Regression Models and Multivariate Life Tables

Ross L Prentice; Shanshan Zhao

doi:10.1080/01621459.2020.1713792

. Author manuscript; available in PMC: 2022 Jan 1.

Published in final edited form as: J Am Stat Assoc. 2020 Feb 10;116(535):1330–1345. doi: 10.1080/01621459.2020.1713792

Regression Models and Multivariate Life Tables

Ross L Prentice ¹, Shanshan Zhao ²

PMCID: PMC8494047 NIHMSID: NIHMS1566901 PMID: 34629570

Abstract

Semiparametric, multiplicative-form regression models are specified for marginal single and double failure hazard rates for the regression analysis of multivariate failure time data. Cox-type estimating functions are specified for single and double failure hazard ratio parameter estimation, and corresponding Aalen–Breslow estimators are specified for baseline hazard rates. Generalization to allow classification of failure times into a smaller set of failure types, with failures of the same type having common baseline hazard functions, is also included. Asymptotic distribution theory arises by generalization of the marginal single failure hazard rate estimation results of Danyu Lin, L.J. Wei and colleagues. The Péano series representation for the bivariate survival function in terms of corresponding marginal single and double failure hazard rates leads to novel estimators for pairwise bivariate survival functions and pairwise dependency functions, at specified covariate history. Related asymptotic distribution theory follows from that for the marginal single and double failure hazard rates and the continuity, compact differentiability of the Péano series transformation and bootstrap applicability. Simulation evaluation of the proposed estimation procedures are presented, and an application to multiple clinical outcomes in the Women’s Health Initiative Dietary Modification Trial is provided. Higher dimensional marginal hazard rate regression modeling is briefly mentioned.

Keywords: Bivariate survival function, Composite outcome, Cross ratio, Empirical process, Hazard rates, Marginal modeling, Multivariate failure times

1. Introduction

Sir David Cox’s landmark paper (Cox, 1972) revolutionized the methods for analyzing censored failure time regression data. Cox regression, along with Kaplan–Meier (KM) survival function estimators, quickly became core methods for the analysis of univariate failure time data.

In the subsequent 45 years a considerable statistical literature has arisen proposing methods that are built upon univariate Cox regression for the analysis of multivariate failure time regression data. An important contribution was provided by Andersen and Gill (1982), who used the same semiparametric exponential model form for the counting process intensity, which models failure rates for multivariate events on a single failure time axis conditional on all preceding failure, censoring and covariate information. As such these methods are suited to examining hazard ratio dependencies on covariates after allowing for the preceding failure history for the correlated set of outcomes, but cannot be used to examine covariate effects on hazard rates without the counting process conditioning. Frailty models (e.g., Andersen et al., 1993; Hougaard, 2000; Aalen et al., 2010; Duchateau and Janssen, 2010; Wienke, 2011) have also played a major role in multivariate failure time data analysis methods. These models avoid intensity models that depend explicitly on the individual’s preceding failure time counting process by assuming independence between the failure times given a random effect, or frailty variable, which is typically assumed to act multiplicatively on the hazard rates for the correlated failure times given preceding covariate histories. These modeling methods are suited to the study of dependencies or clustering among correlated failure times given preceding covariate histories, but are less suited to the study of hazard rate associations with covariate histories themselves. In particular, marginal hazard rates given covariates that are induced by frailty models typically do not reduce to the standard form of the Cox models for marginal single failure hazard rates.

The copula model approach (e.g. Clayton, 1978; Oakes, 1986, 1989; Nan et al., 2006; Nelsen, 2007; Bandeen-Roche and Ning, 2008; Hu et al., 2011) to multivariate failure time modeling avoids these issues by assuming the standard Cox models for marginal single failures hazard rates, and by bringing together corresponding marginal survival functions (given covariates) through a copula function having a low dimensional parameter that controls dependency. A two-stage data analysis (e.g. Shih and Louis, 1995) then retains the usual marginal single failure hazard parameter rate estimators, while also providing estimators of copula distribution parameters. If the assumed copula model is a good fit to the data, this approach can provide a simple parametric description of dependencies among failure times, given covariates. This description can allow such dependencies to depend on baseline covariates, but the copula approach is not suited to estimating dependencies that are functions of covariates that evolve over the study follow-up period(s).

Frailty and copula models typically embrace a limited class of dependencies among the multivariate failure times. A semiparametric marginal regression modeling approach can add valuable analytic flexibility. Importantly the modeling approach that is considered here includes semiparametric regression models for both marginal single and marginal double failure hazard rates, and has the potential to add readily interpretable information on regression influences on failure time outcomes jointly, beyond that from Cox model analyses of the univariate outcomes. Asymptotic distribution theory for estimators of the regression parameters and corresponding baseline rates in the marginal single and double failure rate models will be developed by generalizing the empirical process results of Spiekerman and Lin (1998) for marginal single failure hazard rates and Lin et al. (2000) for recurrent event data, to embrace both marginal single and double failure rate models. These methods are largely complementary to well-developed semiparametric regression methods for the failure counting process intensity that considers the regression associations with the rates that condition on the entire preceding failure time counting process history for the correlated set of outcomes. The methods that are presented here aim to elucidate population-averaged regression effects, in contrast to the subject-specific associations targeted by intensity models.

2. Bivariate Failure Time Regression Modeling and Estimation

2.1. Marginal single and double failure rate regression

Consider bivariate failure times T₁ > 0 and T₂ > 0 that are subject to right censoring by C₁ ≥ 0 and C₂ ≥ 0, respectively, with the usual convention that failures precede censorings in the event of tied times. Also suppose that the pair (T₁, T₂) is accompanied by a bivariate covariate that may stochastically evolve over the study follow-up period. Let z(t₁, t₂) denote a vector of measured covariates at follow-up times (t₁, t₂) and let $Z (t_{1}, t_{2}) = z (0, 0) \lor {z (s_{1}, s_{2}); s_{1} < t_{1}, s_{2} < t_{2}}$ the covariate history prior to (t₁, t₂), where ‘∨’ denotes composition. One can define the marginal single failure hazard rate processes, $Λ_{10} {\cdot, 0; Z (\cdot, 0)}$ and $Λ_{01} {0, \cdot; Z (0, \cdot)}$ by

Λ_{10} {d t_{1}, 0; Z (t_{1}, 0)} = P {T_{1} \in [t_{1}, t_{1} + d t_{1}); T_{1} \geq t_{1}, Z (t_{1}, 0)} for t_{1} \geq 0, and Λ_{01} {0, d t_{2}; Z (0, t_{2})} = P {T_{2} \in [t_{2}, t_{2} + d t_{2}); T_{2} \geq t_{2}, Z (0, t_{2})}, for t_{2} \geq 0,

and marginal double failure hazard rate process $Λ_{11} {\cdot, \cdot; Z (\cdot, \cdot)}$ by

Λ_{11} {d t_{1}, d t_{2}; Z (t_{1}, t_{2})} = P {T_{1} \in [t_{1}, t_{1} + d t_{1}), T_{2} \in [t_{2}, t_{2} + d t_{2}); T_{1} \geq t_{1}, T_{2} \geq t_{2}, Z (t_{1}, t_{2})},

for all t₁ ≥ 0 and t₂ ≥ 0. An independent censoring assumption (given Z) for estimation of parameters in these hazard rate processes requires that lack of censoring in [0, t₁), in [0, t₂), and in [0, t₁) × [0, t₂) can be added, respectively, to the conditioning events in these expressions without altering the failure rates, for any (t₁, t₂).

Though a variety of regression models could be entertained for these marginal single and double failure hazard rates, we will focus on Cox-type semiparametric models, and write

Λ_{10} {d t_{1}, 0; Z (t_{1}, 0)} = Λ_{10} (d t_{1}, 0) \exp {X (t_{1}, 0) β_{10}},

(1)

Λ_{01} {0, d t_{2}; Z (0, t_{2})} = Λ_{01} (0, d t_{2}) \exp {X (0, t_{2}) β_{01}}, and

(2)

Λ_{11} {d t_{1}, d t_{2}; Z (t_{1}, t_{2})} = Λ_{11} (d t_{1}, d t_{2}) \exp {X (t_{1}, t_{2}) β_{11}} .

(3)

Here $Λ_{10} (\cdot, 0), Λ_{01} (0, \cdot),$ and $Λ_{11} (\cdot, \cdot)$ are unspecified ‘baseline’ hazard functions at zero values for the corresponding regression variables $X (t_{1}, 0) = {X_{1} (t_{1}, 0), X_{2} (t_{1}, 0), \dots}, X (0, t_{2}) = {X_{1} (0, t_{2}), X_{2} (0, t_{2}), \dots},$ and X(t₁, t₂) = {X₁(t₁, t₂), X₂(t₁, t₂),...}. These regression variables, with sample paths that are continuous from the left with limits from the right, are each fixed length vectors (i.e., same length vectors for all study subjects at all times) formed from {t₁, Z(t₁, 0)}, {t₂, Z(0, t₂)} and {t₁, t₂; Z(t₁, t₂)} respectively. Also, β₁₀, β₀₁ and β₁₁ are corresponding (column vector) regression parameters to be estimated. The hazard ratio factors on the right side of (1)–(3) aim to quantify the dependence of these failure rates on the pertinent preceding covariate history.

Consider a random sample $S_{1 i} = T_{1 i} \land C_{1 i}, δ_{1 i} = I [T_{1 i} = S_{1 i}], S_{2 i} = T_{2 i} \land C_{2 i}, δ_{2 i} = I [T_{2 i} = S_{2 i}], Z (S_{1 i}, S_{2 i}), for i = 1, \dots, n$ from a study cohort, where $\land$ denotes minimum and I[·] is an indicator function. These observations define corresponding counting processes N_1i, N_2i and ‘at risk’ processes Y_1i, Y_2i via

\begin{array}{l} N_{1 i} (d t_{1}) = {\begin{array}{l} 1 & if S_{1 i} = t_{1} and δ_{1 i} = 1 \\ 0 & otherwise \end{array}; N_{2 i} (d t_{2}) = {\begin{array}{l} 1 & if S_{2 i} = t_{2} and δ_{2 i} = 1, \\ 0 & otherwise \end{array} \\ Y_{1 i} (t_{1}) = {\begin{array}{l} 1 & if S_{1 i} \geq t_{1} \\ 0 & otherwise \end{array}; and Y_{2 i} (t_{2}) = {\begin{array}{l} 1 & if S_{2 i} \geq t_{2} \\ 0 & otherwise \end{array}, \end{array}

for i = 1,...,n. From the above expressions one can define processes $L_{10 i} (\cdot, 0; β_{10}), L_{01 i} (0, \cdot; β_{01})$ and $L_{11 i} (\cdot, \cdot; β_{11})$ that have zero means under (1)–(3) respectively, by

L_{10 i} (d t_{1}, 0; β_{10}) = N_{1 i} (d t_{1}) - Y_{1 i} (t_{1}) Λ_{10} (d t_{1}, 0) \exp {X_{i} (t_{1}, 0) β_{10}}, L_{01 i} (0, d t_{2}; β_{01}) = N_{2 i} (d t_{2}) - Y_{2 i} (t_{2}) Λ_{01} (0, d t_{2}) \exp {X_{i} (0, t_{2}) β_{01}}, and L_{11 i} (d t_{1}, d t_{2}; β_{11}) = N_{1 i} (d t_{1}) N_{2 i} (d t_{2}) - Y_{1 i} (t_{1}) Y_{2 i} (t_{2}) Λ_{11} (d t_{1}, d t_{2}) \exp {X_{i} (t_{1}, t_{2}) β_{11}} .

An estimating equation for the hazard ratio parameter $β = {(β_{10}^{'}, β_{01}^{'}, β_{11}^{'})}^{'}$ over a follow-up region $[0, τ_{1}] \times [0, τ_{2}]$ can be written as

U (β; τ_{1}, τ_{2}) = (\begin{matrix} U_{10} (β_{10}; τ_{1}) \\ U_{01} (β_{01}; τ_{2}) \\ U_{11} (β_{11}; τ_{1}, τ_{2}) \end{matrix}) = 0,

(4)

where

U_{10} (β_{10}; t_{1}) = \sum_{i = 1}^{n} \int_{0}^{t_{1}} {X_{i} (s_{1}, 0) - \bar{X} (s_{1}, 0; β_{10})}^{'} N_{1 i} (d s_{1}), U_{01} (β_{01}; t_{2}) = \sum_{i = 1}^{n} \int_{0}^{t_{2}} {X_{i} (0, s_{2}) - \bar{X} (0, s_{2}; β_{01})}^{'} N_{2 i} (d s_{2}),

and

U_{11} (β_{11}; t_{1}, t_{2}) = \sum_{i = 1}^{n} \int_{0}^{t_{1}} \int_{0}^{t_{2}} {X_{i} (s_{1}, s_{2}) - \bar{X} (s_{1}, s_{2}; β_{11})}^{'} N_{1 i} (d s_{1}) N_{2 i} (d s_{2}),

and where

\bar{X} (s_{1}, 0; β_{10}) = \sum_{i = 1}^{n} Y_{1 i} (s_{1}) X_{i} (s_{1}, 0) \exp {X_{i} (s_{1}, 0) β_{10}} / \sum_{i = 1}^{n} Y_{1 i} (s_{1}) \exp {X_{i} (s_{1}, 0) β_{10}}, \bar{X} (0, s_{2}; β_{01}) = \sum_{i = 1}^{n} Y_{2 i} (s_{2}) X_{i} (0, s_{2}) \exp {X_{i} (0, s_{2}) β_{01}} / \sum_{i = 1}^{n} Y_{2 i} (s_{2}) \exp {X_{i} (0, s_{2}) β_{01}}, and \bar{X} (s_{1}, s_{2}; β_{11}) = \sum_{i = 1}^{n} Y_{1 i} (s_{1}) Y_{2 i} (s_{2}) X_{i} (s_{1}, s_{2}) \exp {X_{i} (s_{1}, s_{2}) β_{11}} / \sum_{i = 1}^{n} Y_{1 i} (s_{1}) Y_{2 i} (s_{2}) \exp {X_{i} (s_{1}, s_{2}) β_{11}} .

Note that the solution $\hat{β} = {({\hat{β}}_{10}^{'}, {\hat{β}}_{01}^{'}, {\hat{β}}_{11}^{'})}^{'}$ to (4) provides an estimator of β under (1)–(3) as derives from the fact that N_1i(ds₁) in U₁₀ can be replaced by L_10i(ds₁, 0; β₁₀), N_2i(ds₂) in U₀₁ can be replaced by $L_{10 i} (d s_{1}, 0; β_{10}), N_{2 i} (d s_{2})$ and N_1i(ds₁)N_2i(ds₂) in U₁₁ can be replaced by L_11i(ds₁, ds₂; β₁₁) while retaining equality to zero, as follows from some simple algebra. Hence $U (β; τ_{1}, τ_{2})$ is composed of stochastic integrals of functions of the data with respect to a zero mean process at the ‘true’ β-value.

The distribution theory for $\hat{β}$ is complicated due to the dependence of the ‘centering’ processes $\bar{X} (\cdot, 0; β_{10}), \bar{X} (0, \cdot; β_{01})$ and $\bar{X} (\cdot, \cdot; β_{11})$ on data from all sampled individuals. However, under independent and identically distributed (i.i.d.) conditions and some regularity conditions these processes converge almost surely to the ratio of the expectations of numerator and denominator terms, denoted here by $\bar{x} (\cdot, 0; β_{10}), \bar{x} (0, \cdot; β_{01})$ and $\bar{x} (\cdot, \cdot; β_{11})$ respectively. In fact, the convergence is at a sufficiently rapid rate as $n \to \infty$ that the centering processes can be replaced by these limits without altering the asymptotic distribution of $n^{- 1 / 2} U (β; τ_{1}, τ_{2}) .$ The central limit theorem can therefore be applied to $n^{- 1 / 2} U (β; t_{1}, t_{2}),$ for all $(t_{1}, t_{2}) \in [0, τ_{1}] \times [0, τ_{2}],$ to show weak convergence to a zero mean Gaussian process with covariance function

Σ (β; t_{1}, t_{2}) = E {(\begin{matrix} \int_{0}^{t_{1}} {X (s_{1}, 0) - \bar{x} (s_{1}, 0; β_{10})}^{'} L_{10} (d s_{1}, 0; β_{10}) \\ \int_{0}^{t_{2}} {X (0, s_{2}) - \bar{x} (0, s_{2}; β_{01})}^{'} L_{01} (0, d s_{2}, 0; β_{01}) \\ \int_{0}^{t_{1}} \int_{0}^{t_{2}} {X (s_{1}, s_{2}) - \bar{x} (s_{1}, s_{2}; β_{11})}^{'} L_{11} (d s_{1}, d s_{2}; β_{11}) \end{matrix})}^{\otimes 2},

where E denotes expectation, and $a^{\otimes 2} = a a^{'}$ for column vector a. For these developments $(τ_{1}, τ_{2})$ is required to be in the support of the observed follow-up times ${(S_{1 i}, S_{2 i}), i = 1, \dots, n} .$ A Taylor series expansion of $U (β; τ_{1}, τ_{2})$ about the ‘true’ β value then leads under regularity conditions to a mean zero asymptotic Gaussian distribution for $n^{1 / 2} (\hat{β} - β)$ with analytic variance estimator $I^{- 1} \hat{Σ} I^{- 1}$ of sandwich form, where I is the product of n⁻¹ and the negative of the matrix of partial derivatives of $U (β; τ_{1}, τ_{2})$ with respect to β evaluated at $\hat{β},$ and $\hat{Σ}$ is an empirical estimator of $Σ (β; τ_{1}, τ_{2})$ that can be written as

\hat{Σ} = n^{- 1} \sum_{i = 1}^{n} {(\begin{matrix} \int_{0}^{τ_{1}} {X_{i} (t_{1}, 0) - \bar{X} (t_{1}, 0; {\hat{β}}_{10})}^{'} {\hat{L}}_{10 i} (d t_{1}, 0; {\hat{β}}_{10}) \\ \int_{0}^{τ_{2}} {X_{i} (0, t_{2}) - \bar{X} (0, t_{2}; {\hat{β}}_{01})}^{'} {\hat{L}}_{01 i} (0, d t_{2}; {\hat{β}}_{01}) \\ \int_{0}^{τ_{1}} \int_{0}^{τ_{2}} {X_{i} (t_{1}, t_{2}) - \bar{X} (t_{1}, t_{2}; {\hat{β}}_{11})}^{'} {\hat{L}}_{11 i} (d t_{1}, d t_{2}; {\hat{β}}_{11}) \end{matrix})}^{\otimes 2},

(5)

where ${\hat{L}}_{10}, {\hat{L}}_{01}$ and ${\hat{L}}_{11}$ are obtained from L₁₀, L₀₁ and L₁₁, respectively, by evaluating at $\hat{β}$ and at empirical estimators of the baseline hazard rates $Λ_{10}, Λ_{01}$ and $Λ_{11}$ in (1)–(3). Natural empirical estimators ${\hat{Λ}}_{10}, {\hat{Λ}}_{01}$ and ${\hat{Λ}}_{11}$ of these baseline rates have Aalen–Breslow form, and are given by

{\hat{Λ}}_{10} (t_{1}, 0) = \int_{0}^{t_{1}} \sum_{i = 1}^{n} N_{1 i} (d s_{1}) / \sum_{i = 1}^{n} Y_{1 i} (s_{1}) \exp {X_{i} (s_{1}, 0) {\hat{β}}_{10}},

(6)

{\hat{Λ}}_{01} (0, t_{2}) = \int_{0}^{t_{2}} \sum_{i = 1}^{n} N_{2 i} (d s_{2}) / \sum_{i = 1}^{n} Y_{2 i} (s_{2}) \exp {X_{i} (0, s_{2}) {\hat{β}}_{01}}, and

(7)

{\hat{Λ}}_{11} (t_{1}, t_{2}) = \int_{0}^{t_{1}} \int_{0}^{t_{2}} \sum_{i = 1}^{n} N_{1 i} (d s_{1}) N_{2 i} (d s_{2}) / \sum_{i = 1}^{n} Y_{1 i} (s_{1}) Y_{2 i} (s_{2}) \exp {X_{i} (s_{1}, s_{2}) {\hat{β}}_{11}} .

(8)

Empirical process methods can be used to show the weak convergence of $n^{1 / 2} ({\hat{Λ}}_{10} - Λ_{10}), n^{1 / 2} ({\hat{Λ}}_{01} - Λ_{01}), n^{1 / 2} ({\hat{Λ}}_{11} - Λ_{11})$ jointly as n → ∞ to a zero mean Gaussian process under (1)–(3). In fact an empirical covariance function estimator for these parameter estimates can be developed. These asymptotic developments follow from modest extensions of the work of Spiekerman and Lin (1998), and Lin et al. (2000). Some detail on these developments and related conditions is given in the Appendix in the more general context of §5.

The marginal single failure hazard rate models (1) and (2) impose constraints on the double failure hazard rate model (3) and visa versa, so that (1) and (2) typically will not be fully consistent with (3) in their respective hazard rate regression components. The solutions to (1)–(3), and the estimators of the baseline rates (6)–(8) may incorporate some asymptotic bias under departure from one or more of the regression models (1)–(3). However, through the time-varying features of the modeled regression variables in these models, and even further through the use of both time-varying regression variates and time-varying baseline hazard rate stratification, the latter of which can be readily incorporated by replacing (4) by corresponding summations over a fixed number of possibly time-dependent strata, one has the tools to arrange for each of (1), (2) and (3) to provide a suitable fit to available data. Having done so one can expect that estimators of joint survival probabilities and related statistics, for example those that assess the strength of dependency between T₁ and T₂ given Z, to be estimated with little bias. This topic too will be elaborated below. Also, departure from any one, or two, of the hazard rate models (1)–(3) does not adversely affect asymptotic distributional results for estimators of parameters in the remaining hazard rate models.

2.2. Simulation evaluations

Continuous failure times given a single binary covariate z that takes values 0 or 1 with probability 0.5 were generated under the rather specialized Clayton–Oakes regression model (Clayton, 1978; Oakes, 1986) of the form

F (t_{1}, t_{2}; z) = {F_{0} {(t_{1}, 0)}^{- θ} + F_{0} {(0, t_{2})}^{- θ} - 1}^{- e^{z γ} / θ}

(9)

for values θ ≥ 0, where F₀ denotes the survival function at z = z(0, 0) = 0. The resulting failure time variates have marginal single failure hazard rates of the form

Λ_{10} (d t_{1}, 0; z) = Λ_{10} (d t_{1}, 0) e^{z γ} and Λ_{01} (0, d t_{2}; z) = Λ_{01} (0, d t_{2}) e^{z γ},

and double failure hazard rates

Λ_{11} (d t_{1}, d t_{2}; z) = Λ_{11} (d t_{1}, d t_{2}) e^{z γ} (e^{z γ} + θ) / (1 + θ),

so that (1)–(3) is obtained for a binary covariate z with x(t₁, 0) = x(0, t₂) = x(t₁, t₂) = z, with $x (t_{1}, 0) = x (0, t_{2}) = x (t_{1}, t_{2}) = z,$ with $β_{10} = β_{01} = γ,$ and with $β_{11} = \log {e^{γ} (e^{γ} + θ) / (1 + θ)} .$ Data were generated with unit exponential marginals at z = 0, and censoring times that were independent of each other and equally and exponentially distributed with censoring hazard rate, c, chosen to give certain specified uncensored failure fractions for T₁ and T₂, or with no censoring (c = 0). Covariate values were generated with probabilities 0.5 for z = 0 and z = 1.

In implementing (4) $τ_{1}$ and $τ_{2}$ were specified as the maximal values of S_1i and S_2i respectively in the sample of size n. Table 1 shows sample means and sample standard deviations for $({\hat{β}}_{10}, {\hat{β}}_{01}, {\hat{β}}_{11})$ at $(θ, γ)$ values of (2, 0) and (2, log 2) at n = 250 or 500 with substantial censoring (c = 5), and at n = 100 with no censoring, based on 1000 simulations at each configuration. These simulations show little evidence of regression parameter bias, even though there are, for example, only about 13.5 expected double failures at $(θ, γ) = (2, 0)$ and n = 250, with substantial censoring (c = 5) for each of T₁ and T₂. Also there is good agreement generally between the sample standard deviation for the regression parameter estimates and the average of the standard deviation estimators from the sandwich variance formula, as well as good proximity to 95% for the associated asymptotic 95% confidence interval coverage. An exception occurs at $(θ, γ) = (2, 0)$ and n = 250, where the sample standard deviation for ${\hat{β}}_{11}$ is considerably larger than the average of sandwich standard deviations, presumably reflecting a distribution with heavier tails than the approximating asymptotic normal distribution. The approximation, however, seems adequate at n = 500, where the expected number of double failures is about 27. Note that the marginal double failure hazard rate from (9) has a very specialized form, which typically does not agree with the semiparametric model (3) if z is not binary.

Table 1:

Simulation summary statistics for marginal single failure hazard ratio parameters β₁₀ and β₀₁ and for double failure hazard ratio parameter β₁₁, and for related estimators from sandwich-form variance estimators, under the Clayton–Oakes model (9) with β₁₀ = β₀₁ = $γ$ and a binary covariate z. This model implies marginal hazard ratio (for z = 1 versus z = 0) of $e^{γ}$ for both T₁ and T₂, and double failure hazard ratio of $e^{γ} (e^{γ} + θ) / (1 + θ) .$

		$(θ = 2, γ = 0)$
Sample size (n)		250				500				100

T₁ and T₂ failing %		16.7				16.7				100
Double failure %		5.3				5.3				100
	True	Sample^a	Sample^b	Sandwich^b	95% CI^b	Sample	Sample	Sandwich	95% CI	Sample	Sample	Sandwich	95% CI
	True	Mean	SD	SD	Coverage	Mean	SD	SD	Coverage	Mean^a	SD	SD	Coverage
β ₁₀	0	0.003	0.315	0.316	0.953	0.006	0.227	0.222	0.951	−0.001	0.213	0.203	0.933
β ₀₁	0	0.007	0.327	0.317	0.948	−0.008	0.224	0.221	0.952	−0.005	0.216	0.203	0.943
β ₁₁	0	−0.001	1.222	0.606	0.961	−0.007	0.440	0.420	0.949	−0.003	0.282	0.263	0.928
		$(θ = 2, γ = log 2)$
Sample size (n)		250				500				100

T₁ and T₂ failing %		22.6				22.6				100
Double failure %		8.2				8.2				100
β ₁₀	0.693	0.706	0.281	0.281	0.966	0.705	0.203	0.198	0.941	0.696	0.222	0.212	0.942
β ₀₁	0.693	0.707	0.287	0.282	0.959	0.697	0.199	0.197	0.955	0.696	0.223	0.212	0.937
β ₁₁	0.981	1.056	0.798	0.525	0.956	1.004	0.381	0.364	0.951	0.986	0.315	0.288	0.929

Open in a new tab

Based on 1000 simulations at each sampling configuration.

Abbreviations: Sample SD is sample standard deviation of the regression parameter estimates; Sandwich SD is the corresponding average of standard deviation estimates derived from the sandwich form estimator of variance for $\hat{β},$ and 95% CI coverage is the fraction of simulated samples from which the approximate 95% confidence interval, formed from $\hat{β}$ and its sandwich-form variance estimator, includes the true β.

Table 2 shows corresponding summary statistics for the cumulative double failure hazard rate estimator ${\hat{Λ}}_{11} (t_{1}, t_{2}; z),$ under (8) and the second Table 1 configuration $(θ = 2, γ = \log 2),$ at both z = 0 and z = 1. One can show the targeted double failure hazard rate to be

Λ_{11} (t_{1}, t_{2}; z) = e^{z γ} (e^{z γ} + θ) {t_{1} / θ + t_{2} / θ - \log (e^{t_{1} θ} + e^{t_{2} θ} - 1) / θ^{2}}

under the simulation conditions of this subsection. Estimated ${\hat{Λ}}_{11}$ values are reasonably accurate under the configurations shown, as was also the case at some smaller sample sizes (e.g. n = 100 with no censoring). Empirical approximations to asymptotic standard deviation estimates are somewhat low in heavy censoring scenarios, especially close to the coordinate axes or toward distributional tails, and corresponding confidence interval coverage rates tend to be less than nominal levels. These features derive from few preceding double failures close to the axes, and from empty double failure risk sets for some samples toward distributional tails. Hence fairly large sample sizes may be needed for these asymptotic approximations to the distribution of ${\hat{Λ}}_{11}$ to be accurate.

Table 2:

Simulation summary statistics^* for double failure hazard estimator ${\hat{Λ}}_{11}$ at selected marginal survival probabilities given z, under the Clayton-Oakes model (9) with β₁₀ = β₀₁ = γ = log 2 and dependency parameter $θ = 2,$ at both z = 0 and z = 1.

Sample size (n)		500				1000				250
T_i and T₂ Failing %		22.6				22.6				100
		${\hat{Λ}}_{11}$				${\hat{Λ}}_{11}$				${\hat{Λ}}_{11}$
(T₁, T₂) Marginal	$Λ_{11}$	Sample	Sample	Empirical	95% CI	Sample	Sample	Empirical	95% CI	Sample	Sample	Empirical	95% CI
Survival Rates		Mean	SD	SD	Coverage	Mean	SD	SD	Coverage	Mean	SD	SD	Coverage
	z = 0
(0.85, 0.85)	0.060	0.060	0.021	0.020	0.916	0.060	0.015	0.014	0.921	0.060	0.016	0.015	0.925
(0.85, 0.70)	0.114	0.115	0.041	0.039	0.910	0.115	0.029	0.028	0.934	0.113	0.025	0.024	0.925
(0.85, 0.55)	0.161	0.159	0.070	0.060	0.855	0.163	0.048	0.045	0.898	0.159	0.033	0.032	0.916
(0.70, 0.70)	0.226	0.226	0.083	0.076	0.900	0.226	0.055	0.055	0.927	0.224	0.041	0.040	0.932
(0.70, 0.55)	0.330	0.326	0.151	0.115	0.848	0.331	0.103	0.090	0.901	0.327	0.058	0.055	0.917
(0.55, 0.55)	0.500	0.483	0.274	0.162	0.750	0.497	0.198	0.139	0.843	0.499	0.082	0.078	0.935
	z = 1
(0.85, 0.85)	0.046	0.044	0.016	0.015	0.903	0.045	0.011	0.011	0.927	0.046	0.017	0.017	0.928
(0.85, 0.70)	0.092	0.090	0.027	0.027	0.900	0.091	0.020	0.019	0.928	0.092	0.026	0.026	0.946
(0.85, 0.55)	0.140	0.138	0.043	0.042	0.876	0.139	0.031	0.030	0.921	0.140	0.034	0.035	0.947
(0.70, 0.70)	0.189	0.187	0.050	0.049	0.900	0.186	0.034	0.034	0.926	0.188	0.041	0.041	0.937
(0.70, 0.55)	0.290	0.290	0.082	0.076	0.906	0.287	0.055	0.054	0.927	0.288	0.054	0.054	0.943
(0.55, 0.55)	0.453	0.452	0.129	0.120	0.880	0.447	0.089	0.085	0.924	0.449	0.076	0.074	0.937

Open in a new tab

Sample mean and standard deviation (SD) based on 1000 simulations at each sample configuration. Empirical SD is the average of SD estimates based on the empirical variance estimator for ${\hat{Λ}}_{11}$ , and 95% Cl coverage is the fraction of the 1000 simulated samples where the asymptotic confidence interval using the empirical SD includes the true $Λ_{11}$ value.

3. Bivariate Survival Function and Dependency Function Estimation

3.1. Bivariate survival function estimation

Given specifications, such as (1)–(3), for marginal single and double failure hazard rate processes one can define a bivariate process F given Z for all t₁ ≥ 0 and t₂ > 0 by the product integrals

F {t_{1}, 0; Z (t_{1}, 0)} = \prod_{0}^{t_{1}} [1 - Λ_{10} {d s_{1}, 0; Z (s_{1}, 0)}], and F {0, t_{2}; Z (0, t_{2})} = \prod_{0}^{t_{2}} [1 - Λ_{01} {0, d s_{2}; Z (0, s_{2})}],

along the coordinate axes. Away from these axes F given Z is defined by the inhomogeneous Volterra integral equation

F {t_{1}, t_{2}; Z (t_{1}, t_{2})} = F {t_{1}, 0; Z (t_{1}, 0)} + F {0, t_{2}; Z (0, t_{2})} - 1 + \int_{0}^{t_{1}} \int_{0}^{t_{2}} F {s_{1}^{-}, s_{2}^{-}; Z (s_{1}, s_{2})} Λ_{11} {d s_{1}; d s_{2}; Z (s_{1}, s_{2})},

(10)

that has a unique Péano series solution given by

F {t_{1}, t_{2}; Z (t_{1}, t_{2})} = ψ {t_{1}, t_{2}; Z (t_{1}, 0), Z (0, t_{2})} + \sum_{j = 1}^{\infty} \int_{0}^{t_{1}} \int_{s_{11}}^{t_{1}} \dots \int_{s_{1, j - 1}}^{t_{1}} \int_{0}^{t_{2}} \int_{s_{21}}^{t_{2}} \dots \int_{s_{2, j - 1}}^{t_{2}} ψ {s_{11}^{-}, s_{21}^{-}; Z (s_{11}, s_{21})} \prod_{m = 1}^{j} Λ_{11} {d s_{1 m}, d s_{2 m}; Z (s_{1 m}, s_{2 m})},

(11)

where $ψ {t_{1}, t_{2}; Z (t_{1}, 0), Z (0, t_{2})} = F {t_{1}, 0; Z (t_{1}, 0)} + F {0, t_{2}; Z (0, t_{2})} - 1.$ Note that F given Z will have a survival function interpretation if Z is composed only of the baseline covariate data, z(0, 0), and evolving covariates that are external to the failure processes (e.g. Kalbfleisch and Prentice, 2002, chapter 6).

Now denote the uncensored T₁ failures in the sample of a size n by t₁₁, t₁₂,..., t_1I, and the uncensored T₂ failures by t₂₁,t₂₂,..., t_2J. The semiparametric model estimators ${\hat{Λ}}_{10} {d t_{1}, 0; Z (t_{1}, 0)}, {\hat{Λ}}_{01} {0, d t_{2}; Z (0, t_{2})}$ and ${\hat{Λ}}_{11} {d t_{1}, d t_{2}; Z (t_{1}, t_{2})}$ place mass only at the uncensored data grid points (t_1i, t_2j) within the risk region $[R = {(t_{1}, t_{2}; S_{1 ℓ} \geq t_{1}, S_{2 ℓ} \geq t_{2},$ for some $ℓ \in (1, \dots, n)}]$ of the data, and along the half-lines through the uncensored failure times in either direction beyond the risk region. One can readily estimate F given Z at specified covariate history Z, using a simple recursive procedure as follows: At uncensored failure time grid point $(t_{1 i}, t_{2 j}) \in R$ one can define

\hat{F} {Δ t_{1 i}, Δ t_{2 j}; Z (t_{1 i}, t_{2 j})} = \hat{F} {t_{1 i}^{-}, t_{2 j}^{-}; Z (t_{1 i}, t_{2 j})} {\hat{Λ}}_{11} {Δ t_{1 i}, Δ t_{2 j}; Z (t_{1 i}, t_{2 j})}

where ${\hat{Λ}}_{11} {Δ t_{1 i}, Δ t_{2 j}; Z (t_{1 i}, t_{2 j})} = {\hat{Λ}}_{11} (Δ t_{1 i}, Δ t_{2 j}) \exp {X (t_{1 i}, t_{2 j}) {\hat{β}}_{11}},$ from which

\hat{F} {t_{1 i}, t_{2 j}; Z (t_{1 i}, t_{2 j})} = \hat{F} {t_{1 i}, t_{2 j}^{-}; Z (t_{1 i}, t_{2 j})} + \hat{F} {t_{1 i}^{-}, t_{2 j}; Z (t_{1 i}, t_{2 j})} - \hat{F} {t_{1 i}^{-}, t_{2 j}^{-}; Z (t_{1 i}, t_{2 j})} [1 - {\hat{Λ}}_{11} {Δ t_{1 i}, Δ t_{2 j}; Z (t_{1 i}, t_{2 j})}] .

(12)

This expression provides a procedure for calculating $\hat{F} {t_{1 i}, t_{2 j}; Z (t_{1 i}, t_{2 j})}$ for a specified covariate history Z, starting with

\hat{F} {t_{1 i}, 0; Z (t_{1 i}, 0)} = \prod_{ℓ = 1}^{i} [1 - {\hat{Λ}}_{10} (Δ t_{1 ℓ}, 0) e^{X (t_{1 ℓ}, 0) {\hat{β}}_{10}}], and \hat{F} {0, t_{2 j}; Z (0, t_{2 j})} = \prod_{m = 1}^{j} [1 - {\hat{Λ}}_{01} (0, Δ t_{2 m}) e^{X (0, t_{2 m}) {\hat{β}}_{01}}] .

(13)

Under (1)–(3) $\hat{F}$ given Z will generally provide a strongly consistent estimator of F given Z, and $n^{1 / 2} (\hat{F} - F)$ given Z will converge as $n \to \infty$ to a zero mean Gaussian process, on the basis of the properties of the univariate Cox model estimators (13) and the continuity and weakly continuous compact differentiability of the Péano series transformation (Gill et al., 1995) from these marginal estimators and ${\hat{Λ}}_{11}$ to $\hat{F},$ given Z. As such (12) and (13) provide a rather flexible regression generalization, with fixed and external covariates, of a Volterra bivariate survival function estimator that has been attributed to Peter Bickel (Dabrowska, 1988).

3.2. Dependency function estimation

One use of the estimator $\hat{F}$ given Z is for assessing dependency between the two failure time variates given Z. If F given Z has a survival function interpretation one can define, building on the work of Fan et al. (2000), an average cross ratio function estimator $\hat{C} {t_{1}, t_{2}; Z (t_{1}, t_{2})}$ over $[0, τ_{1}] \times [0, τ_{2}],$ where $(τ_{1}, τ_{2})$ is in the risk region of the data, by

\hat{C} {t_{1}, t_{2}; Z (t_{1}, t_{2})} = \int_{0}^{t_{1}} \int_{0}^{t_{2}} \hat{F} {s_{1}^{-}, s_{2}^{-}; Z (s_{1}, s_{2})} {\hat{Λ}}_{11} {Δ s_{1}, Δ s_{2}; Z (s_{1}, s_{2})} / \int_{0}^{t_{1}} \int_{0}^{t_{2}} \hat{F} {s_{1}^{-}, s_{2}^{-}; Z (s_{1}, s_{2})} {\hat{Λ}}_{10} {Δ s_{1}, s_{2}^{-}; Z (s_{1}, s_{2})} {\hat{Λ}}_{01} {s_{1}^{-}, Δ s_{2}; Z (s_{1}, s_{2})},

(14)

which contrasts the double failure rate ${\hat{Λ}}_{11}$ with the corresponding local independence value ${\hat{Λ}}_{10} {\hat{Λ}}_{01}$ given Z, where

{\hat{Λ}}_{10} {Δ s_{1}, s_{2}^{-}; Z (s_{1}, s_{2})} = - \hat{F} {Δ s_{1}, s_{2}^{-}; Z (s_{1}, s_{2})} / \hat{F} {s_{1}^{-}, s_{2}^{-}; Z (s_{1}, s_{2})} and {\hat{Λ}}_{01} {s_{1}^{-}, Δ s_{2}; Z (s_{1}, s_{2})} = - \hat{F} {s_{1}^{-}, Δ s_{2}; Z (s_{1}, s_{2})} / \hat{F} {s_{1}^{-}, s_{2}^{-}; Z (s_{1}, s_{2})},

with weight function $\hat{F} {s_{1}^{-}, s_{2}^{-}; Z (s_{1}, s_{2})}$ that depends on the failure rates, but not the censoring rates, given Z.

Similarly, one can define, following Oakes (1989) and Fan et al. (2000), an average concordance function estimator between T₁ and T₂ over $[0, τ_{1}] \times [0, τ_{2}]$ given Z, that takes values in (−1, 1), by

\hat{T} {t_{1}, t_{2}; Z (t_{1}, t_{2})} = [\int_{0}^{t_{1}} \int_{0}^{t_{2}} \hat{F} {s_{1}^{-}, s_{2}^{-}; Z (s_{1}, s_{2})} \hat{F} {Δ s_{1}, Δ s_{2}; Z (s_{1}, s_{2})} - \int_{0}^{t_{1}} \int_{0}^{t_{2}} \hat{F} {Δ s_{1}, s_{2}^{-}; Z (s_{1}, s_{2})} \hat{F} {s_{1}^{-}, Δ s_{2}; Z (s_{1}, s_{2})}] / [\int_{0}^{t_{1}} \int_{0}^{t_{2}} \hat{F} {s_{1}^{-}, s_{2}^{-}; Z (s_{1}, s_{2})} \hat{F} {Δ s_{1}, Δ s_{2}; Z (s_{1}, s_{2})} + \int_{0}^{t_{1}} \int_{0}^{t_{2}} \hat{F} {Δ s_{1}, s_{2}^{-}; Z (s_{1}, s_{2})} \hat{F} {s_{1}^{-}, Δ s_{2}; Z (s_{1}, s_{2})}] .

(15)

These estimators quite generally inherit strong consistency, and weak Gaussian convergence properties from these same properties for $\hat{F}$ given Z and the continuity and compact differentiability of the transformations from $\hat{F}$ to $\hat{C}$ and from $\hat{F}$ to $\hat{T} .$ .

3.3. Confidence interval and confidence band estimation

The asymptotic properties just stated for survival and dependency function estimators conceptually generate corresponding analytic variance function estimators using the delta function method. However the transformations from marginal single and double hazard rate estimators to the bivariate survival function using (11), and the transformations from the survival function to the average cross ratio and concordance estimators (14) and (15) may be too complex for the delta function approach to be useful. Accordingly we employ a bootstrap resampling approach to estimate confidence intervals and bands for these functions, as well as to estimate confidence bands for marginal single and double failure cumulative hazard functions. The applicability of bootstrap procedures follows from the asymptotic Gaussian properties already cited for regression parameter and baseline hazard function estimators in (10)–(13), and the weakly continuous compact differentiabilty of the Péano series survival function transformation (11) (Gill et al., 1995) and of the transformations (14) and (15) (Fan et al., 2000).

3.4. Simulation evaluation of survival and dependency function estimators

In the special case where all regression parameters in (1)–(3) take value zero, $\hat{F}$ given Z from (12) and (13) is the previously mentioned Volterra estimator. While nonparametric plug-in estimators of the bivariate survival function due to Dabrowska (1988) and Prentice and Cai (1992) have been shown (Gill et al., 1995) to be nonparametric efficient under the compete independence of (T₁, T₂, C₁, C₂), this property evidently does not hold for the Volterra estimator. On that basis it has been speculated that the Volterra estimator may be ‘much inferior’ to these other estimators (Gill et al., 1995). To examine this topic further we conducted simulations under (9) with $γ = 0,$ so that β₁₀ = β₀₁ = β₁₁ = 0 with no regression variable influences. As previously T₁ and T₂ were specified as the maximal observed S_1i and S_2i values respectively in the generated sample.

Table 3 shows summary statistics evaluating the Volterra estimator and comparing it to the Dabrowska estimator, which is also simply calculated recursively using

\hat{F} (t_{1 i}, t_{2 j}) = \frac{\hat{F} (t_{1 i}^{-}, t_{2 j}) \hat{F} (t_{1 i}, t_{2 j}^{-})}{\hat{F} (t_{1 i}^{-}, t_{2 j}^{-})} {\frac{d_{i j}^{00} r_{i j}}{(d_{i j}^{10} + d_{i j}^{00}) (d_{i j}^{01} + d_{i j}^{00})}}

at all grid points where the denominator components in the factor in curly brackets are positive, and $\hat{F} = 0$ otherwise, again starting with KM marginal survival function estimators. In this expression ‘ $d_{i j}^{10}, d_{i j}^{01}$ and $d_{i j}^{00}$ ’ are the numbers of observations known to have ‘ $T_{1} = t_{1 i}$ and $T_{2} > t_{i j}$ ’; ‘ $T_{1} > t_{1 i}$ and $T_{2} = t_{2 j}$ ’; and ‘ $T_{1} > t_{1 i}, T_{2} > t_{2 j}$ ’ respectively, among the r_ij individuals at risk at uncensored failure time grid point (t_1i, t_2j). From Table 3 one can see that both the Volterra and Dabrowska estimators are quite accurate under the specified sampling configurations. The two estimators also appear to have similar corresponding moderate sample efficiencies, even at the complete independence of (T₁, T₂, C₁, C₂), where the Dabrowska estimator is nonparametric efficient. Note that, in contrast to the Dabrowska estimator, the Volterra estimator does not assign negative mass within the risk region of the data. However, it tends to assign more negative mass than does the Dabrowska estimator, to half-lines beyond the risk region. Overall, these simulations provide little basis for choosing between the Volterra and Dabrowska nonparametric estimators of the bivariate survivor function.

Table 3:

Sample means and standard deviations for Volterra (V) and Dabrowska (D) estimators of the bivariate survival function under a Clayton–Oakes model (9) with γ = 0 with unit exponential marginal distributions and dependency parameter θ. Censoring times were also exponentially distributed and independent of each other with censoring rate c taking values 0 (no censoring), 2 (moderate censoring), or 5 (heavy censoring) at various sample sizes (n), based on 1000 simulations at each configuration

Marginal Survival		Joint Survival	Volterra	n = 100	n = 100	n = 100	n = 250	n = 500
Probabilities		Probability	Dabrowska	c = 0	c = 2	c = 5	c = 5	c = 5
				$θ = 0$ (independence)
0.85	0.85	0.723	V	0.723(0.045)	0.723(0.050)	0.723(0.045)	0.722(0.036)	0.722(0.018)
			D	0.723(0.045)	0.723(0.050)	0.723(0.045)	0.722(0.036)	0.722(0.018)
0.85	0.70	0.595	V	0.594(0.048)	0.595(0.060)	0.594(0.048)	0.593(0.054)	0.595(0.026)
			D	0.594(0.048)	0.594(0.060)	0.594(0.048)	0.593(0.053)	0.595(0.026)
0.85	0.55	0.468	V	0.467(0.050)	0.468(0.072)	0.467(0.050)	0.465(0.089)	0.468(0.043)
			D	0.467(0.050)	0.467(0.071)	0.467(0.050)	0.465(0.095)	0.468(0.041)
0.70	0.70	0.490	V	0.491(0.050)	0.491(0.067)	0.491(0.050)	0.492(0.069)	0.489(0.050)
			D	0.491(0.050)	0.490(0.066)	0.491(0.050)	0.431(0.069)	0.488(0.047)
0.70	0.55	0.385	V	0.385(0.048)	0.386(0.078)	0.385(0.048)	0.394(0.102)	0.381(0.075)
			D	0.385(0.048)	0.385(0.073)	0.385(0.048)	0.387(0.123)	0.3880(0.086)
0.55	0.55	0.303	V	0.302(0.044)	0.306(0.090)	0.302(0.044)	0.311(0.122)	0.298(0.095)
			D	0.302(0.044)	0.304(0.085)	0.302(0.044)	0.316(0.142)	0.303(0.117)
				$θ = 2$ (cross ratio= 3)
0.85	0.85	0.752	V	0.752(0.043)	0.752(0.047)	0.753(0.054)	0.752(0.034)	0.752(0.025)
			D	0.752(0.043)	0.752(0.047)	0.753(0.054)	0.752(0.034)	0.752(0.025)
0.85	0.70	0.642	V	0.641(0.048)	0.641(0.058)	0.642(0.084)	0.641(0.053)	0.642(0.037)
			D	0.641(0.048)	0.641(0.058)	0.642(0.087)	0.641(0.052)	0.642(0.036)
0.85	0.55	0.521	V	0.519(0.050)	0.519(0.072)	0.529(0.135)	0.520(0.087)	0.520(0.064)
			D	0.519(0.050)	0.519(0.070)	0.522(0.156)	0.519(0.090)	0.520(0.063)
0.70	0.70	0.570	V	0.569(0.049)	0.570(0.064)	0.572(0.112)	0.565(0.070)	0.571(0.050)
			D	0.569(0.049)	0.570(0.062)	0.574(0.128)	0.567(0.070)	0.571(0.046)
0.70	0.55	0.480	V	0.478(0.049)	0.480(0.078)	0.493(0.150)	0.483(0.108)	0.479(0.086)
			D	0.478(0.049)	0.480(0.071)	0.480(0.183)	0.480(0.136)	0.480(0.084)
0.55	0.55	0.422	V	0.422(0.048)	0.426(0.095)	0.429(0.171)	0.416(0.138)	0.412(0.111)
			D	0.422(0.048)	0.427(0.086)	0.427(0.196)	0.421(0.160)	0.422(0.118)

Open in a new tab

Table 4 shows summary statistics for $\hat{F}$ at various follow-up times (t₁,t₂) under (9) and a specific Table 1 configuration $(θ = 2, γ = \log 2) .$ The survival function estimators $\hat{F}$ do not show evidence of bias under these simulation conditions, similar to what was observed for smaller sample sizes (e.g., n = 100 with no censoring). One could apply the bootstrap procedure to some transformation of $\hat{F}$ , such as log $\hat{F}$ , but we applied it directly to $\hat{F}$ in these simulations. Note the good correspondence between sample standard deviation (SD) based on 1000 generated samples at each configuration and the corresponding average of bootstrap SD estimates, based on 200 bootstrap replicates for each generated sample. Also asymptotic 95% confidence interval coverage rates, based on $\hat{F} (t_{1}, t_{2}) \pm 1.96$ (bootstrap SD), are close to the nominal levels throughout Table 4.

Table 4:

Simulation summary statistics^a for confidence intervals (CIs) for bivariate survival probability F at selected marginal survival probabilities under the Clayton–Oakes model (9) with $β_{10} = β_{01} = γ =$ log 2 and dependency parameter $θ = 2,$ at both z = 0 and z = 1.

Sample size (n)		500				1000				250
T₁ and T₂ Failing %		22.6				22.6				100
		$\hat{F}$				$\hat{F}$				$\hat{F}$
(T₁, T₂) Marginal		Sample	Sample	Bootstrap	95% CI	Sample	Sample	Bootstrap	95% CI	Sample	Sample	Empirical	95% CI
Survival Rates	F	Mean	SD	SD	Coverage	Mean	SD	SD	Coverage	Mean	SD	SD	Coverage
	z = 0
(0.85, 0.85)	0.752	0.752	0.031	0.030	0.933	0.752	0.021	0.021	0.943	0.754	0.028	0.028	0.936
(0.85, 0.70)	0.642	0.644	0.046	0.044	0.931	0.644	0.030	0.031	0.950	0.644	0.035	0.034	0.940
(0.85, 0.55)	0.521	0.526	0.070	0.071	0.945	0.523	0.048	0.048	0.948	0.522	0.040	0.038	0.930
(0.70, 0.70)	0.570	0.570	0.057	0.059	0.960	0.571	0.040	0.040	0.961	0.571	0.037	0.036	0.933
(0.70, 0.55)	0.480	0.482	0.089	0.089	0.967	0.482	0.062	0.062	0.966	0.481	0.039	0.038	0.935
(0.55, 0.55)	0.422	0.417	0.142	0.123	0.951	0.423	0.0918	0.091	0.953	0.423	0.040	0.039	0.935
	z = 1
(0.85, 0.85)	0.739	0.739	0.027	0.028	0.951	0.739	0.020	0.020	0.938	0.739	0.032	0.034	0.958
(0.85, 0.70)	0.623	0.623	0.036	0.036	0.943	0.623	0.025	0.025	0.951	0.624	0.037	0.038	0.960
(0.85, 0.55)	0.501	0.501	0.045	0.046	0.953	0.503	0.032	0.032	0.953	0.503	0.039	0.040	0.956
(0.70, 0.70)	0.538	0.538	0.041	0.042	0.953	0.538	0.029	0.029	0.949	0.540	0.039	0.040	0.952
(0.70, 0.55)	0.445	0.445	0.050	0.051	0.959	0.445	0.035	0.035	0.953	0.447	0.040	0.040	0.957
(0.55, 0.55)	0.379	0.378	0.063	0.064	0.965	0.379	0.043	0.044	0.957	0.380	0.040	0.040	0.947

Open in a new tab

Sample mean and standard deviation (SD) based on 1000 simulations at each sample configuration. Bootstrap SD is the SD estimates from averaging the sample variances for $\hat{F}$ based on 200 bootstrap replicates for each generated sample, and 95% CI coverage is the fraction of the 1000 simulated samples where the asymptotic confidence interval using the bootstrap SD includes the true F value.

Supplementary Table 1 compares analytic and bootstrap SD estimators for ${\hat{Λ}}_{11},$ as well as corresponding 95% confidence interval coverage rates under the same generated samples, and at the same (t₁, t₂) values, as in Table 4. There appears to be good agreement between empirical (sandwich) estimator and bootstrap (200 replicates) standard deviation estimators. Confidence interval coverage rates are low under some configurations, but tend to be a little closer to nominal levels with the bootstrap than with the analytic SD estimators.

Table 5 gives confidence band performance statistics for both ${\hat{Λ}}_{11}$ and $\hat{F},$ over specified follow-up regions. These were developed by applying a supremum statistic over the confidence region without estimator transformation, for each estimator. Specifically, over a follow-up region [0, t₁] × [0, t₂] the statistics

W_{Λ_{11}} (t_{1}, t_{2}) = \sup_{[0, t_{1}] \times [0, t_{2}]} n^{\frac{1}{2}} | {\hat{Λ}}_{11} (s_{1}, s_{2}) - Λ_{11} (s_{1}, s_{2}) | and W_{F} (t_{1}, t_{2}) = \sup_{[0, t_{1}] \times [0, t_{2}]} n^{\frac{1}{2}} | \hat{F} (s_{1}, s_{2}) - F (s_{1}, s_{2}) |

are targeted at specified (t₁, t₂) values. Bootstrap estimates of these quantities are obtained, respectively, by calculating

W_{{\hat{Λ}}_{11} (t_{1}, t_{2})} = \sup_{[0, t_{1}] \times [0, t_{2}]} n^{\frac{1}{2}} | {\hat{Λ}}_{11}^{*} (s_{1}, s_{2}) - {\hat{Λ}}_{11} (s_{1}, s_{2}) | and W_{\hat{F}} (t_{1}, t_{2}) = \sup_{[0, t_{1}] \times [0, t_{2}]} n^{\frac{1}{2}} | {\hat{F}}^{*} (s_{1}, s_{2}) - \hat{F} (s_{1}, s_{2}) |

where ${\hat{Λ}}_{11}^{*}$ and ${\hat{F}}^{*}$ denote bootstrap replicate estimators derived using ${\hat{Λ}}_{11}$ and $\hat{F} .$ Critical values for an α-level (e.g. α = 0.95) confidence region can be estimated as the α percentiles ${\hat{C}}_{α} ({\hat{Λ}}_{11})$ and ${\hat{C}}_{α} (\hat{F})$ from the bootstrap replicate supremum statistics $W_{{\hat{Λ}}_{11}} (t_{1}, t_{2})$ and $W_{\hat{F}} (t_{1}, t_{2})$ respectively. Corresponding α-level confidence bands are then estimated for region [0, t₁] × 0, t₂] as

{\hat{Λ}}_{11} (t_{1}, t_{2}) \pm n^{- \frac{1}{2}} {\hat{C}}_{α} ({\hat{Λ}}_{11}) and \hat{F} (t_{1}, t_{2}) \pm n^{- \frac{1}{2}} {\hat{C}}_{α} (\hat{F})

respectively.

Table 5:

Simulation summary statistic^a for bootstrap-based rectangular confidence bands for $Λ_{11}$ and F at z = 0 and z = 1, using supremum statistics. Samples were generated under the Clayton–Oakes model (9) with $β_{10} = β_{01} = γ = \log 2$ and dependency parameter θ = 2, with z a binary variate taking values 0 and 1 with probability 0.5.

Sample size (n)		1000			250
T₁ and T₂ Failure%		22.6			100
		$Λ_{11}$
(T₁,T₂) Marginal		Mean Boot.	SD Boot.	95% CB	Mean Boot.	SD Boot.	95% CB
Survival Rates	$Λ_{11}$	$C_{0.95} ({\hat{Λ}}_{11})$	$C_{0.95} ({\hat{Λ}}_{11})$	Coverage	$C_{0.95} ({\hat{Λ}}_{11})$	$C_{0.95} ({\hat{Λ}}_{11})$	Coverage

		z = 0
(0.85, 0.85)^b	0.060	0.95	0.19	0.916	0.51	0.11	0.921
(0.70, 0.70)	0.226	3.61	0.92	0.917	1.36	0.23	0.934
(0.55, 0.55)	0.500	8.47	3.46	0.837	2.60	0.41	0.920
		z = 1
(0.85, 0.85)	0.046	0.75	0.14	0.915	0.51	0.11	0.921
(0.70, 0.70)	0.189	2.34	0.39	0.920	1.36	0.23	0.934
(0.55, 0.55)	0.453	5.75	1.28	0.923	2.60	0.41	0.920
		F
(T₁, T₂) Marginal		Mean Boot.	SD Boot.	95% CB	Mean Boot.	SD Boot.	95% CB
Survival Rates	F	$C_{0.95} (\hat{F})$	$C_{0.95} (\hat{F})$	Coverage	$C_{0.95} (\hat{F})$	$C_{0.95} (\hat{F})$	Coverage

		z = 0
(0.85, 0.85)	0.752	1.53	0.12	0.939	1.02	0.10	0.931
(0.70, 0.70)	0.570	2.89	0.26	0.961	1.38	0.10	0.937
(0.55, 0.55)	0.422	5.89	1.80	0.928	1.60	0.10	0.932
		z = 1
(0.85, 0.85)	0.739	1.53	0.11	0.942	1.29	0.12	0.952
(0.70, 0.70)	0.538	2.32	0.16	0.952	1.66	0.12	0.946
(0.55, 0.55)	0.379	3.34	0.31	0.947	1.83	0.12	0.942

Open in a new tab

Mean Boot $C_{0.95} ({\hat{Λ}}_{11})$ is the average of the 95th percentiles of $W_{{\hat{Λ}}_{11}} (t_{1}, t_{2})$ over the 1000 samples, each with 200 bootstrap samples, and SD Boot $C_{0.95} ({\hat{Λ}}_{11})$ is the standard deviation of these bootstrap percentiles; 95% CB Coverage is the fraction of bootstrap-based 95% confidence bands that include $Λ_{11} (s_{1}, s_{2})$ for all $(s_{1}, s_{2}) ε [0, t_{1}] \times [0, t_{2}] .$ The same quantities are presented for confidence bands for F.

Confidence region over $[0, - \log (0.85)] \times [0, - \log (0.85)] .$

The simulation summary statistics in Table 5 include bootstrap-based confidence regions for both $Λ_{11}$ and F over certain rectangular follow-up regions, using 200 bootstrap replicates for each generated sample, for each of the latter two configurations of Table 4. Note that the full set of uncensored data grid points for a generated sample was retained for all associated bootstrap samples in the calculation of $W_{{\hat{Λ}}_{11}}$ and $W_{\hat{F}}$ statistics. The sample mean and standard deviation of critical value estimates from the 1000 generated samples are shown. Summary statistics for 95% confidence bands for $Λ_{11}$ and F at both z = 0 and z = 1 are also shown in Table 5. Coverage rates tend to be somewhat low but, considering the size of the standard deviation for the bootstrap critical value estimates, these may improve if a larger number of bootstrap replicates are used.

Supplementary Table 2 provides simulation summary statistics for average cross ratio and average concordance estimators under the same simulation conditions as Table 5. Under these simulation conditions $\hat{C} (t_{1}, t_{2}; z)$ estimates $1 + θ e^{- z γ}$ and $\hat{T} (t_{1}, t_{2}; z)$ estimates $θ e^{- z γ} / (θ e^{- z γ} + 2)$ at any $t_{1} > 0$ and $t_{2} > 0$ . As shown in Supplementary Table 2 the cross ratio estimates with follow-up periods [0, t₁] × [0, t₂] tend to have small upward bias, and average concordance estimators have a small downward bias at these sample sizes, especially under the configuration with substantial censoring. These estimated biases derive in part from moderate sample size distributions that are somewhat skewed, and additional calculations show they can be reduced through simple transformation (e.g., apply asymptotic normal approximation to log $\hat{C},$ rather than to $\hat{C}$ ). Bootstrap procedures can again be used to estimate confidence intervals and bands for these dependency function estimators.

4. Composite Outcomes in a Low-fat Dietary Pattern Trial

The Women’s Health Initiative (WHI) includes a low-fat dietary pattern randomized controlled trial among 48,835 postmenopausal women (Women’s Health Initiative Study Group, 1998). Participating women were in the age range 50–79 at randomization at one of the 40 clinical centers in the US during 1993–1998. Forty percent of the participants were assigned to a low-fat dietary pattern intervention that included goals of reducing dietary fat to 20% of energy, as well as increasing vegetables and fruit to five servings a day and grains to six servings a day. The intervention was administered by nutritionists in groups of size 10–15, with 18 sessions in the first year of the intervention, and quarterly maintenance sessions thereafter over an intervention period that averaged 8.5 years, with subsequent continuing non-intervention follow-up. The other 60% of the participants were assigned to a comparison (control) group, with no dietary intervention. Comparison group women were provided written materials on diet and health only. Breast cancer incidence and colorectal cancer incidence were designated primary outcomes, while coronary heart disease incidence was designated as the secondary trial outcome. Various other clinical outcomes, including mortality from any cause were also ascertained, and used in trial monitoring and reporting.

Chlebowski et al. (2017) recently reported updated analyses of breast cancer incidence (T₁) and total mortality (T₂) from this dietary modification (DM) trial, for both the intervention, and a combined intervention and post-intervention, time periods. Cox models (1)–(3) were applied with $X (t_{1}, 0) = X (0, t_{2}) = X, (t_{1}, t_{2}) = z$ , where z is an indicator for intervention (z = 1) or comparison (z = 0) randomization assignment, and with baseline stratification on age (5-year intervals) and on randomization status in the companion WHI hormone therapy trials. The (T₁, T₂) failure times were censored by a common value C₁ = C₂ = C equal to the participants follow-up time at the end of the intervention period (3/31/05) or, for a small fraction of women, at the time of earlier loss to follow-up. Since deaths can only follow breast cancer incidence events (i.e. T₂ ≥ T₁) for a participant, an independent censoring assumption requires specifically that censoring rates for T₂ do not depend on the corresponding T₁ value, an appropriate assumption here since all death ascertainment procedures continued unchanged following a breast cancer diagnosis, including matching to the U.S. National Death Index.

At the end of the intervention period the breast cancer (T₁) hazard ratio (estimated 95% CI) was 0.92 (0.84,1.01), with logrank significance level of p = 0.09, with 671 and 1093 incident breast cancer cases in the intervention and comparison groups respectively. The corresponding values for all-cause mortality (T₂) were 0.98 (0.91, 1.06), with 989 and 1519 deaths in the respective groups. The composite outcome (T₁, T₂) of breast cancer followed by death from any cause had an estimated double failure hazard ratio (95% confidence interval) of 0.64 (0.44, 0.93), with 40 and 94 women experiencing the dual events in the two randomization groups, respectively, during the intervention period.

Note that the composite outcome analysis provides stronger evidence for an intervention benefit (logrank p = 0.02) than does the marginal analysis for either outcome separately, in spite of a much smaller number of cases. A corresponding univariate analysis of time from randomization to death attributed to breast cancer has estimated hazard ratio (95% confidence interval) of 0.67 (0.43, 1.06), with logrank p = 0.08. There were 27 and 61 deaths attributed to breast cancer in the two groups, respectively, during the intervention period. Another univariate analysis considers death, with classification of whether or not the death followed a breast cancer diagnosis, as a marked point process. This approach leads to a hazard ratio estimate (95% CI) of 0.65 (0.45,0.94), as reported in Chlebowski et al. (2017), which is nearly identical to the double failure hazard ratio estimate given above. In fact the corresponding estimating equations agree except for minor differences in the dual outcome risk set specifications at each death time following breast cancer. The double failure hazard rate model, however, brings potential to address additional questions such as whether the observed intervention influence is primarily through breast cancer incidence or through subsequent survival, and can do so in a manner that retains intention-to-treat interpretation for inferences. For example, suppose that the modeled regression variable in $Λ_{11}$ is extended to $X (t_{1}, t_{2}) = {z, z (t_{2} - t_{1})} .$ One then obtains ${\hat{β}}_{11} = (0.226, - 0.220),$ with corresponding standard deviation estimates of (0.364, 0.101) from the sandwich-form estimated variance matrix. This gives nominally significant evidence (p = 0.03) of a dual outcome hazard ratio that is reduced at larger time periods from breast cancer diagnosis to death. See Chlebowski et al. (2017) for more detailed analyses that also include breast tumor hormone receptor status, subgroup analyses, and longer-term non-intervention follow-up.

For completeness Table 6 shows the estimated survival probability for (T_l, T₂) at follow-up times of three, six, and nine years from randomization for each variate. For this purpose we dropped the baseline hazard rate stratification described above, so that survival function estimators at z = 0 and z = 1 correspond to the comparison and intervention groups as a whole. Corresponding bootstrap-based 95% confidence intervals and 95% supremum-type confidence bands are also shown, the latter from a rectangular follow-up region with from 0 to 9 years for each failure time variate. These were based on 200 bootstrap replicates with asymptotic approximations applied to F, without transformation. Confidence bands are presented only at follow-up grid points {3, 6, 9} × {3, 6, 9} in years. As expected the confidence bands are somewhat wider than corresponding confidence intervals at these follow-up times, especially at short follow-up times. Supplementary Table 3 provides corresponding estimators, bootstrap-based confidence intervals and confidence bands for $Λ_{11}$ using the same bootstrap replicates. Since T₂ ≥ T_l, these are only of interest on or above the main diagonal.

Table 6:

Joint survival probability estimators $(\hat{F})$ for breast cancer incidence and total mortality in the Women’s Health Initiative Dietary Modification Trial (n = 48,835, with 1764 women with incident breast cancers, 2508 deaths, and 134 women with both breast cancer and death during the 8.5 year average trial intervention period).

Follow-up Years for Breast Cancer (T₁)		Comparison Group (z = 0)			Intervention Group (z = 1)
		Follow-up Years for Mortality (T₂)			Follow-up Years for Mortality (T₂)
		3	6	9	3	6	9

3	$\hat{F}$	0.979	0.961	0.936	0.980	0.962	0.937
	95% CI^a	(0.978,0.980)	(0.959,0.963)	(0.933,0.939)	(0.979,0.981)	(0.960,0.965)	(0.934,0.941)
	95% CB^b	(0.975,0.983)	(0.957,0.965)	(0.932,0.940)	(0.975,0.985)	(0.957,0.968)	(0.932,0.942)
6	$\hat{F}$	0.965	0.947	0.923	0.967	0.949	0.925
	95% CI	(0.962,0.967)	(0.945,0.950)	(0.919,0.926)	(0.965,0.969)	(0.947,0.952)	(0.921,0.929)
	95% CB	(0.961,0.968)	(0.943,0.951)	(0.919,0.926)	(0.962,0.972)	(0.944,0.954)	(0.920,0.930)
9	$\hat{F}$	0.951	0.933	0.909	0.954	0.937	0.912
	95% CI	(0.948,0.953)	(0.930,0.936)	(0.905,0.913)	(0.951,0.957)	(0.933,0.940)	(0.908,0.917)
	95% CB	(0.947,0.955)	(0.929,0.937)	(0.905,0.913)	(0.949,0.959)	(0.932,0.942)	(0.907,0.917)

Open in a new tab

95% confidence intervals for F given z based on 200 bootstrap replicates.

95% supremum-type confidence bands for F given z over the region [0, 9] × [0, 9] years, based on 200 bootstrap replicates.

A second illustration in the same clinical trial illustrates the value of including a focus on marginal hazard rates for T_l and T₂ beyond counting process intensity modeling. Although diabetes was not a designated outcome in the trial protocol, information on the use of ‘pills for diabetes’ or ‘insulin shots for diabetes’ were collected twice annually during the trial intervention period and annually thereafter, through medical update questionnaire. These self-reports were found to be in reasonably good agreement with periodic medication inventories provided by study participants. A total of 45, 579 women were without prevalent diabetes at baseline. Clinical practice dictates the use of diabetes pills as a first line treatment for diabetes, changing to insulin injections if the disease progresses. Cox-type regression models were applied to these data, with baseline rates stratified as described above. An analysis (Howard et al., 2018) of time from randomization to initiation of diabetes pills (T₁) gives a hazard ratio estimate (95% confidence interval) for the low-fat dietary pattern intervention of 0.95 (0.88,1.02) over the intervention period with p = 0.13, with 3179 women developing diabetes. A counting process intensity model was applied to the post-diabetes pills follow-up to ascertain time from randomization to insulin use (T₂). This intensity was modeled to allow a distinct parameter for the intervention hazard ratio, and a baseline hazard rate that retained the original stratification, but also stratified on time-from-randomization to first use of oral diabetes agents (in quartiles). The intervention hazard ratio estimate (95% CI) from this analysis was 0.82 (0.64, 1.04) with a significance level of 0.10 and with 309 women progressing to insulin during the intervention period. This provides some modest evidence that the intervention slowed progression to the more serious type of disease requiring insulin injections, after controlling for time from randomization to the initiation of diabetes pills. A marginal single and double hazard rate analysis of the (T₁, T₂) data was also carried out with the original stratification mentioned above for both time variates and with distinct baseline rates and intervention group regression parameters for the two times. The marginal hazard rate analysis for T₁ is the same as was described above, whereas the marginal hazard rate analysis for time from randomization to diabetes requiring insulin injections (T₂) gave intervention hazard ratio estimate (95% confidence interval) of 0.74 (0.59, 0.94) with intention-to-treat significance level of 0.01. An independent censorship assumption, again with C₁ ≡ C₂, is entirely appropriate in this context, so that one obtains considerably stronger evidence of intervention benefit for time from randomization to diabetes requiring insulin than is the case from analysis of either of its component parts; namely, time from randomization to diabetes pills and time from diabetes pills to insulin injections. Moreover, this stronger result arises from a comparison between randomized groups, whereas the time from pills to insulin component of the intensity modeling contrasts groups that may differ in their distributions of time-to-diabetes pills, complicating the associated regression parameter interpretation. The double failure hazard ratio estimate (95% confidence interval) here is nearly identical to that for T₂. Over a longer term follow-up that included a substantial post-intervention period, and a median total follow-up of 17.3 years, the T₁ estimated marginal hazard ratio estimate (95% CI) was 0.96 (0.91, 1.00), while that for T₂ was 0.88 (0.78, 0.99) as was reported in Howard et al. (2018).

5. Higher Dimensional Failure Time Regression Methods

5.1. Hazard rate regression models

With bivariate failure time data there may be natural commonalities in baseline rates and in regression parameters in (1) and (2). For example, in twin studies it may be natural to restrict the baseline hazard rates $Λ_{10} (d t_{1}, 0)$ and $Λ_{01} (0, d t_{2})$ to be identical, and to require some components of β₁₀ and β₀₁ to be equal. Following Spiekerman and Lin (1998) we will refer to failure times having a common baseline rate function as failures of the same ‘type’, and for notational convenience we will redefine the marginal single failure hazard rate regression parameter to have a single value for all failure types by allowing the modeled regression vector to include interaction terms with failure type. Also, we now allow the multivariate failure times to be of arbitrary dimension.

5.2. Regression on marginal single and double failure hazard rates

Suppose that there is an arbitrary number, q, of failure times denoted by T₁...,T_q for each ‘study subject,’ with a possibly evolving q-dimensional covariate Z. Denote by z(t₁,...,t_q) covariate values at (t₁,...,t_q) and by Z(t₁,..., t_q) = z(0,..., 0) ∨ {z(s₁,..., s_q); s₁ < t₁,..., s_q < t_q} the covariate history prior to (t₁,..., t_q). Also let M denote a unique mapping from {1,..., q} to {1,..., K}, with K ≤ q, so that k = M(j) denotes the unique failure type for T_j, out of K possible types, for j = 1,..., q. Much of the interest in the study of failure rates on Z typically resides in the marginal single failure hazard rates. Suppose that the single failure hazard rate for T_j at follow-up time t_j, given Z(0,...,0, t_j, 0,...) is modeled by

Γ_{k} (d t_{j}) \exp {X_{k} (t_{j}) β}

(16)

for j = 1, . . ., q. Note that failures of the same type, k, are assumed to have a common baseline hazard rate function ‘ $Γ_{k}$ ’, which is obtained when the modeled covariate X_k is identically zero, with X_k(t_j) a fixed length row vector which for T_j is formed from {t_j; Z(0,...,0, t_j, 0,...0)}, and β a corresponding (column) regression vector to be estimated. As noted by Spiekerman and Lin (1998), this parameterization is flexible enough to allow, for example, distinct hazard ratio parameter vectors for each failure type, by including interaction variables with failure type in the specification of X_k.

Similarly suppose that the marginal double failure hazard rate for a pair of failure time variates (T_j, T_h) at follow-up times (t_j, t_h), given Z(0,...,0, t_j, 0,...,0, t_h, 0,... 0), is modeled by

Γ_{k g} (d t_{j}, d t_{h}) \exp {X_{k g} (t_{j}, t_{h}) γ},

(17)

for each 1 ≤ j < h ≤ q, where k = M(j) and g = M(h) are the failure types for T_j and T_h, respectively. In (17) X_kg(t_j, t_h) is a fixed length row regression vector which for (T_j, T_k) is formed from {t_j, t_h; Z(0,...,0, t_j, 0,...,0, t_h, 0,...,)}, $γ$ is a corresponding column double failure hazard ratio parameter to be estimated, and ‘ $Γ_{k g}$ ’ is a baseline double failure hazard rate function that is obtained at X_kg ≡ 0.

In this formulation the failure times T₁,...,T_q can occur along the same or different failure time axes, but failures of the same type are required to fall on the same time axis. For the parameters in (16) and (17) to have a useful interpretation an independent censorship condition, given Z, needs to be met. Hence we assume that lack of censoring in [0, t_j) can be added to the single failure hazard rate conditioning without affecting (16) for any j = 1,...,q, and lack of censoring in [0, t_j) × [0, t_h) can be added to the double failure hazard rate conditioning without affecting (17), for any (t_j, t_h) and 1 ≤ j < h ≤ q.

Now consider a random sample ${(S_{j i}, δ_{j i}), j = 1, \dots, q; Z (S_{1 i}, \dots, S_{q i})},$ for $i = 1, \dots, n$ from a study cohort, where $S_{j i} = T_{j i} \land C_{j i}$ is the minimum of the jth failure time T_ji and a corresponding potential censoring time C_ji for the ith individual, and $δ_{j i} = I [S_{j i} = T_{j i}] .$ From these one can define counting processes N_ji and ‘at risk’ processes Y_ji by

N_{j i} (d t_{j}) = {\begin{array}{l} 1 & if S_{j i} = t_{j} and δ_{j i} = 1 \\ 0 & otherwise \end{array}; Y_{j i} (t_{j}) = {\begin{array}{l} 1 & if S_{j i} \geq t_{j} \\ 0 & otherwise \end{array}

for j = 1,...,q and i = 1,...,n. Missing failure times can be accommodated by setting the pertinent C_ji value equal to zero.

Similar to Spiekerman and Lin (1998) one can define an estimating equation for the marginal single failure hazard ratio parameter β by

\sum_{i = 1}^{n} \sum_{j = 1}^{q} \sum_{k = 1}^{K} I {M (j) = k} \int_{0}^{τ_{k}} {X_{k i} (t_{j}) - {\bar{X}}_{k} (t_{j}; β)}^{'} N_{j i} (d t_{j}) = 0.

(18)

Also, a corresponding estimating equation for the double failure hazard rates parameter $γ$ can be written

\sum_{i = 1}^{n} \sum_{j = 1}^{q} \sum_{h = j + 1}^{q} \sum_{k = 1}^{K} \sum_{g = k + 1}^{K} I {M (j) = k} I {M (h) = g} \int_{0}^{τ_{k}} \int_{0}^{τ_{g}} {X_{k g i} (t_{j}, t_{h}) - {\bar{X}}_{k g} (t_{j}, t_{h}; γ)}^{'} N_{j i} (d t_{j}) N_{h i} (d t_{h}) = 0.

(19)

In these expressions $(τ_{k}, τ_{g})$ are such that $P {S_{j i} \geq τ_{k}, M (j) = k, S_{h i} \geq τ_{g}, M (h) = g; Z (0, \dots, t_{j}, 0, \dots, 0, t_{h}, 0, \dots 0)} > 0$ for some (j, h), for each 1 ≤ k ≤ g ≤ K for theoretical developments, but each $τ_{k}$ can evidently be taken to be the maximal observed S_ji value, where k = M(j), in application. Also the ‘centering’ variates in (18) are

{\bar{X}}_{k} (t_{j}; β) = Q_{k}^{(1)} (t_{j}; β) / Q_{k}^{(0)} (t_{j}; β),

where $Q_{k}^{(ℓ)} (t_{j}; β) = n^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{q} I {M (j) = k} Y_{j i} (t_{j}) X_{k i} {(t_{j})}^{(ℓ)} \exp {X_{k i} (t_{j}) β},$ for $l$ = 0, 1, 2 with $a^{(0)}$ = 1, $a^{(1)}$ = a and $a^{(2)}$ = a′a row vector a, while those in (19) are

{\bar{X}}_{k g} (t_{j}, t_{h}; γ) = Q_{k g}^{(1)} (t_{j}, t_{h}; γ) / Q_{k g}^{(0)} (t_{j}, t_{h}; γ),

where $Q_{k g}^{(ℓ)} (t_{j}, t_{h}; γ) = n^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{q} \sum_{h = j + 1}^{q} I {M (j) = k} I {M (h) = g} Y_{j i} (t_{j}) Y_{h i} (t_{h}) X_{k g} {(t_{j}, t_{h})}^{(ℓ)} \exp {X_{k g} (t_{j}, t_{h}) γ}, for ℓ = 0, 1, 2.$

The utility of (18) and (19) as estimating functions derives from the fact that each $N_{j i} (d t_{j})$ in (18) can be replaced by $L_{j i} (d t_{j}; β)$ where

L_{j i} (t_{j}; β) = \int_{0}^{t_{j}} [N_{j i} (d s_{j}) - \sum_{k = 1}^{K} I {M (j) = k} Y_{j i} (s_{j}) \exp {X_{k i} (s_{j}) β} Γ_{k} (d s_{j})]

while retaining equality to zero under (16), and similarly each $N_{j i} (d t_{j}) N_{h i} (d t_{h})$ in (19) can be replaced by $L_{j h i} (d t_{j}, d t_{h}; γ)$ where

L_{j h i} (t_{j}, t_{h}; γ) = \int_{0}^{t_{j}} \int_{0}^{t_{h}} [N_{j i} (d s_{j}) N_{h i} (d s_{h}) - \sum_{k = 1}^{K} \sum_{g = k + 1}^{K} I {M (j) = k} I {M (h) = g} Y_{j i} (s_{j}) Y_{h i} (s_{h}) \exp {X_{k g i} (s_{j}, s_{h}) γ} Γ_{k g} (d s_{j}, d s_{h})],

while retaining equality to zero under (17). It follows that the product of n^−1/2 and left sides of (18) and (19) are stochastic integrals of sample variates with respect to processes L_ji and L_jhi that have zero means under (16) and (17) respectively at the true $(β, γ)$ values. Moreover, it turns out that under i.i.d. conditions for the processes ${(N_{j i}, Y_{j i}), j = 1, \dots, q, Z_{i} (S_{j i}, \dots, S_{j q})} for i = 1, \dots, n$ that the centering variates in (18) and (19) can be replaced by their almost sure limits.

{\bar{x}}_{k} (t_{j}; β) = q_{k}^{(1)} (t_{j}; β) / q_{k}^{(0)} (t_{j}; β) and {\bar{x}}_{k g} (t_{j}, t_{h}; γ) = q_{k g}^{(1)} (t_{j}, t_{h}; γ) / q_{k g}^{(0)} (t_{j}, t_{h}; γ),

where $q_{k}^{(ℓ)}$ and $q_{k g}^{(ℓ)}$ are expectations of $Q_{k}^{(ℓ)}$ and $Q_{k g}^{(ℓ)}$ respectively for $ℓ = 0, 1, 2,$ without altering the asymptotic distribution of the left sides of (18) and (19). It then follows further that the left sides of (18) and (19) behave, for large n, like a sum of i.i.d. variates to which the central limit theorem applies under modest additional regularity conditions. From this n^−1/2 times these left sides converges to a zero mean Gaussian variate at the ‘true’ values for $(β, γ)$ under (16) and (17). The variance matrix for this Gaussian variate quite generally can be consistently estimated by

\hat{Σ} = n^{- 1} \sum_{i = 1}^{n} {(\begin{array}{l} \sum_{j = 1}^{q} \sum_{k = 1}^{K} I {M (j) = k} \int_{0}^{τ_{k}} {X_{k i} (t_{j}) - {\bar{X}}_{k} (t_{j}, \hat{β})}^{'} {\hat{L}}_{j i} (d t_{j}; \hat{β}) \\ \sum_{j = 1}^{q} \sum_{h = j + 1}^{q} \sum_{k = 1}^{K} \sum_{g = k + 1}^{K} I {M (j) = k} I {M (h) = g} \\ \int_{0}^{τ_{k}} \int_{0}^{τ_{g}} {X_{k g i} (t_{j}, t_{h}) - {\bar{X}}_{k g} (t_{j}, t_{h}; \hat{γ})}^{'} {\hat{L}}_{j h i} (d t_{j}, d t_{h}; \hat{γ}) \end{array})}^{\otimes 2},

where L_ji and L_jhi denote L_ji and L_jhi respectively evaluated at $(\hat{β}, \hat{γ})$ and at Aalen–Breslow estimators of baseline hazard functions given by

{\hat{Γ}}_{k} (t_{k}; \hat{β}) = \sum_{i = 1}^{n} \sum_{j = 1}^{q} I {M (j) = k} \int_{0}^{t_{k}} N_{j i} (d s_{k}) / {n Q_{k}^{(0)} (s_{k}; \hat{β})}, and {\hat{Γ}}_{k g} (t_{k}, t_{g}; \hat{γ}) = \sum_{i = 1}^{n} \sum_{j = 1}^{q} \sum_{h = j + 1}^{q} \int_{0}^{t_{k}} \int_{0}^{t_{g}} I {M (j) = k} I {M (h) = g} N_{j i} (d s_{k}) N_{h i} (d s_{g}) / {n Q_{k g}^{(0)} (s_{k}, s_{g}; \hat{γ})},

for k = 1,..., K and for 1 ≤ k ≤ g ≤ K, respectively.

Taylor series expansions of the left sides of (18) and (19) about the true $(β, γ)$ values then lead, under regularity conditions, to a zero mean asymptotic normal distribution for

n^{\frac{1}{2}} (\begin{array}{l} \hat{β} - β \\ \hat{γ} - γ \end{array})

with variance matrix consistently estimated by ${\hat{A}}^{- 1} \hat{Σ} {\hat{A}}^{- 1}$ where $\hat{A}$ is the product of n⁻¹ and the negative of the derivative matrix of the left sides of (18) and (19) with respect to $(β, γ) .$ Specifically $n \hat{A}$ is a block diagonal matrix with entries

\sum_{k = 1}^{K} \int_{0}^{τ_{k}} [{Q^{(2)} (t_{k}; \hat{β}) / Q^{(0)} (t_{k}, \hat{β}) - {Q^{(1)} (t_{k}; \hat{β}) / Q^{(0)} (t_{k}; \hat{β})}}^{\otimes 2}] \sum_{j = 1}^{q} I {M (j) = k} \sum_{i = 1}^{n} N_{j i} (d t_{k})

in the upper left, with entries

\sum_{k = 1}^{K} \sum_{g = k + 1}^{K} \int_{0}^{τ_{k}} \int_{0}^{τ_{g}} [Q^{(2)} (t_{k}, t_{g}; \hat{γ}) / Q^{(0)} (t_{k}, t_{g}; \hat{γ}) - {Q^{(1)} (t_{k}, t_{g}; \hat{γ}) / Q^{(0)} (t_{k}, t_{g}; γ)}^{\otimes 2}] \sum_{j = 1}^{q} \sum_{h = j + 1}^{q} I {M (j) = k} I {M (h) = g} \sum_{i = 1}^{n} N_{j i} (d t_{k}) N_{h i} (d t_{g})

in the lower right, and with zero matrices in the off-diagonal blocks, so that one can write

\hat{A} = (\begin{matrix} {\hat{A}}_{1} & 0 \\ 0 & {\hat{A}}_{2} \end{matrix}) .

Empirical process methods can also be used to show $n^{1 / 2} ({\hat{Γ}}_{k} - Γ_{k}),$ for $k = 1, \dots, K$ and $n^{1 / 2} ({\hat{Γ}}_{k g} - Γ_{k g})$ for $1 \leq k < g \leq K$ to converge jointly to a zero mean Gaussian processes under (16) and (17), and a sandwich-type covariance process estimator can be specified for this set of parameter estimates. These asymptotic developments again follow from modest extensions of Spiekerman and Lin (1998) and Lin et al. (2000). Some related detail is given in the Appendix. As in the previous section, bootstrap resampling procedures can be used for supremum-type confidence band estimation for marginal single and double cumulative hazard rates, or for confidence intervals or bands for bivariate survival function estimators, and for pairwise cross ratio or concordance functions, given Z, for any 1 ≤ k ≤ g ≤ K.

It can also be remarked that these asymptotic results assume the marginal single and double failure rate models (16) and (17) to hold simultaneously. Note however, that the asymptotic properties for $\hat{β}$ and ${\hat{Γ}}_{k},$ for k = 1,..., K hold under (16) even under departure from (17), and those for $\hat{γ}$ and ${\hat{Γ}}_{k g},$ for all 1 ≤ k ≤ g ≤ K hold under (17) even under departure from (16), providing some flexibility in the modeling and interpretation of the respective single and double failure hazard rates. For example, the marginal single failure hazard ratio factor $\exp {X_{k} (t) \hat{β}}$ may have an interpretation as an average failure type k hazard ratio for the modeled covariate even if (16) is oversimplified and (17) fails to hold, and similarly for $\exp {X_{k g} (t_{1}, t_{2}) \hat{γ}}$ under an oversimplified double failure hazard rate model (17) and departure from (16). However, when the fitted marginal single and double failure hazard rates are brought together to estimate bivariate survival functions and pairwise dependency functions given Z, some care may be needed to ensure an adequate fit of (16) and (17) to available data, as will be considered further in Section 6 below.

Note also that mixed continuous and discrete failure times are included in the methodology described above, subject to the models (16) and (17), and the sandwich-type variance estimators and other weak convergence results mentioned above adapt appropriately to the nature of the failure time variates.

It may be possible to improve the efficiency of β estimators by introducing weights into the left side of (18) (e.g., Lin et al., 2000) possibly using the fitted marginal double failure hazard rates as a source of weighting information. Efficiency gains are likely to be small however, unless dependencies among the failure times are strong and censoring is not too severe. Related asymptotic results extend in a straightforward manner under some additional regularity conditions, following Lin et al. (2000), provided any dependence of the weights on β in estimating equation (18) is fixed at the parameter estimate $\hat{β}$ described above. Further study of the preferred form of weights that could be included in (18) and of their value for enhancing estimator efficiency, would be worthwhile.

6. Additional Aspects of Hazard Ratio Regression Parameter Modeling and Estimation

6.1. Model misspecification

Some readers may find it problematic that the single and double failure hazard rate models (16) and (17) may be mutually incompatible in that there may be no proper ‘survival’ function F given Z for which these models are simultaneously obtained. This issue arises also for mean and covariance parameter estimation using estimating equations with uncensored outcomes (e.g. Liang and Zeger, 1986). If this situation arises then one or both of (16) and (17) are misspecified, and, as usual, one can then expect some bias in estimators of related parameters, such as F given Z. An advantage of the semiparametric models (16) and (17), however, is that the unspecified baseline hazard rate functions provide valuable flexibility to these models, with restrictions entering only through the parametric form of the hazard ratio factors. The time-dependent covariate option allows the data analyst to adapt these hazard ratio factors to available data, and time-dependent baseline hazard rate stratification options allow even more flexible modeling. Hence, under careful modeling one can expect to obtain estimated single and double failure hazard rate estimators that are consistent with available data. These estimators uniquely determine estimators $\hat{F}$ given Z for all univariate and bivariate failure times, and these too will then be consistent with available data.

From a practical point of view a data analyst is likely to just include some simple time-dependent terms in the modeled single and double failure regression vectors in (16) and (17). We considered a generalization of the bivariate survival function model (9) where the single failure hazard rates for the binary covariate z are correctly modeled, but the double failure hazard rate is not, to examine the bias associated with this model misspecification, and to examine the extent to which it can be mitigated by the inclusion of the simple time-dependent components z log t₁ and z log t₂ in the respective regression vectors $X (t_{1}, 0)$ and $X (0, t_{2}),$ and the inclusion of both of these time-dependent terms in X(t₁, t₂), in the models (1)–(3).

The joint survival function considered was the Clayton-Oakes model

F (t_{1}, t_{2}, z) = {[F_{0} {(t_{1}, 0)}^{- θ \exp {z (β_{10} - γ)}} + F_{0} {(0, t_{2})}^{- θ \exp {z (β_{01} - γ)}} - 1]}^{- e^{z γ} / θ}

(20)

for $θ \geq 0,$ with F₀ again denoting the survival function at z = 0. This class of models has the same single failure hazard rates as (9), and the same cross ratio function $1 + θ e^{- z γ},$ but the double failure hazard rate model has the more complex form

Λ_{11} ({dt}_{1}, {dt}_{2}; z) = Λ_{11} ({dt}_{1}, {dt}_{2}) e^{z γ} (e^{z γ} + θ) / (1 + θ) {\frac{F_{0} {(t_{1}, 0)}^{- θ} + F_{0} {(0, t_{2})}^{- θ} - 1}{F_{0} {(t_{1}, 0)}^{- θ \exp {z (β_{10} - γ)}} + F_{0} {(0, t_{2})}^{- θ \exp {z (β_{01} - γ)}} - 1}}^{2} F_{0} {(t_{1}, 0)}^{- θ [\exp {z (β_{10} - γ)} - 1]} F_{0} {(0, t_{2})}^{- θ [\exp {z (β_{01} - γ)} - 1]},

which departs from (3) under departure from $β_{10} = β_{01} = γ .$ Table 7 shows some simulation results for estimating F given Z, at both z = 0 and z = 1 for $β_{10} = 0, β_{01} = \log 2, θ = 2,$ and $γ = \log 2,$ either with $x (t_{1}, 0) = x (0, t_{2}) = x (t_{1}, t_{2}) = z$ as before, or with $x (t_{1}, 0) = (z, z \log t_{1}), x (0, t_{2}) = (z, z \log t_{2})$ and $x (t_{1}, t_{2}) = (z, z \log t_{1}, z \log t_{2}) .$ From the left side of Table 7 one sees that the biases in $\hat{F}$ given z are minimal in the heavy censoring scenario, even without time-dependent regression variables, whereas bias is evident away from the origin in the uncensored data scenario where the model misspecification has more influence in the tails of the survival function. Much of this bias is avoided by the inclusion of these simple time-varying regression variable that allow the single and double failure hazard ratio for z = 1 versus z = 0 to be power functions of t_l and t₂. Note that the sample standard deviations for $\hat{F}$ given z are little affected by the inclusion of these time-dependent variables. Corresponding estimators of average cross ratios and average concordances incorporate somewhat greater biases under these sampling configurations, but these biases too were considerably reduced by the inclusion of the time-dependent components of the modeled regression variables. Results were similar at various other parameter values, sample sizes and censoring configurations. Time-dependent regression variable zt_l and zt₂, instead of z log t_l and z log t₂ were also considered, with very similar bias reduction properties for these choices, under the simulation model (20).

Table 7:

Simulations summary statistics for survival function estimators $\hat{F}$ at z = 0 and z = 1 at selected marginal survival function percentiles under Clayton–Oakes model (20) with $β_{10} = 0, β_{01} = \log 2, θ = 2$ and $γ = \log 2$ .

Sample size (n)		1000	250	1000	250
T₁ and T₂ failure %		18.2	100	18.2	100
		$\hat{F}$ –no time-varying covariates		$\hat{F}$ –time-varying covariates included
(T₁, T₂) Percentiles	F	Mean (SD)^a	Mean (SD)	Mean (SD)	Mean (SD)
		z = 0
(0.85,0.85)	0.752	0.751(0.021)	0.747(0.030)	0.752(0.022)	0.754(0.036)
(0.85,0.70)	0.642	0.640(0.029)	0.632(0.035)	0.641(0.030)	0.650(0.039)
(0.85,0.55)	0.521	0.517(0.048)	0.506(0.039)	0.519(0.051)	0.532(0.042)
(0.70,0.70)	0.570	0.571(0.041)	0.553(0.037)	0.571(0.042)	0.569(0.041)
(0.70,0.55)	0.480	0.475(0.065)	0.453(0.039)	0.476(0.069)	0.483(0.042)
(0.55,0.55)	0.422	0.421(0.094)	0.392(0.039)	0.420(0.106)	0.417(0.041)
		z = 1
(0.85,0.85)	0.739	0.739(0.021)	0.743(0.031)	0.739(0.022)	0.741(0.034)
(0.85,0.70)	0.623	0.624(0.026)	0.632(0.035)	0.624(0.026)	0.624(0.039)
(0.85,0.55)	0.501	0.504(0.033)	0.514(0.038)	0.503(0.033)	0.499(0.041)
(0.70,0.70)	0.538	0.537(0.034)	0.550(0.037)	0.538(0.034)	0.546(0.039)
(0.70,0.55)	0.445	0.446(0.042)	0.462(0.038)	0.447(0.043)	0.449(0.040)
(0.55,0.55)	0.379	0.374(0.064)	0.396(0.038)	0.378(0.068)	0.389(0.040)

Open in a new tab

Sample mean and standard deviation (SD) based on 1000 simulated samples at each sampling configuration.

An additional simulation was conducted under (20) and the same parameter configuration described above, but with regression vectors augmented to include a standard normal variate in addition to the binary regression variable, with the two regression variables having identical parameter values β₁₀, β₀₁ and $γ .$ Analyses that included modeled regression variables z log t₁ and z log t₂ for marginal single failure hazard rates, and z log(t₁ + t₂ + 1) for the marginal double failure hazard rate demonstrated good agreement between sample standard deviations and the average of standard deviation estimators from sandwich estimation–based standard deviation estimators, and good agreement of sandwich estimation–based 95% confidence intervals with nominal levels for targeted parameters, based on 1000 simulated data sets.

6.2. Higher dimensional marginal hazard rate regression estimation

Marginal hazard rate regression models analogous to (16) and (17) can also be considered for trivariate and higher dimensional marginal hazard rates. The methods of the preceding section generalize naturally to the estimation of hazard ratio regression parameters and baseline hazard rates for subsets of the failure times $(T_{1}, \dots, T_{q})$ for any q ≥ 1. Moreover, the survival function F given Z for $(T_{1}, \dots, T_{q})$ at a specified q-dimensional covariate history, with fixed or external covariates, can be readily estimated in a recursive fashion. For example, one can write

F {t_{1}, \dots, t_{q}; Z (t_{1}, \dots, t_{q})} = ψ_{Z} (t_{1}, \dots, t_{q}) + \int_{0}^{t_{1}} \dots \int_{0}^{t_{q}} F {s_{1}^{-}, \dots, s_{q}^{-}; Z (s_{1}, \dots, s_{q})} Λ_{1 \dots 1} {d s_{1}, \dots, d s_{q}; Z (s_{1}, \dots, s_{q})}

where $ψ_{Z} (t_{1}, \dots, t_{q})$ depends only on marginal distributions of F given Z of dimension less than q. This inhomogenous Volterra integral equation has a unique solution as a function of

ψ_{Z} (t_{1}, \dots, t_{q}) = {(- 1)}^{q - 1} + {(- 1)}^{q - 2} \sum_{i = 1}^{q} F {0, \dots, t_{i}, 0, \dots, 0; Z (0, \dots, t_{i}, 0, \dots, 0)} + {(- 1)}^{q - 3} \sum_{i = 1}^{q} \sum_{j = 1}^{q} F {0, \dots, t_{i}, 0, \dots, 0, t_{j}, 0, \dots, 0; Z (0, \dots, 0, t_{i}, 0, \dots, t_{j}, 0, \dots, 0)} + \dots + {(- 1)}^{0} \underset{i_{1} < i_{2} < \dots < i_{q - 1}}{\sum_{i_{1} = 1}^{q} \dots \sum_{i_{q - 1} = 1}^{q}} F {0, \dots, 0, t_{i_{1}}, 0, \dots, 0, t_{i_{q - 1}}, 0, \dots, 0; Z (0, \dots, 0, t_{i_{1}}, 0 \dots, 0, t_{i_{q - 1}}, 0 \dots, 0)}

and the q-variate hazard rate regression model $Λ_{1 \dots 1}$ in Péano series form, leading to strongly consistent and weakly Gaussian convergent estimators of F given Z by plugging in marginal hazard rate regression estimators for hazard rates of all dimensions up to q, starting with Cox model marginal single failure hazard rate estimators.

Note, however, that marginal q-variates hazard rate estimators have precision that depends directly on the number of individuals experiencing a q-variate failure. In many applications, for example in epidemiologic cohort studies with failure times constituting q specific clinical outcomes, the data for estimating high-dimensional marginal hazard rates will be too sparse to be useable. In fact, the most useful and interpretable regression information will often derive from marginal single and double failure rate estimation, analogous to mean and covariance parameter estimation in uncensored data regression settings (e.g. Liang and Zeger, 1986; Prentice and Zhao, 1991).

6.3. Summary and Concluding remarks

In summary, the methods provided here aim to fill an important gap in the various possible extensions of the univariate failure time Cox model to multivariate failure time data. The proposed marginal methods are based on semiparametric multiplicative form regression models for marginal single and double, and potentially higher order, failure hazard rates, where marginal implies that possibly-evolving covariate histories are included in the hazard rate conditioning, but the evolving failure time counting process for the ‘individual’ (correlated set of measurements) is not included. These methods, along with models of a similar form for the counting process intensity, which does condition on the preceding counting process history, provide flexible tools for the analysis of multivariate failure time regression data. The present marginal methods allow separate censoring processes to apply to the components of the multivariate failure time variable, and allow failure time components of different types to fall on unrelated time axes, provisions that are not available for martingale-based distribution theory for counting process intensity models. On the other hand, intensity process modeling allows censoring rates to depend on the prior counting process data for the correlated set, while somewhat stronger censoring requirements apply to the marginal hazard rate methods considered here.

The applicability of these stronger censoring requirements can be examined by applying models of the form (16) and (17) to marginal single and double failure censoring rates, while extending the conditioning event to include aspects of the preceding failure counting process for the ‘individual’ in addition to the preceding covariate history. A dependence of these censoring rates on the prior counting process history would suggest departure from independent censoring given Z.

The marginal methods can also be viewed as extending copula model methods to include a semiparametric class of dependency models, including models that can depend on an evolving covariate process. Additionally, the proposed marginal methods build upon the marginal single failure regression methods of Lin, Wei, and colleagues, while including higher dimensional marginal hazard rate regression models, and do so using straightforward computations that extend those used in this earlier work, and in Cox’s (1972) seminal paper. As a byproduct, these methods yield semiparametric bivariate survival function estimators, and related cross ratio and concordance dependency function estimators, with fixed or external covariates, that are considerably more flexible than corresponding estimators previously available using copula and frailty regression model approaches. Furthermore the relationship of marginal double failure hazard rates to covariates will often be readily interpretable, and may lead to novel insights; for example, into intervention effects and related intervention mechanisms in a clinical trial context.

Supplementary Material

Supplemental Material

NIHMS1566901-supplement-Supplemental_Material.pdf^{(198.7KB, pdf)}

Acknowledgments

This work was partially supported by National Institutes of Health grants R01CA10921 and P30CA015704, and by the Research Program of the National Institute of Environmental Health Sciences.

Biography

Ross L. Prentice is Professor, Fred Hutchinson Cancer Research Center, Seattle WA 98109 (rprentic@whi.org) and Shanshan Zhao is a Principal Investigator, National Institute of Environmental Health Sciences, Research Triangle Park NC 27709 (shanshan.zhao@nih.gov).

Contributor Information

Ross L. Prentice, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, Washington, USA 98109

Shanshan Zhao, National Institute of Environmental Health Sciences, 111 TW Alexander Dr, Rall Building, Research Triangle Park, North Carolina, USA 27709.

References

Aalen O, Borgan 0, and Gjessing H. (2010). Survival and Event History Analysis. Springer Science & Business Media. [Google Scholar]
Andersen PK, Borgan 0, Gill RD, and Keiding N. (1993). Statistical Models Based on Counting Processes. New York: Springer-Verlag. [Google Scholar]
Andersen PK and Gill RD (1982). Cox’s regression model for counting processes: a large sample study. The Annals of Statistics 10(4), 1100–1120. [Google Scholar]
Bandeen-Roche K. and Ning J. (2008). Nonparametric estimation of bivariate failure time associations in the presence of a competing risk. Biometrika 95 (1), 221–232. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chlebowski RT, Aragaki AK, Anderson GL, Thomson CA, Manson JE, Simon MS, Howard BV, Rohan TE, Snetselar L, Lane D, Barrington W, Vitolins MZ, Womack C, Qi L, Hou L, Thomas F, and Prentice RL (2017). Low-fat dietary pattern and breast cancer mortality in the women’s health initiative randomized controlled trial. Journal of Clinical Oncology 35(25), 2919–2926. [DOI] [PMC free article] [PubMed] [Google Scholar]
Clayton DG (1978). Model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 65, 141–151. [Google Scholar]
Cox DR (1972). Regression models and life-tables (with discussion). Journal of the Royal Statistical Society. Series B (Methodological) 34 (2), 187–220. [Google Scholar]
Dabrowska DM (1988). Kaplan–Meier estimate on the plane. Annals of Statistics 16, 1475–1489. [Google Scholar]
Duchateau L. and Janssen P. (2010). The Frailty Model. New York: Springer-Verlag. [Google Scholar]
Fan J, Prentice RL, and Hsu L. (2000). A class of weighted dependence measures for bivariate failure time data. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 62(1), 181–190. [Google Scholar]
Gill RD, van der Laan MJ, and Wellner JA (1995). Inefficient estimators of the bivariate survival function for three models. Ann. Inst. H. Poincare Probab. Statist 31 (3), 545–597. [Google Scholar]
Hougaard P. (2000). Analysis of Multivariate Survival Data, Volume 564. Springer; New York. [Google Scholar]
Howard BV, Aragaki AK, Tinker LF, Allison M, Hingle MD, Johnson KC, Manson JE, Shadyab AH, Shikany JM, Snetselaar LG, Thomson CA, Zaslavsky O, and Prentice RL (2018). A low-fat dietary pattern and diabetes: A secondary analysis from the Women’s Health Initiative Dietary Modification Trial. Diabetes Care 41, 680–687. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hu T, Nan B, Lin X, and Robins JM (2011). Time-dependent cross ratio estimation for bivariate failure times. Biometrika 98(2), 341–354. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kalbfleisch JD and Prentice RL (2002). The Statistical Analysis of Failure Time Data, Second Edition. New York: Wiley and Sons. [Google Scholar]
Liang K-Y and Zeger SL (1986). Longitudinal data analysis using generalized linear models. Biometrika 73(1), 13–22. [Google Scholar]
Lin D, Wei L, Yang I, and Ying Z. (2000). Semiparametric regression for the mean and rate functions of recurrent events. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 62(4), 711–730. [Google Scholar]
Nan B, Lin X, Lisabeth LD, and Harlow SD (2006). Piecewise constant cross-ratio estimation for association of age at a marker event and age at menopause. Journal of the American Statistical Association 101 (473), 65–77. [Google Scholar]
Nelsen RB (2007). An Introduction to Copulas (Seconded.). New York: Springer-Verlag. [Google Scholar]
Oakes D. (1986). Semiparametric inference in a model for association in bivariate survival data. Biometrika 73 (2), 353–361. [Google Scholar]
Oakes D. (1989). Bivariate survival models induced by frailties. Journal of the American Statistical Association 84 (406), 487–493. [Google Scholar]
Prentice RL and Cai J. (1992). Covariance and survivor function estimation using censored multivariate failure time data. Biometrika 79(3), 495–512. [Google Scholar]
Prentice RL and Zhao LP (1991). Estimating equations for parameters in means and covariances of multivariate discrete and continuous responses. Biometrics, 825–839. [PubMed] [Google Scholar]
Shih JH and Louis TA (1995). Inferences on the association parameter in copula models for bivariate survival data. Biometrics, 1384–1399. [PubMed] [Google Scholar]
Spiekerman CF and Lin D. (1998). Marginal regression models for multivariate failure time data. Journal of the American Statistical Association 93(443), 1164–1175. [Google Scholar]
Wienke A. (2011). Frailty Models in Survival Analysis. Boca Raton: Chapman and Hall/CRC Press. [Google Scholar]
Women’s Health Initiative Study Group (1998). Design of the Women’s Health Initiative clinical trial and observational study. Controlled Clinical Trials 19(1), 61–109. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material

NIHMS1566901-supplement-Supplemental_Material.pdf^{(198.7KB, pdf)}

[R1] Aalen O, Borgan 0, and Gjessing H. (2010). Survival and Event History Analysis. Springer Science & Business Media. [Google Scholar]

[R2] Andersen PK, Borgan 0, Gill RD, and Keiding N. (1993). Statistical Models Based on Counting Processes. New York: Springer-Verlag. [Google Scholar]

[R3] Andersen PK and Gill RD (1982). Cox’s regression model for counting processes: a large sample study. The Annals of Statistics 10(4), 1100–1120. [Google Scholar]

[R4] Bandeen-Roche K. and Ning J. (2008). Nonparametric estimation of bivariate failure time associations in the presence of a competing risk. Biometrika 95 (1), 221–232. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] Chlebowski RT, Aragaki AK, Anderson GL, Thomson CA, Manson JE, Simon MS, Howard BV, Rohan TE, Snetselar L, Lane D, Barrington W, Vitolins MZ, Womack C, Qi L, Hou L, Thomas F, and Prentice RL (2017). Low-fat dietary pattern and breast cancer mortality in the women’s health initiative randomized controlled trial. Journal of Clinical Oncology 35(25), 2919–2926. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] Clayton DG (1978). Model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 65, 141–151. [Google Scholar]

[R7] Cox DR (1972). Regression models and life-tables (with discussion). Journal of the Royal Statistical Society. Series B (Methodological) 34 (2), 187–220. [Google Scholar]

[R8] Dabrowska DM (1988). Kaplan–Meier estimate on the plane. Annals of Statistics 16, 1475–1489. [Google Scholar]

[R9] Duchateau L. and Janssen P. (2010). The Frailty Model. New York: Springer-Verlag. [Google Scholar]

[R10] Fan J, Prentice RL, and Hsu L. (2000). A class of weighted dependence measures for bivariate failure time data. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 62(1), 181–190. [Google Scholar]

[R11] Gill RD, van der Laan MJ, and Wellner JA (1995). Inefficient estimators of the bivariate survival function for three models. Ann. Inst. H. Poincare Probab. Statist 31 (3), 545–597. [Google Scholar]

[R12] Hougaard P. (2000). Analysis of Multivariate Survival Data, Volume 564. Springer; New York. [Google Scholar]

[R13] Howard BV, Aragaki AK, Tinker LF, Allison M, Hingle MD, Johnson KC, Manson JE, Shadyab AH, Shikany JM, Snetselaar LG, Thomson CA, Zaslavsky O, and Prentice RL (2018). A low-fat dietary pattern and diabetes: A secondary analysis from the Women’s Health Initiative Dietary Modification Trial. Diabetes Care 41, 680–687. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] Hu T, Nan B, Lin X, and Robins JM (2011). Time-dependent cross ratio estimation for bivariate failure times. Biometrika 98(2), 341–354. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] Kalbfleisch JD and Prentice RL (2002). The Statistical Analysis of Failure Time Data, Second Edition. New York: Wiley and Sons. [Google Scholar]

[R16] Liang K-Y and Zeger SL (1986). Longitudinal data analysis using generalized linear models. Biometrika 73(1), 13–22. [Google Scholar]

[R17] Lin D, Wei L, Yang I, and Ying Z. (2000). Semiparametric regression for the mean and rate functions of recurrent events. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 62(4), 711–730. [Google Scholar]

[R18] Nan B, Lin X, Lisabeth LD, and Harlow SD (2006). Piecewise constant cross-ratio estimation for association of age at a marker event and age at menopause. Journal of the American Statistical Association 101 (473), 65–77. [Google Scholar]

[R19] Nelsen RB (2007). An Introduction to Copulas (Seconded.). New York: Springer-Verlag. [Google Scholar]

[R20] Oakes D. (1986). Semiparametric inference in a model for association in bivariate survival data. Biometrika 73 (2), 353–361. [Google Scholar]

[R21] Oakes D. (1989). Bivariate survival models induced by frailties. Journal of the American Statistical Association 84 (406), 487–493. [Google Scholar]

[R22] Prentice RL and Cai J. (1992). Covariance and survivor function estimation using censored multivariate failure time data. Biometrika 79(3), 495–512. [Google Scholar]

[R23] Prentice RL and Zhao LP (1991). Estimating equations for parameters in means and covariances of multivariate discrete and continuous responses. Biometrics, 825–839. [PubMed] [Google Scholar]

[R24] Shih JH and Louis TA (1995). Inferences on the association parameter in copula models for bivariate survival data. Biometrics, 1384–1399. [PubMed] [Google Scholar]

[R25] Spiekerman CF and Lin D. (1998). Marginal regression models for multivariate failure time data. Journal of the American Statistical Association 93(443), 1164–1175. [Google Scholar]

[R26] Wienke A. (2011). Frailty Models in Survival Analysis. Boca Raton: Chapman and Hall/CRC Press. [Google Scholar]

[R27] Women’s Health Initiative Study Group (1998). Design of the Women’s Health Initiative clinical trial and observational study. Controlled Clinical Trials 19(1), 61–109. [DOI] [PubMed] [Google Scholar]

PERMALINK

Regression Models and Multivariate Life Tables

Ross L Prentice

Shanshan Zhao

Abstract

1. Introduction

2. Bivariate Failure Time Regression Modeling and Estimation

2.1. Marginal single and double failure rate regression

2.2. Simulation evaluations

Table 1:

Table 2:

3. Bivariate Survival Function and Dependency Function Estimation

3.1. Bivariate survival function estimation

3.2. Dependency function estimation

3.3. Confidence interval and confidence band estimation

3.4. Simulation evaluation of survival and dependency function estimators

Table 3:

Table 4:

Table 5:

4. Composite Outcomes in a Low-fat Dietary Pattern Trial

Table 6:

5. Higher Dimensional Failure Time Regression Methods

5.1. Hazard rate regression models

5.2. Regression on marginal single and double failure hazard rates

6. Additional Aspects of Hazard Ratio Regression Parameter Modeling and Estimation

6.1. Model misspecification

Table 7:

6.2. Higher dimensional marginal hazard rate regression estimation

6.3. Summary and Concluding remarks

Supplementary Material

Acknowledgments

Biography

Contributor Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases