Abstract
In many randomized clinical trials, the primary response variable, for example, the survival time, is not observed directly after the patients enroll in the study but rather observed after some period of time (lag time). It is often the case that such a response variable is missing for some patients due to censoring that occurs when the study ends before the patient’s response is observed or when the patients drop out of the study. It is often assumed that censoring occurs at random which is referred to as noninformative censoring; however, in many cases such an assumption may not be reasonable. If the missing data are not analyzed properly, the estimator or test for the treatment effect may be biased. In this paper, we use semiparametric theory to derive a class of consistent and asymptotically normal estimators for the treatment effect parameter which are applicable when the response variable is right censored. The baseline auxiliary covariates and post-treatment auxiliary covariates, which may be time-dependent, are also considered in our semiparametric model. These auxiliary covariates are used to derive estimators that both account for informative censoring and are more efficient then the estimators which do not consider the auxiliary covariates.
Keywords: Informative censoring, Influence function, Logrank test, Nuisance tangent space, Proportional hazards model, Regular and asymptotically linear estimators
1 Introduction
In many randomized clinical trials, the primary endpoint of interest, denoted by Y, is not observed immediately after patients enroll into a study, but rather is observed after some period of time which may vary among patients. The time from a patient’s entry into the study until the response is observed is referred to as the lag time or time to ascertainment, and the corresponding response variable is referred to as the time-lagged response (Anstrom and Tsiatis 2001) or marked point process (Huang and Louis 1998). The simplest example of a time-lagged response is survival time where the primary endpoint itself is the lag time which varies by individual. Another simple example of a time-lagged response may be a laboratory measurement taken after some fixed period of time in which the lag time is the same for all individuals. Yet another example is when the response variable of interest Y are the total medical costs incurred during the treatment of some disease, in which the time to ascertainment is the length of disease which will vary by individual. The first two examples of the the time-lagged response will be discussed in detail in this paper. Areal data example ACTG 175, which will be introduced shortly in section 1.1, will be used to illustrate the applications of our developed methods on these two cases. The methods can also be applied more generally with other time-lagged response data such as the third example.
Our primary goal is to estimate or test for the treatment effect between two competing treatment groups, e.g., a new treatment versus placebo, which is defined through the statistical model
(1) |
where Z denotes the treatment assignment with Z = 1 for new treatment and Z = 0 for placebo, pY|Z(y|z) is used to denote the conditional density of Y given Z, β is the treatment effect parameter of interest and η are nuisance parameters used to describe the class of conditional distributions of Y|Z. The lag time that it takes for the response variable Y to be ascertained is denoted by T. For example, if the time-lagged response is survival time then Y = T and a commonly used model for assessing the treatment effect is the proportional hazards model where the conditional density of T given Z is modeled through its hazard function; namely,
(2) |
where λT|Z(t|z) denotes the conditional hazard rate of dying at time t for treatment z = 0, 1. Here the parameter β, i.e., the log hazard ratio between the two treatments, is the treatment effect of interest, whereas, the baseline hazard function λ(t), t ≥ 0 is the nuisance parameter η.
For other time-lagged response data where T ≠ Y, if all the patients in a study are followed until their response is ascertained, the lag time itself does not add any useful additional information regarding the estimation of the treatment effect β.
It is very common, however, in such clinical trials that the time-lagged response data are missing because of right censoring of some patients. Depending on the study, censoring occurs for a variety of reasons. Administrative censoring occurs because patients enter the study in a staggered fashion and not all have been observed at the end of the study when the data are analyzed. Often it is assumed that the censoring time is independent of the primary response Y or the slightly weaker assumption that the censoring time is independent of Y given treatment assignment Z. The assumption of independence between censoring time and response Y given Z is also necessary for some commonly used standard methods, for example, the maximum partial likelihood estimator of Cox (Cox 1972) and the logrank test (Mantel 1966; Peto and Peto 1972), to ensure their properties, such as consistency or asymptotic normality, to hold. This assumption is often referred to as noninformative censoring. However, censoring may also occur due to a patient’s drop out of the study before their response data are observed. For example, patients may drop out of the study because of side effects, or prognostically worse or better patients may drop out for reasons that can be attributed to other time-dependent outcomes. Under such situations, the censoring time is likely to be dependent on the response Y given Z and such censoring is usually referred to as the informative censoring. Informative censoring, if not properly accounted for, may bias the results from standard inferential methods and give overly optimistic or pessimistic estimates of treatment effect.
It is also common in clinical trials to collect additional information on auxiliary covariates (for example, age, gender, health conditions, etc.). Some of these auxiliary variables are collected prior to randomization, while others may be collected after treatment assignment. Because of randomization, covariates collected prior to randomization, referred to as baseline auxiliary covariates, are independent of treatment assignment and are not affected by treatment, whereas, covariates measured after randomization, referred to as post-treatment auxiliary covariates, may be time-dependent and affected by treatment assignment. Nonetheless, some of these covariates may be important prognostic factors that are correlated with the primary response variable.
Several researchers in their several recent works (e.g. Zhao and Tsiatis 1999; van der Laan and Hubbard 1999; Wahed and Tsiatis 2006) discussed the estimation methods on the survival distribution when censoring is presented and utilized prognostic covariates to improve the efficiency. More relevant work on the estimation of treatment effect was discussed by Lu and Tsiatis (2008), in which the authors derived an augmented class of consistent and asymptotically normal estimators for the treatment-specific log hazard ratio regression parameter as defined in (2). The auxiliary covariates were used to derive estimators that are more efficient than the maximum partial likelihood estimator and the logrank test. However, the proposed method by Lu and Tsiatis (2008) was based on the assumption of noninformative censoring that the censoring is independentof survival time T given treatment assignment Z. If such an assumption is not satisfied, the corresponding results may be biased.
In the presence of informative censoring, Hubbard et al (1999) and Moor and van der Laan (2009) discussed covariate adjusted estimation methods on the treatment specific survival at a fixed end point for right-censored survival outcomes. In this paper, we focus on estimation of treatment-specific log hazard ratio regression parameter by weakening the assumption of independence between censoring time and the response variable Y given Z to allow for censoring that is informative in a manner that can be explained through the observed auxiliary covariates. The semiparametric theory and the major results in Robins and Rotnitzky (1992) will be used to derive a class of consistent and asymptotically normal estimators for the treatment effect parameter β. The auxiliary covariates here play an important role in deriving a class of augmented semiparametric consistent and asymptotically normal estimators for β when the censoring is informative. The correlations between auxiliary covariates and the primary response variable are also utilized to derive estimators that are more efficient than the estimators without using the auxiliary covariates.
1.1 An illustrative example: ACTG 175
AIDS Clinical Trials Group protocol 175 (ACTG 175) is a double blind study that randomized HIV-infected subjects to four antiretroviral regimes in equal proportions: zidovudine (ZDV) monotherapy, ZDV + didanosine (ddI), ZDV + zalcitabine, and ddI monotherapy Hammer et al (1996). A subset of the data from 2139 HIV subjects are considered in this paper to demonstrate our proposed methods and be compared to the commonly used standard techniques. Among the 2139 subjects, 532 subjects were randomized to ZDV monotherapy, 522 were randomized to ZDV + ddI, 524 were randomized to ZDV + zalcitabine and 561 were randomized to ddI monotherapy. Two primary endpoints, corresponding to the two special cases of time-lagged response mentioned earlier, are taken into account: (1) the survival time that was defined from the time of diagnosis to the first time of a ≥ 50 percent decline in the CD4 cell count, an event indicating progression to the acquired immunodeficiency syndrome (AIDS), or death; (2) the mean CD4 cell counts at 96 ± 5 weeks after diagnosis. Our aim is to compare the treatment effects on the two primary endpoints between each of the treatments ZDV + ddI, ZDV + zalcitabine and ddI monotherapy with the treatment ZDV monotherapy. Roughly 76% of the data were censored.
In addition to the censored survival times, CD4 cell counts at 96 ± 5 weeks and treatment arms, the data also contain several prognostic baseline covariates and post-treatment covariates. The baseline covariates include CD4 cell counts, CD8 cell counts, age (years), weight (kg), gender (0 = female), hemophilia indicator (0 = no), homosexual activity (0 = no), race (0 = white, 1 = non-white), history of IV drug use (0 = no), Karnofsky score (on a scale of 0-100), ZDV in the 30 days prior to 175 (0 = no), antiretroviral history stratification (1 = ’Antiretroviral Naive’, 0 = other), number of days pre-175 antiretroviral therapy and symptomatic status indicator (0 = asymptomatic). The post-treatment covariates include, CD4 at 20 5 weeks, CD8 at 20 ± 5 weeks, indicator of off-trt before 96 ± 5 weeks (0 = no, 1 = yes) and Missing CD4 at 96 ± 5 weeks (0 = missing, 1 = observed). The last two post-treatment covariates as well as CD4 at 96 ± 5 weeks are time dependent covariates that will only be used as covariates for the survival endpoint which extends beyond 96 weeks.
This article is organized as follows: Section 2 describes the notation and model assumptions which will be used throughout this article. Section 3 characterizes the class of regular and asymptotic linear estimators for β using a general time-lagged responses. Section 4 and Section 5 are the specific applications of our method to the two special cases mentioned earlier. For each case, we characterize a subclass of regular and asymptotic linear estimators for β, the treatment effect parameter of interest, when informative censoring exists and perform a series of simulation studies to compare our proposed estimators with the commonly used standard techniques. The proposed estimators are also applied on the real data example ACTG 175.
2 Model framework and notation
2.1 Notation and assumptions
Consider a randomized clinical trial where n subjects are sampled from a population of interest. Let Di = {Ui, Δi, ΔiYi, Zi, Xi(Ui)} denote the observed data that are independent and identically distributed random vectors for i = 1, …, n. For the i-th subject, Ui = min(Ti, Ci), where Ti denotes the underlying lag time, and Ci denotes the potential censoring time, Δi = I(Ti ≥ Ci) is an indicator of whether the response data were ascertained (Δi = 1) or missing (Δi = 0), Yi denotes the response on which the primary analysis will be based, where Yi may be continuous or discrete and is only observed if Δi = 1, Zi denotes the treatment indicator with value 1 and 0 denoting experimental treatment and placebo, respectively. Furthermore, we let , where X1i denotes a vector of baseline auxiliary covariates which are measured prior to randomization, and defined by {X2i(u), 0 ≤ u ≤ Ui} denotes post-treatment auxiliary covariates which may be time-dependent in which case we would observe the history of these values up to time Ui. In addition, we use Vi = {Ti, Yi, Zi, Xi(Ti) to denote the full data had there been no censoring or missing data.
Due to randomization, it is reasonable to assume that the treatment indicator Z is independent of the auxiliary baseline covariates X1 and that the randomization probability of receiving treatment 1 is equal to π with 0 < π < 1 which is known to us; that is,
(3) |
It is well known that, instead of using the true value of π, using the estimated π generally leads to further gain in efficiency for estimators for β. Therefore, we recommend estimating π using the sample proportion as it is actually done in all the following computations both for simulations and real data example.
As in any missing data problem it is important to consider assumptions regarding the process in which the data are missing (censored); that is, we need to consider the conditional distribution of the censoring variable C given the full data V which we define through the conditional hazard function λC(u|V) at time u given V. Often, in randomized clinical trials, one assumes that the censoring variable C is completely independent of the full-data V. This assumption is similar in spirit to the “missing completely at random” assumption (MCAR) as defined by Rubin (1987) and may be a reasonable assumption if the data were administratively censored. A slightly weaker assumption that is implicitly made when one uses the logrank test to test for differences in the survival distributions for two treatments with right-censored data is that C is conditionally independent of (T, Y ) given Z. When a patient is censored due to drop out or lost to follow-up, one can imagine some scenarios where poorer prognostic patients may be more likely to be censored and other scenarios where the opposite may happen. We therefore consider the weaker assumption that
(4) |
where λC(u|·) denotes the conditional hazard rate of C at time u. In words, this assumption means that the probability of being censored at time u, given that one has not been censored or failed by time u, only depends on observed variables measured prior to time u and not additionally on the future data. This last assumption is similar in spirit to what Rubin refers to as missing at random (MAR) and we will refer to this assumption as censoring at random (CAR). This CAR assumption allows greater flexibility than the usual assumptions with censored data.
Without making additional assumptions, other than those given by (1), (3) and (4), we now consider how to derive a class of semiparametric estimators that are consistent and asymptotically normal for β using the results by Robins and Rotnitzky (1992).
2.2 Introduction to semiparametric theory
Robins and Rotnitzky (1992) restricted attention to estimators that are regular and asymptotically linear (RAL). An estimator for β is asymptotically linear if there exists a random variable φ(D), which, under the truth, β = β0, has mean zero and finite variance, such that . The function φ(Di) is referred to as the i-th influence function of the estimator . Regularity is a technical condition that rules out ”pathological” estimators with undesirable local properties (Newey 1990), such as the ”superefficient” estimator of Hodges (e.g. Tsiatis 2006, p. 24). The influence function of an RAL estimator for β is uniquely defined and the asymptotic properties of such an estimator is determined by its influence function. It is clear from the definition of the influence function given above and a simple application of the central limit theorem, that, the asymptotic variance of an RAL estimator is equal to the variance of its influence function, i.e., E{φ(D)2}.
For a general semiparametric model, Robins and Rotnitzky (1992) provided a series of steps for deducing a class of RAL estimators when data are censored at random: (1) characterize the class of full data influence functions, (2) characterize the class of observed data influence functions by applying the theory of augmented inverse probability complete case estimators and (3) identify the observed data estimators with estimating functions in this class.
3 The class of all semiparametric estimators for β
The class of full data influence functions for β, whose proof was given by Zhang et al (2008), is characterized by
(5) |
where ΨY Z(Y, Z; β0) is the class of influence functions for β that only use the information of time-lagged response variable Y and treatment assignment Z derived through model (1),
(6) |
and the sum of the two spaces S1 + S2 = {s1 + s2 : s1 ∈ S1 and s2 ∈ S2}. In addition, as we will see shortly in Theorem 1, the direct sum ⊕ of the two linear spaces S1 and S2 is the same as the regular sum of the two spaces but with an additional property that the two linear spaces only intersect at {0}.
For the time being, assume the hazard function λC{u|T ≥ u, Z, X (u)} to be known and let be the conditional survival function for censoring. Under the CAR assumption, we obtain that P(Δ = 1|v) = KC{T, Z, X(T)}. The theory of Robins and Rotnitzky (1992) tells us that the an observed data influence function for β can be written as an augmented inverse probability weighted complete case (AIPWCC) influence function; namely,
(7) |
where the full-data influence function ψF(V ; β0) ∈ ΨF is given in (5), dMC{u, Z, X(u)} is the increment of the martingale process dNC(u) – λC{u|T ≥ u, Z, X(u)} Y (u)du, where dNC(u) is the increment of the counting process for censoring; i.e., NC(u) = I(U ≤ u, Δ = 0), Y (u) = I(U ≥ u), is the “at-risk” process, and L{u, Z, X(u)} is an function of all the data prior to time u; that is,
Theorem 1 If the hazard function λC{u|T ≥ u, Z, X (u)} for censoring is known, the class of all observed data influence functions is given by
(8) |
where ΨY Z and R are defined in (5) and (6), respectively, and
(9) |
Theorem 1 (see proof in Appendix A) tells that if λC{u|T ≥ u, Z, X (u)} is known, then the class of RAL estimators for β be represented as the solution to the following estimating equations
(10) |
for arbitrary functions ψY Z (Y, Z; β0) ∈ ΨY Z, f (X1) and L{u, Z, X(u)}.
In practice, however, the hazard function λC{u|T ≥ u, Z, X (u)} (also used to derive KC{u, Z, X (u)}is not known, and must be estimated from the data. Because of its availability and versatility, a proportional hazards regression models with time-dependent covariates will be used to model the hazard relationship for the censoring survival distribution. For maximum flexibility, we will consider separate models for each treatment group. To be specific, we posit a stratified proportional hazards regression model for the censoring time C; that is, λC{u|T ≥ u, Z, X (u)} = λ0C(u, Z) H {Z, X(u); α}, where λ0C(u, Z) is the baseline hazard function for each treatment group Z = 0.1, and . The vector of parameters α = (α11, …, α22) will be estimated by using the standard partial likelihood estimator (Cox 1975), denoted here by , and λ0C will be estimated by
Denote the estimated hazard function and corresponding survival function by and , respectively. Placing these estimated functions in (10), after some algebra, we have an equivalent expression of the estimating equation
(11) |
where
(12) |
for arbitrary functions ψY Z(Y, Z; β0) ∈ ΨY Z, f(X1) and L {u, Z, X (u)}.
Clearly, if the proportional hazard regression model is the underlying true model for the censoring time, the estimator derived from equation (11) is a consistent estimator for β. In addition, if we consider the subclass of estimators of (11) by fixing function ψY Z(Y, Z; β0) ∈ ΨY Z but varying functions f(X1) and L{u, Z, X(u)}, following the analogous proof as that for Theorem 3 of Lu and Tsiatis (2008), we have the optimal functions
(13) |
(14) |
for f(·) and L(·), respectively, that lead to the most efficient estimator with the smallest variance in this subclass. However, the conditional expectations in equations (13) and (14) are unknown in practice and they will be estimated by positing parametric regression models. Substituting these estimated functions and for f(·) and L(·) in (11) will lead to a more efficient estimator for β than without using auxiliary covariates. This procedure will be explained in details when we discuss the two special applications of our proposed method in sections 4 and 5, respectively.
Remark 1 Although we need the correct model for censoring time to derive the consistent estimators for β, we will see later in the simulation portions of section 4 and 5 that the resulting estimators for β have smaller bias than the standard method even if the true model for censoring does not follow the assumed proportional hazard model.
4 Application to the proportional hazard model
4.1 Class of all semiparametric estimators for β
To illustrate how these estimators are derived, we will focus on the special case where the time-lagged response variable Y is the survival time T and T|Z follows a proportional hazard model (2), where the primary focus is to estimate the treatment effect parameter β. This problem is common in chronic disease clinical trials and is of importance in its own right.
Using standard results for the proportional hazards model (e.g. Andersen et al 1993; van der Laan and Robins 2003; Tsiatis 2006) the class of full data influence functions ΨY Z for estimators of β can be described by
(15) |
where dMT(u, Z; β) = dNT(u)–λ(u) exp(βZ)I(T ≥ u)du, NT(u) = I(T ≤ u), , and the proportionality constant .
Applying this class of full data influence functions ΨY Z to Theorem 1, we have the following corollary whose proof is given in Appendix B.
Corollary 1 If the hazard function λC{u|T ≥ u, Z, X (u)} for censoring is known, and if T|Z follows a proportional hazard model (2), the class of all observed data influence functions is given by
(16) |
where
(17) |
(18) |
(19) |
and R and C are defined in (6) and (9), respectively.
Analogous to (11), if we use the estimated functions and , assuming the stratified proportional hazards regression model, the class of all RAL estimators for β can be represented as a solution to the following estimating equation
(20) |
where
(21) |
for arbitrary functions W (u, Z), f(X1) and L{u, Z, X(u)}. We can also derive a more efficient estimator for β by substituting appropriate estimators and for f(·) and L(·) in (20). To be specific, we posit parametric models f(X1; a) = aTq(X1) that is linear in a and L{u, Z, X (u); b} = bT s{u, Z, X(u)} that is linear in b, where a and b are ra-dimensional vector of functions of X1 and s(·) is and rb-dimensional vectors of functions of unknown parameters, respectively. q(·) is an ra-dimensional vector of functions of X1 and s(·) is an rb-dimentional vector of {u, Z, X(u)}, and consider the subclass of Ral estimators which solve the estimating equations , for all and . We define a0 and b0 to be the values leading to the smallest asymptotic variance of the estimator within this subclass. Using standard regression methods, we obtain the estimators
(22) |
(23) |
for a0 and b0, respectively, where
is estimated using the increment of the Breslow estimate for the underlying cumulative hazard function, i.e., , , and , for Z = 0, 1 are estimated using the increment of the treatment-specific Nelson-Aalen estimator for the cumulative hazard function of the censoring distribution; that is, . and are calculated the similar manner as in (12) by replacing the function L(·) with s(·). The estimated functions and are defined to be and , respectively.
Remark 2 If the censoring time is noninformative, then after some algebra, the class of estimating equations described in (20) will be identical to the class of estimating equations characterized in (9) of Lu and Tsiatis (2008).
Remark 3 Finding the optimal function of W (u, Z) that gives the efficient estimator in the class of all RAL estimators derived from (20) would be very difficult. Therefore our strategy is to choose a function that gives an estimator at least as efficient as the maximum partial likelihood estimator when the censoring time is noninformative. Here, we choose , where is the treatment specific Kaplan-Meier estimator for the censoring time. Clearly, when censoring is noninformative, the first summand of the estimating equation (20) will reduce to the standard partial likelihood score function that leads to the maximum partial likelihood estimator for β.
4.2 Variance estimator for
If the potential censoring time truly follows a stratified proportional hazards regression model, λC{u|T ≥ u, Z, X(u)} = λ(u|Z)H{Z, X(u); α}, as described in section 3, then using the estimated function and in (20) will lead to an estimator with its influence function in , where ∏(S1|S2) denotes the projection of space S1 onto space S2, is defined in (16) and Λφ = Λλ0C(·, Z) ⊕ Λα, in which Λλ(·, Z) is the nuisance tangent space associated with the nuisance baseline hazard function Λλ0C(·, Z) which is given in equation (31) of Appendix C and Λα is the nuisance tangent space associated with the nuisance parameters α = (α11, …, α22) given in equation (32) of Appendix C (e.g. van der Laan and Robins 2003; Tsiatis 2006); that is,
Theorem 2 If the hazard function λC{u|T ≥ u, Z, X(u)} = λ0C(u, Z)H{Z, X(u); α}, then the class of all observed data influence functions is given by
(24) |
where is defined in (16), is given by
(25) |
in which CW and Z*(u;β) are defined in (19) and (18), respectively,
(26) |
(27) |
and Π(ψ(D)|Λα) is given by (35).
Remark 4 The proof of Theorem 2 is given in Appendix C. The expressions for the fourth and fifth summands in (25) are very complicated and difficult to evaluate numerically. Therefore we choose to ignore these terms in our variance estimator of the estimator that solves equation (20). If the censoring time is noninformative, then these two summands will add to zero giving us a consistent variance estimator. In addition, for the posited model L{u, Z, X(u); b} = bT s{u, Z, X(u)}, if s{u, Z, X(u)} contains Q(u) which is defined by (34), then the projection Π(ψ(D)|Λα) is identical to zero, otherwise, our estimator will be slightly conservative. However, in all our simulations studies that will be described later the conservatism in the variance estimator was never noticeable. Thus, the a variance estimator for is computed by the following sandwich estimator
(28) |
where
4.3 Simulation
We performed a Monte-Carlo simulation study to compare the performance of the maximum partial likelihood estimator with our proposed estimators , , and that are obtained by solving the estimating equation (20) with and {f(·) = 0, L(·) = 0}, and , respectively. It is clear to see that these three estimators have decreasing asymptotic variances as is obtained without using any auxiliary covariate information except for estimating the survival function for censoring time, and and are obtained by utilizing the information on baseline covariates and the information on all the covariates, respectively.
For this study, we considered one baseline covariate X1 and one post-treatment covariate X2 that can be obtained immediately for all patients. First we generated bivariate data (Y, X) from a bivariate normal density with mean zero, variance 1, and correlation ρ. We then independently generated the treatment indicator Z as a Bernoulli(π). Using inverse transformation, the survival time T was taken to be T = −exp( −βZ) log{1 − Φ(Y)}, where Φ(·) denotes the cumulative distribution function of a standard normal. X1 = Φ−1(X) follows a uniform (0, 1) distribution. This guarantees that the distribution of T given Z will follow a proportional hazards relationship λ(t|z) = λ(t) exp(βz), with λ(t) = 1, that is, T ~ Exponential exp(βZ)}. The post-treatment covariate was generated using the formula which results in the correlation of X2 and T to be r. Note that this makes X2 conditionally independent of the treatment Z given T. Censoring time C was generated using the following three scenarios: 1) Exponential distribution with hazard rate λC (u|Z) = c exp(βZ) reflecting noninformative censoring given Z; 2) Exponential distribution with hazard rate λC(u|X1, X2, Z) = c {Z exp (α11 X1 + α21 X2) + (1 − Z) exp(α12X1 + α22 X2)}, reflecting informative censoring with a stratified proportional hazard regression model; 3) Scaled Lognormal distribution c·LN(μ, σ) with μ = Z(ℓ11X1 + ℓ21X2) + (1−Z)(ℓ12X1 + ℓ22X2) and σ = 1, in which case the censoring time is correlated with the survival time given Z but does not follow a stratified proportional hazard regression model. To calculate the estimated functions and , we posited the models for f(·), and b0 + b1 X1 + b2 X2+ b3 X1 Z + b4 X2 Z for L(·).
For this demonstration, treatment was assigned with probability π = .5, the correlation between the bivariate normal random variable was taken to be ρ = .5 which resulted in a sample correlation of approximately 0.4 between the survival time T and baseline covariate X1. The correlation of X2 and T was taken to be r = 0.7. Two values for the proportional hazards regression coefficient were considered, β = 0 (null hypothesis) and β = .3. For the censoring time, α11 = 1, α21 = 0.1, α12 = 2, α22 = 0.3, ℓ11 = 0.3, ℓ21 = 1, ℓ12 = 1, ℓ22 = 0.3, and the value c was chosen in different scenarios that would result in roughly 36% the data being censored. Sample sizes of 250 and 600 were considered and each scenario used 1000 Monte-Carlo simulations. In Tables 1, 2 and 3, we compare the bias, standard error estimate, Monte-Carlo standard error, relative efficiency (ratio of variance estimate and ratio of Monte-Carlo variance), type I error and the power of the maximum partial likelihood estimator with our proposed estimators k = 1, 2, 3, under the various simulation scenarios.
Table 1.
True β0 | Statistics | n | ||||
---|---|---|---|---|---|---|
0 | Bias | 250 | 0.004 | 0.004 | 0.005 | 0.005 |
600 | 0.001 | 0.000 | 0.001 | 0.001 | ||
AveSE | 250 | 0.160 (1.00) | 0.160 (1.00) | 0.145 (1.21) | 0.134 (1.42) | |
600 | 0.089 (1.00) | 0.089 (1.00) | 0.080 (1.24) | 0.077 (1.34) | ||
MCSE | 250 | 0.165 (0.88) | 0.155 (1.00) | 0.141 (1.21) | 0.141 (1.21) | |
600 | 0.091 (0.94) | 0.088 (1.00) | 0.078 (1.27) | 0.078 (1.27) | ||
Type I Error | 250 | 0.055 | 0.048 | 0.041 | 0.060 | |
600 | 0.054 | 0.043 | 0.047 | 0.059 | ||
| ||||||
0.3 | Bias | 250 | 0.007 | 0.007 | 0.008 | 0.008 |
600 | 0.002 | 0.003 | 0.003 | 0.003 | ||
AveSE | 250 | 0.162 (1.00) | 0.161 (1.00) | 0.147 (1.21) | 0.137 (1.38) | |
600 | 0.103 (1.00) | 0.014 (1.00) | 0.094 (1.21) | 0.088 (1.37) | ||
MCSE | 250 | 0.165 (0.90) | 0.157 (1.00) | 0.143 (1.20) | 0.143 (1.20) | |
600 | 0.107 (0.91) | 0.102 (1.00) | 0.091 (1.26) | 0.091 (1.26) | ||
Power | 250 | 0.481 | 0.474 | 0.555 | 0.606 | |
600 | 0.826 | 0.831 | 0.899 | 0.923 |
Table 2.
True β0 | Statistics | n | ||||
---|---|---|---|---|---|---|
0 | Bias | 250 | 0.159 | 0.014 | 0.015 | 0.015 |
600 | 0.159 | 0.008 | 0.008 | 0.008 | ||
AveSE | 250 | 0.183 (1.07) | 0.189 (1.00) | 0.174 (1.18) | 0.161 (1.38) | |
600 | 0.117 (1.10) | 0.123 (1.00) | 0.114 (1.17) | 0.105 (1.38) | ||
MCSE | 250 | 0.185 (1.02) | 0.187 (1.00) | 0.171 (1.20) | 0.171 (1.20) | |
600 | 0.121 (1.02) | 0.122 (1.00) | 0.110 (1.22) | 0.110 (1.22) | ||
Type I Error | 250 | 0.141 | 0.053 | 0.049 | 0.069 | |
600 | 0.289 | 0.043 | 0.039 | 0.064 | ||
| ||||||
0.3 | Bias | 250 | 0.152 | 0.019 | 0.020 | 0.020 |
600 | 0.149 | 0.010 | 0.010 | 0.010 | ||
AveSE | 250 | 0.185 (1.05) | 0.190 (1.00) | 0.174 (1.18) | 0.163 (1.35) | |
600 | 0.119 (1.08) | 0.123 (1.00) | 0.113 (1.17) | 0.106 (1.34) | ||
MCSE | 250 | 0.188 (1.01) | 0.189 (1.00) | 0.173 (1.20) | 0.173 (1.20) | |
600 | 0.121 (1.04) | 0.123 (1.00) | 1.112 (1.21) | 1.112 (1.21) | ||
Power | 250 | 0.684 | 0.397 | 0.450 | 0.493 | |
600 | 0.962 | 0.707 | 0.769 | 0.803 |
Table 3.
True β0 | Statistics | n | ||||
---|---|---|---|---|---|---|
0 | Bias | 250 | −0.245 | 0.023 | 0.019 | 0.019 |
600 | −0.254 | 0.051 | 0.052 | 0.052 | ||
AveSE | 250 | 0.193 (1.37) | 0.225 (1.00) | 0.210 (1.15) | 0.183 (1.51) | |
600 | 0.123 (1.86) | 0.168 (1.00) | 0.160 (1.11) | 0.133 (1.60) | ||
MCSE | 250 | 0.191 (1.43) | 0.229 (1.00) | 0.215 (1.13) | 0.215 (1.13) | |
600 | 0.123 (1.89) | 0.169 (1.00) | 0.164 (1.06) | 0.164 (1.06) | ||
Type I Error | 250 | 0.247 | 0.045 | 0.041 | 0.071 | |
600 | 0.527 | 0.040 | 0.037 | 0.088 | ||
| ||||||
0.3 | Bias | 250 | −0.230 | 0.023 | 0.019 | 0.019 |
600 | −0.240 | 0.046 | 0.048 | 0.048 | ||
AveSE | 250 | 0.185 (1.38) | 0.218 (1.00) | 0.203 (1.15) | 0.177 (1.51) | |
600 | 0.118 (1.87) | 0.162 (1.00) | 0.154 (1.11) | 0.128 (1.60) | ||
MCSE | 250 | 0.182 (1.42) | 0.216 (1.00) | 0.202 (1.15) | 0.202 (1.15) | |
600 | 0.117 (1.83) | 0.158 (1.00) | 0.154 (1.06) | 0.154 (1.06) | ||
Power | 250 | 0.059 | 0.297 | 0.319 | 0.450 | |
600 | 0.067 | 0.617 | 0.666 | 0.781 |
Table 1 shows the simulation results under scenario 1 where censoring time is noninformative. As we expect, all the estimators are unbiased and control the type I error. Our proposed estimators are more efficient then the traditional maximum partial likelihood estimator and more powerful than the logrank test.
Table 2 shows the simulation results under scenario 2 where censoring time follows a stratified proportional hazards regression model for each treatment group. As we can see the traditional maximum partial likelihood estimator is severely biased whereas all our proposed estimators are unbiased and control the type I error. This is consistent with the theoretical results in this paper.
In addition, from Table 3, we see that when the censoring time is informative, but does not follow a stratified proportional hazard regression model, our proposed estimators are still less biased then the traditional maximum partial likelihood estimator.
4.4 Applying to ACTG 175 data
Figure 1 is a plot of the logarithm of the negative logarithm of the survival curve of the survival time for each treatment group. The four lines, except for a few points early in time, are approximately parallel suggesting that a proportional hazards relationship between treatments is reasonable. If the censoring time is noninformative, we know that the maximum partial likelihood estimator is the most efficient estimator and the logrank test is the most powerful nonparametric test without using any additional covariates. The results of applying the standard analysis using Cox’s maximum partial likelihood estimator can be found in Table 4. For example, the estimate of the log hazard ratio between treatments ZDV monotherapy and ZDV + ddI is −0.703 and its standard error is .124, which is highly statistically significant.
Table 4.
Estimate | Standard Error | RE | ||
---|---|---|---|---|
ZDV monotherapy and ZDV + ddI | −0.703 | 0.124 | 1.01 | |
−0.689 | 0.124 | 1.00 | ||
−0.724 | 0.120 | 1.07 | ||
−0.721 | 0.117 | 1.12 | ||
| ||||
ZDV monotherapy and ZDV + zalcitabine | −0.640 | 0.121 | 1.01 | |
−0.638 | 0.122 | 1.00 | ||
−0.617 | 0.114 | 1.15 | ||
−0.590 | 0.111 | 1.21 | ||
| ||||
ZDV monotherapy and ddI monotherapy | −0.528 | 0.116 | 1.01 | |
−0.525 | 0.116 | 1.00 | ||
−0.536 | 0.111 | 1.10 | ||
−0.509 | 0.109 | 1.14 |
Applying a stratified proportional hazard regression model for the censoring time and using Forward selection with selection entry 0.05, we obtained some prognostic covariates for each treatment group. The important prognostic covariates for the censoring time are (AGE, RACE, STRAT, OFFTRT, MisCD4) for ZDV monotherapy, (HOMO, Z30, RACE, CD820, OFFTRT, MisCD4) for ZDV + ddI, (HOMO, RACE, STRAT, SYMP, OFFTRT) for ZDV + zalcitabine, and (Z30, GENDER, CD80, OFFTRT, MisCD4) for ddI monotherapy.
We also applied the similar model for the survival time and found some prognostic covariates for each treatment group, which are (CD40, CD80, GENDER, Z30, CD420, CD496, MisCD4) for ZDV monotherapy, (CD40, PREANTI, STRAT, SYMP, CD420, CD496, MisCD4) for ZDV + ddI, (PREANTI, KARN, SYMP, CD420, CD496, MisCD4) for ZDV + zalcitabine and (CD40, SYMP, CD420, CD496, MisCD4) for ddI monotherapy.
As we can see, for each treatment group, there is only one common covariate that is prognostic for both survival time and censoring time. For example, for treatments ZDV monotherapy, ZDV + ddI and ddI monotherapy, MisCD4 is the only common covariate and for treatment ZDV + zalcitabine, SYMP is the common covariate. Both MisCD4 and SYMP are binary variables, which give us a rough sense that the survival time and censoring time are weakly correlated given treatment assignment. Hence our proposed estimator may be close to the traditional partial likelihood estimator which assumes independence between survival time and censoring time given treatment. This explains why there are no substantial differences between our estimators and the maximum partial likelihood estimator after applying our method to the ACTG 175 data as shown in Table 4. Again, to obtain a more efficient estimator, we incorporated some prognostic covariates into the model that include baseline covariates (CD4, CD8, AGE, WEIGHT, DRUG, KARN, Z30, SYMP, PREANTI) and post-treatment covariates (CD420, CD820, CD496, MisCD4, OFFTRT). The results seem to support that using auxiliary covariates in the model leads to more efficient estimators ( and ) than the estimator that does not consider auxiliary covariates in the model.
5 Application to two-sample treatment comparison
In this section, we discuss another simple yet commonly occurring case where the time-lagged response is to be observed after a fixed time period; that is, the lag time is the same for every response to be observed. The primary endpoint here is to evaluate the population difference in mean responses between two treatment groups. Specifically, the treatment effect parameter β = μ1 − μ0, where μk denotes the population mean of the response Y for treatment group k; i.e., μk = E(Y|Z = k), k = 0, 1.
5.1 Class of all semiparametric estimators for β
Applying Theorem 1, it is straightforward to derive that the influence functions for the RAL estimators for μk are
(29) |
for arbitrary functions f(X1) and L{u, Z, X(u)}, where the full data influence function for estimators of μk (Davidian et al 2005) is given by
Again, if we use the estimated functions and , assuming the stratified proportional hazards regression model, the class of all RAL estimators for μk can be represented as a solution to the following estimating equation
(30) |
for arbitrary functions f(X1) and L{u, Z, X(u)}. Consequently, the class of the RAL estimators for β, the difference in mean response Y between treatment groups (Tsiatis et al 2008) is and the corresponding influence function for is ϕβ (D; f, L) = ϕμ1(D; f, L) − ϕμ0(D; f, L).
From the results of Lu and Tsiatis (2008), we know that the efficient influence function for μk is , where
and
Considering that the underlying true model for E(Y|Z = k, X1) and E(Y|T ≥ u, Z = k, X(u)} are unknown in practice, we posit parametric models that is linear in ak and that is linear in bk, where ak and bk are rak-dimensional and rbk-dimensional vectors of unknown parameters, respectively, qk(·) is an rak-dimensional vector of functions of X1 and sk(·) is an rbk-dimensional vector of functions of {u, Z, X(u)}. The parameters ak and bk are estimated by the commonly used ordinary least squares method. Once we have the estimated functions and , the corresponding estimated optimal functions are easily obtained by
and
Now, let us consider the following three estimators , j = 1, 2, 3, for β, where is obtained by solving ∑ϕμk (Di; 0, 0) = 0 without considering auxiliary covariates which results in
is obtained by solving which only considers baseline covariates; that is
and is obtained by solving which considers all covariates; that is,
The variance of , j = 1, 2, 3, is estimated using the standard sandwich estimator based on the influence function of . According to our theory, the estimators and using auxiliary covariates should be more efficient than without using auxiliary covariates and the variance of should be decreasing as j increases from 1 to 3.
5.2 Simulation
This simulation study is performed to compare our proposed estimators , and with the commonly used t-test. The simulation data were generated from the fit of the data ACTG 175. For each simulated data set, we generated for each of n subjects the continuous baseline covariates logCD40, logCD80, AGE, WEIGHT, KARN and PREANTI from a multivariate normal distribution with the empirical mean and covariance matrix of these variables in the data. We then independently generated the binary indicators of HOMO, RACE, DRUG, and STRAT for each subject from independent Bernoulli distributions using the observed data proportions for each variable. The treatment indicator was generated independently from Bernoulli(π) for each subject. The CD4 cell counts at 8, 20, 32, 44, 56, 68, 80, 92, 96 weeks for each subject was generated using the treatment specific mixed model log{CD4(t)} = α0k + α1kt + γX1k + ∊k where k is treatment group ZDV monotherapy or ZDV + ddI, α0, α1 and ∊ are multivariate normal random effect with empirical means and covariance matrix after fitting this mixed model to the data, and the set of baseline covariates X1k in the treatment specific mixed model consists of (logCD40, logCD80, STRAT) for ZDV monotherapy; (logCD40, AGE, KARN, PREANTI) for ZDV + ddI. The censoring time for each subject was generated using the treatment specific hazard rate λC{u|T ≥ u, Z, X(u)} = λ0k exp{ζ1kCD4k(u)+ζ2kX1Ck}, where CD4k(u) records the last observed CD4 cell counts at or before time u in treatment group k, and the set of baseline covariates taken for each treatment group in this proportional hazard model are (WEIGHT, DRUG, RACE) for ZDV monotherapy and (WEIGHT, KARN) for ZDV + ddI. The coefficients (ζ1k, ζ2k) in the proportional hazard model were taken to be the values after fitting this model to the data ACTG 175. If the censoring time is greater than t, t = 8, 12, …, 96, then the CD4 CELL counts at t + 1, …, 96 weeks are set to missing.
The true value of β = 80.13 and the randomization probability to each of the two treatment group is π = 0.5. We considered the sample size n = 250, 600 and 1054 that is the actual sample size of the data from ACTG 175 with treatments ZDV monotherapy and ZDV + ddI.
Similarly, we also performed a simulation study for the comparisons of treatments ZDV + zalcitabine and ddI monotherapy with treatment ZDV monotherapy, respectively. To generate the CD4 cell counts at 8, 20, 32, 44, 56, 68, 80, 92, 96 weeks, we chose the baseline covariates (logCD40, logCD80, PREANTI, STRAT) for ZDV + zalcitabine; (logCD40, logCD80, WEIGHT, KARN) for ddI monotherapy in the treatment specific mixed model. The baseline covariates used in the treatment specific proportional hazard model to generate censoring time were taken to be (HOMO, KARN, PREANTI) for ZDV + zalcitabine and (HOMO, KARN, CD40) for ddI monotherapy. The true values of β for comparing treatments ZDV + zalcitabine and ddI monotherapy with treatment ZDV monotherapy are 63.57 and 46.33, respectively.
Tables 5, 6 and 7 are the simulation results for comparing treatment group ZDV + ddI, ZDV + zalcitabine and ddI monotherapy with treatment ZDV monotherapy, respectively. The results of fitting the models on the data ACTG 175 was shown in table 8. Clearly, under the situations with informative censoring, the standard method of t-test is severely biased while the biases of our proposed estimators , and are negligible. In addition, the estimators and using auxiliary covariates seem to have smaller estimated variances than without using the auxiliary covariates, which is consistent with our theoretical results mentioned earlier.
Table 5.
Statistics | n | t-test | |||
---|---|---|---|---|---|
Bias | 250 | −21.050 | −5.953 | −5.564 | −3.871 |
600 | −21.120 | −5.258 | −4.980 | −3.533 | |
1054 | −21.310 | −5.267 | −5.027 | −3.664 | |
AveSE | 250 | 62.645 (0.71) | 52.961 (1.00) | 51.637 (1.05) | 47.126 (1.26) |
600 | 40.858 (0.70) | 34.246 (1.00) | 33.572 (1.04) | 30.815 (1.24) | |
1054 | 30.949 (0.70) | 25.859 (1.00) | 25.395 (1.04) | 23.357 (1.23) | |
MCSE | 250 | 64.104 (0.60) | 49.756 (1.00) | 50.074 (0.99) | 49.070 (1.03) |
600 | 41.377 (0.59) | 31.891 (1.00) | 31.490 (1.03) | 30.952 (1.06) | |
1054 | 31.492 (0.59) | 24.150 (1.00) | 23.866 (1.02) | 23.530 (1.05) | |
95% CI | 250 | 94.3 | 96.9 | 96.2 | 94.9 |
600 | 91.9 | 96.4 | 96.1 | 94.9 | |
1054 | 89.2 | 95.7 | 96.0 | 94.5 |
Table 6.
Statistics | n | t-test | |||
---|---|---|---|---|---|
Bias | 250 | −19.130 | −2.664 | −2.323 | −1.478 |
600 | −18.980 | −2.213 | −1.871 | −1.075 | |
1056 | −19.040 | −2.308 | −2.105 | −1.320 | |
AveSE | 250 | 57.639 (0.71) | 48.548 (1.00) | 47.068 (1.06) | 43.102 (1.27) |
600 | 37.648 (0.70) | 31.417 (1.00) | 30.595 (1.05) | 28.172 (1.24) | |
1056 | 28.497 (0.69) | 23.715 (1.00) | 23.132 (1.05) | 21.337 (1.24) | |
MCSE | 250 | 58.415 (0.62) | 45.859 (1.00) | 45.678 (1.01) | 44.740 (1.05) |
600 | 37.895 (0.61) | 29.479 (1.00) | 28.753 (1.05) | 28.170 (1.10) | |
1056 | 28.685 (0.61) | 22.415 (1.00) | 21.794 (1.06) | 21.421 (1.09) | |
95% CI | 250 | 95.4 | 96.7 | 96.2 | 94.4 |
600 | 93.4 | 96.9 | 96.5 | 94.7 | |
1056 | 91.6 | 96.5 | 96.3 | 95.3 |
Table 7.
Statistics | n | t-test | |||
---|---|---|---|---|---|
Bias | 250 | −10.970 | −1.748 | −1.325 | 0.081 |
600 | −11.040 | −1.893 | −1.622 | −0.333 | |
1093 | −10.970 | −2.478 | −2.338 | −1.034 | |
AveSE | 250 | 60.028 (0.69) | 49.787 (1.00) | 48.521 (1.05) | 44.447 (1.25) |
600 | 39.159 (0.67) | 32.165 (1.00) | 31.493 (1.04) | 29.015 (1.23) | |
1093 | 29.126 (0.67) | 23.826 (1.00) | 23.368 (1.04) | 21.581 (1.22) | |
MCSE | 250 | 61.090 (0.59) | 46.897 (1.00) | 46.771 (1.01) | 45.881 (1.04) |
600 | 39.473 (0.58) | 29.981 (1.00) | 29.444 (1.04) | 28.958 (1.07) | |
1093 | 29.254 (0.58) | 22.233 (1.00) | 21.714 (1.05) | 21.408 (1.08) | |
95% CI | 250 | 95.4 | 96.9 | 96.5 | 95.0 |
600 | 94.7 | 96.8 | 96.6 | 95.0 | |
1093 | 93.7 | 96.7 | 96.9 | 95.5 |
Table 8.
Estimates | Standard Errors | RE | ||
---|---|---|---|---|
ZDV monotherapy and ZDV + ddI | t-test | 53.485 | 13.216 | 1.05 |
66.092 | 13.532 | 1.00 | ||
70.219 | 12.242 | 1.22 | ||
69.850 | 10.426 | 1.68 | ||
| ||||
ZDV monotherapy and ZDV + zalcitabine | t-test | 68.566 | 13.286 | 1.03 |
76.215 | 13.483 | 1.00 | ||
77.115 | 12.143 | 1.23 | ||
74.750 | 10.308 | 1.71 | ||
| ||||
ZDV monotherapy and ddI monotherapy | t-test | 41.881 | 13.294 | 1.04 |
48.367 | 13.537 | 1.00 | ||
53.300 | 12.065 | 1.26 | ||
53.852 | 9.857 | 1.89 |
6 Conclusion
Our interest in this paper is estimating the unconditional treatment effect β as defined in (1), where the primary outcome is the time-lagged response whose values are often missing in practice due to censoring of some patients. Under the assumption of censoring at random (CAR) given in (4) and the assumption of independence between baseline covariates and treatment assignments given in (3), we developed a class of regular and asymptotically linear (RAL) estimators for β by using the theory of semiparametrics and the major results of Robins and Rotnitzky (1992). The prognostic covariates were utilized by the model to improve the efficiency of the estimators.
We have discussed in detail the applications of our method to two special cases of time-lagged responses. Nevertheless, our proposed method has wider applicability to more general time-lagged response problems. In the first case, the time-lagged response itself is the survival time and the parameter of interest is the treatment-specific log hazard ratio. A class of RAL estimators for β characterized by (20) were derived that took advantage of auxiliary covariates to gain efficiency while allowing us to weaken the usual assumption of noninformative censoring to the more reasonable assumption of censoring at random. In the second case, we are interested in estimating the population difference in mean response between two treatment groups where the time-lagged response is observed after a fixed period of time; i.e., the lag time is the same for each individual in the study. A class of RAL estimators was derived and compared with the commonly used standard method, t-test.
Deriving the RAL estimators for β requires the correct model for the censoring time, which is impossible to know in practice. A stratified proportional hazard regression model was hereby proposed to estimate the survival function of the censoring time. As we demonstrated numerically in the above two applications, even if the underlying true model for censoring time did not follow the proportional hazard model, our proposed estimators had much less bias than the standard estimation methods. As expected, when the censoring time truly followed a stratified proportional hazard model, the standard estimation techniques were biased while our proposed estimators were unbiased. In addition, both the results of the simulation study and the analysis on the data from ACTG 175 seemed to indicate that the estimators using prognostic covariates were more efficient than that without using covariates in the model.
Acknowledgments
This work was supported in part by NIH grants P01 CA142538, R37 AI031789, and R01 CA051962.
Appendix A
Proof of Theorem 1
Using the equality
the influence function (7) can be written as
Since L(·) is an arbitrary fuction of , we have the class of all observed data influence functions as described in theorem 1.
APPENDIX B. Proof of corollary 1
We first prove the following Lemmas.
Lemma 1 For any function b(u, Z), we have the following equality
where dMT (u, Z) = I(T = u) − λ0(u) exp(β0Z)I(T ≥ u)du.
Proof: Since E{I(T = v)|T ≥ u, Z} = I(v ≥ u)eβ0Zλ0(v)dv P(T≥v, Z)=P(T≥u, Z) and E{I(T ≥ v)|T ≥ u, Z} = I(v < u) + I(v ≥ u)P(T ≥ v, Z)=P(T ≥ u, Z), we have that
This implies
The last equality is because that
Hence, we have
Applying ψY Z in (15) to Theorem 1 and using Lemma 1, we have the class of all influence function written as
where
and are defined in (6) and (9), respectively.
Now for any function W(u,Z), if we let the function a(u) = W(u, 1) − {W(u, 1) − W(u, 0)} Z*(u;β0), where Z*(u; β) is given in (18), after some simple algebra, we can easily get that . Recalculate the proportionality, according to the theory of semiparametrics (Tsiatis 2006, chap. 4), is equal to the expectation of the partial derivative of
with respect to β evaluated at the true value β0, that is, CW = [E{∂eW(u, Z)(D; β0)/∂β}]−1. After some algebra, we derive the proportionality constant CW is the same as (19) and hence, the space ε1 is the same as ε as described in (17).
APPENDIX C. Proof of theorem 2
Using standard results for the proportional hazards model (e.g. van der Laan and Robins 2003; Tsiatis 2006) the nuisance tangent space associated with the nuisance baseline hazard function λ0C(·, Z) is given by
(31) |
and the nuisance tangent space associated with the nuisance parameters α = (α11, …, α22) is given by
(32) |
Find the projection in (25) is equivalent to find the function a0(u, Z) such that is perpendicular to the nuisance tangent space Λλ(·, Z), that is
(33) |
for all functions a(u, Z). Using the result of Lemma 1, we have the influence function be written as
where ψF(T, Z) is defined in (26). Following the standard results of Fleming and Harrington (1991) for the covariance of two martingale processes, we have (33) equivalent to
for all functions a(u, Z). After a little algebra, we have that
Therefore, is identical to (25).
Similarly, obtain the projection Π(ψ(D)|Λα) is equivalent to find the function B0 such that is perpendicular to the nuisance tangent space , where
(34) |
After some algebra, we have that
and
(35) |
Contributor Information
Xiaomin Lu, Department of Biostatistics, College of Medicine and College of Public Health and health Professions, University of Florida, Gainesville, FL, 32611.
Anastasios A. Tsiatis, Department of Statistics, North Carolina State University, Raleigh, North Carolina 27695-8203
References
- Andersen PK, Borgan O, Gill RD, Keiding N. Statistical models based on counting processes. Springer; New York: 1993. [Google Scholar]
- Anstrom KJ, Tsiatis AA. Using propensity scores to estimate causal treatment effects wit hcensored time-lagged data. Biometrics. 2001;57:1207–1218. doi: 10.1111/j.0006-341x.2001.01207.x. [DOI] [PubMed] [Google Scholar]
- Breslow N. Covariance analysis of censored survival data. Biometrics. 1974;30:89–99. [PubMed] [Google Scholar]
- Cox DR. Regression models and life tables. Journal of the Royal Statistical Society B. 1972;34:187–220. [Google Scholar]
- Cox DR. Partial likelihood. Biometrika. 1975;62:269–275. [Google Scholar]
- Davidian M, Tsiatis AA, Leon S. Semparametric estimation of treatment effect in a pretest-posttest study with missing data (with discussion) Statistical Science. 2005;20:261–301. doi: 10.1214/088342305000000151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fleming TR, Harrington DP. Counting Processes and Survival Analysis. John Wiley and Sons; Newyork, NY, USA: 1991. [Google Scholar]
- Hammer SM, Katzenstein DA, Hughes MD, Gundaker H, Schooley RT, haubrich RH, Henry WK, Lederman MM, Phair JP, Niu M, Hirsch MS, Merigan TC. A trial comparing nucleoside monotherapy with combination therapy in hiv-infected adults wit hcd4 cell counts from 200 to 500 per cubic millimeter. New England Journal of Medicine. 1996;335:1081–1089. doi: 10.1056/NEJM199610103351501. [DOI] [PubMed] [Google Scholar]
- Huang Y, Louis TA. Nonparametric estimation of the joint distribution of survival time and mark variables. Biometrika. 1998;85:785–798. [Google Scholar]
- Hubbard AE, van der Laan MJ, Robins JM. Nonparametric locally efficient estimation of the treatment specific survival distribution with right censored data and covariates in observational studies. In: Halloran E, Berry D, editors. Statistical Models in Epidemiology: The Environment and Clinical Trials. Springer-Verlag; NY: 1999. pp. 135–178. [Google Scholar]
- van der Laan MJ, Hubbard AE. Locally efficient estimation of the quality-adjusted lifetime distribution with right-censored data and covariates. Biometrics. 1999;55:530–536. doi: 10.1111/j.0006-341x.1999.00530.x. [DOI] [PubMed] [Google Scholar]
- van der Laan MJ, Robins JM. Unified Methods for Censored Longitudinal Data and Causality. Springer; Berlin: 2003. [Google Scholar]
- Lu X, Tsiatis AA. Improving the efficiency of the logrank test using auxiliary covariates. Biometrika. 2008 doi: ASN003. [Google Scholar]
- Mantel N. Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemotherapy Report. 1966;50:163–170. [PubMed] [Google Scholar]
- Moor KL, van der Laan MJ. application of time-to-event methods in assessment of safety in clinical trials. In: Peace KE, editor. Design, Summarization, Analysis and Interpretation of Clinical Trials with Time-to-Event Endpoints. Chapman and Hall; 2009. [Google Scholar]
- Newey WK. Semiparametric efficiency bounds. Journal of Applied Econometrics. 1990;5:99–135. [Google Scholar]
- Peto R, Peto J. Asymptotically efficient rank invariant procedures. Journal of the Royal Statistical Society A. 1972;135:185–207. [Google Scholar]
- Robins JM, Rotnitzky A. Recovery of information and adjustment for dependent censoring using surrogate markers. In: Jewell N, Dietz K, Farewell V, editors. AIDS Epidemiology-Methodological Issues. Birkhauser; 1992. pp. 297–331. [Google Scholar]
- Rubin DB. Multiple imputation for nonresponse in surveys. Wiley; New York, NY, USA: 1987. [Google Scholar]
- Tsiatis A. Semiparametric Theory and Missing Data. Springer; New York, NY, USA: 2006. [Google Scholar]
- Tsiatis AA, Davidian M, Zhang M, Lu X. Covariate adjustment for two-stage treatment comparisons in randomized clinical trials: A principles yet flexible approach. Statistics in Medicine. 2008 doi: 10.1002/sim.3113. doi:10.1002/sim.3113:PMCID:46,485. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wahed AS, Tsiatis AA. Semiparametric efficient estimation of survival distributions in two-stage randomizsation designs in clinical trials with censored data. Biometrika. 2006;93:163–177. [Google Scholar]
- Zhang M, Tsiatis AA, Davidian M. Improving efficiency of inferences in randomized clinical trials using auxiliary covariates. Biometrics. 2008 doi: 10.1111/j.1541-0420.2007.00976.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhao H, Tsiatis AA. Efficient estimation of the distribution of quality-adjusted survival time. Biometrics. 1999;55:1101–1107. doi: 10.1111/j.0006-341x.1999.01101.x. [DOI] [PubMed] [Google Scholar]