Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Oct 15.
Published in final edited form as: Ann Stat. 2009 Feb 1;37(1):394–426. doi: 10.1214/07-AOS554

PROPORTIONAL HAZARDS MODELS WITH CONTINUOUS MARKS

Yanqing Sun 1, Peter B Gilbert 1, Ian W McKeague 1
PMCID: PMC2762218  NIHMSID: NIHMS93879  PMID: 19838313

Abstract

For time-to-event data with finitely many competing risks, the proportional hazards model has been a popular tool for relating the cause-specific outcomes to covariates [Prentice et al. Biometrics 34 (1978) 541–554]. This article studies an extension of this approach to allow a continuum of competing risks, in which the cause of failure is replaced by a continuous mark only observed at the failure time. We develop inference for the proportional hazards model in which the regression parameters depend nonparametrically on the mark and the baseline hazard depends nonparametrically on both time and mark. This work is motivated by the need to assess HIV vaccine efficacy, while taking into account the genetic divergence of infecting HIV viruses in trial participants from the HIV strain that is contained in the vaccine, and adjusting for covariate effects. Mark-specific vaccine efficacy is expressed in terms of one of the regression functions in the mark-specific proportional hazards model. The new approach is evaluated in simulations and applied to the first HIV vaccine efficacy trial.

Key words and phrases: Competing risks, distribution-free confidence bands and tests, failure time data, genetic data, HIV vaccine trial, pointwise and simultaneous confidence bands, semiparametric model, survival analysis

1. Introduction

It has been 30 years since Prentice et al. [15] introduced a Cox regression framework for the analysis of failure time data in the presence of finitely many competing risks. Yet many important applications of competing risks methodology involve continuous causes-of-failure (marks). In HIV vaccine trials, for example, genetic divergence of infecting HIV viruses from the HIV strain represented in the vaccine needs to be taken into account to properly assess vaccine efficacy, but the mark variable is essentially continuous because of the large number of mutations involved. Other examples of continuous mark variables include lifetime medical cost or a quality of life score associated with survival time [14]. The grouping of continuous mark data into discrete marks is unsatisfactory because that amounts to a coarsening of the data and the results will depend on the way the groups are defined. To address this problem, we develop inference for a proportional hazards model in which both the regression parameters and the baseline hazard function depend nonparametrically on a continuous mark.

The paper is motivated by the need for new methods to analyze data from HIV vaccine efficacy trials. Approximately 15,000 new HIV infections occur each day [21], making development of a protective HIV vaccine a top priority for biomedical science. In efficacy trials thousands of HIV-negative volunteers are randomized to receive vaccine or placebo, and are monitored for HIV infection. Four efficacy trials are ongoing (http://www.iavi.org). A primary objective of each trial is to assess vaccine efficacy (VE) to prevent infection, where typically VE is defined as one minus the hazard ratio (vaccine/placebo) of HIV infection. One of the greatest barriers to achieving an efficacious vaccine is the extreme genetic heterogeneity of HIV [12, 7]. Although it may be possible to develop a vaccine that protects against HIV strains genetically similar to the HIV virus or viruses represented in the vaccine, it may be quite difficult to develop one to protect against HIV strains dissimilar from the vaccine material. This phenomenon is well known for flu vaccines—moderate genetic mismatch between an exposing flu virus and the virus represented in the vaccine causes vaccine failure, which has necessitated development of a new vaccine each year that is closely matched to the contemporary circulating flu strains. The genetic divergence (or distance) between two aligned HIV sequences can be measured as the weighted percent mismatch of amino acids, and since this distance may be unique for all infected subjects, it is natural to consider it as a continuous mark variable. The formidable problem of HIV genetic diversity implies that an important objective of an efficacy trial is assessment of if and how VE depends on the genetic divergence.

This problem can be addressed in terms of the conditional mark-specific hazard function, defined as

λ(t,υ|z)=limh1,h20P{T[t,t+h1),    V[υ,υ+h2)|Tt,Z(t)=z}/h1h2, (1)

where T is the failure (infection diagnosis) time, V is a continuous mark variable and Z(t) is a (possibly time-dependent) p-dimensional covariate. Huang and Louis [8] developed the nonparametric maximum likelihood estimator of the joint distribution of T and V in terms of the unconditional mark-specific hazard function. Gilbert, McKeague and Sun [6] defined mark-specific vaccine efficacy as VE(t, υ) = 1 − λ(t, υ|1)/λ(t, υ|0), with z being the indicator of membership in the vaccine group; they developed several nonparametric and semiparametric tests concerning VE(t, υ).

In this article, we develop the mark-specific proportional hazards (PH) model

λ(t,υ|z(t))=λ0(t,υ)exp{β(υ)Tz(t)}, (2)

where the baseline hazard function λ0(·, υ) and the p-dimensional regression parameter β(υ) are unknown continuous functions of υ. As far as we know, this model has never been studied in the literature, even though it is closely related to the discrete cause-of-failure models discussed by Prentice et al. [15]. The approach in the continuous case departs from the discrete case in that it is necessary to “borrow strength” from data in a neighborhood of υ, with the data closest to υ contributing the most.

For the HIV vaccine trial application, we partition the covariate as z(t) = (z1, z2(t))T, where z1 is the treatment (vaccine) group indicator and z2(t) is a vector of possibly time-dependent covariates. Then the vaccine efficacy defined above takes the simpler form VE(υ) = 1 − exp(β1(υ)), without any dependence on t. By assuming proportional hazards, model (2) can provide more powerful tests of mark-specific vaccine efficacy than the nonparametric procedures of Gilbert, McKeague and Sun [6], and the model allows adjustment for covariate effects. Furthermore, ignoring the mark variable and studying vaccine efficacy using the standard Cox model, as is widely practiced in vaccine trials for many infectious diseases, can give misleading results. In fact, even in the case of model (2) with z as the treatment indicator, the ordinary (marginal) Cox model will be misspecified unless the baseline λ0(t, υ) factors into separate functions of t and υ.

Indeed, consider the model λ(t, υ|z = 0) = γ0/2 + γ1tυ and λ(t, υ|z = 1) = γ0υ + γ1tυ2, for t ≥ 0, 0 ≤ υ ≤ 1, z ∈ {0, 1}. The corresponding marginal hazard functions are λ(t|z = 0) = γ0/2 + γ1t/2 and λ(t|z = 1) = γ0/2 + γ1t/3, for t ≥ 0. It is clear that λ(t|z) is not a proportional hazards model unless γ0 or γ1 is zero. If γ1 = 0, the resulting marginal hazards become proportional for z = 0 and z = 1. However, in this example, the marginal vaccine efficacy VE = 1 − λ(t|z = 1)/λ(t|z = 0) = 0 while the mark-specific vaccine efficacy is VE(υ) = 1 − 2υ. The ordinary Cox model averages the mark-specific vaccine efficacy over its range, and important vaccine effects may be missed. This issue will be further illustrated in our simulation study. In general, use of the ordinary Cox model for studying hazard ratios can be misleading if an important mark variable is ignored. The mark-specific PH model offers a way to correct for that deficiency.

We also consider a cumulative vaccine efficacy estimand defined as CV(υ)=aυVE(u)du where a > 0. We develop distribution-free uniform confidence bands for CV(υ), which are useful for inferential purposes. In addition we derive test statistics for evaluating mark-specific vaccine efficacy based on the estimator of CV(υ).

The paper is organized as follows. Section 2 develops a local partial likelihood procedure for estimating β(υ), leading to the construction of pointwise confidence intervals and formal tests for various hypotheses of interest concerning vaccine efficacy. A simulation study evaluating the performance of the proposed tests and the pointwise and simultaneous confidence intervals for VE(υ) and CV(υ) is presented in Section 3. The proposed methods are applied to analyze the data from the first HIV vaccine efficacy trial in Section 4. We discuss some general aspects of mark-specific PH models in Section 5. Proofs of the main results are placed in the Appendix.

2. Mark-specific proportional hazards model

2.1. Local partial likelihood

We begin by stating some assumptions and notations that are used throughout the paper. The mark variable V is assumed to have a known and bounded support; rescaling V if necessary, this support is taken without loss of generality to be [0, 1]. The observations (Xi, δi, δi Vi, Zi), i = 1,…,n, are assumed to be i.i.d. replicates of (X, δ, δV, Z), where X is the right-censored failure time corresponding to T, which satisfies the model (2), and δ is the indicator of non-censorship. The mark is assumed to be observed whenever the corresponding failure time is uncensored; when δi = 0, Vi is undefined and is not meaningful. The censoring time is assumed to be conditionally independent of (T, V) given Z.

We consider a localized version of the log partial likelihood function for β = β(υ) at a fixed υ:

l(υ,β)=i=1n010τKh(uυ)[βTZi(t)log(j=1nYj(t)eβTZj(t))]×Ni(dt,du), (3)

where Kh(x) = K(x/h)/h, K(·) is a kernel function with support [−1, 1], τ is the end of the follow-up period and h = hn is a bandwidth. Here Yi(t) = I(Xit) and Ni(t, υ) = I(Xit, δi = 1, Vi ≤ υ) is the marked point counting process with a jump at an uncensored failure times Xi and the associated mark Vi. For background on marked point processes see Brémaud [2] and Martinussen and Scheike [11].

The log partial likelihood function (3) resembles that of Kalbfleisch and Prentice [9] in the case of discrete marks, except that it borrows strength from observations having marks in the neighborhood of υ. The kernel function is designed to give greater weight to observations with marks near υ than those further away. The local maximum partial likelihood estimator of β(υ) is a maximizer β̂(υ) of (3). A similar approach has been studied by Cai and Sun [3] for estimating time-dependent coefficients in Cox regression models.

Denote μj = ∫ uj K (u) du, νj = ∫ uj K2(u) du for j = 0, 1, 2. For β ∈ ℝp, t ≥ 0, let

S(j)(t,β)=n1i=1nYi(t)exp{βTZi(t)}Zi(t)j,

where for any z ∈ ℝp, we denote z⊗0 = 1, z⊗1 = z and z⊗2 = zzT. Define s(j)(t, β) = ES(j)(t, β) and

Jn(t,β)=S(2)(t,β)S(0)(t,β)(S(1)(t,β)S(0)(t,β))2,J(t,β)=s(2)(t,β)s(0)(t,β)(s(1)(t,β)s(0)(t,β))2.

Taking the derivative of l(υ, β) with respect to β gives the score function

U(υ,β)=lβ(υ,β)=i=1n010τKh(uυ)[Zi(t)S(1)(t,β)S(0)(t,β)]Ni(dt,du). (4)

The maximum partial likelihood estimator is a solution to U(υ, β̂(υ)) = 0, and can be computed using a Newton–Raphson algorithm. The second derivative of l(υ, β) with respect to β yields

lβ(υ,β)=i=1n010τKh(uυ)Jn(t,β)Ni(dt,du).

Although inference on β is usually of primary interest, the baseline function λ0(t, υ) can also be estimated, by smoothing the increments of the following estimator of the doubly cumulative baseline function Λ0(t,υ)=0t0υλ0(s,u)dsdu:

Λ^0(t,υ)=0t0υN(ds,du)nS(0)(s,β^(u)). (5)

2.2. Asymptotic results

We make use of the following regularity conditions; not all of these conditions are required for the proof of each theorem, nor are they the minimum required set of conditions.

CONDITION A

  • (A.1)

    β(υ) has componentwise continuous second derivatives on [0, 1]. The second partial derivative of λ0(t, υ) with respect to υ exists and is continuous on [0, τ] × [0, 1]. The covariate process Z(t) has paths that are left-continuous and of bounded variation, and satisfies the moment condition E[‖Z(t)‖4 exp(2MZ(t)‖)] < ∞, where M is a constant such that (υ, β(υ)) ∈ [0, 1] × (−M, M)p for all υ and ‖A‖ = maxk,l |akl| for a matrix A = (akl).

  • (A.2)

    For j = 0, 1, 2, each component of s(j)(t, θ) is continuous on [0, τ] × [−M, M]p, and supt∈[0,τ],θ∈[−M, M]pS(j)(t, θ) − s(j)(t, θ)‖ = Op(n−1/2).

  • (A.3)

    s(0)(t, θ) > 0 on [0, τ] × [−M, M]p and the matrix (υ)=0τJ(t,β(υ))×λ0(t,υ)s(0)(t,β(υ))dt is positive definite.

  • (A.4)

    E(Ni(dt, dυ)|t) = E(Ni(dt, dυ)|Yi(t), Zi(t)), where t = σ{I (Xis, δi = 1), I(Xis, δi = 0), Vi I (Xis, δi = 1), Zi(s); 0 ≤ st, i = 1,…,n} is the (right-continuous) filtration generated by {Ni(s, υ), Yi(s), Zi(s); 0 ≤ st, 0 ≤ υ ≤ 1, i = 1,…,n}.

  • (A.5)

    The kernel function K(·) is symmetric with support [−1, 1] and of bounded variation. The bandwidth satisfies nh2 → ∞ and nh5 → 0 as n → ∞.

Note that the condition (A.2) holds under the condition (A.1) given some additional moment conditions on Z(t) − Z(s) and exp(bT Z(t)) − exp(bT Z(s)). If Z(t) = Z, not depending on t, then (A.2) holds by the Donsker theorem (Theorem 19.5 of van der Vaart [20]). The condition (A.4) assumes that the mark-specific instantaneous failure rate at time t given the observed information up to time t only depends on the failure status and the current covariate value. Under (A.4) and by the definition (1), E(Ni(dt, dυ)|t) = Yi(t)λ(t, υ|Zi(t)) dt dυ, and Mi(t,υ)=0t0υ[Ni(ds,dx)Yi(s)λ(s,x|Zi(s))dsdx] is a martingale with respect to t for each fixed υ ([11], page 31). Further, it follows by Aalan and Johansen [1] that Mi(·, υ1) and Mi(·, υ2) − Mi(·, υ1) are orthogonal square integrable martingales with respect to t for any 0 ≤ υ1 ≤ υ2 ≤ 1. To avoid the problems at the boundaries υ = 0, 1, we shall study the asymptotic properties of β̂(υ) for the interior values of υ ∈ [a, b] ⊂ (0, 1).

First we present the following result that is essential for proving the asymptotic normality of β̂(υ) and provides insight into the constructions of the confidence bands and test statistics that follow. Let

W˜A(υ)=n1/2i=1naυ0τA(u)[Zi(t)s(1)(t,β(u))s(0)(t,β(u))]Mi(dt,du), (6)

where A(u) is a deterministic p × p matrix with bounded components and 0 ≤ a < b ≤ 1.

THEOREM 1

Assume that each component of the p × p matrix A(υ), υ ∈ [a, b], is continuous. Under conditions (A.1)–(A.4), A(υ) converges weakly to a p-dimensional mean-zero Gaussian martingale, WA(υ), with continuous sample paths on υ ∈ [a, b]. The covariance matrix of WA(υ) is given by Cov(WA(υ))=aυA(u)(u)A(u)du.

Let

Σ^A^(υ)=n1i=1naυ0τA^(u)Jn(t,β^(u))A^T(u)Ni(dt,du), (7)

where Â(υ) is a consistent estimator of A(υ) uniformly in υ ∈ [a, b] ⊂ [0, 1]. It can be shown that Σ̂A(υ) is a consistent estimator of Cov(WA(υ)).

The consistency and asymptotic normality of β̂(υ) are established in the next two theorems.

THEOREM 2

Under conditions (A.1)–(A.5), β̂(υ) converges to β(υ) uniformly in υ ∈ [a, b] ⊂ (0, 1).

THEOREM 3

Under conditions (A.1)–(A.5), (nh)1/2(β^(υ)β(υ))𝒟N(0,ν0Σ1(υ)) for υ ∈ [a, b].

The proof of Theorem 3 uses a Taylor expansion of the score function, leading to β^(υ)β(υ)=(lβ(υ,β*(υ)))1U(β(υ)), where β*(υ) is on the line segment between β̂(υ) and β(υ). The asymptotic variance of n−1/2h1/2U (β(υ)) is shown to be ν0Σ(υ), which is the in probability limit of Σ˜n(β(υ))=n1h×i=1n010τ(Kh(uυ))2Jn(t,β(υ))Ni(dt,du). It can also be shown that Σ^(υ)lβ(υ,β^(υ))/nPΣ(υ)asn. Thus, the asymptotic variance of (nh)1/2 × (β̂(υ) − β(υ)) can be estimated by Σ^1(υ)=(lβ(υ,β^(υ))/n)1Σ˜n(β^(υ))(lβ(υ,β^(υ))/n)1. An alternative estimator is Σ^2(υ)=ν0(lβ(υ,β^(υ))/n)1. It is easy to check that ν0 = 3/5 for Epanechnikov’s kernel K(x)=34(1x2),1<x<1. Simulations indicate that the two estimators have similar finite sample performance.

Theorem 3 will lead to the construction of pointwise confidence intervals for VE(υ). Simultaneous inference over υ ∈ [a, b] will be possible in terms of the estimate B^(υ)=aυβ^(u)du of the cumulative regression coefficient B(υ)=aυβ(u)du. We have the following weak convergence result for (υ).

THEOREM 4

Under conditions (A.1)–(A.5), n1/2((υ) − B(υ)) converges weakly to a p-dimensional mean-zero Gaussian martingale WΣ−1(υ) with continuous sample paths on υ ∈ [a, b]. The covariance matrix of WΣ−1(υ) is aυΣ(u)1du, which can be consistently estimated by Σ̂Â(υ) defined by (7) with A(υ) = (Σ(υ))−1 and Â(υ) = (Σ̂(υ))−1.

2.3. Confidence bands for vaccine efficacy

Let β(υ)=(β1(υ),β2T(υ))T. Then the vaccine efficacy can be expressed as VE(υ) = 1 − exp(β1(υ)). The estimated vaccine efficacy is VE^(υ)=1exp(β^1(υ)). By Theorem 3 and the delta method, (nh)1/2(VE^(υ)VE(υ))𝒟N(0,ν0σ12(υ)exp(2β1(υ))) for υ ∈ [a, b], where σ12(υ) is the first element on the diagonal of Σ−1(υ). Let σ^β12(υ) be the first element on the diagonal of Σ̂1(υ). By the discussions on the consistent estimators for the asymptotic variance following Theorem 3, σ^β12(υ) is a consistent estimator for ν0σ12(υ). A pointwise 100(1 − α)% confidence band for VE(υ) is given by

VE^(υ)±(nh)1/2zα/2σ^β1(υ)exp(β^1(υ)),   aυb, (8)

where zα/2 is the upper α/2 quantile of the standard normal distribution.

To derive simultaneous confidence bands for the cumulative vaccine efficacy CV(υ)=aυVE(u)du, we consider the point estimator CV^(υ)=aυVE^(u)du. Then

n(CV^(υ)CV(υ))=naυ(exp(β1(υ))exp(β^1(υ)))du.

Note that n(CV^(υ)CV(υ))naυexp(β1(υ)(β1(υ)β^1(υ))du. From the proof of Theorem 4, it can be shown that n(CV^(υ)CV(υ)) converges weakly to a mean-zero Gaussian process, e1TWA(υ),aυb, with continuous paths and independent increments, where A(υ) = exp(β1(υ))Σ(υ)−1 and e1 is the first column of the p × p identity matrix. The variance of e1TWA(υ) equals ρ2(υ)=aυσ12(u)exp(2β1(u))du by Theorem 1, which can be conveniently estimated by aυσ^12(u)exp(2β^1(u))du, where σ^12(υ) is the first element of the diagonal of Σ̂(υ)−1. We suspect that this estimator may ignore the finite sample correlations of β1(υ) − β̂1(υ) at different values of υ, thus over- or underestimating the true variance. We propose to use ρ^2(υ)=e1TΣ^Â(υ)e1 as the estimator of the asymptotic variance of n(CV^(υ)CV(υ)), where Σ̂Â(υ) is obtained from (7) with Â(υ) = exp(β̂1(υ))Σ̂(υ)−1, which is uniformly consistent by Theorem 1. Consequently, a pointwise 100(1 − α)% confidence band for CV(υ) is given by

CV^(υ)±n1/2zα/2ρ^(υ),   aυb. (9)

Let 𝒱 be a set of values of υ in [a, b]. We may take 𝒱 = [a, b] or 𝒱 = {υk, k = 1,…,K} with υ1 < ··· < υK. Note that if U(υ) is a Gaussian martingale with variance ρ2(υ), for a ≤ υ ≤ b, then U(υ)ρ(b)[ρ2(b) + ρ2(υ)]−1 has the same distribution as B02(υ)/(ρ2(b) + ρ2(υ))), a ≤ υ ≤ b, where B0(·) is a Brownian bridge. By the weak convergence of n(CV^(υ)CV(υ)), the uniform consistency of ρ̂2(υ) to ρ2(υ) and the continuous mapping theorem, we have

supυ𝒱|n(CV^(υ)CV(υ))ρ^(b)/(ρ^2(b)+ρ^2(υ))|𝒟supυ𝒱|B0(ρ2(υ)/(ρ2(b)+ρ2(υ)))|.

Thus a simultaneous 100(1 − α)% confidence band for CV(υ), υ ∈ 𝒱, is given by

CV^(υ)±n1/2uα[ρ^2(b)+ρ^2(υ)]/ρ^(b), (10)

where uα is the upper α-quantile of the distribution of supυ∈𝒱|B02(υ)/(ρ2(b) + ρ2(υ)))|. The uα is the upper α-quantile of sup0≤υ≤1/2|B0(υ)| if 𝒱 = [a, b], which has been tabulated by Schumacher [16] for some α values. In the simulation study presented in the next section, we estimate uα by the upper α-quantile of the distribution of supυk∈𝒱|B0(ρ̂2k)/(ρ̂2(b) + ρ̂2k)))| in both cases when 𝒱 = [a, b] or 𝒱 = {υk, k = 1,…,K}, which can be obtained by simulating a Brownian bridge for given ρ̂2(υ).

Alternatively, other resampling techniques such as the Gaussian multiplier method of Lin, Wei and Ying [10] can be used to estimate the critical value uα. This method can be briefly outlined as follows. Let ξ1,…,ξn be i.i.d. standard normal random variables and

WÂ*(υ)=n1/2i=1nξi0υ0τÂ(u)[Zi(t)S(1)(t,β^(u))S(0)(t,β^(u))]Mi(dt,du). (11)

Then the distribution n(CV^(υ)CV(υ)) can be approximated by the conditional distribution of e1TWÂ*(υ) given the observed data sequence, where  = exp(β̂1(υ)) × (Σ̂(υ))−1. Consequently, the distribution of supυ𝒱|n(CV^(υ)CV(υ))ρ^(b)[ρ^2(b)+ρ^2(υ)]1| can be approximated by the conditional distribution of U*=supυ𝒱|e1TWÂ*(υ)ρ^(b)[ρ^2(b)+ρ^2(υ)]1| given the observed data sequence. Let uα* be the (1 − α)-quantile of the copies of U* obtained by repeatedly generating sets of i.i.d. standard normal random variables. A simultaneous 100(1 − α)% confidence band for CV(υ), υ ∈ 𝒱, is given by

CV^(υ)±n1/2uα*[ρ^2(b)+ρ^2(υ)]/ρ^(b). (12)

This resampling technique is also applicable to the hypothesis tests for vaccine efficacy developed in the next subsection.

2.4. Testing vaccine efficacy

We are interested in testing the following two sets of hypotheses. The first set of hypotheses is

H10:VE(υ)=0forυ[a,b]versusH1a:VE(υ)0for someυ(general alternative)orH1m:VE(υ)0with strict inequality for at least someυ(monotone alternative).

The second set of hypotheses is

H20:VE(υ)does not depend onυ[a,b]versusH2a:VE(υ)depends onυ(general alternative)orH2m:VE(υ)decreases asυincreases(monotone alternative).

Let β1(υ) be the first component of β(υ). Then the null hypothesis H10 is equivalent to β1(υ) = 0 and the null hypothesis H20 is equivalent to β1(υ) does not depend on υ. The null hypothesis H10 implies the vaccine affords no protection against any infecting strain of virus. The alternative H1m indicates that the vaccine provides protection for at least some of the infecting strains, while H1a states that the vaccine provides either protection or increased risk for some infecting strains. The null hypothesis H20 implies there is no difference in vaccine protection for different infecting strains, measured by their distance υ to the strains contained in the vaccine. The ordered alternative H2m states that vaccine efficacy decreases with υ and the alternative H2a indicates that the vaccine efficacy changes with υ.

In this section, we develop some test procedures for detecting departures from H10 in the direction of H1m and H1a and for detecting departures from H20 in the direction of H2m and H2a. By Theorem 4 and the discussions in Section 2.3, the process n(CV^(υ)CV(υ)),aυb, converges weakly to a Gaussian martingale with predictable variation ρ2(υ). Let ξ(υ)=n(CV^(υ)CV(υ))/ρ(b). It follows from Theorem 4 that ξ(υ)𝒟W(t(υ)),aυb, where W(·) is a Wiener process and t(υ) = ρ2(υ)/ρ2(b).

To test H10, let Z^(1)(υ)=nCV^(υ)/ρ^(b)andt^(υ)=ρ^2(υ)/ρ^2(b). Consider the following test statistics:

Ta(1)=ab(Z^(1)(υ))2dt^(υ),   Tm1(1)=abZ^(1)(υ)dt^(υ).

These test statistics have somewhat complicated null distributions (see below) so we consider the following simpler test statistic based on a finite grid, which leads to a standard normal null distribution:

Tm2(1)=(K1)1/2k=2K(Z^(1)(υk)Z^(1)(υk1))/(t^(υk)t^(υk1))1/2,

where a ≤ υ1 < … < υKb are the grid points in [a, b]. A similar test statistic with a standard normal null distribution is also proposed for H20 later. Under H10, Ta(1)𝒟ab(W(t(υ)))2dt(υ)=𝒟01(W(t))2dt,Tm1(1)𝒟abW(t(υ))dt(υ)=𝒟01W(t)dtandTm2(1)𝒟N(0,1). The distributions of Ta(1)andTm1(1) under H10 can also be approximated by those of ab(W(t^(υ)))2dt^(υ)andabW(t^(υ))dt^(υ) for given (υ), respectively, which are used in the numerical studies for better finite sample approximations. We denote the upper α-quantiles of these two distributions by ca(1)andcm1(1), respectively.

The test statistic Ta(1) captures general departures H1a, while the test statistics Tm1(1)andTm2(1) are sensitive to the monotone departure H1m. Both test statistics Tm1(1)andTm2(1) are likely to be positive when VE(υ) ≥ 0 for all υ with strict inequality for some υ. Hence the tests based on Ta(1),Tm1(1)andTm2(1) reject H10 if Ta(1)>ca(1),Tm1(1)>cm1(1)andTm2(1)>zα, respectively.

To test H20, let Z^(2)(υ)=n(1υaCV^(υ)1baCV^(b))/ρ^(b). Note that, under H20, Z^(2)(υ)=n[1υa(CV^(υ)CV(υ))1ba(CV^(b)CV(b))]/ρ^(b). By Theorem 4 and the continuous mapping theorem, under H20, Z^(2)(υ)𝒟1υaW(t(υ))1baW(1)Z(2)(υ) for υ ∈ [a1, b], where a < a1 < b. We propose the following test statistics for evaluating H20:

Ta(2)=a1b(Z^(2)(υ))2dt^(υ),   Tm1(2)=a1bZ^(2)(υ)dt^(υ),Tm2(2)=Π^K1k=2K(Z^(2)(υk1)Z^(2)(υk))/π^k,

where a1 ≤ υ1 < … < υKb are K grid points in [a1, b], π^k2 is an estimate of the variance πk2=Var(Z(2)(υk1)Z(2)(υk))andΠ^K2 is an estimate of the variance ΠK2ofk=2K(Z(2)(υk1)Z(2)(υk))/πk. By the covariance of the Wiener process, it is easy to show that

τi,j=Cov(Z(2)(υi),Z(2)(υj))=t(υi)(υia)(υja)t(υi)(υia)(ba)t(υj)(υja)(ba)+1(ba)2,

for υi ≤ υj. Thus, πk2=τk1,k12τk1,k+τk,k. Let Γ = (τi,j)K×K and

ξT=(π21,π31π21,,πK1πK11,πK1).

It follows that ΠK = ξTΓξ. The estimates π^k2andΠ^K2 are obtained by replacing t(υ) with (υ).

By the weak convergence of (2)(υ) to Z(2)(υ), and the convergence in probability of (υ) to t(υ), a1 ≤ υ ≤ b, we have Tm2(2)𝒟N(0,1) under H20. It also follows that Ta(2)𝒟a1b(Z(2)(υ))2dt(υ), and Tm1(2)𝒟a1bZ(2)(υ)dt(υ) under H20. The distributions of Ta(2)andTm1(2) under H20 can be approximated by those of a1b(W(t^(υ))/(υa)W(t^(b))/(ba))2dt^(υ)anda1b(W(t^(υ))/(υa)W(t^(b))/(ba))dt^(υ) for given (υ), respectively, which are used in the numerical studies for better finite sample approximations. We denote the upper α-quantiles of these two distributions by ca(2)andcm1(2), respectively.

The test statistic Ta(2) captures general departures H2a while the test statistics Tm1(2)andTm2(2) are sensitive to the monotone departure H2m. Both Tm1(2)andTm2(2) are expected to be positive when VE(υ) decreases as υ increases, that is, when H2m holds. Hence the tests Ta(2),Tm1(2)andTm2(2) reject H20 if Ta(2)>ca(2),Tm1(2)>cm1(2)andTm2(2)>zα, respectively.

3. Simulation study

In this section, we conduct a simulation study to check the finite sample performance of the proposed estimation and hypothesis testing procedures using the simple mark-specific proportional hazards model:

λ(t,υ|z)=exp{γυ+(α+βυ)z},   t0,0υ1, (13)

where α, β and γ are constants and the treatment indicator z takes value 0 or 1 with probability of 0.5 for each value. Under model (13), the mark-specific baseline function is λ0(t, υ) = exp(γ υ) and VE(υ) = 1 − exp(α + β υ). The null hypothesis H10 of no vaccine efficacy holds if both α = 0 and β = 0, and the null hypothesis H20 that vaccine efficacy does not depend on the type of infecting strain is true if β = 0. Various choices of α and β specify different alternatives for H10 and H20.

We consider the following simulation models:

(α,β,γ)=(0,0,0.3),for the null hypothesisH10of no vaccine efficacy; (M1)
(α,β,γ)=(0.5,0.5,0.3),as the first alternative ofH10; (M2)
(α,β,γ)=(0.6,0.6,0.3),as the second alternative ofH10; (M3)
(α,β,γ)=(0.6,0,0.3),as the third alternative ofH10; (M4)
(α,β,γ)=(0.69,0,0.3),for the null hypothesisH20that vaccine efficacy does not depend on the type of infecting strain; (M5)
(α,β,γ)=(1.2,1.2,0.3),as the first alternative ofH20; (M6)
(α,β,γ)=(1.5,1.5,0.3),as the second alternative ofH20; (M7)
(α,β,γ)=(1.8,1.8,0.3),as the third alternative ofH20. (M8)

The models (M2) to (M4) are considered as the alternatives for H1m and H1a. The departure from H10: VE(υ) = 0 increases as the simulation model moves from (M2) to (M4). The models (M6) to (M8) are considered as the alternatives for H2m and H2a. The departure from H20 increases as the simulation model moves from (M6) to (M8).

We generate the censoring times from an exponential distribution, independent of (T, V), with the censoring rates ranging from 20% to 30%. We set the interval of analyses for υ as [a, b] = [0.1, 0.9] and bandwidths are chosen as h = 0.05, 0.1, 0.15. The observed failure times with marks outside the interval [a, b] can also be used since the smoothing at υ takes the cases with marks in its h-neighborhood. The Epanechnikov kernel K(x) = 0.75(1 − x2)I{|x| ≤ 1} is used throughout. Sample sizes of n = 500 and 800 are studied.

For the tests Tm2(1)andTm2(2), we take the grid of eight evenly spaced points in [a, b] from 0.196 to 0.868. Table 1 lists the empirical sizes and powers of the test statistics Ta(1),Tm1(1)andTm2(1) and Table 2 for the test statistics Ta(2),Tm1(2)andTm2(2). The significance levels of these tests are given at α = 0.05. Both tables also list the coverage probabilities of the 95% simultaneous confidence intervals for CV(υ), for υ ∈ [a, b] and for υ in the grid. The critical values for the tests Tm2(1)andTm2(2) at α = 0.05 are zα = 1.645. The critical values for the tests Ta(1)andTa(2), Tm1(1)andTm1(2) are obtained by generating 10,000 Wiener processes W(·) with time parameter equal to (υ) and calculating the corresponding functionals of W((υ)), as described in the previous section. Each entry in Tables 1 and 2 is based on 1000 repetitions.

TABLE 1.

Empirical sizes and powers of the tests Ta(1),Tm1(1)andTm2(1) at the nominal level 0.05, and coverage probabilities of the 95% simultaneous confidence intervals for CV(υ) with υ on the grid and on [a, b]

Size/Power
Coverage
Model (α, β, γ) n h
Ta(1)
Tm1(1)
Tm2(1)
Grid [a, b]
M1 (0, 0, 0.3) 500 0.05 2.9 3.1 7.8 97.5 98.1
0.1 4.9 5.9 8.3 96.6 97.4
0.15 5.1 6.9 7.3 96.2 96.8
800 0.05 5.3 2.8 6.9 95.9 96.8
0.1 5.7 4.7 6.8 95.5 97.0
0.15 5.8 5.2 6.3 95.6 96.5
M2 (−0.5, 0.5, 0.3) 500 0.05 45.4 56.3 63.2 97.6 98.0
0.1 60.3 71.4 65.7 97.0 97.5
0.15 66.0 77.4 65.5 96.7 97.6
800 0.05 69.1 78.4 77.5 96.1 96.8
0.1 80.3 86.5 80.1 95.6 96.7
0.15 82.9 89.1 80.1 96.0 97.2
M3 (−0.6, 0.6, 0.3) 500 0.05 59.7 70.0 76.5 97.5 98.0
0.1 75.4 83.9 78.8 96.9 97.8
0.15 80.9 87.2 78.5 96.9 97.9
800 0.05 83.7 90.4 87.6 96.2 96.9
0.1 90.8 94.4 89.6 96.0 96.8
0.15 93.0 96.0 89.6 96.2 97.2
M4 (−0.6, 0, 0.3) 500 0.05 96.0 95.6 99.9 97.0 97.8
0.1 99.1 98.8 100 96.7 97.6
0.15 99.5 99.7 100 96.7 97.4
800 0.05 99.9 99.5 100 97.0 98.0
0.1 100 100 100 96.9 97.3
0.15 100 100 100 96.4 97.4

TABLE 2.

Empirical sizes and powers of the tests Ta(2),Tm1(2)andTm2(2) at the nominal level 0.05, and coverage probabilities of the 95% simultaneous confidence intervals for CV(υ) with υ on the grid and on [a, b]

Size/Power
Coverage
Model (α, β, γ) n h
Ta(2)
Tm1(2)
Tm2(2)
grid [a, b]
M5 (−0.69, 0, 0.3) 500 0.05 1.6 3.7 3.7 97.0 97.8
0.1 2.1 3.7 4.5 96.5 97.5
0.15 2.1 3.5 4.6 96.8 97.3
800 0.05 2.3 4.0 2.9 97.3 98.3
0.1 2.6 4.3 3.2 97.0 97.6
0.15 2.1 3.5 3.0 96.9 97.4
M6 (−1.2, 1.2, 0.3) 500 0.05 47.2 67.6 47.7 97.9 98.5
0.1 60.2 76.7 62.3 97.1 97.6
0.15 63.2 80.3 73.3 97.5 97.8
800 0.05 69.2 85.1 69.2 96.5 97.2
0.1 80.4 92.0 80.4 96.6 97.6
0.15 84.2 94.1 88.4 96.9 97.8
M7 (−1.5, 1.5, 0.3) 500 0.05 63.8 81.4 62.1 97.7 98.0
0.1 76.9 78.0 63.6 97.2 98.0
0.15 81.2 91.7 86.3 97.6 98.0
800 0.05 85.1 94.4 82.6 96.2 97.1
0.1 93.2 98.2 91.8 96.1 97.6
0.15 96.0 98.9 97.4 96.7 97.7
M8 (−1.8, 1.8, 0.3) 500 0.05 77.6 89.1 73.6 97.8 98.5
0.1 87.1 95.6 85.7 97.3 98.4
0.15 91.5 96.9 92.8 97.7 98.7
800 0.05 93.5 98.2 91.4 96.4 97.4
0.1 98.2 99.5 97.0 96.3 97.5
0.15 99.3 99.9 99.2 96.5 97.9

Most tests have appropriate sizes close to 5%. The test Ta(2) seems to be conservative for the simulation models used in the study. The test Tm1(1) has better power than the tests Ta(1)andTm2(1). The test Tm1(2) has better power than the tests Ta(2)andTm2(2). Therefore the tests that incorporate CV^(υ) over the entire range [a, b] present greater power than the simpler tests based on CV^(υ) over the grid. We also observed that the powers of the tests seem to be influenced by the selection of bandwidth, with greater power for a larger bandwidth. Similar plots (not included here) to Figure 1 and Figure 2 but with larger bandwidth h = 0.2 show that the estimated standard errors of CV^(υ) become smaller for larger h while the biases stay approximately the same, resulting in increased power for the larger bandwidth. We suspect that this phenomenon is associated with the sample size and the convergence rate of the normalized CV^(υ) to a Wiener process. The dependence of the power on the bandwidth should become small as the sample size increases. Further study on the bandwidth selection is warranted.

FIG. 1.

FIG. 1

Plots of estimates for β(υ), VE(υ) and CV(υ) under the models M1, M2, M5 and M6 for n = 500, h = 0.1. The solid dark lines are the true functions and the dashed lines are the averages of the estimates based on 1000 repetitions. The gray lines are the corresponding estimates for β(υ), VE(υ) and CV(υ) of 50 random samples.

FIG. 2.

FIG. 2

Plots of the standard errors under the models M1, M2, M5 and M6, based on n = 500, h = 0.1. The solid lines are the averages of the estimates of the standard deviations of β̂(υ), VE^(υ)andCV^(υ), while the dashed lines are the sample standard deviations of β̂(υ), VE^(υ)andCV^(υ), based on 1000 repetitions. The gray lines are the corresponding estimates for the standard deviations of β̂(υ), VE^(υ)andCV^(υ) of 50 random samples.

The coverage probabilities of the simultaneous confidence intervals for CV(υ) are closer to the 95% nominal level for υ on the grid than on [a, b]. This may be explained by the fact that the convergence for υ over the entire range [a, b] is slower than the convergence on the grid. The evaluations of the proposed estimators for β(υ), VE(υ) and CV(υ) and their respective estimators of the standard deviations under some of the simulation models are presented in Figure 1 and Figure 2. The plots of the pointwise coverage probabilities for VE(υ) and for CV(υ) are given in Figure 3. These plots are based on n = 500 and h = 0.1.

FIG. 3.

FIG. 3

Plots of the pointwise coverage probabilities for VE(υ) (gray lines) and for CV(υ) (solid lines), based on n = 500, h = 0.1 and 1000 repetitions. The models on the left panel are M1, M2 and M3. The models on the right panel are M5, M6 and M7.

Now we demonstrate with a simulation example that the adoption of a standard method for testing the vaccine efficacy that ignores the mark is inefficient and can be misleading. We consider a special case of the model discussed in the Introduction, with λ(t, υ|z = 0) = 1 and λ(t, υ|z = 1) = 2υ, for t ≥ 0, 0 ≤ υ ≤ 1. The covariate z is again a treatment indicator taking values 0 and 1 with probability of 0.5 for each value. The marginal hazards model ignoring the mark is therefore λ(t|z = 0) = 1 and λ(t|z = 1) = 1, for t ≥ 0. The rest of the simulation setup such as the percentage of censorship, the kernel function and the bandwidth is the same as for the previous models. The model considered here represents both a proportional mark-specific hazards model for λ(t, υ|z) and a proportional hazards model for λ(t|z) = λ0(t) exp(βz), with the mark-specific vaccine efficacy VE(υ) = 1 − 2υ and the marginal VE = 1 − exp(β) = 0. The standard Wald test, denoted by Tw, under the marginal Cox model is often used to test for the vaccine efficacy. As expected, the standard Wald test shows no power (Table 3). It is incapable of revealing any vaccine efficacy or that the vaccine efficacy depends on the mark, thus missing the important scientific finding that the vaccine protects against viruses with smaller mark values (V < 0.5) and increases risk of infection with viruses with larger mark values (V > 0.5). The example we constructed here shows the weakness of using the standard approach that ignores the mark and is what motivates the present research.

TABLE 3.

Comparison of the standard Wald test with the proposed tests Ta(1),Tm1(1),Tm2(1),Ta(2),Tm1(2)andTm2(2) at the nominal level 0.05

Power
n h Tw
Ta(1)
Tm1(1)
Tm2(1)
Ta(2)
Tm1(2)
Tm2(2)
500 0.05 5.9 14.9 24.2 16.6 98.0 99.4 97.3
0.1 23.9 35.7 16.0 99.6 100 99.8
0.15 27.9 39.1 15.7 99.9 100 99.9
800 0.05 6.1 32.4 39.6 15.0 100 100 99.6
0.1 43.1 51.5 13.8 100 100 100
0.15 46.0 53.3 13.9 100 100 100

4. Application

The first preventive HIV vaccine efficacy trial was carried out in North America and The Netherlands, and enrolled 5403 HIV-negative volunteers at risk for acquiring HIV infection [4]. Volunteers were randomized in a 2:1 ratio to receive a recombinant glycoprotein 120 vaccine (AIDSVAX) or placebo, and were monitored for HIV infection at semi-annual HIV testing visits for 36 months. The primary objective was to assess VE using the standard Cox model, and a secondary objective was to test H10: VE(t, υ) = 0 and H20: VE(t, υ) = VE(t) for three different mark variables V defined in terms of the percent mismatch of aligned amino acid sequences (for each infecting HIV sequence compared to the HIV sequence [named GNE8] contained in the AIDSVAX construct) in three subregions of HIV-gp120. For brevity, in this article we consider only one mark V, defined as the percent mismatch of amino acids in the whole gp120 region (581 amino acids long), where all possible mismatches of particular pairs of amino acids (e.g., A versus C) are weighted by the estimated probability of interchange [13]. The distance is based on the gp120 region because this region contains neutralizing epitopes that potentially can induce anti-HIV antibody responses that prevent HIV infection [22]; the vaccine was designed to protect by stimulating high titer antibodies that neutralize exposing HIVs. Of the 368 individuals infected during the trial, 32 had missing marks. Of the remaining 336 samples, all marks were unique (217 vaccine; 119 placebo).

The vaccine efficacy is estimated and tested by adjusting for two covariates: age (ranging 18–62 years with mean of 36.5) and behavioral risk score (taking values 0–7) as defined in [4]. It is relevant to adjust for these covariates because they predict infection rate and because trial participants with different values of these covariates may be exposed to HIV strains with different distributions of V. Both covariates are considered as continuous variables. The histograms of the rescaled mark values, ages in years and behavioral risk scores are plotted in Figure 4. We denote the treatment indicator by z1 (z1 = 1 for the vaccine and z1 = 0 for the placebo), age by z2 and behavioral risk score by z3, and denote the corresponding coefficient functions by β1(υ), β2(υ) and β3(υ). Fitting model (2) with h = 0.3, the plots of the estimates for β1(υ), β2(υ) and β3(υ) and their pointwise confidence bands are given in Figure 5. The plots of VE^(υ)andCV^(υ) with their corresponding pointwise confidence bands adjusting for the two covariates z2 and z3 are given in Figure 6.

FIG. 4.

FIG. 4

Histograms for the observed mark values, ages in years and behavioral risk scores. The left panel is for the vaccine group and the right panel is for the placebo group.

FIG. 5.

FIG. 5

Plots of the estimated regression coefficients β1(υ), β2(υ) and β3(υ) and their 95% pointwise confidence bands for the vaccine trial data with h = 0.3.

FIG. 6.

FIG. 6

Plots of the estimates of VE(υ) and CV(υ) and their confidence bands for the vaccine trial data with h = 0.3. The dashed lines are 95% pointwise confidence bands and the dotted lines are 95% simultaneous confidence bands.

Adjusting for age and behavioral risk score, the Wald test statistic for testing the marginal VE = 0 using the standard Cox model is −0.978, yielding a p-value of 0.328 for the two-sided alternative and 0.164 for the monotone alternative. Our test with the test statistic Ta(1) for H10: VE(υ) = 0 for all υ versus the general alternative H1a yields a p-value of 0.1532. The p-values for testing against the monotone alternative H1m are 0.0916 for Tm1(1) and 0.0228 for Tm2(1). These results give some, albeit weak, evidence of nonzero vaccine efficacy for at least one mark value; see Figure 6.

In addition, adjusting for age and behavioral risk score, we conducted the tests to evaluate whether the vaccine efficacy varies with the mark. The p-value for testing H20 that VE(υ) does not depend on υ versus the general alternative H2a is 0.2067 for the test statistic Ta(2). The p-value for testing for the monotone alternative H2m is 0.9363 for the test statistic Tm1(2) and 0.9047 for the test statistic Tm2(2). These p-values are expected given the plots in Figure 6 where VE^(υ) shows some tendency to increase with υ.

5. Discussion

This article developed inference techniques for the proportional hazards model with a continuous mark variable, including nonparametric methods for estimation and testing of mark-specific regression functions. These techniques can be used to estimate mark-specific vaccine efficacy (VE(υ)) and cumulative mark-specific vaccine efficacy (CV(υ)) with simultaneous confidence bands, and to test hypotheses for VE(υ), while adjusting for time-dependent covariate effects. The testing procedures based on the statistics Tm1(1)andTm2(2) showed greatest power in simulations and are recommended for testing VE(υ) = 0 for all υ and for testing VE(υ) independent of υ, respectively.

An alternative approach to the continuous mark-specific PH model would be a similar model that treats the mark variable as ordinal categorical. We focused on a continuous mark because (i) it most naturally suits the HIV vaccine application, as the choice of K bins for categorizing the marks would be arbitrary and (ii) testing β(υ) = β can often be done with greater power than testing equality of the cause-specific regression coefficients β1 = … = βK.

As is well known for a discrete mark-specific hazard function, the interpretation of the continuous mark-specific hazard function λ(t, υ) is restricted to actual study conditions, that is, it is the instantaneous rate of failure in the presence of all of the circulating competing risks (i.e., is a “crude” hazard in the terminology of Prentice et al. [15]). However, often the main scientific interest is in the “net” mark-specific hazard, the instantaneous rate of failure by mark υ in the absence of any other competing risks, but unfortunately this parameter is not identified except under untestable assumptions such as mutual independence of all of the notional (latent) mark-specific failure times [19]. This problem necessitates careful interpretation of inferences in the mark-specific PH model.

For the HIV vaccine trial example, the crude mark-specific hazard can be factored as

λ(t,υ|z)=λE(t,υ|z)×λPC(t|υ,z) (14)

where λE(t, υ|z) is the intensity of exposure to strain υ for participants with covariates z and λPC(t|υ, z) (the “per-contact” transmission hazard) is the same as λ(t, υ|z) except that it further conditions on the (unobserved) presence of exposure to a virus with genetic distance υ during [t, t + dt). Exposure can arise from unprotected sex or sharing a needle with an individual infected with strain υ. Therefore the identified parameter measures a mixture of vaccine/placebo-group differences in mark-specific exposure rates and in conditional mark-specific per-exposure transmission probabilities, whereas biological interest is in

VEPC(t|υ,z2)=1λPC(t|υ,1,z2)λPC(t|υ,0,z2)

as a measure of vaccine efficacy. However, as data are not available for estimating the relative intensity λE(t, υ|1, z2)/λE(t, υ|0, z2), our approach is to use

VE(t,υ|z2)=1λ(t,υ|1,z2)λ(t,υ|0,z2)

as the target estimand, and assume identical exposure rates between the two groups, so this target has the same interpretation as VEPC(t|υ, z2). Reliance on this assumption demonstrates the value of including covariates z2 that predict mark-specific exposure into the mark-specific PH model: the richer the covariate information the more likely VE(t, υ|z2) reflects biological vaccine efficacy. Gilbert, McKeague and Sun [6] provided further discussion of the interpretation of mark-specific hazard ratios.

The usefulness of our approach relies on the validity of the mark-specific proportional hazards model. Lin, Wei and Ying [10] developed goodness-of-fit tests for the standard Cox model based on martingale residuals, and their tests can be extended to the present setting by using the mark-specific martingale residuals

M^i(t,υ)=0taυ[Ni(ds,du)Yi(s)exp((β^(u))TZi)Λ^0(ds,du)], (15)

for i = 1,…, n. These residuals may be interpreted as the difference at time t between the observed and the predicted number of events with mark less than υ for the ith subject, and are informative about model misspecification. It can be checked that n1/2i=1nM^i(t,υ)=op(1). This property is similar to that in the standard Cox model, where the sum of the martingale residuals is exactly zero. The difference here is caused by the kernel smoothing in a neighborhood of υ. Because β(υ) is treated nonparametrically, the checking of the model (2) needs further development and has additional issues related to the bandwidth. This would need a thorough treatment that is beyond the scope of the present paper.

Finally, we caution that the method proposed here requires large sample sizes to work well as demonstrated in the simulation study. This is the result of β(υ) being treated nonparametrically: the estimation of β(υ) utilizes only the observed failures with marks in a neighborhood of υ. Although this does not cause a problem in our application to the first preventive HIV vaccine trial (which has a sample size of 5403), one needs to be careful in applying the method to situations with small sample sizes.

Acknowledgments

The authors gratefully acknowledge David Jobes and VaxGen Inc. for providing the HIV sequence data. The authors also thank the Associate Editor and two referees for their valuable comments.

APPENDIX

The following lemma is an extension of Theorem 5.7 of Van der Vaart [20] and will be used to prove the uniform consistency of β̂(υ).

LEMMA A.1

Let Qn(υ, θ) be random functions and let Q(υ, θ) be a fixed function of (υ, θ) ∈ [a, b] × Θ, Θ ⊂ ℝp. Let β(υ) be a fixed function of υ ∈ [a, b] taking values in Θ. Assume that supυ,θ|Qn(υ,θ)Q(υ,θ)|P0 and that for every ε > 0 there exists an η > 0 such that sup‖θ−β(υ)‖>ε Q(υ, θ) < Q(υ, β(υ)) − η for υ ∈ [a, b]. Then for any sequence of estimators β̂(υ), with Qn(υ, β̂(υ)) > Qn(υ, β(υ)) − op(1) uniformly in υ ∈ [a, b], we have β^(υ)Pβ(υ) uniformly in υ ∈ [a, b].

PROOF

For every ε > 0, there exists an η > 0 such that

{supυβ^(υ)β(υ)>ε}υ{β^(υ)β(υ)>ε}υ{Q(υ,β^(υ))<Q(υ,β(υ))η}.

Since Qn(υ,β^(υ))>Qn(υ,β(υ))op(1)PQ(υ,β(υ)), uniformly in υ ∈ [a, b], we have Qn(υ, β̂(υ)) > Q(υ, β(υ)) − op(1), uniformly in υ ∈ [a, b]. It follows that

υ{Q(υ,β^(υ))<Q(υ,β(υ))η}   υ{Q(υ,β^(υ))<Qn(υ,β^(υ))η+op(1)}   ={infυ(Q(υ,β^(υ))Qn(υ,β^(υ)))<η+op(1)}   ={supυ(Qn(υ,β^(υ))Q(υ,β^(υ)))>ηop(1)}  {supυ|Qn(υ,β^(υ))Q(υ,β^(υ))|>ηop(1)},

whose probability goes to 0 by the uniform convergence of Qn(υ, θ) to Q(υ, θ). Hence P{supυ ‖β̂(υ) − β(υ)‖ > ε} → 0.

The following lemma is used to prove Theorem 3 and Theorem 4. Let N=i=1nNiandM=i=1nMi.

LEMMA A.2

Under conditions (A.1)–(A.4), n1N(t,υ)PENi(t,υ), uniformly in (t, u) ∈ [0, τ] × [0, 1], and n−1/2 M(t, υ) converges weakly to a mean-zero continuous Gaussian random field G(t, υ), (t, υ) ∈ [0, τ] × [0, 1], with independent increments and Var(G(t,υ))=0t0υλ0(s,u)s(0)(s,β(u))dsdu.

PROOF

We treat ωi = (Xi, δi, Vi), i = 1,…, n, as a random sample from a probability distribution P on a measurable space (𝒳, 𝒜), with 𝒳 = [0, ∞) × {0, 1} × [0, 1] and 𝒜 its Borel σ-field. Let be the class of all indicator functions ft, υ: 𝒳 → R, where ft, υi) = I([0, t] × {1} × [0, υ])(ωi) = I(Xit, δi = 1, Vi ≤ υ), for 0 ≤ t ≤ τ, 0 ≤ υ ≤ 1. Then n1N(t,υ)=n1i=1nft,υ(ωi). Let ‖ft, υP,r = (P|ft, υ|r)1/r be Lr(P)-norm of ft, υ.

Let 0 = t0 < t1 < … < tK = τ and 0 = υ0 < υ1 < … < υJ = 1 be partitions of the intervals [0, τ] and [0, 1]. Define the bracketing functions lkj = Ni(tk−1, υj−1) and ukj = Ni(tk, υj), for k = 1,…, K, j = 1,…, J. Then for any ft, υ, there is a bracket [lkj, ukj] such that ft, υ ∈ [lkj, ukj]. And

ukjlkjP,1E(Ni(tk,υj)Ni(tk1,υj1))=0tk0υjλ0(s,x)s(0)(s,β(x))dsdx   0tk10υj1λ0(s,x)s(0)(s,β(x))dsdxtk1tk01λ0(s,x)s(0)(s,β(x))dsdx   +0τυj1υjλ0(s,x)s(0)(s,β(x))dsdxC1(tktk1)+C2(υjυj1),

where C1 and C2 are some positive constants. For any ε > 0, choose the grid points such that tktk−1 < ε and υj − υj−1 < ε. Then ‖ukjlkjP,1 ≤ [C1 + C2]ε. Hence, the bracketing number N[·](ε, , L1(P)) is of the polynomial order (1/ε)2. By the Glivenko–Cantelli theorem (Theorem 19.4 of van der Vaart [20]), n1N(t,υ)PENi(t,υ), uniformly in (t, υ) ∈ [0, τ] × [0, 1].

Next, consider the processes {Mi(t, υ), 0 ≤ t ≤ τ, 0 ≤ υ ≤ 1}, i = 1,…, n, as a random sample from a probability distribution P on a measurable space (𝒳, 𝒜). Let ℱ be the class of coordinate projections ft, υ: 𝒳 → R, where ft, υ(Mi) = Mi(t, υ), for 0 ≤ t ≤ τ, 0 ≤ υ ≤ 1. The process {Mi(t, υ), 0 ≤ t ≤ τ, 0 ≤ υ ≤ 1} is determined by the {Xi, δi, δi Vi, Zi}.

Again, let 0 = t0 < t1 < … < tK = τ and 0 = υ0 < υ1 < … < υJ = 1 be the partitions of the intervals [0, τ] and [0, 1]. Define the bracketing functions lkj=Ni(tk1,υj1)0tk0υjYi(s)λ(s,x|Zi(s))dsdxandukj=Ni(tk,υj)0tk10υj1Yi(s)λ(s,x|Zi(s))dsdx, for k = 1,…, K, j = 1,…, J. Then for any ft, υ, there is a bracket [lkj, ukj] such that ft, υ ∈ [lkj, ukj]. The bracket size is

ukjlkjP,2Ni(tk,υj)Ni(tk1,υj1)P,2+0tk0υjYi(s)λ(s,x|Zi(s))dsdx0tk10υj1Yi(s)λ(s,x|Zi(s))dsdxP,2[C1(tktk1)+C2(υjυj1)]1/2,

where C1 and C2 are some positive constants. For any ε > 0, choose the grid points such that tktk−1 < ε and υj − υj−1 < ε. Then ‖ukjlkjP,2 ≤ [C1 + C2]1/2ε1/2. Hence, the bracketing number N[·]1/2, , L2(P)) is of the polynomial order (1/ε)2. Thus, N[·](ε, , L2(P)) is of the polynomial order (1/ε)4. So the bracketing integral J[·](1, , L2(P)) < ∞. By the Donsker theorem (Theorem 19.5 of Van der Vaart [20]), n1/2M={n1/2i=1nMi(t,υ),0tτ,0υ1} converges weakly to a mean-zero Gaussian process G(t, υ), (t, υ) ∈ [0, τ] × [0, 1], which can be constructed to have continuous paths by Theorem 18.14 and Lemma 18.15 of van der Vaart [20].

Now we show that G(t, υ) has independent increments. Note that for t1t2 and υ1 ≤ υ2, the covariance of G(t1, υ1) and G(t2, υ2) − G(t1, υ1) is E{Mi(t1, υ1) × (Mi(t2, υ2) − Mi(t1, υ1))}. By Aalan and Johansen [1], Mi(t, υ1) and Mi(t, υ2) − Mi(t, υ1), 0 ≤ t ≤ τ, are orthogonal square integrable martingales for 0 ≤ υ1 ≤ υ2 ≤ 1. It follows that

E{Mi(t1,υ1)(Mi(t2,υ2)Mi(t1,υ1))}=E{Mi(t1,υ1)(Mi(t2,υ2)Mi(t2,υ1))}    +E{Mi(t1,υ1)(Mi(t2,υ1)Mi(t1,υ1))}=0.

Hence G(t1, υ1) and G(t2, υ2) − G(t1, υ1) are independent.

PROOF OF THEOREM 1

It is easy to check that the conditions of Lemma 1 of Sun and Wu [18] are satisfied under Condition A. It follows that A(υ) converges weakly to a vector of continuous mean-zero Gaussian random processes, WA(υ), υ ∈ [a, b]. Now we show that WA(υ) has independent increments. Let wi(t,υ)=aυ0tA(u)[Zi(t)s(1)(t,β(u))/s(0)(t,β(u))]Mi(dt,du). Then W˜A(υ)=n1/2i=1nwi(τ,υ). For a ≤ υ1 ≤ υ2b, the covariance matrix of WA1) and WA2) − WA1) is equal to E{wi(τ, υ1)(wi(τ, υ2) − wi(τ, υ1))T}. Since Mi(t, υ1) and Mi(t, υ2) − Mi(t, υ1), 0 ≤ t ≤ τ, are orthogonal square integrable martingales, it follows that wi(t, υ1) and wi(t, υ2) − wi(t, υ1), 0 ≤ t ≤ τ, are orthogonal square integrable martingales. Hence E{wi(τ, υ1)(wi(τ, υ2) − wi(τ, υ1))T} = 0. So WA(υ), υ ∈ [a, b], is a vector of mean-zero Gaussian random processes with independent increments.

Further, the covariance matrix of WA(υ) is

E{wi(τ,υ)(wi(τ,υ))T}=E{aυ0τA(u)[Zi(t)s(1)(t,β(u))s(0)(t,β(u))]2A(u)Ni(dt,du)}=E{aυ0τA(u)[Zi(t)s(1)(t,β(u))s(0)(t,β(u))]2×A(u)y(t|Zi(t))λ(t,u|Zi(t))dtdu}=aυA(u)E{0τ[Zi(t)s(1)(t,β(u))s(0)(t,β(u))]2y(t|Zi(t))λ(t,u|Zi(t))dt}    ×A(u)du=aυA(u)(u)A(u)du.

This completes the proof of Theorem 1.

PROOF OF THEOREM 2

We shall prove Theorem 2 by verifying the conditions of Lemma A.1.

Let

ηn(u,θ)=n1i=1n0u0τ[θTZi(t)log(S(0)(t,θ))]Ni(dt,du),ξn(u,θ)=n1i=1n0u0τ[θTZi(t)log(s(0)(t,θ))]Ni(dt,du),Qn(υ,θ)=n1l(υ,θ)+n1logn01Kh(uυ)N(τ,du).

Then by Condition A, ηn(υ, θ) = ξn(υ, θ) + Op(n−1/2) and

Qn(υ,θ)=01Kh(uυ)ηn(du,θ)=01Kh(uυ)ξn(du,θ)+Op(n1/2h1),

uniformly in (υ, θ) ∈ [0, 1] × [−M, M], for M > 0. By application of the Glivenko–Cantelli and Donsker theorems, similarly to the proofs of Lemma A.2 and Theorem 1, ξn(υ, θ) = ξ(υ, θ) + Op(n−1/2), uniformly in (υ, θ) ∈ [0, 1] × [−M, M], with

ξ(υ,θ)=E[0u0τ[θTZi(t)log(s(0)(t,θ))]Ni(dt,du)].

It follows that Qn(υ, θ) = Q(υ, θ) + Op(n−1/2 h−1), uniformly in (υ, θ) ∈ [a, b] × [−M, M], where

Q(υ,θ)=E[0τ[θTZi(t)log(s(0)(t,θ))]λ0(t,υ)exp(βT(υ)Zi(t))Yi(t)dt].

Now we show that β(υ) is the well-separated point of maximum of Q(υ, θ) for υ ∈ [0, 1]. Note that

Q(υ,θ)/θ=E[0τ[Zi(t)s(1)(t,θ)s(0)(t,θ)]λ0(t,υ)exp(βT(υ)Zi(t))Yi(t)dt]2Q(υ,θ)/θ2=E[0τ{s(2)(t,θ)s(0)(t,θ)(s(1)(t,θ)s(0)(t,θ))2}×λ0(t,υ)exp(βT(υ)Zi(t))Yi(t)dt].

We have ∂Q(υ, β(υ))/∂θ = 0, and for every ε > 0 there exists an η > 0 such that sup‖θ−β(υ)‖>ε Q(υ, θ) < Q(υ, β(υ)) − η for υ ∈ [a, b], under condition (A.3), by Taylor expansion and continuity. Further, since Qn(υ,θ)PQ(υ,θ),Qn(υ,θ)/θPQ(υ,θ)/θ,and2Qn(υ,θ)/θ2P2Q(υ,θ)/θ2 uniformly in (υ, θ) ∈ [a, b] × [−M, M], and − < β(υ) < for a ≤ υ ≤ b for some < M, we have for every α > 0 there exists an n0 such that P(−M ≤ β̂(υ) ≤ M, a ≤ υ ≤ b) > 1 − α for nn0.

Therefore, for every ε > 0,

P(supaυbβ^(υ)β(υ)>ε)     α+P(supaυbβ^(υ)β(υ)>ε,Mβ^(υ)M,aυb)     α

as n → ∞, by the previous checking of the conditions of Lemma A.1 together with Qn(υ, β̂(υ)) ≥ Qn(υ, β(υ)). Since α is arbitrary, we have P(supa≤υ≤b ‖β̂(υ) − β(υ)‖ > ε) → 0.

PROOF OF THEOREM 3

In the proof of this theorem, we set β = β(υ) for simplicity. Note that under Condition A, using a second-order Taylor expansion for λ(t, u|Zi(t)) in the neighborhood of υ, we have

n1/2|i=1n010τKh(uυ)[Zi(t)S(1)(t,β)S(0)(t,β)]Yi(t)×[λ(t,υ|Zi(t))λ(t,u|Zi(t))]dtdu|     =Op(n1/2h2),

uniformly in υ ∈ [0, 1]. It follows that

n1/2U(υ,β)=n1/2i=1n010τKh(uυ)[Zi(t)S(1)(t,β)S(0)(t,β)]×[Ni(dt,du)Yi(t)λ(t,υ|Zi(t))dtdu]=n1/2i=1n010τKh(uυ)[Zi(t)S(1)(t,β)S(0)(t,β)]Mi(dt,du)   +Op(n1/2h2),

uniformly in υ ∈ [0, 1].

Next, we show that for each υ, n−1/2 h1/2U(υ, β) converges weakly to a normal distribution. By Lemma A.2, n−1/2 M(t, υ) converges weakly to a mean-zero Gaussian process. By Condition A, ‖S(j)(t, β) − s(j)(t, β)‖ = op(n−1/2+δ), uniformly in t for j = 0, 1, for 0 < δ < 1/2. Note that n−1/2+δ h−1/2 = o(1) for δ = 1/4 as nh2 → ∞. We have h1/2 Kh(u − υ) ‖S(j)(t, β) − s(j)(t, β)‖ goes in probability to zero. Applying Lemma 2 of Gilbert, McKeague and Sun [6], we have

n1/2h1/2U(β(υ))     =n1/2h1/2i=1n010τKh(uυ)[Zi(t)s(1)(t,β)s(0)(t,β)]×Mi(dt,du)+Op(n1/2h5/2)+op(1)     =n1/2h1/2i=1n010τKh(uυ)[Zi(t)s(1)(t,β(u))s(0)(t,β(u))]×Mi(dt,du)+Op(n1/2h5/2)+op(1)     =h1/201Kh(uυ)W˜I(du)+Op(n1/2h5/2)+op(1), (16)

where I(υ) is defined in (6) with A = I and a = 0.

Since W˜I(υ)𝒟WI(υ) by Theorem 1, by the almost sure representation theorem ([17], page 47), there exist W˜I*(υ)andWI*(υ) on some probability space that have the same distributions and sample paths as I(υ) and WI(υ), respectively, such that W˜I*(υ)a.s.WI*(υ) uniformly in υ ∈ [0, 1]. Hence 01Kh(uυ)W˜I*(du)=01Kh(uυ)WI*(du)+Op(n1/2h1) by integration by parts since K(·) has bounded variation. It follows that

h1/201Kh(uυ)W˜I(du)=𝒟h1/201Kh(uυ)W˜I*(du)=h1/201Kh(uυ)WI*(du)+Op(n1/2h1/2).

Since WI*(υ) is a Gaussian martingale with covariance matrix of 0υ(u)du,andh1/201Kh(uυ)WI*(du) is a mean-zero Gaussian random vector with covariance matrix equal to h01Kh2(uυ)(u)duν0(υ)ash0. Hence, h1/201Kh(uυ)W˜I(du)𝒟N(0,ν0(υ))ash0,nh. By the Slut-sky theorem, n−1/2h1/2U(υ, β) converges weakly to N(0, ν0∑(υ)) as nh2 → ∞ and nh5 → 0.

Note that U(β^(υ))U(β(υ))=lβ(υ,β*(υ))(β^(υ)β(υ)), where β*(υ) is on the line segment between β̂(υ) and β(υ). By Condition A and the uniform consistency of β̂(υ) on υ ∈ [a, b] ⊂ (0, 1), we have n1lβ(υ,β*(υ))=(υ)+op(1), uniformly in υ ∈ [a, b] for 0 < δ < 1/2. Hence,

n1/2h1/2(β^(υ)β(υ))=(lβ(υ,β*(υ))/n)1n1/2h1/2U(β(υ))=((υ))1n1/2h1/2U(β(υ))+op(1), (17)

uniformly in υ ∈ [a, b]. It follows that (nh)1/2(β^(υ)β(υ))𝒟N(0,ν0(υ)1) as nh2 → ∞ and nh5 → 0.

PROOF OF THEOREM 4

From (16) and the first line of (17), we have, for υ ∈ [a, b],

aυn1/2(β^(u)β(u))du=aυ((u))101Kh(xu)W˜I(dx)du+op(1).

Exchanging the order of integration and by the compact support of the kernel function K(·) on [−1, 1], we have

aυn1/2(β^(u)β(u))du   =01[aυ((u))1Kh(xu)du]W˜I(dx)+op(1)   =a+hυh[aυ((u))1Kh(xu)du]W˜I(dx)       aha+h[aυ((u))1Kh(xu)du]W˜I(dx)       υhυ+h[aυ((u))1Kh(xu)du]W˜I(dx)+op(1). (18)

By Theorem 1, the process I(x) converges weakly to a mean-zero Gaussian process with continuous paths. Under the assumption (A.4), aυ((u))1Kh(xu)du has bounded variation and converges uniformly to ∑(x)−1 for x ∈ (a + h, υ − h). By Lemma 2 of Gilbert, McKeague and Sun [6], the first term in (18) is aυ((x))1W˜I(dx)+op(1). Similar arguments lead to the second and the third terms in (18) to be op(1). Hence

aυn1/2(β^(u)β(u))du=aυ((x))1W˜I(dx)+op(1)=W˜1(υ)+op(1),

which converges weakly to a p-dimensional mean-zero Gaussian martingale, W∑(υ)−1(υ), with continuous paths. The covariance matrix of W∑(υ)−1(υ) equals to Cov(W1(υ))=aυ(u)1(u)(u)1du=aυ(u)1du.

Footnotes

1

Supported in part by NSF Grant DMS-06-4576, NIH Grant 2 RO1 AI054165-04 and funds provided by the University of North Carolina at Charlotte.

2

Supported in part by NIH Grant 2 RO1 AI054165-04.

3

Supported in part by NSF Grant DMS-0505201.

AMS 2000 subject classifications. Primary 62N01; secondary 62N02, 62N03, 62G20.

REFERENCES

  • 1.Aalen OO, Johansen S. An empirical transition matrix for non-homogeneous Markov chains based on censored observations. Scand. J. Statist. 1978;5:141–150. MR0509450. [Google Scholar]
  • 2.Brémaud P. Point Processes and Queues: Martingale Dynamics. New York: Springer; 1981. MR0636252. [Google Scholar]
  • 3.Cai Z, Sun Y. Local linear estimation for time-dependent coefficients in Cox’s regression models. Scand. J. Statist. 2003;30:93–111. MR1963895. [Google Scholar]
  • 4.Flynn NM, Forthal DN, Harro CD, Judson FN, Mayer KH, Para MF, Gilbert PB The RGP120 HIV Vaccine Study Group. Placebo-controlled phase 3 trial of recombinant glycoprotein 120 vaccine to prevent HIV-1 infection. J. Infectious Diseases. 2005;191:654–665. doi: 10.1086/428404. [DOI] [PubMed] [Google Scholar]
  • 5.Gilbert PB, McKeague IW, Sun Y. Tests for comparing mark-specific hazards and cumulative incidence functions. Lifetime Data Anal. 2004;10:5–28. doi: 10.1023/b:lida.0000019253.69537.91. MR2058572. [DOI] [PubMed] [Google Scholar]
  • 6.Gilbert PB, McKeague IW, Sun Y. The two-sample problem for failure rates depending on a continuous mark: An application to vaccine efficacy. Biostatistics. 2007 doi: 10.1093/biostatistics/kxm028. To appear. [DOI] [PubMed] [Google Scholar]
  • 7.Graham BS. Clinical trials of HIV vaccines. Annual Review of Medicine. 2002;53:207–221. doi: 10.1146/annurev.med.53.082901.104035. [DOI] [PubMed] [Google Scholar]
  • 8.Huang Y, Louis TA. Nonparametric estimation of the joint distribution of survival time and mark variables. Biometrika. 1998;85:785–798. MR1666750. [Google Scholar]
  • 9.Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. New York: Wiley; 1980. MR0570114. [Google Scholar]
  • 10.Lin DY, Wei LJ, Ying Z. Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika. 1993;80:557–572. MR1248021. [Google Scholar]
  • 11.Martinussen T, Scheike TH. Dynamic Regression Models for Survival Data. New York: Springer; 2006. MR2214443. [Google Scholar]
  • 12.Nabel GJ. Challenges and opportunities for development of an AIDS vaccine. Nature. 2001;410:1002–1007. doi: 10.1038/35073500. [DOI] [PubMed] [Google Scholar]
  • 13.Nickle DC, Heath L, Jensen MA, Gilbert PB, Kosakovsky Pond SLK, Mullins JI. Amino acid substitution matrices for HIV-1 subtype B. Technical report. Univ. Washington; 2005. [Google Scholar]
  • 14.Olschewski M, Schumacher M. Statistical analysis of quality of life in cancer clinical trials. Statistics in Medicine. 1990;9:749–763. doi: 10.1002/sim.4780090705. [DOI] [PubMed] [Google Scholar]
  • 15.Prentice RL, Kalbfleisch JD, Peterson AV, Fluornoy N, Farewell VT, Breslow NE. The analysis of failure time in the presence of competing risks. Biometrics. 1978;34:541–554. [PubMed] [Google Scholar]
  • 16.Schumacher M. Two-sample tests of Cramér–von Mises and Kolmogorov–Smirnov type for randomly censored data. Internat. Statist. Rev. 1984;52:263–281. MR0867175. [Google Scholar]
  • 17.Shorack GR, Wellner JA. Empirical Processes with Applications to Statistics. New York: Wiley; 1986. MR0838963. [Google Scholar]
  • 18.Sun Y, Wu H. Semiparametric time-varying coefficients regression model for longitudinal data. Scand. J. Statist. 2005;32:21–47. MR2136800. [Google Scholar]
  • 19.Tsiatis AA. A nonidentifiability aspect of the problem of competing risks. Proc. Natl. Acad. Sci. USA. 1975;72:20–22. doi: 10.1073/pnas.72.1.20. MR0356425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Van Der Vaart AW. Asymptotic Statistics. Cambridge Univ. Press; 1998. MR1652247. [Google Scholar]
  • 21.UNAIDS. Joint United Nations Programme for HIV/AIDS. 2004 AIDS Epidemic Update. [PubMed]
  • 22.Wyatt R, Kwong PD, Desjardins E, Sweet RW, Robinson J, Hendrickson WA, Sodroski JG. The antigenic structure of the HIV gp120 envelope glycoprotein. Nature. 1998;393:705–711. doi: 10.1038/31514. [DOI] [PubMed] [Google Scholar]

RESOURCES