Abstract
Deployment of the recently licensed CYD-TDV dengue vaccine requires understanding of how the risk of dengue disease in vaccine recipients depends jointly on a host biomarker measured after vaccination (neutralization titer – NAb) and on a “mark” feature of the dengue disease failure event (the amino acid sequence distance of the dengue virus to the dengue sequence represented in the vaccine). The CYD14 phase 3 trial of CYD-TDV measured NAb via case-cohort sampling and the mark in dengue disease failure events, with about a third missing marks. We addressed the question of interest by developing inferential procedures for the stratified mark-specific proportional hazards model with missing covariates and missing marks. Two hybrid approaches are investigated that leverage both augmented inverse probability weighting and nearest neighborhood hot deck multiple imputation. The two approaches differ in how the imputed marks are pooled in estimation. Our investigation shows that NNHD imputation can lead to biased estimation without properly selected neighborhood. Simulations show that the developed hybrid methods perform well with unbiased NNHD imputations from proper neighborhood selection. The new methods applied to CYD14 show that NAb is strongly inversely associated with risk of dengue disease in vaccine recipients, more strongly against dengue viruses with shorter distances.
Keywords: Augmented inverse probability weighting, competing risks failure time, missing marks, nearest neighborhood hot deck multiple imputation, semiparametric regression, two-phase sampling
1. Introduction
The CYD14 phase 3 trial randomized 2 to 14 year-old children within five countries of southeast Asia in 2:1 allocation to receive the CYD-TDV vaccine or placebo in three injections at months 0, 6, and 12, where the CYD-TDV vaccine (Sanofi Pasteur) is a recombinant, live-attenuated, tetravalent vaccine containing one representative dengue strain from each of the four dengue serotypes (Capeding et al., 2014). Participants underwent active surveillance for the primary study endpoint symptomatic virologically confirmed dengue (henceforth “dengue disease”) between Month 13 and Month 25 post first vaccination. Partly based on this trial that showed that the rate of dengue disease was an estimated 56% lower in the vaccine group than the placebo group (p < 0.001), this vaccine has been licensed in more than a dozen countries. The vaccine has been thought to work by inducing anti-dengue neutralizing antibodies (NAbs).
We develop statistical methods to analyze the CYD14 efficacy trial data that are appropriate for interrogating how the association of anti-dengue NAb with dengue disease risk may differ depending on the amino acid (AA) sequence of the dengue virus causing the study endpoint, accounting for the fact that the expensive covariate of interest (NAb titer) was measured through a classic case-cohort sampling design (measured from a Bernoulli simple random sample of 19.5% of participants at enrollment and from all disease cases) and that there is a substantial percentage (about a third) of missing dengue sequences among cases. This integrated assessment of how a host biomarker and a “mark” feature of the failure event relate to failure risk has many applications, including general prospective studies that follow a cohort for acquisition of a genetically-diverse infectious disease, encompassing many pathogens including HIV-1, influenza, and malaria.
To define the general statistical problem, let T be the time to a failure event of interest, and Z be a time-independent p-dimensional covariate. Under the competing risks model, a cause-of-failure mark V is observed when a failure event occurs. Let V be a continuous mark variable with bounded support [0, 1]. The mark-specific failure time data follow a competing risks model where the mark variable V plays the role of cause-of-failure that is only observable upon failure. In the motivating dengue vaccine study, the mark V measures the AA sequence distance of a dengue-disease causing dengue sequence to the nearest dengue sequence inside the vaccine, which can only be observed in subjects experiencing the dengue disease endpoint and is not available nor meaningfully defined in subjects without the endpoint.
The mark-specific hazard function, defined as , was studied by Gilbert et al. (2004). It measures the instantaneous risk of failure by a mark in the presence of all marks, e.g., dengue sequences circulating in the efficacy trial region exposing trial participants through mosquito bites, and can be considered as an extension of the cause-specific hazard function to a continuous mark. Subsequent statistical methods have been developed to model the conditional mark-specific hazard function with applications to HIV vaccine efficacy studies; see Sun et al. (2009), Sun and Gilbert (2012), Juraska and Gilbert (2013, 2016) and Yang et al. (2017).
Suppose that the population of interest includes K subpopulations or strata, each with different baseline mark-specific hazard functions. Let λk(t, v|z) be the conditional mark-specific hazard function at (T, V) = (t, v) given covariate Z = z for an individual in the kth stratum. The stratified mark-specific proportional hazards (PH) model postulates that
| (1) |
where λ0k(·, v) = λk(t, v|z = 0) is the unspecified baseline hazard function for the kth stratum, and β(v) is the p-dimensional unknown regression coefficient function of v. Model (1) allows different baseline functions for different strata.
The mark-specific PH model (1) was first studied by Sun et al. (2009) under K = 1 with the objective of evaluating mark-specific HIV vaccine efficacy, where the mark is an amino acid sequence distance of an infecting HIV strain to an HIV strain inside the vaccine. The model (1) was further studied by Sun and Gilbert (2012), Gilbert and Sun (2015) and Juraska and Gilbert (2016) for the situation where the marks are subject to missingness in subjects with observed failure times. Yang et al. (2017) investigated model (1) under two-phase sampling of components of Z allowing some participants to have missing covariates. However, the methods accounting for missing marks assumed complete measurements of all covariates, and the methods accounting for missing covariates assumed complete data on the marks of failures. Therefore, new methods are needed to account for both types of missing data.
In the motivating CYD14 efficacy trial, there are two types of missing data. The covariate NAb titer is missing through a case-cohort sampling design and the mark V (dengue sequence distance) is missing for some cases. Multiple imputation has been widely used for handling missing data, cf. Rubin (1987). Two-phase sampling/case-cohort designs are common forms of studies with missing covariates, where covariates are divided into phase one or phase two, with the former measured in all enrolled subjects and the latter measured only in a subset, typically because of expense of measurement. A “case-cohort” design typically refers to randomly sampling subjects at enrollment into a subcohort for measuring the phase-two covariates, which are also measured in all subjects outside the subcohort who experience the failure event and have the requisite samples available (White, 1982; Prentice, 1986; Breslow and Lumley, 2013). “Two-phase sampling” typically refers to the generalization of outcome-dependent case-control sampling, where within each cell of a 2 × K table defined by outcome status cross-classified with the K levels of a discrete phase-one covariate, subjects are randomly sampled for measuring the phase-two covariates Breslow et al. (2009). These designs can be implemented with Bernoulli or without replacement sampling, and our methods apply for any of the Bernoulli sampling versions. As is the usual case, application of the methods to the without replacement sampling versions provides approximately correct results, with inferences tending to be slightly conservative. There is extensive literature on statistical methods for two-phase sampling/case-cohort designs, e.g. Prentice (1986); Robins et al. (1994); Borgan et al. (2000); Scheike and Martinussen (2004); Kulich and Lin (2004); Nan (2004); Breslow et al. (2009); Breslow and Lumley (2013).
Nearest neighborhood imputation is one of the hot deck imputation methods commonly used in survey sampling (Sedransk, 1985; Kovar et al., 1988; Jonsson and Wohlin, 2004; Andridge and Little, 2010). The idea of nearest neighborhood hot deck (NNHD) imputation is to replace each missing value with an observed response from a matching subject from the same dataset. The proposed hybrid approach leverages both augmented inverse probability weighted complete-case (AIPW) estimation (Robins et al., 1994) to handle the two-phase sampled covariates and NNHD imputation to fill in missing marks in failure cases (Chen and Shao, 2000; Beretta and Santaniello, 2016). AIPW estimation possesses a double robust property, yielding consistent estimates if either the model for whether phase-two covariates are missing, or the model for the conditional expectations of phase-two covariates, is correctly specified (Robins et al., 1994; Gao and Tsiatis, 2005). Most imputation methods assume a parametric model for the variable to be imputed. In contrast, as a nonparametric technique, NNHD imputation does not rely on model fitting for the variable to be imputed, and thus is potentially less sensitive to model misspecification than a parametric-model based imputation method. However, our investigation shows that NNHD imputation can lead to biased estimation without proper neighborhood selection.
We develop hybrid estimation and hypothesis testing procedures for model (1) that use both AIPW estimation and NNHD imputation. We investigate the neighborhood selection for the NNHD imputation for unbiased estimation. The NNHD imputation is employed to impute the values of missing marks, followed by completed-marks two-phase sampling data analysis with an AIPW method similar to that of Yang et al. (2017) that did not account for missing marks. We investigate two hybrid estimation methods using the completed-marks two-phase sampling data that differ in the way in which the imputed marks are pooled in estimation. We develop hypothesis testing procedures to evaluate whether the mark-specific hazard ratios are unity and whether they change with the mark. The main contribution of this paper is the development of hybrid estimation and hypothesis testing methods for model (1) that relates the hazard of an outcome to both covariates and marks, accounting for missingness in both, including the investigation of neighborhood selection for the NNHD imputation of marks to achieve valid inference on the association parameters. The developed procedures enable assessment of whether and how the hazard rate of an infectious disease with a pathogen genetically close to or far from a reference genetic sequence are modified by participant covariates. This application is exemplified by the dengue vaccine efficacy trial, with reference sequence the closest dengue strain in the vaccine construct and covariates age and immune response to the dengue vaccine strains.
In Section 2, we formulate the missing data problem, presenting notations and assumptions. The NNHD imputation technique is introduced in Section 2.1. The two hybrid estimation procedures are developed in Section 2.2. Techniques for estimation of the mark-specific cumulative incidence function rate are given in Section 2.3. Statistical procedures for hypothesis testing of the mark-specific hazard ratios are developed in Section 3. An extensive simulation study is conducted in Section 4 to examine the performances of the newly proposed methods, which are applied to the CYD14 data in Section 5. Some concluding remarks are given in Section 6. Additional discussions about the proposed hybrid methods along with more simulation results, analysis of the simulated data based on the CYD14 efficacy trial and additional analysis of the CYD14 efficacy trial are presented in the Supplementary Materials.
2. Hybrid estimation using AIPW and NNHD multiple imputations
The AIPW estimation method was proposed by Robins, Rotnitzky and Zhao (1994) for missing data to improve robustness and efficiency over simple inverse probability weighted estimators. This important methodology has been widely used, and shown efficiency and the double robust property in many studies, cf., Gao and Tsiatis (2005), Sun and Gilbert (2012), Yang et al. (2017) and Sun et al. (2018), among others. We investigate two hybrid methods of estimation of the mark-specific proportional hazards model that use both AIPW and NNHD imputation. We propose employing NNHD to impute the values of missing marks, followed by completed-marks two-phase sampling data analysis with an AIPW method similar to that of Yang et al. (2017) that did not account for missing marks. The first approach follows the standard multiple imputation scheme of Rubin (1987) while the second approach incorporates multiple imputations in estimating equations (MIEE).
Suppose the failure time T is subject to right censoring, and is partially observed through observation of X = min{T, C} and δ = I(T ≤ C*), where I(·) is the indicator function and C* = min(C, τ) is the right censoring time with τ the end of follow-up and C the right censoring random variable. Let Z be a time-independent covariate vector. We assume independent censoring – that C is independent of (T, V) conditional on Z. Suppose that consists of two parts – Z1 are observed in all subjects (phase one) and Z2 are only measured in a subset (phase-two sample). In addition, the mark variable V is subject to missingness. Let ξ = (ξz, ξv) be the vector of missing data indicators, where ξz is the indicator for whether a subject has complete covariate information, and ξv is the indicator for whether the mark variable V is observed. We set ξv = 1 if δ = 0 since the mark V is inherently not available and is not considered as missing. We also set ξv = 1 if δ = 1 and V is observed; otherwise ξv = 0. Let A = (Az, Av) be auxiliary variables, with Az the auxiliary variable predictive of phase-two covariates and Av the auxiliary variable predictive of missing marks. For convenience, we denote Ω = (X, Z1, A) and represent the observed data by .
We assume that Z2 and V are missing at random (Rubin, 1976), satisfying MAR:
P(ξz = 1|X, Z1, Z2, A, δV, δ) = P(ξz = 1|X, Z1, Az, δ),
P(ξv = 1|X, Z1, Z2, A, δV, δ = 1) = P(ξv = 1|X, Z1, Z2, Av, δ = 1), and
P(ξz = 0, ξv = 0|X, Z1, Z2, A, δV, δ) = 0.
MAR (i) assumes that the missingness of Z2 does not depend on the value of Z2 and δV; MAR (ii) assumes that the missingness of V do not depend on the value of V; and MAR (iii) implies that Z2 and V do not have missing values on the same subjects, which is always satisfied under Prentice’s (1986) original case-cohort sampling design for which no cases have missing Z2 values. It is approximately satisfied for implemented case-cohort sampling designs (including our example) that intend to measure Z2 in all cases but end up with a small number of happenstance missing values.
Suppose there are K strata. Let nk be the number of subjects in the kth stratum and . We label the ith subject in the kth stratum with a pair of subscripts {ki}. Let Z1,ki and Z2,ki be copies of covariates Z1 and Z2 for subject i in stratum k, respectively. Similarly, ξz,ki and ξv,ki are copies of ξz and ξv, respectively. Let , ξki = (ξz, ki, ξv, ki), and Ωki = (Xki, Z1,ki, Aki). The observed data are , for i = 1, …, nk, k = 1, …, K. We assume that {Tki, Cki, Vki, Zki, ξki, Aki; i = 1, …, nk} are independent identically distributed (i.i.d.) replicates of (T, C, V, Z, ξ, A) from stratum k, k = 1, …, K.
2.1. Nearest neighborhood hot deck imputation of missing marks
In the competing risks setting, Vki is observable if a failure is observed, i.e., δki = 1. If the mark value Vki is not available for δki = 1, then we have a missing mark indicated by ξv,ki = 0. The standard imputation approach involves first drawing the parameters of the posterior distribution of the missing variables given the observed data, and then drawing M sets of imputed values for the missing data from their posterior distribution given the observed data, cf. Rubin (1987). However, parametric multiple imputation can be sensitive to misspecification of the imputation model (Carroll et al., 1984).
NNHD imputation, as a hot deck imputation method, replaces each missing value with an observed response from a matching subject from the same dataset. Hot deck imputation methods have been studied by Little (1988), Reilly (1993), Chen and Shao (2000), Beretta and Santaniello (2016), among others. Using the hot deck method, we impute a missing value V of a subject by choosing at random from observed V values among matching donors. Donors are matched for their similarity in regard to some metric. This approach does not rely on model fitting for the variable to be imputed, and thus is potentially less sensitive to model misspecification than an imputation method based on a parametric model.
We describe the NNHD imputation procedure as follows. Suppose Vki is missing in which case ξv,ki = 0. We impute missing values Vki using hot deck imputation from donors with similar or in case that a relevant Av,ki is available. Let be a measure of similarity between and . Each hot deck imputation of Vki is obtained by randomly selecting a donor’s mark from the L-nearest neighborhood matched based on the similarity measure , where L is a number less than the number of non-missing marks for observed failures. Let Rki,L be the Lth order statistic of for subjects with δkj = 1, ξv,kj = 1, j = 1, …, nk. An L-nearest neighborhood of Vki is defined as . The implementation of the nearest neighborhood hot deck depends on the choice of metric and the variables included for the neighborhood selection. If some components of Zki are discrete, then the L-nearest neighborhood imputations are carried out based on the remaining variables in stratified by the values of the discrete components of Zki. Further, the similarity measure can be calculated based on the z-scores or the ranks of variables, which eliminates the effects of scales or units of the variables on the nearest neighbor selections. Let , m = 1, …, M, be M random selections from with replacement. If Vki is not missing, in which case ξv,ki = 1, then we let .
NNHD imputation is related to variable-bandwidth L-nearest neighbors kernel smoothing that is widely used in nonparametric density estimation and regression, cf. Stone (1977), Li (1984) and Altman (1992). Every case with missing V has the same number of marks imputed from the L-nearest neighbors. The NNHD approach with a fixed number, L, of neighbors is similar to defining neighborhoods by a metric with varying bandwidth such as Rki,L, whereas an alternative approach with a fixed bandwidth B is similar to allowing variable L. A fixed B-bandwidth neighborhood of Vki is defined as . In this case, the number, L, of neighbors belonging to varies among subjects with missing marks. An advantage of using fixed L is that the bandwidth is allowed to be larger when data are sparse, which is a common nonparametric smoothing approach to guard against incorporating too few points that could occur by using fixed bandwidth. While we study NNHD with fixed L, the method could also be implemented with fixed bandwidth or variable L.
Choosing the set, , of variables for neighborhood selection is very important. Our investigation shows that NNHD imputation can lead to biased estimation without proper selection of the neighborhood. Let W = (T, Z, Av) and ρk(v, W) = P(V ≤ v|δ = 1, W) be the conditional distribution of V given W for cases. For an observed value w = (t, z, a) of W of an individual in the kth stratum, ρk(v, w) = P(V ≤ v|δ = 1, W = w). Let gk(a|t, v, z) = P(Av,ki = a|Tki = t, Vki = v, Zki = z, δki = 1) be the probability density of a possible auxiliary variable for V. By Sun and Gilbert (2012),
| (2) |
If Av,ki is not available or independent of Vki given (Tki, Zki, δki), then . Equation (2) shows that the conditional distribution of Vki depends on (Tki, Zki, Av,ki) in general. Unbiased imputation of Vki should be selected from a neighborhood defined based on except for certain special situations where β(v) in model (1) does not change with v and Av,ki is conditionally independent of Zki given (Tki, Vki, δki). In this case, z cancels out from (2) under model (1). A simulation example in Section 4 shows that the NNHD imputation leads to biased estimation without including Z2,ki in
2.2. Hybrid estimation procedures
We propose two hybrid approaches for estimation of model (1). The first approach follows the standard multiple imputation scheme of Rubin (1987) such that the NNHD estimator is the average of the AIPW estimates of Yang et al. (2017) for two-phase sampling of covariates for completed marks under each imputation and the variance estimator is adjusted using Rubin’s formula. The second approach utilizes multiple imputations in a single AIPW estimating equation.
Let Yki(t) = I(Xki ≥ t) be the at-risk process. The sampling probabilities of the phase-two covariates are given by πz,ki(t) = Pk{ξz,ki = 1|Ωki, δki, Yki(t) = 1}. Suppose that is an estimator of πz,ki(t) based on parametric models as discussed in Yang et al. (2017). Let Wki(t) = ξz,ki(πz,ki(t))−1 and . We define the marked counting processes for the completed marks by for m = 1, …, M. If Vki is not missing, and .
2.2.1. Hybrid estimation using standard multiple imputation
Standard MI estimation of β(v) uses the average of estimates obtained for each imputation. Following Yang et al. (2017), for the mth imputation, m = 1, …, M, let be the solution to the estimating equation for β = β(v) for v ∈ (0, 1):
| (3) |
where Kh(x) = K(x/h)/h, K(·) is a kernel function and h the bandwidth, and and are the estimates described in Yang et al. (2017).
In particular, is the estimate of , and
| (4) |
for j = 0, 1, and 2, where is the estimate of the conditional expectation for j = 0, 1, and 2. Write , where β1 and β2 are the coefficients for Z1,ki(t) and Z2,ki, respectively. Note that Z1,ki(t) is a part of Ωki. For given β, the first part of is Z1,ki(t) and the second part is . Similarly, , for j = 0, 1 and 2, depend on the observed data, and are functions of the conditional expectations , r = 0, 1 and 2. Yang et al. (2017) considered using parametric models for to obtain the estimate , where g(Z2,ki) is a specified function of Z2,ki such as Z2,ki, exp(β2Z2,ki) or Z2,ki exp(β2Z2,ki).
By the standard multiple imputation scheme of Rubin (1987), the hybrid-Rubin estimator is defined by . The variance estimate of adjusting for multiple imputation using Rubin’s (Rubin, 1987, pp. 76) rule equals
| (5) |
where is the variance estimator of Yang et al. (2017) based on the mth imputation. The first part accounts for within-imputation variability, and the second part for between-imputation variability. The term 1 + M−1 corrects for bias due to the finite number of multiply imputed data sets.
2.2.2. Hybrid estimation via the estimating equations approach
This subsection proposes another hybrid approach that incorporates multiple imputations into a single estimating equation. A subject with a missing mark gets M imputed marks, which are associated with a particular subject and are dependent. The M imputed marks can be considered as a cluster. We consider the following hybrid estimating equation for β = β(v) for v ∈ (0, 1):
| (6) |
where U(m)(v, β) is defined in (3). The estimating equation (6) for the hybrid-MIEE resembles the generalized estimation equation (GEE) approach for repeated measures analysis that assumes working independence (Liang and Zeger, 1986). A subject with observed mark, in which case , receives weight one, while the weight for a subject with missing mark is 1/M for each imputation. The estimator that solves U(v, β) = 0 is termed the hybrid multiple imputation estimation equation (hybrid-MIEE) estimator.
The estimator can be implemented using the Newton-Raphson iterative algorithm. Starting with an initial value β(0)(v), let β(l)(v) be the estimate of β(v) at step l. The estimator is obtained by iterating steps (i) and (ii) as follows until convergence: (i) Estimate the conditional expectations for j = 0, 1, 2, and calculate ; (ii) Update the estimate β(l+1)(v) at step l + 1 by β(l+1)(v) = β(l)(v) − (∂U(v, β(l)(v))/∂β)−1 U(v, β(l)(v)).
Estimation of the stratified mark-specific PH model (1) also involves estimation of the baseline mark-specific hazard function λ0k(t, v). The MIEE approach treats the multiple imputations for a given subject as a cluster. As such, the Nelson-Aalen type estimator is a natural estimator of the doubly cumulative baseline function . The baseline function λ0k(t, v) can be estimated by obtained by smoothing the increments of the estimator . For example, one can use kernel smoothing , where and , with K(1)(·) and K(2)(·) kernel functions and h1 and h2 bandwidths. Other model parameters of interest discussed in Section 2.3 such as the overall conditional survival function, the conditional cumulative incidence function, and the mark-specific cumulative incidence function rate can be estimated under the same framework.
Next, we propose an estimator of the variance of . Let . The derivative of U(v, β) with respect to β equals . Following the proof of Theorem 2 of Yang et al. (2017), we have the approximation
| (7) |
where , and, similar to Yang et al. (2017), is approximated by
| (8) |
Here and is the estimator of given by
Hence, is obtained by replacing with , λ0k(t, u) dtdu with , and by replacing , and with their estimates.
Using Rubin’s idea to account for the between-imputation variability, we propose to estimate the variance of by , where
| (9) |
In Web Appendix A, we present heuristic arguments to show that the proposed hybrid-MIEE and hybrid-Rubin estimators are unbiased for large sample using NNHD imputation and under the model assumptions given by Yang et al. (2017). The hybrid-MIEE and hybrid-Rubin estimators also enjoy the double robustness properties similar to the AIPW estimators of Yang et al. (2017).
Parametric multiple imputation often uses between 2 and 10 imputations (Rubin, 1987, pp. 15). Reilly (1993) recommended that hot deck estimation be performed with 3, 5 and 10 imputations.
2.3. Estimation of the mark-specific cumulative incidence function rate
By definition of the conditional mark-specific hazard function, λk(t, v|z) dv measures the instantaneous rate of failure at time t with failure type/mark (e.g., dengue sequence distance) V ∈ [v, v+dv) in the presence of all other possible failure types (dengue viruses with different sequence distances) for a very small dv. In this section, we introduce the mark-specific cumulative incidence function rate that provides interpretable results (e.g., through visual display of estimates) and is useful for prediction. The conditional cumulative incidence function (CIF) for stratum k is defined by , which has the interpretation of the classical CIF as the conditional probability of failure by time t with failure cause Vk ≤ v. The conditional mark-specific cumulative incidence function rate (CIFR) fk,v(t, v|z) is the derivative of with respective to v. The quantity fk,v(t, v|z) dv is the conditional probability that failure with mark V ∈ [v, v + dv) occurs by time t. The cumulative incidence of failure with mark V in an interval (v1, v2] ⊂ (0, 1) is given by for (v1, v2] ⊂ (0, 1).
While λk(t, v|z) is useful for measuring the instantaneous rate of failure occurrence at time t for those at risk, the mark-specific CIFR fk,v(t, v|z) is useful to estimate/predict the probability of failure by time t with V ∈ [v, v+dv). As with the classical competing risks model, the mark-specific hazard function is related to the CIFR through the simple formula , where and Sk(t|z) is the conditional overall survival function of Tk given Zk = z that is given by under model (1). The CIF and CIFR can be can be estimated by plugging in the estimates of β(v) and λ0k(t, v). The details of estimation are given in the Web Appendix B. The relationships among the estimated conditional mark-specific hazard function, CIF and CIFR are the same as for their population quantities using the hybrid-MIEE approach with multiple imputations, but this is not the case for the hybrid-Rubin approach.
3. Statistical inferences for β(v)
We develop procedures for testing two sets of hypotheses regarding β(v). Let βr(v) be the rth component of β(v), 1 ≤ r ≤ p. We first test the null hypothesis H10: βr(v) = 0 for v ∈ [a, b] ⊂ (0, 1) against the general alternative H1a: βr(v) ≠ 0 for at least some v ∈ [a, b], and against the monotone alternative H1m: βr(v) ≤ 0 with βr(v) < 0 for some v ∈ [a, b]. The testing procedure can be used to test βr(v) ≥ 0 with simple modifications. The second hypothesis H20 concerns whether βr(v) does not depend for v ∈ [a, b]. We test H20 against the general alternative H2a that βr(v) depends on v for v ∈ [a, b] and the monotone alternative H2m that βr(v) is a monotone increasing function. The test can be modified to test the monotone alternative that βr(v) is a monotone decreasing function. The tests of H10 are helpful for identifying covariates that are correlated with risk for at least some failure types/marks. The tests of H20 evaluate whether the strength of association of a covariate with risk varies with values of the failure type/mark V.
We construct the following test procedures based on the hybrid MIEE estimator of β(v). Let 0 < v1 < ⋯ < vG < 1 be a grid of G points in the range of the marks (0, 1). By Aalen and Johansen (1978) and Sun et al. (2009), it can be shown that are asymptotically independent and approximately normal. The estimated variance of , , is the rth element on the diagonal of . Let . We propose the following test statistic to test H10: βr(v) = 0 against H1a: βr(v) ≠ 0:
The following test statistic is used to test H10 against H1m: βr(v) ≤ 0:
Under the null hypothesis H10, T1a has an approximately Chi-square distribution with G degrees of freedom, and T1m has an approximate normal distribution with mean zero and variance G. A larger value of T1a indicates departures from H10, rejecting H10 in favor of H1a at significance level α if T1a is greater than the (1 − α) percentile of . A smaller value of T1m shows evidence in favor of H1m, rejecting H10 at significance level α if T1m is less than the α percentile of N(0, G).
To test the null hypothesis H20 that the rth component βr(v) does not depend on v, we let . Then , where A is the (G − 1) × G matrix with −1 as the (i, i)th element, 1 as the (i, i + 1)th element for i = 1, …, G − 1, and the rest of the elements zero. Thus the covariance matrix of is , where is the diagonal matrix with , g = 1, …, G, on the diagonals. The following are the two test statistics for testing H20:
where is the diagonal matrix with , g = 1, …, G, on the diagonals, and J is a G − 1 dimensional vector of 1’s. Under H20, T2a has an approximately Chi-square distribution with G − 1 degrees of freedom, and T2m has an approximate normal distribution with mean zero and variance G − 1. We reject H20 in favor of H2a at significance level α if T2a is greater than the (1−α) percentile of and reject H20 in favor of H2m if T2m is greater than the (1 − α) percentile of N(0, G − 1).
In practice, we recommend that G takes a value from 3 to 5 with approximately evenly spaced grid points with spacing greater then the size of the bandwidth for better approximations of the null distributions of the test statistics.
4. Simulation study
We conducted a simulation study to evaluate the finite-sample performance of the proposed estimation and hypothesis testing procedures. Let U1, U2 and U3 be independent uniformly distributed random variables on (0, 1). Let Z1 = U1 +2U3 be a phase-one covariate, and Z2 = −U1+2U2 a phase-two covariate, with resulting correlation coefficient of −0.2. We study the scenario of one stratum K = 1. The (T1i, V1i) are generated from the following mark-specific proportional hazards model:
| (10) |
where the mark-specific baseline function is λ0(t, v) = exp(−0.3v), β1(v) = −0.2v and β2(v) = α+θv. We study performance of the hypothesis tests of H10 and H20 for β2(v). The parameters α and θ are chosen to examine the sizes and powers of the proposed tests. All failure times greater than τ = 2.0 are right-censored at τ. Censoring times are generated from an exponential distribution, independent of (T, V), with parameter adjusted so that the overall censoring rate during follow-up is approximately 40%.
For the two-phase sampling, we consider a simple Bernoulli random sample taken separately for cases and controls, with selection probability πz,1i = 1 for cases (δ1i = 1) and 0.5 for controls (δ1i = 0). Suppose there is an auxiliary variable Az correlating with Z2, Az = Z2 + ϵ, where ϵ is normally distributed with mean zero and standard deviation 0.5, which corresponds to a Pearson correlation coefficient between Z2 and Az of ρ = 0.75. The conditional expectations involving the phase-two covariate Z2 are estimated using linear models with (1, δ, Z1, Az, δZ1, δAZ) as predictors based on the subjects with observed Z2. Covariates (1, Z1, Az) are used for estimating the logit linear model for πz,1i for subjects with δ1i = 0.
The mark V1i is missing following the conditional probability logit(πv,1i) = logit(P(ξv,1i = 1|Ω1i)) = 0.3Z1,1i + 0.8 for δ1i = 1, yielding about 22% missing marks. The hot deck imputation of a missing Vki is obtained from donors from the same stratum k with δki = 1 and with similar defined by Euclidean distance and z-scores of (Tki, Zki); for our simulations we study only one stratum k = 1. The L-nearest neighborhood imputations are carried out based on with the Euclidean metric. By considering the z-scores of variables, we eliminate the effects of scales or units of the variables on the nearest neighbor selections. We consider M = 3 and M = 5 imputations from the M-nearest neighborhoods for the cases with missing marks.
The performances of the proposed test procedures are evaluated through simulations under model (10) for the parameter settings P1 to P5 that are defined by P1: (α, θ) = (0, 0); P2: (α, θ) = (−0.4, 0); P3: (α, θ) = (−0.5, 0); P4: (α, θ) = (−1, 1.5); P5: (α, θ) = (−1, 2). P1 and P3 are models under the null hypothesis H10 and H20, respectively; P2-P3 are H1m alternatives to H10, and P4-P5 are H2m alternatives to H20.
The Epanechnikov kernel K(x) = .75(1−x2)I{|x| ≤ 1} is used for the kernel smoothing. The bandwidth is selected using the formula , where is the estimated standard error of the observed marks for uncensored failure times and C is a constant ranging from 2 to 5. Sun et al. (2016) and Yang et al. (2017) showed that this formula works well in simulations. A larger C can be used if the distribution of the observed marks is skewed or marks are sparse in some areas. Alternatively, the formula has also been used in situations with a very large phase-one sample and low event rate (Yang et al., 2017), where no is the observed number of events. The values of under model (10) for the settings P1 to P5 are approximately 0.29, yielding for n = 500 and h = 0.13 for n = 800. We also studied the impact of using larger bandwidths: h = 0.20 for n = 500 and h = 0.17 for n = 800.
We estimate β(v) over 21 evenly spaced grid points in [0, 1] with spacing 0.05 such that v1 = 0, v2 = 0.05, …, v21 = 1. The initial value for estimating β(v1) is set to zero. The estimate is used as the initial value for estimating β(v2) such that is used as the initial value for estimating β(vi) for i = 2, …, N. The Newton-Raphson iterative algorithm proposed in Section 2.2.2 is not overly sensitive to the choice of initial values.
Figure 1 shows the simulation results for estimating β(v) = (β1(v), β2(v))T under the setting P4 for model (10) without auxiliary Az with M = 5 imputations from 5-nearest neighborhoods of cases with missing marks based on 1000 simulations using bandwidths h = 0.13 and h = 0.15 for 5-nearest neighborhoods calculated using Euclidean distance and z-scores of the for cases (with δ1j = 1), where Bias is the bias, SEE is the sample standard error of the estimator, ESE is the sample mean of the estimated standard errors, and CP is the 95% empirical coverage probability. Figure (2) compares the estimates using different neighborhood selections under the setting P4 for model (10).
Figure 1.

Bias, SEE, ESE and CP for and for n=800 under the setting P4 for model (10) with M = 5 imputations from the 5-nearest neighborhoods of cases i with missing marks based on 1000 simulations. The are calculated using Euclidean distance and z-scores of the for cases (with δ1j = 1). The red lines are for the hybrid-MIEE estimator while the blue lines are for the hybrid-Rubin estimator.
Figure 2.

Bias, SEE, ESE and CP for and for n=800 under the setting P4 for model (10) with M = 5 imputations from the 5-nearest neighborhoods of cases with missing marks based on 1000 simulations. MIEE-NN(T,Z) is for the hybrid-MIEE estimator with the 5-nearest neighborhoods calculated using Euclidean distance and z-scores of the for cases (with δ1j = 1), while MIEE-NN(T) is the same except z-scores of the are used. The legends Rubin-NN(T,Z) and Rubin-NN(T) are defined similarly for the hybrid-Rubin estimator. The red lines are for the hybrid-MIEE estimator while the blue lines are for the hybrid-Rubin estimator.
Additional simulation studies are presented in Web Appendix C, which includes simulation results under setting P3 of model (10), and for a different mark-specific proportional hazards model with K = 2 strata. Web Appendix C also includes a real data simulation study that applies the proposed methods to a data set generated based on the the CYD14 trial data.
The simulation study shows that the biases of both the hybrid-MIEE estimator and the hybrid-Rubin estimator are very small except in the left and right tails for β2(v) (which are the expected boundary effects in nonparametric estimation) by using the 5-nearest neighborhoods of cases with missing marks calculated using Euclidean distance and z-scores of the for cases. The pointwise coverage probabilities are slightly below but very close to 95% for v ∈ (0, 1) except in the left and right tails for β2(v), indicating adequate performance of the proposed variance estimators. For larger bandwidths, the SEE and ESE of the proposed estimator are smaller. The study also shows that estimation based on the L-nearest neighborhoods calculated only using a subset of can yield much larger biases unless β(v) does not depend on v. In particular, under the setting P4 for model (10), Figure (2)(b) shows that using yields much larger biases than using . We also notice from Web Figure 3 in Web Appendix C that the biases are small for both selections of under setting P3 since β(v) does not vary with v in this setting.
Using the 5-nearest neighborhoods calculated using Euclidean distance and z-scores of the for all cases (with δ1j = 1), Figure 3 shows the simulation results for estimating the conditional mark-specific cumulative incidence function rate f1,v(t, v|z) at Z1 = 1.5 and at the 10th, 50th and 90th percentiles of Z2 for t = 1 and n = 800 under the setting P4 for model (10) with M = 5 based on 1000 simulations using bandwidth h = 0.13. The figures show that the average of the estimated f1,v(t, v|z) are close to the true values f1,v(t, v|z).
Figure 3.

Estimation of the conditional mark-specific cumulative incidence function rate f1,v(τ, v|z) at Z1 = 1.5 and at the 10th, 50th and 90th percentiles of Z2 for τ = 1 and n = 800 under the setting P4 for model (10) with M = 5 imputations from the 5-nearest neighbors of the missing marks using bandwidth h = 0.13 based on 1000 simulations, where the grey lines represent the true values. The 5-nearest neighborhoods are calculated using Euclidean distance and z-scores of the for cases (with δ1j = 1). The plots in red lines in (a) are the averages of the estimated f1,v(τ, v|z). The plots in (b) are the estimated ratios of the f1,v(τ, v|z) at the 10th and 50th percentiles of Z2 divided by f1,v(τ, v|z) at the 90th percentiles of Z2, respectively, for Z1 = 1.5 and τ = 1. The plots in (c) are the SSE’s of the estimated f1,v(τ, v|z).
The simulations are carried out to examine the performances of the proposed tests with the nearest neighborhoods calculated using Euclidean distance and z-scores of the for subjects with δ1j = 1. Table 1 presents the empirical sizes and powers of the tests T1a and T1m for testing H10 and the tests T2a and T2m for testing H20 at nominal level 0.05 using M = 3 and M = 5 imputations from the M-nearest neighborhoods based on 1000 simulations. The test statistics are calculated using bandwidth h = 0.15 for n = 500 and h = 0.13 for n = 800 and using G = 3 grid points with v1 = 0.2, v2 = 0.5, v3 = 0.8. The empirical sizes for testing H10 under P1 and for testing H20 under P3 are slightly higher but very close to the nominal level 0.05, indicating adequate performance of the proposed tests. The powers of the tests for testing H10 increase as the model moves from P1 to P3, while the powers of the tests for testing H20 increase as the model moves from P3 to P5, representing increasing departures from the null hypotheses H10 and H20, respectively. Powers of the tests with auxiliary variable Az are slightly higher than those without using Az. Powers of the tests are not overly sensitive to the number of imputations M, but seem to increase slightly for larger bandwidth.
Table 1.
Empirical sizes and powers of the test statistics T1a and T1m for testing H10 and the test statistics T2a and T2m for testing H20 under model (10) with M = 3 and M = 5 imputations from the M-nearest neighborhoods of the missing marks at nominal level 0.05 based on 1000 simulations. The test statistics are constructed using G = 3, v1 = 0.2, v2 = 0.5, v3 = 0.8. “Without Az” refers to the scenario where there is no auxiliary Az, and “With Az” refers to the scenario where the auxiliary Az is used. BAND1 is the bandwidth setting of h = 0.15 for n = 500 and h = 0.13 for n = 800 while BAND2 is the bandwidth setting of h = 0.20 for n = 500 and h = 0.17 for n = 800.
| Model | n | M | Without Az | With Az | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| BAND1 | BAND2 | BAND1 | BAND2 | |||||||
| Testing H10 | Testing H10 | Testing H10 | Testing H10 | |||||||
| T1a | T1m | T1a | T1m | T1a | T1m | T1a | T1m | |||
| P1 | 500 | 3 | 0.064 | 0.062 | 0.067 | 0.065 | 0.070 | 0.051 | 0.063 | 0.054 |
| 5 | 0.067 | 0.066 | 0.071 | 0.072 | 0.079 | 0.061 | 0.081 | 0.061 | ||
| 800 | 3 | 0.069 | 0.060 | 0.062 | 0.058 | 0.072 | 0.051 | 0.070 | 0.045 | |
| 5 | 0.073 | 0.060 | 0.060 | 0.058 | 0.066 | 0.045 | 0.063 | 0.043 | ||
| P2 | 500 | 3 | 0.771 | 0.934 | 0.876 | 0.968 | 0.814 | 0.959 | 0.915 | 0.983 |
| 5 | 0.746 | 0.921 | 0.843 | 0.960 | 0.800 | 0.941 | 0.891 | 0.974 | ||
| 800 | 3 | 0.897 | 0.974 | 0.959 | 0.996 | 0.933 | 0.991 | 0.980 | 0.999 | |
| 5 | 0.886 | 0.982 | 0.946 | 0.990 | 0.911 | 0.991 | 0.964 | 0.997 | ||
| P3 | 500 | 3 | 0.912 | 0.982 | 0.966 | 0.995 | 0.945 | 0.992 | 0.984 | 0.999 |
| 5 | 0.907 | 0.977 | 0.955 | 0.989 | 0.943 | 0.990 | 0.976 | 0.997 | ||
| 800 | 3 | 0.974 | 0.998 | 0.992 | 1.000 | 0.986 | 0.998 | 0.999 | 1.000 | |
| 5 | 0.983 | 0.999 | 0.998 | 1.000 | 0.994 | 0.999 | 1.000 | 1.000 | ||
| Model | n | M | Testing H20 | Testing H20 | Testing H20 | Testing H20 | ||||
| T2a | T2m | T2a | T2m | T2a | T2m | T2a | T2m | |||
| P3 | 500 | 3 | 0.070 | 0.049 | 0.055 | 0.047 | 0.079 | 0.055 | 0.062 | 0.053 |
| 5 | 0.056 | 0.063 | 0.050 | 0.059 | 0.070 | 0.069 | 0.062 | 0.069 | ||
| 800 | 3 | 0.066 | 0.060 | 0.069 | 0.054 | 0.075 | 0.062 | 0.083 | 0.058 | |
| 5 | 0.066 | 0.061 | 0.056 | 0.059 | 0.073 | 0.066 | 0.069 | 0.063 | ||
| P4 | 500 | 3 | 0.757 | 0.883 | 0.839 | 0.942 | 0.779 | 0.891 | 0.867 | 0.953 |
| 5 | 0.755 | 0.909 | 0.868 | 0.964 | 0.772 | 0.918 | 0.896 | 0.970 | ||
| 800 | 3 | 0.894 | 0.973 | 0.959 | 0.990 | 0.904 | 0.980 | 0.967 | 0.993 | |
| 5 | 0.871 | 0.970 | 0.955 | 0.991 | 0.891 | 0.975 | 0.966 | 0.992 | ||
| P5 | 500 | 3 | 0.953 | 0.990 | 0.989 | 0.999 | 0.996 | 0.998 | 0.993 | 0.999 |
| 5 | 0.946 | 0.988 | 0.987 | 0.998 | 0.999 | 1.000 | 0.993 | 0.999 | ||
| 800 | 3 | 0.991 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | |
| 5 | 0.995 | 0.999 | 0.999 | 1.000 | 1.000 | 1.000 | 0.999 | 1.000 | ||
5. Dengue Vaccine Efficacy Trial Analysis
The CYD14 cohort for data analysis is all participants attending the Month 13 study visit without previously experiencing the dengue disease primary endpoint, comprising 6639 vaccine recipients and 3220 placebo recipients. Of these, 116 vaccine recipients and 129 placebo recipients experienced the dengue endpoint by Month 25, constituting an estimated 56.5% vaccine-reduction in the hazard of dengue disease between Month 13 and 25 (Capeding et al., 2014). The percentage of right-censoring by Month 25 was 98.3%. An important scientific question is how does natural and vaccine immunity work in preventing dengue disease. Neutralizing antibodies are generally believed to be important for both natural and vaccine-induced protection, which are present in many placebo recipients (caused by prior dengue exposures and infections), and are boosted/increased in many vaccine recipients (caused by dengue vaccination) (Moodie et al., 2018). In this section, we apply the developed methods to analyze the CYD14 data with objective of understanding, for each of the placebo and vaccine groups, the association of Month 13 neutralizing antibody levels (“NAb titer”) with subsequent occurrence of the dengue disease primary endpoint through Month 25, and whether and how the associations depend on dengue amino acid sequence. The NAb titer marker is the average of an individual’s log-base-10 50% neutralization titer to each of the four dengue strains in the vaccine (one strain for each dengue serotype), where the 50% neutralization titer quantifies the ability of antibodies in an individual’s blood sample to kill a given dengue strain (defined in detail in (Moodie et al., 2018)). The analyses by treatment group can be interpreted as assessing NAb titer as a marker of different kinds of acquired protection/disease resistance – for placebo recipients naturally-acquired resistance and for vaccine recipients a combination of naturally- and vaccine-acquired resistance. This integrated analysis of host and pathogen data types would increase knowledge of NAb titer as a correlate of risk of dengue disease, with many applications including aiding refinement of models for bridging vaccine efficacy to new settings not studied in CYD14.
As summarized in the Introduction, NAb titer was measured from Month 13 blood samples from a subset of participants selected through a case-cohort sampling design. With controls defined as participants reaching the Month 25 visit never experiencing the dengue disease study endpoint, the NAb titer marker was measured from n=1879 controls (1275 vaccine, 604 placebo), and from all n=245 cases (116 vaccine, 129 placebo).
From blood samples drawn at dengue disease failure event times, dengue virus nucleotide sequences of the complete antigen-coding region of the dengue genome represented in the CYD-TDV vaccine (prM/E) were measured using 454 sequencing (Rabaa et al., 2017). The prM/E dengue genome (1,985 base pairs for serotypes 1, 2, 4 and 1,979 base pairs for serotype 3) was sequenced and translated to 661 amino acid positions (659 for serotype 3). The amino acid sequences were multiply aligned with the four vaccine strain sequences. A subset of 65 of the prM/E amino acid positions have been documented to be “NAb contact sites”, defined as positions on the outer surface of dengue that have been documented to interact with anti-dengue neutralizing antibodies. Because sequence variation in these contact sites was hypothesized to be especially relevant for potential protection against dengue disease, we studied the mark V defined as the Hamming distance based on these NAb contact sites. The distance V “Hamming distances: NAb contact sites” was calculated, which is the percent amino acid mismatch in the 65 NAb contact sites between the dengue sequence from a given disease case and the closest dengue sequence among the four vaccine strain sequences. The mark V was measured from 76 (66%) of the 116 vaccine recipient cases and from 84 (65%) of the 129 placebo recipient cases.
Vaccine recipients exposed to dengue sequences with short distances to the vaccine may be more likely to be protected by antibodies than vaccine recipients exposed to dengue sequences with large distances. Therefore, if NAb titer is important for protection, its inverse correlation with dengue disease risk would be expected to be strongest against dengue viruses with small distances and be weakest or non-existent against dengue sequences with large distances. The results on these hypotheses may provide insights into how the vaccine partially worked and thereby guide next steps of vaccine research. In addition, the same analysis in placebo recipients aids understanding of how naturally acquired NAb titers associate with sequence-specific dengue risk.
Let T be the time between the Month 13 visit until diagnosis of dengue disease through Month 25. We consider the following mark-specific PH model:
| (11) |
with K = 1 baseline stratum, where NAb is the Month 13 NAb titer and Age is age at enrollment. Age is a phase-one variable while NAb is a phase-two variable. For both the vaccine and placebo groups, NAb titer is observed for all the cases but missing for 80.5% of the non-cases.
We implement the proposed estimation and testing procedures described in Sections 2 and 3. We estimated the probability of observing the NAb titer marker with a logistic regression model, with logit(P(ξz = 1|Ω)) a linear function of (1, Age, Sex). To implement the AIPW method, we use linear models for E(NAb|Ω) and E(exp(β2(v)NAb⊗j)|Ω) for j = 0, 1, 2, with predictors (1, Age, Sex). For each case i with missing mark Vi, we use M = 5 imputed marks from the 5-nearest neighborhoods calculated using z-scores of from all cases j (with δ1j = 1).
Due to the very large phase-one sample and the low event rate, we used the bandwidth , where is the estimated standard error of the observed marks and no is the number of cases. The standard deviation of the observed mark “Hamming distances: NAb contact sites” is 0.0225 for the vaccine group and 0.0243 for the placebo group, resulting in bandwidth h = 0.023 and h = 0.024, respectively.
Figure 4 shows point and 95% confidence interval estimates of β1(v) for Age and β2(v) for NAb titer, by treatment arm. Greater age is associated with lower risk of dengue disease for the vaccine group and apparently not for the placebo group, and the associations do not appear to depend on the mark. NAb titer is strongly inversely associated with risk of dengue disease in the vaccine group, with stronger association for dengue viruses closest to the vaccine strains. In the placebo group the results suggest a weak inverse association of NAb titer with dengue disease, only for dengue viruses close to the vaccine strains.
Figure 4.

Estimation of the associations of age and NAb titer with the mark-specific hazard of the dengue disease endpoint with mark “Hamming distances: NAb contact sites” using M = 5 imputations from the 5-nearest neighborhoods of missing marks, with bandwidth h = 0.023 for the vaccine group and h = 0.024 for the placebo group. The 5-nearest neighborhoods are calculated using Euclidean distance and z-scores of the values of cases (with δ1j = 1). Plots of the estimated log mark-specific hazard ratios β1(v) for Age and β2(v) for NAb titer are given for the vaccine and placebo groups in the first and second columns.
Augmenting results from Figure 4, Figure 5 shows the estimated conditional mark-specific cumulative incidence function rate fv(τ, v|z) at Month τ = 25 for the 10th, 50th and 90th percentiles of the NAb titer marker and at the average age 8.35 years old, by treatment arm. The figure shows that is highest at the 10th percentile of NAb titer and lowest at the 90th percentile. Figure 5 also shows the ratios of for the 10th vs. 90th percentiles and 50th vs. 90th percentiles of NAb titer at the average age. For the vaccine group, this ratio for the 10th vs. 90th percentile is almost twice that of the ratio for the 50th vs. 90th percentile for mark values v < 0.021. Such differences in estimated fv(τ = 25, v|z) are not observed for the placebo group.
Figure 5.

Estimation of the conditional mark-specific cumulative incidence function rate fv(τ, v|z) at Month τ = 25 and the ratios with the mark calculated as “Hamming distances: NAb contact sites” using M = 5 imputations from the 5-nearest neighborhoods of missing marks, with bandwidth h = 0.023 for the vaccine group and h = 0.024 for the placebo group. The 5-nearest neighborhoods are calculated using Euclidean distance and z-scores of the . values of cases (with δ1j = 1). The first row shows estimated fv(τ = 25, v|z) at the 10th, 50th and 90th percentiles of NAb titer at the average age 8.35 years old by treatment group. The second row shows the ratios at the 10th vs. 50th and the 10th vs. 90th percentiles.
Table 2 presents the results of the hypothesis testing for β1(v) and β2(v) under H10 and H20 for the vaccine group and placebo group. The p-values are calculated using G = 4 grid points with v1 = 0.01, v2 = 0.03, v3 = 0.05, v4 = 0.07. The results support that the risk of dengue disease decreases as the NAb titer increases for the vaccine group, but not for the placebo group. Older children are at lower risk of dengue disease for both treatment groups but more significantly for the vaccine group. There are statistically significant results that the magnitude of the mark-specific association parameter β2(v) for NAb titer decreases with increasing mark values for both treatment groups.
Table 2.
Results of hypothesis tests of H10 and H20 for the CYD14 trial.
| NAb titer | Age | |||||||
|---|---|---|---|---|---|---|---|---|
| Testing H10 | Testing H20 | Testing H10 | Testing H20 | |||||
| T1a | T1m | T2a | T2m | T1a | T1m | T2a | T2m | |
| Vaccine | < 0.001 | < 0.001 | 0.069 | 0.005 | < 0.001 | < 0.001 | 0.857 | 0.762 |
| Placebo | 0.116 | 0.393 | 0.070 | 0.012 | 0.170 | 0.016 | 0.383 | 0.961 |
The analysis presented in this manuscript imputes missing dengue sequence distances from subjects with similar event times, ages, and NAb titer in a neighborhood. In Web Appendix C, we present the results of the data analysis using the alternative hotdeck imputations implemented by Juraska et al. (2018), which were obtained using information on study site and local clinic, as well as on dengue genotype and serotype. Similar results are obtained but with slightly weaker evidence that the magnitude of β2(v) decreases with increasing mark values in the vaccine group.
Because the CYD-TDV vaccine is licensed for children 9 years of age or older, we repeated the analyses restricting to 9–14 year-olds (Web Appendix D). For the vaccine group, the results for inference on β2(v) with covariate NAb titer are similar to those for all ages 6–14 (Web Appendix D Table 3, Figure 8, Figure 9). However, for the placebo group, the analysis restricting to 9–14 year-olds supports an inverse correlation of NAb titer with dengue disease for low dengue mark values, whereas the analysis of 6–14 year-olds did not suggest a correlation for any mark values.
6. Concluding Remarks
Motivated by the CYD14 dengue vaccine efficacy trial, this article developed estimation and hypothesis testing procedures for β(v) in model (1) under two-phase sampling of some covariates and with missing marks for some individuals with the failure event. We investigated two hybrid approaches that utilize nonparametric NNHD multiple imputations to impute missing marks of observed failures, followed by application of the AIPW technique to the completed-marks case-cohort sampled data sets. The two hybrid methods differ in how the imputed marks are pooled. Our simulations show that the hybrid-Rubin and the hybrid-MIEE estimators have similar performances in estimation.
We consider hot deck imputations of missing marks from donors with similar characteristics among the observed failures. The implementation of the NNHD depends on the choice of metric and the variables included for the neighborhood selection. The imputation based on a subset of (Tkj, Zkj) can lead to biased estimation. Our L-nearest neighborhoods imputations are carried out based on z-scores of the for cases and with the Euclidean metric. By considering the z-scores of variables, we eliminate the effects of scales or units of the variables on the nearest neighbor selections. Hsu and Yu (2019) recently studied the Cox model with missing covariates using the nonparametric multiple imputation approach with the neighborhood selected based on the predictive scores of two working regression models. We conducted a limited simulation study and found no advantages of the predictive score approach for the neighborhood selection.
Achieving consistent variance estimation in the presence of imputed data remains a challenge. Rubin’s (Rubin, 1987) rule of adjusting for multiple imputation has been widely used in practice. Other methods for estimating variances have been investigated, but few are rigorously justified; see, for example, Kovar and Chen (1994); Lee et al. (1994); Rancourt et al. (1994); Lee et al. (1995); Montaquila and Jernigan (1997). Chen and Shao (2000) investigated the theoretical properties of the NNHD imputation method and showed that the NNHD method provides asymptotically unbiased estimators for population means, quantiles and univariate distributions. They also derived consistent variance estimators of the NNHD estimators. The proposed hybrid-Rubin and hybridMEEE estimators for the mark-specific proportional hazards model (1) work very well with small biases in many different models we examined. However, finding consistent variance estimators is very challenging for the NNHD imputation of missing mark under two-phase sampling of covariates. We adopted Rubin’s rule for the variance estimators, which seems to slightly underestimate the true variances under some situations. The underestimated variances also lead to slightly inflated observed sizes for the proposed tests. Further investigation of variance estimation is needed.
For the analysis of the CYD14 efficacy trial, model (11) assumes that the mark-specific log-hazard ratio for Age is the same for every unit increase in Age and similarly for the mark-specific log-hazard ratio for NAb. However, the model assumptions may fail and thus model checking is an important problem. Sun et al. (2016) proposed a goodness-of-fit test procedure for the stratified mark-specific proportional hazards model (1) when covariates are observed and there are no missing marks. Developing the goodness-of-fit test procedure for model (1) with missing data is a project meriting future research.
The main manuscript presents the analysis of model (11) for children of all ages. However, the mark-specific effects β(v) may be different for different age groups. In Web Appendix D of the Supplementary Material, we conducted separate analyses for children in two different age groups: 2–8 and 9–14 year-olds. The additional analyses provide some insights on whether the effects of Age and NAb titer on the mark-specific risk of the dengue disease are different for different age groups.
Web Appendix E of the Supplementary Material also includes the analyses using the hotdeck imputations implemented in Juraska et al. (2018) that defined the neighborhood based on biological and geographic information, i.e., dengue genotype, serotype, study site and local clinic, and Juraska et al. validated that these hotdeck imputations were highly accurate. These hotdeck imputations are scientifically based and more robust to model misspecifications, while the other hotdeck imputations approach that we studied in this manuscript exploits the link between the failure time data and observed marks specified by the mark-specific proportional hazards model, which can improve power but at the expense of being less robust to model misspecifications. Further research is warranted to investigate the neighborhood selections and their impacts.
Supplementary Material
Acknowledgements
The authors thank the participants and investigators of the CYD14 trial. This research was partially supported by NIAID NIH award number R37AI054165, and by a contract from the CYD14 study sponsor Sanofi Pasteur. Dr. Sun’s research was partially supported by the National Science Foundation grants DMS1513072 and DMS1915829. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.
Footnotes
Supplementary Materials
Web Appendices A, B, C, D and E referenced in this article are given in the Supplementary Material available at the journal’s website. The MATLAB code and instructions for doing the analysis for a simulated data set presented in Section 3.3 of the Web-based Supplementary Material is also available from the website.
Contributor Information
Yanqing Sun, University of North Carolina at Charlotte, Charlotte, U.S.A..
Li Qi, Sanofi, Bridgewater, U.S.A..
Fei Heng, University of North Florida, Jacksonville, U.S.A..
Peter B. Gilbert, University of Washington and Fred Hutchinson Cancer Research Center, Seattle, U.S.A.
References
- Aalen OO and Johansen S (1978) An empirical transition matrix for non-homogeneous markov chains based on censored observations. Scandinavian Journal of Statistics, 5, 141–150. [Google Scholar]
- Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician, 46, 175–185. [Google Scholar]
- Andridge RR and Little RJA (2010) A review of hot deck imputation for survey non-response. International Statistical Review, 78, 40–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Beretta L and Santaniello A (2016) Nearest neighbor imputation algorithms: a critical evaluation. BMC medical informatics and decision making, 16, 197–208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Borgan O, Langholz B, Samuelsen SO, Goldstein L and Pogoda J (2000) Exposure stratified case-cohort designs. Lifetime data analysis, 6, 39–58. [DOI] [PubMed] [Google Scholar]
- Breslow NE and Lumley T (2013) Semiparametric models and two-phase samples: Applications to Cox regression, vol. Volume 9 of Collections, 65–77. Beachwood, Ohio, USA: Institute of Mathematical Statistics. [Google Scholar]
- Breslow NE, Lumley T, Ballantyne CM, Chambless L and Kulich M (2009) Improved horvitz-thompson estimation of model parameters from two-phase stratified samples: Applications in epidemiology. Statistics in Biosciences, 1, 32–49. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Capeding M, Tran N, Hadinegoro S, Ismail H, Chotpitayasunondh T, Chua M, Luong C, Rusmil K, Wirawan D, Nallusamy R, Pitisuttithum P, Thisyakorn U, Yoon I, van der Vliet D, Langevin E, Laot T, Hutagalung Y, Frago C, Boaz M, Wartel T, Tornieporth N, Saville M, Bouckenooghe A and CYD14 Study Group. (2014) Clinical efficacy and safety of a novel tetravalent dengue vaccine in healthy children in asia: a phase 3, randomised, observer-masked, placebo-controlled trial. Lancet, 384, 1358–65. [DOI] [PubMed] [Google Scholar]
- Carroll RJ, Spiegelman CH, Lan KKG, Bailey KT and Abbott RD (1984) On errors-in-variables for binary regression models. Biometrika, 71, 19–25. [Google Scholar]
- Chen J and Shao J (2000) Nearest neighbor imputation for survey data. J. Official Stat, 16, 113–141. [Google Scholar]
- Gao G and Tsiatis AA (2005) Semiparametric estimators for the regression coefficients in the linear transformation competing risks model with missing cause of failure. Biometrika, 92, 875–891. [Google Scholar]
- Gilbert P, McKeague I and Sun Y (2004) Tests for comparing mark-specific hazards and cumulative incidence functions. Lifetime Data Analysis, 10, 5–28. [DOI] [PubMed] [Google Scholar]
- Gilbert P and Sun Y (2015) Inferences on relative failure rates in stratified mark-specific proportional hazards models with missing marks, with application to human immunodeficiency virus vaccine efficacy trials. Journal of the Royal Statistical Society: Series C (Applied Statistics), 64, 49–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hsu C-H and Yu M (2019) Cox regression analysis with missing covariates via nonparametric multiple imputation. Statistical Methods in Medical Research. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jonsson P and Wohlin C (2004) An evaluation of k-nearest neighbour imputation using likert data In 10th International Symposium on Software Metrics, 2004. Proceedings, 108–118. IEEE. [Google Scholar]
- Juraska M and Gilbert P (2013) Mark-specific hazard ratio model with multivariate continuous marks: An application to vaccine efficacy. Biometrics, 69, 328–337. [DOI] [PMC free article] [PubMed] [Google Scholar]
- — (2016) Mark-specific hazard ratio model with missing multivariate marks. Lifetime Data Analysis, 22, 606–625. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Juraska M, Magaret C, Shao J, Carpp L, Fiore-Gartland A, Benkeser D, Girerd-Chambaz Y, Langevin E, Frago C, Guy B, Jackson N, Duong T, Simmons C, Edlefsen P and Gilbert P (2018) Viral genetic diversity and protective efficacy of a tetravalent dengue vaccine in two phase 3 trials. Proceedings of the National Academy of Sciences, 115, E8378–E838. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kovar J, Whitridge P and MacMillan J (1988) Generalized edit and imputation system for economic surveys at statistics canada. In Proceedings of the Survey Research Methods Section, American Statistical Association, 627–630. [Google Scholar]
- Kovar JG and Chen EJ (1994) Jackknife variance estimation of imputed survey data. Survey Methodology, 20, 45–52. [Google Scholar]
- Kulich M and Lin D (2004) Improving the efficiency of relative-risk estimation in case-cohort studies. Journal of the American Statistical Association, 99, 832–844. [Google Scholar]
- Lee H, Rancourt E and Särndal C (1995) Variance estimation in the presence of imputed data for the generalized estimation system In Proceedings of the section on survey research methods, vol. 1, 384–389. American Statistical Association USA. [Google Scholar]
- Lee H, Rancourt E and Särndal CE (1994) Experiments with variance estimation from survey data with imputed values. Journal of Official Statistics, 10, 231–243. [Google Scholar]
- Li K-C (1984) Consistency for cross-validated nearest neighbor estimates in nonparametric regression. The Annals of Statistics, 12, 230–240. [Google Scholar]
- Liang K-Y and Zeger SL (1986) Longitudinal data analysis using generalized linear models. Biometrika, 73, 13–22. [Google Scholar]
- Little RJA (1988) Missing-data adjustments in large surveys. Journal of Business & Economic Statistics, 6, 287–296. [Google Scholar]
- Montaquila J and Jernigan R (1997) Variance estimation in the presence of imputed data. In Proceedings of the Survey Research Methods Section of the American Statistical Association, 273–277. [Google Scholar]
- Moodie Z, Juraska M, Huang Y, Zhuang Y, Fong Y, Carpp L, Self S, Chambonneau L, Small R, Jackson N, Noriega F and Gilbert P (2018) Neutralizing antibody correlates analysis of tetravalent dengue vaccine efficacy trials in Asia and Latin America. Journal of Infectious Diseases, 217(5), 742–753. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nan B (2004) Efficient estimation for case-cohort studies. Canadian Journal of Statistics, 32, 403–419. [Google Scholar]
- Prentice RL (1986) A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika, 73, 1–11. [Google Scholar]
- Rabaa MA, Girerd-Chambaz Y, Duong Thi Hue K, Vu Tuan T, Wills B, Bonaparte M, van der Vliet D, Langevin E, Cortes M, Zambrano B, Dunod C, Wartel-Tram A, Jackson N and Simmons CP (2017) Genetic epidemiology of dengue viruses in phase iii trials of the cyd tetravalent dengue vaccine and implications for efficacy. eLife., 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rancourt E, Särndal C and Lee H (1994) Estimation of the variance in the presence of nearest neighbor imputation In Proceedings of the section on survey research methods, 888–893. American Statistical Association. [Google Scholar]
- Reilly M (1993) Data analysis using hot deck multiple imputation. Journal of the Royal Statistical Society, Series D: The Statistician, 42, 307–313. [Google Scholar]
- Robins J, Rotnitzky A and Zhao L (1994) Estimation of regression-coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89, 846–866. [Google Scholar]
- Rubin D (1987) Multiple Imputation for Nonresponse in Surveys. Wiley, New York. [Google Scholar]
- Scheike TH and Martinussen T (2004) Maximum likelihood estimation for cox’s regression model under case–cohort sampling. Scandinavian Journal of Statistics, 31, 283–293. [Google Scholar]
- Sedransk J (1985) The objective and practice of imputation In Proceedings of the First Annual Research Conference, U.S. Bureau of the Census, Washington D.C., 445–452. [Google Scholar]
- Stone CJ (1977) Consistent nonparametric regression. Ann. Statist, 5, 595–620. [Google Scholar]
- Sun Y and Gilbert P (2012) Estimation of stratified mark-specific proportional hazards models with missing marks. Scandinavian Journal of Statistics, 39, 34–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sun Y, Gilbert P and McKeague I (2009) Proportional hazards models with continuous marks. Annals of Statistics, 37, 394–426. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sun Y, Li M and Gilbert P (2016) Goodness-of-fit test of the stratified mark-specific proportional hazards model with continuous mark. Computational Statistics and Data Analysis, 93, 348–358. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sun Y, Qi L, Yang G and Gilbert P (2018) Hypothesis tests for stratified mark-specific proportional hazards models with missing covariates, with application to hiv vaccine efficacy trials. Biometrical Journal, 60(3), 516–536. [DOI] [PMC free article] [PubMed] [Google Scholar]
- White JE (1982) A two stage design for the study of the relationship between a rare exposure and a rare disease. American Journal of Epidemiology, 115, 119–128. [DOI] [PubMed] [Google Scholar]
- Yang G, Sun Y, Qi L and Gilbert P (2017) Estimation of stratified mark-specific proportional hazards models under two-phase sampling with application to hiv vaccine efficacy trials. Statistics In Biosciences, 9, 259–283. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
