Skip to main content
Biostatistics (Oxford, England) logoLink to Biostatistics (Oxford, England)
. 2020 Sep 25;23(2):541–557. doi: 10.1093/biostatistics/kxaa040

Evaluation of treatment effect modification by biomarkers measured pre- and post-randomization in the presence of non-monotone missingness

Yingying Zhuang 1,, Ying Huang 1, Peter B Gilbert 1
PMCID: PMC9216643  PMID: 32978622

Summary

In vaccine studies, an important research question is to study effect modification of clinical treatment efficacy by intermediate biomarker-based principal strata. In settings where participants entering a trial may have prior exposure and therefore variable baseline biomarker values, clinical treatment efficacy may further depend jointly on a biomarker measured at baseline and measured at a fixed time after vaccination. This makes it important to conduct a bivariate effect modification analysis by both the intermediate biomarker-based principal strata and the baseline biomarker values. Existing research allows this assessment if the sampling of baseline and intermediate biomarkers follows a monotone pattern, i.e., if participants who have the biomarker measured post-randomization would also have the biomarker measured at baseline. However, additional complications in study design could happen in practice. For example, in a dengue correlates study, baseline biomarker values were only available from a fraction of participants who have biomarkers measured post-randomization. How to conduct the bivariate effect modification analysis in these studies remains an open research question. In this article, we propose approaches for bivariate effect modification analysis in the complicated sampling design based on an estimated likelihood framework. We demonstrate advantages of the proposed method over existing methods through numerical studies and illustrate our method with data sets from two phase 3 dengue vaccine efficacy trials.

Keywords: Clinical trial, Dengue vaccine, Estimated likelihood, Non-monotone missingness, Principal stratification

1. Introduction

An important problem of interest in many biomedical research fields is the evaluation of an inexpensive surrogate endpoint based on the degree of their utility for treatment development. As a motivating example, a research goal of placebo-controlled preventive dengue vaccine efficacy (VE) trials is the evaluation of immunological markers in vaccine recipients as modifiers of VE against dengue disease, for which we utilize a principal stratification framework as described below.

1.1. Principal stratification

Causal inference for randomized experiments often uses the potential outcome framework originally introduced by Neyman (1923), where each of all possible treatments can be potentially applied to each subject; and the potential outcomes are the outcomes that would be observed when each of the treatments would be applied to each of the subjects. With precise notation defined in Section 2, we consider a two-arm randomized trial where subjects are randomly assigned to either an active treatment (Inline graphic) or a placebo (Inline graphic). Let Inline graphic be the potential outcome of Inline graphic if assigned to treatment Inline graphic for Inline graphic. Then a causal effect of treatment on the outcome Inline graphic is defined to be a comparison between Inline graphic and Inline graphic of the same group of subjects. However, the immune response marker endpoints (Inline graphic) are usually measured at a fixed time Inline graphic post-randomization; let Inline graphic be the indicator of whether disease develops before Inline graphic, then Inline graphic is only measurable if Inline graphic. To study the causal effect of treatment on Inline graphic conditional on Inline graphic, standard methods using a comparison between the probabilities Inline graphic and Inline graphic based on observed Inline graphic data may not yield correct causal inference due to post-randomization selection bias. Specifically, the comparison is not made between the same groups of subjects because the group who have post-randomization biomarker value Inline graphic under active treatment are not the same as the group who have post-randomization biomarker value Inline graphic under placebo, thus violating the definition of causal effect. To remedy this problem, Frangakis and Rubin (2002) proposed a principal stratification framework for comparing treatments where the estimands are adjusted for post-randomization variables and yet are always causal effects. A principal stratification with respect to Inline graphic is a cross-classification of subjects defined by the potential outcome pair Inline graphic, where Inline graphic is the potential biomarker value if receiving treatment Inline graphic. Inline graphic is not affected by the treatment therefore can be considered as a baseline covariate and comparisons of treatment effect within principal strata are always causal effects.

1.2. Principal stratification effect modification analysis

Within the principal stratification framework and motivated by randomized placebo-controlled VE trials, Gilbert and Hudgens (2008), henceforth referred to as GH, proposed a clinically relevant causal estimand for evaluating an candidate surrogate biomarker, called the Causal Effect Predictiveness (CEP) surface: Inline graphic, where Inline graphic for Inline graphic and Inline graphic is a known contrast function satisfying Inline graphic if and only if Inline graphic. The CEP surface assesses whether and how treatment efficacy varies by subgroups defined by principal strata that are characterized by Inline graphic. Therefore, the analysis is essentially effect modification analysis, and it provides a way to rank biomarkers endpoints by their strength of effect modification. The marginal CEP curve causal effect is also valuable for studying biomarkers as effect modifiers. It contrasts the risks averaged over the distribution of Inline graphic:

graphic file with name Equation1.gif

where Inline graphic. Unfortunately, if no further assumptions are made, the marginal CEP curve cannot be identified by the observed data because of the missing potential biomarker outcomes that define the principal strata. Follmann (2006) proposed two augmented trial designs to address this problem: baseline immunogenicity predictors (BIP) and closeout placebo vaccination (CPV). The BIP design uses baseline predictor(s) to infer the unobserved potential biomarker values, while the CPV design vaccinates placebo recipients who stay free of the clinical endpoint at the end of the standard trial follow-up and measures their immune response values, which are then used in place of their potential biomarker values if receiving vaccine. We call the BIP design with no CPV component the BIP-only design and the BIP design with a nonzero CPV component the BIP+CPV design.

Examples of good baseline predictors are baseline biomarker measurements in vaccine trials where participants may have prior exposure to the disease-causing pathogen and biomarker measurements at baseline reflect natural immunity arising from the pretrial exposure. In some settings, baseline predictors may further modify the marginal CEP curve such that the clinical risks under each treatment assignment may depend on both those baseline predictors and the intermediate biomarker-based principal strata. This makes it interesting to estimate and compare the marginal CEP curve in subgroups characterized by baseline biomarker measurements, which is essentially bivariate treatment effect modification analysis by the intermediate biomarker-based principal strata and baseline predictors, and is the focus of this article.

1.3. Motivating example and existing methods

In our motivating dengue vaccine trials, a critical issue is whether and how the efficacy of a vaccine depends on whether a vaccinated individual was previously infected with dengue prior to vaccination (i.e., is dengue-naive or pre-immune) (Capeding and others, 2014; Villar and others, 2015). Similarly, an important issue is whether and how the association of a biomarker response with VE depends on previous infection with dengue. These issues were present in the CYD14 (Capeding and others, 2014) and CYD15 (Villar and others, 2015) phase 3 trials of the CYD-TDV vaccine that constituted the primary basis for licensure of this vaccine, where the biomarker of interest is neutralizing antibody titer to the vaccine measured post-vaccination at 13 months and pre-vaccination at baseline. A baseline neutralization value of seronegative [defined as baseline biomarker value equal or below the lower limit of quantification (LLOQ)] indicates dengue-naive whereas a value of seropositive (defined as baseline biomarker value above the LLOQ) indicates dengue pre-immune. Comparing the marginal CEP curve within the baseline seronegative subgroup to that within the baseline seropositive subgroup could provide important insights. For instance, in general for any two major baseline subgroups, a statistical result that a biomarker response is not associated with VE in one subgroup and is strongly associated with VE in another subgroup may suggest advancing the biomarker as a study endpoint for subsequent phase 1–2 biomarker endpoint vaccine trials for one subgroup but not the other. Specifically for the dengue studies, it was previously shown that overall VE was greater in baseline seropositive than baseline seronegative individuals (Sridhar and others, 2018), and the indication for use of the CYD-TDV vaccine is restricted to individuals with evidence of previous dengue infection (i.e., baseline seropositive individuals), making it important to study the marginal CEP curve in each subgroup separately. Moreover, it is biologically relevant to compare the marginal CEP curve between the two subgroups, given that the neutralizing antibody marker has a different meaning for each: for baseline seronegative individuals it is the neutralization marker response generated by vaccination, and for baseline seropositive individuals it is the neutralization marker response generated by the aggregate of vaccination and previous dengue infection(s) prior to vaccination. Comparing the two marginal CEP curves addresses the question of whether a purely vaccine-induced marker response has the same impact on VE as a natural infection plus vaccine-induced marker response, where evidence for a similar curve may help support that the marker contributes to a mechanism of VE.

The majority of previous developments were based on a two-phase sampling setting where there is only missingness with respect to the intermediate potential outcome (Inline graphic) but the baseline predictors are measured in everyone (Gilbert and Hudgens, 2008; Wolfson and Gilbert, 2010; Huang and others, 2013). However, additional complications in study design could happen in practice. For example, in the two phase 3 dengue vaccine trials CYD14 and CYD15, the biomarker values at baseline were only measured in a fraction of those with the biomarker measured at the post-randomization time point such that the missingness with respect to the baseline biomarker and the intermediate biomarker is no longer monotone. Our goal in this manuscript is to propose methods for a bivariate treatment effect modification analysis by the intermediate biomarker-based principal strata and baseline predictors in general settings without requirements of a nested sub-sampling relationship between the intermediate biomarker and the baseline predictors, in other words, in the presence of non-monotone missingness of covariates, applicable to both the BIP-only design and the BIP+CPV design.

2. Methods

Consider a study in which Inline graphic subjects are independently and randomly selected from a given population of interest and are randomly assigned to either placebo or vaccine at baseline (time 0). Let Inline graphic if a subject is randomized to vaccine and Inline graphic if randomized to placebo. Let Inline graphic be a vector of baseline covariates used for modeling disease risk and Inline graphic can be partitioned into two parts, Inline graphic and Inline graphic. Inline graphic denotes baseline covariates recorded for everyone at baseline such as gender and country while Inline graphic denotes baseline covariates that are only available in a subset of the Inline graphic trial participants, such as baseline biomarker measurements. Trial participants are followed for the primary clinical endpoint for a predetermined period of time and let Inline graphic be the indicator of clinical endpoint event during the study follow-up period. At some fixed time, Inline graphic post randomization, an immune response endpoint, Inline graphic, is measured. Because Inline graphic must be measured prior to disease to evaluate its association with the treatment effect, Inline graphic is defined conditional on remaining clinical endpoint free at time Inline graphic (denoted by Inline graphic). If the clinical endpoint occurs in the time interval Inline graphic (denoted by Inline graphic), then Inline graphic is undefined and we set Inline graphic. If a CPV component is incorporated in the trial design, all or a fraction of placebo recipients who remain free of the clinical endpoint at the closeout of the trial are vaccinated and the immune response biomarker Inline graphic is measured at time Inline graphic after vaccination. In addition, we consider cases where Inline graphic and Inline graphic are continuous and subject to the LLOQ left censoring. The observable random variable Inline graphic where c is the LLOQ and Inline graphic has a continuous cumulative distribution function (cdf) with Inline graphic. Similar to Inline graphic, Inline graphic where Inline graphic has a continuous cdf with Inline graphic. If Inline graphic denotes the baseline biomarker measurements, then the observable random variable Inline graphic where Inline graphic has a continuous cdf with Inline graphic. Let Inline graphic, Inline graphic, Inline graphic, Inline graphic be the potential outcomes if assigned to treatment Inline graphic, for Inline graphic or Inline graphic. We consider a general sampling framework where baseline covariates Inline graphic and the clinical outcome data Inline graphic and Inline graphic are measured for everyone and the sampling probability of Inline graphic, Inline graphic and Inline graphic can depend on Inline graphic, Inline graphic and/or Inline graphic.

An example of the marginal CEP curve is VE as a function of Inline graphic:

graphic file with name Equation2.gif (2.1)

Inline graphic is a specific form of the marginal CEP curve defined in Section 1.2 with the contrast function Inline graphic being Inline graphic. Inline graphic is the percentage reduction in the risk of the clinical outcome for the subgroup of treatment recipients with immune response Inline graphic compared to if they had not received the treatment. It measures a causal effect and examination of this Inline graphic curve as a function of varying levels of immune response if receiving vaccination provides levels of Inline graphic for a spectrum of subgroups and a ranking of immune response biomarkers by their strength of effect modification. If Inline graphic has larger variability across a range of values for biomarker A than across the same range of values for biomarker B, then biomarker A demonstrates stronger effect modification in Inline graphic and there is potential to achieve a larger VE by increasing the immune response.

In this manuscript, we propose methods to estimate the causal estimand for bivariate treatment effect modification analysis Inline graphic where Inline graphicInline graphic, based on an estimated likelihood approach in the presence of non-monotone missingness with respect to Inline graphic and Inline graphic. Furthermore, in the special case where Inline graphic denotes the baseline biomarker values, we also derive the estimator for the VE curve within the baseline seropositive subgroup,

graphic file with name Equation3.gif (2.2)

and the VE curve within the baseline seronegative subgroup,

graphic file with name Equation4.gif (2.3)

where Inline graphic and Inline graphic.

We make the common assumptions for randomized clinical trials: (A1) Stable Unit Treatment Values and Consistency: Inline graphic for one subject is unaffected by the treatment assignments of other subjects, and given the treatment a subject actually receives, a subject’s potential outcomes equal the observed outcomes; (A2) Ignorable Treatment Assignment; (A3) Equal Early Clinical Risk: Inline graphic. These three assumptions help with identifiability of our estimands. They have been used and discussed in details in previous literature, e.g., Gilbert and Hudgens (2008); Huang and others (2013). Henceforth, we drop the notation of Inline graphic and tacitly assume all probabilities condition on Inline graphic. Furthermore, we assume the risk functions have a generalized linear model form:

  • (A4) Inline graphic for some pre-specified link function Inline graphic, for Inline graphic.

In order to replace the unobservable Inline graphic among placebo recipients with the CPV measurements Inline graphic, the following two assumptions are made for the BIP+CPV design:

  • (A5) Time constancy of immune response: for placebo recipients who are clinical endpoint free at closeout, Inline graphic, and Inline graphic, for some underlying Inline graphic and i.i.d. measurement errors Inline graphic, Inline graphic that are independent of one another.

  • (A6) No placebo recipients who are clinical endpoint free at closeout experience the clinical endpoint over the next Inline graphic time-units.

Henceforth, we consolidate the notation and let Inline graphic be the potential outcome of Inline graphic under treatment arm Inline graphic, either obtained during the standard trial follow-up for vaccine recipients or replaced by the CPV measurements for placebo recipients who receive vaccination at closeout. We let Inline graphic and Inline graphic be the indicator that Inline graphic is measured. In addition, if Inline graphic denotes the baseline biomarker values, we also replace a missing Inline graphic with Inline graphic if it is available for placebo recipients based on (A3) and the next assumption (A7):

  • (A7) For participants who are clinical endpoint free at time Inline graphic, Inline graphic, and Inline graphic, for some underlying Inline graphic and i.i.d. measurement errors Inline graphic, Inline graphic that are independent of one another.

Henceforth, we use Inline graphic to denote the baseline biomarker values (or its substitute from Inline graphic), subject to the LLOQ: Inline graphic. We let Inline graphic be the indicator that Inline graphic is available. Finally, we state the assumptions about missing at random (MAR) and sampling probability positivity for Inline graphic and Inline graphic, needed to justify the conditional likelihood method and the inverse probability weighting method presented in Section 2.1, respectively:

  • (A8) Missing at random (MAR) and positive sampling probability: First, let Inline graphic indicate the missingness pattern for Inline graphic out of the four different combinations of Inline graphic and Inline graphic. The probability an individual belongs to each individual pattern only depends on the observed variables among Inline graphic. Moreover, we assume Inline graphic for every individual in the vaccine arm (Inline graphic), the sampling probability for Inline graphic only depends on Inline graphic, Inline graphic, and Inline graphic, and Inline graphic for every individual in the trial.

Table 1 shows whether Inline graphic and Inline graphic are available for subject Inline graphic in all possible scenarios. Inline graphic is always missing for placebo recipients in a BIP-only design while replaced by CPV measurements if available in a BIP+CPV design. As Inline graphic and Inline graphic are not observed in every subject, indicators of their availabilities are denoted with Inline graphic and Inline graphic, respectively, and the observed data are Inline graphic. We assume the Inline graphic’s are independently and identically distributed (i.i.d). We use the corresponding lowercase letters to denote the values each random variable takes on. Specifically, we use Inline graphic to denote a real value for Inline graphic to emphasize that it refers to the potential outcome Inline graphic if receiving vaccine.

Table 1.

For each trial participant Inline graphic, the table lists the possible scenarios to which the participant could belong for the availability of Inline graphic (the potential biomarker value measured at time Inline graphic if assigned to active treatment arm) and Inline graphic (the biomarker value measured at baseline).

Treatment assignment Inline graphic Biomarker measured at baseline? Biomarker measured at time Inline graphic given Inline graphic ? Availability of Inline graphic and Inline graphic
Inline graphic Yes Yes Inline graphic observed ; Inline graphic observed
Inline graphic Yes No Inline graphic observed ; Inline graphic missing
Inline graphic No Yes Inline graphic missing ; Inline graphic observed
Inline graphic No No Inline graphic missing ; Inline graphic missing
Inline graphic Yes Yes Inline graphic observed ;
      Inline graphic missing,
      replaced by CPV measurements if available when Inline graphic
Inline graphic Yes No Inline graphic observed ;
      Inline graphic missing,
      replaced by CPV measurements if available when Inline graphic
Inline graphic No Yes Inline graphic replaced by biomarker measured at Inline graphic ;
      Inline graphic missing,
      replaced by CPV measurements if available when Inline graphic
Inline graphic No No Inline graphic missing ;
      Inline graphic missing,
      replaced by CPV measurements if available when Inline graphic

2.1. Risk model parameters estimation

We propose an estimated likelihood approach to estimate our risk model parameters Inline graphic as specified in (A4). Based on the MAR assumption in (A8), we consider the maximization of the likelihood for Inline graphic conditional on observed Inline graphic and/or Inline graphic. Define the nuisance parameter Inline graphicInline graphic, the cdf of Inline graphic conditional on Inline graphic, the cdf of Inline graphic conditional on Inline graphic and Inline graphic, and the cdf of Inline graphic conditional on Inline graphic and Inline graphic. Then the conditional likelihood is Inline graphicInline graphic where

graphic file with name Equation5.gif (2.4)
graphic file with name Equation6.gif (2.5)
graphic file with name Equation7.gif (2.6)
graphic file with name Equation8.gif (2.7)

The likelihood contribution from subject Inline graphic takes on one of four forms ((2.4), (2.5), (2.6), (2.7)) depending on the values of Inline graphic and Inline graphic. If Inline graphic and Inline graphic, then neither Inline graphic nor Inline graphic is missing and subject Inline graphic contributes to the likelihood in the form of (2.4). If Inline graphic and Inline graphic, then Inline graphic is missing. Subject Inline graphic contributes to the likelihood in the form of (2.5), which is obtained by integrating Inline graphic over the distribution of Inline graphic conditional on Inline graphic and Inline graphic. If Inline graphic and Inline graphic, then Inline graphic is missing. Subject Inline graphic contributes to the likelihood in the form of (2.6), which is obtained by integrating Inline graphic over the distribution of Inline graphic conditional on Inline graphic and Inline graphic. If Inline graphic and Inline graphic, then both Inline graphic and Inline graphic are missing. Subject Inline graphic contributes to the likelihood in the form of (2.7), which is obtained by integrating Inline graphic over the joint distribution of Inline graphic and Inline graphic over Inline graphic.

We consider the estimated likelihood approach for the estimation of Inline graphic, originally proposed by Pepe and Fleming (1991) for two-phase sampling settings, where consistent estimators of Inline graphic are obtained first and then Inline graphic is maximized over Inline graphic. Here, we adopt a parametric form for Inline graphic and Inline graphic. Then according to Bayes’ theorem, we have

graphic file with name Equation9.gif

For example, in Section 2 of the Supplementary material available at Biostatistics online, we show that by assuming conditional normal distribution of Inline graphic and Inline graphic, we have Inline graphic and

graphic file with name Equation10.gif

where Inline graphic. In general when the sampling probability depends on Inline graphic, Inline graphic, and Inline graphic, based on (A8), Inline graphic can be estimated by binary regression for the probability of Inline graphic conditional on Inline graphic, Inline graphic, Inline graphic or by the proportion of participants with Inline graphic measured in each subset defined by Inline graphic, Inline graphic, Inline graphic for discrete X; Inline graphic can be estimated by binary regression for the probability of Inline graphic conditional on Inline graphic, Inline graphic, Inline graphic or by the proportion of participants with Inline graphic and Inline graphic available in each subset defined by Inline graphic, Inline graphic, Inline graphic for discrete Inline graphic. One can then estimate Inline graphic by the weighted maximum likelihood estimator (MLE) for using data from individuals with Inline graphic measured, Inline graphic, where the contribution of each individual to the likelihood is inversely weighted by the estimated Inline graphic. For estimation of Inline graphic, the weighted MLE can be implemented using data from vaccine recipients who have both Inline graphic and Inline graphic available, where the contribution of each individual to the likelihood is weighted by the inverse of the estimated probability of Inline graphic. The sampling scheme in the dengue studies is a special case, where Inline graphic is measured in a simple random sample of all participants and the set with both Inline graphic and Inline graphic available is also a random subset of vaccine recipients. Thus Inline graphic can be estimated by Inline graphic/Inline graphic and Inline graphic can be estimated by Inline graphic/Inline graphic. The inverse probability weighted estimators of Inline graphic and Inline graphic derived in this special setting are equivalent to unweighted estimates since we are estimating a constant for the sampling probability of Inline graphic, and for the sampling probability of Inline graphic and Inline graphic together among vaccine recipients. Note that even when there is a CPV component in the study design, we cannot use Inline graphic from placebo recipients for the estimation of Inline graphic because placebo recipients who have experienced the clinical endpoint by study closeout have zero probability of obtaining Inline graphic thus inverse probability weighting is not applicable. The estimator of Inline graphic is then derived as the maximizer of the estimated likelihood Inline graphic.

Standard errors for Inline graphic can be estimated using a perturbation resampling technique proposed by Zhuang and others (2019) (inspired by Parzen and others (1994)). In essence, one can generate Inline graphic random realizations of Inline graphic from a known distribution with mean of 1 and variance of 1 to create Inline graphic. Let Inline graphic be a perturbed version of Inline graphic, where Inline graphic and Inline graphic is the perturbed estimator of Inline graphic obtained by adding Inline graphic as the weights in the estimating equations. Then the perturbed estimator Inline graphic is derived as the maximizer of Inline graphic. As shown in Zhuang and others (2019), the distribution of Inline graphic given the observed data Inline graphic, can be used to approximate the unconditional distribution of Inline graphic. In practice, one may obtain a variance estimator of Inline graphic based on the empirical variance of Inline graphic realizations of Inline graphic. In our simulation and example, we use Inline graphic.

2.2. Estimation of marginal VE curves and VE curves within the baseline seropositive/seronegative subgroups

In this section, we study the special problem where Inline graphic denotes the baseline biomarker values subject to the LLOQ left censoring, and derive the estimators for the marginal VE curve, Inline graphic, the VE curve in the baseline seropositive subgroup, Inline graphic, and the VE curve in the baseline seronegative subgroup, Inline graphic. With some calculations, the risk of Inline graphic conditional on Inline graphic, Inline graphic and Inline graphic in the general population, and among baseline seropositive or baseline seronegative subgroups can be expressed as:

graphic file with name Equation11.gif

All three risk functions can be estimated based on Inline graphic and Inline graphic. We consider situations where Inline graphic is categorical with Inline graphic levels: Inline graphic. Then Inline graphic, Inline graphic, and Inline graphic. In Section 1 of the Supplementary material available at Biostatistics online, we show that based on Bayes’ theorem and with some calculations, we have

graphic file with name Equation12.gif (2.8)
graphic file with name Equation13.gif (2.9)
graphic file with name Equation14.gif (2.10)

We propose estimating Inline graphic nonparametrically specified by Inline graphic, Inline graphic and based on Bayes’ theorem Inline graphic and Inline graphic can be estimated by consistent estimators Inline graphic and Inline graphic through: Inline graphic and Inline graphic. Probabilities Inline graphic, Inline graphic, and Inline graphic can then be estimated by Inline graphic and Inline graphic based on expression 2.8, 2.9, and 2.10. Together with estimators for Inline graphic, Inline graphic and Inline graphic, the VE curve, baseline seropositive VE curve, and baseline seronegative VE curve defined in expressions 2.1, 2.2, and 2.3 can be estimated by using Inline graphic, Inline graphic and Inline graphic. In Section 2 of the Supplementary material available at Biostatistics online, we provide a detailed estimation procedure of the marginal VE curve, baseline seropositive VE curve, and baseline seronegative VE curve for the case where Inline graphic and Inline graphic are assumed normal and the risk functions take the form Inline graphic.

A perturbation resampling method (Zhuang and others, 2019) can be used to make simultaneous inference of the VE curve, the baseline seropositive VE curve, and the baseline seronegative VE curve. Because VE ranges from negative infinity to 1, we construct confidence intervals for the VE curves assuming approximate normality of the log of relative risk (RR), where Inline graphicVEInline graphic. To be specific, perturbed estimators Inline graphic, Inline graphic, and Inline graphic are obtained based on Inline graphic. Then the corresponding perturbed estimators for Inline graphic, Inline graphic, and Inline graphicInline graphic are obtained based on Inline graphic, Inline graphic, and Inline graphic. Repeat this process Inline graphic times to obtain Inline graphic realizations of Inline graphic, Inline graphic, and Inline graphic, denoted by Inline graphic, Inline graphic and Inline graphicInline graphic. Then based on the Inline graphic realizations, we calculate the sample standard deviations Inline graphic, Inline graphic, Inline graphic. The Inline graphic pointwise confidence intervals for Inline graphic in the general population and within the baseline seropositive/seronegative subgroups can be constructed as

graphic file with name Equation15.gif

And the Inline graphic simultaneous confidence bands of corresponding Inline graphic for Inline graphic can be constructed as

graphic file with name Equation16.gif

where Inline graphic is the Inline graphicth percentile of Inline graphic, Inline graphic is the Inline graphicth percentile of Inline graphic, and Inline graphic and Inline graphic are defined similarly to Inline graphic with the logRR estimator, logRR perturbed estimator and standard error estimator replaced by its own version. Finally, the Wald Inline graphic pointwise and simultaneous confidence bands for Inline graphic, Inline graphic, and Inline graphic are obtained by transformation of the symmetric bounds from the logRR scale back to the VE scale.

Furthermore, simultaneous inference enables evaluation of the hypothesis testing of Inline graphic for Inline graphic. Let Inline graphic, then Inline graphic is equivalent to Inline graphic for Inline graphic. Let Inline graphic denote the estimator Inline graphic; let Inline graphic denote one of the Inline graphic realizations of the perturbed estimator Inline graphic, Inline graphic; let Inline graphic denote the sample standard deviation of Inline graphic; finally let Inline graphic denote the Inline graphic percentile of Inline graphic. Then the Inline graphic simultaneous confidence bands for Inline graphic is Inline graphic. Let Inline graphicInline graphic, Inline graphic and Inline graphic be the Inline graphic percentile of Inline graphic. Then, it follows from Roy and Bose (1953) that a test of the null hypothesis Inline graphic for Inline graphic at significance level Inline graphic is obtained by using the region of rejection Inline graphic, and the two-sided p-value can be obtained as the empirical probability that Inline graphic.

3. Simulation studies

Simulation data are generated with 10 000 subjects randomized to vaccine and placebo by a ratio of 2:1. The baseline covariate Inline graphic was generated with a multinomial distribution to have four categories, 1, 2, 3, and 4 with corresponding probabilities of 0.25, 0.25, 0.25, and 0.25. Inline graphic, Inline graphic, and Inline graphic are dummy variables indicating category 2, 3, or 4, respectively. Baseline biomarker values Inline graphic were generated from a normal distribution with mean of Inline graphic and standard deviation of 0.86. Inline graphic values were generated from a normal distribution with mean of Inline graphic and standard deviation of 0.4, which leads to a correlation of 0.78 between Inline graphic and Inline graphic and a correlation of 0.20 between Inline graphic and Inline graphic. Let the LLOQ be 1, then Inline graphic and Inline graphic. We assume a probit risk model of the clinical outcome Inline graphic conditional on Inline graphic, Inline graphic, Inline graphic, and Inline graphic: Inline graphic. We set Inline graphic to Inline graphic so that the probability of the endpoint Inline graphic equals 0.16 in the placebo arm and 0.08 in the vaccine arm. These simulation parameters were chosen to reflect the characteristics of the two phase 3 dengue trials, CYD14 and CYD15. The resultant true Inline graphic, Inline graphic, and Inline graphic curves are shown in Figure 1.

Fig. 1.

Fig. 1.

The true Inline graphic curve (solid), Inline graphic curve (dashed), and Inline graphic curve (dot-dashed) in the simulation design.

To achieve a non-monotone sampling design, 35% of study participants are randomly sampled to have Inline graphic retained. For the BIP-only design, Inline graphic is set missing for all placebo recipients and retained in all cases and all participants with Inline graphic measured in the vaccine arm, that is Inline graphic. For the BIP+CPV design, 70% of event-free placebo recipients are randomly sampled to be included in the CPV component and have Inline graphic retained. Simulation results are based on 500 Monte-Carlo simulations and for each simulation 250 perturbation iterations are generated to construct pointwise confidence intervals and simultaneous confidence bands. In our simulation, we used the exponential distribution with rate of 1 to generate Inline graphic.

For comparison with our proposed estimators, we consider two alternative estimators based on existing approaches for two-phase studies that restrict the analysis to part of the observed data. The first approach obtains MLEs without using baseline biomarker values Inline graphic as baseline predictors nor in the risk models (henceforth referred to as the MLEs ignoring Inline graphic) based on the estimated likelihood method proposed in GH and the analysis is restricted on observed data Inline graphic, Inline graphic; the second approach restricts analysis to the subset of data with Inline graphic measured thus in the analysis subset B is measured from everyone and can be treated like a baseline covariate X. Similar to the first approach, MLEs are obtained based on the estimated likelihood method proposed in GH and henceforth we refer to this estimator as the MLEs restricted. The comparison analysis between our proposed estimators, the MLEs ignoring Inline graphic, and the MLEs restricted is performed under two scenarios: (1) the true risk model Inline graphic with non-zero Inline graphic and Inline graphic, thus the working risk model is correct for our proposed estimators and the MLEs restricted but is incorrect for the MLEs ignoring Inline graphic; (2) Inline graphic and Inline graphic, therefore Inline graphic is not associated with Inline graphic in the risk model and the working risk models for our proposed estimators, the MLEs ignoring Inline graphic, and the MLEs restricted are correctly specified. We present the detailed procedures of obtaining the MLEs ignoring Inline graphic and the MLEs restricted in Section 3 and Section 4 of the Supplementary material available at Biostatistics online, respectively.

Tables 2 and 3 present the average bias and sample standard deviation (SD) of our proposed estimators compared to the MLEs ignoring Inline graphic, and the MLEs restricted. In both the BIP-only and BIP+CPV designs, our proposed estimators have smallest bias and smallest SD. In the setting where ignoring Inline graphic leads to an incorrect working model, the MLEs ignoring B have the largest bias due to the incorrectly specified working risk model and the MLEs restricted have the largest SD due to the reduced data size. In the setting where ignoring Inline graphic yields a correctly specified working model, our proposed estimators still have the best performance. The MLEs restricted have the largest bias and largest SD due to the reduced data size; the MLEs ignoring B are also less efficient compared to our proposed estimators because we gain efficiency by predicting missing Inline graphic through both Inline graphic and Inline graphic rather than through Inline graphic alone.

Table 2.

Bias and sample standard deviation (SD) comparison in a BIP-only design

    Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic
    (-0.40) (0.50) (0.65) (-0.55) (-1.30) (0.00) (0.24) (0.11) (0.20)
Inline graphic
BiasInline graphic100 Our proposed estimators 0.47 -0.13 -0.65 -0.02 0.39 0.20 0.15 0.14 0.37
  MLEs ignoring B -4.05 -4.10 1.95 1.36 -0.55 0.51 -0.81
  MLEs restricted 3.92 0.46 -1.94 -0.16 0.39 0.30 -0.18 -0.47 -0.39
SDInline graphic100 Our proposed estimators 10.84 12.26 3.21 3.96 2.16 7.18 4.28 4.39 4.68
  MLEs ignoring B 18.38 19.26 6.07 6.21 4.52 4.49 4.83
  MLEs restricted 25.32 31.83 16.42 14.72 12.69 12.87 13.05 14.14 11.40
Inline graphic
BiasInline graphic100 Our proposed estimators 0.98 -0.15 -0.35 0.06 0.02 0.01 0.05 0.15 0.13
  MLEs ignoring B 1.19 -0.12 -0.58 0.10 -0.26 -0.24 -0.07
  MLEs restricted 1.42 -1.72 -0.61 1.05 0.38 0.24 -2.20 -1.86 -1.50
SDInline graphic100 Our proposed estimators 10.16 11.79 3.31 3.86 2.71 3.40 4.33 4.41 4.59
  MLEs ignoring B 11.94 12.55 6.12 6.09 4.71 4.89 5.33
  MLEs restricted 15.92 19.17 10.16 8.27 8.33 9.48 10.64 11.37 8.88

Table 3.

Bias and sample standard deviation (SD) comparison in a BIP+CPV design

    Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic
    (-0.40) (0.50) (0.65) (-0.55) (-1.30) (0.00) (0.24) (0.11) (0.20)
Inline graphic
BiasInline graphic100 Our proposed estimators -0.55 0.36 0.58 -0.16 0.28 0.12 -0.12 -0.19 0.08
  MLEs ignoring B 4.59 -3.85 -1.73 1.69 -0.18 -0.67 -0.34
  MLEs restricted 2.94 1.27 1.38 -0.31 0.21 0.15 0.33 -0.28
SDInline graphic100 Our proposed estimators 5.04 5.70 3.16 2.29 2.75 7.29 1.61 1.75 1.50
  MLEs ignoring B 10.38 10.29 7.97 6.89 1.83 1.78 1.63
  MLEs restricted 12.60 14.79 16.85 10.69 16.16 12.50 4.89 5.64 3.65
Inline graphic
BiasInline graphic100 Our proposed estimators -0.35 0.32 -0.14 -0.13 0.24 0.12 -0.14 -0.22 0.04
  MLEs ignoring B 0.43 -0.44 -0.16 0.19 0.24 0.16 0.36
  MLEs restricted 0.51 3.67 0.24 2.17 0.88 1.12 6.16 2.72 0.46
SDInline graphic100 Our proposed estimators 5.27 5.34 3.16 1.32 2.09 6.99 1.83 1.74 1.81
  MLEs ignoring B 7.17 10.32 6.15 2.04 1.88 1.81 2.46
  MLEs restricted 8.26 11.22 9.58 2.83 6.42 19.50 4.51 4.47 3.48

We then evaluated the finite-sample performance of our proposed estimators for Inline graphic, Inline graphic, and Inline graphic. Results are presented in Figure 2. The empirical coverage levels of the 95% simultaneous confidence bands are reported as “simultaneous.cover.” The bias was smaller than 3% across all Inline graphic values. The increase in bias at larger Inline graphic values are due to the fact that the observed Inline graphic values are rarer at these extreme values, such that less data are available for estimation, and the fact that at high VE, only a small number of vaccine recipients have events. The coverage probabilities of the confidence intervals given fixed Inline graphic values and simultaneous confidence bands across all Inline graphic values are fairly close to the nominal level. The observed power of the test Inline graphic for Inline graphic is 0.67 and 0.72 for BIP-only design and BIP+CPV design, respectively.

Fig. 2.

Fig. 2.

Average bias for our proposed estimators Inline graphic, Inline graphic, and Inline graphic and coverage probabilities of 95% perturbation Wald confidence intervals.

4. Application to the CYD14 and CYD15 trials

CYD14 (CYD15) is an observer-masked, randomized controlled, multi-center, phase 3 trial in five countries in the Asia-Pacific (Latin America) region where participants were randomized in 2:1 allocation to receive three injections of the CYD-TDV vaccine or placebo at months 0, 6, and 12 (Capeding and others, 2014; Villar and others, 2015). Concentrations of dengue neutralizing antibody titers to each of the four dengue serotype strains at month 13 were measured for all virologically confirmed dengue disease cases and a subset of controls (defined as participants free of the virologically confirmed dengue disease endpoint), of which only a fraction have their baseline titers measured. In this illustration, we applied our proposed method to data pooling across CYD14 and CYD15 restricted to 9- to 16-year-olds to assess how VE varied by month 13 titers within baseline seropositive and seronegative subgroups. We let Inline graphic and Inline graphic be the average of the log10-transformed neutralizing antibody titers to the four dengue serotypes at month 13 and at baseline, respectively. Baseline seropositive and seronegative subgroups are defined as Inline graphic and Inline graphic. Immunogenicity studies in CYD14 and CYD15 hold a non-monotone sampling design with respect to Inline graphic and Inline graphic, where Inline graphic is available for everyone in the immunogenicity subset (Inline graphic) and Inline graphic is available for all vaccine recipients who are either in the immunogenicity subset or are cases (Inline graphic). Let Inline graphic denote participants’ age and country categories, and let Y be the indicator of virologically confirmed dengue disease occurrence after the month 13 study visit and by the month 25 visit. We model the risk of Inline graphic conditional on Inline graphic, Inline graphic, Inline graphic, and Inline graphic same as in the simulation studies.

Figure 3 shows the estimated VE curves (marginal, baseline seropositive, and baseline seronegative) and 95% CIs and CBs based on 500 perturbation iterations. VE curves were similar for baseline seropositive and baseline seronegative subgroups. For vaccine recipients with month 13 average titers of 500 and 10 000, estimated VE was 79.3% and 97.3% for the baseline seropositive subgroup compared to 70.4% and 91.8% for the baseline seronegative subgroup, respectively. Furthermore, we tested the null hypothesis Inline graphic: Inline graphic for Inline graphic range of month 13 average titers in vaccinees in the data, which gave a p-value of 0.35. The large variations in the VE curves suggest that the magnitude of dengue neutralizing antibody titer following vaccination predicts the level of CYD-TDV VE to prevent virologically confirmed dengue disease, and provides a map between neutralizing antibody titer and the predicted level of VE. This guides use of the vaccine, for example by providing an input parameter into models for bridging VE to new populations (Gilbert and others, 2019). It also supports use of the neutralizing antibody endpoint for ranking and selecting among future candidate dengue vaccines developed within the same vaccine class. Moreover, the hypothesis testing results indicating no significant difference between the baseline seropositive VE curve compared to the baseline seronegative VE curve, suggests that neutralizing antibody titer has a similar association with VE regardless of whether the child had been previously infected with dengue at the time of first vaccination.

Fig. 3.

Fig. 3.

Estimated VE by average Inline graphic titer at month 13 with 95% pointwise confidence intervals and simultaneous confidence bands in CYD14and CYD15 9- to 16-year-olds.

5. Discussion

Various research has been devoted to study the causal effect of treatment adjusted for post-randomization variables under the principal stratification framework (Frangakis and Rubin, 2002). Schwartz and others (2011) presented a general Bayesian semiparametric model to make comparison on a continuous outcome Y, adjusting for intermediate variables. Their Bayesian approach treated the unobserved potential outcomes as part of the model parameters and proposed a Bayesian nonparametric model for the principal strata based on the Dirichlet process mixture model. Bayesian inference for the estimand of interest Inline graphic was based on the posterior distribution of model parameters conditional on the observed data. In another randomized experiment to study the effect of a job-training program on employment and wages (Frumento and others, 2012), the primary causal estimands were the average causal effect on employment for compliers and the average causal effect on wages for always-employed compliers. They also adopted a Bayesian inference approach where missing potential outcomes are no different than unknown parameters and conducted a likelihood analysis for the model parameters, assuming that the values of the model parameters that governed the distribution of observable data were drawn from a prior distribution. In this article, we focused on a binary outcome Inline graphic where the comparison is made on Inline graphic and Inline graphic and developed a frequentist estimated likelihood approach to evaluate effect modification by subgroups defined by an intermediate response biomarker (principal strata) and a baseline covariate applicable to general sampling designs with respect to the intermediate biomarker and the baseline covariate in vaccine trials.

In the motivating dengue application of this article, the distributions of Inline graphic and of Inline graphic conditional on Inline graphic were estimated from random samples among trial participants and among vaccine recipients, respectively. But our methods can apply in general to settings where sampling probabilities for Inline graphic and Inline graphic vary with other covariates, through the use of inverse probability weighting techniques when modeling the distribution of Inline graphic and Inline graphic. It requires the missing data model to be clearly specified, which is satisfied in the dengue example since the missingness mechanism is determined by the study design. While we have considered situations where the baseline covariates Inline graphic are categorical and employed a normal model for Inline graphic and Inline graphic, our proposed definition and estimation procedure can be used for arbitrary specified parametric forms for the conditional distributions of Inline graphic and Inline graphic. In practice one should perform model checking to ensure appropriate choice of parametric model. In addition, kernel-based methods can be considered for modeling the conditional distributions for low-dimensional covariate Inline graphic settings, especially for problems where parametric models do not provide a good fit. In our work, to deal with nonidentifiability due to missing Inline graphic, we focused on the BIP-only design and the BIP+CPV design. In the BIP-only design, the parametric risk model assumption made in (A4) is untestable whereas the BIP+CPV design allows testing of this risk model assumption. Ertefaie and others (2018) utilizes a different set of untestable assumptions to deal with missing Inline graphic, including a structural model assumption relating Inline graphic and Inline graphic and the existence of an instrumental variable for consistent estimation of the risk model based on observed data. It can be an alternative approach useful in the immune response correlate of VE field.

In some VE trials, samples are stored from all participants at multiple post-randomization time points, such that it is possible to study immune response biomarkers at multiple time points as correlates of VE (e.g., Rerks-Ngarm and others, 2009). A direction of potential future research is to compare and integrate titers obtained at different time points toward developing biomarkers with maximum effect modification.

6. Software

The code to generate a simulated data set with similar design as the dengue example used in Section 4, analyze this simulated data set and produce the figure for the VE curves and 95% CIs and CBs is available at https://github.com/Yingying-Z/TrtEffMod-NonMonotoneMissingness. The results obtained on this simulated data set are provided in Section 5 of the Supplementary material available at Biostatistics online.

Supplementary Material

kxaa040_Supplementary_Data

Acknowledgments

The authors thank the participants, investigators, and sponsors of the CYD14 and CYD15 trials.

Conflict of Interest: None declared.

Supplementary material

Supplementary material is available online at http://biostatistics.oxfordjournals.org.

Funding

Research reported in this publication was supported by Sanofi Pasteur and the National Institute of Allergy and Infectious Diseases (NIAID), National Institutes of Health (NIH), Department of Health and Human Services (award number R37AI054165). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH or Sanofi Pasteur.

References

  1. Capeding, M. R., Tran, N. H., Hadinegoro, S. R., Ismail, H. I., Chotpitayasunondh, T., Chua, M. N., Luong, C. Q., Rusmil, K., Wirawan, D. N., Nallusamy, R.. and others. (2014). Clinical efficacy and safety of a novel tetravalent dengue vaccine in healthy children in Asia: a phase 3, randomised, observer-masked, placebo-controlled trial. The Lancet 384, 1358–1365. [DOI] [PubMed] [Google Scholar]
  2. Ertefaie, A., Hsu, J. Y., Page, L. C. and Small, D. S. (2018). Discovering treatment effect heterogeneity through post-treatment variables with application to the effect of class size on mathematics scores. Journal of the Royal Statistical Society: Series C (Applied Statistics) 67, 917–938. [Google Scholar]
  3. Follmann, D. (2006). Augmented designs to assess immune response in vaccine trials. Biometrics 62, 1161–1169. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Frangakis, C. E. and Rubin, D. B. (2002). Principal stratification in causal inference. Biometrics 58, 21–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Frumento, P., Mealli, F., Pacini, B. and Rubin, D. B. (2012). Evaluating the effect of training on wages in the presence of noncompliance, nonemployment, and missing outcome data. Journal of the American Statistical Association 107, 450–466. [Google Scholar]
  6. Gilbert, P. B., Huang, Y., Juraska, M., Moodie, Z., Fong, Y., Luedtke, A., Zhuang, Y., Shao, J., Carpp, L. N., Jackson, N.. and others. (2019). Bridging efficacy of a tetravalent dengue vaccine from children/adolescents to adults in highly endemic countries based on neutralizing antibody response. The American Journal of Tropical Medicine and Hygiene 101, 164–179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Gilbert, P. B. and Hudgens, M. G. (2008). Evaluating candidate principal surrogate endpoints. Biometrics 64, 1146–1154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Huang, Y., Gilbert, P. B. and Wolfson, J. (2013). Design and estimation for evaluating principal surrogate markers in vaccine trials. Biometrics 69, 301–309. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Neyman, J. S. (1923). On the application of probability theory to agricultural experiments. essay on principles. Section 9. (Translated and edited by D.M. Dabrowska and TP Speed, Statistical science (1990), 5, 465–480). Annals of Agricultural Sciences 10, 1–51. [Google Scholar]
  10. Parzen, M. I., Wei, L. J. and Ying, Z. (1994). A resampling method based on pivotal estimating functions. Biometrika 81, 341–350. [Google Scholar]
  11. Pepe, M. S. and Fleming, T. R. (1991). A nonparametric method for dealing with mismeasured covariate data. Journal of the American Statistical Association 86, 108–113. [Google Scholar]
  12. Rerks-Ngarm, S., Pitisuttithum, P., Nitayaphan, S., Kaewkungwal, J., Chiu, J., Paris, R., Premsri, N., Namwat, C., de Souza, M., Adams, E.. and others. (2009). Vaccination with ALVAC and AIDSVAX to prevent HIV-1 infection in Thailand. New England Journal of Medicine 361, 2209–2220. [DOI] [PubMed] [Google Scholar]
  13. Roy, S. N. and Bose, R. C. (1953). Simultaneous confidence interval estimation. The Annals of Mathematical Statistics 24, 513–536. [Google Scholar]
  14. Schwartz, S. L., Li, F. and Mealli, F. (2011). A bayesian semiparametric approach to intermediate variables in causal inference. Journal of the American Statistical Association 106, 1331–1344. [Google Scholar]
  15. Sridhar, S., Luedtke, A., Langevin, E., Zhu, M., Bonaparte, M., Machabert, T., Savarino, S., Zambrano, B., Moureau, A., Khromava, A.. and others. (2018). Effect of dengue serostatus on dengue vaccine safety and efficacy. New England Journal of Medicine 379, 327–340. [DOI] [PubMed] [Google Scholar]
  16. Villar, L., Dayan, G. H., Arredondo-García, J. L., Rivera, D. M., Cunha, R., Deseda, C., Reynales, H., Costa, M. S., Morales-Ramírez, J. O., Carrasquilla, G.. and others. (2015). Efficacy of a tetravalent dengue vaccine in children in latin america. New England Journal of Medicine 372, 113–123. [DOI] [PubMed] [Google Scholar]
  17. Wolfson, J. and Gilbert, P. B. (2010). Statistical identifiability and the surrogate endpoint problem, with application to vaccine trials. Biometrics 66, 1153–1161. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Zhuang, Y., Huang, Y. and Gilbert, P. B. (2019). Simultaneous inference of treatment effect modification by intermediate response endpoint principal strata with application to vaccine trials. International Journal of Biostatistics 16(1). DOI: 10.1515/ijb-2018-0058. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

kxaa040_Supplementary_Data

Articles from Biostatistics (Oxford, England) are provided here courtesy of Oxford University Press

RESOURCES