Summary
In vaccine studies, an important research question is to study effect modification of clinical treatment efficacy by intermediate biomarker-based principal strata. In settings where participants entering a trial may have prior exposure and therefore variable baseline biomarker values, clinical treatment efficacy may further depend jointly on a biomarker measured at baseline and measured at a fixed time after vaccination. This makes it important to conduct a bivariate effect modification analysis by both the intermediate biomarker-based principal strata and the baseline biomarker values. Existing research allows this assessment if the sampling of baseline and intermediate biomarkers follows a monotone pattern, i.e., if participants who have the biomarker measured post-randomization would also have the biomarker measured at baseline. However, additional complications in study design could happen in practice. For example, in a dengue correlates study, baseline biomarker values were only available from a fraction of participants who have biomarkers measured post-randomization. How to conduct the bivariate effect modification analysis in these studies remains an open research question. In this article, we propose approaches for bivariate effect modification analysis in the complicated sampling design based on an estimated likelihood framework. We demonstrate advantages of the proposed method over existing methods through numerical studies and illustrate our method with data sets from two phase 3 dengue vaccine efficacy trials.
Keywords: Clinical trial, Dengue vaccine, Estimated likelihood, Non-monotone missingness, Principal stratification
1. Introduction
An important problem of interest in many biomedical research fields is the evaluation of an inexpensive surrogate endpoint based on the degree of their utility for treatment development. As a motivating example, a research goal of placebo-controlled preventive dengue vaccine efficacy (VE) trials is the evaluation of immunological markers in vaccine recipients as modifiers of VE against dengue disease, for which we utilize a principal stratification framework as described below.
1.1. Principal stratification
Causal inference for randomized experiments often uses the potential outcome framework originally introduced by Neyman (1923), where each of all possible treatments can be potentially applied to each subject; and the potential outcomes are the outcomes that would be observed when each of the treatments would be applied to each of the subjects. With precise notation defined in Section 2, we consider a two-arm randomized trial where subjects are randomly assigned to either an active treatment (
) or a placebo (
). Let
be the potential outcome of
if assigned to treatment
for
. Then a causal effect of treatment on the outcome
is defined to be a comparison between
and
of the same group of subjects. However, the immune response marker endpoints (
) are usually measured at a fixed time
post-randomization; let
be the indicator of whether disease develops before
, then
is only measurable if
. To study the causal effect of treatment on
conditional on
, standard methods using a comparison between the probabilities
and
based on observed
data may not yield correct causal inference due to post-randomization selection bias. Specifically, the comparison is not made between the same groups of subjects because the group who have post-randomization biomarker value
under active treatment are not the same as the group who have post-randomization biomarker value
under placebo, thus violating the definition of causal effect. To remedy this problem, Frangakis and Rubin (2002) proposed a principal stratification framework for comparing treatments where the estimands are adjusted for post-randomization variables and yet are always causal effects. A principal stratification with respect to
is a cross-classification of subjects defined by the potential outcome pair
, where
is the potential biomarker value if receiving treatment
.
is not affected by the treatment therefore can be considered as a baseline covariate and comparisons of treatment effect within principal strata are always causal effects.
1.2. Principal stratification effect modification analysis
Within the principal stratification framework and motivated by randomized placebo-controlled VE trials, Gilbert and Hudgens (2008), henceforth referred to as GH, proposed a clinically relevant causal estimand for evaluating an candidate surrogate biomarker, called the Causal Effect Predictiveness (CEP) surface:
, where
for
and
is a known contrast function satisfying
if and only if
. The CEP surface assesses whether and how treatment efficacy varies by subgroups defined by principal strata that are characterized by
. Therefore, the analysis is essentially effect modification analysis, and it provides a way to rank biomarkers endpoints by their strength of effect modification. The marginal CEP curve causal effect is also valuable for studying biomarkers as effect modifiers. It contrasts the risks averaged over the distribution of
:
![]() |
where
. Unfortunately, if no further assumptions are made, the marginal CEP curve cannot be identified by the observed data because of the missing potential biomarker outcomes that define the principal strata. Follmann (2006) proposed two augmented trial designs to address this problem: baseline immunogenicity predictors (BIP) and closeout placebo vaccination (CPV). The BIP design uses baseline predictor(s) to infer the unobserved potential biomarker values, while the CPV design vaccinates placebo recipients who stay free of the clinical endpoint at the end of the standard trial follow-up and measures their immune response values, which are then used in place of their potential biomarker values if receiving vaccine. We call the BIP design with no CPV component the BIP-only design and the BIP design with a nonzero CPV component the BIP+CPV design.
Examples of good baseline predictors are baseline biomarker measurements in vaccine trials where participants may have prior exposure to the disease-causing pathogen and biomarker measurements at baseline reflect natural immunity arising from the pretrial exposure. In some settings, baseline predictors may further modify the marginal CEP curve such that the clinical risks under each treatment assignment may depend on both those baseline predictors and the intermediate biomarker-based principal strata. This makes it interesting to estimate and compare the marginal CEP curve in subgroups characterized by baseline biomarker measurements, which is essentially bivariate treatment effect modification analysis by the intermediate biomarker-based principal strata and baseline predictors, and is the focus of this article.
1.3. Motivating example and existing methods
In our motivating dengue vaccine trials, a critical issue is whether and how the efficacy of a vaccine depends on whether a vaccinated individual was previously infected with dengue prior to vaccination (i.e., is dengue-naive or pre-immune) (Capeding and others, 2014; Villar and others, 2015). Similarly, an important issue is whether and how the association of a biomarker response with VE depends on previous infection with dengue. These issues were present in the CYD14 (Capeding and others, 2014) and CYD15 (Villar and others, 2015) phase 3 trials of the CYD-TDV vaccine that constituted the primary basis for licensure of this vaccine, where the biomarker of interest is neutralizing antibody titer to the vaccine measured post-vaccination at 13 months and pre-vaccination at baseline. A baseline neutralization value of seronegative [defined as baseline biomarker value equal or below the lower limit of quantification (LLOQ)] indicates dengue-naive whereas a value of seropositive (defined as baseline biomarker value above the LLOQ) indicates dengue pre-immune. Comparing the marginal CEP curve within the baseline seronegative subgroup to that within the baseline seropositive subgroup could provide important insights. For instance, in general for any two major baseline subgroups, a statistical result that a biomarker response is not associated with VE in one subgroup and is strongly associated with VE in another subgroup may suggest advancing the biomarker as a study endpoint for subsequent phase 1–2 biomarker endpoint vaccine trials for one subgroup but not the other. Specifically for the dengue studies, it was previously shown that overall VE was greater in baseline seropositive than baseline seronegative individuals (Sridhar and others, 2018), and the indication for use of the CYD-TDV vaccine is restricted to individuals with evidence of previous dengue infection (i.e., baseline seropositive individuals), making it important to study the marginal CEP curve in each subgroup separately. Moreover, it is biologically relevant to compare the marginal CEP curve between the two subgroups, given that the neutralizing antibody marker has a different meaning for each: for baseline seronegative individuals it is the neutralization marker response generated by vaccination, and for baseline seropositive individuals it is the neutralization marker response generated by the aggregate of vaccination and previous dengue infection(s) prior to vaccination. Comparing the two marginal CEP curves addresses the question of whether a purely vaccine-induced marker response has the same impact on VE as a natural infection plus vaccine-induced marker response, where evidence for a similar curve may help support that the marker contributes to a mechanism of VE.
The majority of previous developments were based on a two-phase sampling setting where there is only missingness with respect to the intermediate potential outcome (
) but the baseline predictors are measured in everyone (Gilbert and Hudgens, 2008; Wolfson and Gilbert, 2010; Huang and others, 2013). However, additional complications in study design could happen in practice. For example, in the two phase 3 dengue vaccine trials CYD14 and CYD15, the biomarker values at baseline were only measured in a fraction of those with the biomarker measured at the post-randomization time point such that the missingness with respect to the baseline biomarker and the intermediate biomarker is no longer monotone. Our goal in this manuscript is to propose methods for a bivariate treatment effect modification analysis by the intermediate biomarker-based principal strata and baseline predictors in general settings without requirements of a nested sub-sampling relationship between the intermediate biomarker and the baseline predictors, in other words, in the presence of non-monotone missingness of covariates, applicable to both the BIP-only design and the BIP+CPV design.
2. Methods
Consider a study in which
subjects are independently and randomly selected from a given population of interest and are randomly assigned to either placebo or vaccine at baseline (time 0). Let
if a subject is randomized to vaccine and
if randomized to placebo. Let
be a vector of baseline covariates used for modeling disease risk and
can be partitioned into two parts,
and
.
denotes baseline covariates recorded for everyone at baseline such as gender and country while
denotes baseline covariates that are only available in a subset of the
trial participants, such as baseline biomarker measurements. Trial participants are followed for the primary clinical endpoint for a predetermined period of time and let
be the indicator of clinical endpoint event during the study follow-up period. At some fixed time,
post randomization, an immune response endpoint,
, is measured. Because
must be measured prior to disease to evaluate its association with the treatment effect,
is defined conditional on remaining clinical endpoint free at time
(denoted by
). If the clinical endpoint occurs in the time interval
(denoted by
), then
is undefined and we set
. If a CPV component is incorporated in the trial design, all or a fraction of placebo recipients who remain free of the clinical endpoint at the closeout of the trial are vaccinated and the immune response biomarker
is measured at time
after vaccination. In addition, we consider cases where
and
are continuous and subject to the LLOQ left censoring. The observable random variable
where c is the LLOQ and
has a continuous cumulative distribution function (cdf) with
. Similar to
,
where
has a continuous cdf with
. If
denotes the baseline biomarker measurements, then the observable random variable
where
has a continuous cdf with
. Let
,
,
,
be the potential outcomes if assigned to treatment
, for
or
. We consider a general sampling framework where baseline covariates
and the clinical outcome data
and
are measured for everyone and the sampling probability of
,
and
can depend on
,
and/or
.
An example of the marginal CEP curve is VE as a function of
:
![]() |
(2.1) |
is a specific form of the marginal CEP curve defined in Section 1.2 with the contrast function
being
.
is the percentage reduction in the risk of the clinical outcome for the subgroup of treatment recipients with immune response
compared to if they had not received the treatment. It measures a causal effect and examination of this
curve as a function of varying levels of immune response if receiving vaccination provides levels of
for a spectrum of subgroups and a ranking of immune response biomarkers by their strength of effect modification. If
has larger variability across a range of values for biomarker A than across the same range of values for biomarker B, then biomarker A demonstrates stronger effect modification in
and there is potential to achieve a larger VE by increasing the immune response.
In this manuscript, we propose methods to estimate the causal estimand for bivariate treatment effect modification analysis
where 
, based on an estimated likelihood approach in the presence of non-monotone missingness with respect to
and
. Furthermore, in the special case where
denotes the baseline biomarker values, we also derive the estimator for the VE curve within the baseline seropositive subgroup,
![]() |
(2.2) |
and the VE curve within the baseline seronegative subgroup,
![]() |
(2.3) |
where
and
.
We make the common assumptions for randomized clinical trials: (A1) Stable Unit Treatment Values and Consistency:
for one subject is unaffected by the treatment assignments of other subjects, and given the treatment a subject actually receives, a subject’s potential outcomes equal the observed outcomes; (A2) Ignorable Treatment Assignment; (A3) Equal Early Clinical Risk:
. These three assumptions help with identifiability of our estimands. They have been used and discussed in details in previous literature, e.g., Gilbert and Hudgens (2008); Huang and others (2013). Henceforth, we drop the notation of
and tacitly assume all probabilities condition on
. Furthermore, we assume the risk functions have a generalized linear model form:
(A4)
for some pre-specified link function
, for
.
In order to replace the unobservable
among placebo recipients with the CPV measurements
, the following two assumptions are made for the BIP+CPV design:
(A5) Time constancy of immune response: for placebo recipients who are clinical endpoint free at closeout,
, and
, for some underlying
and i.i.d. measurement errors
,
that are independent of one another.(A6) No placebo recipients who are clinical endpoint free at closeout experience the clinical endpoint over the next
time-units.
Henceforth, we consolidate the notation and let
be the potential outcome of
under treatment arm
, either obtained during the standard trial follow-up for vaccine recipients or replaced by the CPV measurements for placebo recipients who receive vaccination at closeout. We let
and
be the indicator that
is measured. In addition, if
denotes the baseline biomarker values, we also replace a missing
with
if it is available for placebo recipients based on (A3) and the next assumption (A7):
(A7) For participants who are clinical endpoint free at time
,
, and
, for some underlying
and i.i.d. measurement errors
,
that are independent of one another.
Henceforth, we use
to denote the baseline biomarker values (or its substitute from
), subject to the LLOQ:
. We let
be the indicator that
is available. Finally, we state the assumptions about missing at random (MAR) and sampling probability positivity for
and
, needed to justify the conditional likelihood method and the inverse probability weighting method presented in Section 2.1, respectively:
(A8) Missing at random (MAR) and positive sampling probability: First, let
indicate the missingness pattern for
out of the four different combinations of
and
. The probability an individual belongs to each individual pattern only depends on the observed variables among
. Moreover, we assume
for every individual in the vaccine arm (
), the sampling probability for
only depends on
,
, and
, and
for every individual in the trial.
Table 1 shows whether
and
are available for subject
in all possible scenarios.
is always missing for placebo recipients in a BIP-only design while replaced by CPV measurements if available in a BIP+CPV design. As
and
are not observed in every subject, indicators of their availabilities are denoted with
and
, respectively, and the observed data are
. We assume the
’s are independently and identically distributed (i.i.d). We use the corresponding lowercase letters to denote the values each random variable takes on. Specifically, we use
to denote a real value for
to emphasize that it refers to the potential outcome
if receiving vaccine.
Table 1.
For each trial participant
, the table lists the possible scenarios to which the participant could belong for the availability of
(the potential biomarker value measured at time
if assigned to active treatment arm) and
(the biomarker value measured at baseline).
Treatment assignment
|
Biomarker measured at baseline? | Biomarker measured at time given ? |
Availability of and
|
|---|---|---|---|
|
Yes | Yes |
observed ; observed |
|
Yes | No |
observed ; missing |
|
No | Yes |
missing ; observed |
|
No | No |
missing ; missing |
|
Yes | Yes |
observed ; |
missing, |
|||
replaced by CPV measurements if available when
|
|||
|
Yes | No |
observed ; |
missing, |
|||
replaced by CPV measurements if available when
|
|||
|
No | Yes |
replaced by biomarker measured at ; |
missing, |
|||
replaced by CPV measurements if available when
|
|||
|
No | No |
missing ; |
missing, |
|||
replaced by CPV measurements if available when
|
2.1. Risk model parameters estimation
We propose an estimated likelihood approach to estimate our risk model parameters
as specified in (A4). Based on the MAR assumption in (A8), we consider the maximization of the likelihood for
conditional on observed
and/or
. Define the nuisance parameter 
, the cdf of
conditional on
, the cdf of
conditional on
and
, and the cdf of
conditional on
and
. Then the conditional likelihood is 
where
![]() |
(2.4) |
![]() |
(2.5) |
![]() |
(2.6) |
![]() |
(2.7) |
The likelihood contribution from subject
takes on one of four forms ((2.4), (2.5), (2.6), (2.7)) depending on the values of
and
. If
and
, then neither
nor
is missing and subject
contributes to the likelihood in the form of (2.4). If
and
, then
is missing. Subject
contributes to the likelihood in the form of (2.5), which is obtained by integrating
over the distribution of
conditional on
and
. If
and
, then
is missing. Subject
contributes to the likelihood in the form of (2.6), which is obtained by integrating
over the distribution of
conditional on
and
. If
and
, then both
and
are missing. Subject
contributes to the likelihood in the form of (2.7), which is obtained by integrating
over the joint distribution of
and
over
.
We consider the estimated likelihood approach for the estimation of
, originally proposed by Pepe and Fleming (1991) for two-phase sampling settings, where consistent estimators of
are obtained first and then
is maximized over
. Here, we adopt a parametric form for
and
. Then according to Bayes’ theorem, we have
![]() |
For example, in Section 2 of the Supplementary material available at Biostatistics online, we show that by assuming conditional normal distribution of
and
, we have
and
![]() |
where
. In general when the sampling probability depends on
,
, and
, based on (A8),
can be estimated by binary regression for the probability of
conditional on
,
,
or by the proportion of participants with
measured in each subset defined by
,
,
for discrete X;
can be estimated by binary regression for the probability of
conditional on
,
,
or by the proportion of participants with
and
available in each subset defined by
,
,
for discrete
. One can then estimate
by the weighted maximum likelihood estimator (MLE) for using data from individuals with
measured,
, where the contribution of each individual to the likelihood is inversely weighted by the estimated
. For estimation of
, the weighted MLE can be implemented using data from vaccine recipients who have both
and
available, where the contribution of each individual to the likelihood is weighted by the inverse of the estimated probability of
. The sampling scheme in the dengue studies is a special case, where
is measured in a simple random sample of all participants and the set with both
and
available is also a random subset of vaccine recipients. Thus
can be estimated by
/
and
can be estimated by
/
. The inverse probability weighted estimators of
and
derived in this special setting are equivalent to unweighted estimates since we are estimating a constant for the sampling probability of
, and for the sampling probability of
and
together among vaccine recipients. Note that even when there is a CPV component in the study design, we cannot use
from placebo recipients for the estimation of
because placebo recipients who have experienced the clinical endpoint by study closeout have zero probability of obtaining
thus inverse probability weighting is not applicable. The estimator of
is then derived as the maximizer of the estimated likelihood
.
Standard errors for
can be estimated using a perturbation resampling technique proposed by Zhuang and others (2019) (inspired by Parzen and others (1994)). In essence, one can generate
random realizations of
from a known distribution with mean of 1 and variance of 1 to create
. Let
be a perturbed version of
, where
and
is the perturbed estimator of
obtained by adding
as the weights in the estimating equations. Then the perturbed estimator
is derived as the maximizer of
. As shown in Zhuang and others (2019), the distribution of
given the observed data
, can be used to approximate the unconditional distribution of
. In practice, one may obtain a variance estimator of
based on the empirical variance of
realizations of
. In our simulation and example, we use
.
2.2. Estimation of marginal VE curves and VE curves within the baseline seropositive/seronegative subgroups
In this section, we study the special problem where
denotes the baseline biomarker values subject to the LLOQ left censoring, and derive the estimators for the marginal VE curve,
, the VE curve in the baseline seropositive subgroup,
, and the VE curve in the baseline seronegative subgroup,
. With some calculations, the risk of
conditional on
,
and
in the general population, and among baseline seropositive or baseline seronegative subgroups can be expressed as:
![]() |
All three risk functions can be estimated based on
and
. We consider situations where
is categorical with
levels:
. Then
,
, and
. In Section 1 of the Supplementary material available at Biostatistics online, we show that based on Bayes’ theorem and with some calculations, we have
![]() |
(2.8) |
![]() |
(2.9) |
![]() |
(2.10) |
We propose estimating
nonparametrically specified by
,
and based on Bayes’ theorem
and
can be estimated by consistent estimators
and
through:
and
. Probabilities
,
, and
can then be estimated by
and
based on expression 2.8, 2.9, and 2.10. Together with estimators for
,
and
, the VE curve, baseline seropositive VE curve, and baseline seronegative VE curve defined in expressions 2.1, 2.2, and 2.3 can be estimated by using
,
and
. In Section 2 of the Supplementary material available at Biostatistics online, we provide a detailed estimation procedure of the marginal VE curve, baseline seropositive VE curve, and baseline seronegative VE curve for the case where
and
are assumed normal and the risk functions take the form
.
A perturbation resampling method (Zhuang and others, 2019) can be used to make simultaneous inference of the VE curve, the baseline seropositive VE curve, and the baseline seronegative VE curve. Because VE ranges from negative infinity to 1, we construct confidence intervals for the VE curves assuming approximate normality of the log of relative risk (RR), where
VE
. To be specific, perturbed estimators
,
, and
are obtained based on
. Then the corresponding perturbed estimators for
,
, and 
are obtained based on
,
, and
. Repeat this process
times to obtain
realizations of
,
, and
, denoted by
,
and 
. Then based on the
realizations, we calculate the sample standard deviations
,
,
. The
pointwise confidence intervals for
in the general population and within the baseline seropositive/seronegative subgroups can be constructed as
![]() |
And the
simultaneous confidence bands of corresponding
for
can be constructed as
![]() |
where
is the
th percentile of
,
is the
th percentile of
, and
and
are defined similarly to
with the logRR estimator, logRR perturbed estimator and standard error estimator replaced by its own version. Finally, the Wald
pointwise and simultaneous confidence bands for
,
, and
are obtained by transformation of the symmetric bounds from the logRR scale back to the VE scale.
Furthermore, simultaneous inference enables evaluation of the hypothesis testing of
for
. Let
, then
is equivalent to
for
. Let
denote the estimator
; let
denote one of the
realizations of the perturbed estimator
,
; let
denote the sample standard deviation of
; finally let
denote the
percentile of
. Then the
simultaneous confidence bands for
is
. Let 
,
and
be the
percentile of
. Then, it follows from Roy and Bose (1953) that a test of the null hypothesis
for
at significance level
is obtained by using the region of rejection
, and the two-sided p-value can be obtained as the empirical probability that
.
3. Simulation studies
Simulation data are generated with 10 000 subjects randomized to vaccine and placebo by a ratio of 2:1. The baseline covariate
was generated with a multinomial distribution to have four categories, 1, 2, 3, and 4 with corresponding probabilities of 0.25, 0.25, 0.25, and 0.25.
,
, and
are dummy variables indicating category 2, 3, or 4, respectively. Baseline biomarker values
were generated from a normal distribution with mean of
and standard deviation of 0.86.
values were generated from a normal distribution with mean of
and standard deviation of 0.4, which leads to a correlation of 0.78 between
and
and a correlation of 0.20 between
and
. Let the LLOQ be 1, then
and
. We assume a probit risk model of the clinical outcome
conditional on
,
,
, and
:
. We set
to
so that the probability of the endpoint
equals 0.16 in the placebo arm and 0.08 in the vaccine arm. These simulation parameters were chosen to reflect the characteristics of the two phase 3 dengue trials, CYD14 and CYD15. The resultant true
,
, and
curves are shown in Figure 1.
Fig. 1.
The true
curve (solid),
curve (dashed), and
curve (dot-dashed) in the simulation design.
To achieve a non-monotone sampling design, 35% of study participants are randomly sampled to have
retained. For the BIP-only design,
is set missing for all placebo recipients and retained in all cases and all participants with
measured in the vaccine arm, that is
. For the BIP+CPV design, 70% of event-free placebo recipients are randomly sampled to be included in the CPV component and have
retained. Simulation results are based on 500 Monte-Carlo simulations and for each simulation 250 perturbation iterations are generated to construct pointwise confidence intervals and simultaneous confidence bands. In our simulation, we used the exponential distribution with rate of 1 to generate
.
For comparison with our proposed estimators, we consider two alternative estimators based on existing approaches for two-phase studies that restrict the analysis to part of the observed data. The first approach obtains MLEs without using baseline biomarker values
as baseline predictors nor in the risk models (henceforth referred to as the MLEs ignoring
) based on the estimated likelihood method proposed in GH and the analysis is restricted on observed data
,
; the second approach restricts analysis to the subset of data with
measured thus in the analysis subset B is measured from everyone and can be treated like a baseline covariate X. Similar to the first approach, MLEs are obtained based on the estimated likelihood method proposed in GH and henceforth we refer to this estimator as the MLEs restricted. The comparison analysis between our proposed estimators, the MLEs ignoring
, and the MLEs restricted is performed under two scenarios: (1) the true risk model
with non-zero
and
, thus the working risk model is correct for our proposed estimators and the MLEs restricted but is incorrect for the MLEs ignoring
; (2)
and
, therefore
is not associated with
in the risk model and the working risk models for our proposed estimators, the MLEs ignoring
, and the MLEs restricted are correctly specified. We present the detailed procedures of obtaining the MLEs ignoring
and the MLEs restricted in Section 3 and Section 4 of the Supplementary material available at Biostatistics online, respectively.
Tables 2 and 3 present the average bias and sample standard deviation (SD) of our proposed estimators compared to the MLEs ignoring
, and the MLEs restricted. In both the BIP-only and BIP+CPV designs, our proposed estimators have smallest bias and smallest SD. In the setting where ignoring
leads to an incorrect working model, the MLEs ignoring B have the largest bias due to the incorrectly specified working risk model and the MLEs restricted have the largest SD due to the reduced data size. In the setting where ignoring
yields a correctly specified working model, our proposed estimators still have the best performance. The MLEs restricted have the largest bias and largest SD due to the reduced data size; the MLEs ignoring B are also less efficient compared to our proposed estimators because we gain efficiency by predicting missing
through both
and
rather than through
alone.
Table 2.
Bias and sample standard deviation (SD) comparison in a BIP-only design
|
|
|
|
|
|
|
|
|
||
|---|---|---|---|---|---|---|---|---|---|---|
| (-0.40) | (0.50) | (0.65) | (-0.55) | (-1.30) | (0.00) | (0.24) | (0.11) | (0.20) | ||
| ||||||||||
Bias 100 |
Our proposed estimators | 0.47 | -0.13 | -0.65 | -0.02 | 0.39 | 0.20 | 0.15 | 0.14 | 0.37 |
| MLEs ignoring B | -4.05 | -4.10 | 1.95 | 1.36 | — | — | -0.55 | 0.51 | -0.81 | |
| MLEs restricted | 3.92 | 0.46 | -1.94 | -0.16 | 0.39 | 0.30 | -0.18 | -0.47 | -0.39 | |
SD 100 |
Our proposed estimators | 10.84 | 12.26 | 3.21 | 3.96 | 2.16 | 7.18 | 4.28 | 4.39 | 4.68 |
| MLEs ignoring B | 18.38 | 19.26 | 6.07 | 6.21 | — | — | 4.52 | 4.49 | 4.83 | |
| MLEs restricted | 25.32 | 31.83 | 16.42 | 14.72 | 12.69 | 12.87 | 13.05 | 14.14 | 11.40 | |
| ||||||||||
Bias 100 |
Our proposed estimators | 0.98 | -0.15 | -0.35 | 0.06 | 0.02 | 0.01 | 0.05 | 0.15 | 0.13 |
| MLEs ignoring B | 1.19 | -0.12 | -0.58 | 0.10 | — | — | -0.26 | -0.24 | -0.07 | |
| MLEs restricted | 1.42 | -1.72 | -0.61 | 1.05 | 0.38 | 0.24 | -2.20 | -1.86 | -1.50 | |
SD 100 |
Our proposed estimators | 10.16 | 11.79 | 3.31 | 3.86 | 2.71 | 3.40 | 4.33 | 4.41 | 4.59 |
| MLEs ignoring B | 11.94 | 12.55 | 6.12 | 6.09 | — | — | 4.71 | 4.89 | 5.33 | |
| MLEs restricted | 15.92 | 19.17 | 10.16 | 8.27 | 8.33 | 9.48 | 10.64 | 11.37 | 8.88 | |
Table 3.
Bias and sample standard deviation (SD) comparison in a BIP+CPV design
|
|
|
|
|
|
|
|
|
||
|---|---|---|---|---|---|---|---|---|---|---|
| (-0.40) | (0.50) | (0.65) | (-0.55) | (-1.30) | (0.00) | (0.24) | (0.11) | (0.20) | ||
| ||||||||||
Bias 100 |
Our proposed estimators | -0.55 | 0.36 | 0.58 | -0.16 | 0.28 | 0.12 | -0.12 | -0.19 | 0.08 |
| MLEs ignoring B | 4.59 | -3.85 | -1.73 | 1.69 | — | — | -0.18 | -0.67 | -0.34 | |
| MLEs restricted | 2.94 | 1.27 | 1.38 | -0.31 | 0.21 | 0.15 | 0.33 | -0.28 | ||
SD 100 |
Our proposed estimators | 5.04 | 5.70 | 3.16 | 2.29 | 2.75 | 7.29 | 1.61 | 1.75 | 1.50 |
| MLEs ignoring B | 10.38 | 10.29 | 7.97 | 6.89 | — | — | 1.83 | 1.78 | 1.63 | |
| MLEs restricted | 12.60 | 14.79 | 16.85 | 10.69 | 16.16 | 12.50 | 4.89 | 5.64 | 3.65 | |
| ||||||||||
Bias 100 |
Our proposed estimators | -0.35 | 0.32 | -0.14 | -0.13 | 0.24 | 0.12 | -0.14 | -0.22 | 0.04 |
| MLEs ignoring B | 0.43 | -0.44 | -0.16 | 0.19 | — | — | 0.24 | 0.16 | 0.36 | |
| MLEs restricted | 0.51 | 3.67 | 0.24 | 2.17 | 0.88 | 1.12 | 6.16 | 2.72 | 0.46 | |
SD 100 |
Our proposed estimators | 5.27 | 5.34 | 3.16 | 1.32 | 2.09 | 6.99 | 1.83 | 1.74 | 1.81 |
| MLEs ignoring B | 7.17 | 10.32 | 6.15 | 2.04 | — | — | 1.88 | 1.81 | 2.46 | |
| MLEs restricted | 8.26 | 11.22 | 9.58 | 2.83 | 6.42 | 19.50 | 4.51 | 4.47 | 3.48 | |
We then evaluated the finite-sample performance of our proposed estimators for
,
, and
. Results are presented in Figure 2. The empirical coverage levels of the 95% simultaneous confidence bands are reported as “simultaneous.cover.” The bias was smaller than 3% across all
values. The increase in bias at larger
values are due to the fact that the observed
values are rarer at these extreme values, such that less data are available for estimation, and the fact that at high VE, only a small number of vaccine recipients have events. The coverage probabilities of the confidence intervals given fixed
values and simultaneous confidence bands across all
values are fairly close to the nominal level. The observed power of the test
for
is 0.67 and 0.72 for BIP-only design and BIP+CPV design, respectively.
Fig. 2.
Average bias for our proposed estimators
,
, and
and coverage probabilities of 95% perturbation Wald confidence intervals.
4. Application to the CYD14 and CYD15 trials
CYD14 (CYD15) is an observer-masked, randomized controlled, multi-center, phase 3 trial in five countries in the Asia-Pacific (Latin America) region where participants were randomized in 2:1 allocation to receive three injections of the CYD-TDV vaccine or placebo at months 0, 6, and 12 (Capeding and others, 2014; Villar and others, 2015). Concentrations of dengue neutralizing antibody titers to each of the four dengue serotype strains at month 13 were measured for all virologically confirmed dengue disease cases and a subset of controls (defined as participants free of the virologically confirmed dengue disease endpoint), of which only a fraction have their baseline titers measured. In this illustration, we applied our proposed method to data pooling across CYD14 and CYD15 restricted to 9- to 16-year-olds to assess how VE varied by month 13 titers within baseline seropositive and seronegative subgroups. We let
and
be the average of the log10-transformed neutralizing antibody titers to the four dengue serotypes at month 13 and at baseline, respectively. Baseline seropositive and seronegative subgroups are defined as
and
. Immunogenicity studies in CYD14 and CYD15 hold a non-monotone sampling design with respect to
and
, where
is available for everyone in the immunogenicity subset (
) and
is available for all vaccine recipients who are either in the immunogenicity subset or are cases (
). Let
denote participants’ age and country categories, and let Y be the indicator of virologically confirmed dengue disease occurrence after the month 13 study visit and by the month 25 visit. We model the risk of
conditional on
,
,
, and
same as in the simulation studies.
Figure 3 shows the estimated VE curves (marginal, baseline seropositive, and baseline seronegative) and 95% CIs and CBs based on 500 perturbation iterations. VE curves were similar for baseline seropositive and baseline seronegative subgroups. For vaccine recipients with month 13 average titers of 500 and 10 000, estimated VE was 79.3% and 97.3% for the baseline seropositive subgroup compared to 70.4% and 91.8% for the baseline seronegative subgroup, respectively. Furthermore, we tested the null hypothesis
:
for
range of month 13 average titers in vaccinees in the data, which gave a p-value of 0.35. The large variations in the VE curves suggest that the magnitude of dengue neutralizing antibody titer following vaccination predicts the level of CYD-TDV VE to prevent virologically confirmed dengue disease, and provides a map between neutralizing antibody titer and the predicted level of VE. This guides use of the vaccine, for example by providing an input parameter into models for bridging VE to new populations (Gilbert and others, 2019). It also supports use of the neutralizing antibody endpoint for ranking and selecting among future candidate dengue vaccines developed within the same vaccine class. Moreover, the hypothesis testing results indicating no significant difference between the baseline seropositive VE curve compared to the baseline seronegative VE curve, suggests that neutralizing antibody titer has a similar association with VE regardless of whether the child had been previously infected with dengue at the time of first vaccination.
Fig. 3.
Estimated VE by average
titer at month 13 with 95% pointwise confidence intervals and simultaneous confidence bands in CYD14and CYD15 9- to 16-year-olds.
5. Discussion
Various research has been devoted to study the causal effect of treatment adjusted for post-randomization variables under the principal stratification framework (Frangakis and Rubin, 2002). Schwartz and others (2011) presented a general Bayesian semiparametric model to make comparison on a continuous outcome Y, adjusting for intermediate variables. Their Bayesian approach treated the unobserved potential outcomes as part of the model parameters and proposed a Bayesian nonparametric model for the principal strata based on the Dirichlet process mixture model. Bayesian inference for the estimand of interest
was based on the posterior distribution of model parameters conditional on the observed data. In another randomized experiment to study the effect of a job-training program on employment and wages (Frumento and others, 2012), the primary causal estimands were the average causal effect on employment for compliers and the average causal effect on wages for always-employed compliers. They also adopted a Bayesian inference approach where missing potential outcomes are no different than unknown parameters and conducted a likelihood analysis for the model parameters, assuming that the values of the model parameters that governed the distribution of observable data were drawn from a prior distribution. In this article, we focused on a binary outcome
where the comparison is made on
and
and developed a frequentist estimated likelihood approach to evaluate effect modification by subgroups defined by an intermediate response biomarker (principal strata) and a baseline covariate applicable to general sampling designs with respect to the intermediate biomarker and the baseline covariate in vaccine trials.
In the motivating dengue application of this article, the distributions of
and of
conditional on
were estimated from random samples among trial participants and among vaccine recipients, respectively. But our methods can apply in general to settings where sampling probabilities for
and
vary with other covariates, through the use of inverse probability weighting techniques when modeling the distribution of
and
. It requires the missing data model to be clearly specified, which is satisfied in the dengue example since the missingness mechanism is determined by the study design. While we have considered situations where the baseline covariates
are categorical and employed a normal model for
and
, our proposed definition and estimation procedure can be used for arbitrary specified parametric forms for the conditional distributions of
and
. In practice one should perform model checking to ensure appropriate choice of parametric model. In addition, kernel-based methods can be considered for modeling the conditional distributions for low-dimensional covariate
settings, especially for problems where parametric models do not provide a good fit. In our work, to deal with nonidentifiability due to missing
, we focused on the BIP-only design and the BIP+CPV design. In the BIP-only design, the parametric risk model assumption made in (A4) is untestable whereas the BIP+CPV design allows testing of this risk model assumption. Ertefaie and others (2018) utilizes a different set of untestable assumptions to deal with missing
, including a structural model assumption relating
and
and the existence of an instrumental variable for consistent estimation of the risk model based on observed data. It can be an alternative approach useful in the immune response correlate of VE field.
In some VE trials, samples are stored from all participants at multiple post-randomization time points, such that it is possible to study immune response biomarkers at multiple time points as correlates of VE (e.g., Rerks-Ngarm and others, 2009). A direction of potential future research is to compare and integrate titers obtained at different time points toward developing biomarkers with maximum effect modification.
6. Software
The code to generate a simulated data set with similar design as the dengue example used in Section 4, analyze this simulated data set and produce the figure for the VE curves and 95% CIs and CBs is available at https://github.com/Yingying-Z/TrtEffMod-NonMonotoneMissingness. The results obtained on this simulated data set are provided in Section 5 of the Supplementary material available at Biostatistics online.
Supplementary Material
Acknowledgments
The authors thank the participants, investigators, and sponsors of the CYD14 and CYD15 trials.
Conflict of Interest: None declared.
Supplementary material
Supplementary material is available online at http://biostatistics.oxfordjournals.org.
Funding
Research reported in this publication was supported by Sanofi Pasteur and the National Institute of Allergy and Infectious Diseases (NIAID), National Institutes of Health (NIH), Department of Health and Human Services (award number R37AI054165). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH or Sanofi Pasteur.
References
- Capeding, M. R., Tran, N. H., Hadinegoro, S. R., Ismail, H. I., Chotpitayasunondh, T., Chua, M. N., Luong, C. Q., Rusmil, K., Wirawan, D. N., Nallusamy, R.. and others. (2014). Clinical efficacy and safety of a novel tetravalent dengue vaccine in healthy children in Asia: a phase 3, randomised, observer-masked, placebo-controlled trial. The Lancet 384, 1358–1365. [DOI] [PubMed] [Google Scholar]
- Ertefaie, A., Hsu, J. Y., Page, L. C. and Small, D. S. (2018). Discovering treatment effect heterogeneity through post-treatment variables with application to the effect of class size on mathematics scores. Journal of the Royal Statistical Society: Series C (Applied Statistics) 67, 917–938. [Google Scholar]
- Follmann, D. (2006). Augmented designs to assess immune response in vaccine trials. Biometrics 62, 1161–1169. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Frangakis, C. E. and Rubin, D. B. (2002). Principal stratification in causal inference. Biometrics 58, 21–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Frumento, P., Mealli, F., Pacini, B. and Rubin, D. B. (2012). Evaluating the effect of training on wages in the presence of noncompliance, nonemployment, and missing outcome data. Journal of the American Statistical Association 107, 450–466. [Google Scholar]
- Gilbert, P. B., Huang, Y., Juraska, M., Moodie, Z., Fong, Y., Luedtke, A., Zhuang, Y., Shao, J., Carpp, L. N., Jackson, N.. and others. (2019). Bridging efficacy of a tetravalent dengue vaccine from children/adolescents to adults in highly endemic countries based on neutralizing antibody response. The American Journal of Tropical Medicine and Hygiene 101, 164–179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gilbert, P. B. and Hudgens, M. G. (2008). Evaluating candidate principal surrogate endpoints. Biometrics 64, 1146–1154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang, Y., Gilbert, P. B. and Wolfson, J. (2013). Design and estimation for evaluating principal surrogate markers in vaccine trials. Biometrics 69, 301–309. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Neyman, J. S. (1923). On the application of probability theory to agricultural experiments. essay on principles. Section 9. (Translated and edited by D.M. Dabrowska and TP Speed, Statistical science (1990), 5, 465–480). Annals of Agricultural Sciences 10, 1–51. [Google Scholar]
- Parzen, M. I., Wei, L. J. and Ying, Z. (1994). A resampling method based on pivotal estimating functions. Biometrika 81, 341–350. [Google Scholar]
- Pepe, M. S. and Fleming, T. R. (1991). A nonparametric method for dealing with mismeasured covariate data. Journal of the American Statistical Association 86, 108–113. [Google Scholar]
- Rerks-Ngarm, S., Pitisuttithum, P., Nitayaphan, S., Kaewkungwal, J., Chiu, J., Paris, R., Premsri, N., Namwat, C., de Souza, M., Adams, E.. and others. (2009). Vaccination with ALVAC and AIDSVAX to prevent HIV-1 infection in Thailand. New England Journal of Medicine 361, 2209–2220. [DOI] [PubMed] [Google Scholar]
- Roy, S. N. and Bose, R. C. (1953). Simultaneous confidence interval estimation. The Annals of Mathematical Statistics 24, 513–536. [Google Scholar]
- Schwartz, S. L., Li, F. and Mealli, F. (2011). A bayesian semiparametric approach to intermediate variables in causal inference. Journal of the American Statistical Association 106, 1331–1344. [Google Scholar]
- Sridhar, S., Luedtke, A., Langevin, E., Zhu, M., Bonaparte, M., Machabert, T., Savarino, S., Zambrano, B., Moureau, A., Khromava, A.. and others. (2018). Effect of dengue serostatus on dengue vaccine safety and efficacy. New England Journal of Medicine 379, 327–340. [DOI] [PubMed] [Google Scholar]
- Villar, L., Dayan, G. H., Arredondo-García, J. L., Rivera, D. M., Cunha, R., Deseda, C., Reynales, H., Costa, M. S., Morales-Ramírez, J. O., Carrasquilla, G.. and others. (2015). Efficacy of a tetravalent dengue vaccine in children in latin america. New England Journal of Medicine 372, 113–123. [DOI] [PubMed] [Google Scholar]
- Wolfson, J. and Gilbert, P. B. (2010). Statistical identifiability and the surrogate endpoint problem, with application to vaccine trials. Biometrics 66, 1153–1161. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhuang, Y., Huang, Y. and Gilbert, P. B. (2019). Simultaneous inference of treatment effect modification by intermediate response endpoint principal strata with application to vaccine trials. International Journal of Biostatistics 16(1). DOI: 10.1515/ijb-2018-0058. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.









































