Abstract
The COVID‐19 pandemic due to the novel coronavirus SARS CoV‐2 has inspired remarkable breakthroughs in the development of vaccines against the virus and the launch of several phase 3 vaccine trials in Summer 2020 to evaluate vaccine efficacy (VE). Trials of vaccine candidates using mRNA delivery systems developed by Pfizer‐BioNTech and Moderna have shown substantial VEs of 94–95%, leading the US Food and Drug Administration to issue Emergency Use Authorizations and subsequent widespread administration of the vaccines. As the trials continue, a key issue is the possibility that VE may wane over time. Ethical considerations dictate that trial participants be unblinded and those randomized to placebo be offered study vaccine, leading to trial protocol amendments specifying unblinding strategies. Crossover of placebo subjects to vaccine complicates inference on waning of VE. We focus on the particular features of the Moderna trial and propose a statistical framework based on a potential outcomes formulation within which we develop methods for inference on potential waning of VE over time and estimation of VE at any postvaccination time. The framework clarifies assumptions made regarding individual‐ and population‐level phenomena and acknowledges the possibility that subjects who are more or less likely to become infected may be crossed over to vaccine differentially over time. The principles of the framework can be adapted straightforwardly to other trials.
Keywords: crossover, inverse probability weighting, potential outcomes, randomized phase 3 vaccine trial, waning vaccine efficacy
1. INTRODUCTION
The primary objective of a vaccine trial is to estimate vaccine efficacy (VE). Typically, these trials are double‐blind, placebo‐controlled studies in which participants are randomized to either vaccine or placebo and followed for the primary endpoint. This endpoint is often time to viral infection, on which inference on VE is based, where VE is defined as a measure of reduction in infection risk for vaccine relative to placebo, expressed as a percentage.
Vaccine trials have become the focus of immense global interest as a result of the COVID‐19 disease pandemic due to the novel coronavirus SARS‐CoV‐2 (COVID‐19 Vaccine Tracker). The pandemic inspired unprecedented scientific breakthroughs in the rapid development of vaccines against SARS‐CoV‐2, culminating in the launch of several large phase 3 vaccine trials in Summer 2020. Trials in the United States studying the vaccine candidates using messenger RNA (mRNA) delivery systems developed by Pfizer‐BioNTech and Moderna began in July 2020 and demonstrated substantial evidence of VEs of 94–95% at interim analyses, leading the US Food and Drug Administration (FDA) to issue Emergency Use Authorizations (EUAs) for both vaccines in December 2020 and to the rollout of vaccination programs shortly thereafter.
Implicit in the primary analysis in these trials is the assumption that VE is constant over the study period, and, with primary endpoint time to infection, VE is represented by (1 − the ratio of the hazard rate for vaccine to that for placebo), estimated based on a Cox proportional hazards model. As the trials continue following the EUAs, among the many issues to be addressed is the possibility that VE may wane over time. Principled evaluation of the nature and extent of waning of VE is of critical public health importance, as waning has implications for measures to control the pandemic. Were all participants in the trials to continue on their randomized assignments (study vaccine or placebo), evaluation of potential waning of VE would be straightforward. However, once efficacy is established, ethical considerations dictate the possibility of unblinding all participants and offering the study vaccine to those randomized to placebo. After consultation with stakeholders, Pfizer and Moderna issued amendments to their trial protocols specifying unblinding strategies and modifications to planned analyses.
Crossover of placebo subjects to the study vaccine of necessity complicates inference on waning of VE and has inspired recent research (Follmann et al., 2020; Fintzi and Follmann, 2021; Lin et al. 2021). We propose a statistical framework within which we develop methods for inference on whether or not VE wanes over time based on data where subjects are unblinded and those on placebo may cross over to study vaccine and in which assumptions made regarding individual and population phenomena are made transparent. It is possible that subjects who are more or less likely to become infected could be unblinded and cross over to vaccine differentially over time, which could lead to biased inferences due to confounding; accordingly, this possibility is addressed explicitly in the framework. The first author (AAT) has the privilege of serving on the Data and Safety Monitoring Board for all U.S. government‐sponsored COVID‐19 vaccine trials and is thus well acquainted with the unblinding approach for the Moderna trial. Accordingly, the development is based on the specifics of this trial, but the principles can be adapted to the features of other trials.
In Section 2, we review the Moderna trial and the resulting data. We present a conceptual framework in which we define VE precisely as a function of time postvaccination in Section 3. In Section 4, we develop a formal statistical framework within which we propose methodology for estimation of VE and describe its practical implementation in Section 5. Simulations demonstrating performance are presented in Section 6.
2. CLINICAL TRIAL STRUCTURE AND DATA
We first describe the timeline of the Moderna Coronavirus Efficacy (COVE) trial (Baden et al., 2020) on the scale of calendar time. The trial opened on July 27, 2020 (time 0), and reached full accrual at time (October 23, 2020). On December 11, 2020, denoted by , the FDA issued an EUA for the Pfizer vaccine, followed by an EUA for the Moderna mRNA‐1273 vaccine on December 18, 2020. Amendment 6 of the study protocol was issued on December 23, 2020 and specified the unblinding strategy (see figure 2 of the protocol) under which, starting on December 24, 2020, study participants are scheduled on a rolling basis over several months for Participant Decision clinic visits (PDCVs) at which they will be unblinded. If originally randomized to vaccine, participants continue to be followed; if randomized to placebo, participants can receive the Moderna vaccine at the PDCV or refuse and either seek another vaccine outside the study or remain unvaccinated. Let denote the time at which all PDCVs have taken place. The study will continue until time at which all participants will have completed full follow up at 24 months after initial treatment assignment. Assume that the analysis of VE using the methods in Sections 4.4 and 5 takes place at time , where all participants have achieved the primary endpoint, requested to be unblinded, or attended the PDCV by L. The Moderna vaccine is administered in two doses, ideally 4 weeks apart, and is not thought to achieve full efficacy until 2 weeks following the second dose. Accordingly, the primary endpoint is defined as symptomatic viral infection occurring after a lag of weeks following the initial dose.
Under this scheme, we characterize the data on a given participant as follows. Let denote the calendar time at which the subject entered the trial, denote baseline covariates, and (1) if assigned to placebo (vaccine). Denote observed time to infection on the scale of calendar time as U, and , where if B is true and 0 otherwise. At , availability of the Pfizer vaccine commenced, at which point some subjects not yet infected requested to be unblinded. Denote by R (calendar time) the minimum of (i) time to such an unblinding, in which case , and define ; (ii) time of PDCV, so , and let ; or (iii) time to infection, in which case and . If and , so that the subject was randomized to vaccine, she/he continues to be followed; if , she/he can agree to receive the Moderna vaccine, or refuse, . We distinguish the cases and 2 to acknowledge different unblinding dynamics before and after . Because a very small number of participants requested unblinding before , and although the protocol allows participants to refuse unblinding at PDCV, all subjects are strongly encouraged to unblind, we do not include these possibilities in the formulation.
Table 1 summarizes the timeline and observed data. The trial data are thus
(1) |
independent and identically distributed (iid) across i.
TABLE 1.
Summary of notation. All times are on the scale of calendar time, where time 0 is the start of the trial
Variable | Definition | |
---|---|---|
Trial Milestones | ||
|
Full accrual reached, October 23, 2020 | |
|
Pfizer granted EUA, December 11, 2020 | |
|
Moderna granted EUA, December 18, 2020 | |
|
Participant Decision clinic visits (PDCVs) commence, December 24, 2020 | |
|
PDCVs conclude | |
|
Follow‐up concludes, trial ends | |
ℓ | Lag between initial vaccine dose and full efficacy, 6 weeks, | |
L | Time of analysis of vaccine efficacy using the proposed methods; time at which all subjects have achieved the endpoint, requested unblinding, or attended the PDCV, | |
Observed Data on a Trial Participant | ||
E | Study entry time, | |
|
Baseline information | |
A | Treatment assignment, placebo, , or vaccine, | |
|
Time to symptomatic infection, indicator of infection by time L, | |
|
Time to requested unblinding, PDCV/requested unblinding, or infection, whichever comes first | |
: , infection occurs before requested/offered unblinding | ||
: time to requested unblinding, | ||
: time to PDCV or requested unblinding, | ||
Ψ | If , , indicator or whether subject receives Moderna vaccine, , or refuses and seeks another vaccine outside the study or remains unvaccinated, |
3. CONCEPTUALIZATION OF VACCINE EFFICACY
Similar to Halloran et al. (1996) and Longini and Halloran (1996), we consider the following framework in which to conceptualize VE. The study population, comprising individuals for which inference on VE is of interest, is that of individuals susceptible to infection, represented by the trial participants. There is a population of individuals outside the trial with which trial participants interact, assumed to be much larger than the number of participants, so that interactions among participants are much less likely than interactions with the outside population. The probability that a trial participant will become infected at calendar time t depends on three factors: , the contact rate, the number of contacts with the outside population per unit time; , the prevalence of infections in the outside population at t; and , the transmission probability at t, the probability a susceptible individual in the study population will become infected per contact with an infected individual from the outside population. Dependence of on time acknowledges the emergence of new variants of the virus, which may be more or less virulent, as in the COVID‐19 pandemic. Assuming random mixing, is the contact rate at time t with infected individuals, and the infection rate at time t is .
We adapt this framework to the COVID‐19 pandemic. The prevalence rate in the pandemic can vary substantially in time and space, so denote by S the trial site at which a participant is enrolled, and let be the prevalence at time t at site . Although varies by t and s, assume that it is unaffected by the individuals in the trial and thus represents an external force. We view the contact rate as individual specific; accordingly, for an arbitrary individual in the study population, let the random variables denote potential contact rates. These potential outcomes can be regarded as individual‐specific behavioral characteristics of trial participants, where some may be more careful and make fewer contacts while others take more risks, and behavior can vary over time and by vaccination and blinding status. Here, is the contact rate at time t if the individual were to receive vaccine, , or placebo, , and be blinded to this assignment; by virtue of blinding, it is reasonable to take .
As in Table 1, letting ℓ denote the lag between initial dose and full efficacy, reflects behavior of a placebo subject who is unblinded, receives the Moderna vaccine, and is within ℓ weeks of vaccination. Likewise, reflects behavior of any unblinded Moderna vaccine recipient after ℓ, both those originally randomized to placebo and crossed over to the vaccine and those originally randomized to vaccine. Thus, allows for more cautious behavior before full efficacy is achieved for recently vaccinated placebo subjects; in the trial, all subjects randomized to vaccine were past the full efficacy lag at the time of unblinding (as in Table 1, ). Similar to the stable unit treatment value assumption (Rubin, 1980), assume that is the same if the individual was randomized to vaccine and unblinded before t or was randomized to placebo and subsequently unblinded and crossed over to the Moderna vaccine before t. The rate reflects behavior of an unblinded placebo subject who does not cross over to the Moderna vaccine and does not play a role in the development, and, as demonstrated in Section 4.4, such subjects do not contribute to the analysis of VE.
Finally, for an arbitrary participant, let the random variable be the potential individual‐specific transmission probability per contact at t if she/he were to receive placebo, and let be the same if she/he were to receive study vaccine and have been vaccinated for units of time. As we now demonstrate, this formulation allows us to represent VE as a function of τ and thus consider whether or not VE wanes over time since vaccination.
With the set of potential outcomes for an arbitrary individual in the study population who enrolls at site S thus given by , the infection rate in the study population at calendar time t if all individuals were to receive placebo and be blinded to that assignment is ; likewise, the infection rate at t if all individuals were to receive vaccine at time and be blinded to that assignment is . The relative infection rate at t is then
(2) |
Accordingly, VE at time t after vaccination at is , reflecting the proportion of infections at t that would be prevented if the study population were vaccinated and on study vaccine for τ units of time during the blinded phase of the study.
In the sequel, we assume that and thus depend only on τ and write and . This assumption embodies the belief that, although infection rates may change over time, the relative effect of vaccine to placebo remains approximately constant and holds if (i) , where means “independent of” and this independence is conditional on ; and (ii) , so does not depend on t and . Condition (i) reflects the interpretation of and as inherent biological characteristics of an individual, whereas S and are external and behavioral characteristics, respectively; thus, once common individual and external baseline covariates are taken into account, biological and geographic/behavioral characteristics are unrelated. Condition (ii) implies that, although new viral variants may change transmission probabilities under both vaccine and placebo over time, this change stays in constant proportion, and this proportion is similar for individuals with different characteristics. Further discussion is given in Section 7 and Web Appendix B of the Supporting Information.
Within this framework, the goal of inference on waning of VE based on the data from the trial can be stated precisely as inference on , , so reflecting VE after full efficacy is achieved. It is critical to recognize that, like estimands of interest in most clinical trials, represents VE at time since vaccination τ under the original conditions of the trial, under which all participants are blinded. The challenge we address in subsequent sections is how to achieve valid inference on , , using data from the modified trial in which blinded participants are unblinded in a staggered fashion, with placebo subjects offered the option to receive the study vaccine.
We propose a semiparametric model within which we cast this objective. Let , , and , , be the infection rates in the study population at t if all individuals were to receive vaccine at time and be unblinded to that fact. Analogous to (i) above, assume that , and continue to assume condition (ii). Then, for two values of τ, it is straightforward that (see Web Appendix A of the Supporting Information)
(3) |
Defining and , by (3) with and (ℓ) on the left (right) hand side, the infection rates at t if all individuals in the study population were unblinded and to receive vaccine at time are
(4) |
Likewise, from (2), the infection rate at t if all individuals in the study population were blinded and to receive vaccine at time is
(5) |
We now represent the infection rate ratio as
(6) |
where is a function of τ; θ0 and are real‐ and vector‐valued parameters, respectively; and is a real‐valued function of such that for all and . For example, taking yields , , in which case implies that , , does not change with time since vaccination, and indicates that decreases with increasing τ; that is, exhibits waning. More complex specifications of using splines (e.g., Fintzi and Follmann, 2021) or piecewise constant functions could be made; for example, for ,
(7) |
Because interest focuses only on , we leave unspecified.
Under this model, (5) and (4) can be written as
(8) |
Thus, to estimate for any and make inference on potential waning of VE, we develop a principled approach to estimation of based on the data from the modified trial in which participants are unblinded and those on placebo may cross over to study vaccine.
4. STATISTICAL FRAMEWORK
4.1. Motivation
Estimation of , equivalently , would be straightforward for any over the entire follow‐up period if all participants remained blinded and on their assigned treatments throughout the trial. However, subjects randomized to placebo, when unblinded, have the option to receive the study vaccine on or after . For , it is possible to estimate because, due to randomization, for we have representative samples of blinded subjects on vaccine and placebo and thus information on and , so can estimate θ0 and components of identified for such τ; for example, in (7) depending on the values of v 1 and v 2. At , the data comprise a mixture of blinded and unblinded participants, where, within the latter group, those on placebo may have opted to receive study vaccine or refuse. Here, information, albeit diminishing during , on and is available from participants not yet unblinded, which contributes to estimation of θ0 and components of . Information is also available on from individuals who were originally randomized to vaccine and provide information on longer τ and from individuals who recently crossed over to study vaccine and provide information on shorter τ. For , there are no longer blinded subjects, so that information is available only on . For these latter groups, for longer and shorter , , and, because of the mixture of times since vaccination, can be fully estimated.
Through the following potential outcomes formulation and under suitable assumptions, in the next several sections, we develop an approach to estimation of based on the observed data (1) that embodies the foregoing intuitive principles.
4.2. Potential outcomes formulation
Denote by the potential time to infection on the scale of patient time for an arbitrary individual in the study population if she/he were to enter the trial at calendar time e, receive placebo and be blinded to that fact, and, if not infected by calendar time r, be unblinded and cross over to study vaccine at r. Let , if she/he is never crossed over to receive vaccine. Similarly, define to be the potential time to infection (patient time scale) for an arbitrary individual if she/he were to enter the trial at e, receive vaccine and be blinded to that fact, and, if not infected by r, be unblinded at r; and define . We make the consistency assumptions that if and if . For , denote the hazard at calendar time t, , by
(9) |
where the addition of e induces a shift from patient to calendar time. Denote the set of all potential outcomes as .
The development in Section 3 is in terms of infection rates at the individual‐specific and population levels. Population‐level hazard rates such as (9) are not equivalent to population‐level infection rates. However, we argue in Web Appendix C of the Supporting Information that, because the probabilities of infection under vaccine and placebo during the course of the trial are small, population‐level hazard rates and population‐level infection rates are approximately equivalent; this assumption is implicit in the standard primary analysis noted in Section 1. Thus, to reflect this, we use familiar notation and write , , and . Under these conditions, using (8), we can write for
(10) |
(11) |
where (11) follows because , , . Define the counting processes for infection by and , and the at‐risk processes by and , (Fleming and Harrington, 2005). From the above consistency assumptions, if , then , , . For , let be the cumulative hazard. Because , , it follows that , , are mean‐zero counting process increments. Thus, any linear combination of these increments over can be used to define unbiased estimating functions in of quantities of interest. In Web Appendix D of the Supporting Information, we formulate a particular set of estimating functions that, based on iid potential outcomes , , lead to consistent and asymptotically normal estimators for , , . Because interest focuses on for , estimation of and is not considered and is reflected in the specification of the linear combinations; see Web Appendix D.
For fixed t, , the estimating functions for and are, respectively,
(12) |
(13) |
where and , , are arbitrary nonnegative weight functions, specification of which is discussed later. The estimating function for is given by
(14) |
where . Analogous to Yang et al. (2018), envisioning (12)–(14) as characterizing a system of estimating functions , if we could observe , , we would estimate by solving the estimating equations .
4.3. Identifiability assumptions
Of course, the potential outcomes , , are not observed. However, we now present assumptions under which we can exploit the developments in the last section to derive estimating equations yielding estimators based on the observed data (1).
Define the indicator that a participant is observed to be infected at time t by , the observed at‐risk indicator at t by , and
(15) |
indicates that a subject entering the trial at time e and randomized to placebo () or vaccine () has not yet been infected or unblinded by t. For , indicates that a subject randomized to placebo at time e is unblinded (either by request or at a PDCV) at time r and crosses over to study vaccine at r, and if a subject randomized to vaccine at time e is unblinded at r. Make the consistency assumptions
(16) |
We now make assumptions similar in spirit to those adopted in observational studies. By randomization,
(17) |
where we subsume the site indicator S in , and let . It is realistic to assume that the mix of baseline covariates changes over the accrual period; for example, during the trial, because of lagging accrual of elderly subjects and subjects from underrepresented groups, an effort was made to increase participation of these groups in the latter part of the accrual period. Accordingly, we allow the distribution of entry time E to depend on , and denote its conditional density as . We make the no unmeasured confounders assumption
(18) |
Define the hazard functions of unblinding in the periods between the Pfizer EUA and the start of PDCVs and after the start of PDCVs, respectively, as
where for () and (). Because the accrual period was short relative to the length of follow‐up, we take these unblinding hazard functions to not depend on E, although including such dependence is straightforward; and, similar to a noninformative censoring assumption, to not depend on and write
(19) |
Define , , , , , , (), or (). Because and are defined on the nonoverlapping intervals and , respectively, with , ,
Finally, define , .
Let be the probability that a placebo participant unblinded at R agrees to receive the Moderna vaccine. Similar to (19), we assume that this probability does not depend on ; moreover, because the unblinding interval is short relative to the length of follow‐up, we assume that it does not depend on R but does depend on the unblinding dynamics at R. Thus, write
(20) |
4.4. Observed data estimating equations
We now outline, under the assumptions (16)–(20), which we take to hold henceforth, how we can develop unbiased estimating equations based on the observed data yielding consistent and asymptotically normal estimators for . The basic premise is to use inverse probability weighting (IPW) to probabilistically represent potential outcomes in terms of the observed data to mimic the estimating functions (12)–(14).
Considering (15), define the inverse probability weights
We show in Web Appendix E of the Supporting Information that
(21) |
(22) |
(23) |
(24) |
To obtain observed data analogs to the estimating functions (12)–(14), based on the equalities in (21)–(24), we substitute the IPW expressions in the conditional expectations on the left‐hand sides. Using (15) and (21)–(22), the analog to (12) is given by
(25) |
Likewise, using (23)–(24), the analog to (13) is
(26) |
A entirely similar representation of (14) in terms of the observed data can be deduced and is suppressed for brevity.
To simplify notation, based on (25), (26), and the analogous expression for (14), define
It is important to recognize that these expressions are equal to zero for an unblinded placebo participant who is at risk at time t, , and who has refused study vaccine, , which is equivalent to censoring such a subject, as she/he cannot provide information on study vaccine after t. These expressions are also equal to zero at time t for an at‐risk unblinded placebo participant who receives study vaccine but has been vaccinated for less than ℓ weeks at t and for an at‐risk blinded vaccine participant vaccinated for less than ℓ weeks at t, reflecting the fact that such individuals do not contribute information on VE for until times t at which they have reached full efficacy, in which case the expressions are ≥0. Moreover, by excluding the at‐risk unblinded placebo participants vaccinated for less than ℓ weeks at t, the behavior reflected by does not play a role.
Define also
Then, it is straightforward that the observed‐data estimating functions are
Letting , , , , , and denote evaluation at in (1), the foregoing developments lead to the set of observed‐data estimating equations
(27) |
(28) |
For fixed , the estimators for and are the solutions to the equations in (27) given by
(29) |
Substituting these expressions in (28) yields the estimating equation in given by
(30) |
which can be solved in to yield the estimator .
5. PRACTICAL IMPLEMENTATION AND INFERENCE
Choice of the weight functions , , , and is arbitrary but can play an important role in the performance of the resulting estimators. We recommend taking a fixed value of , for example, the sample mean, and setting and , , where the latter does not depend on t. The resulting weights and , , are referred to as stabilized weights (Robins et al., 2000), as they mitigate the effect of small inverse probability weights that can give undue influence to a few observations. Note that dependence of the inverse probability weights on cancels in construction of stabilized weights. Moreover, if there is no confounding, in that , in (19), , and do not depend on , the stabilized weights are equal to 1. Interpretation of the stabilized weights is discussed further in Web Appendix F.
If the “survival probabilities” for R, , and the densities , , and in the inverse probability weights, which appear in the expressions in the estimating equation (30), were known, (30) could be solved to yield an estimator for and in particular characterizing VE waning. As these quantities are unknown, models must be posited for them, leading to estimators that can be substituted in (30). We propose the use of Cox proportional hazards models for , , in (19), which can be fitted using the data , ; and for the hazard of entry time E given , which can be fitted using , . A binary, for example, logistic, regression model can be used to represent and fitted using for i such that .
For individual i, the stabilized weights involve the quantities , , , and . With proportional hazards models as above with predictors , say, it is straightforward that , where the baseline hazard cancels from numerator and denominator, and similarly for . Thus, the estimated stabilized weights involve only the estimated cumulative hazard functions and estimators for the , each of which is root‐n consistent and asymptotically normal.
As sketched in Web Appendix G, with stabilized weights set equal to one or estimated, (30) can be solved easily in via a Newton–Raphson algorithm. A heuristic argument demonstrating that is asymptotically normal leading to an expression for its approximate sampling variance using the sandwich technique is given in Web Appendix G.
6. SIMULATIONS
We report on simulation studies demonstrating performance of the methods, each involving 1000 Monte Carlo replications, based roughly on the Moderna trial. We took and , , , and , where all times are in weeks, and consider an analysis at calendar time weeks, with 30,000. In all cases, where weeks and , corresponding to VE = 95% prior to time v, so that, depending on θ1, VE potentially wanes following v. We consider , corresponding to VE = 65% after time v, and , corresponding to no waning.
Because the trial and unblinding process are ongoing, we were not able to base our generative scenarios on data from the trial. Owing to the complexity of the trial and multiple potential sources of confounding, to facilitate exploration of a range of conditions while controlling computational complexity and intensity, we focused on several basic scenarios meant to represent varying degrees of confounding consistent with our expectations for the most likely sources of such confounding in the trial. Specifically, we took and to not depend on (or A in the latter case) in any scenario, reflecting mostly random entry and PDCV unblinding processes. In scenarios involving confounding, we took , corresponding to the period in which “requested unblinding” occurred, and the “agreement process” to depend on , as described below, reflecting our belief that these processes could be associated with participant characteristics.
In the first set of simulations, we consider two cases: (i) no confounding, where all of , , , and , Γ) do not depend on ; and (ii) confounding, where and , Γ) depend on as above. In both (i) and (ii), the entry process , that is, uniform on , and the unblinding process during PDCVs was ; see below. In each simulation experiment, for each participant in each Monte Carlo data set, we first generated , two baseline covariates and , , and E as above. To obtain R, we generated G 1 to be exponential with hazard , , where , corresponding to roughly 7% unblinding during , , and , , , for (i), no confounding, and for (ii), confounding. With and , we let and . We generated Ψ as Bernoulli, , , expit , expit , where , corresponding to approximately 80% agreement to receive the study vaccine by unblinded placebo participants, and , , for (i) and for (ii).
To generate , we first generated and based on (10)–(11), with , where , leading to approximately a 3% infection rate for placebo participants over L, and ; ; ; and , so that in (10)–(11), , are piecewise constant hazards. and were obtained via inverse transform sampling. We then generated U (calendar time) as , where for G 2 exponential with hazard ; infection times for unblinded placebo participants who decline vaccine are not used in the analysis. Finally, we set , and defined and . Although we obtained Ψ for all n participants, Ψ is used only when , .
For each combination of (i) and (ii) and (a) and (b) , we estimated and thus for and two ways: taking the stabilized weights equal to 1, so disregarding possible confounding, and with estimated stabilized weights. The latter were obtained by fitting proportional hazards models for entry time E with linear predictor and for , , with linear predictors and , respectively; and a logistic regression model for .
Table 2 presents the results for estimation of θ1, dictating waning; , VE prior to weeks; and , VE after weeks. Because the Monte Carlo distribution of some of these quantities exhibited slight skewness, those for the VE quantities likely due to the exponentiation, we report both Monte Carlo mean and median. Estimation of shows virtually no bias for both (a) and (b); that for in case (a) shows minimal bias and virtually none for (b). In all cases, standard errors obtained via the sandwich technique as outlined in Web Appendix G along with the delta method for the VEs track the Monte Carlo standard deviations. Under both (i) no confounding and (ii) confounding, estimation of the stabilized weights appears to have little consequence for precision of the estimators relative to setting them to equal to 1. Wald 95% confidence intervals, exponentiated for the VEs, achieve nominal coverage. For (b) and each combination of stabilized weights set equal to 1 or estimated and (i), no confounding, and (ii), confounding, we also calculated the empirical Type I error achieved by a Wald test at level of significance 0.05 for VE waning addressing the null and alternative hypotheses versus . These values are 0.04 and 0.06 when using stabilized weights set equal to 1 under (i) and (ii), respectively; the analogous values with estimated weights are 0.05 and 0.05 under (i) and (ii).
TABLE 2.
Simulation results based on 1000 Monte Carlo replications, first scenario. Mean = mean of Monte Carlo estimates, Med = median of Monte Carlo estimates, SD = standard deviation of Monte Carlo estimates, SE = average of standard errors obtained via the sandwich technique/delta method, Cov = empirical coverage of nominal 95% Wald confidence interval (transformed for ). , VE prior to weeks; , VE after weeks. True values: (a) , , ; (b) ,
Stabilized weights = 1 | Stabilized weights estimated | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Mean | Med | SD | SE | Cov | Mean | Med | SD | SE | Cov | |||
(i), no confounding; (a) | ||||||||||||
θ1 | 1.961 | 1.935 | 0.310 | 0.308 | 0.95 | 1.983 | 1.959 | 0.303 | 0.310 | 0.96 | ||
|
0.950 | 0.953 | 0.019 | 0.019 | 0.95 | 0.950 | 0.952 | 0.019 | 0.019 | 0.95 | ||
|
0.634 | 0.663 | 0.183 | 0.174 | 0.96 | 0.626 | 0.662 | 0.188 | 0.177 | 0.96 | ||
(ii), confounding; (a) | ||||||||||||
θ1 | 2.030 | 2.013 | 0.325 | 0.320 | 0.95 | 1.990 | 1.973 | 0.346 | 0.335 | 0.95 | ||
|
0.951 | 0.953 | 0.019 | 0.018 | 0.96 | 0.951 | 0.952 | 0.019 | 0.019 | 0.95 | ||
|
0.614 | 0.647 | 0.199 | 0.185 | 0.95 | 0.619 | 0.665 | 0.201 | 0.186 | 0.94 | ||
(i), no confounding; (b) | ||||||||||||
θ1 | −0.020 | −0.019 | 0.433 | 0.422 | 0.95 | 0.007 | 0.019 | 0.421 | 0.424 | 0.96 | ||
|
0.950 | 0.952 | 0.020 | 0.019 | 0.95 | 0.950 | 0.952 | 0.020 | 0.019 | 0.96 | ||
|
0.947 | 0.954 | 0.032 | 0.030 | 0.96 | 0.946 | 0.953 | 0.033 | 0.031 | 0.95 | ||
(ii), confounding; (b) | ||||||||||||
θ1 | 0.053 | 0.045 | 0.446 | 0.436 | 0.95 | 0.011 | −0.004 | 0.452 | 0.450 | 0.96 | ||
|
0.951 | 0.952 | 0.019 | 0.019 | 0.96 | 0.950 | 0.952 | 0.020 | 0.019 | 0.95 | ||
|
0.944 | 0.951 | 0.035 | 0.032 | 0.96 | 0.945 | 0.954 | 0.036 | 0.033 | 0.95 |
In the first set of simulations, the confounding induced by our generative choices led to little to no bias in the estimators for θ1 and the VEs prior to and after 20 weeks. Notably, modeling and fitting of the stabilized weights to adjust for potential confounding shows little effect relative to setting the stabilized weights to 1. To the extent that this scenario is a plausible approximation to actual conditions of the trial, it may be that confounding will not be a serious challenge for the analysis of VE waning.
To examine the ability of the methods with estimated stabilized weights to adjust for confounding that potentially could be sufficiently strong to bias results, we carried out additional simulations under settings (a) and (b) with (ii) confounding in which our choices of generative parameters induce a stronger association between the potential infection times and the agreement process. Specifically, we took instead and , with all other settings identical to those above.
Table 3 shows the results. The estimators for θ1 and are slightly biased when stabilized weights are set equal to 1, although coverage probability for the latter is at the nominal level. This feature is mitigated by use of estimated stabilized weights. Coverage probability for θ1 is somewhat lower than nominal. Under (b), empirical Type I error achieved by a Wald test at level of significance 0.05 of versus is 0.12 when stabilized weights are equal to 1, demonstrating the potential for biased inference; Type I error is 0.06 using estimated stabilized weights, leading to a more reliable test.
TABLE 3.
Simulation results based on 1000 Monte Carlo replications, second scenario. Entries are as in Table 2. True values: (a) , , ; (b) ,
Stabilized weights = 1 | Stabilized weights estimated | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Mean | Med | SD | SE | Cov | Mean | Med | SD | SE | Cov | |||
(ii), confounding; (a) | ||||||||||||
θ1 | 2.125 | 2.100 | 0.315 | 0.299 | 0.93 | 2.009 | 2.008 | 0.346 | 0.325 | 0.94 | ||
|
0.952 | 0.953 | 0.017 | 0.016 | 0.97 | 0.950 | 0.952 | 0.017 | 0.017 | 0.96 | ||
|
0.581 | 0.611 | 0.191 | 0.182 | 0.95 | 0.613 | 0.640 | 0.179 | 0.175 | 0.96 | ||
(ii), confounding; (b) | ||||||||||||
θ1 | 0.171 | 0.149 | 0.436 | 0.403 | 0.92 | 0.050 | 0.053 | 0.447 | 0.426 | 0.95 | ||
|
0.951 | 0.953 | 0.017 | 0.017 | 0.97 | 0.950 | 0.952 | 0.018 | 0.017 | 0.96 | ||
|
0.937 | 0.945 | 0.038 | 0.034 | 0.95 | 0.942 | 0.949 | 0.034 | 0.032 | 0.95 |
Overall, we speculate that, because the Moderna study is a randomized, double‐blind trial, the unblinding period is finite and eventually all participants are unblinded, the refusal rate is likely to be low, and infection rates are low, confounding may not lead to substantial bias in estimation of VE waning.
7. DISCUSSION
We have proposed a conceptual framework based on potential outcomes for study of VE in which assumptions on biological, behavioral, and other phenomena are made transparent. The methods provide a mechanism to account for possible confounding. The corresponding statistical framework combines information from blinded and unblinded participants over time. We focus on the setting of the Moderna phase 3 trial, but the principles can be adapted to other settings, including the blinded crossover design of Follmann et al. (2020), and apply to ongoing and future vaccine trials in which unblinding may well occur throughout and some participants may refuse the study vaccine in favor of already‐licensed products. Extension of the methods to the setting where time between doses varies across vaccinated participants due to either deviations from the protocol or by design would require modification of the framework presented here to represent VE as a function of both time between doses and time since vaccination.
Our approach and those of Lin et al. (2021) (LZG), Follmann et al. (2020), and Fintzi and Follmann (2021) for estimation of VE waning use a calendar time formulation and Cox hazard models. As do we, Follmann et al. (2020) and Fintzi and Follmann (2021) include data from placebo participants who cross over to study vaccine, whereas LZG censor such subjects and propose a sensitivity analysis. Because the approach of Follmann et al. (2020) and Fintzi and Follmann (2021) is based on a randomized crossover design that maintains the blind, confounding is not addressed, while LZG adjust for confounding via regression modeling. In our methodology, confounding is addressed through a potential outcomes formulation and IPW. As we do, Follmann et al. (2020) and Fintzi and Follmann (2021) represent using parametric or flexible spline models; LZG model nonparametrically.
Through condition (ii) in Section 3, (ii) , the methods embed the assumption that VE is similar across current and emerging viral variants. If the analyst is unwilling to adopt an assumption like condition (ii), then it is not possible to rule out that the data from the blinded (prior to ) and unblinded (starting at ) phases of the trial reflect very different variant mixtures. In this case, calendar time and time since vaccination cannot be disentangled, and thus, it is not possible to evaluate VE solely as a function of time since vaccination. However, it may be possible to evaluate the ratio of infection rates under vaccine at any time t (and thus variant mixture in force at t) after different times since vaccination τ1 and τ2, say, during the unblinded phase of the trial, namely, , . The infection rates can be estimated based on the infection status data at time t from vaccinated individuals who received vaccine at times and , respectively. These infection rates and their ratio will reflect information about the waning of the vaccine itself under the conditions at time t, and, in fact, this infection rate ratio can be viewed as the ratio of vaccine efficacies at τ1 and τ2. However, because after information on will no longer be available, it is not possible to deduce VE itself for . But if data external to the trial became available that provide information on VE at t, even for small τ, it may be possible to integrate this information with that from the infection rates to gain insight into VE as a function of τ.
Supporting information
Web Appendices referenced in Sections 3–6 and R code implementing the simulation scenarios in Section 6 are available with this paper at the Biometrics website on Wiley Online Library. An R package, VEwaning, implementing the methodology is available on GitHub at https://github.com/sth1402/VEwaning and at the Comprehensive R Archive Network (CRAN) at https://cran.r‐project.org/web/packages/VEwaning/index.html.
ACKNOWLEDGMENTS
The authors thank Dean Follmann for helpful discussions and insights. The authors are grateful to Shannon Holloway for creating the R package VEWaning, noted in the Supporting Information.
Tsiatis AA, Davidian M. Estimating vaccine efficacy over time after a randomized study is unblinded. Biometrics. 2021;1–14. 10.1111/biom.13509
Contributor Information
Anastasios A. Tsiatis, Email: tsiatis@ncsu.edu
Marie Davidian, Email: davidian@ncsu.edu.
DATA AVAILABILITY STATEMENT
Data sharing is not applicable to this article, as no datasets are generated or analyzed in this paper. The methods developed in the paper are proposed to enable future analyses of data from ongoing vaccine trials from which the required data are not yet fully accrued.
REFERENCES
- Baden, L.R. , El Sahly, H.M. , Essink, B. , Kotloff, K. , Frey, S. , Novak, R. et al. for the COVE Study Group (2020) Efficacy and safety of the mRNA‐1273 SARS‐CoV‐2 vaccine. New England Journal of Medicine, 384, 403–416. 10.1056/NEJMoa2035389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- COVID‐19 Vaccine Tracker . https://covid19.trackvaccines.org/. Last updated 11 July 2021.
- Fintzi, J. and Follmann, D. (2021) Assessing vaccine durability in randomized trials following placebo crossover. arXiv preprint arXiv:2101.01295v3. [DOI] [PMC free article] [PubMed]
- Fleming, T.R. and Harrington, D.P. (2005) Counting processes and survival analysis. New York: Wiley. [Google Scholar]
- Follmann, D. , Fintzi, J. , Fay, M.P. , Janes, H.E. , Baden, L. , El Sahly, H. et al. (2020) Assessing durability of vaccine effect following blinded crossover in COVID‐19 vaccine efficacy trials. medRxiv. 2020 Dec 14;2020.12.14.20248137. 10.1101/2020.12.14.20248137. [DOI]
- Halloran, M.E. , Longini, I.M. and Struchiner, C.J. (1996) Estimability and interpretation of vaccine efficacy using frailty mixing models. American Journal of Epidemiology, 144, 83–97. [DOI] [PubMed] [Google Scholar]
- Lin, D.‐Y. , Zeng, D. and Gilbert, P.B. (2021) Evaluating the long‐term efficacy of COVID‐19 vaccines. Clinical Infectious Diseases, ciab226. 10.1093/cid/ciab226. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Longini, I.M. and Halloran, M.E. (1996) Frailty mixture model for estimating vaccine efficacy. Journal of the Royal Statistical Society, Series C, 45, 165–173. [Google Scholar]
- Moderna Clinical Study Protocol, Amendment 6, 23 December 2020, available at https://www.modernatx.com/sites/default/files/content_documents/Final%20mRNA‐1273‐P301%20Protocol%20Amendment%206%20‐%2023Dec2020.pdf
- Robins, J.M. , Hernán, M.A. and Brumback, B. (2000) Marginal structural models and causal inference in epidemiology. Epidemiology, 11, 550–560. [DOI] [PubMed] [Google Scholar]
- Rubin, D.B. (1980) Bias reduction using Mahalanobis‐metric matching. Biometrics, 36, 293–298. [Google Scholar]
- Yang, S. , Tsiatis, A.A. and Blazing, M. (2018) Modeling survival distribution as a function of time to treatment discontinuation: A dynamic treatment regime approach. Biometrics, 74, 900–909. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Web Appendices referenced in Sections 3–6 and R code implementing the simulation scenarios in Section 6 are available with this paper at the Biometrics website on Wiley Online Library. An R package, VEwaning, implementing the methodology is available on GitHub at https://github.com/sth1402/VEwaning and at the Comprehensive R Archive Network (CRAN) at https://cran.r‐project.org/web/packages/VEwaning/index.html.
Data Availability Statement
Data sharing is not applicable to this article, as no datasets are generated or analyzed in this paper. The methods developed in the paper are proposed to enable future analyses of data from ongoing vaccine trials from which the required data are not yet fully accrued.