Skip to main content
Wiley - PMC COVID-19 Collection logoLink to Wiley - PMC COVID-19 Collection
. 2021 Apr 29;40(27):5983–6007. doi: 10.1002/sim.9001

Assessing vaccine durability in randomized trials following placebo crossover

Jonathan Fintzi 1,, Dean Follmann 1
PMCID: PMC8242890  PMID: 33928660

Abstract

Randomized vaccine trials are used to assess vaccine efficacy (VE) and to characterize the durability of vaccine‐induced protection. If efficacy is demonstrated, the treatment of placebo volunteers becomes an issue. For COVID‐19 vaccine trials, there is broad consensus that placebo volunteers should be offered a vaccine once efficacy has been established. This will likely lead to most placebo volunteers crossing over to the vaccine arm, thus complicating the assessment of long term durability. We show how to analyze durability following placebo crossover and demonstrate that the VE profile that would be observed in a placebo controlled trial is recoverable in a trial with placebo crossover. This result holds no matter when the crossover occurs and with no assumptions about the form of the efficacy profile. We only require that the VE profile applies to the newly vaccinated irrespective of the timing of vaccination. We develop different methods to estimate efficacy within the context of a proportional hazards regression model and explore via simulation the implications of placebo crossover for estimation of VE under different efficacy dynamics and study designs. We apply our methods to simulated COVID‐19 vaccine trials with durable and waning VE and a total follow‐up of 2 years.

Keywords: COVID‐19, proportional hazards regression, vaccine efficacy, vaccine trial design

1. INTRODUCTION

Randomized phase III clinical trials are used to definitively demonstrate the efficacy of candidate vaccines. Volunteers are randomized to receive vaccine or a placebo and followed for a period of time to assess whether the vaccine reduces the rate of disease acquisition. An important question in vaccine development is whether vaccine‐induced protection is durable. For COVID‐19 vaccines, questions surrounding vaccine durability are important as acquired immunity against seasonal and other coronaviruses ranges from 6 months to 2 years. 1 , 2 Clinical trials for vaccines against COVID‐19 plan to follow participants for up to 2 years. 3

To assess long term safety and durability, long term blinded follow‐up of the original placebo and vaccine arms is ideal. 4 From an ethical perspective, placebo volunteers should be offered a vaccine once efficacy is established. 5 However, vaccination of placebo volunteers may occur before it is known whether vaccine‐induced protection is durable. Besides waning of efficacy, there is concern that the vaccine might eventually cause harm, that is, negative vaccine efficacy (VE), in subgroups. Such harm is known as vaccine associated enhanced disease and has been observed in other contexts, such as the Dengvaxia vaccine in seronegative individuals. 6

It might seem that the ability to assess vaccine durability following placebo crossover is completely lost once there is no longer an unvaccinated control group. 4 However, at the point of crossover the study remains a randomized trial, albeit of immediate vs deferred vaccination. This contrast allows the VE profile for a standard noncrossover trial to be recovered with placebo crossover. 7 The only additional assumption that is required is that the same VE profile applies to the newly vaccinated irrespective of the timing of vaccination, for example, June or December.

Crossover trials for absorbing endpoints, such as infection or death, have been discussed in the literature. 8 , 9 However, these methods apply to estimation of a intervention effect that stops once the intervention is removed. Vaccination is different as the benefit lingers and our goal is to see how the intervention effect varies with time. Crossover has been discussed for vaccine trials, but only for the placebo arm and only to measure immune response. 10 Delayed vaccination has been used in an Ebola vaccine trial, but to serve as control group prior to deferred vaccination. 11

We establish that vaccine durability can be accurately assessed following placebo crossover under fairly mild assumptions. We demonstrate how to estimate VE as a function of time since vaccination under placebo crossover using proportional hazards (PH) regression 12 where VE, defined as 1 minus the hazard ratio (HR), is allowed to depend on time through use of time‐varying covariates. 13 We specify log‐linear and P‐spline functions to allow for a variety of shapes for the VE profile and provide a justification for using calendar time as the natural timescale in vaccine trials where risk can vary substantially with calendar time. We discuss and evaluate different approaches for crossing over placebo participants, and estimation is affected by the timing and pace of crossover, and unobservable heterogeneity in risk. We evaluate our methods by simulation and analyze two simulated COVID‐19 vaccine trials that vaccinate placebo volunteers after efficacy is established.

2. VE UNDER PLACEBO CROSSOVER

2.1. Conceptual development

Consider a vaccine trial where volunteers are randomized to receive vaccine or placebo. For now, assume that everyone is enrolled at the same time, so calendar time and time since randomization are aligned. All participants are followed over the period [0,2τ], and a blinded crossover occurs at time τ, at which point the volunteers randomized to vaccine receive placebo, and the volunteers randomized to placebo receive vaccine. Following crossover, both arms are vaccinated and thus comparative efficacy might seem lost as there is no control group. However, a randomized trial remains, though now as a trial of immediate vs deferred vaccination; these assignments correspond to the original vaccine and placebo arms. This “rebranded” randomized trial can still provide information about vaccine durability, even after the point of crossover.

To illustrate, suppose we have case counts for the two randomization arms over the two periods, (0,τ] and (τ,2τ]. Suppose that the vaccine:placebo case split is 20:100 in period one, and in period two we observe a case split of 20:12 in the original vaccine arm:deferred vaccination arm. Using a person‐time analysis, and imagining the denominators are so large that they cancel out, we obtain a simple estimate of the period one VE as VE^1=120/100=0.80. Assume now that this VE applies to the newly vaccinated participants in the second period with 12 cases. With this assumption we can estimate N2plac, the number of cases for a counterfactual placebo group in period two, as we are assuming 0.80=112/N2plac^, which yields N2plac=12/.2=60. We then contrast the counterfactual placebo case count of 60 to the 20 observed cases in the original vaccine arm to obtain an estimate of placebo controlled VE in period two, VE^2=120/60=0.667. Based on these crude estimates, we conclude that VE has waned as efficacy has dropped from 80% in period one to 66.7% in period two.

The crux of this example is that the VE among the newly vaccinated is portable across periods. That is, the original placebo arm receives the same immediate benefit from vaccination that the original vaccine group received, regardless of changes in the population attack rate. Additional considerations for a period focused approach are discussed in Follmann et al. 7 While a period focused approach is simple and clear, a more natural development allows VE to vary smoothly with time. The Cox PH model allows this to be easily accomplished. Under the PH model, the baseline placebo hazard function for the time to disease is arbitrary and the effect of vaccination induces a hazard proportional to the baseline hazard. We can formulate the hazard function corresponding to the previous example for a standard trial with no crossover as

h(t)=h0(t)exp{Z1{t<τ}θ1+Z1{tτ}θ2}, (1)

where t is time since randomization, h 0(t) is the placebo hazard function, and Z is the vaccine assignment indicator. VE is defined as the relative change in the instantaneous risk of acquiring disease, and is given by 1exp(θ1) for period one and 1exp(θ2) for period two.

Suppose placebo volunteers are vaccinated at the end of period one. Assuming the vaccine effect for the newly vaccinated applies in period two, we can write

h(t)=h0(t)exp{Z1{t<τ}θ1+Z1{tτ}θ2+(1Z)1{tτ}θ1}=λ0(t)exp{Z1{t<τ}θ1+Z1{tτ}θ2},

where the baseline hazard λ0(t)=h0(t)exp{1{tτ}θ1} applies to the original placebo arm for both periods. The first line parameterizes the placebo controlled VE for the newly vaccinated in period two as 1exp(θ1) and the VE for the original vaccinees in period two as 1exp(θ2). Analogous to the simple example where we recovered the placebo case count N2plac after time τ, here we can recover the counterfactual placebo hazard function h 0(t) for tτ as h0(t)=λ0(t)exp(θ1).

Equation (1) is deceptively simple. Generalizations of this idea allow us to recover an arbitrary placebo controlled VE curve long after the placebo group has been completely vaccinated and with no assumptions about the baseline hazard function for an actual or counterfactual placebo arm. To illustrate, suppose that the placebo controlled hazard is given by a log‐linear function of time.

h(t)=h0(t)exp{Z(θ1+θ2t)}. (2)

If crossover to vaccine occurs at time τ in the placebo arm, the resultant hazard is

h(t)=h0(t)expZ(θ1+θ2t)+(1Z)1{tτ}(θ1+θ2(tτ)).=λ0(t)expZθ2t+(1Z)1{tτ}θ2(tτ), (3)

where λ0(t)=h0(t)exp(1{tτ}θ1). A visualization of this hazard function is given in Figure 1.

FIGURE 1.

SIM-9001-FIG-0001-c

Log hazard for two study participants: i, who is initially given placebo (orange line), and j who is initially given vaccine (green line). Vaccine efficacy wanes (ie, log hazard increases) as a function of time since vaccination. At time τ, participant i is given vaccine and follows the same efficacy profile as participant j. The “baseline” hazard function for original placebo participant i is λ0(t)=h0(t) prior to crossover and λ0(t)=h0(t)exp{θ1+θ2tτ} after crossover. In this figure, h 0(t) is constant, but this is not required [Colour figure can be viewed at wileyonlinelibrary.com]

To better understand what happens in terms of estimation, suppose that an event occurs in the original placebo arm at time t=s<τ, which is postrandomization, but before crossover. This scenario is shown in Figure 1. The partial likelihood contribution for this event under model (3) reduces to

1iR(s)exp{Zi(θ1+θ2s)},

where R(s) is the set of indices for volunteers who remain event free at time s. Thus, events prior to τ allow estimation of θ2 and, crucially, θ1. Next suppose an event occurs at time t=τ+s postrandomization to an individual in the original placebo arm who was vaccinated at time τ and who now has been vaccinated for s days (Figure 1). The partial likelihood contribution for this event is

exp(θ2s)iR(τ+s)exp{Zi(τ+s)θ2+(1Zi)sθ2}.

We see that θ1 is gone as the baseline hazard, λ0(τ+s)=h0(τ+s)exp(θ1), cancels out of the numerator and denominator. Thus, the precrossover period completely determines the reliability of the estimate of θ1. As a result, longer precrossover periods with more events are desirable to better estimate θ1. In addition, as τ approaches zero, the covariate value for the original vaccine arm (τ+s) is very close to the covariate value for the original placebo arm s. Insufficient variation in covariate values makes estimation of the associated regression slope more difficult. Thus, the benefit of a longer precrossover period persists in estimation of θ2. In addition to helping to estimate θ1. We note that there is nothing really special about this kind of cancellation. Consider a Cox regression with two covariates; a treatment indicator and a sex indicator. Even if all the women eventually dropout or have events, we continue to accrue information about the effect of treatment, provided we still have men at risk.

2.2. General development

We now develop this approach for the more realistic setting of a staggered entry trial and consider more general models for VE over time. Let t ≥ 0 now index time since study initiation, not time since randomization. For each subject, i,  i ∈ 1, … , N, the data, τi(e),τi(v),Ti,Ci,Zi,Xi, consist of the times of study entry and vaccination, τi(e) and τi(v), with τi(v)τi(e)>0,, the time to symptomatic COVID‐19 or end‐of‐follow‐up, Ti=min(Yi,Ci) with Y i the true but possibly unobserved event time, treatment assignment Z i , and baseline covariates X i . By convention, we take τi(v) to be greater than the study duration if disease occurs prior to vaccination. We also define a time‐dependent vaccination indicator, Zi(t)=1{t>τi(v)}.

The hazard for participant i is

hi(t)=0,tτi(e),h0(t)expZi(t)ftτi(v);θ+Xβ,t>τi(e),, (4)

where h 0(t) is the arbitrary “reference” hazard for a placebo group, θ a vector of parameters governing VE over time, X a vector of baseline covariates and β a vector of parameters. The hazard is zero before an individual enters the trial by convention as the hazard function applies to volunteers who are uninfected when randomized. We calculate VE at time s postvaccination as one minus the ratio of vaccine to placebo hazards, that is,

VE(s)=1exp[f(s;θ)].

The model, (4), encompasses standard trials with parallel arms in which case τi(v)= for placebo volunteers and placebo‐crossover trials in which case τi(v)>τi(e) for participants on the placebo arm. Following crossover of all placebo subjects, Z i (t) = 1 for all study participants, hence the HR for any pair of subjects with the same X is only depends on the contrast in their times since vaccination:

hi(t)hj(t)=exp{f(tτi(v);θ)}exp{f(tτj(v);θ)}.

This ratio is 1 for a constant VE model and so following crossover, there is no additional information about a constant VE, just as there is no additional information about θ1 in the log‐linear decay model. This differs from the standard parallel arms trial where such information accrues throughout follow‐up.

While it is standard to have the time index for the Cox model be time since randomization, calendar time is a more natural index for the Cox model as specified in (4) for trials where risk can wax and wane over time. In our setting, aligning the data on study entry induces non‐PHs and distorts the risk set in the Cox partial likelihood at each event time. These phenomena are diagrammed in Figure 2.

FIGURE 2.

SIM-9001-FIG-0002-c

Participant histories and baseline hazards when the data are indexed in calendar time or aligned on times of study entry. The true data generating mechanism is indexed in calendar time. A vs B: Aligning the data on study entry changes the risk set as k falls out of the risk set at i's event time and j is incorrectly introduced into the risk set. C vs D: Baseline hazards are no longer proportional after the data are aligned on study entry [Colour figure can be viewed at wileyonlinelibrary.com]

Suppose participant i acquires disease at calendar time t i after being on study for a period si=tiτi(e) (Panels A and B) and we use model (4) with calendar time index (Panel C). Setting aside baseline covariates, the partial likelihood contribution at calendar time t i is

h0(ti)exp{Zi(ti)f(tiτi(v);θ)}jR(ti)h0(ti)exp{Zj(ti)f(tiτj(v);θ)}

and the baseline hazards cancel out as they should.

Now, suppose we align participants on their times of study entry. The event of person i at calendar time t i is at study time s i and the associated study time risk set is ˜(si). The calendar time for participant i is ti=si+τi(e), whereas for a generic participant j it is a different calendar time tj=si+τj(e). Since the hazard truly depends on calendar time, the partial likelihood contribution at study time s i is

h0(τie+si)exp{Zi(τie+si)f(si(si);θ)}jR˜(si)h0(τje+si)exp{Zj(τje+sj)f(sj(si);θ)},

where sj(si) is the time since vaccination at study time s i for person j. With alignment on time since study entry, the baseline hazards do not cancel out and a partial likelihood contribution which assumes they do will be misspecified.

2.3. Flexible models for VE(s)

The log‐linear and piecewise‐constant forms of VE(s) discussed above are simple and useful to understand behavior of the model and estimation. However, the form of waning vaccine efficacy can be hard to anticipate for new vaccines and high constant efficacy followed by a quick or smooth decay is possible as are other shapes. It is thus appealing to model f(·) semiparametrically, for example, using penalized cubic P‐splines. 14 , 15 , 16 Let PL(t;k,δ) denote a P‐spline basis of degree δ=3 with L basis terms and vector of knot locations k, and let γ be a vector of coefficients, with γ0 reserved for the log‐HR immediately following vaccination. The hazard for participant i is

hi(t)=0,tτi(e),h0(t)expZi(t){γ0+=1LγP(tτi(v);k,δ)},t>τi(e). (5)

In practice, we center the decay component estimated by the P‐spline at zero to ensure identifiability of γ0. Note that we need to evaluate the hazard for each participant at every event time, not merely at the time when a person experiences their own event. 17

Splines can be implemented in the SAS procedure PROC PHREG and in R using the survival package. 18 The latter provides users with a convenient summary method for the linear and nonlinear spline effects, which is useful for testing for nonlinearity in the decay profile.

2.4. Crossing over

Our development up to now has implicitly started counting cases immediately after the first dose of vaccine. In practice vaccine trials often use a per‐protocol primary analysis that forgoes counting disease cases until after the immunization schedule is complete, for example, 7 days following the last dose. Such an analysis better evaluates the full benefit of immunization. For such analyses, we need to symmetrically avoid counting cases in both arms during the second immunization period even if it is counterfactual, that is if volunteers randomized to vaccine are unblinded and not immunized. To achieve this symmetry, a “blackout” period of length Δ can be defined by the hazard function h(t){1I(t[τ(x),τ(x)+Δ]}, where τ(x) is the start of the crossover or unblinding for an individual, and Δ the time from first dose to when cases are counted. The consequence is to define discontinuous pre‐ and post‐crossover risk intervals for volunteers who complete crossover without having an event. Volunteers who have an event before or during crossover have a single risk interval which ends in an event or censoring, respectively.

Placebo crossover might happen in a blinded or unblinded (open label) manner. Blinded crossover is preferred to avoid potentially differential risk behavior as the recently unblinded volunteers originally randomized to vaccine, who now know they are protected, might forgo risk avoidance behavior. 7 With open label crossover such differential behavior could cause a spurious waning efficacy in the period immediately following unblinding. One complex approach to address potential bias from unblinded crossover would be to use covariate adjustment and stratification. Let X denote a vector of covariates measured at baseline, or pre‐ or post‐unblinding that predict risk behavior. While clinical trials typically avoid use of postbaseline variables for adjustments, in the open label setting such adjustment may ameliorate bias. Once an individual is unblinded, a new hazard function applies. We illustrate using log‐linear decay:

h(t)=λ0(t)expθ2(tτ(v))+Xβ, (6)

where λ0(t) is the new baseline hazard for the original placebo arm in this new open label milieu. Because crossover of all subjects cannot occur at the same time, there will be a crossover interlude during which the placebo volunteers become vaccinated. Thus, at calendar time t during the crossover interlude, the expanding unblinded cohort would use hazard h (t) given by (6) while the dwindling blinded cohort would use hazard h(t) given by (2). Following the crossover interlude (6) would apply to all and at some point, the term exp(Xβ) might not be needed if the volunteers in the two arms behave similarly. This construction is a form of time‐dependent stratification.

A simpler way to address open label crossover bias is to define a blackout period of length Δ such that behavior is presumed to be similar after the end of the period. Similarity should happen at some point as all trial volunteers will know they are vaccinated and protected. As above, time‐dependent stratification would make sense with different baseline hazards before unblinding and after the crossover blackout period. To be specific, for the log‐linear decay model we would have h(t)=λ0(t)exp{Z(t)[θ1+θ2(tτ(v))]} prior to unblinding and h(t)=λ0(t)exp{θ2(tτ(v))} at time Δ postunblinding.

If the VE dropped substantially during a black‐out period, later estimates of VE might be compromised. As an extreme example, suppose all volunteers enroll at the same time, and all are blacked out during τ,τ+Δ which is exactly when VE drops. Then the estimated VE curve pre‐ and post‐crossover would incorrectly appear constant. In practice this scenario can be avoided with a staggered entry trial by exploiting the induced variation in time since vaccination at any calendar time. To illustrate what not to do, suppose that enrollment took 2 months, crossover took 2 months and the crossover order was in exact sync with the enrollment order. Then all would be crossed over at some time τ since randomization and all blacked out for the period τ,τ+Δ. To minimize the problem, crossover could occur in reverse order with the first enrollees being crossed over last. Logistical considerations and placebo volunteers' sense of fairness could also come into play.

3. ASSUMPTIONS

To recover a VE curve under a standard trial with no crossover requires that the volunteers in each arm remain similar over time and that the external environment remain similar over time.

  • Volunteers in each arm remain similar over time. This can be violated if there is differential dropout in the two arms and dropout is related to underlying risk of disease. Relatedly, unobserved heterogeneity in risk can result in differential culling by infection of the vaccine and placebo groups. Thus, after a while, the remaining placebo arm volunteers tend to be a less risky group than the remaining vaccine arm volunteers and VE can appear to decrease, see References 19, 20, 21. COVID‐19 trials with 30 000 or more enrolled and perhaps 200 to 1000 cases over follow‐up, such bias may be small. Of course one can explicitly model the heterogeneity. 22 Such methods are beyond the scope of this article. In practice, covariate adjustment for baseline risk factors can be used to mitigate this assumption.

  • Study environment similar. The PHs model allows for the attack rate to change with time. But if the pathogen mutates to a form that is resistant to vaccine effects, efficacy may appear to wane. Another possibility is if human behavior changes in such a way that the vaccine is less effective. For example, if there is less mask wearing in the community over the study, the viral inoculum at infection may increase over the study and overwhelm the immune response for later cases. Vaccines may work less well against larger inoculums and thus VE might appear to wane. For viral mutation, analyses could be run separately for different major strains provided they occur both prior and postcrossover. More elaborate methods could also be developed to address viral mutation, but are beyond the scope of this article.

The only additional assumption that is required to recover the VE profile under placebo crossover is that the effect of vaccination be the same no matter when the vaccine is given. Interestingly, this is a common assumption; for vaccine trials with staggered entry it is implicitly assumed that the VE for early enrollees is the same as for the late enrollees. Importantly, both standard and crossover designs allow for a varying attack rate through the arbitrary hazard h 0(t) whether due to seasonality, vaccination coverage, or other reasons.

4. SIMULATED COVID‐19 TRIALS

In this section, we explore how placebo crossover, the dynamics of durability, and the baseline hazard affect our estimates of VE and durability. Since several COVID‐19 vaccine trials are powered to accrue 150 cases and follow all volunteers for 2 years, we evaluate three different designs: (i) crossover at 150 cases, (ii)crossover at 1 year, and (iii) a standard parallel arm trial. 3 We consider two settings for vaccine dynamics: constant VE of 75%, and VE waning linearly on the log‐hazard scale from 85% to 35% over 1.5 years. In the crossover scenarios, placebo arm volunteers crossover during a four week interlude. For each of the six settings we simulated 10 000 trials. Each trial enrolled 3000 participants in a 1:1 randomization with linear accrual of participants over an initial 3 month period and followed participants for 2 years postenrollment. While COVID‐19 trials are larger, we evaluated 3000 participants to lessen our computational burden. The baseline hazard was piecewise‐constant and calibrated to yield an average of 50, 75, 50, and 25 cases per 3 month period in the placebo arm in the first year, and either the same or half the year one case rates in the second study year. The data were analyzed in each simulation using the log‐linear decay model, (2), and the P‐spline model, (5).

The simulations demonstrate that we can accurately estimate VE(s) and the change in VE in all simulation settings using both the log‐linear and P‐spline model (Tables 1 and B1). Coverage probabilities of 95% confidence intervals were near their nominal levels or somewhat conservative. The P‐spline model performs similarly to the log‐linear model except for the estimates at year two where the variance becomes notably larger. Initiating crossover at 1 year resulted in an average accrual of 44% more cases prior to crossover compared with trials that initiated crossover at 150 cases. We found that this improved the precision of our estimates for all quantities of interest. One way to quantify the relative performance of placebo crossover and parallel arm trials is by the ratio of empirical variances. We focus on the empirical variances of estimates of the linear predictor in the log‐linear model in the constant VE(s) setting in Table 1. The cross at 150:cross at 1 year variance ratios for VE^(s) are 0.051/0.029 = 1.8, 2.4, 2.5, and 2.4 at 0.5,  1,  1.5, and 2 years. This underscores the potential benefit of additional case accrual during the precrossover period leading into the second year when the baseline hazard was halved. We next compare the crossover at 1 year design to a standard parallel trial using the log‐linear model. This comparison is more of a benchmark as a standard trial is may not be ethically possible following vaccine approval. The analogous empirical variance ratios for cross at 1 year compared with a standard trial are 0.029/0.022 = 1.3, 2.5, 2.3, and 2.0, respectively. Results are broadly similar for the P‐spline model and for waning VE.

TABLE 1.

Bias, empirical variance, and coverage for the linear predictor in Cox PH models with time‐varying VE in simulated trials where the baseline hazard in year two was half the baseline hazard in year one

log(1VE(s))
log(1VE(s))log(1VE(0))
Design Model Time Bias Emp. var. Covg. Bias Emp. var. Covg.
True vaccine efficacy constant at 75%
Cross at 150 cases log‐linear 0.5 −0.013 0.051 0.948 0.002 0.048 0.952
τx=0.6±0.05
1.0 −0.011 0.146 0.951 0.004 0.192 0.952
1.5 −0.009 0.338 0.953 0.006 0.433 0.952
2.0 −0.007 0.626 0.953 0.008 0.769 0.952
P‐spline 0.5 −0.016 0.067 0.967 0.006 0.142 0.971
1.0 −0.017 0.194 0.965 0.004 0.288 0.959
1.5 −0.014 0.369 0.958 0.008 0.460 0.956
2.0 −0.015 0.951 0.974 0.006 1.043 0.972
Cross at 1 year log‐linear 0.5 −0.009 0.029 0.949 0.001 0.022 0.953
N x  = 216 ± 13 1.0 −0.009 0.061 0.953 0.001 0.087 0.953
1.5 −0.008 0.137 0.952 0.002 0.195 0.953
2.0 −0.007 0.256 0.953 0.002 0.347 0.953
P‐spline 0.5 −0.013 0.041 0.982 0.005 0.146 0.978
1.0 −0.014 0.084 0.976 0.004 0.178 0.969
1.5 −0.010 0.160 0.969 0.008 0.240 0.972
2.0 −0.024 0.671 0.987 −0.006 0.786 0.983
Parallel trial log‐linear 0.5 −0.010 0.022 0.948 −0.001 0.016 0.952
1.0 −0.010 0.024 0.951 −0.001 0.065 0.952
1.5 −0.011 0.059 0.949 −0.002 0.145 0.952
2.0 −0.011 0.125 0.949 −0.002 0.258 0.952
P‐spline 0.5 −0.012 0.039 0.981 0.014 0.173 0.979
1.0 −0.013 0.056 0.981 0.012 0.208 0.967
1.5 −0.024 0.073 0.980 0.002 0.194 0.974
2.0 −0.071 0.447 0.981 −0.045 0.560 0.981
Vaccine efficacy wanes from 85% to 35% over 1.5 years
Cross at 150 cases log‐linear 0.5 −0.014 0.051 0.951 0.003 0.031 0.953
τx=0.63±0.05
1.0 −0.010 0.108 0.951 0.007 0.123 0.953
1.5 −0.007 0.225 0.951 0.010 0.276 0.953
2.0 −0.004 0.405 0.950 0.014 0.491 0.953
P‐spline 0.5 −0.015 0.067 0.964 0.011 0.143 0.977
1.0 −0.008 0.162 0.965 0.018 0.277 0.964
1.5 0.000 0.247 0.960 0.027 0.359 0.960
2.0 0.005 0.475 0.962 0.031 0.587 0.962
Cross at 1 year log‐linear 0.5 −0.010 0.031 0.950 0.004 0.017 0.949
N x  = 211 ± 13 1.0 −0.006 0.053 0.952 0.008 0.066 0.949
1.5 −0.001 0.107 0.950 0.013 0.149 0.949
2.0 0.003 0.195 0.951 0.017 0.265 0.949
P‐spline 0.5 −0.013 0.039 0.976 0.014 0.142 0.983
1.0 −0.005 0.072 0.973 0.022 0.182 0.970
1.5 0.008 0.119 0.966 0.035 0.208 0.973
2.0 0.016 0.352 0.977 0.043 0.469 0.976
Parallel trial log‐linear 0.5 −0.010 0.024 0.948 0.004 0.012 0.951
1.0 −0.006 0.016 0.946 0.008 0.048 0.951
1.5 −0.002 0.031 0.951 0.012 0.107 0.951
2.0 0.002 0.071 0.950 0.015 0.191 0.951
P‐spline 0.5 −0.012 0.036 0.982 0.022 0.164 0.984
1.0 −0.007 0.038 0.983 0.027 0.213 0.970
1.5 −0.004 0.039 0.980 0.030 0.182 0.975
2.0 −0.006 0.203 0.979 0.028 0.337 0.976

Note: The log‐linear and P‐spline models correspond to (2) and (5), respectively. The average time of crossover (in years), τx, and the average number of events at crossover, N x , along with SDs beneath the crossover grouping in the design column. Time is given in years since study initiation.

Abbreviations: PH, proportional hazards; VE, vaccine efficacy.

In Tables 2 and B2 we provide estimates of the intercept and linear trend of the VE profile for the scenarios where the baseline hazards was the same or halved in year two, respectively. All estimates have negligible bias and good coverage. For the constant VE scenario and log‐linear model, the variance ratios for cross at 150 cases and cross at 1 year are 0.051/0.040 = 1.3 and 0.192/0.084 = 2.3 for the intercept and slope, respectively. Thus, there is a big advantage in slope estimation with delayed crossover. When we compare crossover at 1 year to a standard trial, the variance ratios are 0.040/0.052 = 0.8 and 0.087/0.065 = 1.3, respectively. Interestingly, crossover improves the intercept estimate as during the crossover interlude, the newly vaccinated placebo volunteers contribute additional information about the intercept. The P‐spline and log‐linear model have similar empirical variances for the slope but the log‐linear model has about half the empirical variance of the P‐spline model for the intercept. Finally, for the constant VE scenario we compared the intercept estimates to a constant VE model (top half of Table 2). Under crossover, the empirical variance was modestly improved under this model compared with the log‐linear model. Conclusions are broadly similar for the waning VE scenario.

TABLE 2.

Empirical variance and coverage for estimates of the intercept and linear trend in vaccine efficacy under the log‐linear model, (2), and semiparametric model, (5)

Intercept Linear trend
Bias Emp. var. Covg. Bias Emp. var. Covg.
Vaccine efficacy constant at 75%
Cross at 150 cases Constant VE −0.012 0.039 0.952
log‐linear −0.015 0.051 0.951 0.004 0.192 0.952
P‐spline −0.022 0.086 0.973 0.005 0.188 0.956
Cross at 1 year Constant VE −0.008 0.028 0.950
log‐linear −0.010 0.040 0.950 0.001 0.087 0.953
P‐spline −0.018 0.104 0.978 0.001 0.084 0.959
Parallel trial Constant VE −0.005 0.018 0.949
log‐linear −0.009 0.052 0.951 −0.001 0.065 0.952
P‐spline −0.026 0.126 0.974 0.006 0.063 0.955
Vaccine efficacy wanes from 85% to 35% over 1.5 years
Cross at 150 cases log‐linear −0.017 0.056 0.951 0.007 0.123 0.953
P‐spline −0.027 0.102 0.975 −0.002 0.121 0.955
Cross at 1 year log‐linear −0.014 0.043 0.952 0.008 0.066 0.949
P‐spline −0.027 0.116 0.981 −0.003 0.065 0.951
Parallel trial log‐linear −0.014 0.057 0.950 0.008 0.048 0.951
P‐spline −0.034 0.145 0.978 0.003 0.047 0.952

Note: Here, the time‐varying baseline hazard in year two was half the baseline hazard in year one.

Abbreviation: VE, vaccine efficacy.

4.1. Crossover interlude

A design question is how estimation efficiency varies with the length of the crossover interlude. To explore this design question, we did additional simulations where we evaluated a standard parallel trial of 2 years, a trial where all placebo participants are crossed over at 1 year, and a trial where the times of vaccination for all volunteers were uniformly distributed over 2 years. The baseline hazard was constant over the 2 year period. Under the constant VE(s) scenario, the empirical variances for the intercept term were 0.051, 0.035, 0.031, respectively, while the variances for the slope were 0.039, 0.039, and 0.034, respectively (Table B3). This suggests a longer crossover interlude is somewhat better for estimation of the intercept and the slope.

4.2. Unobserved heterogeneity in risk

Unobserved heterogeneity in the risk of disease can lead to bias in estimates of VE(s) and complicate the task of separating time‐varying efficacy from increased removal of the riskier individuals from the placebo arm. 23 We simulated placebo crossover and parallel arm trials with 30 000 participants and gamma distributed frailties with mean one, and variance equal to either one or four. Crossover trials initiated vaccination of the placebo arm at 1 year. The baseline hazard was constant and calibrated to yield either 50 or 300 cases per 6 month period on the placebo arm, and VE(s) was either constant or waned linearly on the log hazard scale, as before. The frailty distributions in the original placebo and vaccine arms at the end of follow‐up were more similar in the placebo crossover trials than in the standard parallel trials (Table B5). In the low baseline hazard scenario, where the dominant contribution to a participant's propensity for disease was their underlying frailty, placebo crossover trials yielded less biased estimates of VE(s) relative to the standard parallel design (Tables B6 and B7). Higher baseline hazards resulted in more differential culling of the risk set and increased bias in estimates of VE(s). In the high baseline hazard scenario, the common baseline hazard dominated heterogeneity in the frailty distribution, and in this setting the bias in VE(s) estimates under placebo crossover was comparable to the bias that was observed with parallel trials. However, the absolute bias was modest. Under a constant VE of 75%, the estimates at 2 years were about 70% on average under either design. This scenario with a variance of four had substantial heterogeneity in the risk of disease with the riskiest 1% in the placebo arm having a median probability of disease over 2 years of 0.60. The least risky 1% had a median probability of 1 × 10−10. In practice, we could mitigate biases resulting from heterogeneity in the frailty distribution by adjusting for known risk factors of disease and stratifying our analyses by site or geographic region.

5. ANALYSIS OF TWO SIMULATED TRIALS

In this section, we present detailed analyses of two simulated COVID‐19 vaccine trials where the true VE profile was either constant at 75% or waned linearly on the log‐hazard scale from 85% to 35% over 1.5 years. Each trial enrolled 30 000 participants with linear accrual over 3 months in a 1:1 randomization to vaccine or placebo. The baseline attack rate was piecewise constant with changepoints every 3 months, and was calibrated to yield 50, 75, 50, and 25 cases on the placebo arm in each period in the first year, and half the expected number of cases per period on the placebo arm in year two. In this example, interim analyses are planned at 150 cases, which ultimately result in crossover at the end of year one following evaluation and vetting of the efficacy by a regulatory agency. Placebo crossover occurs over a four week period. Each volunteer was followed for a total of 2 years.

The two simulated trials are summarized in Table 3. In the constant VE scenario, the trial reached 150 cases in 222 days, and recorded 223 events by the 1 year crossover time‐point and 273 events, overall. In the waning VE scenario, the trial reached 150 cases in 242 days, and recorded 199 events by the 1 year crossover time‐point and 292 events, overall. The case split across treatment arms declined from roughly 83% on the placebo arm at the 150 case interim look to 76.2% at the completion of the study in the constant VE scenario, and from 82% to 65.2% in the waning VE scenario. The overall VE estimate at the 1 year crossover, estimated using a PHs model without adjustment for time since vaccination, was 76.6% (95% CI: 67.2%, 83.3%) in the constant VE(s) case and 80.1% (95% CI: 71.0%, 86.3%) in the waning VE(s) scenario (the true geometric mean VE(s) to 1 year postvaccination is 75.6%).

TABLE 3.

Summary of example trials simulated under constant and waning vaccine efficacy (VE) at times of interim analysis and placebo crossover

True VE Constant at 75% True VE Wanes from 85% to 35%
Time of 150 case interim look Day 222 Day 242
Case split by original arm
at interim look Placebo = 124, Vaccine = 26 Placebo = 131, Vaccine = 19
at 1 year crossover Placebo = 181, Vaccine = 42 Placebo = 166, Vaccine = 33
at 2 year follow‐up Placebo = 208, Vaccine = 65 Placebo = 191, Vaccine = 101
Estimates at interim look
log‐linear model
Intercept −0.84 (95% CI: −1.6, −0.09) −2.16 (95% CI: −3.17, −1.16)
Linear trend −3.06 (95% CI: −6.05, −0.07) 0.81 (95% CI: −2.13, 3.75)
LRT for time‐varying VE 0.039 0.589
P‐spline model
Intercept −1.41 (95% CI: −2.77, −0.05) −2.43 (95% CI: −4.23, −0.62)
Linear trend −3.02 (95% CI: −6.36, 0.32) 0.8 (95% CI: −2.13, 3.73)
LRT for time‐varying VE 0.037 0.605
Estimates at 1 year crossover
log‐linear model
Intercept −1.34 (95% CI: −1.98, −0.7) −2.36 (95% CI: −3.17, −1.55)
Linear trend −0.29 (95% CI: −1.74, 1.17) 1.8 (95% CI: 0.2, 3.4)
LRT for time‐varying VE 0.698 0.027
P‐spline model
Intercept −1.14 (95% CI: −2.17, −0.1) −2.26 (95% CI: −3.68, −0.83)
Linear trend −0.28 (95% CI: −1.66, 1.1) 1.8 (95% CI: 0.22, 3.37)
LRT for time‐varying VE 0.133 0.054
Estimates at 2 year follow‐up
log‐linear model
Intercept −1.37 (95% CI: −1.77, −0.97) −2.19 (95% CI: −2.62, −1.75)
Linear trend −0.13 (95% CI: −0.7, 0.43) 1.33 (95% CI: 0.82, 1.83)
LRT for time‐varying VE 0.641 <0.001
P‐spline model
Intercept −1.33 (95% CI: −2.09, −0.58) −2.26 (95% CI: −3.15, −1.36)
Linear trend −0.13 (95% CI: −0.7, 0.44) 1.28 (95% CI: 0.77, 1.8)
LRT for time‐varying VE 0.178 <0.001

Note: The intercept and linear trend correspond to the immediate effect of vaccination and the time‐trend for VE(s) under model (2), and the true values were set to θ1=1.39 and θ2=0 in the constant VE scenario, and θ1=1.9 and θ2=0.98 in the waning VE setting. The likelihood ratio test (LRT) for waning VE compares models (2) and (5) to a PH model without adjustment for time since vaccination.

Point estimates for VE(0) and the linear trend in log VE(s) from the log‐linear and P‐spline models were close in both scenarios, although confidence intervals in the P‐spline models were wider. The estimated efficacy profiles obtained with both methods were in agreement and recovered the true VE profile (Figure 3). The P‐spline estimates had wider pointwise confidence intervals, but the inflation in the variance appears to be fairly modest for the period spanning the end of study enrollment through, roughly, year 1.5 postvaccination. In practice, both the log‐linear decay model and the P‐spline model could be used to test a hypothesis of time‐varying VE(s). This is straightforwardly carried out for the log‐linear model via a likelihood ratio test (LRT) for the slope parameter in (2) where the test statistic is compared with a chi‐square distribution with one degree of freedom. For the P‐spline models, we perform a LRT for whether all of the P‐spline basis coefficients are jointly equal to zero, and compare the test statistic to a chi‐square distribution with 3.1 degrees of freedom (the effective degrees of freedom for the P‐splines in our models). In the waning VE(s) scenario, we resoundingly reject the null hypothesis of time‐homogeneous VE(s), and fail to reject the null in the constant VE(s) scenario (Table 3) at the end of 2 years of follow‐up. The benefit of an additional year of follow‐up past crossover is substantial in terms of evaluating the long term durability of the vaccine. Under the waning VE scenario, the P‐value for testing the null hypothesis of constant VE is close to 0.05 at 1 year and convincing at 2 years for both the log‐linear and P‐spline models. These simulated examples show that for both the waning and constant VE scenarios, accurate inference about the behavior of the VE over time can be recovered.

FIGURE 3.

SIM-9001-FIG-0003-c

(Top) Number of events per quarter by arm. The delayed vaccination arm consists of the original placebo participants after they have been crossed over. (Bottom) Vaccine efficacy (VE) as a function of time since vaccination. Dashed lines are the true VE(s), solid curves and ribbons are pointwise means and 95% confidence intervals [Colour figure can be viewed at wileyonlinelibrary.com]

6. DISCUSSION

Knowing the durability of vaccine‐induced protection is a key question in vaccine development, especially for COVID‐19 vaccines. With placebo volunteers being offered vaccine before long term follow‐up has completed, it seems the ability to assess durability is lost. In this article, we demonstrated that placebo controlled VE can be accurately assessed long after the placebo group has disappeared. Our method is the familiar Cox PHs model. To reflect seasonal or outbreak variation in the attack rate, we used calendar time as the time index. To recover different VE curves we specified flexible models for VE decay. If crossover occurs quickly, the early VE(s) will remain poorly estimated, no matter how many postcrossover cases occur which will impact later estimates of VE(s). Our results point out the advantages of delaying crossover and longer crossover interludes to help improve the estimation. We also provide suggestions on how to manage the crossover interlude, discuss how to perform per‐protocol analyses, and discuss solutions for open label crossover where risk behavior might increase for the recently unblinded vaccinees. While developing this approach we became aware of two related methods that assess vaccine durability. Both use hazard function models with calendar time as the index and consider the precrossover 24 and both the pre and postcrossover period. 25 The latter focuses on methods when crossover is open label and confounding an issue.

Future work could develop random effects or frailty type models. Such models seem especially suited for settings with a relatively high attack rate. Our work focused on the setting where the disease event was continuously monitored. An important endpoint in vaccine trials is seroconversion, or the development of antibodies to the pathogen of interest. Seroconversion is typically measured rarely which results in an interval censored endpoint. The extension of these methods to interval censored data will be important. Another potential generalization of these methods is for observational data where methods could be developed that address confounding of risk with vaccine uptake. Finally, an emerging issue in the context of COVID‐19 is how to estimate vaccine durability in the presence of emerging strains. This could be addressed within a competing risks framework in which times to first acquiring disease due to different strains are treated as competing events. The framework developed in this article are easily extended to this setting since our models could straightforwardly be applied to the subdistribution hazards in a competing risks model.

ACKNOWLEDGEMENTS

This work utilized the computational resources of the NIH HPC Biowulf computing cluster (http://hpc.nih.gov). The authors would like to thank Keith Lumbard for help with simulations, as well as Michael Fay, Anastasios Tsiatis, Danyu Lin, Peter Gilbert, Holly Janes, and Larry Molton for helpful discussions regarding this work.

Appendix A. Illustrative Computer Code

A.1.

In this section, we present a minimal example with SAS and R code to estimate a log‐linear waning efficacy curve. In this trial, a per‐protocol analysis is used and disease cases are counted starting 30 days after the first dose. Calendar time is relative to 1 January 2021, so the volunteer depicted in the first row was dosed on January 5, 2021. Thus, during the crossover period, cases are not counted for 30 days. A blinded crossover is assumed so the same placebo baseline hazard applies throughout the study without time‐dependent stratification. If an open label crossover were pursued, an additional “stratum” variable could be created that identified whether a risk interval was blinded or open label. The “stratum” variable would then be used as a stratification variable in the PHs model.

We assume the data is arranged so that anyone who gets the crossover dose has both the start and end date of crossover recorded and that if someone drops out or has an event before the start of the crossover period the start and end date of crossover are missing.

Volunteers 1, 2, 6, and 7 are censored after crossover while Volunteer 4 has an event after crossover. Volunteer 3 is censored before crossover while volunteers 5 and 8 have events before crossover. Volunteers 9 and 10 enter the crossover interlude but have an event and dropout, respectively, thus they are censored at the start of the crossover interlude.

The variables below are



id        = subject identifier

arm       = original randomization arm 1=vaccine 0=placebo

entry     = # of days from 01‐Jan‐2021 to 30 days past first dose

Xstart    = # of days from 01‐Jan‐2021 to crossover start

Xend      = # of days from 01‐Jan‐2021 to 30 days post first crossover dose

eveenttime  = # of days from 01‐Jan‐2021 to a disease event or censoring

status    = 1 if a disease event at eventtime 0 otherwise


A.1. SAS code



DATA new;

     INPUT id arm entry Xstart Xend eventtime status;

     CARDS;

         1  0  35  65  95  370 0

         2  1  45  80  110 400 0

         3  0  55   .   .  150 0

         4  1  60  170 200 310 1

         5  0  65   .   .  80  1

         6  1  80  190 210 410 0

         7  0  85  215 245 420 0

         8  1  70   .   .  90  1

         9  0  58  160 190 180 1

         10 1  71  160 190 166 0



;



/* did not start crossover*/

DATA data1; SET new;

   IF Xstart=. AND Xend = .;

      period=1; start=entry; stop=eventtime; event=status;



/* event or dropout during crossover interlude: censor at start of period 1*/

DATA data2; SET new;

   IF Xstartˆ=. AND Xstart<=eventtime AND eventtime<=Xend;

      period=1; start=entry; stop=Xstart; event=0;



/* did pass crossover so outputs for per and postcrossover periods*/

DATA data3; SET new;

   IF Xstartˆ=. AND eventtime>Xend;

      period=1; start=entry; stop=Xstart; event=0; OUTPUT;

      period=2; start=Xend; stop=eventtime; event=status; OUTPUT;



/* Merge the three datasets and mark the vaccination time and status*/

DATA newest;

  SET data1 data2 data3;

   IF arm=0 THEN DO; timevact=Xend; IF period=1 THEN vac=0; ELSE vac=1; END;

   IF arm=1 THEN DO; timevact=entry;                             vac=1; END;



/* Run the code with a log‐linear VE decay, fix so no missing vactime*/

PROC PHREG DATA=newest;

  MODEL (start, stop)*event( 0 )= vac vactime

          / itprint rl ;

  vactime=vac*(stop‐timevact);

   IF vac=0 THEN vactime=0;

RUN;



/* Print out the dataset, sorted*/

PROC SORT DATA=newest;

   BY id period;

PROC PRINT;

RUN;



Obs  id  arm period start  stop event  vac timevact

1    1   0    1      35    65    0     0    95

2    1   0    2      95    370   0     1    95

3    2   1    1      45    80    0     1    45

4    2   1    2      110   400   0     1    45

5    3   0    1      55    150   0     0    .

6    4   1    1      60    170   0     1    60

7    4   1    2      200   310   1     1    60

8    5   0    1      65    80    1     0    .

9    6   1    1      80    190   0     1    80

10    6   1    2      210   410   0     1    80

11    7   0    1      85    215   0     0    245

12    7   0    2      245   420   0     1    245

13    8   1    1      70    90    1     1    70

14    9   0    1      58    160   0     0    190

15   10   1    1      71    160   0     1    71




In this minimal example, the estimates of (θ1,θ2) are (−0.82335, 0.02649), respectively, which correspond to estimates of VE(0), VE(30) of 1exp(0.82335)=0.561 and 1exp(0.82335+30×0.02649)=0.028.

A.2. R code



library(survival)

library(tidyverse) # for dplyr and piping %>%

dean_dat = read.table("manuscript/dean_dat.txt")



# dataset

dat_raw =

   data.frame(

   id        = c(1,2,3,4,5,6,7,8,9,10),

   arm       = c(0,1,0,1,0,1,0,1,0,1),

   entry     = c(35,45,55,60,65,80,85,70,58,71),

   Xstart    = c(65,80,NA,170,NA,190,215,NA,160,160),

   Xend      = c(95,110,NA,200,NA,210,245,NA,190,190),

   eventtime = c(370,400,150,310,80,410,420,90,180,166),

   status    = c(0,0,0,1,1,0,0,1,1,0)

)



# reshape dataset into start‐stop format

# add new variables for time‐varying vaccine status and vaccination time

dat_long =

   dat_raw %>%

   group_by(id) %>%

   summarize(id = rep(id, 2),

             arm = rep(arm, 2),

             tstart = c(entry, Xend),

             tstop = c(min(eventtime, Xstart, na.rm = T), eventtime),

             status =

                 case_when(is.na(Xstart) ~

                              c(status, NA), # event happens before crossover

                          between(eventtime, Xstart, Xend) ~

                              c(0, NA), # no event, then blackout period

                          eventtime > Xend ~

                              c(0, status)), # no event, then record status

              vacc_status =

                  case_when(arm == 1 ~ rep(1, 2),

                            arm == 0 ~ c(0, 1)),

              vacc_time =

                  case_when(arm == 1 ~ rep(entry, 2),

                            arm == 0 & is.na(Xstart) & is.na(Xend) ~ Inf,

                            arm == 0 & !is.na(Xstart) & !is.na(Xend) ~ rep(Xend,

                                     2))) ~

    drop_na() %>%

    as.data.frame()



print(dat_long, row.names = FALSE)

#  id  arm  tstart   tstop  status  vacc_status  vacc_time

#  1    0     35       65     0        0            95

#  1    0     95       370    0        1            95

#  2    1     45       80     0        1            45

#  2    1     110      400    0        1            45

#  3    0     55       150    0        0            Inf

#  4    1     60       170    0        1            60

#  4    1     200      310    1        1            60

#  5    0     65       80     1        0            Inf

#  6    1     80       190    0        1            80

#  6    1     210      410    0        1            80

#  7    0     85       215    0        0            245

#  7    0     245      420    0        1            245

#  8    1     70       90     1        1            70

#  9    0     58       160    0        0            190

#  10   1     71       160    0        1            71



# fit that model!

vacc_dur =

   coxph(formula =

        Surv(time = tstart,

            time2 = tstop,

            event = status) vacc_status + tt(vacc_time),

               tt = function(vacc_time, t, …) {

                  pmax(0, t ‐ vacc_time)

             },

            data = dat_long)




The parameter estimates are (θ^1,θ2^) = (−0.82336, 0.02649).

Appendix B. Additional Simulation Results

B.1.

B.1. Trials with year two baseline hazard equal to year one baseline hazard

In this section, we provide simulation results where the baseline hazard function in year two is the same as in year one. Table B1 is the analogue to Tables 1 and Table B2 is the analogue to Table 2.

TABLE B1.

Bias, empirical variance, and coverage for the linear predictor in Cox PH models for simulated trials where the baseline hazard in year two was the same as the baseline hazard in year one

log(1VE(s))
log(1VE(s))log(1VE(0))
Design Model Time Bias Emp. var. Covg. Bias Emp. var. Covg.
True vaccine efficacy constant at 75%
Cross at 150 cases log‐linear 0.5 −0.012 0.046 0.950 0.001 0.028 0.952
τx=0.6±0.05
1.0 −0.011 0.102 0.951 0.002 0.111 0.952
1.5 −0.010 0.213 0.950 0.003 0.249 0.952
2.0 −0.010 0.379 0.951 0.003 0.443 0.952
P‐spline 0.5 −0.015 0.064 0.962 0.003 0.118 0.975
1.0 −0.014 0.148 0.960 0.004 0.229 0.960
1.5 −0.013 0.226 0.957 0.005 0.303 0.957
2.0 −0.019 0.461 0.966 −0.001 0.536 0.963
Cross at 1 year log‐linear 0.5 −0.008 0.027 0.950 −0.001 0.011 0.953
N x  = 216 ± 13 1.0 −0.009 0.043 0.951 −0.002 0.042 0.953
1.5 −0.010 0.080 0.953 −0.002 0.095 0.953
2.0 −0.011 0.138 0.953 −0.003 0.170 0.953
P‐spline 0.5 −0.010 0.035 0.976 0.005 0.101 0.981
1.0 −0.010 0.056 0.971 0.005 0.117 0.965
1.5 −0.013 0.087 0.969 0.002 0.132 0.977
2.0 −0.021 0.271 0.977 −0.005 0.336 0.977
Parallel trial log‐linear 0.5 −0.008 0.020 0.948 0.000 0.010 0.950
1.0 −0.008 0.013 0.950 0.000 0.040 0.950
1.5 −0.008 0.027 0.949 0.000 0.089 0.950
2.0 −0.008 0.060 0.949 0.000 0.158 0.950
P‐spline 0.5 −0.010 0.033 0.981 0.012 0.134 0.983
1.0 −0.007 0.033 0.983 0.016 0.164 0.968
1.5 −0.012 0.035 0.981 0.011 0.142 0.975
2.0 −0.039 0.171 0.977 −0.016 0.274 0.976
Vaccine efficacy wanes from 85% to 35% over 1.5 years
Cross at 150 cases log‐linear 0.5 −0.015 0.048 0.952 0.001 0.016 0.952
τx=0.63±0.05
1.0 −0.014 0.078 0.949 0.002 0.064 0.952
1.5 −0.013 0.140 0.950 0.004 0.144 0.952
2.0 −0.012 0.234 0.949 0.005 0.255 0.952
P‐spline 0.5 −0.015 0.063 0.963 0.009 0.108 0.981
1.0 −0.010 0.127 0.962 0.015 0.208 0.968
1.5 −0.008 0.167 0.953 0.017 0.243 0.965
2.0 −0.007 0.254 0.959 0.018 0.329 0.962
Cross at 1 year log‐linear 0.5 −0.009 0.030 0.948 0.001 0.008 0.953
N x  = 211 ± 13 1.0 −0.008 0.039 0.952 0.003 0.031 0.953
1.5 −0.006 0.065 0.952 0.004 0.071 0.953
2.0 −0.005 0.106 0.954 0.006 0.126 0.953
P‐spline 0.5 −0.010 0.034 0.972 0.012 0.089 0.987
1.0 −0.006 0.051 0.968 0.016 0.116 0.969
1.5 −0.002 0.070 0.966 0.020 0.118 0.980
2.0 0.001 0.154 0.973 0.023 0.218 0.974
Parallel trial log‐linear 0.5 −0.008 0.022 0.952 0.004 0.008 0.948
1.0 −0.004 0.010 0.947 0.007 0.031 0.948
1.5 −0.001 0.014 0.947 0.011 0.071 0.948
2.0 0.003 0.034 0.949 0.014 0.125 0.948
P‐spline 0.5 −0.011 0.030 0.980 0.019 0.119 0.990
1.0 −0.004 0.023 0.983 0.026 0.166 0.973
1.5 −0.001 0.020 0.978 0.029 0.144 0.976
2.0 −0.003 0.076 0.975 0.027 0.197 0.977

Note: The log‐linear and P‐spline models correspond to (2) and (5), respectively. The average time of crossover (in years), τx, and the average number of events at crossover, N x , along with SDs beneath the crossover grouping in the design column. Time is given in years since study initiation.

Abbreviations: PH, proportional hazards; VE, vaccine efficacy.

TABLE B2.

Bias, empirical variance, and coverage for estimates of the intercept and linear trend in vaccine efficacy under the log‐linear model, (2), and semiparametric model, (5)

Intercept Linear trend
Bias Emp. var. Covg. Bias Emp. var. Covg.
Vaccine efficacy constant at 75%
Cross at 150 cases Constant VE −0.012 0.039 0.951
log‐linear −0.013 0.046 0.953 0.002 0.111 0.952
P‐spline −0.018 0.076 0.970 0.002 0.109 0.954
Cross at 1 year Constant VE −0.007 0.027 0.949
log‐linear −0.008 0.033 0.952 −0.002 0.042 0.953
P‐spline −0.015 0.079 0.977 −0.002 0.042 0.956
Parallel trial Constant VE −0.004 0.013 0.951
log‐linear −0.007 0.046 0.950 0.000 0.040 0.950
P‐spline −0.023 0.110 0.977 0.002 0.039 0.952
Vaccine efficacy wanes from 85% to 35% over 1.5 years
Cross at 150 cases log‐linear −0.016 0.050 0.951 0.002 0.064 0.952
P‐spline −0.025 0.086 0.975 −0.002 0.063 0.953
Cross at 1 year log‐linear −0.010 0.036 0.952 0.003 0.031 0.953
P‐spline −0.022 0.085 0.979 −0.004 0.031 0.955
Parallel trial log‐linear −0.011 0.049 0.950 0.007 0.031 0.948
P‐spline −0.030 0.124 0.980 0.003 0.031 0.949

Note: Here, the time‐varying baseline hazard in year two the same as the baseline hazard in year one.

Abbreviation: VE, vaccine efficacy.

B.2. Comparing uniform crossover, crossover at 1 year, and parallel trials

This section contains results for a set of idealized trials with constant baseline hazard, instantaneous enrollment and crossover, and constant VE. Trials either crossed placebo participants to the vaccine arm at 1 year, uniformly over the 2 year study period, or never (corresponding to a standard parallel arm design). Table B3 provides the results for the vaccine efficacy over time while Table B4 provides the parameter estimates.

TABLE B3.

Bias, empirical variance, and coverage for the linear predictor in Cox PH models for simulated trials in an idealized scenario with constant baseline hazards and either continuous crossover, instantaneous crossover at 1 year, or a standard trial

log(1VE(s))
log(1VE(s))log(1VE(0))
Model Time Bias Empir. var. Coverage Bias Empir. var. Coverage
Continuous uniform crossover
log‐linear 0.0 −0.008 0.031 0.950
0.5 −0.004 0.016 0.951 0.003 0.009 0.950
1.0 −0.001 0.018 0.949 0.007 0.034 0.950
1.5 0.002 0.037 0.951 0.010 0.077 0.950
2.0 0.006 0.074 0.950 0.013 0.137 0.950
P‐spline 0.0 −0.017 0.072 0.979
0.5 −0.006 0.023 0.981 0.010 0.082 0.983
1.0 −0.004 0.029 0.976 0.012 0.104 0.967
1.5 0.002 0.044 0.973 0.018 0.103 0.970
2.0 0.011 0.183 0.976 0.027 0.234 0.980
Crossover at 1 year
log‐linear 0.0 −0.008 0.035 0.953
0.5 −0.008 0.026 0.954 0.000 0.010 0.947
1.0 −0.007 0.037 0.947 0.000 0.039 0.947
1.5 −0.007 0.068 0.947 0.000 0.088 0.947
2.0 −0.007 0.117 0.946 0.000 0.156 0.947
P‐spline 0.0 −0.018 0.100 0.984
0.5 −0.009 0.032 0.982 0.009 0.112 0.986
1.0 −0.012 0.048 0.975 0.005 0.122 0.971
1.5 −0.010 0.078 0.970 0.007 0.130 0.981
2.0 −0.002 0.237 0.980 0.016 0.333 0.975
Parallel trial
log‐linear 0.0 −0.006 0.051 0.950
0.5 −0.007 0.022 0.950 0.000 0.010 0.951
1.0 −0.007 0.013 0.949 −0.001 0.039 0.951
1.5 −0.008 0.024 0.951 −0.001 0.088 0.951
2.0 −0.008 0.054 0.953 −0.002 0.157 0.951
P‐spline 0.0 −0.021 0.131 0.981
0.5 −0.011 0.032 0.982 0.010 0.140 0.986
1.0 −0.010 0.033 0.981 0.010 0.186 0.968
1.5 −0.010 0.034 0.983 0.011 0.165 0.973
2.0 −0.024 0.133 0.982 −0.004 0.261 0.977

Note: The log‐linear and P‐spline models correspond to (2) and (5), respectively.

Abbreviations: PH, proportional hazards; VE, vaccine efficacy.

TABLE B4.

Empirical variance and coverage for estimates of the intercept and linear trend in vaccine efficacy under the log‐linear model, (2), and semiparametric model, (5), in an idealized scenario with constant baseline hazards and continuous crossover, instantaneous crossover at 1 year, or a standard trial

Intercept Linear trend
Bias Emp. var. Covg. Bias Emp. var. Covg.
Continuous uniform crossover
log‐linear −0.008 0.031 0.950 0.007 0.034 0.950
P‐spline −0.016 0.071 0.973 0.003 0.034 0.952
Crossover at 1 year
log‐linear −0.008 0.035 0.953 0.000 0.039 0.947
P‐spline −0.018 0.099 0.978 −0.002 0.038 0.952
Parallel trial
log‐linear −0.006 0.051 0.950 −0.001 0.039 0.951
P‐spline −0.021 0.130 0.977 −0.001 0.039 0.952

B.2. Frailty simulation results

This section presents results from simulated trials in where participants were heterogeneous in their baseline hazards. Simulation was analogous to trials simulated elsewhere in this article, except that each trial consisted of 30 000 participants and the participant level hazard was h˜i(t)=Uihi(t), with h i (t) corresponding to either a constant VE or log‐linear VE model. In both cases the baseline hazard was constant. We considered two settings for the baseline hazard: a low event rate scenario calibrated to yield 100 cases per year on placebo or a high event rate scenario calibrated to yield 600 cases per year on placebo. Participant frailties were drawn from a gamma distribution with mean one and a variance of either one (low frailty variance scenario) or four (high frailty variance scenario).

Tables B5, B6, B7 present the results.

TABLE B5.

Summary statistics of the frailty distribution of participants still in the risk set at the end of 2 years of follow‐up

Frailty distribution summary statistics
Baseline hazard Frailty variance Design Original arm Mean SD 25%ile 50%ile 75%ile
VE constant at 75%
Low Low Cross at 1 year Placebo 0.992 0.992 0.285 0.688 1.375
Vaccine 0.997 0.996 0.287 0.691 1.382
Parallel trial Placebo 0.987 0.987 0.284 0.684 1.368
Vaccine 0.997 0.996 0.287 0.691 1.382
High Cross at 1 year Placebo 0.969 1.937 0.010 0.169 1.010
Vaccine 0.987 1.973 0.010 0.172 1.028
Parallel trial Placebo 0.949 1.897 0.010 0.166 0.989
Vaccine 0.987 1.973 0.010 0.172 1.028
High Low Cross at 1 year Placebo 0.955 0.955 0.275 0.662 1.323
Vaccine 0.980 0.980 0.282 0.680 1.359
Parallel trial Placebo 0.926 0.926 0.266 0.642 1.283
Vaccine 0.980 0.980 0.282 0.680 1.359
High Cross at 1 year Placebo 0.841 1.681 0.009 0.147 0.876
Vaccine 0.926 1.851 0.010 0.162 0.965
Parallel trial Placebo 0.757 1.514 0.008 0.132 0.790
Vaccine 0.926 1.851 0.010 0.162 0.965
VE wanes from 85% to 35% over 1.5 years
Low Low Cross at 1 year Placebo 0.992 0.992 0.285 0.688 1.375
Vaccine 0.994 0.994 0.286 0.689 1.378
Parallel trial Placebo 0.987 0.987 0.284 0.684 1.368
Vaccine 0.994 0.994 0.286 0.689 1.378
High Cross at 1 year Placebo 0.968 1.936 0.010 0.169 1.009
Vaccine 0.975 1.950 0.010 0.170 1.017
Parallel trial Placebo 0.949 1.897 0.010 0.166 0.989
Vaccine 0.975 1.950 0.010 0.170 1.017
High Low Cross at 1 year Placebo 0.954 0.954 0.274 0.661 1.322
Vaccine 0.964 0.964 0.277 0.668 1.337
Parallel trial Placebo 0.926 0.926 0.266 0.642 1.283
Vaccine 0.964 0.964 0.277 0.668 1.337
High Cross at 1 year Placebo 0.838 1.676 0.009 0.146 0.874
Vaccine 0.870 1.740 0.009 0.152 0.907
Parallel trial Placebo 0.757 1.514 0.008 0.132 0.790
Vaccine 0.870 1.740 0.009 0.152 0.907

Note: We report geometric means of summary statistics of each frailty distribution from 10 000 simulated trials.

Abbreviation: VE, vaccine efficacy.

TABLE B6.

Bias of estimates of the linear predictor for VE and change in the linear predictor for VE in Cox PH models for trials simulated with constant VE at 75% and gamma distributed frailties

Placebo crossover Parallel trial
Frailty variance Model Time
log(1VE(s))
log(1VE(s))
log(1VE(0))
log(1VE(s))
log(1VE(s))log(1VE(0))
Low baseline hazard
Low log‐linear 0.5 −0.014 0.000 −0.013 0.001
1.0 −0.015 0.000 −0.012 0.002
1.5 −0.015 −0.001 −0.011 0.003
2.0 −0.015 −0.001 −0.010 0.005
P‐spline 0.5 −0.017 0.022 −0.017 0.043
1.0 −0.022 0.016 −0.017 0.043
1.5 −0.020 0.018 −0.019 0.040
2.0 −0.032 0.006 −0.046 0.014
High log‐linear 0.5 −0.009 0.009 −0.007 0.011
1.0 0.000 0.019 0.004 0.022
1.5 0.010 0.028 0.015 0.032
2.0 0.019 0.038 0.026 0.043
P‐spline 0.5 −0.014 0.021 −0.014 0.041
1.0 −0.007 0.028 0.001 0.057
1.5 0.003 0.038 0.008 0.063
2.0 0.005 0.041 −0.017 0.039
High baseline hazard
Low log‐linear 0.5 0.011 0.016 0.012 0.015
1.0 0.026 0.031 0.027 0.029
1.5 0.042 0.047 0.041 0.044
2.0 0.058 0.063 0.056 0.059
P‐spline 0.5 0.010 0.017 0.010 0.018
1.0 0.026 0.033 0.027 0.036
1.5 0.042 0.050 0.042 0.050
2.0 0.052 0.059 0.046 0.054
High log‐linear 0.5 0.054 0.054 0.055 0.050
1.0 0.108 0.108 0.105 0.100
1.5 0.162 0.162 0.154 0.149
2.0 0.216 0.216 0.204 0.199
P‐spline 0.5 0.052 0.054 0.053 0.057
1.0 0.108 0.109 0.108 0.112
1.5 0.162 0.164 0.154 0.159
2.0 0.210 0.211 0.190 0.194

Note: The low baseline hazard scenario was calibrated to yield an average of 50 cases per 6 month period on the placebo arm, while the high baseline hazard scenario was calibrated to yield 300 cases per 6 month period. The frailty distribution had mean one and a variance of either one (low variance) or four (high variance).

Abbreviations: PH, proportional hazards; VE, vaccine efficacy.

TABLE B7.

Bias of estimates of the linear predictor for VE and change in the linear predictor for VE in Cox PH models for trials simulated with VE waning from 85% to 35% linear on the log hazard scale and gamma distributed frailties

Placebo crossover Parallel trial
Frailty variance Model Time
log(1VE(s))
log(1VE(s))log(1VE(0))
log(1VE(s))
log(1VE(s))log(1VE(0))
Low baseline hazard
Low log‐linear 0.5 −0.016 0.005 −0.013 0.008
1.0 −0.010 0.011 −0.005 0.015
1.5 −0.005 0.016 0.002 0.023
2.0 0.001 0.022 0.010 0.030
P‐spline 0.5 −0.019 0.036 −0.020 0.058
1.0 −0.010 0.045 −0.006 0.071
1.5 −0.001 0.054 0.000 0.077
2.0 0.009 0.064 0.005 0.083
High log‐linear 0.5 −0.008 0.011 −0.004 0.013
1.0 0.003 0.022 0.009 0.025
1.5 0.014 0.033 0.022 0.038
2.0 0.026 0.044 0.034 0.051
P‐spline 0.5 −0.012 0.037 −0.014 0.059
1.0 0.005 0.054 0.011 0.083
1.5 0.020 0.069 0.022 0.095
2.0 0.030 0.079 0.022 0.095
High baseline hazard
Low log‐linear 0.5 0.011 0.012 0.015 0.010
1.0 0.023 0.023 0.025 0.021
1.5 0.035 0.035 0.035 0.031
2.0 0.046 0.047 0.046 0.041
P‐spline 0.5 0.011 0.021 0.011 0.022
1.0 0.027 0.037 0.029 0.040
1.5 0.038 0.048 0.038 0.049
2.0 0.041 0.051 0.038 0.049
High log‐linear 0.5 0.056 0.038 0.063 0.033
1.0 0.095 0.077 0.096 0.066
1.5 0.133 0.115 0.128 0.099
2.0 0.172 0.154 0.161 0.131
P‐spline 0.5 0.056 0.059 0.057 0.063
1.0 0.108 0.110 0.109 0.114
1.5 0.142 0.145 0.135 0.141
2.0 0.151 0.153 0.136 0.141

Note: The low baseline hazard scenario was calibrated to yield an average of 50 cases per 6 month period on the placebo arm, while the high baseline hazard scenario was calibrated to yield 300 cases per 6 month period. The frailty distribution had mean one and a variance of either one (low variance) or four (high variance).

Abbreviations: PH, proportional hazards; VE, vaccine efficacy.

Fintzi J, Follmann D. Assessing vaccine durability in randomized trials following placebo crossover. Statistics in Medicine. 2021;40:5983–6007. 10.1002/sim.9001

Abbreviations: HR, hazard ratio; PH, proportional hazards; VAED, vaccine associated enhanced disease; VE, vaccine efficacy.

REFERENCES

  • 1. Poland GA, Ovsyannikova IG, Kennedy RB. SARS‐CoV‐2 immunity: review and applications to phase 3 vaccine candidates. The Lancet. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Choe PG, Kang CK, Suh HJ, et al. Waning antibody responses in asymptomatic and symptomatic SARS‐CoV‐2 infection. Emerg Infect Dis. 2020;27:327‐329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Moderna A phase 3, randomized, stratified, observer‐blind, placebo‐controlled study to evaluate the efficacy, safety, and immunogenicity of mRNA‐1273 SARS‐CoV‐2 vaccine in adults aged 18 years and older; 2020. https://www.modernatx.com/sites/default/files/mRNA‐1273‐P301‐Protocol.pdf. Accessed November 17, 2020.
  • 4. World Health Organization . Placebo‐controlled trials of covid‐19 vaccines–why we still need them. N Engl J Med. 2020. [DOI] [PubMed] [Google Scholar]
  • 5. Wendler D, Ochoa J, Millum J, Grady C, Taylor HA. COVID‐19 vaccine trial ethics once we have efficacious vaccines. Science. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Sridhar S, Luedtke A, Langevin E, et al. Effect of dengue serostatus on dengue vaccine safety and efficacy. N Engl J Med. 2018;379(4):327‐340. [DOI] [PubMed] [Google Scholar]
  • 7. Follmann DA, Fintzi J, assessing durability of vaccine effect following blinded crossover in COVID‐19 vaccine efficacy trials. medRxiv; 2020.
  • 8. Nason M, Follmann D. Design and analysis of crossover trials for absorbing binary endpoints. Biometrics. 2010;66(3):958‐965. [DOI] [PubMed] [Google Scholar]
  • 9. Makubate B, Senn S. Planning and analysis of cross‐over trials in infertility. Stat Med. 2010;29:3203‐3210. [DOI] [PubMed] [Google Scholar]
  • 10. Follmann D. Augmented designs to assess immune response in vaccine trials. Biometrics. 2006;62(4):1161‐1169. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Henao‐Restrepo AM, Camacho A, Longini IM, et al. Efficacy and effectiveness of an rVSV‐vectored vaccine in preventing Ebola virus disease: final results from the Guinea ring vaccination, open‐label, cluster‐randomised trial (Ebola Ça Suffit!). The Lancet. 2017;389(10068):505‐518. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Cox DR. Regression models and life‐tables. J Royal Stat Soc Ser B (Methodol). 1972;34(2):187‐202. [Google Scholar]
  • 13. Therneau TM, Grambsch PM. Modeling Survival Data: Extending the Cox Model. Berlin, Germany: Springer Science & Business Media; 2013. [Google Scholar]
  • 14. Eilers PHC, Marx BD. Flexible smoothing with B‐splines and penalties. Stat Sci. 1996;11:89‐102. [Google Scholar]
  • 15. Wood SN. Generalized Additive Models: An Introduction with R. Boca Raton, FL: CRC Press; 2017. [Google Scholar]
  • 16. Perperoglou A, Sauerbrei W, Abrahamowicz M, Schmid M. A review of spline function procedures in R. BMC Med Res Methodol. 2019;19:1‐16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Therneau T, Crowson C, Atkinson E. Using time dependent covariates and time dependent coefficients in the cox model. Survival Vignettes. 2017;1‐17. [Google Scholar]
  • 18. Therneau TM. A package for survival analysis in R; 2020. https://CRAN.R‐project.org/package=survivalRpackageversion3.2‐7.
  • 19. Lipsitch M. Challenges of vaccine effectiveness and waning studies. Clin Infect Dis. 2019;68:1631‐1633. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Durham LK, Longini IM Jr, Halloran ME, Clemens JD, Azhar N, Rao M. Estimation of vaccine efficacy in the presence of waning: application to cholera vaccines. Amer J Epidemiol. 1998;147(10):948‐959. [DOI] [PubMed] [Google Scholar]
  • 21. Aalen OO, Cook RJ, Røysland K. Does Cox analysis of a randomized survival study yield a causal treatment effect? Lifetime Data Anal. 2015;21:579‐593. [DOI] [PubMed] [Google Scholar]
  • 22. Kanaan MN, Farrington CP. Estimation of waning vaccine efficacy. J Amer Stat Assoc. 2002;97(458):389‐397. [Google Scholar]
  • 23. Balan TA, Putter H. A tutorial on frailty models. Stat Methods Med Res. 2020;29:3424‐3454. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Lin D, Zeng D, Gilbert P. Evaluating the long‐term efficacy of COVID‐19 vaccines. medRxiv; 2021. [DOI] [PMC free article] [PubMed]
  • 25. Tsiatis AA, Davidian M. Estimating vaccine efficacy over time after a randomized study is unblinded; 2021. arXiv preprint arXiv:2102.13103. [DOI] [PMC free article] [PubMed]

Articles from Statistics in Medicine are provided here courtesy of Wiley

RESOURCES