Skip to main content
Philosophical transactions. Series A, Mathematical, physical, and engineering sciences logoLink to Philosophical transactions. Series A, Mathematical, physical, and engineering sciences
. 2008 Apr 11;366(1874):2347–2360. doi: 10.1098/rsta.2008.0044

Estimation of vaccine efficacy in a repeated measures study under heterogeneity of exposure or susceptibility to infection

Clarissa Valim 1,2,3,4,*, Maura Mezzetti 5, James Maguire 4, Margarita Urdaneta 6, David Wypij 3,7
PMCID: PMC3227149  PMID: 18407892

Abstract

Vaccine efficacy (VE) is commonly estimated through proportional hazards modelling of the time to first infection or disease, even when the event of interest can recur. These methods can result in biased estimates when VE is heterogeneous across levels of exposure and susceptibility in subjects. These two factors are important sources of unmeasured heterogeneity, since they vary within and across areas, and often cannot be individually quantified. We propose an estimator of VE per exposure that accounts for heterogeneous susceptibility and exposure for a repeated measures study with binary recurrent outcomes. The estimator requires only information about the probability distribution of environmental exposures. Through simulation studies, we compare the properties of this estimator with proportional hazards estimation under the heterogeneity of exposure. The methods are applied to a reanalysis of a malaria vaccine trial in Brazil.

Keywords: vaccine, efficacy, generalized linear models, longitudinal analyses, generalized estimating equation, heterogeneity

1. Introduction

Vaccine efficacy (VE) is defined as the per cent reduction in the probability or hazard of disease conferred by the vaccine to an individual. It is typically estimated based on a marginal, or population-based, parameter, which is an average of the individual vaccine effects, specific to a geographically and temporally defined population (Halloran et al. 1991). Commonly, estimates of VE are based on one minus the hazard ratio from a proportional hazards model for time to first event, which can be an infection or disease, even when the disease under study recurs (Alonso et al. 1994; Bojang et al. 2001; Aponte et al. 2007). The use of the proportional hazards model in VE studies is widespread owing to its ease of implementation, interpretation and flexibility regarding the shape of the baseline hazard over time. An important advantage of proportional hazards models is that, in balanced randomized trials and under the proportional hazards assumption, the population VE represents the individual VE (Halloran et al. 1994; Gilbert et al. 1998), which is the causal parameter of interest. This individual VE in trials represents the experimental or biological effect of the vaccine and is useful in selecting or comparing the vaccine candidates. The population VE for the same vaccine could vary for different studies and does not allow an assessment of the efficacy of the vaccine itself.

Under homogeneity of risk factors of infection or disease in the population or of the VE itself, proportionality of the hazards may hold true (Vaupel et al. 1979; Aalen 1988; Hougaard 1995). However, the assumption of homogeneity is often unrealistic because fundamental risk factors for infectious diseases, such as individual exposure levels and susceptibilities to the pathogen pre-vaccination, are likely to be heterogeneous in the population (Struchiner et al. 1994; Halloran et al. 1995). In this paper we will focus on the heterogeneity due to exposure, assuming that the heterogeneous susceptibilities can be minimized in the design or analysis, e.g. by using covariates. Examples of covariates to model heterogeneous susceptibility include age or other markers of previous exposure history, such as time living in the area (Baird et al. 1993).

Exposure intensity varies greatly within and across populations and cannot always be reasonably and accurately quantified through covariates. The intensity of exposure can vary across individual behavioural characteristics such as occupation or sexual behaviour and is likely to fluctuate within geographical regions due to environmental factors (Smith et al. 1995). Under heterogeneity, proportional hazards analysis of time to first event can underestimate the individual VE, owing to unmeasured covariates representing prognostic factors for survival or random heterogeneity (Gail et al. 1984; Struthers & Kalbfleisch 1986; Schumacher et al. 1987; Aalen 1988, 1994; Chastang et al. 1988; Lin & Wei 1989; Omori & Johnson 1993; Schmoor & Schumacher 1997; Henderson & Oman 1999). Basically, unmeasured heterogeneity affects the comparability between vaccinees and non-vaccinees achieved by randomization in the beginning of the study. Intuitively, if the vaccine is effective, higher risk (more exposed) unvaccinated subjects fail faster than vaccinated subjects and are removed from the risk set.

Population heterogeneities in exposure intensities can be modelled using random effects survival methods, including Cox models with random effects (Hougaard 1995; Halloran et al. 1996; Longini & Halloran 1996). When infection or disease does not confer long-lasting immunity and may recur, multivariate survival methods may be used (Hougaard 2000; Therneau & Grambsch 2000). Multiple events may be analysed with Poisson regression or using a more flexible approach, such as the marginal model originally proposed by Anderson & Gill (1982) with variances adjusted for correlations within subjects using robust variance estimates (Lin & Wei 1989). Analyses of multiple events using the Anderson–Gill model are expected to correct for bias due to unmeasured heterogeneity because, since subjects remain in the risk set until the end of the study, vaccinated and unvaccinated subjects remain comparable. However, continuous time survival analysis methods typically require information about the exact failure times, though often in practice we can only distinguish a subject's outcomes in the time interval between consecutive visits (active case detection). Even when studies combine active case detection with passive case detection (detection of the event whenever people seek care), many of their cases are found through active detection and have event time interval censored. Under those circumstances, although the exact failure time could be approximated, a repeated measures analysis for a binary outcome represents a practical alternative to handle the sequential monitoring of subjects, interval-censored observations, recurrent episodes and non-proportional hazards.

In this paper, we present an estimator for VE given one exposure contact in a repeated measures analysis with a generalized estimating equations (GEE) approach, accounting for heterogeneous exposure and susceptibility pre-vaccination. Although the proposed estimator may be applied to different vaccines, we focus our model on estimating efficacy for malaria vaccines. Malaria is a leading cause of child mortality in developing countries (WHO 1995) and is transmitted by a mosquito vector. The transmission intensity of malaria is highly variable across and within regions, and highly seasonal (Smith et al. 1995). Immunity to malaria is partial and does not prevent recurrences of infection or disease. Currently there are several vaccines in the preclinical test phase (Aide et al. 2007). For some malaria vaccines, recent randomized trials have been performed in which the VE is primarily estimated based on the proportional hazards modelling of time to first event (Alonso et al. 2005; Bejon et al. 2006, 2007). It is not feasible to record exposure to infected mosquito bites received by each trial subject, and so exposure information typically relies on mosquito collections in samples of the study area. Case detection methods rely on both passive case detection, when individuals seek medical treatment, and active case detection, when houses are visited and individuals are examined at regular time intervals.

Section 2 presents our general model and estimator of VE. Under some simplifying assumptions, we propose estimation procedures using standard commercial software. In §3 we show that, with heterogeneity of exposure to infection, VE estimates based on the proportional hazards modelling of time to first event yield biased estimates of the individual VE. We also use simulation studies to compare our repeated measures estimator of VE with the proportional hazards model estimator of VE, when the assumption of proportional hazards holds. In §4, we apply the proposed model to a reanalysis of the data from a malaria SPf66 vaccine trial carried out in Brazil.

2. Model description

In a VE study, subjects i, for i=1, …, n, with different susceptibility status are randomized to vaccination (Vi=1) or placebo (Vi=0) at the beginning of the study period. The outcome status (Dij) of each subject (binary) is subsequently recorded at specific time intervals, j. For vaccines targeting recurrent events, after an event, a subject treated can re-enter the risk set and present the outcome again. The probability of presenting the outcome at each interval j depends on the specific amount of exposures received (Wij), which arrive with intensity λij. Although Wij cannot be measured, λij may be estimated in a separate ecological study or indirectly, allowing one to determine the empirical probability distribution of exposure. In this scenario, an estimator of VE can be defined based on the likelihood of

Pr{dij|vi;λij}=wijPr{dij|vi,wij}Pr{wij;λij},

where Pr{wij;λij} is the probability distribution of exposure contacts for subject i in time interval j. Define a subject's susceptibility status, or the probability of presenting the outcome (baseline probability) in one exposure contact as pij=f(αijTxij), where xij are covariates, possibly time varying. Assuming that susceptibility to the vaccine is reduced by a multiplicative factor g(·), independent of the susceptibility pij across exposures, we can define the probability of presenting the outcome given Wij exposures as

Pr{dij|vi,wij}=1[1pijg(·)]wij.

The reduction in the probability of disease given one exposure conferred by the vaccine can be defined as g(βivi+δijTxijvi), allowing the VE to vary across subjects and to interact with covariates.

Different functions can be chosen to model f, g and λ and different probability distributions to model Pr{wij;λij}. If f and g are exponential functions (log link model), we can write the expression for the probability of disease for the ith subject at the jth interval given vaccination status as

Pr{dij=1|vi,xij;α,βi,δ,λ}=wij{1[1exp(αijTxij+βivi+δijTxijvi)]wij}Pr{wij;λij}. (2.1)

The resulting contribution to the likelihood of the ith subject at the jth time interval becomes

Lij=wij{1[1exp(αijTxij+βivi+δijTxijvi)]wij}dij{[1exp(αijTxij+βivi+δijTxijvi)]wij}1dijPr{wij;λij}.

Several simplifying assumptions can be made. For instance, if we let Wij be independently Poisson (λij) distributed and s=1exp(αijTxij+βivi+δijTxijvi), then

Pr{dij=1|vi,xij;α,β,δ,λ}=1EW[sW],

where EW[sW] is the generating function of a Poisson random variable. Algebraic manipulation reduces equation (2.1) to

ln(ln(1Pr{dij=1|vi,xij;α,βi,δ,λ}))=log(λij)+αijTxij+βivi+δijTxijvi.

Thus, the probability of disease for the ith subject at the jth interval is linked to the parameters by a complementary log–log link function, allowing the estimation of VE through standard statistical software. In this model, the probability distribution of exposure contacts may also be allowed to vary over time and across covariate levels by λij=λ(γijTzij). For instance, in the case of the malaria vaccine, the intensity of exposure can be estimated based on an ecological study of the number of infectious bites received by subjects at different months (or seasons) of the year and whether or not the subject uses a bed net.

In the above models, the individual VE or VE in one exposure contact varies across subjects and intervals and can be expressed by

VEWij=1=1Pr{dij=1|vi=1,wij=1,xij;α,β,δ,λ}Pr{dij=1|vi=0,wij=1,xij;α,β,δ,λ}=1exp(β+δijTxij).

In the absence of interaction between the vaccine and covariates (δ=0), the VE given one exposure contact is homogeneous across subjects, even when the baseline probability of an event or susceptibility in one exposure contact is varying. VE varies in the range (0, 1] when the vaccine is beneficial.

The population VE or VE marginal on exposure can be expressed as

VE=1Pr{dij=1|vi=1,xij;α,β,δ,λ}Pr{dij=1|vi=0,xij;α,β,δ,λ}=1wijPr{dij=1|vi=1,xij;α,β,δ,λ}Pr{wij;λij}wijPr{dij=1|vi=0,xij;α,β,δ,λ}Pr{wij;λij}=11exp{λijexp{αijTxij+βi+δijTxij}}1exp{λijexp{αijTxij}}.

Straightforward extensions of this model could involve random effects or Markov covariates (Diggle et al. 1994). The appropriateness of each of these approaches is related to the existence of heterogeneity other than that caused by exposure, i.e. heterogeneity in susceptibility not measured by the modelled covariates and heterogeneity in the VE per exposure. Alternative approaches will lead to different interpretation of the VE in one exposure.

Throughout this paper, we will model the marginal probability of disease given exposure. The marginal models assume that event history during the study period does not affect the susceptibility per exposure. To model the repeated measures, we will use GEE methods (Liang & Zeger 1986; Zeger & Liang 1986). When there is no additional random heterogeneity, 1−exp(β) represents the individual VE per exposure.

3. Assessing bias of our repeated measures model and the proportional hazards model under heterogeneous exposure intensities

In this section we show that VE based on one minus the hazard ratio of the first episode is a biased estimate of the VE per exposure (the individual VE or causal parameter of interest), when the assumption of proportional hazards is violated due to heterogeneous exposures to infection. We studied two scenarios of heterogeneous exposure: the first generated by the mixture of two Poisson distributions and the second by a continuous mixture of Poisson distributions. Using simulations, we also compared VE estimated through our repeated measures model to that estimated through proportional hazards under a heterogeneous and continuous intensity of exposure.

In the first exposure heterogeneity scenario, the population was assumed to be subdivided in two groups (Xi=0/1). We allowed these two groups to receive exposures according to Poisson distributions, with means λ1 and λ2, respectively. Under these assumptions, the interarrival times of each disease episode given the number of exposures received in the interval followed an exponential distribution, homogeneous across time, with mean 1/λ0pg(βvi) when Xi=0 and 1/λ1pg(βvi) when Xi=1, where p is the baseline probability that one exposure causes an event and g(βvi) represents the reduction in the probability of an event in one exposure contact conferred by the vaccine. The proportion of individuals in each of the two groups varied from 5 to 95%. Half of the subjects in each group were vaccinated with a vaccine with individual efficacy of 50%, to mimic a perfectly balanced randomized trial. The resulting distribution of exposure contacts was thus a mixture of two Poisson distributions with mean and variance equal to

E(Wij)=λ=Pr(Xi=0)λ0+Pr(Xi=1)λ1,V(Wij)=Pr(Xi=0)[λ0+(λ0λ)2]+Pr(Xi=1)[λ1+(λ1λ)2].

When the exposure was based on a two-point distribution figure 1(a), the hazard was plotted based on

h(ti|vi;θ,λ,p)=πf(ti|vi;p1,θ,λ)+(1π)f(ti|vi;p2,θ,λ)πS(ti|vi;p1,θ,λ)+(1π)S(ti|vi;p2,θ,λ)=πλp1θviexp(λp1θviti)+(1π)λp2θviexp(λp2θviti)πexp(λp1θviti)+(1π)exp(λp2θviti).

Figure 1.

Figure 1

Population VE based on the hazard ratio comparing vaccinated with unvaccinated subjects as a function of time and individual VE of 50%. (a) Exposure was generated assuming a mixing distribution of two Poisson distributions with means λ1=12 and λ2=2, with the probability of occurrence of λ2 from 0.1 to 0.9, as specified. Triple dot-dashed line, 0.1; dashed line, 0.3; dot-dashed line, 0.5; dotted line, 0.7; solid line, 0.9. (b) Exposure was generated assuming a continuous mixture of Poisson distributions (negative binomial) with different means and variances (means and variances were chosen to mimic the mean and variance of the two point distribution). Triple dot-dashed line, mean=11 and variance=20; dashed line, mean=9 and variance=30; dot-dashed line, mean=7 and variance=32; dotted line, mean=5 and variance=26; solid line, mean=3 and variance=12.

In the second exposure heterogeneity scenario, we assumed that half of the subjects were vaccinated with a vaccine of individual efficacy of 50%. The intensity of exposure for each subject (λi) followed a gamma distribution with mean λ and variance λ/γ, leading to an overdispersed Poisson distribution for exposure with mean λ and variance λϕ (ϕ=(γ+1)/γ). The parameters λ and γ were chosen to match the mean and variance of the two group problems described above. The interarrival times of each disease episode given the number of exposures received in the interval, for subject i, were exponentially distributed with mean 1/λipg(βvi).

When the exposure was based on a mixture of Poisson distributions (negative binomial, figure 1b), the hazard was plotted based on

h(ti|vi;θ,λ,p)=0f(ti|vi;p,θ,λi)f(λi)dλi0S(ti|vi;p,θ,λi)f(λi)dλ=0θvipλiexp(θvipλiti)[γλγ/Γ(λγ)]λiλγ1exp(γλi)dλi0exp(θvipλiti)[γλγ/Γ(λγ)]λiλγ1exp(γλi)dλi=pθviλγγ+pθviti.

Figure 1 plots the VE over time based on hazard ratios under these two scenarios. In a study with one year follow-up time, the population VE is always lower than the biological VE or VE per exposure (50%). After two years, VE based on hazard ratios could substantially underestimate the true (and constant over time) effect of the vaccine. The difference between the individual and the population vaccine efficacies is higher when most subjects have higher intensity of exposure (λ) or when the heterogeneity is large. The hazard of the mixing distribution generated by a binary intensity of exposure is similar to that generated by a continuous intensity of exposure, except when the mean of the mixing distribution is low. Similar numerical results were found by Schmoor & Schumacher (1997).

For both exposure scenarios, we generated datasets of 250 simulations with 1200 study subjects. Subjects were followed for 720 days and then censored (type I censoring only). For the repeated measures analysis, subjects were allowed to re-enter the study after each failure. We assigned vaccine to 50% of the study subjects and then sampled exposure using the corresponding distribution. The probability of infection per bite, p, was chosen to average a cumulative probability of disease over the two years in unvaccinated subjects of approximately 40% and VE was chosen to be 50%.

In all simulations, time was subdivided into intervals as if the subjects had been observed every 30 days. A binary outcome random variable was created, Dij, which was equal to 1 if the ith subject developed the outcome during the corresponding time period j. With these data, we fitted proportional hazards models for time to first event, frailty or random effect survival modelling time to first event (Gaussian frailty), our repeated measures model for all events using a complementary log–log link function and GEE, and a marginal multiple events survival model for continuous time to event, i.e. the Anderson–Gill model. All simulations and analyses were performed in Splus v. 8.0 (Insightful Corporation, Seattle). Estimation for the repeated measures model was implemented using the complementary log–log link and GEEs, assuming an independence working covariance matrix, using the gee function from the correlatedData library. Estimation for all survival models was implemented using the coxph function with the cluster function for the Anderson–Gill model and with the frailty function for the random effects survival model. Wald CIs were calculated based on the estimated standard errors.

When heterogeneity in the intensity of exposure varied continuously, the proportional hazards VE estimated without or with a random effect was biased. Our repeated measures estimator of individual efficacy performed substantially better and had negligible bias (figure 2). Results with our estimator and the Anderson–Gill approach were comparable. Discrepancies between our method and the Anderson–Gill approach are likely to be due to the discretization of time. While VE estimated by our method is based on the ratio of cumulative hazards over the specified time period, VE estimated by the Anderson–Gill method is based on the ratio of instantaneous hazards over the time period. Overall, our repeated measures model constituted a valid alternative to the Anderson–Gill approach, and in fact would be a more appropriate method to analyse data in which information about time to event is known in discrete time intervals (interval censoring). Moreover, although our estimator was based on discrete time to event, the half-widths of the 95% Wald CI of the estimator proposed here and that from the Anderson–Gill method were very similar. The difference was negligible in all simulation scenarios and at most 0.001.

Figure 2.

Figure 2

Comparison of the per cent bias in VE under heterogeneity of the intensity of exposure, as a function of the expected value of the mixing distribution of exposure, in 250 simulations each with a sample size of 1200 subjects. VE was estimated via modelling of time to first infection/disease (using proportional hazard and Gaussian frailty models), via modelling of time to all infection/disease (using Anderson–Gill model) and via a repeated measures model with a complementary log–log link and GEE approach. Exposure was generated assuming a continuous mixture of Poisson distributions with the specified mean intensity λ (and variance ϕλ). Dashed line, first event frailty model; dot-dashed line, repeated measures; dotted line, Anderson–Gill; solid line, first event proportional hazard.

4. Reanalysis of a malaria vaccine trial

We reanalyse the Brazilian trial of the SPf66 vaccine (Urdaneta et al. 1998) to compare the VE estimated through proportional hazards analysis (of first event only) with the VE estimated through our repeated measures estimator (of first and second events) implemented with a GEE approach. The SPf66 vaccine was expected only to protect vaccinees from disease, without affecting transmission (Graves et al. 1998; Graves & Gelband 2001).

In the Brazilian SPf66 vaccine trial, 58% of the study population had immigrated to the trial area in the two years prior to the trial, suggesting heterogeneous susceptibility to malaria among the study subjects. Although no mosquito surveys were performed in the trial area during the trial period, studies in nearby regions indicated that the intensity of exposure in the region ranged from 0.4 to 2.1 infected mosquito bites/person/night, depending on the season (Klein & Lima 1990; Urdaneta et al. 1996). A total of 800 individuals were randomized to vaccination (400) or placebo (400) and 572 (287 vaccinees versus 285 non-vaccinees) received three doses. As 32 of these individuals were lost to follow-up immediately after the third dose, the final analysis includes 540 study subjects. The study lasted 18 months after the third dose and recorded first and second malaria episodes.

Among the 540 subjects, 161 had one episode of falciparum malaria (the type of malaria with higher morbidity), and among those 44 presented with a second episode. The original trial analysed time to first infection/disease episode through life-table methods, with the hazard for each group estimated as the ratio of the number of cases at the end of the follow-up period to the person–time at risk. The trial reported a crude VE of 14.1% (95% CI of [−17.0, 36.9%]) for the first episode of malaria.

We performed a survival analysis for first episode using proportional hazards models, and a repeated measures analysis for first and second episodes, in monthly intervals, with GEEs using a working independence covariance assumption. In the repeated measures analysis, we assumed that the intensity of exposure (λ) was constant over time and, based on the previous entomological studies done in the area, equal to 30 infected bites/person/month. For this example, we chose three categorical covariates: vaccination; time living in the trial area; and age.

For each model, the estimated vaccine effect was low and did not reach statistical significance (table 1). Individuals who had lived in the area for more than two years had a lower susceptibility or probability of infection per exposure contact than those who were living in the area for two years or less. Neither the main effect of age nor its interaction with VE was significant in any analyses, indicating that baseline susceptibility and VE were relatively homogeneous across age groups. Adjusting for age or time living in the area did not appreciably change the point estimates of VE, suggesting no confounding due to these factors.

Table 1.

Point and CI estimates of VE and possible predictors of individual susceptibility, based on a survival analysis with proportional hazards model (using first episode only as outcome), and a repeated measures analysis with GEE of the Brazilian SPf66 malaria vaccine trial. Covariates include vaccination status, years living in the trial area prior to the trial and age.

variable proportional hazards models GEE modelsa


1b 2b 1b 2b
P(D | W=1, V=0) (%) 0.12 (0.09, 0.14) 0.11 (0.08, 0.14)
VE (%) 12.5 (−19.3, 35.8) 13.6 (−17.8, 36.6) 10.1 (−20.3, 32.8) 10.8 (−19.7, 33.4)
RR(time in area >2 versus ≤ 2 | W=1) 0.61 (0.44, 0.84) 0.61 (0.44, 0.85) 0.61 (0.44, 0.83) 0.61 (0.45, 0.84)
RR(age >20 versus age ≤ 20 | W=1) 1.17 (0.86, 1.60) 1.12 (0.83, 1.51)
a

Standard errors of GEE models were calculated with robust variance estimators assuming a working independence covariance matrix.

b

Models: model 1, ln(−ln(1−Pr{dij|vi,xi}))=ln{λ=30}+α0+β1vi2I(time in areai>2); model 2, ln(−ln(1−Pr{dij|vi,xi}))=ln{λ=30}+α0+β1vi+α2I(time in areai>2)+α3I(agei>20).

Overall, these results confirm the lack of VE found by Urdaneta et al. (1998). Currently, alternative delivery systems for the SPf66 vaccine are under investigation to improve the immunogenicity of the vaccine.

5. Discussion

The repeated measures model presented in this paper constitutes a practical way to estimate the individual VE or the VE conditional on exposure when the intensity of exposure is heterogeneous, even when the only knowledge about exposure relies on the estimated intensity of exposure from an environmental study. The estimator proposed here requires only a coarse estimate of the density of exposure. The repeated measures model offers a flexible and convenient way of handling heterogeneity in susceptibility and in the VE per exposure contact through covariates and random effects. Under unmeasured heterogeneity of the intensity of exposure in the population, our repeated measures model results in more accurate and more robust estimates of VE than the proportional hazards modelling of time to first event and offers an alternative to multiple events survival methods when the time interval of events rather than the exact event time is known.

When unmeasured heterogeneity leads to the failure of the proportional hazards assumption, survival analysis with frailty models for time to first event have been proposed to estimate the marginal or population-based VE under heterogeneity (Halloran et al. 1996; Longini & Halloran 1996). However, frailty single event models may also provide biased estimates of the individual VE, as shown here. As Hougaard (1991) remarks, single event frailty models are sensitive to the choice of the frailty distribution. Thus, many VE studies having recurrent events would be more appropriately analysed through multivariate methods, such as continuous time multivariate survival or repeated measures analysis. The methods assessed here are more robust than Poisson models that have been used by vaccine studies to handle multiple events (Alonso et al. 2005; Bejon et al. 2006). Poisson models rely on exponentially distributed interarrival times and, thus, yield only unbiased estimates of the individual VE when the assumption of proportional hazards is valid.

Although we specified our estimator for a Poisson distributed exposure, an alternative mixture of distributions for exposure, such as the negative binomial, could also be easily handled. In addition, an intensity of exposure varying over time and across subgroups of the population could be easily incorporated into the model. Assuming that exposures are Poisson distributed simplified the implementation of the model due to the resulting complementary log–log link function for the marginal probability of the outcome. Different motivations for complementary log–log models have been described (Collett 1991), including the discrete proportional hazards model. Therefore, our model to estimate VE can also be viewed as a multivariate version of the proportional hazards model, in which robust variances can be obtained through standard approaches such as GEEs. Frailties or random effects in susceptibility or in the log-transformed exposure can easily be incorporated within our longitudinal or repeated analysis framework.

We did not assess here the impact on the repeated measures estimator of having an exposure or susceptibility heterogeneously changing over time. A heterogeneous time-decreasing susceptibility could be the result of subjects' susceptibility decreasing conditionally on a recent event. Under these circumstances, our estimator could possibly be biased. However, including a covariate to represent recent event history, such as in a conditional Markov model, would probably correct most of this bias.

Since many vaccines may be given to prevent or delay diseases that recur, such as malaria, otitis media in pneumococcal infection or recurrent manifestations of perennial infections, such as the herpes or HIV virus, the proposed repeated measures estimator may be widely applicable. Our estimator is appropriate for the data collection mechanism of VE studies, since case detection at pre-specified time points often does not allow ascertainment of the exact failure time. Further developments of the general model proposed here could be made towards the estimation of individual VE when the infection/disease does not recur using a semi-parametric approach as suggested by van der Laan & Robins (2003). Extension to designs that collect detailed exposure information on a small validation set of participants and crude exposure information on all participants could be based on the semi-parametric methods of Golm et al. (1999). The model could also be extended to include estimation of ‘strain’-specific VE, as suggested by Gilbert et al. (1998), by classifying each event caused by a specific strain as a different outcome.

Given the results presented here, we recommend that primary analysis of vaccine trials with recurrent events should consider methods based on recurrent events, such as our repeated measures model, and not just the proportional hazards models for time to first event. It is probable that modelling time to first event yields biased estimates of individual VE. The VE estimated when modelling time to first event represents a parameter that varies with the transmission intensity in the trial site and the duration of the trial and, thus, does not allow vaccines to be judged. The repeated measures estimator proposed not only provides nearly unbiased estimates of the actual parameter of interest, individual VE, but also is an appropriate method to analyse data collected at discrete time intervals.

Acknowledgments

The malaria vaccine trial reanalysed in this paper was initiated after approval from the Ministry of Health, Brazil, and the local authorities. Written consent was obtained from every patient.

This work was supported by TDR, WHO Research Training grant no. M8/181/4/V.106 and the Fogarty International Institute of the National Institute of Health grant no. 5 D43 TW 000918. We are also grateful to Dr James Robins and Dr Claudio Struchiner who provided helpful suggestions.

Footnotes

One contribution of 13 to a Theme Issue ‘Mathematical and statistical methods for diagnoses and therapies’.

References

  1. Aalen O.O. Heterogeneity in survival analysis. Stat. Med. 1988;7:1121–1137. doi: 10.1002/sim.4780071105. [DOI] [PubMed] [Google Scholar]
  2. Aalen O.O. Effects of frailty in survival analysis. Stat. Methods Med. Res. 1994;3:227–243. doi: 10.1177/096228029400300303. [DOI] [PubMed] [Google Scholar]
  3. Aide P., Bassat Q., Alonso P. Towards an effective malaria vaccine. Arch. Dis. Child. 2007;92:476–479. doi: 10.1136/adc.2005.092551. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Alonso P.L., et al. Randomized trial of efficacy of SPf66 vaccine against Plasmodium falciparum malaria in children in southern Tanzania. Lancet. 1994;344:1175–1181. doi: 10.1016/S0140-6736(94)90505-3. [DOI] [PubMed] [Google Scholar]
  5. Alonso P.L., et al. Duration of protection with RTS,S/AS02A malaria vaccine in prevention of Plasmodium falciparum disease in Mozambican children: single-blind extended follow-up of a randomized controlled trial. Lancet. 2005;366:2012–2018. doi: 10.1016/S0140-6736(05)67669-6. [DOI] [PubMed] [Google Scholar]
  6. Anderson P.K., Gill R.D. Cox's regression model for counting processes: a large sample study. Ann. Statist. 1982;10:1100–1120. doi: 10.1214/aos/1176345976. [DOI] [Google Scholar]
  7. Aponte J.J., et al. Safety of the RTS,S/AS02D candidate malaria vaccine in infants living in a highly endemic area of Mozambique: a double blind randomized controlled phase I/IIb trial. Lancet. 2007;370:1543–1551. doi: 10.1016/S0140-6736(07)61542-6. [DOI] [PubMed] [Google Scholar]
  8. Baird J.K., et al. Age-specific prevalence of Plasmodium falciparum among six populations with limited histories of exposure to endemic malaria. Am. J. Trop. Med. Hyg. 1993;49:707–719. doi: 10.4269/ajtmh.1993.49.707. [DOI] [PubMed] [Google Scholar]
  9. Bejon P., et al. A phase 2b randomised trial of the candidate malaria vaccines FP9 ME-TRAP and MVA ME-TRAP among children in Kenya. PloS Clin. Trials. 2006;1:e29. doi: 10.1371/journal.pctr.0010029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bejon P., et al. Extended follow-up following a phase 2b randomized trial of the candidate malaria vaccines FP9 ME-TRAP and MVA ME-TRAP among children in Kenya. PloS ONE. 2007;2:e707. doi: 10.1371/journal.pone.0000707. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Bojang K.A., et al. Efficacy of RTS,S/AS02 malaria vaccine against Plasmodium falciparum infection in semi-immune adult men in The Gambia: a randomized trial. Lancet. 2001;358:1927–1934. doi: 10.1016/S0140-6736(01)06957-4. [DOI] [PubMed] [Google Scholar]
  12. Chastang C., Byar D., Piantadosi S. A quantitative study of the bias in estimating the treatment effect caused by omitting a balanced covariate in survival models. Stat. Med. 1988;7:1243–1255. doi: 10.1002/sim.4780071205. [DOI] [PubMed] [Google Scholar]
  13. Collett D. Chapman and Hall; New York, NY: 1991. Modelling binary data. [Google Scholar]
  14. Diggle P.J., Liang K., Zeger S.L. Oxford statistical science series. vol. 13. Oxford University Press; New York, NY: 1994. Analysis of longitudinal data. [Google Scholar]
  15. Gail M.H., Wieand S., Piantadosi S. Biased estimates of effect in randomized experiments with nonlinear regressions and omitted covariates. Biometrika. 1984;71:431–444. doi: 10.1093/biomet/71.3.431. [DOI] [Google Scholar]
  16. Gilbert P.B., Self S.G., Ashby M.A. Statistical methods for assessing differential vaccine protection against human immunodeficiency virus types. Biometrics. 1998;54:799–814. doi: 10.2307/2533835. [DOI] [PubMed] [Google Scholar]
  17. Golm G.T., Halloran M.E., Longini I.M. Semiparametric methods for multiple exposure mismeasurement and a bivariate outcome in HIV vaccine trials. Biometrics. 1999;55:94–101. doi: 10.1111/j.0006-341X.1999.00094.x. [DOI] [PubMed] [Google Scholar]
  18. Graves, P. & Gelband, H. 2001 Vaccines for preventing malaria. Cochrane review. The Cochrane Library, issue no. 3. (Update Software, Oxford, 2001). See http://www.update-software.com/abstracts/ab000129.htm.
  19. Graves P., Gelband H., Garner P. The SPf66 malaria vaccine: what is the evidence for efficacy. Parasit. Today. 1998;14:218–220. doi: 10.1016/S0169-4758(98)01242-3. [DOI] [PubMed] [Google Scholar]
  20. Halloran M.E., Haber M.J., Longini I.M. Direct and indirect effects in vaccine efficacy and effectiveness. Am. J. Epidemiol. 1991;133:323–331. doi: 10.1093/oxfordjournals.aje.a115884. [DOI] [PubMed] [Google Scholar]
  21. Halloran M.E., Longini I.M., Haber M.J., Struchiner C.J., Brunet R.C. Exposure efficacy and change in contact rates in evaluating prophylactic HIV vaccines in the field. Stat. Med. 1994;13:357–377. doi: 10.1002/sim.4780130404. [DOI] [PubMed] [Google Scholar]
  22. Halloran M.E., Longini I.M., Struchiner C.J. Causal inference in infectious diseases. Epidemiology. 1995;6:142–151. doi: 10.1097/00001648-199503000-00010. [DOI] [PubMed] [Google Scholar]
  23. Halloran M.E., Longini I.M., Struchiner C.J. Estimability and interpretation of vaccine efficacy using frailty mixing models. Am. J. Epidemiol. 1996;144:83–97. doi: 10.1093/oxfordjournals.aje.a008858. [DOI] [PubMed] [Google Scholar]
  24. Henderson R., Oman P. Effect of frailty on marginal regression estimates in survival analysis. J. R. Stat. Soc. B. 1999;61:367–379. doi: 10.1111/1467-9868.00182. [DOI] [Google Scholar]
  25. Hougaard P. Modelling heterogeneity in survival data. J. Appl. Prob. 1991;28:695–701. doi: 10.2307/3214503. [DOI] [Google Scholar]
  26. Hougaard P. Frailty models for survival data. Lifetime Data Anal. 1995;1:255–273. doi: 10.1007/BF00985760. [DOI] [PubMed] [Google Scholar]
  27. Hougaard P. Springer; New York, NY: 2000. Analysis of multivariate survival data. [Google Scholar]
  28. Klein T.A., Lima J.B.P. Seasonal distribution and biting patterns of Anopheles mosquitoes in Costa Marques, Rondônia, Brazil. J. Am. Mosq. Control Assoc. 1990;6:700–707. [PubMed] [Google Scholar]
  29. Liang K., Zeger S. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73:13–22. doi: 10.1093/biomet/73.1.13. [DOI] [Google Scholar]
  30. Lin D.Y., Wei L.J. The robust inference for the Cox proportional hazards model. J. Am. Stat. Assoc. 1989;84:1074–1078. doi: 10.2307/2290085. [DOI] [Google Scholar]
  31. Longini I.M., Halloran M.E. A frailty mixture model for estimating vaccine efficacy. Appl. Stat. 1996;45:165–173. doi: 10.2307/2986152. [DOI] [Google Scholar]
  32. Omori Y., Johnson R.A. The influence of random effects on the unconditional hazard rate and survival functions. Biometrika. 1993;80:910–914. doi: 10.1093/biomet/80.4.910. [DOI] [Google Scholar]
  33. Schmoor C., Schumacher M. Effects of covariate omission and categorization when analysing randomized trials with the Cox model. Stat. Med. 1997;16:225–237. doi: 10.1002/(SICI)1097-0258(19970215)16:3%3C225::AID-SIM482%3E3.0.CO;2-C. [DOI] [PubMed] [Google Scholar]
  34. Schumacher M., Olschewski M., Schmoor C. The impact of heterogeneity on the comparison of survival times. Stat. Med. 1987;6:773–784. doi: 10.1002/sim.4780060708. [DOI] [PubMed] [Google Scholar]
  35. Smith T., Charlwood J.D., Takken W., Tanner M., Spiegelhalter D.J. Mapping the densities of malaria vectors within a single village. Acta Trop. 1995;59:1–18. doi: 10.1016/0001-706X(94)00082-C. [DOI] [PubMed] [Google Scholar]
  36. Struchiner C.J., Halloran M.E., Brunet R.C., Ribeiro J.M.C., Massad E. Malaria vaccines: lessons from field trials. Cad. de Saude Publica, Rio de Janeiro. 1994;10:310–326. doi: 10.1590/s0102-311x1994000800009. [DOI] [PubMed] [Google Scholar]
  37. Struthers C.A., Kalbfleisch J.D. Misspecified proportional hazards models. Biometrika. 1986;73:363–369. doi: 10.1093/biomet/73.2.363. [DOI] [Google Scholar]
  38. Therneau T.M., Grambsch P.M. Springer; New York, NY: 2000. Modeling survival data. Extending the Cox model. [Google Scholar]
  39. Urdaneta M., Prata A., Struchiner C.J., Tosta C.E., Tauil P., Boulos M. SPf66 vaccine trial in Brazil: conceptual framework, study design and analytical approach. Rev. Soc. Bras. Med. Trop. 1996;29:259–269. doi: 10.1590/s0037-86821996000300007. [DOI] [PubMed] [Google Scholar]
  40. Urdaneta M., Prata A., Struchiner C.J., Tosta C.E., Tauil P., Boulos M. Evaluation of SPf66 malaria vaccine efficacy in Brazil. Am. J. Trop. Med. Hyg. 1998;58:378–385. doi: 10.4269/ajtmh.1998.58.378. [DOI] [PubMed] [Google Scholar]
  41. van der Laan M.J., Robins J.M. Springer; New York, NY: 2003. Unified methods for censored longitudinal data and causality. p. 72. [Google Scholar]
  42. Vaupel J.W., Manton K.G., Stallard E. The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography. 1979;16:439–454. doi: 10.2307/2061224. [DOI] [PubMed] [Google Scholar]
  43. WHO. Twelfth program report of the UNDP/World Bank/WHO special program for research and training in tropical diseases (TDR) Bull. WHO. 1995;12:64–76. [Google Scholar]
  44. Zeger S.L., Liang K. Longitudinal data analysis for discrete and continuous outcomes. Biometrics. 1986;42:121–130. doi: 10.2307/2531248. [DOI] [PubMed] [Google Scholar]

Articles from Philosophical transactions. Series A, Mathematical, physical, and engineering sciences are provided here courtesy of The Royal Society

RESOURCES