Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Sep 1.
Published in final edited form as: Pharm Stat. 2020 May 5;19(5):710–719. doi: 10.1002/pst.2026

Estimands and inference in cluster-randomized vaccine trials

Kayla W Kilpatrick 1, Michael G Hudgens 1, M Elizabeth Halloran 2,3
PMCID: PMC8273646  NIHMSID: NIHMS1690677  PMID: 32372535

Summary

Cluster-randomized trials are often conducted to assess vaccine effects. Defining estimands of interest before conducting a trial is integral to the alignment between a study’s objectives and the data to be collected and analyzed. This paper considers estimands and estimators for overall, indirect, and total vaccine effects in trials, where clusters of individuals are randomized to vaccine or control. The scenario is considered where individuals self-select whether to participate in the trial, and the outcome of interest is measured on all individuals in each cluster. Unlike the overall, indirect, and total effects, the direct effect of vaccination is shown in general not to be estimable without further assumptions, such as no unmeasured confounding. An illustrative example motivated by a cluster-randomized typhoid vaccine trial is provided.

Keywords: causal inference, estimands, herd immunity, spillover, vaccine trials

1 |. INTRODUCTION

Vaccines are integral to combating a variety of infectious diseases. Quantifying a vaccine’s effects is vital to determining its benefits, which can then guide public health policies aimed at reducing the burden of disease. Cluster-randomized trials are often conducted to quantify the effects of a treatment or intervention such as a vaccine. In cluster-randomized trials, individuals are grouped together based on certain characteristics (eg, neighborhood of residence), and the entire cluster is randomized to treatment or control. The process of randomization ensures that the treatment and control groups are exchangeable. Cluster-randomization is useful when it is impractical or infeasible to randomize at the individual level.1 Comparisons between randomized clusters can be used to assess the overall impact of an intervention on the population, which is particularly important in settings where an intervention may have indirect (or spillover) effects.2 For example, in the infectious disease setting, whether one individual is vaccinated could affect the outcome of another individual. Moulton et al3 describe a cluster-randomized trial in the White Mountain Apache Reservation and the Navajo Nation wherein approximately 9000 infants within 38 clusters were randomized by cluster to the vaccine of interest (Streptococcus pneumoniae conjugate vaccine) or control (a meningococcal C conjugate vaccine). Diallo et al4 present a cluster-randomized trial of an inactivated influenza vaccine in Senegal in which approximately 7800 enrolled, age-eligible children within 20 clusters were randomized by cluster to the influenza vaccine or control (an inactivated polio vaccine). Sur et al5 describe a cluster-randomized trial of a typhoid vaccine in India, with approximately 38 000 individuals within 80 clusters randomized by cluster to the typhoid vaccine or control (hepatitis A vaccine).

Because the cluster-randomized trial is a common study design for evaluating vaccine effects, it is important to carefully define the estimands, that is, parameters of interest, in these trials. Careful definition of the effects of interest prior to the study can aid in study planning and can ensure that the study’s goals are achieved.6 Recently, there has been increased interest in defining estimands in clinical trials. The International Council on Harmonization (ICH) has published an addendum to the E9 guidelines detailing the use of estimands in clinical trials.7 This addendum aims to describe the necessity of defining the target estimand before the design and analysis of trials to avoid misalignment of the trial goals and the data, as well as to ensure that estimation of the estimand is possible without relying upon dubious assumptions.8

Leuchs et al,6 Koch and Wiener,9 Permutt,10 and Phillips et al11 discuss examples of estimands of interest in regulatory clinical trials. Target estimands specifically for cluster-randomized trials have been previously considered for certain designs. Wu et al12 consider estimands for matched-pair cluster-randomized trials. Hudgens and Halloran13 consider estimands of the direct, indirect, total, and overall effects of treatment assuming a two-stage randomization scheme. In this design, clusters are randomly assigned to a treatment allocation program, and individuals within the clusters are randomly assigned to treatment based on the cluster-level assignment. In some cluster-randomized trials, individuals may not comply with their randomization assignment or may choose not to participate in the study.3,5,14,15 Frangakis et al16 consider clustered encouragement designs, which allow noncompliance, where individuals belong to one of three principal strata: always-takers, compliers, and never-takers. Kang and Keele17 also consider cluster-randomized trials with noncompliance. Like Frangakis et al,16 they consider the setting where there are the three principal strata mentioned above and also the special case where there are no always-takers. Even for this special case, they show the total and indirect (spillover) effects are not identified because principal strata membership is unknown for some individuals.

In this paper, we consider cluster-randomized vaccine trials where individuals choose whether to participate in the trial or not. As illustrated by the examples described above, it is common in cluster-randomized vaccine trials for the control to be another vaccine which is not expected to affect the outcome of interest. For simplicity, below the control vaccine will sometimes be referred to just as a control. Here, we consider the particular case where a control vaccine is employed and individuals are blinded, that is, unaware whether their cluster is randomly assigned to the vaccine of interest or to the control vaccine. In this setting, it is reasonable to assume individual participation behavior is unaffected by randomization, such that there are only two principal strata: always participators and never participators. Thus, our setting is similar to the special case considered by Kang and Keele.17 However, because it is assumed an individual will participate or not in the trial regardless of randomization assignment, principal strata membership is known for all individuals; this allows for identification and estimation of overall, total and indirect effects.

Sur et al5 provides a motivating example of a cluster-randomized vaccine trial where individuals self-select whether to participate. In this trial, clusters of individuals were randomized to either a typhoid vaccine or a control vaccine (for hepatitis A). The presence of a control allowed study blinding, so individuals in the clusters did not know which assignment their cluster received. While some individuals chose not to participate in the trial, outcome data were collected on all individuals. This allows inference about different effects of the vaccine, as described below.

The outline of the remainder of this paper is as follows. In Section 2, notation, estimands, estimators, and effects of interest are described. In Section 3, the Sur et al5 cluster-randomized typhoid vaccine trial is considered. Finally, Section 4 concludes with a discussion.

2 |. METHODS

2.1 |. Notation and potential outcomes

Consider a cluster-randomized vaccine trial with n clusters (or groups) of individuals where each cluster is randomly assigned to vaccine or control. For i = 1, …, n, let Ai = 1 if cluster i is assigned to vaccine and Ai = 0 otherwise. Let Yia=1 denote the potential outcome if cluster i is assigned vaccine, and let Yia=0 denote the potential outcome if cluster i is assigned control. For example, Yia=1 could denote the proportion of individuals in cluster i who would develop typhoid within 1 year after randomization if, possibly counter to fact, cluster i were assigned to vaccine. For now, we leave the particular outcome associated with Yia unspecified. Different specifications of Yia will correspond to different vaccine effects, as described below. Let Yi denote the observed outcome for cluster i, such that Yi=Yia=1Ai+Yia=0(1Ai). Below, the subscript i is sometimes dropped for notational convenience.

In cluster-randomized vaccine trials, one individual’s vaccination status may affect another individual’s outcome, that is, there may be “interference” between individuals.18 For instance, if one individual receives a typhoid vaccine, this could affect whether another individual develops typhoid or not. Throughout this paper, it is assumed that there is no interference between individuals in different clusters, that is, there is “partial interference.”19 Under this assumption, the outcome Yi for cluster i depends only on the treatment assigned to cluster i. No assumption is made regarding the form of interference within clusters.

2.2 |. Estimands and estimators

Vaccine effects, that is, the causal effects of vaccination, can be defined by contrasts in the expected values of the potential outcomes Ya=1 and Ya=0. Assuming the n clusters in the trial are randomly sampled from an infinite super-population of clusters, the average treatment (vaccine) effect is generally defined by

θ=E[Ya=1]E[Ya=0] (1)

where E[X] denotes the expected value of X in the super-population of clusters. In words, Equation (1) is the difference in the average outcome in the super-population when a cluster receives a = 1 compared to when a cluster receives a = 0. Alternatively, the n clusters could be considered the finite population of interest and E[X] defined instead to be n1i=1nXi. The super-population perspective is adopted in this paper, but similar considerations to those provided here apply if the finite population approach is utilized instead. Likewise, estimands other than Equation (1) could be considered. For example, for binary Y, the risk ratio E[Ya = 1]/E[Ya = 0] = Pr[Ya = 1 = 1]/Pr[Ya = 0 = 1] might be of greater interest than the risk difference [Equation (1)]. For instance, Y might be an indicator of whether or not at least one individual in a cluster gets infected.20,21 More generally, causal effects can by defined by g(E[Ya = 1], E[Ya = 0]) for some contrast function g(x, y) where g(x, x) = 0; for example, g(x, y) = xy corresponds to Equation (1). Below, estimands of the form in Equation (1) are described, but similar considerations apply for other contrasts.

A few aspects of defining causal effects bear mentioning. First, causal effects are typically defined by contrasts in expected values of the potential outcomes over the same set of units.22,23 In many settings, the unit is defined to be an individual; for example, a unit could be a participant in a randomized controlled trial. Here, we consider the clusters to be the units since randomization is at the cluster level. Note that contrasts in average potential outcomes between different sets of units do not have a causal interpretation. For example, suppose a cluster-randomized vaccine trial is conducted in schools, where students within the same school constitute the clusters. A comparison of the average Ya = 1 among clusters (schools) in rural areas to the average Ya = 0 among clusters in urban areas is not a causal effect. Also note that causal effects are contrasts in the expected value of the same outcome under different counterfactual scenarios. Contrasts in different outcomes are not causal effects. For example, a comparison of the average incidence of typhoid when clusters receive vaccine with the average incidence of cholera when clusters receive control would not be a causal effect. We will revisit this point below when discussing direct effects.

The average treatment effect can be estimated by the difference in sample means:

θ^=1n1i=1nYiI(Ai=1)1n0i=1nYiI(Ai=0) (2)

where na=i=1nI(Ai=a) for a = 0, 1. This estimator is consistent and unbiased under commonly used randomization schemes, such as a completely randomized experiment where the number of clusters assigned vaccine (treatment) is fixed.2426 The SE of θ^ can be estimated and 95% Wald confidence intervals can be constructed in the usual manner for the difference in means. Equivalently, Equation (2) can be obtained by computing the least squares estimate of the slope parameter of simple linear regression of Y on A. A generally more precise estimator can be obtained by regressing Y on A and Z where Z is some vector of baseline covariates. For simplicity, only estimators of the form in Equation (2) are considered below; see Tsiatis et al27 for further discussion on using baseline covariates to improve efficiency. Note also that Equation (2) utilizes only cluster level data and thus avoids the complexities associated with inference based on statistics constructed using individual level data, which require accounting for possible within-cluster correlation (eg, using mixed effects models or generalized estimating equations).

2.3 |. Overall, indirect, and total effects

In this section, the general approach above is used to define estimands and estimators of the overall, indirect, and total effects. The outcome of interest will depend on the context of the vaccine trial, such as the infection or pathogen of interest, the target population, and so forth. Here, the outcome of interest is generically referred to as disease.

The overall effect compares the average disease outcome among all individuals when a cluster is assigned vaccine vs when a cluster is assigned control. This quantity may be the most relevant to public health policy because all individuals within clusters are used in the comparison. As it is likely that populations of interest will include a mixture of individuals who would and who would not choose to be vaccinated, the overall effect may be valuable for public health officials and policy makers in assessing the overall impact of a vaccine at the population level.

The overall effect estimand and estimator can be defined in terms of individual level outcomes as follows. Let mi denote the number of individuals in cluster i. For individual j in cluster i, let Yij = 1 if individual j develops disease, and let Yij = 0 otherwise. Let Yija=1 indicate the outcome that would have been observed for individual j if cluster i is randomized to vaccine, and define Yija=0 analogously for control, such that Yij=Yija=1Ai+Yija=0(1Ai). For the overall effect, the estimand Equation (1) can be expressed in terms of individual potential outcomes by defining Yia=1=j=1miYija=1/mi, and Yia=0=j=1miYija=0/mi for cluster i. The overall effect estimator can likewise be expressed in terms of the observed individual-level outcomes by letting Yi=j=1miYij/mi.

The indirect effect quantifies the effect of vaccination on individuals who chose not to participate in the trial and, therefore, have no chance of receiving the vaccine. This effect is defined as a contrast in the average outcomes among non-participants when their cluster does or does not receive vaccine.28 Because the indirect effect is defined only among individuals who never receive the vaccine, this effect (if present) is solely due to interference. Thus, indirect effects are a type of spillover or peer effect.19 Quantifying indirect effects may be of interest from a public health policy perspective because vaccinating some, but not all, individuals within a cluster can still provide benefits to those who are unable or choose not to be vaccinated.

Like the overall effect, the indirect effect estimand and estimator can be defined in terms of individual level outcomes. To do so, first define the potential outcome Sija=1 where Sija=1=1 if individual j in cluster i would choose to participate in the trial if, possibly counter to fact, cluster i were randomized to vaccine and Sija=1=0 otherwise. Define Sija=0=0 analogously. Denote the observed participation outcome for individual j in cluster i by Sij, such that Sij=Sija=1Ai+Sija=0(1Ai). Assume Sija=1=Sija=0, that is, an individual’s decision to participate is not affected by whether their cluster is assigned vaccine or control. This assumption may be reasonable in cluster-randomized trials where individuals are blinded, such as the typhoid vaccine trial described in Section 3, because in such settings, randomization is not expected to have an effect on an individual’s decision to participate in the trial. As mentioned in the Introduction section, Frangakis and Rubin23 and Kang and Keele17 utilize the principal stratification framework when considering non-compliance in cluster-randomized trials. Under the assumption Sija=1=Sija=0, all individuals belong to one of two principal strata: always participators, that is, individuals where Sija=1=Sija=0=1, and never participators, that is, individuals where Sija=1=Sija=0=0. Fortunately, unlike the setting considered by Kang and Keele, here the principal strata membership of each individual can be inferred directly from the observed data because Sij=Sija=1=Sija=0.

The indirect effect is the effect of vaccine in the non-participator principal stratum. The indirect effect has the general form [Equation (1)], with Yia=1 now defined to be {j=1miYija=1I(Sija=1=0)}/{j=1miI(Sija=1=0)} and Yia=0 defined to be {j=1miYija=0ija=0=0)}/{j=1miI(Sija=0=0)}. This estimand compares the average disease outcome among non-participators when a cluster is assigned vaccine vs when a cluster is assigned control. Similarly, the indirect effect estimator can be expressed by Equation (2) with Yi defined to be {j=1miYijI(Sij=0)}/{j=1miI(Sij=0)}.

The total effect measures the effect of treatment in the always participator principal stratum. Because always participators receive the vaccine if and only if their cluster is assigned vaccine, the total effect encompasses both the individual effect of receiving the vaccine as well as the effect of other individuals in the cluster being vaccinated. The total effect estimand and estimator have the same form as the indirect effect estimand and estimator described above, but with Sija=1=0 replaced by Sija=1=1, Sija=0=0 replaced by Sija=0=1, and Sij = 0 replaced by Sij = 1. The total effect quantifies the difference in the average disease outcome among always participators when a cluster is assigned vaccine vs when a cluster is assigned control. The total effect is often the effect of primary interest in this type of trial. An illustration of the overall, indirect, and total effects is given in Figure 1.

FIGURE 1.

FIGURE 1

Cluster counterfactual comparisons. The left circle represents a cluster if, possibly counter to fact, assigned to vaccine (A = 1). The right circle represents a cluster if, possibly counter to fact, assigned to control (A = 0). Within each circle, S indicates which individuals chose to participate in the study (S = 1 indicates participation, S = 0 otherwise). The overall, indirect, and total effects are contrasts in average potential outcomes over different sets of individuals within the clusters

There are a few special cases of note. In the scenario where all individuals in the population are willing to participate in trials (ie, there are no non-participators), the indirect effect is not well-defined, and the total and overall effects are equivalent. In some trials, only a subset of individuals may be eligible to be randomized for vaccination. For example, in Sur et al,5 individuals were eligible if they were at least 2 years of age, were not pregnant or lactating, and did not have an elevated temperature when the vaccine was given. Indirect effects, analogous to that defined above for non-participators, can be defined and estimated in these individuals if their outcome of interest is measured.

2.4 |. Direct effect

The overall, indirect, and total effects each describe an effect of treatment (vaccination) which is at least partially due to interference, if present. The effect of treatment that is not attributable to interference may also be of interest. Such an effect is sometimes referred to as a direct effect. This section describes why it is not possible in general to estimate the direct effect of vaccination in a cluster-randomized trial with self-selection of participation without additional assumptions, such as no unmeasured confounding. Informally, the direct effect compares the average outcome when an individual is vaccinated to the average outcome when an individual is not vaccinated, holding fixed the proportion of other individuals vaccinated.28 Several formal definitions of the direct effect estimand have been proposed; for example, see Hudgens and Halloran,13VanderWeele and Tchetgen Tchetgen,29 Liu et al,30 Eck et al31 and Sävje et al.32

To develop intuition behind the lack of identifiability of the direct effect, consider the following naive approach. Suppose the proportion of vaccinated individuals with disease is compared to the proportion of unvaccinated individuals with disease in clusters assigned to vaccine by

1n1i=1nj=1miYijI(Sij=1)j=1miI(Sij=1)I(Ai=1)1n1i=1nj=1miYijI(Sij=0)j=1miI(Sij=0)I(Ai=1). (3)

By the law of large numbers, Equation (3) converges to

E[Ya=1]E[Y˜a=1] (4)

where

Yia=1={j=1miYija=1I(Sija=1=1)}/{j=1miI(Sija=1=1)}

and

Y˜ia=1={j=1miYija=1I(Sija=1=0)}/{j=1miI(Sija=1=0)}.

Unfortunately, the estimand Equation (4) is not a causal effect, as it comprises a comparison of different cluster-level outcomes, namely Yia=1 and Y˜ia^=1. As noted above, for an estimand to have a causal interpretation, the same outcome must be compared under different counterfactual scenarios.

It is conventional, although not incontrovertible,33 to define causal effects only for a treatment or exposure that is manipulable, that is, there can be “no causation without manipulation”.34 If this convention is followed, then in cluster-randomized trials with non-participation, the direct effect of vaccination would only be considered well defined in always participators. Otherwise, to define the relevant potential outcomes would require considering a counterfactual scenario where non-participators receive vaccine. However, for the study design under consideration, always participators receive vaccine if and only if other always participators in their cluster also receive vaccine. Thus, it is not possible to observe both (a) a vaccinated always participator and (b) an unvaccinated always participator, while holding fixed the proportion of other individuals who are vaccinated in the cluster; hence the direct effect is not identifiable without additional assumptions.

On the other hand, if the “no causation without manipulation” convention is not adopted, there are other complications that may arise with estimating the direct effect. In particular, in cluster-randomized trials with non-participation, vaccine coverage within a cluster is dictated by the collective level of individual participation in the study, which is not under the investigator’s control. Factors associated with participation may also be associated with the outcome of interest, creating the potential for confounding. Thus causal inference methods for observational studies, such as those assuming no unmeasured confounding, would in general be necessary to draw inference about direct effects. To be concrete, consider the counterfactual scenario (or policy) where individuals independently receive vaccine with probability α. Let Aij denote the vaccination status of individual j in cluster i, and let Ai=(Ai1,Ai2,,Aini). The random vector Ai takes on values ai in the set A(ni)={0,1}ni. Let Yij(ai) denote the potential outcome for individual j (in cluster i) corresponding to ai. The potential outcomes Yij(ai) may also be expressed as Yij(ai,−j,aij) where ai,−j denotes the vector of treatment indicators for all individuals except individual j and aij is the treatment indicator for individual j. Define the average outcome for individual j when vaccinated under policy α by

Yij(1;α)=bA(ni1)Yij(ai,j=b,aij=1)Pα(Ai,j=b)

where Pα denotes the probability under policy α. Define Yij(0;α) analogously such that Yij(0;α) is the average outcome for individual j when not vaccinated under policy α. Then define the direct effect under policy α to be

E[Y¯i(1;α)]E[Y¯i(0;α)] (5)

In the cluster-randomized trial setting considered in this paper, individuals self-select whether to participate such that it would be dubious to assume treatment received is independent of an individual’s potential outcomes. However, in some settings, it might be reasonable to assume there exists some vector of baseline covariates, say Li, such that the set of potential outcomes for individuals within cluster i are conditionally independent of the treatment selected given these covariates, that is, Yij(ai)⊥Ai|Li. This is a cluster level version of the usual no unmeasured confounders assumption. Under this assumption, inverse probability weighted estimators have been proposed which are consistent for Equation (5).35,36

3 |. TYPHOID VACCINE TRIAL

A cluster-randomized study was conducted to investigate the effectiveness of a Vi polysaccharide typhoid vaccine in Kolkata, India, over 2 years of follow-up from 2004 to 2006.5 The control in this trial was an inactivated hepatitis A vaccine. Geographic mapping and a census that characterized and counted all people and households in the study area were used to define 80 clusters. For purposes of randomization, clusters were stratified by ward (an administrative unit of Kolkata) and by the number of residents in certain age groups. Overall, 40 clusters were assigned to Vi vaccine and the other 40 to control. Because data from the typhoid trial are not publicly available, a simulated data set was constructed (see Data S1). The data were simulated to match exactly the cluster level summary statistics from the actual trial shown in Table 1.

TABLE 1.

Summary statistics of a cluster-randomized study in Kolkata from 2004 to 2006 of a Vi typhoid vaccine vs a hepatitis A control vaccine5

Typhoid vaccine Control
Number of clusters 40 40
Mean ± SD of people per cluster 777 ± 136 792 ± 142
Mean ± SD of participants per cluster 472 ± 103 470 ± 104
Number of participants 18 869 18 804
Number of non-participants 12 206 12 877
Number of events in participants 34 96
Number of events in non-participants 16 31

Sur et al5 measure vaccine effects in terms of hazard ratios. However, causal interpretations for hazard ratios are difficult because hazard ratios can depend on time and have an inherent selection bias.37 In particular, time-specific hazard ratios compare different subsets of subjects and, as noted above, estimands have a causal interpretation only when comparing potential outcomes between the same set (or subset) of units. Due to these issues, instead of using the hazard ratio to determine the vaccine effects as in Sur et al,5 the risk difference of typhoid over 2 years is calculated here to quantify vaccine effects.

The overall, indirect, and total effects were estimated using Equation (2) with the Yi definitions provided in Section 2.3. The effect estimates, estimated SEs, and 95% Wald confidence intervals (CIs) are shown in Table 2. For example, the overall effect estimate was obtained by taking the difference in the average number of cases of typhoid per 1000 individuals between Vi clusters and control clusters. In particular, Vi clusters had 1.61 cases of typhoid per 1000 people, while control clusters had 4.10 cases of typhoid per 1000 people. Thus, the overall effect estimate is −2.49 cases per 1000 people. The SE of the overall effect estimate was calculated by {σ^02+σ^12}1/2 where σ^a denotes the estimated SE for clusters assigned a = 0 (control), 1 (Vi). Finally, a 95% Wald CI was estimated in the usual manner with a result of (−3.41, −1.58). The overall effect estimate has a straightforward interpretation which may be of interest to public health officials such as epidemiologists. In particular, the number of cases of typhoid per 1000 persons over a 2 year period is estimated to decrease by 2.5 on average when a cluster receives the Vi vaccine compared to receiving control.

TABLE 2.

Estimates of overall, indirect, and total effects, SE, and 95% Wald confidence intervals (CI)

Effect Estimate (SE) 95% CI
Overall −2.49 (0.47) (−3.41, −1.58)
Indirect −1.29 (0.56) (−2.38, −0.19)
Total −3.30 (0.67) (−4.61, −1.99)

Note: Effect estimates are differences in typhoid cases per 1000 people per 2 years.

Both participants and non-participants appear to benefit from the Vi vaccine. In particular, over the study period, on average, there were 1.85 cases of typhoid per 1000 participants in Vi clusters, and 5.15 cases of typhoid per 1000 participants in control clusters. Thus, the total effect estimate is −3.30 (95% CI −4.61, −1.99), indicating that assigning a cluster to Vi vaccine causes 3.3 fewer cases of typhoid per 1000 participants compared to assigning a cluster to hepatitis A vaccine. Likewise, Vi clusters had 1.29 cases of typhoid per 1000 non-participants on average, while control clusters had 2.58 cases of typhoid per 1000 non-participants on average over the study period. Taking the difference between these values gives an indirect effect estimate of −1.29 (95% CI −2.38, −0.19). The indirect effect estimate suggests that assigning a cluster to the typhoid vaccine results in 1.29 fewer cases per 1000 non-participants; as non-participants never receive the vaccine, this indicates an indirect (or herd immunity) effect of the typhoid vaccine.

On the other hand, the naive direct effect estimator Equation (3) equals 0.56 (95% CI −0.44, 1.55). Although not statistically significant, this point estimate implies that the average number of cases of typhoid per 1000 people is higher in vaccinated individuals compared to non-vaccinated individuals in clusters randomized to the Vi vaccine. However, as described above, this estimate cannot be interpreted as an effect of the vaccine as discussed in Section 2.4. For example, perhaps individuals at higher risk of typhoid chose to participate in the trial, or those who participated tended to have different health care seeking behavior. Moreover, the average number of cases of typhoid per 1000 people was also higher in participants compared to non-participants (2.57, 95% CI 1.19, 3.96) in the control clusters, providing direct evidence of confounding. Sur et al5 reported similar results, with incidence of typhoid higher in participants compared to non-participants, both within Vi vaccine clusters and within control clusters.

4 |. DISCUSSION

Randomized controlled trials are the gold standard in vaccine trials since randomization ensures that the vaccine and control groups are comparable. Carefully defining estimands in clinical trials is vital to ensure accurate interpretation of the resulting treatment effect estimates. Because cluster-randomized trials can be large and expensive to conduct, it is important to formally characterize estimands for use in these trials. This paper considers causal estimands in cluster-randomized trials where interference may be present within clusters. An illustrative example is provided motivated by a recent cluster-randomized typhoid vaccine trial demonstrating inference and interpretation of the overall, total, and indirect effect estimands. These types of analyses can be used to inform public health policies regarding vaccination.

In cluster-randomized trials with self-selection, estimators of the direct effect must account for possible confounding. As described at the end of Section 2, a standard method to adjust for confounding is to condition on covariates and assume that conditional on these covariates, participants and non-participants are exchangeable. A possible indirect way to adjust for confounding could involve comparing outcomes between participants and non-participants in the control clusters as an estimate of the confounding bias, if present, similar to negative control approaches described in Lipsitch et al38 and Tchetgen Tchetgen.39 Alternatively, two-stage randomized designs could be considered to eliminate possible confounding when drawing inference about the direct effect. In two-stage randomized experiments, clusters are first randomly assigned to a treatment allocation program, then individuals within those clusters are assigned to treatment or control based on their cluster’s treatment allocation program.13 Randomization eliminates possible confounding at the cluster and individual level, such that direct, indirect, total, and overall effects can be estimated.13,40,41 However, it may not always be feasible to conduct two-stage randomized trials. In addition, the effects estimated by a two-stage randomized experiment are not equivalent to the effects estimated in cluster-randomized trials with participation self-selection and may have less public health relevance.42,43

Estimated effects may have greater real-world relevance depending on the estimands of interest and characteristics of individuals in the trials, such as the level of participation. Westreich44 provides several examples of population intervention effects defined by contrasts in average potential outcomes under different possible interventions on the distribution of treatment. These population intervention effects may be more germane to real-world policy than the traditional approach of defining causal effects by comparing average outcomes when all individuals in the population receive treatment vs when no individuals receive treatment. The estimands described here for cluster-randomized trials with self-selection are examples of population intervention effects, to the extent that the participation rate in the trial approximates vaccination uptake should the vaccine under evaluation become widely available to the public. For example, in Sur et al,5 about 60% of individuals on average chose to be vaccinated in both Vi and hepatitis A clusters; thus, the overall, total, and indirect effect estimates approximate the effects of vaccinating 60% of the population. Such effect estimates could potentially help inform public health policy decisions regarding vaccination.

Supplementary Material

suppinfo-dataset

ACKNOWLEDGEMENTS

The authors thank the Causal Inference with Interference working group in the Biostatistics Department at UNC-Chapel Hill: Shaina Alexandria, Brian G. Barkley, Bryan Blette, Sujatro Chakladar, Bradley Saul, and Bonnie Shook-Sa for their helpful suggestions. This work was partially supported by NIH grants R01 AI085073 and T32 ES007018.

Funding information

National Institutes of Health, Grant/Award Numbers: R01 AI085073, T32 ES007018

Footnotes

DATA AVAILABILITY STATEMENT

Because data from the typhoid trial are not publicly available, a simulated dataset was constructed (see Supporting Information).

SUPPORTING INFORMATION

Additional supporting information may be found online in the Supporting Information section at the end of this article.

REFERENCES

  • 1.Halloran ME, Longini IM Jr, Struchiner CJ. Design and Analysis of Vaccine Studies. New York: Springer-Verlag; 2010. [Google Scholar]
  • 2.Hayes RJ, Alexander NDE, Bennett S, Cousens SN. Design and analysis issues in cluster-randomized trials of interventions against infectious diseases. Stat Methods Med Res. 2000;9(2):95–116. [DOI] [PubMed] [Google Scholar]
  • 3.Moulton LH, O’Brien KL, Kohberger R, et al. Design of a group-randomized Streptococcus pneumoniae vaccine trial. Control Clin Trials. 2001;22(4):438–452. [DOI] [PubMed] [Google Scholar]
  • 4.Diallo A, Diop OM, Diop D, et al. Effectiveness of seasonal influenza vaccination in children in Senegal during a year of vaccine mismatch: a cluster-randomized trial. Clin Infect Dis. 2019;69(10):1780–1788. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Sur D, Ochiai RL, Bhattacharya SK, et al. A cluster-randomized effectiveness trial of Vi typhoid vaccine in India. N Eng J Med. 2009;361 (4):335–344. [DOI] [PubMed] [Google Scholar]
  • 6.Leuchs AK, Zinserling J, Brandt A, Wirtz D, Benda N. Choosing appropriate estimands in clinical trials. Ther Innov Regul Sci. 2015;49 (4):584–592. [DOI] [PubMed] [Google Scholar]
  • 7.International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use. Addendum on estimands and sensitivity analysis in clinical trials: to the guideline on statistical principles for clinical trials E9(R1). https://database.ich.org/sites/default/files/E9-R1_Step4_Guideline_2019_1203.pdf.2019. Accessed February 13, 2020. [Google Scholar]
  • 8.Mehrotra DV, Hemmings RJ, Russek-Cohen E. Seeking harmony: estimands and sensitivity analyses for confirmatory clinical trials. Clin Trials. 2016;13(4):456–458. [DOI] [PubMed] [Google Scholar]
  • 9.Koch GG, Wiener LE. Commentary for the Missing Data Working Group’s perspective for regulatory clinical trials, estimands, and sensitivity analyses. Stat Med. 2016;35(17):2887–2893. [DOI] [PubMed] [Google Scholar]
  • 10.Permutt T. A taxonomy of estimands for regulatory clinical trials with discontinuations. Stat Med. 2016;35(17):2865–2875. [DOI] [PubMed] [Google Scholar]
  • 11.Phillips A, Abellan-Andres J, Soren A, et al. Estimands: discussion points from the PSI estimands and sensitivity expert group. Pharm Stat. 2017;16(1):6–11. [DOI] [PubMed] [Google Scholar]
  • 12.Wu Z, Frangakis CE, Louis TA, Scharfstein DO. Estimation of treatment effects in matched-pair cluster randomized trials by calibrating covariate imbalance between clusters. Biometrics. 2014;70(4):1014–1022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Hudgens MG, Halloran ME. Toward causal inference with interference. J Am Stat Assoc. 2008;103(482):832–842. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Sur D, Lopez AL, Kanungo S, et al. Efficacy and safety of a modified killed-whole-cell oral cholera vaccine in India: an interim analysis of a cluster-randomised, double-blind, placebo-controlled trial. Lancet. 2009;374(9702):1694–1702. [DOI] [PubMed] [Google Scholar]
  • 15.PATH. Inactivated influenza vaccine effectiveness in tropical Africa. 2013. https://clinicaltrials.gov/ct2/show/record/NCT00893906. Identifier: NCT00893906. Accessed February 8, 2019. [Google Scholar]
  • 16.Frangakis CE, Rubin DB, Zhou X-H. Clustered encouragement designs with individual noncompliance: Bayesian inference with randomization, and application to advance directive forms. Biostatistics. 2002;3(2):147–164. [DOI] [PubMed] [Google Scholar]
  • 17.Kang H, Keele L. Spillover effects in cluster randomized trials with noncompliance. arXiv preprint arXiv:1808.06418. 2018. [DOI] [PubMed] [Google Scholar]
  • 18.Cox DR. Planning of Experiments. New York: Wiley; 1958. [Google Scholar]
  • 19.Sobel ME. What do randomized studies of housing mobility demonstrate? Causal inference in the face of interference. J Am Stat Assoc. 2006;101(476):1398–1407. [Google Scholar]
  • 20.Bjune G, Høiby EA, Grønnesby JK, et al. Effect of outer membrane vesicle vaccine against group B meningococcal disease in Norway. Lancet. 1991;338(8775):1093–1096. [DOI] [PubMed] [Google Scholar]
  • 21.Halloran ME, Longini IM, Cowart DM, Nizam A. Community interventions and the epidemic prevention potential. Vaccine. 2002;20 (27–28):3254–3262. [DOI] [PubMed] [Google Scholar]
  • 22.Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol. 1974;66(5):688–701. [Google Scholar]
  • 23.Frangakis CE, Rubin DB. Principal stratification in causal inference. Biometrics. 2002;58(1):21–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Miratrix LW, Sekhon JS, Yu B. Adjusting treatment effect estimates by post-stratification in randomized experiments. J R Stat Soc B. 2013;75(2):369–396. [Google Scholar]
  • 25.Imbens GW, Rubin DB. Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. New York: Cambridge University Press; 2015. [Google Scholar]
  • 26.Athey S, Imbens GW. The econometrics of randomized experiments. In Joe Kruze, Handbook of Economic Field Experiments. North-Holland, an imprint of Elsevier; Amsterdam, Netherlands, Vol 1; 2017:73–140. [Google Scholar]
  • 27.Tsiatis AA, Davidian M, Zhang M, Lu X. Covariate adjustment for two-sample treatment comparisons in randomized clinical trials: a principled yet flexible approach. Stat Med. 2008;27(23):4658–4677. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Halloran ME, Struchiner CJ. Study designs for dependent happenings. Epidemiology. 1991;2(5):331–338. [DOI] [PubMed] [Google Scholar]
  • 29.VanderWeele TJ, Tchetgen Tchetgen EJ. Effect partitioning under interference in two-stage randomized vaccine trials. Stat Probab Lett. 2011;81(7):861–869. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Liu L, Hudgens MG, Becker-Dreps S. On inverse probability-weighted estimators in the presence of interference. Biometrika. 2016;103 (4):829–842. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Eck DJ, Morozova O, Crawford FW. Randomization for the direct effect of an infectious disease intervention in a clustered study population. arXiv preprint arXiv:1808.05593. 2018. [Google Scholar]
  • 32.Sävje F, Aronow PM, Hudgens MG. Average treatment effects in the presence of unknown interference. Ann. Stat. 2020. (In press) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Pearl J. Does obesity shorten life? Or is it the soda? On non-manipulable causes. J Causal Inference. 2018;6(2):1–7. [Google Scholar]
  • 34.Holland PW. Statistics and causal inference. J Am Stat Assoc. 1986;81(396):945–960. [Google Scholar]
  • 35.Tchetgen Tchetgen EJ, VanderWeele TJ. On causal inference in the presence of interference. Stat Methods Med Res. 2012;21(1):55–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Perez-Heydrich C, Hudgens MG, Halloran ME, Clemens JD, Ali M, Emch ME. Assessing effects of cholera vaccination in the presence of interference. Biometrics. 2014;70(3):731–741. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Hernán MA. The hazards of hazard ratios. Epidemiology. 2010;21(1):13–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Lipsitch M, Tchetgen Tchetgen E, Cohen T. Negative controls: a tool for detecting confounding and bias in observational studies. Epidemiology. 2010;21(3):383–388. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Tchetgen Tchetgen E. The control outcome calibration approach for causal inference with unobserved confounding. Am J Epidemiol. 2013;179(5):633–640. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Baird S, Bohren JA, McIntosh C, Özler B. Optimal design of experiments in the presence of interference. Rev Econ Stat. 2018;100(5): 844–860. [Google Scholar]
  • 41.Basse G, Feller A. Analyzing two-stage experiments in the presence of interference. J Am Stat Assoc. 2018;113(521):41–55. [Google Scholar]
  • 42.Barkley BG, Hudgens MG, Clemens JD, Ali M, Emch ME. Causal inference from observational studies with clustered interference. Ann Appl Stat. 2020. (In press) [Google Scholar]
  • 43.Papadogeorgou G, Mealli F, Zigler CM. Causal inference for interfering units with cluster and population level treatment allocation programs. Biometrics. 2019;75(3):778–787. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Westreich D. From patients to policy: population intervention effects in epidemiology. Epidemiology. 2017;28(4):525–528. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

suppinfo-dataset

RESOURCES