Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2012 Jul 1.
Published in final edited form as: Stat Probab Lett. 2011 Jul 1;81(7):861–869. doi: 10.1016/j.spl.2011.02.019

Effect partitioning under interference in two-stage randomized vaccine trials

Tyler J VanderWeele 1, Eric J Tchetgen Tchetgen 1,1
PMCID: PMC3084013  NIHMSID: NIHMS287377  PMID: 21532912

Abstract

In the presence of interference, the exposure of one individual may affect the outcomes of others. We provide new effect partitioning results under interferences that express the overall effect as a sum of (i) the indirect (or spillover) effect and (ii) a contrast between two direct effects.

Keywords: Causal inference, counterfactual, interference, SUTVA, randomized experiments, spillover effects, vaccine trials

1. Introduction

In vaccine trials, the effect of a vaccine intervention on an individual’s disease outcome depends not only on whether an individual receives a vaccine but may also depend on what proportion of the population in the individual’s community receives the vaccine. In such settings when the outcome of one individual depends on the treatment or exposure status of other individuals, interference is said to be present. Drawing inferences about cause and effect is considerably more challenging in settings with such interference (Hong and Raudenbush, 2006; Sobel, 2006; Rosenbaum, 2007; Hudgens and Halloran, 2008; VanderWeele, 2010; Tchetgen Tchetgen and VanderWeele, 2011). One design that facilitates causal inference in the presence of interference in the vaccine context is a two-stage randomized experiment in which specific clusters are randomized to having a certain proportion of the cluster vaccinated (e.g. 50% versus 30%) and then, within each cluster, once the proportion is determined, individuals are randomized to receive the vaccine or not. In the second stage of such randomized experiments, either a fixed proportion of individuals are vaccinated (what we will call type A) or alternatively each individual is independently randomized to have the vaccine with some fixed probability (what we will call type B). Such designs are sometimes also referred to as split-plot or pseudo-cluster randomized designs.

In the presence of such interference, for each possible allocation of vaccine, it is of interest to know the extent to which an individual is protected as a result of receiving a vaccine and the extent to which the individual is protected due to others in the same cluster receiving the vaccine. The distinction between these two is sometimes referred to as the direct and indirect effects of a vaccine respectively. Such issues arise, for example, in attempting to assess the extent of herd immunity conferred by a cholera vaccine (Ali et al., 2005) and we consider an example similar to this as an illustration later in the paper. Hudgens and Halloran (2008) showed how under type A randomization one could define and identify effects they called the overall, total, direct and indirect effects even in the presence of interference; it is also possible to obtain finite sample confidence intervals for these effects under type A randomization (Tchetgen Tchetgen and VanderWeele, 2011).

Building on the work of Halloran and Struchiner (1995), Hudgens and Halloran (2008) also formally showed that the total effect could be partitioned into the sum of what they defined as a direct effect and an indirect effect. Suppose in the first stage of the randomized experiment, clusters are randomized to receive either allocation program α0 or α1. The total effect, as defined by Hudgens and Halloran (2008), is the effect on an individual comparing two scenarios, one in which the individual receives the vaccine and is in a cluster that receives allocation α1 and the second in which the individual does not receive the vaccine and is in a cluster that receives allocation α0; the population average total effect is then simply the individual total effect averaged over all individuals and clusters. This contrasts with the overall effect which is simply a comparison of the proportions who develop the outcome under allocation α1 versus under allocation α0. The effect partitioning result of Hudgens and Halloran (2008) shows that the total effect is the sum of a direct effect and an indirect effect where the direct effect is defined as the effect on an individual of having the vaccine in a cluster with allocation α1 versus not having the vaccine in a cluster with allocation α1 and the indirect effect is defined as the effect on an individual of not having the vaccine in a cluster with allocation α1 versus not having the vaccine in a cluster with allocation α0.

The effect partitioning result of Hudgens and Halloran (2008) is subject to two important criticisms. First, it is an effect partitioning result for the total effect not the overall effect and it is arguably the overall effect that is of the greatest policy interest. The overall effect for example would compare infection rates under two different allocations vaccinating say 50% versus 30% of each cluster. It is thus arguably the overall effect for which one would like a partition into direct and indirect components. Second, under the partitioning result given by Hudgens and Halloran (2008), the direct effect in this partitioning doesn’t carry a clear interpretation as a "direct effect." In the definition given by Hudgens and Halloran (2008), the direct effect is the effect of having the vaccine in a cluster with allocation α1 versus not having the vaccine in a cluster with allocation α1. In this comparison, of the remaining individuals in the cluster other than the one in view, one fewer of these remaining individuals receives treatment in the former scenario than in the latter (so as to maintain overall allocation α1). The "direct effect" as defined by Hudgens and Halloran (2008) thus captures both the effect of an individual’s being vaccinated but also the effect of the fact that one fewer of the other individuals in the cluster was vaccinated. It is thus not entirely clear in what sense the label "direct" is justified as the vaccine/treatment status of the other individuals in the cluster are not being held fixed. Sobel (2006) and Hudgens and Halloran (2008) also consider a setting in which one of the allocations, α0, leaves everyone untreated and show that the overall effect can then be partitioned as the sum of the probability of being treated under α1 times the total effect plus the probability of being untreated under α1 times the indirect effect. However, this is not a partitioning into a direct and indirect effect.

In this paper we give new effect partitioning results. We partition the overall effect, not the total effect, and in this partitioning we show that the overall effect is the sum of an indirect effect along with a contrast between two direct effects and that these two direct effects can in fact be appropriately interpreted as "direct" effects i.e. effects for which the treatment of all other individuals in the cluster are held constant. We show that this partitioning holds under both randomization schemes of type A and type B, albeit under slightly different definitions of the direct and indirect effect in each of these two cases.

2. Interference and the Effect Partitioning of Hudgens and Halloran (2008)

In this section we review concepts and notation for causal inference under interference as described in Hudgens and Halloran (2008) and Tchetgen Tchetgen and VanderWeele (2011) and we review the effect partitioning result of Hudgens and Halloran (2008). Suppose there are N > 1 clusters in the study and for i = 1,…,N, and let ni denote the number of individuals in cluster i. Let Ai ≡ (Ai1,…,Aini) denote the treatments received by individuals in cluster i where Aij is a dichotomous random variable. Let Ai,−j = (Ai1,…,Aij−1, Aij+1,…,Aini) denote the treatments received by individuals in cluster i other than that of individual j. We refer to Ai as a treatment or allocation program, to distinguish it from the individual treatment Aij. Let 𝒜(n) be the set of possible treatment allocations of length n; thus Ai takes one of 2ni possible values in 𝒜(ni). For each cluster i, we assume there exist counterfactual (potential outcome) data Yi(·) = {Yi(ai) : ai ∈ 𝒜(ni)} where Yi(ai) = {Yi1 (ai),…, Yini (ai)}, and Yij (ai) is the outcome for individual j in cluster i had treatment allocation in cluster i been ai. The observed outcome Yij for individual j in cluster i is equal to the potential outcome Yij (Ai) under the realized treatment allocation Ai. Because Yij(ai) may depend on the entire vector ai, we allow for the outcome of one individual to depend on the treatment received by other individuals in the same cluster i.e. we allow for interference. We assume that counterfactuals for an individual in cluster i do not depend on treatment assignments of individuals in a different cluster i′i (Sobel, 2006; Hudgens and Halloran, 2008). The ordinary no interference assumption (Cox, 1958; Rubin, 1980) generally made in the causal inference literature is then that for all i and j if ai and ai are such that aij=aij then Yij(ai)=Yij(ai). We suppose that Yi(·) is fixed for each individual and treatment allocation only determine which potential outcome is observed for each individual.

In a two-stage randomized experiment evaluating two allocation programs, each cluster is first randomized to one of the two allocation programs, say α0 and α1, and then within each cluster, individuals are randomized to receive treatment with probability determined by the allocation their clustered was assigned. We let πi (Ai; α) denote that probability that treatment in cluster i takes value Ai under allocation program α. Following Tchetgen Tchetgen and VanderWeele (2011), we will consider two different randomization parameterizations within each cluster as given in the next definition.

Definition 1: (A) A parametrization is said to be of type A with parameter α = (ni,K0,i) for cluster i if the treatment program Ai in cluster i is randomly allocated conditional on 1nTAi=j=1niAij=K0,i with probability mass function

πi(Ai;α)=I(AiA(ni,K0,i))/(niK0,i)

(B) A parametrization is said to be of type B with parameter α if treatment program Ai is randomly assigned to individuals in cluster i according to the known Bernoulli probability mass function

πi(Ai;α)=j=1niαAij(1α)1Aij

where 0 < α < 1.

A type A parameterization might assign half the subjects in a cluster to Aij = 1 and half to Aij = 0. A type B parameterization in contrast might independently assign each individual Aij = 1 with probability one half and Aij = 0 with probability one half so that there is a non-negative probability that treatment is given to all individuals in the cluster (or to none). Throughout we will use α to reference a specific allocation (e.g. assign 50% of individuals in the cluster the vaccine or 30% of individuals the vaccine) and we will also use α to denote the proportion vaccinated (e.g. 50% or 30%). Further below, we will also denote probability distributions arising from a specific allocation α by Prα.

Hudgens and Halloran (2008) considered comparing two treatment allocation programs of type A e.g. under allocation programs α0 and α1. Building on the work of Halloran and Struchiner (1995) and Sobel (2006), Hudgens and Halloran (2008) define the direct, indirect, total and overall effects for type A parameterizations. So as to better correspond with prior causal inference literature we reverse the roles played by 0 and 1 in Hudgens and Halloran (2008). However, none of the substantive points made below are affected by doing so. They define the individual direct effect for individual j in group i when all other individuals in the group receive treament ai,−j by DEij (ai,−j) = Yij (ai,−j, aij = 1)−Yij (ai,−j, aij = 0); the individual indirect effect comparing treatment allocation ai and ai by IEij(ai,j,ai,j)=Yij(ai,j,aij=0)Yij(ai,j,aij=0); the individual total effect by TEij(ai,j,ai,j)=Yij(ai,j,aij=1)Yij(ai,j,aij=0); and the individual overall effect by OEij(ai,ai)=Yij(ai)Yij(ai). Sobel (2006) refers to the indirect effect as a "spillover effect." Note that under the assumption of no interference between individuals in a group, the individual indirect effect is equal to zero. Hudgens and Halloran (2008) also give corresponding definitions for the individual average direct, total and overall effects as stated in Definition 2 below. The superscript H in the notation given will denote definitions provided by Hudgens and Halloran so that we can distinguish this notation from similar definitions given in the next section.

Definition 2 (Individual Average Direct, Indirect, Total and Overall Effects of Hudgens and Halloran (2008)): Let

Y¯ijH(a;α)=s𝒜(n1)Yij(ai,j=s,aij=a)Prα(Ai,j=s|Aij=a)where Prα(Ai,j=s|Aij=a)=πi(Ai,j=s,Ai,j=a;α)s𝒜(n1)πi(Ai,j=s,Ai,j=a;α)

and define

DE¯ijH(α)=Y¯ijH(1;α)Y¯ijH(0;α)IE¯ijH(α1,α0)=Y¯ijH(0;α1)Y¯ijH(0;α0)TE¯ijH(α1,α0)=Y¯ijH(1;α1)Y¯ijH(0;α0).

Finally let

Y¯ijH(α)=s𝒜(n)Yij(ai=s)πi(Ai=s;α)

and define OE¯ijH(α1,α0)=Y¯ijH(α1)Y¯ijH(α0).

Note that individual average direct, indirect, total and overall effects are defined with respect to comparing two allocations, α1 and α0 and are averaged over all possible treatment assignments under these allocations. Hudgens and Halloran (2008) go on to define the group average direct effect by DE¯iH(α)=j=1niDE¯ijH(α)/ni and the population average direct effect by DE¯H(α)=i=1NDE¯iH(α)/N; the group average indirect effect as IE¯iH(α1,α0)=j=1niIE¯ijH(α1,α0)/ni and the population average indirect effect as IE¯H(α1,α0)=i=1NIE¯iH(α1,α0)/N; the group average total effect by TE¯iH(α1,α0)=j=1niTE¯ijH(α1,α0)/ni and the population average total effect by TE¯H(α1,α0)=i=1NTE¯iH(α1,α0)/N; and finally, the group average overall effect by OE¯iH(α1,α0)=j=1niOE¯ijH(α1,α0)/ni and the population average overall effect by OE¯H(α1,α0)=i=1NOE¯iH(α1,α0)/N.

It follows immediately from their definitions, that total effects at the individual, group or population levels can be decomposed as the sum of direct and indirect causal effects at the corresponding level. That is, for example, TE¯ijH(α1,α0)=DE¯ijH(α1)+IE¯ijH(α1,α0) (Hudgens and Halloran, 2008). Note that if interference is absent then the indirect effect, IE¯ijH(α1,α0), is zero, and the total and direct effects coincide and are equal to a comparison of vaccinating everyone versus vaccinating no one.

Let S ≡ (S1,…, SN) denote the first stage of randomization group assignments with Si = 1 if cluster i is assigned to α1 and 0 if cluster i is assigned to α0. Let η denote the parametrization for the distribution of S and suppose that in a two-stage randomized experiment {η, α0, α1} are Type A parametrizations.

Suppose Si = 1 corresponds to allocation program α, then let Y^i(a;α)=j=1niI(Aij=a)Yij(Ai)/j=1niI(Aij=a),Y^(a;α)=i=1NY^i(a;α)I(Si=1)/i=1NI(Si=1),Y^i(α)=j=1niYij(Ai)/ni, and Y^(α)=i=1niY^i(α)I(Si=1)/i=1NI(Si=1). Hudgens and Halloran (2008) proposed the following estimators:

DE^(α0)=Y^(1;α0)Y^(0;α0),IE^(α1,α0)=Y^(0;α1)Y^(0;α0),TE^(α1,α0)=Y^(1;α1)Y^(0;α0),OE^(α1,α0)=Y^(α1)Y^(α0),

which they showed to be unbiased i.e. E{DE^(α0)}=DE¯H(α0),E{IE^(α1,α0)}=IE¯H(α1,α0),E{TE^(α1,α0)}=TE¯H(α1,α0),E{OE^(α1,α0)}=OE¯H(α1,α0), where the expectation is taken with respect to the joint density of (S, A1, …,AN). Tchetgen Tchetgen and VanderWeele (2011) provided finite sample confidence intervals for these estimators.

As noted above, it follows from the definitions of Hudgens and Halloran (2008), that total effects at the individual, group or population levels can be partioned into the sum of direct and indirect causal effects at the corresponding level. That is, for example, TE¯ijH(α1,α0)=DE¯ijH(α1)+IE¯ijH(α1,α0). This effect partitioning is subject to two potential criticisms. First, the partitioning is for the total effect rather than for the overall effect. The overall effect is the comparison of the expected proportion with the outcome under allocation α1 versus the expected proportion with the outcome under allocation α0. It is the arguably the most important policy effect of interest. It is the effect of interest that compares the infection rates that would arise under two different proportions allocated (e.g. 50% versus 30%). The group average total effect in contrast compares a scenario in which an individual is treated in a cluster with allocation α1 to a scenario in which the individual is not treated in a cluster with allocation α0 and then averages this contrast over all individuals in the cluster. This total effect contrast is of less interest insofar as if α1 and α0 are allocations in which some individuals do and some do not receive treatment, then it is impossible that each individual receives the treatment under α1 as considered in first scenario concerning the total effect. The total effect thus, while of some interest from a hypothetical standpoint, is of much less interest from a policy perspective.

The second potential criticism of the effect partitioning of Hudgens and Halloran (2008) is that it is not entirely clear that what they defined as a direct effect, DE¯ijH(α)=Y¯ijH(1;α)Y¯ijH(0;α), actually merits the label "direct effect." The direct effect of Hudgens and Halloran (2008) compares a scenario in which an individual is treated in a cluster with allocation α to a scenario in which the individual is not treated in a cluster with allocation α. In this second scenario, however, because the individual is not treated and the cluster has allocation α, of the remaining individuals in the cluster, one more will be treated in the second scenario than in the first scenario because in the first scenario the individual himself receives treatment. The treatment of the other individuals in the cluster are not be held fixed, even in proportion. To take an extreme scenario suppose there were just two individuals in a cluster and the allocation program α consisted of treating just one of the two then we would have that DE¯i1H(α)=Yi1((1,0))Yi1((0,1)) i.e. a comparison for individual 1 of individual 1’s receiving treatment (but not individual 2) versus individual 2’s receiving treatment (but not individual 1). This effect may be of interest but it is not clear it merits the label "direct effect" because the treatments of both individuals are being changed in the comparison being made. What is effectively being held "fixed" in the direct effect of Hudgens and Halloran (2008) is effectively the allocation program, not the treatments received by other individuals.

In the next section we provide new effect partitioning results to remedy both of these potential criticisms. We give an effect partitioning result for the overall effect (rather than the total effect) and moreover the direct effect within the partitioning truly does have an interpretation of a "direct effect."

3. New Effect Partitioning Results for the Overall Effect

We begin with an effect partitioning result for two-stage randomized experiments under Type B parameterizations wherein each individual has a fixed probability of receiving treatment independent of the treatments received by other individuals in the cluster. We then show a similar result in fact holds also for two-stage randomized experiments under Type A parameterizations. We first define the effect of interest under type B parameterizations.

Definition 3 (Individual Average Direct, Indirect, Total and Overall Effects Under Type B Parameterizations): Let

Y¯ijB(a;α)=s𝒜(n1)Yij(ai,j=s,aij=a)Prα(Ai,j=s)Y¯ijB(α)=s𝒜(n)Yij(ai=s)Prα(Ai=s)

where Prα(Ai=s)=j=1niαAij(1α)1Aij and define

DE¯ijB(α)=Y¯ijB(1;α)Y¯ijB(0;α)IE¯ijB(α1,α0)=Y¯ijB(0;α1)Y¯ijB(0;α0)OE¯ijB(α1,α0)=Y¯ijB(α1)Y¯ijB(α0).

Note that for the effects in Definition 3, in contrast with those in Definition 2, the potential outcomes are averaged over the unconditional distribution of Ai whereas in Definition 2 this averaging takes place over a the conditional distribution of Ai,−j given Aij = a.

Group average (DE¯iB(α),IE¯iB(α1,α0),OE¯iB(α1,α0)) and population average (DE¯B(α),IE¯B(α1,α0),OE¯B(α1,α0)) direct, indirect and overall effects could be defined by averaging over individuals and then over clusters respectively. Suppose Si = 1 corresponds to allocation program α, and let Y^i(a;α)=j=1niI(Aij=a)Yij(Ai)/j=1niI(Aij=a),Y^(a;α)=i=1NY^i(a;α)I(Si=1)/i=1NI(Si=1),Y^i(α)=j=1niYij(Ai)/ni, and Y^(α)=i=1niY^i(α)I(Si=1)/i=1NI(Si=1). It is then straightforward to show that under type B parameterizations, Ŷ (1; α) − Ŷ (0; α), Ŷ (0; α1) − Ŷ (0; α0) and Ŷ1) − Ŷ0) are unbiased for the population average direct, indirect and overall effects respectively.

Note that for two-stage randomized experiments under Type B parameterizations, the direct effect in Definition 3 defined by DE¯ijB(α)=Y¯ijB(1;α)Y¯ijB(0;α), now merits the label "direct effect." This direct effect, DE¯ijB(α), compares what would happen to an individual with versus without treatment where in both scenarios every other individual received treatment with probability α. Our definition of the direct effect under parameterization B thus circumvents one of the two criticisms of the effect partitioning result of Hudgens and Halloran (2008). Our next result shows that the other criticism is circumvented as well.

Theorem 1 (Overall Effect Partitioning Under Type B Parameterization). Suppose πi (Ai; α0) and πi (Ai; α1) are type B parameterizations then

OE¯ijB(α1,α0)=α1DE¯ijB(α1)α0DE¯ijB(α0)+IE¯ijB(α1,α0).

Theorem 1 partitions the overall effect, OE¯ijB(α1,α0), into two components: (i) the indirect effect, IE¯ijB(α1,α0), which carries with it the interpretation of comparing the outcome of an individual without treatment where every other other individual received treatment with probability α1 versus with probability α0 and (ii) a contrast of two direct effects, under allocation α1 and under allocation α0 each multiplied by the probability of an individual receiving treatment under that allocation i.e. α1DE¯ijB(α1)α0DE¯ijB(α0). The contrast, α1DE¯ijB(α1)α0DE¯ijB(α0), is in many ways intuitive. Under allocation α0, each individual will have treatment with probability α0 and thus the direct effect, DE¯ijB(α0), will be in operation with probability α0. Likewise, under allocation α1, each individual will have treatment with probability α1 and thus the direct effect, DE¯ijB(α1), will be in operation with probability α1. For an overall change in allocation from α0 to a more extensive allocation α1, everyone will benefit from the indirect effect, IE¯ijB(α1,α0), of there being a larger proportion of individuals treated and whereas a proportion α0 benefited from the direct effect DE¯ijB(α0) under allocation α0, but now a proportion α1 will benefit from the direct effect of treatment, DE¯ijB(α1), under allocation α1.

In fact, a similar decomposition holds for two-stage randomized experiments under parameterizations of type A. We first introduce alternative notions of direct and indirect effects under type A parameterization to those given by Hudgens and Halloran (2008).

Definition 4 (Alternative Individual Average Direct and Indirect Effects Under Type A Parameterizations): Let

Y¯ijA(a;α,a*)=s𝒜(n1)Yij(ai,j=s,aij=a)Prα(Ai,j=s|aij=a*)Y¯ijA(a;α)=s𝒜(n)Yij(ai,j=s\sj,aij=a)Prα(Ai=s)Y¯ijA(α)=s𝒜(n)Yij(ai=s)Prα(Ai=s)

and define

DE¯ijA(α,a)=Y¯ijA(1;α,a)Y¯ijA(0;α,a)IE¯ijA(α1,α0)=Y¯ijA(0;α1)Y¯ijA(0;α0)OE¯ijA(α1,α0)=Y¯ijA(α1)Y¯ijA(α0).

Once again, group average (DE¯iA(α,a),IE¯iA(α1,α0),OE¯iA(α1,α0)) and population average (DE¯A(α,a),IE¯A(α1,α0),OE¯A(α1,α0)) direct, indirect and overall effects could be defined by averaging over individuals and then clusters respectively. As with Definition 3, the definition of a direct effect given in Definition 4, again merits the label "direct effect." The direct effect in Definition 4, DE¯ijA(α,a)=Y¯ijA(1;α,a)Y¯ijA(0;α,a), compares the outcome for an individual with versus without treatment where in both scenarios all other individuals are assigned treatment under allocation α conditional on the individual in question having been given treatment a. This contrasts with the "direct effect" of Hudgens and Halloran in which the allocation rule for the other individuals in the cluster varied according to whether or not the individual in question received treatment. The indirect effect in Definition 4 is also subtly different than that of Hudgens and Halloran (2008). In the indirect effect in Definition 4, the individual is untreated in both scenarios i.e. under allocation program α1 and α0 but other individuals in the cluster are assigned treatment under allocations α1 or α0 without conditioning on the fact that the individual in question was untreated. As shown in the following theorem, the same effect partitioning effectively holds for the effects given in Definition 4 under type A parameterization as for those given in Definition 3 under type B paramaterizations.

Theorem 2. (Overall Effect Partitioning Under Type A Parameterization). Suppose πi (Ai; α0) and πi (Ai; α1) are type A parameterizations with α1 = (ni,K1,i) and α0 = (ni,K0,i) then

OE¯ijA(α1,α0)=K1,iniDE¯ijA(α1,1)K0,iniDE¯ijA(α0,1)+IE¯ijA(α1,α0).

Note that under allocation α1, the proportion receiving treatment is simply K1,ini and under allocation α0, the proportion receiving treatment is simply K0,ini. The effect partitioning result in Theorem 2 for type A parameterization is thus entirely analogous to that in Theorem 1 for type B parameterization. Under type B parameterization however we were able to identify the population average direct, indirect and overall effects defined in Definition 3 and partitioned in Theorem 1. For type A parameterizations, the population average direct and indirect effects given in Definition 4 and partitioned in Theorem 2 are not identified. This is because, for example, population averages of Y¯ijA(1;α,0) are not identified because for no cluster is there an individual j with treatment 1 in a cluster wherein all other individuals are assigned treatment under an allocation rule α conditional on individual j having been untreated; there are no clusters with this treatment arrangement because such a cluster would not in fact qualify as having been assigned allocation α. It may, however, be possible to identify the direct and indirect effects in Definition 4 under a modified type A two-stage randomized trial wherein randomization was extended to three stages so that those assigned to allocation α were further randomized either to actual allocation α or to modified allocation α with an individual randomly selected to receive treatment and all others allocation assigned treatment under an allocation rule α conditional on individual j having been untreated. Such developments and considerations of alternative study designs will be left for future work.

Note that in a limit scenario in which each cluster effectively had infinite sample size, the direct and indirect effects in Definition 4 would essentially reduce to those of Hudgens and Halloran (2008) in Definition 2 because Prα(Ai,−j|Aij) ≈ Prα(Ai,−j) and these would then be identified by the estimators given in the previous section. A rigorous formalization of this in terms of limit arguments lies beyond the scope of this paper and will be left to future work. It is, however, the case that even if one retains the definitions of Hudgens and Halloran (2008) in Definition 2, essentially the same effect partitioning holds, as is stated in our final theorem.

Theorem 3. (Overall Effect Partitioning Under Parameterization A for the Direct and Indirect Effects of Hudgens and Halloran (2008)). Suppose πi (Ai; α0) and πi (Ai; α1) are type A parameterizations with α1 = (ni,K1,i) and α0 = (ni,K0,i) then

OE¯ijH(α1,α0)=K1,iniDE¯ijH(α1)K0,iniDE¯ijH(α0)+IE¯ijH(α1,α0).

Here the population average direct and indirect effects, DE¯H(α1) and IE¯H(α1,α0), are identified under simple type A two-stage randomized experiments and the effect partitioning holds but the direct effect does not carry a clear interpretation as a "direct" effect.

Theorems 1, 2 and 3 taken together show that the result partitioning the overall effect into the sum of a indirect effect and a contrast between two direct effects thus applies to the effects defined in Hudgens and Halloran (under type A allocations) and also to those given in this paper in Definition 4 (for type A allocations) and Definition 3 (for type B allocations).

4. Illustration

Consider the hypothetical data in Table 1 presented by Hudgens and Halloran (2008), based on a hypothetical trial similar to that of Ali et al. (2005) of a cholera vaccine, where 30% are allocated treatment under α0 (Si = 0) and 50% are allocated treatment under α1 (Si = 1).

Table 1.

Hypothetical data from Hudgens and Halloran (2008)

Group Assignment Vaccinated (Aij = 1) Cases Placebo (Aij = 1) Cases
i Si Total ∑jAij jAijYij(Ai) Total ∑j (1 − Aij) j (1 − Aij)Yij(Ai)
1 1 12,541 16 12,541 18
2 1 11,513 26 11,513 54
3 0 10,772 17 25,134 119
4 0 8,883 22 20,727 122
5 0 5,627 15 13,130 92

Suppose this had in fact come from a two-stage randomized experiment under type B parameterization. Our estimates of the population average direct, indirect and overall effects in Definition 3 per 1,000 individuals would then be DE¯B(α0)=3.64,DE¯B(α1)=1.3,IE¯B(α1,α0)=2.81,OE¯B(α1,α0)=2.37. We would then have the decomposition of the overall effect as

OE¯B(α1,α0)=α1DE¯B(α1)α0DE¯B(α0)+IE¯B(α1,α0)i.e.2.37=(0.5)(1.3)(0.3)(3.64)+(2.81).

If the data had come from a two-stage randomized experiment under type A parameterization then, as in Hudgens and Halloran, the population average direct, indirect, total and overall effects in Definition 2 per 1,000 individuals would then be DE¯H(α0)=3.64,DE¯H(α1)=1.3,IE¯H(α1,α0)=2.81,TE¯H(α1,α0)=4.11,OE¯H(α1,α0)=2.37. Hudgens and Halloran (2008) note that

TE¯H(α1,α0)=DE¯H(α1)+IE¯H(α1,α0)i.e.4.11=(1.3)+(2.81).

Hudgens and Halloran comment that an estimate of direct effect of the vaccine of −1.3 if all clusters had been given a 50% allocation would underestimate the total effect of −4.11 considerably.

We note further here, illustrating Theorem 3, that

OE¯H(α1,α0)=α1DE¯H(α1)α0DE¯H(α0)+IE¯H(α1,α0)i.e.2.37=(0.5)(1.3)(0.3)(3.64)+(2.81).

Here we see that the overall effect in fact is smaller in magnitude than the indirect effect. This is because the direct effect is reduced from −3.64 when only 30% are given the vaccine to −1.3 when 50% are given the vaccine. Thus under the allocation of 50%, although a greater proportion are vaccinated, this is offset by a reduced direct effect for the vaccine resulting in a net causative contribution for the direct effect contrast and consequently we see that the overall effect is smaller than the indirect effect.

5. Discussion

In the previous example, the overall effect was smaller than the indirect effect; this was because the net contribution of the direct effect contrast was causative; by increasing the numbers vaccinated herd immunity was effectively granted and the direct effect was rendered smaller in magnitude. More generally we might refer to contrasts of the form α1DE¯A(α1)α0DE¯A(α0) or α1DE¯B(α1)α0DE¯B(α0) or α1DE¯H(α1)α0DE¯H(α0), as the "net direct effects contribution"; this may be in either direction. For allocation proportions α1 that are greater than allocation proportions under α0, we might consider three scenarios: one in which DE¯A(α1)<DE¯A(α0) and consequently the net contribution of the direct effects is protective; one in which DE¯A(α1)>DE¯A(α0) but the increase in proportion treatment from α0 to α1 is sufficient to offset the diminished direct effect so that the net contribution of the direct effects is again protective; and finally, one in which DE¯A(α1)>DE¯A(α0) but the increase in proportion treatment from α0 to α1 is not sufficient to offset the diminished direct effect so that the net contribution of the direct effects is causative. It was this third scenario that was illustrated in the previous section. As was also seen in the illustration, however, even if the net contribution of the direct effect is causative, the overall effect of treatment may still be highly protective.

The results given in this paper have implications for study design. We have seen that if one is interested in an effect partitioning that is (i) for the overall effect, (ii) involves direct effects that are actually interpretable as direct effects and (iii) are such that all components are identifiable then it will be important to use either a two-stage randomized experiment with type B parameterizations or a modification of the type A parameterization which extends the randomization to a third stage. We have seen also that the quantity which we have referred to as the "net direct effects contribution" is an important one for understanding the overall effect comparing different allocations and for understanding whether the overall effect will be greater than or less than the indirect effect.

We note that the effect decomposition results given in this paper hold also for outcomes that are not binary. Tchetgen Tchetgen and VanderWeele (2011) gave finite sample confidence intervals for the direct, indirect, total and overall effects of Hudgens and Halloran (2008), as given in Definition 2 above, when the outcome was binary. Future work could consider such finite sample confidence intervals for the overvall, direct and indirect effects given in Definitions 2, 3 and 4 above, for both binary and continuous outcomes. Future work could also formally consider how the direct and indirect effects in Definition 2 might approximate those in Definition 4 when the size of each cluster is large.

Appendix: Proofs

Proof of Theorem 1. We have that Y¯ijB(α)

=αs𝒜(n1)Yij(ai,j=s,aij=1)Prα(Ai,j=s)+(1α)s𝒜(n1)Yij(ai,j=s,aij=0)Prα(Ai,j=s)=α[Y¯ijB(1;α)Y¯ijB(0;α)]+Y¯ijB(0;α).
Thus,OE¯ijB(α1,α0)=Y¯ijB(α1)Y¯ijB(α0)=α1[Y¯ijB(1;α1)Y¯ijB(0;α1)]α0[Y¯ijB(1;α0)Y¯ijB(0;α0)]+[Y¯ijB(0;α1)Y¯ijB(0;α0)]=α1DE¯ijB(α1)α0DE¯ijB(α0)+IE¯ijB(α1,α0).

Proof of Theorem 2. We have that

Y¯ijA(α)=K0,iniY¯ijA(1;α,1)+niK0,iniY¯ijA(0;α,0)+{K0,iniY¯ijA(0;α,1)K0,iniY¯ijA(0;α,1)}=K0,ini{Y¯ijA(1;α,1)Y¯ijA(0;α,1)}+niK0,iniY¯ijA(0;α,0)+K0,iniY¯ijA(0;α,1)=K0,iniDE¯ijA(α,1)+s𝒜(n1)Yij(ai,j=s,aij=0)Prα(Ai,j=s|aij=0){niK0,ini}+s𝒜(n1)Yij(ai,j=s,aij=0)Prα(Ai,j=s|aij=1){K0,ini}=K0,iniDE¯ijA(α,1)+s𝒜(n1)Yij(ai,j=s,aij=0)Prα(Ai,j=s,aij=0)+s𝒜(n1)Yij(ai,j=s,aij=0)Prα(Ai,j=s,aij=1)=K0,iniDE¯ijA(α,1)+s𝒜(n)Yij(ai,j=s\sj,aij=0)Prα(Ai=s)=K0,iniDE¯ijA(α,1)+Y¯ijA(0;α).
Thus,OE¯ijA(α1,α0)=Y¯ijA(α1)Y¯ijA(α0)=K1,iniDE¯ijA(α1,1)+Y¯ijA(0;α1){K0,iniDE¯ijA(α0,1)+Y¯ijA(0;α0)}=K1,iniDE¯ijA(α1,1)K0,iniDE¯ijA(α0,1)+IE¯ijA(α1,α0).

Proof of Theorem 3. We have that

Y¯ijH(α)=K0,iniY¯ijH(1;α)+niK0,iniY¯ijH(0;α)+{K0,iniY¯ijH(0;α)K0,iniY¯ijH(0;α)}=K0,ini{Y¯ijH(1;α)Y¯ijH(0;α)}+niK0,iniY¯ijH(0;α)+K0,iniY¯ijH(0;α)=K0,iniDE¯ijH(α)+Y¯ijH(0;α)
Thus,OE¯ijH(α1,α0)=Y¯ijH(α1)Y¯ijH(α0)=K1,iniDE¯ijH(α1)+Y¯ijH(0;α1){K0,iniDE¯ijH(α0)+Y¯ijH(0;α0)}=K1,iniDE¯ijH(α1)K0,iniDE¯ijH(α0)+IE¯ijH(α1,α0).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  1. Ali M, Emch M, von Seidlein L, Yunus M, Sack DA, Rao M, Holmgren J, Clemens JD. Herd immunity conferred by killed oral cholera vaccine in Bangladesh: a reanalysis. Lancet. 2005;366:44–49. doi: 10.1016/S0140-6736(05)66550-6. [DOI] [PubMed] [Google Scholar]
  2. Halloran ME, Struchiner CJ. Causal inference for infectious diseases. Epidemiology. 1995;6:142–151. doi: 10.1097/00001648-199503000-00010. [DOI] [PubMed] [Google Scholar]
  3. Hong G, Raudenbush SW. Evaluating kindergarten retention policy: A case study of causal inference for multilevel observational data. Journal of the American Statistical Association. 2006;101:901–910. [Google Scholar]
  4. Hudgens MG, Halloran ME. Towards causal inference with interference. Journal of the American Statistical Association. 2008;103:832–842. doi: 10.1198/016214508000000292. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Rosenbaum PR. Interference between units in randomized experiments. Journal of the American Statistical Association. 2007;102:191–200. doi: 10.1080/01621459.2012.655954. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Sobel ME. What do randomized studies of housing mobility demonstrate?: Causal inference in the face of interference. Journal of the American Statistical Association. 2006;101:1398–1407. [Google Scholar]
  7. Tchetgen Tchetgen EJ, VanderWeele TJ. Estimation of causal effects in the presence of interference. Statistical Methods in Medical Research - Special Issue on Causal Inference. 2011 doi: 10.1177/0962280210386779. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. VanderWeele TJ. Direct and indirect effects for neighborhood-based clustered and longitudinal data. Sociological Methods and Research. 2010;38:515–544. doi: 10.1177/0049124110366236. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES