Abstract
Interference is said to be present when the exposure or treatment received by one individual may affect the outcomes of other individuals. Such interference can arise in settings in which the outcomes of the various individuals come about through social interactions. When interference is present, causal inference is rendered considerably more complex, and the literature on causal inference in the presence of interference has just recently begun to develop. In this paper we summarize some of the concepts and results from the existing literature and extend that literature in considering new results for finite sample inference, new inverse probability weighting estimators in the presence of interference and new causal estimands of interest.
1 Introduction
Interference is said to be present when the exposure or treatment received by one individual may affect the outcomes of other individuals. Such interference can arise in settings in which the outcomes of the various individuals come about through social interactions (Manski, 2000, 2010). Most of the literature on causal inference proceeds by making an assumption of "no-interference." For example, Rubin’s formulation of the potential outcomes framework an assumption referred to as the "Stable Unit Treatment Value Assumption" or "SUTVA" is made which includes within it a no-interference assumption (Rubin, 1980). Such no-interference assumptions are employed routinely though not always acknowledged. When interference is present, causal inference is rendered considerably more complex, and the literature on causal inference in the presence of interference has just recently begun to develop (Sobel, 2006; Hong and Raudenbush, 2006; Rosenbaum, 2007; Hudgens and Halloran, 2008; Graham, 2008; Manski, 2010). In this paper we hope to both summarize some of the concepts and results from the existing literature and to extend that literature in considering new results for finite sample inference, new inverse probability weighting estimators in the presence of interference and new causal estimands of interest.
The remainder of this paper is organized as follows. In section 2 we present the notation we will be using throughout. In section 3 we review notions of direct, indirect (spillover), total and overall causal effects of Hudgens and Halloran (2008) that arise when interference is present. In section 4 we discuss inference for these effects in randomized trials and present new results on variance estimation and finite sample confidence intervals in the presence of interference. In section 5 we consider the context of observational studies and present a result on inverse probability weighting estimators of causal effects when interference is present. In section 6, we discuss varieties of direct and indirect effects present in the causal inference literature and comment on the terminological ambiguity concerning the expressions "direct effect" and "indirect effect"; we also introduce a new causal estimand that indicates a non-zero "infectiousness effect" in the context of vaccine trials (Datta et al., 1999). Finally, in section 7, we offer some concluding remarks and directions for future research.
2 Preliminaries
2.1 Counterfactuals
As in Hudgens and Halloran (2008), suppose data is observed on N > 1 groups of individuals, or blocks of units. For i = 1, …, N let ni denote the number of individuals in group i and let Ai ≡ (Ai1, …, Aini) denote the treatments those ni individuals received. Throughout, we assume perfect compliance, that is treatment assigned to an individual is equivalent to treatment received by the individual. We assume that Aij is a dichotomous random variable with support equal to {0, 1}, so that Ai takes values in the set {0, 1}ni. Let Ai, −j ≡ (Ai1, … Aini)\Aij ≡ (Ai1, …, Aij−1, Aij+1, … Aini) denote the ni – 1 subvector of Ai with the jth entry deleted. Following Hudgens and Halloran (2008) and Sobel (2006), we refer to Ai as an intervention, treatment or allocation program, to distinguish it from the individual treatment Aij Furthermore, for n = 1, 2, …, we define 𝒜(n) as the set of vectors of possible treatment allocations of length n; for instance 𝒜 (2) ≡ {(0, 0), (0, 1), (1, 0), (1, 1)}. Therefore, Ai takes one of 2ni possible values in 𝒜(ni), while Ai,−j takes values in 𝒜 (ni – 1) for all j. For positive integers n and k, we further define 𝒜 (n, k) to be the subset of 𝒜 (n) wherein exactly k individuals receive treatment 1, that is every element a of 𝒜 (n, k) satisfies , where 1n is the vector of length n with entries all equal to one.
For each block i, we shall assume there exist counterfactual (potential outcome) data Yi(·) = {Yi(ai) : ai ∈ 𝒜} where Yi(ai) = {Yi1 (ai), …, Yini (ai)}, and Yij (ai) is individual j’s response under treatment allocation ai; and that the observed outcome Yij for individual j in block i is equal to his counterfactual outcome Yij (Ai) under the realized treatment allocation Ai. The notation Yij(ai) makes explicit the possibility for interference between individuals within a block, that is, the potential outcome for individual j may depend on another’s individual treatment assignment in block j. Also, note that for counterfactuals to remain well defined, this notation implicitly assumes that counterfactuals for an individual in block i do not depend on treatment assignments of individuals in a different block i′ ≠ i. This encodes the assumption of partial interference considered by Sobel (2006) and Hudgens and Halloran (2008), which they point out to be particularly appropriate when the observed blocks are well separated by space or time such as in some group randomized studies in the social sciences, or in some community-randomized vaccine trials. The ordinary no interference assumption (Cox, 1958; Rubin, 1980) generally made in the causal inference literature is then that for all i and j if ai and are such that then , which in turn implies that the counterfactual outcomes for individual j in group i can be written as {Yij (a) : a = 0, 1}.
Hereafter, we follow the convention in Sobel (2006) and Hudgens and Halloran (2008), and suppose that Yi(·) is fixed as it does not depend on the random treatment allocation program Ai. In addition to treatment and outcome data, we suppose that we also observe fixed data Li = (Lil, …, Lini), i = 1, … N, where Lij denotes pretreatment covariates for individual i in block j; we allow Lij to contain block level covariate along with block aggregates of individual level covariates.
2.2 Treatment Assignment in Group Randomized Experiments
In group randomized experiments, treatment allocation is determined by the experimenter; therefore the assignment mechanism πi (Ai) of Ai is known. Let πi(Ai; α0) denote an experimenter’s particular choice of parametrization for the distribution of Ai indexed by the parameter α0, that is πi (Ai) = πi(Ai; α0). In this paper, we consider two types of parametrizations.
Definition
(A) A parametrization of type A with parameter ni and K0,i for block i, entails a so-called mixed individual group assignment strategy, whereby the treatment program Ai in block i is randomly allocated conditional on with probability mass function
(B) A parametrization of type B entails a Bernoulli individual group assignment strategy, whereby treatment is randomly assigned to different individuals within block i according to the known probability mass function
where 0 < α0 < 1.
For example, two type A treatment assignment strategies α0 and α1 might entail randomly assigning half of ni individuals in group i to treatment 1 and the other half to treatment 0 under a strategy corresponding to α0 versus assigning all individuals in a group to treatment zero under the second strategy corresponding to α1. Similarly, two treatment assignment strategies of the second type might assign each individual in a group to treatment 1 with probability 1/2 under strategy versus assigning each individual in a group to treatment 0 with probability 1/3 under strategy . Sobel (2006) and Hudgens and Halloran (2008) considered Type A treatment allocation programs in group randomized trials; in Section 5, we show that allocation programs of type (B) play an important conceptual role in the Definition and estimation of causal effects in observational studies.
Suppose our goal is to assess the causal effects of assigning groups to α0, compared to α1, where α0 and α1 are two individual group assignment strategies of type A. To achieve this goal in an experimental study, Hudgens and Halloran (2008) considered the following two-stage group randomization framework. In the first stage, each of the N groups is randomly assigned to either α0 or α1. In the second stage individuals within a group are randomly assigned to treatment conditional on their group’s assignment in the first stage. For instance, in the first stage, half of the N groups might be assigned to an allocation strategy α0 while the other half is assigned to α1; in the second stage, two-thirds of the individuals within groups assigned α0 are randomly assigned to treatment 1, while one-third of the individuals within a group assigned to α1 receive treatment 1. Such a design is commonly known as split-plot randomization or pseudo-cluster randomization. As Hudgens and Halloran (2008) point out, two-stage randomization designs are key to obtaining answers for important public health questions in the face of interference, such as: how many cases due to an infectious disease will be averted by vaccinating two-thirds of the population compared to only vaccinating one-third of the population?
3 Causal Estimands
3.1 Direct Causal Effects
Following Halloran and Struchiner (1995), we define the individual direct causal effect of treatment 0 compared to treatment 1 for individual j in group i by:
and the individual average direct causal effect for individual j in group i by
(1) |
where for a = 0, 1,
Note that in the above display, and until stated otherwise, πi (·; α0) may either be of Type A or B. Thus, is a difference in individual average counterfactual outcomes when aij = 0 and when aij = 1 under α0. This is a marginal causal effect as it is a comparison between expected values of the marginal distributions of Yij(Ai,−j, aij = 0) and of Yij(Ai,−j, aij = 1) with respect to α0. Finally, we define the group average direct causal effect by and the population average direct causal effect by .
3.2 Indirect Causal Effects or"Spillover Effects"
Halloran and Struchiner (1995) also define an individual indirect causal effect as the causal effect on an individual of the treatment received by others in the group. Specifically, let be the individual indirect causal effect on subject j in group i of treatment allocation ai compared with so that:
Sobel (2006) refers to the indirect effect defined above as a"spillover effect." Note that if interference is absent then . Similar to direct effects, define the individual average indirect causal effect by . Finally, define the group average indirect causal effect as and the population average indirect causal effect as .
3.3 Total Causal Effects
Total effects reflect both the direct and the indirect effects of a particular treatment assignment on an individual. Following Halloran and Struchiner (1995) we define the individual total causal effects for individual j in group i as:
the individual average total causal effect by , the group average total causal effect by and the population average total causal effect by .
3.4 Overall Causal Effects
Following Hudgens and Halloran (2008), we define the individual overall causal effect of treatment ai compared to treatment for individual j in group i by
Similarly, define the individual average overall causal effect comparing α0 to α1 by , the group average overall causal effect by and the population average overall effect by
The following simple yet instructive properties describe the relationship between the various causal effects:
It follows immediately from their Definitions, that total effects at the individual, group or population levels can be decomposed as the sum of direct and indirect causal effects at the corresponding level. That is, for example (Hudgens and Halloran, 2008).
Total causal effects are not commutative, for instance . However , so that while the total causal effects are not necessarily equal, they are constrained in sum to equal the sum of direct effects (Hudgens and Halloran, 2008).
-
If , then . In the absence of indirect effects, the total effects are commutative if and only if the direct effects are equal (Hudgens and Halloran, 2008).
We also have the following decomposition for the overall effect:
The group average overall effects are equal to a weighted sum of the group average indirect, direct and total effects: , where
Under the assumption of no interference between individuals of a group, the individual indirect causal effect is equal to zero and therefore individual, group and population average causal total effects are equal to the average causal direct effects at the corresponding level. Recall that in the absence of interference, the counterfactual outcomes for individual j in group i can be written as {Yij (a) : a = 0, 1} and the individual and group average causal effect ct become Yij (1) – Yij (0) and respectively. Furthermore, the assumption of no interference implies that the various causal effects do not depend on the treatment assignment strategies α0 and α1, whereas in the presence of interference within groups, these effects do in general depend on the assignment strategies.
4 Inference in group randomized studies
4.1 Estimation
In this section, we consider the estimation of the following four key causal contrasts, the population average direct causal effect , the population average indirect causal effect , the population average total causal effect and the population average overall effect . Unbiased estimators of these parameters under a two-stage randomization scheme were proposed by Hudgens and Halloran (2008) under the following assumption:
Assumption 1
Let S ≡ (S1, …, SN) denote the first stage of randomization group assignments with Si = 1 if group i is assigned to α0 and zero if group i is assigned to α1. Let η denote the parametrization for the distribution of S and let C = ∑iSi denote the number of groups assigned α1. Then, {η, α0, α1} are assumed to be Type A parametrizations.
Suppose Si = 1 and let , also define , and . Hudgens and Halloran (2008) proposed the following estimators:
(2) |
(3) |
(4) |
(5) |
which they showed to be unbiased under Assumption 1, i.e.
where the expectation is taken with respect to the joint density of (S,A1, …, AN).
4.2 Variance Estimation
4.2.1 Variance Estimation under Stratified interference
Unbiased estimation of the variances of the various estimators of the previous section appears not to be generally available without additional assumptions regarding the underlying structure of interference. Hudgens and Halloran (2008) illustrate this difficulty by considering the estimation of Var (Ŷ (1; α0)|Si=1) under assumption 1 only. They note that the estimator Ŷ (1; α0) is based on a single systematic random sample of fixed size Ki from the set of potential outcomes {Yij (ai) : ai ∈ 𝒜 (ni;Ki), zij = 1}. By the non-existence of an unbiased estimator of the variance of the sample mean from a single systematic sample, this implies the non-existence of an unbiased estimator of Var (Ŷ (1; α0)|Si=1). However, as we show in the next lemma, the non-existence of an unbiased estimator of Var (Ŷ (1; α0)|Si=1) does not preclude the possibility for simple yet conservative estimation of the latter quantity, as an unbiased estimator of an upper bound for the variance is often a useful measure of uncertainty. The following lemma gives the result for a nonnegative outcome.
Lemma 1
Suppose that Yij (ai) ≥ 0 for all ai ∈ A(ni;K0,i) and for j = 1, …, ni, and define
then the following holds under Assumption 1:
The proof of this lemma is given in the appendix.
In contrast with Lemma 1 Hudgens and Halloran (2008) consider variance estimators that rely on the following assumption of Stratified interference.
Assumption 2
Stratified interference: For
Assumption 2 states that ai ↦ Yij (ai) is a function of ai only through (aij, ∑j′≠j aij′), that is an individual’s counterfactual outcome only depends on his exposure level aij, and on the total number of people exposed in his group. Let Yij (aij; α0) ≡ Yij (aij, ai,−j; α0) for any ai,−j ∈ A(ni – 1, Ki – aij), aij = 0, 1; and let
where is the within-group sample variance and the between group sample variance for individuals with Aij = a ∈ {0, 1}. Also, let
and
and define
(6) |
(7) |
(8) |
(9) |
Hudgens and Halloran (2008) proved that under assumptions 1 and 2:
(10) |
(11) |
(12) |
(13) |
That is the variance estimators (6)-(9) are generally conservative. However, as they show in equation (10), equality holds if and only if
(14) |
for fixed constant, for j = 1, …, ni and i = 1, …, N, which is equivalent to an additive individual direct causal effect across all groups. Note that when Yij (Ai) is binary, and 0 < |DE (α0)| < 1, then the hypothesis of additive direct treatment effects cannot hold as the only values of DE (α0) consistent with additivity are 0, 1 and −1. Hudgens and Halloran (2008) also establish analogous conditions under which equality holds for each of the other equations (11)–(13).
Despite the availability under assumptions 1 and 2, of reasonable variance estimators given by equations (6) – (9) for the various estimators of causal effects proposed by Hudgens and Halloran (2008), a formal framework for statistical inference on population average causal effects is currently lacking. As a remedy, in the following section, we develop a finite sample framework for making causal inferences in the context of interference.
4.3 Finite sample inference for a binary outcome
We construct novel finite sample confidence intervals for the four population average causal effects of interest. To simplify the exposition, we mainly focus on the case of a binary outcome. To the best of our knowledge there currently exists no method, whether finite or large sample-based, to construct a confidence interval for any of the causal parameters of current interest. In a technical report, we show that admits an alternative representation as a martingale, an observation which enables us to use a Hoeffding-type exponential inequality to obtain the desired finite sample confidence interval. We prove the following results.
Theorem 1
For any level γ ∈ (0, 1), the interval
is a finite sample (1 – γ) CI of DE (α0) under assumption 1, where
(15) |
and for i = 1, …, N
According to the theorem, for each value of (q, N, γ), the coverage probability Pr{DE(α0) ∈ CDE (γ, q, N)} is guaranteed under assumption 1 to be no smaller than 95%, with the length of CDE (γ, q, N) proportional to , so that for a fixed value of (γ q), CDE (γ q, N) becomes increasingly precise as the number of groups in the study grows. However, we note that CDE (γ q, N) may not be particularly useful when N is small, for those values of (γ q) such that . This is because in such a case, the corresponding confidence interval is noninformative, as it contains the entire range of possible values of , since [−1, 1] ⊆ CDE (γ, q, N) and . To further illustrate this point, suppose that and q = 1/2, then . This implies that CDE (γ q, N) is guaranteed to be noninformative for values of N ≤ 9. As made evident in the proof of the theorem, the term 4 in equation (15) is an upper bound for the squared absolute deviation of the conditional average direct effect from the population average direct effect Ȳ(0; α0) – Ȳ(1; α0). This bound increases as q decreases towards zero, a situation which can arise in a study where the proportion of groups randomized to the treatment allocation α0 is very small, and can happen even when C and N are both relatively large. This will invariably result in an increase in uncertainty in our inferences on . However, we note that more accurate inferences may still be possible for the population conditional average causal direct effect which we define as
and which corresponds to the average causal direct effect for the population of groups actually randomized to α0. The next theorem provides a finite sample confidence interval for .
Theorem 2
For any level γ ∈ (0,1), the interval
is a finite sample (1 – γ) CI of under assumption 1, where
Note that both CDE (γ, q, N) and CIDEc (γ, q, N) are centered around the same estimator , which is unbiased for and is conditionally unbiased for . However, the length of the second confidence interval no longer includes the term 4 and thus will often be substantially shorter.
The following theorem provides a finite sample confidence interval for the population average indirect causal effect.
Theorem 3
For any level γ ∈ (0, 1), the interval
is a finite sample under assumption 1, where
and
The next two theorems give finite sample confidence intervals for the population average total causal effect and for the population average overall causal effect respectively.
Theorem 4
For any level γ ∈ (0, 1), the interval
is a finite sample under assumption 1, where
and
Theorem 5
For any level γ ∈ (0, 1), the interval
is a finite sample under assumption 1, where
and
Note that , with the corresponding confidence intervals having identical length. Future work could improve about the length of these confidence intervals by a sharpening of the exponential inequalities used in their derivation (van der Vaart and Wellner, 1996) and by leveraging additional assumptions such as that of Stratified interference or by deriving potentially sharper alternative exponential inequalities. In future work, we also plan to consider inference for continuous and possibly unbounded outcomes. The technical developments necessary to achieve these results are beyond the scope of the current paper and will be addressed elsewhere.
5 Towards Inference in observational studies
In this section, we briefly consider an approach for drawing causal inferences from observational data in the presence of interference. We begin by noting that in the absence of (two-stage) randomization, the estimators of Section 5 are no longer valid in an observational study. This is because Assumption 1 is in general no longer tenable in the non-experimental setting of an observational study, therefore, a different approach is needed. To make progress, we consider the following assumption:
Assumption 3
For i = 1, …, N, we assume that conditional on Li, the treatment allocation Ai is independent of the counterfactual variables Yi(·), that is:
(16) |
where fA|Li (ai|Li) ≡ Pr {Ai = ai|Li}
This assumption is a group-level generalization of the standard conditional randomization assumption routinely made at the individual-level in the analysis of observational studies. It states that the treatment allocation program Ai is randomly assigned to individuals in group i conditional on the vector of covariates Li observed on these individuals. Whereas in the previous section, the outcome was assumed to be binary, hereafter, no such assumption is needed. In addition to Assumption 3, we suppose that the following positivity assumption holds:
Assumption 4
For i = 1, …, N we assume that conditional on Li, we have that for all ai ∈ 𝒜 (ni)
(17) |
Assumption 4 is a group-level version of the positivity assumption routinely made at the individual level in the analysis of observation studies. In the appendix, we show that the following theorem holds:
Theorem 6
Suppose that fA|Li (·|Li) satisfies assumptions 3 and 4, and that α0 is the parametrization of a Bernoulli individual group assignment strategy (i.e. a type B parametrization) which satisfies assumption 4. Let
and
Then
and
According to this theorem, if the allocation probability mechanism fA|L(·|Li) is known, the population counterfactual averages Ȳi (a; α0) and Ȳi (α0) are identified from the observed data, and are unbiased estimators of Ȳi (a; α0) and Ȳi (α0) respectively. The theorem also immediately gives the following result. Let
Unfortunately, are not feasible in practice since, as is usually the case in observational studies, fA|L(·|Li) is unknown to the analyst. To proceed, we must estimate this unknown treatment allocation mechanism from the observed data. Because Li will typically include a large vector of covariates, nonparametric estimation of fA|L(Ai|Li) is not a viable option, and parametric or semi-parametric models must be adopted in practice. Next, we provide a brief and informal description to illustrate what a parametric approach entails in practice, in the particularly favorable setting where the number of groups N is reasonably large. In such a setting, we propose to estimate a parsimonious model fA|Li (Ai|Li;ψ) = fA|L(Ai|Li;ψ) i = 1, …, N, with unknown parameter ψ = (ψa, ψb), where fA|L(Ai|Li; ψ) is assumed to be a mixed model of the form
with hA|L (1|Lij bi; ψa) say the logistic regression model logit and bi a random effect known to follow a parametric density fb (bi|Vi;ψb) indexed by an unknown parameter ψb. The standard logistic-normal mixed model corresponds to the choice of fb (bi|Vi;ψb) univariate normal with mean ψa,1 and variance ψa,2. Estimation of ψa = (ψa,1, ψa,2) and ψb is obtained by maximizing
(18) |
with respect to ψ to give ψ̂. The mixed model paradigm is particularly appealing in the current setting, as it provides a flexible framework to account for a possible non-null conditional association between Aij and Aij′ given Li, for j ≠ j′. Furthermore, under the assumption that Ai and Ai′, are independent given Li and Li′ for i ≠ i′, ψ̂ is a maximum likelihood estimator, and thus, under standard regularity conditions it is . However, note that the mixed model is agnostic to a possible non-null conditional association between Aij and Ai′j′ for i ≠ i′. Such a non-null association between the exposure levels of individuals belonging to different groups may arise say due to the spatial proximity of the two groups, even in the absence of between-group interference. In such a case, ψ̂ is no longer the mle, but will remain consistent as the number of groups grows to infinity, provided that the non-null association of exposure levels between groups is not too pervasive. Specifically, this will hold provided that the dependence between the treatment allocation program of a given group is non-null only with that of a fixed number of groups, as determined say by spatial proximity. Feasible estimators of the various causal effects are then obtained by substituting Alternately, one may use the more stable estimators
A large sample estimator of the variances of the estimates of the various causal effects can be obtained under standard regularity assumptions using well known Taylor series arguments that we do not reproduce here. The finite sample behavior of these various estimators will be examined in a simulation study we plan to report elsewhere.
Thus far we have assumed thatYi(·) is fixed; we will now briefly consider a setting in whichYi(·) is considered random. Hong and Raudenbush (2006) assume Stratified interference (Assumption 2) and assume that Yij (ai) depends on ai,−j only through some known scalar function v(ai, −j) so that Yij (ai) can be written as Yij (aij, v(ai,−j)). Suppose now that for all i, j, Aij is determined by simple randomization then assumption 3 will hold and it will also be the case that
(19) |
Hong and Raudenbush (2006) consider a variation on this assumption in the context of observational data. Specifically, they assume that
(20) |
and from this it follows that
and from this one could obtain conditional direct, indirect and total effects, namely,
Hong and Raudenbush (2006) also allow Lij to contain cluster level covariate along with cluster aggregates of individual level covariates. A similar approach is taken in VanderWeele (2010) in the context of mediation in the presence of interference. Note, however, that (20) requires that Yij (aij, v) be mean independent of both Aij and V (ai,−j) conditional on Lij. If, for each individual Aij is randomized conditional on Lij, although this will imply that Yij (aij, v) is mean independent of Aij conditional on Lij, it does not necessarily guarantee that Yij (aij, v) is mean independent of V (ai,−j conditional on Lij. More generally, instead of (21) we might consider
(21) |
where h(Li) is a known function of Li. However once again, with (21), even if for each individual Aij were randomized conditional on Lij, h(Li), this does not guarantee that Yij (aij, v) is mean independent of V (ai,−j) conditional on Lij, h(Li) unless h(Li) = Li.
6 Varieties of direct and indirect effects
We have considered several types of effects that arise when there is interference between units. We have considered the effect on some outcome of an individual’s treatment when the treatment of other units in a cluster are held fixed at a certain value; following, Hudgens and Halloran (2008), this was referred to as a"direct effect." We have also considered the effect on an individual’s outcome of holding the individual’s own treatment fixed but modifying the treatments received by other individuals in the same cluster; again following Hudgens and Halloran (2008), this was referred to as an "indirect effect." Of course, the terms "direct effects" and "indirect effects" are also used in the context of questions of mediation analysis, i.e. in assessing the extent to which the effect of some treatment on an outcome is mediated through some intermediate (the indirect effect) and the extent to which it occurs through other pathways (the direct effect). In some contexts, both interference and mediation may be present and of interest and the terms "direct effect" and "indirect effect" become ambiguous as they may make reference to the concepts from interference or from mediation.
In the infectious disease literature, the terminology of "direct and indirect effects" when interference is present dates at least as far back as Halloran and Struchiner (1991) although Hudgens and Halloran (2008) arguably provide the first formal counterfactual definitions. The terminology of "direct and indirect effects" in the context of mediation analysis extends at least as far back as the literature on structural equation modeling (e.g. Duncan, 1966) motivated by the method of path coefficients of Wright (1921); counterfactual notions of direct and indirect effects were described in detail by Holland (1988) and Robins and Greenland (1992). Because of the potential ambiguity in terms "direct effect" and "indirect effect," Sobel (2006) chose to use the term "spillover effect" for the effect on an individual’s outcome of holding the individual’s own treatment fixed but modifying the treatments received by other individuals. An early paper (Strain et al., 1976) in experimental educational psychology appears to have interchangeably used "indirect effect" and "spillover effect" to denote the effect on a child’s outcome of holding the child’s own treatment fixed but modifying the treatments received by other children. Complicating terminological issues yet further, the causal inference on mediation itself has produced alternative Definitions of direct and indirect effects based on potential interventions on the mediator (Robins and Greenland, 1992; Pearl, 2001) or alternatively on the notion of principal strata (Frangakis and Rubin, 2002; Rubin, 2004).
Variants of the notions of direct and indirect effects based on principal strata may in fact further be reformulated in the context of interference. Consider a vaccine trial (type A randomization) in which each cluster has two individuals so that for all i, ni = 2 (e.g. a study of married households with no children) such that half of the households were randomized to no vaccine (α0 = 0) and half of the households were randomized to having one individual (e.g. the wife) vaccinated (α1= 0.5). For each i, let j = 1 denote the subject that is potentially vaccinated (e.g. the wife) and j = 2 the subject that is never vaccinated (e.g. the husband). In the infectious disease context, a vaccination for individual 1 may prevent individual 2 from being infected either because the vaccine prevents individual 1 from being in infected or possibly because, even if individual 1 becomes infected, the vaccine itself renders the infection less contagious. A distinction between these two possibilities is sometimes drawn by using "susceptibility effect" to describe the former and "infectiousness effect" to describe the latter (Datta et al., 1999). Consider the following causal quantity, Ei(Yi2 (1, 0) – Yi2 (0, 0)|Yi1 (1, 0) = Yi1 (0, 0) = 1); this is the effect on individual 2 of vaccinating individual 1 (with individual 2 unvaccinated) amongst the subset of households for whom individual 1 becomes infected irrespective of whether individual 1 receives the vaccination; this would be a principal strata direct effect (Rubin, 2004). If this quantity were non-zero we might interpret this as evidence of an "infectiousness effect" of the vaccine since the vaccination of individual 1 affects the outcome of individual 2 even though it has no effect on the outcome of individual 1. Future work could potentially adapt estimation methods for principal strata direct effects (Gallop et al, 2009; Sjölander et al., 2009) to attempt to estimate and potentially test for the presence of an "infectiousness effect", Ei(Yi2 (1, 0) – Yi2 (0, 0)|Yi1 (1, 0) = Yi1 (0, 0) = 1).
Note that although the infectiousness effect quantity defined above is a "principal strata direct effect," within the context of interference it is a form of an "indirect effect" since individual 2’s vaccination status is fixed to be unvaccinated in the causal comparison. Within the context of interference, both the "susceptibility effect" and the "infectiousness effect" are in fact forms of "indirect effects" (in the interference sense) because both the "susceptibility effect" and the "infectiousness effect" concern the effect on individual 2 of holding individual 2’s vaccine status fixed but changing the vaccine status of individual 1; if interference were absent, neither of the effects would be present. If interference were absent then the principal strata "infectiousness effect" quantity defined above would reduce to Ei(Yi2 (0) – Yi2 (0)|Yi1 (1) = Yi1 (0) = 1)=0. Again terminology concerning "direct and indirect effects" is ambiguous and is easily confused: what is a "direct effect" in the context of principal strata is an "indirect effect" in the context of interference.
Because of the multiple varieties of direct and indirect effects, the use of more specific terminology may be desirable. In the context of interference, "indirect effect" and "direct effect" could be replaced by "spillover effect" and "unit-treatment effect"; in the context of mediation, "indirect effect" and "direct effect" could be replaced by "mediated effect" and "unmediated effect." In the context of infectious diseases and the principal strata effect defined above, "susceptibility effect" and "infectiousness effect" could be used rather than making reference to "direct and indirect effects." Yet further caution with regard to terminology on direct and indirect effects will be needed when both interference and mediation are present and of interest (VanderWeele, 2010).
7 Concluding remarks
In this paper we have reviewed some of the literature on causal inference in the presence of interference, we have provided new results on inference without the assumption of Stratified interference and we have described an inverse probability weighting approach to causal inference under interference in the context of observational studies. Interference arises in settings in which social interactions are present including settings of infectious disease, the study of neighborhoods and classrooms and in a variety of economic contexts. Although most work in causal inference has proceeded under a no-interference assumption, there are clearly many contexts in which such an assumption is not plausible. The issues raised by interference can be circumvented to a certain extent by implementing treatment programs at the cluster level rather than the individual level. However, interference gives rise to spillover effects which are themselves of intrinsic interest and the analysis of such spillover effects is inaccessible without explicitly taking interference into account. Theory and methods to address questions of interference and spillover effects will thus likely be important for a number of applied research settings.
The present work could be extended in a number of directions. Finite sample confidence intervals of shorter length than those in section 4 could be obtained by employing additional assumptions such as Stratified interference; continuous and unbounded outcomes could also be considered. The finite sample behavior of the inverse probability weighting estimation approach we proposed in this paper could be explored. Identification or partial Identification results for the "infectiousness effect," formalized in terms of principal strata, could be developed. Finally, further research could also potentially develop a more general framework for interference and spillover effects so as to consider a range of settings in which both interference and mediation were present and also so as to potentially allow for both within-cluster and between-cluster forms of interference. Causal inference under interference is a relatively new subfield and considerable work remains to be carried out.
APPENDIX
Proof of Lemma 1
Note that
Let . Each term of the first sum equals
and
so that
Therefore, as Yij (ai) ≥ 0 for all ai ∈ A(ni;K0,i;) and all j in group i, , since
Proof of Theorems 1-5
See technical report available from the authors.
Proof of Theorem 6
Under Assumptions 3 and 4, we have that for
similarly,
References
- 1.Chow YS, Teicher HP. Probability Theory: Independence, interchangeability, martingales. 3rd edition. Springer Texts in Statistics; 1997. [Google Scholar]
- 2.Datta S, Halloran ME, Longini IM. Efficiency of estimating vaccine efficacy for susceptibility and infectiousness: randomization by individual versus household. Biometrics. 1999;55:792–798. doi: 10.1111/j.0006-341x.1999.00792.x. [DOI] [PubMed] [Google Scholar]
- 3.Duncan OD. Path analysis: sociological examples. American Journal of Sociology. 1966;72:1–16. [Google Scholar]
- 4.Frangakis CE, Rubin DB. Principal stratification in causal inference. Biometrics. 2002;58:21–29. doi: 10.1111/j.0006-341x.2002.00021.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Gallop R, Small DS, Lin JY, Elliott MR, Joffe M, Ten Have TR. Mediation analysis with principal stratification. Statistics in Medicine. 2009;28:1108–1130. doi: 10.1002/sim.3533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Graham B. Identifying social interactions through conditional variance restrictions. Econometrica. 2008;76:643–660. [Google Scholar]
- 7.Halloran ME, Struchiner CJ. Causal inference for infectious diseases. Epidemiology. 1995;6:142–151. doi: 10.1097/00001648-199503000-00010. [DOI] [PubMed] [Google Scholar]
- 8.Hoeffding W. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association. 1963 Mar;58(301):13–30. [Google Scholar]
- 9.Holland PW. Causal inference, path analysis, and recursive structural equations models. Sociological Methodology. 1988;18:449–484. [Google Scholar]
- 10.Hong G, Raudenbush SW. Evaluating kindergarten retention policy: A case study of causal inference for multilevel observational data. Journal of the American Statistical Association. 2006;101:901–910. [Google Scholar]
- 11.Hudgens MG, Halloran ME. Towards causal inference with interference. Journal of the American Statistical Association. 2008;103:832–842. doi: 10.1198/016214508000000292. PMC2600548. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Manski CF. Economic analysis of social interactions. Journal of Economic Perspectives. 2000;14:115–136. [Google Scholar]
- 13.Manski CF. Identification of treatment response with social interactions. Northwestern University Working Paper. 2010 [Google Scholar]
- 14.Joag-Dev K, Proschan F. Negative Association of Random Variables with Applications. Annals of Statistics. 1983;11(1):286–295. [Google Scholar]
- 15.Pearl J. Direct and indirect effects. Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence; San Francisco: Morgan Kaufmann. 2001. pp. 411–420. [Google Scholar]
- 16.Robins JM, Greenland S. Identifiability and exchangeability for direct and indirect effects. Epidemiology. 1992;3:143–155. doi: 10.1097/00001648-199203000-00013. [DOI] [PubMed] [Google Scholar]
- 17.Rosenbaum PR. Interference between units in randomized experiments. Journal of the American Statistical Association. 2007;102:191–200. doi: 10.1080/01621459.2012.655954. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Rubin DB. Comment on: "Randomization analysis of experimental data in the fisher randomization test” by D. Basu. Journal of the American Statistical Association. 1980;75:591–593. [Google Scholar]
- 19.Rubin DB. Direct and indirect effects via potential outcomes. Scandinavian Journal of Statistics. 2004;31:161–170. [Google Scholar]
- 20.Sjölander A, Humphreys K, Vansteelandt S, Bellocco R, Palmgren J. Sensitivity analysis for principal stratum direct effects, with an application to a study of physical activity and coronary heart disease. Biometrics. 2009;65:514–520. doi: 10.1111/j.1541-0420.2008.01108.x. [DOI] [PubMed] [Google Scholar]
- 21.Sobel ME. What Do Randomized Studies of Housing Mobility Demonstrate?: Causal Inference in the Face of Interference. Journal of the American Statistical Association. 2006;101:1398–1407. [Google Scholar]
- 22.Strain PS, Shores RE, Kerr MM. An experimental analysis of "spillover" effects on the social interaction of behaviorally handicapped preschool children. Journal of Applied Behavior Analysis. 1976;9:31–40. doi: 10.1901/jaba.1976.9-31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.van der Vaart AW. Weak Convergence and Empirical Processes: With Applications to Statistics. Springer Serices in Statistics. 1996 [Google Scholar]
- 24.VanderWeele TJ. Direct and indirect effects for neighborhood-based clustered and longitudinal data. Sociological Reserach and Methods. 2010 doi: 10.1177/0049124110366236. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Wright S. Correlation and causation. J. Agric. Res. 1921;20:557–585. [Google Scholar]