On causal inference in the presence of interference

Eric J Tchetgen Tchetgen; Tyler J VanderWeele

doi:10.1177/0962280210386779

. Author manuscript; available in PMC: 2014 Nov 2.

Published in final edited form as: Stat Methods Med Res. 2010 Nov 10;21(1):55–75. doi: 10.1177/0962280210386779

On causal inference in the presence of interference

Eric J Tchetgen Tchetgen ¹, Tyler J VanderWeele ¹

PMCID: PMC4216807 NIHMSID: NIHMS632455 PMID: 21068053

Abstract

1 Introduction

Interference is said to be present when the exposure or treatment received by one individual may affect the outcomes of other individuals. Such interference can arise in settings in which the outcomes of the various individuals come about through social interactions (Manski, 2000, 2010). Most of the literature on causal inference proceeds by making an assumption of "no-interference." For example, Rubin’s formulation of the potential outcomes framework an assumption referred to as the "Stable Unit Treatment Value Assumption" or "SUTVA" is made which includes within it a no-interference assumption (Rubin, 1980). Such no-interference assumptions are employed routinely though not always acknowledged. When interference is present, causal inference is rendered considerably more complex, and the literature on causal inference in the presence of interference has just recently begun to develop (Sobel, 2006; Hong and Raudenbush, 2006; Rosenbaum, 2007; Hudgens and Halloran, 2008; Graham, 2008; Manski, 2010). In this paper we hope to both summarize some of the concepts and results from the existing literature and to extend that literature in considering new results for finite sample inference, new inverse probability weighting estimators in the presence of interference and new causal estimands of interest.

The remainder of this paper is organized as follows. In section 2 we present the notation we will be using throughout. In section 3 we review notions of direct, indirect (spillover), total and overall causal effects of Hudgens and Halloran (2008) that arise when interference is present. In section 4 we discuss inference for these effects in randomized trials and present new results on variance estimation and finite sample confidence intervals in the presence of interference. In section 5 we consider the context of observational studies and present a result on inverse probability weighting estimators of causal effects when interference is present. In section 6, we discuss varieties of direct and indirect effects present in the causal inference literature and comment on the terminological ambiguity concerning the expressions "direct effect" and "indirect effect"; we also introduce a new causal estimand that indicates a non-zero "infectiousness effect" in the context of vaccine trials (Datta et al., 1999). Finally, in section 7, we offer some concluding remarks and directions for future research.

2 Preliminaries

2.1 Counterfactuals

As in Hudgens and Halloran (2008), suppose data is observed on N > 1 groups of individuals, or blocks of units. For i = 1, …, N let n_i denote the number of individuals in group i and let A_i ≡ (A_i1, …, A_{in_i}) denote the treatments those n_i individuals received. Throughout, we assume perfect compliance, that is treatment assigned to an individual is equivalent to treatment received by the individual. We assume that A_ij is a dichotomous random variable with support equal to {0, 1}, so that A_i takes values in the set {0, 1}^n_i. Let A_{i, −j} ≡ (A_i1, … A_{in_i})\A_ij ≡ (A_i1, …, A_ij−1, A_ij+1, … A_{in_i}) denote the n_i – 1 subvector of A_i with the jth entry deleted. Following Hudgens and Halloran (2008) and Sobel (2006), we refer to A_i as an intervention, treatment or allocation program, to distinguish it from the individual treatment A_ij Furthermore, for n = 1, 2, …, we define 𝒜(n) as the set of vectors of possible treatment allocations of length n; for instance 𝒜 (2) ≡ {(0, 0), (0, 1), (1, 0), (1, 1)}. Therefore, A_i takes one of 2^n_i possible values in 𝒜(n_i), while A_i,−j takes values in 𝒜 (n_i – 1) for all j. For positive integers n and k, we further define 𝒜 (n, k) to be the subset of 𝒜 (n) wherein exactly k individuals receive treatment 1, that is every element a of 𝒜 (n, k) satisfies $1_{n}^{T} a = k$ , where 1_n is the vector of length n with entries all equal to one.

For each block i, we shall assume there exist counterfactual (potential outcome) data Y_i(·) = {Y_i(a_i) : a_i ∈ 𝒜} where Y_i(a_i) = {Y_i1 (a_i), …, Y_{in_i} (a_i)}, and Y_ij (a_i) is individual j’s response under treatment allocation a_i; and that the observed outcome Y_ij for individual j in block i is equal to his counterfactual outcome Y_ij (A_i) under the realized treatment allocation A_i. The notation Y_ij(a_i) makes explicit the possibility for interference between individuals within a block, that is, the potential outcome for individual j may depend on another’s individual treatment assignment in block j. Also, note that for counterfactuals to remain well defined, this notation implicitly assumes that counterfactuals for an individual in block i do not depend on treatment assignments of individuals in a different block i′ ≠ i. This encodes the assumption of partial interference considered by Sobel (2006) and Hudgens and Halloran (2008), which they point out to be particularly appropriate when the observed blocks are well separated by space or time such as in some group randomized studies in the social sciences, or in some community-randomized vaccine trials. The ordinary no interference assumption (Cox, 1958; Rubin, 1980) generally made in the causal inference literature is then that for all i and j if a_i and $a_{i}^{'}$ are such that $a_{i j} = a_{i j}^{'}$ then $Y_{i j} (a_{i}) = Y_{i j} (a_{i}^{'})$ , which in turn implies that the counterfactual outcomes for individual j in group i can be written as {Y_ij (a) : a = 0, 1}.

Hereafter, we follow the convention in Sobel (2006) and Hudgens and Halloran (2008), and suppose that Y_i(·) is fixed as it does not depend on the random treatment allocation program A_i. In addition to treatment and outcome data, we suppose that we also observe fixed data L_i = (L_il, …, L_{in_i}), i = 1, … N, where L_ij denotes pretreatment covariates for individual i in block j; we allow L_ij to contain block level covariate along with block aggregates of individual level covariates.

2.2 Treatment Assignment in Group Randomized Experiments

In group randomized experiments, treatment allocation is determined by the experimenter; therefore the assignment mechanism π_i (A_i) of A_i is known. Let π_i(A_i; α₀) denote an experimenter’s particular choice of parametrization for the distribution of A_i indexed by the parameter α₀, that is π_i (A_i) = π_i(A_i; α₀). In this paper, we consider two types of parametrizations.

Definition

(A) A parametrization of type A with parameter n_i and K_0,i for block i, entails a so-called mixed individual group assignment strategy, whereby the treatment program A_i in block i is randomly allocated conditional on $1_{n}^{T} A_{i} = \sum_{j = 1}^{n_{i}} A_{i j} = K_{0, i}$ with probability mass function

π_{i} (A_{i}; α_{0}) = I (A_{i} \in A (n_{i}, K_{0, i})) / (\begin{matrix} n_{i} \\ K_{0, i} \end{matrix})

(B) A parametrization of type B entails a Bernoulli individual group assignment strategy, whereby treatment is randomly assigned to different individuals within block i according to the known probability mass function

π_{i} (A_{i}; α_{0}) = \prod_{j = 1}^{n_{i}} α_{0}^{A_{i j}} {(1 - α_{0})}^{1 - A_{i j}}

where 0 < α₀ < 1.

For example, two type A treatment assignment strategies α₀ and α₁ might entail randomly assigning half of n_i individuals in group i to treatment 1 and the other half to treatment 0 under a strategy corresponding to α₀ versus assigning all individuals in a group to treatment zero under the second strategy corresponding to α₁. Similarly, two treatment assignment strategies $α_{0}^{*} and α_{1}^{*}$ of the second type might assign each individual in a group to treatment 1 with probability 1/2 under strategy $α_{0}^{*}$ versus assigning each individual in a group to treatment 0 with probability 1/3 under strategy $α_{1}^{*}$ . Sobel (2006) and Hudgens and Halloran (2008) considered Type A treatment allocation programs in group randomized trials; in Section 5, we show that allocation programs of type (B) play an important conceptual role in the Definition and estimation of causal effects in observational studies.

Suppose our goal is to assess the causal effects of assigning groups to α₀, compared to α₁, where α₀ and α₁ are two individual group assignment strategies of type A. To achieve this goal in an experimental study, Hudgens and Halloran (2008) considered the following two-stage group randomization framework. In the first stage, each of the N groups is randomly assigned to either α₀ or α₁. In the second stage individuals within a group are randomly assigned to treatment conditional on their group’s assignment in the first stage. For instance, in the first stage, half of the N groups might be assigned to an allocation strategy α₀ while the other half is assigned to α₁; in the second stage, two-thirds of the individuals within groups assigned α₀ are randomly assigned to treatment 1, while one-third of the individuals within a group assigned to α₁ receive treatment 1. Such a design is commonly known as split-plot randomization or pseudo-cluster randomization. As Hudgens and Halloran (2008) point out, two-stage randomization designs are key to obtaining answers for important public health questions in the face of interference, such as: how many cases due to an infectious disease will be averted by vaccinating two-thirds of the population compared to only vaccinating one-third of the population?

3 Causal Estimands

3.1 Direct Causal Effects

Following Halloran and Struchiner (1995), we define the individual direct causal effect of treatment 0 compared to treatment 1 for individual j in group i by:

D E_{i j} (a_{i, - j}) \equiv Y_{i j} (a_{i, - j}, a_{i j} = 0) - Y_{i j} (a_{i, - j}, a_{i j} = 1)

and the individual average direct causal effect for individual j in group i by

{\bar{D E}}_{i j} (α_{0}) \equiv {\bar{Y}}_{i j} (0; α_{0}) - {\bar{Y}}_{i j} (1; α_{0})

(1)

where for a = 0, 1,

{\bar{Y}}_{i j} (a; α_{0}) \equiv \sum_{s \in 𝒜 (n - 1)} Y_{i j} (a_{i, - j} = s, a_{i j} = a) {Pr}_{α_{0}} (A_{i, - j} = s | A_{i j} = α)

{with Pr}_{α_{0}} (A_{i, - j} = s | A_{i j} = α) = \frac{π_{i} (A_{i, - j} = s, α; α_{0})}{\sum_{s' \in 𝒜 (n - 1)} π_{i} (s', a; α_{0})}

Note that in the above display, and until stated otherwise, π_i (·; α₀) may either be of Type A or B. Thus, ${\bar{D E}}_{i j} (α_{0})$ is a difference in individual average counterfactual outcomes when a_ij = 0 and when a_ij = 1 under α₀. This is a marginal causal effect as it is a comparison between expected values of the marginal distributions of Y_ij(A_i,−j, a_ij = 0) and of Y_ij(A_i,−j, a_ij = 1) with respect to α₀. Finally, we define the group average direct causal effect by ${\bar{D E}}_{i} (α_{0}) = \sum_{j = 1}^{n_{i}} {\bar{D E}}_{i j} (α_{0}) / n_{i}$ and the population average direct causal effect by $\bar{D E} (α_{0}) = \sum_{i = 1}^{N} {\bar{D E}}_{i} (α_{0}) / N$ .

3.2 Indirect Causal Effects or"Spillover Effects"

Halloran and Struchiner (1995) also define an individual indirect causal effect as the causal effect on an individual of the treatment received by others in the group. Specifically, let $I E_{i j} (a_{i, - j}, a_{i, - j}^{'})$ be the individual indirect causal effect on subject j in group i of treatment allocation a_i compared with $a_{i}^{'}$ so that:

I E_{i j} (a_{i, - j}, a_{i, - j}^{'}) = Y_{i j} (a_{i, - j}, α_{i j} = 0) - Y_{i j} (a_{i, - j}, α_{_{i j}}^{'} = 0)

Sobel (2006) refers to the indirect effect defined above as a"spillover effect." Note that if interference is absent then $I E_{i j} (a_{i, - j}, a_{i, - j}^{'}) = 0$ . Similar to direct effects, define the individual average indirect causal effect by ${\bar{I E}}_{i j} (α_{0}, α_{1}) = {\bar{Y}}_{i j} (0, α_{0}) - {\bar{Y}}_{i j} (1; α_{1})$ . Finally, define the group average indirect causal effect as ${\bar{I E}}_{i} (α_{0}, α_{1}) = \sum_{j = 1}^{n_{i}} {\bar{I E}}_{i j} (α_{0}, α_{1}) / n_{i}$ and the population average indirect causal effect as $\bar{I E} (α_{0}, α_{1}) = \sum_{i = 1}^{N} {\bar{I E}}_{i} (α_{0}, α_{1}) / N$ .

3.3 Total Causal Effects

Total effects reflect both the direct and the indirect effects of a particular treatment assignment on an individual. Following Halloran and Struchiner (1995) we define the individual total causal effects for individual j in group i as:

T E_{i j} (a_{i, - j}, a_{i, - j}^{'}) \equiv Y_{i j} (a_{i, - j}, α_{i j} = 0) - Y_{i j} (a_{i, - j}^{'}, α_{i j} = 1),

the individual average total causal effect by ${\bar{T E}}_{i j} (α_{0}, α_{1}) = {\bar{Y}}_{i j} (0, α_{0}) - {\bar{Y}}_{i j} (1; α_{1})$ , the group average total causal effect by ${\bar{T E}}_{i} (α_{0}, α_{1}) = \sum_{j = 1}^{n_{i}} {\bar{T E}}_{i j} (α_{0}, α_{1}) / n_{i}$ and the population average total causal effect by $\bar{T E} (α_{0}, α_{1}) = \sum_{i = 1}^{N} {\bar{T E}}_{i} (α_{0}) / N$ .

3.4 Overall Causal Effects

Following Hudgens and Halloran (2008), we define the individual overall causal effect of treatment a_i compared to treatment $a_{i}^{'}$ for individual j in group i by

{\bar{O E}}_{i j} (a_{i}, a_{i}^{'}) = Y_{i j} (a_{i}) - Y_{i j} (a_{i}^{'})

Similarly, define the individual average overall causal effect comparing α₀ to α₁ by ${\bar{O E}}_{i j} (α_{0}, α_{1}) = {\bar{Y}}_{i j} (α_{0}) - {\bar{Y}}_{i j} (α_{1})$ , the group average overall causal effect by ${\bar{O E}}_{i} (α_{0}, α_{1}) = \sum_{j = 1}^{n_{i}} {\bar{O E}}_{i j} (α_{0}, α_{1}) / n_{i}$ and the population average overall effect by $\bar{O E} (α_{0}, α_{1}) = \sum_{i = 1}^{N} {\bar{O E}}_{i} (α_{0}) / N$

The following simple yet instructive properties describe the relationship between the various causal effects:

It follows immediately from their Definitions, that total effects at the individual, group or population levels can be decomposed as the sum of direct and indirect causal effects at the corresponding level. That is, for example $\bar{T E} (α_{0}, α_{1}) = \bar{D E} (α_{1}) + \bar{I E} (α_{0}, α_{1})$ (Hudgens and Halloran, 2008).
Total causal effects are not commutative, for instance $\bar{T E} (α_{0}, α_{1}) \neq \bar{T E} (α_{1}, α_{0})$ . However $\bar{I E} (α_{0}, α_{1}) = - \bar{I E} (α_{1}, α_{0}) \Rightarrow \bar{D E} (α_{0}) + \bar{D E} (α_{1}) = {\bar{T E}}_{i j} (α_{0}, α_{1}) + {\bar{T E}}_{i j} (α_{1}, α_{0})$ , so that while the total causal effects are not necessarily equal, they are constrained in sum to equal the sum of direct effects (Hudgens and Halloran, 2008).
If $\bar{I E} (α_{0}, α_{1}) = \bar{I E} (α_{1}, α_{0}) = 0$ , then $\bar{T E} (α_{0}, α_{1}) = \bar{T E} (α_{1}, α_{0}) \Leftrightarrow \bar{D E} (α_{0}) = \bar{D E} (α_{1})$ . In the absence of indirect effects, the total effects are commutative if and only if the direct effects are equal (Hudgens and Halloran, 2008).

We also have the following decomposition for the overall effect:
The group average overall effects are equal to a weighted sum of the group average indirect, direct and total effects: ${\bar{O E}}_{i} (α_{0}, α_{1}) = Pr (A_{i j} = 0; α_{1}) {\bar{I E}}_{i} (α_{0}, α_{1}) + Pr (A_{i j} = 1; α_{1}) {\bar{T E}}_{i} (α_{0}, α_{1}) + Pr (A_{i j} = 1; α_{0}) {\bar{D E}}_{i} (α_{1}, α_{0})$ , where

Pr (A_{i j} = a_{i j}; α) = \sum_{s \in 𝒜 (n - 1)} π_{i} (s, a_{i j}; α) = {\begin{matrix} \frac{K_{i}}{n_{i}} & if & π_{i} (A_{i}; α) is of type A \\ α & if & π_{i} (A_{i}; α) is of type B \end{matrix}

Under the assumption of no interference between individuals of a group, the individual indirect causal effect is equal to zero and therefore individual, group and population average causal total effects are equal to the average causal direct effects at the corresponding level. Recall that in the absence of interference, the counterfactual outcomes for individual j in group i can be written as {Y_ij (a) : a = 0, 1} and the individual and group average causal effect ct become Y_ij (1) – Y_ij (0) and $\sum_{j = 1}^{n_{i}} {Y_{i j} (1) - Y_{i j} (0)} / n_{i}$ respectively. Furthermore, the assumption of no interference implies that the various causal effects do not depend on the treatment assignment strategies α₀ and α₁, whereas in the presence of interference within groups, these effects do in general depend on the assignment strategies.

4 Inference in group randomized studies

4.1 Estimation

In this section, we consider the estimation of the following four key causal contrasts, the population average direct causal effect $\bar{D E} (α_{0})$ , the population average indirect causal effect $\bar{I E} (α_{0}, α_{1})$ , the population average total causal effect $\bar{T E} (α_{0}, α_{1})$ and the population average overall effect $\bar{O E} (α_{0}, α_{1})$ . Unbiased estimators of these parameters under a two-stage randomization scheme were proposed by Hudgens and Halloran (2008) under the following assumption:

Assumption 1

Let S ≡ (S₁, …, S_N) denote the first stage of randomization group assignments with S_i = 1 if group i is assigned to α₀ and zero if group i is assigned to α₁. Let η denote the parametrization for the distribution of S and let C = ∑_iS_i denote the number of groups assigned α₁. Then, {η, α₀, α₁} are assumed to be Type A parametrizations.

Suppose S_i = 1 and let ${\hat{Y}}_{i} (α; α_{0}) = \sum_{j = 1}^{n_{i}} I (A_{i j} = a) Y_{i j} (A_{i}) / \sum_{j = 1}^{n_{i}} I (A_{i j} = a)$ , also define ${\hat{Y}}_{i} (α; α_{0}) = \sum_{i = 1}^{N} {\hat{Y}}_{i} (a; α_{0}) I (S_{i} = 1) / \sum_{i = 1}^{N} I (S_{i} = 1) {\hat{Y}}_{i} (α_{0}) = \sum_{j = 1}^{n_{i}} Y_{i j} (A_{i}) / n_{i}$ , and $\hat{Y} (α_{0}) = \sum_{i = 1}^{n_{i}} {\hat{Y}}_{i} (α_{0}) I (S_{i} = 1) / \sum_{i = 1}^{N} I (S_{i} = 1)$ . Hudgens and Halloran (2008) proposed the following estimators:

\hat{D E} (α_{0}) = \hat{Y} (0; α_{0}) - \hat{Y} (1; α_{0}),

(2)

\hat{I E} (α_{0}, α_{1}) = \hat{Y} (0; α_{0}) - \hat{Y} (0; α_{1}),

(3)

\hat{T E} (α_{0}, α_{1}) = \hat{Y} (1; α_{0}) - \hat{Y} (0; α_{1}),

(4)

\hat{O E} (α_{0}, α_{1}) = \hat{Y} (α_{0}) - \hat{Y} (α_{1}),

(5)

which they showed to be unbiased under Assumption 1, i.e.

E {\hat{D E} (α_{0})} = \bar{D E} (α_{0}),

E {\hat{I E} (α_{0}, α_{1})} = \bar{I E} (α_{0}, α_{1}),

E {\hat{T E} (α_{0}, α_{1})} = \bar{T E} (α_{0}, α_{1}),

E {\hat{O E} (α_{0}, α_{1})} = \bar{O E} (α_{0}, α_{1})

where the expectation is taken with respect to the joint density of (S,A₁, …, A_N).

4.2 Variance Estimation

4.2.1 Variance Estimation under Stratified interference

Unbiased estimation of the variances of the various estimators of the previous section appears not to be generally available without additional assumptions regarding the underlying structure of interference. Hudgens and Halloran (2008) illustrate this difficulty by considering the estimation of Var (Ŷ (1; α₀)|S_i=1) under assumption 1 only. They note that the estimator Ŷ (1; α₀) is based on a single systematic random sample of fixed size K_i from the set of potential outcomes {Y_ij (a_i) : a_i ∈ 𝒜 (n_i;K_i), z_ij = 1}. By the non-existence of an unbiased estimator of the variance of the sample mean from a single systematic sample, this implies the non-existence of an unbiased estimator of Var (Ŷ (1; α₀)|S_i=1). However, as we show in the next lemma, the non-existence of an unbiased estimator of Var (Ŷ (1; α₀)|S_i=1) does not preclude the possibility for simple yet conservative estimation of the latter quantity, as an unbiased estimator of an upper bound for the variance is often a useful measure of uncertainty. The following lemma gives the result for a nonnegative outcome.

Lemma 1

Suppose that Y_ij (a_i) ≥ 0 for all a_i ∈ A(n_i;K_0,i) and for j = 1, …, n_i, and define ${\hat{Var}}_{u} ({\hat{Y}}_{i} (1; α_{0}) | S_{i} = 1) \equiv$

\frac{1 - π_{i} (A_{i}; α_{0})}{K_{0, i}^{2}} {\sum_{j = 1}^{n_{i}} A_{i j} Y_{i j}^{2} (A_{i}) + \sum_{j \neq j'}^{n_{i}} A_{i j} A_{i j'} Y_{i j} (A_{i}) Y_{i j'} (A_{i})},

then the following holds under Assumption 1:

E {{\hat{Var}}_{u} ({\hat{Y}}_{i} (1; α_{0}) | S_{i} = 1) | S_{i} = 1} \geq V a r ({\hat{Y}}_{i} (1; α_{0}) | S_{i} = 1)

The proof of this lemma is given in the appendix.

In contrast with Lemma 1 Hudgens and Halloran (2008) consider variance estimators that rely on the following assumption of Stratified interference.

Assumption 2

Stratified interference: For $k = 1, \dots, n_{i} - 1, Y_{i j} (a_{i}) = Y_{i j} (a_{i}^{'}) for all a_{i}, a_{i}^{'} \in 𝒜 (n_{i}, k), such that a_{i j} = a_{i j}^{'}$

Assumption 2 states that a_i ↦ Y_ij (a_i) is a function of a_i only through (a_ij, ∑j′≠j a_ij′), that is an individual’s counterfactual outcome only depends on his exposure level a_ij, and on the total number of people exposed in his group. Let Y_ij (a_ij; α₀) ≡ Y_ij (a_ij, a_i,−j; α₀) for any a_i,−j ∈ A(n_i – 1, K_i – a_ij), a_ij = 0, 1; and let

{\hat{σ}}_{i a}^{2} (α) \equiv \sum_{j = 1}^{n_{i}} {[Y_{i j} (a; α) - {\hat{Y}}_{i} (a; α)]}^{2} 1 (A_{i j} = a) / (K_{i} - 1)

{\hat{σ}}_{g a}^{2} (α) \equiv \sum_{i = 1}^{N} {[{\hat{Y}}_{i} (a; α) - \hat{Y} (a; α)]}^{2} S_{i} / (C - 1)

where ${\hat{σ}}_{i a}^{2} (α)$ is the within-group sample variance and ${\hat{σ}}_{g a}^{2} (α)$ the between group sample variance for individuals with A_ij = a ∈ {0, 1}. Also, let

{\hat{σ}}_{D E}^{2} (α) \equiv \sum_{i = 1}^{N} {[{\hat{D E}}_{i} (α) - \hat{D E} (α)]}^{2} S_{i} / (C - 1)

{\hat{σ}}_{M}^{2} (α_{0}) \equiv \sum_{i = 1}^{N} {[Y_{i} (α) - \hat{Y} (α)]}^{2} S_{i} / (C - 1)

and

\hat{V a r} {{\hat{D E}}_{i} (α) | S_{i} = 1} = \frac{{\hat{σ}}_{i 1}^{2} (α)}{K_{i}} + \frac{{\hat{σ}}_{i 0}^{2} (α)}{n_{i} - K_{i}}

and define

\hat{V a r} {\hat{D E} (α_{0}) | S_{i} = 1} \equiv (1 - \frac{C}{N}) {\hat{σ}}_{D E}^{2} (α_{0}) + \frac{1}{C N} \sum_{i = 1}^{N} \hat{V a r} {{\hat{D E}}_{i} (α_{0}) | S_{i} = 1} S_{i}

(6)

\hat{V a r} {\hat{I E} (α_{0}, α_{1})} \equiv \frac{{\hat{σ}}_{g 0}^{2} (α_{0})}{N - C} + \frac{{\hat{σ}}_{g 0}^{2} (α_{1})}{C}

(7)

\hat{V a r} {\hat{T E} (α_{0}, α_{1})} \equiv \frac{{\hat{σ}}_{g 0}^{2} (α_{0})}{N - C} + \frac{{\hat{σ}}_{g 1}^{2} (α_{1})}{C}

(8)

\hat{V a r} {\hat{O E} (α_{0}, α_{1})} \equiv \frac{{\hat{σ}}_{M}^{2} (α_{0})}{N - C} + \frac{{\hat{σ}}_{M}^{2} (α_{1})}{C}

(9)

Hudgens and Halloran (2008) proved that under assumptions 1 and 2:

E [\hat{V a r} {\hat{D E} (α_{0}) | S_{i} = 1}] \geq V a r {\hat{D E} (α_{0}) | S_{i} = 1}

(10)

E [\hat{V a r} {\hat{I E} (α_{0}, α_{1})}] \geq V a r {\hat{I E} (α_{0}, α_{1})}

(11)

E [\hat{V a r} {\hat{T E} (α_{0}, α_{1})}] \geq V a r {\hat{T E} (α_{0}, α_{1})}

(12)

E [\hat{V a r} {\hat{O E} (α_{0}, α_{1})}] \geq V a r {\hat{O E} (α_{0}, α_{1})}

(13)

That is the variance estimators (6)-(9) are generally conservative. However, as they show in equation (10), equality holds if and only if

Y_{i j} (1; α_{0}) = Y_{i j} (0; α_{0}) + ψ_{D, i}

(14)

for fixed constant, for j = 1, …, n_i and i = 1, …, N, which is equivalent to an additive individual direct causal effect across all groups. Note that when Y_ij (A_i) is binary, and 0 < |DE (α₀)| < 1, then the hypothesis of additive direct treatment effects cannot hold as the only values of DE (α₀) consistent with additivity are 0, 1 and −1. Hudgens and Halloran (2008) also establish analogous conditions under which equality holds for each of the other equations (11)–(13).

Despite the availability under assumptions 1 and 2, of reasonable variance estimators given by equations (6) – (9) for the various estimators of causal effects proposed by Hudgens and Halloran (2008), a formal framework for statistical inference on population average causal effects is currently lacking. As a remedy, in the following section, we develop a finite sample framework for making causal inferences in the context of interference.

4.3 Finite sample inference for a binary outcome

We construct novel finite sample confidence intervals for the four population average causal effects of interest. To simplify the exposition, we mainly focus on the case of a binary outcome. To the best of our knowledge there currently exists no method, whether finite or large sample-based, to construct a confidence interval for any of the causal parameters of current interest. In a technical report, we show that $\hat{D E} (α_{0}) - D E (α_{0})$ admits an alternative representation as a martingale, an observation which enables us to use a Hoeffding-type exponential inequality to obtain the desired finite sample confidence interval. We prove the following results.

Theorem 1

For any level γ ∈ (0, 1), the interval

C_{D E} (γ, p (α_{0}), q, N) \equiv (\hat{D E} (α_{0}) - ε_{D E}^{*} (q, γ, N), \hat{D E} (α_{0}) + ε_{D E}^{*} (q, γ, N))

is a finite sample (1 – γ) CI of DE (α₀) under assumption 1, where

ε_{D E}^{*} (q, γ, N) = \sqrt{\frac{[4 {(\frac{1}{q} - 1)}^{2} + \frac{\sum_{i = 1}^{N} {(\frac{L_{D E, i}}{q})}^{2}}{N}]}{2 N} ln (\frac{2}{γ})}

(15)

$q \equiv Pr (S_{i} = 1) = \frac{C}{N}$ and for i = 1, …, N

L_{D E, i} (α_{0}) \equiv 2 (1 - \frac{1}{(\begin{matrix} n_{i} \\ K_{0, i} \end{matrix})}) .

According to the theorem, for each value of (q, N, γ), the coverage probability Pr{DE(α₀) ∈ C_DE (γ, q, N)} is guaranteed under assumption 1 to be no smaller than 95%, with the length of C_DE (γ, q, N) proportional to $\frac{1}{N^{1 / 2}}$ , so that for a fixed value of (γ q), C_DE (γ q, N) becomes increasingly precise as the number of groups in the study grows. However, we note that C_DE (γ q, N) may not be particularly useful when N is small, for those values of (γ q) such that $ε_{D E}^{*} (q, γ, N) \geq 2$ . This is because in such a case, the corresponding confidence interval is noninformative, as it contains the entire range of possible values of $\bar{D E} (α_{0})$ , since [−1, 1] ⊆ C_DE (γ, q, N) and $| \bar{D E} (α_{0}) | \leq 1$ . To further illustrate this point, suppose that $\frac{1}{(\begin{matrix} n_{i} \\ K_{0, i} \end{matrix})} \approx 0$ and q = 1/2, then $ε_{D E}^{*} (1 / 2, γ, N) \approx \frac{6}{\sqrt{N}}$ . This implies that C_DE (γ q, N) is guaranteed to be noninformative for values of N ≤ 9. As made evident in the proof of the theorem, the term 4 ${(\frac{1}{q} - 1)}^{2}$ in equation (15) is an upper bound for the squared absolute deviation of the conditional average direct effect $E {\hat{DE} (α_{0}) | S = 1} = \frac{1}{c} \sum_{i : S_{i} = 1} {\bar{Y}}_{i} (0; α_{0}) - {\bar{Y}}_{i} (1; α_{0})$ from the population average direct effect Ȳ(0; α₀) – Ȳ(1; α₀). This bound increases as q decreases towards zero, a situation which can arise in a study where the proportion of groups randomized to the treatment allocation α₀ is very small, and can happen even when C and N are both relatively large. This will invariably result in an increase in uncertainty in our inferences on $\bar{D E} (α_{0})$ . However, we note that more accurate inferences may still be possible for the population conditional average causal direct effect which we define as

{\bar{D E}}_{c} (α_{0}) \equiv \frac{1}{C} \sum_{i : S_{i} = 1} {\bar{Y}}_{i} (0; α_{0}) - {\bar{Y}}_{i} (1; α_{0})

and which corresponds to the average causal direct effect for the population of groups actually randomized to α₀. The next theorem provides a finite sample confidence interval for ${\bar{D E}}_{c} (α_{0})$ .

Theorem 2

For any level γ ∈ (0,1), the interval

C_{D E_{c}} (γ, q, N) \equiv (\hat{D E} (α_{0}) - ε_{D E_{c}}^{*} (q, γ, N), \hat{D E} (α_{0}) - ε_{D E_{c}}^{*} (q, γ, N)

is a finite sample (1 – γ) CI of $\bar{D E} (α_{0})$ under assumption 1, where

ε_{D E_{c}}^{*} (q, γ, N) = \frac{1}{\sqrt{C}} \sqrt{[\sum_{i : S_{i} = 1} {(L_{D E, i} (q))}^{2} / 2 C] ln (\frac{2}{γ})}

Note that both C_DE (γ, q, N) and CI_{DE_c} (γ, q, N) are centered around the same estimator $\hat{D E} (α_{0})$ , which is unbiased for $\bar{D E} (α_{0})$ and is conditionally unbiased for ${\bar{D E}}_{c} (α_{0})$ . However, the length of the second confidence interval no longer includes the term 4 ${(\frac{1}{q} - 1)}^{2}$ and thus will often be substantially shorter.

The following theorem provides a finite sample confidence interval for the population average indirect causal effect.

Theorem 3

For any level γ ∈ (0, 1), the interval

C_{I E} (γ) \equiv (\hat{I E} (α_{0}, α_{1}) - ε_{I E}^{*} (q, γ, N), \hat{I E} (α_{0}, α_{1}) + ε_{I E}^{*} (q, γ, N)

is a finite sample $(1 - γ) CI of \bar{I E} (α_{0}, α_{1})$ under assumption 1, where

ε_{I E}^{*} (q, γ, N) = \sqrt{\frac{[max {\frac{1}{q^{2}}, \frac{1}{{(1 - q)}^{2}}} + \sum_{i} L_{I E, i,}^{2} (q) / N]}{2 N} ln (\frac{2}{γ})}

and

L_{I E, i} (q) = max {\frac{1}{q} (1 - \frac{1}{(\begin{matrix} n_{i} \\ K_{0, i} \end{matrix})}), \frac{1}{1 - q} (1 - \frac{1}{(\begin{matrix} n_{i} \\ K_{1, i} \end{matrix})})}

The next two theorems give finite sample confidence intervals for the population average total causal effect and for the population average overall causal effect respectively.

Theorem 4

For any level γ ∈ (0, 1), the interval

C_{T E} (γ) \equiv (\hat{T E} (α_{0}, α_{1}) - ε_{T E}^{*} (q, γ, N), \hat{T E} (α_{0}, α_{1}) + ε_{T E}^{*} (q, γ, N)

is a finite sample $(1 - γ) CI of \bar{T E} (α_{0}, α_{1})$ under assumption 1, where

ε_{T E}^{*} (q, γ, N) = \sqrt{\frac{[max {\frac{1}{q^{2}}, \frac{1}{{(1 - q)}^{2}}} + \sum_{i} L_{i}^{2} (q) / N]}{2 N} ln (\frac{2}{γ})}

and

L_{T E, i} (q) = max {\frac{1}{q} (1 - \frac{1}{(\begin{matrix} n_{i} \\ K_{0, i} \end{matrix})}), \frac{1}{1 - q} (1 - \frac{1}{(\begin{matrix} n_{i} \\ K_{1, i} \end{matrix})})}

Theorem 5

For any level γ ∈ (0, 1), the interval

C_{O E} (γ, q, N) \equiv (\hat{O E} (α_{0}, α_{1}) - ε_{O E}^{*} (q, γ, N), \hat{O E} (α_{0}, α_{1}) + ε_{O E}^{*} (q, γ, N)

is a finite sample $(1 - γ) CI of \bar{O E} (α_{0}, α_{1})$ under assumption 1, where

ε_{O E}^{*} (q, γ, N) = \sqrt{\frac{[max {\frac{1}{q^{2}}, \frac{1}{{(1 - q)}^{2}}} + \sum_{i} L_{i}^{2} (q) / N]}{2 N} ln (\frac{2}{γ})}

and

L_{O E, i} (q) = max {\frac{1}{q} (1 - \frac{1}{(\begin{matrix} n_{i} \\ K_{0, i} \end{matrix})}), \frac{1}{1 - q} (1 - \frac{1}{(\begin{matrix} n_{i} \\ K_{1, i} \end{matrix})})}

Note that $ε_{I E}^{*} (q, γ, N) = ε_{T E}^{*} (q, γ, N) = ε_{O E}^{*} (q, γ, N)$ , with the corresponding confidence intervals having identical length. Future work could improve about the length of these confidence intervals by a sharpening of the exponential inequalities used in their derivation (van der Vaart and Wellner, 1996) and by leveraging additional assumptions such as that of Stratified interference or by deriving potentially sharper alternative exponential inequalities. In future work, we also plan to consider inference for continuous and possibly unbounded outcomes. The technical developments necessary to achieve these results are beyond the scope of the current paper and will be addressed elsewhere.

5 Towards Inference in observational studies

In this section, we briefly consider an approach for drawing causal inferences from observational data in the presence of interference. We begin by noting that in the absence of (two-stage) randomization, the estimators of Section 5 are no longer valid in an observational study. This is because Assumption 1 is in general no longer tenable in the non-experimental setting of an observational study, therefore, a different approach is needed. To make progress, we consider the following assumption:

Assumption 3

For i = 1, …, N, we assume that conditional on L_i, the treatment allocation A_i is independent of the counterfactual variables Y_i(·), that is:

Pr {A_{i} = a_{i} | L_{i}, Y_{i} (\cdot)} = f_{A | L, i} (a_{i} | L_{i})

(16)

where f_A|Li (a_i|L_i) ≡ Pr {A_i = a_i|L_i}

This assumption is a group-level generalization of the standard conditional randomization assumption routinely made at the individual-level in the analysis of observational studies. It states that the treatment allocation program A_i is randomly assigned to individuals in group i conditional on the vector of covariates L_i observed on these individuals. Whereas in the previous section, the outcome was assumed to be binary, hereafter, no such assumption is needed. In addition to Assumption 3, we suppose that the following positivity assumption holds:

Assumption 4

For i = 1, …, N we assume that conditional on L_i, we have that for all a_i ∈ 𝒜 (n_i)

Pr {A_{i} = a_{i} | L_{i}} > 0

(17)

Assumption 4 is a group-level version of the positivity assumption routinely made at the individual level in the analysis of observation studies. In the appendix, we show that the following theorem holds:

Theorem 6

Suppose that f_A|Li (·|L_i) satisfies assumptions 3 and 4, and that α₀ is the parametrization of a Bernoulli individual group assignment strategy (i.e. a type B parametrization) which satisfies assumption 4. Let ${\hat{Y}}_{i}^{i p w} (a; α_{0}) \equiv$

\frac{\sum_{j = 1}^{n_{i}} π_{i} (A_{i, - j}; α_{0}) 1 (A_{i j} = α) Y_{i j} (A_{i})}{n_{i} \times f_{A | L, i} (A_{i} | L_{i})}

and ${\hat{Y}}_{i}^{i p w} (α_{0}) \equiv$

\frac{\sum_{j = 1}^{n_{i}} π_{i} (A_{i}; α_{0}) Y_{i j} (A_{i})}{n_{i} \times f_{A | L, i} (A_{i} | L_{i})}

Then

E {{\hat{Y}}_{i}^{i p w} (a; α_{0})} = {\bar{Y}}_{i} (a; α_{0}) = \frac{1}{n_{i}} \sum_{j = 1}^{n_{i}} \sum_{s \in 𝒜 (n_{i} - 1)} Y_{i j} (a_{i, - j} = s, a_{i j} = a) \prod_{j' = 1, j' \neq 1}^{n_{i}} α_{0}^{s_{i j}'} {(1 - α_{0})}^{1 - s_{i j}'}

and

E {{\hat{Y}}_{i}^{i p w} (α_{0})} = {\bar{Y}}_{i} (α_{0}) = \frac{1}{n_{i}} \sum_{j = 1}^{n_{i}} \sum_{s \in 𝒜 (n_{i})} Y_{i j} (a = s) \prod_{j' = 1}^{n_{i}} α_{0}^{s_{i j}'} {(1 - α_{0})}^{1 - s_{i j}'}

According to this theorem, if the allocation probability mechanism f_A|L(·|L_i) is known, the population counterfactual averages Ȳ_i (a; α₀) and Ȳ_i (α₀) are identified from the observed data, and ${\hat{Y}}_{i}^{i p w} (a; α_{0}) and {\hat{Y}}_{i}^{i p w} (α_{0})$ are unbiased estimators of Ȳ_i (a; α₀) and Ȳ_i (α₀) respectively. The theorem also immediately gives the following result. Let

{\hat{D E}}^{i p w} (α_{0}) = {\hat{Y}}^{i p w} (0; α_{0}) - {\hat{Y}}^{i p w} (1; α_{0}),

{\hat{I E}}^{i p w} (α_{0}, α_{1}) = {\hat{Y}}^{i p w} (0; α_{0}) - {\hat{Y}}^{i p w} (0; α_{1}),

{\hat{T E}}^{i p w} (α_{0}, α_{1}) = {\hat{Y}}^{i p w} (0; α_{0}) - {\hat{Y}}^{i p w} (0; α_{1}),

{\hat{O E}}^{i p w} (α_{0}, α_{1}) = {\hat{Y}}^{i p w} (α_{0}) - {\hat{Y}}^{i p w} (α_{1}),

Where {\hat{Y}}^{i p w} (a; α_{0}) = \sum_{i = 1}^{N} {\hat{Y}}_{i}^{i p w} (a; α_{0}) / N and {\hat{Y}}^{i p w} (α_{0}) = \sum_{i = 1}^{N} {\hat{Y}}_{i}^{i p w} (α_{0}) / N . Then,

E {{\hat{D E}}^{i p w} (α_{0})} = \bar{D E} (α_{0}),

E {{\hat{I E}}^{i p w} (α_{0}, α_{1})} = \bar{I E} (α_{0}, α_{1}),

E {{\hat{T E}}^{i p w} (α_{0}, α_{1})} = \bar{T E} (α_{0}, α_{1}),

E {{\hat{O E}}^{i p w} (α_{0}, α_{1})} = \bar{O E} (α_{0}, α_{1})

Unfortunately, ${\hat{D E}}^{i p w} (α_{0}), {\hat{I E}}^{i p w} (α_{0}, α_{1}), {\hat{T E}}^{i p w} (α_{0}, α_{1}) and {\hat{O E}}^{i p w} (α_{0}, α_{1})$ are not feasible in practice since, as is usually the case in observational studies, f_A|L(·|L_i) is unknown to the analyst. To proceed, we must estimate this unknown treatment allocation mechanism from the observed data. Because L_i will typically include a large vector of covariates, nonparametric estimation of f_A|L(A_i|L_i) is not a viable option, and parametric or semi-parametric models must be adopted in practice. Next, we provide a brief and informal description to illustrate what a parametric approach entails in practice, in the particularly favorable setting where the number of groups N is reasonably large. In such a setting, we propose to estimate a parsimonious model f_A|Li (A_i|L_i;ψ) = f_A|L(A_i|L_i;ψ) i = 1, …, N, with unknown parameter ψ = (ψ_a, ψ_b), where f_A|L(A_i|L_i; ψ) is assumed to be a mixed model of the form

f_{A | L} (A_{i} | L_{i}; ψ) \equiv \int \prod_{j = 1}^{n_{i}} h_{A | L} (A_{i j} | L_{i j}, b_{i}; ψ_{a}) f_{b} (b_{i} | V_{i}; ψ_{b}) d b_{i}

with h_A|L (1|L_ij b_i; ψ_a) say the logistic regression model logit ${h_{A | L} (1 | L_{i j}, b_{i}; ψ_{a})} = b_{i} + ψ_{a}^{'} L_{i j}$ and b_i a random effect known to follow a parametric density f_b (b_i|V_i;ψ_b) indexed by an unknown parameter ψ_b. The standard logistic-normal mixed model corresponds to the choice of f_b (b_i|V_i;ψ_b) univariate normal with mean ψ_a,1 and variance ψ_a,2. Estimation of ψ_a = (ψ_a,1, ψ_a,2) and ψ_b is obtained by maximizing

\sum_{i = 1}^{N} log {f_{A | L} (A_{i} | L_{i}; ψ_{a}, ψ_{b})}

(18)

with respect to ψ to give ψ̂. The mixed model paradigm is particularly appealing in the current setting, as it provides a flexible framework to account for a possible non-null conditional association between A_ij and A_ij′ given L_i, for j ≠ j′. Furthermore, under the assumption that A_i and A_i′, are independent given L_i and L_i′ for i ≠ i′, ψ̂ is a maximum likelihood estimator, and thus, under standard regularity conditions it is $\sqrt{N} - consistent$ . However, note that the mixed model is agnostic to a possible non-null conditional association between A_ij and A_i′j′ for i ≠ i′. Such a non-null association between the exposure levels of individuals belonging to different groups may arise say due to the spatial proximity of the two groups, even in the absence of between-group interference. In such a case, ψ̂ is no longer the mle, but will remain consistent as the number of groups grows to infinity, provided that the non-null association of exposure levels between groups is not too pervasive. Specifically, this will hold provided that the dependence between the treatment allocation program of a given group is non-null only with that of a fixed number of groups, as determined say by spatial proximity. Feasible estimators of the various causal effects are then obtained by substituting $f_{A | L} (A_{i} | L_{i}; \hat{ψ}) for f_{A | L, i} (a_{i} | L_{i}) in {\hat{Y}}_{i}^{i p w} (a; α_{0}) and {\hat{Y}}_{i}^{i p w} (α_{0})$ Alternately, one may use the more stable estimators

{\hat{Y}}^{i p w} (a; α_{0}, \hat{ψ}) = \frac{\sum_{i = 1}^{N} [\sum_{j = 1}^{n_{i}} π_{i} (A_{i, - j}; α_{0}) 1 (A_{i j} = a) Y i j (A_{i}) / {n_{i} \times f_{A | L, i} (A_{i} | L_{i i}; \hat{ψ})}]}{\sum_{i = 1}^{N} [\sum_{j = 1}^{n_{i}} π_{i} (A_{i, - j}; α_{0}) 1 (A_{i j} = a) / {n_{i} \times f_{A | L, i} (A_{i} | L_{i i}; \hat{ψ})}]}

{\hat{Y}}^{i p w} (α_{0}, \hat{ψ}) = \frac{\sum_{i = 1}^{N} [\sum_{j = 1}^{n_{i}} π_{i} (A_{i}; α_{0}) Y_{i j} (A_{i}) / {n_{i} \times f_{A | L, i} (A_{i} | L_{i}; \hat{ψ})}]}{\sum_{i = 1}^{N} [\sum_{j = 1}^{n_{i}} π_{i} (A_{i}; α_{0}) / {n_{i} \times f_{A | L, i} (A_{i} | L_{i}; \hat{ψ})}]}

A large sample estimator of the variances of the estimates of the various causal effects can be obtained under standard regularity assumptions using well known Taylor series arguments that we do not reproduce here. The finite sample behavior of these various estimators will be examined in a simulation study we plan to report elsewhere.

Thus far we have assumed thatY_i(·) is fixed; we will now briefly consider a setting in whichY_i(·) is considered random. Hong and Raudenbush (2006) assume Stratified interference (Assumption 2) and assume that Y_ij (a_i) depends on a_i,−j only through some known scalar function v(a_{i, −j}) so that Y_ij (a_i) can be written as Y_ij (a_ij, v(a_i,−j)). Suppose now that for all i, j, A_ij is determined by simple randomization then assumption 3 will hold and it will also be the case that

E [Y_{i j} (a_{i j}, v) | A_{i j}, V (a_{i, - j})] = E [Y_{i j} (a_{i j}, v)] .

(19)

Hong and Raudenbush (2006) consider a variation on this assumption in the context of observational data. Specifically, they assume that

E [Y_{i j} (a_{i j}, v) | A_{i j}, V (a_{i, - j}), L_{i j}] = E [Y_{i j} (a_{i j}, v) | L_{i j}]

(20)

and from this it follows that

E [Y_{i j} (a, v) | L_{i j} = l_{i j}] = E [Y_{i j} | A_{i j} = a_{i j}, V (a_{i, - j}) = v, L_{i j} = l_{i j}]

and from this one could obtain conditional direct, indirect and total effects, namely,

E [Y_{i j} (a, v) | L_{i j} = l_{i j}] - E [Y_{i j} (a', v) | L_{i j} = l_{i j}]

E [Y_{i j} (a, v) | L_{i j} = l_{i j}] - E [Y_{i j} (a, v') | L_{i j} = l_{i j}]

E [Y_{i j} (a, v) | L_{i j} = l_{i j}] - E [Y_{i j} (a', v') | L_{i j} = l_{i j}] .

Hong and Raudenbush (2006) also allow L_ij to contain cluster level covariate along with cluster aggregates of individual level covariates. A similar approach is taken in VanderWeele (2010) in the context of mediation in the presence of interference. Note, however, that (20) requires that Y_ij (a_ij, v) be mean independent of both A_ij and V (a_i,−j) conditional on L_ij. If, for each individual A_ij is randomized conditional on L_ij, although this will imply that Y_ij (a_ij, v) is mean independent of A_ij conditional on L_ij, it does not necessarily guarantee that Y_ij (a_ij, v) is mean independent of V (a_i,−j conditional on L_ij. More generally, instead of (21) we might consider

E [Y_{i j} (a_{i j}, v) | A_{i j}, V (a_{i, - j}), L_{i j}, h (L_{i})] = E [Y_{i j} (a_{i j}, v) | L_{i j}, h (L_{i})]

(21)

where h(L_i) is a known function of L_i. However once again, with (21), even if for each individual A_ij were randomized conditional on L_ij, h(L_i), this does not guarantee that Y_ij (a_ij, v) is mean independent of V (a_i,−j) conditional on L_ij, h(L_i) unless h(L_i) = L_i.

6 Varieties of direct and indirect effects

We have considered several types of effects that arise when there is interference between units. We have considered the effect on some outcome of an individual’s treatment when the treatment of other units in a cluster are held fixed at a certain value; following, Hudgens and Halloran (2008), this was referred to as a"direct effect." We have also considered the effect on an individual’s outcome of holding the individual’s own treatment fixed but modifying the treatments received by other individuals in the same cluster; again following Hudgens and Halloran (2008), this was referred to as an "indirect effect." Of course, the terms "direct effects" and "indirect effects" are also used in the context of questions of mediation analysis, i.e. in assessing the extent to which the effect of some treatment on an outcome is mediated through some intermediate (the indirect effect) and the extent to which it occurs through other pathways (the direct effect). In some contexts, both interference and mediation may be present and of interest and the terms "direct effect" and "indirect effect" become ambiguous as they may make reference to the concepts from interference or from mediation.

In the infectious disease literature, the terminology of "direct and indirect effects" when interference is present dates at least as far back as Halloran and Struchiner (1991) although Hudgens and Halloran (2008) arguably provide the first formal counterfactual definitions. The terminology of "direct and indirect effects" in the context of mediation analysis extends at least as far back as the literature on structural equation modeling (e.g. Duncan, 1966) motivated by the method of path coefficients of Wright (1921); counterfactual notions of direct and indirect effects were described in detail by Holland (1988) and Robins and Greenland (1992). Because of the potential ambiguity in terms "direct effect" and "indirect effect," Sobel (2006) chose to use the term "spillover effect" for the effect on an individual’s outcome of holding the individual’s own treatment fixed but modifying the treatments received by other individuals. An early paper (Strain et al., 1976) in experimental educational psychology appears to have interchangeably used "indirect effect" and "spillover effect" to denote the effect on a child’s outcome of holding the child’s own treatment fixed but modifying the treatments received by other children. Complicating terminological issues yet further, the causal inference on mediation itself has produced alternative Definitions of direct and indirect effects based on potential interventions on the mediator (Robins and Greenland, 1992; Pearl, 2001) or alternatively on the notion of principal strata (Frangakis and Rubin, 2002; Rubin, 2004).

Variants of the notions of direct and indirect effects based on principal strata may in fact further be reformulated in the context of interference. Consider a vaccine trial (type A randomization) in which each cluster has two individuals so that for all i, n_i = 2 (e.g. a study of married households with no children) such that half of the households were randomized to no vaccine (α₀ = 0) and half of the households were randomized to having one individual (e.g. the wife) vaccinated (α₁= 0.5). For each i, let j = 1 denote the subject that is potentially vaccinated (e.g. the wife) and j = 2 the subject that is never vaccinated (e.g. the husband). In the infectious disease context, a vaccination for individual 1 may prevent individual 2 from being infected either because the vaccine prevents individual 1 from being in infected or possibly because, even if individual 1 becomes infected, the vaccine itself renders the infection less contagious. A distinction between these two possibilities is sometimes drawn by using "susceptibility effect" to describe the former and "infectiousness effect" to describe the latter (Datta et al., 1999). Consider the following causal quantity, E_i(Y_i2 (1, 0) – Y_i2 (0, 0)|Y_i1 (1, 0) = Y_i1 (0, 0) = 1); this is the effect on individual 2 of vaccinating individual 1 (with individual 2 unvaccinated) amongst the subset of households for whom individual 1 becomes infected irrespective of whether individual 1 receives the vaccination; this would be a principal strata direct effect (Rubin, 2004). If this quantity were non-zero we might interpret this as evidence of an "infectiousness effect" of the vaccine since the vaccination of individual 1 affects the outcome of individual 2 even though it has no effect on the outcome of individual 1. Future work could potentially adapt estimation methods for principal strata direct effects (Gallop et al, 2009; Sjölander et al., 2009) to attempt to estimate and potentially test for the presence of an "infectiousness effect", E_i(Y_i2 (1, 0) – Y_i2 (0, 0)|Y_i1 (1, 0) = Y_i1 (0, 0) = 1).

Note that although the infectiousness effect quantity defined above is a "principal strata direct effect," within the context of interference it is a form of an "indirect effect" since individual 2’s vaccination status is fixed to be unvaccinated in the causal comparison. Within the context of interference, both the "susceptibility effect" and the "infectiousness effect" are in fact forms of "indirect effects" (in the interference sense) because both the "susceptibility effect" and the "infectiousness effect" concern the effect on individual 2 of holding individual 2’s vaccine status fixed but changing the vaccine status of individual 1; if interference were absent, neither of the effects would be present. If interference were absent then the principal strata "infectiousness effect" quantity defined above would reduce to E_i(Y_i2 (0) – Y_i2 (0)|Y_i1 (1) = Y_i1 (0) = 1)=0. Again terminology concerning "direct and indirect effects" is ambiguous and is easily confused: what is a "direct effect" in the context of principal strata is an "indirect effect" in the context of interference.

Because of the multiple varieties of direct and indirect effects, the use of more specific terminology may be desirable. In the context of interference, "indirect effect" and "direct effect" could be replaced by "spillover effect" and "unit-treatment effect"; in the context of mediation, "indirect effect" and "direct effect" could be replaced by "mediated effect" and "unmediated effect." In the context of infectious diseases and the principal strata effect defined above, "susceptibility effect" and "infectiousness effect" could be used rather than making reference to "direct and indirect effects." Yet further caution with regard to terminology on direct and indirect effects will be needed when both interference and mediation are present and of interest (VanderWeele, 2010).

7 Concluding remarks

In this paper we have reviewed some of the literature on causal inference in the presence of interference, we have provided new results on inference without the assumption of Stratified interference and we have described an inverse probability weighting approach to causal inference under interference in the context of observational studies. Interference arises in settings in which social interactions are present including settings of infectious disease, the study of neighborhoods and classrooms and in a variety of economic contexts. Although most work in causal inference has proceeded under a no-interference assumption, there are clearly many contexts in which such an assumption is not plausible. The issues raised by interference can be circumvented to a certain extent by implementing treatment programs at the cluster level rather than the individual level. However, interference gives rise to spillover effects which are themselves of intrinsic interest and the analysis of such spillover effects is inaccessible without explicitly taking interference into account. Theory and methods to address questions of interference and spillover effects will thus likely be important for a number of applied research settings.

The present work could be extended in a number of directions. Finite sample confidence intervals of shorter length than those in section 4 could be obtained by employing additional assumptions such as Stratified interference; continuous and unbounded outcomes could also be considered. The finite sample behavior of the inverse probability weighting estimation approach we proposed in this paper could be explored. Identification or partial Identification results for the "infectiousness effect," formalized in terms of principal strata, could be developed. Finally, further research could also potentially develop a more general framework for interference and spillover effects so as to consider a range of settings in which both interference and mediation were present and also so as to potentially allow for both within-cluster and between-cluster forms of interference. Causal inference under interference is a relatively new subfield and considerable work remains to be carried out.

APPENDIX

Proof of Lemma 1

Note that $Var ({\hat{Y}}_{i} (1; α_{0}) | S_{i} = 1) = V a r {\sum_{j = 1}^{n_{i}} A_{i j} Y_{i j} (A_{i}) / \sum_{j = 1}^{n_{i}} A_{i j}} = \frac{1}{K_{0, i}^{2}} [\sum_{i} Var {A_{i j} Y_{i j} (A_{i})} + \sum_{j \neq j'} Cov {A_{i j} Y_{i j} (A_{i}), A_{i j'} Y_{i j'} (A_{i})}]$

Let $p_{i} \equiv \frac{1}{(\begin{matrix} n_{i} \\ K_{i} \end{matrix})}$ . Each term of the first sum equals

Var {A_{i j} Y_{i j} (A_{i})} = Var {\sum_{ω \in 𝒜 (n_{i} - 1, K_{0, i} - 1)} 1 (A_{i, - j} = ω) A_{i j} Y_{i j} (a_{i j} = 1, a_{i, - j} = ω)}

= \sum_{ω \in 𝒜 (n_{i} - 1, K_{0, i} - 1)} Var {1 (A_{i, - j} = ω) A_{i j} Y_{i j} (a_{i j} = 1, a_{i, - j} = ω)}

+ \sum_{(ω, ω') \in 𝒜 (n_{i} - 1, K_{0, i} - 1)} 1 (ω \neq ω') Cov {_{1 (A_{i, - j} = ω') A_{i j} Y_{i j} (a_{i j} = 1, a_{i, - j} = ω')}^{1 (A_{i, - j} = ω) A_{i j} Y_{i j} (a_{i j} = 1, a_{i, - j} = ω)},}

= p_{i} {1 - p_{i}} \sum_{ω \in 𝒜 (n_{i} - 1, K_{0, i} - 1)} Y_{i j}^{2} (a_{i j} = 1, a_{i, - j} = ω)

- p_{i}^{2} \sum_{(ω, ω') \in 𝒜 (n_{i} - 1, K_{0, i} - 1)} 1 (ω \neq ω') Y_{i j} (a_{i j} = 1, a_{i, - j} = ω) Y_{i j} (a_{i j} = 1, a_{i, - j} = ω')

and

Cov {A_{i j} Y_{i j} (A_{i}), A_{i j'} Y_{i j'} (A_{i})} = E {A_{i j} Y_{i j} (A_{i}) A_{i j'} Y_{i j'} (A_{i})} - E {A_{i j} Y_{i j} (A_{i})} E {A_{i j'} Y_{i j'} (A_{i})}

= \sum_{ω \in 𝒜 (n_{i} - 2, K_{0, i} - 2)} p_{i} Y_{i j} (a_{i j} = 1, a_{i j'} = 1, a_{i, - (j, j')} = ω) Y_{i j'} (a_{i j} = 1, a_{i j'} = 1, a_{i, - (j, j')} = ω)

- \sum_{(ω, ω') \in 𝒜 (n_{i} - 1, K_{0, i} - 1)} p_{i}^{2} Y_{i j} (a_{i j} = 1, a_{i, - j} = ω) Y_{i j'} (a_{i j'} = 1, a_{i, - j'} = ω')

= \sum_{ω \in 𝒜 (n_{i} - 2, K_{0, i} - 2)} p_{i} Y_{i j} (a_{i j} = 1, a_{i j'} = 1, a_{i, - (j, j')} = ω) Y_{i j'} (a_{i j} = 1, a_{i j'} = 1, a_{i, - (j, j')} = ω)

- \sum_{ω \in 𝒜 (n_{i} - 2, K_{0, i} - 2)} p_{i}^{2} Y_{i j} (a_{i j} = 1, a_{i j'} = 1, a_{i, - (j, j')} = ω) Y_{i j'} (a_{i j} = 1, a_{i j'} = 1, a_{i, - (j, j')} = ω)

- \sum_{(ω, ω') \in 𝒜 (n_{i} - 1, K_{0, i} - 1)} 1 (ω \neq ω') p_{i}^{2} Y_{i j} (a_{i j} = 1, a_{i, - j} = ω) Y_{i j'} (a_{i j'} = 1, a_{i, - j'} = ω')

so that

Var ({\hat{Y}}_{i} (1; α_{0}) | S_{i} = 1) = \frac{1}{K_{0, i}^{2}} p_{i} {1 - p_{i}}

\times [\begin{matrix} Σ_{j} \sum_{ω \in 𝒜 (n_{i} - 1, K_{0, i} - 1)} Y_{i j}^{2} (a_{i j} = 1, a_{i, - j} = ω) \\ + Σ_{j \neq j'} \sum_{ω \in 𝒜 (n_{i} - 2, K_{0, i} - 2)} Y_{i j} (a_{i j} = 1, a_{i j'} = 1, a_{i, - (j, j')} = ω) Y_{i j'} (a_{i j} = 1, a_{i j'} = 1, a_{i, - (j, j')} = ω) \end{matrix}]

- \frac{1}{K_{0, i}^{2}} p_{i}^{2} [\begin{matrix} Σ_{j} \sum_{(ω, ω') \in 𝒜 (n_{i} - 1, K_{0, i} - 1)} 1 (ω \neq ω') Y_{i j} (a_{i j} = 1, a_{i, - j} = ω) Y_{i j} (a_{i j} = 1, a_{i, - j} = ω') \\ + Σ_{j \neq j'} \sum_{(ω, ω') \in 𝒜 (n_{i} - 1, K_{0, i} - 1)} 1 (ω \neq ω') Y_{i j} (a_{i j} = 1, a_{i, - j} = ω) Y_{i j'} (a_{i j'} = 1, a_{i, - j'} = ω') \end{matrix}]

= \frac{1}{K_{0, i}^{2}} p_{i} {1 - p_{i}}

\times [\begin{matrix} Σ_{j} \sum_{ω \in 𝒜 (n_{i} - 1, K_{0, i} - 1)} Y_{i j}^{2} (a_{i j} = 1, a_{i, - j} = ω) \\ + Σ_{j \neq j'} \sum_{ω \in 𝒜 (n_{i} - 2, K_{0, i} - 2)} Y_{i j} (a_{i j} = 1, a_{i j'} = 1, a_{i, - (j, j')} = ω) Y_{i j'} (a_{i j} = 1, a_{i j'} = 1, a_{i, - (j, j')} = ω) \end{matrix}]

- \frac{1}{K_{0, i}^{2}} p_{i}^{2} [Σ_{j, j'} \sum_{(ω, ω') \in 𝒜 (n_{i} - 1, K_{0, i} - 1)} 1 (ω \neq ω') Y_{i j} (a_{i j} = 1, a_{i, - j} = ω) Y_{i j'} (a_{i j} = 1, a_{i, - j} = ω')]

Therefore, as Y_ij (a_i) ≥ 0 for all ai ∈ A(n_i;K_0,i;) and all j in group i, $E {{\hat{V a r}}_{u} ({\hat{Y}}_{i} (1; α_{0}) | S_{i} = 1) | S_{i} = 1} > Var ({\hat{Y}}_{i} (1; α_{0}) | S_{i} = 1)$ , since

E {{\hat{Var}}_{u} ({\hat{Y}}_{i} (1; α_{0}) | S_{i} = 1) | S_{i} = 1}

= \frac{1}{K_{0, i}^{2}} p_{i} {1 - p_{i}}

\times [\begin{matrix} Σ_{j} \sum_{ω \in 𝒜 (n_{i} - 1, K_{0, i} - 1)} Y_{i j}^{2} (a_{i j} = 1, a_{i, - j} = ω) \\ + Σ_{j \neq j'} \sum_{ω \in 𝒜 (n_{i} - 2, K_{0, i} - 2)} Y_{i j} (a_{i j} = 1, a_{i j'} = 1, a_{i, - (j, j')} = ω) Y_{i j'} (a_{i j} = 1, a_{i j'} = 1, a_{i, - (j, j')} = ω) \end{matrix}] .

Proof of Theorems 1-5

See technical report available from the authors.

Proof of Theorem 6

Under Assumptions 3 and 4, we have that for $a = 0, 1; E {{\hat{Y}}_{i}^{i p w} (a; α_{0})}$

= \frac{1}{n_{i}} \sum_{s \in 𝒜 (n_{i})} \frac{Pr {A_{i} = s | L_{i}, Y_{i} (\cdot)}}{f_{A | L, i} (s | L_{i})} \sum_{j = 1}^{n_{i}} 1 (s_{i j} = a) Y_{i j} (a_{i, - j} = s, a_{i j} = s_{i j}) \prod_{j' = 1, j' \neq j}^{n_{i}} α_{0}^{s_{i j'}} {(1 - α_{0})}^{1 - s_{i j'}}

= \frac{1}{n_{i}} \sum_{s \in 𝒜 (n_{i})} \frac{f_{A | L, i} (s | L_{i})}{f_{A | L, i} (s | L_{i})} \sum_{j = 1}^{n_{i}} 1 (s_{i j} = a) Y_{i j} (a_{i, - j} = s, a_{i j} = s_{i j}) \prod_{j' = 1, j' \neq j}^{n_{i}} α_{0}^{s_{i j'}} {(1 - α_{0})}^{1 - s_{i j'}}

= \frac{1}{n_{i}} \sum_{j = 1}^{n_{i}} \sum_{s \in 𝒜 (n_{i} - 1)} Y_{i j} (a_{i, - j} = s, a_{i j} = a) \prod_{j' = 1, j' \neq j}^{n_{i}} α_{0}^{s_{i j'}} {(1 - α_{0})}^{1 - s_{i j'}}

similarly, $E {{\hat{Y}}_{i}^{i p w} (α_{0})}$

= \frac{1}{n_{i}} \sum_{s \in 𝒜 (n_{i})} \frac{Pr {A_{i} = s | L_{i}, Y_{i} (\cdot)}}{f_{A | L, i} (s | L_{i})} \sum_{j = 1}^{n_{i}} Y_{i j} (a_{i j} = s) \prod_{j' = 1,}^{n_{i}} α_{0}^{s_{i j'}} {(1 - α_{0})}^{1 - s_{i j'}} {\bar{Y}}_{i} (a; α_{0})

= \frac{1}{n_{i}} \sum_{s \in 𝒜 (n_{i})} \frac{f_{A | L, i} (s | L_{i})}{f_{A | L, i} (s | L_{i})} \sum_{j = 1}^{n_{i}} Y_{i j} (a_{i, j} = s) \prod_{j' = 1,}^{n_{i}} α_{0}^{s_{i j'}} {(1 - α_{0})}^{1 - s_{i j'}}

= \frac{1}{n_{i}} \sum_{j = 1}^{n_{i}} \sum_{s \in 𝒜 (n_{i})} Y_{i j} (a = s) \prod_{j' = 1}^{n_{i}} α_{0}^{A_{_{i j}}^{'}} {(1 - α_{0})}^{1 - A_{i j'}}

References

1.Chow YS, Teicher HP. Probability Theory: Independence, interchangeability, martingales. 3rd edition. Springer Texts in Statistics; 1997. [Google Scholar]
2.Datta S, Halloran ME, Longini IM. Efficiency of estimating vaccine efficacy for susceptibility and infectiousness: randomization by individual versus household. Biometrics. 1999;55:792–798. doi: 10.1111/j.0006-341x.1999.00792.x. [DOI] [PubMed] [Google Scholar]
3.Duncan OD. Path analysis: sociological examples. American Journal of Sociology. 1966;72:1–16. [Google Scholar]
4.Frangakis CE, Rubin DB. Principal stratification in causal inference. Biometrics. 2002;58:21–29. doi: 10.1111/j.0006-341x.2002.00021.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Gallop R, Small DS, Lin JY, Elliott MR, Joffe M, Ten Have TR. Mediation analysis with principal stratification. Statistics in Medicine. 2009;28:1108–1130. doi: 10.1002/sim.3533. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Graham B. Identifying social interactions through conditional variance restrictions. Econometrica. 2008;76:643–660. [Google Scholar]
7.Halloran ME, Struchiner CJ. Causal inference for infectious diseases. Epidemiology. 1995;6:142–151. doi: 10.1097/00001648-199503000-00010. [DOI] [PubMed] [Google Scholar]
8.Hoeffding W. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association. 1963 Mar;58(301):13–30. [Google Scholar]
9.Holland PW. Causal inference, path analysis, and recursive structural equations models. Sociological Methodology. 1988;18:449–484. [Google Scholar]
10.Hong G, Raudenbush SW. Evaluating kindergarten retention policy: A case study of causal inference for multilevel observational data. Journal of the American Statistical Association. 2006;101:901–910. [Google Scholar]
11.Hudgens MG, Halloran ME. Towards causal inference with interference. Journal of the American Statistical Association. 2008;103:832–842. doi: 10.1198/016214508000000292. PMC2600548. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Manski CF. Economic analysis of social interactions. Journal of Economic Perspectives. 2000;14:115–136. [Google Scholar]
13.Manski CF. Identification of treatment response with social interactions. Northwestern University Working Paper. 2010 [Google Scholar]
14.Joag-Dev K, Proschan F. Negative Association of Random Variables with Applications. Annals of Statistics. 1983;11(1):286–295. [Google Scholar]
15.Pearl J. Direct and indirect effects. Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence; San Francisco: Morgan Kaufmann. 2001. pp. 411–420. [Google Scholar]
16.Robins JM, Greenland S. Identifiability and exchangeability for direct and indirect effects. Epidemiology. 1992;3:143–155. doi: 10.1097/00001648-199203000-00013. [DOI] [PubMed] [Google Scholar]
17.Rosenbaum PR. Interference between units in randomized experiments. Journal of the American Statistical Association. 2007;102:191–200. doi: 10.1080/01621459.2012.655954. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Rubin DB. Comment on: "Randomization analysis of experimental data in the fisher randomization test” by D. Basu. Journal of the American Statistical Association. 1980;75:591–593. [Google Scholar]
19.Rubin DB. Direct and indirect effects via potential outcomes. Scandinavian Journal of Statistics. 2004;31:161–170. [Google Scholar]
20.Sjölander A, Humphreys K, Vansteelandt S, Bellocco R, Palmgren J. Sensitivity analysis for principal stratum direct effects, with an application to a study of physical activity and coronary heart disease. Biometrics. 2009;65:514–520. doi: 10.1111/j.1541-0420.2008.01108.x. [DOI] [PubMed] [Google Scholar]
21.Sobel ME. What Do Randomized Studies of Housing Mobility Demonstrate?: Causal Inference in the Face of Interference. Journal of the American Statistical Association. 2006;101:1398–1407. [Google Scholar]
22.Strain PS, Shores RE, Kerr MM. An experimental analysis of "spillover" effects on the social interaction of behaviorally handicapped preschool children. Journal of Applied Behavior Analysis. 1976;9:31–40. doi: 10.1901/jaba.1976.9-31. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.van der Vaart AW. Weak Convergence and Empirical Processes: With Applications to Statistics. Springer Serices in Statistics. 1996 [Google Scholar]
24.VanderWeele TJ. Direct and indirect effects for neighborhood-based clustered and longitudinal data. Sociological Reserach and Methods. 2010 doi: 10.1177/0049124110366236. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Wright S. Correlation and causation. J. Agric. Res. 1921;20:557–585. [Google Scholar]

[R1] 1.Chow YS, Teicher HP. Probability Theory: Independence, interchangeability, martingales. 3rd edition. Springer Texts in Statistics; 1997. [Google Scholar]

[R2] 2.Datta S, Halloran ME, Longini IM. Efficiency of estimating vaccine efficacy for susceptibility and infectiousness: randomization by individual versus household. Biometrics. 1999;55:792–798. doi: 10.1111/j.0006-341x.1999.00792.x. [DOI] [PubMed] [Google Scholar]

[R3] 3.Duncan OD. Path analysis: sociological examples. American Journal of Sociology. 1966;72:1–16. [Google Scholar]

[R4] 4.Frangakis CE, Rubin DB. Principal stratification in causal inference. Biometrics. 2002;58:21–29. doi: 10.1111/j.0006-341x.2002.00021.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Gallop R, Small DS, Lin JY, Elliott MR, Joffe M, Ten Have TR. Mediation analysis with principal stratification. Statistics in Medicine. 2009;28:1108–1130. doi: 10.1002/sim.3533. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Graham B. Identifying social interactions through conditional variance restrictions. Econometrica. 2008;76:643–660. [Google Scholar]

[R7] 7.Halloran ME, Struchiner CJ. Causal inference for infectious diseases. Epidemiology. 1995;6:142–151. doi: 10.1097/00001648-199503000-00010. [DOI] [PubMed] [Google Scholar]

[R8] 8.Hoeffding W. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association. 1963 Mar;58(301):13–30. [Google Scholar]

[R9] 9.Holland PW. Causal inference, path analysis, and recursive structural equations models. Sociological Methodology. 1988;18:449–484. [Google Scholar]

[R10] 10.Hong G, Raudenbush SW. Evaluating kindergarten retention policy: A case study of causal inference for multilevel observational data. Journal of the American Statistical Association. 2006;101:901–910. [Google Scholar]

[R11] 11.Hudgens MG, Halloran ME. Towards causal inference with interference. Journal of the American Statistical Association. 2008;103:832–842. doi: 10.1198/016214508000000292. PMC2600548. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Manski CF. Economic analysis of social interactions. Journal of Economic Perspectives. 2000;14:115–136. [Google Scholar]

[R13] 13.Manski CF. Identification of treatment response with social interactions. Northwestern University Working Paper. 2010 [Google Scholar]

[R14] 14.Joag-Dev K, Proschan F. Negative Association of Random Variables with Applications. Annals of Statistics. 1983;11(1):286–295. [Google Scholar]

[R15] 15.Pearl J. Direct and indirect effects. Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence; San Francisco: Morgan Kaufmann. 2001. pp. 411–420. [Google Scholar]

[R16] 16.Robins JM, Greenland S. Identifiability and exchangeability for direct and indirect effects. Epidemiology. 1992;3:143–155. doi: 10.1097/00001648-199203000-00013. [DOI] [PubMed] [Google Scholar]

[R17] 17.Rosenbaum PR. Interference between units in randomized experiments. Journal of the American Statistical Association. 2007;102:191–200. doi: 10.1080/01621459.2012.655954. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Rubin DB. Comment on: "Randomization analysis of experimental data in the fisher randomization test” by D. Basu. Journal of the American Statistical Association. 1980;75:591–593. [Google Scholar]

[R19] 19.Rubin DB. Direct and indirect effects via potential outcomes. Scandinavian Journal of Statistics. 2004;31:161–170. [Google Scholar]

[R20] 20.Sjölander A, Humphreys K, Vansteelandt S, Bellocco R, Palmgren J. Sensitivity analysis for principal stratum direct effects, with an application to a study of physical activity and coronary heart disease. Biometrics. 2009;65:514–520. doi: 10.1111/j.1541-0420.2008.01108.x. [DOI] [PubMed] [Google Scholar]

[R21] 21.Sobel ME. What Do Randomized Studies of Housing Mobility Demonstrate?: Causal Inference in the Face of Interference. Journal of the American Statistical Association. 2006;101:1398–1407. [Google Scholar]

[R22] 22.Strain PS, Shores RE, Kerr MM. An experimental analysis of "spillover" effects on the social interaction of behaviorally handicapped preschool children. Journal of Applied Behavior Analysis. 1976;9:31–40. doi: 10.1901/jaba.1976.9-31. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.van der Vaart AW. Weak Convergence and Empirical Processes: With Applications to Statistics. Springer Serices in Statistics. 1996 [Google Scholar]

[R24] 24.VanderWeele TJ. Direct and indirect effects for neighborhood-based clustered and longitudinal data. Sociological Reserach and Methods. 2010 doi: 10.1177/0049124110366236. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] 25.Wright S. Correlation and causation. J. Agric. Res. 1921;20:557–585. [Google Scholar]

PERMALINK

On causal inference in the presence of interference

Eric J Tchetgen Tchetgen

Tyler J VanderWeele

Abstract

1 Introduction

2 Preliminaries

2.1 Counterfactuals

2.2 Treatment Assignment in Group Randomized Experiments

Definition

3 Causal Estimands

3.1 Direct Causal Effects

3.2 Indirect Causal Effects or"Spillover Effects"

3.3 Total Causal Effects

3.4 Overall Causal Effects

4 Inference in group randomized studies

4.1 Estimation

Assumption 1

4.2 Variance Estimation

4.2.1 Variance Estimation under Stratified interference

Lemma 1

Assumption 2

4.3 Finite sample inference for a binary outcome

Theorem 1

Theorem 2

Theorem 3

Theorem 4

Theorem 5

5 Towards Inference in observational studies

Assumption 3

Assumption 4

Theorem 6

6 Varieties of direct and indirect effects

7 Concluding remarks

APPENDIX

Proof of Lemma 1

Proof of Theorems 1-5

Proof of Theorem 6

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases