GENERAL THEORY FOR INTERACTIONS IN SUFFICIENT CAUSE MODELS WITH DICHOTOMOUS EXPOSURES

Tyler J VanderWeele; Thomas S Richardson

doi:10.1214/12-aos1019

. Author manuscript; available in PMC: 2014 Dec 29.

Published in final edited form as: Ann Stat. 2012;40(4):2128–2161. doi: 10.1214/12-aos1019

GENERAL THEORY FOR INTERACTIONS IN SUFFICIENT CAUSE MODELS WITH DICHOTOMOUS EXPOSURES

Tyler J VanderWeele ¹, Thomas S Richardson ²

PMCID: PMC4278668 NIHMSID: NIHMS632467 PMID: 25552780

Abstract

The sufficient-component cause framework assumes the existence of sets of sufficient causes that bring about an event. For a binary outcome and an arbitrary number of binary causes any set of potential outcomes can be replicated by positing a set of sufficient causes; typically this representation is not unique. A sufficient cause interaction is said to be present if within all representations there exists a sufficient cause in which two or more particular causes are all present. A singular interaction is said to be present if for some subset of individuals there is a unique minimal sufficient cause. Empirical and counterfactual conditions are given for sufficient cause interactions and singular interactions between an arbitrary number of causes. Conditions are given for cases in which none, some or all of a given set of causes affect the outcome monotonically. The relations between these results, interactions in linear statistical models and Pearl’s probability of causation are discussed.

Keywords: causal inference, counterfactual, epistasis, interaction include keywords that are in title, potential outcomes, synergism

1. Introduction

Rothman’s sufficient-component cause model [21] postulates a set of different causal mechanisms each sufficient to bring about the outcome under consideration. Rothman refers to these hypothesized causal mechanisms as “sufficient causes”, conceiving of them as minimal sets of actions, events or states of nature which together initiate a process resulting in the outcome.

Thus each sufficient cause is hypothesized to consist of a set of “component causes”. Whenever all components of a particular sufficient cause are present, the outcome occurs; within every sufficient cause, each component would be necessary for that sufficient cause to lead to the outcome. Models of this kind have a long history: a simple version is considered by Cayley [3]; it also corresponds to the INUS model introduced by Mackie [11] in the philosophical literature. Much recent work has sought to relate the model to other causal modelling frameworks [7, 8, 29, 31, 32].

In traditional sufficient-component cause [SCC] models the outcome and all the component causes are events, or equivalently, binary random variables. An SCC model with k component causes, implies a set of 2^k potential outcomes. Conversely, in section 2 we show that for any given list of potential outcomes there is at least one SCC model which represents this set. However, in general there may be many such SCC models.

One question concerns whether, given a set of potential outcomes implied by some (unknown) SCC model, one may infer that two component causes are present within some sufficient cause in the unknown SCC model. In general, it is possible that two SCC models both imply the same set of potential outcomes yet although A and B occur together in some sufficient component cause in the first model, A and B are not present together in any sufficient component cause in the second. In [32] two sufficient component causes are said to form a ‘sufficient cause interaction’ (or to be ‘irreducible’) if they are both present within at least one sufficient cause in every SCC model for a given set of potential outcomes. Of course, in general, the distribution of potential outcomes for a given population is also unknown, though it is constrained (marginally) by the observed data from a randomized experiment. In [32] empirical conditions are given which are sufficient to ensure that for any set of potential outcomes compatible with experimental data, all compatible SCC models will contain a sufficient cause involving A and B. These results are an improvement upon earlier empirical tests for the existence of a two-way interaction in an SCC model [22], which required the assumption of monotonicity; see also [1, 9, 10, 15, 30]. The new results are able to establish the existence of an interaction in situations where monotonicity does not hold. In this paper we develop empirical conditions that are sufficient for the existence of a sufficient cause containing a given subset of an arbitrary number of variables both with and without monotonicity assumptions.

As illustrative motivation for the theoretical development we will consider data presented in a study by [26], summarized in Table 1, from a case-control study of bladder cancer examining possible three-way interaction between smoking (1 =present), and genetic variants on NAT2 (0 = R, 1 = S genotype) and NAT1 (1 for the *10 allele) for Caucasian individuals. We return to this example at the end of this paper to examine the evidence for a sufficient cause containing all three of smoking, the S genotype on NAT2 and the *10 allele on NAT1.

Table 1.

Smoking	NAT2	NAT1*10	Cases (n = 215)	Controls (n = 191)	Odds Ratio (95% CI)
0	0	0	6	13	1
0	0	1	8	16	1.1	(0.3, 3.9)
0	1	0	16	31	1.1	(0.4, 3.5)
0	1	1	6	10	1.3	(0.3, 5.3)
1	0	0	42	32	2.8	(1.1, 8.3)
1	0	1	41	26	3.4	(1.2, 10.1)
1	1	0	61	51	2.6	(0.9, 7.3)
1	1	1	35	12	6.3	(2.0, 20.3)

Open in a new tab

The remainder of this paper is organized as follows: Section 2 presents the sufficient-component cause framework as formalized by VanderWeele and Robins [32]. Section 3 describes general n-way irreducible interactions (aka ‘sufficient cause interactions’) and characterizes these in terms of potential outcomes. Section 4 derives empirical conditions for the existence of irreducible interactions both with and without monotonicity assumptions. Section 5 describes ‘singular’ interactions which arise in genetic contexts, provides a characterization, derives empirical conditions that are sufficient for their existence, and relates this notion to Pearl’s probability of causation. Section 6 discusses the relation between singular and sufficient cause interactions and linear statistical models. Section 7 provides some comments regarding stronger interpretations of sufficient cause models, and returns to the data presented in Table 1. Finally section 8 offers some possible extensions to the present work.

2. Notation and Basic Concepts

We will use the following notation: An event is a binary random variable taking values in {0, 1}. We use uppercase roman to indicate events (X), boldface to indicate sets of events (C), and lowercase to indicate specific values both for single random variables (X = x), and, with slight abuse of notation, for sets {C = c} ≡ {∀i, (C)_i = (c)_i}, and {a ≤ b} ≡ {∀i, (a)_i ≤ (b)_i}; 1 and 0 are vectors of 1’s and 0’s; the cardinality of a set is denoted |C|. We use fraktur ( $B$ ) to denote collections of sets of events.

The complement of some event X is denoted by $\bar{X} \equiv 1 - X$ . A literal event associated with X, is either X or $\bar{X}$ . For a given set of events C, $𝕃 (C)$ is the associated set of literal events:

𝕃 (C) \equiv C \cup {\bar{X} ∣ X \in C} .

For a literal $L \in 𝕃 (C)$ , and an assignment c to C, (L)_c denotes the value assigned to L by c. The conjunction of a set of literal events $B = {F_{1}, \dots, F_{m}} \subseteq 𝕃 (C)$ is defined as:

⋀ (B) \equiv \prod_{i = 1}^{m} F_{i} = \min {F_{1}, \dots, F_{m}};

note that Λ(B) = 1 iff for all i, F_i = 1. We also define B₁ ∧B₂ ≡ Λ {B₁, B₂}. We will use $𝕀 (A)$ to denote the indicator function for event A. There is a simple correspondence between conjunctions of literals and indicator functions: let B = {X₁, …, X_s} and C = {Y₁, …, Y_t}, then

⋀ ({X_{1}, \dots, X_{s}, {\bar{Y}}_{1}, \dots, {\bar{Y}}_{t}}) = 1 \Leftrightarrow 𝕀 ({B = 1, C = 0}) = 1 .

(2.1)

Similarly, the set of literals corresponding to an assignment c to C is defined:

B^{[c]} \equiv {L ∣ L \in 𝕃 (C), {(L)}_{c} = 1}

so that $⋀ (B^{[c]}) = 𝕀 (C = c)$ ; note that |B^[c]| = |C|. The disjunction of a set of binary random variables is defined as:

⋁ ({Z_{1}, \dots, Z_{p}}) \equiv \max {Z_{1}, \dots, Z_{p}};

note that ∨({Z₁, …, Z_p}) = 1 iff for some j, Z_j = 1. Similarly we let B₁ ∨ B₂ ≡ ∨{B₁, B₂}. Given a collection of sets of literals $B = {C_{1}, \dots, C_{q}}$ we define:

⋁ ⋀ (B) \equiv ⋁_{i} (⋀ (C_{i})) .

We use $\dot{ℙ} (𝕃 (C))$ to denote the set of subsets of $𝕃 (C)$ that do not contain both X and $\bar{X}$ for any X ∈ C, more formally:

\dot{ℙ} (𝕃 (C)) \equiv {B ∣ B \subset 𝕃 (C), for all X \in C, {X, \bar{X}} ⊈ B} .

Note that if $B \in \dot{ℙ} (𝕃 (C))$ , and |B| = |C|, so that for all C ∈ C, exactly one of C or $\bar{C}$ is in B, then an assignment of values b to B induces a unique assignment c to C and vice versa.

2.1. Potential Outcomes Models

Consider a potential outcome model [14, 23, 24] with s binary factors, X₁, …, X_s, which represent hypothetical interventions or causes and let D denote some binary outcome of interest. We use Ω to denote the sample space of individuals in the population and use ω for a particular sample point. Let D_x₁_…x_s(ω) denote the counterfactual value of D for individual ω if the cause X_j were set to the value x_j for j = 1, …, s. The potential outcomes framework we employ makes two assumptions: first, that for a given individual these counterfactual variables are deterministic; second, in asserting that the counterfactual D_x₁_…x_s(ω) is well-defined, it is implicitly assumed that the value that D would take on for individual ω is determined solely by the values that X₁, …, X_s are assigned for this individual, and not the assignments made to these variables for other individuals ω′. This latter assumption is often called ‘no interference’ [6], or the Stable Unit Treatment Value Assumption (SUTVA) [25]. An example of a situation where this assumption might fail is a vaccine trial where there is ‘herd’ immunity.

We will use D_x₁_…x_s(ω), D_{X₁=x₁,…,X_s=x_s}(ω), D_c and D_C₌_c(ω), with C = {X₁, …, X_s} interchangeably. In this setting there will be 2^s potential outcomes for each individual ω in the population, one potential outcome for each possible value of (X₁, …, X_s); we use $D (C; ω)$ to denote the set of all such potential outcomes for an individual, and $D (C; Ω)$ for the population. Note that if G = g(C) is some deterministic function of C then G_C=c(ω) = g(c), and hence is constant; thus our usage is consistent with the definition of (L)_c in the previous section.

The actual observed value of D for individual ω will be denoted by D(ω) and similarly the actual value of X₁, …, X_s by X₁(ω), …, X_s(ω). Actual and counterfactual outcomes are linked by the Consistency Axiom which requires that

D_{X_{1} = X_{1} (ω), \dots, X_{s} = X_{s} (ω)} (ω) = D (ω),

(2.2)

i.e. that the value of D which would have been observed if X₁, …, X_s had been set to the values they actually took is equal to the value of D which was in fact observed [?]. It follows from this axiom that D_{X₁(ω),…,X_s(ω)}(ω) = D is observed, but it is the only potential outcome for individual ω that is observed.

Example 1. Consider a binary outcome D with three binary causes of interest, X₁, X₂ and X₃. Suppose that the population consists of two individuals. The potential outcomes (LHS) and actual outcomes (RHS) are shown in Table 2.

Table 2.

All potential outcomes and actual outcomes for three binary causes X₁, X₂ and X₃, in a population with two individuals.

Individual	D ₀₀₀	D ₀₀₁	D ₀₁₀	D ₀₁₁	D ₁₀₀	D ₁₀₁	D ₁₁₀	D ₁₁₁	(X₁,X₂,X₃)	D
1	0	1	1	0	0	1	1	0	(1, 0, 1)	1
2	0	1	1	0	0	1	1	1	(0, 0, 0)	0

Open in a new tab

We use the notation A ⫫ B | C to indicate that A is independent of B conditional on C in the population distribution.

2.2. Definitions for sufficient cause models

The following definitions generalize those in [32] to sub-populations, $\emptyset \neq Ω^{*} \subseteq Ω$ :

Definition 2.1 (Sufficient Cause). A subset B of the putative (binary) causes $𝕃 (C)$ for D forms a sufficient cause for D (relative to C) in sub-population Ω* if for all c ∈ {0, 1}^|C| such that (Λ(B))_c = 1, D_c(ω) = 1 for all ω ∈ Ω* ⊆ Ω. (We assume that there exists a c* such that (Λ(B))_c* = 1.)

Observe that if B is a sufficient cause for D then any intervention setting the variables C to c with (Λ(B))_c = 1 will ensure that D_c(ω) = 1 for all ω ∈ Ω*. We restrict the definition to non-empty sets Ω*, to preclude every set B being a sufficient cause in an empty sub-population. Likewise we require that there exists some c* such that (Λ(B))_c* = 1 in order to avoid logically inconsistent conjunctions, e.g. $X_{1} \land {\bar{X}}_{1}$ , being classified (vacuously) as a sufficient cause. As a direct consequence, for any binary random variable X, at most one of X and $\bar{X}$ appear in any sufficient cause B.

Proposition 2.2. In Ω* if B is a sufficient cause for D relative to C then B is sufficient for D in any set C* with B ⊆ C* ⊆ C.

B may be sufficient for D relative to C in Ω*, but not relative to C′ ⊃ C.

Proposition 2.3. If B is a sufficient cause for D relative to C in Ω* then B is sufficient for D relative to C in any subset $\emptyset \neq Ω^{* *} \subseteq Ω^{*}$ .

B may be sufficient for D relative to C in Ω*, but not in Ω′ ⊃ Ω*.

Definition 2.4 (Minimal Sufficient Cause). A set $B \subset 𝕃 (C)$ forms a minimal sufficient cause for D (relative to C) in sub-population Ω* if B constitutes a sufficient cause for D in Ω* but no proper subset B* ⊂ B also forms a sufficient cause for D in Ω*.

Note that (in some Ω*) B may be a minimal sufficient cause for D relative to C, but not relative to C* ⊂ C, so the analog of Proposition 2.2 does not hold. For individual 2 in Table 2 {X₁, X₃} is a minimal sufficient cause relative to {X₁, X₂, X₃}. However, if we suppose that for ω = 2, X₂ is not caused by X₁ and X₃, so for all x₁, x₃, X_{2X₁=x₁,X₃=x₃}(ω = 2) = X₂(ω = 2), then {X₁, X₃} is not a minimal sufficient cause relative to {X₁, X₃}:

\begin{matrix} D_{X_{1} = 0, X_{3} = 1} (ω = 2) & = D_{X_{1} = 0, X_{2} = 0, X_{3} = 1} (ω = 2) = 1, \\ D_{X_{1} = 1, X_{3} = 1} (ω = 2) & = D_{X_{1} = 1, X_{2} = 0, X_{3} = 1} (ω = 2) = 1, \end{matrix}

(since X₂(ω =2) = 0) hence X₃ is a sufficient cause of D relative to {X₁, X₃}, hence {X₁, X₃} is not minimal relative to {X₁, X₃} for ω = 2.

Similarly, if B is a minimal sufficient cause for D relative to C in Ω*, it does not follow that B is a minimal sufficient cause for D relative to C in subsets Ω** ⊆ Ω*, so the analog to Proposition 2.3 does not hold. In particular, it may be the case that for all ω ∈ Ω*, B is not a minimal sufficient cause for D in {ω}.

In the language of digital circuit theory [12], sufficient causes are termed ‘implicants’, and minimal sufficient causes are ‘prime implicants’.

Definition 2.5 (Determinative Set of Sufficient Causes). A set of sufficient causes for D, $B = {B_{1}, \dots, B_{n}} \subseteq \dot{ℙ} (𝕃 (C))$ , is said to be determinative for D (relative to C) in sub-population Ω* if for all ω ∈ Ω* and for all c, D_c(ω) = 1 if and only if ${(⋁ ⋀ (B))}_{c} = 1$ .

We will refer to a determinative set of sufficient causes for D as a sufficient cause model. Observe that in any sub-population Ω* for which there exists a determinative set of sufficient causes, the vectors of potential outcomes for D are identical, so $D (C, ω) = D (C, ω^{'})$ for all ω,ω′ ∈ Ω*.

Definition 2.6 (Non-redundant Set of Sufficient Causes). A determinative set of sufficient causes $B$ , for D, is said to be non-redundant (in Ω*, relative to C) if there is no proper subset $B^{*} \subset B$ that is also determinative for D.

Note that sufficient causes are conjunctions, while sets of sufficient causes form disjunctions of conjunctions; minimality refers to the components in a particular conjunction, that each component is required for the conjunction to be sufficient for D; non-redundancy implies that each conjunction is required for the disjunction of the set of conjunctions to be determinative. If for some set of sufficient causes $B \subseteq \dot{ℙ} (𝕃 (C))$ , for all X ∈ C, and all $B \in B$ , either X ∈ B or $\bar{X} \in B$ then $B$ is a non-redundant set of sufficient causes.

Example 1 (Revisited). The set $B_{1} = {{X_{1}, X_{2}}, {X_{2}, {\bar{X}}_{3}}, {{\bar{X}}_{2}, X_{3}}}$ forms a determinative set of sufficient causes for the individual ω = 2, since:

D_{c} (ω = 2) = {((X_{1} \land X_{2}) \lor (X_{2} \land {\bar{X}}_{3}) \lor ({\bar{X}}_{2} \land X_{3}))}_{c}

(2.3)

as does $B_{2} = {{X_{1}, X_{3}}, {X_{2}, {\bar{X}}_{3}}, {{\bar{X}}_{2}, X_{3}}}$ :

D_{c} (ω = 2) = {((X_{1} \land X_{3}) \lor (X_{2} \land {\bar{X}}_{3}) \lor ({\bar{X}}_{2} \land X_{3}))}_{c} .

(2.4)

As this example shows, determinative sets of sufficient causes are not, in general, unique.

2.3. Sufficient cause representations for a population

As noted, if B is a sufficient cause for D in Ω*, then all the units in Ω* will have D = 1 for any assignment c to C, such that (Λ(B))_c = 1. In most realistic settings it is unlikely that any set B will be sufficient to ensure D = 1 in an entire population. Consequently different sets of sufficient causes will be required within different sub-populations. A sufficient cause representation is a set of sub-populations each with its own determinative sufficient cause representation:

Definition 2.7. A sufficient cause representation (A, $B$ ) for $D (C; Ω)$ is an ordered set A = 〈A₁, … A_p〉 of binary random variables, with (A_i)_c = A_i for all i, c, and a set $B = 〈 B_{1}, \dots, B_{p} 〉$ , with $B_{i} \in \dot{ℙ} (𝕃 (C))$ , such that for all ω, c, D_c(ω) = 1 ⇔ for some j, A_j(ω) = 1 and (∧(B_j))_c = 1.

Note that the binary random variables A_i and the sets B_i are naturally paired via the orderings of A and $B$ ; we will refer to a pair (A_i, B_i) as occurring in the representation. The requirement that (A_i)_c = A_i for all i, c implies that A ∩ C = ∅, and further that the A_i are unaffected by interventions on the X_i; this is in keeping with the interpretation of the A_i as defining pre-existing sub-populations with particular sets of potential outcomes for D.

Proposition 2.8. If (A, $B$ ) is a sufficient cause representation for $D (C; Ω)$ then B_i is a sufficient cause of D in the sub-population in which A_i(ω) = 1.

Proposition 2.9. If (A, $B$ ) is a sufficient cause representation for $D (C; Ω)$ , then for all A* ⊆ A, if

\emptyset \neq Ω_{A \ A^{*}}^{A^{*}} \equiv {ω ∣ f o r a l l A_{i} \in A, A_{i} (ω) = 1 i f f A_{i} \in A^{*}},

then

B^{A^{*}} \equiv {B_{i} ∣ B_{i} \in B; A_{i} \in A^{*}}

forms a determinative set of sufficient causes (relative to C) for $Ω_{A \ A^{*}}^{A^{*}}$ .

Note that $Ω_{A \ A^{*}}^{A^{*}}$ consists of the sub-population in which A_i(ω) = 1 for all A_i ∈ A* and A_j(ω) = 0 for all A_j ∈ A\A*.

Proof: Suppose for some $ω \in Ω_{A \ A^{*}}^{A^{*}}$ , $B_{j} \in B^{A^{*}}$ , and c we have (Λ(B_j))_c = 1. Since $ω \in Ω_{A \ A^{*}}^{A^{*}}$ , A_j(ω) = 1. It then follows from the definition of a sufficient cause representation that D_c(ω) = 1. Conversely, suppose D_c(ω) = 1. As (A, $B$ ) is a sufficient cause representation, for some j, A_j(ω) = 1 and (Λ(B_j))_c = 1. Since, by hypothesis, $ω \in Ω_{A \ A^{*}}^{A^{*}}$ , it follows that A_j ∈ A*, hence $B_{j} \in B^{A^{*}}$ .

Theorem 2.10. For any $D (C; Ω)$ there exists a sufficient cause representation (A, $B$ ).

Proof: Let p = 2^|C|, and define $B \equiv {B ∣ B \subseteq \dot{ℙ} (𝕃 (C)), ∣ B ∣ = ∣ C ∣} \equiv 〈 B_{1}, \dots, B_{p} 〉$ , ordered arbitrarily. Further define A_i(ω) ≡ D_B_i₌₁(ω). Given an arbitrary c, for some j, B^[^c^] = B_j, by construction of $B$ . We then have

D_{c} (ω) = 1 \Leftrightarrow D_{B_{j = 1}} (ω) = 1 \Leftrightarrow A_{j} (ω) = 1 and {(⋀ (B_{j}))}_{c} = 1,

as required. The last step follows since by definition B^[^c^] = 1 iff C = c.

[32] proves this for the case of |C| = 2; see also [7] and [29] for discussion of the case |C| = 1.

Example 1 (Revisited). The construction given in the proof of Theorem 2.10 would yield the following sets of sufficient causes to represent $D (C; Ω)$ shown in Table 2:

\begin{matrix} B & = 〈 B_{1}, \dots, B_{8} 〉 \\ = 〈 {X_{1}, X_{2}, X_{3}}, {X_{1}, X_{2}, {\bar{X}}_{3}}, {X_{1}, {\bar{X}}_{2}, X_{3}}, {X_{1}, {\bar{X}}_{2}, {\bar{X}}_{3}}, \\ {{\bar{X}}_{1}, X_{2}, X_{3}}, {{\bar{X}}_{1}, X_{2}, {\bar{X}}_{3}}, {{\bar{X}}_{1}, {\bar{X}}_{2}, X_{3}}, {{\bar{X}}_{1}, {\bar{X}}_{2}, {\bar{X}}_{3}} 〉 \end{matrix}

(2.5)

with A₁ = A₄ = A₅ = 0, A₂ = A₃ = A₆ = A₇ = 1, $A_{8} = 𝕀 ({ω = 2})$ .

3. Irreducible conjunctions

We saw in Example 1 above with ω = 2 that an individual’s potential outcomes may be such that there are two determinative sets of common causes $B$ and $B^{'}$ and {X₁, X₂} is in $B$ , but not in $B^{'}$ . However, certain conjunctions are such that in every representation either the conjunction is present or it is contained in some larger conjunction; such conjunctions are said to be ‘irreducible’:

Definition 3.1. $B \in \dot{ℙ} (𝕃 (C))$ is said to be irreducible for $D (C, Ω)$ if in every representation (A, $B$ ) for $D (C, Ω)$ there exists $B_{i} \in B$ , with B ⊆ B_i.

[32] also refer to irreducibility of B for $D (C, Ω)$ as a ‘sufficient cause interaction’ between the components of B. (Note, however, that if B is irreducible, this does not in general imply that B is either a minimal sufficient cause, or even a sufficient cause, only that there is a sufficient cause that contains B.) It can be shown (via Theorem 3.2 below) that {X₂, ${\bar{X}}_{3}$ } and { ${\bar{X}}_{2}$ , X₃} are irreducible for $D (C; Ω)$ in Table 2. In §7 we provide an interpretation of irreducibility in terms of the existence of a mechanism involving the variables in B. We now characterize irreducibility:

Theorem 3.2. Let $C = C_{1} \dot{\cup} C_{2}$ , $B \in \dot{ℙ} (𝕃 (C_{1}))$ , |B| = |C₁|, then B is irreducible for $D (C, Ω)$ iff there exists ω* ∈ Ω and values $c_{2}^{*}$ for C₂ such that: (i) $D_{B = 1, C_{2} = c_{2}^{*}} (ω^{*}) = 1$ ; (ii) for all L ∈ B, $D_{B \ {L} = 1, L = 0, C_{2} = c_{2}^{*}} (ω^{*}) = 0$ .

Here $C_{1} \dot{\cup} C_{2}$ indicates the disjoint union of C₁ and C₂. Thus B is irreducible if and only if there exists an individual in Ω, who would have response D = 1 if every literal in B is set to 1, but D = 0 whenever one literal is set to 0 and the rest to 1 (in some context $C_{2} = c_{2}^{*}$ ). Note that conditions (i) and (ii) are equivalent to:

D_{B = 1, C_{2} = c_{2}^{*}} (ω^{*}) - \sum_{L \in B} D_{B \ {L} = 1, L = 0, C_{2} = c_{2}^{*}} (ω^{*}) > 0 .

(3.1)

Proof: (⇒) We adapt the proof of Theorem 2.10, to show that if for all ω ∈ Ω and assignments $c_{2}^{*}$ to C₂, at least one of (i) or (ii) does not hold, then there exists a representation (A, $B$ ) for $D (C, Ω)$ such that for all $B_{i} \in B$ , B ⊆ B_i.

Define:

\begin{matrix} B^{†} & \equiv 〈 B_{i}^{†} 〉 \equiv {B^{*} ∣ B^{*} \in \dot{ℙ} (𝕃 (C)), ∣ B^{*} ∣ = ∣ C ∣, B ⊈ B^{*}}, \\ B^{‡} & \equiv 〈 B_{i}^{‡} 〉 \equiv {B^{*} ∣ B^{*} \in \dot{ℙ} (𝕃 (C)), ∣ B^{*} ∣ = ∣ C ∣ - 1, B \ B^{*} = {L}, L \in B}, \end{matrix}

under arbitrary orderings. Thus $B^{†}$ is the set of subsets of exactly |C| literals that do not include B as a subset, while $B^{‡}$ contains those subsets of size |C| − 1 that contain all but one literals in B.

For $B_{i}^{†} \in B^{†}$ define the corresponding $A_{i}^{†} (ω) \equiv D_{B_{i}^{†} = 1} (ω)$ ;
For $B_{i}^{‡} \in B^{‡}$ define $A_{i}^{‡} (ω) \equiv D_{B_{i}^{‡} = 1, L_{i} = 0} (ω) D_{B_{i}^{‡} = 1, L_{i} = 1} (ω)$ , where ${L_{i}} \equiv B \ B_{i}^{‡}$ .

The representation is given by $(A, B) \equiv (A^{†} \cup A^{‡}, B^{†} \cup B^{‡})$ , where $A^{†} \equiv 〈 A_{i}^{†} 〉, A^{‡} \equiv 〈 A_{i}^{‡} 〉$ . To see this, first note that if for some ω and c, there is a pair (A_j, B_j) in (A, $B$ ) such that A_j(ω) = 1 and (Λ(B_j))_c = 1 then by construction of A^† and A^‡ it follows that D_c(ω) = 1. For the converse, suppose that for some c and ω, D_c(ω) = 1. There are two cases to consider:

(Λ(B))_c=0. In this case B ⊈ B^[c], so for some j, $B_{j}^{†} = B^{[c]}$ , hence $A_{j}^{†} (ω) \equiv D_{B_{j}^{†} = 1} (ω) = D_{c} (ω) = 1$ , as required.

(Λ(B))_c =1. Let c be partitioned as (c₁, c₂). Since (i) holds with $c_{2}^{*} = c_{2}$ , (ii) does not. Thus for some L ∈ B, D_{B\{L}=1,L=0,C₂=c₂}(ω) = 1. By construction of $B^{‡}$ , for some j, $B_{j}^{‡} = B^{[c]} \ {L}$ , so ${(⋀ (B_{j}^{‡}))}_{c} = 1$ . Since 1 = D_c(ω) = D_{B\{L}=1,L=0,C₂=c₂}(ω), we have $A_{j}^{‡} (ω) = 1$ , as required.

(⇐) Suppose for a contradiction, that for some ω* and $c_{2}^{*}$ , (i) and (ii) hold, but B is not irreducible. Then there exists a representation (A, $B$ ) such that for all $B_{i} \in B$ , B ⊈ B_i. By (i), $D_{B = 1, c_{2}^{*}} (ω^{*}) = 1$ . Thus for some pair (A_j, B_j), A (ω*) = 1, and $B_{j} \subseteq B \cup B^{[c_{2}^{*}]}$ . Since B ⊈ B_j there exists some L ∈ B \ B_j, but then since A_j(ω*) = 1 and ${(⋀ (B_{j}))}_{B \ {L} = 1, L = 0, C_{2} = c_{2}^{*}} = 1$ , we have $D_{B \ {L} = 1, L = 0, C_{2} = c_{2}^{*}} (ω^{*}) = 1$ , which is a contradiction.

Corollary 3.3. If B is irreducible for $D (C, Ω)$ then for any Ω* ⊃ Ω, B is irreducible for $D (C, Ω^{*})$ .

Proof: By Theorem 3.2, since if Ω satisfies (i) and (ii) then so does Ω*.

3.1. B irreducible for $D (C, Ω)$ with |B| = |C|

In the special case where |B| = |C|, the concepts of minimal sufficient cause for some ω* and irreducibility coincide.

Proposition 3.4. If $B \in \dot{ℙ} (𝕃 (C))$ and |B| = |C| then B is a minimal sufficient cause for some ω* ∈ Ω relative to C i B is irreducible for $D (C, ω)$ .

Proof: If |B| = |C| then condition (i) in Theorem 3.2 (taking C₂ = ∅) holds iff B is a sufficient cause for D for ω*, and similarly condition (ii) holds iff B is a minimal sufficient cause for D (for ω*).

Thus we have the following:

Corollary 3.5. If $B \in \dot{ℙ} (𝕃 (C))$ , |B| = |C| and B is a minimal sufficient cause for D for some ω* ∈ Ω then $B \in B$ for every representation (A, $B$ ) for $D (C, Ω)$ .

Proof: Immediate from Proposition 3.4.

3.2. B irreducible for $D (C, Ω)$ with |B| > |C|

When |B| < |C| the conditions for irreducibility and for being a minimal sufficient cause are logically distinct. Condition (i) in Theorem 3.2 requires $D_{B = 1, C_{2} = c_{2}^{*}} (ω^{*}) = 1$ for one assignment $c_{2}^{*}$ (and some ω*), while if B is a sufficient cause (for ω*) then this condition is required to hold for all assignments $c_{2}^{*}$ ; in contrast condition (ii) in Theorem 3.2 requires that there exists a single $c_{2}^{*}$ (and some ω*) such that for all $L \in B, D_{B \ {L} = 1, L = 0, C_{2} = c_{2}^{*}} (ω^{*}) = 0$ , while for B to be a minimal sufficient cause for ω* merely requires that for all L ∈ B, there exists an assignment $c_{2}^{L}$ such that $D_{B \ {L} = 1, L = 0, C_{2} = c_{2}^{L}} (ω^{*}) = 0$ .

Example 1 (Revisited). Let C = {X₁, X₂, X₃}, = {2}. Relative to C, {X₁, X₂} is a minimal sufficient cause for ω = 2 since D₁₁₁(2) = D₁₁₀(2) = 1, and D₀₁₁(2) = D₁₀₀(2) = 0. However {X₁, X₂} is not irreducible for $D (C, Ω)$ because we have D₁₀₁(2) = D₀₁₀(2) = 1, hence condition (ii) in Theorem 3.2 is not satisfied for either X₃ = 0, or X₃ = 1. Conversely {X₁} is irreducible for = {2} since D₁₁₁(2) = 1, while D₀₁₁(2) = 0, but {X₁} is not a sufficient cause because D₁₀₀(2) = 0.

Though irreducibility of B for $D (C, Ω)$ neither implies, nor is implied by B being a minimal sufficient cause for some ω ∈ Ω, it does imply that every sufficient cause representation for $D (C, Ω)$ contains at least one conjunction B_j of which B is a (possibly proper) subset. However, prima facie this still leaves open the possibility that, for example, every representation either includes B ∪ {L} or $B \cup {\bar{L}}$ , for some L, but no representation includes both. However, this cannot occur:

Corollary 3.6. Let $C = C_{1} \dot{\cup} C_{2}$ , $B \in \dot{ℙ} (𝕃 (C_{1}))$ if B is irreducible for $D (C, Ω)$ then there exists a set $B^{*} \in \dot{ℙ} (𝕃 (C))$ , with |B*| = |C| such that in every representation (A, $B$ ) for $D (C, Ω)$ there exists $B_{j} \in B$ , with B ⊆ B_j ⊆ B*.

Thus irreducibility of B further implies that there is a set B* of size |C| such that in every representation there is at least one conjunct containing B that is itself contained in B*. However, it should be noted that, in general, there may be more than one conjunct B_j with B ⊆ B_j ⊆ B*.

Proof: Immediate from Theorem 3.2, taking $B^{*} = B \cup B^{[c_{2}^{*}]}$ .

Finally, we note that a conjunction that is both irreducible and a minimal sufficient cause corresponds to an ‘essential prime implicant’ in digital circuit theory [12]. The Quine-McCluskey algorithm [13, 18, 19] finds the set of essential prime implicants for a given Boolean function, which here corresponds to the potential outcomes $D (C, ω)$ for an individual.

3.3. Enlarging the set of potential causes

As noted in §2.2 a set B may be a minimal sufficient cause for C but not a superset C′. Irreducibility is also not preserved without further conditions. To state these conditions we require the following:

Definition 3.7. X′ is said to be not causally influenced by a set C if for all ω ∈ Ω, the potential outcomes $X_{C = c}^{'} (ω)$ are constant as c varies.

We will also assume that if every X′ ∈ C′ is not causally influenced by C then the Relativized Consistency Axiom holds:

D_{C = c, C^{'} = C^{'} (ω)} (ω) = D_{C = c} (ω),

(3.2)

i.e. that if variables in C′ are not causally influenced by the variables in C then the counterfactual value of D intervening to set C to c is the same as the counterfactual value of D intervening to set C to c and the variables in C′ to the values they actually took on.

We now have the following Corollary to Theorem 3.2:

Corollary 3.8. Let $C = C_{1} \dot{\cup} C_{2}$ , $B \in \dot{ℙ} (𝕃 (C_{1}))$ , |B| = |C₁|. If B is irreducible for $D (C, Ω)$ , C′ ∩ C = ∅ and for all X′ ∈ C′, X′ is not causally influenced by C, then B is irreducible for $D (C \cup C^{'}, Ω)$ .

Proof: By Theorem 3.2 there exists ω* ∈ Ω and an assignment $c_{2}^{*}$ to C₂ such that (i) and (ii) hold. Let c′ = C′(ω*). Since variables in C′ are not causally influenced by C, for all assignments b:

D_{B = b, C_{2} = c_{2}^{*}, C^{'} = c^{'}} (ω^{*}) = D_{B = b, C_{2} = c_{2}^{*}, C^{'} = C^{'} (ω^{*})} (ω^{*}) = D_{B = b, C_{2} = c_{2}^{*}} (ω^{*}),

the second equality here follows from (3.2). It follows that ω* and ( $c_{2}^{*}$ , c′) obey (i) and (ii) in Theorem 3.2 with respect to C ∪ C′.

The assumption that every variable in C′ is not causally influenced by C, is required because otherwise we may have C′(ω*) ≠ (C′)_B=b* (ω*) for some assignment b* to B. For example, let C = {X₁, X₂, X₃}, and suppose that

\begin{matrix} D_{X_{1} = x_{1}, X_{2} = x_{2}, X_{3} = x_{3}} (ω) & = x_{3} \\ {(X_{3})}_{X_{1} = x_{1}, X_{2} = x_{2}} (ω) & = x_{1} \land x_{2} \end{matrix}

for all ω ∈ Ω. In this case {X₁, X₂} is irreducible for $D ({X_{1}, X_{2}}, Ω)$ , but not for $D ({X_{1}, X_{2}, X_{3}}, Ω)$ . We saw earlier that if B is a minimal sufficient cause for C then this does not imply that B is a minimal sufficient cause with respect to subsets of C. Here we see that if B is irreducible with respect for $D (C, Ω)$ then this does not imply irreducibility for supersets C* ⊃ C, unless every variable in C* \ C is not causally influenced by a variable in C.

4. Tests for irreducibility

In this section we derive empirical conditions which imply that a given conjunction B is irreducible for $D (C, Ω)$ . Our first approach is via condition (3.1).

4.1. Adjusting for Measured Confounders

To detect that (3.1) holds requires us to identify the mean of potential outcomes in certain subpopulations. This is only possible if we have no unmeasured confounders [??]:

Definition 4.1. A set of covariates W suffice to adjust for confounding of (the effect of) C on D if:

D_{C = c} ⫫ C ∣ W = w

(4.1)

for all c, w.

Proposition 4.2. If a set W suffices to adjust for confounding of C on D and P (C = c, W = w) > 0 then

E [D_{C = c} ∣ W = w] = E [D ∣ C = c, W = w] .

The proof of this is standard and hence omitted.

Note that if W is sufficient to adjust for confounding of C on D, then W is also sufficient to adjust for confounding of B on D, where $B \in \dot{ℙ} (𝕃 (C))$ , |B|=|C|.

4.2. Tests for irreducibility without monotonicity

Theorem 4.3. Let $C = C_{1} \dot{\cup} C_{2}$ , $B \in \dot{ℙ} (𝕃 (C_{1}))$ , |B| = |C₁|. If W is sufficient to adjust for confounding of C on D, and for some c₂,w,

\begin{matrix} 0 < E [D ∣ & B = 1, C_{2} = c_{2}, W = w] \\ - & \sum_{L \in B} E [D ∣ B \ {L} = 1, L = 0, C_{2} = c_{2}, W = w] \end{matrix}

(4.2)

then B is irreducible for $D (C, Ω)$ .

Proof: We prove the contrapositive. Suppose that B is not irreducible for $D (C, Ω)$ . Then by Theorem 3.2, for all ω ∈ Ω, and all c₂,

D_{B = 1, C_{2} = c_{2}} (ω) - \sum_{L \in B} D_{B \ {L} = 1, L = 0, C_{2} = c_{2}} (ω) \leq 0 .

Hence for any w,

E = [D_{B = 1, C_{2} = c_{2}} - \sum_{L \in B} D_{B \ {L} = 1, L = 0, C_{2} = c_{2}} ∣ W = w] \leq 0 .

Applying Proposition 4.2 to each term implies the negation of (4.2).

The condition provided in Theorem 4.3 can be empirically tested with t-test type statistics if W consists of a small number of categorical or binary variables or using regression or inverse probability of treatment weighting methods [20, 34] otherwise.

It follows from Corollary 3.8 that condition (4.2) further establishes that B is irreducible for $D (C \cup C^{'}, Ω)$ so long as every variable in C′ is not causally influenced by variables in C.

It may be shown that condition (4.2) is the sole restriction on the law of (D, C, W) implied by the negation of irreducibility.

4.3. Graphs

In the next section we develop more powerful tests under monotonicity assumptions. However, to state these conditions we first introduce some concepts from graph theory:

Definition 4.4. A graph $G$ defined on a set B is a collection of pairs of elements in B, $G$ ≡ {E | E = {B₁, B₂} ⊆ B, B₁ ≠ B₂}.

This is the usual definition of a graph, except that the vertex set here is a set of literals. We will refer to sets in $G$ as edges, which we will represent pictorially as B₁ — B₂.

Definition 4.5. Two elements L, L* ∈ B are said to be connected in $G$ if there exists a sequence L = L₁, …, L_p = L* of distinct elements in B such that {L_i, L_i₊₁} ∈ $G$ for i = 1, …, p – 1.

The sequence of edges joining L and L* is said to form a path in $G$ .

Definition 4.6. A graph $G$ on B is said to form a tree if $∣ G ∣ = ∣ B ∣ - 1$ , and any pair of distinct elements in B are connected in $G$ .

In a tree there is a unique path between any two elements.

Proposition 4.7. Let $T$ be a tree on B. For each element R ∈ B there is a natural bijection:

ϕ_{R}^{T} : B \ {R} \leftrightarrow T

given by $ϕ_{R}^{T} (L) = E = {L^{'}, L}$ where $E \in T$ is the last edge on a path from R to L.

It is not hard to show that for a graph $G$ , if the bijections described in Proposition 4.7 exist then $G$ is a tree.

Theorem 4.8 (Cayley [4]). On a set B there are |B|^|B|−2 different trees.

4.4. Monotonicity

Sometimes it may be known that a certain cause has an effect on an outcome that is either always positive or always negative.

Definition 4.9. B_i has a positive monotonic effect on D relative to a set B (with B_i ∈B) in a population Ω if for all ω ∈ Ω and all values b_−i for the variables in B \ {B_i}, D_{B{B_i}=b_−i,B_i=1}(ω) ≥ D_{B{B_i}=b_−i,B_i=0}(ω).

Similarly we say that L has a negative monotonic effect relative to B ∪ {L} if $\bar{L}$ has a positive monotonic effect relative to $B \cup {\bar{L}}$ . Note that the case in which D_{B{B_i}=b_−i,B_i=1}(ω) = D_{B{B_i}=b_−i,B_i=0}(ω) for all ω, and hence B_i has no effect on D relative to B, is included as a degenerate case.

The definition of a positive monotonic effect requires that an intervention does not decrease D for every individual, not simply on average, regardless of the other interventions taken. This is thus a strong assumption; see [33] for further discussion.

Proposition 4.10. The set of potential outcomes that are compatible with monotonicity is given by the Sperner number S(n).

*** Cite Dedekind again?

4.5. Tests for irreducibility with monotonicity

Knowledge of the monotonicity of certain potential causes allows for the construction of more powerful statistical tests for irreducibility than those given by Theorem 4.3.

Theorem 4.11. Let $C = C_{1} \dot{\cup} C_{2}$ , $B = (B + \dot{\cup} B^{'}) \in \dot{ℙ} (𝕃 (C_{1}))$ , |B|=|C₁| and suppose that every L ∈ B₊ has a positive monotonic effect on D relative to C. If for some tree $T$ on B₊, ω* ∈ Ω and some c₂ we have:

\begin{matrix} 0 < & D_{B = 1, C_{2} = c_{2}} (ω^{*}) \\ - \sum_{L \in B} D_{B \ {L} = 1, L = 0, C_{2} = c_{2}} (ω^{*}) + \sum_{E \in T} D_{B \ E = 1, E = 0, C_{2} = c_{2}} (ω^{*}) . \end{matrix}

(4.3)

then B is irreducible for $D (C, Ω)$ .

If we know that X has a negative monotonic effect on D, then we may use this Theorem to construct more powerful tests of the irreducibility of sets containing $\bar{X}$ with respect to $D (C, Ω)$ . Under the assumption that every L ∈ C has a monotonic effect on D we have shown via direct calculation using cddlib [?] that for |C| ≤ 4, the conditions (4.3) are the sole restrictions on the law of (D, C, W) implied by the negation of irreducibility. We conjecture that this holds in general.

Proof: By Theorem 3.2 it is sufficient to prove that under the monotonicity assumption on B₊, (4.3) implies (3.1). Suppose that (3.1) does not hold, so that for all values c₂, and all ω* ∈ Ω,

D_{B = 1, C_{2} = c_{2}} (ω^{*}) - \sum_{L \in B} D_{B \ {L} = 1, L = 0, C_{2} = c_{2}} (ω^{*}) \leq 0 .

Then for each ω* ∈ Ω there exists R ∈ B₊ such that

D_{B = 1, C_{2} = c_{2}} (ω^{*}) - \sum_{L \in B^{'} \cup {R}} D_{B \ {L} = 1, L = 0, C_{2} = c_{2}} (ω^{*}) \leq 0 .

For a given tree $T$ , the remaining terms on the RHS of (4.3) are:

\begin{matrix} - & \sum_{L \in B_{+} \ {R}} D_{B \ {L} = 1, L = 0, C_{2} = c_{2}} (ω^{*}) + \sum_{E \in T} D_{B \ E = 1, E = 0, C_{2} = c_{2}} (ω^{*}) \\ = \sum_{L \in B_{+} \ {R}} (D_{B \ ϕ_{R}^{T} (L) = 1, ϕ_{R}^{T} (L) = 0, C_{2} = c_{2}} (ω^{*}) - D_{B \ {L} = 1, L = 0, C_{2} = c_{2}} (ω^{*})), \end{matrix}

by Proposition 4.7. Finally since $ϕ_{R}^{T} (L) = {L, L^{'}) \subseteq B_{+}$ , L′ has a positive monotonic effect on D relative to C, thus no term in the sum is positive. Consequently for all ω* ∈ Ω, (4.3) does not hold for any tree $T$ .

Example 2. In the case B = {X₁, X₂} = B₊ = C, where there is only one tree on B₊, consisting of a single edge X₁ — X₂. Thus if X₁ and X₂ have a positive monotonic effect on D (relative to C) then Theorem 4.11 implies that if the following inequality holds for some ω ∈ Ω then

D_{11} (ω) - (D_{10} (ω) + D_{01} (ω)) + D_{00} (ω) > 0

then {X₁, X₂} is irreducible for $D (C, Ω)$ .

Example 3. If B = {X₁, X₂, X₃} = B₊ = C, then there are three trees on B₊, see Figure 1. These correspond to the following conditions:

D₁₁₁(ω) − (D₁₁₀(ω) + D₁₀₁(ω) + D₀₁₁(ω)) + (D₀₁₀(ω) + D₀₀₁(ω)) > 0;
D₁₁₁(ω) − (D₁₁₀(ω) + D₁₀₁(ω) + D₀₁₁(ω)) + (D₁₀₀(ω) + D₀₀₁(ω)) > 0;
D₁₁₁(ω) − (D₁₁₀(ω) + D₁₀₁(ω) + D₀₁₁(ω)) + (D₁₀₀(ω) + D₀₁₀(ω)) > 0.

Thus B is irreducible for $D (C, Ω)$ if at least one holds for some ω ∈ Ω.

Fig 1 — The three trees on {X₁, X₂, X₃}.

Corollary 4.12. Let $C = C_{1} \dot{\cup} C_{2}$ , $B = (B + \dot{\cup} B^{'}) \in \dot{ℙ} (𝕃 (C_{1}))$ |B| = |C₁|. Suppose that every L ∈ B₊ has a positive monotonic effect on D relative to C, and W is sufficient to adjust for confounding of C on D. If for some tree $T$ on B₊, and some c₂, w we have:

\begin{matrix} 0 < E [D ∣ B & = 1, C_{2} = c_{2}, W = w] \\ - \sum_{L \in B} & E [D ∣ B \ {L} = 1, L = 0, C_{2} = c_{2}, W = w] \\ + \sum_{E \in T} E [D ∣ B \ E = 1, E = 0, C_{2} = c_{2}, W = w], \end{matrix}

(4.4)

then B is irreducible for $D (C, Ω)$ .

Proof: Directly analogous to the proof of Theorem 4.3.

The special case of the previous Corollary where |B₊| = |C| = 2, and W = ∅, appears in Rothman and Greenland [22]; see also Koopman [10].

Theorem 4.8 implies that if every literal in B has a positive monotonic effect on D then we will have |B|^|B|−2 conditions to test, each of which is sufficient to establish the irreducibility of B for $D (C, Ω)$ . As before, the conditions (4.4) may be tested via t-test type statistics or using various statistical models.

As with the results in §4.2, we may apply Corollary 3.8 to establish that B is irreducible for D(C ∪ C′,) if every variable in C′ is not causally influenced by variables in C.

4.6. Tests for a minimal sufficient cause under monotonicity

As noted in section §3.1 if |B| = |C| then irreducible conjunctions are also minimal sufficient causes. Thus in this special case the tests of irreducibility given in Theorem 4.3 and Corollary 4.12 also establish that B is a minimal sufficient cause relative to C. When |B| < |C| these tests do not in general establish this. However, under positive monotonicity assumptions on C₂, such tests may be obtained by taking c₂ = 0:

Proposition 4.13. Let $C = C_{1} \dot{\cup} C_{2}$ , $B \in \dot{ℙ} (𝕃 (C_{1}))$ , |B|=|C₁|. Suppose every L ∈ C₂ has a positive monotonic effect on D relative to C. If (i) D_B=1,C₂=0(ω*) = 1 and (ii) for all L ∈ B, $D_{B \ {L} = 1, L = 0, C_{2} = 0} (ω^{*}) = 0$ , then B is a minimal sufficient cause for D relative to C for ω*.

Proof: For any c₂, D_{B=1,C_2=c₂}(ω) ≥ D_{B=1,C₂₌₀}(ω) by the monotonicity assumption. Hence B is a sufficient cause for D relative to C for ω*. Minimality follows directly from condition (ii).

We have the following corollaries which provide conditions under which B is a minimal sufficient cause for D relative to C for some ω ∈, in addition to being irreducible for $D (C, Ω)$ :

Corollary 4.14. Let $C = C_{1} \dot{\cup} C_{2}$ , $B \in \dot{ℙ} (𝕃 (C_{1}))$ |B| = |C₁|. Suppose every L ∈ C₂ has a positive monotonic effect on D relative to C and W is sufficient to adjust for confounding of C on D. If (4.2) holds with c₂ = 0 for some w then B is a minimal sufficient cause of D relative to C for some ω ∈ Ω.

Proof: Immediate from Proposition 4.13 and Theorem 4.3.

Corollary 4.15. Let $C = C_{1} \dot{\cup} C_{2}$ , $B = (B + \dot{\cup} B^{'}) \in \dot{ℙ} (𝕃 (C_{1}))$ , |B| = |C₁|. Suppose that every L ∈ B₊ ∪ C₂ has a positive monotonic effect on D relative to C, and W is sufficient to adjust for confounding of C on D.

If (4.4) holds with c₂ = 0 for some w and some tree $T$ on B₊ then B is a minimal sufficient cause of D relative to C for some ω ∈ Ω.

Proof: Immediate from Proposition 4.13 and Corollary 4.12.

5. Singular interactions

In the genetics literature, in the context of two binary genetic factors, X₁ and X₂, ‘biologic’ or ‘compositional’ epistasis [2, 5, 17] is said to be present if for some ω*, D₁₁(ω*) = 1 but D₁₀(ω*) = D₀₁(ω*) = D₀₀(ω*) = 0; in this case the effect of one genetic factor is effectively masked when the other genetic factor is absent. If {X₁, X₂} is a minimal sufficient cause of D relative to {X₁, X₂} for ω* then although this implies D₁₁(ω*) = 1 and D₁₀(ω*) = D₀₁(ω*) = 0, it does not imply D₀₀(ω*) = 0. This motivates the following:

Definition 5.1. A minimal sufficient cause B for D relative to C for ω* is said to be singular if there is no $B^{'} \in \dot{ℙ} (𝕃 (C))$ , B′ ≠ B, forming a minimal sufficient cause for D relative to C for ω*. B is singular for $D (C, Ω)$ if B is singular relative to C for some ω* ∈ Ω.

If B is singular for $D (C, Ω)$ then we will also refer to a singular interaction between the components of B. We now characterize singularity in terms of potential outcomes:

Theorem 5.2. Let C = C₁∪C₂, $B \in \dot{ℙ} (𝕃 (C_{1}))$ , |B| = |C₁| then B is singular for $D (C, Ω)$ iff there exists ω* ∈ Ω such that

f o r a l l v a l u e s c_{2}^{*}, b : D_{B = b, C_{2} = c_{2}^{*}} (ω^{*}) = 1 \Leftrightarrow b = 1 .

(5.1)

Note that (5.1) is equivalent to:

D_{C = c} (ω^{*}) = {(⋀ (B))}_{c} for all c .

(5.2)

Thus if B is singular for $D (C, Ω)$ then there is some individual ω* whose potential outcomes are given by the single conjunction B.

Proof: By definition, B is a sufficient cause for D for ω* iff $(b = 1 \Rightarrow D_{B = b, C_{2} = c_{2}^{*}} (ω^{*}) = 1)$ . Thus it is sufficient to show that, assuming B is a minimal sufficient cause for D for ω*, there are no other minimal sufficient causes of D for ω* iff $(D_{B = b, C_{2} = c_{2}^{*}} (ω^{*}) = 1 \Rightarrow b = 1)$ .

Suppose B is the only minimal sufficient cause for D for ω*, but that for some b* ≠ 1, $D_{B = b^{*}, C_{2} = c_{2}^{*}} (ω^{*}) = 1$ . Let $B^{†} \equiv B^{[B = b^{*}, C_{2} = c_{2}^{*}]}$ , B^† forms a sufficient cause for D for ω*, and B ⊈ B^†. Hence there is some B′ ⊆ B^† that is a minimal sufficient cause for D for ω*, and B ≠ B^†, a contradiction.

Conversely suppose $(D_{B = b, C_{2} = c_{2}^{*}} (ω^{*}) = 1 \Rightarrow b = 1)$ but there exists another minimal sufficient cause B′ for D for ω*, and B ≠ B^†. Since B′ is minimal, B ⊈ B′. Thus there exists a $\tilde{c}$ such that ${(B)}_{\tilde{c}} \neq 1$ , but ${(B^{'})}_{\tilde{c}} = 1$ and hence $D_{C = \tilde{c}} (ω^{*}) = 1$ , a contradiction.

Corollary 5.3. For $D (C, Ω)$ , if B is singular then B is irreducible.

Proof: Immediate from (5.1) and the definition of irreducibility.

Theorem 5.4 relates singular interactions to properties of the set of sufficient cause representations for $D (C, Ω)$ .

Theorem 5.4. Let $B \in \dot{ℙ} (𝕃 (C))$ . B is singular for $D (C, Ω)$ iff there exists ω* ∈ Ω such that in every representation (A, $B$ ) for $D (C, Ω)$ , (i) for all $B^{*} \in \dot{ℙ} (𝕃 (C))$ , with |B*| = |C| and B ⊆ B* there exists $B_{i} \in B$ with B_i ⊆ B* and A_i(ω*) = 1; (ii) for all $B_{i} \in B$ such that B ⊈ B_i, A_i(ω*) = 0.

Proof: Let $C = C_{1} \dot{\cup} C_{2}$ , where $B \in \dot{ℙ} (𝕃 (C_{1}))$ , and |B| = |C₁|.

(⇒) Suppose B is singular for $D (C, Ω)$ , so that some ω* ∈ Ω satisfies (5.2). Then for any representation (A, $B$ ) for $D (C, Ω)$ and any B* such that |B*| = |C| and B ⊆ B* we can select values $c_{2}^{*}$ so that $B^{*} =^{[B = 1, C_{2} = c_{2}^{*}]}$ . Since $D_{B = 1, C_{2} = c_{2}^{*}} (ω^{*}) = 1$ there exists A_i ∈ A, $B_{i} \in B$ with A_i (ω*) = 1 and ${(⋀ (B_{i}))}_{B = 1, C_{2} = c_{2}^{*}} = 1$ . Thus B_i ⊆ B*, so (i) holds. For all $B_{i} \in B$ such that B ⊈ B_i, we can choose $\tilde{B} \in \dot{ℙ} (𝕃 (C_{1}))$ , $∣ \tilde{B} ∣ = ∣ C_{1} ∣$ with $\tilde{B} \neq B$ and values ${\tilde{c}}_{2}$ so that $B_{i} \subseteq B^{[\tilde{B} = 1, C_{2} = {\tilde{c}}_{2}]}$ . Since $D_{\tilde{B} = 1, C_{2} = {\tilde{c}}_{2}} (ω^{*}) = 0$ we have A_i(ω*) = 0 since ${(⋀ (B_{i}))}_{\tilde{B} = 1, C_{2} = {\tilde{c}}_{2}} = 1$ , so (ii) holds as required.

(⇐) Suppose there exists ω* ∈ such that every representation (A, $B$ ) satisfies (i) and (ii). We will show that (5.1) holds. For any values $c_{2}^{*}$ let $B^{*} \equiv B^{[B = 1, C_{2} = c_{2}^{*}]}$ , so |B*| = |C| and B ⊆ B*. Thus by (i) there exists $B_{i} \in B$ with B_i ⊆ B* and A_i(ω*) = 1. Hence $D_{B = 1, C_{2} = c_{2}^{*}} (ω^{*}) = 1$ since A((ω*) = 1 and ${(⋀ (B_{i}))}_{B = 1, C_{2} = c_{2}^{*}} = 1$ . Conversely for any b′ ≠ 1, let B′ ≡ B^[B=b′]. so |B′| = |C₁| with B′ ≠ B. Thus for all $B_{i} \in B$ such that ${(⋀ (B_{i}))}_{B^{'} = 1, C_{2} = c_{2}^{*}} = 1$ , B ⊈ B_i and thus by (ii) A_i (ω*) = 0. Hence $D_{B^{'} = 1, C_{2} = c_{2}^{*}} (ω^{*}) = 0$ .

We now consider results relevant for testing for singular interactions with or without monotonicity assumptions.

Theorem 5.5. Let $B = B_{+} \dot{\cup} B^{'} \in \dot{ℙ} (𝕃 (C))$ , |B| = |C| and suppose that every L ∈ B₊ has a positive monotonic effect on D relative to C. If for some tree $T$ on B₊ and some ω* ∈ Ω we have

\begin{matrix} D_{B = 1} (ω^{*}) & - \sum_{L \in B_{+}} D_{B \ {L} = 1, L = 0} (ω^{*}) \\ - \sum_{\tilde{B} : \emptyset \neq \tilde{B} \subseteq B^{'}} D_{B \ \tilde{B} = 1, \tilde{B} = 0} (ω^{*}) + \sum_{E \in T} D_{B \ E = 1, E = 0} (ω^{*}) > 0 \end{matrix}

(5.3)

then B is singular for $D (C, Ω)$ .

Proof: By Theorem 5.4 B is singular for $D (C, Ω)$ iff

for some ω^{*} \in Ω, D_{B = 1} (ω^{*}) - \sum_{\tilde{B} \subseteq B} D_{B \ \tilde{B} = 1, \tilde{B} = 0} (ω^{*}) > 0 .

(5.4)

Suppose for a contradiction that (5.4) does not hold but (5.3) holds for some ω* ∈ Ω. Since B₊ has a positive monotonic effect on D relative to C, if $\tilde{B} \subseteq B$ is such that $\tilde{B} \cap B_{+} \neq \emptyset$ then $D_{B \ \tilde{B} = 1, \tilde{B} = 0} (ω^{*}) = 1$ implies $D_{B \ {L} = 1, L = 0} (ω^{*}) = 1$ for some $L \in \tilde{B}$ . Hence for all ω ∈ Ω

D_{B = 1} (ω) - \sum_{L \in B} D_{B \ {L} = 1, L = 0} (ω) - \sum_{\tilde{B} \subseteq B^{'}, ∣ \tilde{B} ∣ \geq 2} D_{B \ \tilde{B} = 1, \tilde{B} = 0} (ω) \leq 0 .

(5.5)

By applying the same argument to the first two terms on the LHS of (5.5) as was applied in the proof of Theorem 4.11 we have that (5.3) does not hold for all ω ∈, which is a contradiction.

The following corollary to Theorem 5.5 generalizes the discussion in [27,28] to an arbitrary number of dichotomous factors:

Corollary 5.6. Let $B = B_{+} \dot{\cup} B^{'} \in \dot{ℙ} (𝕃 (C))$ , |B| = |C|. Suppose that every L ∈ B₊ has a positive monotonic effect on D relative to B, and W is sufficient to adjust for confounding of C on D. If for some tree $T$ on B₊, and some w we have:

\begin{matrix} 0 < E & [D ∣ B = 1, W = w] - \sum_{L \in B_{+}} E [D ∣ B \ {L} = 1, L = 0, W = w] \\ - \sum_{\tilde{B} : \emptyset \neq \tilde{B} \subseteq B^{'}} E [D ∣ B \ \tilde{B} = 1, \tilde{B} = 0, W = w] \\ + \sum_{E \in T} E [D ∣ B \ E = 1, E = 0, W = w] \end{matrix}

(5.6)

then B is singular for $D (C, Ω)$ .

Proof: By applying Proposition 4.2 to each term in (5.3).

Condition (5.6) leads directly to a statistical test of compositional epistasis. This is notable since some claims in the genetics literature appear to suggest that such tests did not exist [5].

As stated in the next corollary, from Theorem 5.5 if all or all but one of the elements of B have positive monotonic effects on D then singularity and irreducibility coincide:

Corollary 5.7. Suppose |B| = |C| and that for all or all but one of B_i ∈ B, B_i has a positive monotonic effect on D relative to B then B is singular for $D (C, Ω)$ i B is irreducible for $D (C, Ω)$ .

An important consequence of this corollary is that when there is at most one variable that does not have a positive monotonic effect condition (4.4) establishes that B is singular in addition to being irreducible for $D (C, Ω)$ .

Proof: Let B′ denote the one or zero elements of B that do not have a monotonic effect on D relative to C. If B is irreducible for $D (C, Ω)$ then by Theorem 4.11,

D_{B = 1} (ω^{*}) - \sum_{L \in B} D_{B \ {L} = 1, L = 0} (ω^{*}) + \sum_{E \in T} D_{B \ E = 1, E = 0} (ω^{*}) > 0 .

Since the third term on the LHS of (5.3) vanishes when |B′| ≤ 1, it follows that B is singular for $D (C, Ω)$ . The converse is given in Corollary 5.3.

Corollary 5.8. Suppose |B| = |C| and that for all or all but one of $B_{i} \in B \dot{\cup} C^{'}$ , B_i has a positive monotonic effect on D relative to B ∪ C, for all X′ ∈ C′, X′ is not causally influenced by C and B is singular for $D (C, Ω)$ then B is singular for $D (C \cup C^{'}, Ω)$ .

Proof: By Corollary 5.7, B is irreducible relative to $D (C, Ω)$ . Hence by Corollary 3.8 B is singular relative to $D (C \cup C^{'}, Ω)$ . The conclusion then follows from a further application of Corollary 5.7.

5.1. Relation to Pearl’s Probability of Causation

Pearl [16, chapter 9] defined the probability of necessity and sufficiency (PNS) of cause X for outcome D to be P (D_X₌₁(ω) = 1, D_X₌₀(ω) = 0). In other words PNS(D, X) is the probability that D would occur if X occurred and would not have done so had X not occurred. We generalize this to the setting in which there are multiple causes B:

Definition 5.9. For $B \subseteq \dot{ℙ} (𝕃 (C))$ , the probability of necessity and sufficiency of B causing D, is:

PNS (D, B) \equiv P (D_{B = 1} = 1 a n d f o r a l l b \neq 1, D_{B = b} = 0) .

Thus PNS(D, B) is the probability that D would occur if every literal L ∈ B occurred and would not have done so had at least one literal in B not occurred.

Proposition 5.10. If |B| = |C| then PNS(D, B) > 0 i B is singular for $D (C, Ω)$ .

Proof: Follows directly from Theorem 5.4 and Definition 5.9.

This connection also provides an interpretation for condition (5.6).

Proposition 5.11. Under the conditions of Corollary 5.6, with W = ∅ = B₊ PNS(B, D) is bounded below by the RHS of (5.6).

Proof:

\begin{matrix} PNS (D, B) & = P (D_{B = 1} = 1 a n d f o r a l l b \neq 1, D_{B = b} = 0) \\ \geq P (D_{B = 1} = 1) + P (f o r a l l b \neq 1, D_{B = b} = 0) - 1 \\ = P (D_{B = 1} = 1) - P (f o r s o m e b \neq 1, D_{B = b} = 1) \\ \geq P (D_{B = 1} = 1) - \sum_{b \neq 1} P (D_{B = b} = 1) \\ = E [D = 1 ∣ B = 1] - \sum_{b \neq 1} E [D ∣ B = b] \end{matrix}

which is the RHS of (5.6) with W = ∅ = B₊.

This generalizes some of the lower bounds on PNS(D, X) given by Pearl [16, §9.2]. The assumption in Proposition 5.11 that B and D are unconfounded, so W = ∅, and the absence of monotonicity assumptions so B₊ = ∅, are for expositional convenience.

6. Relation to Statistical Models with Linear Links

In related work [32] it is noted that the presence of interaction terms in statistical models do not in general correspond to sufficient conditions for irreducibility. Consider for example a saturated Bernoulli regression model for D with identity link and binary regressors C = {X₁, …, X_p}:

E [D ∣ C = c] = \sum_{\tilde{B} \subseteq C} β_{\tilde{B}} {(⋀ (\tilde{B}))}_{c} .

(6.1)

Note that with c = (x₁, …, x_p) then ${(⋀ (\tilde{B}))}_{c} = Π_{X_{i} \in \tilde{B}} x_{i}$ , the usual multiplicative interaction term. The conditions, given earlier, for detecting the presence of irreducibility and singularity lead to linear restrictions on the regression coefficients $β_{\tilde{B}}$ :

\sum_{\tilde{B} \subseteq C} m_{\tilde{B}} β_{\tilde{B}} > 0 .

(6.2)

Note that (6.2) includes an intercept β_∅. First we define

\deg_{T} (L) \equiv {E ∣ E \in T, L \in E},

the degree of L in a tree $T$ , to be the number of edges in $T$ that contain L.

Proposition 6.1. Under the conditions of Theorem 4.11, with B = C, condition (4.3) is equivalent to restriction (6.2) with $m_{\tilde{B}} = m_{\tilde{B}}^{irred}$ where:

m_{\tilde{B}}^{irred} \equiv 1 - ∣ B \ \tilde{B} ∣ + ∣ T ∣ - \sum_{L \in \tilde{B} \cap B_{+}} \deg_{T} (L) + ∣ {E ∣ E \in T, E \subseteq \tilde{B} \cap B_{+}} ∣ .

(6.3)

Note that since $T$ is a tree on B₊, the last term in (6.3) has a natural graphical interpretation as the number of edges in the ‘induced subgraph’ of $T$ on the subset $\tilde{B}$ . Definition (6.3) also subsumes condition (4.2) given in Theorem 4.3 (without monotonicity), in which case the last three terms in (6.3) vanish. Though Proposition 6.1 assumes that C₂ = ∅, the condition given by (6.2) and (6.3) continues to apply in the case where c₂ = 0 as in Corollaries 4.14 and 4.15; obvious extensions apply to the case where c₂ ≠ 0.

Proof: This follows by counting the number (and sign) of expectations in (4.3) for which $\tilde{B}$ is a subset of the variables assigned the value 1 in the conditioning event. The first two terms in (4.3) correspond, respectively, to the first two terms in (6.3). The remaining three terms in (6.3) correspond to the last sum in (4.3): | $T$ |, the number of edges in $T$ , is the total number of terms in the sum. The sum over degrees subtracts the number of terms in which some $L \in \tilde{B}$ is assigned zero. Since this double counts terms corresponding to edges contained in $\tilde{B}$ , the last term corrects for this.

Proposition 6.2. Under the conditions of Theorem 5.5, with B = C, condition (5.3) is equivalent to restriction (6.2) with $m_{\tilde{B}} = m_{\tilde{B}}^{sing}$ where:

m_{\tilde{B}}^{sing} \equiv m_{\tilde{B}}^{irred} + (∣ B^{'} \ \tilde{B} ∣) - (2^{∣ B^{'} \ \tilde{B} ∣} - 1) .

(6.4)

Proof: Expression (6.4) follows from another counting argument similar to the proof of Proposition 6.1, together with the observation that conditions (4.3) and (5.3) only differ in that the terms in the sum over L in (4.3) for L ∈ B′ are replaced by a sum over all subsets of B′.

Example 4 (Two-way interactions). Consider the saturated Bernoulli regression with with identity link with C = {X₁, X₂}:

E [D ∣ X_{1} = x_{1}, X_{2} = x_{2}] = β_{\emptyset} + β_{1} x_{1} + β_{2} x_{2} + β_{12} x_{1} x_{2} .

Suppose that X₁ and X₂ are unconfounded with respect to D, so (4.1) holds with W = ∅. Proposition 6.1 implies that {X₁, X₂} is irreducible relative to C if β₁₂ > β_∅; Proposition 6.2 implies that {X₁, X₂} is singular relative to C if ₁₂ > 2 _∅. If one of X₁ or X₂ have positive monotonic effects on D relative to C then Proposition 6.1 and Corollary 5.7 imply that {X₁, X₂} is both irreducible and singular relative to C if β₁₂ > β_∅ If X₁ and X₂ have positive monotonic effects on D relative to C then Proposition 6.1 and Corollary 5.7 imply that {X₁, X₂} is both irreducible and singular relative to C if β₁₂ > 0.

Thus only under the assumption of positive monotonic effects for both X₁ and X₂ does the sufficient condition for the irreducibility and singularity of {X₁, X₂} coincide with the classical two-way interaction term β₁₂ being positive.

It also follows from Proposition 3.4 that if {X₁, X₂} is irreducible relative to C then there exists some ω ∈ for whom {X₁, X₂} is a minimal sufficient cause relative to C (since |B| = |C|).

Example 5 (Three-way interactions). The saturated Bernoulli regression with three binary variables and a identity link can be written as:

E [D = 1 ∣ X_{1} = x_{1}, X_{2} = x_{2}, X_{3} = x_{3}] = β_{\emptyset} + β_{1} x_{1} + β_{2} x_{2} + β_{3} x_{3} + β_{12} x_{1} x_{2} + β_{13} x_{1} x_{3} + β_{23} x_{2} x_{3} + x_{123} x_{1} x_{2} x_{2} .

Suppose that C = {X₁, X₂, X₃} is unconfounded for D. Proposition 6.1 implies that {X₁, X₂, X₃} is irreducible relative to C if

β_{123} > 2 β_{\emptyset} + β_{1} + β_{2} + β_{3} .

(6.5)

It follows from Proposition 3.4 that if {X₁, X₂, X₃} is irreducible relative to C then there exists some ω ∈ Ω for whom {X₁, X₂, X₃} is a minimal sufficient cause relative to C (since |B| = |C|).

Proposition 6.2 implies {X₁, X₂, X₃} is singular relative to C if

β_{123} > 6 β_{\emptyset} + 2 β_{1} + 2 β_{2} + 2 β_{3} .

However, if X₁, X₂ and X₃ have positive monotonic effects on D (relative to C) then Proposition 6.1 implies {X₁, X₂, X₃} is irreducible relative to C if any of the following hold:

β_{123} > β_{1}, β_{123} > β_{2}, β_{123} > β_{3};

(6.6)

equivalently, β₁₂₃ > min{β₁, β₂, β₃}. By Corollary 5.7 this also establishes that {X₁, X₂, X₃} is singular relative to C.

If only X₁ and X₂ have positive monotonic effects on D relative to C then Proposition 6.1 implies that {X₁, X₂, X₃} is irreducible relative to C if

β_{123} > β_{\emptyset} + β_{1} + β_{2} .

(6.7)

By Corollary 5.7 condition (6.7) also implies that {X₁, X₂, X₃} is singular relative to C (since only X₃ does not have a positive monotonic effect on D). As we would expect condition (6.7) is weaker than (6.5) but stronger than any of the conditions (6.6). If only one potential cause has a monotonic effect on D relative to C then we can only use (6.5) to establish irreducibility.

Thus for three-way interactions β₁₂₃ > 0 does not correspond to any of the sufficient conditions for irreducibility or singularity of {X₁, X₂, X₃} relative to C, regardless of whether or not monotonicity assumptions hold.

7. Interpretation of sufficient cause models

As mentioned in §2.1 the observed data for an individual (C(ω), D(ω)) represents a strict subset of the potential outcomes $D (C, ω)$ ; this is the ‘fundamental problem of causal inference’. Further, as we have seen, for a given set of potential outcomes there can exist different determinative sets of minimal sufficient causes B for the same set of potential outcomes; see (2.3) and (2.4). Thus we have the following for an individual:

\begin{matrix} ⋮ & ↘ & ⋮ & ↘ \\ B & \to & D (C, ω) & \to & (C (ω), D (ω)) . \\ ⋮ & ↗ & ⋮ & ↗ \\ many-one & many-one \end{matrix}

(7.1)

It is typically impossible to know the set of potential outcomes for an individual $D (C, ω)$ , even when C = {X}, even from randomized experiments. However possession of this knowledge would permit one to predict how a given individual would respond under any given pattern of exposures C = c.

The results in this paper demonstrate that, given data from a randomized experiment (or when sufficient variables are measured to adjust for confounding) it is possible to infer the existence of an individual for whom all sets of minimal sufficient causes $B$ have certain features in common. However, given the double many-one relationship (7.1), and the fact that the set of potential outcomes $D (C, ω)$ , if they were known, apparently address all potential counterfactual queries, it is natural to ask what is to be gained by considering such representations. We now motivate our results by presenting several different interpretations of sufficient cause representations.

7.1. The descriptive interpretation

Under this view sets of minimal sufficient causes are merely a way to describe the set of potential outcomes $D (C, Ω)$ . The representation (A, $B$ ) may be more compact; compare Table 2 and (2.5). Extending this to a population Ω, the variables A in a representation (A, $B$ ) merely describe subpopulations with particular patterns of potential outcomes. Knowing that there exists an individual for whom all representations $B$ have certain features in common provides qualitative information about the set of potential outcomes.

For two binary causes {X₁, X₂}, Theorem 3.2 implies that {X₁, X₂} is irreducible relative to C for ω* if D₁₁(ω*) = 1 and D₁₀(ω*) = D₀₁(ω*) = 0. Such a pattern is of interest insofar as it indicates that the causal process resulting in this individual’s potential outcomes D(C, ω*) is such that (for some setting of the variables in C \ {X₁, X₂}), D = 1 if both X₁ = 1 and X₂ = 1, but not when X₁ = 1 and X₂ = 0 or vice versa.

Similarly it follows from Theorem 5.2 that if {X₁, X₂} is singular relative to C for ω* if D₁₁(ω*) = 1 and D₁₀(ω*) = D₀₁(ω*) = D₀₀(ω*) = 0. Hence the causal process producing $D (C, ω^{*})$ is such that, for some setting of the variables in C \ {X₁, X₂}, D = 1 if both X₁ = 1 and X₂ = 1, but not when either X₁ = 0 or X₂ = 0.

In contrast to the classical notions of interaction arising in linear models (see §6), irreducibility and singularity are causal in that they relate to the potential outcomes. §4 and §5 contain empirical tests for the presence of irreducible or singular interactions.

7.2. Generative mechanism interpretations

A minimal sufficient cause representation may be interpreted in terms of a ‘generative mechanism’:

Definition 7.1. A mechanism M(ω) relative to C takes as input an assignment c to C, and outputs a ‘state’ M_c(ω) which is either ‘active’ (1) or ‘inactive’ (0). A mechanism is said to be generative for D if whenever it is active, the event D = 1 is caused, so that M_c(ω) = 1 implies D_c(ω) = 1. Conversely, a mechanism is said to be preventive for D if whenever M_c(ω) = 1, D_c(ω) = 0 is caused.

Though this definition refers to a mechanism ‘causing’ D = 1 or D = 0, we abstain from defining this more formally in terms of potential outcomes until the next section. Our reason for proceeding in this way is that there may be circumstances in which an investigator is able to posit the existence of a causal mechanism causing D = 1 or D = 0, e.g. based on experiments manipulating the inputs C and output D, but lacks sufficiently detailed information to posit well-defined counterfactual outcomes involving interventions on these (hypothesized) mechanisms.

Definition 7.2. A set of generative mechanisms M = 〈M¹, …, M^p〉 will be said to be exhaustive for a given set of potential outcomes $D (C, Ω)$ if for all ω ∈ Ω, and all c, if D_c(ω) = 1 then for some Mⁱ ∈ M, $M_{c}^{i} (ω) = 1$ .

Note that if M forms an exhaustive set of mechanisms for $D (C, Ω)$ , then it follows that in a context in which no mechanism Mⁱ is active, D = 0.

Proposition 7.3. If M = 〈M¹, …, M^p_W〉 forms an exhaustive set of generative mechanisms for $D (C, Ω)$ then $D = ⋁ (M)$ and $D_{c} (ω) = ⋁ (M_{c} (ω))$ .

Proof: Follows Definitions 7.1 and 7.2. 2

Proposition 7.4. Suppose M forms an exhaustive set of generative mechanisms for $D (C, Ω)$ . If $B \in \dot{ℙ} (𝕃 (C))$ , |B| = |C| and B is irreducible for $D (C; Ω)$ then there exists an individual ω* and a mechanism Mⁱ such that $M_{B = 1}^{i} (ω^{*}) = 1$ but for all L ∈ B, $M_{B \ {L} = 1, L = 0}^{i} (ω^{*}) = 0$ .

Thus if there exists an exhaustive set of generative mechanisms for $D (C, Ω)$ and B is irreducible then there is an individual ω* and a mechanism Mⁱ such that Mⁱ is active when all the literals in B take the value 1, and is inactive when any one literal is 0 and the rest continue to take the value 1.

Proof: By Theorem 3.2, since B is irreducible for $D (C; Ω)$ , there exists ω* ∈ Ω such that D_B=1(ω*) = 1 and for all L ∈ B, D_B\{L}=1,L=0(ω). Since M is an exhaustive set of generative mechanisms for $D (C, Ω)$ , we have that for all c, $D_{c} (ω^{*}) = ⋁ (M_{c} (ω))$ . Since D_B=1(ω*) = 1, for some M_i ∈ M, $M_{B = 1}^{i} (ω^{*}) = 1$ . Since for all L ∈ B, D_B\{L}=1,L=0(ω*) = 0 we have that $M_{B \ {L} = 1, L = 0}^{i} (ω^{*}) = 0$ .

Proposition 7.5. Suppose M forms an exhaustive set of generative mechanisms for $D (C, Ω)$ . If $B \in \dot{ℙ} (𝕃 (C))$ , |B| = |C| and B is singular for $D (C; Ω)$ then there exists an individual ω* and a mechanism Mⁱ such that $M_{B = b}^{i} (ω^{*}) = 1$ iff b = 1.

Hence under the conditions of Proposition 7.5, if B is singular then there is an individual ω* and a mechanism Mⁱ such that Mⁱ is active if and only if all the literals in B take the value 1.

Proof: Similar to the proof of Proposition 7.4, replacing Theorem 3.2 by Theorem 5.2.

As the next example shows, the assumption that there exists an exhaustive set of generative mechanisms is substantive, and does not hold in all cases.

Example 6. Suppose C = {X₁, X₂} where X₁ and X₂ denote the presence of a variant allele at two particular loci. Let M¹ and M² denote two different proteins such that Mⁱ is produced iff X_i = 0, i.e. the associated allele is not present. Finally let D denote some characteristic whose occurrence is blocked by the presence of either M¹ or M² (or both). In this example

\begin{matrix} M_{x_{1} x_{2}}^{i} (ω) & = (1 - x_{i}) \\ D_{x_{1} x_{2}} (ω) & = (1 - M_{x_{1} x_{2}}^{1}) \lor (1 - M_{x_{1} x_{2}}^{2}) = x_{1} x_{2} . \end{matrix}

By De Morgan’s Law, the second equation here may also be expressed as:

1 - D_{x_{1} x_{2}} (ω) = M_{x_{1} x_{2}}^{1} (ω) M_{x_{1} x_{2}}^{2} (ω) = 1 - x_{1} x_{2} .

The mechanisms M¹ and M² are preventive for D, so that D = 1 only occurs when both mechanisms are inactive. An exhaustive set of generative mechanisms does not exist because in this example there are no generative mechanisms.

It is natural to suppose that mechanisms are ‘modular’ and thus may be isolated or rendered inactive without affecting other such mechanisms. This may be formalized via potential outcomes:

Definition 7.6. An exhaustive set of generative mechanisms M for $D (C, Ω)$ are said to support counterfactuals if there exist well-defined potential outcomes D_C=c,M=m(ω) and D_M=m(ω) such that

D_{C = c, M = m} (ω) = D_{M = m} (ω) = {(⋁ (M))}_{m} .

The important assumption here is the existence of the potential outcomes D_m(ω) and D_c,m(ω). Note that if M supports counterfactuals then interventions on C do not affect D if interventions are also made on M.

Proposition 7.7. If the exhaustive set of generative mechanisms M support counterfactuals then

D_{M = M (ω)} (ω) = D_{C = C (ω), M = M (ω)} (ω) = D (ω)

so that consistency holds for the potential outcomes D_m(ω) and D_c,m(ω).

Proof: This follows because

D_{C = C (ω), M = M (ω)} (ω) = D_{M = M (ω)} (ω) = ⋁ (M (ω)) = D (ω) .

7.3. Counterfactual interpretation of a minimal sufficient cause representation

If we have an exhaustive set of generative mechanisms which supports counterfactuals, and further each mechanism is a conjunction of literals, then there will be a minimal sufficient cause representation that itself supports counterfactuals.

Definition 7.8. A representation (A, $B$ ) for $D (C, Ω)$ will be said to be structural if for each pair (A_i, B_i), A_i ∈ A, $B_{i} \in B$ there exists a generative mechanism (or mechanisms) Mⁱ such that Mⁱ = A_i ∧ (Λ(B_i)) and

{M^{i}}_{C = c} (ω) = A_{i} (ω) \land {(⋀ (B_{i}))}_{c} .

Thus if (A, $B$ ) is structural then each pair (A_i, B_i), A_i ∈ A, $B_{i} \in B$ corresponds to a mechanism Mⁱ. Thus in this case the variables A_i(ω) may be interpreted as indicating whether the corresponding mechanism(s) Mⁱ is ‘present’ in individual ω. We may thus associate potential outcomes with the A_i, corresponding to removing (or inserting) the corresponding mechanism(s). This interpretation of the A_i’s is consistent with the notion of ‘co-cause’ which arises in the literature on minimal sufficient causes.

We note that ‘structural’ is often used as a synonym for ‘causal’. However, even under the weak interpretation, a sufficient cause representation is causal in that it represents a set of potential outcomes. The word is used in Definition 7.8 to connote that the structure of the representation itself represents (additional) potential outcomes arising from intervention on a set of mechanisms M that correspond with the pairs (A_i, B_i), A_i ∈ A, $B_{i} \in B$ . Note that there need not be a unique structural representation (A, $B$ ) for $D (C, Ω)$ . There might be several functionally equivalent generative mechanisms corresponding to a given pair (A_i, B_i).

Proposition 7.9. If a representation (A, $B$ ) for $D (C, Ω)$ , where A = 〈A₁, …, A_p〉, is structural then the associated set of generative mechanisms 〈M¹, …, M^p〉 is exhaustive.

Proof: This follows from Definitions 2.7 and 7.2. 2.

Proposition 7.10. Suppose that M forms an exhaustive set of generative mechanisms for $D (C, Ω)$ and M supports counterfactuals. If for all Mⁱ ∈ M there exists $B_{i} \in \dot{ℙ} (𝕃 (C))$ , and an A_i such that for all c, and ω ∈ Ω, if A_i(ω) = 1 then (Mⁱ)_c(ω) = Λ(B_i)_c, then (A = 〈A₁, …, A_p〉, $B = 〈 B_{1}, \dots, B_{p} 〉$ ) forms a representation for $D (C, Ω)$ that is structural.

Proof: This follows from Definitions 2.7 and 7.8.

Proposition 7.11. If there is some representation (A, $B$ ) that is structural and $B \in \dot{ℙ} (𝕃 (C))$ is irreducible for $D (C; Ω)$ then there exists a mechanism M_i that is active only if B = 1.

Proof: If B is irreducible for $D (C; Ω)$ then there exists $B_{i} \in B$ with $B \subseteq B_{i}$ ; the mechanism Mⁱ = A_i ∧ (Λ(B_i)) is such that Mⁱ = 1 only if B = 1.

Note that the conclusion of Proposition 7.11, unlike those of Propositions 7.4 and 7.5, does not make reference to an individual ω*. This is because Proposition 7.11 assumes that there is a representation (A, $B$ ) that is structural: in this representation the A_i variables may be seen as a constituent part of the corresponding mechanism M_i.

Note that there may exist a set of exhaustive generative mechanisms, but these mechanisms may not themselves be conjunction of literals so that there is no sufficient cause representation for $D (C; Ω)$ that is structural:

Example 7. Suppose C = {X₁, X₂} where X₁ and X₂ again denote the presence of variant alleles, acquired by maternal and paternal inheritance respectively, at a particular locus. Let M denote a protein that is produced iff either X₁ = X₂ = 1 or X₁ = X₂ = 0 and let D denote some characteristic that occurs iff M = 1. Suppose we can intervene to remove or add the protein. We then have that

\begin{matrix} M_{x_{1} x_{2}} (ω) & = x_{1} x_{2} \lor (1 - x_{1}) (1 - x_{2}) \\ D_{x_{1} x_{2}} (ω) & = M_{x_{1} x_{2}} (ω) \\ D_{x_{1} x_{2} m} (ω) = D_{m} (ω) & = m . \end{matrix}

Thus {M} constitutes an exhaustive set of generative mechanisms for $D (C, Ω)$ . We have a representation for $D (C, Ω)$ :

D_{x_{1} x_{2}} (ω) = {(A_{1} (ω) X_{1} X_{2} \lor A_{2} (ω) (1 - X_{1}) (1 - X_{2}))}_{x_{1} x_{2}}

with A₁(ω) = A₂(ω) = 1 for all ω ∈. However, this representation is not structural because A₁(ω)X₁X₂ and A₂(ω)(1−X₁)(1−X₂) do not constitute separate mechanisms for which interventions are conceivable; there is only one mechanism M, the protein. Since for anyω ∈, D₁₁(ω) = 1 and D₁₀(ω) = D₀₁(ω) = 0, {X₁, X₂} is irreducible relative to C; however it is not the case that there is a mechanism M_i that is active only if X₁X₂ = 1 since for the only mechanism M it is the case that M = 1 if X₁ = X₂ = 0. Note, however, in this example there is still a mechanism, namely M, that will be ‘active’ if X₁ = X₂ = 1 but will be ‘inactive’ if only one of X₁ or X₂ is 1.

Example 8. To illustrate the results in the paper we consider again the data presented in Table 1. We let D denote bladder cancer, X₁ smoking, X₂ the S NAT2 genotype, and X₃ the *10 allele on NAT1. As discussed in Example 5, if the effect of C = {X₁, X₂, X₃} is unconfounded for D and and we fit the model

E [D = 1 ∣ X_{1} = x_{1}, X_{2} = x_{2}, X_{3} = x_{3}] = β_{\emptyset} + β_{1} x_{1} + β_{2} x_{2} + β_{3} x_{3} + β_{12} x_{1} x_{2} + β_{13} x_{1} x_{3} + β_{23} x_{2} x_{3} + β_{123} x_{1} x_{2} x_{2}

(7.2)

then if X₁, X₂ and X₃ have positive monotonic effects on D (relative to C) then {X₁, X₂, X₃} is irreducible relative to C if any of the following hold:

β_{123} > β_{1}, β_{123} > β_{2}, β_{123} > β_{3} .

We cannot fit the model (7.2) directly with case control data. However, under the assumption that the outcome is rare (reasonable with bladder cancer) so that odds ratios approximate risk ratios, we can fit the model:

E [D = 1 ∣ X_{1} = x_{1}, X_{2} = x_{2}, X_{3} = x_{3}] ∕ E [D = 1 ∣ X_{1} = 0, X_{2} = 0, X_{3} = 0] = θ_{1} x_{1} + θ_{2} x_{2} + θ_{3} x_{3} + θ_{12} x_{1} x_{2} + θ_{13} x_{1} x_{3} + θ_{23} x_{2} x_{3} + θ_{123} x_{1} x_{2} x_{2}

(7.3)

and the conditions for the irreducibility of {X₁, X₂, X₃} relative to C under monotonicity of {X₁, X₂, X₃} become:

θ_{123} > θ_{1}, θ_{123} > θ_{2}, θ_{123} > θ_{3} .

If we fit model (7.3) using maximum likelihood, we find that

\begin{matrix} θ_{123} - θ_{1} & = 1.21 (9.5 % CI : - 3.83, 6.26), \\ θ_{123} - θ_{2} & = 2.93 (9.5 % CI : - 2.85, 8.72), \\ θ_{123} - θ_{3} & = 2.97 (9.5 % CI : - 2.80, 8.74) . \end{matrix}

In each case the point estimate suggests evidence of irreducibility, under monotonicity of {X₁, X₂, X₃}, but the sample size is not sufficiently large to draw this conclusion confidently. With monotonicity of {X₁, X₂, X₃}, irreducibility also implies a singular interaction for {X₁, X₂, X₃}. If we assume that only {X₁, X₂} or {X₁, X₃} or {X₂, X₃} are monotonic relative to C then the conditions for irreducibility in Example 5 can be expressed respectively as:

θ_{123} > 1 + θ_{1} + θ_{2}, θ_{123} > 1 + θ_{1} + θ_{3}, θ_{123} > 1 + θ_{2} + θ_{3} .

From model (7.3) we have that

\begin{matrix} θ_{123} - (1 + θ_{1} + θ_{2}) & = 0.09 (9.5 % CI : - 4.77, 4.96), \\ θ_{123} - (1 + θ_{1} + θ_{3}) & = 0.13 (9.5 % CI : - 4.69, 4.95), \\ θ_{123} - (1 + θ_{2} + θ_{3}) & = 1.86 (9.5 % CI : - 3.41, 7.12) . \end{matrix}

Again, in each case the point estimate suggests evidence of irreducibility, under monotonicity of just two of the three exposures, but the sample size is not sufficiently large to draw this conclusion confidently. With monotonicity of two of the three exposures, irreducibility also implies a singular interaction for {X₁, X₂, X₃}. The test for irreducibility in Example 5 without assumptions about monotonicity can be expressed as:

θ_{123} > 2 + θ_{1} + θ_{2} + θ_{3} .

From model (7.3) we have that

θ_{123} - (2 + θ_{1} + θ_{2} + θ_{3}) = - 0.99 (95 % CI : - 5.86, 3.88) .

In this case, not even the point estimate is positive.

If {X₁, X₂, X₃} is in fact irreducible and if there exists a representation that is structural then it follows by Proportion 7.11 that there exists a mechanism that is active only if X₁ = 1, X₂ = 1, X₃ = 1.

8. Concluding Remarks

In this paper we have developed general results for notions of interaction that we referred to as ‘irreducibility’ (aka ‘a sufficient cause interaction’) and ‘singularity’ for sufficient cause models with an arbitrary number of dichotomous causes. The theory could be extended by developing notions of sufficient cause, irreducibility and singularity for causes and outcomes that are categorical and/or ordinal in nature.

Acknowledgments

We thank Stephen Stigler for pointing us to earlier work by Cayley on minimal sufficient cause models. We thank James Robins for helpful conversations. Tyler VanderWeele was supported by the National Institutes of Health (R01 ES017876). Thomas Richardson was supported by the National Science Foundation (CRI 0855230), the National Institutes of Health (R01 AI032475), and The Institute of Advanced Studies, University of Bologna.

Footnotes

AMS 2000 subject classifications: Primary 62A01; secondary 68T30, 62J99

Contributor Information

Tyler J. VanderWeele, Harvard School of Public Health, Department of Epidemiology, 677 Huntington Avenue, Boston, MA 02115, tvanderw@hsph.harvard.edu, URL: http://www.hsph.harvard.edu/faculty/tyler-vanderweele/

Thomas S. Richardson, University of Washington, Department of Statistics, Box 354322, Seattle, WA 98195, thomasr@uw.edu

References

[1].Aickin M. Causal Analysis in Biomedicine and Epidemiology Based on Minimal Sufficient Causation. Marcel Dekker; New York: 2002. [Google Scholar]
[2].Bateson W. Mendel’s Principles of Heredity. Cambridge University Press; [Google Scholar]
[3].Cayley A. Note on a question in the theory of probabilities. London, Edinburgh and Dublin Philosophical Magazine. 1853;VI:259. [Google Scholar]
[4].Cayley A. A theorem on trees. Quart. J. Math. 1889;23:376–378. [Google Scholar]
[5].Cordell H. Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum. Mol. Genet. 2002;11:2463–2468. doi: 10.1093/hmg/11.20.2463. [DOI] [PubMed] [Google Scholar]
[6].Cox DR. The Planning of Experiments. Wiley; New York: 1958. [Google Scholar]
[7].Flanders D. Sufficient-component cause and potential outcome models. Eur. J. Epidemiol. 2006;21:847–853. doi: 10.1007/s10654-006-9048-3. [DOI] [PubMed] [Google Scholar]
[8].Greenland S, Brumback B. An overview of relations among causal modelling methods. Int. J. Epidemiol. 2002;31:1030–1037. doi: 10.1093/ije/31.5.1030. [DOI] [PubMed] [Google Scholar]
[9].Greenland S, Poole C. Invariants and noninvariants in the concept of interdependent effects. Scand. J. Work Environ. Health. 1988;14:125–129. doi: 10.5271/sjweh.1945. [DOI] [PubMed] [Google Scholar]
[10].Koopman JS. Interaction between discrete causes. Am. J. Epidemiol. 1981;113:716–724. doi: 10.1093/oxfordjournals.aje.a113153. [DOI] [PubMed] [Google Scholar]
[11].Mackie JL. Causes and conditions. American Philosophical Quarterly. 1965;2:245–255. [Google Scholar]
[12].Marcovitz AB. Introduction to Logic Design. McGraw-Hill; 2001. [Google Scholar]
[13].McCluskey EJ. Minimization of Boolean functions. The Bell System Technical Journal. 1956;35:1417–1444. [Google Scholar]
[14].Neyman J. Sur les applications de la thar des probabilities aux experiences agaricales: Essay des principle. In: Dabrowska D, Speed T, editors. Statistical Science 5. 1923. pp. 463–472. Excerpts in English. [Google Scholar]
[15].Novick LR, Cheng PW. Assessing interactive causal influence. Psychological Review. 2004;111:455–485. doi: 10.1037/0033-295X.111.2.455. [DOI] [PubMed] [Google Scholar]
[16].Pearl J. Causality: Models, Reasoning, and Inference. Cambridge University Press; Cambridge: 2000. [Google Scholar]
[17].Phillips P. Epistasis – the essential role of gene interactions in the structure and evolution of genetic systems. Nature Reviews Genetics. 2008;9:855–867. doi: 10.1038/nrg2452. [DOI] [PMC free article] [PubMed] [Google Scholar]
[18].Quine WV. The problem of simplifying truth functions. American Math. Monthly. 1952;59:521–531. [Google Scholar]
[19].Quine WV. A way to simplify truth functions. American Math. Monthly. 1955;62:627–631. [Google Scholar]; Robins JM. A new approach to causal inference in mortality studies with sustained exposure period - application to control of the healthy worker survivor effect. Mathematical Modelling. 1986;7:1393–1512. [Google Scholar]
[20].Robins JM. Marginal structural models versus structural nested models as tools for causal inference. In: Halloran E, Berry D, editors. Statistical Models in Epidemiology: the Environment and Clinical Trials. Springer; New York: 1999. pp. 95–134. [Google Scholar]; Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70:41–55. [Google Scholar]
[21].Rothman KJ. Causes. Am. J. Epidemiol. 1976;104:587–592. doi: 10.1093/oxfordjournals.aje.a112335. [DOI] [PubMed] [Google Scholar]
[22].Rothman KJ, Greenland S. Modern Epidemiology. Lippincott-Raven; Philadelphia: 1998. [Google Scholar]
[23].Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 1974;66:688–701. [Google Scholar]
[24].Rubin DB. Bayesian inference for causal effects: The role of randomization. Ann. Statist. 1978;6:34–58. [Google Scholar]
[25].Rubin DB. Comment on neyman 1923 and causal inference in experiments and observational studies. Statistical Science. 1990;5:472–480. [Google Scholar]
[26].Taylor JA, Umbach DM, Stephens E, Castranio T, Paulson D, Robertson G, Mohler JL, Bell DA. The role of N-acetylation polymorphisms in smoking-associated bladder cancer: Evidence of a gene-gene-exposure three-way interaction. Cancer Research. 1998;58:3603–3610. [PubMed] [Google Scholar]
[27].VanderWeele T. Empirical tests for compositional epistasis. Nature Reviews Genetics. 2010;11:166. doi: 10.1038/nrg2579-c1. [DOI] [PMC free article] [PubMed] [Google Scholar]
[28].VanderWeele T. Epistatic interactions. Statistical Applications in Genetics and Molecular Biology. 2010;9:1–22. doi: 10.2202/1544-6115.1517. [DOI] [PMC free article] [PubMed] [Google Scholar]
[29].VanderWeele TJ, Hernán MA. From counterfactuals to sufficient component causes, and vice versa. Eur. J. Epidemiol. 2006;21:855–858. doi: 10.1007/s10654-006-9075-0. [DOI] [PubMed] [Google Scholar]
[30].VanderWeele TJ, Robins JM. Directed acyclic graphs, sufficient causes and the properties of conditioning on a common effect. Am. J. Epidemiol. 2007;166:1096–1104. doi: 10.1093/aje/kwm179. [DOI] [PubMed] [Google Scholar]
[31].VanderWeele TJ, Robins JM. The identification of synergism in the sufficient-component cause framework. Epidemiol. 2007;18:329–339. doi: 10.1097/01.ede.0000260218.66432.88. [DOI] [PubMed] [Google Scholar]
[32].VanderWeele TJ, Robins JM. Empirical and counterfactual conditions for sufficient cause interactions. Biometrika. 2008;95:49–61. [Google Scholar]
[33].VanderWeele TJ, Robins JM. Signed directed acyclic graphs for causal inference. J. Roy. Statist. Soc., Series B. 2010;72:111–127. doi: 10.1111/j.1467-9868.2009.00728.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
[34].Vansteelandt S, Goetghebeur E. Causal inference with generalized structural mean models. J. Roy. Statist. Soc., Series B. 2003;65:817–835. [Google Scholar]

[R1] [1].Aickin M. Causal Analysis in Biomedicine and Epidemiology Based on Minimal Sufficient Causation. Marcel Dekker; New York: 2002. [Google Scholar]

[R2] [2].Bateson W. Mendel’s Principles of Heredity. Cambridge University Press; [Google Scholar]

[R3] [3].Cayley A. Note on a question in the theory of probabilities. London, Edinburgh and Dublin Philosophical Magazine. 1853;VI:259. [Google Scholar]

[R4] [4].Cayley A. A theorem on trees. Quart. J. Math. 1889;23:376–378. [Google Scholar]

[R5] [5].Cordell H. Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum. Mol. Genet. 2002;11:2463–2468. doi: 10.1093/hmg/11.20.2463. [DOI] [PubMed] [Google Scholar]

[R6] [6].Cox DR. The Planning of Experiments. Wiley; New York: 1958. [Google Scholar]

[R7] [7].Flanders D. Sufficient-component cause and potential outcome models. Eur. J. Epidemiol. 2006;21:847–853. doi: 10.1007/s10654-006-9048-3. [DOI] [PubMed] [Google Scholar]

[R8] [8].Greenland S, Brumback B. An overview of relations among causal modelling methods. Int. J. Epidemiol. 2002;31:1030–1037. doi: 10.1093/ije/31.5.1030. [DOI] [PubMed] [Google Scholar]

[R9] [9].Greenland S, Poole C. Invariants and noninvariants in the concept of interdependent effects. Scand. J. Work Environ. Health. 1988;14:125–129. doi: 10.5271/sjweh.1945. [DOI] [PubMed] [Google Scholar]

[R10] [10].Koopman JS. Interaction between discrete causes. Am. J. Epidemiol. 1981;113:716–724. doi: 10.1093/oxfordjournals.aje.a113153. [DOI] [PubMed] [Google Scholar]

[R11] [11].Mackie JL. Causes and conditions. American Philosophical Quarterly. 1965;2:245–255. [Google Scholar]

[R12] [12].Marcovitz AB. Introduction to Logic Design. McGraw-Hill; 2001. [Google Scholar]

[R13] [13].McCluskey EJ. Minimization of Boolean functions. The Bell System Technical Journal. 1956;35:1417–1444. [Google Scholar]

[R14] [14].Neyman J. Sur les applications de la thar des probabilities aux experiences agaricales: Essay des principle. In: Dabrowska D, Speed T, editors. Statistical Science 5. 1923. pp. 463–472. Excerpts in English. [Google Scholar]

[R15] [15].Novick LR, Cheng PW. Assessing interactive causal influence. Psychological Review. 2004;111:455–485. doi: 10.1037/0033-295X.111.2.455. [DOI] [PubMed] [Google Scholar]

[R16] [16].Pearl J. Causality: Models, Reasoning, and Inference. Cambridge University Press; Cambridge: 2000. [Google Scholar]

[R17] [17].Phillips P. Epistasis – the essential role of gene interactions in the structure and evolution of genetic systems. Nature Reviews Genetics. 2008;9:855–867. doi: 10.1038/nrg2452. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] [18].Quine WV. The problem of simplifying truth functions. American Math. Monthly. 1952;59:521–531. [Google Scholar]

[R19] [19].Quine WV. A way to simplify truth functions. American Math. Monthly. 1955;62:627–631. [Google Scholar]; Robins JM. A new approach to causal inference in mortality studies with sustained exposure period - application to control of the healthy worker survivor effect. Mathematical Modelling. 1986;7:1393–1512. [Google Scholar]

[R20] [20].Robins JM. Marginal structural models versus structural nested models as tools for causal inference. In: Halloran E, Berry D, editors. Statistical Models in Epidemiology: the Environment and Clinical Trials. Springer; New York: 1999. pp. 95–134. [Google Scholar]; Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70:41–55. [Google Scholar]

[R21] [21].Rothman KJ. Causes. Am. J. Epidemiol. 1976;104:587–592. doi: 10.1093/oxfordjournals.aje.a112335. [DOI] [PubMed] [Google Scholar]

[R22] [22].Rothman KJ, Greenland S. Modern Epidemiology. Lippincott-Raven; Philadelphia: 1998. [Google Scholar]

[R23] [23].Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 1974;66:688–701. [Google Scholar]

[R24] [24].Rubin DB. Bayesian inference for causal effects: The role of randomization. Ann. Statist. 1978;6:34–58. [Google Scholar]

[R25] [25].Rubin DB. Comment on neyman 1923 and causal inference in experiments and observational studies. Statistical Science. 1990;5:472–480. [Google Scholar]

[R26] [26].Taylor JA, Umbach DM, Stephens E, Castranio T, Paulson D, Robertson G, Mohler JL, Bell DA. The role of N-acetylation polymorphisms in smoking-associated bladder cancer: Evidence of a gene-gene-exposure three-way interaction. Cancer Research. 1998;58:3603–3610. [PubMed] [Google Scholar]

[R27] [27].VanderWeele T. Empirical tests for compositional epistasis. Nature Reviews Genetics. 2010;11:166. doi: 10.1038/nrg2579-c1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] [28].VanderWeele T. Epistatic interactions. Statistical Applications in Genetics and Molecular Biology. 2010;9:1–22. doi: 10.2202/1544-6115.1517. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] [29].VanderWeele TJ, Hernán MA. From counterfactuals to sufficient component causes, and vice versa. Eur. J. Epidemiol. 2006;21:855–858. doi: 10.1007/s10654-006-9075-0. [DOI] [PubMed] [Google Scholar]

[R30] [30].VanderWeele TJ, Robins JM. Directed acyclic graphs, sufficient causes and the properties of conditioning on a common effect. Am. J. Epidemiol. 2007;166:1096–1104. doi: 10.1093/aje/kwm179. [DOI] [PubMed] [Google Scholar]

[R31] [31].VanderWeele TJ, Robins JM. The identification of synergism in the sufficient-component cause framework. Epidemiol. 2007;18:329–339. doi: 10.1097/01.ede.0000260218.66432.88. [DOI] [PubMed] [Google Scholar]

[R32] [32].VanderWeele TJ, Robins JM. Empirical and counterfactual conditions for sufficient cause interactions. Biometrika. 2008;95:49–61. [Google Scholar]

[R33] [33].VanderWeele TJ, Robins JM. Signed directed acyclic graphs for causal inference. J. Roy. Statist. Soc., Series B. 2010;72:111–127. doi: 10.1111/j.1467-9868.2009.00728.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] [34].Vansteelandt S, Goetghebeur E. Causal inference with generalized structural mean models. J. Roy. Statist. Soc., Series B. 2003;65:817–835. [Google Scholar]

PERMALINK

GENERAL THEORY FOR INTERACTIONS IN SUFFICIENT CAUSE MODELS WITH DICHOTOMOUS EXPOSURES

Tyler J VanderWeele

Thomas S Richardson

Abstract

1. Introduction

Table 1.

2. Notation and Basic Concepts

2.1. Potential Outcomes Models

Table 2.

2.2. Definitions for sufficient cause models

2.3. Sufficient cause representations for a population

3. Irreducible conjunctions

3.1. B irreducible for D(C,Ω) with |B| = |C|

3.2. B irreducible for D(C,Ω) with |B| > |C|

3.3. Enlarging the set of potential causes

4. Tests for irreducibility

4.1. Adjusting for Measured Confounders

4.2. Tests for irreducibility without monotonicity

4.3. Graphs

4.4. Monotonicity

4.5. Tests for irreducibility with monotonicity

Fig 1.

4.6. Tests for a minimal sufficient cause under monotonicity

5. Singular interactions

5.1. Relation to Pearl’s Probability of Causation

6. Relation to Statistical Models with Linear Links

7. Interpretation of sufficient cause models

7.1. The descriptive interpretation

7.2. Generative mechanism interpretations

7.3. Counterfactual interpretation of a minimal sufficient cause representation

8. Concluding Remarks

Acknowledgments

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

3.1. B irreducible for $D (C, Ω)$ with |B| = |C|

3.2. B irreducible for $D (C, Ω)$ with |B| > |C|