Skip to main content
Philosophical Transactions of the Royal Society B: Biological Sciences logoLink to Philosophical Transactions of the Royal Society B: Biological Sciences
. 2005 May 29;360(1457):953–967. doi: 10.1098/rstb.2005.1641

A graphical approach for evaluating effective connectivity in neural systems

Michael Eichler 1,*
PMCID: PMC1854925  PMID: 16087440

Abstract

The identification of effective connectivity from time-series data such as electroencephalogram (EEG) or time-resolved function magnetic resonance imaging (fMRI) recordings is an important problem in brain imaging. One commonly used approach to inference effective connectivity is based on vector autoregressive models and the concept of Granger causality. However, this probabilistic concept of causality can lead to spurious causalities in the presence of latent variables. Recently, graphical models have been used to discuss problems of causal inference for multivariate data. In this paper, we extend these concepts to the case of time-series and present a graphical approach for discussing Granger-causal relationships among multiple time-series. In particular, we propose a new graphical representation that allows the characterization of spurious causality and, thus, can be used to investigate spurious causality. The method is demonstrated with concurrent EEG and fMRI recordings which are used to investigate the interrelations between the alpha rhythm in the EEG and blood oxygenation level dependent (BOLD) responses in the fMRI. The results confirm previous findings on the location of the source of the EEG alpha rhythm.

Keywords: graphical models, multivariate time-series, Granger causality, causal inference, spurious causality

1. Introduction

Recent studies in functional brain imaging have focused on the investigation of interactions between brain areas that are activated during certain tasks. Of particular interest is the determination of directed interactions and, hence, of the effective connectivity, which is defined as the influence one region exerts over another (Büchel & Friston 2000).

Several approaches have been suggested for modelling and estimating effective connectivity: structural equation modelling (Büchel & Friston 1997; McIntosh & Gonzales-Lima 1997), nonlinear dynamic causal models (Friston et al. 2003) and vector autoregressive modelling (Goebel et al. 2003; Harrison et al. 2003). The latter approach explicitly makes use of the temporal information in time-resolved fMRI measurements and is closely related to the concept of Granger causality (Goebel et al. 2003). This probabilistic concept of causality introduced by Granger (1969) is based on the idea that causes always precede their effects and can be formulated in terms of predictability.

In general, inferences about effective connectivity rely on the assumption that the underlying dependence structure of the system is correctly specified by the statistical model employed for the analysis. In the case of Granger causality and vector autoregressive modelling, it is well known that omission of relevant factors can lead to temporal correlations between the observed components that are falsely detected as causal relationships. Such so-called spurious causalities (Hsiao 1982) are particularly a problem in brain imaging applications where typically only a small number of the potential factors are included in an analysis. This raises the questions of whether and how effective connectivity of neural systems can be determined by fitting vector autoregressive models.

In this paper, we address this question using a new graphical approach for the investigation of dependence structures in multivariate dynamic systems. This approach is motivated by recent work on causal inference (Pearl 2000; Lauritzen 2001) that is based on graphical models. The idea of graphical modelling is to represent possible dependencies among the variables of a multivariate distribution in a graph. Such graphs can be easily visualized and allow an intuitive understanding of complex dependence structures. For an introduction to graphical models we refer to the monographs by Whittaker (1990), Cox & Wermuth (1996), Lauritzen (1996) and Edwards (2000).

In time-series analysis, there exist various approaches for defining graphical models; a partial overview can be found in Dahlhaus & Eichler (2003). Here, we focus on the approach by Eichler (2001, 2002), who linked the notions of graphical modelling and Granger causality to describe the dynamic dependencies in vector autoregressions. In §2, we review the concept of path diagrams associated with vector autoregressions and multivariate Granger causality. Furthermore, we introduce a similar graphical representation of the bivariate Granger-causal relationships among multiple time-series. In §3, we discuss the properties of such multivariate and bivariate path diagrams and their relation to each other. The results suggest using a combination of both representations for the description of dependencies in multiple time-series. These general path diagrams and the implications for causal inference from vector autoregressions are discussed in §4, and an application for simultaneous recordings from EEG and time-resolved fMRI is provided in §5.

2. Granger causality and autoregressive modelling

(a) Multivariate Granger causality

The concept of Granger causality is a fundamental tool for the empirical investigation of dynamic interactions in multivariate time-series. This probabilistic concept of causality is based on the common sense conception that causes always precede their effects. Thus, an event taking place in the future cannot cause another event in the past or present. This temporal ordering implies that the past and present values of a series X that influences another series Y should help to predict future values of this latter series Y. Furthermore, the improvement in the prediction of future values of Y should persist after any other relevant information for the prediction has been exploited. Suppose that the vector time-series Z comprises all variables that might affect the dependence between X and Y, such as confounding variables. Then we say that a series X Granger causes another series Y with respect to the information given by the series (X, Y, Z) if the value of Y(t+1) can be better predicted by using the entire information available at time t than by using the same information apart from the past and present values of X. Here, ‘better’ means a smaller variance of forecast error.

Because of the temporal ordering, it is clear that Granger causality can only capture functional relationships for which cause and effect are sufficiently separated in time. To describe causal dependencies between variables at the same time point, Granger (1969) proposed the notion of ‘instantaneous causality’. In general, it is not possible to attribute a unique direction to such ‘instantaneous causalities’ and we therefore will only speak of contemporaneous dependencies.

In practice, the use of Granger causality has mostly been restricted to the investigation of linear relationships. This notion of linear Granger causality is closely related to the important class of vector autoregressive processes. More precisely, let XV be a zero-mean, weakly stationary vector time-series with components Xv, vV={1,…,d} and assume that XV has an autoregressive representation of the form

XV(t)=u=1Φ(u)XV(tu)+ϵV(t), (2.1)

where the matrices Φ(u) are square summable and ϵV(t) is a white noise process with mean zero and non-singular covariance matrix Σ. Then, for any bV, the linear prediction of Xb(t+1) on the past and present values of XV is given by

Xˆb(t+1)=u=1aVΦba(u)Xa(tu+1).

It follows that a component Xa is Granger-non-causal for Xb with respect to the complete series XV if, and only if, Φba(u)=0 for all lags u. Furthermore, if Σab=0, the variables Xa(t+1) and Xb(t+1) are uncorrelated after removing the linear effects of XV(s), st and we say that Xa and Xb are contemporaneously uncorrelated with respect to XV.

From the definition of Granger causality in terms of the autoregressive parameters, it is clear that the notion of Granger causality depends on the multivariate time-series XV available for the analysis. If, for example, we consider only a subset Xs, sS, of the series in XV, the vector time-series XS again has an autoregressive representation

XS(t)=u=1Φ˜(u)XS(tu)+ϵ˜S(t),

for some white noise series ϵ˜S with covariance matrix Σ˜, but the coefficients Φ˜(u) and Σ˜ will generally be different from those in (2.1). For distinct a, bV, we say that Xa does not Granger-cause Xb with respect to XS if Φ˜ba(u)=0 for all lags u. To illustrate this dependence on the analysed process XV, we consider the following trivariate system. Let

X1(t)=αX2(t1)+ϵ1(t),X2(t)=βX3(t1)+ϵ2(t),X3(t)=ϵ3(t),} (2.2)

where ϵv(t), v=1, 2, 3 are independent and identically normally distributed with mean zero and variance σ2. Then X3 Granger-causes X2, which in turn Granger-causes X1, whereas X3 is Granger-non-causal for X1 (all with respect to the full trivariate series X{1,2,3}). On the other hand, in a bivariate autoregressive representation of X1 and X3, we have

X1(t)=αβX3(t2)+ϵ˜1(t),
X3(t)=ϵ3(t),

with ϵ˜1(t)=ϵ1(t)+αϵ2(t1). This representation implies that X3 Granger-causes X1 with respect to the bivariate series X{1,3}.

(b) Path diagrams

The Granger-causal relationships in a vector autoregressive process XV can be represented graphically by a path diagram in which vertices v correspond to the components Xv, while directed edges (→) between the vertices indicate Granger-causal influences. For a complete graphical description of the dependence structure of XV, we also include undirected edges (---) to depict contemporaneous correlations between the corresponding components. This leads to the following definition of path diagrams associated with vector autoregressions (cf. Eichler 2002).

Let XV be a weakly stationary time-series with the autoregressive representation (2.1). Then the path diagram associated with XV is a graph G=(V, E) with vertex set V and edge set E such that for a, bV with ab

  1. abEΦba(u)=0foru=1,2,;

  2. abEΣab=0.

In other words, the path diagram G contains a directed edge ab if, and only if, Xa Granger-causes Xb with respect to the full series XV. Similarly, an undirected edge a---b is present in the path diagram if, and only if, Xa and Xb are contemporaneously correlated with respect to XV.

As an example, we consider the trivariate vector autoregression in equation (2.2). The associated path diagram is depicted in figure 1. Here the directed path from 3 to 1 via 2 reflects the fact that X3 Granger-causes X1 if X2 is not included in the analysis. A more complicated path diagram is obtained for the following trivariate system:

X1(t)=Φ12X2(t1)+Φ13X3(t1)+ϵ1(t),X2(t)=Φ23X3(t1)+ϵ2(t),X3(t)=Φ32X2(t1)+Φ33X3(t1)+ϵ3(t),} (2.3)

with Σ13=Σ23=0. The associated path diagram in figure 2 provides a concise summary of the interactions and shows for example a feedback relation between X2 and X3. Note that the dependence of X3 on past values of itself is not depicted in the diagram. It follows from the results in §3 that inclusion of such self-loops vv would not change the properties of the graph and we therefore omit them for the sake of simplicity.

Figure 1.

Figure 1

A trivariate VAR(1) model with independent errors and its associated multivariate path diagram.

Figure 2.

Figure 2

Multivariate path diagram associated with the VAR(1) model in (2.3). For each edge that is absent in the graph the corresponding zero constraint on the parameters is given.

(c) Bivariate Granger causality

Much of the literature on Granger causality has been concerned with the analysis of relationships between two time-series (or two vector time-series) and, as a consequence, relationships among multiple time-series are still quite frequently investigated using bivariate Granger causality; examples involving EEG or fMRI signals can be found in Kamiński et al. (2001), Goebel et al. (2003) and Hesse et al. (2003).

For a better understanding of this approach and its relation to a full multivariate analysis based on multivariate Granger causality, we now introduce a graphical representation of the bivariate connectivity structure. Suppose that XV is a weakly stationary process of the form (2.1). Then for a, bV the bivariate subprocess X{a,b} is again a weakly stationary process and has an autoregressive representation:

Xa(t)=u=1Φ˜aa(u)Xa(tu)+u=1Φ˜ab(u)Xb(tu)+ϵ˜a(t),Xb(t)=u=1Φ˜ba(u)Xa(tu)+u=1Φ˜bb(u)Xb(tu)+ϵ˜b(t),} (2.4)

where ϵ˜(t)=(ϵ˜a(t),ϵ˜b(t)) is a white noise process with covariance matrix Σ˜. From this representation, it follows that Xa is bivariately Granger-causal for Xb if, and only if, the coefficients Φ˜ba(u) are zero for all lags u. This leads to the following definition of bivariate path diagrams that visualize the bivariate connectivities of XV.

Let XV be a weakly stationary time-series of the form (2.1). Then the bivariate path diagram associated with XV is a graph G=(V, E) with vertex set V and edge set E such that for all a, bV with ab

  1. abEΦ˜ba(u)=0foru=1,2,,

  2. abEΣ˜ab=0,

where Φ˜(u), u=1, 2,… and Σ˜ are the parameters in the autoregressive representation (2.4) of the bivariate subprocess X{a,b}.

As an example, we consider again the trivariate system in (2.2). As noted before, the bivariate autoregressive representation of X1 and X3 is given by

X1(t)=αβX3(t2)+ϵ˜1(t),
X3(t)=ϵ3(t),

with ϵ˜1(t)=ϵ1(t)+αϵ2(t1), which implies that X3 Granger-causes X1. Furthermore, it can be shown that the bivariate representation of the subprocess X{1,2} is given by

X1(t)=αX2(t2)+ϵ1(t),
X2(t)=ϵ˜2(t),

with ϵ˜2(t)=ϵ2(t)+βϵ3(t1). Hence, X2 Granger-causes X1 in a bivariate analysis. Finally, the autoregressive representation of X{2,3} is determined by the second and the third equation in (2.2) and, thus, X3 Granger-causes X2. The bivariate Granger-causal relationships between the variables can be summarized by the path diagram in figure 3b.

Figure 3.

Figure 3

Graphical representations for the trivariate system in (2.2). (a) Multivariate path diagram associated with the autoregressive representation of the trivariate series X{1,2,3}. (b) Bivariate path diagram associated with autoregressive representations of bivariate series X{1,2}, X{1,3} and X{2,3}.

We note that in general the relation between the bivariate representations (2.4) and the multivariate representation (2.1) is more complicated than in this example and an analytic derivation of the bivariate representations would be very difficult to obtain.

3. Markov properties

The edges in the path diagrams discussed in this paper represent pairwise Granger-causal relationships with respect to either the complete process in the case of multivariate path diagrams or bivariate subprocesses in the case of path diagrams depicting the bivariate connectivity structure. The results in this section show that both types of path diagrams more generally provide sufficient conditions for Granger-causal relationships with respect to subseries XS for any subset S of V. The proofs of the results are technical and therefore omitted; for multivariate path diagrams the proofs can be found in Eichler (2002), for bivariate path diagrams they can be obtained from the author upon request.

(a) Multivariate path diagrams

We first review a path-oriented concept of separating subsets of vertices in a mixed graph that has been used to represent the Markov properties of linear structural equation systems (Spirtes et al. 1998; Koster 1999). Following Richardson (2003), we will call this notion of separation in mixed graphs m-separation.

Let G=(V, E) be a mixed graph with directed edges → and undirected edges ---. A path π between two vertices a and b in G is a sequence π=〈e1,…,en〉 of edges eiE such that ei is an edge between vi−1 and vi for some sequence of vertices v0=a, v1,…,vn=b. We say that a and b are the endpoints of the path, while v1,…,vn−1 are the intermediate vertices on the path. Note that the vertices vi in the sequence do not need to be distinct and, therefore, that paths may be self-intersecting.

An intermediate vertex c on a path π is said to be a collider on the path if the edges preceding and succeeding c on the path both have an arrowhead or a dashed tail at c, that is, →c←, →c---, ---c←, ---c---; otherwise the vertex c is said to be a non-collider on the path. A path π between vertices a and b is said to be m-connecting given a set C if

  1. every non-collider on the path is not in C and

  2. every collider on the path is in C,

otherwise we say the path is m-blocked given C. If all paths between a and b are m-blocked given C, then a and b are said to be m-separated given C. Similarly, sets A and B are said to be m-separated in G given C if for every pair aA and bB, a and b are m-separated given C.

As an example consider the graph in figure 4. In this graph, vertices 1 and 4 are m-separated given S={3}. To show this, we examine all paths between the two vertices. First, we consider the path 4→3→1. Since 3 is a non-collider on this path, the path is m-blocked given {3}. Second, we note that every path that passes through vertex 2 contains this vertex as a collider. Two examples of such paths are given in figure 4b,c. Since 2 lies outside S={3}, these paths are m-blocked given S. The only path between 1 and 4 that does not pass through 2 is the path 4→3→1, which is also m-blocked given S. Hence, no path exists that is m-connecting given S and consequently 1 and 4 are m-separated given S.

Figure 4.

Figure 4

Illustration of m-separation in mixed graphs: vertices 1 and 4 are m-separated given S={3} since all paths between 1 and 4 are m-blocked given S. (a) Path 4→3→1 is m-blocked by non-collider 3∉S. (b) Path 4→2←1 is m-blocked by collider 2∈S. (c) Path 4→3---2←1 is m-blocked by collider 2∈S.

Now suppose that XV is a weakly stationary time-series with autoregressive representation (2.1) and let G be its associated multivariate path diagram. Furthermore, let A, B and C be disjoint subsets of V. Then, if A and B are m-separated given C, it can be shown (Eichler 2002, theorem 3.1) that XA(t) and XB(t+u) are uncorrelated at all lags u after removing the linear effects of the complete series XC. For example, in the path diagram in figure 3a associated with the trivariate process in (2.2), vertices 1 and 3 are m-separated given S={2}. This implies that the components X1 and X3 are uncorrelated conditionally on X2 but not in a bivariate analysis.

On the other hand, we have shown in §2a that in a bivariate analysis X3 Granger-causes X1 whereas X1 is Granger-non-causal for X3 (as indicated by the bivariate path diagram in figure 3b). Obviously, the notion of m-separation is too strong for the derivation of Granger-non-causal relations from multivariate path diagrams: the definition of m-separation requires that all paths between vertices 1 and 3 are m-blocked whereas the path 3→2→1 is intuitively interpreted as a causal link from X3 to X1. Consequently, the path should not be considered when discussing Granger causality from X1 to X3.

The example suggests the following definition. A path π between vertices a and b is said to be b-pointing if it has an arrowhead at the endpoint b. More generally, a path π between A and B is said to be B-pointing if it is b-pointing for some bB. In order to establish Granger non-causality from XA to XB, it suffices to only consider all B-pointing paths between A and B.

Suppose XV is a weakly stationary time-series with autoregressive representation (2.1) and let G=(V, E) be the path diagram associated with XV. Furthermore, let A, B and C be disjoint subsets of V. If every B-pointing path between A and B is m-blocked given BC, then XA is Granger-non-causal for XB with respect to XABC.

Similarly, a graphical condition for contemporaneous correlation can be obtained. Intuitively, two variables Xa and Xb are contemporaneously uncorrelated with respect to XS if they are contemporaneously uncorrelated with respect to XV and the variables are not jointly affected by past values of XV\S. For a precise formulation of the conditions, we need the following definition. A path π between vertices a and b is said to be bi-pointing if it has an arrowhead at both endpoints a and b. Then the sufficient condition for contemporaneous correlation can be stated as follows.

Suppose XV is a weakly stationary time-series with autoregressive representation (2.1) and let G=(V, E) be the path diagram associated with XV. Furthermore, let A, B and C be disjoint subsets of V. If

  1. abEforallaAandbB and

  2. every bi-pointing path between A and B is m-blocked given ABC,

then XA and XB are contemporaneously uncorrelated with respect to XABC.

As an example, consider the four-dimensional vector autoregressive process XV with components

X1(t)=αX4(t2)+ϵ1(t),X2(t)=βX4(t1)+γX3(t1)+ϵ2(t),X3(t)=ϵ3(t),X4(t)=ϵ4(t),} (3.1)

where ϵi(t), i=1,…,4 are uncorrelated white noise processes with mean zero and variance one. The path diagram associated with XV is shown in figure 5a. In this graph, vertices 1 and 3 are connected by the path 3→2←4→1. On this path, vertex 2 is a collider, whereas vertex 4 is a non-collider. Thus, the path is m-blocked given the set {2} and it follows from theorem 3.1 that X3 is Granger-non-causal for X1 in a bivariate analysis, but not in a trivariate analysis including X2. Furthermore, vertices 1 and 2 are connected by the bi-pointing path 1←4→2, which is only m-blocked given vertex 4. Therefore, it follows by theorems 3.1 and 3.2 that X1 and X2 Granger-cause each other and, additionally, are contemporaneously correlated regardless of whether X3 is included in the analysis or not. The Granger-causal relationships, with respect to X{1,2,3}, that can be inferred from the path diagram in figure 5a can be summarized by the graph in figure 5b.

Figure 5.

Figure 5

Illustration of the global Granger-causal Markov property. (a) Multivariate path diagram G{1,2,3,4} of complete series X{1,2,3,4} of the system in (3.1). (b) Graphical representation of those relations with respect to the subseries X{1,2,3} that are encoded in G{1,2,3,4}. (c) True multivariate path diagram G{1,2,3} of X{1,2,3} depicting all relations with respect to the subseries X{1,2,3} that hold for X{1,2,3}. (d) Bivariate path diagram of X{1,2,3} depicting the bivariate relations among components X1, X2 and X3.

More generally, if a mixed graph G encodes certain Granger-non-causal relations of a process XV, we say that XV satisfies a Markov property with respect to the graph G.

We say that a weakly stationary time-series XV satisfies the global Granger-causal Markov property with respect to a mixed graph G if the following conditions hold for all disjoint subsets A, B and C of V:

  1. If in the graph G every B-pointing path between A and B is m-blocked given BC, then XA is Granger-non-causal for XB with respect to XABC.

  2. If the sets A and B are not connected by an undirected edge (---) in graph G and every bi-pointing path between A and B is m-blocked given ABC, then XA and XB are contemporaneously uncorrelated with respect to XABC.

With this definition, theorems 3.1 and 3.2 state that a weakly stationary time-series XV with the autoregressive representation (2.1) satisfies the global Granger-causal Markov property, with respect to its multivariate path diagram G.

For the four-dimensional time-series in (3.1), we have shown above that the Granger-causal relationships with respect to the trivariate subseries X{1,2,3} are encoded by the graph in figure 5b. It follows from theorems 3.1 and 3.2 that the trivariate subprocess X{1,2,3} satisfies the global Granger-causal Markov property with respect to the graph in figure 5b. On the other hand, it can be shown that X{1,2,3} is given by

X1(t)=αβ1+β2X2(t1)+αβγ1+β2X3(t2)+ϵ˜1(t),
X2(t)=γX3(t1)+ϵ˜2(t),
X3(t)=ϵ3(t),

where ϵ˜2(t)=ϵ2(t)+βX4(t1) and

ϵ˜1(t)=ϵ1(t)αβ1+β2ϵ2(t1)+α1+β2X4(t2).

Simple calculations show that ϵ˜(t)=(ϵ˜1(t),ϵ˜2(t),ϵ3(t)) is indeed a white noise process with uncorrelated components. Thus, the path diagram associated with the trivariate process X{1,2,3} is given by the graph in figure 5c. Note that this graph is a subgraph of the graph in figure 5b, which has been derived from the path diagram of the complete series XV: theorems 3.1 and 3.2 provide only sufficient, not necessary, conditions for Granger non-causality with respect to subprocesses.

(b) Bivariate path diagrams

Next, we discuss the properties of the bivariate path diagrams introduced in §2c. Recall that these path diagrams may have two kinds of edges, namely dashed directed edges ⤍ and undirected edges ---. We note that the choice of dashed directed edges to represent bivariate Granger-causal relationships allows the application of the concept of m-separation without further modifications. More precisely, let G be a mixed graph with directed edges ⤍ and undirected edges --- and let π be a path in G. Then the intermediate vertices on π can be characterized as colliders and non-colliders as in §3a, that is, an intermediate vertex c on the path π is said to be a collider if the edges preceding and succeeding c on the path both have an arrowhead or a dashed tail at c. Since G contains only edges of the form ⤍ or ---, it follows that all paths π in G are pure-collider paths, that is, all intermediate vertices are colliders.

In §3a, we have shown that the concepts of m-separation and of pointing paths can be used to derive Granger-non-causal relations with respect to subprocesses from multivariate path diagrams. The same is also true for bivariate path diagrams. More precisely, we have the following result.

Let XV be a weakly stationary time-series with autoregressive representation (2.1) and let G be the bivariate path diagram of XV. Then XV satisfies the global Granger-causal Markov property with respect to G.

As an example, consider the four-dimensional process in (3.1) and suppose that variable X4 has not been observed. Simple calculations show that X3 is bivariately Granger-causal for X2, which, in turn, bivariately Granger-causes X1. Replacing X4(t−2) with ϵ4(t−2) in the equation for X1(t) in (3.1), we find on the other hand that X1(t) depends only on ϵ1(t) and ϵ4(t−2) while X3(t)=ϵ3(t). Since the white noise processes ϵi(t) are assumed to be uncorrelated, it follows that X1 and X3 are completely unrelated. In particular, this implies that X3 does not bivariately Granger-cause X1.

The corresponding bivariate path diagram is shown in figure 5d. In this path diagram, the absence of the edge 3⤍1 implies that X3 is bivariately Granger-non-causal for X1. On the other hand, vertices 1 and 3 are connected by the 1-pointing path 3⤍2⤍1. It follows from theorem 3.4 that, in a trivariate analysis based on X{1,2,3}, the series X3 Granger-causes X1. Similarly, the 2-pointing path 1⤌2⤌3⤌2 indicates that X1 Granger-causes X2 with respect to X{1,2,3}. Since this path is also bi-pointing, the diagram also indicates that X1 and X2 are contemporaneously correlated with respect to X{1,2,3}. Altogether, we find that the bivariate path diagram in figure 5d implies all relations that are encoded by the graph in figure 5b and additionally that X3 is bivariately Granger-non-causal for X1.

(c) Comparison of bivariate and multivariate Granger causality

The notion of Granger causality is based on the idea that a correlation between two variables that cannot be explained otherwise must be a causal influence; the temporal ordering then determines the direction of the causal link. This approach requires that all relevant information is included in the analysis. Given data from a multivariate time-series XV, it therefore seems plausible to discuss Granger causality with respect to the full multivariate process XV.

As a first example, we consider again the trivariate system in (2.2). Here, the multivariate path diagram (figure 3a) describes the effective connectivity among the components correctly, whereas it is not clear from the bivariate path diagram (figure 3b) whether X3 has a direct influence on X1 or affects X1 only indirectly via X2. The situation becomes worse for the following trivariate system

X1(t)=αX2(t2)+ϵ1(t),X2(t)=ϵ2(t),X3(t)=βX2(t1)+ϵ3(t),} (3.2)

where the error series are again assumed to be uncorrelated. The corresponding path diagrams are depicted in figure 6. Here, a bivariate analysis suggests a causal link from X3 to X1, although the observed correlation between X1 and X3 is only due to confounding by X2. A trivariate analysis correctly shows no direct connections between X1 and X3. This phenomenon that an observed Granger-non-causal relation between two variables vanishes after adding variables to the information set is called spurious causality of type II (Hsiao 1982).

Figure 6.

Figure 6

Graphical representations of the connectivity of the trivariate system in (3.2). (a) Multivariate path diagram associated with the autoregressive representation of the trivariate series X{1,2,3}. (b) Bivariate path diagram associated with autoregressive representations of bivariate series X{1,2}, X{1,3}, and X{2,3}.

A serious problem that arises in practice is the omission of relevant variables from the analysis (e.g. because the variables could not be measured). As an example, we consider again the situation described in §3b, where we have examined the connectivity structure of the trivariate subprocess X{1,2,3} of the four-dimensional system (3.1). Here, the multivariate path diagram (figure 5c) indicates the presence of a direct causal link from X3 to X1 although this Granger-causal influence vanishes in a bivariate analysis. In this situation, the bivariate path diagram (figure 5d) clearly provides a better graphical description of the effective connectivity among the variables than the multivariate path diagram. This phenomenon that a Granger-causal relationship vanishes after the information set has been reduced is called spurious causality of type I (Hsiao 1982).

More generally, consider a system in which all relationships between the observed variables are due to confounding by latent variables. It can be shown that two observed components are unrelated whenever they are not confounded by the same latent variables. Conversely, this implies that two observed variables are dependent only if they are affected by a common confounder. Thus, the effective connectivity of such a system can be best investigated by bivariate analyses.

4. Latent variables and spurious causality

The discussion in §3 has shown that multivariate path diagrams are best suited for the representation of structures that do not involve confounding by latent variables, while the connectivity of a system where all relationships between the observed variables are induced by confounding is best described by a bivariate path diagram. In practice, however, causal structures may be a combination of both situations with only a part of the Granger-causal relationships being due to confounding by latent variables. In such cases neither graphical representation would provide a correct description of the dependencies among the observed variables. As an example, consider the following four-dimensional system:

X1(t)=αX2(t1)+ϵ1(t),X2(t)=βZ(t2)+ϵ2(t),X3(t)=γZ(t1)+δX4(t1)+ϵ3(t),X4(t)=ϵ4(t),} (3.3)

where ϵi, i=1,…,4 and Z are independent white noise series. Simple calculations show that the multivariate and the bivariate path diagrams are given by the graphs in figure 7a,b, respectively. Neither graph depicts the effective connectivity of the system correctly. Moreover, both graphs describe different aspects of the dependence structure. From the multivariate path diagram, we learn that X3 is Granger-non-causal for X1 with respect to X{1,2,3}, while the same Granger-non-causal relation cannot be derived from the bivariate path diagram. On the other hand, the bivariate path diagram implies that X4 is bivariately Granger-non-causal for X2, which is not encoded in the multivariate path diagram.

Figure 7.

Figure 7

Graphical representations for the four-dimensional system in (3.3). (a) Multivariate path diagram. (b) Bivariate path diagram. (c) General path diagram.

(a) General path diagrams

We now propose a new graphical representation that is based on the global Granger-causal Markov property and combines elements of both multivariate and bivariate representations. More precisely, we consider graphs that may contain three types of edges, namely →, ⤍, and ---. For the previous example, such a graph is shown in figure 7c. Unlike the graphs in figure 7a,b, this graph encodes both that X4 is bivariately Granger-noncausal for X2 and that X3 is Granger-noncausal for X1 with respect to X{1,2,3}.

Suppose that observations from a multivariate time-series XS that is part of a larger system XV are available. A necessary condition for a graph G=(S, E) to serve as a model for the effective connectivity of XS is that it does not imply any independence relations that do not hold for XS.

A mixed graph G is consistent with XS if XS satisfies the global Granger-causal Markov property with respect to G.

Obviously, this condition is not sufficient for the identification of effective connectivity since the Markov property trivially holds for a saturated graph with all possible edges included. So how can we determine a graph that depicts the dependencies between the observed time-series as closely as possible? We first note that the type of a directed edge, → or ⤍, cannot be determined alone from the pairwise Granger-non-causal relations between the two corresponding variables. For instance, suppose that X{a,b} is a bivariate process such that Xa Granger-causes Xb but not vice versa. From this information, it is impossible to decide between the two graphs ab and ab since both encode the same Granger-non-causal relations.

On the other hand, consider again the four-dimensional process (3.1) and suppose, as in §3b, that only the subprocess XS=X{1,2,3} has been observed. The multivariate path diagram G associated with XS is depicted in figure 8a. As noted in §3b, the variable Xa does not Granger-cause Xb in a bivariate analysis. While this Granger-non-causal relation cannot be derived from graph a in figure 8, it is implied by graphs c and d. This suggests that the latter two diagrams might be better representations of the effective connectivity. Thus, the type of the directed edge from 2 to 1 is determined by the bivariate relationship between components X3 and X1.

Figure 8.

Figure 8

Illustration of minimal consistent graphs for the trivariate subseries X{1,2,3} of the system in (3.3). (a) Multivariate path diagram G{1,2,3} associated with X{1,2,3}—this graph is consistent with X{1,2,3} but does not imply that X3 is bivariately Granger-non-causal for X1. (b) A graph G′, which is Markov equivalent to G{1,2,3}. (c) and (d) Two minimal consistent graphs, which both correctly encode that X3 is bivariately Granger-non-causal for X1. (e) Two inconsistent graphs which falsely encode that X3 is Granger-non-causal for X1 with respect to X{1,2,3}.

This reasoning can be formalized in terms of so-called minimal consistent graphs, which have been introduced by Pearl (2000) who addressed the problem of inferring causal effects from multivariate distributions satisfying certain conditional independence relations. Note that, in contrast to the graphs described here, the causal structures discussed by Pearl need to be directed acyclic graphs.

For a mixed graph G=(S, E), let J(G) be the set of all independence relations between the variables in XS that are encoded in the graph according to the global Granger-causal Markov property. Then for two mixed graphs G and G′, we write GG if J(G)J(G), that is, the graph G encodes the same or more independence relations than the graph G′. Furthermore, if GG and GG then J(G′)=J(G) and we say that the two graphs are Markov equivalent (with respect to the global Granger-causal Markov property). In this case, we will write GG′.

Now if G and G′ are two consistent graphs such that GG, we prefer G as a graphical representation of the effective connectivity of XS since it encodes more information about the dependencies between the components of XS. Thus, a graphical representation G is optimal if it is consistent with XS and cannot be further reduced without violating consistency.

A graph G is minimal in the class of all graphs consistent with XS if G is consistent and for any other consistent graph Gwe have G′∼G whenever GG.

In other words, any graph G′ that implies further Granger-non-causal relations additional to those implied by a minimal consistent graph G must be inconsistent with XS.

We illustrate these concepts by application to the system in (3.1). As before we assume that the component X4 has not been observed. Figure 8a displays the corresponding multivariate path diagram G{1,2,3} associated with the trivariate subprocess X{1,2,3}, which by theorems 3.1 and 3.2 is consistent with X{1,2,3}. For the derivation of a minimal consistent graph GG{1,2,3}, we first note that the edge 2→1 in G{1,2,3} can be replaced by the edge 2⤍1 without changing the set J(G{1,2,3}) of encoded relations. Thus, the resulting graph G′ in figure 8b is Markov equivalent to G{1,2,3}.

Next, we consider the graph G in figure 8c, which is a subgraph of G′ and hence satisfies GG. Additionally to the relations encoded by G′, this graph G correctly implies that X3 is bivariately Granger-non-causal for X1 since the only 1-pointing path from 3 to 1 is m-blocked given the empty set. Since it does not imply any further additional relations among the components, it is also consistent with X{1,2,3}. The graph resembles the bivariate path diagram of X{1,2,3} (figure 8d), which by theorem 3.4 is also consistent with X{1,2,3}. Both graphs cannot be further reduced without violating an existing Granger causality from X3 to X2 or from X2 to X1; therefore, they are minimally consistent with X{1,2,3}. However, we note that the two graphs are not Markov equivalent. For example, the bi-pointing path 1⤌2←3→2 in G is m-connecting given the empty set and hence indicates that X1 bivariately Granger-causes X2. In contrast, the corresponding bi-pointing path 1⤌2⤌3⤍2 in the bivariate path diagram is m-connecting given {3} and, hence, indicates that X1 Granger-causes X2 with respect to X{1,2,3}. In this case, there does not exist an optimal graphical representation that encodes all relations that hold for X{1,2,3}.

Finally, we note that the graphs in figure 8e both falsely imply that X3 is Granger-non-causal for X1 in a trivariate analysis and hence are inconsistent with the process X{1,2,3}.

(b) Identification of minimal consistent graphs

In practice, the determination of minimal consistent graphs must be databased. Recall that if G′ is consistent with process XS and G′ is another graph such that GG but G and G′ are not Markov equivalent, then G implies some Granger-non-causal (or contemporaneous non-correlation) relation that is not encoded in G′. Thus, we can decide between the two graphs G and G′ by testing for this relationship. More precisely, suppose that the graph G implies—in addition to the relations in J(G′)—that Xa0 is Granger-non-causal for Xb0 with respect to XA for some AS and a0, b0A. Since G′ is consistent with XS, the subseries XA has an autoregressive representation:

XA(t)=u=1pΨ(u)XA(tu)+ϵ(t),var(ϵ(t))=Ω,

where the parameters Ψ(u), u=1,…,p and Ω are constrained with respect to the graph G′, that is,

  1. Ψba(u)=0 for u=1,…,p whenever G′ implies that Xa is Granger-non-causal for Xb with respect to XA and

  2. Ωab=0 whenever G′ implies that Xa and Xb are contemporaneously uncorrelated with respect to XA.

Now, if the graph G is also consistent with XS, we have additionally that Xa0 is Granger-non-causal for Xb0 with respect to XA and hence Ψb0a0(u)=0 for all lags u=1,…,p. Thus, we can determine whether G is consistent with XS by testing the null hypothesis,

H0:Ψb0a0(u)=0forallu=1,,p,

against the alternative:

Ha:Ψb0a0(u)0forsomeu=1,,p.

Similarly, if G additionally implies that Xa0 and Xb0 are contemporaneously uncorrelated with respect to XA, we can test for consistency of G by considering the null hypothesis H0:Ωa0b0=0.

The test is based on fitting a VAR(p) model to the subseries XA under the constraints (i) and (ii). These constraints define a path diagram GA=(A, EA) for XA with abEA, if G′ implies that Xa is Granger-non-causal for Xb with respect to XA and a---bEA if G′ implies that Xa and Xb are contemporaneously uncorrelated with respect to XA. Therefore, the restricted model can be fitted by using the iterative procedure for fitting graphical vector autoregressions described in the Appendix. Noting that for large sample sizes T the estimates Ψˆba=(Ψˆba(1),,Ψˆba(p))T are approximately normally distributed with variance VΨvar(Ψˆab), we obtain, as a test statistic for the above test problem,

Sba=TΨˆbaTVˆΨ1Ψˆba,

where VˆΨ is an estimate for V. Under the null hypothesis H0, the test statistic Sba is asymptotically χ2 distributed with p degrees of freedom.

As an example, we consider again the system (3.1) which we discussed in the previous section. Recall that we have found that the graph G′ in figure 8b is Markov equivalent to the multivariate path diagram associated with X{1,2,3} and, thus, is consistent with X{1,2,2}. Furthermore, we have seen that the graph G in figure 8c satisfies GG. In order to test whether G is consistent with the data, we note that G′ implies that X1 is bivariately Granger-non-causal for, and bivariately contemporaneously uncorrelated with, X3, whereas G entails that X1 and X3 are completely unrelated. Thus, the graph G′ leads to the following restricted bivariate VAR(p) model for X{1,3}

X1(t)=u=1pΨ11(u)X1(tu)+u=1pΨ13(u)X3(tu)+ϵ1(t),
X3(t)=u=1pΨ33(u)X3(tu)+ϵ3(t),

with Ω13=cov(ϵ1(t),ϵ3(t))=0. For this example, we obtain as the null hypothesis H0:Ψ13(1)==Ψ13(p)=0.

Finally, we note that if two graphs G and G′ are Markov equivalent we cannot decide between the two graphical representations.

(c) Causal effects and spurious causality

The minimal graphs consistent with a process XS describe all causal models that can be used for an explanation of the observed dependencies among the variables in XS. Therefore, if a directed edge ab is present in all minimal consistent graphs, the Granger-causal relationship from Xa to Xb might be in part, but not completely, attributed to a common explanatory latent variable. This implies the existence of a causal influence of Xa on Xb.

Xa has a causal effect on Xb if there exists a directed path from a to b in every minimal causality graph G¯ consistent with XS.

Conversely, if Xa Granger-causes Xb with respect to some subprocess XS′ of XS (including the case S′=S), but there does not exist any directed path from a to b in all minimal graphs consistent with XS, then this Granger-causal relationship between Xa and Xb must be spurious.

Suppose that Xa Granger-causes Xb with respect to XS. Then we say that Xa spuriously causes Xb with respect to XS if no directed path from a to b exists in any minimal graph consistent with XS.

The concepts presented in this section can be used to evaluate effective connectivity. We note that it may happen that the direct and indirect effects of one variable on another cancel out such that the total effect vanishes. Under such circumstances, the effective connectivity differs from the structural connectivity of the system. Inference about the structural connectivity of a system is only possible under the assumption that the complete system XV (including latent variables) is stable (Pearl 2000) or faithful (Spirtes et al. 2001) with respect to its associated path diagram GV, that is, a Granger-non-causal relation holds for XV if, and only if, it can be derived from GV (and the same for contemporaneous correlation). We note that is not possible to test whether a system satisfies this assumption.

5. Applications

(a) Simulated example

For an illustration of the application of the concepts developed in this paper, we first consider simulated data (length=512) from a four-dimensional system

X1(t)=α1L1(t1)+ϵ1(t),
X2(t)=α2L1(t2)+α3L2(t1)+ϵ2(t),
X3(t)=α4X4(t1)+α5L2(t2)+ϵ3(t),
X4(t)=α6X1(t1)+ϵ4(t),

where the series ϵi(t), i=1,…,4 and Li(t), i=1, 2 are independent white noise with mean zero and variance one. For the simulation, we have set α1=⋯=α6=0.6.

In this system, L1 and L2 represent latent variables that were not included in the analysis. For the remaining four variables X1,…,X4 we have estimated the partial directed correlations (PDC) πab(u) (cf. Appendix) with respect to the complete available information (X{1,…,4}) as well as with respect to bivariate subprocesses. The estimates are shown in figure 9 and suggest that the series can be well approximated by a VAR model of order 2. Moreover, the results of the multivariate analysis show five PDC that are significantly different from zero, which leads to the multivariate path diagram in figure 9b. For a quantitative analysis, we have also performed tests for Granger causality and contemporaneous non-correlation (table 1), which corroborate the results obtained from the inspection of the PDC. Similarly, the bivariate path diagram in figure 9c has been obtained from the results of the bivariate analyses. Comparing the two graphs, we note that all directed edges that are present in the multivariate path diagram are also included in the bivariate path diagram. This means that the analysis of all bivariate subseries does not detect any additional relations among the components and thus does not indicate any spurious causalities.

Figure 9.

Figure 9

(a) Multivariate and bivariate partial directed correlations (PDC) for a simulated VAR(2) process X{1,2,3,4}, the horizontal lines signify pointwise 95% test bounds for the hypothesis that the PDC is zero. (b) Empirically determined multivariate path diagram for X{1,2,3,4}. (c) Empirically determined bivariate path diagram for X{1,2,3,4}.

Table 1.

Results of multivariate and bivariate analyses of simulated data (a) p-values of tests for Granger causality with respect to X{1,2,3,4}; (b) p-values of tests for bivariate Granger causality; (c) p-values of tests for multivariate (below diagonal) and bivariate (above diagonal) contemporaneous non-correlation.

  from
 
to X1 X2 X3 X4
(a) multivariate Granger causality
X1 0.907 0.294 0.607
X2 <0.001 0.116 0.096
X3 0.016 <0.001 <0.001
X4 <0.001 0.988 0.371
(b) bivariate Granger causality
X1 0.970 0.589 0.895
X2 0.001 0.533 0.305
X3 <0.001 <0.001 <0.001
X4 <0.001 0.865 0.489
(c) contemporaneous non-correlation
X1 0.369 0.803 0.981
X2 0.381 0.202 0.042
X3 0.616 0.819 0.829
X4 0.924 0.754 0.246

We now examine whether any of the observed relationships between the variables are due to confounding by latent variables. We first consider the graphs in figure 10a. Here the left graph Ga is Markov equivalent to the multivariate path diagram. Removing the edge 1→3 we obtain the right graph Ga which, additionally to the independencies encoded by Ga, implies that X1 is Granger non-causal for X3 with respect to X{1,2,3}. To test whether Ga is consistent with the data, we follow the procedure described in the previous section and fit a VAR(2) model to the subseries X{1,2,3} under the constraints implied be the graph Ga, that is, Ψ12(u)=Ψ13(u)=Ψ23(u)=0 for u=1, 2 and Ω12=Ω13=Ω23=0. In this model, the null hypothesis that the subgraph Ga is consistent with the data can be formulated as H0:Ψ31(1)=Ψ31(2)=0. The corresponding test clearly rejects the null hypothesis (table 2) and, thus, we conclude that graph Ga is not consistent with the data.

Figure 10.

Figure 10

Identification of the effective connectivity of the simulated four-dimensional series X{1,2,3,4}: three graphs (left) that are Markov equivalent to the multivariate path diagram in figure 9b and three subgraphs (right) encoding that (a) X1 does not Granger-cause X3 with respect to X{1,2,3}; (b) X1 does not Granger-cause X3 with respect to X{1,3,4}; and (c) X1 does not bivariately Granger-cause X3.

Table 2.

Identification of effective connectivity: tests for consistency of subgraphs Ga and Gb.

graph test statistic p-value
Ga 29.46 <0.001
Gb 5.34 0.069

Alternatively, examination of the PDC between X1 and X3 with respect to X{1,2,3} (figure 11a) shows a still significant positive influence of X1 on X3, which implies that X1 Granger-causes X3 with respect to the subseries X{1,2,3}. Following the same steps as above with component X4 substituted for X2 (figure 10b), we find that the null hypothesis that X1 Granger-causes X3 with respect to X{1,3,4} cannot be rejected (table 2). This is also indicated by the PDC between X1 and X3 with respect to X{1,3,4}, which vanishes completely (figure 11b). Consequently, the graph Gb is consistent with X{1,2,3,4}. Further examination shows the graph is also minimally consistent with X{1,2,3,4}.

Figure 11.

Figure 11

Partial directed correlations (PDC) between X1 and X3: (a) with respect to X{1,2,3} and (b) with respect to X{1,3,4}.

Finally, we consider the right graph in figure 10c which encodes that X1 is bivariately Granger-non-causal for X3. Since this contradicts the results of the bivariate analysis of X{1,3} it follows that this graph is inconsistent with X{1,2,3,4}.

From definitions 4.3 and 4.4, we conclude that the causality from X2 to X3 is spurious whereas the association between X4 and X3 is due to a causal effect of X4 on X3. In contrast, neither a causal effect nor spurious causality can be established for the directed edges from 1 to 2 and from 1 to 4, since each of them may be replaced by a dashed directed edge ⤍ without changing the Markov properties of the graph.

(b) Application to fMRI data

In this section, we apply the methods discussed in this paper to a seven-dimensional time-series obtained from concurrent recordings from EEG and fMRI. The EEG was sampled at 200 Hz from an array of 16 bipolar pairs, with an additional channel for the EEG and scan trigger. The fMRI series were measured with a time resolution of 2.5 s at six slice planes (TR=4 mm, skip 1 mm), with the second most inferior slice oriented through the anterior commissure–posterior commissure (AC–PC) line. The data and their requisition are described in detail in Goldman et al. (2002). Informed consent was obtained from the volunteers based on a protocol previously approved by the UCLA Office for the Protection of Research Subjects.

For the analysis, the time-varying spectrum of the EEG has been decomposed by parallel factor (PARAFAC) analysis into trilinear components (called atoms), each being the product of spatial, spectral and temporal factors. The PARAFAC analysis extracted three significant atoms characterized by their spectral signature. Only the temporal factor of the alpha atom corresponding to a frequency range of 8–12 Hz was included in the effective connectivity analysis.

The fMRI data are time-series of length T=108 for six regions in the brain whose activation was correlated with the EEG alpha atom: visual cortex, thalamus, left and right insulae and left and right somatosensory areas. The time-series for each region was obtained by averaging the time-series of all voxels in that region. For details about preprocessing the data, we refer to Martínez-Montes et al. (2004). The use of PARAFAC analysis in analysing three-dimensional EEG data (space, time and frequency) is described in Miwakeichi et al. (2004).

For the determination of the Granger-causal relationships among the seven series, a vector autoregressive model of order p=1 was fitted to the data. The model order has been determined by minimization of the Akaike's information criterion (AIC) (Lütkepohl 1993),

AIC(p)=12log|Σˆp|+rT,

where Σˆp is the estimate for the covariance matrix of ϵ(t) and r is the number of parameters for the model. From the parameter estimates, the corresponding PDC with respect to the complete series have been computed (figure 12, below diagonal). Similarly, estimates for the bivariate PDC have been obtained by fitting vector autoregressive models of order p=2, where the order has been chosen so that the AIC criterion is minimized for most bivariate analyses (figure 12, above diagonal).

Figure 12.

Figure 12

Multivariate and bivariate partial directed correlations (PDC) for fMRI and EEG data computed from graphical VAR models; the horizontal lines signify pointwise 95% test bounds for the hypothesis that the PDC is zero.

For the identification of the multivariate path diagram, we have performed a series of tests for pairwise Granger non-causality and contemporaneous non-correlation; the results of the tests are given in table 3 and the resulting multivariate path diagram is depicted in figure 13. Similarly, the bivariate path diagram could be obtained by testing for bivariate Granger non-causality and bivariate contemporaneous non-correlation. Due to the large number of edges in this graph, it would be difficult to draw any inferences from the bivariate path diagram and we therefore omit it. The results of the tests are also given in table 3.

Table 3.

Results of multivariate and bivariate analyses of simulated data (a) p-values of tests for multivariate Granger causality; (b) p-values of tests for bivariate Granger causality and (c) p-values of tests for multivariate (below diagonal) and bivariate (above diagonal) contemporaneous non-correlation.

  from
 
to VC TH LI RI LS RS EEG
(a) multivariate Granger causality
VC 0.00 0.86 0.61 0.10 0.40 0.00
TH 0.02 0.15 0.17 0.36 0.43 0.00
LI 0.06 0.00 0.17 0.01 0.30 0.84
RI 0.03 0.00 0.01 0.14 0.91 0.84
LS 0.40 0.00 0.41 0.76 0.02 0.08
RS 0.30 0.00 0.42 0.38 0.51 0.84
EEG 0.07 0.35 0.31 0.37 0.72 0.13
(b) Bivariate Granger causality
VC 0.00 0.35 0.48 0.53 0.72 0.00
TH 0.00 0.01 0.56 0.00 0.00 0.10
LI 0.00 0.00 0.30 0.00 0.01 0.01
RI 0.03 0.00 0.35 0.27 0.34 0.00
LS 0.11 0.00 0.31 0.02 0.00 0.02
RS 0.12 0.00 0.00 0.00 0.09 0.09
EEG 0.96 0.71 0.87 0.50 0.81 0.20
(c) contemporaneous non-correlation
VC 0.08 0.12 0.02 0.00 0.00 0.70
TH 0.01 1.00 0.65 0.41 0.16 0.22
LI 0.86 0.10 0.00 0.04 0.95 0.88
RI 0.70 0.83 0.00 0.08 0.71 0.33
LS 0.00 0.23 0.85 0.48 0.00 0.54
RS 0.03 0.34 0.07 0.03 0.01 0.40
EEG 0.97 0.11 0.87 0.35 0.76 0.28

Figure 13.

Figure 13

Multivariate path diagram for concurrent EEG and fMRI data empirically determined by the tests for Granger causality and contemporaneous non-correlation in table 3a,c.

The multivariate path diagram most notably shows that the EEG alpha atom has a direct influence on the thalamus and visual cortex only, while the BOLD responses in the other regions are only influenced indirectly. This indicates that the brain regions involved in the generation of the ‘EEG alpha rhythm’ are primarily located in the thalamus and visual cortex and corroborates previous findings by Goldman et al. (2002) and Martínez-Montes et al. (2004), who identified the parieto-occipital cortex and thalamus as generators of the ‘alpha brain rhythm’. Comparing these findings with the results of the bivariate analysis in table 3b,c, we find that although the EEG alpha atom shows an influence on all cortical brain regions, the positive correlation between the EEG alpha atom and the BOLD response in the thalamus is no longer significant. This suggests that the Granger causality from EEG alpha atom to thalamus is spurious.

For a more detailed analysis of the effective connectivity, we first note that neither in the multivariate nor in the bivariate analyses, the EEG alpha atom, the visual cortex and the thalamus seem to be affected by the remaining four brain regions. We therefore may restrict our analysis to these three components. Figure 14a,b displays the corresponding multivariate and bivariate path diagrams G(m) and G(b), respectively; the former has been obtained following the same steps as above (order p=2), while the latter has been directly derived from the bivariate results in table 3. Here, the multivariate path diagram G(m) implies that the thalamus and the visual cortex neither Granger-cause the EEG alpha atom nor are they contemporaneously correlated with the EEG component, while the bivariate path diagram G(b) additionally encodes that, first, the EEG alpha atom does not bivariately Granger-cause the thalamus and second, the visual cortex and thalamus are bivariately contemporaneously uncorrelated. Thus, we have G(b)G(m). Furthermore, since, according to theorem 3.4, the bivariate path diagram is consistent with the data and already encodes all empirically determined relations among the variables, we conclude that the bivariate path diagram G(b) is minimally consistent with the data. Replacing the edge TH⤍VC with TH→VC, we obtain another graph G′ which is Markov equivalent to G(b) and, hence, is also minimally consistent with the data (figure 14c). Simple considerations show that no further minimally consistent graph exists.

Figure 14.

Figure 14

Identification of effective connectivity between the EEG alpha atom, the visual cortex and the thalamus: (a) multivariate path diagram; (b) bivariate path diagram; (c) extended path diagram. The graphs in (b) and (c) are Markov equivalent and minimally consistent with the data.

The minimally consistent graphs show that the correlation between EEG alpha atom and thalamic BOLD responses observed in a multivariate analysis can be attributed to the indirect link EEG⤍VC⤍TH mediated by the visual cortex. This is in line with the results by Martínez-Montes et al. (2004), who identified the visual cortex as the source of the EEG alpha rhythm. However, note that, contrary to the explanation provided by Martínez-Montes et al. (2004), we find that EEG and visual cortex are negatively correlated while the PDCs that correspond to the link from the visual cortex to the thalamus are positive. Since the intermediate vertex is a collider, the combined correlation is negated resulting in a positive correlation for the pathway. This effect of changing sign can also be observed in figure 8, where the pathway 1⤍2⤍3 lead to an overall negative correlation (also Eichler et al. 2003).

Finally, we note that the contemporaneous correlation between the thalamus and the visual cortex in a multivariate analysis is due to the pathway TH⤌VC⤌EEG⤍VC.

The analysis suggests a causal influence of the EEG on the visual cortex. This ‘causal ordering’ is most likely due to the fact that brain activity, which is instantaneously reflected in the EEG measurements, takes much longer to be reflected in the fMRI. Thus, the analysis allows no conclusions about the causal ordering of the EEG and the fMRI component. Furthermore, we note that it is not possible to determine whether the link between the EEG alpha atom and the visual cortex is due to direct confounding since the Markov properties of the graph would not be changed if the directed edge between the EEG alpha atom and the visual cortex were replaced by dashed directed edge.

A serious limitation of effective connectivity analysis from time-resolved fMRI data in general is the short sample size; in the example above only 108 measurements were available. Consequently, only strong correlations among the components show up significantly in an analysis. This particularly affects the identification of indirect pathways as needed for the distinction between direct and spurious causality. For instance, in our discussion of the four-dimensional system (3.1) in §3a, we have derived the autoregressive representation of the trivariate subseries X{1,2,3}. Here, the coefficient Φ13(2), which corresponds to the m-connecting pathway 3→2←4→1 was of the form αβγ/(1+β2). Since all coefficients were smaller than one, the Granger-causal effect of X3 on X1 is much smaller than the other links between the components. In an analysis with small sample size such correlations can easily become insignificant and thus hamper the identification of the effective connectivity of the system.

6. Conclusion

In this paper, a new graphical approach for the analysis of causal relationships in multivariate time-series has been presented. In particular, this approach allows the comparison of multivariate and bivariate Granger causality, both of which are frequently used in brain imaging to resolve functional relationships between brain areas from EEG or fMRI time-series data. We have shown that both concepts may provide useful information that cannot be obtained by the other.

Our discussion has also shown that the effective connectivity of systems that may be affected by unmeasured latent variables cannot be resolved by multivariate and bivariate analyses alone, but only by examination of Granger-non-causal relations with respect to all subseries. To this end, we have generalized the idea of multivariate and bivariate path diagrams and introduced mixed graphs that can be used to visualize the effective connectivity of such systems.

Currently, the identification of such graphical representations is based on a multi-step procedure where each step requires the fitting of a new autoregressive model. As a consequence, it is impossible to compare two graphical representations of the effective connectivity and test between them. Furthermore, the statistical errors in different steps may lead to contradictory results. Therefore, future research aims at the development of new graphical time-series models with dependencies that are constrained by general path diagrams.

Acknowledgments

The author wishes to thank Robin Goldman and Mark Cohen, who conducted the EEG–fMRI experiments discussed in §5, and Pedro Valdéz-Sosa and Eduardo Martínez-Montes for providing the data and many helpful comments.

Appendix A

Appendix A. Statistical inference

For the analysis of empirical data, VAR(p) models can be fitted using least-squares estimation. For observations XV(1),…,XV(T) from a d-dimensional multiple time-series XV, let Rˆp=(Rˆp(u,v))u,v=1,,p be the pd×pd matrix composed by submatrices

Rˆp(u,v)=1Tpt=p+1TX(tu)X(tv)T.

Similarly, we set rˆp=(Rˆp(0,1),,Rˆp(0,p)). Then the least-squares estimates of the autoregressive coefficients are given by

Φˆ(u)=v=1p(Rˆp)1(u,v)rˆp(v),

for u=1,…,p, while the covariance matrix Σ is estimated by

Σˆ=1Tt=p+1Tϵˆ(t)ϵˆ(t)T,

where ϵˆ(t)=X(t)u=1pΦ(u)X(tu) are the least-squares residuals. The estimates are asymptotically normally distributed; for details we refer to Lütkepohl (1993).

The coefficients Φab(u) depend, like any regression coefficient, on the unit of measurement of Xa and Xb and, thus, are not suited for comparisons of the strength of causal relationships between different pairs of variables. Therefore, Eichler (2005) proposes partial directed correlations as a measure of the strength of causal effects. For u>0 (u=0) the partial directed correlation πab(u) is defined as the correlation between Xa(t) and Xb(tu) after removing the linear effects of X¯V(t1)\{Xb(tu)}(X¯V(t1)) while for u<0 we have πab(u)=πba(u). It is further shown that for u>0, estimates for the partial directed correlations πab(u) can be obtained from the parameter estimates of a VAR(p) model by rescaling the coefficients Φab(u)

πˆab(u)=Φˆab(u)Σˆaaτˆab(u),

where

τˆab(u)=Kˆbb+v=1u1k,lVΦˆkb(v)KˆklΦˆlb(v)+Φˆab(u)2Σˆaa,

with Kˆ=Σˆ1. For u=0, we obviously have

πˆab(0)=ΣˆabΣˆaaΣˆbb.

For large sample length T, the partial directed correlations are approximately normally distributed with mean πab(u) and variance 1/T.

Tests for Granger causal relationships among the variables can be derived from the asymptotic distribution of the parameters of the VAR(p) model. More precisely, let Vˆ(u,v)=(Rˆp)bb1(u,v)Σˆaa be the estimate for the asymptotic covariance between Φˆab(u) and Φˆab(v) and let Vˆ be the corresponding p×p matrix. Then the existence of a Granger causal effect of Xb on Xa can be tested by evaluation of the test statistic

Sab=Tu,v=1pΦˆab(u)Φˆab(v)(Vˆ1)(u,v).

Under the null hypothesis that Xb is Granger non-causal for Xa with respect to XV, the test statistic Sab is asymptotically χ2 distributed with p degrees of freedom.

More generally, inference on causal structures should be based on fitting graphical vector autoregressive models. For given graph G and order p, the parameters of the VAR(p, G) model can be computed iteratively from the following two steps. First, for the given estimate Σˆ, the estimates Φˆab(u) are determined as the solution of the linear equation system:

(v=1pΣˆ1Φ(v)Rˆp(u,v))ab=(Σˆ1Rˆp(0,v))ab,

for u=1,…,p and a, bV such that ba in G under the constraints that Φab(u)=0 whenever the directed edge ba is absent in the graph G. Second, the estimate Σˆ is obtained by solving the nonlinear equation system,

(Σ1)ab=(Σ1Σˆ0Σ1)ab,

for a, bV such that b---a in G, where Σˆ0=1Tt=p+1Tϵˆ(t)ϵˆ(t)T is the unconstrained estimate for the covariance matrix of the residuals ϵˆ(t). The second step corresponds to fitting a covariance model to the residuals ϵˆ(t), which is determined by the zero constraints on the covariance matrix Σ. An iterative algorithm for fitting such covariance models has been introduced by Drton & Richardson (2004). Since the solutions for both sets of equations are not independent, an iteration of the two steps is needed to obtain a joint solution. The fitting of graphical vector autoregressive models will be described in more detail in a forthcoming paper.

Footnotes

One contribution of 21 to a Theme Issue ‘Multimodal neuroimaging of brain connectivity’.

References

  1. Büchel C, Friston K.J. Modulation of connectivity in visual pathways by attention: cortical interactions evaluated with structural equation modelling and fMRI. Cereb. Cortex. 1997;7:768–778. doi: 10.1093/cercor/7.8.768. [DOI] [PubMed] [Google Scholar]
  2. Büchel C, Friston K.J. Assessing interactions among neuronal systems using functional imaging. Neural Netw. 2000;13:871–882. doi: 10.1016/s0893-6080(00)00066-6. [DOI] [PubMed] [Google Scholar]
  3. Cox D.R, Wermuth N. Chapman & Hall; London: 1996. Multivariate dependencies—models, analysis and interpretation. [Google Scholar]
  4. Dahlhaus R, Eichler M. Causality and graphical models in time series analysis. In: Green P, Hjort N, Richardson S, editors. Highly structured stochastic systems. Oxford University Press; 2003. pp. 115–137. [Google Scholar]
  5. Drton, M. & Richardson, T. S. 2004 Iterative conditional fitting for estimation of a covariance matrix with zeros. Technical report, University of Washington.
  6. Edwards D. 2nd edn. Springer; New York: 2000. Introduction to graphical modelling. [Google Scholar]
  7. Eichler, M. 2001 Graphical modelling of time series. Technical report, Universität Heidelberg.
  8. Eichler, M. 2002 Granger-causality and path diagrams for multivariate time series. Technical report, The University of Chicago.
  9. Eichler, M. 2005 Graphical modelling for time series using chain graphs. Technical report, Universität Heidelberg.
  10. Eichler M, Dahlhaus R, Sandkühler J. Partial correlation analysis for the identification of synaptic connections. Biol. Cybern. 2003;89:289–302. doi: 10.1007/s00422-003-0400-3. [DOI] [PubMed] [Google Scholar]
  11. Friston K.J, Harrison L, Penny W. Dynamic causal modelling. NeuroImage. 2003;19:1273–1302. doi: 10.1016/s1053-8119(03)00202-7. [DOI] [PubMed] [Google Scholar]
  12. Goebel R, Roebroeck A, Kim D.-S, Formisano E. Investigating directed cortical interactions in time-resolved fMRI data using vector autoregressive modeling and Granger causality mapping. Magn. Reson. Imaging. 2003;21:1251–1261. doi: 10.1016/j.mri.2003.08.026. [DOI] [PubMed] [Google Scholar]
  13. Goldman R.I, Engel S.J.M, Jerome J, Cohen M.S. Simultaneous EEG and fMRI of the alpha rhythm. NeuroReport. 2002;13:2487–2492. doi: 10.1097/01.wnr.0000047685.08940.d0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Granger C.W.J. Investigating causal relations by econometric models and cross-spectral methods. Econometrica. 1969;37:424–438. [Google Scholar]
  15. Harrison L, Penny W.D, Friston K.J. Multivariate autoregressive modeling of fMRI time series. NeuroImage. 2003;4:1477–1491. doi: 10.1016/s1053-8119(03)00160-5. [DOI] [PubMed] [Google Scholar]
  16. Hesse W, Möller E, Arnold M, Schack B. The use of time-variant EEG Granger causality for inspecting directed interdependencies of neural assemblies. J. Neurosci. Methods. 2003;124:27–44. doi: 10.1016/s0165-0270(02)00366-7. [DOI] [PubMed] [Google Scholar]
  17. Hsiao C. Autoregressive modeling and causal ordering of econometric variables. J. Econ. Dyn. Control. 1982;4:243–259. [Google Scholar]
  18. Kamiński M, Ding M, Truccolo W.A, Bressler S.L. Evaluating causal relations in neural systems: Granger causality, directed transfer function and statistical assessment of significance. Biol. Cybern. 2001;85:145–157. doi: 10.1007/s004220000235. [DOI] [PubMed] [Google Scholar]
  19. Koster J.T.A. On the validity of the Markov interpretation of path diagrams of Gaussian structural equations systems with correlated errors. Scand. J. Stat. 1999;26:413–431. [Google Scholar]
  20. Lauritzen S.L. Oxford University Press; 1996. Graphical models. [Google Scholar]
  21. Lauritzen S.L. Causal inference from graphical models. In: Barndorff-Nielsen O.E, Cox D.R, Klüppelberg C, editors. Complex stochastic systems. CRC Press; London: 2001. pp. 63–107. [Google Scholar]
  22. Lütkepohl H. Springer; New York: 1993. Introduction to multiple time series analysis. [Google Scholar]
  23. Martínez-Montes E, Valdés-Sosa P.A, Miwakeichi F, Goldman R.I, Cohen M.S. Concurrent EEG/fMRI analysis by multiway partial least squares. NeuroImage. 2004;22:1023–1034. doi: 10.1016/j.neuroimage.2004.03.038. [DOI] [PubMed] [Google Scholar]
  24. McIntosh A.R, Gonzales-Lima F. Structural equation modelling and its application to network analysis in functional brain imaging. Hum. Brain Mapp. 1997;2:2–22. [Google Scholar]
  25. Miwakeichi F, Martínez-Montes E, Valdés-Sosa P, Nishiyama N, Mizuhara H, Yamaguchi Y. Decomposing EEG data into space-time-frequency components using parallel factor analysis. NeuroImage. 2004;22:1035–1045. doi: 10.1016/j.neuroimage.2004.03.039. [DOI] [PubMed] [Google Scholar]
  26. Pearl J. Cambridge University Press; Cambridge: 2000. Causality. [Google Scholar]
  27. Richardson T. Markov properties for acyclic directed mixed graphs. Scand. J. Stat. 2003;30:145–157. [Google Scholar]
  28. Spirtes P, Richardson T.S, Meek C, Scheines R, Glymour C. Using path diagrams as a structural equation modeling tool. Soc. Methods Res. 1998;27:182–225. [Google Scholar]
  29. Spirtes P, Glymour C, Scheines R. 2nd edn. MIT Press; Cambridge, MA: 2001. Causation, prediction, and search. With additional material by David Heckerman, Christopher Meek, Gregory F. Cooper & Thomas Richardson. [Google Scholar]
  30. Whittaker J. Wiley; Chichester: 1990. Graphical models in applied multivariate statistics. [Google Scholar]

Articles from Philosophical Transactions of the Royal Society B: Biological Sciences are provided here courtesy of The Royal Society

RESOURCES