Skip to main content
Proceedings. Mathematical, Physical, and Engineering Sciences logoLink to Proceedings. Mathematical, Physical, and Engineering Sciences
. 2020 Apr 29;476(2236):20190779. doi: 10.1098/rspa.2019.0779

Deriving pairwise transfer entropy from network structure and motifs

Leonardo Novelli 1,, Fatihcan M Atay 2,3, Jürgen Jost 3,4, Joseph T Lizier 1,3
PMCID: PMC7209155  PMID: 32398937

Abstract

Transfer entropy (TE) is an established method for quantifying directed statistical dependencies in neuroimaging and complex systems datasets. The pairwise (or bivariate) TE from a source to a target node in a network does not depend solely on the local source-target link weight, but on the wider network structure that the link is embedded in. This relationship is studied using a discrete-time linearly coupled Gaussian model, which allows us to derive the TE for each link from the network topology. It is shown analytically that the dependence on the directed link weight is only a first approximation, valid for weak coupling. More generally, the TE increases with the in-degree of the source and decreases with the in-degree of the target, indicating an asymmetry of information transfer between hubs and low-degree nodes. In addition, the TE is directly proportional to weighted motif counts involving common parents or multiple walks from the source to the target, which are more abundant in networks with a high clustering coefficient than in random networks. Our findings also apply to Granger causality, which is equivalent to TE for Gaussian variables. Moreover, similar empirical results on random Boolean networks suggest that the dependence of the TE on the in-degree extends to nonlinear dynamics.

Keywords: network inference, connectome, motifs, information theory, transfer entropy

1. Introduction

From a network dynamics perspective, the activity of a system over time is the result of the interplay between the dynamical rules governing the nodes and the network structure (or topology). Studying the structure-dynamics relationship is an ongoing research effort, often aimed at optimizing the synchronization, controllability, or stability of complex systems, or understanding how these properties are shaped by evolution [14]. Information theory [5] offers a general mathematical framework to study the diverse range of dynamics across technical and biological networks, from neural to genetic to cyber-physical systems [6]. It provides quantitative definitions of uncertainty and elementary information processing operations (such as storage, transfer and modification), which align with qualitative descriptions of dynamics on networks and could serve as a common language to interpret the activity of complex systems [7].

This study will focus on a specific information-theoretic measure: transfer entropy (TE) [8,9]. In its original formulation as a pairwise measure, TE can be used to study the activity of a network and detect asymmetric statistical dependencies between pairs of nodes. TE has been widely used to characterize directed relationships in complex systems, in particular in the domain of computational neuroscience [10,11]. For a given dynamics, there is a non-trivial dependence of the local TE between pairs of nodes and the wider global structure of the network. For example, several empirical studies have reported a dependence of the TE on the in- and out-degree of the source and target nodes [1217] as well as other aspects of network structure such as long links in small-world networks [18]. The main purpose of this work is to present a systematic analytic characterization of the relationship between network structure and TE on a given link, which has not been previously established.

In order to provide an analytic treatment, we will use a stationary vector autoregressive (VAR) process, characterized by linear interactions and driving Gaussian noise (§2). This model is a simplification when compared with most real-world processes, but can be viewed as approximating the weakly coupled near-linear regime [19]. Interestingly, a recent review found that the VAR model performed better than six more complex mainstream neuroscience models in predicting the undirected functional connectivity (based on Pearson correlation) from the brain structural connectivity (based on tractography) [20]. Other studies have related the undirected functional connectivity to specific structural features, such as search information, path transitivity [21], and topological similarity [22]. Analytic relationships of the network structure and correlation/covariance between nodes for the VAR and similar dynamics have also been well studied [2325].

This work will instead focus on the analytical treatment of the directed functional connectivity obtained via the pairwise TE for the VAR process. Building on previous studies of other information-theoretic measures for this process (regarding the TSE complexity [26] in [19,27] and active information storage in [28]), we explicitly establish the dependence of the TE for a given link on the related structural motifs. Motifs are small subnetwork configurations, such as feed-forward or feedback loops, which have been studied as building blocks of complex networks [29]. Specific motif classes are over-represented in biological networks when compared with random networks, suggesting they could serve specific functions [3033]. Indeed, linear systems analyses have been used to predict functional sub-circuits from the nervous system topology of the C. elegans nematode [34].

It is shown analytically (in §3) that the dependence of the TE on the directed link weight from the source to the target is only a first approximation, valid for weak coupling. More generally, the TE increases with the in-degree of the source and decreases with the in-degree of the target, indicating an asymmetry of information transfer between hubs and low-degree nodes. In addition, the TE is directly proportional to weighted motif counts involving common parents or multiple walks from the source to the target, which are more abundant in networks with a high clustering coefficient than in random networks. These results are tested using numerical simulations and discussed in §4.

Being based on a linearly coupled Gaussian model, our findings apply directly to Granger causality, which is equivalent to TE for Gaussian variables [35]. However, similar empirical results on random Boolean networks (RBNs) suggest that the dependence of the TE on the in-degrees extends to nonlinear dynamics (appendix C).

2. Information-theoretic measures on networks of coupled Gaussians

Let us consider a discrete-time, stationary, first-order autoregressive process on a network of N nodes. This multivariate VAR(1) process is described by the recurrence relation

Z(t+1)=Z(t)C+ε(t), 2.1

where Zi(t) is the activity of node i at time t (and Z(t) is a row vector). Here, ε(t) is spatially and serially uncorrelated Gaussian noise of unit variance and C = [Cij] is the N × N weighted adjacency matrix representing the weighted network structure (where Cij is the weight of the directed connection from node i to node j). A stationary autoregressive process has a multivariate Gaussian distribution, whose expected Shannon entropy [5], independent of t, is [36, Ch. 8]:

H(Z)=12ln[(2πe)N|Ω|]. 2.2

In equation (2.2), |Ω| represents the determinant of the covariance matrix Ω : =〈Z(t)TZ(t)〉 and 〈 · 〉 denotes the average over the statistical ensemble at times t [36]. Barnett [19] et al. show that the covariance matrix satisfies Ω = I + CTΩC, where I denotes the relevant identity matrix, and the solution is obtained in general via the power series

Ω=I+CTC+(C2)TC2+=j=0(Cj)TCj. 2.3

(A simpler form exists for symmetric C [19]). As discussed in [19,28], the convergence of the series is guaranteed under the assumption of stationarity (for which a sufficient condition is that the spectral radius of C is smaller than one). Information-theoretic measures relating variables over a time difference s also involve covariances across time, which can be computed via the lagged covariance matrix [28]

Ω(s):=Z(t)TZ(t+s)=ΩCs. 2.4

Interestingly, equation (2.4) can be used to directly reconstruct the weighted adjacency matrix C from empirical calculations of Ω and Ω(s) from observations [37].

3. Approximating the pairwise transfer entropy

In this section, we will derive the TE [8] for pairs of nodes from the VAR process in equation (2.1) as a function of specific network motifs; the final results are listed in equation (3.14) and shown in figure 1.

Figure 1.

Figure 1.

Visual summary of the motifs involved in the pairwise transfer entropy from a source node X to a target node Y in the network. The seven panels (ag) correspond to the seven motifs in equations (3.14a)–(3.14g), expanded up to order O(||C||4). The motifs in (c) and (d) represent the effect of the weighted in-degree of the source and the target (which have a positive and negative contribution to the transfer entropy, respectively, with the negative indicated in dashed red line). The motifs in (b), (f ) and (g) are clustered motifs, which can enhance or detract from the predictive effect of the directed link, depending on the sign of the link weights. In particular, motifs (b) and (f ) involve a common parent of X and Y, whereas (g) involves an additional pathway effect. Note that the unlabelled nodes are distinct from X and Y (and from each other in (b)). (Online version in colour.)

For two given nodes X and Y in Z, the TE TXY as a conditional mutual information can be decomposed into four joint entropy terms [9]:

TXY=I(X;Y|Y)=H(Y,Y)H(Y)H(X,Y,Y)+H(X,Y). 3.1

Here, we use the shorthand Y to represent the next value Y(t + 1) of the target at time t + 1, X for the previous value X(t) of the source, and Y for the past state of Y at time t. We drop the time index t to simplify the notation under the stationarity assumption. Following convention, finite embedding vectors Y : =Y(k) of the past k values of Y will be used to represent the previous state [8,9]. (One could also embed the source process X; however, only a single value is used here, in line with the order-1 causal contributions in equation (2.1)).

We can then rewrite the TE in terms of Ω(Y, Y(k)), Ω(Y(k)), Ω(X, Y, Y(k)) and Ω(X, Y(k)): the covariance matrices of the joint processes involved in the four entropy terms. Plugging equation (2.2) into equation (3.1) for each term yields

TXY=12(ln|Ω(Y,Y(k))|ln|Ω(Y(k))|ln|Ω(X,Y,Y(k))|+ln|Ω(X,Y(k))|). 3.2

Furthermore, from the matrix identity |eA|=etr(A) (valid for any square matrix A [38]) and from the Taylor-series expansion for the natural logarithm, it follows that

ln|Ω|=m=1(1)m1mtr[(ΩI)m], 3.3

where tr[ · ] is the trace operator. Plugging equation (3.3) into equation (3.2) gives

TXY=12m=1(1)m1m(tr[(Ω(Y,Y(k))I)m]tr[(Ω(Y(k))I)m]tr[(Ω(X,Y,Y(k))I)m]+tr[(Ω(X,Y(k))I)m]). 3.4

In order to simplify equation (3.4), consider the block structure of B : =(Ω(X, Y, Y(k)) − I) and note that it contains (Ω(Y, Y(k)) − I), (Ω(Y(k)) − I) and (Ω(X, Y(k)) − I) as submatrices with overlapping diagonals

3. 3.5

where Ω(s)XY represents the (X, Y) entry of the lag s covariance matrix Ω(s) in equation (2.4). An explicit representation of these covariance matrices is provided in appendix A. Since most of the terms in the trace of Bm also appear in the traces of the other covariance matrices in equation (3.4), they will get cancelled. As shown in appendix A, the only non-zero terms remaining in equation (3.4) are those in tr[Bm] that involve multiplication of at least one entry of B from the first row or column (corresponding to correlations with X) and one entry from the second row or column (corresponding to correlations with the next value of the target Y). Therefore, we can simplify equation (3.4) as

TXY=12m=1TXY(m)=12m=1(1)mmtr[Bm]¯, 3.6

where TXY(m) indicates contributions to TXY from power m of B, and the overbar on tr[Bm]¯ indicates that only the terms that involve at least one entry of B from the first row and one from the second row (or columns) are considered. More formally,

tr[Bm]¯=i(Bm)ii¯=i1,,im s.t.{1,2}{i1,,im}Bi1i2Bi2i3Bim1imBimi1. 3.7

Let us now consider the cases m = 1, 2 separately. When m = 1, all the terms in tr[B]¯ are neglected:

TXY(1)=tr[B]¯=iBii¯=0. 3.8

When m = 2, we have

TXY(2)=12tr[B2]¯=12i,jBijBji¯=12i=1;j=2i=2;j=1BijBji=[Ω(1)XY]2=[(ΩC)XY]2, 3.9

where the last step follows from equation (2.4). Before proceeding to consider the cases m > 2, let us see how equation (3.9) can be used to relate the TE contribution TXY(2) to the network structure. Plugging equation (2.3) into equation (3.9) yields

TXY(2)=(CXY)2+2CXY(CTC2)XY+O(||C||6) 3.10a
=(CXY)2+2i1,i2CXYCi1XCi1i2Ci2Y+O(||C||6). 3.10b

In equation (3.10) and in the following, we will only consider the contributions to the TE up to order O(||C||4), where |||| is any consistent matrix norm [19]. Our approximations will, therefore, be most accurate when the link weights are homogeneous or have the same order of magnitude. Noting that product sums of connected link weights as in equation (3.10b) represent weighted walk counts of relevant motifs, the first two panels in figure 1a,b provide a visual summary of the motifs involved in TXY(2).

Now, consider the higher-order cases. When m = 3, we have

TXY(3)=13tr[B3]¯=13i,j,kBijBjkBki¯=13i=1;j=2;k=1,,Ni=2;j=1;k=1,,Nj=1;k=2;i1j=2;k=1;i2k=1;i=2;j1,2k=2;i=1;j1,2BijBjkBki 3.11a
=[(ΩC)XY]2(ΩYY1)[(ΩC)XY]2(ΩXX1)2[(ΩC)XY][(ΩC)YY]ΩYX2[(ΩC)XY][(ΩC2)YY][(ΩC)YX]2l>2[(ΩC)XY][(ΩCl)YY][(ΩCl1)YX]. 3.11b

The six cases in the sum in equation (3.11a) are those where at least one of the indices (i, j, k) is equal to 1 and another index is equal to 2 (the third index can range between 1 and N, with some values excluded to avoid double counting).

Plugging equation (2.3) into equation (3.11b) yields

TXY(3)=(CXY)2(CTC)XX(CXY)2(CTC)YY2CXYCYY(CTC)YX2CXY(C2)YYCYX+O(||C||6)=i1(CXY)2(Ci1,X)2 3.12a
i1(CXY)2(Ci1,Y)2 3.12b
2i1CXYCYYCi1XCi1Y 3.12c
2i1CXYCYXCYi1Ci1Y+O(||C||6). 3.12d

Similarly, when m = 4, we have

TXY(4)=14tr[B4]¯=14i,j,k,lBijBjkBklBli¯=12(CXY)4 3.13a
+(CXY)2(CYY)2 3.13b
+(CXY)2(CYX)2 3.13c
+2CXYCYX(CYY)2+O(||C||6). 3.13d

The full derivation for the case m = 4 is provided in appendix B. We will not need to consider the cases where m > 4 since TXY(m)O(||C||6)m>4.

So far, we have analysed the cases m = 1, 2, 3, 4 separately. Let us now combine the results by summing the weighted walk counts from equations (3.10), (3.12), (3.13). In order to simplify the expressions, we will isolate the occurrences where the indices in the sums are equal to X or Y from the other values. In so doing, some of the weighted walk counts found previously will cancel each other. The final decomposition for the TE in terms of weighted walk counts of relevant motifs, which is the main result of this paper, is then

TXY=12(TXY(2)+TXY(3)+TXY(4))+O(||C||6)=+12(CXY)214(CXY)4 3.14a
+i1X,Yi2X,Y,i1CXYCi1XCi1i2Ci2Y 3.14b
+12i1X,Y(CXY)2(Ci1X)2 3.14c
12i1X,Y(CXY)2(Ci1Y)2 3.14d
+12(CXX)2(CXY)2 3.14e
+i1X,YCXYCi1i1Ci1XCi1Y 3.14f
+i1X,YCXYCXXCXi1Ci1Y+O(||C||6). 3.14g

The motifs from equations (3.12c)–(3.12d) and equations (3.13b)–(3.13d) were cancelled; on the other hand, the new motifs in equations (3.14e)–(3.14g) were introduced as special cases of equation (3.10b). Equation (3.14a) and equation (3.14d) are the only terms remaining from TXY(3) that are negatively correlated to the TE and were not completely cancelled here. Figure 1 provides a visual summary of the motifs involved in TXY, up to order O(||C||4).

4. Numerical simulations and discussion

(a). Directed link

The pairwise TE TXY clearly depends on the weight of the directed link X → Y (as per equations (3.14a) and (3.14e) and corresponding figure 1a,e). Equation (3.14a) is the dominant term in equation (3.14) for linear Gaussian systems with weights CXY ∈ [ − 1, 1] being similar across the network, which is perhaps not so surprising. For such weights, the (CXY)2 term will have a larger magnitude than the (CXY)4 term, and so the total direct contribution of CXY to the TE in equation (3.14a) will be positive and increase with the magnitude of CXY.

(i). Discussion

Similarly, Hahs & Pethel [39] analytically investigated the TE between coupled Gaussian processes—for pairs of processes without a network embedding—and identified a general increase with link weight. Furthermore, a recent analytic study of a Boolean network model of policy diffusion also found that the TE depends on the square of the directed link weight as a first-order approximation [40]. Moreover, the directed link weight in the structural brain connectome is correlated with functional connectivity [22,41]. Positive or negative directed link weights result in the same contribution for the motifs in equation (3.14a) and (3.14e) (this dependence becomes more complex for higher-order terms, see later sections). To distinguish the sign of the underlying link weight, one could examine the sub-components of the TE [42].

Yet, it is not always the case that information transfer is dominated by (or even correlated with) the weight of a directed link between the source and the target: the dependence on the link weight is generally non-monotonic, especially in nonlinear systems (see [8] and [9, Fig. 4.1]).

(b). In-degree of source and target

Beyond the effect of the directed link, the TE increases with the in-degree of the source X (see equation (3.14c) and figure 1c) and decreases with the in-degree of the target Y (see equation (3.14d) and figure 1d), regardless of the sign of the weights (since the weights are squared in the sums). This is because a higher number of incoming links can increase the variability of the source X (and therefore its entropy), which enables higher TE. The same effect has the opposite consequence on the target: although a higher target in-degree may increase the collective transfer [43,44] from the set of sources taken jointly, the confounds introduced by more sources weaken the predictive effect of each single source considered individually. The result is an asymmetry of information transfer, whereby the TE from the hubs to the other nodes is larger than the TE from the other nodes to the hubs. These factors are expected to have a strong effect in networks with low clustering coefficient, where the other motifs (equations (3.14b), (3.14f) and (3.14g)) are comparatively rare on average, e.g. in random networks.

(i). Numerical simulations

In order to test this prediction, the TE between all pairs of linked nodes was measured in undirected scale-free networks of 100 nodes obtained via preferential attachment [45]. At each iteration of the preferential attachment algorithm, a new node was connected bidirectionally to a single existing node (as well as to itself via a self-loop). A constant uniform link weight CXY = CXX = 0.1 was assigned to all the links, including the self-loops. The theoretical TE was computed according to equation (3.2) with k = 14 (matching the later empirical studies in §4c) and approximating Ω via the power series in equation (2.3) (until convergence). Differently from equation (3.14), the higher-order terms (i.e. O(||C||6)) are not neglected. The experiment was repeated on 10 000 different realizations of scale-free networks and the TE was averaged over the pairs with the same source and target in-degrees.

As shown in figure 2, the pairwise TE increased with the source in-degree and decreased with the target in-degree. The factor-of-three difference between the minimum and maximum TE values underlines the importance of these network effects beyond local pairwise link weights.

Figure 2.

Figure 2.

The pairwise transfer entropy (TE) increases with the in-degree of the source and decreases with the in-degree of the target, regardless of the sign of the link weights. The TE is plotted as a function of the source and target in-degree. The results were obtained from 10 000 simulations of scale-free networks of 100 nodes generated via preferential attachment and the TE was averaged over all the node pairs with the same source and target in-degree. Note that the values in the lower-left corner are the result of an average over many samples, since most of the node pairs have low in-degree. There are progressively fewer samples for higher in-degree pairs, and none for most pairs in the upper-right corner (the absence indicated by the white colour). (Online version in colour.)

(ii). Discussion

Interestingly, qualitatively similar results were obtained when the experiment was replicated on RBNs, despite their nonlinear dynamics (appendix C). Similarly, a recent analytic study of a Boolean network model of policy diffusion also found that the TE is proportional to the weighted in-degree of the source and negatively proportional to the weighted in-degree of the target, as a second-order approximation [40]. A positive correlation between the pairwise TE and the in-degree of the source was also reported in simulations involving neural mass models [17], Kuramoto oscillators [16], and a model of cascading failures in energy networks [15]. This is consistent with further findings showing that the degree of a node X is correlated to the ratio of (average) outgoing to incoming information transfer from/to X in various dynamical models, including Ising dynamics on the human connectome [12,13]. Similarly, a study by Walker et al. [46] on effects of degree-preserving versus non-degree-preserving network randomizations on Boolean dynamics suggests that the presence of hubs plays a significant role in information transfer, as well as identifying that local structure beyond degree also contributes (as per the next section). Our results reinforce the suggestion that such correlation of source in-degree to TE is to be expected in general [17], since the linear Gaussian autoregressive processes considered here can be seen as approximations of nonlinear dynamics in the weakly coupled near-linear regime [19].

Differently though, Timme et al. [14] report that the out-degree of the source correlates with the computation performed by a neuron (defined as the synergistic component of the TE [47]). It is difficult to interpret a direct mechanistic reason for this, however, it is possible that this effect is mediated indirectly by re-entrant walks between the source and the target, similarly to how the path-transitivity enhances the undirected functional connectivity [21]. The role of the motifs involving multiple walks is discussed in the next section.

Returning to the earlier qualification that a higher target in-degree may increase the collective transfer from the target’s set of sources taken jointly, we note that this was previously empirically observed by Li et al. [17], and over the sum of pairwise transfers by Olin-Ammentorp and Cady [48]. Analytically investigating collective transfer across a set of sources jointly for the VAR dynamics remains a topic for future work.

Finally, echoing [40], the effect of the in-degree has implications for computing the directed functional connectivity via the pairwise TE, which has been widely employed in neuroscience [10,4951]. When using TE as a pairwise measure, the links from hubs to low-degree nodes would generally be easier to infer than links between hubs, as well as links from low-degree nodes to hubs. This applies especially when the low number of time samples makes it difficult to distinguish weak transfer from noise and, importantly, could introduce a bias in the estimation of network properties. More specifically, we expect the in-degree of hubs to be underestimated, which may thin the tail of the in-degree distribution. As Goodman and Porfiri [40] also concluded, ‘the out-degree plays a surprisingly marginal role on the quality of the inference’. However, where the out-degree is correlated to the in-degree (e.g. for undirected networks), we expect the out-degree of non-hubs to be underestimated, which may relatively fatten the tail of the out-degree distribution. For all of these reasons, the rich-club coefficient [52] may also be altered. These implications also apply to iterative or greedy algorithms based on multivariate TE [5357], since they rely on computing the pairwise TE as a first step.

(c). Clustered motifs

So far, we have discussed the directed motif (equation (3.14a)) and we have considered networks with low global clustering coefficient, where the in-degree of the source and the target (equations (3.14c) and (3.14d)) play an important role. In networks with higher global clustering coefficients, such as lattice or small-world networks, other motifs will provide a significant contribution to the pairwise TE beyond the effect of the in-degrees. Specifically, these are the clustered motifs that involve a common-parent (equations (3.14b) and (3.14f) and corresponding figure 1b,f ) or a secondary path (equation (3.14g) and figure 1g) in addition to the directed link X → Y. The relative importance of the terms in equation (3.14) depends in fact on the properties of the network: if the clustering coefficient is high, the abundance of the clustered motifs makes their effect significant, despite each motif only contributing to the TE at order 4 (see equations (3.14b), (3.14f) and (3.14g)). Therefore, if the link weights are positive, we would expect the pairwise TE to be higher (due to these motifs) than what would be accounted for by the directed and in-degree motifs alone. The reason is that the common parent and the secondary pathways reinforce the effect of the directed link X → Y, leading to a greater predictive payoff from knowing the activity of the source X.

(i). Numerical simulations

This prediction was tested on Watts–Strogatz ring networks [58], starting from a directed ring network of N = 100 nodes with uniform link weights CXY = CXX = 0.15 and fixed in-degree din=4 (i.e. each node was linked to two neighbours on each side as well as itself). The source of each link was rewired with probability γ, such that the in-degree of each node was unchanged and the effect of the other motifs could be studied. The clustering coefficient decreased for higher values of γ as the network underwent a small-world transition, and so did the number of clustered motifs. Accordingly, the average theoretical TE between linked nodes (computed via equation (3.2) with k = 14 as above) decreased as predicted (see orange circles in figure 3).

Figure 3.

Figure 3.

Average transfer entropy (TE) as a function of the rewiring probability in Watts–Strogatz ring networks. For positive link weights, the pairwise TE is higher in clustered networks than in random networks, due to the higher number of clustered motifs. For each value of the rewiring probability (γ), the results for 10 simulations on different networks are presented (low-opacity markers) in addition to the mean values (solid markers). The plot shows that the approximation based on all the motifs up to order 4 (green triangles) is closer to the theoretical values (orange circles) than the approximation based on the in-degrees and directed motifs alone (red squares) or on the directed motifs alone (violet diamonds). The empirical values are also shown (blue crosses) as a validation of the theoretical results. (Online version in colour.)

Figure 3 also reports the empirical values of the TE, estimated from synthetic time series of 100 000 time samples. The analysis was carried out using the IDTxl software [59], employing the Gaussian estimator and selecting an optimal embedding of size k = 14 for the target time series.1 This provides a validation of the theoretical TE (computed via equations (3.2) and (2.3)), which matches these empirical values. The approximation in terms of motifs up to order O(||C||4) (computed via equation (3.14)), while not capturing all higher-order components of the TE, do reproduce the overall trend in agreement with the theoretical values, providing further validation of our main derivations. On the other hand, the partial approximation based on the directed link weight and the in-degree (motifs a, c, d and e) is not sufficient to reproduce the empirical TE trend, since that partial approximation does not account for the changing contribution of motif structures with the rewiring parameter γ.

(ii). Discussion

If the link weights are positive, the pairwise TE increases with the number of clustered motifs. (This applies on average in the mammalian cortex, where the majority of the connections are thought to be excitatory [27].) As such, the effect of the clustered motifs has implications for computing the directed functional connectivity via the pairwise TE: the directed functional connectivity is better able to infer links within brain modules (where such motifs enhance TE values) than links across modules. This appears to align with results of Stetter et al. [51], finding that the true positive rate for TE-based directed functional network inference on simulated neural cultures generally increased with clustering coefficient of the underlying network structure. When negative weights are present (interpretable as inhibitory in a neural context), the direct relationship to the number of motifs for equations (3.14b), (3.14f) and (3.14g) is less clear and depends intricately on the proportion and placement of these negatively-weighted links (though the overall relation to weighted motif counts obviously still holds).

Differently from the case of the in-degree, the effect of the clustered motifs on the pairwise TE was not qualitatively preserved in RBNs. Our experiments on RBNs in appendix C show that the pairwise TE increases with the rewiring probability γ there. These results align with more comprehensive experiments in a previous study [18]. There, it was argued that long links are able to introduce new information to the target that it was less likely to have previously been exposed to, in contrast to information available from its clustered near neighbours. This effect does not appear to be so important for linear dynamics, as it cannot be identified in the motifs in equation (3.14) and figure 1. Mediano & Shanahan [62] also report a slightly different effect in other nonlinear dynamics. That is, that averages of (higher-order conditional) TE peak at values of γ on the random side of the small-world regime in a model of coupled spiking neurons (in contrast to our approach, this is averaged over all pairs of nodes in the system, connected or not). They argue that the neurons are functionally decoupled in the regular regime, and that in the random regime the strong correlations across the network mean that the source cannot add information about the target beyond what is already conditioned on. The dominant effect in the linear dynamics under consideration here are the reinforcements achieved from clustered structure identified in equations (3.14b) and (3.14f) and equation (3.14g); that is an additive reinforcement effect, and so is likely less pertinent to nonlinear dynamics such as in RBNs and spiking neurons.

(d). Further remarks

The decomposition of the pairwise TE in terms of network motifs (equation (3.14) and figure 1) was performed up to order O(||C||4). Longer motifs will start to appear in higher-order approximations. For example, motifs involving a confounding effect (i.e. a common parent of X and Y without the directed link X → Y) appear at order 6 (not shown). The higher-order motifs are providing only a small contribution for CXY = CXX = 0.15 in figure 3; that contribution will become more significant as link weights become larger (in particular when the spectral radius is close to 1).

A similar decomposition of the active information storage in the dynamics of a target node was provided in previous work [28], reporting that the highest order contributions were from low-order feedback and feed-forward motifs (with the relevant feed-forward motifs converging on the target node Y). The motifs contributing to the information storage at a node Y contrast to those contributing to the decomposition of information transfer from X → Y presented in equation (3.14). First, there is no explicit contribution of feedback loops in the TE decomposition. This may seem contrary to the expectation of their detracting from TE (since they facilitate prior knowledge of the source stored in the past of the target, which TE removes). While such terms do not appear explicitly, their detracting effect has been implicitly removed prior to the final result: because the unlabelled nodes in figure 1 are distinct from the target Y, any feedback loops potentially including Y have been removed from the counts in figure 1 (panels b, f , g). Moreover, the types of feed-forward motifs that contribute to information storage on Y and transfer from X → Y are slightly distinct. Feed-forward motifs contribute to transfer here where the source X is on one of two walks with the same lengths to Y from some common driver (equations (3.14b), (3.14f) and (3.14g)). By contrast, a motif will generate an information storage effect on the target Y where the lengths of those walks are distinct [28]. We can interpret this as the difference between the reinforcement of a direct effect from X (transfer) versus a correlation in Y of dynamics across time steps (storage).

5. Conclusion

A linear, order-1 autoregressive process was used to systematically investigate the dependence of the pairwise TE on the global network topology. Specific weighted motifs were found to enhance or reduce the TE (equation (3.14)), as summarized in figure 1. The assumptions of linearity, stationarity, Gaussian noise, and uniform link weights were made in order to enable the analytical treatment. Importantly, under these assumptions, the results also apply to Granger causality [35]. Moreover, the numerical simulations in appendix C and the recent literature on the topic suggest that the dependence of the TE on the in-degree also holds for nonlinear dynamics.

In future work, the analytic approach will be extended to linear systems in continuous time, such as the multivariate Ornstein–Uhlenbeck process (as performed by Barnett et al. [19,27] for the Tononi–Sporns–Edelman (TSE) complexity [26]). Recent progress has already been made in the inference of the weighted adjacency matrix from observations for these continuous-time systems [6365]. Furthermore, higher-order conditional and collective transfer entropies [43,44] could also be investigated in a similar fashion. Since conditional TE terms remove redundancies and include synergies between the considered source and conditional sources [47], it is likely that there will be both removal of previous and inclusion of new contributing motif structures in comparison to the pairwise effect.

Acknowledgements

The authors acknowledge the Sydney Informatics Hub and the University of Sydney’s high-performance computing cluster Artemis for providing the high-performance computing resources that have contributed to the research results reported within this paper.

Appendix A. Covariance matrices and non-zero terms in equation (3.4)

The covariance matrices Ω(Y, Y(k)) − I, Ω(Y(k)) − I, and Ω(X, Y(k)) − I can be obtained as submatrices of B = Ω(X, Y, Y(k)) − I (see equation (3.5)). Specifically, we have

Ω(Y(k))I=(Ω(0)YY1Ω(k1)YYΩ(k1)YYΩ(0)YY1) A 1
Ω(Y,Y(k))I=(Ω(0)YY1Ω(1)YYΩ(k)YYΩ(1)YYΩ(0)YY1Ω(k1)YYΩ(k)YYΩ(k1)YYΩ(0)YY1) A 2
andΩ(X,Y(k))I=(Ω(0)XX1Ω(0)YXΩ(k1)YXΩ(0)YXΩ(0)YY1Ω(k1)YYΩ(k1)YXΩ(k1)YYΩ(0)YY1). A 3

The four matrix traces involved in equation (3.4) are

tr[(Ω(Y,Y(k))I)m], A 4a
tr[(Ω(Y(k))I)m], A 4b
tr[(Ω(X,Y,Y(k))I)m]=tr[Bm] A 4c
andtr[(Ω(X,Y(k))I)m]. A 4d

Let us start with the difference (equation (A 4d)–equation (A 4c)). The trace in equation (A 4c) can be expanded as

tr[Bm]=i(Bm)ii=i1,,imBi1i2Bi2i3Bim1imBimi1, A 5

and the trace in equation (A 4d) can be expanded similarly as a sum. With Ω(X, Y(k)) − I being a submatrix of B, all the terms in equation (A 4d) also appear in equation (A 4c). Thus, the remaining terms in the difference (equation (A 4d)–equation (A 4c)) are the terms in equation (A 5) that involve entries from the second row (or column) of B, i.e. those where at least one of the indices i1, …, im is equal to 2 (corresponding to Y).

Similarly, all the terms in equation (A 4b) also appear in equation (A 4a). Thus, the remaining terms in the difference (equation (A 4a)–equation (A 4b)) are those where at least one of the indices i1, …, im corresponds to Y (being equal to 1 for the matrix in equation (A 4a), but equal to 2 when aligned with matrix B in equation (A 5)).

Finally, the remaining terms in the trace differences in equation (3.4)

(equation (A 4a)equation (A 4b))(equation (A 4c)equation (A 4d))

are the terms in equation (A 5) that (i) involve at least one entry of B from the second row (or column) corresponding to Y (as per the arguments above), and also (ii) involve at least one entry of B from the first row (or column) corresponding to X (in order to appear in (equation A 4c–equation A 4 d but not (equation A 4a–equation A 4b). That is, the remaining terms are those in equation (A 5) where at least one of the indices i1, …, im is equal to 1 and another one is equal to 2.

Appendix B. Derivation of motifs for m = 4

When m = 4 in equation (3.6), we have

TXY(4)=14tr[B4]¯=14i,j,k,lBijBjkBklBli¯, B 1

where the overbar indicates that only the terms that involve at least one entry of B from the first row and one from the second row (or columns) are considered. There are 12 cases to consider, i.e. those where at least one of the four indices (i, j, k, l) is equal to 1 and another index is equal to 2 (the other indices can range between 1 and N, with some values excluded to avoid double counting):

TXY(4)=14i=1;j=2;k;li=2;j=1;k;li2;j=1;k=2;li1;j=2;k=1;li;j1,2;k=1;l=2i;j1,2;k=2;l=1i=2;j1;k1,2;l=1i=1;j2;k1,2;l=2BijBjkBklBli B 2a
+14i=1;k=2;j;li=2;k=1;j;lj=1;l=2;i1;k1j=2;l=1;i2;k2BijBjkBklBli. B 2b

The terms in equation (B 2b) will be neglected since they contribute at order O(||C||6) once the expansions of the covariance matrices are inserted (equations (2.3) and (2.4)). Computing the remaining terms in equation (B 2a) gives the result shown in equation (3.13).

Appendix C. Extension to random Boolean networks

RBNs are a class of discrete dynamical systems which were proposed as models of gene regulatory networks by Kauffman [66]. Each node in the network has a Boolean state value, which is updated in discrete time. In the original formulation, the new state of each node is a deterministic Boolean function of the current state of its parents. Given the topology of the network, this function is assigned at random for each node when the network is initialized, subject to a probability r of producing ‘1’ outputs. Differently from the original formulation, the Boolean function was made stochastic here by introducing a probability p = 0.005 of switching state at each time step.

The experiment described in §4 (in-degree of source and target) was repeated on RBNs with r = 0.5 but keeping the same topology (scale-free networks obtained via preferential attachment). In the absence of theoretical results, the pairwise TE was estimated numerically from synthetic time series with 100 000 time samples. The time series were embedded with a history length k = 14, as in §4. The results (shown in figure 4) were qualitatively similar to those obtained using linear Gaussian processes (figure 2).

Figure 4.

Figure 4.

Pairwise transfer entropy (TE) as a function of the source and target in-degree in random Boolean networks. Similarly to the linear Gaussian case (figure 2), the TE increases with the in-degree of the source and decreases with the in-degree of the target. The results were obtained from 10 000 simulations of scale-free networks of 100 nodes generated via preferential attachment. The TE was averaged over all the node pairs with the same in-degrees. The values in the lower-left corner are the result of an average over many samples, since most of the node pairs have low in-degrees. There are progressively fewer observations for higher in-degrees and none in the upper-right corner (the absence indicated by white colour). (Online version in colour.)

The experiment presented in §4 (Clustered motifs) was also repeated using the random boolean networks but keeping the same topology. In this case, the results (shown in figure 5) were not qualitatively similar to those obtained using linear Gaussian processes (figure 3). As shown in previous studies [18] (without the addition of stochastic noise), the pairwise TE increases with the rewiring probability γ.

Figure 5.

Figure 5.

Average empirical transfer entropy as a function of the rewiring probability in Watts–Strogatz ring networks with a random boolean dynamics. The results for 20 simulations on different networks are presented (low-opacity markers) in addition to the mean values (solid markers). (Online version in colour.)

Footnotes

1

The determination of these embedding parameters followed the method of Garland et al. [60] finding the values which maximize the active information storage, with the important additional inclusion of bias correction (because increasing k generally serves to increase bias of the estimate) [61].

Data accessibility

The synthetic data were generated via computer simulations. The code and the data are available on GitHub (github.com/LNov/transferEntropyMotifs) and Zenodo (https://doi.org/10.5281/zenodo.3724176) to facilitate the reproduction of the original results.

Authors' contributions

J.T.L., L.N., F.M.A. and J.J. conceptualized the study; L.N. and J.T.L. carried out the formal analysis; L.N. performed the numerical validation, prepared the visualizations and wrote the original draft; all the authors edited and approved the final manuscript.

Competing interests

We declare we have no competing interest.

Funding

J.L. was supported through the Australian Research Council DECRA Fellowshipgrant no. DE160100630 and through the University of Sydney Research Accelerator (SOAR) prize program.

Reference

  • 1.Barrat A, Barthelemy M, Vespignani A. 2008. Dynamical processes on complex networks. Cambridge, UK: Cambridge University Press. [Google Scholar]
  • 2.Liu Y-Y, Barabási A-L. 2016. Control principles of complex systems. Rev. Mod. Phys. 88, 035006 ( 10.1103/RevModPhys.88.035006) [DOI] [Google Scholar]
  • 3.Nishikawa T, Sun J, Motter AE. 2017. Sensitive dependence of optimal network dynamics on network structure. Phys. Rev. X 7, 041044 ( 10.1103/physrevx.7.041044) [DOI] [Google Scholar]
  • 4.Sporns O, Tononi G, Edelman GM. 2000. Connectivity and complexity: the relationship between neuroanatomy and brain dynamics. Neural Netw. 13, 909–922. ( 10.1016/S0893-6080(00)00053-8) [DOI] [PubMed] [Google Scholar]
  • 5.Shannon CE. 1948. A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423. ( 10.1002/j.1538-7305.1948.tb01338.x) [DOI] [Google Scholar]
  • 6.Prokopenko M, Boschetti F, Ryan AJ. 2009. An information-theoretic primer on complexity, self-organization, and emergence. Complexity 15, 11–28. ( 10.1002/cplx.20249) [DOI] [Google Scholar]
  • 7.Lizier JT. 2013. The local information dynamics of distributed computation in complex systems. Berlin/Heidelberg, Germany: Springer Theses. [Google Scholar]
  • 8.Schreiber T. 2000. Measuring information transfer. Phys. Rev. Lett. 85, 461–464. ( 10.1103/PhysRevLett.85.461) [DOI] [PubMed] [Google Scholar]
  • 9.Bossomaier T, Barnett L, Harré M, Lizier JT. 2016. An introduction to transfer entropy. Cham, Switzerland: Springer International Publishing. [Google Scholar]
  • 10.Wibral M, Vicente R, Lizier JT. 2014. Directed information measures in neuroscience. Berlin/Heidelberg, Germany: Springer. [Google Scholar]
  • 11.Timme NM, Lapish C. 2018. A tutorial for information theory in neuroscience. eNeuro 5, e0052-18 ( 10.1523/ENEURO.0052-18.2018) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Marinazzo D, Pellicoro M, Wu G, Angelini L, Cortés JM, Stramaglia S. 2014. Information transfer and criticality in the Ising model on the human connectome. PLoS ONE 9, e93616 ( 10.1371/journal.pone.0093616) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Marinazzo D, Wu G, Pellicoro M, Angelini L, Stramaglia S. 2012. Information flow in networks and the law of diminishing marginal returns: evidence from modeling and human electroencephalographic recordings. PLoS ONE 7, e45026 ( 10.1371/journal.pone.0045026) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Timme NM, Ito S, Myroshnychenko M, Nigam S, Shimono M, Yeh F-C, Hottowy P, Litke AM, Beggs JM. 2016. High-degree neurons feed cortical computations. PLoS Comput. Biol. 12, e1004858 ( 10.1371/journal.pcbi.1004858) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Lizier JT, Prokopenko M, Cornforth DJ. 2009. The information dynamics of cascading failures in energy networks. Proc. of the European Conf. on Complex Systems (ECCS), Warwick, UK, 21–25 September, p. 54.
  • 16.Ceguerra RV, Lizier JT, Zomaya AY. 2011. Information storage and transfer in the synchronization process in locally-connected networks. IEEE Symp. on Artificial Life (ALIFE), Paris, France, 11–15 April, pp. 54–61. Piscataway, NJ: IEEE.
  • 17.Li M, Han Y, Aburn MJ, Breakspear M, Poldrack RA, Shine JM, Lizier JT. 2019. Transitions in information processing dynamics at the whole-brain network level are driven by alterations in neural gain. PLoS Comput. Biol. 15, e1006957 ( 10.1371/journal.pcbi.1006957) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Lizier JT, Pritam S, Prokopenko M. 2011. Information dynamics in small-world boolean networks. Artif. Life 17, 293–314. ( 10.1162/artl_a_00040) [DOI] [PubMed] [Google Scholar]
  • 19.Barnett L, Buckley CL, Bullock S. 2009. Neural complexity and structural connectivity. Phys. Rev. E 79, 051914 ( 10.1103/PhysRevE.79.051914) [DOI] [PubMed] [Google Scholar]
  • 20.Messé A, Rudrauf D, Giron A, Marrelec G. 2015. Predicting functional connectivity from structural connectivity via computational models using MRI: an extensive comparison study. NeuroImage 111, 65–75. ( 10.1016/j.neuroimage.2015.02.001) [DOI] [PubMed] [Google Scholar]
  • 21.Goñi J. et al. 2014. Resting-brain functional connectivity predicted by analytic measures of network communication. Proc. Natl Acad. Sci. USA 111, 833–838. ( 10.1073/pnas.315529111) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Bettinardi RG, Deco G, Karlaftis VM, Van Hartevelt TJ, Fernandes HM, Kourtzi Z, Kringelbach ML, Zamora-López G. 2017. How structure sculpts function: unveiling the contribution of anatomical connectivity to the brain’s spontaneous correlation structure. Chaos: An Interdiscip. J. Nonlinear Sci. 27, 047409 ( 10.1063/1.4980099) [DOI] [PubMed] [Google Scholar]
  • 23.Galán RF. 2008. On how network architecture determines the dominant patterns of spontaneous neural activity. PLoS ONE 3, e2148 ( 10.1371/journal.pone.0002148) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Pernice V, Staude B, Cardanobile S, Rotter S. 2011. How structure determines correlations in neuronal networks. PLoS Comput. Biol. 7, e1002059 ( 10.1371/journal.pcbi.1002059) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Saggio ML, Ritter P, Jirsa VK. 2016. Analytical operations relate structural and functional connectivity in the brain. PLoS ONE 11, e0157292 ( 10.1371/journal.pone.0157292) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Tononi G, Sporns O, Edelman GM. 1994. A measure for brain complexity: relating functional segregation and integration in the nervous system. Proc. Natl Acad. Sci. USA 91, 5033–5037. ( 10.1073/pnas.91.11.5033) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Barnett L, Buckley CL, Bullock S. 2011. Neural complexity: a graph theoretic interpretation. Phys. Rev. E 83, 041906 ( 10.1103/PhysRevE.83.041906) [DOI] [PubMed] [Google Scholar]
  • 28.Lizier JT, Atay FM, Jost Jn. 2012. Information storage, loop motifs, and clustered structure in complex networks. Phys. Rev. E 86, 026110 ( 10.1103/PhysRevE.86.026110) [DOI] [PubMed] [Google Scholar]
  • 29.Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U. 2004. Network motifs: simple building blocks of complex networks. Science 305, 824–827. ( 10.1126/science.1100519) [DOI] [PubMed] [Google Scholar]
  • 30.Song S, Sjöström PJ, Reigl M, Nelson S, Chklovskii DB. 2005. Highly nonrandom features of synaptic connectivity in local cortical circuits. PLoS Biol. 3, e68 ( 10.1371/journal.pbio.0030068) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Sporns O, Kötter R. 2004. Motifs in brain networks. PLoS Biol. 2, e369 ( 10.1371/journal.pbio.0020369) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Mangan S, Alon U. 2003. Structure and function of the feed-forward loop network motif. Proc. Natl Acad. Sci. USA 100, 11 980–11 985. ( 10.1073/pnas.2133841100) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Azulay A, Itskovits E, Zaslaver A. 2016. The C. elegans connectome consists of homogenous circuits with defined functional roles. PLoS Comput. Biol. 12, e1005021 ( 10.1371/journal.pcbi.1005021) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Varshney LR, Chen BL, Paniagua E, Hall DH, Chklovskii DB. 2011. Structural properties of the caenorhabditis elegans neuronal network. PLoS Comput. Biol. 7, e1001066 ( 10.1371/journal.pcbi.1001066) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Barnett L, Barrett AB, Seth AK. 2009. Granger causality and transfer entropy are equivalent for gaussian variables. Phys. Rev. Lett. 103, 238701 ( 10.1103/PhysRevLett.103.238701) [DOI] [PubMed] [Google Scholar]
  • 36.Cover TM, Thomas JA. 2005. Elements of information theory, 2nd edn Hoboken, NJ: John Wiley & Sons, Inc. [Google Scholar]
  • 37.Lai P-Y. 2017. Reconstructing network topology and coupling strengths in directed networks of discrete-time dynamics. Phys. Rev. E 95, 022311 ( 10.1103/PhysRevE.95.022311) [DOI] [PubMed] [Google Scholar]
  • 38.Hall BC. 2015. The matrix exponential. In Lie groups, Lie algebras, and representations, pp. 31–48. Cham, Switzerland: Springer.
  • 39.Hahs D, Pethel S. 2013. Transfer entropy for coupled autoregressive processes. Entropy 15, 767–788. ( 10.3390/e15030767) [DOI] [Google Scholar]
  • 40.Goodman RH, Porfiri M. 2020. Topological features determining the error in the inference of networks using transfer entropy. Math. Eng. 2, 34–54. ( 10.3934/mine.2020003) [DOI] [Google Scholar]
  • 41.Honey CJ, Sporns O, Cammoun L, Gigandet X, Thiran JP, Meuli R, Hagmann P. 2009. Predicting human resting-state functional connectivity from structural connectivity. Proc. Natl Acad. Sci. USA 106, 2035–2040. ( 10.1073/pnas.0811168106) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Goetze F, Lai P-Y. 2019. Reconstructing positive and negative couplings in Ising spin networks by sorted local transfer entropy. Phys. Rev. E 100, 012121 ( 10.1103/PhysRevE.100.012121) [DOI] [PubMed] [Google Scholar]
  • 43.Lizier JT, Prokopenko M, Zomaya AY. 2010. Information modification and particle collisions in distributed computation. Chaos 20, 037109 ( 10.1063/1.3486801) [DOI] [PubMed] [Google Scholar]
  • 44.Lizier JT, Prokopenko M, Zomaya AY. 2008. Local information transfer as a spatiotemporal filter for complex systems. Phys. Rev. E 77, 026110 ( 10.1103/PhysRevE.77.026110) [DOI] [PubMed] [Google Scholar]
  • 45.Barabási A-L, Albert R. 1999. Emergence of scaling in random networks. Science 286, 509–512. ( 10.1126/science.286.5439.509) [DOI] [PubMed] [Google Scholar]
  • 46.Walker SI, Kim H, Davies PCW. 2016. The informational architecture of the cell. Phil. Trans. R. Soc. A 374, 20150057 ( 10.1098/rsta.2015.0057) [DOI] [PubMed] [Google Scholar]
  • 47.Williams PL, Beer RD. 2011 Generalized measures of information transfer. (http://arxiv.org/abs/1102.1507. )
  • 48.Olin-Ammentorp W, Cady N. 2018. Applying non-parametric testing to discrete transfer entropy. bioRxiv 460733 ( 10.1101/460733) [DOI] [Google Scholar]
  • 49.Honey CJ, Kotter R, Breakspear M, Sporns O. 2007. Network structure of cerebral cortex shapes functional connectivity on multiple time scales. Proc. Natl Acad. Sci. USA 104, 10 240–10 245. ( 10.1073/pnas.0701519104) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Ito S, Hansen ME, Heiland R, Lumsdaine A, Litke AM, Beggs JM. 2011. Extending transfer entropy improves identification of effective connectivity in a spiking cortical network model. PLoS ONE 6, e27431 ( 10.1371/journal.pone.0027431) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Stetter O, Battaglia D, Soriano J, Geisel T. 2012. Model-free reconstruction of excitatory neuronal connectivity from calcium imaging signals. PLoS Comput. Biol. 8, e1002653 ( 10.1371/journal.pcbi.1002653) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.van den Heuvel MP, Sporns O. 2011. Rich-club organization of the human connectome. J. Neurosci. 31, 15 775–15 786. ( 10.1523/JNEUROSCI.3539-11.2011) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Faes L, Nollo G, Porta A. 2011. Information-based detection of nonlinear Granger causality in multivariate processes via a nonuniform embedding technique. Phys. Rev. E 83, 051112 ( 10.1103/PhysRevE.83.051112) [DOI] [PubMed] [Google Scholar]
  • 54.Lizier JT, Rubinov M. 2012. Multivariate construction of effective computational networks from observational data. Technical Report Preprint 25/2012, Max Planck Institute for Mathematics in the Sciences.
  • 55.Montalto A, Faes L, Marinazzo D. 2014. MuTE: A MATLAB toolbox to compare established and novel estimators of the multivariate transfer entropy. PLoS ONE 9, e109462 ( 10.1371/journal.pone.0109462) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Sun J, Taylor D, Bollt EM. 2015. Causal network inference by optimal causation entropy. SIAM J. Appl. Dyn. Syst. 14, 73–106. ( 10.1137/140956166) [DOI] [Google Scholar]
  • 57.Novelli L, Wollstadt P, Mediano PAM, Wibral M, Lizier JT. 2019. Large-scale directed network inference with multivariate transfer entropy and hierarchical statistical testing. Netw. Neurosci. 3, 827–847. ( 10.1162/netn_a_00092) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Watts DJ, Strogatz SH. 1998. Collective dynamics of ‘small-world’ networks. Nature 393, 440–442. ( 10.1038/30918) [DOI] [PubMed] [Google Scholar]
  • 59.Wollstadt P, Lizier JT, Vicente R, Finn C, Martinez-Zarzuela M, Mediano PAM, Novelli L, Wibral M. 2019. IDTxl: the information dynamics Toolkit xl: a Python package for the efficient analysis of multivariate information dynamics in networks. J. Open Source Softw. 4, 1081 ( 10.21105/joss.01081) [DOI] [Google Scholar]
  • 60.Garland J, James RG, Bradley E. 2016. Leveraging information storage to select forecast-optimal parameters for delay-coordinate reconstructions. Phys. Rev. E 93, 022221 ( 10.1103/PhysRevE.93.022221) [DOI] [PubMed] [Google Scholar]
  • 61.Erten E, Lizier J, Piraveenan M, Prokopenko M. 2017. Criticality and information dynamics in epidemiological models. Entropy 19, 194 ( 10.3390/e19050194) [DOI] [Google Scholar]
  • 62.Mediano PAM, Shanahan M. 2017 doi: 10.1007/s12559-017-9464-6. Balanced information storage and transfer in modular spiking neural networks. (http://arxiv.org/abs/1708.04392. ) [DOI] [PMC free article] [PubMed]
  • 63.Ching ESC, Lai P-Y, Leung CY. 2015. Reconstructing weighted networks from dynamics. Phys. Rev. E 91, 030801 ( 10.1103/PhysRevE.91.030801) [DOI] [PubMed] [Google Scholar]
  • 64.Ching ESC, Tam HC. 2017. Reconstructing links in directed networks from noisy dynamics. Phys. Rev. E 95, 010301 ( 10.1103/PhysRevE.95.010301) [DOI] [PubMed] [Google Scholar]
  • 65.Zhang Z, Zheng Z, Niu H, Mi Y, Wu S, Hu G. 2015. Solving the inverse problem of noise-driven dynamic networks. Phys. Rev. E 91, 012814 ( 10.1103/PhysRevE.91.012814) [DOI] [PubMed] [Google Scholar]
  • 66.Kauffman SA. 1993. The origins of order: self-organization and selection in evolution. Oxford, UK: Oxford University Press. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The synthetic data were generated via computer simulations. The code and the data are available on GitHub (github.com/LNov/transferEntropyMotifs) and Zenodo (https://doi.org/10.5281/zenodo.3724176) to facilitate the reproduction of the original results.


Articles from Proceedings. Mathematical, Physical, and Engineering Sciences are provided here courtesy of The Royal Society

RESOURCES