Skip to main content
PLOS One logoLink to PLOS One
. 2014 Jun 23;9(6):e99462. doi: 10.1371/journal.pone.0099462

Quantifying ‘Causality’ in Complex Systems: Understanding Transfer Entropy

Fatimah Abdul Razak 1,2,*, Henrik Jeldtoft Jensen 1
Editor: Renaud Lambiotte3
PMCID: PMC4067287  PMID: 24955766

Abstract

‘Causal’ direction is of great importance when dealing with complex systems. Often big volumes of data in the form of time series are available and it is important to develop methods that can inform about possible causal connections between the different observables. Here we investigate the ability of the Transfer Entropy measure to identify causal relations embedded in emergent coherent correlations. We do this by firstly applying Transfer Entropy to an amended Ising model. In addition we use a simple Random Transition model to test the reliability of Transfer Entropy as a measure of ‘causal’ direction in the presence of stochastic fluctuations. In particular we systematically study the effect of the finite size of data sets.

Introduction

Many complex systems are able to self-organise into a critical state [1], [2]. The local properties of the system will typically fluctuate in time and space but the way the fluctuations are interrelated or correlated may differ. In this context a critical state is defined in terms of the way in which the correlations of the local fluctuations decay in space and time. When a system isn't critical, the correlations of the fluctuations of a quantity Inline graphic measured in two different positions at two different times, say Inline graphic and Inline graphic decay as an exponential function of the separation in space Inline graphic and also decay exponentially as function of the separation in time Inline graphic. However in a critical state the correlations exhibit a much slower algebraic decay, i.e. the correlation functions decay as negative powers of Inline graphic and Inline graphic. This is the behaviour observed at second order phase transitions in thermal equilibrium, which are denoted the critical points. The slow algebraic decay of correlations is equivalent to correlations effectively spanning across the entire system. Or in other words, in the critical state local distortions can propagate throughout the entire system [2][4]. We address here how to identify directed stochastic causal connections embedded in a background of strongly correlated stochastic fluctuations.

Most of ‘causality’ and directionality measures have been tested on low dimension systems and neglect addressing the behaviour of systems consisting of large numbers of interdependent degrees of freedom that is a main feature of complex systems. From a complex systems point of view, on one hand there is the system as a whole (collective behaviour) and on another there are individual interactions that lead to the collective behaviour. A measure that can help understand and differentiate these two elements is needed. We shall first seek to make a clear definition of ‘causality’ and then relate this definition to complex systems. We outline the different approaches and measures used to quantify this type of ‘causality’. We highlight that for multiple reasons, Transfer Entropy seems to be a very suitable candidate for a ‘causality’ measure for complex systems. Consequently we seek to shed some light on the usage of Transfer Entropy on complex systems.

To improve our understanding of Transfer Entropy we study two simplistic models of complex systems which in a very controllable way generate correlated time series. Complex system whose main characteristic consist in essential cooperative behaviour [5] takes into account instances when the whole system is interdependent. Therefore, we apply Transfer Entropy to the (amended) Ising model in order to investigate its behaviour at different temperatures particularly near the critical temperature. Moreover, we are also interested in investigating the different magnitude of Transfer Entropy in general (which is not fully understood [6]) by looking at the effect of different transition probabilities, or activity levels. We discuss the interpretation of the different magnitudes of the Transfer Entropy by varying transition rates in a Random Transition model.

Quantifying ‘Causality’

The quantification of ‘causality’ was first envisioned by the mathematician Wiener [7] who propounded the idea that the ‘causality’ of a variable in relation to another can be measured by how well the variable helps to predict the other. In other words, variable Inline graphic ‘causes’ variable Inline graphic if the ability to predict Inline graphic is improved by incorporating information about Inline graphic in the prediction of Inline graphic. The conceptualisation of ‘causality’ as envisioned by Wiener was formulated by Granger [8] leading to the establishment of the Wiener-Granger framework of ‘causality’. This is the definition of ‘causality’ that we shall adopt in this paper.

In literature, references to ‘causality’ take many guises. The term directionality, information transfer and sometimes even independence can possibly refer to some sort of ‘causality’ in line with the Wiener-Granger framework. Continuing the assumption that Inline graphic causes Inline graphic, one would expect the relationship between Inline graphic and Inline graphic to be asymmetric and that the information flows in a direction from the source Inline graphic to the target Inline graphic. One can assume that this information transfer is the unique information provided by the causal variable to the affected one. When one variable causes another variable, the affected variable (the target) will be dependent (to certain extent) on the causal variable (the source). There must exist a certain time lag however small between the source and the target [9][11], this will be henceforth referred to as the causal lag [8]. One could also say the Wiener-Granger framework of prediction based ‘causality’ is equivalent to looking for dependencies between the variables at a certain causal lag.

Roughly, there are two different approaches in establishing ‘causality’ in a system. One approach is to make a qualified guess of a model that will fit the data, called the confirmatory approach [12]. Models of this nature are typically very field specific and rely on particular insights into the mechanism involved. A contrasting approach known as the exploratory approach, infers ‘causal’ direction from the data. This approach does not rely on any preconceived idea about underlying mechanisms and let results from data shape the directed model of the system. Most of the measures within the Wiener-Granger framework falls into this category. One can think of the different approaches as being on a spectrum from purely confirmatory to purely exploratory.

The nature of complex systems calls for the exploratory approach. The abundance of data emphasises this even more so. In fact ‘causality’ measures in the Wiener Granger framework have been increasingly utilised on data sets obtained from complex systems such as the brain [13], [14] and financial systems [15]. Unfortunately, most of the basic testings of the effectiveness of these measures are mostly done on dynamical systems [16][18] or simple time series, without taking into account the emergence of collective behaviour and criticality. Complex systems are typically stochastic and thus different from deterministic systems where the internal and external influences are distinctly identified. As mentioned above, here we focus on the emergence of collective behaviour in complex systems and in particular on how the intermingling of the collective behaviour with individual (coupled) interactions complicates the identification of ‘causal’ relationships. Identifying a measure that is able to distinguish between these different interactions will obviously help us to improve our understanding of the dynamics of complex systems.

Transfer Entropy

Within the Wiener-Granger framework, two of the most popular ‘causality’ measure are Granger Causality (G-causality) and its nonlinear analog Transfer Entropy. G-causality and Transfer Entropy are exploratory as their measures of causality are based on distribution of the sampled data. The standard steps of prediction based ‘causality’ that underlies these measures can be summarized as follows. Say we want to test whether variable Inline graphic causes variable Inline graphic. The first step would be to predict the current value of Inline graphic using the historical values of Inline graphic. The second step is to do another prediction where the historical values of Inline graphic and Inline graphic are both used to predict the current value of Inline graphic. And the last step would be to compare the former to the latter. If the second prediction is judged to be better than the first one, then one can conclude that Inline graphic causes Inline graphic. This being the main idea, we outline why Transfer Entropy is more suitable for complex systems.

Granger causality is the most commonly used ‘causality’ indicator [9]. However, in the context of the nonlinearities of a complex systems (collective behaviour and criticality being the main example), using G-causality may not be sufficient. Moreover, the inherently linear autoregressive framework makes G-causality less exploratory than Transfer Entropy. Transfer Entropy was defined [16], [17] as a nonlinear measure to infer directionality using the Markov property. The aim was to incorporate the properties of Mutual Information and the dynamics captured by transition probabilities in order to understand the concept and exchange of information. More recently, the usage of Transfer Entropy to detect causal relationships [19][21] and causal lags (the time between cause and effect) has been further examined [6], [22]. Thus we are especially interested in Transfer Entropy due to its propounded ability to capture nonlinearities, its exploratory nature as well as its information theoretic background that provides information transfer related interpretation. Unfortunately, some of the vagueness in terms of interpretation may cause confusion in complex systems. The rest of the paper is an attempt to discuss these issues in a reasonably self-contained manner.

Mutual Information based measures

Define random variables Inline graphic and Inline graphic with discrete probability distributions Inline graphic, Inline graphic and Inline graphic. The entropy of Inline graphic is defined [23], [24] as

graphic file with name pone.0099462.e034.jpg (1)

where log to the base Inline graphic and Inline graphic is used. The joint entropy of Inline graphic and Inline graphic is defined as

graphic file with name pone.0099462.e039.jpg (2)

and the conditional entropy can be written as

graphic file with name pone.0099462.e040.jpg (3)

where Inline graphic is the joint distribution and Inline graphic is the respective conditional distribution. The Mutual Information [24], [25] is defined as

graphic file with name pone.0099462.e043.jpg (4)

Taking into account conditional variables, the conditional Mutual Information [19], [24] is defined as Inline graphic. A variant of conditional Mutual Information namely the Transfer Entropy was first defined by Schreiber in [16]. Let Inline graphic be the variable Inline graphic that is shifted by Inline graphic, so that the values of Inline graphic where Inline graphic is the value of Inline graphic at time step Inline graphic and similarly for Inline graphic. We highlight a simple form of Transfer Entropy where conditioning is minimal such that

graphic file with name pone.0099462.e053.jpg (5)

The idea is that, if Inline graphic causes Inline graphic at causal lag Inline graphic, then Inline graphic for any lag Inline graphic since Inline graphic due to the fact that Inline graphic should provide the most information about the change of Inline graphic to Inline graphic. This simple form allows us to vary the values of time lag Inline graphic in ascertaining the actual causal lag. This form of Transfer Entropy was also used in [13], [18], [22], [26], [27]. The Transfer Entropy in equation (5) can also be written as

graphic file with name pone.0099462.e064.jpg (6)

Our choice of this simple definition was motivated by the fact that it directly captures how the state of Inline graphic influences the changes in Inline graphic i.e. from Inline graphic to Inline graphic. In other words, equation (5) is tailor made to measure whether the state of Inline graphic influences the current changes in Inline graphic. This coincides with the predictive view of ‘causality’ in the Wiener-Granger framework where the current state of one variable (the source) influences the changes in another variable (the target) in the future. The same concept will be applied in order to probe this kind of ‘causality’ in our models.

The Ising Model

A system is critical when correlations are long ranged. A simple prototype example is the Ising model [2] at critical temperature, Inline graphic. Away from Inline graphic correlations are short ranged and dies off exponentially with separation. We shall apply Transfer Entropy to the Ising model in order to investigate its behaviour at different temperatures particularly in the vicinity of the critical temperature. One can visualize the 2D Ising model as a two dimensional square lattice with length Inline graphic composed of Inline graphic sites Inline graphic. These sites can only be in two possible states, spin-up (Inline graphic) or spin-down (Inline graphic). We restrict the interaction of the sites to only its nearest neighbours (in two dimensions this will be sites to the north, south, east and west). Let the interaction strength between Inline graphic and Inline graphic be denoted by

graphic file with name pone.0099462.e080.jpg (7)

so that the Hamiltonian (energy), Inline graphic, is given by [2], [28]

graphic file with name pone.0099462.e082.jpg (8)

Inline graphic is used to obtain the Boltzmann (Gibbs) distribution Inline graphic with Inline graphic where Inline graphic is the Boltzmann constant and Inline graphic is temperature.

We implement the usual Metropolis Monte Carlo (MMC) algorithm [2], [29], [30] for the simulation of the Ising model in two dimensions with periodic boundary conditions. The algorithm proposed by Metropolis and co-workers in 1953 was designed to sample the Boltzmann distribution Inline graphic by artificially imposing dynamics on the Ising model. The implementation of the MMC algorithm in this paper is outlined as follows. A site is chosen at random to be considered for flipping (change of state) with probability Inline graphic. The event of considering the change and afterwards the actual change (if accepted) of the configuration, shall henceforth be referred to as flipping consideration. A sample is taken after each Inline graphic flipping considerations. The logic being that, since sites to be considered are chosen randomly one at a time, after Inline graphic flips, each site will on average have been selected for consideration once. The interaction strength is set to be Inline graphic and the Boltzmann constant is fixed as Inline graphic for all the simulations. We let the system run up to Inline graphic samples before sampling at every Inline graphic time steps.

Through the MMC algorithm, a Markov chain (process) is formed for every site on the lattice. The state of each site at each sample will be taken as a time step Inline graphic in the Markov chain Inline graphic. Let Inline graphic be the number of samples (length of the Markov chains). To get the probability values for each site, we utilise temporal average. All the numerical probabilities obtained for the Ising model in this paper have been obtained by averaging over simulations with Inline graphic unless stated otherwise.

Measures on Ising model

In an infinite two dimensional lattice, the phase transition of the Ising model with Inline graphic and Inline graphic is known to occur at the critical temperature Inline graphic [2]. In a finite system, due to finite size effects, the critical values will not be quite as exact, we will call the temperature where the transition effectively occurs in the simulation as the crossover temperature Inline graphic. Susceptibility Inline graphic is an observable that is normally used to identify Inline graphic for the Ising model as seen in Figure (1). In order to define Inline graphic, let Inline graphic be the sum of spins on a lattice of size Inline graphic at time steps Inline graphic. The susceptibility [2] is given by

graphic file with name pone.0099462.e110.jpg (9)

where Inline graphic is the expectation in terms of temporal average and Inline graphic is temperature. The covariance on the Ising model can be defined as

graphic file with name pone.0099462.e113.jpg (10)

where Inline graphic.

Figure 1. Susceptibility Inline graphic on the Ising model with lengths L=10,25,50,100 obtained using equation (9).

Figure 1

Peaks can be seen at respective Inline graphic.

To display measures applied on individual sites, let sites Inline graphic represent coordinates Inline graphic, Inline graphic and Inline graphic respectively. The values of the covariance Inline graphic and Inline graphic is displayed in Figure (2) and Figure (3). It can be seen that for the Ising model, Mutual Information gives no more information than covariance. From this figure, one can see that the values are system size dependent up to system size Inline graphic or Inline graphic. We conclude from this, that up to this length scale, correlations are detectable across the entire lattice [2]. Thus we shall frequently utilize Inline graphic when illustration is required.

Figure 2. Covariance Inline graphic on the Ising model with lengths L=10,25,50,100 obtained using equation (10).

Figure 2

Figure 3. Mutual Information Inline graphic on the Ising model with lengths L=10,25,50,100 obtained using equation (4).

Figure 3

Using time shifted variables we obtained the Transfer Entropy Inline graphic in Figures (46). By looking at Figure (4) and then contrasting Figures (5) and (6), one can see that there is no clear difference between Inline graphic and Inline graphic in the figures thus no direction of ‘causality’ can be established between Inline graphic and Inline graphic. This is expected due to the symmetry of the lattice. More interestingly, the fact that Transfer Entropy peaks near Inline graphic can be due to the fact that at Inline graphic the correlations span across the entire lattice. Therefore, one may say that the critical transition and collective behaviour in the Ising model is detected by Transfer Entropy as a type of ‘causality’ that is symmetric in both directions. It is logical to interpret collective behaviour as a type of ‘causality’ in all directions since information is disseminated throughout the whole lattice when it is fully connected. This is an important fact to take into account when estimating Transfer Entropy on complex systems.

Figure 4. Transfer Entropy Inline graphic and Inline graphic on the Ising model of lengths L=50 obtained using equation (5).

Figure 4

Peaks for both direction are at Inline graphic.

Figure 6. Transfer Entropy Inline graphic on the Ising model of lengths L=10,25,50,100 obtained using equation (5).

Figure 6

Peaks can be seen at respective Inline graphic.

Figure 5. Transfer Entropy Inline graphic on the Ising model of lengths L=10,25,50,100 obtained using equation (5).

Figure 5

Peaks can be seen at respective Inline graphic.

Amended Ising Model

In the amended Ising model we introduce an explicit directed dependence between the sites Inline graphic, Inline graphic and Inline graphic in order to study how well Transfer Entropy is able to detect this causality. We will define the amended Ising model using the algorithm outlined as follows. At each step in the algorithm a site chosen at random will be considered for flipping with a certain probability Inline graphic except when Inline graphic or Inline graphic is selected where an extra condition needs to be fulfilled first before it can be allowed to change (flip). If Inline graphic, Inline graphic (or Inline graphic) can be considered for flipping with probability Inline graphic as usual, however if Inline graphic, no change is allowed. Thus only one state of Inline graphic (Inline graphic in this case) allows sites Inline graphic and Inline graphic to be considered for flipping. Therefore, although Inline graphic and Inline graphic have their own dynamics, their changes still depend on Inline graphic.

We simulated the amended Ising model with Inline graphic for different lattice lengths Inline graphic. Figures (7) display the values of susceptibility Inline graphic on the model and the peaks clearly show the presence of Inline graphic in our model just like Figure (1) of the Ising model. Figures (8) and (9) display the values of the covariance Inline graphic and the Mutual Information Inline graphic respectively. We reiterate that our correlations reach across the system for Inline graphic [2], [31]. While covariance and Mutual Information gives similar results to those of the standard Ising model as in Figures (2) and (3), a difference is clearly seen in Transfer Entropy values. Figure (1012) displays the contrasts of Inline graphic and Inline graphic on the amended Ising model which explicitly indicates the direction of ‘causality’ Inline graphic. While Figure (12) is not very different from Figure (6), Figures (10) and (11) are clearly different from their counterparts in the Ising model, Figures (4) and (5). Transfer Entropy captures the effect of the amendment.

Figure 7. Susceptibility Inline graphic on the amended Ising model of lengths L=10,25,50,100 obtained using equation (9).

Figure 7

Peaks can be seen at respective Inline graphic.

Figure 8. Covariance Inline graphic on the amended Ising model of lengths L=10,25,50,100 obtained using equation (10).

Figure 8

Peaks can be seen at respective Inline graphic, similar to Figure (2) of the Ising model.

Figure 9. Mutual Information Inline graphic on the amended Ising model with lengths L=10,25,50,100 obtained using equation (4).

Figure 9

Not much different from results on the Ising model in Figure 3.

Figure 10. Transfer Entropy Inline graphic and Inline graphic on the amended Ising model of lengths Inline graphic and Inline graphic, obtained using equation (5).

Figure 10

Direction Inline graphic at time lag Inline graphic is indicated. Very different from result on Ising model in Figure 4.

Figure 12. Transfer Entropy Inline graphic on the Ising model of lengths L=10,25,50,100 obtained using equation (5).

Figure 12

Peaks can be seen at respective Inline graphic, similar to Ising model results in Figure (6).

Figure 11. Transfer Entropy Inline graphic on the Ising model of lengths L=10,25,50,100 obtained using equation (5).

Figure 11

Values continue to increase after Inline graphic which is very different from Figure (5).

Furthermore with this amendment, one can utilize Transfer Entropy to illustrate the effect of separation in time. The effect of deviation from the predetermined causal lag Inline graphic, can be clearly seen in Figure (13), where the values of Inline graphic reduces to Inline graphic but at different rates depending on the deviation of Inline graphic from Inline graphic. The further away from Inline graphic, the faster the decrease to Inline graphic. Figure (14) is simply Figure (13) plotted over different time lags Inline graphic to illustrate how Transfer Entropy correctly and distinctly identified causal lag Inline graphic.

Figure 13. Inline graphic versus Inline graphic for different time lags Inline graphic in amended Ising model with Inline graphic and Inline graphic using equation (5).

Figure 13

The figure shows the effect of separation in time.

Figure 14. A different view of Figure (13) where Inline graphic versus Inline graphic for different temperatures Inline graphic is plotted instead.

Figure 14

Inline graphic. Figure highlights time lag detection.

That temperature is a main factor in influencing the strength of Transfer Entropy values is apparent in all the figures in this section. One can observe, especially in Figure (13), that the Transfer Entropy values approaches Inline graphic as they get further away from Inline graphic except when the time lag Inline graphic matches the delay induced (Inline graphic), in which case the Transfer Entropy value stabilizes to a certain fixed value as seen in Figure (15). In the vicinity of Inline graphic, the lattice is highly correlated thus subsequently leading to higher values of Transfer Entropy. The increase and value stabilization after Inline graphic is due to the fact that, as temperature increases, the probability for all ‘flipping considerations’ approaches a uniform distribution. This leads to transfer of information between site Inline graphic and sites Inline graphic and Inline graphic occurring much more frequently at elevated temperature.

Figure 15. Inline graphic in Figure 17 up to Inline graphic.

Figure 15

Transfer Entropy stabilizes due to Boltzmann distribution that approaches uniform distribution at higher temperatures.

Figure (16) and (17) display Transfer Entropy values for the Ising model and amended Ising model with Inline graphic respectively. The figures illustrate the mechanism in which Transfer Entropy detects the predefined causal delay. Consider the following question: which site ‘causes’ site Inline graphic? Firstly we see that Inline graphic is zero in both figures due to the definition in equation (5). Note that by our definition this is only for Inline graphic, if Inline graphic the Transfer Entropy value will be nonzero and also peak at Inline graphic. More importantly we see that Inline graphic is different from Inline graphic. In Figure (16) of the Ising model, the difference is due to separation (distance) in space and nearest neighbour interaction in the model, thus Inline graphic since Inline graphic is further away from Inline graphic than Inline graphic. But in Figure (17) of the amended Ising model, the opposite is true and separation in space does not dominate the Transfer Entropy value in this interaction. The figure very clearly indicates that Inline graphic ‘causes’ Inline graphic at Inline graphic and Inline graphic does not. In other words, in the amended Ising model Transfer Entropy identifies Inline graphic as a source in which one of the target is Inline graphic, whereas in the Ising model the expected nearest neighbour dynamics presides. This result is only obtained for measures sensitive to transition probabilities. Measures that depend only on static probabilities such as covariance, Mutual Information and conditional Mutual Information will only give values in accordance to the underlying nearest neighbour dynamics in both the Ising model and the amended Ising model [32].

Figure 16. Inline graphic, Inline graphic and Inline graphic in the Ising model with Inline graphic.

Figure 16

Inline graphic due to distance (separation) in space where Inline graphic is closer to Inline graphic than Inline graphic. The nearest neighbour effect is observed.

Figure 17. Inline graphic, Inline graphic and Inline graphic in the amended Ising model with Inline graphic and Inline graphic.

Figure 17

Inline graphic due to implanted ‘causal’ lag. The effect of separation in space is no longer visible.

Transfer Entropy, directionality and change

In order to understand the dynamics of of each site we calculate the effective rate of change (ERC) in relation to the transition probabilities. Let Inline graphic for any site Inline graphic on the lattice. Figure (18) illustrates how Inline graphic and Inline graphic are equal, as expected, and significantly different from Inline graphic. In Figure (10), the corresponding Transfer Entropy in both directions are displayed. At higher temperatures, it can be clearly seen that Inline graphic is larger than Inline graphic. However for temperatures near Inline graphic it is not as clear and therefore to highlight the relative values we calculate Inline graphic in Figure (19) and Figure (20) where Inline graphic if Inline graphic. We see that this value actually gives a clear jump at Inline graphic and remains more or less a constant after Inline graphic. Therefore even though Transfer Entropy in neither direction is zero, a clear indication of directionality can be obtained. Interestingly, the division with ERC brought out the clear phase transition-like behaviour that seems to distinguish the situation below and above Inline graphic. Referring back to Figure (4) of the unamended Ising model we can clearly see that Inline graphic for any direction in the unamended Ising model. We have demonstrated that Inline graphic is able to cancel out the symmetric contribution from the collective behaviour and only captures the imposed directed interdependence.

Figure 18. Inline graphic (Expected rate of change) of sites Inline graphic, Inline graphic and Inline graphic on amended Ising model with Inline graphic and Inline graphic.

Figure 18

Figure 19. Inline graphic on amended Ising model with Inline graphic and Inline graphic displaying phase-transition like behaviour.

Figure 19

Figure 20. Inline graphic on amended Ising model with Inline graphic and Inline graphic.

Figure 20

All with phase-transition like jump.

In his introductory paper [16], Schreiber warns that in certain situations due to different information content as well as different information rates, the difference in magnitude should not be relied on to imply directionality unless Transfer Entropy in one direction is Inline graphic. We have shown that when collective behaviour is present on the Ising model, the value of Transfer Entropy cannot possibly be Inline graphic. We suggest that this is due to fact that collective behaviour is as a type of ‘causality’ (disseminating information in all directions) and thus the Transfer Entropy is correctly indicating ‘cause’ in all directions. The clear difference in Transfer Entropy magnitude (even at Inline graphic) observed when the model is amended indicates that the difference in Transfer Entropy can indeed serve as an indicator of directionality in systems with emergent cooperative behaviour. We have seen that Transfer Entropy is influenced by the nearest neighbour interactions, collective behaviour and the ERC. In the next section we use the Random Transition model to further investigate how the ERC influences the Transfer Entropy.

Random Transition Model

In the amended Ising model we implemented a causal lag as a restriction of one variable on another, in a way that a value of the source variable will affect the possible changes of the target variable. It is this novel concept of implementing ‘causality’ that we will analyze and expand in the Random Transition model. Let Inline graphic, Inline graphic and Inline graphic, be the independent probabilities for the stochastic swaps of the variables Inline graphic, Inline graphic and Inline graphic at every time step respectively. In addition to that, a restriction is placed on Inline graphic and Inline graphic such that they are only allowed to do the stochastic swaps with probability Inline graphic and Inline graphic if the state of Inline graphic fulfills a certain condition. This restriction means that Inline graphic and Inline graphic can only change states if Inline graphic is in the conditioned state at time step Inline graphic thus creating a ‘dependence’ on Inline graphic, analogous to the dependence of Inline graphic and Inline graphic on Inline graphic in the amended Ising model.

However in this model we allow the number of states Inline graphic to be more than just two. The purpose of this is twofold, on one hand it contributes towards verifying that the behaviours of Transfer Entropy observed on the amended Ising model does extend to cases where Inline graphic. On the other hand, the model also serves to highlight different properties of Transfer Entropy as well as the very crucial issue of probability estimation that may lead to misleading results. The processes are initialized randomly and independently. The swapping probabilities are taken to be Inline graphic, thus enabling Transfer Entropy values to be calculated analytically. The transition probability of the Random Transition model is as follows. We assume that if a process chooses to change it must choose one of the other states equally, thus we have that Inline graphic, so that the marginal and joint probabilities remain uniform but the transition probabilities are

graphic file with name pone.0099462.e300.jpg
graphic file with name pone.0099462.e301.jpg

and

graphic file with name pone.0099462.e302.jpg

where Inline graphic such that one can control ‘dependence’ on Inline graphic by altering Inline graphic.

The relationship between Inline graphic and Inline graphic

To understand how the values of Inline graphic affects the value of Inline graphic we need a different variable. Let Inline graphic be the probability that the condition is fulfilled given current knowledge at time Inline graphic such that Inline graphic. The value of Inline graphic will depend on Inline graphic, and in our model here, particularly on whether or not Inline graphic satisfies the condition. One can divide the possible states Inline graphic of all the processes into two sets such that

graphic file with name pone.0099462.e317.jpg
graphic file with name pone.0099462.e318.jpg

Note that Inline graphic and Inline graphic since Inline graphic such that Inline graphic can be interpreted as the proportion of states of Inline graphic that fulfill the condition. Due to equiprobability of spins and uniform initial distribution, for any Inline graphic there are only two possible values of Inline graphic, one for Inline graphic and one for Inline graphic. Therefore define Inline graphic such that

graphic file with name pone.0099462.e329.jpg (11)

to get

graphic file with name pone.0099462.e330.jpg (12)

Thus Inline graphic with the Inline graphic as in equation (11).

The relationship between Inline graphic and Inline graphic can be defined using the formula for total probability Inline graphic Let Inline graphic and using the fact that Inline graphic, we get that

graphic file with name pone.0099462.e338.jpg (13)

Due to the sole dependence of Inline graphic on Inline graphic, Inline graphic will make the transition probability of Inline graphic uniform such that Inline graphic for any Inline graphic since we have that

graphic file with name pone.0099462.e345.jpg

for any Inline graphic. Consequently, Inline graphic also makes all values of Inline graphic uniform so that equation (13) becomes

graphic file with name pone.0099462.e349.jpg (14)

Therefore on the model when the Inline graphic, we have that Inline graphic for any Inline graphic. And this is why we get Figure (21), where Inline graphic only if Inline graphic since Inline graphic in equation (16) cancels out.

Figure 21. Analytical Transfer Entropy Inline graphic versus time lags Inline graphic of the Random Transition model with Inline graphic (hence Inline graphic) and Inline graphic in equation (16) where Inline graphic is varied but Inline graphic fixed.

Figure 21

Inline graphic is monotonically increasing with respect to Inline graphic. Inline graphic is affected by Inline graphic. Figure illustrates how the internal dynamics of Inline graphic influences Inline graphic when Inline graphic is the target variable. Transfer Entropy changes even though external influence Inline graphic is constant.

For any Inline graphic, the relationship between Inline graphic and Inline graphic can be derived from equation (13) where

graphic file with name pone.0099462.e374.jpg (15)
graphic file with name pone.0099462.e375.jpg
graphic file with name pone.0099462.e376.jpg

Note that when Inline graphic (hence Inline graphic) this simplifies to Inline graphic.

Transfer Entropy formula on the Random Transition model

Using Inline graphic as in equation (12) we have that

graphic file with name pone.0099462.e381.jpg

which gives us

graphic file with name pone.0099462.e382.jpg (16)

where we used the Bayes theorem i.e

graphic file with name pone.0099462.e383.jpg

Due to independence, if Inline graphic were to be conditioned on Inline graphic we would have that

graphic file with name pone.0099462.e386.jpg

Therefore for values other than when Inline graphic and Inline graphic conditioned on Inline graphic, this ratio will yield Inline graphic. This renders Inline graphic. And if we get that Inline graphic, we can say that Transfer Entropy indicates ‘causality’ or some form of directionality from Inline graphic to Inline graphic and Inline graphic to Inline graphic, at time lag Inline graphic. In a similar manner for Inline graphic we have that

graphic file with name pone.0099462.e399.jpg

such that Inline graphic in exactly like equation (16) except that Inline graphic is replaced with Inline graphic.

When Inline graphic we have that Inline graphic is either Inline graphic or Inline graphic since the condition was placed at Inline graphic. More specifically we will have that Inline graphic and that Inline graphic. Putting these two values in equation (16) we obtain

graphic file with name pone.0099462.e410.jpg (17)

A more thorough treatment of the Random Transition model and other methods of Transfer Entropy estimations is given in [32].

Understanding ‘causality’ on the Random Transition model

The unclear meaning of the magnitude of Transfer Entropy is one of its main criticism [6], [18]. This is partly due to the ERC which incorporates both external and internal influences, the separation of which is rather unclear. The advantage of investigating Transfer Entropy on the Random Transition model is that the ERC can be defined in terms of internal and external elements i.e. for any variable Inline graphic we have that

graphic file with name pone.0099462.e412.jpg

where Inline graphic is the internal transition probability of Inline graphic and Inline graphic represents the external influence applied on Inline graphic. If the condition in our model is that Inline graphic for Inline graphic and Inline graphic to change values then, Inline graphic so that Inline graphic and Inline graphic. However, for the source Inline graphic which has no external influence, Inline graphic and consequently Inline graphic

When Inline graphic, the model essentially replicates the Ising model without the collective behaviour effect i.e. far above the Inline graphic where the Boltzmann distribution approaches a uniform distribution. Consequently, at these temperatures the influence of collective behaviour is close to none. One can see in Figure (21) and Figure (22) that the Inline graphic (hence the ERC) values are indeed key in determining the strength of Transfer Entropy. In Figure (21), Inline graphic influences Inline graphic monotonically when every other value is fixed, therefore in this case the Transfer Entropy reflects the internal dynamics Inline graphic rather than the external influence Inline graphic. If ‘causality’ is the aim, surely Inline graphic is the very thing that makes the relationship ‘causal’ and should be the main focus. This is a factor that needs to be taken into account when comparing the magnitudes of Transfer Entropy. Figure (21) also shows that when Inline graphic is uniform (since Inline graphic) hence Inline graphic, one gets that Inline graphic only if Inline graphic which makes causal lag detection fairly straight forward. However, in Figure (22) the effect of varying Inline graphic can be clearly seen in the nonzero values Inline graphic when Inline graphic. Nevertheless, the value at Inline graphic seems to be fully determined by Inline graphic regardless of Inline graphic value. The mechanism in which Inline graphic effects Inline graphic is sketched in the appendix.

Figure 22. Analytical Transfer Entropy Inline graphic versus time lags Inline graphic of the Random Transition model with Inline graphic (hence Inline graphic) and Inline graphic in equation (16) where Inline graphic fixed and Inline graphic is varied.

Figure 22

Only at Inline graphic, Inline graphic does not effect Inline graphic and values remain constant. For Inline graphic at Inline graphic, Transfer Entropy is affected by Inline graphic. Inline graphic and Inline graphic coincides. Figure shows how the internal dynamics of Inline graphic influences Inline graphic when Inline graphic is the source variable.

Therefore one can conclude that when Inline graphic is the source (‘causal’ variable) and Inline graphic is the target (the variable being affected by the ‘causal’ link), the value of the Transfer Entropy Inline graphic at Inline graphic is influenced only by Inline graphic but for Inline graphic, Inline graphic is determined by both Inline graphic and Inline graphic. We have verified that this is indeed the case even when Inline graphic in this model. This should apply to all variables in the model and much more generally to any kind of source-target ‘causal’ relationship in this sense. We suspect that this also extends to cases when there is more than one source and this will be a subject of future research. Thus for causal lag detection purposes, it is clear that theoretically Transfer Entropy will attain maximum value at the exact causal lag. It is also clear that Transfer Entropy at nearby lags can be nonzero due to this single ‘causal’ relationship. Thus, on data sets it is strongly recommended to test for relative lag values.

Transfer Entropy estimations of the Random Transition model

For a classical histogram estimation of Transfer Entropy on real data sets [17], one can say that the number of states Inline graphic corresponds to the number of bins chosen for estimation. The estimations of Transfer Entropy for larger Inline graphic requires sufficient sample size (sufficient length of time series). To illustrate this finite sampling effect we set the value Inline graphic to three different values; Inline graphic for Case Inline graphic, Inline graphic for Case 2 and Inline graphic for Case 3. We plot the analytical Transfer Entropy Inline graphic, and its estimations on simulated values of varying time series length, Inline graphic, for all three cases in Figure (23). The exact Inline graphic is known and incorporated in the estimations.

Figure 23. Transfer Entropy Inline graphic versus number of state Inline graphic (number of chosen bins) for Cases Inline graphic and Inline graphic.

Figure 23

Inline graphic are uniformly distributed. Analytical values obtained from substituting respective Inline graphic values in equation (17). Simulated values are acquired using equation (5) on simulated data of varying sample size Inline graphic (length of time series) where Inline graphic. Error bars are displaying two standard deviation values above and two standard deviation below (some bars are very small, it can barely be seen). The aim is primarily to display how choosing Inline graphic has to be made according to length, Inline graphic, of available time series. For large Inline graphic the error bar becomes smaller than the width of the curve.

The observed existence of spurious detection or overestimation (finite sampling effects) as in Figure (23), is not uncommon and has been reported in relation to causality measures [15], [20], [33], [34]. This situation would be even more confusing in situations where Inline graphic is not known (unfortunately, this is more often than not the case). The significant testing (or lack of it) of Transfer Entropy is admittedly one of its main criticism. Initially, we have sidestepped this issue by implementing Transfer Entropy on relatively small Inline graphic to easily get statistically significant estimations. In fact of the main motivation for the use of the Ising model in the testing of Transfer Entropy is to exactly sidestep this issue since no binning is required and one can focus on the issue of what exactly does the Transfer Entropy measures. However Figure (23) clearly shows that for larger Inline graphic, some form of validation is required to avoid false directionality conclusion. Surrogates have been suggested as a form of significant testing for Transfer Entropy [13], [20], [26], [35]. Surrogate data sets are synthetically generated data which should ideally preserve all properties of the underlying system except the one being tested [20]. There are many different types of surrogates to serve different purposes[13], [14], [16], [20], [26], [35]. The idea is to break the coupling (causal link) but maintain dynamics in hope that one can differentiate cause and effect from any other dynamics.

One way to attain surrogates is by generating a null model (in the case of the Random Transition model this is simply three randomly generated time series) and test the values of Transfer Entropy as in Figure (24). Subtracting the null model from the values on the Random Transition model is equal to subtracting the Transfer Entropy values of both directions as one direction is theoretically zero. This is the idea behind the effective and corrected Transfer Entropy [15], [18]. However this does not quite solve the problem as the values may still be negative if the sample size is small. There are many other types of corrections [6], [13] proposed to address this issue involving substraction of the null model in some various forms. Nevertheless, as we have seen in Figure (19) of the amended Ising model, only by subtracting the two directions of Transfer Entropy did we obtain the clear direction as this cancelled out the underlying collective behaviour. We suspect that this will work as well for cancelling out other types of background effects and succeed in revealing directionality.

Figure 24. Transfer Entropy using equation (17) on simulated null model with varying sample size or length of time series, Inline graphic where Inline graphic.

Figure 24

Analytical values are all Inline graphic. Error bars in the first figure are displaying two standard deviation values above and two standard deviation below. For large Inline graphic the error bar becomes smaller than the width of the curve. In order to use the null model as surrogates, Inline graphic still has to be chosen in accordance to Inline graphic.

Discussion

This paper highlights the question of distinguishing interdependencies induced by collective behaviour and individual (coupled) interactions, in order to understand the inner workings of complex systems derived from data sets. These data sets are usually in the form of time series that seem to behave essentially as stochastic series. It is hence of great interest to understand measures proposed to be able to probe ‘causality’ in view of complex systems. Transfer Entropy has been suggested as a good probe on the basis of its nonlinearities, exploratory approach and information transfer related interpretation.

To investigate the behaviour of Transfer Entropy, we studied two simplistic models. From results of applying Transfer Entropy on the Ising model, we proposed that the collective behaviour is also a type of ‘causality’ in the Wiener-Granger framework but highlighted that it should be identified differently from individual interactions by illustrating this issue on an amended Ising model. The collective behaviour that emerges near criticality may overshadow the intrinsic directionality in the system as it is not detected by measures such as covariance (correlation) and Mutual Information. We showed that by taking into account both directions of Transfer Entropy on the amended Ising model, a clear direction can be identified. In addition to that, we verified that the Transfer Entropy is indeed maximum at the exact causal lag by utilizing the amended Ising model.

By obtaining the phase transition-like difference measure, we have shown that the Transfer Entropy is highly dependent on the effective rate of change (ERC) and therefore likely to be dependent on the overall activity level given by, say, the temperature in thermal systems as we demonstrated in the amended Ising model. Using the Random Transition model we have illustrated that the ERC is essentially comprised of internal as well as external influences and this is why Transfer Entropy depicts both. This also explains why collective behaviour on the Ising model is detected as type of ‘causality’. In complex systems where there is bound to be various interactions on top of the emergent collective behaviour, the situation can be difficult to disentangle and caution is needed. Moreover we pointed out the danger of spurious values in the estimation of the Transfer Entropy due to finite statistics which can be circumvented to a certain extend by a comparison of the amplitude of the causality measure in both directions and also by use of null models.

We believe that identifying these influences is important for our understanding of Transfer Entropy with the aim of utilising its full potential in uncovering the dynamics of complex systems. The mechanism of replicating ‘causality’ in the amended Ising model and the Random Transition model may be used to investigate these ‘causality’ measures even further. Plans for future investigations involve indirect ‘causality’, multiple sources and multiple targets. It would also be interesting to understand these measures in terms of local and global dynamics in dynamical systems. It is our hope that these investigations will help establish these ‘causality’ measures as a repertoire of measures for complex systems.

Acknowledgments

The authors gratefully acknowledge the financial support received in the form of research grants from Universiti Kebangsaan Malaysia (GGPM-2013-067 and DLP-2013-007).

Funding Statement

The authors gratefully acknowledge the financial support received in the form of research grants from Universiti Kebangsaan Malaysia (GGPM-2013-067 and DLP-2013-007). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Bak P (1996) How Nature Works: The Science of Self Organized Criticality. New York: Springer-Verlag. [Google Scholar]
  • 2.Christensen K, Moloney RN (2005) Complexity and Criticality. London: Imperial College Press. [Google Scholar]
  • 3.Jensen HJ (1998) Self Organized Criticality: Emergent Complex Behavior in Physical and Biological Systems. Cambridge: Cambridge University Press. [Google Scholar]
  • 4.Pruessner G (2012) Self-Organised Criticality: Theory, Models and Characterisation. Cambridge: Cambridge University Press. [Google Scholar]
  • 5.Jensen HJ (2009) Probability and statistics in complex systems, introduction to. In: Encyclopedia of Complexity and Systems Science. pp. 7024–7025.
  • 6. Runge J, Heitzig J, Marwan N, Kurths J (2012) Quantifying causal coupling strength: A lag-specific measure for multivariate time series related to transfer entropy. Phys Rev E 86: 061121. [DOI] [PubMed] [Google Scholar]
  • 7.Wiener N (1956) I am Mathematician: The later life of a prodigy. Massachusetts: MIT Press. [Google Scholar]
  • 8. Granger CWJ (1969) Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37: 424–438. [Google Scholar]
  • 9. Bressler SL, Seth A (2011) Wiener-granger causality: A well established methodolgy. NeuroImage 58: 323–329. [DOI] [PubMed] [Google Scholar]
  • 10. Sauer N (2010) Causality and causation: What we learn from mathematical dynamic systems theory. Transactions of the Royal Society of South Africa 65: 65–68. [Google Scholar]
  • 11. Hausman DM (1999) The mathematical theory of causation. Brit J Phil Sci 3: 151–162. [Google Scholar]
  • 12. Friston K (2011) Dynamic causal modeling and Granger causality comments on: The identification of interacting networks in the brain using fMRI: Model selection, causality and deconvolution. NeuroImage 58: 303–305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Vicente R, Wibral M, Lindner M, Pipa G (2011) Transfer entropy: a model-free measure of effective connectivity for the neurosciences. J Comput Neurosci 30: 45–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Martini M, Kranz TA, Wagner T, Lehnertz K (2011) Inferring directional interactions from transient signals with symbolic transfer entropy. Phys Rev E 83: 011919. [DOI] [PubMed] [Google Scholar]
  • 15. Marschinski R, Kantz H (2002) Analysing the information flow between financial time series: An improved estimator for transfer entropy. Eur Phys J B 30: 275–281. [Google Scholar]
  • 16. Schreiber T (2000) Measuring information transfer. Phys Rev Lett 85: 461–464. [DOI] [PubMed] [Google Scholar]
  • 17. Kaiser A, Schreiber T (2002) Information transfer in continuous process. Physica D 166: 43–62. [Google Scholar]
  • 18. Pompe B, Runge J (2011) Momentary information transfer as a coupling of measure of time series. Phys Rev E 83: 051122. [DOI] [PubMed] [Google Scholar]
  • 19. Hlavackova-Schindler K, Paluš M, Vejmelka M, Bhattacharya J (2007) Causality detection based on information-theoretic approachesin time series analysis. PhysicsReport 441: 1–46. [Google Scholar]
  • 20. Vejmelka M, Palus M (2008) Inferring the directionality of coupling with conditional mutual information. Phys Rev E 77: 026214. [DOI] [PubMed] [Google Scholar]
  • 21. Lungarella M, Ishiguro K, Kuniyoshi Y, Otsu N (2007) Methods for quantifying the causal structure of bivariate time series. J Bifurcation Chaos 17: 903–921. [Google Scholar]
  • 22. Wibral M, Pampu N, Priesemann V, Siebenhuhner F, Seiwert H, et al. (2013) Measuring information-transfer delays. PLoS ONE 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Shannon CE (1948) A mathematical theory of communication. The Bell Systems Technical Journal 27: 379–656, 379-423, 623-656. [Google Scholar]
  • 24.Cover T, Thomas J (1999) Elements of information theory. New York: Wiley. [Google Scholar]
  • 25. Kraskov A, Stögbauer H, Grassberger P (2004) Estimating mutual information. Phys Rev E 69: 066138. [DOI] [PubMed] [Google Scholar]
  • 26. Nichols JM, Seaver M, Trickey ST (2005) Detecting nonlinearity in structural systems using the transfer entropy. Phys Rev E 72: 046217. [DOI] [PubMed] [Google Scholar]
  • 27. Li Z, Ouyang G, Li D, Li X (2011) Characterization of the causality between spike trains with permutation conditional mutual information. Phys Rev E 84: 021929. [DOI] [PubMed] [Google Scholar]
  • 28. Cipra BA (1987) An introduction to the Ising model. The American Mathematical Monthly 94: 937–959. [Google Scholar]
  • 29.Krauth W(2006) Statistical Mechanics: Algorithms and Computations. Oxford: Oxford University Press. [Google Scholar]
  • 30.Norris JR (2008) Markov Chains. Cambridge: Cambridge University Press. [Google Scholar]
  • 31.Witthauer L, Dieterle M (2007). The phase transition of the 2D-Ising model. Available: http://quantumtheory.physik.unibas.ch/bruder/Semesterprojekte2007/p1/index.htmlx1-110002.1.6. (refer to Figure 9).
  • 32.Abdul Razak F (2013) Mutual Information based measures on complex interdependent networks of neuro data sets. Ph.D. thesis, Department of Mathematics, Imperial College London.
  • 33. Theiler J (1986) Spurious dimension from correlation algorithms applied to limited time-series data. Phys Rev A 34: 2427–2432. [DOI] [PubMed] [Google Scholar]
  • 34. Papana A, Kugiumtzis D, Larsson PG (2011) Reducing the bias of causality measures. Phys Rev E 83: 036207. [DOI] [PubMed] [Google Scholar]
  • 35. Palus M, Stefanovska A (2003) Direction of coupling from phases of interacting oscillators: An information-theoretic approach. Phys Rev E 67: 055201. [DOI] [PubMed] [Google Scholar]

Articles from PLoS ONE are provided here courtesy of PLOS

RESOURCES