Partial cross mapping eliminates indirect causal influences

Siyang Leng; Huanfei Ma; Jürgen Kurths; Ying-Cheng Lai; Wei Lin; Kazuyuki Aihara; Luonan Chen

doi:10.1038/s41467-020-16238-0

. 2020 May 26;11:2632. doi: 10.1038/s41467-020-16238-0

Partial cross mapping eliminates indirect causal influences

Siyang Leng ^1,^2,³, Huanfei Ma ⁴, Jürgen Kurths ^5,⁶, Ying-Cheng Lai ⁷, Wei Lin ^1,^2,^8,^✉, Kazuyuki Aihara ^3,^9,^✉, Luonan Chen ^10,^11,^12,^13,^✉

PMCID: PMC7251131 PMID: 32457301

Abstract

Causality detection likely misidentifies indirect causations as direct ones, due to the effect of causation transitivity. Although several methods in traditional frameworks have been proposed to avoid such misinterpretations, there still is a lack of feasible methods for identifying direct causations from indirect ones in the challenging situation where the variables of the underlying dynamical system are non-separable and weakly or moderately interacting. Here, we solve this problem by developing a data-based, model-independent method of partial cross mapping based on an articulated integration of three tools from nonlinear dynamics and statistics: phase-space reconstruction, mutual cross mapping, and partial correlation. We demonstrate our method by using data from different representative models and real-world systems. As direct causations are keys to the fundamental underpinnings of a variety of complex dynamics, we anticipate our method to be indispensable in unlocking and deciphering the inner mechanisms of real systems in diverse disciplines from data.

Subject terms: Network topology, Ecological networks, Applied mathematics

It is crucial yet challenging to identify cause-consequence relation in complex dynamical systems where direct causal links can mix with indirect ones. Leng et al. propose a data-driven model-independent method to distinguish direct from indirect causality and test its applicability to real-world data.

Introduction

Causal interactions are fundamental underpinnings in natural and engineering systems, as well as in social, economical, and political systems. Here system details are typically not known, but only time series are available. Correctly identifying causal relations among the dynamical variables generating the time series provides a window through which the inner dynamics of the target system may be probed into, and a number of previous methods were developed, such as those based on the celebrated Granger causality^1–5, the entropy^6–11, the dynamical Bayesian inference^12–15, and the mutual cross mapping (MCM)^16–21, with applications to real-world systems^5,7,22–31. If the system contains two independent variables only, the causal relation between them is straightforwardly direct. However, for a complex system with a large number of interacting nodes connected with each other in a networked fashion, two kinds of causation can arise: direct and indirect. Especially, if there is a direct link between two nodes, the detected causal relation between them can contain a direct component and an indirect one through other nodes in the network as a result of the generic phenomenon of causation transitivity (see Fig. 1). Even for two nodes that are not directly connected, a causal relation may be detected, but it must be indirect. To eliminate indirect causal influences so as to ascertain direct causal links is of paramount importance, as the latter constitutes the base for modeling, predicting, and controlling the system. There were previous studies of significant advance in detecting direct causal links to reconstruct the underlying true causal network based on the concept of partial transfer entropy or its linear Gaussian version, the conditional Granger causality, which resulted in many successful data mining in related fields^32–38. Combining these methods with graphical models, recent studies further provided a visible and comprehensive description of causal relations among interested variables^36,38,39. However, mathematically, all these methods are not applicable directly in situations where the relevant dynamical variables are non-separable so that the information from any variables cannot be separated easily in a prediction framework (see “Methods” for the rigorous concept of non-separability). In real-world nonlinear systems, the non-separability is ubiquitously present among systems variables¹⁷. To our knowledge, the problem of ascertaining direct causation by removing indirect causal influences for general complex dynamical systems has not been fully studied and remained outstanding.

Fig. 1 — a There is directional interaction between variables X and Y, but Z is an independent variable. b The variables X, Y, and Z constitute a one-directional causal chain with an indirect causal link from X to Y. c The variables constitute a causal loop, where every two neighboring variables have, in two opposite directions, a direct and an indirect causal link, respectively. d For a network with many interacting variables, more indirect causal links would be falsely identified as direct causal links.

In this paper, we develop a data-based, model-free method of partial cross mapping (PCM) to eliminate indirect causal influences in situations where non-separability is allowed to be present. The central idea is to integrate three basic data analysis methods from nonlinear dynamics and statistics: classic phase-space reconstruction, MCM, and partial correlation, to detect direct causal links for complex and nonlinear networked systems. The method is validated using various benchmark systems. Its applications to real-world systems lead to new insights into their dynamical underpinnings. The method provides a solution to the long-standing, crucial problem with existing causality detection methods: misidentifying indirect causal influences as direct ones. Because of its unprecedented ability to eliminate indirect causation, this method can be a powerful tool to understand and model complex dynamical systems.

Results

Direct and indirect causal links

To illustrate the difference between direct and indirect causal links, we first consider a toy system of three variables with different interaction structures. If only two variables interact in one direction and the third one is isolated (Fig. 1a), then the previous methods can be effective for identifying the direct causal link^16–21. However, when the three variables constitute a unidirectional causal chain (Fig. 1b), applying any of the previous methods to the time series from a pair of variables would detect a false direct link between the two non-neighboring variables X and Y in Fig. 1b (see “Methods” for a false link aroused by the transitivity). When the three variables constitute a causal loop (Fig. 1c), every two neighboring variables may have an indirect causal link in addition to the direct one in the opposite direction. In this case, previous methods would falsely identify any actual indirect link as a direct one. In addition to the above three representative interaction structures for the three variables, all the other possible modes have been introduced thoroughly and investigated systematically in Supplementary Note 1. Moreover, with more observable variables, the likelihood that indirect causal links are incorrectly regarded as direct ones will substantially increase (Fig. 1d).

Partial cross mapping

To overcome this problem, we propose the PCM method. The key idea is to examine the consensus between one time series and its cross map prediction from the other with conditioning on the part that is transferred from the third variable. For the convenience of describing our method clearly, we consider the simple case of three variables (X, Y, and Z) causally interacting with each other in a unidirectional chain (Fig. 2a). Let $X = {\{x_{t}\}}_{t = 1}^{L}$ , $Y = {\{y_{t}\}}_{t = 1}^{L}$ , and $Z = {\{z_{t}\}}_{t = 1}^{L}$ be the corresponding time series of length L. Using Takens–Mañé’s delay-coordinate embedding^40,41, we obtain three shadow manifolds: $M_{X} = {x_{t}}_{t = r}^{L}$ , $M_{Y} = {y_{t}}_{t = r}^{L}$ , and $M_{Z} = {z_{t}}_{t = r}^{L}$ with the vectors

x_{t} = (x_{t}, x_{t - τ_{x}}, \dots, x_{t - (E_{x} - 1) τ_{x}}), y_{t} = (y_{t}, y_{t - τ_{y}}, \dots, y_{t - (E_{y} - 1) τ_{y}}), z_{t} = (z_{t}, z_{t - τ_{z}}, \dots, z_{t - (E_{z} - 1) τ_{z}}),

where E_x, E_y, and E_z are the respective embedding dimensions, τ_x, τ_y, and τ_z are the time lags, and $r = \max_{ξ = x, y, z} {1 + (E_{ξ} - 1) τ_{ξ}}$ . These parameters of embedding dimensions and time lags can be computationally determined by the method of false nearest neighbor (FNN) and delayed mutual information (DMI), respectively. More advanced techniques can also be utilized^20,42. In general, for any pair of variables ξ and η ∈ {x, y, z}, we set ${\hat{N}}^{ξ} (η_{t}) = {η_{t^{'}} ∣ ξ_{t^{'}} \in N (ξ_{t})}$ , where $N (ξ_{t})$ is a set containing a fixed number (usually taken as E_ξ + 1, which is the minimum number of points needed for a bounded simplex in an E_ξ-dimensional space⁴³) of nearest neighboring points of ξ_t in the corresponding shadow manifold. For $ξ = η, {\hat{N}}^{ξ} (η_{t})$ becomes $N (η_{t})$ . For $ξ \neq η, {\hat{N}}^{ξ} (η_{t})$ becomes a cross mapping neighborhood from $N (ξ_{t})$ (for an illustrative example, see the horizontal arrows from M_Y to M_X in Fig. 2a). The dependence from $N (η_{t})$ to ${\hat{N}}^{ξ} (η_{t})$ characterizes the causal influence from the variable producing η_t to the variable producing ξ_t. Previously developed heuristic measures for quantifying such dependence and causal influence^{16–18,20,21} constitute the MCM framework. We exploit the correlation coefficient¹⁷ between η_t and ${\hat{η}}_{t}^{ξ} = E [{\hat{N}}^{ξ} (η_{t})]$ , where ${\hat{η}}_{t}^{ξ}$ is the mapping from ξ_t and $E [\cdot]$ is an operation taking appropriately weighted average over all the points in a given set. Specifically, if the correlation coefficient $ϱ_{C} = ∣Corr (x_{t}, {\hat{x}}_{t}^{y})∣$ is larger than an empirical threshold T, the MCM method will stipulate that there is a causal influence from X to Y. MCM complements the field of causality analysis in pairwise non-separable dynamical systems. However, due to causation transitivity, the causal link detected by MCM can be either direct or indirect, as illustrated in Fig. 2a. Additionally, since causation manifests its influence in a certain time delay, we search for an optimal time delay that maximizes the causation (i.e., the obtained correlation coefficient ϱ_C) between a translated Y and X (see “Methods” for a detailed description)²⁰.

Heuristically, ϱ_C, as defined above, represents the cosine of the angle between X and ${\hat{X}}^{Y}$ in the entire space, as shown in Fig. 2b. In order to distinguish the existence of the causation transitivity, we consider the projection of ϱ_C onto the information space orthogonal to the indirect information that is induced by the causation transitivity. To this end, we formulate our PCM framework (see “Methods” and Supplementary Fig. 1 for detailed formulations and practical instructions). First, for a time series pair Z and translated $Y_{τ_{i}} = {y_{t + τ_{i}}}$ with possible time delay candidates τ_i(i = 1, 2, …, m), we apply the conventional MCM method to determine the optimal time delay $τ_{i} = τ_{i_{1}}$ , which maximizes the correlation coefficient $Corr (Z, {\hat{Z}}^{Y_{τ_{i}}})$ . Correspondingly, the obtained mapping ${\hat{Z}}^{Y_{τ_{i_{1}}}}$ from $Y_{τ_{i_{1}}}$ is denoted by ${\hat{Z}}^{Y}$ for simplicity. The next step is to repeat the procedure to the time series pair X and the translated ${\hat{Z}}_{τ_{i}}^{Y}$ so as to obtain the optimal time delay $τ_{i_{2}}$ , as well as the mapping ${\hat{X}}^{{\hat{Z}}_{τ_{i_{2}}}^{Y}}$ from ${\hat{Z}}_{τ_{i_{2}}}^{Y}$ , which maximizes the coefficient $Corr (X, {\hat{X}}^{{\hat{Z}}_{τ_{i}}^{Y}})$ . Denoting the obtained mapping by ${\hat{X}}^{{\hat{Z}}^{Y}}$ , which is acquired from a successive MCM procedure and characterizes the indirect information flow through Z, and then obtaining ${\hat{X}}^{Y}$ , which characterizes all causal information from X to Y, by repeating the above procedure to time series pair X and the translated $Y_{τ_{i}}$ , we introduce the correlation index: $ϱ_{D} = ∣Pcc (X, {\hat{X}}^{Y} ∣ {\hat{X}}^{{\hat{Z}}^{Y}})∣$ to measure the direct causation from X to Y conditioned on the indirect causation through Z, where Pcc( ⋅ , ⋅ ∣ ⋅ ) is the partial correlation coefficient describing the association degree between the first two variables with information about the third variable removed⁴⁴, in contrast to the MCM index $ϱ_{C} = ∣Corr (X, {\hat{X}}^{Y})∣$ . Note that we search for the strongest causation on different candidate time delays in every MCM procedure above. As a consequence, ϱ_D can be regarded intuitively as the projection of ϱ_C onto the information space orthogonal to the indirect information ${\hat{X}}^{{\hat{Z}}^{Y}}$ (Fig. 2b), and thus eliminates the indirect causal influence.

For three causally interacting variables X, Y, and Z, we generally have ϱ_C ≥ ϱ_D. Setting an empirical threshold 1 > T ≫ 0, we have three cases for the order of the correlation index: ϱ_C ≥ ϱ_D ≥ T, ϱ_C ≥ T ≫ ϱ_D, and T > ϱ_C ≥ ϱ_D, corresponding, respectively, to the three causal relations: a direct causal link from X to Y, a sole indirect causal link from X to Y, and the absence of any causal link from X to Y. The index ϱ_D thus characterizes the degree to which direct causal links can be ascertained while eliminating the possibility of indirect links. For the example in Fig. 2a, the causal interaction of X and Y belongs to the second case above, which can be inferred from the correlation index in the same order as ϱ_C ≥ T ≫ ϱ_D. In real applications, it can happen that the causal signals in transition are not strong enough, making the values of ϱ_C ≳ T and ϱ_D close to that of T. In such a case, the detection of direct causal links becomes more sensitive to the value of T. To overcome this difficulty, we introduce γ = ϱ_D/ϱ_C to measure the proximity of the two index values. The closer the proximity to one, the higher the possibility of the existence of a direct causal link. Multiple tests^45–47 have been conducted to ensure statistical reliability.

The framework of PCM can be generalized to networked systems with an arbitrary number of interacting variables: X, Y, Z¹, …, Z^s (s ≥ 2) (e.g., Fig. 1d). With the full correlation between X and ${\hat{X}}^{Y}$ , we calculate their partial correlation coefficient, denoted as $ϱ_{D_{1}} = ∣Pcc (X, {\hat{X}}^{Y} ∣ {{\hat{X}}^{{\hat{Z}}^{i Y}} ∣ i = 1, \dots, s})∣$ , by removing the information of the cross mapping variables from the s variables Z¹, …, Z^s, where $ϱ_{D_{1}}$ is a first-order measure for distinguishing the direct from indirect causal link from X to Y. Motivation and formalization for extending this measure to higher orders is described in “Methods” section. We emphasize here that strongly coupled (synchronized) variables in nonlinear systems are not in the scope of the PCM framework, because in this circumstance the complete system collapses to the cause system sub-manifold, and the effect variable becomes an observation function on the cause system, where bidirectional causation will always be computationally detected¹⁷. In addition, theoretically our PCM framework is based on the Takens–Mañé theorem, which is applicable only for autonomous systems. Data entirely recorded from nonautonomous systems are therefore not directly suitable for this framework⁴⁸, but our method can be applied to some nonautonomous systems. In particular, it can be numerically used to detect piecewise causations with data from switching systems where the switching points could be located and each duration between the consecutive switching points is sufficiently long. Also, our framework is suitable for some forced systems or/and systems with weak or moderate noise because some generalized embedding theorems could support the soundness of our framework^49,50. As for an important kind of nonautonomous system, viz., dynamical oscillators with time-evolving coupled functions or/and with various types of noise, the dynamical Bayesian inference with a delicate set of function bases can provide pretty practical solutions¹⁴. As for the future research topics, possible investigations include combining the above mutually complementary methods for causation detection in more general dynamical systems without knowing explicit model equations but with highly complex interaction structures.

Ascertaining direct causation in benchmark systems

To validate our PCM method, we use the following benchmark system of three interacting species: x_t = x_t−1(α_x − α_xx_t −1 − β_xyy_t−1) + ϵ_x,t, y_t = y_t−1(α_y − α_yy_t−1 − β_yxx_t−1 − β_yzz_t−1) + ϵ_y,t, and z_t = z_t−1(α_z − α_zz_t−1 − β_zxx_t−1) + ϵ_z,t, for α_x = 3.6, α_y = 3.72, and α_z = 3.68, where ϵ_i,t (i ∈ {x, y, z}) are white noise of zero mean and standard deviation 0.005. Different choices of the coupling parameters β_xy, β_yx, β_yz, and β_zx can lead to distinct interacting modes (Fig. 3a). From the time series, we compute the MCM and PCM indices, ϱ_C and ϱ_D, respectively, for detecting the causal link from X to Y, with results listed in Fig. 3b, c. While there are cases where both methods are effective at detecting the direct causal links, for the causal chain and the causal loop structures with the threshold value T = 0.5, the PCM method succeeds in discriminating the indirect causal links, while clearly the MCM method, without eliminating the influence of the causation transitivity, fails. As furher shown in Supplementary Note 2, the PCM performance is more robust than that of the MCM method with respect to variations in the value of T, making the PCM method applicable to real-world systems when there is none or little a priori knowledge of assigning a proper value of T. The results in Fig. 3b, c have also been verified by using the multi-testing corrections. Additionally, for all the other possible interaction structures of three species, including the representative network motifs: fan-in, fan-out, and cascading structures^51,52, our systematic studies manifest that the PCM method achieves accurate causation detections completely (see Supplementary Note 1). More importantly, we systematically conducted comparison studies with the Granger causality, the transfer entropy and all their conditional extensions to detect the causations for the above three species system and tested their robustness against different noise levels and time series lengths. As clearly shown in Supplementary Note 3, the PCM outperforms those existing methods which are, in principle, suitable only for the variables satisfying the separability condition. We also provided a comparison study between the PCM framework and the dynamical Bayesian inference in Supplementary Note 3. Both methods have their own particular advantages and could be used in a complementary manner. All these results systematically demonstrate the universal and peculiar usefulness of our method to the typical situation where the variables of dynamical systems are non-separable.

Fig. 3 — a Three distinct interaction modes of the system. b Causal links from X to Y detected by the MCM method, which contain false direct causation for the second and the third interaction modes. c Direct causal links detected by the PCM method, which successfully excludes the false direct causations in b. Randomly selected are the 100 trials with a 1000-length from 5000-length time series, where the sampling rate is 1 Hz so that the length matches exactly the time unit of the system. The average is calculated over the results of these randomly selected trials. The phase-space reconstruction parameters are E = 4 and τ = 1. Here superscripts of ϱ_C and ϱ_D denote the specified causal direction.

Additionally, we validate the effectiveness of the PCM method in a network model containing eight interacting species. As shown in Supplementary Fig. 10, the direct causal network can be reconstructed faithfully while the indirect links are all eliminated successfully with setting an appropriate group of T. In contrast, with the same values of T, the MCM method produces a dense network containing direct, indirect, and even erroneous causal links. We also find that the ratio γ = ϱ_D/ϱ_C can be used to improve the detection accuracy even for relatively small values of the threshold T (Supplementary Note 4). Moreover, selecting a practically effective threshold value is much more realizable and robust in our PCM method (see Supplementary Fig. 11 and see Supplementary Note 5 for detailed information on statistical tests and methods for threshold selection). The robustness tests of PCM against the time series lengths and the noise scales also show good effectiveness even with small data size and relatively strong noise in this model (Supplementary Note 3). These additional results demonstrate the power of our PCM method in detecting direct links and accurately reconstructing the underlying causal networks from multivariate time series.

Detecting direct causation in real-world networks

We test gene regulatory networks with gene expression data available from DREAM4 in silico Network Challenge^53–55. There are five networks with different, synthetically produced structures. Each network has 100 genes. We use the software GeneNetWeaver⁵⁶ to randomly select 20 interacting genes, where each gene has 10 realizations of 21 gene expression time series data. Figure 4a presents one gene regulatory network (see Supplementary Fig. 12 for the others). For each gene, we combine all realizations as one time series for phase-space reconstruction. We compare the direct causal links detected by PCM with the a priori known edges of the five networks and calculate the respective ROC (receiver operating characteristic) curves (Fig. 4b). We find the average of the five areas under the ROC curves approaches the value of ~0.75, indicating high detection accuracies of direct links in gene regulatory networks even with small data sets, a task for which PCM outperforms the MCM method (see Supplementary Note 6).

Fig. 4 — a One of the five gene regulatory networks with 20 interacting genes from GeneNetWeaver. Each red (blue) arrow represents an activating (inhibitory) effect. b ROC curves characterizing the PCM detection performance. The corresponding AUROCs are also indicated. The reconstruction parameters are E = 2 and τ = 1. c A food chain network of three plankton species, where the direction of each red arrow represents a prey to predator interaction. d The PCM indices (the color region framed by red boxes) signifying successful detection of the direct causal links (for E = 4 and τ = 1). A relatively weak but direct causal link (the yellow arrow in c) from *Rotifers* to *Pico cyanobacteria* is identified through the index framed by the yellow box. e Results on all successfully detected interactions between air pollutants and cardiovascular diseases (red box) for E = 7 and τ = 1. f The reconstructed causal network from the results in e. All detection results are verified using multiple testing corrections.

We next consider the food chain network of three plankton species: Pico cyanobacteria, Rotifers and Cyclopoids, with the prey–predator relations indicated in Fig. 4c. The oscillatory population data are selected from an 8-year mesocosm experiment of a plankton community isolated from the Baltic Sea^57–59. Our PCM method yields six indices for all the possible causal links, and we preserve the links with index values ⪆10⁻¹ and discard other links (see Supplementary Note 5 for issues on threshold selection). This leads to two direct causal links, which agree with the ground truth of the original network (Fig. 4d). Remarkably, our PCM method successfully excludes the indirect link from Pico cyanobacteria to Cyclopoids. For this network, there is also a weak direct link from Rotifers to Pico cyanobacteria, and our method is indeed able to detect it (verified with multi-testing corrections). This reveals that the actual prey–predator hierarchy does not necessarily match the direct causal links among the species. For example, while predators hunt preys, a predator through hunting can significantly influence the prey populations when they are not tremendously abundant. In such a case, the predator can be regarded as the causal source, giving rise to the third relatively weak but direct causal link.

Our third real-world example is from the recorded data of air pollution and hospital admission of cardiovascular diseases in Hong Kong from 1994 to 1997 (see Supplementary Note 6)^60–62. As shown in Fig. 4e, f, our PCM method uncovers that only the pollutants, that is, nitrogen dioxide and respirable suspended, are detected as the major causes of cardiovascular diseases. Neither sulfur dioxide nor ozone has been identified as the cause for the diseases, which is consistent with previous results^20,63. Our method reveals a unidirectional causal relation from ozone to sulfur dioxide, but the detected causal relations among the recognized pollutants are bidirectional. It is likely that these detected causal relations are either direct or indirect, because data of other factors, such as temperature, humidity, and wind speed, are not completely available, which can be the common causes to some pollutants (e.g., the fan-out interaction mode shown in Supplementary Fig. 2).

We also apply the PCM method to real-world examples, including gene expression data related to the circadian rhythms and electroencephalography data of the human brain in Supplementary Note 7. All the results demonstrate the broad applicability of our method to different scales of data sets, and indeed reveal new viewpoints to the dynamical underpinnings of real-world systems.

Discussion

To summarize the work, by exploiting both dynamical and statistical features from the observed data, there are two major advantages of our method: detecting direct causality based on PCM and handling non-separability problem based on Takens–Mañé’s embedding theorem. Actually, variables for a nonlinear dynamical system are generally considered non-separable due to their intertwined nonlinear nature. Specifically, in contrast to the existing methods on detecting causation, which either misidentify indirect causal links as direct ones or fail due to a violation of the condition of separability, we develop a method theoretically and computationally to solve this outstanding problem, coping with the situation for which the existing frameworks cannot work effectively. The central idea lies in examining the consensus between one time series and its cross map prediction from the other with conditioning on the part that is transferred from the third variable. Our method is capable of not only distinguishing direct from indirect causal influences but also removing the latter. A virtue of our method is that it is generally applicable to nonlinear dynamical networks without requiring the condition of separability, which complements the missing part of causality analysis (see Supplementary Table 3). In fact, the concept of causality in dynamical systems is different from the widely accepted traditional statistical viewpoint that X causes Y if and only if an intervention in X has an effect on Y. Due to the non-separability, causality in dynamical systems should have different formalization, which in simplest way can be intuitively interpreted as a coupling term from X to Y in the system’s equations. Further theoretical interpretations regarding this new framework will be included in our future work. Finally, our PCM method is validated by applying to a number of real-world systems, yielding new insights into the dynamics of these systems. Unambiguous identification of direct causal links with indirect causal influence eliminated is a key to understanding and accurately modeling the underlying system, and our framework therefore provides a vehicle to achieve this goal.

Methods

The concept of non-separability

We illustrate the concept, non-separability, by using a general continuous-time dynamical system:

\dot{x} = F (x),

where the state variable $x (t) = {[x_{1} (t), x_{2} (t), \dots, x_{n} (t)]}^{⊤}$ evolves inside a compact manifold $M_{x}$ , forming an attractor $A$ with a dimension $d_{A}$ . Here, $d_{A}$ can be computed as the box-counting dimension of $A$ . The dynamics with an initial value $x_{0} \in M_{x}$ are denoted by x(t) = φ_t(x₀), where φ_t(⋅) is regarded as a flow along the manifold $M_{x}$ . According to Takens–Mañé’s embedding theory and its fractal generalizations, one can, with probability one, reconstruct the system with a positive delay τ and a smooth observation function $h : M_{x} \to R$ in the sense that the delay-coordinate map $Γ_{h, φ, τ} (x) = {[h (x), h (φ_{- τ} (x)), h (φ_{- 2 τ} (x)), \dots, h (φ_{- (L - 1) τ} (x))]}^{⊤}$ is generically an embedding map as long as $L > 2 d_{A}$ . Particularly for direct illustration, we take the observation function h(x) as a simple coordinate function: h(x) = x_i, where x_i is the ith component of x. Thus, we have $y (t) = {[x_{i} (t), x_{i} (t - τ), \dots, x_{i} (t - (L - 1) τ)]}^{⊤}$ and also have the manifold $M_{x}$ mapped to the shadow manifold $M_{y}$ by the embedding map Γ. Since the embedding map is one to one, the dynamics ψ_τ on the shadow manifold $M_{y}$ are topologically conjugated with the dynamics φ_τ on $M_{x}$ , that is,

y (t + τ) = ψ_{τ} (y (t)) = Γ \circ φ_{τ} \circ Γ^{- 1} (y (t)) .

On the one hand, system (1) implies a fact that the future dynamics of one specific component, say x_j with j = (or ≠)i, is governed by

{[φ_{τ}]}_{j} : x (t) = {[x_{1} (t), x_{2} (t), \dots, x_{n} (t)]}^{⊤} \to x_{j} (t + τ)

and thus depends on the history of all the components x₁, x₂, …, x_n. On the other hand, the relation in (2) implies the other fact that as long as the embedding map Γ exists, the future dynamics of x_j is also governed by

\begin{matrix} {[Γ^{- 1} \circ ψ_{τ}]}_{j} : y (t) = {[x_{i} (t), x_{i} (t - τ), \dots, x_{i} (t - (L - 1) τ)]}^{⊤} \\ \to x_{j} (t + τ) \end{matrix}

and thus only depends on the history of one variable x_i and on the embedding map Γ as well.

Generically, it is possible to make a prediction of x_j(t + τ) based only on the observation of one variable, and this prediction could be as perfect as the prediction using the information of all the variables x₁(t), x₂(t), …, x_n(t) of the system (this obviously disables the idea of Granger causality and its extensions). Thus, Takens–Mañé’s embedding theory reveals that, in such a deterministic nonlinear dynamical system, the information of the whole dynamical system could be generically injected into only one single variable and thus could be reconstructed by the observation data of that variable. This therefore invites a concept of non-separability, that is, one, prevalently, cannot remove the information of some variable from the other variables when any prediction is made for the dynamical systems. This also reveals that the methods based on prediction frameworks, such as the Granger causality, the transfer entropy, and all their extensions, mathematically are not suitable for dealing with the time series data produced by nonlinear dynamical systems where non-separability always exists among the internal variables. A toy example showing how GC fails in non-separable systems could be referred to the Supplementary Materials of ref. ¹⁷.

Transitivity arousing indirect causation

To investigate how the transitivity arouses indirect causation, we consider a heuristic logistic model of three species connected in the following manner:

x_{t} = x_{t - 1} (α_{x} - α_{x} x_{t - 1}), z_{t} = z_{t - 1} (α_{z} - α_{z} z_{t - 1} - β_{z x} x_{t - 1}), y_{t} = y_{t - 1} (α_{y} - α_{y} y_{t - 1} - β_{y z} z_{t - 1}),

where the three species X = {x_t}, Z = {z_t} and Y = {y_t} are interacting in a causal chain, denoted by X → Z → Y, and the coupling strengths β_zx and β_yz are nonzero.

Now, we shift the second equation in (5) with one time step and then substitute it into the last equation in (5), which yields:

\begin{matrix} y_{t} = y_{t - 1} [α_{y} - α_{y} y_{t - 1} - \\ β_{y z} z_{t - 2} (α_{z} - α_{z} z_{t - 2} - β_{z x} x_{t - 2})] . \end{matrix}

Also the last equation in (5) can be transformed as:

z_{t - 1} = \frac{1}{β_{y z}} (α_{y} - α_{y} y_{t - 1} - y_{t} / y_{t - 1}),

so that

z_{t - 2} = \frac{1}{β_{y z}} (α_{y} - α_{y} y_{t - 2} - y_{t - 1} / y_{t - 2}) .

Then, a substitution of Eq. (8) into Eq. (6) gives:

y_{t} = y_{t - 1} \{α_{y} - α_{y} y_{t - 1} - β_{y z} \frac{1}{β_{y z}} (α_{y} - α_{y} y_{t - 2} - y_{t - 1} / y_{t - 2}) \times [α_{z} - α_{z} \frac{1}{β_{y z}} (α_{y} - α_{y} y_{t - 2} - y_{t - 1} / y_{t - 2}) - β_{z x} x_{t - 2}]\} .

Consequently, this equation, coupling with the first equation in (5), forms a causation relation unidirectionally from X to Y. However, this causation is indirect, induced by the transitivity, and then the influence has the effect of time delay for discrete-time dynamical systems.

The PCM method of first order and higher order

We now formulate the PCM framework formally (see Supplementary Fig. 1 for a schematic graph of the PCM procedure). The first step is to translate the time series Y = {y_t} with time steps τ_i(i = 1, 2, …, m), generating m translated variables denoted as $Y_{τ_{i}} = {y_{t + τ_{i}}}$ . For time series pair $Y_{τ_{i}}$ and Z, we apply the conventional MCM method (see the practical steps below) to obtain the mapping ${\hat{Z}}^{Y_{τ_{i}}}$ from $Y_{τ_{i}}$ and calculate the correlation coefficient $Corr (Z, {\hat{Z}}^{Y_{τ_{i}}})$ . For simplicity, we denote ${\hat{Z}}^{Y}$ as the mapping ${\hat{Z}}^{Y_{τ_{i_{1}}}}$ with

i_{1} = {argmax}_{1 \leq i \leq m} Corr (Z, {\hat{Z}}^{Y_{τ_{i}}}) .

The next step is to repeat the procedure to the time series pair of translated ${\hat{Z}}_{τ_{i}}^{Y}$ and X so as to obtain the mapping ${\hat{X}}^{{\hat{Z}}_{τ_{i}}^{Y}}$ from ${\hat{Z}}_{τ_{i}}^{Y}$ , and set ${\hat{X}}^{{\hat{Z}}^{Y}}$ as ${\hat{X}}^{{\hat{Z}}_{τ_{i_{2}}}^{Y}}$ with

i_{2} = {argmax}_{1 \leq i \leq m} Corr (X, {\hat{X}}^{{\hat{Z}}_{τ_{i}}^{Y}}) .

Now the obtained ${\hat{X}}^{{\hat{Z}}^{Y}}$ represents the indirect information flow. By directly applying MCM to the translated $Y_{τ_{i}}$ and X, we could have ${\hat{X}}^{Y}$ denoting all the information transferred from X to Y, which is simplified for ${\hat{X}}^{Y_{τ_{i_{3}}}}$ with

i_{3} = {argmax}_{1 \leq i \leq m} Corr (X, {\hat{X}}^{Y_{τ_{i}}}) .

We now introduce the correlation index:

ϱ_{D} = ∣Pcc (X, {\hat{X}}^{Y} ∣ {\hat{X}}^{{\hat{Z}}^{Y}})∣,

where Pcc( ⋅ , ⋅ ∣ ⋅ ) is the partial correlation coefficient describing the association degree between the first two variables with information about the third variable removed. We review the definition of partial correlation coefficient here. For time series X, Y, and Z¹, …, Z^s, the partial correlation coefficient between X and Y conditioned on Z¹ is

Pcc (X, Y ∣ Z^{1}) = \frac{Corr (X, Y) - Corr (X, Z^{1}) Corr (Y, Z^{1})}{\sqrt{(1 - Corr {(X, Z^{1})}^{2}) (1 - Corr {(Y, Z^{1})}^{2})}} .

The partial correlation coefficient between X and Y conditioned on both Z¹ and Z² is

\begin{matrix} Pcc (X, Y ∣ Z^{1}, Z^{2}) = \\ \frac{Pcc (X, Y ∣ Z^{1}) - Pcc (X, Z^{2} ∣ Z^{1}) Pcc (Y, Z^{2} ∣ Z^{1})}{\sqrt{(1 - Pcc {(X, Z^{2} ∣ Z^{1})}^{2}) (1 - Pcc {(Y, Z^{2} ∣ Z^{1})}^{2})}}, \end{matrix}

and the partial correlation coefficient between X and Y conditioned on more variables can be defined recursively. For the computation and more information on the partial correlation coefficient, see refs. ^44,64.

To provide detailed instruction to our method, we summarize the practical steps here:

Procedure A: MCM for detecting causation from $U = {u_{t}}_{t = 1}^{L}$ to $V = {v_{t}}_{t = 1}^{L}$ :

Reconstruct the phase space by using delay-coordinate embedding for time series U and V, the reconstruction parameters (embedding dimensions E_u, E_v and time lags τ_u, τ_v) can be selected by FNN algorithm and by the method of DMI, respectively (see Supplementary Note 5);
For each time index t, find the set of neighboring points $N (v_{t})$ of v_t (E_v + 1 nearest neighbors are used since it is the minimum number of points needed for a bounded simplex in an E_v-dimensional space⁴³);
Find the corresponding points in M_U that have the same time indexes as the points in $N (v_{t})$ and calculate their weighted average (the weights are determined by the distances between the point in $N (v_{t})$ and v_t, which defines the operation $E [\cdot]$ ) to obtain the estimated ${\hat{u}}_{t}^{v}$ ;
Use an appropriate index (such as $ϱ_{C} = ∣Corr (u_{t}, {\hat{u}}_{t}^{v})∣$ ) to characterize the consensus of the estimated time series ${\hat{U}}^{V_{0}}$ (subscript 0 is denoted for no translation of V here to keep consistency with the following notations) and the original time series U, which measures the causation from U to V.

Procedure B: PCM for detecting direct causation from X to Y conditioning on Z:

Translate time series Y with different candidate time delays τ_i(i = 1, 2, …, m) to generate $Y_{τ_{i}} = {y_{t + τ_{i}}}$ ;
For each pair Z to $Y_{τ_{i}}$ , perform Procedure A to obtain $Corr (Z, {\hat{Z}}^{Y_{τ_{i}}})$ , and denote ${\hat{Z}}^{Y}$ as ${\hat{Z}}^{Y_{τ_{i_{1}}}}$ , where the time delay $τ_{i_{1}}$ maximizes $Corr (Z, {\hat{Z}}^{Y_{τ_{i}}})$ as in (10);
Translate time series ${\hat{Z}}^{Y}$ with different candidate time delays τ_i(i = 1, 2, …, m) to generate ${\hat{Z}}_{τ_{i}}^{Y}$ ;
For each pair X to ${\hat{Z}}_{τ_{i}}^{Y}$ , perform Procedure A to obtain $Corr (X, {\hat{X}}^{{\hat{Z}}_{τ_{i}}^{Y}})$ , and denote ${\hat{X}}^{{\hat{Z}}^{Y}}$ as ${\hat{X}}^{{\hat{Z}}_{τ_{i_{2}}}^{Y}}$ , where the time delay $τ_{i_{2}}$ maximizes $Corr (X, {\hat{X}}^{{\hat{Z}}_{τ_{i}}^{Y}})$ as in (11);
For each pair X to $Y_{τ_{i}}$ , perform Procedure A to obtain $Corr (X, {\hat{X}}^{Y_{τ_{i}}})$ , and denote ${\hat{X}}^{Y}$ as ${\hat{X}}^{Y_{τ_{i_{3}}}}$ , where the time delay $τ_{i_{3}}$ maximizes $Corr (X, {\hat{X}}^{Y_{τ_{i}}})$ as in (12);
Use $ϱ_{D} = ∣Pcc (X, {\hat{X}}^{Y} ∣ {\hat{X}}^{{\hat{Z}}^{Y}})∣$ to measure the direct causation from X to Y conditioning on Z.

Note that we search for the strongest causation on different candidate time delays in every MCM procedure above. For consistency, in the whole research, all the MCM results are also based on this strategy. Moreover, it is possible to characterize the causal relations among variables on a distribution of time delays (i.e., a causal spectrum). This full causal description will be included in our future work.

As described above, the first-order PCM method can be established as following definition for networked systems of more than three interacting variables: X, Y, Z¹, …, Z^s(s ≥ 2) (e.g., Fig. 1d), based on which high-order method can be derived,

ϱ_{D_{1}} = ∣Pcc (X, {\hat{X}}^{Y} ∣\{{\hat{X}}^{{\hat{Z}}^{i Y}}∣ i = 1, \dots, s\})∣ .

In a complex dynamical networks, the indirect causation could also be transferred through more than one variables (e.g., through two variables X → Z¹ → Z² → Y). The high-order PCM method is derived to specifically characterize this situation. In particular, we calculate the correlation coefficient between X and ${\hat{X}}^{Y}$ , and the partial correlation coefficient between them through removal of the information about the cross mapping variables via two variables out of the s variables Z¹, …, Z^s. The partial correlation coefficient

ϱ_{D_{2}} = ∣Pcc (X, {\hat{X}}^{Y} ∣\{{\hat{X}}^{{\hat{Z}}^{i^{{\hat{Z}}^{j Y}}}}∣ i \neq j, i, j \in {1, \dots, s}\})∣

represents effectively a second-order method for differentiating the direct and indirect causal links from X to Y that is transferred through two mediate variables. Analogously, the nth order measure, denoted by $ϱ_{D_{n}}$ , can be defined through any combinations of n mediate variables from Z¹, …, Z^s as

\begin{matrix} ϱ_{D_{n}} = ∣ Pcc (X, {\hat{X}}^{Y} ∣ \{{\hat{X}}^{{\hat{Z}}^{{i_{1}^{\dots}}^{{\hat{Z}}^{i_{n} Y}}}} ∣ (i_{1}, \dots, i_{n}) \\ is an n - combination from {1, \dots, s} \}) ∣ . \end{matrix}

Together with $ϱ_{C}, ϱ_{D_{n}} (n = 1, \dots, s)$ and the PCM measure

γ = (Π_{n = 1}^{s} ϱ_{D_{n}}) / ϱ_{C}^{s},

reflecting the proximity of all these coefficients, we obtain higher-order PCM methods for detecting direct causal links in large networks. However, for a relatively large order n, the possible number of combinations of n mediate variables is quite large. We will study the computations and applications of the high-order methods in future work, and in this research, we only consider the first-order problem.

In practice, the partial correlation procedure will encounter calculation problems if the network scale is relatively large and thus a large conditioning set should be taken into account. In this case, we could adopt the technique of selecting several nodes Zⁱ that maximize $ϱ_{C}^{X \to Z^{i}} + ϱ_{C}^{Z^{i} \to Y}$ (or $\min {ϱ_{C}^{X \to Z^{i}}, ϱ_{C}^{Z^{i} \to Y}}$ ), which means a high probability of the existence of an indirect link through Zⁱ, and make conditioning on these nodes. Moreover, if we have a priori knowledge that the network is sparse, that is, indirect connections are seldom, we could also make conditioning on Z¹, …, Z^s one by one, and take the minimum value of $ϱ_{D}^{X \to Y ∣ Z^{i}}$ as the final result.

Moreover, the PCM idea can be further developed or varied by substituting the partial correlation to other possible measures characterizing the conditional dependence. For example, the coefficient of determination (denoted r²) is a possible choice to serve as an index directly estimated from the cross map neighbors in parceling out effect sizes for each contributing factor. Another heuristic thinking is that for indirect causal influence X → Z → Y, cutting off either the link X → Z or Z → Y is enough to eliminate the whole indirect information flow, which also provides variation of the PCM framework. These further variations will be included in our future work.

Supplementary information

Supplementary Information^{(5.1MB, pdf)}

Peer Review File^{(214.8KB, pdf)}

Acknowledgements

W.L. is supported by the National Key R&D Program of China (No. 2018YFC0116600), by the National Natural Science Foundation of China (Nos 11925103 and 61773125), and by the STCSM (Nos 18DZ1201000, 19511132000, and 2018SHZDZX01). L.N.C. is supported by the National Key R&D Program of China (No. 2017YFA0505500), by the Strategic Priority Project of CAS (No. XDB38000000), by the Natural Science Foundation of China (Nos 31771476 and 31930022), and by Shanghai Municipal Science and Technology Major Project (No. 2017SHZDZX01). S.Y.L. and K.A. are supported by JSPS KAKENHI (No. JP15H05707) and by AMED (No. JP20dm0307009). Y.-C.L. is supported by ONR (No. N00014-16-1-2828). H.F.M. is supported by the National Key R&D Program of China (No. 2018YFA0801100) and the National Natural Science Foundation of China (No. 11771010). J.K. is supported by the project RF Government Grant 075-15-2019-1885.

Author contributions

W.L. and L.N.C. conceived the idea; S.Y.L., H.F.M., W.L., K.A., and L.N.C designed the research; S.Y.L., H.F.M., and W.L. performed the research; All authors, S.Y.L., H.F.M., J.K., Y.-C.L., W.L., K.A., and L.N.C., analyzed the data and wrote the paper.

Data availability

The data sets generated during and/or analyzed during the current study are all available from the corresponding author on reasonable request. The links/references for the public data sets used and analyzed during the current study are all provided in Supplementary Information.

Code availability

The codes as well as their directions for the PCM framework that we developed in this article are publicly available at https://github.com/Partial-Cross-Mapping.

Competing interests

The authors declare no competing interests.

Footnotes

Peer review information Nature Communications thanks Aneta Stefanovska and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Wei Lin, Email: wlin@fudan.edu.cn.

Kazuyuki Aihara, Email: aihara@sat.t.u-tokyo.ac.jp.

Luonan Chen, Email: lnchen@sibs.ac.cn.

Supplementary information

Supplementary information is available for this paper at 10.1038/s41467-020-16238-0.

References

1.Granger CW. Investigating causal relations by econometric models and cross-spectral methods. Econometrica. 1969;37:424–438. [Google Scholar]
2.Geweke JF. Measurement of linear dependence and feedback between multiple time series. J. Am. Stat. Assoc. 1982;77:304–313. [Google Scholar]
3.Geweke JF. Measures of conditional linear dependence and feedback between time series. J. Am. Stat. Assoc. 1984;79:907–915. [Google Scholar]
4.Ding, M., Chen, Y. & Bressler, S. L. In Handbook of Time Series Analysis 437–460 (Wiley, Hoboken, 2006).
5.Guo, S., Ladroue, C. & Feng, J. In Frontiers in Computational and Systems Biology 83–111 (Springer, New York, 2010).
6.Schreiber T. Measuring information transfer. Phys. Rev. Lett. 2000;85:461. doi: 10.1103/PhysRevLett.85.461. [DOI] [PubMed] [Google Scholar]
7.Vicente R, Wibral M, Lindner M, Pipa G. Transfer entropy-a model-free measure of effective connectivity for the neurosciences. J. Comput. Neurosci. 2011;30:45–67. doi: 10.1007/s10827-010-0262-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Cover TM, Thomas JA. Elements of Information Theory. Hoboken: Wiley; 2012. [Google Scholar]
9.Sun J, Cafaro C, Bollt EM. Identifying the coupling structure in complex systems through the optimal causation entropy principle. Entropy. 2014;16:3416–3433. [Google Scholar]
10.Cafaro C, Lord WM, Sun J, Bollt EM. Causation entropy from symbolic representations of dynamical systems. Chaos. 2015;25:043106. doi: 10.1063/1.4916902. [DOI] [PubMed] [Google Scholar]
11.Sun J, Taylor D, Bollt EM. Causal network inference by optimal causation entropy. SIAM J. Appl. Dyn. Syst. 2015;14:73–106. [Google Scholar]
12.Duggento A, Stankovski T, McClintock PV, Stefanovska A. Dynamical bayesian inference of time-evolving interactions: from a pair of coupled oscillators to networks of oscillators. Phys. Rev. E. 2012;86:061126. doi: 10.1103/PhysRevE.86.061126. [DOI] [PubMed] [Google Scholar]
13.Stankovski T, Duggento A, McClintock PV, Stefanovska A. A tutorial on time-evolving dynamical bayesian inference. Eur. Phys. J. Spec. Top. 2014;223:2685–2703. [Google Scholar]
14.Stankovski T, Ticcinelli V, McClintock PV, Stefanovska A. Coupling functions in networks of oscillators. N. J. Phys. 2015;17:035002. [Google Scholar]
15.Stankovski T, Pereira T, McClintock PV, Stefanovska A. Coupling functions: universal insights into dynamical interaction mechanisms. Rev. Mod. Phys. 2017;89:045001. [Google Scholar]
16.Schiff SJ, So P, Chang T, Burke RE, Sauer T. Detecting dynamical interdependence and generalized synchrony through mutual prediction in a neural ensemble. Phys. Rev. E. 1996;54:6708. doi: 10.1103/physreve.54.6708. [DOI] [PubMed] [Google Scholar]
17.Sugihara G, et al. Detecting causality in complex ecosystems. Science. 2012;338:496–500. doi: 10.1126/science.1227079. [DOI] [PubMed] [Google Scholar]
18.Ma H, Aihara K, Chen L. Detecting causality from nonlinear dynamics with short-term time series. Sci. Rep. 2014;4:7464. doi: 10.1038/srep07464. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Jiang J-J, Huang Z-G, Huang L, Liu H, Lai Y-C. Directed dynamical influence is more detectable with noise. Sci. Rep. 2016;6:24088. doi: 10.1038/srep24088. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Ma H, et al. Detection of time delays and directional interactions based on time series from complex dynamical systems. Phys. Rev. E. 2017;96:012221. doi: 10.1103/PhysRevE.96.012221. [DOI] [PubMed] [Google Scholar]
21.Harnack D, Laminski E, Schünemann M, Pawelzik KR. Topological causality in dynamical systems. Phys. Rev. Lett. 2017;119:098301. doi: 10.1103/PhysRevLett.119.098301. [DOI] [PubMed] [Google Scholar]
22.Joskow, P. L. & Rose, N. L. In Handbook of Industrial Organization, Vol. 2, 1449–1506 (Elsevier, Amsterdam, 1989).
23.Kamiński M, Ding M, Truccolo WA, Bressler SL. Evaluating causal relations in neural systems: Granger causality, directed transfer function and statistical assessment of significance. Biol. Cybern. 2001;85:145–157. doi: 10.1007/s004220000235. [DOI] [PubMed] [Google Scholar]
24.Banos R, et al. Optimization methods applied to renewable and sustainable energy: a review. Renew. Sust. Energ Rev. 2011;15:1753–1766. [Google Scholar]
25.Brockmann D, Helbing D. The hidden geometry of complex, network-driven contagion phenomena. Science. 2013;342:1337–1342. doi: 10.1126/science.1245200. [DOI] [PubMed] [Google Scholar]
26.Deyle ER, et al. Predicting climate effects on pacific sardine. Proc. Natl Acad. Sci. USA. 2013;110:6430–6435. doi: 10.1073/pnas.1215506110. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Van Nes EH, et al. Causal feedbacks in climate change. Nat. Clim. Change. 2015;5:445. [Google Scholar]
28.Tsonis AA, et al. Dynamical evidence for causality between galactic cosmic rays and interannual variation in global temperature. Proc. Natl Acad. Sci. USA. 2015;112:3253–3256. doi: 10.1073/pnas.1420291112. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Hirata Y, et al. Detecting causality by combined use of multiple methods: climate and brain examples. PLoS ONE. 2016;11:e0158572. doi: 10.1371/journal.pone.0158572. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Ma H, Leng S, Aihara K, Lin W, Chen L. Randomly distributed embedding making short-term high-dimensional data predictable. Proc. Natl Acad. Sci. USA. 2018;115:E9994–E10002. doi: 10.1073/pnas.1802987115. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Leng S, Xu Z, Ma H. Reconstructing directional causal networks with random forest. Chaos. 2019;29:093130. doi: 10.1063/1.5120778. [DOI] [PubMed] [Google Scholar]
32.Guo S, Seth AK, Kendrick KM, Zhou C, Feng J. Partial granger causality-eliminating exogenous inputs and latent variables. J. Neurosci. Methods. 2008;172:79–93. doi: 10.1016/j.jneumeth.2008.04.011. [DOI] [PubMed] [Google Scholar]
33.Frenzel S, Pompe B. Partial mutual information for coupling analysis of multivariate time series. Phys. Rev. Lett. 2007;99:204101. doi: 10.1103/PhysRevLett.99.204101. [DOI] [PubMed] [Google Scholar]
34.Zhao J, Zhou Y, Zhang X, Chen L. Part mutual information for quantifying direct associations in networks. Proc. Natl Acad. Sci. USA. 2016;113:5130–5135. doi: 10.1073/pnas.1522586113. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Runge J, Heitzig J, Petoukhov V, Kurths J. Escaping the curse of dimensionality in estimating multivariate transfer entropy. Phys. Rev. Lett. 2012;108:258701. doi: 10.1103/PhysRevLett.108.258701. [DOI] [PubMed] [Google Scholar]
36.Schelter B, et al. Direct or indirect? graphical models for neural oscillators. J. Physiol. 2006;99:37–46. doi: 10.1016/j.jphysparis.2005.06.006. [DOI] [PubMed] [Google Scholar]
37.Nawrath J, et al. Distinguishing direct from indirect interactions in oscillatory networks with multiple time scales. Phys. Rev. Lett. 2010;104:038701. doi: 10.1103/PhysRevLett.104.038701. [DOI] [PubMed] [Google Scholar]
38.Runge J. Causal network reconstruction from time series: from theoretical assumptions to practical estimation. Chaos. 2018;28:075310. doi: 10.1063/1.5025050. [DOI] [PubMed] [Google Scholar]
39.Runge J, Petoukhov V, Kurths J. Quantifying the strength and delay of climatic interactions: the ambiguities of cross correlation and a novel measure based on graphical models. J. Clim. 2014;27:720–739. [Google Scholar]
40.Takens, F. In Dynamical Systems and Turbulence, Warwick 1980, 366–381 (Springer, New York, 1981).
41.Mañé, R. In Dynamical Systems and Turbulence, Warwick 1980, 230–242 (Springer, New York, 1981).
42.Kantz, H. & Schreiber, T. Nonlinear Time Series Analysis, Vol. 7 (Cambridge Univ. Press, Cambridge, 2004).
43.Sugihara G, May RM. Nonlinear forecasting as a way of distinguishing chaos from measurement error in time series. Nature. 1990;344:734. doi: 10.1038/344734a0. [DOI] [PubMed] [Google Scholar]
44.Bailey NT. Statistical Methods in Biology. Cambridge: Cambridge Univ. Press; 1995. [Google Scholar]
45.Noble WS. How does multiple testing correction work? Nat. Biotechnol. 2009;27:1135. doi: 10.1038/nbt1209-1135. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Shaffer JP. Multiple hypothesis testing. Annu. Rev. Psychol. 1995;46:561–584. [Google Scholar]
47.Lancaster G, Iatsenko D, Pidde A, Ticcinelli V, Stefanovska A. Surrogate data for hypothesis testing of physical systems. Phys. Rep. 2018;748:1–60. [Google Scholar]
48.Clemson PT, Stefanovska A. Discerning non-autonomous dynamics. Phys. Rep. 2014;542:297–368. [Google Scholar]
49.Stark J. Delay embeddings for forced systems. i. deterministic forcing. J. Nonlinear Sci. 1999;9:255–332. [Google Scholar]
50.Stark J, Broomhead DS, Davies M, Huke J. Delay embeddings for forced systems. II. Stochastic forcing. J. Nonlinear Sci. 2003;13:519–577. [Google Scholar]
51.Milo R, et al. Network motifs: simple building blocks of complex networks. Science. 2002;298:824–827. doi: 10.1126/science.298.5594.824. [DOI] [PubMed] [Google Scholar]
52.Alon U. Network motifs: theory and experimental approaches. Nat. Rev. Genet. 2007;8:450. doi: 10.1038/nrg2102. [DOI] [PubMed] [Google Scholar]
53.Marbach D, et al. Revealing strengths and weaknesses of methods for gene network inference. Proc. Natl Acad. Sci. USA. 2010;107:6286–6291. doi: 10.1073/pnas.0913357107. [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Marbach D, Schaffter T, Mattiussi C, Floreano D. Generating realistic in silico gene networks for performance assessment of reverse engineering methods. J. Comput. Biol. 2009;16:229–239. doi: 10.1089/cmb.2008.09TT. [DOI] [PubMed] [Google Scholar]
55.Prill RJ, et al. Towards a rigorous assessment of systems biology models: the dream3 challenges. PLoS ONE. 2010;5:e9202. doi: 10.1371/journal.pone.0009202. [DOI] [PMC free article] [PubMed] [Google Scholar]
56.Schaffter T, Marbach D, Floreano D. Genenetweaver: in silico benchmark generation and performance profiling of network inference methods. Bioinformatics. 2011;27:2263–2270. doi: 10.1093/bioinformatics/btr373. [DOI] [PubMed] [Google Scholar]
57.Benincà E, Jöhnk KD, Heerkloss R, Huisman J. Coupled predator–prey oscillations in a chaotic food web. Ecol. Lett. 2009;12:1367–1378. doi: 10.1111/j.1461-0248.2009.01391.x. [DOI] [PubMed] [Google Scholar]
58.Benincà E, et al. Chaos in a long-term experiment with a plankton community. Nature. 2008;451:822. doi: 10.1038/nature06512. [DOI] [PubMed] [Google Scholar]
59.Neutel A-M, Heesterbeek JA, de Ruiter PC. Stability in real food webs: weak links in long loops. Science. 2002;296:1120–1123. doi: 10.1126/science.1068326. [DOI] [PubMed] [Google Scholar]
60.Lee B-J, Kim B, Lee K. Air pollution exposure and cardiovascular disease. Toxicol. Res. (Seoul., Repub. Korea) 2014;30:71. doi: 10.5487/TR.2014.30.2.071. [DOI] [PMC free article] [PubMed] [Google Scholar]
61.Wong TW, et al. Air pollution and hospital admissions for respiratory and cardiovascular diseases in hong kong. Occup. Environ. Med. 1999;56:679–683. doi: 10.1136/oem.56.10.679. [DOI] [PMC free article] [PubMed] [Google Scholar]
62.Fan J, Zhang W. Statistical estimation in varying coefficient models. Ann. Stat. 1999;27:1491–1518. [Google Scholar]
63.Milojevic A, et al. Short-term effects of air pollution on a range of cardiovascular events in England and Wales: case-crossover analysis of the minap database, hospital admissions and mortality. Heart. 2014;100:1093–1098. doi: 10.1136/heartjnl-2013-304963. [DOI] [PMC free article] [PubMed] [Google Scholar]
64.Baba K, Shibata R, Sibuya M. Partial correlation and conditional correlation as measures of conditional independence. Aust. N. Z. J. Stat. 2004;46:657–664. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information^{(5.1MB, pdf)}

Peer Review File^{(214.8KB, pdf)}

Data Availability Statement

The codes as well as their directions for the PCM framework that we developed in this article are publicly available at https://github.com/Partial-Cross-Mapping.

[CR1] 1.Granger CW. Investigating causal relations by econometric models and cross-spectral methods. Econometrica. 1969;37:424–438. [Google Scholar]

[CR2] 2.Geweke JF. Measurement of linear dependence and feedback between multiple time series. J. Am. Stat. Assoc. 1982;77:304–313. [Google Scholar]

[CR3] 3.Geweke JF. Measures of conditional linear dependence and feedback between time series. J. Am. Stat. Assoc. 1984;79:907–915. [Google Scholar]

[CR4] 4.Ding, M., Chen, Y. & Bressler, S. L. In Handbook of Time Series Analysis 437–460 (Wiley, Hoboken, 2006).

[CR5] 5.Guo, S., Ladroue, C. & Feng, J. In Frontiers in Computational and Systems Biology 83–111 (Springer, New York, 2010).

[CR6] 6.Schreiber T. Measuring information transfer. Phys. Rev. Lett. 2000;85:461. doi: 10.1103/PhysRevLett.85.461. [DOI] [PubMed] [Google Scholar]

[CR7] 7.Vicente R, Wibral M, Lindner M, Pipa G. Transfer entropy-a model-free measure of effective connectivity for the neurosciences. J. Comput. Neurosci. 2011;30:45–67. doi: 10.1007/s10827-010-0262-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR8] 8.Cover TM, Thomas JA. Elements of Information Theory. Hoboken: Wiley; 2012. [Google Scholar]

[CR9] 9.Sun J, Cafaro C, Bollt EM. Identifying the coupling structure in complex systems through the optimal causation entropy principle. Entropy. 2014;16:3416–3433. [Google Scholar]

[CR10] 10.Cafaro C, Lord WM, Sun J, Bollt EM. Causation entropy from symbolic representations of dynamical systems. Chaos. 2015;25:043106. doi: 10.1063/1.4916902. [DOI] [PubMed] [Google Scholar]

[CR11] 11.Sun J, Taylor D, Bollt EM. Causal network inference by optimal causation entropy. SIAM J. Appl. Dyn. Syst. 2015;14:73–106. [Google Scholar]

[CR12] 12.Duggento A, Stankovski T, McClintock PV, Stefanovska A. Dynamical bayesian inference of time-evolving interactions: from a pair of coupled oscillators to networks of oscillators. Phys. Rev. E. 2012;86:061126. doi: 10.1103/PhysRevE.86.061126. [DOI] [PubMed] [Google Scholar]

[CR13] 13.Stankovski T, Duggento A, McClintock PV, Stefanovska A. A tutorial on time-evolving dynamical bayesian inference. Eur. Phys. J. Spec. Top. 2014;223:2685–2703. [Google Scholar]

[CR14] 14.Stankovski T, Ticcinelli V, McClintock PV, Stefanovska A. Coupling functions in networks of oscillators. N. J. Phys. 2015;17:035002. [Google Scholar]

[CR15] 15.Stankovski T, Pereira T, McClintock PV, Stefanovska A. Coupling functions: universal insights into dynamical interaction mechanisms. Rev. Mod. Phys. 2017;89:045001. [Google Scholar]

[CR16] 16.Schiff SJ, So P, Chang T, Burke RE, Sauer T. Detecting dynamical interdependence and generalized synchrony through mutual prediction in a neural ensemble. Phys. Rev. E. 1996;54:6708. doi: 10.1103/physreve.54.6708. [DOI] [PubMed] [Google Scholar]

[CR17] 17.Sugihara G, et al. Detecting causality in complex ecosystems. Science. 2012;338:496–500. doi: 10.1126/science.1227079. [DOI] [PubMed] [Google Scholar]

[CR18] 18.Ma H, Aihara K, Chen L. Detecting causality from nonlinear dynamics with short-term time series. Sci. Rep. 2014;4:7464. doi: 10.1038/srep07464. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR19] 19.Jiang J-J, Huang Z-G, Huang L, Liu H, Lai Y-C. Directed dynamical influence is more detectable with noise. Sci. Rep. 2016;6:24088. doi: 10.1038/srep24088. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR20] 20.Ma H, et al. Detection of time delays and directional interactions based on time series from complex dynamical systems. Phys. Rev. E. 2017;96:012221. doi: 10.1103/PhysRevE.96.012221. [DOI] [PubMed] [Google Scholar]

[CR21] 21.Harnack D, Laminski E, Schünemann M, Pawelzik KR. Topological causality in dynamical systems. Phys. Rev. Lett. 2017;119:098301. doi: 10.1103/PhysRevLett.119.098301. [DOI] [PubMed] [Google Scholar]

[CR22] 22.Joskow, P. L. & Rose, N. L. In Handbook of Industrial Organization, Vol. 2, 1449–1506 (Elsevier, Amsterdam, 1989).

[CR23] 23.Kamiński M, Ding M, Truccolo WA, Bressler SL. Evaluating causal relations in neural systems: Granger causality, directed transfer function and statistical assessment of significance. Biol. Cybern. 2001;85:145–157. doi: 10.1007/s004220000235. [DOI] [PubMed] [Google Scholar]

[CR24] 24.Banos R, et al. Optimization methods applied to renewable and sustainable energy: a review. Renew. Sust. Energ Rev. 2011;15:1753–1766. [Google Scholar]

[CR25] 25.Brockmann D, Helbing D. The hidden geometry of complex, network-driven contagion phenomena. Science. 2013;342:1337–1342. doi: 10.1126/science.1245200. [DOI] [PubMed] [Google Scholar]

[CR26] 26.Deyle ER, et al. Predicting climate effects on pacific sardine. Proc. Natl Acad. Sci. USA. 2013;110:6430–6435. doi: 10.1073/pnas.1215506110. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR27] 27.Van Nes EH, et al. Causal feedbacks in climate change. Nat. Clim. Change. 2015;5:445. [Google Scholar]

[CR28] 28.Tsonis AA, et al. Dynamical evidence for causality between galactic cosmic rays and interannual variation in global temperature. Proc. Natl Acad. Sci. USA. 2015;112:3253–3256. doi: 10.1073/pnas.1420291112. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR29] 29.Hirata Y, et al. Detecting causality by combined use of multiple methods: climate and brain examples. PLoS ONE. 2016;11:e0158572. doi: 10.1371/journal.pone.0158572. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR30] 30.Ma H, Leng S, Aihara K, Lin W, Chen L. Randomly distributed embedding making short-term high-dimensional data predictable. Proc. Natl Acad. Sci. USA. 2018;115:E9994–E10002. doi: 10.1073/pnas.1802987115. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR31] 31.Leng S, Xu Z, Ma H. Reconstructing directional causal networks with random forest. Chaos. 2019;29:093130. doi: 10.1063/1.5120778. [DOI] [PubMed] [Google Scholar]

[CR32] 32.Guo S, Seth AK, Kendrick KM, Zhou C, Feng J. Partial granger causality-eliminating exogenous inputs and latent variables. J. Neurosci. Methods. 2008;172:79–93. doi: 10.1016/j.jneumeth.2008.04.011. [DOI] [PubMed] [Google Scholar]

[CR33] 33.Frenzel S, Pompe B. Partial mutual information for coupling analysis of multivariate time series. Phys. Rev. Lett. 2007;99:204101. doi: 10.1103/PhysRevLett.99.204101. [DOI] [PubMed] [Google Scholar]

[CR34] 34.Zhao J, Zhou Y, Zhang X, Chen L. Part mutual information for quantifying direct associations in networks. Proc. Natl Acad. Sci. USA. 2016;113:5130–5135. doi: 10.1073/pnas.1522586113. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR35] 35.Runge J, Heitzig J, Petoukhov V, Kurths J. Escaping the curse of dimensionality in estimating multivariate transfer entropy. Phys. Rev. Lett. 2012;108:258701. doi: 10.1103/PhysRevLett.108.258701. [DOI] [PubMed] [Google Scholar]

[CR36] 36.Schelter B, et al. Direct or indirect? graphical models for neural oscillators. J. Physiol. 2006;99:37–46. doi: 10.1016/j.jphysparis.2005.06.006. [DOI] [PubMed] [Google Scholar]

[CR37] 37.Nawrath J, et al. Distinguishing direct from indirect interactions in oscillatory networks with multiple time scales. Phys. Rev. Lett. 2010;104:038701. doi: 10.1103/PhysRevLett.104.038701. [DOI] [PubMed] [Google Scholar]

[CR38] 38.Runge J. Causal network reconstruction from time series: from theoretical assumptions to practical estimation. Chaos. 2018;28:075310. doi: 10.1063/1.5025050. [DOI] [PubMed] [Google Scholar]

[CR39] 39.Runge J, Petoukhov V, Kurths J. Quantifying the strength and delay of climatic interactions: the ambiguities of cross correlation and a novel measure based on graphical models. J. Clim. 2014;27:720–739. [Google Scholar]

[CR40] 40.Takens, F. In Dynamical Systems and Turbulence, Warwick 1980, 366–381 (Springer, New York, 1981).

[CR41] 41.Mañé, R. In Dynamical Systems and Turbulence, Warwick 1980, 230–242 (Springer, New York, 1981).

[CR42] 42.Kantz, H. & Schreiber, T. Nonlinear Time Series Analysis, Vol. 7 (Cambridge Univ. Press, Cambridge, 2004).

[CR43] 43.Sugihara G, May RM. Nonlinear forecasting as a way of distinguishing chaos from measurement error in time series. Nature. 1990;344:734. doi: 10.1038/344734a0. [DOI] [PubMed] [Google Scholar]

[CR44] 44.Bailey NT. Statistical Methods in Biology. Cambridge: Cambridge Univ. Press; 1995. [Google Scholar]

[CR45] 45.Noble WS. How does multiple testing correction work? Nat. Biotechnol. 2009;27:1135. doi: 10.1038/nbt1209-1135. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR46] 46.Shaffer JP. Multiple hypothesis testing. Annu. Rev. Psychol. 1995;46:561–584. [Google Scholar]

[CR47] 47.Lancaster G, Iatsenko D, Pidde A, Ticcinelli V, Stefanovska A. Surrogate data for hypothesis testing of physical systems. Phys. Rep. 2018;748:1–60. [Google Scholar]

[CR48] 48.Clemson PT, Stefanovska A. Discerning non-autonomous dynamics. Phys. Rep. 2014;542:297–368. [Google Scholar]

[CR49] 49.Stark J. Delay embeddings for forced systems. i. deterministic forcing. J. Nonlinear Sci. 1999;9:255–332. [Google Scholar]

[CR50] 50.Stark J, Broomhead DS, Davies M, Huke J. Delay embeddings for forced systems. II. Stochastic forcing. J. Nonlinear Sci. 2003;13:519–577. [Google Scholar]

[CR51] 51.Milo R, et al. Network motifs: simple building blocks of complex networks. Science. 2002;298:824–827. doi: 10.1126/science.298.5594.824. [DOI] [PubMed] [Google Scholar]

[CR52] 52.Alon U. Network motifs: theory and experimental approaches. Nat. Rev. Genet. 2007;8:450. doi: 10.1038/nrg2102. [DOI] [PubMed] [Google Scholar]

[CR53] 53.Marbach D, et al. Revealing strengths and weaknesses of methods for gene network inference. Proc. Natl Acad. Sci. USA. 2010;107:6286–6291. doi: 10.1073/pnas.0913357107. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR54] 54.Marbach D, Schaffter T, Mattiussi C, Floreano D. Generating realistic in silico gene networks for performance assessment of reverse engineering methods. J. Comput. Biol. 2009;16:229–239. doi: 10.1089/cmb.2008.09TT. [DOI] [PubMed] [Google Scholar]

[CR55] 55.Prill RJ, et al. Towards a rigorous assessment of systems biology models: the dream3 challenges. PLoS ONE. 2010;5:e9202. doi: 10.1371/journal.pone.0009202. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR56] 56.Schaffter T, Marbach D, Floreano D. Genenetweaver: in silico benchmark generation and performance profiling of network inference methods. Bioinformatics. 2011;27:2263–2270. doi: 10.1093/bioinformatics/btr373. [DOI] [PubMed] [Google Scholar]

[CR57] 57.Benincà E, Jöhnk KD, Heerkloss R, Huisman J. Coupled predator–prey oscillations in a chaotic food web. Ecol. Lett. 2009;12:1367–1378. doi: 10.1111/j.1461-0248.2009.01391.x. [DOI] [PubMed] [Google Scholar]

[CR58] 58.Benincà E, et al. Chaos in a long-term experiment with a plankton community. Nature. 2008;451:822. doi: 10.1038/nature06512. [DOI] [PubMed] [Google Scholar]

[CR59] 59.Neutel A-M, Heesterbeek JA, de Ruiter PC. Stability in real food webs: weak links in long loops. Science. 2002;296:1120–1123. doi: 10.1126/science.1068326. [DOI] [PubMed] [Google Scholar]

[CR60] 60.Lee B-J, Kim B, Lee K. Air pollution exposure and cardiovascular disease. Toxicol. Res. (Seoul., Repub. Korea) 2014;30:71. doi: 10.5487/TR.2014.30.2.071. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR61] 61.Wong TW, et al. Air pollution and hospital admissions for respiratory and cardiovascular diseases in hong kong. Occup. Environ. Med. 1999;56:679–683. doi: 10.1136/oem.56.10.679. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR62] 62.Fan J, Zhang W. Statistical estimation in varying coefficient models. Ann. Stat. 1999;27:1491–1518. [Google Scholar]

[CR63] 63.Milojevic A, et al. Short-term effects of air pollution on a range of cardiovascular events in England and Wales: case-crossover analysis of the minap database, hospital admissions and mortality. Heart. 2014;100:1093–1098. doi: 10.1136/heartjnl-2013-304963. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR64] 64.Baba K, Shibata R, Sibuya M. Partial correlation and conditional correlation as measures of conditional independence. Aust. N. Z. J. Stat. 2004;46:657–664. [Google Scholar]

PERMALINK

Partial cross mapping eliminates indirect causal influences

Siyang Leng

Huanfei Ma

Jürgen Kurths

Ying-Cheng Lai

Wei Lin

Kazuyuki Aihara

Luonan Chen

Abstract

Introduction

Fig. 1. Direct versus indirect causal links.

Results

Direct and indirect causal links

Partial cross mapping

Fig. 2. Basic principles of the PCM framework.

Ascertaining direct causation in benchmark systems

Fig. 3. Detection of causal links from X to Y in the benchmark systems.

Detecting direct causation in real-world networks

Fig. 4. Detecting direct causal links in three real-world networks.

Discussion

Methods

The concept of non-separability

Transitivity arousing indirect causation

The PCM method of first order and higher order

Supplementary information

Acknowledgements

Author contributions

Data availability

Code availability

Competing interests

Footnotes

Contributor Information

Supplementary information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases