Significance
Effective use of ecosystem monitoring data to resolve global environmental issues is a major challenge of the 21st century ecology. A promising solution to address this challenge is a time series–based causal analysis that can provide insight on the mechanical links between ecosystem components. In this work, a model-free framework named EcohNet is proposed. EcohNet utilizes ensemble predictions of echo state networks, which are known to be fast, accurate, and highly relevant for a variety of dynamical systems and can robustly predict causal networks of ecosystem components. It also can provide an optimized forecasting of overall ecosystem components and could be used to analyze complex and hybrid multivariate time series in many scientific areas, not limited to ecosystems.
Keywords: ecosystem monitoring, lake ecosystem, causal network, echo state network, Granger causality
Abstract
Ecosystems are complex systems of various physical, biological, and chemical processes. Since ecosystem dynamics are composed of a mixture of different levels of stochasticity and nonlinearity, handling these data is a challenge for existing methods of time series–based causal inferences. Here, we show that, by harnessing contemporary machine learning approaches, the concept of Granger causality can be effectively extended to the analysis of complex ecosystem time series and bridge the gap between dynamical and statistical approaches. The central idea is to use an ensemble of fast and highly predictive artificial neural networks to select a minimal set of variables that maximizes the prediction of a given variable. It enables decomposition of the relationship among variables through quantifying the contribution of an individual variable to the overall predictive performance. We show how our approach, EcohNet, can improve interaction network inference for a mesocosm experiment and simulated ecosystems. The application of the method to a long-term lake monitoring dataset yielded interpretable results on the drivers causing cyanobacteria blooms, which is a serious threat to ecological integrity and ecosystem services. Since performance of EcohNet is enhanced by its predictive capabilities, it also provides an optimized forecasting of overall components in ecosystems. EcohNet could be used to analyze complex and hybrid multivariate time series in many scientific areas not limited to ecosystems.
Various systems in the world, from cells, organisms, ecosystems, and our own societies, are complex and driven by many interacting components. An attempt to infer relationships among the components as a causal network is an important step in understanding their mechanistic basis. Great efforts have been made to infer such networks from time series (1, 2). In particular, methods that can overcome the limitations of classical methods (3, 4), such as convergent cross mapping (CCM) (5), which is suitable for systems under the influence of dynamic processes, and a combination of the Peter–Clark algorithm and the Momentary Conditional Independence test (PCMCI) (6), which is suitable for stochastic systems, have been proposed and are becoming widely used.
However, there is a problem when attempting to apply the existing time series–based causal analyses to ecosystems. Ecosystems are complex systems of various physical, biological, and chemical processes (7). Meteorological variables such as temperature and precipitation are under the influence of atmospheric and oceanic fluid dynamics (8). Recent studies showed that these Earth system dynamics are well captured as stochastic processes, and approaches based on statistical causal analysis are effective in elucidating their relationships (2, 6, 9). In contrast, the density and abundance of organisms often show strong nonlinear dynamics, driven by interactions between organisms. This motivates the application and success of approaches based on nonlinear dynamical systems (5, 10–12). Some chemicals control ecological dynamics as essential resources and are produced by organismal activities; as well, they are under the control of global geochemical cycles (13). These processes influence each other, but it is through rather weak causal couplings, and the dynamics of any one process may not become dominant (5, 14–17). Methods for revealing causal relationships among ecosystem components from time series data need to be robust to dynamical complexity, that is, the different levels of stochasticity and nonlinearity. However, the assumptions of most methods do not satisfy this requirement (1, 18).
In this paper, we introduce a method called EcohNet. It is based on the ensemble prediction of neural networks (19–21) that can seamlessly handle stochastic/deterministic and linear/nonlinear dynamics (22). Therefore, it is expected to be robust to dynamical complexity. The ensemble prediction is used to decompose relationships among variables in terms of predictability (23). Here, the contribution of one variable to predictive performance can be evaluated apart from that of the other variables, as in the concepts of partial correlation and conditional independence. It is expected that, even if some variables are driven by a strong driver, weak relationships among variables can be detected separately from the effect of the driver without special treatments (24–27).
Although various methods have been proposed and used for descriptive purposes, the relevance of their application to actual ecosystem monitoring data has not been seriously examined. How well can interactions between ecosystem components be captured as causal relationships in EcohNet? What advantages does it have over conventional methods used for causal analysis? To answer these questions, we performed benchmarking with data from long-term observation of an aquatic mesocosm, as well as a simulated dataset to test robustness to different dynamical complexities (equilibrium, equilibrium forced by an external oscillator, and intrinsic oscillatory dynamics, as well as different magnitudes of noise) under two interaction types (food web and random interaction) and three observational conditions (different data size, sampling interval, and the presence of unobserved species). In addition to the performance criteria for network inference, we focused on susceptibility to the often-problematic interaction topologies such as chain relationships (e.g., in a three-species food chain X←Y←Z, a causal relationship may be identified between X and Z) and fan-out relationships (e.g., in a one-predator-two-prey-relationship X←Z→Y, a causal relationship may be identified between X and Y) (5, 24, 28). We then applied our method to a long-term lake monitoring dataset that includes heterogeneous components such as meteorological, chemical, and biological variables, and interpreted the results. Here, we mainly focused on how EcohNet provides insight on the drivers of cyanobacteria blooms, which are often a serious threat to ecological integrity and ecosystem services (29), but also we show how well the detection of unrealistic causal relationships such as those from biological to meteorological variables was avoided.
EcohNet
EcohNet combines a type of recurrent neural network (RNN), called an echo state network (ESN) (19, 20, 30), with a progressive selection of variables (31) (Fig. 1; see Materials and Methods for details). Initially, a target variable is selected (Fig. 1A). Then, ESNs are generated, and the ensemble predictive performance (prediction skill) of one time step ahead is evaluated when the past state of the target itself is given as an input of ESNs (Fig. 1 B and C). As the result, we obtain a distribution of prediction skills (illustrated by a gray mountain shape in Fig. 1B). In the next step, we choose a second variable other than the target variable one by one and adopt the one that improves the prediction skill the most. Here, ESNs are generated for each pair of variables. If the prediction skill does not improve, no further new variables are adopted. This process is repeated for variables that have not yet been selected, as long as new variables are adopted. As a result of the above process, a set of variables that maximizes the prediction skill for the target variable is obtained. Then, is used to evaluate the unique prediction skill, which represents the unique contribution of each variable in to the overall prediction skill (Fig. 1D). If and the target variable is , for each variable (here, and ), we evaluate how much the prediction changes when one variable is excluded from it (e.g., ). For example, when denoting the prediction skill for and as and , respectively, the unique contribution of Y on X is . In this stage, a variable is removed from if . By performing the above procedure with all variables as targets in turn, a prediction skill–based causal network is obtained for the entire system.
Fig. 1.
Graphical explanation of EcohNet. EcohNet is illustrated in four steps: target variable selection (A), progressive selection of predictive variables (B), evaluation of prediction skill (C), and evaluation of unique prediction skill (D). See main text and Materials and Methods for the explanation.
To show how EcohNet works, we explain what would be for X in three representative relationships of the three variables X, Y, and Z. First, we assume a transitive relationship (Z→Y→X and Z→X; SI Appendix, Fig. S1A). Here, in addition to , both and are nonzero, corresponding to the direct relationship Y→X and Z→X. In a chain relationship (Z→Y→X; SI Appendix, Fig. S1B), the direct relationship Z→X is removed. In this case, the contribution of Z on X is always mediated by Y. Thus, if all else is equal, only and are expected to have nonzero values. In a fan-out relationship (X←Z→Y; SI Appendix, Fig. S1C), the direct relationship Y→X is removed. In this case, both X and Y have a direct relationship with Z but remain disconnected from each other. X and Y share dynamic imprint of Z that may contribute to predict X from Y, but it is simply involved in the direct influence of X on Z. Thus, if all else is equal, only and are expected to have nonzero values. The importance of evaluating the unique prediction skill is highlighted by another representative example (SI Appendix, Fig. S2). In this case, both Z1 and Z2 have a fan-out relationship with X and Y (X←Z1→Y, and X←Z2→Y). As explained in Runge (1), a simple forward stepwise algorithm would select Y first, followed by Z1 and Z2, because Y can have a larger individual contribution than either Z1 or Z2. However, evaluation of the unique prediction skill can remove Y from , since the contribution of Y is included in the union of the contribution of Z1 and Z2, and thus .
Conventional Approaches
We selected a correlation-based method (Spearman rank correlation), two equation-free methods from nonlinear time series analysis [CCM (5) and partial cross-mapping (PCM) (28)] and an equation-based learning interactions from microbial time series (LIMITS) method (32) as the conventional approaches for comparison with EcohNet (SI Appendix, section A includes the details of the implementations), and compared them using the evaluation criteria of a binary classification task (SI Appendix, Fig. S3). CCM and PCM are intended to detect causality in dynamical systems and are therefore comparative to EcohNet. Since there are several different implementations for CCM, we tested three representative approaches preliminary (SI Appendix, section A2). PCM is an extension of CCM that accounts for direct causality between two variables conditional on indirect causation through a third variable. Thus, PCM implements an idea similar to the unique prediction skill of EcohNet. LIMITS infers the strength of ecological interactions directly rather than the causal relationships among variables. In the benchmark using simulation data, the results of LIMITS should be carefully interpreted because there is the advantage that the basic process is consistent; that is, it assumes a Lotka–Volterra (LV) equation used to generate the data. In other words, it contains more prior knowledge than the other methods. Applying it to relationships of all ecosystem components such as chemical and meteorological variables exceeds its scope of application. Although it is frequently pointed out that correlation does not signal the presence of interaction, we considered Spearman rank correlation as a baseline.
Results
Benchmarking.
When applied to the data from a long-term mesocosm experiment (33), EcohNet identified 11 out of 13 interacting pairs of components, with three false positives, and was superior to conventional methods (Fig. 2 A–D). It outperformed other methods in all evaluation criteria (Fig. 2E). The results were not sensitive to different parameter values that might affect the performance of ESNs, except when the forgetting factor was very close to one (SI Appendix, section B). A benchmark with datasets generated by food web models (SI Appendix, Fig. S4) also supported the superiority of our method. The area under the curve of a receiver–operator characteristic curve (ROC-AUC) and F1 score of EcohNet outperformed CCM, PCM, and Spearman rank correlation, except for ROC-AUC of CCM in in which the signal of interaction was considered to be strong (Fig. 2 G and H). Performance of CCM was largely reduced in and . PCM performed better than CCM in , but it was reduced in . CCM was better than PCM in , potentially because of the use of seasonal surrogate method. Performance of Spearman rank correlation was comparable to other methods only in . LIMITS outperformed EcohNet in , especially in ROC-AUC. Its high applicability to cases of intrinsic oscillations has been reported in a previous study (34). It is also worth noting that EcohNet’s performance was comparable to LIMITS in most cases, despite not explicitly assuming underlying LV processes.
Fig. 2.
Benchmarking results. (A–E) Actual (blue) and inferred (red) network of EcohNet and conventional approaches (CCM, PCM, Spearman rank correlation, and LIMITS) for mesocosm data (DIN and SRP stand for total dissolved inorganic nitrogen and soluble reactive phosphorus, respectively). The numbers of true positives (TP) and false positives (FP) are shown above the panel. (F) The value of evaluation criteria for mesocosm data. (G and H) Two performance criteria (ROC-AUC and F1 score) for the food web models. Here, O is intrinsic oscillation, E is equilibrium, and EO is equilibrium forced by an external oscillator, and the suffixes S and L indicate small and large noise magnitude, respectively (SI Appendix, section D2 includes simulation parameter values). In the box plot, white lines indicate the median, box edges indicate the first and third quartile values, and whiskers indicate maximum and minimum values. Black circles indicate the median value of EcohNet. In G and H, the result of LIMITS should be carefully interpreted, because it assumes an LV equation used to generate the benchmark data. (I) Cumulative number of incorrectly identified links in chain relationships (for 6 × 100 = 600 time series in total). (J) Cumulative number of incorrectly identified links in fan-out relationships (for 6 × 100 = 600 time series in total).
Although accuracy of EcohNet, CCM, and PCM was consistent among , , and except for PCM for , there were large variations in sensitivity and specificity for CCM and PCM (Fig. 3 A–C). The number of detected links varied depending on the simulation conditions (Fig. 3D). This suggests that the fundamental sensitivity of the methods used to detect interactions is affected by dynamic complexity. In comparison, performance of EcohNet was relatively stable, and its sensitivity, specificity, and the number of links detected were not greatly affected by the simulation conditions. Our benchmarking also showed that false positives due to chain and fan-out relationships were best suppressed in EcohNet (Fig. 2 I and J).
Fig. 3.
Values of accuracy (A), sensitivity (B), specificity (C), and number of inferred links (D) for the food web models. Since connectance was fixed at 0.33 when generating an interaction matrix, the number of links detected should ideally be a constant value (since the number of species was fixed at eight, average number of links was (82 − 8)*0.33 = 18). In the box plot, white lines indicate the median, box edges indicate the first and third quartile values, and whiskers indicate maximum and minimum values. Black circles indicate the median value of EcohNet.
The above conclusions did not change significantly based on different dataset sizes (SI Appendix, Figs. S5 and S6), sampling intervals (SI Appendix, Figs. S7–S10), and the presence of unobserved species (SI Appendix, Fig. S11) that we tested. While these results are based on food web models where the interactions are bidirectional, the superiority of EcohNet was shown also in a random interaction model where the interactions were unidirectional and the detectability of directional interactions were evaluated (SI Appendix, Fig. S12; SI Appendix, section C includes the evaluation of directional interactions in the food web models).
Phytoplankton Dynamics in a Real Ecosystem.
We applied EcohNet to a time series (Lake Kasumigaura Long-term Monitoring Dataset) to examine the top-down and bottom-up causal factors of phytoplankton community composition (Fig. 4A). The causal network was naturally organized in a top-down structure with temperature at the top, and temperature had the most numbers of interaction links across multiple trophic levels. EcohNet showed that each of seven dominant phytoplankton groups were determined by different factors, and those networks were complex. Four out of seven phytoplankton groups (all three cyanobacteria and Thalassiosiraceae) were forced by NO3-N. We also detected the top-down control of rotifers and calanoids on not only diatoms but also cyanobacteria (Rotifers→Nitzschia, Rotifers→Oscillatoriales, Calanoida→Microcystis, Clanoida→Thalassiosiraceae, Cyclopoida→Aulacoseira), while large and small cladocerans did not influence dominant phytoplankton groups. In contrast, three bottom-up links from phytoplankton to zooplankton (Nitzschia→Rotifers, Oscillatoriales→Cyclopoida, Fragilaria→Small cladocera) were also detected. Importantly, in addition to the effects of environmental variables and zooplankton, we identified three interactions among dominant phytoplankton groups (Nostocales →Oscillatoriales, Oscillatoriales→Aulacoseira, and Fragilaria→Microcystis).
Fig. 4.
Analysis of Lake Kasumigaura Long-term Monitoring Dataset. (A) Causal network obtained by EcohNet. We removed causal links with a prediction skill of less than 0.002 (23 out of 70 links were removed). The magnitude of the prediction skill is indicated by both the thickness and color of arrows. Seven dominant phytoplankton groups were underlined. (B) We identified causal relationships including meteorological variables (wind speed, precipitation, and temperature) and examined how well unrealistic causal links (i.e., organisms/chemicals to meteorological variables) were avoided while the effects of meteorological variables were detected. (C) Ratio of causal links larger than the threshold level (x axis) is shown for all links (solid lines) and unrealistic links (dashed lines). Inset shows the ratio of the unrealistic links to all links. The faster the unrealistic/all ratio converges to zero, the smaller the unique prediction skill of unrealistic causal links, and the easier it is to remove unrealistic links while retaining more reliable links.
An additional analysis of the direction of causation between meteorological and other components showed that the total number of unrealistic causalities was lowest in PCM, followed by EcohNet and CCM in that order (Fig. 4B). For the causal links with the meteorological variables as factors, EcohNet detected an equal number of links as PCM for temperature and precipitation and found more links for wind speed. Moreover, unrealistic causal links were likely to be detected as strong links in CCM and PCM, but appeared only as weak links in EcohNet (Fig. 4C).
Discussion
We developed and evaluated the effectiveness of a method, EcohNet, that utilizes ensemble predictions of online ESNs. In short, our method encompasses the scope of Granger causality (3) and nonlinear time series analysis (35), and implements a framework for decomposing relationships among variables in terms of predictability (23). Although ESNs have been applied to causal analysis (36–38), we developed a reliable method of causal network inference by integrating adaptive online ESNs (20) into an ensemble machine learning framework. In addition to its applicability to nonlinear dynamics, our approach is not affected by the reliability of nonlinear prediction methods for the dynamics of a given system (39), which differs from previous approaches (5, 26, 28, 34, 40, 41). As we have confirmed (Fig. 2 E–G), performance of CCM and PCM, which rely on a specific nonlinear forecasting method (state space reconstruction and simplex projection), was sensitive to the differences in dynamic complexity (42, 43), although they performed better than the Spearman rank correlation coefficient. Another advantage of our method is its capability to decompose relationships among variables; we confirmed that false positives due to chain and fan-out relationships were best suppressed in EcohNet followed by PCM (Fig. 2 E and F). This result is plausible because PCM also address both relationships (28).
EcohNet has two prominent features. First, it was robust in its ability to detect links and has relatively stable sensitivity and specificity (Fig. 3). This robustness is important because it means that the error rate of an inferred network is consistently controlled by the regularization scheme irrespective of the dynamics. For example, comparison of the connectivity of ecological networks from several sites would be difficult if the method applied was sensitive to dynamical complexity (16). Second, we confirmed that EcohNet can identify causation between meteorological and other components while filtering out the unrealistic direction of causalities (Fig. 4 B and C) (18). The unrealistic causal links were likely to be detected only as weak links in EcohNet. Because of observational noise and other factors, it is hard to completely avoid detecting unrealistic causalities. Therefore, it is a desirable feature that detection of unrealistic causal relationships can be avoided depending on the threshold setting. Among the rest of the methods, PCM had stable sensitivity and specificity and suppressed the detection of unrealistic links. On the other hand, its overall performance (Fig. 2 G and H) was no better than CCM. One reason would be the absence of the evaluation step of the convergence of a prediction skill in PCM, whereas, in CCM, this step helps to discount the predictability of the target variable itself. In EcohNet, it is considered as .
EcohNet revealed that temperature determined nutrient dynamics, phytoplankton, and zooplankton communities in Lake Kasumigaura. This result is similar to Tanentzap et al. (44) that demonstrates strong direct and indirect impacts of temperature on phytoplankton and zooplankton. Quantifying the causal effects of temperature is an important advantage of EcohNet. CCM failed to distinguish causal relationships from seasonality-driven synchronization, leading to misidentification of causality (45). Likewise, previous work on Lake Kasumigaura did not detect the causal effects of temperature on phytoplankton and zooplankton (25). Our results suggest that temperature may determine the whole food web structure of Lake Kasumigaura.
Clearly, each of seven dominant phytoplankton groups were determined by different factors (Fig. 4). Some studies reported that phytoplankton species or genera have unique physiological and ecological features and thus respond differently to environmental factors and grazing (46, 47). These differences can result in dynamic changes in phytoplankton community composition, since it is reported that the dominant phytoplankton group changed temporally in Lake Kasumigaura (48–50). NO3-N determined four out of seven phytoplankton groups. This is consistent with previous work on Lake Kasumigaura showing that nitrogen limits phytoplankton primary production (25). Not only diatoms but also cyanobacteria were influenced by rotifers, calanoids, and cyclopods. Although our analysis did not identify whether these are direct predation or indirect effects, it is known that some copepods can ingest and shorten the filament size of cyanobacteria, and rotifers can graze or utilize decomposed cyanobacteria (51, 52). Our results suggest that these zooplankton groups might have an important role in the food web of shallow hypereutrophic lakes.
We identified significant interactions among dominant phytoplankton groups. It has been reported that phytoplankton groups may have compensatory responses to environmental factors (53). One possible mechanism for the interaction between Nostocales and Oscillatoriales could be nitrogen availability, because N2-fixing cyanobacteria, including Nostocales, compete with non-N2-fixing bacteria, including Oscillatoriales (54). Another mechanism could be light availability. Mixing regimes determined by climatic factors like heat exchange and wind action affect competition for light between phytoplankton species (e.g., buoyant cyanobacteria and sinking diatoms), and shading by cyanobacteria blooms also influences other phytoplankton (55). As pointed out in Freeman et al. (47), previous studies have not incorporated these interactions in models for identifying species-specific factors. Explicitly including not only environmental factors and grazing impacts but also community interactions in complex systems is another advantage of EcohNet.
As well as predicting the flow of causation, EcohNet enables cyanobacteria bloom forecasts for a lake (SI Appendix, Figs. S13 and S14). The application of EcohNet to long-term monitoring data can be a useful tool for lake management. Harmful cyanobacteria blooms are a serious threat to ecological integrity and ecosystem services, despite management efforts, and climate change is predicted to promote the occurrence and severity of cyanobacterial blooms (56). As show here, the system is complex, and, therefore, identifying the drivers causing cyanobacteria blooms and forecasting these blooms using EcohNet are critically important for water quality management (29).
There are two developmental directions that we did not fully address in this paper. First, in this paper, we did not consider the possibility of synergistic effects of two or more variables that may contribute more to the prediction than simple addition of their effect. We recognize the importance of such nonadditivity (57), and considering that a further extension of our methodology, for example, evaluating the combinatorial effect of removing multiple variables, could address this issue. Second, we did not consider data-dependent parameter fitting of ESN recursive least squares (ESN-RLS). It may be true that optimal settings of parameters would maximize the performance of EcohNet's network inference. Guiding principles for such a setup would be to use autocorrelation to determine the forgetting coefficient (58), and to determine parameters to maximize the predictability of the time series (59). However, we have demonstrated the benefits of EcohNet well enough without these further optimizations.
The development of frameworks to facilitate time series–based causal analysis would advance the field of ecology in the coming decades, where advanced techniques for ecosystem monitoring (60–64) will lead to unprecedented increases in dataset size (7, 65, 66). Such analysis could promote studies to estimate the effects of global environmental change on biological communities (27), prevent regime shifts that lead to catastrophic impacts on biodiversity (67), and support efficient management of biological resources (68). Further development by applying EcohNet to time series of high-frequency data, such as sensor data, could lead to real-time, near-term forecasting to prepare for or preempt future impairment of ecological functions and services. It is expected that, with integration into active hypothesis testing, these studies will aid the development of predictive and manipulative ecology in the 21st century.
Materials and Methods
ESN-RLS.
An ESN (19, 21) implements a type of reservoir computing that uses a RNN as a dynamical reservoir (an internal structure which is made up of individual, nonlinear units, and can store information). ESNs can skillfully reconstruct and predict time series from different nonlinear dynamical systems (69–75). It has been demonstrated that ESNs can track the temporal evolution of time series better than backpropagation-based artificial neural networks (71).
In an ESN, an input signal induces a nonlinear response to the dynamical reservoir RNN, and the reservoir states are converted to an output signal by linear weights (Fig. 1C). Here, for a multivariate time series of variables and , we assume that is the input at time specified by a set of indices , and is a target variable to which an ESN is trained to output its one-step-ahead prediction. Following the typical implementation, we defined an ESN by three matrices and one function. The matrices are 1) the input weight matrix, ; 2) the reservoir weight matrix, ; and 3) the output vector, , where and are the number of nodes in the input layer (number of elements in ) and dynamical reservoir, respectively. The state update equations for an ESN are as follows:
[1] |
[2] |
Here, is a vector of neural states in the dynamical reservoir, is the output of an ESN as the prediction of , and is a neural activation function. The neural connections of an ESN are randomly generated, except for the output weights . Here, is sequentially updated by the RLS method. This defines an online implementation of ESN, namely, ESN-RLS, which is robust to the nonstationary dynamics and easily applicable to predictive purposes (20).
RLS is widely used in linear signal processing and has the desirable feature of fast convergence (76). It incorporates the error history of a system into the calculation of the present error compensation. The recursive updating rule of RLS for is as follows:
[3] |
[4] |
[5] |
[6] |
Here, is the forgetting factor, which determines the effect of past errors on the update, is the output error at time when applying the weight matrix before the update, is the gain vector, and is the inverse of the self-correlation matrix of weighted by the forgetting factor. This update can be repeated multiple times at each time step, and we define the number of iterations as . We set the initial conditions of the update rule as
[7] |
[8] |
where is an identity matrix, is a zero vector of length , and is the regularization factor. We also set the initial reservoir state as .
This algorithm minimizes the following cost function, which is a weighted sum of the output errors at time when applying the output weight matrix at time with a regularization term:
[9] |
Here, represents the L2 norm.
Parameterization of ESN-RLS.
The parameter values were set according to the standard recommendation for parameterization of the ESN and RLS (20, 21, 76–78) (Table 1 and SI Appendix, section B). The elements of are randomly drawn from a uniform distribution between and . We scale the spectral radius of by after it is generated by randomly drawing its elements from a uniform distribution between and (with probability , and otherwise zero). The number of nodes in the dynamic reservoir is 32, which is smaller than that of a typical ESN. This is intended to reduce computation time. Also, this number is related to the memory capacity of the reservoir (20), and the number of nodes must be scaled to the time series length. In this sense, a large number of nodes was not needed in this study. In our benchmark using aquatic microcosm data, we examined the stability of the results for four representative parameters () that may affect the performance of ESNs (SI Appendix, Fig. S16).
Table 1.
Parameters of ESN-RLS
Description | Value |
---|---|
Connectance of | 1 |
Range of values in | [−0.1, 0.1] |
Number of nodes in () | 32 |
Connectance of | 0.1 |
Range of values in | [−1, 1] |
Spectral radius of () | 0.95* |
Forgetting factor () | 0.95* |
Number of iterations of RLS updates () | 8* |
Regularization factor () | 0.001* |
*For these parameters, we tested the impact on EcohNet's performance with different values. The tested values are , , , and .
Progressive Selection of Input Variables.
For a target variable (corresponding to X in Fig. 1A), we used a progressive selection to obtain the set of variables that minimizes the prediction skill for ({X, Y} in Fig. 1B). For this purpose, we first define the prediction skill in this paper as
[10] |
Here, is the time series of target variable (blue lines in Fig. 1C), and represents the prediction of given a set of variables and an ESN indexed by (yellow lines in Fig. 1C). Practically, we calculated the prediction skill after truncating the first 20% of the time series. This is intended to remove the effect of initial condition (, , and ).
To prevent the results from being dependent on a particular network, the evaluation of prediction skill is done by an ensemble of ESNs. This allows us to account for the estimation error in prediction skill and ensures more reliable variable selection. Moreover, to avoid overfitting, the input is replaced by a zero vector with a fixed probability at each time. This means that we eventually use the following equation to update the reservoir state instead of Eq. 1:
[11] |
where is a random number that is one with probability 0.5 and zero otherwise.
The progressive selection of variables proceeds as follows. First, the set of indices of input variables to predict variable is initialized to to account for the predictability inherent in the target variable itself (first step in Fig. 1B). Correspondingly, the set of indices of the rest of the variables is initialized to . Then, ESNs are generated to perform predictions of by setting (Fig. 1C). This returns a distribution of prediction skills ( indicates the step of variable selection). To select the next variable to be added, for each index in , are generated, where the suffix indicates that is the th element of , and then, for each , ESNs are generated to obtain predictions of by setting (second step in Fig. 1B). Among the distribution of the prediction skills for , one that has the highest median value () is set as (second step in Fig. 1B). The criterion used to accept as is that the median of is larger than % of values in , formally written as , where is the number of in that satisfies the condition . If the criterion is satisfied, we replace by , by , by (removing from ), and increment by one, and proceed to the next step only if ; otherwise, the procedure is stopped (third step in Fig. 1B) with returning as the set of the indices of optimal variables that maximizes the prediction of .
The number of ESNs generated for each evaluation, , is related to the stability of the results. If is small, the variability in results from trial to trial may be large, especially for weak causality. In this paper, we set N = 10,000 for real-world data, to obtain stable results, and N = 5,000 for simulation datasets, to reduce the time required for evaluation. The threshold value for accepting a new variable, , was set to 0.48 throughout this paper. The closer the number is to 0.5, the more likely it is that weak links will remain. Therefore, instead of increasing the sensitivity (true positive rate), there is a possibility of decreasing the specificity (increasing the false positive rate). In this paper, the value is set close to 0.5, taking into account the evaluation by ROC-AUC.
Calculation of Unique Prediction Skill.
Unique prediction skill quantifies the unique contribution of a single variable to the overall prediction skill (Fig. 1D). For each , the unique prediction skill is defined as
[12] |
Here, is the distribution of prediction skill for , and
[13] |
that is, is obtained by removing from . and are again calculated by ESNs. represents how uniquely contributes to the prediction of (illustrated as the difference of the positions of two distributions in Fig. 1D). We identify as the indicator of causal influence from to . In this step, such that is removed from , and we set as well as other variables that are not included in .
Dataset.
We benchmarked our method using both experimental and simulation data (SI Appendix, section D) and then applied it to a long-term monitoring dataset from a lake ecosystem. One of the benchmarking datasets was from a real ecological system, which was obtained by long-term observation of a mesocosm (33). The other datasets were obtained by simulations of two ecological models, namely food web and random interaction models (all simulation conditions are listed in SI Appendix, Table S1), and allowed us to examine how the dynamic complexity of a time series affects identification of causal relationships, and how the effects of two types of interaction topologies (chain and fan-out relationships) can cause false positives.
Lake Kasumigaura Long-Term Monitoring Dataset.
We applied the EcohNet to the long-term monitoring data from Lake Kasumigaura, a hypereutrophic lake. Lake Kasumigaura is the second largest lake in Japan (167.7 km2) and is shallow (mean depth: ∼4 m; maximum depth: 7.4 m). The National Institute for Environmental Studies has been conducting monthly monitoring in Lake Kasumigaura since 1976 and publishing the long-term data on the Lake Kasumigaura Database. In this lake, cyanobacteria blooms occur and disappear repeatedly, and the dominant cyanobacteria group changes. Phytoplankton communities can interact with many components, such as temperature, nutrients, and zooplankton, and are complex and dynamic. Using EcohNet, we quantified the causal relationships among environmental variables, seven dominant phytoplankton groups (three cyanobacteria [Microcystis, Nostocales, Oscillatoriales] and four diatoms [Thalassiosiraceae, Aulacoseira, Fragilaria, and Nitzschia), and zooplankton, and examined the interactions among these phytoplankton groups.
We analyzed the monitoring data at the center of Takahamairi Bay (Station 3), which is shallow (ca. 3.2 m) and the most eutrophic site with cyanobacteria blooms. Phytoplankton biovolume data were obtained from Takamura and Nakagawa (79). For zooplankton, we used the abundance data of five functional groups: large cladocerans (>1.0 mm), small cladocerans (<1.0 mm), rotifers, adult calanoids, and adult cyclopoids (25). Zooplankton data were obtained from Takamura et al. (80). Our key environmental variables were surface water temperature, soluble reactive phosphorus (PO4-P), and nitrate nitrogen (NO3-N) from the Lake Kasumigaura Database (81). We also included 30-d moving averages of wind speed and precipitation, which were collected from the Tsukuba-Tateno Meteorological Station of the Japan Meteorological Agency (https://www.jma.go.jp/jma/menu/menureport.html). Since CCM, PCM, and LIMITS require taking consecutive time lags of observed variables, we analyzed data from April 1996 to March 2019 (276 mo). There were no missing data for any variables during this time interval. All time series were square-root transformed and normalized to have a mean zero and variance of one to adjust for rapid increases in some phytoplankton species.
Evaluation.
We evaluated the performance of EcohNet and conventional methods (SI Appendix, section A) in detecting interactions, using the evaluation criteria of a binary classification task, namely, ROC-AUC, F1 score, accuracy, sensitivity, and specificity (SI Appendix, Fig. S3). Since accuracy is affected by the degree of connectance, we considered ROC-AUC and F1 score as criteria for overall performance. We specifically considered prediction skill (EcohNet, CCM, and PCM), correlation coefficient (Spearman rank correlation), and interaction coefficient (LIMITS) as classifiers to identify interactions. We adjusted the sparsity of the Spearman rank correlation matrix based on the P value; that is, we made the correlation matrix sparse by replacing elements with with zero. In the same manner, we set the threshold P value of CCM as 0.05 (SI Appendix, section A2). For PCM, following the author implementation (28), we used the value of prediction skill instead of the P value and set the threshold as 0.2. For LIMITS, as in EcohNet, the sparsity of an interaction matrix depends on the forward stepwise algorithm. Here, the threshold value for the improvement of prediction error (SI Appendix, section A4) was set as zero; that is, a new variable is added if it at least improves the prediction.
In predator–prey relationships, species affect each other directly, whereas, in a causal relationship, a relationship between species may only be detected in one direction, due to a difference in the time dependence of each effect (5, 24, 25). Therefore, in this paper, when evaluating EcohNet, CCM, and PCM for the long-term mesocosm experiment and the food web models using the evaluation criteria of the binary classification task, we consider the ability to detect interacting pairs (pairs where at least one has a direct influence on the other). Specifically, we symmetrized the matrices of the prediction skill of EcohNet, CCM, and PCM so that before evaluation, and evaluated the performance of prediction skill as a classifier in detecting elements for which and whose values are nonzero in the actual interaction matrix. The matrix of Spearman rank correlation, which is originally symmetric, was evaluated in the same manner. For the random interaction model, we considered the ability of EcohNet, CCM, and PCM to detect direct influences. Specifically, we tested the presence of influence from one species to another (corresponding to a nonzero in the interaction matrix) when there are causal links from the former to the latter. For LIMITS, we evaluated the performance of detecting nonzero components of the interaction matrix directly in all cases, according to its original definition.
Supplementary Material
Acknowledgments
We thank Shinji Nakaoka for valuable discussions, and Megumi Nakagawa for counting phytoplankton and providing helpful comments. We also thank two anonymous reviewers and all members of the Lake Kasumigaura long-term monitoring group of the National Institute for Environmental Studies. This work was supported by funding from the Management Expenses Grant for RIKEN BioResource Research Center, Ministry of Education, Culture, Sports, Science and Technology, and the Japan Society for the Promotion of Science KAKENHI Grants JP20K06820 and JP20H03010 (to K.S.).
Footnotes
The authors declare no competing interest.
This article is a PNAS Direct Submission.
This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2204405119/-/DCSupplemental.
Data, Materials, and Software Availability
We used Mathematica 12.3 and 13.1 for our analysis. The computer codes and data used for the analysis can be downloaded from GitHub(82).
Change History
October 14, 2022: The text of the introduction has been updated for grammatical clarity.
References
- 1.Runge J., Causal network reconstruction from time series: From theoretical assumptions to practical estimation. Chaos 28, 075310 (2018). [DOI] [PubMed] [Google Scholar]
- 2.Runge J., et al. , Inferring causation from time series in Earth system sciences. Nat. Commun. 10, 2553 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Granger C. W., Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37, 424–438 (1969). [Google Scholar]
- 4.Akoglu H., User’s guide to correlation coefficients. Turk. J. Emerg. Med. 18, 91–93 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Sugihara G., et al. , Detecting causality in complex ecosystems. Science 338, 496–500 (2012). [DOI] [PubMed] [Google Scholar]
- 6.Runge J., Nowack P., Kretschmer M., Flaxman S., Sejdinovic D., Detecting and quantifying causal associations in large nonlinear time series datasets. Sci. Adv. 5, eaau4996 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Durden J. M., Integrating “big data” into aquatic ecology: Challenges and opportunities. Limnol. Oceanogr. Bull. 26, 101–108 (2017). [Google Scholar]
- 8.Vallis G. K., Atmospheric and Oceanic Fluid Dynamics (Cambridge University Press, 2017). [Google Scholar]
- 9.Kretschmer M., Coumou D., Donges J. F., Runge J., Using causal effect networks to analyze different arctic drivers of midlatitude winter circulation. J. Clim. 29, 4069–4081 (2016). [Google Scholar]
- 10.Perretti C. T., Munch S. B., Sugihara G., Model-free forecasting outperforms the correct mechanistic model for simulated and experimental data. Proc. Natl. Acad. Sci. U.S.A. 110, 5253–5257 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Deyle E. R., May R. M., Munch S. B., Sugihara G., Tracking and forecasting ecosystem interactions in real time. Proc. Biol. Sci. 283, 20152258 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.McGowan J. A., et al. , Predicting coastal algal blooms in southern California. Ecology 98, 1419–1433 (2017). [DOI] [PubMed] [Google Scholar]
- 13.Schlesinger W. H., Cole J. J., Finzi A. C., Holland E. A., Introduction to coupled biogeochemical cycles. Front. Ecol. Environ. 9, 5–8 (2011). [Google Scholar]
- 14.Tsonis A. A., et al. , Dynamical evidence for causality between galactic cosmic rays and interannual variation in global temperature. Proc. Natl. Acad. Sci. U.S.A. 112, 3253–3256 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Deyle E. R., Maher M. C., Hernandez R. D., Basu S., Sugihara G., Global environmental drivers of influenza. Proc. Natl. Acad. Sci. U.S.A. 113, 13081–13086 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Rogers T. L., Munch S. B., Hidden similarities in the dynamics of a weakly synchronous marine metapopulation. Proc. Natl. Acad. Sci. U.S.A. 117, 479–485 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Chang C. W., et al. , Causal networks of phytoplankton diversity and biomass are modulated by environmental context. Nat. Commun. 13, 1140 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Baskerville E. B., Cobey S., Does influenza drive absolute humidity? Proc. Natl. Acad. Sci. U.S.A. 114, E2270–E2271 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Jaeger H., “The “echo state” approach to analysing and training recurrent neural networks-with an erratum note” (GMD Technical Report 148 + 13, German National Research Center for Information Technology, Bonn, Germany, 2001). [Google Scholar]
- 20.Jaeger H., Adaptive nonlinear system identification with echo state networks. Adv. Neural Inf. Process. Syst. 15, 609–616 (2002). [Google Scholar]
- 21.Jaeger H., Haas H., Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication. Science 304, 78–80 (2004). [DOI] [PubMed] [Google Scholar]
- 22.Bollt E., On explaining the surprising success of reservoir computing forecaster of chaos? The universal machine learning dynamical system with contrast to VAR and DMD. Chaos 31, 013108 (2021). [DOI] [PubMed] [Google Scholar]
- 23.Pennekamp F., et al. , The intrinsic predictability of ecological time series and its potential to guide forecasting. Ecol. Monogr. 89, e01359 (2019). [Google Scholar]
- 24.Ye H., Deyle E. R., Gilarranz L. J., Sugihara G., Distinguishing time-delayed causal interactions using convergent cross mapping. Sci. Rep. 5, 14750 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Matsuzaki S. S., Suzuki K., Kadoya T., Nakagawa M., Takamura N., Bottom-up linkages between primary production, zooplankton, and fish in a shallow, hypereutrophic lake. Ecology 99, 2025–2036 (2018). [DOI] [PubMed] [Google Scholar]
- 26.Ushio M., et al. , Fluctuating interaction network and time-varying stability of a natural fish community. Nature 554, 360–363 (2018). [DOI] [PubMed] [Google Scholar]
- 27.Chang C. W., et al. , Long-term warming destabilizes aquatic ecosystems through weakening biodiversity-mediated causal networks. Glob. Change Biol. 26, 6413–6423 (2020). [DOI] [PubMed] [Google Scholar]
- 28.Leng S., et al. , Partial cross mapping eliminates indirect causal influences. Nat. Commun. 11, 2632 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Carey C. C., et al. , Advancing lake and reservoir water quality management with near-term, iterative ecological forecasting. Inland Waters 12, 107–120 (2021). [Google Scholar]
- 30.Nakajima K., Fischer I., Reservoir Computing (Springer, 2021). [Google Scholar]
- 31.Friedman J., Hastie T., Tibshirani R., The Elements of Statistical Learning (Springer, 2001). [Google Scholar]
- 32.Fisher C. K., Mehta P., Identifying keystone species in the human gut microbiome from metagenomic timeseries using sparse linear regression. PLoS One 9, e102451 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Benincà E., et al. , Chaos in a long-term experiment with a plankton community. Nature 451, 822–825 (2008). [DOI] [PubMed] [Google Scholar]
- 34.Suzuki K., Yoshida K., Nakanishi Y., Fukuda S., An equation-free method reveals the ecological interaction networks within complex microbial ecosystems. Methods Ecol. Evol. 8, 1774–1785 (2017). [Google Scholar]
- 35.Kantz H., Schreiber T., Nonlinear Time Series Analysis (Cambridge University Press, 2004). [Google Scholar]
- 36.Huang Y., Fu Z., Franzke C. L. E., Detecting causality from time series in a machine learning framework. Chaos 30, 063116 (2020). [DOI] [PubMed] [Google Scholar]
- 37.Duggento A., Guerrisi M., Toschi N., Echo state network models for nonlinear Granger causality. Philos. Trans. R. Soc. Math. Phys. Eng. Sci. 379, 20200256 (2021). [DOI] [PubMed] [Google Scholar]
- 38.Wang M., Fu Z., A new method of nonlinear causality detection: Reservoir computing Granger causality. Chaos Solitons Fractals 154, 111675 (2022). [Google Scholar]
- 39.Judd K., Mees A., On selecting models for nonlinear time series. Physica D 82, 426–444 (1995). [Google Scholar]
- 40.Cenci S., Sugihara G., Saavedra S., Regularized S-map for inference and forecasting with noisy ecological time series. Methods Ecol. Evol. 10, 650–660 (2019). [Google Scholar]
- 41.Chang C. W., et al. , Reconstructing large interaction networks from empirical time series data. Ecol. Lett. 24, 2763–2774 (2021). [DOI] [PubMed] [Google Scholar]
- 42.Cobey S., Baskerville E. B., Limits to causal inference with state-space reconstruction for infectious disease. PLoS One 11, e0169050 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Mønster D., et al. , Causal inference from noisy time-series data—Testing the convergent cross-mapping algorithm in the presence of noise and external influence. Future Gener. Comput. Syst. 73, 52–62 (2017). [Google Scholar]
- 44.Tanentzap A. J., et al. , Climate warming restructures an aquatic food web over 28 years. Glob. Change Biol. 26, 6852–6866 (2020). [DOI] [PubMed] [Google Scholar]
- 45.Kitayama K., Ushio M., Aiba S. I., Temperature is a dominant driver of distinct annual seasonality of leaf litter production of equatorial tropical rain forests. J. Ecol. 109, 727–736 (2021). [Google Scholar]
- 46.Huber V., Wagner C., Gerten D., Adrian R., To bloom or not to bloom: Contrasting responses of cyanobacteria to recent heat waves explained by critical thresholds of abiotic drivers. Oecologia 169, 245–256 (2012). [DOI] [PubMed] [Google Scholar]
- 47.Freeman E. C., Creed I. F., Jones B., Bergström A. K., Global changes may be promoting a rise in select cyanobacteria in nutrient-poor northern lakes. Glob. Change Biol. 26, 4966–4987 (2020). [DOI] [PubMed] [Google Scholar]
- 48.Takamura N., Otsuki A., Aizaki M., Nojiri Y., Phytoplankton species shift accompanied by transition from nitrogen dependence to phosphorus dependence of primary production in Lake Kasumigaura, Japan. Arch. Hydrobiol. 124, 129–148 (1992). [Google Scholar]
- 49.Tomioka N., Imai A., Komatsu K., Effect of light availability on Microcystis aeruginosa blooms in shallow hypereutrophic Lake Kasumigaura. J. Plankton Res. 33, 1263–1273 (2011). [Google Scholar]
- 50.Fukushima T., Arai H., Regime shifts observed in Lake Kasumigaura, a large shallow lake in Japan: Analysis of a 40-year limnological record. Lakes Reservoirs Res. Manage. 20, 54–68 (2015). [Google Scholar]
- 51.Kâ S., et al. , Can tropical freshwater zooplankton graze efficiently on cyanobacteria? Hydrobiologia 679, 119–138 (2012). [Google Scholar]
- 52.Ger K. A., et al. , The interaction between cyanobacteria and zooplankton in a more eutrophic world. Harmful Algae 54, 128–144 (2016). [DOI] [PubMed] [Google Scholar]
- 53.Jochimsen M. C., Kümmerlin R., Straile D., Compensatory dynamics and the stability of phytoplankton biomass during four decades of eutrophication and oligotrophication. Ecol. Lett. 16, 81–89 (2013). [DOI] [PubMed] [Google Scholar]
- 54.Moisander P. H., Paerl H. W., Zehr J. P., Effects of inorganic nitrogen on taxa-specific cyanobacterial growth and nifH expression in a subtropical estuary. Limnol. Oceanogr. 53, 2519–2532 (2008). [Google Scholar]
- 55.Huisman J., et al. , Changes in turbulent mixing shift competition for light between phytoplankton species. Ecology 85, 2960–2970 (2004). [Google Scholar]
- 56.Huisman J., et al. , Cyanobacterial blooms. Nat. Rev. Microbiol. 16, 471–483 (2018). [DOI] [PubMed] [Google Scholar]
- 57.Williams P. L., Beer R. D., Nonnegative decomposition of multivariate information. arXiv [Preprint] (2010). https://arxiv.org/abs/1004.2515. Accessed 6 August 2022.
- 58.Leung S. H., So C. F., Gradient-based variable forgetting factor RLS algorithm in time-varying environments. IEEE Trans. Signal Process. 53, 3141–3150 (2005). [Google Scholar]
- 59.Wang L., Wang Z., Liu S., An effective multivariate time series classification approach using echo state network and adaptive differential evolution algorithm. Expert Syst. Appl. 43, 237–249 (2016). [Google Scholar]
- 60.Caporaso J. G., et al. , QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7, 335–336 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Bohan D. A., et al. , Next-generation global biomonitoring: Large-scale, automated reconstruction of ecological networks. Trends Ecol. Evol. 32, 477–487 (2017). [DOI] [PubMed] [Google Scholar]
- 62.Bálint M., et al. , Environmental DNA time series in ecology. Trends Ecol. Evol. 33, 945–957 (2018). [DOI] [PubMed] [Google Scholar]
- 63.Haase P., et al. , The next generation of site-based long-term ecological monitoring: Linking essential biodiversity variables and ecosystem integrity. Sci. Total Environ. 613–614, 1376–1384 (2018). [DOI] [PubMed] [Google Scholar]
- 64.Taberlet P., Bonin A., Zinger L., Coissac E., Environmental DNA: For Biodiversity Research and Monitoring (Oxford University Press, 2018). [Google Scholar]
- 65.Hampton S. E., et al. , Big data and the future of ecology. Front. Ecol. Environ. 11, 156–162 (2013). [Google Scholar]
- 66.Mouquet N., et al. , Predictive ecology in a changing world. J. Appl. Ecol. 52, 1293–1310 (2015). [Google Scholar]
- 67.Scheffer M., Carpenter S. R., Catastrophic regime shifts in ecosystems: Linking theory to observation. Trends Ecol. Evol. 18, 648–656 (2003). [Google Scholar]
- 68.Chivian E., Bernstein A., Eds., Sustaining Life: How Human Health Depends on Biodiversity (Oxford University Press, 2008). [Google Scholar]
- 69.Pathak J., Lu Z., Hunt B. R., Girvan M., Ott E., Using machine learning to replicate chaotic attractors and calculate Lyapunov exponents from data. Chaos 27, 121102 (2017). [DOI] [PubMed] [Google Scholar]
- 70.Du C., et al. , Reservoir computing using dynamic memristors for temporal information processing. Nat. Commun. 8, 2204 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Lu Z., et al. , Reservoir observers: Model-free inference of unmeasured variables in chaotic systems. Chaos 27, 041102 (2017). [DOI] [PubMed] [Google Scholar]
- 72.Lu Z., Hunt B. R., Ott E., Attractor reconstruction by machine learning. Chaos 28, 061104 (2018). [DOI] [PubMed] [Google Scholar]
- 73.Pathak J., Hunt B., Girvan M., Lu Z., Ott E., Model-free prediction of large spatiotemporally chaotic systems from data: A reservoir computing approach. Phys. Rev. Lett. 120, 024102 (2018). [DOI] [PubMed] [Google Scholar]
- 74.Chattopadhyay A., Hassanzadeh P., Palem K., Subramanian D., Data-driven prediction of a multi-scale Lorenz 96 chaotic system using a hierarchy of deep learning methods reservoir computing, ANN, and RNN-LSTM. arXiv [Preprint] (2019). https://arxiv.org/abs/1906.08829. Accessed 6 August 2022.
- 75.Reichstein M., et al. , Deep learning and process understanding for data-driven Earth system science. Nature 566, 195–204 (2019). [DOI] [PubMed] [Google Scholar]
- 76.Farhang-Boroujeny B., Adaptive Filters: Theory and Applications (John Wiley, 2013). [Google Scholar]
- 77.Ifeachor E. C., Jervis B. W., Digital Signal Processing: A Practical Approach (Pearson Education, 2002). [Google Scholar]
- 78.Lukoševičius M., “A practical guide to applying echo state networks” in Neural Networks: Tricks of the Trade, Montovan G., Orr G. B., Müller K.-R., Eds. (Springer, 2012), pp. 659–686. [Google Scholar]
- 79.Takamura N., Nakagawa M., Phytoplankton species abundance in Lake Kasumigaura (Japan) monitored monthly or biweekly since 1978. Ecol. Res. 27, 837–837 (2012). [Google Scholar]
- 80.Takamura N., Nakagawa M., Hanazato T., Zooplankton abundance in the pelagic region of Lake Kasumigaura (Japan): Monthly data since 1980. Ecol. Res. 32, 1 (2017). [Google Scholar]
- 81.National Institute for Environmental Studies, Lake Kasumigaura Database. https://db.cger.nies.go.jp/gem/moni-e/inter/GEMS/database/kasumi/index.html. Accessed 18 June 2021.
- 82.Suzuki K., EcohNet. GitHub. https://github.com/kecosz/EcohNet. Deposited 28 August 2022. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
We used Mathematica 12.3 and 13.1 for our analysis. The computer codes and data used for the analysis can be downloaded from GitHub(82).