Abstract
Reservoir computing is a machine learning framework that exploits nonlinear dynamics, exhibiting significant computational capabilities. One of the defining characteristics of reservoir computing is that only linear output, given by a linear combination of reservoir variables, is trained. Inspired by recent mathematical studies of generalized synchronization, we propose a novel reservoir computing framework with a generalized readout, including a nonlinear combination of reservoir variables. Learning prediction tasks can be formulated as an approximation problem of a target map that provides true prediction values. Analysis of the map suggests an interpretation that the linear readout corresponds to a linearization of the map, and further that the generalized readout corresponds to a higher-order approximation of the map. Numerical study shows that introducing a generalized readout, corresponding to the quadratic and cubic approximation of the map, leads to a significant improvement in accuracy and an unexpected enhancement in robustness in the short- and long-term prediction of Lorenz and Rössler chaos. Towards applications of physical reservoir computing, we particularly focus on how the generalized readout effectively exploits low-dimensional reservoir dynamics.
Keywords: Reservoir computing, Generalized synchronization, Echo state property
Subject terms: Computer science, Nonlinear phenomena, Machine learning
Introduction
Reservoir computing (RC) is a machine learning framework that exploits dynamical systems and has remarkable computational capabilities1–3. For example, RC using random networks, called echo state networks (ESNs), can efficiently predict chaotic time series4. Adding closed-loop makes an RC system autonomous and capable of replicating chaotic attractors, which are utilized to estimate Lyapunov exponents5. Furthermore, recent studies have shown that such ‘autonomous’ RC systems can reproduce true dynamical properties more accurately than those computed from limited training data and extrapolate true dynamical structures such as bifurcation outside the training data6–8. Another branch of research, physical RC, harnesses various physical dynamics and demonstrates high information processing capability9–13,14.
Why does RC work so well with untrained random networks and physical systems? This is a central open problem in RC research, and more broadly in machine learning and neuroscience. Partial answers to this problem have been provided using dynamical systems theory7,15–17. In particular, Grigoryeva, Hart, and Ortega17 rigorously proved the existence of a continuously differentiable synchronization map under certain conditions and explicitly showed what the RC learns when predicting chaotic dynamics. In other words, they provided a formal expression of a map, which is
, as explained later in Eq. (8), that RC approximates for prediction. Hara and Kokubu7 uncovered a key mathematical structure for learning with RC, i.e. a smooth conjugacy between target and reservoir dynamics based on observations from the numerical study of the logistic map.
Inspired by these seminal studies7,17, we propose a novel method of RC with a generalized readout. Based on generalized synchronization, the Taylor expansion of the map
, the Eq. (10), may give an interpretation of the conventional RC as a linearization of
. Moreover, it implies that the computational capabilities of RC with a generalized readout are superior to those of conventional RC. Remark that a specific type of nonlinear readout, such as
in the notation introduced later, has already been used in previous research5,18,19. We emphasize that our theoretical framework comprehensively explains the reason why these nonlinear readouts are so effective, and, furthermore, presents a new direction of research, including cubic readouts, bridging rigorous mathematics7,17 with practical applications.
Indeed, numerical studies on Lorenz and Rössler chaos prediction as a benchmark problem strongly support this; i.e., for both short- and long-term predictions, we reveal the significant computational capabilities of RC with generalized readout compared to conventional RC. Moreover, for long-term prediction, the autonomous RC system with generalized readout acquires notable robustness, in contrast to the lack of robustness of conventional RC.
Formulation
Conventional RC
Here we briefly sketch the method of conventional RC. Let us consider the target input
, the target output
vectors (
), and their sequence
. The goal is to construct a machine that, given an input
, produces an output
that approximates the target output
, i.e.
, by using the training data
.
The machine consists of reservoir variables,
, whose dynamics are determined by a map
and the input
as follows
![]() |
1 |
The output
is determined by the readout weight matrix
and the output bias
as
![]() |
2 |
The readout weight matrix
is determined such that
;
![]() |
3 |
as usual the method of least squares, where
denotes the long-term average. For simplicity, the reguralization term is omitted in the formulation, but it is used, as explained in the following numerical study.
Synchronizations
Common-Signal-Induced Synchronization (CSIS), or equivalently the Echo State Property (ESP) in the context of RC, is a key property required for reservoir dynamics determined by the map
. For a given (common) input signal
and arbitrary initial reservoir states
, we say that CSIS occurs if the reservoir states converge to a unique state that depends only on the sequence of the input signal, i.e.
![]() |
4 |
where these dynamics are determined by
![]() |
5 |
The occurrence of CSIS can be characterized by the conditional Lyapunov exponent15,16.
In this paper, we only study the input signal
is generated by another dynamical system, referred to as the target dynamical system determined by a nonlinear map
,
![]() |
6 |
where
denotes the initial point of the target dynamics and
is the t times composition of
. It is more general to formulate the observation function of the target dynamics,
, as in Grigoryeva et al.17; however, we do not consider it for simplicity.
Note that, if CSIS occurs, the asymptotic states of the reservoir dynamics
are uniquely determined by the target dynamics
after the transient period. This correspondence is referred to as generalized synchronization, and denoted by
![]() |
7 |
where
is the generalized synchronization map. See Fig. 1 a for an illustration of
. Grigoryeva et al.17 proved the existence and the differentiability of the map
under certain conditions.
Fig. 1.
An illustration of the target and reservoir dynamics in phase space. As an example of the target dynamical system, the Rössler attractor is shown in the left panel. The right panel shows the projection of the reservoir dynamics driven by the Rössler dynamics onto the subspace spanned by the first three variables, i.e.
. The red arrow depicts the schematic of the generalized synchronization map 
Generalized readout
Let us consider the inverse of the map
exists, and then,
. In that case, for instance,
-ahead prediction task of the target dynamics can be expressed by
![]() |
8 |
as a function of the reservoir state
. Therefore, predicting
-ahead target dynamics with RC is mathematically equivalent to the functional approximation of the map
. This indicates that the conventional RC may be viewed as a linearisation of the map
(see the Eq. (2)), i.e.,
![]() |
9 |
However, the map
is not linear in general, since it is the composition of the nonlinear maps of
and
. Here we assume
is sufficiently smooth, and consider the Taylor expansion of
,
![]() |
10 |
where
and
denote the Jacobian
and the Hessian
, respectively. This gives an interpretation of the conventional RC; that is, the output bias vector
and the readout weight matrix W are used to approximate the first two terms in the Taylor expansion as
![]() |
11 |
with the approximation error of
. In this sense, the conventional RC may be understood as a linear approximation of the target map
.
In this paper, we propose to utilize the nonlinear combination of the reservoir variables for the approximation of higher order terms in the Taylor expansion. In other words, our method, referred to as RC with generalized readout, approximates the general term in the Taylor expansion beyond the linear term. Taking into account up to second order, we include the quadratic form
into the output as
![]() |
12 |
so that the readout weight tensor,
, approximates the Hessian term as
![]() |
13 |
with the approximation error of
. This corresponds to the quadratic approximation of the map
. As the conventional RC, the readout weights
and
are determined such that
![]() |
14 |
![]() |
15 |
which we refer to as quadratic-form RC (QRC).
Note that learning in our method is linear with respect to the weights, W and
, which result in again least squares, and therefore, retains the simplicity of the conventional RC, i.e., the low computational cost and guaranteed optimality. Furthermore, the output of our method is nonlinear with respect to the reservoir variables as
, which leads to a greater variety of approximations to the functional relationship between
on
. Moreover, it is natural to expect that including higher terms, beyond QRC, will give a better approximation, and indeed, we show in the following numerical experiments that this is true at least up to the third order (see Fig. 7).
Fig. 7.
Summary of quantitative comparison of reconstruction ability. The left (
), center (
), and right (
) points correspond to the results of
-,
-, and
-ESN, respectively. The top and bottom panels show the MCE
and KLD
values over ten times realizations of the random matrices A and B, respectively.
Numerical study of QRC and beyond
We numerically show that the RC with generalized readout is superior to the conventional RC for the prediction task, and that the closed-loop long-term prediction using the QRC provides better performance with unexpected robustness.
Here, we use the Echo State Network (ESN) as a reservoir, i.e., the map
is given by
where the component-wise application of
is employed as the activation function,
, and
are the random matrices. The elements of the random matrices A and B are sampled independently and identically from a uniform random distribution over the interval
and
, respectively, and we use the Ridge regression at the training phase with the regularization parameter
. Hyperparameter optimization was performed for each ESN and each task, and the detailed results are given in the Supplementary Information.
The number of training parameters, denoted by M in the following numerical study, is summarised here for later use (Fig. 3b). Henceforth, we refer to the conventional ESN as linear ESN (
-ESN) and the quadratic-form ESN as
-ESN. Concerning
-ESN, in accordance with previous mathematical studies6–8, we employ a simple architecture without leaking rate, the special structure of the adjacency matrix, output bias, and so on, where simply
for K outputs. As for
-ESN, using the symmetry of
, i.e.,
, we have
for K outputs.
Fig. 3.
Short-term prediction (open-loop). (a) The root mean square error (RMSE),
over the size of the network N. The red open and blue solid circles show the RMSE using
- and
-ESN, respectively. (b) The same as (a), but the horizontal axis shows the number of trained parameters
where
for
-ESN and
for
-ESN.
The target dynamical system is determined by the Lorenz equations;
where its time-
map gives
of the target dynamics (6) with
, calculated using the fourth-order Runge–Kutta method with
. As another example of dynamical systems, we show the prediction results for the Rössler system in the Supplementary Information.
Short-term prediction (open-loop)
First, we study the short-term prediction, and in particular,
ahead prediction of the Lorenz chaos; hence, when the input is
, the target output is
. Figure 2 shows the prediction results with the ESN size
. Note that while we also use
in the long-term prediction later, which is quite small compared to the commonly used one, as a reference,
is used in5. The left panels of Fig. 2a are the time series of the target signals, i.e.,
, depicted by the grey dashed lines, and those of the predictions, i.e.,
, depicted by the red solid lines. The predictions by the
-ESN and the
-ESN are shown in Fig. 2a and b, respectively.
Fig. 2.
Short-term prediction (open-loop). The panels (a) and (b) show the results using
- and
-ESN, respectively. The left and right panels show the time series of the target (grey dashed) and prediction (red solid) and the phase space structures of the orbits. The colors represent the local error of the prediction,
.
Although there is a discrepancy between the target signal
and the prediction by the
-ESN
it is difficult to distinguish between the target signal
and the prediction by the
-ESN
i.e., the
-ESN provides a more accurate prediction than the
-ESN. The right panels of Fig. 2 show the phase space structures of the orbit
corresponding to the time series on the left panels. The phase space structures of the orbit
by the
-ESN are far from those of the true Lorenz attractor; however, the
-ESN can predict the orbit whose phase space structure is qualitatively the same as the true one, the butterfly wing shape. In summary, the
-ESN is more accurate in short-term predictive ability than the
-ESN when
.
To quantitatively compare the predictive ability of the
-ESN and the
-ESN, we plot the root mean square errors (RMSE),
in Fig. 3a for the ESN size
. The values of the RMSE over the 20 different realizations for each case are shown to investigate the dependence of the random number realizations used for the matrices A and B. Here, the red and blue circles represent the RMSE given by the
-ESN and the
-ESN, respectively. While the RMSE values typically decrease with increasing ESN size, there is a huge gap between the RMSE values of the
-ESN and the
-ESN, i.e., the RMSE of the
-ESN is significantly lower than that of the
-ESN.
Figure 3a shows the comparison of the RMSE for the
- and
-ESNs for the same network size N; however, the number of training parameters M differs between
- and
-ESNs. Fig. 3b is the same as Fig. 3a, but the horizontal axis is
The RMSE values are almost the same for the
- and
-ESN. Note that, in this comparison, the network sizes of the
- and
-ESN are not the same. For example, in the case of
while the network size of the
-ESN is
that of the
-ESN is
The smaller network size of the
-ESN results in the larger scatter; however, even when the number of training parameters for the comparison is fixed, the best result achieved by the
-ESN is almost the same or remarkably better than that by the
-ESN, e.g., the case of
.
Long-term prediction (closed-loop)
For the long-term prediction, the both ESNs are trained for the
-ahead prediction task where
of the Lorenz chaos. Again, the target output is
The output from the ESN is denoted by
where the output function is
for the
-ESN and
for the
-ESN. After training, we obtain
. In the next step, we employ
instead of (1), where the map
determines the autonomous dynamical system in the reservoir state space. This closed-loop method using the ESN, which we call the autonomous ESN for short, not only provides long-term prediction, but also has the surprising ability to reconstruct the target dynamics determined by
, as mentioned in the introduction. Here we fix
and examine the effect of varying the functional form
i.e. using the
-ESN or the
-ESN, on these abilities.
First, Fig. 4 shows the results of long-term prediction using the autonomous
-ESN. The results depend on the realization of the random matrices A and B, we show two cases of different random numbers in Fig. 4a, where the black dashed lines and the red solid lines represent the time series of the target Lorenz chaos and the prediction by the autonomous
-ESN, respectively. For
we use the open-loop method, and switch to the closed-loop method at
. For both results, the orbits by the automated
-ESN deviate significantly from the target orbits at
which is about three Lyapunov times, since the maximal Lyapunov exponent is
. The second case, shown in the right panel of Fig. 4a, suggests that the dynamics generated by the autonomous
-ESN is not chaotic, but converges to a fixed point, which is unstable in the target Lorenz system.
Fig. 4.
Long-term prediction (closed-loop) using the autonomous
-ESN. (a) The time series of the target (grey dashed) and prediction (red solid). The difference between the left and right panels lies in the realizations of the random numbers used for A and B. (b) The phase space structures of the orbits generated by the autonomous
-ESN. The panels (a)–(j) are results corresponding to the ten times realizations of the random numbers used for A and B. The colors represent the local conjugacy error,
. The length of the orbits shown is
.
To investigate the reconstruction ability, we demonstrate the phase space structure of the orbit generated by the autonomous
-ESN for 10 different realizations in (a)–(j) of Fig. 4b, where the left and right panels of Fig. 4a correspond to the cases of (a) and (b), respectively. Obviously, the reconstruction ability of the automated
-ESN is highly dependent on the realizations; in other words, it is not robust.
Figure 5 is the same as Fig. 4, but we use the
-ESN instead of the
-ESN. The automated
-ESN exhibits the long-term prediction ability over about 8 Lyapunov times, which is remarkably improved compared to the case of the
-ESN used, even with the same network size. Due to the intrinsic orbital instability of the Lorenz chaos, the orbits generated by the automated
-ESN inevitably deviate from the target orbits; however, the phase space structures shown in Fig. 5b are qualitatively equivalent to the Lorenz attractor. As will be quantitatively examined later, the reconstruction ability of the automated
-ESN is independent of the realizations; in other words, it can robustly reproduce the dynamics of Lorenz chaos. Similar results have been reported in the previous studies, e.g. Pathak et al.5; however, they used the relatively large network such as
. We emphasize that the autonomous
-ESN has such a long-term prediction and robust reconstruction ability with the tiny network,
.
Fig. 5.
Long-term prediction (closed-loop) using the autonomous
-ESN. The same as Fig. 4, but the
-ESN is used instead of the
-ESN.
Quantitative comparison
For quantitative comparison, we introduce the mean conjugacy error (henceforth MCE), and the Kullback-Leibler divergence (henceforth KLD), which quantify the error between orbits and the error between invariant distributions, respectively. First, we define the MCE. As discussed in Hara and Kokubu7, the dynamical system determined by the autonomous ESN,
is expected to be smoothly conjugate to the Lorenz dynamics,
The MCE quantifies the deviation from the expected conjugacy as follows. The above relationship
implies
![]() |
16 |
On the other hand, for the autonomous ESN, we have
leading to
![]() |
17 |
Considering the map
as the conjugacy map, we define the conjugacy error at the reservoir state
by
![]() |
18 |
where
is approximated by the four-stage and fourth-order Runge-Kutta method. The long-time average of
along the orbit
generated by
defines the MCE,
![]() |
19 |
The colors of the orbits in Figs. 4b and 5b represent the conjugacy error, where the values on the color bars correspond to
. Obviously, compared to the autonomous
-ESNs (Fig. 4b), the colors of the orbits generated by the autonomous
-ESNs (Fig. 5b) are blue almost everywhere on the attractor, suggesting that successful conjugacy to the target Lorenz dynamics.
While the MCE quantifies the reconstruction ability of the autonomous ESN, it is not perfect. For instance, if the orbit generated by
converges to the saddle point along the stable manifold of the target system, the MCE may take a small value. However, the saddle point cannot be an attractor. Therefore, in this case, the autonomous ESN fails to reproduce the target attractor, even if the MCE is small. To shed light on the ergodic aspect of the dynamics, we compare the invariant probability measures through
as a quantification complementary to MCE.
Figure 6 shows the probability density functions (PDF) of the variable x. The grey dashed and red solid lines represent the PDF p(x) calculated from the target Lorenz chaos data and the PDF q(x) calculated from the autonomous ESN data, respectively. The two panels of Fig. 6a show the results of the autonomous
-ESN, corresponding to the two cases shown in Fig. 4a. Although the time series shown in the left panel of Fig. 4a and the phase space structure shown in Fig. 4b (a) are similar to the target Lorenz, the PDF q(x) shown in the left panel of Fig. 6a differs from p(x). The time series shown in the right panel of Fig. 4a converges to the fixed point, resulting in the PDF q(x) shown in the right panel of Fig. 6a having a delta function-like form, and apparently differing from p(x).
Fig. 6.
Probability density functions (PDF) of the variable x. The dashed lines show the PDF of the target Lorenz system p(x). The red solid lines show the PDF q(x) calculated from data generated by (a),
-ESN and (b),
-ESN. Two panels of (a) and (b) correspond to the cases shown in Figs. 4a and 5a, respectively.
The two panels of Fig. 6b show the results of the automated
-ESN, corresponding to the two cases shown in Fig. 5a. The PDFs p(x) and q(x) are quite similar, suggesting that the autonomous
-ESN can reproduce the global structure of the target Lorenz attractor, in addition to the accurate prediction along the orbit verified by the MCE, which is local in phase space. The values of
where its definition is
are
and
for the PDFs q(x) shown in the left and right panels of Fig. 6b, respectively. These values of
are significantly smaller than the values of
in the case of the
-ESN used, e.g.
for the PDFs q(x) shown in the left panel of Fig. 6a.
Figure 7 summarizes the quantitative comparison. The top and bottom panels show the MCE
and KLD
values over ten times realizations of the random matrices A and B, respectively. The left points, labelled “1” on the horizontal axis, represent the results for the autonomous
-ESN, excluding the two extremely poor results shown in Fig. 4b (e) and (f). The centre points, labelled “2” on the horizontal axis, represent the results for the autonomous
-ESN, illustrating that the values of both the MCE
and the KLD
are significantly smaller than those of the
-ESN. Immediately we notice this remarkable reconstruction ability of the
-ESN. Moreover, we find that both quantities are less dependent on the realizations of the random matrices compared to the
-ESN, i.e. the
-ESN improves not only the accuracy but also the robustness of the reconstruction results.
Finally, we remark the results beyond the
-ESN, i.e. the cubic-form ESN (
-ESN) including up to the third order terms of the reservoir variables,
, and the corresponding output weights,
, which are trained to approximate the fourth term in the Taylor expansions (10) as
![]() |
20 |
We show the MCE
and the KLD
for the autonomous
-ESN in the right points, labelled “3” on the horizontal axis, of Fig. 7. As expected, the accuracy of the reconstruction by the autonomous
-ESN is superior to that of the
-ESN and the
-ESN. Furthermore, we find again that the
-ESN improves not only the accuracy but also the robustness of the reconstruction results, compared to those of the
-ESN and the
-ESN.
Conclusion and discussion
Inspired by the seminal works on the mathematical analysis of RC7,17, we have proposed a novel method of RC with generalized readout with a theoretical guarantee of its high computational capabilities based on generalized synchronization. Numerical studies on the Lorenz and Rössler chaos have uncovered significant short- (Figs. 2 and 3) and long-term prediction and reconstruction abilities with improved robustness (Figs. 4, 5, 6 and 7) of the
-ESN. The MCE
and KLD
have quantified these properties complementarily, i.e. from the notions of orbit and distribution. By including the higher-order approximation, we have revealed “hierarchical” improvement in reconstruction ability and robustness; i.e. the
-ESN is superior to the
-ESN, which is superior to the
-ESN (Fig. 7).
As the future extensions based on the present work, we discuss the following three directions: mathematical analysis, machine learning, and physical implementation. From the mathematical analysis of RC7,17, it may be natural that introducing the generalized readout improves prediction ability. However, we unexpectedly observed an improvement in the robustness of the reconstruction ability. Further analysis of the reservoir dynamics is crucial; unveiling fundamental properties such as the topological conjugacy and the mechanism behind the enhanced robustness will have major implications for several fields, including machine learning, where stabilizing the dynamics of neural networks by adding noise and normalization is one of the critical issues20.
One of the key applications of the generalized readout is the physical RC; in many physical systems, such as photonic integrated circuit10, only small physical degrees of freedom are available9–13. The hard challenge is to find a way to exploit these low-dimensional dynamics for computation. We emphasize that our generalized readout paves the way, and this is why we focus on small networks in the numerical study. For future work along the line of the research21,22, combining linear physical systems with the generalized readout may be effective.
The apparent drawback of using the generalized readout is the large number of parameters to be trained, still within the linear learning framework. Therefore, based on the hierarchical improvement in accuracy with increasing parameters (Fig. 7), the balance between accuracy and learning cost should be determined for each application. For the large number of parameters to be trained, transfer learning23,24 may be efficient. Once linear regression is used, the trained parameters, i.e. the generalized readout weights, can be reused with a minor correction for similar tasks, e.g. predicting chaotic dynamics that are structurally stable.
The autonomous RC with generalized readout achieves accurate predictions, e.g. longer than 8 Lyapunov times (Fig. 4); however, such a prediction eventually fails due to the orbital instability. Toward practical predictions of, for instance, fluid turbulence25, combining the autonomous RC with generalized readout and data assimilation may be essential in future work. Also, we have assumed that observational data of all state variables are available for training. It is important to investigate the effectiveness of the generalized readout in the case where only the partial observation data are available, as studied in26. In practice, the prediction of high-dimensional dynamical systems, such as fluid turbulence23,27, is crucial. We have found that the RC with generalized readout is effective for some high-dimensional chaos, which will be reported elsewhere.
The concept of generalized readout does not require the RC framework, but rather, may be essential in a more general machine learning context, e.g. training recurrent neural networks. Although the main claim of this paper is to propose the mathematical framework and the generalized readout method, a systematic comparison across a variety of neural networks, e.g., with deep architectures28, in a more general task may be valuable and is left for future study. Studies of neural connections similar to
- and
-ESN may also be interesting in the context of a learning mechanism in biological brains. Whatever the direction, the concepts from dynamical system theory used in the above discussion, such as synchronization, orbital instability, and conjugacy, will shed light on a guiding principle for future studies.
Supplementary Information
Acknowledgements
We thank M. Hara, H. Kokubu, S. Sunada, T. Yoneda, S. Matsumoto, S. Goto, Y. Saiki, and J. A. Yorke for their insightful comments and encouragement. We would also like to thank C. P. Caulfield and DAMTP, University of Cambridge, for providing a great environment in which this work was completed on M.I.’s sabbatical. This work was partially supported by JSPS Grants-in-Aid for Scientific Research (Grants Nos. 22K03420, 22H05198, 20H02068 and 19KK0067).
Author contributions
M.I. and A.O. wrote the main manuscript text and prepared all the figures. All authors reviewed the manuscript.
Data availability
The program used to generate the data is provided within the supplementary information file.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-024-81880-3.
References
- 1.Jaeger, H. The “echo state” approach to analysing and training recurrent neural networks-with an erratum note. Bonn, Germany: German National Research Center For Information Technology GMD Technical Report vol. 148, 13 (2001)
- 2.Maass, W., Natschlager, T. & Markram, H. Real-time computing without stable states: A new framework for neural computation based on perturbations. Neural Comput.14, 2531–2560 (2002). [DOI] [PubMed] [Google Scholar]
- 3.Nakajima, K. & Fischer, I. Reservoir Computing: Theory, Physical Implementations, and Applications, Natural Computing Series (Springer, Berlin, 2021). [Google Scholar]
- 4.Jaeger, H. & Haas, H. Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication. Science304, 78–80 (2004). [DOI] [PubMed] [Google Scholar]
- 5.Pathak, J., Lu, Z., Hunt, B., Girvan, M. & Ott, E. Using machine learning to replicate chaotic attractors and calculate Lyapunov exponents from data. Chaos Interdiscip. J. Nonlinear Sci.27(9), 121102 (2017) [DOI] [PubMed]
- 6.Kim, J., Lu, Z., Nozari, E., Pappas, G. & Bassett, D. Teaching recurrent neural networks to infer global temporal structure from local examples. Nat. Mach. Intell.3, 316–323 (2021). [Google Scholar]
- 7.Hara, M. & Kokubu, H. Learning dynamics by reservoir computing. J. Dyn. Differ. Equ.36, 515–540 (2022). [Google Scholar]
- 8.Kobayashi, M., Nakai, K., Saiki, Y. & Tsutsumi, N. Dynamical system analysis of a data-driven model constructed by reservoir computing. Phys. Rev. E104, 044215 (2021). [DOI] [PubMed] [Google Scholar]
- 9.Wang, S. et al. Others Echo state graph neural networks with analogue random resistive memory arrays. Nat. Mach. Intell.5, 104–113 (2023). [Google Scholar]
- 10.Takano, K. et al. Compact reservoir computing with a photonic integrated circuit. Opt. Express26, 29424–29439 (2018). [DOI] [PubMed] [Google Scholar]
- 11.Sunada, S. & Uchida, A. Photonic reservoir computing based on nonlinear wave dynamics at microscale. Sci. Rep.9, 19078 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Appeltant, L. et al. Information processing using a single dynamical node as complex system. Nat. Commun.2, 468 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Tanaka, G. et al. Recent advances in physical reservoir computing: A review. Neural Netw.115, 100–123 (2019). [DOI] [PubMed] [Google Scholar]
- 14.Sande, G., Brunner, D. & Soriano, M. Advances in photonic reservoir computing. Nanophotonics6, 561–576 (2017). [Google Scholar]
- 15.Inubushi, M. & Yoshimura, K. Reservoir computing beyond memory-nonlinearity trade-off. Sci. Rep.7, 10199 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Inubushi, M., Yoshimura, K., Ikeda, Y. & Nagasawa, Y. On the characteristics and structures of dynamical systems suitable for reservoir computing. Reserv. Comput. Theory Phys. Implement. Appl. 97–116 (2021).
- 17.Grigoryeva, L., Hart, A. & Ortega, J. Chaos on compact manifolds: Differentiable synchronizations beyond the Takens theorem. Phys. Rev. E103, 062204 (2021). [DOI] [PubMed] [Google Scholar]
- 18.Herteux, J. & Rath, C. Breaking symmetries of the reservoir equations in echo state networks. Chaos: Interdiscip. J. Nonlinear Sci.30(13), 123142 (2020). [DOI] [PubMed]
- 19.Bollt, E. On explaining the surprising success of reservoir computing forecaster of chaos? The universal machine learning dynamical system with contrast to VAR and DMD. Chaos Interdiscip. J. Nonlinear Sci.31(23), 013108 (2021). [DOI] [PubMed]
- 20.Wikner, A. et al. Stabilizing machine learning prediction of dynamics: Novel noise-inspired regularization tested with reservoir computing. Neural Netw.170, 94–110 (2024). [DOI] [PubMed] [Google Scholar]
- 21.Shougat, M., Li, X., Mollik, T. & Perkins, E. An information theoretic study of a duffing oscillator array reservoir computer. J. Comput. Nonlinear Dyn.16, 081004 (2021). [Google Scholar]
- 22.Coulombe, J., York, M. & Sylvestre, J. Computing with networks of nonlinear mechanical oscillators. PLoS ONE12, e0178663 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Inubushi, M. & Goto, S. Transfer learning for nonlinear dynamics and its application to fluid turbulence. Phys. Rev. E102, 043301 (2020). [DOI] [PubMed] [Google Scholar]
- 24.Sakamaki, R., Kanno, K., Inubushi, M. & Uchida, A. Transfer learning based on photonic reservoir computing using semiconductor laser with optical feedback. IEICE Proc. Ser.71, 229-232 (2022).
- 25.Inubushi, M., Saiki, Y., Kobayashi, M. & Goto, S. Characterizing small-scale dynamics of Navier–Stokes turbulence with transverse Lyapunov exponents: A data assimilation approach. Phys. Rev. Lett.131, 254001 (2023). [DOI] [PubMed] [Google Scholar]
- 26.Storm, L., Gustavsson, K. & Mehlig, B. Constraints on parameter choices for successful time-series prediction with echo-state networks. Mach. Learn. Sci. Technol.3, 045021 (2022). [Google Scholar]
- 27.Matsumoto, S., Inubushi, M. & Goto, S. Stable reproducibility of turbulence dynamics by machine learning. Phys. Rev. Fluids9, 104601 (2024). [Google Scholar]
- 28.Wang, R., Kalnay, E. & Balachandran, B. Neural machine-based forecasting of chaotic dynamics. Nonlinear Dyn.98, 2903–2917 (2019). [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The program used to generate the data is provided within the supplementary information file.



























