Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2023 Mar 16;120(12):e2216030120. doi: 10.1073/pnas.2216030120

Network inference from short, noisy, low time-resolution, partial measurements: Application to C. elegans neuronal calcium dynamics

Amitava Banerjee a,b,1, Sarthak Chandra c,d, Edward Ott a,b,e
PMCID: PMC10041139  PMID: 36927154

Significance

Network link inference from measured time series data is an important problem for several fields of study. Techniques for doing this typically score each potential network link, which, ideally, would enable clear discrimination between linked and unlinked nodes. However, real-world data, typically consisting of short, noisy time series with poor time resolution cause poor link discriminability. Using traditional and recently developed link inference techniques, we characterize scenarios leading to poor link discrimination through tests on synthetic data from chaotic systems as well as experimental neuronal activity data from C. elegans, for which the ground-truth neuronal synaptic connectivity network is known. Furthermore, we test a surrogate data method which provides quantitative statistical information on the confidence levels of inferred links.

Keywords: network inference, time series analysis, surrogate data techniques, causal inference, neural networks

Abstract

Network link inference from measured time series data of the behavior of dynamically interacting network nodes is an important problem with wide-ranging applications, e.g., estimating synaptic connectivity among neurons from measurements of their calcium fluorescence. Network inference methods typically begin by using the measured time series to assign to any given ordered pair of nodes a numerical score reflecting the likelihood of a directed link between those two nodes. In typical cases, the measured time series data may be subject to limitations, including limited duration, low sampling rate, observational noise, and partial nodal state measurement. However, it is unknown how the performance of link inference techniques on such datasets depends on these experimental limitations of data acquisition. Here, we utilize both synthetic data generated from coupled chaotic systems as well as experimental data obtained from Caenorhabditis elegans neural activity to systematically assess the influence of data limitations on the character of scores reflecting the likelihood of a directed link between a given node pair. We do this for three network inference techniques: Granger causality, transfer entropy, and, a machine learning-based method. Furthermore, we assess the ability of appropriate surrogate data to determine statistical confidence levels associated with the results of link-inference techniques.


The task of reconstructing complex networks solely from observations of their nodal state time series dynamics has applications in a wide range of problems. Examples include inferring neuronal networks from neural recordings (Fig. 1AC) (1, 2), predator–prey interaction networks from population data of different species in an ecosystem (3), gene and protein interaction networks from protein concentration and gene expression data of biochemical systems (4), climatic interaction networks from global temperature and ocean circulation data (5, 6), etc. A number of computational techniques have been proposed for this purpose, e.g., Granger causality (79), transfer entropy (1013), Bayesian techniques (14, 15), machine learning-based techniques (1620), among others (2123). Typically these techniques take the time series of the nodal states of the unknown network as input and use it to assign to each potential directed link between a pair of nodes (i, j) of the network a link score Sij that is supposedly reflective of the probability of existence of an actual network connection from node i to node j. We refer to the distribution of scores between linked node pairs as the link-score distribution and the distribution of scores between unlinked node pairs as the nonlink-score distribution. Ideally, these network inference techniques would yield a bimodal distribution of all potential link scores Sij, with the link-score distribution having relatively larger scores and the nonlink-score distribution having relatively smaller scores, with a clear gap in between these two types of scores (Fig. 2B), so that linked and unlinked node pairs are easy to separate based solely on their respective link scores. In practice, however, it is often the case that the link-score distribution and the nonlink-score distributions overlap (Fig. 2C). In such cases, it becomes useful to i) understand the relationships between experimental factors characterizing the measurements of the network nodal state time series, link inference techniques, and the distribution of link scores and ii) to make better inference from the results of network inference link scoring.

Fig. 1.

Fig. 1.

(AC) Schematic of the network inference task from calcium fluorescence dynamics of individual C. elegans neurons showing (A) identified neurons in a live worm (B) representative full time series available from three identified neurons from ref. 24, and (C) The 8-neuron ground-truth synaptic network of C. elegans from ref. 25.

Fig. 2.

Fig. 2.

(A) Example time series data of the three variables (x, y, z) of one Lorenz oscillator (Eqs. 13) which is a part of the network in Fig. 1C. For the 8-neuron network in Fig. 1C, two examples of link-score distributions with (B) separated scores from Lorenz oscillator time series with T = 3, ​000, σobs = 0.01, Δt = 0.02, δt = 0.02 and complete nodal states and full network sampled and (C) nonseparated scores from C. elegans calcium fluorescence dynamics from ref. 24.

To this end, primarily focusing on a machine learning link inference technique (18), we first conduct tests to examine the factors that affect the score distribution. This technique, developed in refs. 18 and 17, involves training an artificial neural network called a “reservoir computer” with the measured time series and then using the trained reservoir as an in silico model of the experimental system to evaluate the network structure between the nodes of the experimental system. We describe the construction and implementation of this method in detail in Methods section 6.

In general, to test the validity and efficacy of link inference techniques, it is essential to apply them in a scenario where the ground truth of the underlying network is known. Thus, we focus our attention on two types of network dynamical systems. One is the neural system of the Caenorhabditis elegans worm, a model organism that we choose due to the availability of its complete neural connectome as obtained from previous microscopy studies (25, 26), in addition to whole-brain calcium fluorescence imaging of individual neurons (24, 27, 28). The other dataset is obtained by numerical simulations in which we take each node to be a Lorenz oscillator (29). In this case, we simulate and compare results from two different kinds of nodal state time series measurements data: i) The entire three-component vector state of each node is assumed to be measured, and ii) we take as the nodal observations a single composite variable per node (representative of the case of partial state measurements of nodal states). Example time series of such a nodal state is shown in Fig. 2A.

As we will show, the overlap between the link-score distribution and the nonlink-score distribution can have various causes, including 1) low temporal resolution of sampling (e.g., ∼3 samples/second for the C. elegans time series that we use, examples shown in Fig. 1B); 2) the presence of observational noise; 3) short length of the available time series data (e.g., ∼0.3 s × 3,000 time points for the C. elegans time series in ref. 24, Fig. 1B); and 4) availability of only partial, indirect measurements (e.g., instead of accessing the full biophysical state of each neuron, we might only have measurements of the fluorescence intensity of the calcium ions involved in the neural activity); among others e.g., the presence of strong synchrony or correlations among the activities of different neurons, making connectivity inference prone to spurious false-positive connections between neurons with correlated activities (17, 18). In section 2, we illustrate the effect of such factors on the separability of the distribution function of link and nonlink scores in the network Lorenz model with chaotic dynamics when examined using the example of the machine learning network inference method of ref. 18. When these distributions are no longer separable, using scores to estimate the underlying network structure is nontrivial. In section 4, we demonstrate that in these cases, appropriate surrogate time series data can be generated, in which causal relationships in the experimental data between a given pair of nodes are destroyed, and that this surrogate data can be used to estimate null-hypothesis score distributions, which mimic the distributions of the nonlink scores of the original time series data (3037). We show that such surrogate data enable an estimate of the P-value associated with any particular node pair being linked in the underlying network, and we do this for the machine learning method, the Granger causality method (7, 8), and the Transfer Entropy method (10, 13). Choosing an appropriate P-value as a link-score cutoff can then be used to partially reconstruct an estimate of the unknown network at a confidence level corresponding to the chosen P-value cutoff.

The main results/contributions of this paper are as follows:

  • 1.

    We systematically study the effects of training time length, observational noise variance, sampling resolution, and incomplete nodal state measurements on the performance of network inference.

  • 2.

    We investigate the dependence of the effectiveness of network link inference techniques on such factors as short, low sampling resolution, noisy time series data, and partial nodal state measurements.

  • 3.

    Using C. elegans as a system with a known ground-truth synaptic network, we demonstrate the efficacy of link inference and surrogate data methods in a real-world setting.

Physical Systems and Datasets Used

Calcium Imaging Time Series Data from C. elegans.

To test the performance of techniques for inferring neural connectomes from neural recordings, we apply them to the publicly available (24) dataset of whole-brain calcium-imaging time series of individual neurons of a freely moving C. elegans worm. We then compare our inferred connectivity among the identified neurons to the well-established ground-truth connectome for C. elegans available from ref. 25, 26. While data are available for all neurons, a large fraction of the neurons have low magnitudes of activity over the duration of neural recording. Thus, we focus on the subnetwork consisting of the 16 most active motor neurons and the synaptic connections between them. Furthermore, due to the left–right symmetry of the C. elegans worm (25, 26), the network of neuronal connections of these 16 neurons is “folded-over” to form a network of connections between the corresponding 8 left–right neuron pairs. These pairs are conventionally labeled AIB, AVA, AVB, RIB, RIM, RIV, SMDD, and SMDV. We take the link score of each potential connection in the folded network to be the average over the corresponding left and right connections in the original 16-node network. Considering the folded-over network, our ground-truth network has 8 nodes, with 30 directed pairs of linked nodes and 26 pairs of unlinked nodes (Fig. 1C). We henceforth refer to this 8-node network as the folded C. elegans network.

Coupled Lorenz Oscillator Network Simulations.

In order to test the effect of different experimental and sampling conditions, such as sampling frequency, observational noise, etc., we also generate time series data from a set of 8 Lorenz oscillators, coupled using the same network topology as the folded C. elegans network (Fig. 1C). The equations describing our example network of Lorenz oscillators are as follows:

dxidt=αx+(α/2)jAijyj+σDynξix(t), [1]
dyidt=xi(ρizi)yi+σDynξiy(t), [2]
dzidt=xiyiβzi+σDynξiz(t), [3]

for i = 1, 2, ..., 8, where (xi, yi, zi) constitute the nodal state of the i-th node, and, Aij is a binary (0 or 1) adjacency matrix corresponding to the folded C. elegans network (shown in Fig. 1C). The coupling, represented by the second term on the right side of Eq. 1, is from the y variable of node j to the x variable of node i. The parameters α and β are chosen to have the same value used by Lorenz, α = 10 and β = 8/3, and for each node i, the parameter ρi is sampled uniformly from the range [30,70] to increase heterogeneity across nodes. The dynamical noise strength parameter is σDyn = 10−3, and the noise is assumed to be white, ξip(t)ξjq(t)=2δpqδijδ(tt). Thus, in this setup, the y-variable of node j directly affects the x-variable of node i if Aij = 1. Starting from a random initial condition, we integrate the system using a fourth order Runge–Kutta method with step size δt = 0.02 and discard the initial transients. In addition, to mimic realistic sampling, we add white Gaussian observational noise of mean zero and SD σObs independently at each sampled data point and sample the time series of the system with a time-step Δt (which is chosen to be a multiple of the Runge–Kutta simulation time-step δt). We consider two modes of measurements of the dynamics: i) the case of full nodal state measurements, where the full state (xi, yi, zi) is sampled from each of the nodes. In this case, we calculate link scores from yi to xj as the relevant quantity of interest that measures the strength of the connection from node i to node j. ii) A single composite variable, si = xi2/⟨xi2t + yi2/⟨yi2t + zi2/⟨zi2t, is measured, where ⟨.⟩t denotes the time average of the respective variable.

Link-Score Distributions for the Coupled Lorenz Oscillator Network

As shown in Fig. 2C and the section 4, the C. elegans data that we use yield link-score distributions that strongly differ from what we call the ideal case, shown in Fig. 2B, and instead are more like Fig. 2C. Thus, we were motivated to ask why this happens. This section addresses this question through an investigation of the Lorenz network model system of section 1.2. We expect that the applicability of the general qualitative behavior that we will find in this section applies broadly and is not limited by particular features of the example treated. For example, we expect that the main results of this section will be robust to modifications of the example treated, such as, incorporation of link strength heterogeneity (18), time delays along links (17), etc.

We assume that the states of the dynamical system are sampled at a time interval Δt at times nΔt with n = {1, 2, …, T}. In particular, we investigate the effects of data sampling time interval, length of time series, and the number of sampled variables on the distribution of link inference scores. (In a real-world setting, such as the C. elegans dataset we considered, the data characteristics are often fixed and are largely determined by the system and the experimental setup recording the data.) Our main results regarding this case are summarized in Fig. 3 AO, which plot histograms of scores obtained using the machine learning technique. In each panel of Fig. 3, the histogram plotted in black is for the scores of all directed node pairs, the histogram plotted in red is for the scores of directed node pairs that are linked (i.e., the link-score distribution), and the histogram plotted in blue is for directed node pairs that are not linked (i.e., the nonlink-score distribution). (The histograms plotted in green will not be discussed in this section but will be discussed in section 4.)

Fig. 3.

Fig. 3.

Number of directed network node pairs vs their scores for all such pairs, (all-score distribution, plotted in black), for pairs corresponding to actual links (plotted in red), for pairs not corresponding to links (plotted in blue), and for the surrogate data (plotted in green) from the coupled Lorenz oscillators on the network shown in Fig. 1C, with (AC) 3 variables and (DF) 1 variable per node sampled, with T = 3, ​000, δt = 0.02, σobs = 0.01; and 3 variables per node sampled, with (GI) Δt = 0.02, δt = 0.02, σobs = 0.01, ( JL) Δt = 0.02, δt = 0.02, T = 3, ​000, and (MO) Δt = 0.02, δt = 0.02, T = 3, ​000, σobs = 0.01. Note that the black curves overlap the blue and red curves in cases where the link-score and nonlink-score histograms separate completely e.g., in panels (A), (HK), and (MO).

We start with the parameters, number of time-points T = 3,000, observational noise strength σObs = 0.01, with the data sampling time interval Δt initially chosen to be the same as the numerical time-step δt = 0.02 and with all 3 variables of each Lorenz oscillator being measured. Fig. 3A shows a typical histogram approximation of the link-score distribution for this example network obtained by the application of the machine learning technique described in the Methods section (18). The histograms plotted are constructed by pooling the scores obtained through 10 random realizations of the reservoir computer (Methods section 6.1 for additional details). We observe from Fig. 3A that we get a distribution with a clear gap between the low-score component and the high-score component. In this case, judging scores above and below the gap to correspond to links and nonlinks, respectively, we obtain perfect network inference. We will henceforth refer to this case as the “canonical parameter case.”

To see how the network inference performance varies with change of parameters away from our canonical parameter case, we first decrease the sampling rate by increasing the sampling time Δt for the system. There are two natural ways of doing this. One is by keeping the total time duration of the time series constant as Δt is varied. The other is by keeping the number of sampled points constant as Δt is varied. We consider the latter case. We find that with increasing sampling times, the magnitude of the scores diminishes, and the separation between the link-score and nonlink-score distributions is reduced, resulting in progressive deterioration of the performance of the link inference technique (Fig. 3 B and C). We interpret this reduction in performance with increasing Δt to arise due to the causal effect of state changes at a node spreading to other nodes to which it is not directly connected by a single link. That is, with larger Δt, the causal effect of a state change at one node can propagate to another node via multiple-link paths. This effectively yields a larger-than-expected score for disconnected node pairs, leading to a merger of the nonlink-score distribution with the link-score distribution. In contrast, at the lower Δt value (Fig. 3A), we postulate that such a causal effect has time to travel only one link, and, thus, a causal effect of a state change at time t at node i on the state of node j at time t + Δt implies a network link ij. We note that the largest Lyapunov exponent of the uncoupled Lorenz system is about 1. Thus, the sampling times Δt may be regarded as a characteristic time for variation of the state variables x, y, and z over the sampling interval.

We now consider the variation when the number of sampled points, T, is varied, with the sampling time Δt of the input time series held fixed. We first note, from panels (GI) of Fig. 3, that, for a fixed sampling time Δt, the separation between the link and nonlink components of the distribution decreases as the time duration of the input data, TΔt, decreases. Since the time duration of the input data also decreases as Δt decreases as we move from panel (C) to panel (B) to panel (A) in Fig. 3, we see that the improvement in the separation of the link and nonlink distributions can only improve even more than shown in panels (A), (B), and (C) when the duration of the time series is held fixed.

We next study the case where we sample only a single variable (as described at the end of section 1.2) from each Lorenz oscillator. In this case, we do not find any parameter regime where we obtain a score distribution (black curves in Fig. 3DF) where the distribution is bimodal. Since, in applications, solely the black curve is observed the link-score [red] and nonlink-score [blue] distributions are unknown to an experimenter, based on only the black curve of Fig. 3DF, link inference is not possible. On the other hand, from Fig. 3D, taking into account the red and blue plots, we see that the scores in the large-score tail of the black-plotted distribution predominantly correspond to actual links. However, as Δt is increased (Fig. 3 E and F), this feature becomes weaker. This behavior suggests that, as we will soon discuss, there may be potential for extracting useful information from measurements like those resulting in Fig. 3D.

We also vary the length T of the training time series and the SD of observational noise σObs for the case where all 3 variables are sampled. As might be intuitively expected, the link inference performance progressively improves with an increase of T (Fig. 3GI) and deteriorates with increasing σObs (Fig. 3JL) due to increased spurious features in the training data arising from the observational noise.

Our previous results (17, 18), as well as results of others (3842), show that the strength of dynamical noise (σDyn) present in the system is another important factor in determining the nature of the link-score distribution. A main result of the previous work is that, in contrast to observational noise, dynamical noise can have a positive effect on network inference. This happens because the dynamical noise itself generates perturbations of the nodal states of the system which propagate to other linked nodes to aid link inference (e.g., see refs. 38 and 42). Because the effect of varying dynamical noise has been previously considered (18), we will hold σDyn fixed and not investigate the dependence of our score distributions on σDyn.

Finally, we test the performance of our link inference technique in cases where we sample only a subset of the nodes in the full 8-node network of Fig. 1C. To do this, for the canonical parameter case, we consider two cases, one in which we consider a subnetwork where we sample N′=4 nodes and one in which we sample N′=6 nodes. The sampled subnetworks in these two cases have L = 5 and L = 16 links, respectively. In both these cases, we see from Fig. 3MO that we get a perfect separation between link scores and nonlink scores, showing that the method may be used even when we do not have a complete sampling of all the nodes as is often the case.

To summarize this section, we note that, in general, time series data available for use in link inference will typically be subject to limitations. In many such cases, application of a link scoring technique may confront us with score distributions, which, by themselves, cannot be used to infer links. However, as our examples from this section show, even in cases where score distributions are not bimodally separated (e.g., Fig. 2C), there may be valuable information in the score distribution that could help us obtain useful partial information about the network structure. This is because, in many cases where the score distributions of links and nonlinks are not well separated, the tail of the total distribution may contain scores predominantly from actually linked nodes (e.g., Figs. 2C, 3D, and 3L). Although, in practice, we have access to only the total distribution (black curves in Fig. 3), if we knew that the true links were concentrated sufficiently far into the tail of the distribution, we could use that to identify some fraction of the true links that occur in the tail. However, if the only information we have is the score histogram, we do not know that this is in fact the case. We thus desire some method of obtaining information regarding the true link-score distribution, which would then determine how large a score cutoff one should choose before judging that a score in the tail most likely indicates a true link. In the next section, we address this crucial issue by use of surrogate data.

Statistical Analysis of Candidate Link Scores for Link Inference

Brief Review of Surrogate Data Generation Techniques for Network Link Inference.

The use of surrogate time series data is common for estimating the statistical significance of inferred network links (43). Such surrogate time series data are data that are synthetically generated to have, as much as possible, the same statistical properties as the original data but with any causal dependence between the nodes of the node pair whose score is to be determined removed. Ideally, the surrogate data would then yield a set of scores whose distribution is similar to the distribution of the scores of nonlinked variables in the original time series data. Without prior knowledge of the underlying ground-truth network, we cannot reconstruct the distribution of nonlink scores directly from the original data; however, the distribution of scores calculated from an appropriate “causality-destroyed surrogate data” (CDSD) set can serve as a proxy for the nonlink-score distribution. Using this proxy distribution, statistical tools can then be applied to obtain the significance of inferred network links. In particular, we can estimate a true-link P-value for any particular score s found from the causal time series data, by calculating the fraction of the CDSD scores that are larger than s.

There exist several distinct methods for generating CDSD. A key early paper on surrogate data methods applied to analysis of time series from dynamical systems is that of Theiler et al. (43), while (to our knowledge) the first application to neuronal systems was in the paper of Kaminski et al. (7). Refs. 3037 provide reviews on several such CDSD generation methods, and refs. 31, 32, and 37 compare their performances on causality detection and network inference in different settings involving multivariate time series data. Some common techniques for generating CDSDs involve performing one of the following operations on time series of individual variables: i) time-shifting the time series of individual variables so as to destroy their causal relationships with time series from other variables (31, 44), ii) randomizing phases and amplitudes of the Fourier transform of the original time series and then inverse-Fourier-transforming to generate surrogate data that preserves the individual variables’ power spectra (43, 45), and iii) drawing segments of the original time series, each starting at a random time-point and having a random length and joining them contiguously (31, 32, 46). Some examples using surrogate data for inferring causal interaction and network connectivity from multivariate neural time series data are refs. 4752. Refs. 11, 12, 30, 37, and 53 use surrogate data to estimate the statistical significance of their link scores and also evaluate the performance of their network inference technique using a ground-truth network and simulated and experimental data from large-scale networks. In what follows we use the time-shifting method.

Method of Surrogate Time Series Generation and Statistical Significance Analysis of Inferred Connections.

For our examples in sections 4.1–4.3, we generate the surrogate scores as follows. For a given node i, we divide the node i time series in half, to form two subtime series, each of length (T/2). We then interchange them, as illustrated in Fig. 4, thus forming our surrogate time series for node i. To generate surrogate scores for candidate links, ji, that are incoming to node i, we then apply the chosen link scoring procedure to the shifted time series for node i but with all other time series j left unshifted. We repeat this procedure for all nodes in the network. (Additionally, when using the machine learning technique (section 6), we also repeat this procedure ten times for each parameter value using different random realizations of the machine learning procedure). Through this procedure of shifting the nodal time series, the distribution of all scores generated from the surrogate data is expected to be a useful proxy for the nonlink-score distribution from the original, unshifted time series data. Fig. 3 shows comparisons of the histograms of these two distributions.

Fig. 4.

Fig. 4.

Schematic showing the construction of time-shifted surrogate time series data used in Figs. 3 and 5.

Tests on Example Systems

In this section, we apply three link inference score methods (sections 6.1–6.3) in conjunction with surrogate data to examples (sections 1.1 and 1.2) and assess the effects of imperfect data features (section 2) on link inference. For the purpose of the following discussion (sections 4.1 and 4.2), it is useful to briefly summarize some features of one of the three link inference score methods that we consider, namely, the machine learning scoring method (a more detailed discussion appears in section 6.1, and a schematic is shown in Fig. 7). Specifically, as used here, this method employs a particular type of machine learning known as reservoir computing, and we use a three-layer implementation of reservoir computing (Methods Fig. 7). The first layer (the input layer) is a fixed, randomly chosen matrix mapping the time series data of the evolving nodal states of the network to be inferred to the second layer (the reservoir). The reservoir consists of a large number of dynamically interacting nodes connected by a network of links specified by a fixed, randomly chosen network connectivity matrix. The third and final layer (the output layer) is an adjustable (trained) matrix mapping the time-evolving state vector of the reservoir nodes to a desired output. The desired output is approximately achieved by adjusting (training) the values of the matrix elements of the output layer. As discussed in section 6.1, this training is achieved by linear regression in which, to avoid overfitting, we use a “ridge regression” term with an adjustable strength parameter. This later feature suppresses the occurrence of very large output layer matrix elements and is referred to as “regularization.”

Fig. 7.

Fig. 7.

Schematic of the reservoir computer.

Application to Lorenz Network Model.

In this subsection, we apply the above surrogate data procedure to the model given by Eqs. 13. The histograms plotted in green in Fig. 3 correspond to the distributions of surrogate scores for the various cases (panels AO) of Fig. 3. (Note that in a real network inference situation, the ground-truth network is unknown; thus, the blue and red plotted curves are consequently also unknown. Only the black and green curves would be known by someone applying a link inference procedure.) These figures demonstrate that in cases where there is a partial overlap between the link-score and nonlink-score distribution calculated from the causal data, the surrogate score distribution is in reasonable agreement with the nonlink-score distribution calculated from the causal time series data.

We note however that this agreement between the nonlink-score distribution and the surrogate score distribution appears to depend strongly on the choice of the regularization parameter of the reservoir computing technique. In Methods, we describe a heuristic for choosing an appropriate regularization parameter.

In Fig. 3BC, we see that for our Lorenz network example with all three variables sampled, for larger sampling time Δt, the scores calculated from the causal time series and the surrogate scores decrease. In fact, the distributions of the surrogate data scores match the distribution of the nonlink scores better as we increase Δt, and the link and nonlink scores overlap. This shows that the surrogate scores are able to give a proxy of the nonlink-score distribution in situations where, due to the lack of clear separation between the link and nonlink scores, such proxies are needed the most. In each case, a significant fraction of scores in the tail of the total score distribution (black histograms in Fig. 3AD) correspond to true links in the network, and appropriate score thresholds based on the surrogate data distribution can successfully identify a large fraction of them. For the case of Δt = 0.02 in Fig. 3A, where there is a clear separation, the exact distributions of the surrogate scores and the nonlink scores do not match very well. However, the link and nonlink scores are already well separated, diminishing the requirement for surrogate data in this case. Despite this, a score threshold based on the surrogate score distribution is indeed smaller than link scores and falls within the range of scores separating the link- and nonlink-score distributions. Thus, in practice, the surrogate data continue to perform their intended function well.

Similar results are seen in Fig. 3GI and JL, where we vary the observational noise strength (σObs) and the training length (T) for the Lorenz network with three variables sampled. In Fig. 3 G and L, we observe overlapping link-score and nonlink-score distributions. However, in both cases, there is an enrichment of the density of link scores at the tails of the total score distributions (black histograms in Fig. 3 G and L). In both of these cases, comparisons between the surrogate and nonlink-score distributions show that the surrogate scores are able to generate a good proxy for the nonlink-score distribution. Thus, score thresholds based on the surrogate distributions in each of these cases are able to identify link scores in the tail of the full distribution. Furthermore, as Fig. 3 H and I and J and K show, in situations with relatively smaller observational noise, and relatively larger training length, when the scores separate, a cutoff based on the surrogate data falls within the separation range. Thus, a cutoff based on the surrogate data works well in all such cases as well. The same behavior occurs in Fig. 3MO, where the surrogate score distribution is shown to give a cutoff within the range separating the link and nonlink scores.

For the cases where only one variable is sampled per Lorenz oscillator node (Fig. 3DF), we again observe (Fig. 3D with Δt = 0.02) that in the case where link- and nonlink-score distributions overlap, the surrogate distribution matches well with the nonlink-score distribution, and the surrogate distribution can be used to choose a score threshold that will allow the link scores at the tail of the score distribution to be identified. For the cases with larger Δt (Fig. 3 E and F), the link inference procedure is unable to give any information about the network since the tail of the score distribution does not contain scores predominantly from links.

Applications to C. elegans Data.

Next, we test the performance of our surrogate data for the time series of C. elegans neurons that we described in section 1. For the cases where we use our machine learning technique as the network link inference method, we also consider two cases: i) where a large number of neurons (127 in the example we used, which is the total number of neurons from which time series data were available in the dataset in ref. 24) are measured for the data used for training the reservoir computer (Fig. 5 A and B) and ii) when only the 8 neuron pairs (listed in Fig. 1C) are assumed to be measured in the data to be used for training the reservoir computer (Fig. 5 C and D). The score results (black) are plotted in Fig. 5 where we simultaneously plot the link-score (red) and nonlink-score (blue) distributions obtained from the original data and the surrogate score (green) distribution obtained from the corresponding CDSD.

Fig. 5.

Fig. 5.

Distributions of scores for links, nonlinks, and surrogate data and the corresponding average number of true and false links at different P-value cutoffs for the network shown in Fig. 1C, as inferred from the C. elegans calcium fluorescence time series of ref. 24 with (A and B) all sampled neurons used and (CH) with only 8 pairs of neurons used for calculating link scores using different link inference techniques. In (B) and (D), the shaded region indicates the region between the first and third quartiles for the distribution of the average number of inferred true and false links for different random configurations of the reservoir computer.

The P-value associated with each score s obtained from the original data is estimated as the fraction of surrogate scores larger than s [Since the P-value of a score s0 is defined as the probability that a randomly chosen nonlink-score sij is larger than s0, p(s0)=Prob(sij > s0| ji is not associated with an actual link)]. This is useful in practical scenarios where users of a link inference technique might have a predefined statistical confidence threshold for accepting inferred links, which could be quantified in terms of a P-value cutoff (e.g., selecting only those inferred links for which P < 0.02). In such cases, the correspondence between link scores and estimated P-values could be used to convert a P-value cutoff to a link-score threshold, and a desired confidence level (e.g., P < 0.02) can be achieved by selecting only the inferred links whose link scores lie above that threshold. Following this procedure, for each such confidence level represented by a P-value cutoff, we get a set of acceptable inferred links having link scores lying above the score threshold corresponding to the P-value cutoff. For each such set of accepted links, we use the ground-truth connectome data of C. elegans, shown in Fig. 1C, to calculate the number of true-positive link inferences and false-positive link inferences. We plot these two numbers with the corresponding cutoff P-values (Fig. 5 B and D) for the two cases introduced earlier in this subsection.

From both Fig. 5 A and C, we see that actual links predominate in the tail of the high-score distribution. We also note that surrogate score distributions, in both cases, provide a good estimate of the nonlink-score distribution obtained from the causal data. Furthermore, score cutoffs based on the surrogate data distribution successfully yield a significant number of true links and relatively few false positives (Fig. 5 B and D). For example, Fig. 5 B and D show that at the threshold P-value of 0.05, in the first case, we obtain 5 true links and less than one false link on average, while, in the second case, we obtain the same number of true links with no false links. As we increase the P-value cutoff threshold, both of Fig. 5 B and D show that the number of correctly inferred links increases rapidly, while the number of false links increases at a slower rate. For example, even at a cutoff at P = 0.2, we obtain 14 true links and 4 false links in the first case and 15 true links and 4 false links in the second case. Note that, in both cases, the ratio of the number of true positives to the number of false positives (TP/FP), which is about 4 in both cases, is significantly higher than what would have been obtained through random assignment of links and nonlinks to candidate links: Given that the subnetwork of Fig. 1C has 30 links, and 26 nonlinks, such random assignments would correspond to the ratio (TP/FP) being approximately 1 on average (30/26 ≈ 1). Furthermore, we note that, in both cases (Fig. 5 B and D), for a given score threshold s = s0, the corresponding estimated P-value is approximately equal to the ratio of the joint probability of a randomly chosen score sij being larger than s0 and corresponding to a nonlink (Prob(sij > s0 and ji is not associated with an actual link)=FP/56) and the probability of any randomly chosen score to correspond to a nonlink [Prob(any ji is not associated with an actual link)=26/56], that is, P ≈ FP/26.

Transfer Entropy and Granger Causality.

Next, we test the performance of transfer entropy and Granger causality as link inference techniques for the C. elegans time series. In both cases, we use the same surrogate data generation technique as described in section 3.2. The results are plotted in Fig. 5EH. Unlike the reservoir computing technique which yields an ensemble of scores for each candidate link with each score in the ensemble predicted by a different realization of the random reservoir connectivity matrix and input-to-reservoir coupling matrix, transfer entropy, and Granger causality predict unique scores for each candidate link. Fig. 5 E and G show that, in the case of transfer entropy and Granger causality, the surrogate distribution is narrower than the distribution of the scores for nonlinks. This is also reflected in Fig. 5 F and H, where we see that at a very small P-value cutoff, we obtain a large number of false-positive links in both cases (9 for transfer entropy and 4 for Granger causality). This suggests that the surrogate data generation technique that we used for the reservoir computer link inference may not be universally applicable to other link inference techniques.

Finally, we note that while we compared the performance of three different link inference methods (reservoir computer, Granger causality, transfer entropy) in Fig. 5, the methods themselves use different data as their inputs. Transfer entropy is calculated pairwise in this work (Methods section 6.2). For the Granger causality results shown in Fig. 5 G and H, as well as the reservoir computer results shown in Fig. 5CH, only the subset of 8 motor neuron pairs (listed in Fig. 1C) were used. Granger causality results on measurements of all the neurons in the available dataset showed qualitatively similar results, but we found that link inference performance was somewhat worse than that shown in Fig. 5 G and H due to fewer link scores being present in the tail of the score distribution. Lastly, Fig. 5 A and B) shows results obtained using the reservoir computer technique with input time series from all nodes in the dataset. We observe from Fig. 5 A and B that unlike Granger causality results, the reservoir computer results (Fig. 5 C and D) from only the 8 neuron pairs have statistically similar results as the ones using all the measured neurons.

Another Surrogate Data Generation Technique.

Finally, we test a different surrogate data generation technique. For this purpose, we do tests on all three of the link inference techniques considered in this paper, namely, reservoir computing, transfer entropy, and Granger causality. The tests are done on the same C. elegans dataset as we used in our previous results and on the Lorenz network with one variable per node sampled with δt = 0.02 (same parameter regime as Fig. 3D). This surrogate data generation technique, known as amplitude-adjusted Fourier transform (AAFT), was introduced in ref. 43 and improved in ref. 45 and involves Fourier transforms of the original time series data, followed by phase randomization and finally an inverse Fourier transform. In our case, we apply AAFT to the time series of one Lorenz network node (or one pair of neurons for C. elegans) at a time, keeping the time series of other nodes intact. We then apply the link inference techniques, and, for each technique, we collect the link scores corresponding to links incoming to the node on which we applied the AAFT to generate our pool of surrogate scores. For the reservoir computing method, we chose the value of the regularization parameter according to our methodology described in Methods (section 6.1). The results plotted in Fig. 6 show that, for the case of both Lorenz and C. elegans networks, the AAFT surrogate score distribution is able to give a good estimate of the nonlink-score distribution for the reservoir computer technique. In these cases, a cutoff based on the extent of the surrogate data is able to pick up the scores in the tail of the score distribution which are predominantly seen to come from true links. For the other two link inference techniques, the surrogate score distribution is also able to approximately match the nonlink-score distribution. However, unlike the reservoir computer results, for both transfer entropy and Granger causality, there appear to be a few nonlink scores extended beyond the score value where the AAFT surrogate score distribution ends. Compared with Fig. 3D and Fig. 5A, this example suggests that our reservoir computer link inference technique may be suitable to be used robustly with multiple surrogate time series generation methods.

Fig. 6.

Fig. 6.

Distributions of scores for links, nonlinks, and amplitude-adjusted Fourier transform (AAFT) surrogate data with three different link inference techniques (reservoir computer, transfer entropy and Granger causality) from (AC) data from Lorenz oscillator network with one variable sampled per node and (DF) C. elegans neuronal fluorescence time series.

While it may appear (Figs. 5 and 6) that the RC has performed somewhat better than transfer entropy and Granger causality on the C. elegans and Lorenz dataset, we feel that, to make a general statement to this effect, we would require future work with tests on other example systems.

Discussion

In this work, we tested the performance of three network inference techniques—“ Granger causality, transfer entropy, and a reservoir-computer-based method”—on time series data from calcium fluorescence images of C. elegans neurons and on a network of Lorenz oscillators, with the ground-truth networks known in all cases. For the example of the Lorenz model, we systematically tested the effects of varying the sampling interval, total duration of sampling, and observational noise variance and of partial sampling of nodal states. The results showed that the reservoir-computer-based network inference technique is improved with smaller sampling time, longer total sampling duration, complete sampling of nodal states, and lower noise variance. Thus, experimental constraints on such parameters can potentially limit the effectiveness of the network inference technique.

These results complement previous results published in refs. 17 and 18. In ref. 17, the authors applied the machine learning-based technique to experimental and simulated data from optoelectronic oscillator networks and found that the technique works even in the case where individual network links have a time delay associated with them, provided the dynamics of different nodes of the system do not synchronize and that the link delay times are not too heterogeneous. In their work, both dynamical noise and network connection strengths controlled the amount of synchrony in the system, and, as such, network inference performance depended on both these parameters via the degree of synchrony among the network nodal states. In ref. 18, the authors applied the reservoir-computer-based technique to Lorenz oscillator networks of 100 nodes with arbitrary connection strengths, containing both excitatory and inhibitory connections. The results showed that the technique was able to infer connection strengths and types (excitatory and inhibitory) correctly, provided that the dynamics of different network nodes did not synchronize. To this end, they also showed that the presence of dynamical noise facilitated link inference, while the presence of observational noise hindered it.

However, in a typical experimental situation, it will not be known what the specific limiting effects are for a given network inference task. In real-world cases, by definition, network inference is useful only when the network is unknown, and one is interested in the types of tests we have performed as confidence-building exercises. In such cases, one possibility is to evaluate such effects on network inference performance in related simulated networks with known ground-truth connectivity, whose dynamics bear resemblance to the original system. For example, our tests of network inference for C. elegans may be viewed as providing confidence especially (but not exclusively) for link inference of other neuronal systems for which calcium fluorescence data are available (1, 2, 54, 55), e.g., Drosophila (56) and zebrafish (57, 58).

We note that the presence of such limiting effects, even if their specific characters are unknown, will be signaled only by the distribution function of the scores not separating into two groups, corresponding to links and nonlinks, (similar to the “ideal case” of Fig. 2B). For example, in the case of C. elegans time series data which we studied in this work, we do not know the character and magnitude of effects, such as observational and dynamical noise strengths, synaptic delays, heterogeneity of synapses, sampling time, and partial state observations on network inference performance. Nevertheless, however the “nonideal case” arises, our C. elegans tests show that appropriate surrogate data can be constructed to extract potentially useful information from nodal time series data by prescribing quantitative statistical certainty associated with each candidate link in a network.

Materials and Methods

Reservoir-Computer-Based Connection Inference Technique.

In this section, we summarize the reservoir-computer-based link inference technique (18), that we employ in our results.

Consider a system of Dn nodes connected by a directed network that we wish to infer. We assume that the full state of the ith node in the network is given by a time-dependent vector Xi[t] of dimension Ds, with i = 1, 2, 3, ..., Dn. Denoting the components of this vector by Xiμ with μ = 1, 2, 3, ..., Ds, we suppose that the dynamics of the full system is governed by a general differential equation of the form

dXiμ(t)dt=FiμX1(t),X2(t),...,XDn(t). [4]

Here, Fiμ is the function dictating how the dynamics of the μth component of the state vector of the ith node is governed by the states of all other nodes. A network link is said to exist from node i to node j if and only if Fiμ is a function of Xj for some μ. Moreover, we define a network link to exist from the μth component of node i to the νth component of node j if Fiμ is a function of Xjν. Thus, the presence (absence) of such a link implies that Fiμ/Xjν is nonzero (zero). If we knew the function Fiμ, then we could calculate these derivatives. However, as we consider situations in the absence of such knowledge, we are therefore tasked with estimating the derivative solely from the observed time series data.

We assume that observations of the dynamical system (Eq. 4) are sampled at a time interval Δt at times nΔt for n={1T}. We further assume that these observations are carried out via some measurement apparatus that measures Yi = M(Xi) for each node i. In cases where the full nodal state is available at each time, Y is identical to X; however, in general, this may not be the case, and Y may be a vector with dimensionality DsDs.

We concatenate the sampled input measurements of the time-dependent node state vectors {Yi[t]} and place them in a single time-dependent column vector 𝒳[t] of length DX=DsDn,

X[t]=(Y11[t],Y12[t],,Y1Ds[t],Y21[t],Y22[t],,Y2Ds[t],,YDnDs[t])T. [5]

Considering the case where full-state measurements are taken, we now briefly describe the machine learning technique for link inference developed in refs. 18, which the reader can consult for a more detailed derivation. We train an artificial neural network, called a reservoir computer (RC), to predict the time evolution of the nodal states 𝒳[t] one sampling time step Δt into the future, i.e., the RC predicts 𝒳[(n+1)Δt] from 𝒳[nΔt]. While RCs can be in general constructed through arbitrary high-dimensional dynamical systems (59), we implement the RC in silico as a dynamical system on a network of nodes (Fig. 7). (Note that this network of nodes is unrelated of the underlying network of the dynamical system being measured). We assume that the number of nodes Dr is large (such that DrDn × DsD𝒳). The nodal states are stored in a vector R of length Dr. The sampled time series vector is fed into the reservoir via a Dr-by-D𝒳 input-to-reservoir coupling matrix textbfWin (Fig. 7). Furthermore, the reservoir nodes affect the dynamics of each other according to a Dr-by-Dr asymmetric adjacency matrix H. The time evolution of the reservoir node states R is given by the equation,

RnΔt=σHRn1Δt+WinXnΔt, [6]

where n is a positive integer, and σ is a sigmoidal activation function acting componentwise on its vector argument (which has the same dimension, Dr, as R).

The goal of the reservoir computer training is to predict the one-time-step future values of the sampled components of {Yi[(n+1)Δt]} in the concatenated form 𝒳[(n+1)Δt] (Eq. 5) from their current values 𝒳[nΔt] using the reservoir state vector R[nΔt]. In our case, this is done with a regularized linear regression determining a D𝒳-by-Dr reservoir-to-output coupling matrix Wout by best-fitting WoutR[nΔt] to the data for 𝒳 one time-step Δt in the future 𝒳[(n+1)Δt], i.e., minimizing the cost function 𝒞 given by

C={1Tn=1TX[(n+1)Δt]WoutR[nΔt]2}+λWout2, [7]

where T is the number of training steps and the last term (λWout2) is a “ridge” regularization term (60) used to prevent overfitting to the training data, and λ is typically a small number.

The input matrix Win is chosen such that each of the D𝒳 components of the input vector goes to R/D𝒳 distinct nodes of the reservoir. The nonzero elements of the input matrix Win are chosen randomly from a uniform distribution in the interval [−w,w]. H is a sparse random matrix, corresponding to an average in-degree dav of the reservoir nodes. The nonzero elements of H are chosen randomly from a uniform distribution such that the spectral radius of H is equal to some predefined value ρ, which we choose to be equal to 0.9. The hyperparameters w and dav are chosen using a Nelder–Mead optimization procedure (61). In our work, we used dav = 2.7, w = 0.105 for the Lorenz networks, and dav = 5, w = 0.42 for the C. elegans network. In all network inference works with the Lorenz (C. elegans) model, we typically use values λT = 103(10−3), unless stated otherwise, a choice that we shall examine in the next section. The sigmoidal activation function σ is taken to be the hyperbolic tangent function. The reservoirs we used typically had Dr = 3, ​000 nodes.

As derived in ref. 18, we estimate the Jacobian matrix of partial derivatives Mij(t)=∂𝒳i[(n + 1)Δt]/∂𝒳j[nΔt] by Eq. 8

Xi[(n+1)Δt]Xj[nΔt]=k=1DR(Wout)ikσ((Win+HWout1)X[nΔt])k×(Win+HWout1)kj, [8]

and construct link scores as Sij=Mijt, where t denotes time-averaging over a sufficiently long time so that the averages do not change significantly after doubling the averaging time.

Finally, we comment on the choice of the regularization parameter λ for the reservoir computer. As we increase the value of λ across several orders of magnitude, the ranges of both the all-score distribution (black curve in Fig. 3) and the surrogate score distribution (green curve in Fig. 3) decrease and merge, as individual link, nonlink, and surrogate scores diminish in magnitude with increasing regularization. Typically, the all-score distribution has a larger range than the surrogate score distribution. After examining how the all-score and surrogate score distributions vary as we monotonically change the value of λ, we select a range of λ values such that, there remains a substantial number of scores in the high-score tail of the all-score distribution (black curve) extending beyond the range of the surrogate score distribution (green curve). These scores would then be inferred as links. We then choose from among these the smallest λ value. This gave us λT = 103 in all of the results that we show in Figs. 3 and 5. For the reservoir computer results, we used λT = 103 for the Lorenz data, and λT = 100 for the C. elegans data in Fig. 6. Values of λ are only roughly determined by the above method. However, this appears to be sufficient for our purposes since we find that there is typically a range of λ values spanning an order of magnitude over which the number of inferred true-positive and false-positive links are approximately constant.

For the chosen λ value, the corresponding surrogate score distribution can be used as a proxy of the unknown nonlink-score distribution. So, we use it to estimate the P-value corresponding to a particular score. As such, the score threshold for assigning links can now be equivalently expressed as a P-value cutoff, thus quantifying the statistical confidence of an inferred link.

Transfer Entropy.

For time series from two processes I and J, the transfer entropy from J to I is defined by Schreiber (10)

TJI=np(in+1,in(k),jn(l))logp(in+1|in(k),jn(l))p(in+1|in(k)), [9]

where p denotes joint or conditional probability, in is the element of the time series of I sampled at the n-th time-point, and in(k) = (in, …, in − k + 1) is the delay-embedding vector. In Eq. 9, k and l are two parameters which are to be chosen. There are multiple computational techniques (62) to obtain an estimate of transfer entropy expressed with this equation. Among them, in this work, we used the MATLAB code (62) for the rank-based technique to estimate transfer entropy. The three input parameters of this transfer entropy evaluation program are k, l, and the number of quantization entropy levels. We chose l = 1, motivated by the notion that (as discussed in section 2 and supported by Figs. 3AC) link inference performs best when causal dependence over short times is tested (see also, ref. 18). We also found that, with this choice for l, results were largely unchanged with variation of k, and we took k = 1 in our evaluation. We took the number of entropy quantization levels to be 3, and, again, the results were largely independent of this choice. For more details of the method and the definition of these parameters, please see the MATLAB code associated with (62).

Granger Causality.

For two time series from processes I and J, we say that J does not Granger-cause I if I, conditional on its own past, is independent of the past of J (7, 8, 63). Thus, in terms of the notation of section 6.2, if knowledge of jn(l) improves prediction of in(l+1), then J would Granger cause I. The typical way to test this dependency of two time series involves fitting a vector autoregressive model for I and measuring whether inclusion of J in that model makes the fitting error significantly lower. For network link inference scoring purposes, we use the logarithm of the ratio of the two fitting errors as the network link score. Furthermore, in our computations, l = 1, for the same reason as given in section 6.2. For more details, as well as for the MATLAB toolbox that we have used in this work, see ref. 63.

Acknowledgments

This work was supported by NSF grant DMS1813027.

Author contributions

E.O. designed research; A.B., S.C., and E.O. performed research; A.B. performed numerical experiments; A.B., S.C., and E.O. analyzed data; and A.B., S.C., and E.O. wrote the paper.

Competing interests

The authors declare no competing interest.

Footnotes

Reviewers: E.B., George Mason University; and I.F., IFISC (UIB-CSIC).

Data, Materials, and Software Availability

References for previously published experimental time series data (24) and softwares (62, 63) used in this work can be found in the main text. Additionally, an example reservoir computer code to generate the results of this paper can be found at the GitHub repository https://github.com/banerjeeamitava/Neuronal-Network-Inference-with-Reservoir-Computer. No new data has been generated in this paper, other than the code outputs from analysis of existing time series data.

References

  • 1.Blevins A. S., Bassett D. S., Scott E. K., Vanwalleghem G. C., From calcium imaging to graph topology. Netw. Neurosci. 6, 1125–1147 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.de Abril I. M., Yoshimoto J., Doya K., Connectivity inference from neural recording data: Challenges, mathematical bases and research directions. Neural Netw. 102, 120–137 (2018). [DOI] [PubMed] [Google Scholar]
  • 3.Sander E. L., Wootton J. T., Allesina S., Ecological network inference from long-term presence-absence data. Sci. Rep. 7, 1–12 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Qiu X., et al. , Inferring causal gene regulatory networks from coupled single-cell expression dynamics using scribe. Cell Syst. 10, 265–274 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Ludescher J., et al. , Network-based forecasting of climate phenomena. Proc. Natl. Acad. Sci. U.S.A. 118, e1922872118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Fan J., Meng J., Ashkenazy Y., Havlin S., Schellnhuber H. J., Network analysis reveals strongly localized impacts of El Niño. Proc. Natl. Acad. Sci. U.S.A. 114, 7543–7548 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Kamiński M., Ding M., Truccolo W. A., Bressler S. L., Evaluating causal relations in neural systems: Granger causality, directed transfer function and statistical assessment of significance. Biol. Cyber. 85, 145–157 (2001). [DOI] [PubMed] [Google Scholar]
  • 8.Friston K., Moran R., Seth A. K., Analysing connectivity with Granger causality and dynamic causal modelling. Curr. Opin. Neurobiol. 23, 172–178 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Pratapa A., Jalihal A. P., Law J. N., Bharadwaj A., Murali T., Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat. Methods 17, 147–154 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Schreiber T., Measuring information transfer. Phys. Rev. Lett. 85, 461 (2000). [DOI] [PubMed] [Google Scholar]
  • 11.Novelli L., Wollstadt P., Mediano P., Wibral M., Lizier J. T., Large-scale directed network inference with multivariate transfer entropy and hierarchical statistical testing. Netw. Neurosci. 3, 827–847 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Shorten D. P., Spinney R. E., Lizier J. T., Estimating transfer entropy in continuous time between neural spike trains or other event-based data. PLoS Comput. Biol. 17, e1008054 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Vicente R., Wibral M., Lindner M., Pipa G., Transfer entropy-a model-free measure of effective connectivity for the neurosciences. J. Comput. Neurosci. 30, 45–67 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Yu J., Smith V. A., Wang P. P., Hartemink A. J., Jarvis E. D., Advances to Bayesian network inference for generating causal networks from observational biological data. Bioinformatics 20, 3594–3603 (2004). [DOI] [PubMed] [Google Scholar]
  • 15.Zou C., Feng J., Granger causality vs. dynamic Bayesian network inference: A comparative study. BMC Bioinform. 10, 1–17 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Brunton S. L., Proctor J. L., Kutz J. N., Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc. Natl. Acad. Sci. U.S.A. 113, 3932–3937 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Banerjee A., Hart J. D., Roy R., Ott E., Machine learning link inference of noisy delay-coupled networks with optoelectronic experimental tests. Phys. Rev. X 11, 031014 (2021). [Google Scholar]
  • 18.Banerjee A., Pathak J., Roy R., Restrepo J. G., Ott E., Using machine learning to assess short term causal dependence and infer network links. Chaos: Interdiscip. J. Nonlinear Sci. 29, 121104 (2019). [DOI] [PubMed] [Google Scholar]
  • 19.Leng S., Xu Z., Ma H., Reconstructing directional causal networks with random forest: Causality meeting machine learning. Chaos: Interdiscip. J. Nonlinear Sci. 29, 093130 (2019). [DOI] [PubMed] [Google Scholar]
  • 20.E. Tan, D. Corrêa, T. Stemler, M. Small, Backpropagation on dynamical networks. arXiv [Preprint] (2022). http://arxiv.org/abs/2207.03093 (Accessed 7 February 2023).
  • 21.Sima C., Hua J., Jung S., Inference of gene regulatory networks using time-series data: A survey. Curr. Gen. 10, 416–429 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Casadiego J., Nitzan M., Hallerberg S., Timme M., Model-free inference of direct network interactions from nonlinear collective dynamics. Nat. Commun. 8, 1–10 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Wang W. X., Lai Y. C., Grebogi C., Data based identification and prediction of nonlinear and complex dynamical systems. Phys. Rep. 644, 1–76 (2016). [Google Scholar]
  • 24.Kato S., et al. , Global brain dynamics embed the motor command sequence of Caenorhabditis elegans. Cell 163, 656–669 (2015). [DOI] [PubMed] [Google Scholar]
  • 25.Varshney L. R., Chen B. L., Paniagua E., Hall D. H., Chklovskii D. B., Structural properties of the Caenorhabditis elegans neuronal network. PLoS Comput. Biol. 7, e1001066 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.White J. G., Southgate E., Thomson J. N., Brenner S., The structure of the nervous system of the nematode Caenorhabditis elegans. Philos. Trans. R. Soc. Lond. B Biol. Sci. 314, 1–340 (1986). [DOI] [PubMed] [Google Scholar]
  • 27.Nguyen J. P., et al. , Whole-brain calcium imaging with cellular resolution in freely behaving Caenorhabditis elegans. Proc. Natl. Acad. Sci. U.S.A. 113, E1074–E1081 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Nichols A. L., Eichler T., Latham R., Zimmer M., A global brain state underlies C. Elegans sleep behavior. Science 356, eaam6851 (2017). [DOI] [PubMed] [Google Scholar]
  • 29.Lorenz E. N., Deterministic nonperiodic flow. J. Atmos. Sci. 20, 130–141 (1963). [Google Scholar]
  • 30.Runge J., Causal network reconstruction from time series: From theoretical assumptions to practical estimation. Chaos: Interdiscip. J. Nonlinear Sci. 28, 075310 (2018). [DOI] [PubMed] [Google Scholar]
  • 31.Diks C., Fang H., Transfer entropy for nonparametric Granger causality detection: An evaluation of different resampling methods. Entropy 19, 372 (2017). [Google Scholar]
  • 32.Papana A., Kyrtsou C., Kugiumtzis D., Diks C., Assessment of resampling methods for causality testing: A note on the US inflation behavior. PloS One 12, e0180852 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.P. Good, Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses (Springer Science& Business Media, 2013).
  • 34.P. I. Good, Resampling Methods (Springer, 2006).
  • 35.Lancaster G., Iatsenko D., Pidde A., Ticcinelli V., Stefanovska A., Surrogate data for hypothesis testing of physical systems. Phys. Rep. 748, 1–60 (2018). [Google Scholar]
  • 36.Paluš M., From nonlinearity to causality: Statistical testing and inference of physical mechanisms underlying complex dynamics. Contemp. Phys. 48, 307–348 (2007). [Google Scholar]
  • 37.Cliff O. M., Novelli L., Fulcher B. D., Shine J. M., Lizier J. T., Assessing the significance of directed and multivariate measures of linear dependence between time series. Phys. Rev. Res. 3, 013145 (2021). [Google Scholar]
  • 38.Ren J., Wang W. X., Li B., Lai Y. C., Noise bridges dynamical correlation and topology in coupled oscillator networks. Phys. Rev. Lett. 104, 058701 (2010). [DOI] [PubMed] [Google Scholar]
  • 39.Panaggio M. J., Ciocanel M. V., Lazarus L., Topaz C. M., Xu B., Model reconstruction from temporal data for coupled oscillator networks. Chaos: Interdiscip. J. Nonlinear Sci. 29, 103116 (2019). [DOI] [PubMed] [Google Scholar]
  • 40.Lipinski-Kruszka J., Stewart-Ornstein J., Chevalier M. W., El-Samad H., Using dynamic noise propagation to infer causal regulatory relationships in biochemical networks. ACS Synt. Biol. 4, 258–264 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Prill R. J., Vogel R., Cecchi G. A., Altan-Bonnet G., Stolovitzky G., Noise-driven causal inference in biomolecular networks. PloS One 10, e0125777 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Wang W. X., Ren J., Lai Y. C., Li B., Reverse engineering of complex dynamical networks in the presence of time-delayed interactions based on noisy time series. Chaos: Interdiscip. J. Nonlinear Sci. 22, 033131 (2012). [DOI] [PubMed] [Google Scholar]
  • 43.Theiler J., Eubank S., Longtin A., Galdrikian B., Farmer J. D., Testing for nonlinearity in time series: The method of surrogate data. Phys. D: Nonlinear Phenom. 58, 77–94 (1992). [Google Scholar]
  • 44.Quiroga R. Q., Kraskov A., Kreuz T., Grassberger P., Performance of different synchronization measures in real data: A case study on electroencephalographic signals. Phys. Rev. E 65, 041903 (2002). [DOI] [PubMed] [Google Scholar]
  • 45.Kugiumtzis D., Surrogate data test for nonlinearity including nonmonotonic transforms. Phys. Rev. E 62, R25 (2000). [DOI] [PubMed] [Google Scholar]
  • 46.Politis D. N., Romano J. P., The stationary bootstrap. J. Am. Stat. Assoc. 89, 1303–1313 (1994). [Google Scholar]
  • 47.Wang X., Chen Y., Ding M., Testing for statistical significance in bispectra: A surrogate data approach and application to neuroscience. IEEE Trans. Biomed. Eng. 54, 1974–1982 (2007). [DOI] [PubMed] [Google Scholar]
  • 48.Fujisawa S., Amarasingham A., Harrison M. T., Buzsáki G., Behavior-dependent short-term assembly dynamics in the medial prefrontal cortex. Nat. Neurosci. 11, 823–833 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Lizier J. T., Heinzle J., Horstmann A., Haynes J. D., Prokopenko M., Multivariate information-theoretic measures reveal directed information structure and task relevant changes in fMRI connectivity. J. Comput. Neurosci. 30, 85–107 (2011). [DOI] [PubMed] [Google Scholar]
  • 50.Shimono M., Beggs J. M., Functional clusters, hubs, and communities in the cortical microconnectome. Cerebral Cortex 25, 3743–3757 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Dimitriadis S. I., Zouridakis G., Rezaie R., Babajani-Feremi A., Papanicolaou A. C., Functional connectivity changes detected with magnetoencephalography after mild traumatic brain injury. NeuroImage: Clin. 9, 519–531 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Olejarczyk E., Marzetti L., Pizzella V., Zappasodi F., Comparison of connectivity analyses for resting state EEG data. J. Neural Eng. 14, 036017 (2017). [DOI] [PubMed] [Google Scholar]
  • 53.Gilson M., Tauste Campo A., Chen X., Thiele A., Deco G., Nonparametric test for connectivity detection in multivariate autoregressive networks and application to multiunit activity data. Netw. Neurosci. 1, 357–380 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Urai A. E., Doiron B., Leifer A. M., Churchland A. K., Large-scale neural recordings call for new insights to link brain and behavior. Nat. Neurosci. 25, 11–19 (2022). [DOI] [PubMed] [Google Scholar]
  • 55.Lin A., et al. , Imaging whole-brain activity to understand behaviour. Nat. Rev. Phys. 4, 292–305 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Mann K., Gallen C. L., Clandinin T. R., Whole-brain calcium imaging reveals an intrinsic functional network in drosophila. Curr. Biol. 27, 2389–2396 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Ahrens M. B., et al. , Brain-wide neuronal dynamics during motor adaptation in zebrafish. Nature 485, 471–477 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Ahrens M. B., Orger M. B., Robson D. N., Li J. M., Keller P. J., Whole-brain functional imaging at cellular resolution using light-sheet microscopy. Nat. Methods 10, 413–420 (2013). [DOI] [PubMed] [Google Scholar]
  • 59.Tanaka G., et al. , Recent advances in physical reservoir computing: A review. Neural Netw. 115, 100–123 (2019). [DOI] [PubMed] [Google Scholar]
  • 60.Hoerl A. E., Kennard R. W., Ridge regression: Applications to nonorthogonal problems. Technometrics 12, 69–82 (1970). [Google Scholar]
  • 61.Singer S., Nelder J., Nelder-Mead algorithm. Scholarpedia 4, 2928 (2009). [Google Scholar]
  • 62.Lee J., et al. , Transfer entropy estimation and directional coupling change detection in biomedical time series. Biomed. Eng. Online 11, 1–17 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Barnett L., Seth A. K., The MVGC multivariate Granger causality toolbox: A new approach to Granger-causal inference. J. Neurosci. Methods 223, 50–68 (2014). [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

References for previously published experimental time series data (24) and softwares (62, 63) used in this work can be found in the main text. Additionally, an example reservoir computer code to generate the results of this paper can be found at the GitHub repository https://github.com/banerjeeamitava/Neuronal-Network-Inference-with-Reservoir-Computer. No new data has been generated in this paper, other than the code outputs from analysis of existing time series data.


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES