Temporal resolution of spike coding in feedforward networks with signal convergence and divergence

Zach Mobille; Usama Bin Sikandar; Simon Sponberg; Hannah Choi

doi:10.1371/journal.pcbi.1012971

. 2025 Apr 21;21(4):e1012971. doi: 10.1371/journal.pcbi.1012971

Temporal resolution of spike coding in feedforward networks with signal convergence and divergence

Zach Mobille ^1,^2,^*, Usama Bin Sikandar ^3,⁴, Simon Sponberg ^2,^3,⁵, Hannah Choi ^1,^2,^*

Editor: Sacha Jennifer van Albada⁶

PMCID: PMC12021431 PMID: 40258062

Abstract

Convergent and divergent structures in the networks that make up biological brains are found across many species and brain regions at various spatial scales. Neurons in these networks fire action potentials, or “spikes,” whose precise timing is becoming increasingly appreciated as large sources of information about both sensory input and motor output. In this work, we investigate the extent to which feedforward convergent/divergent network structure is related to the gain in information of spike timing representations over spike count representations. While previous theories on coding in convergent and divergent networks have largely neglected the role of precise spike timing, our model and analyses place this aspect at the forefront. For a suite of stimuli with different timescales, we demonstrate that structural bottlenecks–small groups of neurons post-synaptic to network convergence–have a stronger preference for spike timing codes than expansion layers created by structural divergence. We further show that this relationship can be generalized across different spike-generating models and measures of coding capacity, implying a potentially fundamental link between network structure and coding strategy using spikes. Additionally, we found that a simple network model based on convergence and divergence ratios of a hawkmoth (Manduca sexta) nervous system can reproduce the relative contribution of spike timing information in its motor output, providing testable predictions on optimal temporal resolutions of spike coding across the moth sensory-motor pathway at both the single-neuron and population levels.

Author summary

Within the complex anatomy of the brain, there are certain structures that appear more often than expected. One example of this is when large populations of neurons connect to much smaller populations, and vice versa. We refer to these structural patterns as network convergence and divergence; they are observed in systems like the cerebellum, insect olfactory networks, visuomotor pathways, and the early visual system of mammals. Despite the ubiquity of this connectivity pattern, we are only beginning to understand its functional implications from a computational point of view. Here, we construct and analyze mathematical models of spiking neural networks to understand how convergent and divergent structure shapes the way that information is represented in each part of the network, as a function of the temporal resolution of population spiking activity. We then developed a simple feedforward network model of the visuomotor pathway of a moth, with similar convergent/divergent network structure, and reproduced a similar proportion of spike timing to spike count information as observed experimentally. Our results form predictions about spike coding in populations previously unobserved in experiment.

Introduction

The neural systems of animals comprise networks with highly non-random topological structure [1–6]. The relationship between computation and connectivity in neural networks is multi-faceted and depends on our definition of these terms [7–9,52], but often it can be fruitful to focus on computation in networks with connectivity patterns that are observed more often in biological systems than would be expected in a totally random model network [4,10–12,57]. One particular structural motif that is common in many areas of the nervous system involves populations of neurons synapsing with other populations of a much different size. When a large population of neurons synapses with a much smaller population, it may be called a “convergent” pathway. If a small population synapses with a much larger population, we call this structure “divergent.” Structural convergence and divergence are observed in a wide range of neural systems across species, including the mammalian early visual system [13–16], mammalian cerebellum-like structures [17,18] and the insect olfactory system [19]. A notable example is the divergence from 200 million mossy fibers to 50 billion granule cells and then convergence to 15 million Purkinje cells in the human cerebellum – mostly a feedforward network [20,21].

Despite their ubiquity, convergent/divergent structures are only beginning to be understood from a functional point of view. Previous work has shown that network convergence synergizes with nonlinear activation functions to boost information coding [13]. Other studies have focused explicitly on networks with bottlenecks (small groups of neurons pre- and post-synaptic to much larger groups of neurons on both sides), demonstrating that modular connectivity increases their information transfer in classification tasks [22], and that they increase dimension while reducing noise in the expansion layer post-synaptic to them [17]. While highlighting the computational significance of structural convergence and divergence, the network or neuron models used in these studies were non-spiking, neglecting the biologically-relevant role of precisely-timed action potentials. Instead of directly testing how feedforward convergence and divergence shape coding strategies of spiking neurons, a recent study [23] examined the relationship between temporal coding of spiking population and its size. In this work, time-dependent stimuli were decoded from uncoupled spiking neuron populations of varying size. It was found that signal reconstruction error drops linearly with population size when decoded from precisely timed spikes, but sublinearly when decoded from imprecisely timed spikes. Although this work reveals an interesting relationship between spike coding and population size, it is still unclear how convergent and divergent network structures directly shape the importance of spike timing in information processing.

The demand for understanding the implications of convergent and divergent structure for the information coding is especially high in light of growing experimental evidence showing that spike timing can encode significantly more information about inputs [24] and outputs [25] than spike count. While the importance of spike timing is well established at the levels of sensory input [26–30] and motor output [25,31–35], it is not as well understood in the intermediate stages of processing between sensory and motor populations. In the context of vertebrate and invertebrate visuomotor systems, these pathways involve several cascades of structural convergence and divergence from the early visual system to cortex [15,16] and eventually through the cerebellum [18,20,21] to the spinal cord and commanding muscles. A classic modeling study suggests that the cortex, a large population of neurons post-synaptic to structural divergence, is more likely to use a population spike count code due to high variability in inter-spike intervals [36]. Another work argues based on energy expenditure that rate/count coding can only explain around 15% of the activity in primary visual cortex [37], suggesting that other coding strategies must explain the rest [38]. From a purely quantitative perspective, single-neuron count codes are slow and information-poor, but robust to noise [39]. The activity of large populations of neurons following a structural divergence comprises a high-dimensional space and may therefore benefit from a collective count code due to the noise reduction. On the other hand, spike timing codes are fast, efficient, and information-rich, but potentially sensitive to noise [40]. It is, therefore, possible that bottleneck populations of neurons post-synaptic to structural convergences may be good candidates for a temporal code, since this would allow them to encode a similar amount of information as the larger pre-synaptic layer but with a smaller number of neurons. Indeed, experiments testing white noise optogenetic stimuli in the cortex of mice have shown that temporal precision of spiking increases in the inter-neurons post-synaptic to a structural convergence compared to the pyramidal neurons pre-synaptic to them [41]. However, this is just one example, and our understanding of the information processing between sensory input and motor output would improve if a relationship between population spike coding and convergent/divergent structure was also explored theoretically.

Thus, we aim to systematically investigate how convergent/divergent structure are related to precise spike time coding in spiking neural network models. Our primary hypothesis is that temporal coding is more beneficial in bottleneck populations post-synaptic to a structural convergence than it is in an expansion layer. While expansion layers have a surplus of neurons and may represent stimuli equally well with a coarse count code, bottlenecks have fewer neurons available to encode signals. Therefore, bottlenecks may preserve information by preferring temporal expressions of signal representations. To test this hypothesis, we train feedforward spiking neural networks to autoencode a time-dependent stimulus [38] and perform decoding analyses [42] on the population spike trains binned at various resolutions. First, we study 3-layered feedforward networks with varying levels of convergence and divergence to establish a relationship between structure and spike coding. Next, we develop a 5-layered model resembling the patterns of expansion and contraction in a hawkmoth visuomotor pathway, whose output is known to use a spike timing code during hover-feeding and target tracking [25]. We test if our model, although lacking many biological details present in the moth, can recapitulate a similar relative proportion of information in spike timing and spike rate coding as observed in experiment. To confirm that our results are not an exception due to the specific spiking model or decoder we chose, we also test the robustness of the results using other models and measures.

Results

A graphical summary of our approach is shown in Fig 1. We first train a feedforward spiking neural network with a given structure to autoencode a time-dependent stimulus s(t) (left of Fig 1A), and then decode it using a recurrent neural network into its reconstruction ŝ ( t ) (bottom right of Fig 1A). To test how the encoding changes as we increase the temporal resolution of the spike trains, we use a decoding analysis in which we process each layer’s spikes over a sliding window of width T = 50 ms which are further binned at resolution Δt. The choice of 50 ms for the duration of the response window was motivated by the wingstroke period of the hawkmoth Manduca sexta, and it is also consistent with previous neural decoding studies [42]. The binned spikes R ( t ; Δt ) are then fed to a decoder (the recurrent neural network) that treats the binned spikes within the larger response window as a sequence of hidden states within its own dynamics. The decoder estimates the stimulus presented to the input layer of the network with a reconstruction ŝ, based on the binned population spiking of the layer of interest. We then quantify the relationship between response R ( t ; Δt ) and stimulus s(t) by computing both the decoding accuracy R² (coefficient of determination) and the mutual information $I_{m}$ between true stimulus s and decoded stimulus ŝ for various Δt. These measures approximate the true information carried at the population level and are computed across a range of Δt to establish the temporal resolution of the optimal coding strategy, referred to here as the “information curves”.

We also perform an information theoretic analysis at the single-neuron level, based on past work [25,33,48]. The strength of this method is that it quantifies the amount of spike count and spike timing information without confounding the two variables. Note that the binning method used in the population analysis considers spike counts over increasing levels of time resolution and therefore does not isolate spike-timing code completely from spike-count code. The strength of this method is that it considers all neurons in the layer and thus quantifies its population coding strategy, not just single-neuron coding. Computing mutual information at the single-cell resolution allows us to compare our model’s results with previously obtained experimental results at the single neuron level. For a detailed explanation of our model and analysis, see Methods.

Three-layer network

We first focus on a feedforward network of 3 layers, systematically varying the number of neurons in the middle (hidden) layer $N_{h}$ while keeping the number of neurons in the input and output layer fixed at $N_{in} = N_{out} = 100$ . By doing this, we simultaneously tune the level of structural divergence and convergence. The network model consists of leaky integrate-and-fire neurons with both excitatory and inhibitory synapses. The parameters of the network, including synaptic weights, membrane time constants, and readout weights (mentioned below) are optimized to minimize the following loss function

\begin{matrix} L_{MSE} (z, s) = \frac{1}{N_{t}} \sum_{t = 1}^{N_{t}} {(z_{t} - s_{t})}^{2} \end{matrix}

(1)

where $N_{t}$ is the total number of time points, $s_{t}$ is the true stimulus at time t, and z is a readout from the output layer of the form

\begin{matrix} z = γ z_{time} + (1 - γ) z_{count} \end{matrix}

(2)

where $z_{time}$ is a readout based on the spike timings of the output layer and $z_{count}$ is a readout based on the spike counts of the output layer. The quantity γ is a hyperparameter that we set to 0.5, so as to equally weigh the readouts based on spike count and spike timing, thus not biasing our results (for more details, see Methods).

After training the network, we decode the stimulus from each layer by using the population spikes binned at various time resolutions Δt using two types of recurrent neural networks (see Decoding analysis methods). Membrane time constants are around 5 ms and there are no synaptic delays in our network model, so spikes binned in a given T = 50 ms time window are most likely to be caused by other spikes in the same time window. The association between the true stimulus s and decoded stimulus ŝ is estimated using various measures, including the mutual information $I_{m} (s, ŝ)$ . Since the decoded stimulus is a function of the response (i.e. ŝ = f ( r ) ), the data-processing inequality states that $I_{m} (s, ŝ) \leq I_{m} (s, r)$ . Thus, when we quantify how associated s and ŝ are, we are computing a lower bound on the true association between stimulus s and response r. Note that the estimated stimulus ŝ from each layer and the network readout z are separate quantities: the former is constructed by binning population spike trains at various Δt’s and feeding them to the decoder while the latter is purely a mechanism by which we train the network, thus increasing the information in the deeper layers before performing the decoding analysis which forms the ŝ’s.

For a variety of stimuli, we demonstrate how this information changes in each layer as a function of the network structure and timescale of spike counting Δt. When Δt is equal to the duration of the response window T, the input to our decoder is a vector of spike counts across each neuron. When Δt = 1 ms (1 ms is the time step of our simulations), the input to the decoder is matrix of 1’s and 0’s indicating when spikes occurred at each time step, across all neurons in the population. Due to the loss in dimensionality of the neural representation implied by network convergence, we hypothesize that a temporal code (high information at small Δt but low information at high Δt) will be especially beneficial in bottlenecks. Conversely, large populations post-synaptic to network divergence should have less to gain from temporal codes, since they have high-dimensional representations even with a count or rate code (high information across all Δt’s).

We first sought to test deterministic stimuli with fixed and well-defined frequency content, opting for sinusoidal stimuli of various frequency. We start with analyses of the optimal temporal resolution of codes at the output layer. Although we always expect a decrease in information with decreasing temporal resolution (larger bin size Δt), different layers of the network receiving varying ratios of convergent and divergent feedforward signals will have different rates of information degradation with respect to increasing Δt. Those layers with very steep slopes encode information with precisely timed spike codes whereas those with shallow slopes encode information with a coarser spike count code. In Fig 2, the information in the output layer has a steeper decline with growing Δt in the case of the expansion hidden layer structure, as opposed to the bottleneck hidden layer structure, especially at higher stimulus frequencies. This is shown for a wide range of stimulus frequencies $f_{high}$ in Fig 2C, where the slope of the information curves is plotted as a function of $f_{high}$ . There is a general decrease in the slopes with increasing stimulus frequency for both bottleneck and expansion networks, owing to progressively better encoding of faster stimuli by spikes binned at higher temporal resolution. Additionally, for all frequencies $f_{high} > 20$ Hz tested, the slope distributions are significantly lower in the expansion hidden layer structure (where signals converge onto the output layer) than the bottleneck hidden layer structure (signals diverge onto output layer). This demonstrates that structural convergence is more associated with precise spike timing codes than structural divergence, where information is relatively more preserved at coarser temporal resolutions. To ensure that this result did not depend on our specific choices, we tested different decoders and spiking neuron models in S1 and S2 Figs and found the same result.

For the same networks tested in Fig 2, we also performed a decoding analysis on the hidden layer for the case when $f_{low} = 4$ Hz and $f_{high} = 20$ Hz in Fig 3. As a visual representation of how more precise temporal codes are associated with bottleneck populations of neurons, stimulus reconstructions are shown for $N_{h} = 10$ and $N_{h} = 1000$ in Fig 3A from spike trains binned at Δt = 5 ms and Δt = 50 ms. In the case of an expansion hidden layer $N_{h} = 1000$ , there is little difference between Δt = 5 ms and Δt = 50 ms; the drop in decoding accuracy when going from a more precise temporal code Δt = 5 ms to a less precise code Δt = 50 ms is only $Δ R^{2} = 0.078$ (see right side of Fig 3B). However, when decoding from the hidden layer of the bottleneck network $N_{h} = 10$ , there is a large drop in decoding accuracy when going from a more precise code (Δt = 5 ms) to a less precise code (Δt = 50 ms). From Fig 3A, it is clear that the drop in accuracy comes from the fact that the $N_{h} = 10$ , Δt = 50 ms reconstruction misses the faster 20 Hz frequency component while the other reconstructions do not. By having a higher dimensional representation of the input, the $N_{h} = 1000$ expansion layer can still encode these higher frequency components even with a less precise code, binned over a time window equal to the period of the faster stimulus component. We again tested this result for an alternative spiking model, decoder, and association metric, finding the same general trend in S3–S6 Figs.

To explicitly show that the higher-frequency component $f_{high} = 20$ Hz contributes to the drop in decoding accuracy for $N_{h} = 10$ at Δt = 50 ms in Fig 3, we decode the low frequency component $f_{low} = 4$ Hz separately from the high frequency component $f_{high} = 20$ Hz in Fig 4 for all layers of the bottleneck and expansion networks. At the input layer (left), both the bottleneck and the expansion networks have a very similar dependence of decoding accuracy R² on bin size Δt for a given frequency component. When decoding from the hidden layer of either the bottleneck or expansion network, the decoding accuracy of the 4 Hz component remains constant for all Δt’s. However, there is a large discrepancy between the bottleneck and expansion networks when decoding the 20 Hz component from the hidden layer: the bottleneck has a steep decrease in decoding accuracy with increasing Δt while the expansion shows a much slower decrease in R² with increasing Δt. Furthermore, going from the hidden layer to the output layer steepens the 20 Hz curve for the network with an expansion hidden layer, but leaves the 20 Hz curve for the network with a bottleneck hidden layer virtually unchanged. These results support the conclusion that populations post-synaptic to a network convergence encode high-frequency stimulus information with spike codes of high temporal resolution. Populations post-synaptic to structural divergence maintain similar information curves as their pre-synaptic layer, indicating that either spike count or spike timing codes are feasible for divergent populations. The analysis with mutual information is in S7 Fig.

Fig 4 — Decoding accuracy versus bin size for each layer of the bottleneck and expansion networks receiving a 4 Hz + 20 Hz sum of sines stimulus. The 4 Hz (top) and 20 Hz (bottom) components are decoded separately here. Error bars represent standard errors of the mean over 10 independent network seeds.

In the previous results, all stimuli used were sums of 2 sines. In Figs 5 and S8, we show accuracy gains in the hidden and output layer for four different stimuli. For a slow (5 Hz), continuous single sine stimulus (top of S8 Fig), there is little to be gained from a more precise temporal code. For the other stimuli shown, which all include some sort of faster time scale or unpredictability, the hidden layer has a higher accuracy gain in a bottleneck network than a uniformly structured ( $N_{h} = 100$ ) or expansion-compression ( $N_{h} = 1000$ ) network. For the white noise and binary stimuli, the output layer has significantly higher accuracy gains in the expansion-hidden-layer network ( $N_{h} = 1000$ ) than in the networks without structural convergence onto the output layer. Together, these results demonstrate that structural convergence promotes temporal coding in networks responding to stimuli with fast timescales or unpredictability. For slow stimuli without fast jumps, there is little, if anything, to be gained from a temporal code for all network structures tested. The analysis with mutual information is shown in S9 Fig.

Fig 5 — (A) Each row shows the stimulus used for the corresponding plots on the right. Filtered white noise was used instead of pure white noise since much of the variation in pure white noise is low-pass filtered by the membrane voltage of the neurons. (B) Decoding accuracy v.s. the number of hidden neurons at Δt = 5 ms and Δt = 50 ms for the hidden layer (left) and output layer (right). (C) Accuracy gain (R² at Δt = 5 ms minus R² at Δt = 50 ms) v.s number of hidden neurons. Asterisks denote where a one-sided Wilcoxon rank sums test is significant (* for p<0.05, ** for p<0.01, and *** for p<0.001). All boxplots represent distributions of the results over 25 independent network simulations. For the filtered white noise stimulus, the mean drop in accuracy gain for the hidden layer when going from $N_{h} = 10$ to $N_{h} = 1000$ is 0.14; the mean drop in accuracy gain for the output layer when going from $N_{h} = 1000$ to $N_{h} = 10$ is 0.04. For the binary stimulus, the accuracy gain drop is 0.04 for the hidden layer and 0.06 for the output layer.

Five-layer network model of the moth visuomotor pathway

Now that a relationship between optimal coding strategy with spikes and convergent/divergent structure has been established in a simple 3-layer model, we next test our model-based conclusion on this relationship in a specific biological model of a convergent/divergent neural network found in nature. Specifically, we focus on the visuomotor pathway of the hawkmoth Manduca sexta for its convergent/divergent architecture along the signal pathway and relative behavioral simplicity during flower tracking [79]. The output of this system primarily consists of only 10 muscles that control wing motion, each acting effectively as a single motor unit or output channel. This compact set of muscles, recorded with spike-level resolution, encode the majority of the information about motor output in their spike timing [25,49].

The output layer of the hawkmoth visumotor pathway provides a nearly complete motor program for behavior allowing for the near perfect (>99%) reconstruction of behavioral output states [50] and between 85% and 95% reconstruction on the continuous 6 degree of freedom (DoF) body forces and torques [51,90]. The input layer corresponds to the visual system, which we have here simplified as a group of 48 motion-sensitive neurons [24] separated into two subpopulations, each tuned to a direction along a line. Intermediate layers of the moth’s visuomotor pathway include the brain, the neck connective, and the thoracic motor circuits that drive wing muscles. Structurally speaking, each of these populations corresponds respectively to an expansion (from 10⁵ to 10⁶ neurons), a bottleneck (big convergence from 10⁶ to 10³ neurons), and another expansion (from 10³ to 10⁴ neurons) before finally converging (from 10⁴ to 10¹ neurons) at the output layer. For a schematic diagram of the moth’s visuomotor pathway and our corresponding model network, see Fig 6. The size of each neural population in the model network was chosen to preserve the relative order of magnitude of divergence and convergence observed in the moth, within computational capacity. This is a very coarse representation of the real network. Of course many brain regions are not involved in the process of target tracking but the optic lobe and premotor regions capture a very large portion of the brain of moths and other insects [75–78].

Fig 6 — Diagram of the central nervous system of the hawkmoth *Manduca sexta* and a schematic of the 5-layered spiking neural network developed here to model its visuomotor pathway. Numbers in parentheses denote the number of neurons in each population for the moth (orders of magnitude, left) and the model (exact, right).

Our first objective with the 5-layer network was to validate it against previous findings from the motor program of the hawkmoth. Although the real hawkmoth visuomotor pathway contains recurrent, within-layer connections which have been shown to affect the coding of neuronal populations [44,45], here we focus on testing whether the feedforward pattern of convergence/divergence is sufficient to reproduce experimental results. In particular, Putney et al [25] performed experiments where hawkmoths were shown a robotic flower oscillating horizontally at 1 Hz, a stimulus that they are naturally inclined to track when foraging. During the flower tracking, the 10 muscles coordinating their flight were recorded with spike timing resolution down to 0.1 ms. The authors found that a significant majority of the mutual information between the spiking activity of these muscles and the motor output (forces/torques generated during flight) was encoded by spike timing instead of spike count in each unit. In fact, spike timing encoded three times more information than spike count. Subsequent analysis showed that the precision of the spike timing code was of the order of 1 ms across all output units [49].

We re-analyzed their data from moth motor units to first confirm this result, shown in Fig 7C. Next, we trained our 5-layer network model to autoencode the same 1 Hz stimulus that was used during the experiment and performed the same single-neuron information theoretic analysis for all layers of the model. Since there is no “motor output” from our model, we computed the mutual information between the stimulus and the response, which is analogous to the mutual information between motor output and response in a setting where the stimulus is being physically tracked. The results are shown in Fig 7. In particular, a large majority of the mutual information in the output layer of our model is encoded by spike timing (bottom of Fig 7C), just as found from the experimental data (top of Fig 7C). Furthermore, we show the single-neuron information rate averaged across all neurons within a layer in Fig 7B. The spike count information is low in all layers compared to the spike timing information. The single-neuron spike timing information starts low in the input layer, rises in the first expansion (E1) layer, falls in the bottleneck (B), rises slightly in the second expansion (E2) and again in the output layer. In the output layer, the spike timing information rate exceeds the spike count information rate by a much larger amount than it does in the input layer. On first glance these results seem contradictory to the previous analyses associating divergence with spike count codes and convergence with spike timing codes. The reason that the trend is different here is because we are analyzing the coding strategy of single neurons, averaged across the whole population. The previous analyses characterize the coding strategy of the population, which we show in S10 Fig for the 5-layer network receiving a 1 Hz stimulus where the stimulus is decoded almost perfectly ( $R^{2} \approx 1$ ) in all layers at all timescales Δt. Note also that the information theoretic method used in this analysis is conservative in the sense that contributions from spike timing are only taken once those from spike count have been completely accounted for. We verified that our single-neuron mutual information estimates from the output layer were robust to choice of hyperparameter and amount of data points in S11–S13 Figs. Overall, our result lends evidence to the notion that convergent/divergent structure in the hawkmoth visuomotor pathway supports a transformation from the input layer where spike timing is less important to the output layer where spike timing provides an order of magnitude more information than spike count. Furthermore, when interpreted in light of the population decoding analysis showing perfect reconstruction across all Δt’s in all layers (S10 Fig), the single-neuron analysis shown in Fig 7B indicates that there is a high amount of redundancy in the large expansion layers. We also confirmed that pairwise redundancy in the output layer of the model is mostly contained in spike timing, not spike count (S14 Fig), which was another key result of ref [25] demonstrating that coordination in hawkmoth hovering is achieved through spike timing and not spike count.

Since the 1 Hz sinusoid was decoded very well in all layers and at all time scales (see S10 Fig), we sought to investigate what coding strategy was optimal during a more complex and biologically-relevant stimulus. Specifically, we were interested in the idea that the bottleneck may filter the noise in some way. To answer this, we trained the 5-layer network on a noiseless 4 Hz + 20 Hz sum of sines stimulus. Its input was a version of the same stimulus but with white noise added. In each layer, we decoded the noiseless stimulus from population spikes binned at various Δt’s, the results for which are shown in Fig 8. We found that both expansion layers have a broader range of Δt’s over which nearly perfect decoding accuracy is achieved than the smaller layers. This was quantified by computing the slope of the best line fits to the R² v.s. Δt curves shown on the top of Fig 8B. The distributions of these slopes for each layer are shown in the bottom of Fig 8B, and also explicitly against layer in Fig 8C. A slope of zero means that there is no preference for spike count or spike timing. A negative slope indicates that there is a gain in information with a spike timing strategy over a spike count coding strategy. For the noisy sum of sines used here, all of the slopes (except for one outlier in the E1 layer) were negative. However, the slopes were more negative in the bottleneck and output layer, supporting the conclusion that precise spike timing codes are more beneficial in layers following structural convergence than they are in layers post-synaptic to structural divergence.

Fig 8 — (A) The 5-layer network receives a sum of sines corrupted by noise, but is trained to encode the noiseless version at the output. The decoding is done with respect to the noiseless stimulus. (B) Decoding accuracy from spikes binned at resolution Δt, in each layer of the 5-layer model. Each gray trace represents an individual network seed. Black traces are the means across all network seeds (top). Distribution of slopes of best line fits to the R² v.s. Δt curves (bottom). (C) Slope distributions versus layer. Results are shown for 25 independent network simulations.

Discussion

We observe significant differences in information between spike count and spike timing representations as a function of convergent/divergent network structure. Although the stimulus reconstruction task is relatively low-dimensional, the fact that we are decoding from discrete spikes and not continuous rates makes this problem more difficult. Nonetheless, we notice differences in performance between spike count and spike timing representations, even for large layers. The 3-layer network results show that bottleneck populations of neurons post-synaptic to a structural convergence have more to gain from precise spike timing codes than expansion layers, so long as the stimulus being encoded has sufficiently fast dynamics. The simple 5-layer network model replicating the cascades of convergence and divergence in the moth sensory-motor pathway can reproduce the relative proportion of spike timing information previously measured at the single unit level from the spike resolved motor program of Manduca sexta. Notably, the amount by which spike timing information exceeds spike count information at the output layer is higher than that at the input layer. Even without the extensive recurrence and reafferent sensing observed in biological networks, our simple feedforward model replicates the experimental result at the hawkmoth motor output. This suggests that the feedforward signal compressions and expansions induced by the structural convergence and divergence can predict the relative information gain obtained by temporal coding, along the various stages of the hawkmoth visuomotor pathway. Our work goes beyond previous theoretical studies considering the effects of convergent and divergent structure on information processing [13,17,22] by establishing a relationship between this ubiquitous structural motif and the information encoded by spikes at various time resolutions in its constituent neurons.

A related but distinct concept to structural bottlenecks in biological neural networks is that of the information bottleneck: a variational method for extracting the most relevant information that a random variable X has about another random variable Y by finding an optimal compressed representation $\tilde{X}$ [85]. This method optimizes the tradeoff between prediction and compression and has been used to shed light on learning [84] and optimal architectures [83] in deep neural networks. While the vanilla information bottleneck method is agnostic to any particular mapping between $\tilde{X}$ and Y, recent work has extended the idea by finding an $\tilde{X}$ that is specific to the decoder being used for downstream prediction [82], thus improving generalization in artificial neural networks. A similar variant of the information bottleneck was applied to neural data from the cells of the retina, showing that predictive information about future visual inputs can be encoded and compressed by neurons post-synaptic to the retina [74]. Although the information bottleneck method is useful for understanding artificial neural networks [82–84] and neural data [74], its potential mapping to the discussion of structural bottlenecks in biological neural networks is unclear. Whereas information bottlenecks are optimal compressions in an abstract sense, the structural bottleneck studied here is a feature of networks widely observed in biology that we treat as a starting point and study its consequences for information-processing.

Network convergence and divergence are widespread in species and brain areas [14–16,18,19], but the implication of this structure for spike coding of time-dependent stimuli has not been well characterized. While the importance of spike timing at both sensory input [26–30] and motor output [31–35] is well established, its role in the intermediate processing stages resulting from structural convergence and divergence has been less clear [36,37,87,91]. Our results demonstrate that bottlenecks would benefit greatly from a more temporally-resolved spike code, more so than in expansion layers which have a plethora of neurons to represent a signal with spike counts. This finding is relevant to a variety of systems where cascades of network convergence and divergence are present, including visuomotor pathways, cerebellum-like structures, the early visual system, and olfactory systems [13,16–18]. Additionally, the nervous systems of segmented organisms contain neural ganglia which are often coupled by fewer fibers than they comprise, resulting in a convergent/divergent connectivity pattern that is another good candidate system to further explore temporal coding [86]. Conversely, there also exist cases of extremely precise temporal coding in interneuons of bat [47] and owl [46] that may be interpreted in the context of the present work as potential bottlenecks. High temporal precision in the sensory neurons of mice has also been found [43] and it will be interesting for future work to investigate the minimum temporal resolution needed to preserve information in large neuronal populations that these sensory neurons diverge onto, i.e. barrel cortex.

Recent theoretical work with groups of uncoupled Poisson neurons is consistent with our finding that larger populations of neurons can encode time-dependent signals well with a count code, whereas small populations must use precisely-timed spikes to achieve the same decoding accuracy [23]. Here we show that this trend extends to feedforward networks with convergent/divergent structure where the number of neurons pre-synaptic to a given population shapes the coding strategy of that population, even when the size of that population is fixed (as in Fig 2). Additionally, while our decoder is a nonlinear function (a recurrent neural network) trained on discrete sequences of population spiking, the decoder used in ref. [23] is a linear function of spikes convolved with an exponential filter of width 10 ms. Thus, our method accounts for sequences of population spike trains extended in time whereas the method used in ref. [23] decodes a continuous signal from a continuous representation of a spike train over a short time. Our technique is more consistent with emerging definitions of spike timing codes in which longer sequences of spikes are critical for encoding information [24,25,73,80]. In order to obtain elegant analytical results, the authors of ref. [23] assumed that their neural populations were not correlated with the signal being decoded, whereas the networks in our computational study explicitly encode the stimulus in the input layer and are trained to encode it at the output layer.

Whether our findings could be recapitulated in alternative learning models is an open question. Although artificial neural network (ANN) models can predict precise spike timing from biologically-relevant stimuli [90], they are unable to make predictions for the role of spike timing in intermediate layers since their units do not have a spiking mechanism. Indeed, this frontier is where past work has delivered mixed results [36,37,91] and was important for us to test. Other approaches like the “chronotron” [88] and “tempotron” [89] are single-neuron models that learn to classify inputs with distinct spike timing patterns. However, a training method such as this, although useful in other contexts, would bias the coding strategy toward spike timing, which is undesirable when interested in isolating the effect of network structure on coding in a study like ours. For this reason, we chose to train our network in a way that was agnostic to the coding strategy at the output (see eq. (2)), a notable strength of our approach.

There are many models of spike-generating mechanisms for the design of SNNs and these may promote different coding and network features. For example, the dynamics of resonant-and-fire neurons [55] and their generalizations [54] are selective for stimuli of certain frequencies. This could be especially important in the context of spike timing codes, since patterns of pre-synaptic spikes with a steady firing rate equal to the natural frequency of the post-synaptic neuron will produce post-synaptic spikes with higher probability than other pre-synaptic firing rates [53]. This is in contrast to leaky-integrate-and-fire (LIF) neurons, which are most likely to spike when the input amplitude is high and the frequency is low [53]. Although we tested models only within the LIF family, this particular model in its generalized form has been shown to reproduce a variety of experimentally measured neuronal spiking behaviors [56]. Thus, we expect that the general trends we observe in the two LIF models tested here will extend to other spike-generating mechanisms.

In summary, we found that convergent and divergent structure shapes the way in which populations of neurons encode high-frequency or less predictable dynamic stimulus information with precisely-timed spikes. Structural bottlenecks resulting from network convergence benefit much greater from precise spike timing than expansion layers coming from network divergence. A simple model recapitulates previous experimental findings at the motor output of the visuomotor pathway of the hawkmoth. While comprehensive experimental data across all layers of the hawkmoth visuomotor pathway is unavailable, our model further makes predictions about unobserved populations and untested stimuli, which could be confirmed experimentally in future studies. In particular, our single-neuron analyses Fig 7 predict high amounts of redundancy in the spike timing representation of simple visual stimuli by the populations comprising the brain and thoracic circuits of the hawkmoth. From our population decoding analyses Figs 8 and S10, we predict that precise spike timing representations for accurate tracking of fast stimuli are needed in the bottlenecks of the hawkmoth visuomotor pathway (neck connective and motor neurons) but provide only marginal gains over spike count codes in the expansion layers (the brain and thoracic circuits). Overall, our work establishes a novel structure-function relationship in feedforward neural networks with signal convergence and divergence, elucidating how this structural motif prevalent across neural systems and species determines the optimal coding strategy with spikes.

Methods

Analytical support for spike timing and count codes

In this section, we present an analytical explanation for why network convergence promotes timing-based codes, demonstrating that count-based coding requires more neurons to achieve the same entropy upper bound as timing-based coding.

Single-neuron example

Consider the example where the response window is of duration T ms and the refractory period of the neuron is $τ_{ref}$ ms. The total number of bins to place spikes in would then be $n_{bins} = T ∕ τ_{ref}$ . In the case of a spike count code, we may bin spikes at resolution Δt = T ms. Including the outcome of 0 spikes, the total number of outcomes for the spike count code is $| S_{c} | = n_{bins}$ + 1. By assuming each of these outcomes is equally likely, the probability distribution becomes uniform, i.e. the probability of i spikes is $p_{i} = 1 ∕ | S_{c} |$ for $i = 0, 1, \dots, n_{bins}$ . Using this probability distribution with maximum entropy, we may calculate an upper bound on the true entropy of the spike count code. Let us refer to the true entropy of the spike count code as $H_{c}$ and its upper bound as ${\tilde{H}}_{c}$ . Then we have:

\begin{matrix} H_{c} & \leq {\tilde{H}}_{c} \end{matrix}

(3)

\begin{matrix} = - \sum_{i} p_{i} {log}_{2} p_{i} \end{matrix}

(4)

\begin{matrix} = - \sum_{i = 0}^{n_{bins}} \frac{1}{| S_{c} |} {log}_{2} \frac{1}{| S_{c} |} \end{matrix}

(5)

\begin{matrix} = {log}_{2} | S_{c} | \end{matrix}

(6)

\begin{matrix} {\tilde{H}}_{c} & = {log}_{2} (n_{bins} + 1) \end{matrix}

(7)

Similarly, we may bin spikes at resolution $Δ t = τ_{ref}$ ms and list the possible spike timings as binary sequences. The total number of possible outcomes for the spike timing code is equal to the number of binary sequences of length $n_{bins}$ , which is given by $| S_{t} | = 2^{n_{bins}}$ . Again assuming a uniform distribution $p_{i} = 1 ∕ | S_{t} |$ for each outcome $i = 1, \dots, | S_{t} |$ , the upper bound on the entropy of the spike timing code is:

\begin{matrix} H_{t} & \leq {\tilde{H}}_{t} \end{matrix}

(8)

\begin{matrix} = - \sum_{i} p_{i} {log}_{2} p_{i} \end{matrix}

(9)

\begin{matrix} = - \sum_{i = 1}^{| S_{t} |} \frac{1}{| S_{t} |} {log}_{2} \frac{1}{| S_{t} |} \end{matrix}

(10)

\begin{matrix} = {log}_{2} | S_{t} | \end{matrix}

(11)

\begin{matrix} = {log}_{2} (2^{n_{bins}}) \end{matrix}

(12)

\begin{matrix} {\tilde{H}}_{t} & = n_{bins} \end{matrix}

(13)

Therefore, the entropy upper bound for spike count code scales logarithmically with the duration of the response window whereas that for the spike timing code scales linearly. For the example when T = 15 ms and $τ_{ref} = 5$ ms, the number of bins is $n_{bins} = T ∕ τ_{ref} = 3$ and the maximum entropy rate is ${\tilde{H}}_{t} = 3$ bits per 15 ms for the spike timing code and ${\tilde{H}}_{c} = 2$ bits per 15 ms for the spike count code.

Population of neurons

Let us now consider spike coding in a population of $N_{nrn}$ neurons. In the case of a spike count code, each neuron in the population can fire anywhere between 0 and $n_{bins}$ spikes. Therefore, the total number of outcomes is $| S_{c}^{pop} | = {(n_{bins} + 1)}^{N_{nrn}}$ . The upper bound of the population spike count code entropy is then:

\begin{matrix} H_{c}^{pop} & \leq {\tilde{H}}_{c}^{pop} \end{matrix}

(14)

\begin{matrix} = {log}_{2} | S_{c}^{pop} | \end{matrix}

(15)

\begin{matrix} = {log}_{2} {(n_{bins} + 1)}^{N_{nrn}} \end{matrix}

(16)

\begin{matrix} {\tilde{H}}_{c}^{pop} & = N_{nrn} {log}_{2} (n_{bins} + 1) \end{matrix}

(17)

For the spike timing code, each neuron in the population can fire one of $2^{n_{bins}}$ possible spike sequences. Thus, the total number of possible outcomes for the population spike timing code is $| S_{t}^{pop} | = {(2^{n_{bins}})}^{N_{nrn}} = 2^{n_{bins} N_{nrn}}$ . The upper bound of the entropy for the population spike timing code is:

\begin{matrix} H_{t}^{pop} & \leq {\tilde{H}}_{t}^{pop} \end{matrix}

(18)

\begin{matrix} = {log}_{2} | S_{t}^{pop} | \end{matrix}

(19)

\begin{matrix} = {log}_{2} 2^{n_{bins} N_{nrn}} \end{matrix}

(20)

\begin{matrix} {\tilde{H}}_{t}^{pop} & = n_{bins} N_{nrn} \end{matrix}

(21)

The entropy upper bounds for both the spike count and spike timing codes grow linearly with the number of neurons $N_{nrn}$ but with different slopes. For the spike count code, the slope is ${log}_{2} (n_{bins}$ + 1 ) . For the spike timing code, the slope is $n_{bins}$ . The slopes of the entropy upper bounds of both the spike timing and spike count codes are plotted as a function of $n_{bins}$ in Fig 9A, where we can see that the spike timing entropy slopes are higher than that of the spike count at all values of $n_{bins} \geq 1$ . Furthermore, the gain in entropy of a spike timing over a spike count code becomes greater as longer spike trains are considered (i.e. $n_{bins}$ is increased). With the parameter values T = 15 ms, $τ_{ref} = 5$ ms, we plot the maximum entropy rate $\tilde{H} ∕ T$ as a function of the number of neurons $N_{nrn}$ in Fig 9B. For $N_{nrn} = 10$ neurons, we can see that the spike timing code achieves an entropy rate of ${\tilde{H}}_{t}^{pop} ∕ T = 2$ bits/ms, whereas the spike count code can only encode ${\tilde{H}}_{c}^{pop} ∕ T = 1.3$ bits/ms. This analytical finding is consistent with our main computational Results, showing that spike timing codes increase the computational capacity of small populations of neurons post-synaptic to a network convergence. To reach the amount of entropy encoded by a given population employing a spike timing code, but with a spike count code, the number of neurons in the population should increase.

Fig 9 — (A) Slope of the entropy v.s. population size curves, as a function of the number of time bins. The purple curve is simply the linear function $y = n_{bins}$ for the spike timing code and the teal curve is the function $y = {log}_{2} (n_{bins}$ + 1 ) for the spike count code. (B) Example of the entropy rate v.s. population size for both types of spike code. We set T = 15 ms and $τ_{ref} = 5$ ms here so that $n_{bins} = T ∕ τ_{ref} = 3$ .

Spiking neuron models

The Python package snnTorch [38] was used to train and run simulations of the spiking neural networks (SNNs) studied here. The spiking neuron model that is used for all primary results is the spike response model or “alpha” neuron. We have also implemented a simple leaky integrate-and-fire neuron model, to verify that the main results are not model-dependent. Parameter values used for the alpha neuron model are listed in Table 1 and for the LIF neuron in Table 2.

Table 1. Parameter values for the neurons in the alpha neuron model.

The symbol U ( A , B ) denotes the uniform distribution between A and B.

Alpha neuron parameter initializations
Parameter name	Symbol	Value
Excitatory current decay rate	α	U ( 0 . 7 , 0 . 9 )
Inhibitory current decay rate	β	α − 0 . 1
Reset membrane potential	$U_{reset}$	0
Threshold membrane potential	$U_{thr}$	U ( 0 , 0 . 5 )

Open in a new tab

Table 2. Parameter values for the neurons in the LIF neuron model.

The symbol U ( A , B ) denotes the uniform distribution between A and B.

LIF neuron parameter initializations
Parameter name	Symbol	Value
Membrane potential decay rate	β	U ( 0 . 7 , 0 . 9 )
Reset membrane potential	U _reset	0
Threshold membrane potential	$U_{thr}$	U ( 0 , 1 . 1 )

Open in a new tab

The evolution of the alpha neuron is governed by the following difference equations:

\begin{matrix} I_{exc} [t + 1] & = α I_{exc} [t] + I_{in} [t + 1] \end{matrix}

(22)

\begin{matrix} I_{inh} [t + 1] & = β I_{inh} [t] - I_{in} [t + 1] \end{matrix}

(23)

\begin{matrix} U [t + 1] & = τ_{α} (I_{exc} [t + 1] + I_{inh} [t + 1]) \end{matrix}

(24)

where α is the decay rate of the excitatory current $I_{exc}$ and β is the decay rate of the inhibitory current $I_{inh}$ . The term $I_{in}$ represents external current, which either comes from a stimulus or pre-synaptic spiking. The time constant for the membrane potential U is given by $τ_{α} = \frac{ln α}{ln β}$ − lnα + 1. To ensure that positive inputs increase the membrane potential, we set α > β.

The leaky integrate-and-fire (LIF) neuron is governed by

\begin{matrix} U [t + 1] = β U [t] + I_{in} [t + 1] \end{matrix}

(25)

where β is the decay rate of the membrane potential U and $I_{in}$ is the input current. For both the alpha and LIF neuron models, we set U [ t + $1] = U_{reset}$ whenever the membrane potential reaches the spiking threshold $U [t] > U_{thr}$ .

Our network’s input layer is inspired by an insect visual system with a mechanism for motion direction selectivity in two dimensions in its visual scene [62,96]. We designed the input layer of our models to be tuned to various regions of a 2-dimensional plane (the visual scene). For the $i^{th}$ neuron in the subpopulation of the input layer tuned to quadrant q, the input it receives is

\begin{matrix} I_{in}^{q, i} = {\begin{matrix} \sqrt{s_{x}^{2} + s_{y}^{2}} if (s_{x} - s^{q, i}, s_{y} - s^{q, i}) \in quadrant q \\ 0 otherwise \end{matrix} \end{matrix}

(26)

where q = 1 , 2 , 3 , 4 and $i = 1, \dots, N_{in} ∕ 4$ . The stimulus s is a time-dependent vector with two components $s_{x}$ and $s_{y}$ denoting x- and y-positions of a moving object. The total number of neurons in the input layer is $N_{in}$ . We generate a set of random offsets $s^{q, i} \sim N (0, 0.1)$ for each quadrant, independently sampled for each neuron. The purpose of this is to encourage smooth transitions between firing of the 4 subpopulations, which is more biologically-realistic than discrete switching.

Network connectivity

To capture critical biological features, our spiking neuron model includes both excitatory and inhibitory input synapses because of their role in passing on relevant sensory information and evoking balanced motor responses in the sensorimotor pathway [20].

All layers besides the input layer of our feedforward network models solely receive inputs from neurons in pre-synaptic layers. The $i^{th}$ neuron in the ( k + $1)^{th}$ layer other than the input layer receives the input current

\begin{matrix} I_{in}^{i, k + 1} [t] & = \sum_{j = 1}^{N_{pre}} W_{i j}^{k, k + 1} X_{j}^{k} [t] \end{matrix}

(27)

where N_pre is the number of neurons in the layer pre-synaptic to neuron i in layer k, $W^{k, k + 1}$ is the synaptic weight matrix from layer k to layer k + 1, and $X_{j}^{k} [t] = 1$ if pre-synaptic neuron j in layer k spiked at time t and $X_{j}^{k} [t] = 0$ otherwise. The entries of $W^{k, k + 1}$ that are non-zero with probability p are distributed according to $W_{i j}^{k, k + 1} \sim N (0, 1 ∕ \sqrt{p N_{pre}})$ . Thus, excitatory and inhibitory connections are equally probable in our model, and may both exist for the same pre-synaptic neuron and Dale’s law was disregarded for simplification. We set the connection probability to p = 0 . 7 for all models.

Network training

The synaptic weights between layers of spiking neurons in our networks are optimized with backpropagation-thru-time (BPTT) to minimize the following loss function:

\begin{matrix} L_{MSE} (z, s) = \frac{1}{N_{t}} \sum_{t = 1}^{N_{t}} {(z_{t} - s_{t})}^{2} \end{matrix}

(28)

where $N_{t}$ is the total number of time points, $s_{t}$ is the true stimulus at time t, and z is a readout from the output layer of the form

\begin{matrix} z = γ z_{time} + (1 - γ) z_{count} \end{matrix}

(29)

where γ = 0 . 5 to equally weigh spike count and timing, $z_{time} = W_{time} r_{time}$ and $z_{count} = W_{count} r_{count}$ . The matrices $W_{count} \in ℝ^{N_{out} \times d_{s}}$ and $W_{time} \in ℝ^{N_{out} \times d_{s}}$ are read-out weights whose entries are initialized randomly from a normal distribution N ( 0 , 0 . 1 ) . The symbol $d_{s}$ denotes the dimensionality of the stimulus dynamics: either $d_{s} = 1$ for the 5-layer network or $d_{s} = 2$ for the 3-layer network. The quantities $r_{time} \in ℝ^{N_{t} \times N_{out}}$ and $r_{count} \in ℝ^{N_{t} \times N_{out}}$ are convolutions of the output layer’s spike trains with two different kernels:

\begin{matrix} r_{time} & = K_{time} * ρ_{out} \end{matrix}

(30)

\begin{matrix} r_{count} & = K_{count} * ρ_{out} \end{matrix}

(31)

where ∗ denotes convolution. The binarized population spikes of the output layer $ρ_{out} \in B^{N_{t} \times N_{out}}$ (where B = { 0 , 1 } ) are convolved with the kernels $K_{time}$ and $K_{count}$ , which are of the form

\begin{matrix} K (t) = {\begin{matrix} exp [- \frac{{(t - Δ t ∕ 2)}^{2}}{σ}] if 0 < t < Δ t \\ 0 otherwise \end{matrix} \end{matrix}

(32)

where Δt = 10 ms for $K_{time} \in ℝ^{10 \times N_{out}}$ and Δt = 70 ms for $K_{count} \in ℝ^{70 \times N_{out}}$ . We chose Δt = 10 ms as the scale of the timing convolution since only 1-3 spikes typically fall within this window, and smaller Δt’s resulting in poor training. The value Δt = 70 ms was chosen for the count convolution since multiple spikes usually fall within this window. Values larger than Δt = 70 ms for the count convolution resulted in poorer training. The standard deviation is set to σ = 0 . 1 ms² for both kernels.

The read-out weights $W_{time}$ and $W_{count}$ , as well as the membrane decay rates α and β and synaptic weights (see Spiking neuron models) of the spiking neural network, are trained during back-propagation to minimize the mean-squared error. A plot of the MSE loss over training is shown in Fig 10, as well as an example of the read-out compared to the stimulus after training.

Fig 10 — (A) Reduction of MSE loss through training with BPTT. Thin gray traces show individual network seeds, thick black trace shows the average across all 25 seeds. (B) Readout after training 3-layer networks with N_in = N_h = N_out = 100 to the 4 Hz + 20 Hz sum of sines stimulus. Colored traces are for the readout; the black trace denotes the true stimulus presented to the network. The top shows the x-dimension of the stimulus and the bottom shows the y-dimension.

Decoding analysis

In order to determine how the population responses of layers in our network model relate to stimuli, we trained and tested a decoder [42]. In particular, long short-term memory (LSTM) and gated recurrent unit (GRU) networks were used to predict the stimulus at time t based on the neural response during time [ t , t + T ] binned at resolution Δt. In other words, the stimulus value at the beginning of the spike train is the value we use the spike train to decode. To further clarify this process, suppose a neural recording of $t_{f} = 10$ time steps results in the following spike train:

\begin{array}{l} ρ = [1, 0, 0, 1, 1, 0, 1, 0, 1, 0] \end{array}

where “0” represents no spike and “1” represents spike. Sliding a rectangular window of width T = 8 over this spike train results in

\begin{array}{l} ρ = & [1, 0, 0, 1, 1, 0, 1, 0] \\ [0, 0, 1, 1, 0, 1, 0, 1] \\ [0, 1, 1, 0, 1, 0, 1, 0] \end{array}

Each of these are then sub-divided into bins of size Δt. If Δt = T = 8, the binned response R becomes a vector of spike counts over the response window T:

\begin{array}{l} R = & [4] \\ [4] \\ [4] \end{array}

If Δt = 4, then the binned response is

\begin{array}{l} R = & [2, 2] \\ [2, 2] \\ [2, 2] \end{array}

If Δt = 2, then the binned response is

\begin{array}{l} R = & [1, 1, 1, 1] \\ [0, 2, 1, 1] \\ [1, 1, 1, 1] \end{array}

And if Δt = 1, then the binned response becomes identical to the original binary spike train ρ.

The above matrix R has size $(n_{samples} \times n_{bins})$ where $n_{samples} = t_{f} - T + 1 = 10 - 8 + 1 = 3$ and $n_{bins} = T ∕ Δ t$ . If instead of 1 neuron, we have recordings from $N_{nrn}$ neurons (as in our population decoding analyses), the same procedure is performed on each neuron’s spike train and their resulting matrices are stacked together to form a tensor of dimension $(n_{samples} \times n_{features} \times n_{bins})$ where $n_{features} = N_{nrn}$ . This tensor is used to decode the stimulus over time. The dimension along which spikes are binned at resolution Δt is treated as a hidden state for the LSTM and GRU decoders, so that decoding depends on specific spike sequences. The stimulus is stored as a matrix S of size $(n_{samples}$ × $d_{s})$ where $d_{s}$ is the dimension of the stimulus, either 1 or 2 here. The task of decoding is to find a function f that forms an estimate Ŝ = f ( R ) of the true stimulus S, minimizing the error $\sum_{i, j} {(ŝ_{i j} - s_{i j})}^{2}$ . In our analysis, the control parameter Δt is varied to modulate the time resolution with which spikes are counted. When Δt = 1, there is no difference between ρ and R, and the specific timing of every spike is preserved. As Δt is increased, spike timings within the larger window of size T become increasingly blurred. The maximum value Δt = T results in a vector R where each entry is the number of spikes that occurred in the respective time window of duration T. On the other hand, as Δt decreases, the code becomes more dependent on spike timing than spike count.

We used the Python package keras to perform the decoding with the LSTM and GRU networks. Cross-validation was performed by maximizing the validation accuracy using Bayesian optimization [68] to select hyperparameters.

Single-neuron information theoretic analysis

We follow Putney et al.[25] for the single neuron mutual information analysis. Briefly, the idea is to compute the mutual information $I_{m}$ between motor output m and single-neuron response R via:

\begin{matrix} I_{m} (m, R) & = I_{m} (m, R_{c}) + \sum_{i = 1}^{R_{c, max}} p (R_{c} = i) I_{m} (R_{t}, m | R_{c} = i) \end{matrix}

(33)

where $R_{c}$ is the spike count, $R_{t}$ is the spike timings, and m is the first two principal components of the motor output (forces/torques generated by the wing muscles during hover feeding). The first term in eq. (33) is what we label the “spike count” information and the second term is the “spike timing” information in the single-neuron analyses of the 5-layer network results. In our implementation, $R_{c} \in ℤ_{\geq 0}^{N_{T}}$ where $ℤ_{\geq 0}$ denotes the set of non-negative integers and $N_{T} = (N_{t} ∕ T)$ is the number of non-overlapping response windows of duration T falling within the experiment or simulation of duration $N_{t}$ . For the moth experiments, T = 50 ms is the same as the wingstroke period of the animal, so $N_{T}$ equals the number of wingstrokes in this context. The spike timing matrix $R_{t} \in ℝ^{(N_{T} \times R_{c, max})}$ contains the spike timings within each response window where $R_{c, max}$ is the maximum number of spikes observed in a single wing stroke. The quantity $p (R_{c} = i)$ denotes the probability that a spike count of i was observed. The mutual information in spike count $I_{m} (m; R_{c})$ and the mutual information in spike timing, conditioned on spike count, $I_{m} (R_{t}, m | R_{c} = i)$ were both estimated numerically using the Kraskov-Stögbauer-Grassberger (KSG) method [65] (see Assocation measures). Since there is no “motor output” for our network model, we performed this analysis by substituting the stimulus s for the motor output m in eq. (33). As the stimulus represents the position of a target that moths follow, it is reasonable to assume that the stimulus information is directly reflected in the motor output.

Assocation measures

To quantify the amount of information between stimulus and response in our population decoding analyses, we employ various association measures between the true stimulus S and decoded stimulus Ŝ = f ( R ) based on the response R. If we define $I_{m} (X, Y)$ as the mutual information between random variables X and Y, the data-processing inequality states that $I_{m} (S, Ŝ) \leq I_{m} (S, R)$ since Ŝ cannot gain information about R [66]. For large populations of neurons and small Δt’s, the response matrix R becomes very high dimensional, rendering the quantity $I_{m} (S, R)$ difficult to estimate directly [64]. Thus, we instead estimate the quantity $I_{m} (S, Ŝ)$ which forms a lower bound on the true mutual information of interest $I_{m} (S, R)$ . This is done via the Kraskov-Stögbauer-Grassberger (KSG) method [65], employed via scikit-learn. For the single-neuron mutual information calculations, we used the Julia package Associations.jl. In addition to mutual information, which is a nonlinear measure of association between variables, we also show results with the coefficient of determination R² (decoding accuracy) which is a linear association measure.

Supporting information

S1 Fig. Version of Fig 2 but with GRU decoder.

(TIF)

pcbi.1012971.s001.tif^{(2.2MB, tif)}

S2 Fig. Version of Fig 2 but with LIF spiking model.

(TIF)

pcbi.1012971.s002.tif^{(2.1MB, tif)}

S3 Fig. Version of Fig 3 but with mutual information.

(TIF)

pcbi.1012971.s003.tif^{(2.3MB, tif)}

S4 Fig. Version of Fig 3 but with GRU decoder.

(TIF)

pcbi.1012971.s004.tif^{(2.2MB, tif)}

S5 Fig. Version of Fig 3 but with GRU decoder and mutual information.

(TIF)

pcbi.1012971.s005.tif^{(2.3MB, tif)}

S6 Fig. Version of Fig 3 but with LIF neuron model.

(TIF)

pcbi.1012971.s006.tif^{(2.2MB, tif)}

S7 Fig. Frequency analysis robustness to information metric.

(TIF)

pcbi.1012971.s007.tif^{(1.2MB, tif)}

S8 Fig. Decoding analysis for sinusoidal stimuli.

(TIF)

pcbi.1012971.s008.tif^{(882.1KB, tif)}

S9 Fig. All stimuli analysis robustness to information metric.

(TIF)

pcbi.1012971.s009.tif^{(1.5MB, tif)}

S10 Fig. Decoding analysis of 5 layer model receiving 1 Hz stimulus.

(TIF)

pcbi.1012971.s010.tif^{(585.4KB, tif)}

S11 Fig. Single-neuron information across layer of the 5-layer model for three values of k, the number of nearest neighbors in the KSG method.

(TIF)

pcbi.1012971.s011.tif^{(678.5KB, tif)}

S12 Fig. Single-neuron information across layer of the 5-layer model for various data set sizes n.

(TIF)

pcbi.1012971.s012.tif^{(716.5KB, tif)}

S13 Fig. Robustness of the single-neuron information estimates.

(TIF)

pcbi.1012971.s013.tif^{(711.7KB, tif)}

S14 Fig. Interaction info in experimental data v.s. 5-layer model.

(TIF)

pcbi.1012971.s014.tif^{(1.3MB, tif)}

S1 Appendix. Additional descriptions of supplementary figures.

(PDF)

pcbi.1012971.s015.pdf^{(165.2KB, pdf)}

Acknowledgments

We thank Soon Ho Kim for insightful comments and discussions on the manuscript.

Data Availability

All code for reproducing the analytical and computational figures in this work are available at https://github.com/HChoiLab/SpikeCoding.

Funding Statement

Research was supported by the National Institute of General Medical Sciences of the National Institutes of Health under award number T32GM142616 to ZM, the Achievement Rewards for College Scientists scholar award to ZM, the Alfred P. Sloan Foundation Fellowship in Neuroscience FG-2022-18969 to HC, and the Air Force Office of Scientific Research MURI grant FA9550-22-1-0315 to SS and HC. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1.Nguyen TM, Thomas LA, Rhoades JL, Ricchi I, Yuan XC, Sheridan A, et al. Structured cerebellar connectivity supports resilient pattern separation. Nature 2023 Jan;613(7944):543–549. doi: 10.1038/s41586-022-05471-w [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Turner NL, Macrina T, Bae JA, Yang R, Wilson AM, Schneider-Mizell C, et al. Reconstruction of neocortex: Organelles, compartments, cells, circuits, and activity. Cell 2022;185(6):1082–100.e24. doi: 10.1016/j.cell.2022.01.023 [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Bullmore E, Sporns O. Complex brain networks: graph theoretical analysis of structural and functional systems. Nat Rev Neurosci 2009;10(3):186–98. doi: 10.1038/nrn2575 [DOI] [PubMed] [Google Scholar]
4.Oh SW, Harris JA, Ng L, Winslow B, Cain N, Mihalas S, et al. A mesoscale connectome of the mouse brain. Nature 2014;508(7495):207–14. doi: 10.1038/nature13186 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Sporns O. Networks of the brain. MIT Press; 2016 [Google Scholar]
6.Song S, Sjöström PJ, Reigl M, Nelson S, Chklovskii DB. Highly nonrandom features of synaptic connectivity in local cortical circuits. PLoS Biol 2005;3(3):e68. doi: 10.1371/journal.pbio.0030068 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Mastrogiuseppe F, Ostojic S. Linking connectivity, dynamics, and computations in low-rank recurrent neural networks. Neuron 2018;99(3):609–23.e29. doi: 10.1016/j.neuron.2018.07.003 [DOI] [PubMed] [Google Scholar]
8.Recanatesi S, Ocker GK, Buice MA, Shea-Brown E. Dimensionality in recurrent spiking networks: Global trends in activity and local origins in connectivity. PLoS Comput Biol 2019;15(7):e1006446. doi: 10.1371/journal.pcbi.1006446 [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Dubreuil A, Valente A, Beiran M, Mastrogiuseppe F, Ostojic S. The role of population structure in computations through neural dynamics. Nat Neurosci 2022;25(6):783–94. doi: 10.1038/s41593-022-01088-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Bassett DS, Sporns O. Network neuroscience. Nat Neurosci 2017 Feb 23;20(3):353–64. doi: 10.1038/nn.4502 [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Deco G, Senden M, Jirsa V. How anatomy shapes dynamics: a semi-analytical study of the brain at rest by a simple spin model. Front Comput Neurosci 2012; 6:68. doi: 10.3389/fncom.2012.00068 [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Cohen JR, D’Esposito M. The segregation and integration of distinct brain networks and their relationship to cognition. J Neurosci 2016;36(48):12083–94. doi: 10.1523/JNEUROSCI.2965-15.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Gutierrez GJ, Rieke F, Shea-Brown ET. Nonlinear convergence boosts information coding in circuits with parallel outputs. Proc Natl Acad Sci U S A 2021;118(8):e1921882118. doi: 10.1073/pnas.1921882118 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Jeanne JM, Wilson RI. Convergence, divergence, and reconvergence in a feedforward network improves neural speed and accuracy. Neuron 2015;88(5):1014–26. doi: 10.1016/j.neuron.2015.10.018 [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Zhaoping L. Theoretical understanding of the early visual processes by data compression and data selection. Network 2006;17(4):301–34. doi: 10.1080/09548980600931995 [DOI] [PubMed] [Google Scholar]
16.Nirenberg S, Carcieri SM, Jacobs AL, Latham PE. Retinal ganglion cells act largely as independent encoders. Nature 2001;411(6838):698–701. doi: 10.1038/35079612 [DOI] [PubMed] [Google Scholar]
17.Muscinelli SP, Wagner MJ, Litwin-Kumar A. Optimal routing to cerebellum-like structures. Nat Neurosci 2023;26(9):1630–41. doi: 10.1038/s41593-023-01403-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Bell CC, Han V, Sawtell NB. Cerebellum-like structures and their implications for cerebellar function. Annu Rev Neurosci 2008; 31:1–24. doi: 10.1146/annurev.neuro.30.051606.094225 [DOI] [PubMed] [Google Scholar]
19.Bates AS, Schlegel P, Roberts RJV, Drummond N, Tamimi IFM, Turnbull R, et al. Complete connectomic reconstruction of olfactory projection neurons in the fly brain. Curr Biol 2020;30(16):3183–99.e6. doi: 10.1016/j.cub.2020.06.042 [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Kandel ER, Schwartz JH, Jessell TM, Siegelbaum S, Hudspeth AJ, Mack S, et al. Principles of neural science, vol. 4. New York: McGraw-Hill; 2000. [Google Scholar]
21.Sereno MI, Diedrichsen J, Tachrount M, Testa-Silva G, d’Arceuil H, De Zeeuw C. The human cerebellum has almost 80% of the surface area of the neocortex. Proc Natl Acad Sci U S A 2020;117(32):19538–43. doi: 10.1073/pnas.2002896117 [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Nande A, Dubinkina V, Ravasio R, Zhang GH, Berman GJ. Bottlenecks, modularity, and the neural control of behavior. Front Behav Neurosci 2022; 16:835753. doi: 10.3389/fnbeh.2022.835753 [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Nicola W, Newton TR, Clopath C. The impact of spike timing precision and spike emission reliability on decoding accuracy. Sci Rep 2024;14(1):10536. doi: 10.1038/s41598-024-58524-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Nemenman I, Lewen GD, Bialek W, de Ruyter van Steveninck RR. Neural coding of natural stimuli: information at sub-millisecond resolution. PLoS Comput Biol 2008;4(3):e1000025. doi: 10.1371/journal.pcbi.1000025 [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Putney J, Conn R, Sponberg S. Precise timing is ubiquitous, consistent, and coordinated across a comprehensive, spike-resolved flight motor program. Proc Natl Acad Sci U S A 2019;116(52):26951–60. doi: 10.1073/pnas.1907513116 [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Birmingham JT, Szuts ZB, Abbott LF, Marder E. Encoding of muscle movement on two time scales by a sensory neuron that switches between spiking and bursting modes. J Neurophysiol 1999;82(5):2786–97. doi: 10.1152/jn.1999.82.5.2786 [DOI] [PubMed] [Google Scholar]
27.De Ruyter Van Steveninck R, Bialek W. Real-time performance of a movement-sensitive neuron in the blowfly visual system: Coding and information transfer in short spike sequences. Proc R Soc Lond B Biol Sci 1988;234(1277):379–414. 10.1098/rspb.1988.0055 [DOI] [Google Scholar]
28.Christopher R, Merzenich MM. Primary cortical representation of sounds by the coordination of action-potential timing. Nature 1996;381(6583):610–3. doi: 10.1038/381610a0 [DOI] [PubMed] [Google Scholar]
29.Egea-Weiss A, Renner A, Kleineidam CJ, Szyszka P. High precision of spike timing across olfactory receptor neurons allows rapid odor coding in Drosophila. iScience 2018; 4:76–83. doi: 10.1016/j.isci.2018.05.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Mackevicius EL, Best MD, Saal HP, Bensmaia SJ. Millisecond precision spike timing shapes tactile perception. J Neurosci 2012;32(44):15309–17. doi: 10.1523/JNEUROSCI.2161-12.2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Sober SJ, Sponberg S, Nemenman I, Ting LH. Millisecond spike timing codes for motor control. Trends Neurosci 2018;41(10):644–8. doi: 10.1016/j.tins.2018.08.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Tang C, Chehayeb D, Srivastava K, Nemenman I, Sober SJ. Millisecond-scale motor encoding in a cortical vocal area. PLoS Biol 2014;12(12):e1002018. doi: 10.1371/journal.pbio.1002018 [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Srivastava KH, Holmes CM, Vellema M, Pack AR, Elemans CPH, Nemenman I, et al. Motor control by precisely timed spike patterns. Proc Natl Acad Sci U S A 2017;114(5):1171–6. doi: 10.1073/pnas.1611734114 [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Sponberg S, Daniel TL. Abdicating power for control: a precision timing strategy to modulate function of flight power muscles. Proc Biol Sci 2012;279(1744):3958–66. doi: 10.1098/rspb.2012.1085 [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Von Reyn CR, Breads P, Peek MY, Zheng GZ, Williamson WR, Yee AL, et al. A spike-timing mechanism for action selection. Nat Neurosci 2014;17(7):962–70. doi: 10.1038/nn.3741 [DOI] [PubMed] [Google Scholar]
36.Shadlen MN, Newsome WT. The variable discharge of cortical neurons: implications for connectivity, computation, and information coding. J Neurosci 1998;18(10):3870–96. doi: 10.1523/JNEUROSCI.18-10-03870.1998 [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Olshausen BA, Field DJ. What is the other 85 percent of V1 doing. L van Hemmen, & T Sejnowski (Eds). 2006;23:182–211. [Google Scholar]
38.Olshausen BA, Field DJ. What Is the Other 85 Percent of V1 Doing? In: Leo van Hemmen J, Sejnowski TJ, editors. 23 Problems in systems neuroscience, computational neuroscience series. New York: Oxford Academic. doi: 10.1093/acprof:oso/9780195148220.003.0010, accessed 4 April 2025. [DOI] [Google Scholar]
39.Eshraghian JK, Ward M, Neftci E, Wang X, Lenz G, Dwivedi G, et al. Training spiking neural networks using lessons from deep learning. Proc IEEE 2023;111(9):1016–54. doi: 10.1109/JPROC.2023.3308088 [Google Scholar]
40.Faisal AA, Selen LP, Wolpert DM. Noise in the nervous system. Nat Rev Neurosci. 2008;9(4):292–303. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Gerstner W, Kistler WM, Naud R, Paninski L. Neuronal dynamics: from single neurons to networks and models of cognition. Cambridge University Press; 2014. [Google Scholar]
42.Levi A, Spivak L, Sloin HE, Someck S, Stark E. Error correction and improved precision of spike timing in converging cortical networks. Cell Rep 2022;40(12):111383. doi: 10.1016/j.celrep.2022.111383 [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Glaser JI, Benjamin AS, Chowdhury RH, Perich MG, Miller LE, Kording KP. Machine learning for neural decoding. Eneuro. 2020;7(4):ENEURO.0506-19.2020. doi: 10.1523/ENEURO.0506-19.2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Bale MR, Ince RA, Santagata G, Petersen RS. Efficient population coding of naturalistic whisker motion in the ventro-posterior medial thalamus based on precise spike timing. Front Neural Circuits 2015; 9:50. doi: 10.3389/fncir.2015.00050 [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Trousdale J, Carroll SR, Gabbiani F, Josić K. Near-optimal decoding of transient stimuli from coupled neuronal subpopulations. J Neurosci 2014;34(36):12206–22. doi: 10.1523/JNEUROSCI.2671-13.2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Wang S, Borst A, Zaslavsky N, Tishby N, Segev I. Efficient encoding of motion is mediated by gap junctions in the fly visual system. PLoS Comput Biol 2017;13(12):e1005846. doi: 10.1371/journal.pcbi.1005846 [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Carr CE, MacLeod KM. Microseconds matter. PLoS Biol 2010;8(6):e1000405. doi: 10.1371/journal.pbio.1000405 [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Covey E, Casseday JH. The monaural nuclei of the lateral lemniscus in an echolocating bat: parallel pathways for analyzing temporal features of sound. J Neurosci 1991;11(11):3456–70. doi: 10.1523/JNEUROSCI.11-11-03456.1991 [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Lin KK, Shea-Brown E, Young LS. Reliability of Coupled Oscillators. J Nonlinear Sci 2009;19:497–545. doi: 10.1007/s00332-009-9042-5 [Google Scholar]
50.Putney J, Niebur T, Wood L, Conn R, Sponberg S. An information theoretic method to resolve millisecond-scale spike timing precision in a comprehensive motor program. PLOS Comput Biol 2023;19(6):e1011170. doi: 10.1371/journal.pcbi.1011170 [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Putney J, Angjelichinoski M, Ravier R, Ferrari S, Tarokh V, Sponberg S. Consistent coordination patterns provide near perfect behavior decoding in a comprehensive motor program for insect flight. bioRxiv. preprint. 2021.07.13.452211, 2021. doi:10.1101/2021.07.13.452211
52.Yang H, Zhu P, Putney J, Sponberg S, Sikandar UB, Ferrari S. Regression-based spike train decoding in a comprehensive motor program for insect flight. In: Proceedings from the International Joint Conference on Neural Networks. 2022. [Google Scholar]
53.Ocker GK, Josić K, Shea-Brown E, Buice MA. Linking structure and activity in nonlinear spiking networks. PLoS Comput Biol 2017;13(6):e1005583. doi: 10.1371/journal.pcbi.1005583 [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Izhikevich EM. Dynamical systems in neuroscience. MIT Press; 2007. [Google Scholar]
55.Izhikevich EM. Simple model of spiking neurons. IEEE Trans Neural Netw 2003;14(6):1569–72. doi: 10.1109/TNN.2003.820440 [DOI] [PubMed] [Google Scholar]
56.Izhikevich EM. Resonate-and-fire neurons. Neural Netw 2001;14(6–7):883–94. doi: 10.1016/s0893-6080(01)00078-8 [DOI] [PubMed] [Google Scholar]
57.Teeter C, Iyer R, Menon V, Gouwens N, Feng D, Berg J, et al. Generalized leaky integrate-and-fire models classify multiple neuron types. Nat Commun 2018;9(1):709. doi: 10.1038/s41467-017-02717-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
58.Ocker GK, Hu Y, Buice MA, Doiron B, Josić K, Rosenbaum R, et al. From the statistics of connectivity to the statistics of spike times in neuronal networks. Curr Opin Neurobiol 2017; 46:109–119. doi: 10.1016/j.conb.2017.07.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
59.Rieke F, Warland D, Van Steveninck RdR, Bialek W. Spikes: exploring the neural code. MIT Press; 1999. [Google Scholar]
60.DiLorenzo PM, Victor JD. Spike timing: mechanisms and function. CRC Press; 2013. [Google Scholar]
61.Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2011;12:2825–30. [Google Scholar]
62.Gerstner W. Spike-response model. Scholarpedia 2008;3(12):1343. doi: 10.4249/scholarpedia.1343 [Google Scholar]
63.Borst A, Egelhaaf M, Seung H. Two-dimensional motion perception in flies. Neural computation 1993;5(6):856–868. doi: 10.1162/neco.1993.5.6.856 [Google Scholar]
64.Nicola W, Clopath C. Supervised learning in spiking neural networks with FORCE training. Nat Commun 2017;8:2208. doi: 10.1038/s41467-017-01827-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
65.Paninski L. Estimation of entropy and mutual information. Neural Comput 2003;15(6):1191–253. doi: 10.1162/089976603321780272 [Google Scholar]
66.Kraskov A, Stögbauer H, Grassberger P. Estimating mutual information. Phys Rev E Stat Nonlin Soft Matter Phys. 2004;69(6 Pt 2):066138. doi: 10.1103/PhysRevE.69.066138 [DOI] [PubMed] [Google Scholar]
67.Quian Quiroga R, Panzeri S. Extracting information from neuronal populations: information theory and decoding approaches. Nat Rev Neurosci 2009;10(3):173–85. doi: 10.1038/nrn2578 [DOI] [PubMed] [Google Scholar]
68.Platt J, et al. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers. 1999;10(3):61–74. [Google Scholar]
69.Nogueira F. Bayesian optimization: Open source constrained global optimization tool for Python; 2014. Available from: https://github.com/bayesian-optimization/BayesianOptimization [Google Scholar]
70.Neftci EO, Mostafa H, Zenke F. Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Process Mag 2019;36(6):51–63. doi: 10.1109/MSP.2019.2931595 [Google Scholar]
71.Dayan P, Abbott LF. Theoretical neuroscience: computational and mathematical modeling of neural systems. MIT Press; 2005. [Google Scholar]
72.Deshpande SS, Smith GA, van Drongelen W. Third-order motifs are sufficient to fully and uniquely characterize spatiotemporal neural network activity. Sci Rep 2023;13(1):238. doi: 10.1038/s41598-022-27188-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
73.Bojanek K, Zhu Y, MacLean J. Cyclic transitions between higher order motifs underlie sustained asynchronous spiking in sparse recurrent networks. PLoS Comput Biol 2020;16(9):e1007409. doi: 10.1371/journal.pcbi.1007409 [DOI] [PMC free article] [PubMed] [Google Scholar]
74.Strong SP, Koberle R, Van Steveninck RRDR, Bialek W. Entropy and information in neural spike trains. Phys Rev Lett 1998;80(1):197. doi: 10.1103/PhysRevLett.80.197 [Google Scholar]
75.Palmer SE, Marre O, Berry MJ, Bialek W. Predictive information in a sensory population. Proc Natl Acad Sci U S A 2015;112(22):6908–13. doi: 10.1073/pnas.1506855112 [DOI] [PMC free article] [PubMed] [Google Scholar]
76.Azevedo A, Lesser E, Phelps JS, Mark B, Elabbady L, Kuroda S, et al. Connectomic reconstruction of a female Drosophila ventral nerve cord. Nature 2024;631(8020):360–8. doi: 10.1038/s41586-024-07389-x [DOI] [PMC free article] [PubMed] [Google Scholar]
77.Dorkenwald S, Matsliah A, Sterling AR, Schlegel P, Yu SC, McKellar CE, et al. Neuronal wiring diagram of an adult brain. Nature 2024;634(8032):124–38. doi: 10.1038/s41586-024-07558-y [DOI] [PMC free article] [PubMed] [Google Scholar]
78.Namiki S, Kanzaki R. Morphology of visual projection neurons supplying premotor area in the brain of the silkmoth Bombyx mori. Cell Tissue Res 2018;374(3):497–515. doi: 10.1007/s00441-018-2892-0 [DOI] [PubMed] [Google Scholar]
79.Namiki S, Wada S, Kanzaki R. Descending neurons from the lateral accessory lobe and posterior slope in the brain of the silkmoth Bombyx mori. Sci Rep 2018;8(1):9663. doi: 10.1038/s41598-018-27954-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
80.Sponberg S, Dyhr JP, Hall RW, Daniel TL. INSECT FLIGHT. Luminance-dependent visual processing enables moth flight in low light. Science 2015;348(6240):1245–8. doi: 10.1126/science.aaa3042 [DOI] [PubMed] [Google Scholar]
81.Sober SJ, Sponberg S, Nemenman I, Ting LH. Millisecond spike timing codes for motor control. Trends Neurosci 2018;41(10):644–8. doi: 10.1016/j.tins.2018.08.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
82.Yang S, Chen B. Effective surrogate gradient learning with high-order information bottleneck for spike-based machine intelligence. IEEE Trans Neural Netw Learn Syst 2025;36(1):1734–48. doi: 10.1109/TNNLS.2023.3329525 [DOI] [PubMed] [Google Scholar]
83.Dubois Y, Kiela D, Schwab DJ, Vedantam R. Learning optimal representations with the decodable information bottleneck. Advances in Neural Information Processing Systems. 2020;33:18674–90. [Google Scholar]
84.Tishby N, Zaslavsky N. Deep learning and the information bottleneck principle. In: 2015 IEEE Information Theory Workshop (ITW). IEEE; 2015. [Google Scholar]
85.Shwartz-Ziv R, Tishby N. Opening the black box of deep neural networks via information. arXiv. preprint. arXiv:170300810. 2017. [Google Scholar]
86.Tishby N, Pereira FC, Bialek W. The information bottleneck method. arXiv. preprint. physics/0004057. 2000. [Google Scholar]
87.Chipman AD. The evolution and development of segmented body plans. In: Nuno de la Rosa L, Müller G, editors. Evolutionary developmental biology. Cham: Springer; 2018. doi: 10.1007/978-3-319-33038-9_136-1 [DOI] [Google Scholar]
88.Reinagel P, Reid RC. Temporal coding of visual information in the thalamus. J Neurosci 2000;20(14):5392–400. doi: 10.1523/JNEUROSCI.20-14-05392.2000 [DOI] [PMC free article] [PubMed] [Google Scholar]
89.Florian RV. The chronotron: a neuron that learns to fire temporally precise spike patterns. PLoS One 2012;7(8):e40233. doi: 10.1371/journal.pone.0040233 [DOI] [PMC free article] [PubMed] [Google Scholar]
90.Gütig R, Sompolinsky H. The tempotron: a neuron that learns spike timing-based decisions. Nat Neurosci 2006;9(3):420–8. doi: 10.1038/nn1643 [DOI] [PubMed] [Google Scholar]
91.Sikandar UB, Choi H, Putney J, Yang H, Ferrari S, Sponberg S. Predicting visually-modulated precisely-timed spikes across a coordinated and comprehensive motor program. In: 2023 International Joint Conference on Neural Networks (IJCNN). IEEE; 2023. [Google Scholar]
92.Mainen ZF, Sejnowski TJ. Reliability of spike timing in neocortical neurons. Science 1995;268(5216):1503–6. doi: 10.1126/science.7770778 [DOI] [PubMed] [Google Scholar]
93.Bick C, Goodfellow M, Laing CR, Martens EA. Understanding the dynamics of biological and neural oscillator networks through exact mean-field reductions: a review. J Math Neurosci 2020;10(1):9. doi: 10.1186/s13408-020-00086-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
94.Chambers B, MacLean JN. Higher-order synaptic interactions coordinate dynamics in recurrent networks. PLoS Comput Biol 2016;12(8):e1005078. doi: 10.1371/journal.pcbi.1005078 [DOI] [PMC free article] [PubMed] [Google Scholar]
95.Dechery JB, MacLean JN. Functional triplet motifs underlie accurate predictions of single-trial responses in populations of tuned and untuned V1 neurons. PLoS Comput Biol 2018;14(5):e1006153. doi: 10.1371/journal.pcbi.1006153 [DOI] [PMC free article] [PubMed] [Google Scholar]
96.Holmes CM, Nemenman I. Estimation of mutual information for real-valued data with error bars and controlled bias. Phys Rev E 2019;100(2-1):022404. doi: 10.1103/PhysRevE.100.022404 [DOI] [PubMed] [Google Scholar]
97.Wicklein M, Varju D. Visual System of the European Hummingbird Hawkmoth Macroglossum stellatarum (Sphingidae, Lepidoptera): Motion-Sensitive Interneurons of the Lobula Plate. J Comp Neurol 1999; 408:272–28. [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Fig. Version of Fig 2 but with GRU decoder.

(TIF)

pcbi.1012971.s001.tif^{(2.2MB, tif)}

S2 Fig. Version of Fig 2 but with LIF spiking model.

(TIF)

pcbi.1012971.s002.tif^{(2.1MB, tif)}

S3 Fig. Version of Fig 3 but with mutual information.

(TIF)

pcbi.1012971.s003.tif^{(2.3MB, tif)}

S4 Fig. Version of Fig 3 but with GRU decoder.

(TIF)

pcbi.1012971.s004.tif^{(2.2MB, tif)}

S5 Fig. Version of Fig 3 but with GRU decoder and mutual information.

(TIF)

pcbi.1012971.s005.tif^{(2.3MB, tif)}

S6 Fig. Version of Fig 3 but with LIF neuron model.

(TIF)

pcbi.1012971.s006.tif^{(2.2MB, tif)}

S7 Fig. Frequency analysis robustness to information metric.

(TIF)

pcbi.1012971.s007.tif^{(1.2MB, tif)}

S8 Fig. Decoding analysis for sinusoidal stimuli.

(TIF)

pcbi.1012971.s008.tif^{(882.1KB, tif)}

S9 Fig. All stimuli analysis robustness to information metric.

(TIF)

pcbi.1012971.s009.tif^{(1.5MB, tif)}

S10 Fig. Decoding analysis of 5 layer model receiving 1 Hz stimulus.

(TIF)

pcbi.1012971.s010.tif^{(585.4KB, tif)}

S11 Fig. Single-neuron information across layer of the 5-layer model for three values of k, the number of nearest neighbors in the KSG method.

(TIF)

pcbi.1012971.s011.tif^{(678.5KB, tif)}

S12 Fig. Single-neuron information across layer of the 5-layer model for various data set sizes n.

(TIF)

pcbi.1012971.s012.tif^{(716.5KB, tif)}

S13 Fig. Robustness of the single-neuron information estimates.

(TIF)

pcbi.1012971.s013.tif^{(711.7KB, tif)}

S14 Fig. Interaction info in experimental data v.s. 5-layer model.

(TIF)

pcbi.1012971.s014.tif^{(1.3MB, tif)}

S1 Appendix. Additional descriptions of supplementary figures.

(PDF)

pcbi.1012971.s015.pdf^{(165.2KB, pdf)}

Data Availability Statement

All code for reproducing the analytical and computational figures in this work are available at https://github.com/HChoiLab/SpikeCoding.

[pcbi.1012971.ref001] 1.Nguyen TM, Thomas LA, Rhoades JL, Ricchi I, Yuan XC, Sheridan A, et al. Structured cerebellar connectivity supports resilient pattern separation. Nature 2023 Jan;613(7944):543–549. doi: 10.1038/s41586-022-05471-w [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref002] 2.Turner NL, Macrina T, Bae JA, Yang R, Wilson AM, Schneider-Mizell C, et al. Reconstruction of neocortex: Organelles, compartments, cells, circuits, and activity. Cell 2022;185(6):1082–100.e24. doi: 10.1016/j.cell.2022.01.023 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref003] 3.Bullmore E, Sporns O. Complex brain networks: graph theoretical analysis of structural and functional systems. Nat Rev Neurosci 2009;10(3):186–98. doi: 10.1038/nrn2575 [DOI] [PubMed] [Google Scholar]

[pcbi.1012971.ref004] 4.Oh SW, Harris JA, Ng L, Winslow B, Cain N, Mihalas S, et al. A mesoscale connectome of the mouse brain. Nature 2014;508(7495):207–14. doi: 10.1038/nature13186 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref005] 5.Sporns O. Networks of the brain. MIT Press; 2016 [Google Scholar]

[pcbi.1012971.ref006] 6.Song S, Sjöström PJ, Reigl M, Nelson S, Chklovskii DB. Highly nonrandom features of synaptic connectivity in local cortical circuits. PLoS Biol 2005;3(3):e68. doi: 10.1371/journal.pbio.0030068 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref007] 7.Mastrogiuseppe F, Ostojic S. Linking connectivity, dynamics, and computations in low-rank recurrent neural networks. Neuron 2018;99(3):609–23.e29. doi: 10.1016/j.neuron.2018.07.003 [DOI] [PubMed] [Google Scholar]

[pcbi.1012971.ref008] 8.Recanatesi S, Ocker GK, Buice MA, Shea-Brown E. Dimensionality in recurrent spiking networks: Global trends in activity and local origins in connectivity. PLoS Comput Biol 2019;15(7):e1006446. doi: 10.1371/journal.pcbi.1006446 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref009] 9.Dubreuil A, Valente A, Beiran M, Mastrogiuseppe F, Ostojic S. The role of population structure in computations through neural dynamics. Nat Neurosci 2022;25(6):783–94. doi: 10.1038/s41593-022-01088-4 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref010] 10.Bassett DS, Sporns O. Network neuroscience. Nat Neurosci 2017 Feb 23;20(3):353–64. doi: 10.1038/nn.4502 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref011] 11.Deco G, Senden M, Jirsa V. How anatomy shapes dynamics: a semi-analytical study of the brain at rest by a simple spin model. Front Comput Neurosci 2012; 6:68. doi: 10.3389/fncom.2012.00068 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref012] 12.Cohen JR, D’Esposito M. The segregation and integration of distinct brain networks and their relationship to cognition. J Neurosci 2016;36(48):12083–94. doi: 10.1523/JNEUROSCI.2965-15.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref013] 13.Gutierrez GJ, Rieke F, Shea-Brown ET. Nonlinear convergence boosts information coding in circuits with parallel outputs. Proc Natl Acad Sci U S A 2021;118(8):e1921882118. doi: 10.1073/pnas.1921882118 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref014] 14.Jeanne JM, Wilson RI. Convergence, divergence, and reconvergence in a feedforward network improves neural speed and accuracy. Neuron 2015;88(5):1014–26. doi: 10.1016/j.neuron.2015.10.018 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref015] 15.Zhaoping L. Theoretical understanding of the early visual processes by data compression and data selection. Network 2006;17(4):301–34. doi: 10.1080/09548980600931995 [DOI] [PubMed] [Google Scholar]

[pcbi.1012971.ref016] 16.Nirenberg S, Carcieri SM, Jacobs AL, Latham PE. Retinal ganglion cells act largely as independent encoders. Nature 2001;411(6838):698–701. doi: 10.1038/35079612 [DOI] [PubMed] [Google Scholar]

[pcbi.1012971.ref017] 17.Muscinelli SP, Wagner MJ, Litwin-Kumar A. Optimal routing to cerebellum-like structures. Nat Neurosci 2023;26(9):1630–41. doi: 10.1038/s41593-023-01403-7 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref018] 18.Bell CC, Han V, Sawtell NB. Cerebellum-like structures and their implications for cerebellar function. Annu Rev Neurosci 2008; 31:1–24. doi: 10.1146/annurev.neuro.30.051606.094225 [DOI] [PubMed] [Google Scholar]

[pcbi.1012971.ref019] 19.Bates AS, Schlegel P, Roberts RJV, Drummond N, Tamimi IFM, Turnbull R, et al. Complete connectomic reconstruction of olfactory projection neurons in the fly brain. Curr Biol 2020;30(16):3183–99.e6. doi: 10.1016/j.cub.2020.06.042 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref020] 20.Kandel ER, Schwartz JH, Jessell TM, Siegelbaum S, Hudspeth AJ, Mack S, et al. Principles of neural science, vol. 4. New York: McGraw-Hill; 2000. [Google Scholar]

[pcbi.1012971.ref021] 21.Sereno MI, Diedrichsen J, Tachrount M, Testa-Silva G, d’Arceuil H, De Zeeuw C. The human cerebellum has almost 80% of the surface area of the neocortex. Proc Natl Acad Sci U S A 2020;117(32):19538–43. doi: 10.1073/pnas.2002896117 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref022] 22.Nande A, Dubinkina V, Ravasio R, Zhang GH, Berman GJ. Bottlenecks, modularity, and the neural control of behavior. Front Behav Neurosci 2022; 16:835753. doi: 10.3389/fnbeh.2022.835753 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref023] 23.Nicola W, Newton TR, Clopath C. The impact of spike timing precision and spike emission reliability on decoding accuracy. Sci Rep 2024;14(1):10536. doi: 10.1038/s41598-024-58524-7 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref024] 24.Nemenman I, Lewen GD, Bialek W, de Ruyter van Steveninck RR. Neural coding of natural stimuli: information at sub-millisecond resolution. PLoS Comput Biol 2008;4(3):e1000025. doi: 10.1371/journal.pcbi.1000025 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref025] 25.Putney J, Conn R, Sponberg S. Precise timing is ubiquitous, consistent, and coordinated across a comprehensive, spike-resolved flight motor program. Proc Natl Acad Sci U S A 2019;116(52):26951–60. doi: 10.1073/pnas.1907513116 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref026] 26.Birmingham JT, Szuts ZB, Abbott LF, Marder E. Encoding of muscle movement on two time scales by a sensory neuron that switches between spiking and bursting modes. J Neurophysiol 1999;82(5):2786–97. doi: 10.1152/jn.1999.82.5.2786 [DOI] [PubMed] [Google Scholar]

[pcbi.1012971.ref027] 27.De Ruyter Van Steveninck R, Bialek W. Real-time performance of a movement-sensitive neuron in the blowfly visual system: Coding and information transfer in short spike sequences. Proc R Soc Lond B Biol Sci 1988;234(1277):379–414. 10.1098/rspb.1988.0055 [DOI] [Google Scholar]

[pcbi.1012971.ref028] 28.Christopher R, Merzenich MM. Primary cortical representation of sounds by the coordination of action-potential timing. Nature 1996;381(6583):610–3. doi: 10.1038/381610a0 [DOI] [PubMed] [Google Scholar]

[pcbi.1012971.ref029] 29.Egea-Weiss A, Renner A, Kleineidam CJ, Szyszka P. High precision of spike timing across olfactory receptor neurons allows rapid odor coding in Drosophila. iScience 2018; 4:76–83. doi: 10.1016/j.isci.2018.05.009 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref030] 30.Mackevicius EL, Best MD, Saal HP, Bensmaia SJ. Millisecond precision spike timing shapes tactile perception. J Neurosci 2012;32(44):15309–17. doi: 10.1523/JNEUROSCI.2161-12.2012 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref031] 31.Sober SJ, Sponberg S, Nemenman I, Ting LH. Millisecond spike timing codes for motor control. Trends Neurosci 2018;41(10):644–8. doi: 10.1016/j.tins.2018.08.010 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref032] 32.Tang C, Chehayeb D, Srivastava K, Nemenman I, Sober SJ. Millisecond-scale motor encoding in a cortical vocal area. PLoS Biol 2014;12(12):e1002018. doi: 10.1371/journal.pbio.1002018 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref033] 33.Srivastava KH, Holmes CM, Vellema M, Pack AR, Elemans CPH, Nemenman I, et al. Motor control by precisely timed spike patterns. Proc Natl Acad Sci U S A 2017;114(5):1171–6. doi: 10.1073/pnas.1611734114 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref034] 34.Sponberg S, Daniel TL. Abdicating power for control: a precision timing strategy to modulate function of flight power muscles. Proc Biol Sci 2012;279(1744):3958–66. doi: 10.1098/rspb.2012.1085 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref035] 35.Von Reyn CR, Breads P, Peek MY, Zheng GZ, Williamson WR, Yee AL, et al. A spike-timing mechanism for action selection. Nat Neurosci 2014;17(7):962–70. doi: 10.1038/nn.3741 [DOI] [PubMed] [Google Scholar]

[pcbi.1012971.ref036] 36.Shadlen MN, Newsome WT. The variable discharge of cortical neurons: implications for connectivity, computation, and information coding. J Neurosci 1998;18(10):3870–96. doi: 10.1523/JNEUROSCI.18-10-03870.1998 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref037] 37.Olshausen BA, Field DJ. What is the other 85 percent of V1 doing. L van Hemmen, & T Sejnowski (Eds). 2006;23:182–211. [Google Scholar]

[pcbi.1012971.ref038] 38.Olshausen BA, Field DJ. What Is the Other 85 Percent of V1 Doing? In: Leo van Hemmen J, Sejnowski TJ, editors. 23 Problems in systems neuroscience, computational neuroscience series. New York: Oxford Academic. doi: 10.1093/acprof:oso/9780195148220.003.0010, accessed 4 April 2025. [DOI] [Google Scholar]

[pcbi.1012971.ref039] 39.Eshraghian JK, Ward M, Neftci E, Wang X, Lenz G, Dwivedi G, et al. Training spiking neural networks using lessons from deep learning. Proc IEEE 2023;111(9):1016–54. doi: 10.1109/JPROC.2023.3308088 [Google Scholar]

[pcbi.1012971.ref040] 40.Faisal AA, Selen LP, Wolpert DM. Noise in the nervous system. Nat Rev Neurosci. 2008;9(4):292–303. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref041] 41.Gerstner W, Kistler WM, Naud R, Paninski L. Neuronal dynamics: from single neurons to networks and models of cognition. Cambridge University Press; 2014. [Google Scholar]

[pcbi.1012971.ref042] 42.Levi A, Spivak L, Sloin HE, Someck S, Stark E. Error correction and improved precision of spike timing in converging cortical networks. Cell Rep 2022;40(12):111383. doi: 10.1016/j.celrep.2022.111383 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref043] 43.Glaser JI, Benjamin AS, Chowdhury RH, Perich MG, Miller LE, Kording KP. Machine learning for neural decoding. Eneuro. 2020;7(4):ENEURO.0506-19.2020. doi: 10.1523/ENEURO.0506-19.2020 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref044] 44.Bale MR, Ince RA, Santagata G, Petersen RS. Efficient population coding of naturalistic whisker motion in the ventro-posterior medial thalamus based on precise spike timing. Front Neural Circuits 2015; 9:50. doi: 10.3389/fncir.2015.00050 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref045] 45.Trousdale J, Carroll SR, Gabbiani F, Josić K. Near-optimal decoding of transient stimuli from coupled neuronal subpopulations. J Neurosci 2014;34(36):12206–22. doi: 10.1523/JNEUROSCI.2671-13.2014 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref046] 46.Wang S, Borst A, Zaslavsky N, Tishby N, Segev I. Efficient encoding of motion is mediated by gap junctions in the fly visual system. PLoS Comput Biol 2017;13(12):e1005846. doi: 10.1371/journal.pcbi.1005846 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref047] 47.Carr CE, MacLeod KM. Microseconds matter. PLoS Biol 2010;8(6):e1000405. doi: 10.1371/journal.pbio.1000405 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref048] 48.Covey E, Casseday JH. The monaural nuclei of the lateral lemniscus in an echolocating bat: parallel pathways for analyzing temporal features of sound. J Neurosci 1991;11(11):3456–70. doi: 10.1523/JNEUROSCI.11-11-03456.1991 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref049] 49.Lin KK, Shea-Brown E, Young LS. Reliability of Coupled Oscillators. J Nonlinear Sci 2009;19:497–545. doi: 10.1007/s00332-009-9042-5 [Google Scholar]

[pcbi.1012971.ref050] 50.Putney J, Niebur T, Wood L, Conn R, Sponberg S. An information theoretic method to resolve millisecond-scale spike timing precision in a comprehensive motor program. PLOS Comput Biol 2023;19(6):e1011170. doi: 10.1371/journal.pcbi.1011170 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref051] 51.Putney J, Angjelichinoski M, Ravier R, Ferrari S, Tarokh V, Sponberg S. Consistent coordination patterns provide near perfect behavior decoding in a comprehensive motor program for insect flight. bioRxiv. preprint. 2021.07.13.452211, 2021. doi:10.1101/2021.07.13.452211

[pcbi.1012971.ref052] 52.Yang H, Zhu P, Putney J, Sponberg S, Sikandar UB, Ferrari S. Regression-based spike train decoding in a comprehensive motor program for insect flight. In: Proceedings from the International Joint Conference on Neural Networks. 2022. [Google Scholar]

[pcbi.1012971.ref053] 53.Ocker GK, Josić K, Shea-Brown E, Buice MA. Linking structure and activity in nonlinear spiking networks. PLoS Comput Biol 2017;13(6):e1005583. doi: 10.1371/journal.pcbi.1005583 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref054] 54.Izhikevich EM. Dynamical systems in neuroscience. MIT Press; 2007. [Google Scholar]

[pcbi.1012971.ref055] 55.Izhikevich EM. Simple model of spiking neurons. IEEE Trans Neural Netw 2003;14(6):1569–72. doi: 10.1109/TNN.2003.820440 [DOI] [PubMed] [Google Scholar]

[pcbi.1012971.ref056] 56.Izhikevich EM. Resonate-and-fire neurons. Neural Netw 2001;14(6–7):883–94. doi: 10.1016/s0893-6080(01)00078-8 [DOI] [PubMed] [Google Scholar]

[pcbi.1012971.ref057] 57.Teeter C, Iyer R, Menon V, Gouwens N, Feng D, Berg J, et al. Generalized leaky integrate-and-fire models classify multiple neuron types. Nat Commun 2018;9(1):709. doi: 10.1038/s41467-017-02717-4 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref058] 58.Ocker GK, Hu Y, Buice MA, Doiron B, Josić K, Rosenbaum R, et al. From the statistics of connectivity to the statistics of spike times in neuronal networks. Curr Opin Neurobiol 2017; 46:109–119. doi: 10.1016/j.conb.2017.07.011 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref059] 59.Rieke F, Warland D, Van Steveninck RdR, Bialek W. Spikes: exploring the neural code. MIT Press; 1999. [Google Scholar]

[pcbi.1012971.ref060] 60.DiLorenzo PM, Victor JD. Spike timing: mechanisms and function. CRC Press; 2013. [Google Scholar]

[pcbi.1012971.ref061] 61.Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2011;12:2825–30. [Google Scholar]

[pcbi.1012971.ref062] 62.Gerstner W. Spike-response model. Scholarpedia 2008;3(12):1343. doi: 10.4249/scholarpedia.1343 [Google Scholar]

[pcbi.1012971.ref063] 63.Borst A, Egelhaaf M, Seung H. Two-dimensional motion perception in flies. Neural computation 1993;5(6):856–868. doi: 10.1162/neco.1993.5.6.856 [Google Scholar]

[pcbi.1012971.ref064] 64.Nicola W, Clopath C. Supervised learning in spiking neural networks with FORCE training. Nat Commun 2017;8:2208. doi: 10.1038/s41467-017-01827-3 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref065] 65.Paninski L. Estimation of entropy and mutual information. Neural Comput 2003;15(6):1191–253. doi: 10.1162/089976603321780272 [Google Scholar]

[pcbi.1012971.ref066] 66.Kraskov A, Stögbauer H, Grassberger P. Estimating mutual information. Phys Rev E Stat Nonlin Soft Matter Phys. 2004;69(6 Pt 2):066138. doi: 10.1103/PhysRevE.69.066138 [DOI] [PubMed] [Google Scholar]

[pcbi.1012971.ref067] 67.Quian Quiroga R, Panzeri S. Extracting information from neuronal populations: information theory and decoding approaches. Nat Rev Neurosci 2009;10(3):173–85. doi: 10.1038/nrn2578 [DOI] [PubMed] [Google Scholar]

[pcbi.1012971.ref068] 68.Platt J, et al. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers. 1999;10(3):61–74. [Google Scholar]

[pcbi.1012971.ref069] 69.Nogueira F. Bayesian optimization: Open source constrained global optimization tool for Python; 2014. Available from: https://github.com/bayesian-optimization/BayesianOptimization [Google Scholar]

[pcbi.1012971.ref070] 70.Neftci EO, Mostafa H, Zenke F. Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Process Mag 2019;36(6):51–63. doi: 10.1109/MSP.2019.2931595 [Google Scholar]

[pcbi.1012971.ref071] 71.Dayan P, Abbott LF. Theoretical neuroscience: computational and mathematical modeling of neural systems. MIT Press; 2005. [Google Scholar]

[pcbi.1012971.ref072] 72.Deshpande SS, Smith GA, van Drongelen W. Third-order motifs are sufficient to fully and uniquely characterize spatiotemporal neural network activity. Sci Rep 2023;13(1):238. doi: 10.1038/s41598-022-27188-6 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref073] 73.Bojanek K, Zhu Y, MacLean J. Cyclic transitions between higher order motifs underlie sustained asynchronous spiking in sparse recurrent networks. PLoS Comput Biol 2020;16(9):e1007409. doi: 10.1371/journal.pcbi.1007409 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref074] 74.Strong SP, Koberle R, Van Steveninck RRDR, Bialek W. Entropy and information in neural spike trains. Phys Rev Lett 1998;80(1):197. doi: 10.1103/PhysRevLett.80.197 [Google Scholar]

[pcbi.1012971.ref075] 75.Palmer SE, Marre O, Berry MJ, Bialek W. Predictive information in a sensory population. Proc Natl Acad Sci U S A 2015;112(22):6908–13. doi: 10.1073/pnas.1506855112 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref076] 76.Azevedo A, Lesser E, Phelps JS, Mark B, Elabbady L, Kuroda S, et al. Connectomic reconstruction of a female Drosophila ventral nerve cord. Nature 2024;631(8020):360–8. doi: 10.1038/s41586-024-07389-x [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref077] 77.Dorkenwald S, Matsliah A, Sterling AR, Schlegel P, Yu SC, McKellar CE, et al. Neuronal wiring diagram of an adult brain. Nature 2024;634(8032):124–38. doi: 10.1038/s41586-024-07558-y [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref078] 78.Namiki S, Kanzaki R. Morphology of visual projection neurons supplying premotor area in the brain of the silkmoth Bombyx mori. Cell Tissue Res 2018;374(3):497–515. doi: 10.1007/s00441-018-2892-0 [DOI] [PubMed] [Google Scholar]

[pcbi.1012971.ref079] 79.Namiki S, Wada S, Kanzaki R. Descending neurons from the lateral accessory lobe and posterior slope in the brain of the silkmoth Bombyx mori. Sci Rep 2018;8(1):9663. doi: 10.1038/s41598-018-27954-5 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref080] 80.Sponberg S, Dyhr JP, Hall RW, Daniel TL. INSECT FLIGHT. Luminance-dependent visual processing enables moth flight in low light. Science 2015;348(6240):1245–8. doi: 10.1126/science.aaa3042 [DOI] [PubMed] [Google Scholar]

[pcbi.1012971.ref081] 81.Sober SJ, Sponberg S, Nemenman I, Ting LH. Millisecond spike timing codes for motor control. Trends Neurosci 2018;41(10):644–8. doi: 10.1016/j.tins.2018.08.010 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref082] 82.Yang S, Chen B. Effective surrogate gradient learning with high-order information bottleneck for spike-based machine intelligence. IEEE Trans Neural Netw Learn Syst 2025;36(1):1734–48. doi: 10.1109/TNNLS.2023.3329525 [DOI] [PubMed] [Google Scholar]

[pcbi.1012971.ref083] 83.Dubois Y, Kiela D, Schwab DJ, Vedantam R. Learning optimal representations with the decodable information bottleneck. Advances in Neural Information Processing Systems. 2020;33:18674–90. [Google Scholar]

[pcbi.1012971.ref084] 84.Tishby N, Zaslavsky N. Deep learning and the information bottleneck principle. In: 2015 IEEE Information Theory Workshop (ITW). IEEE; 2015. [Google Scholar]

[pcbi.1012971.ref085] 85.Shwartz-Ziv R, Tishby N. Opening the black box of deep neural networks via information. arXiv. preprint. arXiv:170300810. 2017. [Google Scholar]

[pcbi.1012971.ref086] 86.Tishby N, Pereira FC, Bialek W. The information bottleneck method. arXiv. preprint. physics/0004057. 2000. [Google Scholar]

[pcbi.1012971.ref087] 87.Chipman AD. The evolution and development of segmented body plans. In: Nuno de la Rosa L, Müller G, editors. Evolutionary developmental biology. Cham: Springer; 2018. doi: 10.1007/978-3-319-33038-9_136-1 [DOI] [Google Scholar]

[pcbi.1012971.ref088] 88.Reinagel P, Reid RC. Temporal coding of visual information in the thalamus. J Neurosci 2000;20(14):5392–400. doi: 10.1523/JNEUROSCI.20-14-05392.2000 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref089] 89.Florian RV. The chronotron: a neuron that learns to fire temporally precise spike patterns. PLoS One 2012;7(8):e40233. doi: 10.1371/journal.pone.0040233 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref090] 90.Gütig R, Sompolinsky H. The tempotron: a neuron that learns spike timing-based decisions. Nat Neurosci 2006;9(3):420–8. doi: 10.1038/nn1643 [DOI] [PubMed] [Google Scholar]

[pcbi.1012971.ref091] 91.Sikandar UB, Choi H, Putney J, Yang H, Ferrari S, Sponberg S. Predicting visually-modulated precisely-timed spikes across a coordinated and comprehensive motor program. In: 2023 International Joint Conference on Neural Networks (IJCNN). IEEE; 2023. [Google Scholar]

[pcbi.1012971.ref092] 92.Mainen ZF, Sejnowski TJ. Reliability of spike timing in neocortical neurons. Science 1995;268(5216):1503–6. doi: 10.1126/science.7770778 [DOI] [PubMed] [Google Scholar]

[pcbi.1012971.ref093] 93.Bick C, Goodfellow M, Laing CR, Martens EA. Understanding the dynamics of biological and neural oscillator networks through exact mean-field reductions: a review. J Math Neurosci 2020;10(1):9. doi: 10.1186/s13408-020-00086-9 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref094] 94.Chambers B, MacLean JN. Higher-order synaptic interactions coordinate dynamics in recurrent networks. PLoS Comput Biol 2016;12(8):e1005078. doi: 10.1371/journal.pcbi.1005078 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref095] 95.Dechery JB, MacLean JN. Functional triplet motifs underlie accurate predictions of single-trial responses in populations of tuned and untuned V1 neurons. PLoS Comput Biol 2018;14(5):e1006153. doi: 10.1371/journal.pcbi.1006153 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1012971.ref096] 96.Holmes CM, Nemenman I. Estimation of mutual information for real-valued data with error bars and controlled bias. Phys Rev E 2019;100(2-1):022404. doi: 10.1103/PhysRevE.100.022404 [DOI] [PubMed] [Google Scholar]

[pcbi.1012971.ref097] 97.Wicklein M, Varju D. Visual System of the European Hummingbird Hawkmoth Macroglossum stellatarum (Sphingidae, Lepidoptera): Motion-Sensitive Interneurons of the Lobula Plate. J Comp Neurol 1999; 408:272–28. [PubMed] [Google Scholar]

PERMALINK

Temporal resolution of spike coding in feedforward networks with signal convergence and divergence

Zach Mobille

Usama Bin Sikandar

Simon Sponberg

Hannah Choi

Roles

Abstract

Author summary

Introduction

Results

Fig 1. Schematic of model and analysis methods.

Three-layer network

Fig 2. Structural convergence to the output layer promotes timing codes across stimulus frequencies.

Fig 3. Bottlenecks have more to gain from temporal codes than expansion layers.

Fig 4. Temporal codes capture high-frequency stimulus components more accurately in layers following structural convergence.

Fig 5. Stimulus-dependence of spike coding as shaped by convergent/divergent structure.

Five-layer network model of the moth visuomotor pathway

Fig 6. Experimental system and network model.

Fig 7. Single-neuron information during 1 Hz stimulus.

Fig 8. Decoding analysis of a noisy 4 Hz + 20 Hz stimulus.

Discussion

Methods

Analytical support for spike timing and count codes

Single-neuron example

Population of neurons

Fig 9. Maximum entropy of population spike codes.

Spiking neuron models

Table 1. Parameter values for the neurons in the alpha neuron model.

Table 2. Parameter values for the neurons in the LIF neuron model.

Network connectivity

Network training

Fig 10. Network training.

Decoding analysis

Single-neuron information theoretic analysis

Assocation measures

Supporting information

Acknowledgments

Data Availability

Funding Statement

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases