Abstract
The brain’s spatial orientation system uses different neuron ensembles to aid in environment-based navigation. Two of the ways brains encode spatial information are through head direction cells and grid cells. Brains use head direction cells to determine orientation, whereas grid cells consist of layers of decked neurons that overlay to provide environment-based navigation. These neurons fire in ensembles where several neurons fire at once to activate a single head direction or grid. We want to capture this firing structure and use it to decode head direction and animal location from head direction and grid cell activity. Understanding, representing, and decoding these neural structures require models that encompass higher-order connectivity, more than the one-dimensional connectivity that traditional graph-based models provide. To that end, in this work, we develop a topological deep learning framework for neural spike train decoding. Our framework combines unsupervised simplicial complex discovery with the power of deep learning via a new architecture we develop herein called a simplicial convolutional recurrent neural network. Simplicial complexes, topological spaces that use not only vertices and edges but also higher-dimensional objects, naturally generalize graphs and capture more than just pairwise relationships. Additionally, this approach does not require prior knowledge of the neural activity beyond spike counts, which removes the need for similarity measurements. The effectiveness and versatility of the simplicial convolutional neural network is demonstrated on head direction and trajectory prediction via head direction and grid cell datasets.
Significance
We propose the simplicial convolutional recurrent neural network (SCRNN) as a method for decoding navigation cell spike trains. The SCRNN utilizes simplicial complexes, a tool from computational topology that captures higher-order connectivity, paired with a recurrent neural network to decode head direction and grid cell spiking data. The simplicial convolutional layer captures the firing structure of the neurons and utilizes the underlying connectivity as an input into the recurrent neural network. We compared the median absolute error of the optimized SCRNN to those of three optimized traditional neural networks, a feedforward neural network, a recurrent neural network, and a graph neural network. We conclude that the SCRNN is able to predict head direction and grid activation better than three other traditional neural networks.
Introduction
Neurophysiological recording techniques have produced simultaneous recordings from increased numbers of neurons, both in vitro and in vivo, allowing for access to the activity of the hundreds of neurons required to encode certain variables (1,2,3,4). This makes efficient algorithms for decoding the information content from neural spike trains of increasing interest. Neural decoding can help provide insight into the function and significance of individual neurons or even entire regions of the brain (5). Additionally, neural decoding provides a foundation for new machine learning algorithms that leverage the mammalian brain structure. Utilizing a lower-dimensional structure is one way the mammalian brain brings efficiency into neural data processing. Head direction (HD) cells and grid cells are two types of brain cells recorded in a quantity that allows for the analysis of their functional connectivity and structure of their population activity (1,6). The activity of HD cells has been shown to lie on a circle (7), whereas the activity of a module of grid cells lies on a torus (1). Hence, algorithmic tools that capture and utilize the inherent structure in the data are well equipped to decode neural spiking data.
Decoding methods typically employ statistical or deep-learning-based models since one may view them as a regression problem where we learn the relationship between the independent spike trains and the decoded dependent variable. Statistical methods like, but not limited to, linear regression, Bayesian reconstruction, and Kalman filtering are utilized for their interpretability and relatively low computational costs (6,8,9). On the other hand, deep learning for neural decoding is a rapidly growing field due to neural networks’ (NNs’) observed success at time series tasks like sequence prediction and their ability to generalize beyond training data (9,10,11,12,13,14,15). NNs have outperformed statistical methods at decoding HD and two-dimensional, environment-based position from neural recordings of HD cells and place cells, respectively (9,16,17). Deep learning’s superior decoding performance has been observed for a variety of network architectures including recurrent NNs (RNNs) (18,19), fully connected feedforward NNs (FFNNs), and convolutional NNs (CNNs) (11,20,21,22). The smaller network sizes required for success in decoding compared to visual tasks allow for state-of-the-art performance on limited amounts of data (8). However, these deep learning applications to neural decoding utilize architectures that ignore the underlying structure of the input neural activity.
One approach is to look at the underlying graphical structure of the neurons and the neuronal maps and utilize this information for feature extraction. Graph NNs (GNNs) feed in one-dimensional connectivity information into a NN and use that information to update the neural weights (23). Although graphs are able to capture pairwise connectivity, neurons in the brain form networks that lead to correlated activity across multiple neurons (24,25). Beyond these structural connections, higher-dimensional functional connectivity has been observed within groups of neurons exhibiting similar firing properties, for example, grid cells within a module (26). Simplicial complexes, topological spaces with the ability to describe multiway relationships, naturally lend themselves to defining and encapsulating the hierarchical properties of neuronal data (26,27), making them an increasingly popular tool for representing neural activity (1,7,28,29,30,31,32,33). Hence, there exist simplicial CNNs (SCNNs) that account for this higher-order connectivity (34,35,36).
Our proposed approach, the simplicial convolutional RNN (SCRNN), combines the connectivity-based structure of the SCNN and the power of an RNN. First, the neural activity is defined on a simplicial complex via a preprocessing procedure. Neural spikes are binned to generate a binarized spike count matrix where each set of active cells within a time bin are connected by a simplex. The construction of the simplicial complex makes no assumptions about the spike train’s encoding, and the higher-dimensional connectivity of the simplicial complex ameliorates feature representation. Then, each simplicial complex is fed into SC layers for feature extraction. Next, the outputs of the final SC layer are concatenated to form a single feature vector, which is fed into the RNN portion of the network. Finally, the algorithm predicts either an HD or a location, depending on the dataset used for training. For an overarching view, see Fig. 1.
Figure 1.
The framework for the SCRNN. First, captured neural spiking data are recorded in a spike matrix. Next, the data are converted into a simplicial complex for a series of time windows. Then, each simplicial complex is fed into the simplicial convolutional layers. After flattening the simplicial information into a single vector, the vector is fed through an RNN, which predicts the location (or head direction) of the mouse based on the neural firing data. To see this figure in color, go online.
We first demonstrate the method by decoding HD from a population of HD cells (6) and compare the results to those produced by three other NN architectures. Applying the SCRNN to the HD data provides the lowest average absolute error (AAE) and mean absolute error compared to the three traditional NNs we tested it against. After verifying our architecture’s viability on HD decoding, we demonstrate the effectiveness of the SCRNN by decoding two-dimensional location from a population of grid cells and comparing it to the same networks mentioned above. We show that the SCRNN has the smallest average Euclidean distance (AED) between the ground truth and decoded location, demonstrating its aptitude for decoding different kinds of spiking data. Notably, to the best of our knowledge, our grid cell decoding task marks one of the first deep learning applications to decode experimental grid cell data.
The paper is organized as follows. Related work examines related work and surveys other relevant decoding algorithms. Materials and methods discusses the architecture of the SCRNN, including the preprocessing procedure and the datasets we consider. Decoding results and comparisons to other machine learning algorithms can be found in results and discussion for both the HD and grid cell data. Finally, we conclude and comment on future directions in conclusions.
Related work
The SCRNN draws inspiration from the SC layer’s ability to leverage the underlying connectivity of a dataset and the success of RNNs at decoding time-dependent data. First, we will look at how simplicial complexes have been used to capture connectivity. Next, we will consider SCNNs and how those have leveraged the underlying data structure for predictive purposes. Finally, we consider prior instances decoding neural data, with an emphasis on machine learning methods.
Neural decoding
Deep learning for neural decoding is a rapidly growing field due to NNs’ observed success at tasks like image recognition and sequence prediction and their ability to generalize beyond training data (37). NNs have outperformed statistical methods at decoding HD and two-dimensional position within an environment from neural recordings of HD cells and place cells, respectively (9,16,17). The superior performance has been observed for a variety of network architectures including RNNs, FFNNs, and CNNs. The smaller network sizes required for success in decoding compared to visual tasks allow for state-of-the-art performance on limited amounts of data (8).
Simplicial complexes and neural activity
Simplicial complexes have previously been used to represent neural activity. The study in (28) used place cell spike trains to reconstruct the environment. The work in (29) analyzed clique complexes generated from place cell firing fields to detect geometric structure in matrices. Simplicial complexes also play a pivotal role in manifold discovery, a growing area of neuroscience focused on finding the underlying manifolds on which different types of neural activity live. Persistent homology on a point cloud representing the population activity of HD cells revealed that the states of the HD circuit form a one-dimensional ring (7). Similarly, persistent cohomology was employed to show that the activity of a single grid cell module forms a toroidal manifold (1). For more background on simplicial complexes, see SC layers.
SC neural nets
Neural activity is regularly converted to a matrix, where rows represent either individual neurons or different electroencephalogram channels and columns correspond to nonintersecting time bins. The most common deep learning approach to handling the matrix is to use a CNN (20). In a CNN, convolutional layers extract features from the input by aggregating weighted information from neighboring elements in the input matrix. This localization of information sharing assumes regular connectivity where only neighboring rows, or columns, bear significance to each other. But, the ordering of the matrix rows are arbitrary and not dependent on neural connectivity. Hence, there is need for a different kind of convolution that takes into account the functional connectivity.
SCNNs, such as those found in (34,35), which utilize the simplicial complexes formed by the connectivity of the network as the input. These SCNNs take in the simplicial complexes constructed from the data and generate matrices that capture low- and high-dimensional connectivity information. These matrices are used to construct simplicial filters, which contain the NN weights. Then, these features are flattened and fed into an FFNN, which can use the features for prediction. For our particular application, the prediction is a mouse’s HD or location. Although the simplicial layers capture connectivity, spiking data’s time-dependent nature makes other networks, such as RNNs, a better tool for neural decoding applications. As such, we propose a network that consists of SC layers and RNN layers.
Materials and methods
Our method consists of three major parts: preprocessing, SC layers, and the back-end RNN. Below, we elaborate on each portion individually. For an overarching view of the architecture, see Fig. 1.
Preprocessing
One of the strengths of the SCRNN is its ability to process different types of data that do not have an explicit graphical structure. Neuronal spiking data are an example of data where extracting connectivity provides implicit structural information. Spiking data are captured by inserting probes into the brain and recording the electrical activity, specifically when individual neurons fire. These data are captured in raster plots, where the x axis represents time and the y axis represents which neuron is firing. Hence, preprocessing spiking data into a simplicial complex provides information about which neurons fire together.
The experimental HD data and grid cell data consist of neurons and their corresponding spike times. Given the spike times of N simultaneously recorded neurons, we first construct a spike count matrix A by creating nonintersecting bins of width and counting each individual neuron’s number of spikes within each bin, as shown in Fig. 2. The element is then set equal to the spike count of neuron i within bin j. The next step is to binarize A via a row-wise thresholding procedure. For a fixed row, consider the elements ordered from highest to lowest. Then, for some value , we select for given by
(1) |
Figure 2.
An example of the preprocessing procedure. First, neural spiking data are represented as a raster plot. Next, the data are binned and converted to a spike count matrix. A row-wise thresholding procedure, given in Eq. 1, binarizes the matrix. In this figure, each colored box denotes a 1, and each white box denotes a 0. Then, each neuron is represented by a node of the simplicial complex, color coded to match the corresponding matrix row. To construct the simplicial complex, the colored nodes are connected by the appropriate dimensional simplex to capture the neurons that fire together. For example, we see that the second column of the binarized matrix has three active neurons (green, orange, and blue). This generates a 2 simplex on the corresponding nodes. To see this figure in color, go online.
The -selected row elements are then set to 1, while the remaining elements are set to 0. This is repeated for every row of A using the same value for p as before. Note that thresholding row-wise accounts for the variability in total spikes among neurons by comparing each neuron’s activity against itself. We then proceed column-wise through the binarized matrix, connecting each active neuron within a time bin by the appropriate-dimensional simplex (see Fig. 2). Specifically, if there are active neurons in a column, then an simplex is constructed on the nodes corresponding to those active neurons. This allows for higher-order descriptions of node connectivity as opposed to the pairwise node connectivity that graphs capture.
SC layers
It is common practice for neural activity to be converted to a matrix where rows represent individual neurons and columns correspond to time bins. The most widely used deep learning approach to handling matrices as inputs is to employ a CNN. In a CNN, convolutional layers extract features from the input by aggregating weighted information from neighboring elements in the input matrix. This localization of information sharing assumes regular connectivity where only neighboring rows, or columns, possess significance to each other. Thus, in tasks where rows of a matrix neighboring each other bear no significance, CNNs do not intuitively extract features.
Simplicial convolutions generalize convolutions to account for data with irregular connectivity (34,35,38,39,40,41). We introduce simplices and simplicial complexes, the topological structures we exploit for feature representation. For more information on simplicial complexes beyond what is outlined below, see (42).
Definition 1: a collection is geometrically independent if, and only if, for any with , the condition implies for all .
Definition 2: a simplex, , is the convex hull of geometrically independent points , denoted by .
Definition 3: the faces of a simplex are the simplices given by for some and are denoted .
Definition 4: a simplicial complex S is a collection of simplices satisfying
-
(1)
if , then every face of s is in S, and
-
(2)
if , then or .
To ease understanding, one may consider a 0 simplex as a vertex, a 1 simplex as an edge, a 2 simplex as a triangle, a 3 simplex as a tetrahedron, and so on. Orientation can be assigned to simplices, forming what is called an ordered simplex. For a face , if the orientation of coincides with that of , we write . Additionally, features, typically vectors or scalars, can also be assigned to the simplices. The features of the simplices are represented by a vector or matrix, depending on the feature size, called the cochain, and it is denoted by .
Definition 5: let be the ordered simplices of a simplicial complex. Then, for each , assign a feature . The cochain, , is then given:
(2) |
For these layers, input data are defined on a simplicial complex, and information sharing is generated by the Hodge-Laplacian. To define the Hodge-Laplacian, we must first introduce the -dimensional incidence matrix, , where the th element is given by,
(3) |
where and are the number of simplices and simplices, respectively. Note, we consider . Then, finally, the Hodge-Laplacian, , is defined as
(4) |
In simplicial convolutions, the terms of the Hodge-Laplacian in Eq. 4 act as shift operators defining which simplices of the same dimension share information. The terms and are called the lower and upper Laplacians, and they capture connectivity by lower and higher dimensional simplices, respectively. A degree D simplicial filter consisting of weights is an operator, , given by
(5) |
where k is the dimension of simplices and denotes the th power of a matrix. Note, each power of the lower and upper Laplacians localizes information sharing to within the i-nearest simplices, similar to increasing the filter size in a traditional convolutional layer.
We now discuss the dynamics of the SC layers of an SCRNN. The proof of the following proposition is delegated to the supporting material.
Proposition 1: consider an SCRNN consisting of L SC layers, each equipped with F filters, , for each dimension k of the functional simplicial complex with maximum simplicial dimension K, where denotes the SC layer. In such a network, the number of parameters used in the SC layers is .
Note that the dynamics of the SC layers prevent exponential growth of parameters with respect to filters and number of layers.
For the first layer , features are extracted from the input, , via some nonlinear transformation σ,
(6) |
for each and . Note for some hyperparameter , and for . For the intermediate SC layers and fixed k, each of the filters is applied to each of the extracted features from the previous layer. To prevent the exponential growth of the number of filters, the outputs extracted from the same feature from the previous layer are summed together to create one single output feature. That is, for each feature from the previous layer, we extract
(7) |
for . In the final SC layer, , features are extracted following the same procedure as the intermediate layers, but additionally, all extracted features are summed:
(8) |
where as in Eq. 7 for . If , then is summed across columns, which gives us . Finally, the outputs for each dimension of the simplicial complex, , are stacked to create one output feature vector, . For illustrative purposes, Fig. 3 depicts SC layers each consisting of filters for each dimension of the input simplicial complex.
Figure 3.
A diagram of simplicial convolutional layers each equipped with two filters, and , for each simplicial dimension and filters for each dimension of the input simplicial complex. In the first layer, we see three orange and three blue filters indicating the three dimensions of the input simplicial complex. The features extracted using these filters result in new orange and blue simplicial complexes, respectively. In the second simplicial convolutional layer, the process is repeated with two new filters, depicted by yellow and dark blue, giving us 4 simplicial complexes. In order to prevent exponential growth, features extracted from the same input from the previous layer are summed, resulting in a new orange and blue simplicial complex. Finally, all extracted features are summed and flattened to create one feature vector for the RNN. To see this figure in color, go online.
SCRNN
To form an input sequence to the RNN component of the SCRNN, we consider the outputs of the SC layers corresponding to a desired number of consecutive time bins. The output feature vector of the simplicial layers captures the connectivity of the simplicial complex as a feature vector. Given the sequential nature of the decoding task, we append the SC layers with a multilayer RNN, an NN architecture designed for time series data. We opt for the Elman RNN architecture (18) over its more complex counterpart, the long short-term memory network (43). In this task, only neural activity recorded in time bins close to the target time bin bear any relevance to the decoded variable, thus making the extra parameters in a long short-term memory network designed for handling long sequences unnecessary. Elman RNNs utilize what are called hidden states to handle sequential data. Specifically, for a given input sequence , an Elman RNN computes hidden state, , given by
(9) |
where are weight matrices and are bias vectors. The final output of an RNN is obtained by computing the nonlinear mapping of a linear transformation of the hidden state; that is,
(10) |
where is defined in Eq. 9. Finally, a multilayer RNN is created by stacking multiple RNNs, feeding the outputs of one as the inputs to another. In the SCRNN, simplicial complexes generated from consecutive time bins are fed as inputs to the SC layers, and their outputs form the input sequence to the RNN.
Results and discussion
For each task, we compare four different networks; the FFNN, RNN, GNN, and SCRNN. We chose to compare the SCRNN to the FFNN and RNN, as they are basic networks that have been used for a variety of decoding tasks (8). The convolutional layers of the GNN are the same as those of the SCRNN but without the higher-order connectivity, and as such, the comparison serves as an ablation study of the higher-order connectivity. We optimize the hyperparameters of each network using RayTune, a distributed hyperparameter tuning tool (44). The program uses the tree-structured Parzen estimators, an algorithm that combines random search with two greedy sequential methods (45). HD decoding accuracy was measured two different ways. First, we considered the median absolute error (MAE), which is defined as
(11) |
where is the number of time bins and are the decoded and the ground-truth directions, respectively. The mapping rescale accounts for the ring structure of HD. For example, and should be recorded as a difference of instead of .
Similarly, we compute the AAE, which considers the average instead of the median discrepancy as defined below,
(12) |
While optimizing our HD networks, we chose to minimize AAE. The MAE is included for additional comparison.
To measure the success of our grid cell model, we compute the AED across all time bins:
(13) |
where is the number of time bins and are the decoded and ground-truth coordinates, respectively.
To evaluate our architecture, we look at two different types of spiking data, HD spiking data and grid cell spiking data, both of which are outlined in detail below.
HD
The neurons making up the HD system in the brain encode the direction the head is facing at any given time. This encoding is done by identifying different ensembles of certain neurons, called HD cells, which can fire synchronously, where each grouping of cells represents a different direction. Additionally, HD is decoded independently of body orientation.
To demonstrate the effectiveness of our method, we analyze HD data recorded in (6). The spike times of HD cells in the anterodorsal thalamic nucleus along with the corresponding ground-truth head angles of 7 mice were recorded using multisite silicon probes and an alignment of LED lights on the mice’s head stage, respectively. The sessions recorded comprised of 2 h of sleep followed by 30–45 min of foraging in an open rectangular environment followed by 2 more hours of sleep. For the following analysis, we used the foraging portion of session “Mouse28-140313,” which consists of recordings from 22 HD cells. The training and test data were constructed from 20 min of a 38 min session of open foraging using ms.
We used RayTune to search the parameter space given in Table S1 to optimize the AAE for each of the four networks. The FFNN recorded , the RNN yielded , and the GNN yielded , whereas the SCRNN produced . For the corresponding hyperparameters and the MAE, please see Table 1. All around, the SCRNN performed the best across all different architectures, as shown in Fig. 4.
Table 1.
Comparison of the different networks on the head direction data and the grid cell data based on their trial with the lowest AAE, measured in degrees, and AED, measured in centimeters, respectively
Head direction decoding lowest AAE |
Grid cell decoding lowest AED |
|||||||
---|---|---|---|---|---|---|---|---|
Hyperparameters | FFNN | RNN | GNN | SCRNN | FFNN | RNN | GNN | SCRNN |
Epochs | 100 | 50 | 100 | 100 | 100 | 100 | 100 | 100 |
Batch size | 32 | 16 | 64 | 8 | 32 | 32 | 8 | 8 |
Learning rate | 0.001 | 0.0001 | 0.001 | 0.0001 | 0.001 | 0.001 | 0.001 | 0.001 |
Dropout | 0.2 | 0.2 | 0.3 | 0.3 | 0.2 | 0.3 | 0.2 | 0.2 |
NN layers | 2 | 2 | 2 | 2 | 3 | 3 | 1 | 1 |
Layer width/hidden size | 128 | 200 | 100 | 200 | 512 | 200 | 100 | 50 |
SC layers | – | – | 1 | 2 | – | – | 2 | 1 |
No. of filters | – | – | 3 | 2 | – | – | 3 | 3 |
Sequence length | – | – | 5 | 5 | – | – | 5 | 5 |
Validation loss | 0.380 | 0.221 | 0.418 | 0.321 | 0.015 | 0.001 | 0.003 | 0.002 |
Training MAE (HD) | 10.959 | 9.414 | 9.406 | 6.785 | – | – | – | – |
Training AAE (HD) | 15.918 | 12.548 | 12.804 | 8.233 | – | – | – | – |
Test MAE (HD) | 12.080 | 9.812 | 9.950 | 8.416 | – | – | – | – |
Test AAE (HD) | 17.990 | 14.587 | 14.624 | 11.493 | – | – | – | – |
Training AED (grid) | – | – | – | – | 7.620 | 3.086 | 3.030 | 2.547 |
Validation AED (grid) | – | – | – | – | 11.801 | 3.350 | 3.539 | 3.088 |
Note that a dash (–) indicates that the hyperparameter was not used for that network.
Figure 4.
Plots depicting the true and the predicted head angles for the second minute of the testing portion of four different networks using 100 ms time bins, (a) FFNN, (b) RNN, (c) GNN, and (d) SCRNN, and the corresponding test AAEs for the first 5 min. Each network was generated with the RayTune-optimized hyperparameters listed in Table 1. To see this figure in color, go online.
Grid cells
Grid cells encode two-dimensional allocentric location by forming hexagonal, periodic firing fields within an environment. Grid cells with firing fields exhibiting the same spacing and orientation form what are referred to as modules. Because firing fields for cells within a module are the same, except for a shift in space, it takes more than one module to encode position (46,47). Cells firing at the same time within a module generate a spatial grid over the environment. A multiscale representation for location is then created by layering the grids generated by different modules.
To showcase the ability of our method on a more complex task, we consider a population of cells recorded in layers II and III of the medial entorhinal cortex of a moving rat, which contains “pure” grid cells, HD cells, and conjunctive grid/HD cells (1,48). Neural activity was recorded using high-site-count Neuropixels silicon probes while the rat foraged in an open square environment. Specifically, we look at a population of 482 cells that contains three grid modules consisting of 166, 167, and 149 cells total, with 93, 149, and 145 of them being pure grid cells, respectively (1). We use 8 total minutes of recorded neural activity and ground-truth position with bin sizes of ms. The rest of the population is made up of conjunctive grid/HD cells. The difficulties of decoding such a population stem from not only the large amount of cells but also the fact that some cells are not solely responsible for encoding position, the target variable we aim to decode (49). We decode the position (as in xy coordinates) from the activity of a population of cells recorded in a moving rat, which contains pure grid cells and conjunctive grid/HD cells (1). The larger population of cells and, consequently, the larger size of the functional simplicial complex compared to the HD decoding task means that a more heavily parameterized SCRNN is required to decode position.
We used RayTune to search the hyperparameter space to minimize AED (Eq. 13) (see Table S1). For the training set, the FFNN recorded cm, the RNN yielded cm, and the GNN yielded cm, whereas the SCRNN produced cm. For the validation set, the FFNN recorded cm, the RNN yielded cm, and the GNN yielded cm, whereas the SCRNN produced cm. Plots of the best predictions for the hyperparameters associated with each network type are shown in Fig. 5. For the corresponding hyperparameters, please see Table 1. Thus, the SCRNN is clearly able to learn the pattern between grid cell activity and position in the environment. Note, a discrepancy between training and test results is expected given the fact grid cells may not encode the exact location, so training could bias the network to map neural codes for general locations to the specific labeled locations included in the training data.
Figure 5.
Grid cell decoding task. (a–d) Plots showing results from 2 min of the grid cell decoding task. (a and b) Comparison of decoded versus ground-truth x coordinate and y coordinate. (c) Error for each time bin measured by Eq. 13. (d) In gray, the ground-truth position of the rat’s trajectory in the environment. For visual purposes, we include colorized paths showing a 5 s comparison of decoded versus ground-truth position. To see this figure in color, go online.
Conclusions
As neuroscientists capture more data, they desire tools to not only decode neural data but also to leverage the underlying structure of the brain’s systems to predict animal behavior. This additional requirement provides interpretability, something traditional NNs lack. The SCRNN combines the interpretability of the SC layers with the power of an RNN. As shown on the HD and grid cells, the SCRNN is able to decode spiking data with a higher level of accuracy than traditional NN architectures. Indeed, the SCRNN outperformed the FFNN, RNN, and GNN when evaluated by the mean absolute error and the AAE for the HD data and also had lowest AED of the three networks for the grid cell decoding task. Future work includes applying the SCRNN to other decoding tasks, such as decoding place cells and cue cells. From a modeling standpoint, it would be crucial to examine how higher-dimensional simplices as expressed in highly correlated data may be used to capture more information and decode even more of the underlying structure of brain activity.
Data and code availability
The code can be found on GitHub at https://github.com/emitch27/SCRNN.
Author contributions
E.C.M. performed research. E.C.M. and B.S. contributed analytic tools, analyzed the data, and wrote the paper. P.J.F. and D.B. provided expert advice throughout the research process. V.M. designed the research and wrote the paper. All authors read and edited the paper.
Acknowledgments
This work has been partially funded by the US Army Research Lab contract no. W911NF2120186. We would also like to thank the editor and an anonymous reviewer for their comments, which significantly improved our paper.
Declaration of interests
The authors declare no competing interests.
Editor: Tamar Schlick.
Footnotes
Supporting material can be found online at https://doi.org/10.1016/j.bpj.2024.01.025.
Supporting material
References
- 1.Gardner R.J., Hermansen E., et al. Moser E.I. Toroidal topology of population activity in grid cells. Nature. 2022;602:123–128. doi: 10.1038/s41586-021-04268-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Yoshida T., Ohki K. Natural images are reliably represented by sparse and variable populations of neurons in visual cortex. Nat. Commun. 2020;11:872. doi: 10.1038/s41467-020-14645-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Jun J.J., Steinmetz N.A., et al. Harris T.D. Fully integrated silicon probes for high-density recording of neural activity. Nature. 2017;551:232–236. doi: 10.1038/nature24636. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Steinmetz N.A., Aydin C., et al. Harris T.D. Neuropixels 2.0: A miniaturized high-density probe for stable, long-term brain recordings. Science. 2021;372 doi: 10.1126/science.abf4588. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Glaser J.I., Benjamin A.S., et al. Kording K.P. The roles of supervised machine learning in systems neuroscience. Prog. Neurobiol. 2019;175:126–137. doi: 10.1016/j.pneurobio.2019.01.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Peyrache A., Lacroix M.M., et al. Buzsáki G. Internally organized mechanisms of the head direction sense. Nat. Neurosci. 2015;18:569–575. doi: 10.1038/nn.3968. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Chaudhuri R., Gerçek B., et al. Fiete I. The intrinsic attractor manifold and population dynamics of a canonical cognitive circuit across waking and sleep. Nat. Neurosci. 2019;22:1512–1520. doi: 10.1038/s41593-019-0460-x. [DOI] [PubMed] [Google Scholar]
- 8.Glaser J.I., Benjamin A.S., et al. Kording K.P. Machine learning for neural decoding. eNeuro. 2020;7 doi: 10.1523/ENEURO.0506-19.2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Xu Z., Wu W., et al. Clark B.J. A Comparison of Neural Decoding Methods and Population Coding Across Thalamo-Cortical Head Direction Cells. Front. Neural Circ. 2019;13 doi: 10.3389/fncir.2019.00075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Krizhevsky A., Sutskever I., Hinton G.E. In: Pereira F., Burges C., et al.Weinberger K., editors. volume 25. Curran Associates, Inc.; 2012. ImageNet Classification with Deep Convolutional Neural Networks.https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf (Advances in Neural Information Processing Systems). [Google Scholar]
- 11.LeCun Y., Bengio Y., Hinton G. Deep learning. Nature. 2015;521:436–444. doi: 10.1038/nature14539. [DOI] [PubMed] [Google Scholar]
- 12.Szabó P., Barthó P. Decoding neurobiological spike trains using recurrent neural networks: a case study with electrophysiological auditory cortex recordings. Neural Comput. Appl. 2022;34:3213–3221. [Google Scholar]
- 13.Maroulas V., Mike J.L., Oballe C. Nonparametric Estimation of Probability Density Functions of Random Persistence Diagrams. J. Mach. Learn. Res. 2019;20:1–49. http://jmlr.org/papers/v20/18-618.html [Google Scholar]
- 14.Oballe C., Boothe D., et al. Maroulas V. ToFU: Topology functional units for deep learning. Found. Data Sci. 2022;4:641–665. [Google Scholar]
- 15.Oballe C., Cherne A., et al. Maroulas V. Bayesian topological signal processing. Discrete and Continuous Dynamical Systems - S. 2022;15:797–817. [Google Scholar]
- 16.Frey M., Tanni S., et al. Barry C. Deepinsight: a general framework for interpreting wide-band neural activity. bioRxiv. 2019 https://www.biorxiv.org/content/early/2019/12/11/871848 Preprint at. [Google Scholar]
- 17.Tampuu A., Matiisen T., et al. Vicente R. Efficient neural decoding of self-location with a deep recurrent network. PLoS Comput. Biol. 2019;15 doi: 10.1371/journal.pcbi.1006822. e1006822–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Elman J. Finding structure in time. Cognit. Sci. 1990;14:179–211. [Google Scholar]
- 19.Rumelhart D.E., Hinton G.E., Williams R.J. Learning representations by back-propagating errors. Nature. 1986;323:533–536. [Google Scholar]
- 20.Lecun Y., Boser B., et al. Jackel L.D. Backpropagation Applied to Handwritten Zip Code Recognition. Neural Comput. 1989;1:541–551. [Google Scholar]
- 21.Love E.R., Filippenko B., et al. Carlsson G. Topological Convolutional Layers for Deep Learning. J. Mach. Learn. Res. 2023;24:1–35. [Google Scholar]
- 22.Love E.R., Filippenko B., et al. Carlsson G. Topological Deep Learning. arXiv. 2021 https://arxiv.org/abs/2101.05778 Preprint at. [Google Scholar]
- 23.Bessadok A., Mahjoub M.A., Rekik I. Graph Neural Networks in Network Neuroscience. IEEE Trans. Pattern Anal. Mach. Intell. 2023;45:5833–5848. doi: 10.1109/TPAMI.2022.3209686. [DOI] [PubMed] [Google Scholar]
- 24.Grande X., Sauvage M.M., et al. Berron D. Transversal functional connectivity and scene-specific processing in the human entorhinal-hippocampal circuitry. Elife. 2022;11 doi: 10.7554/eLife.76479. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Maass A., Berron D., et al. Düzel E. Functional subregions of the human entorhinal cortex. Elife. 2015;4 doi: 10.7554/eLife.06426. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Hafting T., Fyhn M., et al. Moser E.I. Microstructure of a spatial map in the entorhinal cortex. Nature. 2005;436:801–806. doi: 10.1038/nature03721. [DOI] [PubMed] [Google Scholar]
- 27.O’Keefe J. Place units in the hippocampus of the freely moving rat. Exp. Neurol. 1976;51:78–109. doi: 10.1016/0014-4886(76)90055-8. [DOI] [PubMed] [Google Scholar]
- 28.Curto C., Itskov V. Cell Groups Reveal Structure of Stimulus Space. PLoS Comput. Biol. 2008;4:e1000205–e1000213. doi: 10.1371/journal.pcbi.1000205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Giusti C., Pastalkova E., et al. Itskov V. Clique topology reveals intrinsic geometric structure in neural correlations. Proc. Natl. Acad. Sci. USA. 2015;112:13455–13460. doi: 10.1073/pnas.1506407112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Andjelković M., Tadić B., Melnik R. The topology of higher-order complexes associated with brain hubs in human connectomes. Sci. Rep. 2020;10 doi: 10.1038/s41598-020-74392-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Billings J., Saggar M., et al. Petri G. Simplicial and topological descriptions of human brain dynamics. Netw. Neurosci. 2021;5:549–568. doi: 10.1162/netn_a_00190. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Arai M., Brandt V., Dabaghian Y. The effects of theta precession on spatial learning and simplicial complex dynamics in a topological model of the hippocampal spatial map. PLoS Comput. Biol. 2014;10 doi: 10.1371/journal.pcbi.1003651. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Giusti C., Ghrist R., Bassett D.S. Two’s company, three (or more) is a simplex: Algebraic-topological tools for understanding higher-order structure in neural data. J. Comput. Neurosci. 2016;41:1–14. doi: 10.1007/s10827-016-0608-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Ebli S., Defferrard M., Spreemann G. Simplicial Neural Networks. arXiv. 2020 https://arxiv.org/abs/2010.03633 Preprint at. [Google Scholar]
- 35.Yang M., Isufi E., Leus G. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing. ICASSP; 2022. Simplicial Convolutional Neural Networks; pp. 8847–8851. [Google Scholar]
- 36.Hajij M., Istvan K., Zamzmi G. NeurIPS Workshop on Topological Data Analysis and beyond. 2020. Cell Complex Neural Networks. [Google Scholar]
- 37.Livezey J.A., Glaser J.I. Deep learning approaches for neural decoding across architectures and recording modalities. Briefings Bioinf. 2021;22:1577–1591. doi: 10.1093/bib/bbaa355. [DOI] [PubMed] [Google Scholar]
- 38.Hajij M., Zamzmi G., et al. Cai X. Machine Learning on Graphs (MLoG) Workshop at 15th ACM InternationalWSD Conference. 2022. Simplicial Complex Representation Learning. [Google Scholar]
- 39.Hajij M., Ramamurthy K.N., et al. Za G. ICLR 2022 Workshop on Geometrical and Topological Representation Learning. 2022. High Skip Networks: A Higher Order Generalization of Skip Connections. [Google Scholar]
- 40.Hajij M., Zamzmi G., et al. Schaub M. Topological Deep Learning: Going Beyond Graph Data. arXiv. 2023 doi: 10.48550/arXiv:2206.00606. Preprint at. [DOI] [Google Scholar]
- 41.Bodnar C., Frasca F., et al. Bronstein M. In: Proceedings of the 38th International Conference on Machine Learning. PMLR, Volume 139 of Proceedings of Machine Learning Research. Meila M., Zhang T., editors. 2021. Weisfeiler and Lehman Go Topological: Message Passing Simplicial Networks; pp. 1026–1037.https://proceedings.mlr.press/v139/bodnar21a.html [Google Scholar]
- 42.Hatcher A. Cambridge University Press; Cambridge: 2002. Algebraic Topology. [Google Scholar]
- 43.Hochreiter S., Schmidhuber J. Long Short-Term Memory. Neural Comput. 1997;9:1735–1780. doi: 10.1162/neco.1997.9.8.1735. [DOI] [PubMed] [Google Scholar]
- 44.Liaw R., Liang E., et al. Stoica I. Tune: A Research Platform for Distributed Model Selection and Training. arXiv. 2018 doi: 10.48550/arXiv:1807.05118. Preprint at. [DOI] [Google Scholar]
- 45.Bergstra J., Bardenet R., et al. Kégl B. Algorithms for hyper-parameter optimization. Adv. Neural Inf. Process. Syst. 2011;24 [Google Scholar]
- 46.Mathis A., Herz A.V.M., Stemmler M. Optimal Population Codes for Space: Grid Cells Outperform Place Cells. Neural Comput. 2012;24:2280–2317. doi: 10.1162/NECO_a_00319. [DOI] [PubMed] [Google Scholar]
- 47.Stemmler M., Mathis A., Herz A.V.M. Connecting multiple spatial scales to decode the population activity of grid cells. Sci. Adv. 2015;1 doi: 10.1126/science.1500816. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Gerlei K., Passlack J., et al. Nolan M.F. Grid cells are modulated by local head direction. Nat. Commun. 2020;11:4228. doi: 10.1038/s41467-020-17500-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.McNaughton B.L., Battaglia F.P., et al. Moser M.-B. Path integration and the neural basis of the ‘cognitive map’. Nat. Rev. Neurosci. 2006;7:663–678. doi: 10.1038/nrn1932. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The code can be found on GitHub at https://github.com/emitch27/SCRNN.