Abstract
Neural decoding and its applications to brain computer interfaces (BCI) are essential for understanding the association between neural activity and behavior. A prerequisite for many decoding approaches is spike sorting, the assignment of action potentials (spikes) to individual neurons. Current spike sorting algorithms, however, can be inaccurate and do not properly model uncertainty of spike assignments, therefore discarding information that could potentially improve decoding performance. Recent advances in high-density probes (e.g., Neuropixels) and computational methods now allow for extracting a rich set of spike features from unsorted data; these features can in turn be used to directly decode behavioral correlates. To this end, we propose a spike sorting-free decoding method that directly models the distribution of extracted spike features using a mixture of Gaussians (MoG) encoding the uncertainty of spike assignments, without aiming to solve the spike clustering problem explicitly. We allow the mixing proportion of the MoG to change over time in response to the behavior and develop variational inference methods to fit the resulting model and to perform decoding. We benchmark our method with an extensive suite of recordings from different animals and probe geometries, demonstrating that our proposed decoder can consistently outperform current methods based on thresholding (i.e. multi-unit activity) and spike sorting. Open source code is available at https://github.com/yzhang511/density_decoding.
1. Introduction
Decoding methods for large-scale neural recordings are opening up new ways to understand the neural mechanisms underlying cognition and behavior in diverse species (Urai et al. 2022). The emergence of high-density multi-electrode array (HD-MEA) devices introduced a tremendous increase in the number of extracellular channels that can be recorded simultaneously (Jun et al. 2017; Steinmetz et al. 2021), leading to scalable and high-bandwith brain computer interfaces (BCI) systems (Musk et al. 2019; Paulk et al. 2022).
Traditional neural decoding methods assume that spiking activity has already been correctly spike-sorted. As a result, these methods are not appropriate for situations where sorting cannot be performed with high precision. Despite intensive efforts towards automation, current spike sorting algorithms still require manual supervision to ensure sorting quality (Steinmetz et al. 2018). Even after careful curation, current spike sorters suffer from many sources of errors including erroneous spike assignment (Deng et al. 2015). The dense spatial resolution of HD probes makes some known issues of spike sorting even more evident. With the increased density of the recording channels, the probability of visibly overlapping spikes (spike collisions) is higher (Buccino et al. 2022). Even for the same HD dataset, different spike sorters have low agreement on the isolated units and can find a significant number of poorly sorted and noisy units. (Buccino et al. 2020). Consequently, only single units that pass quality control metrics are included in many neural coding studies (IBL et al. 2022).
Because the spike-sorting problem remains unresolved, alternative approaches that do not rely on sorted single-units for decoding have been proposed. A popular choice is multi-unit threshold crossing that uses spiking activity on each electrode for decoding (Fraser et al. 2009; Trautmann et al. 2019). However, this approach ignores the fact that the signal on each electrode is a combination of signals from different neurons, thus making inefficient use of the data (Todorova et al. 2014). Ventura (2008) proposed a spike-sorting free decoding paradigm that estimates neuronal tuning curves from electrode tuning curves and then infers the behavior of interest using the estimated tuning curves and newly observed electrode spike trains. More recently, Chen et al. (2012), Kloosterman et al. (2014), Deng et al. (2015), and Rezaei et al. (2021) developed spike feature decoding methods that use marked point processes to characterize the relationship between the behavior variable and features of unsorted spike waveforms. However, these state-based decoders make explicit assumptions about the underlying system dynamics which reduce their flexibility in capturing complex relationships in the data. Moreover, these methods mainly utilize simple waveform features for decoding such as the maximum amplitude on each electrode and do not take advantage of HD spike features such as the estimated spike location.
To leverage the spatial spread and density of HD probes, Hurwitz et al. (2019) and Boussard et al. (2021) developed spike localization methods. These methods estimate the source location of a detected spike; this is a low-dimensional feature that is informative about the firing neuron’s identity. We propose a probabilistic model-based decoding method that scales to HD-MEA devices and utilizes these novel localization features in conjunction with additional waveform features. We use a mixture of Gaussians (MoG) model to encode the uncertainty associated with spike assignments in the form of parametric distributions of the spike features. Unlike traditional MoG models with a fixed mixing proportion, our method allows the mixing proportion to depend on the behavior of interest and change over time. This is motivated by the theory that behavioral covariates that modulate neurons’ firing rates also contain information about spike identities and that such tuning information should be incorporated into spike sorting and neural decoding in order to obtain unbiased and consistent tuning function estimates (Ventura 2009). To infer the functional relationship between spike features and behavioral correlates, we employ automatic differentiation variational inference (ADVI) (Kucukelbir et al. 2017) and coordinate ascent variational inference (CAVI) (Blei et al. 2017), which enable us to perform efficient and accurate inference while considering the behavior-modulated MoG model.
We apply our method to a large number of HD recordings and decode various types of behavioral correlates. Experimental results show that our decoder consistently outperforms decoders based on multi-unit threshold crossings and single-units sorted by Kilosort 2.5 (Pachitariu et al. 2023). We further validate the robustness of our method by applying it to recordings with different levels of sorting quality, HD probes with varying geometry, and recordings from multiple animal species. Consistent with previous results, our findings indicate that relying solely on “good” units, as determined by sorting quality metrics, leads to information loss and suboptimal decoding performance. This observation motivates our transition to a spike sorting-free decoding framework which enables us to extract more information from the spiking activity and improve decoding performance.
2. Method
Consider an electrophysiological recording comprised of trials, where each trial is divided into equally spaced time bins. Let , denote a set of spike features, where indexes the -th spike, represents the number of spikes collected in the -th time bin of the -th trial, and is the dimension of spike features. For example, the spike feature includes the spike location along the x- and z-axis of the probe, and its maximum peak-to-peak (max ptp) amplitude. Let be the observed time-varying behavior in the trial , e.g., the speed of a rotating wheel controlled by a mouse. When the behavior in the trial does not vary over time, it can take on either a binary () or scalar () value, e.g., the mouse responds to a scalar-valued stimulus by making a binary decision.
The proposed decoding method comprises an encoder and a decoder model. During the training of the model, the encoder captures the relationship between the observed spike feature distribution and the observed behavior. During the testing phase, the decoder utilizes the learned relationship from the encoder to predict the unobserved behavior based on newly observed spike features. In the following section, we present a general formulation of the encoding-decoding paradigm and provide more detailed implementations in the supplementary materials.
Encoder
The multivariate spike feature distribution is modeled using a mixture of Gaussian (MoG). The encoder generative model is as follows:
| (1) |
| (2) |
| (3) |
where is a function that describes the firing rate’s dependence on behaviors , while represents a general prior on encompassing the model parameters for the mixture component . Intuitively, the behavior-dependent governs the mixing proportion of the MoG, which determines the specific component from which a spike feature is generated. As varies over time in response to , spikes that are spatially close and share similar waveform features may originate from different MoG components at different time points within a trial. In our implementation, we parameterize using a generalized linear model (GLM), but alternative models such as neural networks (NN) can also be used; see the supplementary material for the GLM configuration.
We employ variational inference (VI) to learn the unknown quantities. In the standard MoG setting (Blei et al. 2017), our goal is to infer the spike assignment which indicates the latent component from which the observation originates. However, unlike the standard MoG, our spike assignment is influenced by the firing rates of the neurons which are modulated by the behavior . Consequently, learning the association between and necessitates the estimation of the unknown model parameters . Our objective is to simultaneously learn both the latent variables and model parameters based on the observed spike features and behavior . To accomplish this, we posit a mean-field Gaussian variational approximation
| (4) |
for the posterior . Subsequently, we employ the CAVI or ADVI methods to maximize the evidence lower bound (ELBO) and compute updates for and . Analogous to the standard Expectation-Maximization (EM) algorithm for MoG, the proposed CAVI and ADVI procedures consist of an E step and a M step. The E step, which updates , closely resembles that of the ordinary MoG, while the M step, responsible for updating , differs. CAVI utilizes coordinate ascent to find that maximizes the ELBO, while ADVI employs stochastic gradient ascent for updates. For detailed information on the CAVI and ADVI model specifications, refer to Supplement 1 and 2.
Decoder
The decoder adopts the same generative model in Equation 1-3 as the encoder with two distinctions: 1) is unobserved and considered a latent variable we aim to estimate, i.e., , where is a general prior, and 2) the model parameters are obtained from the encoder and kept constant. In practice, the choice of prior relies on the nature of . For instance, if is binary, we can sample from a Bernoulli distribution while a Gaussian process prior can capture the temporal correlation between time steps if . The posterior is approximated using a mean-field Gaussian variational approach
| (5) |
We employ standard CAVI or ADVI methods to infer and decode .
Robust behavior decoding
In practice, we found that direct decoding of using the approximated posterior in Equation 5 was not robust, leading to decoding results of inconsistent quality across different datasets. Although the factorization described in Equation 5 may not fully exploit the available information for predicting , it is useful for learning about the spike assignment . To enhance decoding robustness, we compute a weight matrix from the MoG outputs as input to the final behavior decoder. The weight matrix, denoted as , has dimensions and entries , which capture the posterior probability of assigning spike collected at time of the trial into the component . In scenarios such as spike sorting or multi-unit thresholding, spikes are assigned deterministically to one of sorted single units or thresholded channels. In this case, has one-hot rows, and each entry represents the number of spikes belonging to trial , time and unit (channel) . To obtain , we rely on estimating the posterior , which requires either the observed or the estimated . During model training, we can substitute the observed into Equation 1 to calculate the posterior for the train trials. At test time, we use the estimated obtained from multi-unit thresholding along with the learned encoder parameters to calculate the posterior for the test trials. With the posterior in hand, we then estimate to compute the weight matrix for both the train and test trials, which serves as input to the final behavior decoder. The choice of the behavior decoder depends on the user’s preference. For instance, we can use logistic regression as the behavior decoder for binary , and ridge regression for that exhibit temporal variations. Additional information regarding the selection of the behavior decoder can be found in Supplement 5.
3. Experiments
We conducted experiments using both electrophysiological and behavior data obtained from the International Brain Laboratory (IBL) (IBL et al. 2021). The electrophysiological recordings were acquired using Neuropixels (NP) probes that were implanted in mice performing a decision-making task. Each recording comprises multiple trials with several behavioral variables recorded during each trial including the choice, face motion energy, and wheel speed; see Figure 2 (a) for details. Each trial has a duration of 1.5 seconds and is divided into 30 time bins of 50 milliseconds length. The NP probe spans multiple brain regions; an example of the brain parcellation can be seen in Figure 3. To prepare the recordings for decoding, we first applied IBL’s standard destriping procedure (Chapuis et al. 2022) to reduce artifacts. Then, we used a subtraction-based spike detection and denoising method described in Boussard et al. 2023. After preprocessing, we computed the spike locations (Boussard et al. 2021) to acquire spike features for decoding and then utilized registration techniques (Windolf et al. 2022) to correct for motion drift in the recorded data. Further details about data preprocessing can be found in Supplement 4. In all experiments, we used spike locations along the width and depth of the NP probe, and maximum peak-to-peak amplitudes of spikes for decoding. We selected this set of spike features based on empirical evidence from our experiments, which showed their good decoding performance. Furthermore, previous studies (Boussard et al. 2023; Hilgen et al. 2017) have also recognized that these features were highly informative about unit identity. Figure 1(a) illustrates the spike localization and waveform features that were utilized for decoding.
Figure 2: Density-based decoding is robust to varying levels of spike sorting quality.
(a) We decode various behaviors including choice, motion energy and wheel speed. In the experimental setup, the mouse detects the presence of a visual stimulus to their left or right and indicates the perceived location (choice) by turning a steering wheel in the corresponding direction. Motion energy is calculated within a square region centered around the mouse’s whiskers. The example behavior traces are distinguished by different colors for each trial. (b) We compare decoders using two experimental sessions with “good” sorting quality (represented by the color green) and two sessions with “bad” sorting quality (represented by the color purple) based on IBL’s quality metrics. The box plots display various statistical measures including the minimum, maximum, first and third quartiles, median (indicated by a gray dashed line), mean (indicated by a red solid line), and outliers (represented by dots). These decoding metrics are obtained from a 5-fold CV and are averaged across both “good” and “bad” sorting example sessions. (c) We compare the traces decoded by spike-sorted decoders and our method on example sessions with “good” sorting quality (indicated by green) and “bad” sorting quality (indicated by purple). (d) The scatter plots depict the decoding quality of motion energy, measured by , with respect to various spike-sorting quality metrics. Each point represents one of the 20 IBL sessions, and different colors and shapes are used to distinguish between the type of decoder and sorting quality. The sorting quality metrics include “contamination,” which estimates the fraction of unit contamination (Hill et al. 2011), “drift,” which measures the absolute value of the cumulative position change in micrometers per second (um/sec) of a given KS unit, and “missed spikes,” which approximates the fraction of missing spikes from a given KS unit (Hill et al. 2011). These metrics are averaged across all KS units in a session. The scatter plots demonstrate that decoding quality tends to decrease when sorting quality is compromised. However, our method outperforms spike-sorted decoders even in the presence of these sorting issues.
Figure 3: Decoding comparisons broken down by brain regions.
(a) We decoded 20 IBL datasets acquired using NP1 probes which were inserted into mice performing a behavioral task. The locations of the probe insertions in the mouse brain and the corresponding brain parcellations along the NP1 probe are shown. We compared the performance of all decoders across different recorded brain regions. For the “All” region, spikes from all brain regions were utilized for decoding. In contrast, for the “PO,” “LP,” “DG,” “CA1,” and “VISa” regions, only spikes from the respective regions were used for decoding. The decoding performance were summarized using box plots showing metrics obtained through a 5-fold CV and averaged across 20 IBL sessions. We observe a higher accuracy from PO, LP, and VISa regions when decoding choice; decoding results are more comparable across regions for the continuous behavioral variables. Our proposed decoder consistently achieves higher accuracy in decoding the continuous variables. (b) We use scatter plots to quantify the relationship between decoding quality, measured by from decoding motion energy, and the number of components used for decoding. In the case of “all KS” and “good KS”, the number of components corresponds to the number of KS units. For our method, the number of components refers to the number of MoG components used. For all methods, the decoding performance is higher when using more components (in the regime of a small number of components). Our decoding method consistently outperforms spike-sorted decoders based on KS 2.5 while tending to need fewer components.
Figure 1: Decoding paradigm and graphical model.
(a) Spike localization features, (, ), the locations of spikes along the width and depth of the NP1 probe, and waveform features, , the maximum peak-to-peak (max ptp) amplitudes of spikes. Amplitude is measured in standard units (s.u.). Spike features from the entire probe are shown, and we focus on a specific segment of the probe. (b) During the training phase, the encoder takes the observed spike features and behavior from the train trials as inputs and then outputs the variational parameters which control the dependence of the firing rate on the behavior . At test time, the decoder utilizes the learned model parameters obtained from the encoder and the observed spike features from the test trials to predict the corresponding behavior in the test trials. To ensure reliable decoding of behaviors, we initially calculate the during training using the learned and observed behaviors from the train trials. Then, we compute the during test time using the learned and the estimated behavior obtained through multi-unit thresholding from the test trials. Finally, we generate the weight matrix for both the train and test trials as input to the final behavior decoder, e.g., linear regression or neural networks (Glaser et al. 2020; Livezey et al. 2021). (c) In the encoder, the firing rates of each MoG component are modulated by the observed behavior in the train trials. This modulation affects the MoG mixing proportion , which in turn determines the spike assignment that generates the observed spike features in the train trials. In the decoder, the behavior in the test trials is unknown and considered as a latent variable to be inferred. The decoder uses the observed spike features from the test trials along with the fixed model parameters learned by the encoder to infer the latent behavior .
We evaluate the performance of our decoding method by comparing it to the following baselines: (1) Spike-thresholded decoders which utilize the spiking activity on each electrode after a voltage-based detection step. (2) Spike-sorted decoders that utilize all single-units found using Kilosort (KS) 2.5 (Pachitariu et al. 2016). (3) Spike-sorted decoders based on “good” units which consist of KS units that have passed IBL’s quality control procedure (IBL et al. 2022). The parameters used for KS were tuned across multiple IBL datasets as described by IBL’s spike sorting white paper (Chapuis et al. 2022).
To assess the quality of decoding, we perform 5-fold cross validation (CV) and compute relevant decoding metrics. The coefficient of determination () is used to evaluate continuous behavior decoding (e.g., motion energy and wheel speed) while accuracy is used for discrete behaviors (e.g., choice). To demonstrate the efficacy of our approach in a wide range of settings, we conduct the following experiments.
Varying levels of spike sorting quality
Our objective is to compare the proposed decoding method to spike-sorted decoders using datasets which have varying levels of spike sorting quality. We apply our method to two datasets with high sorting quality (“good” sorting) and two with low sorting quality (“bad” sorting). The quality of sorting is assessed using IBL’s quality metrics (Chapuis et al. 2022). Although motion registration has been performed, we find that the recordings that have “bad” sortings are more affected by motion drift then the recordings which have “good” sortings; see supplementary materials.
Different brain regions from 20 datasets
To demonstrate the efficacy of our method across many different datasets, we decode 20 IBL datasets (IBL et al. 2022). In these datasets, mice perform a behavioral task while NP1 probes record activity from multiple brain regions. These brain regions are repeatedly targeted across all the datasets. To explore how behaviors are linked to specific brain regions, we use spikes that are confined to a particular area of the mouse brain for decoding. We collect decoding results from the posterior thalamic nucleus (PO), the lateral posterior nucleus (LP), the dentate gyrus (DG), the cornu ammonis (CA1) and the anterior visual area of the visual cortex (VISa).
Different probe geometry
Our method is capable of decoding electrophysiological data from a variety of HD probes. To demonstrate this, we apply our method on Neuropixels 2.4 (NP2.4) and Neuropixels 1.0 in nonhuman primates (NP1-NHP) datasets. The NP2.4 and NP1-NHP recordings are preprocessed using an identical pipeline as employed for NP1; see Supplement 4 for details. For spike sorting, the NP2.4 and NP1 adopt the same KS parameters as outlined in IBL’s spike sorting procedure. Different KS parameters are utilized for NP1-NHP probes which are detailed in Trautmann et al. 2023.
NP2.4: Each NP2.4 probe consists of four shanks and a total of 384 channels (Steinmetz et al. 2021). NP2.4 probes are more dense (i.e., more channels in a given area) than NP1. The mice were trained in accordance with the IBL experiment protocols to perform a visual decision-making task. The behavioral correlates we decode are choice, motion energy, and wheel speed.
NP1-NHP: NP1-NHP is designed for nonhuman primate species, such as macaques. The NP1-NHP probe maintains the same number of channels as NP1 (384), but its overall length is extended, resulting in a sparser configuration compared to NP1 (Trautmann et al. 2023). During the experiment, the macaque underwent training in a sequential multi-target reaching task (Marshall et al. 2022). The behavioral correlate we decode is the monkey’s arm force. The probe was implanted in the macaque’s motor cortex.
Comparison to a state-of-the-art clusterless decoder
Although the lack of available code for prior methods make comprehensive comparisons difficult, we benchmark our density-based decoder against a state-of-the-art clusterless decoder on datasets from both HD probes and multiple tetrodes. We compare our method to the clusterless point process decoder of Denovellis et al. (2021), which utilizes a marked point process to connect spike features with behaviors. For more details of this comparison, see Section 9 of the supplementary materials.
To decode the binary choice variable, we utilize the CAVI algorithm described in Supplement 2. For continuous behaviors like motion energy, wheel speed, and arm force, we employ the ADVI algorithm outlined in Supplement 1. We specify the maximum number of iterations as a hyperparameter in the CAVI model as it requires analytical updates for the model parameters. Running the CAVI encoder and decoder for fewer than 50 iterations yields satisfactory decoding outcomes. As for the ADVI algorithm, we implement it in PyTorch and update the model parameters using the Adam optimizer with a learning rate of 0.001 and a batch size of 6. The ADVI model is run for 1000 iterations. Open source code is available at https://github.com/yzhang511/density_decoding.
4. Results
Varying levels of spike sorting quality
The performance of our method in comparison to the spike-thresholded and spike-sorted decoders for both the “good” and “bad” sorting examples is summarized in Figure 2. For the “good” sorting examples, our method has the highest decoding performance for motion energy and wheel speed. For choice decoding, our approach is comparable to decoding based on all KS single-units and better than decoders based on multi-unit thresholding and “good” KS units. For the “bad” sorting sessions, the gap in decoding performance between our method and other decoders is more pronounced. Example traces are illustrated in Figure 2 (c), which demonstrate that the behavior traces decoded by our method closely match the observed traces compared to decoded traces from the sorted decoders. In Figure 2 (d), we quantify the relationship between sorting quality and decoding quality using data from 20 IBL sessions. For all three quality metrics, the performance of our decoder and the spike-sorted decoder decreases as the quality of the sorting decreases. Despite this decrease in performance, our method consistently has better performance than the spike-sorted decoder even in the presence of significant motion drift as well as a when there is a large fraction of missed spikes or contaminated units.
Different brain regions from 20 datasets
The decoding results across various brain regions for 20 IBL sessions are summarized in Figure 3. Overall, our approach consistently achieves higher values compared to other competing methods in decoding both motion energy and wheel speed across the five recorded brain regions. Notably, decoders based on “good” KS units exhibit poor performance across all recorded brain regions when compared to decoders based on all KS units. This observation highlights the importance of utilizing all available information for decoding behaviors rather than solely relying on “good” units based on sorting quality metrics. The scatter plots in Figure 3 (b) indicate a general trend where decoding quality tends to increase when more components (i.e., KS units for the spike-sorted decoders and MoG components for our method) are available for decoding. However, our method outperforms spike-sorted decoders even with a limited number of components.
Different probe geometry
The decoding results for the NP2.4 and NP1-NHP geometries are illustrated in Figure 4. For NP2.4, our approach significantly outperforms other competing methods when decoding motion energy and wheel speed, while again, the approaches are more comparable when decoding the discrete choice variable (our method performs slightly worse than the spike-sorted decoder). For NP1-NHP, Figure 4 demonstrates that our method achieves better decoding performance () compared to the spike-thresholded () and spike-sorted baselines (). “Good” KS units are not available in this scenario (the IBL quality criteria were not applied to this primate dataset) and are therefore not included in the results.
Figure 4: Decoding performance generalizes across different animals and probe geometry.
(a) We compare all decoders on a NP2.4 dataset using box plots showing performance metrics obtained from a 5-fold CV. Our method achieves much higher performance than all other decoders on continuous behavior decoding with slightly worse choice decoding than the spike-sorted decoder. (b) We utilize data from a single NP1-NHP recording session to decode the reaching force of a monkey engaged in a path-tracking (pacman) behavioral task. The decoders are evaluated through both quantitative analysis (box plots) and qualitative examination of the decoded traces. Each trial within the NP1-NHP recording has a duration of 9.85 seconds. Our method outperforms all other decoders on predicting the arm force.
Comparison to a state-of-the-art clusterless decoder
We compare our method to a state-of-the-art clusterless point process decoder (Denovellis et al. 2021) in Table 1. Our method has higher decoding performance than the point process decoder on both HD and simulated tetrode datasets for all behavior variables. This performance improvement is likely due to the increased flexibility of our decoder compared to state-space models that make stronger assumptions about the dynamics of the decoded signals.
Table 1: Comparison to a state-of-the-art clusterless decoder.
We evaluated the performance of both methods using 5-fold cross-validation. We reported the mean correlation between the ground-truth behavior and the decoded behavior along with the standard deviation. All tetrode data was simulated. For the HD datasets, we averaged the results across three IBL datasets.
| Multiple tetrodes (position) | NP1 (wheel speed) | NP1 (motion energy) | |
|---|---|---|---|
| Denovellis et al. 2021 | 0.91 (± 0.01) | 0.50 (± 0.16) | 0.55 (± 0.15) |
| Density-Based | 0.97 (± 0.03) | 0.63 (± 0.12) | 0.63 (± 0.14) |
Computation time
In Figure 5, we provide a computation time comparison relative to real-time. Our decoding step operates at a sub-real-time pace (0.3 times real-time). The total time after preprocessing for our method is close to real-time.
Figure 5: Computation time measured relative to real-time.
“Preprocessing” includes destriping, required by all decoders (IBL et al. 2022). “Total after preprocess” includes spike subtraction, denoising, localization, registration and density-decoding. The computation time of the clusterless point process decoder (Denovellis et al. 2021) is also provided.
5. Discussion
In this work, we introduce a probabilistic model-based neural decoding method that relates spike feature distributions to behavioral correlates for more accurate behavior decoding. Our method is designed for high-density recording devices such as Neuropixels probes, utilizing novel HD spike features (i.e., spike locations) and maximum peak-to-peak amplitudes of the spikes. We further develop an efficient variational approach to perform inference in this model. We benchmark our method across a comprehensive set of HD recordings with varying levels of sorting quality, different probe geometries, and distinct brain regions. We demonstrate that our decoding method can consistently outperform spike-thresholded decoders and spike-sorted decoders across a wide variety of experimental contexts. This motivates a shift towards a spike-feature based decoding paradigm that avoids the need for spike sorting while also achieving comparable or superior decoding performance to approaches relying on well-isolated single-units.
While our method shows promising results, it is essential to explore avenues for improvement. Two potential improvements to our method include utilizing deep learning-based models to capture complex functional associations between firing rates and behaviors and also introducing dependencies among the mixture components to account for correlated neuronal firing patterns. An interesting extension of this work would be to apply this spike-feature based paradigm to unsupervised learning of neural dynamics which would enable us to estimate MoG “firing rates” conditioned on time-varying latent variables. By addressing these challenges and expanding the scope of our method, we can advance our understanding of neural activity and its relationship with behavior in both the supervised and unsupervised settings.
Supplementary Material
Acknowledgement
This work was supported by grants from the Wellcome Trust (209558 and 216324), National Institutes of Health (1U19NS123716 and K99MH128772) and the Simons Foundation. We thank Matt Whiteway, Tatiana Engel, and Alexandre Pouget for helpful conversations and feedback on the manuscript.
References
- Blei D. M., Kucukelbir A., and McAuliffe J. D. (2017). “Variational inference: A review for statisticians”. In: Journal of the American statistical Association 112.518, pp. 859–877. [Google Scholar]
- Boussard J. et al. (2021). “Three-dimensional spike localization and improved motion correction for Neuropixels recordings”. In: Advances in Neural Information Processing Systems 34, pp. 22095–22105. [Google Scholar]
- Boussard J. et al. (2023). “DARTsort: A modular drift tracking spike sorter for high-density multi-electrode probes”. In: bioRxiv, p. 553023. [Google Scholar]
- Buccino A. P. et al. (2020). “SpikeInterface, a unified framework for spike sorting”. In: Elife 9, e61834. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buccino A. P., Garcia S., and Yger P. (2022). “Spike sorting: new trends and challenges of the era of high-density probes”. In: Progress in Biomedical Engineering. [Google Scholar]
- Chapuis M. F. et al. (2022). “Spike sorting pipeline for the International Brain Laboratory”. In: channels 10, p. 6. [Google Scholar]
- Chen Z. et al. (2012). “Transductive neural decoding for unsorted neuronal spikes of rat hippocampus”. In: pp. 1310–1313. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Deng X. et al. (2015). “Clusterless decoding of position from multiunit activity using a marked point process filter”. In: Neural computation 27.7, pp. 1438–1460. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Denovellis E. L. et al. (2021). “Hippocampal replay of experience at real-world speeds”. In: Elife 10, e64505. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fraser G. W. et al. (2009). “Control of a brain–computer interface without spike sorting”. In: Journal of neural engineering 6.5, p. 055004. [DOI] [PubMed] [Google Scholar]
- Glaser J. I. et al. (2020). “Machine learning for neural decoding”. In: Eneuro 7.4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hilgen G. et al. (2017). “Unsupervised spike sorting for large-scale, high-density multielectrode arrays”. In: Cell reports 18.10, pp. 2521–2532. [DOI] [PubMed] [Google Scholar]
- Hill D. N., Mehta S. B., and Kleinfeld D. (2011). “Quality metrics to accompany spike sorting of extracellular signals”. In: Journal of Neuroscience 31.24, pp. 8699–8705. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hurwitz C. et al. (2019). “Scalable spike source localization in extracellular recordings using amortized variational inference”. In: Advances in Neural Information Processing Systems 32. [Google Scholar]
- IBL, I. B. L. et al. (2022). “Reproducibility of in-vivo electrophysiological measurements in mice”. In: bioRxiv, pp. 2022–05. [Google Scholar]
- IBL, T. I. B. et al. (2021). “Standardized and reproducible measurement of decision-making in mice”. In: Elife 10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jun J. J. et al. (2017). “Fully integrated silicon probes for high-density recording of neural activity”. In: Nature 551.7679, pp. 232–236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kloosterman F. et al. (2014). “Bayesian decoding using unsorted spikes in the rat hippocampus”. In: Journal of neurophysiology. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kucukelbir A. et al. (2017). “Automatic differentiation variational inference”. In: Journal of machine learning research. [Google Scholar]
- Livezey J. A. and Glaser J. I. (2021). “Deep learning approaches for neural decoding across architectures and recording modalities”. In: Briefings in bioinformatics 22.2, pp. 1577–1591. [DOI] [PubMed] [Google Scholar]
- Magland J. F. and Barnett A. H. (2015). “Unimodal clustering using isotonic regression: ISO-SPLIT”. In: arXiv preprint arXiv:1508.04841. [Google Scholar]
- Marshall N. J. et al. (2022). “Flexible neural control of motor units”. In: Nature neuroscience 25.11, pp. 1492–1504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Musk E. et al. (2019). “An integrated brain-machine interface platform with thousands of channels”. In: Journal of medical Internet research 21.10, e16194. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pachitariu M., Sridhar S., and Stringer C. (2023). “Solving the spike sorting problem with Kilosort”. In: bioRxiv, pp. 2023–01. [Google Scholar]
- Pachitariu M. et al. (2016). “Kilosort: realtime spike-sorting for extracellular electrophysiology with hundreds of channels”. In: BioRxiv, p. 061481. [Google Scholar]
- Paulk A. C. et al. (2022). “Large-scale neural recordings with single neuron resolution using Neuropixels probes in human cortex”. In: Nature Neuroscience 25.2, pp. 252–263. [DOI] [PubMed] [Google Scholar]
- Rezaei M. R. et al. (2021). “Real-time point process filter for multidimensional decoding problems using mixture models”. In: Journal of neuroscience methods 348, p. 109006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Steinmetz N. A. et al. (2018). “Challenges and opportunities for large-scale electrophysiology with Neuropixels probes”. In: Current opinion in neurobiology 50, pp. 92–100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Steinmetz N. A. et al. (2021). “Neuropixels 2.0: A miniaturized high-density probe for stable, long-term brain recordings”. In: Science 372.6539, eabf4588. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Todorova S. et al. (2014). “To sort or not to sort: the impact of spike-sorting on neural decoding performance”. In: Journal of neural engineering 11.5, p. 056005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trautmann E. M. et al. (2019). “Accurate estimation of neural population dynamics without spike sorting”. In: Neuron 103.2, pp. 292–308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trautmann E. M. et al. (2023). “Large-scale brain-wide neural recording in nonhuman primates”. In: bioRxiv, pp. 2023–02. [Google Scholar]
- Urai A. E. et al. (2022). “Large-scale neural recordings call for new insights to link brain and behavior”. In: Nature neuroscience 25.1, pp. 11–19. [DOI] [PubMed] [Google Scholar]
- Ventura V. (2008). “Spike train decoding without spike sorting”. In: Neural computation 20.4, pp. 923–963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- — (2009). “Traditional waveform based spike sorting yields biased rate code estimates”. In: Proceedings of the National Academy of Sciences 106.17, pp. 6921–6926. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Windolf C. et al. (2022). “Robust online multiband drift estimation in electrophysiology data”. In: bioRxiv, pp. 2022–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





