Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

bioRxiv logoLink to bioRxiv
[Preprint]. 2025 Apr 6:2024.11.05.622159. Originally published 2024 Nov 5. [Version 2] doi: 10.1101/2024.11.05.622159

In vivo cell-type and brain region classification via multimodal contrastive learning

Han Yu 1, Hanrui Lyu 2, Ethan Yixun Xu 1, Charlie Windolf 1, Eric Kenji Lee 3, Fan Yang 4, Andrew M Shelton 5, Shawn Olsen 5, Sahar Minavi 5, Olivier Winter 6; International Brain Laboratory, Eva L Dyer 7, Chandramouli Chandrasekaran 3, Nicholas A Steinmetz 8, Liam Paninski 1, Cole Hurwitz 1
PMCID: PMC11580900  PMID: 39574717

Abstract

Current electrophysiological approaches can track the activity of many neurons, yet it is usually unknown which cell-types or brain areas are being recorded without further molecular or histological analysis. Developing accurate and scalable algorithms for identifying the cell-type and brain region of recorded neurons is thus crucial for improving our understanding of neural computation. In this work, we develop a multimodal contrastive learning approach for neural data that can be fine-tuned for different downstream tasks, including inference of cell-type and brain location. We utilize this approach to jointly embed the activity autocorrelations and extracellular waveforms of individual neurons. We demonstrate that our embedding approach, Neuronal Embeddings via MultimOdal contrastive learning (NEMO), paired with supervised fine-tuning, achieves state-of-the-art cell-type classification for two opto-tagged datasets and brain region classification for the public International Brain Laboratory Brain-wide Map dataset. Our method represents a promising step towards accurate cell-type and brain region classification from electrophysiological recordings. The project page and code are available at https://ibl-nemo.github.io/.

1. Introduction

High-density electrode arrays now allow for simultaneous extracellular recording from hundreds to thousands of neurons across interconnected brain regions (Jun et al., 2017; Steinmetz et al., 2021; Ye et al., 2023b; Trautmann et al., 2023). While significant progress has been made in developing algorithms for tracking neural activity (Hennig et al., 2019; Buccino et al., 2020; Magland et al., 2020; Boussard et al., 2023; Pachitariu et al., 2024), identifying cell types and brain regions solely from electrophysiological features remains an open problem.

Traditional approaches for electrophysiological cell-type classification utilize simple features of the extracellular action potential (EAP) such as its width or peak-to-trough amplitude (Mountcastle et al., 1969; Matthews & Lee, 1991; Nowak et al., 2003; Barthó et al., 2004; Vigneswaran et al., 2011) or features of neural activity, such as the inter-spike interval distribution (Latuske et al., 2015; Jouty et al., 2018). These simple features are interpretable and easy to visualize but lack discriminative power and robustness across different datasets (Weir et al., 2015; Gouwens et al., 2019). Current automated featurization methods for EAPs (Lee et al., 2021; Vishnubhotla et al., 2024) and neural activity (Schneider et al., 2023a) improve upon manual features but are limited to a single modality.

There has been a recent push to develop multimodal methods that can integrate information from both recorded EAPs and spiking activity. PhysMAP (Lee et al., 2024) is a UMAP-based (McInnes et al., 2018a) approach that can predict cell-types using multiple physiological modalities through a weighted nearest neighbor graph. Another recently introduced method utilizes variational autoencoders (VAEs) to embed each physiological modality separately and then combines these embeddings before classification (Beau et al., 2025). Although both methods show promising results, PhysMAP is hard to fine-tune for downstream tasks as it is nondifferentiable, and the VAE-based method captures features that are important for reconstruction, not discrimination, impairing downstream performance (Guo et al., 2017). Neither approach has been used to classify brain regions.

In this work, we introduce a multimodal contrastive learning method for neurophysiological data, Neuronal Embeddings via MultimOdal Contrastive Learning (NEMO), which utilizes large amounts of unlabeled paired data for pre-training and can be fine-tuned for different downstream tasks including cell-type and brain region classification. We utilize a recently developed contrastive learning framework (Radford et al., 2021) to jointly embed individual neurons’ activity autocorrelations and average extracellular waveforms. The key assumption of our method is that jointly embedding different modalities into a shared latent space will capture shared information while discarding modality-specific noise (Huang et al., 2024). We evaluate NEMO on cell-type classification using optotagged Neuropixels Ultra (NP Ultra) data from the mouse visual cortex (Ye et al., 2023b) and optotagged Neuropixels 1 data from the mouse cerebellum (Beau et al., 2025). We evaluate NEMO on brain region classification using the International Brain Laboratory (IBL) Brain-wide Map dataset (IBL et al., 2023). Across all datasets and tasks, NEMO outperforms current unsupervised (PhysMAP and VAEs) and supervised methods, with particularly strong performance in label-limited regimes. These results demonstrate that NEMO is a significant advance towards accurate cell-type and brain region classification from electrophysiological recordings.

2. Related work

2.1. Contrastive learning for neuronal datasets

The goal of contrastive learning is to find an embedding space where similar examples are close together while dissimilar ones are well-separated (Le-Khac et al., 2020). Contrastive learning has found success across a number of domains including language (Reimers & Gurevych, 2019), vision (Chen et al., 2020), audio (Saeed et al., 2021), and multimodal learning (Radford et al., 2021; Tian et al., 2020). Contrastive learning has also been applied to neuronal morphological data (Chen et al., 2022), connectomics data (Dorkenwald et al., 2023) and preprocessed spiking activity (Azabou et al., 2021; Urzay et al., 2023; Schneider et al., 2023b; Antoniades et al., 2023). In each of these applications, associated downstream tasks such as 3D neuron reconstruction, cellular sub-compartment classification, or behavior prediction have shown improvement using this contrastive paradigm. One contrastive method, CEED, has been applied to raw extracellular recordings to perform spike sorting and cell-type classification. In contrast to NEMO, CEED is unimodal (it ignores neural activity) and has never been applied to optotagged data or brain region classification (Vishnubhotla et al., 2024).

2.2. Cell-type classification

The goal of cell-type classification is to assign neurons to distinct classes based on their morphology, function, electrophysiological properties, and molecular markers (Masland, 2004). Current transcriptomic (Tasic et al., 2018; Gala et al., 2019; Yao et al., 2021; 2023) and optical methods (Cardin et al., 2010; Kravitz et al., 2013; Lee et al., 2020) are effective but require extensive sample preparation or specialized equipment, limiting their scalability and applicability for in-vivo studies (Lee et al., 2024). Recently, calcium imaging has been utilized in conjunction with molecular approaches to identify cell-types (Bugeon et al., 2022; Mi et al., 2023). This approach has low temporal resolution and requires substantial post hoc effort to align molecular imaging with calcium data, making it unsuitable for closed-loop in vivo experiments. A promising alternative is to use the electrophysiological properties of recorded neurons as they capture some of the variability of the transcriptomic profile (Bomkamp et al., 2019). Simple electrophysiological features from a neuron’s EAP and spiking activity are commonly used to identify putative cell types, such as excitatory and inhibitory cells (Frank et al., 2001; Gouwens et al., 2019). Automated methods including EAP-specific methods (Lee et al., 2021) and activity-based methods (Schneider et al., 2023a) are an improvement in comparison to manual features. Most recently, multi-modal cell-type classification methods including PhysMAP (Lee et al., 2024) and a VAE-based algorithm (Beau et al., 2025; Herzfeld et al., 2025) have been introduced which make use of multiple physiological modalities such as the EAP, activity, or peri-stimulus time histogram (PSTH).

2.3. Brain region classification

Brain region classification consists of predicting the location of a neuron or electrode based on the recorded physiological features (Steinmetz et al., 2018; Davis et al., 2023). Rather than predicting a 3D location, the task is to classify the brain region a neuron or electrode occupies, which can be estimated using post-insertion localization via histology (Sunkin et al., 2012). Brain region classification is an important task for understanding fundamental differences in physiology between brain areas and for targeting regions that are hard to hit via insertion. Most importantly, brain region classification can provide a real-time estimate of the probe’s location in the brain during experiments. Additionally, insertions in primates and human subjects often lack histological information, instead relying on the experimental heuristics that lack standardization between laboratories. As this task is relatively new, only simple features of the EAPs have been utilized for classification (Jia et al., 2019; Tolossa et al., 2024).

3. Datasets

For cell-type classification, we use two mouse datasets: an opto-tagged dataset from the visual cortex recorded with Neuropixels Ultra (NP Ultra; Ye et al., 2023b) and a dataset from the cerebellar cortex recorded with Neuropixels 1 (Beau et al., 2025). For brain region classification, we utilize the IBL Brain-wide Map of neural activity from mice performing a decision-making task (IBL et al., 2023).

NP Ultra opto-tagged mouse data.

This dataset consists of NP Ultra recordings of spontaneous and opto-stimulated activity from the visual cortex of mice. We included spontaneous periods only and excluded units that have less than 100 spikes after removal of stimulation periods (see Supplement C). We obtained 462 ground-truth neurons with three distinct cell-types. The ground-truth neurons are composed of 237 parvalbumin (PV), 107 somatostatin (SST), and 118 vasoactive intestinal peptide cells (VIP). There are also 8491 unlabelled neurons that we can utilize for pre-training.

C4 cerebellum dataset.

This dataset consists of Neuropixels recordings from the cerebellar cortex of mice. Opto-tagging is utilized to label 202 ground-truth neurons with five distinct cell-types. The ground-truth neurons are composed of 27 molecular layer interneurons (MLI), 18 Golgi cells (GoC), 30 mossy fibers (MF), 69 Purkinje cell simple spikes (PCss), and 58 Purkinje cell complex spikes (PCcs). There are 3,090 unlabelled neurons for pretraining.

IBL Brain-wide Map.

This dataset consists of Neuropixels recordings from animals performing a decision-making task. Each neuron is annotated with the brain region where it is located. We utilize 675 insertions from over 100 animals, yielding 37017 ‘good’ quality neurons after spike sorting and quality control (Banga et al., 2022; IBL et al., 2022). Each brain is parcellated with 10 brain atlas annotations divided into 10 broad areas: isocortex, olfactory areas (OLF), cortical subplate (CTXsp), cerebral nuclei (CNU), thalamus (TH), hypothalamus (HY), midbrain (MB), hindbrain (HB), cerebellum (CB) and hippocampal formation (HPF). We divide this dataset into a training, validation, and testing set by insertion such that we can evaluate each model on heldout insertions.

4. NEMO

We introduce Neuronal Embeddings via MultimOdal contrastive learning (NEMO), which learns a multimodal embedding of neurophysiological data. To extract representations from multiple modalities, we utilize Contrastive Language-Image Pretraining (CLIP; Radford et al., 2021). CLIP uses a contrastive objective to learn a joint representation of images and captions. For NEMO, we utilize the same objective but with modality-specific data augmentations and encoders (see Figure 1c).

Figure 1: Schematic illustration of NEMO.

Figure 1:

(a) Neuropixels Ultra (Ye et al., 2023b) recordings capture activity from many different cell-types which have distinct extracellular action potentials (EAPs) and spiking activity. We present waveform and spiking activity snippets from five example neurons for each cell-type. (b) We transform the spiking activity of each neuron into a compact autocorrelogram (ACG) image from (Beau et al., 2025) that accounts for variations in the firing rate (see Section 4.1) (c) NEMO utilizes a CLIP-based objective where an EAP encoder and an ACG image encoder are trained to embed randomly augmented EAPs and ACG image from the same neuron close together while keeping different neurons separate. The learned representations can be utilized for downstream tasks such as cell-type and brain-region classification.

4.1. Prepreprocessing

We construct a paired dataset of spiking activity and EAPs for all recorded neurons. Using the open-source Python package NeuroPyxels (Beau et al., 2021), we computed an autocorrelogram (ACG) image for each neuron by smoothing the spiking activity with a 250-ms width boxcar filter, dividing the firing rate distribution into 10 deciles, and then building ACGs for each decile (see Figure 1b). This ACG image is a useful representation because the activity autocorrelations of a neuron can change as a function of its firing rate. By computing ACGs for each firing rate decile, the ACG image will account for firing rate dependent variations in the autocorrelations, allowing for comparisons between different areas of the brain, behavioral contexts, and animals (Beau et al., 2025). For the EAPs, we construct a ‘template’ waveform which is the mean of ~500 waveforms for that neuron. For NP Ultra, we utilize multi-channel templates which take advantage of the detailed spatial structure enabled by the small channel spacing; we use nine channels with the highest peak-to-peak (ptp) amplitude, re-ordered from highest to lowest amplitude. For the C4 and IBL dataset, all main text results utilize templates consisting of one channel with maximal amplitude.

4.2. Data augmentations

Previous work on contrastive learning for spiking activity utilizes data augmentations including sparse multiplicative noise (pepper noise), Gaussian noise, and temporal jitter (Azabou et al., 2021). As it is computationally expensive to construct ACG images for each batch during training, we instead design augmentations directly for the ACG images rather than the original spiking data. Our augmentations include temporal Gaussian smoothing, temporal jitter, amplitude scaling, additive Gaussian noise, and multiplicative pepper noise (see Supplemental B for more details and Supplementary Figure 14 for an ablation). For single channel templates, we use additive Gaussian noise as our only augmentation. For multi-channel templates, we also include electrode dropout and amplitude jitter as described in Supplementary Table 1.

4.3. Encoders

We employ separate encoders for each electrophysiological modality. For the ACG image encoder, we use a version of the convolutional architecture introduced in (Beau et al., 2025) with 2 layers and Gaussian Error Linear Units (GeLU) (Hendrycks & Gimpel, 2016). For the waveform encoder, we use a 2 layer multilayer perceptron (MLP) with GeLU units. The representation sizes are 200 dimensional and 300 dimensional for the ACG image encoder and the waveform encoder, respectively. We set the projection size to be 512. For details about hyperparameters, see Supplement B.

4.4. Contrastive objective

We utilize the contrastive objective defined in CLIP. Let zacg and zwf be the L2 normalized projections of each modality. For a batch B, the objective is as follows,

=12|B|i=1|B|[logexp(zacgizwfi/τ)j=1|B|exp(zacgizwfj/τ)+logexp(zacgizwfi/τ)j=1|B|exp(zacgjzwfi/τ)] (1)

where τ is a temperature parameter which we fix during training. The objective function encourages the model to correctly match zacgi with its corresponding zwfi, and vice versa, over all other possible pairs in the batch. This loss can easily be extended to more than two modalities including PSTHs.

4.5. Single-neuron and multi-neuron brain region classification

Brain region classification using electrophysiological datasets is a new problem that requires novel classification schemes. We develop two classification schemes for our evaluation: a single-neuron and multi-neuron classifier. For our single-neuron classifier, we predict the brain area for each neuron independently using its embedding. For our multi-neuron classifier, we predict the brain region for each 20 micron bin along the depth of the probe by ensembling the predictions of nearby neurons within a 60-micron radius (i.e., averaging the logits of the single-neuron model) as shown in Figure 3a (ii). When more than five neurons fall within this range, only the nearest five are selected.

Figure 3: Results for NEMO on the IBL brain region classification task.

Figure 3:

(a) Schematic for multi-neuron classifier. (i) At each depth, the neurons within 60 μm were used to classify the anatomical region. Only the nearest 5 neurons were selected if there were more than 5 neurons within that range. (ii) For logits averaging, single-neuron classifier logits are predicted using a linear model/MLP trained on the representations of our two physiological modalities. The final prediction is based on the average of the individual logits. (b) Confusion matrices for the single-neuron region classifier for fine-tuned NEMO. (c) Confusion matrices for the NEMO multi-neuron region classifier, averaged across 5 runs. (d) Single neuron balanced accuracy with linear classifier and the MLP head for each model trained/fine-tuned with different label ratios. (e) Single-neuron MLP-classification balanced accuracy for each modality separately and for the combined representation.

5. Experimental Setup

5.1. Baselines

For our baselines, we compare to PhysMAP (Lee et al., 2024), a VAE-based method (Beau et al., 2025)), and a fully supervised MLP. For fair comparison, we utilize the same encoder architectures for NEMO and the VAE-based method. We include two versions of the VAE baseline: (1) the latent space (10D) is used to predict cell-type or brain region (from Beau et al. (2025)), or (2) the output of the layer before the latent space (500D) is used to predict cell-type or brain region. Although this approach was not proposed in Beau et al. (2025), we find that utilizing the 500D representations before the latent space performed better than using the 10D latent space and also outperformed a VAE trained with a 500D latent space. For the VAEs, we use default hyperparameters from Beau et al. (2025) (see Supplement N for a hyperparameter sensitivity analysis). For the supervised baseline, we again use the same encoder architectures as NEMO. For training NEMO, we use an early stopping strategy which utilizes validation data. For the VAE-based method, we use the training scheme introduced in Beau et al. (2025). We fix the hyperparameters for all methods across all datasets. For more details about baselines, training, and hyperparameters, see Supplements B and D.

5.2. Evaluation

For both NEMO and the VAE-based method, the representations from the ACG image and EAPs are concatenated together before classification or fine-tuning. We apply three classification schemes for evaluation including (1) freezing the model and training a linear classifier on the final-layer outputs, (2) freezing the model and training a MLP-based classifier on the final-layer outputs, (3) fine-tuning both the original model and a MLP-based classifier on the final layer. To ensure balanced training data, we implement dataset resampling prior to fitting the linear classifier. For PhysMAP comparisons, we utilize the weighted graph alignment approach provided in Lee et al. (2024) for all comparisons. For our classification metrics, we utilize the macro-averaged F1 score, calculated as the unweighted mean of F1 scores for each class, and balanced accuracy, which measures average accuracy per class. For additional details about baseline hyperparameters, see Supplement B.

5.3. Experiments

NP Ultra opto-tagged dataset.

For the NP Ultra dataset, we pretrain NEMO and the VAE-based method on 8491 unlabelled neurons. This pretraining strategy is important for reducing overfitting to the small quantity of labeled cell-types. To evaluate each model after pretraining, we perform the three evaluation schemes introduced in Section 5.2: freezing + linear classifier, freezing + MLP classifier, and full end-to-end finetuning. For PhysMAP, we utilize the anchor alignment technique introduced by Lee et al. (2024). For all methods and evaluation schemes, we perform a 5-fold cross-validation with 10 repeats to evaluate each model.

C4 cerebellum dataset.

For the C4 dataset, we pretrain all methods on 3090 unlabelled neurons. To evaluate each model after pretraining, we perform the three evaluation schemes introduced in Section 5.2. For all classifiers, we do not utilize input layer information, nor do we exclude neurons based on a confidence threshold as done in Beau et al. (2025), as we were interested in evaluating the predictiveness of the features directly without additional information. For all methods and evaluation schemes, we perform a 5-fold cross-validation with 10 repeats to evaluate each model.

IBL Brain-wide Map.

For the IBL dataset, we randomly divide all insertions (i.e., Neuropixels recordings) into a 70-10-20 split to create a training, validation, and test set for each method. We then pretrain NEMO and the VAE-based method on all neurons in the training split. We then perform the three evaluation schemes introduced in Section 5.2. For PhysMAP, we utilize the anchor alignment technique. We train both a single-neuron and multi-neuron classifier using the representations learned by NEMO and the VAE-based method. For PhysMAP, we only evaluate the single-neuron classifier. We compute the average and standard deviation of the metrics using five random seeds.

6. Results

6.1. Classification

NP Ultra cell-type classifier.

The results for the NP Ultra opto-tagged dataset are shown in Table 1 and Figure 2. For all three evaluation schemes, NEMO achieves the highest macro-averaged F1 score and balanced accuracy by a significant margin. Notably, its largest improvement is in the linear and frozen MLP evaluations, indicating that NEMO captures cell-type-discriminative features even without fine-tuning. After fine-tuning, all methods are closer in performance, suggesting that there is some saturation for this dataset. These results demonstrate that NEMO is an accurate method for cell-type classification in visual cortical microcircuits even without fine-tuning.

Table 1: Cell-type classification for the NP Ultra dataset.

5-Fold accuracy and F1-scores are reported for three conditions: (i) a linear layer and (ii) MLP on top of the frozen pretrained representations, and (iii) after MLP finetuning. Chance level is 0.33 for this dataset.

Model Linear MLP MLP fine-tuned
Acc F1 Acc F1 Acc F1
Supervised N/A N/A N/A N/A 0.79 ± 0.00 0.78 ± 0.00
PhysMAP (Lee et al. (2024)) 0.521 0.521 N/A N/A N/A N/A
VAE (10D latent; Beau et al. (2025)) 0.74 ± 0.00 0.73 ± 0.00 0.74 ± 0.00 0.73 ± 0.00 0.78 ± 0.00 0.79 ± 0.00
VAE (500D rep) 0.76 ± 0.00 0.75 ± 0.00 0.77 ± 0.00 0.77 ± 0.00 0.78 ± 0.00 0.79 ± 0.00
NEMO (500D rep) 0.78 ± 0.01 0.78 ± 0.00 0.80 ± 0.00 0.79 ± 0.00 0.80 ± 0.00 0.80 ± 0.00
Figure 2: Comparing NEMO to baseline models on the NP Ultra opto-tagged dataset.

Figure 2:

(a) UMAP visualization of the pretrained NEMO representations of unseen opto-tagged visual cortex units, colored by different cell-types. Neurons of the same class form clusters, particularly when combined modalities are used. (b) Balanced accuracy and (c) Confusion matrices for the NP Ultra classification results, normalized by ground truth label and averaged across 5 random seeds. NEMO outperforms the other embedding methods by a significant margin across all cell-types and evaluation methods.

C4 cerebellum cell-type classifier.

The results for the C4 cerebellum dataset are shown in Table 2 and Supplementary Figure 4. For all evaluation schemes, NEMO outperforms the baseline models, achieving the highest macro-averaged F1 score and balanced accuracy, with the largest gains again in the linear and frozen MLP evaluations. These findings demonstrate that NEMO can accurately classify cell types in the cerebellum without any hyperparameter adjustments.

Table 2: Cell-type classification for the C4 cerebellum dataset.

5-Fold accuracy and F1-scores are reported for three conditions: (i) a linear layer and (ii) MLP on top of the frozen pretrained representations, and (iii) after MLP finetuning. Chance level is 0.20 for this dataset.

Model Linear
MLP (5-fold)
Finetuned MLP (5-fold)
Acc F1 Acc F1 Acc F1
Supervised N/A N/A N/A N/A 0.82 ± 0.00 0.82 ± 0.00
PhysMAP 0.511 0.491 N/A N/A N/A N/A
VAE (10D latent; Beau et al. (2025)) 0.79 ± 0.00 0.78 ± 0.01 0.74 ± 0.01 0.73 ± 0.02 0.82 ± 0.00 0.81 ± 0.00
VAE (500D rep) 0.82 ± 0.00 0.81 ± 0.00 0.82 ± 0.00 0.82 ± 0.00 0.83 ± 0.00 0.83 ± 0.00
NEMO (500D rep) 0.85 ± 0.01 0.85 ± 0.01 0.85± 0.00 0.85 ± 0.00 0.86 ± 0.01 0.86 ± 0.01

IBL single-neuron region classifier.

We then aim to investigate how much relevant information NEMO extracts from each neuron about its anatomical location, i.e., brain region. We investigate this by training classifiers that use single neuron features to identify anatomical regions for the IBL dataset (see Table 3 and Figure 3 for results). The confusion matrix for the VAE, supervised MLP, and PhysMAP are shown in Supplementary Figure 9. We find that NEMO outperforms all other methods using both the linear and MLP-based classification schemes. Without end-to-end fine-tuning, NEMO with an MLP classification head is already on par with the supervised MLP. NEMO’s success with both the linear and MLP classifier with frozen encoder weights indicates that NEMO is able to extract a region-discriminative representation of neurons without additional fine-tuning. This representation can be further improved by fine-tuning NEMO. All methods have closer performance after fine-tuning potentially due to the substantial amount of labeled data.

Table 3: Single-neuron brain region classification for the IBL dataset.

The accuracy and F1-scores are reported for three conditions: (i) a linear layer and (ii) MLP on top of the frozen pretrained representations, and (iii) after MLP finetuning. Chance level is 0.10 for this dataset.

Model Linear MLP MLP fine-tuned
Acc F1 Acc F1 Acc F1
Supervised N/A N/A N/A N/A 0.45 ± 0.01 0.42 ± 0.01
PhysMAP 0.311 0.271 N/A N/A N/A N/A
VAE (500D rep) 0.40 ± 0.01 0.37 ± 0.01 0.41 ± 0.00 0.37 ± 0.00 0.45 ± 0.01 0.43 ± 0.00
NEMO (500D rep) 0.42 ± 0.00 0.40 ± 0.00 0.45 ± 0.01 0.42 ± 0.00 0.47 ± 0.01 0.44 ± 0.01

IBL multi-neuron region classifier.

We investigate whether combining information from multiple neurons at each location can improve brain region classification. We use the nearest-neurons ensembling approach as described in 4.5 and shown in Figure 3a. Averaging the logits of predictions from single neurons improves classification performance over the single-neuron model. NEMO still has the best region classification performance especially for the linear and frozen MLP evaluations (see Table 4 and Figure 3c for results). Again, all methods have closer performance after fine-tuning.

Table 4: Multi-neuron brain region classification for the IBL dataset.

The accuracy and F1-scores are reported for three conditions: (i) a linear layer and (ii) MLP on top of the frozen pretrained representations, and (iii) after MLP finetuning. Chance level is 0.10 for this dataset.

Model Linear MLP MLP fine-tuned
Acc F1 Acc F1 Acc F1
Supervised N/A N/A N/A N/A 0.50 ± 0.00 0.48 ± 0.01
VAE (500D rep) 0.45 ± 0.01 0.42 ± 0.01 0.46 ± 0.00 0.43 ± 0.00 0.51 ± 0.01 0.49 ± 0.01
NEMO (500D rep) 0.48 ± 0.00 0.45 ± 0.00 0.50 ± 0.00 0.48 ± 0.00 0.51 ± 0.00 0.50 ± 0.00

6.2. Clustering

We next examine the clusterability of NEMO representations for the IBL Brain-wide Map. We followed the clustering strategy used in Lee et al. (2021) by running Louvain clustering on a UMAP graph constructed from the representations extracted by NEMO from the IBL training neurons. We adjusted two main settings: the neighborhood size in UMAP and the resolution in Louvain clustering. We selected these parameters by maximizing the modularity index, which had the effect of minimizing the number of resulting clusters (Figure 4c). The clustering results relative to the region labels are presented in Figures 4a and b. The UMAP visualization of the NEMO representations, colored by region label, demonstrates that the regions are separable in the representation space. Notably, there is a distinct separation of thalamic neurons from other regions, along with an isolated cluster of cerebellar neurons. Neurons from other regions are also well organized by region labels within the NEMO representation space, allowing for their clustering into several distinct clusters. Additionally, overlaying the neurons colored by their cluster IDs onto their anatomical locations (Figure 4) reveals a cluster distribution closely correlated with anatomical regions which is consistent across insertions from different labs (Supplementary Figure 11). We find that clustering NEMO’s representations leads to a more region-selective clustering than when we use the raw features directly (Supplementary Figures 12 and 13). These results demonstrate that NEMO extracts features that capture the electrophysiological diversity across regions even without labels.

Figure 4: IBL neuron clustering using NEMO pretraining.

Figure 4:

(a) A UMAP visualization of the representations that NEMO extracts from the training data colored by anatomical brain region. (b) The same UMAP as shown in (a) but instead colored by cluster labels using a graph-based approach (Louvain clustering). (c) We tuned the neighborhood size in UMAP and the resolution for the clustering. These parameters were selected by maximizing the modularity index which minimized the number of clusters. (d) 2D brain slices across three brain views with the location of individual neurons colored using the cluster IDs shown in (b). The black lines show the region boundaries of the Allen mouse atlas (Wang et al., 2020). The cluster distribution found using NEMO is closely correlated with the anatomical regions and is consistent across insertions from different labs.

6.3. Ablations

Label ratio sweep.

We assess whether NEMO requires less labeled data for fine-tuning by conducting a label ratio sweep with single-neuron region classifiers. We train both linear and MLP classifiers under two conditions: with frozen weights and full end-to-end fine-tuning, using 1%, 10%, 30%, 50%, 80%, and 100% of the labeled data. Accuracy results are shown in Figure 3d (F1 scores in Supplementary Figure 6). The fine-tuned NEMO model outperforms all other methods across all label ratios. Notably, with only 50% of the training labels, both the linear and fine-tuned NEMO models outperform the corresponding VAE models and the supervised MLP, even when the latter models are trained on the full dataset.

Single modality classifier.

We examine whether combining modalities improves region-relevant information and if NEMO enhances feature extraction by aligning their embeddings. We compared the classification performance of the MLP classifier with encoder weights frozen and end-to-end fine-tuned, across all models using: 1) waveforms only 2) ACGs only 3) waveforms and ACGs. Brain region classification balanced accuracies are shown in Figure 3e (for F1, see Supplementary Figure 6). We found that bimodal models generally outperform unimodal models, suggesting that combining both modalities provides extra information on the anatomical location of neurons. Once again, NEMO achieves the best performance, demonstrating its ability to enhance single-modality information extraction by leveraging the other modality. After fine-tuning, NEMO and the VAE have closer performance potentially due to the substantial amount of ground-truth labels.

Joint vs. independent learning for NEMO.

To ablate the importance of learning a shared representation of each modality, we train a version of NEMO where we independently learn an embedding for each modality using a unimodal contrastive method, SimCLR (Chen et al., 2020). The results for brain region classification are shown in Figure 5a where NEMO trained with CLIP outperforms NEMO trained with SimCLR for all label ratios and classification methods. NEMO trained with CLIP is also able to extract more informative representations from each modality as shown in Figure 5b. These results demonstrate that learning a shared representation of the two modalities is important for the downstream performance of NEMO. Supplementary Table 9 and 10 show that joint training with CLIP also leads to an improvement over SimCLR for cell-type classification.

Figure 5: Ablating joint vs. independent learning for NEMO.

Figure 5:

To evaluate the importance of learning a shared representation between modalities, we train a version of NEMO on the IBL brain classification task where each modality is independently embedded using SimCLR. (a) Across all label ratios and classifiers, we find that NEMO trained with CLIP outperforms the SimCLR version. (b) NEMO trained with CLIP also extracts more informative representations for each modality than when training with SimCLR.

7. Discussion

In this work, we proposed NEMO, a pretraining framework for electrophysiological data that utilizes multi-modal contrastive learning. We demonstrate that NEMO is able to extract informative representations for cell-type and brain region classification with minimal fine-tuning across three different datasets. This is especially valuable in neuroscience, where ground truth data, like opto-tagged cells, are costly, labor-intensive, or even impossible to obtain (e.g., in human datasets).

Our work has several limitations. First, we focus on shared information between two modalities, assuming this is most informative for identifying cell identity or anatomical location. However, modeling both shared and modality-specific information could further improve performance, as each modality may contain unique features relevant to cell identity or anatomical location (Liu et al., 2024; Liang et al., 2024). Additionally, NEMO utilizes the activity of each neuron independently, which ignores neuron correlations that can help distinguish cell-types (Mi et al., 2023). Extending NEMO to encode population-level features is an exciting future direction.

NEMO opens up several promising avenues for future research. Our framework can be adapted for studies of peripheral nervous systems, such as the retina (Wu et al., 2023). NEMO can also be combined with RNA sequencing to find features that are shared between RNA and electrophysiological data (Li et al., 2023). It will also be possible to correlate the cell-types discovered using NEMO with animal behavior to characterize their functional properties. Finally, we imagine that the representations extracted by NEMO can be integrated with current multi-animal pretraining approaches for neural activity to provide additional cell-type information which could improve generalizability to unseen sessions or animals (Azabou et al., 2023; Ye et al., 2023a; Zhang et al., 2025; 2024).

Supplementary Material

1

8. Acknowledgments

We thank Jonathan Pillow and Tatiana Engel for providing feedback on this manuscript. We also thank Maxime Beau and the other authors of Beau et al. (2025) for sharing the C4 cerebellum dataset. This project was supported by the Wellcome Trust (PRF 209558, 216324, 201225, and 224688 to MH, SHWF 221674 to LFR, collaborative award 204915 to MC, MH and TDH), National Institutes of Health (1U19NS123716), the Simons Foundation, the DoD OUSD (R&E) under Co-operative Agreement PHY-2229929 (The NSF AI Institute for Artificial and Natural Intelligence), the Kavli Foundation, the Gatsby Charitable Foundation (GAT3708), the NIH BRAIN Initiative (U01NS113252 to NAS, SRO, and TDH), the Pew Biomedical Scholars Program (NAS), the Max Planck Society (GL), the European Research Council under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 834446 to GL and AdG 695709 to MH), the Giovanni Armenise Harvard Foundation (CDA to LFR), the Human Technopole (ECF to LFR), the NSF (IOS 211500 to NBS), the Klingenstein-Simons Fellowship in Neuroscience (NAS), the NINDS R01NS122969, the NINDS R21NS135361, the NINDS F31NS131018, the NSF CAREER awards IIS-2146072, as well as generous gifts from the McKnight Foundation, and the CIFAR Azrieli Global Scholars Program. GM is supported by a Boehringer Ingelheim Fonds PhD Fellowship. The primate research procedures were supported by the NIH P51 (OD010425) to the WaNPRC, and animal breeding was supported by NIH U42 (OD011123). Computational modeling work was supported by the European Union Horizon 2020 Research and Innovation Programme under Grant Agreement No. 945539 Human Brain Project SGA3 and No. 101147319 EBRAINS 2.0 (GTE and TVN). Computational resources for building machine learning models were provided by ACCESS, which is funded by the US National Science Foundation.

Footnotes

1

We utilize PhysMAP’s anchor alignment technique (which is deterministic) to evaluate its performance.

References

  1. Akiba Takuya, Sano Shotaro, Yanase Toshihiko, Ohta Takeru, and Koyama Masanori. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp. 2623–2631, 2019. [Google Scholar]
  2. Antoniades Antonis, Yu Yiyi, Canzano Joseph, Wang William, and Smith Spencer LaVere. Neuroformer: Multimodal and multitask generative pretraining for brain data. arXiv preprint arXiv:2311.00136, 2023. [Google Scholar]
  3. Aynaud Thomas. python-louvain x.y: Louvain algorithm for community detection. https://github.com/taynaud/python-louvain, 2020.
  4. Azabou Mehdi, Azar Mohammad Gheshlaghi, Liu Ran, Lin Chi-Heng, Johnson Erik C, Bhaskaran-Nair Kiran, Dabagia Max, Avila-Pires Bernardo, Kitchell Lindsey, Hengen Keith B, et al. Mine your own view: Self-supervised learning through across-sample prediction. arXiv preprint arXiv:2102.10106, 2021. [Google Scholar]
  5. Azabou Mehdi, Arora Vinam, Ganesh Venkataramana, Mao Ximeng, Nachimuthu Santosh, Mendelson Michael J, Richards Blake, Perich Matthew G, Lajoie Guillaume, and Dyer Eva L. A unified, scalable framework for neural population decoding. arXiv preprint arXiv:2310.16046, 2023. [Google Scholar]
  6. Banga Kush, Boussard Julien, Chapuis Gaëlle A, Faulkner Mayo, Harris Kenneth D, Huntenburg JM, Hurwitz Cole, Lee Hyun Dong, Paninski Liam, Rossant Cyrille, et al. Spike sorting pipeline for the international brain laboratory. 2022. [Google Scholar]
  7. Barthó Peter, Hirase Hajime, Monconduit Lenäıc, Zugaro Michael, Harris Kenneth D, and Buzsaki Gyorgy. Characterization of neocortical principal cells and interneurons by network interactions and extracellular features. Journal of neurophysiology, 92(1):600–608, 2004. [DOI] [PubMed] [Google Scholar]
  8. Beau Maxime, D’Agostino Federico, Lajko Ago, Mart´ınez Gabriela, and Kostadinov Dimitar. Neuropyxels: loading, processing and plotting neuropixels data in python. Zenodo 10.5281/ZENODO, 5509733, 2021. [DOI] [Google Scholar]
  9. Beau Maxime, Herzfeld David J., Naveros Francisco, Hemelt Marie E., D’Agostino Federico, Oostland Marlies, Sánchez-López Alvaro, Chung Young Yoon, Maibach Michael, Stabb Hannah N., et al. A deep-learning strategy to identify cell types across species from high-density extracellular recordings. Cell, 2025. doi: 10.1016/j.cell.2025.01.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bomkamp Claire, Tripathy Shreejoy J, Gonzales Carolina Bengtsson, Hjerling-Leffler Jens, Craig Ann Marie, and Pavlidis Paul. Transcriptomic correlates of electrophysiological and morphological diversity within and across excitatory and inhibitory neuron classes. PLoS computational biology, 15(6):e1007113, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Boussard Julien, Windolf Charlie, Hurwitz Cole, Lee Hyun Dong, Yu Han, Winter Olivier, and Paninski Liam. Dartsort: A modular drift tracking spike sorter for high-density multi-electrode probes. bioRxiv, pp. 2023–08, 2023. [Google Scholar]
  12. Buccino Alessio P, Hurwitz Cole L, Garcia Samuel, Magland Jeremy, Siegle Joshua H, Hurwitz Roger, and Hennig Matthias H. Spikeinterface, a unified framework for spike sorting. Elife, 9: e61834, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Bugeon Stephane, Duffield Joshua, Dipoppa Mario, Ritoux Anne, Prankerd Isabelle, Nicoloutsopoulos Dimitris, Orme David, Shinn Maxwell, Peng Han, Forrest Hamish, et al. A transcriptomic axis predicts state modulation of cortical interneurons. Nature, 607(7918):330–338, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Cardin Jessica A, Carlén Marie, Meletis Konstantinos, Knoblich Ulf, Zhang Feng, Deisseroth Karl, Tsai Li-Huei, and Moore Christopher I. Targeted optogenetic stimulation and recording of neurons in vivo using cell-type-specific expression of channelrhodopsin-2. Nature protocols, 5(2):247–254, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Chen Hanbo, Yang Jiawei, Iascone Daniel, Liu Lijuan, He Lei, Peng Hanchuan, and Yao Jianhua. Treemoco: Contrastive neuron morphology representation learning. Advances in Neural Information Processing Systems, 35:25060–25073, 2022. [Google Scholar]
  16. Chen Ting, Kornblith Simon, Norouzi Mohammad, and Hinton Geoffrey. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pp. 1597–1607. PMLR, 2020. [Google Scholar]
  17. Davis Zachary W, Dotson Nicholas M, Franken Tom P, Muller Lyle, and Reynolds John H. Spike-phase coupling patterns reveal laminar identity in primate cortex. Elife, 12:e84512, 2023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Dorkenwald Sven, Li Peter H, Januszewski Michał, Berger Daniel R, Maitin-Shepard Jeremy, Bodor Agnes L, Collman Forrest, Schneider-Mizell Casey M, da Costa Nuno Maçarico, Lichtman Jeff W, et al. Multi-layered maps of neuropil with segmentation-guided contrastive learning. Nature Methods, 20(12):2011–2020, 2023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Frank Loren M, Brown Emery N, and Wilson Matthew A. A comparison of the firing properties of putative excitatory and inhibitory neurons from ca1 and the entorhinal cortex. Journal of neurophysiology, 86(4):2029–2040, 2001. [DOI] [PubMed] [Google Scholar]
  20. Gala Rohan, Gouwens Nathan, Yao Zizhen, Budzillo Agata, Penn Osnat, Tasic Bosiljka, Murphy Gabe, Zeng Hongkui, and Sümbül Uygar. A coupled autoencoder approach for multi-modal analysis of cell types. Advances in Neural Information Processing Systems, 32, 2019. [Google Scholar]
  21. Gouwens Nathan W, Sorensen Staci A, Berg Jim, Lee Changkyu, Jarsky Tim, Ting Jonathan, Sunkin Susan M, Feng David, Anastassiou Costas A, Barkan Eliza, et al. Classification of electrophysiological and morphological neuron types in the mouse visual cortex. Nature neuroscience, 22(7):1182–1195, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Guo Xifeng, Gao Long, Liu Xinwang, and Yin Jianping. Improved deep embedded clustering with local structure preservation. In Ijcai, volume 17, pp. 1753–1759, 2017. [Google Scholar]
  23. Hendrycks Dan and Gimpel Kevin. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, 2016. [Google Scholar]
  24. Hennig Matthias H, Hurwitz Cole, and Sorbaro Martino. Scaling spike detection and sorting for next-generation electrophysiology. In Vitro Neuronal Networks: From Culturing Methods to Neuro-Technological Applications, pp. 171–184, 2019. [DOI] [PubMed] [Google Scholar]
  25. Herzfeld David J, Hall Nathan J, and Lisberger Stephen G. Strategies to decipher neuron identity from extracellular recordings in the cerebellum of behaving non-human primates. bioRxiv, pp. 2025–01, 2025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Huang Wei, Han Andi, Chen Yongqiang, Cao Yuan, Xu Zhiqiang, and Suzuki Taiji. On the comparison between multi-modal and single-modal contrastive learning. arXiv preprint arXiv:2411.02837, 2024. [Google Scholar]
  27. IBL, Banga Kush, Benson Julius, Bonacchi Niccolò, Bruijns Sebastian A, Campbell Rob, Chapuis Gaëlle A, Churchland Anne K, Davatolhagh M Felicia, Lee Hyun Dong, et al. Reproducibility of in-vivo electrophysiological measurements in mice. bioRxiv, pp. 2022–05, 2022. [Google Scholar]
  28. IBL, Benson Brandon, Benson Julius, Birman Daniel, Bonacchi Niccolo, Carandini Matteo, Catarino Joana A, Chapuis Gaelle A, Churchland Anne K, Dan Yang, et al. A brain-wide map of neural activity during complex behaviour. bioRxiv, pp. 2023–07, 2023. [Google Scholar]
  29. Jia Xiaoxuan, Siegle Joshua H, Bennett Corbett, Gale Samuel D, Denman Daniel J, Koch Christof, and Olsen Shawn R. High-density extracellular probes reveal dendritic backpropagation and facilitate neuron classification. Journal of neurophysiology, 121(5):1831–1847, 2019. [DOI] [PubMed] [Google Scholar]
  30. Jouty Jonathan, Hilgen Gerrit, Sernagor Evelyne, and Hennig Matthias H. Non-parametric physiological classification of retinal ganglion cells in the mouse retina. Frontiers in Cellular Neuroscience, 12:481, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Jun James J, Steinmetz Nicholas A, Siegle Joshua H, Denman Daniel J, Bauza Marius, Barbarits Brian, Lee Albert K, Anastassiou Costas A, Andrei Alexandru, Aydın Çağatay, et al. Fully integrated silicon probes for high-density recording of neural activity. Nature, 551(7679):232–236, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Kravitz Alexxai V, Owen Scott F, and Kreitzer Anatol C. Optogenetic identification of striatal projection neuron subtypes during in vivo recordings. Brain research, 1511:21–32, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Latuske Patrick, Toader Oana, and Allen Kevin. Interspike intervals reveal functionally distinct cell populations in the medial entorhinal cortex. Journal of Neuroscience, 35(31):10963–10976, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Le-Khac Phuc H, Healy Graham, and Smeaton Alan F. Contrastive representation learning: A framework and review. Ieee Access, 8:193907–193934, 2020. [Google Scholar]
  35. Lee Candice, Lavoie Andreanne, Liu Jiashu, Chen Simon X, and Liu Bao-hua. Light up the brain: the application of optogenetics in cell-type specific dissection of mouse brain circuits. Frontiers in neural circuits, 14:18, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Lee Eric Kenji, Balasubramanian Hymavathy, Tsolias Alexandra, Anakwe Stephanie Udochukwu, Medalla Maria, Shenoy Krishna V, and Chandrasekaran Chandramouli. Non-linear dimensionality reduction on extracellular waveforms reveals cell type diversity in premotor cortex. Elife, 10: e67490, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Lee Eric Kenji, Gul Asim, Heller Greggory, Lakunina Anna, Jaramillo Santiago, Przytycki Pawel, and Chandrasekaran Chandramouli. Physmap-interpretable in vivo neuronal cell type identification using multi-modal analysis of electrophysiological data. bioRxiv, pp. 2024–02, 2024. [Google Scholar]
  38. Li Qiang, Lin Zuwan, Liu Ren, Tang Xin, Huang Jiahao, He Yichun, Sui Xin, Tian Weiwen, Shen Hao, Zhou Haowen, et al. Multimodal charting of molecular and functional cell states via in situ electro-sequencing. Cell, 186(9):2002–2017, 2023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Liang Paul Pu, Deng Zihao, Ma Martin Q, Zou James Y, Morency Louis-Philippe, and Salakhutdinov Ruslan. Factorized contrastive learning: Going beyond multi-view redundancy. Advances in Neural Information Processing Systems, 36, 2024. [Google Scholar]
  40. Liu Shengzhong, Kimura Tomoyoshi, Liu Dongxin, Wang Ruijie, Li Jinyang, Diggavi Suhas, Srivastava Mani, and Abdelzaher Tarek. Focal: Contrastive learning for multimodal time-series sensing signals in factorized orthogonal latent space. Advances in Neural Information Processing Systems, 36, 2024. [Google Scholar]
  41. Magland Jeremy, Jun James J, Lovero Elizabeth, Morley Alexander J, Hurwitz Cole Lincoln, Buccino Alessio Paolo, Garcia Samuel, and Barnett Alex H. Spikeforest, reproducible web-facing ground-truth validation of automated neural spike sorters. Elife, 9:e55167, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Masland Richard H. Neuronal cell types. Current Biology, 14(13):R497–R500, 2004. [DOI] [PubMed] [Google Scholar]
  43. Matthews RT and Lee WL. A comparison of extracellular and intracellular recordings from medial septum/diagonal band neurons in vitro. Neuroscience, 42(2):451–462, 1991. [DOI] [PubMed] [Google Scholar]
  44. McInnes Leland, Healy John, and Melville James. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426, 2018a. [Google Scholar]
  45. McInnes Leland, Healy John, Saul Nathaniel, and Grossberger Lukas. Umap: Uniform manifold approximation and projection. The Journal of Open Source Software, 3(29):861, 2018b. [Google Scholar]
  46. Mi Lu, Le Trung, He Tianxing, Shlizerman Eli, and Sümbül Uygar. Learning time-invariant representations for individual neurons from population dynamics. Advances in Neural Information Processing Systems, 36:46007–46026, 2023. [Google Scholar]
  47. Mountcastle Vernon B, Talbot William H, Sakata Hideo, and Hyvärinen J. Cortical neuronal mechanisms in flutter-vibration studied in unanesthetized monkeys. neuronal periodicity and frequency discrimination. Journal of neurophysiology, 32(3):452–484, 1969. [DOI] [PubMed] [Google Scholar]
  48. Nowak Lionel G, Azouz Rony, Sanchez-Vives Maria V, Gray Charles M, and McCormick David A. Electrophysiological classes of cat primary visual cortical neurons in vivo as revealed by quantitative analyses. Journal of neurophysiology, 89(3):1541–1566, 2003. [DOI] [PubMed] [Google Scholar]
  49. Pachitariu Marius, Sridhar Shashwat, Pennington Jacob, and Stringer Carsen. Spike sorting with kilosort4. Nature Methods, pp. 1–8, 2024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Radford Alec, Kim Jong Wook, Hallacy Chris, Ramesh Aditya, Goh Gabriel, Agarwal Sandhini, Sastry Girish, Askell Amanda, Mishkin Pamela, Clark Jack, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning, pp. 8748–8763. PMLR, 2021. [Google Scholar]
  51. Reimers Nils and Gurevych Iryna. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084, 2019. [Google Scholar]
  52. Saeed Aaqib, Grangier David, and Zeghidour Neil. Contrastive learning of general-purpose audio representations. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3875–3879. IEEE, 2021. [Google Scholar]
  53. Schneider Aidan, Azabou Mehdi, McDougall-Vigier Louis, Parks David F, Ensley Sahara, Bhaskaran-Nair Kiran, Nowakowski Tomasz, Dyer Eva L, and Hengen Keith B. Transcriptomic cell type structures in vivo neuronal activity across multiple timescales. Cell reports, 42(4), 2023a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Schneider Steffen, Lee Jin Hwa, and Mathis Mackenzie Weygandt. Learnable latent embeddings for joint behavioural and neural analysis. Nature, 617(7960):360–368, 2023b. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Steinmetz Nicholas A, Koch Christof, Harris Kenneth D, and Carandini Matteo. Challenges and opportunities for large-scale electrophysiology with neuropixels probes. Current opinion in neurobiology, 50:92–100, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Steinmetz Nicholas A, Aydin Cagatay, Lebedeva Anna, Okun Michael, Pachitariu Marius, Bauza Marius, Beau Maxime, Bhagat Jai, Böhm Claudia, Broux Martijn, et al. Neuropixels 2.0: A miniaturized high-density probe for stable, long-term brain recordings. Science, 372(6539):eabf4588, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Sunkin Susan M, Ng Lydia, Lau Chris, Dolbeare Tim, Gilbert Terri L, Thompson Carol L, Hawrylycz Michael, and Dang Chinh. Allen brain atlas: an integrated spatio-temporal portal for exploring the central nervous system. Nucleic acids research, 41(D1):D996–D1008, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Tasic Bosiljka, Yao Zizhen, Graybuck Lucas T, Smith Kimberly A, Nguyen Thuc Nghi, Bertagnolli Darren, Goldy Jeff, Garren Emma, Economo Michael N, Viswanathan Sarada, et al. Shared and distinct transcriptomic cell types across neocortical areas. Nature, 563(7729):72–78, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Tian Yonglong, Krishnan Dilip, and Isola Phillip. Contrastive multiview coding. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI 16, pp. 776–794. Springer, 2020. [Google Scholar]
  60. Tolossa Gemechu B, Schneider Aidan M, Dyer Eva L, and Hengen Keith B. A conserved code for anatomy: Neurons throughout the brain embed robust signatures of their anatomical location into spike trains. bioRxiv, pp. 2024–07, 2024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Trautmann Eric M, Hesse Janis K, Stine Gabriel M, Xia Ruobing, Zhu Shude, O’Shea Daniel J, Karsh Bill, Colonell Jennifer, Lanfranchi Frank F, Vyas Saurabh, et al. Large-scale high-density brain-wide neural recording in nonhuman primates. bioRxiv, pp. 2023–02, 2023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Urzay Carolina, Ahad Nauman, Azabou Mehdi, Schneider Aidan, Atamkuri Geethika, Hengen Keith B, and Dyer Eva L. Detecting change points in neural population activity with contrastive metric learning. In 2023 11th International IEEE/EMBS Conference on Neural Engineering (NER), pp. 1–4. IEEE, 2023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Vigneswaran Ganesh, Kraskov Alexander, and Lemon Roger N. Large identified pyramidal cells in macaque motor and premotor cortex exhibit “thin spikes”: implications for cell type classification. Journal of Neuroscience, 31(40):14235–14242, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Vishnubhotla Ankit, Loh Charlotte, Srivastava Akash, Paninski Liam, and Hurwitz Cole. Towards robust and generalizable representations of extracellular data using contrastive learning. Advances in Neural Information Processing Systems, 36, 2024. [PMC free article] [PubMed] [Google Scholar]
  65. Wang Quanxin, Ding Song-Lin, Li Yang, Royall Josh, Feng David, Lesnar Phil, Graddis Nile, Naeemi Maitham, Facer Benjamin, Ho Anh, et al. The allen mouse brain common coordinate framework: a 3d reference atlas. Cell, 181(4):936–953, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Weir Keiko, Blanquie Oriane, Kilb Werner, Luhmann Heiko J, and Sinning Anne. Comparison of spike parameters from optically identified gabaergic and glutamatergic neurons in sparse cortical cultures. Frontiers in cellular neuroscience, 8:460, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Wu Eric G, Rudzite Andra M, Bohlen Martin O, Li Peter H, Kling Alexandra, Cooler Sam, Rhoades Colleen, Brackbill Nora, Gogliettino Alex R, Shah Nishal P, et al. Decomposition of retinal ganglion cell electrical images for cell type and functional inference. bioRxiv, 2023. [DOI] [PubMed] [Google Scholar]
  68. Yao Zizhen, Van Velthoven Cindy TJ, Nguyen Thuc Nghi, Goldy Jeff, Sedeno-Cortes Adriana E, Baftizadeh Fahimeh, Bertagnolli Darren, Casper Tamara, Chiang Megan, Crichton Kirsten, et al. A taxonomy of transcriptomic cell types across the isocortex and hippocampal formation. Cell, 184(12):3222–3241, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Yao Zizhen, van Velthoven Cindy TJ, Kunst Michael, Zhang Meng, McMillen Delissa, Lee Changkyu, Jung Won, Goldy Jeff, Abdelhak Aliya, Aitken Matthew, et al. A high-resolution transcriptomic and spatial atlas of cell types in the whole mouse brain. Nature, 624(7991):317–332, 2023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Ye Joel, Collinger Jennifer, Wehbe Leila, and Gaunt Robert. Neural data transformer 2: multicontext pretraining for neural spiking activity. bioRxiv, pp. 2023–09, 2023a. [Google Scholar]
  71. Ye Zhiwen, Shelton Andrew M, Shaker Jordan R, Boussard Julien, Colonell Jennifer, Birman Daniel, Manavi Sahar, Chen Susu, Windolf Charlie, Hurwitz Cole, et al. Ultra-high density electrodes improve detection, yield, and cell type identification in neuronal recordings. bioRxiv, 2023b. [Google Scholar]
  72. Zhang Yizi, Lyu Hanrui, Hurwitz Cole, Wang Shuqi, Findling Charles, Hubert Felix, Pouget Alexandre, International Brain Laboratory, Varol Erdem, and Paninski Liam. Exploiting correlations across trials and behavioral sessions to improve neural decoding. bioRxiv, 2024. [Google Scholar]
  73. Zhang Yizi, Wang Yanchen, Jiménez-Benetó Donato, Wang Zixuan, Azabou Mehdi, Richards Blake, Tung Renee, Winter Olivier, Dyer Eva, Paninski Liam, et al. Towards a” universal translator” for neural dynamics at single-cell, single-spike resolution. Advances in Neural Information Processing Systems, 37:80495–80521, 2025. [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

Articles from bioRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES