SUMMARY
Cell type is hypothesized to be a key determinant of a neuron’s role within a circuit. Here, we examine whether a neuron’s transcriptomic type influences the timing of its activity. We develop a deep-learning architecture that learns features of interevent intervals across timescales (ms to >30 min). We show that transcriptomic cell-class information is embedded in the timing of single neuron activity in the intact brain of behaving animals (calcium imaging and extracellular electrophysiology) as well as in a bio-realistic model of the visual cortex. Further, a subset of excitatory cell types are distinguishable but can be classified with higher accuracy when considering cortical layer and projection class. Finally, we show that computational fingerprints of cell types may be universalizable across structured stimuli and naturalistic movies. Our results indicate that transcriptomic class and type may be imprinted in the timing of single neuron activity across diverse stimuli.
In brief
Schneider et al. develop a machine-learning architecture showing that a neuron’s genetic cell type imposes a recognizable fingerprint on its spike patterning in the context of the intact brain and across diverse stimuli. They recover fingerprints from visual cortical neurons during presentation of simplified and complex stimuli.
Graphical abstract

INTRODUCTION
Mammalian brains are composed of between 108 to 1011 neurons that can be categorized into hundreds of genetically distinct cell classes and types.1,2 There is great interest in understanding the role of cell types in the intact brain, driven by the promise that a careful description of cells, their anatomical diversity, and their rules of connectivity will offer insight into how circuits generate behavior.2–4 Anatomically and morphologically diverse classes of neurons were initially described by Ramon y Cajal,5 while more recently, neuronal classes have been defined based on molecular markers, morphology, and anatomical location. Single-cell transcriptomics has revealed a diverse but finite number of profiles that align with known cell classes, thus indicating the genetic basis of cellular phenotype.2,6,7 How diverse cell types influence the pattern and timing of neuronal spiking in the intact circuit is largely unknown.6
Previous attempts to determine functional signatures of cell types based on their transcriptomic profiles have taken advantage of reduced preparations and synaptically isolated cells. Ex vivo patch-clamp-based studies suggest that, given a controlled input, electrical properties of neurons may indicate genetic identity.8–11 However, finding reliable signatures of cell types in the intact brain has proven elusive because isocortical spike timing is highly irregular,12–18 and even distinct neuronal classes exhibit broad variability in their activity patterns.19–22 The challenge of identifying spike-timing-based signatures of neuronal classes is exacerbated by experimental limitations related to simultaneously measuring physiology and genetic cell type.23,24 As a result, classification of cell type in the intact brain relies either on (1) molecular tools or (2) analysis of the shape of the extracellular waveform, which labels only parvalbumin (PV)-positive neurons among additional clusters that are currently not tied to a transcriptomic cell type.19,25–27
Ongoing developments in genetic tools and high-density recordings are accelerating the investigation of how cell types impact brain dynamics. Recent work demonstrates that inhibitory cell type influences activity correlations among subpopulations of neurons.28 This work reveals that molecularly diverse cell classes influence patterns of population activity. This powerful observation raises two enticing questions: (1) if cell type is observable in population interactions, does it impose reliable operational constraints at the level of individual neuron activity, and (2) if so, could such information be used to classify unknown cells? In other words, what is the minimal (resolvable) unit at which transcriptomic class impacts cortical computation, and how reliable is that fingerprint? If the computational influence of cell type was spread across thousands of neurons, it is unlikely that microcircuits could use that information, and its measurement would demand a large-scale experimental approach. In contrast, should a spike train convey the identity of an individual neuron, cell type would be discernible by any observer of spiking, such as an experimenter or the cortex itself.
To investigate this possibility, we developed a deep-learning architecture that employs an attentional mechanism to extract cell type/class from the time series of a single neuron’s activity. The model learns from a neuron’s activity patterns (as small as a single interspike interval) and global statistics (up to the entire length of the recording, >30 min). When trained on many trials of simplistic, repeated stimuli (drifting gratings), this approach can accurately decode 16 cell types in a bio-realistic model of the primary visual cortex with 250,000 neurons (note that despite the breadth of these datasets, not all cell types and classes are available). When applied to in vivo electro- and optophysiological datasets where transcriptomic types can be identified (or inferred), we show that cell type and class can be decoded in the isocortex of behaving animals. We then apply this tool to the question of whether genetic subtypes of excitatory neurons are computationally discernable in the intact brain.
Finally, we ask whether computational fingerprints are robust across diverse stimuli. We produce a single model that successfully identifies cell classes in both drifting gratings and naturalistic movies. Our data reveal (1) robust spike-timing fingerprints of the three transcriptomic families of inhibitory neurons labeled in high-density electrophysiological recordings (PV, somatostatin, and vasoactive intestinal polypeptide expressing), (2) robust event-timing fingerprints of the same three interneuron classes and an excitatory class in calcium imaging experiments, and (3) robust event-timing fingerprints of a subset of excitatory subtypes in calcium imaging experiments. Our data suggest that these latent timing-based fingerprints are universal across stimuli. This raises the intriguing possibility that transcriptomic class/type information is accessible to and implicated in computations throughout the broader cortical network.
RESULTS
To examine whether information about cell type is embedded in the timing of neuronal activity in the intact circuit, we sought datasets that satisfy two criteria: (1) high-resolution sampling of the activity of many neurons and (2) ground-truth information about each neuron’s underlying transcriptomic classification. These were satisfied by two open datasets centered on mouse primary visual cortex from the Allen Institute, specifically high-density extracellular electrophysiology with optotagging of inhibitory types29 and optophysiology/Ca2+ imaging.30 In addition, we deployed a bio-realistic model of the primary visual cortex comprised of 230,000 neurons (50,000 in the core) with 16 cell types modeled on transcriptomic classes in a layer-dependent fashion.31 In all cases, recordings were obtained from visual cortical regions in equivalent tasks and conditions (Figures 1A–1C). We refer to broad transcriptomic families (PV, vasoactive intestinal peptide [VIP], somatostatin [SST], and excitatory [E]) as “classes” and transcriptomically defined subdivisions of E neurons as “types.” In the electrophysiological dataset, three inhibitory classes were distinguished with Cre driver lines as expressing either PV, VIP, or SST. In the calcium imaging dataset, in addition to the same three inhibitory classes, eight E Cre lines were recorded across cortical layers (Cux2-layer 2/3 [L2/3], Ntsr1-L6, Rorb-L4, Fezf-L5, Nr5a-L4, Rbp4-L5, Scnn1a-L4, Tlx3-L5). All E Cre lines were recorded at a single imaging depth, with the exception of Cux2 (recorded in L1 and L2/3).30 In contrast to the inhibitory classes, excitatory Cre lines varied in their degree of specificity, with some labeling a single cell type and others targeting multiple cell types and layers simultaneously. In the bio-realistic model, the same three inhibitory classes and an E class were specified by layer, each with functional and connectomic properties informed by experimental evidence.31
Figure 1. Experimental overview and separability of cell types with 17 statistics.

Three datasets contain the time series of many neurons’ activity (spikes or calcium events) and corresponding genetic type.
(A–C) Example visual stimulus, recording setup, raw data, and extracted event raster for (A) a bio-realistic model of V1 containing 4 cell classes (VIP, SST, PV, E) organized by layer to generate 16 cell types; (B) in vivo calcium imaging sessions, collected from Cre lines labeling three inhibitory classes (VIP, SST, PV) and eight excitatory cell types; and (C) extracellular electrophysiology from Neuropixels, collected from Cre lines optotagging of three inhibitory classes (VIP, SST, PV). (D–F) Linear separability of cell type by 17 statistics that describe neuronal activity. The ranked order of the statistics (best to worst) in the bio-realistic model is shown in the blue spectrum (D). Color order is maintained in (E) and (F) to show shuffling relative to (D). Coarse categories of statistics are indicated by label color. Error bars represent SEM over 20 random splits.
(G–I) Confusion matrices for linear models trained to predict cell class using all parameters. Ca2+ imaging sampling rate eliminates the two highest PSD bands. ISI (interspike interval)/interevent interval, min (minimum), med (median), and max (maximum); mean FR, mean firing rate; CV, coefficient of variation; LV, local variation; LVR, revised local variation; CV2, local alternative to coefficient of variation; PSD, spectral density power in four bands (delta, theta, alpha, beta, and gamma); Std Dev, standard deviation; γ - α/β, the two parameters of the gamma distribution (α and β) used to describe the ISI distribution.
Here, we include all individual neurons in which spikes/detected calcium events could be paired with information about underlying transcriptomic identity. Pooled across animals, neurons were divided into non-overlapping training, validation, and test sets (60% training, 20% validation, 20% testing). To account for sampling error, we trained and tested on 20 independent random splits of each dataset. Class balancing was performed so that models did not learn to simply predict the most prevalent classes (see STAR Methods). As a result, chance is 1/number of cell types (K). When appropriate, we report balanced accuracy, defined as the averaged per-class accuracy: (true positive rate + true negative rate)/2, which is effective when classes are imbalanced.32 Precision of this estimate with n = 20 random splits is expressed as standard error of the mean (e.g., balanced accuracy ± SEM). To graphically convey performance across classes, we use confusion matrices–in these square heatmaps, each row represents true class labels, and each column displays predicted class labels. As a result, values along the diagonal represent correctly predicted neurons.33
Cell-type separability with established statistics
One possibility is that cell type is only relevant in establishing connectivity and determining neurotransmitter content. In this case, spike timing should only be informed by a neuron’s inputs and noise. Alternatively, in addition to inputs and noise, cell type might predictably influence the activity of individual neurons. To distinguish these possibilities, we attempted to predict cell class/type by examining 17 measures (i.e., features) of event timing that have been previously deployed to describe spike time series (Figures 1D–1F; STAR Methods).34–40 Broadly, these features convey information about (1) standard statistics, such as mean firing rate (FR); (2) acute variance, such as the local alternative to coefficient of variation (CV2); (3) spectral power, such as the delta band (0.1–4 Hz); and (4) parameters of the interspike interval (ISI) distribution. We assessed the ability of features to predict cell type individually (Figures 1D–1F), as well as in a combined logistic regression (LogReg) model (Figures 1G–1I).
In the bio-realistic model, the combination of all features achieved a balanced accuracy of 45.14% ± 0.38% (Figure 1G), significantly above chance levels, when predicting all 16 cell types (K = 16; chance = 6.25%, p < 1E–10 relative to chance performance on shuffled labels, ANOVA with Tukey’s honestly significant difference [HSD] post hoc). For benchmark comparison with in vivo datasets, we collapsed the 16 cell types into 4 classes (K = 4; PV, VIP, SST, and E; chance = 25%). In this case, the combination of all features yielded a balanced accuracy of 69.50% ± 0.37% (p < 1E–10, Tukey’s HSD). These data suggest that, in the context of an ideal model in which every cell’s type is known, cell type is partially separable along some features of neuronal activity. There was extensive variability in the performance of individual features (Figure 1D).
We then deployed the same 17 features on the event time series from in vivo datasets. In calcium imaging, we collapsed the 10 E cell types into a single class for benchmark comparisons of K = 4 (PV, VIP, SST and E; chance = 25%). The combination of statistics achieved a balanced accuracy of 74.44% ± 0.76% (p < 1E–10, Tukey’s HSD) (Figure 1E). Note that, due to sampling rate, calcium imaging datasets lack the two highest bands of the power spectrum. Some individual features offered surprisingly powerful separation of cell types (Figure 1E).
In the electrophysiological dataset, tasked with predicting the inhibitory classes (K = 3, chance = 33%), the combination of the 17 features yielded a balanced accuracy of 47.07% ± 0.78%, which was significantly above chance (p < 1E–10, Tukey’s HSD) although low relative to the benchmarks in the other datasets (Figure 1I). It should be noted that none of the 17 features performed notably well in the context of extracellular electrophysiology, and roughly half were only marginally above chance (Figure 1F).
A priori-defined features have the advantage of being immediately interpretable. For example, separability along the axis of FR would indicate a cell-type-dependent change in the number of spikes per second. We next sought to understand which features or groups of features are most effective in discerning cell type. We reasoned that, due to the high variability in overall performance across the three datasets, the mean accuracy of individual features would provide little insight into this problem. Thus, we ranked individual features by performance within a dataset and compared rankings across the datasets. To our surprise, there was little correspondence in the ranking of features across datasets (Figures 1D–1F, blue coloration). Feature rankings in the calcium imaging dataset were positively correlated with feature rankings in the bio-realistic model dataset (0.5571, p = 0.0310, Spearman rank correlation), whereas features in the Neuropixels dataset were not positively correlated (−0.2214, p = 1.0, Spearman rank correlation). The features in the Neuropixels dataset and calcium imaging dataset had a non-significant positive correlation (0.3214, p = 0.2427, Spearman rank correlation) (Figure S1). Even the broad categories of the most informative features differed by modality. In the bio-realistic model, the top two features were standard statistics (ISI and CV). In the calcium imaging data, the top two features related to local variability (revised local variation [LVR] and LV). In the electrophysiology data, the top two features reflected spectral power (beta and theta bands). This does not reflect a slight disordering of the top features. The most effective features in one dataset were among the least effective in the others.
Taken together, these results suggest that some degree of cell-type classification can be achieved in all three datasets through the combination of hand-selected statistical features. This supports the hypothesis that cell type imposes meaningful constraints on the activity of individual neurons in the brains of behaving animals. However, there are three important caveats. First, from a practical perspective, the combination of features achieved mediocre performance in both the bio-realistic model and the electrophysiology dataset. Second, the lack of consistency in accuracy provided by features suggests that different recording methodologies are likely to demand unique sets of features. Third, these datasets were acquired under highly stereotyped conditions of repeated drifting grating stimuli.
Bottom-up, non-linear separability of cell types based on neuronal activity
Summary statistics, such as those employed above, reduce observed events into single values. In addition, the selection of summary statistics requires assumptions on the part of the experimenter. We reasoned that taking a data-driven approach to consideration of the entire distribution of a neuron’s interevent intervals (IEIs) might contain more information about type/class and that cell class/type might not be linearly separable. To learn such a “fingerprint,” we trained a multilayer perceptron (MLP; see STAR Methods; Figures S2A–S2F) to classify cell type/class based on the histogram of IEIs generated by each neuron.
In the bio-realistic model (K = 16), the MLP achieved 60.17% ± 0.26% balanced accuracy, a 33.30% relative improvement over linear models (p < 1E–10, Tukey’s HSD). Collapsed to K = 4, the MLP achieved 75.06% ± 0.25% balanced accuracy, an improvement of 8% (p < 1E–10, Tukey’s HSD) (Figure S2G). In the calcium imaging dataset (K = 4), the MLP achieved a balanced accuracy of 76.09% ± 0.50%, an improvement of 2.22% relative to summary statistics (p = 0.109, Tukey’s HSD) (Figure S2H). Similarly, in the electrophysiological dataset (K = 3), the MLP reached a balanced accuracy of 49.22% ± 0.82%, an improvement of 4.57% (p = 0.1901, Tukey’s HSD) (Figure S2I).
With the exception of its performance in the bio-realistic model, these results suggest that the MLP performs effectively equivalent to linear separability of cell type (using 17 carefully selected features), despite representing a non-linear and bottom-up approach. One interpretation of the equivalent performances of these approaches is that they reflect the true maximal differentiability of cell types based on a computational fingerprint. However, there is another quality of neuronal activity that is captured by neither summary statistics nor an MLP trained on the global IEI histogram: alterations in activity that progress over time. In this case, the defining features of a neuron’s activity may appear intermittently or may only be meaningful when two states are considered in contrast to one another.
A flexible architecture for cell-type classification that leverages attention
Neuronal activity is typically described in terms of acute, stimulus-driven statistics or broad summary statistics. The possibility that other information may be embedded across or in between these scales poses the problem of whether we can learn to flexibly attend to the relevant subsets of data. Recent progress in machine learning tackles this problem with “attention,” a process that allows a classifier to weight different subsets of data depending on their value to the task.41,42 Taking advantage of this, we developed local latent concatenated attention (LOLCAT), a model that learns to predict cell type based on a neuron’s time series.
Summarily, the algorithm works as follows (for details, see STAR Methods). We begin by dividing an entire recording into short chunks of time (snippets), within which we extract the IEI distribution (Figure 2A). For convenience, in the context of drifting gratings, snippets are aligned with each 3 s trial. First, LOLCAT extracts local features from the IEI distribution of each snippet. These local features are then passed through a multi-head attentional network that aggregates the features into a final representation of the cell. Each attention head functions independently to learn which aspects of snippets to attend to, and thus the inclusion of multiple attention heads allows LOLCAT to attend to features of activity at distinct scales, ranging from a few important snippets (local) to equally weighted snippets over the entire recording (global).
Figure 2. LOLCAT learns computational fingerprints of cell types.

(A) Architecture for LOLCAT: a time series of neuronal activity is split into short snippets (3 s) and the IEI distribution is estimated for each local window. IEIs are fed into an encoder to obtain features for each window before being aggregated in a multihead attention module. Each attention head computes a weighted combination of the representations of each IEI distribution, creating a pooled version of the representations. The attention head’s representations are concatenated to obtain an estimate of the class. The network is trained in a supervised manner using dropout, both across the input blocks and at units in layers within the network. Inset (top right) shows the allocation of attention (in red) from 3 heads to the same 5 snippets.
(B) Confusion matrix for LOLCAT on the bio-realistic model with 16 cell types (4 excitatory, 12 inhibitory. HTR3A is analogous to VIP). Annotation of performance below chance is deleted for visual clarity.
(C and D) LOLCAT results (confusion matrix) for four cell classes, E, PV, SST, and VIP, in the bio-realistic model (C) and calcium imaging data (D).
(E) Same as (C) and (D) but for three inhibitory cell classes labeled in extracellular electrophysiological recordings.
(F) Summary of final balanced accuracies across different methods of decoding cell types from event times. Bars represent statistical significance of comparison across models within task using Tukey’s HSD (ns, p ≥ 0.05; *p < 0.05; **p < 0.01; ***p < 0.001). Error bars represent SEM over 20 random splits. Logistic regression (LogReg) was trained on 17 statistics (Figure 1). A multilayer perceptron (MLP; non-linear deep network) was trained on the full IEI distribution of a recording. LOLCAT was trained on IEI distributions divided into snippets.
The design choices behind LOLCAT are motivated by three factors: (1) the possibility of a generalizable computational fingerprint of cell type must be independent of stimulus information, such as the orientation or frequency of a drifting grating. LOLCAT is not passed anything beyond the time series of neuronal activity. It searches for the same patterns in every snippet, ignorant of stimulus and experimental conditions. (2) The order of stimulus presentation in experiments may carry information. To avoid this influence and induce LOLCAT to learn robust fingerprints, its architecture is intrinsically order invariant (i.e., shuffling snippet order has no impact on the learning of fingerprints). (3) To limit assumptions about fingerprints and maximize flexibility across observations, LOLCAT generates predictions with as many snippets as are available.
We first tested the ability of LOLCAT to learn fingerprints from the 16 cell types that comprise the bio-realistic model (K = 16) (Figure 2B). To our surprise, and in contrast to the hypothesis that the MLP reflected asymptotic performance, LOLCAT achieved a mean balanced accuracy of 70.10% ± 1.42% (chance = 6.25%), a 16.50% relative improvement over the MLP (p < 1E–10, Tukey’s HSD) and 55.29% over the linear model (p < 1E–10, Tukey’s HSD) (Figure 2F). The accuracy of individual cell types ranged between 40.19% and 95.73%. Out of 16 cell types, only five performed under 60% and seven under 70% accuracy. When the 16 cell types were collapsed into the benchmark K = 4 (PV, VIP, SST, and E; chance = 25%), LOLCAT achieved a mean balanced accuracy of 85.01% ± 1.34%, with an improved diagonal structure of the confusion matrix and strong performance across all classes (PV: 88.66%, VIP: 85.42%, SST: 81.06%, and E: 84.97%) (Figure 2C). This represents a 13.24% relative improvement over the MLP (p < 1E–10, Tukey’s HSD) and 22.30% over the linear model (p < 1E–10, Tukey’s HSD).
The increase in performance in the bio-realistic model could be due to some combination of learning from an extensive dataset and the perfect reconstructability of a synthetic dataset. We next applied LOLCAT to calcium imaging, the second largest dataset, to determine whether performance improvements over the MLP might be achieved in a smaller, in vivo context. In the benchmark K = 4 label set (PV, VIP, SST, E; chance = 25%), LOLCAT achieved a similar balanced accuracy to that observed in the bio-realistic model (79.59% ± 2.70%) (Figure 2D). This represents a 4.60% relative improvement of the MLP (p = 0.0019, Tukey’s HSD) and a 6.92% relative improvement over the linear model (p = 3.11E–7, Tukey’s HSD) (Figure 2F). Model accuracy was well balanced across classes (E: 78.28%, PV: 81.92%, SST: 82.73%, VIP: 75.41%). The accuracy achieved for each class is noteworthy given the severe imbalance between excitatory and inhibitory cells in this dataset (PV: 180, SST: 470, VIP: 470, E: 22,245). Further increases in accuracy may be achievable by expanding the inhibitory populations.
Next, we turned to electrophysiology, the smallest dataset with three inhibitory classes (K = 3; PV, SST, and VIP; chance = 33%). Trained on spike times from optotagged neurons in this dataset, LOLCAT extracted cell class with a balanced accuracy of 54.22% ± 1.57% (Figure 2E). These results are a 10.12% relative improvement over the MLP (p = 2.85E–4, Tukey’s HSD), and a 15.14% relative increase to the global statistics baseline (p = 5.04E–7, Tukey’s HSD) (Figure 2F). Compared with the MLP and linear/summary statistics, class accuracies were well balanced (PV: 58.14%, VIP: 53.57%, and SST: 50.89%). It is worth noting that it is more challenging to extract transcriptomic class from this dataset compared with the calcium imaging data and bio-realistic model. This is most likely the product of three factors: the electrophysiology dataset (1) has the fewest labeled neurons, (2) has the fewest class labels, and (3) is unavoidably challenged by the inclusion of spikes from multiple neurons, even in well-isolated clusters.
Despite the challenges associated with extracellular electrophysiology, LOLCAT’s results across all three datasets suggest that information about genetic cell type is robustly embedded in the time series of neuronal activity in isocortical circuits. LOLCAT’s performance was not achieved by learning one class at the expense of others; the model delivered well-balanced class identification (Figures 2B–2E). In the benchmark comparisons (bio-realistic model K = 4; calcium imaging K = 4; electrophysiology K = 3), LOLCAT exhibited a range of class accuracies (highest class accuracy to lowest class accuracy) of 7.60%, 7.32%, and 7.25%, respectively. This indicates that LOLCAT achieved increased performance by enhancing the resolvability of all classes rather than a subset. In contrast, accuracy ranges in the MLP results were 31.37%, 15.83%, and 24.70%. In the linear models, ranges of 39.03%, 18.68%, and 25.38% were observed.
Computational division of excitatory subtypes
The inhibitory classes considered here are genetically distinct and stable over time.1 In other words, PV neurons do not express SST and are unlikely to switch classes. In contrast, the degree to which E neuronal subtypes are distinct and stable is an open question.1 We took advantage of the 10 E Cre lines in the calcium imaging dataset to address this, excluding two lines, EMX1 and SLC17a7, known for their widespread glutamatergic expression and lack of specificity.7,43,44 Thus, we trained LOLCAT on different excitatory Cre lines (K = 8) to understand which, if any, excitatory subtypes are computationally distinguishable (Figure 3A).
Figure 3. Excitatory cell types in the cortex are computationally separable by layer and projection class.

(A) Confusion matrix for LOLCAT applied to the time series of neuronal activity recorded in eight excitatory Cre lines (Ca2+ imaging). Annotation of performance at or below chance is deleted for visual clarity.
(B) Confusion matrix for LOLCAT applied to excitatory cell type when the 8 Cre lines are grouped by laminar position and projection class.45
(C) Uniform manifold approximation and projection (UMAP) embedding of all separable cells colored by Cre line. Layer/projection class assignment are indicated by adjacent labels (IT, intratelencephalic; PT, pyramidal tract; CT, corticothalamic).
Lines like Cux2, Ntsr, and Rbp4 could be discerned at or above 50% accuracy (chance is 12.5%) (Figure 3A). These lines are known to be more specific to individual cell types.46,45 In contrast, other lines were more likely to be confused with a subset of other labels. Specifically, Rorb and Scnn1a performed below 30%. Interestingly, these lines are known to label a wider range of E cells.7,47 Together, these data suggest that genetic specificity of excitatory cell types is correlated with the resolvability of cell types in the time series of neuronal activity.
Further distinctions within canonically defined cell types are still being recognized, and the genetic specificity of excitatory Cre lines is variable. In light of this, we reasoned that the computational divisibility of E neurons might improve along another axis. We investigated LOLCAT’s performance when E neurons were organized according to the taxonomy for Cre lines proposed in Harris et al.45 In this work, each Cre line was characterized in terms of its laminar expression pattern, as well as its axonal target (pyramidal tract [PT], intratelencephalic tract [IT], corticothalamic tract [CT], or a mixture). The outputs of LOLCAT were thus collapsed into four classes: L2/3 IT neurons (Cux2), L4/5 IT neurons (Nr5a1: L4, Rorb: L4/5 IT, Scnn1a: L4/5 IT), L5 IT/PT (Tlx3: L5 IT, Rbp4: L5 IT/PT), and L6 CT neurons (Ntsr1) (Figure 3B). Grouped this way, LOLCAT achieved 52.10% ± 0.31% balanced accuracy. The highest accuracy (59.24%) was associated with L 5 IT/PT. Other classes reached roughly 50% accuracy, suggesting that excitatory cell types may be better differentiated by layer and projection type than transcriptomic signatures.
We visualized the representations learned by the model with a supervised variant of uniform manifold approximation and projection (UMAP) that aims to keep points from the same class close in the embedded space (Figure 3C).48 Distances between classes are learned directly by the model, and class arrangements can be interpreted as reflecting whether clusters/classes are close in the latent space (that comprises IEI features). When we examined this embedding for E neurons that were correctly classified by LOLCAT, we found that Cre lines from similar layers are mapped to nearby points in the latent space, providing further evidence that LOLCAT learned similar latent features for cells from similar laminar positions. Overall, we found that pure IT lines are embedded into similar parts of the latent space, and lines containing mixtures of IT/PT are closer, with CT in L6 being well separated from the other lines. The Cux line in L2/3 is one of the most well-sampled classes in the dataset, and despite the broad coverage, our analysis suggests that the representations of different neurons within the class are relatively homogeneous.
Understanding the basis of computational fingerprints learned by LOLCAT
LOLCAT could identify cell type from a single IEI or increasingly global patterns that emerge over longer timescales. To understand the rules LOLCAT uses to optimally solve the cell-type task, we dissected the four attention heads within the calcium imaging K = 4 model. Visualization of attention allocation across trials (i.e., attention masks) revealed that, consistently and without prompt or guidance, two heads learned local rules, while the other two learned global rules involving nearly all of the trials (Figure 4A). In other words, some heads attended to the entire recording, while others focused on specific trials in the data. We also directly quantified the contribution of attention to this process by ablating the attention mechanism while maintaining the rest of the LOLCAT architecture. In this case, LOLCAT’s performance was significantly diminished (76.80 ± 1.86 ablated, 79.59 ± 2.70 intact, p = 0.0117, Tukey’s HSD), and it performed at the level of the standard MLP (76.09 ± 0.50, p = 0.936, Tukey’s HSD).
Figure 4. The basis of computational fingerprints learned by LOLCAT.

(A) Visualization of attention paid to 119 snippets. The same snippets are shown in the context of each of four attention heads. Attention weight is indicated by the shade of red (darker is more attention). The left heads are global in nature (broad distribution of attention), while the right heads are local (sparse distribution of attention).
(B) Bar plots showing accuracy of decoding the cell class from each head individually (rather than concatenating all four heads as in the full LOLCAT model). No one head individually decodes all classes.
(C) Distribution of attention scores for each head in low (0–2 Hz), medium (2–4 Hz), and high (4–10 Hz) firing rate snippets. Densities are normalized and shifted vertically for visualization.
(D) Prediction confidence as a function of the number of observed snippets (K = 4 indicated by color). Error represents SEM from all correctly classified cells of a class in the test set.
(E– and F) Prediction accuracy of the full model when tested on a single stimulus orientation (E) or temporal frequency (F).</p/>(G) Integrated gradients attribution score (teal bars; top), corresponding to the relative importance of each snippet for LOLCAT’s prediction of the correct class. Snippet rasters (gray; bottom) underlying the attribution score shown above.
(H) IEI histograms of high-attribution trials reveal consistent fingerprints of cell classes after clustering with Wasserstein distance. Fingerprints are computed as the Wasserstein barycenter (center of mass) of the top IEI for neurons in a specific class. For SST (left), example high-attribution trial histograms are shown (top), as well as the mean smoothed IEI fingerprint (bottom).
LOLCAT’s predictions arise from pooling the results of the attention heads. To test whether cell type arises from local vs. global attention heads, we extracted the latent representations from individual heads. We then trained a linear layer on top of each attention head to decode the four cell classes of interest (Figure 4B). This allowed us to assess the ability of each attention head to decode cell type on its own. Interestingly, excitatory types could be decoded at >60% accuracy across both global and both local heads. VIP and SST were decoded by the same two heads, one local and one global. PV, in contrast, was only decodable in a single local head that did not attend to the other inhibitory types. It is worth noting that LOLCAT relies on the combination of all heads, which is not quantified here. However, the ability to decode single heads provides insight into the timescales over which cell-type fingerprints emerge, as well as the complexity of the rules necessary for accurate grouping (the distribution of non-overlapping performance across multiple heads).
A reasonable hypothesis is that each attention head simply attends to snippets containing the most activity. To evaluate this, we plotted attention weight as a function of event rate for each attention head (Figure 4C). Surprisingly, across all heads, there was generally equivalent attention paid to low and high levels of activity. This suggests that computational fingerprints involve diverse patterns of IEIs that are not only driven by preferred stimuli.
Fingerprints could manifest in two ways. First, if cell type shaped the timing of all neuronal activity, only a small number of snippets would be necessary for identification. Second, if the impact of cell type were subtle and/or stochastically distributed, evidence from many trials might be required to make a high-confidence identification. To parse whether cell classes are distinguishable based on these possibilities, we presented LOLCAT with increasing amounts of data, from one snippet to 600, and evaluated its performance (Figure 4D). In order from the smallest to largest amount of data required, SST, E, PV, and VIP each revealed a unique relationship between discernibility and the number of snippets. All classes were significantly different from one another (p < 0.001, linear mixed model with Tukey’s HSD).
Despite the fact that LOLCAT is not provided information about stimuli, stimuli influence neuronal activity. We hypothesized that computational fingerprints would be most reliably driven by a neuron’s preferred stimuli. In stark contrast, we found little relationship between stimuli and the snippets that carried computational fingerprints (Figures 4E and 4F). Specifically, SST and PV cells could be predicted with high accuracy across all stimulus subsets. E and VIP cells were more variable, with decreased resolvability of E cells at low temporal frequencies. We also observed an anti-correlated trend in SST, with a small decrease in accuracy for high temporal frequencies consistent with previous studies in this dataset.30
Visualizing local fingerprints
We next asked whether we could evaluate the local fingerprints that define each class. We applied the integrated gradients (IG) method49 to identify snippets that were particularly useful for the LOLCAT’s predictions (Figure 4G). We averaged the IEI distributions of the most useful snippets to reveal local fingerprints associated with each class (Figure 4H) (see STAR Methods for details). The fingerprints associated with the four classes reflected distinct temporal dynamics. Specifically, inhibitory cell classes were characterized by multimodal peaks (i.e., have more than one prominent, separable local maximum), while E neurons tended to be unimodal and concentrated in shorter IEIs. K-means-based clustering of IEI fingerprints revealed distinct subtypes of fingerprints within the three inhibitory classes but not excitatory. IEI fingerprints exhibited variability within cell classes, suggesting that LOLCAT learns multiple features of neuronal activity to determine cell type.
We also applied IG to the LOLCAT model that categorized K = 8 E cell types in the calcium imaging dataset (Figure 3). This revealed distinguishable and complex fingerprints within the E cell population (Figure S3). These data suggest that IEI fingerprints at the class level may mask diversity in the fingerprints of underlying cell types.
Computational fingerprints persist across complex, naturalistic stimuli and multiple conditions
The synthetic stimuli that maximally drive spiking in visual cortical neurons do little to predict the activity of the same neuron in more complex, naturalistic conditions.50–52 This raises the possibility that computational fingerprints learned by LOLCAT are highly specific to drifting gratings. Alternatively, and consistent with LOLCAT’s performance across activity levels and stimuli (Figures 4C, 4E, and 4F), computational fingerprints may be an underlying feature of cell type that are relevant in diverse conditions. To address this, we trained LOLCAT to predict cell class (K = 4, calcium imaging) from neural activity recorded during the presentation of (1) naturalistic movies (NM) and (2) a mixture of NMs and drifting gratings (DGs) (Figure 5A). In each of these contexts, we also evaluated our feature-based model (LogReg) and the MLP used in previous experiments.
Figure 5. Computational fingerprints generalize throughout diverse stimulus sets.

(A) “Within-context” models (K = 4, Ca2+ imaging) are trained and tested on neuronal time series recorded during the presentation of either drifting gratings (top, DG:DG) or naturalistic movies (bottom, NM:NM).
(B) Performance of within-context LogReg, MLP, and LOLCAT models. Bars represent statistical significance of comparison across models within task using Tukey’s HSD (ns, p ≥ 0.05; *p < 0.05; **p < 0.01; ***p < 0.001). Error bars represent SEM over 20 random splits.
(C) “Mixed” models (K = 4, Ca2+ imaging) are trained on time series recorded during both DG and NM presentation. Mixed models are tested separately on both NMs and DGs.
(D) Performance of LogReg, MLP, and LOLCAT mixed models tested on DGs (left) and NMs (right).
(E) Local IEI fingerprints of the four cell classes attended to by the DG (solid lines) and NM (dashed lines) models. Cell class is indicated by color.
(F) Local IEI fingerprints attended to by the mixed model when identifying cell types in the context of DGs (solid lines) and NMs (dashed lines). Error bars represent SEM over 20 random splits.
To establish a baseline, we started with the NM-only condition, in which models were trained and tested on neuronal activity recorded during the presentation of NMs. Surprisingly, despite the complexity of the NM stimulus, all three approaches reached performance levels similar to DG only (Figure 5B). LOLCAT reached 76.48% (±1.90%) balanced accuracy, and the MLP and LogReg achieved 74.98% ± 0.37%* and 74.86% ± 0.82%*, respectively (ANOVA p = 0.006654, asterisks indicate comparison with LOLCAT, *p < 0.05; **p < 0.01; ***p < 0.0001). This result implies that cell type imposes structure on the timing of neuronal activity even in complex, non-stereotyped naturalistic stimuli. However, these results offer no insight into whether the fingerprints learned in the NM condition bear any similarity to those in DG. There is good reason to assume they should differ; activity levels change systematically when comparing DGs with NMs (Figure S5). SST neurons modestly decrease their activity, while VIP and PV neurons increase their activity by an order of magnitude. E cells increase their activity by ~10%.
To test whether it is possible to learn fingerprints that define cell types across DG and NM contexts, we trained LOLCAT on mixtures of snippets from DG and NM sessions. We then tested the mixed model on withheld neurons from DG recordings and withheld neurons from NM recordings (Figure 5C). In these conditions, LogReg and the MLP maintained their performance when testing with DGs and decreased their performance when testing with NMs. In contrast, LOLCAT’s performance was significantly improved when testing with DGs and was maintained when testing with NMs, relative to the within-domain train/test LOLCAT models. The mixed LOLCAT significantly outperformed both the mixed MLP and mixed LogReg when tested on DG trials (LOLCAT = 82.98% ± 1.46%; LogReg = 73.96% ± 0.86%***; MLP = 74.04% ± 0.63%***; ANOVA p < 1E–10, asterisks indicate comparison with LOLCAT) as well as NM trials (LOLCAT = 74.79% ± 0.76%; LogReg = 66.03% ± 0.74%***; MLP = 69.51% ± 0.59%***; ANOVA p < 1E–10, asterisks indicate comparison with LOLCAT). It is worth noting that LOLCAT’s accuracy on DG trials in the mixed model increased relative to the DG-only condition (from 79.59 to 82.98), suggesting that LOLCAT overcomes limitations by learning common rules between conditions. These results demonstrate that computational fingerprints of cell class generalize strongly across diverse stimuli.
Given that the global activity levels are quite unstable between conditions, we reasoned that local fingerprints learned by LOLCAT must be the source of consistency across DG and NM experiments. To evaluate the comparability of fingerprints, we applied IG to compare salient snippets from DG-only and NM-only models. Many of the IEI fingerprints overlapped in this contrast (Figure 5E). In particular, VIP and PV classes showed similar bimodal structure and non-uniform distributions. We then applied IG to the mixed model when it was tested with DGs and when it was tested with NMs. The fingerprints weighed heavily by the mixed model exhibited remarkable stability across DG and NM testing (Figure 5F). This demonstrates that once LOLCAT learns to solve the mixed-condition problem, the same fingerprints are relevant in both conditions. In other words, a neuron’s class is embedded in the timing of its activity consistently across divergent inputs. Once these embeddings are learned, these results suggest that cell type may be identifiable in a broad range of datasets, both those already recorded and forthcoming.
DISCUSSION
In this study, we asked whether a neuron in a cortical network embeds a fingerprint of its transcriptomic identity into its activity. To test this, we attempted to resolve cell classes and types using only spike and event timing in (1) a bio-realistic model of visual cortex, (2) an open, in vivo optophysiology dataset from mouse visual cortex, and (3) an open dataset comprising in vivo high-density extracellular electrophysiological recordings. Taking a data-driven approach to this problem, we developed LOLCAT, a deep-learning architecture that uses attention to extract computational signatures of cell types. LOLCAT achieves substantial success in decoding cell classes and types from the time series of neurons in all three datasets. The computational fingerprints that define cell type are robust across diverse stimuli including NMs. Our findings reveal that a reliable stream of information about cell type is embedded within the time series of neuronal activity in the intact brain of behaving animals.
Linking transcriptionally defined neuronal types to their functional phenotypes in the intact brain is a widely recognized challenge.1,28 Traditionally, the cellular diversity of neurons was described in physiological and anatomical terms.53 Currently, neural taxonomies rely on either transcriptomic, morphoelectric, and/or molecular parameters.1,3,10,54–58 While powerful, these approaches provide limited insight into the functioning, intact circuit and require very specific stimulus protocols to extract cell-type information.
Prior efforts to use spike timing to resolve even small numbers of excitatory cell types in vivo suggest that network-driven variability in neuronal responses conceals the existence of subtypes.59 In line with this, there is a long list of literature centered on the randomness of isocortical neuronal activity.60–62 Spike times of isocortical neurons are Poisson in nature, a feature that provides computational robustness60,63 and appears to be incompatible with reliable, cell-type-specific responses. However, results from simplified preparations demonstrate functional differences between some cell types,10 and populations of cell types are distinguishable based on population-level responses.28 LOLCAT may reconcile these observations through the consideration of neuronal activity across longer timescales. Spike trains could be Poisson within short intervals and carry reliable patterns over longer, more global intervals. Alternatively, nested amid many irregular short intervals, stereotyped patterns of activity could occasionally arise. Local and global attention heads naturally formed in each of our models, suggesting that LOLCAT learned signatures of cell type/class consistent with each of these possibilities.
The rules by which cell-type information shape a spike train are likely to be quite complex and thus difficult to summarize. The fact that no single statistical measure or feature differentiates cell types across datasets and conditions underscores this. However, our data clearly demonstrate the existence of robust computational signatures that predict cell identity in withheld data. Even at the level of four broad classes, rules were variably distributed across attention heads. Further decomposition of a class into types reveals additional layers of complexity (e.g., the difference between the E class and the 8 E types). As a result, the rules learned by LOLCAT are unlikely to be summarized by a small number of traditional measures (e.g., CV and spectral power). That being said, traditional measures provide some insight into the axes along which cell types may substantially influence spike timing. LVR was the most insightful measure across all three datasets. This implies that FR irregularity is a likely nexus of cell-type influence that contributes to the more complex rules learned by LOLCAT.
LOLCAT is somewhat remarkable for the generalizability of its results across diverse conditions. Not only is the model effective when challenged by both DGs and NMs, but stable fingerprints emerge in both conditions. Generalizability is facilitated by deliberately preventing LOLCAT from contextualizing snippets of one neuron’s activity with regard to the experiment or other neurons. This is imposed at three levels in the architecture. First, within snippets, LOLCAT evaluates IEI distributions rather than an ordered sequence of spike times. The effect of this is that LOLCAT cannot “see” the local response to stimulus onset or change, only the distribution of IEIs that arose in a short interval. Second, snippet IEI distributions are passed through an identical encoder without any information about the stimulus. This undermines LOLCAT’s ability to learn stimulus-specific patterns.28,64 Third, because the multihead attention module pools the representations of snippets, it is blind to the order of trials. Put simply, LOLCAT has no access to temporally correlated activity, nor can it decipher the structure of the experiment. As a result, LOLCAT derives fingerprints that can be used to classify a neuron independent of the network and stimulus.
The ability to infer transcriptomic cell type from only the time series of recorded neuronal activity opens up a wide range of possibilities. Key among these are (1) training a more comprehensive LOLCAT model on diverse cell types, including subcortical neurons and those responsible for neuromodulatory tone, (2) the deployment of LOLCAT on extant or future datasets that otherwise lack cell-type information, and (3) the application of self-supervised approaches to learn computationally distinct populations from recording without genetic labels.65 The first possibility is limited only by the availability of necessary genetic tools and large-scale datasets for training. The second will allow the interactions of cell classes/types to be studied during cognition and behavior. The third presents an opportunity to take the opposite approach: functional classification of cells can be subsequently mapped onto types.
Prior work has taken advantage of cell-type-specific correlations at the population level to describe cell-type-specific differences in neuronal activity.28 However, this requires a priori knowledge of cell types and is not applicable to the problem of classifying an unknown neuron. The current version of LOLCAT suggests that accuracies of up to 80% may be achievable by examining nothing other than the time series of one neuron’s spiking. In this example, a meaningful minority of labels would be incorrectly assigned. However, this is significantly more accurate than inference of genetic cell types from similar approaches such as those based on extracellular waveform.26 The combination of LOLCAT with extracellular waveform information, etc., stands to achieve even higher accuracies. While higher accuracy is certainly achievable through the combination of complementary information streams, our fundamental question is whether cell-type information reliably shapes spike timing.
The existence of computational fingerprints of cell type raises the question of whether they are utilized by the brain or are epiphenomenon. Previous modeling suggests that this type of information could be utilized by sensory neurons66 and in credit assignment during learning.67 Further work is required to clarify whether the brain takes advantage of information streams that carry the identity of presynaptic neurons. Interestingly, the question can be reframed as the following: does the presence of diverse cell types provide a computational benefit? Given the important and evolving connections between neuroscience and artificial intelligence (AI),68 our results raise the possibility that inclusion of different cell types in deep-learning architectures will advance AI.
Limitations of the study
The datasets employed here represent the largest datasets of their class currently available, yet some classes and cell types are absent, notably the Lamp5 and Sncg inhibitory classes. Similarly, many excitatory Cre lines lack genetic specificity.7,43,44 There is no reason to believe that LOLCAT or similar architectures will be unable to learn more classes as relevant datasets come online: in the bio-realistic model, LOLCAT maintained high accuracy with 16 classes. In addition, the in vivo datasets rely on Cre expression to identify cell types. While the presence of Cre is unlikely to impact event timing, it will be interesting to compare the signatures learned here with those extracted from independent datasets that do not rely on Cre. Similarly, it will be valuable to test the fingerprints learned by LOLCAT across data acquired by different groups in different brain areas using different recording methods.
Our baseline comparisons provide insight into the linear and non-linear separability of cell types based on activity timing. This is not to suggest that the accuracy of baseline models could not be boosted by increased model complexity. For example, we explored a range of hyperparameter values for our MLP (Figure S2) to capture the performance of a standard, non-linear approach. The MLPs included in this work employ the optimal set of hyperparameters for each task (Figure S2). However, it stands to reason that the integration of further architectures alongside the MLP, such as a recurrent neural network (RNN) or a convolutional neural network (CNN), might further increase accuracy. Similarly, the linear classification using LogReg (Figure 1) would presumably reveal increased accuracy with the addition of nonlinearity.
Finally, analyses of calcium spikes are degraded by lumping all excitatory cell types together. As a group, excitatory types are parsable, but two factors make it challenging to fully delineate genetic types. The first is the lack of genetic specificity intrinsic with some of the excitatory Cre lines. Additionally, there is signal degradation inevitable in calcium events. This is a key limiting factor in our analyses, as accurate reconstruction of the neuronal spike time series may be necessary for a reliable computational fingerprint to emerge. Similarly, in the Neuropixels recordings, misattributed spikes are a likely source of noise that limits classification.
STAR★METHOS
RESOURCE AVAILABILITY
Lead contact
Further information and requests for resources and reagents should be directed to the lead contact, Keith Hengen (khengen@wustl.edu).
Materials availability
This study did not generate new unique reagents.
Data and code availability
This paper analyzes existing, publicly available data. The accession numbers for the datasets are listed in the key resources table. The 2P-Imaging and Neuropixels datasets were generated by the Allen Institute. Directions for access are available here: https://allensdk.readthedocs.io/en/latest/data_resources.html. Bio-realistic data were generated using a model available here: https://alleninstitute.github.io/bmtk/.
KEY RESOURCES TABLE.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
|
| ||
| Deposited data | ||
|
| ||
| Visual Coding - Neuropixels Dataset | Allen Institute | https://allensdk.readthedocs.io/en/latest/visual_coding_neuropixels.html |
| 2-Photon Imaging Dataset | Allen Institute | https://allensdk.readthedocs.io/en/latest/visual_behavior_optical_physiology.html#photon-imaging-dataset |
|
| ||
| Software and algorithms | ||
|
| ||
| AllenSDK | Allen Institute | https://allensdk.readthedocs.io/en/latest/data_api_client.html |
| Brain Modeling Toolkit | Allen Institute | https://alleninstitute.github.io/bmtk/ |
| Code Base for LOLCAT | This paper | https://doi.org/10.5281/zenodo.7636935 |
All original code has been deposited on GitHub and is publicly available as the date of publication. DOIs are listed in the key resources table.
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
METHOD DETAILS
Datasets
Bio-realistic model for generating synthetic spike trains (V1)
We have used the Allen Institute’s bio-realistic model for simulations of mouse primary visual cortex.31 The V1 simulations are done using Generalized leaky integrate-and-fire (GLIF) models, which have the same connectivity graph as the biophysical model.31 This model has a core of radius 400 μm surrounded by leaky integrate and fire model neurons, totaling a radius of 845 μm. For more details, please see Figure 1B of Billeh et al. (2020). The bio-realistic model contains 51,978 cells in the core and the whole network contains 230,924 cells with 85% excitatory and 15% inhibitory neurons. The V1 network contains 17 cell types represented by 111 unique GLIF neuron models. After filtering for neurons with firing rate exceeding 0.1 Hz, layer 5 inhibitory Htr3a were very rare and therefore not included in classification tasks. This simulation consists of 100 presentations of drifting gratings to LGN with different orientations (0, 45, 90, 135, 180, 225, 270, and 315). Each of these 100 trials starts with 500 ms for gray screen followed by drifting gratings in an orientation angle for 2.5 s. The total duration of the simulation was 300 s.
Calcium imaging datasets
Neural data from the Allen Brain Institute’s Visual Coding - 2-Photon Imaging dataset was retrieved as NeuroData Without Borders (NWB) files through AllenSDK. The default filter criteria for selecting quality units was applied. For recordings coinciding with all presentations of drifting gratings, high quality units with five or more events during at least one trial’s stimulus presentation were included for analysis. This did not eliminate classes from the dataset. For drifting gratings, we retrieved recordings coinciding with all 3 second stimulus presentation trials. Stimulus presentations varied in spatial orientation (0°, 45°, 90°, 135°, 180°, 225°, 270°, 315°, clockwise from 0° = right-to-left) and temporal frequency (1, 2, 4, 8, and 15 Hz), remained consistent in spatial frequency (0.04 cycles/degree) and contrast (80%) with 15 equivalent presentations of each particular stimulus. They were followed by 1 second of gray screen. In total, this amounted to 30 minutes of recording from each unit. For naturalistic movies, we retrieved recordings coinciding with presentations of Natural Movie 3, a 120-second clip with no cuts from the movie “A Touch of Evil”. This was presented 10 times, amounting to 20 minutes of recording from each unit. Separately for recordings coinciding with all presentations of drifting gratings or naturalistic movies, high quality units with at least 5 events in one or more subsequent 3-second intervals (i.e. trials for drifting gratings) were included for analysis. This did not eliminate classes from the dataset.
Neuropixels datasets
Neural data from the Allen Brain Institute’s Visual Coding - Neuropixels dataset was retrieved as NeuroData Without Borders (NWB) files through AllenSDK. The default filter criteria for selecting quality units was applied. For recordings coinciding with all presentations of drifting gratings, high quality units whose average firing rate was at least 0.1 Hz were included for analysis. This did not eliminate classes from the dataset. For drifting gratings, we retrieved recordings coinciding with all 3 second stimulus presentation trials. In the Brain Observatory Dataset, stimulus presentations varied in spatial orientation (0°, 45°, 90°, 135°, 180°, 225°, 270°, 315°, clockwise from 0° = right-to-left) and temporal frequency (1, 2, 4, 8, and 15 Hz), remained consistent in spatial frequency (0.04 cycles/degree) and contrast (80%) with 15 equivalent presentations of each particular stimulus. They were followed by 1 second of gray screen. In total, this amounted to 30 minutes of recording from each unit. Units were excluded if they never exhibited more than five spikes in the span of an individual stimulus presentation. In the Functional Connectivity Dataset, stimulus presentations varied in spatial orientation (0°, 45°, 90°, 135°) and 9 contrasts (0.01, 0.02, 0.04, 0.08, 0.13,0.2, 0.35, 0.6, 1.0), remained consistent in temporal frequency (2 Hz) and spatial frequency (0.04 cycles/degree) with 75 equivalent presentations of each particular stimulus. They were followed by 1 second of gray screen. In total, this amounted to 30 minutes of recording from each unit. These datasets were combined in order to increase the number of available units for the electrophysiology dataset.
Note: the term “transcriptomic class/type” is used here to reflect classes and types of cells that are reliably identifiable by the transcription of a single gene. In the Ca2+ and neuropixels datasets, transcription of these genes results in the production of a Ca2+ indicator or channelrhodopsin, respectively. The identification of some cell types requires evaluation of the transcription of multiple genes - the Cre lines underlying these datasets are unable to reliably discriminate these types. As a result, some of the excitatory types are not constrained to a single cell type. We excluded the least specific E types in this work.
LOLCAT
Our approach for cell type identification, LOLCAT, consists of three main components. We outline the architecture and rationale behind each of these components below.
Local feature extractor
We observe the response of the neuron to a series of stimulus presentations, the length of each trial is 3 seconds, but the total number of trials slightly varies across the dataset. We first process each trial individually before aggregating the information globally: For each trial, the inter-event interval (IEI) distribution is computed using D log-spaced bins.
The resulting D-dimensional input vector is fed to the local feature extractor, which is a multi-layer perceptron (MLP) equipped with batch normalization layers and rectified linear units.
The same local feature extractor is shared across trials, as it is tasked to extract features that locally characterize the signature of a neuron’s activity. This design is directly influenced by the structure of our data.69 The reuse of the same feature extractor is known as weight sharing and has the advantage of artificially augmenting the number of samples that the local feature extractor is trained on, as it observes one 3-second trial at a time, instead of observing the entire activity of the neuron at once. For the V1 k=16 task, the dimensions of the encoder was (128, 64, 64, 32) and for the k=4 task, the dimensions were (32, 16, 16, 16). For the neuropixels tasks, we perform a hyperparameter sweep on the validation set to determine the dimensions of several hyperparameters and report the accuracy of the best validation hyperparameters on the test set.
Multi-head attention module
We aggregate the extracted local features to produce a cell-level representation that describes its global distribution. To allow the network to seek out or attend to specific trials, we use an attention network41,42 that generates an attention score for each trial, and then uses it to weight the trial’s contribution to the final global feature vector.
This is simply a weighted sum of the trial-level local features. The use of the softmax operator ensures that the attention scores sum up to 1. All these design choices enable us to apply our model to a cell with any number of observed trials, as this pooling operation does not depend on a fixed number of trials.
During training, the model can learn to attend to all trials equally in order to extract information about the average firing rate for example, or it can learn to be highly selective and identify trials that are relevant for distinguishing between the cell types, but the informational content of which would otherwise be drowned if all trials are considered equally. A single attention head assigns a single scalar to each trial, and thus can only learn a single rule on how to attend to the set of trials. To allow the model to learn different attention patterns, we use multiple attention heads (4 or 8) in parallel, each of which can learn to be selective to different activity patterns and aggregate statistics over different scales. All the pooled feature vectors are concatenated to produce a final global feature vector, which can simultaneously include features describing the average statistics of the neuron’s activity and other features encoding the presence of trials that are characteristic of a particular cell type (or group).
In the attention-ablated control, we utilized average-pooling of encoder outputs. This is the equivalent of equally distributed attention across all trials.
Classifier
After concatenating all of the features from the different attention heads, we then pass this global representation to a final classification network, which uses the information aggregated at multiple scales, with different degrees of selectivity, to predict the cell type. We use an MLP with a single hidden layer.
Training
We initialize all of the model weights using a standard initialization proposed by He et al., 2015,70 except for the last layer of the classifier. To take into account the class imbalance in our datasets, we initialize the bias of the last layer of the classifier to be proportional to the class weight. This was found to speed up convergence.
We use the AdamW optimizer71 with learning rate η and weight decay 10−5. The learning rate is decayed once the training reaches the final epochs. We also use network dropout with a rate of 0.5.
| Dataset | V1 k=4 | V1 k=16 | Ca2+ k=4 | Ca2+ k=8 |
|---|---|---|---|---|
|
| ||||
| Number of Bins | 128 | 128 | 90 | 90 |
| Embedding size | 16 | 16 | 16 | 16 |
| Hidden sizes | 32, 16, 16, 16 | 128, 64, 64, 32 | 32, 16, 16 | 64, 32, 32 |
| Number of heads | 4 | 4 | 4 | 8 |
| Learning rate (η) | 10−3 | 10−4 | 10−2 | 5 * 10−3 |
| Epochs | 100 | 100 | 100 | 300 |
For the neuropixels tasks, we perform a hyperparameter sweep for each random seed. After an initial broad randomized search to determine the appropriate ranges for hyperparameter values on 5 splits, these smaller ranges were swept iteratively across each of the splits.
| Hyperparameter | Grid values |
|---|---|
|
| |
| Number of Histogram Bins | 64, 128 |
| Embedding size | 16 |
| Hidden sizes | (64, 32, 16), (64, 32, 16, 8), (128, 64, 32), (128, 64, 32, 16) |
| Number of heads | 4 |
| Learning rate | 10−2, 10−3 |
| Batch size | 64, 128 |
| Trial dropout | 0.3, 0.45, 0.6 |
| Network dropout | 0.5, 0.7 |
| Weight decay | 10−5 |
| Milestones | (50,100,200), (100,200) |
| Epochs | 300 |
| Sampler factor | 0.03, 0.1 |
Data augmentation
Adaptive trial dropout is parametrized by dropout probability p. The probability of dropping a trial (removing it from the set of trials observed by LOLCAT), is proportional to p and inversely proportional to the neuron’s firing rate during that trial. In other words, if the trial is sparser than average, it will be more likely to be dropped out. This heuristic is inspired by similar heuristics designed for graph augmentations.72
Dealing with class imbalance
Because of the class imbalance and variability across classes, we designed a strategy for reweighting different classes adaptively over training. To balance classes or decide the number of cells to sample from each class per batch, we calculate two scores: (i) a score that quantifies overfitting that might be taking place for a given class (ii) a score that quantifies how the model is performing on a given class relative to the overall dataset. At the end of each training epoch, we calculate the loss terms across both training and validation sets. First, we determine whether there is overfitting by calculating the difference between the average training and validation losses for a given class. If the difference is greater than 1-unit of standard deviation, calculated over all training loss terms, then the sampling factor for that class is increased for the following epoch (typically by division with 0.99). We also determine whether the model is over- or under-performing on a particular class by comparing the average per-class training loss to the globally averaged training loss, relative to the standard deviation. If the score is greater than a fixed threshold (0.1), then the sampling factor for that class is decreased (typically by division with 1.01) and otherwise increased. Initially, sampling factors for each class are set to balance the classes (majority class count/ class count).
More formally, the overfitting score can be expressed as follows:
where the first term measures the average loss in classification prediction for all samples in the ith classes validation set and the second term measures the average loss in classification prediction for all samples in the ith classes training set. Thus, this score provides a gap in loss between the validation and train: when the model is overfit this score will be low.
The other measure we compute is how well a given class is predicted relative to cells in other classes. We compute this as follows:
The first term in the score is the average loss on the validation set (as in our overfitting measure) and the mean and variance are computed over the entire validation set.
Integrated gradients
The integrated gradients algorithm aims to explain the model’s predictions in terms of its input features, by identifying the features that were important in making the prediction possible.49 We use a slightly altered version of the algorithm: we use it to explain the relationship between the model’s outputs and the trial-level features. In other words, we identify the trials that were critical in the prediction of the cell type. The algorithm assigns an attribution score to each trial. A trial with high attribution means that the neuron exhibited an activity pattern that skewed the model towards correctly predicting a specific cell type. This means that if this same pattern were to be observed, it is highly likely that the neuron is from the corresponding cell type.
Clustering IEI distributions
For each cell type, we collect the inter-event interval distributions of the trials with the highest attribution across cells. Using the Wasserstein-2 distance as a measure of similarity between two distributions,73 we perform clustering and then compute the Wasserstein barycenter of each cluster. The Wasserstein barycenter is essentially a different version of the centroid or average, where instead of taking a standard coordinate-wise average, the barycenter is the center-of-mass in the Wasserstein distance space and is thus the distribution where the Wasserstein-2 distance between all samples and the barycenter is minimized. We use this measure due to the sparsity of IEIs and small variability between different salient trials that when averaged together produces an overly smoothed version that doesn’t convey the shared structure between IEIs in a cluster as easily. Note that this analysis only focuses on the highly local information used by LOLCAT; the prediction is made after considering different pieces of information, aggregated across different scales.
Logistic regression model
For each neuron within a dataset, several summary statistics based on the event times during presentation of drifting gratings and/or natural movie presentation were used as features for a logistic regression implemented in scikit-learn.74 When possible, we elected to compute spiking statistics derived from interspike intervals rather than binned spike counts. For instance, we show the dynamic range of activity with the minimum ISI (that is, the maximum instantaneous firing rate) and maximum ISI (the minimum instantaneous firing rate).
Computed standard statistics included the minimum ISI, maximum ISI, median ISI, mode ISI (based on a binned distribution of ISIs < 3s, 128 bins for neuropixels and the bio-realistic model or 90 bins for 2P-imaging), standard deviation of ISIs, coefficient of variation for ISIs, and mean firing rate (inverse of mean ISI).
To capture oscillations, the power spectral density (PSD) of the event time series was computed from a periodogram using scipy. signal.75 The mean value of the PSD was calculated within established bands (<4 hz, 4–8 hz, 8–12 hz, 12–40 hz, 40–100 hz).76 For 2P imaging, the two highest frequency bands were not used due to sampling rate.
For a parametric description of the ISI distribution , using scipy.stats, a gamma distribution was fit to the list of the neuron’s ISIs less than 3s (the span of a trial/snippet) and the alpha and beta parameters were extracted.75 The probability distribution function of the gamma distribution is as follows:
To capture more local aspects of spiking variability (e.g. bursting) CV2,35 LV,39 and LVR40 were calculated using Elephant.77 The following equations pertain to a sequence containing N consecutive interspike intervals (isi). For LVR, R is the refractoriness constant which was set to 5 ms (the default value).
These basic statistics were used individually and collectively to attempt to predict class for each task. A logistic regression with penalized L2-Norm and 1E-4 tolerant stopping criterion was employed using sci-kit learn.74
Multilayer perceptron (MLP) model
For each neuron within a dataset, a binned histogram of the ISI distribution during blocks of drifting gratings and/or natural movie presentation was used as features for an MLP implemented in scikit-learn.74 The ISI distribution ranges from 0–3s, the length of a snippet/trial. For the spiking datasets (bio-realistic model and Neuropixels) 128 evenly-spaced bins was optimal. For the 2P-imaging datasets, IEIs were separated into 90 equally-sized bins because of the reduced sampling rate of calcium. In each dataset, a Bayesian hyperparameter optimization (implemented with the HyperOpt python package) was used to select network dimensions, batch size, regularization, input scaling, activation function, learning rate, and class-balancing (using RandomOverSampling and/or RandomUnderSampling from the imbalanced-learn python package) to maximize balanced accuracy on the validation set.78,79
| Hyperparameter | Possible values |
|---|---|
|
| |
| Hidden sizes | (1028,), (512,), (256,), (128,), (64,), (32,), (1028,512,256,128,64,32), (512,256,128,64,32), (256,128,64,32), (128,64,32), (64,32,16), (32,16,8), (64,64,64), (32,32,32), (16,16,16) |
| Activation function | Identity, Logistic, Tanh, ReLU |
| Learning rate | 1E-7, 1E-6, 1E-5, 1E-4, 1E-3, 1E-2, 1E-1 |
| Batch size | 25:300:25 |
| L2 regularizer | 0, 1E-8, 1E-7, 1E-6, 1E-5, 1E-4, 1E-3, 1E-2, 1E-1, 1E0 |
| Scaling | Z-score, MinMax, None |
| Epochs | 100 |
The base grid of hyperparameter values explored in optimization for each task is shown. 25:300:25 indicates all multiples of 25 up to 300 (inclusive). The ranges of these values may vary slightly by task, for instance larger network dimensions (hidden sizes) were explored for data from 2P-imaging during Natural Movies. Importantly, class-wise sample ratio was also optimized, exploring ranges of values from approximately 5% of batch to 95% of batch, this was supported by random oversampling and random undersampling.
QUANTIFICATION AND STATISTICAL ANALYSES
Statistical analyses are described in the main text, methods, and figure legends. We trained and tested on 20 independent random splits of each dataset (n=20). Neurons were pooled across animals, divided into non-overlapping training, validation, and test sets (60% training, 20% validation, 20% testing). Standard error of the mean (SEM) is used to quantify error in balanced accuracy across replicates. To evaluate the significance of differences in balanced accuracy across datasets or models, we first applied an ANOVA followed by a posthoc Tukey HSD test. To evaluate the significance of re-ordering of feature importances, we used the Spearman Rank Correlation. A p-value of 0.05 is considered to be significant. This is consistent with the conventions of the field.
Supplementary Material
Highlights.
A neuron’s genetic cell type can be recovered from a time series of its activity
A deep-learning architecture learns activity patterns to differentiate cell types
Projection type and cortical depth can help to classify excitatory subtypes more accurately
Visual cortical cells show type-specific activity for structured and naturalistic stimuli
ACKNOWLEDGMENTS
We would like to thank Drs. Saskia de Vries and Josh Siegle (Allen Brain Institute) for helpful discussions. This project was supported by NIH BRAIN Initiative awards 1R01NS118442 (K.B.H.) and 1R01EB029852 (E.L.D. and K.B.H.), as well as by an NSF award, IIS-2039741 (ELD), and generous gifts from the Alfred Sloan Foundation (E.L.D.), the McKnight Foundation (E.L.D.), the CIFAR Azrieli Global Scholars Program (E.L.D.), and the Incubator for Transdisciplinary Futures, an Arts & Sciences Signature Initiative at Washington University in St. Louis (K.B.H.).
Footnotes
DECLARATION OF INTERESTS
The authors declare no competing interests.
SUPPLEMENTAL INFORMATION
Supplemental information can be found online at https://doi.org/10.1016/j.celrep.2023.112318.
REFERENCES
- 1.Scala F, Kobak D, Bernabucci M, Bernaerts Y, Cadwell CR, Castro JR, Hartmanis L, Jiang X, Laturnus S, Miranda E, and Tolias AS (2021). Phenotypic variation of transcriptomic cell types in mouse motor cortex. Nature 598, 144–150. 10.1038/s41586-020-2907-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.BRAIN Initiative Cell Census Network BICCN (2021). A multimodal cell census and atlas of the mammalian primary motor cortex. Nature 598, 86–102. 10.1038/s41586-021-03950-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Ecker JR, Geschwind DH, Kriegstein AR, Ngai J, Osten P, Polioudakis D, Regev A, Sestan N, Wickersham IR, and Zeng H (2017). The BRAIN Initiative Cell Census Consortium: lessons learned toward generating a comprehensive brain cell atlas. Neuron 96, 542–557. 10.1016/j.neuron.2017.10.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Yuste R, Hawrylycz M, Aalling N, Aguilar-Valles A, Arendt D, Armañanzas R, Ascoli GA, Bielza C, Bokharaie V, Bergmann TB, and Lein E (2020). A community-based transcriptomics classification and nomenclature of neocortical cell types. Nat. Neurosci. 23, 1456–1468. 10.1038/s41593-020-0685-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ramon y Cajal S (1892). S. Rev Ciencias Méd. Barcelona 18, 361–376, 457–476, 505–520, 529–541. [Google Scholar]
- 6.Lein E, Borm LE, and Linnarsson S (2017). The promise of spatial transcriptomics for neuroscience in the era of molecular cell typing. Science 358, 64–69. 10.1126/science.aan6827. [DOI] [PubMed] [Google Scholar]
- 7.Tasic B, Menon V, Nguyen TN, Kim TK, Jarsky T, Yao Z, Levi B, Gray LT, Sorensen SA, Dolbeare T, and Zeng H (2016). Adult mouse cortical cell taxonomy revealed by single cell transcriptomics. Nat. Neurosci. 19, 335–346. 10.1038/nn.4216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Nowak LG, Azouz R, Sanchez-Vives MV, Gray CM, and McCormick DA (2003). Electrophysiological classes of cat primary visual cortical neurons in vivo as revealed by quantitative analyses. J. Neurophysiol. 89, 1541–1566. 10.1152/jn.00580.2002. [DOI] [PubMed] [Google Scholar]
- 9.Rainnie DG, Mania I, Mascagni F, and McDonald AJ (2006). Physiological and morphological characterization of parvalbumin-containing interneurons of the rat basolateral amygdala. J. Comp. Neurol. 498, 142–161. 10.1002/cne.21049. [DOI] [PubMed] [Google Scholar]
- 10.Gouwens NW, Sorensen SA, Baftizadeh F, Budzillo A, Lee BR, Jarsky T, Alfiler L, Baker K, Barkan E, Berry K, and Zeng H (2020). Integrated morphoelectric and transcriptomic classification of cortical GABAergic cells. Cell 183, 935–953.e19. 10.1016/j.cell.2020.09.057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Lazarevich I, Prokin I, Gutkin B, and Kazantsev V (2023). Spikebench: an open benchmark for spike train time-series classification. PLoS Comput. Biol. 19, e1010792. 10.1371/journal.pcbi.1010792. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Segundo JP, Moore GP, Stensaas LJ, and Bullock TH (1963). Sensitivity of neurones in Aplysia to temporal pattern of arriving impulses. J. Exp. Biol. 40, 643–667. 10.1242/jeb.40.4.643. [DOI] [PubMed] [Google Scholar]
- 13.Softky WR, and Koch C (1993). The highly irregular firing of cortical cells is inconsistent with temporal integration of random EPSPs. J. Neurosci. 13, 334–350. 10.1523/JNEUROSCI.13-01-00334.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Mainen ZF, and Sejnowski TJ (1995). Reliability of spike timing in neocortical neurons. Science 268, 1503–1506. 10.1126/science.7770778. [DOI] [PubMed] [Google Scholar]
- 15.Stevens CF, and Zador AM (1998). Input synchrony and the irregular firing of cortical neurons. Nat. Neurosci. 1, 210–217. 10.1038/659. [DOI] [PubMed] [Google Scholar]
- 16.Nolte M, Reimann MW, King JG, Markram H, and Muller EB (2019). Cortical reliability amid noise and chaos. Nat. Commun. 10, 3792–3815. 10.1038/s41467-019-11633-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Seeman SC, Campagnola L, Davoudian PA, Hoggarth A, Hage TA, Bosma-Moody A, Baker CA, Lee JH, Mihalas S, Teeter C, and Jarsky T (2018). Sparse recurrent excitatory connectivity in the microcircuit of the adult mouse and human cortex. Elife 7, e37349. 10.7554/eLife.37349. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Peron S, Pancholi R, Voelcker B, Wittenbach JD, Ólafsdóttir HF, Freeman J, and Svoboda K (2020). Recurrent interactions in local cortical circuits. Nature 579, 256–259. 10.1038/s41586-020-2062-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Barthó P, Hirase H, Monconduit L, Zugaro M, Harris KD, and Buzsáki G (2004). Characterization of neocortical principal cells and interneurons by network interactions and extracellular features. J. Neurophysiol. 92, 600–608. 10.1152/jn.01170.2003. [DOI] [PubMed] [Google Scholar]
- 20.Cardin JA, Palmer LA, and Contreras D (2007). Stimulus feature selectivity in excitatory and inhibitory neurons in primary visual cortex. J. Neurosci. 27, 10333–10344. 10.1523/JNEUROSCI.1692-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Royer S, Zemelman BV, Losonczy A, Kim J, Chance F, Magee JC, and Buzsáki G (2012). Control of timing, rate and bursts of hippocampal place cells by dendritic and somatic inhibition. Nat. Neurosci. 15, 769–775. 10.1038/nn.3077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Buzsáki G, and Mizuseki K (2014). The log-dynamic brain: how skewed distributions affect network operations. Nat. Rev. Neurosci. 15, 264–278. 10.1038/nrn3687. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Lima SQ, Hromádka T, Znamenskiy P, and Zador AM (2009). PINP: a new method of tagging neuronal populations for identification during in vivo electrophysiological recording. PLoS One 4, e6099. 10.1371/journal.pone.0006099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Ding L, Balsamo G, Chen H, Blanco-Hernandez E, Zouridis IS, Naumann R, Preston-Ferrer P, and Burgalossi A (2022). Juxtacellular optotagging of hippocampal CA1 neurons in freely moving mice. Elife 11, e71720. 10.7554/eLife.71720. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Niell CM, and Stryker MP (2008). Highly selective receptive fields in mouse visual cortex. J. Neurosci. 28, 7520–7536. 10.1523/JNEUROSCI.0623-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Jia X, Siegle JH, Bennett C, Gale SD, Denman DJ, Koch C, and Olsen SR (2019). High-density extracellular probes reveal dendritic backpropagation and facilitate neuron classification. J. Neurophysiol. 121, 1831–1847. 10.1152/jn.00680.2018. [DOI] [PubMed] [Google Scholar]
- 27.Trainito C, von Nicolai C, Miller EK, and Siegel M (2019). Extracellular spike waveform dissociates four functionally distinct cell classes in primate cortex. Curr. Biol. 29, 2973–2982.e5. 10.1016/j.cub.2019.07.051. [DOI] [PubMed] [Google Scholar]
- 28.Bugeon S, Duffield J, Dipoppa M, Ritoux A, Prankerd I, Nicoloutsopoulos D, Orme D, Shinn M, Peng H, Forrest H, and Harris KD (2022). A transcriptomic axis predicts state modulation of cortical interneurons. Nature 607, 330–338. 10.1038/s41586-022-04915-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Siegle JH, Jia X, Durand S, Gale S, Bennett C, Graddis N, Heller G, Ramirez TK, Choi H, Luviano JA, and Koch C (2021). Survey of spiking in the mouse visual system reveals functional hierarchy. Nature 592, 86–92. 10.1038/s41586-020-03171-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.de Vries SEJ, Lecoq JA, Buice MA, Groblewski PA, Ocker GK, Oliver M, Feng D, Cain N, Ledochowitsch P, Millman D, and Koch C (2020). A large-scale standardized physiological survey reveals functional organization of the mouse visual cortex. Nat. Neurosci. 23, 138–151. 10.1038/s41593-019-0550-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Billeh YN, Cai B, Gratiy SL, Dai K, Iyer R, Gouwens NW, AbbasiAsl R, Jia X, Siegle JH, Olsen SR, and Arkhipov A (2020). Systematic integration of structural and functional data into multi-scale models of mouse primary visual cortex. Neuron 106, 388–403.e18. 10.1016/j.neuron.2020.01.040. [DOI] [PubMed] [Google Scholar]
- 32.Brodersen KH, Ong CS, Stephan KE, and Buhmann JM (2010, August). The balanced accuracy and its posterior distribution. In 2010 20th International Conference on Pattern Recognition (IEEE), pp. 3121–3124. 10.1109/ICPR.2010.764. [DOI] [Google Scholar]
- 33.Townsend JT (1971). Theoretical analysis of an alphabetic confusion matrix. Percept. Psychophys. 9 (1), 40–50. 10.3758/BF03213026. [DOI] [Google Scholar]
- 34.Bair W, Koch C, Newsome W, and Britten K (1994). Power spectrum analysis of bursting cells in area MT in the behaving monkey. J. Neurosci. 14, 2870–2892. 10.1523/JNEUROSCI.14-05-02870.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Holt GR, Softky WR, Koch C, and Douglas RJ (1996). Comparison of discharge variability in vitro and in vivo in cat visual cortex neurons. J. Neurophysiol. 75, 1806–1814. 10.1152/jn.1996.75.5.1806. [DOI] [PubMed] [Google Scholar]
- 36.McCormick DA, Connors BW, Lighthall JW, and Prince DA (1985). Comparative electrophysiology of pyramidal and sparsely spiny stellate neurons of the neocortex. J. Neurophysiol. 54, 782–806. 10.1152/jn.1985.54.4.782. [DOI] [PubMed] [Google Scholar]
- 37.Kuffler SW, Fitzhugh R, and Barlow HB (1957). Maintained activity in the cat’s retina in light and darkness. J. Gen. Physiol. 40, 683–702. 10.1085/jgp.40.5.683. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Li M, Xie K, Kuang H, Liu J, Wang D, Fox GE, Wei W, Li X, Li Y, Zhao F, and Tsien JZ (2018). Spike-timing pattern operates as gamma-distribution across cell types, regions and animal species and is essential for naturally-occurring cognitive states. Preprint at bioRxiv145813. 10.1101/145813. [DOI]
- 39.Shinomoto S, Shima K, and Tanji J (2003). Differences in spiking patterns among cortical neurons. Neural Comput. 15, 2823–2842. 10.1162/089976603322518759. [DOI] [PubMed] [Google Scholar]
- 40.Shinomoto S, Kim H, Shimokawa T, Matsuno N, Funahashi S, Shima K, Fujita I, Tamura H, Doi T, Kawano K, and Toyama K (2009). Relating neuronal firing patterns to functional differentiation of cerebral cortex. PLoS Comput. Biol. 5, e1000433. 10.1371/journal.pcbi.1000433. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Bahdanau D, Cho K, and Bengio Y (2014). Neural Machine Translation by Jointly Learning to Align and Translate. Preprint at arXiv. 10.48550/arXiv.1409.0473. [DOI]
- 42.Li Y, Zemel R, Brockschmidt M, and Tarlow D (2016, April). Gated graph sequence neural networks. Proceedings of ICLR’16. 10.48550/arXiv.1511.05493. [DOI] [Google Scholar]
- 43.Gorski JA, Talley T, Qiu M, Puelles L, Rubenstein JLR, and Jones KR (2002). Cortical excitatory neurons and glia, but not GABAergic neurons, are produced in the Emx1-expressing lineage. J. Neurosci. 22, 6309–6314. 10.1523/JNEUROSCI.22-15-06309.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Daigle TL, Madisen L, Hage TA, Valley MT, Knoblich U, Larsen RS, Takeno MM, Huang L, Gu H, Larsen R, and Zeng H (2018). A suite of transgenic driver and reporter mouse lines with enhanced brain-cell-type targeting and functionality. Cell 174, 465–480.e22. 10.1016/j.cell.2018.06.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Harris JA, Mihalas S, Hirokawa KE, Whitesell JD, Choi H, Bernard A, Bohn P, Caldejon S, Casal L, Cho A, and Zeng H (2019). Hierarchical organization of cortical and thalamic connectivity. Nature 575, 195–202. 10.1038/s41586-019-1716-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Gil-Sanz C, Espinosa A, Fregoso SP, Bluske KK, Cunningham CL, Martinez-Garay I, Zeng H, Franco SJ, Müller U, and Müller U (2015). Lineage tracing using Cux2-cre and Cux2-CreERT2 mice. Neuron 86, 1091–1099. 10.1016/j.neuron.2015.04.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Matho KS, Huilgol D, Galbavy W, He M, Kim G, An X, Lu J, Wu P, Di Bella DJ, Shetty AS, and Huang ZJ (2021). Genetic dissection of the glutamatergic neuron system in cerebral cortex. Nature 598, 182–187. 10.1038/s41586-021-03955-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.McInnes L, Healy J, and Melville J (2018). Umap: Uniform Manifold Approximation and Projection for Dimension Reduction. Preprint at arXiv. 10.48550/arXiv.1802.03426. [DOI]
- 49.Sundararajan M, Taly A, and Yan Q (2017, July). Axiomatic attribution for deep networks. In International Conference on Machine Learning (PMLR), pp. 3319–3328. [Google Scholar]
- 50.David SV, Vinje WE, and Gallant JL (2004). Natural stimulus statistics alter the receptive field structure of v1 neurons. J. Neurosci. 24, 6991–7006. 10.1523/JNEUROSCI.1422-04.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Talebi V, and Baker CL (2012). Natural versus synthetic stimuli for estimating receptive field models: a comparison of predictive robustness. J. Neurosci. 32, 1560–1576. 10.1523/JNEUROSCI.4661-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Yeh CI, Xing D, Williams PE, and Shapley RM (2009). Stimulus ensemble and cortical layer determine V1 spatial receptive fields. Proc. Natl. Acad. Sci. USA 106, 14652–14657. 10.1073/pnas.0907406106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Harris KD, and Shepherd GMG (2015). The neocortical circuit: themes and variations. Nat. Neurosci. 18, 170–181. 10.1038/nn.3917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Mukamel EA, and Ngai J (2019). Perspectives on defining cell types in the brain. Curr. Opin. Neurobiol. 56, 61–68. 10.1016/j.conb.2018.11.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Rudy B, Fishell G, Lee S, and Hjerling-Leffler J (2011). Three groups of interneurons account for nearly 100% of neocortical GABAergic neurons. Dev. Neurobiol. 71, 45–61. 10.1002/dneu.20853. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Kepecs A, and Fishell G (2014). Interneuron cell types are fit to function. Nature 505, 318–326. 10.1038/nature12983. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Tremblay R, Lee S, and Rudy B (2016). GABAergic interneurons in the neocortex: from cellular properties to circuits. Neuron 91, 260–292. 10.1016/j.neuron.2016.06.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Zeisel A, Muñoz-Manchado AB, Codeluppi S, Lönnerberg P, La Manno G, Juréus A, Marques S, Munguba H, He L, Betsholtz C, and Linnarsson S (2015). Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 347, 1138–1142. 10.1126/science.aaa1934. [DOI] [PubMed] [Google Scholar]
- 59.Crockett T, Wright N, Thornquist S, Ariel M, and Wessel R (2015). Turtle dorsal cortex pyramidal neurons comprise two distinct cell types with indistinguishable visual responses. PLoS One 10, e0144012. 10.1371/journal.pone.0144012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Ma Z, Turrigiano GG, Wessel R, and Hengen KB (2019). Cortical circuit dynamics are homeostatically tuned to criticality in vivo. Neuron 104, 655–664.e4. 10.1016/j.neuron.2019.08.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Shadlen MN, and Newsome WT (1994). Noise, neural codes and cortical organization. Curr. Opin. Neurobiol. 4, 569–579. 10.1016/0959-4388(94)90059-0. [DOI] [PubMed] [Google Scholar]
- 62.Shadlen MN, and Newsome WT (1998). The variable discharge of cortical neurons: implications for connectivity, computation, and information coding. J. Neurosci. 18, 3870–3896. 10.1523/JNEUROSCI.18-10-03870.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Van Vreeswijk C, and Sompolinsky H (1996). Chaos in neuronal networks with balanced excitatory and inhibitory activity. Science 274, 1724–1726. 10.1126/science.274.5293.1724. [DOI] [PubMed] [Google Scholar]
- 64.Millman DJ, Ocker GK, Caldejon S, Kato I, Larkin JD, Lee EK, Luviano J, Nayan C, Nguyen TV, North K, et al. (2020). VIP interneurons in mouse primary visual cortex selectively enhance responses to weak but specific stimuli. Elife 9, e55130. 10.7554/eLife.55130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Liu R, Azabou M, Dabagia M, Lin CH, Azar MG, Hengen KB, Valko M, and Dyer E (2021). Drop, swap, and generate: a self-supervised approach for generating neural activity. Adv. Neural Inf. Process. Syst. 34, 10587–10599. 10.48550/arXiv.2111.02338. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Gütig R, and Sompolinsky H (2006). The tempotron: a neuron that learns spike timing–based decisions. Nat. Neurosci. 9, 420–428. 10.1038/nn1643. [DOI] [PubMed] [Google Scholar]
- 67.Payeur A, Guerguiev J, Zenke F, Richards BA, and Naud R (2021). Burst-dependent synaptic plasticity can coordinate learning in hierarchical circuits. Nat. Neurosci. 24, 1010–1019. 10.1038/s41593-021-00857-x. [DOI] [PubMed] [Google Scholar]
- 68.Richards BA, Lillicrap TP, Beaudoin P, Bengio Y, Bogacz R, Christensen A, Clopath C, Costa RP, de Berker A, Ganguli S, and Kording KP (2019). A deep learning framework for neuroscience. Nat. Neurosci. 22, 1761–1770. 10.1038/s41593-019-0520-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Battaglia PW, Hamrick JB, Bapst V, Sanchez-Gonzalez A, Zambaldi V, Malinowski M, and Pascanu R (2018). Relational Inductive Biases, Deep Learning, and Graph Networks. Preprint at arXiv. 10.48550/arXiv.1806.01261. [DOI]
- 70.He K, Zhang X, Ren S, and Sun J (2015). Delving deep into rectifiers: Surpassing human-level performance on image net classification. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034. [Google Scholar]
- 71.Loshchilov I, and Hutter F (2017). Decoupled Weight Decay Regularization. Preprint at arXiv. 10.48550/arXiv.1711.05101. [DOI]
- 72.Zhu Y, Xu Y, Yu F, Liu Q, Wu S, and Wang L (2021, April). Graph contrastive learning with adaptive augmentation. In Proceedings of the Web Conference 2021, pp. 2069–2080. 10.1145/3442381.3449802. [DOI] [Google Scholar]
- 73.Cuturi M (2013). Sinkhorn distances: Lightspeed computation of optimal transport. Adv. Neural Inf. Process. Syst. 26. 10.48550/arXiv.1306.0895. [DOI] [Google Scholar]
- 74.Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, and Duchesnay E (2011). Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830. [Google Scholar]
- 75.Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, Burovski E, Peterson P, Weckesser W, Bright J, et al. (2020). SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272. 10.1038/s41592-019-0686-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Buzsáki G, and Draguhn A (2004). Neuronal oscillations in cortical networks. Science 304, 1926–1929. 10.1126/science.1099745. [DOI] [PubMed] [Google Scholar]
- 77.Denker M, Yegenoglu A, and Grün S (2018). Collaborative HPC-enabled workflows on the HBP Collaboratory using the Elephant framework. In Computational and Systems Neuroscience, Neuroinformatics 2018, Montreal (Canada). [Google Scholar]
- 78.Bergstra J, Yamins D, and Cox D (2013, February). Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. International Conference on Machine Learning (PMLR), pp. 115–123. [Google Scholar]
- 79.Lemaître G, Nogueira F, and Aridas CK (2017). Imbalanced-learn: a python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18, 559–563. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
This paper analyzes existing, publicly available data. The accession numbers for the datasets are listed in the key resources table. The 2P-Imaging and Neuropixels datasets were generated by the Allen Institute. Directions for access are available here: https://allensdk.readthedocs.io/en/latest/data_resources.html. Bio-realistic data were generated using a model available here: https://alleninstitute.github.io/bmtk/.
KEY RESOURCES TABLE.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
|
| ||
| Deposited data | ||
|
| ||
| Visual Coding - Neuropixels Dataset | Allen Institute | https://allensdk.readthedocs.io/en/latest/visual_coding_neuropixels.html |
| 2-Photon Imaging Dataset | Allen Institute | https://allensdk.readthedocs.io/en/latest/visual_behavior_optical_physiology.html#photon-imaging-dataset |
|
| ||
| Software and algorithms | ||
|
| ||
| AllenSDK | Allen Institute | https://allensdk.readthedocs.io/en/latest/data_api_client.html |
| Brain Modeling Toolkit | Allen Institute | https://alleninstitute.github.io/bmtk/ |
| Code Base for LOLCAT | This paper | https://doi.org/10.5281/zenodo.7636935 |
All original code has been deposited on GitHub and is publicly available as the date of publication. DOIs are listed in the key resources table.
Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
