Abstract
High-density microelectrode arrays can be used to record extracellular action potentials from hundreds to thousands of neurons simultaneously. Efficient spike sorters must be developed to cope with such large data volumes. Most existing spike sorting methods for single electrodes or small multielectrodes, however, suffer from the “curse of dimensionality” and cannot be directly applied to recordings with hundreds of electrodes. This holds particularly true for the standard reference spike sorting algorithm, principal component analysis-based feature extraction, followed by k-means or expectation maximization clustering, against which most spike sorters are evaluated. We present a spike sorting algorithm that circumvents the dimensionality problem by sorting local groups of electrodes independently with classical spike sorting approaches. It is scalable to any number of recording electrodes and well suited for parallel computing. The combination of data prewhitening before the principal component analysis-based extraction and a parameter-free clustering algorithm obviated the need for parameter adjustments. We evaluated its performance using surrogate data in which we systematically varied spike amplitudes and spike rates and that were generated by inserting template spikes into the voltage traces of real recordings. In a direct comparison, our algorithm could compete with existing state-of-the-art spike sorters in terms of sensitivity and precision, while parameter adjustment or manual cluster curation was not required.
NEW & NOTEWORTHY We present an automatic spike sorting algorithm that combines three strategies to scale classical spike sorting techniques for high-density microelectrode arrays: 1) splitting the recording electrodes into small groups and sorting them independently; 2) clustering a subset of spikes and classifying the rest to limit computation time; and 3) prewhitening the spike waveforms to enable the use of parameter-free clustering. Finally, we combined these strategies into an automatic spike sorter that is competitive with state-of-the-art spike sorters.
Keywords: HD-MEA surrogate data, high-density microelectrode array, prewhitening, spike sorting
INTRODUCTION
High-density microelectrode arrays (HD-MEAs) are important tools in electrophysiology research and are used to simultaneously record the electrical activity of large numbers of neurons. Recent advances in complementary metal oxide-semiconductor (CMOS) technology have increased the number of simultaneously active recording electrodes from a few hundred to several thousand per chip (Berdondini et al. 2009; Bertotti et al. 2014; Frey et al. 2010; Johnson et al. 2012; Müller et al. 2013; Obien et al. 2017; Viswam et al. 2016). At the same time, the center-to-center electrode distances (pitch) have decreased significantly to a point where the electrode density comes close to or even exceeds the density of neurons in certain tissues, for example, the density of ganglion cells in the murine retina (Fiscella et al. 2012). Each electrode records the activity of all neurons in its vicinity, and because of the high electrode density each neuron is usually recorded on many electrodes. Thus the recorded multielectrode activity can be thought of as the sum of mixed neuronal signals and correlated noise. We do not know a priori the number of active neurons or the exact time at which an action potential was fired. The process of estimating the number of recorded neurons and assigning each detected action potential to one of those neurons is called “spike sorting.”
Various spike sorting methods have been described previously, and many constitute a variant of the following approach (Lewicki 1998): First, spike events are detected through thresholding, and the spike waveforms are extracted from the recordings (throughout this report we assume that the recorded signals have been band-pass filtered before spike detection is performed). Next, a feature extraction, such as principal component analysis, reduces the dimensionality of the cut-out spike waveforms. Finally, a clustering technique groups the spikes and assigns them to putative neurons based on assumptions about the distribution of spikes of the same neuron in the feature space (Einevoll et al. 2012; Jun et al. 2017; Lewicki 1998; Schmidt 1984).
This approach represents the standard reference method against which most other spike sorting algorithms are evaluated; however, it cannot be scaled to HD-MEA data in a straightforward way: When using a thousand parallel readout electrodes at a typical sampling rate of 20 kHz, even 1-ms-short spike waveforms would span a vector space with 20,000 dimensions. Having that many dimensions makes feature extraction and clustering extremely susceptible to even small quantities of noise because of a phenomenon known as the “curse of dimensionality” (Bishop 2007). Furthermore, temporal spike overlaps, i.e., spikes of different neurons that occur at the same time, are difficult to resolve and become more abundant with increasing numbers of recorded neurons (Dragas et al. 2015; Franke et al. 2012; Pillow et al. 2013).
On the other hand, HD-MEAs have the advantage that spikes of each neuron are simultaneously detected on many electrodes, which provides spatial information in addition to the characteristic temporal spike waveforms of each neuron. This combined information increases the discriminability of neurons and has been used in various spike sorting algorithms (Franke et al. 2015a; Hill et al. 2011; Jäckel et al. 2012; Lambacher et al. 2011; Marre et al. 2012; Muthmann et al. 2015; Prentice et al. 2011; Swindale and Spacek 2014).
In this article, we present and discuss three strategies with which the previously sketched standard reference method could be scaled up for HD-MEA data: The first strategy involved splitting the overall set of recording electrodes into smaller local electrode groups (LEGs) and treating each of them independently as a classical spike sorting problem. Yger et al. (2018) used a similar approach with one group around each electrode. At the end, we combined the results from each LEG into a final, global sorting result. This strategy helped to circumvent the “curse of dimensionality” and, at the same time, helped to eliminate the problem of temporal spike overlaps between spatially disjoint neurons (we do not address spatiotemporal overlaps within a single LEG in this report). Another advantage was that all LEGs could be processed in parallel, which made the overall processing duration effectively independent of the number of recording electrodes. We split the recording electrodes into LEGs in a way that each neuron would be well sorted in at least one LEG, while keeping the total number of LEGs low and thus minimizing the number of detections of the same neuron in multiple LEGs.
The second strategy involved performing feature selection and clustering only on a subset of spikes, which limited the computational cost of these steps and rendered them independent of the total number of spikes. In a subsequent step, we classified all spikes by template matching. During template matching, we compared the voltage traces around each detected spike with characteristic waveforms of putative neurons and matched them to the neuron with the most similar waveform (Franke et al. 2015b; Lewicki 1998). This strategy has been employed by several other spike sorters previously (Franke et al. 2015b; Marre et al. 2012; Rutishauser et al. 2006).
The third strategy included the use of prewhitening before feature extraction, followed by nonparametric clustering. We describe how prewhitening increased cluster separation and made cluster shapes and sizes predictable, which allowed us to use parameter-free clustering by means of the mean-shift algorithm.
Finally, we developed an algorithm that incorporates these three strategies, which can be easily parallelized, is scalable to thousands of electrodes, and relies on a clustering algorithm that does not require any parameter adjustment. We evaluated this algorithm against newly generated HD-MEA benchmark data sets with up to 1,000 electrodes, and we compared it to five other spike sorters using publicly available data sets. All of our evaluations were done without manual curation of the sorting results, and all parameters were fixed before we ran the evaluation on the benchmarks, i.e., they were identical for all evaluations and not fitted to individual data sets to improve the performance of the algorithm.
It was and is difficult to obtain good benchmarking data sets to evaluate the performance of HD-MEA spike sorting algorithms. Patch-clamp recordings in combination with MEA recordings can be used to provide a ground truth, but currently only a handful of neurons can be patched at the same time (Fournier et al. 2016; Franke et al. 2015a). Simulated data may provide a surrogate ground truth for all neurons in a given data set (Einevoll et al. 2012; Hagen et al. 2015; Jäckel et al. 2012), but the generation of useful data sets requires a well-characterized electrophysiological model for each experimental condition.
To generate our benchmarking data sets, we used an established method to create surrogate ground-truth data by adding typical spike waveforms at random time points into recorded data (Pachitariu et al. 2016; Rossant et al. 2016). We adapted this method for HD-MEA data to obtain a simple and robust tool with which we assessed the performance of our spike sorter.
METHODS
Tissue Preparation
Syrian hamsters (Mesocricetus auratus; Janvier Labs) were anesthetized and killed under protocols that were approved by the Basel City Veterinary Office in accordance with Swiss federal laws on animal welfare. Each hamster was kept in darkness for 10 min, anesthetized (Telazol 30 mg/kg, xylazine 10 mg/kg), and decapitated. Retinas from both eyes were immediately removed under dim red light and immersed in Ames medium (8.8 g/l, supplemented with 1.9 g/l sodium bicarbonate; Sigma-Aldrich Chemie, Buchs, Switzerland), which was perfused with Oxycarbon at room temperature (PanGas, Dagmersellen, Switzerland) for at least 30 min before the optical stimulus sequence was started. The retina patch was placed ganglion cell side down on the electrode array and perfused with Ames medium (pH 7.4, 36°C) equilibrated with 5%CO2-95%O2. Retinal ganglion cell extracellular activity was recorded for 7–8 h.
Light Stimulation
For retinal light stimulation, we used the Acer K10 light projector (60-Hz refreshing rate) in a previously developed and described setup in which extracellular electrophysiological measurements under light projections can be performed on a microscope stage [more details about the microscope and the optics for focusing the light stimulus on the retina can be found in Fiscella et al. (2012)]. We simultaneously used blue (460 ± 15 nm) and green (523 ± 23 nm) projector LEDs for stimulating retinal photoreceptors with moving, flashing squares and moving bars. For the analysis conducted here we did not analyze the light responses, as the recordings were only used for the creation of surrogate data.
MEA Recordings
We used a high-density CMOS MEA (Müller et al. 2015) with a total of 26,400 electrodes at a density of 3,300 electrodes/mm2. The electrodes have a size of 9.3 × 5.45 µm2, and their center-to-center pitch is 17.5 µm. The MEA featured 1,024 readout channels with a 10-bit, 20-kHz analog-to-digital converter each and a noise level of 2.4 µVrms in the action potential band (300 Hz to 10 kHz). A reprogrammable switch matrix allowed us to connect two arbitrarily located, disjoint blocks of 23 × 23 adjacent electrodes to the readout channels with only a few electrodes per block that remained unconnected. We created these high-density blocks so that they were translation symmetric, which means that the unconnected electrodes were located at the same relative positions within each block. This was important for the generation of the surrogate data, as can be seen below. We produced six different recordings with a duration of ~20 min each. The signals were digitally band-pass filtered between 300 and 7,000 Hz before the spike sorting started.
RESULTS
Spike Sorting Algorithm
The spike sorting process consisted of three steps as depicted in Fig. 1. In the first step, we divided the global set of electrodes into LEGs. In the second step, we performed spike sorting on each group independently, and we processed all groups in parallel. In the third step, we identified and resolved duplicates of neuronal units, i.e., neurons that were detected in multiple LEGs. The spike sorting process within each LEG started with spike detection and the extraction of the waveforms for each spike. To limit computational complexity, we selected a random subset of Nspikes spikes within each LEG and performed waveform alignment, feature extraction, and clustering only on this subset. The resulting cluster centers were then used to classify the entire set of spikes with template matching. Finally, we merged clusters based on the similarity of their mean waveforms.
Fig. 1.
A: block diagram of the spike sorting algorithm. Step 1: the global set of electrodes was subdivided into local electrode groups (LEGs). Step 2: each LEG was processed independently and in parallel with the other LEGs. The parallel processes consisted of 6 steps indicated at bottom. Step 3: finally, we resolved duplicate units, which were neurons that were found in >1 LEG. B: shape and size of the multielectrode waveform (3.75-ms cutout) of a typical retinal ganglion cell on the high-density microelectrode array. r, Radius. C: enlarged view of the waveform center with a schematic of 3 LEGs.
We semantically distinguish here between neurons as biological entities in the experiment and neuronal units as the results of the spike sorting process. In an optimal case, each neuronal unit corresponds to one neuron and reproduces all of its action potentials. The values of all parameters that were used and are not specified separately are listed in Table 1.
Table 1.
Values of parameters used in this evaluation
| Parameter | Value | Remark |
|---|---|---|
| θel | 4.2σn | Spike-detection threshold |
| Δtdist | 1.25 ms | Maximal temporal distance between threshold crossing and spike peak |
| Δtevent | 0.9 ms | Maximal temporal distance between peaks belonging to the same spike event |
| fs | 20 kHz | Sampling rate |
| tw | 1.0 ms | Temporal extension of waveforms |
| tpeak | 0.4 ms | Temporal location of the peak within the waveforms |
| Nspikes | 50,000 | Maximal no. of spikes selected for clustering |
| k | 6 | No. of principal components used for clustering |
| ch | 1.8 | Parameter for mean-shift bandwidth estimation |
| Mspikes | 10 | Minimal number of spikes per cluster |
| Dmax | 0.30σn | Maximal Euclidean distance between merged templates |
| Pmin | 0.93 | Minimal value of vector projection between merged templates |
| Δtoverlap | 0.5 ms | Maximal temporal distance between spikes that count as overlaps |
σn, Noise standard deviation.
The algorithm was implemented on MATLAB, and the source code is available for download at https://git.bsse.ethz.ch:443/hima_public/HDsort.git.
Subdivision into local electrode groups.
We divided the overall array of recording electrodes into groups of adjacent electrodes. These LEGs can overlap, i.e., some electrodes were members of multiple LEGs. The number of LEGs as well as the number of electrodes, Nels, within an LEG varied according to the size and the topology of the recording electrode sets. The LEGs were obtained in an iterative process that assigned each electrode to at least one LEG. The goal was to assign all electrodes in close vicinity to each other to the same LEG in order to minimize the number of groups and the overlap between groups. For a detailed description of this process see Subdivision into Local Electrode Groups in the appendix. We set the maximal number of electrodes per LEG to 9, because we wanted to compartmentalize the recording area into squares of 3 × 3 electrodes. This arrangement proved to be well suited for recordings from murine retina and for the given electrode pitch. There is a trade-off between having more electrodes in an LEG and thereby increasing the amount of information on a specific neuron vs. having fewer electrodes and thereby keeping the number of recorded neurons, the number of spike overlaps, as well as the computational complexity low. We found that, within a radius of 43.5 µm from the center of an LEG, the precise location of the neuron did not affect the sorting performance or, in other words, as long as the maximal signal amplitude of a neuron was within the area covered by the LEG, it did not matter if this neuron was only partially covered by the LEG (for more information see Sorting Performance as a Function of LEG Distance in the appendix). In experiments with different spatial arrangements of the electrodes, electrode pitches, or cell densities, other solutions for the creation of LEGs may be more appropriate.
Spike detection.
Spikes were detected on each electrode independently whenever the signal crossed a threshold (θel). The threshold was defined independently for each electrode as a multiple of the electrode’s noise standard deviation (σel). We estimated σel in a two-step procedure: In the first step, we computed a preliminary threshold, based on the signal standard deviation, and detected all spikes that surpassed this threshold. In the second step, we excluded these spikes and computed the median absolute deviation on the remaining noise signal. This allowed us to estimate σel in a way that is robust also in the presence of spikes and other outliers.
We defined the spike event as a time point at which the electrical signal reached its maximal amplitude within a short period (Δtdist) after threshold crossing. When two spike events on different electrodes within a single LEG occurred within a defined short time interval (Δtevent), we considered them to be the same spike and kept only the spike event with the highest amplitude. Then, we extracted the waveforms, wi,j(t), for each spike event i on each electrode j within the LEG and grouped them to obtain a multielectrode waveform We selected a random subset of Nspikes spikes, which we used in the following tasks of waveform alignment, feature selection, and clustering.
Waveform alignment.
As the recording sampling rate, fs, is finite, the spike waveforms, their amplitudes, and the exact timing of their peaks may slightly differ because of small temporal shifts of the waveform with respect to the sampling interval (so-called registration jitter). We corrected for this jitter through upsampling, alignment of the upsampled waveforms, and subsequent downsampling. The upsampling factor (Lup = 3) and the downsampling factor (Ldown = 3) were chosen to be the same in this case, which meant that the aligned waveforms had the same length as the original waveforms. To align the spikes, we searched for the maximum in the cross-correlation between the upsampled waveforms with their mean waveform , which gave us the temporal shift . We shifted the waveforms by this value, , and repeated this process until convergence was reached.
We trimmed the aligned waveforms to a length tw with their peaks located at tpeak relative to their first sample. Then, we concatenated them over all electrodes to form a vector wi with dimension Ndim = tw·fs·Nels.
Feature selection.
We started the feature selection with prewhitening of the waveforms. For this, we estimated the spatiotemporal noise covariance matrix, C, on periods of data where no spike events were detected. We computed C with a similar method that was described in Pouzat et al. (2002). In brief, we computed the auto- and cross-correlations of noise epochs (i.e., in voltage traces where no spikes occurred) between all pairs of electrodes. With these correlation functions, we built the blockwise Toeplitz matrix C, where each block corresponds to the temporal noise covariance between two electrodes (for more details see Noise Covariance Matrix in the appendix).
Through Cholesky decomposition, we obtained the whitening matrix U that has the property UTU = C. The spike waveforms were then transformed by wi ← U−1wi (Franke et al. 2015b; Pouzat et al. 2002; Rutishauser et al. 2006). After prewhitening, we performed a principal component analysis and reduced the dimensionality of the spike vectors by keeping only the projections of each spike on the first k principal components (PCs).
Prewhitening is a linear operation that transforms signals with a given noise covariance matrix C so that the resulting signals are decorrelated, i.e., their noise covariance becomes the identity matrix. This means, in an ideal case where cluster shapes are fully defined by the noise covariance, that after prewhitening all clusters become hyperspherical with σ2 = 1 (Fig. 2A). In this case, the first few PCs of all spikes will represent the largest between-cluster variance, and the cluster distances in this subspace will increase on average by 50% (Fig. 2, A–D) in comparison to cluster distances in the PC space computed with nonprewhitened spike waveforms.
Fig. 2.
Effects of prewhitening on cluster separation in the linear subspace spanned by the first principal components (PCs) (A–D) and on cluster shapes (E–G). A: simplified schematic of the effect of prewhitening (W, and its inverse W−1)on the PCs in a 2-dimensional example case (x1 and x2). Correlated noise between x1 and x2 gives clusters an elliptic shape. PC1 is not an optimal direction to separate these 2 clusters. Prewhitening decorrelates the noise and produces circular clusters. After prewhitening PC1′ represents the optimal subspace for cluster separation. B and C: example of a labeled cluster (dark gray) corresponding to an artificial unit surrounded by other clusters in its local electrode group (light gray). B: scatterplot in PCs 1–3. C: histogram of Euclidean distances in PCs 1–6 for each spike to the center of the labeled cluster. The distance was normalized such that the variance of the labeled cluster became 1. Vertical dashed lines show the distance d within which 5% of spikes from other clusters are located. D: change of distances Δd introduced by prewhitening for 109 labeled clusters across first 30 PCs. Each point represents 1 cluster. Horizontal black lines indicate the median values. E: schematic of the contribution of cluster PCs to amplitude variability. We computed principal component analyses for each cluster and determined the contribution μ of each cluster PC to the direction of amplitude variability, which is the direction from the origin to the cluster center. F and G: means (black line) and standard deviations (gray areas) over 440 clusters. F: % of amplitude variability captured by each PC. The total contribution over the first 6 PCs is 75% and 64% in the nonwhitened and the prewhitened case, respectively. G: standard deviations within each PC (dark gray) compared with the standard deviation within the direction of amplitude variability (light gray).
Cluster shapes in our data, however, were not determined by noise correlation alone but also by intrinsic spike variation, mainly amplitude variation. A large share of the variability captured by the first 6 PCs is due to variability in the amplitude of the spikes (Fig. 2, E and F). Prewhitening therefore did not produce completely spherical clusters; instead, the median standard deviation in the first PC was σ1 = 6.8, and the average standard deviation over the first 6 PCs was (Fig. 2G).
Clustering.
Prewhitening had the effect that the cluster distributions became close to standard normal, which allowed us to implement a parameter-free spike clustering method by using mean-shift clustering with a flat kernel (Fukunaga and Hostetler 1975). Mean-shift is a density-based clustering algorithm that requires only one input parameter, h, called bandwidth. It works by iteratively shifting each data point x toward the mean of all points within a neighborhood around x, until all points converge to their cluster centers. The size of the neighborhood is defined by the bandwidth. The shift of each point toward the mean within the neighborhood is given by
where K(x) is a flat kernel defined as
To avoid under- or overclustering, the size of the kernel should be roughly on the same order as the spread of the clusters in the data. For a k-dimensional cluster with mean µ and covariance matrix C, the mean squared Euclidean distance (d2 = xTx) between each data point and the mean is given as
Using our empirical estimate of the residual variability of the clusters after prewhitening we could therefore fix the bandwidth to h2 = chk with a constant parameter .
At the end of the clustering step, we retained all clusters that had at least Mspikes spikes, which prevented single outliers from forming their own clusters. It is important to keep in mind that the clustering was not used for the final assignments of spikes to neuronal units but only to compute the templates, which were subsequently used for template matching.
Template matching.
In the previous step we clustered only a subset of all detected spikes and obtained clusters that corresponded to putative neuronal units. For each cluster n we computed a template ξn by averaging the waveforms w of their assigned spikes.
In this step, we classified all detected spikes in a process called “template matching.” This process compares the waveform of each spike to all templates and matches each waveform to the template with the highest similarity. Here we used a specific method called Bayes optimal template matching. It combines matched filters with Fisher linear discriminant analysis. Matched filters maximize the separation between signal and noise, whereas linear discriminant analysis optimizes the discrimination between classes that have the same covariance matrix (Franke 2011; Franke et al. 2015b) (for more details see Template Matching in the appendix). In brief, the multielectrode waveform of each detected spike is convolved with a set of multielectrode matched filters. Each filter is matched to one of the templates. From each filter’s output a discriminant function can be computed, which ranks how well the spike fits to the respective template. The values of all discriminant functions are then compared against each other, and the spike is assigned to the template with the largest associated discriminant function value.
Cluster merging.
Spikes from the same neuron may be distorted by noise and overlying spikes of other neurons so that they do not always have the same waveform. Additionally, the spike generation process can cause variability in the waveforms, e.g., during a burst (Fee et al. 1996). The first spike within the burst will have a larger amplitude than the second spike within the same burst. Therefore, a single neuron can produce spikes that fall into multiple, well-defined clusters if its intrinsic waveform variability is on the same order as or larger than its extrinsic variability (i.e., noise and overlapping spikes).
There are two ways to address this problem: First, one can increase the mean-shift bandwidth in an effort to prevent split neurons, but this approach increases the risk of merging two separate units together. Second, one can keep a smaller bandwidth and end up with neurons that are represented by multiple clusters. The second option is the preferred one. Not only is it easier to do automatic merging of clusters than automatic splitting, it is also beneficial to have more than one template per neuron for template matching. For this reason, we set the mean-shift bandwidth parameter lower than the previously estimated value.
It is therefore important that there is a step in which the detected neuronal units are compared to each other and merged if they represent the same neuron. This step was done after template matching, when all the spikes were classified and the final template for each neuronal unit could be computed. Since the final templates were averaged over more spikes than those obtained through clustering, their noise component was lower. We merged two neuronal units when the following two conditions with respect to their templates ξn and ξm were met:
and
This process of merging was done iteratively: 1) The two units with the smallest distance were merged (i.e., the spikes of one neuron were assigned to the other). 2) The templates of the newly formed units were computed through averaging of the waveforms of all spikes. 3) The Euclidean distances and the projections with the other templates were calculated. 4) The process was repeated until there were not any pairs to meet the merging criteria.
Duplicate resolution.
As a result of the process described in the previous section, we obtained a number of neuronal units for every LEG, each consisting of a spike train and a template. In a final step, we resolved and removed duplicated neuronal units between LEGs, each of which was the result of a neuron being detected independently in multiple LEGs. Occurrence of the same neuronal unit in multiple LEGs was found in cases when the extracellular signal of a neuron was large and spread over several LEGs or when a neuron was located at the intersection of two LEGs. We used a simple heuristic that was computationally efficient and that compared the global templates (i.e., the templates over the entire set of recording electrodes) and the spike trains to decide whether two units represented the same neuron.
We began with the assumption that a neuron was always detected best in the LEG where its amplitude was largest. Consequently, when the global template of a neuronal unit found in LEG A had its maximal amplitude in LEG B, the respective neuron must have been detected more reliably in LEG B. Therefore, we removed all units that featured global templates with maximal amplitude located in other LEGs, which accounted for roughly half of all duplicates (data not shown).
The remaining duplicates were units that had their maximal template amplitudes on electrodes at the intersection of two LEGs. We made pairwise comparisons of all these units and earmarked those pairs that featured sufficient template similarity (for more details see Duplicate Resolution in the appendix).
Finally, we compared the spike trains of these earmarked pairs: We counted the number of overlapping spikes, i.e., the spikes within the same time window of Δtoverlap. When the percentage of overlapping spikes was >50%, we determined the pair to be duplicates and discarded the unit with the smaller maximum template amplitude. All remaining units were retained as final results of the spike sorting algorithm.
Surrogate Data Generation
For the performance assessment of a spike sorter, it is necessary to have a benchmarking data set in which the exact spike times of many neurons are known. We wanted to emulate realistic noise properties and spike shapes for each experimental condition that we analyzed. In this section, we describe a simple method with which we generated surrogate data based on real HD-MEA recordings. We created two separate benchmarks, each containing 12 data sets, with different randomly inserted neuronal units. In the first benchmark, the amplitudes of the inserted spikes were varied, in the second benchmark the firing rates.
Initially, we ran our spike sorter on six different 20-min recordings from murine retinas without surrogate ground truth and detected a total of 4,034 neuronal units that we named “original units.” As our HD-MEA allowed us to record from an almost arbitrary selection of 1,024 electrodes of the 26,000 available electrodes, we placed the electrodes in two similar blocks of 23 × 23 adjacent electrodes. The electrode blocks (Fig. 3A) were spatially separated by ∼0.1 mm. Because of design constraints of our HD-MEA, some electrodes were not connected during the recording, resulting in apparent gaps in the recording area. These gaps did not influence the performance assessment. For each recording, we computed global templates of 10 original units per electrode block. To make sure that the templates contained little noise, we only took the mean spike waveforms of units with at least 4,000 spikes (~3.3 spikes/s). We also band-pass filtered these templates with the same filter settings that were used to prefilter the raw data (300–6,000 Hz) and multiplied them with a Tukey window to ascertain zero on- and offset.
Fig. 3.
Generation of benchmark data exemplified with a representative artificial neuron. A: schematic of “waveform swapping” to generate artificial templates: the microelectrode data were recorded with an electrode configuration consisting of 2 high-density electrode blocks. The data were spike sorted, and 20 units from the sorting were chosen (“original units”). An “artificial unit” was created from each of the templates by swapping of the waveforms between the 2 high-density blocks. B: example template of an artificial neuron. Inset: region of the array in which the template had the largest-amplitude waveforms. (Note that the high-density blocks contained gaps. This is a consequence of a design constraint in our high-density microelectrode arrays that only subsets of electrodes can be simultaneously recorded with high-density electrode configurations.) C: example waveforms of a spike that has been inserted into the recordings. D: recording traces of the electrodes corresponding to the one in C before insertion of the spike (top) and after insertion (bottom). E: superposition of all spike waveforms on the one electrode before insertion into the recordings (top) and after insertion into the recordings (bottom). Insets: histograms of spike amplitudes. B–E: * marks the electrode where the amplitude of the example waveform was maximal.
For each of these templates, we created a new, artificial template by interchanging the waveforms of the original template between the two high-density blocks (Fig. 3, A and B). The obtained new templates formed the basis of 20 “artificial units.” The advantage of this procedure is that the shape and spatial distribution of the inserted spikes were identical to those obtained from real neurons in the recordings, yet the interchange between the blocks put them at new locations so that they were sufficiently different from their originals.
The templates created this way were inserted as spikes into the original recordings by adding the waveforms onto the respective recorded traces (Fig. 3, C and D). Before insertion of an artificial spike i we multiplied the respective templates by a factor αi drawn from a normal distribution with variance , , so that the amplitudes of the inserted spikes reflected the amplitude distribution of neurons in the recordings (Fig. 3E, top). The spike amplitudes after insertion were therefore randomly distributed, but their variance was larger than because of the noisy background to which they were added (Fig. 3E, bottom; see Spike Amplitude Distribution in the appendix). We also jittered each spike in time before inserting it into the data by upsampling the respective template by a factor of 10, randomly shifting the upsampled template between 0 and 9 samples before downsampling it again. This process ensured that the inserted spikes were not always perfectly aligned with the sampling intervals.
To choose the time points at which spikes were inserted into the data, we computed an independent Poisson process for each artificial unit with spiking rate parameter λi and a refractory period of 1.5 ms.
Performance Assessment
We evaluated the performance of our algorithm by sorting the surrogate data and comparing the resulting sorted units with the inserted artificial units. We benchmarked the sorting performance as a function of spike amplitude and spike rate, using two dedicated sets of surrogate data. Since our surrogate data consisted not only of the artificial units but also of many more unknown neurons, it was not trivial to obtain meaningful metrics to assess the sorting performance. We matched the artificial units to the sorted units by using two independent procedures: In the first procedure, we compared the spike trains of all pairs of one sorted unit and one artificial unit and counted the number of true positives as well as the detection errors categorized as false positives, false negatives, and false classifications (for a detailed description of these terms see Performance Assessment in the appendix). We matched those pairs that produced the smallest number of detection errors. In the second procedure, we compared the templates of the units and matched the pairs with the smallest Euclidean distance of their templates. When both procedures matched the same sorted unit to an artificial unit we categorized it as “found,” and otherwise as “lost.”
We then computed the sorting performance metrics, sensitivity, precision, and error rate, to quantify the detection accuracy of each artificial unit based on spike train similarity to its matched sorted unit. The descriptions of these metrics are given in Table 2 (for more a detailed description see Performance Assessment in the appendix).
Table 2.
Definitions of sorting performance metrics
| Term | Description |
|---|---|
| Sensitivity | Percentage of true positives in surrogate ground truth spikes |
| Precision | Percentage of true positives in sorted unit spikes |
| Error rate | Percentage of detection errors in surrogate ground truth spikes |
Sorting performance as a function of spike amplitudes.
The templates of the artificial units used in this section were selected randomly from the original recordings. The mean spiking rate was the same for each artificial unit with a value of λi = 5 Hz. This means that all artificial units in this evaluation had roughly the same number of spikes. All spike amplitudes shown here are given as multiples of the noise standard deviation, σn.
Figure 4, A–D, show the relationship between the spike amplitude and sensitivity, precision, and error rate. Table 3 lists the mean detection metrics of all found units. There is a steep drop in sorting performance for units with amplitudes around the detection threshold 4.2. This is due to the fact that half of these units were completely lost but also that found units within this range were sorted with an overall error rate of 115%. Units with amplitudes between 4.2 and 10.0 were generally sorted well, with ~10% of them being lost. The found units, however, were sorted with >90% sensitivity and precision. Except for a few outliers, all units with amplitudes >10.0 were found and sorted with a median sensitivity and precision of 99.0% and 100.0%, respectively.
Fig. 4.
Evaluation of the sorting performance with respect to spike amplitudes in units of noise standard deviation (σn) (A–D) and spike rates (Hz) (E–H). A and E: sensitivity: no. of true positives divided by no. of inserted spikes. B and F: precision: no. of true positives divided by no. of detected spikes. C and G: error rate: no. of detection errors divided by no. of inserted spikes. D and H: histogram of found and lost units. A–D: left dashed vertical line indicates the spike detection threshold (4.2σn); right dashed vertical line indicates 10σn.
Table 3.
Mean and median values to assess sorting performance of found units
| Amplitude, σn |
|||
|---|---|---|---|
| 0–4.2 | 4.2–10.0 | >10.0 | |
| Sensitivity, % | |||
| Mean | 62.6 | 92.4 | 97.7 |
| Median | 72.4 | 94.8 | 99.0 |
| Precision, % | |||
| Mean | 52.4 | 92.3 | 98.6 |
| Median | 49.7 | 98.4 | 100.0 |
| Error rate, % | |||
| Mean | 115.0 | 16.8 | 3.56 |
| Median | 101 | 8.42 | 1.01 |
| Found units | 7 | 111 | 99 |
| All units | 15 | 124 | 101 |
σn, Noise standard deviation.
Sorting performance as a function of spike rate.
To investigate how the number of spikes per neuron that were available for clustering affected the detection performance, we used the second surrogate data set in which we varied the spike rates of the artificial units. The templates were created in the same way as described above; however, the original units were not selected randomly. Instead, we looked at the amplitude distribution of the original units and selected the 10 units per high-density block that produced signals closest to the mean amplitude in each recording (≈10σn, data not shown). This way, we ended up with 20 artificial units per recording that had similar spike amplitudes.
The spike rates λi that we used as input to the Poisson spike-time generator were evenly distributed (on a logarithmic scale) in the range of 0.05–50 Hz. This produced spike counts in the range of 40–60,000.
Analogous to the previous section, Fig. 4, E–H, shows spike rate vs. sensitivity, precision and error rate, and Table 4 lists the mean detection metrics of all found units. We saw a drop in the number of found units at spike rates <0.2 Hz, with a few outliers above. The found units with spike rates <0.1 Hz (~120 spikes) had a large mean error rate (3,340%). These errors were all due to a lack in precision, as the sensitivity was 100%, i.e., there were no false negatives but only falsely classified spikes from other neurons. In general, the sensitivity was ∼99% for all found units irrespective of their spike rate. The mean precision also approached 99% for higher spike counts. The error rate was <2% for units with spike rates >1 Hz (corresponding to spike counts of 1,500 and more).
Table 4.
Mean and median values to characterize sorting performance of found units
| Spike Rate, Hz |
||||
|---|---|---|---|---|
| <0.1 | 0.1–1 | 1–10 | >10 | |
| Sensitivity, % | ||||
| Mean | 99.9 | 98.9 | 99.7 | 99.0 |
| Median | 100.0 | 100.0 | 99.9 | 99.9 |
| Precision, % | ||||
| Mean | 64.5 | 95.3 | 98.5 | 99.1 |
| Median | 100.0 | 100.0 | 100.0 | 100.0 |
| Error rate, % | ||||
| Mean | 3340 | 21.1 | 0.846 | 1.90 |
| Median | 0 | 0.131 | 0.0877 | 0.231 |
| Found units | 12 | 68 | 69 | 56 |
| All units | 28 | 80 | 72 | 60 |
Comparison to other spike sorters.
We compared the performance of our algorithm to other spike sorters using a publicly available spike sorting benchmark data set. We used the algorithm with parameters identical to those in the previous section. The benchmark data set is available for download at http://phy.cortexlab.net/data/sortingComparison (Steinmetz 2016). On this data set five other spike sorters have been evaluated and compared (phy: Rossant et al. 2016; spykingCircus: Yger et al. 2018; globalSuper: Shabnam 2016; kiloSort: Pachitariu et al. 2016; JRClust: Jun et al. 2017). The data set consisted of recordings of cortical neurons with 118 electrodes, arranged in two columns, and the duration of the recordings ranged between 46 and 83 min. The surrogate ground truth spikes were created by adding denoised waveforms into recorded data. Thus the final data set contained, besides the artificially inserted neurons, recorded spikes of an unknown number of real neurons (Pachitariu et al. 2016).
To ensure that the performance metrics were comparable to those used to evaluate the other sorters, we matched and compared our sorted units to the surrogate ground truth by using the code that was provided together with the data sets. This code was used to compute a score for each pair of sorted units and artificial units and matched the pairs with the highest scores. The score was equivalent to sensitivity plus precision minus 1. The code further helped to assess whether the obtained score could be improved by merging units, but we only report here the initial scores before the merging.
Figure 5 compares the results of all six sorters per given data set. Our sorter showed consistently high median scores for all data sets, either matching or exceeding those of the top three sorters. The median precision was nearly 100% in all data sets, whereas the results for sensitivity were more mixed. We observed a loss of sensitivity in some units of data set 6 that we attribute to overclustering as a consequence of multimodal amplitude distributions; other sorters seemed to have the same problem.
Fig. 5.
Performance comparison with other spike sorters using 6 data sets of increasing difficulty. Data sets were sorted separately with our algorithm (hdsort) using default parameters. Score, sensitivity, and precision were computed with software that was published together with the data sets at http://phy.cortexlab.net/data/sortingComparison (Steinmetz 2016). A: score combining sensitivity and precision (sensitivity + precision – 1). B: sensitivity: no. of true positives divided by no. of inserted spikes. C: precision: no. of true positives divided by no. of detected spikes.
Runtime estimation.
To estimate the runtime of our algorithm in a realistic scenario, we measured the time required to process each LEG (parallel processes) and for the final duplicate resolution step. We excluded the time that was necessary to load the data sets, filter them, and detect the spikes, as these time spans are highly dependent on the performance of the file system and the file format in which the recordings were saved. We compare the runtimes of a 20-min and a 63-min recording in Table 5. The results showed that the runtimes between LEGs can vary by a factor of 20, which was mainly due to the fact that the number of spikes per LEG could differ significantly. The theoretical total runtime upon using one CPU per LEG in a parallel approach was limited by the slowest of the parallel processes. It amounted to 10.5 min for the 20-min recording and 23.3 min for the 63-min recording. These results further showed that the runtime did not scale linearly with the recording duration but was comparably shorter for longer recordings. This was due to the fact that the slowest process was the mean-shift clustering and that we defined a maximum of 50,000 spikes for this step within a single LEG (for more on this see Runtime Estimation in the appendix).
Table 5.
Runtime estimation for two data sets with different recording durations
| Recording duration, min |
||
|---|---|---|
| 20 | 63 | |
| No. of LEGs | 92 | 166 |
| No. of electrodes | 578 | 890 |
| Parallel processes, min | ||
| Mean | 3.8 | 7.6 |
| Median | 3.6 | 5.8 |
| Min. | 0.4 | 0.8 |
| Max. | 9.7 | 20.9 |
| Duplicate resolution, min | 0.8 | 2.4 |
| Total runtime without parallelization, min | 353 | 1,269 |
| Theoretical runtime with 1 CPU per parallel process | 10.5 | 23.3 |
LEG, local electrode group.
DISCUSSION
Spike Sorting Algorithm
We extended the standard reference method for spike sorting of recordings with low numbers of electrodes to recordings with thousands of electrodes and simultaneously recorded neurons. The high dimensionality of these HD-MEA recordings poses a challenge to standard spike sorters that have been designed for small electrode numbers, so that new solutions are required. We used three comparatively simple strategies to implement such extension to large numbers.
Subdivision into local electrode groups.
We divided the entire set of recording electrodes into LEGs and then sorted each group independently. In this way, we limited the complexity of each task to treating signals of maximally nine electrodes, a number that was small enough that standard spike sorting methods could be applied. As the LEGs were treated independently, we were able to implement a parallel computation scheme that scaled with the duration of the recording and not with the number of recording electrodes.
The division of the overall electrode array into LEGs was based on the spatial arrangement of the electrodes, with the goal to cover the overall array recording area with little redundancy. As some neurons were located at the edge of an LEG, it was necessary to assess whether the reduced information content within each LEG negatively affected the sorting performance or, in other words, whether neurons that were located at the border between two LEGs were sorted with a lower performance than neurons in the center of an LEG. Our analysis showed that the relative position of a neuron within the LEG did not have an impact on the sorting quality as long as the distance between the LEG and the electrode that recorded the maximal amplitude of a neuron did not exceed 43.5 µm (2.5 electrode distances). The LEGs, therefore, need to be selected in a way that this distance is not exceeded. An alternative approach would include treating each electrode as the center of an LEG and clustering only spikes that have their maximal amplitude on that central electrode (Yger et al. 2018).
The independent sorting of LEGs obviated the problem of having coinciding spikes from neurons that were located far apart. However, it did not solve the issue of spatiotemporal overlaps, i.e., the fact that spikes of neighboring neurons may occur at the same time. Methods for resolving these overlaps have been presented before and could be implemented additionally. One way would be to create additional overlap templates; another way would be to iteratively subtract detected spikes from the recorded signals (Franke et al. 2015b; Marre et al. 2012; Pillow et al. 2013). These methods offer solutions for scenarios in which highly correlated neurons need to be separated.
Clustering of a limited number of spikes.
To limit the computational costs of waveform alignment, feature extraction, and, especially, clustering, we did not apply these steps to all spikes detected within a given LEG. Instead, we limited these steps to a random subset consisting of 50,000 spikes. Owing to this limit, the mean-shift clustering method reached convergence within a few minutes, irrespective of the recording duration. After clustering, all spikes were assigned to one of the detected clusters with template matching. This strategy had the effect that, for LEGs with >50,000 spikes, the runtime scaled linearly—instead of quadratically—with the number of spikes.
Prewhitening of spike waveforms.
To achieve parameter-free clustering, we prewhitened the spike waveforms before using the mean-shift clustering algorithm. Prewhitening had two positive effects: 1) Cluster separation in the first 6 PCs increased by an average of 50% through prewhitening (Fig. 2D). 2) The cluster shapes became predictable and closer to standard normal. The remaining nonsphericity can, in large part, be accounted for by intrinsic spike amplitude variability (Fig. 2, E and F). It was thus possible to use a fixed kernel bandwidth without manual intervention. There are other autonomous clustering methods that could be used here, such as consensus-based clustering (Fournier et al. 2016), density-peak clustering (Jun et al. 2017; Rodriguez and Laio 2014; Yger et al. 2018), or even k-means clustering (Lewicki 1998).
Merging and Duplicate Resolution
A neuron can produce different spike waveforms within bursts of action potentials (Meister et al. 1994). This can cause a spike sorter to split the same neuron into multiple units. We addressed this problem by merging clusters based on the similarity of their templates. However, the quality of this merging depends on how strongly the waveforms change within a burst and, in turn, on how well the parameters in the merging step can be adapted to this change. We did not further investigate this issue in our study.
In a situation where intraburst variability would be so large that our merging method would not be applicable, a possible solution would include merging units based on their spike time cross-correlation. However, while using this approach one would have to take into account that even distant neurons could be correlated.
Since neurons could be detected and sorted on multiple LEGs, we added a duplicate resolution step based on a heuristic that was designed for speed and scalability. Duplicates were identified and removed when necessary. It may be possible to improve this step by solving a well-defined optimization problem, which we have not attempted yet. An alternative to this step could be applying a unimodality test (Chung et al. 2017), a manually curated merging step (Yger et al. 2018), or an adaptation of existing cluster comparison methods (Hill et al. 2011; Pachitariu et al. 2016; Rossant et al. 2016).
In many experiments it is interesting to analyze, in addition to the spike-sorted units, the so-called multiunit activity (MUA), i.e., the “residual spike trains” after spike sorting. The MUA contains all small-amplitude spikes, which could not be assigned to a high-quality sorted unit. In a HD-MEA context, however, each LEG will have its own MUA, and it stands to reason that many spikes between adjacent MUAs will actually be duplicates. We did not investigate whether our method for duplicate resolution would be suited to resolving this issue, but approaches for spike detection in a HD-MEA context that avoid duplicate detection have been developed (Swindale and Spacek 2014).
Spike sorting often requires a manual curation step, in which the results are visually inspected by a human operator and spike clusters are either split or merged when necessary. For recordings with thousands of neurons, this step becomes impractical and could be biased (Einevoll et al. 2012; Wood et al. 2004). The results we present here show that good performance for many neurons could be attained despite avoidance of manual curation.
The high throughput achieved by combining HD-MEA recordings with this sorting algorithm proved useful in experiments where tens of thousands of cells need to be screened to assess, for example, a disease-related phenotype (Hillier et al. 2017; Yonehara et al. 2016).
Surrogate Data Generation
We presented a simple method for generating surrogate data based on real HD-MEA recordings. This method relied on generating artificial neuronal units according to templates of recorded neuronal units by manipulating the spatial location of the template waveforms. The spikes of these artificial units had the same waveform shapes and spatial distributions as the spikes of the recorded units. This method enabled us to create data sets with an arbitrary but low number of inserted neurons for which we knew the exact spike times. All inserted spikes were scaled copies of the same basic waveform, and the data sets featured real noise and background activity. It is important to note that this data set is not suitable to assess the sorting performance of our algorithm on data sets in which the spike waveforms vary over time, e.g., because of spatial movements of the neurons with respect to the electrodes (Franke et al. 2010) or bursting. A more suitable way of benchmark data generation, which accounts for spike shape variations, would need to be employed.
It is possible that we inserted units that were spatially overlapping with other existing units in the recording in a way that would not have occurred in real data, because two neurons usually cannot occupy exactly the same location. The exact overall number of neurons in the preparation as well as their properties remained unknown. We could only perform an analysis of the detection performance of the developed method on the inserted units.
Performance Assessment
We present a spike sorter for HD-MEA recordings that relies on a comparatively simple approach. Although our intention was not to build a spike sorter with highest possible accuracy but a spike sorter that requires a minimum of human interference, we could show that the developed spike sorter yields results that are comparable to those of other state-of-the-art spike sorters. Moreover, we found that the presented spike sorter could correctly detect and classify all spikes of units with amplitudes > 10σn (noise standard deviation) and reliably identified templates for most units with amplitudes between 4.2σn and 10σn. Almost all units that were lost during the sorting had signal amplitudes smaller than our spike detection threshold of 4.2σn, yet some of their spikes were nevertheless detected. This result is due to the fact that background noise can add constructively to the spike template and push an otherwise undetectable spike beyond the threshold. Furthermore, this noise effect occurs on each electrode of the LEG, which increases the likelihood that a spike will be detected. The presented algorithm is then able to cluster and classify these spikes, albeit with a significantly increased error rate. We therefore would recommend discarding units with amplitudes close to and below the detection threshold.
A plausible way to further reduce the number of undetected spikes would be to run the template matching on the entire continuous recording traces rather than on predetected and cut-out spikes, thereby also finding spikes below the detection threshold (Dragas et al. 2015; Franke 2011; Franke et al. 2015b).
The correlation between the spike rate of a neuron and the detection performance showed that highly active neurons were generally well sorted, whereas neurons with low activity were more likely to be missed by the spike sorter. As the spike sorter randomly selects only 50,000 spikes for clustering and is set to overcluster, it can happen that neurons with low spike counts are not able to form clusters with >10 spikes and thus get discarded as noise. In stimulation-driven sensory systems, such as light-stimulated retinas, we deem it unlikely that neurons have spike rates that are so low that they can be missed through this phenomenon. We therefore think that the method of clustering a random subsample of spikes is well suited in such a situation and that it can also be used for other experiments in which low-activity neurons are not of interest. However, if the data are likely to contain strongly asymmetric firing rate distributions, i.e., single neurons contribute to fewer than ~120 of the 50,000 clustered spikes (~0.25%), a different way of clustering will be needed. Solutions for this problem include the clustering of all spikes, by finding a weight measure that favors dissimilar spikes, or by implementing an iterative process with the goal to also find small clusters.
Finally, the comparison to other spike sorters showed that our approach performs on par with other currently available spike sorters for most of the test data sets. This result was obtained without parameter adjustment and without manual curation of the clusters.
GRANTS
Financial support through the ERC Advanced Grants 694829 “neuroXscales” and 267351 “NeuroCMOS” and Swiss National Science Foundation Sinergia Projects CRSII3_141801 and CRSII5_173728 is acknowledged, as well as individual support for R. Diggelmann through a Swiss SystemsX IPhD grant and for F. Franke through Swiss National Science Foundation Ambizione Grant PZ00P3_167989.
DISCLOSURES
No conflicts of interest, financial or otherwise, are declared by the authors.
AUTHOR CONTRIBUTIONS
R.D. and F.F. conceived and designed research; M.F. performed experiments; R.D. analyzed data; R.D. and F.F. interpreted results of experiments; R.D. prepared figures; R.D. drafted manuscript; R.D., M.F., A.H., and F.F. edited and revised manuscript; R.D., A.H., and F.F. approved final version of manuscript.
ACKNOWLEDGMENTS
We thank Thomas L. Russell, ETH Zurich, for sharing tissue from experiments. We also thank Nick Steinmetz from the Cortical Processing Laboratory at University College London for the benchmarking data and evaluation code.
APPENDIX
Subdivision into Local Electrode Groups
The process we used for subdividing a set of recording electrodes into LEGs was iterative and terminated when each electrode was assigned to at least one LEG. Each iteration consisted of three steps:
Pick an unassigned electrode, e0, and add it to a new LEG. Select the closest electrode to e0 within a distance Rmin and assign it to the new LEG. Continue until the LEG is full or no unassigned electrodes are in the vicinity of e0.
If the LEG contains fewer than Nmin electrodes, expand the radius to Rmax and add only electrodes without an assignment.
If the LEG still contains fewer than Nmin electrodes, add other electrodes within Rmax until Nmax is reached.
The parameters used in this evaluation are listed in Table A1.
Sorting Performance as a Function of LEG Distance
Since we split the array of recording electrodes somewhat arbitrarily into LEGs, a neuron might sit exactly on the border between two LEGs. Therefore, the question arose of whether the location of a neuron with respect to the LEGs affected the sorting performance. To answer this question, we investigated whether the sorting performance of individual neurons correlated with their relative position with respect to each LEG. We specifically wanted to investigate at what distance a neuron will be lost and whether this distance would be within the area of an LEG. We selected those artificial units that were detected with high sensitivity and precision in at least one LEG. Each of those artificial units was then matched to the best-matching sorted unit (before the duplicate resolution!) in all LEGs. In this way we could estimate how far away from a neuron it was still possible to sort it with good performance. The matching was performed with the same method as described in Performance Assessment in results, and the same performance metrics were computed. Finally, we determined the distance between the location of the electrode on which the signal amplitude of the artificial unit was maximal and the “center of mass” of the LEG.
Figure A1 shows that, below 43.5 µm (~2.5 × the electrode pitch), the sorting performance did not depend on the distance. Furthermore, all the artificial units were within this distance relative to the LEGs in which they were found. This means that the sorting performance of a neuron did not decrease if it was located at the edge of an LEG compared to the center. Therefore, the partitioning of the electrode array into LEGs should be arranged such that each neuron is <43.5 µm apart from the center of its closest LEG, which was the case in our investigations and for our algorithm.
Noise Covariance Matrix
We computed the noise covariance matrix C in the event space by first computing the noise cross-correlation function, ρi,j(τ), with a maximum lag τmax of 3.75 ms between all pairs of electrodes i,j within an LEG. The noise covariance matrix C was then constructed as
where Ci,j is a τmax fs × τmax fs Toeplitz matrix, constructed from ρi,j(τ), as described in Pouzat et al. (2002). To guarantee that C was invertible, we added to all diagonal elements 0.1 times the trace of C.
Mean-Shift
Mean squared Euclidean distance between data points and cluster mean for k-dimensional cluster with mean μ and covariance matrix C:
Template Matching
The template matching used in this report was a simplified version of the Bayes optimal template matching presented in Franke (2011) and Franke et al. (2015b). For each template n we computed the discriminant function, Dn(t), based on its waveform ξn, the inverse of the noise covariance matrix C, and the input signal X(t):
In Franke et al. (2015a), these discriminant functions were computed continuously for the entire recording, and spike detection and classification were performed simultaneously. Whenever a Dn(t) crossed a threshold, a spike event was detected. This spike was immediately assigned to the neuron whose discriminant function produced the largest peak within a short time window of the detection. For performance reasons, we simplified the process here by applying the matched filters only onto the previously detected spike waveforms. The template matching was therefore only used for spike classification, not for spike detection, and we did not attempt to resolve overlapping spikes. This made it unnecessary to align all spikes, and only spikes used for clustering were upsampled and aligned. The classification principle remained the same: a spike was assigned to the neuronal n for which the amplitude of Dn(t) was maximal.
Duplicate Resolution
Here we describe in detail how we compared the templates of two duplicate candidates in the duplicate resolution step. The three steps were taken in order, each time decreasing the number of candidates and thus reducing the computational efforts of the subsequent processing:
Contributing electrodes.
Contributing electrodes En in a template ξn were those electrodes where the signal amplitude was >30% of the largest amplitude of the template. We compared the relative number of shared contributing electrodes, , between the candidates n and m (with respect to the smaller number of contributing electrodes of either n or m). When was <80%, they were not considered as duplicates and were thus eliminated from the process:
Template correlation.
The template correlation, cn,m, was computed on overlapping contributing electrodes, En,m, only. When cn,m was smaller than 85%, the candidates were eliminated from the process ( is a unit vector of the same dimensionality as ξn multiplied by the mean over all elements of ξn):
Template distance.
When the maximal distance dn,m between waveforms (again on overlapping contributing electrodes En,m only) was >60%, the candidates were eliminated from the process (the function “max” returned the largest element of the input vectors):
The candidate pairs that remained after these three steps were then considered possible duplicates before we compared their spike trains (see results).
Spike Amplitude Distribution
The distribution of the template multiplication factor, αi, was estimated such that the inserted spikes showed an amplitude distribution similar to the recorded spikes. Because we inserted the spikes into a noisy signal, we could not simply use the standard deviation of the recorded amplitudes a for αi, but we had to estimate the intrinsic standard deviation σa of the neurons without the added noise. The multiplication factor was normalized with respect to the mean amplitude and given by . Here we describe how we estimated μa and σa,
We assumed that the observed spike amplitudes â was the sum of a and a noise component ϵ. We assumed both a and ϵ to be normally distributed and that they can be expressed as and . From this followed that â was also normally distributed as . Therefore
We expressed the waveform wi of spike i with the template ξ, intrinsic amplitude ai, and noise ni:
The ni followed a normal distribution with noise covariance matrix C:
The amplitude âi could be obtained by projecting the wi onto ξ and normalizing with ‖ξ‖ = ξTξ:
where ϵi was the projection of the noise vector ni onto the template. Its distribution could be expressed as
and
We obtained = 0.12 and = 1.0 by fitting a Gaussian curve to the spike amplitude histogram of sorted units. From this fit we obtained the standard deviation of the multiplication factor α: .
Performance Assessment
For the pairwise comparison of spike trains, we computed their cross-correlation function for small temporal lags (1 ms) and aligned them to its peak. We then counted the number of spikes that were present in both spike trains (true positive, TP) within a window of 1 ms. Three different cases were counted as errors (Fig. A2): spikes not present in the artificial unit that were found in the matched unit (false positive, FP); true spikes in the artificial unit that were not found at all (false negative, FN); and true spikes in the artificial unit that were assigned to a different unit (false classification, FC).
We calculated three metrics for the sorting performance: sensitivity, precision, and error rate.
Sensitivity.
As a metric for sensitivity, we calculated the true positive rate (TPR), which is the ratio between TP and the number of true spikes of the artificial unit:
Precision.
Since we recorded the activity of hundreds of real neurons in our recordings, we had for any given unit a very large number of TN spikes, i.e., all spikes of all other neurons in the entire experiment. It was therefore not useful to calculate the true negative rate (TNR) as a metric for specificity. We used the positive predictive value (PPV) instead, which is the ratio between TP and the number of detected spikes:
Error rate.
Error rate is the total number of errors divided by the total number of inserted spikes:
Tables A2 and A3 show the total numbers of units sorted in the original recordings, the number of sorted units in the benchmarking data set, the number of artificial units in total, and the number of found and lost units.
Runtime Estimation
Figure A3 shows the relationship between the runtime of a parallel process and the number of spikes in its LEG. Because the mean-shift algorithm scales with the square of the number of spikes, we fitted a quadratic function to the runtime values for LEGs with up to 50,000 spikes. We clustered a maximum of 50,000 spikes within each LEG, which had the effect that the quadratic trend did not continue beyond 50,000 spikes and the runtime increased linearly instead. The linear fit predicts a slope of ~10-min runtime per 100,000 spikes, if the number of parallel computing nodes stays fixed.
For the two data sets, reported in Table 5, we measured the total sorting duration using a workstation with 16 CPUs (Intel Xeon CPU E5-2690 0 at 2.90 GHz) and 64 GB of RAM. In contrast to the runtimes reported in Runtime estimation in results, the following numbers also include the time required for file access, data decompression, and filtering, i.e., all the processes that are highly dependent on available data storage infrastructure, network speed, and data format. We parallelized our algorithm with the MATLAB parfor functionality using 16 workers (or nodes). All files were stored on a shared network file system. The raw data of the 20-min data set were stored in uncompressed binary files, amounting to a total of 27 GB, which resulted in a total runtime of 85.4 min. The raw data of the 63-min data set were stored in HDF5 files with compression level 1, amounting to a total of 60 GB, which resulted in a total runtime of 308 min.
The total runtimes for the comparison data sets in Comparison to other spike sorters in results are reported in Table A4.
Fig. A1.
Effect of the distance between artificial units and the local electrode groups (LEGs) in which they were detected. A–C: sorting performances (as shown in Fig. 4) as a function of the distance between a unit and a LEG in which it was found. D: distribution of the distances between the artificial units that were classified as found and the LEGs in which they were detected. E: the location of an LEG was defined as the “center of mass” of all electrodes in the LEG; the location of an artificial unit was the electrode where the template had its maximal amplitude.
Fig. A2.

Categorization of spikes in a pairwise spike train comparison. Spike trains were aligned with respect to the peak of their cross-correlation functions. Two spikes coinciding within 1 ms were considered true positive (TP). We considered 3 types of errors: false negative (FN), false positive (FP), and false classification (FC).
Fig. A3.

Runtime of parallel processes in relation to the number of spikes in a local electrode group (LEG). Dashed line: maximum no. of spikes used for feature selection and clustering. Red line: fitted quadratic function. Blue line: fitted linear function.
Table A1.
Values of parameters used in this evaluation
| Parameter | Value | Remark |
|---|---|---|
| Nmin | 1 | Minimal number of electrodes per LEG |
| Nmax | 9 | Maximal number of electrodes per LEG |
| Rmin | 20 µm | Minimal radius for addition of electrode to LEG |
| Rmax | 52 µm | Maximal radius for addition of electrode to LEG |
LEG, local electrode group.
Table A2.
Number of units across trials for spike amplitude evaluation
| Trial | Sorted Units | Artificial Units | Found | Lost |
|---|---|---|---|---|
| 1 | 571 | 20 | 20 | 0 |
| 2 | 768 | 20 | 18 | 2 |
| 3 | 599 | 20 | 19 | 1 |
| 4 | 899 | 20 | 19 | 1 |
| 5 | 527 | 20 | 15 | 5 |
| 6 | 628 | 20 | 18 | 2 |
| 7 | 553 | 20 | 20 | 0 |
| 8 | 764 | 20 | 18 | 2 |
| 9 | 612 | 20 | 15 | 5 |
| 10 | 926 | 20 | 18 | 2 |
| 11 | 536 | 20 | 16 | 4 |
| 12 | 646 | 20 | 18 | 2 |
| Total | 8,029 | 240 | 214 | 26 |
Table A3.
Number of units across trials for spike rate evaluation
| Trial | Sorted Units | Artificial Units | Found | Lost |
|---|---|---|---|---|
| 1 | 544 | 20 | 18 | 2 |
| 2 | 703 | 20 | 18 | 2 |
| 3 | 571 | 20 | 17 | 3 |
| 4 | 894 | 20 | 16 | 4 |
| 5 | 495 | 20 | 17 | 3 |
| 6 | 593 | 20 | 17 | 3 |
| 7 | 565 | 20 | 19 | 1 |
| 8 | 721 | 20 | 18 | 2 |
| 9 | 583 | 20 | 19 | 1 |
| 10 | 887 | 20 | 16 | 4 |
| 11 | 533 | 20 | 14 | 6 |
| 12 | 610 | 20 | 16 | 4 |
| Total | 7,699 | 240 | 205 | 35 |
Table A4.
Runtimes for spike sorting comparison data sets
| Comparison Data Set | Data Set 1 | Data Set 2 | Data Set 3 | Data Set 4 | Data Set 5 | Data Set 6 |
|---|---|---|---|---|---|---|
| Recording duration, min | 45.7 | 82.9 | 72.3 | 82.9 | 72.3 | 45.7 |
| Mean no. of spikes per LEG (×103) | 15.0 | 27.1 | 391.1 | 619.6 | 390.8 | 355.1 |
| Total runtime (parfor, 16 nodes), min | 18.1 | 109.4 | 70.4 | 135.1 | 69.6 | 79.7 |
LEG, local electrode group.
REFERENCES
- Berdondini L, Imfeld K, Maccione A, Tedesco M, Neukom S, Koudelka-Hep M, Martinoia S. Active pixel sensor array for high spatio-temporal resolution electrophysiological recordings from single cell to large scale neuronal networks. Lab Chip 9: 2644–2651, 2009. doi: 10.1039/b907394a. [DOI] [PubMed] [Google Scholar]
- Bertotti G, Velychko D, Dodel N, Keil S, Wolansky D, Tillak B, Schreiter M, Grall A, Jesinger P, Rohler S, Eickenscheidt M, Stett A, Moller A, Boven K-H, Zeck G, Thewes R. A CMOS-based sensor array for in-vitro neural tissue interfacing with 4225 recording sites and 1024 stimulation sites. IEEE Biomedical Circuits and Systems Conference (BioCAS) Proceedings 2014: 304–307, 2014. [Google Scholar]
- Bishop CM. Pattern Recognition and Machine Learning. New York: Springer, 2007. [Google Scholar]
- Chung JE, Magland JF, Barnett AH, Tolosa VM, Tooker AC, Lee KY, Shah KG, Felix SH, Frank LM, Greengard LF. A fully automated approach to spike sorting. Neuron 95: 1381–1394.e6, 2017. doi: 10.1016/j.neuron.2017.08.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dragas J, Jackel D, Hierlemann A, Franke F. Complexity optimization and high-throughput low-latency hardware implementation of a multi-electrode spike-sorting algorithm. IEEE Trans Neural Syst Rehabil Eng 23: 149–158, 2015. doi: 10.1109/TNSRE.2014.2370510. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Einevoll GT, Franke F, Hagen E, Pouzat C, Harris KD. Towards reliable spike-train recordings from thousands of neurons with multielectrodes. Curr Opin Neurobiol 22: 11–17, 2012. doi: 10.1016/j.conb.2011.10.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fee MS, Mitra PP, Kleinfeld D. Automatic sorting of multiple unit neuronal signals in the presence of anisotropic and non-Gaussian variability. J Neurosci Methods 69: 175–188, 1996. doi: 10.1016/S0165-0270(96)00050-7. [DOI] [PubMed] [Google Scholar]
- Fiscella M, Farrow K, Jones IL, Jäckel D, Müller J, Frey U, Bakkum DJ, Hantz P, Roska B, Hierlemann A. Recording from defined populations of retinal ganglion cells using a high-density CMOS-integrated microelectrode array with real-time switchable electrode selection. J Neurosci Methods 211: 103–113, 2012. doi: 10.1016/j.jneumeth.2012.08.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fournier J, Mueller CM, Shein-Idelson M, Hemberger M, Laurent G. Consensus-based sorting of neuronal spike waveforms. PLoS One 11: e0160494, 2016. doi: 10.1371/journal.pone.0160494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Franke F. Real-Time Analysis of Extracellular Multielectrode Recordings (PhD thesis). Berlin: Technische Universität Berlin, 2011. [Google Scholar]
- Franke F, Jäckel D, Dragas J, Müller J, Radivojevic M, Bakkum D, Hierlemann A. High-density microelectrode array recordings and real-time spike sorting for closed-loop experiments: an emerging technology to study neural plasticity. Front Neural Circuits 6: 105, 2012. doi: 10.3389/fncir.2012.00105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Franke F, Natora M, Boucsein C, Munk MH, Obermayer K. An online spike detection and spike classification algorithm capable of instantaneous resolution of overlapping spikes. J Comput Neurosci 29: 127–148, 2010. doi: 10.1007/s10827-009-0163-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Franke F, Pröpper R, Alle H, Meier P, Geiger JR, Obermayer K, Munk MH. Spike sorting of synchronous spikes from local neuron ensembles. J Neurophysiol 114: 2535–2549, 2015a. doi: 10.1152/jn.00993.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Franke F, Quian Quiroga R, Hierlemann A, Obermayer K. Bayes optimal template matching for spike sorting—combining Fisher discriminant analysis with optimal filtering. J Comput Neurosci 38: 439–459, 2015b. [Erratum in J Comput Neurosci 38: 461, 2015.] doi: 10.1007/s10827-015-0547-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Frey U, Sedivy J, Heer F, Pedron R, Ballini M, Müller J, Bakkum D, Hafizovic S, Faraci FD, Greve F, Kirstein K-U, Hierlemann A. Switch-matrix-based high-density microelectrode array in CMOS technology. IEEE J Solid-State Circuits 45: 467–482, 2010. doi: 10.1109/JSSC.2009.2035196. [DOI] [Google Scholar]
- Fukunaga K, Hostetler L. The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans Inf Theory 21: 32–40, 1975. doi: 10.1109/TIT.1975.1055330. [DOI] [Google Scholar]
- Hagen E, Ness TV, Khosrowshahi A, Sørensen C, Fyhn M, Hafting T, Franke F, Einevoll GT. ViSAPy: a Python tool for biophysics-based generation of virtual spiking activity for evaluation of spike-sorting algorithms. J Neurosci Methods 245: 182–204, 2015. doi: 10.1016/j.jneumeth.2015.01.029. [DOI] [PubMed] [Google Scholar]
- Hill DN, Mehta SB, Kleinfeld D. Quality metrics to accompany spike sorting of extracellular signals. J Neurosci 31: 8699–8705, 2011. doi: 10.1523/JNEUROSCI.0971-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hillier D, Fiscella M, Drinnenberg A, Trenholm S, Rompani SB, Raics Z, Katona G, Juettner J, Hierlemann A, Rozsa B, Roska B. Causal evidence for retina-dependent and -independent visual motion computations in mouse cortex. Nat Neurosci 20: 960–968, 2017. doi: 10.1038/nn.4566. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jäckel D, Frey U, Fiscella M, Franke F, Hierlemann A. Applicability of independent component analysis on high-density microelectrode array recordings. J Neurophysiol 108: 334–348, 2012. doi: 10.1152/jn.01106.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson LJ, Cohen E, Ilg D, Klein R, Skeath P, Scribner DA. A novel high electrode count spike recording array using an 81,920 pixel transimpedance amplifier-based imaging chip. J Neurosci Methods 205: 223–232, 2012. doi: 10.1016/j.jneumeth.2012.01.003. [DOI] [PubMed] [Google Scholar]
- Jun JJ, Mitelut C, Lai C, Gratiy S, Anastassiou C, Harris TD. Real-time spike sorting platform for high-density extracellular probes with ground-truth validation and drift correction (Preprint). bioRxiv 101030, 2017. doi: 10.1101/101030. [DOI]
- Lambacher A, Vitzthum V, Zeitler R, Eickenscheidt M, Eversmann B, Thewes R, Fromherz P. Identifying firing mammalian neurons in networks with high-resolution multi-transistor array (MTA). Appl Phys A 102: 1–11, 2011. doi: 10.1007/s00339-010-6046-9. [DOI] [Google Scholar]
- Lewicki MS. A review of methods for spike sorting: the detection and classification of neural action potentials. Network 9: R53–R78, 1998. doi: 10.1088/0954-898X_9_4_001. [DOI] [PubMed] [Google Scholar]
- Marre O, Amodei D, Deshmukh N, Sadeghi K, Soo F, Holy TE, Berry MJ 2nd. Mapping a complete neural population in the retina. J Neurosci 32: 14859–14873, 2012. doi: 10.1523/JNEUROSCI.0723-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meister M, Pine J, Baylor DA. Multi-neuronal signals from the retina: acquisition and analysis. J Neurosci Methods 51: 95–106, 1994. doi: 10.1016/0165-0270(94)90030-2. [DOI] [PubMed] [Google Scholar]
- Müller J, Ballini M, Livi P, Chen Y, Radivojevic M, Shadmani A, Viswam V, Jones IL, Fiscella M, Diggelmann R, Stettler A, Frey U, Bakkum DJ, Hierlemann A. High-resolution CMOS MEA platform to study neurons at subcellular, cellular, and network levels. Lab Chip 15: 2767–2780, 2015. doi: 10.1039/C5LC00133A. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Müller J, Ballini M, Livi P, Chen Y, Shadmani A, Frey U, Jones IL, Fiscella M, Radivojevic M, Bakkum DJ, Stettler A, Heer F, Hierlemann A. Conferring flexibility and reconfigurability to a 26,400 microelectrode CMOS array for high throughput neural recordings. In: 2013 Transducers and Eurosensors XXVII: The 17th International Conference on Solid-State Sensors, Actuators and Microsystems (TRANSDUCERS and EUROSENSORS 2013). New York: IEEE, 2013, p. 744–747. [Google Scholar]
- Muthmann JO, Amin H, Sernagor E, Maccione A, Panas D, Berdondini L, Bhalla US, Hennig MH. Spike detection for large neural populations using high density multielectrode arrays. Front Neuroinform 9: 28, 2015. doi: 10.3389/fninf.2015.00028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Obien ME, Gong W, Frey U, Bakkum DJ. CMOS-based high-density microelectrode arrays: technology and applications. In: Emerging Trends in Neuro Engineering and Neural Computation, edited by Bhatti A, Lee KH, Garmestani H, Lim CP. Singapore: Springer Singapore, 2017, p. 3–39. doi: 10.1007/978-981-10-3957-7_1. [DOI] [Google Scholar]
- Pachitariu M, Steinmetz N, Kadir S, Carandini M, Harris KD. Kilosort: realtime spike-sorting for extracellular electrophysiology with hundreds of channels (Preprint). bioRxiv 061481, 2016. doi: 10.1101/061481. [DOI]
- Pillow JW, Shlens J, Chichilnisky EJ, Simoncelli EP. A model-based spike sorting algorithm for removing correlation artifacts in multi-neuron recordings. PLoS One 8: e62123, 2013. doi: 10.1371/journal.pone.0062123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pouzat C, Mazor O, Laurent G. Using noise signature to optimize spike-sorting and to assess neuronal classification quality. J Neurosci Methods 122: 43–57, 2002. doi: 10.1016/S0165-0270(02)00276-5. [DOI] [PubMed] [Google Scholar]
- Prentice JS, Homann J, Simmons KD, Tkačik G, Balasubramanian V, Nelson PC. Fast, scalable, Bayesian spike identification for multi-electrode arrays. PLoS One 6: e19884, 2011. doi: 10.1371/journal.pone.0019884. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rodriguez A, Laio A. Clustering by fast search and find of density peaks. Science 344: 1492–1496, 2014. doi: 10.1126/science.1242072. [DOI] [PubMed] [Google Scholar]
- Rossant C, Kadir SN, Goodman DF, Schulman J, Hunter ML, Saleem AB, Grosmark A, Belluscio M, Denfield GH, Ecker AS, Tolias AS, Solomon S, Buzsáki G, Carandini M, Harris KD. Spike sorting for large, dense electrode arrays. Nat Neurosci 19: 634–641, 2016. doi: 10.1038/nn.4268. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rutishauser U, Schuman EM, Mamelak AN. Online detection and sorting of extracellularly recorded action potentials in human medial temporal lobe recordings, in vivo. J Neurosci Methods 154: 204–224, 2006. doi: 10.1016/j.jneumeth.2005.12.033. [DOI] [PubMed] [Google Scholar]
- Schmidt EM. Computer separation of multi-unit neuroelectric data: a review. J Neurosci Methods 12: 95–111, 1984. doi: 10.1016/0165-0270(84)90009-8. [DOI] [PubMed] [Google Scholar]
- Shabnam K. Global Superclustering (Online). 2016. https://github.com/kwikteam/global_superclustering [10 July 2018].
- Steinmetz N. Sorting Comparison Results (Online). 2016. http://phy.cortexlab.net/data/sortingComparison/ [16 March 2018].
- Swindale NV, Spacek MA. Spike sorting for polytrodes: a divide and conquer approach. Front Syst Neurosci 8: 6, 2014. doi: 10.3389/fnsys.2014.00006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Viswam V, Dragas J, Shadmani A, Chen Y, Stettler A, Müller J, Hierlemann A.. Multi-functional microelectrode array system featuring 59,760 electrodes, 2048 electrophysiology channels, impedance and neurotransmitter measurement units. In: 2016 IEEE International Solid-State Circuits Conference (ISSCC). New York: IEEE, 2016, p. 394–396. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wood F, Black MJ, Vargas-Irwin C, Fellows M, Donoghue JP. On the variability of manual spike sorting. IEEE Trans Biomed Eng 51: 912–918, 2004. doi: 10.1109/TBME.2004.826677. [DOI] [PubMed] [Google Scholar]
- Yger P, Spampinato GL, Esposito E, Lefebvre B, Deny S, Gardella C, Stimberg M, Jetter F, Zeck G, Picaud S, Duebel J, Marre O. A spike sorting toolbox for up to thousands of electrodes validated with ground truth recordings in vitro and in vivo. eLife 7: e34518, 2018. doi: 10.7554/eLife.34518. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yonehara K, Fiscella M, Drinnenberg A, Esposti F, Trenholm S, Krol J, Franke F, Scherf BG, Kusnyerik A, Müller J, Szabo A, Jüttner J, Cordoba F, Reddy AP, Németh J, Nagy ZZ, Munier F, Hierlemann A, Roska B. Congenital nystagmus gene FRMD7 is necessary for establishing a neuronal circuit asymmetry for direction selectivity. Neuron 89: 177–193, 2016. doi: 10.1016/j.neuron.2015.11.032. [DOI] [PMC free article] [PubMed] [Google Scholar]






