2016 Jun 2;10:248. doi: 10.3389/fnins.2016.00248

Processing and Analysis of Multichannel Extracellular Neuronal Signals: State-of-the-Art and Challenges

Mufti Mahmud 1,*, Stefano Vassanelli 1,*
PMCID: PMC4889584  PMID: 27313507


In recent years multichannel neuronal signal acquisition systems have allowed scientists to focus on research questions which were otherwise impossible. They act as a powerful means to study brain (dys)functions in in-vivo and in in-vitro animal models. Typically, each session of electrophysiological experiments with multichannel data acquisition systems generate large amount of raw data. For example, a 128 channel signal acquisition system with 16 bits A/D conversion and 20 kHz sampling rate will generate approximately 17 GB data per hour (uncompressed). This poses an important and challenging problem of inferring conclusions from the large amounts of acquired data. Thus, automated signal processing and analysis tools are becoming a key component in neuroscience research, facilitating extraction of relevant information from neuronal recordings in a reasonable time. The purpose of this review is to introduce the reader to the current state-of-the-art of open-source packages for (semi)automated processing and analysis of multichannel extracellular neuronal signals (i.e., neuronal spikes, local field potentials, electroencephalogram, etc.), and the existing Neuroinformatics infrastructure for tool and data sharing. The review is concluded by pinpointing some major challenges that are being faced, which include the development of novel benchmarking techniques, cloud-based distributed processing and analysis tools, as well as defining novel means to share and standardize data.

Keywords: neuroengineering, brain-machine interface, neuronal probes, neuronal signal, neuronal signal processing and analysis, neuronal activity, neuronal spikes, local field potentials

1. Introduction

The open question of structure-function relationship has attracted lot of interests in Systems Neuroscience. Recent works on anatomical substructures of the brain (Briggman and Denk, 2006; Mikula et al., 2012) promise to improve our understanding of neuronal networks physiology and drive the development of novel applications of neurotechnology by interpreting the activities of large neuronal ensembles via extracellular methods (Buzsaki, 2004; Nicolelis and Lebedev, 2009).

On the other hand, neuronal signals recorded by means of neuronal probes require rigorous (pre)processing and analysis. In terms of technological advancement, the extracellular interfacing of neurons with artificial chip-based devices has taken a considerable leap forward, even in comparison with very popular patch-clamp, EEG, and fMRI techniques (Vassanelli, 2011; Spira and Hai, 2013). In the last two decades, such advances have allowed neuroscientists to record neural activity simultaneously from many neurons with up to thousands of recording sites in a single neuronal probe and at a temporal resolution from a few up to hundreds of kilo Hertz (kHz) (Buzsaki, 2004; Schröder et al., 2015).

The wide variety of electrode size and dimensions allow different types of neuronal signals to be recorded from the extracellular space. Single-unit activities (action potentials) from single neurons can be sensed by small electrodes in their close proximity (Buzsaki et al., 2012). They also pick multi-unit activities from several simultaneously active neurons nearby to the electrode (Einevoll et al., 2012). With increasing electrode dimensions, local field potentials (LFPs) are sensed from neighboring neuronal populations as synchronous net activity of several hundreds to thousands neurons (Tsytsarev et al., 2006; Vassanelli, 2011, 2014; Vassanelli et al., 2012; Khodagholy et al., 2015). Therefore, the neurophysiological signals from different brain structures can be measured using a wide range of techniques based on the dimensions of the electrodes (see Figure 1; Sejnowski et al., 2014).

Figure 1.

Figure 1

Spatiotemporal range of neurophysiological signal acquisition techniques. Spatiotemporal range of the main techniques to measure neurophysiological signals from the brain. EEG, electroencephalography; MEG, magnetoencephalography.

Also, the massive growth in the field of brain imaging techniques allowed scientists to image brain activities at very different scales, from imaging single ion-channels to the whole brain (for a review, see Freeman, 2015).

Recently developed neural probes allowed neuroscientists to investigate neural processing by monitoring groups of neurons and their activation patterns at unprecedented resolution (Brown et al., 2004; Giocomo, 2015), thus also contributing to bridge the gap between neuronal network activity and behavior (Berenyi et al., 2014). In addition, they provided deep insights on the pathological basis of brain disorders (Friston et al., 2015). As a drawback, investigation of brain function and pathology can require massive data mining. For example, in an hour, a 128 channel signal acquisition system with 16 bits A/D conversion and 20 kHz sampling rate will generate approximately 17 GB uncompressed data (Mahmud et al., 2014). Inferring meaningful conclusions from this massive amount of data is pivotal to the neuroscience and neuroengineering community (Mahmud et al., 2010a, 2012a) and tools for analysis of such multichannel extracellular recordings that support a rapid and accurate data interpretation are still missing (Stevenson and Kording, 2011). Though computing power increased and costs decreased, yet, processing and analysis of signals remained labor-intensive. This poses a huge challenge to the computational neuroscientists: to develop tools to analyze such complex data that are optimized for both memory management and processing times (Stevenson and Kording, 2011).

Over the years, to make data handling and analysis fast, interactive and user friendly, several software tools have been developed by individual laboratories, e.g., Mahmud et al. (2012a), but only a negligible number of them have been released to the community. In practice, large number of analysis scripts are kept private, leading to a situation where analysis transparency is reduced and reproducibility of analysis results is hampered (Schofield et al., 2009).

It has also been argued that the acquired data, despite being in digitized form, have been only minimally made publicly available for other scientists to explore and validate (Van Horn and Ball, 2008). To overcome this, in recent years, the community sees a growing need to have standardized and publicly available tools (Gardner et al., 2008; Akil et al., 2011) as well as experimental data repositories (Ascoli, 2006a; De Schutter, 2010). To this aim, a paradigm shift has been initiated by a set of laboratories to share their analysis tools through open-source licenses fostering standardization (Ince et al., 2010). Given the circumstances, distributed and cloud-based computing solutions have become an obvious and valuable option (Mahmud et al., 2014).

This review will introduce the readers to the available major open-source academic toolboxes for processing and analysis of neurophysiological signals acquired by means of multichannel probes, and the available infrastructure for sharing such tools and the experimental data. Also, some of the challenges and bottlenecks the community is currently facing will be identified and highlighted, and development perspectives which, in our opinion, will facilitate result reproducibility, flexibility, and standardization will be provided.

2. State-of-the-art

The state-of-the-art for processing and analysis of neurophysiological signals can be categorized based on signal types, i.e., electroencephalography or magnetoencephalography (local) field potentials, and spikes. Though the majority of the toolboxes specialize to process and analyze one specific type of signal, there exist a few which provide rather comprehensive methods covering two or more signal types. Therefore, based on the signal types we categorized the toolboxes into three broad categories:

  • Toolboxes for Electroencephalography (EEG) analysis;

  • Toolboxes for spike trains and field potentials analysis;

  • Toolboxes for spike sorting.

Most of the tools were developed mainly in Matlab (Mathworks Inc., Natick, USA; and python ( programming languages due to their diffused usage in the neuroscience community. Other programming languages such as C, C++, R, Delphi7, and Java were also used in partial coding of some packages.

2.1. Toolboxes for electroencephalography (EEG) analysis

In the last decade, various techniques have been developed and applied to EEG data analysis and focused reviews on specific techniques have been reported (Pascual-Marqui et al., 2002; Stam, 2005; Hallez et al., 2007; Grech et al., 2008; Lenkov et al., 2013; de Cheveigné and Parra, 2014). Table 1 summarizes some of the popular open-source EEG analysis tools with their representative features which are enlisted below.

Table 1.

Popular EEG processing and analysis toolboxes with their representative features.

Toolbox Features
Lang PF GUI DV DIE AR E/C L SM Representative method(s) Representative measure/analysis
EEGLAB Matlab LUMW Yes Yes Yes Yes Yes ICA (Comon, 1994) 1. ERSP; 2. ITC; 3. ERCC
FieldTrip Matlab OSML Yes Yes Yes Yes Yes 1. Fourier / WT (Perrier et al., 1995);
2. MTM (Percival and Walden, 1993);
3. LCMVB (Veen and Buckley, 1988);
4. MUSIC (Schmidt, 1986)
1. (N)PSA; 2. BMCA
ERPWAVELAB Matlab LUMW Yes Yes Yes Yes Yes 1. PARAFAC (Harshman, 1970); 2. TUCKER (Tucker, 1966) 1. ESP; 2. ITPC; 3. ITLC
eConnectome Matlab OSML Yes Yes No Yes Yes 1. GC (Granger, 1969);
2. (Adaptive) DTF (Kaminski and Blinowska, 1991; Wilke et al., 2008);
3. PDC (Baccala and Sameshima, 2001);
4. CCD (Dale and Sereno, 1993)
1. EERP; 2. ERS/D; 3. FC
pyMVPA Python LMW Yes Yes No No No 1. LSVM (Vapnik, 1995);
2. SMLR (Krishnapuram et al., 2005);
3. GPRLK (Rasmussen and Williams, 2006)
SCoT Python NS No No No No Yes 1. MVARICA (Koles et al., 1990);
1. CSD; 2. (FF/G)P(D)C; 3. (D/FF/G)DTF; 4. F/EC
EMDLAB Matlab OSML Yes No No No No (E/wS/M)EMD (Rehman and Mandic, 2010; Rutkowski et al., 2010; Zeiler et al., 2012) 1. IMF; 2. ERM
PREP Matlab OSML Yes Yes Yes No No 1. MTM (Mitra and Pesaran, 1999);
2. ICA;
3. PCA; 4. LDA & HDCA (Marathe et al., 2014)
1. MAC; 2. RWD
EEGVIS Matlab OSML Yes No No No No Provides rich visualization of multichannel EEG

Lang, Language; PF, Platform; GUI DV, GUI and Data Visualization; DIE, Data Import/Export; AR, Artifact Removal; E/C L, Event/Channel Localization; SM, Source Modeling; L, Linux; U, Unix; M, Mac; W, Windows; OSML, Operating System supported by Matlab; NS, Not Specified; ICA, Independent Component Analysis; WT, Wavelet Transform; MTM, MultiTaper Methods; PARAFAC, PARAllel FACtor analysis; LCMVB, Linear Constrained Minimum Variance Beamformer; LSVM, Linear Support Vector Machine; SMLR, Sparse Multinomial Logistic Regression; GPRLK, Gaussian Process Regression with Linear Kernel; GC, Granger Causality; DTF, Directed Transfer Function; PDC, Partial Directed Coherence; CCD, Cortical Current Density; MVARICA, Multi Vector AutoRegrassive Independent Component Analysis; CSPVARICA, Common Spatial Patterns Vector AutoRegrassive Independent Component Analysis; (E/wS/M)EMD, (Ensemble/weighted Sliding/Multivariate) Empirical Mode Decomposition; ERSP, Event-Related Spectral Perturbation; ITC, Inter-Trial Coherence; ERCC, Event-Related Cross-Coherence; (N)PSA, (Non)Parametric Spectral Analysis; BMCA, Bivariate and Multivariate Connectivity Analysis; ESP, Evoked Spectral Perturbation; ITP/LC, Inter-Trial Phase/Linear Coherence; MIRELIEF, Multivariate Iterative RELIEF; EERP, Extraction of Event-Related Potentials; ERS/D, Event-Related Synchronization/Desynchronization; F/EC, Functional/Effective Connectivity; CSD, Cross Spectral Density; (FF/G)P(D)C, (Full Frequency/Generalized) Partial (Directed) Coherence; (D/FF/G)DTF, (Direct/Full Frequency/Generalized) Directed Transfer Function; IMF/ERM, Visualization of (Intrinsic Mode Functions/Event-Related Modes); MAC, Maximum Absolute Correlation; RWD, Robust Window Deviation.

2.1.1. EEGLAB

“EEGLAB” is a Matlab based EEG signal processing environment with time-frequency and ICA methods (Delorme and Makeig, 2004). It allows the user to: plot channel spectra and maps, remove artifacts, extract signal epochs, average data, select and compare multiple data, plot event related potential (ERP) images, decompose data using ICA and time/frequency methods, and estimate source locations. In addition, it also allows handling data from multiple subjects and perform statistical analysis on them. It can be obtained from


“ERPWAVELAB” is another Matlab based EEG processing toolbox (Morup et al., 2007) which depends on EEGLAB for certain functionalities. It is capable of multi-channel time-frequency analysis of ERP of EEG and MEG data. Provides data decomposition using multiway (tensor) factorization. The features include: various visualizations and maps, artifact rejection in the time-frequency domain, clustering dendrogram, statistical analysis across different groups and subjects, cross coherence analysis, etc. It can be obtained from

2.1.3. pyMVPA

“pyMVPA” is a multivariate pattern analysis package developed in Python and aims to facilitate statistical learning analyses of large datasets (Hanke et al., 2009). It offers data handling and an extensible framework for multivariate statistical analyses such as, classification, regression, and feature selection. It can be downloaded from

2.1.4. eConnectome

“eConnectome” is a Matlab based software with interactive graphical interfaces for EEG/ECoG/MEG preprocessing, source estimation, connectivity analysis and visualization where the connectivity from EEG/ECoG/MEG can be mapped over sensor and source domains (He et al., 2011). It can be obtained from

2.1.5. FieldTrip

“FieldTrip” is a Matlab based toolbox developed for the analysis of MEG, EEG, and other noninvasively recorded electrophysiological data (Oostenveld et al., 2011). Capable of handling data directly from many proprietary formats (e.g., BrainProducts/BrainVision, NeuroScan, Electrical Geodesics Inc., BCI2000, Micromed, Nexstim, European data format, Generic standard formats, etc.), it provides the user to perform time-frequency analysis using multitapers, source reconstruction using dipoles, distributed sources and beamformers, connectivity analysis, and nonparametric statistical permutation tests at the channel and source level. It can be obtained from

2.1.6. EEGVIS

“EEGVIS” is a Matlab based toolbox that allows users to explore multichannel EEG and other large array-based data sets using multiscale drill-down techniques (Robbins, 2012). Available at, and useable as a plugin to “EEGLAB.”

2.1.7. SCoT

“SCoT” is a toolbox written in Python for connectivity analysis on EEG/MEG sources. It performs blind source separation, connectivity estimation, resampling statistics, and visualization (Billinger et al., 2014). It works with both multi-trial and single trial data. The source code can be downloaded from

2.1.8. EMDLAB

“EMDLAB” is developed in Matlab as a plugin to the EEGLAB to perform various empirical mode decomposition (EMD), e.g., plain EMD, ensemble EMD, weighted sliding EMD, and multivariate EMD (MEMD) on EEG data (Al-Subari et al., 2015). It can be obtained from

2.1.9. PREP

“PREP” is for early-stage EEG processing which is a Matlab based preprocessing pipeline that aims in cleaning (e.g., line noise removal, fixing drifting problem, interpolating corrupt channels, etc.) the EEG signals (Bigdely-Shamlo et al., 2015). The library is available at

2.2. Toolboxes for spike trains and field potentials analysis

With the increasing capabilities to record simultaneously from a growing number of neurons, computational neuroscientists developed automated toolboxes addressing the required processing and analyses. We touch upon few of the publicly available ones below. Table 2 summarizes the packages we discuss below with their representative features.

Table 2.

Popular spike trains and field potentials processing and analysis toolboxes with their representative features.

Toolbox Features
Lang PF GUI DV DIE AR PDP Representative method(s)/measures/analyses
DATA-MEAns Delphi7 W Yes No No No 1. PoSTH; 2. PEH; 3. AC; 4. CC; 5. FF; 6. COH; 7. ES (Quian Quiroga et al., 2002); 8. NNC (Cover and Hart, 1967); 9. KMC (MacQueen, 1967)
MeaBench C++ / Matlab L Yes No Yes No 1. Spike detection; 2. Spike validation; 3. Burst detection
KNSNDM C++ LMW Yes Yes Yes No 1. AC; 2. CC; 3. KlustaKwik2.3.2; 4. CEM (Celeux and Govaert, 1992)
BSMART Matlab / C OSML Yes Yes No No 1. AMAR; 2. (A)B/MAR; 3. FFT; 4. GC (Granger, 1969); 5. COH; 6. CN; 7. GCN
FIND Matlab OSML Yes Yes No No 1. CV; 2. PPM; 3. PWCC; 4. ASGF; 5. RLDE (Nawrot et al., 2003); 6. Spike detection
STAToolkit Matlab / C LMW Yes No No Yes 1. DM (Strong et al., 1998); 2. MSM (Victor and Purpura, 1997); 3. BLM (Victor, 2002); 4. AD (Treves and Panzeri, 1995); 5. JD (Thomson and Chave, 1991); 6. DMB (Ma, 1981); 7. BUB (Paninski, 2003); 8. CA (Chao and Shen, 2003); 9. BDP (Wolpert and Wolf, 1995)
PANDORA Matlab LMW No Yes No EMP 1. RDBCDS; 2. SSC; 3. KLDM (Kullback and Leibler, 1951); 4. RAD (Johnson and Sinanovic, 2001)
sigTOOL Matlab OSML Yes Yes No No 1. AC; 2. CC; 3. COH; 4. PSA; 5. ICA; 6. PEH
ibTB Matlab LMW No No No No 1. DM; 2. QE (Strong et al., 1998); 3. PTBC (Panzeri and Treves, 1996); 4. SHP (Montemurro et al., 2007); 5. BBC (Optican et al., 1991); 6. GM (Misra et al., 2005)
Chronux Matlab LMW Yes No No No 1. HCM (Fee et al., 1996); 2. LOWESS (Cleveland, 1979); 3. LOCFIT (Loader, 1999); 4. MTM; 5. COH; 6. SFC
SPKTool Matlab OSML Yes Yes No No 1. (NL)ESD (Mukhopadhyay and Ray, 1998); 2. PCA; 3. EMGMM (Duda et al., 2000)
nSTAT Matlab OSML No No No No 1. PPGLM (Paninski et al., 2007); 2. GLM-PSTH; 3. (A/B)IC; 4. SSGLM; 5. KF; 6. MTM; 7. STG
SigMate Matlab OSML Yes Yes Yes No 1. FO; 2. LC (Mahmud et al., 2016); 2. CLAOD (Mahmud et al., 2010b); 3. CSD (Mahmud et al., 2011); 4. SLFPC (Mahmud et al., 2012c)
MVGC Matlab OSML No No No No 1. OLS; 2. LWRA (Levinson, 1946); 3. VARMLE; 4. CPSD; 5. MTM; 6. FFT; 7. UGC
QSpike Tools Matlab ML No No Yes EMP 1. Spike detection; 2. Spike validation; 3. PSTH; 4. PEH; 5. Burst detection and validation; 6. Wave_ClusSection 2.3.1

Lang, Language; PF, Platform; GUI DV, GUI and Data Visualization; DIE, Data Import/Export; AR, Artifact Removal; PDP, Parallel Data Processing; KNSNDM, Klusters, NeuroScope, NDManager; L, Linux; U, Unix; M, Mac; W, Windows; OSML, Operating System supported by Matlab; PoSTH, Post-Stimulus Time Histogram; PSTH, Peri-Stimulus Time Histogram; PEH, Peri-Event Histogram; AC, AutoCorrelation; (PW)CC, (Pair Wise) Cross-Correlation; FF, Fano Factor; COH, (Cross) COHerence; ES, Event Synchrony; NNC, Nearest Neighbor Clustering; KMC, K-Means Clustering; CEM, Classification Expectation Maximization; (A)B/MAR, (Adaptive) Bi/Multi variate AutoRegrassive model; FFT, Fast Fourier Transform; GC, Granger Causality; CN, Coherent Network; GCN, GC Network; PPM, Point Process Model; ASGF, Asymmetric SavitzkyGolay Filter; RLDE, Response Latency Differences Estimation; DM, Direct Method; MSM, Metric Space Method; BLM, BinLess Method; AD, Asymptotically Debiased; JD, Jackknife Debiased; DMB, Debiased Ma Bound; BUB, Best Upper Bound; CA, Coverage-Adjusted; BDP, Bayesian with Dirichlet Prior; EMP, EMbarrassingly Parallel; RDBCDS, Rational DataBase Creation from DataSet; SSC, Spike Shape Characteristics; KLDM, Kullback-Leibler Divergence Measure; RAD, Resistor-Average Distance; PSA, Power Spectral Analysis; QE, Quadratic Extrapolation; PTBC, Panzeri and Treves Bias Correction; SHP, Shuffling Procedure; BBC, Bootstrap Bias Correction; GM, Gaussian Method; HCM, Hierarchical Clustering Method; LOWESS, Locally Weighted Sum of Squares; SFC, Spike Field Coherence; (NL)ESD, (NonLinear) Signal Energy for Spike Detection; EMGMM, Expectation Maximization on Gaussian Mixed Model; PPGLM, Point Process Generalized Linear Model; (A/B)IC, (Akaike's/Bayesian) Information Criteria; SSGLM, State-Space Generalized Linear Model; KF, Kalman Filtering; STG, Spectrogram; FO, File Operations (file splitting, concatenation, column rearranging); LC, Latency Calculation; CLAOD, Cortical Layer Activation Order Detection; CSD, Current Source Density; SLFPC, Single LFP Classification; OLS, Ordinary Least Squares; LWRA, LWR Algorithm; VARMLE, VAR Maximum Likelihood Estimator; CPSD, Cross-Power Spectral Density; UGC, Unconditional GC.

2.2.1. DATA-MEAns

“DATA-MEAns” is a toolbox developed in Borland Delphi 7 (Embarcadero Technologies Inc., Austin, USA) and Matlab (Bonomini et al., 2005). It provides data visualization, basic analysis (i.e., autocorrelations, perievent histograms, rate curves, PSTHs, ISIs, etc.), and nearest neighbor or k-means clustering. Available at

2.2.2. MeaBench

“MeaBench” is a toolbox written mainly in C++ with certain parts written in Perl1 and Matlab. It is intended for data acquisition and online analysis of commercial multielectrode array recordings from Multichannel Systems GmbH (Reutlingen, Germany) (Wagenaar et al., 2005). It allows real-time data visualization, line and stimulus artifact suppression, spike and burst detection and validation. Available at

2.2.3. Klusters, NeuroScope, NDManager

“Klusters,” “NeuroScope,” and “NDManager” are three integrated modules bundled together for processing and analysis of spike and field potential signals (Hazan et al., 2006). Klusters performs spike sorting using KlustaKwik (see Section 2.3.2) and displays 2D projection of features, spike traces, correlograms, and error matrix view. NeuroScope allows inspection, selection, and event editing of spike signals as well as local field potentials (LFPs). NDManager facilitates experimental and preprocessing parameter management. Available at

2.2.4. Brain-system for multivariate AutoRegressive time series (BSMART)

“BSMART” toolbox is written in Matlab/C for spectral analysis of neurophysiological signals (Cui et al., 2008). It provides (multi-)bi-variate AutoRegressive modeling, spectral analysis through coherence and Granger causality, and network analysis. Available at

2.2.5. Finding information in neural data (FIND)

“FIND” is a platform-independent framework for the analysis of neuronal data based on Matlab (Meier et al., 2008). It provides a unified data import function from various proprietary formats simplifying standardized interfacing with analysis tools and allows analysis of discrete series of spike events, continuous time series, and imaging data. Also, allows simulating multielectrode activity using point-process based stochastic model. Available at

2.2.6. Spike train analysis toolkit (STAToolkit)

“STAToolkit” is a Matlab/C-hybrid toolbox implementing information theoretic methods to quantify how well the stimuli can be distinguished based on the timing of neuronal firing patterns in a spike train (Goldberg et al., 2009). Available at

2.2.7. PANDORA

“PANDORA” is a Matlab-based toolbox that extracts user-defined characteristics from spike train signals and create numerical database tables from them (Gunay et al., 2009). Further analyses (e.g., drug and parameter effects, spike shape characterization, histogramming and comparison of distributions, cross-correlation, etc.) can then be performed on these tables. Spike detection and feature extraction can also be performed. It is available at

2.2.8. sigTOOL

“sigTOOL” toolbox is written in Matlab and allows direct loading of a wide range of proprietary file formats (Lidierth, 2009). It provides (auto-)cross-correlation, power spectral analysis, and coherence estimation in addition to usual spike train analysis (i.e., ISI, event auto- and cross-correlations, spike-triggered averaging, peri-event time histograms, frequencygrams, etc.). Available at

2.2.9. Information breakdown ToolBox (ibTB)

“ibTB” is a Matlab-based toolbox which implements information theory methods for spike, LFP, and EEG analysis (Magri et al., 2009). It provides information breakdown technique to decode the encoding of sensory stimuli by different groups of neurons. The source code can be obtained from the publisher's website (

2.2.10. Chronux

“Chronux” toolbox is developed in Matlab for the analysis of both point process and continuous data (Bokil et al., 2010). It provides spike sorting, and local regression and multitaper spectral analysis of neural signals. Available at

2.2.11. SPKTool

“SPKTool” is coded in Matlab for the detection and analysis of neural spiking activity (Liu et al., 2011). It performs spike detection, feature extraction, manual and semi-automatic clustering of spike trains. Available at

2.2.12. nSTAT

“nSTAT” toolbox is coded in Matlab and performs spike train analysis in time domain (e.g., Kalman Filtering), frequency domain (e.g., multi-taper spectral estimation), and mixed time-frequency domain (e.g., spectrogram) (Cajigas et al., 2012). Available at

2.2.13. SigMate

“SigMate” is a Matlab-based comprehensive framework that allows preprocessing and analysis of EEG, LFPs, and spike signals (Mahmud et al., 2012a). It's main contribution is in the analysis of LFPs which includes data display, file operations, baseline correction, artifact removal, noise characterization, current source density (CSD) analysis, latency estimation from LFPs and CSDs, determination of cortical layer activation order using LFPs and CSDs, and single LFP clustering. The EEG and spike analysis are provided through EEGLAB (see Section 2.1.1) and Wave_Clus (see Section 2.3.1) toolboxes. It can be obtained from

2.2.14. Multivariate granger causality toolbox (MVGC)

“MVGC” is a toolbox written in Matlab that implements WienerGranger causality (G-causality) on multiple equivalent representations of a vector autoregressive model in both time and frequency domains (Barnett and Seth, 2014). It can be applied to neuroelectric, neuromagnetic, and fMRI signals and can be obtained from

2.2.15. QSpike tools

“QSpike Tools” is a Linux/Unix-based cloud-computing framework, modeled using client-server architecture and developed in Matlab / Bash scripts2, for processing and analysis of extracellular spike trains (Mahmud et al., 2014). It performs batch preprocessing of CPU-intensive operations for each channel (e.g., filtering, multi-unit activity detection, spike-sorting, etc.), in parallel, by delegating them to a multi-core computer or to a computers cluster. It can be obtained from

2.3. Toolboxes for spike sorting

As seen in the literature, majority of the efforts have been devoted in developing tools for spike sorting and analysis. A recent review by Rey et al. outlines the basic concepts of spike sorting, applicability requirements, and shortcoming of currently available algorithms (Rey et al., 2015). Detailing all spike-sorting packages and their functionalities would require a complete review, therefore, here we restrict our discussion to some of the popular open-source toolboxes.

2.3.1. Wave_Clus

“Wave_Clus” is the most popular spike sorting package to date. Developed in Matlab, it uses wavelet transformation based feature selection method and superparamagnetic clustering (Blatt et al., 1996) method to sort the spikes into different classes (Quian Quiroga et al., 2004). It is available at

2.3.2. KlustaKwik

“KlustaKwik” is a stand-alone program written in C++ for automatic clustering analysis (Harris et al., 2000) by fitting a mixture of Gaussians and masked expectation-maximization (Kadir et al., 2014; Rossant et al., 2016). Download link is

2.3.3. OSort

“OSort” is a template-based, unsupervised, online spike sorting algorithm written in Matlab (Rutishauser et al., 2006). It uses residual-sum-of-squares based distance method and custom thresholds to on-the-fly sort the recorded spikes. Available at

2.3.4. SpikeOMatic

“SpikeOMatic” is a spike sorting package developed in R (Pouzat and Chaffiol, 2009). It implements Gaussian Mixture and Dynamic Hidden Markov Models using expectation-maximization and Markov Chain Monte Carlo methods, respectively. Available at

2.3.5. Spyke

“Spyke” is a Python based toolbox for visualizing, navigating, and spike sorting of high-density multichannel extracellular spikes (Spacek et al., 2009). It uses PCA for dimensionality reduction and modified gradient ascent clustering algorithm (Fukunaga and Hostetler, 1975; Swindale and Spacek, 2014) to classify the features. Available at

2.3.6. UltraMegaSort2000

“UltraMegaSort2000” is a Matlab based toolbox for spike detection and clustering which implements a hierarchical clustering scheme using similarities of spike shape and spike timing statistics, and provides false-positive and false-negative errors as quality evaluation metrics (Fee et al., 1996; Hill et al., 2011). Available at

2.3.7. EToS

“EToS” is a spike sorting toolbox written in C++ implementing multimodality-weighted PCA and variational Bayes for student's t mixture model (Takekawa et al., 2012). The spike sorting code is parallelized through OpenMP ( and available at

2.3.8. MClust

“MClust” is a spike sorting toolbox developed in Matlab. It supports both manual and automated clustering with possibility to manual feature selection (Redish, 2014). It can be obtained from

2.3.9. NEV2lkit

“NEV2lKit” is a package written in C++ with routines for analysis, visualization and classification of spikes (Bongard et al., 2014). Its results are accurate, efficient and consistency across experiments. Available at

2.3.10. WIToolbox

“WIToolbox” implements a combination of wavelet transform and information theory using Matlab for better classification of spikes on the occasions of spike time-jitter, background noise, and sample size problem (Lopes-dos Santos et al., 2015). Available at

3. Sharing of analysis tools and experimental data

Making available to the community analysis toolboxes for easy and efficient handling of massive neuronal data is just a part of the solution. The other part is the availability of infrastructures which would allow these tools and the experimental data to be shared. Computational neuroscientists are putting constant and significant efforts in building and refining “Neuroinformatics” infrastructures, as outlined below, for making data, tools, and resources electronically accessible over the web (Ascoli, 2006b) which is believed to help and facilitate the standardization, benchmarking process, and foster collaborative research (Mahmud et al., 2012b). As quoted by Prof. Jan G. Bjaalie, “Neuroinformatics applies the methods and approaches required for large scale data integration and thereby paves the way toward understanding the brain”3.

3.1. Neuroshare

The neuroshare ( framework started with the goal to create and support open data file format specifications for neurophysiology, a set of open libraries to access those data, and open-source software tools for their analysis. This is particularly important when the community faces a situation where there are many proprietary neuronal signal file formats used by different acquisition softwares. Leveraging the “Neuroshare API,” the framework aims at standardizing the access to individual file formats of neurophysiological experiment data by creating low-level handling and processing tools. However, this has been designed to be achieved in two subsequent phases: (i) creation of open library and format standards for the experimental data, and (ii) developing free and open-source tools for low-level handling and processing of the data. Currently, it provides eight Neuroshare-compliant digital link libraries (DLLs) to access raw data files recorded with proprietary acquisition setups, e.g., Alpha-Omega, Blackrock Microsystems, Cambrige Electronic Design, Multichannel Systems, NeuroExplorer, Plexon, RC Electronics, and Tucker-Davis Technologies.

3.2. International neuroinformatics coordinating facility (INCF)

To facilitate tools and data sharing and fostering development in the field of Neuroinformatics, an organization called International Neuroinformatics Coordinating Facility (INCF, was formed by 12 member countries of the Organization for Economic Co-operation and Development (OECD, Financed by Belgium, Czech Republic, Finland, France, Germany, Italy, Japan, The Netherlands, Norway, Sweden, Switzerland, the United States, and the European Commission, many of these member countries have their own nodes to provide this facility locally (Rautenberg et al., 2011). Quoting from an article by the Executive Director of INCF during 2006–2008, who defined it's aims to be (Bjaalie and Grillner, 2007):


  • coordinate and foster international activities in Neuroinformatics;

  • contribute to the development of scalable, portable, and extensible applications that can be used for furthering our knowledge of the human brain and its diseases;

  • contribute to the development and maintenance of specific database and other computational infrastructures and support mechanisms; and

  • focus on developing mechanisms for the seamless flow of information and knowledge between academia, private enterprizes, and the publication industry.


3.3. Code analysis, repository and modeling for e-Neuroscience (CARMEN)

The Code Analysis, Repository and Modeling for e-Neuroscience (CARMEN) project was one of its kind in developing a virtual neuroscience laboratory, specially for electrophysiology data, facilitating e-Neuroscience through creating a unique infrastructure for data and tools sharing and services (Watson et al., 2010). These secure services allow a user to curate data and analysis code to defined storages, document experimental protocols, and execute data analysis (Fletcher et al., 2008). The data as such cannot be curated to the databases of CARMEN without having a proper metadata description about it. This description is essential for accessing correct data out of the thousands of available datasets and interpreting them using the appropriate analysis codes (Jessop et al., 2010).

The CARMEN framework currently supports analysis codes written in Matlab, Python, C/C++, and R. The users may upload their codes, in the form of non-interactive standalone command-line applications, wrapping them using a Service Builder tool to create a suitable service format to be executed on the platform (Weeks et al., 2013).

Recently, a programming document demonstrated the usage of a curated repository of multielectrode array recordings of spontaneous activity from mouse and ferret retina. The mentioned dataset was in HD54 format (a format for hierarchical data organization), and the document outlined the guide to be followed for the efficient usage of the CARMEN software workflow. Moreover, the dataset structure along with examples of reproducible research using those data files were reported (Eglen et al., 2014).

3.4. Neurodata without borders: neurophysiology (NWB:N)

To facilitate research reproducibility and to have an opportunity to explore someone else's data, data standardization is a must. The Neurodata Without Borders: Neurophysiology (NWB:N, is an initiative aiming at promoting data standardization and sharing. Since it's infancy, the NWB:N has been keen on producing a common data format for recordings and metadata of cellular electrophysiology which has recently been released along with a sample dataset (Teeters et al., 2015).

4. Challenges and future perspectives

Secure infrastructures are vital for the success of large-scale, multi-institutional Neuroinformatics research. It is foreseeable that Neuroinformatics research facilities shall be capable of integrating data seamlessly from different sources for data sharing, but also they should be secure enough to address challenging issues like –

  • research collaboration with the option to protect their proprietary data,

  • user friendliness allowing users with minimal information technology skills to explore, navigate, and use scientific data and services provided by the environment.

In the recent years, the emergence and popularity of distributed computing render an opportunity to share resources that otherwise require more effort. In particular, cloud computing and service oriented architecture open novel avenues necessary to foster collaborative neuronal signal analysis through distributed infrastructure. These approaches allow better representation of responsibilities taken by the different users in accordance to their granted privileges. In our opinion, the development is expected toward:

  • Design and implementation of secure and protected systems;

  • Advance on cloud based web applications;

  • Facilitate easy deployment of data;

  • Reusability and sharing of tools with adaptability to changing requirements;

  • Empower researchers to share functionalities that they want to publish.

Based on the current state-of-the-art, we identified few challenges that require immediate attention of the community, a few are indicated below:

  1. Over the last few years, the neuroscientists have put together quite a few useful neuroimage repositories and their analysis tools (Eickhoff et al., 2016), but neurophysiology is lagging behind. Though there exist a few individual databases (e.g.,,,, etc.), they are very poor in comparison to their imaging counterpart (Tripathy et al., 2014).

  2. With the actual acquisition systems and the needed data formats changes, inter-operability and data conversion is still a nightmare due to the lack of widely adopted standards. In addition, when the data are being curated in a databases, the data-description through metadata is again incompatible among different labs/curators which also hampers in conducting meaningful analyses using data from another lab. This unnecessarily increases the time and effort required for data discovery and analysis.

  3. Due to the practical problem of rapid and customized analyses, most of the labs develop their own analysis scripts and perform their required analyses. This approach has severe drawbacks on the global scale: interoperability, compatibility, and sharing of tools with other laboratories are highly restricted. Thus, the problem of creating a common set of analyses and the availability of benchmark analysis tools are yet to be addressed.

  4. Though the price of computing power has reduced significantly over the years, yet the power required to demystify large neuronal ensembles is still alarmingly high. From a Neuroinformatics perspective, availability of powerful international computing facilities will greatly facilitate remote, automated, and standardized multichannel neuronal signal processing and analysis.

  5. Cloud computing's popularity is rapidly growing. Exploiting the bliss of distributed computing, a concept of Competitor-to-Collaborator would be very interesting where small clusters of laboratories working on similar research questions would share their resources and tools through a unified cloud-based framework for the other laboratories to be used as web-services.

Author contributions

MM performed the reported study. MM wrote and SV edited the paper. Both authors have seen and approved the final manuscript.

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.


Financial support by the 7th Framework Programme of the European Commission through “RAMP” project ( with contract no. 612058 is kindly acknowledged.



