Abstract
Recently, the computational neuroscience community has pushed for more transparent and reproducible methods across the field. In the interest of unifying the domain of auditory neuroscience, naplib-python provides an intuitive and general data structure for handling all neural recordings and stimuli, as well as extensive preprocessing, feature extraction, and analysis tools which operate on that data structure. The package removes many of the complications associated with this domain, such as varying trial durations and multi-modal stimuli, and provides a general-purpose analysis framework that interfaces easily with existing toolboxes used in the field.
Keywords: python, auditory neuroscience, iEEG, ECoG, preprocessing
Nr | Code metadata description | Please fill in this column |
---|---|---|
C1 | Current code version | 0.2.0 |
C2 | Permanent link to code/repository used for this code version | https://github.com/naplab/naplib-python |
C3 | Permanent link to reproducible capsule | https://codeocean.com/capsule/6656601/tree |
C4 | Legal code license | MIT License |
C5 | Code versioning system used | git |
C6 | Software code languages, tools and services used | python |
C7 | Compilation requirements, operating environments and dependencies | Linux, macOS, or Windows; matplotlib, numpy, scipy, pandas, statsmodels, hdf5storage, mne, scikit-learn |
C8 | If available, link to developer documentation/manual | https://naplib-python.readthedocs.io |
C9 | Support email for questions | nima@ee.columbia.edu |
1. Introduction
With the recent explosion of neural data acquisition and computational power, the field of neuroscience has seen incredible growth in the use of computational methods to analyze neural response patterns and make inferences about the brain. The field of auditory neuroscience is no exception, with the widespread use of computational analyses such as spectro-temporal receptive field (STRF) estimation (Aertsen et al., 1981; Theunissen et al., 2000, 2001) and software such as STRFlab (http://www.strflab.berkeley.edu), mTRF (Crosse et al., 2016), Neural Encoding Model System (NEMS) (David, 2018), and others dedicated to these specific techniques. Many papers are now accompanied by small bits of code to reproduce figures or analyses, which greatly aids in the reproducibility of scientific research. However, the explosion of computational methods has also led to a software ecosystem with many highly specialized packages written by, and for, people with very different needs. Even when code is shared openly, it may do little to aid the reproducibility of the experiments because of difficulties running others’ code or omissions of critical details in the original report (Easterbrook, 2014; Miłkowski et al., 2018). Therefore, there is a need for a unifying framework which provides comprehensive tools for the auditory neuroscience domain and can easily fit into existing codebases and analysis pipelines used by researchers in the field.
As a Neural Acoustic Processing Library in Python, naplib-python is specifically built for auditory neuroscience. Its library of methods complements those provided by other public toolkits. Python is an ideal language for this purpose, due to its vast open-source ecosystem and the fact that many in the neuroscience community are moving towards Python (Muller et al., 2015). In addition to implementing a host of relevant analysis methods that previously had no popular Python implementation, naplib-python expands upon the existing MATLAB toolkit NAPlib (Khalighinejad et al., 2017), which focuses on phoneme response analysis methods, by porting several of these methods to Python and increasing their ease-of-use by ensuring they work with naplib-python data structures. One existing Python package, MNE-Python (Gramfort et al., 2013), supports a broad set of functionalities for neural data, including magnetoencephalography (MEG), electroencephalography (EEG), and intracranial EEG (iEEG). However, that package was never meant to be used only for auditory neuroscience, and so it generally offers techniques applicable to many areas of neuroscience while missing functionality that would be useful in the auditory domain, such as linguistic alignment and phonetic feature extraction. In this paper, we first describe the basic data structure and API used in the package, then we provide an overview of the methods available in the package and describe their use in common analysis pipelines in the field.
2. Package Description
2.1. Data Structure and API
In auditory neuroscience, several problems often arise which make using a general-purpose data structure difficult. Stimuli tend to be non-uniform in duration, especially when using naturalistic stimuli like human speech (Hamilton & Huth, 2020). Additionally, a large variety of trial-specific metadata may be needed for analyses, such as transcripts of speech stimuli or event labels, which would traditionally need to be loaded and treated as separate variables. The Data object in naplib-python takes care of these problems by seamlessly storing any number of trials containing any number of fields, which may include things like auditory stimulus waveforms and time-frequency representations, neural responses, stimulus transcripts, and trial condition labels. By not enforcing equal durations and allowing various data types through its various fields, the Data class is reminiscent of a pandas DataFrame (McKinney, 2011), and in many ways it operates similarly. Fig. 1 shows a visual representation of the most typical information that could be stored in a Data instance. The Data object is designed to enable easy processing by trial, by field, or all together. Many functions in naplib-python can be called by either passing a set of parameters, or simply passing a Data object containing all the necessary fields to fill the parameters. For example, a user can fit a STRF model using the TRF class in naplib-python by passing in a Data instance and the fields for stimulus and response will be automatically extracted and used as the input and output when training the model. Alternatively, certain parameters can be passed in manually without needing them to be stored in a Data instance.
For convenient adoption, naplib-python supports loading data from the Brain Imaging Data Structure (BIDS) format (Gorgolewski et al., 2016; Niso et al., 2018; Pernet et al., 2019) directly into a Data object with a single line of code. Trials are automatically separated based on events defined in the BIDS files, and then the data is ready to be quickly processed. These features make the Data object easy for beginners to adopt and broadly useful for any type of analysis.
2.2. Library Overview
In this section we summarize the main modules currently available in the package and the tools offered in each. Full documentation and examples for the functions within these modules, as well as the other utility modules, are available online at https://naplib-python.readthedocs.io.
2.2.1. Features
In auditory neuroscience, a wide variety of features have been proposed to describe both acoustic signals and neural data. An implementation of the auditory spectrogram (Yang et al., 1992) is provided, a time-frequency representation which models the inner ear and cochlear spectral decomposition of sound waves. Additionally, many linguistic features are available to describe speech signals. A forced aligner is provided based on the Prosodylab-Aligner (Gorman et al., 2011), as well as functions to extract phoneme and word alignment labels from the aligner’s output, which can be used to identify phonetic information and timing in speech stimuli.
2.2.2. Encoding
This module is dedicated to encoding models used by the auditory neuroscience community. For example, a robust temporal receptive field (TRF) class is implemented which interfaces naturally with the Data object. By default, the TRF model uses cross-validated ridge regression, but any class which adheres to the scikit-learn linear model API (Pedregosa et al., 2011) can be used, meaning that TRFs can be trained from L1-regularized models, elastic net models, or any other linear or non-linear user-defined classes. This is useful for training both forward STRF models (Theunissen et al., 2000, 2001), which predict neural responses from acoustic stimuli, or backward models for stimulus reconstruction (Bialek et al., 1991; Mesgarani et al., 2009), which reconstruct acoustic stimuli from neural signals.
2.2.3. Segmentation
Stimulus onset response patterns are often studied in order to understand response properties in auditory neuroscience, as used in studies of evoked potentials (Gage et al., 1998; Näätänen et al., 1978; Picton, 2013) or stimulus onset-locked encoding (Gwilliams et al., 2018; Phillips et al., 2002; Hamilton et al., 2018). The segmentation module contains methods for segmenting multi-trial data based on aligned labels. For example, aligned labels could include phoneme onset labels (where phoneme alignment can be computed using the Features module described above), enabling easy analysis of phoneme onset responses.
2.2.4. Preprocessing
A significant amount of preprocessing is typically involved when analyzing neural signals, so this module includes several functions which are useful for a variety of signal types, especially intracranial recordings. There are functions for extracting the envelope and phase of different frequency bands using a filter bank followed by the Hilbert transform (Edwards et al., 2009), which can be used to extract the well-studied high-gamma envelope response for iEEG data, or as an input to further analyses of phase-amplitude coupling (Canolty et al., 2006; Tort et al., 2010). There are also generic filtering functions that operate on Data instances which are useful for performing notch filtering to remove line noise or filtering EEG/MEG data into different frequency bands.
2.2.5. Stats
Statistical analysis of data is critical to understand the significance of any findings in neuroscience. This module provides several statistical tools common to auditory neuroscience. For example, one of the first steps in many analysis pipelines is electrode selection, which can be done using a t-test between responses to speech and silence to identify stimulus-responsive electrodes (Mesgarani & Chang, 2012). Another common statistic offered in the package and used in the field is the F-ratio, which is often used to describe the discriminability of neural responses between different stimulus classes (Khalighinejad et al., 2021). Additionally, a linear mixed effects model is offered to perform linear modeling with the ability to control for effects such as subject identity, which may be needed when studying data across heterogeneous subjects, as is common with iEEG data. Similarly, a generalized t-test method is offered, which can be used to perform t-tests while controlling for additional factors, such as subject identity when testing a distribution with underlying groupings.
2.2.6. Input-Output
Input from and output to files is supported in the IO module, including functions to save and load files, as well as read from or write to third-party file structures like MATLAB and BIDS format. These functions make naplib-python easy to use no matter how a researcher’s data is currently stored or what stage of the analysis pipeline a researcher wants to use naplib-python.
3. Software Impact
With its large suite of implemented methods, naplib-python enables researchers in the field of auditory neuroscience to run common analysis pipelines in only a few lines of code, all while using a general data framework that applies to nearly any type of neural recording data. This allows for collaborations and easier code sharing across disciplines, even when data or recording methods differ significantly between researchers. Furthermore, the ability to interface between naplib-python and other commonly used packages greatly extends the utility of all related toolkits, since researchers can rely on the naplib-python framework and data structure for basic analyses but still utilize state-of-the-art methods available elsewhere without needing to write new code. There are multiple tutorial notebooks available on the online documentation illustrating how to integrate naplib-python with other toolkits, such as fitting encoding models on naplib-python data using NEMS and plotting EEG analysis results using MNE visualization tools.
In addition to offering an analysis framework which can interface with various third-party packages, naplib-python offers a wide array of support for methods which are commonly used in the field but lack standard open-source implementations. This means naplib-python is well-positioned to become the standard toolbox for researchers in the field. Introducing this package to the field will therefore improve the reproducibility of new research by reducing the amount of independently-produced code and making code-sharing easier. The package was critical in a recent study of noise adaptation mechanisms in auditory cortex (Mischler et al., 2022).
4. Limitations and Future Improvements
The main limitation of naplib-python is because all operations are performed in-memory, processing of large (hours-long) datasets becomes highly inefficient. This could be enhanced in the future with dynamic data loading and saving for individual trial data. Additionally, while data can be imported from various sources, there is currently limited ability to save data to a wide variety of file structures, since naplib-python is primarily useful for the later stages of data analysis, beginning with preprocessing after raw data have been collected and stored.
Acknowledgements
We thank the members of the Neural Acoustic Processing Lab who provided feedback on the package and its applications. This work was supported by National Institutes of Health grant R01DC018805 and National Institute on Deafness and Other Communication Disorders grant R01DC014279. GM was supported in part by the National Science Foundation Graduate Research Fellowship Program under grant DGE2036197.
References
- Aertsen A. M. H. J., Olders J. H. J., & Johannesma P. I. M. (1981). Spectro-temporal receptive fields of auditory neurons in the grassfrog - III. analysis of the stimulus-event relation for natural stimuli. Biological Cybernetics, 39(3). 10.1007/BF00342772 [DOI] [PubMed] [Google Scholar]
- Bialek W., Rieke F., de Ruyter Van Steveninck R. R, & Warland D. (1991). Reading a neural code. Science, 252(5014). 10.1126/science.2063199 [DOI] [PubMed] [Google Scholar]
- Canolty R. T., Edwards E., Dalal S. S., Soltani M., Nagarajan S. S., Kirsch H. E., Berger M. S., Barbare N. M., & Knight R. T. (2006). High gamma power is phase-locked to theta oscillations in human neocortex. Science, 313(5793). 10.1126/science.1128115 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Crosse M. J., di Liberto G. M., Bednar A., & Lalor E. C. (2016). The multivariate temporal response function (mTRF) toolbox: A MATLAB toolbox for relating neural signals to continuous stimuli. Frontiers in Human Neuroscience, 10(NOV2016). 10.3389/fnhum.2016.00604 [DOI] [PMC free article] [PubMed] [Google Scholar]
- David S. v. (2018). Incorporating behavioral and sensory context into spectro-temporal models of auditory encoding. In Hearing Research (Vol. 360). 10.1016/j.heares.2017.12.021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Easterbrook S. M. (2014). Open code for open science? In Nature Geoscience (Vol. 7, Issue 11). 10.1038/ngeo2283 [DOI] [Google Scholar]
- Edwards E., Soltani M., Kim W., Dalal S. S., Nagarajan S. S., Berger M. S., & Knight R. T. (2009). Comparison of time-frequency responses and the event-related potential to auditory speech stimuli in human cortex. Journal of Neurophysiology, 102(1). 10.1152/jn.90954.2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gage N., Poeppel D., Roberts T. P. L., & Hickok G. (1998). Auditory evoked M100 reflects onset acoustics of speech sounds. Brain Research, 814(1–2). 10.1016/S0006-8993(98)01058-0 [DOI] [PubMed] [Google Scholar]
- Gorgolewski K. J., Auer T., Calhoun V. D., Craddock R. C., Das S., Duff E. P., Flandin G., Ghosh S. S., Glatard T., Halchenko Y. O., Handwerker D. A., Hanke M., Keator D., Li X., Michael Z., Maumet C., Nichols B. N., Nichols T. E., Pellman J., … Poldrack R. A. (2016). The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Scientific Data, 3. 10.1038/sdata.2016.44 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gorman K., Howell J., & Wagner M. (2011). Prosodylab-aligner: A tool for forced alignment of laboratory speech. Canadian Acoustics - Acoustique Canadienne, 39(3). [Google Scholar]
- Gramfort A., Luessi M., Larson E., Engemann D. A., Strohmeier D., Brodbeck C., Goj R., Jas M., Brooks T., Parkkonen L., & Hämäläinen M. (2013). MEG and EEG data analysis with MNE-Python. Frontiers in Neuroscience, 7 DEC. 10.3389/fnins.2013.00267 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gwilliams L., Linzen T., Poeppel D., & Marantz A. (2018). In spoken word recognition, the future predicts the past. Journal of Neuroscience, 38(35). 10.1523/JNEUROSCI.0065-18.2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hamilton L. S., & Huth A. G. (2020). The revolution will not be controlled: natural stimuli in speech neuroscience. Language, Cognition and Neuroscience, 35(5). 10.1080/23273798.2018.1499946 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Khalighinejad B., Nagamine T., Mehta A., & Mesgarani N. (2017). NAPLib: An open source toolbox for real-time and offline Neural Acoustic Processing. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. 10.1109/ICASSP.2017.7952275 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Khalighinejad B., Patel P., Herrero J. L., Bickel S., Mehta A. D., & Mesgarani N. (2021). Functional characterization of human Heschl’s gyrus in response to natural speech. NeuroImage, 235. 10.1016/j.neuroimage.2021.118003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McKinney W. (2011). pandas: a Foundational Python Library for Data Analysis and Statistics. Python for High Performance and Scientific Computing. [Google Scholar]
- Mesgarani N., & Chang E. F. (2012). Selective cortical representation of attended speaker in multi-talker speech perception. In Nature (Vol. 485, Issue 7397). 10.1038/nature11020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mesgarani N., David S. v., Fritz J. B., & Shamma S. A. (2009). Influence of context and behavior on stimulus reconstruction from neural activity in primary auditory cortex. Journal of Neurophysiology, 102(6). 10.1152/jn.91128.2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miłkowski M., Hensel W. M., & Hohol M. (2018). Replicability or reproducibility? On the replication crisis in computational neuroscience and sharing only relevant detail. Journal of Computational Neuroscience, 45(3). 10.1007/s10827-018-0702-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mischler G., Keshishian M., Bickel S., Mehta A. D., & Mesgarani N. (2022). Deep neural networks effectively model neural adaptation to changing background noise and suggest nonlinear noise filtering methods in auditory cortex. NeuroImage, 119819. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Muller E., Bednar J. A., Diesmann M., Gewaltig M. O., Hines M., & Davison A. P. (2015). Python in neuroscience. In Frontiers in Neuroinformatics (Vol. 9, Issue APR). 10.3389/fninf.2015.00011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Näätänen R., Gaillard A. W. K., & Mäntysalo S. (1978). Early selective-attention effect on evoked potential reinterpreted. Acta Psychologica, 42(4). 10.1016/0001-6918(78)90006-9 [DOI] [PubMed] [Google Scholar]
- Niso G., Gorgolewski K. J., Bock E., Brooks T. L., Flandin G., Gramfort A., Henson R. N., Jas M., Litvak V., Moreau J. T., Oostenveld R., Schoffelen J. M., Tadel F., Wexler J., & Baillet S. (2018). MEG-BIDS, the brain imaging data structure extended to magnetoencephalography. In Scientific Data (Vol. 5). 10.1038/sdata.2018.110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V., Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., & Duchesnay É. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12. [Google Scholar]
- Pernet C. R., Appelhoff S., Gorgolewski K. J., Flandin G., Phillips C., Delorme A., & Oostenveld R. (2019). EEG-BIDS, an extension to the brain imaging data structure for electroencephalography. In Scientific Data (Vol. 6, Issue 1). 10.1038/s41597-019-0104-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Phillips D. P., Hall S. E., & Boehnke S. E. (2002). Central auditory onset responses, and temporal asymmetries in auditory perception. Hearing Research, 167(1–2). 10.1016/S0378-5955(02)00393-3 [DOI] [PubMed] [Google Scholar]
- Picton T. (2013). Hearing in time: Evoked potential studies of temporal processing. In Ear and Hearing (Vol. 34, Issue 4). 10.1097/AUD.0b013e31827ada02 [DOI] [PubMed] [Google Scholar]
- Theunissen F. E., David S. v., Singh N. C., Hsu A., Vinje W. E., & Gallant J. L. (2001). Estimating spatio-temporal receptive fields of auditory and visual neurons from their responses to natural stimuli. Network: Computation in Neural Systems, 12(3). 10.1088/0954-898X/12/3/304 [DOI] [PubMed] [Google Scholar]
- Theunissen F. E., Sen K., & Doupe A. J. (2000). Spectral-temporal receptive fields of nonlinear auditory neurons obtained using natural sounds. Journal of Neuroscience, 20(6). 10.1523/jneurosci.20-06-02315.2000 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tort A. B. L., Komorowski R., Eichenbaum H., & Kopell N. (2010). Measuring phase-amplitude coupling between neuronal oscillations of different frequencies. Journal of Neurophysiology, 104(2). 10.1152/jn.00106.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang X., Wang K., & Shamma S. A. (1992). Auditory Representations of Acoustic Signals. IEEE Transactions on Information Theory, 38(2). 10.1109/18.119739 [DOI] [Google Scholar]