Extracellular voltage thresholds for maximizing information extraction in primate auditory cortex: implications for a brain computer interface

James Bigelow; Brian J Malone

doi:10.1088/1741-2552/ab7c19

. Author manuscript; available in PMC: 2025 Aug 25.

Published in final edited form as: J Neural Eng. 2021 Mar 4;18(3):10.1088/1741-2552/ab7c19. doi: 10.1088/1741-2552/ab7c19

Extracellular voltage thresholds for maximizing information extraction in primate auditory cortex: implications for a brain computer interface

James Bigelow ^1,², Brian J Malone ^1,^2,³

PMCID: PMC12371764 NIHMSID: NIHMS2105464 PMID: 32126540

Abstract

Objective.

Research by Oby (2016 J. Neural. Eng. 13 036009) demonstrated that the optimal threshold for extracting information from visual and motor cortices may differ from the optimal threshold for identifying single neurons via spike sorting methods. The optimal threshold for extracting information from auditory cortex has yet to be identified, nor has the optimal temporal scale for representing auditory cortical activity. Here, we describe a procedure to jointly optimize the extracellular threshold and bin size with respect to the decoding accuracy achieved by a linear classifier for a diverse set of auditory stimuli.

Approach.

We used linear multichannel arrays to record extracellular neural activity from the auditory cortex of awake squirrel monkeys passively listening to both simple and complex sounds. We executed a grid search of the coordinate space defined by the voltage threshold (in units of standard deviation) and the bin size (in units of milliseconds), and computed decoding accuracy at each point.

Main results.

The optimal threshold for information extraction was consistently near two standard deviations below the voltage trace mean, which falls significantly below the range of three to five standard deviations typically used as inputs to spike sorting algorithms in basic research and in brain-computer interface (BCI) applications. The optimal binwidth was minimized at the optimal voltage threshold, particularly for acoustic stimuli dominated by temporally dynamic features, indicating that permissive thresholding permits readout of cortical responses with temporal precision on the order of a few milliseconds.

Significance.

The improvements in decoding accuracy we observed for optimal readout parameters suggest that standard thresholding methods substantially underestimate the information present in auditory cortical spiking patterns. The fact that optimal thresholds were relatively low indicates that local populations of cortical neurons exhibit high temporal coherence that could be leveraged in service of future auditory BCI applications.

Keywords: primate, brain-computer interface, auditory, cortex, decoding, neural prosthetics, spike timing

1. Introduction

Extracting sensory and motor information from neural activity is a fundamental goal of both basic and applied neuroscience (Averbeck et al 2006, Nicolelis and Lebedev 2009, Wander and Rao 2014, Moxon and Foffani 2015). The long-standing experimental tradition of characterizing neuronal spike train responses associated with specific sensory stimuli and motor events has yielded considerable progress in revealing the functional organization of the brain, including both the global distribution of sensory and motor regions throughout the brain and the local organization of specific feature representations within a given sensory or motor area. Identifying reliable and distinct responses associated with specific stimuli or motor movements (e.g. Georgopoulos et al 1982), especially in primary sensory and motor areas, has permitted decoding of the stimulus or motor event from neural activity. This ability to extract stimulus and movement parameters from neural activity has provided a foundation for technologies including brain-computer interfaces (BCIs) and neural prostheses. For instance, intended movement trajectories may be decoded from motor cortical activity and subsequently translated into directions for a computer cursor or robotic arm (Simeral et al 2011, Collinger et al 2013, Pandarinath et al 2017). Decoding responses evoked by sensory events may similarly be used for BCI control, and constitutes a critical step toward developing stimulating neural prostheses (Kellis et al 2010, Pasley et al 2012, Kaufmann et al 2013, Smith et al 2013, Höhne and Tangermann 2014, Wander and Rao 2014, Flesher et al 2016, Van Eyndhoven et al 2016).

Basic physiology experiments have traditionally focused on stimulus-response relationships at the single neuron level. For in vivo applications, single-unit activity is primarily obtained through extracellular recordings (Yael and Bar-Gad 2017). Putative action potentials are identified in the raw extracellular signal as high-frequency voltage fluctuations exceeding a specified threshold. Clustering techniques incorporating on spike timing and waveform parameters are then used to isolate the activity of one or more single neurons (Lewicki 1998, Rey et al 2015). Despite the historical emphasis on single-unit activity within the basic physiology literature, analyses reflecting the aggregate activity of multiple neurons have become increasingly common, in part due to growing recognition that local cell populations may be collectively tuned for specific stimulus parameters (Kreiman et al 2006; Moran and Bar-Gad 2010, Panzeri et al 2015). Similarly, although spike sorting was once thought to be critical for intracortical BCI applications, accumulating evidence now suggests that the information content of multi-unit signals, including unsorted spikes, is sufficient for robust BCI performance (Stark and Abeles 2007, Ventura 2008, Fraser et al 2009, Homer et al 2013, Perge et al 2014, Wander and Rao 2014, Oby et al 2016). For such applications, multi-unit signals offer the significant advantage of bypassing the computationally and/or manually intensive spike sorting step (Lewicki 1998, Christie et al 2015, Todorova et al 2014, Oby et al 2016).

The increasing popularity of multi-unit activity has resulted in efforts to identify strategies and best practices for maximizing information extraction from extracellular signals (Stark and Abeles 2007, Todorova et al 2014, Oby et al 2016). One particularly elegant paradigm systematically examined information available in threshold crossings defined by a broad range of extracellular voltage thresholds (Oby et al 2016). A key insight from this approach was that, for both kinematic parameters in primary motor cortex and visual stimulus parameters in primary visual cortex, information was maximized at substantially more permissive thresholds than have been conventionally adopted in BCI applications and basic physiological experiments, which typically range from 3σ to 5σ below the mean of the voltage trace (Rey et al 2015). This result suggests prior studies relying on multi-unit signals obtained with conventional thresholds may have failed to capture much of the information available in voltage threshold crossings for relatively small excursions. A second important observation was that optimal voltage thresholds differed significantly among motor and visual stimulus parameters, likely reflecting organizational differences in the cortical microcircuitry associated with each parameter. Estimating optimal voltage threshold settings thus has implications for performance benchmarks in applied settings such as neural prosthetics as well as for understanding the basic functional organization of cortex, suggesting this approach may be usefully applied beyond the visual and motor parameters investigated so far.

The auditory cortex occupies a central position among networks underlying auditory perception and cognition, forming a critical part of both the ascending pathway responsible for detailed of acoustic information processing and an extended cognitive control network governing auditory-guided behaviors (Schreiner and Polley 2014). As such, it has been the subject of a large number of studies detailing its encoding of a wide range of acoustic parameters, as well as its sensitivity to contextual influences such as attention and behavioral state (Mesgarani et al 2009). More recently, auditory cortex has become a focal point of early foundational work toward developing central auditory neural interfaces, including studies investigating the possibility of enabling auditory perception through microstimulating implants (Smith et al 2013) and restoring communicative abilities through decoding inner speech imagery (Pasley et al 2012). The success of these efforts will ultimately depend on detailed understanding of how acoustic features are represented in auditory neural circuits and the ability to efficiently extract information about these parameters from extracellular signals. The motivation of the current study was to contribute to these long-term goals by estimating optimal voltage thresholds for decoding a range of fundamental acoustic parameters from extracellular signals in auditory cortex of awake nonhuman primates, including sound frequency and level, as well as temporal and spectral modulation.

Two additional aspects of neural encoding that may be particularly significant for the auditory system were addressed by the present threshold optimization analyses. First, numerous studies have documented the fine temporal encoding precision of auditory neurons, and have further demonstrated that much more information can often be extracted from temporal spiking patterns than is available in averaged evoked firing rates alone (Malone et al 2007, 2010, 2013, 2014, Kayser et al 2010, Garcia-Lazaro et al 2013, Panzeri et al 2015). A small but growing body of literature suggests this temporal information is utilized during auditory-guided behaviors. In one example, the auditory cortex of ferrets trained to discriminate vocalizations showed no evidence of changes in firing rates evoked by the trained stimuli. However, training significantly increased information available in temporal spiking patterns, which had to be analyzed at high temporal resolution to ensure accurate decoding (Schnupp et al 2006). In another study, rats were trained to perform a consonant discrimination task, after which their behavioral performance was correlated with neural discrimination performance assessed by decoding the trained stimuli from auditory cortical responses (Engineer et al 2008). In contrast to a previous study of visual motion processing, neural discrimination performance in the auditory task was well correlated with behavior only when spike timing information was preserved.

Considering these outcomes, the present study not only sought to identify extracellular threshold settings for maximizing information about auditory parameters, but also to identify the effective temporal resolution with which these parameters are encoded by the local cell populations supplying the multi-unit signals. These goals were approached by manipulating the temporal resolution of spike train decoding methods applied to threshold crossings obtained at a range of voltage thresholds (Malone et al 2007, Kayser et al 2010).

The second aspect of neural encoding explored in the current study relates to the question of how signals are represented in the presence of competing background events in natural environments (Mesgarani and Chang 2012, Rabinowitz et al 2013, Teschner et al 2016, Malone et al 2017). In contrast to typical laboratory settings, behaviorally relevant sounds in naturalistic settings do not occur in isolation, but instead arrive at the auditory periphery as part of a summed sound pressure waveform often heavily influenced by diverse sources of noise. Understanding how noise affects efficient information extraction from neural signals will be critical for most applied settings in which background noise is inevitable. Thus, the present study included analysis of information content available in threshold crossings about acoustic stimuli presented in isolation, as well as in the presence of mild to intense background noise.

In the following, we document four primary insights. First, across all stimulus sets analyzed, voltage thresholds that maximized information were substantially lower (1.5σ to 2.5σ) than conventional thresholds adopted by most previous studies (3σ to 5σ). Indeed, the most effective range of thresholds (where decoder accuracy was at least 70% of the maximum) was almost completely missed by the conventional range, implying these settings effectively discard the most information-rich source of neural activity. Second, although decoder accuracy improves substantially at these relatively low thresholds, it does not continue to improve monotonically as threshold approaches zero. It instead drops precipitously below ~1σ, implying the addition of large numbers of neural events occurring randomly in time with respect to the stimulus. Third, effective decoding of auditory stimulus parameters requires fine temporal discretization of cortical responses. For nearly all stimulus parameters tested, optimal decoding required binwidths of 10 ms or less. In all cases, optimal binwidth was strongly and inversely correlated with decoding accuracy. Fourth, in real world listening conditions, which invariably include sources of acoustic noise, auditory decoder accuracy can be improved with modest increases to the voltage threshold (still well below conventional ranges) and increases in sampling binwidth.

2. Methods

2.1. Subjects and surgical preparation

All procedures were carried out in strict compliance with recommendations in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health, and were approved by the Institutional Animal Care and Use Committee of the University of California, San Francisco. Details regarding protocol and methodology have been published previously (Malone et al 2013, 2015a, 2017), and are briefly summarized here.

Electrophysiological data were collected from a total of four adult squirrel monkeys (Saimiri sciureus, Monkey 1: male; Monkeys 2–4: female). Monkeys 1 and 2 served as subjects for the pure tone (PT) experiments, and Monkeys 3 and 4 served for all other experiments. Subjects were group housed with other conspecifics in a temperature- and humidity-controlled colony. Subjects had ad libitum access to water and primate diet supplemented with fresh fruits and vegetables. An environmental enrichment program was administered by UCSF Laboratory Animal Resource Center staff. Regular monitoring and care was provided by UCSF veterinary staff.

Prior to electrophysiological recording, subjects were acclimated to a primate chair. A head post was then surgically implanted to allow head restraint. For all surgical procedures, subjects were sedated with ketamine (25 mg kg⁻¹) and midazolam (0.1 mg kg⁻¹), and anesthetized with isoflurane gas (0.5%–5%). Implants were secured to the cranium with bone screws and dental acrylic. Perioperative antibiotics and analgesics were administered as needed in consultation with UCSF veterinary staff. After subjects were acclimated to the primate chair while head fixed, they underwent a second surgery in which a recording chamber was implanted over primary auditory cortex (A1). The temporal muscle was resected, the cranium overlying auditory cortex was exposed, and a recording chamber was secured with bone screws and dental acrylic. Perioperative care was administered as before.

Sterile procedures were used for all recording sessions to access auditory cortex. Following lidocaine (1%) application, a small cranial burr hole (2–3 mm) was drilled inside the recording chamber under magnification with a surgical microscope. A small incision was then made in the dura using micro-surgical instruments. The process was repeated as needed for subsequent recording sessions to expose additional areas of auditory cortex. Between recording sessions, implants were cleaned aseptically and the chamber was filled with antibiotic ointment and sealed with a metal cap.

2.2. Auditory stimuli

A summary of the stimulus sets used in each experiment is presented in figure 1. All stimuli were generated in MATLAB (MathWorks, Natick, MA) and presented at a sample rate of 96 kHz. Sounds were delivered through a free-field speaker centrally in front of the subject at ear level, 40 cm from the interaural line. Sound levels were calibrated with a Brüel & Kjær Model 2209 meter using an A-weighted decibel filter and a Model 4192 microphone. We summarize the stimulus details for each stimulus class below.

Pure tones (PT).

PTs comprise sinusoidal waveforms characterized by a single frequency and amplitude, and represent perhaps the most widely-used stimulus class in auditory physiology for characterizing sensitivity to sound frequency and level throughout the auditory system (Merzenich and Brugge 1973, Cheung et al 2001). The PT stimulus set in the present experiment comprised tone pips (50 ms duration, 5 ms on- and off-cosine-squared ramps) spanning 60 frequencies logarithmically spaced between 0.5 and 40 kHz and 15 levels between 0 and 70 dB in 5 dB steps (figure 1(A)). Three repetitions of each frequency/level combination were presented in pseudorandom order with an interstimulus interval (ISI) of 300 ms.

Sinusoidal amplitude modulated tones, noise (SAMt, SAMn).

Information present in complex acoustic signals is often divided into fine structural cues defining the spectral content of the ‘carrier’ signal, and the slower changes in the overall amplitude of the pressure waveform defining the envelope (Rosen 1992, Smith et al 2002, Joris et al 2004, Malone and Schreiner 2010). SAM stimuli have been used to characterize temporal encoding at various stages of the auditory system in many species (Joris et al 2004, Malone and Schreiner 2010), including the auditory cortical fields of nonhuman primates such as the rhesus macaque (Malone et al 2007, 2010, 2014), squirrel monkey (Beiser and Muller-Preuss 1996, Malone et al 2013, 2015a), and marmoset (Liang et al 2002).

In this study, SAM stimuli (figure 1(B)) were presented with either a pure tone carrier (SAMt), or a noise carrier (SAMn) centered on the site’s best frequency with a bandwidth of two octaves (SigGen, Tucker-Davis Technologies, Gainesville, FL). The spectrum of the noise carriers was flat within that range. Tonal SAM consisted of a sinusoidal carrier tone (fc) modulated sinusoidally by a second tone at a lower frequency (f _m) such that s(t) = A(1 + msin(2πf _mt + Φ))sin(2πf _ct). For noise SAM, the sin(2πf _ct) term is replaced by the noise carrier, but the modulation term remains the same. The amplitude term A was adjusted to equalize the level. The phase term, φ, was equal to −π/2, so that each modulation cycle begins and ends at the minimum amplitude within the cycle. For all stimuli, the modulation depth m was set to 100%. The list of tested modulation frequencies was typically 4, 6, 8, 10, 16, 24, 32, 64, 96, 128, 192, 256, 384, and 512 Hz.

Frequency modulated sweeps (FMS).

Logarithmic FMS were presented at different velocities (octaves/s), directions (up or down) and levels (figure 1(C)). The FMS were delivered via an RP2.1 (Tucker-Davis Technologies, Alachua, FL). FMS frequency ranged from 50 to 21 000 Hz in upward and downward directions with rates of frequency change of 10, 20, 35, 60, 85, and 110 octaves s⁻¹. The corresponding sweep durations were 871, 436, 249, 145, 103, and 79 ms. A constant tone (100 ms) was appended to each FMS at the starting and ending frequencies. These constant frequency portions were gated with 5 ms cosine ramps. FMS stimuli were played near 60 dB sound-pressure level (SPL) with energy-matching compensations made for the different sweep velocities (Malone et al 2017). Although the background noise was continuous, data were collected in trials lasting 2500 ms.

Each stimulus was repeated 20 times in pseudorandom order with a minimum intertrial interval of 500 ms. In some experiments, the FMS level (20, 30, 40, 50, 60, 71, and 85 dB SPL) was varied while presenting the ascending sweep at 35 oct s⁻¹. When sweep level was varied in this way, changes in both sweep level (dB) and trajectory (ID) were pseudorandomly interleaved.

Frequency modulated sweeps in noise (FMS_SNR).

The same FMS stimuli described above were also presented at different signal-to-noise ratios (SNRs) obtained by varying the background noise level). We refer to the presentation of the FMS in silence as the +∞ SNR condition. The remaining SNRs (0, − 5, −10, −15 and −20 dB) corresponded to background noise levels of 60, 65, 70, 75 and 80 dB SPL for sweep levels of 60 dB. Different SNR conditions were presented in blocks, and within each block the individual FMS were presented in pseudorandom order. Order of the SNR conditions was varied across penetrations to reduce order effects related to changes in recording conditions over time. One FMS condition (the ascending sweep of 35 oct s⁻¹) was also tested at several stimulus levels (20, 30, 40, 50, 60, 71 and 85 dB) across the range of background noise levels described above. The absolute background noise levels for this condition were the same as for the full FMS (60, 65, 70, 75 and 80 dB SPL).

2.3. Electrophysiology

Recordings were conducted inside an anechoic sound attenuation chamber (Industrial Acoustics Company, Bronx, NY). Extracellular data were collected using 16-channel linear electrode arrays (177 μm² contact size, 100 or 150 μm spacing, NeuroNexus Technologies, Ann Arbor, MI). Probes were advanced into cortex with a hydraulic microdrive (David Kopf Instruments, Tujunga, CA) to depths at which neural activity was evident on most or all channels. Penetrations were approximately perpendicular to the surface of the exposed cortex, although it was not possible to achieve strict orthogonality for every recording given the complex anatomy of auditory cortex near the superior temporal sulcus (Cheung et al 2001). For the PT experiments, extracellular signals were amplified with an RA16 Medusa preamplifier (Tucker-Davis Technologies, Gainesville, FL), band-pass filtered (800–5000 Hz) and stored to hard disk at 30.3 kHz using a Cheetah A/D system (Neuralynx, Inc., Bozeman, MT). For the SAM and FMS experiments, extracellular signals were amplified with an RA16 Medusa preamplifier, band-pass filtered (600–7000 Hz) and stored to hard disk at 25 kHz using an RX-5 A/D system with BrainWare software (Tucker-Davis Technologies, Gainesville, FL). For parity with previous research (Oby et al 2016), all signals were filtered offline using a Kaiser window with a 700–3000 Hz passband.

We quantified the information about auditory stimulus parameters by decoding extracellular voltage deflections defined over a range of specified thresholds. Following previously published methodology (Oby et al 2016), thresholds were defined by the standard deviation (σ) of the continuous filtered voltage trace (including stimulus periods and interstimulus intervals). Negative threshold crossings (TCs) were extracted at voltage thresholds spanning 0 to −10σ in 0.5σ steps; hereafter they are referred to by their absolute value for notational convenience. As in prior work, TCs included at a given threshold step are inclusive of TCs of all lower steps. The time for each TC was defined by the first voltage sample to exceed the specified threshold. TCs cannot be unequivocally classified as spikes, particularly for low threshold values, where TCs result from the superposition of multiple coincident action potentials distal to the probe.

Example data depicting the voltage thresholding procedure are provided in figure 2. Generally, more negative thresholds resulted in lower TC counts (figure 2(C)) and longer inter-TC-intervals (figure 2(D)), both of which have been negatively correlated with stimulus information in previous studies of single-unit and multi-unit activity defined by conventional voltage threshold settings (Shih et al 2011). For many channels, zero or near zero TCs were obtained at extreme voltage thresholds (>5σ). Because decoder accuracy is necessarily chance without any TCs to decode (or similarly, with only a single TC), only channels for which TCs occurred at a rate of at least 0.1 Hz at a given threshold were included. The minimum event rate threshold (0.1 Hz) thus ensured that decoding results below were not spuriously reduced by attempting to decode empty or nearly empty response vectors. The proportion of channels analyzed at each threshold is reflected by the area of the line plot markers in figures 4–7, panels D and E.

Figure 4. — Optimal decoding parameters for pure tone frequency and level (A–C). Example data showing stimulus-aligned threshold crossings and results for decoding accuracy and binwidth optimization for a single recording site. (A) The top panels in each column depict the response area for an example recording site at thresholds of 2σ (left) and 5σ (right). The red and yellow rectangles delimit the parameter ranges that were decoded for tone frequency and tone level, respectively. Responses were averaged across the shorter dimension of each rectangle and then decoded across the longer dimension. The lower panels illustrate the pure tone responses within each colored rectangle as rasters, segregated by parameter (frequency: lower left panel; level: lower right panel). (B) Example curves describing how decoding accuracy for pure tone frequency (red) and level (yellow) in A varied across voltage threshold are shown relative to the adjusted significance criterion (dashed line). (C) Example curves show how the binwidth that maximized decoding accuracy for tone frequency (red) and level (yellow) varied with voltage threshold. (D) and (E) Population summary for decoding accuracy and binwidth optimization across all recording sites meeting criteria ensuring sound responsiveness and minimum threshold crossing counts (see section 2). Solid lines indicate the mean and shading indicates standard error of the mean. Marker sizes are proportional to the fraction of total sites meeting inclusion criteria included at each threshold (see section 2). (D) Curves show how the population-averaged decoding accuracy for tone frequency (red) and level (yellow) varied with voltage threshold. Boxplots in the inset panel compare the distribution of best thresholds across recording sites for the tone frequency and level decoding paradigms. Boxplots on this and other figures indicate the median, 25th, and 75th percentiles of the data distribution. Significant differences are indicated by an asterisk (p < 0.05, Wilcoxon rank-sum test). (E) Same graphical conventions as D, illustrating analogous results for the optimal binwidth.

Figure 7. — Optimal decoding parameters for the rate of frequency modulated sweeps (FMS) embedded in wideband noise. (A–C) Example data showing stimulus-aligned threshold crossings and results for decoding accuracy and binwidth optimization for a single recording site. (A) The rows of panels in each column contain response rasters for FMS for three different signal-to-noise (SNR) ratios. Events are shown for voltage thresholds of 2σ (left column) and 5σ (right column). (B) Example curves describing how decoding accuracy for FMS rate varied across voltage threshold for the responses in A are shown relative to the adjusted significance criterion (dashed line). Responses obtained for the least favorable SNRs are indicated by darker shades of blue. (C) Example curves depict how the binwidth that maximized decoding accuracy for FMS rate varied across voltage threshold for all tested SNRs. (D) and (E) Population summary for decoding accuracy and binwidth optimization across all recording sites meeting criteria ensuring sound responsiveness and minimum threshold crossing counts (see section 2). Solid lines indicate the mean and shading indicates standard error of the mean. Marker sizes are proportional to the fraction of total sites meeting inclusion criteria included at each threshold (see section 2). (D). Curves show how the population-averaged decoding accuracy for FMS rate varied with voltage threshold. Boxplots in the inset panel compare the distribution of best thresholds across recording sites when decoding the FMS rate at different SNRs. Asterisks indicates significant differences (p < 0.05, Wilcoxon rank-sum tests). (E) Same graphical conventions as D, illustrating analogous results for the optimal binwidth.

2.4. Stimulus decoding analysis

Decoding accuracy for a nearest-neighbor linear classifier (Foffani and Moxon 2004) provides a lower-bound estimate of the mutual information between the stimulus and the neural response (Schnupp et al 2006). The classifier assigns the neural response on each trial to the eliciting stimulus via a template matching procedure that can be continuously scaled in temporal resolution by varying the bin size used to enumerate features of cortical activity, such as action potentials or threshold crossings. Details for this procedure have been provided in previous reports using a similar methodology (e.g. Hoglen et al 2018). Because the optimal temporal resolution and the optimal threshold setting could interact to determine the optimal readout parameters for auditory cortex (figure 3), we evaluated decoding accuracy at temporal resolutions spanning multiple orders of magnitude (100 bin sizes logarithmically spaced between 0.1 and 1000 ms). Responses to each stimulus were averaged across trials to form a response template for that stimulus, and binned to form a bin-dimensional vector representing the response across time. Individual trials were binned at the same temporal resolution and compared against each response template by computing the Euclidean distance between the trial and the template vectors. The stimulus associated with the nearest template in the response space was assumed to be the stimulus that elicited the response on that trial. We used a complete cross validation procedure so no trial was compared with a response template that included itself. The decoding procedure generates a confusion matrix whose columns represent the actual stimulus and whose rows represent the stimulus selected by the decoding algorithm. Correctly identified trials result in matrix entries along the diagonal. Decoding accuracy, the percentage of correctly classified trials, was computed as the sum of the diagonal entries (i.e. the trace of the matrix) divided by the total number of trials.

Figure 3. — Illustration of how stimulus decoding accuracy varies with temporal scale and the voltage threshold setting. (A) Peristimulus time histograms (PSTHs) depicting the response to a single presentation of a 4 Hz SAM tone for an example electrode site. The same response is shown for voltage thresholds of 2σ (left pair) and 5σ (right pair). Within each pair, the left and right PSTHs show the same response binned at 10 ms and 50 ms resolution, respectively. (B). Each column depicts the PSTH associated with the modulation frequency indicated on each row for all trials (the response obtained for the single trial shown in A is excluded). The response templates in each column are shown at the same binning resolution and voltage threshold as the single trial PSTHs in A. Lines to the right of each column of PSTHs indicate the Euclidian distance between the trial PSTHs in A and template PSTHs in B, with asterisks indicating the minimum value identified by the decoding algorithm. (C) Confusion matrices at the base of the columns in A and B index the modulation frequency value returned by the decoder against the actual modulation frequency on each trial. The decoding accuracy value shown above each matrix was obtained by taking the ratio of diagonal entries to all matrix entries. (D) The black and gray curves indicate how decoding accuracy varies as a function of voltage threshold for binning resolutions of 10 and 50 ms, respectively. (E) The black and gray curves indicate how decoding accuracy varies as a function of binning resolution for thresholds of 2σ and 5σ, respectively.

We determined whether the decoding accuracy significantly exceeded chance performance by generating a distribution of simulated accuracy values by random assignment. Because we report the actual decoding accuracy at the temporal resolution that maximized the performance of the decoding algorithm, we repeated the random assignment procedure 100 times (for the 100 tested bin sizes) and took the maximum value. We estimated decoding accuracy at chance (indicated by a dashed line on each of the relevant figures) as the median of the values obtained after 1000 iterations of this process. We computed chance decoding accuracy separately for each stimulus set to account for differences in the numbers of stimuli.

We used two main tests to provide statistical verification of our main results. We computed one-way ANOVA tests to verify that the voltage threshold had a significant effect on decoding accuracy and optimal binwidth. We computed correlation coefficients between optimal bin size and voltage threshold to verify that the temporal resolution of the analysis was correlated with the threshold value.

2.5. Site inclusion criteria

Limitations inherent to recording from awake animals with multichannel electrode arrays made it impractical to ensure that every recording site was in auditory cortex prior to initiating data collection (e.g. the deepest sites may have extended into white matter and the shallowest sites may have rested above the dura mater). We thus imposed two recording site inclusion criteria in the analyses below aimed at removing ‘dead channels’ and retaining only sites likely to have been in cortex, from which decoder accuracy could reasonably be expected to be above chance.

The first criterion required the presence of threshold crossings at a rate of 0.1 Hz or greater for at least one threshold value, as described above (e.g. removing sites above dura from which action potentials were not detected). The second criterion ensured that sites fell within sound-responsive regions of cortex with potentially differentiable responses among stimulus parameters of interest (described individually for each dataset below). The criteria thus ensure the optimal threshold and binwidth estimation procedures are valid under the assumption that recording sites reflect cortical activity related to the stimulus set of interest. In practice, however, these inclusion criteria made little impact on any of the results below, and affected none of the primary conclusions.

3. Results

Varying the voltage threshold has a significant impact on the rate of TCs (figure 2) because it determines the volume of the ‘listening sphere’ (Oby et al 2016) of the electrode. Figure 2(A) shows an example of a voltage waveform obtained during a recording from the auditory cortex of an awake squirrel monkey at different temporal resolutions (top versus bottom). Red lines indicating various thresholds show how the pattern and rate of TCs varies with the threshold. Examples of the voltage deflections that constitute TC events are illustrated in figure 2(B). The steepest slopes of the functions defining the population averages for TC rate and the inter-TC intervals occur for thresholds in range from 0 to −5σ of the voltage signal, and span more than two orders of magnitude across that range (figures 2(C) and (D)).

The essential question is whether TCs for low (absolute-valued) thresholds contain useful stimulus information. Very large voltage deflections are dominated by action potentials occurring proximate to the recording electrode; smaller voltage deflections reflect action potentials from more distant neurons, perhaps with different stimulus preferences. In the limit, near zero, TCs might be dominated by uninformative neural noise. The temporal resolution used to characterize the pattern of TCs also sets important limits on the amount of stimulus information that can be extracted from them, particularly for auditory signals with time-varying features, such as modulations in amplitude or frequency. Because changing the event threshold changes the TC rate (figure 2), and changing the temporal resolution changes the interval over which TC rates are measured, it is crucial that both parameters be jointly sampled to estimate how much information can be decoded from TC patterns.

Figure 3 shows an example of the decoding method employed throughout this report (see Methods) applied to neural responses to a 4 Hz SAM stimulus for two distinct threshold values (2σ and 5σ) and temporal resolutions (10 and 50 ms). Figure 3(A) shows peristimulus time histograms (PSTHs) for the four combinations of threshold values and bin sizes for a single trial. Figure 3(B) shows the corresponding response templates built from binned responses on the remaining trials. For 64 Hz SAM, for example, the template binned at 10 ms resolution and thresholded at 2σ evidences clear synchronization to the stimulus envelope; this response synchrony is obscured by binning at 50 ms, and eliminated by thresholding at 5σ.

The confusion matrices shown in figure 3(C) indicate that precise binning of permissively thresholded events results in significant increases in decoding accuracy relative to the alternatives. Diagonal entries indicate trials that were correctly decoded. Figures 3(D) and 3(E) illustrate how decoding accuracy varies as a function of voltage threshold and binwidth, respectively. In figure 3(D), the effect of varying the voltage threshold is obscured when the bin size is too large because of the restricted range of the decoding accuracy curve. As figure 3(E) shows, bin sizes of less than 5 ms are required to maximize decoding accuracy for this recording site when the threshold values are less than 2σ. A stringent threshold combined with relatively coarse binning of the spike times would recover only a fraction of the stimulus information available. For this reason, we sampled a two-dimensional surface defined by threshold values and temporal resolutions in all the stimulus-specific analyses that follow.

4. Pure tone decoding

Frequency response area (FRA) functions were constructed from the TC rate in a 50 ms window following tone onset at every tone frequency and level (figure 4(A)). This is arguably the most common and fundamental tuning characterization for auditory neurons. In order to avoid the inclusion of spurious estimates of ‘optimal’ bin sizes and threshold values in the population data, we restricted our analyses to channels with strong evidence of frequency and/or level tuning. For each recording site and threshold setting, frequency tuning functions (FTFs) and level tuning functions (LTFs) were constructed by summing TCs evoked during the tone period for each frequency and level, respectively. To test the significance of tuning, we generated null functions by randomly shuffling the stimulus and response labels (n = 1000 iterations) and computing the variance of the resulting functions. The distribution of variance values was used to convert the variance of the actual FTFs and LTFs into z-scores. Channels with a z-score above 3.29 (p < 0.001) at one or more thresholds for either the FTF or LTF were considered significantly tuned and retained for further analysis (n = 631 of 1120 channels).

The spectral extent of the FRA spanned the range of tuning preferences we observed in our data sample. As a result, many frequency-level combinations necessarily fall outside the response areas of individual recording sites. This limits decoding accuracy because frequency-level combinations that do not elicit responses cannot be discriminated. Given the comparatively large numbers of distinct frequencies (n = 60) relative to other stimulus sets, groups of adjacent frequency or level values were combined for the purposes of decoding to maintain rough parity with other analyses, similar to the approach employed in a prior report (Teschner et al 2016). This adjustment reflects the fact that we are chiefly interested in relative decoding accuracy across threshold and bin size rather than absolute performance. For frequency decoding analysis, the original 60 sampled frequencies were collapsed into five bins (12 frequencies each) and were further restricted to the best level defined for each site ± one level (three total levels). If the best level was 70 dB (the highest), the two lower levels were included (60, 65 and 70), so that number of trials was equivalent for analyses across sites. For level decoding analysis, the original 15 levels were similarly collapsed into five bins (3 levels each) and restricted to the best frequency (BF) for each site ± five frequency bins (11 total frequencies). In cases where the BF was nearer to the spectral edges of the FRA than five bins, additional bins above or below BF were included to ensure analyses reflected equivalent trials across sites. Spike rasters for a representative site reflecting trials grouped by the five frequency and level bins are shown in figure 4(A) below the FRA plots, with the center frequency or level of the bin indicated by the tick label. The span of frequency-level combinations corresponding to these trials is indicated by red (frequency) and yellow (level) rectangles superimposed on the FRA plots.

As shown in figure 4(B), decoding accuracy as a function of voltage threshold for this site exhibits an inverted-U shape: accuracy nears chance (dashed line) when the voltage threshold is so low that noise dominates TCs and when the threshold is so high that few TCs occur to carry information about the stimulus. This outcome was quite similar when either frequency (PT Hz) or level (PT dB) were decoded. The optimal binwidths observed for frequency and level decoding (figure 4(C)) for this recording site were narrow (<3 ms) at voltage thresholds in the optimal range (~1.5 to 4σ). One-way ANOVAs verified that decoding accuracy varied significantly with voltage threshold for both frequency (F_20,9199 = 263.37, p = 0) and level (F_20,9199 = 175.76, p = 0). Optimal binwidth similarly varied significantly across thresholds for both frequency (F_20,9199 = 33.63, p < 10⁻¹²⁵) and level (PT dB: F_20,9199 = 15.84, p < 10^–53).

Comparison of figures 4(B) and 4(C) suggests that the plateau of the decoding accuracy function coincides with the trough of the optimal binwidth function, although the latter function is somewhat variable. Across the population, there were trends suggesting an inverse correlation between the decoding accuracy function and the population-averaged best binwidth function for both frequency (r = −0.49, p = 0.023) and level (r = −0.55, p = 0.011).

The median best thresholds for tone frequency (2.06σ) and tone level discrimination (2.09σ) did not differ significantly (p = 0.28, Wilcoxon rank-sum test), as indicated by the inset boxplots in figure 4(D). However, the median best binwidth for tone frequency decoding (9.54 ms) significantly exceeded that for tone level decoding (6.58 ms; p < 10⁻⁶, Wilcoxon rank-sum test) when computed at best threshold. Despite the static nature of pure tones, excepting their onset and offset transients (Malone et al 2015b), our results indicate that the extractable information in cortical TCs is maximal when large numbers of TCs are read out with high temporal precision.

5. Sinusoidal amplitude-modulated tone and noise decoding

Amplitude modulation (AM) is an essential feature of many communication sounds, and responses to Sinusoidal AM (SAM) have been used to characterize the temporal precision of the neural representation of dynamic sounds throughout the ascending auditory pathway (Joris et al 2004, Malone and Schreiner 2010). Only recording sites that exhibited decoding accuracy significantly above chance were included the analyses to avoid inclusion of spurious ‘optimal’ values. To identify these recording sites, we employed a bootstrapping method based on generating a null distribution of decoding accuracy values by circularly shifting TC times by a random value selected from an interval equal to the stimulus period (n = 100 iterations). Decoding accuracy was considered significant if the true accuracy value exceeded all values in the null distribution (p < 0.01). Channels were retained for further analysis if decoding accuracy was significant for either SAM tones or SAM noise at one or more voltage thresholds. By this criterion, 234 of 264 channels were significant.

Figure 5(A) shows TC rasters of responses to SAM tones (SAMt; top) and SAM noise (SAMn; bottom) for two voltage thresholds (2σ: left column; 5σ: right column) for an example recording channel. The voltage threshold profoundly affects both TC rate (figure 5(A)) and decoding accuracy (figure 5(B)) for SAM tones and noise. For voltage thresholds in the optimal range, the optimal binwidth can be 1 ms or less (figure 5(C)). Similar trends are evident for the population-averaged functions in panels 5D and 5E. One-way ANOVA confirmed that decoding accuracy varied significantly with the voltage threshold for SAM tones (F_20,3924 = 112.51, p = 0) and noise (F_20,3850 = 106.63, p = 0). The same was true for the optimal binwidth (tones: F_20,3924 = 46.02, p < 10⁻¹⁶¹; noise: F_20,3850 = 33.41, p < 10⁻¹¹⁷). The decoding accuracy and best binwidth functions are almost perfectly inversely correlated for both tonal carriers (r = −0.98, p < 10^–13) and noise carriers (r = −0.95, p < 10^–10).

As seen in figure 5(D), decoding accuracy for SAM tones was consistently higher than for SAM noise, congruent with our prior report (Malone et al 2013). The median best threshold for SAM tone decoding (1.44) was significantly (p = 0.001, Wilcoxon rank-sum test) lower than that for SAM noise decoding (1.60). The distribution of best bin sizes did not differ (3.43 ms versus 3.13; p = 0.631, Wilcoxon rank-sum test) when computed at the best threshold. Overall, results for the amplitude modulated stimuli were similar to those for static tones. For dynamic stimuli, the inverse relationship between optimal threshold settings and bin size is especially clear.

6. Frequency modulated sweep decoding

Like amplitude modulation, frequency modulation is also a crucial component of many communication sounds, including squirrel monkey vocalizations (Godey et al 2005). The bootstrapping method described for the SAM dataset was used to assess whether decoding accuracy at each channel and threshold was greater than expected by chance. Channels were retained for further analysis if decoding accuracy was significant for at least one FMS stimulus set (rate, dB) at one or more thresholds (FMS rate: 314 of 321 channels; FMS dB: 178 of 181 channels).

The challenges involved in discriminating FMS rate and level are distinct. FMS rate affects the latency, relative to trial onset, when a given sweep intersects with a neuron’s receptive field (Malone et al 2017). FMS rate was treated as a signed value, such that negative velocities indicate sweeps that descend in frequency. Sweep level, by contrast, can affect the TC response pattern, but the accompanying changes in response latency are considerably more subtle, as shown in the example response rasters in the bottom row of figure 6(A). For this site, the decoding accuracy function for FMS rate exhibited a wide plateau that extended to relatively stringent thresholds. Even at 5σ, there are enough neural events to define the location of the intersection of the FMS with the preferred frequencies of neurons near the recording site (figure 6(A), upper right panel), or to widen the response period for FMS at higher levels (figure 6(A), lower right panel). Decoding accuracy for this example site is summarized in figure 6(B) for the full range of thresholds, indicating accuracy exceeded chance at all but the most extreme thresholds (> 9σ). Binwidth optimization results summarized in figure 6(C) suggested a coarse, inverse relationship between decoder accuracy and binwidth.

Results for the population-averaged decoding functions (figure 6(D)) exhibited narrower peaks at 2σ for both FMS rate and level. The variation in decoding accuracy across threshold level was highly significant for both FMS rate (F_20,5013 = 186.40, p = 0) and level (F_20,2873 = 114.92, p = 0). The variation in optimal binwidth was likewise significant (FMS rate: F_20,5013 = 62.16, p < 10⁻²²²; FMS level: F_20,2873 = 24.85, p < 10⁻⁸⁴). Similar to the results for SAM, the decoding accuracy and optimal binwidth functions were strongly anti-correlated for both FMS rate (r = −0.97, p < 10⁻¹³) and FMS level (r = −0.95, p < 10⁻¹⁰). As shown by the inset boxplot, there were no differences in the median best thresholds (FMS rate versus level: 2.06 versus 2.08; p > 0.06, Wilcoxon rank-sum test) or best binwidths computed at the best threshold (6.58 ms versus 6 ms; p > 0.48, Wilcoxon rank-sum test). Despite differences in absolute decoding accuracy for FMS rate and amplitude, the overall trends were remarkably similar with respect to the readout parameters we evaluated for extracting stimulus information from cortical TCs.

7. Frequency modulated sweep in noise decoding

In a prior study, we studied auditory cortical responses to FMS embedded in varying levels of background noise to characterize neural processing strategies that could engender robust signal processing in challenging listening conditions (Malone et al 2017). Here, we focus on how the neural readout parameters affect our estimates of the underlying neural representations of competing acoustic signals. The bootstrapping method applied to the FMS dataset was similarly applied to FMS in noise. Channels were retained for further analysis if decoding accuracy was significant for at least one SNR (+∞, 0, −5, −10, −15 dB) at one or more thresholds. Because not all thresholds were tested for all channels, the number of channels retained for analysis varies (FMS +∞: 314 of 321 channels; FMS 0 dB: 306 of 316 channels; FMS −5 dB: 282 of 295 channels; FMS −10 dB: 303 of 307 channels; FMS −15 dB: 265 of 273 channels).

Figure 7 shows the results for decoding the sweep velocity and direction of twelve distinct FMS stimuli embedded in broadband background noise. To facilitate comparisons with the data in figure 6, decoding accuracy curves for FMS presented in silence (+∞ SNR) are also included. Decoding accuracy functions for a representative recording site (figure 7(B)) suggest that the optimal threshold increases slightly as the SNR decreases. The best binwidth functions replicate a familiar pattern, appearing as inverses of the decoding accuracy functions, albeit staggered along the ordinate by the SNR. The population-averaged functions bear out these trends. Decoding accuracy varied significantly across voltage threshold at every SNR (p < 10⁻¹⁵⁹ in all cases). The same was true of best binwidth (p < 10⁻⁷⁷ in all cases). The shapes of the decoding accuracy and best binwidth functions were strongly anti-correlated for all tested SNRs (r > 0.92; p < 10⁻⁸ in all cases).

The SNR had a significant effect on both the best threshold for decoding, and for the optimal integration time at the best threshold, as indicated by the inset boxplots in figure 7(D) and 7(E) (all significant pairwise comparisons are indicated by lines above the boxplots; p < 0.05, Wilcoxon rank-sum tests). The median best threshold was lowest when the sweeps were presented in silence (2.06), and highest for the most adverse SNR condition (2.32 at −15 dB). The median best threshold in the no-noise condition was significantly lower than every other condition (p < 10⁻⁴, Wilcoxon rank-sum tests). The median best threshold for the −15 dB condition was higher than in the −5 dB (2.18) and −10 dB (2.16) conditions, but the statistical trend was marginal (p < 0.05, Wilcoxon rank-sum tests). Decreasing SNRs were associated with a clear increase in the median best binwidth at the optimal threshold, which spanned a range from 6.58 ms for sweeps in silence to 16.69 ms for the −15 dB SNR condition. The median best binwidth was significantly lower in the no-noise condition than all other conditions, and significantly higher in the −15 dB condition than all other conditions (p < 10⁻³, Wilcoxon rank-sum tests). These results indicate that unfavorable SNRs require coarser binning for optimal decoding, reinforcing similar conclusions in our previous report (Malone et al 2017).

8. Decoding functions were consistent across stimuli and peaked below standard threshold values

Figure 8(A) shows normalized decoding accuracy as a function of voltage threshold for all the stimulus types depicted in prior figures. To generate these curves, we normalized each curve by setting the peak and trough of each recording site’s decoding function to 1 and 0, respectively, and then computed population averages. The normalized decoding curves are remarkably similar in shape, exhibiting rapid rises in decoding accuracy for thresholds greater than 0.5σ, and moderately less steep declines at thresholds greater than 2.5σ. Nevertheless, the decreases in decoding accuracy with increasingly high voltage thresholds are slightly more rapid for amplitude modulation frequency decoding (SAMn in grey, and SAMt in black).

We summarized the location of the decoding accuracy peaks by defining a range where decoding accuracy was greater than 70% of the peak value (figure 8(A); inset), using cubic spline interpolation to upsample the mean decoding functions from 21 to 1000 threshold values (Atencio et al 2012). For all stimulus paradigms, the majority of the peak decoding range fell outside the conventional range of voltage thresholds (figure 8(B)). For modulation frequency decoding, the peak decoding range fell entirely outside the conventional cutoff range.

These findings indicate that the information decoded from patterns of TCs obtained at conventional thresholds significantly underestimates the information available at lower thresholds clustered around 2σ. To estimate the information gap, we computed the improvement in decoding accuracy when the optimal threshold replaces a threshold value of 4 (figure 9). Improvement was calculated as accuracy at the best threshold minus a conventional threshold (4σ; (Rey et al 2015)), divided by accuracy at the conventional threshold. Best threshold was defined by the population average decoding function for a given stimulus parameter, which provides a more conservative estimate of the improvement than best threshold performance for individual sites. The largest improvements occurred for amplitude modulated stimuli, followed by frequency modulated stimuli. For example, decoding of SAM tones improved by 79.2%. The improvements for FMS decreased roughly as a function of noise level. The smallest improvements occurred for discrimination of static stimulus parameters, such as stimulus levels for pure tones and frequency modulated sweeps, and the frequency of pure tones. However, it is possible that the smaller benefit of more liberal thresholding reflects a restriction of range, since decoding accuracy was generally low for these stimulus classes. This explanation could also explain why improvements for frequency sweeps in noise were roughly ordered by the SNR. In summary, the greatest improvements in stimulus decoding at low voltage thresholds occurred for stimulus classes could be discriminated on the basis of their dynamic temporal features.

Figure 9. — Summary of changes in decoding accuracy obtained using the best voltage threshold setting defined for each stimulus parameter relative to a fixed, conventional threshold of 4σ. Left: decoding accuracy results using best (outlined boxplots) and conventional (filled boxplots) voltage thresholds. Right: decoding improvements estimated by subtracting best from conventional, then dividing by conventional and multiplying the result by 100. Boxplots indicate medians and the 25th and 75th percentiles.

9. Discussion

Our results demonstrate that the optimal readout setting based on auditory cortical threshold crossings is substantially lower than thresholds that are typically employed in neurophysiology experiments that anticipate subsequent spike sorting. In fact, the steepest increases in decoding accuracy occur in the range between the optimal and conventional threshold settings. Although the size of the benefit associated with permissive thresholding varies with stimulus paradigm, the optimal range was surprisingly consistent around a modal value near 2σ. Effectively, low amplitude neural activity contains significant information about auditory signals that should not be discarded in BCI applications whose primary concern is signal estimation or discrimination.

Optimal thresholds were clearly and inversely related to the temporal resolution used to extract dynamic firing rate information. Summing activity over a larger sample of neurons provides a more precise estimate of the temporal envelope of the cortical response that supports discrimination among both static and dynamic sounds. Of course, the value of detailed envelope information will depend on the degree to which spectral or temporal differences distinguish a set of sounds. Accordingly, decoding accuracy for dynamic acoustic stimuli (e.g. SAM and FMS) improved more than decoding accuracy for static stimuli, such as tones, that vary in constant amplitude (dB SPL), or frequency.

To obtain our results, the temporal coherence among local populations of cortical neurons must be high enough that ‘destructive interference’ based on latency differences or disparate tuning preferences does not compromise TC-based response representations at permissive thresholds. Instead, estimating the time-varying output of local populations at millisecond resolution (Kayser et al 2010, Ince et al 2013) provides the highest lower-bounds on the mutual information between acoustic signals and cortical responses.

To our knowledge, this paper is the first to explore the effects of changing voltage thresholds in tandem with changing the temporal resolution of the decoding algorithm. An important implication of the current paper for BCI applications is that the decision about setting the voltage threshold necessarily informs the decision about temporal discretization, since the two are clearly interdependent—greater temporal precision is optimal when more TCs are available. Analogously, more precise binning advantages decoding of multiunit activity relative to sorted spikes from single units in mouse and monkey auditory cortex (Hoglen et al 2018).

The historic focus on single neuron activity in sensory physiology is reflected in conventional voltage thresholds (3σ to 5σ; e.g. Rey et al 2015) chosen to maximize the identification and isolation of single unit activity. As noted above, our results for auditory cortex argue forcefully against these conventions when the objective is maximizing the information that can be extracted from the neural signal. Our results confirm analogous findings in motor cortex and visual cortex, where it has also been observed that optimal thresholds are substantially lower than those adopted in many BCI studies (Oby et al 2016; but see Christie et al 2015).

A critical question is how the optimal voltage threshold depends on the stimulus, structure, and system. In a pioneering study on this topic, Oby et al (2016) demonstrated that in motor cortex, a directional parameter, velocity, is better represented at higher thresholds than a scalar quantity, speed, which is better represented at lower thresholds. In visual cortex, stimulus orientation was better represented in cortical TCs at higher thresholds than information about stimulus contrast. Conversely, we found that the optimal thresholds for auditory cortex were consistent across a diverse range of auditory stimulus decoding paradigms (figure 8). What explains this difference? Oby et al (2016) argue that the representation of a stimulus parameter ‘…in an extracellular voltage trace will depend in part on how the topographic scale of tuning to that parameter in the cortex relates to the effective sampling radius of the electrode, as determined by the detection threshold.’ The underlying assumption is that summing inputs from similarly tuned neurons improves the representation, while summing inputs from dissimilarly tuned neurons degrades it (Christie et al 2015). It is also assumed that the number of neurons contributing to the threshold crossing signal is proportional to the voltage threshold, which defines the ‘listening radius’ of the electrode (Martinez et al 2009, Pedreira et al 2012).

In our results, however, the absolute values of the optimal thresholds were sufficiently low (~2σ) that the reductions in decoding accuracy at more permissive thresholds likely reflect the inclusion of noise, rather than signals from increasingly distant neurons with divergent tuning. Future work could explore this issue by examining the effects of including waveforms recorded on adjacent channels. Given the spacing of our electrodes (100 or 150 μm), waveforms from adjacent channels likely sample a greater cortical volume than is sampled at the lowest voltage thresholds on a single channel. Of course, the sampling radius of the electrode will depend on additional factors including the packing density of local neurons, which has been shown to vary across the cortical sheet, with the highest and lowest densities having been reported in the visual and motor cortices, respectively (Turner et al 2016). Nevertheless, differences in the optimal threshold within a cortical area for different stimuli (e.g. velocity versus speed, or orientation versus contrast; Oby et al 2016) cannot be explained by differences in the sampling radius.

The functional topography of auditory cortex may be less strictly regimented than that of visual cortex with respect to the stimulus parameters we examined. Although tonotopy is a fundamental organizing principle in the auditory system, topographic organization for other stimulus features we examined, such as modulation frequency, sound level, or the signal-to-noise ratio, is not well established (Joris et al 2004, Hullett et al 2016). Further, there does not appear to be a principle of auditory cortical organization that is analogous to the tiling of orientation preference across the visual retinotopic map. Aural dominance also appears to be comparatively weak relative to ocular dominance, particularly in nonhuman primates (Scott et al 2011).

More fundamentally, our decoding methods differed significantly from those typically employed in past studies, and the BCI literature more generally with respect to binning precision. Specifically, we analyzed neural events at temporal resolutions as precise as one millisecond, compared to typical values in the 100 ms range, a difference of roughly two orders of magnitude. Decoding accuracy using bins of 100 ms or more was poor for most of the stimulus paradigms we employed in this study, echoing prior work in macaques (Malone et al 2007, 2010, 2014) and squirrel monkeys (Malone et al 2013, 2017).

It may be problematic to apply the concept of ‘tuning scale’ to stimuli that are fundamentally time varying when the neural representation is ‘explicit’ (Wang et al 2008). In such cases, it is not necessary to invoke a tuning function that maps stimulus features to response rates. For example, the amplitude modulation frequency for SAM tones or noise can be estimated directly from neural responses synchronized to the modulation period.

The concept of topographic tuning scale raises a related question about the ‘coverage’ of a given stimulus feature provided by individual cortical neurons. For example, many auditory cortical neurons each encode a significant fraction of the dynamic range in amplitude (dB SPL), but map a significant fraction of the hearing frequency range to an equivalent response (i.e. no response relative to the baseline rate). However, they exhibit heterogenous rate-based tuning for sound amplitude, and nonmonotonic responses are quite common (Sadagopan and Wang 2008, Scott et al 2011), unlike the case in V1, where neurons exhibit similar tuning to contrast (Oby et al 2016). More fundamentally, however, it is possible to extract more information about the frequency (Malone et al 2014) and level (Malone et al 2010) of parametrically ‘static’ pure tones by decoding rate-normalized temporal profiles of cortical responses rather than average response rates. Thus, it may prove difficult to relate optimal voltage thresholds to cortical topography when rate-based tuning does not dominate how a given stimulus class is encoded and represented. In summary, the prospect of using optimal thresholds as a means for understanding auditory cortical organization (Oby et al 2016) is appealing, but our results suggest that it may be limited when applied how auditory cortex processes temporally dynamic signals.

As Christie et al (2015) noted, reliance on TCs for BCI devices could facilitate the transition of such devices ‘from research into the clinic’ by simplifying the requisite hardware and software. Improvements in automated spike sorting procedures might make decoding based on sorted spikes feasible. We should note that unlike a number of recent studies that have compared decoding performance with TCs against sorted spikes, or local field potentials (Stark and Abeles 2007, Ventura 2008, Chestek et al 2011, Kloosterman et al 2014, Todorova et al 2014, Christie et al 2015), we only addressed the question of how to maximize decoding performance for TCs across a range of threshold settings, and, crucially, timescales. Central BCI applications for audition must confront the unique temporal demands of auditory processing. The distribution of tuning to acoustic envelope features (e.g. modulation phase, or rise-time) among local populations of neurons will constrain how best to estimate the informative features of a neural representation of time-varying sounds such as speech.

Acknowledgments

The authors thank Ralph Beitel for experimental assistance, and Drs. Christoph Schreiner and Gregg Recanzone for helpful suggestions and feedback on the manuscript.

References

Atencio CA, Sharpee TO and Schreiner CE 2012. Receptive field dimensionality increases from the auditory midbrain to cortex J. Neurophysiol 107 2594–603 [DOI] [PMC free article] [PubMed] [Google Scholar]
Averbeck BB, Latham PE and Pouget A 2006. Neural correlations, population coding and computation Nat. Rev. Neurosci 7 358–66 [DOI] [PubMed] [Google Scholar]
Bieser A and Müller-Preuss P 1996. Auditory responsive cortex in the squirrel monkey: neural responses to amplitude-modulated sounds Exp. Brain Res 108 273–84 [DOI] [PubMed] [Google Scholar]
Chestek CA. et al. Long-term stability of neural prosthetic control signals from silicon cortical arrays in rhesus macaque motor cortex. J. Neural Eng. 2011;8:045005. doi: 10.1088/1741-2560/8/4/045005. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cheung SW, Bedenbaugh PH, Nagarajan SS and Schreiner CE 2001. Functional organization of squirrel monkey primary auditory cortex: responses to pure tones J. Neurophysiol 85 1732–49 [DOI] [PubMed] [Google Scholar]
Christie BP, Tat DM, Irwin ZT, Gilja V, Nuyujukian P, Foster JD, Ryu SI, Shenoy KV, Thompson DE and Chestek CA 2015. Comparison of spike sorting and thresholding of voltage waveforms for intracortical brain–machine interface performance J. Neural Eng 2 22 016009. [DOI] [PMC free article] [PubMed] [Google Scholar]
Collinger JL, Wodlinger B, Downey JE, Wang W, Tyler-Kabara EC, Weber DJ, McMorland AJ, Velliste M, Boninger ML and Schwartz AB 2013. High-performance neuroprosthetic control by an individual with tetraplegia Lancet 381 557–64 [DOI] [PMC free article] [PubMed] [Google Scholar]
Engineer CT, Perez CA, Chen YH, Carraway RS, Reed AC, Shetake JA, Jakkamsetti V, Chang KQ and Kilgard MP 2008. Cortical activity patterns predict speech discrimination ability Nat. Neurosci 11 603–8 [DOI] [PMC free article] [PubMed] [Google Scholar]
Flesher SN, Collinger JL, Foldes ST, Weiss JM, Downey JE, Tyler-Kabara EC, Bensmaia SJ, Schwartz AB, Boninger ML and Gaunt RA 2016. Intracortical microstimulation of human somatosensory cortex Sci. Trans. Med 8 361ra141. [DOI] [PubMed] [Google Scholar]
Foffani G and Moxon KA 2004. PSTH-based classification of sensory stimuli using ensembles of single neurons J. Neurosci. Methods 135 107–20 [DOI] [PubMed] [Google Scholar]
Fraser GW, Chase SM, Whitford AS and Schwartz AB 2009. Control of a brain–computer interface without spike sorting J. Neural Eng 6 055004. [DOI] [PubMed] [Google Scholar]
Garcia-Lazaro JA, Belliveau LA and Lesica NA 2013. Independent population coding of speech with sub-millisecond precision J. Neurosci 33 19362–72 [DOI] [PMC free article] [PubMed] [Google Scholar]
Georgopoulos AP, Kalaska JF, Caminiti R and Massey JT 1982. On the relations between the direction of two-dimensional arm movements and cell discharge in primate motor cortex J. Neurosci 2 1527–37 [DOI] [PMC free article] [PubMed] [Google Scholar]
Godey B, Atencio CA, Bonham BH, Schreiner CE and Cheung SW 2005. Functional organization of squirrel monkey primary auditory cortex: responses to frequency-modulation sweeps J. Neurophysiol 94 1299–311 [DOI] [PubMed] [Google Scholar]
Hoglen NE, Larimer P, Phillips EA, Malone BJ and Hasenstaub AR 2018. Amplitude modulation coding in awake mice and squirrel monkeys J. Neurophysiol 119 1753–66 [DOI] [PMC free article] [PubMed] [Google Scholar]
Höhne J and Tangermann M 2014. Towards user-friendly spelling with an auditory brain-computer interface: the charstreamer paradigm PLoS One 9 e98322. [DOI] [PMC free article] [PubMed] [Google Scholar]
Homer ML, Nurmikko AV, Donoghue JP and Hochberg LR 2013. Sensors and decoding for intracortical brain computer interfaces Annu. Rev. Biomed. Eng 2013 383–405 [DOI] [PMC free article] [PubMed] [Google Scholar]
Hullett PW, Hamilton LS, Mesgarani N, Schreiner CE and Chang EF 2016. Human superior temporal gyrus organization of spectrotemporal modulation tuning derived from speech stimuli J. Neurosci 2016 2014–26 [DOI] [PMC free article] [PubMed] [Google Scholar]
Ince RA, Panzeri S and Kayser C 2013. Neural codes formed by small and temporally precise populations in auditory cortex J. Neurosci 33 18277–87 [DOI] [PMC free article] [PubMed] [Google Scholar]
Joris PX, Schreiner CE and Rees A 2004. Neural processing of amplitude modulated sounds Physiol. Rev 84 541–77 [DOI] [PubMed] [Google Scholar]
Kaufmann T, Holz EM and Kübler A 2013. Comparison of tactile, auditory, and visual modality for brain–computer interface use: a case study with a patient in the locked-in state Front. Neurosci 7 129. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kayser C, Logothetis NK and Panzeri S 2010. Millisecond encoding precision of auditory cortex neurons Proc. Natl. Acad. Sci. USA 107 16976–81 [DOI] [PMC free article] [PubMed] [Google Scholar]
Kellis S, Miller K, Thomson K, Brown R, House P and Greger B 2010. Decoding spoken words using local field potentials recorded from the cortical surface J. Neural Eng 7 056007. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kloosterman F, Layton SP, Chen Z, Wilson MA 2014. Bayesian decoding using unsorted spikes in the rat hippocampus J. Neurophysiol 111 217–27 [DOI] [PMC free article] [PubMed] [Google Scholar]
Kreiman G, Hung CP, Kraskov A, Quiroga RQ, Poggio T, and DiCarlo TT 2006. Object selectivity of local field potentials and spikes in the macaque inferior temporal cortex Neuron 49 433–45 [DOI] [PubMed] [Google Scholar]
Lewicki MS 1998. A review of methods for spike sorting: the detection and classification of neural action potentials Network 9 R53–78 [PubMed] [Google Scholar]
Liang L, Lu T and Wang X 2002. Neural representations of sinusoidal amplitude and frequency modulations in the primary auditory cortex of awake primates J. Neurophysiol 87 2237–61 [DOI] [PubMed] [Google Scholar]
Malone BJ, Scott BH and Semple MN 2007. Dynamic amplitude coding in the auditory cortex of awake rhesus macaques J. Neurophysiol 98 1451–74 [DOI] [PubMed] [Google Scholar]
Malone BJ and Schreiner CE 2010. Coding of time varying sounds: envelope modulations The Oxford Handbook of Auditory Science: The Auditory Brain , Volume 2 ed Rees A and Palmer A (Oxford: Oxford University Press; ) p 125 [Google Scholar]
Malone BJ, Scott BH and Semple MN 2010. Temporal coding of amplitude contrast in auditory cortex J. Neurosci 30 767–84 [DOI] [PMC free article] [PubMed] [Google Scholar]
Malone BJ, Beitel RE, Vollmer M, Heiser MA and Schreiner CE 2013. Spectral context affects temporal processing in awake auditory cortex J. Neurosci 33 9431–50 [DOI] [PMC free article] [PubMed] [Google Scholar]
Malone BJ, Scott BH and Semple MN 2014. Encoding frequency contrast in primate auditory cortex J. Neurophysiol 111 2244–63 [DOI] [PMC free article] [PubMed] [Google Scholar]
Malone BJ, Beitel RE, Vollmer M, Heiser MA and Schreiner CE 2015a. Modulation-frequency-specific adaptation in awake auditory cortex J. Neurosci 35 5904–16 [DOI] [PMC free article] [PubMed] [Google Scholar]
Malone BJ, Scott BH and Semple MN 2015b. Diverse cortical codes for scene segmentation in primate auditory cortex J. Neurophysiol 113 2934–52 [DOI] [PMC free article] [PubMed] [Google Scholar]
Malone BJ, Heiser MA, Beitel RE and Schreiner CE 2017. Background noise exerts diverse effects on the cortical encoding of foreground sounds J. Neurophysiol 118 1034–54 [DOI] [PMC free article] [PubMed] [Google Scholar]
Martinez J, Pedreira C, Ison MJ and Quian Quiroga R 2009. Realistic simulation of extracellular recordings J. Neurosci. Methods 184 285–93 [DOI] [PubMed] [Google Scholar]
Merzenich MM and Brugge JF 1973. Representation of the cochlear partition on the superior temporal plane of the macaque monkey Brain Res. 50 275–96 [DOI] [PubMed] [Google Scholar]
Mesgarani N, David SV, Fritz JB and Shamma SA 2009. Influence of context and behavior on stimulus reconstruction from neural activity in primary auditory cortex J. Neurophysiol 102 3329–39 [DOI] [PMC free article] [PubMed] [Google Scholar]
Mesgarani N and Chang EF 2012. Selective cortical representation of attended speaker in multi-talker speech perception Nature 485 233–6 [DOI] [PMC free article] [PubMed] [Google Scholar]
Moran A and Bar-Gad I 2010. Revealing neuronal functional organization through the relation between multi-scale oscillatory extracellular signals J. Neurosci. Methods 186 116–29 [DOI] [PubMed] [Google Scholar]
Moxon KA and Foffani G 2015. Brain–machine interfaces beyond neuroprosthetics Neuron 86 55–67 [DOI] [PubMed] [Google Scholar]
Nicolelis MA and Lebedev MA 2009. Principles of neural ensemble physiology underlying the operation of brain–machine interfaces Nat. Rev. Neurosci 10 530. [DOI] [PubMed] [Google Scholar]
Oby ER. et al. Extracellular voltage threshold settings can be tuned for optimal encoding of movement and stimulus parameters. J. Neural. Eng. 2016;13:036009. doi: 10.1088/1741-2560/13/3/036009. [DOI] [PMC free article] [PubMed] [Google Scholar]
Pandarinath C, Nuyujukian P, Blabe CH, Sorice BL, Saab J, Willett FR, Hochberg LR, Shenoy KV and Henderson JM 2017. High performance communication by people with paralysis using an intracortical brain–computer interface Life 6 e18554. [DOI] [PMC free article] [PubMed] [Google Scholar]
Panzeri S, Macke JH, Gross J and Kayser C 2015. Neural population coding: combining insights from microscopic and mass signals Trends Cogn. Sci 19 162–72 [DOI] [PMC free article] [PubMed] [Google Scholar]
Pasley BN et al. 2012. Reconstructing speech from human auditory cortex PLoS Biol. 10 e1001251. [DOI] [PMC free article] [PubMed] [Google Scholar]
Pedreira C, Martinez J, Ison MJ and Quian Quiroga R 2012. How many neurons can we see with current spike sorting algorithms? J. Neurosci. Methods 211 58–65 [DOI] [PMC free article] [PubMed] [Google Scholar]
Perge JA, Zhang S, Malik WQ, Homer ML, Cash S, Friehs G, Eskandar EN, Donoghue JP and Hochberg LR 2014. Reliability of directional information in unsorted spikes and local field potentials recorded in human motor cortex J. Neural. Eng 2014 046007. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rabinowitz NC, Willmore BD, King AJ, Schnupp JW 2013. Constructing noise-invariant representations of sound in the auditory pathway PLoS Biol. 2013 e1001710. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rey HG, Pedreira C and Quiroga RQ 2015. Past, present and future of spike sorting techniques Brain Res. Bull 119 106–17 [DOI] [PMC free article] [PubMed] [Google Scholar]
Rosen S 1992. Temporal information in speech: acoustic, auditory and linguistic aspects Philos. Trans. R. Soc. Lond. B Biol. Sci 336 367–73 [DOI] [PubMed] [Google Scholar]
Sadagopan S and Wang X 2008. Level invariant representation of sounds by populations of neurons in primary auditory cortex J. Neurosci 13 3415–26 [DOI] [PMC free article] [PubMed] [Google Scholar]
Schnupp JW, Hall TM, Kokelaar RF and Ahmed B 2006. Plasticity of temporal pattern codes for vocalization stimuli in primary auditory cortex J. Neurosci 26 4785–95 [DOI] [PMC free article] [PubMed] [Google Scholar]
Schreiner CE and Polley DB 2014. Auditory map plasticity: diversity in causes and consequences Curr. Top. Neurobiol 24 143–56 [DOI] [PMC free article] [PubMed] [Google Scholar]
Scott BH, Malone BJ and Semple MN 2011. Transformation of temporal processing across auditory cortex of awake macaques J. Neurophys 105 712–30 [DOI] [PMC free article] [PubMed] [Google Scholar]
Shih JY, Atencio CA and Schreiner CE 2011. Improved stimulus representation by short interspike intervals in primary auditory cortex J. Neurophysiol 105 1908–17 [DOI] [PMC free article] [PubMed] [Google Scholar]
Simeral JD, Kim SP, Black MJ, Donoghue JP and Hochberg LR 2011. Neural control of cursor trajectory and click by a human with tetraplegia 1000 days after implant of an intracortical microelectrode array J. Neural. Eng 8 025027. [DOI] [PMC free article] [PubMed] [Google Scholar]
Smith E, Kellis S, House P and Greger B 2013. Decoding stimulus identity from multi-unit activity and local field potentials along the ventral auditory stream in the awake primate: implications for cortical neural prostheses J. Neural. Eng 10 016010. [DOI] [PMC free article] [PubMed] [Google Scholar]
Smith ZM, Delgutte B and Oxenham AJ 2002. Chimaeric sounds reveal dichotomies in auditory perception Nature 416 87–90 [DOI] [PMC free article] [PubMed] [Google Scholar]
Stark E, Abeles M 2007. Predicting movement from multiunit activity J. Neurosci 27 8387–94 [DOI] [PMC free article] [PubMed] [Google Scholar]
Teschner MJ, Seybold BA, Malone BJ, Hüning J and Schreiner CE 2016. Effects of signal-to-noise ratio on auditory cortical frequency processing J. Neurosci 36 2743–56 [DOI] [PMC free article] [PubMed] [Google Scholar]
Todorova S, Sadtler P, Batista A, Chase S and Ventura V 2014. To sort or not to sort: the impact of spike-sorting on neural decoding performance J. Neural. Eng 11 056005. [DOI] [PMC free article] [PubMed] [Google Scholar]
Turner EC, Young NA, Reed JL, Collins CE, Flaherty DK, Gabi M and Kaas JH 2016. Distributions of cells and neurons across the cortical sheet in old world macaques Brain Behav. Evol 88 1–13 [DOI] [PubMed] [Google Scholar]
Van Eyndhoven S, Francart T and Bertrand A 2016. EEG-informed attended speaker extraction from recorded speech mixtures with application in neuro-steered hearing prostheses IEEE Trans. Biomed. Eng 64 1045–56 [DOI] [PubMed] [Google Scholar]
Ventura V 2008. Spike train decoding without spike sorting Neural Comput. 20 923–63 [DOI] [PMC free article] [PubMed] [Google Scholar]
Wander JD and Rao RP 2014. Brain–computer interfaces: a powerful tool for scientific inquiry Curr. Opin. Neurobiol 25 70–75 [DOI] [PMC free article] [PubMed] [Google Scholar]
Wang X, Lu T, Bendor D and Bartlett E 2008. Neural coding of temporal information in auditory thalamus and cortex Neuroscience 157 484–94 [DOI] [PubMed] [Google Scholar]
Yael D and Bar-Gad I 2017. Filter based phase distortions in extracellular spikes PLoS One 12 e0174790. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] Atencio CA, Sharpee TO and Schreiner CE 2012. Receptive field dimensionality increases from the auditory midbrain to cortex J. Neurophysiol 107 2594–603 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] Averbeck BB, Latham PE and Pouget A 2006. Neural correlations, population coding and computation Nat. Rev. Neurosci 7 358–66 [DOI] [PubMed] [Google Scholar]

[R3] Bieser A and Müller-Preuss P 1996. Auditory responsive cortex in the squirrel monkey: neural responses to amplitude-modulated sounds Exp. Brain Res 108 273–84 [DOI] [PubMed] [Google Scholar]

[R4] Chestek CA. et al. Long-term stability of neural prosthetic control signals from silicon cortical arrays in rhesus macaque motor cortex. J. Neural Eng. 2011;8:045005. doi: 10.1088/1741-2560/8/4/045005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] Cheung SW, Bedenbaugh PH, Nagarajan SS and Schreiner CE 2001. Functional organization of squirrel monkey primary auditory cortex: responses to pure tones J. Neurophysiol 85 1732–49 [DOI] [PubMed] [Google Scholar]

[R6] Christie BP, Tat DM, Irwin ZT, Gilja V, Nuyujukian P, Foster JD, Ryu SI, Shenoy KV, Thompson DE and Chestek CA 2015. Comparison of spike sorting and thresholding of voltage waveforms for intracortical brain–machine interface performance J. Neural Eng 2 22 016009. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] Collinger JL, Wodlinger B, Downey JE, Wang W, Tyler-Kabara EC, Weber DJ, McMorland AJ, Velliste M, Boninger ML and Schwartz AB 2013. High-performance neuroprosthetic control by an individual with tetraplegia Lancet 381 557–64 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] Engineer CT, Perez CA, Chen YH, Carraway RS, Reed AC, Shetake JA, Jakkamsetti V, Chang KQ and Kilgard MP 2008. Cortical activity patterns predict speech discrimination ability Nat. Neurosci 11 603–8 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] Flesher SN, Collinger JL, Foldes ST, Weiss JM, Downey JE, Tyler-Kabara EC, Bensmaia SJ, Schwartz AB, Boninger ML and Gaunt RA 2016. Intracortical microstimulation of human somatosensory cortex Sci. Trans. Med 8 361ra141. [DOI] [PubMed] [Google Scholar]

[R10] Foffani G and Moxon KA 2004. PSTH-based classification of sensory stimuli using ensembles of single neurons J. Neurosci. Methods 135 107–20 [DOI] [PubMed] [Google Scholar]

[R11] Fraser GW, Chase SM, Whitford AS and Schwartz AB 2009. Control of a brain–computer interface without spike sorting J. Neural Eng 6 055004. [DOI] [PubMed] [Google Scholar]

[R12] Garcia-Lazaro JA, Belliveau LA and Lesica NA 2013. Independent population coding of speech with sub-millisecond precision J. Neurosci 33 19362–72 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] Georgopoulos AP, Kalaska JF, Caminiti R and Massey JT 1982. On the relations between the direction of two-dimensional arm movements and cell discharge in primate motor cortex J. Neurosci 2 1527–37 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] Godey B, Atencio CA, Bonham BH, Schreiner CE and Cheung SW 2005. Functional organization of squirrel monkey primary auditory cortex: responses to frequency-modulation sweeps J. Neurophysiol 94 1299–311 [DOI] [PubMed] [Google Scholar]

[R15] Hoglen NE, Larimer P, Phillips EA, Malone BJ and Hasenstaub AR 2018. Amplitude modulation coding in awake mice and squirrel monkeys J. Neurophysiol 119 1753–66 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] Höhne J and Tangermann M 2014. Towards user-friendly spelling with an auditory brain-computer interface: the charstreamer paradigm PLoS One 9 e98322. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] Homer ML, Nurmikko AV, Donoghue JP and Hochberg LR 2013. Sensors and decoding for intracortical brain computer interfaces Annu. Rev. Biomed. Eng 2013 383–405 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] Hullett PW, Hamilton LS, Mesgarani N, Schreiner CE and Chang EF 2016. Human superior temporal gyrus organization of spectrotemporal modulation tuning derived from speech stimuli J. Neurosci 2016 2014–26 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] Ince RA, Panzeri S and Kayser C 2013. Neural codes formed by small and temporally precise populations in auditory cortex J. Neurosci 33 18277–87 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] Joris PX, Schreiner CE and Rees A 2004. Neural processing of amplitude modulated sounds Physiol. Rev 84 541–77 [DOI] [PubMed] [Google Scholar]

[R21] Kaufmann T, Holz EM and Kübler A 2013. Comparison of tactile, auditory, and visual modality for brain–computer interface use: a case study with a patient in the locked-in state Front. Neurosci 7 129. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] Kayser C, Logothetis NK and Panzeri S 2010. Millisecond encoding precision of auditory cortex neurons Proc. Natl. Acad. Sci. USA 107 16976–81 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] Kellis S, Miller K, Thomson K, Brown R, House P and Greger B 2010. Decoding spoken words using local field potentials recorded from the cortical surface J. Neural Eng 7 056007. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] Kloosterman F, Layton SP, Chen Z, Wilson MA 2014. Bayesian decoding using unsorted spikes in the rat hippocampus J. Neurophysiol 111 217–27 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] Kreiman G, Hung CP, Kraskov A, Quiroga RQ, Poggio T, and DiCarlo TT 2006. Object selectivity of local field potentials and spikes in the macaque inferior temporal cortex Neuron 49 433–45 [DOI] [PubMed] [Google Scholar]

[R26] Lewicki MS 1998. A review of methods for spike sorting: the detection and classification of neural action potentials Network 9 R53–78 [PubMed] [Google Scholar]

[R27] Liang L, Lu T and Wang X 2002. Neural representations of sinusoidal amplitude and frequency modulations in the primary auditory cortex of awake primates J. Neurophysiol 87 2237–61 [DOI] [PubMed] [Google Scholar]

[R28] Malone BJ, Scott BH and Semple MN 2007. Dynamic amplitude coding in the auditory cortex of awake rhesus macaques J. Neurophysiol 98 1451–74 [DOI] [PubMed] [Google Scholar]

[R29] Malone BJ and Schreiner CE 2010. Coding of time varying sounds: envelope modulations The Oxford Handbook of Auditory Science: The Auditory Brain , Volume 2 ed Rees A and Palmer A (Oxford: Oxford University Press; ) p 125 [Google Scholar]

[R30] Malone BJ, Scott BH and Semple MN 2010. Temporal coding of amplitude contrast in auditory cortex J. Neurosci 30 767–84 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] Malone BJ, Beitel RE, Vollmer M, Heiser MA and Schreiner CE 2013. Spectral context affects temporal processing in awake auditory cortex J. Neurosci 33 9431–50 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] Malone BJ, Scott BH and Semple MN 2014. Encoding frequency contrast in primate auditory cortex J. Neurophysiol 111 2244–63 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] Malone BJ, Beitel RE, Vollmer M, Heiser MA and Schreiner CE 2015a. Modulation-frequency-specific adaptation in awake auditory cortex J. Neurosci 35 5904–16 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] Malone BJ, Scott BH and Semple MN 2015b. Diverse cortical codes for scene segmentation in primate auditory cortex J. Neurophysiol 113 2934–52 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R35] Malone BJ, Heiser MA, Beitel RE and Schreiner CE 2017. Background noise exerts diverse effects on the cortical encoding of foreground sounds J. Neurophysiol 118 1034–54 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] Martinez J, Pedreira C, Ison MJ and Quian Quiroga R 2009. Realistic simulation of extracellular recordings J. Neurosci. Methods 184 285–93 [DOI] [PubMed] [Google Scholar]

[R37] Merzenich MM and Brugge JF 1973. Representation of the cochlear partition on the superior temporal plane of the macaque monkey Brain Res. 50 275–96 [DOI] [PubMed] [Google Scholar]

[R38] Mesgarani N, David SV, Fritz JB and Shamma SA 2009. Influence of context and behavior on stimulus reconstruction from neural activity in primary auditory cortex J. Neurophysiol 102 3329–39 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] Mesgarani N and Chang EF 2012. Selective cortical representation of attended speaker in multi-talker speech perception Nature 485 233–6 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] Moran A and Bar-Gad I 2010. Revealing neuronal functional organization through the relation between multi-scale oscillatory extracellular signals J. Neurosci. Methods 186 116–29 [DOI] [PubMed] [Google Scholar]

[R41] Moxon KA and Foffani G 2015. Brain–machine interfaces beyond neuroprosthetics Neuron 86 55–67 [DOI] [PubMed] [Google Scholar]

[R42] Nicolelis MA and Lebedev MA 2009. Principles of neural ensemble physiology underlying the operation of brain–machine interfaces Nat. Rev. Neurosci 10 530. [DOI] [PubMed] [Google Scholar]

[R43] Oby ER. et al. Extracellular voltage threshold settings can be tuned for optimal encoding of movement and stimulus parameters. J. Neural. Eng. 2016;13:036009. doi: 10.1088/1741-2560/13/3/036009. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R44] Pandarinath C, Nuyujukian P, Blabe CH, Sorice BL, Saab J, Willett FR, Hochberg LR, Shenoy KV and Henderson JM 2017. High performance communication by people with paralysis using an intracortical brain–computer interface Life 6 e18554. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R45] Panzeri S, Macke JH, Gross J and Kayser C 2015. Neural population coding: combining insights from microscopic and mass signals Trends Cogn. Sci 19 162–72 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R46] Pasley BN et al. 2012. Reconstructing speech from human auditory cortex PLoS Biol. 10 e1001251. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R47] Pedreira C, Martinez J, Ison MJ and Quian Quiroga R 2012. How many neurons can we see with current spike sorting algorithms? J. Neurosci. Methods 211 58–65 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R48] Perge JA, Zhang S, Malik WQ, Homer ML, Cash S, Friehs G, Eskandar EN, Donoghue JP and Hochberg LR 2014. Reliability of directional information in unsorted spikes and local field potentials recorded in human motor cortex J. Neural. Eng 2014 046007. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R49] Rabinowitz NC, Willmore BD, King AJ, Schnupp JW 2013. Constructing noise-invariant representations of sound in the auditory pathway PLoS Biol. 2013 e1001710. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R50] Rey HG, Pedreira C and Quiroga RQ 2015. Past, present and future of spike sorting techniques Brain Res. Bull 119 106–17 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R51] Rosen S 1992. Temporal information in speech: acoustic, auditory and linguistic aspects Philos. Trans. R. Soc. Lond. B Biol. Sci 336 367–73 [DOI] [PubMed] [Google Scholar]

[R52] Sadagopan S and Wang X 2008. Level invariant representation of sounds by populations of neurons in primary auditory cortex J. Neurosci 13 3415–26 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R53] Schnupp JW, Hall TM, Kokelaar RF and Ahmed B 2006. Plasticity of temporal pattern codes for vocalization stimuli in primary auditory cortex J. Neurosci 26 4785–95 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R54] Schreiner CE and Polley DB 2014. Auditory map plasticity: diversity in causes and consequences Curr. Top. Neurobiol 24 143–56 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R55] Scott BH, Malone BJ and Semple MN 2011. Transformation of temporal processing across auditory cortex of awake macaques J. Neurophys 105 712–30 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R56] Shih JY, Atencio CA and Schreiner CE 2011. Improved stimulus representation by short interspike intervals in primary auditory cortex J. Neurophysiol 105 1908–17 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R57] Simeral JD, Kim SP, Black MJ, Donoghue JP and Hochberg LR 2011. Neural control of cursor trajectory and click by a human with tetraplegia 1000 days after implant of an intracortical microelectrode array J. Neural. Eng 8 025027. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R58] Smith E, Kellis S, House P and Greger B 2013. Decoding stimulus identity from multi-unit activity and local field potentials along the ventral auditory stream in the awake primate: implications for cortical neural prostheses J. Neural. Eng 10 016010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R59] Smith ZM, Delgutte B and Oxenham AJ 2002. Chimaeric sounds reveal dichotomies in auditory perception Nature 416 87–90 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R60] Stark E, Abeles M 2007. Predicting movement from multiunit activity J. Neurosci 27 8387–94 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R61] Teschner MJ, Seybold BA, Malone BJ, Hüning J and Schreiner CE 2016. Effects of signal-to-noise ratio on auditory cortical frequency processing J. Neurosci 36 2743–56 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R62] Todorova S, Sadtler P, Batista A, Chase S and Ventura V 2014. To sort or not to sort: the impact of spike-sorting on neural decoding performance J. Neural. Eng 11 056005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R63] Turner EC, Young NA, Reed JL, Collins CE, Flaherty DK, Gabi M and Kaas JH 2016. Distributions of cells and neurons across the cortical sheet in old world macaques Brain Behav. Evol 88 1–13 [DOI] [PubMed] [Google Scholar]

[R64] Van Eyndhoven S, Francart T and Bertrand A 2016. EEG-informed attended speaker extraction from recorded speech mixtures with application in neuro-steered hearing prostheses IEEE Trans. Biomed. Eng 64 1045–56 [DOI] [PubMed] [Google Scholar]

[R65] Ventura V 2008. Spike train decoding without spike sorting Neural Comput. 20 923–63 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R66] Wander JD and Rao RP 2014. Brain–computer interfaces: a powerful tool for scientific inquiry Curr. Opin. Neurobiol 25 70–75 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R67] Wang X, Lu T, Bendor D and Bartlett E 2008. Neural coding of temporal information in auditory thalamus and cortex Neuroscience 157 484–94 [DOI] [PubMed] [Google Scholar]

[R68] Yael D and Bar-Gad I 2017. Filter based phase distortions in extracellular spikes PLoS One 12 e0174790. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Extracellular voltage thresholds for maximizing information extraction in primate auditory cortex: implications for a brain computer interface

James Bigelow

Brian J Malone

Abstract

Objective.

Approach.

Main results.

Significance.

1. Introduction

2. Methods

2.1. Subjects and surgical preparation

2.2. Auditory stimuli

Figure 1.

Pure tones (PT).

Sinusoidal amplitude modulated tones, noise (SAMt, SAMn).

Frequency modulated sweeps (FMS).

Frequency modulated sweeps in noise (FMSSNR).

2.3. Electrophysiology

Figure 2.

Figure 4.

Figure 7.

2.4. Stimulus decoding analysis

Figure 3.

2.5. Site inclusion criteria

3. Results

4. Pure tone decoding

5. Sinusoidal amplitude-modulated tone and noise decoding

Figure 5.

6. Frequency modulated sweep decoding

Figure 6.

7. Frequency modulated sweep in noise decoding

8. Decoding functions were consistent across stimuli and peaked below standard threshold values

Figure 8.

Figure 9.

9. Discussion

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Frequency modulated sweeps in noise (FMS_SNR).