Skip to main content
The Journal of Physiology logoLink to The Journal of Physiology
. 2012 May 8;590(Pt 13):3129–3139. doi: 10.1113/jphysiol.2012.232892

Differential representation of auditory categories between cell classes in primate auditory cortex

Joji Tsunada 1, Jung H Lee 1,2, Yale E Cohen 1
PMCID: PMC3406395  PMID: 22570374

Abstract

A comprehensive understanding of the neural mechanisms of cognitive function requires an understanding of how neural representations are transformed across different scales of neural organization: from within local microcircuits to across different brain areas. However, the neural transformations within the local microcircuits are poorly understood. Particularly, the role that two main cell classes of neurons in cortical microcircuits (i.e. pyramidal neurons and interneurons) have in auditory behaviour and cognition remains unknown. In this study, we tested the hypothesis that pyramidal cells and interneurons in the auditory cortex play a differential role in auditory categorization. To test this hypothesis, we recorded single-unit activity from the auditory cortex of rhesus monkeys while they categorized speech sounds. Based on the spike-waveform shape, a neuron was classified as either a narrow-spiking putative interneuron or a broad-spiking putative pyramidal neuron. We found that putative interneurons and pyramidal neurons in the auditory cortex differentially coded category information: interneurons were more selective for auditory categories than pyramidal neurons. These differences between cell classes may be an essential property of the neural computations underlying auditory categorization within the microcircuitry of the auditory cortex.


Key points

  • The role that pyramidal neurons and interneurons have in auditory behaviour and cognition remains unknown.

  • In this study, we tested the hypothesis that pyramidal cells and interneurons in the auditory cortex play a differential role in auditory categorization.

  • Putative interneurons in the auditory cortex were more selective for auditory categories than putative pyramidal neurons.

  • The greater category selectivity in putative interneurons may be a characteristic of auditory categorization in the microcircuit of the auditory cortex.

Introduction

A comprehensive understanding of the neural mechanisms underlying cognitive function requires an understanding of how neural representations are transformed across different scales of neural organization: from within local microcircuits to across different brain areas. Although previous studies have demonstrated how neural representations of perception, categorization and decision-making are transformed across cortical areas (Gold & Shadlen, 2007; Russ et al. 2007; Freedman & Miller, 2008; Hernández et al. 2010), the neural transformations that occur within the local microcircuits that mediate cognitive function remain poorly understood. Recent studies have begun to elucidate the role that two main classes of neurons in cortical microcircuits, excitatory pyramidal neurons and inhibitory interneurons (Markham et al. 2004), play in vision, somatosensation and motor control (Wilson et al. 1994; Swadlow, 2003; Mitchell et al. 2007; Diester & Nieder, 2008; Isomura et al. 2009; Johnston et al. 2009; Yokoi & Komatsu, 2010; Ison et al. 2011). However, the role that these two cell classes have in auditory behaviour and cognition is not known (Atencio & Schreiner, 2008; Sakata & Harris, 2009; Ogawa et al. 2011).

Here, we tested the hypothesis that pyramidal cells and interneurons in the auditory cortex play a differential role in auditory categorization, a fundamental auditory-cognitive function across a broad range of animal species (Russ et al. 2007). To test this hypothesis, we recorded single-unit activity from a region of the auditory cortex in the superior temporal gyrus (STG) of rhesus monkeys while they categorized speech sounds. The STG was targeted because neurons in this brain region are known to respond categorically to human phonemes (human studies: Chang et al. 2010; Steinschneider et al. 2011; non-human primate study: Tsunada et al. 2011). Based on a neuron's spike-waveform shape, recorded neurons were classified into one of two categories (Bartho et al. 2004; Sakata & Harris, 2009): narrow-spiking putative interneurons (NS neurons) and broad-spiking putative pyramidal neurons (BS neurons). We found that putative interneurons and pyramidal neurons in the auditory cortex differentially coded category information. Specifically, interneurons were more selective for auditory categories than pyramidal neurons. These differences between cell classes may be an essential property of the neural computations underlying auditory categorization within the microcircuitry of the auditory cortex.

Methods

General procedures

We recorded neural activity from the superior temporal gyrus (STG) of two male rhesus monkeys (Macaca mulatta, monkey H and monkey T). Under isofluorane anaesthesia and using aseptic techniques, both monkeys were implanted with a head-positioning cylinder and a recording chamber. Monkey H was additionally implanted with a scleral search coil (Judge et al. 1980). STG recordings were obtained from the left hemisphere of monkey H and from the right hemisphere of monkey T. All recordings were guided by pre- and post-operative magnetic resonance images of each monkey's brain.

Behavioural and neurophysiological recording sessions took place in a darkened sound-attenuated room. While the monkeys were in the room, they were monitored with an infrared camera and were seated in a primate chair. The monkeys were given juice rewards in response to correct reports during all behavioural and recording sessions; see Match-to-category task below for more details about the behavioural task.

Dartmouth College's and the University of Pennsylvania's Institutional Animal Care and Use Committees approved all of the experimental protocols. All neural and behavioural data were collected previously as a part of our recent study (Tsunada et al. 2011). Both monkeys H and T are currently still participating in experiments in the laboratory.

Auditory stimuli

The prototype stimuli were the spoken words bad and dad that were provided by Dr Michael Kilgard. Perceptually, these stimuli differ in their place of articulation of each word's initial consonant (i.e. the place of articulation for /b/ is the lips, and for /d/, it is the roof of the mouth). Morphed versions of the prototypes were created using the STRAIGHT toolbox (Kawahara et al. 1999), which is run in the Matlab (The Mathworks Inc.) programming environment. Morphing was accomplished by calculating the shortest trajectory between the fundamental and formant frequencies of the two prototypes; the shortest trajectory was based on a computed distance metric. Morphed versions of the two prototypes were created at 20%, 40%, 50%, 60% and 80% of the distance along this trajectory. Operationally, the bad prototype was defined as the 0% morph, and the dad prototype was defined as the 100% morph.

Match-to-category task

As schematized in Supplementary Fig. S1A, the task began with a presentation of a ‘reference’ stimulus that was followed by the presentation of a ‘test’ stimulus. The reference stimulus and the test stimulus could be either of the prototypes or any of the morphs with one exception: the 50% morph was not allowed to be a reference stimulus. The reference and test stimuli were 500 ms in duration and the inter-stimulus interval was between 1100 and 1300 ms. The stimuli were presented from a speaker (Pyle, PLX32) that was placed in front of the monkey at their eye level. The stimuli were presented at 70 dB SPL. The monkeys reported whether the reference and test stimuli belonged to the same category or to different categories. They reported this decision by making a saccade to one of two LEDs; these LEDs were 20 deg to the left and to the right of the speaker. The LEDs were illuminated 1100–1300 ms after test-stimulus offset. The eye position of monkey H was monitored with a scleral search coil. The eye position of monkey T was monitored non-invasively with an infrared eye tracker (Eye-Trac6 RS6-HS; Applied Science Laboratories).

If the reference stimulus and the test stimulus belonged to the same category, the monkeys were rewarded when they made a saccadic eye movement to the LED that was 20 deg to the left of the speaker. Stimuli that belonged to the same category were on the same side of the 50%-morph boundary. Examples of reference and test stimuli that belonged to the same category were: (1) when the reference stimulus was the 40% morph and the test stimulus was the 0% morph; or (2) when the reference stimulus was the 100% morph and the test stimulus was the 60% morph.

In contrast, if the reference stimulus and the test stimulus belonged to different categories, the monkeys were rewarded when they made a saccadic eye movement to the LED that was 20 deg to the right of the speaker. Stimuli that belonged to different categories were on different sides of the 50%-morph boundary. Examples of reference and test stimuli that belonged to different categories were: (1) when the reference stimulus was the 40% morph and the test stimulus was the 100% morph; or (2) when the reference stimulus was the 20% morph and the test stimulus was the 80% morph.

When the test stimulus was at the categorical boundary (i.e. the 50% morph), there was not a ‘correct’ answer. On these trials, the monkeys were rewarded randomly (Tsunada et al. 2011).

Importantly, the monkeys could not successfully complete this task by calculating the morphing distance between the reference and the test stimuli because stimuli with the same morphing distance (e.g. 20%) could belong to the same category (e.g. the 60% morph and the 80% morph) or to different categories (e.g. the 40% morph and the 60% morph).

Recording procedures

Our recordings targeted the lateral surface of the STG as well as the superior surface that is just lateral to the rostral auditory field R (Russ et al. 2008; Tsunada et al. 2011). This region of the STG coincides with the anterolateral belt and the rostral parabelt of the auditory cortex; the anterolateral belt and the rostral parabelt are part of the ventral auditory stream and processes information about auditory identity (Rauschecker & Tian, 2000).

While the monkeys participated in the match-to-category task, the electrode was advanced through the STG to search for spiking activity. Since we were interested in neurons that responded to sounds, we mainly focused on neurons whose spiking activity was modulated by the presentation of auditory stimuli. Also, since STG neurons respond broadly to different auditory stimuli (Russ et al. 2008), we did not filter neurons based on their auditory tuning nor did we tailor the stimuli to the response properties of a particular neuron. On each trial, the reference- and test-stimulus combination was chosen in a balanced pseudorandom order. We report those neurons in which we were able to collect data from at least 132 successful trials of different reference- and test-stimulus combinations.

Single-unit extracellular recordings were obtained with a tungsten microelectrode (∼1.0 MΩ at 1 kHz; Frederick Haer & Co.) or a 4-core-multifibre microelectrode (‘tetrode’; ∼0.8 MΩ at 1 kHz; Thomas Recording GmbH) that was seated inside a stainless-steel guide tube. Extracellular neural signals from each electrode were sampled at 24 kHz, amplified and filtered (0.6–6.0 kHz) with a multi-channel recording system (Tucker-Davis Technologies); these recording parameters are comparable to those used in previous studies (e.g. Ogawa et al. 2011). This 24 kHz rate was above the sampling rate needed to adequately extract the spike-waveform parameters (see below for cell-classification methods). Custom software written in LabView (National Instruments) synchronized neural data collection with stimulus presentation and behavioural control.

Action potentials from individual neurons were extracted from the neural recordings with an off-line spike-sorting algorithm (WaveClus) (Quiroga et al. 2004) that ran on the Matlab programming platform. The algorithm first detected the troughs of the spike waveforms by setting an amplitude threshold that was 4 standard deviations below the mean of the background activity. Next, to extract the characteristics of each waveform, the algorithm performed a wavelet decomposition of each detected spike, obtaining 64 wavelet coefficients. A Kolmogorov–Smirnov test tested whether the distribution of each wavelet coefficient was normally distributed. Finally, the 10 wavelet coefficients that had the largest deviations from normality were used for the clustering analysis; a super-paramagnetic clustering algorithm classified these wavelet coefficients.

To test the separation between clusters of spikes, we calculated an isolation-distance metric (Sakata & Harris, 2009). The isolation-distance metric was the Mahalanobis distance between a specific cluster of spikes and other spikes. To validate our metric, we benchmarked our isolation-distance metric relative to sample data supplied with the WaveClus software (Quiroga et al. 2004) and found that well-isolated spikes had isolation distances between 1.0 and 11.9. The metric values for our isolated spikes were well within this range (see Fig. S4A).

Neurophysiological-data analyses

The neurons reported in this study were those classified as ‘auditory’. An STG neuron was classified as auditory if the 95% confidence interval of the firing rate (spikes per second) of a neuron during the 100-ms period that began with reference- and test-stimulus onset was different from the mean firing rate of the baseline period (i.e. the 500-ms period that preceded reference-stimulus onset). This 100-ms period was chosen because many of the neurons had substantial phasic responses to stimulus onset; phasic responses could not be detected with longer time windows. The aforementioned analysis was conducted independently of the morph value of the stimuli.

Next, neurons were classified into one of two categories based on their spike-waveform shapes (Bartho et al. 2004; Sakata & Harris, 2009). A neuron's waveform was characterized by a negative-voltage deflection (trough) followed by a positive-voltage deflection (peak). For the classification, we used both the trough-to-peak time and the half-amplitude duration that were derived from each neuron's mean waveform. We removed two auditory neurons from the classification analysis because these neurons did not have typical waveforms. The distribution of those two parameters of each neuron was classified by k-means clustering (k= 2, squared Euclidean distance) (Bartho et al. 2004; Sakata & Harris, 2009).

We further sub-classified these NS and BS neurons as either ‘increasingly responsive’ or ‘decreasingly responsive’ based on their mean firing rate during the stimulus presentation. If the average firing rate during the stimulus presentation was higher than the firing rate during the baseline period, we defined the neuron as increasingly responsive. If the firing rate during the stimulus presentation was lower than the baseline firing rate, the neuron was defined as decreasingly responsive (see the response properties in Fig. S2B and C).

Category index

Following from Freedman and colleagues (Freedman et al. 2001), the selectivity of a neuron to a stimulus’ category was tested with the ‘category index’. We treated the 0% (i.e. the bad prototype), 20% and 40% morphs as one category and the 60%, 80% and 100% (i.e. the dad prototype) morphs as a second category. The advantage of this index is that it quantifies whether neurons respond differentially to stimuli that have the same morphing distance (e.g. 20%) but can either belong (1) to the same category (e.g. the 20% morph and the 40% morph) or (2) to the different categories (e.g. the 40% morph and the 60% morph).

On a neuron-by-neuron basis, we first calculated the ‘within-category difference’ (WCD). The WCD was the average of the absolute difference in firing rate between morph pairs that were on the same side of the category boundary: for example, the 0% and the 20% morphs or the 60% and the 100% morphs. Second, we calculated the ‘between-category difference’ (BCD), which was the average of the absolute difference in firing rate between morph pairs that were on different sides of the category boundary: for example, the 40% and 80% morphs. The category index was the difference between the BCD and the WCD, divided by their sum.

Two versions of the category index were calculated. In the first version, we calculated the category index by using the absolute difference in firing rate between all morph pairs. Category-index values close to 1 indicate that the neural responses were categorical. That is, values close to 1.0 indicate ‘binary-like’ neural responses to the morphs according to their category membership: similar neural responses to morphs on the same side of the 50% border but very different responses to morphs on different sides of the 50% border.

Next, to test the temporal dynamics of the category index, distributions of category-index values were calculated from data in consecutive 5 ms bins, relative to reference-stimulus or test-stimulus onset. On a neuron-by-neuron basis, we also calculated the ‘category-index latency’. This latency was defined as the first time bin of five consecutive time bins (i.e. 25 ms) for which category-index values were higher than the 95% confidence interval of the mean category index during the baseline period.

Since the first version of the category index was calculated from more BCD pairs than WCD pairs (i.e. 9 BCD pairs versus 6 WCD pairs), it is inherently biased toward positive values. To control for this difference in the number of pairs, we calculated a second ‘control’ category index that had the same number of BCD pairs (i.e. (1) the 20% and 60% pair and (2) the 40% and 80% pair) and WCD pairs (i.e. (1) the 0% and 40% pair and (2) the 60% and 100% pair). As with our first index, values near 1 imply that the neurons responded categorically: similar responses to morphs on the same side of the 50% border but different responses to morphs on different sides of the 50% border. Values near –1 imply that the neurons also responded categorically but to a category border that was orthogonal to the 50% border (e.g. the neuron codes the 20%, 40% and 80% morphs as one category and the 0%, 60% and 100% morphs as a second category).

Receiver-operating-characteristic (ROC) analysis

After explicitly testing whether auditory-cortex neurons responded categorically with the category index, we next applied signal-detection theory to calculate an ROC value (Green & Swets, 1966). For each neuron and on a trial-by-trial basis, firing rates were first divided into two distributions: the bad distribution contained the firing rates elicited when the reference or test stimulus was the 0% (i.e. the bad prototype), 20% or 40% morph, whereas the dad distribution contained the firing rates elicited when the reference or test stimulus was the 60%, 80% or 100% morph. An ROC curve was then generated from these two distributions of firing rates. The area under the curve represents the probability that an ideal observer can differentiate between these two categories. ROC values range from 0.5 to 1. Larger values indicate better differentiation between the two categories. To examine the ROC value as a function of time, we calculated the ROC value for consecutive 5 ms bins, relative to reference-stimulus or test-stimulus onset.

Results

To test the hypothesis that pyramidal cells and interneurons in the auditory cortex play a differential role in auditory categorization, we recorded single-unit activity from the auditory cortex in the STG of rhesus monkeys while they categorized speech sounds. As shown in our previous study (Tsunada et al. 2011), the monkeys categorized the 0% (the bad prototype)–40% morphs as one category and the 60%–100% (the dad prototype) morphs as a second category; the monkeys’ response to the 50% morphs was intermediate between their behavioural reports on the lower-percentage morphs and the higher-percentage morphs (Fig. S1B).

One-hundred and ten auditory neurons (54 neurons from monkey H, 56 neurons from monkey T) were recorded from the STG while the monkeys participated in this categorization task (Tsunada et al. 2011). Offline, a k-means-clustering algorithm classified each neuron's waveform into two distinct populations based on each waveform's trough-to-peak time and half-amplitude duration (Bartho et al. 2004; Sakata & Harris, 2009) (Fig. 1). The first class, the ‘narrow-spiking’ (NS; n= 51) neurons had relatively short trough-to-peak times (mean, 271 μs) and short half-amplitude durations (mean, 164 μs). The second class, the ‘broad spiking’ (BS; n= 57) neurons had relatively longer trough-to-peak times (mean, 761 μs) and half-amplitude durations (mean, 254 μs). A two-way ANOVA (cell class (NS neurons versus BS neurons) and spike-waveform parameter (trough-to-peak time versus half-amplitude duration) as factors) indicated that both spike-waveform parameters were reliably different between NS and BS neurons (cell class: F(1,213) = 254.01, P < 0.05; spike-waveform parameter: F(1,213) = 306.5, P < 0.05).

Figure 1. Cell classification based on spike waveforms’ trough-to-peak times and half-amplitude durations.

Figure 1

A, the average spike waveform of narrow-spiking (NS) neurons (n= 51; red) and broad-spiking (BS) neurons (n= 57; blue). Each neuron's waveform was first normalized relative to the amplitude between trough and peak of the spike. Next, these waveforms were averaged across their respective populations. The dotted lines indicate the bootstrapped 95% confidence interval. B, the distribution of the waveform's trough-to-peak times and half-amplitude durations for the populations of NS neurons (red) and BS neurons (blue). The inset shows the definition of a waveform's trough-to-peak time and half-amplitude duration.

Both BS and NS neurons were found in most of the recording sites in which we found auditory neurons; unfortunately, it was not possible to reconstruct the exact locations of these recording sites. Moreover, in those recording sessions in which we were able to simultaneously record from a pair of neurons (n= 21), we isolated NS and BS neurons 29% of the time (n= 6). These NS and BS neurons pairs were recorded from several different penetrations including the most lateral penetrations that might overlap with the parabelt. Overall, these findings indicated that NS and BS neurons were intermingled throughout the auditory cortex.

At a functional level, we found that NS neurons were more category selective than BS neurons. An example of an increasingly responsive (see Methods) NS neuron that responded categorically is shown in Fig. 2A. During presentations of the reference stimuli, the neuron responded strongly to the 0% (the prototype bad), 20% and 40% morphs but had a relatively weaker response to the 60%, 80% and 100% (the prototype dad). In addition to categorical differences in firing rate, the temporal response profile of this neuron was also categorical. When the 0% (the prototype bad), 20% or 40% morphs were presented, the latency of the peak firing rate was longer than when the 60%, 80% or 100% (the prototype dad) morphs were presented. Figure 2B shows the response of a BS neuron whose category selectivity is clearly weaker than the NS neuron in Fig. 2A (see Fig. S3A for another example of a BS neuron). We also found that decreasingly responsive (see Methods) NS neurons were more category selective than decreasingly responsive BS neurons. Examples of decreasingly responsive neurons are shown in Fig. S3A for another example of a BS neuron). We also found that decreasingly responsive (see Methods) NS neurons were more category selective than decreasingly responsive BS neurons. Examples of decreasingly responsive neurons are shown in Fig. S3A for another example of a BS neuron). We also found that decreasingly responsive (see Methods) NS neurons were more category selective than decreasingly responsive BS neurons. Examples of decreasingly responsive neurons are shown in Fig. S3A for another example of a BS neuron). We also found that decreasingly responsive (see Methods) NS neurons were more category selective than decreasingly responsive BS neurons. Examples of decreasingly responsive neurons are shown in Fig. S3B–D.

Figure 2. Categorical responses of an NS neuron (A) and a BS neuron (B) during presentations of the reference stimulus.

Figure 2

The plots in the left column show the mean firing rates of the two neurons as a function of time and the reference stimulus presented. The inset in the upper graph of each plot shows the neuron's waveform. The middle column shows each neuron's category-index values as a function of time. The right column shows ROC values as a function of time. For all of the panels, the two vertical dotted lines indicate stimulus onset and offset, respectively.

To quantify these observations and to test whether these observations were valid at the population level, we calculated a category index (Freedman et al. 2001). This index quantified the degree of category selectivity and enabled us to analyse the temporal evolution of a neuron's category selectivity. As expected, the category selectivity of the NS neuron in Fig. 2A was greater than that of the BS neuron in Fig. 2B (compare the middle panels of Fig. 2A and B). The category-index population analysis revealed three findings. First, consistent with the single-neuron examples, NS neurons were more category selective than BS neurons (Fig. 3A, B, E and F; two-way ANOVA with cell class (NS neurons versus BS neurons) and response type (increasingly responsive versus decreasingly responsive) as factors; cell class during the reference stimulus: F(1,105) = 77.73, P < 0.05; cell class during the test stimulus: F(1,105) = 17.98, P < 0.05). Second, the category selectivity of the increasingly- and decreasingly-responsive neurons was comparable (two-way ANOVA, response type during the reference stimulus: F(1,105) = 0.84, P > 0.05; response type during the test stimulus: F(1,105) = 1.31, P > 0.05). Finally, the category selectivity of the NS and BS neurons had different temporal dynamics: the category-index latency (see Methods) was reliably earlier for NS neurons than for BS neurons (Fig. 3C and G; two-way ANOVA; cell class during the reference stimulus: F(1,104) = 15.55, P < 0.05; cell class during the test stimulus: F(1,103) = 3.94, P < 0.05).

Figure 3. Population results of category index and ROC analysis.

Figure 3

The temporal profile (A and E), mean (B and F), and latency of the category index (C and G) during reference-stimulus presentation and test-stimulus presentation. The panels in D and H plot the temporal profile of ROC values. Error bars represent bootstrapped 95% confidence intervals of the mean.

Next, we conducted an ROC analysis (Green & Swets, 1966) that tested how well an ideal observer could discriminate between the firing-rate distributions elicited by a neuron in response to the two stimulus categories. The advantage of this ROC analysis is that it takes into account the trial-by-trial variability of a neuron's firing rate. Like the category index, this analysis also indicated that the responses of NS neurons were more category selective than those of BS neurons for the single-neuron examples (right panel of Fig. 2) and at the population level for both the reference and test stimuli (Fig. 3D and H; two-way ANOVA for mean ROC values, response type during the reference stimulus: F(1,105) = 7.82, P < 0.05; cell class during the reference stimulus: F(1,105) = 39.01, P < 0.05; response type during the test stimulus: F(1,105) = 7.67, P < 0.05; cell class during the test stimulus: F(1,105) = 25.77, P < 0.05).

Finally, we performed three control analyses. First, since NS neurons had higher baseline and auditory-evoked firing rates than BS neurons (median baseline firing rate: 20.1 Hz for NS neurons; 12.0 Hz for BS neurons; Mann–Whitney U test, P < 0.05; see also Fig. S3A for another example of a BS neuron). We also found that decreasingly responsive (see Methods) NS neurons were more category selective than decreasingly responsive BS neurons. Examples of decreasingly responsive neurons are shown in Fig. S3A for another example of a BS neuron). We also found that decreasingly responsive (see Methods) NS neurons were more category selective than decreasingly responsive BS neurons. Examples of decreasingly responsive neurons are shown in Fig. S2), our findings may simply be due to differences between their firing rates. To eliminate this possibility, we tested the category selectivity of NS and BS neurons as a function of their auditory-evoked firing rates (Mitchell et al. 2007). We found that, even when we controlled for differences in firing rate, the category selectivity (i.e. the category-index and ROC values) of NS neurons was still reliably greater than BS neurons (two-way ANOVA with cell class (NS neurons versus BS neurons) and auditory-evoked firing rate (<5 Hz, 5–10 Hz, 10–15 Hz, 15–20 Hz, >20 Hz) as factors, category-index value: F(1,102) = 77.42, P < 0.05 for cell class; ROC value: F(1,102) = 41.38, P < 0.05 for cell classes). Therefore, it is unlikely that the difference in category selectivity between NS and BS neurons can be trivially attributed to differences between the firing rates of these two classes of neurons.

Second, we tested whether the quality of the spike isolation may underlie the difference in category selectivity between NS and BS neurons. For example, the weaker category selectivity of BS neurons may be simply attributed to the fact that BS neurons are multi-unit clusters of NS neurons that could not be isolated. To test for this possibility, we calculated the isolation distances between clusters of spikes (Sakata & Harris, 2009) and tested the relationships between the isolation distances, the trough-to-peak time of the spike waveform, and category selectivity. We could not identify a reliable correlation between the isolation distance and the trough-to-peak time (P > 0.05; Fig. S4A). Moreover, we could not identify a reliable correlation between the isolation distance and the category index (P > 0.05; Fig. S4B). Finally, even when we controlled for the isolation distance, the category index of the NS neurons was still reliably greater than the BS neurons (Fig. S4C; two-way ANOVA with cell class (NS neurons versus BS neurons) and isolation distance (<2, 2–4, 4–6, >6) as factors; cell class: F(1,101) = 78.88, P < 0.05). Therefore, the quality of the spike isolation is unlikely to be an underlying factor for the difference in the category selectivity between NS and BS neurons.

Third, to confirm the results for the category index (Fig. 3B and F), we computed a second ‘control’ category index (see Methods). For this index, the BCD values were calculated from the 20% and 60% pair and the 40% and 80% pair; whereas the WCD values were calculated from the 0% and 40% pair and the 60% and 100% pair. The advantage of this index is that it is unbiased: it had the same number of morph pairs that went into computing the BCD and WCD values. Using this control category index, we found that the average category-index value for NS neurons was reliably larger than zero (i.e. the 95% confidence interval of the mean was greater than 0) and reliably larger than the average category-index value for BS neurons (Fig. S5; t test, P < 0.05). Thus, the ‘control’ category index confirmed that NS neurons are more category selective than BS neurons.

Discussion

We recorded single-unit activity from the auditory cortex of rhesus monkeys while they categorized speech sounds. Neurons were classified as either narrow-spiking (NS) putative interneurons or broad-spiking (BS) putative pyramidal neurons. We found that putative interneurons and pyramidal neurons in the auditory cortex differentially coded category information: interneurons were more selective for auditory categories than pyramidal neurons. We hypothesize these differences between cell classes may be an essential property of the neural computations underlying auditory categorization within the microcircuit of the auditory cortex.

NS and BS neurons are putative interneurons and pyramidal neurons, respectively

Based on the spike-waveform parameters, 47% and 53% of our neural population was classified as NS and BS neurons, respectively. How do our classifications compare with previous classifications? First, the spike-waveform parameters (i.e. the trough-to-peak times) that we observed were comparable to those recorded from the rat primary auditory cortex under anaesthesia (Ogawa et al. 2011), though shorter trough-to-peak times have also been reported in the anaesthetized cat and anaesthetized/awake rat primary auditory cortex (Atencio & Schreiner, 2008; Sakata & Harris, 2009). Second, unlike previous studies that reported small proportions (10–30%) of NS neurons (Wilson et al. 1994; Markham et al. 2004; Mitchell et al. 2007; Atencio & Schreiner, 2008; Diester & Nieder, 2008; Isomura et al. 2009; Sakata & Harris, 2009; Yokoi & Komatsu, 2010; Ison et al. 2011), we found a relatively large proportion (47%) of NS neurons. Since our NS neurons had a reliably higher spontaneous firing rate than the BS neurons (Fig. S2A; see also spontaneous firing rates in the rat primary auditory cortex in Hromadka et al. 2008) and since the firing rates of more than half (39/57, 68%) of the BS neurons decreased in response to auditory stimuli (see Fig. S2B and C), our recordings may have been biased toward isolating these NS neurons with relatively higher firing rates as our electrode advanced through the STG (see Recording procedures).

Based on the differences of (1) the spike-waveform parameters and (2) the baseline firing rates between NS and BS neurons, we hypothesize that NS and BS neurons are two distinct types of neurons (Wilson et al. 1994; Swadlow, 2003; Markham et al. 2004; Mitchell et al. 2007; Diester & Nieder, 2008; Isomura et al. 2009; Johnston et al. 2009; Yokoi & Komatsu, 2010; Ison et al. 2011). In particular, we hypothesize that NS neurons are putative interneurons – calcium-binding protein parvalbumin-positive, aspiny stellate, chandelier cells or basket cells – whereas BS neurons are putative pyramidal neurons. This methodology of classifying neurons by spike-waveform shape has been validated with detailed morphological analyses, protein-expression analyses and intracellular recordings (McCormick et al. 1985; Kawaguchi & Kubota, 1993, 1997; Markham et al. 2004; González-Burgos et al. 2005)

Comparison with previous studies testing properties of NS and BS neurons

Consistent with our finding of differential representation of the auditory categories in NS (putative interneurons) and BS (putative pyramidal neurons) neuronal populations, previous studies have also demonstrated a differential functional role for interneurons and pyramidal neurons in vision, audition, somatosensation and motor control (Wilson et al. 1994; Swadlow, 2003; Markham et al. 2004; Mitchell et al. 2007; Diester & Nieder, 2008; Isomura et al. 2009; Johnston et al. 2009; Yokoi & Komatsu, 2010; Ison et al. 2011). In particular, Atencio & Schreiner (2008) reported that, under anaesthesia, NS neurons in the cat primary auditory cortex (1) had broader spectral tuning, (2) had greater feature selectivity for auditory stimuli, and (3) were phase-locked more to the features of auditory stimuli than BS neurons.

In contrast, our findings differed substantially from a comparable visual-categorization study in the prefrontal cortex (PFC) (Diester & Nieder, 2008). That study reported greater visual-category selectivity for BS neurons than for NS neurons, whereas we found greater category selectivity for NS neurons. Three, non-exclusive possibilities may underlie this difference. One possibility may relate to differences in the local connectivity patterns and interactions between interneurons and pyramidal neurons in the PFC versus the auditory cortex (Kätzel et al. 2010). Indeed, in the PFC, simultaneously recorded (and, hence, nearby) BS and NS neurons have different category preferences and tuning properties (Wilson et al. 1994; Diester & Nieder, 2008). In contrast, in the auditory cortex, preliminary data indicate that simultaneously recorded pairs of NS and BS neurons (n= 6) have similar category preferences (data not shown). Thus, the neural computations required to encode a stimulus's category within a local microcircuit may substantially depend on the local circuitry and areal-specific computations.

Second, the nature of the categorization task may also affect the category selectivity of NS and BS neurons. Our task required the perceptual categorization of stimuli whereas the prefrontal task (Diester & Nieder, 2008) required a more abstract type of categorization. Perceptual categorization is based on the physical attributes of a stimulus (Russ et al. 2007; Freedman & Miller, 2008). In contrast, abstract categorization is based on not only the shared physical features of stimuli but also functional characteristics and subjects’ knowledge of a stimulus (Russ et al. 2007; Freedman & Miller, 2008). Therefore, the perceptual categorization and the abstract categorization may require different neural systems and/or neural mechanisms.

A third possibility relates to differences between stimulus dynamics. The visual stimuli in the PFC study were static (Diester & Nieder, 2008), whereas our speech stimuli had complex spectrotemporal dynamics. For the categorization of dynamic stimuli, the moment-by-moment features of stimuli need to be quickly categorized (Chang et al. 2010). Since strong and dense inhibition from interneurons can rapidly control the spiking activity of pyramidal neurons (Hefti & Smith, 2003; Wehr & Zador, 2003; Atencio & Schreiner, 2008; Isaacson & Scanziani, 2011), greater NS neuron (interneuron) category selectivity may be a consequence of the inhibition that is needed to categorize dynamic stimuli.

Conclusion

We found that putative interneurons (NS neurons) are more category selective than putative pyramidal (BS) neurons in the auditory cortex. This finding is somewhat counterintuitive. It is counterintuitive because it is natural to hypothesize that pyramidal neurons, which project to other cortical areas, should be more category selective than interneurons, which process information that is local to the auditory cortex. However, it is not clear how ‘much’ category selectivity is needed for those computations undertaken by cortical areas that receive afferent input from the auditory cortex. Indeed, a re-analysis (J. Tsunada and Y. E. Cohen, unpublished findings) of our earlier data from the PFC (Russ et al. 2008) indicated that, whereas these PFC neurons code a monkey's decision, they are less category selective than those in the auditory cortex, which do not code a monkey's decision (Tsunada et al. 2011). A second, non-exclusive hypothesis is that since projection neurons in the PFC provide top-down influence on interneurons in the auditory cortex (Barbas et al. 2005; Medalla et al. 2007), the category selectivity of interneurons in the auditory cortex may be enhanced by these top-down signals. Finally, another possibility is that a subset of the most category-selective pyramidal neurons preferentially transmits category information to other brain areas. Thus, in the future, it will be important to test directly how information is transmitted and transformed between the auditory cortex and other brain regions in order to more fully elucidate the cortical computations underlying auditory categorization.

Acknowledgments

We would like to thank Subhash Bennur, Andrew Liu, Adam Gifford, Maria Geffen and Heather Hersh for helpful comments on the preparation of this manuscript. We also want to thank Harry Shirley for outstanding animal care. This research was supported by grants from the National Institute of Deafness and Other Communication Disorders (NIDCD) to Y.E.C.

Glossary

BCD

between-category difference

BS neuron

broad-spiking neuron

NS neuron

narrow-spiking neuron

PFC

prefrontal cortex

ROC analysis

receiver-operating-characteristic analysis

STG

superior temporal gyrus

WCD

within-category difference

Author contributions

J.T., J.H.L. and Y.E.C. designed the study and wrote the paper. J.T. and J.H.L. collected the electrophysiological data. J.T. analysed the data. All authors approved the final version.

Supplementary material

Supplementary Figure S1

Supplementary Figure S2

Supplementary Figure S3

Supplementary Figure S4

Supplementary Figure S5

tjp0590-3129-SD1.pdf (1.1MB, pdf)

References

  1. Atencio CA, Schreiner CE. Spectrotemporal processing differences between auditory cortical fast-spiking and regular-spiking neurons. J Neurosci. 2008;28:3897–3910. doi: 10.1523/JNEUROSCI.5366-07.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Barbas H, Medalla M, Alade O, Suski J, Zikopoulos B, P L. Relationship of prefrontal connections to inhibitory systems in superior temporal areas in the rhesus monkey. Cereb Cortex. 2005;15:1356–1370. doi: 10.1093/cercor/bhi018. [DOI] [PubMed] [Google Scholar]
  3. Bartho P, Hirase H, Monconduit L, Zugaro M, Buzsáki G. Characterization of neocortical principal cells and interneurons by network interactions and extracelluar features. J Neurophysiol. 2004;92:600–608. doi: 10.1152/jn.01170.2003. [DOI] [PubMed] [Google Scholar]
  4. Chang EF, Rieger JW, Johnson K, Berger MS, Barbaro NM, Knight RT. Categorical speech representation in human superior temporal gyrus. Nat Neurosci. 2010;13:1428–1432. doi: 10.1038/nn.2641. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Diester I, Nieder A. Complementary contributions of prefrontal neuron classes in abstract numerical categorization. J Neurosci. 2008;28:7737–7747. doi: 10.1523/JNEUROSCI.1347-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Freedman DJ, Miller EK. Neural mechanisms of visual categorization: insights from neurophysiology. Neurosci Biobehav Rev. 2008;32:311–329. doi: 10.1016/j.neubiorev.2007.07.011. [DOI] [PubMed] [Google Scholar]
  7. Freedman DJ, Riesenhuber M, Poggio T, Miller EK. Categorical representation of visual stimuli in the primate prefrontal cortex. Science. 2001;291:312–316. doi: 10.1126/science.291.5502.312. [DOI] [PubMed] [Google Scholar]
  8. Gold JI, Shadlen MN. The neural basis of decision making. Ann Rev Neurosci. 2007;30:535–574. doi: 10.1146/annurev.neuro.29.051605.113038. [DOI] [PubMed] [Google Scholar]
  9. González-Burgos G, Krimer LS, Povysheva NV, Barrionuevo G, Lewis DA. Functional properties of fast spiking interneurons and their synaptic connections with pyramidal cells in primate dorsolateral prefrontal cortex. J Neurophysiol. 2005;93:942–953. doi: 10.1152/jn.00787.2004. [DOI] [PubMed] [Google Scholar]
  10. Green DM, Swets JA. Signal Detection Theory and Psychophysics. New York: John Wiley and Sons, Inc; 1966. [Google Scholar]
  11. Hefti BJ, Smith PH. Distribution and kinetic properties of GABAergic inputs to layer V pyramidal cells in rat auditory cortex. J Assoc Res Otolaryngol. 2003;4:106–121. doi: 10.1007/s10162-002-3012-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Hernández A, Nácher V, Luna R, Zainos A, Lemus L, Alvarez M, Vázquez Y, Camarillo L, Romo R. Decoding a perceptual decision process across cortex. Neuron. 2010;29:300–314. doi: 10.1016/j.neuron.2010.03.031. [DOI] [PubMed] [Google Scholar]
  13. Hromadka T, DeWeese MR, Zador A. Sparse representation of sounds in the unanesthetized auditory cortex. PLoS Biol. 2008;6:e16. doi: 10.1371/journal.pbio.0060016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Isaacson JS, Scanziani M. How inhibition shapes cortical activity. Neuron. 2011;72:231–243. doi: 10.1016/j.neuron.2011.09.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Isomura Y, Harukuni R, Takekawa T, Aizawa H, Fukai T. Microcircuitry coordination of cortical motor information in self-initiation of voluntary movements. Nat Neurosci. 2009;12:1586–1593. doi: 10.1038/nn.2431. [DOI] [PubMed] [Google Scholar]
  16. Ison MJ, Mormann F, Cerf M, Koch C, Fried I, Quiroga RQ. Selectivity of pyramidal cells and interneurons in the human medial temporal lobe. J Neurophysiol. 2011;106:1713–1721. doi: 10.1152/jn.00576.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Johnston K, DeSouza JF, Everling S. Monkey prefrontal cortical pyramidal and putative interneurons exhibit differential patterns of activity between prosaccade and antisaccade task. J Neurosci. 2009;29:5516–5524. doi: 10.1523/JNEUROSCI.5953-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Judge SJ, Richmond BJ, Chu FC. Implantation of magnetic search coils for measurement of eye position: an improved method. Vis Res. 1980;20:535–538. doi: 10.1016/0042-6989(80)90128-5. [DOI] [PubMed] [Google Scholar]
  19. Kätzel D, Zemelman BV, Buetfering C, Wölfel M, Miesenböck G. The columnar and laminar organization of inhibitory connections to neocortical excitatory cells. Nat Neurosci. 2010;14:100–107. doi: 10.1038/nn.2687. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Kawaguchi Y, Kubota Y. Correlation of physiological subgroupings of nonpyramidal cells with parvalbumin- and calbindinD28k-immunoreactive neurons in layer V of rat frontal cortex. J Neurophysiol. 1993;70:387–396. doi: 10.1152/jn.1993.70.1.387. [DOI] [PubMed] [Google Scholar]
  21. Kawaguchi Y, Kubota Y. GABAergic cell subtypes and their synaptic connections in rat frontal cortex. Cereb Cortex. 1997;7:476–486. doi: 10.1093/cercor/7.6.476. [DOI] [PubMed] [Google Scholar]
  22. Kawahara H, Masuda-Katsuse I, de Cheveigne A. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction. Speech Comm. 1999;27:187–199. [Google Scholar]
  23. McCormick DA, Connors BW, Lighthall JW, Prince DA. Comparative electrophysiology of pyramidal and sparsely spiny stellate neurons of the neocortex. J Neurophysiol. 1985;54:782–806. doi: 10.1152/jn.1985.54.4.782. [DOI] [PubMed] [Google Scholar]
  24. Markham H, Toledo-Rodriguez M, Wang Y, Gupta A, Silberberg G, Wu C. Interneuron of the neocortical inhibitory system. Nat Rev Neurosci. 2004;5:793–807. doi: 10.1038/nrn1519. [DOI] [PubMed] [Google Scholar]
  25. Medalla M, Lera P, Feinberg M, Barbas H. Specificity in inhibitory systems associated with prefrontal pathways to temporal cortex in primates. Cereb Cortex. 2007;17:i136–i150. doi: 10.1093/cercor/bhm068. [DOI] [PubMed] [Google Scholar]
  26. Mitchell JF, Sundberg KA, Reynolds JH. Differential attention-dependent response modulation across cell classes in macaque visual area V4. Neuron. 2007;5:131–141. doi: 10.1016/j.neuron.2007.06.018. [DOI] [PubMed] [Google Scholar]
  27. Ogawa T, Riera J, Goto T, Sumiyoshi A, Nonaka H, Jerbi K, Bertrand O, Kawashima R. Large-scale heterogeneous representation of sound attributes in rat primary auditory cortex: from unit activity to population dynamics. J Neurosci. 2011;31:14639–14653. doi: 10.1523/JNEUROSCI.0086-11.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Quiroga RQ, Nadasy Z, Ben-Shaul M. Unsupervised spike detection and sorting with wavelets and superparamagnetic clustering. Neural Comput. 2004;16:1661–1687. doi: 10.1162/089976604774201631. [DOI] [PubMed] [Google Scholar]
  29. Rauschecker JP, Tian B. Mechanisms and streams for processing of ‘what’ and ‘where’ in auditory cortex. Proc Natl Acad Sci U S A. 2000;97:11800–11806. doi: 10.1073/pnas.97.22.11800. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Russ BE, Lee Y-S, Cohen YE. Neural and behavioral correlates of auditory categorization. Hear Res. 2007;229:204–212. doi: 10.1016/j.heares.2006.10.010. [DOI] [PubMed] [Google Scholar]
  31. Russ BE, Orr LE, Cohen YE. Prefrontal neurons predict choices during an auditory same-different task. Curr Biol. 2008;18:1483–1488. doi: 10.1016/j.cub.2008.08.054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Sakata S, Harris KD. Laminar structure of spontaneous and sensory-evoked population activity in auditory cortex. Neuron. 2009;64:404–418. doi: 10.1016/j.neuron.2009.09.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Steinschneider M, Nourski KV, Kawasaki H, Oya H, Brugge JF, Howard MA. Intracranial study of speech-elicited activity on the human posterolateral superior temporal gyrus. Cereb Cortex. 2011;10:2332–2347. doi: 10.1093/cercor/bhr014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Swadlow HA. Fast-spike interneurons and feedforward inhibition in awake sensory neocortex. Cereb Cortex. 2003;13:25–32. doi: 10.1093/cercor/13.1.25. [DOI] [PubMed] [Google Scholar]
  35. Tsunada J, Lee JH, Cohen YE. Representation of speech categories in the primate auditory cortex. J Neurophysiol. 2011;105:2634–2646. doi: 10.1152/jn.00037.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Wehr MS, Zador A. Balanced inhibition underlies tuning and sharpend spike timing in auditory cortex. Nature. 2003;27:442–446. doi: 10.1038/nature02116. [DOI] [PubMed] [Google Scholar]
  37. Wilson FA, O'Scalaidhe SP, Goldman-Rakic PS. Functional synergism between putative γ-aminobutyrate-containing neurons and pyramidal neurons in prefrontal cortex. Proc Natl Acad Sci U S A. 1994;91:4009–4013. doi: 10.1073/pnas.91.9.4009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Yokoi I, Komatsu H. Putative pyramidal neurons and interneurons in the monkey parietal cortex make different contributions to the performance of a visual grouping task. J Neurophysiol. 2010;104:1603–1611. doi: 10.1152/jn.00160.2010. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

tjp0590-3129-SD1.pdf (1.1MB, pdf)

Articles from The Journal of Physiology are provided here courtesy of The Physiological Society

RESOURCES