Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2010 Feb 16.
Published in final edited form as: Nat Neurosci. 2009 Jan 11;12(2):221–228. doi: 10.1038/nn.2246

Neural Correlates of Categorical Perception in Learned Vocal Communication

JF Prather 1, S Nowicki 1,2, RC Anderson 2, S Peters 2, R Mooney 1
PMCID: PMC2822723  NIHMSID: NIHMS103067  PMID: 19136972

Abstract

The division of continuously variable acoustic signals into discrete perceptual categories is a fundamental feature of vocal communication, including human speech. Despite the importance of categorical perception to learned vocal communication, the neural correlates underlying this phenomenon await identification. Here we report that individual sensorimotor neurons in freely behaving swamp sparrows express categorical auditory responses to changes in note duration, a learned feature of their songs, and that the neural response boundary accurately predicts the categorical perceptual boundary measured in field studies of the same sparrow population. Furthermore, swamp sparrow populations that learn different song dialects exhibit different categorical perceptual boundaries, consistent with the boundary being learned. Our results extend the analysis of the neural basis of perceptual categorization into the realm of vocal communication, while advancing the learned vocalizations of songbirds as a model for investigating how experience shapes categorical perception and the activity of categorically responsive neurons.

INTRODUCTION

One way perception enables adaptive behavior is by grouping variable stimuli into classes. An impressive example of this synthetic capacity of the nervous system occurs when continuous changes in a stimulus parameter are perceived as discrete perceptual categories1. Categorical perception plays a prominent role in processing acoustic signals important to social communication2, including vocalizations made by humans3,4, as well as by monkeys5 rodents6, birds7, and frogs8,9. In spoken English, for example, the distinction between the speech sounds that elicit the phonemes /ba/ and /pa/ stems from a difference in delay time between the initial sound made by opening the lips and the onset of vocal fold vibration (“voice onset time”). Voice onset time varies considerably from individual to individual, but this variation is perceived categorically3,4, with native English speakers recognizing a categorical boundary between /ba/ and /pa/ at ~ 12 ms3,4. Presumably, categorical processing of speech more generally facilitates comprehension by generating perceptual constancy in the face of individual variation in many dimensions including vocal pitch, timbre, and tempo.

Despite the prominent role of categorical perception in the processing of speech and other natural signals, the neural mechanisms underlying this cognitive ability remain poorly studied. The fact that some animals have been shown to perceive human speech sounds categorically, even when they do not learn their own species-typical vocalizations6, suggests that the capacity for categorical perception reflects some fundamental and innate brain mechanism. At the same time, however, the underlying mechanism must also be sensitive to experience, because perceptual boundaries that define speech sounds vary across native speakers of different languages3,4, indicating that the boundary, like other features of speech, is learned. What has been lacking is a model system in which the neural underpinnings of categorical perception could be studied for a natural, learned communication system. Here we ask whether sensorimotor neurons known to be important to learned vocal communication in songbirds could also mediate categorical perception of learned vocalizations.

Swamp sparrows (Melospiza georgiana) are especially appropriate for examining the neural events underlying categorical perception of learned vocalizations for several reasons. First, like all songbirds studied to date, swamp sparrows learn their song notes by imitation10, a feature of human speech that is otherwise rare among animals11. Second, behavioral experiments have shown that male swamp sparrows use categorical perception to distinguish fundamental acoustic elements in their species-typical vocal repertoire7. Swamp sparrow songs comprise repeated groups of 2 to 5 “notes” (Fig. 1A) composed of short pure-tonal frequency sweeps, with note categories differing primarily in duration, bandwidth and rate of change in frequency12. Previous work has shown that males use categorical perception to distinguish between note types that, like the phones produced in speech, are produced with considerable variation by different individuals but are grouped into natural categories. Specifically, note types in Categories I and VI share most spectral features but differ along a duration continuum7,12,13, and individual male swamp sparrows in a wild population in New York State perceive these differences categorically, with an estimated perceptual boundary at a duration of 13 ms (Figs. 1B-C)7.

FIGURE 1. Earlier behavioral tests with swamp sparrows reveal categorical perception of changes in individual song notes.

FIGURE 1

(a) Male swamp sparrows sing a small repertoire of distinct song types composed of trilled syllables that comprise multiple notes (2-5 song types, each ~ 2 sec duration). (b, c, d) Categorical perception in swamp sparrows was demonstrated previously7 by (b) splicing out one note in the syllable (note C) and replacing it with a note with similar spectral characteristics but different duration. Replacement note durations differed by equal increments on a logarithmic scale. (c) Behavioral testing revealed that swamp sparrows perceive strong differences, evident as robust expression of aggressive displays, when note transitions spanned an estimated boundary of 13 ms (stimulus set 2) but perceive little or no difference when transitions did not cross that boundary (stimulus sets 1 and 3, figure adapted from Nelson and Marler7), indicating that swamp sparrows employ categorical perception in their song discrimination.

A third feature that advances the swamp sparrow as a model for investigating neural activity underlying categorical perception is that their brain, as in other songbirds, contains a discrete set of specialized sensorimotor structures devoted to learning, producing and perceiving song14-19. Previously, we showed that one of these structures, the nucleus HVC, contains a certain class of striatum-projecting neurons (HVCX cells; Figs. 2B-C, left) that respond to only one song type in the bird’s repertoire20,21. Three observations implicate song type-selective HVCX neurons in song perception. First, song perception is impaired by lesions to HVC15,16. Second, song perception is also impaired by lesions to the striatal portion of an anterior forebrain pathway into which HVCX cells project their axons22. Third, some HVCX cells display a sensorimotor correspondence reminiscent of mirror neurons in the monkey cortex that are hypothesized to play an important role in perception20,23.

FIGURE 2. Extracellular recordings reveal differences in the auditory response of identified single units in nucleus HVC.

FIGURE 2

(a) A parasagittal schematic of the swamp sparrow brain at the level of HVC. Antidromic stimulation was used to identify HVC neurons19 as projecting to striatal areas implicated in song learning and perception (Area X), projecting to premotor structures (robust nucleus of the arcopallium, RA), of making local connections within HVC (HVC interneurons, omitted for clarity). Each cell type within HVC receives auditory input45 (dotted line; D = dorsal, R = rostral). (b, c) Auditory responses of HVCX neurons (N = 24 of 29 cells, 5 birds) were phasic and typically evoked by only one song type in the bird’s repertoire (“primary song type20”, N = 23 of 24 responsive cells, identity of primary song type varied across cells). Auditory responses of HVCINT cells (N = 18 of 18 cells, 5 birds) consisted of tonic increases in activity in response to many or all song types in the repertoire. In contrast, HVCRA neurons were unresponsive to any stimulus that was presented (top: raw data traces; middle: auditory response raster; bottom: auditory response PSTH, 10 ms binsize; N = 16 cells, 5 birds). (c) HVCX phasic responses occurred at a restricted phase in the syllable of the primary song type (top: auditory response raster; bottom: auditory response PSTH, 1 ms binsize), whereas HVCINT cells responded throughout the syllable duration.

Although HVCX cells appear to be well suited to a role in perception, the extent to which their auditory properties are correlated with the bird’s perception of categorical differences among song features is unknown. Therefore, we used a lightweight chronic recording device20,24 to assess whether HVCX cells in freely behaving adult male swamp sparrows respond categorically to changes in note duration, known from behavioral experiments to be a key feature that distinguishes between note type Categories I and VI7.

RESULTS

In initial experiments, we recorded from antidromically identified HVC neurons in freely behaving adult male swamp sparrows and presented the bird’s song types through a speaker located near its perch. These recordings confirmed that HVCX neurons responded to a single song type in the bird’s repertoire (the “primary song type”), and also revealed that robust auditory responses could be evoked in interneurons (HVCINT cells). In contrast, all of the cells we sampled of the other projection neuron type in HVC, which innervates the song premotor nucleus RA (HVCRA cells), were unresponsive to auditory stimuli.

To test whether individual HVCX neurons or interneurons express categorical responses to changes in the duration of notes in the primary song type, we first identified the primary song type to which a cell responded and then identified a note in the repeated syllable comprising that song type that was in either a Category I (< 13 ms) or a Category VI (> 13 ms) note type as defined in previous studies7,12,13. We then replaced the Category I or Category VI note in the primary song type with a note that had similar spectral features but a different duration, following the same procedures used in prior behavioral studies (Fig. 3A) in which replacement notes were chosen that differed in their durations by equal increments on a logarithmic scale7 (see Methods for additional details). The resulting synthetic syllable was assembled into a trill to form a stimulus song, with all other notes, inter-note intervals and trill rate identical to the natural song. Repeating this procedure using different replacement notes to create each stimulus, we presented to each cell a set of song stimuli comprising 5 to 11 variants of the primary song type, each differing only in the duration of a single replacement note in each trilled syllable, with durations spanning a range from notes that would be classified as Category I to those that would be classified as Category VI.

FIGURE 3. Manipulation of song note duration reveals categorical auditory responses of HVCX neurons.

FIGURE 3

(a) Individual notes in the primary song type syllable (note C in top spectrogram) were replaced by another swamp sparrow song note with similar spectral characteristics but different duration7. Auditory responses in this HVCX cell were strong among one group of replacement notes (4, 8, 16 ms) but weak among another group (27, 31 ms; third row: auditory response PSTH, 1 ms binsize; bottom row: stimulus syllables, gray boxes indicate replacement notes). (b) Categorical responses like those in (a) were evident in all HVCX cells in this bird that responded to the same song type (N = 4, left, open blue squares indicate the cell in panel (a)). In contrast, categorical activity was not evident in any of the HVCINT neurons that responded to the same song type in the same bird (N = 3, right). In behavioral testing, this same bird responded to differences between stimuli that crossed the putative neurophysiological boundary but not to differences between stimuli that did not cross the boundary (Supp. Fig. 4). (c) Among all cells tested in all birds (N = 22 of 24 auditory responsive HVCX cells, 5 birds and 18 of 18 responsive HVCINT cells, 5 birds), categorical activity was consistently evident for HVCX neurons (N = 19 of 22 cells, 5 birds), with stronger response strengths evoked by replacement notes in the same category as the note that was replaced. In contrast, categorical activity was very rarely evident in HVCINT neurons (N = 2 of 18 cells, 2 birds). (d) Categorical responses were observed in HVCX cells regardless of whether the duration of note that was replaced was naturally short or long, and the location of the categorical boundary (black triangle) was consistent across HVCX cells and birds (21 ± 4 ms, mean ± SD; 10 cells shown for clarity). A subset of 10 categorically responsive HVCX cells (2 birds) was tested further to probe the categorical boundary at a higher resolution (19, 22, 25 ms; e.g. thick blue line), revealing an estimated categorical boundary (20 ± 4 ms) very similar to that estimated using the full dataset (21 ± 4 ms; p = 0.47, unpaired t-test).

Categorical auditory responses are evoked by changes in duration of individual notes

Acoustic presentation of variants of the primary song type revealed that the auditory responses of HVCX neurons, but not interneurons, are highly sensitive to changes in note duration (Figs. 3A-D). Most HVCX neurons responded robustly to song stimuli containing replacement notes with durations within the range of the target note category but weakly or not at all to variants of the primary song type with replacement notes having durations outside that category (N = 19 of 22 HVCX cells, 5 birds). Robust responses were invariably evoked by stimuli containing replacement notes with durations that would fall unambiguously into the same category as the target note in the natural song (e.g., Fig. 3A, note durations 4, 8 ms; target note duration = 7 ms; replacement notes near the categorical boundary discussed below). Following the criteria established in previous studies of categorical perception7,25-27, these cells were deemed categorically responsive (see Methods for additional details). Categorical responsiveness was evident in HVCX cells both within and across birds (within bird: Fig. 3B, left, N = 4 of 4 cells, 1 bird, p < 0.001; across birds: Fig. 3C, left; p < 0.001, Mann-Whitney U test; N = 19 cells, 5 birds), regardless of whether the original target note was of the shorter Category I type or the longer Category VI type (Fig. 3D). Beyond the sheer abundance of categorically-responsive HVCX neurons, two observations indicate that the HVCX neuronal population encodes note duration in a categorical manner. First, the three HVCX cells that did not respond categorically also did not display bell-shaped tuning curves to intermediate note durations, as might be expected if note duration is represented continuously (Supp. Fig. 1). Second, our dataset differed significantly from a model in which note duration is encoded linearly by HVCX cells (Supp. Fig. 2). In contrast to HVCX neurons, HVCINT cells responded similarly regardless of whether the replacement note was of the same category or a different category as the target note (Figs. 3B-C, right; p = 0.24, Mann-Whitney U test; N = 16 of 18 cells, 5 birds). These results indicate that categorical responses to changes in note duration are displayed by only one subset of auditory responsive HVC neurons, namely those that project to a striatal pathway important in song perception15,22.

Estimating the neuronal response boundary

To probe the link between these categorical neuronal responses and song perception, we first estimated the categorical boundary evident in the auditory responses of HVCX neurons. Interpolation at the transition from strong to weak responses yielded an estimated categorical response boundary at 21 ± 4 ms (mean ± SD, N = 19 cells, 5 birds). We further probed the location of the categorical response boundary at higher resolution in a subset of HVCX cells using stimuli containing replacement notes with durations near the estimated boundary (e.g., thick blue line in Fig. 3D; 19, 22, 25 ms; N = 10 cells, 2 birds, Supp. Fig. 3). This higher resolution investigation of the categorical transition yielded an estimated boundary (20 ± 4 ms), very similar to that obtained using the full dataset (21 ± 4 ms, p = 0.47, unpaired t-test). Such a sharp response transition is a hallmark of categorical activity3,4,28 (see also Supplementary Methods), further establishing the auditory responses of HVCX neurons as categorical.

Categorical boundary in neural response predicts a novel perceptual boundary

Although the categorical responses of HVCX cells are consistent with their playing some role in the categorical perception of note type based on duration, we found it curious that the categorical boundary estimated from our neural data (~ 21 ms) was different from the perceptual boundary reported previously for swamp sparrows based on behavioral testing (~ 13 ms)7. This discrepancy suggests that the neural responses we measured are not linked to perception of note duration or, alternatively, that the perceptual boundary in the birds we studied differed from the previously reported values. All birds from which we collected neural data were obtained from a population in northwestern Pennsylvania, whereas the previous behavioral study examined a population from upstate New York, some 540 km distant7. As with most songbirds, geographically distinct populations of swamp sparrows learn different song dialects29, raising the possibility that the perceptual boundary separating Category I and Category VI notes may be influenced by learning and thus vary across populations.

To test whether the discrepancy between our neural data and previous behavioral measures of a categorical boundary reflects a learned dialect difference, we first compared the acoustic characteristics of Category I and Category VI notes obtained from the Pennsylvania and New York populations. We found that the distributions of note durations differ between these two populations, as would be predicted by a dialect difference in this phonological feature (Fig. 4A). We next asked whether population-specific differences in note durations are paralleled by population differences in song perception. We tested the categorical perceptual boundary for note duration in the Pennsylvania population, replicating the behavioral experiment done by Nelson and Marler on the New York population7, in which habituation to repeated presentation of one song stimulus was followed by presentation of a test stimulus (i.e., stimulus transitions in Figs. 4B-C). This approach is a standard method for establishing categorical perception in human infants30 and non-human animals7,26.

FIGURE 4. The perceptual boundary in swamp sparrows’ categorization of note duration was predicted by the categorical boundary evident in auditory responses of HVCX neurons.

FIGURE 4

(a) The distribution of Category I and Category VI11 note durations in the songs of swamp sparrows from Pennsylvania (PA, thick black line) differed from the distribution in the songs of a previously studied population of swamp sparrows from New York7 (NY, thin gray line; p < 0.001, Kolmogorov-Smirnov 2-sample test, N = 129 notes, 62 songs from 35 PA birds; N = 52 notes, 29 songs from 12 NY birds, 3 ms binsize), suggesting that the categorical boundary in each population (dotted lines) may be learned through population-specific auditory and vocal experience. (b) Stimuli in our behavioral tests contained note transitions that either: 1) crossed no putative categorical boundary (4 to 8 ms), or 2) crossed the NY perceptual boundary but not the PA neural boundary (8 to 16 ms), or 3) crossed the PA neural boundary but not the NY perceptual boundary (16 to 32 ms). (c) Behavioral testing revealed that PA swamp sparrows (top, filled bars) perceived strong differences when the transition in note duration spanned the PA neural boundary detected in HVCX neurons (stimulus set 3, p < 0.001, Kruskal-Wallis test; p < 0.05 for stimulus set 3 vs. 1 and 3 vs. 2, Tukey’s HSD) but perceived little or no difference when the transition spanned the perceptual boundary detected in NY birds (stimulus set 2; p > 0.05, stimulus set 1 vs. 2, Tukey’s HSD) or spanned no putative boundary (stimulus set 1). Comparison of the behavioral data obtained from PA birds (top, filled bars, mean ± SE) and from NY birds7 (bottom, open bars, mean ± SE) makes clear the population-specific differences in categorical perception of note duration (NY data adapted from Nelson and Marler7, asterisk indicates p < 0.05, Kruskal-Wallis ANOVA, responses of individual PA birds shown in Supp. Fig. 6).

Swamp sparrows are territorial, and the introduction of novel song stimuli on their territory evokes robust aggressive displays, such as wing waves7,31. We quantified this territorial behavior in a habituation/dishabituation paradigm to determine whether the perceptual boundary of Pennsylvania birds more closely approximated the perceptual boundary measured behaviorally for New York birds7 or the neural response boundary we measured for Pennsylvania birds. We found that Pennsylvania birds responded as though they did not detect changes in note duration that spanned the perceptual boundary reported for New York birds (Fig. 4B-C, p > 0.05, stimulus transition 2 vs. 1; Tukey’s HSD). However, Pennsylvania birds clearly detected changes in note duration that spanned the neural response boundary detected in HVCX neurons, indicating that the changes in duration also spanned a categorical perceptual boundary (p < 0.001, Kruskal-Wallis test; p < 0.05, stimulus transition 3 vs. 1; p < 0.05, stimulus transition 3 vs. 2; Tukey’s HSD). Thus, categorical perception of note duration in swamp sparrows as measured in the field by behavioral testing is accurately predicted by the auditory response properties of HVCX neurons measured in the laboratory. Although it is extremely challenging to conduct behavioral tests that adequately capture the responsiveness of a wild territorial male songbird in a laboratory setting while simultaneously collecting neurophysiological data, we were able to obtain both behavioral and neurophysiological data at different times from one bird (Supp. Fig. 4, see also Figs. 2B-C (left and middle columns), Fig. 3A-B and Supp. Fig. 5). Although these findings are limited to a single individual, they indicate a direct within-individual correspondence between neurophysiological data and behavioral data related to those we measured in a field setting.

Categorical auditory responses are not influenced by frequency modulation or bandwidth

Although prior behavioral work implicated note duration as the primary salient acoustic feature for categorical perception7, the stimuli used in that study and the initial set of stimuli used here varied note duration without controlling for changes in other covariant features of the manipulated note, namely the rate of frequency modulation (FM) or frequency bandwidth (BW). Thus it remains to be established whether note duration, rather than FM or bandwidth, is the salient feature underlying categorical perception of Category I and Category VI notes by swamp sparrows and the categorical responses of their HVC neurons. Although the larger stimulus set required to distinguish between these alternatives is impractical to employ in field studies of wild sparrows, such measurements could be made for the responses of single neurons. Therefore, we tested a subset of HVCX neurons (N = 10 cells, 2 birds; different subset than cells used in high-resolution testing of boundary location) with variants of the primary song type containing computer-generated replacement notes. The use of these synthetic notes allowed us to systematically change note duration while holding constant either the note’s FM or BW (Figs. 5A-C). Syllables containing synthetic notes that closely replicated the features of the natural note they replaced were just as efficacious as the natural syllable at eliciting responses from HVCX neurons (p = 0.47; Wilcoxon signed rank test; Fig. 5A). Furthermore, syllables containing synthetic notes with durations similar to those of the natural notes they replaced, but with different FM or bandwidths, also were highly effective at eliciting responses from HVCX neurons (FM: p = 0.13; BW: p = 0.43; Wilcoxon signed rank test; Figs. 5B-C). Thus, these data indicate that neither FM nor BW FM nor BW plays a primary role in influencing our results, and that note duration, the song feature previously implicated in affecting categorical perception of these note types7, is the salient song feature in driving categorical responses in HVCX neurons.

FIGURE 5. Categorical responses were evoked by changes in specific features of individual notes in the syllable.

FIGURE 5

(a-c) A subset of 10 categorically responsive HVCX neurons (2 birds) was further tested to probe the acoustic stimulus features that directed categorical responsiveness. (a) In all cells, computer-generated duplicates of natural replacement notes were sufficient to evoke a categorical response (p = 0.47; Wilcoxon signed rank test), establishing the validity of further testing using synthetic stimuli. Synthetic stimuli in which note duration was varied while either (b) the rate of frequency modulation (FM; p = 0.13) or (c) the frequency bandwidth (BW; p = 0.43; Wilcoxon signed rank test) was held constant evoked responses like those evoked by natural replacement notes, implicating note duration as the salient song feature driving categorical responses of HVCX neurons.

Categorical responses are evident in neurons displaying a precise sensorimotor correspondence

In a prior study in which we recorded both auditory and singing-related activity in HVCX neurons of swamp sparrows, we established that individual cells display a precise sensorimotor correspondence20. Because neurons that display sensorimotor “mirroring” are hypothesized to mediate perception23, we investigated whether categorical responsiveness was evident in these same cells for which this sensorimotor correspondence had been established. Categorical auditory responsiveness was observed in all HVCX neurons for which we also recorded singing-related activity (N = 5 cells, 2 birds). Comparison of auditory and singing-related activity revealed that all of these neurons displayed a precise sensorimotor correspondence20 (e.g., Supp Fig. 5). These results suggest that neurons displaying a form of sensorimotor mirroring convey perceptual information about song.

DISCUSSION

This study provides the first evidence of neurons that encode perceptual information about a phonological feature of learned vocal behavior, specifically information about a categorical perceptual boundary. Furthermore, our neural results accurately predicted a previously unknown geographic dialectical difference in this perceptual boundary, one that we subsequently confirmed using behavioral tests with wild birds. Variation in this perceptual boundary across swamp sparrow populations strongly suggests that both categorical perception and categorical neural responses in sparrows are affected by experience. Finally, by identifying a cell type that expresses categorical activity, we provide insights into the neural circuitry for categorical perception of note duration.

Categorical perception has been shown to play a prominent role in a wide variety of communication systems3-9, including human speech3,4. Although categorical processing of both human and non-human vocalizations has been demonstrated in a variety of animals3,4,6,8, neural correlates of this phenomenon await identification. Neurophysiological studies in other modalities, however, have identified that neurons can reliably encode information about perceptual categories. For example, neurons in the monkey inferotemporal cortex respond in a categorical manner to facial features32, an important component of social signaling in primates, and brain imaging studies in humans have detected regions that show categorical sensitivity to differences in facial expression33. These studies strengthen the idea that categorical perception is mediated by categorically responsive neurons, and that the activity of those cells could play a role in communication. The present results identify neurons that respond categorically to learned vocal signals, extending the analysis of the neural basis of perceptual categorization to another major facet of communication with special relevance to human speech.

To our initial surprise, we found that the categorical response boundary for note duration measured from neurons differed markedly from the previously reported categorical perceptual boundary for this system using behavioral methods7. We found, however, that the response boundary we determined from physiological recordings of HVCX cells accurately predicted a previously unidentified population difference in this perceptual boundary, indicating that the perceptual boundary varies across geographically distinct swamp sparrow populations and consistent with the view that, as is the case for human speech3,4, these boundaries can be influenced by early experience. The fact that bird songs vary geographically as a consequence of learning is widely documented11,34, but our data are the first to suggest an effect of learning on the categorical perception of a natural signal other than speech. The observations that human infants across cultures display an initially similar ability to categorize speech sounds35 and that categorical boundaries of human speech can be detected by mammals that do not learn their vocalizations6 suggest that categorization of vocal signals may exploit innate neural mechanisms. However, the role of social experience in shaping human speech perception, including the perception of categorical boundaries, is also well documented3,4, indicating that the underlying neural mechanisms must also be pliable. Future studies in songbirds will be especially fruitful for addressing how acoustic and perhaps social experience shapes the development of categorical perception and categorically responsive neurons.

The present study localized categorical responses to HVCX projection neurons, which innervate a striatal pathway that lesion studies implicate in song perception as well as song learning15,16,18,22 and may be regarded as analogous to mammalian corticostriatal neurons. Our finding that the auditory responses of these projection neurons are closely linked to song perception provides a physiological basis for understanding how lesions to this pathway could interfere with song recognition. Furthermore, the present findings show that categorical responses are exhibited by HVCX neurons that display precise sensorimotor mirroring20, providing a link between mirror neurons and perception. In the songbird, categorically responsive sensorimotor neurons could enable auditory perception to directly guide subsequent behavior, as evident in countersinging in response to hearing a neighbor’s song20. Intriguingly, neurons that represent learned visual categories have been detected in regions of monkey parietal cortex implicated in motor planning and decision-making36. The expression of categorical responses in sensorimotor neurons in both songbirds and primates could point to an efficient means through which the nervous system determines when variations in a sensory stimulus warrant similar or different behavioral responses.

Establishing that categorical responses are expressed by striatal-projecting HVCX neurons, but not by interneurons, may also provide a clue as to how and where encoding of perceptual attributes occurs. Prior studies have shown that although HVCX neurons and HVC interneurons receive shared excitatory auditory input37,38, the auditory selectivity of interneurons more closely parallels the activity of auditory afferents to HVC37, and the highly selective auditory responses of HVCX neurons require inhibitory sculpting through interneurons39. The presence of categorical auditory responses in HVCX neurons and the absence of such responses in HVC interneurons suggest that categorical activity emerges through the local circuitry of HVC40 and the processes of synaptic integration in HVCX cells. Future studies using intracellular recordings could test this idea by determining whether HVCX neurons receive broadly responsive excitatory input from which categorical responses are sculpted by local inhibition.

Precise mapping of the neuronal response boundary revealed that the action potential discharge of HVCX cells is exquisitely sensitive to changes in note duration on the millisecond timescale. Sensitivity to fine temporal features of song has been described for HVC neurons in other songbirds41, and intracellular and extracellular recordings from putative HVCX neurons reveal that their song-evoked auditory responses are strongly dependent on auditory context, such as sequences of notes or syllables42-44. This context sensitivity involves integration over hundreds of milliseconds42 and is thought to depend at least in part on circuit mechanisms local to HVC38,39. Although the present study cannot resolve whether categorical responsiveness to note duration arises in HVC, it does suggest that temporal aspects of song ranging from milliseconds to hundreds of milliseconds can be encoded by single neurons. The sensitivity of individual neurons to features over multiple timescales may constitute a strategy for optimizing encoding of complex stimuli that can vary in both local structure and global sequence.

Supplementary Material

1

SUPPLEMENTARY FIGURE 1

A small minority of HVCX cells expressed responses to changes in note duration that were not categorical in nature. In 3 of 22 HVCX cells tested (2 birds), the auditory response did not fulfill the criteria to establish a response as categorical (see Methods). These data reveal no evidence of neurons selectively tuned to intermediate note durations, indicating that the population of HVCX neurons does not represent note duration continuously, further supporting our claim that HVCX neurons encode note duration in categorical manner.

2

SUPPLEMENTARY FIGURE 2

A linear model of auditory responses of HVCX neurons to notes of different duration does not account for the observed dataset. We compared the relations between note duration and strength of response for HVCX cells in our dataset against the null hypothesis that those cells represented note duration in a linear manner (i.e., HVCX cells expressed a linear relation rather than the step-like transition characteristic of categorical activity). In this noisy linear model, the neural response is expected to be minimal at the shortest note duration we tested (normalized response = 0 at note duration 4 ms) and is expected to be maximal at the longest note duration we tested (normalized response = 1 at note duration 31 ms; stated in equation form: Y = mX + b, where m = 0.037 and b = -0.148). Random variance is included in the model to simulate local transitions with slopes much steeper than the mean value or possibly negative slopes (see Supplementary Methods). Because our dataset included both positive and negative slopes at the categorical boundary (e.g., Fig. 3D, Supp. Fig. 3), we used the absolute value of the slope in the tuning curve of each HVCX neuron to compare the steepest slope of each transition in our dataset against the results of the model (shown in panel a). Artificial data created using this model were compared against the dataset of 10 HVCX neurons for which we had a high-resolution estimate of the categorical boundary (Supp. Fig. 3). Three epochs of note duration were considered: 1) within the putative short-note category (X <= 14 ms), 2) within the putative long-note category (X >= 27 ms), and 3) values around the putative boundary (14 < X < 27 ms). (a) Within the putative boundary region, the slopes of the steepest transition for cells in our dataset (filled bar, mean ± SE) were significantly steeper than the slope of the model (open bar, p = 0.002, paired t-tests). (b) Within each category, the slopes of responses in our dataset (filled bar) were significantly different than the slope of the model (open bar, p = 0.02) and were indistinguishable from a slope of zero (p = 0.94, no absolute values used in within-category comparison of model and actual values). Together, these data indicate that a linear model cannot account for our observations and that, consistent with a categorical representation, the population activity of HVCX neurons in response to changes in note duration expresses a steep transition between two regions with slopes indistinguishable from zero.

3

SUPPLEMENTARY FIGURE 3

The categorical boundary estimated from a subset of HVCX neurons tested using high-resolution stimuli corroborates the boundary estimated using the full dataset. A subset of 10 HVCX neurons (2 birds, all 10 cells shown here) was tested to probe the categorical boundary at a high resolution, revealing an estimated categorical boundary (filled triangle, 20 ± 4 ms) very similar to that estimated using the full dataset (21 ± 4 ms; p = 0.47, unpaired t-test). Categorical responses were observed in HVCX cells regardless of whether the duration of note that was replaced was naturally short (thick lines) or naturally long (thin lines).

4

SUPPLEMENTARY FIGURE 4

Behavioral testing of one swamp sparrow for which neural data were also collected revealed a correspondence between the perceptual and neurophysiological boundaries. Song stimuli were presented as described in the Methods for experiments performed in the field, and the bird’s behavior was observed using a small camera placed inside the neurophysiological recording chamber. The bird was conditioned to song stimuli using the song of another swamp sparrow, and the number of aggressive displays (crest raises46 or wing waves7 quantified as described in the Methods) was strongly habituated at time 0. Changing the stimulus to the bird’s own song (12 minutes) evoked a dishabituation of response, indicating that the bird perceived the new song as different than the song to which it had been conditioned. In subsequent song changes in this habituation/dishabituation paradigm (see Methods), this bird responded as if he perceived differences between stimuli that crossed the neurophysiological boundary (i.e., changing note duration from 7 to 31 ms at 98 minutes and from 27 to 5 ms at 218 minutes) but not between stimuli that did not cross the boundary (i.e., changing from 31 to 27 ms at 192 minutes). The bird from which we obtained these behavioral results is the same bird from which we obtained the neural in Figs 2B-C (left and middle columns), Fig 3A-B and Supp. Fig. 5.

5

SUPPLEMENTARY FIGURE 5

Categorical auditory responses were observed in HVCX cells that also expressed a precise sensorimotor correspondence20. (a) The auditory response of this HVCX neuron was selective for the primary song type (N = 5 cells, 2 birds; data from the same cell shown in all panels; top: auditory response PSTH, 10 ms binsize; bottom: stimulus syllable spectrogram; left: primary song type; center: reverse playback of the primary song type; right: another song type in the bird’s vocal repertoire). (b) Manipulation of the duration of note C in each syllable of the primary song type resulted in a categorical auditory response (data as in Fig. 3A). (c) This cell also expressed a precise correspondence in the timing of action potentials generated when the bird sang the primary song type (top) and during the auditory response (middle) to the primary song type (spectrogram, bottom). The presence of a neural correlate of the animal’s perception in auditory-vocal “mirror neurons20” strengthens the idea that cells expressing an auditory-vocal correspondence may facilitate perception of the signals used in vocal communication.

6

SUPPLEMENTARY FIGURE 6

Responses of individual birds in behavioral testing of the perceptual boundary. Swamp sparrows in the Pennsylvania population perceived strong differences when the transition in note duration spanned the neurophysiological boundary detected in HVCX neurons of Pennsylvania birds (right, N = 8 birds, statistics as reported in legend of Fig. 4C) but perceived little or no difference when the transition spanned the perceptual boundary detected in New York birds (middle, N = 8 birds) or spanned no putative boundary (control, left, N = 8 birds). The y-axis indicates the difference between the number of territorial displays evoked in the first block of testing using the dishabituation test stimulus and the number of displays evoked in the final block of testing using the habituation stimulus (thus negative values are possible, see Methods for details). Small values are consistent with the bird perceiving little or no difference between the habituation and dishabituation stimuli, and large positive values are consistent with the bird perceiving the two stimuli as different. These data are summarized in Fig. 4C.

7

Appendix

METHODS

Methods employed in collection and analysis of neurophysiological and behavioral data are described here briefly. Detailed information is available in the Supplementary Methods. All experiments were performed in accordance with protocols approved by the Duke University Animal Care and Use Committee.

Subjects

Male swamp sparrows used in neural testing were collected as adults (age > 1 year) from their breeding grounds. All field tests of song perception were performed in Crawford County, PA during May and June of 2007.

Song Stimulus Preparation

Exemplars of each song in the bird’s repertoire (2 to 5 song types) were recorded and digitized 20 to be used as stimuli and in the construction of additional stimuli to test categorical responsiveness. Song notes of the type implicated previously in swamp sparrow categorical perception7 served as “target notes” for replacement (e.g., note C in Fig. 3A). Each replacement note had similar spectral characteristics but different duration than the target note (e.g., Fig. 3A), and only one target note in the syllable was replaced in any stimulus. The possible durations of replacement notes were 4, 5, 7, 8, 16, 19, 22, 25, 27, and 31 ms. These values were chosen to replicate as closely as possible the methods used in the previous behavioral assessment of categorical perception in swamp sparrows7 and to resolve the location of the categorical boundary. The resulting synthetic syllable contained a replacement note in place of the target note, and all other notes and intervals were identical to the natural song. This synthetic syllable was assembled into a trill with trill rate identical to the natural song and total duration closely approximating the duration of the natural song type (increasing or decreasing individual note durations changed the overall duration slightly). In a subset of cells, additional synthetic stimuli were presented in which either the rate of frequency modulation (FM) or the frequency bandwidth (BW) was controlled while the duration of the note was varied (Figs. 5A-C).

Experimental Protocol

All experiments were performed using awake and freely behaving birds. Individual neurons were isolated using a microdrive device24 implanted under isolfurane anesthesia. All birds were allowed to recover for three days following implantation before recording began. Neurons in HVC were identified as HVCX, HVCRA or HVCINT cells either using antidromic stimulation methods or according to their electrophysiological and auditory response properties (see Supplementary Methods). All neural data were amplified, filtered (band pass 500 Hz to 10 kHz), and digitized (25 kHz) to computer file (LabView), and all action potentials of individual units were discriminated using amplitude discrimination of the largest unit in a record (custom software) or discrimination based on waveform characteristics (WaveClus). In both cases, single unit isolation was verified using an interspike interval histogram to test for the presence of a refractory period.

When a single unit had been isolated and identified, stimulus presentation was immediately initiated (10 sec quiet interval between each song presentation, stimuli presented in randomized order). Songs were played to the sparrow at 70 dB (peak RMS, A-weighted) through a speaker placed 20 to 35 cm away in the chamber (distance varied according to the bird’s location in the cage). Playback of the bird’s entire song repertoire, as well as their synthetic variants (see Supplementary Methods), was used to assess the auditory response of each neuron described in the Results. Natural song types (unaltered from the original recordings) were played through a speaker placed inside the recording chamber to assess the auditory selectivity of each neuron (i.e., to identify the “primary song type”). Extracellular recordings of action potentials in response to these stimuli were collected from 29 individual HVCX units (5 birds) and 18 HVCINT units (same 5 birds).

Quantification of Auditory Response

Auditory activity in HVCX neurons was taken as significant if the response to any natural song stimulus exceeded the mean + 5 SD of the baseline firing rate. Auditory activity in HVC interneurons was tested for significance using response strength (p < 0.05), a metric that also compares the response and baseline conditions37. The responses of HVCX neurons and HVCINT cells were normalized using the strongest response of each cell to any member of the set of synthetic stimuli, enabling comparison of auditory responses across cells (e.g., Fig. 3B, left) and birds (e.g., Fig. 3C, left).

Neural Assessment of Categorical Responsiveness

Noting that the dataset of auditory responses in HVCX cells tended to include transitions from strong responses (normalized responses closer to 1) to little or no responses (normalized responses closer to 0), we used interpolation to compute the note duration at which the auditory response crossed 0.5. In cases wherein interpolated data crossed 0.5 multiple times (e.g., green line in Fig. 3D), the transition point in the cell was represented by the mean of the note durations corresponding to those multiple crossing points. This method revealed a putative boundary in the auditory responsiveness of these cells (~ 21 ms), and this value was used to test whether cells expressed categorical responses to stimuli on either side of this boundary. In off-line analysis, neural responses were compared to well-established criteria7,25,27 to determine whether responses were categorical in nature (see Supplementary Methods). In short, responses were said to be categorical if activity was similar among members of a group of stimuli, but different between groups of stimuli, indicating a greater sensitivity to stimulus category than to the physical properties of the stimulus.

Behavioral Assessment of Categorical Perception

Behavioral methods (Crawford County, PA) closely paralleled the previous study of categorical perception in a New York population of swamp sparrows7. Song stimuli were created in the same manner described for neurophysiological testing (see above) and were presented in pairs (4:8 ms, 8:16 ms, 16:32 ms; one pair of songs to each bird; presentation sequence randomized) through a speaker placed inside of the bird’s breeding territory. Perception of song features was assessed by quantifying territorial wing-wave responses to song stimuli in a habituation/dishabituation paradigm employed previously to test auditory perception in swamp sparrows7.

References

  • 1.Miller EK, Nieder A, Freedman DJ, Wallis JD. Neural correlates of categories and concepts. Curr Opin Neurobiol. 2003;13:198–203. doi: 10.1016/s0959-4388(03)00037-0. [DOI] [PubMed] [Google Scholar]
  • 2.Bregman A. Auditory Scene Analysis. MIT Press; Cambridge, MA: 1990. [Google Scholar]
  • 3.Liberman AM, Harris KS, Hoffman HS, Griffith BC. The discrimination of speech sounds within and across phoneme boundaries. J Exp Psychol. 1957;54:358–68. doi: 10.1037/h0044417. [DOI] [PubMed] [Google Scholar]
  • 4.Liberman AM, Harris KS, Kinney JA, Lane H. The discrimination of relative onset-time of the components of certain speech and nonspeech patterns. J Exp Psychol. 1961;61:379–88. doi: 10.1037/h0049038. [DOI] [PubMed] [Google Scholar]
  • 5.May B, Moody DB, Stebbins WC. Categorical perception of conspecific communication sounds by Japanese macaques, Macaca fuscata. J Acoust Soc Am. 1989;85:837–47. doi: 10.1121/1.397555. [DOI] [PubMed] [Google Scholar]
  • 6.Kuhl PK, Miller JD. Speech-perception by chinchilla - phonetic boundaries for synthetic VOT stimuli. Journal of the Acoustical Society of America. 1975;57:S49–S50. doi: 10.1121/1.381770. [DOI] [PubMed] [Google Scholar]
  • 7.Nelson DA, Marler P. Categorical perception of a natural stimulus continuum: birdsong. Science. 1989;244:976–8. doi: 10.1126/science.2727689. [DOI] [PubMed] [Google Scholar]
  • 8.Baugh AT, Akre KL, Ryan MJ. Categorical perception of a natural, multivariate signal: mating call recognition in tungara frogs. Proc Natl Acad Sci U S A. 2008;105:8985–8. doi: 10.1073/pnas.0802201105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Nowak MA, Krakauer DC. The evolution of language. Proc Natl Acad Sci U S A. 1999;96:8028–33. doi: 10.1073/pnas.96.14.8028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Marler P, Peters S. Sparrows learn adult song and more from memory. Science. 1981;213:780–782. doi: 10.1126/science.213.4509.780. [DOI] [PubMed] [Google Scholar]
  • 11.Catchpole CK, Slater PJB. Birdsong: Biological Themes and Variations. Cambridge University Press; New York, NY: 1995. p. 248. [Google Scholar]
  • 12.Marler P, Pickert R. Species - universal microstructure in the learned song of the swamp sparrow (Melospiza georgiana) Animal Behavior. 1984;32:673–689. [Google Scholar]
  • 13.Clark CW, Marler P, Beeman K. Quantitative analysis of animal vocal phonology: an application to swamp sparrow song. Ethology. 1987;76:101–115. [Google Scholar]
  • 14.Nottebohm F, Stokes TM, Leonard CM. Central control of song in the canary, Serinus canarius. J Comp Neurol. 1976;165:457–86. doi: 10.1002/cne.901650405. [DOI] [PubMed] [Google Scholar]
  • 15.Gentner TQ, Hulse SH, Bentley GE, Ball GF. Individual vocal recognition and the effect of partial lesions to HVc on discrimination, learning, and categorization of conspecific song in adult songbirds. J Neurobiol. 2000;42:117–33. doi: 10.1002/(sici)1097-4695(200001)42:1<117::aid-neu11>3.0.co;2-m. [DOI] [PubMed] [Google Scholar]
  • 16.Brenowitz EA. Altered perception of species-specific song by female birds after lesions of a forebrain nucleus. Science. 1991;251:303–5. doi: 10.1126/science.1987645. [DOI] [PubMed] [Google Scholar]
  • 17.Yu AC, Margoliash D. Temporal hierarchical control of singing in birds. Science. 1996;273:1871–5. doi: 10.1126/science.273.5283.1871. [DOI] [PubMed] [Google Scholar]
  • 18.Bottjer SW, Miesner EA, Arnold AP. Forebrain lesions disrupt development but not maintenance of song in passerine birds. Science. 1984;224:901–3. doi: 10.1126/science.6719123. [DOI] [PubMed] [Google Scholar]
  • 19.Hahnloser RH, Kozhevnikov AA, Fee MS. An ultra-sparse code underlies the generation of neural sequences in a songbird. Nature. 2002;419:65–70. doi: 10.1038/nature00974. [DOI] [PubMed] [Google Scholar]
  • 20.Prather JF, Peters S, Nowicki S, Mooney R. Precise auditory-vocal mirroring in neurons for learned vocal communication. Nature. 2008;451:305–10. doi: 10.1038/nature06492. [DOI] [PubMed] [Google Scholar]
  • 21.Mooney R, Hoese W, Nowicki S. Auditory representation of the vocal repertoire in a songbird with multiple song types. Proc Natl Acad Sci U S A. 2001;98:12778–83. doi: 10.1073/pnas.221453298. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Scharff C, Nottebohm F, Cynx J. Conspecific and heterospecific song discrimination in male zebra finches with lesions in the anterior forebrain pathway. J Neurobiol. 1998;36:81–90. [PubMed] [Google Scholar]
  • 23.Rizzolatti G, Craighero L. The mirror-neuron system. Annu Rev Neurosci. 2004;27:169–92. doi: 10.1146/annurev.neuro.27.070203.144230. [DOI] [PubMed] [Google Scholar]
  • 24.Fee MS, Leonardo A. Miniature motorized microdrive and commutator system for chronic neural recording in small animals. J Neurosci Methods. 2001;112:83–94. doi: 10.1016/s0165-0270(01)00426-5. [DOI] [PubMed] [Google Scholar]
  • 25.Studdert M, Liberman AM, Harris KS, Cooper FS. Theoretical Notes Motor Theory of Speech Perception - a Reply to Lanes Critical Review. Psychological Review. 1970;77:234. doi: 10.1037/h0029078. &. [DOI] [PubMed] [Google Scholar]
  • 26.Wyttenbach RA, May ML, Hoy RR. Categorical perception of sound frequency by crickets. Science. 1996;273:1542–4. doi: 10.1126/science.273.5281.1542. [DOI] [PubMed] [Google Scholar]
  • 27.Freedman DJ, Riesenhuber M, Poggio T, Miller EK. Categorical representation of visual stimuli in the primate prefrontal cortex. Science. 2001;291:312–6. doi: 10.1126/science.291.5502.312. [DOI] [PubMed] [Google Scholar]
  • 28.Diehl RL, Lotto AJ, Holt LL. Speech perception. Annu Rev Psychol. 2004;55:149–79. doi: 10.1146/annurev.psych.55.090902.142028. [DOI] [PubMed] [Google Scholar]
  • 29.Balaban E. Cultural and genetic variation in swamp sparrows (Melospiza georgiana) 2. behavioral salience of geographic song variants. Behaviour. 1988;105:292–322. [Google Scholar]
  • 30.Eimas PD, Siqueland ER, Jusczyk P, Vigorito J. Speech perception in infants. Science. 1971;171:303–6. doi: 10.1126/science.171.3968.303. [DOI] [PubMed] [Google Scholar]
  • 31.Ballentine B, Searcy WA, Nowicki S. Reliable aggressive signalling in swamp sparrows. Animal Behaviour. 2008;75:693–703. [Google Scholar]
  • 32.Eifuku S, De Souza WC, Tamura R, Nishijo H, Ono T. Neuronal correlates of face identification in the monkey anterior temporal cortical areas. J Neurophysiol. 2004;91:358–71. doi: 10.1152/jn.00198.2003. [DOI] [PubMed] [Google Scholar]
  • 33.Etcoff NL, Magee JJ. Categorical perception of facial expressions. Cognition. 1992;44:227–40. doi: 10.1016/0010-0277(92)90002-y. [DOI] [PubMed] [Google Scholar]
  • 34.Marler P, Tamura M. Culturally transmitted patterns of vocal behavior in sparrows. Science. 1964;146:1483–6. doi: 10.1126/science.146.3650.1483. [DOI] [PubMed] [Google Scholar]
  • 35.Kuhl PK, Tsao FM, Liu HM. Foreign-language experience in infancy: effects of short-term exposure and social interaction on phonetic learning. Proc Natl Acad Sci U S A. 2003;100:9096–101. doi: 10.1073/pnas.1532872100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Freedman DJ, Assad JA. Experience-dependent representation of visual categories in parietal cortex. Nature. 2006;443:85–8. doi: 10.1038/nature05078. [DOI] [PubMed] [Google Scholar]
  • 37.Coleman MJ, Mooney R. Synaptic transformations underlying highly selective auditory representations of learned birdsong. J Neurosci. 2004;24:9251–65. doi: 10.1523/JNEUROSCI.0947-04.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Rosen MJ, Mooney R. Synaptic interactions underlying song-selectivity in the avian nucleus HVC revealed by dual intracellular recordings. J Neurophysiol. 2006;95:1158–75. doi: 10.1152/jn.00100.2005. [DOI] [PubMed] [Google Scholar]
  • 39.Rosen MJ, Mooney R. Inhibitory and excitatory mechanisms underlying auditory responses to learned vocalizations in the songbird nucleus HVC. Neuron. 2003;39:177–94. doi: 10.1016/s0896-6273(03)00357-x. [DOI] [PubMed] [Google Scholar]
  • 40.Mooney R, Prather JF. The HVC microcircuit: the synaptic basis for interactions between song motor and vocal plasticity pathways. J Neurosci. 2005;25:1952–64. doi: 10.1523/JNEUROSCI.3726-04.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Theunissen FE, Doupe AJ. Temporal and spectral sensitivity of complex auditory neurons in the nucleus HVc of male zebra finches. J Neurosci. 1998;18:3786–802. doi: 10.1523/JNEUROSCI.18-10-03786.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Margoliash D. Acoustic parameters underlying the responses of song-specific neurons in the white-crowned sparrow. J Neurosci. 1983;3:1039–57. doi: 10.1523/JNEUROSCI.03-05-01039.1983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Margoliash D, Fortune ES. Temporal and harmonic combination-sensitive neurons in the zebra finch’s HVc. J Neurosci. 1992;12:4309–26. doi: 10.1523/JNEUROSCI.12-11-04309.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Lewicki MS, Konishi M. Mechanisms underlying the sensitivity of songbird forebrain neurons to temporal order. Proc Natl Acad Sci U S A. 1995;92:5582–6. doi: 10.1073/pnas.92.12.5582. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Mooney R. Different subthreshold mechanisms underlie song selectivity in identified HVc neurons of the zebra finch. J Neurosci. 2000;20:5420–36. doi: 10.1523/JNEUROSCI.20-14-05420.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Collins CE, Houtman AM. Tan and white color morphs of White-throated Sparrows differ in their non-song vocal responses to territorial intrusion. Condor. 1999;101:842–845. [Google Scholar]
  • 47.Ehret G. Categorical perception of sound signals: facts and hypotheses from animal studies. In: Harnad S, editor. Categorical Perception: the Groundwork of Cognition. Cambridge University Press; New York, NY: 1975. pp. 301–331. [Google Scholar]
  • 48.Ehret G, Haack B. Categorical perception of mouse pup ultrasound by lactating females. Naturwissenschaften. 1981;68:208–209. doi: 10.1007/BF01047208. [DOI] [PubMed] [Google Scholar]
  • 49.Cutting J. Plucks and bows are categorically perceived, sometimes. Perception & Psychophysics. 1982;31:462–476. doi: 10.3758/bf03204856. [DOI] [PubMed] [Google Scholar]
  • 50.Beletsky LD, Chao S, Smith DG. An investigation of song-based species recognition in the red-winged blackbird (Agelaius phoeniceus) Behaviour. 1980;73:189–203. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

SUPPLEMENTARY FIGURE 1

A small minority of HVCX cells expressed responses to changes in note duration that were not categorical in nature. In 3 of 22 HVCX cells tested (2 birds), the auditory response did not fulfill the criteria to establish a response as categorical (see Methods). These data reveal no evidence of neurons selectively tuned to intermediate note durations, indicating that the population of HVCX neurons does not represent note duration continuously, further supporting our claim that HVCX neurons encode note duration in categorical manner.

2

SUPPLEMENTARY FIGURE 2

A linear model of auditory responses of HVCX neurons to notes of different duration does not account for the observed dataset. We compared the relations between note duration and strength of response for HVCX cells in our dataset against the null hypothesis that those cells represented note duration in a linear manner (i.e., HVCX cells expressed a linear relation rather than the step-like transition characteristic of categorical activity). In this noisy linear model, the neural response is expected to be minimal at the shortest note duration we tested (normalized response = 0 at note duration 4 ms) and is expected to be maximal at the longest note duration we tested (normalized response = 1 at note duration 31 ms; stated in equation form: Y = mX + b, where m = 0.037 and b = -0.148). Random variance is included in the model to simulate local transitions with slopes much steeper than the mean value or possibly negative slopes (see Supplementary Methods). Because our dataset included both positive and negative slopes at the categorical boundary (e.g., Fig. 3D, Supp. Fig. 3), we used the absolute value of the slope in the tuning curve of each HVCX neuron to compare the steepest slope of each transition in our dataset against the results of the model (shown in panel a). Artificial data created using this model were compared against the dataset of 10 HVCX neurons for which we had a high-resolution estimate of the categorical boundary (Supp. Fig. 3). Three epochs of note duration were considered: 1) within the putative short-note category (X <= 14 ms), 2) within the putative long-note category (X >= 27 ms), and 3) values around the putative boundary (14 < X < 27 ms). (a) Within the putative boundary region, the slopes of the steepest transition for cells in our dataset (filled bar, mean ± SE) were significantly steeper than the slope of the model (open bar, p = 0.002, paired t-tests). (b) Within each category, the slopes of responses in our dataset (filled bar) were significantly different than the slope of the model (open bar, p = 0.02) and were indistinguishable from a slope of zero (p = 0.94, no absolute values used in within-category comparison of model and actual values). Together, these data indicate that a linear model cannot account for our observations and that, consistent with a categorical representation, the population activity of HVCX neurons in response to changes in note duration expresses a steep transition between two regions with slopes indistinguishable from zero.

3

SUPPLEMENTARY FIGURE 3

The categorical boundary estimated from a subset of HVCX neurons tested using high-resolution stimuli corroborates the boundary estimated using the full dataset. A subset of 10 HVCX neurons (2 birds, all 10 cells shown here) was tested to probe the categorical boundary at a high resolution, revealing an estimated categorical boundary (filled triangle, 20 ± 4 ms) very similar to that estimated using the full dataset (21 ± 4 ms; p = 0.47, unpaired t-test). Categorical responses were observed in HVCX cells regardless of whether the duration of note that was replaced was naturally short (thick lines) or naturally long (thin lines).

4

SUPPLEMENTARY FIGURE 4

Behavioral testing of one swamp sparrow for which neural data were also collected revealed a correspondence between the perceptual and neurophysiological boundaries. Song stimuli were presented as described in the Methods for experiments performed in the field, and the bird’s behavior was observed using a small camera placed inside the neurophysiological recording chamber. The bird was conditioned to song stimuli using the song of another swamp sparrow, and the number of aggressive displays (crest raises46 or wing waves7 quantified as described in the Methods) was strongly habituated at time 0. Changing the stimulus to the bird’s own song (12 minutes) evoked a dishabituation of response, indicating that the bird perceived the new song as different than the song to which it had been conditioned. In subsequent song changes in this habituation/dishabituation paradigm (see Methods), this bird responded as if he perceived differences between stimuli that crossed the neurophysiological boundary (i.e., changing note duration from 7 to 31 ms at 98 minutes and from 27 to 5 ms at 218 minutes) but not between stimuli that did not cross the boundary (i.e., changing from 31 to 27 ms at 192 minutes). The bird from which we obtained these behavioral results is the same bird from which we obtained the neural in Figs 2B-C (left and middle columns), Fig 3A-B and Supp. Fig. 5.

5

SUPPLEMENTARY FIGURE 5

Categorical auditory responses were observed in HVCX cells that also expressed a precise sensorimotor correspondence20. (a) The auditory response of this HVCX neuron was selective for the primary song type (N = 5 cells, 2 birds; data from the same cell shown in all panels; top: auditory response PSTH, 10 ms binsize; bottom: stimulus syllable spectrogram; left: primary song type; center: reverse playback of the primary song type; right: another song type in the bird’s vocal repertoire). (b) Manipulation of the duration of note C in each syllable of the primary song type resulted in a categorical auditory response (data as in Fig. 3A). (c) This cell also expressed a precise correspondence in the timing of action potentials generated when the bird sang the primary song type (top) and during the auditory response (middle) to the primary song type (spectrogram, bottom). The presence of a neural correlate of the animal’s perception in auditory-vocal “mirror neurons20” strengthens the idea that cells expressing an auditory-vocal correspondence may facilitate perception of the signals used in vocal communication.

6

SUPPLEMENTARY FIGURE 6

Responses of individual birds in behavioral testing of the perceptual boundary. Swamp sparrows in the Pennsylvania population perceived strong differences when the transition in note duration spanned the neurophysiological boundary detected in HVCX neurons of Pennsylvania birds (right, N = 8 birds, statistics as reported in legend of Fig. 4C) but perceived little or no difference when the transition spanned the perceptual boundary detected in New York birds (middle, N = 8 birds) or spanned no putative boundary (control, left, N = 8 birds). The y-axis indicates the difference between the number of territorial displays evoked in the first block of testing using the dishabituation test stimulus and the number of displays evoked in the final block of testing using the habituation stimulus (thus negative values are possible, see Methods for details). Small values are consistent with the bird perceiving little or no difference between the habituation and dishabituation stimuli, and large positive values are consistent with the bird perceiving the two stimuli as different. These data are summarized in Fig. 4C.

7

RESOURCES