Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Feb 1.
Published in final edited form as: Exp Brain Res. 2007 Sep 21;184(4):493–509. doi: 10.1007/s00221-007-1115-9

Spectral Integration Plasticity in Cat Auditory Cortex Induced by Perceptual Training

M Diane Keeling 1, Barbara M Calhoun 2, Katharina Krüger 1, Daniel B Polley 3, Christoph E Schreiner 1
PMCID: PMC2474628  NIHMSID: NIHMS51589  PMID: 17896103

Abstract

We investigated the ability of cats to discriminate differences between vowel-like spectra, assessed their discrimination ability over time, and compared spectral receptive fields in primary auditory cortex (AI) of trained and untrained cats. Animals were trained to discriminate changes in the spectral envelope of a broad-band harmonic complex in a 2-alternative forced choice procedure. The standard stimulus was an acoustic grating consisting of a harmonic complex with a sinusoidally modulated spectral envelope ('ripple spectrum'). The spacing of spectral peaks was conserved at 1, 2, or 2.66 peaks/octave. Animals were trained to detect differences in the frequency location of energy peaks, corresponding to changes in the spectral envelope phase. Average discrimination thresholds improved continuously during the course of the testing from phase-shifts of 96° at the beginning to 44° after 4–6 months of training.

Responses of AI single units and small groups of neurons to pure tones and ripple spectra were modified during perceptual discrimination training with vowel-like ripple stimuli. The transfer function for spectral envelope frequencies narrowed and the tuning for pure tones sharpened significantly in discriminant versus naive animals. By contrast, control animals that used the ripple spectra only in a lateralization task showed broader ripple transfer functions and narrower pure-tone tuning than naïve animals.

Keywords: spectral modulation, ripple spectrum, behavioral training, frequency tuning, formants, discrimination, speech, vowels, perception, primary auditory cortex, spectral envelope, cortical representation, plasticity, adult cats

Introduction

Species-specific vocalizations and human speech signals share many spectro-temporal features. Accordingly, several physiological studies of the peripheral and central auditory system of animals have used speech-like stimuli to elucidate potential coding strategies of the human auditory system (e.g., Delgutte and Kiang, 1984; Eggermont, 1995; Sachs and Young, 1979; Schreiner, 1998; Steinschneider et al., 1990; Wong and Schreiner, 2003; Young and Sachs, 1979). Studies of general features of human and animal vocalizations, such as fundamental frequency of the sound source (e.g. Langner, 1992; Langner and Schreiner, 1988; Rhode and Greenberg, 1994) and vocal tract resonances (e.g., Klein et al., 2000; Schreiner and Calhoun, 1994; Shamma et al., 1995; Wang and Sachs, 1995) provide useful insights into auditory neuronal coding.

The formant structure of vowels is a fundamental spectral feature of the vocal tract in speech and animal vocalization. Auditory gratings or ripple spectra, i.e. broad-band stimuli with sinusoidal spectral envelopes similar to the structure of vowels, have been used to characterize spectral integration properties of central auditory neurons (e.g., Schreiner and Calhoun, 1994; Shamma et al., 1995; Klein et al., 2000; Miller et al., 2002; Escabi and Schreiner, 2002). Cat and ferret cortical neurons respond preferentially to a narrow range of formant ratios or spectral envelope frequencies (Calhoun and Schreiner, 1998; Klein et al., 2000; Kowalski et al., 1996; Schreiner and Calhoun, 1994; Shamma et al, 1995). Receptive field properties determined with broad-band acoustic gratings and pure tones appear to be related, although the precise nature of that relationship is still debated (Calhoun and Schreiner, 1998; Linden et al., 2003).

Psychophysical evaluations of the perception of specific speech features in animals have been reported for various conditioning/training procedures (e.g., Kuhl and Miller 1975, using the chinchilla; Kuhl and Padden 1982, 1983, monkey; Baru 1975, dog; and Dewson 1964, cat). It was found that animals discriminate many speech-like features in a similar fashion as humans. The ability to discriminate signals can improve with training and is accompanied by plastic changes in response characteristics of central auditory neurons (e.g., Allard et al., 1992; Bakin et al., 1996; Beitel et al., 2001; Edeline, 1999; Recanzone et al., 1993; Polley et al. 2004, 2006; Prosen et al., 1990; Weinberger et al., 1988). Since the process of behavioral assessment can influence the perceptual capacity of an animal, it may not be valid to equate physiological data from untrained animals with the psychophysical performance of trained animal. Behavioral training, combined with psychophysical and neurophysiological evaluation of animals, is a more appropriate animal model for the study of cortical sound processing, including speech sounds.

To elucidate the potential relationship between auditory cortical receptive field properties and the spectral nature of behaviorally relevant acoustic inputs, we trained animals with spectrally structured acoustic gratings and determined spectral integration properties in AI. We hypothesized that the shape and extent of spectral receptive fields of trained animals would undergo modifications during behavioral training that parallel the perceptual improvements of the animal and reflect more efficient processing of behaviorally relevant broad-band stimuli. The results support this hypothesis and show that neural spectral selectivity in AI improves with training and reflects perceptual performance in spectral envelope discrimination.

METHODS

Behavioral Procedures

Eight adults, female cats were exposed to spectral envelope/ripple stimuli by training in a spatial and/or spectral discrimination task using a two-alternative forced choice (2-AFC) procedure. The stimulus generation and response tabulation were controlled and presented with the LabVIEW application (National Instruments, Version 2) in a Macintosh environment (Apple, Mac IIci). The animals were food restricted to 90– 95% of their pre-fast weight; reward during testing was diluted canned cat food (Hill’s Prescription Diet feline c/d) dispensed from a syringe using a stepping motor, which pumped a small amount of food onto a fiberglass plate (Thompson et al., 1990). The cats were weighed daily and, if indicated, received a supplement of dry cat food after testing to maintain their total daily nutritional requirements. Training occurred in an acoustically transparent test cage hung from the ceiling of a large sound-attenuating chamber (IAC). The chamber walls were covered with sound absorbing acoustic foam. Behavior was monitored continuously with a video camera system.

All procedures were approved by the University Committee on Animal Research and adhered to national guidelines for the treatment of animals (“Principles of laboratory animal care”; NIH publication No. 82-23, revised 1985).

Stimulus

Stimuli were created using a DSP board (TMS32010) with a 16-bit DA converter. The signals were amplified (Crown D-75) and delivered through a loudspeaker located directly in front of the cat or, at the outset of the training, from two speakers located 75° to the right and left of midline. Signal intensity and spectra were assessed in the region occupied by the head of the experimental animals using a sound level meter (Bruel & Kjaer) and a spectrum analyzer (Nicolet Ubiquitous). Stimuli were ripple spectra consisting of a harmonic carrier signal with sinusoidal modulation (linear on a logarithmic amplitude and frequency scale) of the spectral envelope (see Schreiner and Calhoun, 1994). The stimulus spectrum was centered at 3.7 kHz, with a spacing between spectral maxima (the ripple density) of either 1, 2 or 2.66 ripples/octave, a fundamental frequency of 69 Hz, and a total bandwidth of 3 octaves. The stimulus contained 127 frequency components and extended from ~ 1.3 kHz (the 19th harmonic of the fundamental frequency) to ~ 10 kHz (the 145th harmonic). Thus, the stimulus with ripple density of 1 ripple/octave had three spectral maxima at approximately 1.85, 3.7 and 7.4 kHz (see Figure 1). The phase of each frequency component differed from the neighboring frequency components by 57° to avoid peaked temporal waveforms. An overall decrease of the component amplitudes of 6 dB/octave was implemented to maintain an equal amount of signal energy per octave. The frequency range of 1.3 – 10 kHz was chosen to facilitate cortical recordings from a largely planar region of the primary auditory cortex (AI). The main formants in cat vocalizations appear to be located between about 1 and 4 kHz (Shipley et al., 1991), spanning the mapped frequency region. The signal consisted of two segments, the standard stimulus followed by the test stimulus. The duration of the standard and the test signal were 500 ms including rise and fall times of 3 ms. A silent 500 ms interval was inserted between standard and test stimulus. The inter-trial pause was controlled by the animal and varied between 1 to 10 seconds.

Figure 1.

Figure 1

Ripple Spectrum Stimulus for Animal Testing. The spectral envelope of a three-octave signal is shown. The carrier signal was composed of a harmonic complex with a fundamental frequency of 69 Hz (for illustration only, the harmonics for a 333Hz fundamental frequency are indicated for the 0° phase condition). Three examples of the spectral envelope are shown for ripple phases of 0°, 90°, and 180°. Corresponding frequency locations of the central spectral maximum or 'formants' are indicated by arrows: 2.6, 3.1, and 3.7 kHz, respectively.

At the start of testing, the frequency location of the spectral peaks in the test stimulus was shifted downward by half an octave relative to the standard stimulus, i.e., the envelope phase was shifted by 180°. The phase shift affected the spectral envelope only by altering the amplitudes of each frequency component, i.e., stimulus frequency range as well as frequency and phase of each spectral component were unchanged. During testing, the range of envelope phase differences between standard and test stimulus was gradually reduced as an animal’s discrimination performance improved. Since ripple density was constant during the training, the interval between spectral peaks (measured in octaves) remained the same while their frequency locations shifted. The depth of the spectral envelope modulation (the auditory contrast) was 30 dB. The intensity of the 2-AFC stimuli was constant within a trial but varied randomly in 1 dB steps between 61 and 67 dB SPL across trials.

Training procedure

Initially, cats were trained to press a nose key located at mid-line to indicate an observing (“ready”) response, for which they received food. This response ensured that the cat was attending, could initiate trials, and that the cat’s head position was relatively constant from trial to trial. Once the cat was reliably performing the observing response, auditory stimuli were introduced; reward then required a center key press, followed by auditory stimulus presentation, to which the cat was required to make a response either to the left or to the right side. Responses were indicated on two spatially separated nose-press keys, one to each side of the middle, observing key. During this phase of the training, by pressing the response key the animal indicated the side from which the stimulus had been presented.

At the beginning of training, standard and test stimuli were presented from one of two small lateral loudspeakers, at about 75° to the left and right of midline. When the stimuli were both presented from the left speaker (presentation designated AA), the ripple phase of standard and test stimuli was 0°. When the stimuli were presented from the right speaker (designated BB), the ripple phase was 180° (the maximal spectral difference from the left side). After the animals performed this task with high reliability, the stimuli presented on the right side were changed to an AB presentation, i.e. the test stimulus was maximally different from the preceding standard stimulus. The requirement was that the animals go left for reward following a left-side or ‘same’ stimulus presentation (AA) and go right for reward following a right-side or ‘different’ stimulus presentation (AB); this was usually readily learned in 2 to 3 weeks.

After the animal learned to respond to the location of the stimulus presented (lateralization stage), the speakers were gradually (over several weeks) moved toward the center location until, finally, the ripple stimuli were presented from a single speaker positioned directly in front of the cage (envelope stage). Stimulus pair AA required a left-side key press to receive reward; stimulus pair AB required right-side key press for reward. During the lateralization phase, all trained animals had a hit rate of >90% and a very low false alarm rate, indicating that the animals were under behavioral control. The switch from localization discrimination to discrimination based solely upon the spectral quality difference between standard and test stimuli was quite difficult for some cats; three cats (out of five) hearing a ripple stimulus of 1 ripple/octave made the transition from the lateralization task to the envelope stage. Three cats were trained with ripple densities of 2 and/or 2.66 ripples/octave. These animals performed the spatial task with the same accuracy as the previous group of animals but failed to acquire the spectral discrimination task (see Discussion).

Testing procedure

Once an animal mastered the 2-AFC procedure with a single speaker, the testing phase began. The ripple phase of the test stimulus was randomly chosen from a set of values. The initial value range of ripple phase differences in the AB stimulus was large, i.e. 50° ~ 180° of the spectral envelope phase distributed over 10 to 15 fixed phase values. An initially wide range (~130°) was necessary to be certain to present ripple densities above and below the yet unknown behavioral threshold. According to the discrimination threshold achieved on each day’s testing, the range of values was narrowed to about +/− 50% of the last 2–3 thresholds (but not narrower than ~40° near the end of testing) and centered around the most recent 2–3 threshold estimates for the subsequent test session.

Cats worked until they did not initiate any further trials, usually after 100 to 250 trials per day. The number of AA (same stimuli) and AB (second stimulus different) trials were approximately equal. Performance threshold for the envelope discrimination was defined as the phase shifts (AB stimuli) correctly detected in 75% of the trial presentations and computed by a linear regression formula applied to the raw psychometric function. Testing was terminated after four to six months when the animal’s performance plateaued over one to two weeks of testing. Subsequently, the cats underwent electrophysiological testing of neuronal responses in AI with similar stimuli.

Electrophysiological Procedures

The methods for the electrophysiological recording were the same as described in previous, related reports (Calhoun and Schreiner, 1998; Schreiner and Calhoun, 1994; Schreiner and Mendelson, 1990). Briefly, results were obtained in AI of the right hemisphere. Anesthesia was induced with an intramuscular injection of a 4:1 mixture of ketamine hydrochloride (10 mg/kg) and acetylpromazine maleate (0.10 mg/kg). After venous cannulation, an initial dose of pentobarbital sodium (30 mg/kg) was administered. Animals were maintained at a surgical level of anesthesia with a continuous infusion of pentobarbital sodium (~2 mg/kg per hour) in lactated Ringers solution (~3.5 ml/h) and, if necessary, with supplementary intravenous injections of pentobarbital sodium. The cats were also given dexamethasone sodium phosphate (0.14 mg/kg im) to prevent brain edema and atropine sulfate (0.06 mg/kg im) to reduce salivation. Temperature was monitored and maintained at 37.5° C by means of a heated water blanket. The head was fixed, leaving the external meati unobstructed, and the temporal muscle on the right hemisphere was retracted and the lateral cortex exposed by a craniotomy. The dura overlying the middle ectosylvian gyrus was removed and the exposed cortex covered with silicone oil.

Experiments were conducted in a double-walled, sound-shielded room (Industrial Acoustics Company). Auditory stimuli were presented via calibrated headphones (STAX 54) enclosed in small chambers that were connected to sound delivery tubes sealed into the acoustic meati (Sokolich 1981; US Patent 4251 686). Parylene-coated tungsten microelectrodes (Microprobe; 0.8 – 1.3 MOhm at 1 kHz) were introduced into the auditory cortex with a hydraulic microdrive (Kopf) oriented approximately orthogonal to the cortical surface. Recordings were made at cortical depths ranging from 700 to 1100 µm, as determined by the microdrive setting, corresponding to cortical layers IIIb and IV.

For each recording site, frequency response areas (FRAs) were constructed from responses to 675 different tone bursts, which were presented in a pseudorandom sequence of different frequency-intensity combinations selected from 15 level values and 45 frequency values. Neuronal activity of single units or small groups of neurons (2–6 units) were amplified, band-pass filtered, and monitored on an oscilloscope and an audio monitor. Spike activity was separated from the background noise with a window discriminator (BAK DIS-1). The number of spikes per presentation and the arrival time of the first spike after the onset of the stimulus were recorded and stored in a computer (DEC 11/73). From the FRAs several parameters were extracted, including spontaneous discharge rate, characteristic frequency (CF), threshold, bandwidth and Q-value 10 dB and 40 dB above threshold.

Ripple stimuli with the following standard stimulus characteristics similar to the stimuli used in animal behavioral testing were presented: a harmonic series (from 35 to 200 Hz fundamental frequency) with a 6 dB/octave decline of the component amplitudes (120 to 255 components), as carrier; a bandwidth of 3 octaves; the spectral envelope of the signal was a sinusoid on a logarithmic scaled frequency axis; the modulation waveform of the envelope (ripple depth) was linear on a dB scale with a depth (contrast) of 30dB. Stimulus duration was 50 ms including 3 ms rise/fall time; interstimulus interval was 400 ms (recording window: 200ms). For electrophysiological assessment of responses to different ripple densities, the ripple stimulus was centered on the CF of each unit. Then the ripple transfer function was found by systematically varying the ripple densities (10 to 14 values) from 0.5 to 8 ripples/octave for a constant envelope phase of 0° (i.e., there was always a spectral maximum at the CF of the unit). Responses to 30 repetitions of the ripple stimulus presentation were obtained.

The magnitude of the ripple transfer function for each unit/cluster was normalized to the difference between maximally and minimally driven rate (strongest response designated as 100% and smallest response as 0%). For each unit/cluster the ripple density which produced the best response (best ripple density, BRD) was determined; additionally, for each animal the bandwidth halfway between the peak and the lowest value within the ripple transfer function (BW50) was obtained as a measure of the selectivity of ripple tuning. For purposes of statistical testing, it was necessary to implement a data transformation for BRD and RTF bandwidth values so that they conformed to the normality of variance assumption inherent to parametric statistical tests. Obtaining the log of the BRD and RTF bandwidth values was sufficient to satisfy this assumption. However, since a fraction of BRD and RTF bandwidth were less than 1.0, we were required to first add a constant (1.0) to all values before taking the log.

Neural response selectivity was also tested with the standard 1 ripple/octave stimulus used for behavioral training in a subset of recordings made in naïve cats (N = 3) and cats that were extensively trained in spectral discrimination task (N = 5). The envelope phase of the ripple stimulus (presented at 65 dB SPL) was systematically shifted in 20° steps from −180° to +180°. For each animal, a population phase-response curve was reconstructed by referencing the phase of the ripple envelope to the CF of the unit/cluster. The response strength was normalized to the strongest response, providing a frequency-independent estimate of envelope waveform representation. The resulting phase-rate functions were fit with a sinusoid, having the form f(x)= As iwx+(p)+DC, for the three individual cases as well as the mean trained and untrained data. The amplitude (A), frequency (w), phase (p), and offset (DC) parameters were obtained by minimizing the squared error between the sinusoid and the data. The goodness of fit (i.e. regression coefficient) was then calculated for each function.

Experimental Groups

We compared electrophysiological results for cats trained with the ripple stimuli to cats that had been neither trained nor exposed to the stimuli (Group I, ‘naive’; N=5). Cats # 218, #293, #176 were trained with ripple spectra centered at 3.675 kHz, with a fundamental frequency of 69 Hz, a ripple bandwidth of 3 octaves, and a constant ripple density of 1.0 ripple/octave. These animals were able to learn the discrimination task and performed the discrimination based solely on the spectral quality of the two stimuli. These cats constituted Group II (‘discriminant’; N=3). Group III (‘lateral’; N=5) was extensively exposed to ripple stimuli of 0° and 180° envelope phases and successfully performed the lateralization task but were unable to transfer to discrimination based on spectral envelope differences alone. Cats #122 (lateralizing for two months) and #40–764 (lateralizing for five months) were exposed to a stimulus of 1.0 ripple/octave. Cats # 92–1767, #184, #92–1688 were trained on stimuli with a ripple density of 2.0 and/or 2.66 ripples/octave. Two of the animals in the lateralization group were initially tested at 2.66 ripples/octave that was later reduced to 2 ripple/octave when they were not able to transfer to the spectral task. Neither cat could transfer to the spectral task at 2 ripples/octave as well. While we were not able to obtain a spectral envelope discrimination threshold from these cats, the animals utilized the ripple stimuli for lateralization over a period of 4 to 6.5 months. Since no significant physiological differences among the ‘lateral’ group animals were found, the animals were grouped together.

These stimuli were centered at 3.703 kHz and had a fundamental frequency of 34.5 Hz.

RESULTS

Behavior

Three of five cats trained with a ripple stimulus of 1.0 ripple/octave learned to perform the spectral-envelope discrimination test for a single speaker (Group II, ‘discriminant’). The remaining two cats mastered the initial spatial lateralization task with high accuracy (>90% correct responses to lateralization) but did not learn to switch to the spectral envelope discrimination task. They were included in the lateralization group (Group III, ‘lateral’). Using the method of constant stimuli, the envelope discrimination threshold (75% correct) for Group II animals was calculated daily from psychometric functions, as shown in Figure 2. Arrows indicate the envelope-phase discrimination threshold for that day (21° or 5.8% of an octave).

Figure 2.

Figure 2

Typical psychophysical function (Cat #293). The percent correct values for a single day of testing in a constant-stimulus paradigm are plotted as a function of the phase shift (expressed in degrees re a ripple density of 1 ripple per octave). Threshold envelope phase shift (75% correct, see arrows) is approximately 21° corresponding to a formant shift of 0.06 octaves.

Envelope discrimination testing was repeated nearly daily for several weeks, with steadily improved discrimination thresholds. The time course of these thresholds for Group II animals is shown in Figure 3A. All three cats showed an improvement in envelope-phase discrimination with the most rapid changes in threshold, ~ 0.9° per session, during the first 10 to 20 sessions. One animal (#218; circles) had initial discrimination thresholds that were higher than for the other subjects. The average discrimination threshold for envelope phase over the first ten test sessions for that cat was 127°, corresponding to a frequency shift in the spectral maxima of about 35% of an octave (see Table 1). The other two cats had initial thresholds of 66° and 76°, respectively. The mean threshold for Group II cats at the beginning of testing was 96° (24.6% of an octave), which serves as an estimate of the pre-training discrimination ability of naive animals.

Figure 3.

Figure 3

Time course of the discrimination performance for 3 trained cats. A: The average threshold of each daily session is plotted as a linear function of the session number. B: Comparison of the time course of the discrimination thresholds as a function of testing sessions on a logarithmic time scale. A logarithmic best fit is shown for each case. The regression coefficients are indicated.

Table 1.

Summary of psychophysical discrimination performance (Group II; discriminant)

Cat # Discrimination Threshold Improvement per day Discrimination Threshold Improvement per day testing duration
first 10 sessions first 10 sessions last 10 sessions last 10 sessions
#176 66° (18%) 0.8° 33° (9%) 0.1° 4 months
#218 127° (35%) 1.0° 80° (22%) 0.25° 4 months
#293 76° (21%) 0.9° 18° (5%) 0.1° 6 months
Mean +/− SD 96°+/−32.4° (24.7+/−9.1%) 0.9°+/−0.1° 43.7°+/−32.3° (12.0+/−8.9%) 0.15°+/−0.8° 4.67+/− 1.15

After training (average duration: 4.7 months) the discrimination performance asymptoted. The final threshold for the least able animal (average of the last ten test sessions) was a phase shift of 80°, corresponding to a formant frequency shift of 22% of an octave. The average final threshold for the best animal was 18° with a mean of 43.7° (12%) for all three animals. At this stage of the training, the average rate of improvement was ~ 0.15° per session. The best daily discrimination thresholds for the two cats with the lowest overall thresholds over the last four weeks of testing were 25° (7% of an octave) for cat #176 and 11° (3% of an octave) for cat #293.

The differences between the early and late rate of change in discrimination improvement suggested a non-linear learning curve. Plotting the changes in the discrimination thresholds on a logarithmic time scale (Figure 3B) reveals two aspects of the learning time course not clearly discernible from the linear plot (Figure 3A). First, all cases can be reasonably fitted with a logarithmic function (r = 0.78 (#218), 0.95 (#293), and 0.91 (#176)). The high correlation coefficients imply that the threshold improvement is constant for a constant factor in the increment of testing sessions. The slope of the improvement curves for the three animals was statistically indistinguishable; the discrimination improvement over time was independent of the initial threshold value. The average slope of the fit was 40°/decade (11% of an octave) with an average Y-intercept of 114° (32% of an octave). The latter value is another estimate of the discrimination ability for naive animals. Secondly, there is no indication that an absolute threshold minimum had been reached when training ended. Such a minimum threshold would be indicated by a horizontal portion of the threshold values versus logarithmic time, which was not observed (Figure 3B).

Performance analysis based on signal detection theory allows the effects of sensory ability changes to be distinguished from those of response criterion changes on the threshold estimates (Green and Swets, 1966). The proportion of hits (correctly detected differences in the AB condition) and false alarms (incorrect responses in the AA condition) were computed for each cat early in the testing and for the last week of testing. During early testing, the average hit rate (over five days) was 78.4+/−5.9% with a false-alarm rate of 17.2+/−4.8%. Later in testing, the hit rate was 69.3+/−9.2%, and the false-alarm rate was 20.1+/−4.9%. There was no difference between the false-alarm rates, early vs. late (p = 0.12); there was, however, a significant difference between the hit rates (p = 0.01). Overall, the high hit rate combined with the low false-alarm rate indicates that the animals were under stimulus control.

Thus, three cats were able to learn a two-alternative forced choice auditory discrimination procedure for vowel-like stimuli with a spacing of 1 ripple/octave that varied in the frequency position of three formants. Five more cats (Group III) were trained on the initial lateralization task for several months (2–6.5 months) but could not switch to the spectral discrimination task. These animals serve as control group in the physiological evaluation since they were highly trained with behaviorally relevant ripple spectra without the necessity to use the spectral envelope phase as cue in the lateralization task.

Electrophysiology

After behavioral training, spectral receptive fields were obtained in AI of the anesthetized animals. AI was identified by the sulcal pattern and the characteristic-frequency gradient of multiple unit responses. A total of 437 AI locations in 13 cats were sampled. The ratio of single units to multiple units was approximately 2:3. In each animal, a uniform spatial sample of locations across AI avoided sampling biases due to differences in the distribution of local spectral integration aggregates along the iso-frequency axis (e.g. Schreiner and Mendelson, 1990; Read et al., 2001). The sampled range of CFs was 2.3 to 11.5 kHz with no statistically significant differences among the cases (Kolmogorov-Smirnov test). A ripple transfer function and a pure-tone FRA were determined for each site. Four representative ripple transfer functions from AI of a Group II (discriminant) animal are shown in Figure 4. The neuronal firing rate depends on the ripple density and is indicative of a bandpass characteristic for spectral envelope spacing. The 'best ripple density’ (BRD) corresponds to the spectral spacing that produced the largest neural response for a ripple phase of 0°, i.e., a formant peak located at CF (Schreiner and Calhoun, 1994).

Figure 4.

Figure 4

Examples of ripple transfer functions. The firing rate as a function of ripple density is plotted for two single units (open symbols) and two multiple units (filled symbols) from AI of a trained animal (C218). Firing rate (in percent) is normalized to maximum rate. The animal was able to discriminate between phase shifted ripple spectra of 1 ripple/octave.

Comparing the BRD values between the three groups (Fig. 5) revealed that the trained groups had higher mean BRD values than the naive animals (mean ± SE, 1.6 ± 0.14, 1.92 ± 0.15 and 1.89 ± 0.11 ripples/octave for Group I, II and III respectively). BRD value distributions exhibited a pronounced positive skewness in all groups (Fig. 5a–c) that precluded direct comparision with parametric statistical tests. We addressed this problem by obtaining the log BRD value to eliminate the positive skewness and confirmed that the distribution of log BRD values conformed a normal distribution for each group (skewness / SE skewness < 3.0 for all groups). We compared the log BRD values in each trained group to the naïve group and observed a significant increase in best ripple density in animals trained up to the lateralization phase (Group III vs. Group I; two-tailed t-test, p < 0.025) but not in discriminant animals compared to controls (Group II vs. Group I; two-tailed t-test, p = 0.08; Fig. 5d). No difference in the BRD distribution was detected between the two trained groups (Group II vs. Group III; two-tailed t-test, p = 0.68) despite the fact that only one of these groups showed significant perceptual improvements in spectral envelope discrimination.

Figure 5.

Figure 5

Comparision of best ripple density (BRD). BRD distributions for naïve (A), discriminant, (B), and lateralization (C) groups. The Gaussian fit for each distribution (dashed black line) is superimposed onto each histogram. (D), The log BRD was obtained for each value and the mean ± se are compared between groups. Asterisk indicates a significant difference relative to naïve controls using an unpaired two-tailed t-test (p < 0.025).

The sharpness of tuning to spectral envelope frequencies, expressed by the width of the RTFs at half-height, also differed among the three groups (Fig. 6). RTF bandwidths in discriminant animals were narrower, on average, than that observed in naïve animals (Group II vs. Group I = 1.56 ± 0.09 vs. 2.09 ± 0.15 ripples/octave), but were wider in the lateralization group (2.25 ± 0.1 ripples/octave). The distributions of RTF bandwidth values also failed the normality assumption due to a positive skewness (Fig. 6a–c), but, again, this problem was eliminated by obtaining the log RTF bandwidth (skewness / SE skewness < 2.0 for all groups). A statistical comparison of the log RTF bandwidth values revealed spectral tuning bandwidths were significantly narrower in discriminant animals compared to naïve animals (two-tailed t-test, p < 0.01) and significantly wider in animals that only performed the task up to the lateralization phase (two-tailed t-test, p < 0.025).

Figure 6.

Figure 6

Comparison of ripple transfer function (RTF) bandwidth. Distribution of the RTF bandwidth at half-height for naïve (A), discriminant (B) and lateralization (C) groups. The Gaussian fit for each distribution (dashed black line) is superimposed onto each histogram. (D) The log RTF bandwidth was obtained for each value and the mean ± se are compared between groups. Asterisks indicate a significant difference relative to naïve controls using an unpaired two-tailed t-test (p < 0.025).

BRD and RTF bandwidth measurements in the lateralization group were pooled from two cats trained with a 1 ripple/octave spacing (Group III-low, n = 23 neurons) and from three cats trained with spectral densities ≥ 2.0 ripples/octave (Group III-high, n = 147 neurons). In order to test the possibility that differences in the spectral density of the training stimulus might have had a differential impact on BRD and RTF bandwidth measures, we compared recordings taken from Group III-low and Group III-high directly (Fig. 7). We did not find any difference in log BRD values (paired t-test, t = 0.3, p = 0.77), but we did observe that the mean RTF bandwidth in Group III-high was significantly greater than in Group III-low (mean ± SE, 1.57 ± 0.33 vs. 2.36 ± 0.1 ripples/octave for low and high respectively; two-tailed t-test on log RTF bandwidth values, p < 0.005). Thus, animals trained in either the lateralization or discrimination tasks using low-density ripple stimuli exhibited similarly narrow RTF bandwidth (1.57 vs. 1.56 ripples/octave for Group II vs. Group III-low; two-tailed t-test, p = 0.71) whereas training animals in the lateralization task with high-density ripple stimuli produced a significant increase in RTF bandwidth (2.36 ripples/octave) compared to animals trained in the discrimination task (two-tailed t-test for Group II vs. Group III-high, p < 1 × 10−6).

Figure 7.

Figure 7

Comparison of BRD and RTF bandwidth between lateralization animals trained with high versus low spectral density stimuli. Mean ± se of raw BRD and RTF bandwidth values shown for cats trained to lateralize a 1.0 ripple/octave (low) and ≥2.0 ripple/octave stimulus. Asterisk indicates a significant difference in log RTF bandwidth between recordings taken from low and high animals using an unpaired two-tailed t-test (p < 0.025).

A complementary approach to the parametric analysis of individual receptive fields is provided by analysis of population ripple transfer functions. The population neuronal response is the averaged firing rate across all recording locations in an animal or experimental group and combines aspects of receptive field position, shape, and strength. Figure 8 shows three examples of population ripple transfer function (RTF) for each experimental group. Compared to the naive animals, the population RTFs in trained animals (Groups II and III) demonstrate an increased response magnitude at the trained ripple densities (thick arrows in Fig. 8), resulting in a rightward shift in BRD and high-frequency slope. The overall change in the shape of the population RTFs is shown in the grand average RTFs for the three groups (Fig. 9). The BRD from the naïve population RTF was lower than that of the two trained groups (mean ± SE; 0.46 ± 0.18, 0.78 ± 0.39, 0.69 ± 0.27 for Groups I, II and III respectively). This difference reached statistical significance between Group I and Group III (one-tailed t-test; p<0.04). The mean population BRD was not statistically different between the two trained groups (two-tailed t-test, p > 0.05). When Groups II and III were combined and compared to the naive animals, the best population ripple density was significantly higher for the trained animals (one-tailed t-test, p < 0.03, see Fig. 9). This result suggests that the population tuning in animals exposed to ripple densities equal or greater than 1 ripple/octave is shifted toward these values.

Figure 8.

Figure 8

Population RTFs of AI in nine animals. For each experimental group, the population RTF is shown for three animals. Neuronal responses are averaged across all recorded units of an animal and plotted as a function of ripple density. The spike count is based on 30 repetition of the stimuli. Arrows indicate the value of trained ripple densities in Group II (1 ripple per octave, thick arrow); and Group III (2 and 2.66 ripples/octave; thick arrows). The number of locations included in each population RTF are indicated.

Figure 9.

Figure 9

Summary-population RTFs for the three experimental groups. The averaged across all cats of each group is plotted: for five naive cats (Group I), for the three cats trained and tested with a ripple density of 1 (Group II discriminant), and for the five cats exposed to a ripple stimulus of either 1, 2, or 2.66 (Group III lateral).

Some properties of spectral envelope tuning for broadband stimuli are related to the shape of pure-tone tuning profiles (e.g. Calhoun and Schreiner, 1998; Linden et al, 2003). Accordingly, the sharpness of frequency tuning, expressed as Q-40 dB of pure-tone tuning curves, was assessed for each group. We found that Q40 values were higher, on average, in both trained groups than in naïve animals (mean ± se; 1.45 ± 0.07, 1.94 ± 0.16, 2.24 ± 0.16 for Group I, II and III respectively; Fig. 10a–c). As with the neuronal BRD and RTF bandwidth measurements, we were first required to obtain the log of each Q40 value to meet the normality assumptions for parametric statistical testing and then compared the log Q40 values obtained from each trained group to the naïve animals. We found that Q40 values were significantly higher for both the spectral group (two-tailed t-test, p < 0.025; Fig 10d) and the lateralization group (two-tailed t-test, p < 0.0001; Fig. 10d) compared to naïve controls, reflecting increased pure tone frequency selectivity for both training conditions. The Q40 values between the two trained groups were not significantly different (two-tailed t-test, p = 0.15). This result demonstrates that spectral integration measures of cortical neurons obtained with narrow-band and broad-band stimuli are both affected by training with structured broad-band stimuli.

Figure 10.

Figure 10

Comparison of pure tone tuning curve bandwidth (Q40). Distribution of Q40 values for naïve (A), discriminant (B) and lateralization (C) groups. The Gaussian fit for each distribution (dashed black line) is superimposed onto each histogram. Higher Q values are indicative of sharper tuning. (D)Log Q40 was obtained for each value and the mean ± se are compared between groups. Asterisks indicate a significant difference relative to naïve controls using an unpaired two-tailed t-test (p < 0.025).

Correlation of psychophysics and physiology

Comparing the perceptual performance of the discriminant animals and cortical neural receptive field properties indicated some potential relationships. Discriminant animals showed an increase in BRD compared to naive animals; however, this was also observed in the animals that only advanced to the lateralization phase of the task (Group III). Therefore, a shift in the peak of RTFs toward the trained ripple densities may be necessary, but is not sufficient, to explain the improvement in spectral envelope discrimination threshold. Another physiological property that improved with training was the selectivity of the receptive field for either pure tones (increase in Q-40dB) or RTFs (decrease in BW50). The two animals with the lower discrimination threshold had higher Q-40dB values than the animal with a high discrimination threshold. Similarly, the bandwidth of the RTFs was narrower, or more selective, for the two animals with better discrimination thresholds.

Further evidence for behavioral influenced reshaping of the cortical ripple spectra encoding stems from phase-response functions. These functions (Fig. 11) plot the average response of neurons to different phases of the ripple envelope with the phase of 0° corresponding to a maximum of the ripple envelope at the CF of each unit/cluster. The two Group II (discriminant) animals with the lowest behavioral discrimination thresholds showed envelope-phase functions (Fig. 11A) that were well approximated by a sinusoidal fit, i.e., representing the original ripple waveform (r2 = 0.69 (#293) and 0.58 (#176)). The envelope-phase function of the animal with the worst behavioral threshold showed no clear reflection of the envelope waveform (r2 = 0.26, #218). The combined Group II envelope-phase function (Fig. 11B) showed a more faithful representation of the stimulus waveform (r2 = 0.69) than the untrained Group I animals (r2 = 0.45). The preferred phase was observed to be slightly higher than the phase corresponding the CF of the recording site (0°) for each function (i.e. the maxima of the preferred envelope phase was shifted towards frequencies lower than the CF). This simply reflects the fact that spectral tuning functions for AI units typically exhibit a low frequency skewness, such that stimuli with spectral energy centered on frequencies just lower than the CF will often elicit higher firing rates than stimuli positioned at the CF.

Figure 11.

Figure 11

Neuronal response selectivity to ripple stimuli of varying spectral envelope phase. Neural response strength was characterized for a 1 ripple/octave stimulus with spectral phase that varied from −180° to +180° relative to the CF of the recording site (0°). Response strengths for each phase value (gray circles) were normalized to the phase that elicited the greatest response (1.0). Responses were fit with a sinusoid (black lines) to emphasize the dependence of firing rate on stimulus phase and the goodness of fit (r2) is indicated for each function. Phase-response functions are presented for each individual cat trained in the Discriminant group (A) as well as for the mean of all recordings obtained from Naïve (Group I) and Discriminant (Group II) animals (B).

DISCUSSION

This combined behavioral and neurophysiological study of spectral envelope discrimination demonstrates three main points. 1) Animals trained to discriminate differences in spectral phase profiles of sequential auditory stimuli exhibit significant learned improvement in psychophysical discrimination thresholds; 2) cortical receptive field characteristics that reflect spectral integration properties can change as a consequence of this perceptual training; and 3) spectral integration properties in AI are also influenced by tasks that are not explicitly based but accompanied by spectral envelope properties.

Psychophysics of Ripple Spectra

To determine how the processing of simple sounds relates to that of complex, naturally occurring sounds, it is advantageous to use increasingly complex signals with behaviorally relevant characteristics. The evaluation of spectral envelope characteristics of broadband stimuli relative to those of pure tones represents an intermediate step in the effort to understand the processing of complex natural signals, such as communication sounds. Psychophysical studies in humans and animals have investigated the spectral processing of broadband stimuli with parametrically varied spectral envelopes, especially formant position and modulation depth (e.g. Pick 1980; Festen & Plomp 1981; Yost 1982; van Veen & Houtgast, 1985; Shamma et al., 1992; Bernstein & Green 1987; Hillier and Miller, 1991; Sommer et al., 1992; Sinnott et al., 1976; Sinnott and Kreiter, 1991; Supin et al., 1994; Amagai et al., 1999; O'Connor et al., 2000). However, no attempt had been made to link psychophysical measurements with neurophysiological properties of cortical neurons.

This study demonstrates that cats can be trained to discriminate among speech-like stimuli that vary in spectral envelope phase or formant position. This task was chosen specifically to manipulate the spectral integration property of central auditory neurons without strong biases from absolute frequency position. The hypothesis was confirmed that prolonged exposure to a spectral profile with a fixed spectral periodicity (e.g., 1 ripple/octave) influences the distribution of neuronal ripple transfer functions and the related pure-tone tuning curves.

Cats in the discriminant group, on average, detected a difference in spectral envelope phase or formant position of about 12% of an octave after 4 to 6 months of behavioral testing; two of the cats reached best discrimination values of 5–9%. One human study, using the same phase-shift paradigm and the same ripple density, reported thresholds of ~7.5% (Keeling et al., 1992). Other human psychophysical studies of ripple discrimination ability used changes in modulation depth to establish discriminability and are not directly comparable (e.g., Hillier and Miller, 1991; Supin et al., 1994). However, several studies in human and macaques evaluated the minimum discriminable shift in formant frequency and provide some useful comparisons. Flanagan (1955) reported formant frequency difference limens corresponding to 10° to 20° at ripple densities of ~0.5 to 1 ripple/octave, somewhat above the average envelope phase thresholds in cats. Similarly, Sommers et al. (1992) reported thresholds for Japanese macaques that were also slightly above the values in extensively trained cats. Among the potential reasons for lower than expected human formant shift values is that the experiments were performed at lower frequencies than in the cats and the spectral envelope shift in this study was not confined to a single periodicity but contained additional formant spacings that may have provided additional cues. Likely, the animals’ performance levels were influenced by differences in basic processing capacities as reflected, for example, in the number of cochlear elements available per frequency (Prosen et al., 1990) and different critical bandwidths (Scharf, 1970; Greenwood, 1974; Pickles, 1975). However, the main effect of critical bandwidth differences would be expected for ripple densities above 3 ripples/octave, well above the values used in this study.

Discrimination thresholds for shifts in the spectral envelope improved steadily over the course of the behavioral training, similar to other auditory discrimination tasks (e.g., Prosen et al.; 1990; Recanzone et al., 1993; Polley et al., 2006). Analysis of the hit rate and false-alarm rate early and late in training support the conclusion that the observed improvements in discrimination reflect changes in perceptual capacity and not changes in response criterion. The lower hit rate at the end of the training is likely a consequence of the narrower range of constant stimuli that was used than at the beginning of training.

On a linear time scale, the improvements appeared to asymptote after about two months of training. In fact, the apparent flattening of the learning curve was used as a criterion to end the training and to perform the physiological mapping experiments. However, plotting the threshold data on a logarithmic scale revealed that the sensory ability of the animals continued to improve, albeit at a slower pace. Since the slope of the logarithmic learning curve was quite similar for all three animals, the practical limit of sensory ability of an individual animal is strongly influenced by the starting point of the learning process.

These data suggest that perceptual abilities in naive animals can be quite different from those in perceptually trained animals and from human performances. This is relevant if the animal is used as a model of human perceptual abilities since it implies that animals need to be trained to approach human performance levels, at least for some perceptual tasks. If the goal is to ascertain the physiological conditions and mechanisms underlying perceptual abilities, it is necessary to determine the perceptual ability of each individual animal for the interpretation of the physiological data since the variance between animals can be large (see also Prosen et al., 1990). Similarly, comparable human performance estimates are also best derived from well-trained subjects.

While all trained animals mastered the initial lateralization task, only three out of eight animals succeeded in transferring to a spectral profile discrimination task. There are three potential reasons for this low success rate. 1) Perceptually, the maximal difference in spectral envelope phase (180°) may not have been sufficient for some animals to distinguish between the standard and test stimuli. Given the high initial discrimination threshold at a spectral envelope spacing of 1 ripple/octave in one animal (see case #218), a 180° phase shifts for ripple densities of 2 and 2.66 ripples/octave may not have reached threshold in similar cases. 2) Behaviorally, the cues given to the animals at the time of transition from two to one speaker may have been ambiguous reducing stimulus control for the animals who failed to discriminate between spectral envelopes of 1 ripple/octave. However, these animals constitute an important comparison since they were actively engaged in a behavioral training task (lateralization) with stimuli that had a specific, task-independent spectral profile but without requiring a fine spectral envelope analysis. 3) These animals were not successful because of cognitive limitations that prevented them from “understanding” the transfer.

Physiology of Ripple Spectra

Vowels in human and animal vocalizations have spectral envelopes similar to ripple stimuli used in this study. If the primary auditory cortex is organized to process the representation of behaviorally relevant vowel-like sounds, careful investigation using a similar but more easily controlled set of stimuli, such as ripple stimuli, may reveal basic computational features.

In visual cortex, it is possible to predict the neuron's response to complex stimuli from its response to different sinusoidal gratings (e.g. Wörgötter & Eysel 1987; Wörgötter et al. 1990; DeValois & DeValois 1990). These studies provide evidence that generalized stimuli covering large portions of the receptor surface can be well suited to predict responses to specific and/or more spatially restricted stimuli. In addition, these studies suggested that much of the transformation from input space to cortical representation can be described in terms of linear processing.

Previous studies in the auditory system utilized broadband acoustic gratings to compare complex and tonal stimuli responses in peripheral auditory nuclei. For example, tuning curves from cat cochlear nucleus neurons were predicted from the neuron's response to cosine noise (Bilsen et al. 1975; ten Kate & van Bekkum 1988). More recently, cat auditory cortical neurons have been shown to respond selectively and systematically to acoustic stimuli with sinusoidal spectral envelopes (Schreiner & Calhoun 1994; Calhoun and Schreiner; Miller et al., 2002). Studies in ferret AI find that ripple responses allow predictions of responses to pure tones and to spectrally complex natural sounds (Shamma et al. 1995; Versnel and Shamma 1998; Klein et al., 2000) suggesting that AI neurons analyze the shape of acoustic spectra in a substantially linear manner. The population analysis of the cortical ripple waveform representation in the current study also shows that the best-performing trained animals have an improved fidelity and more linear response to the spectral envelope compared to naïve animals.

In cats and New World monkeys, sharpness of tuning, or local frequency specificity, to pure-tones represents an important functional organization principle reflected in a modular or patchy cortical organization associated with specific horizontal connections (Recanzone et al., 1999; Cheung et al., 2001; Schreiner et al., 2000; Read et al., 2001). The spatial organization of tuning to broad-band ripple spectra should be related to sharpness of tuning assessed with pure-tones since both reflect aspects of spectral filtering (Calhoun and Schreiner, 1998; Wang and Shamma, 1995). However, there is only sparse experimental evidence for spatial organization of ripple transfer functions (Shamma et al., 1995; Kowalski et al., 1996). In this study, care was taken to sample the targeted frequency region evenly to avoid spatial biases due to sharpness of tuning clusters. A spatial analysis of the distribution of sharpness of pure-tone tuning (not shown) suggested an organization compatible with previous reports (Schreiner and Mendelson, 1990; Schreiner and Sutter, 1992; Read et al., 2001) but ripple maps were quite noisy and a significant correlation between ripple and pure-tone responses was evident in only two cases. This suggests that pure-tone estimates of spectral integration may be more robust and less influenced by other contributing elements, such as depth of modulation, overall intensity, frequency position etc, than broad-band measures.

Cortical Plasticity and Perceptual Learning

The receptive field characteristics of AI neurons can be modified through conditioning procedures in which an auditory stimulus is associated with the onset of a behavioral reinforcer (for review see Ohl et al., 2005). The present study made use of three stimulus-reward contingencies: 1) reward contingent upon correct discrimination of spectral envelope phase between a pair of low-density stimuli (1 ripple/octave) presented from a single sound source (Group II), 2) reward contingent upon correctly localizing a pair of low-density stimuli with 0° or 180° phase differences to the left or right, respectively (Group III-low), 3) reward contingent upon correctly localizing a pair of high-density stimuli (≥ 2.0 ripples/octave) with 0° or 180° phase differences to the left or right, respectively (Group III-high). Cats in Group II learned to attend to fine differences in spectral envelope phase to obtain food rewards. Performance of cats in Groups III low and III high was exclusively dictated by the relative spatial position of the sound sources and not by spectral phase differences, demonstrating that they were using localization cues to obtain food reward.

In formal learning theory, stimuli that are used to reduce the uncertainty about the timing of an upcoming unconditioned stimulus are said to convey information (Gallistel, 2003). In this sense, spectral phase differences were informative for animals in Group II, but not for animals in Group III- low or Group III-high. Despite the substantial heterogeneity in stimulus composition, task demands and the informational value of spectral phase differences, all training conditions induced plasticity in AI spectral envelope selectivity relative to neural recordings in untrained control cats. BRD was elevated in all trained groups (but only reached statistical significance for the lateralization group). Cats trained in Group II or Group III-low exhibited significantly narrower RTF bandwidths whereas cats trained in Group III-high exhibited significantly broader RTF bandwidths. These results suggest that the bandwidth of spectral integration filters is strongly influenced by spectral density of the input stimuli regardless of whether the spectral properties are informative or relevant to the task demands.

The literature is currently divided between studies demonstrating that receptive field plasticity in AI is dictated by stimulus input statistics under conditions of varying (Polley et al., 2004) or absent (Kilgard et al., 2001) task demands and studies that demonstrate that AI receptive field plasticity is relatively independent of stimulus statistics, but instead is guided by the demands of the conditioning task (Polley et al., 2006). The present results most strongly support findings that receptive field reorganization is directed by statistics of sensory inputs paired with behavioral reinforcement. The difference between this class of experiments versus the latter class of experiments that support a role for top-down task-related inputs might hinge upon the manner in which the conditioning task is constructed. In the present study, the spectral profile of auditory stimuli in the lateralization task did not guide behavioral choices but, at the same time, variations in spectral density and phase did not actively interfere with the cat’s ability to localize the stimulus. In conditioning tasks in which the irrelevant sensory inputs need not be suppressed to accurately perform the task, receptive field reorganization might be directed by the bottom-up statistics of sensory stimuli that are: i) paired with reinforcement and ii) substantially different from the naïve animal’s ambient acoustic environment. In conditioning tasks where irrelevant stimulus features compete for limited attentional resources or are otherwise antithetical to accurate task performance, their influence might be suppressed through top-down influences (see, for example, Ahissar et al., 1993; Li et al., 2004; Polley et al., 2006).

From a clinical perspective, it is highly valuable to understand how simple auditory conditioning tasks might be used to improve speech processing and aural communication. For example, one of the major challenges in cochlear prosthetic research is to create a spatially differentiated pattern of excitation along the cochleotopic axis so that the spectral profile of complex sounds (e.g. speech) can be encoded with acceptable spatial/spectral fidelity. Most of the research effort for this question has been directed towards improving the signal – devising methods, in other words, to limit the spatial spread of electrical excitation on the basilar membrane (for review see (Middlebrooks et al., 2005; Rubinstein, 2004). Given that the spectral receptive fields in the auditory cortex of congenitally or neonatally deaf individuals are also likely to be broad and poorly organized (Kral et al., 2002; Raggio and Schreiner, 1999), it would also be valuable to devise conditioning and/or stimulation protocols that would reduce the spectral integration bandwidth of central auditory neurons. In this way, the spatial resolution of both the signal and the receiver might be improved. The results of this study and others that have observed training-induced refinement of spectral receptive field bandwidth (Recanzone et al., 1993; Witte et al., 2005) might represent initial progress towards this goal.

Psychophysics versus Physiology

This study shows a correlation between the psychophysical discrimination ability for spectral envelopes and the spectral selectivity for pure-tones and ripple spectra. Training with a spectral discrimination task, the relative shift of formant peaks, is appropriately reflected in an improvement of the spectral resolution of cortical neurons. In contrast, the lateralization task sharpened the pure-tone tuning but broadened ripple tuning. This shows that differences in discrimination ability, either within an animal as a function of time or between animals, covary with differences in physiological properties, in particular with spectral tuning sharpness or selectivity.

Conclusions

The data in this report show that the auditory processing abilities of cats for spectral envelope discrimination can substantially improve with training and can at least approach if not reach values similar to those employed in the processing of speech features by humans. Accordingly, basic auditory processing capabilities can be similar in humans and in other mammals and experience with specific stimulus statistics refines the processing capability and the cortical representation of that stimulus. This suggests that perceptual evaluation should be a required component for the psychophysical and neurophysiological assessment of the processing of stimuli that have a low probability of occurrence and limited behavioral significance in the normal acoustical environment of an animal, such as aspects of speech and music. It also implies that the results of neurophysiological studies of speech-like stimuli in the CNS of naive animals that were not psychophysically assessed are difficult to extrapolate to humans since differences in the psychophysical abilities between the species cannot be taken into account.

ACKNOWLEDGMENTS

This research was supported by NIDCD 02260, NIMH077970, the Max Kade Foundation, the Coleman Memorial Fund, and Hearing Research Inc. We thank Dr. W. M. Jenkins for help with the LabVIEW behavioral testing program, Dr. Craig Atencio for analytical support, and Drs. R. Beitel and J. A. Winer for comments on the manuscript.

REFERENCES

  1. Ahissar M, Hochstein S. Attentional control of early perceptual learning. Proc Natl Acad Sci U S A. 1993;90:5718–5722. doi: 10.1073/pnas.90.12.5718. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Allard TA, Clark SA, Jenkins WM, Merzenich MM. Reorganization of somatosensory area 3b representations in adult owl monkeys after digital syndactyly. J Neurophysiol. 1991;66:1048–1058. doi: 10.1152/jn.1991.66.3.1048. [DOI] [PubMed] [Google Scholar]
  3. Amagai A, Dooling RJ, Shamma S, Kidd TL, Lohr B. Detection of modulation in spectral envelopes and linear-rippled noises by budgerigars (Melopsittacus undulatus) J Acoust Soc Am. 1999;10:2029–2035. doi: 10.1121/1.426736. [DOI] [PubMed] [Google Scholar]
  4. Bakin JS, South DA, Weinberger NM. Induction of receptive field plasticity in the auditory cortex of the guinea pig during instrumental avoidance. Behav Neurosci. 1996;110:905–913. doi: 10.1037//0735-7044.110.5.905. [DOI] [PubMed] [Google Scholar]
  5. Baru AV. Discrimination of synthesized vowels /a/ and /i/ with varying parameters (fundamental frequency, intensity, duration and number of formants) in dog. In: Fant G, Tatham MAA, editors. Auditory analysis and perception of speech. London: Academic Press; 1975. pp. 91–101. [Google Scholar]
  6. Beitel RE, Schreiner CE, Merzenich MM. Spectral receptive field plasticity in auditory cortex of owl monkeys trained with sinusoidally amplitude modulated (SAM) tones. Assoc Res Otolaryngol Abstr. 1999;21:201. [Google Scholar]
  7. Bernstein LR, Green DM. Detection of simple and complex changes of spectral shape. J Acoust Soc Am. 1987;82:1587–1592. doi: 10.1121/1.395147. [DOI] [PubMed] [Google Scholar]
  8. Bilsen FA, ten Kate JH, Buunen TJ, Raatgever J. Responses of single units in the cochlear nucleus of the cat to cosine noise. J Acoust Soc Am. 1975;58:858–866. doi: 10.1121/1.380734. [DOI] [PubMed] [Google Scholar]
  9. Calhoun B, Schreiner CE. Spectral envelope coding in cat primary auditory cortex: Linear and non-linear effects of stimulus characteristics. Eur J Neurosci. 1998;10:926–940. doi: 10.1046/j.1460-9568.1998.00102.x. [DOI] [PubMed] [Google Scholar]
  10. Cheung SW, Nagarajan SS, Bedenbaugh PH, Schreiner CE, Wang X, Wong A. Auditory cortical neuron response differences under isoflurane versus pentobarbital anesthesia. Hear Res. 2001;56:115–127. doi: 10.1016/s0378-5955(01)00272-6. [DOI] [PubMed] [Google Scholar]
  11. Delgutte B, Kiang NYS. Speech coding in the auditory nerve. I. Vowel-like sounds. J Acoust Soc Am. 1984;75:866–878. doi: 10.1121/1.390596. [DOI] [PubMed] [Google Scholar]
  12. DeValois RL, DeValois KK. Spatial Vision. Oxford: Oxford Science Publication; 1990. [Google Scholar]
  13. Dewson JH. Speech sound discrimination by cats. Science. 1964;144:555–556. doi: 10.1126/science.144.3618.555. [DOI] [PubMed] [Google Scholar]
  14. Edeline J-M. Learning-induced physiological plasticity in the thalamo-cortical sensory system: a critical evaluation of recpetive field plastiicity, map changes and their potential mechanisms. Prog Neurobiol. 1999;57:165–224. doi: 10.1016/s0301-0082(98)00042-2. [DOI] [PubMed] [Google Scholar]
  15. Eggermont JJ. Representation of a voice onset time continuum in primary auditory cortex of the cat. J Acoust Soc Am. 1995;98:911–920. doi: 10.1121/1.413517. [DOI] [PubMed] [Google Scholar]
  16. Ehret G, Schreiner CE. Frequency resolution and spectral integation (critical band analysis) in single units of the cat primary auditory cortex. J Comp Physiol A. 1997;181:635–650. doi: 10.1007/s003590050146. [DOI] [PubMed] [Google Scholar]
  17. Escabi M, Schreiner CE. Nonlinear processing of auditory neurons revealed with naturalistic spectro-temporal envelopes. J Neurosci. 2002;22:4114–4131. doi: 10.1523/JNEUROSCI.22-10-04114.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Festen JM, Plomp R. Relations between auditory functions in normal hearing. J Acoust Soc Am. 1981;70:356–369. doi: 10.1121/1.386771. [DOI] [PubMed] [Google Scholar]
  19. Flanagan JL. A difference limen for vowel formant frequency. J Acoust Soc Am. 1955;27:613–617. [Google Scholar]
  20. Gallistel CR. Conditioning from an information processing perspective. Behav Processes. 2003;62:89–101. doi: 10.1016/s0376-6357(03)00019-6. [DOI] [PubMed] [Google Scholar]
  21. Green DM, Swets JA. Signal Detection Theory. New York: Wiley; 1966. [Google Scholar]
  22. Greenwood DG. Critical bandwidth in man and some other species in relation to the traveling wave envelope. In: Moskowitz HR, et al., editors. Sensation and Measurement. Dordrech: D Reidel Publishing Co; 1974. pp. 231–239. [Google Scholar]
  23. Hillier D, Miller J. Auditory-contrast sensitivity in normal and hearing-impaired listeners. J Acoust Soc Am. 1991;89:1938. [Google Scholar]
  24. Jenkins WM, Merzenich MM, Ochs M, Allard TT, Guic-Robles E. Functional organization of primary somatosensory cortex in adult owl monkeys after behaviorally controlled tactile stimulation. J Neurophysiol. 1990;63:82–104. doi: 10.1152/jn.1990.63.1.82. [DOI] [PubMed] [Google Scholar]
  25. Keeling D, Schreiner CE, Jenkins WM. Discrimination of formant shifts in vowel-like ripple spectra. Assoc Res Otolaryngol Abstr. 1992;14:201. [Google Scholar]
  26. Kilgard MP, Pandya PK, Vazquez J, Gehi A, Schreiner CE, Merzenich MM. Sensory input directs spatial and temporal plasticity in primary auditory cortex. J. Neurophysiol. 2001;86:326–338. doi: 10.1152/jn.2001.86.1.326. [DOI] [PubMed] [Google Scholar]
  27. Klein DJ, Simon JZ, Shamma SA. Robust spectro-temporal reverse correlation for the auditory system: optimizing stimulus design. J Comput Neurosci. 2000;9:85–111. doi: 10.1023/a:1008990412183. [DOI] [PubMed] [Google Scholar]
  28. Kowalski N, Depireux DA, Shamma SA. Analysis of dynamic spectra in ferret primary auditory cortex. 1. Characteristics of single-unit responses to moving ripple spectra. J Neurophysiol. 1996;76:503–3523. doi: 10.1152/jn.1996.76.5.3503. [DOI] [PubMed] [Google Scholar]
  29. Kral A, Hartmann R, Tillein J, Heid S, Klinke R. Hearing after congenital deafness: central auditory plasticity and sensory deprivation. Cereb Cortex. 2002;12:797–807. doi: 10.1093/cercor/12.8.797. [DOI] [PubMed] [Google Scholar]
  30. Kuhl PK, Miller JD. Speech perception by the chinchilla: voiced-voiceless distinction in alveolar plosive consonants. Science. 1975;190:69–72. doi: 10.1126/science.1166301. [DOI] [PubMed] [Google Scholar]
  31. Kuhl PK, Padden DM. Enhanced discriminability at the phonetic boundaries for the voicing feature in macaques. Perception and Psychophysics. 1982;32:542–550. doi: 10.3758/bf03204208. [DOI] [PubMed] [Google Scholar]
  32. Kuhl PK, Padden DM. Enhanced discriminability at the phonetic boundaries for the place feature in macaques. J Acoust Soc Am. 1983;73:1003–1010. doi: 10.1121/1.389148. [DOI] [PubMed] [Google Scholar]
  33. Langner G. Periodicity coding in the auditory system. Hearing Res. 1992;60:115–142. doi: 10.1016/0378-5955(92)90015-f. [DOI] [PubMed] [Google Scholar]
  34. Langner G, Schreiner CE. Periodicity coding in the inferior colliculus of the cat. I. Neuronal mechanisms. J.Neurophysiol. 1988;60:1799–1822. doi: 10.1152/jn.1988.60.6.1799. [DOI] [PubMed] [Google Scholar]
  35. Li W, Piech V, Gilbert CD. Perceptual learning and top-down influences in primary visual cortex. Nat Neurosci. 2004;7:651–657. doi: 10.1038/nn1255. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Linden JF, Liu RC, Sahan M, Schreiner CE, Merzenich MM. Spectrotemporal structure of receptive fields in areas AI and AAF of mouse auditory cortex. J Neurophysiol. 2003;90:2660–2675. doi: 10.1152/jn.00751.2002. [DOI] [PubMed] [Google Scholar]
  37. Mendelson JR, Schreiner CE, Sutter M, Grasse K. Functional topography of cat primary auditory cortex: representation of frequency modulation. Exp Brain Res. 1993;94:65–87. doi: 10.1007/BF00230471. [DOI] [PubMed] [Google Scholar]
  38. Middlebrooks JC, Bierer JA, Snyder RL. Cochlear implants: the view from the brain. Curr Opin Neurobiol. 2005;15:488–493. doi: 10.1016/j.conb.2005.06.004. [DOI] [PubMed] [Google Scholar]
  39. Miller LM, Escabi MA, Read HL, Schreiner CE. Spectrotemporal receptive fields in the lemnsical auditory thalamus and cortex. J Neurophysiol. 2002;87:516–527. doi: 10.1152/jn.00395.2001. [DOI] [PubMed] [Google Scholar]
  40. O'Connor KN, Barruel P, Sutter ML. Global processing of spectrally complex sounds in macaques (Macaca mullata) and humans. J Comp Physiol A. 2000;186:903–912. doi: 10.1007/s003590000145. [DOI] [PubMed] [Google Scholar]
  41. Ohl FW, Scheich H. Learning-induced plasticity in animal and human auditory cortex. Curr Opin Neurobiol. 2005;15:470–477. doi: 10.1016/j.conb.2005.07.002. [DOI] [PubMed] [Google Scholar]
  42. Polley DB, Heiser MA, Blake DT, Schreiner CE, Merzenich MM. Associative learning shapes the neural code for stimulus magnitude in primary auditory cortex. Proc Natl Acad Sci U S A. 2004;101:16351–16356. doi: 10.1073/pnas.0407586101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Polley DB, Steinberg EE, Merzenich MM. Perceptual learning directs auditory cortical map plasticity through top-down influences. J Neurosci. 2006;26:4970–4982. doi: 10.1523/JNEUROSCI.3771-05.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Pick GF. Level dependence of psychophysical frequency resolution and auditory filter shape. J Acoust Soc Am. 1980;68:1085–1095. doi: 10.1121/1.384979. [DOI] [PubMed] [Google Scholar]
  45. Pickles JO. Normal critical bands in the cat. Acta Otolaryngol. 1975;80:245–254. doi: 10.3109/00016487509121325. [DOI] [PubMed] [Google Scholar]
  46. Prosen CA, Moody DB, Sommers MS, Stebbins WC. Frequency discrimination in the monkey. J Acoust Soc Am. 1990;88:2152–2158. doi: 10.1121/1.400112. [DOI] [PubMed] [Google Scholar]
  47. Raggio MW, Schreiner CE. Neuronal responses in cat primary auditory cortex to electrical cochlear stimulation. III: Activation patterns in long- and short-term deafness. J Neurophysiol. 1999;82:3506–3526. doi: 10.1152/jn.1999.82.6.3506. [DOI] [PubMed] [Google Scholar]
  48. Read HL, Winer JA, Schreiner CE. Modular organization of intrinsic connections associated with spectral tuning in cat auditory cortex. Proc Natl Acad Sci U S A. 2001;98:8042–8047. doi: 10.1073/pnas.131591898. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Recanzone GH, Schreiner CE, Merzenich MM. Plasticity in the frequency representation of primary auditory cortex following discrimination training in adult owl monkeys. J.Neurosci. 1993;1:87–103. doi: 10.1523/JNEUROSCI.13-01-00087.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Recanzone GH, Schreiner CE, Sutter ML, Beitel R, Merzenich MM. Functional organization of spectral receptive fields in the primary auditory cortex of the owl monkey. J Comp Neurol. 1999;415:460–481. doi: 10.1002/(sici)1096-9861(19991227)415:4<460::aid-cne4>3.0.co;2-f. [DOI] [PubMed] [Google Scholar]
  51. Rhode WS, Greenberg S. Encoding of amplitude modulation in the cochlear nucleus of the cat. J Neurophysiol. 1994;7:1797–1825. doi: 10.1152/jn.1994.71.5.1797. [DOI] [PubMed] [Google Scholar]
  52. Rubinstein JT. How cochlear implants encode speech. Curr Opin Otolaryngol Head Neck Surg. 2004;12:444–448. doi: 10.1097/01.moo.0000134452.24819.c0. [DOI] [PubMed] [Google Scholar]
  53. Sachs MB, Young ED. Encoding of steady-state vowels in the auditory nerve: representation in terms of discharge rate. J. Acoust Soc Am. 1979;66:470–479. doi: 10.1121/1.383098. [DOI] [PubMed] [Google Scholar]
  54. Scharf B. Critical Bands. In: Tobias JV, editor. Foundations of Modern Auditory Theory. I. New York: Academic Press; 1970. pp. 159–202. [Google Scholar]
  55. Schreiner CE. Spatial distribution of responses to simple and complex sounds in primary auditory cortex. Audiol Neurootol. 1998;3:104–122. doi: 10.1159/000013785. [DOI] [PubMed] [Google Scholar]
  56. Schreiner CE, Calhoun BM. Spectral envelope coding in cat primary auditory cortex: Properties of ripple transfer functions. Auditory Neurosci. 1994;1:39–61. [Google Scholar]
  57. Schreiner CE, Mendelson JR. Functional topography of cat primary auditory cortex: distribution of integrated excitation. J Neurophysiol. 1990;64:1442–1459. doi: 10.1152/jn.1990.64.5.1442. [DOI] [PubMed] [Google Scholar]
  58. Schreiner CE, Sutter ML. Topography of excitatory bandwidth in cat primary auditory cortex: single-neuron versus multiple-neuron recordings. J Neurophysiol. 1992;68:1487–1502. doi: 10.1152/jn.1992.68.5.1487. [DOI] [PubMed] [Google Scholar]
  59. Schreiner CE, Read HL, Sutter ML. Modular organization of frequency integration in primary auditory cortex. Ann Rev Neurosci. 2000;23:501–529. doi: 10.1146/annurev.neuro.23.1.501. [DOI] [PubMed] [Google Scholar]
  60. Shamma SA, Versnel H, Kowalski N. Ripple analysis in the ferret primary auditory cortex. I. Response characteristics of single units to sinusoidally rippled spectra. Auditory Neurosci. 1995;1:233–254. [Google Scholar]
  61. Shipley C, Carterette EC, Buchwald JS. The effects of articulation on the acoustical structure of feline vocalizations. J Acoust Soc Am. 1991;89:902–909. doi: 10.1121/1.1894652. [DOI] [PubMed] [Google Scholar]
  62. Sinnott JM. Detection and discrimination of synthetic English vowels by Old World monkeys (Cercophithecus, Macaca) and humans. J Acoust Soc Am. 1989;86:557–565. doi: 10.1121/1.398235. [DOI] [PubMed] [Google Scholar]
  63. Sinnott JM, Kreiter NA. Differential sensitivity to vowel continua in Old World monkeys (Macaca) and humans. J Acoust Soc Am. 1991;89:2421–2429. doi: 10.1121/1.400974. [DOI] [PubMed] [Google Scholar]
  64. Sommers MS, Moody DB, Prosen CA, Stebbins WC. Formant frequency discrimination by Japanese macaques (Macaca fuscata) J Acoust Soc Am. 1992;91:3499–3510. doi: 10.1121/1.402839. [DOI] [PubMed] [Google Scholar]
  65. Steinschneider M, Arezzo JC, Vaughan HG., Jr Tonotopic features of speech-evoked activity in primate auditory cortex. Brain Res. 1990;519:158–168. doi: 10.1016/0006-8993(90)90074-l. [DOI] [PubMed] [Google Scholar]
  66. Supin AY, Popov VV, Milekhina ON, Tarakanov MB. Frequency resolving power measured by rippled noise. Hear Res. 1994;78:31–40. doi: 10.1016/0378-5955(94)90041-8. [DOI] [PubMed] [Google Scholar]
  67. Thompson M, Porter B, O'Bryan J, Heffner HE, Heffner RS. A syringe-pump food-paste dispenser. Behav Res Meth Instr Comp. 1990;22:49–450. [Google Scholar]
  68. Van Veen TM, Houtgast T. Spectral sharpness and vowel dissimilarity. J Acoust Soc Am. 1985;77:628–634. doi: 10.1121/1.391880. [DOI] [PubMed] [Google Scholar]
  69. Versnel H, Shamma SA. Spectral-ripple representation of steady-state vowels in primary auditory cortex. J Acoust Soc Am. 1998;103:2502–2514. doi: 10.1121/1.422771. [DOI] [PubMed] [Google Scholar]
  70. Wang K, Shamma SA. Representation of acoustic signals in the primary auditory cortex. IEEE Trans Audio Speech Process. 1995;3:382–395. [Google Scholar]
  71. Wang X, Sachs MB. Neural encoding of single-formant stimuli in the cat II. Responses of anteroventral cochlear nucleus units. J Neurophysiol. 1994;71:59–78. doi: 10.1152/jn.1994.71.1.59. [DOI] [PubMed] [Google Scholar]
  72. Wang X, Merzenich MM, Beitel R, Schreiner CE. Representation of a species-specific vocalization in the primary auditory cortex of the common marmoset: temporal and spectral characteristics. J Neurophysiol. 1995;74:2685–2706. doi: 10.1152/jn.1995.74.6.2685. [DOI] [PubMed] [Google Scholar]
  73. Witte RS, Kipke DR. Enhanced contrast sensitivity in auditory cortex as cats learn to discriminate sound frequencies. Brain Res Cogn Brain Res. 2005;23:171–184. doi: 10.1016/j.cogbrainres.2004.10.018. [DOI] [PubMed] [Google Scholar]
  74. Weinberger NM, Diamond DM. Dynamic modulation of the auditory system by associative learning. In: Edelman GM, Gall WE, Cowan WM, editors. Auditory function - neurobiological bases of hearing. New York: John Wiley & Sons; 1988. pp. 485–512. [Google Scholar]
  75. Wörgötter F, Eysel UT. Quantitative determination of orientational and directional components in the response of visual cortical cells to moving stimuli. Biol Cybern. 1987;57:349–355. doi: 10.1007/BF00354980. [DOI] [PubMed] [Google Scholar]
  76. Wörgötter F, Grundel O, Eysel UT. Quantification and comparison of cell properties in cat's striate cortex determined by different types of stimuli. Eur J Neurosci. 1990;2:928–941. doi: 10.1111/j.1460-9568.1990.tb00005.x. [DOI] [PubMed] [Google Scholar]
  77. Wong S, Schreiner CE. Representation of stop-consonants in cat primary auditory cortex: intensity dependence. Speech Communication. 2003;41:93–106. [Google Scholar]
  78. Yost WA. The dominance region and ripple noise pitch: a test of the peripheral weighting model. J Acoust Soc Am. 1982;72:416–425. doi: 10.1121/1.388094. [DOI] [PubMed] [Google Scholar]
  79. Young ED, Sachs MB. Representation of steady-state vowels in the temporal aspects of the discharge patterns of populations of auditory-nerve fibers. J Acoust Soc Am. 1979;66:1381–1403. doi: 10.1121/1.383532. [DOI] [PubMed] [Google Scholar]

RESOURCES