Abstract
Sensory systems are dynamic. They must process a wide range of natural signals that facilitate adaptive behaviors in a manner that depends on an organism's constantly changing goals. A full understanding of the sensory physiology that underlies adaptive natural behaviors must therefore account for the activity of sensory systems in light of these behavioral goals. Here we present a novel technique that combines in vivo electrophysiological recording from awake, freely moving songbirds with operant conditioning techniques that allow control over birds' recognition of conspecific song, a widespread natural behavior in songbirds. We show that engaging in a vocal recognition task alters the response properties of neurons in the caudal mesopallium (CM), an avian analog of mammalian auditory cortex, in European starlings. Compared with awake, passive listening, active engagement of subjects in an auditory recognition task results in neurons responding to fewer song stimuli and a decrease in the trial-to-trial variability in their driven firing rates. Mean firing rates also change during active recognition, but not uniformly. Relative to nonengaged listening, active recognition causes increases in the driven firing rates in some neurons, decreases in other neurons, and stimulus-specific changes in other neurons. These changes lead to both an increase in stimulus selectivity and an increase in the information conveyed by the neurons about the animals' behavioral task. This study demonstrates the behavioral dependence of neural responses in the avian auditory forebrain and introduces the starling as a model for real-time monitoring of task-related neural processing of complex auditory objects.
Keywords: behavioral electrophysiology, auditory cortex, birdsong, task engagement, behavioral state
sensory signals must guide many different behaviors (e.g., feeding, predator avoidance, mating), and the mappings from neural sensory systems to behaviors are not fixed. As an organism interacts with the environment, both the sensory signals and the animal's behavioral goals for these signals change over time. For example, the salient color and smell of a tasty berry that an animal has previously eaten when hungry may be ignored when the animal is sated. These dynamic components make the challenge of understanding sensory encoding more difficult, as any complete functional description must account for changes in the stimulus representation in relation to each animal's goals. Indeed, goal-dependent modulation of sensory representations can be powerful and broad, ranging from presynaptic inhibition on sensory afferents only during certain behaviors in invertebrates (Gaudry and Kristan 2009) to attentional modulation in visual and auditory cortices (Hubel et al. 1959; Mesgarani and Chang 2012; Moran and Desimone 1985). It is likely that the goals of an organism are continuously shaping the sensory system transformations to highlight relevant stimulus features and attenuate distracting elements.
Studying behavioral goal-dependent modulation of sensory encoding requires tight control over an organism's behavior. One simple method for studying behavioral modulation of sensory representations is to observe neural activity when an animal is engaged in a task that requires the use of sensory information and then compare this activity to that observed when the animal is not actively engaged in the task. Any changes in the neural representation of the stimuli under these two conditions must be attributed to the animal's engagement in the task. In the auditory system, the effects of task engagement on the activity of neurons have been studied in a variety of species, including monkeys (Hocherman et al. 1976; Miller et al. 1972; Niwa et al. 2012), cats (Lee and Middlebrooks 2011), ferrets (Fritz et al. 2003), and rats (Otazu et al. 2009). Changes due to task engagement range from general increases and decreases in spontaneous and driven neural activity to more task-specific effects such as the suppression of responses to distracters (Otazu et al. 2009) or the increase in sensitivity to task-related target features (Fritz et al. 2003; Lee and Middlebrooks 2011; Niwa et al. 2012; see Sutter and Shamma 2011 for a detailed review). These studies provide important insights into the ways that task engagement can alter auditory representations but are limited by the use of artificial stimuli in organisms performing simple laboratory tasks. Because adaptive behaviors can powerfully modulate an animal's goals, understanding will likely benefit from the use of natural stimuli in realistic contexts.
The songbird auditory system provides an excellent model for studying the neural substrate of ethologically relevant behavior involving complex natural stimuli (Doupe and Kuhl 1999; Gentner and Ball 2005; Knudsen and Gentner 2010; Marler 2004; Pinaud and Terleph 2008; Theunissen and Shaevitz 2006). Birds participate in a variety of acoustically mediated natural behaviors such as territory defense, mate attraction (Catchpole and Slater 1995), and recognition of song type (Beecher et al. 1994) and conspecific individual identity (Gentner and Hulse 2000). These behaviors depend critically on differences in the acoustic features in other birds' songs, and recent studies suggest that the avian auditory system is poised to preferentially represent those acoustic features that are particularly relevant to an individual bird's experience. Moving from the songbird auditory periphery into more central auditory regions, neural receptive fields become increasingly complex; linear response models describe less and less of the variance in the neurons' song-evoked, time-varying firing rates (Woolley et al. 2005). That is, higher-order auditory regions are less likely to be driven by simple stimuli like pure tones or noise bursts, requiring more complex acoustic stimuli such as conspecific song to drive their responses. Many neurons in these regions show selectivity among different segments of song, which may arise as a function of the combination of feedforward inputs representing less complex features (Meliza et al. 2010). In addition, many of these secondary auditory regions show strong experience-dependent effects, displaying increased (Gentner and Margoliash 2003) or decreased (Thompson and Gentner 2010) responsiveness for songs that birds have learned to classify, and increased information about learned stimulus identity and task-relevant classification (Jeanne et al. 2011). These studies suggest that powerful changes can be effected in the representation of acoustic stimuli based upon a bird's behavioral experience but have focused on changes observed at one time point, after learning and during neurophysiological recordings in anesthetized animals. These are therefore strong candidate regions for displaying behavioral state-dependent modulation. The goal of the present research was to determine the nature of these modulations, if they exist.
Here we present a novel technique for studying extracellular neurophysiology in European starlings (Sturnus vulgaris) while birds perform behaviors mediated by natural stimuli. We trained birds in operant tasks where they learned to recognize a number of conspecific starling songs and then recorded extracellular action potentials from single units in the caudomedial mesopallium (CM), an avian analog of mammalian auditory cortex (Butler et al. 2011) that undergoes strong experience-dependent learning effects (Gentner and Margoliash 2003; Jeanne et al. 2011). We compared responses while birds actively engaged in the auditory recognition task to responses to the same stimuli while the birds were not engaged in the task and found that the behavioral state of the animal modulates the activity of the majority of neurons recorded in CM. Some neurons show systematic increases in their driven firing rate, others show systematic decreases, and yet others display stimulus-specific changes in firing rates. At the population level, task engagement causes neurons to be excited by fewer stimuli, leading to a corresponding increase in stimulus selectivity. Additionally, task engagement leads to a decrease in trial-to-trial firing rate variability at the population level. Although the changes due to task engagement are heterogeneous, we find that these changes allow the output from single neurons to better discriminate between relevant behavioral classes when birds are engaged in the behavioral task than when they are not. From these results, we conclude that engagement in an auditory recognition task alters the neural representation of auditory stimuli in the songbird auditory forebrain, and that these changes occur in such a way as to better transmit information regarding the task that birds are performing.
METHODS
All experiments were performed in accordance with a protocol (no. S05383) approved by the Institutional Animal Care and Use Committee of the University of California San Diego and followed the American Physiological Society “Guiding Principles for the Care and Use of Vertebrate Animals in Research and Training.”
Subjects
Eight (7 male and 1 female) European starlings served as subjects for this study. Both male and female starlings readily acquire laboratory operant behaviors involving conspecific vocal recognition (Gentner and Hulse 1998; Gentner et al. 2000), and previous studies of neural activity in medial CM show experience-dependent plasticity in both sexes (Gentner and Margoliash 2003). Subjects were wild-caught in southern California and had adult plumage at the time of capture. Prior to training and testing, birds were housed with conspecifics in large flight aviaries with ad libitum access to food and water. Light-dark cycles in the aviaries were matched to the naturally varying photoperiod. Prior to experiments, birds were naive to all stimuli and to the operant apparatus used in this study.
Operant Apparatus and Shaping
At the start of training, birds were removed from the aviary and acclimated to individual operant chambers. Detailed specifications of the custom-built operant chambers have been described elsewhere (Gentner 2008). Briefly, birds lived in weld-wire cages mounted inside sound-attenuation chambers (Acoustic Systems; Eckel Industries). One wall of the cage contained a metal panel with three response ports into which birds pecked their beaks to trigger various outcomes (Fig. 1A). Directly below the response ports was a feeding station at which food could be presented based on the reward contingencies of a given task. A speaker mounted behind the panel was used for auditory stimulus presentation. Above the cage, a recessed broad-spectrum compact fluorescent light bulb provided naturalistic illumination of the chamber (“house light”). Birds earned all food through successful completion of the operant tasks described below. Access to water was unrestricted. After quickly learning to eat from the feeding station, birds underwent an auto-shaping procedure that used visual cues (LEDs mounted at the back of the response ports) to teach them how to use the apparatus.
Stimuli
All auditory stimuli were segments of conspecific song recorded from six male starlings at 44.1 kHz and 16 bit (for a more detailed description of the recording parameters, see Gentner 2008). Stimuli were normalized to 65 dB mean SPL, and the first and last 20 ms of the stimulus were linearly ramped to zero to avoid onset and offset artifacts. Starling song can be decomposed into repeated groups of spectrotemporal features called motifs that are natural behaviorally relevant subunits of song (Gentner 2008; Gentner and Hulse 2000; Seeba and Klump 2009). For four of the birds in this study training stimuli comprised either 6 (3 birds) or 12 (1 bird) single motifs (range: 0.38–1.13 s), and for the remaining four birds stimuli were 4 (3 birds) or 8 (1 bird) sections of song made up of 11–15 individual motifs (“long songs,” range: 9.08–10.16 s). During some neural recording sessions, a subset of birds (4/8) were presented with additional stimuli (similarly long songs or single motifs) that were not from the set of that bird's training stimuli. These additional stimuli are not discussed further; all analyses are restricted to a given bird's training stimuli to isolate the effects of task engagement on well-learned auditory-mediated behaviors. No bird heard its own song, nor was it familiar with any of the songs used in training or testing before the start of the experiments described here.
Training
Six birds performed a simple two-alternative choice (2AC) recognition task (2 with the long songs and 4 with single motifs). In the 2AC task, birds initiated trials by pecking their beak into the center response port. This elicited the playback of a training stimulus from the speaker, after the end of which the birds had 2 s to make a response by pecking into either the left or right response port. Half of a bird's training stimuli were rewarded with access to food for pecking the left response port, and the other half were rewarded after pecking of the right response port. If the bird made an inappropriate response (e.g., pecking left when it should peck right), the lights in the operant apparatus were turned off for 5 s and the bird was restricted from initiating another trial during this time. If the bird made no response in the left or the right port during the 2-s response window, the trial was considered a “no-response” trial and was not included in analysis. Two additional birds performed a go/nogo (GNG) recognition task using the long songs, where they were trained to respond to half of the songs with a peck in the center response port and withhold responding to the second half of the songs. Correct responses (pecks to “go” songs) were rewarded with access to food, and incorrect responses (pecks to “nogo” songs) were punished by restricting access to trial initiation and briefly turning off the house lights (5 s). Withheld responses (not pecking to a “go” song) were never explicitly reinforced. In both the 2AC and GNG tasks, birds learned all reward-pairing contingencies through trial and error. Incorrect responses were always punished. Correct responses were rewarded at 100% during early training but at lower rates during later training sessions (typically 40–60%) to keep response rates high by maintaining motivation and delaying satiation.
Acclimation to Recording Apparatus
Once the birds reached stable and asymptotic performance on training stimuli, they were transferred to a modified operant apparatus that allowed simultaneous behavioral testing and extracellular recording of action potentials. This recording apparatus was identical in design to the training apparatus, except for modifications that allowed for electrical isolation of the animals and ensured high-quality, low-noise electrophysiological recording. Modifications included coating the wire cage in plastic, a plastic response panel, and insulating the feeding station with a layer of neoprene rubber. In addition to this, the house light was changed from an alternating current CFL to a 24-V DC-powered incandescent bulb, and a green LED (“cue light”) was installed above the center response port to signify trial availability and to act as a secondary reinforcer. The cue light remained on whenever the bird was allowed to initiate a trial and turned off as soon as a trial was initiated. Upon the successful completion of a trial, the cue light blinked five times in 0.5 s with a 50% duty cycle to indicate a correct response, even if food was not presented on that trial. A further modification consisted of the installation of a 32-channel motorized commutator (Plexon, Dallas, TX) through a hole in the ceiling of the soundproof chamber. A small hole in the top of the wire cage admitted a multiwire tether that connected the bird to the commutator. This setup allowed the free movement of a tethered bird inside the cage without fear of tangling (Fig. 1A).
The behavioral control of the recording apparatus was managed by custom-written Spike2 [Cambridge Electronic Design (CED)] scripts in combination with a CED Power 1401 input-output device that handled digital-to-analog conversion for stimulus playback and managed digital inputs and outputs for control of the response ports and reward/punishment apparatus. In addition, the software controlled the analog-to-digital conversion performed by the 1401 and maintained the temporal registration of the neural data with the behavioral data.
Neurophysiology
Electrode microdrive.
Figure 1B depicts the electrode microdrive assembly developed in conjunction with the Scripps Institute of Oceanography Machine Shop (http://sioms.ucsd.edu/) to allow the recording of extracellular action potentials from awake, behaving starlings. The drive assembly consists of three main components: the base plate (a), the microdrive (b), and the outer housing (c) (Fig. 1B). The base plate attaches to the bird's skull with adhesive and dental acrylic and provides an anchor point for the rest of the assembly. The bottom flanges on the microdrive are fitted into the tracks in the base plate (Fig. 1B, d), and it is held in place by a set screw threaded through one of the holes in the base plate tracks. Finally, the bottom portion of the outer housing is lowered over the microdrive and screwed into the base plate, while the top portion of the outer housing is screwed directly into the lower outer housing.
Base plate.
The base plate is made of titanium, and its bottom aspect is machined to conform roughly to the curve of a starling skull. The flange around the perimeter of the base plate and the larger vertically oriented holes provide attachment points for the dental acrylic used to affix the drive to the skull. The smaller vertically oriented holes allow for the passage of wires (e.g., from reference or ground electrodes) from the implant site to the electrode connector. The large rectangular cutout in the base plate allows for access to the implant site by the electrode, visual inspection of the electrode insertion site by the researcher after implantation, and minor curettage and cleaning prior to later electrode penetrations. The horizontally oriented tracks (Fig. 1B, d) provide an attachment location for the microdrive as mentioned above.
Microdrive.
The microdrive slides into the tracks in the base plate (Fig. 1B, d) and is held in place with a set screw. This allows for adjustment along one horizontal axis, even after the base plate is cemented in place, permitting multiple electrode penetrations in the same subject. Within the titanium microdrive body is a vertically oriented threaded rod (Fig. 1B, e). Threaded onto this rod and fit tightly into the microdrive housing is a plastic shuttle (f) that moves up and down relative to the rest of the drive assembly when the rod is turned with the operating knob at the top (g) (Fig. 1B). The rod is threaded with 91 threads per inch, and one full rotation of the rod through 360° leads to a vertical displacement of the shuttle of 279.12 μm. Keeping the rod in place on the bottom is a restraining block containing a closed cylinder in which the rod is allowed to rotate freely, and at the top restraining block there is channel in which a collar firmly attached to the threaded rod is allowed to rotate freely (Fig. 1B, h). A rubber washer (not depicted) is fitted into the space between the control knob and the top of the upper restraining block to keep tension on the rod in order to prevent it from falling out of the bottom restraining block. By attaching a recording electrode or electrode array to the shuttle with adhesive, and fixing the microdrive in place over the region of interest, the operating knob may be turned to raise and lower the electrode along a vertical track.
Electrode array.
The recording electrodes used in this study were 16-channel electrode arrays (NeuroNexus Technologies) equipped with the F16 connector package (currently deprecated and superseded by the H-series connector package). This design consists of 1 or 2 silicone shanks that enter the brain (3–5 mm × 80 μm × 15 μm) and contain 16 iridium contact sites (arranged in either a 1 × 16 linear array or an array of four tetrodes; 121-μm2, 312-μm2, or 413-μm2 site area) attached to a flexible cable (21 mm) that then attaches to a 20-channel connector. The microdrive shuttle is sized to accept both the 16-channel electrode arrays used in this study and 32-channel arrays available from NeuroNexus to increase channel counts in future studies. When the entire microdrive assembly is in place, this connector is attached to a 16-channel tethered headstage (HST/16V-G20, Plexon). The headstage is permanently attached by adhesive to the upper portion of the outer housing and provides 20× gain. The flexible cable on the electrode array allows the recording shanks to be raised and lowered on the shuttle without needing to move the connector/headstage assembly. Before implantation, the electrode array shanks are coated with a small amount of fluorescent dye (DiI), to aid electrode track localization during postmortem histology (DiCarlo et al. 1996).
Outer housing.
The outer housing (Fig. 1B, c) is machined from polyetheretherketone (PEEK), a strong, lightweight, and biocompatible plastic, and provides protection for the rest of the drive assembly from the movements of the bird and protection for the implant site from any external factors or foreign bodies. The outer housing consists of two separate pieces: the bottom section screws directly into the base plate, and the upper section screws into this bottom section. Once the outer housing is screwed into place on the base plate, a small window into the internals of the drive assembly remains open at the point where the rectangular cutout in the base plate is open. This is sealed with a removable silicone gel (Kwik-Cast, World Precision Instruments), so that the entirety of the microdrive assembly and the craniotomy are sealed from the external environment.
While the above describes the present iteration of the microdrive assembly, the eight birds included in this study were implanted either with drives that were identical to the one depicted in Fig. 1B or with earlier versions that shared main characteristics but differed in some small respects, such as the inability to remove the outer housing after implantation and thus make multiple penetrations in the same animal.
Electrode microdrive implantation surgery.
Once birds achieved high accuracy on their behavioral task, they underwent surgery to implant the microdrive and allow for the recording of single units from their auditory forebrain. Birds were anesthetized with isoflurane, and an incision was made in the scalp to expose the skull. A small opening in the top layer of the skull was made to allow visualization of the bifurcation of the Y sinus. This location was used to determine the proper stereotaxic coordinates for targeting the CM (Fig. 1C) 2,500 μm rostral and 500 μm lateral (all birds had implants to the left hemisphere). A small craniotomy was placed dorsal to the target location, and the dura was resected to expose the brain's surface over the target region. A layer of silicone gel (3-4680, Dow Corning, Midland, MI; Jackson and Muthuswamy 2008) was applied to act as an artificial dura, which sealed the entire durotomy and lower portion of the craniotomy. Two smaller craniotomies were made roughly 3 mm bilaterally from this main craniotomy and similarly sealed with artificial dura. These allowed the implantation of a reference electrode (custom 0.003-in.-diameter PtIr wire, etched to a point and glass coated to an impedance of ∼1.5 MΩ, 2–4 mm in length) and a ground electrode (custom 0.003-in.-diameter PtIr wire, etched to a point and uninsulated, 2–4 mm in length) These secondary electrodes were patched to the connector on the main electrode array via a fine PVC-insulated wire (Pacific Wire and Cable, Santa Ana, CA) soldered in place prior to implantation. The microdrive, fitted with an electrode array, was then screwed into the base plate, and the entire assembly was lowered into place over the skull so that the vertical trajectory of the electrode shank lined up with the durotomy dorsal to the target of interest. The base plate was then attached to the skull with adhesive and dental acrylic. Once the dental acrylic hardened, the electrode array was slowly lowered into the brain, by turning the control knob on the microdrive, to a depth of ∼500–800 μm, dorsal to the CM target area. The outer housing of the microdrive assembly was screwed into place, the electrode connector was mated to the headstage attached to the outer housing, and the bird was allowed to recover with free access to food and water until the start of recording.
Recording procedure.
Recording sessions started by attaching the top of the headstage to the bottom of the motorized commutator via the tether described above. The output of the commutator was attached to a multichannel amplifier (either a Plexon PBX2 preamplifier or a model 3600 16-channel microelectrode amplifier, A-M Systems, Sequim, WA). The amplifier provided between 2,000× and 10,000× gain and band-pass filtered the signal (low cutoff: 300–500 Hz; high cutoff: 5–8 kHz). The output of the amplifier was sent to a CED Power 1401 analog-to-digital converter that could provide an additional amplification of the signals by up to 10× and digitized each channel at 19–25 kHz with the custom-written Spike2 scripts.
Once the bird was tethered to the recording apparatus, we followed the recording procedure diagrammed in Fig. 2A. A library of stimuli, including the training stimuli, was presented in randomized order to the bird while the operant apparatus was inactive and the bird was quiescent. The electrode array was slowly advanced by turning the control knob on the microdrive until the signal from a single unit could be isolated from background noise. Once a single unit was isolated on one or more channels, the testing sessions began. Testing always began with a non-task-engaged block (“nonengaged”), where the operant apparatus remained inactive and the training stimuli were pseudorandomly presented to the bird (mean 42.3 trials per block). The intertrial interval was drawn randomly on each trial from a uniform distribution between 1 and 5 s. Either the response ports were physically blocked or the LED cue light was turned off to indicate that no trial could be initiated. We used a video camera to monitor each bird's behavioral state continuously and ensure that it did not attempt to initiate or respond to trials during the nonengaged condition. Additionally, we recorded all pecking responses during the nonengaged condition and excluded individual trials in the nonengaged block during which a bird pecked into any response port during stimulus presentation. In practice, such trials were rare, as the lack of reinforcement quickly extinguished pecking during the nonengaged condition. After the nonengaged block completed, the apparatus was turned on, and the bird began a task-engaged (“engaged”) block of trials. Trials were allowed to continue while the neuron's isolation was stable, and while the bird continued to perform trials. Because the bird initiated all the trials during the engaged block, the intertrial intervals were variable and the total number of trials per block (mean 30.4) was generally lower than in the nonengaged condition. After the completion of an engaged block, another nonengaged block of trials was run. If the neuron was still separable from background noise and the bird was still motivated, another engaged block was run after this, followed by a final nonengaged block. For the analyses reported here, unless otherwise noted, we grouped the trials from the multiple nonengaged or multiple engaged blocks together.
Data Analysis
Histology.
At the conclusion of each experiment, the electrode array was retracted and the microdrive was removed from the base plate. The bird was deeply anesthetized with Nembutal (150 mg/kg) and then perfused with heparinized saline followed by 10% formalin. After cryoprotection in a solution of 30% sucrose in phosphate-buffered saline, brains were frozen, sectioned on a freezing microtome, and mounted on glass slides. Fluorescence images of each section were taken to visualize the DiI fluorescence along the electrode track(s). We then stained the sections for Nissl (which destroys the DiI fluorescence signal) and took another series of images. The Nissl and fluorescent images were then aligned, to visualize the electrode tracks in the context of established cytoarchitectonic boundaries between auditory forebrain regions (e.g., Fig. 1C). Single-unit recording locations were registered to the histologically verified electrode position, and only those neurons determined to be within the boundaries of CM were analyzed for this study.
Spike sorting.
Putative action potentials were detected and sorted from the raw voltage waveforms with the built-in spike sorting features in Spike2. Some noise artifacts due to birds' movements were observable in the raw waveforms despite attempts to electrically isolate the bird and ground the microdrive. These artifacts generally contained power in low-frequency bands relative to putative action potential shapes, and so could be greatly reduced by high-pass filtering at 300 Hz. Noise artifacts are also common to many channels, and so subtracting one or more averaged reference channels from the data channel of interest (Ludwig et al. 2009) also improved the signal greatly. Putative action potentials were sorted on the basis of spike shape similarity by a combination of template matching and clustering in principal component space. Only well-isolated single units are included in these analyses (see Fig. 2, B–D). For each neuron, <0.01% of the interspike intervals violated a 1-ms refractory period.
Data inclusion criteria.
Only neurons located in CM and that responded to at least one stimulus with a firing rate significantly different from the spontaneous rate in either the engaged or nonengaged condition were included in our analyses. Birds were trained and tested on a number of stimuli, but only responses to those training stimuli that were presented at least five times in both the engaged and nonengaged conditions are analyzed here, and only those neurons with at least two such stimuli are included. To facilitate analyses across subjects, neural data were analyzed at the scale of the motif. For neurons from birds trained with long songs (n = 24) there were on average 51.5 [standard deviation (SD) 3.2] motifs per neuron that reached the data inclusion criteria, and for neurons from birds trained with individual motifs (n = 41) the mean number of motifs was 3.4 (SD 2.0).
Spontaneous firing rate calculation.
Spontaneous firing rates were calculated during the intertrial intervals. Two estimates were taken for each stimulus presentation trial: one before the stimulus [from the start of recording until 250 ms before the stimulus start (mean duration 1.15 s)] and one after the end of the stimulus and trial reward/punishment phase (each estimate was 1 s in duration). Reported average spontaneous firing rates included both estimates from all trials in a given task engagement condition. Differences in spontaneous firing rates for a given neuron were determined by comparing the distribution of firing rates observed during the nonengaged condition to the distribution observed during the engaged condition with the Mann-Whitney U-test (α = 0.05). P values obtained from each neuron were corrected for multiple comparisons across neurons with the false discovery rate procedure (Benjamini 2001). To determine whether neurons in our population were more likely to show increases or decreases in spontaneous firing rate with task engagement, we compared the number of observed decreases and increases to the 95% confidence interval given by the binomial distribution that assumed an equal number of increases and decreases among the neurons showing a significant difference.
Determining significant responses to stimuli.
To determine whether a neuron's firing rate was significantly modulated by the presentation of a given motif, we compared the distribution of firing rates evoked by the motif in a given condition to the distribution of spontaneous firing rates in the same condition with the Mann-Whitney U-test (α = 0.05). P values were corrected for multiple comparisons of stimuli within a neuron with the false discovery rate procedure. Responses to motifs that were significantly above the spontaneous rate were termed “excitatory,” and responses to motifs that were significantly lower than the spontaneous rate were termed “suppressive.” Population comparisons between proportions of driven motifs across conditions were made with the Wilcoxon signed-rank test (α = 0.05). To determine whether single neurons were driven by a different number of motifs in the nonengaged and engaged conditions, we compared the proportion of motifs that evoked responses in the engaged condition to the upper and lower bounds of the 95% confidence interval (computed from the binomial) around the proportion of motifs that evoked responses in the nonengaged condition.
Firing rate changes due to task engagement.
We compared changes in firing rates evoked by each motif presented to each neuron in the engaged and nonengaged conditions, using the Mann-Whitney U-test (α = 0.05, corrected for the false discovery rate). Because for most cases the nonengaged rate was computed by averaging rates across nonengaged blocks that preceded and followed the engaged block, any nonspecific effect of block order, such as a slow increase or decrease in rate, would work against our observation of a task-engagement effect. To rule out nonspecific effects of ordering more directly, we also compared the spike rates evoked in each neuron by each motif between two nonengaged blocks separated by an engaged block. We treated these three blocks as three independent variables in a one-way ANOVA, looked for a significant main effect across blocks (P < 0.05), and then used post hoc comparisons [Tukey's honestly significant difference (HSD)] to compare firing rates in each block. As evidence of a nonspecific effect of block ordering, we looked for instances where either 1) rates differed significantly between the two nonengaged blocks, and both nonengaged rates differed significantly but in opposite directions from the engaged rate, or 2) rates differed significantly between the two nonengaged blocks, and only one differed significantly from the engaged rate. The same control analysis was done for the smaller subset of neurons where we recorded two engaged blocks separated by an intervening nonengaged block. Both sets of control analyses revealed little support for ordering effects.
Selectivity.
Stimulus selectivity describes a neuron's tendency to elicit different numbers of action potentials in response to the presentation of different stimuli and is useful for describing the response characteristics of a neuron with respect to nonparametric stimuli. To investigate stimulus selectivity in our data set, we used a variant of the activity fraction as described by Vinje and Gallant (2000; see also Rolls and Tovee 1995), given by
where ri is the neuron's mean response to the ith stimulus (motif) and n is the number of stimuli presented to the neuron. This measure varies between 0 and 1; neurons with selectivity values near 0 will generally have similar responses to many stimuli, while neurons with selectivity values near 1 respond to fewer stimuli (but at least 1). We calculated the selectivity for each neuron in each behavioral condition and compared the engaged and nonengaged conditions across neurons with a paired t-test.
Variability.
One important measure of neural responsiveness is the reliability with which a neuron responds to repeated presentations of the same stimulus. We used the coefficient of variation (CV) to measure this trial-to-trial variability. The CV is equal to the SD of the firing rate across individual trials divided by the mean firing rate across those trials. Normalization by the mean allows comparisons across motifs that evoke different mean firing rates. The CV was calculated for each motif presented to a neuron, and a separate value was calculated for each behavioral condition. Motifs that elicited no spikes in any trial within a condition were not included in the analyses, and only neurons with CV calculated for at least four motifs in both conditions were analyzed, to allow statistical testing. To determine whether or not individual neurons showed increases or decreases in CV with behavioral state, we compared the distribution of each neuron's CV values over all motifs during the nonengaged condition to the distribution of CVs observed during the engaged condition, using a Wilcoxon signed-rank test. Additionally, a mean CV was calculated for each neuron in each condition by averaging the CVs to individual stimuli to obtain a mean nonengaged and a mean engaged CV for each neuron. The distribution of mean nonengaged CVs was then compared to the distribution of mean engaged CVs with a paired t-test. We additionally analyzed the population-level CV for each condition by calculating firing rates using submotif stimulus-response bins that were smaller than those given by motif boundaries (10, 20, 50, 100, 200 ms). Results for these population-level analyses were the same as when we used stimulus responses based on motif boundaries.
Receiver operating characteristic.
Two receiver operating characteristic (ROC) curves were plotted for each neuron, one for each condition. For a given condition, we divided all of the trials presented to the neuron into class 1 (“go” stimuli for birds trained on GNG tasks or “left” stimuli for birds trained on 2AC tasks) or class 2 (“nogo” stimuli for birds trained on GNG tasks or “right” stimuli for birds trained on 2AC tasks). The firing rate distribution for the class with the higher mean was selected as the “target” class, since we only care about discriminability and not the directionality of the discrimination. We then stepped through all observed firing rate values, treating each one as a separate threshold, and plotted the proportion of false positive firing rate values (proportion of firing rates observed from the nontarget class that were above the current threshold) on the x-axis against the proportion of true positive rates (proportion of firing rates observed from the target class that were above the current threshold). We then calculated the area under this curve (AUC) as our dependent measure. The AUC represents the probability that one can correctly classify a pair of observed firing rates evoked by a class 1 and a class 2 motif. To determine whether a neuron's AUC in a given condition was different from that expected by chance, AUCs were calculated on distributions with shuffled trial-class relationships. That is, we randomly assigned each observed firing rate to either class 1 or class 2, regardless of the class of the stimulus that actually elicited it, and calculated the AUC for these distributions. This was repeated 1,000 times in order to build a null distribution. If the AUC calculated from the distributions with the proper trial-class relationships fell above the 95th percentile of this null distribution, it was considered a significant AUC.
RESULTS
The analyses below include 65 well-isolated single units recorded from eight adult starlings. To these neurons, we presented an average of 3.6 (SD 1.6) training stimuli an average of 42.3 (SD 30.4) times each in the nonengaged condition and 30.4 (SD 27.0) times each in the engaged condition. Average behavioral performance classifying the training stimuli for the 2,000 trials preceding microdrive implant was 91% correct for birds trained with the GNG procedure and 85% for those trained with the 2AC procedure. All subjects maintained high levels of performance during neural recording sessions; the mean behavioral performance during the trials included in the neural analyses below was 87% correct for birds performing GNG tasks and 90% correct for birds performing 2AC tasks, suggesting that they tolerated the implants well and were still able to behave at a high level of performance. We confirmed that birds were not attempting to engage in the task during the nonengaged condition with video monitoring and by recording all pecks into response ports (see methods). We detected pecks during nonengaged trials in only 17/65 (26%) of recorded neurons, and for only 3 of these neurons were pecks recorded in >5% of nonengaged trials. Trials in which pecks were detected are excluded from the analyses below, but including these trials did not change any of the results.
Task Engagement Changes Average Firing Rates in Some CM Neurons
Median spontaneous firing rates measured during intertrial intervals did not differ between the nonengaged [1.43 spikes per second (sp/s), quartiles 0.72, 3.30] and engaged (1.48 sp/s, quartiles 0.69, 3.43) conditions at the population level (Wilcoxon signed-rank test, P = 0.20). However, at the level of individual neurons, 30/65 (46%) neurons showed a significant increase in their spontaneous firing rate in the engaged condition compared with the nonengaged condition, and 7 (11%) showed a significant decrease (Mann-Whitney U-test, P < 0.05 corrected for false discovery rate). These increases and decreases are not distributed evenly [probability of observing 7 decreases < 0.001, binomial distribution B(37,0.5)], suggesting that while there is a broad range of spontaneous firing rate differences induced by task engagement, CM neurons are more likely to increase, rather than decrease, their spontaneous firing rate with task engagement.
Neurons also tended to increase their stimulus-driven firing rates when birds were engaged in a task (median firing rate 2.42 sp/s, quartiles 0.70, 6.41) compared with the nonengaged trials (median firing rate 1.99 sp/s, quartiles 0.81, 6.18), though the difference in median stimulus-driven firing rates averaged across all the stimuli presented to all neurons did not meet our criterion for significance at the population level (Wilcoxon signed-rank test, P = 0.056). At the individual neuron level, 6/65 (9%) neurons showed a significant increase in driven firing rate averaged across all stimuli in the engaged condition and 6/65 (9%, see Fig. 4B for an example) showed a significant decrease (Wilcoxon signed-rank test, P < 0.05 corrected for false discovery rate). Although the population-level effect for the differences in driven firing rates seems stronger than the population-level effect for changes in spontaneous firing rates, fewer individual neurons show firing rate differences in driven firing rates than show differences in spontaneous firing rates. This is likely due to the fact that the distribution of driven firing rates is broader than the distribution of spontaneous firing rates (note in both conditions the similar lower quartiles between the driven and spontaneous firing rates and the larger upper quartiles for the driven firing rates). The increased variability in the driven firing rate distributions compared with the spontaneous distributions makes it less likely to detect small differences in the broader driven firing rate distributions than in the narrower spontaneous firing rate distributions. Taken together, these results show that CM neurons can show both generalized increases and decreases in spontaneous and stimulus-driven firing rates based on task engagement.
CM Neurons Respond to Fewer Stimuli During Task Engagement
As in anesthetized starlings (Gentner and Margoliash 2003; Jeanne et al. 2011; Meliza et al. 2010), CM neurons in awake birds responded to auditory stimulation with both increases and decreases in their firing rates relative to baseline levels. The responses of a typical neuron to two training stimuli are shown in Fig. 3A. To examine these responses in greater detail, we divided the training stimuli into their constituent motifs and determined for each neuron which motifs evoked a significant change in firing rate relative to spontaneous rate (Fig. 3A). Responses were designated as either excitatory for stimuli that evoked a response above the neuron's spontaneous rate or suppressive for stimuli that lowered the firing rate below the spontaneous rate. This was repeated for both the nonengaged and engaged task conditions.
We quantified these effects by calculating the proportion of motifs driving significant excitatory or suppressive responses in each condition for each neuron. To control for poor estimates of proportionality that come from discretization due to small numbers of stimuli, we restricted this analysis to only those neurons presented with at least four motifs (n = 38). The median proportion of motifs that evoked firing rates significantly different (either excitatory or suppressive) from the spontaneous rate was 0.60 (quartiles 0.43, 0.96) in the nonengaged condition and 0.50 (quartiles 0.22, 0.86) in the engaged condition. This difference is significant (Wilcoxon signed-rank test, P = 0.003) and remains significant if we include all neurons regardless of the number of stimuli presented (Wilcoxon signed-rank test, P = 0.02). Figure 3B depicts these proportions graphically for all neurons in both nonengaged (Fig. 3B, top) and engaged (Fig. 3B, bottom) conditions.
By subtracting the proportion of significantly driven motifs in the nonengaged condition from that observed in the engaged condition, we calculated differences in the proportion of driven motifs due to task engagement. The distribution of these differences across the population (median −0.07, quartiles −0.26, 0.00; see Fig. 3C, top) is significantly less than zero (Wilcoxon signed-rank test, P = 0.004), indicating that CM neurons respond to fewer motifs when birds are engaged in an auditory recognition task. When we asked whether individual neurons showed differences in the proportion of motifs that elicited a response in the two conditions, we found that 17/38 (45%) of neurons showed such a difference, with 15/17 (88%) of these neurons responding to fewer motifs in the engaged condition and 2/17 (12%) responding to more motifs in the engaged condition (binomial test per neuron, α = 0.05), suggesting that it was more likely that a given CM neuron responded to fewer motifs while a bird was engaged in a task than it did when the bird was not engaged.
CM Neurons Are Excited by Fewer Stimuli During Task Engagement
The observed reduction in the number of motifs that drive significant responses in CM neurons during task engagement may come from a decrease in the number of motifs that evoke excitatory responses or an increase in the number of motifs that suppress responding. To answer this question, we separately compared the proportions of motifs driving excitatory and suppressive responses in the engaged and nonengaged conditions (again restricting our analysis to neurons with at least 4 motifs presented in each behavioral condition, n = 38). The difference between the proportion of motifs that excited the neuron in the engaged condition and the nonengaged condition was significantly different from chance (median difference = −0.02, quartiles −0.13, 0.00; Wilcoxon signed-rank test, P = 0.008; Fig. 3C, middle). Thus CM neurons are excited by fewer stimuli in the engaged condition than in the nonengaged condition. The difference between the proportion of motifs that suppressed the neuron in the engaged and nonengaged conditions was not different from chance (median difference = 0.00, quartiles −0.06, 0.00; Wilcoxon signed-rank test, P = 0.8; Fig. 3C, bottom). Thus, across the population of neurons, the decrease in the number of motifs that evoke CM responses during task engagement comes from a decrease in net excitation rather than an increase in suppression.
Task Engagement Changes Motif-Evoked Firing Rates in Most CM Neurons
Although only a minority of CM neurons show a uniform change in firing rate for all stimuli, averaging over all of the motifs presented to a neuron can mask significant variability in the response to different stimuli. Many individual neurons show differences in the driven firing rate in response to one or more motifs. Figure 4A, left, shows a plot of one neuron's average responses to individual motifs in both nonengaged and engaged conditions for six training stimuli. For most motifs the neuron responds similarly in the two conditions, but for motif 4 the firing rate is significantly higher when the bird is engaged in the task and for motif 5 the neuron's firing rate is significantly lower during task engagement. Figure 4A, right, shows a comparison of the firing rates in each condition, and a histogram of these differences at top right shows that despite the changes in responses to individual stimuli there is no systematic increase or decrease in firing rate due to task engagement (Wilcoxon signed-rank test, P = 0.44). Figure 4B depicts another neuron, which shows a distribution of differences in the driven firing rate between the two conditions that is significantly shifted from zero (Wilcoxon signed-rank test, P < 0.001), indicating that this is one of the six neurons for which the mean driven firing rate shows a more general decrease with task engagement. Even for this neuron, however, not every motif evoked a different firing rate in the two conditions. Thus firing rate changes across the two conditions are not uniform for all motifs.
To characterize this response heterogeneity more fully, we asked how many neurons showed a difference in firing rate between the engaged and nonengaged conditions for any single motif. We found that 31 of 65 (48%) neurons showed significantly different firing rates between the two conditions in at least one individual motif. Ten neurons showed only decreases in the response to single motifs with task engagement, 13 showed only increases, and 8 showed both increases and decreases. These neurons are represented in Fig. 4C, where the proportion of increases (engaged firing rate higher) and decreases (nonengaged firing rate higher) due to task engagement can be seen for each neuron. For neurons that had differences in at least one motif and that were presented with at least four motifs (n = 17), the median proportion of motifs that showed modulation in the firing rate between the two conditions was 0.33 (quartiles 0.13, 0.55). For each of these neurons, we subtracted the proportion of stimuli that were higher in the engaged condition from the proportion that were higher in the nonengaged condition and compared these values across the population of neurons (Fig. 4D). This difference was not significant across the population (Wilcoxon signed-rank test, P = 0.55), indicating that there was not a consistent tendency to have significant increases in one or the other behavioral condition. In all, 39 of 65 (60%) neurons showed a difference in firing rate between the engaged and nonengaged conditions, either at the level of a single motif (31/65, 48%) or at the level of a general facilitation or suppression across all stimuli (12/65, 19%).
To confirm that the observed changes in stimulus-evoked spike rates were not attributable to a time-varying parameter other than behavioral state, we compared firing rates across the nonengaged recording blocks that preceded and followed each engaged block (see methods). In 28 of the 31 neurons in which the firing rate for at least one motif differed between engaged and nonengaged conditions, the mean firing for those motifs did not differ significantly between the pre- and postengaged blocks (ANOVA, Tukey's HSD post hoc, P > 0.05 all cases). Thus the observed change in spike rate is tied directly to the change in task engagement. In the remaining three neurons, we found a small number of motifs (3 of 23, 1 of 5, and 3 of 3, respectively) where the firing rates differed significantly between the pre- and postengaged blocks and moved in opposite directions relative to the intervening engaged block. Thus, for these seven motifs, we could not rule out the possibility that temporal ordering effects might explain the observed change in spike rate across the task engagement conditions. None of the 12 neurons showing a general facilitation or suppression of firing rates with task engagement was affected by the ordering of behavioral state. Thus most spike rates were stable across multiple, nonsequential behavioral conditions, and among the very few neurons that did show instabilities between conditions they were not uniform across all stimuli. The relative timing or ordering of recording sessions cannot account for the observed changes in firing rates.
Task Engagement Increases Stimulus Selectivity
Stimulus selectivity values ranged from 0.02 to 0.80 in the nonengaged condition (mean 0.36, SD 0.25), suggesting a wide range of representational densities; some neurons responded similarly to most stimuli, while others responded to only a few stimuli. Across the population, neurons showed a modest but significant increase in selectivity during task engagement (range 0.02–0.89, mean 0.39, SD 0.26) relative to the nonengaged condition (paired t-test, P = 0.04). Figure 5A shows these selectivity values for all neurons with greater than four stimuli (n = 38) and depicts a distribution of the differences between conditions in the histogram at top right. The tendency for neurons to respond to fewer stimuli in the engaged condition as reported above, and to respond to those fewer stimuli with higher firing rates, leads to a general increase in selectivity at the population level and a modest sparsening of the stimulus representation.
Task Engagement Decreases Response Variability
We measured trial-to-trial response variability, using the CV (see methods). We found that in 6/37 neurons (16%) the CV was lower in the engaged condition, while only 1/37 neurons (3%), showed a higher CV in the engaged condition (Wilcoxon signed-rank test, P < 0.05 corrected for false discovery rate). When we averaged the CV values for all motifs presented to a given neuron and compared this across the engaged and nonengaged conditions for all neurons (Fig. 5B), we found that the CV was significantly higher in the nonengaged condition (mean 1.46, SD 1.18) than in the engaged condition (mean 1.22, SD 0.84; paired t-test, P = 0.001). Analyzing responses of neurons to submotif stimulus time bins instead of responses based on motif boundaries yielded similar results (see methods). Thus actively engaging in a task reduces the trial-to trial variability in the spike rates of CM neurons, in principle permitting a reliable representation of the stimulus.
Task Engagement Improves Stimulus Class Discrimination
Up to this point, our results describe a population of neurons that modify their activity in response to task engagement by responding to fewer stimuli and responding to those stimuli more strongly and with increased reliability. There are likely many top-down factors contributing to these phenomena that can be attributed to the animal's behavioral goals. In this task, birds need to discriminate between two classes of stimuli: either they must discriminate between stimuli that require a response (go stimuli) and those that do not (nogo stimuli) or they must discriminate between stimuli associated with a left response and stimuli associated with a right response. If the top-down mechanisms that birds employ when engaged in the task are affecting CM neurons, we hypothesized that they might be doing so in a way that increases the discriminability of the neural representation of the stimulus classes upon which the animal is performing the task. To test this hypothesis, we calculated the ROC curve for each behavioral condition in each neuron (Fig. 6, A and B). The area under the ROC curve (AUC) describes how well an ideal observer can discriminate between the two stimulus classes given the neuron's firing rate. In our present formulation, it represents the probability that an observer is able to use a neuron's distribution of firing rates to correctly categorize a pair of responses when one comes from a class 1 motif and the other from a class 2 motif. If neurons have a greater AUC in the engaged condition than they do in the nonengaged condition, it would demonstrate that there was more discriminative power in the neurons' responses when a bird was engaged in a task.
When we compared the AUC values during nonengaged recording epochs to engaged epochs, we found that CM neurons tended to have higher AUCs during task engagement (median 0.59, quartiles 0.54, 0.69) than during passive listening (median 0.58, quartiles 0.53, 0.66), though this difference was not statistically significant (Wilcoxon signed-rank test, P = 0.082). We then asked how many neurons could classify class 1 and class 2 stimuli significantly better than chance by comparing the measured AUC to a null distribution of ROCs created from the same data but with the associations between firing rate and stimulus class randomized. By this measure, 31/65 (48%) neurons had AUCs significantly above chance values in both the nonengaged and engaged conditions—that is, an observer could use the neuron's firing rate distributions to correctly determine motif classes at better than chance performance. For these neurons, we found the AUC values during engaged epochs (median 0.67, quartiles 0.59, 0.84) to be significantly higher than those calculated from nonengaged recording epochs (Fig. 6C; median 0.61, quartiles 0.55, 0.72; Wilcoxon signed-rank test, P = 0.002). From these analyses, we conclude that neurons that can discriminate between class 1 and class 2 stimuli can do so better when the animal is engaged in an auditory recognition task. In other words, engaging in auditory recognition increases the amount of information about the animal's task that is represented by these CM neurons.
DISCUSSION
Telencephalic sensory systems provide representations of the external sensory world that can be modulated by the real-time behavioral demands of an organism. This report describes techniques that allowed us to determine the effects of task engagement on neurons in the starling auditory system by recording single-unit activity in the CM in awake, behaving European starlings while they performed auditory recognition tasks using complex natural stimuli. We found that engaging in a task that requires using the auditory information present in complex natural stimuli modulates the responses of the majority of the neurons in starling CM. Of the 65 single units from which recordings were made, 39 (60%) showed significant differences in the stimulus-driven firing rate due to the behavioral state of the animal. We also observed a reduction in the proportion of stimuli that excite neurons above baseline, accompanied by an increase in selectivity and a decrease in response variability. In addition, for those neurons that reliably encode the behavioral class of (i.e., correct response to) the training stimuli, behavioral engagement increased the information about the stimulus classification that these neurons carried. Changing the behavioral goal of the animal alters the neural representation of sensory stimuli in CM neurons in real time.
Although studies of auditory-evoked neural activity in awake starlings have been performed for over three decades (Kirsch et al. 1980), most have made recordings in passively listening birds from the primary thalamorecipient region field L (but see Meliza et al. 2010 for recordings from awake, passively listening, starlings in medial CM) and were aimed at mapping response properties (Capsius and Leppelsack 1996, 1999; Cousillas et al. 2005) and neural correlates of perceptual phenomena such as categorization of natural sounds (Hausberger et al. 2000) and stream segregation (Bee et al. 2010; Bee and Klump 2004, 2005; Itatani and Klump 2009, 2011). As far as we are aware, this is the first report of recordings from auditory-responsive neurons made in awake songbirds engaged in a controlled auditory-mediated behavioral task. These awake-behaving recordings were enabled by the microdrive described here. Combined with commercially available microelectrode arrays, this drive provides a robust recording system that allows for the isolation of high-quality single units in awake, behaving animals. A motorized version of the microdrive is under development and will allow remote positioning of the electrode array and limit the need to physically handle the animal. These recording techniques will require less experimenter intervention and ultimately allow a broader range of physiological experimentation in the context of increasingly complex behaviors.
At least two previous studies have attempted to make comparisons of responses in the starling auditory forebrain across behavioral states. Recording from neurons in field L, Capsius and Leppelsack (1996) reported a decrease in the spontaneous rate of multineuron and single-neuron activity in the auditory forebrain of anesthetized and awake starlings. Meliza et al. (2010) did not see this general decrease in spontaneous firing rate with anesthesia in medial CM, although they did see differences in spike timing precision and facilitative interactions in awake compared with anesthetized birds. In agreement with the latter study, we observed no uniform change of the population average spontaneous firing rate between behavioral states. Although one should be careful of conflating changes in anesthesia with changes in natural behavior states, these reported differences may reflect differential regulation of spontaneous firing rate in field L and CM. One advantage of our data set is that we track changes in the same neurons across behavioral state, whereas the other studies recorded from different neurons in the two behavioral conditions. The fact that we observe one subpopulation that increases spontaneous firing rate and another that decreases spontaneous rate indicates that behavioral state modulation, at least in CM, may be different for different classes of neurons. Determining whether these two subpopulations correspond to physiologically different neuronal subtypes will require additional study.
In the Meliza et al. (2010) study, responses of CM neurons to motifs in awake-restrained and anesthetized starlings are described as being additive and suppressive combinations of responses to motif subcomponents called “notes.” The authors report that there are more facilitative interactions between combinations of these notes when birds are awake than when they are anesthetized and suggest that this should lead to increased stimulus selectivity. While they observe no change in selectivity across behavioral state using the same selectivity measure we used here, they suggest that the broad range of selectivity values measured and the relatively low sample size might not allow them to observe these selectivity effects. We too observed a broad range of selectivity values across neurons in our data set, but because we measure selectivity in the same neurons under two behavioral conditions the between-neuron variability can be removed from the analysis. Doing this reveals a somewhat modest, but consistent, increase in selectivity that coincides with task engagement.
The observed selectivity differences are due to differential modulation of responses to different motifs in the two behavioral conditions. Although we found no uniform difference in spontaneous or driven responses across all neurons due to task engagement, some individual neurons displayed systematic increases (6/65 neurons, 9%) or decreases (6/65 neurons, 9%) in driven firing rate as a result of task engagement. Moreover, about half (31/65, 48%) of the neurons in CM showed a firing rate difference for at least one motif between behavioral conditions. These differences were equally distributed between increases and decreases, demonstrating that modulatory mechanisms can act selectively on certain acoustic features while leaving others unchanged. This agrees with previous conclusions (Meliza et al. 2010) that CM neurons receive a wide variety of inputs tuned to complex motif subcomponents like notes and their complex receptive fields are built up by suppressive and facilitative combinations of these inputs. Although our knowledge of CM receptive fields is impoverished, the fact that response changes with task engagement are often motif specific suggests that individual inputs (or sets of inputs) to CM neurons may be independently altered in the short term by behavioral tasks and/or goals. Moreover, because the discriminability of responses evoked by different motifs improves during task engagement, but only in the subset of neurons that discriminate between behavioral classes to begin with, the modulatory mechanisms invoked by task engagement also appear able to target specific subsets of neurons.
The proportion of stimuli that neurons responded to decreased with task engagement, and this change was due to a reduction in the ability for stimuli to drive neurons above their baseline firing rates. This effect might be caused by a reduction in feedforward excitatory drive, a general increase in inhibitory drive that affects excitatory responses to a greater degree than suppressive responses, or an enhancement of the suppressive interactions between features that drive inputs onto CM neurons. Enhanced inhibition or suppression may explain, at least in part, the observed reduction in trial-to-trial firing rate variability. However, a global increase in inhibitory drive during the engaged condition is not supported by changes in spontaneous firing rate or by the trend toward higher driven firing rates, which should both be uniformly lower in such a scenario. A general caveat to these interpretations is that the effects of suppression are hard to measure in the extracellular responses of neurons that have low firing rates, and so may be underrepresented here. To adequately determine the relative contributions of inhibitory and excitatory drive on tuning properties and the effects of behavioral engagement, it will be necessary to develop intracellular recording techniques that allow the monitoring of subthreshold currents in behaving animals.
Short-term modulation of auditory cortical responses across behavioral states has been observed in other systems, but its explicit functions remain poorly understood. In ferrets and cats, receptive fields of auditory cortical neurons shift with task engagement toward stimulus features that are diagnostic for the task at hand (Fritz et al. 2003, 2007; Lee and Middlebrooks 2011). In rats, responses of auditory cortical neurons to distracting, broadband clicks that precede tones are suppressed when rats are actively detecting the tones (Otazu et al. 2009). The present results extend this prior work to include songbirds and complex natural stimuli and tie these behavioral-state-dependent changes directly to the discriminability of the stimulus-driven neural responses. We speculate that the changes in neural responsiveness observed here are the product of short-term modulatory processes such as attention and ultimately contribute to the enhanced representations of learned acoustic material observed throughout in CM at longer timescales (Gentner and Margoliash 2003; Jeanne et al. 2011). Activation of neuromodulatory centers associated with arousal (Froemke et al. 2007; Kilgard and Merzenich 1998; McLin et al. 2002; Metherate and Weinberger 1990) or reward (Bao et al. 2001) have been implicated in experience-dependent plasticity of auditory cortices, and receptive field changes in single neurons that result from stimulation of the nucleus basalis coincident with auditory signal presentation can last for multiple hours (Froemke et al. 2007). Short-term modulation of natural stimulus responses having been demonstrated, it will be important for future studies to directly control selective attention to specific acoustic features of the same stimulus, and to track changes in the response of single neurons (and populations) over the course of recognition learning.
In addition to the modulatory roles of internal processes such as attention and learning, the overt structure of the task is likely to have significant effects on neuronal encoding as well. We observed general effects of task engagement across different behavioral tasks (2AC, GNG) and stimuli (long songs, single motifs), but it would not be surprising if differences in training protocols (even those employed in the present study) did not lead to differential task-related modulation. The reward contingencies differ dramatically between the 2AC and GNG procedures, and this may have a strong effect on the motivation of the animals during task engagement. Although the present experimental design lacked sufficient statistical power to test the independent contribution that different training might make to the observed task-engagement effects, we note that different training contingencies have been linked to CM neuron response differences in anesthetized birds (Gentner and Margoliash 2003). Thus it seems reasonable to predict that similar differences could be observed in in awake birds given the appropriate design in future studies. Importantly, because the training conditions and stimulus sets were held constant within each neuron, differences between training and/or stimuli cannot explain the reported effects of task engagement within any single neuron.
The results of the present study highlight the importance of precise control over each animal's behavior during awake physiological recording. While it is clear that moving away from anesthetized preparations is a necessary step for systems neurophysiology, the benefits of making this transition are not likely to be realized by simply removing anesthesia. As soon as one wakes up their experimental subject they must contend with a large number of previously fixed and unknown internal variables. Our results demonstrate that simple engagement in an auditory recognition task can produce significant changes in how neurons encode natural acoustic stimuli, but we note that this manipulation, while effective, leaves most of the ethologically relevant internal variables only loosely controlled at best. Although interesting in their own right, showing how the auditory system can modify its representations in real time, the results are also somewhat daunting. They remind us that providing a full account of auditory cortex will require tasks that bring stimulus type, reward contingencies, and working memory loads, attention, and learning under exquisite behavioral control.
GRANTS
This work was supported by National Institute on Deafness and Other Communication Disorders Grant DC-008358.
DISCLOSURES
No conflicts of interest, financial or otherwise, are declared by the author(s).
AUTHOR CONTRIBUTIONS
Author contributions: D.P.K. and T.Q.G. conception and design of research; D.P.K. performed experiments; D.P.K. analyzed data; D.P.K. and T.Q.G. interpreted results of experiments; D.P.K. prepared figures; D.P.K. drafted manuscript; D.P.K. and T.Q.G. edited and revised manuscript; D.P.K. and T.Q.G. approved final version of manuscript.
ACKNOWLEDGMENTS
The authors acknowledge David Malmberg for his helpful advice and technical expertise in designing and improving the novel microdrive, as well as the members of the Gentner Lab and two anonymous reviewers for helpful comments on the manuscript.
REFERENCES
- Bao S, Chan VT, Merzenich MM. Cortical remodelling induced by activity of ventral tegmental dopamine neurons. Nature 412: 79–83, 2001 [DOI] [PubMed] [Google Scholar]
- Bee MA, Klump GM. Primitive auditory stream segregation: a neurophysiological study in the songbird forebrain. J Neurophysiol 92: 1088–1104, 2004 [DOI] [PubMed] [Google Scholar]
- Bee MA, Klump GM. Auditory stream segregation in the songbird forebrain: effects of time intervals on responses to interleaved tone sequences. Brain Behav Evol 66: 197–214, 2005 [DOI] [PubMed] [Google Scholar]
- Bee MA, Micheyl C, Oxenham AJ, Klump GM. Neural adaptation to tone sequences in the songbird forebrain: patterns, determinants, and relation to the build-up of auditory streaming. J Comp Physiol A Neuroethol Sens Neural Behav Physiol 196: 543–557, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Beecher MD, Campbell SE, Burt JM. Song perception in the song sparrow: birds classify by song type but not by singer. Anim Behav 47: 1343–1352, 1994 [Google Scholar]
- Benjamini Y. The control of the false discovery rate in multiple testing under dependency. Ann Stat 29: 1165–1188, 2001 [Google Scholar]
- Butler AB, Reiner A, Karten HJ. Evolution of the amniote pallium and the origins of mammalian neocortex. Ann NY Acad Sci 1225: 14–27, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Capsius B, Leppelsack HJ. Influence of urethane anesthesia on neural processing in the auditory cortex analogue of a songbird. Hear Res 96: 59–70, 1996 [DOI] [PubMed] [Google Scholar]
- Capsius B, Leppelsack HJ. Response patterns and their relationship to frequency analysis in auditory forebrain centers of a songbird. Hear Res 136: 91–99, 1999 [DOI] [PubMed] [Google Scholar]
- Catchpole CK, Slater PJ. Bird Song: Biological Themes and Variations. Cambridge, UK: Cambridge Univ. Press, 1995 [Google Scholar]
- Cousillas H, Leppelsack HJ, Leppelsack E, Richard JP, Mathelier M, Hausberger M. Functional organization of the forebrain auditory centres of the European starling: a study based on natural sounds. Hear Res 207: 10–21, 2005 [DOI] [PubMed] [Google Scholar]
- DiCarlo JJ, Lane JW, Hsiao SS, Johnson KO. Marking microelectrode penetrations with fluorescent dyes. J Neurosci Methods 64: 75–81, 1996 [DOI] [PubMed] [Google Scholar]
- Doupe AJ, Kuhl PK. Birdsong and human speech: common themes and mechanisms. Annu Rev Neurosci 22: 567–631, 1999 [DOI] [PubMed] [Google Scholar]
- Fritz J, Shamma S, Elhilali M, Klein D. Rapid task-related plasticity of spectrotemporal receptive fields in primary auditory cortex. Nat Neurosci 6: 1216–1223, 2003 [DOI] [PubMed] [Google Scholar]
- Fritz JB, Elhilali M, Shamma SA. Adaptive changes in cortical receptive fields induced by attention to complex sounds. J Neurophysiol 98: 2337–2346, 2007 [DOI] [PubMed] [Google Scholar]
- Froemke RC, Merzenich MM, Schreiner CE. A synaptic memory trace for cortical receptive field plasticity. Nature 450: 425–429, 2007 [DOI] [PubMed] [Google Scholar]
- Gaudry Q, Kristan WB. Behavioral choice by presynaptic inhibition of tactile sensory terminals. Nat Neurosci 12: 1450–1457, 2009 [DOI] [PubMed] [Google Scholar]
- Gentner T, Ball G. A neuroethological perspective on the perception of vocal communication signals. In: The Handbook of Speech Perception, edited by Pisoni DB, Remez RE. Oxford, UK: Blackwell, 2005, p. 653–675 [Google Scholar]
- Gentner T, Hulse S. Perceptual mechanisms for individual vocal recognition in European starlings, Sturnus vulgaris. Anim Behav 56: 579–594, 1998 [DOI] [PubMed] [Google Scholar]
- Gentner TQ, Hulse SH, Bentley GE, Ball GF. Individual vocal recognition and the effect of partial lesions to HVc on discrimination, learning, and categorization of conspecific song in adult songbirds. J Neurobiol 42: 117–133, 2000 [DOI] [PubMed] [Google Scholar]
- Gentner TQ, Hulse SH. Perceptual classification based on the component structure of song in European starlings. J Acoust Soc Am 107: 3369–3381, 2000 [DOI] [PubMed] [Google Scholar]
- Gentner TQ, Margoliash D. Neuronal populations and single cells representing learned auditory objects. Nature 424: 669–674, 2003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gentner TQ. Temporal scales of auditory objects underlying birdsong vocal recognition. J Acoust Soc Am 124: 1350–1359, 2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hausberger M, Richard J, Leppelsack E, Leppelsack HJ. Neuronal bases of categorization in starling song. Behav Brain Res 114: 89–95, 2000 [DOI] [PubMed] [Google Scholar]
- Hocherman S, Benson DA, Goldstein MH, Heffner HE, Hienz RD. Evoked unit activity in auditory cortex of monkeys performing a selective attention task. Brain Res 117: 51–68, 1976 [DOI] [PubMed] [Google Scholar]
- Hubel DH, Henson CO, Rupert A, Galambos R. “Attention” units in the auditory cortex. Science 129: 1279–1280, 1959 [DOI] [PubMed] [Google Scholar]
- Itatani N, Klump GM. Auditory streaming of amplitude-modulated sounds in the songbird forebrain. J Neurophysiol 101: 3212–3225, 2009 [DOI] [PubMed] [Google Scholar]
- Itatani N, Klump GM. Neural correlates of auditory streaming of harmonic complex sounds with different phase relations in the songbird forebrain. J Neurophysiol 105: 188–199, 2011 [DOI] [PubMed] [Google Scholar]
- Jackson N, Muthuswamy J. Artificial dural sealant that allows multiple penetrations of implantable brain probes. J Neurosci Methods 171: 147–152, 2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jeanne JM, Thompson JV, Sharpee TO, Gentner TQ. Emergence of learned categorical representations within an auditory forebrain circuit. J Neurosci 31: 2595–2606, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kilgard MP, Merzenich MM. Cortical map reorganization enabled by nucleus basalis activity. Science 279: 1714–1718, 1998 [DOI] [PubMed] [Google Scholar]
- Kirsch M, Coles RB, Leppelsack HJ. Unit recordings from a new auditory area in the frontal neostriatum of the awake starling (Sturnus vulgaris). Exp Brain Res 38: 375–380, 1980 [DOI] [PubMed] [Google Scholar]
- Knudsen DP, Gentner TQ. Mechanisms of song perception in oscine birds. Brain Lang 115: 59–68, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee CC, Middlebrooks JC. Auditory cortex spatial sensitivity sharpens during task performance. Nat Neurosci 14: 108–114, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ludwig KA, Miriani RM, Langhals NB, Joseph MD, Anderson DJ, Kipke DR. Using a common average reference to improve cortical neuron recordings from microelectrode arrays. J Neurophysiol 101: 1679–1689, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marler P. Bird calls: their potential for behavioral neurobiology. Ann NY Acad Sci 1016: 31–44, 2004 [DOI] [PubMed] [Google Scholar]
- McLin DE, Miasnikov AA, Weinberger NM. Induction of behavioral associative memory by stimulation of the nucleus basalis. Proc Natl Acad Sci USA 99: 4002–4007, 2002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meliza CD, Chi Z, Margoliash D. Representations of conspecific song by starling secondary forebrain auditory neurons: toward a hierarchical framework. J Neurophysiol 103: 1195–1208, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mesgarani N, Chang EF. Selective cortical representation of attended speaker in multi-talker speech perception. Nature 485: 233–236, 2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Metherate R, Weinberger NM. Cholinergic modulation of responses to single tones produces tone-specific receptive field alterations in cat auditory cortex. Synapse 6: 133–145, 1990 [DOI] [PubMed] [Google Scholar]
- Miller JM, Sutton D, Pfingst B, Ryan A, Beaton R, Gourevitch G. Single cell activity in the auditory cortex of rhesus monkeys: behavioral dependency. Science 177: 449–451, 1972 [DOI] [PubMed] [Google Scholar]
- Moran J, Desimone R. Selective attention gates visual processing in the extrastriate cortex. Science 229: 782–784, 1985 [DOI] [PubMed] [Google Scholar]
- Niwa M, Johnson JS, O'Connor KN, Sutter ML. Active engagement improves primary auditory cortical neurons' ability to discriminate temporal modulation. J Neurosci 32: 9323–9334, 2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Otazu GH, Tai LH, Yang Y, Zador AM. Engaging in an auditory task suppresses responses in auditory cortex. Nat Neurosci 12: 646–654, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pinaud R, Terleph TA. A songbird forebrain area potentially involved in auditory discrimination and memory formation. J Biosci 33: 145–155, 2008 [DOI] [PubMed] [Google Scholar]
- Rolls ET, Tovee MJ. Sparseness of the neuronal representation of stimuli in the primate temporal visual cortex. J Neurophysiol 73: 713–726, 1995 [DOI] [PubMed] [Google Scholar]
- Seeba F, Klump GM. Stimulus familiarity affects perceptual restoration in the European starling (Sturnus vulgaris). PloS One 4: e5974, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sutter ML, Shamma SA. The relationship of auditory cortical activity to perception and behavior. In: The Auditory Cortex, edited by Schreiner CE, Winer JA. New York: Springer, 2011, p. 626–628 [Google Scholar]
- Theunissen FE, Shaevitz SS. Auditory processing of vocal sounds in birds. Curr Opin Neurobiol 16: 400–407, 2006 [DOI] [PubMed] [Google Scholar]
- Thompson JV, Gentner TQ. Song recognition learning and stimulus-specific weakening of neural responses in the avian auditory forebrain. J Neurophysiol 103: 1785–1797, 2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vinje WE, Gallant JL. Sparse coding and decorrelation in primary visual cortex during natural vision. Science 287: 1273–1276, 2000 [DOI] [PubMed] [Google Scholar]
- Woolley SM, Fremouw TE, Hsu A, Theunissen FE. Tuning for spectro-temporal modulations as a mechanism for auditory discrimination of natural sounds. Nat Neurosci 8: 1371–1379, 2005 [DOI] [PubMed] [Google Scholar]