A Basal Ganglia Circuit Sufficient to Guide Birdsong Learning

Lei Xiao; Gaurav Chattree; Francisco Garcia Oscos; Mou Cao; Matthew J Wanat; Todd F Roberts

doi:10.1016/j.neuron.2018.02.020

. Author manuscript; available in PMC: 2019 Apr 4.

Published in final edited form as: Neuron. 2018 Mar 15;98(1):208–221.e5. doi: 10.1016/j.neuron.2018.02.020

A Basal Ganglia Circuit Sufficient to Guide Birdsong Learning

Lei Xiao ¹, Gaurav Chattree ¹, Francisco Garcia Oscos ¹, Mou Cao ², Matthew J Wanat ³, Todd F Roberts ^1,⁴

PMCID: PMC5918681 NIHMSID: NIHMS947976 PMID: 29551492

SUMMARY

Learning vocal behaviors, like speech and birdsong, is thought to rely on continued performance evaluation. Whether candidate performance evaluation circuits in the brain are sufficient to guide vocal learning is not known. Here, we test the sufficiency of VTA projections to the vocal basal ganglia in singing zebra finches, a songbird species that learns to produce a complex and stereotyped multi-syllabic courtship song during development. We optogenetically manipulate VTA axon terminals in singing birds contingent on how the pitch of an individual song syllables is naturally performed. We find that optical inhibition and excitation of VTA terminals are each sufficient to reliably guide learned changes in song. Inhibition and excitation have opponent effects on future performances of targeted song syllables, consistent with positive and negative reinforcement of performance outcomes. These findings define a central role for reinforcement mechanisms in learning vocalizations and demonstrate minimal circuit elements for learning vocal behaviors.

ETOC BLURB

The role of basal ganglia (BG) in learning motor behaviors is poorly understood. Using optogenetic manipulation of axon terminals in singing songbirds, Xiao et al., demonstrate that the ventral tegmental area to BG pathway is sufficient to guide song learning.

INTRODUCTION

The ability to imitate vocal behaviors is exceedingly rare, with humans, songbirds and parrots providing canonical examples of this remarkable trait (Doupe and Kuhl, 1999; Konopka and Roberts, 2016a, b; Petkov and Jarvis, 2012). Fluent production of learned vocalizations requires continual evaluation of performances using auditory-feedback. Changes in neuronal activity, indicating whether vocal performances are well and/or poorly performed relative to performance goals, may provide general mechanisms for maintaining and shaping vocalizations (Fee and Goldberg, 2011; Gadagkar et al., 2016), as well as other skilled behaviors. Indeed, circuits involved in learning and production of behaviors are tightly linked with those associated with performance evaluation (den Ouden et al., 2012; Desrochers et al., 2015; Frank et al., 2004; Hosp et al., 2011; Lee and Tomblin, 2012; Leemburg et al., 2018; Mandelblat-Cerf et al., 2014; Smith and Graybiel, 2016; Stephenson-Jones et al., 2016; Yttri and Dudman, 2016). However, it is not known if evaluative signals indicating better performances or worse performances are sufficient to guide adaptive changes in vocal behaviors, or how such signals are implemented in the brain.

Zebra finches provide a useful model in which to test these questions because they learn a single courtship song during development and use extensive daily practice to maintain expert performance of this song in adulthood (Doupe and Kuhl, 1999; Immelmann, 1969; Price, 1979). Cortical (Bottjer and Altenau, 2010; Bottjer et al., 1984; London and Clayton, 2008; Mandelblat-Cerf et al., 2014; Roberts et al., 2012; Roberts et al., 2017) and basal ganglia circuits (Ali et al., 2013; Brainard and Doupe, 2000; Haesler et al., 2007; Hoffmann et al., 2016; Scharff and Nottebohm, 1991) have been uniquely implicated in song learning and neurons signaling aspects of performance evaluation have been identified in at least three regions in the songbird brain, including auditory and premotor cortical circuits (Achiro et al., 2017; Keller and Hahnloser, 2009; Mandelblat-Cerf et al., 2014) and in the ventral tegmental area (VTA) (Gadagkar et al., 2016).

Here we focus on the function of VTA neurons projecting to the striatopallidal vocal basal ganglia (Area X). In vertebrates, the basal ganglia are thought to play a central role in the volitional control of skilled motor behaviors (Dhawale et al., 2017; Dudman and Krakauer, 2016; Graybiel and Grafton, 2015; Hikosaka et al., 1998; Jin and Costa, 2015; Yttri and Dudman, 2016). The striatum is a main input hub of the basal ganglia and striatal medium spiny neurons (MSNs) integrate inputs from the cortex and from dopaminergic neurons in VTA and substantia nigra pars compacta (SNc) (Tritsch and Sabatini, 2012). Divergent striatal MSN output pathways play integral roles in the execution of motor behaviors by supporting action initiation, action selection, and action vigor (Cui et al., 2013; Jin and Costa, 2010; Jin et al., 2014; Nelson and Kreitzer, 2014; Panigrahi et al., 2015; Tecuapetla et al., 2016; Wall et al., 2013). Phasic changes in dopaminergic signaling are also sufficient to directly influence ongoing movements (Howe and Dombeck, 2016; Panigrahi et al., 2015).

In addition to their motor role, dopaminergic inputs to the basal ganglia also carry information about reward and aversion and have been shown to encode both positive and negative reward prediction errors (RPE) (Bromberg-Martin et al., 2010; Cohen et al., 2012; Eshel et al., 2015; Frank et al., 2004; Gadagkar et al., 2016; Graybiel, 2008; Hikosaka et al., 2006; Hollerman and Schultz, 1998; Hong and Hikosaka, 2011; Howard et al., 2017; Howe and Dombeck, 2016; Montague et al., 1996; Morales and Margolis, 2017; Reynolds et al., 2001; Schultz, 2015; Schultz et al., 1993; Schultz et al., 1997; Watabe-Uchida et al., 2017; Wise, 2004), a key component of reinforcement learning theory and associated temporal difference and actor-critic models (Barto, 1995; Bush and Mosteller, 1951a, b; Glimcher, 2011; Houk et al., 1995; Montague et al., 1996; Rescorla and Wagner, 1972; Suri and Schultz, 1998; Sutton and Barto, 1981; Sutton and Barto, 1998). These findings raise the possibility that dopaminergic inputs to the striatum may also play a role in learning of fine motor skills involved in vocal performance (Hoffmann et al., 2016). Indeed, lesion and pharmacological inactivation studies in songbirds indicate that the output of the vocal basal ganglia circuitry is important for initial song learning (Bottjer et al., 1984; Scharff and Nottebohm, 1991) and continued vocal plasticity (Andalman and Fee, 2009; Brainard and Doupe, 2000; Charlesworth et al., 2012; Olveczky et al., 2005; Tanaka et al., 2016; Warren et al., 2011). Moreover, inputs from the VTA encode ‘performance errors’, a signal thought to reflect if songs were correctly or incorrectly performed relative to performance goals (Gadagkar et al., 2016).

However, the functional significance of these evaluative signals in vocal learning is not known, and several fundamental questions regarding the function of VTA projections to the basal ganglia are left unresolved. For example, are phasic changes in VTA activity sufficient to instruct changes in vocal performance? Does VTA have direct effects on ongoing song performances, or primarily function to instruct changes in future performances? Are evaluative signals from VTA sufficient to guide learning of an individual vocal parameter even when embedded within a larger vocal sequence? To address these and other related questions we developed axon-targeted optogenetic methods to excite or inhibit VTA axon terminals in the vocal basal ganglia of freely singing zebra finches. On-line assessment of the fundamental frequency (pitch) of a targeted song-syllable permitted precise trial-to-trial targeting of optical manipulations based on natural variation in syllable pitch. We find that activation or inhibition of VTA axon terminals is sufficient to guide learned changes in the future performance of single song-syllables within the birds’ larger polysyllabic song. Moreover, activation and inhibition guide rapid and reliable pitch learning in opposing directions. These findings define a minimal synaptic circuit for vocal learning and highlight unexpected precision in how the VTA-basal ganglia pathway can guide changes to vocal behaviors.

RESULTS

Pitch Contingent Auditory Feedback Negatively Reinforces Learned Changes in Vocal Pitch

We first tested the ability of zebra finches to adaptively modify the pitch of a song syllable in a negative reinforcement learning task (Ali et al., 2013; Andalman and Fee, 2009; Roberts et al., 2017; Tumer and Brainard, 2007). Zebra finches practice their song hundreds to thousands of times each day and exhibit a small amount of natural, trial-to-trial, variability in how they produce the pitch of individual syllables. Pitch-contingent auditory feedback (pCAF, Figures 1A–1F) triggers playback of white-noise when the pitch of a syllable is below or above an experimenter-defined threshold. Brief pulses of white-noise playback are thought to function as an aversive cue, perhaps perceived as an error in vocal performance (Tumer and Brainard, 2007). Birds rapidly learned to shift the pitch of the pCAF targeted syllable throughout the day, exhibiting significant increases in the pitch of the targeted syllable and significant decreases in the number of syllables that fell below the threshold for white noise playback (Figures 1E–1F). These findings confirm the ability of zebra finches to rapidly and selectively modify their song in a negative reinforcement task.

A) Schematic of experimental design for close-looped pitch-contingent auditory feedback. B) Sonogram from the bird used in the pCAF experiment illustrated in figures 1C – 1D. White noise (WN) bursts were delivered over syllable ‘d’ during lower pitch variants. ‘iabbcd’ indicate the syllables that comprise the bird’s motif and the introductory syllable. Arrowhead indicates a 5ms segment where the pitch of target syllable ‘d’ was measured; black box indicates target syllable; red line marks 100ms playback of white-noise. C) Plot of the pitch of syllable ‘d’ across 1,500 motifs before and during pCAF, each point corresponds to one rendition of the syllable. Closed-loop targeting of WN to lower pitch variants (red dots, ‘hit’) but not higher pitch variants (black dots, ‘escape’) resulted in an increase in the number ‘escape’ of trials . D) Plot of the running average of the pitch and hit rate (contingency) during the day of closed-loop pCAF illustrates the rapid increases in running average of pitch (black line) and concomitant decreases in contingency percentage (red line). Each point corresponds to a single syllable rendition and shaded region indicates ± one standard deviation; gray box indicates the baseline period before WN was delivered. E) Changes in running average of pitch during baseline day (open) and WN day (filled) in 7 birds in which WN was delivered to lower pitch variants (downward pointing triangles). WN delivering elicited increases in the running average of pitch (p=0.016, n=7, Wilcoxon matched-pairs signed-rank test). F) Changes in contingency percentage during baseline day (open) and WN day (filled) in 7 birds in which WN was delivered to lower pitch variants. WN elicited decreases in the contingency (p=0.015, n=7, Wilcoxon matched-pairs signed-rank test). G) Hypothetical source of instructive signal in natural pitch learning. H) General hypothesis tested in this paper: phasic increases and decreases in VTA neurons projecting to Area X encode positive and negative reinforcement signals that are each sufficient to guide song learning.

Optogenetic Manipulation of VTA Terminals in the Songbird Vocal Basal Ganglia (Area X)

Learning to shift the pitch of an individual syllable embedded in a complex song motif is a form of motor skill learning that may depend on positive and/or negative reinforcement signals from VTA to Area X (Gadagkar et al., 2016) (Figures 1G–1H). To test if phasic increases or decreases in VTA_AX activity are sufficient to guide song learning, we sought to optogenetically manipulate VTA_AX axon terminals in freely singing birds. To enhance axonal distribution and membrane trafficking of virally expressed opsins, the intracellular targeting sequence of neurexin 1α was attached to the C-terminal end of channelrhodopsin (ChR2) or archaerhodopsin (ArchT), resulting in AAV viral constructs referred to here as axChR2 and axArchT. Targeted viral injections into VTA revealed that we could efficiently infect dopaminergic VTA_AX neurons and robustly label their terminals in Area X (Figure 2A–2B, Figure S1–S2).

A) Schematic showing injection of AAV1-CAG-axChR2 into. B) (Left image) Representative coronal section through VTA shows that most neurons infected with AAV1-CAG-axChR2-2a-YFP are TH positive and located in ventral and ventrolateral potions of VTA. Scale bar, 100 μm. (Middle image) Representative parasagittal section shows that Area X was well innervated with axonal terminals (green) arising from VTA. Dashed line circle outlines the border of Area X with darker staining of TH(red) relative to the surrounding basal ganglia region. Scale bar, 300 μm; D, Dorsal; L, lateral; R, rostral. (Right image) Enlarged image from Area X shows that axChR2 positive axons (green) are overlapping with TH staining (red). Scale bar, 20 μm. C) Schematic showing projection-specific optical stimulation of VTA terminals in Area X and electrical stimulation of VTA in axChR2+ birds. D) Representative voltammetric color plot of dopamine(DA) release in Area X following electrical stimulation of either VTA(n=2) or Area X(n=3) within parasagittal brain slices. Electrical stimulation of either VTA or Area X gave rise to comparable voltammetry profiles (data not shown). Black arrow, 50 - 100ms, 50 - 60hz of electrical stimulation. E) Representative voltammetric color plot of DA release in Area X following optical stimulation of VTA terminals in Area X within parasagittal brain slices of axChR2+ birds(n=5). Blue arrow, 100ms, 470nm. F) Background-subtracted cyclic voltammogram from electrical (E.S.) and optical (O.S.) stimulated DA release in the Area X, *ex vivo*. Single light pulse stimulation (O.S. blue, 100ms, n=5 birds) produced signature DA signals (E.S., black, 50 - 100ms, 50 - 60hz, n=5). Scale bar, 200 mV; shaded region indicates standard error of the mean. G) Schematic showing injection of AAV1-CAG-axArchT into VTA and projection-specific inhibition of VTA terminals in the Area X paired with electrical stimulation of VTA. H) Representative voltammetric color plots of dopamine(DA) release in Area X following electrical stimulation of VTA within a parasagittal brain slice of either control (n=2) or axArchT+(n=3) birds. Black arrow, 50ms, 60hz of electrical stimulation. I) Representative voltammetric color plots of DA release in Area X when electrical stimulation of VTA is paired with optical inhibition of VTA terminals in axArchT+ birds (n=3, green arrow, 100ms, 540nm). J) Averaged DA responses to electrical stimulation (E.S) of VTA with (black) or without(orange) optical inhibition (O.I.) of VTA terminals in the Area X of axArchT+ birds (n=3) as measured by FSCV. Scale bar, 2s, 2nA; shaded region indicates standard error of the mean. Insert panel, Optical inhibition resulted in reduced peak DA levels evoked by electrical stimulation (p=0.0002; paired t test). See also Figure S1–S2.

We next tested our ability to optically manipulate axon terminals of VTA_AX neurons and the phasic release of DA. To test if optical activation or inhibition of axon terminals in Area X was sufficient control dopamine release, we made fast scanning cyclic voltammetric recordings from Area X several weeks following injection of either AAV1-axChR2 or AAV1-axArchT into VTA (Figure 2C–2J). We prepared brain slices maintaining axonal connections between VTA and Area X and used electrical stimulation of VTA somata to assess the electrochemical signature for dopamine in zebra finch Area X. We found that a 100ms pulse of blue (470nm) light was sufficient to reliably evoke dopamine release from VTA_AX axon terminals expressing axChR2 (recordings from 5 birds, Figure 2C–2F). In birds expressing axArchT we paired electrical stimulation of VTA somata with light inhibition of terminals in Area X in order to test the efficacy of axon-terminal optical inhibition. We found that a 100ms pulse of green (540nm) light was sufficient to significantly suppress electrically evoked dopamine release in Area X (Figure 2G – 2J). Together, these findings indicate that brief optogenetic excitation or inhibition of VTA_AX axon terminals is sufficient to control phasic dopamine signaling in the songbird basal ganglia. Although these in vitro experiments cannot reveal the full complement of effects our optogenetic manipulations may have on downstream circuits in vivo, they nonetheless open the door for testing the role of VTA_AX dopaminergic projections in shaping performance of learned vocal behaviors.

Phasic Stimulation of VTA Axon Terminals Guides Bidirectional Learned Changes in Vocal Pitch

To test if manipulation of VTA_AX axon terminals is sufficient to guide learned changes in the pitch of a targeted syllable, we first used closed-loop, pitch-contingent optogenetic stimulation in freely singing adult zebra finches. Unlike pCAF experiments, these experiments do not rely on playback of an aversive auditory cue, but rather test if optogenetic manipulation alone is sufficient to guide learning (Figure 3A). We implanted adult male zebra finches with fiber optic cables overlying Area X 6-12 weeks after bilateral injections of axChR2 into VTA (Figures 3B–3C). Birds were allowed to acclimate for up to a week following implantation of fiber optic cables, as measured by a return in their normal daily singing behavior. On the first day of the experiment birds were continuously recorded to establish baseline pitch levels for the syllable to be targeted. The next morning, birds received optogenetic stimulation contingent on how they sang their syllable during individual song motifs. We targeted stimulation to natural syllable variants that fell within the lowest third of all pitch variants. Illumination onset occurred within 25ms (24 ±0.4ms) of the syllable-pitch measurement and persisted for 100ms, a temporal window overlapping with the production of the targeted song syllable. Birds readily sang through optical pulses and did not appear to have any overt changes in their singing behavior (Figures 3D).

A) Schematic of closed-loop pitch-contingent optogenetic experimental paradigm. B) A zebra finch with optical cannula bilaterally implanted over Area X. C) Parasagittal section shows implantation track of fiber optic overlying Area X and anterior to lMAN. Dashed lines outline both the border of Area X and lMAN. Scale bar, 300 μm. D) Sonogram from the bird used in the closed-loop optical stimulation experiment illustrated in figures 3E – 3F. Light pulses (~460 nm, 100ms) were delivered over syllable ‘c’ during lower pitch variants (hit) and not during higher pitch variants (escape). Scale bar, 100ms. E) Plot of the pitch of syllable ‘c’ across 1,000 motifs before and during optical stimulation, each point corresponds to one rendition of the syllable. Closed-loop optical stimulation of target syllables ‘c’ to lower pitch variants (blue dots) but not higher pitch variants (black dots) resulted in an increase in the number of ‘hit’ trials. F) Plot of the running average of the pitch and hit rate (contingency) during the day of close-looped optical stimulation illustrates the rapid decreases in running average of pitch (black line) and concomitant increases in contingency percentage (blue line). Each point corresponds to one rendition of the syllable; gray box indicates the baseline period before optical stimulation. G) Changes in running average of pitch during baseline day (open) and stimulation day (filled) for experiments in which optical stimulation was delivered to variants with higher pitch (upward pointing triangles with black outline, n=4) or lower pitch (downward pointing triangles, n=3), resulting in significant upward or downward shift in pitch (p=0.016, n=7, Wilcoxon matched-pairs signed-rank test). Changes in running average of pitch are expressed in units of the standard deviation of the last baseline session (z score). H) Closed-loop optical stimulation of syllables with either higher pitch (upward pointing triangles with black outline, n=4) or lower pitch (downward pointing triangles, n=3) elicited increases in contingency (33.81 ±2.92% to 66.94 ±6.47%, p= 0.016, n=7, Wilcoxon matched-pairs signed-rank test). Contingency on baseline day (open) and stimulation day (filled) was determined according to the same preset threshold for each individual experiment. I) Closed-loop optical illumination of either higher pitch variants (up) or lower pitch variants(down) elicited upward (p=0.0045) or downward (p=0.015, Unpaired t test with Welch’s correction) shift in running average of pitch in axChR2 birds (filled, up n=4, down n=3) but not in GFP birds (open, up n=3, down n=3). Error bars indicate standard error of the mean. See also Figure S3.

We found rapid and reliable changes in the pitch of the targeted syllable following pitch-contingent optogenetic stimulation. For the experiments shown in figures 3D–3F, we targeted optogenetic stimulation only to those renditions in which the bird sang syllable ‘c’ with a pitch lower than 875 Hz (Figure 3D–3F). The pitch of syllable ‘c‘ decreased by 17.1 Hz, 1.2 times the standard deviation of its baseline values, during the first day of stimulation (Figure 3F, see Figure S3 for fiber optic targeting controls). Correspondingly, as the bird learned to shift the pitch of the targeted syllable, the number of syllables that reached threshold for light stimulation increased by a third, from 46% to 79%. Across experiments, we found that targeting optogenetic stimulation to syllable renditions in the lower third of the normal pitch range resulted in decreases in the average pitch of that syllable, consistent with stimulation functioning as a positive reinforcement signal (downward pointing triangles always correspond to illumination targeting lower pitch variants and upward pointing triangles correspond to illumination targeting higher pitch variants, Figure 3G). The running average of pitch decreased by 1.39 ±0.25 standard deviations during the stimulation day while it changed only 0.02 ±0.06 standard deviations during the baseline day (Figure 3G). The rapid decreases in syllable pitch observed following optogenetic stimulation to lower-pitch syllable renditions stand in sharp contrast to the rapid increases in pitch seen in our pCAF experiments (Figure 1C–1F). These results support the idea that phasic activation of VTA_AX axon terminals functions as a teaching signal that guides song learning by positively reinforcing associated performances.

To test if stimulation of VTA_AX axon terminals is sufficient to guide bidirectional changes in vocal performance we targeted light stimulation to syllable renditions sung within the highest third of all pitch variants, rather than in the lowest third. We found that targeting optogenetic stimulation to syllable renditions in the higher third of the normal pitch range resulted in substantial increases in the pitch of that syllable throughout the training day (upward pointing triangles, Figure 3G). The running average of pitch increased by 0.84 ±0.02 standard deviations during the stimulation day, while it changed only −0.007 ±0.035 standard deviations during the baseline day (Figure 3G). These findings suggest that optogenetic stimulation of VTA_AX axon terminals can guide learned increases or decreases in syllable pitch and further support the idea that phasic activation of VTA_AX axon terminals is sufficient to positively reinforce associated performances. Consistent with this, as birds learned to shift the pitch of the targeted syllable during the stimulation day, the light stimulation contingency increased for all birds, regardless of targeting higher or lower pitch variants (Figure 3H). To test if changes in syllable pitch reflect a non-specific effect on song behavior we conducted identical pitch-contingent optical manipulations in birds injected in VTA with viral constructs only expressing GFP. Targeting light flashes to GFP expressing VTA_AX axon terminals during performance of either the higher or lower pitch syllable variants did not drive pitch learning and resulted in significantly smaller changes in pitch than optogenetic activation (Figure 3I). These results show that phasic activation of VTA inputs to Area X is sufficient to reliably guide rapid pitch learning, and support the idea that phasic increases in activity signal better than expected performances outcomes (Gadagkar et al., 2016).

Inhibition of VTA Axon Terminals is Sufficient to Negatively Reinforce Changes in Vocal Pitch

The idea that VTA dopamine neurons encode a bidirectional teaching signal, capable of reflecting positive and negative evaluation of performance outcomes, is an essential component of prediction error based learning models (Chang et al., 2016; Glimcher, 2011; Montague et al., 1996; Schultz et al., 1997). Therefore, causal evidence showing that phasic decreases in activity are also sufficient to guide learning, and do so in a manner consistent with negative performance errors, are necessary to support the role of reinforcement learning models in vocal learning.

To examine this, we tested whether pitch-contingent optical inhibition of VTA_AX axon terminals was sufficient to guide learned changes in song. We implanted adult male zebra finches with fiber optic cables overlying Area X 6-12 weeks after bilateral injections of axArchT into VTA. On the experimental day, we targeted inhibition to all but the highest or lowest pitch variants of an individual syllable in the birds’ polysyllabic song (upper or lower 60 - 90% of variants were yoked to optical inhibition, see Methods for description of contingencies). In contrast to optical activation of VTA_AX axon terminals, we found that optical inhibition resulted in rapid shifts in syllable pitch which mirrored those seen in birds exposed to a negative reinforcement pCAF task (Figures 4A–4C, 1C–D). For the experiments shown in figures 4A–4C, we targeted optogenetic inhibition to renditions in which the bird sang syllable ‘e’ with a pitch lower than 595 Hz (orange dots in Figure 4B). The pitch of syllable ‘e’ increased by 7.5 Hz, 0.72 times the standard deviation of its baseline values, over the course of a single experimental day and the stimulation contingency rate decreased by 19.5% (Figure 4C). Across experiments, we found that targeting optogenetic inhibition to lower pitch renditions resulted in rapid and substantial increases in the pitch of the targeted syllable. The running average of pitch increased by 1.08 ±0.15 standard deviations during the experimental day while it changed only 0.11 ±0.18 standard deviations during the baseline day (Figure 4D). Moreover, we found that optogenetic inhibition of VTA_AX axon terminals was sufficient to guide bidirectional changes in the pitch of song syllables in a manner consistent with negative reinforcement signals. Targeting inhibition to low pitch variants resulted in birds increasing the pitch of their song syllable, while targeting inhibition to high pitch variants resulted in birds decreasing the pitch of their song syllable (targeting high pitch variants, upward facing triangles: running average of pitch decreased by 1.36 ±0.2 standard deviations during the experimental day while it only changed 0.33 ±0.08 standard deviations during the baseline day, p=0.016, n=7, Wilcoxon matched-pairs signed-rank test, Figure 4D). Consistent with this, as birds learned to shift the pitch of the targeted syllable during the stimulation day, the light stimulation contingency decreased for all birds, regardless of targeting higher or lower pitch variants (Figure 4E). Lastly, we found that birds expressing only GFP did not exhibit any shifts in the pitch following identical pitch-contingent illumination over Area X and resulted in significantly smaller changes in pitch than optogenetic inhibition (Figure 4F).

A) Sonogram from the bird used in the closed-loop optical inhibition experiment illustrated in figures 4B – 4C. Pitch contingent optical inhibition does not induce systematic changes in either song or syllable structure. Spectrograms of a song before (top) and during (bottom) an experiment in which light pulse (~520nm, 100ms, orange line) was delivered over syllable ‘e’ during lower pitch variants. Scale bar, 100ms. B) Plot of the pitch of syllable ‘e’ across 1,701 motifs before and during optical inhibition, each point corresponds to one rendition of syllable. Closed-loop optical inhibition of target syllables ’e’ to lower pitch variants (orange dots) but not higher pitch variants (black dots) resulted in a decrease in the number of ‘hit’ trials. C) Plot of the running average of the pitch and hit rate (contingency) during the day of close-looped optical inhibition illustrates the rapid increases in running average of pitch (black line) and concomitant decreases in contingency percentage (orange line). Each point corresponds to one rendition of the syllable; gray box indicates the baseline period before optical inhibition. D) Changes in running average of pitch during baseline day (open) and inhibition day (filled) for experiments in which optical inhibition was delivered to variants with higher pitch (upward pointing triangles with black outline, n=3) or lower pitch (downward pointing triangles, n=4), resulting in significant upward or downward shift in pitch (p=0.016, n=7, Wilcoxon matched-pairs signed-rank test). Changes in running average of pitch are expressed in units of the standard deviation of the last baseline session (z score). E) Closed-loop optical inhibition of syllables with either higher pitch (upward pointing triangles with black outline, n=3) or lower pitch (downward pointing triangles, n=4) elicited increases in contingency (p= 0.016, n=7, Wilcoxon matched-pairs signed-rank test). Contingency on baseline day (open) and stimulation day (filled) was determined according to the same preset threshold for each individual experiment. F) Closed-loop optical illumination of syllables with either higher pitch (up) or lower pitch (down) elicited downward (p=0.014) or upward (p=0.014, Unpaired t test with Welch’s correction) shift in running average of pitch in axArchT birds (filled, up n=3, down n=4) but not in GFP+ birds (open, up n=3, down n=4). Error bars indicate standard error of the mean. See also Figure S3.

These results demonstrate that optical inhibition of VTA_AX axon terminals is sufficient to guide changes in behavior consistent with negative reinforcement of vocal performances. Together with our optogenetic stimulation results, these findings provide strong causal support for VTA_AX dopamine neurons encoding positive and negative reinforcement signals that are each sufficient to guide rapid and selective learned changes to song.

Optogenetic Manipulation of VTA Axon Terminals Guides Changes in Future Performances of Song

Reinforcement signals could shape changes in song by directly influencing motor performances (Howard et al., 2017; Howe and Dombeck, 2016; Panigrahi et al., 2015; Yttri and Dudman, 2016) or through evaluation of performance outcomes (Gadagkar et al., 2016). We examined these ideas in several ways. First, if activation or inhibition had direct effects on syllable pitch, we would expect immediate and consistent effects on syllable behavior. Instead, we found accumulation of changes over the course of the day, consistent with iterative learning from an instructive signal (Figure 3F, Figure 4C). Second, direct motor effects on syllable pitch would likely result in overall increase in variance of pitch as optogenetically manipulated syllables diverge from the normal distribution of pitch. We found that the coefficient of variation of syllable pitch was not altered by our phasic activation or inhibition of VTA_AX axon terminals (Figure 5A–5B). This finding indicates that activation or inhibition of a subset of pitch variants resulted in learned changes across the entire distribution of pitch variants and supports the idea that learning resulted from incremental effects on future performances. Third, phasic activation or inhibition of VTA_AX axon terminals could result in direct motor effects on non-targeted vocal parameters (parameters other than pitch). To test this, we quantified several syllable performance parameters from interleaved light illuminated and escape trials during the first day of stimulation or inhibition. We were unable to detect differences in any of the quantified features during optogenetic manipulation of VTA_AX axon terminals (Figure 5C–5D).

A) Variability in pitch of target syllables for baseline day (open, CV = 1.91±0.42%) and stimulation day (filled, CV = 1.73±0.27%). Closed-loop optical stimulation of target syllables did not change the coefficient of variation of syllable pitch of target syllables (p=0.81, n=7, Wilcoxon matched-pairs signed-rank test). B) Variability in pitch of target syllables for baseline day (open, CV = 1.71±0.29%) and inhibition day (filled, CV = 1.75±0.28%). Closed-loop optical inhibition of target syllables did not change variability in pitch of target syllables (p=0.47, n=7, Wilcoxon matched-pairs signed-rank test). C) Spectral characteristics of ‘hit’ (filled) and ‘escape’ (open) syllable during the first stimulation session (200 motifs). Across experiments (n=7), there were no differences in amplitude (p=0.69), duration (p=0.30), Weiner entropy (p=0.16), goodness (p=0.38), frequency modulation (FM, p=0.69) and amplitude modulation (AM, p=0.11, Wilcoxon matched-pairs signed-rank test) between ‘hit’ and ‘escape’. D) Spectral characteristics of ‘hit’ (filled) and ‘escape’ (open) syllable during the first inhibition session (200 motifs). Across experiments (n=7), there were no differences in amplitude (p=0.38), duration (p=0.81), Weiner entropy (p=0.81), goodness (p=0.93), frequency modulation (FM, p>0.99) and amplitude modulation (AM, p=0.078, Wilcoxon matched-pairs signed-rank test) between ‘hit’ and ‘escape’. E) Motif number for baseline day (open, 1,444 ±441 motifs) and stimulation day (filled, 1,215 ±425 motifs). Closed-loop optical stimulation of target syllables didn’t change singing rate (p=0.15, n=7, Wilcoxon matched-pairs signed-rank test). F) Motif number for baseline day (open, 1,544 ±204 motifs) and inhibition day (filled, 1,413 ±144 motifs). Closed-loop optical inhibition of target syllables didn’t change singing rate (p>0. 999, n=7, Wilcoxon matched-pairs signed-rank test).

Beyond direct motor effects, repeated activation or inhibition of VTA_AX axon terminals could drive changes in pitch through changes in behavioral motivation. Dopamine signaling has been generally linked with reward, and successive activation or inhibition of VTA_AX axon terminals could lead to overall changes in the motivation to sing (Hu, 2016). We examined singing rate in our birds and found that neither stimulation nor inhibition of VTA_AX axon terminals altered singing rate (Figure 5E–5F). We have shown that closed-loop optogenetic manipulation of VTA_AX axon terminals is sufficient to guide bidirectional changes in the future performances of song, independent of direct influences on ongoing song performance or generalized motivational changes in singing. These results indicate that phasic increases and decreases in dopaminergic input to Area X is sufficient to guide rapid and opponent changes in learned song, and support the idea that this single synaptic input provides a minimal circuit sufficient to direct vocal learning.

VTA Axon Terminal Manipulations Drive Significant and Sustained Learned Changes in Song

The rapid, within-day changes in pitch, driven by optogenetic excitation or inhibition, reveal remarkable precision in the teaching signal that VTA conveys to Area X. Nonetheless, changes to syllable pitch in most of our birds remained within the natural range that the syllable could be produced prior to our optogenetic manipulations, raising concerns that manipulations of VTA_AX axon terminals are not sufficient to guide large scale changes in behavior, akin to those needed during initial learning of a new vocalization or recovery of vocal behaviors following peripheral or central injuries. To test if this teaching signal is capable of guiding sustained and large scale changes in vocal behavior, we extended our pitch-contingent optogenetic manipulations over several days, updating pitch-illumination thresholds each morning in order to continue driving changes in vocal behavior (3 – 12 days, n = 8 birds). Optogenetic stimulation and inhibition continued to have opponent effects on the direction of pitch learning. For example, successive targeting of lower pitch syllable renditions with optogenetic excitation resulted in cumulative decreases in syllable pitch, while optogenetic inhibition resulted in cumulative increases in syllable pitch over several days (Figure 6A). We found that changes in pitch learned on the first day of training were retained in the bird’s behavior the following morning, consistent with overnight consolidation of learned changes in song (Figure S4). Manipulations over several consecutive days, therefore, resulted in large changes in syllable pitch, with some birds shifting the pitch of their syllable by 3 – 6 standard deviations away from baseline values (quantified as z scores and absolute d-prime values (Id’I), see Methods, Figure 6A–E).

A) Difference in mean pitch between illumination day (filled) and baseline or recovery days (open) from one axChR2 (blue) and one axArchT (orange) bird in which variants with lower pitch (downward pointing triangles) was targeted. Changes in mean pitch are expressed in units of the standard deviation of the baseline distribution (z score). B) Frequency distribution of pitch for the axChR2 bird shown in A. After illumination (blue), mean pitch shifted by 4.68 SD of baseline distribution (black) over 3 days and recovered (gray) to 0.37SD away from baseline distribution over 12 days. Pitch is expressed in units of the standard deviation of the baseline distribution (z-score). C) Frequency distribution of pitch for the axArchT bird shown in A. After illumination (orange), mean pitch shifted by 2.67 SD of baseline distribution (black) over 10 days and recovered (gray) to 0.02SD away from baseline distribution over 5 days. Pitch is expressed in units of the standard deviation of the baseline distribution (z-score). D) Shifts in mean pitch which expressed in unites of |d’| (see method) in 4 axChR2 birds over 3-6 days. E) Shifts in mean pitch which expressed in unites of |d′| in 4 axArchT birds over 5-10 days. F) Average of shift in mean pitch, expressed in units of |d′|, for GFP birds (n=6), axChR2 birds (n=4) and axArchT birds (n=4). Average shift in mean pitch for both axChR2 and axArchT birds were higher than 0.75, and also significantly higher than respective control GFP groups (axChR2, p=0.016; axArchT, p=0.0095; Mann-Whitney test). Error bars indicate standard error of the mean. G) Shifts in mean pitch for last illumination day and the day within a week following termination of illumination for both axChR2 (n=3) and axArchT (n=4) birds. Shifts in mean pitch were recovered toward to baseline within a week (changes in average of pitch Id′I, 2.6 ±0.5 vs 0.75 ±0.33, p=0.016, n=7, Wilcoxon matched-pairs signed-rank test). See also Figure S4–S5.

To understand if these learned changes in pitch constitute behaviorally significant and sustained deviations, we assessed changes relative to a conservative threshold for naturally occurring variability of syllable pitch (Id’I = 0.75 significance threshold, Figure 6F) (Canopoli et al., 2014). Both excitation and inhibition of VTA_AX axon terminals was sufficient to guide sustained and significant changes in syllable pitch, while similar manipulations in GFP birds did not result in learned changes in pitch (Figure 6F). In addition, we found that the standard deviation of syllable pitch at baseline positively correlated with maximum pitch shift birds were able to achieve (Figure S5). This finding held even for within-day changes in syllable pitch from birds receiving either activation or inhibition of VTA_AX axon terminals. These results further underscore that our optogenetic manipulations do not alter syllable variability; rather changes in pitch are brought about by shifting the entire distribution of syllable variants over time. Together, these results demonstrate that positive and negative reinforcement signals can guide large scale changes in the pitch of a song syllable, constrained only by the intrinsic variability in how the bird naturally sings the targeted syllable.

Although changes in performance, guided by optogenetic manipulation of VTA_AX axon terminals, appear to be consolidated in motor circuits, birds were still able to recover the baseline levels of syllable pitch once optogenetic manipulations were stopped. In 7 out of 8 cases in which we were able to monitor the recovery phase, the pitch of target syllables returned to its original range within 7 – 10 days after we ceased optogenetic manipulations, similar to recovery of normal song behavior at the end of pCAF training (Figure 6G) (Ali et al., 2013; Canopoli et al., 2014; Roberts et al., 2017; Tumer and Brainard, 2007).

Changes in Vocal Behavior are Spectrally and Temporally Precise

Fluent production of vocalizations involves the sequencing or concatenation of many small volitional movements. In birdsong this is reflected in the correct ordering of individual syllables and song notes, each with their own learned spectral and temporal features (Williams, 2004; Zann, 1996). For reinforcement signaling from VTA_AX neurons to be a viable mechanism for learning and maintenance of song it should be able to guide changes in syllable performance that are both spectrally and temporally precise, and not result in changes to other features of a song syllable, such as its frequency modulation, or to changes in other portions of the song. To quantify this spectral and temporal precision, we focused on birds in which we optogenetically shifted the pitch of a song syllable for several days and confined our analysis to the day in which they displayed the largest shift in behavior, typically the last day of our optogenetic manipulations. All birds used in these experiments had at least two harmonic syllables in their core song motif, only one of which was targeted for pitch-contingent optogenetic stimulation or inhibition (n = 8 birds). We first asked if learned changes in pitch also resulted in changes to other features in the song syllable, or if they resulted in coincident changes to a non-targeted harmonic syllable in the song motif. We quantified several features of the song syllable and found that none of these other features were significantly changed when compared to baseline values, confirming that only the pitch of the targeted syllable had been systematically modified in both our optogenetically stimulated and inhibited birds (Figure 7A – 7B). Additionally, we found that changes in the pitch of the target syllables did not result in any systematic changes in the features of the non-targeted harmonic syllables (control in Figure 7A – 7B). These findings show that positive and negative reinforcement signals are able to guide changes in a single spectral feature in a bird’s syllable, revealing remarkable precision in how these signals can influence learning of vocal behaviors.

A) Shift in song spectral features for target (filled) and control (open) syllables of axChR2 birds (n=4). Changes in song were restricted to the pitch of target syllables (p=0.024, Mann-Whitney test). Spectral characteristics including amplitude (p=0.65), duration (p=0.11), Weiner entropy (p=0.11), goodness (p=0.53), FM (p=0.72), and AM (p=0.16) were not altered for either target or control syllables. Error bars indicate standard error of the mean. B) Shift in song spectral features for target (filled) and control (open) syllables of axArchT birds (n=4). Changes in song were restricted to pitch of target syllables (p=0.0081, Mann-Whitney test). Spectral characteristics including amplitude (p=0.68), duration (p=0.46), Weiner entropy (p=0.57), goodness (p=0.15), FM (p=0.21), and AM (p=0.15) were not altered for either target or control syllables. Error bars indicate standard error of the mean. C) Shift in mean pitch for target (filled) and control (open) syllables from both axChR2 birds (blue, n=4) and axArchT+ birds (orange, n=4) at millisecond time scale. Changes in pitch are restricted to target syllables (ANOVA, F_4,19 = 13.62, P < 0.05, the diamonds denote s.e.m. and whiskers denote the 10–90% range). See also Figure S6.

To examine the temporal precision of behavioral changes guided by optogenetic manipulation of VTA_AX terminals, we measured changes to the pitch of harmonic syllables produced immediately before or after the targeted syllable in the same birds described above. Measuring out from the time point in the target syllable when the fundamental frequency was calculated, we measured changes in the mean pitch of adjacent harmonic syllables produced between ±100 – 200ms and ±200 – 300ms before or after the target syllable. We found our effects on vocal pitch are largely restricted to the target syllables, which in these birds were ~100ms in duration (range = 47.19 – 148.9ms, mean = 96.16 ±37.2ms, Figure 7C) and tend to not extend to neighboring syllables as little as 100ms removed from the onset of illumination. In 6 of 8 birds we were unable to detect any changes to the pitch of harmonic syllables produced within ±100 – 200ms of the targeted syllable (7 of 9 syllables) and no changes to syllables produced within ±200 – 300ms of the targeted syllable (6 syllables).

In reinforcement learning, all aspects of performance correlated with reinforcement are prone to modification (Charlesworth et al., 2011). We measured changes to pitch in harmonic syllables immediately following the targeted syllables in 4 birds and found that in two of these birds syllables shifted along with the target syllable, while in the two other birds they did not (Figure 7C, S6). We found that this shift in the pitch of non-targeted syllable was predicted by baseline correlations in the pitch of target and non-target syllables. Together, these findings show that barring covariation across performance variables, the VTA-Area X pathway can direct changes to an individual feature in a bird’s syllable and has the temporal resolution necessary to confine changes to only targeted syllables.

DISCUSSION

Birdsong is one of the best studied naturally learned skilled motor behaviors and the foremost model for investigating neural circuit mechanisms for learning vocalizations (Doupe and Kuhl, 1999; Konopka and Roberts, 2016b). Like birdsong, many fine motor skills are learned through extensive practice and require continued training to maintain expert performance. Gaining proficiency in motor performance is thought to require neural circuits capable of evaluating performance outcomes relative to motor goals and the ability to bias future performances in accordance with these evaluative signals. Our findings demonstrate that phasic manipulation of evaluative signals from VTA to Area X during natural performances is sufficient to guide song learning and provide causal support for positive and negative reinforcement signals in learning vocalizations.

Although the role of the basal ganglia in learning vocalizations has been intensively studied, the origin of instructive signals sufficient to guide changes in performance has remained elusive. We show that manipulating the activity of VTA axon terminals in the vocal basal ganglia of singing birds biases future performances of song in a manner consistent with positive and negative performance errors. Pitch-contingent activation and inhibition are each sufficient to guide either decreases or increases in the pitch of a targeted syllable (Figures 3, 4, 6). These effects are independent of direct motor effects or motivational influences on song performance (Figure 5), consistent with the idea that the VTA_AX circuit encodes an evaluation of performance outcome capable of instructing future performances. We find that phasic activation and inhibition have opponent effects on future performances of song. Birds change the pitch of their song in the direction that increases the number of song trials targeted with illumination following activation of VTA_AX axon terminals, while they bias future performances away from pitch ranges targeted with optical inhibition. These results highlight the importance of using both activation and inhibition to dissect neural circuit function and the use of optogenetic methods in the study of naturally performed skilled behaviors. Together, our findings underscore the role of a bidirectional teaching signal in reinforcement-learning, such as those envisioned by reward predictions errors (Fee and Goldberg, 2011; Gadagkar et al., 2016; Hikosaka et al., 2006; Schultz et al., 1997; Suri and Schultz, 1998; Watabe-Uchida et al., 2017; Yttri and Dudman, 2016), and extend it to learning of fine motor skills associated with vocal performances.

Models of basal ganglia-dependent reinforcement learning postulate that song learning relies on the convergence of three signals onto MSNs in the vocal basal ganglia(Fee and Goldberg, 2011): a signal encoding information about the time-step in the song; a signal encoding information about motor variability (for example, if the pitch is going to be sung higher or lower at specific moment in the song); and a reinforcement signal from VTA (see circuit schematics in Figure 2). Consistent with these models, lesion and pharmacological inactivation studies have shown that cortical vocal basal ganglia pathways play an essential role in biasing future performances of song (Ali et al., 2013; Andalman and Fee, 2009; Charlesworth et al., 2012; Gadagkar et al., 2016) and that dopaminergic input is necessary for learning during negative reinforcement auditory feedback tasks(Hoffmann et al., 2016). Simple formulations of these results support the view that input from VTA is permissive for learning spectral features of song. Our results instead argue that this circuit functions instructively in song learning. We show that phasic inhibition of VTA inputs to Area X is sufficient to guide song away from its natural performance range and inhibition and excitation of VTA inputs guide song changes in opposing directions. Indeed, a bidirectional instructive signal is a cornerstone of RPE models and our results provide causal support for this bidirectional instructive signal in vocal learning. Nonetheless, it is still not known if dopamine neurons exhibit phasic excitation or inhibition during natural tutor song learning. Additional research will be needed to test if phasic changes in dopaminergic signaling are necessary and sufficient to guide normal imitation of a vocal model. The axon-targeted optogenetic tools developed here provide new avenues for starting to address these important questions.

The data presented here also extend beyond current models by highlighting an unappreciated temporal precision and execution parameter specificity in VTA dependent song learning. Our manipulations were sufficient to guide spectrally specific changes to single song syllables as short as 50ms and confined changes in song to a 50-300ms time window (Figure 7). We suggest that this form of temporal precision is likely a common attribute of performance error signaling, and not a specialization associated with production and learning of birdsong. In rodents, dopaminergic inputs to the direct and indirect pathway have been shown to be involved in action selection, and capable of directing learned changes in movement kinematics (Howard et al., 2017; Yttri and Dudman, 2016). Subthreshold activation of the direct and indirect pathways for as little as 450ms is sufficient to bidirectionally control movement velocity and bias velocity on future performances (Yttri and Dudman, 2016), underscoring the role of basal ganglia circuitry and dopamine in learning precise behaviors. However, it is still unclear how dopaminergic signaling alone may be able to provide the temporal precision needed for learning the microstructure of fine motor behaviors. For example, reinforcement learning experiments in songbirds indicate that song can be modified on timescales as short as 10ms, implicating remarkable temporal precision in how the nervous system tracks and evaluates performances(Charlesworth et al., 2011). One possibility is that co-release of glutamate and/or GABA from either dopaminergic or non-dopaminergic neurons could serve as a fast synaptic tag, marking synapses for plasticity (Granger et al., 2017). Further experiments will be needed to understand how performance evaluation circuits guide learning on this fine temporal scale.

In addition to their role in learning, dopaminergic inputs to the dorsal striatum are associated with kinematic control(Howe and Dombeck, 2016). Our results on the other hand, indicate that phasic manipulation of dopaminergic inputs to Area X does not result in overt changes in ongoing vocalizations. Anatomical segregation of dopaminergic inputs to the rodent dorsal striatum and songbird Area X may account for these differences. The majority of dopaminergic inputs to the rodent dorsal striatum arise from the SNc (Ikemoto, 2007), while a large fraction of inputs to the ventral striatum arise from VTA. Dopaminergic inputs to the ventral striatum in mice are thought to reflect reward signaling and are not known to elicit direct changes in movement kinematics. Although Area X anatomically resides in the dorsal striatum, a large fraction of its dopaminergic inputs arise from VTA (current results), perhaps accounting for the lack of direct motor effects on song. It will be of interest in future studies to examine if manipulation of SNc inputs to Area X is sufficient to directly modify song production. Indeed, pharmacological manipulations suggest that dopamine signaling in Area X can directly influence song. Infusion of D1-like receptor antagonists in Area X modulates the spectral variability of song syllables (Leblois and Perkel, 2012; Leblois et al., 2010). These results may reflect global manipulation of dopamine signaling, affecting neurons targeted by VTA and SNC inputs, or may also reflect the different functional roles of fast and slow dopamine signaling(Lapish et al., 2007). Singing directly to a female bird is associated with increase in dopamine in Area X(Sasaki et al., 2006), while infusion of D1-like receptor antagonists into Area X blocks changes in song typically associated with female directed singing(Leblois et al., 2010). These slow, context-dependent changes in dopamine signaling may provide a separate signaling regime than the phasic manipulations employed to guide changes in song spectral features.

In humans, basal ganglia-dependent reinforcement learning is thought to play an important role in speech and language learning(Konopka and Roberts, 2016b; Watkins, 2011). By applying closed-loop optogenetic methods in freely singing songbirds, we show that VTA inputs to the basal ganglia are sufficient to guide learning of a select vocal feature without disrupting other aspects of the learned vocalizations. We suggest that the temporal and movement parameter precision observed here is likely a common attribute of circuitry involved in evaluation of fine motor behaviors, like speech and language. Overall, our findings in the songbird reveal that positive and negative reinforcement signals from VTA are sufficient to guide temporally precise and movement parameter specific learned changes in song and highlight the utility of focusing on naturally learned, ethologically relevant behaviors when dissecting neural circuit function.

STAR METHODS

KEY RESOURCES TABLE

REAGENT or RESOURCE	SOURCE	IDENTIFIER
Antibodies
rabbit anti-tyrosine hydroxylase	Millipore	RRID: AB_390204
rabbit anti-GFP	Invitrogen	RRID: AB_221569
mouse anti-GFP	Invitrogen	RRID: AB_221568
Bacterial and Virus Strains
scAAV1-Cbh-GFP	Duke viral vector core	N/A
scAAV9-Cbh-GFP	Duke viral vector core	N/A
scAAV1-CBh-ChR2(H134R)-nrxn-EYFP	Duke viral vector core	N/A
AAV1-CAG-ChR2(H134R)-nrxn-2a-EYFP	This lab	N/A
AAV1-CAG-ArchT-nrxn	UNC Vector Core	N/A
AAV1-CAG-ArchT3.0-nrxn-2a-EYFP	This lab	N/A
Chemicals, Peptides, and Recombinant Proteins
L-DOPA	Tocris	Cat#3788
Dextran, Alexa Fluor™ 488	Invitrogen	Cat# D22910
Critical Commercial Assays
NeuroTrace™ 530/615 Red Fluorescent Nissl Stain	Invitrogen	Cat# N21482
Experimental Models: Organisms/Strains
zebra finches (Taeniopygia guttata)	This lab	N/A
Recombinant DNA
scAAV1-CBh-ChR2(H134R)-nrxn-EYFP	This paper	Request from Lead Contact
AAV1-CAG-ChR2(H134R)-nrxn-2a-EYFP	This paper	Request from Lead Contact
AAV1-CAG-ArchT-nrxn	This paper	Request from Lead Contact
AAV1-CAG-ArchT3.0-nrxn-2a-EYFP	This paper	Request from Lead Contact
Software and Algorithms
MATLAB	MATLAB	https://www.mathworks.com/
Custom-written MATLAB code for data analysis	This lab	Request from Lead Contact
Labview	National Instruments	http://www.ni.com/en-us/shop/labview.html
Custom-written Labview software(CAF)	Ali et.al., 2013, this paper	Request from Lead Contact
Other
Fiber Optic Cannula	Prizmatix	www.goldstonescientific.com/prizmatix-optogenetics-implantable-cannula/
High NA Dual Polymer Optical Fiber for Bilateral Stimulation	Prizmatix	http://www.goldstonescientific.com/high-na-dual-optical-fiber-for-bilateral-stimulation-rotary-joint-to-cannulae/
Mating Sleeves	Prizmatix	http://www.goldstonescientific.com/prizmatix-mating-sleeves/

Open in a new tab

CONTACT FOR REAGENT AND RESOURCE SHARING

Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Todd Roberts (Todd.Roberts@UTSouthwestern.edu)

EXPERIMENTAL MODEL AND SUBJECT DETAILS

Animal models

Experiments described in this study were conducted using 66 adult male zebra finches (Taeniopygia guttata; > 90 days post hatch) that were raised in our breeding facility and housed with their parents until at least 50 days of age. During experiments, birds were housed individually in sound-attenuating recording chambers (Med associates) on a 12/12 h day/night schedule and were given ad libitum access to food and water. All birds (range: 111 to 178 days post hatch at start of experiment) were subjected to undirected song recording (i.e., with no female present) with Sound Analysis Pro(SAP2011, http://soundanalysispro.com/ (Tchernichovski et al., 2000). All procedures were performed in accordance with established protocols approved by the UT Southwestern Medical Center Animal Care and Use Committee.

METHOD DETAILS

Viral Vectors

Vector of scAAV1-CBh-ChR2(H134R)-nrxn-EYFP and AAV1-CBh-ArchT-nrxn were synthesized by the Duke viral vector core facility (Durham, NC, USA). The scAAV1–CBh-ChR2(H134R)-nrxn-EYFP vector was prepared by ligating ChR2(H134R) and EYFP with an intervening DNA fragment encoding the intracellular domain of neurexin-1a (nrxn, aa 1,425–1,479), inserted into the scAAV2-Cbh backbone. The AAV1-CAG-ArchT-nrxn vector was prepared by ligating ArchT(archaerhodopsin from Halorubrum strain TP009) and the nrxn fragment, inserted into the AAV2-CAG backbone. The AAV1-CAG-ChR2(H134R)-nrxn-2a-EYFP vector was prepared by ligating ChR2(H134R)-nrxn and EYFP with an intervening DNA fragment encoding a 2a amino acid sequence from porcine teschovirus-1 ((GSG) ATNFSLLKQAGDVEENPGP) and insertion into the AAV2-CAG backbone. ArchT3.0-nrxn was constructed from ArchT3.0(Mattis et al., 2012) with a C-terminal 2xHA(YPYDVPDYA) tag and followed by the nrxn fragment (between enhanced trafficking signal(KSRITSEGEYIPLDQIDINV) and ER export signal(FCYENEV)) (Stachniak et al., 2014). The AAV1-CAG-ArchT3.0-nrxn-2a-EYFP was prepared by replacing ChR2(H134R)-nrxn with ArchT3.0-nrxn. The recombinant AAV vectors were serotyped with AAV1 coat proteins and produced by the University of North Carolina vector core facility (Chapel Hill, NC, USA) with titer exceeding 10¹² vg/ml or in the lab with titer exceeding 5×10¹¹ vg/ml. The self-complementary AAV(scAAV) vectors were serotyped with AAV1 or AAV9 coat proteins and produced by the Duke viral vector core facility (Durham, NC, USA). AAV1-CAG-ChR2(H134R)-nrxn-2a-EYFP or scAAV1-CBh-ChR2(H134R)-nrxn-EYFP viruses were used interchangeably for targeted stimulation of VTA_AX axon terminals, abbreviated to axChR2. AAV1-CAG-ArchT3.0-nrxn-2a-EYFP or AAV1-CBh-ArchT-nrxn viruses were used interchangeably for targeted inhibition of VTA_AX axon terminals, abbreviated to axArchT. ScAAV1-Cbh-GFP virus was used as an opsin-negative control. Both scAAV1 and scAAV9-Cbh-EGFP were used for tracing experiments. All viral vectors were aliquoted and stored at −80 °C until use.

Stereotaxic Surgery

Virus/tracer injection

All surgical procedures were performed under aseptic conditions. Birds were anesthetized using isoflurane inhalation (1.5-2%) and placed in a stereotaxic apparatus. Viral injections were performed using previously described procedures (Roberts et al., 2012; Roberts et al., 2017) at the following approximate stereotaxic coordinates relative to interaural zero and the brain surface were (rostral, lateral, depth, in mm): Ov (2.8, 1.0, 5.75), the center of Ov was located and mapped based on its robust white noise responses; VTA relative to the center of Ov (+0.3, −0.2, +1.8)(Mandelblat-Cerf et al., 2014); Area X (5.1, 1.6, 3.3) with 43-degree head angle or (5.8, 1.6, 3) with 20-degree head angle, the boundary of Area X was verified using extracellular electrophysiological recordings. For behavioral experiments, 0.7μl AAVs were injected into the VTA between 70-90dph after identification of target syllables and allowed at least 4-6 weeks for expression. For tracing experiments, 0.12μl Alexa Fluor 488–conjugated dextran amines (D22910, Invitrogen, CA, USA) or scAAVs were injected into the Area X or the VTA and allowed 3-5 days for sufficient retrograde transport and labelling.

Optical fiber placement

Birds were bilaterally implanted with optical fiber cannula (Prizmatix, Israel), prior to behavioral training and following the surgery procedure for viral injection. Fiber implants (200 or 250um, NA=0.66, Prizmatix) were targeted to the dorsomedial aspect of the Area X and were secured to the skull with C&B Metabond quick adhesive cement (Parkell Inc., NY, USA) followed by dental cement (Diamond Springs Inc., CA, USA). The optical fibers were connected to a LED source (λ = ~460 nm or~520 nm; Prizmatix) via a rotary joint (Prizmatix) using an optic fiber sleeve (Prizmatix). LED power was adjusted to produce the desired output at the tip of the implanted optic probe (3-5mw for 460nm LED; 1.5-4mw for 520nm LED).

Behavioral Assays

CAF program

Custom LabView software (National Instruments) was used for online detection of target syllables and implementation of optogenetic manipulation(Ali et al., 2013). Pitch was computed on a 5ms sound segment located 15-80ms into the target syllable. The target segment was constant for a given bird but varied between birds. Running average of pitch was calculated as the average of the pitch over the last 200 renditions (one session) of the target.

Habituation

Before baseline recordings, birds were given at least 1 week to recover from cannula implantation and habituate to singing with attached optical fibers. Songs were recorded for several days to measure baseline statistics on the pitch of targeted syllables and develop spectral templates to detect syllables in real-time and trigger closed-loop optogenetic manipulation. False positive and false negative rates were quantified and maintained under 10% for all birds.

Threshold setup and contingency calculation

Thresholds for triggering the LED light source were set at the running average of last baseline session for each animal, such that approximately one-third of baseline rage would be either supra-threshold or subthreshold. If the pitch met the escape criterion, no illumination was triggered (‘escape’). Otherwise, a 100-ms light pulse was delivered within 25ms of the measurement (‘hit’). Contingency was calculated as the percentage of ‘hit’ renditions out of total rendition numbers for a day or a session. During baseline periods, contingency was calculated in the absence of illumination. For optogenetic activation experiments we targeted the upper or lower third of naturally produced pitch variants (Yttri and Dudman, 2016). For optogenetic inhibition experiments we targeted the upper or lower two-thirds of pitch variants, similar to procedures used in white-noise playback experiments (Ali et al., 2013; Andalman and Fee, 2009; Roberts et al., 2017; Tumer and Brainard, 2007).

Closed-loop optogenetic manipulation

The threshold for triggering illumination was constant during each day, and was updated each morning. Optogenetic manipulations were maintained over several days and not ceased until the pitch approached an asymptotic state within a day (Figure 6B, blue columns) or constant state for three consecutive days (Figure 6A, last three filled orange downward pointing triangles). The same training parameters were used for axChR2, axArchT and GFP experiments.

Pitch-contingent auditory feedback (pCAF, Figures 1A–1F)

pCAF targets white-noise playback when the pitch of a syllable is below or above an experimenter-defined threshold. Brief pulses of white-noise playback are thought to function as an aversive cue, perhaps perceived as an error in vocal performance(Tumer and Brainard, 2007). We targeted 100ms pulses of white-noise playback (60-80 dB SPL) to low-pitch variants of individual song syllables in seven birds (single targeted syllable in each bird). Birds were acclimated to pCAF acoustic chambers for 3-4 days. On the first day of the experiment their song was continually recorded in order to establish baseline levels of the pitch for each syllable. Pitch-contingent auditory feedback was started in the morning of the second day.

Fast-Scan Cyclic Voltammetry

Birds were anesthetized using isoflurane and decapitated. The brain was quickly removed and immersed in ice-cold oxygenated zero-sodium ACSF containing the following (in mM): 225 sucrose, 3 KCl, 1.25 NaH2PO4, 26 NaHCO3, 10 D-(+)-glucose, 2 MgSO4, 2 CaCl2. The brain was then cut along the sagittal plane and the lateral side of the right or left hemisphere was glued onto a specimen tilting disc. The disc was tilted such that the vibratome blade entered the brain at a 10-15 ° angle. Slices (300 μm) conta ining Area X-Ventral Tegmental Area (VTA) were produced using a vibratome (Leica VT1200/VT1200S) and an advancing speed of 0.12 mm/s. Slices were incubated in a custom-made holding chamber saturated with 95%/5% O2/CO2 mix with reduced sodium ACSF containing the following (in mM): 60 NaCl, 75 sucrose, 2.5 KCl, 1.2 NaH2PO4, 30 NaHCO3, 25 D-(+)-glucose, 20 HEPES, 2 MgSO4, 2 CaCl2 at a temperature of 32°C for 40 min. The slic es remained in the recovery chamber for at least another 40 min at room temperature before FSCV recording.

Slices were transferred to a recording chamber perfused with ACSF contained the following (in mM): 126 NaCl, 3 KCl, 1.25 NaH2PO4, 26 NaHCO3, 10 D-(+)-glucose, 2 MgSO4, 2 CaCl2 50 μM of L-DOPA (CAT#3788, Tocris, USA), perfused at 3 ml min-1) at 31-33 C°. Stimulation of dopamine (DA) release was initiated typically 30 min after transfer to the chamber. Recordings were made in slices for up to 5 h after cutting.

Recordings were conducted using carbon-fiber electrodes (7 μm fiber diameter), the exposed carbon fiber tip was cut to a length of 30-150 μm. The tip of the CFE was gently lowered into the slice to a depth of 50-150 μm. The potential applied to the carbon fiber was ramped from −0.4 V (versus Ag/AgCl) to +1.3 V and back at a rate of 400 V/s during a voltammetric scan and held at −0.4 V between scans at a frequency of 10 Hz. All extracellular solutions were adjusted to 310 mOsm, pH 7.3-7.4, and aerated with a 95%/5% O2/CO2 mix.

DA Release was evoked either by light emitted from a collimated light-emitting diode (470 nm) driven by a Cube LED Driver pE-300 (CoolLED) under the control of an Axon Digidata 1550B Data Acquisition System and Clampex 10.6 or through electrical stimulation. Light was delivered through the reflected light fluorescence illuminator port and the X 40 objective. For electrical stimulation a bipolar concentric stimulating electrode was placed in VTA controlled by a stimulus isolator (A365, WP) triggered by Axon Digidata 1550B data Acquisition System and Clampex 10.6.. Optimal stimulation employed a single 100ms light pulse. Electrical stimulation used a pulse-train at 50 Hz, 200-300 μA, 1 ms pulses, for 50 ms. After establishment of DA release, a laser light source (540nm; ~10mW) was used to illuminate Area X via the light fluorescence illuminator port and the X 40 objective for 100 ms single pulse.

Histology

Immunohistochemistry

Immunohistochemistry experiments were performed following standard procedures. Briefly, birds were anesthetized with Ethanol (Virbac, TX, USA) and transcardially perfused with PBS, followed by 4% paraformaldehyde in PBS. Coronal sections (30μm) were cut using a freezing microtome (Leica SM 2010R, Leica). Sections were first washed in PBS, incubated in PBST (0.3% Triton X-100 in PBS) for 15min at room temperature (RT) and then washed with PBS. Next, sections were blocked in 5% donkey serum in PBST for 30 min at RT and then incubated with primary antibodies overnight at 4 °C. Sections were washed with PBS and incubated with fluorescent secondary antibodies at RT for 1 h. After washing with PBS, sections were mounted onto slides with Fluoromount-G (eBioscience, CA, USA). Composite images were acquired and stitched using a LSM 710 laser-scanning confocal microscope (Carl Zeiss, Germany) or an upright compound microscope (Leica DM5500 B, Leica). The primary antibodies used were: rabbit anti-tyrosine hydroxylase (AB152, Millipore, Germany), rabbit anti-GFP (A11122, Invitrogen, CA, USA) and mouse anti-GFP (A11120, Invitrogen, CA, USA). Primary antibodies were incubated with appropriate fluorophore-conjugated secondary antibodies (Life Technologies, Carlsbad, California, USA) depending on the desired fluorescence color.

Localization of probes

Post-mortem histology was performed on sectioned (30-50 mm thick, sagittal) tissue to confirm placement of optical fiber overlaying Area X. Tissue damage, revealed by Nissl(N21482, Invitrogen, CA, USA) and Hoechst stain, indicated the location of the optical fiber.

QUANTIFICATION AND STATISTICAL ANALYSIS

Behavioral Analysis

All behavioral events were recorded by computer systems (CAF and SAP2011 program see in METHOD DETAILS). Data analysis for the pitch of targeted syllables were performed using custom software written in MATLAB (MathWorks; Natick, Massachusetts). All birds subjected for behavior assay were pre-screened, and only birds with well-defined harmonic syllables that permitted reliable detection included. Experiments described in Fig. 2 were randomized for allocation of virus or tracer injections. Other experiments were not randomized, but investigators were blinded to allocation and outcome assessments during analysis. Sample sizes were not predetermined but were comparable to previous related studies (Andalman and Fee, 2009; Tumer and Brainard, 2007; Yttri and Dudman, 2016). For our optogenetic experiments 19 birds were excluded from analysis because either 1) their song syllables were difficult to reliably target using CAF software (birds with false positive/negative rates above 10%, n= 11 birds, 4 axChR2 birds and 7 axArchT birds), or 2) they were found to have miss-targeted placement of fiber optic implants above Area X following processing of tissue sections (n = 8 birds, 5 axChR2 birds and 3 axArchT birds). To summarize effects across syllables, we expressed the daily changes in pitch of targeted syllables as z-scores:

z_{i} = \frac{p_{i} - p_{i - 1}}{σ_{b}}

p_i is the running average of pitch from last session on day i and σ_b is the standard deviation of last baseline session.

d′ scores were computed to express the changes in mean daily pitch relative to last baseline day for each experimental day:

d_{i}^{'} = \frac{\sqrt{2} (P_{i} - P_{b})}{\sqrt{σ_{i}^{2} + σ_{b}^{2}}}

P_i is the mean daily pitch on day i and $σ_{i}^{2}$ is variance. Day b refers to last baseline day.

In the case of equal variances ( $σ_{i}^{2} = σ_{b}^{2}$ ), $d_{i}^{'}$ reports the changes in average between experimental day i and baseline day in the unit of SDs.

Maximum shift in mean daily pitch was computed as the difference in mean daily pitch between baseline day and the day d′ reaches maximum value (day_max).

For analysis of spectral characteristics, SAP2011 was used to quantify longer segments (15–100 ms) of targeted or non-targeted harmonic stacks. Acoustic features were measured from one time-matched session from baseline day and day_max. d’ score for acoustics features were calculated using the same formula as for pitch shift.

For analysis of immediate motor effect, the first session of stimulation or inhibition day and time-matched session of last baseline day was analyzed. Z scores were computed for hit and escape renditions, using the last baseline session mean and standard deviation within each animal.

Statistical Analysis

To determine whether parametric tests could be used, the Shapiro-Wilk Test was performed on all data as a test for normality. Unless otherwise noted, statistical significance was tested with non-parametric statistical tests; Wilcoxon signed-rank tests and Wilcoxon rank-sum tests were used where appropriate. Statistical significance refers to *P < 0.05, **P<0.02. Statistical details for all experiments are included in their corresponding figure legends.

DATA AND SOFTWARE AVAILABILITY

Data and custom-written software are available upon request.

Supplementary Material

supplement

NIHMS947976-supplement.pdf^{(189.9MB, pdf)}

Highlights.

Optogenetic excitation and inhibition of dopaminergic axon terminals in songbirds
Ventral tegmental area to basal ganglia pathway is sufficient to guide song learning
Activation and inhibition have opposing effects on song learning
Optogenetically learned changes in song are temporally and spectrally precise

Acknowledgments

The authors thank Drs. Joseph Takahashi, Samuel Sober and Richard Hahnloser, and members of the Roberts laboratory for discussions and comments on the manuscript, Dr. Jeremy Clark for advice on fast scanning cyclic voltammetry experiments, Dr. Claire Stelly and Merridee Lefner for fabrication of voltammetry electrodes, Dr. Bence Ölveczky for CAF software, Drs. Karl Deisseroth and Edward Boyden for providing optogenetic constructs, Marguerita Kline for cloning of initial axon targeted constructs and Jennifer Holdway and Andrea Guerrero for laboratory support and animal husbandry. The Roberts laboratory was supported by grants from the US National Institutes of Health (R01NS102488, R01DC014364, R03MH111319), the National Science Foundation (IOS-1457206, IOS-1451034), University of Texas BRAIN Initiative (362808, 362430), the Klingenstein-Simons Fellowship, and a NARSAD Young Investigator Grant (Essel Investigator) from the Brain & Behavior Research Foundation (T.F.R.).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Author contributions: L.X., G.C. and T.F.R. conceived and designed all experiments. L.X. collected and analyzed the in vivo optogenetic behavioral, and the anatomical tracing data. G.C. collected and analyzed the pitch contingent auditory feedback data and optogenetic behavioral data. F.G.O. and T.F.R. collected the cyclic voltammetry data. M.C. analyzed and imaged the anatomical data and helped analyze optogenetic behavioral data. M.J.W. analyzed the voltammetry data and provided reagents and advice to help collect these data in songbirds. T.F.R. and L.X. wrote the manuscript. All authors read and commented on the manuscript.

Declaration of Interests: The authors declare no competing interests. Correspondence and requests for materials should be addressed to T.F.R. (todd.roberts@utsouthwestern.edu)

References

Achiro JM, Shen J, Bottjer S. Neural activity in cortico-basal ganglia circuits of juvenile songbirds encodes performance during goal-directed learning. Journal Article. 2017;6 doi: 10.7554/eLife.26973. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ali F, Otchy TM, Pehlevan C, Fantana AL, Burak Y, Olveczky BP. The basal ganglia is necessary for learning spectral, but not temporal, features of birdsong. Neuron. 2013;80:494–506. doi: 10.1016/j.neuron.2013.07.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
Andalman AS, Fee MS. A basal ganglia-forebrain circuit in the songbird biases motor output to avoid vocal errors. Proc Natl Acad Sci U S A. 2009;106:12518–12523. doi: 10.1073/pnas.0903214106. [DOI] [PMC free article] [PubMed] [Google Scholar]
Barto AG. Adaptive crtics and the basal ganglia. In: Houlk J, Davis J, Beiser B, editors. Models of information processing in the basal ganglia. MIT Press; 1995. pp. 215–232. [Google Scholar]
Bottjer SW, Altenau B. Parallel pathways for vocal learning in basal ganglia of songbirds. Nat Neurosci. 2010;13:153–155. doi: 10.1038/nn.2472. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bottjer SW, Miesner EA, Arnold AP. Forebrain lesions disrupt development but not maintenance of song in passerine birds. Science. 1984;224:901–903. doi: 10.1126/science.6719123. [DOI] [PubMed] [Google Scholar]
Brainard MS, Doupe AJ. Interruption of a basal ganglia-forebrain circuit prevents plasticity of learned vocalizations. Nature. 2000;404:762–766. doi: 10.1038/35008083. [DOI] [PubMed] [Google Scholar]
Bromberg-Martin ES, Matsumoto M, Hikosaka O. Dopamine in motivational control: rewarding, aversive, and alerting. Neuron. 2010;68:815–834. doi: 10.1016/j.neuron.2010.11.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bush RR, Mosteller F. A mathematical model for simple learning. Psychological review. 1951a;58:313–323. doi: 10.1037/h0054388. [DOI] [PubMed] [Google Scholar]
Bush RR, Mosteller F. A model for stimulus generalization and discrimination. Psychological review. 1951b;58:413–423. doi: 10.1037/h0054576. [DOI] [PubMed] [Google Scholar]
Canopoli A, Herbst JA, Hahnloser RH. A higher sensory brain region is involved in reversing reinforcement-induced vocal changes in a songbird. J Neurosci. 2014;34:7018–7026. doi: 10.1523/JNEUROSCI.0266-14.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chang CY, Esber GR, Marrero-Garcia Y, Yau HJ, Bonci A, Schoenbaum G. Brief optogenetic inhibition of dopamine neurons mimics endogenous negative reward prediction errors. Nat Neurosci. 2016;19:111–116. doi: 10.1038/nn.4191. [DOI] [PMC free article] [PubMed] [Google Scholar]
Charlesworth JD, Tumer EC, Warren TL, Brainard MS. Learning the microstructure of successful behavior. Nat Neurosci. 2011;14:373–380. doi: 10.1038/nn.2748. [DOI] [PMC free article] [PubMed] [Google Scholar]
Charlesworth JD, Warren TL, Brainard MS. Covert skill learning in a cortical-basal ganglia circuit. Nature. 2012;486:251–255. doi: 10.1038/nature11078. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cohen JY, Haesler S, Vong L, Lowell BB, Uchida N. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature. 2012;482:85–88. doi: 10.1038/nature10754. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cui G, Jun SB, Jin X, Pham MD, Vogel SS, Lovinger DM, Costa RM. Concurrent activation of striatal direct and indirect pathways during action initiation. Nature. 2013;494:238–242. doi: 10.1038/nature11846. [DOI] [PMC free article] [PubMed] [Google Scholar]
den Ouden HE, Kok P, de Lange FP. How prediction errors shape perception, attention, and motivation. Frontiers in psychology. 2012;3:548. doi: 10.3389/fpsyg.2012.00548. [DOI] [PMC free article] [PubMed] [Google Scholar]
Desrochers TM, Amemori K, Graybiel AM. Habit Learning by Naive Macaques Is Marked by Response Sharpening of Striatal Neurons Representing the Cost and Outcome of Acquired Action Sequences. Neuron. 2015;87:853–868. doi: 10.1016/j.neuron.2015.07.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
Dhawale AK, Smith MA, Olveczky BP. The Role of Variability in Motor Learning. Annu Rev Neurosci. 2017;40:479–498. doi: 10.1146/annurev-neuro-072116-031548. [DOI] [PMC free article] [PubMed] [Google Scholar]
Doupe AJ, Kuhl PK. Birdsong and human speech: common themes and mechanisms. Annu Rev Neurosci. 1999;22:567–631. doi: 10.1146/annurev.neuro.22.1.567. [DOI] [PubMed] [Google Scholar]
Dudman JT, Krakauer JW. The basal ganglia: from motor commands to the control of vigor. Curr Opin Neurobiol. 2016;37:158–166. doi: 10.1016/j.conb.2016.02.005. [DOI] [PubMed] [Google Scholar]
Eshel N, Bukwich M, Rao V, Hemmelder V, Tian J, Uchida N. Arithmetic and local circuitry underlying dopamine prediction errors. Nature. 2015;525:243–246. doi: 10.1038/nature14855. [DOI] [PMC free article] [PubMed] [Google Scholar]
Fee MS, Goldberg JH. A hypothesis for basal ganglia-dependent reinforcement learning in the songbird. Neuroscience. 2011;198:152–170. doi: 10.1016/j.neuroscience.2011.09.069. [DOI] [PMC free article] [PubMed] [Google Scholar]
Frank MJ, Seeberger LC, O’Reilly RC. By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science. 2004;306:1940–1943. doi: 10.1126/science.1102941. [DOI] [PubMed] [Google Scholar]
Gadagkar V, Puzerey PA, Chen R, Baird-Daniel E, Farhang AR, Goldberg JH. Dopamine neurons encode performance error in singing birds. Science. 2016;354:1278–1282. doi: 10.1126/science.aah6837. [DOI] [PMC free article] [PubMed] [Google Scholar]
Glimcher PW. Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis. Proc Natl Acad Sci U S A. 2011;108(Suppl 3):15647–15654. doi: 10.1073/pnas.1014269108. [DOI] [PMC free article] [PubMed] [Google Scholar]
Granger AJ, Wallace ML, Sabatini BL. Multi-transmitter neurons in the mammalian central nervous system. Curr Opin Neurobiol. 2017;45:85–91. doi: 10.1016/j.conb.2017.04.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
Graybiel AM. Habits, rituals, and the evaluative brain. Annu Rev Neurosci. 2008;31:359–387. doi: 10.1146/annurev.neuro.29.051605.112851. [DOI] [PubMed] [Google Scholar]
Graybiel AM, Grafton ST. The striatum: where skills and habits meet. Cold Spring Harbor perspectives in biology. 2015;7:a021691. doi: 10.1101/cshperspect.a021691. [DOI] [PMC free article] [PubMed] [Google Scholar]
Haesler S, Rochefort C, Georgi B, Licznerski P, Osten P, Scharff C. Incomplete and inaccurate vocal imitation after knockdown of FoxP2 in songbird basal ganglia nucleus Area X. PLoS Biol. 2007;5:e321. doi: 10.1371/journal.pbio.0050321. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hikosaka O, Miyashita K, Miyachi S, Sakai K, Lu X. Differential roles of the frontal cortex, basal ganglia, and cerebellum in visuomotor sequence learning. Neurobiol Learn Mem. 1998;70:137–149. doi: 10.1006/nlme.1998.3844. [DOI] [PubMed] [Google Scholar]
Hikosaka O, Nakamura K, Nakahara H. Basal ganglia orient eyes to reward. J Neurophysiol. 2006;95:567–584. doi: 10.1152/jn.00458.2005. [DOI] [PubMed] [Google Scholar]
Hoffmann LA, Saravanan V, Wood AN, He L, Sober SJ. Dopaminergic Contributions to Vocal Learning. J Neurosci. 2016;36:2176–2189. doi: 10.1523/JNEUROSCI.3883-15.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hollerman JR, Schultz W. Dopamine neurons report an error in the temporal prediction of reward during learning. Nat Neurosci. 1998;1:304–309. doi: 10.1038/1124. [DOI] [PubMed] [Google Scholar]
Hong S, Hikosaka O. Dopamine-mediated learning and switching in cortico-striatal circuit explain behavioral changes in reinforcement learning. Frontiers in behavioral neuroscience. 2011;5:15. doi: 10.3389/fnbeh.2011.00015. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hosp JA, Pekanovic A, Rioult-Pedotti MS, Luft AR. Dopaminergic projections from midbrain to primary motor cortex mediate motor skill learning. J Neurosci. 2011;31:2481–2487. doi: 10.1523/JNEUROSCI.5411-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
Houk J, Adams J, Barto AG. A model of how the basal ganglia generate and use neural signals that predict reinforcement. In: Houk J, Davis J, Beiser D, editors. Models of information processing in the basal ganglia. MIT Press; 1995. pp. 249–270. [Google Scholar]
Howard CD, Li H, Geddes CE, Jin X. Dynamic Nigrostriatal Dopamine Biases Action Selection. Neuron. 2017;93:1436–1450 e1438. doi: 10.1016/j.neuron.2017.02.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
Howe MW, Dombeck DA. Rapid signalling in distinct dopaminergic axons during locomotion and reward. Nature. 2016;535:505–510. doi: 10.1038/nature18942. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hu H. Reward and Aversion. Annu Rev Neurosci. 2016;39:297–324. doi: 10.1146/annurev-neuro-070815-014106. [DOI] [PubMed] [Google Scholar]
Ikemoto S. Dopamine reward circuitry: two projection systems from the ventral midbrain to the nucleus accumbens-olfactory tubercle complex. Brain research reviews. 2007;56:27–78. doi: 10.1016/j.brainresrev.2007.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
Immelmann K. Song development in zebra finch and other Estrildid finches. In: Hinde RA, editor. Bird Vocalisations. Cambridge University Press; 1969. pp. 61–74. [Google Scholar]
Jin X, Costa RM. Start/stop signals emerge in nigrostriatal circuits during sequence learning. Nature. 2010;466:457–462. doi: 10.1038/nature09263. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jin X, Costa RM. Shaping action sequences in basal ganglia circuits. Curr Opin Neurobiol. 2015;33:188–196. doi: 10.1016/j.conb.2015.06.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jin X, Tecuapetla F, Costa RM. Basal ganglia subcircuits distinctively encode the parsing and concatenation of action sequences. Nat Neurosci. 2014;17:423–430. doi: 10.1038/nn.3632. [DOI] [PMC free article] [PubMed] [Google Scholar]
Keller GB, Hahnloser RH. Neural processing of auditory feedback during vocal practice in a songbird. Nature. 2009;457:187–190. doi: 10.1038/nature07467. [DOI] [PubMed] [Google Scholar]
Konopka G, Roberts TF. Animal Models of Speech and Vocal Communication Deficits Associated With Psychiatric Disorders. Biological psychiatry. 2016a;79:53–61. doi: 10.1016/j.biopsych.2015.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
Konopka G, Roberts TF. Insights into the Neural and Genetic Basis of Vocal Communication. Cell. 2016b;164:1269–1276. doi: 10.1016/j.cell.2016.02.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lapish CC, Kroener S, Durstewitz D, Lavin A, Seamans JK. The ability of the mesocortical dopamine system to operate in distinct temporal modes. Psychopharmacology. 2007;191:609–625. doi: 10.1007/s00213-006-0527-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
Leblois A, Perkel DJ. Striatal dopamine modulates song spectral but not temporal features through D1 receptors. Eur J Neurosci. 2012;35:1771–1781. doi: 10.1111/j.1460-9568.2012.08095.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Leblois A, Wendel BJ, Perkel DJ. Striatal dopamine modulates basal ganglia output and regulates social context-dependent behavioral variability through D1 receptors. J Neurosci. 2010;30:5730–5743. doi: 10.1523/JNEUROSCI.5974-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lee JC, Tomblin JB. Reinforcement learning in young adults with developmental language impairment. Brain and language. 2012;123:154–163. doi: 10.1016/j.bandl.2012.07.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
Leemburg S, Canonica T, Luft A. Motor skill learning and reward consumption differentially affect VTA activation. Scientific reports. 2018;8:687. doi: 10.1038/s41598-017-18716-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
London SE, Clayton DF. Functional identification of sensory mechanisms required for developmental song learning. Nat Neurosci. 2008;11:579–586. doi: 10.1038/nn.2103. [DOI] [PMC free article] [PubMed] [Google Scholar]
Mandelblat-Cerf Y, Las L, Denisenko N, Fee MS. A role for descending auditory cortical projections in songbird vocal learning. Elife. 2014;3:e02152. doi: 10.7554/eLife.02152. [DOI] [PMC free article] [PubMed] [Google Scholar]
Mattis J, Tye KM, Ferenczi EA, Ramakrishnan C, O’Shea DJ, Prakash R, Gunaydin LA, Hyun M, Fenno LE, Gradinaru V, et al. Principles for applying optogenetic tools derived from direct comparative analysis of microbial opsins. Nat Methods. 2012;9:159–172. doi: 10.1038/nmeth.1808. [DOI] [PMC free article] [PubMed] [Google Scholar]
Montague PR, Dayan P, Sejnowski TJ. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J Neurosci. 1996;16:1936–1947. doi: 10.1523/JNEUROSCI.16-05-01936.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
Morales M, Margolis EB. Ventral tegmental area: cellular heterogeneity, connectivity and behaviour. Nat Rev Neurosci. 2017;18:73–85. doi: 10.1038/nrn.2016.165. [DOI] [PubMed] [Google Scholar]
Nelson AB, Kreitzer AC. Reassessing models of basal ganglia function and dysfunction. Annu Rev Neurosci. 2014;37:117–135. doi: 10.1146/annurev-neuro-071013-013916. [DOI] [PMC free article] [PubMed] [Google Scholar]
Olveczky BP, Andalman AS, Fee MS. Vocal experimentation in the juvenile songbird requires a basal ganglia circuit. PLoS Biol. 2005;3:e153. doi: 10.1371/journal.pbio.0030153. [DOI] [PMC free article] [PubMed] [Google Scholar]
Panigrahi B, Martin KA, Li Y, Graves AR, Vollmer A, Olson L, Mensh BD, Karpova AY, Dudman JT. Dopamine Is Required for the Neural Representation and Control of Movement Vigor. Cell. 2015;162:1418–1430. doi: 10.1016/j.cell.2015.08.014. [DOI] [PubMed] [Google Scholar]
Petkov CI, Jarvis ED. Birds, primates, and spoken language origins: behavioral phenotypes and neurobiological substrates. Frontiers in evolutionary neuroscience. 2012;4:12. doi: 10.3389/fnevo.2012.00012. [DOI] [PMC free article] [PubMed] [Google Scholar]
Price PH. Developmental determinants of structure in zebra finch song. J Comp Physiol Psychol. 1979;93:260–277. [Google Scholar]
Rescorla R, Wagner A. A theory of Pavlovian conditioning: Vatiaions in the effectiveness of reinforcement and nonreinforcement. In: Black A, Prokasy W, editors. Classical Conditioning II: Current Research and Theory. Appleton Century Crofts; 1972. pp. 64–99. [Google Scholar]
Reynolds JN, Hyland BI, Wickens JR. A cellular mechanism of reward-related learning. Nature. 2001;413:67–70. doi: 10.1038/35092560. [DOI] [PubMed] [Google Scholar]
Roberts TF, Gobes SM, Murugan M, Olveczky BP, Mooney R. Motor circuits are required to encode a sensory model for imitative learning. Nat Neurosci. 2012;15:1454–1459. doi: 10.1038/nn.3206. [DOI] [PMC free article] [PubMed] [Google Scholar]
Roberts TF, Hisey E, Tanaka M, Kearney MG, Chattree G, Yang CF, Shah NM, Mooney R. Identification of a motor-to-auditory pathway important for vocal learning. Nat Neurosci. 2017 doi: 10.1038/nn.4563. [DOI] [PMC free article] [PubMed] [Google Scholar]
Sasaki A, Sotnikova TD, Gainetdinov RR, Jarvis ED. Social context-dependent singing-regulated dopamine. J Neurosci. 2006;26:9010–9014. doi: 10.1523/JNEUROSCI.1335-06.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
Scharff C, Nottebohm F. A comparative study of the behavioral deficits following lesions of various parts of the zebra finch song system: implications for vocal learning. J Neurosci. 1991;11:2896–2913. doi: 10.1523/JNEUROSCI.11-09-02896.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
Schultz W. Neuronal Reward and Decision Signals: From Theories to Data. Physiological reviews. 2015;95:853–951. doi: 10.1152/physrev.00023.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
Schultz W, Apicella P, Ljungberg T. Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. J Neurosci. 1993;13:900–913. doi: 10.1523/JNEUROSCI.13-03-00900.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
Schultz W, Dayan P, Montague PR. A neural substrate of prediction and reward. Science. 1997;275:1593–1599. doi: 10.1126/science.275.5306.1593. [DOI] [PubMed] [Google Scholar]
Smith KS, Graybiel AM. Habit formation coincides with shifts in reinforcement representations in the sensorimotor striatum. J Neurophysiol. 2016;115:1487–1498. doi: 10.1152/jn.00925.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
Stachniak TJ, Ghosh A, Sternson SM. Chemogenetic synaptic silencing of neural circuits localizes a hypothalamus–>midbrain pathway for feeding behavior. Neuron. 2014;82:797–808. doi: 10.1016/j.neuron.2014.04.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
Stephenson-Jones M, Yu K, Ahrens S, Tucciarone JM, van Huijstee AN, Mejia LA, Penzo MA, Tai LH, Wilbrecht L, Li B. A basal ganglia circuit for evaluating action outcomes. Nature. 2016 doi: 10.1038/nature19845. [DOI] [PMC free article] [PubMed] [Google Scholar]
Suri RE, Schultz W. Learning of sequential movements by neural network model with dopamine-like reinforcement signal. Experimental brain research Experimentelle Hirnforschung Experimentation cerebrale. 1998;121:350–354. doi: 10.1007/s002210050467. [DOI] [PubMed] [Google Scholar]
Sutton RS, Barto AG. Toward a modern theory of adaptive networks: expectation and prediction. Psychological review. 1981;88:135–170. [PubMed] [Google Scholar]
Sutton RS, Barto AG. Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press; 1998. [Google Scholar]
Tanaka M, Singh Alvarado J, Murugan M, Mooney R. Focal expression of mutant huntingtin in the songbird basal ganglia disrupts cortico-basal ganglia networks and vocal sequences. Proc Natl Acad Sci U S A. 2016;113:E1720–1727. doi: 10.1073/pnas.1523754113. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tchernichovski O, Nottebohm F, Ho CE, Pesaran B, Mitra PP. A procedure for an automated measurement of song similarity. Anim Behav. 2000;59:1167–1176. doi: 10.1006/anbe.1999.1416. [DOI] [PubMed] [Google Scholar]
Tecuapetla F, Jin X, Lima SQ, Costa RM. Complementary Contributions of Striatal Projection Pathways to Action Initiation and Execution. Cell. 2016;166:703–715. doi: 10.1016/j.cell.2016.06.032. [DOI] [PubMed] [Google Scholar]
Tritsch NX, Sabatini BL. Dopaminergic modulation of synaptic transmission in cortex and striatum. Neuron. 2012;76:33–50. doi: 10.1016/j.neuron.2012.09.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tumer EC, Brainard MS. Performance variability enables adaptive plasticity of ‘crystallized’ adult birdsong. Nature. 2007;450:1240–1244. doi: 10.1038/nature06390. [DOI] [PubMed] [Google Scholar]
Wall NR, De La Parra M, Callaway EM, Kreitzer AC. Differential innervation of direct- and indirect-pathway striatal projection neurons. Neuron. 2013;79:347–360. doi: 10.1016/j.neuron.2013.05.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
Warren TL, Tumer EC, Charlesworth JD, Brainard MS. Mechanisms and time course of vocal learning and consolidation in the adult songbird. J Neurophysiol. 2011;106:1806–1821. doi: 10.1152/jn.00311.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
Watabe-Uchida M, Eshel N, Uchida N. Neural Circuitry of Reward Prediction Error. Annu Rev Neurosci. 2017;40:373–394. doi: 10.1146/annurev-neuro-072116-031109. [DOI] [PMC free article] [PubMed] [Google Scholar]
Watkins K. Developmental disorders of speech and language: from genes to brain structure and function. Progress in brain research. 2011;189:225–238. doi: 10.1016/B978-0-444-53884-0.00027-0. [DOI] [PubMed] [Google Scholar]
Williams H. Birdsong and singing behavior. Ann N Y Acad Sci. 2004;1016:1–30. doi: 10.1196/annals.1298.029. [DOI] [PubMed] [Google Scholar]
Wise RA. Dopamine, learning and motivation. Nat Rev Neurosci. 2004;5:483–494. doi: 10.1038/nrn1406. [DOI] [PubMed] [Google Scholar]
Yttri EA, Dudman JT. Opponent and bidirectional control of movement velocity in the basal ganglia. Nature. 2016;533:402–406. doi: 10.1038/nature17639. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zann RA. The Zebra Finch: a Synthesis of Field and Laboratory Studies. Oxford University Press; 1996. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supplement

NIHMS947976-supplement.pdf^{(189.9MB, pdf)}

[R1] Achiro JM, Shen J, Bottjer S. Neural activity in cortico-basal ganglia circuits of juvenile songbirds encodes performance during goal-directed learning. Journal Article. 2017;6 doi: 10.7554/eLife.26973. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] Ali F, Otchy TM, Pehlevan C, Fantana AL, Burak Y, Olveczky BP. The basal ganglia is necessary for learning spectral, but not temporal, features of birdsong. Neuron. 2013;80:494–506. doi: 10.1016/j.neuron.2013.07.049. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] Andalman AS, Fee MS. A basal ganglia-forebrain circuit in the songbird biases motor output to avoid vocal errors. Proc Natl Acad Sci U S A. 2009;106:12518–12523. doi: 10.1073/pnas.0903214106. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] Barto AG. Adaptive crtics and the basal ganglia. In: Houlk J, Davis J, Beiser B, editors. Models of information processing in the basal ganglia. MIT Press; 1995. pp. 215–232. [Google Scholar]

[R5] Bottjer SW, Altenau B. Parallel pathways for vocal learning in basal ganglia of songbirds. Nat Neurosci. 2010;13:153–155. doi: 10.1038/nn.2472. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] Bottjer SW, Miesner EA, Arnold AP. Forebrain lesions disrupt development but not maintenance of song in passerine birds. Science. 1984;224:901–903. doi: 10.1126/science.6719123. [DOI] [PubMed] [Google Scholar]

[R7] Brainard MS, Doupe AJ. Interruption of a basal ganglia-forebrain circuit prevents plasticity of learned vocalizations. Nature. 2000;404:762–766. doi: 10.1038/35008083. [DOI] [PubMed] [Google Scholar]

[R8] Bromberg-Martin ES, Matsumoto M, Hikosaka O. Dopamine in motivational control: rewarding, aversive, and alerting. Neuron. 2010;68:815–834. doi: 10.1016/j.neuron.2010.11.022. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] Bush RR, Mosteller F. A mathematical model for simple learning. Psychological review. 1951a;58:313–323. doi: 10.1037/h0054388. [DOI] [PubMed] [Google Scholar]

[R10] Bush RR, Mosteller F. A model for stimulus generalization and discrimination. Psychological review. 1951b;58:413–423. doi: 10.1037/h0054576. [DOI] [PubMed] [Google Scholar]

[R11] Canopoli A, Herbst JA, Hahnloser RH. A higher sensory brain region is involved in reversing reinforcement-induced vocal changes in a songbird. J Neurosci. 2014;34:7018–7026. doi: 10.1523/JNEUROSCI.0266-14.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] Chang CY, Esber GR, Marrero-Garcia Y, Yau HJ, Bonci A, Schoenbaum G. Brief optogenetic inhibition of dopamine neurons mimics endogenous negative reward prediction errors. Nat Neurosci. 2016;19:111–116. doi: 10.1038/nn.4191. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] Charlesworth JD, Tumer EC, Warren TL, Brainard MS. Learning the microstructure of successful behavior. Nat Neurosci. 2011;14:373–380. doi: 10.1038/nn.2748. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] Charlesworth JD, Warren TL, Brainard MS. Covert skill learning in a cortical-basal ganglia circuit. Nature. 2012;486:251–255. doi: 10.1038/nature11078. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] Cohen JY, Haesler S, Vong L, Lowell BB, Uchida N. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature. 2012;482:85–88. doi: 10.1038/nature10754. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] Cui G, Jun SB, Jin X, Pham MD, Vogel SS, Lovinger DM, Costa RM. Concurrent activation of striatal direct and indirect pathways during action initiation. Nature. 2013;494:238–242. doi: 10.1038/nature11846. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] den Ouden HE, Kok P, de Lange FP. How prediction errors shape perception, attention, and motivation. Frontiers in psychology. 2012;3:548. doi: 10.3389/fpsyg.2012.00548. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] Desrochers TM, Amemori K, Graybiel AM. Habit Learning by Naive Macaques Is Marked by Response Sharpening of Striatal Neurons Representing the Cost and Outcome of Acquired Action Sequences. Neuron. 2015;87:853–868. doi: 10.1016/j.neuron.2015.07.019. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] Dhawale AK, Smith MA, Olveczky BP. The Role of Variability in Motor Learning. Annu Rev Neurosci. 2017;40:479–498. doi: 10.1146/annurev-neuro-072116-031548. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] Doupe AJ, Kuhl PK. Birdsong and human speech: common themes and mechanisms. Annu Rev Neurosci. 1999;22:567–631. doi: 10.1146/annurev.neuro.22.1.567. [DOI] [PubMed] [Google Scholar]

[R21] Dudman JT, Krakauer JW. The basal ganglia: from motor commands to the control of vigor. Curr Opin Neurobiol. 2016;37:158–166. doi: 10.1016/j.conb.2016.02.005. [DOI] [PubMed] [Google Scholar]

[R22] Eshel N, Bukwich M, Rao V, Hemmelder V, Tian J, Uchida N. Arithmetic and local circuitry underlying dopamine prediction errors. Nature. 2015;525:243–246. doi: 10.1038/nature14855. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] Fee MS, Goldberg JH. A hypothesis for basal ganglia-dependent reinforcement learning in the songbird. Neuroscience. 2011;198:152–170. doi: 10.1016/j.neuroscience.2011.09.069. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] Frank MJ, Seeberger LC, O’Reilly RC. By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science. 2004;306:1940–1943. doi: 10.1126/science.1102941. [DOI] [PubMed] [Google Scholar]

[R25] Gadagkar V, Puzerey PA, Chen R, Baird-Daniel E, Farhang AR, Goldberg JH. Dopamine neurons encode performance error in singing birds. Science. 2016;354:1278–1282. doi: 10.1126/science.aah6837. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] Glimcher PW. Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis. Proc Natl Acad Sci U S A. 2011;108(Suppl 3):15647–15654. doi: 10.1073/pnas.1014269108. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] Granger AJ, Wallace ML, Sabatini BL. Multi-transmitter neurons in the mammalian central nervous system. Curr Opin Neurobiol. 2017;45:85–91. doi: 10.1016/j.conb.2017.04.007. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] Graybiel AM. Habits, rituals, and the evaluative brain. Annu Rev Neurosci. 2008;31:359–387. doi: 10.1146/annurev.neuro.29.051605.112851. [DOI] [PubMed] [Google Scholar]

[R29] Graybiel AM, Grafton ST. The striatum: where skills and habits meet. Cold Spring Harbor perspectives in biology. 2015;7:a021691. doi: 10.1101/cshperspect.a021691. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] Haesler S, Rochefort C, Georgi B, Licznerski P, Osten P, Scharff C. Incomplete and inaccurate vocal imitation after knockdown of FoxP2 in songbird basal ganglia nucleus Area X. PLoS Biol. 2007;5:e321. doi: 10.1371/journal.pbio.0050321. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] Hikosaka O, Miyashita K, Miyachi S, Sakai K, Lu X. Differential roles of the frontal cortex, basal ganglia, and cerebellum in visuomotor sequence learning. Neurobiol Learn Mem. 1998;70:137–149. doi: 10.1006/nlme.1998.3844. [DOI] [PubMed] [Google Scholar]

[R32] Hikosaka O, Nakamura K, Nakahara H. Basal ganglia orient eyes to reward. J Neurophysiol. 2006;95:567–584. doi: 10.1152/jn.00458.2005. [DOI] [PubMed] [Google Scholar]

[R33] Hoffmann LA, Saravanan V, Wood AN, He L, Sober SJ. Dopaminergic Contributions to Vocal Learning. J Neurosci. 2016;36:2176–2189. doi: 10.1523/JNEUROSCI.3883-15.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] Hollerman JR, Schultz W. Dopamine neurons report an error in the temporal prediction of reward during learning. Nat Neurosci. 1998;1:304–309. doi: 10.1038/1124. [DOI] [PubMed] [Google Scholar]

[R35] Hong S, Hikosaka O. Dopamine-mediated learning and switching in cortico-striatal circuit explain behavioral changes in reinforcement learning. Frontiers in behavioral neuroscience. 2011;5:15. doi: 10.3389/fnbeh.2011.00015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] Hosp JA, Pekanovic A, Rioult-Pedotti MS, Luft AR. Dopaminergic projections from midbrain to primary motor cortex mediate motor skill learning. J Neurosci. 2011;31:2481–2487. doi: 10.1523/JNEUROSCI.5411-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] Houk J, Adams J, Barto AG. A model of how the basal ganglia generate and use neural signals that predict reinforcement. In: Houk J, Davis J, Beiser D, editors. Models of information processing in the basal ganglia. MIT Press; 1995. pp. 249–270. [Google Scholar]

[R38] Howard CD, Li H, Geddes CE, Jin X. Dynamic Nigrostriatal Dopamine Biases Action Selection. Neuron. 2017;93:1436–1450 e1438. doi: 10.1016/j.neuron.2017.02.029. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] Howe MW, Dombeck DA. Rapid signalling in distinct dopaminergic axons during locomotion and reward. Nature. 2016;535:505–510. doi: 10.1038/nature18942. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] Hu H. Reward and Aversion. Annu Rev Neurosci. 2016;39:297–324. doi: 10.1146/annurev-neuro-070815-014106. [DOI] [PubMed] [Google Scholar]

[R41] Ikemoto S. Dopamine reward circuitry: two projection systems from the ventral midbrain to the nucleus accumbens-olfactory tubercle complex. Brain research reviews. 2007;56:27–78. doi: 10.1016/j.brainresrev.2007.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R42] Immelmann K. Song development in zebra finch and other Estrildid finches. In: Hinde RA, editor. Bird Vocalisations. Cambridge University Press; 1969. pp. 61–74. [Google Scholar]

[R43] Jin X, Costa RM. Start/stop signals emerge in nigrostriatal circuits during sequence learning. Nature. 2010;466:457–462. doi: 10.1038/nature09263. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R44] Jin X, Costa RM. Shaping action sequences in basal ganglia circuits. Curr Opin Neurobiol. 2015;33:188–196. doi: 10.1016/j.conb.2015.06.011. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R45] Jin X, Tecuapetla F, Costa RM. Basal ganglia subcircuits distinctively encode the parsing and concatenation of action sequences. Nat Neurosci. 2014;17:423–430. doi: 10.1038/nn.3632. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R46] Keller GB, Hahnloser RH. Neural processing of auditory feedback during vocal practice in a songbird. Nature. 2009;457:187–190. doi: 10.1038/nature07467. [DOI] [PubMed] [Google Scholar]

[R47] Konopka G, Roberts TF. Animal Models of Speech and Vocal Communication Deficits Associated With Psychiatric Disorders. Biological psychiatry. 2016a;79:53–61. doi: 10.1016/j.biopsych.2015.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R48] Konopka G, Roberts TF. Insights into the Neural and Genetic Basis of Vocal Communication. Cell. 2016b;164:1269–1276. doi: 10.1016/j.cell.2016.02.039. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R49] Lapish CC, Kroener S, Durstewitz D, Lavin A, Seamans JK. The ability of the mesocortical dopamine system to operate in distinct temporal modes. Psychopharmacology. 2007;191:609–625. doi: 10.1007/s00213-006-0527-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R50] Leblois A, Perkel DJ. Striatal dopamine modulates song spectral but not temporal features through D1 receptors. Eur J Neurosci. 2012;35:1771–1781. doi: 10.1111/j.1460-9568.2012.08095.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R51] Leblois A, Wendel BJ, Perkel DJ. Striatal dopamine modulates basal ganglia output and regulates social context-dependent behavioral variability through D1 receptors. J Neurosci. 2010;30:5730–5743. doi: 10.1523/JNEUROSCI.5974-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R52] Lee JC, Tomblin JB. Reinforcement learning in young adults with developmental language impairment. Brain and language. 2012;123:154–163. doi: 10.1016/j.bandl.2012.07.009. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R53] Leemburg S, Canonica T, Luft A. Motor skill learning and reward consumption differentially affect VTA activation. Scientific reports. 2018;8:687. doi: 10.1038/s41598-017-18716-w. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R54] London SE, Clayton DF. Functional identification of sensory mechanisms required for developmental song learning. Nat Neurosci. 2008;11:579–586. doi: 10.1038/nn.2103. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R55] Mandelblat-Cerf Y, Las L, Denisenko N, Fee MS. A role for descending auditory cortical projections in songbird vocal learning. Elife. 2014;3:e02152. doi: 10.7554/eLife.02152. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R56] Mattis J, Tye KM, Ferenczi EA, Ramakrishnan C, O’Shea DJ, Prakash R, Gunaydin LA, Hyun M, Fenno LE, Gradinaru V, et al. Principles for applying optogenetic tools derived from direct comparative analysis of microbial opsins. Nat Methods. 2012;9:159–172. doi: 10.1038/nmeth.1808. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R57] Montague PR, Dayan P, Sejnowski TJ. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J Neurosci. 1996;16:1936–1947. doi: 10.1523/JNEUROSCI.16-05-01936.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R58] Morales M, Margolis EB. Ventral tegmental area: cellular heterogeneity, connectivity and behaviour. Nat Rev Neurosci. 2017;18:73–85. doi: 10.1038/nrn.2016.165. [DOI] [PubMed] [Google Scholar]

[R59] Nelson AB, Kreitzer AC. Reassessing models of basal ganglia function and dysfunction. Annu Rev Neurosci. 2014;37:117–135. doi: 10.1146/annurev-neuro-071013-013916. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R60] Olveczky BP, Andalman AS, Fee MS. Vocal experimentation in the juvenile songbird requires a basal ganglia circuit. PLoS Biol. 2005;3:e153. doi: 10.1371/journal.pbio.0030153. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R61] Panigrahi B, Martin KA, Li Y, Graves AR, Vollmer A, Olson L, Mensh BD, Karpova AY, Dudman JT. Dopamine Is Required for the Neural Representation and Control of Movement Vigor. Cell. 2015;162:1418–1430. doi: 10.1016/j.cell.2015.08.014. [DOI] [PubMed] [Google Scholar]

[R62] Petkov CI, Jarvis ED. Birds, primates, and spoken language origins: behavioral phenotypes and neurobiological substrates. Frontiers in evolutionary neuroscience. 2012;4:12. doi: 10.3389/fnevo.2012.00012. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R63] Price PH. Developmental determinants of structure in zebra finch song. J Comp Physiol Psychol. 1979;93:260–277. [Google Scholar]

[R64] Rescorla R, Wagner A. A theory of Pavlovian conditioning: Vatiaions in the effectiveness of reinforcement and nonreinforcement. In: Black A, Prokasy W, editors. Classical Conditioning II: Current Research and Theory. Appleton Century Crofts; 1972. pp. 64–99. [Google Scholar]

[R65] Reynolds JN, Hyland BI, Wickens JR. A cellular mechanism of reward-related learning. Nature. 2001;413:67–70. doi: 10.1038/35092560. [DOI] [PubMed] [Google Scholar]

[R66] Roberts TF, Gobes SM, Murugan M, Olveczky BP, Mooney R. Motor circuits are required to encode a sensory model for imitative learning. Nat Neurosci. 2012;15:1454–1459. doi: 10.1038/nn.3206. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R67] Roberts TF, Hisey E, Tanaka M, Kearney MG, Chattree G, Yang CF, Shah NM, Mooney R. Identification of a motor-to-auditory pathway important for vocal learning. Nat Neurosci. 2017 doi: 10.1038/nn.4563. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R68] Sasaki A, Sotnikova TD, Gainetdinov RR, Jarvis ED. Social context-dependent singing-regulated dopamine. J Neurosci. 2006;26:9010–9014. doi: 10.1523/JNEUROSCI.1335-06.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R69] Scharff C, Nottebohm F. A comparative study of the behavioral deficits following lesions of various parts of the zebra finch song system: implications for vocal learning. J Neurosci. 1991;11:2896–2913. doi: 10.1523/JNEUROSCI.11-09-02896.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R70] Schultz W. Neuronal Reward and Decision Signals: From Theories to Data. Physiological reviews. 2015;95:853–951. doi: 10.1152/physrev.00023.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R71] Schultz W, Apicella P, Ljungberg T. Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. J Neurosci. 1993;13:900–913. doi: 10.1523/JNEUROSCI.13-03-00900.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R72] Schultz W, Dayan P, Montague PR. A neural substrate of prediction and reward. Science. 1997;275:1593–1599. doi: 10.1126/science.275.5306.1593. [DOI] [PubMed] [Google Scholar]

[R73] Smith KS, Graybiel AM. Habit formation coincides with shifts in reinforcement representations in the sensorimotor striatum. J Neurophysiol. 2016;115:1487–1498. doi: 10.1152/jn.00925.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R74] Stachniak TJ, Ghosh A, Sternson SM. Chemogenetic synaptic silencing of neural circuits localizes a hypothalamus–>midbrain pathway for feeding behavior. Neuron. 2014;82:797–808. doi: 10.1016/j.neuron.2014.04.008. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R75] Stephenson-Jones M, Yu K, Ahrens S, Tucciarone JM, van Huijstee AN, Mejia LA, Penzo MA, Tai LH, Wilbrecht L, Li B. A basal ganglia circuit for evaluating action outcomes. Nature. 2016 doi: 10.1038/nature19845. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R76] Suri RE, Schultz W. Learning of sequential movements by neural network model with dopamine-like reinforcement signal. Experimental brain research Experimentelle Hirnforschung Experimentation cerebrale. 1998;121:350–354. doi: 10.1007/s002210050467. [DOI] [PubMed] [Google Scholar]

[R77] Sutton RS, Barto AG. Toward a modern theory of adaptive networks: expectation and prediction. Psychological review. 1981;88:135–170. [PubMed] [Google Scholar]

[R78] Sutton RS, Barto AG. Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press; 1998. [Google Scholar]

[R79] Tanaka M, Singh Alvarado J, Murugan M, Mooney R. Focal expression of mutant huntingtin in the songbird basal ganglia disrupts cortico-basal ganglia networks and vocal sequences. Proc Natl Acad Sci U S A. 2016;113:E1720–1727. doi: 10.1073/pnas.1523754113. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R80] Tchernichovski O, Nottebohm F, Ho CE, Pesaran B, Mitra PP. A procedure for an automated measurement of song similarity. Anim Behav. 2000;59:1167–1176. doi: 10.1006/anbe.1999.1416. [DOI] [PubMed] [Google Scholar]

[R81] Tecuapetla F, Jin X, Lima SQ, Costa RM. Complementary Contributions of Striatal Projection Pathways to Action Initiation and Execution. Cell. 2016;166:703–715. doi: 10.1016/j.cell.2016.06.032. [DOI] [PubMed] [Google Scholar]

[R82] Tritsch NX, Sabatini BL. Dopaminergic modulation of synaptic transmission in cortex and striatum. Neuron. 2012;76:33–50. doi: 10.1016/j.neuron.2012.09.023. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R83] Tumer EC, Brainard MS. Performance variability enables adaptive plasticity of ‘crystallized’ adult birdsong. Nature. 2007;450:1240–1244. doi: 10.1038/nature06390. [DOI] [PubMed] [Google Scholar]

[R84] Wall NR, De La Parra M, Callaway EM, Kreitzer AC. Differential innervation of direct- and indirect-pathway striatal projection neurons. Neuron. 2013;79:347–360. doi: 10.1016/j.neuron.2013.05.014. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R85] Warren TL, Tumer EC, Charlesworth JD, Brainard MS. Mechanisms and time course of vocal learning and consolidation in the adult songbird. J Neurophysiol. 2011;106:1806–1821. doi: 10.1152/jn.00311.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R86] Watabe-Uchida M, Eshel N, Uchida N. Neural Circuitry of Reward Prediction Error. Annu Rev Neurosci. 2017;40:373–394. doi: 10.1146/annurev-neuro-072116-031109. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R87] Watkins K. Developmental disorders of speech and language: from genes to brain structure and function. Progress in brain research. 2011;189:225–238. doi: 10.1016/B978-0-444-53884-0.00027-0. [DOI] [PubMed] [Google Scholar]

[R88] Williams H. Birdsong and singing behavior. Ann N Y Acad Sci. 2004;1016:1–30. doi: 10.1196/annals.1298.029. [DOI] [PubMed] [Google Scholar]

[R89] Wise RA. Dopamine, learning and motivation. Nat Rev Neurosci. 2004;5:483–494. doi: 10.1038/nrn1406. [DOI] [PubMed] [Google Scholar]

[R90] Yttri EA, Dudman JT. Opponent and bidirectional control of movement velocity in the basal ganglia. Nature. 2016;533:402–406. doi: 10.1038/nature17639. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R91] Zann RA. The Zebra Finch: a Synthesis of Field and Laboratory Studies. Oxford University Press; 1996. [Google Scholar]

PERMALINK

A Basal Ganglia Circuit Sufficient to Guide Birdsong Learning

Lei Xiao

Gaurav Chattree

Francisco Garcia Oscos

Mou Cao

Matthew J Wanat

Todd F Roberts

SUMMARY

ETOC BLURB

INTRODUCTION

RESULTS

Pitch Contingent Auditory Feedback Negatively Reinforces Learned Changes in Vocal Pitch

Figure 1. Pitch contingent auditory feedback guides pitch learning.

Optogenetic Manipulation of VTA Terminals in the Songbird Vocal Basal Ganglia (Area X)

Figure 2. Optogenetic Manipulation of VTA Terminals in Singing Zebra Finches.

Phasic Stimulation of VTA Axon Terminals Guides Bidirectional Learned Changes in Vocal Pitch

Figure 3. Pitch-Contingent Optical Excitation of VTA-Area X Terminals is Sufficient to Bidirectionally Guide Pitch Learning.

Inhibition of VTA Axon Terminals is Sufficient to Negatively Reinforce Changes in Vocal Pitch

Figure 4. Pitch-Contingent Inhibition of VTA-Area X Terminals is Sufficient to Aversively Guide Pitch Learning.

Optogenetic Manipulation of VTA Axon Terminals Guides Changes in Future Performances of Song

Figure 5. VTA-Area X Terminal Manipulations Do Not Have Motor or Motivational Effects on Song.

VTA Axon Terminal Manipulations Drive Significant and Sustained Learned Changes in Song

Figure 6. VTA-Area X Terminal Manipulations are Sufficient to Guide Opponent and Significant Changes in Song-Syllable Pitch.

Changes in Vocal Behavior are Spectrally and Temporally Precise

Figure 7. Spectral and Temporal Precision of Optically Directed Changes in Song.

DISCUSSION

STAR METHODS

KEY RESOURCES TABLE

CONTACT FOR REAGENT AND RESOURCE SHARING

EXPERIMENTAL MODEL AND SUBJECT DETAILS

Animal models

METHOD DETAILS

Viral Vectors

Stereotaxic Surgery

Virus/tracer injection

Optical fiber placement

Behavioral Assays

CAF program

Habituation

Threshold setup and contingency calculation

Closed-loop optogenetic manipulation

Pitch-contingent auditory feedback (pCAF, Figures 1A–1F)

Fast-Scan Cyclic Voltammetry

Histology

Immunohistochemistry

Localization of probes

QUANTIFICATION AND STATISTICAL ANALYSIS

Behavioral Analysis

Statistical Analysis

DATA AND SOFTWARE AVAILABILITY

Supplementary Material

Highlights.

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases