Abstract
Specific corticostriatal structures and circuits are important for flexibly shifting between goal-oriented versus habitual behaviors. For example, the orbitofrontal cortex and dorsomedial striatum are critical for goal-directed action, while the dorsolateral striatum supports habits. To determine the role of neurotrophin signaling, we overexpressed a truncated, inactive form of tropomyosin receptor kinase B [also called tyrosine receptor kinase B (TrkB)], the high-affinity receptor for Brain-derived Neurotrophic Factor, in the orbitofrontal cortex, dorsomedial striatum and dorsolateral striatum. Overexpression of truncated TrkB interfered with phosphorylation of full-length TrkB and ERK42/44, as expected. In the orbitofrontal cortex and dorsomedial striatum, truncated trkB overexpression also occluded the ability of mice to select actions based on the likelihood that they would be reinforced. Meanwhile, in the dorsolateral striatum, truncated trkB blocked the development of habits. Thus, corticostriatal TrkB-mediated plasticity appears necessary for balancing actions and habits.
Introduction
Flexible action requires shifting between familiar and novel behavioral strategies. Extensive response training and exposure to stressors and certain drugs of abuse can lead to a bias towards habit-based behaviors that are by contrast inflexible. Maladaptive habits may contribute to illnesses characterized by impulse control deficits, such as addiction and obsessive-compulsive disorder1,2. Nevertheless, the mechanisms by which the brain balances actions and habits are still being identified.
During the initial acquisition of an instrumental behavior, organisms are typically sensitive to the predictive relationship between actions and their outcomes, and goal-directed action selection strategies dominate. After continued training, reward-related stimuli can gain control over behavior, and behavioral response strategies become automated, or “habitual,” and insensitive to action-outcome associations3–5. The posterior dorsomedial striatum (DMS) and orbitofrontal prefrontal cortex (oPFC) are necessary for goal-directed actions, while the dorsolateral striatum (DLS) controls habits3,6–8.
The primary neurotrophin Brain-derived Neurotrophic Factor (BDNF) appears to be a key cortical substrate coordinating goal-directed action selection, given that oPFC-selective Bdnf knockdown causes failures in action-outcome decision making and a deferral to habit-based behaviors7,9. Where, precisely, stimulation of the high-affinity BDNF receptor tyrosine receptor kinase B (TrkB) is important remains unclear, however, given that BDNF is subject to anterograde transport. For example, oPFC-selective Bdnf knockdown reduces BDNF protein in the dorsal striatum7, suggesting that TrkB activation in the oPFC, DMS, or both could support goal-directed action. Resolving these possibilities is important because upon BDNF binding, the intracellular domain of TrkB auto-phosphorylates, creating docking sites for effector proteins that initiate intracellular signaling cascades, e.g., the ERK42/44 and Akt pathways. TrkB impacts a diverse array of neuronal functions including cell survival and differentiation, axonal and dendritic growth and arborization and synapse formation and plasticity.
Here, mice were trained to generate two food-reinforced behaviors in operant conditioning chambers, then tested for sensitivity to action-outcome associations using a contingency degradation procedure. In this task, food pellets associated with one familiar behavior are delivered non-contingently (“for free”), regardless of the animal’s actions, while the other response remains reinforced (Fig. 1a). Mice that are sensitive to action-outcome contingencies decrease responding during the ‘degraded’ session since responding is not rewarded. By contrast, nose poking that has taken on habitual qualities remains robust.
Figure 1.
TrkB.t1 overexpression in the oPFC impedes goal-directed action selection. (a) Behavioral testing approach: Mice were trained to nose poke on two ports for food reinforcers. Then, one response was reinforced approximately 50% of the time (‘Non-degraded’), while the probability of reinforcement associated with the other response was greatly decreased (‘Degraded’), given that pellets were delivered non-contingently. Inhibiting responding in this condition is considered goal-directed, while insensitivity to non-contingent pellet delivery is considered habitual. (b) Experimental timeline: Mice were infused with viral vectors, then behaviorally tested. (c) Viral vector constructs (from ref.10) are shown. (d) A lentivirus expressing TrkB.t1, GFP, or a half-and-half mixture of both was infused bilaterally into the oPFC. Representative viral vector spread is represented on images from the Mouse Brain Library27. White represents the maximal spread and black the smallest. “VLO” refers to the ventrolateral oPFC. (e) Quantitative immunostaining revealed that full-titer lenti-TrkB.t1 infusions generated greater HA immunofluorescence than a half-and-half mixture of lenti-GFP and lenti-TrkB.t1 (“Half-TrkB.t1”) (GFP: n = 4; TrkB.t1: n = 5). Inset: Representative HA immunofluorescence. (f) Mice were trained to respond for food reinforcers. Full-titer TrkB.t1 overexpression reduced response rates (n = 6 mice/group). (g) Further, mice with full-titer lenti-TrkB.t1 were insensitive to action-outcome contingencies, failing to reduce responding when responding was not reinforced. (h) The same data were normalized to response rates generated on the last day of training, such that 0 reflects no change. Response rates increased in the ‘Non-degraded’ condition across groups. Meanwhile, full-titer TrkB.t1 overexpression interfered with response inhibition, such that these mice maintained high levels of responding even when responding was not reinforced (‘Degraded’ condition). (i) With additional exposure to noncontingent pellet delivery, full-titer TrkB.t1 mice were ultimately able to inhibit a nonreinforced response. Bars and symbols represent means + SEMs, *p < 0.05. Behavioral findings are concordant with independent unpublished pilot investigations and post-mortem experiments were conducted at least twice.
First, we infused into the oPFC a lentivirus expressing a truncated, inactive form of TrkB, TrkB.t1, which lacks an intracellular domain and therefore cannot initiate intracellular signaling pathways (from10). We infused lenti-TrkB.t1 with an HA tag, lenti-Green Fluorescent Protein (GFP; a control), or a half-and-half mixture of the two in order to generate multiple lenti-TrkB.t1 “doses” (Fig. 1b–d). The full-titer lenti-TrkB.t1 infusion generated significantly greater HA immunoreactivity than the low-titer mixture (“Half TrkB.t1”) (t7 = −2.357, p = 0.05) (Fig. 1e), as expected. We then trained mice to nose poke for food reinforcers. While all groups initially acquired the responses, mice with full-titer lenti-TrkB.t1 generated lower response rates (day * group interaction F12,90 = 5.565, p < 0.001; main effects: day F6,90 = 86.345, p < 0.001; nose poke F1,15 = 0.284, p = 0.602; group F2,15 = 5.816, p = 0.013) (Fig. 1f), as also occurs with oPFC-selective Bdnf knockdown7,9 and oPFC damage more generally11. This profile is also consistent with impaired action-outcome decision making12. Indeed, lenti-TrkB.t1 interfered with the ability of mice to select actions based on their consequences during an instrumental contingency degradation procedure (Fig. 1g). Specifically, lenti-TrkB.t1 mice failed to inhibit a response that was unlikely to be rewarded. In contrast, GFP control and low-titer mice decreased responding when that behavior was unlikely to be reinforced (interaction F2,15 = 6.191, p = 0.011; main effects: nose poke F1,15 = 10.816, p = 0.005; group F2,15 = 2.042, p = 0.164) (Fig. 1g).
Given low response rates during instrumental response training (Fig. 1f), it is conceivable that full-titer lenti-TrkB.t1 mice were simply unable to energize a response that was reinforced (i.e., as opposed to being unable to inhibit an inappropriate response). We feel that this is unlikely, however, given that across all groups, response rates during the reinforced phase of testing were higher than those generated on the last day of training (main effect of test phase F1,15 = 14.696, p = 0.002; no interactions) (Fig. 1h). Groups differed only during the “degradation” phase when responding was no longer reinforced (Fig. 1h). Again, full-titer lenti-TrkB.t1 mice generated response rates higher than the last day of training (1-sample t-test against no change (0) t5 = 4.385, p = 0.007), even though responding was not reinforced. By contrast, the other groups did not (GFP control 1-sample t-test against no change (0) t4 = −2.414, p = 0.073; Half TrkB.t1 1-sample t-test against no change (0) t5 = −1.233, p = 0.272).
Does TrkB.t1 overexpression in the oPFC block, or instead delay, action-outcome learning and memory? To answer this question, we exposed mice to non-contingent pellet delivery for 2 additional sessions. Ultimately (in the final session), full-titer TrkB.t1 mice inhibited nonreinforced responding in a goal-directed manner, indicating that TrkB.t1 overexpression delayed, but did not fully occlude, action-outcome-based decision making (main effects: nose poke F1,15 = 61.688, p < 0.001; group F2,15 = 0.418, p = 0.666; no interactions) (Fig. 1i).
We next generated additional mice with either lenti-GFP or full-titer lenti-TrkB.t1 in the oPFC. We euthanized them 3 weeks following viral vector infusion and extracted the oPFC using a tissue punch. In this case, tissue samples would be expected to contain both infected and uninfected cells. Nevertheless, Trkb.T1 protein levels were detectably elevated in mice bearing the lenti-TrkB.t1 virus, as would be expected (t7 = −2.769, p = 0.028) (Fig. 2a). Phospho-TrkB was also diminished, consistent with the notion that over-expression of a truncated receptor interferes with signaling of the full-length receptor13–15 (t26 = 3.575, p = 0.001) (Fig. 2b). Meanwhile, total full-length TrkB protein was unaffected, indicating no gross compensatory changes in receptor expression (t26 = −1.453, p = 0.158) (Fig. 2c). Consistent with reductions in phospho-TrkB, phospho-ERK42/44 was also diminished (t26 = 3.218, p = 0.003) (Fig. 2d; agrees with ref.16, which uses the same viral vector in the hippocampus), while total ERK42/44 was unchanged (t26 = 0.493, p = 0.626) (Fig. 2e). Similarly, we identified a trend for reduced phospho-Akt (t26 = 3.218, p = 0.061) (Fig. 2f) and no changes in total Akt (t26 = 1.547, p = 0.134) (Fig. 2g).
Figure 2.
Validation of the TrkB.t1-overexpressing virus. (a) Virus-infected oPFC tissue was dissected by tissue punch and immunoblotted for TrkB.T1, revealing elevated TrkB.T1 protein in mice bearing the TrkB.t1-overexpressing virus, as expected (GFP: n = 4; TrkB.t1: n = 5). (b) Phospho-TrkB was also diminished (GFP: n = 12; TrkB.t1: n = 16, representing 2 independent cohorts; applies also to all following panels). (c) Full-length TrkB was unaffected. (d) Phospho-ERK42/44 was reduced, while (e) total ERK42/44 was unchanged. (f) Similarly, we identified a trend for reduced phospho-Akt and (g) no changes in total Akt. (h) Mature BDNF and (i) the pro-form were not significantly affected. (j) Finally, the astrocytic marker GFAP was reduced and (k) the synaptic marker PSD95 was not affected. Representative, unadjusted lanes from the same individual gels are shown with their corresponding loading controls. Molecular weights of each protein are indicated either in, or directly adjacent to, the protein name. Bars represent means + SEMs, *p < 0.05, #p = 0.06. Every gel was run at least twice.
BDNF levels can dynamically impact reward-related decision making. For example, microRNA regulation of BDNF in the prefrontal cortex mediates escalating alcohol intake in mice17. To address the possibility that TrkB.t1 overexpression led to an accumulation of cortical BDNF (e.g., by interfering with axonal transport), or alternatively, diminished local BDNF levels, we also quantified BDNF. Neither mature nor pro-BDNF were significantly affected, despite large group sizes (though downward trends were noted: t26 = 1.651, p = 0.111; t26 = 2.135, p = 0.090, respectively) (Fig. 2h and i). To address the potential concern that TrkB.t1 overexpression caused lesion-like tissue damage, we quantified the astrocytic marker GFAP, which increases upon lesion. TrkB.t1 overexpression reduced GFAP levels, however (t26 = 2.411, p = 0.027) (Fig. 2j). Our final finding that the postsynaptic marker PSD95 was not affected (t26 = 0.757, p = 0.457) (Fig. 2k) further supports our perspective that TrkB.t1 overexpression did not cause gross tissue damage.
As with cortical TrkB, striatal TrkB influences action selection strategies
Obstructing oPFC-striatal interactions causes the same impairments in goal-directed action as with oPFC-selective TrkB.t1 overexpression here7. Interfering with oPFC-striatal interactions also impedes an organism’s ability to modify instrumental behaviors when reward value changes8. The striatum contains very little Bdnf mRNA18, but abundant BDNF protein anterogradely transported from cortical sources19. We thus next examined whether TrkB in the dorsal striatum is similarly important for flexible action selection. In this case, we overexpressed TrkB.t1 selectively in the DMS or DLS (using the full-titer viral vector also used in Figs 1 and 2) (Fig. 3a,b). Response rates during initial nose poke training did not differ between groups (main effects: day F10,180 = 112.669, p < 0.001; nose poke F1,18 = 0.006, p = 0.937; group F2,18 = 0.674, p = 0.522) (Fig. 3c). TrkB.t1 overexpression in the DMS, however, induced failures in goal-oriented response selection, causing robust response rates despite non-contingent delivery of food pellets (interaction F2,18 = 8.14, p = 0.003; main effects: nose poke F1,18 = 65.625, p < 0.001; group F2,18 = 0.918, p = 0.417) (Fig. 3d). Thus, TrkB in the oPFC and downstream DMS appears to be essential for goal-directed action.
Figure 3.
TrkB.t1 overexpression in the striatum bidirectionally regulates actions and habits. (a) Lenti-TrkB.t1 or GFP was infused into the DMS or DLS, then sensitivity to action-outcome contingency was tested. (b) Viral vector infusions are represented, with white representing the largest infusion and black the smallest. (c) We detected no group differences during food-reinforced instrumental conditioning. (d) Overexpression of TrkB.t1 in the DMS, however, caused a bias towards inflexible habits, indicated by insensitivity to action-outcome contingencies. (e) Additional nose poke training induced habits in control mice, but overexpression of TrkB.t1 in the DLS blocked these habits from forming. n = 8, 6, 7 for GFP, DMS and DLS, respectively. Bars and symbols represent means + SEMs, *p < 0.05. Results are concordant with independent unpublished pilot investigations.
Next, we induced habit behavior using a random interval schedule of reinforcement (Fig. 3a). Following this training, both control and DMS TrkB.t1 mice generated inflexible habit-based responding as expected, indicated by insensitivity to action-outcome contingencies. By contrast, TrkB.t1 overexpression in the DLS interfered with habit formation – these mice remained sensitive to changes in action-outcome contingencies despite extensive behavioral experience (interaction F2,17 = 4.198, p = 0.033; main effects: nose poke F1,17 = 7.693, p = 0.013; group F2,17 = 0.495, p = 0.618) (Fig. 3e).
To summarize, TrkB appears to be essential to the functions of both the DMS (supporting goal-directed action) and DLS (supporting habits). Indeed, TrkB.t1 overexpression in these striatal sub-regions causes response patterns that bear remarkable resemblance to those following inactivation of each respective structure20,21. Although TrkB is expressed in both the DMS and DLS22, these patterns were nevertheless somewhat unexpected, given that systemic administration of a putative TrkB agonist blocks habits induced by extensive response training9 and excess glucocorticoids16, rather than facilitating this DLS-dependent behavior. TrkB stabilizes dendritic spine densities and morphologies throughout multiple brain regions23 and is essential for corticostriatal long-term potentiation24. The switch from goal-directed action to habits is thought to reflect a transition in the coordinated control of response strategies by multiple cortico-striatal regions to a predominantly DLS-controlled output (e.g.,3). Thus, broad-spread TrkB stimulation (i.e., due to systemic injection of a TrkB agonist) may energize goal-directed action by stimulating multiple cortico-striatal structures (such as the oPFC, DMS and prelimbic prefrontal cortex)4,12 competing with the DLS for control over behavior. Further understanding the molecular mechanisms mediating the balance between actions and habits could shed light onto treating disorders characterized by impairments in flexible action and decision making, such as obsessive-compulsive disorder and addiction1,2.
Methods
Subjects
Experiments used adult male wild-type C57BL/6 mice (≥postnatal day 60) (Jackson Laboratories, Bar Harbor, ME). Mice were housed 2–5 per cage and maintained on a 12-hour light cycle (on at 0800) and were experimentally naïve. Mice had ad libitum access to water and food, except during instrumental conditioning when body weights were maintained at ~90% of baseline. Procedures were approved by the Emory University Institutional Animal Care and Use Committee and were performed in concordance with The Guide for the Care and Use of Laboratory Animals.
Intracranial surgery
Mice were anaesthetized with ketamine/dexdomitor and then mounted onto a digital stereotaxic apparatus (Stoelting, Wood Dale, IL). Lentiviral vectors expressing TrkB.t1 and an HA tag or GFP under a CMV promotor were generated by the Emory University Viral Vector Core and have been described in detail previously10. Viral vectors were infused at a rate of 0.1 μL/minute, with a total volume of 0.5 μL and the microsyringe left in place for 5 minutes following infusion. In experiments targeting the oPFC, viral vectors were infused at +2.6 mm anteroposterior (AP), −2.85 mm dorsoventral (DV) and +/−1.2 mm mediolateral (ML). Viral vectors targeting the DMS were delivered to +0.74 mm AP, −3.0 mm DV and +/−2.2 mm ML. DLS coordinates were +0.5 mm AP, −3.5 mm DV and +/−2.7 mm ML.
Action-outcome contingency degradation
Mice were trained to nose poke for food pellet reinforcers (20 mg grain-based pellets; Bioserv, Frenchtown, NJ) in Med-Associates (Georgia, VT) operant conditioning chambers. Mice were trained to nose poke on 2 available apertures using a fixed ratio 1 (FR1) schedule of reinforcement for 5 sessions. Next, mice were trained for 2 additional days using a random interval 30 second (RI30) schedule of reinforcement. Sessions lasted for 70 minutes or until the maximum 60 pellets (30 per nose poke) had been delivered.
Next, mice were tested for sensitivity to action-outcome contingencies using a modified version of classical action-outcome contingency degradation, the details of which are further discussed in refs25,26. Briefly, during the ‘non-degraded’ session, one nose poke aperture was occluded and responding on the other nose poke aperture was reinforced using a variable ratio 2 (VR2) schedule of reinforcement. The next day, during the ‘degraded’ session, pellets were delivered non-contingently at a rate yoked to the reinforcement rate from the previous session. Responses were recorded, but had no programmed consequences. The location of the ‘degraded’ aperture was counterbalanced across subjects. Mice that decrease their response rates during the ‘degraded’ session are considered goal-directed. Equivalent response rates during the ‘non-degraded’ and ‘degraded’ sessions are thought to reflect habitual responding4.
In experiments with oPFC infusions, mice were tested in the modified contingency degradation procedure 3 consecutive times. In experiments with striatal infusions, following the first contingency degradation test, mice were trained for an additional 4 days with 2 available nose poke recesses using an RI60-second schedule of reinforcement. Then, mice were again tested for sensitivity to action-outcome contingency degradation, as above.
Immunohistochemistry
Histology
Mice were anesthetized by isoflurane and euthanized by rapid decapitation. Brains were stored for 48 hours in 4% paraformaldehyde and then transferred to a 30% w/v sucrose solution. Brains were then sectioned at 50 μM. To verify infusion sites, sections were immunostained for the HA tag on the TrkB.t1 virus, or GFP was imaged. To visualize HA, sections were blocked, then incubated with the primary antibody [anti-HA; 1:1000; Sigma-Aldrich (Product #H6908), St. Louis, MO] overnight at 4 °C. The next day, sections were incubated with secondary antibody (Alexa Fluor 488 or 594 anti-rabbit; 1:500; Jackson ImmunoResearch Laboratories, West Grove, PA) and then mounted with Permount (Fisher Scientific, Hampton, NH) for fluorescence imaging. Mice with mislocalized infusions were excluded from analysis, resulting in the omission of 1 mouse from each of the TrkB.t1 groups and 2 mice from the GFP control groups in the oPFC infusion experiment and 2 mice from each group in the dorsal striatal infusion experiment.
Quantitative imaging
Sections were immunostained for the HA tag (as above). Sections were imaged on a Nikon 4550 s SMZ18 stereo microscope (Nikon Instruments, Melville, NY). All images were collected in the same session with settings held constant. A sampling area was drawn around the infusion site and the mean integrated intensity was quantified in NIS Elements (Nikon Instruments).
Western blotting
Behaviorally-naïve mice received oPFC-targeted infusions of full-titer lenti-TrkB.t1 or GFP as above. Approximately 3 weeks following infusion, matching the onset of behavioral studies, mice were rapidly decapitated and brains were stored at −80 °C, then later sectioned into 1-mm thick sections. The oPFC was dissected using a 1 mm tissue core. Tissue was homogenized in lysis buffer [200 μL; 137 mM NaCl, 20 mM tris-HCl (pH = 8), 1% igepal, 10% glycerol, 1:100 Phosphatase Inhibitor Cocktails 1 and 2 (Sigma-Aldrich) and 1:1000 Protease Inhibitor Cocktail (Sigma-Aldrich)] and protein concentrations were determined by a Pierce BCA Protein Assay kit (Thermo Fisher Scientific). 15 μg of each sample was separated by SDS-page on a 7.5% gradient Tris-glycine gel (Bio-Rad Laboratories, Inc., Hercules, CA). Next, samples were transferred to a PVDF membrane (Bio-Rad) and blocked with 5% nonfat dry milk for 1 hour. The membrane was incubated overnight at 4 °C in primary antibodies. Primary antibodies were TrkB [1:375; Cell Signaling Technology (Product #4606), Danvers, MA], phospho-Trk (Y706/Y707) [1:100; Cell Signaling (Product #4621)], Akt [1:500; Cell Signaling (Product #9271)], phospho-Akt (T308) [1:100; Cell Signaling (Product #2965)], ERK42/44 [1:500; Cell Signaling (Product #9102)], phospho-ERK42/44 (T202/Y204) [1:250; Cell Signaling (Product #4370)], BDNF [1:250; Sigma-Aldrich (Product #B9436)], GFAP [1:1000; Invitrogen (Product #180063)], PSD-95 [1:5000; Cell Signaling (Product #3450)] and HSP70 [1:5000 to 1:10000; Santa Cruz Biotechnology (Product #sc-7298), Dallas, TX]. Following 1 hour of incubation in secondary antibodies [goat anti-mouse and anti-rabbit peroxidase labeled IgG (Vector Laboratories, Burlingame, CA)], immunoreactivity was assessed using a chemiluminescence substrate (Thermo Fisher Scientific) and a ChemiDoc MP Imaging System (Bio-Rad). Immunoblot comparisons were generated at least twice.
Statistical analyses
All mice were randomly assigned to condition, and sample sizes were in line with prior reports using the same approaches (e.g., refs7,9). Behavioral response rates were compared by 2-factor mixed-design ANOVA and Bonferroni post-hoc comparisons in case of significant interactions. In an additional analysis, response rates during the instrumental contingency degradation testing phases were normalized to response rates associated with the same nose poke port generated on the final day of training. Fold-change values were compared by 2-factor ANOVA, as well as 1-sample t-tests against no change (0).
For western blotting experiments, densitometry values were normalized to a loading control (HSP70) in the same lane and then to the control sample mean on the same gel to accommodate fluorescence variance across gels. Group means were then compared by a 2-tailed unpaired t-test.
Throughout, normality was confirmed using the Shapiro-Wilk test. Values >2 standard deviations above or below the mean were considered outliers and excluded, resulting in the omission of 1 mouse each from the fold-change calculations in Fig. 1h and the instrumental contingency degradation test 2 in Fig. 3. Statistical analyses were performed in SPSS or Prism with α ≤ 0.05. Data are presented as mean ± SEM and sample sizes are included in the associated figure legends.
Behavioral experiments were not performed blind to the condition, but response rates were collected via automated photobeam-based systems, minimizing bias. Similarly, equivalent amounts of protein were loaded in western blotting experiments, also minimizing bias.
Data availability statement
Data can be made available upon reasonable request.
Acknowledgements
We thank A. Allen, Mr. Michael Bower and Dr. Alonzo Whyte for their contributions. Thank you also to Dr. Kerry Ressler for the use of the TrkB.t1-expressing lentivirus. This work was supported by NIH MH101477. The Yerkes National Primate Research Center is supported by the Office of Research Infrastructure Programs/OD P51 OD011132. The Emory Viral Vector Core is supported by an NINDS Core Facilities grant, P30 NS055077.
Author Contributions
E.P. and S.G. designed the experiments and prepared the manuscript. E.P. and D.L. conducted the experiments and statistical analyses. D.L. assisted with manuscript editing.
Competing Interests
The authors declare no competing interests.
Footnotes
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Burguiere E, Monteiro P, Mallet L, Feng G, Graybiel AM. Striatal circuits, habits and implications for obsessive-compulsive disorder. Current opinion in neurobiology. 2015;30:59–65. doi: 10.1016/j.conb.2014.08.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Everitt BJ, Robbins TW. Drug addiction: Updating actions to habits to compulsions ten years on. Annual review of psychology. 2016;67:23–50. doi: 10.1146/annurev-psych-122414-033457. [DOI] [PubMed] [Google Scholar]
- 3.Yin HH, Ostlund SB, Balleine BW. Reward-guided learning beyond dopamine in the nucleus accumbens: the integrative functions of cortico-basal ganglia networks. The European journal of neuroscience. 2008;28:1437–1448. doi: 10.1111/j.1460-9568.2008.06422.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Balleine BW, O’Doherty JP. Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology: official publication of the American College of Neuropsychopharmacology. 2010;35:48–69. doi: 10.1038/npp.2009.131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Graybiel AM, Grafton ST. The striatum: where skills and habits meet. Cold Spring Harbor perspectives in biology. 2015;7:a021691. doi: 10.1101/cshperspect.a021691. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Yin HH, et al. Dynamic reorganization of striatal circuits during the acquisition and consolidation of a skill. Nature neuroscience. 2009;12:333–341. doi: 10.1038/nn.2261. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Gourley SL, et al. The orbitofrontal cortex regulates outcome-based decision-making via the lateral striatum. The European journal of neuroscience. 2013;38:2382–2388. doi: 10.1111/ejn.12239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Gremel CM, Costa RM. Orbitofrontal and striatal circuits dynamically encode the shift between goal-directed and habitual actions. Nature communications. 2013;4:2264. doi: 10.1038/ncomms3264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Zimmermann KS, Yamin JA, Rainnie DG, Ressler KJ, Gourley SL. Connections of the Mouse Orbitofrontal Cortex and Regulation of Goal-Directed Action Selection by Brain-Derived Neurotrophic Factor. Biological psychiatry. 2017;81:366–377. doi: 10.1016/j.biopsych.2015.10.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Rattiner LM, Davis M, French CT, Ressler KJ. Brain-derived neurotrophic factor and tyrosine kinase receptor B involvement in amygdala-dependent fear conditioning. The Journal of neuroscience: the official journal of the Society for Neuroscience. 2004;24:4796–4806. doi: 10.1523/JNEUROSCI.5654-03.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Stalnaker TA, Cooch NK, Schoenbaum G. What the orbitofrontal cortex does not do. Nat Neurosci. 2015;18:620–627. doi: 10.1038/nn.3982. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Corbit LH, Balleine BW. The role of prelimbic cortex in instrumental conditioning. Behavioural brain research. 2003;146:145–157. doi: 10.1016/j.bbr.2003.09.023. [DOI] [PubMed] [Google Scholar]
- 13.Eide FF, et al. Naturally occurring truncated trkB receptors have dominant inhibitory effects on brain-derived neurotrophic factor signaling. The Journal of neuroscience: the official journal of the Society for Neuroscience. 1996;16:3123–3129. doi: 10.1523/JNEUROSCI.16-10-03123.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Saarelainen T, et al. Transgenic mice overexpressing truncated trkB neurotrophin receptors in neurons show increased susceptibility to cortical injury after focal cerebral ischemia. Molecular and cellular neurosciences. 2000;16:87–96. doi: 10.1006/mcne.2000.0863. [DOI] [PubMed] [Google Scholar]
- 15.Haapasalo A, Koponen E, Hoppe E, Wong G, Castren E. Truncated trkB.T1 is dominant negative inhibitor of trkB.TK+ -mediated cell survival. Biochemical and biophysical research communications. 2001;280:1352–1358. doi: 10.1006/bbrc.2001.4296. [DOI] [PubMed] [Google Scholar]
- 16.Barfield ET, et al. Regulation of actions and habits by ventral hippocampal trkB and adolescent corticosteroid exposure. PLoS Biol. 2017;15:e2003000. doi: 10.1371/journal.pbio.2003000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Darcq E, et al. MicroRNA-30a-5p in the prefrontal cortex controls the transition from moderate to excessive alcohol consumption. Molecular psychiatry. 2015;20:1219–1231. doi: 10.1038/mp.2014.122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Hofer M, Pagliusi SR, Hohn A, Leibrock J, Barde YA. Regional distribution of brain-derived neurotrophic factor mRNA in the adult mouse brain. The EMBO journal. 1990;9:2459–2464. doi: 10.1002/j.1460-2075.1990.tb07423.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Conner JM, Lauterborn JC, Gall CM. Anterograde transport of neurotrophin proteins in the CNS–a reassessment of the neurotrophic hypothesis. Reviews in the neurosciences. 1998;9:91–103. doi: 10.1515/REVNEURO.1998.9.2.91. [DOI] [PubMed] [Google Scholar]
- 20.Yin HH, Knowlton BJ, Balleine BW. Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning. The European journal of neuroscience. 2004;19:181–189. doi: 10.1111/j.1460-9568.2004.03095.x. [DOI] [PubMed] [Google Scholar]
- 21.Yin HH, Ostlund SB, Knowlton BJ, Balleine BW. The role of the dorsomedial striatum in instrumental conditioning. The European journal of neuroscience. 2005;22:513–523. doi: 10.1111/j.1460-9568.2005.04218.x. [DOI] [PubMed] [Google Scholar]
- 22.Altar CA, et al. In situ hybridization of trkB and trkC receptor mRNA in rat forebrain and association with high-affinity binding of [125I]BDNF, [125I]NT-4/5 and [125I]NT-3. The European journal of neuroscience. 1994;6:1389–1405. doi: 10.1111/j.1460-9568.1994.tb01001.x. [DOI] [PubMed] [Google Scholar]
- 23.Bennett MR, Lagopoulos J. Stress and trauma: BDNF control of dendritic-spine formation and regression. Progress in neurobiology. 2014;112:80–99. doi: 10.1016/j.pneurobio.2013.10.005. [DOI] [PubMed] [Google Scholar]
- 24.Park H, Popescu A, Poo MM. Essential role of presynaptic NMDA receptors in activity-dependent BDNF secretion and corticostriatal LTP. Neuron. 2014;84:1009–1022. doi: 10.1016/j.neuron.2014.10.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Hinton EA, Wheeler MG, Gourley SL. Learning & memory (Cold Spring Harbor, N.Y.) 2014. Early-life cocaine interferes with BDNF-mediated behavioral plasticity; pp. 253–257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Swanson AM, Allen AG, Shapiro LP, Gourley SL. GABAAalpha1-mediated plasticity in the orbitofrontal cortex regulates context-dependent action selection. Neuropsychopharmacology. 2015;40:1027–1036. doi: 10.1038/npp.2014.292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Rosen GD, et al. The Mouse Brain Library @ www.mbl.org. Int Mouse Genome Conference. 2000;14:166. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data can be made available upon reasonable request.