Abstract
Background
Distinguishing between actions that are more, or less, likely to be rewarded is a critical aspect of goal-directed decision-making. However, neuroanatomical and molecular mechanisms are not fully understood.
Methods
We used anterograde tracing, viral-mediated gene silencing, functional disconnection strategies, pharmacological rescue, and Designer Receptors Exclusively Activated by Designer Drugs (DREADDs) to determine the anatomical and functional connectivity between the orbitofrontal cortex (oPFC) and the amygdala in mice. In particular, we knocked down Brain-derived neurotrophic factor (Bdnf) bilaterally in the oPFC, or generated an oPFC-amygdala “disconnection” by pairing unilateral oPFC Bdnf knockdown with lesions of the contralateral amygdala. We characterized decision-making strategies using a task wherein mice select actions based on the likelihood that they will be reinforced. Additionally, we assessed the effects of DREADD-mediated oPFC inhibition on the consolidation of action-outcome conditioning.
Results
As in other species, the oPFC projects to the basolateral amygdala and dorsal striatum in mice. Bilateral Bdnf knockdown within the ventrolateral oPFC, and unilateral Bdnf knockdown accompanied by lesions of the contralateral amygdala, impede goal-directed response selection, implicating BDNF-expressing oPFC projection neurons in selecting actions based on their consequences. The TrkB agonist 7,8-dihydroxyflavone rescues action selection and increases dendritic spine density on excitatory neurons in the oPFC. Rho-kinase inhibition also rescues goal-directed response strategies, linking neural remodeling with outcome-based decision-making. Finally, DREADD-mediated oPFC inhibition weakens new action-outcome conditioning.
Conclusions
Activity- and BDNF-dependent neuroplasticity within the oPFC coordinate outcome-based decision-making through interactions with the amygdala. These interactions brake reward-seeking habits, a putative factor in multiple psychopathologies.
Keywords: orbital, habit, amygdala, action, outcome, striatum
Introduction
The orbitofrontal cortex (oPFC) is essential for encoding information about rewards and translating this information into behavioral response strategies. Accordingly, both rodents and non-human primates with lesions or inactivation of the oPFC fail to modify reward-seeking behaviors when a reinforcer loses value (e.g.,(1),(2)). Further, the oPFC is essential to value judgment (3) and outcome expectancy (4). In other words, across species, the oPFC is critical for acquiring information relevant to salient outcomes.
These findings raise the possibility that the oPFC may guide decision-making strategies based not just on outcome value or reward-related cues, but also on other outcome-related information such as the likelihood that a given response will result in a desired outcome. In line with this perspective, recent reports indicate that oPFC-striatal interactions are preferentially engaged during goal-directed, as opposed to habitual, decision-making (5). Further, perturbations in oPFC-striatal interactions – through lesions, inactivation, hyper-activation, or targeted neurotrophin knockdown – result in involuntary motor movements, as well as inflexible habits (5–7).
In addition to the striatum, the oPFC innervates the basolateral nucleus of the amygdala (BLA) (8), which is also necessary for goal-directed decision-making – that is, selecting an action based on the value of an anticipated reinforcer, or based on the likelihood that it will be reinforced (9). From a circuit-level perspective, most reports in this domain have focused on BLA interactions with the nucleus accumbens and dorsal striatum (10–12), meaning top-down cortical regulation of BLA-dependent goal-directed decision-making is under-characterized. Further, these and related studies have largely used lesion approaches in rats, leaving molecular mechanisms unclear. Finally, most studies of the BLA utilize outcome devaluation procedures, which assess decision-making based on the value of a goal, rather than the predictive relationship between a response and a reinforcer.
In the present studies, we first report that mouse oPFC-amygdala and oPFC-striatal projection patterns are homologous to those of rats (8,13). Then, we use in vivo viral-mediated gene transfer in mice to inactivate the neuroplasticity-associated neurotrophin Brain-derived neurotrophic factor (Bdnf), or Designer Receptors Exclusively Activated by Designer Drugs (DREADDs) to dampen neural activity, and test a model in which plasticity in the ventrolateral oPFC (VLO) coordinates goal-directed action selection. We also used asymmetric infusion techniques to establish the functional necessity of VLO-BLA connectivity in selecting actions based on their consequences. We then attempted to augment goal-directed action selection using the TrkB agonist 7,8-dihydroxyflavone (7,8-DHF). Based on our evidence that 7,8-DHF induces dendritic spine proliferation, we last capitalized on the availability of a blood brain barrier-penetrant Rho-kinase inhibitor to rescue outcome-based decision-making following Bdnf knockdown. Together, our findings indicate that VLO Bdnf systems critically organize goal-directed decision-making via interaction with the downstream BLA.
Methods
For additional details, see Supplementary Materials.
Subjects
Mice were males, >8 weeks old. For studies involving Bdnf knockdown, mice were homozygous for a floxed allele (exon V) encoding the Bdnf gene (14). These mice were maintained on a mixed BALB/C background. For studies involving dendritic spine imaging, mice expressed thy1-derived YFP (15) and were fully back-crossed onto a C57BL/6 background. Other experiments used wildtype C57BL/6 mice, and all original breeding pairs were purchased from Jackson Labs. Throughout, littermates were represented in both control and experimental groups.
Mice were maintained on a 12-hour light cycle (0700 on) and provided food and water ad libitum except during instrumental conditioning, when body weights were maintained at ~93% of baseline to motivate responding. Procedures were approved by the Emory University IACUC.
Intracranial infusions
Using standard stereotaxic procedures and coordinates based on (16), the following were delivered: biotinylated dextran amine (BDA)-10,000 (0.15μl/site); lentiviral vectors expressing Cre-Recombinase or GFP under the CMV promoter (0.5μl/site) (Emory Viral Vector Core); or adeno-associated viruses (AAV5)-CaMKII-HA-hM4D(Gi)-IRES-mCitrine or AAV5-CaMKII-GFP (0.5μl/site) (UNC Viral Vector Core). For disconnection experiments, VLO infusions were unilateral, and NMDA (20μg/μl) or saline (0.1μl/site) was infused in the ipsilateral or contralateral BLA.
Instrumental conditioning
Mice were trained to nose poke for food reinforcement (20mg pellets; Bioserv) using Med-Associates conditioning chambers. Training was initiated with a fixed ratio 1 schedule of reinforcement; 30 pellets were available for responding on each of 2 distinct nose poke recesses located on opposite sides of a single wall within the chambers, resulting in 60 pellets/session. Sessions ended when all 60 pellets were delivered or at 135min. Unless otherwise indicated, after 5 sessions, mice were shifted to a random interval (RI) 30-second schedule of reinforcement for 2 sessions; again, 30 pellets were available for responding on each of 2 apertures. At this point, sensitivity to instrumental contingency degradation was tested, or in the case of extended training, mice were trained for an additional 6 RI30-second sessions and then 7 RI60-second sessions to promote the formation of stimulus-response habits (17). Response acquisition curves represent total responses/min.
A modified version of classical instrumental contingency degradation was used. As previously described (e.g.,18,19), in the “non-degraded session”, one nose poke aperture was occluded, and responding on the other aperture was reinforced using a variable ratio 2 schedule of reinforcement for 25min. In the “degraded session”, the opposite aperture was occluded, and reinforcers were delivered into the magazine for 25min. at a rate matched to each animal’s reinforcement rate the previous day. Responding produced no programmed consequences. Thus, one response became significantly more predictive of reinforcement than the other (see(20)). Both apertures were available during a subsequent 10min. probe test, conducted in extinction. In the “disconnection” experiment, this 3-day process was repeated.
Extinction conditioning
After testing as above, mice in one experiment were placed in the conditioning chambers for an additional 15min./day for 7 days. Responding was not reinforced, and mice were injected with vehicle or 7,8-DHF immediately after each session. A subset of these mice was thy1-YFP-expressing, to allow for dendritic spine imaging described below. YFP- and non-YFP-expressing mice did not differ in response extinction.
Drugs
7,8-DHF (Sigma; 5mg/kg, 17% DMSO), fasudil (LC Laboratories; 10mg/kg, PBS), ANA-12 (Sigma; 0.5mg/kg, 1% DMSO), CNO (Sigma; 1mg/kg, 2% DMSO), or the corresponding vehicle was administered i.p. immediately following action-outcome contingency degradation (7,8-DHF, CNO), immediately prior to contingency degradation (ANA-12), or immediately following extinction training (7,8-DHF). Groups were assigned by matching mice based on response rates during training.
Dendritic spine imaging and enumeration
Dendritic spine imaging was accomplished as described (18,21). 40μm-thick sections were generated from YFP-expressing brains, and unobstructed dendritic segments running parallel to the surface of the section were imaged using a 0.1μm step size. Collapsed z-stacks were analyzed using ImageJ: Each protrusion <4μm was considered a spine and counted (22). Each animal contributed a single density value (its average) to statistical analyses. A single blinded rater scored all spines.
Histology
Brains were sectioned into 55μm-thick sections. BDA signal was amplified with a Vectastain Elite ABC kit and revealed by nickel-enhanced-diaminobenzidine staining. Maximum diffusion around the infusion site was mapped, and patterns of axon terminals downstream of each infusion site were transposed onto representative coronal sections from (16). Labeling from 2–3 mice was analyzed/site.
Following viral vector delivery, every third section was imaged for GFP or mCitrine, or immunostained for Cre (Sigma; 1:750) as appropriate.
To confirm lesion sites, every third section was immunostained for Glial Fibrillary Acidic Protein (GFAP) (Dakocytomation; 1:1000) as described (23).
BDNF quantification
Mice were rapidly decapitated, and brains were frozen on dry ice for BDNF quantification by enzyme-linked immunosorbent assay (ELISA). The VLO and amygdala were extracted with 1mm bilateral tissue punches. ELISA was performed in accordance with manufacturer’s instructions (Promega) except the extraction step was excluded. BDNF concentrations were normalized to the total protein content in each sample. Concentrations were normalized to the mean of the control samples on the same plate to control for fluorescence variance across plates.
Statistical analyses
Two-tailed statistical analyses with α<0.05 were performed using SigmaStat or SPSS. Tukey’s post-hoc t-tests were utilized in the event of interaction effects; posthoc comparisons are indicated graphically. Values lying >2 standard deviations outside of the mean were considered outliers and excluded (see Supplementary Materials). BDNF covariance with behavioral measures was tested using a linear regression analysis.
Results
The mouse oPFC innervates the dorsal striatum and amygdala
We first compared projection patterns between the well-studied dorsolateral oPFC/agranular insula (DLO/AI) and the adjacent VLO. BDA infusion into the VLO (fig. 1a) revealed innervation of both the dorsal striatum and amygdala to be overwhelmingly ipsilateral. The central aspect of the dorsal striatum received heavy innervation broadly along the rostrocaudal axis (fig. 1b). By contrast, only light labeling was present in the ventral striatum. Fibers entered the rostral striatum through the genu of the corpus callosum (gcc) and the external capsule, then formed multiple fiber bundles that coursed through the dorsomedial terminal fields along the rostrocaudal axis.
Within the amygdala, VLO-originating fibers largely spared the lateral amygdala and instead targeted the BLA (fig. 1c–d). In rostral sections, innervation was widely distributed, but in more caudal sections, labeling became laterally oriented along the external capsule. Light innervation of the medial intercalated masses was noted, but the central nucleus was relatively devoid of labeled terminals.
The DLO/AI (fig. 2a) also innervated the central aspects of the dorsal striatum. Unlike the VLO, the DLO/AI also targeted aspects of the lateral and ventral striatum (fig. 2b). Fibers originating from the DLO/AI reached the rostral striatum through the gcc and the external capsule and were organized into fiber bundles. Projections from the DLO/AI to the amygdala were again topographically organized; the heaviest labeling was identified in the rostral BLA, and terminals were densest along the lateral wall (fig. 2c–d). As reported (8), mid-amygdaloid labeling was primarily evident in the lateral basal nucleus, along with the ventral lateral amygdala. The majority of posterior terminals were located in the ventrolateral field of the basal nucleus (fig. 2c–d).
Projections from the DLO/AI to the amygdala appeared ipsilateral; however, unlike the VLO, innervation of the striatum was evident in both hemispheres, strongest ipsilateral to the infusion site (fig. 3a), culminating in a massive innervation of the posterior caudate (fig. 3b). Additionally of note was the presence of terminals and fibers of passage in the perirhinal cortex (PRh) originating from the DLO/AI, suggesting a DLO/AI-perirhinal-hippocampus pathway in mice similar to that found in macaques (24,25) (fig. 3c).
Overall, VLO vs. DLO/AI innervation patterns resembled those in rats (8,26,27), as well as other reports in mice (5,28–30).
VLO BDNF coordinates outcome-based decision-making
The mouse VLO innervates the dorsal striatum and BLA, regions associated with goal-directed action selection (9,31). The VLO might thus itself regulate decision-making based on the predictive relationship between an action and an outcome. To test this, we used a task in which mice are trained to generate two food-reinforced responses, then the likelihood that one response will be reinforced is reduced (action-outcome contingency degradation). Meanwhile, the other response remains reinforced in a separate training session (fig. 4a). During a subsequent probe test, mice can generate both responses freely; preferential engagement of the response that is likely to be reinforced is considered “goal-directed,” while non-selective responding is considered habit-based. Throughout these experiments, response acquisition curves reflect both responses; mice did not generate response biases that would interfere with subsequent experimental stages.
The neuroplasticity-associated neurotrophin Bdnf was knocked down in the VLO using viral vector strategies, reducing regional BDNF expression [Mann-Whitney U=17, p=0.04](fig. 4b–c). During response training, response rates in the knockdown group lagged, particularly in later sessions when the reinforcement schedule escalated from a fixed ratio to RI [interaction F(6,66)=3.8, p=0.006](fig. 4c). This profile is associated with impaired action-outcome decision-making (32). Indeed, Bdnf knockdown mice subsequently failed to differentiate between responses that were more, or less, likely to be reinforced, instead relying on habit-based strategies, generating both responses equivalently [interaction F(1,22)=9, p=0.007](fig. 4d).
Cortical pyramidal neurons provide BDNF to downstream substrates (33,34). Accordingly, BDNF in the amygdala was reduced following knockdown in the VLO [t27=3, p=0.005](fig. 4e). Further, amygdala BDNF levels correlated with response strategies — higher levels of BDNF were associated with avoidance of the response that was unlikely to be reinforced (r=0.53, p=0.05), while “low” BDNF was associated with habits (fig. 4e).
BDNF-expressing amygdala-targeted VLO projection neurons may thus regulate goal-directed action selection. To test this model, we modified classical disconnection procedures in which contralateral lesions would be placed unilaterally in the VLO and BLA, instead knocking down Bdnf unilaterally in the VLO and placing a lesion in the contralateral amygdala (fig. 5a). All mice acquired the instrumental responses, with no differences between groups (F<1)(fig. 5b). Thus, the response acquisition deficits following bilateral Bdnf knockdown (fig. 4c) cannot obviously be attributed to effects on VLO-BLA interactions. Nonetheless, contralateral infusions resulted in habitual response patterns [interaction F(2,30)=4.9, p<0.05](fig. 5c). By contrast, ipsilateral infusions, leaving one cortico-amygdala circuit intact, spared response selection.
With additional contingency degradation training, mice with contralateral infusions ultimately differentiated between the responses [effect of choice p<0.05](fig. 5c). Thus, interfering with BDNF-dependent VLO-amygdala interactions delays, but does not fully block, goal-directed response selection.
TrkB regulates goal-directed decision-making
Next, we assessed the role of the high-affinity BDNF receptor TrkB using the small-molecule agonist 7,8-DHF (35). Intact mice were extensively trained such that they would develop stimulus-response habits by virtue of over-training. Response rates did not differ between mice designated to vehicle or 7,8-DHF groups (Fs<1)(fig. 6a). We then violated the predictive relationship between one response and the associated reinforcer and injected mice immediately following this training session, during the presumptive consolidation of new learning. Vehicle-treated mice failed to differentiate between the responses that were more, or less, likely to be reinforced the following day, relying instead on familiar habit-based strategies. By contrast, 7,8-DHF caused a 2-fold preference for the response likely to be reinforced [interaction F(1,12)=6.2, p=0.03](fig. 6b).
We replicated this experiment, additionally pretreating mice with the TrkB antagonist ANA-12 (36). Groups did not differ during training (Suppl. fig. S1). ANA-12 blocked 7,8-DHF [interaction F(1,29)=8, p=0.009](fig. 6c), evidence that 7,8-DHF enhances the consolidation of action-outcome conditioning in a TrkB-dependent manner. Unexpectedly, mice that received both ANA-12 and 7,8-DHF preferentially engaged the response that was unlikely to be reinforced, though this effect may be driven by relatively few mice (fig. 6c, right).
Separate mice were trained to nose poke using a fixed ratio 1 schedule of reinforcement (fig. 6d) to confirm that systemic 7,8-DHF had no effects at a time point when typical mice would be expected to be “goal-directed” [main effect F(1,13)=71.2, p<0.001](fig. 6e). This is important because prelimbic PFC-targeted BDNF microinfusions, under certain circumstances, cause habit-like behavior (18,37).
Sensitivity to action-outcome contingency degradation and nonreinforcement (extinction) are dissociable (38), and the oPFC does not appear to be a site of extinction consolidation in appetitive contexts (39). On the other hand, 7,8-DHF enhances the extinction of conditioned freezing (40), suggesting that it may also regulate the extinction of an appetitive response. We trained mice further until responding was robust (>4 responses/min), then withheld reinforcement. Despite injections following several training sessions, 7,8-DHF did not impact response extinction (Fs<1)(fig. 6f). Following response extinction, we enumerated dendritic spines in the VLO and found that 7,8-DHF increased dendritic spine density on excitatory neurons in layer V (fig. 6g).
Correction of response strategies following VLO Bdnf silencing
Stimulating TrkB enhances the ability of mice to select actions based on their consequences. We thus next assessed whether 7,8-DHF could recover response strategies in Bdnf knockdown mice. We additionally treated a group with the Rho-kinase inhibitor fasudil, motivated by evidence that TrkB stimulation suppresses p75-mediated signaling, which can otherwise inhibit neurite outgrowth via substrates such as Rho-kinase (41). Again, injections were administered immediately following action-outcome contingency degradation training, and response rates represent responding, drug-free, during a subsequent probe test.
Response rates did not differ during training [“to be 7,8-DHF” vs. “to be saline” vs. “to be fasudil” Fs<1](fig. 6h). As expected, Bdnf knockdown reduced rates (main effect of Bdnf, p=0.04). Subsequently, vehicle-treated mice with VLO-targeted Bdnf knockdown failed to differentiate between the responses that were more, or less, likely to be reinforced. By contrast, knockdown mice treated with 7,8-DHF or fasudil preferentially engaged the response likely to be reinforced in a goal-directed fashion [Bdnf × 7,8-DHF F(1,37)=4, p=0.05; fasudil t5=2.6, p=0.047](fig. 6i). During this probe test, control mice generated >60% of responses toward the intact action-outcome contingency; this preference dropped to chance levels in knockdown mice. Response preference was fully rescued by 7,8-DHF and fasudil [F(4,39)=6.8, p<0.001; all groups compared to Cre-only, p<0.04](fig. 6j).
Additionally, fasudil did not impact response choice in typical mice with the same training history (Suppl. fig. S2).
Gi-DREADD stimulation impairs goal-directed action selection
Last, a CaMKII-Gi-DREADD or GFP was expressed in the VLO (fig. 7a), allowing us to acutely manipulate neuroplasticity in glutamatergic VLO projection neurons. Response rates did not differ between groups during training (Fs<1)(fig. 7b). The synthetic ligand CNO was then administered systemically to all mice following instrumental contingency degradation training. GFP-expressing mice subsequently preferentially generated the response most likely to be reinforced [main effect F(1,6)=7.1, p=0.04](fig. 7c, left), while Gi-DREADD-expressing mice initially engaged the response most likely to be reinforced, but this effect decayed, and responding became non-selective [interaction F(1,5)=24.6, p=0.004](fig. 7c, right). An overall interaction further indicated that Gi-DREADD stimulation weakened goal-directed response strategies [F(1,11)=5.1, p<0.05](fig. 7d).
Discussion
Considerable evidence indicates that the oPFC encodes salient information regarding desirable outcomes, such as external cues signaling reinforcement, as well as the value of rewards (2). The oPFC may also guide outcome-based decision-making based on other reinforcement-related information such as the likelihood that a given behavior will be reinforced, but to date, relatively few investigations into oPFC function have focused on action-outcome associative learning and memory. We addressed this gap by first verifying that two important subregions of the oPFC — the VLO and DLO/AI — innervate the amygdala in mice in patterns similar to those reported in rats (8). Next we used a combination of site-selective gene silencing and pharmacological interventions to demonstrate that: 1) VLO-derived BDNF is required for selecting actions based on their consequences; 2) obstructing BDNF-dependent functional connectivity between the VLO and amygdala impairs this type of goal-directed decision-making, resulting in habit-like behaviors; 3) habitual response strategies induced by either extended training or selective VLO Bdnf knockdown can be reversed by the TrkB agonist, 7,8-DHF, or the Rho-kinase inhibitor, fasudil; and 4) DREADD-mediated inhibition of glutamatergic neuroplasticity in the VLO disrupts the consolidation of new information regarding the predictive relationship between actions and their outcomes, weakening goal-directed response strategies.
We have previously reported that knocking down Bdnf in the VLO impairs goal-directed response selection (7). Here, we first highlight that the VLO innervates the BLA, an amygdalar subdivision involved in multiple forms of associative conditioning (9,42,43). Although less dense than those originating from the adjacent DLO/AI subregions, these projections are consistent with those reported in monkeys (44,45) and rats (8), as well as other investigations in mice (5,28–30). Conservation across species suggests that these networks are essential for evolutionarily-conserved behaviors, such as learning that specific actions produce desired outcomes, and that the VLO is positioned to provide top-down regulation of these processes. This may occur via local plasticity within the VLO that in turn coordinates differential excitatory outputs. Additionally, the VLO may affect plasticity in the BLA via axonal transport of small peptides such as BDNF. Indeed, BDNF expression in the amygdala was diminished following VLO Bdnf knockdown, and BDNF protein levels predicted response selection strategies.
VLO-BLA projections are ipsilateral. We capitalized on this segregated neuroanatomy by knocking down VLO Bdnf unilaterally and ablating the contralateral amygdala, leaving the infected VLO to project to the one remaining healthy amygdala. This “disconnection” approach allowed us to assess the impact of BDNF-dependent VLO-BLA interactions on response choice. Disconnection caused outcome-insensitive habits, recapitulating the effects of bilateral VLO Bdnf knockdown. Goal-directed responding was intact in mice with ipsilateral infusions in which one oPFC-amygdala circuit remained intact, further indicating that BDNF-mediated plasticity between these structures is fundamental to selecting actions based on their consequences.
TrkB regulates the consolidation of action-outcome conditioning
VLO Bdnf knockdown impaired goal-directed action selection, raising the possibility that TrkB stimulation could rescue, or enhance, action-outcome conditioning. We tested this by first inducing habits in mice using a classical approach – response over-training (46) – then stimulated TrkB during the period immediately following action-outcome instrumental contingency degradation, when mice could be presumably consolidating new information regarding the predictive relationships between actions and their outcomes (namely, that one action is no longer likely to be reinforced). The TrkB agonist 7,8-DHF enhanced action-outcome conditioning, resulting in goal-directed action selection in a subsequent probe test. These and other reports suggest that a latent “goal-directed” system can be accessed even once habits have developed (47), and our findings indicate that this process is TrkB-sensitive.
7,8-DHF also enhanced action-outcome conditioning in mice with VLO-targeted Bdnf knockdown, suggesting that BDNF organizes action selection through its high-affinity receptor TrkB, as opposed to pro-BDNF binding to the p75 receptor. Additionally, 7,8-DHF could be blocked by pretreatment with ANA-12, a TrkB antagonist, indicating that the actions of 7,8-DHF can be attributed, at least in part, to TrkB stimulation, rather than off-target effects.
In these experiments, we administered 7,8-DHF immediately following action-outcome contingency degradation, rather than at the probe test when mice must choose between responses that are more, or less, likely to be reinforced. This experimental design was motivated by evidence that temporary inactivation of the BLA during outcome devaluation training occludes goal-directed response selection during a subsequent probe test, while inactivation during the probe test has no effects (48–50). Thus, the BLA is essential for learning about, but not necessarily expressing, goal-directed decision-making strategies. Injections immediately following the training sessions additionally allowed us to avoid drug effects on the acquisition of instrumental contingency degradation training and instead target the consolidation phase of new learning.
Considerable attention has been given to the functional significance of projections from the BLA to the oPFC (51,52). While we have instead focused on oPFC projections to the BLA, it seems probable that bidirectional interactions regulate BDNF-dependent action selection. For example, 7,8-DHF increased dendritic spine density in deep-layer VLO. This spine population is targeted by BLA projections (53), so it is conceivable that 7,8-DHF corrected decision-making strategies in a direct manner by restoring VLO TrkB binding following local Bdnf knockdown, and in an indirect manner by structurally remodeling these neurons to support greater synaptic connectivity and increased sensitivity to BLA inputs. Supporting this “indirect” model, Rho-kinase inhibition mimicked 7,8-DHF, also correcting decision-making strategies in Bdnf-deficient mice. This is significantly because Rho-kinase provides a contractile force on the actin cytoskeleton, the structural lattice that forms the shape of neurons; this can be inhibited by TrkB-mediated interference with p75 signaling, allowing for structural plasticity (41). A final consideration is that 7,8-DHF may additionally regulate action selection strategies by increasing TrkB binding in the amygdala, particularly given that 7,8-DHF facilitates long-term potentiation in this region (54).
Based on the association between structural plasticity in the VLO and regulation of action selection strategies, as well as evidence that BDNF release from axons of pyramidal neurons is activity-dependent (55,56), we lastly applied a CaMKII-Gi-coupled DREADD to excitatory VLO neurons. When the synthetic ligand CNO is administered, an inhibitory Gi pathway is acutely activated, reducing the likelihood that Gi-DREADD-expressing neurons will generate activity-dependent action potentials or release glutamate from neuron terminals (57). Impeding activity-dependent excitatory transmission in this manner disrupted consolidation processes associated with developing goal-directed action selection strategies, rendering the memory of contingency degradation training inherently labile. The effect was, interestingly, weaker than that of Bdnf knockdown; whether Bdnf knockdown causes both acute and chronic neurobiological sequelae that contribute to habit formation will be a topic of future consideration.
Conclusions
Our findings implicate a BDNF-sensitive VLO-amygdala neurocircuit in the coordination of actions and habits. These findings may provide mechanistic insight into evidence implicating oPFC Bdnf in psychopathologies such as addiction. For instance, cocaine seeking in rats has been associated with elevated oPFC Bdnf (58), while diminished oPFC Bdnf increases sensitivity to cocaine-associated conditioned stimuli (7). Given our current findings, it is tempting to speculate that drug-related oPFC Bdnf overexpression drives goal-oriented drug seeking, while the atrophy of oPFC neurotrophin systems – caused by stressor exposure, for example (59) – can drive habitual drug seeking. A second important aspect of this report pertains to the identification of experimental techniques that reverse habits, which has otherwise proven challenging in the field. We find that TrkB- and Rho-kinase-targeted drugs, when administered during the consolidation of new action-outcome associative conditioning, may serve as promising adjuncts to behavioral therapies aimed at suppressing or reversing habitual, maladaptive thought or behavioral patterns.
Supplementary Material
Acknowledgments
We thank Ms. Amanda Allen and Mr. Zach Liang for their contributions. This work was supported by T32DA015040, P51OD11132, P30NS055077, DA034808, DA036737, Children’s Healthcare of Atlanta, the Brain and Behavior Research Foundation when Dr. Gourley was the Foundation’s Katherine Deschner Family Investigator, and an NIMH BRAINS award to Dr. Gourley (MH101477). We thank Ms. Lauren Shapiro, and Drs. Geoffrey Schoenbaum and Christopher Muly for guidance and valuable feedback.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
All authors report no biomedical financial interests or potential conflicts of interest.
References
- 1.Rhodes SE, Murray EA. Differential effects of amygdala, orbital prefrontal cortex, and prelimbic cortex lesions on goal-directed behavior in rhesus macaques. J Neurosci. 2013;33:3380–3389. doi: 10.1523/JNEUROSCI.4374-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.McDannald MA, Jones JL, Takahashi YK, Schoenbaum G. Learning theory: a driving force in understanding orbitofrontal function. Neurobiol Learn Mem. 2014;108:22–27. doi: 10.1016/j.nlm.2013.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Padoa-Schioppa C. Neurobiology of economic choice: a good-based model. Annu Rev Neurosci. 2011;34:333–359. doi: 10.1146/annurev-neuro-061010-113648. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Schoenbaum G, Roesch MR, Stalnaker TA, Takahashi YK. A new perspective on the role of the orbitofrontal cortex in adaptive behaviour. Nat Rev Neurosci. 2009;10:885–892. doi: 10.1038/nrn2753. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Gremel CM, Costa RM. Orbitofrontal and striatal circuits dynamically encode the shift between goal-directed and habitual actions. Nat Commun. 2013;4:2264. doi: 10.1038/ncomms3264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Ahmari SE, Spellman T, Douglass NL, Kheirbek MA, Simpson HB, Deisseroth K, et al. Repeated cortico-striatal stimulation generates persistent OCD-like behavior. Science. 2013;340:1234–1239. doi: 10.1126/science.1234733. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Gourley SL, Olevska A, Zimmermann KS, Ressler KJ, Dileone RJ, Taylor JR. The orbitofrontal cortex regulates outcome-based decision-making via the lateral striatum. Eur J Neurosci. 2013;38:2382–2388. doi: 10.1111/ejn.12239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.McDonald AJ, Mascagni F, Guo L. Projections of the medial and lateral prefrontal cortices to the amygdala: a Phaseolus vulgaris leucoagglutinin study in the rat. Neuroscience. 1996;71:55–75. doi: 10.1016/0306-4522(95)00417-3. [DOI] [PubMed] [Google Scholar]
- 9.Balleine BW, Killcross AS, Dickinson A. The effect of lesions of the basolateral amygdala on instrumental conditioning. J Neurosci. 2003;23:666–675. doi: 10.1523/JNEUROSCI.23-02-00666.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Wang SH, Ostlund SB, Nader K, Balleine BW. Consolidation and reconsolidation of incentive learning in the amygdala. J Neurosci. 2005;25:830–835. doi: 10.1523/JNEUROSCI.4716-04.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Shiflett MW, Balleine BW. At the limbic-motor interface: disconnection of basolateral amygdala from nucleus accumbens core and shell reveals dissociable components of incentive motivation. Eur J Neurosci. 2010;32:1735–1743. doi: 10.1111/j.1460-9568.2010.07439.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Corbit LH, Leung BK, Balleine BW. The role of the amygdala-striatal pathway in the acquisition and performance of goal-directed instrumental actions. J Neurosci. 2013;33:17682–17690. doi: 10.1523/JNEUROSCI.3271-13.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Mailly P, Aliane V, Groenewegen HJ, Haber SN, Deniau JM. The rat prefrontostriatal system analyzed in 3D: evidence for multiple interacting functional units. J Neurosci. 2013;33:5718–5727. doi: 10.1523/JNEUROSCI.5248-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Rios M, Fan G, Fekete C, Kelly J, Bates B, Kuehn R, et al. Conditional deletion of brain-derived neurotrophic factor in the postnatal brain leads to obesity and hyperactivity. Mol Endocrinol. 2001;15:1748–1757. doi: 10.1210/mend.15.10.0706. [DOI] [PubMed] [Google Scholar]
- 15.Feng G, Mellor RH, Bernstein M, Keller-Peck C, Nguyen QT, Wallace M, et al. Imaging neuronal subsets in transgenic mice expressing multiple spectral variants of GFP. Neuron. 2000;28:41–51. doi: 10.1016/s0896-6273(00)00084-2. [DOI] [PubMed] [Google Scholar]
- 16.Paxinos G, Franklin KBJ. The Mouse Brain in Stereotaxic Coordinates. second. San Diego: Academic Press; 2001. [Google Scholar]
- 17.Dickinson A, Nicholas D, Adams CD. The effect of the instrumental training contingency on susceptibility to reinforcer devaluation. The Quarterly Journal of Experimental Psychology. 1983;35:35–51. [Google Scholar]
- 18.Gourley SL, Swanson AM, Jacobs AM, Howell JL, Mo M, Dileone RJ, et al. Action control is mediated by prefrontal BDNF and glucocorticoid receptor binding. Proc Natl Acad Sci U S A. 2012;109:20714–20719. doi: 10.1073/pnas.1208342109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Swanson AM, Shapiro LP, Whyte AJ, Gourley SL. Glucocorticoid receptor regulation of action selection and prefrontal cortical dendritic spines. Commun Integr Biol. 2013;6:e26068. doi: 10.4161/cib.26068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Hinton EA, Wheeler MG, Gourley SL. Early-life cocaine interferes with BDNF-mediated behavioral plasticity. Learn Mem. 2014;21:253–257. doi: 10.1101/lm.033290.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Gourley SL, Swanson AM, Koleske AJ. Corticosteroid-induced neural remodeling predicts behavioral vulnerability and resilience. J Neurosci. 2013;33:3107–3112. doi: 10.1523/JNEUROSCI.2138-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Peters A, Kaiserman-Abramof IR. The small pyramidal neuron of the rat cerebral cortex. The perikaryon, dendrites and spines. Am J Anat. 1970;127:321–355. doi: 10.1002/aja.1001270402. [DOI] [PubMed] [Google Scholar]
- 23.Gourley SL, Lee AS, Howell JL, Pittenger C, Taylor JR. Dissociable regulation of instrumental action within mouse prefrontal cortex. Eur J Neurosci. 2010;32:1726–1734. doi: 10.1111/j.1460-9568.2010.07438.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Suzuki WA, Amaral DG. Cortical inputs to the CA1 field of the monkey hippocampus originate from the perirhinal and parahippocampal cortex but not from area TE. Neurosci Lett. 1990;115:43–48. doi: 10.1016/0304-3940(90)90515-b. [DOI] [PubMed] [Google Scholar]
- 25.Van Hoesen G, Pandya DN, Butters N. Some connections of the entorhinal (area 28) and perirhinal (area 35) cortices of the rhesus monkey. II. Frontal lobe afferents. Brain Res. 1975;95:25–38. doi: 10.1016/0006-8993(75)90205-x. [DOI] [PubMed] [Google Scholar]
- 26.Schilman EA, Uylings HB, Galis-de Graaf Y, Joel D, Groenewegen HJ. The orbital cortex in rats topographically projects to central parts of the caudate-putamen complex. Neurosci Lett. 2008;432:40–45. doi: 10.1016/j.neulet.2007.12.024. [DOI] [PubMed] [Google Scholar]
- 27.Berendse HW, Galis-de Graaf Y, Groenewegen HJ. Topographical organization and relationship with ventral striatal compartments of prefrontal corticostriatal projections in the rat. J Comp Neurol. 1992;316:314–347. doi: 10.1002/cne.903160305. [DOI] [PubMed] [Google Scholar]
- 28.Oh SW, Harris JA, Ng L, Winslow B, Cain N, Mihalas S, et al. A mesoscale connectome of the mouse brain. Nature. 2014;508:207–214. doi: 10.1038/nature13186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Matyas F, Lee J, Shin HS, Acsady L. The fear circuit of the mouse forebrain: connections between the mediodorsal thalamus, frontal cortices and basolateral amygdala. Eur J Neurosci. 2014;39:1810–1823. doi: 10.1111/ejn.12610. [DOI] [PubMed] [Google Scholar]
- 30.Dong H, et al. Mouse Connectome Project. @ http://www.mouseconnectome.org/
- 31.Yin HH, Ostlund SB, Balleine BW. Reward-guided learning beyond dopamine in the nucleus accumbens: the integrative functions of cortico-basal ganglia networks. Eur J Neurosci. 2008;28:1437–1448. doi: 10.1111/j.1460-9568.2008.06422.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Corbit LH, Balleine BW. The role of prelimbic cortex in instrumental conditioning. Behav Brain Res. 2003;146:145–157. doi: 10.1016/j.bbr.2003.09.023. [DOI] [PubMed] [Google Scholar]
- 33.Altar CA, Cai N, Bliven T, Juhasz M, Conner JM, Acheson AL, et al. Anterograde transport of brain-derived neurotrophic factor and its role in the brain. Nature. 1997;389:856–860. doi: 10.1038/39885. [DOI] [PubMed] [Google Scholar]
- 34.Conner JM, Lauterborn JC, Yan Q, Gall CM, Varon S. Distribution of brain-derived neurotrophic factor (BDNF) protein and mRNA in the normal adult rat CNS: evidence for anterograde axonal transport. J Neurosci. 1997;17:2295–2313. doi: 10.1523/JNEUROSCI.17-07-02295.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Jang SW, Liu X, Yepes M, Shepherd KR, Miller GW, Liu Y, et al. A selective TrkB agonist with potent neurotrophic activities by 7,8-dihydroxyflavone. Proc Natl Acad Sci U S A. 2010;107:2687–2692. doi: 10.1073/pnas.0913572107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Cazorla M, Premont J, Mann A, Girard N, Kellendonk C, Rognan D. Identification of a low-molecular weight TrkB antagonist with anxiolytic and antidepressant activity in mice. J Clin Invest. 2011;121:1846–1857. doi: 10.1172/JCI43992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Graybeal C, Feyder M, Schulman E, Saksida LM, Bussey TJ, Brigman JL, et al. Paradoxical reversal learning enhancement by stress or prefrontal cortical damage: rescue with BDNF. Nat Neurosci. 2011;14:1507–1509. doi: 10.1038/nn.2954. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Hammond LJ. The effect of contingency upon the appetitive conditioning of free-operant behavior. J Exp Anal Behav. 1980;34:297–304. doi: 10.1901/jeab.1980.34-297. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Panayi MC, Killcross S. Orbitofrontal cortex inactivation impairs between-but not within-session Pavlovian extinction: an associative analysis. Neurobiol Learn Mem. 2014;108:78–87. doi: 10.1016/j.nlm.2013.08.002. [DOI] [PubMed] [Google Scholar]
- 40.Andero R, Heldt SA, Ye K, Liu X, Armario A, Ressler KJ. Effect of 7,8-dihydroxyflavone, a small-molecule TrkB agonist, on emotional learning. Am J Psychiatry. 2011;168:163–172. doi: 10.1176/appi.ajp.2010.10030326. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Reichardt LF. Neurotrophin-regulated signalling pathways. Philos Trans R Soc Lond B Biol Sci. 2006;361:1545–1564. doi: 10.1098/rstb.2006.1894. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Davis M. The role of the amygdala in fear and anxiety. Annu Rev Neurosci. 1992;15:353–375. doi: 10.1146/annurev.ne.15.030192.002033. [DOI] [PubMed] [Google Scholar]
- 43.Fanselow MS, LeDoux JE. Why we think plasticity underlying Pavlovian fear conditioning occurs in the basolateral amygdala. Neuron. 1999;23:229–232. doi: 10.1016/s0896-6273(00)80775-8. [DOI] [PubMed] [Google Scholar]
- 44.Groenewegen HJ, Uylings HB. The prefrontal cortex and the integration of sensory, limbic and autonomic information. Prog Brain Res. 2000;126:3–28. doi: 10.1016/S0079-6123(00)26003-2. [DOI] [PubMed] [Google Scholar]
- 45.Barbas H. Connections underlying the synthesis of cognition, memory, and emotion in primate prefrontal cortices. Brain Res Bull. 2000;52:319–330. doi: 10.1016/s0361-9230(99)00245-2. [DOI] [PubMed] [Google Scholar]
- 46.Balleine BW, O’Doherty JP. Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology. 2010;35:48–69. doi: 10.1038/npp.2009.131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Gourley SL, Olevska A, Gordon J, Taylor JR. Cytoskeletal determinants of stimulus-response habits. J Neurosci. 2013;33:11811–11816. doi: 10.1523/JNEUROSCI.1034-13.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Wellman LL, Gale K, Malkova L. GABAA-mediated inhibition of basolateral amygdala blocks reward devaluation in macaques. J Neurosci. 2005;25:4577–4586. doi: 10.1523/JNEUROSCI.2257-04.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.West EA, Forcelli PA, Murnen AT, McCue DL, Gale K, Malkova L. Transient inactivation of basolateral amygdala during selective satiation disrupts reinforcer devaluation in rats. Behav Neurosci. 2012;126:563–574. doi: 10.1037/a0029080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Parkes SL, Balleine BW. Incentive memory: evidence the basolateral amygdala encodes and the insular cortex retrieves outcome values to guide choice between goal-directed actions. J Neurosci. 2013;33:8753–8763. doi: 10.1523/JNEUROSCI.5071-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Holland PC, Gallagher M. Amygdala-frontal interactions and reward expectancy. Curr Opin Neurobiol. 2004;14:148–155. doi: 10.1016/j.conb.2004.03.007. [DOI] [PubMed] [Google Scholar]
- 52.Schoenbaum G, Setlow B, Saddoris MP, Gallagher M. Encoding predicted outcome and acquired value in orbitofrontal cortex during cue sampling depends upon input from basolateral amygdala. Neuron. 2003;39:855–867. doi: 10.1016/s0896-6273(03)00474-4. [DOI] [PubMed] [Google Scholar]
- 53.Ghashghaei HT, Barbas H. Pathways for emotion: interactions of prefrontal and anterior temporal pathways in the amygdala of the rhesus monkey. Neuroscience. 2002;115:1261–1279. doi: 10.1016/s0306-4522(02)00446-3. [DOI] [PubMed] [Google Scholar]
- 54.Li C, Dabrowska J, Hazra R, Rainnie DG. Synergistic activation of dopamine D1 and TrkB receptors mediate gain control of synaptic plasticity in the basolateral amygdala. PLoS ONE. 2011;6:e26065. doi: 10.1371/journal.pone.0026065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Balkowiec A, Katz DM. Cellular mechanisms regulating activity-dependent release of native brain-derived neurotrophic factor from hippocampal neurons. J Neurosci. 2002;22:10399–10407. doi: 10.1523/JNEUROSCI.22-23-10399.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Gartner A, Staiger V. Neurotrophin secretion from hippocampal neurons evoked by long-term-potentiation-inducing electrical stimulation patterns. Proc Natl Acad Sci U S A. 2002;99:6386–6391. doi: 10.1073/pnas.092129699. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Dong S, Allen JA, Farrell M, Roth BL. A chemical-genetic approach for precise spatio-temporal control of cellular signaling. Mol Biosyst. 2010;6:1376–1380. doi: 10.1039/c002568m. [DOI] [PubMed] [Google Scholar]
- 58.Hearing MC, Miller SW, See RE, McGinty JF. Relapse to cocaine seeking increases activity-regulated gene expression differentially in the prefrontal cortex of abstinent rats. Psychopharmacology (Berl) 2008;198:77–91. doi: 10.1007/s00213-008-1090-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Gourley SL, Kedves AT, Olausson P, Taylor JR. A history of corticosterone exposure regulates fear extinction and cortical NR2B, GluR2/3, and BDNF. Neuropsychopharmacology. 2009;34:707–716. doi: 10.1038/npp.2008.123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Rosen G, et al. The mouse brain library. 2000 @ http://www.Mbl.Org. International Mouse Genome Conference.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.