Abstract
An essential component of goal-directed decision-making is the ability to maintain flexible responding based on the value of a given reward, or “reinforcer.” The medial orbitofrontal cortex (mOFC), a subregion of the ventromedial prefrontal cortex, is uniquely positioned to regulate this process. We trained mice to nose poke for food reinforcers and then stimulated this region using CaMKII-driven Gs-coupled designer receptors exclusively activated by designer drugs (DREADDs). In other mice, we silenced the neuroplasticity-associated neurotrophin brain-derived neurotrophic factor (BDNF). Activation of Gs-DREADDs increased behavioral sensitivity to reinforcer devaluation, whereas Bdnf knockdown blocked sensitivity. These changes were accompanied by modifications in breakpoint ratios in a progressive ratio task, and they were recapitulated in Bdnf+/− mice. Replacement of BDNF selectively in the mOFC in Bdnf+/− mice rescued behavioral deficiencies, as well as phosphorylation of extracellular-signal regulated kinase 1/2 (ERK1/2). Thus, BDNF expression in the mOFC is both necessary and sufficient for the expression of typical effort allocation relative to an anticipated reinforcer. Additional experiments indicated that expression of the immediate-early gene c-fos was aberrantly elevated in the Bdnf+/− dorsal striatum, and BDNF replacement in the mOFC normalized expression. Also, systemic administration of an MAP kinase kinase inhibitor increased breakpoint ratios, whereas the addition of discrete cues bridging the response–outcome contingency rescued breakpoints in Bdnf+/− mice. We argue that BDNF–ERK1/2 in the mOFC is a key regulator of “online” goal-directed action selection.
SIGNIFICANCE STATEMENT Goal-directed response selection often involves predicting the consequences of one's actions and the value of potential payoffs. Lesions or chemogenetic inactivation of the medial orbitofrontal cortex (mOFC) in rats induces failures in retrieving outcome identity memories (Bradfield et al., 2015), suggesting that the healthy mOFC serves to access outcome value information when it is not immediately observable and thereby guide goal-directed decision-making. Our findings suggest that the mOFC also bidirectionally regulates effort allocation for a given reward and that expression of the neurotrophin BDNF in the mOFC is both necessary and sufficient for mice to sustain stable representations of reinforcer value.
Keywords: cue, dorsal striatum, neurotrophin, operant, orbital, progressive ratio
Introduction
An essential component of goal-directed decision-making is assessing the value of a reinforcer and engaging response strategies accordingly. Electrophysiological studies in nonhuman primates and neuroimaging studies in humans suggest that the ventral and medial orbitofrontal cortex (mOFC) encodes the value of rewards during real or hypothetical tasks (Arana et al., 2003; Paulus and Frank, 2003; Padoa-Schioppa and Assad, 2006, 2008; Plassmann et al., 2007; Valentin et al., 2007). Furthermore, mOFC neurons are sensitive to satiety-specific reinforcer devaluation and notably more so than centrolateral OFC neurons (Bouret and Richmond, 2010). This observation is consistent with the perspective that, across species, ventromedial prefrontal cortical structures are concerned with representations of outcome value, as determined by internal means such as hunger/satiety, as well as inhibitory control processes; meanwhile, the lateral OFC integrates multisensory information to build representations of task states to guide optimal response selection strategies (Wallis, 2012; Wilson et al., 2014; Stalnaker et al., 2015; Gourley and Taylor, 2016).
What are the molecular mechanisms regulating mOFC function? One candidate is BDNF. BDNF belongs to a family of structurally and functionally related peptide growth factors and is expressed throughout the brain. BDNF facilitates synaptic transmission (Kang and Schuman, 1995, 1996) and long-term potentiation (LTP) in the adult hippocampus (Figurov et al., 1996; Korte et al., 1996; Patterson et al., 1996). In the cerebral cortex, BDNF regulates AMPA receptor subunit expression, experience-dependent synaptogenesis, and dendritic modeling (McAllister et al., 1995, 1997; Gorski et al., 2003; Genoud et al., 2004; Nakata and Nakamura, 2007). Bdnf−/− mutant mice do not survive to adulthood, but Bdnf+/− mutants are viable and grossly normal on multiple tests of “emotionality” and memory (Montkowski and Holsboer 1997; but see Linnarsson et al., 1997; MacQueen et al., 2001; Chourbaji et al., 2004). Nonetheless, hippocampal LTP is impaired in these animals (Korte et al., 1995), indicating that even incomplete loss of BDNF has functional consequences.
To assess the role of mOFC BDNF in reward-related decision-making, we first turned to reinforcer devaluation assays. In both primates and rodents, sensitivity to outcome value can be quantified using these procedures, e.g., allowing mice that have been trained previously to respond for a food reinforcer unconditional access to the food before testing. Sensitivity to outcome value is reflected by diminished responding for the now-devalued food (Colwill and Rescorla, 1986; Dickinson, 1980; Balleine and O'Doherty, 2010). In another task, the progressive ratio (PR) schedule of reinforcement can quantify perceived outcome value using instrumental response requirements that progressively increase with each reinforcer delivery (Hodos 1961). In this case, the highest response/reinforcer ratio—the “breakpoint ratio”—can serve as an indicator of perceived “value.” We report that viral-mediated mOFC-selective Bdnf knockdown decreases behavioral sensitivity to reinforcer devaluation and inflates responding on a PR schedule of reinforcement. Constitutive Bdnf+/− mice also generate aberrant breakpoints, and mOFC-selective BDNF replacement in these mice rescues behavioral abnormalities and immediate-early gene expression in the downstream striatum.
Goal-directed response selection often involves predicting the consequences of one's actions and the value of future outcomes. Bradfield et al. (2015) recently reported that the mOFC sustains goal-directed behavior by retrieving outcome identity information to guide response strategies when reward value is not immediately observable (e.g., during certain stages of reinforcer devaluation tasks). Our data suggest that this property also helps to gate appropriate effort allocation (as in the PR task) and that the neurotrophin BDNF is a critical molecular substrate mediating these mOFC functions. To further support these perspectives, we additionally report that sensitivity to outcome value and PR schedules of reinforcement can be enhanced by stimulating CaMKII-driven Gs-coupled excitatory designer receptor exclusively activated by designer drugs (DREADDs) in the mOFC. These findings raise the possibility that the mOFC could be a viable target for therapeutic interventions aimed at correcting, normalizing, or enriching goal-directed behaviors. Examples include therapies aimed at bringing about behavioral change in depression or addiction, illnesses commonly characterized by behavioral rigidity and inflexibility.
Materials and Methods
Subjects
Mice included the following: (1) adult male C57BL/6 mice (10–12 weeks old) obtained from Charles River or The Jackson Laboratory; (2) Bdnf+/− and littermate wild-type controls bred on a C57BL/6 background (The Jackson Laboratory); or (3) adult male mice homozygous for a floxed Bdnf gene (exon V) bred on a mixed BALB/c background (The Jackson Laboratory).
Mutant mice were tested between 10 weeks and 6 months of age, at which point body weights in Bdnf+/− animals significantly increase (Kernie et al., 2000), even despite food restriction (S.L.G. and J.R.T., unpublished observations). Genotypes were determined by PCR of tail tissue and, in the case of Bdnf+/− mice, confirmed by postmortem analysis of homogenized hippocampal tissue by BDNF ELISA (25 μl/well; Promega; methods below). Mice were food-restricted during instrumental conditioning, maintaining ∼93% original body weight unless otherwise noted. All tests were conducted during the light phase of the 12 h light cycle (7:00 A.M. lights on). The Yale and Emory University Animal Care and Use Committees approved procedures as appropriate.
Viral vector delivery
AAV8–CaMKII–HA–rM3D(Gs)–IRES–mCitrine (AAV–Gs–DREADD–mCitrine) or AAV8–CaMKII–GFP (AAV–GFP) viral vectors were generated by the University of North Carolina Viral Vector Core and infused into wild-type mice. Lentiviral vectors expressing GFP or Cre recombinase (Cre) under the CMV promoter were generated by the Emory University Viral Vector Core and infused into floxed Bdnf mice. Mice were anesthetized with ketamine/dexdomitor. With needles centered at bregma, stereotaxic coordinates were located on the leveled skull using a digitized stereotaxic frame. Viral vectors were delivered to +2.3 mm anteroposterior (AP), −2.8 mm dorsoventral (DV), and ±0.1 mm mediolateral (ML; Gourley et al., 2010) in a volume of 0.5 μl/side. The microsyringe remained in place for 5 min after infusion. Mice were sutured and allowed to recover for at least 3 weeks before behavioral testing. After testing, mice were deeply anesthetized and transcardially perfused with 4% paraformaldehyde, then brains were sectioned into 40 μm sections, and mCitrine or GFP was imaged to confirm that infusions infected the mOFC.
Clozapine-N-oxide injection in DREADDs-expressing mice and general experimental design
The DREADDs ligand clozapine-N-oxide (CNO; 1 mg/kg, i.p., in 2% DMSO and PBS; Sigma) was prepared fresh daily and administered immediately before the 30 min prefeeding period for devaluation experiments and then again 30 min before testing for PR experiments. A final injection was delivered 30 min before a 1 h locomotor monitoring test. The 30 min inject-to-test interval allowed CNO time to penetrate the brain. All mice received CNO, which increases the excitability of infected neurons in Gs–DREADDs-expressing mice but does not affect control GFP-expressing neurons (Urban and Roth, 2015).
In these experiments, the acute stress resulting from injection would be expected to reduce sensitivity to reinforcer devaluation (Schwabe and Wolf, 2011), as was indeed observed. Mice were accordingly then trained for four additional sessions and then retested in the devaluation procedure to confirm that mice could ultimately develop sensitivity to reinforcer devaluation with habituation to the injection.
Instrumental response training
Instrumental response training was conducted as reported previously (Gourley et al., 2008a,b). Briefly, experimenters used standard aluminum operant conditioning chambers for mice (16 × 14 × 12.5 cm) controlled by MedPC software and equipped with two or three nose-poke recesses (Med-Associates). Each chamber was housed in a sound-attenuating outer chamber equipped with white noise generator, fan, and house light. A dispenser delivered grain-based food pellets (20 mg; Bio-Serv) into the magazine. Head entries into the one active nose-poke aperture and magazine were detected by photocell. Mice were initially trained to perform the operant response over several 25 min sessions, during which one, two, or three responses resulted in food reinforcement (a variable ratio 2 schedule of reinforcement). Once responding stabilized, experiments proceeded as indicated below. Response rates were compared by two-factor (genotype or treatment × session) ANOVA with repeated measures (RM). In the case of interactions, Tukey's post hoc comparisons were used for all experiments.
Satiety-specific reinforcer devaluation
To investigate behavioral sensitivity to reinforcer devaluation, trained mice were allowed unlimited access to the reinforcer pellets in a clean cage for 30 min immediately before a 15 min probe test conducted in extinction. Response rates were compared by two-factor ANOVA to responding in a session when ad libitum access to the reinforcer pellets had not been available, but rather mice were given regular chow. These “devalued” (pellets) and “value” (chow) sessions were counterbalanced, with one test per day on sequential days.
In one experiment, a main effect of testing order indicated that mice extinguished responding during the second test regardless of devalued or value condition. To address this issue, response rates after the first prefeeding period are compared with those generated on the final day of training, referred to as “baseline.”
PR test
PR testing was conducted in mice trained to nose poke as described above. We then applied a linearly increasing response/reinforcement requirement (1, 5, 9, x + 4 responses/reinforcement). The test ended when no active responses were emitted for 5 min or if the mouse had not “timed out” within 4 h. The highest response/reinforcement ratios achieved—termed breakpoint ratios—were compared by two-factor (genotype × session) RM-ANOVA or t test as appropriate. For mice used in viral vector experiments, this test followed reinforcer devaluation testing to conserve animal usage. In Bdnf+/− mice, independent groups were used for devaluation and PR tests. Breakpoint ratios >2 SDs outside of the group means were excluded (Gourley et al., 2008a; throughout all experiments: n = 1 lenti-GFP; n = 3 intact wild-type; n = 1 Bdnf+/− plus vehicle; n = 2 Bdnf+/− plus BDNF. Additionally, two Bdnf knockdown mice were excluded because of experimenter error.)
The PR task is commonly used as an assay of “reward value.” Measuring post-reinforcement pausing (PRP) in animals with differing breakpoint ratios can, however, determine whether differences in primary motivation can instead account for group differences (Skjoldager et al., 1993). For example, with increased food restriction, presumably increasing “motivation,” PRPs shorten, whereas changes in reinforcer value (e.g., one pellet vs three pellets) do not affect PRPs (Skjoldager et al., 1993). Thus, this metric—the time between magazine head entry after reinforcer delivery and initiation of the next trial—was also compared between wild-type and Bdnf+/− mice, which experience adult-onset obesity (Kernie et al., 2000). The first, median, and final pauses for each session were extracted and compared by RM-ANOVA. The ratio of responses on the active versus inactive apertures was also compared by two-factor RM-ANOVA to verify that mice distinguished the active from inactive apertures.
BDNF microinfusion
For BDNF microinfusions, mice were experimentally naive Bdnf+/− and littermate wild-type mice trained to perform the instrumental response as described and then subjected to three PR test sessions. Based on breakpoint ratios, mice were matched and assigned to groups (wild-type plus vehicle, wild-type plus BDNF, Bdnf+/− plus vehicle, Bdnf+/− plus BDNF). Mice were then anesthetized with 1:1 2-methyl-2-butanol and tribromoethanol diluted 40-fold with saline. The head was shaved and placed in a digitized stereotaxic frame. The scalp was incised, skin retracted, and bregma and lambda identified. The head was leveled, and recombinant human BDNF (Millipore Bioscience Research Reagents) dissolved in saline (0.4 μg/μl; Gourley et al., 2008b, 2009a, 2012) was infused into the mOFC (+2.3 mm AP, −2.8 mm DV, ±0.1 mm ML; Gourley et al., 2010) in a volume of 0.2 μl over 6 min using a digital coordinate system with mm resolution (David Kopf Instruments).
Pilot experiments indicated that a single infusion of BDNF affects PR responding up to 8 d after infusion, so mice here were allowed 4 d recovery, followed by 3 consecutive days of PR testing, with a single PR session per day. Each animal's breakpoint ratios were averaged, yielding a single value per mouse, which were compared between groups by two-factor (genotype × infusion) ANOVA.
Mice were killed by rapid decapitation immediately after the last session, brains were rapidly sectioned, bilateral needle entry sites were documented, and tissues were frozen on dry ice for Western blot analyses. Additional groups of behaviorally naive wild-type mice were infused and killed 24 h after infusion for Western blot and immunostaining analyses.
Western blotting
A single experimenter dissected frozen tissue punches from the ventromedial prefrontal cortex (vmPFC), dorsal striatum, dorsal hippocampus, and nucleus accumbens (NAc) core and shell using 1.2 and 0.50 mm tissue cores (Fine Science Tools). vmPFC samples were collected with a single midline punch with the tissue core aimed at the rostroventral-most part of the vmPFC, containing the mOFC. Samples likely included both mOFC and ventral prelimbic PFC to generate sufficient protein concentrations for multiple blots. Hence, these samples are referred to as “vmPFC” tissue.
Tissue was sonicated in lysis buffer [75–200 μl: 137 mm NaCl, 20 mm Tris-Hcl, pH 8, 1% igepal, 10% glycerol, and 1:100 Phosphatase Inhibitor Cocktails 1 and 2 (Sigma)] and stored at −80°C. Protein concentrations were determined using a Bradford colorimetric assay (Pierce), and 20 μg of each sample was separated by SDS-PAGE on an 8–16% gradient Tris-glycine gel (Invitrogen). After transfer to nitrocellulose membrane, blots were blocked with 5% nonfat milk for 1 h. The following primary antibodies were used: anti-phosphorylated (p) ERK1/2 (mouse; 1:1000; Cell Signaling Technology), anti-ERK1/2 (rabbit; 1:2000; Cell Signaling Technology), anti-trkB (mouse; 1:1000; BD Biosciences), anti-p-trk (rabbit; 1:500; Cell Signaling Technology), anti-GluR1 (rabbit; 1:500; Millipore Bioscience Research Reagents), and anti-c-fos (rabbit; 1:500, Santa Cruz Biotechnology). Membranes were incubated with primary antibodies at 4°C for 1 h or overnight and then incubated with IRDye 700 Dx Anti-Rb IgG and IRDye 800 Dx Anti-Ms IgG (both 1:5000; Rockland Immunochemicals) for 1 h.
Bands were then quantified using infrared densitometry analysis (Odyssey Infrared Imaging System). Membranes were reprobed with anti-GAPDH, which served as a loading control (mouse; 1:20,000; Advanced Immunochemical). pERK1/2 was normalized to total ERK1/2, which was not changed in any comparison. Infrared values were converted to a percentage of control samples from the same membrane to control for variance between gels. Group means were then compared by t test or two-factor ANOVA as appropriate.
BDNF ELISA
Fresh frozen brain tissue was dissected and homogenized as for Western blotting experiments. BDNF was quantified by ELISA (Promega) in duplicate in accordance with the instructions of the manufacturer except for exclusion of the extraction step. BDNF concentrations were normalized to total protein concentrations in each sample.
pERK1/2 immunostaining
One group of wild-type mice was infused with BDNF or saline as described. Then, 24 h later, brains were harvested and stored in 4% paraformaldehyde for 48 h and then transferred to 30% w/v sucrose before being sectioned into 45 μm sections. Sections were blocked in a PBS solution containing 2% normal goat serum, 1% bovine serum albumin (BSA), and 0.3% Triton X-100 (Sigma) for 1 h at room temperature. Sections were then incubated in primary antibody solution containing 0.3% normal goat serum, 1% BSA, and 0.3% Triton X-100 at 4°C for 48 h. pERK1/2 (1:400; Cell Signaling Technology) served as the primary antibody. Sections were incubated in secondary antibody solution containing 0.5% normal goat serum and 0.3% Triton X-100, with Alexa Fluor 633 (1:200; Life Technologies) serving as the secondary antibody.
Sections were imaged on a Nikon 4550s SMZ18 microscope with settings held constant. Concentric rings 35 μm apart were generated in NIH ImageJ with the center of the first ring positioned at the base of the infusion site. Integrated intensity was measured along the perimeter of each ring to assess the spread of ERK1/2 phosphorylation after BDNF infusion, and values were compared by RM-ANOVA. Each mouse contributed a single value.
Experimental design for experiments using a MAP kinase kinase inhibitor
To evaluate whether suppressing MAP kinase kinase (MEK), a kinase directly upstream of ERK1/2, could regulate PR responding, intact wild-type mice were trained to perform an instrumental response for food reinforcement as described. During this period, mice were also habituated to injection by nightly handling and mock intraperitoneal injection to ensure injections before PR tests did not interfere with performance. Mice were then matched based on reinforcers acquired during training and assigned to either vehicle or PD0184161 (5-bromo-2-[(2-chloro-4-iodophenyl)amino]-N-(cyclopropylmethoxy)-3,4-difluorobenzamide; 30 mg/kg, generously provided by Dr. David Russell, Yale University, New Haven, CT) groups. Because the drug necessitated 100% DMSO for dissolution, mice were maintained at 29–31 g body weight to allow for accurate injection of very small injection volume (30 μl/mouse). In contrast to other experiments, this translates to maintaining mice at ∼100% of their original body weight. PD0184161 was administered within 48 h of dissolution and kept at 4°C when not in use.
Thirty minutes before the PR tests, mice were injected with either drug or vehicle alone. An additional control group was injected at the end of the session to evaluate whether MEK inhibition interfered with post-session memory consolidation. Breakpoint ratios were compared by RM-ANOVA. Three weeks after the last test, mice were tested without drug to identify whether PD0184161 had long-term consequences.
Locomotor monitoring
Ambulation in a clean cage was quantified using the automated Omnitech Digiscan Micromonitor system equipped with 16 photocells or a customized Med-Associates locomotor monitoring system identically equipped with 16 photocells. Mice were food restricted overnight to recapitulate locomotor activity levels during instrumental response training and testing. Consecutive photobeams broken across 60 min were compared by two-factor (group × time bin) RM-ANOVA.
Mice expressing DREADDs and their GFP-expressing counterparts were administered CNO and then placed in the locomotor monitoring chambers 30 min later. Given evidence that the mOFC may regulate repetitive behavior (Ahmari et al., 2013), ambulatory counts were segregated from repetitive interruption of the same photobeam, an indicator of stereotypy-like behavior. Both types of photobeam breaks were compared between groups by RM-ANOVA.
Additional behavioral testing in drug-naive mice
Cued reinforcer delivery.
A naive group of Bdnf+/− and wild-type mice was used. We added discrete stimuli (a 2 s, 2.9 kHz tone and extinction of the house light during the 2 s between nose-poke response and food pellet delivery) signaling reinforcer delivery. This stimulus was delivered during both training and PR testing. Otherwise, training, testing, and analytic procedures were identical.
Extinction conditioning.
In trained mice, responding in the absence of reinforcement (extinction) was evaluated. Here, the tube connecting the food hopper and magazine was disconnected; daily tests were otherwise identical to those during response training. Response rates were compared by two-factor (group × session) RM-ANOVA.
Sucrose consumption.
Mice were habituated to drinking a 1% (w/v) sucrose (Sigma) solution in place of water for 2 d as part of an experimental protocol that was aimed at evaluating animals' hedonic response to a desirable food product. Mice were next fluid deprived for 4 h, followed by 1 h access to the sucrose solution. The deprivation periods were then extended to 14 and 19 h to habituate mice to water restriction. Finally, each mouse was allowed 1 h access to the solution in its home cage, whereas cage mates were housed in a clean cage in the colony room. Each mouse had 1 h access such that the average deprivation period was 14 h. ANOVA with testing order as the independent measure confirmed no effects on consumption (F < 1). Consumption values were then normalized to body weight and compared by t test. This procedure has been shown previously to be sensitive to other manipulations (Gourley et al., 2008b, 2013a; Gourley and Taylor, 2009).
Results
Stimulation of the mOFC enhances behavioral sensitivity to reinforcer devaluation
We first expressed CaMKII-driven AAV–GFP in the mOFC. GFP was mostly contained within the mOFC (Fig. 1a). Within the downstream dorsal striatum, labeled terminals were confined to the medial-most aspects of the rostral striatum, adjacent to the lateral ventricles (Fig. 1b). This innervation pattern is highly consistent with previous reports (Schilman et al., 2008; Hoover and Vertes, 2011). Innervation of the amygdala was also consistent with previous reports (Hoover and Vertes, 2011; Fig. 1c); relative to projections of the infralimbic and prelimbic cortices (Mcdonald et al., 1996), the mOFC has sparse innervation of the lateral capsular division of the central nucleus of the amygdala (CeA). Rather, projections mostly avoid the CeA, instead innervating the ventral portion of the lateral nucleus of the amygdala and the medial aspects of the basal nucleus (Fig. 1c).
Next, CaMKII-driven AAV–Gs–DREADD–mCitrine or AAV–GFP was infused into the mOFC of separate mice. Fluorescence distribution is represented in Figure 1d, with the majority of infusions selective to the mOFC. Mice were trained to nose poke for food pellets. We identified no group differences in response acquisition (interaction, F(5,40) = 1.7, p = 0.15; effect of group, F < 1; Fig. 1e). We then delivered the DREADDs ligand CNO via systemic injection to all mice, regardless of viral vector, and gave the mice access to food ad libitum: the reinforcer pellets in one session and regular chow in another session. Throughout, food intake did not differ between groups (t8 = −1.44, p = 0.19; Fig. 1f). Mice were then placed in the operant conditioning chambers. Control mice generated robust response rates throughout, insensitive to reinforcer devaluation. In contrast, Gs–DREADDs-expressing mice inhibited responding after prefeeding with the reinforcer pellets (session × group, F(1,8) = 7.6, p = 0.03; Fig. 1g). Thus, Gs–DREADDs-mediated mOFC stimulation enhanced behavioral sensitivity to reinforcer devaluation.
Mice were then trained for four additional sessions (Fig. 1e), and the reinforcer devaluation procedure was repeated. With this additional identical test (and habituation to the injection stressor), all mice displayed sensitivity to reinforcer devaluation as expected, inhibiting responding after pellet prefeeding (main effect, F(1,16) = 22.9, p < 0.001; Fig. 1h). These findings indicate that CNO does not itself impair sensitivity to prefeeding devaluation (i.e., in GFP control mice) but rather enhances sensitivity to reinforcer devaluation in Gs–DREADDs-expressing mice. Activation of mOFC Gs–DREADDs also does not appear to obviously affect extinction conditioning because overall response rates were indistinguishable between groups during the probe tests (Fig. 1g,h), which are conducted in extinction.
We hypothesized that stimulation of the mOFC may increase sensitivity to effort requirements, i.e., the effort required to obtain a reinforcer, relative to the value of the reinforcer. Thus, we tested the same mice in a PR task. In this test, the response requirements progressively increase for a reinforcer of fixed value. Consistent with our hypothesis, activation of Gs–DREADDs decreased breakpoint ratios (t14 = 2.2, p = 0.04; Fig. 1i).
There is some evidence that hyperexcitation of the mOFC causes repetitive stereotypy-like behaviors (Ahmari et al., 2013). Previous experiments used optogenetic stimulation of the mOFC rather than DREADDs approaches, which instead regulate the firing threshold of neurons (Urban and Roth, 2015). We quantified spontaneous locomotor activity for 1 h 30 min after CNO administration (matching the timing of devaluation and PR testing). Mice engaged in ambulatory behavior more than repetitive stereotypy-like behavior in general (F(1,28) = 103, p < 0.001), but we identified no group differences in either ambulation or repetitive stereotypy-like locomotor counts (F < 1; Fig. 1j). This same locomotor monitoring system has been used to document changes in psychostimulant-elicited locomotion and stereotypy-like behavior (Gourley et al., 2009b), increasing confidence in this null result.
Selective Bdnf knockdown in the mOFC decreases behavioral sensitivity to reinforcer devaluation
Excitatory pyramidal neurons—those targeted by CaMKII-driven AAV–Gs–DREADD—are a primary source of the pro-plasticity neurotrophin BDNF in the cortex. We hypothesized that BDNF may be a molecular mechanism by which the mOFC regulates reward-related decision-making. To test this perspective, we next reduced expression of Bdnf in the mOFC using a viral vector approach (for representation of viral vector spread, see Fig. 1d). Subsequently, mice acquired the nose-poke response with no group differences (interaction and main effect, F < 1; Fig. 2a). After pellet prefeeding (and in the absence of an injection stressor as in the studies above), control GFP-expressing mice decreased response rates as expected. Meanwhile, mice with selective Bdnf knockdown failed to modify response rates (interaction, F(1,13) = 5.3, p = 0.04; Fig. 2b), insensitive to reinforcer devaluation. Instead, these mice responded identically to those that had been prefed with regular chow, leaving the value of the food pellet intact (value groups). (Note that response rates generated during the probe tests were compared with each animal's own baseline because an effect of testing order in this experiment indicated that mice extinguished responding during a second probe test not shown, regardless of devalued or value condition.)
mOFC-selective Bdnf knockdown mice also generated higher breakpoint ratios in a PR test (t(13) = −2.1, p = 0.05; Fig. 2c), again the opposite pattern relative to mOFC Gs–DREADDs-expressing mice. This could not obviously be attributable to resistance to extinction because selective Bdnf knockdown did not affect extinction conditioning (main effect of session, F(2,26) = 14.2, p < 0.001; effect of group and interaction, F < 1; Fig. 2d). The lack of effect on extinction conditioning replicates our previous findings on this topic (Gourley et al., 2009a), and this pattern overall is consistent with the suggestion that the mouse mOFC, like the primate mOFC, regulates behavioral sensitivity to outcome value.
As with the Gs–DREADDs mice, locomotor activity counts did not differ between groups (ambulation, F(1,14) = 1.7, p = 0.2; repetitive photobeam breaks, F(1,14) = 1.2, p = 0.3; interactions, F < 1; Fig. 2e).
Bdnf+/− mice are behaviorally insensitive to reinforcer devaluation
Our findings suggest that mOFC BDNF serves as an inhibitory brake on reward-related responding. Bdnf+/− mice are viable, meaning that we were next able to assess whether these mice develop the same phenotype as selective Bdnf knockdown mice and whether it could be rescued by selective replacement of BDNF in the mOFC.
We first confirmed that brain (hippocampal) BDNF expression in Bdnf+/− mice was approximately half that in wild-type mice as expected (p < 0.001; Fig. 3a). Mice in this experiment successfully acquired the food-reinforced instrumental response. We detected no differences between groups (F < 1; Fig. 3b). Food consumption during the prefeeding periods was also unaffected by genotype (t(26) = 1.2, p = 0.25; Fig. 3c). Wild-type mice subsequently reduced responding associated with the now devalued reinforcer, but Bdnf+/− mice failed to inhibit responding, showing insensitivity to reinforcer devaluation (interaction, F(1,26) = 4.11, p = 0.05; within-group post hoc p = 0.8; Fig. 3d).
In addition to unchanged food intake during ad libitum feeding (Fig. 3c), consumption of a palatable sucrose solution was also indistinguishable between genotypes (p = 0.6; Fig. 3e). This pattern suggests that adult Bdnf+/− mice are behaviorally insensitive to reinforcer devaluation, as opposed to, for example, the hedonic valence of the reinforcer. This is an important distinction, given that Bdnf+/− mice develop late-life obesity (Kernie et al., 2000).
PR responding is elevated in Bdnf+/− mice
As in our experiments with mOFC-selective Bdnf knockdown, responding on a PR schedule was also assessed. Here, we tested mice daily over the course of 1 week, revealing an interaction between genotype and session (F(6,150) = 2.6, p = 0.02; Fig. 4a). Breakpoint ratios gradually grew in Bdnf+/− mice, coupled with a modest decline in typical mice. However, the ratio of responses on the active versus inactive apertures was unchanged (main effect of genotype, F(1,25) = 1; interaction, F(6,150) = 1.6, p = 0.17; Fig. 4b), indicating that responding was equally selective for the active nose-poke aperture between groups.
An increase in breakpoint ratios could conceivably be attributed to increased perceived value of the reinforcer or increased motivation to acquire the reinforcer. PRPs can dissociate these factors. PRPs decrease when the motivation to acquire an outcome increases, for example, rats more rapidly initiate a new trial after collecting a reinforcer when hungry (Skjoldager et al., 1993). In contrast, increasing the quantity of reinforcers, for example, by providing three pellets instead of one, also increases PR breakpoints but leaves PRPs unaffected. To determine whether motivational contributions potentially increased breakpoint ratios in Bdnf+/− mice, we extracted the first, median, and last PRP for each mouse for each test session (Skjoldager et al., 1993). Several analytic approaches failed to identify an effect of genotype. For example, we averaged PRPs for sessions 1–3 (when breakpoints did not significantly differ) and compared them with sessions 4–7 (when breakpoints differed). These analyses failed to reveal any effects of genotype (F values ≤1; Fig. 4c,d). Only main effects of time were detected as expected (session 1–3, F(2,23) = 6.8, p = 0.002; session 4–7, F(2,23) = 6.9, p = 0.002). As an additional example, PRPs did not differ between groups during session 1 when breakpoints also did not differ, nor during session 7 when breakpoints did differ (F values ≤1; data not shown). In fact, during no test session did the PRPs differ as a function of genotype (all p values >0.05). This pattern of responding—increased breakpoint ratios, coupled with unaffected PRPs—suggests that differences in primary motivation do not account for breakpoint ratio differences between Bdnf+/− and littermate wild-type mice.
pERK2 is reduced in Bdnf+/− vmPFC
pERK1/2 has been proposed as a marker of neuronal activity. Furthermore, BDNF binding to its high-affinity receptor trkB activates the ERK MAP kinase signaling pathway. For these reasons, pERK1/2 was analyzed in vmPFC tissue samples, which include the mOFC, immediately after the last session. pERK1/2 was decreased in Bdnf+/− mice as expected (t(22) = 3.3, p = 0.004; Fig. 4e). Analysis of the individual pERK1/2 isoforms indicated that pERK2, which is preferentially associated with activity-dependent neuroplasticity (English and Sweatt, 1996), was significantly reduced (p = 0.006). Expression of the primary BDNF receptor trkB was unchanged (p > 0.6), but phosphorylation of the receptor was decreased as expected, as was expression of the GluR1 subunit of the AMPA receptor (p < 0.05; Fig. 4e). pERK1/2 was also analyzed in the dorsal hippocampus and NAc; notably, there were no differences in expression levels in these regions (data not shown).
BDNF replacement in the mOFC blocks behavioral abnormalities
We next aimed to block behavioral abnormalities in Bdnf+/− mice. We first developed a BDNF microinfusion protocol that, in genetically intact mice, increased levels of pERK1/2 and the immediate-early gene c-fos at the infusion site, detectable 24 h after infusion (all p < 0.05; Fig. 5a,b). To determine the anatomical distribution of BDNF-mediated ERK1/2 stimulation, we also immunostained for pERK1/2 24 h after infusion and quantified expression in 35 μm concentric rings around the infusion site in hemisected coronal sections (Fig. 5c). pERK1/2 expression was increased proximal to the infusion site in both groups, but pERK1/2 was higher in the BDNF group within 500 μm of the infusion terminus (interaction, F(19,95) = 1.9, p = 0.05; Fig. 5d).
Next, we trained a naive cohort of Bdnf+/− and littermate wild-type mice to perform the instrumental response. We again detected no differences in response rates or reinforcement rates between groups (data not shown). Then, groups were assigned by matching breakpoint ratios collected during three initial PR test sessions (Fig. 6a). After BDNF infusion in the mOFC, mice were tested again, and an interaction between genotype and infusion was detected (F(1,37) = 10.3, p = 0.003; Fig. 6b). mOFC BDNF replacement in Bdnf+/− mice normalized breakpoint ratios compared with saline-infused Bdnf+/− mice (p = 0.01) and wild-type mice infused with BDNF (p = 0.009). In other words, selective BDNF replacement fully rescued responding in Bdnf+/− mice, and thus, mOFC BDNF is both necessary and sufficient for typical responding in this task. Interestingly, BDNF infusion in wild-type mice modestly increased breakpoint ratios (p = 0.08), suggesting that mOFC BDNF regulates action selection according to an inverted U-shaped curve. This finding, although unexpected, bears some similarity to the inverted U-shaped influence of BDNF met gene dosing on gray and white matter morphometry in humans (Forde et al., 2014). Also of note, synaptic scaling can occur after supra-physiological BDNF overexpression (Rutherford et al., 1998), which here, could conceivably impair optimal mOFC function and account for a modest inflation of breakpoints in wild-type mice administered BDNF.
Immediately after the last session, mice were killed, and vmPFC tissue was homogenized to evaluate whether BDNF infusion normalized pERK2 expression, in parallel with behavioral responding. As expected, pERK2 was decreased in Bdnf+/− mice infused with saline (interaction, F(1,26) = 6.3, p = 0.02, post hoc p = 0.02; Fig. 6c). However, BDNF infusion restored pERK2 levels such that BDNF-infused Bdnf+/− mice did not differ from control mice (p = 0.3).
As shown, rodents can learn to select actions according to the value of a reinforcer. Meanwhile, behavioral insensitivity to reinforcer value is associated with a dorsolateral striatal “habit” circuit (Yin et al., 2008, 2009). Thus, the dorsal striatum was also extracted and immunoblotted for the immediate-early gene c-fos. Expression patterns strongly resembled behavioral response patterns (interaction, F(1,26) = 18, p = 0.003; Fig. 6d, compare with b): Bdnf+/− mice had high c-fos expression levels (p < 0.05), whereas BDNF infusion normalized expression (p = 0.005 compared with saline-infused Bdnf+/− mice). BDNF infusion in wild-type mice increased c-fos (p = 0.007). Even from a correlational perspective, high striatal c-fos was associated with high breakpoint ratios (r = 0.37, p = 0.045; Fig. 6e).
Discrete stimuli signaling reinforcer availability rescue responding
Dorsolateral striatal systems are associated with stimulus-dependent, as opposed to value-dependent, decision-making (Yin et al., 2008; Hart et al., 2014). Thus, we evaluated whether PR responding could be normalized if Bdnf+/− mice were provided with discrete stimuli signaling reinforcer availability; this would presumably access cue-sensitive striatal systems. A light/tone stimulus was coupled with reinforcer delivery during both the response training (Fig. 7a) and PR testing phases (Fig. 7b). Response rates and breakpoint ratios were indistinguishable between Bdnf+/− and wild-type mice, suggesting that Bdnf+/− mice were able to use pavlovian stimuli to regulate responding (PR test main effect, F < 1; interaction, F(4,56) = 1.6, p = 0.2; Fig. 7b). An alternative perspective is that stimulus–outcome associations energized responding in wild-type mice, but it does not necessarily account for the lack of group differences during training, when the stimuli were also present. Nevertheless, we replicated this experiment in a separate group of mice, and again, wild-type and Bdnf+/− mice did not differ (main effect and interaction F values <1; Fig. 7b′).
Extinction responding in Bdnf+/− mice
To rule out insensitivity to non-reinforcement or general behavioral inflexibility as causal factors in behavioral abnormalities in Bdnf+/− mice, reinforcement was withheld entirely (extinction). A main effect of session (F(5,130) = 14.5, p < 0.001), but no effect of genotype or interactions, was detected (F values <1), again indicating that responding was indistinguishable based on genotype (Fig. 7c).
Finally, we also confirmed that locomotor activity was unaffected by genotype (F values <1; Fig. 7d). Overall, activity decreased across the session (F(11,110) = 2.7, p = 0.004), indicating habituation to the novel environment, but we detected no differences between groups. Importantly, mice were food restricted during this test, recapitulating locomotor activity levels during PR and devaluation testing.
MEK inhibition elevates PR responding
ERK1/2 phosphorylation was blunted in Bdnf+/− mice. To evaluate whether suppressing MEK, a kinase directly upstream of ERK1/2, could mimic Bdnf heterozygosity, adult male C57BL/6 mice were trained to perform an instrumental response for food reinforcement. Mice were matched based on reinforcements earned during training and assigned to either vehicle or PD0184161 groups (Fig. 8a). When mice were treated with PD0184161, breakpoint ratios differed (F(3,27) = 4.1, p = 0.02). Specifically, wild-type mice injected with PD0184161 before the test achieved higher breakpoint ratios than mice injected with vehicle before the test (p = 0.02; pretest injections represented in Fig. 8b), whereas mice injected with PD0184161 after the test did not differ from corresponding control mice (p = 0.17; posttest injections represented in Fig. 8c). Also, when tested drug free, PD0184161-exposed mice were indistinguishable from control mice (p = 0.4; Fig. 8b, inset). This pattern suggests ERK1/2 regulates action selection online rather than via post-session memory consolidation (i.e., consolidating information regarding the PR schedule of reinforcement).
PR responding predicts vmPFC, but not striatal, BDNF in wild-type mice
Our findings suggest that BDNF within the mOFC regulates effortful instrumental response selection. However, an alternative possibility is that mOFC-derived BDNF acts via axonal transport from the mOFC to the dorsal striatum. To explore this possibility, we characterized responding on a PR schedule in several naive mice (Fig. 9a). Immediately after the final session, vmPFC and dorsal striatal samples were extracted. vmPFC BDNF significantly covaried with responding during the last session (r = 0.62, p = 0.02; Fig. 9b), whereas striatal BDNF did not (r = 0.05, p = 0.86; Fig. 9c). This outcome suggests that local mOFC BDNF is a determining factor in action selection strategies.
Discussion
The mOFC is a primary component of an anatomically interconnected medial prefrontal cortical network and is considered distinct relative to the central and lateral compartments of the OFC (Ongür and Price, 2000; Wallis, 2012). These compartments are instead part of a “sensory integration” network classically associated with behavioral flexibility during reversal conditioning (Iversen and Mishkin, 1970; Schoenbaum et al., 2002; McAlonan and Brown, 2003; for contemporary models, see Stalnaker et al., 2015). In contrast, vmPFC structures may be essential to determining behavioral response strategies based on general representations of outcome value (Wallis, 2012). Goal-directed action selection after reinforcer devaluation correlates significantly with neural activity in the human mOFC (Valentin et al., 2007). Furthermore, the mOFC is activated during willingness-to-pay calculations (Plassmann et al., 2007), a finding that may be particularly germane to our current study, because we find that the rodent mOFC is essential to behavioral sensitivity to outcome value and the appropriate “pay” (i.e., effort expenditure) for a given reinforcer in a PR task. Lesions or DREADD-mediated inactivation of the mOFC in rats induce failures in retrieving outcome identity memories (Bradfield et al., 2015), suggesting that the healthy mOFC serves to access outcome value information when it is not immediately observable and thereby guide goal-directed decision-making. We argue that BDNF is essential for this function, given that both tasks used in Bdnf-deficient mice here require mice to sustain a stable representation of reinforcer value to appropriately inhibit responding (after reinforcer devaluation) or gate responding when response requirements escalate (as in the PR task).
Bidirectional regulation of behavioral sensitivity to reinforcer value
We initiated these studies by expressing in the mOFC Gs-coupled DREADDs, engineered G-protein-coupled receptors activated by the otherwise inert ligand CNO (Urban and Roth, 2015). We then decreased reinforcer value using a prefeeding procedure wherein mice can freely consume reinforcer pellets before a probe test conducted in extinction. Additionally, instrumental responding according to a PR schedule of reinforcement was tested. Gs–DREADDs stimulation enhanced response inhibition after reinforcer devaluation, evidence of increased sensitivity to decreased outcome value. Mice were also more sensitive to the escalating demands of the PR schedule, achieving lower breakpoints. mOFC stimulation did not affect ambulation or stereotypy-like behaviors, contrary to evidence that optogenetic stimulation of a mOFC-striatal circuit induces repetitive stereotypy (Ahmari et al., 2013). Repeated burst-like activity caused by optogenetic stimulation, relative to slow depolarization caused by Gs–DREADDs, could account for different behavioral consequences.
What molecular factors might regulate mOFC function? Excitatory pyramidal neurons are a primary source of BDNF in the cortex, and high-frequency stimulation induces synaptic BDNF secretion (Hartmann et al., 2001). To determine the role of locally synthesized BDNF in mOFC-dependent decision-making, we reduced local Bdnf, revealing the opposite behavioral profile. Specifically, mOFC-selective Bdnf knockdown interfered with behavioral sensitivity to reinforcer devaluation, and PR responding was inflated. Bdnf+/− mice developed the same aberrantly elevated PR breakpoints as those with site-selective knockdown, which allowed us to next confirm that selective infusion of BDNF into the mOFC normalized responding in constitutive Bdnf+/− mice. Thus, mOFC BDNF is both necessary and sufficient for appropriate response inhibition. Notably, PRPs did not differ between Bdnf+/− and wild-type littermates, suggesting that exaggerated primary motivation to acquire the reinforcer does not account for breakpoint differences (Skjoldager et al., 1993). Moreover, elevated breakpoints could not be attributed to motoric hyperactivity or enhanced hedonic sensitivity to food. These are important measures, given that Bdnf+/− mice develop late-adulthood obesity (Kernie et al., 2000).
Involvement of ERK/MAP kinase
The ERK MAP kinase signaling cascade is coupled to multiple receptor systems, including trkB, the high-affinity receptor for BDNF. ERK1/2 is also implicated in several forms of learning, memory, and neuroplasticity (Mazzucchelli and Brambilla 2000; Rodrigues et al., 2004), and furthermore, LTP induction preferentially activates ERK2 (English and Sweatt, 1996). Phosphorylation of both trkB and ERK1/2 was decreased in the Bdnf+/− vmPFC, driven specifically by decreased pERK2, suggesting that ERK2-mediated signaling may be essential to mOFC function. Pharmacologically inhibiting MEK, immediately upstream of ERK1/2, recapitulated the effects of mOFC-selective Bdnf knockdown, augmenting breakpoints in the PR task. Notably, post-session MEK inhibition had no effects, implicating ERK1/2 signaling in online action selection rather than post-session consolidation processes. This perspective is further supported by our DREADDs studies, given that CNO was onboard during testing and is in general agreement with the argument that the mOFC retrieves outcome identity/value information during task performance (Bradfield et al., 2015). Interestingly, this selective function contrasts with that of the ventrolateral orbitofrontal cortex, in which BDNF–trkB appears to also regulate the consolidation or retention of response–outcome associative memory (Zimmermann et al., 2015).
Replacement of BDNF selectively within the mOFC rescued pERK1/2 in Bdnf+/− mice, and immunostaining for pERK1/2 revealed a sustained elevation in expression fanning from the infusion site. BDNF-induced pERK1/2 extended in some mice into the dorsally situated prelimbic cortex. We believe that the behavioral effects of BDNF infusion can nonetheless be attributed to actions in the mOFC, because Bdnf knockdown in the prelimbic PFC decreases, rather than increases, PR responding (Gourley et al., 2012). Prelimbic-selective knockdown also facilitates extinction conditioning, whereas extinction was spared here (Fig. 2; Gourley et al., 2009a). These opposing roles for BDNF in the mOFC and prelimbic cortex may account for why mOFC-selective Bdnf knockdown rapidly modified PR response patterns, whereas genotypic differences between Bdnf+/− mice and wild-type littermates emerged only with repeated testing.
Corticostriatal interactions in reward-related decision-making
BDNF is subject to anterograde transport. For example, cortical pyramidal neurons are a predominant source of BDNF in the striatum, which contains little Bdnf mRNA (Altar et al., 1997). Furthermore, BDNF infusion in the dorsal PFC can increase striatal and amygdalar BDNF (McGinty et al., 2010). Conversely, PFC-selective Bdnf knockdown reduces BDNF in downstream structures (Gourley et al., 2009a, 2013c; Zimmermann et al., 2015). Thus, it is conceivable that mOFC BDNF regulates action selection by binding locally or in distal targets, such as the dorsomedial striatum (Fig. 1; Schilman et al., 2008). However, we found that BDNF in the vmPFC, but not striatum, predicted response patterns. Thus, BDNF subjected to axonal transport from the mOFC to dorsal striatum may serve important purposes, but striatal BDNF was a poor predictor of response patterns here.
Unlike BDNF, striatal immediate-early gene expression closely mirrored response patterns, with high c-fos expression in Bdnf+/− mice that responded persistently, despite escalating response demands in the PR task. Why might this be? Classically, goal-directed behaviors are defined as those sensitive to outcome value. mPFC damage (lesions, inactivations) and stressors can induce a shift from goal-directed to “habitual” modes of response that are, in contrast, insensitive to reinforcer value (Balleine and O'Doherty, 2010; Schwabe and Wolf, 2011). Converging neuroanatomical models characterize this process as a transition from PFC–striatal systems that act in concert to a cue-sensitive dorsolateral striatum neurocircuit (Yin et al., 2008, 2009; Kimchi et al., 2009; Gourley et al., 2013b). Thus, c-fos expression here may reflect the recruitment of a dorsolateral striatal system that drives stimulus-dependent responding in the absence of BDNF-mediated signaling in the mOFC. To test this perspective, we provided discrete stimuli signaling reinforcer availability, “bridging” the association between the nose-poke response and reinforcer. These stimuli eliminated response differences between groups. This effect was robust, detected in multiple experiments and despite cohort variances, suggesting that stimulus-dependent response regulation is intact in Bdnf+/− mice.
In marmosets, vmPFC lesions including the mOFC increase breakpoints (Pears et al., 2003), and mice with mOFC lesions develop nearly identical patterns of responding on a PR schedule relative to Bdnf+/− mice (Gourley et al., 2010). However, we have also reported previously that mOFC lesions do not affect behavioral sensitivity to reinforcer devaluation (Gourley et al., 2010). How might we reconcile these findings? Previously, mice were trained before lesion placement, whereas here, Bdnf was knocked down first. It is thus possible that information acquired before insult can help to sustain inhibitory control.
Effects of acute BDNF infusion
We used multiple complementary genetic, pharmacological, viral, and chemogenetic approaches to provide evidence that the mOFC inhibits instrumental responding when response demands escalate and that BDNF is necessary for this function. One additional discovery was that a single BDNF infusion into the mOFC had sustained behavioral consequences, normalizing response patterns and pERK1/2 in Bdnf+/− mice multiple days after infusion (Fig. 6). In conceptually similar studies, BDNF infusion into the dorsomedial PFC after cocaine self-administration in rats reduced the reinstatement of cocaine seeking several days later (Berglind et al., 2007, 2009; Whitfield et al., 2011), a sustained response. However, the relationship between mPFC BDNF and reward seeking is likely quite complex, e.g., cocaine abstinence causes progressive increases in mPFC BDNF above control levels (Lu et al., 2010; McGinty et al., 2010; Giannotti et al., 2014), and the precise behavioral significance of cocaine-induced BDNF overexpression is debated (Sadri-Vakili et al., 2010; Pitts et al., 2016). Ongoing studies using targeted manipulation of BDNF in specific circuits and structures such as the mOFC will help to clarify how BDNF regulates the function of corticolimbic regions in balancing reward seeking versus behavioral inhibition.
Footnotes
This work was supported by National Institutes of Health Grants DA011717, DA027844 (J.R.T.), and MH101477 (S.L.G.), the Children's Center for Neuroscience Research (S.L.G.), and the Connecticut Department of Mental Health and Addiction Services (J.R.T.). The Emory Viral Vector Core is supported by National Institute of Neurological Disorders and Stroke Core Facilities Grant P30NS055077. The Yerkes National Primate Research Center is supported by P51OD011132. We thank Alexia Kedves, Tendi Hungwe, and Courtni Andrews for valuable assistance and the Duman laboratory for providing Bdnf+/− mice used here. We also thank Dr. Glenn Schafe for critical comments on this manuscript and Dr. R. Jude Samulski at the University of North Carolina Viral Vector Core.
References
- Ahmari SE, Spellman T, Douglass NL, Kheirbek MA, Simpson HB, Deisseroth K, Gordon JA, Hen R. Repeated cortico-striatal stimulation generates persistent OCD-like behavior. Science. 2013;340:1234–1239. doi: 10.1126/science.1234733. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Altar CA, Cai N, Bliven T, Juhasz M, Conner JM, Acheson AL, Lindsay RM, Wiegand SJ. Anterograde transport of brain-derived neurotrophic factor and its role in the brain. Nature. 1997;389:856–860. doi: 10.1038/39885. [DOI] [PubMed] [Google Scholar]
- Arana FS, Parkinson JA, Hinton E, Holland AJ, Owen AM, Roberts AC. Dissociable contributions of the human amygdala and orbitofrontal cortex to incentive motivation and goal selection. J Neurosci. 2003;23:9632–9638. doi: 10.1523/JNEUROSCI.23-29-09632.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Balleine BW, O'Doherty JP. Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology. 2010;35:48–69. doi: 10.1038/npp.2009.131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berglind WJ, See RE, Fuchs RA, Ghee SM, Whitfield TW, Jr, Miller SW, McGinty JF. A BDNF infusion into the medial prefrontal cortex suppresses cocaine seeking in rats. Eur J Neurosci. 2007;26:757–766. doi: 10.1111/j.1460-9568.2007.05692.x. [DOI] [PubMed] [Google Scholar]
- Berglind WJ, Whitfield TW, Jr, LaLumiere RT, Kalivas PW, McGinty JF. A single intra-PFC infusion of BDNF prevents cocaine-induced alterations in extracellular glutamate within the nucleus accumbens. J Neurosci. 2009;29:3715–3719. doi: 10.1523/JNEUROSCI.5457-08.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bouret S, Richmond BJ. Ventromedial and orbital prefrontal neurons differentially encode internally and externally driven motivational value in monkeys. J Neurosci. 2010;30:8591–8601. doi: 10.1523/JNEUROSCI.0049-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bradfield LA, Dezfouli A, van Holstein M, Chieng B, Balleine BW. Medial orbitofrontal cortex mediates outcome retrieval in partially observable task situations. Neuron. 2015;88:1268–1280. doi: 10.1016/j.neuron.2015.10.044. [DOI] [PubMed] [Google Scholar]
- Chourbaji S, Hellweg R, Brandis D, Zörner B, Zacher C, Lang UE, Henn FA, Hörtnagl H, Gass P. Mice with reduced brain-derived neurotrophic factor expression show decreased choline acetyltransferase activity, but regular brain monoamine levels and unaltered emotional behavior. Mol Brain Res. 2004;121:28–36. doi: 10.1016/j.molbrainres.2003.11.002. [DOI] [PubMed] [Google Scholar]
- Colwill RM, Rescorla RA. Associative structures in instrumental learning. In: Bower GH, editor. Psychology of learning and motivation. Vol 20. New York: Elsevier; 1986. [Google Scholar]
- Dickinson A. Contemporary animal learning theory. Cambridge, MA: Cambridge UP; 1980. [Google Scholar]
- English JD, Sweatt JD. Activation of p42 mitogen-activated protein kinase in hippocampal long term potentiation. J Biol Chem. 1996;271:24329–24332. doi: 10.1074/jbc.271.40.24329. [DOI] [PubMed] [Google Scholar]
- Figurov A, Pozzo-Miller LD, Olafsson P, Wang T, Lu B. Regulation of synaptic responses to high-frequency stimulation and LTP by neurotrophins in the hippocampus. Nature. 1996;381:706–709. doi: 10.1038/381706a0. [DOI] [PubMed] [Google Scholar]
- Forde NJ, Ronan L, Suckling J, Scanlon C, Neary S, Holleran L, Leemans A, Tait R, Rua C, Fletcher PC, Jeurissen B, Dodds CM, Miller SR, Bullmore ET, McDonald C, Nathan PJ, Cannon DM. Structural neuroimaging correlates of allelic variation of the BDNF val66met polymorphism. Neuroimage. 2014;90:280–289. doi: 10.1016/j.neuroimage.2013.12.050. [DOI] [PubMed] [Google Scholar]
- Genoud C, Knott GW, Sakata K, Lu B, Welker E. Altered synapse formation in the adult somatosensory cortex of brain-derived neurotrophic factor heterozygote mice. J Neurosci. 2004;24:2394–2400. doi: 10.1523/JNEUROSCI.4040-03.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Giannotti G, Caffino L, Calabrese F, Racagni G, Riva MA, Fumagalli F. Prolonged abstinence from developmental cocaine exposure dysregulates BDNF and its signaling network in the medial prefrontal cortex of adult rats. Int J Neuropsychopharmacol. 2014;17:625–634. doi: 10.1017/S1461145713001454. [DOI] [PubMed] [Google Scholar]
- Gorski JA, Zeiler SR, Tamowski S, Jones KR. Brain-derived neurotrophic factor is required for the maintenance of cortical dendrites. J Neurosci. 2003;23:6856–6865. doi: 10.1523/JNEUROSCI.23-17-06856.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gourley SL, Taylor JR. Recapitulation and reversal of a persistent depression-like syndrome in rodents. Curr Protoc Neurosci Chapter. 2009;9 doi: 10.1002/0471142301.ns0932s49. Unit 9.32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gourley SL, Taylor JR. Going and stopping: Dichotomies in behavioral control by the prefrontal cortex. Nat Neurosci. 2016 doi: 10.1038/nn.4275. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gourley SL, Wu FJ, Kiraly DD, Ploski JE, Kedves AT, Duman RS, Taylor JR. Regionally specific regulation of ERK MAP kinase in a model of antidepressant-sensitive chronic depression. Biol Psychiatry. 2008a;63:353–359. doi: 10.1016/j.biopsych.2007.07.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gourley SL, Kiraly DD, Howell JL, Olausson P, Taylor JR. Acute hippocampal BDNF restores motivational and forced swim performance after corticosterone. Biol Psychiatry. 2008b;64:884–890. doi: 10.1016/j.biopsych.2008.06.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gourley SL, Howell JL, Rios M, DiLeone RJ, Taylor JR. Prelimbic cortex bdnf knock-down reduces instrumental responding in extinction. Learn Mem. 2009a;16:756–760. doi: 10.1101/lm.1547909. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gourley SL, Koleske AJ, Taylor JR. Loss of dendrite stabilization by the Abl-related gene (Arg) kinase regulates behavioral flexibility and sensitivity to cocaine. Proc Natl Acad Sci U S A. 2009b;106:16859–16864. doi: 10.1073/pnas.0902286106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gourley SL, Lee AS, Howell JL, Pittenger C, Taylor JR. Dissociable regulation of goal-directed action within mouse prefrontal cortex. Eur J Neurosci. 2010;32:1726–1734. doi: 10.1111/j.1460-9568.2010.07438.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gourley SL, Swanson AM, Jacobs AM, Howell JL, Mo M, Dileone RJ, Koleske AJ, Taylor JR. Action control is mediated by prefrontal BDNF and glucocorticoid receptor binding. Proc Natl Acad Sci U S A. 2012;109:20714–20719. doi: 10.1073/pnas.1208342109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gourley SL, Swanson AM, Koleske AJ. Corticosteroid-induced neural remodeling predicts behavioral vulnerability and resilience. J Neurosci. 2013a;33:3107–3112. doi: 10.1523/JNEUROSCI.2138-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gourley SL, Olevska A, Gordon J, Taylor JR. Cytoskeletal determinant of stimulus-response habits. J Neurosci. 2013b;33:11811–11816. doi: 10.1523/JNEUROSCI.1034-13.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gourley SL, Olevska A, Zimmermann KS, Ressler KJ, Dileone RJ, Taylor JR. The orbitofrontal cortex regulates outcome-based decision-making via the lateral striatum. Eur J Neurosci. 2013c;38:2382–2388. doi: 10.1111/ejn.12239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hart G, Leung BK, Balleine BW. Dorsal and ventral streams: the distinct role of striatal subregions in the acquisition and performance of goal-directed actions. Neurobiol Learn Mem. 2014;108:104–118. doi: 10.1016/j.nlm.2013.11.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hartmann M, Heumann R, Lessmann V. Synaptic secretion of BDNF after high-frequency stimulation of glutamatergic synapses. EMBO J. 2001;20:5887–5897. doi: 10.1093/emboj/20.21.5887. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hodos W. Progressive ratio as a measure of reward strength. Science. 1961;134:943–944. doi: 10.1126/science.134.3483.943. [DOI] [PubMed] [Google Scholar]
- Hoover WB, Vertes RP. Projections of the medial orbital and ventral orbital cortex in the rat. J Comp Neurol. 2011;519:3766–3801. doi: 10.1002/cne.22733. [DOI] [PubMed] [Google Scholar]
- Iversen SD, Mishkin M. Perseverative interference in monkeys following selective lesions of the inferior prefrontal convexity. Exp Brain Res. 1970;11:376–386. doi: 10.1007/BF00237911. [DOI] [PubMed] [Google Scholar]
- Kang H, Schuman EM. Long-lasting neurotrophin-induced enhancement of synaptic transmission in the adult hippocampus. Science. 1995;267:1658–1662. doi: 10.1126/science.7886457. [DOI] [PubMed] [Google Scholar]
- Kang H, Schuman EM. A requirement for local protein synthesis in neurotrophin-induced hippocampal synaptic plasticity. Science. 1996;273:1402–1406. doi: 10.1126/science.273.5280.1402. [DOI] [PubMed] [Google Scholar]
- Kernie SG, Liebl DJ, Parada LF. BDNF regulates eating behavior and locomotor activity in mice. EMBO J. 2000;19:1290–1300. doi: 10.1093/emboj/19.6.1290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kimchi EY, Torregrossa MM, Taylor JR, Laubach M. Neuronal correlates of instrumental learning in the dorsal striatum. J Neurophysiol. 2009;102:475–489. doi: 10.1152/jn.00262.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Korte M, Carroll P, Wolf E, Brem G, Thoenen H, Bonhoeffer T. Hippocampal long-term potentiation is impaired in mice lacking brain-derived neurotrophic factor. Proc Natl Acad Sci U S A. 1995;92:8856–8860. doi: 10.1073/pnas.92.19.8856. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Korte M, Griesbeck O, Gravel C, Carroll P, Staiger V, Thoenen H, Bonhoeffer T. Virus-mediated gene transfer into hippocampal CA1 region restores long-term potentiation in brain-derived neurotrophic factor mutant mice. Proc Natl Acad Sci U S A. 1996;93:12547–12552. doi: 10.1073/pnas.93.22.12547. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Linnarsson S, Björklund A, Ernfors P. Learning deficit in BDNF mutant mice. Eur J Neurosci. 1997;9:2581–2587. doi: 10.1111/j.1460-9568.1997.tb01687.x. [DOI] [PubMed] [Google Scholar]
- Lu H, Cheng PL, Lim BK, Khoshnevisrad N, Poo MM. Elevated BDNF after cocaine withdrawal facilitates LTP in medial prefrontal cortex by suppressing GABA inhibition. Neuron. 2010;67:821–833. doi: 10.1016/j.neuron.2010.08.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacQueen GM, Ramakrishnan K, Croll SD, Siuciak JA, Yu G, Young LT, Fahnestock M. Performance of heterozygous brain-derived neurotrophic factor knockout mice on behavioral analogues of anxiety, nociception, and depression. Behav Neurosci. 2001;115:1145–1153. doi: 10.1037/0735-7044.115.5.1145. [DOI] [PubMed] [Google Scholar]
- Mazzucchelli C, Brambilla R. Ras-related and MAPK signaling in neuronal plasticity and memory formation. Cell Mol Life Sci. 2000;57:604–611. doi: 10.1007/PL00000722. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McAllister AK, Lo DC, Katz LC. Neurotrophins regulate dendritic growth in developing visual cortex. Neuron. 1995;15:791–803. doi: 10.1016/0896-6273(95)90171-X. [DOI] [PubMed] [Google Scholar]
- McAllister AK, Katz LC, Lo DC. Opposing roles for endogenous BDNF and NT-3 in regulating cortical dendritic growth. Neuron. 1997;18:767–778. doi: 10.1016/S0896-6273(00)80316-5. [DOI] [PubMed] [Google Scholar]
- McAlonan K, Brown VJ. Orbital prefrontal cortex mediates reversal learning and not attentional set shifting in the rat. Behav Brain Res. 2003;146:97–103. doi: 10.1016/j.bbr.2003.09.019. [DOI] [PubMed] [Google Scholar]
- Mcdonald AJ, Mascagni F, Guo L. Projections of the medial and lateral prefrontal cortices to the amygdala: a Phaseolus vulgaris leucoagglutinin study in the rat. Neuroscience. 1996;71:55–75. doi: 10.1016/0306-4522(95)00417-3. [DOI] [PubMed] [Google Scholar]
- McGinty JF, Whitfield TW, Jr, Berglind WJ. Brain-derived neurotrophic factor and cocaine addiction. Brain Res. 2010;1314:183–193. doi: 10.1016/j.brainres.2009.08.078. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Montkowski A, Holsboer F. Intact spatial learning and memory in transgenic mice with reduced BDNF. Neuroreport. 1997;8:779–782. doi: 10.1097/00001756-199702100-00040. [DOI] [PubMed] [Google Scholar]
- Nakata H, Nakamura S. Brain-derived neurotrophic factor regulates AMPA receptor trafficking to post-synaptic densities via IP3P and TRPC calcium signaling. FEBS Lett. 2007;581:2047–2054. doi: 10.1016/j.febslet.2007.04.041. [DOI] [PubMed] [Google Scholar]
- Ongür D, Price JL. The organization of networks within the orbital and medial prefrontal cortex of rats, monkeys, and humans. Cereb Cortex. 2000;10:206–219. doi: 10.1093/cercor/10.3.206. [DOI] [PubMed] [Google Scholar]
- Padoa-Schioppa C, Assad JA. Neurons in the orbitofrontal cortex encode economic value. Nature. 2006;441:223–226. doi: 10.1038/nature04676. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Padoa-Schioppa C, Assad JA. The representation of economic value in the orbitofrontal cortex is invariant for changes in menu. Nat Neurosci. 2008;11:95–102. doi: 10.1038/nn2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Patterson SL, Abel T, Deuel TA, Martin KC, Rose JC, Kandel ER. Recombinant BDNF rescues deficits in basal synaptic transmission and hippocampal LTP in BDNF knockout mice. Neuron. 1996;16:1137–1145. doi: 10.1016/S0896-6273(00)80140-3. [DOI] [PubMed] [Google Scholar]
- Paulus MP, Frank LR. Ventromedial prefrontal cortex activation is critical for preference judgments. Neuroreport. 2003;14:1311–1315. doi: 10.1097/01.wnr.0000078543.07662.02. [DOI] [PubMed] [Google Scholar]
- Pears A, Parkinson JA, Hopewell L, Everitt BJ, Roberts AC. Lesions of the orbitofrontal but not medial prefrontal cortex disrupt conditioned reinforcement in primates. J Neurosci. 2003;23:11189–11201. doi: 10.1523/JNEUROSCI.23-35-11189.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pitts EG, Taylor JR, Gourley SL. Prefrontal cortical BDNF: a regulatory key in cocaine- and food-reinforced behaviors. Neurobiol Dis. 2016 doi: 10.1016/j.nbd.2016.02.021. doi: 10.1016/j.nbd.2016.02.021. Advance online publication. Retrieved March 12, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Plassmann H, O'Doherty J, Rangel A. Orbitofrontal cortex encodes willingness to pay in everyday economic transactions. J Neurosci. 2007;27:9984–9988. doi: 10.1523/JNEUROSCI.2131-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rodrigues SM, Schafe GE, LeDoux JE. Molecular mechanisms underlying emotional learning and memory in the lateral amygdala. Neuron. 2004;44:75–91. doi: 10.1016/j.neuron.2004.09.014. [DOI] [PubMed] [Google Scholar]
- Rosen G, Williams AG, Capra JA, Connolly MT, Cruz B, Lu L, Airey DC, Kulkarni A, Williams RW. Presented at the 14th International Mouse Genome Conference; Crete, Greece. 2000. [Accessed May 2, 2011]. The mouse brain library at www.Mbl.Org. [Google Scholar]
- Rutherford LC, Nelson SB, Turrigiano GG. BDNF has opposite effects on the quantal amplitude of pyramidal neuron and interneuron excitatory synapses. Neuron. 1998;21:521–530. doi: 10.1016/S0896-6273(00)80563-2. [DOI] [PubMed] [Google Scholar]
- Sadri-Vakili G, Kumaresan V, Schmidt HD, Famous KR, Chawla P, Vassoler FM, Overland RP, Xia E, Bass CE, Terwilliger EF, Pierce RC, Cha JH. Cocaine-induced chromatin remodeling increases brain-derived neurotrophic factor transcription in the rat medial prefrontal cortex, which alters the reinforcing efficacy of cocaine. J Neurosci. 2010;30:11735–11744. doi: 10.1523/JNEUROSCI.2328-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schilman EA, Uylings HB, Galis-de Graaf Y, Joel D, Groenewegen HJ. The orbital cortex in rats topographically projects to central parts of the caudate-putamen complex. Neurosci Lett. 2008;432:40–45. doi: 10.1016/j.neulet.2007.12.024. [DOI] [PubMed] [Google Scholar]
- Schoenbaum G, Nugent SL, Saddoris MP, Setlow B. Orbitofrontal lesions in rats impair reversal but not acquisition of go, no-go odor discriminations. Neuroreport. 2002;13:885–890. doi: 10.1097/00001756-200205070-00030. [DOI] [PubMed] [Google Scholar]
- Schwabe L, Wolf OT. Stress-induced modulation of instrumental behavior: from goal-directed to habitual control of action. Behav Brain Res. 2011;219:321–328. doi: 10.1016/j.bbr.2010.12.038. [DOI] [PubMed] [Google Scholar]
- Skjoldager P, Pierre PJ, Mittleman G. Reinforcer magnitude and progressive ratio responding in the rat: effects of increased effort, prefeeding, and extinction. Learn Motiv. 1993;24:303–343. doi: 10.1006/lmot.1993.1019. [DOI] [Google Scholar]
- Stalnaker TA, Cooch NK, Schoenbaum G. What the orbitofrontal cortex does not do. Nat Neurosci. 2015;18:620–627. doi: 10.1038/nn.3982. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Urban DJ, Roth BL. DREADDs (designer receptors exclusively activated by designer drugs): chemogenetic tools with therapeutic utility. Annu Rev Pharmacol Toxicol. 2015;55:399–417. doi: 10.1146/annurev-pharmtox-010814-124803. [DOI] [PubMed] [Google Scholar]
- Valentin VV, Dickinson A, O'Doherty JP. Determining the neural substrates of goal-directed learning in the human brain. J Neurosci. 2007;27:4019–4026. doi: 10.1523/JNEUROSCI.0564-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wallis JD. Cross-species studies of orbitofrontal cortex and value-based decision-making. Nat Neurosci. 2012;15:13–19. doi: 10.1038/nn.2956. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Whitfield TW, Jr, Shi X, Sun WL, McGinty JF. The suppressive effect of an intra-prefrontal cortical infusion of BDNF on cocaine-seeking is Trk receptor and extracellular signal-regulated protein kinase mitogen-activated protein kinase dependent. J Neurosci. 2011;31:834–842. doi: 10.1523/JNEUROSCI.4986-10.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wilson RC, Takahashi YK, Schoenbaum G, Niv Y. Orbitofrontal cortex as a cognitive map of task space. Neuron. 2014;81:267–279. doi: 10.1016/j.neuron.2013.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yin HH, Ostlund SB, Balleine BW. Reward-guided learning beyond dopamine in the nucleus accumbens: the integrative functions of the cortico-basal ganglia networks. Eur J Neurosci. 2008;28:1437–1448. doi: 10.1111/j.1460-9568.2008.06422.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yin HH, Mulcare SP, Hilário MR, Clouse E, Holloway T, Davis MI, Hansson AC, Lovinger DM, Costa RM. Dynamic reorganization of striatal circuits during the acquisition and consolidation of a skill. Nat Neurosci. 2009;12:333–341. doi: 10.1038/nn.2261. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zimmermann KS, Yamin JA, Rainnie DG, Kessler KJ, Gourley SL. Connections of the mouse orbitofrontal cortex and regulation of goal-directed action selection by BDNF-trkB. Biol Psychiatry. 2015 doi: 10.1016/j.biopsych.2015.10.026. doi: 10.1016/j.biopsych.2015.10.026. Advance online publication. Retrieved March 12, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]