Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2013 Aug 1.
Published in final edited form as: Behav Brain Res. 2012 May 28;233(2):494–499. doi: 10.1016/j.bbr.2012.05.032

Impaired reward learning and intact motivation after serotonin depletion in rats

Alicia Izquierdo 1,*, Kathleen Carlos 1, Serena Ostrander 1, Danilo Rodriguez 1, Aaron McCall-Craddolph 1, Gargey Yagnik 2, Feimeng Zhou 2
PMCID: PMC3402622  NIHMSID: NIHMS381038  PMID: 22652392

Abstract

Aside from the well-known influence of serotonin (5-hydroxytryptamine, 5-HT) on emotional regulation, more recent investigations have revealed the importance of this monoamine in modulating cognition. Parachlorophenylalanine (PCPA) depletes 5-HT by inhibiting tryptophan hydroxylase, the enzyme required for 5-HT synthesis and, if administered at sufficiently high doses, can result in a depletion of at least 90% of the brain s 5-HT levels. The present study assessed the long-lasting effects of widespread 5-HT depletions on two tasks of cognitive flexibility in Long Evans rats: effort discounting and reversal learning. We assessed performance on these tasks after administration of either 250 or 500 mg/kg PCPA or saline (SAL) on two consecutive days. Consistent with a previous report investigating the role of 5-HT on effort discounting, pretreatment with either dose of PCPA resulted in normal effortful choice: All rats continued to climb tall barriers to obtain large rewards and were not work-averse. Additionally, rats receiving the lower dose of PCPA displayed normal reversal learning. However, despite intact motivation to work for food rewards, rats receiving the largest dose of PCPA were unexpectedly impaired relative to SAL rats on the pretraining stages leading up to reversal learning, ultimately failing to approach and respond to the stimuli associated with reward. High performance liquid chromatography (HPLC) with electrochemical detection confirmed 5-HT, and not dopamine, levels in the ventromedial frontal cortex were correlated with this measure of associative reward learning.

Keywords: Effort discounting, reversal learning, ventromedial frontal cortex, HPLC

1. Introduction

Serotonin (5-HT) is thought to modulate emotional states, circadian rhythm, food intake, reproduction, and some aspects of cognition. Recent studies suggest that emotional dysregulation and cognitive control failures may share a common neurobiological substrate, both heavily modulated by 5-HT [1]. Yet, because of the historically focused role of 5-HT on emotional regulation, there have been fewer studies in comparison investigating the effects of 5-HT depletion on cognitive control and flexibility. In experimental animals, manipulations of 5-HT vary from pharmacological, genetic, lesion, and dietary with diverse assessments in the domains of learning, memory, decision making, and cognitive control (for review, see [2]). Only toxin-mediated depletions of 5-HT consistently yield impairments on cognitive flexibility [37].

Tryptophan is essential for 5-HT synthesis and its transformation to 5-HT is facilitated by the enzyme tryptophan hydroxylase. Acute tryptophan depletion in humans results in largely intact long-term memory, verbal recall, and recognition [8], and leaves cognitive flexibility, as measured by probabilistic reversal learning, unaffected [9]. Similarly, tryptophan hydroxylase-depleting drug treatments such as parachlorophenylalanine (PCPA) in rodents have not yielded consistent evidence of any measured cognitive impairment [7, 10, 11]. Specifically, depleting-doses of PCPA do not result in sensorimotor impairments [12] and leave spatial learning unaffected in rats [1315]. Fewer studies have investigated the effects of 5-HT depletions on cognitive flexibility: two such reports utilized comparatively smaller doses and found no impairments [10, 11].

Whereas attention and working memory have been linked to dopamine (DA), aspects of cognition such as response inhibition and flexibility of response are more heavily associated with 5-HT function [1, 16]. Both effort discounting and discrimination reversal learning are tasks administered with frequency to measure cognitive flexibility in rats [17]. Using an effort discounting paradigm described by Walton and colleagues [18], rats have the option to either work for a reward of greater magnitude or instead obtain a smaller reward at low effort. Previous studies have found that work tolerance and the motivation to choose a reward of greater value may not rely on 5-HT modulation [11], but instead be DA-dependent. Reversal learning, on the other hand, requires the rat to respond to changes in reward contingency and is highly linked to both DA and 5-HT modulation (for review, see [19].

In the present study, we sought to compare performance on both of these tasks in rats treated with PCPA. We first assessed rats effort discounting by using a T-maze task in which the rats were required to choose between a high reward (HR) and a low reward (LR). The rat had to exert greater effort by climbing a barrier of increasing height to procure the HR. The LR arm, however, could be freely accessed with no effort. We then treated rats with either 2 d of 250 mg/kg PCPA, 2 d of 500 mg/kg PCPA, or 2d of saline (SAL). The highest dose of PCPA has been shown to deplete 5-HT between 80–90% in the brain [15]. After an assessment of post-treatment effort discounting, we also assessed reversal learning in PCPA-treated rats by using touchscreen-adapted operant chambers in which rats were required to nosepoke a stimulus for a reward. Finally, we used high performance liquid chromatography (HPLC) with electrochemical detection to quantify 5-HT and DA in ventromedial frontal cortex. We hypothesized that 5-HT depleted rats would not be impaired on effort discounting in accordance with previous research [11] and that rats would choose the HR at comparable rates to SAL-treated rats. We also hypothesized that PCPA-treated rats would be impaired on reversal learning following treatment, given this monoamine s important role in inhibitory control.

2. Materials and Methods

2.1 Subjects

Subjects were 24 male Long Evans rats (Charles River Laboratories), weighing between 275–300g at the start of behavioral testing. All rats were pair-housed until food-restriction (21 d after arrival). To ensure motivation to work for food rewards, daily rations were decreased until they were approximately (and no less than) 85% of their free-feeding body weight. Water was available ad libitum. Housing conditions were maintained at a 12-hr light/12-hr dark cycle, with the temperature held constant at 22 C. All behavioral testing took place between 0800h and 1600h 5–6 days per week in accordance with previous reports from our lab on effort discounting [20] and reversal learning [21]. Procedures for this study were reviewed and accepted by the Institution for Animal Care and Use Committee at California State University, Los Angeles.

2.2 Apparatus

2.2.1 Effortful T-maze

A commercially-available t-maze (Stoelting Co., Wood Dale, Illinois) with one start arm and two goal arms was used (Figure 1a). Each goal arm measured 41.9 cm in length, 10.2 cm wide, with walls 20.3 cm high. The start arm measured 50.3 cm in length. Located at the end of each goal arm was a white ceramic bowl measuring 5.1 cm in diameter in which the food reward was placed. “Froot loops” (Kellogg NA Co., Battle Creek, MI) were given as food rewards during testing: a “high reward” (HR) consisted of four froot loops (i.e., 2 froot loops), while a “low reward” (LR) consisted of one froot loop. To increase effort for the HR, wooden triangular blocks of 15, 20, 25 and 30 cm heights were constructed in the lab and used to impede access to the food reward in the HR arm (Figure 1b). Barrier heights were chosen from a previous study [18] and in keeping with recent reports from our laboratory [20, 22]. The wooden triangular blocks (barriers) were placed in the t-maze such that the rat was required to climb straight up the side (90º) and down at an angle to the food reward located at the end of the goal arm. The angle of decline to the food reward averaged 44.5º across the barriers. Between trials, the rat was removed from the t-maze and placed in a 30 cm h x 25 cm diameter cylindrical glass holding tank.

Figure 1. The effortful t-maze.

Figure 1

The t-maze (A) was equipped with one start arm and two goal arms, each baited with either a high reward, HR (4 1/2 froot loops) or a low reward, LR (1/2 froot loop). Wooden blocks (B) of varying heights 15, 20, 25, and 30 cm were used to increase the effort required to access the high reward arm.

2.2.2 Operant Chambers

Operant chambers (#80004, Lafayette Instrument Co., Lafayette, IN), measuring 35.6 cm (length) x 27.9cm (width) x 33.7 cm (height) were each housed within a sound and light-attenuating cubicle (#83018DDP Lafayette Instrument Co., Lafayette, IN). Each operant chamber was outfitted with a touch-sensitive, 12” LCD flat screen (EloTouch, Menlo Park, CA), Figure 2A. The chamber floor was covered with a clear Plexiglas sheet to facilitate ambulation. The touchscreen and a single houselight were located at one end of the chamber; a tone generator, a pellet receptacle and a pellet dispenser, at the other end. The pellet dispenser delivered 45mg dustless sucrose pellets (BioServ, Frenchtown, NJ). Stimulus presentation, reward delivery and contingencies were controlled by custom-designed software developed for use in nonhuman primate experiments (Ryklin Software, Inc). The equiluminant stimuli were the same as those reported previously [21], Figure 2B.

Figure 2. Operant apparatus and stimuli.

Figure 2

The operant testing chamber (A) was outfitted with a touchscreen on one end and a pellet dispenser on the other end. The stimuli (B) were displayed on the touchscreen for nosepoking during discrimination and reversal learning.

2.3 Handling and Food Restriction

2.3.1 Handling and acclimation to food rewards

Each rat was handled for a minimum of 10 min once per day for 5 days prior to behavioral testing. During the last two days of handling, animals were transported for approximately 5 min to habituate them to the transport cart prior to the commencement of testing. Upon returning to the vivarium, they were fed 10 froot loops in their homecage to accustom them to the food reward.

2.3.2 Food restriction

All rats were food-restricted to 85% of their free-feeding body weight to ensure motivation to work for food, while water was available ad libitum. Weights were monitored three times per week to ensure a healthy body weight.

2.4 Effort Discounting

2.4.1 Acclimation to the t-maze

A habituation and training protocol adapted from Walton and colleagues [18] was used to habituate the rats to the t-maze and familiarize them with the froot loops. During the acclimation phase, each rat was individually placed into the t-maze and allowed to explore and eat froot loops freely for 10 min. Criterion for advancement to the next phase was consuming fifteen ½ froot loops within 10 min for two consecutive days.

2.4.2 Phase 1 Discrimination training with free sampling

In this phase, one goal arm was baited with four ½ froot loops (HR arm), and the other with one ½ froot loop (LR arm). The rat was allowed to sample freely from both arms for five trials. Each trial lasted until the rat finished all the froot loops. Trials were separated by a 30 sec intertrial interval (ITI) on day 1, and a 60 sec ITI on day 2, during which time they were placed in an empty cylindrical glass holding tank. HR and LR arm designations were counterbalanced among rats (half the rats received HR in the left goal arm, the other half in the right), and remained constant throughout testing. This phase was administered for two consecutive days.

2.4.3 Phase 2 Discrimination training with forced trials

For this phase, each rat was administered ten “forced” trials, in which either the HR or LR arm was blocked by a white cardboard insert, forcing the rat to one side or the other according to a Gellerman (pseudorandom) schedule. This phase marked the beginning of learning to visit only one arm as well as continuing to learn each arm s associated reward values. Each trial was separated by a 30 sec ITI, during which time rats were placed in an empty cylindrical glass holding tank. This phase was administered for two consecutive days.

2.4.4 Phase 3 Discrimination training with free choice

Each rat was allowed to choose either the HR or LR arm, and was removed from the t-maze upon eating the food reward from the chosen arm. Ten trials were administered per day, with a forced trial administered after trials 5 and 10, forcing the rat to the arm not chosen on the most recent previous trial to prevent side biases (i.e., if the HR was chosen on trial 5, the forced trial would be to the LR). A 30 sec ITI was used between trials, during which time the t-maze was wiped clean with 70% ethanol solution to prevent the rat s use of scent-guided choice. Criterion for the next phase was choosing the HR arm 90% or more for 2 consecutive days.

2.4.5 Training phase with barriers

During this phase, rats were required to climb successively larger wooden barriers (beginning at 15 cm) to achieve the HR. Each session consisted of 10 free choice trials, with a forced trial after trials 5 and 10 to prevent side biases. Upon eating the food reward, the rat was placed in a holding tank for a 30 sec ITI, during which the maze was wiped clean with 70% ethanol. This phase continued for 12 days of daily testing, with barrier heights increasing every third day from 15 cm, 20 cm, 25 cm to 30 cm, irrespective of the rat s choice.

2.5 Drug Treatment

The tryptophan hydroxylase inhibitor, 4-Chloro-DL-phenylalanine methyl ester hydrochloride (PCPA; Sigma-Aldrich, St. Louis, MO) was dissolved by sonication in physiological saline (SAL). Rats were randomly assigned to PCPA treatment (n=8, 2 d of 500 mg/kg PCPA; n=8, 2 d of 250 mg/kg PCPA) or treatment with two days of SAL (n=8). Treatment groups are abbreviated as follows: 500 PCPA, 250 PCPA, or SAL. Following treatment with PCPA, rats were given 3–4 days rest without behavioral testing.

2.6 Post-treatment Effort Discounting

Following 3–4 days of no testing or food restriction, rats were placed back on food restriction and tested again on barriers of 15 cm, 20 cm, 25 cm, and 30 cm, following a similar protocol used for the 2.4.5 stage above.

2.7 Reversal Learning

Rats were maintained on a food-restricted diet in their homecages for at least one week after effortful t-maze testing, before testing in the operant chambers. During acclimation, rats were required to eat pellets out of the pellet tray before exposure to any stimuli on the touchscreen.

2.7.1 Shaping

Shaping began with the display of white graphic stimuli on the black background of the touchscreen, the disappearance of which was paired with the onset of a “reward event:” a sucrose pellet, a 1 s tone and a 1 s illumination of the houselight. An intertrial interval (ITI) of 20 s was used, while stimuli remained on the screen for 8 s. At any time, rats could nosepoke the stimuli on the touchscreen and initiate the reward event. Criterion for this phase occurred when rats ate 60 sucrose pellets within 30 min for each of two consecutive days.

2.7.2 Pretraining

Pretraining consisted of different stages outlined in detail elsewhere [21]. In brief, rats progressively learned to: 1) nosepoke a “neutral” stimulus (i.e., a white circle); 2) track and nosepoke the stimulus as it appears on the left or right-hand side of the touchscreen; 3) initiate a trial; and 4) receive a “punishment” for nosepoking the incorrect (blank) window on the touchscreen. The punishment event consisted of the absence of a reward event and an inability to initiate a new trial for 5 s. The stimuli remained on the screen indefinitely until the rat nosepoked. Criterion for advancement for each phase of training was 60 correct nosepokes within 45 min, on each of two consecutive days. Rats were given a maximum of 30 sessions to complete this phase.

2.7.3 Visual Discrimination and Reversal Learning

Rats that completed all stages of 2.7.2. were then presented with two concurrently-presented two-dimensional, equiluminant white stimuli (Figure 2B) on a black background and trained according to predetermined reinforcement contingencies: Stimulus A resulted in a food reward (A +), whereas nosepoking the other stimulus, B, resulted in a 5 s timed-out punishment (B−). Designation of the rewarded stimulus (reward assignment) was counterbalanced across groups. The custom software enabled stimuli to be presented on the screen indefinitely until the rat nosepoked one of the stimuli. Only small preprogrammed response windows overlying the stimuli were sensitive to nosepoking: nosepoking outside of the response window was undetected; nosepoking within it was either correct or incorrect, depending on reward contingency. Left/right presentation of the S + was pseudorandom, according to a Gellerman schedule generated by Ryklin Software Inc. There were 60 total trials per session (and one session per day) with a 10-s ITI. For this learning phase, rats were required to reach a criterion of 85% correct out of 60 trials across each of 2 consecutive days. Immediately following attainment of the discrimination criterion (i.e. the next session), rats were then required to respond to a reversal in reward contingency: nosepoking the previously incorrect stimulus was now rewarded by provision of a sucrose pellet. As in the earlier phase, criterion was set at a mean score of 85% correct out of 60 trials across two consecutive sessions.

2.8 HPLC

The effect of PCPA on the brain was assessed by postmortem tissue analysis of 5-HT levels in ventromedial frontocortical regions collected 130 d after the administration of 500 PCPA and 112 d after treatment with 250 PCPA. Following behavioral testing, rats were euthanized with an overdose of euthasol. Once brains were extracted they were “flash frozen” (placed in −80°C isopentane for 2 min) and subsequently stored in aluminum foil at −80°C. Brains were then placed in a brain matrix and 2mm coronal sections of ventromedial frontal cortex (to include orbital, infralimbic, and prelimbic sectors) were dissected on a stainless steel plate positioned on wet ice. No other brain regions were dissected for analysis. Tissue samples were homogenized in 200 μl of 0.2 M perchloric acid for 10 min and centrifuged at 20000 rpm for 10 min (4°C). The supernatant (50μl) was subsequently analyzed using reversed-phase liquid chromatography separation and electrochemical detection. Samples were separated on a high retention analytical column (Phenomenex Luna C18(2) 250 X 4.6 mm 5μ 100 A°) using a mobile phase (50 mM KH2PO4 and 5 mM octane sulfonic acid in 85:15 (H2O:Acetonitrile), , pH 3.0) at 1 ml/ min flow rate. Tissue levels of 5-HT were quantified using Amperometric(it) detection (CHI Electrochemical Workstation instrument) with a Glassy Carbon electrode set at +85mV. The resultant signal was integrated using CHI832 Electrochemical Analyzer software. The HPLC system was equipped with Waters 600 delta pump with Waters 600 controller. The quantification was performed against standards containing known amounts of 5-HT and DA (1000nm and 100nm stock solutions, respectively).

2.9 Data Analyses

Data were analyzed using StatView software. Statistical significance was set at p≤0.05. Effort discounting data (percent HR chosen) were analyzed using a repeated-measures ANOVA for barrier height (15, 20, 25, and 30 cm) by treatment group (250 PCPA, 500 PCPA, or SAL), for each phase separately (pretreatment and post-treatment). Session duration for effort discounting was analyzed using ANOVA. For learning in the operant chambers, the number of sessions to complete pretraining were analyzed using ANOVA. Sessions to criterion for visual discrimination and reversal learning phases were analyzed using independent samples t-tests. Additionally, for visual discrimination and reversal learning phases, session percent correct was analyzed using repeated-measures ANOVA across the first 10 sessions. HPLC data were also analyzed using ANOVAs for DA and 5-HT content in the ventromedial frontal cortex. Fisher s protected least significant difference (PLSD) ANOVAs were used for post-hoc pairwise comparisons of all groups. Finally, Pearson correlation matrices were generated for HPLC measures and learning measures.

3. Results

3.1 Effort discounting

There were no pre- or post-treatment differences in effort discounting (Figure 3A). However, an ANOVA revealed that there was a main effect of treatment with significant differences in session duration [F(2,18)=15.46, p<0.01]. SAL-treated rats completed testing in 12.3 ± 0.921 min, the 250 PCPA group completed sessions in 9.25 ± 0.164 min, and the 500 PCPA group in 15.0 ± 0.775 min. Fisher s PLSD showed that the 500 PCPA group was slower than both the SAL group (p<0.01) and 250 PCPA group (p<0.01). The 250 PCPA group also completed trials more quickly than SAL (p<0.01) (Figure 3B).

Figure 3. Effort Discounting.

Figure 3

(A) Mean ± SEM percent high reward (HR) choices as a function of increasing barrier heights 15, 20, 25, and 30 cm before and after treatment with either 2 d of 250 mg/kg PCPA (250 PCPA) or 2 d of 500 mg/kg (500 PCPA) or 2d of saline (SAL). There was no effect of PCPA treatment on effort discounting, all groups chose similarly. (B) Mean ± SEM session duration in minutes. 250 PCPA finished sessions faster than the SAL, whereas 500 PCPA finished sessions more slowly than SAL. ** Different from SAL, p<0.01.

3.2 Reversal Learning

The 500 PCPA group was impaired at approaching (i.e. nosepoking) the stimuli associated with reward, whereas the 250 PCPA rats were unimpaired relative to SAL (Figure 4). An ANOVA on the number of sessions required to complete pretraining leading up to discrimination and reversal learning revealed there was a main effect of treatment with significant differences in sessions to complete pretraining [F(2,20)=3.92, p=0.036]. SAL-treated rats completed pretraining in 19.3 ± 2.963 sessions, the 250 PCPA group completed this stage in 17.3 ± 3.840 sessions, and the 500 PCPA group were allowed up to 30 sessions to complete this training, with all but one rat failing to learn to nosepoke (27.5 ± 0.756 sessions). Fisher s PLSD showed that the 250 PCPA group required fewer sessions to learn than the 500 PCPA group (p=0.02). Similarly, the SAL group required fewer sessions relative to the 500 PCPA group (p=0.04). There was no significant difference in learning between the 250 PCPA and SAL groups (p=0.60).

Figure 4. Operant Learning.

Figure 4

Mean ± SEM shaping and pretraining sessions to criterion after treatment with either 2 d of 250 mg/kg PCPA (250 PCPA) or 2 d of 500 mg/kg (500 PCPA) or 2d of saline (SAL). 500 PCPA showed impaired nosepoking responses to stimuli on the touchscreen when given a maximum of 30 sessions. *Different from SAL and 250 PCPA, p≤0.05.

Of the PCPA-treated groups, only the 250 PCPA group went on to complete the visual discrimination and reversal phases, and no differently than SAL [sessions to criterion for visual discrimination: t(10)=0.545, p=0.598; sessions to criterion for reversal learning: t(10)=0.404, p=.698]. For reversal learning, the 250 PCPA rats learned in 10.5 ± 1.893 sessions and the SAL-treated rats learned in 8.6 ± 3.881 sessions. To analyze rate of learning, repeated-measures ANOVAs on percent correct for the first 10 sessions of visual discrimination and reversal learning were conducted. Neither analysis revealed a main effect of treatment or a treatment x session interaction.

3.3 HPLC

An analysis of frontocortical tissue collected 130 d after the administration of 500 PCPA and 112 d after treatment with 250 PCPA was performed using ANOVAs. An ANOVA on DA content in ventromedial frontal cortex revealed no significant treatment group difference. An ANOVA on 5-HT content in the same tissue confirmed a marginally significant effect of treatment [F(2,33)=2.951, p=0.06] (Figure 5). Fisher s PLSD post hoc pairwise comparisons showed significantly elevated 5-HT content in 250 PCPA relative to 500 PCPA (p=0.02). However, 250 PCPA was not elevated relative to SAL (p=0.42), and 500 PCPA was not depleted relative to SAL (p=0.22).

Figure 5. HPLC quantification of DA and 5-HT content in rat ventromedial frontal cortex.

Figure 5

Left panel: Treatment with either 2 d of 250 mg/kg PCPA (250 PCPA) or 2 d of 500 mg/kg (500 PCPA) or 2d of saline (SAL) resulted in no difference in DA content in ventromedial frontal cortex. Right panel: Significantly elevated 5-HT content in 250 PCPA relative to 500 PCPA (*p=0.02). However, 250 PCPA was not elevated relative to SAL, and 500 PCPA was not depleted relative to SAL.

Correlation matrices were generated for HPLC measures and the learning measures identified as significant in 3.1 and 3.2. Pearson correlations were only significant for 5-HT and the number of sessions to criterion required to complete pretraining in the operant chambers (r=−0.416, p=0.017). No significant correlation was found for DA on this same measure (r=0.198, p=0.281). Thus, low 5-HT content in the ventromedial frontal cortex was associated with a greater number of sessions spent learning how to nosepoke the stimulus on the touchscreen.

4. Discussion

4.1 Impaired reward learning, intact motivation

This study is the first to show that treatment with high-dose PCPA, a 5-HT-depleting drug, impairs the ability to approach and respond to stimuli associated with reward. The conclusion that rats exhibit impaired reward learning and not simply a lack of motivation to procure reward is substantiated by the observation that the same dose had no effect on our measure of effort discounting, specifically, HR choice. We therefore corroborate previous findings by [11]: rats treated with the highest dose of PCPA continue to choose (and work for) the HR at greater effort. Additionally, the impaired reward learning cannot be explained by impaired locomotor responses, since choice of the HR on the effort discounting task requires costly physical demands and both PCPA-treated groups proved capable of eliciting the appropriate responses compared to SAL-treated rats.

4.2 Impaired attention in reward learning

Interestingly, though the number of HR choices did not differ across treatment groups, the 500 PCPA group exhibited longer session times on the effort discounting task. In contrast to the 500 PCPA group, the rats receiving the lower dose (250 PCPA) displayed quicker session times, even quicker than SAL rats.

It is possible that the 500 PCPA group was overall less attentive relative to the 250 PCPA and SAL groups. This idea accords well with a definition by [23], which posits two components of attention in reward learning: selection and executive control. The selection component allows the subject to assign stimuli to different levels of priority. Though this ability is generally believed the purview of DA, our findings suggest it may also involve 5-HT modulation. As described in the preceding section, in the effort discounting task, the 500 PCPA group displayed greater session times, ostensibly engaged in behaviors not related to reward procurement. Similarly in the operant chambers, the high-dose PCPA group spent the majority of their session time exploring the testing apparatus itself or adjacent to the pellet tray instead of approaching the reward-related stimulus like the 250 PCPA and SAL groups. A more robustly objective measure, sessions to completion of pretraining, was impaired in the 500 PCPA group, which precluded their advancement to discrimination and reversal learning. This learning was significantly correlated with 5-HT levels in the ventromedial frontal cortex. Importantly, this correlation was negative, with low levels of 5-HT associated with a greater number of sessions spent completing this phase. We acknowledge having sampled only from ventromedial frontal cortex, thus it remains possible that 5-HT in other brain areas similarly correlated with this behavioral measure.

A second component of attention, executive control, refers to the active inhibition of irrelevant information and the “optimization” of reward-relevant responses [23]. Indeed, it has been shown that 5-HT in frontal cortex can increase attention selection, decrease impulsive responding, and mostly likely interacts with DA mechanisms in support of reward learning [2427]. This stated, we did not observe impairments in any measure of inhibitory control directly (e.g. reversal learning) nor did we measure attention in an attention-specific task (e.g., 5-choice serial reaction time task). Our apparatuses and software were also not equipped to isolate the attention component of learning. Ongoing experiments are aimed at the disambiguation of these sub-processes.

An impairment in learning to nosepoke the stimulus in the 500 PCPA group was unexpected and represents a limitation in our design that could have been addressed by an a priori counterbalancing of the order of tasks. If the 500 PCPA group was trained to nosepoke the stimulus before treatment, they might have advanced to discrimination and reversal learning. We can only speculate that these rats would likely then display intact discrimination and impaired reversal learning (if still lower in 5-HT), consistent with the putative role of 5-HT on cognitive flexibility and inhibitory control [1, 19]. A related possibility is that 5-HT more selectively modulates the formation of new stimulus-reward associations, and does little to support old or previously-learned associations. Thus, it would follow that 5-HT depletion would produce no impairment on a previously-learned association (e.g. no impairment in post-treatment effort discounting), but would prevent learning to respond to a novel rewarded stimulus (e.g. impaired stimulus nosepoking).

4.3 Plasticity after treatment with PCPA

The 250 PCPA group appeared to exhibit a “rebound” effect in 5-HT levels, with HPLC analyses revealing greater 5-HT content than the 500 PCPA group in the ventromedial frontal cortex. Interestingly, the profile for DA in the same tissue appeared also to be slightly greater on average in the 250 PCPA group relative to the SAL group (though this difference did not reach statistical significance). The 250 PCPA group also exhibited quicker session times in the t-maze than both the 500 PCPA and SAL groups. This measure was not, however, significantly correlated with 5-HT tissue content. Together these observations point to neuroplastic adaptations and behavioral recovery in the 250 PCPA group. Two consecutive days of 500 mg/kg PCPA, on the other hand, might have induced some neurotoxicity in this group, given the reductions observed in 5-HT content relative to the 250 PCPA group. Notably, neither PCPA-treated group displayed significantly different 5-HT content relative to the SAL group. Cognitive exercise and elapsed time likely aided in their overall recovery.

4.4 Conclusion

Here we report that pretreatment with high-dose PCPA resulted in impaired nosepoking to stimuli associated with reward. This learning was also correlated with 5-HT levels in the ventromedial frontal cortex. The same dose had no such effect on effort discounting, a test measuring motivation to procure reward administered shortly after drug treatment. Though a plentitude of studies have provided evidence for 5-HT s selective involvement in inhibitory control (for review, see [19]), fewer investigations have reported on the role of 5-HT in more general (associative) reward learning processes. One such study [6] reported that 5-HT content in areas of the medial frontal cortex are involved in learning to respond to stimuli predictive of reward on a go/no-go task. Another relevant study that accords well with our present finding is that of Chudasama and Robbins [28]. Using methods similar to ours, these authors found that pretraining lesions of the orbitofrontal cortex resulted in impaired autoshaping (i.e., a failure to approach the stimulus predictive of reward). Given our results, the autoshaping impairment in [28] could be due to a putative loss of 5-HT in the orbitofrontal cortex. Taken together, our data suggests that 5-HT can also modulate reward learning that involves little inhibitory control per se, other than the suppression of reward-irrelevant cues.

Research Highlights.

  • Rats were treated with parachlorophenylalanine (PCPA), a drug that depletes 5-HT.

  • Rats were tested on effort discounting and reversal learning, tests of cognitive flexibility.

  • PCPA treatment resulted in unimpaired effort discounting, suggesting intact motivation.

  • Large-dose PCPA resulted in a failure to approach and respond to stimuli predictive of reward.

  • 5-HT levels in the ventromedial frontal cortex were correlated with associative reward learning.

Acknowledgments

Research was supported by the NIH MBRS-RISE program at California State University, Los Angeles. Additional support came from 1SC2MH087974 (Izquierdo).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.Robbins TW. Chemistry of the mind: neurochemical modulation of prefrontal cortical function. J Comp Neurol. 2005;493(1):140–6. doi: 10.1002/cne.20717. [DOI] [PubMed] [Google Scholar]
  • 2.Cools R, Roberts AC, Robbins TW. Serotoninergic regulation of emotional and behavioural control processes. Trends Cogn Sci. 2008;12(1):31–40. doi: 10.1016/j.tics.2007.10.011. [DOI] [PubMed] [Google Scholar]
  • 3.Clarke HF, et al. Cognitive inflexibility after prefrontal serotonin depletion is behaviorally and neurochemically specific. Cereb Cortex. 2007;17(1):18–27. doi: 10.1093/cercor/bhj120. [DOI] [PubMed] [Google Scholar]
  • 4.Clarke HF, et al. Prefrontal serotonin depletion affects reversal learning but not attentional set shifting. J Neurosci. 2005;25(2):532–8. doi: 10.1523/JNEUROSCI.3690-04.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Clarke HF, et al. Cognitive inflexibility after prefrontal serotonin depletion. Science. 2004;304(5672):878–80. doi: 10.1126/science.1094987. [DOI] [PubMed] [Google Scholar]
  • 6.Masaki D, et al. Relationship between limbic and cortical 5-HT neurotransmission and acquisition and reversal learning in a go/no-go task in rats. Psychopharmacology (Berl) 2006;189(2):249–58. doi: 10.1007/s00213-006-0559-0. [DOI] [PubMed] [Google Scholar]
  • 7.Lapiz-Bluhm MD, et al. Chronic intermittent cold stress and serotonin depletion induce deficits of reversal learning in an attentional set-shifting test in rats. Psychopharmacology (Berl) 2009;202(1–3):329–41. doi: 10.1007/s00213-008-1224-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Kulz AK, et al. Effects of tryptophan depletion on cognitive functioning, obsessive-compulsive symptoms and mood in obsessive-compulsive disorder: preliminary results. Neuropsychobiology. 2007;56(2–3):127–31. doi: 10.1159/000115778. [DOI] [PubMed] [Google Scholar]
  • 9.Evers EA, et al. Serotonergic modulation of prefrontal cortex during negative feedback in probabilistic reversal learning. Neuropsychopharmacology. 2005;30(6):1138–47. doi: 10.1038/sj.npp.1300663. [DOI] [PubMed] [Google Scholar]
  • 10.Brigman JL, et al. Pharmacological or genetic inactivation of the serotonin transporter improves reversal learning in mice. Cereb Cortex. 2010;20(8):1955–63. doi: 10.1093/cercor/bhp266. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Denk F, et al. Differential involvement of serotonin and dopamine systems in cost-benefit decisions about delay or effort. Psychopharmacology (Berl) 2005;179(3):587–96. doi: 10.1007/s00213-004-2059-4. [DOI] [PubMed] [Google Scholar]
  • 12.Dringenberg HC, et al. p-chlorophenylalanine-induced serotonin depletion: reduction in exploratory locomotion but no obvious sensory-motor deficits. Behav Brain Res. 1995;68(2):229–37. doi: 10.1016/0166-4328(94)00174-e. [DOI] [PubMed] [Google Scholar]
  • 13.Beiko J, Candusso L, Cain DP. The effect of nonspatial water maze pretraining in rats subjected to serotonin depletion and muscarinic receptor antagonism: a detailed behavioural assessment of spatial performance. Behav Brain Res. 1997;88(2):201–11. doi: 10.1016/s0166-4328(97)02298-5. [DOI] [PubMed] [Google Scholar]
  • 14.Dyer K, Cain DP. Water maze impairments after combined depletion of somatostatin and serotonin in the rat. Behav Brain Res. 2007;181(1):85–95. doi: 10.1016/j.bbr.2007.03.029. [DOI] [PubMed] [Google Scholar]
  • 15.Petrasek T, Stuchlik A. Serotonin-depleted rats are capable of learning in active place avoidance, a spatial task requiring cognitive coordination. Physiol Res. 2009;58(2):299–303. doi: 10.33549/physiolres.931729. [DOI] [PubMed] [Google Scholar]
  • 16.Arnsten AF. Catecholamine modulation of prefrontal cortical cognitive function. Trends Cogn Sci. 1998;2(11):436–47. doi: 10.1016/s1364-6613(98)01240-6. [DOI] [PubMed] [Google Scholar]
  • 17.Izquierdo A, Belcher AM. Rodent models of adaptive decision making. Methods Mol Biol. 2012;829:85–101. doi: 10.1007/978-1-61779-458-2_5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Walton ME, Bannerman DM, Rushworth MF. The role of rat medial frontal cortex in effort-based decision making. J Neurosci. 2002;22(24):10996–1003. doi: 10.1523/JNEUROSCI.22-24-10996.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Izquierdo A, Jentsch JD. Reversal learning as a measure of impulsive and compulsive behavior in addictions. Psychopharmacology (Berl) 2012;219(2):607–20. doi: 10.1007/s00213-011-2579-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Kosheleff AR, et al. Work aversion and associated changes in dopamine and serotonin transporter after methamphetamine exposure in rats. Psychopharmacology (Berl) 2012;219(2):411–20. doi: 10.1007/s00213-011-2367-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Izquierdo A, et al. Reversal-specific learning impairments after a binge regimen of methamphetamine in rats: possible involvement of striatal dopamine. Neuropsychopharmacology. 2010;35(2):505–14. doi: 10.1038/npp.2009.155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Ostrander S, et al. Orbitofrontal cortex and basolateral amygdala lesions result in suboptimal and dissociable reward choices on cue-guided effort in rats. Behav Neurosci. 2011;125(3):350–9. doi: 10.1037/a0023574. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Chudasama Y, Robbins TW. Psychopharmacological approaches to modulating attention in the five-choice serial reaction time task: implications for schizophrenia. Psychopharmacology (Berl) 2004;174(1):86–98. doi: 10.1007/s00213-004-1805-y. [DOI] [PubMed] [Google Scholar]
  • 24.Harrison AA, Everitt BJ, Robbins TW. Central 5-HT depletion enhances impulsive responding without affecting the accuracy of attentional performance: interactions with dopaminergic mechanisms. Psychopharmacology (Berl) 1997;133(4):329–42. doi: 10.1007/s002130050410. [DOI] [PubMed] [Google Scholar]
  • 25.Dalley JW, et al. Specific abnormalities in serotonin release in the prefrontal cortex of isolation-reared rats measured during behavioural performance of a task assessing visuospatial attention and impulsivity. Psychopharmacology (Berl) 2002;164(3):329–40. doi: 10.1007/s00213-002-1215-y. [DOI] [PubMed] [Google Scholar]
  • 26.Dalley JW, et al. Deficits in impulse control associated with tonically-elevated serotonergic function in rat prefrontal cortex. Neuropsychopharmacology. 2002;26(6):716–28. doi: 10.1016/S0893-133X(01)00412-2. [DOI] [PubMed] [Google Scholar]
  • 27.Robbins TW, Arnsten AF. The neuropsychopharmacology of fronto-executive function: monoaminergic modulation. Annu Rev Neurosci. 2009;32:267–87. doi: 10.1146/annurev.neuro.051508.135535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Chudasama Y, Robbins TW. Dissociable contributions of the orbitofrontal and infralimbic cortex to pavlovian autoshaping and discrimination reversal learning: further evidence for the functional heterogeneity of the rodent frontal cortex. J Neurosci. 2003;23(25):8771–80. doi: 10.1523/JNEUROSCI.23-25-08771.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES