Abstract
Previous research has indicated that reward-paired cues can enhance disadvantageous risky choice in both humans and rodents. Systemic administration of a serotonin 2C receptor antagonist can attenuate this cue-induced risk preference in rats. However, the neurocognitive mechanisms mediating this effect are currently unknown. We therefore assessed whether the serotonin 2C receptor antagonist RS 102221 is able to attenuate cue-enhanced risk preference via its actions in the lateral orbitofrontal cortex (lOFC) or prelimbic (PrL) area of the medial prefrontal cortex (mPFC). A total of 32 male Long–Evans rats were trained on the cued version of the rat gambling task (rGT), a rodent analog of the human Iowa gambling task, and bilateral guide cannulae were implanted into the lOFC or PrL. Intra-lOFC infusions of the 5-HT2C antagonist RS 102221 reduced risky choice in animals that showed a preference for the risky options of the rGT at baseline. This effect was not observed in optimal decision-makers, nor those that received infusions targeting the PrL. Given prior data showing that 5-HT2C antagonists also improve reversal learning through the same neural locus, we hypothesized that reward-concurrent cues may amplify risky decision-making through cognitive inflexibility. We therefore devalued the sugar pellet rewards used in the cued rGT (crGT) through satiation and observed that decision-making patterns did not shift unless animals also received intra-lOFC RS 102221. Collectively, these data suggest that the lOFC is one critical site through which reward-concurrent cues promote risky choice patterns that are insensitive to reinforcer devaluation, and that 5-HT2C antagonism may optimize choice by facilitating exploration.
Keywords: decision making, flexibility, orbitofrontal cortex, reward, risk, serotonin
Significance Statement
Lights and sounds signaling reward are used extensively in electronic gaming machines. Recent data indicate that these win-associated cues can increase disadvantageous risky choice. Administering a serotonin 2C receptor antagonist can ameliorate this effect in rats, potentially by increasing flexibility in decision-making. The orbitofrontal cortex (OFC) is critically involved in mediating flexible behavior. Thus, the present study evaluated whether serotonin 2C antagonism in the OFC can reduce disadvantageous risky choice via alterations to behavioral flexibility. Results implicate the OFC as one critical locus, and an increase in flexibility as a potential cognitive mechanism, through which cue-enhanced risky decision-making may improve. This could point to potential therapeutic interventions for problematic gambling that target the control of cues over behavior.
Introduction
Win-associated audiovisual cues are used extensively in electronic games, smartphone apps, and commercial gambling products. Although these cues seem harmless, this form of sensory enhancement can impair decision-making. Specifically, adding win-concurrent cues to laboratory-based gambling tasks increases risky choice in rats and humans (Barrus and Winstanley, 2016; Cherkasova et al., 2018). Risky decision-making can contribute to the onset and maintenance of addiction disorders (Bolla et al., 2003; Goudriaan et al., 2005; Stevens et al., 2013). As such, the ability of reward-synchronous cues to increase risky choice may facilitate the development of pathologic gambling. Determining the neural and cognitive basis of this effect may shed light on how electronic games can become addictive and identify potential therapeutic interventions.
In rats, we have studied cue-driven risky choice using the rat gambling task (rGT), loosely analogous to the Iowa gambling task used clinically (Bechara et al., 1994; Zeeb et al., 2009). In both tasks, maximal reward is attained by avoiding the high-risk, high-reward options and instead favoring the options associated with lower per-trial gains. On the rGT, these low-risk, low-reward options result in less frequent and shorter time-out penalties and therefore more sugar pellets are earned overall. The addition of reward-paired audiovisual cues leads to greater risky choice on average (Barrus and Winstanley, 2016). Decision-making on the cued rGT (crGT) is subject to unique pharmacological regulation. Systemic administration of a serotonin (5-HT) 2C receptor antagonist SB242084 decreased risky choice selectively on the crGT while increasing premature responding, a reliable index of motor impulsivity, on both the cued and uncued versions of the task (Adams et al., 2017).
This finding was somewhat unexpected, given the previously described role of the 5-HT2C receptor in cue-mediated behaviors such as cue-induced reinstatement of cocaine seeking and responding for a conditioned reinforcer (Pentkowski et al., 2010; Browne et al., 2017). Modulation of both mesolimbic dopamine release and activity within the medial prefrontal cortex (mPFC) were implicated in these results, and the findings were attributed to alterations in the incentive salience of the cues. Increased premature responding induced by 5-HT2C antagonism on the rGT (Adams et al., 2017) and the 5-choice serial reaction time task (5CSRT; Winstanley et al., 2004) may also depend on the nucleus accumbens, as systemic administration enhances accumbal dopaminergic release (Browne et al., 2017), which can increase this form of impulsivity (Pattij et al., 2007; Economidou et al., 2012). Indeed, infusions of a 5-HT2C antagonist into the accumbens of rats increased premature responding on the 5CSRT, whereas microinjections into the mPFC had little effect (Robinson et al., 2008).
These findings suggest that 5-HT2C receptor antagonism in the nucleus accumbens would enhance the control of cues over behavior, and 5-HT2C agonism would attenuate it. However, we instead found that systemic administration of the antagonist ameliorated the risk-enhancing effect of cues on the rGT, while the agonist was without effect. As such, 5-HT2C antagonism may mitigate cue-driven risky choice through a different mechanism than described above, and distinct neural loci.
5-HT, particularly the 5-HT2C receptor, is critically involved in mediating flexible behavior (Barlow et al., 2015). Administration of a 5-HT2C antagonist into rats’ lateral orbitofrontal cortex (lOFC) but not the mPFC reduced perseveration during reversal learning (Boulougouris and Robbins, 2010). Several studies have identified the lOFC as a critical region for flexibility in decision-making (Baxter et al., 2000; Schoenbaum et al., 2002; Izquierdo et al., 2004; Amodeo et al., 2017). On the uncued rGT, the lOFC is involved in determining the optimal decision-making strategy (Zeeb and Winstanley, 2011). Interestingly, inactivating the lOFC during acquisition of the crGT may increase optimal choice (Ferland, J.-M. N., Barrus, M. M., Betts, G. D., and Winstanley, C. A., unpublished observations). As such, the inclusion of cues may impair decision-making by altering the establishment of accurate action-outcome contingencies in the lOFC. Theoretically, manipulating serotonergic activity in this region could reintroduce flexibility in the stored action-outcome contingencies and thereby ameliorate this effect.
We therefore administered the 5-HT2C antagonist RS 102221 directly into the lOFC and assessed performance on the crGT. To confirm the regional specificity of any observed effects, we also targeted the prelimbic (PrL) region of the mPFC. We hypothesized that intra-lOFC, but not intra-PrL, RS 102221 would attenuate risky decision-making. We also tested whether decision-making on the crGT is less flexible compared with the uncued rGT by evaluating sensitivity to reinforcer devaluation. Finally, we investigated whether intra-lOFC RS 102221 could restore sensitivity to this manipulation, as expected if 5-HT2C antagonism improves decision-making by reinstating flexibility.
Materials and Methods
Subjects
Subjects were 48 male Long–Evans rats (Charles River Laboratories) weighing 275–300 g on arrival to the facility. One to two weeks following arrival, rats were food-restricted to 14 g of rat chow per day and were maintained at least 85% body weight of an age-matched and sex-matched control (initial weight before food restriction: M = 353 g, SD = 51 g; weight before surgery: M = 399 g, SD = 31 g). Water was available ad libitum. All subjects were pair-housed or trio-housed in a climate-controlled colony room under a 12/12 h reverse light/dark cycle (21°C; lights off at 8 A.M.). Huts and paper towel were provided as environmental enrichment. Behavioral testing took place 5 d per week. Housing and testing conditions were in accordance with the Canadian Council of Animal Care, and experimental protocols were approved by the UBC Animal Care Committee.
Behavioral apparatus
Testing took place in 32 standard five-hole operant chambers, each of which was enclosed in a ventilated, sound-attenuating chamber (Med Associates Inc). Chambers were fitted with an array composed of five equidistantly spaced response holes. A stimulus light was located at the back of each hole, and nose-poke responses into these apertures were detected by vertical infrared beams. On the opposite wall, sucrose pellets (45 mg; Bioserv) were delivered to the magazine via an external pellet dispenser. The food magazine was also fitted with a tray light and infrared sensors to detect sucrose pellet collection. A house light could illuminate the chamber. The operant chambers were operated by software written in Med-PC by CAW, running on an IBM-compatible computer.
crGT training and testing
Details of training and testing have been reported previously (Zeeb et al., 2009; Barrus and Winstanley, 2016). Rats were first habituated to the operant chambers in two daily 30-min sessions, during which sucrose pellets were present in the nose-poke apertures and food magazine. Rats were then trained on a variant of the 5CSRT (Carli et al., 1983), in which rats were required to make a nose-poke response in one of the four apertures, indicated by a 10 s stimulus light. A correct response was rewarded by delivery of one sugar pellet to the food magazine. The location of the stimulus light varied between holes 1, 2, 4, and 5 across the session. Sessions lasted 30 min and consisted of ∼100 trials. Rats were trained until they reached a criteria of ≥ 50 correct responses with ≥80% accuracy and ≤20% omissions. Rats were then trained on a forced-choice variant of the crGT for seven sessions, in which rats were presented with one of the four options per trial in a pseudo-random fashion. This ensured rats had equal exposure to each reinforcement contingency before training on the free-choice version of the program.
A task schematic of the crGT is provided in Figure 1. During the 30-min session, trials were initiated by a nose-poke response within the illuminated food magazine. This response extinguished the light and started a 5-s intertrial interval (ITI). Any response at the five-hole array during the ITI was recorded as a premature response and punished by a 5-s time-out period, during which the house light was illuminated and no reward could be earned.
Following the ITI, apertures 1, 2, 4, and 5 in the five-hole array were illuminated for 10 s. If the rat failed to nose-poke in any illuminated hole within 10 s, the trial was recorded as an omission, the food magazine was re-illuminated, and rats were required to initiate a new trial. A nose-poke response within an illuminated aperture was either rewarded or punished according to that aperture’s reinforcement schedule. Probability of reward varied among options (0.9–0.4, P1–P4), as did reward size (one to four sucrose pellets). Punishments were signaled by the light within the chosen aperture flashing at a frequency of 0.5 Hz, which lasted for a 5- to 40-s time-out penalty depending on the aperture selected. Two-second compound tone/light cues occurred concurrently with reward delivery. Cue complexity and variability scaled with reward size. The task was designed such that the optimal strategy to earn the highest number of sucrose pellets during the 30-min session would be to exclusively select the P2 option, because of the relatively high probability of reward (0.8) and short, infrequent time-out penalties (10 s, 0.2 probability). While options P3 and P4 provide higher per-trial gains of three or four sucrose pellets, the longer and more frequent time-out penalties associated with these options greatly reduces the occurrence of rewarded trials, such that consistently selecting these options results in fewer sucrose pellets earned across the session and are therefore considered disadvantageous.
The position of each option for the crGT was counterbalanced across rats such that half the animals were trained on version A (left to right arrangement: P1, P4, P2, P3) and the other half on version B (left to right arrangement: P4, P1, P3, P2) to mitigate potential side bias. Rats received five training sessions per week.
Surgery
When baseline performance was deemed statistically stable (following ∼40 training sessions), 32 animals were anesthetized with 2% isoflurane in O2 and 23-gauge stainless steel guide cannulae were implanted above the lOFC (n = 20; AP = +3.5 mm, ML = ±2.6 mm from bregma, DV = −2.9 mm from dura) or the PrL (n = 12; AP = +3.0 mm, ML = ±0.7 mm from bregma, DV = −2.8 mm from dura), using standard stereotaxic techniques. Guide cannulae were fixed to the skull via four stainless steel screws and dental acrylic, and obdurators flush with the end of the cannulae were inserted. Animals were given at least one week of recovery in their home cages before subsequent testing.
Drug preparation
Three concentrations of the compound RS 102221 hydrochloride (Tocris Bioscience) were prepared each dosing day. First, 1 mg of the drug was suspended in 300 μl of 0.1 m HCl via sonication. The pH level was then adjusted to 6–7 with 1.0 and 0.1 m NaOH and saline to a final concentration of 2 mg/ml. Two aliquots of the solution were further diluted to 0.2 and 0.6 mg/ml. The highest concentration dose was vortexed before each infusion to prevent precipitation of the drug during the procedure. The vehicle solution consisted of saline that was pH-adjusted to 6–7 with NaOH.
Microinfusion procedure
Following recovery, animals performed 10 free-choice sessions, after which all individuals displayed stable behavior. Animals were then habituated to the microinfusion process with two mock infusions, during which 30-gauge dummy injectors were inserted for 2 min but no infusion was performed, followed by a behavioral testing session initiated 10 min later. Infusions adhered to a 3-d cycle starting with a baseline session, followed by a drug or vehicle injection session, and then by a non-testing day; 0.5 μl per hemisphere injections of saline or RS 102221 (0.1, 0.3, or 1.0 μg of drug per hemisphere) were administered bilaterally at a rate of 0.3 μl/min with injectors that extended 0.8 mm beyond the guide cannulae. Injectors were left in place for an additional minute to allow for diffusion. Animals received each dose of RS 102221 plus vehicle, counterbalanced in a Latin Square design (for doses A thru D: ABCD, CADB, BDAC, DCBA). Once the microinfusions were completed, injectors were removed, obdurators replaced, and animals were placed in the operant chambers for 10 min before initiation of the crGT.
Reinforcer devaluation
Twelve lOFC-cannulated rats and 16 surgically-naive rats underwent a reinforcer devaluation procedure. For the naive rats, this procedure took place across 2 d. On the first day, half of the rats were given ad libitum access to the sucrose pellets used as a reward on the crGT for 1 h before task initiation. The remaining rats completed the crGT without prior access to sucrose pellets. Following a baseline session day for which no sucrose pellets were administered before the task to any rats, the groups were then reversed and the other half were given 1-h access to sucrose pellets. To prevent the accumulation of damage of multiple infusions from impacting the results of this procedure, only one session of reinforcer devaluation was completed for cannulated rats. Ten minutes before task initiation, half of the rats received 1.0 μg of RS 102221 per hemisphere, according to the microinfusion procedure specified above. The other half received a vehicle dose. All rats in this group were given ad libitum access to sucrose pellets for 1 h before task initiation.
Histology
Following completion of all behavioral testing, animals were anesthetized with isoflurane and euthanized by carbon dioxide exposure. Brains were extracted and fixed in 4% formaldehyde for at least 24 h, transferred to a 30% sucrose solution, and then frozen and cut via cryostat into 40-μm coronal sections. These sections were stained with cresyl violet for visualization, and the projected locations of the injector tips protruding from the guide cannulae were mapped onto standard sections from Paxinos and Watson (1998).
Behavioral measures and data analysis
All statistical analyses were completed using SPSS Statistics 27.0 software (SPSS/IBM). As per previous reports, the following rGT variables were analyzed: percentage choice of each option (number of times option chosen/total number of choices × 100), risk score (calculated as percent choice of [(P1 + P2) − (P3 + P4)]), percentage of premature responses (number of premature responses/total number of trials initiated × 100), sum of omitted responses, sum of trials completed, and average latencies to choose an option and collect reward. Variables that were expressed as a percentage were subjected to an arcsine transformation to limit the effect of an artificially imposed ceiling (i.e., 100%). A statistically stable baseline was determined by a repeated-measures ANOVA across data from four consecutive sessions before surgery, following ∼40 training sessions, in which both the session factor and session × choice interaction were not significant. Animals with a mean positive baseline risk score were designated as “optimal,” whereas rats with negative risk scores were classified as “risk-preferring.”
Choice data were analyzed with a two-way repeated measures ANOVA with dose (four levels: vehicle, 0.1 μg, 0.3 μg, and 1.0 μg) and choice (four levels: P1, P2, P3, and P4) as within-subject factors. For all other variables, dose was the only within-subjects factor. Risk status (two levels: optimal, risk-preferring) was included as a between-subjects factor for all statistical analyses. For the analysis of the reinforcer devaluation data, devaluation (two levels: baseline, devaluation) and choice (four levels: P1–P4) were the within-subject factors and group (three levels: surgically-naive, vehicle, drug) and risk status were the between-subjects factors. The baseline session used for each group was as follows: naive rats, session without experimental manipulation (i.e., no devaluation); vehicle rats, vehicle data from Latin Square dosing regimen; drug rats, highest concentration dose data from Latin Square dosing regimen. In isolated cases where data were missing because of technical issues, mean replacements were used.
For all analyses, if sphericity was violated as determined by Mauchley’s test, a Huynh–Feldt correction was applied, and corrected p values’ degrees of freedom were rounded to the nearest integer. Results were deemed to be significant if p values were less than or equal to an α of .05. Any main effects or interactions of significance were further analyzed via post hoc one-way ANOVA or paired samples t tests with a Bonferroni correction applied for the number of comparisons made. Any p > 0.05 but p < 0.09 were reported as a statistical trend.
Results
Cannulae placements
The locations of all acceptable placements are depicted in Figure 2A for the lOFC cohort and Figure 2B for the PrL cohort. One animal in the lOFC condition did not survive surgery. One rat was excluded from the lOFC analyses because of inaccurate placement of the cannulae. All PrL cannulae placements were acceptable. This left a total of 18 (n = 9 risk-preferring; n = 9 optimal) and 12 (n = 6 risk-preferring; n = 6 optimal) rats for the lOFC and PrL analyses, respectively.
Baseline behavior
One rat was excluded from the PrL condition because of poor task performance (mean of 30 trials completed per session, >30 s reward collection latency). As expected, risk-preferring rats selected the risky options at a significantly higher proportion than those who developed an optimal decision-making strategy for all baseline and saline sessions (choice × risk preference: F(2,148) = 41.77, p < 0.0001; optimal vs risk-preferring: P1: t(75) = 1.60, p = 0.11; P2: t(75) = 12.89, p < 0.0001; P3: t(68) = −5.95, p < 0.0001; P4: t(66) = −3.97, p = 0.0002). Across both cohorts, risk-preferring rats completed significantly fewer trials (risk preference: F(1,25) = 38.43, p < 0.0001), had a significantly higher proportion of premature responses (risk preference: F(1,25) = 6.76, p = 0.02), and exhibited shorter latencies to collect reward (risk preference: F(1,25) = 14.67, p = 0.001).
The risk-preferring and optimal rats in the lOFC and PrL cohorts exhibited slightly different choice patterns (choice × brain region × risk preference: F(2,51) = 5.11, p = 0.009). This was particularly evident in optimal rats; risk-preferring rats exhibited only a trending difference in choice preference [risk-preferring rats: choice × brain region: F(2,17) = 3.45, p = 0.07 (Fig. 3A); optimal rats: choice × brain region: F(3,38) = 3.99, p = 0.02 (Fig. 3B)]. Optimal rats in the PrL group chose P2 at a significantly higher rate than rats in the lOFC group (t(12) = 2.92, p = 0.01).
To assess whether damage associated with cannula implantation and the microinfusion procedure impacted decision-making on the rGT, four presurgery sessions were binned and compared with baseline data collected between infusion days. No effect was observed on P1–P4 choice, indicating that procedure-associated damage did not significantly impact their decision-making (data bin × choice: F(3,66) = 0.40, p = 0.75).
Microinfusions into lOFC
Choice
We observed a significant shift in choice when comparing all doses in an omnibus ANOVA, that was dependent on the rats’ choice patterns at baseline (dose × choice × risk status: F(7,117) = 7.34, p = 0.04). This effect was only present in risk-preferring rats [dose × choice, risk-preferring: F(9,72) = 3.09, p = 0.003 (Fig. 4A); optimal: F(9,72) = 0.01, p = 0.30 (Fig. 4B)]. Post hoc analyses revealed a significant reduction in P3 choice (t(8) = 2.49, p = 0.04) and a significant increase in P1 choice (t(8) = −2.74, p = 0.03) in risk-preferring rats, when comparing vehicle to the highest dose. This resulted in a significant increase in risk score in these rats (dose: F(3,24) = 4.39, p = 0.01; vehicle vs 1.0 μg: t(8) = −2.53, p = 0.04). No effect was observed on P2 or P4 choice (P2: t(8) = −1.52, p = 0.17; P4: t(8) = −1.59, p = 0.15).
Premature responding and other variables
No effect on premature responding was observed (dose: F(3,48) = 1.10, p = 0.36). A significant shift in omissions was evident in optimal rats only (dose × risk status: F(3,38) = 3.66, p = 0.03; dose, optimal: F(3,21) = 3.56, p = 0.03; risk-preferring: F(1,11) = 1.20, p = 0.32). Post hoc analyses revealed a significant reduction in omitted trials in these rats, when comparing vehicle to the highest dose (t(8) = 2.40, p = 0.04). However, these rats showed a significantly higher level of omissions at vehicle compared with their baseline data (t(8) = 2.41, p = 0.04). No effect was observed on latency variables or trials completed (all F < 1.33; p > 0.27; Table 1).
Table 1.
Region | Dose | % Premature responses | Choice latency | Collect latency | Omissions | Trials completed |
---|---|---|---|---|---|---|
lOFC | 0 | 28.62 ± 4.05 | 1.28 ± 0.19 | 1.32 ± 0.07 | 1.50 ± 0.53 | 75.38 ± 5.28 |
0.1 | 29.66 ± 3.65 | 1.20 ± 0.12 | 1.19 ± 0.04 | 0.94 ± 0.22 | 77.56 ± 5.11 | |
1.0 | 31.90 ± 4.71 | 1.22 ± 0.12 | 1.24 ± 0.03 | 0.39 ± 0.18 | 75.63 ± 5.05 | |
3.0 | 25.92 ± 3.72 | 1.30 ± 0.13 | 2.01 ± 0.42 | 0.94 ± 0.51 | 83.63 ± 6.41 | |
PrL | 0 | 34.05 ± 5.35 | 1.13 ± 0.12 | 1.42 ± 0.16 | 1.36 ± 0.72 | 69.98 ± 7.94 |
0.1 | 31.94 ± 6.59 | 1.14 ± 0.12 | 1.29 ± 0.05 | 0.91 ± 0.44 | 73.37 ± 9.67 | |
1.0 | 28.64 ± 6.30 | 1.31 ± 0.17 | 1.21 ± 0.07 | 1.18 ± 0.46 | 75.41 ± 10.12 | |
3.0 | 24.19 ± 5.00 | 1.34 ± 0.15 | 1.24 ± 0.12 | 0.91 ± 0.44 | 78.66 ± 10.78 |
Data are mean ± SEM.
Reinforcer devaluation
Choice
Behavioral data from devaluation sessions for each group of rats (naive, vehicle, or drug) was compared with their baseline data (naive group: no manipulation; vehicle group: vehicle dosing data; drug group: highest RS 102221 dose data). In Figure 5A, choice of the P1–P4 options is depicted as a difference in % choice between baseline and devaluation sessions (baseline subtracted from devaluation) for each group. This was done to highlight shifts in choice separate from overall cohort differences in the selection of the different options. Mean values and SEMs for each session and group can be found in Table 2. Risky and optimal rats are grouped together as statistical analyses did not reveal any effects that were dependent on risk status. We observed a significant shift in P1–P4 choice in response to reinforcer devaluation that was dependent on group but not risk status (devaluation × choice × group: F(9,57) = 2.56, p = 0.02; Fig. 5A). Naive rats did not demonstrate a shift in their choice profile (devaluation × choice: F(3,36) = 1.58, p = 0.21). Similarly, we did not observe any change in the choice profile of rats that received a vehicle dose in addition to devaluation, versus vehicle alone (devaluation × choice: F(3,9) = 1.39, p = 0.31). Conversely, rats in the drug + devaluation condition exhibited a shift in their decision-making that was significantly different from the effect of the drug alone (devaluation × choice: F(3,12) = 5.64, p = 0.01). When comparing P1–P4 choice between the drug versus drug + devaluation condition with post hoc analyses, we observed a trending reduction in P2 choice (t(5) = 2.35, p = 0.07) and a significant increase in P4 choice (t(5) = −3.76, p = 0.01). A comparison of the drug + devaluation versus vehicle condition in these rats reached marginal significance (devaluation × choice: F(2,6) = 3.80, p = 0.08), resulting from a significant increase in P1 and P4 choice (P1: t(5) = −4.92, p = 0.004; P4: t(5) = −5.95, p = 0.002). Thus, rats who received an infusion of the highest dose of RS 102221 into the lOFC were uniquely sensitive to the effects of reinforcer devaluation on choice, exhibiting a choice profile that differed from their decision-making patterns after receiving either vehicle or drug alone.
Table 2.
Group | Condition | P1 | P2 | P3 | P4 |
---|---|---|---|---|---|
Surgically naive | Baseline | 4.82 ± 1.33 | 51.72 ± 9.17 | 24.93 ± 8.68 | 18.53 ± 7.18 |
Devaluation | 3.62 ± 1.33 | 53.50 ± 9.25 | 22.09 ± 8.12 | 20.79 ± 6.66 | |
Drug + devaluation | Vehicle | 1.59 ± 0.79 | 56.57 ± 7.93 | 25.91 ± 11.55 | 15.93 ± 8.51 |
Drug | 7.13 ± 1.67 | 58.16 ± 5.44 | 18.67 ± 8.12 | 16.04 ± 6.56 | |
Devaluation | 9.29 ± 1.73 | 37.67 ± 10.06 | 18.94 ± 8.34 | 34.10 ± 12.09 | |
Vehicle + devaluation | Vehicle | 4.83 ± 2.46 | 41.01 ± 13.12 | 40.03 ± 19.23 | 14.13 ± 5.56 |
Devaluation | 4.92 ± 1.72 | 39.91 ± 10.92 | 31.19 ± 17.17 | 23.98 ± 9.85 |
Data are mean ± SEM.
Other task variables
We observed a significant shift in trials completed that was dependent on group (group × devaluation: F(3,19) = 6.25, p = 0.004; Fig. 5B). Only rats that received intra-lOFC RS 102221 completed significantly fewer trials in response to devaluation (drug: F(1,4) = 24.12, p = 0.008; vehicle: F(1,3) = 4.64, p = 0.12; naive: F(1,14) = 2.50, p = 0.14). Rats in all groups exhibited decreased premature responding (F(1,19) = 39.37, p < 0.0001; Fig. 5C) and increased latencies to choose an option (devaluation: F(1,19) = 17.52, p = 0.001; Fig. 5D) in response to reinforcer devaluation. No effect was observed on omissions or latencies to collect reward (all F < 1.16, all p > 0.30; Fig. 5E). See Table 3 for the mean values and SEMs for the reported variables in each group and session.
Table 3.
Group | Condition | % Premature responses | Choice latency | Collect latency | Omissions | Trials completed |
---|---|---|---|---|---|---|
Surgically naive | Baseline | 27.96 ± 3.00 | 0.93 ± 0.08 | 1.12 ± 0.11 | 0.25 ± 0.11 | 79.89 ± 7.63 |
Devaluation | 8.82 ± 2.09 | 1.71 ± 0.21 | 1.38 ± 0.15 | 1.31 ± 0.55 | 63.14 ± 9.87 | |
Drug + devaluation | Vehicle | 26.72 ± 5.72 | 1.35 ± 1.20 | 1.57 ± 0.35 | 1.50 ± 1.11 | 80.87 ± 11.71 |
Drug | 19.96 ± 6.94 | 1.26 ± 0.25 | 1.23 ± 0.08 | 0.50 ± 0.34 | 92.18 ± 9.46 | |
Devaluation | 6.61 ± 2.81 | 2.40 ± 0.52 | 1.51 ± 0.06 | 1.33 ± 0.49 | 32.00 ± 7.93 | |
Vehicle + devaluation | Vehicle | 37.74 ± 5.72 | 1.10 ± 0.21 | 1.33 ± 0.19 | 1.20 ± 0.97 | 71.06 ± 7.63 |
Devaluation | 14.00 ± 6.68 | 2.04 ± 0.43 | 1.31 ± 0.14 | 0.80 ± 0.37 | 51.62 ± 5.40 |
Data are mean ± SEM.
Microinfusions into PrL region
Choice
When examining choice in an omnibus ANOVA, we observed a significant effect of dose that did not interact with risk status or the different options (dose: F(3,27) = 3.39, p = 0.03), potentially indicative of increased variability or noise in rats’ response patterns that does not reliably load on one option or another. Figure 6A,B depicts the % choice of each option at each dose in risk-preferring and optimal rats, respectively. Comparing the choice of P1–P4 between vehicle and each dose did not reveal any significant effects (all t < 1.99, all p > 0.09). Correspondingly, there was no significant effect on risk score (dose: F(3,27) = 0.64, p = 0.60). These results indicate that while rats’ decision-making patterns became more variable across doses, there was no clear pattern of an increase or decrease of choice for any particular option.
Premature responding and other variables
No significant effect was observed on premature responding, latency variables, omissions, or trials completed (all F < 1.68; p > 0.20; see Table 1 for mean values and SEMs of task variables at each dose).
Discussion
Results from this study demonstrated that infusing a 5-HT2C antagonist directly into the lOFC, but not the PrL region, improved decision-making in risk-preferring rats without negatively impacting impulsivity measures. 5-HT2C antagonism in the lOFC also restored behavioral sensitivity to reinforcer devaluation, indicating that flexibility in reward valuation was increased by this manipulation. In contrast, intra-PrL infusions of RS 102221 did not increase optimal choice. The ability of 5-HT2C antagonism to ameliorate cue-enhanced risky choice therefore exhibits at least some regional specificity within the frontal cortices.
While the damage associated with surgeries and infusions is typical of studies using this technique, we cannot rule out that this may have altered the functioning of prefrontal circuitry and influenced the observed results. Nevertheless, within-subjects comparisons to vehicle dosing, along with counterbalanced orders of doses among the rats, supports the conclusion that the effect of 5-HT2C antagonism in the lOFC is not simply because of procedure-associated damage. Furthermore, choice patterns during baseline sessions did not shift throughout the infusion procedure and were not significantly different from data collected before surgery.
Although the ratio of 5-HT2A to 5-HT2C receptors in the mPFC is positively correlated with levels of premature responding on a simplified version of the 5CSRT (Anastasio et al., 2015), we did not see any increase in premature responding when RS 102221 was delivered into the PrL, similar to a previous report (Robinson et al., 2008). There were fewer rats in the PrL cohort (n = 11) than the lOFC cohort (n = 18), but the sample size for both experiments was certainly comparable to other studies using this technique. Considerable data suggest an association between 5-HT2C receptor activity in the mPFC, motor impulsivity, and relapse-like cocaine-seeking in rats (Filip and Cunningham, 2003; Anastasio et al., 2014; Swinford-Jackson et al., 2016). Furthermore, increased risky decision-making following cocaine self-administration correlates with at least one measure of relapse vulnerability (Ferland and Winstanley, 2017). Given that both greater risky choice and higher levels of premature responding are associated with behavioral markers of cocaine addiction in rats and are significantly correlated at the population level (Barrus et al., 2015), we might expect a greater overlap in the neurobiological control of these cognitive processes. However, numerous studies now show that these phenomena are subject to differential neural and pharmacological regulation (Zeeb et al., 2009; Barrus and Winstanley, 2016; Adams et al., 2017; Betts et al., 2021; Chernoff et al., 2021). While impulsive action and risky decision-making may interact synergistically in the manifestation of impulse control and addiction disorders, they ultimately may represent somewhat dissociable pathways to addiction.
In contrast to the effects of systemic administration, which impacted both optimal and risk-preferring rats, intra-lOFC administration of a 5-HT2C antagonist only improved decision-making in those that exhibited a preference for the risky options at baseline. The neural locus whereby 5-HT2C antagonism further optimizes choice in animals already making advantageous decisions remains to be determined, but is clearly neither the lOFC nor the PrL. The selective effects of drug infusions in risk-preferring rats also suggests that the neural architecture underlying the decision-making process in optimal versus risky rats differs, either in the regions forming the network, the weight given to the output of those regions in guiding choice, or the computational analyses performed by key nodes. Indeed, data across species support the view that individual differences in choice preference can be attributed to differential activity across brain networks. Cues present in the environment can also influence the adoption of a behavioral strategy; the considerable literature on sign-trackers versus goal-trackers may best exemplify the significant individual differences in how such cues can be used to guide behavior (Flagel et al., 2009, 2011; Saunders and Robinson, 2010, 2013). Illuminating a cue light during lengthy delays between response and reward delivery can decrease delay discounting (i.e., reduction in a reward’s subjective value because of waiting period) in rats (Cardinal et al., 2000). However, the presence of this cue does not eliminate the large individual differences in animals’ preferences for smaller-sooner versus larger-later rewards. Inactivation of the lOFC only reduced choice of the larger-later reward in rats that showed a high baseline preference for this option; an effect that was also observed following local infusions of dopamine antagonists (Zeeb et al., 2010). Thus, the hypothesis that recruitment of the lOFC into the decision-making process depends on the presence of cues, and that decision-making patterns are only lOFC-dependent in a subpopulation of individuals that use those cues to guide behavior, has some precedent.
It is interesting to note that the effect in risk-preferring animals was specific to a decrease in P3 (their preferred risky option) and an increase in P1, the optimal option offering the most frequent wins but the smallest reward size (one sucrose pellet). This could be because of either an increased sensitivity to the length and/or frequency of time-out penalties, or increased impact of frequent winning trials. This may help explain the specific effect observed in risk-preferring animals; for optimal rats, the difference between P1 and P2 in time-out penalty length/frequency and reward frequency may not be large enough to shift decision-making patterns away from their preferred option. Investigating neural activity following rewards and time-out penalties on each option within the lOFC could shed light on these hypotheses.
Indeed, identifying how audiovisual cues modulate lOFC neuronal firing as well as the impact of 5-HT2C antagonism are important next steps. The 5-HT2C receptor is an excitatory GPCR found throughout the rat central nervous system (Clemett et al., 2000). Previous studies have demonstrated that 5-HT2C receptors are primarily located in the deep layers of the PFC in rats (Pompeiano et al., 1994; Liu et al., 2007). In the mPFC, at least 50% of receptors are localized to GABAergic interneurons, and are hypothesized to regulate the output of pyramidal cells (Liu et al., 2007). Considerably less is known about the localization and function of 5-HT2C receptors in the OFC. Interneurons in this region play a key role in reversal learning, so it is possible that modulation of GABAergic interneuron activity by the 5-HT2C antagonist may drive the improvement in choice seen here (Bissonette et al., 2015).
As noted in the introduction, previous literature has strongly implicated the 5-HT2C receptor in cue-mediated behaviors such as cue-induced cocaine seeking and responding for a conditioned reinforcer through its regulation of the mesolimbic dopamine system (Pentkowski et al., 2010; Browne et al., 2017). Based on this work, 5-HT2C receptor antagonists should increase motor impulsivity and potentiate the risk-promoting effect of cues in the rGT. While the former is true, the latter is clearly not. 5-HT2C receptor antagonism must therefore alter decision-making through an alternate, yet concurrent mechanism. The current data implicate the lOFC as one critical locus, and an increase in behavioral flexibility as a potential cognitive mechanism, through which decision-making may improve. Serotonergic activity within the OFC, and the 2C receptor, have been implicated in cognitive flexibility by multiple previous studies (Boulougouris and Robbins, 2010; Alsiö et al., 2015, 2021; Barlow et al., 2015). Previous results have indicated that circuitry between the OFC and basolateral amygdala (BLA) supports shifts in choice following reinforcer devaluation on the uncued rGT (Zeeb and Winstanley, 2013). It may be that this circuitry is also involved in cue-induced inflexibility on the task. This pathway certainly plays a role in cue-based decision-making, as BLA-OFC projections are essential for guiding decision-making based on cue-triggered reward representations (Lichtenberg et al., 2017).
It is notable that decision-making on the cued task was not altered by reinforcer devaluation in naive or vehicle-treated rats, in contrast to the effects of this manipulation reported previously in the absence of the cues (Zeeb and Winstanley, 2013). Decision-making in other tasks that require considerably more sessions to train remain sensitive to changes in outcome value, indicating that simple repetition of actions in complex cognitive tasks is not sufficient to produce habitual behavior through procedural motor learning (Cocker et al., 2012). Indeed, goal-directed control can be maintained following prolonged training even if automatization of certain action sequences occurs (Garr and Delamater, 2019). Acute satiety with regular chow did not shift choice patterns on the uncued rGT and thus the effect of reinforcer devaluation on choice can be attributed to shifts in goal-directed action rather than reduced motivation. While acute satiety with chow has not been tested on the crGT, it stands to reason that reduced motivation alone would similarly leave choice patterns unaffected. As such, decision-making on the crGT may fail one of the critical tests of true goal-directed behavior, in that it is insensitive to changes in the goal’s value (Balleine and Dickinson, 1992). If this is the case, then 5-HT2C antagonism in the lOFC may shift rats toward a goal-directed response strategy and therefore restore sensitivity to reinforcer value.
Regardless of experimental condition, motivation to engage in the task declined; premature responses decreased when animals were sated, while the latency to choose an option increased. Interestingly, while latency to collect reward is sensitive to satiety on the uncued rGT (Zeeb and Winstanley, 2013), this measure did not significantly increase with reinforcer devaluation in any group. Furthermore, the inclusion of cues on the rGT results in decreased collection latencies, particularly in risky rats (Hathaway et al., 2021). It therefore may be that the presence of the cues invigorates responding to reward delivery, and the effect is not dependent on the value of the reward or computation within the lOFC, as this measure was unaffected by devaluation with or without drug administration. In addition, the number of trials completed decreased in all rats in response to devaluation, but the effect only reached significance in rats who also received RS 102221. On the uncued rGT, trial completion was reduced in response to devaluation, but this was prevented by disconnecting the BLA and lOFC (Zeeb and Winstanley, 2013). Thus, the lOFC may play a key role in inhibiting perseverative responding on the rGT. That intra-lOFC infusion of RS 102221 selectively rendered decision-making and trial completion more sensitive to reinforcer devaluation, without altering the effects of satiety on other variables, further supports the hypothesis that local 5-HT2C antagonism facilitated some form of cognitive flexibility.
As alluded to above, if changes to the value of the reinforcer do not alter behavior, that behavior is thought to be under habitual rather than goal-directed control. However, it could be that the behavior in question is instead reinforced by another aspect of the environment. Given that the cues are concurrent with reward delivery, it could be argued that “reward + cue” has formed a compound reinforcer. Selective devaluation of only one component of this reinforcer may therefore fail to alter response patterns. Certainly, in the drug addiction literature, cues that are present when drug is taken acquire incentive motivational salience that does not decline even when users do not obtain pleasure from drug ingestion (Robinson and Berridge, 1993). However, this is believed to be a highly aberrant state, driven by supraphysiological drug-induced dopamine release amplifying associative learning between cues and drugs. Why sound and light cues should exert the same effect in the current task is unclear, given that rats typically show a dramatic reduction in responding for reward-paired cues following devaluation of that reward (Hatfield et al., 1996; Pickens et al., 2005; McDannald et al., 2014). This is not the first time that similarities have been observed between responding on the crGT and responding for addictive drugs, as there is some evidence that risky choice on-task and cocaine self-administration may cross-sensitize and/or substitute for one another (Ferland et al., 2019; Hynes et al., 2021). However, systemic 5-HT2C antagonism increases responding for drug (Fletcher et al., 2002), yet decreases risky choice here, indicating that the pharmacological regulation of these processes is not uniform.
Instead of the cues acquiring incentive motivational properties that are now independent of reward value, another alternative is that these highly salient audiovisual cues overshadow sucrose pellet delivery to some extent (Pavlov, 1927), such that a change in the sucrose pellet value does not dominate behavior in the presence of the cues. To our knowledge, neither the lOFC nor 5-HT2C receptor signaling have been evaluated in overshadowing experiments, and as such further discussion of this hypothesis is premature. If overshadowing is taking place, then the attenuation of learning about the devalued state of the reward is highly specific to the decision-making process, as devaluation still impacted latencies and motor impulsivity, indicating a reduction in the ability of the sucrose pellet rewards to motivate and invigorate behavior.
Given that 5-HT2C receptor antagonism can clearly increase incentive motivation for rewards and reward-paired cues, likely through actions in the mesolimbic dopamine pathway, the following question remains: why does this mechanism not dominate the decision-making process? One potential answer comes from human literature examining the impact of serotonin depletion on model-based behavior: goal-directed choice is impaired when learning from rewards but enhanced when learning from punishments (Worbe et al., 2016). Computational modeling analyses have revealed that the addition of cues to the rGT specifically impairs learning from the time-out penalties (Langdon et al., 2019). As such, manipulating serotonergic activity within the lOFC may promote the correct integration of punishments into the stored action-outcome contingencies for risk-preferring rats, rather than influencing the incentive motivation of the cues. This is in line with the finding that responding for a conditioned reinforcer does not correlate with risky choice on the crGT, at least in female rats (Winstanley and Tremblay, 2016), indicating that incentive motivation is not primarily responsible for cue-induced risky choice on this task. The activity of striatal neurons, although influenced by neuromodulators like dopamine, is still primarily driven by cortical inputs. It is therefore possible that 5-HT2C-receptor mediated modulation of lOFC output is sufficient to dominate the behavioral response.
Overall, these results indicate that 5-HT2C antagonism in the lOFC can ameliorate cue-induced disadvantageous risky choice in rats with preexisting preferences for these risky options. The lack of effect in optimal rats, together with recent computational modeling analyses, suggests there are underlying differences in the processing or storage of action-outcome contingencies between optimal and risk-preferring animals. It is currently unknown whether this is because of distinct activity patterns within the lOFC or differential involvement of downstream targets. Future studies could examine activity in the lOFC during crGT learning and performance in risk-preferring versus optimal rats to address this question.
Furthermore, these results suggest that targeting flexibility may be a viable approach to improving decision-making in individuals with impaired cost/benefit decision-making, specifically in the presence of cues. This would have implications for the treatment of behavioral addictions and substance use disorders, in which individuals show marked impairments in disadvantageous risky decision-making and processing of reward-associated cues (Goudriaan et al., 2005; Limbrick-Oldfield et al., 2017; Zilberman et al., 2019). Recent clinical trials for the 5-HT2C agonist in the treatment of substance use disorder have been unsuccessful. Results from these studies indicate the necessity to attend to individual differences, as only rats with preexisting deficits in decision-making show improvements in response to the antagonist. Greater specificity of targeting to regions may also improve treatment, as the antagonist can increase impulsivity via other pathways. Allosteric modulators of the 5-HT2C receptor may be worth pursuing in this regard. Given that numerous effective psychoactive medications act on the serotonin system, there is every reason to feel cautiously optimistic that a viable serotonergic medication could be developed for disorders hallmarked or exacerbated by risky decision-making.
Acknowledgments
Acknowledgements: University of British Columbia is situated on the traditional, ancestral, and unceded land of the xwməθkwəy̓əm (Musqueam), sə̓lílwətaʔɬ/Selilwitulh (Tsleil-Waututh), and Sḵwx̱wú7mesh (Squamish) Peoples. We thank them for their stewardship of this land for thousands of years.
Synthesis
Reviewing Editor: Jonathan Lee, University of Birmingham
Decisions are customarily a result of the Reviewing Editor and the peer reviewers coming together and discussing their recommendations until a consensus is reached. When revisions are invited, a fact-based synthesis statement explaining their decision and outlining what is needed to prepare a revision will be listed below. The following reviewer(s) agreed to reveal their identity: Laura Bradfield, Emma Robinson.
The reviewers and reviewing editor were in agreement that the manuscript presents a logical extension of previous work in the field, integrating interesting insights into the function of serotonin in the lOFC. It is well designed and appropriately analysed, and largely reasonably interpreted.
The full reviews are appended below, but I wanted to highlight two points that warrant particular attention in any revision of the manuscript.
First, the presentation and interpretation of the “devaluation” experiment could be clearer and more carefully described. We acknowledge that there may be differences in the precise interpretation of certain terms within the field, but the use of the term “devaluation", when describing an experiment that involves pre-feeding the original outcome risks confusion in relation to general motivational effects vs true devaluation of the specific outcome (i.e. sensory-specific satiety). Please see the detailed comments from Reviewer 1.
Second, and somewhat related again to the devaluation experiment, but also more widely applicable to other areas of the results, the text could more sympathetically guide the reader through the various panels of the figures in order more clearly to elucidate how specific data support particular statements (please see Reviewer 1 comments for further details). One salient example of this is the data in figures 4 & 5 and their different modes of presentation that has confused us (see Reviewer 2 comments).
-----------------------------------------------
<i>Reviewer 1 comments</i>
In this manuscript, the authors investigate the roles of serotonin receptors in the lateral orbitofrontal cortex (lOFC) and prelimbic cortex (PrL) in risky decision-making using the rat gambling task - a version of the Iowa gambling task. They found that intra-lOFC infusions of the 5-HT2c receptor antagonist RS 102221 reduced risky choice in risk-preferring rats, but did not affect decisions in optimal decision-makers. Similar infusions in the PrL did not have the same effect, but apparently did have a blanket dose-dependent effect on responding. The authors also explored the effects of satiation on responding, but this only reduced overall responding as far as I can tell, without affecting the pattern of responding. The authors conclude that 5-HT2c receptors in the lOFC promote risky choice patterns.
Overall I think this is a nice paper, although perhaps only moderately novel and exciting. It is nicely written, with some interesting insights into serotonin function in lOFC the introduction and discussion. The authors have clearly thought through their work thoroughly. I do have a few issues and questions that should be addressed, which I will outline below.
Major
1. The ‘devaluation’ procedures used here lack the proper controls, and therefore cannot be used to infer anything about goal-directed action. That is, devaluation of a reward via sensory-specific satiety requires a control that is sated on a different outcome to that which is used in the task (hence the term ‘sensory-specific’ satiety). Without this control, the authors are both devaluing the outcome and altering the motivational state from hungry to sated in the ‘devalued’ animals, and either of these factors could influence responding. I would contend that reduced motivation is in fact more consistent with the observed results of this manipulation, which appears to have altered task engagement without altering the response profile/pattern of responding. An alteration in goal-directed action would, I think, be more likely to differentially affect the selection of optimal vs. risky options. I suggest that the authors alter their language throughout the manuscript to reflect the fact that these could be devaluation effects, or could simply be motivational, and that the latter is more likely given the general reduction in task engagement.
2. The way in which the results were reported I found quite confusing to follow. I would prefer it if each effect was referred to a specific graph, or particular part of a graph, rather than a blanket statement at the end saying “see Figure X for a summary of differences” (for example). That is, if there is an effect in risk-preferring rats, and these data are shown in the top panel of the figure, could the text please refer to the top panel of the figure? If it could be made clearer which reported effects and statistics relate to which graph/part of a graph, this would be much easier to follow.
3. The PrL data seem like a bit of an afterthought. For example, there’s a lot of introduction to why they would infuse into the lOFC, but almost no rationale given for the PrL infusions. Could the authors insert some more rationale for this, as it’s quite jarring when it is mentioned? I also found the results for Figure 6 difficult to understand. The authors say there is an effect of dose that did not interact with any other variable. Do they mean that general responding increased as dose increased? Across all rats? This would suggest the pattern stayed the same (in contrast to what is claimed) although I’m not sure it is consistent with what I see in Figure 6. It needs to be made clearer whether this is a main effect averaged across the other variables, and if so, what does it suggest? Finally, I also don’t think that ‘visual inspection’ (page 22) without statistics is sufficient to make differential claims about how decision-making was affected in risk-preferring vs. optimal decision-makers.
Minor points
4. On Page 4 it says “These findings suggest that 5-HT2c receptor antagonism enhances the control of cues over behaviour, and 5-HT2c agonism attenuates it.” Is this sentence specifically referring to antagonism/agonism in the nucleus accumbens? They need to say so otherwise the next immediate sentence appears to contradict it.
5. Page 11 - for the statistically stable baseline, when exactly did the 4 sessions occur that this number was taken from?
6. It took me some time to work out which holes in Figure 1 were P1 and P2. Initially I assumed it was the first two holes (i.e. they just went in order from P1-P4 from left to right). After some reading, I think that P1 and P2 are represented in Figure 1 as hole 1 and 3 - the ‘better options’, and P2 and P4 are holes 2 and 4 - the worse options. If this is correct, I’m not sure if there’s a way the authors could make this a bit clearer so that the reader can avoid confusion when trying to determine, for example, how the risk score is calculated?
7. It appears that risk-preferring rats did not alter their responding at P2 and P4, as a result of lOFC infusions (Figure 4A), but this is not reported on. I think it should be mentioned. Why might lOFC infusions selectively affect responding at P1 and P3?
<i>Reviewer 2 comments</i>
This study used a rat gambling task to investigate the role of 5-HT2C receptors in different sub-regions of the prefrontal cortex in risky decision-making behaviour. The authors link their findings to changes in flexible behaviour suggesting this is a neurocognitive mechanism medicating cue-enhanced risky decision-making behaviour. Overall I found the work was clearly presented and aligns with other studies of this type in terms of the technical approach and data analysis. I have a few points which I think would be useful to include to clarify some aspects of the study design and task and I also have a couple of general points about the data analysis and interpretation. I want to emphasise that the more general points are to me important to be discussed but are not things that I see as fundamental to the quality of the study as presented if that makes sense.
In the description of the food restriction, 14g per animal seems quite low, could the authors clarify if animals were managed to 85% free feeding weight match to normal growth or maintained at 85% of their initial weight? Could the authors provide weight data for the animals at the start of the pharmacological studies?
What environmental enrichment was provided and if none, please state this and explain why.
I was not quite clear from the description of the training exactly how long the rats took to acquire the task before surgery and testing, perhaps this could be added to the methods section. More generally, if the animals are requiring multiple sessions to learn the different contingencies, how do the authors resolve this against the concepts that the behaviour is flexible and dynamic enough to be sensitive to acute drug effects and decision-making? I am not sure if this has been tested previously but if the rules for each of the locations was changed, how long would the rats take to adapt their strategy?
The n numbers of the different sub-groups do seem low for this type of cognitive readout although I acknowledge that this is not an unusual n number to see reported for this type of operant method. If you consider the effect size expected for a clinical scenario, how do you relate this to what would seem to be a very large effect in these animals?
There does appear to be quite extensive damage associated with the infusion site and more than I would have expected to see given the methods described. I wonder if the authors might comment on this and the potential impacts on the prefrontal circuitry?
The authors have provided data from a study where they use a satiety test, but I am not sure I fully followed the interpretation of the subsequent dataset. The primary outcome in figure 4 is shown by group but the data in figure 5 is presented differently. If the aim is to relate the effects, then I am not sure why the data are presented differently? Also, in the discussion the focus is on goal-directed versus habitual behaviour but what about procedural learning which can also affect how rats perform these tasks with long training schedules.
Paragraph 2 of the discussion, n=12 versus n=11 earlier for prelimbic.
References
- Adams WK, Barkus C, Ferland JN, Sharp T, Winstanley CA (2017) Pharmacological evidence that 5-HT2C receptor blockade selectively improves decision making when rewards are paired with audiovisual cues in a rat gambling task. Psychopharmacology (Berl) 234:3091–3104. 10.1007/s00213-017-4696-4 [DOI] [PubMed] [Google Scholar]
- Alsiö J, Nilsson SR, Gastambide F, Wang RA, Dam SA, Mar AC, Tricklebank M, Robbins TW (2015) The role of 5-HT2C receptors in touchscreen visual reversal learning in the rat: a cross-site study. Psychopharmacology (Berl) 232:4017–4031. 10.1007/s00213-015-3963-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alsiö J, Lehmann O, McKenzie C, Theobald DE, Searle L, Xia J, Dalley J, Robbins TW (2021) Serotonergic innervations of the orbitofrontal and medial-prefrontal cortices are differentially involved in visual discrimination and reversal learning in rats. Cereb Cortex 31:1090–1105. 10.1093/cercor/bhaa277 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Amodeo LR, McMurray MS, Roitman JD (2017) Orbitofrontal cortex reflects changes in response–outcome contingencies during probabilistic reversal learning. Neuroscience 345:27–37. 10.1016/j.neuroscience.2016.03.034 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anastasio NC, Stutz SJ, Fox RG, Sears RM, Emeson RB, DiLeone RJ, O’Neil RT, Fink LH, Li D, Green TA, Moeller FG, Cunningham KA (2014) Functional status of the serotonin 5-HT2C receptor (5-HT2CR) drives interlocked phenotypes that precipitate relapse-like behaviors in cocaine dependence. Neuropsychopharmacology 39:370–382. 10.1038/npp.2013.199 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anastasio NC, Stutz SJ, Fink LHL, Swinford-Jackson SE, Sears RM, DiLeone RJ, Rice KC, Moeller FG, Cunningham KA (2015) Serotonin (5-HT) 5-HT2A receptor (5-HT2AR):5-HT2CR imbalance in medial prefrontal cortex associates with motor impulsivity. ACS Chem Neurosci 6:1248–1258. 10.1021/acschemneuro.5b00094 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Balleine B, Dickinson A (1992) Signalling and incentive processes in instrumental reinforcer devaluation. Q J Exp Psychol B 45:285–301. [PubMed] [Google Scholar]
- Barlow RL, Alsiö J, Jupp B, Rabinovich R, Shrestha S, Roberts AC, Robbins TW, Dalley JW (2015) Markers of serotonergic function in the orbitofrontal cortex and dorsal raphé nucleus predict individual variation in spatial-discrimination serial reversal learning. Neuropsychopharmacology 40:1619–1630. 10.1038/npp.2014.335 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barrus MM, Winstanley CA (2016) Dopamine D3 receptors modulate the ability of win-paired cues to increase risky choice in a rat gambling task. J Neurosci 36:785–794. 10.1523/JNEUROSCI.2225-15.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barrus MM, Hosking JG, Zeeb FD, Tremblay M, Winstanley CA (2015) Disadvantageous decision-making on a rodent gambling task is associated with increased motor impulsivity in a population of male rats. J Psychiatry Neurosci 40:108–117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baxter MG, Parker A, Lindner CC, Izquierdo AD, Murray EA (2000) Control of response selection by reinforcer value requires interaction of amygdala and orbital prefrontal cortex. J Neurosci 20:4311–4319. 10.1523/JNEUROSCI.20-11-04311.2000 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bechara A, Damasio AR, Damasio H, Anderson SW (1994) Insensitivity to future consequences following damage to human prefrontal cortex. Cognition 50:7–15. 10.1016/0010-0277(94)90018-3 [DOI] [PubMed] [Google Scholar]
- Betts GD, Hynes TJ, Winstanley CA (2021) Pharmacological evidence of a cholinergic contribution to elevated impulsivity and risky decision making caused by adding win-paired cues to a rat gambling task. J Psychopharmacol 35:701–712. 10.1177/0269881120972421 [DOI] [PubMed] [Google Scholar]
- Bissonette GB, Schoenbaum G, Roesch MR, Powell EM (2015) Interneurons are necessary for coordinated activity during reversal learning in orbitofrontal cortex. Biol Psychiatry 77:454–464. 10.1016/j.biopsych.2014.07.023 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bolla KI, Eldreth DA, London ED, Kiehl KA, Mouratidis M, Contoreggi C, Matochik JA, Kurian V, Cadet JL, Kimes AS, Funderburk FR, Ernst M (2003) Orbitofrontal cortex dysfunction in abstinent cocaine abusers performing a decision-making task. Neuroimage 19:1085–1094. 10.1016/S1053-8119(03)00113-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boulougouris V, Robbins TW (2010) Enhancement of spatial reversal learning by 5-HT2C receptor antagonism is neuroanatomically specific. J Neurosci 30:930–938. 10.1523/JNEUROSCI.4312-09.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Browne CJ, Ji X, Higgins GA, Fletcher PJ, Harvey-Lewis C (2017) Pharmacological modulation of 5-HT2C receptor activity produces bidirectional changes in locomotor activity, responding for a conditioned reinforcer, and mesolimbic DA release in C57BL/6 mice. Neuropsychopharmacology 42:2178–2187. 10.1038/npp.2017.124 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cardinal RN, Robbins TW, Everitt BJ (2000) The effects of d-amphetamine, chlordiazepoxide, alpha-flupenthixol and behavioural manipulations on choice of signalled and unsignalled delayed reinforcement in rats. Psychopharmacology (Berl) 152:362–375. 10.1007/s002130000536 [DOI] [PubMed] [Google Scholar]
- Carli M, Robbins TW, Evenden JL, Everitt BJ (1983) Effects of lesions to ascending noradrenergic neurones on performance of a 5-choice serial reaction task in rats; implications for theories of dorsal noradrenergic bundle function based on selective attention and arousal. Behav Brain Res 9:361–380. 10.1016/0166-4328(83)90138-9 [DOI] [PubMed] [Google Scholar]
- Cherkasova MV, Clark L, Barton JJS, Schulzer M, Shafiee M, Kingstone A, Stoessl AJ, Winstanley CA (2018) Win-concurrent sensory cues can promote riskier choice. J Neurosci 38:10362–10370. 10.1523/JNEUROSCI.1171-18.2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chernoff CS, Hynes TJ, Winstanley CA (2021) Noradrenergic contributions to cue-driven risk-taking and impulsivity. Psychopharmacology (Berl) 238:1765–1779. 10.1007/s00213-021-05806-x [DOI] [PubMed] [Google Scholar]
- Clemett D, Punhani T, Duxon MS, Blackburn TP, Fone KCF (2000) Immunohistochemical localisation of the 5-HT2C receptor protein in the rat CNS. Neuropharmacology 39:123–132. 10.1016/S0028-3908(99)00086-6 [DOI] [PubMed] [Google Scholar]
- Cocker PJ, Hosking JG, Benoit J, Winstanley CA (2012) Sensitivity to cognitive effort mediates psychostimulant effects on a novel rodent cost/benefit decision-making task. Neuropsychopharmacology 37:1825–1837. 10.1038/npp.2012.30 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Economidou D, Theobald DEH, Robbins TW, Everitt BJ, Dalley JW (2012) Norepinephrine and dopamine modulate impulsivity on the five-choice serial reaction time task through opponent actions in the shell and core sub-regions of the nucleus accumbens. Neuropsychopharmacology 37:2057–2066. 10.1038/npp.2012.53 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ferland JN, Winstanley CA (2017) Risk-preferring rats make worse decisions and show increased incubation of craving after cocaine self-administration. Addict biol 22:991–1001. 10.1111/adb.12388 [DOI] [PubMed] [Google Scholar]
- Ferland JN, Hynes TJ, Hounjet CD, Lindenbach D, Vonder Haar C, Adams WK, Phillips AG, Winstanley CA (2019) Prior exposure to salient win-paired cues in a rat gambling task increases sensitivity to cocaine self-administration and suppresses dopamine efflux in nucleus accumbens: support for the reward deficiency hypothesis of addiction. J Neurosci 39:1842–1854. 10.1523/JNEUROSCI.3477-17.2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Filip M, Cunningham KA (2003) Hyperlocomotive and discriminative stimulus effects of cocaine are under the control of serotonin(2C) (5-HT(2C)) receptors in rat prefrontal cortex. J Pharmacol Exp Ther 306:734–743. 10.1124/jpet.102.045716 [DOI] [PubMed] [Google Scholar]
- Flagel SB, Akil H, Robinson TE (2009) Individual differences in the attribution of incentive salience to reward-related cues: implications for addiction. Neuropharmacology 56 [Suppl 1]:139–148. 10.1016/j.neuropharm.2008.06.027 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Flagel SB, Clark JJ, Robinson TE, Mayo L, Czuj A, Willuhn I, Akers CA, Clinton SM, Phillips PE, Akil H (2011) A selective role for dopamine in stimulus-reward learning. Nature 469:53–57. 10.1038/nature09588 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fletcher PJ, Grottick AJ, Higgins GA (2002) Differential effects of the 5-HT(2A) receptor antagonist M100907 and the 5-HT(2C) receptor antagonist SB242084 on cocaine-induced locomotor activity, cocaine self-administration and cocaine-induced reinstatement of responding. Neuropsychopharmacology 27:576–586. 10.1016/S0893-133X(02)00342-1 [DOI] [PubMed] [Google Scholar]
- Garr E, Delamater AR (2019) Exploring the relationship between actions, habits, and automaticity in an action sequence task. Learn Mem 26:128–132. 10.1101/lm.048645.118 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goudriaan AE, Oosterlaan J, de Beurs E, van den Brink W (2005) Decision making in pathological gambling: a comparison between pathological gamblers, alcohol dependents, persons with Tourette syndrome, and normal controls. Brain Res Cogn Brain Res 23:137–151. 10.1016/j.cogbrainres.2005.01.017 [DOI] [PubMed] [Google Scholar]
- Hatfield T, Han JS, Conley M, Gallagher M, Holland P (1996) Neurotoxic lesions of basolateral, but not central, amygdala interfere with Pavlovian second-order conditioning and reinforcer devaluation effects. J Neurosci 16:5256–5265. 10.1523/JNEUROSCI.16-16-05256.1996 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hathaway BA, Hrelja KM, Harris CBW, Hynes TJ, Winstanley CA (2021) Investigating the behavioural effects of win-associated cues, outcome-associated cues, and randomly occurring cues on risky decision making. Society for Neuroscience Global Connectome: A Virtual Event. January 11-13, 2021, Virtual. [Google Scholar]
- Hynes TJ, Hrelja KM, Hathaway BA, Hounjet CD, Chernoff CS, Ebsary SA, Betts GD, Russell B, Ma L, Kaur S, Winstanley CA (2021) Dopamine neurons gate the intersection of cocaine use, decision making, and impulsivity. Addict Biol 26:e13022. [DOI] [PubMed] [Google Scholar]
- Izquierdo A, Suda RK, Murray EA (2004) Bilateral orbital prefrontal cortex lesions in rhesus monkeys disrupt choices guided by both reward value and reward contingency. J Neurosci 24:7540–7548. 10.1523/JNEUROSCI.1921-04.2004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Langdon AJ, Hathaway BA, Zorowitz S, Harris C, Winstanley CA (2019) Relative insensitivity to time-out punishments induced by win-paired cues in a rat gambling task. Psychopharmacology (Berl) 236:2543–2556. 10.1007/s00213-019-05308-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lichtenberg NT, Pennington ZT, Holley SM, Greenfield VY, Cepeda C, Levine MS, Wassum KM (2017) Basolateral amygdala to orbitofrontal cortex projections enable cue-triggered reward expectations. J Neurosci 37:8374–8384. 10.1523/JNEUROSCI.0486-17.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Limbrick-Oldfield EH, Mick I, Cocks RE, McGonigle J, Sharman SP, Goldstone AP, Stokes PR, Waldman A, Erritzoe D, Bowden-Jones H, Nutt D, Lingford-Hughes A, Clark L (2017) Neural substrates of cue reactivity and craving in gambling disorder. Transl Psychiatry 7:e992. 10.1038/tp.2016.256 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu S, Bubar MJ, Lanfranco MF, Hillman GR, Cunningham KA (2007) Serotonin 2C receptor localization in GABA neurons of the rat medial prefrontal cortex: implications for understanding the neurobiology of addiction. Neuroscience 146:1677–1688. 10.1016/j.neuroscience.2007.02.064 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McDannald MA, Jones JL, Takahashi YK, Schoenbaum G (2014) Learning theory: a driving force in understanding orbitofrontal function. Neurobiol Learn Mem 108:22–27. 10.1016/j.nlm.2013.06.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pattij T, Janssen MCW, Vanderschuren LJMJ, Schoffelmeer ANM, van Gaalen MM (2007) Involvement of dopamine D1 and D2 receptors in the nucleus accumbens core and shell in inhibitory response control. Psychopharmacology (Berl) 191:587–598. 10.1007/s00213-006-0533-x [DOI] [PubMed] [Google Scholar]
- Pavlov IR (1927) Conditioned reflexes. Oxford: Oxford University Press. [Google Scholar]
- Paxinos G, Watson C (1998) The rat brain in stereotaxic coordinates, Ed 4. San Diego: Academic Press. [Google Scholar]
- Pentkowski NS, Duke FD, Weber SM, Pockros LA, Teer AP, Hamilton EC, Thiel KJ, Neisewander JL (2010) Stimulation of medial prefrontal cortex serotonin 2C (5-HT) receptors attenuates cocaine-seeking behavior. Neuropsychopharmacology 35:2037–2048. 10.1038/npp.2010.72 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pickens CL, Saddoris MP, Gallagher M, Holland PC (2005) Orbitofrontal lesions impair use of cue-outcome associations in a devaluation task. Behav Neurosci 119:317–322. 10.1037/0735-7044.119.1.317 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pompeiano M, Palacios JM, Mengod G (1994) Distribution of the serotonin 5-HT2 receptor family mRNAs: comparison between 5-HT2A and 5-HT2C receptors. Brain Res Mol Brain Res 23:163–178. 10.1016/0169-328x(94)90223-2 [DOI] [PubMed] [Google Scholar]
- Robinson ES, Dalley JW, Theobald DE, Glennon JC, Pezze MA, Murphy ER, Robbins TW (2008) Opposing roles for 5-HT2A and 5-HT2C receptors in the nucleus accumbens on inhibitory response control in the 5-choice serial reaction time task. Neuropsychopharmacology 33:2398–2406. 10.1038/sj.npp.1301636 [DOI] [PubMed] [Google Scholar]
- Robinson TE, Berridge KC (1993) The neural basis of drug craving: an incentive-sensitization theory of addiction. Brain Res Brain Res Rev 18:247–291. 10.1016/0165-0173(93)90013-P [DOI] [PubMed] [Google Scholar]
- Saunders BT, Robinson TE (2010) A cocaine cue acts as an incentive stimulus in some but not others: implications for addiction. Biol Psychiatry 67:730–736. 10.1016/j.biopsych.2009.11.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saunders BT, Robinson TE (2013) Individual variation in resisting temptation: implications for addiction. Neurosci Biobehav Rev 37:1955–1975. 10.1016/j.neubiorev.2013.02.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schoenbaum G, Nugent SL, Saddoris MP, Setlow B (2002) Orbitofrontal lesions in rats impair reversal but not acquisition of go, no-go odor discriminations. Neuroreport 13:885–890. [DOI] [PubMed] [Google Scholar]
- Stevens L, Betanzos-Espinosa P, Crunelle CL, Vergara-Moragues E, Roeyers H, Lozano O, Dom G, Gonzalez-Saiz F, Vanderplasschen W, Verdejo-García A, Pérez-García M (2013) Disadvantageous decision-making as a predictor of drop-out among cocaine-dependent individuals in long-term residential treatment. Front Psychiatry 4:149. 10.3389/fpsyt.2013.00149 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Swinford-Jackson SE, Anastasio NC, Fox RG, Stutz SJ, Cunningham KA (2016) Incubation of cocaine cue reactivity associates with neuroadaptations in the cortical serotonin (5-HT) 5-HT2C receptor (5-HT2CR) system. Neuroscience 324:50–61. 10.1016/j.neuroscience.2016.02.052 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Winstanley CA, Tremblay M (2016) Risky choice on a cued gambling task in rats is not associated with elevated responding for conditioned reinforcement. San Diego: Society for Neuroscience. [Google Scholar]
- Winstanley CA, Theobald DE, Dalley JW, Glennon JC, Robbins TW (2004) 5-HT2A and 5-HT2C receptor antagonists have opposing effects on a measure of impulsivity: interactions with global 5-HT depletion. Psychopharmacology (Berl) 176:376–385. 10.1007/s00213-004-1884-9 [DOI] [PubMed] [Google Scholar]
- Worbe Y, Palminteri S, Savulich G, Daw ND, Fernandez-Egea E, Robbins TW, Voon V (2016) Valence-dependent influence of serotonin depletion on model-based choice strategy. Mol Psychiatry 21:624–629. 10.1038/mp.2015.46 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zeeb FD, Winstanley CA (2011) Lesions of the basolateral amygdala and orbitofrontal cortex differentially affect acquisition and performance of a rodent gambling task. J Neurosci 31:2197–2204. 10.1523/JNEUROSCI.5597-10.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zeeb FD, Winstanley CA (2013) Functional disconnection of the orbitofrontal cortex and basolateral amygdala impairs acquisition of a rat gambling task and disrupts animals’ ability to alter decision-making behavior after reinforcer devaluation. J Neurosci 33:6434–6443. 10.1523/JNEUROSCI.3971-12.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zeeb FD, Robbins TW, Winstanley CA (2009) Serotonergic and dopaminergic modulation of gambling behavior as assessed using a novel rat gambling task. Neuropsychopharmacology 34:2329–2343. 10.1038/npp.2009.62 [DOI] [PubMed] [Google Scholar]
- Zeeb FD, Floresco SB, Winstanley CA (2010) Contributions of the orbitofrontal cortex to impulsive choice: interactions with basal levels of impulsivity, dopamine signalling, and reward-related cues. Psychopharmacology (Berl) 211:87–98. 10.1007/s00213-010-1871-2 [DOI] [PubMed] [Google Scholar]
- Zilberman N, Lavidor M, Yadid G, Rassovsky Y (2019) Qualitative review and quantitative effect size meta-analyses in brain regions identified by cue-reactivity addiction studies. Neuropsychology 33:319–334. 10.1037/neu0000526 [DOI] [PubMed] [Google Scholar]