Abstract
Updated theoretical accounts of the role of serotonin (5-HT) in motivation propose that 5-HT operates at the intersection of aversion and inhibition, promoting withdrawal in the face of aversive predictions. However, the specific cognitive mechanisms through which 5-HT modulates withdrawal behavior remain poorly understood. Behavioral inhibition in response to punishments reflects at least two concurrent processes: instrumental aversive predictions linking stimuli, responses, and punishments, and Pavlovian aversive predictions linking stimuli and punishments irrespective of response. In the current study, we examined to what extent 5-HT modulates the impact of instrumental vs Pavlovian aversive predictions on behavioral inhibition. We used acute tryptophan depletion to lower central 5-HT levels in healthy volunteers, and observed behavior in a novel task designed to measure the influence of Pavlovian and instrumental aversive predictions on choice (response bias) and response vigor (response latencies). After placebo treatment, participants were biased against responding on the button that led to punishment, and they were slower to respond in a punished context, relative to a non-punished context. Specifically, participants slowed their responses in the presence of stimuli predictive of punishments. Tryptophan depletion removed the bias against responding on the punished button, and abolished slowing in the presence of punished stimuli, irrespective of response. We suggest that this set of results can be explained by a role for 5-HT in Pavlovian aversive predictions. These findings suggest additional specificity for the influence of 5-HT on aversively motivated behavioral inhibition and extend recent models of the role of 5-HT in aversive predictions.
Keywords: serotonin, aversion, inhibition, tryptophan, Pavlovian, instrumental
INTRODUCTION
Historically, the neurotransmitter serotonin (5-HT) has been associated with both aversive processing (Deakin, 1983) and behavioral inhibition (Soubrie, 1986), but scientists are only beginning to understand the mechanisms through which 5-HT modulates a vast range of normal and abnormal behaviors. Updated theoretical accounts of 5-HT function in motivation have adroitly pointed out that aversive processing and behavioral inhibition, though orthogonal in theory, are usually intertwined in practice (Dayan and Huys, 2008, 2009; Boureau and Dayan, 2011; Cools et al, 2011). Specifically, the inhibition of ongoing behavior is a reflexive and adaptive consequence of aversive predictions. This makes it difficult to disentangle effects of 5-HT manipulations on aversive processing and behavioral inhibition, as the two are almost always correlated in experiments. Understanding the effects of 5-HT on aversive processing is important because 5-HT is hypothesized to have a role in a range of psychiatric disorders, including depression, anxiety, obsessive-compulsive disorder, and aggression (Dayan and Huys, 2009).
Recently, we addressed this issue in an experiment that separately measured aversive processing, behavioral inhibition, and their interaction. We found that temporarily lowering 5-HT in healthy volunteers abolished punishment-related slowing of responding (‘punishment-induced inhibition'), without affecting overall motor response inhibition or general sensitivity to aversive outcomes (Crockett et al, 2009). Thus, 5-HT's role in motivation appears to operate at the interface of aversion and inhibition, reducing response vigor in the face of aversive predictions (Dayan and Huys, 2008, 2009; Boureau and Dayan, 2011; Cools et al, 2011). However, the specific cognitive mechanisms through which 5-HT modulates such withdrawal behavior remain poorly understood. As long as aversive outcomes are contingent on responses, punishment-induced inhibition reflects at least two concurrent processes: an instrumental process that inhibits behavior by virtue of the link between responses and the aversive outcomes they produce; and a Pavlovian process that reflexively suppresses behavior as a direct consequence of aversive predictions (Rescorla and Solomon, 1967; Bolles et al, 1980; Church et al, 1970). Although many studies have demonstrated a link between 5-HT and punishment-induced inhibition (Thiébot et al, 1982, 1983; Tye et al, 1977, 1979; Wise et al, 1972; Graeff and Schoenfeld, 1970; Crockett et al, 2009), no study has investigated whether this relationship depends on Pavlovian (stimulus-outcome) or instrumental (stimulus-response-outcome) aversive predictions. Here, we examine to what extent 5-HT modulates instrumental vs Pavlovian processes in punishment-induced inhibition. This question is particularly important in light of recent computational approaches to affective decision making emphasizing a distinction between instrumental and Pavlovian control of learning and choice (Dayan et al, 2006; Dayan, 2008).
In the current experiment, we used acute tryptophan depletion (ATD; Young et al, 1985) to temporarily lower 5-HT levels in healthy human volunteers, and tested the effects on behavior in a novel task designed to separate the effects of Pavlovian and instrumental aversive predictions on punishment-induced inhibition. Specifically, on every trial subjects had to categorize two types of stimuli. We compared reaction times (RTs) with both stimuli in a reward-only (RO) condition, in which both stimuli were rewarded if correctly categorized, with RTs in a reward+punishment (RP) condition, in which both stimuli were rewarded if correctly categorized but only one of the stimuli was punished if incorrectly categorized (Figure 1). Thus, in the RP condition only one of the stimuli (the ‘punished stimulus') was associated with punishment, and only one of the responses (the ‘punished button') led to punishment. Critically, this design allowed us to disentangle the effects of manipulating 5-HT on instrumental and Pavlovian aversive predictions. Specifically, in the RP condition, Pavlovian (stimulus-outcome) aversive predictions would be predicted to lead to slower responses in the presence of the punished stimulus regardless of response. Meanwhile, instrumental (stimulus-response-outcome) aversive predictions would be predicted to lead to slower responses in the presence of the punished stimulus specifically on the punished button.
Relative to the RO condition, we expected that response latencies would be slower in the RP condition, reflecting punishment-induced inhibition, and that this effect would be abolished by ATD, as in our previous experiment. Further, we hypothesized that if 5-HT modulates instrumental aversive predictions, the effects of ATD would be restricted to responses on the punished button in the presence of the punished stimulus. In contrast, if 5-HT modulates Pavlovian aversive predictions, then we would expect ATD to abolish slowing of all responses in the presence of the punished stimulus.
METHODS
Participants
Thirty healthy volunteers (13 males, mean age=25.1±3.2 years) participated. Exclusion criteria included history of cardiac, hepatic, renal, pulmonary, neurological, psychiatric or gastrointestinal disorders, medication/drug use, and personal or family history of major depression or bipolar affective disorder. Participants gave written informed consent before participating and were financially compensated. Two participants were excluded due to technical errors during data collection. Because the rewards and punishments used in this study consisted of monetary wins and losses, four additional participants were excluded for indicating at debriefing they did not believe they would be paid for their performance. Therefore, the final analysis was carried out in 24 participants.
General Procedure
The protocol was approved by the Cambridgeshire Research Ethics Committee (09/H0308/051). Participants attended two sessions, spaced at least 1 week apart, and were randomized to receive either ATD (N=14) or placebo (N=10) on the first session. The ATD procedure was carried out according to an established protocol (Crockett et al, 2009).
Upon arrival to the Wellcome Trust Clinical Research Facility (between 0830 and 1000 h), participants completed a baseline mood questionnaire, gave a blood sample, and ingested either the placebo or ATD drink (75 g). After 6.5 h, participants completed a number of cognitive tests, including the Reinforced Categorization Task (RCAT; described below). Participants completed the RCAT task after completing an ultimatum game task, a reversal learning task, and a covert facial emotion recognition task (all completed in the fMRI scanner) and a third-party punishment task. Following the RCAT task, participants completed an emotion regulation task and a delay-discounting task. Task order was consistent across treatments and subjects. Mood was assessed using the Positive and Negative Affect Scale (PANAS; Watson et al, 1988).
Reinforced Categorization Task
In the RCAT, subjects were instructed to categorize stimuli as quickly as possible to win points exchangeable for money. The RCAT was adapted from the Reinforced Go/No-go task used in our previous study (Crockett et al, 2009). As in the original Go/No-go task, the stimuli were checkerboards composed of blue and yellow squares (see Figure 1). These stimuli were designed to introduce a tradeoff between speed and accuracy, and to be able to vary task difficulty. Stimuli could be easy (blue/yellow ratios of 16 : 9 and 9 : 16) or difficult (blue/yellow ratios of 12 : 13 and 13 : 12). All task conditions contained 50% ‘yellow' trials and 50% ‘blue' trials, distributed evenly across difficulty level.
In our previous study, participants were assigned a ‘target color' (blue or yellow), and instructed to respond via button-press (‘Go') if the target color was in the majority on the checkerboard. In the current study, participants were instructed to press one button (eg, ‘right') if yellow was in the majority, and another button (eg, ‘left') if blue was in the majority. Thus, the RCAT is a ‘Go/Go' task rather than a ‘Go/No-go' task. We adapted the task from our previous study (Crockett et al, 2009) to have two separate responses, only one of which would eventually be punished.
In some of the task conditions, participants received feedback for their responses. Sometimes correct responses were rewarded with 10 points, a flourishing tone, and a happy face. Sometimes incorrect responses were punished with a loss of 10 points, a long buzzing tone, and an angry face. Throughout the task, feedback was presented for 750 ms. Participants were instructed that points would be exchangeable for money at the end of the experiment. Faces were taken from the NimStim set of facial expressions (Tottenham et al, 2009).
The task consisted of several phases. First, participants completed 48 practice trials without feedback to minimize learning and practice effects in the main task. Stimuli were presented for 2000 ms, with an inter-trial interval of 1500 ms. The mean RT for correct responses was extracted from the practice session and set as the stimulus duration for the main task, to match task difficulty across participants and sessions.
The main task began with a neutral block of 36 trials to obtain a baseline RT. Next, participants completed two key experimental blocks, each with 48 trials. In the RO block, participants were rewarded for correct responses. Incorrect responses received no feedback. In the RP block, participants were also rewarded for correct responses. However, in the RP block one of the stimuli (‘blue' for half the participants, ‘yellow' for the other half—henceforth the punished stimulus) was punished if incorrectly categorized, while the other stimulus (henceforth the non-punished stimulus) received no feedback if incorrectly categorized. The experimental blocks were separated by neutral blocks of 36 trials without feedback to allow response biases to return to baseline. The RP block took place after the RO block for all participants (see Supplementary Results for an analysis of potential order effects). At the start of each experimental block, participants were explicitly instructed about the response-outcome contingencies in the upcoming trials, and completed four guided practice trials to observe the consequences of correct and incorrect responses. For a summary of response-outcome contingencies in the experimental blocks, see Figure 1.
Data Analysis
Raw data (response accuracy and RTs) are available in Supplementary Tables S1 and S2. For the response data, we computed measures from signal detection theory (Swets et al, 1961), including sensitivity (d′) and response bias (ln(β)). Formulae for calculating d′ and ln(β) are widely available (Stanislaw and Todorov, 1999). Sensitivity is a measure of discrimination accuracy (the ability to correctly categorize stimuli), and is independent from response bias, which measures subjects' tendency to favor one response over the other. Accurate calculation of discrimination and response bias measures requires that the raw proportions of false alarms and omission errors be non-zero. Because performance on easy trials was nearly perfect, we were unable to calculate discrimination and response bias measures for easy trials, so we restricted the analysis of d′ and ln(β) to difficult trials only. For completeness, we also repeated the analysis of response bias on all trials (easy and difficult combined). We predicted that the response bias data would reflect a shift in preference away from the punished response in the RP block. One explanation for such a shift is the input of instrumental aversive predictions (stimulus-response-outcome); however, Pavlovian aversive predictions can also influence instrumental actions via Pavlovian-instrumental transfer (PIT) (Huys et al, 2011; Overmier et al, 1971). Thus, any observed effects of ATD on response bias could reflect Pavlovian or instrumental processes.
As in our previous study, we assessed punishment-induced inhibition by examining RTs for correct responses in the RP block relative to the RO and neutral blocks. This approach has been used in other studies of punishment-induced behavioral inhibition (Newman et al, 1997; Avila, 2001) and follows from the observation that the automatic response to aversive outcomes and their prediction is to freeze or depress responding (LeDoux, 1996; Gray and McNaughton, 2000). We reasoned that punishment-induced inhibition would result in slower responding in the RP block, relative to the RO and neutral blocks, as has been observed in previous studies (Newman et al, 1997; Avila, 2001; Crockett et al, 2009). RTs in the experimental conditions (RO and RP) were converted to z-scores by normalizing against matched RTs in the neutral condition as follows: first, we calculated means and standard deviations of RTs for correct responses in the neutral condition, separately for easy and difficult blue and yellow stimuli. Next, we calculated means of RTs for correct responses in the RO and RP conditions, again separately for easy and difficult blue and yellow stimuli. Finally, we computed z-scores for the RTs in the experimental conditions (RO and RP) by normalizing against the RTs in the neutral condition, again separately for easy and difficult blue and yellow stimuli. So for example, the normalized RT for RP/difficult/blue was computed by subtracting the mean RT for neutral/difficult/blue from the mean RT for RP/difficult/blue, and dividing by the standard deviation for neutral/difficult/blue. We employed this normalization procedure because we were primarily interested in how rewards and punishments influenced response vigor, relative to a neutral baseline.
We were also interested in whether the effects of punishment, and their modulation by ATD, were present at the start of the block or emerged only with learning. To assess these potential learning effects, we sorted the response bias and RT data within each block into ‘early' (first 24 trials) and ‘late' (second 24 trials) bins.
The transformed raw data (d′, ln(β), and normalized RTs) were analyzed using repeated-measures ANOVAs with treatment (ATD, placebo) and block (RO, RP) as within-subjects factors, and gender and treatment order as between-subjects factors. Additional analyses were conducted using, where appropriate, time (early trials, late trials), stimulus (punished stimulus, non-punished stimulus), response type (non-punished button, punished button), and stimulus difficulty (easy, difficult) as within-subjects factors. Factors were dropped from subsequent analyses when non-significant. Post hoc comparisons were conducted using paired t-tests.
In within-subject designs, the appropriate index of variation is not the standard error of the mean, but the standard error of the difference of the means (SED), which is used when one is interested in the relationship between variables rather than the variables themselves. The SED is therefore used in the figures as an index of variation. The SED is the denominator for Student's t-test and also provides a visual method of comparing mean values in graphical depictions of within-subject designs.
RESULTS
Serotonin Manipulation
Plasma samples were analyzed for tryptophan content according to the procedure described in Crockett et al (2009). ATD resulted in significant reductions in both plasma tryptophan levels and the TRP/ΣLNAA ratio. A repeated-measures ANOVA revealed a significant two-way interaction between treatment (ATD, placebo) and time point (baseline, +5.5 h), resulting from significant reductions in total tryptophan levels (F(1, 23)=108.524, p<0.0001) and the TRP/ΣLNAA ratio (F(1, 23)=28.605, p<0.0001), 5 h following ATD relative to placebo. Simple effects analyses showed a significant decrease in plasma tryptophan levels (t(23)=13.883, p<0.0001) on the ATD session, averaging 66%. There was also a significant decrease in TRP/ΣLNAA ratios (t(23)=12.404, p<0.001) on the ATD session, averaging 85%. On the placebo session, plasma tryptophan levels increased by an average of 88% (t(23)=−6.213, p<0.0001); there was no significant change in TRP/ΣLNAA ratios (t(23)=0.537, p=0.598).
Serotonin Modulates the Effects of Aversive Predictions on Choice
To examine the effects of ATD on the behavioral suppression of punished responding across conditions, we analyzed the effects of treatment (ATD, placebo) and block (RO, RP) on response bias (ln(β)). Lower (more negative) numbers indicate a bias toward responding on the punished button, while higher (more positive) numbers indicate a bias away from responding on the punished button. In the RO block, we did not expect there to be a bias toward one response or the other, since both responses yielded the same payoffs; however, in the RP block we expected there to be a bias away from the response that received punishments if incorrect. There was a significant interaction between treatment and block on response bias (F(1, 23)=7.455, p=0.012, difficult trials only; F(1, 23)=4.860, p=0.038, easy and difficult trials combined). On placebo, participants were biased away from responding on the punished button in the RP block (mean+SE, 0.210+0.102) relative to the RO block (mean+SE, −0.051+0.082; t(23)=−2.485, p=0.021; see Figure 2, left panel). However, this punishment-induced suppression of responding on the punished button was released by ATD; there was no difference in response bias between the RP block (mean±SE, −0.056±0.122) and the RO block (mean±SE, 0.078±0.089; t(23)=1.013, p=0.321; see Figure 2, right panel). ATD specifically influenced response bias in the RP block; bias away from the punished button was significantly reduced on ATD, relative to placebo, in the RP block (t(23)=−2.113, p=0.046), but not in the RO block (t(23)=1.289, p=0.210). The effect of ATD on response bias in the RP block was present at the start of the block and remained constant across trials. When we sorted the data into early and late trials and repeated the above analysis with treatment, block and time as factors, the treatment × block interaction remained significant (F(1, 23)=4.989, p=0.034), but there were no significant effects of time (F(1, 23)=0.013, p=0.911), time × treatment (F(1, 23)=0.624, p=0.437), time × block (F(1, 23)=0.006, p=0.940), or time × treatment × block (F(1, 23)=0.369, p=0.549).
Serotonin Modulates the Effects of Aversive Predictions on Response Vigor
We first considered the influence of punishment expectations on response vigor by analyzing the effects of block (RO, RP) and treatment (ATD, placebo) on RTs for correct responses (irrespective of stimulus). This analysis indicated that on the placebo session, the possibility of punishment made subjects respond more slowly, but this effect was abolished by ATD (treatment × block interaction, F(1, 23)=4.734, p=0.042). On placebo, participants were slower to respond in the RP block (mean±SE, −0.219±0.082) than in the RO block (mean±SE, −0.367+0.083; t(23)=−2.254, p=0.034; see Figure 3, left panel). However, this punishment-induced inhibition of responding was absent following ATD; responses were not slower in the RP block (mean+SE, −0.371+0.099) compared with the RO block (mean+SE, −0.329+0.111; t(23)=0.456, p=0.653; see Figure 3, right panel). We therefore replicated our previous finding that ATD abolishes punishment-induced inhibition (Crockett et al, 2009). The effects of ATD on punishment-induced inhibition appeared early and were consistent across trials. Separating the RT data into early and late trials and including time as a factor in our model, the treatment × block interaction remained significant (F(1, 23)=4.306, p=0.049), but there were no significant effects of time (F(1, 23)=0.964, p=0.337), time × treatment (F(1, 23)=0.918, p=0.348), time × block (F(1, 23)=0.043, p=0.838), or time × treatment × block (F(1, 23)=0.287, p=0.598).
Serotonin Modulates the Effects of Pavlovian Aversive Predictions on Response Vigor
We next examined the RT data at a finer level of detail to test the hypothesis that 5-HT modulates the effects of Pavlovian (stimulus-outcome) aversive predictions. We analyzed RTs (regardless of response button) to the easy and difficult punished and non-punished stimuli in the RO and RP blocks. In a repeated-measures ANOVA with block (RO, RP), stimulus (non-punished, punished), difficulty (easy, difficult) and treatment (ATD, placebo) as factors, we found a significant main effect of difficulty (F(1, 23)=12.483, p=0.002); subjects were faster to respond to easy stimuli than to difficult stimuli. We also found a significant three-way interaction between treatment, block, and stimulus (F(1, 23)=4.534, p=0.047), and a trend-level four-way interaction between treatment, block, stimulus, and difficulty (F(1, 23)=3.713, p=0.069).
To explore these interactions, we examined the effects of treatment, block, and stimulus for easy and difficult stimuli separately. We suspected that Pavlovian effects would be weaker for the easy stimuli, since these were less predictive of punishments; mean accuracy for easy trials was over 90%, with more than half of subjects performing perfectly and thus never receiving punishments for easy stimuli. Consistent with this prediction, we did not find evidence of Pavlovian slowing to the punished easy stimulus; on the placebo session, participants were not slower to respond to the punished stimulus in the RP block, relative to the RO block (t(23)=−0.719, p=0.479), and within the RP block, they were not slower to respond to the punished stimulus, relative to the non-punished stimulus (t(23)=−0.641, p=0.528). Furthermore, the three-way interaction between treatment, block, and stimulus was not significant (F(1, 23)=0.318, p=0.579), indicating that ATD did not affect slowing to the punished stimulus for easy trials.
In contrast, when focusing on the difficult stimuli we found a significant three-way interaction between treatment, block, and stimulus (F(1, 23)=4.618, p=0.042). On the placebo session, participants were slower to respond to the punished stimulus in the RP block, compared with the RO block (t(23)=2.459, p=0.022), whereas they responded with equal speed to the non-punished stimulus in the RP and RO blocks (t(23)=−0.391, p=0.699). Following ATD, participants did not exhibit slowing in the RP block, relative to the RO block, for either the punished stimulus (t(23)=–0.986, p=0.334) or the non-punished stimulus (t(23)=0.184, p=0.856).
We next confirmed that ATD abolished punishment-induced slowing to the punished stimulus. We computed slowing scores for the punished and non-punished stimuli by taking the RT difference between the RO and RP blocks. Relative to placebo, tryptophan significantly reduced punishment-induced slowing to the punished stimulus (t(23)=−2.353, p=0.028), without affecting slowing to the non-punished stimulus (t(23)=0.527, p=0.603); Figure 4.
Finally, we focused on responses to the punished stimulus (difficult trials, as in previous analysis) and examined the slowing of responses on the punished button vs non-punished button. We reasoned that slowing of responses on the non-punished button in the presence of the punished stimulus should only reflect Pavlovian (stimulus-outcome) processes, while slowing of responses on the punished button in the presence of the punished stimulus should reflect both Pavlovian and instrumental (stimulus-response-outcome) processes. Thus, if 5-HT influences Pavlovian inhibition we should see a main effect of ATD on all responses, whereas if 5-HT influences instrumental inhibition we should see effects of ATD on responses on the punished button only, resulting in a treatment-by-response interaction.
We tested these predictions by analyzing slowing scores (computed as in above analysis by taking the RT difference between the RO and RP blocks) in a repeated-measures ANOVA with the factors treatment (ATD, placebo) and response (non-punished button, punished button); Figure 5. We found a main effect of response (F(1, 23)=5.246, p=0.033); responses on the punished button showed more slowing than responses on the non-punished button, reflecting an instrumental inhibition of the punished response. We also found a main effect of treatment (F(1, 23)=4.812, p=0.040); across all responses, ATD reduced slowing to the punished stimulus in the RP block, relative to the RO block. Importantly, the treatment-by-response interaction was not significant (F(1, 23)=1.007, p=0.328); in other words, the effects of ATD on RTs in the presence of the punished stimulus were not restricted to responses on the punished button, as would be predicted by a role for 5-HT in instrumental (stimulus-response-outcome) aversive predictions. Instead, the main effect of treatment suggests a role for 5-HT in Pavlovian (stimulus-outcome) aversive predictions.
For completeness, we repeated the above analysis for responses to the non-punished stimulus. We did not find any evidence of slowing of responses to the non-punished stimulus, on either the non-punished or punished button, on either ATD or placebo (see Supplementary Results).
As an additional test of the hypothesis that 5-HT modulates the effect of Pavlovian aversive predictions on response vigor, we examined the immediate after-effects of punishment. We reasoned that the effects of Pavlovian aversive predictions on response vigor should be strongest on trials that immediately follow punishment, resulting in slower responding on trials following punishment (vs trials following non-punishment). We expected this ‘post-punishment slowing' effect to be reduced following ATD. As predicted, participants were significantly slower to respond following punishments than following correct responses, and this effect was abolished by ATD (see Supplementary Results and Supplementary Figure S1).
No Effect of Low Serotonin on Discrimination Performance or Mood
To rule out the possibility that ATD influenced performance via effects on attention or executive function, we assessed the effects of treatment on sensitivity (d′) in the experimental blocks in a repeated-measures ANOVA with treatment (ATD, placebo) and block (RO, RP) as within-subjects factors. There was a trend toward better discrimination performance in the RO block, relative to the RP block (F(1, 23)=3.105, p=0.091), but there were no significant effects of treatment (F(1, 23)=0.013, p=0.911) or treatment × block (F(1, 23)=0.325, p=0.574).
Consistent with previous studies in healthy volunteers, ATD did not affect subjects' self-reported mood. PANAS scores were analyzed immediately before drink ingestion, and immediately before cognitive testing. A repeated-measures ANOVA with treatment (ATD, placebo) and time point (baseline, +5.5 h) as within-subjects factors found no significant effects of treatment, time point, or their interaction on PANAS positive affect (all p>0.13) or negative affect (all p>0.15).
DISCUSSION
Temporarily lowering 5-HT in humans produced a selective reduction in aversively motivated behavioral inhibition, replicating our previous findings (Crockett et al, 2009). This effect was evident in terms of both response bias and response vigor (ie, latencies). Critically, our task design allowed us to separate instrumental (stimulus-response-outcome) and Pavlovian (stimulus-outcome) processes in punishment-induced inhibition. Our results suggest that 5-HT is specifically involved in translating Pavlovian aversive predictions into behavioral inhibition, a function consistent with updated theoretical accounts of 5-HT in motivation and action (Dayan and Huys, 2009; Boureau and Dayan, 2011; Cools et al, 2011).
Recent computational approaches to affective learning and decision making have emphasized a distinction between instrumental control systems, which learn to emit arbitrary responses in pursuit of optimal outcomes, and Pavlovian control systems, which emit evolutionarily pre-programmed reflexes (eg, approach rewards and avoid punishments) to biologically relevant outcomes and their anticipation (Dayan and Huys, 2008; Dayan et al, 2006). Since any experiment with an instrumental contingency between stimulus, response, and outcome also contains a Pavlovian contingency between stimulus and outcome, behavior reflects a combination of Pavlovian and instrumental processes (Rescorla and Solomon, 1967; Mackintosh, 1983). Thus, any explanation of 5-HT's involvement in punishment-induced inhibition is incomplete without considering both instrumental and Pavlovian processes, especially because in the case of punishment the two are usually aligned (Bolles et al, 1980; Dayan et al, 2008; Boureau and Dayan, 2011; one exception is negative automaintenance). This analysis becomes all the more important when considering that Pavlovian responses to predictions of reward and punishment may account for a significant portion of anomalies in human decision making (Dayan et al, 2006), as well as psychopathologies such as depression (Dayan and Huys, 2008), in which 5-HT has historically played a prominent role. For example, a recent theoretical account of 5-HT in depression posits that 5-HT normally mediates reflexive (Pavlovian) avoidance of distressing thoughts; in depression, low 5-HT leads to increased engagement with aversive mental states and inflated predictions of the likelihood of aversive outcomes (Dayan and Huys, 2008).
In the current study, we separately assessed the impact of aversive predictions on response bias for punished vs non-punished responses, and on response vigor, reflected in the speed of responses in the presence of stimuli predictive of punishments. Aversive predictions influenced response selection: in an aversive context, subjects were biased against responding on the punished button, but only on placebo; tryptophan depletion abolished the bias against the punished response. Aversive predictions also influenced response vigor: in an aversive context, subjects responded more slowly than in a non-aversive context, but again only on placebo. Tryptophan depletion abolished the influence of aversive predictions on response vigor, replicating our previous findings (Crockett et al, 2009). These observations strongly support a role for 5-HT in the processing of aversive predictions.
But aversive predictions can take several forms. Specifically, aversive predictions can be instrumental, linking stimuli, responses, and outcomes; or they can be Pavlovian, linking stimuli and outcomes. Many studies have shown that 5-HT influences the effects of aversive predictions on punishment-induced behavioral inhibition (Thiébot et al, 1982, 1983; Tye et al, 1977, 1979; Wise et al, 1972; Graeff and Schoenfeld, 1970; Crockett et al, 2009), but no study has examined to what extent 5-HT modulates Pavlovian vs instrumental aversive predictions. This is likely because the two are difficult to disentangle: in the case of punishment, where a specific combination of stimulus and response produces an aversive outcome, Pavlovian and instrumental aversive predictions operate in parallel (Bolles et al, 1980). One potential explanation of our finding that ATD abolished a bias against responses that lead to punishment is that 5-HT is necessary for instrumental punishment sensitivity. An alternative explanation is that 5-HT mediates the effects of Pavlovian aversive predictions on instrumental choice. In aversive PIT, stimuli predictive of punishments increase the likelihood of selecting actions that avoid punishment (Overmier et al, 1971; Huys et al, 2011). Such effects could contribute to the response bias we observed in the RP block, and a role for 5-HT in aversive PIT could partly explain the effects of ATD on response bias. The current experiment was not designed to specifically examine PIT effects, but this would be a promising target for future research.
In analyzing the effects of aversive predictions on response vigor, our experimental design allowed us to separately assess Pavlovian aversive predictions, reflected in the slowing of all responses in the presence of punished stimuli, and on instrumental aversive predictions, reflected in the slowing of punished responses in the presence of punished stimuli. ATD had a main effect on response latencies for both punished and non-punished responses in the presence of punished stimuli, but did not affect response latencies to non-punished stimuli, suggesting a role for 5-HT in modulating the effects of Pavlovian aversive predictions on behavioral inhibition. However, we note that the lack of a significant interaction effect in the presence of a main effect of ATD could reflect a lack of power, and further studies should seek to replicate our findings. Furthermore, ATD abolished the slowing of responses immediately following punishment, but did not affect slowing of responses following errors. The effects of ATD on post-punishment slowing were again concentrated on responses to punished stimuli, providing further evidence for serotonergic modulation of Pavlovian aversive predictions.
Reflexive withdrawal responses to Pavlovian aversive predictions involve the amygdala, in particular the central nucleus (CeA) (Maren, 2001; Killcross et al, 1997; Cardinal et al, 2002; Balleine and Killcross, 2006). Lesions of the CeA reduce conditioned suppression of non-punished responses without affecting suppression of punished responses (Killcross et al, 1997), paralleling the current findings. Another region likely involved in Pavlovian aversive predictions is the insula (Calder et al, 2001), which is extensively anatomically connected with the amygdala (Barbas, 2007; Stein et al, 2007). An fMRI study that specifically modeled the neural representation of Pavlovian aversive predictions during learning found aversive value-related activity in the insula, as well as a region of the brainstem consistent with the location of the dorsal raphé nucleus (Seymour et al, 2004). Both the CeA and the insula receive a high level of serotonergic input, measured by 5-HT transporter density (Smith et al, 1999; O'Rourke and Fudge, 2006; Way et al, 2007), supporting the idea that 5-HT may promote punishment-induced inhibition by modulating activity in these regions. Future studies combining tryptophan depletion with fMRI are needed to explore these possibilities.
The method we used to manipulate 5-HT function, acute tryptophan depletion, is well known to deplete central 5-HT levels (Crockett et al, 2011) by way of reducing 5-HT synthesis (Nishizawa et al, 1997) and release in projection regions (Stancampiano et al, 1997; Fadda et al, 2000; van der Stelt et al, 2004). As a caveat, though, this method reduces 5-HT levels to a modest extent, and does so globally, which precludes drawing conclusions about region-specific effects without the concurrent use of neuroimaging. It is possible that more profound loss of central 5-HT would affect other processes, including aversive instrumental prediction, possibly mediated by distinct terminal regions. have suggested that tryptophan depletion may selectively influence tonic, rather than phasic, serotonergic signaling, which has important consequences for the interpretation of our results. Tonic 5-HT has been hypothesized to report the average rate of punishments, and a reduction in response vigor serves as an adaptive response to increased expectations of punishments (Cools et al, 2011). This proposal is consistent with our results; response vigor was indeed reduced in an aversive context, and this effect was abolished by tryptophan depletion. Although there is certainly room for future research using more precise methods to pin down the role of phasic 5-HT signaling, we note that psychiatric disorders as well as their treatments involve changes in global, tonic levels of 5-HT rather than region-specific changes in phasic 5-HT. Thus, an understanding of 5-HT function at the global level is necessary for resolving disorders of 5-HT in psychopathology.
In summary, we replicated previous findings that 5-HT is critical for punishment-induced inhibition, and now assign further specificity to 5-HT in this process by demonstrating that 5-HT modulates Pavlovian aversive predictions. ATD specifically abolished the slowing of response latencies in the presence of punishment-predicting stimuli. Our data provide early empirical support for emerging ideas about the role of 5-HT in motivation: operating at the intersection of aversion and inhibition, it may function to reduce the vigor of responding in the face of aversive predictions (Dayan and Huys, 2009; Boureau and Dayan, 2011; Cools et al, 2011).
Acknowledgments
MJC, LC, and TWR designed experiment; MJC and AA-S collected data; MJC and SM-Z analyzed data; MJC, LC, and TWR wrote paper. We thank the staff at the Wellcome Trust Clinical Research Facility and M Franklin for their assistance. This work was supported by a James S McDonnell Foundation 21st Century Collaborative Award (Bridging Brain, Mind and Behavior; award number 22002015501) to EA Phelps and TW Robbins, and by a Wellcome Trust grant (089589/Z/09/Z) awarded to TW Robbins, BJ Everitt, AC Roberts, JW Dalley and BJ Sahakian. SM-Z and AA-S are also supported by the Wellcome Trust grant (089589/Z/09/Z). The work was completed within the Behavioural and Clinical Neuroscience Institute which is supported by a joint award from the Medical Research Council and Wellcome Trust (G00001354).
Dr Clark has consulted for Cambridge Cognition Ltd. Dr Robbins has consulted for Cambridge Cognition Ltd, Lundbeck, Pfizer, GlaxoSmithKline, and Lilly and has received research grants from the latter four companies.
Footnotes
Supplementary Information accompanies the paper on the Neuropsychopharmacology website (http://www.nature.com/npp)
Supplementary Material
References
- Avila C. Distinguishing BIS-mediated and BAS-mediated disinhibition mechanisms: a comparison of disinhibition models of Gray (1981, 1987) and of Patterson and Newman (1993) J Pers Soc Psychol. 2001;80:311–324. doi: 10.1037/0022-3514.80.2.311. [DOI] [PubMed] [Google Scholar]
- Balleine BW, Killcross S. Parallel incentive processing: an integrated view of amygdala function. Trends Neurosci. 2006;29:272–279. doi: 10.1016/j.tins.2006.03.002. [DOI] [PubMed] [Google Scholar]
- Barbas H. Specialized elements of orbitofrontal cortex in primates. Ann NY Acad Sci. 2007;1121:10–32. doi: 10.1196/annals.1401.015. [DOI] [PubMed] [Google Scholar]
- Bolles RC, Holtz R, Dunn T, Hill W. Comparisons of stimulus learning and response learning in a punishment situation. Learn Motiv. 1980;11:78–96. [Google Scholar]
- Boureau Y, Dayan P. Opponency revisited: competition and cooperation between dopamine and serotonin. Neuropsychopharmacology. 2011;36:74–97. doi: 10.1038/npp.2010.151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Calder AJ, Lawrence AD, Young AW. Neuropsychology of fear and loathing. Nat Rev Neurosci. 2001;2:352–363. doi: 10.1038/35072584. [DOI] [PubMed] [Google Scholar]
- Cardinal RN, Parkinson JA, Hall J, Everitt BJ. Emotion and motivation: the role of the amygdala, ventral striatum, and prefrontal cortex. Neurosci Biobehav Rev. 2002;26:321–352. doi: 10.1016/s0149-7634(02)00007-6. [DOI] [PubMed] [Google Scholar]
- Church RM, Wooten CL, Matthews TJ. Discriminative punishment and the conditioned emotional response. Learn Motiv. 1970;1:1–17. [Google Scholar]
- Cools R, Nakamura K, Daw ND. Serotonin and dopamine: unifying affective, activational, and decision functions. Neuropsychopharmacology. 2011;36:98–113. doi: 10.1038/npp.2010.121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cools R, Roberts AC, Robbins TW. Serotoninergic regulation of emotional and behavioural control processes. Trends Cogn Sci. 2008a;12:31–40. doi: 10.1016/j.tics.2007.10.011. [DOI] [PubMed] [Google Scholar]
- Cools R, Robinson OJ, Sahakian B. Acute tryptophan depletion in healthy volunteers enhances punishment prediction but does not affect reward prediction. Neuropsychopharmacology. 2008b;33:2291–2299. doi: 10.1038/sj.npp.1301598. [DOI] [PubMed] [Google Scholar]
- Crockett MJ, Clark L, Robbins TW. Reconciling the role of serotonin in behavioral inhibition and aversion: acute tryptophan depletion abolishes punishment-induced inhibition in humans. J Neurosci. 2009;29:11993–11999. doi: 10.1523/JNEUROSCI.2513-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Crockett MJ, Clark L, Roiser JP, Robinson OJ, Cools R, Chase HW, et al. Converging evidence for central 5-HT effects in acute tryptophan depletion. Mol Psychiatry. 2011;17:121–123. doi: 10.1038/mp.2011.106. [DOI] [PubMed] [Google Scholar]
- Dayan P.2008The role of value systems in decision makingIn: Engel C and Singer W (eds), Better Than Conscious? Decision Making, the Human Mind, and Implications for Institution. MIT Press: Germany; 51–70. [Google Scholar]
- Dayan P, Huys QJ. Serotonin, inhibition, and negative mood. PLoS Comput Biol. 2008;4:e4. doi: 10.1371/journal.pcbi.0040004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dayan P, Huys QJ. Serotonin in affective control. Annu Rev Neurosci. 2009;32:95–126. doi: 10.1146/annurev.neuro.051508.135607. [DOI] [PubMed] [Google Scholar]
- Dayan P, Niv Y, Seymour B, Daw ND. The misbehavior of value and the discipline of the will. Neural Netw. 2006;19:1153–1160. doi: 10.1016/j.neunet.2006.03.002. [DOI] [PubMed] [Google Scholar]
- Dayan P, Seymour B, UCL L.2008Values and actions in aversionIn: P Glimcher, Ceamerer E, Fehr and R Poldrack (eds), Neuroeconomics: Decision Making and the Brain Academic Press: New York; 175–191. [Google Scholar]
- Deakin JFW.1983Roles of serotonergic systems in escape, avoidance and other behavioursIn: SJ Cooper (ed.), Theory in Psychopharmacology Vol 2Academic Press: London, UK; 149–193. [Google Scholar]
- Fadda F, Cocco S, Stancampiano R. A physiological method to selectively decrease brain serotonin release. Brain Res Brain Res Protoc. 2000;5:219–222. doi: 10.1016/s1385-299x(00)00016-7. [DOI] [PubMed] [Google Scholar]
- Graeff FG, Schoenfeld RI. Tryptaminergic mechanisms in punished and nonpunished behavior. J Pharmacol Exp Ther. 1970;173:277–283. [PubMed] [Google Scholar]
- Gray JA, McNaughton N. The Neuropsychology of Anxiety. Oxford UP: Oxford; 2000. [Google Scholar]
- Huys QJ, Cools R, Gölzer M, Friedel E, Heinz A, Dolan RJ, et al. Disentangling the roles of approach, activation and valence in instrumental and pavlovian responding. PLoS Comp Biol. 2011;7:e1002028. doi: 10.1371/journal.pcbi.1002028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Killcross S, Robbins TW, Everitt BJ. Different types of fear-conditioned behaviour mediated by separate nuclei within amygdala. Nature. 1997;388:377–380. doi: 10.1038/41097. [DOI] [PubMed] [Google Scholar]
- LeDoux JE. The Emotional Brain. Simon & Schuster: New York; 1996. [Google Scholar]
- Mackintosh NJ. Conditioning and Associative Learning. Oxford UP: Oxford; 1983. [Google Scholar]
- Maren S. Neurobiology of Pavlovian fear conditioning. Annu Rev Neurosci. 2001;24:897–931. doi: 10.1146/annurev.neuro.24.1.897. [DOI] [PubMed] [Google Scholar]
- Newman JP, Wallace JF, Schmitt WA, Arnett PA. Behavioral inhibition system functioning in anxious, impulsive and psychopathic individuals. Pers Individ Dif. 1997;23:583–592. [Google Scholar]
- Nishizawa S, Benkelfat C, Young SN, Leyton M, Mzengeza S, De Montigny C, et al. Differences between males and females in rates of serotonin synthesis in human brain. Proc Natl Acad Sci USA. 1997;94:5308. doi: 10.1073/pnas.94.10.5308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- O'Rourke H, Fudge JL. Distribution of serotonin transporter labeled fibers in amygdaloid subregions: implications for mood disorders. Biol Psychiatry. 2006;60:479–490. doi: 10.1016/j.biopsych.2005.09.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Overmier JB, Bull JA, Pack K. On instrumental response interaction as explaining the influences of Pavlovian CS+s upon avoidance behavior. Learn Motiv. 1971;2:103–112. [Google Scholar]
- Rescorla RA, Solomon RL. Two-process learning theory: relationships between Pavlovian conditioning and instrumental learning. Psychol Rev. 1967;74:151–182. doi: 10.1037/h0024475. [DOI] [PubMed] [Google Scholar]
- Seymour B, O'Doherty JP, Dayan P, Koltzenburg M, Jones AK, Dolan RJ, et al. Temporal difference models describe higher-order learning in humans. Nature. 2004;429:664–667. doi: 10.1038/nature02581. [DOI] [PubMed] [Google Scholar]
- Smith HR, Daunais JB, Nader MA, Porrino LJ. Distribution of [3H]citalopram binding sites in the nonhuman primate brain. Ann NY Acad Sci. 1999;877:700–702. doi: 10.1111/j.1749-6632.1999.tb09305.x. [DOI] [PubMed] [Google Scholar]
- Soubrie P. Reconciling the role of central serotonin neurons in human and animal behavior. Behav Brain Sci. 1986;9:364. [Google Scholar]
- Stancampiano R, Melis F, Sarais L, Cocco S, Cugusi C, Fadda F. Acute administration of a tryptophan-free amino acid mixture decreases 5-HT release in rat hippocampus in vivo. Am J Physiol. 1997;272:R991–R994. doi: 10.1152/ajpregu.1997.272.3.R991. [DOI] [PubMed] [Google Scholar]
- Stanislaw H, Todorov N. Calculation of signal detection theory measures. Behav Res Methods Instrum Comput. 1999;31:137–149. doi: 10.3758/bf03207704. [DOI] [PubMed] [Google Scholar]
- Stein JL, Wiedholz LM, Bassett DS, Weinberger DR, Zink CF, Mattay VS, et al. A validated network of effective amygdala connectivity. Neuroimage. 2007;36:736–745. doi: 10.1016/j.neuroimage.2007.03.022. [DOI] [PubMed] [Google Scholar]
- Swets J, Tanner WP, Birdsall TG. Decision processes in perception. Psychol Rev. 1961;68:301–340. [PubMed] [Google Scholar]
- Thiébot MH, Hamon M, Soubríe P. Attenuation of induced-anxiety in rats by chlordiazepoxide: role of raphe dorsalis benzodiazepine binding sites and serotoninergic neurons. Neuroscience. 1982;7:2287–2294. doi: 10.1016/0306-4522(82)90139-7. [DOI] [PubMed] [Google Scholar]
- Thiébot MH, Hamon M, Soubrié P. The involvement of nigral serotonin innervation in the control of punishment-induced behavioral inhibition in rats. Pharmacol Biochem Behav. 1983;19:225–229. doi: 10.1016/0091-3057(83)90043-6. [DOI] [PubMed] [Google Scholar]
- Tottenham N, Tanaka JW, Leon AC, McCarry T, Nurse M, Hare TA, et al. The NimStim set of facial expressions: judgments from untrained research participants. Psychiatry Res. 2009;168:242–249. doi: 10.1016/j.psychres.2008.05.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tye NC, Everitt BJ, Iversen SD. 5-Hydroxytryptamine and punishment. Nature. 1977;268:741–743. doi: 10.1038/268741a0. [DOI] [PubMed] [Google Scholar]
- Tye NC, Iversen SD, Green AR. The effects of benzodiazepines and serotonergic manipulations on punished responding. Neuropharmacology. 1979;18:689–695. doi: 10.1016/0028-3908(79)90036-4. [DOI] [PubMed] [Google Scholar]
- van der Stelt HM, Broersen LM, Olivier B, Westenberg HG. Effects of dietary tryptophan variations on extracellular serotonin in the dorsal hippocampus of rats. Psychopharmacology (Berl) 2004;172:137–144. doi: 10.1007/s00213-003-1632-6. [DOI] [PubMed] [Google Scholar]
- Watson D, Clark LA, Tellegen A. Development and validation of brief measures of positive and negative affect: the PANAS scales. J Pers Soc Psychol. 1988;54:1063–1070. doi: 10.1037//0022-3514.54.6.1063. [DOI] [PubMed] [Google Scholar]
- Way BM, aelig;an G, Fairbanks LA, Melega WP. Architectonic distribution of the serotonin transporter within the orbitofrontal cortex of the vervet monkey. Neuroscience. 2007;148:937–948. doi: 10.1016/j.neuroscience.2007.06.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wise CD, Berger BD, Stein L. Benzodiazepines: anxiety-reducing activity by reduction of serotonin turnover in the brain. Science. 1972;177:180–183. doi: 10.1126/science.177.4044.180. [DOI] [PubMed] [Google Scholar]
- Young SN, Smith SE, Pihl RO, Ervin FR. Tryptophan depletion causes a rapid lowering of mood in normal males. Psychopharmacology (Berl) 1985;87:173–177. doi: 10.1007/BF00431803. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.