Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Jul 15.
Published in final edited form as: Biol Psychiatry. 2008 Feb 1;64(2):162–168. doi: 10.1016/j.biopsych.2007.12.001

Euthymic Patients with Bipolar Disorder Show Decreased Reward Learning in a Probabilistic Reward Task

Diego A Pizzagalli 1, Elena Goetz 1, Michael Ostacher 2, Dan V Iosifescu 2, Roy H Perlis 2
PMCID: PMC2464620  NIHMSID: NIHMS56990  PMID: 18242583

Abstract

Background

Bipolar disorder (BPD) features cycling mood states ranging from depression to mania with intermittent phases of euthymia. BPD subjects often show excessive goal-directed and pleasure-seeking behavior during manic episodes and reduced hedonic capacity during depressive episodes, indicating that BPD might involve altered reward processing. Our goal was to test the hypothesis that BPD is characterized by impairments in adjusting behavior as a function of prior reinforcement history, particularly in the presence of residual anhedonic symptoms.

Methods

Eighteen medicated BPD subjects and 25 demographically matched comparison subjects performed a probabilistic reward task. To identify putative dysfunctions in reward processing irrespective of mood state, primary analyses focused on euthymic BPD subjects (n=13). Using signal-detection methodologies, response bias toward a more frequently rewarded stimulus was used to objectively assess the participants’ propensity to modulate behavior as a function of reinforcement history.

Results

Relative to comparison subjects, euthymic BPD subjects showed a reduced and delayed acquisition of response bias toward the more frequently rewarded stimulus, which was partially due to increased sensitivity to single rewards of the disadvantageous stimulus. Analyses considering the entire BPD sample revealed that reduced reward learning correlated with self-reported anhedonic symptoms, even after adjusting for residual manic and anxious symptoms and general distress.

Conclusions

The present study provides preliminary evidence indicating that BPD, even during euthymic states, is characterized by dysfunctional reward learning in situations requiring integration of reinforcement information over time, and thus offers initial insights about the potential source of dysfunctional reward processing in this disorder.

Keywords: Bipolar disorder, reinforcement learning, dopamine, reward, depression, anhedonia

Introduction

Bipolar disorder (BPD) is a debilitating condition characterized by recurrent episodes of depression as well as mania or hypomania (1). BPD subjects often show hyperhedonia (e.g., excessive goal-directed and pleasure-seeking behavior) during manic episodes and anhedonia (e.g., reduced reactivity to rewards) during depressive episodes (2, 3). As such, BPD has been linked to altered reward processing (4, 5). Surprisingly, studies of reward processing in BPD have yielded inconsistent results. In particular, studies using both gambling (69) and reward-based decision-making (10) tasks have failed to detect abnormalities in reward processing in BPD. Of note, these findings have emerged from medicated euthymic (9), depressed (8), and acutely manic samples (6, 10) samples as well as from a medication-free sample with past hypomanic episodes (7), suggesting that clinical characteristics or medication status were unlikely to explain these negative results.

By contrast, Murphy et al. (11) reported that manic BPD patients made frequent suboptimal choices (i.e., selected more often the less favorable of two possible responses) in a gambling task that involved fluctuating favorability of two response options. Moreover, medicated euthymic BPD children were slower to learn variable stimulus-reward contingencies in two response-reversal studies (12, 13). Finally, manic BPD subjects showed increased behavioral switching in a two-choice selection task involving a high error rate, indicating that their decision-making might be impaired in situations in which the probability of successful outcome becomes uncertain (14). Taken together, these findings suggest that BPD might feature a diminished ability to adapt behavior in response to changing or intermittent reward, and thus patients might show impaired reward learning in situations requiring integration of reinforcements over time. The goal of the present study was to directly test this hypothesis. To this end, BPD and healthy participants were assessed in a probabilistic reward task that provides an objective assessment of an individual’s propensity to modulate behavior in response to reinforcement history (15, 16). To allow for the identification of putative dysfunctions in reward processing regardless of mood state, primary analyses focused on euthymic BPD subjects.

Based on prior findings (1113), we hypothesized that euthymic BPD patients would exhibit blunted reward learning, as manifested by reduced response bias toward the more frequently rewarded stimulus due to impaired integration of cumulative reward information. Moreover, we hypothesized that, among the BPD sample, reward learning would be most reduced among patients reporting residual anhedonic symptoms. Our hypotheses were motivated by findings showing a link between reduced reward responsiveness and anhedonic symptoms in non-clinical samples (15, 17), reports in euthymic BPD subjects of decreased attentional biases toward positive stimuli that may negatively impact reward-learning (18), as well as theoretical considerations postulating down-regulation of dopaminergic transmission and the emergence of anhedonia during euthymic and depressive states of BPD (4).

Methods and Materials

Participants

BPD participants were recruited from patients followed for long-term treatment at the Bipolar Clinic and Research Program at Massachusetts General Hospital (MGH) and were initially evaluated using the Affective Disorder Evaluation (ADE; 19), which includes modified mood and psychosis modules from the Structured Clinical Interview for DSM-IV Axis I Disorders (SCID) (20). BPD patients were enrolled if the following inclusion criteria were met: 1) current diagnosis of bipolar I or bipolar II disorder based on the ADE (19); 2) absence of any other current primary Axis I or II diagnosis or lifetime history of substance dependence (lifetime abuse was permitted); 3) absence of ECT the past 6 months; and 4) absence of past history of major depressive episodes (MDE) with psychotic features. Patients with a history of substance dependence were excluded to avoid potential confounds deriving from possible dopaminergic abnormalities associated with this disorder (21, 22). Additionally, the SCID mood module was administered on the day of the study session to confirm diagnosis.

A total of 25 BPD patients were enrolled, but 7 were excluded due to task non-compliance (n=2), performance at chance level (n=4), or misunderstanding of task instructions (n=1). Based on clinician ratings that occurred on the day of the study session, subjects with a score on the Young Mania Rating Scale (23) (YMRS) ≥12 were defined as being in a hypomanic state, whereas those with a score on the 17-item Hamilton Rating Scale for Depression (24) (HRSD) > 8 were defined as being in a depressed state. BPD participants who met neither of these conservative thresholds were classified as currently euthymic (all euthymic participants had an YMRS score ≤ 6). Based on these criteria, the BPD sample (n=18) included 13 euthymic, 2 currently depressed, and 3 currently hypomanic participants. With the exception of one euthymic BPD patient with a history of alcohol abuse, no BPD subjects had any past substance abuse or dependence. Thirteen of the 18 patients met criteria for bipolar I disorder (11/13 of the euthymic BPD patients), whereas the remaining patients met criteria for bipolar II disorder. As in prior studies (6, 814), all patients were on psychotropic medications at the time of testing (Table 1). All ratings were performed by psychiatrists certified and monitored for reliability as part of the Systematic Treatment Enhancement Program for Bipolar Disorder (STEP-BD) study (19, 25, 26). Inter-rater reliability for the HRSD (kappa=0.82), YMRS (kappa=0.79), and SCID-based BPD diagnosis (kappa > 0.80) was satisfactory among the authors’ team.

Table 1.

Summary of psychotropic medications in BPD patients.

Medications, n (%) All BPD patients (n = 18) Euthymic BPD patients (subset) (n = 13)
Lithium 5 (27.8%) 4 (30.8%)
VPA 5 (27.8%) 5 (38.5%)
Anticonvulsants 11 (61.1%) 8 (61.5%)
Antipsychotics 8 (44.4%) 4 (30.8%)
Antidepressants 9 (50.0%) 6 (46.2%)

Note: Overlap among medication types was possible.

BPD patients were compared to 25 healthy comparison participants recruited through community advertisements. Comparison subjects were enrolled if they had no medical or neurological illness, no current or past Axis I diagnoses (SCID, Non-patient Edition), and no psychotropic medications1.

For their participation, subjects were compensated $20, and received $5 in task earnings. All participants provided written informed consent to a protocol approved by Harvard University’s Committee on the Use of Human Subjects and the MGH Human Research Committee.

Task and Procedure

After administration of interview-based rating scales, participants completed a 25-min computer task, the Beck Depression Inventory II (27) (BDI-II), and the Mood and Anxiety Symptom Questionnaire (28) (MASQ). The MASQ is a self-report questionnaire assessing anxiety-specific symptoms (Anxious Arousal, AA), depression-specific symptoms (Anhedonic Depression, AD), and general distress (General Distress-Anxious Symptoms, GDA; General Distress-Depressive Symptoms, GDD), and has shown satisfactory validity and reliability (28, 29).

The task is a reward-based paradigm in which correct identifications of two ambiguous stimuli are differentially rewarded. This paradigm has been found to reliably produce a response bias in control participants, such that as the task proceeds, the more frequently rewarded stimulus is preferentially selected (1517, 30). This pattern is consistent with the so-called “matching law (31), which postulates that response selection relies on reinforcement history.

In brief, participants are instructed to identify whether a long or short mouth is presented within a schematic face by pressing one of two buttons on the keyboard (“z” or “/”, counterbalanced). The face first appears without a mouth, and then either a long (13mm) or short mouth (11.5mm) is presented for 100ms. Stimulus exposure and sizes were chosen based on prior studies (15) and after pilot testing to optimize the psychometric properties of the task (e.g., overall accuracy rates of 75–85%).

Participants are instructed that for some of their correct responses, they will be rewarded and see a message “Correct!! You won 5 Cents.” The task consists of three blocks of 100 trials each. These are referred to as block 1 (trials 1–100), block 2 (trials 101–200) and block 3 (trials 201–300). The two stimuli are presented with equal frequency. Importantly, correct identification of one stimulus (“rich stimulus”) is rewarded three times more often than the other (“lean stimulus”). The reward feedback is presented only 40 times, 30 times for the rich and 10 times for the lean stimulus. At the outset, participants are instructed that the goal of the task is to win as much money as possible, and that they would earn between $3 and $7 based on their performance. Importantly, they are explicitly informed that not all correct responses will receive a reward feedback. They are not informed, however, that one of the stimuli would be rewarded more frequently.

Data Reduction

Task performance was assessed by computing response bias, discriminability, and reaction times (RT). Hit rates [(number of hits)/(number of hits + number of misses)] and miss rates [1 –hit rates] were also computed but were considered secondary variables because they are imperfect measures of performance when response biases are present (32). Response bias and discriminability were computed using the following formulae (33):

ResponseBias:logb=12log(RichcorrectLeanincorrectRichincorrectLeancorrect)
Discriminability:logd=12log(RichcorrectLeancorrectRichincorrectLeanincorrect)

A high response bias emerges if a subject shows a high number of correct identifications (i.e., high hit rate) for the rich stimulus and a low number of correct identifications (i.e., high miss rate) for the lean stimulus. Accordingly, response bias indexes an individual’s preference towards the more frequently rewarded (rich) stimulus. Since reinforcers are stimuli that increase the likelihood of a given behavioral response (34), response bias towards the rich stimulus can be used to measure the extent to which behavior is modulated by reinforcement history. Discriminability, conversely, is a measure of the subjects’ ability to perceptually distinguish between the two stimulus types and thus serves as an assessment of task difficulty.

To test the hypothesis that reduced reward learning would be associated with anhedonic symptoms, an “anhedonic” BDI-II subscore was computed by summing the following BDI-II items (15, 35): loss of pleasure (item #4), loss of interest (item #12), loss of energy (item #15), and loss of interest in sex (item #21).

Statistics

To test whether BPD was characterized by abnormal reward processing irrespective of current mood state, the main statistical analyses focused on the euthymic BPD subjects (n=13). Chi-square tests and unpaired t-tests were performed to test for possible group differences in sociodemographic variables. To evaluate mood symptoms, a mixed analysis of variance (ANOVA) was run on MASQ scores, using Group (Comparison subjects, euthymic BPD subjects) and MASQ subscales (GDA, AA, GDD, AD) as factors.

For the reward task, separate mixed ANOVAs with Group and Block (1,2,3) as factors were run on response bias and discriminability scores separately. For RT and hit rate scores, Stimulus Type (Rich, Lean) was added as a repeated measure. When required, the Greenhouse-Geisser correction was used. Significant findings were followed up by post-hoc Newman-Keuls. Pearson correlations and hierarchical regression analyses between response bias and measures of depressive and manic symptoms (HRSD, BDI-II, MASQ scales) were run within the entire BPD sample (n=18). Throughout the analyses, two-tailed tests were used. Effect sizes are reported in the form of partial eta2 and Cohen d values.

Results

Demographics, Symptom Severity, and Mood Variables

Comparison (n=25) and euthymic BPD (n=13) subjects did not differ significantly with respect to demographic variables (Table 2). [Also, no differences emerged when considering the entire BPD sample, n=18.] For the euthymic BPD patients, the mean HRSD and YMRS scores were 3.38 (SD=2.57) and 1.9 (SD=2.33), respectively. Relative to comparison subjects, euthymic BPD patients had significantly higher BDI-II scores (8.38±6.70 vs. 3.40±3.59, t(36) = 3.00, p < 0.005); the BPD subjects’ mean BDI-II score was, however, below the threshold for mild depression (BDI ≥ 14).

Table 2.

Sociodemographic and clinical data in comparison (n = 25) and euthymic bipolar (n = 13) participants

Comparison subjects (n = 25) Euthymic BPD patients (n = 13) C vs. BPD

Mean SD Mean SD Statistics P value

Age 38.36 10.76 38.77 12.09 t(36) = −0.11 > 0.95
Gender ratio (Female/Male) 11/14 N/A 5/8 N/A χ2(1) = 0.11 > .70
Education (% college education) 61.54% N/A 64.00% N/A χ2(1) = 0.02 > .85
Ethnicity (% Caucasian) 68.0% N/A 100.0% N/A χ2(3) = 6.00 > .11
Marital status (% never married) 64.0% N/A 69.2% N/A χ2(2)= 0.41 > .82
BDI-II 3.40 3.59 8.38 6.70 t(36) = −3.00 < .005
Anhedonic BDI-II subscore* 0.72 1.02 1.77 2.20 t(36) = −2.02 .051
HRSD (17-item) N/A N/A 3.38 2.57 N/A N/A
YMRS N/A N/A 1.92 2.33 N/A N/A
MASQ GDA 14.16 4.34 17.00 4.69 N/K > .55
MASQ AA 18.76 5.19 23.00 4.47 N/K > .25
MASQ GDD 15.64 5.22 19.77 7.38 N/K > . 45
MASQ AD 51.52 12.60 60.38 14.22 N/K < .003

BDI-II: Beck Depression Inventory II (27); HRSD: Hamilton Rating Scale for Depression (24); YMRS: Young Mania Rating Scale (23). N/K: Groups differed in post-hoc Newman-Keuls test.

*

Sum of BDI-II items #4 (loss of pleasure), #12 (loss of interest), #15 (loss of energy), and #21 (loss of interest in sex).

The ANOVA on the MASQ scores revealed main effects of MASQ subscale [F(3,108) = 188.01, p < 0.001, ε = 0.44, partial eta2 = 0.84] and Group [F(1,36) = 15.53, p < 0.005, partial eta2 = 0.30], but no interaction (p > 0.35). Post-hoc Newman-Keuls tests indicated that group differences were driven by significantly higher anhedonic depression scores in euthymic than comparison subjects (p < 0.003), whereas the two groups did not differ in the other MASQ sub-scales (all ps > 0.28; Table 2).

Probabilistic Reward Task

Response Bias

The ANOVA revealed a significant effect of Block [F(2,72) = 6.57, p < 0.002, partial eta2 = 0.15], which was due to significantly higher response bias in Blocks 2 and 3 compared to Block 1 (Newman-Keuls ps < 0.005). The main effect of Group [F(1,36) = 6.28, p < 0.020, partial eta2 = 0.15] and the Group × Block interaction [F(2,72) = 3.30, p < 0.043, partial eta2 = 0.08] were also significant. As shown in Fig. 1, euthymic BPD patients had significantly lower overall response bias than comparison subjects (0.22±0.18 vs. 0.07±0.16; Cohen d = −0.87). Post-hoc Newman-Keuls further revealed significantly higher response biases in Blocks 2 and 3 compared to Block 1 for euthymic BPD patients (ps < 0.005). For comparison subjects, no changes in response bias occurred across the blocks (ps > 0.40). Relative to comparison subjects, euthymic BPD patients had significantly lower response bias in Block 1 only (p < 0.0004; Cohen d = −0.93).2

Figure 1.

Figure 1

Response bias as a function of block (block 1: trials 1–100; 2: trials 101–200; 3: trials 201–300) for healthy comparison (n = 25) and euthymic BP (n = 13) subjects. Error bars represent standard errors.

To further explore the timing of response bias acquisition, a one-way ANOVA that considered the first half (trials 1–50) and second half (trials 51–100) of Block 1, Block 2 (trials 101–200), and Block 3 (trials 201–300) was performed for comparison and BPD subjects separately. For both groups, the main effect of Block was significant (both Fs > 3.02, both ps < 0.035). Within-group post-hoc analyses revealed that comparison subjects had significantly higher response bias in the second half of Block 1, Block 2, and Block 3 compared to the first half of Block 1 (all ps < 0.05). For BPD patients, however, no differences emerged between the early and late phases of Block 1; instead, they showed significantly higher response biases in Blocks 2 and 3 compared to the first half of Block 1 (both ps < 0.036).

To exclude the possibility that group differences in response bias were due to elevated depressive symptoms in BPD patients, a set of hierarchical regression analyses was performed (see refs. 36, 37 for a rationale of using a regression analysis approach when covariates and independent variables are correlated). To this end, the total BDI-II scores were entered in the first step of the regression followed by Group (dummy coded) to predict response bias in Block 1, 2, and 3 as well as response bias across the 300 trials. Findings revealed that Group was a significant predictor of response bias in Block 1 (ΔR2 = 0.20), Block 2 (ΔR2 = 0.11), and over the entire 300 trials (ΔR2 = 0.16) after removing variance associated with BDI-II scores (all ΔFs > 4.24, df = 1,35; all ps < 0.048). Similar findings emerged when considering anhedonic BDI-II scores: Group was a unique predictor of response bias in Block 1 (p < 0.01), Block 2 (p = 0.085), and over the 300 trials (p < 0.035).

Discriminability

No significant effects emerged (all Fs < 2.22, all ps > 0.14).

Reaction Time

No effects involving Group emerged (all Fs < 1.46, all ps > 0.23).

Hit rate

The ANOVA on hit rates revealed a significant main effect of Condition [F(1,36) = 30.85, p < 0.001, partial eta2 = 0.46; rich stimulus > lean stimulus] and a significant Block × Condition interaction [F(2, 72) = 7.69, p < 0.001, partial eta2 = 0.18. As in a prior study using this paradigm (15), the Block × Condition interaction was due to significant hit rate differences between the stimuli, which increased across the blocks.

Critically, this effect was qualified by a significant Group × Condition × Block interaction, F(2,72) = 3.22, p = 0.046, partial eta2 = 0.08 (Fig. 2A,B). To evaluate this triple interaction, follow-up Group × Block ANOVAs were performed for the rich and lean hit rates separately. For the rich stimulus, the main effect of Group was reliable [F(1,36) = 8.13, p = 0.007, partial eta2 = 0.18] due to significantly higher hit rates (or conversely, significantly lower miss rate) for the comparison than euthymic BPD subjects (0.89±0.07 vs. 0.82±0.05; Cohen d = 1.04; Fig. 2C). Notably, comparison and BPD subjects had virtually identical lean accuracy scores (0.75±0.14 vs. 0.75±0.11), and neither the Group [F(1,36) = 0.003, p > 0.95] nor the Group × Block interaction [F(2,72) = 2.03, p > 0.13) were significant.

Figure 2.

Figure 2

Mean accuracy for the rich and lean stimulus across the three blocks (panels A and B) and averaged across the three blocks (panel C) for healthy comparison (n = 25) and euthymic BP (n = 13) subjects. In (C), arrows denote significant post-hoc tests; error bars represent standard errors.

Probability analyses

The above analyses indicate that euthymic BPD patients had significantly lower response bias and significantly higher miss rate for the more frequently rewarded (rich) stimulus. To investigate these findings in more detail, we computed the probability of missing a rich stimulus as a function of the outcome in the immediately preceding trial. To this end, we first identified all trials in which a correct identification of the rich or lean stimulus was rewarded, and then computed the probability of a rich miss in the subsequent trial. Analogous computations were performed for trials immediately following a correct identification of the rich or lean stimulus that was not rewarded (because a reward was not scheduled). Note that these analyses allowed us to test the strength of the response bias toward the rich stimulus as a function of (a) which stimulus had been rewarded in the preceding trial; and (b) proximity of reward delivery. After an arcsine transformation was applied (38), these probability values were entered in a Group × Stimulus Type (rich vs. lean) × Preceding Trial (rewarded vs. not rewarded) ANOVA. For the sake of simplicity, only effects involving Group are reported, and untransformed values are shown.

The ANOVA revealed a significant main effect of Group [F(1,36) = 7.19, p < 0.011, partial eta2 = 0.167] and a trend for the 3-way interaction [F(1,36) = 2.84, p = 0.10, partial eta2 = 0.073]. Post-hoc tests indicated that group differences were driven by significantly higher probability of a rich miss in euthymic BPD patients than comparison subjects in two of the four experimental conditions: when a rich trial was preceded by (a) a non-rewarded rich stimulus (p < 0.019; Cohen d = 0.74), or (b) a rewarded lean stimulus (p < 0.004; Cohen d = 0.80) (Fig. 3). Moreover, within-group analyses indicated that euthymic BPD patients but not comparison subjects had significantly higher probability of rich misses immediately after a non-rewarded rich stimulus than a non-rewarded lean stimulus (p < 0.028 vs. p > 0.19, respectively).

Figure 3.

Figure 3

Probability of miss rates for healthy comparison (n = 25) and euthymic BP (n = 13) subjects as a function of whether the preceding rich or lean trial was rewarded or not. Arrows denote significant post-hoc tests; error bars represent standard errors.

Relationships with clinical symptoms

Within the entire BPD sample (n=18), overall reward learning [=ΔResponse Bias = Response Bias(Block 3) - Response Bias(Block 1)] was negatively correlated with total HRSD (r = −0.51, p < 0.030), total BDI (r = −0.57, p < 0.015), anhedonic BDI-II (r = −0.51, p < 0.030) scores but not with the YMRS score (r = −0.18, p > 0.45) or MASQ AD score (r = −0.22, p > 0.35). A hierarchical regression analysis indicated that the anhedonic BDI-II sub-score (entered in the third step) was a significant predictor of ΔResponse Bias (standardized β coefficients = −0.609, t = −3.04, p < 0.010), even after removing variance associated with the total YMRS score (entered in the first step) and the two MASQ anxiety scores (entered concurrently in the second step) (ΔR2 = 0.35, ΔF = 9.22, df = 1,13, p < 0.010). Accordingly, patients reporting relatively elevated anhedonic symptoms were characterized by decreased reward learning even when controlling for their residual manic and anxious symptoms, or general distress (Fig. 4).

Figure 4.

Figure 4

Pearson correlation (r = −0.59, p < 0.010) for the entire BPD sample (n = 18) between ΔResponse Bias and the residualized BDI-II anhedonic subscore, which was computed by removing variance associated with the total YMRS score and the two MASQ anxiety scores (MASQ AA and MASQ GDA).

Discussion

The goal of the this study was to test the hypotheses that (a) BPD patients are characterized by abnormal reward processing even during a euthymic state; and (b) the presence of residual anhedonic symptoms would exacerbate this dysfunction. Using a probabilistic reward task, which assesses how behavior is modulated by reinforcement history, we found that both euthymic and symptomatic BPD patients showed reduced and delayed acquisition of response bias toward the more frequently rewarded stimulus, even after controlling for residual depressive or anhedonic symptoms. In addition, BPD patients reporting anhedonic symptoms in their daily life (e.g., loss of pleasure) showed the most impaired reward learning. Highlighting the specificity of this finding, a relationship between anhedonic symptoms and reduced response bias remained after controlling for subjects’ residual manic and anxious symptoms as well as general distress, replicating recent findings from two non-clinical samples (15, 17) and a medication-free sample with major depressive disorder (Pizzagalli, Iosifescu, Hallett, Ratner, Fava, unpublished). Unlike our recent findings in major depression, BPD patients exhibited cumulative learning over the course of the three blocks; however, the blunted nature of the response bias and its delayed acquisition point to a dysfunctional integration of reward information in early phases of the experiment.

Of note, additional analyses indicated that reduced response bias in BPD patients was not due to general task deficits, as evident from the lack of group differences in discriminability or reaction time. Rather, the performance of BPD participants was characterized by their increased tendency to misclassify the more frequently rewarded (rich) stimulus, whereas they showed no differences from comparison subjects in classifying the lean stimulus. Interestingly, elevated miss rates for the rich stimulus emerged only when it was immediately preceded by either a non-rewarded rich stimulus, or by a rewarded lean stimulus, indicating that BPD subjects were impaired in developing a response bias toward the more frequently rewarded stimulus in the absence of a proximal rich reward or after receiving a reward for the less advantageous response.

The finding of increased misclassification of the rich stimulus immediately after a rewarded lean stimulus is intriguing, particularly in light of theoretical accounts linking BPD to dysregulation of the behavioral approach system (BAS; 39–41), a system assumed to regulate appetitive motivation and goal-directed behavior in response to signals of reward (42). Notably, in BPD, increased BAS sensitivity and experiences of goal-striving and -attainment events predicted future manic symptoms (4346) and behavioral activation scores distinguished euthymic BPD patients from healthy controls (46). The current finding that reduced response bias toward the more frequently rewarded stimulus was partially explained by increased sensitivity and behavioral switching following rewards of the less advantageous (lean) stimulus is consistent with the general hypothesis of heightened responsivity to minimal environmental incentives in BPD (3941, 46) . Our results also extend a recent report of maladaptive BAS hypersensitivity in subjects with bipolar spectrum disorder (47). In the current study, an increased sensitivity to the infrequently occurring lean reward might, in turn, have led to impaired cumulative reward learning in BPD subjects.

Overall, the present findings of impaired integration of reward feedback are in line with recent reports showing that medicated euthymic and acutely manic BPD patients displayed an increased tendency to select the less favorable of two possible response options in a gambling task (14) and exhibited deficits in learning fluctuating stimulus-reward contingencies (12, 13). Moreover, since omission of reward could be interpreted by the participants as reflecting a potential erroneous response, the increased miss rate observed in trials immediately following a non-rewarded rich trial is consistent with a prior finding of increased behavioral switches after error feedback in mania (14). Unlike prior studies however, the current findings provide direct evidence that BPD, even during euthymic states, is characterized by reduced and delayed integration of reinforcements over time, and thus provide novel insights about the potential source of dysfunctional reward processing in this disorder.3

The limitations of the present study should be acknowledged. First, the sample of BPD, particularly euthymic, patients was relatively small, and all patients were medicated. We note that all prior studies investigating reward processing in BPD have also assessed medicated subjects (6, 814), highlighting the practical and ethical difficulties of investigating medication-free BPD subjects. Second, no patient was in a manic state, and the range of clinical symptomatology was limited. Thus, it is unclear whether acutely manic BPD patients might show potentiated, rather than blunted, reward learning. Third, patients with a history of substance use dependence were excluded to avoid potential confounds deriving from possible dopaminergic abnormalities that characterize these disorders (21, 22). Thus, it is unclear whether our findings will generalize to other BPD samples. Fourth, only reward feedbacks were included, so future studies will be needed to evaluate whether BPD patients might show deficits in other types of incentive learning (e.g., punishment feedback).

In sum, BPD patients, particularly those with residual anhedonic symptoms, showed reduced behavioral bias toward a more frequently rewarded stimulus. Future studies will be required to evaluate whether this abnormality is associated with dysfunction in brain regions coding the representation of reward values (e.g., orbitofrontal cortex; 48, 49) and/or the down-regulation of dopaminergic synaptic mechanisms, which have been hypothesized to follow the hyperdopaminergic state observed in mania (4).

Supplementary Material

01

Acknowledgments

This work was supported by grants from NIMH (R01MH68376) to DAP. The authors are grateful to Darin Dougherty and Mariko Jameson for assistance with recruitment of control subjects investigated in this study; to Lindsay Hallett for assistance with subject enrollment; and to Jeffrey Birk, Kyle Ratner, and James O’Shea for their support of this study.

Footnotes

1

These participants served as comparison subjects also in a recent study investigating reward responsiveness in unmedicated subjects with unipolar depression (Pizzagalli, Iosifescu, Hallett, Ratner, Fava, unpublished).

2

A main effect of Group remained when considering the entire BPD sample (n = 18) irrespective of current clinical state or when considering only euthymic BPD subjects with BP I (n = 11) (Fs > 4.22, ps < 0.046). Moreover, exploratory analyses evaluating the potential effects of different classes of medication on response bias (e.g., drugs blocking dopaminergic effects) revealed no significant effects (see Supplementary Material for more detail).

3

Note that the BPD subjects achieved a comparable response bias by the third block of the task, indicating that they were able to integrate reinforcement information, albeit in a delayed way. The control group, by contrast, achieved their maximum response bias very early in the course of the task (by the second half of block 1), and their failure to show increasing biases over time may be due to ceiling effects.

Financial Disclosures Dr. Pizzagalli has received research support from GlaxoSmithKline and Merck & Co., Inc. Ms. Goetz reports no competing interests. Dr. Ostacher has received research support from Pfizer, and honoraria, Speaker Bureau or travel support from AstraZeneca, Bristol Myers-Squibb, Concordant Rater Systems, Eli Lilly, Glaxo SmithKline. Dr. Iosifescu has received research support from Aspect Medical Systems, Forest Laboratories, Janssen Pharmaceutica, as well as honoraria from Aspect Medical Systems, Cephalon, Gerson Lehrman Group, Eli Lilly & Co., Forest Laboratories and Pfizer, Inc. Dr. Perlis has received honoraria or consulting fees from AstraZeneca, Bristol-Myers Squibb, Eli Lilly and Co, Glaxo SmithKline, Pfizer, and Proteus. Dr. Perlis holds stock with Concordant Rater Systems and he discusses off-label uses of medications in his presentation, and specifies when these occur.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders. 4, text revision. Washington, D.C: American Psychiatric Press; 2000. [Google Scholar]
  • 2.Johnson SL. Mania and dysregulation in goal pursuit: A review. Clin Psychol Rev. 2005;25:241–262. doi: 10.1016/j.cpr.2004.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Leibenluft E, Charney DS, Pine DS. Researching the pathophysiology of pediatric bipolar disorder. Biol Psychiatry. 2003;53:1009–1020. doi: 10.1016/s0006-3223(03)00069-6. [DOI] [PubMed] [Google Scholar]
  • 4.Berk M, Dodd S, Kauer-Sant’Anna M, Malhi GS, Bourin M, Kapczinski F, et al. Dopamine dysregulation syndrome: implications for a dopamine hypothesis of bipolar disorder. Acta Psychiatr Scand. 2007;116(Suppl 434):41–49. doi: 10.1111/j.1600-0447.2007.01058.x. [DOI] [PubMed] [Google Scholar]
  • 5.Hasler G, Drevets WC, Gould TD, Gottesman II, Manji HK. Toward constructing an endophenotype strategy for bipolar disorders. Biol Psychiatry. 2006;60:93–105. doi: 10.1016/j.biopsych.2005.11.006. [DOI] [PubMed] [Google Scholar]
  • 6.Clark L, Iversen SD, Goodwin GM. A neuropsychological investigation of prefrontal cortex involvement in acute mania. Am J Psychiatry. 2001;158:1605–1611. doi: 10.1176/appi.ajp.158.10.1605. [DOI] [PubMed] [Google Scholar]
  • 7.Clark L, Iversen SD, Goodwin GM. The influence of positive and negative mood states on risk taking, verbal fluency, and salivary cortisol. J Affec Disord. 2001;63:179–187. doi: 10.1016/s0165-0327(00)00183-x. [DOI] [PubMed] [Google Scholar]
  • 8.Ernst M, Dickstein DP, Munson S, Eshel N, Pradella A, Jazbec S, et al. Reward-related processes in pediatric bipolar disorder: a pilot study. J Affec Disord. 2004;82S:S89–S101. doi: 10.1016/j.jad.2004.05.022. [DOI] [PubMed] [Google Scholar]
  • 9.Rich BA, Bhangoo RK, Vinton DT, Berghorst LH, Dickstein DP, Grillon C, et al. Using affect-modulated startle to study phenotypes of pediatric bipolar disorder. Bipol Disord. 2005;7:536–545. doi: 10.1111/j.1399-5618.2005.00265.x. [DOI] [PubMed] [Google Scholar]
  • 10.Rubinsztein JS, Fletcher PC, Rogers RD, Ho LW, Aigbirhio FI, Paykel ES, et al. Decision-making in mania: a PET study. Brain. 2001;124:2550–2563. doi: 10.1093/brain/124.12.2550. [DOI] [PubMed] [Google Scholar]
  • 11.Murphy FC, Rubinsztein JS, Michael A, Rogers RD, Robbins TW, Paykel ES, Sahakian BJ. Decision-making cognition in mania and depression. Psychol Med. 2001;31:679–693. doi: 10.1017/s0033291701003804. [DOI] [PubMed] [Google Scholar]
  • 12.Dickstein DP, Treland JE, Snow J, McClure EB, Mehta MS, Towbin KE, et al. Neuropsychological performance in pediatric bipolar disorder. Biol Psychiatry. 2004;55:32–39. doi: 10.1016/s0006-3223(03)00701-7. [DOI] [PubMed] [Google Scholar]
  • 13.Gorrindo T, Blair RJR, Budhani S, Dickstein DP, Pine DS, Leibenluft E. Deficits on a probabilistic response-reversal task in patients with pediatric bipolar disorder. Am J Psychiatry. 2005;162:1975–1977. doi: 10.1176/appi.ajp.162.10.1975. [DOI] [PubMed] [Google Scholar]
  • 14.Minassian A, Paulus MP, Perry W. Increased sensitivity to error during decision-making in bipolar disorder patients with acute mania. J Affect Disord. 2004;82:203–208. doi: 10.1016/j.jad.2003.11.010. [DOI] [PubMed] [Google Scholar]
  • 15.Pizzagalli DA, Jahn AL, O’Shea JP. Toward an objective characterization of an anhedonic phenotype: A Signal-detection approach. Biol Psychiatry. 2005;57:319–27. doi: 10.1016/j.biopsych.2004.11.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Pizzagalli DA, Evins AE, Schetter Cowman E, Frank MJ, Pajtas PE, Santesso DL, Culhane M. Single dose of a dopamine agonist impairs reinforcement learning in humans: Behavioral evidence from a laboratory-based measure of reward responsiveness. Psychopharmacology (Berl) 2007 doi: 10.1007/s00213-007-0957-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Bogdan R, Pizzagalli DA. Acute stress reduces hedonic capacity: Implications for depression. Biol Psychiatry. 2006;60:1147–1154. doi: 10.1016/j.biopsych.2006.03.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Jongen EM, Smulders FT, Ranson SM, Arts BM, Krabbendam L. Attentional bias and general orienting processes in bipolar disorder. J Behav Ther Exp Psychiatry. 2007;38:168–183. doi: 10.1016/j.jbtep.2006.10.007. [DOI] [PubMed] [Google Scholar]
  • 19.Sachs GS, Thase ME, Otto MW, Bauer M, Miklowitz D, Wisniewski SR, et al. Rationale, design, and methods of the Systematic Treatment Enhancement Program for Bipolar Disorder (STEP-BD) Biol Psychiatry. 2003;53:1028–1042. doi: 10.1016/s0006-3223(03)00165-3. [DOI] [PubMed] [Google Scholar]
  • 20.First MB, Spitzer RL, Gibbon M, Williams JBW. Structured Clinical Interview for DSM-IV-TR Axis I Disorders, Research Version, Patient Edition. (SCID-I/P) New York: Biometrics Research, New York State Psychiatric Institute; 2002. [Google Scholar]
  • 21.Chambers RA, Krystal JH, Self DW. A neurobiological basis for substance abuse comorbidity in schizophrenia. Biol Psychiatry. 2001;50:71–83. doi: 10.1016/s0006-3223(01)01134-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Volkow ND, Fowler JS, Wang GJ, Swanson JM. Dopamine in drug abuse and addiction: results from imaging studies and treatment implications. Mol Psychiatry. 2004;9:557–569. doi: 10.1038/sj.mp.4001507. [DOI] [PubMed] [Google Scholar]
  • 23.Young RC, Biggs JT, Ziegler VE, Meyer DA. A rating scale for mania: reliability, validity and sensitivity. Br J Psychiatry. 1978;133:429–435. doi: 10.1192/bjp.133.5.429. [DOI] [PubMed] [Google Scholar]
  • 24.Hamilton M. A rating scale for depression. J Neurol Neurosurg Psychiatry. 1960;23:56–62. doi: 10.1136/jnnp.23.1.56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Perlis RH, Ostacher MJ, Patel JK, Marangell LB, Zhang H, Wisniewski SR, et al. Predictors of recurrence in bipolar disorder: primary outcomes from the Systematic Treatment Enhancement Program for Bipolar Disorder (STEP-BD) Am J Psychiatry. 2006;163:217–224. doi: 10.1176/appi.ajp.163.2.217. [DOI] [PubMed] [Google Scholar]
  • 26.Perlis RH, Brown E, Baker RW, Nierenberg AA. Clinical features of bipolar depression versus major depressive disorder in large multicenter trials. Am J Psychiatry. 2006;163:225–231. doi: 10.1176/appi.ajp.163.2.225. [DOI] [PubMed] [Google Scholar]
  • 27.Beck AT, Steer RA, Brown GK. Beck Depression Inventory Manual. 2. San Antonio: The Psychological Corporation; 1996. [Google Scholar]
  • 28.Watson D, Weber K, Assenheimer JS, Clark LA, Strauss ME, McCormick RA. Testing a tripartite model: I. Evaluating the convergent and discriminant validity of anxiety and depression symptom scales. J Abnorm Psychol. 1995;104:3–14. doi: 10.1037//0021-843x.104.1.3. [DOI] [PubMed] [Google Scholar]
  • 29.de Beurs E, den Hollander-Gijsman ME, Helmich S, Zitman FG. The tripartite model for assessing symptoms of anxiety and depression: Psychometrics of the Dutch version of the mood and anxiety symptoms questionnaire. Behav Res Ther. 2006 doi: 10.1016/j.brat.2006.07. 004. [DOI] [PubMed] [Google Scholar]
  • 30.Tripp G, Alsop B. Sensitivity to reward frequency in boys with attention deficit hyperactivity disorder. J Clin Child Psychol. 1999;28:366–375. doi: 10.1207/S15374424jccp280309. [DOI] [PubMed] [Google Scholar]
  • 31.Herrnstein RJ. Matching, melioration and rational choice. In: Herrnstein Richard J, Rachlin H, Laibson DI., editors. The Matching Law: Papers in Psychology and Economics. Cambridge, MA: Harvard University Press; 1997. [Google Scholar]
  • 32.Macmillan NA, Creelman DC. Detection Theory: A User’s Guide. New York: Cambridge University Press; 1991. [Google Scholar]
  • 33.McCarthy D, Davison M. Signal probability, reinforcement, and signal detection. J Exp Anal Behav. 1979;32:373–382. doi: 10.1901/jeab.1979.32-373. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Spanagel R, Weiss F. The dopamine hypothesis of reward: past and current status. Trends Neurosci. 1999;22:521–527. doi: 10.1016/s0166-2236(99)01447-2. [DOI] [PubMed] [Google Scholar]
  • 35.Joiner TE, Brown JS, Metalsky GI. A test of the tripartite model’s prediction of anhedonia’s specificity to depression: patients with major depression versus patients with schizophrenia. Psychiatry Res. 2003;119:243–50. doi: 10.1016/s0165-1781(03)00131-8. [DOI] [PubMed] [Google Scholar]
  • 36.Miller GA, Chapman JP. Misunderstanding analysis of covariance. J Abn Psychol. 2001;110:40–48. doi: 10.1037//0021-843x.110.1.40. [DOI] [PubMed] [Google Scholar]
  • 37.Cohen J, Cohen P. Applied multiple regression/correlation analysis for the behavioral sciences. Hillsdale, N. J: Erlbaum Associates; 1983. [Google Scholar]
  • 38.Sheskin DJ. Handbook of Parametric and Nonparametric Statistical Procedures. 2. Boca Raton, FL: Chapman & Hall/CRC; 2000. [Google Scholar]
  • 39.Depue R, Iacono W. Neurobehavioral aspects of affective disorders. Ann Rev Psychol. 1989;40:457–492. doi: 10.1146/annurev.ps.40.020189.002325. [DOI] [PubMed] [Google Scholar]
  • 40.Fowles DC. Biological variables in psychopathology: a psychobiological perspective. In: Sutker PB, Adams HE, editors. Comprehensive handbook of psychopathology. New York: Plenum Press; 1993. pp. 57–82. [Google Scholar]
  • 41.Johnson SL. Mania and dysregulation in goal pursuit: A review. Clin Psychol Rev. 2005;25:241–262. doi: 10.1016/j.cpr.2004.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Gray JA. Neural systems, emotion, and personality. In: Madden J IV, editor. Neurobiology of learning, emotion, and affect. New York: Raven Press; 1991. pp. 273–306. [Google Scholar]
  • 43.Johnson SL, Sandrow D, Meyer B, Winters R, Miller I, Solomon D, et al. Increases in manic symptoms after life events involving goal attainment. J Abn Psychol. 2000;109:721–727. doi: 10.1037//0021-843x.109.4.721. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Meyer B, Johnson SL, Winters R. Responsiveness to threat and incentive in bipolar disorder: Relations of the BIS/BAS scales with symptoms. J Psychopathol Behav Assess. 2001;23:133–143. doi: 10.1023/A:1010929402770. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Nusslock R, Abramson LY, Harmon-Jones E, Alloy LB, Hogan ME. A goal-striving life event and the onset of hypomanic and depressive episodes and symptoms: Perspective from the behavioral approach system (BAS) dysregulation theory. J Abn Psychol. 2007;116:105–115. doi: 10.1037/0021-843X.116.1.105. [DOI] [PubMed] [Google Scholar]
  • 46.Salavert J, Caseras X, Torrubia R, Furest S, Aranz B, Dueñas R, et al. The functioning of the behavioral activation and inhibitions systems in bipolar I euthymic patients and its influence in subsequent episodes over an eighteen-month period. Pers Indiv Diff. 2007;42:1323–1331. [Google Scholar]
  • 47.Harmon-Jones E, Abramson LY, Nusslock R, Sigelman JD, Urosevic S, Turonie LD, et al. Effect of bipolar disorder on left frontal cortical responses to goals differing in valence and task difficulty. Biol Psychiatry. doi: 10.1016/j.biopsych.2007.08.004. in press. [DOI] [PubMed] [Google Scholar]
  • 48.Blumberg HP, Leung H-C, Skudlarski P, Lacadie CM, Fredericks CA, Harris BC, et al. A functional magnetic resonance imaging study of bipolar disorder. Arch Gen Psychiatry. 2003;60:601–609. doi: 10.1001/archpsyc.60.6.601. [DOI] [PubMed] [Google Scholar]
  • 49.Drevets WC, Ongur D, Price JL. Neuroimaging abnormalities in the subgenual prefrontal cortex: implications for the pathophysiology of familial mood disorders. Mol Psychiatry. 1998;3:220–226. doi: 10.1038/sj.mp.4000370. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

01

RESOURCES