Skip to main content
CNS Neuroscience & Therapeutics logoLink to CNS Neuroscience & Therapeutics
. 2017 Jan 18;23(3):248–256. doi: 10.1111/cns.12671

Behavioral and Electrophysiological Alterations for Reinforcement Learning in Manic and Euthymic Patients with Bipolar Disorder

Vin Ryu 1, Ra Yeon Ha 2, Su Jin Lee 3, Kyooseob Ha 4, Hyun‐Sang Cho 3,5,
PMCID: PMC6492753  PMID: 28098430

Summary

Aims

Bipolar disorder is characterized by behavioral changes such as risk‐taking and increasing goal‐directed activities, which may result from altered reward processing. Patients with bipolar disorder show impaired reward learning in situations that require the integration of reinforced feedback over time. In this study, we examined the behavioral and electrophysiological characteristics of reward learning in manic and euthymic patients with bipolar disorder using a probabilistic reward task.

Methods

Twenty‐four manic and 20 euthymic patients with bipolar I disorder and 24 healthy control subjects performed the probabilistic reward task. We assessed response bias (RB) as a preference for the stimulus paired with the more frequent reward and feedback‐related negativity (FRN) to correct identification of the rich stimulus.

Results

Both manic and euthymic patients showed significantly lower RB scores in the early learning stage (block 1) in comparison with the late learning stage (block 2 or block 3) of the task, as well as significantly lower RB scores in the early stage compared to healthy subjects. Relatively more negative FRN amplitude is elicited by no presentation of an expected reward, compared to that elicited by presentation of expected feedback. The FRN became significantly more negative from the early (block 1) to the later stages (blocks 2 and 3) in both manic and euthymic patients, but not in healthy subjects. Changes in RB scores and FRN amplitudes between blocks 2 and 3 and block 1 correlated positively in healthy controls, but correlated negatively in manic and euthymic patients. The severity of manic symptoms correlated positively with reward learning scores and negatively with the FRN.

Conclusions

These findings suggest that patients with bipolar disorder during euthymic or manic states have behavioral and electrophysiological alterations in reward learning compared to healthy subjects. This dysfunctional reward processing may be related to the abnormal decision‐making or altered goal‐directed activities frequently seen in patients with bipolar disorder.

Keywords: Bipolar disorder, Feedback‐related negativity, Probabilistic learning, Response bias, Reward

Introduction

Bipolar disorder is characterized by recurrent mood episodes including manic and depressive periods during the patients' lifetime. Characteristically, patients with bipolar disorder seek extremely high goals, set high expectations for success, and participate excessively in pleasurable activities and risk‐taking behaviors during manic states, and show decreased interest or responses to pleasurable stimuli during depressive states 1. Abnormal reward sensitivity, which is a dysregulation of goal‐directed behaviors and motivation in response to a reward, has been observed in persons with bipolar disorder, as well as in those at risk for this disorder 2. Therefore, alterations in goal‐related and pleasurable activities in patients with bipolar disorder may be outcomes of abnormal reward processing.

Studies to date have revealed various aspects of impaired reward learning in bipolar disorder. Manic patients with bipolar disorder were more likely to make suboptimal and bad decisions for betting strategies, with an increased tendency to choose the less favorable of two response options in the Cambridge Gambling Task 3. High sensitivity to error rate changes and frequent switches at high error rates were also observed with a two‐choice prediction task in bipolar manic patients 4. On the Iowa Gambling Task, manic and depressive patients with bipolar disorder selected more cards from the risky decks than did healthy controls 5. Among euthymic bipolar patients, some studies using the Iowa Gambling Task showed heightened risk‐taking 5, 6, whereas others showed no differences in task performance 7, 8. On tasks modeling behavioral adjustment to changing reward contingencies, pediatric and adult individuals with bipolar disorder have been found to exhibit decreased reward learning, even during euthymic periods 9, 10. These results suggest that patients with bipolar disorder show a significantly diminished ability to adjust their responses to an intermittent reward. Therefore, poor probabilistic judgment in patients with bipolar disorder may be related to reward learning ability in situations requiring reward integration over time.

With event‐related potentials (ERPs), we can track the precise timing of reward learning processes in bipolar disorder. Feedback‐related negativity (FRN) has been used to monitor reward‐related activity in the anterior cingulate cortex (ACC) 11, 12. FRN, seen as a negative deflection, usually occurs 200–400 ms after feedback. FRN is thought to reflect the transmission of a dopaminergic signal arriving in the ACC, indicating that events have gone worse than expected 13. Therefore, FRN is involved in early appraisal of feedback, showing a larger (i.e., more negative) amplitude for worse outcomes than those expected, or an attenuated (i.e., more positive) amplitude for better outcomes than those expected 14, 15.

Learning, as one of three psychological reward components (emotion, motivation, and learning), involves predictive association and cognition of future rewards based on previous experiences 16. Patients with bipolar disorder appear to have impaired reward learning in response to changing rewards (i.e., they might readily change their response to rewards.). Studies have found that patients with bipolar disorder are unable make an optimal decision when they do not know the exact probability of a good outcome. The probabilistic reward task is an instrument designed to evaluate reward learning by assessing response bias to two differently rewarded, ambiguous stimuli 17.

Learning in the probabilistic reward task is expected to elicit a smaller (i.e., more positive) amplitude of FRN and greater activation of the anterior cingulate cortex upon being provided a reward for responses to rich stimuli, which increase the likelihood of a given behavioral response 18. The probabilistic reward task uses an asymmetrical reinforce ratio to induce a response bias.

In this study, we investigated altered reward learning and related electrophysiological changes in bipolar I patients with manic or euthymic status. Using the probabilistic reward task, we measured the degree of response bias toward the more frequently rewarded stimulus as the extent to which behavior is modulated by reinforcement history 17. As electrophysiological correlates of reward learning, we measured the amplitude of FRN elicited by reward feedback delivered after correct identification of the rich stimuli 18. As previously reported, patients with bipolar disorder demonstrate cognitive impairments in the euthymic and manic phases, suggesting trait‐related impairments 19. Additionally, euthymic bipolar patients have been found to show dysfunctional reward learning in a probabilistic reward task, which is characterized by reduced acquisition of response bias in situations requiring reward information integration over time 10. We hypothesized that patients with bipolar disorder would show altered reward learning both behaviorally and neurophysiologically, observable as reduced response bias and more negative FRN amplitude during manic and euthymic phases.

Materials and Methods

Participants

Twenty‐four manic and twenty euthymic patients with bipolar I disorder were recruited from the inpatients and outpatients of the Severance Mental Health Hospital of Yonsei University Health System. Diagnostic work‐ups for bipolar disorder were performed according to the criteria of the Diagnostic and Statistical Manual of Mental Disorder, 4th edition. Diagnoses of bipolar disorder were assessed by the Mini‐International Neuropsychiatric Interview (MINI) 20. Diagnostic work‐ups were performed by two psychiatrists (R.Y.H. and H.S.C). Patients with other psychiatric illnesses such as schizoaffective disorder, severe personality disorder, recent substance abuse or dependence, rapid cycling bipolar disorder, history of closed head injury, mental retardation, neurological disorders, or any other current axis I disorder were excluded. For healthy control subjects, we posted a recruitment notice on a website and selected 24 healthy subjects, sex‐ and age‐matched with the patient groups. We also performed diagnostic work‐ups for healthy controls using the MINI. These healthy volunteers had no history of bipolar disorder, schizophrenia, or other psychiatric illnesses and did not show any mood or thought symptoms during the interviews. All subjects were right‐handed as indicated by the Annett's handedness questionnaire 21. Participants received 30,000 Won (approximately $30) for participating in the study, plus 10,000 Won (approximately $10) for performing the task. Written informed consent was obtained from all participants with adequate understanding. This study was approved by the Institutional Review Board of Severance Mental Health Hospital and was conducted in accordance with the Declaration of Helsinki.

Twenty‐four patients in the manic group were taking antipsychotics (Table 1). The average doses of lithium were 1050 ± 141 mg in manic patients and 964 ± 118 mg in euthymic patients. The average doses of valproate were 1210 ± 192 mg in the manic patients and 996 ± 246 mg in the euthymic patients.

Table 1.

Demographic and clinical characteristics of study subjects

Healthy controls (n = 24) Manic patients (n = 24) Euthymic patients (n = 20) F, t or χ 2 P
Age (year) 31.9 ± 6.96 34.5 ± 9.11 34.8 ± 4.41 1.11 0.34
Sex (m/f) 11/13 11/13 8/12 0.20 0.91
IQa 106 ± 10.8 114 ± 12.9 110 ± 11.6 2.45 0.10
YMRS 13.0 ± 10.1 2.00 ± 1.62 5.25 <0.001
MADRS 4.83 ± 2.96 2.40 ± 3.30 2.58 0.01
Age at onset (year) 28.3 ± 10.0 24.7 ± 5.04 1.54 0.13
Illness duration (year) 6.25 ± 7.12 14.0 ± 7.38 3.51 0.001
Number of episode 3.63 ± 2.02 5.65 ± 3.77 2.27 0.03
Mood stabilizers (N, Lithium/valproate) 10/14 7/13 0.21 0.76
Antipsychotic doseb 830 ± 226 663 ± 225 2.44 0.02

IQ, intelligent quotient; YMRS, Young Mania Rating Scale; MADRS, Montgomery–Åsberg Depression Rating Scale. aEstimated by Korean version of Wechsler Adult Intelligence Test. bChlorpromazine equivalent dose.

Clinical Assessment

We interviewed participants to assess their demographics including age, sex, and educational level. The YMRS, which is an 11‐item scale designed to measure the severity of manic symptoms (Cronbach's alpha: 0.66–0.92) 22, MADRS, which is a 10‐item scale used to measure depressive symptom severity (Cronbach's alpha: 0.84) 23, and the State‐Trait Anxiety Inventory (STAI), which comprises 40 self‐report items for measuring levels of state anxiety and trait anxiety (Cronbach's alpha: 0.86–0.95) 24, were utilized in this study. The Cronbach's alpha values in this study were 0.85 for YMRS, 0.74 for MADRS, and 0.92 for STAI.

Tasks

Participants performed the probabilistic reward task 17; permission to use this task in our study was obtained from Dr. Diego Pizzagalli through personal communication. This task consisted of three blocks, with 100 trials for each block. Each block was referred to as block 1 (trials 1–100), block 2 (trials 101–200), or block 3 (trials 201–300). Participants were instructed to focus on the cross‐sign in the center of the screen. Participants viewed a schematic mouthless face for 500 ms, followed by the presentation of this face with either a short mouth or long mouth for 1000 ms. The ambiguous mouth stimulus was shown for a short duration, after which the participants were asked whether the length of the mouth was short or long. The length of long mouth was 13 mm and the short mouth was 11.5 mm. The two stimuli are presented with equal frequency. During the task, participants were not rewarded for all correct responses. The participants were informed that the task goal was to win as much money as possible and that they would earn money based on their performance. For each block, only 40 correct responses were followed by positive feedback. They were rewarded for some of their correct responses with a message “Correct!! You won 100 Won.” The probability of the reward for the correct response was 3 of 5 for the rich stimulus and 1 of 5 for the lean stimulus. Participants were explicitly informed that they would not receive reward feedback for all correct responses. The detailed description of the task was previously published 10, 17.

Behavioral Data for Reward Learning

The main variable for behavioral data was response bias (RB) during the probabilistic reward task performed during the ERP recording. RB, the preference for identifying the stimulus paired with the more frequent reward, was calculated using the following formula 17, 25, 26:

logb=12logRichcorrect×LeanincorrectRichincorrect×Leancorrect

The value of 0.5 was added to every cell of the detection matrix to allow the calculation of RB in case of a zero in one cell of the formula. RB increases when participants make more choices of rewarded (rich) stimuli in cases of correct rich stimuli and incorrect lean stimuli. If a subject shows a high number of correct identifications for the rich stimuli and a low number of correct identifications for the lean stimulus, the response bias will be higher in score. Positive reinforcement produces preference for more frequently rewarded stimuli as the task is executed. Therefore, RB may be an index of reward learning modulated by positive reinforcement 10, 17.

Electrophysiological Recordings and Analyses

The electroencephalograms (EEGs) were recorded continuously using a 64‐channel Neuroscan system (SynAmpsII) with an AgCl lead cap according to the international 10/10 system, with 0.05 and 100 Hz band‐pass filters and a sampling rate of 1000 Hz/channel. The impedance of all channels was kept below 5kΩ. The recordings were referenced to linked electrodes placed on the left and right mastoid processes. Eye blinks and movements were monitored by electrodes placed near the outer canthus and beneath the left eye. Recording procedures were performed in a dimly lit, quiet, and electrically shielded EEG room. Subjects were seated in a comfortable reclining chair at an eye distance of 50 cm from the computer monitor (visual angle of 9° × 12°). Subjects were instructed to concentrate on the center of the monitor and to avoid eye‐blinking as much as possible. Each subject's performance was monitored by closed‐circuit camera, and subjects were not sleepy during the experiments.

EEG analysis was carried out on an off‐line basis. Gross movement artifacts on EEG were removed by inspection. The EEG was amplified by a 0.1–30 Hz band‐pass filter. To control for eye movement artifacts, trials were adjusted by regression from electro‐oculograms 27. Artifacts were rejected automatically if their amplitude exceeded ±50 uV. The minimal number of ERP epochs was 20. A low‐pass filter at 8.5 Hz was used to remove muscular movement, noise, and alpha‐wave activity. EEG epochs were extracted beginning 200 ms before and ending 600 ms after stimulus presentation. Epochs were included in the analysis of cases in which correct responses were provided to richly rewarded stimuli. A preresponse baseline between −200 ms and 0 ms was used for all ERP components. We measured FRN amplitude at electrodes Fz and FCz, where FRN has been reported to be maximal 11, 12, 28. FRN was identified with the definition of the most negative peak 200–400 ms after reward feedback following the correct identification of the rich stimulus.

Statistical Analyses

Behavioral data were analyzed by mixed analysis of variance (ANOVA) to assess effects of blocks (RB of three blocks, block 1 vs. block 2 vs. block 3) and groups (manic and euthymic patients and healthy controls), as analyzed by Pizzagalli et al. 10. FRN amplitudes were collapsed across Fz and FCz electrodes and were analyzed by mixed ANOVA to assess the effects of groups and blocks. The FRN was also collapsed across block 2 and block 3 and analyzed according to the learning phase (early [block 1] vs. late [blocks 2 and 3]), as analyzed by Santesso et al. 18. Significant findings were further analyzed using post hoc Newman–Keuls tests. Greenhouse–Geisser corrections for nonsphericity were applied. Pairwise comparisons for the response bias in each block were performed using one‐way ANOVA with post hoc analysis of Bonferroni corrections. SPSS version 17.0 was used for statistical analyses. We also calculated Pearson's correlation coefficient between FRN amplitude changes, reward learning (RB scores in blocks 2 and 3—RB scores in block 1), and symptom severity.

Results

Demographic characteristics of the two patient groups and the control group are presented in Table 1. There were no significant differences between the groups with regard to age, sex, IQ, or education level. The Young's Mania Rating Scale (YMRS score was 13.0 ± 10.1 in manic patients and 2.0 ± 1.6 in euthymic patients. The Montgomery–Åsberg Depression Rating Scale (MADRS) score was 4.8 ± 3.0 in manic patients and 2.40 ± 3.30 in euthymic patients.

Response Bias

A two‐way group × block ANOVA revealed a significant effect of block (F(2,130) = 6.22, P = 0.005), in which the RB of block 1 was lower than that of block 3 (block 1 vs. block 3, P < 0.001; block 1 vs. block 2, P = 0.10; block 2 vs. block 3, P = 0.11).There were no main effect of groups (F(2,65) = 0.71, P = 0.50). There was a significant interaction of group x block (F(4,130) = 2.47, P = 0.05). We explored this interaction in two ways.

First, we assessed differences in RB among the blocks within each group separately. Within‐group analysis revealed significantly lower RB in block 1 compared to block 3 in euthymic patients (F(2,38) = 7.39, P = 0.002; block 1 vs. block 2, P = 0.08; block 1 vs. block 3, P = 0.001; block 2 vs. block 3, P = 0.06, Newman–Keuls test; RB scores: 0.46 ± 0.16 in block 1, 0.13 ± 0.16 in block 2, 0.21 ± 0.13 in block 3) and significantly lower RB in block 1 compared to block 2 and block 3 in manic patients (F(2,46) = 10.42, P = 0.001; block 1 vs. block 2, P = 0.01; block 1 vs. block 3, P < 0.001; block 2 vs. block 3, P = 0.56; RB scores: 0.47 ± 0.13 in block 1, 0.17 ± 0.20 in block 2, 0.19 ± 0.11 in block 3), but not in healthy controls (F(2,46) = 0.22, P = 0.75; RB scores: 0.19 ± 0.28 in block 1, 0.15 ± 0.23 in block 2, 0.18 ± 0.26 in block 3).

Second, we assessed group effects in each block separately. In block 1, ANOVA revealed that the RB scores of control subjects were higher than those of the two patient groups (F(2,67) = 4.11, P = 0.02; euthymic vs. control, P = 0.02; manic vs. control, P = 0.01; euthymic vs. manic, P = 0.99). There were no significant differences among the three groups in block 2 (F(2,67) = 0.22, P = 0.80) and block 3 (F(2,67) = 0.14, P = 0.87; Figure 1).

Figure 1.

Figure 1

Response bias and feedback‐related negativity amplitudes a function of block in manic and euthymic patients and healthy controls. Arrows denote significant post hoc tests; error bars represent standard errors.

The ANOVA revealed significant group differences of a higher probability of a rich miss in euthymic and manic bipolar patients than in healthy controls in one of the four experimental conditions: when a rich trial was preceded by a rewarded lean stimulus (F(2,67) = 4.83, P = 0.01; Figure 2).

Figure 2.

Figure 2

Probability of miss rates for healthy controls, euthymic patients, and manic patients. Arrows denote significant post hoc tests; error bars represent standard errors.

There were no significant differences in discriminability (F(2,130) = 2.38, P = 0.10), hit rate (F(2,130) = 0.87, P = 0.41), and response time (F(2,130) = 1.89, P = 0.16).

The Amplitude of FRN

Grand averages of FRN are shown in Figure 2. Two‐way ANOVA for group x block revealed a significant effect of block (F(1,65) = 13.79, P < 0.001), in which the collapsed FRN amplitude at Fz and FCz electrodes of block 1 was more positive than that of blocks 2 and 3. There was no main effect of group (F(2,65) = 0.14, P = 0.87). There was a significant interaction of group × block (F(2,65) = 11.50, P < 0.001; Figure 1). We explored this interaction in two ways.

Within‐group analyses revealed significantly more positive amplitude in block 1 than in blocks 2 and 3 in euthymic patients (2.02 ± 1.77 μV in block 1, 0.30 ± 2.81 μV in blocks 2 and 3; P < 0.001, Newman–Keuls test), and more positive amplitude in block 1 than blocks 2 and 3 in manic patients (2.29 ± 3.70 μV in block 1, 0.21 ± 3.63 μV in blocks 2 and 3, P < 0.001, Newman–Keuls test), but not in healthy controls (F(1,23) = 2.24, P = 0.15; 0.47 ± 1.40 μV in block 1, 2.15 ±3.18 μV in blocks 2 and 3). Between‐group comparisons in each block revealed that FRN amplitudes of healthy controls were more negative than those of manic and euthymic patients in block 1 (F(2,67) = 3.51, P = 0.04; euthymic vs. control, P = 0.05; manic vs. control, P = 0.02; euthymic vs. manic, P = 0.73). There were no differences among the groups in blocks 2 and 3 (F(2.67) = 0.74, P = 0.48; Figure 1).

Correlations among Behavioral and ERP Data and Clinical Symptoms

Reward learning which was calculated by the RB difference between blocks 2 and 3 and block 1 was positively correlated with the difference of collapsed FRN amplitude at Fz and FCz electrodes between blocks 2 and 3 and block 1 in healthy controls (r = 0.49, P = 0.02). On the other hand, euthymic and manic patients showed significant negative correlation between reward learning scores and FRN amplitude differences (r = −0.35, P = 0.02; Figure 3). Significant positive correlation was found between the reward learning score and YMRS score in manic and euthymic patients (Pearson's coefficient = 0.31, P = 0.04). FRN amplitude differences between blocks 2 and 3 and block 1 were negatively correlated with YMRS scores (Pearson's coefficient = −0.50, P = 0.001). There was no significant correlation between MADRS and RB (r = 0.22, P = 0.15) or FRN amplitude differences in the patient groups (r = −0.16, P = 0.29; Figure 4).

Figure 3.

Figure 3

Grand averages of event‐related potential waveforms at Fz and FC z electrodes.

Figure 4.

Figure 4

Correlations between RB score differences (RB score in blocks 2 and 3—RB score in block 1) and FRN amplitude differences (blocks 2 and 3—block 1) (RB, response bias; FRN, feedback‐related negativity).

Probability of Missing Rich Stimuli after a Preceding Reward

We discovered significant group differences in the probability of a rich miss among euthymic patients, manic patients, and healthy controls (F(2,67) = 4.84, P = 0.01). Post hoc analyses revealed significant differences between manic patients and healthy controls (22.3 ± 17.6% [manic] vs. 10.8 ± 4.75% [control], P = 0.01) and between euthymic patients and healthy controls (20.4 ± 15.3% [euthymic] vs. 10.8 ± 4.75% [control], P = 0.01) in the probability of missing rich stimuli after a preceding lean reward. Also, no group differences were identified in the probability of rich miss preceded by a rewarded rich stimuli or nonrewarded rich and lean stimuli.

Discussions

In this study, we investigated the characteristics of reward learning in manic and euthymic patients with bipolar I disorder using a probabilistic reward task. Both patient groups showed significantly lower RB in the early learning stage (block 1) in comparison with the late learning stage (block 2 or block 3) and demonstrated lower RB in the early stage compared to healthy control groups. FRNs became more negative from the early (block 1) to the later stages (blocks 2 and 3) of the task in both manic and euthymic patients, but not in healthy subjects. Both patients showed more positive FRNs in only block 1 compared to healthy controls. These findings suggest that patients with bipolar disorder, even in the euthymic state, have impaired reward learning and integration of the repetitive reinforcement pattern of the task at behavioral and electrophysiological levels. To our knowledge, this is the first study to examine probabilistic reward learning and its related ERPs in bipolar I patients with different mood states of mania and euthymia.

Consistent with previous findings in euthymic and symptomatic bipolar patients using the probabilistic reward task 10, the current study showed that both manic and euthymic patients have decreased and delayed learning of the RB toward more frequently rewarded stimuli. As for FRNs, the patterns of change according to learning stage observed in our patients and healthy controls were similar to those observed in nonlearners and learners, respectively, in the study of healthy adults 18. In addition, a positive correlation was observed in healthy controls between the differences over time in RB and FRN in the current study. This correlation may support the amplitude change in FRN as a potential marker of reward learning. However, patients with bipolar disorder had a negative correlation between RB and FRN differences over time. In other words, as learning progresses, FRN, induced by feedback, becomes smaller (i.e., more positive) in healthy people whereas the FRN becomes larger (i.e., more negative) in patients with bipolar disorder.

These findings suggest that patients with bipolar disorder have impaired functions of their FRN‐related neuroanatomical and neurochemical systems, which are associated with the process of reward learning in healthy subjects. The FRN is generated in the ACC 11, 12, 29 and medial prefrontal cortex 30, 31. Recent reports strongly support the potential roles of these brain areas in affective regulation and emotional or reward processing and show its decreased activity or functional connectivity in bipolar disorder 32, 33, 34. Even during euthymic status in patients with bipolar disorder, reward‐related stimuli induced abnormal activity in the orbitofrontal area and ventral striatum 35, 36, which are densely connected to the ACC 37, 38. This altered neural function may be associated with abnormal reward learning and FRN amplitudes.

As for the neurochemical aspects, it has been assumed that FRN‐associated neuronal cells are usually regulated by dopamine neuron firing 39. Recently, a relationship was observed between a specific dopaminergic gene variant (e.g., COMT val158Met) and FRN 40, 41. There are also well‐replicated reports on the association of COMT val158met with reward learning measured by a probabilistic reward task 42, 43. The hyperdopaminergic state elicited by the reduced function of the dopamine transporter mimicked the abnormal behaviors of manic patients 44 and caused poor reward‐based decision‐making in mice 45. Furthermore, increased dopaminergic activity with the dopamine agonist pramipexole caused impairment of reward‐based learning on the Iowa Gambling Task in euthymic patients with bipolar disorder 46. As shown in the correlation between symptom severity and FRNs in this study, the more severe the manic symptoms, the more negative the pattern of FRN changes during the learning process. This interesting result might be indirect evidence of a role for dopamine in reward learning. Taken together, these neuroanatomical and chemical findings support the abnormal FRNs observed in patients with bipolar disorder in the current study.

In bipolar disorder, impulsivity and risk‐taking appear to be associated with altered activity of the prefrontal cortices and ventral striatum in response to rewards 47. Despite increased elevation during the manic state, increased impulsivity and impulsive and risky decision‐making are found in euthymic and symptomatic states in the course of illness 48, 49 and may be a vulnerability marker for bipolar disorder 50. Therefore, the behavioral and electrophysiological alteration of reward learning observed in manic and euthymic states may be trait abnormalities.

Compared to healthy persons, bipolar manic and euthymic patients exhibited significantly higher probabilities of a rich miss when a rich trial was preceded by rewarded lean stimuli. In a previous study of patients with bipolar disorder, an increased miss rate of a rich stimulus was also found immediately after a rewarded lean stimulus 10. This finding might reflect potential mechanisms linked to the delayed acquisition of response bias in patients with bipolar disorder.

This study does have limitations. First, although our study recruited more participants than were included in previous studies using the probabilistic reward task 10, 18, the sample size was relatively small. Second, we did not control the use of psychotropic drugs. One study reported an effect of olanzapine on reward‐related brain activation in healthy subjects 29. In the current study, the chlorpromazine equivalent dose was not significantly correlated with RB scores (P values: 0.66–0.84) and FRN amplitudes (P values: 0.21–0.87) in each block. In the current study, the chlorpromazine equivalent dose was not significantly correlated with RB scores (P values: 0.66–0.84) and FRN amplitudes (P values: 0.21–0.87) in each block. No significant difference was found between the lithium and divalproex groups in RB scores (P values: 0.49–0.76) and FRN amplitudes (P values: 0.41–0.87) in each block. There were no group differences in using mood stabilizers (lithium vs. valproate, χ 2 = 0.21, P = 0.76). However, we cannot completely rule out the potential effects of psychotropic medications. Third, we did not consider smoking effects on the reward learning, although smoking status and cravings may modulate reward learning 51, 52.

Overall, we report evidence that bipolar I patients, during both normothymic and manic states, have behavioral and electrophysiological alterations in reward learning compared to healthy subjects. This dysfunctional reward processing may be related to the abnormal decision‐making or altered goal‐directed activities frequently seen in bipolar disorder subjects.

Conflict of Interest

The authors declare no conflict of interest.

Funding

This study was supported by a grant (A101915) from the Korea Healthcare Technology R&D Project of the Ministry of Health & Welfare of the Republic of Korea.

References

  • 1. American Psychiatric Association . Diagnostic and statistical manual of mental disorder, 4th edn Washington DC: American Psychiatric Association, 2000. [Google Scholar]
  • 2. Johnson SL, Edge MD, Holmes MK, Carver CS. The behavioral activation system and mania. Annu Rev Clin Psychol 2012;8:243–267. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Murphy FC, Rubinsztein JS, Michael A, et al. Decision‐making cognition in mania and depression. Psychol Med 2001;31:679–693. [DOI] [PubMed] [Google Scholar]
  • 4. Minassian A, Paulus MP, Perry W. Increased sensitivity to error during decision‐making in bipolar disorder patients with acute mania. J Affect Disord 2004;82:203–208. [DOI] [PubMed] [Google Scholar]
  • 5. Adida M, Jollant F, Clark L, et al. Trait‐related decision‐making impairment in the three phases of bipolar disorder. Biol Psychiatry 2011;70:357–365. [DOI] [PubMed] [Google Scholar]
  • 6. Malloy‐Diniz LF, Neves FS, de Moraes PH, et al. The 5‐HTTLPR polymorphism, impulsivity and suicide behavior in euthymic bipolar patients. J Affect Disord 2011;133:221–226. [DOI] [PubMed] [Google Scholar]
  • 7. Martino DJ, Strejilevich SA, Torralva T, Manes F. Decision making in euthymic bipolar I and bipolar II disorders. Psychol Med 2011;41:1319–1327. [DOI] [PubMed] [Google Scholar]
  • 8. Yechiam E, Hayden EP, Bodkins M, O'Donnell BF, Hetrick WP. Decision making in bipolar disorder: A cognitive modeling approach. Psychiatry Res 2008;161:142–152. [DOI] [PubMed] [Google Scholar]
  • 9. Gorrindo T, Blair RJ, Budhani S, Dickstein DP, Pine DS, Leibenluft E. Deficits on a probabilistic response‐reversal task in patients with pediatric bipolar disorder. Am J Psychiatry 2005;162:1975–1977. [DOI] [PubMed] [Google Scholar]
  • 10. Pizzagalli DA, Goetz E, Ostacher M, Iosifescu DV, Perlis RH. Euthymic patients with bipolar disorder show decreased reward learning in a probabilistic reward task. Biol Psychiatry 2008;64:162–168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Gehring WJ, Willoughby AR. The medial frontal cortex and the rapid processing of monetary gains and losses. Science 2002;295:2279–2282. [DOI] [PubMed] [Google Scholar]
  • 12. Miltner WHR, Braun CH, Coles MGH. Event‐related brain potentials following incorrect feedback in a time‐estimation task: Evidence for a “Generic” neural system for error detection. J Cogn Neurosci 1997;9:788–798. [DOI] [PubMed] [Google Scholar]
  • 13. Holroyd CB, Coles MGH. Dorsal anterior cingulate cortex integrates reinforcement history to guide voluntary behavior. Cortex 2008;44:548–559. [DOI] [PubMed] [Google Scholar]
  • 14. Hajcak G, Holroyd CB, Moser JS, Simons RF. Brain potentials associated with expected and unexpected good and bad outcomes. Psychophysiology 2005;42:161–170. [DOI] [PubMed] [Google Scholar]
  • 15. Holroyd CB, Pakzad‐Vaezi KL, Krigolson OE. The feedback correct‐related positivity: Sensitivity of the event‐related brain potential to unexpected positive feedback. Psychophysiology 2008;45:688–697. [DOI] [PubMed] [Google Scholar]
  • 16. Berridge KC, Robinson TE. What is the role of dopamine in reward: Hedonic impact, reward learning, or incentive salience? Brain Res Rev 1998;28:309–369. [DOI] [PubMed] [Google Scholar]
  • 17. Pizzagalli DA, Jahn AL, O'Shea JP. Toward an objective characterization of an anhedonic phenotype: A signal‐detection approach. Biol Psychiatry 2005;57:319–327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Santesso DL, Dillon DG, Birk JL, et al. Individual differences in reinforcement learning: Behavioral, electrophysiological, and neuroimaging correlates. NeuroImage 2008;42:807–816. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Torres IJ, Boudreau VG, Yatham LN. Neuropsychological functioning in euthymic bipolar disorder: A meta‐analysis. Acta Psychiatr Scand Suppl 2007;434:17–26. [DOI] [PubMed] [Google Scholar]
  • 20. Sheehan DV, Lecrubier Y, Sheehan KH, et al. The Mini‐International Neuropsychiatric Interview (MINI): The development and validation of a structured diagnostic psychiatric interview for DSM‐IV and ICD‐10. J Clin Psychiatry 1998;59:22–33. [PubMed] [Google Scholar]
  • 21. Annett M. A classification of hand preference by association analysis. Br J Psychol 1970;61:303–321. [DOI] [PubMed] [Google Scholar]
  • 22. Young RC, Biggs JT, Ziegler VE, Meyer DA. A rating scale for mania: Reliability, validity and sensitivity. Br J Psychiatry 1978;133:429–435. [DOI] [PubMed] [Google Scholar]
  • 23. Asberg M, Montgomery SA, Perris C, Schalling D, Sedvall G. A comprehensive psychopathological rating scale. Acta Psychiatr Scand Suppl 1978;57:5–27. [DOI] [PubMed] [Google Scholar]
  • 24. Speilberger CD, Gorsuch R, Lushene R, Vagg P, Jacobs G. Manual for the state‐trait anxiety inventory, 1983.
  • 25. Macmillan NA, Creelman CD. Detection theory: A user's guide. Mahwah, NJ: Lawrence Erlbaum, 2004. [Google Scholar]
  • 26. McCarthy D, Davison M. Signal probability, reinforcement and signal detection. J Exp Anal Behav 1979;32:373. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Hautus MJ. Corrections for extreme proportions and their biasing effects on estimated values of ‘d’. Behav Res Methods 1995;27:46–51. [Google Scholar]
  • 28. Yeung N, Botvinick MM, Cohen JD. The neural basis of error detection: Conflict monitoring and the error‐related negativity. Psychol Rev 2004;111:931–959. [DOI] [PubMed] [Google Scholar]
  • 29. Hauser TU, Iannaccone R, Stampfli P, et al. The feedback‐related negativity (FRN) revisited: New insights into the localization, meaning and network organization. NeuroImage 2014;84:159–168. [DOI] [PubMed] [Google Scholar]
  • 30. Becker MP, Nitsch AM, Miltner WH, Straube T. A single‐trial estimation of the feedback‐related negativity and its relation to BOLD responses in a time‐estimation task. J Neurosci 2014;34:3005–3012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Carlson JM, Foti D, Mujica‐Parodi LR, Harmon‐Jones E, Hajcak G. Ventral striatal and medial prefrontal BOLD activation is correlated with reward‐related electrocortical activity: A combined ERP and fMRI study. NeuroImage 2011;57:1608–1616. [DOI] [PubMed] [Google Scholar]
  • 32. Cerullo MA, Adler CM, Lamy M, et al. Differential brain activation during response inhibition in bipolar and attention‐deficit hyperactivity disorders. Early Interv Psychiatry 2009;3:189–197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Phillips ML, Swartz HA. A critical appraisal of neuroimaging studies of bipolar disorder: Toward a new conceptualization of underlying neural circuitry and a road map for future research. Am J Psychiatry 2014;171:829–843. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Townsend J, Altshuler LL. Emotion processing and regulation in bipolar disorder: A review. Bipolar Disord 2012;14:326–339. [DOI] [PubMed] [Google Scholar]
  • 35. Linke J, King AV, Rietschel M, et al. Increased medial orbitofrontal and amygdala activation: Evidence for a systems‐level endophenotype of bipolar I disorder. Am J Psychiatry 2012;169:316–325. [DOI] [PubMed] [Google Scholar]
  • 36. Nusslock R, Almeida JR, Forbes EE, et al. Waiting to win: Elevated striatal and orbitofrontal cortical activity during reward anticipation in euthymic bipolar disorder adults. Bipolar Disord 2012;14:249–260. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Haber SN, Kim KS, Mailly P, Calzavara R. Reward‐related cortical inputs define a large striatal region in primates that interface with associative cortical connections, providing a substrate for incentive‐based learning. J Neurosci 2006;26:8368–8376. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Morecraft RJ, Van Hoesen GW. Convergence of limbic input to the cingulate motor cortex in the rhesus monkey. Brain Res Bull 1998;45:209–232. [DOI] [PubMed] [Google Scholar]
  • 39. Holroyd CB, Coles MGH. The neural basis of human error processing: Reinforcement learning, dopamine, and the error‐related negativity. Psychol Rev 2002;109:679. [DOI] [PubMed] [Google Scholar]
  • 40. Marco‐Pallares J, Muller SV, Munte TF. Learning by doing: An fMRI study of feedback‐related brain activations. NeuroReport 2007;18:1423–1426. [DOI] [PubMed] [Google Scholar]
  • 41. Mason L, Trujillo‐Barreto NJ, Bentall RP, El‐Deredy W. Attentional bias predicts increased reward salience and risk taking in bipolar disorder. Biol Psychiatry 2015;79:311–319. [DOI] [PubMed] [Google Scholar]
  • 42. Corral‐Frias N, Pizzagalli D, Carre J, et al. COMT Val158Met genotype is associated with reward learning: A replication study and meta‐analysis. Genes Brain Behav 2016;15:503–513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Lancaster T, Heerey E, Mantripragada K, Linden D. Replication study implicates COMT val158met polymorphism as a modulator of probabilistic reward learning. Genes Brain Behav 2015;14:486–492. [DOI] [PubMed] [Google Scholar]
  • 44. Perry W, Minassian A, Paulus MP, et al. A reverse‐translational study of dysfunctional exploration in psychiatric disorders: From mice to men. Arch Gen Psychiatry 2009;66:1072–1080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. van Enkhuizen J, Henry BL, Minassian A, et al. Reduced dopamine transporter functioning induces high‐reward risk‐preference consistent with bipolar disorder. Neuropsychopharmacology 2014;39:3112–3122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Burdick KE, Braga RJ, Gopin CB, Malhotra AK. Dopaminergic influences on emotional decision making in euthymic bipolar patients. Neuropsychopharmacology 2014;39:274–282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Mason L, O'Sullivan N, Montaldi D, Bentall RP, El‐Deredy W. Decision‐making and trait impulsivity in bipolar disorder are associated with reduced prefrontal regulation of striatal reward valuation. Brain 2014;137:2346–2355. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Strakowski SM, Fleck DE, DelBello MP, et al. Impulsivity across the course of bipolar disorder. Bipolar Disord 2010;12:285–297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Swann AC, Lijffijt M, Lane SD, Steinberg JL, Moeller FG. Increased trait‐like impulsivity and course of illness in bipolar disorder. Bipolar Disord 2009;11:280–288. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Wessa M, Kollmann B, Linke J, Schonfelder S, Kanske P. Increased impulsivity as a vulnerability marker for bipolar disorder: Evidence from self‐report and experimental measures in two high‐risk populations. J Affect Disord 2015;178:18–24. [DOI] [PubMed] [Google Scholar]
  • 51. Janes AC, Pedrelli P, Whitton AE, et al. Reward Responsiveness Varies by Smoking Status in Women with a History of Major Depressive Disorder. Neuropsychopharmacology 2015;40:1940–1946. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Pergadia ML, Der‐Avakian A, D'Souza MS, et al. Association between nicotine withdrawal and reward responsiveness in humans and rats. JAMA Psychiatry 2014;71:1238–1245. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from CNS Neuroscience & Therapeutics are provided here courtesy of Wiley

RESOURCES