Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Dec 1.
Published in final edited form as: Eur Neuropsychopharmacol. 2021 Sep 10;53:89–100. doi: 10.1016/j.euroneuro.2021.08.002

Differential reinforcement learning responses to positive and negative information in unmedicated individuals with depression

Jenna M Reinen a, Alexis E Whitton b,c, Diego A Pizzagalli b, Mark Slifstein d,e, Anissa Abi-Dargham d,e, Patrick J McGrath f, Dan V Iosifescu g,h,i, Franklin R Schneier d,f,*
PMCID: PMC8633147  NIHMSID: NIHMS1745095  PMID: 34517334

Abstract

Major depressive disorder (MDD) is characterized by behavioral and neural abnormalities in processing both rewarding and aversive stimuli, which may impact motivational and affective symptoms. Learning paradigms have been used to assess reinforcement encoding abnormalities in MDD and their association with dysfunctional incentive-based behavior, but how the valence and context of information modulate this learning is not well understood. To address these gaps, we examined responses to positive and negative reinforcement across multiple temporal phases of information processing. While undergoing functional magnetic resonance imaging (fMRI), 47 participants (23 unmedicated, predominantly medication-naïve participants with MDD and 24 demographically-matched HC participants) completed a probabilistic, feedback-based reinforcement learning task that allowed us to separate neural activation during motor response (choice) from reinforcement feedback and monetary outcome across two independent conditions: pursuing gains and avoiding losses. In the gain condition, MDD participants showed overall blunted learning responses (prediction error) in the dorsal striatum when receiving monetary outcome, and reduced responses in ventral striatum for positive, but not negative, prediction error. The MDD group showed enhanced sensitivity to negative information, and symptom severity was associated with better behavioral performance in the loss condition. These findings suggest that striatal responses during learning are abnormal in individuals with MDD but vary with the valence of information.

Keywords: Major Depressive Disorder, Reinforcement, Learning, Reward, Ventral Striatum, Putamen

1. Introduction

Major depressive disorder (MDD) is characterized by motivational and affective symptoms that are highly disabling and predict poor treatment response (Treadway and Zald, 2010). Evidence suggests that these motivational disturbances, particularly anhedonia, are related to hyposensitive responses to rewarding experiences (Pizzagalli et al., 2005) and abnormalities in the neural circuitry supporting reinforcement processing (Nestler and Carlezon, 2006; Whitton et al., 2015). This blunting putatively contributes to deficits in outcome valuation, incentive salience, and effort calculation, which then manifest as behavioral and affective disturbances characterized by diminished reward pursuit and suboptimal action selection. To connect these processes, reinforcement learning paradigms have been used to examine the mechanisms underlying the ability to encode and maintain information during goal-directed behavior. In depression, converging findings indicate that responses in cortico-striatal regions critical for feedback-driven learning (Bartra et al., 2013; Daw et al., 2006; Haber and Knutson, 2009) are blunted when learning from rewarding feedback (Admon and Pizzagalli, 2015; Bakker et al., 2018; Kumar et al., 2018, 2008; Robinson et al., 2012; Steele et al., 2007). Although inconsistencies have been observed (e.g., Rutledge et al., 2017; Moutoussis et al., 2018), these are likely due to key aspects of the paradigms used, stage of learning assessed, and other contributing factors. For instance, not all paradigms allow reward anticipation- and outcome-related processes to be examined separately, despite evidence showing that these distinct learning stages draw on distinct neurochemical and functionally interconnected brain systems that are impacted in different ways in depression (Barch et al., 2015; Berridge and Robinson, 1998; Satterthwaite et al., 2015). Furthermore, depression-related deficits in these processes may be either enhanced or minimized by the time-course and situational demands imposed by the paradigm, or by the characteristics of the depression sample selected (e.g., symptom profile and/or medication use) (Heller et al., 2009; McCabe et al., 2010; Robinson et al., 2012).

Depression has also been associated with abnormal (often enhanced) responses to negative stimuli, and this has been linked to symptom severity (Gotlib et al., 2004; Hamilton and Gotlib, 2008). Learning studies have generally supported this, demonstrating hypersensitive, or less-impaired, behavioral and neural responses to negative feedback (Chase et al., 2010; Johnston et al., 2015; Ubl et al., 2015) alongside hyposensitive responses to positive feedback (for reviews, see Eshel and Roiser 2010; Whitton et al., 2015). While it remains unknown whether altered responses to negative stimuli in depression may be attributed to the contextual framing of affective information (Eshel and Roiser, 2010), comorbidity with anxiety, or global affective processing abnormalities (Bylsma et al., 2008), negative affect (including anxiety) and anhedonia often co-occur in MDD (Watson et al., 1995). Accordingly, gaining a better understanding of reinforcement processing in MDD is crucial for understanding maladaptive processes that may bias attention, learning, and memory towards negative (Doré et al., 2018; Eshel and Roiser, 2010) and away from positive information.

To elucidate the mechanisms supporting positive and negative information processing, reinforcement learning models estimate state-specific choice values derived from prior experience. The metric critical for updating behavior is the prediction error (PE), which signifies whether information received was better (positive PE) or worse (negative PE) than expected. Prior studies have implicated the striatum and its cortical connections in learning from positive and negative PEs (Hart et al., 2014), with ventral and anterior regions associated with processing positive information (Bartra et al., 2013), and dorsal and posterior regions implicated in processing negative or aversive information (Seymour et al., 2007). Although they overlap, striatal subdivisions are also posited to support distinct aspects of learning. Ventral regions and their midbrain dopaminergic inputs facilitate updating value predictions by evaluating experienced reward relative to expectations (Haber and Knutson, 2009; O’Doherty et al., 2004). Dorsal striatal regions may maintain action-reinforcement associations by connecting salient stimuli to appropriate approach or avoidance motor responses (Bartra et al., 2013; O’Doherty et al., 2004; Seymour et al., 2007). Consequently, localizing PE abnormalities in MDD may aid in linking regional distinction of PE deficits with symptoms related to value-based cognition and motivated action, respectively.

To address inconsistencies in the literature, including variability across motivational context (gain/loss conditions), PE valence, striatal localization, medication, and substance use, we administered a reinforcement learning task in a group of treatment-naïve individuals with MDD and a demographically-matched healthy control group (HCs) with functional magnetic resonance imaging (fMRI). To enhance precision with respect to identifying distinct incentive-related abnormalities, we specifically examined PE learning signals when participants first received feedback (“correct” or “incorrect”) about their choice, and then received a subsequent monetary gain or loss. Motivational context (Reinen et al., 2014) was varied using two conditions, one aimed at accumulating monetary gains, and another aimed at avoiding losses. We examined responses to positive, negative, and general (positive and negative) PEs in ventral and dorsal striatum, as well as at a whole-brain level, across learning stages, and according to motivational demands of the learning condition. Given prior findings of reward-related blunting in MDD (Kumar et al., 2018; Robinson et al., 2012), we expected that striatal learning signals would be reduced in MDD when learning to accumulate gains but relatively enhanced when learning to avoid losses, and we aimed to explore this disparity both at the level of discrete trials and across broader motivational context (experimental condition for seeking gain or avoiding loss). Further, we expected that learning behavior and striatal response would show a relationship with severity of motivational symptoms, including anhedonia and anxiety.

2. Experimental procedures

2.1. Participants

The data reported here were part of a multimodal, longitudinal study examining the neural bases of MDD, anhedonia, and symptom response to treatment. MDD participants were recruited from outpatient research clinics at the New York State Psychiatric Institute and Mount Sinai Medical Center, and HC participants were recruited from the community through online notices. At baseline, participants completed eligibility assessment, a battery of clinical ratings, functional imaging, and the reinforcement learning task described here. Findings from a PET study assessing capacity for dopamine release, followed by 6 weeks of clinical assessments during treatment with pramipexole, are reported elsewhere (see Fig. S1 in Schneier et al., 2018; Whitton et al., 2020). Study procedures were approved by the Institutional Review Boards of both institutions, and all participants provided informed consent prior to study procedures.

Fifty-two participants (26 MDD/26 HC, ages 18-50) were recruited to complete the functional imaging study. Only two of the MDD participants had ever taken psychotropic medication (both for < 2 weeks, and > 5 years prior to study participation). Two participants did not complete scanning, two were non-compliant, and for one participant there were scan-related technical issues (only behavioral data used). In total, the behavioral analyses included a sample of 24 HC participants and 24 MDD participants; scanning analyses included 24 HC participants and 23 MDD participants.

Diagnoses were assessed by psychiatric interview and confirmed by trained clinicians using the Structured Clinical Interview for the DSM-IV (First et al., 1996). Eligibility screening included blood/urine testing including urine toxicology, an electrocardiogram, and medical history and examination. Female participants were not pregnant, nursing, postmenopausal, or using hormonal contraception methods. MDD participants were experiencing a major depressive episode without psychotic features and had a Hamilton Rating Scale for Depression (Hamilton, 1986) 17-item total score of 17-28. They had no lifetime diagnosis of psychotic, attention-deficit, or bipolar disorders, substance-use disorders (including alcohol and nicotine); nicotine or illicit drug use (past 3 months), active suicidal ideation, or family history of schizophrenia. MDD participants also had no more than two weeks of lifetime treatment with psychotropic medication (n = 22 had no lifetime psychotropic medication). HC participants were matched for age, sex, race/ethnicity, and had no lifetime psychiatric disorders. Demographics, clinical characteristics, and clinical rating scales used are reported in Table 1 and in the Supplementary materials.

Table 1.

Participant Demographics. Mean and standard deviation (in parentheses) for group demographics and clinical symptoms are presented for each group. Asterisk denotes significant difference between groups based on two-sample t-tests (p < 0.05). Statistics are reported as Chi-square tests (for n) or t-statistics (for continuous metrics).

Participant Demographics HCs (n = 24) MDD (n = 24) Statistic P-Value
Years of Education, M (SD) 15.38 (1.50) 14.54 (1.47) 1.943 0.058
Age at Consent in Years, M (SD) 26.89 (5.50) 26.58 (6.40) 0.175 0.862
Handedness (Edinburgh), M (SD) 63.13 (46.54) 61.88 (39.72) 0.100 0.921
NAART IQ, M (SD) 111.19 (8.33) 111.56 (8.09) −0.152 0.88
Sex, n (%) <0.001 >0.999
 Female 12 (50) 12 (50)
 Male 12 (50) 12 (50)
Race, n (%) 0.188 0.98
 Asian 3 (12.5) 3 (12.5)
 African American 4 (16.7) 5 (20.8)
 Other 7 (29) 6 (25)
Ethnicity, n (%) 0.375 0.54
 Hispanic 7 (29) 9 (37.5)
 Non-Hispanic 17 (70.8) 15 (62.5)
Hamilton Depression 17 Total*, M (SD) 0.17 (0.38) 20.08 (2.69) −35.977 <0.001
Mood and Anxiety Questionnaire (MASQ) Anhedonic Depression*, M (SD) 37.67 (9.71) 81.96 (10.24) −15.376 <0.001
Mood and Anxiety Questionnaire (MASQ) Anxious Arousal*, M (SD) 18.92 (2.62) 25.54 (7.75) −3.967 <0.001
Mood and Anxiety Questionnaire (MASQ) General Distress Depressive*, M (SD) 14.21 (2.81) 39.21 (10.71) −11.061 <0.001
Mood and Anxiety Questionnaire (MASQ) Total*, M (SD) 83.75 (13.53) 170.46 (25.80) −14.582 <0.001
Apathy Evaluation Rating Scale (AES)*, M (SD) 23.83 (4.98) 41.09 (8.99) −8.222 <0.001
Snaith Hamilton Pleasure Scale (SHAPS) Franken Total*, M (SD) 20.33 (8.98) 31.83 (6.72) −5.024 <0.001
Temporal Experience of Pleasure Scale (TEPS) Anticipatory Total*, M (SD) 48.96 (5.29) 36.25 (8.25) 6.355 <0.001
Temporal Experience of Pleasure Scale (TEPS) Consummatory Total*, M (SD) 38.38 (7.35) 30.38 (7.76) 3.668 0.001
MDD Age of Onset in Years, M (SD) NA 17.77 (7.02)

2.2. Behavioral paradigm and learning model

Participants completed a probabilistic, feedback-based reinforcement learning task (Reinen et al., 2014) designed to separate choice, feedback, and outcome events while undergoing fMRI. Studies have shown that this task elicits responses in the striatum, prefrontal cortex, and limbic regions while participants actively learn, motivated separately by gains or losses (Insel et al., 2014; Reinen et al., 2016). The task structure was specifically designed to separate motor responses, which recruit striatal resources, from responses during the cognitive processes of anticipatory and consummatory learning, and to be analogous across gain and loss contexts, allowing us to compare motivational contexts of approach versus avoidance. In other words, the best loss avoidance ($0 loss) in the loss condition was comparable to the best win ($1) in the gain condition (Fig. 1ad). Participants completed two counterbalanced phases (60 trials) of non-intermixed conditions: a gain condition, incentivized by earning money, and a loss condition, incentivized by avoiding losing money from an endowment. Through trial and error, participants first made a choice between two stimuli and next received stochastically-delivered verbal feedback (“correct” or “incorrect”, 70/30 contingency), which probabilistically predicted a subsequent monetary outcome (Fig. 1ad). Although “correct” feedback was associated with higher expected value outcome than “incorrect”, the exact magnitude of the reward on each trial was unpredictable, thereby generating a PE at both feedback and reward presentation. This allowed us to examine PEs when individuals received feedback about reward anticipation and a temporally separate reward outcome (consummatory) receipt.

Fig. 1.

Fig. 1

Reinforcement Learning Paradigm. Each participant completed a probabilistic reward-based learning task in two contexts, (A) incentivized by gaining money, and (B) incentivized by avoiding losing money. Condition order and stimuli were counterbalanced. (C) There was a 70/30 probabilistic contingency that linked the choice to verbal feedback. Though expected value was higher with “correct” feedback and lower with “incorrect” feedback, there was a (D) 50% uncertainty of reward receipt magnitude.

To assess behavior and compute parametric regressors for the imaging analysis, we implemented a variant of a basic Q-learning model used in multiple established paradigms (Daw, 2011). This model was adapted for this two-stage task and was utilized to acquire standard estimates of behavioral reinforcement learning parameters for each condition and participant based on a similar approach used previously with this same task (Reinen et al., 2014, 2016). The participant-specific values generated from this approach included learning rates and inverse temperature/beta, similar to a weight by which Q-values are related to participant choices (see Supplementary materials and the Supplementary data for Reinen et al., 2016, Reinforcement Learning Model Analyses, for details). From these estimates, group metrics were assessed and used to generate participant- and trial-specific learning signal prediction error (PE) regressors for the fMRI general linear model (GLM)-based analyses.

Additional exploratory models were run to further assess behavior, including assessing learning rates from positive or negative feedback and sensitivity to the valence of PE (positive versus negative reward outcomes). These models were repeated to explore how the parameters varied by participant group as well as by patient symptom severity. These model specifications, alternative approaches, and parameter estimation are detailed in the Supplementary materials. Other behavioral metrics, including optimal choice (the proportion of time a participant chose the more frequently rewarded shape) and reaction time (RT) were analyzed for group and condition effects.

Several alternative imaging models were explored to ensure consistency of results across model complexity and parameter generation approach, and to confirm that the relevant conclusions from the neuroimaging results converged across these approaches. We compared across regressor generation and fMRI contrast analyses for (1) maximum likelihood and Bayesian regressor estimates (Supplementary materials, Fig. S3); (2) model-based regressors and qualitative regressors, including correct and incorrect outcomes for feedback and high and low rewards at outcome (Supplementary materials, Fig. S4); (3) various two-stage models used in prior tasks (see Supplementary materials, Fig. S5); and (4) various correction thresholds (see Supplementary materials, Fig. S6). Convergent findings are outlined and discussed in detail in the Supplementary materials, section on Convergence Across Alternative fMRI Models.

2.3. Scanning data acquisition and analysis

All scans were performed at NY State Psychiatric Institute on a GE Signa 3T scanner (Milwaukee, WI) using a Nova 32 channel head coil. Participants viewed images projected on a screen while in a supine position, and used a hand-held trackball to respond during the task. T1-weighted structural images (1 mm isotropic voxels, 200 slices, FOV = 25.6, flip angle = 12, TI = 450) and whole-brain functional EPI images (TR = 2000 ms, TE = 28 ms, flip angle = 77°, FOV = 19.2, 3 mm isotropic voxels, 42 slices) were acquired in 178 volumes, in six runs of 20 trials each. The first five volumes were discarded to allow for magnetic stabilization. Group differences in motion estimates were assessed based on the individual mean absolute displacement of raw measurements.

Functional images were preprocessed and analyzed with SPM8 (Wellcome Department of Imaging Neuroscience, London, UK) and analyzed with NeuroElf (http://neuroelf.net/). Functional images were first slice-time corrected and realigned to the first volume of each run to correct for motion. Images were then warped to the Montreal Neurological Institute template and smoothed with a 6 mm Gaussian kernel. Data were forced to single precision to decrease the impact of rounding errors. Signal-to-noise ratio, motion, and alignment were examined for each participant (see Supplementary materials). After preprocessing, we implemented a first-level statistical analysis using a standard GLM. The model included five stick function regressors of interest per condition (choice, feedback, outcome), and two trial-specific parametric regressors that were convolved with the hemodynamic responses for feedback PE and monetary reward PE. Learning rates and beta parameters for the model-based fMRI analyses were estimated using a Bayesian framework based on a variant of a two-stage reinforcement learning model (Daw et al., 2011; Sutton and Barto, 1998) described above and in the Supplementary materials. As in prior analyses of this task, a high pass temporal filter and motion parameters were incorporated into the model as regressors of no interest (Reinen et al., 2016). Gain and loss condition learning signals were examined as separate regressors in the same model. Additional GLM-based analyses were conducted using the similar specifications detailed above. First, we examined positive and negative PEs separately. Values were based on the PE valence generated by the model based on whether an outcome was better or worse than expected. To improve power, available regression beta values for feedback and outcome events were collapsed across each other producing a model with five stick-function regressors of interest per condition (choice, positive outcomes, negative outcomes), and trial-specific parametric regressors that were convolved with the hemodynamic responses (choice value, positive PE, and negative PE).

For both GLMs, we first addressed our hypotheses within specific regions. Responses were extracted from a priori-defined regions of interest (ROI) using automated functional imaging meta-analysis (Yarkoni et al., 2011) using the terms “ventral striatum“ and “dorsal striatum”. Values were extracted from a 6-mm sphere (radius) surrounding the peak coordinates and their bilateral counterparts. ANOVAs with factors of condition and group were calculated for feedback/reward and positive/negative PE. Next, results were examined using a whole-brain approach by masking out-of-brain and white matter voxels, and then selecting voxels that passed a test for family-wise error (FWE) of p < 0.005 using AlphaSim MonteCarlo simulation. To reduce the possibility of false positives, results were validated using conservative permutation (nonparametric) tests using cluster correction of at least p < 0.02 (see Supplementary materials). For each regressor, mean global signal was treated as a covariate, but was assessed for qualitative similarity with and without global signal.

Finally, we addressed the primary hypothesis that learning responses are related to affective and motivational symptoms of depression. From the battery of clinical assessments, we selected two measures a priori to analyze fMRI results for possible relationship to anhedonia and anxious arousal: Anhedonic Depression and Anxious Arousal subscales of the Mood and Anxiety Symptoms Questionnaire (MASQ); see Supplementary materials). These measures were chosen because they offered co-validated metrics for anhedonia and anxiety, and have been shown to effectively discriminate between symptom dimensions (Watson et al., 1995). Using these two subscales therefore allowed us to test whether any effects observed were specific to anhedonia or whether they also extended to other separate forms of negative affect (i.e., anxiety). Accordingly, we calculated Pearson correlations with behavioral learning metrics (optimal choice, learning rates) and symptom scores. After behavioral relationships were assessed, we extracted values based on regions defined by the functional group results in striatum and calculated correlations with participants’ anhedonia and anxiety symptom scores.

3. Results

3.1. Demographic and behavioral measures

Groups did not differ on demographic variables or fMRI motion estimates (all ps > 0.05; see Table 1 and Supplementary materials). The MDD group was characterized by depressive symptoms of moderate severity (Table 1). Reaction time (RT) and optimal choice (defined as the percentage of trials each participant chose the shape that was more frequently rewarded) were first assessed by block (6 blocks of 10 trials). As expected, main effects of block were observed for both RT (F1,544 = 11.46, p < 0.01; Fig. 2A) and optimal choice (F1,544 = 6.50, p < 0.01, Fig. 2B), where participants responded more quickly and chose more optimally over time. However, contrary to our hypotheses, no effect of group or condition emerged for either RT (F1,144 group = 0.36, p = 0.55; F1,144 condition = 2.08, p = 0.15) or optimal choice (F1,544 group = 2.84, p = 0.09; F1,544 condition = 0.10, p = 0.75), indicating that learning-related changes in RT and accuracy were not any weaker in the MDD group relative to the controls.

Fig. 2.

Fig. 2

Behavioral Results. (A) Participants’ reaction time to choices became shorter as they progressed across blocks (10 trials/block) in both conditions. (B) Likewise, participants improved performance as reflected in the percentage of time they chose the correct (optimal) shape, or the one most associated with a rewarding outcome. (C) Model-based analyses indicate there is a group effect greater than 0 based on negative feedback PE valence (top panel; grp=group, con=condition). (D) A correlation between anhedonic depression severity and loss performance was observed in the MDD group (r = 0.42, p = 0.04). All group labels are consistent with colors in the top row legend.

Standard reinforcement learning metrics (Daw et al., 2011; Reinen et al., 2016) were used to assess beta, learning rate, and parameters examining the effect of positive or negative information on outcomes (for details, see Supplementary materials and Table S1). Results revealed a group effect for the parameter that captured the impact of negative feedback information on feedback, where this parameter was greater in the MDD group relative to the control group (P = 0.02) across conditions. This suggests that negative feedback may have a greater impact in updating learned values in individuals with MDD. Similarly, this negative feedback effect also varied positively with MASQ anhedonic and anxious arousal symptoms (both Ps = 0.01), as was observed in the model using symptoms to examine parameter variability instead of a group label. Additional findings, including group effects of beta and a trending effect of reduced influence of positive valence in the gain condition for the MDD group, are discussed in the Supplementary materials.

3.2. Group differences in overall (Positive and negative) PE response

3.2.1. Overall PE response in regions of interest (ROIs)

Our first analysis of PE used a condition (gain, loss) x group (HC, MDD) ANOVA to assess PE response in the each of the a priori defined ROIs (Fig. 3A). ANOVAs were conducted for PE at feedback and PE at outcome in right and left striatum. For PE at feedback, we identified a main effect of condition in the left and right dorsal striatum (left: F1,90 = 4.29, p = 0.04; right: F1,90 = 7.7, p < 0.01). For PE at outcome, a main effect of condition (F1,90 = 5.79, p = 0.018, see Supplementary Fig. S2), as well as a main effect of group (F1,90 = 4.08, p = 0.046) emerged for the right dorsal striatum, and there was also a nonsignificant trend for group in the left dorsal striatum (F1,90 = 3.35, p = 0.07). Furthermore, for PE at outcome, there was a nonsignificant trend for a group-by-condition interaction of outcome PE in right dorsal striatum (F1,90 = 3.37, p = 0.069). Other effects of group were not significant (all ps > 0.05). Given these findings, we conducted post hoc t-tests to examine group differences in dorsal striatum PEs, in the gain and loss conditions separately. We found significant group differences for outcome PE only in the gain condition in the right (HC > MDD, t45 = 2.47, p = 0.017) and left dorsal striatum (HC > MDD, t45 = 2.11, p = 0.04), indicating greater response in these regions in controls; all other ps > 0.05 for feedback PE and for the loss condition).

Fig. 3.

Fig. 3

Learning Signal Differences for HCs > MDD. (A) Regions of interest defined by automated meta-analysis (neurosynth.org) using terms “ventral” (teal) and “dorsal” (yellow) striatum. (B) Whole-brain corrected results (FWE corrected p < 0.005) were examined as they differed for HC > MDD for prediction error learning signals during presentation monetary outcome in the gain condition and (C) results for HC > MDD represent positive PE across feedback and outcome in the gain condition. No significant results were found for the loss condition or negative PE in the striatum.

3.2.2. Overall PE response across the whole-brain

We then evaluated whole-brain group differences for HC > MDD participants in response to overall (positive and negative) PE learning signals. For the gain condition, significant findings survived cluster correction (p < 0.005) only for the outcome PE, and greater response in HC was evident in bilateral thalamus (peak = 18, −24,12, k = 354, tmax = 4.6), extending to regions of right dorsal striatum (peak = 15,12,6, k = 36, tmax = 3.29; Fig. 3B), right inferior frontal gyrus (peak = 57,18, −6, k = 474, tmax = 4.51) extending to BA 11 (peak = 9,18, −21, k = 27, tmax = 3.52), and medial culmen (peak = 9, −42, −3, k = 338, tmax = 5.18). Notably, differences in the gain condition for HC > MDD were also identified in all alternative imaging models in dorsal striatum (see Supplementary materials and Figs. S3S6), highlighting the robustness of these findings. In the loss condition, notably, no significant group differences were identified in the striatum, but we did find blunted feedback PE responses in MDD in the anterior cingulate extending to medial PFC (peak = 3,39,3, k = 114, tmax = 4.97), and enhanced outcome PE responses in MDD in the left cerebellum (peak = −21, −81, −36, k = 273, tmax = −4.43).

3.3. Group differences in positive and negative PE response

3.3.1. Group differences in positive and negative PE in regions of interest (ROIs)

We next examined positive versus negative PE response across all trial events. In a similar approach to prior analyses, we first used ANOVA with factors of condition, group, and valence to assess positive and negative PE response in a priori defined ROIs (Fig. 3A). A significant group-by-condition interaction was found for positive PE valence in left ventral (F1, 180 5.92, p = 0.015) and dorsal striatum (F1,180 = 6.7, p = 0.01). Post hoc t-tests revealed significant group differences (HC > MDD) revealing greater response in HC only for positive PE valence in the gain condition in both regions (left ventral: t45 = 2.76, p < 0.01, left dorsal: t45 = 2.33, p = 0.03). No main effects of group or interactions emerged for the right striatum or for negative PE (all F1,180 > 2.99, all ps > 0.09).

3.3.2. Group differences in positive and negative PE across the whole-brain

Whole-brain group differences (HC > MDD) likewise revealed significant differences in the striatum only for positive PE in the gain condition. Here, we identified group differences indicating relatively enhanced responses in HC in a cluster in the medial prefrontal cortex (peak = −9,57,6, k = 434, tmax = 4.54) extending to left ventral striatum (peak = −15,9,12, k = 30, tmax = 3.81; Fig. 3C). MDD patients showed increased responses relative to HC for negative PE in the gain condition in the right cerebellum (peak = 18, −78, −33, k = 105, tmax = −4.05), and in the left parahippocampal gyrus (peak = −30, −3, −27, k = 141, tmax = −5.15) extending to cerebellum (peak = 18, −33, −12, k = 113, tmax = −4.76), and medial temporal gyrus (peak = 54,0, −27, k = 128, tmax = −4.97) for positive PE in the loss condition.

3.4. Learning signals and clinical symptoms

Pearson correlations were used to test the hypothesis that learning behavior is related to the clinical symptoms of MDD. First, we examined the correlation between learning rate (alpha) and optimal choice in the MDD group as they related to MASQ scores for anhedonic depression and anxious arousal. While symptoms were not correlated with learning rate (all ps > 0.05), increasing severity of anhedonia was associated with increases in optimal choice in the loss condition (r = 0.42, p = 0.04; Fig. 2D). Subsequent exploratory analyses were used to relate clinical symptoms to the imaging findings described above. We evaluated relationships between learning signals in functionally defined ROIs and MASQ scores. We extracted responses from participants based on group effects identified in the peak striatum cluster (peak = 15,12,6, Fig. 3B), and evaluated the correlation between the response in this region and self-reported levels of anhedonia and anxiety. In MDD participants, negative PE response in the left ventral striatum in the gain condition showed a trend correlation with severity of anxiety (r = 0.53, p = 0.009; corrected p = 0.07; all other ps > 0.05, see Supplementary materials for additional analyses).

4. Discussion

The present findings provide evidence that in unmedicated individuals with MDD, striatal learning signals are blunted in response to positive information. This was observed more generally in the gain condition at outcome and in responses time-locked to positive PEs at both feedback and outcome. Relative to HC participants, MDD participants showed reduced overall PE responses in the dorsal striatum when incentivized by rewards but not losses, and in more ventral regions in response to positive, but not negative, PE (note, however, that the group-by-condition interaction was not significant, limiting the specificity of the findings). In MDD, we did not find evidence for loss learning deficits in the striatum, and instead identified increased behavioral sensitivity to negative PE learning signals. Critically, the relationship between learning and anhedonia was preferential to motivational context, such that MDD participants with more severe motivational symptoms performed better at learning to avoid losses but showed greater striatal PE blunting from positive outcomes. These results add to the growing body of literature supporting a reward processing deficit in MDD (Admon and Pizzagalli, 2015), and reveal that this may differ based on the valence and motivational framing of learned content.

Although substantial evidence associates depression and depressive symptoms with reward processing abnormalities in the striatum (Admon et al., 2015; Admon and Pizzagalli, 2015; Heller et al., 2009; Pizzagalli et al., 2009; Satterthwaite et al., 2015), the current literature contains mixed evidence for striatal PE integrity in MDD (Gradin et al., 2011; Greenberg et al., 2015; Kumar et al., 2018; Moutoussis et al., 2018; Rutledge et al., 2017; Ubl et al., 2015). Inconsistent findings illustrate the importance of heterogeneity within MDD samples, as well as in design and analyses of feedback processing studies. These differences include not accounting for the pharmacological impact on reward responses in medicated samples (McCabe et al., 2010), variability in framing of task choices that have long been known to influence choice behavior (Tom et al., 2007; Tversky and Kahneman, 1981), and restricting analyses to specific anatomical subdivisions of striatum. We and others studying MDD have identified differences in PEs in the caudate and putamen (Insel et al., 2019; Kumar et al., 2018; Robinson et al., 2012), which lies dorsal to the regions typically identified as a region of interest for reward PE. Responses in dorsal striatum have been linked to stimulus-action encoding (Haruno and Kawato, 2006) and forming associations necessary to act upon learned value-based cues. It is therefore possible that our findings, which focused on affective symptoms, may reflect abnormalities in linking positive outcomes to motor system responses, which may be a mechanism implicated in biotypes of depression characterized by avolition and fatigue.

We also found evidence that motivational framing is important. Task structure was essentially identical across conditions, yet MDD-related attenuation in the striatum was only significantly different from HC participants in the gain condition, which involved learning to accumulate rewards. We did not identify group differences in PE response in the loss condition, although we and others have demonstrated enhanced or intact loss learning in behavior and imaging, with some proposing loss hypersensitivity to be related to anxious subtypes of depression (Eshel and Roiser, 2010; Henriques et al., 1994; Henriques and Davidson, 2000; Pizzagalli et al., 2011; Robinson et al., 2012; Ubl et al., 2015). In the present MDD cohort, we found anhedonia was related to better loss learning performance and negative PEs were related to anxious arousal, demonstrating an association between sensitivity to negative outcomes and symptom severity. This suggests that the affective symptoms of MDD may arise from multidimensional influences, including abnormal reward encoding and relatively preserved aversion responses. Collectively, these may coalesce to highlight information about negative outcomes, ultimately contributing to abnormal motivational processing and imbalanced approach and avoidance behavior.

The MDD group showed evidence of blunted striatal activation during the presentation of monetary reward outcome. A substantial body of literature has differentiated the underlying neural mechanisms of reward anticipation versus consummation (Berridge and Robinson, 1998). The task design allowed us to distinguish between reinforcement events, and the findings do suggest that blunted striatal responses occur at the time of consumption, but several caveats exist. First, this does not negate the possibility that group differences in PE response also occur at feedback or anticipation, as these were observable using region-specific correction and for positive PE when we collapsed across feedback and outcome events. Second, this paradigm does not enable us to discern between affective and perceptual processing, given that they are inseparable during the reward outcome event. Future studies using different task paradigms should seek to examine the neural processing systems in learning stages independently in order to understand their relationship to depressive symptoms.

Although this study hypothesized response differences in the striatum, the MDD group’s deficits in PEs in the gain condition extended to thalamus, posterior cingulate, orbitofrontal cortex, cerebellum, and temporal gyrus, which have been implicated in value processing for both positive and negative outcomes (Bartra et al., 2013; Garrison et al., 2013). Notably, MDD participants also showed marked differences in responses to positive PE, or better-than-expected information, based on condition. In the gain condition, MDD participants showed blunted positive PE response in medial PFC, striatum, parahippocampal gyrus, and cingulate (Supplementary materials, Table S4). However, in the loss condition, MDD participants showed enhanced responses in parahippocampal regions and temporal gyrus (Supplementary materials, Table S4). This finding supports the growing body of evidence showing that, in addition to having blunted reward processing, individuals with depression may experience negative stimuli as being relatively salient.

A major strength of this study was the assessment of an unmedicated and predominantly medication-naïve MDD sample (91.3% medication-naïve), which allowed us to infer that our findings were not modulated by pharmacological effects (McCabe et al., 2010). However, due to the costly nature of multimodal imaging (fMRI, PET) the study was limited by sample size, and the young age of this sample limits generalizability to other age groups. Furthermore, several of our observed effects fell to trend-level when we corrected for multiple testing, indicating that the findings require replication. Although we were able to identify several clinically meaningful correlational relationships, future studies would benefit from examining learning effects in a large sample across a spectrum of symptom-based subgroups (Drysdale et al., 2017). A second consideration is that we did not observe the predicted overall group difference in learning from gains in the MDD group that has been reported in prior studies (Pizzagalli et al., 2008; Robinson et al., 2012; Vrieze et al., 2013) (although we did observe some conditional effects and dimensional associations between learning symptom severity, see the Supplementary materials, p5, and Supplementary Table S1). This means that while the present imaging results showed reduced response to positive information, the behavioral findings in this respect are inconclusive. The absence of these group and condition differences may be due to the simple nature of the reinforcement learning task used. Future studies may wish to use more difficult learning contingencies, including dynamic or reversal contingencies, to extract nuances of value updating behavior.

In summary, the present results demonstrate an MDD-related deficit for positive information processing in the striatum, and abnormal negative information processing in the cortex in depression. Variability in anhedonic symptoms revealed a multi-dimensional pattern, both related to improved behavioral performance in the loss condition, and to blunted responses in striatum in the gain condition. Collectively, these findings underscore the importance of affective valence in modulating learning signals in depression, and highlight the dorsal striatum as an emerging potential biomarker linked to the motivational symptoms of MDD.

Supplementary Material

1

Acknowledgements

The authors would like to thank Jochen Weber (Columbia University) for consulting on neuroimaging analyses.

Role of the funding source

This study was supported by Award Number R01 MH1099322 (awarded to Dr. Schneier) from the National Institute of Mental Health (NIMH). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIMH or National Institutes of Health.

Dr. Pizzagalli was partially supported by R37 MH068376 and R01 MH101521.

Dr. Whitton was partially supported by R01 MH1099322 and by grant number APP1110773 from the National Health and Medical Research Council.

None of the funding sources provided any further role in the study design, collection, analysis, interpretation of data, writing of the report, or in the decision to submit the paper for publication.

Author disclosures

Over the past 3 years, Dr. Pizzagalli has received consulting fees from Albright Stonebridge Group, BlackThorn Therapeutics, Boehringer Ingelheim, Compass Pathway, Concert Pharmaceuticals, Engrail Therapeutics, Otsuka Pharmaceuticals, and Takeda Pharmaceuticals; one honorarium from Alkermes, and research funding from NIMH, Dana Foundation, Brain and Behavior Research Foundation, and Millennium Pharmaceuticals. In addition, he has received stock options from BlackThorn Therapeutics. With the exception of NIMH, no funding from these entities was used to support the current work. Dr Slifstein has provided consultation for Curasen Therapeutics and Neurocrine and has stock options in Storm Biosciences Inc. Dr. Abi- Dargham received consulting fees and/or honoraria from Sunovion, Otsuka, Merck, and Neurocrine. She holds stock options in Systems 1 Bio and in Terran Biosciences. In the past 5 years, Dr. Iosifescu was a consultant for Alkermes, Axsome, Allergan, Biogen, Centers of Psychiatric Excellence, MyndAnalytics, Jazz Pharmaceuticals, Lundbeck, Otsuka, Precision Neuroscience, Sage, and Sunovion; he has received research support (through his academic institution) from Alkermes, Astra Zeneca, Brainsway, Litecure, Neosync, Roche, and Shire. Dr. Schneier has received research support from Forest Laboratories, Otsuka Pharmaceuticals, Compass Pathways, and consulting fees from Feelmore Rx. No other disclosures were reported.

Footnotes

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Supplementary materials

Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.euroneuro.2021.08.002.

References

  1. Admon R, Nickerson LD, Dillon DG, Holmes AJ, Bogdan R, Kumar P, Dougherty DD, Iosifescu DV, Mischoulon D, Fava M, Pizzagalli DA, 2015. Dissociable cortico-striatal connectivity abnormalities in major depression in response to monetary gains and penalties. Psychol. Med 45, 121–131. doi: 10.1017/S0033291714001123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Admon R, Pizzagalli DA, 2015. Dysfunctional reward processing in depression. Curr. Opin. Psychol 4, 114–118. doi: 10.1016/J.COPSYC.2014.12.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bakker JM, Goossens L, Kumar P, Lange IMJ, Michielse S, Schruers K, Bastiaansen JA, Lieverse R, Marcelis M, van Amelsvoort T, van Os J, Myin-Germeys I, Pizzagalli DA, Wichers M, 2018. From laboratory to life: associating brain reward processing with real-life motivated behaviour and symptoms of depression in non-help-seeking young adults. Psychol. Med 1–11. doi: 10.1017/S0033291718003446. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Barch DM, Pagliaccio D, Luking K, Barch DM, Pagliaccio D, Luking ÁK, 2015. Mechanisms underlying motivational deficits in psychopathology: similarities and differences in depression and schizophrenia. In: Simpson EH, Balsam P (Eds.), Behavioral Neuroscience of Motivation. Current Topics in Behavioral Neurosciences, 27. Springer, Cham, pp. 411–450. doi: 10.1007/7854_2015_376. [DOI] [PubMed] [Google Scholar]
  5. Bartra O, Mcguire JT, Kable JW, 2013. The valuation system: a coordinate-based meta-analysis of BOLD fMRI experiments examining neural correlates of subjective value. Neuroimage 76, 412–427. doi: 10.1016/j.neuroimage.2013.02.063. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Berridge KC, Robinson TE, 1998. What is the role of dopamine in reward: hedonic impact, reward learning, or incentive salience? Brain Res. Rev 28, 309–369. doi: 10.1016/S0165-0173(98)00019-8. [DOI] [PubMed] [Google Scholar]
  7. Bylsma LM, Morris BH, Rottenberg J, 2008. A meta-analysis of emotional reactivity in major depressive disorder. Clin. Psychol. Rev 28, 676–691. doi: 10.1016/j.cpr.2007.10.001. [DOI] [PubMed] [Google Scholar]
  8. Chase HW, Frank MJ, Michael A, Bullmore ET, Sahakian BJ, Robbins TW, 2010. Approach and avoidance learning in patients with major depression and healthy controls: relation to anhedonia. Psychol. Med 40, 433–440. doi: 10.1017/S0033291709990468. [DOI] [PubMed] [Google Scholar]
  9. Daw N, 2011. Trial-by-trial data analysis using computational models. In: Delgado M, Phelps E, Robbins T (Eds.), Decision Making, Affect, and Learning. Oxford University Press, New York, pp. 3–38. [Google Scholar]
  10. Daw ND, Gershman SJ, Seymour B, Dayan P, Dolan RJ, 2011. Model-based influences on humans’ choices and striatal prediction errors. Neuron 69, 1204–1215. doi: 10.1016/j.neuron.2011.02.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Daw ND, O’Doherty JP, Dayan P, Seymour B, Dolan RJ, 2006. Cortical substrates for exploratory decisions in humans. Nature 441, 876–879. doi: 10.1038/nature04766. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Doré BP, Rodrik O, Boccagno C, Hubbard A, Weber J, Stanley B, Oquendo MA, Miller JM, Sublette ME, Mann JJ, Ochsner KN, 2018. Negative autobiographical memory in depression reflects elevated amygdala-hippocampal reactivity and hippocampally-associated emotion regulation. Biol. Psychiatry Cognit. Neurosci. Neuroimaging 3, 358–366. doi: 10.1016/J.BPSC.2018.01.002. [DOI] [PubMed] [Google Scholar]
  13. Drysdale AT, Grosenick L, Downar J, Dunlop K, Mansouri F, Meng Y, Fetcho RN, Zebley B, Oathes DJ, Etkin A, Schatzberg AF, Sudheimer K, Keller J, Mayberg HS, Gunning FM, Alexopoulos GS, Fox MD, Pascual-Leone A, Voss HU, Casey B, Dubin MJ, Liston C, 2017. Resting-state connectivity biomarkers define neurophysiological subtypes of depression. Nat. Med 23, 28–38. doi: 10.1038/nm.4246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Eshel N, Roiser JP, 2010. Reward and punishment processing in depression. Biol. Psychiatry 68, 118–124. doi: 10.1016/j.biopsych.2010.01.027. [DOI] [PubMed] [Google Scholar]
  15. First MB, Spitzer RL, Gibbon M, Williams J, 1996. Structured Clinical Interview for DSM-IV Axis I Disorders – Patient Edition (SCID-I/P, Version 2.0). Biometrics Research Department, New York State Psychiatric Institute, New York. [Google Scholar]
  16. Garrison J, Erdeniz B, Done J, 2013. Prediction error in reinforcement learning: a meta-analysis of neuroimaging studies. Neurosci. Biobehav. Rev 37, 1297–1310. doi: 10.1016/J.NEUBIOREV.2013.03.023. [DOI] [PubMed] [Google Scholar]
  17. Gotlib IH, Krasnoperova E, Yue DN, Joormann J, 2004. Attentional biases for negative interpersonal stimuli in clinical depression. J. Abnorm. Psychol 113, 127–135. doi: 10.1037/0021-843X.113.1.127. [DOI] [PubMed] [Google Scholar]
  18. Gradin VB, Kumar P, Waiter G, Ahearn T, Stickle C, Milders M, Reid I, Hall J, Steele JD, 2011. Expected value and prediction error abnormalities in depression and schizophrenia. Brain 134, 1751–1764. doi: 10.1093/brain/awr059. [DOI] [PubMed] [Google Scholar]
  19. Greenberg T, Chase HW, Almeida JR, Stiffler R, Zevallos CR, Aslam HA, Deckersbach T, Weyandt S, Cooper C, Toups M, Carmody T, Kurian B, Peltier S, Adams P, McInnis MG, Oquendo MA, McGrath PJ, Fava M, Weissman M, Parsey R, Trivedi MH, Phillips ML, 2015. Moderation of the relationship between reward expectancy and prediction error-related ventral striatal reactivity by anhedonia in unmedicated major depressive disorder: findings from the EMBARC study. Am. J. Psychiatry 172, 881–891. doi: 10.1176/appi.ajp.2015.14050594. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Haber SN, Knutson B, 2009. The reward circuit: linking primate anatomy and human imaging. Neuropsychopharmacology 35, 4–26. doi: 10.1038/npp.2009.129. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Hamilton JP, Gotlib IH, 2008. Neural substrates of increased memory sensitivity for negative stimuli in major depression. Biol. Psychiatry 63, 1155–1162. doi: 10.1016/J.BIOPSYCH.2007.12.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Hamilton M, 1986. The Hamilton Rating Scale for Depression. In: Sartorius N, Ban TA (Eds.), Assessment of Depression. Springer Berlin, Heidelberg, pp. 143–152. doi: 10.1007/978-3-642-70486-4_14. [DOI] [Google Scholar]
  23. Hart AS, Rutledge RB, Glimcher PW, Phillips PEM, 2014. Phasic dopamine release in the rat nucleus accumbens symmetrically encodes a reward prediction error term. J. Neurosci 34, 698–704. doi: 10.1523/JNEUROSCI.2489-13.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Haruno M, Kawato M, 2006. Different neural correlates of reward expectation and reward expectation error in the putamen and caudate nucleus during stimulus-action-reward association learning. J. Neurophysiol 95, 948–959. doi: 10.1152/jn.00382.2005. [DOI] [PubMed] [Google Scholar]
  25. Heller AS, Johnstone T, Shackman AJ, Light SN, Peterson MJ, Kolden GG, Kalin NH, Davidson RJ, 2009. Reduced capacity to sustain positive emotion in major depression reflects diminished maintenance of fronto-striatal brain activation. Proc. Natl. Acad. Sci 106, 22445–22450. doi: 10.1073/pnas.0910651106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Henriques JB, Davidson RJ, 2000. Decreased responsiveness to reward in depression. Cognit. Emot 14, 711–724. doi: 10.1080/02699930050117684. [DOI] [Google Scholar]
  27. Henriques JB, Glowacki JM, Davidson RJ, 1994. Reward fails to alter response bias in depression. J. Abnorm. Psychol 103, 460–466. doi: 10.1037/0021-843X.103.3.460. [DOI] [PubMed] [Google Scholar]
  28. Insel C, Glenn CR, Nock MK, Somerville LH, 2019. Aberrant striatal tracking of reward magnitude in youth with current or past-year depression. J. Abnorm. Psychol 128, 44–56. doi: 10.1037/abn0000389. [DOI] [PubMed] [Google Scholar]
  29. Insel C, Reinen J, Weber J, Wager TD, Jarskog LF, Shohamy D, Smith EE, 2014. Antipsychotic dose modulates behavioral and neural responses to feedback during reinforcement learning in schizophrenia. Cognit. Affect. Behav. Neurosci 14, 198–201. doi: 10.3758/s13415-014-0261-3. [DOI] [PubMed] [Google Scholar]
  30. Johnston BA, Tolomeo S, Gradin V, Christmas D, Matthews K, Steele JD, 2015. Failure of hippocampal deactivation during loss events in treatment-resistant depression. Brain 138, 2766–2776. doi: 10.1093/brain/awv177. [DOI] [PubMed] [Google Scholar]
  31. Kumar P, Goer F, Murray L, Dillon DG, Beltzer ML, Cohen AL, Brooks NH, Pizzagalli DA, 2018. Impaired reward prediction error encoding and striatal-midbrain connectivity in depression. Neuropsychopharmacology 43, 1581–1588. doi: 10.1038/s41386-018-0032-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Kumar P, Waiter G, Ahearn T, Milders M, Reid I, Steele JD, 2008. Abnormal temporal difference reward-learning signals in major depression. Brain 131, 2084–2093. doi: 10.1093/brain/awn136. [DOI] [PubMed] [Google Scholar]
  33. McCabe C, Mishor Z, Cowen PJ, Harmer CJ, 2010. Diminished neural processing of aversive and rewarding stimuli during selective serotonin reuptake inhibitor treatment. Biol. Psychiatry 67, 439–445. doi: 10.1016/j.biopsych.2009.11.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Moutoussis M, Rutledge RB, Prabhu G, Hrynkiewicz L, Lam J, Ousdal O-T, Guitart-Masip M, Fonagy P, Dolan RJ, 2018. Neural activity and fundamental learning, motivated by monetary loss and reward, are intact in mild to moderate major depressive disorder. PLoS One 13, 1–20. doi: 10.1371/journal.pone.0201451. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Nestler EJ, Carlezon WA, 2006. The mesolimbic dopamine reward circuit in depression. Biol. Psychiatry 59, 1151–1159. doi: 10.1016/j.biopsych.2005.09.018. [DOI] [PubMed] [Google Scholar]
  36. Pizzagalli DA, Holmes AJ, Dillon DG, Goetz EL, Birk JL, Bogdan R, Dougherty DD, Iosifescu DV, Rauch SL, Fava M, 2009. Reduced caudate and nucleus accumbens response to rewards in unmedicated individuals with major depressive disorder. Am. J. Psychiatry 166, 702–710. doi: 10.1176/appi.ajp.2008.08081201. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Pizzagalli DA, Iosifescu D, Hallett LA, Ratner KG, Fava M, 2008. Reduced hedonic capacity in major depressive disorder: evidence from a probabilistic reward task. J. Psychiatr. Res 43, 76–87. doi: 10.1016/j.jpsychires.2008.03.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Pizzagalli DA, Jahn AL, O ’Shea JP, 2005. Toward an objective characterization of an anhedonic phenotype: a signal-detection approach. Biol. Psychiatry2 57, 319–327. doi: 10.1016/j.biopsych.2004.11.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. O’Doherty J, Dayan P, Schultz J, Deichmann R, Friston K, Dolan RJ, 2004. Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science 304, 452–454. doi: 10.1126/science.1094285. [DOI] [PubMed] [Google Scholar]
  40. Pizzagalli DA, Dillon DG, Bogdan R, Holmes A, 2011. Reward and punishment processing in the human brain: clues from affective neuroscience and implications for depression research. In: Vartanian O, Mandel D (Eds.), Neuroscience of Decision Making. Psychology Press, New York, pp. 199–220. [Google Scholar]
  41. Reinen J, Smith EE, Insel C, Kribs R, Shohamy D, Wager TD, Jarskog LF, 2014. Patients with schizophrenia are impaired when learning in the context of pursuing rewards. Schizophr. Res 152, 309–310. doi: 10.1016/j.schres.2013.11.012. [DOI] [PubMed] [Google Scholar]
  42. Reinen JM, Van Snellenberg JX, Horga G, Abi-Dargham A, Daw ND, Shohamy D, 2016. Motivational context modulates prediction error response in schizophrenia. Schizophr. Bull 42, 1467–1475. doi: 10.1093/schbul/sbw045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Robinson OJ, Cools R, Carlisi CO, Sahakian BJ, Drevets WC, 2012. Ventral striatum response during reward and punishment reversal learning in unmedicated major depressive disorder. Am. J. Psychiatry 169, 152–159. doi: 10.1176/appi.ajp.2011.11010137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Rutledge RB, Moutoussis M, Smittenaar P, Zeidman P, Taylor T, Hrynkiewicz L, Lam J, Skandali N, Siegel JZ, Ousdal OT, Prabhu G, Dayan P, Fonagy P, Dolan RJ, 2017. Association of neural and emotional impacts of reward prediction errors with major depression. JAMA Psychiatry 74, 790–797. doi: 10.1001/jamapsychiatry.2017.1713. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Satterthwaite TD, Kable JW, Vandekar L, Katchmar N, Bassett DS, Baldassano CF, Ruparel K, Elliott MA, Sheline YI, Gur RC, Gur RE, Davatzikos C, Leibenluft E, Thase ME, Wolf DH, 2015. Common and dissociable dysfunction of the reward system in bipolar and unipolar depression. Neuropsychopharmacology 40, 2258–2268. doi: 10.1038/npp.2015.75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Schneier FR, Slifstein M, Whitton AE, Pizzagalli DA, Reinen J, McGrath PJ, Iosifescu DV, Abi-Dargham A, 2018. Dopamine release in antidepressant-naive major depressive disorder: a multimodal [11C]-(+)-PHNO positron emission tomography and functional magnetic resonance imaging study. Biol. Psychiatry 84, 563–573. doi: 10.1016/J.BIOPSYCH.2018.05.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Seymour B, Daw N, Dayan P, Singer T, Dolan R, 2007. Differential encoding of losses and gains in the human striatum. J. Neurosci 27, 4826–4831. doi: 10.1523/JNEUROSCI.0400-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Steele JD, Kumar P, Ebmeier KP, 2007. Blunted response to feedback information in depressive illness. Brain 130, 2367–2374. doi: 10.1093/brain/awm150. [DOI] [PubMed] [Google Scholar]
  49. Sutton RS, Barto AG, 1998. Reinforcement Learning: An Introduction. Reinforcement Learning: An Introduction. The MIT Press, Cambridge, MA: ISBN 0-262-19398-1. [Google Scholar]
  50. Tom SM, Fox CR, Trepel C, Poldrack RA, 2007. The neural basis of loss aversion in decision-making under risk. Science 315 (5811), 515–518. doi: 10.1126/science.1134239. [DOI] [PubMed] [Google Scholar]
  51. Treadway MT, Zald DH, 2010. Reconsidering anhedonia in depression: lessons from translational neuroscience. Neurosci. Biobehav. Rev 35, 537–555. doi: 10.1016/j.neubiorev.2010.06.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Tversky A, Kahneman D, 1981. The framing of decisions and the psychology of choice. Science 211, 453–458. doi: 10.1126/science.7455683. [DOI] [PubMed] [Google Scholar]
  53. Ubl B, Kuehner C, Kirsch P, Ruttorf M, Diener C, Flor H, 2015. Altered neural reward and loss processing and prediction error signalling in depression. Soc. Cognit. Affect. Neurosci 10, 1102–1112. doi: 10.1093/scan/nsu158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Vrieze E, Pizzagalli DA, Demyttenaere K, Hompes T, Sienaert P, de Boer P, Schmidt M, Claes S, 2013. Reduced reward learning predicts outcome in major depressive disorder. Biol. Psychiatry 73, 639–645. doi: 10.1016/j.biopsych.2012.10.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Watson D, Weber K, Assenheimer JS, Clark LA, Strauss ME, McCormick RA, 1995. Testing a tripartite model: I. Evaluating the convergent and discriminant validity of anxiety and depression symptom scales. J. Abnorm. Psychol 104, 3–14. doi: 10.1037/0021-843X.104.1.3. [DOI] [PubMed] [Google Scholar]
  56. Whitton A, Reinen J, Slifstein M, Ang Y, McGrath P, Iosifescu D, Abi-Dargham A, Pizzagalli D, Schneier F, 2020. Baseline reward processing and ventrostriatal dopamine function are associated with pramipexole response in depression. Brain 143, 701–710. doi: 10.1093/brain/awaa002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Whitton AE, Treadway MT, Pizzagalli DA, 2015. Reward processing dysfunction in major depression, bipolar disorder and schizophrenia. Curr. Opin. Psychiatry 28, 7–12. doi: 10.1097/YCO.0000000000000122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Yarkoni T, Poldrack RA, Nichols TE, Van Essen DC, Wager TD, 2011. Large-scale automated synthesis of human functional neuroimaging data. Nat. Methods 8, 665–670. doi: 10.1038/nmeth.1635. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

RESOURCES