Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Aug 1.
Published in final edited form as: J Psychiatr Res. 2022 Jun 15;152:175–181. doi: 10.1016/j.jpsychires.2022.06.032

Reward-based reinforcement learning is altered among individuals with a history of major depressive disorder and psychomotor retardation symptoms

Allison M Letkiewicz a,*, Amy L Cochran b,c, Vijay A Mittal a,d, Sebastian Walther e, Stewart A Shankman a,d
PMCID: PMC10185002  NIHMSID: NIHMS1895082  PMID: 35738160

Abstract

Reward-based reinforcement learning impairments are common in major depressive disorder, but it is unclear which aspects of reward-based reinforcement learning are disrupted in remitted major depression (rMDD). Given that the neurobiological substrates that implement reward-based RL are also strongly implicated in psychomotor retardation (PmR), the present study sought to test whether reward-based reinforcement learning is altered in rMDD individuals with a history of PmR. Three groups of individuals (1) rMDD with past PmR (PmR+, N = 34), (2) rMDD without past PmR (PmR−, N = 44), and (3) healthy controls (N = 90) completed a reward-based reinforcement learning task. Computational modeling was applied to test for group differences in model-derived parameters – specifically, learning rates and reward sensitivity. Compared to controls, rMDD PmR + exhibited lower learning rates, but not reduced reward sensitivity. By contrast, rMDD PmR− did not significantly differ from controls on either of the model-derived parameters. Follow-up analyses indicated that the results were not due to current psychopathology symptoms. Results indicate that a history of PmR predicts altered reward-based reinforcement learning in rMDD. Abnormal reward-related reinforcement learning may reflect a scar of past depressive episodes that contained psychomotor symptoms, or a trait-like deficit that preceded these episodes.

1. Introduction

Major Depressive Disorder (MDD) is a highly impairing and prevalent psychiatric disorder, where at least half of individuals experience multiple depressive episodes in their lifetime (Kessler and Walters, 1998; Kessler et al., 1997). Indeed, the single best predictor of future depression is past depression (Lewinsohn et al., 1989; Tram and Cole, 2006). Identifying processes that are disrupted in individuals with remitted depression could reveal key mechanisms of depression relapse.

One potentially important process through which depression develops, maintains, and recurs is via aberrant reward-based reinforcement learning (RL). RL refers to the ability to learn which actions are more likely to lead to rewarding outcomes. Reward-based RL is impaired in current MDD (Kumar et al., 2008; Pizzagalli et al., 2005, 2008) and has been shown to predict treatment response (Vrieze et al., 2013; Whitton et al., 2020). Specifically, individuals with current MDD often fail to exhibit a bias toward more rewarding stimuli relative to controls. Computational modeling, which is increasingly being utilized to identify specific cognitive processes that are impaired in various psychopathologies (Adams et al., 2016; Huys et al., 2016), indicates that reward-based RL deficits are, in part, related to reduced reward sensitivity (for a review, see Chen et al., 2015; Huys et al., 2013).

In contrast with current MDD, reward-based RL has been studied to a lesser extent in remitted MDD (rMDD) and even fewer studies have applied computational modeling to reward-based RL in rMDD. Similar to current MDD, there is emerging evidence of reward-based RL deficits among individuals with rMDD (Pechtel et al., 2013; Whitton et al., 2016). Although a history of MDD itself may predict impaired reward-based RL, MDD is a very heterogenous disorder (Merikangas et al., 1994; Østergaard et al., 2011), and a history of specific symptoms may be especially detrimental for reward-based RL in rMDD.

One depression symptom that may be particularly critical for reward-based RL is psychomotor retardation (PmR), which broadly describes reduced speed and amplitude in gross and fine motor activity (Schrijvers et al., 2009; Sobin and Sackeim, 1997). Although PmR has been understudied in MDD, studies of PmR are increasingly recognizing its clinical significance (Shankman et al., 2020; Walther et al., 2019). Motor symptoms not only correspond to more severe MDD (Calugi et al., 2011; Ulbricht et al., 2016) and poorer treatment response (Janzing et al., 2020; Taylor et al., 2006), but also to greater MDD recurrence (Gorwood et al., 2014). Thus, PmR may be a critical factor that relates to the development and/or recurrence of MDD.

PmR may be critical for reward-based RL due to overlapping neurocircuitry. Broadly, reward-based RL is implemented in the basal ganglia, including the dorsal striatum and ventral striatum, and regions of the prefrontal cortex (PFC; Galvan et al., 2005; O’Doherty, 2011; Pujara et al., 2016). Notably, PmR in MDD is also related to structural and functional alterations of many of these regions and related circuits (Bracht et al., 2012; Martinot et al., 2001; Naismith et al., 2002). Additionally, reward-based RL is supported by dopaminergic functioning (Samson et al., 2010), which is impaired individuals with PmR (Martinot et al., 2001). Taken together, a history of PmR, which may be indicative of trait-like corticostriatal and/or dopaminergic dysfunction, may relate to key deficits in rMDD, including reward-based RL.

The primary goal of the present study was to identify whether reward-based RL is altered among individuals with remitted MDD with a history of PmR symptoms (rMDD PmR+) relative to (a) healthy controls and (b) rMDD without a history of PmR symptoms (rMDD PmR−). To assess reward-based RL, participants completed the Probabilistic Reward Task (PRT), and computational modeling was applied to participants’ trial-wise choice behavior. It was hypothesized that individuals with rMDD would exhibit a reduced response bias (indicated by traditional PRT measures), as well as computational evidence of impaired reward-based RL relative to controls. Additionally, it was expected that differences in reward-based RL deficits would emerge for rMDD PmR + relative to rMDD PmR−, but not rMDD PmR-relative to controls. Given the specific links between dopamine and (a) psychomotor functioning and (b) learning rates in previous research (e.g., Martinot et al., 2001; Pizzagalli et al., 2020), it was anticipated that reward-based RL differences for the rMDD PmR + would most likely emerge for the learning rate. Finally, it was expected that group differences in reward-based RL would not be accounted for by current clinical symptoms, including anhedonia.

2. Methods

2.1. Participants

Young adults (N = 504) were recruited as part of a larger family study of transdiagnostic mechanisms of internalizing psychopathology from mental health clinics and the local Chicago, Illinois community. Primary eligibility criteria included (1) age 18-30 years old and (2) having a biological sibling age 18–30 who was also interested in participating (for full method details, see Gorka et al., 2016; Weinberg et al., 2015). Participants were oversampled during recruitment for severe internalizing psychopathology symptoms using the Depression, Anxiety, and Stress Scale (Lovibond and Lovibond, 1995; see Supplement for additional details).

Individuals were selected for the present study based on lifetime MDD and PmR symptoms using the Structured Clinical Interview for DSM-5 (SCID-5; First et al., 2015). Participants were included in one of three groups: (1) rMDD PmR+, if they met criteria for past but not current MDD and had PmR symptoms during their past major depressive episode, (2) rMDD PmR−, if they met criteria for past but not current MDD and did not have PmR symptoms during their past depressive episode, and (3) controls, if they had no lifetime history of any psychiatric disorder or PmR. Individuals who met criteria for current MDD and/or had current PmR and/or agitation (which broadly describes increases in purposeless motor activity; Sobin and Sackeim, 1997), on the SCID-5 were excluded from the study. Participants were also excluded if they did not have viable PRT data (Pizzagalli et al., 2005; Whitton et al., 2016; see Supplement). The final sample included 34 participants in the rMDD PmR + group, 44 in the rMDD PmR-group, and 90 in the control group (total N = 168).

2.2. Assessments and measures

2.2.1. Clinical diagnoses and psychomotor symptoms

Diagnoses and psychomotor symptoms were assessed via the SCID-5 (see Supplement for additional details). In contrast with traditional SCID assessment procedures, interviewers assessed all symptoms of MDD for all participants, even if the cardinal symptom(s) were not endorsed. Thus, lifetime psychomotor symptoms were characterized for all participants, allowing for the exclusion of individuals who did not meet criteria for MDD but endorsed lifetime psychomotor agitation (across groups) and/or lifetime PmR in the rMDD PmR− and control groups.

2.2.2. Inventory of depression and Anxiety Symptoms (IDAS-II)

The IDAS-II (Watson et al., 2012) is a 99-item questionnaire that measures current symptoms of depression and anxiety. In the present study, General Depression (Well-Being items removed), Well-Being (reverse scored to capture anhedonia), Panic, and Social-Anxiety were included as covariates of current internalizing symptoms. The observed internal consistencies were in the good-excellent range (General Depression: α = 0.91, Well-Being: α = 0.86, Panic: α = 0.80, and Social Anxiety: α = 0.86).

2.2.3. Personality Inventory for DSM-5 (PID-5)

The PID-5 (Krueger et al., 2012) is a 220-item measure of pathological personality traits and has strong psychometric properties (Thomas et al., 2013; Katz et al., 2018). The PID-5’s Anhedonia facet was included as a covariate to account for the trait-like tendency for individuals to experience anhedonia. The observed internal consistency was in the excellent range (α = 0.93).

2.2.4. Psychomotor speed

Condition 5 (Motor Speed) from the Delis-Kaplan Executive Function System Trail Making Test (Delis et al., 2001) assessed current psychomotor speed. During this condition participants connect empty dots while tracing over a dotted line as quickly as possible. Reaction time was used as the dependent measure.

2.2.5. Probabilistic Reward Task (PRT)

The PRT (Pizzagalli et al., 2005) is a well-validated probe of reward-based RL. The task included 200 trials across two blocks (i.e., 100 trials per block) and was administered in E-prime using code based on Pizzagalli et al. (2005). The task administration followed the instructions that were outlined by Pizzagalli and Whitton (2015, unpublished manual). During this task, participants are first presented with a jittered fixation cross, followed by two dots (“eyes”) and vertical line (“nose”). This is followed by either a short (10 mm) or long (11 mm) horizontal line (“mouth”; see Fig. 1). Participants must select the identity of each cue by pressing a corresponding keyboard key (‘z’ or ‘/‘). The cues represent a “lean” and a “rich” stimulus that are rewarded probabilistically (counterbalanced across participants), with the rich stimulus 3x more likely to deliver a reward following a correct response than the lean stimulus. During the feedback phase, participants either receive feedback indicating that they were correct and won 20 cents, or they receive no feedback.

Fig. 1.

Fig. 1.

Schematic representation of a trial from the Probabilistic Reward Task.

Traditional PRT measures that were calculated included response bias and discrimination (Pizzagalli et al., 2005; Whitton et al., 2016; Pizzagalli and Whitton, 2015, unpublished manual; see Supplement for the response bias and discrimination formulas). In addition to looking at overall response bias and discrimination, differences in the two PRT measures between blocks (Block 2-Block 1) were computed for each participant (Δresponse bias and Δdiscrimination).

2.2.6. Computational modeling of reinforcement learning

Several models that build on the standard Rescorla-Wagner (RW) model have been applied to the PRT task in previous research (Huys et al., 2013). These models were fit to participants’ behavior, including the ‘Action’, ‘Belief’, and ‘Stimulus-Action’ models. All models took the following general form:

Qt+1,c=Qt,c+α×δt,

where Qt,c is the quality, or value estimation, of cue c on trial t, α is the learning rate (which takes on the same value across all trials), and δt is the prediction error (PE; ρ×outcometQt,c) on trial t. To capture reward-sensitivity, parameter ρ was included in the PE term, which scales the value of the outcome, with 0ρ1 and values closer to 1 reflecting greater reward sensitivity.

The probability of choosing key 1 (‘z’) or 2 (’/‘) was linked to participants’ expectations of reward using a softmax function. Specifically, value expectations (Qt,c) were transformed using a softmax that captures the difference between weighted (Wt) value expectations for each action (choosing the ‘z’ key and choosing the ‘/’ key), with larger differences leading to an increased likelihood of selecting one action over the other:

p(atst)=1/1+exp[(Wt(at,st)Wt(a¯t,st))],

where at reflects the selected action and a¯t reflects the unselected action for the presented stimulus st on trial t. The weighted value estimates that are included in the softmax function for each action Wt(at,st) and Wt(a¯t,st) reflect a weighted combination of value estimates for each stimulus:

Wt(at,st)=ς×Qt(at,st)+(1ς)×Qt(at,s¯t),

where s¯t reflects the stimulus that was not presented on trial t. The parameter that scales the value estimates, ζ, reflects the degree to which a learner is certain of the identity of the stimulus on screen, with 0ζ1 and values closer to 1 reflecting greater certainty.

For the Stimulus-Action model, ζ=1 (i.e., a learner is certain which stimulus is the long versus short cue), thus nullifying the second term from the model, (1ζ)×Qt(at,s¯t)=0. For the Action model, ζ=0.5 (i.e., a learner is uncertain which stimulus is the long versus short cue) and each term, Wt(at,st) and Wt(at,s¯t), is weighed equally. For the Belief model, ζ is a free parameter that is estimated for each learner, which captures individual differences in certainty about stimulus identity.

Models were fit to participants’ choice behavior using maximum likelihood estimation and Akaike Information Criterion (AIC) values were used to identify the best-fitting model.

2.2.7. Analytical approach

Analyses examined the effect of remitted MDD with a history of PmR on reinforcement learning using linear mixed effects models. Specifically, we tested whether remitted MDD with a history of PmR symptoms (rMDD PmR+) differed from controls or rMDD without a history of PmR symptoms (rMDD PmR−) on RL using dummy coding with rMDD PmR + as the reference group. We also examined whether behavior or modeling parameters differed for rMDD PmR-relative to controls.

Analyses assessed for group differences in (1) traditional PRT behavioral measures and (2) model-estimated (a) learning rates, α, and (b) reward sensitivity, ρ (from the best-fitting model). Because these parameters are typically correlated (Huys et al., 2013), follow up tests examined unique effects for the parameters. Age and gender were included as covariates, and family was included as a random effects factor to account for sibling relatedness (i.e., familiality). Follow-up analyses examined whether the inclusion of psychopathology covariates altered the results.

3. Results

3.1. Sample characteristics

As shown in Table 1, there were significant no group differences in age, F(2,165) = 2.35, p = .098 or self-identified gender, χ2(2) = 1.76, p = .415, and the sample was quite diverse. Means and standard deviations for clinical measures and psychomotor speed are also reported in Table 1. Both rMDD groups endorsed more current symptoms of general depression, social anxiety, panic, and anhedonia than controls, but critically, the two rMDD groups did not differ on any of the clinical measures. Psychomotor speed did not significantly differ across groups, F(2,165) = 2.85, p = .061.

Table 1.

Demographic, cognitive, and clinical characteristics.

rMDD PmR+n = 34 rMDD PmR− n = 44 Controls n = 90
Age 22.4 (3.3) 23.3 (3.6) 22.0 (3.2)
Gender 77% female 68% female 64% female
Ethnicity
 White 38% 63% 38%
 Black/African American 24% 8% 7%
 Hispanic 32% 17% 19%
 Asian 6% 6% 26%
 Middle Eastern 8%
 Mixed Race 6% 2%
IDAS Subscales
 Depression (Well-Being items removed)a 36.2 (11.3) 35.1 (12.0) 25.0 (7.0)a,b
 Well-Beinga 25.2 (5.8) 25.3 (7.4) 27.32 (6.4)
 Social Anxietya 11.4 (5.4) 10.4 (4.5) 7.1 (1.9)a,b
 Panica 11.9 (4.6) 10.3 (2.6) 9.0 (2.4)a,b
PID-5 Facets
 Anhedoniab 0.73 (.51) 0.71 (.69) 0.36 (.38)a,b
Current Subthreshold Psychomotor Symptoms
 Psychomotor Retardation n = 5 (15%) n = 0 n = 0
 Psychomotor Agitation n = 3 (9%) n = 1 (2%) n = 0
Lifetime Subthreshold Psychomotor Symptoms
 Psychomotor Retardation - n = 0 n = 0
 Psychomotor Agitation n = 0 n = 9 (21%) n = 1 (1%)
Motor Speed (D-KEFS Trail Making Test, Condition 5)c 26.2 (12.6) 22.2 (6.5) 22.0 (8.3)

Note. IDAS=Inventory of Depression and Anxiety Symptoms; PID-5 = Personality Inventory for DSM-5.

a,b

Scores significantly differed for controls relative to the rMDD groups.

a

rMDD PmR+: n = 34, rMDD PmR−: n = 43, Controls: n = 89.

b

rMDD PmR+: n = 34, rMDD PmR−: n = 43, Controls: n = 90.

c

rMDD PmR+: n = 34, rMDD PmR−: n = 42, Controls: n = 85.

Given evidence that smoking status is associated with RL performance and related parameters (Baker et al., 2020; Reynolds, 2006), we examined whether current smoking status varied across groups. There were very few current smokers across all groups (i.e., those who smoked one or more cigarettes daily; see Table 2), and, importantly the ratio of current smokers to non-smokers within each group did not differ between groups, χ2(2) = 1.40, p = .496. Additionally, because some psychiatric medications have been shown to relate to altered RL (e.g., Kumar et al., 2008), we tested for group differences in current psychiatric medication use. As shown in Table 2, the only group difference in medication usage was for the rMDD groups relative to controls, χ2(2) = 10.72, p = .005; both rMDD groups had higher rates of current SSRI use, but SSRI rates did not differ between the rMDD groups, χ2(1) = 0.04, p = .847.

Table 2.

Rates of current psychiatric medication use and smoking status across groups.

rMDD PmR+ rMDD PmR− Controls
Medication
   Anxiolytics 3% (n = 1) 2% (n = 1) 0%
   SSRI 12% (n = 4) 9% (n = 4) 0%a,b
   SNRI 0% 2% (n = 1) 0%
   NDRI 3% (n = 1) 2% (n = 1) 0%
   MAOI 0% 0% 0%
   Sedatives 0% 0% 0%
   Hypnotics 0% 0% 0%
# of Medications 0: 88% (n = 30) 0: 89% (n = 39) 0: 100% (n = 90)
1: 6% (n = 2) 1: 9% (n = 4)
2: 6% (n = 2) 2: 2% (n = 1)
Currently smoke at least 1 cigarette daily 3% (n = 1) 9% (n = 4) 4% (n = 4)
a

rMDD PmR + vs. Controls: t(165) = 2.80, B = .22, p = .006.

b

rMDD PmR-vs. Controls: t(165) = 2.37, B = .19, p .019.

3.2. Traditional PRT measures

Means and standard deviations for traditional PRT measures are provided in Table 3. Although accuracy for the rich stimulus, response bias, and discrimination measures were all numerically lower and reaction times were longer for the lean and rich stimuli for the rMDD PmR + group, no significant group differences emerged for the traditional measures.

Table 3.

Traditional measures from probabilistic reward task.

rMDD PmR+ rMDD PmR− Controls Group Comparison
Accuracy .76 (.06) .77 (.09) .77 (.07) F(2,163) = 0.04, p = .957
Lean .71 (.09) .72 (.11) .71 (.12) F(2,163) = 0.24, p = .787
Rich .83 (.06) .85 (.06) .85 (.07) F(2,163) = 0.80, p = .451
Reaction Time
Rich 433.63 (77.4) 418.88 (101.9) 421.76 (82.3) F(2,163) = 0.19, p = .827
Lean 473.61 (91.1) 446.64 (107.0) 458.33 (105.6) F(2,163) = 0.40, p = .671
Response Bias .15 (.15) .17 (.16) .21 (.20) F(2,163) = 1.37, p = .258
ΔResponse Bias .05 (.15) .04 (.30) .06 (.23) F(2,163) = 0.07, p = .937
Discrimination .55 (.14) .58 (.21) .59 (.20) F(2,163) = 0.57, p = .564
ΔDiscrimination .03 (.19) .03 (.18) .02 (.18) F(2,163) = 0.30, p = .742
Reward Ratio (Rich vs. Lean)
Block 1 29.4/9.8 29.4/9.9 29.4/10 F(2,163) = 1.67, p = .191
Block 2 29.6/9.7 29.7/9.9 29.8/9.8 F(2,163) = 0.90, p = .411
Valid Trials (%) 98 (.03) 98 (.01) 99 (1.0) F(2,163) = 0.51, p = .601

3.3. Model fit

As shown in Fig. 2A, the Action model provided the best fit based on AIC across all participants (log-likelihood values are provided in the Supplement). For all groups, the Action model provided a better fit than the Belief and Stimulus-Action models (see Fig. 2B). Not only did the Action model provide a better fit relative to other models for all groups, the model fit did not significantly differ between groups, F(2,165) = 2.38, p = .096.

Fig. 2.

Fig. 2.

A. Summed Akaike information criterion (AIC) values for the Stimulus-Action, Action, and Belief Models across participants. AIC values indicate that participants’ behavior was better captured by the Action Model than the Stimulus-Action and Belief Models. B. Average AIC values for the Action, Stimulus-Action, and Belief Models for each group. AIC values indicate that for each group, participants’ behavior was better captured by the Action Model than the Stimulus-Action and Belief Models.

3.4. Learning curves

Value estimates, which are a function of model parameters and participants’ choices, were recovered using the best-fitting (Action) model. Trial-wise changes in value estimates across the task (i.e., learning curves) are depicted for each group in Fig. 3.

Fig. 3.

Fig. 3.

Model-derived trial by trial value expectations (V) averaged across the remitted major depressive disorder (rMDD) and control groups.

3.5. Learning rate and reward sensitivity

Across all groups, the learning rate was correlated with accuracy for the rich, r(166) = 0.18, p = .023 and lean stimuli, r(166) = 0.16, p = .030, but reward sensitivity did reach the level of significance for either stimulus, r(166) = 0.14, p = .076 and r(166) = 0.13, p = .094 (note: the parameters were transformed due to non-Gaussianity and scaled such that larger values are more positive). When included as predictors in the same model, neither parameter accounted for a unique portion of variance in accuracy, rich: B = 0.14, p = .103 and B = 0.06, p = .559, and lean: B = 0.16, p = .089 and B = 0.05, p = .661.

For the learning rate (α), there was a significant effect of Group, F (2,163) = 4.19, p = .017, which was driven by higher learning rates for controls relative to the rMDD PmR + group, t(163) = 2.85, B = 0.27, p = .005. Although numerically lower for rMDD PmR-relative to rMDD PmR+, learning rates did not significantly differ for rMDD groups, t (163) = 1. 43, B = 0.14, p = .154. Additionally, learning rates did not differ for the rMDD PmR-group relative to controls, t(163) = −1.40, B = −0.11, p = .164. Follow-up tests revealed that the difference in learning rates between rMDD PmR+ and controls remained after including psychopathology covariates (e.g., current symptoms of depression, anhedonia, social anxiety, and panic), as well as current psychiatric medication use (see Table 4).

Table 4.

Effect of group (PmR + versus controls) on learning rate for controlling for psychopathology and current psychiatric medication use covariates.

DV: α PmR + vs. Controls Covariate
1. B=.28, p=.005 B = −.13, p = .096
2. B=.27, p=.009 B = .01, p = .911
3. B=.29, p=.008 B = .05, p = .536
4. B=.24, p=.029 B = −.05, p = .589
5. B=.28, p=.009 B = .02, p = .769
6. B=.28, p=.005 B=−.02, p=.971
7. B=.28, p=.006 B=.04, p=.634
1

= Inventory of Depressive and Anxiety Symptoms (IDAS), Well-being (reverse-scored).

2

= Personality Inventory for the DSM-5, Anhedonia.

3

= IDAS, Depression.

4

= IDAS, Social Anxiety.

5

= IDAS, Panic.

6

= Current SSRI medication use.

7

= Any current psychiatric medication use.

For reward sensitivity (ρ) there was no main effect of Group, F (2,163) = 1.45, p = .238. To identify whether the difference in learning rates between the rMDD PmR+ and control groups remained after controlling for reward sensitivity (which was significantly correlated with learning rate, r[166] = −0.34), the reward sensitivity parameter was included as a covariate. The difference in learning rates between controls and the rMDD PmR + group remained significant after controlling for reward sensitivity, B = .22, p = .023.

3.6. Alternative model fitting

To identify whether the model that provided the best fit to the data varied as a function of estimation method, model fit was also examined using Expectation-Maximization (Huys et al., 2013; emfit toolbox, available at https://github.com/mpc-ucl/emfit). This estimation method also identified the Action model as the best-fitting model (see the Supplement). Multivariate regression analyses were conducted to test for group differences in the reward sensitivity and learning rate parameters. Similar to the results for reward sensitivity and learning rate tests above, the learning rate parameter differed for rMDD PmR + relative to Controls (p = .023), but not the rMDD PmR-group (p = .281). Additionally, controls and the rMDD PmR-groups did not differ in the learning rate (p = .734). The reward sensitivity parameter did not significantly differ for any of the groups.

4. Discussion

Psychomotor retardation symptom history predicted altered reward-based RL in individuals with rMDD. Whereas group differences did not reach significance for traditional PRT measures, differences emerged using more specific parameters that were identified with computational modeling. These findings add to a growing literature of atypical reward-based RL in individuals with rMDD (Pechtel et al., 2013; Whitton et al., 2016), and, furthermore, highlight a potentially important role of psychomotor retardation in reward-based RL in MDD.

Despite comparable computational model fit across groups, indicating that participants completed the task using similar learning strategies, individuals with rMDD with a history of PmR exhibited lower learning rates compared to controls. Learning rates reflect the degree to which individuals revise their expectations based on the error in their predictions. Although higher learning rates are not inherently “better,” since persistently large updates can disrupt the convergence between a learner’s internal expectations and actual stimulus-outcome probabilities, there was a trend for higher learning rates to relate to greater accuracy. This suggests that, at least in the context of this implicit learning task in which signal detection was difficult, relatively higher learning rates were beneficial (see Pizzagalli et al., 2020 for a similar finding for learning rate and PRT accuracy). In contrast with learning rates, reward sensitivity, which reflects how much individuals weigh reward-related outcomes in their expectations, did not differ between the rMDD PmR+ and control groups. None of the results changed after controlling for current symptoms of psychopathology, suggesting that effects were not due to residual current symptoms.

In contrast with rMDD with a history of PmR, rMDD individuals without a history of PmR did not differ from controls on any of the traditional or computational task measures, including learning rates. Although neither learning rates nor reward sensitivity significantly differed between the rMDD groups, the learning curves show that individuals with rMDD with a history of PmR exhibited slower initial learning of action-outcome pairings relative to rMDD without a history of PmR and controls. Taken together, results indicate that individuals with rMDD with a history of PmR, rather than rMDD broadly, exhibit altered reward-based RL.

Results are in line with previous research that identified reward-based RL is disrupted in individuals with a history of MDD, even after accounting for symptoms of perceived stress and anhedonia (Petchel et al., 2013), the latter of which correlates with the reward sensitivity parameter (Huys et al., 2013; Whitton et al., 2020). Although current MDD and anhedonia have related to reduced sensitivity to rewards in previous work (Huys et al., 2013), a history of MDD has been found to only weakly relate to atypical model-derived reward sensitivity (Huys et al., 2013). That reward sensitivity is not the primary contributor to reward-based RL deficits in rMDD is further supported by present results. While this is not the first study to identify a role of other reward-based RL processes in MDD (Chen et al., 2015), to our knowledge it is the first to identify a primary role of altered learning rates in reward-based RL in individuals with rMDD who have experienced PmR.

Despite not having clinically elevated current PmR symptoms or significantly slower psychomotor speed (as indicated by D-KEFS motor speed performance and RT during the PRT), a history of PmR during a past episode of MDD predicted current alterations in reward-based RL. This suggests that difficulties in rMDD with reward-based RL are not merely the result of psychomotor slowing. As discussed above, there is evidence that corticostriatal circuitry and dopaminergic functioning that supports reward-based RL are disrupted in PmR (Bracht et al., 2012; Naismith et al., 2002). Although indirect evidence of these links, administration of a low dose dopaminergic D2 agonist (which has antagonistic effects on behavior) has been found to disrupt learning rates, but not reward sensitivity, during reward-based RL (Huys et al., 2013; Pizzagalli et al., 2008); a pattern that emerged in our study. These findings are particularly relevant for precision medicine, which aims to identify specific prevention efforts and treatments for specific individuals (Fernandes et al., 2017). For example, computationally-derived PRT parameters could to help identify which individuals may benefit from enhancing dopaminergic functioning to prevent MDD relapse.

There are several limitations that should be noted. First, psychomotor symptom history was based on a retrospective clinical assessment via the SCID. Importantly though, the psychomotor retardation is one of the symptoms of lifetime MDD with the strongest test-retest reliability (k = .72; Kaiser et al., 2020). This suggests that specific symptoms of lifetime MDD, including psychomotor retardation, are reported consistently over time. Another limitation is that this study is unable to clarify whether abnormal reward-based RL processing reflects a scar of past MDD episodes that contained psychomotor symptoms or a trait-like deficit that preceded episodes. In the future, it will be important to conduct prospective studies that include fine-grained assessments of psychomotor symptoms, including laboratory-based and other instrumental measures (e.g., actigraphy, an ecologically sensitive approach to assessing movement in daily life; Shankman et al., 2020). Finally, while the research questions and groupings were determined prior to implementing the study, participants were included from a larger study that was did not initially recruit participants specifically for rMDD and PmR history.

It is also important to note that our results did not replicate a previously established finding of an altered response bias for rMDD (Pechtel et al., 2013; Whitton et al., 2016). Our results are also in slight contrast with Huys et al. (2013), who found that although both the Action and Belief models provided a good fit to performance, the Belief model slightly outperformed the Action model. By contrast, we found that the Action model provided a better fit, which reflects the weighing of previous actions (e.g., pressing key 1 or key 2) to a greater degree than beliefs about stimulus identity. This suggests that participants generally did not have a high degree of certainty about stimulus identity during cue presentation. It is plausible that task design features may have played a role in these differences. First, Huys et al. (2013) included data from six studies and all of the studies included 3 blocks of 100 trials, which was also the number of blocks included in Pechtel et al. (2013), whereas our study included 2 blocks of 100 trials (i.e., 1/3 fewer trials). Second, for all but one study in Huys et al. (2013), the difference between the mouth size of the two cues was slightly larger (13 mm vs. 11.5 mm) than the one used in the present study (11 mm vs. 10 mm). Indeed, the one study that used the same mouth sizes as our task (Bogdan and Pizzagalli, 2006) exhibited the poorest fit to the Belief model. In the future, researchers could consider assessing the role of psychomotor symptom history in rMDD using different combinations of these task features to clarify mechanisms of altered RL.

Reward-based RL is a critical cognitive process that is impaired in current and remitted depression. In the present study, there was evidence of less efficient experience-dependent reward-based RL among individuals with rMDD who have a history of psychomotor slowing. In daily life, these difficulties could contribute to and/or perpetuate disengagement from activities and interpersonal relationships. Targeting (a) residual PmR symptoms in individuals with a history of MDD and/or (b) reward-based RL in individuals with a history of PmR may help to prevent relapse into future depressive episodes.

Supplementary Material

Reward-Based RL_rMDD_and_PMR_Supplement

Acknowledgments

This work was supported, in part, by the National Institute of Mental Health grants R01 MH098093 awarded to Dr. Shankman and R01 MH118741 awarded to Drs. Shankman, Mittal, and Walther and the National Institutes of Health’s National Center for Advancing Translational Sciences grant TL1 TR001423 awarded to Dr. Letkiewicz. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Footnotes

Declaration of competing interest

None.

Appendix A. Supplementary data

Supplementary data to this article can be found online at https://doi.org/10.1016/j.jpsychires.2022.06.032.

Data availability statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

  1. Adams RA, Huys QJ, Roiser JP, 2016. Computational psychiatry: towards a mathematically informed understanding of mental illness. J. Neurol. Neurosurg. Psychiatr. 87, 53–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Baker TE, Zeighami Y, Dagher A, Holroyd CB, 2020. Smoking decisions: altered reinforcement learning signals induced by nicotine state. Nicotine Tob. Res. 22 (2), 164–171. [DOI] [PubMed] [Google Scholar]
  3. Bogdan R, Pizzagalli DA, 2006. Acute stress reduces reward responsiveness: implications for depression. Biol. Psychiatr. 60 (10), 1147–1154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bracht T, Federspiel A, Schnell S, Horn H, Höfle O, Wiest R, Dierks T, Strik W, Müller TJ, Walther S, 2012. Cortico-cortical white matter motor pathway microstructure is related to psychomotor retardation in major depressive disorder. PLoS One 7 (12), e52238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Calugi S, Cassano GB, Litta A, Rucci P, Benvenuti A, Miniati M, et al. , 2011. Does psychomotor retardation define a clinically relevant phenotype of unipolar depression? J. Affect. Disord 129 (1–3), 296–300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Chen C, Takahashi T, Nakagawa S, Inoue T, Kusumi I, 2015. Reinforcement learning in depression: a review of computational research. Neurosci. Biobehav. Rev 55, 247–267. [DOI] [PubMed] [Google Scholar]
  7. Delis DC, Kaplan E, Kramer JH, 2001. Delis-Kaplan Executive Function System (D-KEFS). Psychological Corporation. [Google Scholar]
  8. Fernandes BS, Williams LM, Steiner J, Leboyer M, Carvalho AF, Berk M, 2017. The new field of ‘precision psychiatry. BMC Med. 15 (1), 1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. First MB, Williams JB, Karg RS, Spitzer RL, 2015. Structured Clinical Interview for DSM-5 Disorders, Research Version (SCID-5-RV). American Psychiatric Association, Arlington, VA. [Google Scholar]
  10. Galvan A, Hare TA, Davidson M, Spicer J, Glover G, Casey BJ, 2005. The role of ventral frontostriatal circuitry in reward-based learning in humans. J. Neurosci. 25 (38), 8650–8656. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Gorka SM, Nelson BD, Phan KL, Shankman SA, 2016. Intolerance of uncertainty and insula activation during uncertain reward. Cognit. Affect Behav. Neurosci 16 (5), 929–939. [DOI] [PubMed] [Google Scholar]
  12. Gorwood P, Richard-Devantoy S, Baylé F, Cléry-Melun ML, 2014. Psychomotor retardation is a scar of past depressive episodes, revealed by simple cognitive tests. Eur. Neuropsychopharmacol 24 (10), 1630–1640. [DOI] [PubMed] [Google Scholar]
  13. Huys QJ, Maia TV, Frank MJ, 2016. Computational psychiatry as a bridge from neuroscience to clinical applications. Nat. Neurosci 19 (3), 404. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Huys QJ, Pizzagalli DA, Bogdan R, Dayan P, 2013. Mapping anhedonia onto reinforcement learning: a behavioural meta-analysis. Biol. Mood Anxiety Disord 3 (1), 1–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Janzing JG, Birkenhäger TK, van den Broek WW, Breteler LM, Nolen WA, Verkes RJ, 2020. Psychomotor Retardation and the prognosis of antidepressant treatment in patients with unipolar Psychotic Depression. J. Psychiatr. Res 130, 321–326. [DOI] [PubMed] [Google Scholar]
  16. Kaiser AJ, Funkhouser CJ, Mittal VA, Walther S, Shankman SA, 2020. Test-retest & familial concordance of MDD symptoms. Psychiatr. Res 292, 113313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Katz AC, Hee D, Hooker CI, Shankman SA, 2018. A family study of the DSM-5 Section III personality pathology model using the Personality Inventory for the DSM-5 (PID-5). J. Pers. Disord 32 (6), 753–765. [DOI] [PubMed] [Google Scholar]
  18. Kessler RC, Walters EE, 1998. Epidemiology of DSM-III-R major depression and minor depression among adolescents and young adults in the national comorbidity survey. Depress. Anxiety 7 (1), 3–14. [DOI] [PubMed] [Google Scholar]
  19. Kessler RC, Zhao S, Blazer DG, Swartz M, 1997. Prevalence, correlates, and course of minor depression and major depression in the National Comorbidity Survey. J. Affect. Disord 45 (1–2), 19–30. [DOI] [PubMed] [Google Scholar]
  20. Krueger RF, Derringer J, Markon KE, Watson D, Skodol AE, 2012. Initial construction of a maladaptive personality trait model and inventory for DSM-5. Psychol. Med 42 (9), 1879. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Kumar P, Waiter G, Ahearn T, Milders M, Reid I, Steele JD, 2008. Abnormal temporal difference reward-learning signals in major depression. Brain 131 (8), 2084–2093. [DOI] [PubMed] [Google Scholar]
  22. Lewinsohn PM, Zeiss AM, Duncan EM, 1989. Probability of relapse after recovery from an episode of depression. J. Abnorm. Psychol 98 (2), 107–116. [DOI] [PubMed] [Google Scholar]
  23. Lovibond PF, Lovibond SH, 1995. The structure of negative emotional states: comparison of the depression anxiety stress scales (DASS) with the beck depression and anxiety inventories. Behav. Res. Ther 33 (3), 335–343. [DOI] [PubMed] [Google Scholar]
  24. Martinot MLP, Bragulat V, Artiges E, Dollé F, Hinnen F, Jouvent R, Martinot JL, 2001. Decreased presynaptic dopamine function in the left caudate of depressed patients with affective flattening and psychomotor retardation. Am. J. Psychiatr 158 (2), 314–316. [DOI] [PubMed] [Google Scholar]
  25. Merikangas KR, Wicki W, Angst J, 1994. Heterogeneity of depression: classification of depressive subtypes by longitudinal course. Br. J. Psychiatr 164 (3), 342–348. [DOI] [PubMed] [Google Scholar]
  26. Naismith S, Hickie I, Ward PB, Turner K, Scott E, Little C, Mitchell P, Wilhelm K, Parker G, 2002. Caudate nucleus volumes and genetic determinants of homocysteine metabolism in the prediction of psychomotor speed in older persons with depression. Am. J. Psychiatr 159 (12), 2096–2098. [DOI] [PubMed] [Google Scholar]
  27. O’Doherty JP, 2011. Contributions of the ventromedial prefrontal cortex to goal-directed action selection. Ann. N. Y. Acad. Sci 1239, 118–129. [DOI] [PubMed] [Google Scholar]
  28. Østergaard SD, Jensen SOW, Bech P, 2011. The heterogeneity of the depressive syndrome: when numbers get serious. Acta Psychiatr. Scand 124, 495–496. [DOI] [PubMed] [Google Scholar]
  29. Pechtel P, Dutra SJ, Goetz EL, Pizzagalli DA, 2013. Blunted reward responsiveness in remitted depression. J. Psychiatr. Res 47 (12), 1864–1869. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Pizzagalli DA, Jahn AL, O’Shea JP, 2005. Toward an objective characterization of an anhedonic phenotype: a signal-detection approach. Biol. Psychiatr 57 (4), 319–327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Pizzagalli DA, Iosifescu D, Hallett LA, Ratner KG, Fava M, 2008. Reduced hedonic capacity in major depressive disorder: evidence from a probabilistic reward task. J. Psychiatr. Res 43 (1), 76–87. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Pizzagalli DA, Smoski M, Ang YS, Whitton AE, Sanacora G, Mathew SJ, Nurnberger J, Lisanby SH, Iosifescu DV, Murrough JW, Yang H, Weiner RD, Calabrese JR, Goodman W, Potte WZ, Krystal AD, 2020. Selective kappa-opioid antagonism ameliorates anhedonic behavior: evidence from the fast-fail trial in mood and anxiety spectrum disorders (FAST-MAS), 2020 Neuropsychopharmacology 45 (10), 1656–1663. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Pizzagalli DA, Whittonm AE, 2015. Probabilistic reward task: Procedures manual for obtaining behavioral and event-related potential data. Unpublished manual that is used both nationally and internationally by researchers using the Probabilistic Reward Task (developed by Pizzagalli) to obtain behavioral measures of reward learning.
  34. Pujara MS, Philippi CL, Motzkin JC, Baskaya MK, Koenigs M, 2016. Ventromedial prefrontal cortex damage is associated with decreased ventral striatum volume and response to reward. J. Neurosci 36 (18), 5047–5054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Reynolds B, 2006. The Experiential Discounting Task is sensitive to cigarette-smoking status and correlates with a measure of delay discounting. Behav. Pharmacol 17 (2), 133–142. [DOI] [PubMed] [Google Scholar]
  36. Samson RD, Frank MJ, Fellous JM, 2010. Computational models of reinforcement learning: the role of dopamine as a reward signal. Cognit. Neurodynamics 4 (2), 91–105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Schrijvers D, Van Den Eede F, Maas Y, Cosyns P, Hulstijn W, Sabbe BGC, 2009. Psychomotor functioning in chronic fatigue syndrome and major depressive disorder: a comparative study. J. Affect. Disord 115, 46–53. [DOI] [PubMed] [Google Scholar]
  38. Shankman SA, Mittal VA, Walther S, 2020. An examination of psychomotor disturbance in current and remitted MDD: an RDoC study. J. Psychiatr. Brain Sci 5, e200007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Sobin C, Sackeim HA, 1997. Psychomotor symptoms of depression. Am. J. Psychiatr 154 (1), 4–17. [DOI] [PubMed] [Google Scholar]
  40. Taylor BP, Bruder GE, Stewart JW, McGrath PJ, Halperin J, Ehrlichman H, Quitkin FM, 2006. Psychomotor slowing as a predictor of fluoxetine nonresponse in depressed outpatients. Am. J. Psychiatr 163 (1), 73–78. [DOI] [PubMed] [Google Scholar]
  41. Thomas KM, Yalch MM, Krueger RF, Wright AG, Markon KE, Hopwood CJ, 2013. The convergent structure of DSM-5 personality trait facets and five-factor model trait domains. Assessment 20 (3), 308–311. [DOI] [PubMed] [Google Scholar]
  42. Tram JM, Cole DA, 2006. A multimethod examination of the stability of depressive symptoms in childhood and adolescence. J. Abnorm. Psychol 115 (4), 674. [DOI] [PubMed] [Google Scholar]
  43. Ulbricht CM, Rothschild AJ, Lapane KL, 2016. Functional impairment and changes in depression subtypes for women in STAR* D: a latent transition analysis. J. Wom. Health 25 (5), 464–472. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Vrieze E, Pizzagalli DA, Demyttenaere K, Hompes T, Sienaert P, de Boer P, Schmidt M, Claes S, 2013. Reduced reward learning predicts outcome in major depressive disorder. Biol. Psychiatr 73 (7), 639–645. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Walther S, Bernard JA, Mittal VA, Shankman SA, 2019. The utility of an RDoC motor domain to understand psychomotor symptoms in depression. Psychol. Med 49 (2), 212–216. [DOI] [PubMed] [Google Scholar]
  46. Watson D, O’Hara MW, Naragon-Gainey K, Koffel E, Chmielewski M, Kotov R, Stasik SM, Ruggero CJ, 2012. Development and validation of new anxiety and bipolar symptom scales for an expanded version of the IDAS (the IDAS-II). Assessment 19 (4), 399–420. [DOI] [PubMed] [Google Scholar]
  47. Weinberg A, Liu H, Hajcak G, Shankman SA, 2015. Blunted neural response to rewards as a vulnerability factor for depression: results from a family study. J. Abnorm. Psychol 124 (4), 878. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Whitton AE, Kakani P, Foti D, Van’t Veer A, Haile A, Crowley DJ, Pizzagalli DA, 2016. Blunted neural responses to reward in remitted major depression: a high-density event-related potential study. Biol. Psychiatr.: Cognit. Neurosci. Neuroimag 1 (1), 87–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Whitton AE, Reinen JM, Slifstein M, Ang YS, McGrath PJ, Iosifescu DV, Abi-Dargham A, Pizzagalli DA, Schneier FR, 2020. Baseline reward processing and ventrostriatal dopamine function are associated with pramipexole response in depression. Brain 143 (2), 701–710. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Reward-Based RL_rMDD_and_PMR_Supplement

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

RESOURCES