Abstract
Background:
Motivational deficits in people with schizophrenia (PSZ) are associated with an inability to integrate the magnitude and probability of previous outcomes. The mechanisms that underlie probability-magnitude integration deficits, however, are poorly understood. We hypothesized that increased reliance on “value-less” stimulus-response associations, in lieu of expected value (EV)-based learning, could drive probability-magnitude integration deficits in PSZ with motivational deficits.
Methods:
Healthy volunteers (n=38) and PSZ (n=49) completed a learning paradigm consisting of four stimulus pairs. Reward magnitude (3/2/1/0 points) and probability (90%/80%/20%/10%) determined each stimulus’ EV. Following a learning phase, new and familiar stimulus pairings were presented. Participants were asked to select stimuli with the highest reward value.
Results:
PSZ with high motivational deficits made increasingly less optimal choices as the difference in reward value (probability*magnitude) between two competing stimuli increased. Using a previously-validated computational hybrid model, PSZ relied less on EV (“Q-learning”) and more on stimulus-response learning (“actor-critic”), which correlated with SANS motivational deficit severity. PSZ specifically failed to represent reward magnitude, consistent with model demonstrations showing that response tendencies in the actor-critic were preferentially driven by reward probability.
Conclusions:
Probability-magnitude deficits in PSZ with motivational deficits arise from underutilization of EV in favor of reliance on value-less stimulus-response associations. Confirmed by our computational hybrid framework, probability-magnitude integration deficits were driven specifically by a failure to represent reward magnitude. This work provides a first mechanistic explanation of complex EV-based learning deficits in PSZ with motivational deficits that arise from an inability to combine information from different reward modalities.
Keywords: schizophrenia, reinforcement learning, dopamine, basal ganglia, orbitofrontal cortex, anhedonia
INTRODUCTION
Many people with schizophrenia (PSZ) suffer from a reduced tendency to engage in goal-directed behavior (1, 2), termed amotivation, or avolition. These deficits in motivation can contribute substantially to poor functional capacity and quality of life (3-5). One explanatory suggests that motivational deficits result from a specific impairment in the ability to precisely represent the value of an action or choice (expected value; EV) coupled with overreliance on “value-less” stimulus-response associations (6-8). Support for this computational account, however, comes from studies using reinforcement learning (RL) paradigms in which reward probability (the chance of obtaining a reward) solely determines EV. Importantly, other evidence indicates that EV estimation deficits in PSZ (with motivational deficits) are most prominent when EV depends on the successful integration of reward magnitude (size) and probability.
One important line of evidence suggesting prominent deficits in EV representation comes from the Iowa Gambling Task (IGT; (9)), in which participants select from four card decks with varying reward magnitude and probability. Performance deficits on this task, in PSZ, are driven by a reduced ability to integrate long-term outcome magnitude and probability (10-12). These deficits extend to contexts in which participants choose between earning a small certain reward or gamble for a larger reward (a “framing” task) (13). When reward magnitude and probability vary continuously, impaired EV computations in PSZ with motivational deficits are primarily driven by decreased sensitivity to reward magnitude (14). In contrast, PSZ can optimize task performance eventually if they learn about reward probability (15). Thus, the available evidence suggests that probability-magnitude integration deficits in PSZ (with motivational deficits) may be driven by impaired sensitivity to reward magnitude specifically, and somewhat spared tracking of reward probability. To date, however, the computational mechanisms associated with an inability to integrate reward probability and magnitude into the estimation of EV has never been investigated. The goal of the current study was to use computational modeling to provide a mechanistic account of probability-magnitude integration deficits in PSZ.
There are several approaches to RL in computational models, which may in turn relate to different cognitive and neural mechanisms. While all RL models involve the reward prediction error (RPE) as a critical quantity that drives learning, the way in which the RPE is calculated and used to optimize behavior differs among classes. In the Q-learning framework (16), RPEs are used to update the EV of every action (here, choosing a stimulus), and choices are then executed directly based on the relative EV estimate. Computational models and neural data suggest that the dynamic representation of EV - specifically, the integration of reward magnitude and probability – crucially involves orbitofrontal cortex (OFC) (17-22). In the actor-critic framework (23), however, RPEs are not used to directly update choice EV. Instead, the direct actor develops action propensities in a “value-less” space; choice preferences are only indirectly related to RPEs generated by the critic. The formation of such stimulus-response associations is among others thought to arise by gradual tuning of basal ganglia (BG) synaptic weights in response to dopaminergic RPEs (24, 25).
Although the differences between these two model classes may seem subtle, and both algorithms will generally produce adaptive learning, they can make categorically different predictions in particular scenarios. For example, since the actor only relies on choice propensities, given a novel choice between an option that had yielded gains and another one that had merely avoided losses, the actor will not exhibit a preference for the higher EV option if they had both given rise to positive RPEs in the critic. Using a hybrid computational model that allows for parametric mixing between Q- and actor-critic learning, we have previously shown that decreased reliance on the former and a relative increase in the latter can account for decreased gain-seeking behavior and overvaluation of contextual information in PSZ with motivational deficits (6, 7), which both result in a poor representation of EV. While we have argued previously, using this approach, that decreased reliance on EV and overutilization of stimulus-response associations serves as one computational framework of impaired goal-directed behavior, the ability of this framework to provide a mechanistic explanation of complex probability-magnitude integration deficits has never been assessed. Moreover, probability-magnitude integration deficits in PSZ provide a unique opportunity to assess the generalizability of the computational hybrid framework to other EV estimation deficits.
In the current study, we hypothesized that the hybrid computational framework, via a reduction in Q- relative to actor-critic-type learning, can potentially account for abnormal probability-magnitude integration. First, because actor weights increase with each critic RPE, in stochastic environments they do not converge to a true expected value, and as such they are overly influenced by frequency. Second, neural network models of OFC and basal ganglia (26) have suggested that dysfunction of the OFC can bias BG choices to be primarily driven by reward probability, whereas OFC integrity was needed to improve EV estimation by reward magnitudes. Similar probability patterns have been observed in algorithmic versions of this BG model, as in the Opponent Actor Learning (OpAL) architecture, which is a modified actor-critic based on physiological properties of the BG (25). To the degree that Q-learning and actor-critic capture OFC- and BG-like properties, we thus hypothesized that such a probability bias might be captured by our hybrid computational model in terms of an overreliance on actor-critic versus Q-learning.
In the study reported here, participants were presented with an RL task in which they learned to select stimuli with the highest reward value. Pairs of stimuli differed in both reward probability and magnitude. In a subsequent transfer phase, old and novel pairs of stimuli were presented. Optimal performance in this phase of the task crucially depended on one’s ability to integrate reward probability and magnitude, which is different from more common RL paradigms, where EV is a function of reward probability only.
Through computational modeling and statistical analyses, we tested for the first time whether probability-magnitude integration deficits in PSZ (with motivational deficits) could be explained by underutilization of EV and/or overreliance on stimulus-response associations. In line with previous work (6, 7, 27), we predicted that deficits in the representation of EV should become more apparent as choices become easier. That is, we predicted that performance differences between PSZ and controls would be largest when the difference in EV between two competing stimuli was greatest (note that greater deficits with increasing difficulty would point to a more general learning impairment). Secondly, we expected that probability-magnitude integration deficits in PSZ, as well as reliance on reward probability versus magnitude in the entire sample, would correlate with the degree to which individuals relied on Q- versus actor-critic type learning. Finally, we expected probability-magnitude deficits and computational evidence thereof to correlate with the severity of motivational deficits.
METHODS AND MATERIALS
Participants
Forty-nine participants with a diagnosis of schizophrenia or schizoaffective disorder (PSZ) and thirty-eight healthy volunteers (HV) that did not differ on among others age, gender, and ethnicity (Table 1). Inclusion criteria, cognitive and clinical assessment details are reported in the Supplemental Text (including cut-off scores for less severe motivational deficit [LMD} and more severe motivational deficit [MMD] subgroups). Written informed consent was obtained from all participants prior to the experiment.
Table 1.
Sample Demographics
| HV (n=38) | PSZ (n=49) | t/X2 | p | ||
|---|---|---|---|---|---|
| Age | 37.16 (12.65) | 38.63 (10.58) | −.59 | .56 | |
| Gender [F, M] | [13, 25] | [16, 33] | .02 | .88 | |
| Race | |||||
| African American, Caucasian, Other | [15,21,2] | [19,26,4] | .28 | .87 | |
| Education level (years) | 14.94 (1.97) | 13.06 (2.07) | 4.23 | <.001 | |
| Maternal education level | 14.57 (2.16) | 13.48 (2.44) | 2.10 | .04 | |
| Paternal education level | 14.40 (2.90) | 13.37 (3.60) | 1.38 | .17 | |
| WTAR IQ | 112.18 (10.59) | 97.47 (16.97) | 4.68 | <.001 | |
| WASI IQ | 117.47 (8.67) | 100.69 (13.84) | 6.54 | <.001 | |
| MATRICS Domains | |||||
| Processing Speed | 55.26 (8.71) | 37.55 (13.21) | 7.14 | <.001 | |
| Attention/Vigilance | 53.21 (7.58) | 41.29 (11.22) | 5.63 | <.001 | |
| Working Memory | 50.92 (9.25) | 39.47 (10.50) | 5.31 | <.001 | |
| Verbal Learning | 54.00 (9.06) | 39.53 (9.04) | 7.40 | <.001 | |
| Visual Learning | 50.47 (9.88) | 36.41 (13.68) | 5.35 | <.001 | |
| Reasoning | 52.37 (10.31) | 42.22 (10.37) | 4.54 | <.001 | |
| Social Cognition | 53.61 (9.09) | 39.37 (10.74) | 6.55 | <.001 | |
| MATRICS Overall Score | 54.32 (8.00) | 32.71 (12.55) | 9.25 | <.001 | |
| Antipsychotic Medication | |||||
| Total Haloperidol | - | 11.57 (8.26) | - | - | |
| Clinical Ratings | |||||
| BPRS Positive (sum) | - | 7.92 (3.94) | - | - | |
| SANS AA/RF (sum) | - | 16.16 (8.14) | - | - | |
| SANS AFB/Alog (sum) | 11.02 (7.83) | - | - | ||
Education level missing for 2 HV; Maternal education level missing for 3 HV and 2 PSZ; Paternal education level missing for 3 HV and 3 PSZ
Probability-Magnitude Integration Task
Participants completed an probabilistic stimulus selection task consisting of a Learning (160 trials) and Test/Transfer (64 trials) Phase (Figure 1). During each Learning Phase trial, two stimuli were presented, on either side of a fixation cross. Participants were prompted to select one stimulus by pressing either the left or right trigger on a gamepad using their left or right index finger. Each choice was followed immediately by feedback, in the form of a number of points (+3, +2, +1, or +0). The eight stimuli differed in the probability and magnitude of the expected reward. For one pair, one stimulus resulted in a 3-point win on 90% of all trials, and a 0-point win on 10% of trials. The other option resulted in a 3-point win on 10% of all trials, and a 0-point win on 90% of trials (90-10/3 pair). The other three pairs were a “90-10/1” (1-point win 90% of the time for the optimal stimulus 1-point win 10% of the time for the non-optimal stimulus), “80-20/2” (2-point win 80% of the time for the optimal stimulus, 2-point win 20% of the time for the non-optimal stimulus) and “80-20/1” (1-point win 80% of the time for the optimal stimulus, 1-point win 20% of the time for the non-optimal stimulus) pair. All pairs were presented 40 times in pseudo-randomized order. Stimulus location (optimal stimulus on left/right side of screen) was randomized and counterbalanced across trials, and stimulus-value pairings were fully randomized across participants. Participants were informed that the aim of the task was to accumulate as many points as possible, and that they would be rewarded for good performance. All participants received the same monetary performance bonus at the end of the experiment.
Figure 1.
Graphical Overview Of The Probability-Magnitude Integration Task
The purpose of the Test/Transfer Phase was to assess participants’ ability to combine reward probability and magnitude into a representation of EV. Participants were presented with the four familiar Learning Phase pairs (“acquisition pairs”; 4 presentations per pair) and 24 novel pairs of stimuli (2 presentations per pair) and received the following instructions: “Please choose the picture that feels like it’s worth more points based on what you have learned during the previous block”. Stimulus location was randomized across trials. Crucially, for many of these trials, the optimal answer depended on the ability to combine the expected probability and magnitude of a stimulus (e.g. 80/2 vs. 90/1, or 10/3 vs. 20/2). No performance feedback was presented during the Test/Transfer Phase.
Hybrid Computational Model
Using I) posterior predictive simulations (using the fitted parameters to simulate performance), II) demonstrations of learning on the basis of reward magnitude and probability, and III) quantifications of model evidence, the “hybrid-probability” model was most likely to be the optimal model. Crucially, in the hybrid-probability model, Q-learning had access to the reward magnitude of outcomes, while the actor-critic only received binary outcomes (i.e. 0 or 1 points). Thus, in the hybrid-probability model, learning in the actor-critic was restricted to reward frequency. Due to space restrictions, we report all our computational model details (including rationale, simulations, model demonstrations, fitting procedure, and model evidence in the Supplementary Materials).
Statistical Analyses
Effects of diagnostic group (PSZ vs. HV), trial block (4 blocks of 10 trials), and choice pair (90-10/3, 80-20/2, 90-10/1, 80-20/1) on acquisition pair performance were investigated using a repeated measures analysis of variance. Between-group differences in model parameters were ascertained using univariate analysis of variance and independent sample t-tests.
In accordance with previous work (7), novel stimulus pairs in the Test/Transfer Phase were ranked using the difference in EV between two competing stimuli (EV = probability*magnitude; Table S1 for details regarding stimulus combinations). A value difference tracking slope was computed for every participant using a logistic regression with value difference (value left – value right) as predictor and button press (left=1, right=0) as a nuisance regressor (thereby correcting for the tendency to select one side of the screen over the other).
Exploratory analyses into trials that were matched for reward magnitude while differing in reward probability (“probability discrimination trials”; 90/1-80/1), and trials that were matched for reward probability, while differing in reward magnitude (“magnitude discrimination trials”; 90/3-90/1 and 80/2-80/1), were conducted, as well as a group by trial-type (probability/magnitude discrimination) interaction. The latter analysis investigated whether performance advantages conferred by learning from reward magnitude versus learning from reward probability differed between the two diagnostic groups.
Correlation analyses were conducted using Spearman coefficients (due to non-normal distributions of many variables). Key findings, including group differences in Learning Phase accuracy, value difference tracking slope, mixing parameter (HV vs. MMD) and correlations between the mixing parameter and Test/Transfer Phase performance in the entire sample, all survived Bonferroni correction for the number of model parameters (p=.01). Skewness and kurtosis of all parameters was within the −2/+2 bounds, and HV vs. MMD parameters differences remained when repeating analyses using a Mann-Whitney U test.
RESULTS
Demographics
Participant groups did not differ on key demographic variables, including, age, gender, race, and paternal education, although PSZ scored lower on measures of IQ and all MATRICS subdomains (Table 1). LMD and MMD did not differ on IQ (WTAR: t(47)=.35, p=.73), MATRICS performance (overall: t(47)=.1.42, p=.17), antipsychotic dosage (t(47)=−1.04, p=31), the BPRS positive symptom factor (t(47)=−1.49, p=.14), or BPRS depression scores (t(47)=−.59, p=.56).
Performance on Acquisition Pairs
In the Learning Phase, HV compared to PSZ overall more often selected the optimal stimulus (F(1,85)=13.41, p<.001). There was no evidence for a group-by-trial block-by-pair (F(7.22,613.92)= .44, p=.88), group-by-pair (F(2.68,228.01)= .29, p=.81), or group-by-trial block (F(2.33,197.73)= .48, p=.77) interaction. HVs outperformed PSZ on all stimulus pairs (90-10/3, t(85)=3.17, p=.002; 80-20/2; t(85)=2.90, p=.005; 90-10/1; t(85)=1.90, p=.06; 80-20/1; t(85)=2.62, p=.01; Figure 2A). Performance in block 4 (trials 31-40) was above chance in both participant groups for every pair (all p<.001). There was also a main effect of pair (F(2.69,231)=15.17, p<.001), with post-hoc comparisons suggesting that both greater reward magnitude (90-10/3 vs. 90-10/1 t(86)=4.50, p<.001; 80-20/2 vs. 80-20/1 t(86)=1.58, p=.12) and probability (90-10/1 vs. 80-20/1 t(86)=2.68, p=.009; 90-10/3 vs. 80-20/2 t(86)=5.08, p<.001) conferred performance improvements.
Figure 2.
Performance on Acquisition Phase Pairs. 2A Trial-by-trial (large figure) and average (small figure) Learning Phase accuracy (below “learning”), plotted next to average accuracy on acquisition phase pairs in the Test/Transfer Phase (below “transfer”). 2B Test/Transfer phase performance on novel pairs ranked on EV (probability*magnitude) difference between two competing stimuli once presented for every level of value difference separately (“novel pairs”), once presented as the value difference tracking slope (for LMD and HMD separately; “value difference tracking slope”). 2 LNS and 1 HNS were removed from these analyses due to limited choice variability, which produced extreme performance slopes (marked in red). *p<.05, **p<.01, ***p<.001. Small asterisk above bars represents significance against chance. Error bars in all Figures reflect 95% CI, except for Figure 2A and Figure 5A, where they represent SEM.
In the Test/Transfer Phase, HV compared to PSZ also selected the optimal stimulus more often (F(1,85)=10.26, p=.002). Specifically, HVs outperformed PSZ on 90-10/3 (t(85)=2.68, p=.009) and 80-20/1 (t(85)=2.51, p=.01), but not 80-20/2 (t(85)=1.34, p=.19) and 90-10/1(t(85)=1.64, p=.10) trials (Figure 2A). Nevertheless, PSZ and HVs performed above chance on all original pairs (all p<.001), suggesting that both groups had developed a preference for the optimal stimulus.
Subgroup analyses in LMD and MMD revealed no evidence for an effect of motivational deficit severity on acquisition pair accuracy during either experiment phase (Supplemental Results; Figure S6).
Performance on Novel Transfer Pairs
The value difference tracking slope was greater for HV than PSZ (t(82)=3.34, p=.001; 3 PSZ were removed due to limited choice variability; see Figure S7 for performance on every Test/Transfer Phase pair for each diagnostic group), and remained significant when not correcting for a side-bias (i.e. a logistic regression investigation correct choices as a function of value difference [optimal stimulus – suboptimal stimulus]; t(82)=4.72, p<.001). In line with predictions, these data indicate that PSZ performance improved less as the difference in EV between two competing stimuli increased. Importantly, the group difference in the value difference tracking slope was driven by motivational deficit (AA subscale) severity (HVs vs. LMD t(56)=1.78, p=.08; HVs vs. MMD t(62)=3.80, p<.001, LMD vs. MMD t(44)=.65, p=.10; Figure 2B). These results suggest that MMD specifically were poorer at integrating reward probability and magnitude. BPRS positive symptom scores were not associated with the value difference tracking slope (Spearman’s rho=.15, p=.32)
Focusing on selective trials matched for probability and magnitude, we observed a group-by-trial type interaction (F(1,85)=4.35, p=.04; Figure 3). Post-hoc analyses revealed that HVs performed better on magnitude discrimination than probability discrimination trials (t(37)=3.59, p=.001), while PSZ performed similarly on magnitude and probability discrimination trials (t(48)=.73, p=.47) (Figure 3). The difference between performance on magnitude- and probability discrimination trials - that is, the difference between the advantage conferred by higher reward magnitude versus higher reward probability- highly correlated with the value difference tracking slope, suggesting that participants that performed better on magnitude discrimination trials overall performed better in the Test/Transfer Phase (Spearman’s rho=.56, px<.001).
Figure 3.
Performance on magnitude and probability discrimination trials during the Test/Transfer Phase for each diagnostic group. 3A Performance on probability (90/1-80/1) and magnitude (90/3-90/1 and 80/2-80/1) discrimination trials. 3B The difference in performance on magnitude and probability trials. *p<.05, ***p<.001. Small asterisk above bars represents significance against chance.
Computational Modeling Analyses: Avolitional PSZ overutilize stimulus-response associations
Mean hybrid-probability model parameter estimates for each diagnostic group are reported separately in Table 2, with only the mixing parameter differing between HVs and PSZ (t(85)= 2.04, p=.04; Table S2 for individual model parameters). While a group (HV vs. PSZ) difference in the mixing parameter is in line with previous work (7), it did not survive correction for the number of parameters in the model. Crucially, however, the group difference in the mixing parameter was driven by motivational deficit (AA subscale) severity (HVs vs. LMD t(58)=.70, p=.48, Cohen’s d=.19; HVs vs. MMD t(63)=2.69, p=.009, Cohen’s d=.67; LMD vs. MMD t(47)=1.55, p=.13, Cohen’s d=.44; Figure 4A) and correlated with SANS total severity (Spearman’s rho=−.29, p=.04, df=48) and SANS Avolition scores specifically (Spearman’s rho=−.28, p=. 05). The low mean mixing parameter (.40) suggests that MMD over-utilized actor-critic-type learning, while LMD (.57) and HVs (.64) on average relied more on Q-learning. HV and MMD also marginally differed in αc (t1,63=1.99, p=.052), suggesting that PSZ with motivational deficits updated their state value less in response to recent outcomes. BPRS positive symptom scores were not associated with the mixing parameter (Spearman’s rho=−.03, p=.82).
Table 2.
Hybrid-probability model parameters
| HV (n=38) | PSZ (n=49) | t | p | |
|---|---|---|---|---|
| Critic learning rate (αc) | .49 (.42) | .32 (.37) | 1.95 | .35 |
| Actor learning rate (αa) | .66 (.41) | .56 (.43) | 1.06 | .29 |
| Q learning rate (αQ) | .29 (.31) | .23 (.32) | .86 | .39 |
| Mixing parameter (m) | .64 (.34) | .48 (.39) | 2.04 | .04 |
| Inverse temperature (β) | .34 (.41) | .29 (.42) | .50 | .62 |
B = inverse temperature multiplied by 100
Figure 4.
Hybrid-probability model parameters for HV, LMD and MMD. **p<.01
As predicted, the mixing parameter highly correlated with the value difference tracking slope in the entire sample, suggesting that greater reliance on Q-learning, in which choice values converge to their true expected values (16), was associated with better probability-magnitude integration (Spearman’s rho=.39, p<.001). The mixing parameter additionally correlated with the difference between performance on magnitude- and probability discrimination trials (Spearman’s rho=.38, p<.001). This finding is line with our demonstration of the actor-critic being insensitive to reward magnitude (see “Model Selection and Comparison”) and suggests that better discrimination of reward magnitude was associated with more EV-based learning.
These results suggest that increased reliance on stimulus-response associations/decreased use of EV in PSZ with motivational deficits, as demonstrated by our hybrid-probability model, is associated with impaired probability-magnitude integration and may be driven by reduced sensitivity to reward magnitude specifically.
Hybrid-probability model simulations
Hybrid-probability model Learning Phase simulations closely approximated empirical data, with a predicted main effect of group on accuracy for 90-10/3 (t(85)=3.21, p=.002), 80-20/2 (t(85)=3.52, p<.001), 80-20/1 (t(85)=2.28, p=.03) trials and a similar non-significant trend for 90-10/1 (t(85)=1.61, p=.11) trials (Figure 5A). Analyses of simulated data revealed either significant between-group differences, or near-significant between-group differences on all acquisition pair Test/Transfer trials (90-10/3 t(85)=2.32, p=.02; 80-20/2 t(85)=1.85, p=.07; 90-10/1 t(85)=2.41, p=.02; 80-20/1 t(85)=1.89, p=.06; Figure 5A; n simulations=20).
Figure 5.
Hybrid-probability model simulations. 5A Simulated Performance on Acquisition Phase Pairs during the Learning and Test/Transfer Phase. 5B Simulated Test/Transfer phase performance on novel pairs presented for every value difference separately (“novel pairs”) and the value difference tracking (2 LMD and 1 MMD outliers marked in red). *p<.05, **p<.01, ***p<.001. Small asterisk above bars represents significance against chance.
Consistent with the empirical Test/Transfer Phase data, simulations revealed numerically greater deficits in PSZ for easier choices (i.e. greater EV difference). Greater deficits for trials on which successful probability-magnitude integration decreases choice difficulty were also observed in PSZ relative to HV (e.g. 90/3-80/1 and 90/3-20/1 [“1.9” and “2.5” in Figure 5B]). Crucially, the hybrid-probability model predicted a numerically smaller value difference tracking slope in MMD (HV vs. LMD t(56)=.02, p=.2.44; HV vs. MMD t(62)=3.44, p=.001; LMD vs. HMD t(44)=1.29, p=.20; Figure 5B). Taken together, the hybrid-probability model could account for key aspects of the Learning and Test/Transfer Phase data, as well as highly specific performance deficits in PSZ with motivational deficits.
Associations with antipsychotic dosage, model fit, IQ scores, MATRICS performance and subgroup analyses in poor performers and poorly fit subjects are reported in the Supplemental Results.
DISCUSSION
In contrast to many other RL paradigms, EV-based decision-making in the current study relied on successful integration of reward probability and magnitude. PSZ were specifically impaired on trials with greater objective EV difference between two stimuli, as evidenced by the group difference in the Test/Transfer Phase value difference tracking slope, which was primarily driven by PSZ with motivational deficits. The inability to combine reward magnitude and probability in the service of generating adaptive estimates of EV is in line with a large body of previous work, including findings of performance deficits in PSZ on the Iowa Gambling Task (10-12). Additionally, the finding of greater impairments for easier choices is consistent with a previously-reported smaller value difference tracking slope in PSZ in the context of an RL paradigm where reward probability determined stimulus EV (7). Altogether, the current work reconfirms the notion that performance deficits in PSZ increase with demands placed on putative prefrontal processes involved in EV estimation, a rather robust finding in this population (6, 7, 27, 28).
Using a previously-validated computational approach, we showed that outcome probability-magnitude integration deficits in PSZ with motivational deficits were driven primarily by increased reliance on value-less stimulus-associations (actor-critic), in lieu of EV-based decision-making (Q-learning). Moreover, individual value difference tracking slopes correlated significantly with estimates of individual mixing parameters - which capture the balance between Q- and actor-critic-type learning - suggesting a systematic relationship between EV-based learning and probability-magnitude integration. Crucially, individual mixing parameters correlated significantly negatively with motivational deficit severity, thereby providing formal computational modeling evidence that impaired probability-magnitude integration in PSZ with motivational deficits arises from overutilization of stimulus-response associations.
The current results are noteworthy because they provide a first mechanistic explanation of complex EV estimation deficits in PSZ (with motivational deficits) that result from an inability to combine different dimensions of reward. Moreover, this work demonstrates that EV deficits in PSZ with motivational deficits are observable even when reward value subtly differs between stimuli, and not only when stimuli from diametrically-opposed contexts (e.g. gainseeking/loss-avoidance) are combined (6, 29). Secondly, selective associations with negative symptom severity suggest that probability-magnitude integration deficits may play a role in the onset of motivational deficit severity. A failure to appropriately combine reward magnitude and probability into a single estimate of EV may lead to a decrease in perceived reward value. Underestimations of EV may change the trade-off between reward and effort-cost, in line with findings that abnormal effort-cost computations are most pronounced in avolitional PSZ in conditions with high reward value (30-32). Thus, probability-magnitude integration deficits may reduce willingness to exert effort, thereby playing a role in the onset of avolition and anhedonia.
The framework of reduced EV-based learning/increased reliance on stimulus-response associations has been used previously to explain performance deficits in PSZ (with motivational deficits), such as insensitivity to gains (6, 29) and a reduced ability to update expectations in response to positive RPEs (28, 33, 34). Gold et al. (6) also reported a selective reduction in the mixing parameter in PSZ high motivational deficits during a gain-seeking/loss-avoidance task. More recently, we employed this computational approach to explain increased context-dependent learning in PSZ (7). Although the mixing parameter did not correlate with motivational deficit severity in Hernaus et al. (7), the critic learning rate (αc; which directly related to the magnitude of context-dependent learning) was selectively increased in individuals with more severe motivational deficits. The observation that various manifestations of disrupted processes in EV estimation coalesce in this single computational framework, and that parameters in this framework relate to motivational deficit severity, increases its generalizability and suggests potential for the computational hybrid model as a diagnostic tool for the detection of motivational deficits.
Interestingly, outcome probability-magnitude integration deficits seemed to be driven by underutilization of reward magnitude specifically. This interpretation is supported by the between-group difference in performance improvement due to outcome magnitude versus probability increases (while holding the other constant). This contrast (Figure 3) correlated with the value difference tracking slope in the entire sample, suggesting that better performance on magnitude trials was associated with more efficient probability-magnitude integration. This observation goes hand-in-hand with evidence for increased Q-learning in HV, a framework in which magnitude-driven RPEs also update EV. Moreover, in model simulations we demonstrated that reward magnitude was not used in the formation of response tendencies in the actor-critic architecture; this was the framework that could account for PSZ performance. Thus, performance and modeling evidence suggests that undervaluation of reward magnitude was specifically related to performance deficits in PSZ. This notion is line with reports on reward processing in anhedonia (35, 36), and observations in PSZ with motivational deficits during a outcome probability-magnitude integration task (14).
Speculating on the neural mechanisms involved, pre-clinical work suggests that reward magnitude is among others encoded by neurons in the basolateral amygdala and OFC (37-41). Moreover, lesions to the basolateral amygdala alter reward magnitude encoding (42) and valuebased decision-making (43-45) involving OFC. In contrast, tracking of reward probability may involve midbrain (46, 47) and ventral striatum (41, 48, 49), although the latter has also been implicated in reward magnitude processing (41, 49). The specific inability to utilize reward magnitude information in combination with previously-reported EV deficits in PSZ may thus point to a deficit in the basolateral amygdala-OFC projection or ventral striatum. This is consilient with ventral striatal lesions being associated with performance deficits on a stimulus selection (as employed in the current study), but not action learning, task in non-human primates (50). This interpretation is further supported by reduced amygdala-hippocampal RPE and value signals in PSZ (51), as well as abnormal OFC and striatum outcome and anticipationrelated signals in the psychosis spectrum (52-55), the latter occasionally travelling with motivational deficit severity. Moreover, deficits in amygdala-OFC coupling have been implicated in the psychosis spectrum, and are associated with symptom severity (56).
In light of abnormal striatal reward processing signals in PSZ (54), but relatively intact striatal RPE signals in medicated PSZ (57), however, performance deficits in our medicated PSZ sample could also be associated with a change in cortical top-down projections that influence Q- values represented by striatal neurons (58, 59). This view is supported by reduced OFC-striatal connectivity during effort-based decision making (60) and anterior cingulate cortex-striatal connectivity during probabilistic reversal learning (61) in PSZ with motivational deficits. A better understanding of neural mechanisms associated with probability-magnitude integration could be achieved by disambiguating the neural signals associated with reward magnitude, reward probability, and probability-magnitude integration in PSZ. Regardless of the exact neural mechanism involved, a reduction in RPE signals specifically and reward processing signals more generally in individuals with motivational deficits may be associated with a reduction in phasic DA release (62).
In conclusion, we provide formal evidence for the notion that failure to integrate reward probability and magnitude in PSZ with motivational deficits is associated with overreliance on the learning of value-less stimulus-response associations. Performance improvements associated with increases in reward magnitude, in combination with analyses using our computational hybrid model, suggests that such deficits may be specifically associated with an impairment in the ability to precisely and adaptively represent reward magnitude. The results presented in this manuscript add to the generalizability of the computational hybrid model in capturing a broad range of EV estimation deficits in PSZ with motivational deficits.
LIMITATIONS
Only a small number of trials were available to directly investigate probability and magnitude processing. Moreover, magnitude and probability were not fully orthogonalized in this experiment, slightly reducing the total number of trials that could be used to study reward probability. Importantly, however, the key aim of this study was to study probability-magnitude integration deficits in an RL context, rather than specific deficits in reward magnitude or probability processing. Moreover, a selective deficit in learning from reward magnitude is in line with previous work (14, 36), and was predicted and demonstrated by our computational modeling framework. As outlined previously (7), there is a possibility that the observed deficits in PSZ with motivational deficits may reflect a decrease in model-based processing. One advantage to using Q-learning to capture deficits in performance is that this algorithm does not require one to assume that participants rely on any model-based expectations. Future tasks should focus on teasing apart model-based predictions from EV- (Q-) based learning in order to provide new insights into RL deficits in PSZ. Finally, our sample consisted of chronic, medicated PSZ. While antipsychotic drug doses were not associated with any outcome measures, a follow-up investigation in antipsychotic-naive PSZ could provide more information on symptom-specificity.
Supplementary Material
Acknowledgements
We thank Benjamin M Robinson for their contributions to the task design and data collection.
FINANCIAL DISCLOSURE
This work was supported by the NIMH (Grant No. MH80066 to JMG). JAW, JMG, and MJF report that they perform consulting for Hoffman La Roche. JMG has also consulted for Takeda and Lundbeck and receives royalty payments from the Brief Assessment of Cognition in Schizophrenia. JAW also consults for NCT Holdings. The current experiments were not related to any consulting activity. All other authors report no biomedical financial interests or potential conflicts of interest.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
REFERENCES
- 1.Strauss GP, Waltz JA, Gold JM (2014): A review of reward processing and motivational impairment in schizophrenia. Schizophrenia bulletin. 40 Suppl 2:S107–116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Fervaha G, Foussias G, Agid O, Remington G (2015): Motivational deficits in early schizophrenia: prevalent, persistent, and key determinants of functional outcome. Schizophrenia research. 166:9–16. [DOI] [PubMed] [Google Scholar]
- 3.Dickinson D, Bellack AS, Gold JM (2007): Social/communication skills, cognition, and vocational functioning in schizophrenia. Schizophrenia bulletin. 33:1213–1220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Velligan DI, Kern RS, Gold JM (2006): Cognitive rehabilitation for schizophrenia and the putative role of motivation and expectancies. Schizophrenia bulletin. 32:474–485. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Fervaha G, Foussias G, Agid O, Remington G (2014): Motivational and neurocognitive deficits are central to the prediction of longitudinal functional outcome in schizophrenia. Acta Psychiatr Scand. 130:290–299. [DOI] [PubMed] [Google Scholar]
- 6.Gold JM, Waltz JA, Matveeva TM, Kasanova Z, Strauss GP, Herbener ES, et al. (2012): Negative symptoms and the failure to represent the expected reward value of actions: behavioral and computational modeling evidence. Archives of general psychiatry. 69:129–138. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Hernaus D, Gold JM, Waltz JA, Frank MJ (2018): Impaired Expected Value Computations Coupled With Overreliance on Stimulus-Response Learning in Schizophrenia. Biological psychiatry Cognitive neuroscience and neuroimaging. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Waltz JA, Gold JM (2016): Motivational Deficits in Schizophrenia and the Representation of Expected Value. Current topics in behavioral neurosciences. 27:375–410. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Bechara A, Damasio AR, Damasio H, Anderson SW (1994): Insensitivity to future consequences following damage to human prefrontal cortex. Cognition. 50:7–15. [DOI] [PubMed] [Google Scholar]
- 10.Brown EC, Hack SM, Gold JM, Carpenter WT Jr., Fischer BA, Prentice KP, et al. (2015): Integrating frequency and magnitude information in decision-making in schizophrenia: An account of patient performance on the Iowa Gambling Task. Journal of psychiatric research. 66-67:16–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Brambilla P, Perlini C, Bellani M, Tomelleri L, Ferro A, Cerruti S, et al. (2013): Increased salience of gains versus decreased associative learning differentiate bipolar disorder from schizophrenia during incentive decision making. Psychological medicine. 43:571–580. [DOI] [PubMed] [Google Scholar]
- 12.Kim MS, Kang BN, Lim JY (2016): Decision-making deficits in patients with chronic schizophrenia: Iowa Gambling Task and Prospect Valence Learning model. Neuropsychiatric disease and treatment. 12:1019–1027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Brown JK, Waltz JA, Strauss GP, McMahon RP, Frank MJ, Gold JM (2013): Hypothetical decision making in schizophrenia: the role of expected value computation and "irrational" biases. Psychiatry research. 209:142–149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Albrecht MA, Waltz JA, Frank MJ, Gold JM (2016): Probability and magnitude evaluation in schizophrenia. Schizophrenia research Cognition. 5:41–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Kasanova Z, Waltz JA, Strauss GP, Frank MJ, Gold JM (2011): Optimizing vs. matching: response strategy in a probabilistic learning task is associated with negative symptoms of schizophrenia. Schizophrenia research. 127:215–222. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Watkins C, Dayan P (1992): Q-learning. Mach Learning.279–292. [Google Scholar]
- 17.Furuyashiki T, Gallagher M (2007): Neural encoding in the orbitofrontal cortex related to goal-directed behavior. Annals of the New York Academy of Sciences. 1121:193–215. [DOI] [PubMed] [Google Scholar]
- 18.Padoa-Schioppa C, Cai X (2011): The orbitofrontal cortex and the computation of subjective value: consolidated concepts and new perspectives. Annals of the New York Academy of Sciences. 1239:130–137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Rich EL, Wallis JD (2016): Decoding subjective decisions from orbitofrontal cortex. Nature neuroscience. 19:973–980. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Padoa-Schioppa C, Assad JA (2006): Neurons in the orbitofrontal cortex encode economic value. Nature. 441:223–226. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Gottfried JA, O'Doherty J, Dolan RJ (2003): Encoding predictive reward value in human amygdala and orbitofrontal cortex. Science. 301:1104–1107. [DOI] [PubMed] [Google Scholar]
- 22.Rolls ET, McCabe C, Redoute J (2008): Expected value, reward outcome, and temporal difference error representations in a probabilistic decision task. Cerebral cortex. 18:652–663. [DOI] [PubMed] [Google Scholar]
- 23.Rescorla RA, Wagner AR (1972): in Classical Conditioning II: Current Research and Theory, eds Black AH, Prokasy WF. New York City, NY: Appleton–Century Crofts. [Google Scholar]
- 24.Joel D, Niv Y, Ruppin E (2002): Actor-critic models of the basal ganglia: new anatomical and computational perspectives. Neural networks : the official journal of the International Neural Network Society. 15:535–547. [DOI] [PubMed] [Google Scholar]
- 25.Collins AG, Frank MJ (2014): Opponent actor learning (OpAL): modeling interactive effects of striatal dopamine on reinforcement learning and choice incentive. Psychological review. 121:337–366. [DOI] [PubMed] [Google Scholar]
- 26.Frank MJ, Claus ED (2006): Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal. Psychological review. 113:300–326. [DOI] [PubMed] [Google Scholar]
- 27.Waltz JA, Frank MJ, Robinson BM, Gold JM (2007): Selective reinforcement learning deficits in schizophrenia support predictions from computational models of striatal-cortical dysfunction. Biological psychiatry. 62:756–764. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Dowd EC, Frank MJ, Collins A, Gold JM, Barch DM (2016): Probabilistic Reinforcement Learning in Patients With Schizophrenia: Relationships to Anhedonia and Avolition. Biological psychiatry Cognitive neuroscience and neuroimaging. 1:460–473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Hartmann-Riemer MN, Aschenbrenner S, Bossert M, Westermann C, Seifritz E, Tobler PN, et al. (2017): Deficits in reinforcement learning but no link to apathy in patients with schizophrenia (vol 7, 40352, 2017). Scientific reports. 7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Culbreth AJ, Moran EK, Barch DM (2018): Effort-Based Decision-Making in Schizophrenia. Current opinion in behavioral sciences. 22:1–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Treadway MT, Peterman JS, Zald DH, Park S (2015): Impaired effort allocation in patients with schizophrenia. Schizophrenia research. 161:382–385. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Gold JM, Strauss GP, Waltz JA, Robinson BM, Brown JK, Frank MJ (2013): Negative symptoms of schizophrenia are associated with abnormal effort-cost computations. Biological psychiatry. 74:130–136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Waltz JA, Xu Z, Brown EC, Ruiz RR, Frank MJ, Gold J (2017): Motivational Deficits in Schizophrenia Are Associated With Reduced Differentiation Between Gain and Loss-Avoidance Feedback in the Striatum. Biological Psychiatry: CNNI. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Deserno L, Heinz A, Schlagenhauf F (2017): Computational approaches to schizophrenia: A perspective on negative symptoms. Schizophrenia research. 186:46–54. [DOI] [PubMed] [Google Scholar]
- 35.Huys QJ, Pizzagalli DA, Bogdan R, Dayan P (2013): Mapping anhedonia onto reinforcement learning: a behavioural meta-analysis. Biology of mood & anxiety disorders. 3:12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Vignapiano A, Mucci A, Ford J, Montefusco V, Plescia GM, Bucci P, et al. (2016): Reward anticipation and trait anhedonia: An electrophysiological investigation in subjects with schizophrenia. Clin Neurophysiol. 127:2149–2160. [DOI] [PubMed] [Google Scholar]
- 37.Smith BW, Mitchell DG, Hardin MG, Jazbec S, Fridberg D, Blair RJ, et al. (2009): Neural substrates of reward magnitude, probability, and risk during a wheel of fortune decision-making task. NeuroImage. 44:600–609. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Bermudez MA, Schultz W (2010): Reward magnitude coding in primate amygdala neurons. Journal of neurophysiology. 104:3424–3432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Saez RA, Saez A, Paton JJ, Lau B, Salzman CD (2017): Distinct Roles for the Amygdala and Orbitofrontal Cortex in Representing the Relative Amount of Expected Reward. Neuron. 95:70–77 e73. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Burke SN, Thome A, Plange K, Engle JR, Trouard TP, Gothard KM, et al. (2014): Orbitofrontal cortex volume in area 11/13 predicts reward devaluation, but not reversal learning performance, in young and aged monkeys. J Neurosci. 34:9905–9916. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Stoppel CM, Boehler CN, Strumpf H, Heinze HJ, Hopf JM, Schoenfeld MA (2011): Neural processing of reward magnitude under varying attentional demands. Brain research. 1383:218–229. [DOI] [PubMed] [Google Scholar]
- 42.Rudebeck PH, Mitz AR, Chacko RV, Murray EA (2013): Effects of amygdala lesions on reward-value coding in orbital and medial prefrontal cortex. Neuron. 80:1519–1531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Lichtenberg NT, Pennington ZT, Holley SM, Greenfield VY, Cepeda C, Levine MS, et al. (2017): Basolateral Amygdala to Orbitofrontal Cortex Projections Enable Cue-Triggered Reward Expectations. J Neurosci. 37:8374–8384. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Fiuzat EC, Rhodes SE, Murray EA (2017): The Role of Orbitofrontal-Amygdala Interactions in Updating Action-Outcome Valuations in Macaques. J Neurosci. 37:2463–2470. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Rudebeck PH, Ripple JA, Mitz AR, Averbeck BB, Murray EA (2017): Amygdala Contributions to Stimulus-Reward Encoding in the Macaque Medial and Orbital Frontal Cortex during Learning. J Neurosci. 37:2186–2202. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Bayer HM, Glimcher PW (2005): Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron. 47:129–141 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Bayer HM (2004): A role for the substantia nigra in learning and motor control. New York: New York University. [Google Scholar]
- 48.Abler B, Walter H, Erk S, Kammerer H, Spitzer M (2006): Prediction error as a linear function of reward probability is coded in human nucleus accumbens. Neuroimage. 31:790–795. [DOI] [PubMed] [Google Scholar]
- 49.Yacubian J, Sommer T, Schroeder K, Glascher J, Braus DF, Buchel C (2007): Subregions of the ventral striatum show preferential coding of reward magnitude and probability. NeuroImage. 38:557–563. [DOI] [PubMed] [Google Scholar]
- 50.Rothenhoefer KM, Costa VD, Bartolo R, Vicario-Feliciano R, Murray EA, Averbeck BB (2017): Effects of Ventral Striatum Lesions on Stimulus-Based versus Action-Based Reinforcement Learning. J Neurosci. 37:6902–6914. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Gradin VB, Kumar P, Waiter G, Ahearn T, Stickle C, Milders M, et al. (2011): Expected value and prediction error abnormalities in depression and schizophrenia. Brain. 134:1751–1764. [DOI] [PubMed] [Google Scholar]
- 52.Segarra N, Metastasio A, Ziauddeen H, Spencer J, Reinders NR, Dudas RB, et al. (2016): Abnormal Frontostriatal Activity During Unexpected Reward Receipt in Depression and Schizophrenia: Relationship to Anhedonia. Neuropsychopharmacology: official publication of the American College of Neuropsychopharmacology. 41:2001–2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Waltz JA, Schweitzer JB, Gold JM, Kurup PK, Ross TJ, Salmeron BJ, et al. (2009): Patients with schizophrenia have a reduced neural response to both unpredictable and predictable primary reinforcers. Neuropsychopharmacology: official publication of the American College of Neuropsychopharmacology. 34:1567–1577. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Radua J, Schmidt A, Borgwardt S, Heinz A, Schlagenhauf F, McGuire P, et al. (2015): Ventral Striatal Activation During Reward Processing in Psychosis: A Neurofunctional Meta-Analysis. JAMA psychiatry. 72:1243–1251. [DOI] [PubMed] [Google Scholar]
- 55.de Leeuw M, Kahn RS, Vink M (2015): Fronto-striatal dysfunction during reward processing in unaffected siblings of schizophrenia patients. Schizophrenia bulletin. 41:94–103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Anticevic A, Tang Y, Cho YT, Repovs G, Cole MW, Savic A, et al. (2014): Amygdala connectivity differs among chronic, early course, and individuals at risk for developing schizophrenia. Schizophrenia bulletin. 40:1105–1116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Culbreth AJ, Westbrook A, Xu Z, Barch DM, Waltz JA (2016): Intact Ventral Striatal Prediction Error Signaling in Medicated Schizophrenia Patients. Biological psychiatry Cognitive neuroscience and neuroimaging. 1:474–483. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Samejima K, Ueda Y, Doya K, Kimura M (2005): Representation of action-specific reward values in the striatum. Science. 310:1337–1340. [DOI] [PubMed] [Google Scholar]
- 59.Clarke HF, Cardinal RN, Rygula R, Hong YT, Fryer TD, Sawiak SJ, et al. (2014): Orbitofrontal dopamine depletion upregulates caudate dopamine and alters behavior via changes in reinforcement sensitivity. J Neurosci. 34:7663–7676. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Park IH, Lee BC, Kim JJ, Kim JI, Koo MS (2017): Effort-Based Reinforcement Processing and Functional Connectivity Underlying Amotivation in Medicated Patients with Depression and Schizophrenia. J Neurosci. 37:4370–4380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Hernaus D, Xu Z, Brown EC, Ruiz RR, Frank MJ, Gold JM, et al. (2018): Motivational deficits in schizophrenia relate to abnormalities in cortical learning rate signals. Cognitive, affective & behavioral neuroscience. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Maia TV, Frank MJ (2017): An Integrative Perspective on the Role of Dopamine in Schizophrenia. Biological psychiatry. 81:52–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





