Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Mar 15.
Published in final edited form as: Biol Psychiatry. 2018 Oct 18;85(6):506–516. doi: 10.1016/j.biopsych.2018.10.006

Value-based choice, contingency learning and suicidal behavior in mid-life and late-life depression

Alexandre Y Dombrovski 1,*, Michael N Hallquist 2, Vanessa M Brown 1,3,4, Jonathan Wilson 1, Katalin Szanto 1
PMCID: PMC6380943  NIHMSID: NIHMS1509914  PMID: 30502081

Abstract

Background:

Suicidal behavior is associated with impaired decision-making under uncertainty. Existing studies, however, do not definitively address whether there is an impairment in learning from experience and/or choice based on comparison of estimated option values. Our reinforcement-learning model-based behavioral study tested these hypotheses directly in middle-aged and older suicide attempters representative of those who die by suicide.

Methods:

Two samples (Sample 1 n = 135, Sample 2 n = 125) of depressed suicide attempters (nattempters = 54/39), suicide ideators, non-suicidal depressed individuals and non-psychiatric controls completed a probabilistic three-choice decision-making task. A second experiment in Sample 2 experimentally dissociated long-term learned value from reward magnitude. Analyses combined computational reinforcement learning and mixed-effects models of decision times and choices.

Results:

Learning. Suicide attempters (vs. all comparison groups) were less sensitive to one-back reinforcement, as indicated by a reduced effect on both choices and decision times. Learning deficits scaled with attempt lethality and were partially explained by poor cognitive control. Value-based choice. Attempters (vs. all comparison groups) had abnormally long decision times when choosing between similarly valued options and were less able to distinguish between the best and second-best. Group differences in value-based choice were robust to controlling for cognitive performance, comorbidities, impulsivity, psychotropic exposure, and possible brain damage from attempts.

Conclusion:

Serious suicidal behavior is associated with impaired reward learning, likely undermining the search for alternative solutions. Attempted suicide is associated with impaired value comparison during the choice process, potentially interfering with the consideration of deterrents and alternatives in a crisis.

Keywords: reinforcement learning, suicide, decision-making, reward, expected value, depression

Introduction

Our society tends to view the decision to end one’s life as strategic (1). Yet, clinicians more commonly see that suicide follows a limited consideration of the present crisis, alternative solutions, and deterrents (2). In retrospect, survivors usually regret their suicide attempt (3). These observations inform the general hypothesis that, in a crisis, people vulnerable to suicide do not optimally incorporate moment-to-moment experiences into their decisions alongside their values, goals, and prior knowledge. For example, precipitating stressors that may appear relatively inconsequential to the clinician (e.g., an argument) can trigger a suicidal act, temporarily overshadowing the more significant deterrents (e.g., traumatizing one’s family). Explaining this phenomenon merely as a manifestation of impulsivity fails to capture the uncertainty and confusion characteristic of a suicidal crisis. More precise explanations may include poor integration of recent experiences with prior experience and values or improper comparison of the worth of alternative options when making the choice. Reinforcement learning (RL) models of neural computation (Table 1) distinguish between these two explanations: disrupted learning of expected value (Hypothesis 1) depends on the the medial orbitofrontal/ventromedial prefrontal cortex [mOFC/vmPFC] (4) whereas impaired ability to compare learned values while making a choice (Hypothesis 2) maps to the lateral orbitofrontal [lOFC], and more generally to lateral prefrontal cortex [lPFC] (5).

Table 1.

Key terms

Expected value
In economics, a measure of benefit associated with an option expressed in a theoretical common currency used to compare disparate goods. In animal learning, expected reward associated with an action or stimulus (associative strength). Thought to be represented in the human ventromedial prefrontal cortex (vmPFC).
Reinforcement learning (RL)
Statistical account of learning, wherein the discrepancy between the actually received and expected reward (prediction error) is the learning signal, used to update expected value. The goal of RL is to estimate the values of available options from experience and in order to maximize reward.
Value-based choice
Selection of actions informed by their values. Choices require greater cognitive effort when action values are close. Choices can be exploitative (favoring options with known high value) or exploratory (sampling lower-value alternatives, typically under uncertainty).

In people who have attempted suicide, laboratory studies using the Iowa Gambling Task (IGT), a decision-making task involving both learning of action values and value-based choice, support the general hypothesis of impaired decision-making (6, 7). The IGT, however, cannot differentiate between impairments in action value learning vs. value comparison. Specific evidence for the impaired value learning hypothesis (H1) is mixed, with suicide attempters showing deficits on a probabilistic reversal learning task (8) and in some (9), but not all (meta-analysis: (7)), studies of deterministic learning using the Wisconsin Card Sort. The hypothesis of impaired value comparison (H2) is supported by impairment in suicide attempters on the Cambridge Gamble Task (CGT; (10)), which does not require learning. CGT deficits in these patients have been also linked to disrupted value representations in the vmPFC (11). Indirect support for this hypothesis comes from a growing number of studies where suicide attempters show altered value-based decision-making in contexts that do not involve uncertainty, (e.g. delay discounting), tests of biases and heuristics, and social decision-making (1215). In summary, while our understanding of the neurocomputational mechanisms of impaired decision-making in suicidal behavior is limited, there is preliminary support for impairments in both learning and value-based choice.

To test these hypotheses conclusively, we assessed learning (H1) and value-based choice (H2) in two non-overlapping samples of suicide attempters with a three-armed bandit task differentially sensitive to these functions (5). Our initial analysis of learning (H1) examined the impact of one-back reinforcement on behavior. Our primary analysis examined how past reinforcement was encoded (H1) and how learned values affected current choices (H2). We first estimated learned expected value and prediction errors using an RL model fitted to choices and reinforcement history. To avoid circularity, we then examined how decision times, a process index independent of inputs to the RL model, were modulated by recent reinforcement (H1) and previously learned values (H2). Decision field theory predicts slower responses on difficult trials involving a small difference between values (16). We used this slowing as an index of the influence of previously learned values on the current choice (H2). At the same time, we examined whether learning (H1) from recent reinforcement was altered in suicide attempters, as indexed by decision time (DT) modulation by rewards and absolute prediction errors (PE) (17). This analysis of decision times enabled us to assess how reinforcement modulated the decision process and not merely its outcome (i.e., the choice).

Methods

Study design: overview

We sought to sample individuals maximally representative of those who die by suicide. Given that older and middle-aged adults are at the highest risk for suicide and that attempt-to-death ratios decrease from 100:1 in young adulthood to around 2:1 in old age (18), our study enrolled middle-aged and older suicide attempters with current major depression. To isolate decision process alterations associated with suicidal behavior, we included comparison groups of non-suicidal depressed patients and depressed patients with serious suicidal ideation but no history of attempt. Furthermore, to test for a dose-response relationship between decision process deficits and suicidal behavior, we examined whether they scaled with the medical seriousness (lethality) of suicide attempts. Finally, the second sample was retested in a state of partial recovery from depression potentially revealing the stability of decision deficits. This second experiment involved an additional manipulation separating long-term learned value from transient effects of reward magnitude.

Sample and its characterization

Overall, 260 adults aged from 42 to 82 (Sample 1, n = 135) and from 47 to 79 (Sample 2, n = 125; Table s1-s2; see Procedures for sample descriptions) participated in a longitudinal study of suicidal behavior in late-life depression (19). Participants provided written informed consent and all study procedures were approved by the University of Pittsburgh Institutional Review Board. Participants were selected into one of four groups: suicide attempters with depression, suicide ideators with depression, non-suicidal depressed, and psychiatrically healthy controls. They were recruited at a psychogeriatric inpatient unit, late-life depression clinic, primary care, and through community advertisements in Pittsburgh, PA (15, 19).

Suicide attempters with depression had a history of a self-injurious act with the intent to die within a one-month period of completing the study assessments or had a history of a past suicide attempt with strong current suicidal ideation at the time of study enrollment. Suicide attempt history was verified by a psychiatrist (AYD or KSz) using all available information: participant’s report, medical records, , and collateral information from the treatment team, family, and friends. Significant discrepancies between these sources led to exclusion from the study. The medical seriousness of attempts was assessed using the Beck Lethality Scale (BLS)(20); for participants with multiple attempts, data for the highest-lethality attempt are presented. Twenty-two participants in Sample 1 and eleven in Sample 2 had at least one high-lethality attempt (BLS score ≥ 4). High-lethality attempts resulted in coma, need for resuscitation, unstable vital signs, penetrating wounds of abdomen or chest, third-degree burns, or major bleeding. None of the participants had documented brain damage from suicide attempts. A review of medical records, suggested that such damage or could not be ruled out in 7/54 attempters in sample 1 and in 2/36 in sample 2. In sensitivity analyses, we excluded these individuals. Suicidal intent associated with the most lethal attempt was assessed using Beck’s Suicide Intent Scale (SIS) (20). Suicide ideators with depression, included to identify specific correlates of suicidal behavior, had suicidal ideation with a specific plan but no lifetime history of attempt. Individuals with passive death wish or transient or ambiguous suicidal ideas were excluded from this group. Non-suicidal depressed older adults were included in the study as a comparison group to identify an association between suicidal thoughts or behavior beyond the effects of depression. Non-suicidal depressed participants had no lifetime history of self-injurious behavior, suicidal ideation, or suicide attempts, based on the clinical interview, review of medical records, SCID/DSM-IV, and a score of 0 on the HRSD-17 suicide item. Non-psychiatric controls had no lifetime history of psychiatric disorders, as determined by the SCID/DSM-IV. All participants except for non-psychiatric controls had a SCID/DSM-IV diagnosis of unipolar, non-psychotic major depression and a score of 14 or higher on the 17-item Hamilton Rating Scale for Depression (HRSD-17) at study entry. We excluded individuals with clinical dementia (previous diagnosis or score < 24 on the Mini-Mental State Examination), a history of neurological disorder, delirium, or sensory disorder that precluded the performance of a learning task. Clinical, cognitive and psychological characterization is detailed in Supplemental Methods.

Procedures (Fig. 1, top; Table s1-s2)

Figure 1.

Figure 1.

Procedures, samples, and task

Legend. Top: Summary of experiments. Both samples completed Experiment 1 while only Sample 2 completed Experiment 2. Trial Structure: The options, three novel fractal stimuli, were presented on the screen of a tablet (Experiment 1) or back-projection screen (Experiment 2), using EPrime 2 (Psychology Software Tools, Sharpsburg, PA). Their location in each trial was randomized in a triangle pattern. The participants selected one option using the arrow keys (Experiment 1) or a response glove (Experiment 2). Win/loss feedback was then displayed for 1500 ms, followed by an inter-trial interval (ITI). To dissociate learned value, which integrates reinforcement across multiple trials, from reward magnitude, in Experiment 2, we independently manipulated the amount at stake at each trial (10¢, 25¢, 50¢). The stake was presented before the choice, making it clear to the participant that while the win/loss outcome depended on their choice, the amount won did not. In Experiment 2, the choice-feedback and inter-trial intervals were jittered, sampled from exponential distributions with means of 4000 ms and 2920 ms, respectively. Reinforcement Contingency: Curves depict probability of reward by stimulus over 300 trials. Both experiments used similar contingencies, modified from (5). The contingency was easier for Experiment 2. Behavior: Curves represent the probability of chosing a given stimulus, aggregated for the entire sample (means and error bands are estimated by LOESS regression).

Sample 1 participants met all inclusion criteria but refused or were ineligible for the fMRI study and so completed the task once (Experiment 1) at study baseline. A non-overlapping Sample 2 first completed the same version of the task outside of the scanner (Experiment 1) and once during fMRI scanning at follow-up (Experiment 2, mean 114 days after baseline; imaging results will be reported separately).

Reinforcement learning task and model

Participants completed a 300-trial three-armed bandit task (Fig. 1, middle), shown to be sensitive to the effects of medial ventral prefrontal lesions on value-based choice and of lateral ventral prefrontal lesions on value learning in macaques (21, 22) and humans (4).

To estimate expected value and prediction error for participants at every trial based on their reinforcement and sampling history, we used a Q-learning model implemented using the Variational Bayesian Approach (VBA) in Matlab 2016b (23). The details of these models can be found in the Supplemental methods. VBA yields robust and precise estimates of not only individual model parameters but also evolving hidden states, i.e. expected value. We used an empirical Bayesian procedure leveraging the parameter estimates of an entire sample to constrain the estimates for individuals to reduce the risk that parameters for poorly performing participants would be misestimated (mixed-effects VBA_MFX procedure; estimation was blind to group membership, details in Supplemental methods) (24).

We leveraged several methodological advances to expand on our earlier findings (8, 11). First, earlier studies of probabilistic learning in suicide did not explicitly separate the long-term value of options from more transient effects of reward magnitude. Thus, to isolate specific impairments in the computation of learned value, Experiment 2 separately manipulated long-term value and reward magnitude. Second, to obtain more precise estimates of model parameters and hidden states (option values) we employed a robust Bayesian approach to RL model-fitting. Third, to obtain independent inference regarding group differences in learning and value comparison, we examined the coupling of model-estimated signals and decision times not used in model-fitting.

Learning: initial analyses

We first looked for group differences in responses to reinforcement outside of the RL framework. To examine whether recent reinforcement had a differential effect in suicide attempters, we tested whether they were less likely to repeat a choice after a reward (vs. omission) on the last trial. These effects were estimated using binary logistic mixed models implemented by the lmer package (25) in R 3.3.3 (26).

Main analysis: decision time (DT) signatures of value-based choice and contingency learning

These analyses employed linear mixed effects models implemented in the lme4 package (25), modeling a random intercept of subject. Critically, this analysis was independent from RL model estimation, which did not utilize decision times (DTs). Trials with DT < 200ms or DT > 4000ms, comprising <4% of data, were removed (27). Following Ratcliff (28), DTs were inverse-transformed (the results were the same using raw DTs). We used the maximum value available on a trial (Vmax) as our measure of long-term value impacting choice (see Supplemental methods for rationale). To separate trial difficulty (indicated by maximum available value, Vmax) from whether the last choice was exploratory (low-value) vs. exploitative (high-value), the difference between the last chosen value and Vmax, termed Vchoice, was incorporated into analyses. Vchoice was zero when the best option was chosen (exploitation) and became negative when an inferior option was chosen (exploration). The final model was:

1DTt=1DTt1+t+switcht+rewardt1+Vchoicet1+prediction errort1+Vmaxt1+rewardt1Group+prediction errort1Group+Vmaxt1Group

where t indicates the current trial and t-1 the previous trial.

Sensitivity and exploratory analyses

We ascertained that group differences detected in our main analysis were not explained by the following confounds (included as covariates): demographic characteristics (particularly education), cognitive control, global cognitive function, comorbid conditions (anxiety and addiction), depressive severity, possible brain damage from suicide attempts, and exposure to antidepressants (including augmenting agents), antipsychotics, sedative/hypnotics, and opioids. Next, we verified that results were robust to model fit or fitting procedure (individual vs. empirical Bayesian RL model parameter estimation). Finally, we sought to rule out alternative explanations stemming from collinearity between reward on the previous trial and long-term value and from confounding of between-persons and within-person effects of value due to individual differences in learning.

Our exploratory analyses examined the patterns of value-based choices in suicide attempters using expected value estimates from the RL model. Such analyses should be interpreted with caution given that RL model parameters were originally fit to the same choices. Thus, we did not rely on them for detecting group differences, but only for describing them qualitatively.

Results

Group characteristics (Table s1-s2)

Across samples, suicide attempters were less educated than non-suicidal comparison groups. Attempters were similar to ideators on all other measures, while differing from non-suicidal depressed on depression severity and some measures of impulsivity, especially in Sample 1.

Learning, preliminary analysis: behavioral effects of reinforcement on the past trial (Figure 2)

Figure 2.

Figure 2.

Learning: effect of most recent reward on choice in suicide attempters (coefficients from binary logistic mixed-effects models predicting stay/switch choice)

Legend. Suicide attempters are the reference group. “Reward” denotes whether the previous choice was reinforced regardless of reward magnitude, which was independently manipulated (D, Stake effect) but irrelevant to whether one should stay with the same choice. Central dot and horizontal lines denote estimated regression coefficient and 95% confidence interval. X-axis shows the log-odds of switching vs. staying. * p < .05, ** p < .01, *** p < .001

Across samples and experiments, participants learned the reinforcement contingency well (learning curves: Figure 1; statistics: Figure 2).

Across samples and experiments, suicide attempters were less responsive to reinforcement on the previous trial. Specifically, in both samples in Experiment 1, suicide attempters differed from healthy controls and suicide ideators, but only in sample 1 did they differ from non-suicidal depressed. In Experiment 2, suicide attempters differed from non-suicidal depressed and ideators, but not from healthy controls (see Exploratory analyses for one possible explanation). A combined analysis of both samples in Experiment 1 revealed that Sample 1 was generally less sensitive to reinforcement (sample*reward, B [SE]: 0.64 [0.09], p < 10−11), which may reflect its lower education level. In the combined sample, suicide attempters were less sensitive to reinforcement than all comparison groups (z > 5.12, p < 10−6).

Learning, preliminary analysis: effects of attempt lethality (Figure 3)

Figure 3.

Figure 3.

Learning: effect of most recent reward on choice in high-lethality and low-lethality suicide attemtpers (coefficients from binary logistic mixed-effects models predicting stay/switch choice)

Legend. High-lethality (HL) suicide attempters are the reference group. Group sizes for low- vs. high-lethality attempters were 32 low, 22 high in Sample 1 and 28 low, 11 high in Sample 2. Effects of reward at stake not shown (reported in Figure 2D). Central dot and horizontal lines denote estimated regression coefficient and 95% confidence interval. X-axis shows the log-odds of switching vs. staying. * p < .05, ** p < .01, *** p < .001

Across samples and experiments, high-lethality attempters were less responsive to reinforcement than controls, non-suicidal depressed (albeit marginally in Sample 2, Experiment 1), and ideators. Additionally, in Experiment 2 high-lethality attempters were more impaired than low-lethality attempters.

Learning, decision times (DTs; Figure 4)

Figure 4.

Figure 4.

Modulation of decision times by value and current reinforcement (standardized coefficients from linear mixed-effects models)

Legend. PE: prediction error. Max value: value of the best available option; higher values correspond to easier choices. “Reward” denotes whether the previous choice was reinforced regardless of reward magnitude, which was independently manipulated (stake effect shown in D). “Exploratory choice” is the difference in value between the chosen and best available option. Central dot and horizontal lines denote estimated regression coefficient and 95% confidence interval. X-axis: decision times, arbitrary units. * p < .05, ** p < .01, *** p < .001

Larger unsigned prediction errors were followed by longer DTs in both samples in Experiment 1 but not Experiment 2 (Figure 4, green), with no consistent group differences. Rewards vs. omissions slowed DTs across samples and experiments (Figure 4, red). Suicide attempters slowed down more after a reward than healthy controls (across samples and experiments), suicide ideators (only Experiment 1, both samples), and non-suicidal depressed (only Experiment 2).

Value-based choice, DTs (Figure 4)

Choices are more difficult when the values of options are close. Accordingly, DTs were longer on such trials (indicated by low maximum available value, Vmax; Figure 4, blue). In Experiment 1, this slowing was less pronounced in the less-educated Sample 1 than in Sample 2 (sample*value, B [SE]: −0.119 [0.028], p<.001; Figure 4 A vs. B).

Across samples and experiments, suicide attempters exhibited greater slowing on difficult, low-Vmax trials, compared to healthy controls and suicide ideators. In Sample 2 (both experiments), suicide attempters differed from non-suicidal depressed (no significant difference in Sample 1).

DTs: effects of attempt lethality (Table s7)

Learning. Mirroring high-lethality attempters’ diminished behavioral sensitivity to one-back reinforcement, their DTs also slowed more after rewards and less after larger absolute PEs, differing from low-lethality attempters in Experiment 1 (both samples) but not Experiment 2. Value-based choice. There were no consistent differences in sensitivity to expected value between high- and low-lethality attempters.

Follow-up analysis: exploratory vs. exploitative choices in suicide attempters (Figure 5)

Figure 5.

Figure 5.

Win-switches in suicide attempters: value of chosen option as a function of preceding reward and stay/switch.

Legend: Estimated marginal means from a linear mixed-effects model predicting the expected value of participant’s choice based on whether the previous choice was rewarded and repeated (stay [filled circles]/switch [open circles]). The black vertical line indicates the expected value of a random choice given that it is a win-switch (mean of the second-best and worst options). Exploratory choices lie near or to the left of this line. The value of suicide attemtpers’ win-switch choices (black arrows) is consistently higher than chance, suggesting that they favor the second-best over the under-sampled worst option. NB: these statistics should be taken with caution since RL models were fitted to the same choices, shown for illustration.

Our analysis of DTs suggested that suicide attempters struggled when choosing between options with close values. We aimed to understand how this difficulty affected their choices. An optimal choice policy needs to balance exploiting high-value options and exploring uncertain, previously under-sampled alternatives (29). Since one of the three options throughout most of the task was inferior in value, under-sampled, and thus more uncertain (Figure 1), we could determine to what extent participants’ choices were exploratory (sampling the uncertain, lowest-valued option) vs. exploitative (sampling the best or second-best option). We focused on win-switch choices, which normatively should be motivated by exploration. We examined the value of participants’ choices, controlling for the highest available value and for whether the preceding choice was exploratory. Across samples and experiments, the comparison groups were more likely to explore the most uncertain, lowest-value option on win-switch trials. Suicide attempters, favored the second-best option: the value of their choice was consistently above chance (Figure 5, black horizontal lin; t > 6.68, p < .0001).

Sensitivity analyses (Tables s8-s22)

We verified that group differences in sensitivity to expected value were unaffected when controlling for possible confounds: demographics, cognitive function, medication exposure, severity of depression, and possible brain damage from suicide attempts (Supplemental Results: Sensitivity Analyses; Tables s8-s16). Group differences in sensitivity to recent reinforcement were less robust to inclusion of the above variables. Notably, controlling for cognitive control abolished group differences in post-reward slowing; poor cognitive control predicted diminished slowing (Table s9).

Furthermore, we ascertained that our main results were robust to individual differences in RL model fit (Table s11), transformations of decision times, possible confounding between expected value and one-back reinforcement, removal of covariates, and effects of between-subjects differences in expected value (Supplemental Results: Sensitivity Analyses; Tables s18-s22).

Discussion

Our reinforcement learning model-based study decision-making in attempted suicide aimed to describe deficits potentially underlying the inability to find alternative solutions and consider deterrents in a suicidal crisis. We hypothesized the existence of deficits in learning from experience and in the ability to choose optimally among actions given their values. Both hypotheses were supported (Table 2).

Table 2.

Summary of group differences across samples and experiments.

Effect Reference
group
Groups differing from the reference group
Sample 1,
Experiment 1
Sample 2,
Experiment 1
Sample 2,
Experiment 2
Learning Diminished behavioral sensitivity to reinforcement Suicide
attempters
C, D, I C, I D
High-lethality
attempters
C, D, I C, I C, D, I, LL
Exaggerated post-reward slowing* Suicide
attempters
C, D, I C, I C
High-lethality
attempters
C, D, I, LL C, I, LL C, D
Choice Exaggerated slowing on low-value trials Suicide
attempters
C, I C, D, I C, D, I
Difficulty distinguishing between close-valued options (exploratory analysis) C, D, I C, D, I C, D, I

Table 2 Note: C = healthy controls, D = non-suicidal depressed, I = suicide ideators, LL = low-lethality suicide attempters. There were no differences between high- and low-lethality attempters in choice analyses.

*

Group differences were partially explained by cognitive control.

Value-based choice

In analyses of value-based choice suicide attempters displayed abnormally extended decision times (DTs) when choosing between similarly-valued options (Figure 4) and a tendency to confound the best and second-best options, revealed in their preference for the second-best option on win-switch trials (Figure 5). In short, they struggled to distinguish between options that were close in value. This excessive – rather than lacking – modulation of DTs by long-term values could not be explained by task-unrelated fluctuations, poor effort, or distraction. These group differences were robust to controlling for cognitive control, estimated premorbid IQ and general cognitive ability as well as medication exposure and possible brain damage from suicide attempts.

Our findings provide a bridge between data on decision-making in attempted suicide and the basic literature on its specific neurocomputational substrates, particularly impaired functioning in the ventromedial prefrontal cortex. Poor IGT performance in previous studies could result from both impaired learning and value comparison, or even from broader cognitive impairment. Lesion studies suggest that the core value comparison component of IGT performance depends on the vmPFC; conversely, the neural substrates of other aspects of IGT performance overlap significantly with those of cognitive control in the lateral PFC and the dorsal anterior cingulate (30). Human and monkey lesion studies have mapped value comparison deficits on the 3-armed bandit task of the type seen here to the mOFC/vmPFC (4, 21), also implicated in our earlier fMRI study of expected value in suicide attempters (11). Taken together, this evidence points toward impaired value comparison as a specific cognitive correlate of suicidal behavior, likely involving the vmPFC. Furthermore, value comparison deficits may be the primary factor underlying attempters’ poor IGT performance.

How might these deficits facilitate suicidal behavior? We view the choice between suicide and its alternatives as a temporally unfolding decision process (31, 32). In a given crisis, the choice process may unfold over hours to weeks or longer (33). In this process, one drifts stochastically toward or away from an attempt depending on the perceived long-term value of life versus escape, shaped by recent experiences and deterrents. Whereas in retrospect survivors of attempted suicide typically judge its subjective value to be inferior to alternatives, that value can be temporarily over-estimated in the confusion of a crisis. If suicide attempters indeed struggle to differentiate between close-valued options, they may be at risk for erroneously selecting suicide during periods of despair when its value approaches that of alternatives [cf. (34)]. To the extent that our findings are representative of real-life decision-making, they reinforce the clinical notion that suicidal acts often arise in the setting of marked ambivalence and can potentially be prevented by measures such as means restriction, bridging interventions, and contingency planning. In general, they reinforce the accident prevention approach to suicide (34), and choice abnormalities can be viewed as a form of accident-proneness.

Learning

Our analyses of learning revealed that suicide attempters were less sensitive to one-back reinforcement, as indicated by both choices (Figures 23) and DTs (Figure 4, Table s7), suggesting a difficulty in learning the worth of actions from their outcomes. This effect was specific to the effects of reinforcement itself and not to whether reinforcement differed from what was predicted, as we found no group differences in DT slowing following absolute prediction errors (surprise). Deficits in learning scaled with suicide attempt lethality and were partially explained by cognitive control performance.

Interestingly, this pattern of group differences parallels those in earlier studies where high-lethality attempters displayed learning and cognitive control deficits (7, 9, 35, 36). Performance on cognitive control and learning tasks relies on a network encompassing the dorsal ACC (30) and the lateral PFC (37). Furthermore, the learning deficits observed among attempters in our study resemble the effects of lOFC/frontoopercular lesions from earlier 3-armed bandit studies (4, 22). Altogether, impaired contingency learning in high-lethality attempters in suicide attempters is consistent with cingulo-opercular dysfunction, which may also contribute to poor IGT performance in earlier studies.

In a crisis, the search for alternatives to suicide may be undermined by learning impairments that lead to a lack of awareness or confidence in available options. Additionally, impaired learning and cognitive control make the search for alternative solutions more costly in terms of both time (38) and cognitive effort (39). It is thus possible that under such circustances suicide gains appeal as a literal easy way out (40).

Correlates of suicidal behavior vs. ideation

Most previous studies of the neurocognitive diathesis to suicide did not include a group of suicide ideators, leaving it unclear whether any deficits are selectively associated with suicidal behavior rather than ideation. In the few that did, most cognitive markers – IQ, memory, and attention – did not differentiate between suicidal behavior and suicidal ideation (19, 41, 42). The two exceptions are decision-making and inhibition, although attempter/ideator differentiation is based on very few, mostly small, studies (8, 10, 14, 35, 43, 44). Our study contributes substantially to this inquiry: both impaired value comparison and learning were implicated in acting on suicidal thoughts and not merely suicidal ideation.

Between-sample differences. Limitations

Across samples and experiments, suicide attempters displayed some form of altered sensitivity to reinforcement (Table 2). However, in the less-educated sample 1, group differences in basic learning from one-back reinforcement were more prominent, whereas group differences in sensitivity to long-term value were more pronounced in sample 2. This is likely due to better learning in sample 2 (Figs 2, 4). In general, given the clinical heterogeneity of suicidal behavior and the modest group sizes, sampling variability is likely the main reason for our failure to replicate all findings. This sampling variability may also contextualize the generalizability of our findings. The retrospective, case-control design of our study is susceptible to unobserved confounds, even though we made every effort to measure observable confounds and examine their effects statistically. In particular, it was not possible to examine the effects of individual psychotropic medications. One should be cautious in extrapolating results to those who die by suicide, although the inclusion of high-lethality suicide attempters is reassuring in this respect. Finally, our models lacked a detailed account of the choice process, and future developments in the integration of reinforcement learning with serial sampling choice models on multi-alternative tasks will likely yield additional insights into choice abnormalities in attempted suicide (45).

In summary, we found that suicidal behavior in mid-life and late-life depression is associated with impaired ability to compare the values of alternative actions. In addition, medically serious suicide attempts are associated with deficient moment-to-moment learning from reinforcement. In a crisis, these low-level aberrations in decision processes likely undermine one’s ability to search for alternative solutions and consider deterrents.

Supplementary Material

1

Acknowledgements:

This work was funded by the National Institutes of Health, Bethesda, Maryland, USA (R01MH100095 and R01MH048463 to A.D.; K01MH097091 to M.N.H.; and R01MH085651 to K.Sz.) The funding agency had no role in the design and conduct of the study; the collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication

Footnotes

Disclosures:

All authors report no biomedical financial interests or potential conflicts of interest.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

The de-identified behavioral data and all code used to obtain our results are publically available at: https://github.com/DecisionNeurosciencePsychopathology/bandit_pub

References:

  • 1.Szasz T (1986): The case against suicide prevention. Am Psychol. 41: 806–12. [DOI] [PubMed] [Google Scholar]
  • 2.Shneidman ES (1969): Suicide, lethality, and the psychological autopsy. Int Psychiatry Clin. 6: 225–250. [PubMed] [Google Scholar]
  • 3.Henriques G, Wenzel A, Brown GK, Beck AT (2005): Suicide attempters’ reaction to survival as a risk factor for eventual suicide. Am J Psychiatry. 162: 2180–2. [DOI] [PubMed] [Google Scholar]
  • 4.Noonan MP, Chau BKH, Rushworth MFS, Fellows LK (2017): Contrasting Effects of Medial and Lateral Orbitofrontal Cortex Lesions on Credit Assignment and Decision-Making in Humans. J Neurosci. 37: 7023–7035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Walton ME, Behrens TE, Buckley MJ, Rudebeck PH, Rushworth MF (2010): Separable learning systems in the macaque brain and the role of orbitofrontal cortex in contingent learning. Neuron. 65: 927–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Jollant F, Bellivier F, Leboyer M, Astruc B, Torres S, Verdier R, et al. (2005): Impaired decision making in suicide attempters. Am J Psychiatry. 162: 304–10. [DOI] [PubMed] [Google Scholar]
  • 7.Richard-Devantoy S, Berlim M, Jollant F (2013): A meta-analysis of neuropsychological markers of vulnerability to suicidal behavior in mood disorders. Psychol Med. 1–11. [DOI] [PubMed] [Google Scholar]
  • 8.Dombrovski AY, Clark L, Siegle GJ, Butters MA, Ichikawa N, Sahakian BJ, Szanto K (2010): Reward/Punishment reversal learning in older suicide attempters. Am J Psychiatry. 167: 699–707. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.McGirr A, Dombrovski AY, Butters M, Clark L, Szanto K (2012): Deterministic learning and attempted suicide among older depressed individuals: Cognitive assessment using the Wisconsin Card Sorting Task. J Psychiatr Res. 46: 226–232. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Clark L, Dombrovski AY, Siegle GJ, Butters MA, Shollenberger CL, Sahakian BJ, Szanto K (2011): Impairment in risk-sensitive decision-making in older suicide attempters with depression. Psychol Aging. 26: 321–330. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Dombrovski AY, Szanto K, Clark L, Reynolds CF, Siegle GJ (2013): Reward Signals, Attempted Suicide, and Impulsivity in Late-Life Depression. JAMA Psychiatry. 70: 1020–1030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Dombrovski AY, Szanto K, Siegle GJ, Wallace ML, Forman SD, Sahakian B, et al. (2011): Lethal Forethought: Delayed Reward Discounting Differentiates High- and Low-Lethality Suicide Attempts in Old Age. Biol Psychiatry. 70: 138–144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Liu RT, Trout ZM, Hernandez EM, Cheek SM, Gerlus N (2017): A behavioral and cognitive neuroscience perspective on impulsivity, suicide, and non-suicidal self-injury: Meta-analysis and recommendations for future research. Neurosci Biobehav Rev. 83: 440–450. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Szanto K, Clark L, Hallquist M, Vanyukov P, Crockett M, Dombrovski A (2014): The cost of social punishment and high-lethality suicide attempts. Psychol Aging. 29: 84–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Szanto K, de Bruin WB, Parker AM, Hallquist MN, Vanyukov PM, Dombrovski AY (2015): Decision-making competence and attempted suicide. J Clin Psychiatry. 76: 1590–1597. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Busemeyer JR, Wang Z, Townsend JT, Eidels A (2015): The Oxford Handbook of Computational and Mathematical Psychology. Oxford University Press. [Google Scholar]
  • 17.Hayden BY, Heilbronner SR, Pearson JM, Platt ML (2011): Surprise Signals in Anterior Cingulate Cortex: Neuronal Encoding of Unsigned Reward Prediction Errors Driving Adjustment in Behavior. J Neurosci. 31: 4178–4187. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.De Leo D, Padoani W, Scocco P, Billebrahe U, Arensman E, Hjelmeland H, et al. (2001): Attempted and completed suicide in older subjects: results form the WHO/EURO multicentre study of suicidal behavior. Int J Geriatr Psychiatry. 16: 300–310. [DOI] [PubMed] [Google Scholar]
  • 19.Gujral S, Ogbagaber S, Dombrovski AY, Butters MA, Karp JF, Szanto K (2015): Course of cognitive impairment following attempted suicide in older adults. Int J Geriatr Psychiatry. . Retrieved July 26, 2016, from http://onlinelibrary.wiley.com/doi/10.1002/gps.4365/pdf. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Beck AT, Beck R, Kovacs M (1975): Classification of suicidal behaviors: I. Quantifying intent and medical lethality. Am J Psychiatry. 132: 285–7. [DOI] [PubMed] [Google Scholar]
  • 21.Noonan MP, Walton ME, Behrens TE, Sallet J, Buckley MJ, Rushworth MF (2010): Separate value comparison and learning mechanisms in macaque medial and lateral orbitofrontal cortex. Proc Natl Acad Sci U A. 107: 20547–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Walton ME, Behrens TE, Buckley MJ, Rudebeck PH, Rushworth MF (2010): Separable learning systems in the macaque brain and the role of orbitofrontal cortex in contingent learning. Neuron. 65: 927–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Daunizeau J, Adam V, Rigoux L (2014): VBA: A Probabilistic Treatment of Nonlinear Models for Neurobiological and Behavioural Data. PLOS Comput Biol. 10: e1003441. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Lee MD, Wagenmakers E-J (2013): Bayesian Cognitive Modeling: A Practical Course. Cambridge: Cambridge University Press. doi: 10.1017/CBO9781139087759. [DOI] [Google Scholar]
  • 25.Bates D, Mächler M, Bolker B, Walker S (2015): Fitting Linear Mixed-Effects Models Using lme4. J Stat Softw. 67: 1–48. [Google Scholar]
  • 26.R Core Team (2017): R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. [Google Scholar]
  • 27.Cavanagh JF, Wiecki TV, Kochar A, Frank MJ (2014): Eye tracking and pupillometry are indicators of dissociable latent decision processes. J Exp Psychol Gen. 143: 1476. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Ratcliff R (1993): Methods for dealing with reaction time outliers. Psychol Bull. 114: 510. [DOI] [PubMed] [Google Scholar]
  • 29.Sutton RS (1990): Integrated architectures for learning, planning, and reacting based on approximating dynamic programming Proc Seventh Int Conf Mach Learn. Austin, TX: Morgan Kaufmann, pp 216–224. [Google Scholar]
  • 30.Gläscher J, Adolphs R, Damasio H, Bechara A, Rudrauf D, Calamia M, et al. (2012): Lesion mapping of cognitive control and value-based decision making in the prefrontal cortex. Proc Natl Acad Sci U S A. 109: 14681–14686. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Ratcliff R, McKoon G (2008): The diffusion decision model: theory and data for two-choice decision tasks. Neural Comput. 20: 873–922. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Ratcliff R, Smith PL, Brown SD, McKoon G (2016): Diffusion Decision Model: Current Issues and History. Trends Cogn Sci. 20: 260–281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Millner Alexander J, Lee Michael D, Nock Matthew K (2017): Describing and Measuring the Pathway to Suicide Attempts: A Preliminary Study. Suicide Life Threat Behav. 47: 353–369. [DOI] [PubMed] [Google Scholar]
  • 34.Beskow J, Thorson J, Öström M (1994): National suicide prevention programme and railway suicide. Soc Sci Med, Suicide on Railways. 38: 447–451. [DOI] [PubMed] [Google Scholar]
  • 35.Richard-Devantoy S, Szanto K, Butters MA, Kalkus J, Dombrovski AY (2014): Cognitive inhibition in older high-lethality suicide attempters. Int J Geriatr Psychiatry. doi: 10.1002/gps.4138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Keilp JG, Gorlyn M, Russell M, Oquendo MA, Burke AK, Harkavy-Friedman J, Mann JJ (2013): Neuropsychological function and suicidal behavior: attention control, memory and executive dys function in suicide attempt. Psychol Med. 43: 539–551. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Yuan P, Raz N (2014): Prefrontal cortex and executive functions in healthy adults: A meta-analysis of structural neuroimaging studies. Neurosci Biobehav Rev. 42: 180–192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Lieder F, Plunkett D, Hamrick JB, Russell SJ, Hay N, Griffiths T (2014): Algorithm selection by rational metareasoning as a model of human strategy selection In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ, editors. Adv Neural Inf Process Syst 27. Curran Associates, Inc., pp 2870–2878. [Google Scholar]
  • 39.Shenhav A, Musslick S, Lieder F, Kool W, Griffiths TL, Cohen JD, Botvinick MM (2017): Toward a Rational and Mechanistic Account of Mental Effort. Annu Rev Neurosci. 40: 99–124. [DOI] [PubMed] [Google Scholar]
  • 40.Botvinick M, Braver T (2015): Motivation and Cognitive Control: From Behavior to Neural Mechanism. Annu Rev Psychol. 66: 83–113. [DOI] [PubMed] [Google Scholar]
  • 41.Gujral S, Dombrovski AY, Butters M, Clark L, Reynolds CF 3rd, Szanto K (2013): Impaired Executive Function in Contemplated and Attempted Suicide in Late Life. Am J Geriatr Psychiatry. doi: 10.1016/j.jagp.2013.01.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Klonsky ED, Qiu T, Saffer BY (2017): Recent advances in differentiating suicide attempters from suicide ideators. Curr Opin Psychiatry. 30: 15–20. [DOI] [PubMed] [Google Scholar]
  • 43.Burton CZ, Vella L, Weller JA, Twamley EW (2011): Differential effects of executive functioning on suicide attempts. J Neuropsychiatry Clin Neurosci. 23: 173–179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Minzenberg MJ, Lesh TA, Niendam TA, Yoon JH, Cheng Y, Rhoades RN, Carter CS (2015): Control-related frontal-striatal function is associated with past suicidal ideation and behavior in patients with recent-onset psychotic major mood disorders. J Affect Disord. 188: 202–209. [DOI] [PubMed] [Google Scholar]
  • 45.Pedersen ML, Frank MJ, Biele G (2017): The drift diffusion model as the choice rule in reinforcement learning. Psychon Bull Rev. 24: 1234–1251. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

RESOURCES