Exploration-Exploitation and Suicidal Behavior in Borderline Personality Disorder and Depression

Aliona Tsypes; Michael N Hallquist; Angela Ianni; Aleksandra Kaurin; Aidan G C Wright; Alexandre Y Dombrovski

doi:10.1001/jamapsychiatry.2024.1796

. 2024 Jul 10;81(10):1010–1019. doi: 10.1001/jamapsychiatry.2024.1796

Exploration-Exploitation and Suicidal Behavior in Borderline Personality Disorder and Depression

Aliona Tsypes ^1,^✉, Michael N Hallquist ², Angela Ianni ¹, Aleksandra Kaurin ³, Aidan G C Wright ^4,⁵, Alexandre Y Dombrovski ¹

¹Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania

²Department of Psychology and Neuroscience, University of North Carolina, Chapel Hill

³Department of Psychology, University of Wuppertal, Wuppertal, Germany

⁴Department of Psychology, University of Michigan, Ann Arbor

⁵Eisenberg Family Depression Center, University of Michigan, Ann Arbor

Accepted for Publication: April 25, 2024.

Published Online: July 10, 2024. doi:10.1001/jamapsychiatry.2024.1796

^✉

Corresponding Author: Aliona Tsypes, PhD, Department of Psychiatry, University of Pittsburgh School of Medicine, 100 N Bellefield Ave, BT 748, Pittsburgh, PA 15213 (tsypesa@upmc.edu).

Author Contributions: Drs Tsypes and Dombrovski had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

Concept and design: Tsypes, Hallquist, Wright, Dombrovski.

Acquisition, analysis, or interpretation of data: All authors.

Drafting of the manuscript: Tsypes, Kaurin, Dombrovski.

Critical review of the manuscript for important intellectual content: All authors.

Statistical analysis: All authors.

Obtained funding: Hallquist, Wright, Dombrovski.

Administrative, technical, or material support: Dombrovski.

Supervision: Dombrovski.

Conflict of Interest Disclosures: Dr Wright reported grants from the National Institutes of Health during the conduct of the study. No other disclosures were reported.

Funding/Support: This research was supported by grants from the National Institute of Mental Health (K23MH130664, R01MH048463, R01MH100095, R01MH119399, and T32MH018269), the University of Pittsburgh’s Clinical and Translational Science Institute, which is funded by the National Institutes of Health (NIH) Clinical and Translational Science Award (CTSA) program (UL1TR001857). The CTSA program is led by the NIH’s National Center for Advancing Translational Sciences.

Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Data Sharing Statement: See Supplement 2.

Additional Contributions: We thank Mandy Collier, BS; Michelle Perry, BA, BS; Tanya Shah, BA; Shreya Sheth, MA; Nathan Stimmel, MA; and Laura Taglioni, BA, for their contributions to data collection. We thank Morgan Buerke, MA; Jiazhou Chen, BA; BS; Bea Langer, BS; and Andrew Papale, PhD, for their contributions to data management. At the time of the study, all of these contributors were employed full-time by our team at the University of Pittsburgh. No additional compensation was provided beyond regular salary.

^✉

Corresponding author.

PMCID: PMC11238070 PMID: 38985462

Key Points

Question

Is the inability to explore multiple alternatives and take advantage of the best options associated with suicidal behavior?

Findings

In 2 case-control studies of adults with borderline personality disorder and depression, inability to fully explore available options was associated with medically serious suicide attempts. In an ambulatory study, this pattern predicted suicidal ideation.

Meaning

The findings suggest that the inability to explore a full range of solutions in a state of suicidal crisis may prevent one from discovering alternatives to attempting suicide; exploring novel ways to cope may help individuals build their safety plans.

This case-control study explores the exploration-exploitation dilemma in suicidal behavior.

Abstract

Importance

Clinical theory and behavioral studies suggest that people experiencing suicidal crisis are often unable to find constructive solutions or incorporate useful information into their decisions, resulting in premature convergence on suicide and neglect of better alternatives. However, prior studies of suicidal behavior have not formally examined how individuals resolve the tradeoffs between exploiting familiar options and exploring potentially superior alternatives.

Objective

To investigate exploration and exploitation in suicidal behavior from the formal perspective of reinforcement learning.

Design, Setting, and Participants

Two case-control behavioral studies of exploration-exploitation of a large 1-dimensional continuous space and a 21-day prospective ambulatory study of suicidal ideation were conducted between April 2016 and March 2022. Participants were recruited from inpatient psychiatric units, outpatient clinics, and the community in Pittsburgh, Pennsylvania, and underwent laboratory and ambulatory assessments. Adults diagnosed with borderline personality disorder (BPD) and midlife and late-life major depressive disorder (MDD) were included, with each sample including demographically equated groups with a history of high-lethality suicide attempts, low-lethality suicide attempts, individuals with BPD or MDD but no suicide attempts, and control individuals without psychiatric disorders. The MDD sample also included a subgroup with serious suicidal ideation.

Main Outcomes and Measures

Behavioral (model-free and model-derived) indices of exploration and exploitation, suicide attempt lethality (Beck Lethality Scale), and prospectively assessed suicidal ideation.

Results

The BPD group included 171 adults (mean [SD] age, 30.55 [9.13] years; 135 [79%] female). The MDD group included 143 adults (mean [SD] age, 62.03 [6.82] years; 81 [57%] female). Across the BPD (χ²₃ = 50.68; P < .001) and MDD (χ²₄ = 36.34; P < .001) samples, individuals with high-lethality suicide attempts discovered fewer options than other groups as they were unable to shift away from unrewarded options. In contrast, those with low-lethality attempts were prone to excessive behavioral shifts after rewarded and unrewarded actions. No differences were seen in strategic early exploration or in exploitation. Among 84 participants with BPD in the ambulatory study, 56 reported suicidal ideation. Underexploration also predicted incident suicidal ideation (χ²₁ = 30.16; P < .001), validating the case-control results prospectively. The findings were robust to confounds, including medication exposure, affective state, and behavioral heterogeneity.

Conclusions and Relevance

The findings suggest that narrow exploration and inability to abandon inferior options are associated with serious suicidal behavior and chronic suicidal thoughts. By contrast, individuals in this study who engaged in low-lethality suicidal behavior displayed a low threshold for taking potentially disadvantageous actions.

Introduction

People who survive suicide attempts usually come to regret their choice of suicide attempt over constructive alternatives,¹ suggesting that this choice is often an error of decision-making. Individuals who use substances or gamble in real life^2,3 and do not make optimal value-based choices or effectively learn from rewards and punishments in the laboratory^{4,5,6,7,8,9,10,11,12} may be more vulnerable to suicidal behavior. However, our understanding of decision-making in a state of suicidal crisis is limited by the reliance on simple decision tasks used in case-control studies that leave out critical real-life demands. In a crisis, decisions are often made during a complex sensorimotor interaction, as one may take a phone call, read an upsetting message, walk, look around, and even begin to implement a suicidal plan. There is usually real or perceived time pressure, and a vast number of options may become available and vanish dynamically. Imagine a person experiencing unbearable distress who may consider drinking alcohol, taking an overdose, going for a walk while practicing a coping skill, or calling a friend. Drinking and overdose are always there, while the availability and worth of alternatives can change: the friend may not answer the call after a work shift starts, and a walk may bring no relief once the afternoon heat sets in. Many alternatives may remain, but it is hard to consider which ones would work when experiencing a sense of crisis.

Clinical theories describe the suicidal crisis as a myopic, passive, and constricted cognitive state—one of tunnel vision.^13,14,15 While clinical accounts yield few predictions about neurobehavioral mechanisms, reinforcement learning^16,17,18,19 provides a useful theoretical framework for understanding decision-making. Dynamic decision-making involves a continuous competition between available actions,^20,21,22 and adaptive behavior depends on resolving this competition.²³ Reinforcement learning frames option competition as a dilemma between exploiting options thought to be best and exploring potentially superior alternatives.²⁴ In this explore-exploit framework, we can view cognitive constriction as a narrow and ineffective exploration, yielding a subset of suboptimal choices. Returning to our example, one may end up drinking alcohol or even attempting suicide, having not explored constructive solutions when they were available and useful.

To understand how people who are vulnerable to suicide resolve option competition under time pressure, we investigated the exploration-exploitation of a continuous space where a large number of options become available and vanish dynamically. We used the clock task²⁵ (Figure 1A), where movement through a 1-dimensional environment is signaled by a dot rotating around a circle and rewards, and the distance between consecutive choices provides a straightforward measure of exploration. To assess exploitation, we used a previously validated computational model that explores and exploits efficiently in a resource-rational manner.²⁶ During learning, when one chooses among discrete options, it is hard to infer that a given choice is exploratory without computational modeling. By contrast, shifts to far-away locations of a continuous space, particularly when unexplained by reward history, are likely to reflect exploration. Much exploration on the clock task results from shifting away immediately after unrewarded responses. Humans and other mammals consistently display these so-called win-stay/lose-shift responses alongside reinforcement learning.^27,28,29,30 Critically, smaller win-stay/lose-shift responses were associated with attempted suicide in our earlier armed bandit studies of late-life depression.⁸ Here, we aimed to understand whether this impairment—and ostensibly the inability to find solutions in a suicidal crisis—may reflect deficits in exploration (including by shifting far away from unrewarded options) or in exploitation based on longer-term learned values.^{4,5,6,7,8,9,10,11,12}

Figure 1. — A, The clock paradigm consists of decision and feedback phases. During the decision phase, a dot revolves 360° around a central stimulus over the course of 4 seconds. Participants press a button to stop the revolution and receive a probabilistic outcome. During the feedback phase, participants are informed about the number of points they won on this trial, with 0 points representing reward omission. Rewards are drawn from 1 of 2 monotonically time-varying contingencies in which expected values of choices either increase (increasing expected value [IEV]) or decrease (decreasing expected value [DEV]) with prolonged wait. Reward probabilities and magnitude varied independently (eFigure 1 in Supplement 1). B, Evolution of participants’ response times (RT) and RT swings by contingency in the borderline personality disorder (BPD) and major depressive disorder (MDD) samples. Plotted data are smoothed using a generalized additive model (GAM) in the *ggplot2* package of R version 3.4.4 (R Foundation). In subplots on the right, the smoothing used natural splines from the *splines* package in R version 4.3.2, with a basis of 5 knots. The shaded area around the lines represents 95% CIs. Participants learned to respond later in the IEV compared to the DEV condition and RT swings generally decreased later in learning (especially in IEV). To ascertain that time courses were not distorted by smoothing, trial-averaged data are presented in eFigures 2 and 3 in Supplement 1. The difference between DEV and IEV at trial 1 is due to the alternation of IEV and DEV conditions, which change every 40 trials of the task.

Surveying diverse forms of suicidal behavior maximally representative of death by suicide, we examined exploration-exploitation in people with borderline personality disorder (BPD) and late-life depression who had made high-lethality vs low-lethality suicide attempts. Whereas BPD is characterized by affective instability, rash decisions, and recurrent suicidal thoughts and behaviors,^31,32 suicidal acts in individuals with late-life depression are less frequent but more determined and lethal.^33,34,35 Finally, to validate our case-control findings prospectively, we examined whether behavioral exploration/exploitation predicted incident suicidal thoughts assessed via ecological momentary assessment. We hypothesized that high-lethality suicide attempts and incident suicidal thoughts would be associated with underexploration, particularly following unrewarded choices, and with inability to exploit.

Methods

Participants

Participants (Table 1; eTables 1 and 2 in Supplement 1) included 171 adults with BPD and 143 adults with major depressive disorder (MDD). We contrasted individuals with high-lethality suicide attempts with those with low-lethality attempts, patients with no history of suicide attempts, and psychiatrically healthy control individuals. To identify deficits specific to attempted rather than merely contemplated suicide, we included a group with suicidal ideation with a plan but no attempt history in the MDD sample only. See the eMethods in Supplement 1 for full clinical and psychological characterization of the samples. This study followed the Enhancing the Quality and Transparency of Health Research (EQUATOR) reporting guideline. The institutional review board of the University of Pittsburgh approved the study procedures. Written informed consent was obtained before participation.

Table 1. Case-Control Study Groups Across Samples.

Sample	Group	Age, mean (SD), y	Sex, No (%)		Participant count
Sample	Group	Age, mean (SD), y	Female	Male	Participant count
BPD (age range at enrollment: 18-45 y)^a	Healthy control	30.5 (9.0)	40 (74.1)	14 (25.9)	54
	BPD with no prior history of suicidal behavior	29.3 (6.5)	24 (75.0)	8 (25.0)	32
	BPD with low-lethality suicide attempt	29.6 (8.8)	39 (79.6)	10 (20.4)	49
	BPD with high-lethality suicide attempt	33.1 (11.4)	32 (88.9)	4 (11.1)	36
MDD (age range at enrollment: 50-80 y)	Healthy control	63.3 (8.2)	23 (53.5)	20 (46.5)	43
	MDD with no self-injurious behavior or ideation	61.9 (6.7)	15 (46.9)	17 (53.1)	32
	MDD with suicidal ideation and plan	61.5 (5.0)	18 (58.1)	13 (41.9)	31
	MDD with low-lethality suicide attempt	61.8 (7.1)	17 (70.8)	7 (29.2)	24
	MDD with high-lethality suicide attempt	59.8 (5.2)	8 (61.5)	5 (38.5)	13

Open in a new tab

Abbreviations: BPD, borderline personality disorder; MDD, major depressive disorder.

^{^a}

A subset of 119 participants in the BPD group also completed a 21-day ecological momentary assessment (EMA) protocol assessing daily instances of suicidal ideation. Due to low daily levels of suicidal ideation in the healthy control group, the EMA portion of analyses focused exclusively on 84 individuals with BPD (26 with no suicide attempts, 40 with low-lethality attempts, and 18 with high-lethality attempts). The medical seriousness of attempts was assessed using the Beck Lethality Scale (BLS). For individuals with multiple suicide attempts, data for the highest lethality attempt were used. High-lethality suicidal behavior was defined as a BLS score of 4 or greater.

Daily Assessments

Participants completed a 21-day ecological momentary assessment protocol (6 surveys per day) within predefined time windows. Suicidal ideation was assessed with 2 dichotomous items (1 = yes, 0 = no^36,37) from the Columbia Suicide Severity Rating Scale³⁸: “Have you wished you were dead or wished you could go to sleep and not wake up?” and “Have you actually had any thoughts of killing yourself?” We averaged across the instances of endorsements of suicidal ideation over the duration of the ecological momentary assessment protocol to get an index of frequency.

Clock Task

All participants explored and exploited a 1-dimensional continuous space on the clock task²⁵ (Figure 1; eMethods and eFigures 1-4 in Supplement 1) over the course of 240 trials. During the decision phase, a green dot revolved 360° around a central stimulus. Participants were informed that the timing of their response controlled the number of points they could win and that not responding during a single revolution (4 seconds in the BPD sample and 5 seconds in the MDD sample) would leave them with no points on that trial. They pressed a button to stop the revolution and received probabilistic feedback controlled by 2 difficult contingencies, such that expected values of choices either increased or decreased along the interval. Reward probabilities and magnitude varied independently. Contingencies reversed every 40 trials and, to rule out the effects of novelty on task behavior, the MDD participants were not signaled about these changes.

Computational Modeling

Computational modeling is illustrated in Figure 2 and the eMethods, eTable 3, and eFigure 5 in Supplement 1. Our goal was to identify the highest-value region of the space, which a successful agent would be exploiting on any given trial, given each participant’s sampling and reinforcement history using reinforcement learning. Thus, we fitted our previously validated (across environments and levels of analysis^26,39,40) Strategic Exploration/Exploitation of Temporal Instrumental Contingencies (SCEPTIC) model to participants’ choices. SCEPTIC reduces the potentially infinite continuous options to a handful of discrete actions, using learning elements with staggered receptive fields implemented as gaussian temporal basis functions.³⁹ To explore and exploit efficiently, while reducing memory load, SCEPTIC selectively maintains the values of preferred actions and allows the nonpreferred alternatives to decay. To improve precision, model parameters were estimated by an empirical bayesian procedure using the variational bayesian approach,⁴¹ regularizing individual estimates by the group posterior.

Data Analysis

Measures of exploration and exploitation are detailed in Table 2 and the eMethods in Supplement 1. As in our previous studies,^26,39 we examined individual differences in exploration and exploitation in multilevel linear regression models predicting trial-by-trial response times (RTs) implemented in the lme4 package of R version 1.1.35.1 (R Foundation), accounting for random intercepts for each participant and run and, in sensitivity analyses, participant-level random slopes of behavioral variables. Missed responses and RTs less than 200 ms were excluded (BPD sample: 341 of 41 040 trials [0.83%]; MDD sample: 517 of 34 320 trials [1.51%]).

Table 2. Definitions.

Concept	General definition	Operationalization: clock task and SCEPTIC model
Exploration	Sampling a broader range of options to find the ones with highest value.	RT swings represent the distance between consecutive responses and thus provide a model-free index of exploration. To reduce the confounds of sensorimotor precision, this is estimated as the effect of RT in the previous trial on RT in the current trial in multilevel models of behavior. Weaker RT autocorrelation reflects greater RT swings. As detailed in the eMethods in Supplement 1, RT swings mostly reflect random rather than strategic exploration.
Exploitation	Choosing options currently thought to yield the highest rewards.	The RT(Vmax) is the location of the best option, the response time with the highest expected value, or the global value maximum. RT(Vmax) is the best response location according to the SCEPTIC model, given what options the participant has sampled so far and the rewards received. By choosing RT(Vmax), one maximizes their short-term reward. Therefore, effect of RT(Vmax) on choice indexes the rate of exploitation.
Entropy of the value function: global uncertainty	Value function is the expected reward (value) associated with each option that is being tracked by a learning agent. Its entropy (information content) scales with the number of competing options and tunes the explore-exploit balance. In other words, entropy reflects uncertainty as to which option yields the highest expected reward. Entropy is maximal when all options appear equally attractive, which is often the case when their true value is unknown. Therefore, high entropy promotes exploration and hence discovery of better options. For example, at a food market in an exotic city, a person may start with no idea which food is best (high entropy: many seemingly good options). They may then randomly sample (explore) different foods, but as they start liking certain foods more (low entropy: several best options dominate), they will focus on (exploit) those options more.	SCEPTIC approximates the value (expected reward) at each location on the clock with a set of learning elements whose temporal receptive fields cover the time interval. Each element updates its weight through the discrepancy between model-predicted reward at the chosen RT and the temporally proximal obtained reward. Shannon entropy (information content) of the normalized vector of element weights (values) is high earlier in learning in the presence of multiple competing options and decreases when a single most attractive option begins to dominate.

Open in a new tab

Abbreviations: RT, response time; SCEPTIC, Strategic Exploration/Exploitation of Temporal Instrumental Contingencies.

Participants’ tendency to explore was measured by the decreased effect of RT in the previous trial (RT[t-1]) on RT in the current trial (RT[t]), conceptually corresponding to the tendency to alternate between early and late parts of the interval (RT swings) and primarily reflecting random rather than uncertainty-directed exploration.³⁹ Its interaction with reward quantified the tendency to shift away from unrewarded choices, while the interaction with trial specifically tested adaptive early exploration. These large RT swings of 1 to 2 seconds far exceed the threshold of sensorimotor precision. Participants’ ability to exploit was captured by the effect of RT with the highest expected value, as predicted by the SCEPTIC model (RT[Vmax]), on their choices. The RT(Vmax) × trial interaction tested the transition from earlier exploration to later exploitation.

Results

Exploration in the BPD Sample

Results for exploration in the BPD sample (mean [SD] age, 30.55 [9.13] years; 135 [79%] female and 36 [21%] male) are shown in Figure 3A and eTable 4 in Supplement 1. Levels of exploration differed across groups (group × RT[t-1]: χ²₃ = 50.68; P < .001). Follow-up analyses revealed smaller RT swings (lower exploration) in individuals with BPD and high-lethality suicide attempts vs those with BPD and low-lethality suicide attempts (t_39,350 = −4.32; P < .001) even after accounting for the effects of reward described below. Further, both individuals with BPD and low-lethality attempts (t_39,340 = −7.01; P < .001) and those with BPD and no suicide attempts (t_39,310 = −2.63; P = .008) had larger RT swings compared to control individuals. Thus, we observed a striking heterogeneity among individuals with suicidal behavior, with relatively low exploration in individuals with BPD and high-lethality suicide attempts and relatively high exploration in those with BPD and low-lethality suicide attempts. We found no selective impairment in early exploration above and beyond these differences.

Much exploration on the clock task depends on RT swings away from unrewarded choices (lose-shifts).³⁹ Our analysis found a group × reward × RT[t-1] interaction (χ²₃ = 20.03; P < .001). Follow-up analyses revealed diminished win-stay/lose-shift responses in individuals with BPD and high-lethality suicide attempts vs all other groups (BPD with low-lethality attempts: t_38,950 = 2.81; P = .004; those with BPD and no suicide attempts: t_39,020 = 2.64; P = .008; control individuals: t_38,960 = 4.46; P < .001). Qualitatively, while all BPD groups, particularly those with low-lethality attempts, displayed smaller win-stays; those with BPD and high-lethality attempts displayed smaller lose-shifts, particularly vs those with BPD and low-lethality attempts, but also vs those with BPD and no suicide attempts. Group differences persisted after controlling for the levels of depressive symptoms, suicide attempt recency (although recency predicted smaller lose-shifts), impulsivity, medication exposure, estimated premorbid IQ, and executive function (eTables 5-13 in Supplement 1). Smaller lose-shifts among individuals with BPD and high-lethality attempts could also be due to a working memory deficit; however, this alternative explanation was also ruled out (eTable 14 in Supplement 1). Group differences were not explained by individual heterogeneity of behavioral effects, as indicated by sensitivity analyses including random slopes of behavioral variables (eTables 15-17 in Supplement 1), or by suppressor effects (eTable 18 in Supplement 1).

To ascertain whether individuals with BPD and high-lethality suicide attempts indeed underexplored the option space, we used the SCEPTIC model to examine information dynamics reflective of option competition, finding that individuals with BPD and high-lethality attempts discovered fewer options than other groups (Figure 3B; eTable 19 in Supplement 1).

Replication: Exploration in the MDD Sample

Results for exploration in the MDD sample (mean [SD] age, 62.03 [6.82] years; 81 [57%] female and 62 [43%] male) are shown in Figure 3C and eTable 20 in Supplement 1. Levels of exploration differed across groups (group × RT[t-1]: χ²₄ = 36.34; P < .001). As in the BPD sample, follow-up analyses revealed smaller RT swings in individuals with MDD and high-lethality suicide attempts vs those with MDD and low-lethality suicide attempts (t_33,170 = −5.12; P < .001), as well as vs those with MDD and suicidal ideation but no attempt (t_33,170 = −2.13; P = .03) and control individuals (t_33,170 = −2.20; P = .03). We found no group differences in early exploration specifically.

As in the BPD sample, group differences in exploration were further qualified by reward (group × reward × RT[t-1]: χ²₄ = 13.66; P = .008). Individuals with MDD and high-lethality suicide attempts had diminished win-stay/lose-shift responses compared to individuals with MDD and no suicide attempts (t_33,160 = 2.33; P = .02) and those with MDD and low-lethality suicide attempts (t_33,170 = 3.48; P < .001). Again, relative to other groups, whereas individuals with MDD and high-lethality suicide attempts displayed smaller lose-shifts, those with MDD and low-lethality attempts exhibited greater lose-shifts.

Exploitation in the BPD Sample

Results for exploitation in the BPD sample are shown in eTable 4 in Supplement 1. After accounting for the effects of the last reward described above, omnibus tests showed no overall group differences in exploitation; however, individuals with BPD and high-lethality suicide attempts displayed higher levels of exploitation vs control individuals (t_34,980 = −2.31; P = .02) but not vs other groups (|t| ≤ 1.16). Since partialing out the effects of last reward (reward × RT[t-1]) from effects of long-term reinforcement (RT[Vmax]) constitutes overcontrolling, we tested a model (eTable 21 in Supplement 1) omitting the reward × RT[t-1] term, finding that levels of overall exploitation differed across groups (group × RT[Vmax]: χ²₃ = 9.20; P = .03): individuals with BPD and high-lethality suicide attempts exhibited higher exploitation vs control individuals (t_36,510 = −2.70; P = .007) but not vs other groups.

Replication: Exploitation in the MDD Sample

Results for exploitation in the MDD sample are shown in eTable 20 in Supplement 1. We found no group differences in exploitation.

Exploration, Exploitation, and Prospectively Assessed Suicidal Thoughts

Results pertaining to suicidal thoughts are illustrated in Figure 3D and eTable 22 in Supplement 1. Eighty-four individuals with BPD completed the 21-day ecological momentary assessment study, with 56 of these reporting suicidal thoughts (present on average 7% of days; median [range], 2% [0%-94%]), enabling us to examine the associations between exploration-exploitation on the clock task and prospectively assessed suicidal thinking in daily life.

Prospective suicidal ideation was associated with the same pattern of lower exploration (χ²₁ = 30.16; P < .001) and smaller lose-shifts (χ²₁ = 11.31; P = .001) as individuals with high-lethality suicide attempts (Figure 3A, B, and C). No selective association was observed with early vs late exploration. Results remained qualitatively unchanged when excluding 2 participants with extreme frequencies of suicidal ideation (40% and 94% days) or controlling for suicide attempt recency or affective predictors of suicidal ideation (negative internalizing, externalizing, and impulsive affect during ecological momentary assessment) (eTables 23-27 in Supplement 1).

Discussion

The behavioral experiments in this case-control study, augmented with reinforcement-learning modeling, found associations between serious suicidal behavior in both borderline personality disorder and late-life depression and an inability to shift away from unrewarded choices resulting in the underexploration of a continuous option space. This narrow, inflexible behavior prospectively predicted daily suicidal ideation. By contrast, low-lethality suicidal behavior in both individuals with BPD and depression was associated with excessive shifts after rewarded as well as unrewarded actions. These associations were not explained by plausible confounds, including medication exposure, depressive symptoms, premorbid IQ, executive function, behavioral heterogeneity, and affective predictors of suicidal ideation.

Earlier studies using armed bandits found associations between high-lethality suicidal behavior in mid- and late-life depression and deficits in learning and behavioral adaptation.^8,42 Supporting these associations, the present findings reveal that, given a choice among many uncertain options under time pressure, individuals at the highest risk explore only a limited subset, sticking with unrewarded choices. To our knowledge, this behavioral pattern has not been described in psychopathology research; it diverges from the performance of patients with schizophrenia, for example, on the same task.⁴³ It is equally distinct from win-shift behavior on bandit tasks we previously observed in individuals who attempted suicide,⁸ potentially indicating multiple deficits with additive or similar effects on suicide risk. What neurocomputational deficits may underlie an inability to shift away from unrewarded choices? In rodents, lose-shift behavior depends on the lateral striatum,^29,44 a sensorimotor region approximately homologous to the primate dorsolateral putamen. In contrast, win-stay rodent behavior depends on the ventromedial striatum⁴⁴ and lateral habenula.⁴⁵ Interestingly, however, human lose-shift responding increases under cognitive load, suggesting that frontoparietal control may suppress automatic, striatum-mediated lose-shifts.⁴⁶ During exploration and learning in continuous spaces, dynamic maps of competing options are found in frontoparietal circuits, specifically the dorsal stream and caudal posterior parietal cortex .^40,47 One intriguing possibility is that inappropriately rigid or exaggerated frontoparietal responses to option competition suppress adaptive lose-shift behavior in people who are prone to serious suicidal behavior. Conversely, excessive lose-shifts in individuals with low-lethality suicide attempts may be related to a disrupted encoding of the longer-term reinforcement history we previously described in attempted suicide.^4,7

Our observations resonate with clinical notions of cognitive constriction and tunnel vision and provide a fine-grained behavioral and computational account of the suicide diathesis. Curiously, although the present study was not designed to distinguish between traitlike and statelike deficits, exploratory analyses of suicide attempt recency (eTables 5 and 22 in Supplement 1) suggest that decreased lose-shift responding may be a state-modulated trait. At the same time, our results highlight the role of trait impairments in decision capacity, which can facilitate serious suicidal behavior in a crisis, consistent with the view of suicide as an unintentional decision where the demands of a crisis exceeded one’s decision-making capacity.^16,48 Specifically, individuals prone to underexploration are more likely to select an often-used (or considered) solution in a crisis in lieu of adaptively exploring potentially better alternatives. In psychotherapy, exploring solutions one had never tried before may be a useful skill to learn and practice both at an emotional baseline and when distressed.

It has been questioned whether phenomena such as passive death wish, suicidal thoughts, and more vs less medically serious suicidal acts belong to a single severity continuum and whether underlying risk factors differ only quantitatively or also qualitatively.^49,50 Consistent behavioral differences between individuals with high-lethality and low-lethality suicide attempts controvert the continuum model, pointing instead to qualitatively distinct behavioral pathways. One is generally skeptical of studies where the performance of distinct clinical groups falls on both sides of healthy control individuals, since this pattern often reflects merely unexplained interindividual heterogeneity. However, here, the behavioral divergence between individuals with high-lethality vs low-lethality suicide attempts was replicated across different clinical populations and was robust to statistical controls for individual heterogeneity and plausible confounds. Furthermore, the behavioral distinctiveness of high-lethality suicide attempts from other forms of suicidal behavior and ideation has been observed in several previous studies across clinical populations and samples,^4,5,7,8,11 suggesting that this divergence is systematic. Thus, it is likely that qualitatively distinct behavioral pathways lead to high-lethality suicide attempts and, by extension, many suicide deaths vs lower-lethality suicide attempts. While the first pathway is marked by narrow and inflexible choices, the second is characterized by excessive behavioral plasticity in response to failures, which may correspond to a lower threshold for engaging in potentially disadvantageous and specifically suicidal behavior. Additionally, maximum attempt lethality—a hard outcome (relative to one’s level of intent or planning)—must be considered a key dimension of past suicidal behavior in both research and practice.

Contrary to expectations, we found no evidence that people prone to suicide are unable to exploit the best of previously sampled options, with the caveat that the individuals at the highest risk were choosing from a more limited set than other participants. If anything, there was some evidence of overexploitation in the borderline personality disorder sample, with no group differences in the depression sample. Considering prior evidence associating suicidal behavior with disadvantageous value-based choices,^5,11,51 our findings suggest that such behavior may instead reflect an admixture of overly rigid and erratic behavioral patterns, challenging the notion of simple insensitivity to long-term value.

Limitations

The case-control design of our study limits causal inferences, a limitation partly offset by prospective validation. Future studies will also need to differentiate strategic from stochastic exploration and examine how affective states shape the set of options under consideration, particularly following adverse outcomes, and test formal accounts of affective meta-reasoning during learning and decision-making.^16,52

Conclusions

In summary, divergent behavioral signatures of high-lethality vs low-lethality suicide attempts likely expose distinct neurocognitive pathways. This underscores the need for a broader taxonomy of clinically relevant individual differences in human decision-making.

Supplement 1.

eMethods

eTable 1. Demographic and clinical characterization of the BPD sample

eTable 2. Demographic and clinical characterization of the MDD sample

eTable 3. Model fits by diagnostic groups

eTable 4. BPD sample: Exploration and exploitation on the Clock task

eTable 5. BPD sample sensitivity analysis: Levels of depressive symptoms (excluding healthy controls)

eTable 6. BPD sample sensitivity analysis: Suicide attempt recency

eTable 7. BPD sample sensitivity analysis: Impulsivity

eTable 8. BPD sample sensitivity analysis: Medication exposure to antidepressants (excluding healthy controls)

eTable 9. BPD sample sensitivity analysis: Medication exposure to opioids (excluding healthy controls)

eTable 10. BPD sample sensitivity analysis: Medication exposure to sedatives or hypnotics (excluding healthy controls)

eTable 11. BPD sample sensitivity analysis: Medication exposure to antipsychotics (excluding healthy controls)

eTable 12. BPD sample sensitivity analysis: Estimated premorbid IQ

eTable 13. BPD sample sensitivity analysis: Executive function

eTable 14. BPD sample sensitivity analysis: Effects of working memory

eTable 15. BPD sample sensitivity analysis controlling for individual random slopes of RT lag

eTable 16. BPD sample sensitivity analysis controlling for individual random slopes of RT vmax

eTable 17. BPD sample sensitivity analysis controlling for individual random slopes of reward (reward lag)

eTable 18. BPD sample: model without RT vmax:Group:trial interaction as a covariate

eTable 19. BPD sample: Information dynamics reflective of option competition

eTable 20. MDD sample: Exploration and exploitation on the Clock task

eTable 21. BPD sample: model without reward lag:RT lag interaction as a covariate

eTable 22. BPD EMA subsample: Exploration and exploitation on the Clock task predicting the frequency of prospective daily suicidal ideation

eTable 23. BPD EMA subsample sensitivity analysis: Excluding 2 relatively extreme values on daily suicidal ideation measure

eTable 24. BPD EMA subsample sensitivity analysis: SA recency

eTable 25. BPD EMA subsample sensitivity analysis: Controlling for average levels of negative internalizing affect in daily life

eTable 26. BPD EMA subsample sensitivity analysis: Controlling for average levels of externalizing affect in daily life

eTable 27. BPD EMA subsample sensitivity analysis: Controlling for average levels of impulsive affect in daily life

eFigure 1. Reward magnitude and probability across time-varying contingencies

eFigure 2. Behavioral manipulation checks by group (generalized additive model)

eFigure 3. Behavioral manipulation checks (raw data, trial averages across groups)

eFigure 4. Behavioral manipulation checks (raw data, trial averages by group)

eFigure 5. Posterior predictive checks

jamapsychiatry-e241796-s001.pdf^{(653.9KB, pdf)}

Supplement 2.

Data sharing statement

jamapsychiatry-e241796-s002.pdf^{(16.2KB, pdf)}

References

1.Henriques G, Wenzel A, Brown GK, Beck AT. Suicide attempters’ reaction to survival as a risk factor for eventual suicide. Am J Psychiatry. 2005;162(11):2180-2182. doi: 10.1176/appi.ajp.162.11.2180 [DOI] [PubMed] [Google Scholar]
2.Wong PWC, Cheung DYT, Conner KR, Conwell Y, Yip PSF. Gambling and completed suicide in hong kong: a review of coroner court files. Prim Care Companion J Clin Psychiatry. 2010;12(6):PCC.09m00932. doi: 10.4088/PCC.09m00932blu [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Vijayakumar L, Kumar MS, Vijayakumar V. Substance use and suicide. Curr Opin Psychiatry. 2011;24(3):197-202. doi: 10.1097/YCO.0b013e3283459242 [DOI] [PubMed] [Google Scholar]
4.Brown VM, Wilson J, Hallquist MN, Szanto K, Dombrovski AY. Ventromedial prefrontal value signals and functional connectivity during decision-making in suicidal behavior and impulsivity. Neuropsychopharmacology. 2020;45(6):1034-1041. doi: 10.1038/s41386-020-0632-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Clark L, Dombrovski AY, Siegle GJ, et al. Impairment in risk-sensitive decision-making in older suicide attempters with depression. Psychol Aging. 2011;26(2):321-330. doi: 10.1037/a0021646 [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Dombrovski a Y, Clark L, Siegle GJ, et al. Reward/punishment reversal learning in older suicide attempters. Am J Psychiatry. 2010;167(6):699-707. doi: 10.1176/appi.ajp.2009.09030407 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Dombrovski AY, Szanto K, Clark L, Reynolds CF, Siegle GJ. Reward signals, attempted suicide, and impulsivity in late-life depression. JAMA Psychiatry. 2013;70(10):1020-1030. doi: 10.1001/jamapsychiatry.2013.75 [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Dombrovski AY, Hallquist MN, Brown VM, Wilson J, Szanto K. Value-based choice, contingency learning, and suicidal behavior in mid- and late-life depression. Biol Psychiatry. 2019;85(6):506-516. doi: 10.1016/j.biopsych.2018.10.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Richard-Devantoy S, Berlim MT, Jollant F. A meta-analysis of neuropsychological markers of vulnerability to suicidal behavior in mood disorders. Psychol Med. 2014;44(8):1663-1673. doi: 10.1017/S0033291713002304 [DOI] [PubMed] [Google Scholar]
10.Tsypes A, Owens M, Gibb BE. Reward responsiveness in suicide attempters: an electroencephalography/event-related potential study. Biol Psychiatry Cogn Neurosci Neuroimaging. 2021;6(1):99-106. doi: 10.1016/j.bpsc.2020.04.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Tsypes A, Szanto K, Bridge JA, Brown VM, Keilp JG, Dombrovski AY. Delay discounting in suicidal behavior: myopic preference or inconsistent valuation? J Psychopathol Clin Sci. 2022;131(1):34-44. doi: 10.1037/abn0000717 [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Vanyukov PM, Szanto K, Hallquist MN, et al. Paralimbic and lateral prefrontal encoding of reward value during intertemporal choice in attempted suicide. Psychol Med. 2016;46(2):381-391. doi: 10.1017/S0033291715001890 [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Baumeister RF. Suicide as escape from self. Psychol Rev. 1990;97(1):90-113. doi: 10.1037/0033-295X.97.1.90 [DOI] [PubMed] [Google Scholar]
14.Pollock LR, Williams JMG. Problem-solving in suicide attempters. Psychol Med. 2004;34(1):163-167. doi: 10.1017/S0033291703008092 [DOI] [PubMed] [Google Scholar]
15.Shneidman ES. Ten commonalities of suicide and their implications for response. Crisis. 1986;7(2):88-93. [PubMed] [Google Scholar]
16.Dombrovski AY, Hallquist MN. Search for solutions, learning, simulation, and choice processes in suicidal behavior. Wiley Interdiscip Rev Cogn Sci. 2022;13(1):e1561. doi: 10.1002/wcs.1561 [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Millner AJ, Robinaugh DJ, Nock MK. Advancing the understanding of suicide: the need for formal theory and rigorous descriptive research. Trends Cogn Sci. 2020;24(9):704-716. doi: 10.1016/j.tics.2020.06.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Enfield NJ. Language vs. Reality: Why Language Is Good for Lawyers and Bad for Scientists. The MIT Press; 2022. doi: 10.7551/mitpress/12258.001.0001 [DOI] [Google Scholar]
19.van Rooij I. Psychological models and their distractors. Nat Rev Psychol. 2022;1(3):127-128. doi: 10.1038/s44159-022-00031-5 [DOI] [Google Scholar]
20.Gibson JJ. The Ecological Approach to Visual Perception. Houghton, Mifflin and Company; 1979. [Google Scholar]
21.Cisek P. Cortical mechanisms of action selection: the affordance competition hypothesis. Philos Trans R Soc Lond B Biol Sci. 2007;362(1485):1585-1599. doi: 10.1098/rstb.2007.2054 [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Cisek P, Kalaska JF. Neural mechanisms for interacting with a world full of action choices. Annu Rev Neurosci. 2010;33(1):269-298. doi: 10.1146/annurev.neuro.051508.135409 [DOI] [PubMed] [Google Scholar]
23.Hills TT, Todd PM, Lazer D, Redish AD, Couzin ID; Cognitive Search Research Group . Exploration versus exploitation in space, mind, and society. Trends Cogn Sci. 2015;19(1):46-54. doi: 10.1016/j.tics.2014.10.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Sutton RS, Barto AG. Reinforcement Learning: An Introduction. 2nd ed. The MIT Press; 2018. [Google Scholar]
25.Moustafa AA, Cohen MX, Sherman SJ, Frank MJ. A role for dopamine in temporal decision making and reward maximization in parkinsonism. J Neurosci. 2008;28(47):12294-12304. doi: 10.1523/JNEUROSCI.3116-08.2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Hallquist MN, Dombrovski AY. Selective maintenance of value information helps resolve the exploration/exploitation dilemma. Cognition. 2019;183:226-243. doi: 10.1016/j.cognition.2018.11.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Amodeo DA, Jones JH, Sweeney JA, Ragozzino ME. Differences in BTBR T+ tf/J and C57BL/6J mice on probabilistic reversal learning and stereotyped behaviors. Behav Brain Res. 2012;227(1):64-72. doi: 10.1016/j.bbr.2011.10.032 [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Lee D, Conroy ML, McGreevy BP, Barraclough DJ. Reinforcement learning and decision making in monkeys during a competitive game. Brain Res Cogn Brain Res. 2004;22(1):45-58. doi: 10.1016/j.cogbrainres.2004.07.007 [DOI] [PubMed] [Google Scholar]
29.Skelin I, Hakstol R, VanOyen J, et al. Lesions of dorsal striatum eliminate lose-switch responding but not mixed-response strategies in rats. Eur J Neurosci. 2014;39(10):1655-1663. doi: 10.1111/ejn.12518 [DOI] [PubMed] [Google Scholar]
30.Frank MJ, Moustafa AA, Haughey HM, Curran T, Hutchison KE. Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning. Proc Natl Acad Sci U S A. 2007;104(41):16311-16316. doi: 10.1073/pnas.0706111104 [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Soloff P, Schmahl C. Suicide and nonsuicidal self-injury: Prevalence in patients with personality disorders. In: Schmahl C, Phan KL, Friedel RO, eds, and Siever LJ, , collaborator. Neurobiology of Personality Disorders. Oxford University Press; 2018:237-255. [Google Scholar]
32.Black DW, Blum N, Pfohl B, Hale N. Suicidal behavior in borderline personality disorder: prevalence, risk factors, prediction, and prevention. J Pers Disord. 2004;18(3):226-239. doi: 10.1521/pedi.18.3.226.35445 [DOI] [PubMed] [Google Scholar]
33.Conwell Y, Duberstein PR, Cox C, Herrmann J, Forbes N, Caine ED. Age differences in behaviors leading to completed suicide. Am J Geriatr Psychiatry. 1998;6(2):122-126. doi: 10.1097/00019442-199805000-00005 [DOI] [PubMed] [Google Scholar]
34.De Leo D, Padoani W, Scocco P, et al. Attempted and completed suicide in older subjects: results from the WHO/EURO Multicentre Study of Suicidal Behaviour. Int J Geriatr Psychiatry. 2001;16(3):300-310. doi: 10.1002/gps.337 [DOI] [PubMed] [Google Scholar]
35.Dombrovski AY, Szanto K, Duberstein P, Conner KR, Houck PR, Conwell Y. Sex differences in correlates of suicide attempt lethality in late life. Am J Geriatr Psychiatry. 2008;16(11):905-913. doi: 10.1097/JGP.0b013e3181860034 [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Kaurin A, Dombrovski AY, Hallquist MN, Wright AGC. Suicidal urges and attempted suicide at multiple time scales in borderline personality disorder. J Affect Disord. 2023;329:581-588. doi: 10.1016/j.jad.2023.02.034 [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Tsypes A, Kaurin A, Wright AGC, Hallquist MN, Dombrovski AY. Protective effects of reasons for living against suicidal ideation in daily life. J Psychiatr Res. 2022;148:174-180. doi: 10.1016/j.jpsychires.2022.01.060 [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Posner K, Brown GK, Stanley B, et al. The Columbia-Suicide Severity Rating Scale: initial validity and internal consistency findings from three multisite studies with adolescents and adults. Am J Psychiatry. 2011;168(12):1266-1277. doi: 10.1176/appi.ajp.2011.10111704 [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Dombrovski AY, Luna B, Hallquist MN. Differential reinforcement encoding along the hippocampal long axis helps resolve the explore-exploit dilemma. Nat Commun. 2020;11(1):5407. doi: 10.1038/s41467-020-18864-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Hallquist MN, Hwang K, Luna B, Dombrovski AY. Reward-based option competition in human dorsal stream and transition from stochastic exploration to exploitation in continuous space. Sci Adv. 2024;10(8):eadj2219. doi: 10.1126/sciadv.adj2219 [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Daunizeau J, Adam V, Rigoux L. VBA: a probabilistic treatment of nonlinear models for neurobiological and behavioural data. PLoS Comput Biol. 2014;10(1):e1003441. doi: 10.1371/journal.pcbi.1003441 [DOI] [PMC free article] [PubMed] [Google Scholar]
42.McGirr A, Renaud J, Bureau A, Seguin M, Lesage A, Turecki G. Impulsive-aggressive behaviours and completed suicide across the life cycle: a predisposition for younger age of suicide. Psychol Med. 2008;38(3):407-417. doi: 10.1017/S0033291707001419 [DOI] [PubMed] [Google Scholar]
43.Strauss GP, Frank MJ, Waltz JA, Kasanova Z, Herbener ES, Gold JM. Deficits in positive reinforcement learning and uncertainty-driven exploration are associated with distinct aspects of negative symptoms in schizophrenia. Biol Psychiatry. 2011;69(5):424-431. doi: 10.1016/j.biopsych.2010.10.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Gruber AJ, Thapa R, Randolph SH. Feeder approach between trials is increased by uncertainty and affects subsequent choices. eNeuro. 2017;4(6):ENEURO.0437-17.2017. doi: 10.1523/ENEURO.0437-17.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Thapa R, Donovan CH, Wong SA, Sutherland RJ, Gruber AJ. Lesions of lateral habenula attenuate win-stay but not lose-shift responses in a competitive choice task. Neurosci Lett. 2019;692:159-166. doi: 10.1016/j.neulet.2018.10.056 [DOI] [PubMed] [Google Scholar]
46.Ivan VE, Banks PJ, Goodfellow K, Gruber AJ. Lose-shift responding in humans is promoted by increased cognitive load. Front Integr Neurosci. 2018;12:9. doi: 10.3389/fnint.2018.00009 [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Cheadle S, Wyart V, Tsetsos K, et al. Adaptive gain control during human perceptual choice. Neuron. 2014;81(6):1429-1441. doi: 10.1016/j.neuron.2014.01.020 [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Dombrovski AY, Hallquist MN. The decision neuroscience perspective on suicidal behavior: evidence and hypotheses. Curr Opin Psychiatry. 2017;30(1):7-14. doi: 10.1097/YCO.0000000000000297 [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Sveticic J, De Leo D. The hypothesis of a continuum in suicidality: a discussion on its validity and practical implications. Ment Illn. 2012;4(2):73-78. doi: 10.4081/mi.2012.e15 [DOI] [PMC free article] [PubMed] [Google Scholar]
50.Maris RW. Pathways to Suicide: A Survey of Self-Destructive Behaviors. John Hopkins University Press; 1981. [Google Scholar]
51.Perrain R, Dardennes R, Jollant F. Risky decision-making in suicide attempters, and the choice of a violent suicidal means: an updated meta-analysis. J Affect Disord. 2021;280(Pt A):241-249. doi: 10.1016/j.jad.2020.11.052 [DOI] [PubMed] [Google Scholar]
52.Huys QJM, Renz D. A formal valuation framework for emotions and their control. Biol Psychiatry. 2017;82(6):413-420. doi: 10.1016/j.biopsych.2017.07.003 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials