Skip to main content
UKPMC Funders Author Manuscripts logoLink to UKPMC Funders Author Manuscripts
. Author manuscript; available in PMC: 2020 Nov 4.
Published in final edited form as: JAMA Psychiatry. 2020 Jan 1;77(1):13–14. doi: 10.1001/jamapsychiatry.2019.2896

Machine Learning for Suicide Research–Can It Improve Risk Factor Identification?

Seena Fazel 1, Lauren O’Reilly 2
PMCID: PMC7116325  EMSID: EMS85712  PMID: 31642876

Machine learning is on the rise. According to Scopus (www2.scopus.com), the number of publications in medicine with machine learning in the title, abstract, or as a keyword during 2016 to 2018 increased from 1658 to 3904. In psychiatry, applications of machine learning are proposed to improve the accuracy of diagnosis and prognosis and determine treatment choice. At the same time, much of this research has given insufficient attention to high-quality methods, clinical applications, and ethical aspects. This is compounded by poor reporting of performative measures and misleading claims about the high accuracy of such approaches. In this issue of JAMA Psychiatry, the article by Gradus and colleagues1 raises important questions about the place of machine learning in research and practice.

Machine learning is useful when analyzing several predictors, particularly if there are nonlinear observations and interaction terms that are difficult to theoretically conceptualize and practically model. A strength of the study by Gradus et al1 is that it uses a robust data set of patient-level predictors from Danish national health care registers without imposing or relying on theoretical models. The authors tested 1365 parameters based on 334 individual predictors, which were mostly associated with somatic and psychiatric diagnoses and by temporal proximity with the suicide event by being categorized into 4 points of 6, 12, 24, and 48 months. Another strength is the outcome of death by suicide. As the study illuminates risk factors rather than prediction, this is important, as risk factors for suicide are different from self-harm and suicidal ideation. For example, male sex is more highly associated with suicide, whereas female sex is a risk factor for suicide ideation and attempts.2 The use of high-quality Danish register data means that suicide is ascertained with negligible missing data; however, the authors did not consider injuries of undetermined intent as an outcome, which could lead to possible misclassification. Based on 10 outcomes per predictor recommended by methodologists,3 the authors also had sufficient statistical power to investigate 14 463 suicides. A study of such large suicide numbers combined with hundreds of potentially modifiable predictors is an important advance for the field.

The most informative aspects of the study by Gradus et al1 are in highlighting the importance of psychiatric disorders as risk factors for suicide. The analyses simultaneously accounted for somatic and psychiatric disorders and a few sociodemographic factors (eg, income, civil status, age bands, and being an immigrant). It reported that diagnoses, such as schizophrenia and adjustment disorders, and markers of psychiatric morbidity, such as antidepressants and antipsychotics, are strongly associated with suicide. This provides further evidence, in addition to findings from epidemiological and psychological autopsy studies, of the consistency and strength of modifiable risk factors. The advantage of machine learning in identifying predictors and their combinations opens up novel areas for future suicide research. These include the role of comorbidity and the heterogeneity within disorders. One clear implication of the article is that prediction models will likely improve their accuracy and usability if they are developed for specific diagnostic groups. Clinicians who treat people with schizophrenia, and the likely interventions they can offer to prevent suicide,4 will be different from those who assess and manage adjustment disorders, many of whom will only present to primary care and where psychological therapies will be more important. Another implication of the findings is that, despite many strong predictors in the final model, these predictors are not necessarily causal. Future research will need to investigate causal associations between predictors identified in such models and suicide. For example, the finding that diagnoses occurring in the 48 months before suicide were more important to prediction compared with 6 months does not negate the role of temporally close predictors; rather, it highlights that more research to examine timing and the chronicity of diagnoses associated with suicide is required.

The interpretation of these findings needs careful consideration to avoid the hype that frequently accompanies machine learning publications. Gradus et al1 provide no comparison with simpler modeling, which will need to be done to contextualize the findings. The authors used classification trees and random forest models, both of which attempt to address limitations inherent to one another. Classification tree models, while more easily interpretable, are often unstable depending on the sample and variables available. Regression trees, while more stable, are often criticized for poor interpretability (“black box”). Despite these approaches, a recent review of 71 studies that compared machine learning prediction models with simpler regression ones found no improvement in the machine learning ones,5 suggesting that simpler models would be preferred by clinical services that would struggle to implement complex algorithms into computer systems and the clinical workflow. Notably, the sensitivity of the model in Gradus et al1 is 32% based on a cutoff of the top 5% of men at risk, which means that for every 100 suicides, this model will detect 32 suicides. The high false-negative rate (68%) at the 5% cutoff will mean that this model will not be acceptable to clinicians without further improvements or if used without adjunctive clinical decision-making. Nevertheless, it is possible that unstructured clinical judgement may be similarly predictive of suicide and comparing these models with clinical decision-making will be informative.6 Even lowering the cutoff of the algorithm in Gradus et al1 to 10% of men at risk, the sensitivity was 49%, meaning that more than half the suicides will be missed. For an outcome such as suicide, high false-positives may be tolerated if linked interventions are not harmful to the individual (although the wider resource implications need to be considered), but this will not be the case for high false-negatives. Additionally, calibration was not reported in Gradus et al,1 which is an essential marker of performance as it reports the reliability of the probabilistic risk predictions7 rather than areas under the curve, sensitivities, and specificities, which aid in discriminating between risk groups (such as the top 5% vs the rest).

Nevertheless, the Gradus et al article stimulates research on how to improve these models. A prediction model can provide consistency, raise the ceiling of assessments, and enable the stratification of care transparently. Enhancing the model will come from optimizing the use of key predictors, such as age, which Gradus et al1 used as a categorical variable (with 4 cutoffs at ages 30, 40, 50, and 60 years), whereas a continuous variable would likely provide better accuracy. Predictors based on other background factors, such as educational attainment, may further improve models in light of research demonstrating links between IQ and suicide.8 While population-based registers are advantageous for adequate statistical power, they lack finer-grained measurements of the psychological constructs associated with suicide (eg, hopelessness) and other time-sensitive environmental factors (eg, access to lethal means). Whether these add to incremental validity over simpler measures requires research.

As a necessary next step, Gradus and colleagues should make the algorithm available at no cost so that it can be independently validated, a key marker of validity for any prediction model. Future work needs to make available a prespecified protocol and clarify whether and how the final published study deviated from it. Reporting guidelines should be followed, such as the soon to be released TRIPOD-ML,9 including measures of discrimination and calibration, and journals could consider insisting on their completion on manuscript submission.

Going forward, the development of prediction models in specific, high-risk groups (eg, US Army soldiers after psychiatric treatment10) or those with certain psychiatric diagnoses (eg, schizophrenia spectrum disorders11) will likely have benefits of higher accuracy, more acceptability to clinicians, and simpler linkage to interventions. How they can be most effectively linked to evidence-based treatments needs careful examination; without such links, prediction models are unlikely to reduce suicidality outcomes.

Footnotes

Conflict of Interest Disclosures: None reported.

Contributor Information

Seena Fazel, Department of Psychiatry, University of Oxford, Oxford, Oxfordshire, England.

Lauren O’Reilly, Indiana University, Bloomington.

References

  • 1.Gradus JL, Rosellini AJ, Horváth-Puhó E, et al. Prediction of sex-specific suicide risk using machine learning and single-payer health care registry data from Denmark. [published online October 23, 2019];JAMA Psychiatry. doi: 10.1001/jamapsychiatry.2019.2905. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Turecki G, Brent DA. Suicide and suicidal behaviour. Lancet. 2016;387(10024):1227–1239. doi: 10.1016/S0140-6736(15)00234-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Royston P, Sauerbrei W. Multivariable Model-Building: a Pragmatic Approach to Regression Analysis Based on Fractional Polynomials for Modelling Continuous Variables. Hoboken, NJ: John Wiley & Sons; 2008. [DOI] [Google Scholar]
  • 4.Chan SKW, Chan SWY, Pang HH, et al. Association of an early intervention service for psychosis with suicide rate among patients with first-episode schizophrenia-spectrum disorders. JAMA Psychiatry. 2018;75(5):458–464. doi: 10.1001/jamapsychiatry.2018.0185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Christodoulou E, Ma J, Collins GS, Steyerberg EW, Verbakel JY, Van Calster B. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol. 2019;110:12–22. doi: 10.1016/j.jclinepi.2019.02.004. [DOI] [PubMed] [Google Scholar]
  • 6.Whiting D, Fazel S. How accurate are suicide risk prediction models? asking the right questions for clinical practice. Evid Based Ment Health. 2019;22(3):125–128. doi: 10.1136/ebmental-2019-300102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Van Calster B, Nieboer D, Vergouwe Y, De Cock B, Pencina MJ, Steyerberg EW. A calibration hierarchy for risk models was defined: from utopia to empirical data. J Clin Epidemiol. 2016;74:167–176. doi: 10.1016/j.jclinepi.2015.12.005. [DOI] [PubMed] [Google Scholar]
  • 8.Gunnell D, Magnusson PK, Rasmussen F. Low intelligence test scores in 18 year old men and risk of suicide: cohort study. BMJ. 2005;330(7484):167. doi: 10.1136/bmj.38310.473565.8F. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Collins GS, Moons KGM. Reporting of artificial intelligence prediction models. Lancet. 2019;393(10181):1577–1579. doi: 10.1016/S0140-6736(19)30037-6. [DOI] [PubMed] [Google Scholar]
  • 10.Kessler RC, Warner CH, Ivany C, Army STARRS Collaborators et al. Predicting suicides after psychiatric hospitalization in US Army soldiers: the Army Study To Assess Risk and Resilience in Servicemembers (Army STARRS) JAMA Psychiatry. 2015;72(1):49–57. doi: 10.1001/jamapsychiatry.2014.1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Fazel S, Wolf A, Larsson H, Mallett S, Fanshawe TR. The prediction of suicide in severe mental illness: development and validation of a clinical prediction rule (OxMIS) Transl Psychiatry. 2019;9(1):98. doi: 10.1038/s41398-019-0428-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES