Skip to main content
American Journal of Epidemiology logoLink to American Journal of Epidemiology
. 2021 Apr 20;190(12):2517–2527. doi: 10.1093/aje/kwab112

Predicting Sex-Specific Nonfatal Suicide Attempt Risk Using Machine Learning and Data From Danish National Registries

Jaimie L Gradus , Anthony J Rosellini, Erzsébet Horváth-Puhó, Tammy Jiang, Amy E Street, Isaac Galatzer-Levy, Timothy L Lash, Henrik T Sørensen
PMCID: PMC8796814  PMID: 33877265

Abstract

Suicide attempts are a leading cause of injury globally. Accurate prediction of suicide attempts might offer opportunities for prevention. This case-cohort study used machine learning to examine sex-specific risk profiles for suicide attempts in Danish nationwide registry data. Cases were all persons who made a nonfatal suicide attempt between 1995 and 2015 (n = 22,974); the subcohort was a 5% random sample of the population at risk on January 1, 1995 (n = 265,183). We developed sex-stratified classification trees and random forests using 1,458 predictors, including demographic factors, family histories, psychiatric and physical health diagnoses, surgery, and prescribed medications. We found that substance use disorders/treatment, prescribed psychiatric medications, previous poisoning diagnoses, and stress disorders were important factors for predicting suicide attempts among men and women. Individuals in the top 5% of predicted risk accounted for 44.7% of all suicide attempts among men and 43.2% of all attempts among women. Our findings illuminate novel risk factors and interactions that are most predictive of nonfatal suicide attempts, while consistency between our findings and previous work in this area adds to the call to move machine learning suicide research toward the examination of high-risk subpopulations.

Keywords: Denmark, machine learning, National Registry, prediction, suicide attempts

Abbreviations

AUC

area under the curve

CART

classification and regression tree

CI

confidence interval

ICD-10

International Classification of Diseases, Tenth Revision

RF

random forest

Editor’s note: An invited commentary on this article appears on page 2528, and the authors’ response appears on page 2534.

Suicide attempts are a global public health concern. The World Health Organization estimates that approximately 16,000,000 suicide attempts occur annually worldwide (1). Suicide attempts are a leading cause of injury (2), can lead to disability (3) and subsequent psychiatric illness (4), and are associated with death from suicide (5). Suicide attempts have a negative impact on the lives of family members, friends, colleagues, and others who might suffer emotional distress in response to the suicidal behavior (68). The economic cost of suicide attempts is estimated to be in the billions of dollars worldwide (9, 10). Given the magnitude of the global burden of suicide attempts and associated health, social, and economic costs, there is heightened interest in research to predict suicide attempts, with the goal of identifying high-risk individuals who could receive targeted preventive interventions (11).

Although many risk factors for suicide attempts have been identified, a 2017 meta-analysis of the past 50 years of research found that existing statistical models’ ability to predict suicide attempts remains only slightly better than chance (12). There are several potential explanations for these discouraging results. Most studies examine a small number of risk factors, whereas accurate prediction of suicide attempts likely requires consideration of hundreds of risk factors and their interactions (12, 13). Further, conventional parametric statistical techniques are not well-suited to develop prediction models involving a large number of risk factors (13). In contrast, supervised machine-learning methods can detect complex patterns and return useful algorithms for predicting suicidal behavior (1315). Some machine learning methods (e.g., recursive partitioning) also provide metrics of variable importance that can be used to identify novel predictors/interactions. Earlier studies have found that machine learning algorithms were able to predict future suicide attempts accurately among patients in electronic health record databases, soldiers with psychiatric disorders, and following outpatient mental health visits (13, 14, 16). To our knowledge, no study has applied machine learning to determine risk of suicide attempts in a nationwide sample of civilians.

Accurate prediction of suicide attempt risk requires separate algorithms for men and women; there are well-known sex differences in the incidence of suicidal behavior and risk factors (17). The rate of suicide attempts among women in Denmark increased from 138 (95% confidence interval (CI): 133, 142) per 100,000 person-years in 1994 to 153 (95% CI: 148, 158) per 100,000 person-years in 2011. Among Danish men, the rate of suicide attempts decreased from 95 (95% CI: 91, 99) per 100,000 person-years to 78 (95% CI: 75, 82) per 100,000 person-years over the same period (18). A systematic review and meta-analysis of gender differences in suicidal behavior found that some diagnostic risk factors were more important in women (e.g., eating disorders, posttraumatic stress disorder, abortion) compared with men (e.g., conduct problems, parental divorce, friend’s death from suicide) (19).

Our team has previously used machine learning to predict suicide death in the Danish population (20). There is compelling evidence that attempted suicide and death from suicide are different events (21, 22) with nonoverlapping etiology (2325). Specifically, only a small proportion of people who attempt suicide and survive go on to die by suicide; estimates vary across studies, but generally less than 15% of people who make a nonfatal suicide attempt go on to die by suicide (26, 27). Research has shown that older age, male sex, suicide attempts, socioeconomic status, depression, alcohol use disorder, physical health problems, and recent psychiatric care distinguish persons who die by suicide from persons who made a nonfatal suicide attempt (2325). Thus, the existing evidence supports examining nonfatal suicide attempt as a separate outcome from suicide death. The goal of the present study was to identify key predictors and develop machine learning algorithms for nonfatal suicide attempts in a large nationwide sample using data from the Danish national health-care and social registries.

METHODS

Study sample

Persons born or residing in Denmark as of January 1, 1995 (n = 5,303,674), served as the source population for this study. We chose 1995 as the beginning of the study period because it coincided with the inclusion of diagnosis codes from outpatient visits in the national registries and followed the 1994 implementation of International Classification of Diseases, Tenth Revision (ICD-10), for diagnostic coding in Denmark (28). Cases were persons who received an ICD-10 diagnostic code for a nonfatal suicide attempt between 1995 and 2015 (X60-X84; n = 22,974; only first diagnoses in the study period are included) without a death recorded in the national registries in the subsequent 30 days. The comparison cohort was a 5% random sample of individuals living in Denmark on January 1, 1995 (n = 265,183).

Data sources

Medical care is provided to all residents of Denmark through a tax-funded system, with receipt of health care recorded in national medical and administrative registries (28–30). The Central Personal Register number, a unique personal identifier assigned to all residents of Denmark at birth or upon immigration, was used to merge individual data across the registries described below. (See Web Table 1, available at https://doi.org/10.1093/aje/kwab112, for a summary of all predictors examined in the analyses.)

Demographic data were obtained from the Danish Civil Registration System, which has been updated daily since 1989 and provided data on sex, age, immigrant status (yes/no), generation of citizenship, family suicidal behavior (death by suicide or suicide attempt of a parent, spouse, or registered partner), and marital status (3133). Data on income and employment were obtained from the Income Statistics Register and the Integrated Database for Labour Market Research, respectively (34, 35).

The Danish Psychiatric Central Research Register records psychiatric inpatient and outpatient care, including admission and discharge dates with up to 20 primary and secondary diagnoses per entry (36). Several studies have documented the high quality of diagnoses in this registry (36, 37).We used 2-digit ICD-10 codes to capture psychiatric diagnoses (e.g., code F32 was used to capture a depressive episode and code F33 was used to capture a recurrent depressive disorder).

The Danish National Patient Registry records diagnoses received in an inpatient or outpatient somatic treatment setting, including treatment dates and up to 20 primary and secondary diagnostic codes, surgery codes, and examination codes (28, 38). Validation studies have documented the high validity of many diagnoses recorded in this registry (28, 39). Diagnoses from this registry were obtained using second-level ICD-10 diagnosis groupings (e.g., codes G00–G09 for inflammatory diseases of the central nervous system). Surgery procedure codes were categorized by body system (e.g., code B was endocrine system surgeries).

The Danish National Prescription Registry contains data on prescription drugs sold in Danish pharmacies (40), including name, date of dispensing, and Anatomical Therapeutic Classification code. Data in this registry are considered complete and valid (41). Level 3 Anatomical Therapeutic Classification codes (pharmacological subgroup) were used in the present study.

Analytical procedures

Some predictors (e.g., demographic factors) were kept generally in their registry-based form in the analyses. For other predictors, we created time-varying dummy codes (i.e., diagnoses and medications 0–6-, 0–12-, 0–24-, and 0–48-month time intervals before the suicide attempt) to examine proximal and distal predictors. For members of the comparison subcohort, we randomly selected a month during the study period and evaluated the prevalence of predictors at time intervals, as specified above, preceding the first day of that month.

Data reduction

The data reduction process included elimination of rare predictors (less than 10 observations among cases and the subcohort (14, 42)) and predictors with negligible associations with attempted suicide, only retaining predictors with an unadjusted odds ratio of <0.9 or ≥1.1. We eliminated emergency room diagnoses due to their low positive predictive value (43, 44). The initial data set included 2,559 predictor variables. Following data reduction, the analytical data set included 1,458 predictors. (See Web Table 1 for a summary of initial and final predictors.)

Main analyses

We estimated classification and regression tree (CART) models for an initial visual evaluation of the data structure (45, 46). We employed the R (R Foundation for Statistical Computing, Vienna, Austria) package rpart, which uses a 10-fold (internal) cross-validation procedure (47). To mitigate risk of model overfit and ensure that the trees would be interpretable, we set maximum tree depth and minimum number of observations to 10 in terminal and parent nodes. Given class imbalance (i.e., 92% of the sample did not have the outcome), CART was implemented using equal priors rather than the rpart default of priors proportional to the outcome frequency (48, 49). Risk of attempted suicide was computed for each identified combination (“branch”) of predictors.

Next, we implemented a random forest (RF) classifier using the R package randomForest (50). Each RF was built with 1,000 trees, at least 10 observations were required to attempt a split, and 38 variables were selected as split candidates at each node (i.e., square root of total number of predictors, the randomForest default). Each individual tree was built using all suicide attempt observations plus an equal number of randomly selected comparison cohort observations (using the sampsize tuning parameter) to address the class imbalance (51, 52). We used 2-fold internal cross-validation to generate RF predicted values (i.e., predicted values for fold 1 calculated based on RF model estimated in fold 2, and vice versa). Mean decrease in accuracy was used to evaluate the importance of each variable (across all trees) in both folds. Mean decrease in accuracy reflects the extent of outcome misclassification if a variable were excluded, due either to main effects or interactions (i.e., because RF is a tree-based/recursive partitioning method) (48).

We evaluated prediction accuracy (i.e., discrimination) using receiver operating characteristic curve analysis conducted in 1,000 bootstrap replicates and calculated area under the curve (AUC) and its 95% confidence interval (53). We calculated additional operating characteristics (e.g., risk ratio, sensitivity) using “high risk” subgroups and predicted risk thresholds (e.g., based on CART terminal nodes and RF predicted values). Although other metrics exist to evaluate performance of a prediction model, we prioritized AUC and sensitivity: 1) to be consistent with prior supervised machine learning studies of suicide (14, 16) (e.g., for comparison purposes), 2) because we were most concerned with identifying true positive cases (sensitivity), and 3) given our additional goal of evaluating variable importance (i.e., evaluating all possible performance metrics was not a goal). Missing data were scarce (28), and demographic predictors with missing data were imputed using the software default of rpart (surrogate variables) and randomForests (modal value).

As noted above, rates and trends in suicide attempt differ by sex. Machine learning approaches that rely on stratification for model development, such as our classification tree and random forest approaches, can use an a priori stratified analysis when patterns are known to differ among certain groups. Although these models might be capable of identifying sex as a predictor, stratified patterns of risk across men and women would only be explored from the point at which they enter the model, and not overall. Thus, analyses were performed separately by sex in SAS, version 9.4 (SAS Institute, Inc., Cary, North Carolina), and R, version 3.5.2 (R Foundation for Statistical Computing, Vienna, Austria) (54, 55). The study was approved by the institutional review board at Boston University and approved by the Danish Data Protection Agency (record number 2015-57-0002).

RESULTS

Among men, 9,546 had an incident nonfatal suicide attempt during the study period, and there were 130,591 men in the corresponding comparison subcohort (sample outcome prevalence = 6.8%). Among women, 13,428 had an incident nonfatal suicide attempt, and there were 134,592 women in the corresponding comparison subcohort (sample outcome prevalence = 9.1%). Cases were younger than members of the comparison cohort on average, and a greater proportion of cases were single, while a lower proportion of cases were in the highest income quartile (Table 1).

Table 1.

Characteristics of the Nonfatal Suicide Attempt Cases and the General Population Subcohort, Denmark, January 1, 1995

Characteristic Men, %a Women,%a
Suicide Attempt Cases (n = 9,546) Comparison Subcohort(n = 130,591) Suicide Attempt Cases(n = 13,428) Comparison Subcohort(n = 134,592)
Age, yearsb 28 (16–39) 36 (20–53) 23 (9–38) 39 (21–57)
Marital status
 Married/registered partner 23 41 23 40
 Divorced 8.9 6.3 10 7.6
 Single 67 49 64 41
 Widowed 0.8 2.9 2.3 11
 Unknown 0.6 0.7 0.6 0.7
Immigrant 4.7 4.4 4.5 4.1
Income quartile
 <1 24 18 22 23
 1–2 21 15 23 25
 2–3 19 18 15 24
 >3 16 31 4.8 12
 Age ≤14 19 16 33 15
 Missing 1.5 1.1 1.9 1.3

a Values reported to 2 significant digits.

b Values are expressed as median (interquartile range).

Classification and regression trees

Among men, the highest risk of attempted suicide was among persons with a diagnosis of poisoning by, adverse effect of, and underdosing of drugs, medications, and biological substances (referred hereafter as “poisoning”) but without recorded prescriptions for antidepressants or an alcohol-related diagnosis in the preceding 48 months (n = 621; risk = 0.64). The characteristics of the second-highest risk group were age under 50 years and with a stress disorder diagnosis in the preceding 48 months but without pharmacotherapy (e.g., no prescriptions for antidepressants, antipsychotics, drugs used to treat addictive disorders, or hypnotics or sedatives; and no alcohol-related or poisoning diagnoses; n = 120; risk = 0.51). Other characteristic combinations of variables associated with a high risk of attempted suicide are displayed in Figure 1 (AUC = 0.83, 95% CI: 0.83, 0.84).

Figure 1.

Figure 1

Classification tree depicting suicide attempt predictors among men in Denmark, 1995–2015. Poisoning refers to poisoning by, adverse effect of, and underdosing of drugs, medicaments, and biological substances. Drugs refers to drugs used in additive disorders. AD, adjustment disorders; RSS, reaction to severe stress.

Among women, the highest risk of attempted suicide was among persons who were under age 50, had retired early, and had a diagnosis of poisoning in the preceding 48 months but no antidepressant prescription (n = 717; risk = 0.85). Women who were under age 50, had retired early, and had a stress disorder diagnosis in the preceding 48 months, but no antidepressant prescription or diagnosis of poisoning, had the next highest risk (n = 291; risk = 0.65). Other characteristic combinations associated with a high risk of attempted suicide are displayed in Figure 2 (AUC = 0.86, 95% CI: 0.86, 0.87).

Figure 2.

Figure 2

Classification tree depicting suicide attempt predictors among women in Denmark, 1995–2015. Poisoning refers to poisoning by, adverse effect of and underdosing of drugs, medicaments, and biological substances. The referent of income quartile 2 is income quartile 1. AD, adjustment disorders; MDD, major depressive disorder; RSS, reaction to severe stress.

Random forest

Among men, 79%/78% (fold 1/fold 2) of predictors had a mean decrease in accuracy above 0 (average values = 6.3/6.2). Twenty-five predictors were among the respective top 30 most important predictors in both folds (Figure 3). Removal of variables representing alcohol disorders, drugs used to treat addictive disorders (e.g., nicotine, alcohol, and opioid dependence), antidepressant pharmacotherapy, and stress or adjustment diagnoses in the preceding 48 months had the largest impact on prediction accuracy (mean decrease in accuracy >35 in both folds). Other predictors consistently in the top 30 list included poisoning diagnoses, schizophrenia or major depression diagnoses, and pharmacotherapy with antipsychotics, hypnotics or sedatives, and anxiolytics. Variables representing personality disorders and gastrointestinal problems (e.g., gastroesophageal reflux disease and endoscopy) also had a consistently high mean decrease in accuracy values across folds (although some fell outside the top 30 threshold in one fold). All variables displayed in the CART figure (Figure 1) were among the top 30 most important RF predictors, except for early retirement (which had a negative RF mean decrease in accuracy value, possibly suggesting instability of the predictor, contributing to CART overfitting). The cross-validated AUC for the RF model was 0.89 (95% CI: 0.88, 0.89).

Figure 3.

Figure 3

Variable importance to suicide attempt prediction accuracy among men in Denmark, 1995–2015. The black dots represent the mean decrease in accuracy (MDA) value in fold 1, and the gray dots represent the MDA value in fold 2. The predictors that were in the top 30 predictors in folds 1 and 2 for men are shown in bold. The reference group for age ≤14 years is income quartile 1. The reference group for state pension is employed. Poisoning refers to poisoning by, adverse effect, of and underdosing of drugs, medicaments, and biological substances. GORD, drugs for peptic ulcer and gastro-esophageal reflux disease; RSS, reaction to severe stress.

Among women, 78%/79% (fold 1/fold 2) of the total number of predictors had a mean decrease in accuracy above 0 (average values = 5.6/5.5). Twenty-two predictors overlapped across the top 30 lists from each fold. Poisoning in the prior 48 months had the greatest impact on model accuracy. Similar to men, predictors reflecting alcohol-related disorders, major depression, stress/adjustment diagnoses, and use of drugs in common mental disorder pharmacotherapy classes (e.g., antidepressants) had among the largest impacts on prediction accuracy. One female-specific predictor, hormonal contraceptive use in the prior 6 months, also emerged as important across folds, although it fell just outside the top 30 predictors in fold 2 (Figure 4). Most variables displayed in the CART figure (Figure 2) emerged as being among the top 30 most important RF predictors, or fell just outside the top 30 (e.g., income quartile 2), with a few exceptions (e.g., missing income with a negative RF mean decrease in accuracy). The cross-validated AUC for the RF model was 0.91 (95% CI: 0.91, 0.92).

Figure 4.

Figure 4

Variable importance to suicide attempt prediction accuracy among women in Denmark, 1995–2015. The black dots represent the mean decrease in accuracy (MDA) value in fold 1, and the gray dots represent the MDA value in fold 2. The predictors that were in the top 30 predictors in folds 1 and 2 for women are shown in bold. The reference group for age ≤14 years is income quartile 1. The reference group for state pension is employed. Poisoning refers to poisoning by, adverse effect of, and underdosing of drugs, medicaments, and biological substances. GORD, drugs for peptic ulcer and gastro-esophageal reflux disease; RSS, reaction to severe stress.

Operating characteristics of high-risk thresholds

Cross-validated RF predicted probabilities were rank ordered, and operating characteristics were calculated among individuals in the top quintile of the predicted risk distribution. Men in the top 5%, 10%, and 20% (sensitivity) of predicted risk accounted for 44.7%, 65.0%, and 79.8% of all male cases of attempted suicide, respectively (specificity = 97.9%, 94.0%, and 84.4%, respectively). Women in the top 5%, 10%, and 20% of predicted risk accounted for 43.2%, 65.0%, and 81.7% of all female attempted suicides, respectively (specificity = 98.8%, 95.5%, and 86.2%, respectively). The sensitivity among individuals in the top 5% of predicted risk was 8.9 times higher than the expected value among men (44.7%/5%) and 8.6 times higher than the expected value among women (43.2%/5%).

DISCUSSION

This study examined sex-specific models for nonfatal suicide attempts using machine learning and Danish national registry data. Variables included as potential predictors encompassed demographics, family history of suicidal behavior, psychiatric and physical health diagnoses, and medication. We found novel predictors and interactions between predictors of nonfatal attempted suicide risk, classified primarily at 24 and 48 months prior to suicide attempt, and our RF models achieved excellent prediction accuracy (i.e., AUC near or above 0.90) (56).

Consistent with the existing literature, psychiatric disorders and associated medications were important predictors of suicide attempts across models (5759). Specifically, substance abuse-related disorders, use of psychopharmacological medications, and stress disorders appeared prominently in the RF results among both men and women. For substance abuse–related diagnoses, this result is consistent with literature documenting an association between these diagnoses and suicidal behavior (60, 61). For stress disorders, particularly posttraumatic stress disorder, the literature on suicidal behavior is somewhat more ambiguous. In our models, stress disorders were important for the accuracy of suicide attempt prediction in this population, highlighting the importance of the continued examination of stress disorders as risk factors for suicidal behavior.

A long-standing discussion in the literature is whether pharmacologic agents increase suicide risk (6268). In this study, antidepressants, antipsychotics, sedative/hypnotics, and medications used to treat addictions were important to the accuracy of predicting nonfatal suicide attempt. The causal association between pharmacotherapeutic drug use and suicide attempts, and the scenarios under which they might increase or decrease risk, could not be clarified in this study. It is an additional important area for continued research.

For both men and women, several variables representing the gastrointestinal problems and social factors were important to the accuracy of suicide attempt prediction. Although the literature is small, other studies have found elevated risk of suicidal behaviors among patients with chronic abdominal pain and irritable bowel syndrome (69), potentially indicating an opportunity for suicide screening in nonpsychiatric care settings. The inclusion of social variables is also is noteworthy in the context of the relatively well-resourced social setting of Denmark. Our findings are consistent with a growing awareness of the impact of social factors on a person’s physical and mental health (70). Further, our results highlight the potential for nonmedical points of suicide intervention worth consideration.

In contrast to our recent work examining risk factors for death from suicide in the Danish population (20), this study focused on nonfatal suicide attempt outcomes, and a few differences in the results are worth noting. In our previous work, physical health diagnoses were important predictors of men’s risk of death from suicide. We found less evidence for this in the present study. In contrast, social variables (e.g., early retirement) were more prominent risk factors of suicide attempt than of suicide death. We also observed some expected consistency with our results on death from suicide, with psychiatric disorders and psychotropic medication use being important to risk of death and nonfatal attempts. In light of this consistency across large, population-based, machine learning studies of suicidal behavior (14, 16, 20, 71), leaders in the field have called for developing machine-learning models for suicide in specific high-risk populations, such as persons with diagnosed psychopathology (72). The present work adds to the evidence for an advancement of the field in this direction.

Our results should be interpreted in the context of several limitations. First, we relied on only 2 machine learning classifiers: CART and RF. Other classifiers or meta-classifiers (e.g., super learning (73)) might improve prediction performance. Expanding the classifiers used to examine suicide attempt prediction is an important area for further research. A second limitation is possible misclassification of attempted suicide in registry data. There is reason to believe, based on the results of our models, that some suicide attempts are likely classified as other diagnoses (e.g., injuries to the wrist and hand were found to be predictive of attempted suicide among women in our CART model). A recent validation study of attempted suicide codes in Denmark found that the ICD-10 codes used in the present study had the best positive predictive value (72.7%) of all examined methods of capturing suicide attempts (74). Despite being the best available option, our results must be interpreted within the context of misclassification of attempted suicide. The impact of misclassification on machine learning results is an underexplored area, so it is hard to judge how this bias might have affected our results (75). Third, we excluded emergency room data due to concerns about validity. Variables related to emergency care (e.g., frequency of use) are associated with suicidal behavior in other health-care systems, and thus, this might be a limitation of our models (7678). Fourth, it is possible that the diagnostic codes we used to capture nonfatal suicide attempts also capture nonsuicidal self-injury (i.e., deliberately inflicting pain to one’s body without suicidal intent) (79). Thus, our outcome group might include a range of self-harm behavior. Finally, the extent to which these results apply outside of Denmark is unclear; many of our results are consistent with earlier studies conducted in the United States (13, 80).

Our ability to predict suicide attempts clinically remains poor despite the wealth of research in this area. We developed prediction models for nonfatal attempted suicide based on data from a full civilian population that can be used as a basis for further research. Our results corroborate what is known and highlight novel risk factors that, upon replication, have the potential to contribute to new areas of research and prevention.

Supplementary Material

Web_Material_kwab112

ACKNOWLEDGMENTS

Author affiliations: Aarhus University Hospital, Aarhus, Denmark (Jaimie L. Gradus, Erzsébet Horváth-Puhó, Timothy L. Lash, Henrik T. Sørensen); Department of Epidemiology, Boston University School of Public Health, Boston, Massachusetts, United States (Jaimie L. Gradus, Tammy Jiang); Department of Psychiatry, Boston University School of Medicine, Boston, Massachusetts, United States (Jaimie L. Gradus, Amy E. Street); Center for Anxiety and Related Disorders, Department of Psychological and Brain Sciences, Boston University, Boston, Massachusetts, United States (Anthony J. Rosellini); Department of Clinical Epidemiology, National Center for PTSD, VA Boston Healthcare System, Boston, Massachusetts, United States (Amy E. Street); Department of Psychiatry, New York University School of Medicine, New York, New York, United States (Isaac Galatzer-Levy); and Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, Georgia, United States (Timothy L. Lash).

This work was funded by the National Institute of Mental Health (grants R01MH109507 and 1R01MH110453-01A1). This work was also supported by the Lundbeck Foundation (grant R248-2017-521).

Conflict of interest: none declared.

REFERENCES

  • 1. World Health Organization . Mental Health and substance abuse: suicide data. World Health Organization. http://www.who.int/mental_health/prevention/suicide/suicideprevent/en/. Accessed August 20, 2019. [Google Scholar]
  • 2. Nock  MK, Borges  G, Bromet  EJ, et al.  Suicide and suicidal behavior. Epidemiol Rev. 2008;30(1):133–154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Biering-Sørensen  F, Pedersen  W, Müller  PG. Spinal cord injury due to suicide attempts. Paraplegia. 1992;30(2):139–144. [DOI] [PubMed] [Google Scholar]
  • 4. Stanley  IH, Hom  MA, Boffa  JW, et al.  PTSD from a suicide attempt: an empirical investigation among suicide attempt survivors. J Clin Psychol. 2019;75(10):1879–1895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Han  B, Kott  PS, Hughes  A, et al.  Estimating the rates of deaths by suicide among adults who attempt suicide in the United States. J Psychiatr Res. 2016;77:125–133. [DOI] [PubMed] [Google Scholar]
  • 6. Rudestam  KE. Physical and psychological responses to suicide in the family. J Consult Clin Psychol. 1977;45(2):162–170. [DOI] [PubMed] [Google Scholar]
  • 7. Brent  DA, Perper  JA, Moritz  G, et al.  Psychiatric risk factors for adolescent suicide: a case-control study. J Am Acad Child Adolesc Psychiatry. 1993;32(3):521–529. [DOI] [PubMed] [Google Scholar]
  • 8. Cerel  J, Jordan  JR, Duberstein  PR. The impact of suicide on the family. Crisis. 2008;29(1):38–44. [DOI] [PubMed] [Google Scholar]
  • 9. Shepard  DS, Gurewich  D, Lwin  AK, et al.  Suicide and suicidal attempts in the United States: costs and policy implications. Suicide Life Threat Behav. 2016;46(3):352–362. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Kinchin  I, Doran  CM. The economic cost of suicide and non-fatal suicide behavior in the Australian workforce and the potential impact of a workplace suicide prevention strategy. Int J Environ Res Public Health. 2017;14(4):347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Research Prioritization Task Force, National Action Alliance for Suicide Prevention . A prioritized research agenda for suicide prevention: an action plan to save lives. 2014. https://www.sprc.org/resources-programs/prioritized-research-agenda-suicide-prevention-action-plan-save-lives. Accessed January 18, 2019.
  • 12. Franklin  JC, Ribeiro  JD, Fox  KR, et al.  Risk factors for suicidal thoughts and behaviors: a meta-analysis of 50 years of research. Psychol Bull. 2017;143(2):187–232. [DOI] [PubMed] [Google Scholar]
  • 13. Walsh  CG, Ribeiro  JD, Franklin  JC. Predicting risk of suicide attempts over time through machine learning. Clin Psychol Sci. 2017;5(3):457–469. [Google Scholar]
  • 14. Kessler  RC, Warner  CH, Ivany  C, et al.  Predicting suicides after psychiatric hospitalization in US Army soldiers: the Army Study to Assess Risk and Resilience in Service members (Army STARRS). JAMA Psychiat. 2015;72(1):49–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Kessler  RC, van  Loo  HM, Wardenaar  KJ, et al.  Testing a machine-learning algorithm to predict the persistence and severity of major depressive disorder from baseline self-reports. Mol Psychiatry. 2016;21(10):1366–1371. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Kessler  RC, Stein  MB, Petukhova  MV, et al.  Predicting suicides after outpatient mental health visits in the Army Study to Assess Risk and Resilience in Service members (Army STARRS). Mol Psychiatry. 2017;22(4):544–551. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Weissman  MM, Bland  RC, Canino  GJ, et al.  Prevalence of suicide ideation and suicide attempts in nine countries. Psychol Med. 1999;29(1):9–17. [DOI] [PubMed] [Google Scholar]
  • 18. Morthorst  B, Soegaard  B, Nordentoft  M, et al.  Incidence rates of deliberate self-harm in Denmark 1994–2011. Crisis. 2016;37(4):256–264. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Miranda-Mendizabal  A, Castellví  P, Parés-Badell  O, et al.  Gender differences in suicidal behavior in adolescents and young adults: systematic review and meta-analysis of longitudinal studies. Int J Public Health. 2019;64(2):265–283. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Gradus  JL, Rosellini  AJ, Horváth-Puhó  E, et al.  Prediction of sex-specific suicide risk using machine learning and single-payer health care registry data from Denmark. JAMA Psychiat. 2020;77(1):25–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Maris  R, Berman  A, Silverman  M. Comprehensive Textbook of Suicidology. 1st ed. New York City, NY: The Guilford Press; 2000. [Google Scholar]
  • 22. Cavanagh  JTO, Carson  AJ, Sharpe  M, et al.  Psychological autopsy studies of suicide: a systematic review. Psychol Med. 2003;33(3):395–405. [DOI] [PubMed] [Google Scholar]
  • 23. Joo  S-H, Wang  S-M, Kim  T-W, et al.  Factors associated with suicide completion: a comparison between suicide attempters and completers. Asia-Pac Psychiatry. 2016;8(1):80–86. [DOI] [PubMed] [Google Scholar]
  • 24. Parra-Uribe  I, Blasco-Fontecilla  H, Garcia-Parés  G, et al.  Risk of re-attempts and suicide death after a suicide attempt: a survival analysis. BMC Psychiatry. 2017;17(1):163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Finkelstein  Y, Macdonald  EM, Hollands  S, et al.  Risk of suicide following deliberate self-poisoning. JAMA Psychiat. 2015;72(6):570–575. [DOI] [PubMed] [Google Scholar]
  • 26. Bostwick  JM, Pankratz  VS. Affective disorders and suicide risk: a reexamination. Am J Psychiatry. 2000;157(12):1925–1932. [DOI] [PubMed] [Google Scholar]
  • 27. Maris  R, Berman  A, Silverman  M. Introduction to the study of suicide. In: Maris  R, Berman  A, Silverman  M. Comprehensive Textbook of Suicidology. New York, NY: Guilford Press; 2000:3–25. [Google Scholar]
  • 28. Schmidt  M, Schmidt  SAJ, Sandegaard  JL, et al.  The Danish National Patient Registry: a review of content, data quality, and research potential. Clin Epidemiol. 2015;7:449–490. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Frank  L. Epidemiology. When an entire country is a cohort. Science. 2000;287(5462):2398–2399. [DOI] [PubMed] [Google Scholar]
  • 30. Frank  L. Epidemiology. The epidemiologist’s dream: Denmark. Science. 2003;301(5630):163. [DOI] [PubMed] [Google Scholar]
  • 31. Pedersen  CB, Gøtzsche  H, Møller  JO, et al.  The Danish Civil Registration System. A cohort of eight million persons. Dan Med Bull. 2006;53(4):441–449. [PubMed] [Google Scholar]
  • 32. Pedersen  CB. The Danish Civil Registration System. Scand J Public Health. 2011;39(7 suppl):22–25. [DOI] [PubMed] [Google Scholar]
  • 33. Schmidt  M, Pedersen  L, Sørensen  HT. The Danish Civil Registration System as a tool in epidemiology. Eur J Epidemiol. 2014;29(8):541–549. [DOI] [PubMed] [Google Scholar]
  • 34. Baadsgaard  M, Quitzau  J. Danish registers on personal income and transfer payments. Scand J Public Health. 2011;39(7 suppl):103–105. [DOI] [PubMed] [Google Scholar]
  • 35. Timmermans  B. The Danish Integrated Database for Labor Market Research: Towards Demystification for the English Speaking Audience. DRUID, Copenhagen Business School, Department of Industrial Economics and Strategy/Aalborg University, Department of Business Studies; 2010. (DRUID Working Papers). Report No.: 10–16. https://ideas.repec.org/p/aal/abbswp/10-16.html. Accessed January 29, 2019.
  • 36. Mors  O, Perto  GP, Mortensen  PB. The Danish Psychiatric Central Research Register. Scand J Public Health. 2011;39(7 suppl):54–57. [DOI] [PubMed] [Google Scholar]
  • 37. Svensson  E, Lash  TL, Resick  PA, et al.  Validity of reaction to severe stress and adjustment disorder diagnoses in the Danish Psychiatric Central Research Registry. Clin Epidemiol. 2015;7:235–242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Lynge  E, Sandegaard  JL, Rebolj  M. The Danish National Patient Register. Scand J Public Health. 2011;39(7 suppl):30–33. [DOI] [PubMed] [Google Scholar]
  • 39. Vest-Hansen  B, Riis  AH, Christiansen  CF. Registration of acute medical hospital admissions in the Danish National Patient Registry: a validation study. Clin Epidemiol. 2013;5:129–133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Kildemoes  HW, Sørensen  HT, Hallas  J. The Danish National Prescription Registry. Scand J Public Health. 2011;39(7 suppl):38–41. [DOI] [PubMed] [Google Scholar]
  • 41. Pottegård  A, Schmidt  SAJ, Wallach-Kildemoes  H, et al.  Data Resource Profile: the Danish National Prescription Registry. Int J Epidemiol. 2017;46(3):798–798f. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Rosellini  AJ, Monahan  J, Street  AE, et al.  Using administrative data to identify U.S. Army soldiers at high-risk of perpetrating minor violent crimes. J Psychiatr Res. 2017;84:128–136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Lühdorf  P, Overvad  K, Schmidt  EB, et al.  Predictive value of stroke discharge diagnoses in the Danish National Patient Register. Scand J Public Health. 2017;45(6):630–636. [DOI] [PubMed] [Google Scholar]
  • 44. Tuckuviene  R, Kristensen  SR, Helgestad  J, et al.  Predictive value of pediatric thrombosis diagnoses in the Danish National Patient Registry. Clin Epidemiol. 2010;2:107–122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Breiman  L, Friedman  J, Stone  CJ, et al.  Classification and Regression Trees. 1st ed. London, UK: Chapman and Hall; 1984. [Google Scholar]
  • 46. Breiman  L. Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat Sci. 2001;16(3):199–231. [Google Scholar]
  • 47. Therneau  T, Atkinson  B, Ripley  B. rpart: Recursive partitioning and regression trees. 2019. https://CRAN.R-project.org/package=rpart. Accessed April 18, 2019.
  • 48. Strobl  C, Malley  J, Tutz  G. An introduction to recursive partitioning: rationale, application and characteristics of classification and regression trees, bagging and random forests. Psychol Methods. 2009;14(4):323–348. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Kuhn  M, Johnson  K. Remedies for severe class imbalance. In: Kuhn  M, Johnson  K. Applied Predictive Modeling. New York, NY: Springer; 2013:419–43. [Google Scholar]
  • 50. Liaw  A, Wiener  M. Classification and regression by randomForest. https://www.r-project.org/doc/Rnews/Rnews_2002-3.pdf. Accessed April 18, 2019.
  • 51. Chen  C, Breiman  L. Using Random Forest to learn imbalanced data. 2004. https://statistics.berkeley.edu/sites/default/files/tech-reports/666.pdf. Accessed April 18, 2019.
  • 52. Huang  BFF, Boutros  PC. The parameter sensitivity of random forests. BMC Bioinformatics. 2016;17(1):331. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Robin  X, Turck  N, Hainard  A, et al.  pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics. 2011;12:77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. SAS Institute Inc . SAS/GRAPH 9.4.  Cary, NC: SAS Institute Inc; 2013. [Google Scholar]
  • 55. R Development Core Team . R: A language and environment for statistical computing. Vienna, Austria; 2017. https://www.R-project.org/. Accessed April 19, 2019. [Google Scholar]
  • 56. Hosmer  D, Lemeshow  S. Applied Logistic Regression. 3rd ed. Hoboken, NJ: John Wiley & Sons; 2013. [Google Scholar]
  • 57. Nock  MK, Hwang  I, Sampson  NA, et al.  Mental disorders, comorbidity and suicidal behavior: results from the National Comorbidity Survey Replication. Mol Psychiatry. 2010;15(8):868–876. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Nock  MK, Green  JG, Hwang  I, et al.  Prevalence, correlates and treatment of lifetime suicidal behavior among adolescents: results from the National Comorbidity Survey Replication–Adolescent Supplement (NCS-A). JAMA Psychiat. 2013;70(3):300–310. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Bridge  JA, Iyengar  S, Salary  CB, et al.  Clinical response and risk for reported suicidal ideation and suicide attempts in pediatric antidepressant treatment: a meta-analysis of randomized controlled trials. JAMA. 2007;297(15):1683–1696. [DOI] [PubMed] [Google Scholar]
  • 60. Borges  G, Walters  EE, Kessler  RC. Associations of substance use, abuse, and dependence with subsequent suicidal behavior. Am J Epidemiol. 2000;151(8):781–789. [DOI] [PubMed] [Google Scholar]
  • 61. Elizabeth Sublette  M, Carballo  JJ, Moreno  C, et al.  Substance use disorders and suicide attempts in bipolar subtypes. J Psychiatr Res. 2009;43(3):230–238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Jick  H, Kaye  JA, Jick  SS. Antidepressants and the risk of suicidal behaviors. JAMA. 2004;292(3):338–343. [DOI] [PubMed] [Google Scholar]
  • 63. Fergusson  D, Doucette  S, Glass  KC, et al.  Association between suicide attempts and selective serotonin reuptake inhibitors: systematic review of randomised controlled trials. BMJ. 2005;330(7488):396. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64. Didham  RC, McConnell  DW, Blair  HJ, et al.  Suicide and self-harm following prescription of SSRIs and other antidepressants: confounding by indication. Br J Clin Pharmacol. 2005;60(5):519–525. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Hall  WD. How have the SSRI antidepressants affected suicide risk?  Lancet. 2006;367(9527):1959–1962. [DOI] [PubMed] [Google Scholar]
  • 66. Youssef  NA, Rich  CL. Does acute treatment with sedatives/hypnotics for anxiety in depressed patients affect suicide risk? A literature review. Ann Clin Psychiatry. 2008;20(3):157–169. [DOI] [PubMed] [Google Scholar]
  • 67. Barbui  C, Esposito  E, Cipriani  A. Selective serotonin reuptake inhibitors and risk of suicide: a systematic review of observational studies. CMAJ. 2009;180(3):291–297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68. Cheung  K, Aarts  N, Noordam  R, et al.  Antidepressant use and the risk of suicide: a population-based cohort study. J Affect Disord. 2015;174:479–484. [DOI] [PubMed] [Google Scholar]
  • 69. Spiegel  B, Schoenfeld  P, Naliboff  B. Systematic review: the prevalence of suicidal behaviour in patients with chronic abdominal pain and irritable bowel syndrome. Aliment Pharmacol Ther. 2007;26(2):183–193. [DOI] [PubMed] [Google Scholar]
  • 70. Galea  S, Hernán  MA. Win-win: reconciling social epidemiology and causal inference. Am J Epidemiol. 2020;189(3):167–170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71. Simon  GE, Johnson  E, Lawrence  JM, et al.  Predicting suicide attempts and suicide deaths following outpatient visits using electronic health records. Am J Psychiatry. 2018;175(10):951–960. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72. Fazel  S, O’Reilly  L. Machine learning for suicide research—can it improve risk factor identification?  JAMA Psychiat. 2020;77(1):13–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73. van der  Laan  MJ, Polley  EC, Hubbard  AE. Super learner. Stat Appl Genet Mol Biol. 2007;6(1):Article 25. [DOI] [PubMed] [Google Scholar]
  • 74. Gasse  C, Danielsen  AA, Pedersen  MG, et al.  Positive predictive value of a register-based algorithm using the Danish National Registries to identify suicidal events. Pharmacoepidemiol Drug Saf. 2018;27(10):1131–1138. [DOI] [PubMed] [Google Scholar]
  • 75. Jiang  T, Gradus  JL, Lash  TL, et al.  Addressing measurement error in random forests using quantitative bias analysis [published online ahead of print February 1, 2021]. Am J Epidemiol. (doi: 10.1093/aje/kwab010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76. Crandall  C, Fullerton-Gleason  L, Aguero  R, et al.  Subsequent suicide mortality among emergency department patients seen for suicidal behavior. Acad Emerg Med. 2006;13(4):435–442. [DOI] [PubMed] [Google Scholar]
  • 77. Mercado  MC, Holland  K, Leemis  RW, et al.  Trends in emergency department visits for nonfatal self-inflicted injuries among youth aged 10 to 24 years in the United States, 2001–2015. JAMA. 2017;318(19):1931–1933. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78. Zwald  ML. Syndromic surveillance of suicidal ideation and self-directed violence—United States, January 2017–December 2018. MMWR Morb Mortal Wkly Rep. 2020;69(4):103–108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79. Nock  MK, Favazza  A. Nonsuicidal self-injury: Definition and classification. In: Nock  MK, ed. Understanding Nonsuicidal Self-Injury: Origins, Assessment, and Treatment. Washington, DC: American Psychological Association; 2009:9–19. [Google Scholar]
  • 80. Breslau  N, Davis  GC, Andreski  P. Migraine, psychiatric disorders, and suicide attempts: an epidemiologic study of young adults. Psychiatry Res. 1991;37(1):11–23. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Web_Material_kwab112

Articles from American Journal of Epidemiology are provided here courtesy of Oxford University Press

RESOURCES