Key Points
Question
What are the sex-specific risk profiles for death from suicide in a general population sample?
Findings
In this case-cohort study of 14 103 individuals who died by suicide, risk profiles for suicide were different among men and women in a general population sample of 265 183 persons who did not die by suicide, with physical health more important to men’s than women’s suicide risk. Results suggest psychotropic medications and psychiatric disorders are important to suicide risk and, for many diagnostic variables and prescriptions, a longer vs shorter period of observation (eg, 48 vs 6 months prior to suicide) appeared to be more important.
Meaning
These findings suggest consistency with what is known about suicide risk but also potentially important, and understudied risk factors with evidence of unique suicide risk profiles among specific subpopulations.
Abstract
Importance
Suicide is a public health problem, with multiple causes that are poorly understood. The increased focus on combining health care data with machine-learning approaches in psychiatry may help advance the understanding of suicide risk.
Objective
To examine sex-specific risk profiles for death from suicide using machine-learning methods and data from the population of Denmark.
Design, Setting, and Participants
A case-cohort study nested within 8 national Danish health and social registries was conducted from January 1, 1995, through December 31, 2015. The source population was all persons born or residing in Denmark as of January 1, 1995. Data were analyzed from November 5, 2018, through May 13, 2019.
Exposures
Exposures included 1339 variables spanning domains of suicide risk factors.
Main Outcomes and Measures
Death from suicide from the Danish cause of death registry.
Results
A total of 14 103 individuals died by suicide between 1995 and 2015 (10 152 men [72.0%]; mean [SD] age, 43.5 [18.8] years and 3951 women [28.0%]; age, 47.6 [18.8] years). The comparison subcohort was a 5% random sample (n = 265 183) of living individuals in Denmark on January 1, 1995 (130 591 men [49.2%]; age, 37.4 [21.8] years and 134 592 women [50.8%]; age, 39.9 [23.4] years). With use of classification trees and random forests, sex-specific differences were noted in risk for suicide, with physical health more important to men’s suicide risk than women’s suicide risk. Psychiatric disorders and possibly associated medications were important to suicide risk, with specific results that may increase clarity in the literature. Generally, diagnoses and medications measured 48 months before suicide were more important indicators of suicide risk than when measured 6 months earlier. Individuals in the top 5% of predicted suicide risk appeared to account for 32.0% of all suicide cases in men and 53.4% of all cases in women.
Conclusions and Relevance
Despite decades of research on suicide risk factors, understanding of suicide remains poor. In this study, the first to date to develop risk profiles for suicide based on data from a full population, apparent consistency with what is known about suicide risk was noted, as well as potentially important, understudied risk factors with evidence of unique suicide risk profiles among specific subpopulations.
This case-cohort study examines the risk of suicide in Danish men vs women with use of machine-learning tools.
Introduction
Suicide is a worldwide contributor to mortality.1 The age-standardized suicide rate of 13 per 100 000 per year in the United States has not decreased in the past 50 years.2,3 Suicide was recently identified as 1 of only 3 causes of death that is increasing in the United States.4 The age-standardized suicide rate in Denmark has been comparable to that in the United States in the past 2 decades, but decreased to 9.2 per 100 000 as of 2016.2
In 2014, the US National Institute of Mental Health and the National Action Alliance for Suicide Prevention Research Prioritization Task Force published a list of research priorities.3 Key questions were identified, including “How can we better or more optimally detect [suicide] risk?”3 This call for research is not surprising given that decades of research have not resulted in improved suicide prediction in clinical settings.5 Sociodemographic characteristics, psychiatric diagnoses, and physical health diagnoses are well-documented risk factors for suicide.6,7,8 However, most of the research that generated these findings used conventional null hypothesis testing statistical methods,5 and these methods are limited when the goal is accurate identification of a high-risk patient. For instance, logistic regression is not designed to examine large, highly correlated sets of predictors or to elucidate interactions among predictors without a priori specification. Additional research using flexible modeling procedures that capture the complexities of the causes of suicide is needed.
The shift in focus to supervised machine learning methods in psychiatry allows for the development of novel suicide risk profiles that include broad constellations of predictors. Machine learning methods have been used in some small, clinical civilian samples,9,10,11 and larger samples of Army members, veterans, or civilian hospital patients.12,13,14,15 To our knowledge, no study has used machine learning methods to examine suicide risk in a full civilian population. Thus, the goal of this study was to use population-based, prospectively recorded Danish medical and social registry data and supervised machine learning methods to identify risk profiles for death from suicide. Suicide risk differs by many demographic factors.16 Although all factors are important to examine, we chose to focus on sex differences for the purposes of this study, given well-established sex differences in suicide risk.17,18,19
Methods
Study Sample
The source population for this study was all individuals born or residing in Denmark as of January 1, 1995, coinciding with the switch from International Classification of Diseases, 8th Edition, to International Statistical Classification of Diseases and Related Health Problems, Tenth Revision (ICD-10) and the inclusion of outpatient visits in the registries (n = 5 078 382).20 We used a case-cohort study design, consistent with prior machine learning studies of suicide death.12,13,21,22 Individuals considered cases in this study died by suicide between January 1, 1995, and December 31, 2015 (n = 14 103). The comparison subcohort was a 5% random sample of living individuals in Denmark on January 1, 1995 (n = 265 183). Cases and subcohort members were unmatched to allow for maximum variability in predictors for the machine learning analyses. This work was deemed exempt from review by the institutional review board at Boston University because of deidentified data and approved by the Danish Data Protection Agency. Data were analyzed from November 5, 2018, through May 13, 2019.
Data Sources
Universal medical coverage is provided to all residents of Denmark through a tax-funded health care system.20 More than 90% of the Danish population has at least 1 contact with the health care system in a given year.23 Receipt of health care is recorded in medical and administrative registries at the national level.24,25 The 10-digit Central Personal Register number, a unique personal identifier assigned to all residents of Denmark, can be used to merge individual data across 8 registries.
The Danish Civil Registration System was established in 1968. The data have been updated daily since 1989 and are widely accepted as accurate.26,27,28 For this study we included sex at birth, age, immigration status (yes or no), generation of citizenship, family suicidal behavior (parent or spouse), and marital status.26,27
The Income Statistics Register began in 1970 and contains variables related to income (eg, salary, private pension contributions).29 The Population Education Register contains information on educational level attained for 96.4% of the Danish population.30 The Integrated Database for Labor Market Research contains information about employment beginning in 1980.31
The Danish Psychiatric Central Research Register records all psychiatric inpatient, outpatient, and emergency department treatment data since 1995, including treatment dates and primary and secondary diagnoses.32 We evaluated psychiatric disorders according to 2-digit ICD-10 codes (eg, code F20 was used to capture schizophrenia and its subtypes; code F25 was used to capture schizoaffective disorder). Diagnoses in this registry are documented as high quality.32,33
The Danish National Patient Registry includes treatment date and primary and secondary diagnostic codes, contact type (inpatient, outpatient, emergency), surgical procedure, and selected examination codes.20,34 Diagnoses from this registry were included as second-level ICD-10 groupings (eg, codes C15-C26 included as 1 variable: malignant neoplasms of digestive organs). Surgery procedure codes were included according to body system (eg, nervous system surgeries). Two validation studies have found high correlations between the data contained in the registry and medical records.20,35
The Danish National Prescription Registry has cataloged data on all prescription drugs sold in Danish pharmacies since 1994.4 The registry includes dispensing date, product name, and Anatomical Therapeutic Classification code. Data for this study were coded according to level 3 Anatomical Therapeutic Classification codes (eTable in the Supplement). Data in the registry are considered complete and valid from 1995.36
The Danish Cause of Death Registry includes age at death and manner of death (eg, natural, suicide), place of death, and autopsy results. Suicide cases were identified via this registry via ICD-10 codes of X60 to X84.37 In a validation study, 90% of the deaths registered as suicides were confirmed by experts.38
Statistical Analysis
Some predictors (eg, natal sex) could be used in their registry-based form, while others (eg, diagnoses, medications) were dummy-coded to create time-varying predictors (ie, intervals of 0-6, 0-12, 0-24, and 0-48 months before the first day of the suicide month). To estimate the prevalence of each predictor in the person-time that gave rise to cases, we randomly selected a date for each member of the subcohort and evaluated the prevalence of predictors at the above time intervals in relation to that date. Predictors from all time points were evaluated simultaneously.
The data reduction process included elimination of rare predictors (≤10 observations13,39) and predictors with negligible associations with suicide (unadjusted odds ratio of <0.9 or ≥1.1). We eliminated diagnoses occurring in the emergency department owing to low positive predictive value.40,41 The initial analytic data set contained 2564 predictor variables. Following data reduction, the final number of included predictors was 1339. The eTable in the Supplement provides considered and retained predictors.
Given our dual interests in identifying novel predictor/interactions and developing algorithms that may accurately predict suicide, we used recursive partitioning machine learning methods that automatize detection of associations between predictors and outcomes and interactions among predictors and provide metrics of predictor importance.42
First, we estimated classification tree (CART) models as an initial evaluation of the data structure.43,44 Classification tree modeling is a nonparametric method that builds a decision tree based on predictors and their combinations that result in the highest probability of differentiating cases from noncases. Classification tree modeling was implemented using the R package, version 3.5.2 rpart (R Foundation), which uses a 10-fold cross-validation procedure.45 To mitigate risk of model overfit and increase visual interpretability, maximum tree depth and minimum number of observations in any node (terminal or parent) were set to 10. Given the class imbalance (sample suicide rate, 5%), CART modeling was implemented using equal priors rather than the rpart default of priors proportional to the outcome frequency.42,46 Risk of suicide was computed for each identified combination of predictors.
Second, we implemented random forest, which is less likely to produce an overfit model than CART modeling because random forest uses bootstrap aggregation, using the R package randomForest.47 Each forest was built with 1000 trees with a minimum of 10 observations needed to attempt a split and 37 variables sampled as split candidates at each node (ie, square root of total number of predictors; randomForest default). Given the class imbalance, each tree was built using all observations of suicide plus a random equally sized number of subcohort observations using the sampsize tuning parameter.48,49 Split-sample cross-validation was used to generate individual-level random forests–predicted values (the analytic server would not permit 10-fold cross-validation of 1000 trees). We used mean decrease in accuracy to evaluate each variable in terms of main effects and interactions across all trees.42
Prediction accuracy was evaluated using receiver operating characteristic curve analysis conducted in 1000 bootstrap replicates to estimate area under the curve (AUC) and its 95% CI.50 Additional operating characteristics were evaluated using high-risk subgroups and thresholds of predicted risk (eg, based on CART terminal nodes; random forests–predicted values). Sensitivity was prioritized in accordance with the goal of identifying suicide cases. Analyses were conducted separately for men and women in SAS, version 9.4 (SAS Institute Inc) and R, version 3.5.2.51,52
Results
Of the 14 103 persons who were suicide cases, 10 152 were men (72.0%), with 130 591 men (49.2%) in the corresponding comparison cohort. The suicide cases included 3951 women (28.0%), with 134 592 women (50.8%) in the corresponding comparison cohort. Persons who died by suicide were slightly older than subcohort members (mean [SD] for men: 43.5 [19] years vs 37.4 [21.8] years; women: 47.6 [18.8] vs 39.9 [23.4] years), more frequently divorced (men: 1225 [12.1%] vs 8228 [6.3%]; women: 682 [17.3%] vs 10 266 [7.6%]), and in the second-income quartile (men: 2516 [24.8%] vs 19 980 [15.3%]; women: 1364 [34.5%] vs 34 114 [25.3%]) (Table).
Table. Characteristics of the Suicide Cases and the General Population Subcohort, Denmark, January 1, 1995.
Variable | Men | Women | ||
---|---|---|---|---|
Suicide Cases (n = 10 152) | Comparison Subcohort (n = 130 591) | Suicide Cases (n = 3951) | Comparison Subcohort (n = 134 592) | |
Age, mean (SD), y | 43.5 (18.8) | 37.4 (21.8) | 47.6 (18.8) | 39.9 (23.4) |
Marital status, % | ||||
Married or registered partner | 4000 (39.4) | 53 640 (41.1) | 1665 (42.1) | 53 856 (40.0) |
Divorced | 1225 (12.1) | 8228 (6.3) | 682 (17.3) | 10 266 (7.6) |
Single | 4395 (43.3) | 63 962 (49.0) | 1103 (27.9) | 55 238 (41.0) |
Widow | 467 (4.6) | 3802 (2.9) | 473 (12.0) | 14 327 (10.6) |
Unknowna | 65 (0.6) | (959) 0.7 | 28 (0.7) | 905 (0.7) |
Immigrant, % | 312 (3.1) | 5698 (4.4) | 167 (4.2) | 5566 (4.1) |
Income quartile, % | ||||
<1 | 1766 (17.4) | 23 895 (18.3) | 976 (24.7) | 30 870 (22.9) |
1 to <2 | 2516 (24.8) | 19 980 (15.3) | 1364 (34.5) | 34 114 (25.3) |
2 to <3 | 2387 (23.5) | 23 962 (18.3) | 962 (24.3) | 31 841 (23.7) |
3≥ | 2964 (29.2) | 41 069 (31.4) | 481 (12.2) | 16 217 (12.0) |
Unknowna | 519 (5.1) | 21 685 (16.6) | 168 (4.3) | 21 550 (16.0) |
Given the coverage of the Danish national registries, missing data were scarce. The few predictors with minimal missing data were imputed using the default approaches of rpart (surrogate variables) and randomForests (modal value).
CART Modeling
Among men, the highest risk for suicide was found among those not being treated by pharmacotherapy (eg, antidepressants, antipsychotics, or anxiolytics) and with a prior suicide attempt in the prior 4 years, and being in the second income quartile (n = 18; risk, 1.0). Similarly, men who received a prior diagnosis of poisoning by adverse effects or underdosing of drugs but did not have a coded prescription for antidepressants, antipsychotics, medications for addictions (eg, methadone), or hypnotics/sedatives in the prior 4 years had a risk of 0.42 (n = 251). Other combinations of importance are displayed in Figure 1 (AUC, 0.77; 95% CI, 0.77-0.78).
The highest risk among women was among those who did not have a recorded prescription for anxiolytics, antipsychotics, hypnotics or sedatives, or antidepressants, but made a suicide attempt in the previous 4 years (n = 16; risk, 1.0). Women who were prescribed hypnotic/sedatives and were diagnosed with poisoning by, adverse effect of, and underdosing of drugs in the 4 years before suicide had the next highest risk (n = 79; risk, 0.41). Other combinations of importance are displayed in Figure 2 (AUC, 0.87; 95% CI, 0.86-0.88).
Random Forest
Among men, 90% to 91% (fold 1-fold 2) of the predictors had a mean decrease in accuracy above 0 (mean [SD], 7.8 [5.8]). Eighteen predictors were among the top 30 most important predictors in both folds (Figure 3). Removal of drugs used in addictive disorders 4 years before suicide from our models would have the largest association with accuracy. Other predictors in the top 30 included antidepressants, hypnotics/sedatives, and antipsychotics in the past 4 years, age, physical health diagnoses, and stress disorders. The AUC for the random forest across folds was 0.80 (95% CI, 0.79-0.81).
Among women, 87% to 89% (fold 1 – fold 2) of the total number of predictors had a mean decrease in accuracy above 0 (mean [SD], 4.6 [3.8]). Twenty-one predictors overlapped as a top 30 predictor in both folds, most of which involved psychiatric diagnoses and medications (Figure 4). Specifically, alcohol-related disorders, prior suicide attempts, drugs used in addictive disorders, schizophrenia, recurrent major depression, and stress disorders had the largest association with accuracy. The AUC for the random forest model across folds was 0.88 (95% CI, 0.88-0.89).
Operating Characteristics of High-Risk Thresholds
Cross-validated random forests–predicted probabilities were rank ordered and operating characteristics were calculated among individuals in the top quintile of the predicted risk distribution. Men in the top 5%, 10%, and 20% of predicted risk accounted (sensitivity) for 32.0%, 49.4%, and 65.9% of all male cases of suicide death, respectively (specificity, 97.1%, 93.1%, 83.6%). Women in the top 5%, 10%, and 20% of predicted risk accounted for 53.4%, 68.1%, and 81.0% of all female suicide deaths (specificity, 96.4%, 91.7%, 81.8%). The sensitivity among individuals in the top 5% of predicted risk was 6.4 times the expected value among men (32%/5%) and 10.6 times the expected value among women (53.4%/5%).
Discussion
We used supervised machine learning to develop sex-specific risk models for suicide using population-based data from the Danish national health care and social registries. From 2554 predictors, characterized across 4 different timeframes and spanning demographics, family history of suicide, previous suicidal behavior, psychiatric disorders, physical health disorders, and medications, we derived machine learning models that highlight important predictors and combinations.
Across all models, psychiatric disorders emerged as the most important predictors of suicide. This finding is largely consistent with what is known about psychiatric disorders and suicide risk.7,53,54 However, some specific disorders that appeared in our results were novel. While schizophrenia and depression are well-established risk factors for suicide,17,55,56,57,58,59,60 there is some controversy in the literature with regard to whether stress disorders are a risk factor for suicide aside from depression.61 In these results, stress disorders are an important predictor of suicide among both men and women in models simultaneously evaluating depression. Furthermore, antidepressants, antipsychotics, hypnotics/sedatives, and medications used to treat addictions were important to the accuracy of predicting suicide across analyses. Although our results indicate that pharmacotherapy is important to suicide prediction, our noncausal models could not elucidate the direction or magnitude of associations.
Physical health diagnoses appeared to contribute more to suicide prediction for men than women. Conversely, psychiatric diagnoses and associated medications appeared to contribute more to the prediction of women’s suicide risk. It is possible that clinicians may more frequently assess for, recognize, and diagnose mental health symptoms among women, while these same symptoms may be attributed to somatic concerns among men.62 Previous machine learning work on suicide prediction among men relied on relatively healthy populations (eg, soldiers) and thus did not incorporate a detailed examination of physical health diagnoses.12,13 We believe research will be needed to further examine physical health risk factors for suicide within a causal framework.
Limitations
This study has some important limitations worth noting. With regard to time-varying prediction, a longer period of observation before suicide (eg, diagnoses that occurred 48 months earlier) was more important to suicide prediction than a shorter period of observation before suicide (eg, diagnoses that occurred 6 months earlier). A limitation of this study is that nondiagnostic predictors occurring in the immediate days before suicide may be particularly important to risk (eg, acute interpersonal stress, loneliness), but these data are not available in the Danish registries. A challenge to epidemiologic studies of suicide in general is obtaining data on the sample size needed to study the causes of this relatively rare event (usually medical registry data), while also obtaining data on negative life factors not typically found in data sources of this size. It is likely that long-term diagnostic risk factors and acute life events and emotional states interact to most accurately characterize suicide risk. How to best capture long-term and acute risk factors in large, population-based samples has been an ongoing challenge to suicide epidemiologic research.
Another limitation to the present study is the reliance on only 2 machine learning classifiers: CART and random forest. We used these approaches to prediction for substantive reasons (ie, variable importance), but other classifiers or meta-classifiers (eg, super learning) could achieve better prediction performance.63 Furthermore, despite our bootstrap and cross-validation approach, spurious findings are still possible. In addition, it is possible that observed sex differences are the result of differences in modeling; thus, these findings should be considered exploratory. Given our case-cohort study design, the risks presented in the CART terminal nodes may not reflect the true population risk. Results from this study should not be interpreted as causal effects. In addition, although there is a rigorous and thorough process for suicide death classification in Denmark (ie, a full inquest into the cause of death), suicide death misclassification is possible. The extent to which these results may generalize to populations outside of Denmark is unclear, although concerns about generalization may be assuaged given that our results are consistent with the existing, primarily US-based, suicide literature.
Conclusions
Despite decades of suicide research, our understanding of suicide risk remains poor. Machine learning may allow for the development of prediction models that can evaluate many relevant predictors of suicide simultaneously. To our knowledge, this study is the first to develop prediction models for suicide based on data from a full population. We found what appears to be consistency with what is known about suicide risk but also potentially important, understudied, predictors for future study. Results of this study can possibly be replicated in other novel data sets and used to inform the further development of general population prediction models for suicide.
References
- 1.Naghavi M; Global Burden of Disease Self-Harm Collaborators . Global, regional, and national burden of suicide mortality 1990 to 2016: systematic analysis for the Global Burden of Disease Study 2016. BMJ. 2019;364:l94. doi: 10.1136/bmj.l94 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.World Health Organization Suicide rate estimates, age-standardized estimates by country. http://apps.who.int/gho/data/node.main.MHSUICIDEASDR?lang=en. Updated July 17, 2018. Accessed January 18, 2019.
- 3.Suicide Prevention Resource Center. Research Prioritization Task Force, National Action Alliance for Suicide Prevention A prioritized research agenda for suicide prevention: an action plan to save lives. https://www.sprc.org/resources-programs/prioritized-research-agenda-suicide-prevention-action-plan-save-lives. Published 2014. Accessed January 18, 2019.
- 4.Murphy S, Xu J, Kochanek K, Arias E Mortality in the United States, 2017. Report No.: NCHS Data Brief, no 328. https://www.cdc.gov/nchs/products/databriefs/db328.htm. Published November 2018. Accessed January 18, 2019 [PubMed]
- 5.Franklin JC, Ribeiro JD, Fox KR, et al. . Risk factors for suicidal thoughts and behaviors: a meta-analysis of 50 years of research. Psychol Bull. 2017;143(2):187-232. doi: 10.1037/bul0000084 [DOI] [PubMed] [Google Scholar]
- 6.Crump C, Sundquist K, Sundquist J, Winkleby MA. Sociodemographic, psychiatric and somatic risk factors for suicide: a Swedish national cohort study. Psychol Med. 2014;44(2):279-289. doi: 10.1017/S0033291713000810 [DOI] [PubMed] [Google Scholar]
- 7.Harris EC, Barraclough B. Suicide as an outcome for mental disorders: a meta-analysis. Br J Psychiatry. 1997;170:205-228. doi: 10.1192/bjp.170.3.205 [DOI] [PubMed] [Google Scholar]
- 8.Heikkinen ME, Isometsä ET, Marttunen MJ, Aro HM, Lönnqvist JK. Social factors in suicide. Br J Psychiatry. 1995;167(6):747-753. doi: 10.1192/bjp.167.6.747 [DOI] [PubMed] [Google Scholar]
- 9.Walsh CG, Ribeiro JD, Franklin JC. Predicting risk of suicide attempts over time through machine learning, predicting risk of suicide attempts over time through machine learning. Clin Psychol Sci. 2017;5(3):457-469. doi: 10.1177/2167702617691560 [DOI] [Google Scholar]
- 10.Poulin C, Shiner B, Thompson P, et al. . Predicting the risk of suicide by analyzing the text of clinical notes. PLoS One. 2014;9(1):e85733. doi: 10.1371/journal.pone.0085733 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Metzger M-H, Tvardik N, Gicquel Q, Bouvry C, Poulet E, Potinet-Pagliaroli V. Use of emergency department electronic medical records for automated epidemiological surveillance of suicide attempts: a French pilot study. Int J Methods Psychiatr Res. 2017;26(2). doi: 10.1002/mpr.1522 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kessler RC, Stein MB, Petukhova MV, et al. ; Army STARRS Collaborators . Predicting suicides after outpatient mental health visits in the Army Study to Assess Risk and Resilience in Servicemembers (Army STARRS). Mol Psychiatry. 2017;22(4):544-551. doi: 10.1038/mp.2016.110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Kessler RC, Warner CH, Ivany C, et al. ; Army STARRS Collaborators . Predicting suicides after psychiatric hospitalization in US Army soldiers: the Army Study To Assess Risk and rEsilience in Servicemembers (Army STARRS). JAMA Psychiatry. 2015;72(1):49-57. doi: 10.1001/jamapsychiatry.2014.1754 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Barak-Corren Y, Castro VM, Javitt S, et al. . Predicting suicidal behavior from longitudinal electronic health records. Am J Psychiatry. 2017;174(2):154-162. doi: 10.1176/appi.ajp.2016.16010077 [DOI] [PubMed] [Google Scholar]
- 15.McCoy TH Jr, Castro VM, Roberson AM, Snapper LA, Perlis RH. Improving prediction of suicide and accidental death after discharge from general hospitals with natural language processing. JAMA Psychiatry. 2016;73(10):1064-1071. doi: 10.1001/jamapsychiatry.2016.2172 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Maris R, Berman A, Silverman M. Introduction to the Study of Suicide In: Comprehensive Textbook of Suicidology. New York: Guilford Press; 2000:3-25. [Google Scholar]
- 17.Hawton K. Sex and suicide. Gender differences in suicidal behaviour. Br J Psychiatry. 2000;177(6):484-485. doi: 10.1192/bjp.177.6.484 [DOI] [PubMed] [Google Scholar]
- 18.Skogman K, Alsén M, Öjehagen A. Sex differences in risk factors for suicide after attempted suicide—a follow-up study of 1052 suicide attempters. Soc Psychiatry Psychiatr Epidemiol. 2004;39(2):113-120. doi: 10.1007/s00127-004-0709-9 [DOI] [PubMed] [Google Scholar]
- 19.Gradus JL, King MW, Galatzer-Levy I, Street AE. Gender differences in machine learning models of trauma and suicidal ideation in veterans of the Iraq and Afghanistan wars. J Trauma Stress. 2017;30(4):362-371. doi: 10.1002/jts.22210 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Schmidt M, Schmidt SAJ, Sandegaard JL, Ehrenstein V, Pedersen L, Sørensen HT. The Danish National Patient Registry: a review of content, data quality, and research potential. Clin Epidemiol. 2015;7:449-490. doi: 10.2147/CLEP.S91125 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Rothman KJ, Greenland S, Lash TL. Modern Epidemiology. Philadelphia: Lippincott Williams & Wilkins; 2008. [Google Scholar]
- 22.Noma H, Tanaka S. Analysis of case-cohort designs with binary outcomes: improving efficiency using whole-cohort auxiliary information. Stat Methods Med Res. 2017;26(2):691-706. doi: 10.1177/0962280214556175 [DOI] [PubMed] [Google Scholar]
- 23.Statistics Denmark Visits to physicians. https://www.dst.dk/en/Statistik/emner/levevilkaar/sundhed/laegebesoeg. Accessed February 4, 2019.
- 24.Frank L. Epidemiology. When an entire country is a cohort. Science. 2000;287(5462):2398-2399. doi: 10.1126/science.287.5462.2398 [DOI] [PubMed] [Google Scholar]
- 25.Frank L. Epidemiology. The epidemiologist’s dream: Denmark. Science. 2003;301(5630):163. doi: 10.1126/science.301.5630.163 [DOI] [PubMed] [Google Scholar]
- 26.Pedersen CB, Gøtzsche H, Møller JO, Mortensen PB. The Danish Civil Registration System: a cohort of eight million persons. Dan Med Bull. 2006;53(4):441-449. [PubMed] [Google Scholar]
- 27.Pedersen CB. The Danish Civil Registration System. Scand J Public Health. 2011;39(7)(suppl):22-25. doi: 10.1177/1403494810387965 [DOI] [PubMed] [Google Scholar]
- 28.Schmidt M, Pedersen L, Sørensen HT. The Danish Civil Registration System as a tool in epidemiology. Eur J Epidemiol. 2014;29(8):541-549. doi: 10.1007/s10654-014-9930-3 [DOI] [PubMed] [Google Scholar]
- 29.Baadsgaard M, Quitzau J. Danish registers on personal income and transfer payments. Scand J Public Health. 2011;39(7)(suppl):103-105. doi: 10.1177/1403494811405098 [DOI] [PubMed] [Google Scholar]
- 30.Jensen VM, Rasmussen AW. Danish education registers. Scand J Public Health. 2011;39(7)(suppl):91-94. doi: 10.1177/1403494810394715 [DOI] [PubMed] [Google Scholar]
- 31.Timmermans B. The Danish Integrated Database for Labor Market Research: towards demystification for the english speaking audience. Report No. 10–16. https://ideas.repec.org/p/aal/abbswp/10-16.html. Published 2010. Accessed January 29, 2019.
- 32.Mors O, Perto GP, Mortensen PB. The Danish Psychiatric Central Research Register. Scand J Public Health. 2011;39(7)(suppl):54-57. doi: 10.1177/1403494810395825 [DOI] [PubMed] [Google Scholar]
- 33.Svensson E, Lash TL, Resick PA, Hansen JG, Gradus JL. Validity of reaction to severe stress and adjustment disorder diagnoses in the Danish Psychiatric Central Research Registry. Clin Epidemiol. 2015;7:235-242. doi: 10.2147/CLEP.S80514 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Lynge E, Sandegaard JL, Rebolj M. The Danish National Patient Register. Scand J Public Health. 2011;39(7)(suppl):30-33. doi: 10.1177/1403494811401482 [DOI] [PubMed] [Google Scholar]
- 35.Vest-Hansen B, Riis AH, Christiansen CF. Registration of acute medical hospital admissions in the Danish National Patient Registry: a validation study. Clin Epidemiol. 2013;5:129-133. doi: 10.2147/CLEP.S41905 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Pottegård A, Schmidt SAJ, Wallach-Kildemoes H, Sørensen HT, Hallas J, Schmidt M. Data Resource Profile: The Danish National Prescription Registry. Int J Epidemiol. 2017;46(3):798-798f. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Helweg-Larsen K. The Danish Register of Causes of Death. Scand J Public Health. 2011;39(7)(suppl):26-29. doi: 10.1177/1403494811399958 [DOI] [PubMed] [Google Scholar]
- 38.Tøllefsen IM, Helweg-Larsen K, Thiblin I, et al. . Are suicide deaths under-reported? nationwide re-evaluations of 1800 deaths in Scandinavia. BMJ Open. 2015;5(11):e009120. doi: 10.1136/bmjopen-2015-009120 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Rosellini AJ, Monahan J, Street AE, et al. . Using administrative data to identify US Army soldiers at high-risk of perpetrating minor violent crimes. J Psychiatr Res. 2017;84:128-136. doi: 10.1016/j.jpsychires.2016.09.028 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Lühdorf P, Overvad K, Schmidt EB, Johnsen SP, Bach FW. Predictive value of stroke discharge diagnoses in the Danish National Patient Register. Scand J Public Health. 2017;45(6):630-636. doi: 10.1177/1403494817716582 [DOI] [PubMed] [Google Scholar]
- 41.Tuckuviene R, Kristensen SR, Helgestad J, Christensen AL, Johnsen SP. Predictive value of pediatric thrombosis diagnoses in the Danish National Patient Registry. Clin Epidemiol. 2010;2:107-122. doi: 10.2147/CLEP.S10334 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Strobl C, Malley J, Tutz G. An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychol Methods. 2009;14(4):323-348. doi: 10.1037/a0016973 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Breiman L, Friedman J, Stone CJ, Olshen RA. Classification and Regression Trees. Milton Park, UK: Taylor & Francis; 1984. [Google Scholar]
- 44.Breiman L. Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat Sci. 2001;16(3):199-231. doi: 10.1214/ss/1009213726 [DOI] [Google Scholar]
- 45.rpart: Recursive partitioning and regression trees. https://CRAN.R-project.org/package=rpart. Published April 12, 2019. Accessed April 18, 2019.
- 46.Kuhn M, Johnson K. Remedies for severe class imbalance In: Kuhn M, Johnson K, eds. Applied Predictive Modeling. New York, NY: Springer New York; 2013:419-443. [Google Scholar]
- 47.Liaw A, Wiener M. Classification and regression by randomForest. R News. 2002;2(3):18-22. [Google Scholar]
- 48.Chen C, Liaw A, Breiman L Using random forest to learn imbalanced data. Report No. 666. https://statistics.berkeley.edu/tech-reports/666. Published July 2004. Accessed January 18, 2019.
- 49.Huang BFF, Boutros PC. The parameter sensitivity of random forests. BMC Bioinformatics. 2016;17(1):331. doi: 10.1186/s12859-016-1228-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Robin X, Turck N, Hainard A, et al. . pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics. 2011;12:77. doi: 10.1186/1471-2105-12-77 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.SAS Institute Inc SAS/GRAPH 9.4. Cary, NC; 2013. [Google Scholar]
- 52.R Development Core Team R: A language and environment for statistical computing. Vienna, Austria; 2017. https://www.R-project.org/
- 53.Henriksson MM, Aro HM, Marttunen MJ, et al. . Mental disorders and comorbidity in suicide. Am J Psychiatry. 1993;150(6):935-940. doi: 10.1176/ajp.150.6.935 [DOI] [PubMed] [Google Scholar]
- 54.Gibbons RD, Hur K, Bhaumik DK, Mann JJ. The relationship between antidepressant medication use and rate of suicide. Arch Gen Psychiatry. 2005;62(2):165-172. doi: 10.1001/archpsyc.62.2.165 [DOI] [PubMed] [Google Scholar]
- 55.Brown GK, Beck AT, Steer RA, Grisham JR. Risk factors for suicide in psychiatric outpatients: a 20-year prospective study. J Consult Clin Psychol. 2000;68(3):371-377. doi: 10.1037/0022-006X.68.3.371 [DOI] [PubMed] [Google Scholar]
- 56.Angst J, Angst F, Stassen HH. Suicide risk in patients with major depressive disorder. J Clin Psychiatry. 1999;60(suppl 2):57-62. [PubMed] [Google Scholar]
- 57.Brent DA, Perper JA, Moritz G, et al. . Psychiatric risk factors for adolescent suicide: a case-control study. J Am Acad Child Adolesc Psychiatry. 1993;32(3):521-529. doi: 10.1097/00004583-199305000-00006 [DOI] [PubMed] [Google Scholar]
- 58.Inskip HM, Harris EC, Barraclough B. Lifetime risk of suicide for affective disorder, alcoholism and schizophrenia. Br J Psychiatry. 1998;172(1):35-37. doi: 10.1192/bjp.172.1.35 [DOI] [PubMed] [Google Scholar]
- 59.Palmer BA, Pankratz VS, Bostwick JM. The lifetime risk of suicide in schizophrenia: a reexamination. Arch Gen Psychiatry. 2005;62(3):247-253. doi: 10.1001/archpsyc.62.3.247 [DOI] [PubMed] [Google Scholar]
- 60.Hor K, Taylor M. Suicide and schizophrenia: a systematic review of rates and risk factors. J Psychopharmacol (Oxf). 2010;24(4 suppl):81-90. doi: 10.1177/1359786810385490 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Gradus JL. Posttraumatic stress disorder and death from suicide. Curr Psychiatry Rep. 2018;20(11):98.https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=30221328&dopt=Abstract doi: 10.1007/s11920-018-0965-0 [DOI] [PubMed] [Google Scholar]
- 62.Levant RF, Wong YJ, eds. The Psychology of Men and Masculinities Washington, DC: American Psychological Association; 2017. xxiv, 417. [DOI] [PubMed] [Google Scholar]
- 63.van der Laan MJ, Polley EC, Hubbard AE. Super learner [published online September 16, 2007]. Stat Appl Genet Mol Biol. doi: 10.2202/1544-6115.1309 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.