Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Jan 1.
Published in final edited form as: Pharmacoepidemiol Drug Saf. 2023 Dec 19;33(1):e5734. doi: 10.1002/pds.5734

Predicting risk of suicidal behavior from insurance claims data vs. linked data from insurance claims and electronic health records

Gregory E Simon 1,6, Susan M Shortreed 1,7, Eric Johnson 1, Zimri S Yaseen 2, Marc Stone 2, Andrew D Mosholder 2, Brian K Ahmedani 3, Karen J Coleman 4,6, R Yates Coley 1,7, Robert B Penfold 1, Sengwee Toh 5
PMCID: PMC10843611  NIHMSID: NIHMS1945234  PMID: 38112287

Abstract

Purpose –

Observational studies assessing effects of medical products on suicidal behavior often rely on health records data to account for pre-existing risk. We assess whether high-dimensional models predicting suicide risk using data derived from insurance claims and electronic health records (EHRs) are superior to models using data from insurance claims alone.

Methods –

Data were from seven large health systems identified outpatient mental health visits by patients aged 11 or older between 1/1/2009 and 9/30/2017. Data for the five years prior to each visit identified potential predictors of suicidal behavior typically available from insurance claims (e.g., mental health diagnoses, procedure codes, medication dispensings) and additional potential predictors available from EHRs (self-reported race and ethnicity, responses to Patient Health Questionnaire or PHQ-9 depression questionnaires). Nonfatal self-harm events following each visit were identified from insurance claims data and fatal self-harm events were identified by linkage to state mortality records. Random forest models predicting nonfatal or fatal self-harm over 90 days following each visit were developed in a 70% random sample of visits and validated in a held-out sample of 30%. Performance of models using linked claims and EHR data was compared to models using claims data only.

Results –

Among 15,845,047 encounters by 1,574,612 patients, 99,098 (0.6%) were followed by a self-harm event within 90 days. Overall classification performance did not differ between the best-fitting model using all data (Area Under the receiver operating Curve or AUC = 0.846, 95% CI 0.839-0.854) and the best-fitting model limited to data available from insurance claims (AUC=0.846, 95% CI 0.838 – 0.853). Competing models showed similar classification performance across a range of cutpoints and similar calibration performance across a range of risk strata. Results were similar when the sample was limited to health systems and time periods where PHQ-9 depression questionnaires were recorded more frequently.

Conclusion –

Investigators using health records data to account for pre-existing risk in observational studies of suicidal behavior need not limit that research to databases including linked EHR data.

Keywords: suicide, self-harm, epidemiology, machine learning, prediction

PLAIN LANGUAGE SUMMARY:

Evaluating the effects of medical products on suicidal behavior will often rely on observational studies using large databases derived from health records. Most databases available for large pharmacoepidemiology studies are derived from insurance claims, with only some including data from electronic health records (such a patient reported outcomes or PROs). Machine learning-derived prediction models can account for pre-existing risk when studying effects of medical products on suicidal behavior. In a sample of over 15 million mental health visits, machine learning models predicting self-harm over the following 90 days performed equally well with and without use of PROs and other data only available from electronic health records.

INTRODUCTION

Reducing risk of suicidal behavior is a public health priority. In the US, self-harm leads to over 45,000 deaths annually1, and suicidal ideation or behavior results in approximately 300,000 emergency department visits per year2.

A variety of medical products may have therapeutic or adverse effects on suicidal behavior. Among medications prescribed for treatment of mental health conditions, lithium3, 4, clozapine5, 6, and glutamate receptor modulator drugs (such as ketamine and esketamine)7, 8 have been reported or hypothesized to decrease risk of suicidal behavior, while some antidepressant9 and anticonvulsant10 drugs have been reported to increase or precipitate suicidal behavior. Among medications used to treat general medical conditions, concerns regarding increased risk for suicidal behavior have been raised regarding montelukast11, isotretinoin12, and varenicline13.

Traditional clinical trials evaluating efficacy or safety of medical products are unlikely to have adequate statistical power to evaluate either therapeutic or adverse effects on suicidal behavior. Among the potential beneficial effects described above, only the beneficial effect of clozapine is supported by at least one randomized trial demonstrating effect on suicidal behavior14. Addressing concerns regarding precipitation of suicidal behavior by antidepressants required meta-analyses of hundreds of randomized trials9. In all other cases, regulation and clinical practice have depended on findings from observational comparisons11, 12, 15, 16 or on randomized trials considering suicidal ideation as a surrogate for suicidal behavior8, 17. Among patients known to be at increased risk for suicidal behavior, a single randomized trial with sufficient precision to detect a moderate increase or decrease in suicidal behavior would require several thousand participants18, 19. A randomized trial to detect increased risk due to a nonpsychiatric medication would typically require a sample ten or more times as large. While meta-analyses of randomized trials can examine effects for drug classes9, gaining statistical power via meta-analysis is less feasible for novel medical products. Consequently, regulatory and clinical decisions, especially for newer products, must often rely on evidence from observational comparisons using large records databases.

Observational studies regarding therapeutic or adverse effects of medical products on suicidal behavior are liable to significant confounding. Especially when considering medications for treatment of mental health conditions, prescribing decisions may be related to severity or type of mental symptoms or directly related to prescribers’ estimates of suicide risk. Previous research has attempted to account for this potential confounding using covariates or propensity scores including known predictors of self-harm such as mental health diagnoses and record of prior psychiatric hospitalization or suicide attempt11, 15, 20.

Recent research demonstrates that high-dimensional prediction scores including a wide range of predictors extracted from insurance claims and electronic health records (EHR) data can accurately stratify people at risk for suicidal behavior2123. Machine learning methods can combine hundreds of predictors, each with only moderate or small effect, to classify risk of suicidal behavior with overall performance (as measured by area under the receiver operating curve or AUC) exceeding 85%2123. Prediction models including large numbers of correlated predictors are not suited for causal inference regarding those included predictors. But predictions derived from those models can more accurately estimate pre-existing risk of suicidal behavior. In future observational studies of medication effects on suicidal behavior, these methods could be used to create high-dimensional covariate vectors to account for confounding using propensity score or disease risk score methodology.

Health records databases available for observational studies of medication effects on suicidal behavior differ in the availability of specific covariates or predictors. Data typically available from insurance claims or billing records include encounter diagnoses, procedure codes, and outpatient medication dispensings, while data available from EHRs may also include race, ethnicity, patient-reported outcomes, laboratory results, and clinical text. The Center for Medicare and Medicaid Services Medicare and Medicaid databases are limited to data typically available from insurance claims. The FDA Sentinel System includes insurance claims data for approximately 70 million individuals in any year, with linked EHR data for fewer than 10% of those. Other aggregated records databases, such as MarketScan24, also include linked EHR data on small proportions of covered populations.

Here we examine how additional data available from EHRs do or do not improve prediction of suicidal behavior compared to predictions limited to data typically available from insurance claims. These analyses intend to address a practical question regarding future observational studies of medication effects on suicidal behavior: Should such studies be limited to data sources including linked insurance claims and EHR data?

METHODS

Data for these analyses were extracted from records of seven health systems participating in the Mental Health Research Network (HealthPartners; Henry Ford Health; and the Colorado, Hawaii, Northwest, Southern California and Washington regions of Kaiser Permanente) serving a combined population of approximately eight million members in nine states25. Each system provides insurance coverage and comprehensive health care (including general medical and specialty mental health care) to a defined population enrolled through employer-sponsored insurance, individual insurance, capitated Medicaid or Medicare, and subsidized low-income programs. Members are representative of each system’s service area in age, race/ethnicity, and socioeconomic status. All systems recommend using the Patient Health Questionnaire depression scale or PHQ-926 at mental health visits and primary care visits for depression, but implementation varied across systems during the study period. Each health system maintains a research data warehouse following the Health Care Systems Research Network Virtual Data Warehouse model27. This resource combines data from insurance enrollment records, EHRs, insurance claims, pharmacy dispensings, state mortality records, and census-derived neighborhood characteristics. Responsible institutional review boards for each health system approved use of these de-identified records data for this research.

To represent the potential use of prediction models in observational studies following a target trial emulation design28, this study considered mental health visits as occasions where a new treatment might or might not be prescribed, evaluating confounders prior to that visit and outcomes following it. The study sample included visits to specialty mental health clinicians between 1/1/2009 and 9/30/2017 by health system members aged 11 or older. To ensure ascertainment of subsequent self-harm diagnoses, the sample was limited to those enrolled in a participating health system insurance plan at the time of the sampled visit and for at least 90 days after, excluding those who disenrolled or died from cause other than suicide prior to 90 days.

For each eligible encounter, potential predictors recorded during the prior 60 months were extracted from research data warehouses. Predictors typically available from insurance claims data included age, sex, prior mental health and substance use disorder diagnoses (in 13 categories), dispensed prescriptions for mental health medications (in 8 categories), prior injury or poisoning diagnoses (in 4 categories), and prior emergency department or inpatient encounters with mental health diagnoses (in 3 categories), and 17 indicators of chronic medical illness following the Charlson Comorbidity score. For each category of mental health or substance use diagnosis, prescription dispensing or service use, count data for each of the prior 60 months (e.g., number of days diagnosis was recorded in each of the 60 months, number of days supply of medication dispensed during each of the 60 months) were transformed into 48 possible time patterns reflecting various permutations of first onset, most recent occurrence, and increase or decrease over time. Predictors typically available from EHR data included self-reported race, self-reported Hispanic ethnicity, current and prior responses to PHQ-9 depression questionnaires (both total scores and response to the 9th item regarding thoughts of death or self-harm), and census-block-level indicators of household income and educational attainment. Responses to PHQ-9 item 9 were represented as five indicator variables (representing four possible responses as well as a missing response) for the sampled encounter and for up to three prior recordings of the PHQ-9. A data dictionary describing all potential predictors can be found in Appendix A.

Nonfatal self-harm events occurring within 90 days of the sampled encounter were ascertained from health system EHRs and insurance claims data using ICD-9-CM external cause codes indicating self-harm intent (E950-E958) or undetermined intent (E980-E988) and ICD-10-CM diagnoses coded as having self-harm intent (typically indicated by a modifier in the sixth character position). Including ICD-9-CM diagnoses of undetermined intent increased ascertainment of self-harm events under ICD-9-CM (prior to October 2015) by approximately 25%. Reviews of full-text medical records in these health systems found that ICD-9-CM diagnoses of self-harm intent among people with mental health diagnoses were accompanied by clear documentation of self-harm intent in 90% of cases and that ICD-9-CM diagnoses of undetermined intent among people with mental health diagnoses were accompanied by clear documentation of self-harm intent in over 80% of cases21. Reviews of full-text medical records in these health systems found that ICD-10-CM injury and poisoning diagnoses indicating self-harm among people with history of suicidal ideation were accompanied by clear documentation of self-harm intent in 89% of cases and that reliance on ICD-10-CM self-harm coding would under-estimate overall rates of self-harm by approximately 15%29.

Fatal self-harm events within 90 days of the sampled encounter were ascertained by linking health system enrollment records to state vital statistics data, including ICD-10 cause of death codes indicating self-harm intent (X60-X84) or undetermined intent (Y10-Y34).

Random forest models predicting any self-harm event (fatal or nonfatal) within 90 days of a sampled encounter were developed in a random sample of 70% of encounters and then validated in the remaining 30%, with all encounters for any individual patient selected into the training or validation set. Models were developed using R version 4.4.1 and the Ranger R package30 with tuning parameters (minimal node size and number of predictors considered at each split) to maximize AUC selected by 5-fold cross validation. Confidence limits for AUC statistics were calculated using 10,000 bootstrap iterations resampled at the person level to account for the effects of correlation due to inclusion of multiple encounters per person31. In addition to AUC, positive predictive value and sensitivity were calculated for each model at cut-points ranging from the 50th to the 99.5th percentiles of risk in the training data.

Routine use of PHQ-9 questionnaires was implemented at different times and at different rates in participating health systems. Because the added value of EHR data might depend on the availability of PHQ-9 results, additional analyses compared accuracy of models with and without EHR-sourced data in subsets of encounters with higher rates of PHQ-9 recording. To avoid biases due to individual patients choosing whether to respond to PHQ-9 questions or individual clinicians choosing whether to administer or record questionnaires, these subgroup analyses selected encounters for each health system by calendar year – including health systems and calendar years where rates of PHQ-9 recording for sampled encounters were either greater than 20% or greater than 50%.

RESULTS

The eligibility criteria described above identified 15,845,047 encounters by 1,574,612 patients during the study period. Table 1 displays characteristics of all eligible encounters as well as subsets of encounters when and where PHQ-9 depression scores were more frequently recorded. Limiting to health systems and years with PHQ-9 scores recorded for more than half of encounters decreased the sample of encounters by approximately 90% and selected a larger proportion of White and non-Hispanic patients. Self-harm diagnoses were recorded within 90 days for 99,098 visits by 19,112 patients. Compared to the full visit sample, visits followed by self-harm were more often accompanied by reports of frequent suicidal ideation at the sampled visits and more often preceded by reports of suicidal ideation at prior visits, by encounters with a range of mental health diagnoses, by receipt of specialty mental health care, by inpatient or emergency care for mental health diagnoses, and by previous self-harm events.

Table 1 –

Characteristics of sampled encounters, including all eligible encounters and subsets limited to health systems and time periods with higher rates of PHQ-9 depression questionnaire results recorded.

All encounters Self-harm event within 90 days
  All Sites/ All Years >20% Use of PHQ-9 >50% Use of PHQ-9 All Sites/ All Years >20% Use of PHQ-9 >50% Use of PHQ-9
# of Encounters 15,845,047 5,971,662 1,323,835 99,098 41,379 12,118
# of Unique patients 157,4612 841,505 196,420 19,112 9,130 3,086
Female 63.7% 64.4% 64.9% 68.7% 71.1% 70.8%
Self-Reported Race
 Asian 4.2% 3.6% 2.7% 3.8% 3.4% 3.1%
 Black 8.1% 6.7% 4.1% 5.6% 5.1% 4.3%
 Hawaiian/PI 0.5% 0.3% 0.3% 0.6% 0.3% 0.3%
 American Indian/Alaskan Native 0.5% 0.5% 0.7% 0.5% 0.7% 0.9%
 More than one 2.7% 2.6% 3.8% 2.8% 3.4% 5.0%
 Other 0.5% 1.1% 2.6% 0.7% 1.6% 3.0%
 Unknown 15.8% 13.8% 5.4% 13.9% 11.9% 5.5%
 White 67.8% 71.4% 80.4% 72.1% 73.6% 77.9%
Self-Reported Hispanic Ethnicity
 No 64.1% 63.5% 84.5% 64.3% 64.2% 85.0%
 Unknown 11.6% 16.8% 7.7% 12.0% 17.2% 6.3%
 Yes 24.3% 19.7% 7.8% 23.7% 18.5% 8.7%
Depression diagnosis in prior year 64% 64.9% 66.9% 81.8% 81.7% 81.1%
Anxiety Disorder Diagnosis in prior year 61.8% 68.1% 70.7% 74.5% 79.9% 79.9%
Bipolar Disorder diagnosis in prior year 10.7% 11.7% 14.4% 26.9% 26.4% 28.3%
Schizophrenia diagnosis in prior year 3.4% 3.3% 3.2% 8.0% 7.3% 6.5%
Personality Disord. diagnosis in prior year 10.6% 10.1% 12.1% 25.7% 25.3% 26.8%
Self-Harm diagnosis in prior year 2.4% 2.6% 3.1% 26.2% 27.8% 29.4%
Specialty Mental Health visit in prior year 87.2% 86.9% 85.8% 92.9% 92.3% 91.2%
Psychiatric hospitalization in prior year 13.3% 12.6% 11.5% 50.0% 45.1% 37.7%
Mental Health Emerg. Dept. in prior year 20.6% 20.6% 16.2% 57.3% 55.2% 47.3%
PHQ-9 Item 9 score at sampled visit
 Absent 84.2% 62.8% 32.0% 83.4% 63.3% 34.9%
 Not at all 11.9% 27.9% 51.2% 6.4% 14.2% 25.1%
 Several days 2.5% 6.0% 10.6% 4.6% 10.2% 17.3%
 More than half 0.8% 1.9% 3.7% 2.6% 5.8% 11.0%
 Nearly every day 0.6% 1.3% 2.6% 2.9% 6.6% 11.6%
Highest Item 9 score in prior 5 years
 Absent 64.8% 30.9% 12.3% 60.0% 23.9% 10.1%
 Not at all 20% 36.2% 39.4% 9.6% 14.4% 13.5%
 Several days 7.4% 15.6% 20% 9.0% 17.2% 15.3%
 More than half 3.7% 8% 11.7% 7.2% 14.5% 17.2%
 Nearly every day 4.1% 9.4% 16.6% 14.2% 30.0% 43.9%
Self-Harm diagnosis in following 90 days 0.6% 0.7% 0.9% 100% 100% 100%

Table 2 displays overall classification performance in the held-out validation set for competing models assessed by AUC or c-statistic. For the entire sample, overall performance was essentially identical for a model limited to data typically available from insurance claims and a model using all data available from claims and EHR data. For the sample limited to settings with 50% or greater PHQ-9 recording, a model using all available data had numerically superior performance (AUC 0.856 vs. 0.846), but 95% confidence limits for AUC metrics were wider for this smaller sample and were substantially overlapping for the competing models (0.842 to 0.870 for the model using all data vs. 0.830 to 0.860 for the model limited to claims-sourced data). Analyses stratified by sex, race, and Hispanic ethnicity (Appendix Table) showed similar findings across all subgroups. Classification performance was similar for models using and not using EHR-sourced data with confidence limits largely overlapping.

Table 2 –

Overall classification performance (area under the receiver operating characteristic curve with 95% confidence limits) for competing prediction models using data available from either claims or electronic health records (EHRs) vs. only data typically available from insurance claims.

EHR and Claims Claims only
All eligible years 0.846 (0.839, 0.854) 0.846 (0.838, 0.853)
Limited to settings with >20% PHQ-9 recording 0.845 (0.835, 0.855) 0.840 (0.829, 0.850)
Limited to settings with >50% PHQ-9 recording 0.856 (0.842, 0.870) 0.846 (0.830, 0.860)

ROC curves for competing models in the held-out validation set are shown in Figure 1, for both the entire sample and limited to health systems and years with PHQ-9 results recorded for over half of encounters. For the entire sample, curves were essentially identical. For the sample limited to settings with 50% or greater PHQ-9 recording, curves appear to diverge in the lower half of the risk distribution (right side of the ROC curve), indicating that inclusion of data from EHRs may improve ordering of risk in the lower part of the risk distribution. That portion of the ROC curve, however, represents a small proportion of self-harm events (only approximately 20% of all events occurred following visits below the 75th percentile).

Figure 1 –

Figure 1 –

Receiver operating characteristic curves for competing prediction models using data available from either claims or electronic health records (EHRs) vs. only data typically available from insurance claims. Left panel shows results for all eligible mental health encounters and right panel shows results limited to settings with PHQ-9 results recorded for >50% of encounters.

Table 3 displays classification performance in the held-out validation set for a variety of cut-points for the full sample and the sample restricted to settings with 50% or greater PHQ-9 recording. For competing models with and without EHR-sourced data, sensitivity and positive predictive value were similar across the risk distribution, both in the full sample and in the subset with more frequent PHQ-9 recording.

Table 3 –

Classification performance for competing prediction models using data available from either claims or electronic health records (EHRs) vs. only data typically available from insurance claims.

ALL MENTAL HEALTH ENCOUNTERS
Percentile Cut-point Predicted Risk Cut-point Sensitivity Positive Predictive Value
EHR and Claims Claims only EHR and Claims Claims only
50% 0.002 0.924 0.920 0.012 0.012
75% 0.006 0.787 0.788 0.020 0.020
90% 0.014 0.566 0.567 0.036 0.036
95% 0.023 0.422 0.423 0.053 0.053
99% 0.073 0.173 0.167 0.107 0.107
99.5% 0.097 0.115 0.108 0.139 0.139
LIMITED TO HEALTH SYSTEMS AND YEARS WITH >50% RECORDING OF PHQ-9 RESULTS
Percentile Cut-point Predicted Risk Cut-point Sensitivity Positive Predictive Value
EHR and Claims Claims only EHR and Claims Claims only
50% 0.003 0.956 0.916 0.015 0.017
75% 0.007 0.853 0.779 0.026 0.029
90% 0.016 0.624 0.555 0.046 0.050
95% 0.026 0.436 0.402 0.064 0.073
99% 0.093 0.176 0.162 0.153 0.151
99.5% 0.133 0.099 0.107 0.195 0.190

Table 4 displays calibration performance (in the held-out validation set) for a variety of risk strata for the full sample and the subset with 50% or greater PHQ-9 recording. All models showed good agreement between predicted and observed risk across strata. Competing models with and without EHR-sourced data showed similar ability to stratify or concentrate risk, with observed rates of suicidal behavior following visits above the 99th percentile more than 100 times greater than for those below the 50th percentile.

Table 4 –

Calibration performance for competing prediction models using data available from either claims or electronic health records (EHRs) vs. only data typically available from insurance claims.

ALL MENTAL HEALTH ENCOUNTERS
# of Self-harm events Predicted Risk Observed Risk
EHR and Claims Claims only EHR and Claims Claims only EHR and Claims Claims only
<50% 2287 2389 0.001 0.001 0.001 0.001
50%−75% 4091 3956 0.004 0.004 0.003 0.003
75%−90% 6610 6606 0.009 0.009 0.010 0.010
90%−95% 4295 4306 0.018 0.018 0.019 0.018
95%−99% 7449 7652 0.039 0.039 0.039 0.040
99%−99.5% 1736 1766 0.084 0.083 0.076 0.075
>99.5% 3442 3235 0.133 0.136 0.144 0.139
LIMITED TO HEALTH SYSTEMS AND YEARS WITH >50% RECORDING OF PHQ-9 RESULTS
# of Self-harm events Predicted Risk Observed Risk
EHR and Claims Claims only EHR and Claims Claims only EHR and Claims Claims only
<50% 156 297 0.001 0.001 0.001 0.002
50%−75% 364 486 0.005 0.004 0.004 0.005
75%−90% 812 792 0.010 0.010 0.012 0.014
90%−95% 667 542 0.020 0.020 0.028 0.028
95%−99% 921 851 0.046 0.047 0.046 0.054
99%−99.5% 272 195 0.109 0.109 0.120 0.108
>99.5% 350 379 0.173 0.169 0.195 0.190

DISCUSSION

Using data regarding over 15 million mental health encounters from seven large health systems, we found that machine learning-derived models to predict suicidal behavior using data typically available from insurance claims perform as well as models using additional data (such as race, ethnicity, and patient-reported outcomes) available from EHRs. Models developed with and without EHR-sourced data showed similar performance in both classification and calibration, even when analyses were restricted to health systems and time periods where PHQ-9 depression scores were recorded for over 50% of mental health encounters.

These findings do not imply that responses to PHQ-9 questionnaires do not predict subsequent suicidal behavior. In this sample, and in previous research in these settings32, 33, response to the ninth item of the PHQ-9 was strongly associated with risk of subsequent self-harm. These new analyses imply that responses to PHQ-9 questionnaire do not add to high-dimensional prediction scores derived from comprehensive insurance claims data covering the prior 5 years. Self-report measures are still useful tools for identifying or assessing risk of suicidal behavior in clinical care32, 34, especially in settings where records data or technical resources are not available for calculation of high-dimensional prediction scores.

While models developed with or without data derived from EHR had overall accuracy (as indicated by AUC) of approximately 85%, we would not claim that these prediction scores can completely account for any confounding when evaluating effects of medical products on suicidal behavior. But these results do demonstrate that high-dimensional prediction scores can account for baseline risk of suicidal behavior much more accurately than self-report measures alone32, 35 or predictions based on a small number of risk factors derived from records36, 37.

These analyses focus on the overall accuracy of prediction models that could be used to account for confounding in observational studies of therapeutic or adverse effects on suicidal behavior. Prediction models can also be used to direct outreach or prevention efforts38, 39. Even if models using and not using data derived from EHRs have similar overall accuracy, use of EHR data might identify different individual patients at high risk of self-harm.

These findings are generally consistent with those of our earlier work focused on timeliness of data availability for clinical implementation of suicide risk prediction models40. In that smaller sample with lower rates of PHQ-9 recording, prediction model performance was not significantly affected by availability of diagnoses and PHQ-9 questionnaire results recorded on the visit day.

LIMITATIONS

Interpretation of these findings should consider several limitations. Data were drawn from large integrated health systems with well-organized records databases, and findings may not generalize to other settings. These analyses did not consider additional relevant data that might be extracted from clinical text, such as notation regarding suicidal ideation, negative life events, or social determinants of health. Processing of clinical text for millions of encounters across multiple health systems, however, could pose computational challenges and raise privacy concerns. Ascertainment of self-harm events from health system records is subject to both false negative and false positive errors. While previous research suggests that error rates are low, we cannot determine how false positive or false negative errors would differentially affect predictions that do or do not consider EHR-sourced data.

CONCLUSIONS

Observational research using prediction models to account for pre-existing risk when assessing effects of medical products on suicidal behavior need not limit studies to databases including linked EHR data.

Supplementary Material

Supinfo2
Supinfo1

Key Points:

  • ‐ Evaluating the effects of medical products on suicidal behavior will often rely on observational studies using large databases derived from health records.

  • ‐ Most databases available for large pharmacoepidemiology studies are derived from insurance claims, with only some including data from electronic health records (such a patient reported outcomes or PROs).

  • ‐ Machine learning-derived prediction models can account for pre-existing risk when -studying effects of medical products on suicidal behavior.

  • ‐ In a sample of over 15 million mental health visits, machine learning models predicting self-harm over the following 90 days performed equally well with and without use of PROs and other data only available from electronic health records.

Acknowledgments

This work was supported by contract HHSF223201810201C with the US Food and Drug Administration and cooperative agreement U19 MH121738 with the US National Institute of Mental Health. This article reflects the views of the authors and should not be construed to represent FDA’s views or policies.

Footnotes

ETHICAL APPROVAL STATEMENT

Responsible Institutional Review Boards for each health system contributing data approved waivers of consent for use of deidentified records data in this research.

REFERENCES

  • 1.Garnett MF, Curtin SC, Stone DM. Suicide mortality in the United States, 2000–2020. NCHS Data Brief 2022;433(March). [PubMed]
  • 2.Cairns C, Kang K, Santo L. National Hospital Ambulatory Medical Care Survey: 2018 Emergency Department Summary Tables. In: Statistics NCfH, editor. Hyattsville, MD: 2020. [Google Scholar]
  • 3.Cipriani A, Hawton K, Stockton S, Geddes JR. Lithium in the prevention of suicide in mood disorders: updated systematic review and meta-analysis. Bmj 2013;346:f3646. [DOI] [PubMed] [Google Scholar]
  • 4.Smith KA, Cipriani A. Lithium and suicide in mood disorders: Updated meta-review of the scientific literature. Bipolar Disord 2017;19(7):575–86. [DOI] [PubMed] [Google Scholar]
  • 5.Forte A, Pompili M, Imbastaro B, De Luca GP, Mastrangelo M, Montalbani B, Baldessarini RJ. Effects on suicidal risk: Comparison of clozapine to other newer medicines indicated to treat schizophrenia or bipolar disorder. J Psychopharmacol 2021;35(9):1074–80. [DOI] [PubMed] [Google Scholar]
  • 6.Meltzer HY, Alphs L, Green AI, Altamura AC, Anand R, Bertoldi A, Bourgeois M, Chouinard G, Islam MZ, Kane J, Krishnan R, Lindenmayer JP, Potkin S, International Suicide Prevention Trial Study G. Clozapine treatment for suicidality in schizophrenia: International Suicide Prevention Trial (InterSePT). Arch Gen Psychiatry 2003;60(1):82–91. [DOI] [PubMed] [Google Scholar]
  • 7.Witt K, Potts J, Hubers A, Grunebaum MF, Murrough JW, Loo C, Cipriani A, Hawton K. Ketamine for suicidal ideation in adults with psychiatric disorders: A systematic review and meta-analysis of treatment trials. Aust N Z J Psychiatry 2020;54(1):29–45. [DOI] [PubMed] [Google Scholar]
  • 8.Andrade C Ketamine for Depression, 6: Effects on Suicidal Ideation and Possible Use as Crisis Intervention in Patients at Suicide Risk. J Clin Psychiatry 2018;79(2). [DOI] [PubMed] [Google Scholar]
  • 9.Stone M, Laughren T, Jones ML, Levenson M, Holland PC, Hughes A, Hammad TA, Temple R, Rochester G. Risk of suicidality in clinical trials of antidepressants in adults: analysis of proprietary data submitted to US Food and Drug Administration. BMJ 2009;339:b2880. PMCID: 2725270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Bellivier F, Belzeaux R, Scott J, Courtet P, Golmard JL, Azorin JM. Anticonvulsants and suicide attempts in bipolar I disorders. Acta Psychiatr Scand 2017;135(5):470–8. [DOI] [PubMed] [Google Scholar]
  • 11.Sansing-Foster V, Haug N, Mosholder A, Cocoros NM, Bradley M, Ma Y, Pennap D, Dee EC, Toh S, Pestine E, Petrone AB, Kim I, Lyons JG, Eworuke E. Risk of Psychiatric Adverse Events Among Montelukast Users. J Allergy Clin Immunol Pract 2021;9(1):385–93 e12. [DOI] [PubMed] [Google Scholar]
  • 12.Droitcourt C, Nowak E, Rault C, Happe A, Le Nautout B, Kerbrat S, Balusson F, Poizeau F, Travers D, Sapori JM, Lagarde E, Rey G, Guillot B, Oger E, Dupuy A. Risk of suicide attempt associated with isotretinoin: a nationwide cohort and nested case-time-control study. Int J Epidemiol 2019;48(5):1623–35. [DOI] [PubMed] [Google Scholar]
  • 13.Gunnell D, Irvine D, Wise L, Davies C, Martin RM. Varenicline and suicidal behaviour: a cohort study based on data from the General Practice Research Database. BMJ 2009;339:b3805. PMCID: PMC2755726. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Meltzer HY, Alphs L, Green AI, Altamura AC, Anand R, Bertoldi A, Bourgeois M, Chouinard G, Islam MZ, Kane J, Krishnan R, Lindenmayer JP, Potkin S. Clozapine treatment for suicidality in schizophrenia: International Suicide Prevention Trial (InterSePT). Arch Gen Psychiatry 2003;60(1):82–91. [DOI] [PubMed] [Google Scholar]
  • 15.Goodwin F, Fireman B, Simon G, Hunkeler E, Lee J, Revicki D. Suicide risk in bipolar disorder during treatment with lithium and divalproex. JAMA 2003;290:1467–73. [DOI] [PubMed] [Google Scholar]
  • 16.Song J, Sjolander A, Joas E, Bergen SE, Runeson B, Larsson H, Landen M, Lichtenstein P. Suicidal Behavior During Lithium and Valproate Treatment: A Within-Individual 8-Year Prospective Study of 50,000 Patients With Bipolar Disorder. Am J Psychiatry 2017;174(8):795–802. [DOI] [PubMed] [Google Scholar]
  • 17.Canuso CM, Singh JB, Fedgchin M, Alphs L, Lane R, Lim P, Pinter C, Hough D, Sanacora G, Manji H, Drevets WC. Efficacy and Safety of Intranasal Esketamine for the Rapid Reduction of Symptoms of Depression and Suicidality in Patients at Imminent Risk for Suicide: Results of a Double-Blind, Randomized, Placebo-Controlled Study. Am J Psychiatry 2018:appiajp201817060720. [DOI] [PubMed]
  • 18.Shortreed SM, Rutter CM, Cook AJ, Simon GE. Improving pragmatic clinical trial design using real-world data. Clin Trials 2019:1740774519833679. [DOI] [PMC free article] [PubMed]
  • 19.Shortreed SM, Simon GE. Using predictive analytics to improve pragmatic trial design. Clin Trials 2020;17(4):394–401. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Simon GE, Savarino J. Suicide attempts among patients starting depression treatment with medications or psychotherapy. Am J Psychiatry 2007;164(7):1029–34. [DOI] [PubMed] [Google Scholar]
  • 21.Simon GE, Johnson E, Lawrence JM, Rossom RC, Ahmedani B, Lynch FL, Beck A, Waitzfelder B, Ziebell R, Penfold RB, Shortreed SM. Predicting Suicide Attempts and Suicide Deaths Following Outpatient Visits Using Electronic Health Records. Am J Psychiatry 2018:appiajp201817101167. PMCID: PMC6167136. [DOI] [PMC free article] [PubMed]
  • 22.Boudreaux ED, Rundensteiner E, Liu F, Wang B, Larkin C, Agu E, Ghosh S, Semeter J, Simon G, Davis-Martin RE. Applying Machine Learning Approaches to Suicide Prediction Using Healthcare Data: Overview and Future Directions. Front Psychiatry 2021;12:707916. PMCID: PMC8369059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Kessler RC, Bossarte RM, Luedtke A, Zaslavsky AM, Zubizarreta JR. Suicide prediction models: a critical review of recent research with recommendations for the way forward. Mol Psychiatry 2020;25(1):168–79. PMCID: PMC7489362. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Gibson TB, Nguyen MD, Burrell T, Yoon F, Wong J, Dharmarajan S, Ouellet-Hellstrom R, Hua W, Ma Y, Baro E, Bloemers S, Pack C, Kennedy A, Toh S, Ball R. Electronic phenotyping of health outcomes of interest using a linked claims-electronic health record database: Findings from a machine learning pilot project. J Am Med Inform Assoc 2021;28(7):1507–17. PMCID: PMC8279790. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Coleman KJ, Stewart C, Waitzfelder BE, Zeber JE, Morales LS, Ahmed AT, Ahmedani BK, Beck A, Copeland LA, Cummings JR, Hunkeler EM, Lindberg NM, Lynch F, Lu CY, Owen-Smith AA, Trinacty CM, Whitebird RR, Simon GE. Racial-Ethnic Differences in Psychiatric Diagnoses and Treatment Across 11 Health Care Systems in the Mental Health Research Network. Psychiatr Serv 2016;67(7):749–57. PMCID: PMC4930394. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Kroenke K, Spitzer RL, Williams JB, Lowe B. The Patient Health Questionnaire Somatic, Anxiety, and Depressive Symptom Scales: a systematic review. Gen Hosp Psychiatry 2010;32(4):345–59. [DOI] [PubMed] [Google Scholar]
  • 27.Ross TR, Ng D, Brown JS, Pardee R, Hornbrook MC, Hart G, Steiner JF. The HMO Research Network Virtual Data Warehouse: A Public Data Model to Support Collaboration. eGEMs 2014;2(1). PMCID: PMC4371424. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Hernan MA, Robins JM. Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available. Am J Epidemiol 2016;183(8):758–64. PMCID: PMC4832051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Simon GE, Shortreed SM, Boggs JM, Clarke GN, Rossom RC, Richards JE, Beck A, Ahmedani BK, Coleman KJ, Bhakta B, Stewart CC, Sterling S, Schoenbaum M, Coley RY, Stone M, Mosholder AD, Yaseen ZS. Accuracy of ICD-10-CM encounter diagnoses from health records for identifying self-harm events. J Am Med Inform Assoc 2022;29(12):2023–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Wright MN, Ziegler A. ranger: A fast implementation of random forests for high dimensional data in C++ and R. arXiv 2015;1508:04409. [Google Scholar]
  • 31.Coley RY, Walker RL, Cruz M, Simon GE, Shortreed SM. Clinical risk prediction models and informative cluster size: Assessing the performance of a suicide risk prediction algorithm. Biom J 2021;63(7):1375–88. PMCID: PMC9134927. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Simon GE, Coleman KJ, Rossom RC, Beck A, Oliver M, Johnson E, Whiteside U, Operskalski B, Penfold RB, Shortreed SM, Rutter C. Risk of suicide attempt and suicide death following completion of the Patient Health Questionnaire depression module in community practice. J Clin Psychiatry 2016;77(2):221–7. PMCID: PMC4993156. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Penfold RB, Whiteside U, Johnson EE, Stewart CC, Oliver MM, Shortreed SM, Beck A, Coleman KJ, Rossom RC, Lawrence JM, Simon GE. Utility of item 9 of the patient health questionnaire in the prospective identification of adolescents at risk of suicide attempt. Suicide Life Threat Behav 2021;51(5):854–63. [DOI] [PubMed] [Google Scholar]
  • 34.Simon GE, Shortreed SM, Johnson E, Beck A, Coleman KJ, Rossom RC, Whiteside US, Operskalski BH, Penfold RB. Between-visit changes in suicidal ideation and risk of subsequent suicide attempt. Depress Anxiety 2017. PMCID: PMC5870867. [DOI] [PMC free article] [PubMed]
  • 35.Louzon SA, Bossarte R, McCarthy JF, Katz IR. Does Suicidal Ideation as Measured by the PHQ-9 Predict Suicide Among VA Patients? Psychiatr Serv 2016;67(5):517–22. [DOI] [PubMed] [Google Scholar]
  • 36.Ribeiro JD, Franklin JC, Fox KR, Bentley KH, Kleiman EM, Chang BP, Nock MK. Self-injurious thoughts and behaviors as risk factors for future suicide ideation, attempts, and death: a meta-analysis of longitudinal studies. Psychol Med 2015:1–12. [DOI] [PMC free article] [PubMed]
  • 37.Franklin JC, Ribeiro JD, Fox KR, Bentley KH, Kleiman EM, Huang X, Musacchio KM, Jaroszewski AC, Chang BP, Nock MK. Risk factors for suicidal thoughts and behaviors: A meta-analysis of 50 years of research. Psychol Bull 2017;143(2):187–232. [DOI] [PubMed] [Google Scholar]
  • 38.McCarthy JF, Cooper SA, Dent KR, Eagan AE, Matarazzo BB, Hannemann CM, Reger MA, Landes SJ, Trafton JA, Schoenbaum M, Katz IR. Evaluation of the Recovery Engagement and Coordination for Health-Veterans Enhanced Treatment Suicide Risk Modeling Clinical Program in the Veterans Health Administration. JAMA Netw Open 2021;4(10):e2129900. PMCID: PMC8524305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Rossom RC, Simon GE, Beck A, Ahmedani BK, Steinfeld B, Trangle M, Solberg L. Facilitating Action for Suicide Prevention by Learning Health Care Systems. Psychiatr Serv 2016;67(8):830–2. PMCID: PMC4969117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Simon GE, Shortreed SM, Johnson E, Rossom RC, Lynch FL, Ziebell R, Penfold ARB. What health records data are required for accurate prediction of suicidal behavior? J Am Med Inform Assoc 2019;26(12):1458–65. PMCID: PMC6857508. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supinfo2
Supinfo1

RESOURCES