Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Dec 24.
Published in final edited form as: Nat Med. 2025 Apr 3;31(6):1863–1872. doi: 10.1038/s41591-025-03603-z

Clinical implementation of AI-based screening for risk for opioid use disorder in hospitalized adults

Majid Afshar 1,, Felice Resnik 2, Cara Joyce 3, Madeline Oguss 1, Dmitriy Dligach 4, Elizabeth S Burnside 2, Anne Gravel Sullivan 2, Matthew M Churpek 1, Brian W Patterson 5, Elizabeth Salisbury-Afshar 6, Frank J Liao 7, Cherodeep Goswami 7, Randy Brown 6, Marlon P Mundt 6
PMCID: PMC12723583  NIHMSID: NIHMS2124689  PMID: 40181180

Abstract

Adults with opioid use disorder (OUD) are at increased risk for opioid-related complications and repeated hospital admissions. Routine screening for patients at risk for an OUD to prevent complications is not standard practice in many hospitals, leading to missed opportunities for intervention. The adoption of electronic health records (EHRs) and advancements in artificial intelligence (AI) offer a scalable approach to systematically identify at-risk patients for evidence-based care. This pre–post quasi-experimental study evaluated whether an AI-driven OUD screener embedded in the EHR was non-inferior to usual care in identifying patients for addiction medicine consultations, aiming to provide a similarly effective but more scalable alternative to human-led ad hoc consultations. The AI screener used a convolutional neural network to analyze EHR notes in real time, identifying patients at risk and recommending consultations. The primary outcome was the proportion of patients who completed a consultation with an addiction medicine specialist, which included interventions such as outpatient treatment referral, management of complicated withdrawal, medication management for OUD and harm reduction services. The study period consisted of a 16-month pre-intervention phase followed by an 8-month post-intervention phase, during which the AI screener was implemented to support hospital providers in identifying patients for consultation. Consultations did not change between periods (1.35% versus 1.51%, P < 0.001 for non-inferiority). In secondary outcome analysis, the AI screener was associated with a reduction in 30-day readmissions (odds ratio: 0.53, 95% confidence interval: 0.30–0.91, P = 0.02) with an incremental cost of US$6,801 per readmission avoided, demonstrating its potential as a scalable, cost-effective solution for OUD care. ClinicalTrials.gov registration: NCT05745480.


Unintentional overdose involving synthetic opioids and polydrug toxicity has increased exponentially between 2010 and 2022 and remains a public health crisis1. Many individuals with nonfatal opioid toxicity and related complications, such as wound infections and endocarditis, frequently engage in hospital care2. In the United States, from 2018 to 2021, emergency department visits for substance use surged by nearly 40%, with opioids being the second leading cause of these visits after alcohol2,3. Hospitalized adults with OUD face an elevated risk of overdose or complications, making hospitalization an important touchpoint for interventions to prevent rehospitalization or death4,5.

Hospital-based addiction care has been shown to increase the adoption of life-saving medications for OUD6,7, boost post-hospital treatment engagement8,9, reduce stigma10,11, improve experiences for patients and clinicians12 and lower mortality rates13. However, many hospitals still struggle to provide consistent, high-quality OUD care, resulting in negative patient experiences, delayed presentations, premature discharges, and high morbidity and mortality14. Patients frequently leave the hospital before seeing an addiction specialist, a factor linked to a tenfold increase in overdose rates15. While interprofessional addiction consultation services can increase medications for OUD and patient engagement, overall rates remain low16. Identifying and prioritizing these patients for specialized care continues to challenge health systems, leading to inconsistent treatment12.

With over 35 million hospitalizations annually in the United States, detection rates for OUD are lower than those for other substances and vary substantially across demographic groups17. Organizations such as the Society of Hospital Medicine recommend that all hospitalized patients with unhealthy opioid use be assessed for OUD in the United States7. Although validated tools are available (for example, Drug Abuse Screening Test (DAST), Substance Use Brief Screen (SUBS) and the Tobacco, Alcohol, Prescription medication, and other Substance use (TAPS) tool)18-20, manual screening is resource intensive, making it difficult to scale for all hospitalized adults. In the digital era, the integration of EHRs with AI enables a systematic, automated approach to screening patients at risk for OUD. Unlike traditional manual assessments, which depend on provider availability and workflow constraints, AI-driven solutions can operate without interruption and across all patient encounters within clinical systems, providing consistent, real-time risk identification at scale without human effort21. EHRs, which routinely capture detailed narratives of substance use, can be leveraged for automated AI-driven screening22,23.

We previously developed and validated an AI-driven screener designed to identify hospitalized patients at risk for OUD, defined as the use of opioids for non-prescribed or illicit purposes. The model was trained on a reference dataset derived from structured interviews conducted by clinical staff, who manually screened patients using the DAST across more than 50,000 hospitalizations. The training and validation datasets encompassed a comprehensive, hospital-wide cohort, reflecting real-world EHR data and the true prevalence of OUD in the hospital setting. Ground-truth labels were established based on elevated DAST scores in patients with self-reported opioid use, providing accurate identification of individuals with OUD. The model demonstrated excellent discrimination, calibration and fairness metrics across multiple sites24-26, paving the way for its integration into a real-time, EHR-embedded workflow27.

The primary outcome of this study was the proportion of adult hospitalizations that resulted in a completed addiction medicine consultation involving outpatient treatment referral, complicated withdrawal management, medication management for OUD or harm reduction services. During the post-intervention period, the AI screener served as the intervention, identifying patients at high risk for OUD and providing a recommendation for consultation with the addiction medicine service, along with prompting the initiation of the Clinical Opiate Withdrawal Scale and order set. Using a pre–post study design, we conducted a non-inferiority test to assess whether the addition of the AI screener could match the effectiveness of the pre-period’s provider-driven ad hoc consultation workflow, while providing a scalable and automated alternative to human-led processes. The secondary outcome included the rehospitalization rate between the periods, along with a cost-effectiveness analysis of the AI screener.

Results

AI screener implementation and optimization

The AI screener comprised the AI prediction model, its real-time integration into the EHR and the recommendation for consultation with the inpatient addiction medicine service and withdrawal management orders, delivered as a best practice alert (BPA) to any hospital provider upon opening the patient’s chart. Before the AI screener’s hospital-wide deployment, a hybrid effectiveness–implementation framework was applied. Interviews were conducted with advanced practice providers, residents and attending physicians from surgery, internal medicine and family medicine to identify potential barriers to using the AI screener. Using the Consolidated Framework for Implementation Research28 to guide the interviews and Expert Recommendations for Implementing Change29, barriers were addressed through additional educational initiatives, including newsletters and instructional flyers for care teams.

To further optimize utilization, two rapid Plan-Do-Study-Act (PDSA) cycles were conducted between December 2022 and February 2023. The first cycle aimed to reduce the latency of the BPA and minimize the need for ad hoc addiction consult orders. The AI screener workflow was updated to incorporate notes from the emergency department, allowing for an earlier opportunity to reach the BPA threshold for a positive screen.

The second PDSA cycle aimed to improve the information provided to addiction medicine specialists regarding the reason for consultation. Based on focus group feedback, the BPA was updated to preselect the Clinical Opiate Withdrawal Scale for cases where withdrawal symptoms were a potential concern, allowing for the simultaneous ordering of the consultation and the Clinical Opiate Withdrawal Scale and associated treatment order set. This update increased cases where both the Clinical Opiate Withdrawal Scale assessment and addiction medicine consultations were ordered simultaneously over a 4-week test period (P < 0.02).

Following these cycles, interviews were administered to providers who interacted with the BPA to evaluate its acceptability and usability. Six respondents, comprising two hospitalist physicians, an internal medicine resident, an advanced practice provider, a family medicine physician and a surgical resident, reported that the BPA was a helpful recommendation that did not disrupt their workflow. Positive feedback emphasized the BPA’s effectiveness as a prompt for identifying at-risk patients without disrupting clinical workflows. While some providers expressed concerns about alert fatigue, particularly in high-demand settings, the majority valued the screener’s capacity to highlight patients who might otherwise have been overlooked. Overall, the BPA was well received by users. Following these refinements, the AI screener was successfully implemented and deployed into production hospital wide.

Patient characteristics during the study period

The inclusion criteria were all adults hospitalized at the University of Wisconsin Hospital (UW Health). Patients were excluded if no clinical documentation in the EHR was recorded. The study included 51,760 adult hospitalizations, with 66% occurring during the pre-intervention period and 34% during the post-intervention period. The AI screener was deployed from 1 March 2023 to 31 October 2023, following two seasonally matched pre-intervention periods (1 March 2021–31 October 2021 and 1 March 2022–31 October 2022). A total of 727 addiction medicine consultations were completed during the entire study period (Fig. 1). No clinical adverse events related to the study were recorded from the first patient on 1 March 2021, through the last patient on 31 October 2023.

Fig. 1 ∣. Flow diagram of patient hospitalizations for pre-implementation and post-implementation periods of BPA in the EHR for screening unhealthy opioid use with associated interventions.

Fig. 1 ∣

Post-implementation addiction medicine consults are shown regardless of the AI screener. Multiple hospitalizations per patient are possible. MOUD, medications for OUD. Each completed consultation could include: (1) opioid use assessment and brief behavioral intervention with motivational interviewing; (2) initiation, continuation or adjustment of MOUD; (3) harm reduction services including naloxone and fentanyl test strips; and/or (4) discharge planning with referral to community-based treatment.

Hospitalized adults in the pre-intervention period were more likely to have chronic conditions such as hypertension, diabetes and liver disease (P < 0.01) and exhibited higher Elixhauser comorbidity scores and longer hospital stays (P < 0.01). Additionally, a higher proportion of International Classification of Diseases (ICD)-10 drug misuse diagnosis codes were recorded during the pre-intervention period (P < 0.01). There were no differences in hospital readmission rates between the two periods when examining all hospitalized adults (P = 0.75; Table 1).

Table 1 ∣. Patient-level characteristics and demographics (unique patients and not encounter-level).

All hospitalized Pre-period Post-period P value
Patients (n = 35,494) (n = 23,586) (n = 11,908)
Age, median (IQR) 62 (47–73) 62 (47–73) 63 (47–73) 0.122
Male sex, n (%) 18,360 (51.7) 12,229 (51.8) 6,131 (51.5) 0.519
Race + ethnicity, n (%)
 NH white 30,718 (86.5) 20,402 (86.5) 10,316 (86.6) 0.361
 NH Black 2,058 (5.8) 1,365 (5.8) 693 (5.8)
 Hispanic 1,086 (3.1) 724 (3.1) 362 (3.0)
 Mixed 270 (0.8) 167 (0.7) 103 (0.9)
 Other/unknown 1,362 (3.8) 928 (3.9) 434 (3.6)
Insurance, n (%)
 Medicare 15,199 (42.8) 9,691 (41.1) 5,508 (46.2) <0.001
 Medicaid 3,889 (11.0) 2,561 (10.9) 1,328 (11.1)
 Private 13,716 (38.6) 9,335 (39.6) 4,381 (36.8)
 Other 2,690 (7.6) 1,999 (8.5) 691 (5.8)
Comorbidities, n (%)
 Hypertension 2,983 (8.7) 2,090 (9.2) 893 (7.8) <0.001
 Renal failure 1,601 (4.7) 1,135 (5.0) 466 (4.0) <0.001
 Neurologic 1,522 (4.4) 1,079 (4.8) 443 (3.8) <0.001
 CHF 2,692 (7.9) 1,838 (8.1) 854 (7.4) 0.027
 Diabetes 3,493 (10.2) 2,474 (10.9) 1,019 (8.9) <0.001
 Liver disease 974 (2.8) 697 (3.1) 277 (2.4) <0.001
 Chronic lung disease 747 (2.2) 512 (2.3) 235 (2.0) 0.199
 Psychiatric disorders 507 (1.5) 354 (1.6) 153 (1.3) 0.095
 Depression 1,018 (3.0) 710 (3.1) 308 (2.7) 0.020
 Alcohol misuse 835 (2.4) 582 (2.6) 253 (2.2) 0.038
 Drug misuse 236 (0.7) 179 (0.8) 57 (0.5) 0.002
 AIDS 46 (0.1) 32 (0.1) 14 (0.1) 0.644
 Elixhauser score, mean (s.d.) 2.87 (4.97) 2.98 (5.10) 2.63 (4.68) <0.001
 Length of stay, mean (s.d.) 5.25 (7.62) 5.34 (8.18) 5.05 (6.35) <0.001
 Readmission to hospital, n (%) 4,931 (13.9) 3,267 (13.9) 1,664 (14.0) 0.753
Disposition, n (%)
 Home 29,180 (82.2) 19,258 (81.7) 9,922 (83.3) <0.001
 Death 1,084 (3.1) 770 (3.3) 314 (2.6)
 LT RC/ST PA 4,949 (13.9) 3,383 (14.3) 1,566 (13.2)
 AMA 235 (0.7) 142 (0.6) 93 (0.8)
 Other 46 (0.1) 33 (0.1) 13 (0.1)

Comparisons across all variables were significant with a P value < 0.01; AMA, against medical advice; AIDS, acquired immunodeficiency syndrome; LT RC/ST PA, long-term residential care or short-term post-acute care; CHF, congestive heart failure; NH, non-Hispanic; Mixed, Asian, Native American or Alaskan Native, Native Hawaiian or Other Pacific Islander, Other, or refuse/unknown. Statistical comparisons of baseline characteristics were conducted using the chi-squared test for proportions and the Mann–Whitney U test for non-normally distributed continuous or integer variables. A significance threshold of P < 0.05 was used.

In the subgroup of patients who received an addiction medicine consultation, demographic characteristics were similar, with few differences in comorbidities, but there was a lower proportion of unhealthy alcohol use and a higher proportion of cannabis use in the post-period (Table 2). In the first 72 h of admission, the median time to an addiction medicine consult order was longer in the post-period than pre-period (11.7 h versus 8.4 h, P < 0.001), but the overall median length of stay was similar (4.6 days versus 4.2 days, P = 0.79).

Table 2 ∣. Characteristics and outcomes for hospitalizations with an addiction medicine consultation for risk of OUD.

All patients Pre-period Post-period P value
(n = 727) (n = 460) (n = 267)
Age, median (IQR) 41 (32–55) 40 (32–54) 41 (32–56) 0.284
Male sex, n (%) 465 (64.0) 284 (61.7) 181 (67.8) 0.101
Race + ethnicity, n (%)
 NH white 589 (81.0) 385 (83.7) 204 (76.4) 0.295
 NH Black 97 (13.3) 53 (11.5) 44 (16.5)
 Hispanic 16 (2.2) 9 (2.0) 7 (2.6)
 Mixed 2 (0.3) 1 (0.2) 1 (0.4)
 Other/unknown 23 (3.2) 12 (2.6) 11 (4.1)
Insurance, n (%)
 Medicare 123 (16.9) 75 (16.3) 48 (18.0) 0.447
 Medicaid 388 (53.4) 241 (52.4) 147 (55.1)
 Private 171 (23.5) 117 (25.4) 54 (20.2)
 Other 45 (6.2) 27 (5.9) 18 (6.7)
Comorbidities, n (%)
 Hypertension 40 (5.5) 33 (7.2) 7 (2.6) 0.009
 Renal failure 6 (0.8) 4 (0.9) 2 (0.8) 0.863
 Neurologic 46 (6.3) 36 (7.8) 10 (3.8) 0.029
 CHF 47 (6.5) 25 (5.4) 22 (8.2) 0.138
 Diabetes 31 (4.3) 18 (3.9) 13 (4.9) 0.539
 Liver disease 63 (8.7) 49 (10.7) 14 (5.2) 0.012
 Chronic lung disease 28 (3.9) 17 (3.7) 11 (4.1) 0.774
 Psychiatric disorders 38 (5.2) 19 (4.1) 19 (7.1) 0.081
 Depression 52 (7.2) 29 (6.3) 23 (8.6) 0.244
 Alcohol misuse 207 (28.5) 147 (32.0) 60 (22.5) 0.006
 Drug misuse 149 (20.5) 102 (22.2) 47 (17.6) 0.141
 AIDS 3 (0.4) 2 (0.4) 1 (0.4) 0.903
Elixhauser score, mean (s.d.) 1.1 (5.4) 1.4 (5.6) 0.8 (4.9) 0.124
Urine drug screen, n (%)
 Amphetamines 0 (0.0) 0 (0.0) 0 (0.0) 0.999
 Barbiturates 6 (0.8) 4 (0.9) 2 (0.8) 0.863
 Benzodiazepines 1 (0.1) 1 (0.2) 0 (0.0) 0.446
 Cocaine 174 (23.9) 103 (22.4) 71 (26.6) 0.201
 Cannabinoid 109 (15.0) 51 (11.1) 58 (21.7) <0.001
 Opioids 152 (20.9) 93 (20.2) 59 (22.1) 0.548
Intervention treatment, n (%)
 Buprenorphine 237 (32.6) 157 (34.1) 80 (30.0) 0.248
 Naltrexone 105 (14.4) 77 (16.7) 28 (10.5) 0.021
 Methadone 100 (13.8) 67 (14.6) 33 (12.4) 0.405
 NA 342 (47.0) 200 (43.5) 142 (53.2) 0.011
Hours to consult order, median (IQR) 9.2 (5.4–21.1) 8.4 (4.9–19.7) 11.7 (6.7–24.9) <0.001
Length of stay, days, median (IQR) 4.4 (2.9–8.1) 4.2 (2.8–8.0) 4.6 (2.9–8.2) 0.785
30-day readmission to same hospital, n (%) 84 (11.6) 63 (13.7) 21 (7.9) 0.018
Any 30-day post-discharge hospital/ED, n (%) 246 (33.8) 171 (37.2) 75 (28.1) 0.013
Disposition, n (%)
 Home 610 (83.9) 391 (85.0) 219 (82.0) 0.727
 Death 3 (0.4) 2 (0.4) 1 (0.4)
 LT RC/ST PA 54 (7.4) 33 (7.2) 21 (7.9)
 AMA 56 (7.7) 31 (6.7) 25 (9.4)
 Other 4 (0.6) 3 (0.7) 1 (0.4)

ED, emergency department; NA, not applicable. Statistical comparisons of baseline characteristics were conducted using the chi-squared test for proportions and the Mann–Whitney U test for non-normally distributed continuous or integer variables. A significance threshold of P < 0.05 was used.

AI screener’s performance and BPA utilization

As care teams entered EHR notes, the AI screener dynamically recalculated the model’s score in real time, integrating all available clinical documentation up to that moment. The BPA was triggered whenever a provider accessed the patient’s chart, contingent upon the AI model’s score exceeding the predefined threshold. However, the BPA was automatically deactivated if it was entered into the EHR as inappropriate by the provider or an addiction medicine consult order was placed. A total of 4,328 BPAs were triggered by the AI model across 157 hospitalizations, with a median of 10 BPAs per hospitalization (interquartile range (IQR) 4–23). Among these 157 hospitalizations, 21.7% (n = 34) directly resulted in an addiction medicine consultation, as documented by the ordering provider in the EHR. The BPA-attributed consultations accounted for 12.7% (n = 34 of 267) of all addiction medicine consultations in the post-intervention period. It is important to note that the BPA may have also indirectly influenced provider decision-making, leading to consultations that were not explicitly documented in the EHR as attributable to the BPA. The majority of BPAs (90.6%, n = 3,789) were dismissed without a documented reason. The remaining dismissals were attributed to perceived inappropriateness (2.7%, n = 102), deferred action (2.6%, n = 98), canceled consult orders (2.6%, n = 97), involvement of a non-primary team (1.0%, n = 38) and other reasons (0.5%, n = 20). The Clinical Opiate Withdrawal Scale and order set for withdrawal management were ordered in 108 hospitalizations, with 29 cases (26.9%) directly triggered by the BPA. The finalized version of the BPA is shown in Fig. 2. An example of an EHR note, along with the feature importance scores for the most important medical concepts identified by the AI screener, is shown in Fig. 3.

Fig. 2 ∣. Image of BPA in the EHR for recommending addiction medicine consult order and Clinical Opiate Withdrawal Scale.

Fig. 2 ∣

The finalized BPA includes recommendations for both an addiction medicine consultation and the use of the Clinical Opiate Withdrawal Scale (COWS). The versions presented here were deployed into production and fully embedded within the Epic EHR system. The alert is triggered only when the AI model’s screening threshold is met. This final version incorporates user feedback gathered during the initial PDSA cycles of the hybrid effectiveness–implementation phase, ensuring alignment with clinical workflow and provider needs.

Fig. 3 ∣. Single-note illustration of AI screener.

Fig. 3 ∣

A visualization example demonstrates the individual-level clinical utility of the extracted medical concepts as concept unique identifiers (CUIs) mapped from the note to the Unified Medical Language System Metathesaurus at the National Library of Medicine. Integrated gradients assign an importance score to each input CUI by approximating the integral of the gradients of the AI model’s output to the inputs. This method compares the model’s prediction on the actual input with a baseline generic note with random CUIs, which typically represent the neutral state. The gradients are calculated at several points along the path from the baseline to the actual input. The integral of these gradients gives an attribution score for each CUI, indicating its contribution to the final prediction. Warmer background colors indicate an increased likelihood of unhealthy opioid use, while cooler colors reflect a decreased probability. The final predicted probability was 0.6753, which is above the threshold for a screen positive; whereas a baseline note with generic tokens for medical concepts had an expectedly lower probability of 0.0693. The process of approximating the integral introduces some approximation error, which depends on the number of steps used in the integration process. More steps typically reduce this error but increase computational complexity. In our case, the CUIs with higher attribution scores strongly influenced the predicted probability of unhealthy opioid use, highlighting their clinical relevance.

Primary outcome: completion of addiction medicine consult with intervention

For the primary outcome, only consult orders that led to a full addiction medicine consultation visit were included. Each completed consultation was identified by an authored consultation note in the EHR from an addiction medicine specialist and an opioid-related ICD-10 code associated with the service provided. This approach was applied to both the pre-intervention and post-intervention periods for direct comparison. In the post-intervention period, an integrated daily workbench report was implemented within the EHR to allow research staff to track all hospitalizations and verify that consultations resulted in a documented intervention, such as medication for OUD, outpatient treatment referral, complicated withdrawal management or harm reduction services. This additional auditing process confirmed that interventions were completed and represented accurately in the EHR.

During the post-intervention period, 1.51% of hospitalized adults received an addiction medicine consultation compared to 1.35% in the pre-intervention period (z = −1.49, P < 0.001 for non-inferiority). The adjusted odds ratio (aOR) for receiving an opioid-related addiction medicine consult after intervention was 1.09 (95% confidence interval (CI): 0.93–1.28), indicating comparable odds between the two periods after adjusting for age, sex, race/ethnicity, insurance status and comorbidity score. Similarly, 0.71% of patients received medication for OUD after intervention, compared to 0.76% before intervention (z = 0.69, P < 0.001 for non-inferiority), with an aOR of 0.87 (95% CI: 0.69–1.09). All adjusted variables and results are shown in Supplementary Tables 1 and 2.

Secondary outcome: rehospitalization rates

The AI screener’s implementation was associated with a reduction in 30-day readmission rates among patients who received an addiction medicine consultation, with an aOR of 0.53 (95% CI: 0.30–0.91, P = 0.02). This mixed-effects analysis included a random intercept for repeat hospitalizations by the same patient and adjusted for age, sex, race/ethnicity, insurance status and comorbidity score. Additionally, there was a reduced aOR of 0.67 (95% CI: 0.47–0.94, P = 0.02) for any 30-day post-discharge hospitalization or emergency department visit. In a sensitivity analysis limited to a single hospitalization per patient, a reduction in 30-day readmission rate was still observed among patients who received an addiction medicine consultation, with an aOR of 0.36 (95% CI: 0.24–0.88, P = 0.02). There was also a reduction in any 30-day post-discharge hospital or emergency department visit, with an aOR of 0.60 (95% CI: 0.47–0.87, P = 0.01). Importantly, there was no change in the overall 30-day readmission rate when examining all adult hospitalizations during the same study period (aOR 1.00, 95% CI: 0.94–1.07, P = 0.99). All adjusted variables and results are shown in Supplementary Tables 3-5.

Cost-effectiveness analysis

The cost-effectiveness analysis estimated the incremental costs of the AI screener during the 8 months following its implementation (1 March 2023–31 October 2023) compared to the corresponding 8-month period in the 2 years prior. This analysis evaluated the incremental costs in relation to the intervention’s effectiveness in achieving both primary and secondary outcomes.

The development and implementation of the AI screener incurred several costs. Personnel expenses for building the AI model into the EHR included one principal analytics consultant, one senior analytics consultant, one data scientist, two machine learning engineers and one data science and machine learning architect. The EHR build began in October 2020 and continued until completion in January 2023. On average, the AI screener development required approximately 0.65 full-time equivalent personnel over 28 months, with total personnel salary and benefits amounting to US$234,300. The cost for storage and computing equipment during the development phase was estimated at US$109,800. Additionally, training hospital providers and addiction medicine specialists on the use of the AI screener incurred a total training cost of US$11,600. In total, the development and implementation expenses reached US$355,700. The resource costs used for developing and implementing the AI screener are detailed in Supplementary Table 6.

The post-intervention incremental costs included ongoing storage and computation expenses, support for natural language processing (NLP) and machine learning components, staff time associated with screening, counseling initiated by the screener, and incremental costs for medications for OUD. The estimated incremental personnel costs for supporting the AI screener during the post-intervention period were US$101,400, with additional storage and computing costs of US$2,800. Hospital and counseling staff time, as well as medication expenses, contributed another US$1,900, bringing the total incremental costs to US$106,100 for the post-intervention period. Given that 267 patients received an addiction medicine consultation during this period, the incremental cost per patient was calculated at US$397.

The AI screener’s effectiveness in reducing 30-day readmission was notable, with a 5.8 percentage point difference between pre-intervention and post-intervention rates of 30-day readmission (13.7% and 7.9%, respectively). This equated to an estimated reduction of 15.6 readmissions (95% CI: 2.1–27.1) in the post-intervention period compared to the pre-intervention baseline. Consequently, the incremental cost-effectiveness ratio was determined to be US$6,801 (95% CI: US$3,915–50,524) for each 30-day readmission avoided, indicating an economic benefit of the AI screener in lowering rehospitalization rates.

Discussion

We evaluated a fully EHR-embedded AI screener designed to identify hospitalized adults at risk for OUD. Our findings demonstrated that the AI screener was non-inferior to traditional ad hoc addiction medicine consultations, was cost-effective and was associated with a reduction in 30-day readmissions. Unlike many predictive models that fail to progress beyond the development stage due to poor study design, limited training data or lack of real-world validation, this study successfully deployed an EHR-integrated clinical decision support tool and assessed its impact on healthcare delivery outcomes. By integrating AI-driven workflows directly into clinical practice, this study demonstrates that automated approaches can achieve effectiveness comparable to standard-of-care, human-led workflows while providing scalability. This is particularly beneficial for healthcare systems facing staffing shortages, as AI can provide continuous screening beyond regular business hours and alleviate the burden on limited personnel. Additionally, during periods of high demand, such as the coronavirus disease 2019 pandemic, AI-driven automation ensures screening remains available despite workforce constraints and competing clinical priorities.

Many AI innovations fail to advance beyond the prototype stage due to challenges in user-centered design, interoperability, workflow integration and organizational readiness30. Unlike prior studies conducted in controlled environments31, our approach integrated the AI screener within the clinical workflow of a large health system, capturing the complexities of real-world implementation. This study provides a pragmatic evaluation of the tool’s usability and effectiveness in a hospital setting, considering logistical challenges such as staff availability, patient willingness and alignment with clinical workflows. By addressing these barriers, our findings highlight the feasibility and potential impact of AI-driven interventions in tackling public health crises like the opioid epidemic.

The quasi-experimental pre–post design was chosen due to operational constraints and directives prioritizing a hospital-wide rollout to address gaps in OUD screening, guided by the health system’s AI governance committee32. The pragmatic approach aligns with a hybrid effectiveness–implementation framework33, which integrates implementation strategies with clinical outcomes assessment to test interventions within their intended context and influence of situational factors34. While this design introduces limitations, such as temporal confounding and heightened provider awareness, it demonstrates the relevance of real-world methodologies in rapidly evaluating AI systems at scale.

The prevalence of OUD in the study population of 1.5% aligned with the prevalence observed from an independent health system utilized for training and testing the AI model26 as well as epidemiologic studies. Hospitalized adults from the National Inpatient Sample estimated that 741,275 Americans with OUD accounted for 2.0% of the 35.7 million national discharges among adults aged 18 years or older35. Another study found that 1.2% of all adult emergency department visits had opioid-related diagnoses2. Despite the low prevalence, the AI screener previously demonstrated strong test characteristics26; however, practical deployment revealed challenges beyond predictive accuracy. Some BPAs were overlooked when consult orders were placed before the AI screener triggered or when patients were discharged before the BPA could prompt action. These real-world complexities illustrate the need for AI systems to adapt to dynamic clinical workflows and move past prediction model studies that cannot address the effectiveness of AI-driven clinical decision support tools36,37. This study surfaces some of the inherent complexities of clinical workflows that can influence the effectiveness of an addiction medicine consult service initiated through AI38,39.

The hospital-wide implementation likely yielded broader effects beyond the direct outcomes of consultations and medications prescribed. Important workflow changes included the automation of order sets for monitoring and managing opioid withdrawal, triggered by the AI screener, and frequent BPAs that increased awareness of addiction medicine services. Unlike manual screening or ad hoc consultations, the AI screener repeatedly alerted providers until action was taken, with a median of ten alerts per hospitalization. Providers explicitly attributed the AI screener as the reason for placing consult orders in approximately 13% of all addiction medicine consultations. It is also possible that other consultations were indirectly influenced by the screener, although these associations were not explicitly documented or easily discernible from the EHR logs. Although BPAs were frequent and often dismissed, distributing the alerts across multiple providers reduced alert fatigue while maintaining a focus on high-risk patients. During PDSA cycles and follow-up interviews, providers acknowledged the high frequency of alerts but valued the reminders for identifying patients at risk of OUD. These alerts were distributed across all providers accessing the patient’s chart, mitigating the burden on any single provider’s workflow and promoting team-based care40.

The longer median time to consult in the post-period likely reflects the persistent BPAs, which kept OUD-related needs visible throughout hospitalization. In contrast, pre-period consultations typically occurred once and earlier in the hospitalization. The BPAs served as reminders and prioritized addiction medicine services at later time points, closer to discharge when post-discharge planning with referrals would occur and potentially contribute to reduced readmissions. The AI screener may have also contributed to the shift toward higher odds of Medicaid patients receiving consultations in the post-period, addressing potential disparities in care access. Although the aOR for medication for OUD was slightly lower in the post-intervention period (aOR 0.87), this does not indicate a decline in the potential effectiveness of the AI screener. The primary outcome was not limited to medication for OUD management but rather to completed consultations that led to clinically meaningful interventions, including outpatient treatment referrals, acute withdrawal management and harm reduction strategies. The proportion of patients receiving addiction medicine consultations was non-inferior in the post-intervention period (aOR 1.09), demonstrating that the AI screener was at least as effective as usual care in supporting providers in utilizing addiction medicine services. By identifying at-risk patients systematically, the AI screener may have facilitated interventions addressing both acute withdrawal symptoms and long-term treatment planning. This broader approach may also contribute to the reduction in 30-day readmissions, as timely addiction-related care can help stabilize patients and improve care transitions. Additionally, the AI screener likely reinforced awareness among non-addiction providers, who may be less familiar with the full spectrum of addiction treatment options beyond medications for OUD.

Recent evidence demonstrates that initiating medications for OUD during hospitalization and incorporating addiction medicine consultation teams can improve post-hospital health outcomes6,13,41,42. Addiction medicine specialists provide comprehensive care during hospitalization, including harm reduction strategies, motivational interviewing, withdrawal management, discharge planning and linkage to outpatient services. Their holistic approach addresses not only medical needs but also social determinants of health. The integration of the AI screener with consultations as dual interventions complicates the ability to disentangle their individual contributions to the observed reduction in 30-day inpatient readmissions as well as all rehospitalizations (emergency department, observation stay and hospital inpatient). For instance, over 25% of the Clinical Opiate Withdrawal Scale order set was directly attributable to the AI screener’s BPA, which was another intervention coupled with the recommendation for a consultation with the addiction medicine service. No other major systemic changes occurred during the post-intervention period that could explain the observed reduction in readmissions. Additionally, analyses of all 51,760 hospitalizations across the pre-intervention and post-intervention periods showed no significant change in overall readmission rates, further supporting the inference that the AI screener contributed to the observed change in readmissions among the patients who received an addiction medicine consultation.

The cost-effectiveness analysis revealed that the incremental cost per readmission avoided was approximately US$6,800 and an estimated reduction of approximately 16 readmissions in the post-period, a substantial saving considering the average cost of a 30-day readmission is estimated at US$16,300 (ref. 43). Patients with OUD typically have longer hospital stays, higher comorbidity burdens and more disadvantaged payor mixes, making the cost of care for these patients higher than average. The AI screener offers an opportunity to identify and intervene with these high-risk individuals who are disproportionately affected by the overdose crisis44-46. This study adds to the growing evidence of the cost-effectiveness of addiction medicine consult services47,48. With total start-up costs of approximately US$350,000 for implementing the BPA in a real-time AI workflow, such AI tools could prove economically viable over time.

Several limitations of this study should be acknowledged. First, the study was conducted within a single health system using a specific EHR platform, which may limit its generalizability to other systems with differing addiction services, workflows or patient populations. However, the AI screener ensured that all providers, regardless of their familiarity with the hospital’s longstanding inpatient addiction medicine team, were reminded of the availability of these services.

Second, the AI screener and the resulting addiction medicine consultations were interdependent, making it challenging to isolate the impact of the screener itself from the subsequent consults on reducing 30-day readmissions. Because the AI screener was fully integrated into the hospital’s EHR system, provider behavior may have been influenced by its presence even when consultations were initiated without explicitly selecting the BPA as the reason. This limits our ability to definitively separate traditional ad hoc consults from those indirectly influenced by the addition of the AI screener. Future studies could explore provider decision-making patterns more explicitly to assess how AI-driven screening affects clinical workflows beyond direct BPA utilization.

While seasonality matching was used to mitigate temporal confounding, the dynamic and evolving nature of the opioid epidemic, marked by multiple waves and shifting substance use patterns, may have introduced residual biases. The pre–post study design, while pragmatic and suitable for hospital-wide implementation, inherently limits the ability to fully account for temporal trends that may have influenced the observed outcomes. Although key variables were adjusted for in the analysis, unmeasured confounders could still have contributed to the findings.

The high rate of BPA dismissals reflects the diverse roles of providers accessing the patient’s chart. Alerts were often dismissed by specialists whose scope of practice did not directly involve addiction-related concerns, highlighting a potential misalignment between alert delivery and provider responsibilities. While the persistent alerting approach consistently flagged at-risk patients, it underscores the need for more targeted notification systems that prioritize delivery to the most relevant care team members. Excessive or misdirected alerts have been shown in prior studies to reduce the salience of actionable notifications, contributing to alert fatigue across healthcare institutions49,50.

Finally, the cost-effectiveness analysis in this study was limited to short-term outcomes from a health system perspective. It did not capture the potential long-term economic benefits of AI-driven addiction screening interventions, such as reductions in healthcare utilization, improved patient outcomes and societal cost savings. Future research should adopt a lifetime perspective to evaluate the broader impact of these interventions and inform scalable implementation strategies.

In conclusion, our findings demonstrate the feasibility of deploying AI-driven interventions in real-world healthcare settings, offering a replicable model for hospital-wide screening programs. While the outcomes reflect the inherent challenges of pragmatic implementation, they highlight the transformative potential of AI to augment care delivery, particularly for vulnerable populations. By implementing a hospital-wide, AI-driven screening tool, every hospitalized adult is systematically evaluated, ensuring that no patient is overlooked due to provider awareness or prioritization differences. The screener standardizes risk identification and provides a consistent nudge to providers when a patient meets the screen threshold, reducing the variability associated with ad hoc decision-making. While we do not explicitly analyze fairness metrics in this study, the fundamental shift from provider-driven to AI-driven screening inherently reduces reliance on subjective decision-making. This study highlights the potential of an EHR-embedded AI workflow to support hospital-wide screening efforts and offers valuable insights for health systems seeking to leverage AI to optimize care delivery. Moreover, the scalable and cost-effective AI solution addresses the complexities of translating predictive models into actionable clinical decision support tools, bridging the gap between innovation and real-world impact. This study lays the groundwork for broader adoption of EHR-embedded AI screeners for OUD. Future developments include multicenter trials with analytic designs to isolate AI’s impact on long-term outcomes and cost-effectiveness.

Methods

Hospital setting and study period

The AI screener was implemented in the EHR (Epic Systems Corp, 2023) at the UW Health University Hospital across all adult inpatient wards. The study evaluated the screener’s effectiveness in hospitalized adults (aged ≥18 years) using a pre–post quasi-experimental design with a hybrid effectiveness–implementation framework33. This design compared seasonally matched pre-intervention periods (1 March 2021–31 October 2021 and 1 March 2022–31 October 2022) with a post-intervention period (1 March 2023–31 October 2023). The patient flowchart is shown in Fig. 1. This study was reviewed by the UW Institutional Review Board for ethics approval and registered on ClinicalTrials.gov (NCT05745480). As part of a broader quality improvement initiative at UW Health, it qualified for exemption from human participant research. Classified as secondary research using existing EHR data, the study met category 4 exemption criteria (45 CFR 46.104(d)(4)) and was granted a waiver of consent. The full study protocol was published previously and is included in the Supplementary Information 27. This study adhered to the CONSORT-AI guidelines51 for reporting the AI-driven intervention. The completed checklist is available in the Supplementary Information.

Pre-intervention period: usual care with ad hoc addiction consultations

The addiction medicine inpatient consult service at UW Hospital was established in 1991. Among substances, only alcohol screening had a formal screening program, using the Alcohol Use Disorders Identification Test–Concise (AUDIT-C)52. During the pre-intervention period, consultations for patients with OUD were initiated at the discretion of the primary provider, with orders placed in the EHR. The addiction medicine consult team, consisting of three addiction counselors and four physicians, conducted the consultations. Each completed consultation could include: (1) opioid use assessment and brief behavioral intervention; (2) initiation, continuation or adjustment of medication for OUD; (3) harm reduction services; and/or (4) referral to community-based treatment. Given the absence of a formal screening program for OUD, the health system’s Center for Clinical Knowledge Management and the Clinical AI and Predictive Analytics Committees reviewed the model and its performance to evaluate its suitability for system-wide implementation. Approval of the post-intervention period was granted after confirming no competing protocols were in place and internal testing demonstrated adequate screening performance.

Post-intervention period: AI screener as a clinical decision support tool embedded in the EHR

To integrate the AI model into a screening tool as a BPA, the Applied Data Science team at UW Health developed a comprehensive Development and Operations (DevOps) framework. The system was previously described in detail with pipeline components, technical infrastructure and shared pseudocode27. Briefly, Health Level 7 standards were used to exchange patient notes from the EHR into the UW Health cloud computing environment. Data were stored in the HIPAA-secure Microsoft Azure cloud data lake and processed using an NLP engine to extract key features as coded medical concepts from EHR notes, utilizing the Unified Medical Language System from the National Library of Medicine. The medical concepts were input into a trained convolutional neural network (CNN) model, which had a vocabulary of 37,317 medical concepts and 12.5 million parameters. The trained model is available at git.doit.wisc.edu/smph-public/dom/uw-icu-data-science-lab-public/smart-ai/, and the model’s development, internal validation, and external validation were previously described26. In the prior work, multiple model architectures were evaluated, including logistic regression, feed-forward neural networks, CNNs, deep averaging networks, and transformer-based neural networks. Larger models, including architectures with parameter sizes reaching 746 million and incorporating auxiliary task labeling with ICD-10 codes for opioid misuse, did not demonstrate improved performance over the 12.5 million-parameter CNN model26. Following the emergence of large language models (LLMs), additional testing was conducted using open-source LLMs on UW Health data; however, no appreciable gain in test characteristics was observed to justify replacing the existing CNN model. A key advantage of our approach is the use of concept mapping to semantically similar terms via the Unified Medical Language System Metathesaurus, updated on a 1–2-year cycle. This allows the model’s vocabulary to remain current with evolving language on OUD. Furthermore, by openly providing the trained model and its weights, we allow other institutions to adapt the model to their data, extending its utility across diverse healthcare settings.

The AI model was originally trained during the first 24 h of an admission to allow for time during an average hospital length of stay to still perform the consultation and necessary related interventions. The AI screener recalculated the risk score continuously as new notes were added to the patient’s EHR. If a patient’s updated risk score exceeded the predetermined threshold at any point during the first 24 h of admission, the BPA would trigger. Importantly, once a BPA was triggered, it persisted throughout the hospital stay until a consultation was either ordered or the provider dismissed the alert. Therefore, even if the patient’s risk score fluctuated below the threshold in subsequent updates, the BPA did not automatically turn off. This ensured that once a positive screen was detected, the alert remained active. The AI screener’s BPA was designed to trigger every time a patient’s record was opened by any provider involved in that patient’s care. This includes not just the primary team but also consultants and other healthcare providers who access the patient’s chart. This approach was based on our implementation strategy during our PDSA cycles, which aimed to maximize the opportunity for interventions during the average length of hospital stay for an adult53. Consultations ordered before the BPA triggered were automatically deactivated to prevent unnecessary alerts.

The BPA triggered when the model exceeded a predefined threshold of 0.05 at the time when the provider opens the patient’s chart. This threshold was established following model development and training on a reference dataset from a hospital-wide screening program in Illinois, which used structured diagnostic interviews to manually assess admitted patients over a 27-month period, covering 54,915 adult hospitalizations. Temporal validation was conducted on a subsequent 12-month cohort of 16,917 hospitalized adults, while external validation was performed using data from an independent health system comprising 1,991 adult hospitalizations26. At the 0.05 threshold, the model demonstrated a sensitivity of 0.87 (95% CI 0.84–0.90) and a positive predictive value of 0.76 (95% CI 0.72–0.88). The number needed to evaluate (1/positive predictive value) was 1.4, translating to approximately 26 addiction medicine consult recommendations per 1,000 hospitalized patients, a volume deemed manageable for the UW addiction medicine providers. To examine model stability before live deployment, additional silent testing was conducted at UW Health in 2023, confirming no major drift in test characteristics27.

The final AI screener was designed to balance the early identification of at-risk patients with the clinical team’s capacity to respond effectively. The BPA recommended an addiction medicine consultation and preselected the Clinical Opiate Withdrawal Scale and order set (Fig. 2) to support withdrawal management. This fully integrated clinical decision support system, encompassing the AI-driven prediction model, DevOps integration, and EHR-embedded BPA, is collectively referred to as the ‘AI Screener’.

The AI model’s predictions were assessed using the integrated gradients (IGs) method to understand feature contributions from the medical concepts54. IGs operate by comparing the model’s prediction on a given input with a baseline input that represents the absence of features (neutral state). This method calculates the gradients of the model’s output relative to the input, tracing a path from the baseline to the actual input. By integrating these gradients, attribution scores were derived to highlight the contribution of each text feature to the final prediction. For this analysis, padded tokens were used as the baseline input for medical concepts. The analytics team examined the IGs post hoc to monitor the model during deployment. An example note displaying medical concepts and their attribution scores is shown in Fig. 3.

To address the need for updates as advancements in AI and NLP evolve, the hospital’s Clinical AI and Predictive Analytics governance committee, which initially approved the deployment into clinical practice, also continued to monitor the model’s performance yearly32. The committee conducts annual reviews to monitor model drift, evaluate performance and ensure alignment with current clinical data, mitigating risks of model obsolescence.

Statistical analysis

The analytic cohort included all adult hospitalizations during the study period. The primary outcome was the proportion of patients who received an inpatient addiction medicine consultation, confirmed by a completed consultation note in the EHR with a documented intervention and an opioid-related ICD-10 code associated with the service provided. To assess the clinical impact of AI-facilitated consultations, a completed consultation required at least one of the following interventions: (1) outpatient treatment referral, (2) complicated withdrawal management, (3) medication management for OUD, or (4) harm reduction services such as naloxone distribution or fentanyl test strips. Consultations that were deferred, dismissed or not completed before discharge were excluded from the primary outcome, allowing the analysis to reflect clinical interventions rather than consultation orders alone.

Hypothesis testing used independent sample z-tests to compare intervention effects, with non-inferiority assessed at a one-sided significance level of 0.025. We hypothesized that the AI screener would be as effective as usual care ad hoc consultations. Based on prior analyses, we anticipated a 3% prevalence of OUD among inpatients. To achieve 85% power to detect a 0.75% increase in the post-intervention period, we required a sample size of 12,500 patients (10,000 from the pre-intervention period and 2,500 from the post-intervention period). The non-inferiority margin was set at 0.5%, corresponding to a post-intervention proportion of less than 2.5% under the null hypothesis of inferiority.

Secondary outcomes included the 30-day unplanned hospital readmission rate among those receiving an addiction medicine consultation. The Centers for Medicare and Medicaid Services’ criteria for unplanned hospital readmissions were applied55. Sensitivity analyses also included all post-discharge hospitalizations, such as observation stays and ED visits, since nearly one in five rehospitalizations could be missed if non-inpatient visits were excluded56.

A mixed-effects logistic regression model was used to estimate the odds ratio for receiving a consult after intervention, adjusting for potential confounders, including age, sex, race/ethnicity, insurance status and comorbidity score. The Elixhauser comorbidity score measured overall patient health status with comorbidities57. This score accounts for 31 comorbidities to predict in-hospital mortality, length of stay and other outcomes. The random intercept accounted for repeated hospitalizations in the same patient. To address potential biases introduced by patients with multiple hospitalizations across the study period, we conducted a sensitivity analysis at the patient level. This analysis included only a single index hospitalization for each unique patient and excluded any overlap of patients between the pre-intervention and post-intervention periods.

Cost analysis

The economic evaluation considered: (1) opportunity start-up costs for implementing the AI screener; (2) incremental medical costs comparing usual care to the addition of the AI screener; and (3) ongoing costs for administering and maintaining the AI screener. All costs were adjusted to 2024 US dollars and analyzed from a healthcare system perspective.

The start-up costs for establishing the AI screener included expenses related to supporting the NLP and machine learning components, building the BPA within the EHR, and training healthcare professionals on how to use the tool. To identify the administration and maintenance costs associated with the AI screening workflow changes, we used the following approach: (1) conducting in-depth interviews with hospital administrators; (2) performing activity-based observations of healthcare personnel using the AI screener; and (3) querying the clinician messaging system within the EHR. We used average hospital compensation rates to value healthcare personnel time costs. Incremental costs between usual care and the AI screener were determined by calculating medical care costs before and after the implementation of the AI screener. These costs included those associated with the hospitalization stay and all subsequent medical expenses for the 30 days following hospital admission using the hospital billing records for all the hospitalized adults during the study period.

The cost-effectiveness analysis was reported in terms of the incremental cost-effectiveness ratio for each additional patient who received substance use treatment. For this study, the incremental cost-effectiveness ratio was calculated as the difference in intervention costs between the pre-intervention and post-intervention periods, divided by the difference in intervention effectiveness between these periods, as measured by 30-day hospital readmission and any 30-day post-discharge hospital or emergency department visit.

Supplementary Material

Supplementary Tables
CONSORT-AI Checklist
Clinical Trial Protocol

Supplementary information The online version contains supplementary material available at https://doi.org/10.1038/s41591-025-03603-z.

Acknowledgements

We acknowledge the UW Health operations team in Information Systems from the Applied Data Science team and the Epic Builders team. Of note are the Epic programmers and data scientists B. Schnapp and M. Chao and senior architect G. Wills. Research reported in this publication was supported by the National Institute on Drug Abuse of the National Institutes of Health under award numbers R01-DA051464 (to M.A., D.D., C.J., R.B. and M.O.), R01-LM012973 (to M.A. and D.D.), R01-HL157262 (to M.M.C. and M.O.) and UL1TR002373 (to M.A., F.R., A.G.S., E.S.B. and B.W.P.)

Footnotes

Online content

Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41591-025-03603-z.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Competing interests

The authors declare no competing interests.

Data availability

The minimal data underlying the results of this study are available upon request due to ethical and legal restrictions imposed by the University of Wisconsin-Madison Institutional Review Board. The data are derived from the institution’s EHR and contain patients’ protected health information on patients with addiction disease and their documented addiction treatment in the clinical notes, so the data are not publicly available. Data are available from the University of Wisconsin-Madison for researchers who meet the criteria for access to confidential data and have a data usage agreement with the health system. Please contact M.A. for access requests. With a data use agreement, a limited dataset can be made available in response to an inquiry. Please note that the time frame for responding to requests is approximately 2 weeks.

Code availability

The trained model with its stored vocabulary is open source and available at https://git.doit.wisc.edu/smph-public/dom/uw-icu-data-science-lab-public/smart-ai/. This repository also contains the pseudocode for running the model in a health system EHR using Health Level 7 version 2 data standards.

References

  • 1.Nguyen A. et al. Trends in drug overdose deaths by intent and drug categories, United States, 1999–2022. Am. J. Public Health 114, 1081–1085 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Langabeer JR et al. Prevalence and charges of opioid-related visits to US emergency departments. Drug Alcohol Depend. 221, 108568 (2021). [DOI] [PubMed] [Google Scholar]
  • 3.Drug Abuse Warning Network (DAWN): findings from drug-related emergency department visits, CBHSQ data. Accessed 24 September 2024. https://www.samhsa.gov/data/report/2022-findings-drug-related-emergency-department-visits (2022).
  • 4.King C, Cook R, Korthuis PT, Morris CD & Englander H Causes of death in the 12 months after hospital discharge among patients with opioid use disorder. J. Addict. Med 16, 466–469 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Englander H & Davis CS Hospital standards of care for people with substance use disorder. N. Engl. J. Med 387, 672–675 (2022). [DOI] [PubMed] [Google Scholar]
  • 6.Nordeck CD et al. Opioid agonist treatment initiation and linkage for hospitalized patients seen by a substance use disorder consultation service. Drug Alcohol Depend. Rep 2, 100031 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Calcaterra SL et al. Management of opioid use disorder and associated conditions among hospitalized adults: a consensus statement from the Society of Hospital Medicine. J. Hosp. Med 17, 744–756 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Moyo P. et al. Discharge locations after hospitalizations involving opioid use disorder among medicare beneficiaries. Addict. Sci. Clin. Pract 17, 57 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Wakeman SE, Metlay JP, Chang Y, Herman GE & Rigotti NA Inpatient addiction consultation for hospitalized patients increases post-discharge abstinence and reduces addiction severity. J. Gen. Intern. Med 32, 909–916 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Englander H. et al. ‘We’ve learned it’s a medical illness, not a moral choice’: qualitative study of the effects of a multicomponent addiction intervention on hospital providers’ attitudes and experiences. J. Hosp. Med 13, 752–758 (2018). [DOI] [PubMed] [Google Scholar]
  • 11.Hoover K, Lockhart S, Callister C, Holtrop JS & Calcaterra SL Experiences of stigma in hospitals with addiction consultation services: a qualitative analysis of patients’ and hospital-based providers’ perspectives. J. Subst. Abuse Treat 138, 108708 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Callister C, Lockhart S, Holtrop JS, Hoover K & Calcaterra SL Experiences with an addiction consultation service on care provided to hospitalized patients with opioid use disorder: a qualitative study of hospitalists, nurses, pharmacists, and social workers. Subst. Abus 43, 615–622 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Wilson JD et al. Inpatient addiction medicine consultation service impact on post-discharge patient mortality: a propensity-matched analysis. J. Gen. Intern. Med 37, 2521–2525 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Englander H. et al. Caring for hospitalized adults with opioid use disorder in the era of fentanyl: a review. JAMA Intern. Med 184, 691–701 (2024). [DOI] [PubMed] [Google Scholar]
  • 15.Khan M. et al. ‘Before medically advised’ departure from hospital and subsequent drug overdose: a population-based cohort study. CMAJ 196, E1066–E1075 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.McNeely J. et al. Addiction consultation services for opioid use disorder treatment initiation and engagement: a randomized clinical trial. JAMA Intern. Med 184, 1106–1115 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Serowik KL et al. Substance use disorder detection rates among providers of general medical inpatients. J. Gen. Intern. Med 36, 668–675 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.McNeely J. et al. Performance of the tobacco, alcohol, prescription medication, and other substance use (TAPS) tool for substance use screening in primary care patients. Ann. Intern. Med 16, 690–699 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.McNeely J. et al. A brief patient self-administered substance use screening tool for primary care: two-site validation study of the substance use brief screen (SUBS). Am. J. Med 128, e9–e19 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Yudko E, Lozhkina O & Fouts A A comprehensive review of the psychometric properties of the Drug Abuse Screening Test. J. Subst. Abuse Treat 32, 89–198 (2007). [DOI] [PubMed] [Google Scholar]
  • 21.Vasey B. et al. Reporting guideline for the early stage clinical evaluation of decision support systems driven by artificial intelligence: DECIDE-AI. BMJ 377, e070904 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Afshar M. et al. Subtypes in patients with opioid misuse: a prognostic enrichment strategy using electronic health record data in hospitalized patients. PLoS ONE 14, e0219717 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Sharma B. et al. Publicly available machine learning models for identifying opioid misuse from the clinical notes of hospitalized patients. BMC. Med. Inform. Decis. Mak 20, 79 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Afshar M. et al. External validation of an opioid misuse machine learning classifier in hospitalized adult patients. Addict. Sci. Clin. Pract 16, 19 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Thompson HM et al. Bias and fairness assessment of a natural language processing opioid misuse classifier: detection and mitigation of electronic health record data disadvantages across racial subgroups. J. Am. Med. Inform. Assoc 28, 2393–2403 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Afshar M. et al. Development and multimodal validation of a substance misuse algorithm for referral to treatment using artificial intelligence (SMART-AI): a retrospective deep learning study. Lancet Digit. Health 4, e426–e435 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Afshar M. et al. Deployment of real-time natural language processing and deep learning clinical decision support in the electronic health record: pipeline implementation for an opioid misuse screener in hospitalized adults. JMIR Med. Inform 11, e44977 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Damschroder LJ et al. Fostering implementation of health services research findings into practice: a consolidated framework for advancing implementation science. Implement. Sci 4, 50 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Powell BJ et al. A refined compilation of implementation strategies: results from the Expert Recommendations for Implementing Change (ERIC) project. Implement. Sci 10, 21 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Ahmed MI et al. A systematic review of the barriers to the implementation of artificial intelligence in healthcare. Cureus 15, e46454 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Wan P. et al. Outpatient reception via collaboration between nurses and a large language model: a randomized controlled trial. Nat. Med 30, 2878–2885 (2024). [DOI] [PubMed] [Google Scholar]
  • 32.Liao F, Adelaine S, Afshar M & Patterson BW Governance of clinical AI applications to facilitate safe and equitable deployment in a large health system: key elements and early successes. Front. Digit. Health 4, 931439 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Curran GM, Bauer M, Mittman B, Pyne JM & Stetler C Effectiveness-implementation hybrid designs: combining elements of clinical effectiveness and implementation research to enhance public health impact. Medical Care 50, 217–226 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Ullman AJ, Beidas RS & Bonafide CP Methodological progress note: hybrid effectiveness-implementation clinical trials. J. Hosp. Med 17, 912–916 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Alemu BT, Olayinka O & Martin BC Characteristics of hospitalized adults with opioid use disorder in the United States: nationwide inpatient sample. Pain Physician 24, 327–334 (2021). [PubMed] [Google Scholar]
  • 36.Lo-Ciganic W-H et al. Evaluation of machine-learning algorithms for predicting opioid overdose risk among medicare beneficiaries with opioid prescriptions. JAMA Netw. Open 2, e190968 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Wynants L. et al. Prediction models for diagnosis and prognosis of COVID-19: systematic review and critical appraisal. BMJ 369, m1328 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Carayon P. et al. Application of human factors to improve usability of clinical decision support for diagnostic decision-making: a scenario-based simulation study. BMJ Qual. Saf 29, 329–340 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Schwartz JM et al. Factors influencing clinician trust in predictive clinical decision support systems for in-hospital deterioration: qualitative descriptive study. JMIR Hum. Factors 9, e33960 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Chaparro JD et al. Reducing interruptive alert burden using quality improvement methodology. Appl. Clin. Inform 11, 46–58 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Weiner SG et al. Opioid overdose after medication for opioid use disorder initiation following hospitalization or ED visit. JAMA Netw. Open 7, e2423954 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.James H, Morgan J, Ti L & Nolan S Transitions in care between hospital and community settings for individuals with a substance use disorder: a systematic review. Drug Alcohol Depend. 243, 109763 (2023). [DOI] [PubMed] [Google Scholar]
  • 43.Agency for Healthcare Research and Quality (AHRQ), Healthcare Cost and Utilization Project (HCUP), Nationwide Readmissions Database (NRD). Characteristics of 30-Day All-Cause Hospital Readmissions, 2016–2020. Accessed 24 September 2024. https://hcup-us.ahrq.gov/reports/statbriefs/sb304-readmissions-2016-2020.jsp (2024).
  • 44.Cano M & Sparks CS Drug overdose mortality by race/ethnicity across US-born and immigrant populations. Drug Alcohol Depend. 232, 109309 (2022). [DOI] [PubMed] [Google Scholar]
  • 45.Magee T. et al. Inequities in the treatment of opioid use disorder: a scoping review. J. Subst. Use Addict. Treat 152, 209082 (2023). [DOI] [PubMed] [Google Scholar]
  • 46.Nguyen T. et al. Racial and ethnic disparities in buprenorphine and extended-release naltrexone filled prescriptions during the COVID-19 pandemic. JAMA Netw. Open 5, e2214765 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Barocas JA et al. Clinical impact, costs, and cost-effectiveness of hospital-based strategies for addressing the US opioid epidemic: a modelling study. Lancet Public Health 7, e56–e64 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Fairley M. et al. Cost-effectiveness of treatments for opioid use disorder. JAMA Psychiatry 78, 767–777 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Chen H. et al. Facilitation or hindrance: physicians’ perception on Best Practice Alerts (BPA) usage in an electronic health record system. Health Commun. 34, 942–948 (2019). [DOI] [PubMed] [Google Scholar]
  • 50.Orenstein EW et al. Alert burden in pediatric hospitals: a cross-sectional analysis of six academic pediatric health systems using novel metrics. J. Am. Med. Inform. Assoc 28, 2654–2660 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Liu X, Cruz Rivera S, Moher D, Calvert MJ, Denniston AK & SPIRIT-AI and CONSORT-AI Working Group. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat. Med 26, 1364–1374 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Bradley KA et al. AUDIT-C as a brief screen for alcohol misuse in primary care. Alcohol Clin. Exp. Res 31, 1208–1217 (2007). [DOI] [PubMed] [Google Scholar]
  • 53.Vavalle JP et al. Hospital length of stay in patients with non-ST-segment elevation myocardial infarction. Am. J. Med 125, 1085–1094 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Sundararajan M, Taly A & Yan Q Axiomatic attribution for deep networks. In Proc. 34th International Conference on Machine Learning (eds Precup D & Teh YW) Vol. 70, 3319–3328 (PMLR, 2017). [Google Scholar]
  • 55.Zuckerman RB, Sheingold SH, Orav EJ, Ruhter J & Epstein AM Readmissions, observation, and the hospital readmissions reduction program. N. Engl. J. Med 374, 1543–1551 (2016). [DOI] [PubMed] [Google Scholar]
  • 56.Sheehy AM et al. The hospital readmissions reduction program and observation hospitalizations. J. Hosp. Med 16, 409–411 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Elixhauser A, Steiner C, Harris DR & Coffey RM Comorbidity measures for use with administrative data. Med. Care 36, 8–27 (1998). [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Tables
CONSORT-AI Checklist
Clinical Trial Protocol

Data Availability Statement

The minimal data underlying the results of this study are available upon request due to ethical and legal restrictions imposed by the University of Wisconsin-Madison Institutional Review Board. The data are derived from the institution’s EHR and contain patients’ protected health information on patients with addiction disease and their documented addiction treatment in the clinical notes, so the data are not publicly available. Data are available from the University of Wisconsin-Madison for researchers who meet the criteria for access to confidential data and have a data usage agreement with the health system. Please contact M.A. for access requests. With a data use agreement, a limited dataset can be made available in response to an inquiry. Please note that the time frame for responding to requests is approximately 2 weeks.

The trained model with its stored vocabulary is open source and available at https://git.doit.wisc.edu/smph-public/dom/uw-icu-data-science-lab-public/smart-ai/. This repository also contains the pseudocode for running the model in a health system EHR using Health Level 7 version 2 data standards.

RESOURCES