Abstract
Antibiotic resistance poses a significant global health challenge, with its rapid emergence driven by inappropriate antibiotic use. This study aimed to develop and compare machine learning models to predict resistance to nine antibiotic classes in hospitalized patients, addressing the high-cost labeling challenge of medical data. We conducted a retrospective study using electronic medical records from three Korean tertiary care institutions (n = 59,551). Single-task learning models (Logistic Regression, XGBoost, LightGBM, CatBoost, Multi-Layer Perceptron) and multi-task learning models (Hard Parameter Sharing, Soft Parameter Sharing) were developed. Model performance was evaluated using Area Under the ROC Curve (AUC), Precision-Recall Curve (PRC), and calibration slope. In external validation, the hard parameter sharing model achieved the highest AUC and PRC for five of nine antibiotic classes, with average values of 79.63 and 80.26, respectively. Shapley Additive Explanations analysis revealed previous antibiotic resistance as the most important predictor. Subgroup analyses showed hard parameter sharing model performed best with available previous culture test results, while soft parameter sharing excelled when they were missing. Our multi-task learning approach achieved generalized, stable, and superior performance in predicting antibiotic resistance compared with traditional models, providing a novel solution for handling partial targets in medical data.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-026-41185-z.
Keywords: Antibiotic resistance prediction, Multi-task learning, Machine learning, Electronic health records, Clinical decision support, Explainable artificial intelligence
Subject terms: Machine learning, Learning algorithms, Computational models, Network models, Computer science, Predictive markers
Introduction
Antibiotics are crucial therapeutic tools in modern medicine for bacterial infections1. When selected and administered appropriately, antibiotics can enhance treatment efficacy and reduce mortality in patients with infectious diseases2. However, the situation is compounded by two concurrent issues: the rapid emergence of antibiotic resistance and the gradual decline in new antibiotic development3. Antibiotic resistance, induced by the excessive and improper use of antibiotics, leads to adverse effects, such as prolonged hospitalization, deterioration of patient prognosis, and increased medical costs4. According to the O’Neill report, the rise in antibiotic resistance is predicted to cause approximately 10 million additional deaths annually by 2050, potentially resulting in substantial economic losses5.
In Korea, the situation regarding antibiotic use is of particular concern. Around 50% of hospitalized patients are prescribed antibiotics, with 27% receiving two or more antibiotics simultaneously6. Alarmingly, the rate of inappropriate antibiotic prescriptions was 25.8% in tertiary hospitals, 31.7% in general hospitals, and 30.8% in long–term care hospitals, indicating that one–quarter to one–third of antibiotic use stems from inappropriate prescriptions6. In this context, although culture tests are crucial for selecting appropriate antibiotics, they are constrained by a time lag of 3–5 d before results are available7. The nine antibiotic classes examined in this study—including penicillins, cephalosporins, fluoroquinolones, and carbapenems—were selected based on both their clinical importance in treating infections and their high prevalence in Korean hospital prescribing patterns, as these represent the primary agents used for empirical therapy6.
Recent advancements in medical artificial intelligence have enhanced accessibility to healthcare through various platforms and technologies (including mobile apps, wearables, and clinical decision support systems) and extended their applications to rare and intractable diseases8. It is essential to acquire sufficient data and develop generalized models. In this context, efforts to address data acquisition and model generalization issues persist in infectious disease research, focusing on predicting antibiotic resistance9–15.
However, studies that predict antibiotic resistance have several limitations. These include restricted applicability to specific patient groups9–14, reliance on single-institution validation9–15, and predominant use of conventional machine learning models (Logistic Regression [LR], Decision Trees, Random Forest [RF], Support Vector Machines, Gradient Boosted Trees, and Multi-Layer Perceptron [MLP])9–15. Jun et al.9 developed models using University of Florida data, but critically, these focused on mortality outcomes rather than actionable resistance prediction. Oonsivilai et al.10 applied RF algorithms to a small pediatric cohort in Cambodia (N = 243, area under the ROC curve [AUC]: 0.74–0.85), demonstrating proof-of-concept but suffering from severe limitations including minimal sample size, single-institution data, and absence of external validation. Dan et al.11 proposed a fluoroquinolone resistance score using conventional LR (AUC: 0.73), but this narrow approach addressed only a single antibiotic class at a single institution, limiting its clinical applicability. Yelin et al.12 analyzed a large dataset of 711,099 community-acquired Urinary Tract Infection (UTI) samples from Israel using gradient boosting and LR (AUC: 0.70–0.83), demonstrating the value of large-scale data but remaining within a single healthcare system without external validation. Both İlhanlı et al.13 and Kim et al.14 focused on UTI populations in South Korea, with İlhanlı examining a single institution and Kim extending to two institutions (AUC: 0.589–0.895 for eight antibiotics); while Kim’s incorporation of daily non-susceptibility rate calculations represents an advancement, both studies remained clinically constrained to UTI settings. Lewin-Epstein et al.15 developed sophisticated ensemble models for hospitalized Israeli patients (N = 16,198, AUC: 0.73–0.79 without species data, 0.80–0.88 with species identification), yet validation was notably restricted to a single institution.
The restriction to single institutions and specific patient populations inherently limits the available training data, creating data scarcity challenges. Advanced techniques such as few-shot learning can address these challenges16. Multi–task learning (MTL), a few–shot learning methodology, allows optimal parameter exploration by sharing parameters across tasks in data–scarce situations17,18. However, when MTL is applied to unrelated tasks, a negative transfer phenomenon occurs due to learning competition between tasks, preventing convergence to optimal parameters19. Fortunately, considering previous studies indicating relationships between antibiotic classes, using MTL to predict antibiotic resistance could be an appropriate solution to avoiding these issues13–15.
This study aimed to develop and compare the performance of various machine learning (logistic regression [LR], extreme gradient boosting [XGB], light gradient boosting machine [LGBM], CatBoost) and deep learning (MLP, MTL) models in predicting resistance to nine classes of antibiotics in hospitalized patients. Additionally, we aimed to evaluate the generalization capability of the developed models through internal and external validations using data from three independent institutions. Compared to previous antibiotic resistance prediction studies9–15, our study presents methodological innovations in three key aspects. First, to address the partial label missing problem commonly encountered in medical artificial intelligence research, we modified the loss function of MTL to effectively handle missing labels (NaN). This reflects the characteristics of real-world clinical data where culture tests are performed in panel formats, resulting in partially missing labels for each antibiotic class. Second, utilizing a large-scale multi-institutional cohort of 59,551 patients, we achieved a substantially larger dataset compared to previous studies, enhancing the model’s generalization capability. Third, unlike previous studies that were limited to single-institution validation, we conducted rigorous external validation using data from three independent medical institutions to verify clinical applicability. Thus, we expect to improve the accuracy of antibiotic resistance prediction and ultimately contribute to resolving the problems of antibiotic misuse and overuse.
Results
Study population and cohort characteristics
From November 1, 2011, to July 31, 2022, 59,551 individuals meeting the inclusion criteria were recruited from three institutions. (Fig. 1) The Internal data consisted of Korea University Anam Hospital and Korea University Guro Hospital cohorts comprising 19,841 and 23,560 individuals, respectively. The external data comprised a Korea University Ansan Hospital cohort of 16,150 individuals. The baseline characteristics of the study population are summarized in supplementary Table S5, and the detailed baseline characteristics for each antibiotic class are listed in supplementary Tables S6–S14. Also, The distribution of missing labels varied substantially across antibiotic classes and institutions (Supplementary Table S20).
Fig. 1.
Flowchart for developing nine antibiotic prediction models. Anam: Korea University Anam Hospital in South Korea, Guro: Korea University Guro Hospital in South Korea, Ansan: Korea University Ansan Hospital in South Korea.
Performance comparison of MTL and STL models
The MTL model demonstrated higher predictive performance in distinguishing resistance to nine antibiotics than the single task learning (STL) models (Tables 1 and 2, Supplementary Table S15). The MTL model achieved the following AUC mean (95% CI) in the Internal test data (Table 1): Third Generation Cephalosporins (TGC)–80.54 [80.42–80.66], Macrolides (MAC)–78.28 [78.13–78.42], Fourth Generation Cephalosporins (FGC)–79.67 [79.55–79.78], Fluoroquinolones (FLU)–78.32 [78.23–78.42], Carbapenems (CAR)–80.12 [80.03–80.22], β–Lactams and β–Lactamase Inhibitors (BLA)–81.35 [81.23–81.46], Aminoglycosides (AMI)–71.33 [71.13–71.53], Amikacin/Gentamicin (AG)–72.39 [72.30–72.49], Glycopeptides (GLY)–79.16 [79.03–79.29]. In the External test data(Table 2), the following AUC mean (95% CI) was obtained: TGC–79.67 [79.62–79.72], MAC–80.96 [80.89–81.03], FGC–77.21 [77.15–77.26], FLU–77.85 [77.80–77.89], CAR–79.85 [79.81–79.89], BLA–80.88 [80.84–80.93], AMI–69.67 [69.41–69.93], AG–73.86 [73.81–73.91], GLY–81.23 [81.13–81.32]. Among the MTL approaches, hard parameter sharing (HS) showed higher AUC, precision–recall curve (PRC), and calibration slope (CS) performances than soft parameter sharing (SS).
Table 1.
Performance of multi–task learning model for early prediction of nine classes of antibiotics: internal validation results.
| HS | SS | |||||
|---|---|---|---|---|---|---|
| AUC | PRC | CS | AUC | PRC | CS | |
| TGC | 80.54 [80.42–80.66] | 88.90 [88.78–89.02] | 0.82 [0.74–0.89] | 79.72 [79.59–79.85] | 88.27 [88.14–88.40] | 0.81 [0.76–0.86] |
| MAC | 78.28 [78.13–78.42] | 89.85 [89.72–89.97] | 1.00 [0.98–1.02] | 77.14 [77.01–77.27] | 89.19 [89.05–89.33] | 0.87 [0.83–0.91] |
| FGC | 79.67 [79.55–79.78] | 77.26 [77.06–77.47] | 0.98 [0.97–0.99] | 78.95 [78.83–79.07] | 76.30 [76.09–76.51] | 0.94 [0.94–0.95] |
| FLU | 78.32 [78.23–78.42] | 85.21 [85.10–85.33] | 0.98 [0.95–1.01] | 77.61 [77.51–77.71] | 84.60 [84.45–84.75] | 0.89 [0.85–0.93] |
| CAR | 80.12 [80.03–80.22] | 86.00 [85.88–86.12] | 0.91 [0.85–0.98] | 79.47 [79.37–79.58] | 85.37 [85.23–85.51] | 0.88 [0.84–0.92] |
| BLA | 81.35 [81.23–81.46] | 76.59 [76.39–76.79] | 0.99 [0.98–1.00] | 80.76 [80.65–80.87] | 75.84 [75.60–76.08] | 0.97 [0.96–0.98] |
| AMI | 71.33 [71.13–71.53] | 53.62 [53.23–54.01] | 0.80 [0.73–0.87] | 70.04 [69.79–70.29] | 52.05 [51.63–52.47] | 0.78 [0.75–0.82] |
| AG | 72.39 [72.30–72.49] | 69.22 [69.05–69.38] | 0.87 [0.80–0.93] | 71.59 [71.49–71.70] | 68.28 [68.08–68.48] | 0.80 [0.76–0.84] |
| GLY | 79.16 [79.03–79.29] | 49.67 [49.26–50.07] | 0.90 [0.83–0.96] | 78.70 [78.56–78.85] | 48.78 [48.39–49.17] | 0.86 [0.82–0.90] |
HS hard parameter sharing, SS soft parameter sharing, AUC area under the ROC curve, PRC precision recall curve, CS calibration slope, TGC third generation cephalosporins, MAC macrolides, FGC fourth generation cephalosporins, FLU fluoroquinolones, BLA β–lactams and β–lactamase inhibitors, CAR carbapenems, AMI aminoglycosides, AG amikacin and gentamicin, GLY glycopeptides. Mean (CI 95%).
Table 2.
Performance of multi–task learning model for early prediction of nine classes of antibiotics: external validation results.
| HS | SS | |||||
|---|---|---|---|---|---|---|
| AUC | PRC | CS | AUC | PRC | CS | |
| TGC | 79.67 [79.62–79.72] | 88.06 [88.02–88.10] | 0.83 [0.78–0.89] | 78.60 [78.54–78.66] | 87.44 [87.40–87.47] | 0.72 [0.66–0.77] |
| MAC | 80.96 [80.89–81.03] | 91.93 [91.90–91.97] | 1.03 [1.01–1.05] | 79.37 [79.25–79.50] | 91.10 [91.03–91.17] | 0.86 [0.80–0.91] |
| FGC | 77.21 [77.15–77.26] | 74.38 [74.30–74.46] | 0.93 [0.92–0.94] | 76.04 [75.96–76.12] | 73.13 [73.03–73.24] | 0.87 [0.87–0.87] |
| FLU | 77.85 [77.80–77.89] | 83.04 [83.00–83.08] | 1.01 [0.99–1.03] | 76.99 [76.94–77.05] | 82.26 [82.18–82.34] | 0.88 [0.84–0.92] |
| CAR | 79.85 [79.81–79.89] | 86.04 [85.99–86.08] | 0.83 [0.77–0.90] | 78.93 [78.88–78.99] | 85.32 [85.24–85.40] | 0.84 [0.82–0.87] |
| BLA | 80.88 [80.84–80.93] | 75.08 [74.99–75.17] | 0.98 [0.98–0.99] | 79.84 [79.75–79.93] | 73.76 [73.61–73.90] | 0.95 [0.95–0.96] |
| AMI | 69.67 [69.41–69.93] | 41.33 [41.08–41.57] | 0.76 [0.67–0.84] | 67.80 [67.33–68.26] | 38.99 [38.60–39.37] | 0.73 [0.68–0.78] |
| AG | 73.86 [73.81–73.91] | 65.51 [65.43–65.58] | 0.86 [0.78–0.95] | 72.90 [72.82–72.99] | 64.36 [64.24–64.48] | 0.78 [0.76–0.80] |
| GLY | 81.23 [81.13–81.32] | 52.31 [52.09–52.52] | 0.96 [0.91–1.02] | 80.69 [80.57–80.81] | 51.33 [51.06–51.60] | 0.83 [0.80–0.86] |
HS hard parameter sharing, SS soft parameter sharing, AUC area under the ROC curve, PRC precision recall curve, CS calibration slope, TGC third generation cephalosporins, MAC macrolides, FGC fourth generation cephalosporins, FLU fluoroquinolones, BLA β–lactams and β–lactamase inhibitors, CAR carbapenems, AMI aminoglycosides, AG amikacin and gentamicin, GLY glycopeptides. Mean (CI 95%).
To rigorously evaluate the performance of the HS model against other approaches, we conducted paired t-tests based on 30 repeated validations (Supplementary Table S16, Supplementary Table S17). For AMI, the HS model demonstrated the highest performance among all models in both internal (AUC difference vs. LGBM: 0.81, p < 0.001; vs. other STL models: 1.29–3.83, all p < 0.001) and external validation (AUC difference vs. LGBM: 0.52, p = 0.0021; vs. other STL models: 2.62–5.40, all p < 0.001). The PRC metrics for AMI further confirmed this superiority, with the HS model showing significant advantages in internal (difference range: 1.16–4.97, p = 0.0137 for LGBM, p < 0.001 for others) and external validation (difference range: 1.11–4.82, all p < 0.001).
The computational efficiency analysis, detailed in Table S4, compared inference speed and storage efficiency. The HS model achieved an inference time of 39.76ms per 100 samples with a model size of 2.58 MB, while the SS model recorded 42.79ms and 10.80 MB, respectively. The two models showed comparable computational efficiency with minimal differences.
We developed resistance prediction models for each antibiotic and presented the mean (95% confidence interval [CI] ) of the AUC and PRC for the external test data (Supplementary Figure S4). MTL demonstrates higher predictive accuracy than the corresponding STL models for FGC, AMI, and GLY. Among the STL models, LGBM demonstrated high predictive accuracy and had AUC and PRC values similar to those of HS.
Table 3 shows the models with the highest resistance prediction accuracy for each antibiotic class for internal and external data (Supplementary Figure S3, Supplementary Figure S4, and Supplementary Table S15). For MAC, FLU, CAR, and BLA, the STL model LGBM demonstrated the highest accuracy. The MTL HS model achieved the highest predictive accuracy, as measured by AUC and PRC in both internal and external validation, for TGC, FGC, AMI, AG, and GLY.
Table 3.
Best performing models for antibiotic resistance prediction using internal and external validation.
| Antibiotics | Model | Internal AUC | Internal PRC | External AUC | External PRC |
|---|---|---|---|---|---|
| TGC | HS | 81.27 | 89.56 | 79.78 | 88.04 |
| MAC | LGBM | 78.76 | 90.31 | 80.88 | 91.90 |
| FGC | HS | 80.13 | 78.29 | 77.47 | 74.43 |
| FLU | LGBM | 78.90 | 85.75 | 77.71 | 82.97 |
| CAR | LGBM | 80.43 | 86.25 | 79.85 | 86.01 |
| BLA | LGBM | 82.03 | 77.47 | 80.83 | 75.10 |
| AMI | HS | 72.02 | 55.42 | 70.48 | 42.28 |
| AG | HS | 72.93 | 70.11 | 73.99 | 65.48 |
| GLY | HS | 79.51 | 51.22 | 81.66 | 52.94 |
HS hard parameter sharing, LGBM light gradient boosting machine, TGC third generation cephalosporins, MAC macrolides, FGC fourth generation cephalosporins, FLU fluoroquinolones, BLA β–lactams and β–lactamase inhibitors, CAR carbapenems, AMI aminoglycosides, AG amikacin and gentamicin, GLY glycopeptides. The multi–task learning model achieved the highest AUC and PRC values for TGC, FGC, AMI, AG, and GLY. MTL demonstrated superior performance in five of the nine antibiotic classes evaluated, indicating its robust predictive capability across multiple antibiotic resistance scenarios.
Model interpretability using SHAP analysis
A Shapley Additive Explanations (SHAP) analysis for explainable AI (XAI) was conducted, as shown in Fig. 2. The results of the top 10 SHAP values showed that if there was previous resistance to the respective antibiotic, the probability of resistance increased, demonstrating the resistance associations among the following antibiotics: TGC–FGC, BLA, CAR, and FLU/MAC–FLU, AG, and BLA/FGC–FLU, BLA, CAR, and TGC/FLU–FGC, and CAR/CAR–FGC/BLA–CAR, and FGC/AMI–FGC, TGC, BLA, CAR, and MAC/AG–FLU, BLA, FGC, GLY, TGC, and MAC/GLY–AMI, BLA, MAC, AG, FLU, FGC, and CAR. The SHAP values demonstrated that previous antibiotic resistance status was the most important factor.
Fig. 2.
SHAP Values by antibiotics. TGC third generation cephalosporins, MAC macrolides, FGC fourth generation cephalosporins, FLU fluoroquinolones, BLA β–lactams and β–lactamase inhibitors, CAR carbapenems, AMI aminoglycosides, AG amikacin and gentamicin, GLY glycopeptides, pre_[antibiotics]: Previous antibiotic culture test results (0–missing, 1–susceptible, 2–resistant), Administration of [antibiotics]: Antibiotic administration information (0–missing, 1–not administered, 2–administered).
Subgroup analysis and comprehensive model evaluation
We report the results of the subgroup analyses based on whether culture tests were conducted for the antibiotics we aimed to predict. The AUC and PRC were higher when culture tests were conducted for the measured antibiotics than when they were not. HS exhibited the highest performance when previous culture test results were available, whereas SS performed the highest when previous culture test results were unavailable (Table 4, Supplementary Tables S17 and S18).
Table 4.
Subgroup analyses based on the presence or absence of previous culture test results for a specific antibiotic.
| HS | SS | |||
|---|---|---|---|---|
| AUC | PRC | AUC | PRC | |
| Presence of a specific antibiotic culture test | ||||
| TGC | 81.55 [81.49–81.62] | 92.81 [92.76–92.85] | 78.51 [78.38–78.64] | 92.02 [91.95–92.09] |
| MAC | 87.43 [87.35–87.51] | 96.33 [96.28–96.37] | 80.97 [80.78–81.16] | 95.38 [95.31–95.45] |
| FGC | 79.85 [79.80–79.90] | 80.74 [80.65–80.84] | 74.36 [74.24–74.48] | 82.29 [82.18–82.41] |
| FLU | 65.28 [65.13–65.43] | 90.63 [90.59–90.67] | 77.68 [77.59–77.77] | 87.83 [87.74–87.91] |
| CAR | 80.39 [80.35–80.44] | 90.86 [90.81–90.90] | 79.64 [79.57–79.71] | 90.32 [90.24–90.40] |
| BLA | 80.40 [80.35–80.45] | 80.76 [80.67–80.86] | 77.07 [76.97–77.17] | 80.47 [80.33–80.61] |
| AMI | 80.78 [80.59–80.97] | 56.47 [56.07–56.87] | 72.07 [71.61–72.52] | 44.08 [43.60–44.55] |
| AG | 76.81 [76.77–76.85] | 72.07 [71.98–72.16] | 75.15 [75.05–75.25] | 71.24 [71.11–71.38] |
| GLY | 84.17 [84.10–84.25] | 61.63 [61.34–61.93] | 80.89 [80.74–81.04] | 49.48 [49.08–49.89] |
| Missing a specific antibiotic culture test | ||||
| TGC | 66.64 [66.52–66.76] | 66.64 [66.53–66.76] | 73.04 [72.96–73.12] | 79.28 [79.21–79.34] |
| MAC | 67.82 [67.64–67.99] | 78.29 [78.18–78.41] | 71.76 [71.52–72.00] | 82.03 [81.90–82.15] |
| FGC | 65.57 [65.42–65.71] | 49.53 [49.38–49.68] | 65.38 [65.25–65.51] | 49.21 [49.06–49.36] |
| FLU | 65.13 [65.05–65.22] | 57.14 [57.03–57.25] | 67.87 [67.79–67.95] | 64.40 [64.29–64.51] |
| CAR | 67.03 [66.91–67.15] | 59.91 [59.78–60.04] | 65.12 [64.97–65.26] | 58.06 [57.89–58.22] |
| BLA | 70.47 [70.31–70.63] | 42.43 [42.21–42.64] | 77.70 [77.59–77.80] | 67.86 [67.70–68.02] |
| AMI | 51.86 [51.48–52.24] | 19.43 [19.22–19.64] | 59.27 [58.89–59.65] | 29.43 [29.04–29.82] |
| AG | 62.14 [61.98–62.29] | 40.61 [40.48–40.74] | 67.00 [66.88–67.11] | 52.50 [52.36–52.64] |
| GLY | 69.71 [69.48–69.94] | 16.45 [16.19–16.70] | 78.75 [78.58–78.92] | 52.47 [52.22–52.72] |
HS hard parameter sharing, SS soft parameter sharing, AUC area under the ROC curve, PRC precision recall curve, TGC third generation cephalosporins, MAC macrolides, FGC fourth generation cephalosporins, FLU fluoroquinolones, BLA β–lactams and β–lactamase inhibitors, CAR carbapenems, AMI aminoglycosides, AG amikacin and gentamicin, GLY glycopeptides.
The early antibiotic resistance prediction models using MTL showed better performance in predicting resistance to most antibiotics than the STL models in internal and external validations. As shown in Tables 1 and 2, and Supplementary Table S15, the average AUC and PRC for all antibiotic early prediction models are as follows: Internal test–LR (AUC:74.16, PRC:77.69), XGB (AUC:74.15, PRC:77.80), LGBM (AUC:77.33, PRC:80.33), Cat (AUC:76.26, PRC:79.45), MLP (AUC:76.21, PRC:79.37), HS (AUC:78.64, PRC:81.32), SS (AUC:77.17, PRC:78.60), External test–LR (AUC:75.57, PRC:77.17), XGB (AUC:74.84, PRC:76.28), LGBM (AUC:78.19, PRC:79.07), Cat (AUC:77.01, PRC:79.94), MLP (AUC:77.63, PRC:78.45), HS (AUC:79.63, PRC:80.26), SS (AUC:77.37, PRC:77.01). Overall, the HS model, an MTL approach, outperformed the traditional machine learning and MLP models for early antibiotic resistance prediction.
SHAP analysis based on the HS model for early antibiotic resistance prediction can indicate the risk factors for antibiotic resistance. Analysis of SHAP values revealed that antibiotic resistance was associated with resistance to other antibiotics and revealed various risk factors, such as age, length of hospital stay, and vital signs. The SHAP values for the previous culture test results were the most significant predictors of antibiotic resistance. (Fig. 2)
Based on the SHAP analysis, we conducted subgroup analyses according to the presence or absence of previous culture test results. When previous culture test results were available, HS showed a higher predictive accuracy with an internal test (AUC: 78.64, PRC: 81.32) and an external test (AUC: 79.63, PRC: 80.26). Conversely, when previous culture test results were not available, SS demonstrated the highest performance with Internal test – (AUC: 70.07, PRC: 61.78) and External test – (AUC: 69.54, PRC: 59.47). (Table 4, Supplementary Table S18 and Supplementary Table S19)
Discussion
In this multi–institutional study on antibiotic resistance prediction, we developed MTL (HS and SS) models that outperformed traditional machine learning(LR, XGB, LGBM, and Cat), deep learning (MLP), and STL–based models for certain classes of antibiotics. This study constructed an MTL model that effectively addressed partial targets missing in each task and showed the highest predictive accuracy when evaluating model performance based on the presence of culture tests. The average of internal and external AUC values for early prediction models of antibiotic resistance by antibiotics are as follows: TGC–80.53, MAC–79.82, FGC–78.8, FLU–82.33, CAR–78.31, BLA–81.43, AMI–71.25, AG–73.46, GLY–80.59. Notably, for the AMI class with limited data availability, the MTL HS model consistently demonstrated superior performance compared to other models. Our approach addresses critical limitations in prior machine learning-based antibiotic resistance prediction studies9–15, which were predominantly restricted by single-institution validation9–15, narrow patient populations (pediatric10, UTI-specific12–14, or bloodstream infections9,11,15, and modest sample sizes (N = 243 − 16,198)9–11,13,15. Previous STL-based approaches9–15 handled missing labels through data exclusion, causing substantial information loss and failure to exploit cross-antibiotic resistance patterns. In contrast, our multi-task learning framework simultaneously predicts resistance across nine antibiotic classes using three-institution data with rigorous external validation, addresses severe label sparsity—particularly for aminoglycosides (up to 69% missing labels)—through a modified loss function that leverages inter-task relationships rather than discarding data, and demonstrates robust generalizability across geographically distinct institutions with diverse resistance patterns.
Early and appropriate use of antibiotics in infectious disease treatment effectively improves patient outcomes and reduces costs. Empirical antibiotic use in clinical settings raises concerns about unnecessary broad–spectrum or inappropriate antibiotic use due to unknown resistance patterns20. The MTL model addressed this issue by achieving the highest AUC and PRC for five out of nine antibiotics in the early antibiotic resistance prediction. In contrast, the SS model demonstrated the highest AUC and PRC when no previous culture tests were available.
The factors contributing to the generalization and superior performance of the MTL model in this study were multi–institutional data and learning–task relationships. The three–institution cohort, collected from three regions of Korea, formed similar clusters and can be considered representative of Koreans and Asians. However, due to differences in antibiotic resistance rates and data volumes across institutions, this approach is not ideal for ensuring the robustness of the AI model. We implemented an MTL learning method that considered antibiotic resistance associations to overcome this limitation. We verified that learning between related tasks produced superior results compared with traditional machine learning/deep learning methodologies. MTL can be used to construct advanced deep–learning models through learning that considers the associations between specific events, such as resistance and diseases that exist in actual clinical settings.
A primary clinical implication of our findings, underscored by the predictive performance demonstrated by the MTL and SS models under different conditions, is the model’s potential to help optimize empirical antibiotic therapy. Even modest improvements in predictive accuracy—such as the 1–2% AUC gains observed in our models—can translate into meaningful clinical benefits when applied at scale. By providing predictive insights into resistance patterns across major antibiotic classes, the developed ML model can guide clinicians toward the selection of more targeted, narrower-spectrum agents. Such model-guided decisions may facilitate earlier administration of appropriate therapy, which has been associated with reduced mortality in bacterial infections21–23, and enable more timely de-escalation from broad-spectrum to narrow-spectrum antibiotics once susceptibility patterns are predicted24,25. This capability is particularly valuable, as current empirical treatment strategies often default to broad-spectrum antibiotics to mitigate the risk of therapeutic failure. This defensive approach, while common, is known to induce significant collateral damage, including an increased incidence of Clostridioides difficile infections (CDI) and the accelerated selection for multidrug-resistant pathogens24–26. Furthermore, by supporting more judicious antibiotic selection, the model has the potential to strengthen antimicrobial stewardship programs, which have demonstrated effectiveness in reducing both inappropriate prescribing and healthcare-associated infections such as CDI26,27. Nonetheless, the hypothesis that this model-guided approach is superior to standard care must be rigorously tested and demonstrated through further prospective studies, such as randomized controlled trials.
Medical data often have missing targets for various diseases due to high-cost labeling and omission of disease records28. Previous antibiotic resistance prediction studies handled this partial target missing problem by removing data with missing labels and constructing only a single STL model for each antibiotic9–15, which resulted in low data efficiency and failed to leverage potential relationships between antibiotic classes. In this study, there were partial targets in nine antibiotic-resistant cases. The distribution of missing labels varied substantially across antibiotic classes, with aminoglycosides (AMI) exhibiting the highest proportion of missing labels (47% in Anam hospital, 69% in Guro hospital, and 69% in Ansan hospital) (Supplementary Table S20). Despite this label sparsity, the MTL model maintained robust predictive performance across all antibiotic classes, demonstrating its capability to effectively leverage available data even under conditions of substantial label incompleteness. We modified the loss function by applying our data to the MTL to address this issue. After selectively removing only outputs with missing targets in each task, the modified loss calculated the binary cross-entropy, enabling maximum utilization of all available data. MTL applying this method showed superior prediction accuracy for specific antibiotic resistance compared to the STL models. Specifically, the MTL model demonstrated superior performance for five out of nine antibiotic classes (TGC, FGC, AMI, AG, and GLY). Most strikingly, for AMI—which suffered from the highest proportion of missing labels—MTL achieved an AUC of 0.697 (95% CI: 0.694–0.699) in external validation, surpassing STL approaches. This finding strongly suggests that MTL’s inter-task knowledge sharing mechanism effectively mitigates the adverse effects of label scarcity, a critical advantage in real-world clinical settings where complete labeling is often impractical. Rigorous statistical validation through paired t-tests (Supplementary Table S16 and S17) demonstrated that MTL’s superior performance over STL was statistically significant, providing compelling evidence for the effectiveness of our multi-task learning framework even in the presence of severe label sparsity.
The differential performance patterns observed between the HS and SS models in relation to prior culture test availability can be attributed to their distinct architectural characteristics and learning mechanisms. The HS model employs shared layers followed by task-specific separated layers for each antibiotic class. In this architecture, both shared and task-specific layers are updated based on losses calculated from all nine task-specific loss functions. However, the task-specific layers enable specialized learning for each individual antibiotic prediction task. When previous culture test results for the task-specific antibiotic are available, the task-specific layers can focus on learning from the highly predictive historical resistance information for that particular antibiotic, resulting in superior performance (Table 4). Conversely, when previous culture test results for the task-specific antibiotic are absent, the model has limited capacity to leverage information from other antibiotics due to the task-separated layer structure. In contrast, the SS model utilizes fully shared layers across all nine antibiotic prediction tasks without any task-specific separation. In this architecture, all layers are updated based on losses calculated from all task-specific loss functions simultaneously without task separation, enabling comprehensive cross-task information sharing. When the task-specific antibiotic’s historical data is unavailable, this fully shared architecture allows the model to leverage resistance patterns from other antibiotics more effectively, compensating for the missing information and resulting in superior performance under data-sparse conditions (Table 4). These performance differences fundamentally arise from the architectural distinction between task-separated specialized learning and fully-shared integrated learning18.
The computational performance evaluation demonstrates that our MTL-based HS model exhibits favorable characteristics for potential clinical implementation. With an inference time of approximately 39.76ms per 100 samples (0.40ms per patient) and a compact model size of 2.58 MB, the HS model achieves computational efficiency substantially superior to the MLP deep learning baseline (342.08ms inference time, 10.14 MB size) while maintaining storage requirements 45-fold smaller than XGB and 17-fold smaller than Cat. These computational characteristics suggest the potential for integration into hospital Electronic Medical Record(EMR) systems, where sub-second response times are typically expected for clinical decision support applications. The compact model size is compatible with deployment on standard hospital servers without requiring specialized hardware infrastructure. However, actual clinical deployment would require prospective validation to assess real-world performance, system integration complexity, and clinical workflow compatibility.
The SHAP analysis results revealed that important factors influencing resistance to antibiotic classes were the history of antibiotic use in the same class or accompanying resistance and other antibiotic classes’ use or resistance patterns. This result is consistent with the information that clinicians refer to when prescribing empirical antibiotics. Vital signs did not significantly influence resistance, except for MAC, FGC, FLU, CAR, and BLA, indicating that it is difficult to determine resistance based solely on patient severity. These findings are consistent with previous studies predicting antibiotic resistance in tract infections. Therefore, resistance can be predicted based on actual antibiotic use history and previous resistance rather than merely presuming it from patient severity, as reaffirmed by this study13. To illustrate the practical application of these findings in clinical decision-making, consider the following example: A patient previously demonstrated resistance to TGC. According to our SHAP analysis (Fig. 2), this patient would exhibit increased risk of resistance to FGC, BLA, CAR, and FLU. In this clinical scenario, a physician could leverage these SHAP-derived insights to avoid empirically prescribing antibiotics from these cross-resistant classes and instead consider alternative therapeutic options such as aminoglycosides or glycopeptides, which demonstrated lower cross-resistance associations in our analysis. This SHAP-guided approach would potentially reduce treatment failure rates and minimize the risk of further resistance development.
This study had several limitations. First, the MTL model was not applied in the prospective cohort study. Second, during the MTL training, the loss was calculated by removing outputs with missing targets. This method may influence learning bias because it calculates loss without considering the data volume between tasks. Although this did not affect the learning bias in this study, further research is needed on loss calculations considering the data volume between targets. Third, although our results demonstrate that MTL handles missing labels effectively across varying proportions of label availability (ranging from 0% missing for some antibiotics to 69% for aminoglycosides as shown in Supplementary Table S20), a systematic sensitivity analysis that deliberately varies the proportion of missing labels under controlled conditions could further validate the robustness of our approach and should be undertaken in future work. Fourth, the prediction accuracy is poor when previous culture test results are absent. This issue arose because of missing prior culture test records in the EMR, which could be resolved using medical records containing patient histories. Finally, the study was conducted as a single modal, not including natural language text, genomics, etc. Clinical notes may contain important natural language texts with detailed symptom records not present in the EMR. Genomics is an important prediction model because antibiotic resistance is influenced by genomics29. These various models have been reported to aid antibiotic resistance prediction studies30.
In conclusion, considering antibiotic resistance associations, MTL achieved generalized, stable, and superior performance compared to traditional machine learning and deep learning (LR, XGB, LGBM, Cat, and MLP) based on STL. It also provides a solution for partial targets in each MTL task, thereby addressing the high-cost labeling problem. However, it is important to improve the accuracy of antibiotic resistance prediction through additional research on loss calculations and missing-value-handling methods. For clinical implementation, the model could be integrated into hospital EMR systems as a clinical decision support tool, though this would require prospective validation, appropriate regulatory approval (e.g., FDA), and ongoing performance monitoring. The results of this study are expected to contribute significantly to the early diagnosis and management of antibiotic resistance in clinical settings.
Methods
Study design and participants
This study used the EMR databases of three tertiary care institutions in Korea: Korea University Anam Hospital, Korea University Guro Hospital, and Korea University Ansan Hospital. This study was approved by the Institutional Review Boards (IRB) of Korea University Medical Center: Korea University Anam Hospital (IRB number: 2023AN0544), Korea University Guro Hospital (IRB number: 2023GR0575), and Korea University Ansan Hospital (IRB number: 2024AS0009). We confirm that all methods were carried out in accordance with relevant guidelines and regulations.
The hospital databases from these three institutions included EMR data from November 1, 2011, to July 31, 2022. The study population was defined as patients aged 19 years or older, hospitalized for at least 3 d, and with available culture test results. The outcome variable was antibiotic resistance status after 3 d, specifically targeting nine antibiotics commonly used in clinical practice: TGC, MAC, FGC, FLU, BLA, CAR, AMI, AG, and GLY.
Data preprocessing
We used input data comprising demographic information (Table S3), vital signs, antibiotic administration details (Table S2), and previous culture test results to construct the antibiotic resistance prediction models (Table S1), with the output value set as the current antibiotic resistance status of the hospitalized patients (Figure S1, Figure S2). The bacterial strains were categorized into nine classes. Data scaling was performed using Standard Scaler, and missing data were imputed using multiple imputation by chained equations (MICE) and missing indicators31–33. Detailed preprocessing steps are shown in the supplementary appendix p2–8.
Model
Previous studies on medical artificial intelligence model development using tabular data have primarily adopted Machine Learning techniques9–15. This trend is due to the tendency of Deep Learning models to show relatively lower performance than Machine Learning models when dealing with tabular data34,35. Bridging this performance gap requires either expanding the data scale or improving the model algorithm36. However, security and privacy concerns associated with medical data impose practical limitations on data scale expansion. Consequently, algorithm improvement approaches, such as few-shot learning, are emerging as more feasible solutions.
We adopted MTL, a paradigm that jointly learns related tasks through shared representations (Figure S2). MTL is known to be particularly effective when there are correlations between tasks18. Figure 3 illustrates our MTL architecture compared to traditional STL approaches used in previous studies. We conducted comparative analyses with various STL based algorithms (LR, XGB, LGBM, CatBoost, MLP) to validate the effectiveness of MTL. These STL methods, which have been the standard approach in prior antibiotic resistance prediction research, require complete label information for each antibiotic category and train separate models for each of the nine antibiotic classes.
Fig. 3.
Comparative analysis of Multi-Task Learning and Single-Task Learning architectures for antibiotic resistance prediction. Schematic representation of the proposed Multi-Task Learning (MTL) framework with hard and soft parameter sharing architectures versus traditional Single-Task Learning (STL) approaches. The MTL model handles missing target labels through masking during backpropagation, enabling effective learning from partially labeled culture test data. In hard parameter sharing (HS), shared layers receive gradient updates from all tasks, while in soft parameter sharing (SS), all layers including task-specific ones receive updates, potentially enhancing learning efficiency. The STL baseline models process nine antibiotic category datasets separately using conventional machine learning algorithms (LR, XGB, LGBM, CatBoost, MLP).
One of the main challenges encountered in developing antibiotic resistance prediction models stems from the characteristics of culture test data. Culture tests selectively report resistance to antibiotics based on a test panel, which can lead to partially missing data for the target variables. Our MTL framework addresses this challenge through a masking mechanism during backpropagation, where targets with missing data are excluded from loss computation (Fig. 3). In the HS architecture, this approach enables shared layers to receive gradient updates from all available tasks, while in SS, the entire network including task-specific layers benefits from cross-task learning signals. This approach ensures the stability of model training while maximizing the utilization of the available data, providing a significant advantage over STL methods that must discard samples with any missing labels. (Supplementary appendix p9–12)
Evaluation metrics
The accuracy of early antibiotic resistance prediction was evaluated using metrics that can express the model’s generalization ability: AUC, PRC, and CS according to the Hosmer–Lemeshow method37. We compared predictive performance between models using the following scenario: (1) We divided the internal data, consisting of Korea University Anam Hospital and Korea University Guro Hospital, into training and test sets. External data collected from Korea University Ansan Hospital were designated as the test set. (2) The models were trained using an internal dataset. (3) We calculated the models’ AUC, PRC, and CS based on the internal and external data test sets. (4) This process was repeated 30 times to compare the models statistically.
Feature importance calculation
We demonstrated the accuracy of the early prediction of antibiotic resistance using each model algorithm. We applied the SHAP, an XAI technique, to identify the risk factors for antibiotic resistance based on the model with the highest sum of AUC and PRC values from internal and external tests (Supplementary appendix p11-12).
Statistical analysis for performance comparison between models
We developed models by configuring the training and test data (7:3) 30 times from the original dataset and then conducting performance evaluations to compare the models’ performances. This approach aims to compare the model performance while reflecting the characteristics of medical data, where the model performance varies depending on the patient cohort used for training. The evaluation metrics of the models are expressed as the mean (95% CI).
Supplementary Information
Below is the link to the electronic supplementary material.
Acknowledgements
This work was supported by Korea Institute for Advancement of Technology(KIAT) grant funded by the Korea Government(MOTIE) (P0023675, HRD Program for Industrial Innovation), the IITP(Institute of Information & Communications Technology Planning & Evaluation)-ICAN(ICT Challenge and Advanced Network of HRD) grant funded by the Korea government(Ministry of Science and ICT)(IITP-2026-RS-2022-00156439), and Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(RS-2024-00466053). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. We would like to thank the medical staff of Korea University Anam Hospital, Korea University Guro Hospital, and Korea University Ansan Hospital for their support in data collection and validation.
Abbreviations
- MTL
Multi–task learning
- LR
Logistic regression
- RF
Random Forest
- MLP
Multi-Layer Perceptron
- XGB
Extreme gradient boosting
- LGBM
Light gradient boosting machine
- EMR
Electronic medical record
- IRB
Institutional review boards
- TGC
Rhird generation cephalosporins
- MAC
Macrolides
- FGC
Fourth generation cephalosporins
- FLU
Fluoroquinolones
- BLA
β-lactams and β-lactamase inhibitors
- CAR
Carbapenems
- AMI
Aminoglycosides
- AG
Amikacin/gentamicin
- GLY
Glycopeptides
- MICE
Multiple imputation by chained equations
- STL
Single-task learning
- AUC
Area under the receiver operating characteristic curve
- PRC
Precision–recall curve
- CS
Calibration slope
- SHAP
Shapley additive explanations
- XAI
Explainable artificial intelligence
- HS
Hard parameter sharing
- SS
Soft parameter sharing
Author contributions
- study concept and design: SP and YK- literature search: All authors- data collection: CJ, JS, HP, and HL- Statistical analysis: JP, TJ, YK, and SP- Funding: TJ and HL- Administrative, technical, or material support: SP and HL- data interpretation: IJ, SP, and HL- writing: All authors.
Funding
This work was supported by Korea Institute for Advancement of Technology(KIAT) grant funded by the Korea Government(MOTIE)(P0023675, HRD Program for Industrial Innovation). This work was supported by the IITP(Institute of Information & Communications Technology Planning & Evaluation)-ICAN(ICT Challenge and Advanced Network of HRD) grant funded by the Korea government(Ministry of Science and ICT)(IITP-2026-RS-2022-00156439). This work was supported by National Research Foundation of Korea(NRF) funded by the Ministry of Education(RS-2024-00466053).
Data availability
Data supporting the findings of this study are available from the corresponding author upon reasonable request. The data are not publicly available due to containing sensitive medical information and personal information of patients that could compromise research participant privacy and confidentiality under the data protection regulations.
Code availability
The code used for developing and evaluating the antibiotic resistance prediction models in this study was implemented using Python 3.10.0, TensorFlow 2.10.0, and R 4.3.3. The complete Python implementation, including the multi-task learning models, data preprocessing, and analysis scripts, is publicly available at https://github.com/YeongM/antibiotics_resistance_MTL. This repository contains all necessary code to reproduce our results, including model architectures, training procedures, and evaluation metrics. The implementation details of both HS and SS approaches are included. Additional R scripts used for statistical analysis and visualization are also available in the same repository. The code is released under an open-source license to ensure reproducibility and to facilitate further research in this field.
Declarations
Competing interests
The authors declare no competing interests.
Ethics approval and consent to participate
This retrospective study was conducted after obtaining approval from the Institutional Review Boards of Korea University Anam Hospital (IRB No. 2023AN0544), Korea University Guro Hospital (IRB No. 2023GR0575), and Korea University Ansan Hospital (IRB No. 2024AS0009). The requirement for informed consent was waived due to the retrospective nature of the study using existing medical records.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Se Yoon Park, Email: livinwill2@gmail.com.
Hwamin Lee, Email: hwamin@korea.ac.kr.
References
- 1.Kim, Y. C. et al. Prescriptions patterns and appropriateness of usage of antibiotics in non-teaching community hospitals in South Korea: a multicentre retrospective study. Antimicrob. Resist. Infect. Control. 11, 40 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Gaieski, D. F. et al. Impact of time to antibiotics on survival in patients with severe sepsis or septic shock in whom early goal-directed therapy was initiated in the emergency department. Crit. Care Med.38, 1045–1053 (2010). [DOI] [PubMed] [Google Scholar]
- 3.Miethke, M. et al. Towards the sustainable discovery and development of new antibiotics. Nat. Reviews Chem.5, 726–749 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Spoorenberg, V., Hulscher, M. E., Akkermans, R. P., Prins, J. M. & Geerlings, S. E. Appropriate antibiotic use for patients with urinary tract infections reduces length of hospital stay. Clin. Infect. Dis.58, 164–169 (2014). [DOI] [PubMed] [Google Scholar]
- 5.Neil, J. Report on antimicrobial resistance. Report Antimicrob. Resistance (2016).
- 6.Park, S. Y. et al. Appropriateness of antibiotic prescriptions during hospitalization and ambulatory care: a multicentre prevalence survey in Korea. J. Global Antimicrob. Resist.29, 253–258 (2022). [DOI] [PubMed] [Google Scholar]
- 7.Benkova, M., Soukup, O. & Marek, J. Antimicrobial susceptibility testing: currently used methods and devices and the near future in clinical practice. J. Appl. Microbiol.129, 806–822 (2020). [DOI] [PubMed] [Google Scholar]
- 8.Johnson, K. B. et al. Precision medicine, AI, and the future of personalized health care. Clin. Transl. Sci.14, 86–93 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Jun, I. et al. Moving from predicting hospital deaths by antibiotic-resistant bloodstream bacteremia toward actionable risk reduction using machine learning on electronic health records. AMIA Summits on Translational Science Proceedings 274 (2022). [PMC free article] [PubMed]
- 10.Oonsivilai, M. et al. Using machine learning to guide targeted and locally-tailored empiric antibiotic prescribing in a children’s hospital in Cambodia. Wellcome Open. Res. 3 (2018). [DOI] [PMC free article] [PubMed]
- 11.Dan, S. et al. Prediction of fluoroquinolone resistance in gram-negative bacteria causing bloodstream infections. Antimicrob. Agents Chemother.60, 2265–2272 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Yelin, I. et al. Personal clinical history predicts antibiotic resistance of urinary tract infections. Nat. Med.25, 1143–1152 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.İlhanlı, N. et al. Prediction of Antibiotic Resistance in Patients With a Urinary Tract Infection: Algorithm Development and Validation. JMIR Med. Inf.12, e51326 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Kim, C. et al. Translation of Machine Learning-Based Prediction Algorithms to Personalised Empiric Antibiotic Selection: A Population-Based Cohort Study. Int. J. Antimicrob. Agents. 62, 106966 (2023). [DOI] [PubMed] [Google Scholar]
- 15.Lewin-Epstein, O., Baruch, S., Hadany, L., Stein, G. Y. & Obolski, U. Predicting antibiotic resistance in hospitalized patients by applying machine learning to electronic medical records. Clin. Infect. Dis.72, e848–e855 (2021). [DOI] [PubMed] [Google Scholar]
- 16.Kotia, J., Kotwal, A., Bharti, R. & Mangrulkar, R. Few shot learning for medical imaging. Mach. Learn. Algorithms Ind. Appl. 107–132 (2021).
- 17.Wang, Y., Yao, Q., Kwok, J. T. & Ni, L. M. Generalizing from a few examples: A survey on few-shot learning. ACM Comput. Surv. (csur). 53, 1–34 (2020). [Google Scholar]
- 18.Caruana, R. Multitask learning. Mach. Learn.28, 41–75 (1997). [Google Scholar]
- 19.Song, X., Zheng, S., Cao, W., Yu, J. & Bian, J. Efficient and effective multi-task grouping via meta learning on task combinations. Adv. Neural. Inf. Process. Syst.35, 37647–37659 (2022). [Google Scholar]
- 20.Yoon, C. et al. Relationship between the appropriateness of antibiotic treatment and clinical outcomes/medical costs of patients with community-acquired acute pyelonephritis: a multicenter prospective cohort study. BMC Infect. Dis.22, 112 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Im, Y. et al. Time-to-antibiotics and clinical outcomes in patients with sepsis and septic shock: a prospective nationwide multicenter cohort study. Crit. Care. 26, 19 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Leung, L. Y. et al. Door-to-antibiotic time and mortality in patients with sepsis: Systematic review and meta-analysis. Eur. J. Intern. Med.129, 48–61 (2024). [DOI] [PubMed] [Google Scholar]
- 23.Liu, V. X. et al. The timing of early antibiotics and hospital mortality in sepsis. Am. J. Respir. Crit Care Med.196, 856–863 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Crowther, G. S. & Wilcox, M. H. Antibiotic therapy and Clostridium difficile infection–primum non nocere–first do no harm. Infection Drug Resistance 333–337 (2015). [DOI] [PMC free article] [PubMed]
- 25.Alm, R. A. & Lahiri, S. D. Narrow-spectrum antibacterial agents—benefits and challenges. Antibiotics9, 418 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Talpaert, M. J., Gopal Rao, G., Cooper, B. S. & Wade, P. Impact of guidelines and enhanced antibiotic stewardship on reducing broad-spectrum antibiotic usage and its effect on incidence of Clostridium difficile infection. J. Antimicrob. Chemother.66, 2168–2174 (2011). [DOI] [PubMed] [Google Scholar]
- 27.Patton, A. et al. Impact of antimicrobial stewardship interventions on Clostridium difficile infection and clinical outcomes: segmented regression analyses. J. Antimicrob. Chemother.73, 517–526 (2018). [DOI] [PubMed] [Google Scholar]
- 28.Segars, W. et al. Population of anatomically variable 4D XCAT adult phantoms for imaging research and optimization. Med. Phys.40, 043701 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Su, M., Satola, S. W. & Read, T. D. Genome-based prediction of bacterial antibiotic resistance. J. Clin. Microbiol.57, 01405–01418. 10.1128/jcm (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Behrad, F. & Abadeh, M. S. An overview of deep learning methods for multimodal medical data mining. Expert Syst. Appl.200, 117006 (2022). [Google Scholar]
- 31.Guguloth, S., Telu, A., Sairam, U. & Voruganti, S. in International Conference on Soft Computing and Pattern Recognition. 844–851 (Springer).
- 32.Mera-Gaona, M., Neumann, U., Vargas-Canas, R. & López, D. M. Evaluating the impact of multivariate imputation by MICE in feature selection. Plos one. 16, e0254720 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Costantini, E., Lang, K. M., Sijtsma, K. & Reeskens, T. Solving the many-variables problem in MICE with principal component regression. Behav. Res. Methods. 56, 1715–1737 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Borisov, V. et al. Deep neural networks and tabular data: A survey. IEEE Trans. Neural Networks Learn. Syst. (2022). [DOI] [PubMed]
- 35.Cho, N. J. et al. A machine learning-based approach for predicting renal function recovery in general ward patients with acute kidney injury. Kidney Res. Clin. Pract.43, 538 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Jeong, I., Kim, Y., Cho, N. J., Gil, H. W. & Lee, H. A novel method for medical predictive models in small data using out-of-distribution data and transfer learning. Mathematics12, 237 (2024).
- 37.Hosmer, D. W. & Lemesbow, S. Goodness of fit tests for the multiple logistic regression model. Commun. Stat. Theory Methods9, 1043–1069 (1980). [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data supporting the findings of this study are available from the corresponding author upon reasonable request. The data are not publicly available due to containing sensitive medical information and personal information of patients that could compromise research participant privacy and confidentiality under the data protection regulations.
The code used for developing and evaluating the antibiotic resistance prediction models in this study was implemented using Python 3.10.0, TensorFlow 2.10.0, and R 4.3.3. The complete Python implementation, including the multi-task learning models, data preprocessing, and analysis scripts, is publicly available at https://github.com/YeongM/antibiotics_resistance_MTL. This repository contains all necessary code to reproduce our results, including model architectures, training procedures, and evaluation metrics. The implementation details of both HS and SS approaches are included. Additional R scripts used for statistical analysis and visualization are also available in the same repository. The code is released under an open-source license to ensure reproducibility and to facilitate further research in this field.



