Table 4.
Authors (Year) |
Dataset | Input | Model/Analysis | Objective | Results |
---|---|---|---|---|---|
Liang et al. [129] (2022) |
Multiparameter Intelligent Monitoring in Intensive Care (MIMIC)-III dataset | 42 VAP risk factors at admission and routinely measured the vital characteristics and laboratory results from 38,515 ventilation sessions | Random forest compared to clinical pulmonary infection score (CPIS)-based model | Early prediction of ventilator-associated pneumonia in critical care patients | AUC of 84% in the validation, 74% sensitivity and 71% specificity 24 h after intubation |
Giang et al. [130] (2021) |
Multiparameter Intelligent Monitoring in Intensive Care (MIMIC)-III dataset | Data from 6126 adult ICU encounters | Five different ML models trained: logistic regression, multilayer perceptron, random forest, support vector machines, and gradient boosted trees | Prediction of ventilator-associated pneumonia with ML | The highest performing model achieved an AUROC value of 0.854 |
Samadani et al. [131] (2023) |
Philips eRI dataset | 9204 presumed VAP events | XGBoost gradient boosting algorithm, random forest, logistic regression, ADABoost, KNN | Early prediction and hospital phenotyping of ventilator-associated pneumonia | The model predicts the development of VAP 24 h in advance with an AUC of 76% and AUPRC of 75% |
Jeon et al. [132] (2023) |
SNU-SMG Boramae Medical Center database | 816 patient data including the period from hospital admission to ICU admission, age, APACHE II scores, PaO2/FiO2 ratio, history of chronic respiratory disease, history of cerebrovascular accident (CVA) or dementia, mechanical ventilation, use of vasopressors | Logistic regression with L2 regularization, gradient-boosted decision tree (LightGBM), multilayer perceptron (MLP) | ML-based prediction of in-ICU mortality in pneumonia patients | ML models significantly outperformed the Simplified Acute Physiology Score II (AU-ROC: 0.650 vs. 0.820 for logistic regression vs 0.827 for LightGBM 0.838 for MLP |
Wang et al. [133] (2023) |
MIMIC-IV and eICU databases | MIMIC-IV (n = 4697) and eICU (n = 13,760) databases, six variables included: metastatic solid tumor, Charlson Comorbidity Index, readmission, congestive heart failure, age, and Acute Physiology Score II | Logistic regression, decision tree, random forest, multilayer perceptron, XGBoost |
Prediction of mortality in pneumonia patients on intensive care unit admission | AUC value ranged in predicting 1-year and hospital mortality were 0.784–0.797 and 0.691–0.780, respectively |
Wang et al. [134] (2023) |
Medical Information Mart for Intensive Care-III (MIMIC-III) database | 786 VAP incidences with traumatic brain injury (TBI) patients | Random forest, XGBoost and AdaBoost | Development of algorithms for prediction of ventilator associated pneumonia in traumatic brain injury patients | The random forest performed the best on predicting VAP in the training cohort with AUC of 1.000. AdaBoost performed the best on predicting VAP in the validation cohort with a AUC of 0.706. |
Rahmani et al. [136] (2022) |
National longitudinal electronic health records | Demographics, number of days a patient had been hospitalized before placement of a central line, laboratory and vital values (n = 27,619) | XGBoost, logistic regression, decision tree | Early prediction of central line associated bloodstream infection using ML | XGBoost was the highest performing model with an AUROC of 0.762 for CLABSI risk prediction at 48 h after the recorded time for central line placement |
Beeler et al. [137] (2018) |
Indiana University Health Academic Health Center (IUH AHC) database | Intrinsic and extrinsic risk factors (n = 70,218) | Logistic regression and random forest | ML-based assessment of patient risk for central line-associated bacteremia | Random forest had AUROC of 0.82, while AUROC curve for the logistic regression model was 0.79 |
Parreco et al. [138] (2018) |
Multiparameter Intelligent Monitoring in Intensive Care III database | Variables included six different severities of illness scores calculated on the first day of ICU admission with their components and comorbidities. The outcomes of interest were in-hospital mortality, central line placement, and CLABSI (n = 57,786) | Logistic regression, gradient boosted trees, and deep learning. | Prediction of central line-associated bloodstream infections and mortality using supervised ML | Classifiers using deep learning performed with the highest AUC for mortality, 0.885 and central line placement, 0.816. The classifier using logistic regression for predicting CLABSI performed with an AUC of 0.722 |
Bonello et al. [139] (2022) |
Boston Children’s Hospital database | Patient-level risk factors, encounter-level risk factors, demographics, vital signs measurements from the preceding 24 h, recent course-related risk factors, laboratory values and CVC-associated risk factors (n = 7468) |
Generalized linear modeling, random forest, lasso regression | Prediction of impending CLABSI infections in hospitalized cardiac patients | ML predicted 25% of patients with impending CLABSI with an FPR of 0.11% and AUC of 0.82 |
Hu et al. [141] (2015) |
Surgical patient database at the University of Minnesota Medical Center | Clinical data included six data types: demographics, diagnosis codes, orders, lab results, vital signs, and medications. Demographics included each patient’s gender, race, and age at the time of surgery | Single-task learning, Hierarchical classification, offset method, propensity-weighted observations (PWO), multi-task learning with penalties (MTLP), partial least squares regression (PLS) | Automated detection of postoperative complications using EHR data | The models demonstrated high detection performance, which ensures the feasibility of accelerating manual chart review (MCR) |
Kuo et al. [142] (2018) |
Kaohsiung Chang Gung Memorial Hospital database | Dataset including 1836 patients with 1854 free-flap reconstructions and 438 postoperative SSIs | Feed-forward artificial neural network (ANN) and logistic regression (LR) models | Artificial neural network approach to predict surgical site infection after free-flap reconstruction in patients receiving surgery for head and neck cancer | ANN had a significantly higher AUC (0.892) of postoperative prediction and AUC (0.808) of pre-operative prediction than LR |
Sohn et al. [143] (2017) |
American College of Surgeons National Surgical Quality Improvement Program (ACS-NSQIP) cohort | Cohort data | Bayesian network coupled with natural language processing (NLP) |
Detection of clinically important colorectal surgical site infection | Bayesian network detected ACS-NSQIP-captured SSIs with a receiver operating characteristic AUC of 0.827 |
Soguero-Ruiz et al. [144] (2015) | EHR of the Department of Gastrointestinal Surgery at the University Hospital of North Norway | A cohort based on relevant International Classification of Diseases (ICD10) or NOMESCO Classification of Surgical Procedures (NCSP) codes related to severe post-operative complications (101 cases and 904 controls) | Gaussian process (GP) regression, support vector machine (SVM) | Data-driven temporal prediction of surgical site infection | Real-time prediction and identification of patients at risk for developing SSI was shown |
Mamlook et al. [145] (2023) |
American College of Surgeons’ National Surgical Quality Improvement Program (ACS NSQIP) database | Data from 2,882,526 surgical procedures | Logistic regression (LR), naïve Bayes (NB), random forest (RF), decision tree (DT), support vector machine (SVM), artificial neural network (ANN), and deep neural network (DNN) | Prediction of surgical site infections using patient pre-operative risk and surgical procedure factors | DNN model offers the best predictive performance with 10-fold compared to the other 6 approaches considered (area under the curve 0.8518, accuracy 0.8518, precision 0.8517, sensitivity 0.8527, F1-score 0.8518) |
Cho et al. [146] (2023) |
Samsung Medical Center clinical data warehouse (CDW) | Clinical data | Python, Tensorflow, Keras, Scikit-learn libraries, random forest (RF), gradient boosting (GB), and neural network (NN) with or without recursive feature elimination (RFE) | Development of ML models for the surveillance of colon surgical site infections | NN with RFE using 29 variables had the best performance with an AUC of 0.963. PPV of 21.1%, sensitivity of 95% |
Petrosyan et al. [147] (2021) |
The Ottawa hospital database | Patients aged 18 years and older who underwent surgery, included in the American College of Surgeons National Surgical Quality Improvement Program (NSQIP) data collection | Random forest algorithm, high-performance logistic regression | Prediction of postoperative surgical site infection with administrative data | Final model, including hospitalization diagnostic, physician diagnostic and procedure codes, demonstrated excellent discrimination (C statistics, 0.91, 95% CI, 0.90–0.92 |
Wu et al. [148] (2023) |
Calgary, Canada acute care hospital database | Cohort included adult patients (age ≥ 18 years) who underwent primary total elective hip (THA) or knee (TKA) arthroplasty | XGBoost models | ML-aided detection of surgical site infections following total hip and knee arthroplasty | XGBoost models using a combination of administrative data and text data to identify complex SSIs achieved the best performance, with F1 score of 0.788, ROC AUC of 0.906 |
Chen et al. [149] (2023) |
The First Affiliated Hospital of Guangxi Medical University, Department of Spine and Osteopathy Ward database | Patients who underwent lumbar internal fixation surgery at (n = 4019) | Lasso regression analysis, support vector machine, random forest | Application of ML to predict surgical site infection after lumbar spine surgery | C-index of the model was 0.986, ROC AUC curve 0.988 |
Wang et al. [157] (2021] |
Observational cohort from the Intensive Care Unit of the First Affiliated Hospital of Zhengzhou University | Electronic medical record data, a set of 55 features (variables) from 4449 infected patients | Random forest | Application of ML for accurate prediction of sepsis in ICU patients | ROC AUC was 0.91 with 87%, sensitivity, 89% specificity for sepsis prediction |
Lauritsen et al. [158] (2020) |
Retrospective data from multiple Danish hospitals | EHR, including biochemistry, medicine, microbiology, medical imaging, and the patient administration system (PAS) |
Combination of a convolutional neural network and a long short-term memory network | Early detection of sepsis utilizing deep learning on EHR event sequences | Model performance ranged from AUROC 0.856 (3 h before sepsis onset) to AUROC 0.756 (24 h before sepsis onset) |
Yuan et al. [159] (2020) |
Prospective open-label cohort study conducted at Taipei Medical University Hospital | Data including the vital signs, laboratory results, examination reports, text data, and image of every ICU patient | Logistic regression, support vector machine, XGBoost, and neural network | Development an AI algorithm for early sepsis diagnosis in the intensive care unit | Established AI algorithm achieved accuracy of 82%, sensitivity of 65%, specificity of 88%, precision = 67%, F1 = 0.66 ± 0.02. AUROC was 0.89 |
Fagerström et al. [160] (2019) |
Medical Information Mart for Intensive Care database | Vital signs, laboratory data, and journal entries (n = 59,000 ICU patients) | LiSep LSTM; a long short-term memory neural network, Keras with a Google TensorFlow | Application of ML algorithm for early detection of septic shock | LiSep LSTM outperforms a less complex model, using the same features and targets, with an AUROC 0.8306 |