Table 2.
Reference | Validation | Comparison | Best Performer | BP’s family | Metrics Used | Results | Most Important Laboratory Features for the Model | Issues/Notes | |
---|---|---|---|---|---|---|---|---|---|
Awad et al. (2017) [18] | CV | RF, DT, NB, PART, Scores (SOFA, SAPS-I, APACHE-II, NEWS, qSOFA) | RF | Trees | AUROC | RF best performance (VS subset) predicting hospital mortality: 0.90 ± 0.01 AUROC AUROC RF (15 variables) at 6 h: 0.82 ± 0.04 SAPS at 24 h (best performer among scores): 0.650 ± 0.012 |
Vital Signs, age, serum urea nitrogen, respiratory rate max, heart rate max, heart rate min, creatinine max, care unit name, potassium min, GCS min and systolic blood pressure min | Performance metrics for comparison referred to cross-validation results | |
Escobar et al. (2017) [19] | CV | 3 LoR models, Zilberberg model | LoR (automated model) | Regression | AUROC, pseudo-R2, Sensitivity, Specificity, PPV, NPV, NNE, NRI, IDI | AUROC; R2 | Performance metrics for comparison referred to cross-validation results | ||
Age ≥ 65 years | 0.546; −0.1131 | ||||||||
Basic model | 0.591; −0.0910 | ||||||||
Zilberberg model | 0.591; −0.0875 | ||||||||
Enhanced model | 0.587; −0.0924 | ||||||||
Automated model | 0.605; −0.1033 | ||||||||
Richardson and Lidbury (2017) [20] | CV | RF (variables selection) + SVM *** | NE | SVM *** | AUROC, F1, Sensibility, Specificity, Precision | For both HBV and HCV, 3 balancing methods and 2 feature selectors were tested, showing how they can change SVM performances | HBV: ALT, Age and Sodium HCV: Age, ALT and Urea |
||
Zhang et al. (2017) [21] | CV | GBT *** | NE | Ensemble *** | RI, H-statistic (features) AUROC, Sensibility, Specificity (model) | WBC count ≥ 15 × 109/L (RI: 49.47, p < 0.001), spinal cord involvement (RI: 26.62, p < 0.001), spinal nerve roots involvement (RI: 10.34, p < 0.001), hyperglycaemia (RI: 3.40, p < 0.001), brain or spinal meninges involvement (RI: 2.45, p = 0.003), EV-A71 infection (RI: 2.24, p < 0.001). Interaction between elevated WBC count and hyperglycaemia (H statistic: 0.231, 95% CI: 0–0.262, p = 0.031), between spinal cord involvement and duration of fever (H statistic: 0.291, 95% CI: 0.035–0.326, p = 0.035), and between brainstem involvement and body temperature (H statistic: 0.313, 95% CI: 0–0.273, p = 0.017) GBT model: 92.3% prediction accuracy, AUROC 0.985, Sensibility 0.85, Specificity 0.97 |
|||
Takeuchi et al. (2017) [22] | OOB | Scores (Gunma Score, Kurume Score and Osaka Score), RF | RF | Trees | AUROC, Sensibility, Specificity, PPV, NPV, Out OF Bag error estimation | RF: AUROC 0.916, Sensitivity 79.7%, Specificity 87.3%, PPV 85.2%, NPV82.1%, OOB error rate 15.5% Sensitivity and Specificity were: 69.8% and 60.0% GS; 60.6% and 55.4% KS; 24.1% and 77.0% OS. PPV (28.2%–45.1%), NPV (82.0%–86.8%) |
Aspartate aminotransferase, lactate dehydrogenase concentrations, percent neutrophils |
Performance metrics for comparison referred to cross-validation results | |
Hernandez et al. (2017) [23] | CV | DT, RF, SVM, Naive Bayes | SVM | SVM | AUROC, AUPRC, Sensitivity, Specificity, PPV, NPV, TP, FP, TN, FN | SVM with SMOTE sampling method and considering 6 features obtained the best results AUROC, AUCPR, Sensibility, Specificity 0.830, 0.884, 0.747, 0.912 |
|||
Bertsimas et al. (2018) [24] | VS | LoR, Regularized LoR, Optimal Classification Tree, CART, GBT | Optimal Classification Tree * | Trees | Accuracy (threshold 50%), PPV at Sensibility of 0.6, AUC | Optimal Classification Tree results 60-day mortality, 90-day, 120-day Accuracy: 94.9, 93.3, 86.1 PPV: 20.2, 27.5, 43.1 AUC: 0.86, 0.84, 0.83 |
Albumin, change in weight, Pulse, WBC count, Haematocrit according to the kind of cancer | The validation set was used only for NN, KNN, and SVM | |
Jeong et al. (2018) [25] | CV | CERT, CLEAR, PACE, RF, L1-regularized LoR, SVM, NN | RF | Trees | AUROC, F1, Sensibility, Specificity, PPV, NPV | ML models produced higher averaged F1-measures (0.629–0.709) and AUROC (0.737–0.816) compared to those of the original methods AUROC (0.020–0.597) and F1 (0.475–0.563) | |||
Rosenbaum and Baron (2018) [26] | NA | Univariate models, LoR, SVM | SVM | SVM | AUC, Specificity, PPV | AUROC on testing set (simulated WIBT) best univariate (BUN): 0.84 (interquartile range 0.83–0.84) SVM (difference and values): 0.97 (0.96–0.97) LoR (Difference and values): 0.93 |
Difference and Values together | Not available data from the comparison among machines | |
Ge et al. (2018) [27] | CV | RNN-LSTM + LoR vs LoR | RNN-LSTM | DL | AUROC, TP, FP | AUROC cross-validation, AUROC testing set Logistic Regression: 0.7751, 0.7412 RNN-LSTM model: 0.8076, 0.7614 |
Associated with ICU Mortality: Do Not Reanimate, Prednisolone, Disseminated intravascular coagulation; Associated with ICU Survival: Arterial blood gas pH, Oxygen saturation, Pulse |
||
Jonas et al. (2018) [28] | CV | LoR (LASSO), RF *** | NE | NE | NE | LASSO identified as the most predictive of a positive response to vasoreactivity test: 6-MWD, diabetes, HDL-C, creatinine, right atrial pressure, and cardiac index RF identified as the most predictive: NT-proBNP, HDL-C, creatinine, right atrial pressure, and cardiac index 6-MWD, HDL-C, hs-CRP, and creatinine levels best discriminated between long-term-responder and not |
Performance metrics for comparison referred to cross-validation results Tool available online |
||
Sahni et al. (2018) [29] | NA | LoR, RF | RF | Trees | AUROC | AUROC RF (demographics, physiological, lab, all comorbidities) 0.85 (0.84–0.86) LoR (demographics, physiological, lab, all comorbidities) 0.91 (0.90–0.92) |
Age, BUN, platelet count, haemoglobin, creatinine, systolic blood pressure, BMI, and pulse oximetry readings | Performance metrics for comparison referred to cross-validation results | |
Rahimian et al. (2018) [30] | CV | CPH, RF, GBC | GBC | Ensemble | AUROC | AUROC (CI95), internal validation variables, CPH, RF, GBC QA: 0.740 (0.739, 0.741), 0.752 (0.751, 0.753), 0.779 (0.777, 0.781) T: 0.805 (0.804, 0.806), 0.825 (0.824, 0.826), 0.848 (0.847, 0.849) external validation QA: 0.736, 0.736, 0.796 T: 0.788, 0.810, 0.826 |
age, cholesterol ratio, haemoglobin, and platelets, frequency of lab tests, systolic blood pressure, number of admissions during the last year | Tool available online | |
Foysal et al. (2019) [31] | CV | Regression analysis and SVM *** | NE | SVM | R2 score, Standard error of detection, Accuracy | Accuracy: 98% | NE | Performance metrics for comparison referred to cross-validation results | |
Xu et al. (2019) [32] | CV | L1 Logistic Regression, Regress and Round, Naive Bayes, NN-MLP, DT, RF, AdaBoost, XGBoost. | XGBoost, RF | NA | AUROC, Sensitivity, Specificity, NPV, PPV | Mean AUROC: 0.77 on testing set AUROC > 0.90 on 22 lab tests out of 43 On external validation: results were different according to lab test considered |
NE | DL missed Albumin as OS predictor | |
Burton et al. (2019) [33] | CV | Heuristic model (LoR) with microscopy thresholds, NN, RF, XGBoost | XGBoost * | Ensemble | AUROC, Accuracy, PPV, NPV, Sensitivity, Specificity, Relative Workload Reduction (%) | AUC Accuracy PPV NPV Sensitivity (%) Specificity (%) Relative Workload Reduction (%) Pregnant patients 0.828, 26.94, 94.6 [±0.56], 26.84 [±1.88], 25.29 [±0.92] Children (<11 years) 0.913, 62.00, 94.8 [±0·88], 55.00 [±2.12], 46.24 [±1.48] Pregnant patients 0.894, 71.65, 95.3 [±0.24], 60.93 [±0.65], 43.38 [±0.41] Combined performance 0.749, 65.65, 47.64 [±0.51], 97.14 [±0.28], 95.2 [±0.22], 60.93 [±0.60], 41.18 [±0.39] |
WBC count, Bacterial count, Age, Epithelial cell count, RBC count | ||
Fillmore et al. (2019) [34] | CV | L1 LoR (LASSO), SVM, RF | RF | Trees | Accuracy | LabTest: LR, SVM, RF ALP: 0.98, 0.97, 0.98 ALT: 0.98, 0.94, 0.92 ALB: 0.97, 0.92, 0.98 HDLC: 0.98, 0.91, 0.98 Na: 0.97, 0.98, 0.99 Mg: 0.97, 0.95, 0.99 HGB: 0.97, 0.95, 0.99 |
Not provided precise data of the performances on testing set | ||
Zimmerman et al. (2019) [35] | CV | LiR, LoR, RF, NN-MLP | NN-MLP | DL | AUROC, Accuracy, Sensitivity, Specificity, PPV, NPV | LiR Regression task: RMSEV Linear Backward Selection Model 0.224 Linear All Variables Model 0.224 AUROC, Accuracy, Sensitivity, Specificity, PPV, NPV LR, Backward Selection Model: 0.780, 0.724, 0.697, 0.730, 0.337, 0.924 LR, All Variables Model: 0.783, 0.729, 0.698, 0.736, 0.342, 0.925 RF, Backward Selection Model: 0.772, 0.739, 0.660, 0.754, 0.346, 0.918 RF, All Variables Model: 0.779, 0.742, 0.673, 0.756, 0.352, 0.921 MLP, Backward Selection Model: 0.792, 0.744, 0.684, 0.756, 0.356, 0.924 MLP, All Variables Model: 0.796, 0.743, 0.694, 0.753, 0.357, 0.926 |
Sex, age, ethnicity, Hypoxemia, mechanical ventilation, Coagulopathy, calcium, potassium, creatinine level | Performance metrics for comparison referred to cross-validation results | |
Sharafoddini et al. (2019) [36] | CV | LASSO for choosing most important variables. DT, LoR, RF, SAPS-II (score) |
Logistic Regression | Regression | AUROC | Including indicators improved the AUROC in all modelling techniques, on average by 0.0511; the maximum improvement was 0.1209 | BUN, RDW, anion gap all 3 days. day 1: TBil, phosphate, Ca, and Lac day 2&3: Lac, BE, PO2, and PCO2 day 3: PTT and pH |
||
Matsuo et al. (2019) [37] | CV | NN, CPH, CoxBoost, CoxLasso, Random Survival Forest | NN | DL | Concordance Index, Mean Absolute Error | Progression-free survival (PFS): Concordance index, Mean absolute error (mean ± standard error) CPH: 0.784 ± 0.069, 316.2 ± 128.3 DL: 0.795 ± 0.066, 29.3 ± 3.4 Overall survival (OS): CPH: 0.607 ± 0.039, 43.6 ± 4.3 DL: 0.616 ± 0.041, 30.7 ± 3.6 |
PFS: BUN, Creatinine, Albumin, (Only DL) WBC, Platelet, Bicarbonate, Haemoglobin OS: BUN (only DL) Bicarbonate (only CPH) Platelet, Creatinine, Albumin |
||
Yang et al. (2019) [38] | OOB | RF *** | NE | Trees *** | OOB | Predicting Outcome (discharge/death) Out-of-bag error 0.073 Accuracy: 0.927 Recall/sensitivity: 0.702 Specificity: 0.973 Precision: 0.840 |
bicarbonate, phosphate, anion gap, white cell count (total), PTT, platelet, total calcium, chloride, glucose and INR | Not clear how they split dataset and which results are reported | |
Daunhawer et al. (2019) [39] | CV | L1 Regularized LoR (LASSO), RF | RF+LASSO | NE | AUROC | AUROC cross-validation test set external set RF: 0.933 ± 0.019, 0.927, 0.9329 LASSO: 0.947 ± 0.015, 0.939, 0.9470 RF + LASSO: 0.952 ± 0.013, 0.939, 0.9520 |
Gestational Age, weight, bilirubin level, and hours since birth | ||
Estiri et al. (2019) [40] | Pl | CAD (Standard deviation and Mahalanobis distance), Hierarchical k-means | Hierarchical k-means | Clustering | FP, TP, FN, TN, Sensitivity, Specificity, and fallout across the eight thresholds | Specificity increases as threshold decreases. The lowest was 0.9938 Sensitivity in 39/41 variable > 0.85, Troponin I = 0.0545, LDL = 0.4867 About sensitivity, 39/41 CAD~ML, 9/41 CAD > ML About FP, in 45/50 ML had less FP than CAD |
|||
Kayhanian et al. (2019) [41] | CV | LoR, SVM | SVM | SVM | Sensitivity, Specificity, AUC, J-statistic | Sensitivity, Specificity, J-statistic, AUC Linear model, all variables: 0.75, 0.99, 0.7, 0.9 Linear model, three variables: 0.71, 0.99, 0.74, 0.83 SVM, all variables: 0.63, 1, 0.79, N/A SVM, three variables: 0.8, 0.99, 0.63, N/A |
Lactate, pH and glucose | ||
Wang et al. (2019) [42] | CV | Auto-Weka (39 ML algorithms) | RF | Trees | Sensitivity, Specificity, AUROC, Accuracy | Time after ICH, Case number, Best algorithms Sensitivity, Specificity, Accuracy, AUC 1-month: 307 Random forest, 0.774, 0.869, 0.831, 0.899 6 months: 243 Random forest, 0.725, 0.906, 0.839, 0.917 |
1 month: ventricle compression, GCS, ICH volume, location, Hgb; 6 months: GCS, location, age, ICH volume, gender, DBP, WBC |
Connection between HDL-C and reactivity of the pulmonary vasculature is a novel finding |
|
Ye et al. (2019) [43] | NA | Retrospective: RF, XGBoost, Boosting, SVM, LASSO, KNN Prospective: RF |
RF | Trees | AUROC, PPV, Sensitivity, Specificity | RF’s AUROC: 0.884 (highest among all other ML models) high-risk sensitivity, PPV, low–moderate risk sensitivity, PPV EWS: 26.7%, 69%, 59.2%, 35.4% ViEWS: 13.7%, 35%, 35.7%, 21.4% |
Diagnoses of cardiovascular diseases, congestive heart failure, or renal diseases | No information about tuning | |
Yang et al. (2020) [44] | CV | LoR, DT (CART), RF, and GBDT | GBDT | Ensemble | AUROC, sensitivity, specificity, agreement with RT-PCR (Agr-PCR) | AUROC; Sensitivity; Specificity; Agr-PCR GBDT 0.854 (0.829–0.878); 0.761 (0.744–0.778); 0.808 (0.795–0.821); 0.791 (0.776–0.805); on cross-validation; GBDT 0.838; 0.758; 0.740 on independent testing set |
LDH, CRP, Ferritin | No information about model, training, validation, test | |
Ma et al. (2020) [45] | CV | RF, XGBoost, LoR for selecting variables for the new model New Model vs Score (CURB-65), XGBoost |
New Model | Other | AUROC | AUROC on testing set (13 patients), AUROC on cross-validation New Model: 0.9667, 0.9514 CURB-65: 0.5500, 0.8501 XGBoost: 0.3333, 0.4530 |
LDH, CRP, Age | Tool available online | |
Hyun et al. (2020) [46] | NE | k-means*** | NE | Clustering*** | NE | 3 Clusters Cluster 2: abnormal haemoglobin and RBC Cluster 3: highest mortality, intubation, cardiac medications and blood administration |
BUN, creatinine, potassium, haemoglobin, and red blood cell | ||
Lee et al. (2020) [47] | CV | RF, SVM, LASSO, Ridge, Elastic Net Regulation, MEWS | RF | Trees | AUROC, AUPRC, BA, Sensitivity, Specificity, F1, PLR, and NLR | AUROC AUPRC Sensitivity Specificity RF OSO: 0.80 (0.76 to 0.84); 0.25 (0.18 to 0.33); 0.70 (0.62 to 0.82); 0.78 (0.66 to 0.83) RF OSR: 0.88 (0.85 to 0.91); 0.39 (0.30 to 0.47); 0.81 (0.76 to 0.89); 0.81 (0.75 to 0.83) |
OSO: Troponin I, creatine kinase and CK-MB; OSR: Lactic Acid |
Performance metrics for comparison referred to cross-validation results | |
Morid et al. (2020) [48] | CV | RF, XGBT, Kernel-based Bayesian Network, SVM, LoR, Naive Bayes, KNN, ANN | RF | Trees | AUC, F1, Accuracy | RF Model performances according to the detection method, Accuracy AUC Last recorded Value: 0.581, 0.589 Symbolic pattern detection: 0.706, 0.694 Local structural pattern: 0.781, 0.772 Global structural pattern: 0.744, 0.730 Local & Global: 0.813, 0.809 |
NE | ||
Yu et al. (2020) [49] | NA | ANN*** | NE | DL *** | Checking Proportions (CP), Prediction Accuracy, Aggregated Accuracy (AA) | Threshold for CP.AA. performing test 0.15: 90.14%; 95.83% 0.25: 85.78%; 95.05% 0.35: 79.71%; 93.32% 0.45: 71.70%; 90.95% 0.6: 50.46%; 85.30% |
NE | Not included data about performances, but only graph of AUROC of prediction to 1 month (with 4-month history) | |
Chicco and Jurman (2020) [50] | VS | LiR, RF, One-Rule, DT, ANN, SVM, KNN, Naive Bayes, XGBoost | RF | Trees | MCC, F1, Accuracy, TP, TN, PRAUC, AUROC | MCC F1 Accuracy TP TN PRAUC AUROC All features RF + 0.384, 0.547, 0.740, 0.491, 0.864, 0.657, 0.800 Cr+ EF RF +0.418 0.754 0.585 0.541 0.855 0.541 0.698 Cr+EF+FU time LoR +0.616 0.719 0.838 0.785 0.860 0.617 0.822 |
Serum Creatinine and Ejection Fraction | ||
Ye et al. (2020) [51] | CV | GDBT, AdaBoost, LGB, Logistic, Vote, XGB, Decision Tree, and Random Forest, stepwise LoR, LoR with RCS | GDBT | Ensemble | AUROC, Recall, Precision, F1 | Discrimination AUC GDBT 73.51%, 95% CI 71.36%–75.65% LoR with RCS 70.9%, 95% CI 68.68%–73.12% 0.3 and 0.7 were set as cut-off points for predicting outcomes (GDM or adverse pregnancy outcomes) |
GBDT: Fasting blood glucose, HbA1c, triglycerides, and maternal BMI LoR: HbA1c and high-density lipoprotein |
||
Macias et al. (2020) [52] | CV | RF (features) + RNN-LSTM, RF | RNN-LSTM (all variables) | DL | AUROC | AUROC mortality prediction 1 month RF 0.737 RNN (many) expert variables 0.781 ± 0.021 RNN RF variables 0.820 ± 0.015 RNN all variables 0.873 ± 0.021 |
|||
Lobo et al. (2020) [53] | VS | RNN-LSTM + NN + RNN-LSTM *** | NE | DL | Mean Error (ME), Mean Absolute Error (MAE), Mean Squared Error (MSE) | Best model performance ME: 0.017; MAE: 0.527; MSE: 0.489; predicting to 1 month with 5 month of history data |
|||
Roimi et al. (2020) [54] | CV | 6 RF+2 XGBoost, RF, XGBoost, LoR | 6 RF+2 XGBoost | Other | AUROC, Brier score | Modelling approach BIDMC RHCC AUROC Derivation set, CV Validation set, Derivation set, CV Validation set Logistic-regression: 0.75 ± 0.06, 0.70 ± 0.02, 0.80 ± 0.08, 0.72 ± 0.02 Random-Forest: 0.82 ± 0.03, 0.85 ± 0.01, 0.90 ± 0.03, 0.88 ± 0.02 Gradient Boosting Trees: 0.84 ± 0.04, 0.84 ± 0.02, 0.93 ± 0.04, 0.88 ± 0.01 Ensemble of models: 0.87 ± 0.03, 0.89 ± 0.01, 0.93 ± 0.03, 0.92 ± 0.01 validating the models of BIDMC over RHCC dataset and vice versa, the AUROCs of the models deteriorated to 0.59 ± 0.07 and 0.60 ± 0.06 for BIDMC and RHCC |
Most of the strongest features included patterns of change in the time-series variables | Performance metrics for comparison referred to cross-validation results | |
Kirk et al. (2020) [55] | NA | SVM (cut-offs features), LoR, Random Forest regression Algorithm | RF | Trees | AUROC | AUROC baseline clinical and demographic values 0.52 inclusion of laboratory value thresholds from the day of discharge 0.54 add daily postoperative laboratory thresholds to the demographic and clinical variables 0.59 add postoperative complications 0.62 random forest regression all features 0.68 |
white blood cell count, bicarbonate, BUN, and creatinine | ||
Li et al. (2020) [56] | VS | RF, LoR | LoR | Regression | AUROC, Accuracy, Precision, F1, Recall | Prospective cohort results AU-ROC Accuracy Precision F1 score Recall RF: 0.830 (0.770–0.887), 0.916 (0.891–0.936), 0.907 (0.881–0.928), 0.901 (0.874–0.922), 0.917 (0.892–0.937) LoR: 0.858 (0.808–0.903), 0.905 (0.879–0.926), 0.887 (0.859–0.910), 0.883 (0.855–0.906), 0.905 (0.879–0.926) |
RBC, SI, BE, Lac, DBP, pH | ||
Balamurugan et al. (2020) [57] | CV | Auto-Weka (Naive Bayes, DT-J48, MLP, SVM) & 4 features selectors *** | NE | NE | AUROC, F1, Precision, Accuracy, Recall, MCC, TPR, FPR | Proposed model: features selected; Accuracy; TP Rate; FP Rate GA + J48: 9; 94.32; 0.925; 0.118; PSO + J48: 9; 96.25; 0.963; 0.163; CFS + J48: 11; 84.63; 0.861; 0.871; EWSORA + J48; 4; 98.72; 0.950; 0.165; |
RBC, HGB, HCT, WBC | Performance metrics for comparison referred to cross-validation results | |
Hu et al. (2020) [58] | CV | XGBoost, RF, LR, Score (APACHE II, PSI) | XGBoost | Ensemble | AUROC | AUROC XGBoost 0.842 (95% CI 0.749–0.928) RF 0.809 (95% CI 0.629–0.891) LR 0.701 (95% CI 0.573–0.825) APACHE II 0.720 (95% CI 0.653–0.784) PSI 0.720 (95% CI 0.654–0.7897) |
Fluid balance domain, Laboratory data domain, severity score domain, Management domain, Demographic and symptom domain, Ventilation domain | ||
Aydin et al. (2020) [59] | CV | Naïve Bayes, KNN, SVM, GLM, RF, and DT | DT * | Trees | AUC, Accuracy, Sensitivity, Specificity | AUC (%) Accuracy (%) Sensitivity (%) Specificity (%) RF 99.67; 97.45; 97.79; 97.21 KNN 98.68; 95.58; 95.08; 95.93 NB 98.71; 94.76; 94.06; 95.25 DT 93.97; 94.69; 93.55; 96.55 SVM 96.76; 91.24; 90.32; 91.86 GLM 96.83; 90.96; 90.66; 91.16 |
Platelet distribution width (PDW), white blood cell count (WBC), neutrophils, lymphocytes |
||
Metsker et al. (2020) [60] | CV | KNN for clustering data and then comparison among Linear Regression, Logistic Regression, ANN, DT, and SVM | ANN | DL | AUROC, F1, Precision, Accuracy, Recall | Model Precision Recall F1 score Accuracy AUC 29’s variables Linear Regression 0.6777, 0.7911, 0.7299 0.7472 31’s variables ANN 0.7982, 0.8152, 0.8064, 0.8261, 0.8988 |
Age, Mean Platelet Volume | ||
Voglis et al. (2020) [61] | Bt | Generalized Linear Models (GLM), GLMBoost, Naïve Bayes classifier, and Random Forest | GLMBoost | Ensemble | AUROC, Accuracy, F1, PPV, NPV, Sensibility, Specificity | AUROC: 84.3% (95% CI 67.0–96.4) Accuracy: 78.4% (95% CI 66.7–88.2) Sensitivity: 81.4% Specificity: 77.5% F1 score: 62.1% NPV (93.9%) PPV (50%) |
preoperative serum prolactin preoperative serum insulin-like growth factor 1 level (IGF-1) BMI preoperative serum sodium level |
* It was chosen as the most useful, although it was not the best performer; ** Different models were trained with a different number of features; *** A comparison of the ML models was not made; NA: Not available; NE: Not evaluable (meaning not pertinent). For all the other abbreviations, see Appendix B.