Skip to main content
. 2025 Dec 28;17(12):e100254. doi: 10.7759/cureus.100254

Table 2. PROBAST-informed appraisal of primary prediction model studies contributing to Table 1.

Items are recorded as “Yes/No/NR/Unclear” based on study reporting; when methodological details (e.g., missing-data handling, class imbalance handling, calibration/utility) were not reported, they are conservatively coded as “NR/Unclear.” This appraisal is intended to contextualize AUC-based performance comparisons with attention to validation design, reporting completeness, and clinical readiness rather than to assign formal risk-of-bias scores.

THA: total hip arthroplasty; TKA: total knee arthroplasty; DCA: decision curve analysis; NR: not reported; AUC: area under the curve; EMR: electronic medical record; CV: cross-validation; NSQIP: National Surgical Quality Improvement Program; RF: Random Forest; SMOTE: Synthetic Minority Over-sampling Technique; PROBAST: Prediction model Risk Of Bias ASsessment Tool

Reference Study (Author, Year) Outcome Data source Validation Missing data handling Class imbalance Calibration/Utility PROBAST domains flagged Notes
[13] Huang et al., 2023 Surgical site infection Ortho patients (train + external validation cohort) External validation (separate external set) NR NR Calibration + DCA reported Analysis Classic regression nomogram w/ calibration + DCA
[40] Huang et al., 2021 Blood transfusion after THA/TKA Multi-hospital EMR cohort Random subsampling + 10-fold CV Excluded missing/incorrect (0.73%) NR NR Analysis Reports AUC-focused comparison; calibration not highlighted
[47] Zang et al., 2024 Perioperative blood transfusion (hip surgery) Retrospective hip surgery cohort Temporal split (first 70% train / last 30% test) Excluded missing data NR Calibration + Brier + DCA reported Analysis Time split is stronger than random split; still single-system retrospective
[59] Jiang et al., 2024 Pedicle screw loosening Lumbar fixation cohort Random split (8:2) + 10-fold CV Imputed (<20%) w/ RF regression; otherwise excluded NR Calibration plots + Brier Analysis Strong reporting on calibration/Brier; still internal validation only
[60] Xiong et al., 2023 Spine outcome model; per paper Spine surgery imaging/clinical cohort Train/validation split (0.75/0.25) Excluded incomplete imaging NR Calibration + DCA reported Participants / Analysis Imaging exclusion can introduce selection bias; no external validation shown
[61] Chen et al., 2023 Surgical site infection prediction Retrospective cohort Train/test split (reported) Excluded incomplete clinical data NR Calibration curves reported Analysis Calibration reported, but imbalance handling not clearly described
[62] Zhang et al., 2024 SSI following spine surgery 986 pts “complete data” cohort 5-fold CV (4 folds train / 1 validate) Complete-case (“complete data” only) NR NR Participants / Analysis Internal CV only; AUC-heavy evaluation; missing handling = exclusion
[71] Gupta et al., 2025 30-day mortality (femoral shaft fracture surgery) NSQIP dataset Stratified 80/20 split + 10-fold CV Dropped vars >5% missing; kNN imputation for remaining SMOTE + Tomek Calibration slope/intercept + Brier Analysis Best reported “methods hygiene” among these (imbalance + calibration + imputation)
[72] Han et al., 2025 Postoperative infection (nosocomial infection paper) Retrospective cohort (2011–2024) Random 70/30 split + 10-fold CV Imputation reported (details in supplement) NR Brier reported Analysis Strong model-development description; still internal validation only