Skip to main content
. 2024 Apr 25;16(9):1645. doi: 10.3390/cancers16091645

Table 2.

Performance of clinical prediction models for prognosis of patients with colorectal liver metastases.

First Author (Year) Discrimination (AUC) Calibration Measures Calibration: Performance DCA
Buisman (2022) [20] 0.73 Calibration curve Good calibration (MSKCC model)/slight underprediction (Erasmus MC model) NR
Bertsimas (2022) [21] KRAS-variant: 0.76 (both training and testing)/external validation: 0.78/wild-type, training: 0.79/wild-type, testing: 0.57 NR NR NR
Bao (2021) [22] Mean time-dependent: 0.75 NR NR NR
Lam (2023) [23] 0.65 (both for OS and RFS) NR NR NR
Reijonen (2023) [24] 0.62 (OS) NR NR NR
Margonis (2018) [25] 0.625 NR NR NR
Paredes (2020) [26] Model without KRAS: 0.649–0.662 (validation cohort)/model with KRAS: 0.642–0.667 (validation cohort) Calibration curve No KRAS: good calibration/KRAS: fair NR
Fruhling (2021) [27] 1-, 3-, 5-year OS: 0.71, 0.67, 0.67/internal validation: 0.62 Calibration curve Excellent calibration in development cohort NR
Taghavi (2021) [28] Training: 0.64/validation: 0.71 NR NR NR
Brudvik (2019) [29] Development, 5 -y OS: 0.69/development: 5 y RFS: 0.66 NR NR NR
Moaven (2023) [30] GBT, OS: 0.77/GBT, recurrence: 0.63/LRB, OS: 0.64/LRB, recurrence: 0.57 NR NR NR
Villard (2022) [31] Development: 0.74/validation: 0.69/simplified model, development: 0.74, validation: 0.66 Calibration curve, CITL, slope, HL test CITL: 0.36, slope: 0.89 (validation), good overall fit NR
Chen (2020) [32] Development: 0.69 at 24 months and 0.65 at 33 months/internal validation: 0.63/cohort 2: 0.81 at 15 months Calibration curve Good calibration NR
Chen (2022) [33] 1-, 3-, 5-year OS: 0.828, 0.740, 0.700 in the solitary LM group; 0.747, 0.714, 0.753 in the 2–4 LM group; 0.728, 0.741, 0.792 in the ≥ 5 LM group Calibration curve Fair calibration only in the 2–4 LM group NR
Dai (2021) [34] Training: 0.866/validation: 0.792 Calibration curve Poor calibration in the validation cohort Clinical utility with lift curves
Liu (2021) [35] 0.707 Calibration curve Fair NR
Liang (2021) [36] Training: 0.742/validation: 0.773 Calibration curve Fair in both training and validation cohorts NR
Wu (2021) [37] 0.71 (both neoadjuvant and non-neoadjuvant groups) NR NR NR
Sasaki (2022) [38] Development: 0.61 (model as a continuous variable), 0.60 (model as a categorical variable)/Asian external validation cohort: 0.62 (model as a continuous variable), 0.60 (model as a categorical variable)/European external validation cohort: 0.57 (model as a continuous variable), 0.57 (model as a categorical variable) NR NR NR
Huiskens (2019) [39] Stage 1 model: 0.70/Stage 2 model: 0.72 H-L test Stage 1 model: chi-square: 3.5, p = 0.63/Stage 2 model: chi-square: 7.8, p = 0.18 NR
Bai (2022) [40] 5-year OS, development: 0.721/5-year OS, validation: 0.665/2-year RFS, development: 0.728/2-year RFS, validation: 0.640 NR NR NR
Fang (2022) [41] 0.715 NR NR NR
Qin (2022) [42] 1-, 2-, 3-year ihPFS: 0.695, 0.764, 0.782 Calibration curve Fair calibration yes
Kawaguchi (2021) [43] RAS mutant, development: 0.629/RAS mutant, validation: 0.644/wild type, development: 0.625/wild type, validation: 0.624 Calibration curve Fair calibration (development and validation cohort) NR
Zhang (2023) [44] Risk score: 1, 3, 5 years, training: 0.624, 0.630, 0.662/testing: 0.610, 0.646, 0.688/validation: 0.612, 0.622, 0.652/full model: 0.783, corrected: 0.772 Calibration curve Fair calibration yes
Chen (2021) [45] Complications: 0.658/PFS: 0.676/OS: 0.700 Calibration curve, HL test Complications: fair, HL test: chi-square 3.99, p = 0.91/PFS: fair/OS: good yes (for complications)
Jin (2022) [46] Training: 0.826/validation: 0.820/external validation: 0.763 Calibration curve Poor calibration (internal validation), fair (external validation) yes
Zhai (2022) [47] 0.659 NR NR NR
Liu (2021) [48] Development: 0.696/validation: 0.682 Calibration curve Development: fair/validation: poor NR
Moro (2020) [49] AIC: wtKRAS: 1356, mtKRAS: 1356 Brier scores after bootstrapping Brier: 0.1741 (wtKRAS), 0.1793 (mtKRAS) NR
Chen (2021) [50] Complications: 0.750/PFS: 0.663/OS: 0.684 Calibration curves and HL test Complications: fair/PFS: fair/OS: fair yes
Yao (2021) [51] Presence of LN metastases: 0.655/PFS: 0.656 Calibration curves and HL test Presence of LN metastases: fair/PFS: fair NR
Kazi (2023) [52] 0.692 Calibration table Good calibration (small group numbers) NR
Meng (2021) [53] 1 yr OS, training: 0.788/3 yr OS, validation: 0.702/3 yr OS, training: 0.752/3 yr OS, validation: 0.848 Calibration curve 1 yr OS: fair, 3 yr OS: good (small numbers) NR
Imai (2016) [54] 0.66 Calibration curve 3 and 5 yr OS: fair NR
Chen (2022) [55] Development: 0.754/validation: 0.882 Calibration curve, HL test HL: chi-square: 1.36, p = 0.998, calibration curve: good calibration in development and validation cohorts yes
Cheng (2022) [56] Training: 0.709/validation: 0.735 Calibration curve CSS: fair in training and validation/OS: fair in training and validation NR
Kulik (2018) [57] Preoperative: 0.716/preop- and perioperative: 0.761 NR NR NR
Bai (2021) [58] LDH-CRS: 0.674/mCRS: 0.681 NR NR NR
Wang (2021) [59] 1st score, 1, 3, 5 yr OS, training: 0.84, 0.73, 0.70/1, 3, 5 yr OS, int. validation: 0.75, 0.70, 0.70/1, 3, 5 yr OS, ext. validation: 0.77, 0.78, 0.72/2nd score, 3 yr OS, training: 0.76/5 yr OS, training: 0.75/3 yr OS, validation: 0.74/5 yr OS, validation: 0.66 Calibration curve Merged score: fair NR
Xu (2021) [60] Training: 0.746/validation: 0.764 Calibration curve, slope, intercept Validation: fair, calibration slope 1.09, intercept: −0.006 NR
Sasaki (2018) [61] 0.669 NR NR NR
Wada (2022) [62] Training: 0.83/validation: 0.81/mixed model: 0.85 NR NR NR
Kim (2020) [63] Training: 0.824/validation: 0.898 H-L test p = 0.831 NR
Dupre (2019) [64] Preoperative: 0.619/postoperative: 0.637 NR NR NR
Qi (2023) [65] SOF, 5 yr: 0.63/SOF, 8 yr: 0.74/combined, 5 yr: 0.69/combined, 8 yr: 0.79 Calibration curve Fair calibration NR
Wu (2021) [66] 0.705 Calibration curve Fair calibration NR
Dasari (2023) [67] Development, 1, 2, 3, 5 yr: 0.756, 0.745, 0.706, 0.698/validation, 1, 2, 3, 5 yr: 0.679, 0.659, 0.678, 0.732 NR NR NR
Liu (2023) [68] DEG risk score, development, 5 yr: 0.74/validation, 5 yr: 0.64/mixed model: 0.69 Calibration curve Good calibration yes
Amygdalos (2023) [69] 0.70 NR NR NR
Chen (2023) [70] 0.732 Calibration curve Fair NR
Wu (2018) [71] OS, 1 and 3 yr: 0.621,0.661/CSS, 1 and 3 yr: 0.621,0.660 Calibration curve Fair in training and validation, both for OS and CSS NR
Deng (2023) [72] Training: 0.720/validation: 0.740 Calibration curve, HL test Training: fair calibration, chi-square 4.97, p = 0.7612/validation: poor calibration, chi: 3.89, p = 0.8671 yes (utility in a narrow range of thresholds)
Berardi (2023) [73] Training: 0.68/validation: 0.60 Calibration curve Fair NR
Liu (2019) [74] Development: 0.675/validation: 0.77 Calibration curve Development: 1 yr poor, 3 yr good/validation: 1 yr poor, 3 yr poor, 5 yr poor NR
Welsh (2008) [75] 0.781 Calibration plot, HL test Validation: chi-square = 6.03, p = 0.196 NR
Famularo (2023) [76] RF model: 0.66 NR NR NR
He (2023) [77] Training: 0.801/validation: 0.739 Calibration curve, slope, intercept Development: good calibration/validation: fair calibration, slope: 1.0, intercept 0.0 yes
Kattan (2008) [78] Optimism-corrected: 0.612 Calibration curve Fair NR
Wensink (2023) [79] Optimism-corrected, 6 m: 0.643, 12 m: 0.641 Calibration curve, slope Fair at 6 and 12 months, optimism-corrected slope: 0.86 yes
Fendler (2015) [80] Training 0.81/validation: 0.83 NR NR NR
Marfa (2016) [81] Training: 0.903 NR NR NR
Jiang (2023) [82] CSS, training, 1 and 3 yr: 0.77, 0.70/validation, 1 and 3 yr: 0.72, 0.68/OS, training, 1 and 3 yr 0.78, 0.70/validation, 1 and 3 yr: 0.74, 0.70 Calibration curve Training: fair, validation poor yes (superior to AJCC stage)
Endo (2023) [83] OS-OPT, training: 0.68/testing: 0.69/RFS-OPT, training: 0.68/testing: 0.69 NR NR NR
Rees (2008) [84] Preoperative: 0.781/postoperative: 0.805 H-L test Preoperative: chi-square: 8.125; p = 0.087/postoperative: chi-square: 7.453, p = 0.114 NR
Zakaria (2007) [85] DSS: 0.61/recurrence: 0.58 NR NR NR
Tan (2008) [86] 0.59 NR NR NR
Hill (2012) [87] Apparent: 0.69/optimism-corrected: 0.67 NR NR NR
Takeda (2021) [88] Development: 0.65 NR NR NR
Wang (2017) [89] 0.642 NR NR NR
Spelt (2013) [90] ANN: 0.72/Cox model: 0.66 NR NR NR

AUC: area under the curve, DCA: decision curve analysis, MSKCC: Memorial Sloan Kettering Cancer Centre, KRAS: Kirsten rat sarcoma virus, NR: not reported, OS: overall survival, RFS: recurrence-free survival, GBT: gradient-boosted trees, LRB: logistic regression with bootstrapping, CITL: calibration-in-the-large, HL: Hosmer–Lemeshow, LM: liver metastases, ihPFS: intrahepatic progression-free survival, PFS: progression-free survival, AIC: Akaike information criterion, LN: lymph node, CSS: cancer-specific survival, LDH: lactate dehydrogenase, mCRS: modified clinical risk score, SOFs: spatial organization features, DEGs: differentially expressed genes, RF: random forest, AJCC: American Joint Committee on Cancer, OPT: optimal policy tree, DSS: disease-specific survival, ANN: artificial neural network.