Table 4.
Model Performance at Different Institutions: Predictions Using Full Information without Calibration*
Institution | Prevalence of N2 Disease in that Cohort (%) | AUC ROC (95% CI) | Hosmer-Lemeshow P Value† | Brier Score‡ |
---|---|---|---|---|
MD Anderson Cancer Center (development cohort) | 160/633 (25) | 0.86 (0.82–0.89) | 0.62 | 0.125 |
Cleveland Clinic (validation) | 87/310 (28) | 0.87 (0.83–0.91) | 0.03 | 0.129 |
Johns Hopkins (validation) | 107/186 (58) | 0.82 (0.76–0.89) | <0.001 | 0.181 |
Henry Ford Hospital (validation) | 102/226 (45) | 0.92 (0.88–0.95) | <0.001 | 0.139 |
Definition of abbreviations: AUC ROC = area under the receiver operating characteristic curve; CI = confidence interval.
Full model as specified in Table 2 (age, location, histology, and computed tomography/positron emission tomography interaction). Note that the Hosmer-Lemeshow and Brier scores use predictions from the uncalibrated model in this table.
Hosmer-Lemeshow test, P < 0.05 indicates poor calibration.
Brier scores reflect the mean squared difference between predicted outcomes and the actual outcomes. Brier scores range from 0 to 1, with lower scores being better.