. 2025 Oct 27;29(11):531. doi: 10.1007/s00784-025-06590-0

Table 2.

Performance of the prognostic models of the algorithms and the surgeon in the test set (n = 400)

Model/ Surgeon	Balanced Accuracy	Accuracy	Sensitivity	Specificity	AUC^a	Brier Score
LR^b	0.580	0.90 (CI^c 0.86–0.92)	0.235	0.924	0.624	0.041
RF^d	0.591	0.81 (CI 0.77–0.85)	0.353	0.830	0.671	0.043
XGB^e	0.608	0.84 (CI 0.80–0.88)	0.353	0.864	0.646	0.042
KNN^f	0.616	0.70 (CI 0.65–0.74)	0.529	0.702	0.623	0.042
Senior Surgeon	0.527	0.74 (CI 0.69–0.78)	0.294	0.760	-	-

^aAUC: area under the receiver operating characteristic curve; ^bLR: logistic regression; ^cCI: Confidence interval; ^dRF: random forest; ^eXGB: eXtreme gradient boost; ^fKNN: K-nearest neighbors.

95% confidence intervals are provided for accuracy. No confidence interval is reported for balanced accuracy, as it is a composite metric whose uncertainty depends on both sensitivity and specificity, and no standard analytical method is established for its estimation.