. 2023 Jan 5;13(3):1899–1913. doi: 10.21037/qims-22-220

Table 2. Performance comparison of the proposed MIL method and Mayo criteria in LNM in test dataset, and the performance in external dataset.

Standard metrics	Internal test dataset		External validation dataset
Standard metrics	Mayo criteria	MIL-model	MIL-model
AUC	0.666	0.938	0.770
Accuracy	0.635	0.847	0.630
Recall	0.878	0.830	0.814
Precision (PPV)	0.544	0.881	0.706
F1_score	0.672	0.807	0.756
Sensitivity (TPR)	0.878	0.830	0.814
Specificity	0.455	0.911	0.520
FPR	0.545	0.143	0.479
NPV	0.833	0.872	0.665

MIL, multiple instance-learning; LNM, lymph node metastasis; AUC, area under the curve; PPV, positive predictive value; TPR, true positive rate; FPR, false positive rate; NPV, negative predictive value.