Skip to main content
. 2023 Feb 21;13(4):1342–1354. doi: 10.7150/thno.81784

Table 1.

Comparison between supervised CNN and weakly-supervised MIL-CNN model. The model performances at patient-level were evaluated on the 3-split dataset for comparison. Bold numbers were the highest scores for MIL-CNN model. All numbers are the mean values and standard deviations of the test dataset under 5-fold cross-validation.

Method Accuracy Precision Recall F-score AUC
MIL-1-split 0.95±0.05 1.00±0.00 0.91±0.08 0.95±0.04 0.95±0.05
MIL-2-split 0.93±0.05 0.97±0.03 0.91±0.08 0.94±0.05 0.92±0.06
MIL-3-split 0.95±0.05 1.00±0.00 0.91±0.08 0.95±0.04 0.95±0.05
CNN-3-split 0.77±0.09 0.90±0.09 0.71±0.13 0.77±0.11 0.79±0.08