Skip to main content
. 2021 Mar 23;19:76. doi: 10.1186/s12916-021-01942-5

Table 3.

Patch-level (Dataset-A and Dataset-B) and patient-level (Dataset-C and Dataset-D) performance summary

Source Sensitivity Specificity Accuracy AUC
Dataset-A (patch-level testing)
 XH 96.99% 99.22% 98.11% 99.83%
Dataset-B (patch-level validation)
 NCT-CRC-HE-100 K 92.03% 96.74% 96.07% 98.32%
 CRC-VAL-HE-7 K 94.24% 94.87% 94.76% 98.45%
Dataset-C (patient-level validation)
 XH 98.80% 99.51% 99.02% 99.16%
 TCGA-Frozen 94.04% 88.06% 93.44% 91.05%
 TCGA-FFPE 97.96% 100.00% 97.98% 98.98%
 SYU-CGH 98.90% 92.45% 95.43% 95.68%
Dataset-D (patient-level Human-AI contest)
 XH 97.96% 100% 98.97% 98.99%
 SYU 98.90% 100% 98.97% 99.45%
Dataset-C and Dataset-D (patient-level validation and Human-AI contest)
 PCH 96.00% 97.83% 96.88% 97.91%
 TXH 100% 97.92% 98.96% 99.20%
 HPH 97.96% 97.96% 97.96% 98.98%
 FUS 100% 97.96% 98.99% 99.99%
 GPH 100% 97.65% 98.91% 99.15%
 NJD 92.93% 97.94% 95.41% 95.84%
 SWH 98.99% 97.00% 97.99% 99.42%
 AMU 97% 97.06% 97.04% 98.37%
 ACL 100% 97.20% 98.55% 99.83%