Table 2.
a. Accuracy, sensitivity and specificity in the diagnostic model (GastroMIL) when images at different magnification. | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Accuracy (95%) | Sensitivity (95%) | Specificity (95%) | ||||||||||
Training set | 5 × | 0.999 (0.998, 1.000) | 0.998 (0.993, 1.000) | 1.000 (0.997, 1.000) | ||||||||
10 × | 1.000 (1.000, 1.000) | 1.000 (0.997, 1.000) | 1.000 (0.997, 1.000) | |||||||||
20 × | 0.996 (0.994, 0.999) | 0.998 (0.993, 1.000) | 1.000 (0.997, 1.000) | |||||||||
Internal validation set | 5 × | 0.976 (0.966, 0.986) | 0.969 (0.949, 0.981) | 0.985 (0.970, 0.993) | ||||||||
10 × | 0.976 (0.966, 0.986) | 0.981 (0.965, 0.990) | 0.971 (0.952, 0.983) | |||||||||
20 × | 0.979 (0.970, 0.988) | 0.969 (0.949, 0.981) | 0.985 (0.970, 0.993) | |||||||||
b. Accuracy, sensitivity and specificity in the diagnostic model (GastroMIL) with 10 × magnified images | ||||||||||||
Accuracy (95%) | Sensitivity (95%) | Specificity (95%) | ||||||||||
Training set | 1.000 (1.000, 1.000) | 1.000 (0.997, 1.000) | 1.000 (0.997, 1.000) | |||||||||
Internal validation set | 0.976 (0.966, 0.986) | 0.981 (0.965, 0.990) | 0.971 (0.952, 0.983) | |||||||||
External validation set | 0.920 (0.879, 0.961) | 0.934 (0.864, 0.969) | 0.905 (0.823, 0.951) | |||||||||
c. Diagnostic performance of the GastroMIL model and human pathologists in the external validation set with images at 10 × magnification | ||||||||||||
Accuracy (95%) | Sensitivity (95%) | Specificity (95%) | P-value* | Kappa# | ||||||||
GastroMIL Model | 0.920 (0.879, 0.961) | 0.934 (0.864, 0.969) | 0.905 (0.823, 0.951) | - | - | |||||||
Expert Pathologist D | 0.971 (0.947, 0.996) | 1.000 (0.960, 1.000) | 0.952 (0.884, 0.981) | 1.000 | 0.805 | |||||||
Expert Pathologist E | 0.983 (0.963, 1.002) | 0.967 (0.908, 0.991) | 1.000 (0.956, 1.000) | 0.332 | 0.806 | |||||||
Expert Pathologist F | 0.983 (0.963, 1.002) | 0.967 (0.908, 0.991) | 1.000 (0.956, 1.000) | 0.332 | 0.806 | |||||||
Junior Pathologist G | 0.874 (0.825, 0.924) | 0.758 (0.661, 0.835) | 1.000 (0.956, 1.000) | <0.0001 | 0.617 |
*Difference of Accuracy between the GastroMIL Model and each human pathologist, tested by paired chi‐square test (McNemars test).
#Inter-observer agreement of the GastroMIL Model and each human pathologist, evaluated by Cohen's kappa coefficient.