Table 5.
Performance metrics of the ResNet34 model with junior and senior nuclear medicine physicians on the external validation set.
| ResNet34 | Junior | Senior | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Graves’ disease | Normality | SAT | Tumor | Graves’ disease | Normality | SAT | Tumor | Graves’ disease | Normality | SAT | Tumor | |
| Recall (%) |
88.0 82.0-92.2 |
94.2 89.0-97.0 |
97.5 93.9-99.0 |
92.9 87.7-96.0 |
70.3 62.7-76.8 |
75.4 67.6-81.8 |
82.8 76.3-87.8 |
51.9 44.1-59.7 |
88.6 82.7-92.7 |
94.2 89.0-97.0 |
92.0 86.8-95.3 |
72.7 65.2-79.1 |
| Specificity (%) | 100.0 99.2-100.0 |
94.5 92.1-96.2 |
99.8 98.8-100.0 |
96.7 94.7-98.0 |
93.8 91.2-95.7 |
78.1 74.2-81.6 |
96.4 94.3-97.8 |
92.4 89.6-94.5 |
96.9 94.9-98.2 |
92.4 89.7-94.5 |
97.3 95.4-98.5 |
95.9 93.6-97.3 |
| PPV (%) |
100.0 97.3-100.0 |
83.3 76.7-88.4 |
99.4 96.5-100.0 |
90.5 84.9-94.2 |
79.9 72.4-85.7 |
50.0 43.3-56.7 |
89.4 83.5-93.4 |
69.6 60.6-77.2 |
90.9 85.3-94.5 |
78.3 71.4-83.9 |
92.6 87.5-95.7 |
85.5 78.5-90.5 |
| NPV (%) |
96.0 93.8-97.4 |
98.2 96.6-99.1 |
99.1 97.8-99.7 |
97.6 95.7-98.6 |
90.1 87.1-92.5 |
91.6 88.5-93.9 |
93.9 91.4-95.8 |
85.1 81.7-88.0 |
96.1 93.9-97.5 |
98.2 96.5-99.1 |
97.1 95.1-98.3 |
91.3 88.4-93.5 |
| AUC | 0.969 0.952-0.981 |
0.985 0.972-0.993 |
0.996 0.988-0.999 |
0.981 0.966-0.990 |
0.820 0.788-0.850 |
0.767 0.732-0.800 |
0.896 0.869-0.919 |
0.722 0.684-0.757 |
0.928 0.904-0.947 |
0.933 0.910-0.952 |
0.947 0.926-0.963 |
0.843 0.812-0.871 |
| F1 | 0.936 | 0.884 | 0.985 | 0.917 | 0.747 | 0.601 | 0.860 | 0.595 | 0.897 | 0.855 | 0.923 | 0.786 |
| K value | 0.909 | 0.603 | 0.824 | |||||||||
SAT, Subacute thyroiditis; AUC, the area under the curve; κ value, Fleiss’s κ value; NPV, negative predictive value; PPV, positive predictive value.
The values below indicate 95% CIs.