Table 3.
Performance comparisons of the proposed EMV-3D-CNN model and the six radiologists on the validation dataset for Task 3.
| Overall accuracy (%) | Task 3 | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Model | D1 | D2 | D3 | D4 | D5 | D6 | |||||||||||||||
| 77.6 | 67.1 | 69.7 | 61.8 | 59.2 | 56.6 | 52.6 | |||||||||||||||
| G1 | G2 | G3 | G1 | G2 | G3 | G1 | G2 | G3 | G1 | G2 | G3 | G1 | G2 | G3 | G1 | G2 | G3 | G1 | G2 | G3 | |
| Accuracy (%) | 88.2 | 78.9 | 88.2 | 78.9 | 72.4 | 82.9 | 73.7 | 78.9 | 86.8 | 75.0 | 71.1 | 77.6 | 72.4 | 68.4 | 77.6 | 71.1 | 63.2 | 78.9 | 73.7 | 53.9 | 77.6 |
| Sensitivity (%) | 93.1 | 69.2 | 66.7 | 89.7 | 38.5 | 71.4 | 93.1 | 50.0 | 61.9 | 79.3 | 42.3 | 61.9 | 86.2 | 34.6 | 52.4 | 75.9 | 30.8 | 61.9 | 34.5 | 42.3 | 90.5 |
| Specificity (%) | 85.1 | 84.0 | 96.4 | 72.3 | 90.0 | 87.3 | 61.7 | 94.0 | 96.4 | 72.3 | 86.0 | 83.6 | 63.8 | 86.0 | 87.3 | 68.1 | 80.0 | 85.5 | 97.9 | 60.0 | 72.7 |
| PPV (%) | 79.4 | 69.2 | 87.5 | 66.7 | 66.7 | 68.2 | 60.0 | 81.3 | 86.7 | 63.9 | 61.1 | 59.1 | 59.5 | 56.3 | 61.1 | 59.5 | 44.4 | 61.9 | 90.9 | 35.5 | 55.9 |
| NPV (%) | 95.2 | 84.0 | 88.3 | 91.9 | 91.9 | 88.9 | 93.5 | 78.3 | 86.9 | 85.0 | 74.1 | 85.2 | 88.2 | 71.7 | 82.8 | 82.1 | 69.0 | 85.5 | 70.8 | 66.7 | 95.2 |
| F1 (%) | 85.7 | 69.2 | 75.7 | 76.5 | 76.5 | 69.8 | 73.0 | 61.9 | 72.2 | 70.1 | 50.0 | 60.5 | 70.4 | 42.9 | 56.4 | 66.7 | 36.4 | 61.9 | 50.0 | 38.6 | 69.1 |
PPV positive predictive value, NPV negative predictive value, D Doctor, G Grade.