Table 4.
Balanced Accuracy of Each of the Three Experts and the Automated Method against the Majority-Opinion-Based Ground Truth on Database A
| Accuracy | Expert 1 | Expert 2 | Expert 3 | Automated |
|---|---|---|---|---|
| NM | 99.8 (100, 99.6) | 98.4 (98.8, 98.0) | 99.4 (100, 98.8) | 95.5 (99.4, 91.5) |
| MH | 99.4 (100, 98.8) | 98.3 (98.6, 98.0) | 86.5 (73.0, 100) | 89.7 (89.1, 90.3) |
| ME | 92.4 (99.5, 85.4) | 94.9 (94.6, 95.1) | 91.7 (89.2, 94.3) | 87.3 (87.5, 87.0) |
| AMD | 94.2 (93.2, 95.2) | 94.0 (89.2, 98.8) | 92.0 (85.1, 98.8) | 89.3 (89.7, 88.8) |
| Average | 96.5 (98.2, 94.8) | 96.4 (95.3, 97.5) | 92.4 (86.8, 98.0) | 90.5 (91.4, 89.4) |
Shown is the balanced accuracy (sensitivity, specificity). For the automated method, the best feature setting for each pathology was adopted (TS, t = 0.4; S, t = 0.4; TS, t = 0.4; and TS, t = 0.2, for NM, MH, ME, and AMD, respectively) The best balanced accuracy was derived from the mean of the output of the six runs.