Table 5.
Diagnostic performance and level of agreement in each Trial.
Diagnostic performance | Cohen’s kappa | Kappa index | McNemar’s test | ||||
---|---|---|---|---|---|---|---|
Accuracy | Sensitivity | Specificity | |||||
Trial 1 | Expert | 0.81 | 0.61 | 0.91 | 0.54 | Moderate | .001 |
AI | 0.80 | 0.52 | 0.94 | 0.51 | Moderate | .000 | |
Trial 2 | Expert | 0.69 | 0.57 | 0.93 | 0.42 | Moderate | .000 |
AI | 0.73 | 0.97 | 0.23 | 0.25 | Fair | .000 | |
Trial 3 | Expert | 0.85 | 0.72 | 0.97 | 0.69 | Substantial | .000 |
AI | 0.78 | 0.73 | 0.82 | 0.56 | Moderate | .366 |
AI, artificial intelligence.