Table 2.
Experiment values without SemClinBr data: confusion matrix and performance indicators for each GPT model combination with and without the WAO criteria
| GPT model | Confusion matrix | Precision | Sensitivity | Specificity | Accuracy | Kappa agreement |
|---|---|---|---|---|---|---|
| 4 Turbo | TP: 48 FP: 5 TN: 46 FN: 0 |
90.6% | 100% | 90.2% | 95.0% | 0.90 almost perfect |
| 4 Turbo W/criteria | TP: 48 FP: 5 TN: 46 FN: 0 |
90.6% | 100% | 90.2% | 95.0% | 0.90 almost perfect |
| 3.5 + 4 | TP: 47 TP: 42 FP: 9 FP: 1 |
83.9% | 97.9% | 82.4% | 89.9% | 0.80 substantial |
| 3.5 + 4 W/criteria | TP: 48 FP: 9 TN: 42 FN: 0 |
84.2% | 100% | 82.4% | 90.9% | 0.82 almost perfect |
| 3.5 | TP: 48 TN: 32 FP: 19 FN: 0 |
71.6% | 100% | 62.7% | 80.8% | 0.62 substantial |
| 3.5 W/criteria | TP: 48 FP: 9 TN: 42 FN: 0 |
71.6% | 100% | 62.7% | 80.8% | 0.62 substantial |