Skip to main content
. 2025 Jan 8;15(3):153–158. doi: 10.5415/apallergy.0000000000000179

Table 2.

Experiment values without SemClinBr data: confusion matrix and performance indicators for each GPT model combination with and without the WAO criteria

GPT model Confusion matrix Precision Sensitivity Specificity Accuracy Kappa agreement
4 Turbo TP: 48 FP: 5
TN: 46 FN: 0
90.6% 100% 90.2% 95.0% 0.90 almost perfect
4 Turbo W/criteria TP: 48 FP: 5
TN: 46 FN: 0
90.6% 100% 90.2% 95.0% 0.90 almost perfect
3.5 + 4 TP: 47 TP: 42
FP: 9 FP: 1
83.9% 97.9% 82.4% 89.9% 0.80 substantial
3.5 + 4 W/criteria TP: 48 FP: 9
TN: 42 FN: 0
84.2% 100% 82.4% 90.9% 0.82 almost perfect
3.5 TP: 48 TN: 32
FP: 19 FN: 0
71.6% 100% 62.7% 80.8% 0.62 substantial
3.5 W/criteria TP: 48 FP: 9
TN: 42 FN: 0
71.6% 100% 62.7% 80.8% 0.62 substantial