Skip to main content
. 2023 Dec 29;25:e51501. doi: 10.2196/51501

Table 2.

Diagnostic evaluation indicators.

Dataset model TPa FNb FPc TNd Sensitivity Specificity PPVe NPVf PLRg NLRh Accuracy
Training set Gpt-4 Model 1 10 34 5 73 0.2273 0.9359 0.6667 0.6822 3.5455 0.8257 0.6803
Training set GPT-4 Model 2 20 24 6 72 0.4545 0.9231 0.7692 0.7500 5.9091 0.5909 0.7541
Training set GPT-4 Model 3 26 18 12 66 0.5909 0.8462 0.6842 0.7857 3.8409 0.4835 0.7541
Training set GPT-4 Model 4 16 28 13 65 0.3636 0.8333 0.5517 0.6989 2.1818 0.7636 0.6639
Training set GPT-4 Model 5 38 6 4 74 0.8636 0.9487 0.9048 0.9250 16.8409 0.1437 0.9180
Training set GPT-3.5 Model 1 13 31 14 64 0.2955 0.8205 0.4815 0.6737 1.6461 0.8587 0.6311
Training set GPT-3.5 Model 2 22 22 24 54 0.5000 0.6923 0.4783 0.7105 1.6250 0.7222 0.6230
Training set GPT-3.5 Model 3 26 18 28 50 0.5909 0.6410 0.4815 0.7353 1.6461 0.6382 0.6230
Training set GPT-3.5 Model 4 22 22 31 47 0.5000 0.6026 0.4151 0.6812 1.2581 0.8298 0.5656
Training set GPT-3.5 Model 5 25 19 21 57 0.5682 0.7308 0.5435 0.7500 2.1104 0.5909 0.6721
Test set GPT-4 Model 5 17 5 5 25 0.7727 0.8333 0.7727 0.8333 4.6364 0.2727 0.8077
Test set GPT-3.5 Model 5 12 10 9 21 0.5455 0.7000 0.5714 0.6774 1.8182 0.6494 0.6346

aTP: true positive.

bFN: false negative.

cFP: false positive.

dTN: true negative.

ePPV: positive predictive value.

fNPV: negative predictive value.

gPLR: positive likelihood ratio.

hNLR: negative likelihood ratio.