Skip to main content
. 2024 Dec 11;26:e63892. doi: 10.2196/63892

Table 1.

Performance comparison of GPT-4 zero-shot and few-shot on refined categories.


Discriminative models GPT-4 (with CoTa)

LSTMb BERTc Zero-shot Few-shot
Recall score

Treatment 0.486 0.492 0.724 0.740

Records 0.715 0.702 0.893 0.902

Routine 0.764 0.779 0.834 0.899

Rescheduling 0.769 0.721 0.896 0.837

Symptoms 0.927 0.878 0.750 0.878
Total

Accuracy 0.737 0.713 0.828 0.852

aCoT: chain of thought.

bLSTM: long short-term memory.

cBERT: bidirectional encoder representations from transformers.