Table 1.
|
Discriminative models | GPT-4 (with CoTa) | ||||||||||
|
LSTMb | BERTc | Zero-shot | Few-shot | ||||||||
Recall score | ||||||||||||
|
Treatment | 0.486 | 0.492 | 0.724 | 0.740 | |||||||
|
Records | 0.715 | 0.702 | 0.893 | 0.902 | |||||||
|
Routine | 0.764 | 0.779 | 0.834 | 0.899 | |||||||
|
Rescheduling | 0.769 | 0.721 | 0.896 | 0.837 | |||||||
|
Symptoms | 0.927 | 0.878 | 0.750 | 0.878 | |||||||
Total | ||||||||||||
|
Accuracy | 0.737 | 0.713 | 0.828 | 0.852 |
aCoT: chain of thought.
bLSTM: long short-term memory.
cBERT: bidirectional encoder representations from transformers.