Table 1.
Study | Models tested | Additional modifiers | Data source | Evaluation metric |
---|---|---|---|---|
1 | GPT-3davinci, BARTcnn, T5, LED | None | CNN+DailyMail, booksum, sec-litigation, MIMIC-III | ROUGE, BERTscore, text length reduction |
2 | GPT-4, GPT-3, FLAN-T5, FLAN-UL2, Llama-2, Vicuna | ICL, QLoRA | Open-i | ROUGE, BERTscore, clinician evaluation |
3 | FLAN-T5, BERT, pegasus-xsum | SPeC | MIMIC-CXR | ROUGE, BERTscore |
4 | Med-BERT | Prediction head, fine-tuning via smaller data sets | Cerner, Truven | Disease Prediction AUC |