Table 4.
Examples of the most frequently used training (pretraining/fine-tuning) and testing datasets
| Training dataset | Count | Testing dataset | Count |
|---|---|---|---|
| PubMed articles | 6 | PubMedQA (53) | 8 |
| MedMCQA (52) | 4 | BC5CDR (135) | 7 |
| MIMIC-III (48) | 4 | MedMCQA (52) | 7 |
| PubMedQA (53) | 4 | MedQA (136) | 6 |
| MedQA (136) | 3 | BioASQ (51) | 4 |
| MIMIC-CXR (50) | 3 | DDI (137) | 4 |
| MIMIC-IV (49) | 3 | National Center for Biotechnology Information disease (138) | 4 |
| HiTZ/multilingual (113) | 3 | MIMIC-CXR (50) | 3 |
| PromptCBLUE (139) | 3 | MIMIC-III (48) | 3 |