Skip to main content
. Author manuscript; available in PMC: 2025 Aug 25.
Published in final edited form as: Annu Rev Biomed Data Sci. 2025 Apr 1;8(1):251–274. doi: 10.1146/annurev-biodatasci-102224-074736

Table 4.

Examples of the most frequently used training (pretraining/fine-tuning) and testing datasets

Training dataset Count Testing dataset Count
PubMed articles 6 PubMedQA (53) 8
MedMCQA (52) 4 BC5CDR (135) 7
MIMIC-III (48) 4 MedMCQA (52) 7
PubMedQA (53) 4 MedQA (136) 6
MedQA (136) 3 BioASQ (51) 4
MIMIC-CXR (50) 3 DDI (137) 4
MIMIC-IV (49) 3 National Center for Biotechnology Information disease (138) 4
HiTZ/multilingual (113) 3 MIMIC-CXR (50) 3
PromptCBLUE (139) 3 MIMIC-III (48) 3