Skip to main content
. 2025 Jan 4;18:2. doi: 10.1186/s13040-024-00414-9

Table 2.

Overview of pre-trained language models in biomedicine with release dates

Model Name Corpora LLM Backbone Release Date
BioBert PubMed abstracts, PMC articles BERT 2020
MedBert Medical texts, EHRs BERT 2021
ClinicalBERT MIMIC-III clinical notes BERT 2019
SciBERT Scientific papers (82% biomedical) BERT 2019
COVID-twitter-BERT Tweets about COVID-19 BERT 2023
MedGPT Electronic health records (EHRs) GPT 2021
SCIFIVE Biomedical corpora T5 2021
LLMBiomedicine Biomedical texts (NER [214] tasks) GPT-4 2024
ClinicalGPT Diverse medical data GPT 2023
MultiMedQA Medical QA datasets PaLM [46] 2023
Chatdoctor Patient-physician conversations LLaMa 2023
Taiyi Biomedical texts, multilingual Qwen [215] 2024