Skip to main content
. Author manuscript; available in PMC: 2025 Aug 25.
Published in final edited form as: Annu Rev Biomed Data Sci. 2025 Apr 1;8(1):251–274. doi: 10.1146/annurev-biodatasci-102224-074736

Table 3.

Architectural details of all models (n = 91) from all 82 articles included in the study

Backbone model Architecture Number of models Model size Number of models Variants as backbone Number of models
Llama (7, 8, 14, 15, 6291) Decoder-only 36 (39.6%) 7B 17 (47.2%) Llama-base 28 (77.8%)
13B 14 (38.9%) Alpaca 3 (8.3%)
70B 2 (5.6%) Vicuna 2 (5.6%)
33B 2 (5.6%) AlpaCare 1 (2.8%)
65B 1 (2.8%) Orca 1 (2.8%)
Ziya 1 (2.8%)
GPT (17, 73, 74, 84, 92104) Decoder-only 16 (17.6%) 1.5B 8 (50.0%) GPT-base 14 (87.5%)
175B 2 (12.5%) BioGPT 2 (12.5%)
6.7B 2 (12.5%)
2.7B 1 (6.3%)
20B 1 (6.3%)
6B 1 (6.3%)
1.3B 1 (6.3%)
ChatGLM (105111) Encoder–decoder 7 (7.7%) 6B 7 (100.0%) ChatGLM-base 7 (100.0%)
T5 (88, 112116) Encoder–decoder 6 (6.6%) 11B 4 (66.7%) Flan-T5 3 (50.0%)
3B 2 (33.3%) T5-base 1 (16.7%)
mt5 1 (16.7%)
ProtT5 1 (16.7%)
Baichuan (75, 117121) Decoder-only 6 (6.6%) 7B 4 (66.7%) Baichuan-base 6 (100.0%)
13B 2 (33.3%)
From scratch (122124) Decoder-only 3 (3.3%) 6.4B 1 (33.3%) ProGen 1 (33.3%)
2.5B 1 (33.3%) ProGen2 1 (33.3%)
1.2B 1 (33.3%) Nucleotide Transformer 1 (33.3%)
BLOOM (125127) Decoder-only 3 (3.3%) 7B 2 (66.7%) BLOOM-base 3 (100.0%)
1B 1 (33.3%)
Qwen (9, 128) Decoder-only 2 (2.2%) 14B 1 (50.0%) Qwen-base 2 (100.0%)
7B 1 (50.0%)
PaLM (11, 129) Decoder-only 2 (2.2%) 540B 1 (50.0%) PaLM-base 2 (100.0%)
340B 1 (50.0%)
LongNet (130) Decoder-only 1 (1.1%) 1B 1 (100.0%) LongNet-base 1 (100.0%)
GenSLM (131) Decoder-only 1 (1.1%) 25B 1 (100.0%) GenSLM-base 1 (100.0%)
Henya (132) Decoder-only 1 (1.1%) 7B 1 (100.0%) StripedHyena 1 (100.0%)
Multimodal (14, 87, 91, 96, 130, 133, 134) Mixed 7 (7.7%) NA NA ViT 4 (57.1%)
BLIP 1 (14.3%)
CLIP 1 (14.3%)
LLaVA 1 (14.3%)

Abbreviation: NA, not applicable.