Table 3.
Architectural details of all models (n = 91) from all 82 articles included in the study
Backbone model | Architecture | Number of models | Model size | Number of models | Variants as backbone | Number of models |
---|---|---|---|---|---|---|
Llama (7, 8, 14, 15, 62–91) | Decoder-only | 36 (39.6%) | 7B | 17 (47.2%) | Llama-base | 28 (77.8%) |
13B | 14 (38.9%) | Alpaca | 3 (8.3%) | |||
70B | 2 (5.6%) | Vicuna | 2 (5.6%) | |||
33B | 2 (5.6%) | AlpaCare | 1 (2.8%) | |||
65B | 1 (2.8%) | Orca | 1 (2.8%) | |||
Ziya | 1 (2.8%) | |||||
GPT (17, 73, 74, 84, 92–104) | Decoder-only | 16 (17.6%) | 1.5B | 8 (50.0%) | GPT-base | 14 (87.5%) |
175B | 2 (12.5%) | BioGPT | 2 (12.5%) | |||
6.7B | 2 (12.5%) | |||||
2.7B | 1 (6.3%) | |||||
20B | 1 (6.3%) | |||||
6B | 1 (6.3%) | |||||
1.3B | 1 (6.3%) | |||||
ChatGLM (105–111) | Encoder–decoder | 7 (7.7%) | 6B | 7 (100.0%) | ChatGLM-base | 7 (100.0%) |
T5 (88, 112–116) | Encoder–decoder | 6 (6.6%) | 11B | 4 (66.7%) | Flan-T5 | 3 (50.0%) |
3B | 2 (33.3%) | T5-base | 1 (16.7%) | |||
mt5 | 1 (16.7%) | |||||
ProtT5 | 1 (16.7%) | |||||
Baichuan (75, 117–121) | Decoder-only | 6 (6.6%) | 7B | 4 (66.7%) | Baichuan-base | 6 (100.0%) |
13B | 2 (33.3%) | |||||
From scratch (122–124) | Decoder-only | 3 (3.3%) | 6.4B | 1 (33.3%) | ProGen | 1 (33.3%) |
2.5B | 1 (33.3%) | ProGen2 | 1 (33.3%) | |||
1.2B | 1 (33.3%) | Nucleotide Transformer | 1 (33.3%) | |||
BLOOM (125–127) | Decoder-only | 3 (3.3%) | 7B | 2 (66.7%) | BLOOM-base | 3 (100.0%) |
1B | 1 (33.3%) | |||||
Qwen (9, 128) | Decoder-only | 2 (2.2%) | 14B | 1 (50.0%) | Qwen-base | 2 (100.0%) |
7B | 1 (50.0%) | |||||
PaLM (11, 129) | Decoder-only | 2 (2.2%) | 540B | 1 (50.0%) | PaLM-base | 2 (100.0%) |
340B | 1 (50.0%) | |||||
LongNet (130) | Decoder-only | 1 (1.1%) | 1B | 1 (100.0%) | LongNet-base | 1 (100.0%) |
GenSLM (131) | Decoder-only | 1 (1.1%) | 25B | 1 (100.0%) | GenSLM-base | 1 (100.0%) |
Henya (132) | Decoder-only | 1 (1.1%) | 7B | 1 (100.0%) | StripedHyena | 1 (100.0%) |
Multimodal (14, 87, 91, 96, 130, 133, 134) | Mixed | 7 (7.7%) | NA | NA | ViT | 4 (57.1%) |
BLIP | 1 (14.3%) | |||||
CLIP | 1 (14.3%) | |||||
LLaVA | 1 (14.3%) |
Abbreviation: NA, not applicable.