Extended Data Table 1 |.
Model | Context | Parameters | Proprietary? | Seq2seq | Autoreg. |
---|---|---|---|---|---|
| |||||
FLAN-T5 | 512 | 2.7B | - | ✔ | - |
FLAN-UL2 | 2,048 | 20B | - | ✔ | - |
Alpaca | 2,048 | 7B | - | - | ✔ |
Med-Alpaca | 2,048 | 7B | - | - | ✔ |
Vicuna | 2,048 | 7B | - | - | ✔ |
Llama-2 | 4,096 | 7B, 13B | - | - | ✔ |
GPT-3.5 | 16,384 | 175B | ✔ | - | ✔ |
GPT-4 | 32,768* | unknown | ✔ | - | ✔ |
The context length of GPT-4 has since been increased to 128,000.
We quantitatively evaluated eight models, including state-of-the-art seq2seq and autoregressive models. Unless specified, models are open source (versus proprietary).