Skip to main content
[Preprint]. 2024 Jul 25:arXiv:2408.00588v1. [Version 1]

Figure 4:

Figure 4:

Human and GPT4-simulated evaluation of LLM-generated summaries. a, Performance of different summarization systems in human evaluations using win-rates against zero-shot Llama-2 (Llama-2-zs). The dotted line represents the default 50% win-rate of the Llama-2-zs. b, Performance of different summarization systems in GPT4-simulated evaluation using win-rate. The dotted line represents the default win-rate of Llama-2-zs. zs - zero-shot learning; ft - fine-tuning.