Skip to main content
. 2024 Feb 26;382(2270):20230159. doi: 10.1098/rsta.2023.0159

Figure 3.

Figure 3.

The y-axis is the accuracy of that experimental setting averaged across the different question sub-types. The ‘mega_run’ experimental set-up for GPT-4, which combines few-shot and CoT prompting, along with providing ‘gold truth’ legal sources, results in best overall accuracy for both the CFR and US Code exams. CoT boosts GPT-4 performance in the retrieval experimental settings of providing both no legal text (‘bypass retrieval’ and ‘few shot’) and the most relevant possible legal text (‘gold truth’ and ‘mega run’). (Online version in colour.)