Skip to main content
. 2024 Feb 26;382(2270):20230159. doi: 10.1098/rsta.2023.0159

Figure 4.

Figure 4.

For both the CFR and US Code exams, we see a clear increase in overall answer accuracy with each subsequently released OpenAI model. The most capable model, GPT-4, with both prompting enhancements (CoT and few-shot) and the most relevant ‘gold_truth’ legal text input into the prompt, is able to perform extremely well, far better than any other set-up in the experiments (see ‘mega_run’ in figure 3). (Online version in colour.)