Skip to main content
. 2024 Jul 16;3(7):pgae233. doi: 10.1093/pnasnexus/pgae233

Fig. 5.

Fig. 5.

Detailed results on the Wason task. Human performance is low, even on Realistic rules. However, the subset of subjects who answer more slowly show above chance accuracy for the realistic rules (cyan), but not for the arbitrary ones (pink). Furthermore, each of the language models reproduces this pattern of advantage for the realistic rules. In addition, two of the larger models perform above chance at the arbitrary rules. (The dashed line corresponds to chance—a random choice of two cards. LMs and humans were forced to choose exactly two cards).