Fig. 8.
Answer patterns for the Wason tasks, broken down into the pairings of individual cards that each participant chose (AT Antecedent True, CF Consequent False, etc.). Behavior is not random, even when performance is near chance. As above, humans do not usually choose the correct answer (AT, CF; dark blue); instead, humans more frequently exhibit the matching bias (AT, CT; light blue). Humans also show other errors, e.g. surprisingly often choosing two cards corresponding to a single rule component (pink). Language models answer correctly more often than humans, but intriguingly choose options with the antecedent false and a consequent card (yellow/orange) more frequently.