Skip to main content
. 2022 Mar 7;25(3):369–380. doi: 10.1038/s41593-022-01026-4

Fig. 2. Behavioral assessment of the human ability to predict forthcoming words in a natural context.

Fig. 2

a, The stimulus was transcribed for the behavioral experiment. b, A ten-word sliding window was presented in each trial, and participants were asked to type their prediction of the next word. Once entered, the correct word is presented, and the window slides forward by one word. c, For each word, we calculated the proportion of participants that predicted the forthcoming word correctly. d, Human predictability scores across words. e, Human predictability scores versus GPT-2’s predictability scores for each upcoming word in the podcast. f, Match between assigned probability for humans and GPT-2 and the actual accuracy for their top-one predictions. g, Correlation between human predictions and GPT-2 predictions (as reported in d) for different context window lengths ranging from 2 to 1,024 preceding tokens (blue). Correlation between human predictions and 2- to 5-gram model predictions (orange).