The Typability Index: A tool for measuring and controlling for typing difficulty in text stimuli

. 2026 Feb 12;58(2):61. doi: 10.3758/s13428-025-02877-y

#	Candidate predictor variable	Calculation
1	Total characters	Total number of characters including spaces.
2	Total keystrokes	The minimum number of keystrokes needed to type the text accurately, assuming shift is used rather than caps lock.
3	Total words	The number of words in the text, defined as groups of characters separated by spaces, rather than the typical definition of words as five characters (when calculating speed).
4	Keystrokes per word	Total keystrokes divided by total words.
5	Characters per word	Total characters divided by total words.
6	Mean word proportion	1 divided by characters per word.
7	Proportion of words within high-frequency words	Number of words from the text that appear in the top 1000 words list* divided by total words.
8	Proportion of characters within high-frequency words	Number of characters that are contained in words from the text that appear in the top 1000 words list* divided by total characters.
9	Mean word frequency	Sum of the language frequencies of each word in the text, divided by number of words. Frequencies from SubtLEX_US (Brysbaert & New, 2009, ‘FREQcount’ variable).
10	Proportion of non-words	Number of words in the text that are not recognised in UK, US, AU or CA Hunspell English dictionaries (according to the {hunspell} package; Ooms, 2022) divided by total words.
11	Proportion of characters within non-words	Number of characters that are contained in words that are not recognised in UK, US, AU or CA dictionaries divided by total characters.
12	Syllables per word	Total number of syllables (according to the {quanteda.textstats} package; Benoit et al., 2018), divided by total words. This package uses the CMU Pronunciation Dictionary (Carnegie Mellon University, n.d.), and counts vowel clusters for words not in this dictionary.
13	Bigram frequency	Sum of the language frequencies of each letter pair in the text, divided by number of letter pairs. Frequencies based on Behmer and Crump (2017; 'Frequency' variable). This includes letter pairs only, with no spaces, and is based on approximately 3000 English language eBooks from Project Gutenberg.
14	Proportion of high frequency bigrams	Number of letter pairs from the text that are appear in the top 15 bigrams, divided by number of letter pairs. (An alternative approach akin to proportion of high frequency words). Frequencies from Behmer and Crump (2017).
15	Proportion of character repetitions	Number of character pairs relating to character repetitions (e.g. ‘rr’, ‘..’), divided by number of character pairs.
16	Proportion of finger repetitions	Number of character pairs relating to finger repetitions (e.g. ‘ed’, ‘k,’), assuming standard touch typing, divided by number of character pairs.
17	Proportion of hand repetitions	Number of character pairs relating to hand repetitions (e.g. ‘se’, ‘hi’), assuming standard touch typing, divided by number of character pairs.
18	Proportion of hand alternations	Number of character pairs relating to character repetitions (e.g. ‘qu’, ‘ty’), assuming standard touch typing, divided by number of character pairs.
19	Proportion of lowercase letter characters	Number of lowercase letters divided by total characters.
20	Proportion of uppercase letter characters	Number of uppercase letters divided by total characters.
21	Proportion of numbers	Number of numbers divided by total characters.
22	Proportion of symbols	Number of symbols (including both punctuation and non-punctuation symbols) divided by total characters.
23	Proportion of spaces	Number of spaces divided by total characters.
24	Proportion of lowercase letter non-space characters	Number of lowercase letters divided by total non-space characters.
25	Proportion of uppercase letter non-space characters	Number of uppercase letters divided by total non-space characters.
26	Proportion of number non-space characters	Number of numbers divided by total non-space characters.
27	Proportion of symbol non-space characters	Number of symbols divided by total non-space characters.
28	Keystrokes per character	Total keystrokes divided by total characters.
29	Proportion of right-side keys	Number of characters relating to keys on the right-hand side of the keyboard, assuming standard touch typing, divided by total characters.
30	Mean distance from home row	Sum of each character’s key distance from the eight finger resting keys on the home row, divided by total characters. Distances are based on Krzywinski (n.d.).

Predictors were calculated according to American English spellings and keyboard layout (ANSI) unless stated otherwise. *We used the 1,000 most frequent English words list from the Corpus of Contemporary American English (Davies, 2008-), including lemmatisations. For example, “do” is on the core list, so variations such as “doing,” “did,” and “done” were also considered. Including lemmatisations allows the Typability Index to capture familiarity with core concepts, not just the specific forms of words. This helps reflect both the cognitive familiarity and the ease of typing frequent or commonly recognised words