Abstract
This special issue was expressly designed to illustrate how item-level analytic models can be employed to answer new and exciting questions in the field of reading. To accomplish this we present a compilation of six empirical studies that explore how item-level analyses can be used to advance the field’s understanding of word reading development. In this introduction to the special issue, I summarize the unique advantages of item-level analyses and discuss how the papers in this special issue are valuable examples of how item-focused analyses can be used in a variety of ways to address important question across orthographies, word characteristics, and tasks.
The hallmark of developmental dyslexia and one of the most reliable indicators of reading disabilities is difficulty with the acquisition of context-free word identification skills (Lovett et al., 1994; Share & Stanovich, 1995; Torgesen, 2000; Vellutino, 1979). While there is general agreement regarding the component skills of word reading, there is still much to be learned about specific word reading difficulty and intervention elements that target specific word reading difficulties. Much of the work on word recognition to date is grounded in the lexical quality hypothesis (Perfetti, 2007), which proposes that lexical representations of words, both within and across individuals, vary in the extent and strength with which aspects of their form (phonology, morphosyntax, orthography) and meaning (semantics) are represented (Harm & Seidenberg, 2004). Work on the self-teaching hypothesis (Share, 1995; 1998) suggests that these representations likely proceed in an item-based fashion (Nation & Castles, 2017) and that both child- and item-level characteristics affect the speed at which representations are formed in developing readers. For this reason, a statistical method that allows for the simultaneous modeling of child and item influences on word reading processes is both timely and promising for answering new questions in the area of reading development.
In recent years, item-level analytic approaches have afforded opportunities to examine the underlying processes of word and nonword reading development (e.g., Gilbert, Compton, & Kearns, 2011), letter acquisition (Kim, Petscher, Foorman, & Zhou., 2010), vocabulary acquisition (e.g., Elleman et al., 2017), reading comprehension (e.g., Miller et al., 2014), and specific underlying mechanisms of word reading interventions (e.g., Steacy, Elleman, Lovett, & Compton, 2016). These studies have primarily employed item response theory (IRT) based item-level analyses (commonly referred to as explanatory item response models, crossed random effects models, and multilevel linear mixed effects models), which have afforded researchers the opportunity to examine both child- and item-factors, and in some cases child x item interactions, that contribute to item-level variance in reading outcomes. For the purposes of this introduction, I will refer to this class of models as explanatory item response models. Explanatory item response models model item-level variance as the dependent measure and typically include child and item random effects that allow for both child and/or item predictors. These models allow researchers to simultaneously include child characteristics/skills (e.g., phonological awareness, rapid automatized naming, vocabulary, etc.) and word or item characteristics (e.g., frequency, length, imageability, etc.) as predictors models. As such, item-level analytic models are particularly well-suited to study both the development of word reading skills and mechanisms that undergird word reading instruction and intervention.
Traditional regression models allow researchers to explore either child or item predictors, but do not allow for both to be included in the same model. Furthermore, traditional analyses require data to be aggregated up to either the child (through total scores) or item (often as a percentage of a sample correct for each item). For example, Compton, Appleton, and Hosp (2014) explored passage level predictors (e.g., readability, decodability, average words per sentence) of passage reading accuracy and fluency. To answer this question using a traditional regression approach, the authors aggregated up to average accuracy and fluency for each passage for their sample. In doing so, their unit of analysis for exploring the relation between text-level variables and text accuracy and fluency was that of the passage. Alternatively, explanatory item- response models would allow authors interested in such questions to use passage performance as the unit of analysis and include both passage and child predictors in the same model. In a similar example at the word level, the English Lexicon Project (Balota et al., 2007) aggregates data up to the word, with the unit of analysis being the average performance on a word (lexical decision or speeded naming task) across participants. In this case, the use of explanatory item response models would allow for item performance as the unit of analysis (e.g., word reading accuracy) and would allow for person and item predictors in the same model. Below, I outline examples of how item-based models have been used to date in multiple areas of reading research. An exhaustive review is beyond the scope of this introduction, but a few examples will serve to situate this special issue within the broader literature.
There has been considerable work done using these models in the areas of nonword reading, letter knowledge, and word reading. Gilbert, Compton, and Kearns (2011) used an item level approach to explore nonword reading in developing readers. They used explanatory item response models to partition variance between predictors at the nonword level and predictors at the child level. They also included an item-specific (child-by-word) predictor for grapheme- phoneme correspondence knowledge. They used this modeling technique to explore whether students needed to know all grapheme-phoneme correspondences in a nonword to successfully read the nonword. The modeling technique allowed them to show that knowledge of GPCs is necessary but not sufficient for correctly reading unfamiliar nonwords. Kim, Petscher, Foorman, and Zhou (2010) also used an item-level approach to explore predictors of letter sound knowledge at both the letter- (consonant-vowel letters (e.g., b and d), vowel–consonant letters (e.g., l and m), letters with no sound cues (e.g., h andy), and vowel letters) and child- (letter name knowledge and phonological awareness) levels. They found that letter-knowledge and phonological awareness impacted letter sound knowledge and students were more likely to know consonant-vowel letters than other letters. These models have also been used to explore predictors of polymorphemic word reading (Kearns et al., 2015) and irregular word reading (Steacy et al., 2017). Both studies included child predictors such as phonological awareness and rapid naming and word predictors such as frequency, transparency, and length.
In the areas of vocabulary and reading comprehension, an item-level approach has been used to explore child, item, and passage predictors in several ways. In terms of vocabulary, Elleman et al. (2017) explored child and word predictors of item-based vocabulary acquisition. They used item-based models to determine that students who received vocabulary instruction were more likely to know the meanings of taught words at posttest than students who received comprehension strategy focused intervention. They also found main effects at the child level for general vocabulary knowledge and at the word level for frequency and imageability (a word level predictor representing the ease of eliciting an image for a word). An example of how these models have been used to explore questions related to comprehension is the work of Miller et al. (2014). They used item-level analyses to explore passage and student characteristics as predictors of reading comprehension question responses and reading fluency. They explored predictors associated with the child (e.g., word recognition, language, and executive function), the question (critical analysis, inferential, comprehension monitoring, and strategy questions vs. literal questions), and the passage (e.g., comparing baseline passages to cohesion-manipulated, decoding-manipulated, syntax-manipulated, and vocabulary-manipulated passages).
Finally, these models have been used to explore the impact of specific intervention elements. Steacy, Elleman, Lovett, and Compton (2016) used item-level analyses to test the impact of specific decoding elements in a word reading intervention on transfer to word reading. They used an experimenter designed, treatment aligned word reading outcome measure to explore growth among children with reading disabilities enrolled in Phonological and Strategy Training (PHAST) or Phonics for Reading (PFR). While PHAST includes synthetic phonics and several other word reading strategies, PFR is a traditional synthetic phonics program. Words were coded for the program’s word reading strategies that could help students successfully decode the word (e.g., “vowel alert” strategy to read words with variant vowels). These dummy codes were included in the model as word-level predictors. Interaction terms were then included between these word predictors and treatment. This use of item-level analyses allowed the authors to identify specific intervention elements that differentially impacted word-reading performance at posttest, with children in PHAST better able to read words with variant vowel pronunciations.
The papers in this special issue exemplify the versatility of these models and how they can be used to push our understanding of multiple phenomena forward. As noted above, these models partition item-level variance across child and item/word predictors. Importantly, these models are cross-classified and not strictly hierarchical, with responses nested within persons and words and persons and words crossed on the same level. These models can be used to predict either binary outcomes (i.e., logistic) or continuous outcomes (e.g., exposures to mastery). These new analytic approaches have proven to be fairly robust to sample size and number of items in simulation studies, resulting in more power than typical individual regression models (see Cho, Partchev & De Boeck, 2012).
The six studies in this special issue provide a comprehensive overview of how the simultaneous consideration of person and word characteristics can lead to a deeper understanding of word reading acquisition. These papers address new and interesting considerations related to grapheme-phoneme correspondences (GPC) acquisition in English and German (Schmalz et al., current issue); child and word predictors of item-based word reading of Arabic (Tibi et al., current issue) and Chinese characters (Guan et al., current issue); new explorations of the orthographic choice task (Compton et al., current issue); efficiency of word learning (Steacy et al., current issue); and how these models can be pushed even further to answer new and exciting questions in the field of reading (Petscher et al., current issue).
In their methodologically focused paper, Petscher et al. (current issue) offer an in-depth description of the various types of explanatory item response models that are currently available and the advantages of each. This paper serves as an excellent resource for people who are interested in using these models to answer their own research questions. The authors explain traditional Rasch-based models that have been used extensively in the literature and how these one-parameter models can be expanded to include two-parameter models. As the authors explain traditional Rasch models fix the discrimination parameter to 1.0. By modeling the discrimination parameter, two parameter models allow researchers to go beyond explaining word-level difficulty (1-parameter) to explore both word-level difficulty and discrimination (2-parameters). They argue that modeling the discrimination parameter can allow researchers to answer new and important questions. For instance, by including the discrimination piece researchers and educators can better understand what types of words can best distinguish high- and low-ability readers. This addition of a discrimination parameter can be particularly useful in test development (see also Cho et al., 2014). In an interesting exploration of these models, Petscher et al. use data from a sample of fifth grade students to compare one- and two-parameter models. This is a helpful tutorial on the specifics of these models and some considerations that researchers should take into account for their own analytic plans.
In the second paper, Schmalz et al. (current issue) explore simple and context-sensitive GPCs and report three separate experiments across two languages (English and German). This thought provoking paper focuses specifically on experimentally creating pseudoword items to test specific hypotheses about the interaction between word characteristics and person skills. The authors use a measure of entropy to explore children’s pseudoword reading across grades. In this context, entropy (a concept from information theory) is used to quantify the diversity of GPC pronunciations (i.e., at the item level, if all students provide the same pronunciation of a vowel, entropy is low. Similarly, at the student level, if a student provides the same pronunciation of a vowel across various types of words, entropy is low). Specifically, entropy is used to examine individual differences in the extent to which vowel pronunciations might be generated by unsystematic, random processes. Results indicate that younger students’ pseudoword reading responses were variable and that entropy decreased across grades in both English and German. Taken together, their findings suggest that GPCs become increasingly refined as children matriculate through school. A particularly interesting aspect of their paper is the item- and person-level entropies they propose. By exploring this variable at the item and person levels, they are able to explore variability within vowels and within persons. This novel design pushes the field forward in terms of our understanding of how children assign GPC pronunciations and the influence of context on these pronunciations as children develop.
The Tibi et al. (current issue) contribution uses explanatory item response models to explore the role of morphology in Arabic word reading. Arabic is an interesting example because it has a complex morphological structure that lends itself well to a nuanced analysis using explanatory IRT. The authors partition variance on Arabic word reading between multiple word predictors (number of letters, number of syllables, number of morphemes, ligaturing, concreteness, orthographic frequency, root type frequency, and part of speech) and one child predictor (morphological awareness). The authors use a diverse set of word predictors, some of which are specific to the Arabic orthography (i.e., ligaturing/connectivity) to answer a new and interesting question related to Arabic word reading. They found a significant main effect for child-level morphological awareness and word-level morphology (number of morphemes). The authors highlight that they did not find a main effect for root type frequency and discuss possible reasons for this finding. They conclude that the morphological complexity of Arabic words presents an obstacle for children learning to read in Arabic and this complexity may have implications for instruction.
In an exploration of a non-alphabetic writing system, Guan et al. (current issue) explore the contribution of child-level skills and Chinese character-level attributes to accuracy and reaction time on a lexical decision task. Interestingly, the authors also explore the contribution of these characteristics across time from Grades 1-6 by including time in their models. They report a main effect for orthographic and phonological awareness on accuracy and a main effect for orthographic awareness on response time. Their results suggest that character-level orthographic and phonological effects contribute to character recognition development from grades 1-6 in an asymptotic way, with the magnitude of their effect declining across grades. They also explored interactions between child and character characteristics across time. At the character level, they explored interactions between consistency and transparency and at the child level, they explored interactions between orthographic awareness and phonological awareness. They report a significant interaction between orthographic awareness and phonological awareness in poor readers when predicting lexical decision accuracy. This paper serves as an interesting example of how item-based analyses can be used to explore development over time.
In the fifth paper Compton et al. (current issue) explore item-level performance on the orthographic choice task, which has been used widely in the reading literature as a predictor of literacy skills and as an outcome measure of self-teaching experiments. For this task, individuals are asked to choose the correct spelling between a word (e.g., goat) and a pseudohomophone foil (e.g., gote). The authors use explanatory item response models to explore the debate around using the orthographic choice task as a measure of orthographic processing skill. Within the literature, some see the orthographic choice task as a measure of orthographic knowledge while others suggest that it measures reading skill. The authors use the modeling technique at the heart of this special issue to explore item performance on the task. Critically, they include an item- specific (child-by-word) predictor representing word reading to predict item-level performance on the orthographic choice task. Using this modeling technique, they find that performance on the orthographic choice task is not fully dependent on students’ ability to read the target words in isolation. For example, students who did not read the words correctly still had a high probability of correctly identifying the word on the orthographic choice task (.79) but students who read the words correctly had a higher probability of getting the items correct (.90). This study serves as a nice example of how item-specific (child-by-word) predictors, in addition to child and word predictors, can be used to answer novel and relatively unexplored questions in the literature.
In the final paper Steacy et al. (current issue) also use item-level analyses to explore a new question in the literature by focusing on efficiency of learning in a sample of first grade students at-risk for reading disabilities participating in a word reading intervention. This paper exemplifies the use of these models predicting a continuous outcome (number of exposures to mastery). The authors include a diverse set of both person and word predictors to explore factors related to word learning efficiency. They found that semantic word properties (imageability and vocabulary grade) were significantly predictive of the number of exposures required for mastery, particularly for students who started the intervention with the poorest word reading skills. Their results suggest that students at the lower end of the distribution may be overly reliant on the semantic features of words. This paper serves as a good example of how partitioning variance between child and word predictors allow child x word interactions to be explored that can help to elucidate important factors related to how quickly students acquire words in their orthographic lexicons.
In conclusion, it has been a pleasure to edit this special issue. The authors have provided thought provoking studies that illustrate the potential and promise of these models. By focusing their inquiries and analyses at the item-level, the authors explore new and unanswered questions related to word reading in individuals across the full distribution of reading skill. I believe this special issue demonstrates how this analytic approach affords opportunities for new and important discoveries. As noted throughout this special issue, there clearly remains much to be explored through this technique.
Acknowledgments
This research was supported in part by Grant R324B190025 awarded to Florida State University by the Institute of Education Sciences (IES) and Grant P20HD091013 awarded to Florida State University by Eunice Kennedy Shriver National Institute of Child Health and Human (NICHD). The content is solely the responsibility of the author and does not necessarily represent the official view of IES or NICHD.
Footnotes
Publisher's Disclaimer: This Author Accepted Manuscript is a PDF file of a an unedited peer-reviewed manuscript that has been accepted for publication but has not been copyedited or corrected. The official version of record that is published in the journal is kept up to date and so may therefore differ from this version.
References
*denotes special issue papers
- Balota DA, Yap MJ, Cortese MJ, Hutchison KA, Kessler B, Loftis B, Neely JH, Nelson DL, Simpson GB, & Treiman R (2007). The English Lexicon Project. Behavior Research Methods, 39, 445–459. [DOI] [PubMed] [Google Scholar]
- Cho S-J, De Boeck P, Embretson S, & Rabe-Hesketh S (2014). Additive multilevel item structure models with random residuals: Item modeling for explanation and item generation. Psychometrika, 79, 84–104. [DOI] [PubMed] [Google Scholar]
- Cho SJ, Partchev I, & De Boeck P (2012). Parameter estimation of multiple item response profile model. British Journal of Mathematical and Statistical Psychology, 65(3), 438–466. [DOI] [PubMed] [Google Scholar]
- Compton DL, Appleton AC, & Hosp MK (2004). Exploring the relationship between text- leveling systems and reading accuracy and fluency in second- grade students who are average and poor decoders. Learning Disabilities Research & Practice, 19(3), 176–184. [Google Scholar]
- *.Compton DC, Gilbert JK, Kearns DM, & Olson RK (2020). Using an item-specific predictor to test the dimensionality of the orthographic choice task. Annals of Dyslexia. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Elleman AM, Steacy L, Olinghouse NG, & Compton DL (2017). Examining child by word characteristics in vocabulary learning of struggling readers. Scientific Studies of Reading, 21(2), 133–145. [Google Scholar]
- Harm MW, & Seidenberg MS (2004). Computing the Meanings of Words in Reading: Cooperative Division of Labor Between Visual and Phonological Processes. Psychological Review, 111(3), 662–720. [DOI] [PubMed] [Google Scholar]
- Gilbert JK, Compton DL, & Kearns DK (2011). Word and person effects on decoding accuracy: A new look at an old question. Journal of Educational Psychology. 103, 489–507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- *.Guan CQ, Fraundorf SH, & Perfetti CA (2020). Character and child factors contribute to character recognition development among good and poor Chinese readers from Grade 1 to 6. Annals of Dyslexia. [DOI] [PubMed] [Google Scholar]
- Kim YS, Petscher Y, Foorman BR, & Zhou C (2010). The contributions of phonological awareness and letter-name knowledge to letter-sound acquisition—a cross-classified multilevel model approach. Journal of educational psychology, 102(2), 313. [Google Scholar]
- Lovett MW, Borden SL, DeLuca T, Lacerenza L, Benson NJ, & Brackstone D (1994). Testing the core deficits of developmental dyslexia: Evidence of transfer of learning after phonologically- and strategy-based reading training programs. Developmental Psychology, 30, 805–822. [Google Scholar]
- Miller AC, Davis N, Gilbert JK, Cho SJ, Toste JR, Street J, & Cutting LE (2014). Novel approaches to examine passage, student, and question effects on reading comprehension. Learning Disabilities Research & Practice, 29(1), 25–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nation K, & Castles A (2017). Putting the learning into orthographic learning. Theories of reading development, 148–168. [Google Scholar]
- Perfetti C (2007). Reading ability: Lexical quality to comprehension. Scientific studies of reading, 11(4), 357–383. [Google Scholar]
- *.Schmalz X, Robidoux S, Castles A, & Marinus E (2020). Variations in the use of simple and context-sensitive graphemephoneme correspondences in English and German developing readers. Annals of Dyslexia [DOI] [PMC free article] [PubMed] [Google Scholar]
- Share DL (1995). Phonological recoding and self-teaching: Sine qua non of reading acquisition. Cognition, 55(2), 151–218. [DOI] [PubMed] [Google Scholar]
- Share DL, & Stanovich KE (1995). Cognitive processes in early reading development: A model of acquisition and individual differences. Issues in Education: Contributions From Educational Psychology, 1, 1–57. [Google Scholar]
- Steacy LM, Elleman AM, Lovett MW, & Compton DL (2016). Exploring differential effects across two decoding treatments on item-level transfer in children with significant word reading difficulties: A new approach for testing intervention elements. Scientific Studies of Reading, 20(4), 283–295. [DOI] [PMC free article] [PubMed] [Google Scholar]
- *.Steacy LM, Fuchs D, Gilbert JK, Kearns DM, Elleman AM, & Edwards AA (2020). Sight word acquisition in first grade students at-risk for reading disabilities: An item-level exploration of the number of exposures required for mastery. Annals of Dyslexia. [DOI] [PMC free article] [PubMed] [Google Scholar]
- *.Tibi S, Edwards AA, Schatschneider C, & Kirby JR (2020). Predicting Arabic word reading: A cross-classified generalized random-effects analysis showing the critical role of morphology. Annals of Dyslexia [DOI] [PubMed] [Google Scholar]
- Torgesen JK (2000). Individual differences in response to early interventions in reading: The lingering problem of treatment resisters. Learning Disabilities Research & Practice, 15(1),55–64. [Google Scholar]
