Abstract
The processing of abbreviations in reading was examined with an eye movement experiment. Abbreviations were of two distinct types: Acronyms (abbreviations that can be read with the normal grapheme-phoneme correspondence rules, such as NASA) and initialisms (abbreviations in which the grapheme-phoneme correspondences are letter names, such as NCAA). Parafoveal and foveal processing of these abbreviations was assessed with the use of the boundary change paradigm (Rayner, 1975). Using this paradigm, previews of the abbreviations were either identical to the abbreviation (NASA or NCAA), orthographically legal (NUSO or NOBA), or illegal (NRSB or NRBA). The abbreviations were presented as capital letter strings within normal, predominantly lowercase sentences and also sentences in all capital letters such that the abbreviations would not be visually distinct. The results indicate that acronyms and initialisms undergo different processing during reading, and that readers can modulate their processing based on low-level visual cues (distinct capitalization) in parafoveal vision. In particular, readers may be biased to process capitalized letter strings as initialisms in parafoveal vision when the rest of the sentence is normal, lower case letters.
Reading is an incredibly complex task, despite appearing effortless to the literate population, who often forget the difficultly they had learning to read. Although the process of reading is incredibly complex, involving the planning and execution of eye movements, grapheme-phoneme conversion, access of word meanings, syntactic parsing, and constructing discourse representations, researchers have learned a lot about skilled reading (for a review see Rayner, Foorman, Perfetti, Pesetsky, & Seidenberg, 2001). However, many questions remain regarding reading. For instance, it is well established that processing both orthography and phonology are important for word identification. However, different languages are characterized by a tighter or looser correspondence between their text representation (orthography) and sound representation (phonology) and those with a tighter link lead to faster acquisition of reading and spelling skills (Thorstadt, 1991). Indeed, the effects of phonology in natural reading, which are quite strong in alphabetic languages, are argued to be comparatively small in Chinese (Feng, Miller, Shu & Zhang, 2001) in which the orthography does not always represent the phonology of the word. In a deep orthography like English, which has an extremely inconsistent correspondence between orthography and phonology (and therefore multiple mappings), to what extent does the orthographic appearance of a word affect the way it is phonologically coded? Abbreviated words are a good test case for this question, as they are typographically distinct in normal English text (i.e., they are presented in all capital letters) but consist of two seemingly disparate orthographic to phonological mapping schemes (i.e., with normal grapheme-phoneme correspondence rules as acronyms or with a series of letter names as initialisms; see discussion below). Furthermore, abbreviations are important to study because their use is rapidly increasing and researchers who study reading and word recognition still poorly understand the processing of them.
Abbreviations in Language Use
Research on the processing of abbreviations has increased recently (Brysbaert, Speybroeck, & Dieter, 2009; Laszlo & Federmeier, 2007a, 2007b; McWilliam, Schepman, & Rodway, 2009; Perea, Acha, & Carreiras, 2009; Slattery, Pollatsek, & Rayner, 2006) largely due to the use of abbreviations in text messaging and online chatting. Abbreviations in informal communication have attracted much negative media attention, laced with apparent fear that these ‘textisms’ will negatively impact literacy rates (see Thurlow, 2006; Thurlow & Bell, 2009). While this fear is sometimes supported by scientific research, the converse has also been found: a positive correlation between the ability to create text abbreviations and spelling ability (Plester, Wood, & Bell, 2008), as well as reading ability (Plester, Wood, & Joshi, 2009). Thus, disagreement over the value as opposed to the cost of texting on literacy has created a controversy in the popular media.
The lexical representation of abbreviations
Textisms are not the only abbreviations encountered during reading. In fact, the increased media attention to textisms only highlights the fact that we know little about the processing of formal abbreviations, even though they have been used for decades to provide a savings in number of letters or words used. For example, the use of abbreviations is pervasive in government (e.g., CIA, FBI), the military (e.g., ROTC, AWOL), and science (e.g., DNA, STM). Thus, it is important to start addressing the issue of how abbreviations are recognized as they are commonly used (and their frequency of use is increasing).
Several studies show that common abbreviations are similar to regular words (i.e., have become lexicalized); compared to meaningless letter strings, abbreviations show lower recognition thresholds with brief presentation durations (Gibson, Bishop, Schiff, & Smith, 1964), have faster same-different reaction times (Carr, Posner, Pollatsek, & Snyder, 1979; Henderson, 1974), are subject to greater feature integration errors (Prinzmetal & Wright, 1984), show increased letter identification accuracy in a Reicher-Wheeler task (Besner, Davelaar, Alcott, & Parry, 1984; Laszlo & Federmeier, 2007a; Noice & Hock, 1987), and benefit from associatively related primes in masked priming tasks (Brysbaert et al., 2009). In addition there is also some recent work with event-related potentials (ERPs) demonstrating an N400 repetition effect for familiar abbreviations, similar to that for words (Laszlo & Federmeier, 2007b), but no such effect for illegal letter strings that were non-existent or unfamiliar abbreviations. While these studies may lead one to conclude that abbreviations are processed in the same way as words, they largely ignore an important distinction between two types of abbreviations (see discussion of acronyms and initialisms below).
Phonological Processing in Natural Reading
It is well established that many linguistic properties of a word such as its frequency (Inhoff & Rayner, 1986; Rayner & Duffy, 1986), orthography (Sereno & Rayner, 1992), and phonology (Lee, Binder, Kim, Pollatsek, & Rayner, 1999) affect the ease and speed with which it is identified. Not only is phonological information accessed early during word identification in reading (Lee et al., 1999) but there is now a pervasive literature that indicates that phonological processing begins before a word is even fixated (Ashby & Rayner, 2004; Ashby, Treiman, Kessler, & Rayner, 2006; Chace, Rayner, & Well, 2005; Liu, Inhoff, Ye, & Wu, 2002; Miellet & Sparrow, 2004; Pollatsek, Lesch, Morris, & Rayner, 1992; Rayner, Sereno, Lesch, & Pollatsek, 1995; Tsai, Lee, Tzeng, Hung, & Yen, 2004). Phonological preview benefits have been demonstrated with heterographic homophones (Pollatsek et al., 1992) and have been reported across different languages, such as French (Miellet & Sparrow, 2004) and Chinese (Liu et al, 2002; Tsai et al., 2004). Preview benefits are also provided by pseudohomophones, which share phonology with the target, but do not have lexical representations themselves (Ashby et al., 2006).
Not only is phonology accessed early, but the nature of the phonology affects the processing of a word. For instance, words that employ more regular or consistent orthography-to-phonology mappings are processed more easily than words for which the mapping is more opaque in lexical decision and naming tasks (Bauer & Stanovich, 1980; Glushko, 1979; Seidenberg, Waters, Barnes & Tanenhaus, 1984; Waters & Seidenberg, 1985) as well as during silent reading (Sereno & Rayner, 2000). In fact, when the orthography of two different words is equivalent (e.g., homographs) but the phonology is different (e.g., heterophones) readers incur a large processing cost. Folk and Morris (1995) found a significant processing cost for homographic heterophones but no processing cost for heterographic homophones, providing evidence that the system is quite sensitive to ambiguity in phonological decoding. Since phonological information is processed early in reading, how are abbreviations treated given that there is even greater disparity in their orthography-to-phonology mappings than there is with irregular words? How does one choose which system to use: canonical grapheme-phoneme conversion (as with acronyms) or letter-by-letter pronunciation (as with initialisms)?
The Distinction between Acronyms and Initialisms
As noted above, abbreviations can be split into two distinct groups based on the way they are phonologically decoded: acronyms, which are pronounced with the same grapheme-phoneme correspondence rules as words (e.g., NASA) and initialisms, which are pronounced as a sequence of letter names (e.g., NCAA). Both types are referred to as acronyms in common vernacular, but the acronym/initialism distinction is likely an important one for distinguishing lexicalized representations and determining the time course of phonological processing. Perhaps, because they can be phonologically decoded in the same way, acronyms and words are read similarly. However, because initialisms cannot utilize the same grapheme-phoneme correspondence rules as words, they may be represented and read differently. Adding to the ambiguity of representation of abbreviations, there is no salient and consistent cue in the typography that indicates which phonological coding strategy should be used; both acronyms and initialisms are printed in all capital letters. Therefore, determining whether the letter ‘N’ should be pronounced as “n” as in the acronym ‘NASA’ or as “en” in the initialism ‘NCAA’ cannot be done based solely on the appearance of the letter string in all capital letters.
Slattery et al. (2006) indicated that the nature of phonological codes for familiar initialisms is one consisting of letter names. They utilized the indefinite articles ‘a’ and ‘an,’ which in English are used prior to consonant-beginning and vowel-beginning words, respectively. In initialisms, an orthographic consonant such as ‘N’ will be pronounced as the vowel phoneme ‘en,’ creating a conflict between whether “a” or “an” should precede it. Initial eye fixations on these abbreviations were shorter when the preceding article agreed with the pronunciation corresponding to the name of the initial letter.
Since Slattery et al. (2006) only used initialisms there is still an open question as to whether they would have found different effects had they compared phonological cuing of the articles ‘a’ and ‘an’ between initialisms and acronyms. In the present study we assess the natural reading of both initialisms and acronyms to determine whether they are processed similarly.
No current models of word recognition seriously address the issue of abbreviations. According to dual route models of word recognition like the DRC (Coltheart, Rastle, Perry, Langdon & Ziegler, 2001), legal, word-like strings like acronyms can be processed through both the grapheme-phoneme route and the direct visual route while illegal letter strings like initialisms can only be processed through the latter since grapheme-phoneme conversions would yield unpronounceable output1. However, this direct visual route would only be successful for familiar abbreviations. This would be problematic as readers encounter novel abbreviations quite frequently. Consider, for instance, a novice to the field of cognitive psychology who is faced with abbreviations such as DRC, ERP, and STM, etc.
How then, does a reader decide whether to phonologically decode an abbreviation as an acronym or initialism? Although uncertainty related to processing abbreviations may hurt readers, the orthography itself may provide a cue to the coding of these letter strings. Upon seeing a string of letters in all capital letters the reader may be cued that the proper correspondence rules use letter names. Importantly, while this would only produce the correct output for initialisms, as acronyms follow normal grapheme-phoneme correspondence rules, it will always produce some sort of usable output. On the other hand, attempting to decode capitalized strings as acronyms will always produce correct output for acronyms, but rarely produce useful output for initialisms (for example, try pronouncing FBI as a word). Therefore, it may be more prudent for the system to have a bias to process capitalized strings as initialisms. Another possibility is that readers use parafoveal information about the orthography of the word (i.e., the legality of the letter string) as a cue to whether it should be decoded as an acronym or initialism.
In the present study, we used the boundary paradigm (Rayner, 1975) to test the possibility that readers use typographic cues and/or orthographic legality to infer how to phonologically decode acronyms and initialisms. This paradigm has been very useful in determining the types of information that are processed in the parafovea (see Rayner, 1998, 2009 for reviews). Readers' eye movements were monitored while they read sentences containing either an acronym or initialism. Three types of previews were used: identical (NASA), orthographically legal (NUSO), or illegal (NRSB). Both the legal and illegal previews had the same amount of orthographic overlap with the target abbreviation and differed only in whether or not they could be pronounced as a word-like unit. The preview characters were changed to their respective target abbreviations (in this case, NASA) once readers' eyes crossed an invisible boundary located just before the abbreviation thus enabling us to examine both the foveal and parafoveal processing of such letter strings. Finally, we also manipulated the typographical cue that the upcoming string was an abbreviation as half the subjects read the capitalized abbreviations in normal, predominantly lower case sentences while the other half read the capitalized abbreviations in all capital sentences.
If readers can (and always) utilize the legality of the letter string in parafoveal vision to guide their phonological processing, we should find a cross-over interaction between the type of abbreviation (acronym vs. initialism) and the legality of the preview regardless of the sentence presentation condition (lower case vs. capital). That is, the previews consisting of illegal sequences of letters would cue the system to process the letter strings as a sequence of letter names, which would be beneficial for initialism targets but not acronym targets. Conversely, previews consisting of a legal sequence of letters would cue the system to process the letter strings similarly to words, which would be beneficial for acronym targets but not initialism targets. However, if readers default to processing distinct capitalized strings as initialisms, we would expect a different type of interaction: no effect of legality in the lower case sentence condition, but a significant interaction between preview legality and abbreviation type in the capital sentence condition. That is, assuming this type of default initialism processing, both legal and illegal previews should be treated the same during parafoveal processing under lower case sentence conditions; there would be no benefit to processing a target if the preview were illegal compared legal. This default strategy would not be useful however if the entire sentence were in all capital letters, making the abbreviation indistinguishable from other words in the sentence. Therefore, if readers use the distinct capitalization in parafoveal vision of abbreviations as a cue to alter their phonological processing we should see interactions between the legality of the preview and the case in which the sentences were presented.
Method
Subjects
Seventy-eight undergraduate students from the University of California, San Diego received course credit for participation. They were all native English speakers, had normal or corrected vision, and were naïve regarding the purpose of the experiment.
Apparatus
Eye movements were recorded using an SR Eyelink 1000 eyetracker, sampling at 2000 Hz. Viewing was binocular, but only the right eye was recorded. Subjects read sentences on an Iiyama Vision Master Pro 454 video monitor refreshing at 150 Hz. The display change occurred within 7 ms after the reader's eyes crossed the boundary location; thus the display change typically occurred during the saccade (when vision is suppressed). The subjects' eyes were 60 cm from the video monitor and 2.8 letters equaled one degree of visual angle.
Materials
Target stimuli consisted of 27 acronyms and 27 initialisms familiar to the UCSD undergraduate population (see Appendix A)3. The pronunciation of acronyms followed grapheme-phoneme correspondences of words (NASA) and the pronunciation of initialisms was as a sequence of letter names (NCAA). Acronyms were 3-6 letters (mean 4.03) and had a mean frequency of 7.51 occurrences per million in COCA (Contemporary Corpus of American English, Davies, 2008). Initialisms were 3-4 letters (mean 3.33), with a mean COCA frequency of 7.77 occurrences per million. Since many of our abbreviations had frequencies of zero we collected data from a separate group of subjects to ascertain their familiarity with these abbreviations, which was very high (M=5.6, SD=.98) on a 7 point scale where 1 was not familiar and 7 was very familiar. Additionally, these subjects pronounced the abbreviations as we anticipated 97% of the time, indicating that there was little ambiguity as to how they should be pronounced. Cloze probability ratings were collected from another group of 10 subjects, none of these subjects predicted the target abbreviation in any of the sentences.
In the eye movement experiment, abbreviations were embedded in sentences and were preceded and followed by a minimum of 3 words. Each abbreviation appeared once, with one of three previews: identical, legal, and illegal (See Table 1 and Appendix A). For each target, the same letters were manipulated to generate legal and illegal previews, with the first letter never being changed. The average frequency of the legal and illegal previews was 0.24 and 0.31 occurrences per million respectively, and did not significantly differ, t < 1. Three lists of experimental items were created using Latin square counterbalancing such that each list contained an equal number of items from each condition and no item appeared more than once in any list. Each list was presented along with 63 filler sentences in a new random order to one third of the subjects. All sentences were presented on a single line in 14 point Courier New font. For half of the subjects, the sentences were presented normally with abbreviations in all capitals surrounded by predominantly lower case characters. For the other half of the subjects, the sentences were presented entirely in capital letters4. For 50 of the sentences a comprehension question followed with two answer choices.
Table 1.
Target | Sentence Case | Preview | |
---|---|---|---|
Acronym | Lower Case | Identical | Metal debris damaged the expensive NASA probe while it was in orbit. |
Legal | Metal debris damaged the expensive NUSO probe while it was in orbit. | ||
Illegal | Metal debris damaged the expensive NRSB probe while it was in orbit. | ||
All Capital | Identical | METAL DEBRIS DAMAGED THE EXPENSIVE NASA PROBE WHILE IT WAS IN ORBIT. | |
Legal | METAL DEBRIS DAMAGED THE EXPENSIVE NUSO PROBE WHILE IT WAS IN ORBIT. | ||
Illegal | METAL DEBRIS DAMAGED THE EXPENSIVE NUSO PROBE WHILE IT WAS IN ORBIT. | ||
Initialism | Lower Case | Identical | The floor was slippery at the crowded NCAA championship so a time out was taken. |
Legal | The floor was slippery at the crowded NOBA championship so a time out was taken. | ||
Illegal | The floor was slippery at the crowded NRBA championship so a time out was taken. | ||
All Capital | Identical | THE FLOOR WAS SLIPPERY AT THE CROWDED NCAA CHAMPIONSHIP SO A TIME OUT WAS TAKEN. | |
Legal | THE FLOOR WAS SLIPPERY AT THE CROWDED NOBA CHAMPIONSHIP SO A TIME OUT WAS TAKEN. | ||
Illegal | THE FLOOR WAS SLIPPERY AT THE CROWDED NRBA CHAMPIONSHIP SO A TIME OUT WAS TAKEN. |
Procedure
Subjects were familiarized with the experimental equipment and given verbal task instructions (which control pad buttons to press, to blink before and after rather than during reading, and to read for comprehension). After a 3-point calibration was performed, subjects completed six practice trials (none of which contained abbreviations); three had comprehension questions. On each trial, subjects fixated a small box on the left of the screen; once a stable fixation was detected the box disappeared and the sentence was presented.
Previews were replaced by the target abbreviation once the subject's saccade crossed an invisible boundary located just in front of the space before the abbreviation (See Table 1). The subject pressed a button on the control pad when reading was completed and responded to the comprehension questions by pressing a button on the control pad corresponding to their answer choice.
Results
Fixations shorter than 80 ms within 1 character of a previous or subsequent fixation were combined with that fixation. All other fixations less than 80 ms were eliminated (2.1% of all fixations). Trials in which there was a blink or track loss on the target word or during an immediately adjacent fixation were removed prior to analysis (4.4% of trials), as were trials in which the display change was completed late or was triggered early by a saccade that ended left of the boundary5 (7.5% of trials). The remaining data were evenly distributed over the conditions, p > .4. Accuracy to the comprehension questions was 95.1% on average, and did not significantly differ across conditions, p > .2.
Target regions were defined as the abbreviation and the space before and after it. We analyzed three fixation duration measures (Rayner, 1998): the duration of the fixation immediately prior to the display change, as well as the first fixation duration (the first fixation on the abbreviation, regardless of how many total fixations there were) and gaze duration (the sum of all first pass fixations on the abbreviation before leaving it) on the target. We also analyzed skipping rate (the percentage of trials in which the abbreviation received no first pass fixation). The means for these measures over the various experimental conditions appear in Table 2. The current experimental design was not intended to directly assess the main effect of abbreviation type. This was due to the uncontrolled between item nature of the manipulation, which was necessary given the scarcity of highly familiar acronyms. However, for completeness, we will report this main effect despite the fact that we are more concerned with the interactions involving this variable.
Table 2.
Sentence Case | Acronym Target | Initialism Target | |||||
---|---|---|---|---|---|---|---|
Identical | Legal | Illegal | Identical | Legal | Illegal | ||
Pre-change Fixation Location (characters) | Lower Case | 4.5 (2.6) |
4.6 (2.5) |
4.6 (2.5) |
4.6 (2.8) |
4.5 (2.8) |
4.3 (2.9) |
All Capital | 4.7 (2.5) |
4.5 (2.9) |
4.8 (3.1) |
4.4 (2.8) |
4.5 (2.7) |
4.6 (2.6) |
|
Pre-change Fixation Duration (ms) | Lower Case | 223 (88) |
236 (98) |
227 (82) |
226 (86) |
233 (89) |
234 (87) |
All Capital | 231 (85) |
221 (82) |
221 (78) |
226 (89) |
224 (78) |
226 (90) |
|
First Fixation Duration (ms) | Lower Case | 257 (94) |
266 (97) |
272 (92) |
267 (101) |
283 (122) |
277 (120) |
All Capital | 261 (93) |
296 (113) |
291 (104) |
281 (118) |
288 (120) |
275 (110) |
|
Gaze Duration (ms) | Lower Case | 306 (130) |
325 (141) |
321 (114) |
311 (146) |
337 (179) |
340 (154) |
All Capital | 330 (158) |
349 (135) |
369 (149) |
341 (168) |
370 (178) |
346 (164) |
|
Skipping Rate (%) | Lower Case | 6.9 (25.3) |
5.3 (22.5) |
5.4 (22.6) |
13.6 (34.3) |
12.3 (32.8) |
11.9 (32.4) |
All Capital | 9.4 (29.2) |
10.1 (30.2) |
8.1 (27.3) |
12.1 (32.6) |
15.0 (35.7) |
13.2 (33.9) |
Note: Pre-change fixation location is in characters relative to the boundary, all duration measures are in milliseconds, skipping rate is a percentage. Standard deviations are reported in parenthesis.
Each of the fixation duration measures was first log normalized then analyzed via linear mixed models (LMM) using the lme4 package of the R statistical software (Bates & Maechler, 2010; R Development Core Team, 2010). These linear mixed models predicted the duration measures from the crossing of abbreviation type (acronym vs. initialism), sentence case (normal vs. upper), and preview type (identical, legal, illegal), plus the independent influence of abbreviation length, as fixed effects, and subjects and items as crossed random effects. Target type, sentence case, and target length were all centered. We tested two specific orthogonal contrasts involving the preview type. The first contrast (identity) tested the benefit of having an identical preview compared to the average of the two invalid previews (legal and illegal). The second contrast (legality) directly compared the legal to the illegal preview condition. We report coefficient and standard error estimates as well as p-values estimated from Markov chain Monte Carlo (MCMC) simulations (see Baayen, 2008 for a discussion as to why MCMC methods are preferred to estimate p-values for this type of analysis).
Skipping represents binary outcome data and therefore we used multi-level logistic regression for this measure, and the coefficients we report are changes in log likelihood of skipping with the p-values derived from z distributions rather than MCMC. Finally, the LMM for the skipping data included one additional variable not included in the analyses of the fixation duration measures: the location of the fixation prior to the display change in characters (centered). This variable is sometimes referred to as launch site and has been shown to have a strong influence on skipping behavior; the closer the launch site is to the target, the more likely the target will be skipped. We will present our analyses in the order that parallels the time course of processing during reading. For completeness the different measures are reported, but the most important measure is gaze duration since it is a reasonable reflection of the amount of time needed to process the target abbreviations (Rayner, 1998).
Fixation Prior to Crossing the Boundary
While it is widely agreed that processing of upcoming words begins prior to these words being fixated (Rayner, 1998, 2009), there is still considerable debate regarding the influence that properties of upcoming words have on the current fixation duration (Rayner, 2009). Such influences are often termed parafoveal on foveal effects or successor effects. In the present experiment, we assessed such parafoveal on foveal effects by analyzing the fixation duration immediately prior to crossing the display change boundary. For these fixations, there was a significant interaction between the sentence case manipulation and the identity preview contrast, b = -.049, SE = .023, p < .05. This interaction suggests that readers treated the abbreviation previews differently when they were typographically distinct from the surrounding sentence. None of the other experimental factors or the length of the target abbreviations significantly influenced these fixation durations, all p's > 0.1.
Skipping
Analyses of the rate of skipping the target revealed many standard effects on skipping, such as target length, b = -1.204, SE = .177, p < .001 and location of the previous fixation b = -.229, SE = .028, p < .001. Additionally, there was more skipping of the target abbreviation in the all capital condition than the lower case condition, b = .789, SE = .378, p < .05 suggesting that, when the target was not typographically distinct from the other words in the sentence, readers were more likely to skip it. There was no main effect of abbreviation type on skipping likelihood after accounting for target length, b = .298, SE = .259, p > .20. However, there was a significant interaction between abbreviation type and sentence case, b = -.507, SE = .255, p < .05, such that the difference in skipping likelihood between abbreviation types was smaller when the sentences were presented in all capital letters. The significant main effect of sentence case, as well as its interaction with abbreviation type, suggests that readers process these letter strings differently when they are typographically distinct; these findings are consistent with the notion of a processing bias under normal presentation conditions, triggered during parafoveal processing.
First fixation durations
First fixation duration on the target abbreviation (which had to occur after the display change) was strongly influenced by the experimental manipulations. These fixation durations were strongly affected by the length of the abbreviation, b = .030, SE = .012, p < .05, with longer abbreviations yielding longer first fixation durations. First fixation durations were also longer on initialism targets than on acronym targets, b = .042, SE = .020, p < .05, and were marginally longer in the all capital sentences than in lower case sentences, b = .062, SE = .036, p = .068. There was no significant main effect of either the identity contrast, b = .026, SE = .016, p = .105, or the legality contrast, b = -.011, SE = .014, p = .448. However, the identity contrast did significantly interact with sentence case, b = .049, SE = .023, p < .05, with the size of the identity preview benefit being greater in the all capital sentences. Additionally, the three-way interaction between the identity contrast, sentence case, and abbreviation type was significant, b = -.069, SE = .033, p < .05. In order to better explore this three-way interaction we conducted two additional LMM analyses: one for each sentence case.
In the lower case sentence condition where the abbreviations were typographically distinct, first fixation durations were longer on initialism targets than on acronym targets, b = .038, SE = .020, p < .05. However, there were no other significant effects, p's > .09. The data pattern was very different in the all capital sentences where the abbreviations were not typographically distinct. Here, first fixation durations did not differ between acronym and initialism targets, b = .007, SE = .023, p > .5. Additionally, the identity contrast was significant, b = .074, SE = .017, p < .001, indicating that valid previews yielded shorter fixations than invalid previews. Additionally, the interaction between the identity contrast and abbreviation type was also significant, b = -.077, SE = .025, p < .005. This interaction was driven by a large identity benefit for acronym targets with virtually no identity benefit for initialism targets.
Gaze Duration
As mentioned before, the most important measure to assess processing of the target abbreviations is gaze duration (the sum of all first pass fixations on the abbreviation prior to fixating elsewhere). Gaze durations increased with target length, b = .098, SE = .046, p < .001. After accounting for length effects, gaze durations were significantly longer on initialisms than on acronyms, b = .078, SE = .026, p < .005. Gaze durations were also significantly longer in the all capital sentence condition than in the lower case condition, b = .095, SE = .047, p < .05. Additionally, the identity contrast was significant, b = .036, SE = .017, p < .01, indicating a preview benefit for the identical condition. Finally, there was a significant three-way interaction between target type, sentence case, and the legality contrast, b = .073, SE = .031, p < .05. Again, in order to better explore the nature of this interaction we conducted two additional LMM's: one for each sentence case with only the effects of interest reported.
In the lower case sentence condition, there was a significant effect of the identity contrast, b = .035, SE = .016, p < .05, indicating a preview benefit for the identical condition. However, there was neither a main effect of the legality contrast nor any interactions p's > 0.4. The lack of a main effect of the legality contrast or an interaction with target type indicates that, when these letter strings were typographically distinct they were processed similarly, based on parafoveal information.
In the all capital sentence condition the identity contrast was significant, b = .062, SE = .018, p < .001, indicating a preview benefit for the identical condition. Unlike in the lower case sentence condition, there was also a significant interaction between target type and the legality contrast, b = .056, SE = .023, p < .05. This interaction indicates that for acronyms, legal previews resulted in shorter gaze durations than illegal previews but for initialisms, illegal previews resulted in shorter gaze durations. Therefore, when orthographically indistinguishable from other words, readers decode these letter strings as they do words and use the legality of the upcoming letter string to guide orthography-to-phonology mappings.
General Discussion
The current experiment indicates that readers can use typographical information about letter strings in parafoveal vision to alter their processing of those strings when they are ultimately fixated. Effects of this nature can be seen as early as the fixation prior to crossing the display change boundary where there was a significant interaction between sentence case and preview identity; these fixations were generally longer for invalid previews in lower case sentences but shorter for invalid previews in all capital sentences. Additionally, the skipping rate of the target abbreviation was also significantly impacted by the sentence case manipulation. First, readers were more likely, over all, to skip an abbreviation when it was embedded in an all capital sentence than when it stood out in a lower case sentence. However, there was also a significant interaction between sentence case and target type, with skipping rates for the two abbreviation types being more similar in the all capital sentences. All of these data point to the use of typographical information by readers to modulate processing of upcoming letter strings.
These results further highlight the delicate interplay between bottom up visual cues and higher-level word processing. When readers were provided with a parafoveal cue that an upcoming letter string was an abbreviation, they were able to qualitatively alter their processing of this string. Under these conditions, readers obtained no greater benefit from legal or illegal previews even though the illegal previews could not be successfully processed with normal GPC rules. One possible reason for this apparent insensitivity to preview legality may be the use of biases in parafoveal processing that occur when the abbreviations stand out, as they did in the lower case sentence condition. That is, readers may have been biased to process all capitalized letter strings as initialisms. The existence of such a parafoveal initialism bias is intriguing in that such biases usually develop as a means to save time or resources. One possible advantage to such a bias is that it will always produce a coherent phonological representation, as all letter strings can be pronounced using letter names (which would be particularly advantageous in cases where the abbreviation is novel to the reader). Since both the legal and illegal previews differed from the target abbreviation to the same extent, the same amount of reprocessing would be required for each. However, when the sentences were presented in all capitals, there would be no cue to bias parafoveal processing and instead readers used the legality of the upcoming letter string to guide phonological processing.
Acronyms and initialisms have markedly different grapheme-phoneme correspondences and the interaction between preview legality and target type in the all capital sentence condition highlights the importance of the distinction between them. It also strengthens the case for the use of an initialism bias in the normal case sentences as it indicates that readers are capable of using the orthographic legality of previews but don't always do so. Usually, acronyms such as NASA are comprised of orthographically legal letter sequences while initialisms like NCAA contain illegal letter sequences. However, many initialisms, such as IRS, are legal strings that could be pronounced with normal grapheme-phoneme correspondence rules (“ers”) but are pronounced as a sequence of letter names (“I-R-S”). Furthermore, rare abbreviations such as NHTSA (National Highway Traffic Safety Administration) are comprised of illegal letter sequences but are commonly pronounced as words (“nitsa”) by those who are familiar with the abbreviation.
At some level, all abbreviations abstract the meaning of text in order to reduce orthographic and/or phonological length6. For instance, many people know that BMW is a car brand but few realize that the brand is Bavarian Motor Works. Perhaps the processing of such lexicalized abbreviations would not be influenced by the lexical/semantic properties of their parent words. However, to what extent do these properties influence the processing of recently learned abbreviations? For instance, in languages that mark grammatical gender, such as Italian and Spanish, abbreviations are generally marked with the same grammatical gender as their parent word, even though the gender itself may not be evident from the letters comprising it (Lepschy & Lepschy, 1991). Is this property of the abbreviation learned by rote memorization or is it abstracted from knowledge of the parent words? As the prevalence of abbreviation use continues to rise, these and other questions surrounding their processing will become increasingly important. It is essential that such future research and computational models of word recognition take into account the stark differences in grapheme-phoneme correspondences of acronyms and initialisms, as well as biases that may be present in their phonological processing.
Acknowledgments
This research was supported by Grant HD26765 from the National Institute of Health and by a Chancellor's Research Scholarship from UCSD to the third author. We thank Tara Chaloukian and Jullian Zlatarev for help with data collection and Reinhold Kliegl, Manuel Perea, Kathy Rastle, and two anonymous reviewers for helpful comments.
Appendix A
List of experimental sentences presented as in the normal case condition. The legal and illegal previews appear in parentheses following the target abbreviations. The first 27 items represent the acronym targets and the last 27 represent the initialism targets.
My friend John claims a proud WASP (WESP, WSSP) heritage complete with pilgrim ancestors.
The sailor faced the angry JAG (JEG, JRG) officer while in the courtroom.
During our trip abroad, the fast acting NATO (NOTO, NKTO) response team kept us safe.
Because of performance problems, Tom upgraded RAM (ROM, RBM) improving his computer.
Brian watched the highly competitive FIFA (FUFA, FNFA) qualifying game standing.
Because of Michael's lengthy AWOL (AWEL, AWML) from the Marines, an arrest warrant was issued.
Construction for additional RIMAC (RAMEC, RLMBC) structures is creating traffic problems.
Experts recommend having complex PIN (PON, PLN) numbers to protect against theft.
There was gum on the dirty BART (BORT, BKRT) train's floor and I stepped in it.
Interpreting the lengthy ANOVA (ANUVU, ANRVB) results took a while.
Metal debris damaged the expensive NASA (NUSO, NRSB) probe while it was in orbit.
Jan's legs were tired after walking the massive EPCOT (EPCUT, EPCGT) center in Florida.
Frankie was associated with the early FEMA (FIMU, FBMK) efforts in New Orleans.
Franklin listened as the attractive PETA (PATU, PBTR) representative delivered her speech.
Despite tired legs, the passionate MADD (MEDD, MRDD) protesters stood outside the bar.
After careful planning the large SWAT (SWUT, SWBT) agent burst through the door.
The new legislation will help the devastating AIDS (AERS, AVRS) epidemic affecting the area.
Kim had trouble with the strict NAFTA (NUFTA, NBFTA) regulations while importing goods.
Unfortunately, the image retained incompatible GIF (GEF, GTF) formatting and could not be seen.
Josh was surprised the narrow CAT (CET, CBT) scan machine made him claustrophobic.
Stephanie did not fear the virulent SARS (SURS, SBRS) virus because of her facemask.
Many benefit from the companionate UNICEF (UNOCAF, UNLCRF) workers diligent efforts.
Sammy joined the afternoon CAD (CUD, CRD) class because she wanted to be an architect.
Shaun was happy the friendly GLADD (GLEDD, GLBDD) members invited everyone to gay pride day.
John fell asleep reading a lengthy BLOG (BLUG, BLDG) entry about his favorite type of cheese.
As she walked into the beautiful MOMA (MUMI, MPMB) building Jan gazed at the strange sculptures.
Frank was gazing at the attractive MILF (MALF, MVLF) walking her dog in front of him.
Profits run high at crowded NFL (NEL, NRL) concession stands around half time.
Jane was surprised when the quiet FBI (FUI, FDI) agent stood and spoke.
Fans often complain about hard working NBA (NEB, NGC) officials making bad calls.
The economic problems of large FDIC (FELA, FGLB) insured banks affect us all.
The young man underestimated the dangerous TNT (TUT, TMT) stick and was too close to the blast.
Nick was relieved the oppressive USSR (UMSO, URSC) had dissolved after the Cold War.
Mark was surprised by the intelligent CBS (CES, CFS) reporter who used a large vocabulary.
The floor was slippery at the crowded NCAA (NOBA, NRBA) championships so a time out was taken.
John learned nothing from the secretive NSA (NOA, NCA) agents who would not share information.
Tim had seen the compelling NCIS (NOLS, NRLS) episode which aired this week twice already.
The nationally acclaimed UNLV (UNUM, UNKM) marching band will be playing today.
Car alarms can be heard in noisy NYC (NIC, NKC) regardless of the time of day.
Freedom is important to active NRA (NEP, NPE) members concerned about their right to bear arms.
Many believe the popular MTV (MEV, MBV) channel has strayed from its roots in music.
Matt was ranting about the careless IRS (IES, IBS) agent who lost his tax forms.
Tony laughed at the amusing UCI (UCO, UCL) anteater as the mascot walked by.
I was grateful the responsive EMT (EMI, EML) quickly responded to the accident.
Jose's rapid actions and quick CPR (CER, CBR) probably saved the girl's life.
Paul was pleased the responsive ACLU (AFLO, AFLD) attorney was willing to take his case.
Diana could finally join the large AARP (AREP, ARJP) organization now that she was fifty-five.
After helping the lady, the friendly LAPD (LUBO, LBTO) officer waved to the neighborhood kids.
Luke burst into tears after hearing the tragic DOA (DAF, DKR) announcement after the accident.
Mom does not usually tell her actual DOB (DIB, DDB) because she is embarrassed about her age.
News of the fast acting DEA (DOP, DBN) agents was on every channel.
I thoroughly enjoyed the delicious BLT (BEP, BCP) and side salad I had for lunch today.
Rose could not follow the fast paced ESPN (ESON, ESBN) show which constantly changed topics.
Sam was glad the underfunded PBS (PES, PCS) could produce shows like Sesame Street.
Footnotes
The DRC was not intended to model the processing of orthographically illegal input.
Brysbaert et al (2009) found no difference in associative priming for capital vs. lowercase abbreviations. Therefore, it should also be possible to remove this apparent bias by leaving the surrounding sentences unchanged and using lowercase abbreviations. We chose to use all capitals so the appearance of the abbreviations would be the same across conditions.
The study was run with 60 sentences (30 acronyms and 30 initialisms). However, it was later discovered that there were typos in the previews of 3 of these target abbreviations (1 acronym and 2 initialisms). Therefore, these sentences were dropped from analysis along with 3 other randomly picked sentences (2 acronyms and 1 initialism) in order to balance the dataset (9 sentences per condition). The removal of these items did not influence the statistical conclusions.
Case was manipulated between subjects. While it would have been statistically preferred to manipulate all variables within subjects, there weren't enough familiar acronyms available for such a design.
Late display changes are those that didn't complete within 7 ms after the start of the post boundary fixation. Saccades that trigger the display change early are sometimes referred to as hooks and are believed to be the result of saccadic overshoots and corrections.
Note that the initialism WWW presents an orthographic savings at a heavy phonological cost as each letter contains three syllables but the word it replaced contains only one.
Publisher's Disclaimer: The following manuscript is the final accepted manuscript. It has not been subjected to the final copyediting, fact-checking, and proofreading required for formal publication. It is not the definitive, publisher-authenticated version. The American Psychological Association and its Council of Editors disclaim any responsibility or liabilities for errors or omissions of this manuscript version, any version derived from this manuscript by NIH, or other third parties. The published version is available at www.apa.org/pubs/journals/xlm
References
- Baayen RH, Davidson DH, Bates DM. Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language. 2008;59:390–412. [Google Scholar]
- Bates D, Maechler M, Dai B. lme4: Linear mixed-effects models using S4 classes. R package version 0.999375–28. 2008 Retrieved October 21, 2008, from http://lme4.r-forge.r-project.org/
- Baayen RH. Analyzing linguistic data: A practical introduction to statistics using R. Cambridge University Press; Cambridge, UK: 2008. [Google Scholar]
- Brysbaert M, Speybroeck S, Deiter V. Is there room for the BBC in the mental lexicon? On the recognition of acronyms. The Quarterly Journal of Experimental Psychology. 2009;69:1832–1842. doi: 10.1080/17470210802585471. [DOI] [PubMed] [Google Scholar]
- Coltheart M, Rastle K, Perry C, Langdon R, Ziegler J. DRC: A dual route cascaded model of visual word recognition and reading aloud. Psychological Review. 2001;108:204–256. doi: 10.1037/0033-295x.108.1.204. [DOI] [PubMed] [Google Scholar]
- Davies M. The Corpus of Contemporary American English (COCA): 400+ million words, 1990-present. 2008 Available online at http://www.americancorpus.org.
- Henderson JM, Dixon P, Petersen A, Twilley LC, Ferreira F. Evidence for the use of phonological representations during transsaccadic word recognition. Journal of Experimental Psychology: Human Perception & Performance. 1995;21:82–97. [Google Scholar]
- Laszlo S, Federmeier KD. The acronym superiority effect. Psychonomic Bulletin & Review. 2007a;14:1158–1163. doi: 10.3758/bf03193106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Laszlo S, Federmeier KD. Better the DVL you know. Acronyms reveal the contribution of familiarity to single-word reading. Psychological Science. 2007b;18:122–126. doi: 10.1111/j.1467-9280.2007.01859.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee Y, Binder KS, Kim J, Pollatsek A, Rayner K. Activation of phonological codes during eye fixations in reading. Journal of Experimental Psychology: Human Perception and Performance. 1999;25:948–964. doi: 10.1037//0096-1523.25.4.948. [DOI] [PubMed] [Google Scholar]
- Lepschy A, Lepschy G. The Italian language today (Reprinted Ed) London: Routledge; 1991. [Google Scholar]
- Mcwilliam L, Schepman A, Rodway P. The linguistic status of text message abbreviations: An exploration using a Stroop task. Computers in Human Behavior. 2009;25:970–974. [Google Scholar]
- Miellet S, Sparrow L. Phonological codes are assembled before word fixation: Evidence from boundary paradigm in sentence reading. Brain & Language. 2004;90:299–310. doi: 10.1016/S0093-934X(03)00442-5. [DOI] [PubMed] [Google Scholar]
- Perea M, Acha J, Carreiras M. Eye movements when reading text messaging (txt msgng) The Quarterly Journal of Experimental Psychology. 2009;62:1560–1567. doi: 10.1080/17470210902783653. [DOI] [PubMed] [Google Scholar]
- Plester B, Wood C, Bell V. Txt msg n school literacy: does texting and knowledge of text abbreviations adversely affect children's literacy attainment? Literacy. 2008;42:137–144. [Google Scholar]
- Plester B, Wood C, Joshi P. Exploring the relationship between children's knowledge of text message abbreviations and school literacy outcomes. British Journal of Developmental Psychology. 2009;27:145–161. doi: 10.1348/026151008x320507. [DOI] [PubMed] [Google Scholar]
- Pollatsek A, Lesch M, Morris RK, Rayner K. Phonological codes are used in integrating information across saccades in word identification and reading. Journal of Experimental Psychology: Human Perception & Performance. 1992;18:148–162. doi: 10.1037//0096-1523.18.1.148. [DOI] [PubMed] [Google Scholar]
- R Development Core Team. R: A language and environment for statistical computing. 2010 Retrieved from http://www.R-project.org.
- Rayner K. Eye movements in reading and information processing: 20 years of research. Psychological Bulletin. 1998;124:372–422. doi: 10.1037/0033-2909.124.3.372. [DOI] [PubMed] [Google Scholar]
- Rayner K. Eye movements and attention in reading, scene perception, and visual search. The Quarterly Journal of Experimental Psychology. 2009;62:1457–1506. doi: 10.1080/17470210902816461. [DOI] [PubMed] [Google Scholar]
- Rayner K, Foorman BR, Perfetti CA, Pesetsky D, Seidenberg MS. How Psychological Science informs the teaching of reading. Psychological Science in the Public Interest. 2001;2:31–74. doi: 10.1111/1529-1006.00004.. [DOI] [PubMed] [Google Scholar]
- Rayner K, Pollatsek A, Binder KS. Phonological codes and eye movements in reading. Journal of Experimental Psychology: Learning, Memory, & Cognition. 1998;24:476–497. doi: 10.1037//0278-7393.24.2.476. [DOI] [PubMed] [Google Scholar]
- Slattery TJ, Pollatsek A, Rayner K. The time course of phonological and orthographic processing of acronyms in reading: Evidence from eye movements. Psychonomic Bulletin & Review. 2006;13:412–417. doi: 10.3758/bf03193862. [DOI] [PubMed] [Google Scholar]
- Tagliamonte SA, Denis D. Linguistic ruin? lol! Instant messaging and teen language. American Speech. 2008;83:3–34. [Google Scholar]
- Thurlow C. From statistical panic to moral panic: The metadiscursive construction and popular exaggeration of new media language in the print media. Journal of Computer-Mediated Communication. 2006;11:667–701. [Google Scholar]
- Thurlow C, Bell K. Against technologization: Young people's new media discourse as creative cultural practice. Journal of Computer-Mediated Communication. 2009;14:1038–1049. [Google Scholar]