Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Apr 1.
Published in final edited form as: J Exp Psychol Gen. 2013 Jul 8;143(2):895–913. doi: 10.1037/a0033580

Reading is fundamentally similar across disparate writing systems: A systematic characterization of how words and characters influence eye movements in Chinese reading

Xingshan Li 1,*, Klinton Bicknell 2,*, Pingping Liu 1, Wei Wei 1, Keith Rayner 2
PMCID: PMC3885613  NIHMSID: NIHMS524572  PMID: 23834023

Abstract

While much previous work on reading in languages with alphabetic scripts has suggested that reading is word-based, reading in Chinese has been argued to be less reliant on words. This is primarily because in the Chinese writing system words are not spatially segmented, and characters are themselves complex visual objects. Here, we present a systematic characterization of the effects of a wide range of word and character properties on eye movements in Chinese reading, using a set of mixed-effects regression models. The results reveal a rich pattern of effects of the properties of the current, previous, and next words on a range of reading measures, which is strikingly similar to the pattern of effects of word properties reported in spaced alphabetic languages. This finding provides evidence that reading shares a word-based core and may be fundamentally similar across languages with highly dissimilar scripts. We show that these findings are robust to the inclusion of character properties in the regression models, and are equally reliable when dependent measures are defined in terms of characters rather than words, providing strong evidence that word properties have effects in Chinese reading above and beyond characters. This systematic characterization of the effects of word and character properties in Chinese advances our knowledge of the processes underlying reading and informs the future development of models of reading. More generally, however, this work suggests that differences in script may not alter the fundamental nature of reading.

Keywords: Chinese reading, eye movements, mixed-effects regression


The past four decades of eye movement research has demonstrated that readers’ eye movements are sensitive to a range of properties of the words being read. As a result, dominant models of eye movement control in reading take words to be the basic units of ongoing processing and of saccade targeting (Engbert, Longtin, & Kliegl, 2002; Engbert, Nuthmann, Richter, & Kliegl, 2005; Reichle, Pollatsek, Fisher, & Rayner, 1998; Reichle, Pollatsek, & Rayner, 2012; Reichle, Rayner, & Pollatsek, 2003; Reichle, Warren, & McConnell, 2009; Reilly & Radach, 2006; but see S. N. Yang & McConkie, 2001). However, the majority of this research has examined readers of alphabetic languages such as English, in which words are salient perceptual tokens, separated from each other by spaces. In contrast, in Chinese orthography, words are not spatially segmented, and the characters that compose them are themselves quite visually complex, leading a number of researchers to suggest that characters are the more important unit of processing (e.g., Chen, 1996; Chen, Song, Lau, Wong, & Tang, 2003; Hoosain, 1991, 1992).

Studies of eye movements of Chinese readers have shown that properties of both words and characters have effects on eye movements (e.g., G. Yan, Tian, Bai, & Rayner, 2006), suggesting that eye movements in Chinese are driven by a complex process generally sensitive to linguistic properties at both word and character levels (see a recent 2013 Special Issue of the Journal of Research in Reading for discussion of relevant issues). The present work takes a step towards elucidating this process by systematically characterizing the ways in which the eye movement record in Chinese is sensitive to word and character properties. To do this, we employed mixed-effects regression modeling of an eye movement corpus of Chinese text, simultaneously measuring the influence of a range of word and character properties1. The results of this analysis revealed that, while character properties clearly play a large role in determining Chinese readers’ eye movements, the pattern of effects of word properties in Chinese is remarkably similar to that in languages written with alphabetic scripts, suggesting that the underlying processes driving eye movements across very different orthographies may in fact be highly analogous.

Eye movement studies in alphabetic languages have shown that a word’s linguistic properties, such as its frequency and predictability, affect both the number and duration of fixations it will receive. For example, low frequency words are fixated longer than high frequency words (Inhoff & Rayner, 1986; Miellet, Sparrow, & Sereno, 2007; O’Regan & Jacobs, 1992; Rayner, Ashby, Pollatsek, & Reichle, 2004; Rayner & Duffy, 1986; Rayner, Reichle, Stroud, & Pollatsek, 2006; Rayner, Sereno, & Raney, 1996; Slattery, Pollatsek, & Rayner, 2007; Vanyukov, Warren, Wheeler, & Reichle, 2012; White, 2008), and words that are less predictable in context are fixated longer than more predictable words (Balota, Pollatsek, & Rayner, 1985; Kliegl, Grabner, Rolfs, & Engbert, 2004; Kliegl, Nuthmann, & Engbert, 2006; Miellet et al., 2007; Rayner et al., 2004; Rayner et al., 2006; Rayner, Slattery, Drieghe, & Liversedge, 2011; Rayner & Well, 1996; Vainio, Hyönä, & Pajunen, 2009).

Furthermore, in alphabetic languages, it has also been demonstrated that fixation times on a word are affected by the linguistic properties of at least some other nearby words. For example, a difficult preceding word can lead to more and longer fixations on the next word; this is referred to as a spill-over effect (Henderson & Ferreira, 1990; Kliegl et al., 2006; Pollatsek, Reichle, Juhasz, Machacek, & Rayner, 2008; Rayner & Duffy, 1986). Moreover, some studies have even found that fixation durations are affected by the properties of the subsequent word, termed parafoveal-on-foveal effects (Drieghe, Brysbaert, & Desmet, 2005; Inhoff, Starr, & Shindler, 2000; Kennedy & Pynte, 2005; Kliegl, Risse, & Laubrock, 2007; Pynte, Kennedy, & Ducrot, 2004; M. Yan, Richter, Shu, & Kliegl, 2009; J. Yang, Wang, Xu, & Rayner, 2009). However, these results have not always been replicated (Rayner, Juhasz, & Brown, 2007; Schotter, Angele, & Rayner, 2012; Schotter, Blythe, et al., 2012; White, 2008; White & Liversedge, 2004).

The fact that linguistic properties of words exert such influence over eye movement control in reading has been taken as evidence that words are the basic units of ongoing processing in reading. Further support for this notion comes from analyzing the eyes’ initial landing positions on words. The data show that landing positions cluster at or just left of the center of words, suggesting that words may be not only the basic units of perceptual encoding, but also the functional targets of saccades (McConkie, Kerr, Reddix, & Zola, 1988; Rayner, 1979).

In Chinese, it is much less clear that such a word-based view of reading would apply as Chinese orthography differs from alphabetic languages in many respects. One reason for this is that the character system is very different. There are more than 5000 characters in Chinese – orders of magnitude higher than the number of characters in alphabetic scripts – and the information density in each Chinese character is much higher than in alphabetic scripts (Hoosain, 1991). Whereas in alphabetic languages, all characters are visually simple and all occur in text with high frequencies, Chinese characters exhibit substantial diversity in both their frequency and their visual complexity, being composed of anywhere from 1 to more than 20 strokes. It would be surprising if eye movements in reading were not a sensitive index of this diversity, and indeed, effects of character complexity (H. Yang & McConkie, 1999) and frequency (Cui, Bai, Yan, Hyönä, Wang & Liversedge, 2013; G. Yan et al., 2006) on eye movements in reading in Chinese have been reported, despite the lack of such effects in alphabetic languages2.

These differences in character orthography are necessarily also reflected by differences in word orthography. As characters in Chinese each contribute more information than characters in alphabetic scripts, words are much shorter, the vast majority being composed of just one or two characters. In one published source (Lexicon of common words in contemporary Chinese, 2008), 6% of word types are single-character words, 72% are 2-character words, 12% are 3-character words, 10% are 4-character words, and less than 0.3% are longer than 4 characters. A more critical difference between Chinese and alphabetic scripts in this regard is that there are no physical cues between words (i.e., spaces) in Chinese text to mark word boundaries. Rather, text written in Chinese is formed by strings of equally spaced box-shaped characters. Chinese readers thus have to depend on lexical knowledge to segment characters into words (Li, Rayner, & Cave, 2009), and so characters – not words – are the perceptually salient tokens in a line of text.

These facts have led a number of researchers to suggest that characters are more important than words for Chinese readers. Chen and colleagues (Chen, 1996; Chen & Zhou, 1999) argued that characters function as the perceptual encoding units for Chinese readers, because individual characters have such high complexity, exhibit character superiority effects, and are the physically segmented units of Chinese text. Additionally, Chen et al. (2003) described a regression analysis of an eye movement corpus of Chinese text assessing the contributions of both character and word properties, similar in spirit to that presented here. They argued that their analysis showed evidence that – at least for adult readers – character properties play a larger role in determining eye movements than word properties. However, these results are not conclusive because the word properties they analyzed in their model did not include two of the word properties with the largest effects on eye movements: word frequency and predictability. Similarly, Feng (2008) argued that the fact that reading appears to be word-based in alphabetic languages with spaces is just a reflection of the fact that the spaces between words in the orthography provide a useful cue that readers learn to take advantage of. He suggested that if his hypothesis is correct, the implications for Chinese reading are that we should not expect reading to be similarly word-based, since Chinese orthography does not provide such cues.

Finally, apart from the properties of the orthography itself, there are also a number of reasons to believe that the concept of the word is less salient in Chinese than the character: it is characters rather than words that are the basic units in Chinese dictionaries, and native speakers often have some disagreement on the locations of word boundaries in text (Hoosain, 1991, 1992; Liu, Li, Lin, & Li, 2013). A number of Chinese linguists even argue that the concept of a word is mostly borrowed from Indo-European languages, and the concept may not be applicable in Chinese (H. J. Wang, 2007; J. Wang, 2009; Xu, 1994, 2005). These arguments and experimental results suggesting that characters are more important than words in Chinese suggest that the processes underlying reading behavior may be qualitatively different because of these differences in orthography.

Perhaps some of the most striking evidence that reading in Chinese is different from reading in languages with alphabetic orthographies comes from studying the eyes’ initial landing position within words, the preferred viewing location (Rayner, 1979). In languages with alphabetic scripts, the strongest evidence that word centers are the targets of saccades are analyses showing that initial fixations cluster just left of the center of words (McConkie et al., 1988; Rayner, 1979). In Chinese, there is disagreement about whether readers adopt a word-based targeting strategy. While some studies reported flat preferred viewing location curves (Tsai & McConkie, 2003; H. Yang & McConkie, 1999), M. Yan, Kliegl, Richter, Nuthmann, and Shu (2010) presented evidence that initial fixations similarly clustered around the center of words in Chinese when only one fixation was made on a word, but peaked toward the beginning when there were multiple fixations. However, Li, Liu, and Rayner (2011) presented simulation results showing that even a simple model that assumes that saccades travel constant distances could generate the same kinds of initial fixation distributions as observed by M. Yan et al. Thus, Li et al. concluded that saccade targeting in Chinese may not be word-based, as it appears to be in other languages. Moreover, Zang, Liang, Bai, Yan, and Liversedge (in press) examined how inter-word spaces influence the eye movement behavior of both adults and children by inserting spaces between Chinese words. They found that initial fixations tended to land near the word center more in the spaced condition than in the unspaced condition, suggesting that inserting spaces between words does affect target selection in Chinese reading. There are some suggestions in the literature, however, that saccade targeting may be at least somewhat sensitive to word properties. Specifically, word skipping rates in Chinese have been shown to vary with a word’s frequency (G. Yan et al., 2006; H. Yang & McConkie, 1999) and predictability (Rayner, Li, Juhasz, & Yan, 2005).

While the literature cited above argues that reading in Chinese may be qualitatively different, and specifically less word-based, than reading in languages with alphabetic scripts, there is evidence that words do have psychological reality in Chinese. First, similar to findings in languages with alphabetic scripts (Reicher, 1969; Wheeler, 1970), Chinese characters are identified more accurately in a word than in a string of characters that do not constitute a word (Cheng, 1981). Second, Li et al. (2009) found a word boundary effect, wherein character recognition accuracy dropped at the word boundary when Chinese readers were briefly presented Chinese characters consisting of either two 2-character words or a 4-character word. Third, Li and Logan (2008) demonstrated that Chinese characters belonging to a word could be perceived as an object and affect attentional deployment.

Additionally, there is some evidence for word-level processing in reading. Bai, Yan, Liversedge, Zang, and Rayner (2008) found that while inserting spaces between words did not facilitate or interfere with reading, inserting spaces between characters did interfere with reading. Later studies showed that inserting spaces between words could help beginning readers of Chinese to read more efficiently and to learn new words (Blythe et al., 2012; Shen et al., 2012). Moreover, other studies found that reading speed was slowed down when Chinese readers could not view two characters belonging to a word simultaneously compared when they could do so (Li, Gu, Liu, & Rayner, in press; Li, Zhao, & Pollatsek, 2012). Other eye movement studies demonstrated that the frequency and predictability of a Chinese word affect eye movements on it during reading: high-frequency words are fixated for less time than low-frequency words (G. Yan et al., 2006; H. Yang & McConkie, 1999) and more predictable words are fixated for less time than less predictable words (Rayner et al., 2005). In addition, Rayner, Li, and Pollatsek (2007) extended the word-based E-Z Reader model of eye movement control in English reading to Chinese. The model accounted for fixation durations and word skipping rates (Rayner et al., 2005) during Chinese reading quite well, suggesting that word properties are an important factor in eye movement control for Chinese readers.

In summary, there is substantial reason to believe that reading in Chinese is characterized by qualitatively different underlying processes than reading in languages with alphabetic scripts. Specifically, it is clear that individual characters play a larger role in Chinese reading, and exert their own influence on eye movements, and in addition, there are arguments and evidence that words – while clearly having some effect on eye movements in reading – may play less of a role, and perhaps a qualitatively different role, than in languages with alphabetic scripts.

The purpose of the current study

In order to deepen our insight into the processes underlying reading in Chinese, we present here a systematic characterization of the effects of a wide range of both word and character properties on eye movements in Chinese reading, using a set of mixed-effects regression models. Specifically, the word properties we assessed include the length, frequency, and predictability of the current, previous, and following word, and the character properties we assessed include the frequency and complexity of a range of characters around the point of fixation. Including both word and character properties within a single mixed-effects regression model allows us to determine the effects of word properties above and beyond the character properties included in the model, and vice versa.

This work has a number of goals. First and primarily, as such models have already been reported for effects of word properties on eye movements in alphabetic languages (e.g., Kliegl et al., 2006), this allows us to evaluate the qualitative effects of word properties on eye movements in Chinese, and to determine whether the pattern is similar to that reported for alphabetic languages. To the extent that the pattern of effects of word properties is similar, it would provide evidence that the processes underlying reading are the same even across disparate orthographies, and that words play a prominent role in reading, even when not explicitly marked in the text. While previous work has already shown that word frequency and predictability have effects on eye movements, here we also investigate the influence of the preceding and following words (which have never been studied in Chinese), providing a broader investigation of the ways in which reading may be similar across languages.

Additionally, in order to provide a more stringent test for the effects of word properties on reading in Chinese, we go beyond previous work, which has typically only analyzed the effects of word properties on word-based eye movement measures (such as the total duration of fixations a word receives), to also analyze the effect of word properties on character-based eye movement measures. To the extent that word properties still influence eye movements in the same way even when the measure of interest is defined in terms of characters, this provides some of the strongest evidence to date that word properties do have effects on Chinese reading above and beyond character properties, and that these effects are similar to those in other languages.

Finally, by providing a systematic analysis of a range of character and word properties on eye movements in reading, we take our results to provide benchmark data for the development and evaluation of computational models in Chinese reading. Our knowledge of the processes underlying reading in alphabetic languages has in recent years been substantially refined by a range of successful computational models (e.g., Engbert et al., 2002; Engbert et al., 2005; Reichle et al., 1998, 2003, 2012; Richter, Engbert, & Kliegl, 2006). It is important to investigate how current eye movement models can be modified to account for Chinese reading or whether new models are needed. We trust that the data reported in the current work will contribute to this development.

Methodology of the current study

In examining the effects of a range of linguistic variables on eye movements in reading with a set of regression models, the present work fits into a long tradition of multiple regression analyses in eye movement research, beginning with Just and Carpenter (1980), who applied a regression model to the mean gaze durations (the sum of all first-pass fixations on a word before moving the eyes to another word) on each word in a series of texts, and found that they were affected by factors such as encoding and lexical access, case role assignment, and inter-clause integration. Later work has used repeated measures regression (Lorch & Myers, 1990) to analyze corpora of eye movement data in reading (Juhasz & Rayner, 2003; Kliegl et al., 2006) and documented effects of a large range of variables. More recently, researchers have begun to use mixed-effects regression models (Baayen, Davidson, & Bates, 2008; Pinheiro & Bates, 2000), which can provide a more powerful method of analysis. The present work follows this last class of approaches, using generalized linear mixed effects regression models to analyze the effects of a range of factors on eye movement reading measures (Baayen et al., 2008; Engbert et al., 2005; Faraway, 2006; Kliegl, 2007; Kliegl, Masson, & Richter, 2010; Rayner et al., 2011).

Specifically, in this study, we collected eye movement data when Chinese readers read Chinese sentences, and we used generalized linear mixed-effects models to explore how different word properties and character properties affect eye movement control in Chinese reading. In addition to these fixed effects, the models included subject and word token as crossed random effects.3 The word properties included in the model were (log-transformed) frequency, (log-transformed) predictability, and length in characters. The character properties included complexity (number of strokes in a character), and (log-transformed) frequency. We also examined the distance between the character of interest and the nearest fixated character to the left in some of the analyses. In some models with character measures, we also include the relative character position within a word. To perform the analyses, we used the lme4 package in the R system for statistical computing (Bates, 2010; Bates & Maechler, 2010).

We used mixed-effects regression models to analyze the effects of word and character properties on five eye movement measures. The first analysis examines gaze durations on words. Because of concern that using such a word-based measure may bias results to showing evidence of word properties rather than character properties, we performed a similar analysis on the first fixation durations4 on characters (hereafter fixation duration on characters), testing for effects of words n−1, n, n+1, and character properties. We next analyzed measures of fixation locations: word and character fixation probability, our third and fourth analyses. The final model used saccade length as the independent variable, yielding insight into how the properties of the fixated word affect the planning of the next saccade.

Significance testing in the models was performed as follows. For binary dependent variables such as whether a word was skipped, we report the Wald z, obtained from dividing the coefficient estimate by its standard error, and its associated p value. For continuous dependent variables such as gaze duration, we report Student’s t, also obtained by dividing the coefficient estimate by its standard error. Because there is no consensus on the appropriate number of degrees of freedom for this t distribution for mixed-effects models, however, we do not report degrees of freedom nor a p value. Instead, since the t statistic will be approximately normally distributed for datasets of this size, we count as significant cases in which |t|>1.96 (see Baayen et al., 2008).

To test whether including an independent variable significantly improves the predictability of the model to real data, we also performed likelihood ratio tests (LRTs). The LRT statistic is the difference in the deviance between the whole model and the constrained model when one of the independent variables is removed. An LRT statistic approximately follows a χv2 distribution, where the degrees of freedom, is determined by the difference in the number of free parameters between the two models. When the p value of the test is smaller than a specific value (.05 or .01), we can reject the null hypothesis that the more complex model fits the data better by chance. For binary predictor variables, the p value derived from an LRT will match that derived from Wald z as described above.

We also report the increase of Akaike information criterion (AIC) when an independent variable is removed from the full model. AIC is a measure of the relative goodness of fit of a statistical model (Akaike, 1974). It offers a relative measure of the information lost when a given model is used to describe reality. Since it considers both the goodness of fit and the number of free parameters, it is widely used to compare the nested models. When comparing models, a model with a smaller AIC value is usually considered better since it has less information loss. Hence the increase of AIC provides a relative measure on how much information an independent variable contributed to the variance of the dependent variable.

Methods

Participants

Forty-six native Chinese speakers, who were undergraduate students at universities in Beijing near the Institute of Psychology, Chinese Academy of Sciences, were paid 30 RMB (about 5 US Dollars) to participate in the experiment. All of them had normal or corrected-to-normal vision, and all were naive regarding the purpose of the experiment.

Apparatus

Eye movements were recorded by an SR EyeLink II tracker, which has a resolution of approximately 30′ of arc. Participants read the sentences (which were printed horizontally from left to right) on a 19-in. CRT monitor connected to a DELL PC. They wore a lightweight helmet that is part of the eye-tracking system. The eye-tracking system samples at 250 Hz and provides eye movement data for further analysis via another PC. Although the Eyelink II system is able to compensate for head movements, the participants rested their heads on a chinrest to minimize head movements during the experimental trials. Viewing was binocular, but eye movement data were collected only from the right eye. The participants were seated 70 cm from the video monitor; at this distance, one character subtended 0.8° of visual angle.

Materials

The materials consisted of 80 sentences, which were obtained from an online corpus5. We slightly modified some of the sentences to make the sentence more concise. The sentences were 20 to 36 characters long (M = 29 characters, and SD = 3.7 characters) and were shown in a single line on the display. More information about, and analysis of, the materials is given in Appendix 1.

Procedure

When participants arrived for the experiment, they were given instructions for the experiment and a description of the apparatus. The eye tracker was calibrated at the beginning of the experiment and the calibration was validated as needed. For calibration and validation, participants looked at a dot that was presented at various locations in a 3 × 3 grid in a random order. Then each participant read 10 sentences for practice and the 80 experimental sentences in a different random order. The participants were told to read silently, and that they would periodically be asked to answer questions about the sentences. These questions were asked after one third of the 90 sentences that were read; the participants were correct over 90% of the time.

Each trial started with a fixation box (1° × 1° in size) at the location of the first character of the sentence. The sentence was shown after participants successfully fixated on the box. After reading a sentence, the participant pressed a response button to start the next trial.

Data analysis

Across all of the trials, approximately 3% of the data were lost due to a track loss. Sentences were parsed into words using a popular Chinese word parsing software package (ICTCLAS2010). Since the software’s performance was not perfect, we also asked 10 subjects to evaluate the parsing results and to recommend modification of the parsing results. The final word boundaries were determined when at least 6 out of 10 subjects agreed. As a result, 1633 words were recognized in the sentences.

Words that involved the first two characters and the last two characters in a sentence were removed from analysis, as were all of the punctuation marks and the words involving two characters to the left and to the right of punctuation. All of the names (of people or places) were excluded from the analyses. In total, 1592 characters and 963 words were included in the analyses.

Blinks and fixations shorter than 40 ms (66 fixations) or longer than 1000 ms (59 fixations) were removed from analyses. In total, 42,766 fixations were analyzed. For the word-based dependent measures of gaze duration and word fixation probability, we analyzed only words that are shorter than 3 characters (representing over 92% of words in our corpus), in order to make a more homogenous dataset of character properties.

Results

Overall analyses

Average fixation duration was 244 ms, with a standard deviation of 27 ms. The distribution is shown in Figure 1.

Figure 1.

Figure 1

Distribution of fixation durations.

Character fixation probability was 42.8%, with a standard deviation of 8.5%. Regression rate was .12, with a standard deviation of .07. Average saccade length was 3.15 characters, with a standard deviation of .93. The distribution of forward saccade length is shown in Figure 2. Average regressive saccade length was 2.98 characters (SD = .94). The distribution is shown in Figure 2.

Figure 2.

Figure 2

Distribution of saccade length. Top panel: forward saccade length distribution. Bottom panel: regression saccade length distribution.

Gaze duration on words

In the first model, the dependent variable was gaze duration on words, and the independent variables included word and character properties. The word properties were the (log-transformed) frequency, (log-transformed) predictability, and the length of words n−1, n, and n+1; the character properties were the complexity and (log-transformed) frequency of the characters before and after the word, and the average complexity and (log-transformed) frequency of the characters within the word. The results of the analysis are presented in Table 1. We only discuss significant effects in the following discussions (ts>1.96 for continuous dependent measures or ps<.05 for binary dependent measures). Interested readers can refer to the statistics reported in the tables for more detailed information.

Table 1.

Linear mixed-effects regression results on word gaze duration

model
Values (ms)
Model comparison
b SE t low median High AIC+ χ2 p
(Intercept) 167.26 19.5 8.56

word n−1 Frequency −2.12 1.02 −2.08 236 241 243 3 4.36 .037
predictability −4.07 .95 −4.27 241 236 245 17 18.31 <.001
Length −11.08 2.73 −4.06 244 241 222 15 16.56 <.001

word n Frequency −6.40 1.32 −4.87 275 255 220 22 23.73 <.001
predictability −4.07 .95 −4.26 251 232 214 17 18.25 <.001
Length 9.98 4.62 2.16 213 257 357 3 4.72 .030

word n+1 Frequency .66 .97 .68 239 242 241 −1 0.45 .503
predictability −2.78 .95 −2.94 238 243 245 7 8.72 .003
Length 2.42 2.60 .93 241 242 237 −1 0.86 .352

character before word Frequency −.81 1.12 −.72 244 240 241 −1 .51 .473
Complexity .95 .43 2.20 238 244 241 3 4.89 .027

average of character of word n Frequency 1.81 1.61 1.12 268 254 232 0 1.26 .261
Complexity 3.24 .53 6.16 228 244 259 36 37.83 <.001

character after word Frequency .08 1.08 .07 241 240 241 −2 0 1
Complexity .36 .42 .85 239 245 237 −1 .72 .396

Nearest fixation distance 17.47 .36 48.25 150 271 253 205 2206.6 <.001

Note: For word frequency, low : 0–20 occurrences per million, median: 20–180 occurrences per million, high: more than 180 occurrences per million. For character frequency, low:0–300 occurrences per million, median: 300–1000 occurrences per million, and high: > 1000 occurrences per million. For predictability, low: 0–0.1, median: 0.1–0.5, high >0.5; for character complex: low: 1–6 strokes, median:7–9 strokes, high: >9 strokes; for nearest fixation distance and in word position, low: 0–1 character, median: 1–2 character, high:>2 characters. AIC+ represents the amount of AIC increase when an independent variable was removed from the model. These definitions were also used for all of the other tables.

Effects of word properties

There were spillover effects from word n−1 on word n for frequency and predictability, and inverse spillover effects for word length: gaze durations on word n decreased when word n−1 was more frequent, more predictable, or longer. The effects of frequency and predictability are similar to those found in English reading (e.g., Pollatsek et al., 2008; Rayner et al., 2004; Rayner et al., 2006; White, 2008) and German reading (e.g., Kliegl et al., 2004; Kliegl et al., 2006). The inverse spillover effect for length has also been reported for German (Kliegl et al., 2006), and may be related to skipping of word n−1. If word n−1 is short, it will be more likely to be skipped, so fixations are longer on word n when it is fixated (Rayner, 2009; Rayner et al., 2011).

There were normal effects of word n frequency, predictability, and length. Gaze duration on word n decreased with the increase of word frequency and predictability of word n, and with the decrease of the length of word n. These effects are similar to those found in English reading (Pollatsek et al., 2008; Rayner, 2009; Rayner et al., 2004; Rayner et al., 2011; Slattery et al., 2007; White, 2008), and previously reported in Chinese reading (Li, Liu, & Rayner, 2011; Rayner et al., 2005; G. Yan et al., 2006). These effects reflect that the properties of word n affect gaze duration on word n. Gaze duration was longer when word n was more difficult.

There was also a parafoveal-on-foveal effect of predictability. Gaze duration on word n decreased with the increase of the predictability of word n+1. This effect is similar to that reported in German (Kliegl, 2007; Kliegl et al., 2006). The distance of the last fixation to the left of word n affected gaze duration on word n: the longer the saccade, the longer the gaze duration.

In summary, the effects of word properties on gaze durations in Chinese reading appear to be completely analogous to those found in alphabetic languages. This includes not just the effects of word n on gaze durations, as has previously been reported, but also extends to effects of the two adjacent words. Notably, this pattern of results holds despite the fact that this word gaze duration model also includes character properties, meaning that the results cannot be easily explained as effects of character properties that happen to be correlated with the word properties. (A separate analysis not reported here in which only the word properties were included in the model revealed exactly the same qualitative pattern of effects, providing further evidence that these effects are not being driven by correlations between word and character properties.) Moreover, when the word properties were removed from the full model, the fit was significantly poorer than the full model [χ2 (9) = 208.74, p < .001], suggesting that word properties do affect gaze duration on words in Chinese reading in the same ways as in languages with alphabetic scripts.

In Table 1, we also report the results of the LRT statistic and the increase of AIC when removing one of the variables from the model. The results are generally consistent with the results reported above. Hence, we put these values in the tables as a reference for interested readers, but will not discuss them further. In the table, we also report the mean values of the dependent variable at three ranges of values for each independent variable without further discussion.

Effects of character properties

At the same time, the model also revealed effects of character properties on eye movements. Gaze durations were significantly longer when the character preceding the word or the characters within the word were more complex. None of the effects of other character properties were significant. While the complexity of characters in the current word has previously been demonstrated to affect duration measures on the word (e.g., H. Yang & McConkie, 1999), this is the first demonstration that the complexity of characters in word n−1 also affects gaze duration on word n. It is somewhat surprising that we did not see a reliable effect of character frequency here, as previous results have shown an effect of character frequency independent of word frequency (G. Yan et al., 2006). However, because there is a substantial correlation of these two variables in our naturalistic stimuli, it is possible that the analysis did not have the power to establish this effect. Finally, note that the model with character properties predicts the data significantly better than models without character properties [χ2 (6) = 46.46, p < .001].

Fixation durations on characters

Above, we showed that gaze durations on a word are affected by its properties and the properties of the surrounding words. It may be argued, however, that word properties played such a prominent role in the model because gaze duration is a measure defined in terms of a word. Because of this possibility, we also examined first fixation durations on individual characters. In this model, we included the same word properties as used previously (the frequency, predictability, and length of words n−1, n, and n+1), but defined the character properties in relation to the point of fixation, including the complexities and frequencies of characters n−1, n, n+1, and n+2 (where character n is the character of interest). The properties of these four characters were selected since they fall within the perceptual span, the range of characters known to have robust effects on eye movements in Chinese reading (Inhoff & Liu, 1998). We also included two other factors in the analyses: previous saccade length and the position of the character within a word. The results of the analysis are presented in Table 2.

Table 2.

Linear mixed-effects regression results for fixation duration

model
Values (ms)
Model comparison
b SE t low median High AIC+ χ2 p
(Intercept) 259.74 12.32 21.08

word n−1 Frequency −1.69 .52 −3.24 249 248 247 8 10.59 .001
Predictability −1.61 .59 −2.748 249 244 248 5 7.55 .006
Length −8.87 1.75 −5.07 248 248 240 16 25.75 .000

word n Frequency −3.91 .56 −7.02 254 250 241 47 48.94 .000
Predictability −4.42 .64 −6.93 252 241 231 45 47.84 .000
Length −8.36 1.62 −5.18 242 251 246 24 26.86 .000

word n+1 Frequency .36 .53 .68 247 250 247 2 .46 .499
predictability −2.13 .60 −3.593 248 247 247 11 13.00 .000
Length 2.38 1.69 1.41 247 249 248 0 2.00 .157

character n−1 Frequency −1.63 .56 −2.89 252 249 246 6 8.40 .004
Complexity −.08 .27 −.31 246 249 249 2 .089 .765

character n Frequency −.10 .67 −.15 259 250 244 2 .01 .909
Complexity 1.43 .28 5.04 243 249 255 23 25.48 .000

character n+1 Frequency .09 .53 .18 252 248 247 2 .02 .887
Complexity .13 .27 .48 247 248 249 2 .22 .638

character n+2 Frequency 1.47 .47 3.13 247 246 249 7 9.85 .002
Complexity .00 .26 .01 248 250 245 1 .55 .457

nearest fixation distance in word position 5.53 .51 10.78 247 252 237 95 95.33 .000
−5.68 1.38 −4.11 248 247 249 15 17.01 .000

Effects of word properties

The results of this analysis showed a nearly identical qualitative pattern of results to the word gaze duration analysis. The only difference between the two is the effect of the length of the currently fixated word.6 Specifically, in the word gaze duration model, longer words received longer gaze durations, but in the character-based analysis, fixations on characters had shorter durations when the word was longer. Given that longer words are more likely to receive multiple fixations, this may be a result analogous to that known in other languages, in which each of two fixations on a word when it is fixated twice will be shorter than a single fixation made on the word (Kliegl et al., 2006; Schilling, Rayner, & Chumbley, 1998).

Effects of character properties

The pattern of effects of character properties on individual fixations was quite different from that for word gaze durations. Presumably, this is at least partially related to the fact that character properties are defined differently: whereas previously, we examined the effect of properties of the character before the word, the character after the word, and the average properties of characters within the word, next we examined the effects of the properties of the fixated character, the character to its left, and the two characters to its right. First, fixation duration on character n decreased as the frequency of character n−1 increased, a sort of spill-over effect completely analogous to the effect of the frequency of word n−1. Second, fixation duration was also affected by the complexity of character n, but not by the frequency of character n; fixation duration increased as the complexity of character n increased. Third, none of the other properties of any characters to the right of the fixation affected the fixation duration on character n except the frequency of character n+2; fixation duration on character n increases as the frequency of character n+2 increased.

It is interesting that fixation duration on character n was not affected by character frequency of characters n and n+1, but by the frequency of characters n−1 and n+2. The explanation for this pattern of results is unclear, but one possible explanation relates to the notion that character frequencies may be less relevant for the fixated word, but more relevant for non-fixated words, for which all of the characters may not be visible (cf. Li et al., 2009).

Fixation duration was also affected by incoming saccade length; the longer the saccade, the longer the fixation duration on character n. The position of a character in a word also affected the fixation duration on character n. Fixation durations on the character were longer when the fixation was on a character at the beginning of a word than at the end of a word.

Fixation probability on words

Given that evidence from initial fixation locations within words in Chinese does not suggest a word-based targeting mechanism (Li et al., 2011), one possibility is that the properties of Chinese words influence primarily the when component of eye movement control, and have less influence on the where component. To investigate this possibility, we performed two analyses analogous to those described above on fixation location measures: word and character fixation probability. The first of these is a model of fixation probability on words, which includes as independent variables the frequency, predictability, and length of word n−1, word n, and word n+1, the complexity and (log-transformed) frequency of the characters before and after the word, and the average complexity and (log-transformed) frequency of the characters within the word, and the distance from the current word to the nearest last fixation.

Fixation probability was affected by word properties. The properties of word n−1 affected fixation probabilities on word n. Fixation probability on word n decreased with increasing predictability and length of word n−1. Fixation probability on word n also decreased with increasing predictability of word n, and with decreasing length of word n. Interestingly, there were stable parafoveal-on-foveal effects for each property of word n+1 we investigated. Fixation probability on word n was lower for more predictable and longer word n+1, but higher for more frequent word n+1. (Each of these effects was qualitatively identical in a separate model that did not include character properties.)

Fixation probability was also affected by the complexity of the characters belonging to the word and its surrounding characters. Words with more complex characters within the word or directly preceding it were more likely to be fixated, and words with a more complex character directly following were less likely to be fixated. This suggests that the more complex the character is, the word it constitutes is more likely to be fixated. The fact that there is a significant effect of the character immediately following the word suggests some word-level parallelism in Chinese reading, however, it is unclear why this effect would be in the opposite direction of that for the character immediately preceding the word. No significant effects of character frequency were found. Finally, and unsurprisingly, words were less likely to be fixated the closer the previous fixation was to the word.

Character fixation probability

Character fixation probability is a good index of landing position, so it is important to explore how word properties and character properties affect it. In this model, character fixation probability was the dependent variable, and the frequency, predictability, and length of words n−1, n, and n+1, the complexities and frequency of character n−1, n, n+1, and n+2, the distance to the nearest fixations to the left of the character, and the character position within word were independent variables.

There was a spillover effect of word length and word predictability. The fixation probability of character n decreased with an increase of the predictability of word n−1, and decreased with the increase of the length of word n−1. The properties of the word containing the character of interest also affected its fixation probability. Fixation probability decreased with the increase of the word’s frequency, predictability, and its length. The effects of frequency and predictability may be interpreted as reflecting the fact that more frequent and predictable words are themselves less likely to be fixated (see previous analysis). The effect of word length is more interesting, and completely analogous to that reported above for fixation durations on characters. It may suggest that characters belonging to a word are processed as a unit in Chinese, as it means that longer words receive fewer fixations per character than shorter ones. There was also evidence of a parafoveal-on-foveal effect. The fixation probability on character n decreased with increasing predictability and (marginally) frequency of word n+1.

Character fixation probability was also affected by character properties. Specifically, fixation probability decreased with increasing frequency and decreasing complexity of character n−1. When character n−1 is easier to process (when character frequency is high or character complexity is low), it will be more likely to be processed in parafoveal vision, and hence will be less likely to be fixated. As a result, character n−1 will be more likely to be skipped, and so the eyes will land at character n, and hence character n will be more likely to be fixated. The complexity of character n affected the probability of being fixated, more strokes meaning more likely fixations, but character frequency did not. It is possible that characters with fewer strokes can be recognized via parafoveal vision so that they are fixated less often, or possibly that readers direct their eyes to locations of especially high visual complexity for efficient foveal processing.

In this analysis, none of the other properties of the words or characters to the right of the character affected the fixation probability except the predictability of word n+1. The more predictable word n+1 was, the less likely character n was to be fixated. Fixation probability was also affected by the distance between the previously fixated character and the target character. The longer the distance, the more likely a character was to be fixated. The effect of character position within a word did not reach significance, suggesting that it did not affect the probability of a character being fixated. This is consistent with previous work (Li et al., 2011).

To summarize, character fixation probabilities were mainly affected by the properties of the target character and the properties of the words and characters to the left of the target character, as well as by the predictability of the following word. The fixation probabilities were determined by both the properties of words and those of the characters.

Saccade length

Our final analysis is of forward saccade length, which measures how long a saccade travels after leaving a fixated position of interest. By being based on properties of the character at the beginning of the saccade rather than the end, forward saccade length reflects information about where to move the eyes from a different perspective than the previous fixation probability analysis.

The results of this model can be stated simply: readers made longer saccades when the current word, the next word, and the next two characters were easier to process (in predictability and frequency for words, and in frequency and complexity for characters), and also when the current word was longer. Specifically, saccade length increased with the increase of the frequency and the increase of the length of word n, which are consistent with the results of a recent experiment (Wei, Li, & Pollatsek, 2013). Saccade length also increased with the increase of the frequency and the predictability of word n+1. Saccade length was also affected by the properties of the characters to the right of fixation; saccade length increased with the increase of the frequencies of characters n+1 and n+2, and increased with the decrease of the complexity of characters n+1 and n+2. All these effects of predictability, frequency, and complexity suggest that easier words and characters are more likely to be skipped, and demonstrate that at least some processing of these items occurs on the previous fixation. Finally, saccade lengths were also affected by the length of last saccade; the saccade was longer if the last saccade length was long. These results provide further support for the notion that the saccade targeting system in Chinese is sensitive to both word and character properties.

Discussion

Chinese orthography is quite different from most alphabetic scripts: words are not spatially segmented and the individual characters composing words can be very complex. Because of this, it has been suggested that reading in Chinese may operate in a qualitatively different fashion from reading in alphabetic languages, in which words play a dominant role. In this study, we sought to advance our knowledge of reading in Chinese by systematically characterizing the ways in which both word and character properties affect the eye movement record in Chinese. To do this, we fit a series of generalized linear mixed-effects models to a large corpus of Chinese reading eye movements. The results of these analyses provided evidence for a wide range of effects of both word and character properties on both word- and character-defined measures.

At the outset of this article, we described three goals of this work. The first goal was to assess how word properties such as length, frequency, and predictability affect eye movement behavior in Chinese, and to compare the pattern of results to those found in alphabetic languages. Specifically, we examined how the properties of the current, previous, and following words affect eye movements in Chinese, parallel to the investigation of these properties performed on German by Kliegl et al. (2006). In an analysis of word gaze durations in a corpus of eye movements in Chinese reading, we showed effects of the length, frequency, and predictability of words n−1, n, and n+1 that replicate those found by Kliegl et al. (2006) for German. Specifically, we found standard effects of all three properties of word n, spillover effects of the frequency and predictability of word n−1, inverse spillover effects of the length of word n−1, and parafoveal-on-foveal effects of the predictability of word n+1. This pattern of effects is identical to that obtained by Kliegl et al. (2006), except that we failed to detect effects of the frequency or length of word n+1 (and found only effects due to its predictability). Our analysis of word fixation probabilities generally echoed these findings, with standard effects of the properties of word n, spillover effects from word n−1, and parafoveal-on-foveal effects of word n+1. Because of the major role that word properties are known to play in alphabetic languages, this demonstration that the properties of the previous, current, and following words affect eye movements in Chinese reading in such a similar way as in alphabetic languages like German provides evidence for a word-based core of reading that is shared across languages with highly dissimilar scripts. That is, it appears the clearly larger role of character processing in Chinese does not alter the fundamental nature of reading, but rather that word-based processes completely analogous to those in languages with alphabetic scripts underlie Chinese reading.

The second goal we set out for this work was to provide one of the strongest tests to date of whether word properties have effects on Chinese reading above and beyond character properties. We tested for this in two ways. First, we included a range of character properties in our regression models, and second, we performed analyses on dependent measures defined in terms of words as well as in terms of characters. The general pattern of effects of word properties on word-based measures such as gaze duration and word fixation probability was very reliable. They were significant and remained qualitatively similar whether or not character properties were included in the model. Further, when we performed analogous analyses on character-based dependent measures (character fixation duration and character fixation probability), the pattern of effects of word properties looked nearly identical to the results obtained for word-based dependent measures. Finally, effects of word properties were also apparent when analyzing the length of forward saccades: saccades were longer when words n and n+1 were more frequent, more predictable, and longer. In summary, the pattern of effects of the properties of words n−1, n, and n+1 appears to be highly robust in our dataset, remaining significant with and without character properties included in the model and even for character-defined dependent measures. Crucially, in all cases, this pattern highly resembles that found in languages with spaced alphabetic scripts, providing further evidence for underlying similarity between reading processes across languages with highly dissimilar scripts.

The final goal we set out for this work was to document the full pattern of effects of both word and character properties on a range of eye movement measures in Chinese reading, in order to provide “benchmark phenomena” (Reichle et al., 2003) on which to evaluate future models of reading in Chinese. In addition to the effects of word properties already described, which look similar to those found in other languages, our analyses documented a range of effects of character properties on eye movements in reading. We saw evidence for character complexity – a low-level visual property of characters – both increasing fixation durations and affecting saccade targeting by attracting fixations. Additionally, our analyses demonstrated that higher character frequency led to shorter fixations and also affected saccade targeting. For both types of character properties, we saw evidence of properties of non-fixated characters affecting eye movements, yielding a complex pattern worthy of further study.7 Our analyses also revealed effects of previous saccade length that replicate those found in languages with alphabetic scripts (e.g., Kliegl et al., 2006). Finally, while we replicated the findings of other studies that Chinese readers are no more likely to fixate any specific position within a word (e.g., Li et al., 2011), we did find that fixation position within a word does affect fixation duration and outgoing saccade length, suggesting that within-word position is relevant for Chinese readers. Taken together, these findings provide a rich set of data suitable for the development of future models of eye movement control in Chinese reading.

Implications for modeling in eye movement control in Chinese reading

Given the foregoing summary of our results, we describe in this section their implications for building computational models of eye movements in Chinese reading. On the one hand, the fact that effects of word properties on eye movements in reading appear nearly identical between Chinese and alphabetic reading suggests that models of eye movement control originally developed for alphabetic languages, many of them word-based, may serve as useful starting points for modeling Chinese reading. This result is harmonious with those of Rayner, et al. (2007), who fit the E-Z Reader model to Chinese reading data and showed that it can capture a number of aspects of Chinese reading. On the other hand, our results also catalogued a number of effects that cannot be explained by current models of reading in alphabetic languages. We distinguish these effects into those that affect the durations of fixations on the currently fixated word or character and those that affect saccade targeting decisions about where to move the eyes forward, and we describe each of these next. It is also clear from the fact that words in Chinese are not delimited by spaces, and thus require online word segmentation, that some architectural change is required when applying existing models of reading to Chinese. We discuss the architectural possibilities below.

How do Chinese readers decide how long to continue fixating a word or character? Across our duration analyses, we showed that durations were longer when the current character was more complex, when the characters in the current word had higher average complexity, and when the character preceding the current word was more complex. We also saw effects of the frequency of characters on the edge of the Chinese perceptual span. While we did not find evidence in our analyses for the frequency of characters within the current word having an effect on durations above and beyond word frequency, these have also been reported for Chinese in controlled experiments (G. Yan et al., 2006). It is possible that all of these effects on durations could be reproduced by existing word-based processing models such as E-Z Reader (Reichle et al., 1998) and SWIFT (Engbert et al., 2005) by changing the word processing functions used by the models. Specifically, these models currently assume that the time taken to process a word is a function only of its frequency, predictability, and the eccentricity of its letters from the position of fixation (if we ignore the influences of adjacent words). Word processing functions for Chinese would need to be extended to allow for an interaction with character-level processing independent of words to reproduce character frequency effects and interaction with the visual system to reproduce effects of character visual complexity. Such an extension of a model like E-Z Reader or SWIFT should be able to reproduce all the effects of the characters in the current word, but it remains to be seen if it would be able to reproduce effects of characters in adjacent words. To the extent that these models cannot, it may indicate that processing in Chinese at the character level is more parallel than in other languages (perhaps demanded by the necessity of online word segmentation), and may require a different model architecture.

How do character properties in Chinese affect decisions about where to move the eyes forward? In terms of decisions about which words to fixate, it seems that reading in Chinese operates very similarly to that in other languages, as word properties affect word and character fixation probabilities in similar ways. While current models of reading in alphabetic languages can account for these effects, there are a number of results that are more problematic for these models. The fact that character properties within a word affect its fixation probability may also be able to be understood in terms of models such as E-Z Reader and SWIFT by changing the word processing functions, as described above. It is possible that such a modification of word processing functions would be all that is required to capture these effects, but it is also possible that character properties such as complexity influence saccade targeting in a way not mediated by word processing. The absence of a preferred viewing location in Chinese (Li et al., 2011; Tsai & McConkie, 2003; H. Yang & McConkie, 1999), which was also true in our dataset, provides some evidence that saccade targeting may operate in a very different manner in Chinese. In order to capture this effect, a model of reading in Chinese would require a very different architecture that was not solely word-based, which is also required to segment words.

To summarize, the future is promising for modeling Chinese reading data. Our results indicate that the underlying reading architecture may be quite similar across scripts and languages, meaning that computational models of eye movements in reading developed for alphabetic languages may serve as useful starting points in developing models of reading in Chinese. To capture the range of effects we have documented in this analysis, however, such models would need to be augmented, at minimum, in two ways. First, the simple word processing functions used in these models would need to be replaced by models of word processing that involve processing at the character and visual levels. (Note that this first step is also required for modeling reading in alphabetic languages in order to reproduce effects such as those of visual neighborhood size.) Second, the word targeting mechanism must be changed to be sensitive to the fact that words to the right of fixation are not spatially segmented, and the model must include a model of word segmentation. Future research will determine whether these modifications are sufficient to capture the range of effects we show here, or whether, as mentioned above, a model of reading in Chinese may require more architectural changes.

As mentioned above, one architectural change demanded of any model of reading in Chinese is the need to segment and process words simultaneously. One possible way to implement such an architecture is given by the model of Li et al. (2009). In that model, character processing continues on all characters in the perceptual span simultaneously, but multiple word units compete for a single winner for lexical access, suggesting that only one word is being processed (at the word level) at a given time. Word identification then entails segmentation of the identified word, and the reference character (and thus word processing) is advanced. Such an architecture could naturally combine with a serial word-based model of eye movements in reading such as E-Z Reader. Another architectural possibility is that suggested by Bicknell and Levy (2010). In that model, reading is not taken to be explicitly word-based, but rather, readers work to identify all the text about which they have received useful visual information via Bayesian inference, combining the visual information with probabilistic knowledge of the statistics of the language. Many signatures of word-based reading still appear in the model’s reading behavior, however, because words are important units in the statistical structure of language. Such an architecture works without modification for a script without spaces, such as in Chinese. In that case, the model’s Bayesian inference component would solve the identification problem simultaneously with the segmentation problem. Future work is required to establish whether either of these architectures would provide a useful characterization of reading in Chinese.

Conclusion

In conclusion, we presented evidence based on a range of analyses that word-based processes underlie reading behavior in Chinese, in a way highly analogous to languages with alphabetic scripts. Specifically, we showed that the effects of the properties of the current, previous, and next words are strikingly similar between Chinese and alphabetic languages on a range of eye movement measures. Despite the fact that words are not spatially segmented in Chinese, and that characters are themselves complex visual objects, our results suggest that reading appears just as reliant on words in Chinese as in other languages. In addition, we documented a rich pattern of effects of character properties, which demonstrate the need for developing new models of reading in Chinese. This first attempt at systematic characterization of the effects of word and character properties in Chinese in and of itself advances our knowledge of the processes underlying reading in Chinese, and we hope it will inform the future development of models of reading in the language, and eventually to understanding how reading behavior varies with script and articulating language-universal models of reading.

Table 3.

Logistic mixed-effects regression results for word fixation probability

Model
Values
Model comparison
b SE z p low median high AIC+ χ2 p
(Intercept) −5.75 .31 −18.78 .000

word n−1 frequency .02 .02 .94 .35 .43 .49 .54 −1 .88 .348
predictability −.28 .02 −16.86 .000 .49 .50 .54 294 296.0 <.001
length −.21 .04 −4.68 .000 .55 .49 .36 20 22.12 <.001

word n frequency .03 .02 1.22 .22 .71 .67 .40 −1 1.48 .223
predictability −.18 .02 −11.49 .000 .58 .47 .36 132 133.78 <.001
length .99 .08 12.66 .000 .34 .69 .88 158 159.79 <.001

word n+1 frequency .06 .02 3.62 .000 .49 .48 .52 11 13.16 <.001
predictability −.15 .02 −9.70 .000 .48 .52 .56 93 95.48 <.001
length −.11 .04 −2.57 .010 .54 .49 .46 5 6.63 .010

character before word frequency .01 .02 .51 .610 .47 .48 .52 −2 .26 .610
complexity .05 .00 7.09 .000 .50 .53 .48 48 50.36 <.001

average of characters of wordn frequency −.01 .03 −.21 .840 .65 .62 .46 −2 .04 .837
complexity .06 .01 6.88 .000 .45 .51 .62 45 47.52 <.001

Character after word frequency −.02 .02 −1.34 .180 .49 .48 .52 0 1.80 .180
complexity −.02 .01 −2.66 .008 .50 .53 .48 5 7.06 .008

nearest fixation distance 2.40 .02 98.53 .000 .09 .97 .97 30334 30336 <.001

Table 4.

Logistic mixed-effects regression results for the probability of a character being fixated in first pass

Model
Values
Model comparison
b SE z p low median high AIC+ χ2 p
(Intercept) −.10 .16 −.64 .525

word n−1 frequency −.01 .01 −1.29 .199 .38 .38 .38 0 1.64 .200
predictability −.03 .01 −3.81 .000 .39 .38 .37 13 14.39 .000
length −.11 .02 −4.74 .000 .39 .38 .34 12 22.22 .000

word n frequency −.04 .01 −5.38 .000 .40 .40 .36 27 28.45 .000
predictability −.04 .01 −5.16 .000 .40 .36 .34 25 26.33 .000
length −.11 .02 −4.88 .000 .36 .40 .38 24 23.52 .000

word n+1 frequency −.01 .01 −1.68 .094 .38 .38 .38 1 2.79 .095
predictability −.02 .02 −2.08 .037 .39 .38 .37 3 4.31 .038
length −.03 .02 −1.43 .154 .38 .38 .38 0 2.02 .155

character n−1 frequency −.04 .01 −4.95 .000 .41 .39 .37 23 24.15 .000
complexity .01 .00 3.01 .003 .37 .39 .41 2 9.02 .003

character n frequency .00 .01 .48 .631 .42 .41 .37 1 .23 .632
complexity .03 .00 7.08 .000 .37 .38 .42 47 49.06 .000

character n+1 frequency −.00 .01 −.32 .750 .40 .38 .38 1 .10 .751
complexity −.00 .00 −.40 .687 .38 .38 .39 1 .16 .688

character n+2 frequency .00 .01 .49 .624 .39 .38 .38 1 .30 .583
complexity −.00 .01 −1.27 .206 .38 .38 .38 0 1.59 .207

nearest fixation distance in word position .02 .00 8.03 .000 .22 .46 .53 61 62.90 .000
−.02 .02 −1.27 .205 .37 .40 .36 0 1.58 .209

Table 5.

Linear mixed-effects regression results for forward saccade length

Model
Values (characters)
Model comparison
b SE t low median high AIC+ χ2 p
(Intercept) 2.02 .23 8.90

word n−1 frequency −.00 .01 −.88 2.73 2.69 2.75 1 .76 .387
predictability .00 .01 −.18 2.72 2.71 2.77 2 .02 .888
length −.03 .03 −.93 2.73 2.74 2.68 1 .86 .353

word n frequency .03 .01 2.75 2.72 2.74 2.73 6 7.62 .006
predictability .02 .01 1.60 2.70 2.78 2.80 1 2.60 .11
length .10 .03 3.42 2.68 2.72 2.85 10 11.79 .001

word n+1 frequency .02 .01 2.70 2.55 2.65 2.82 19 23.16 .000
predictability .02 .01 2.31 2.63 2.79 2.89 4 5.41 .020
length .06 .03 2.01 2.80 2.67 2.60 2 4.08 .043

character n−1 frequency −.01 .01 −.98 2.69 2.80 2.71 1 .97 .326
complexity .01 .00 1.22 2.75 2.69 2.75 0 1.50 .221

character n frequency −.00 .01 −.13 2.64 2.77 2.74 2 .01 .942
complexity −.01 .00 −1.08 2.77 2.70 2.69 1 1.17 .279

character n+1 frequency .02 .01 2.05 2.56 2.68 2.80 2 4.25 .039
complexity −.03 .00 −6.27 2.82 2.69 2.59 37 29.23 .000

character n+2 frequency .04 .01 4.56 2.53 2.66 2.81 19 20.87 .000
complexity −.01 .00 −2.99 2.79 2.72 2.61 7 9.03 .003

nearest fixation distance in word position .16 .01 22.44 2.27 2.47 3.16 496 497.7 .000
.06 .02 2.70 2.68 2.77 2.94 6 7.38 .007

Acknowledgments

This research was supported by the Knowledge Innovation Program of the Chinese Academic Sciences (KSCX2-YW-BR-6), by a grant from the Natural Science Foundation of China (31070904), and by NIH grant HD065829. We thank Simon Liversedge and Alexander Pollatsek for their helpful discussion and comments.

Appendix 1: Material analyses

The 80 experimental sentences comprised 1305 words. Among these words, 565 were 1-character in length, 622 were 2-characters long, 56 were 3-characters long, and 62 were 4-characters long. Some of the words were used more than once. Only 779 different words were used (154 1-character words, 515 2-character words, 53 3-character words, and 57 4-character words).

When we analyzed the eye movement data, the following words were excluded: 1) any words including the first two characters and the last two characters in a sentence; 2) Arab digits; and 3) names of people or places. As a result, 953 words were included in the analyses (460 1-character words, 420 2-character words, 27 3-character words, and 46 4-character words). As noted above, some of the words were used more than once. For these included words, there were 556 different words (126 1-character words, 361 2-character words, 26 3-character words, and 41 4-character words).

The properties of the words and characters are shown in Table A1. 48% of the words were 1-character words, 44% were 2-character words, 3% were 3-character words, and 5% were 4-character words. As in English, word frequency decreased as a function of word length (F(3,552) = 55.02, p < 0.001, SEM = 2,119,030).

Number of stokes were different across the four different word lengths (F(3,552) = 5.93, p < 0.001, MSE = 5.66). There were fewer number of strokes for 1-character words than longer words. There was a hint that character frequency was higher for 1-character words than characters of longer words (F(3,552) = 2.17, p = .09, MSE = 4,245,240). Character frequency of 1-character words was higher than longer words. The properties of words were not independent from the properties of characters constituting the words. Word frequency was negatively correlated with the mean number strokes of the characters of a word (−.16), which was significantly less than zero (p < .001). Word frequency was positively correlated with mean character frequency (.64), which was significantly larger than 0 (p < .001). The number of strokes was negatively correlated with mean number of character frequency (−.37), which was significantly smaller than 0 (p < .001).

Table A1.

Properties of the words and characters included in the eye movement analyses

Word length
1 2 3 4
number of occurrences 460 420 27 46
number of different words 128 361 26 41
word frequency 1979 114 14 2
stroke number char 1 6.62 7.67 8.08 6.95
char 2 7.61 6.62 6.76
char 3 7.38 6.90
char 4 8.17
character frequency char 1 2374 1821 1861 2064
char 2 2007 1734 1607
char 3 2951 1595
char 4 1172

Appendix 2. Results of by-participant multiple regressions

Table A2.

Multiple regression results for eye movement measures on words

Gaze duration
Fixation probability
coef t(45) p coef t(45) p
(Intercept) 176.12 14.09 <.001 −7.95 −17.27 <.001

word n−1 frequency −2.43 −2.96 .005 0.03 1.46 .151
predictability −3.91 −5.84 <.001 −0.43 −12.53 <.001
length −11.46 −5.38 <.001 −0.29 −5.30 <.001

word n frequency −6.19 −5.80 <.001 0.05 1.70 .095
predictability −3.87 −5.22 <.001 −0.24 −13.84 <.001
length 7.20 2.20 .033 1.26 13.86 <.001

word n+1 frequency 0.72 1.02 .315 0.09 5.13 <.001
predictability −2.92 −3.26 .002 −0.21 −11.61 <.001
length 1.76 1.09 .283 −0.15 −3.13 .003

character before word frequency −0.47 −.53 .597 0.05 2.45 .018
complexity 1.15 3.54 <.001 0.07 8.71 <.001

average of character of word n frequency 1.25 1.04 .305 0.01 .20 .844
complexity 3.21 8.54 <.001 0.08 9.06 <.001

character after word frequency −0.02 −.03 .982 −0.03 −1.78 .081
complexity 0.16 .51 .609 −0.03 −3.47 .001

Nearest fixation distance 18.61 15.40 <.001 4.30 5.56 <.001

Table A3.

Multiple regression results for eye movement measures on characters

Fixation duration
Fixation probability
Saccade length
coef t(45) P coef t(45) p coef t(45) p
(Intercept) 269.59 32.36 <.001 −1.02 −4.41 <.001 1.94 11.26 <.001

word n−1 frequency −2.08 −5.80 <.001 −0.02 −3.06 .004 0.01 .59 .556
predictability −0.70 −1.53 .132 −0.05 −5.18 <.001 −0.00 −.24 .810
length −9.55 −6.56 <.001 −1.17 −7.18 <.001 0.01 .26 .799

word n frequency −3.52 −8.50 <.001 −0.05 −6.02 <.001 0.03 4.12 <.001
predictability −3.63 −6.73 <.001 −0.06 −6.94 <.001 0.02 2.13 .039
length −7.66 −6.20 <.001 −0.14 −7.08 <.001 0.10 5.33 <.001

word n+1 frequency −0.09 −.24 .810 −0.01 −2.05 .046 0.02 2.61 .012
predictability −0.98 −1.68 .099. −0.02 −2.82 .007 0.03 2.25 .029
length 1.22 .944 .350 −0.04 −1.93 .060 0.05 2.71 .009

character n−1 frequency −1.44 −3.52 <.001 −0.04 −6.45 <.001 −0.01 −1.63 .109
complexity −0.10 −.454 .652 0.02 5.81 <.001 0.00 .039 .696

character n frequency 0.18 .303 .763 0.01 1.34 .188 0.00 .12 .905
complexity 1.60 6.78 <.001 0.03 9.46 <.001 −0.00 −1.15 .256

character n+1 frequency −0.17 −.44 .661 −0.00 −.41 .681 0.01 1.91 .062
complexity 0.06 .28 .778 −0.00 −.49 .628 −0.03 −5.17 <.001

character n+2 frequency 1.64 4.70 <.001 0.00 .83 .413 0.05 6.13 <.001
complexity 01.8 .90 .370 −0.00 −.63 .531 −0.02 −3.19 .003

nearest fixation distance in word position 4.63 4.66 <.001 0.50 5.29 <.001 0.14 8.40 <.001
−4.34 −4.34 <.001 −0.01 −.68 .501 0.08 2.84 .007

Footnotes

1

The method of analysis we use in this work, a statistical analysis of a large eye movement corpus in which a number of measures are analyzed for most words in the text, has yielded a few results that do not seem to be found in controlled experiments that analyze a single target word. Most notably, some parafoveal-on-foveal effects (i.e., the influence of the word to the right of fixation on the currently fixated word) appear to only have robust support from statistical corpus analyses. Unfortunately, the reasons for such differences are still poorly understood (Kliegl, 2007; Rayner, Pollatsek, Drieghe, Slattery, & Reichle, 2007). Given this, we believe that the results we report here should also be examined using controlled experiments.

2

Note that although we compare Chinese characters and English characters, we do not argue that they are linguistically similar. We compare them just because they are both salient units, since there are small spaces between characters in both writing systems. Many Chinese characters carry some semantic information, and as such it may be argued that Chinese characters are analogous to morphemes in English.

3

Fitting mixed-effects regression models without random slopes can be anti-conservative in the presence of differences in effect sizes between levels of grouping variables (i.e., between subject or items; Barr, Levy, Scheepers, & Tily, 2013). However, with models as large as those we are fitting, it is not practical to fit random slopes for each predictor variable of interest. For this reason, we also analyzed the data with by-participant regression (Lorch & Myers, 1990). This method is in general less powerful than mixed-effects regression (Baayen et al., 2008), but is robust to differences in effect size across participants. The results of these additional analyses are given in Appendix 2. They revealed that every significant effect in our main analyses (mixed-effects regression) was also significant under by-participant regression, with just two exceptions: the effects of the predictability of word n−1 and word n+1 on character fixation durations, which we mark in the results with a footnote. This suggests that the rest of the results reported in our main analysis are robust to possible differences in effect size across participants, despite our main analyses not including random slopes.

4

Readers only made more than one fixation on 1.7% of the characters.

5

http://ccl.pku.edu.cn:8080/ccl_corpus/index.jsp?dir=xiandai. Center for Chinese Linguistics PKU.

6

In addition, follow-up analyses performed with by-participants regression failed to recover the effects of the predictability of words n−1 and n+1, indicating that these effects may not be robust to differences between participants (see footnote 1). Under this analysis, each effect is still estimated as being in the same direction, but the effects fail to reach significance, with p-values of .10 and .15 respectively.

7

To investigate the possibility of interactions between word and character properties, we performed follow-up analyses in which we added six interactions between word and character properties to each of our five regression models. Specifically, we added interactions between the three properties of the current word (length, frequency, and predictability), and the frequency and complexity of the current character (for character defined measures) or of the characters of the current word (for word defined measures). Of these 30 predictors we tested, only 3 were found to be significant, all of which were on the two word-based models: there was a significant negative interaction between word length and mean character frequency on gaze duration and word fixation probability, and a significant negative interaction between word predictability and mean character frequency on word fixation probability. Crucially, including these interactions in the models did not change the pattern of main effects.

8

These two effects may not be robust to between-subject differences in effect sizes. See footnote 2.

References

  1. Akaike H. A new look at the statistical model identification. Automatic Control, IEEE Transactions on. 1974;19:716–723. [Google Scholar]
  2. Baayen RH, Davidson DJ, Bates DM. Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language. 2008;59(4):390–412. doi: 10.1016/j.jml.2007.12.005. [DOI] [Google Scholar]
  3. Bai X, Yan G, Liversedge SP, Zang C, Rayner K. Reading spaced and unspaced chinese text: Evidence from eye movements. Journal of Experimental Psychology: Human Perception and Performance. 2008;34(5):1277–1287. doi: 10.1037/0096-1523.34.5.1277. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Balota DA, Pollatsek A, Rayner K. The interaction of contextual constraints and parafoveal visual information in reading. Cognitive Psychology. 1985;17(3):364–390. doi: 10.1016/0010-0285(85)90013-1. [DOI] [PubMed] [Google Scholar]
  5. Barr DJ, Levy R, Scheepers C, Tily HJ. Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language. 2013;68:255–278. doi: 10.1016/j.jml.2012.11.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bates D. lme4: Mixed-Effects modeling with R. 2010 Prepublication version at: http://lme4.r-Forge.r-project.org/book/. Retrieved from.
  7. Bates D, Maechler M. lme4: Linear mixed-effects models using S4 classes. 2010 R package version 0.999375–36/r1083., http://r-Forge.r-project.org/projects/lme4/
  8. Bicknell K, Levy R. A rational model of eye movement control in reading. In: Haji J, Carberry S, Clark S, Nivre J, editors. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL) Uppsala, Sweden: Association for Computational Linguistics; 2010. pp. 1168–1178. [Google Scholar]
  9. Blythe HI, Liang F, Zang C, Wang J, Yan G, Bai X, Liversedge SP. Inserting spaces into Chinese text helps readers to learn new words: An eye movement study. Journal of Memory and Language. 2012;67:241–254. [Google Scholar]
  10. Chen H. Chinese reading and comprehension: A cognitive psychology perspective. In: Bond MH, editor. Handbook of Chinese psychology. Hongkong: Oxford University Press; 1996. pp. 43–62. [Google Scholar]
  11. Chen H, Song H, Lau WY, Wong KFE, Tang SL. Chinese reading and comprehension: A cognitive psychology perspective. In: McBride-Chang C, Chen H, editors. Reading development in Chinese children. Westport, CT: Praeger Publishers; 2003. pp. 157–169. [Google Scholar]
  12. Chen H, Zhou X. Processing east Asian languages: An introduction. Language and cognitive processes. 1999;14(5/6):425–428. [Google Scholar]
  13. Cheng C. Perception of Chinese character. Chinese Journal of Psychology. 1981;23(2):137–153. [Google Scholar]
  14. Cui L, Bai X, Yan G, Hyönä J, Wang S, Liversedge SP. Parallel processing of compound word characters in reading Chinese: An eye movement contingent display change study. Quarterly Journal of Experimental Psychology. 2013;66:403–416. doi: 10.1080/17470218.2012.667423. [DOI] [PubMed] [Google Scholar]
  15. Drieghe D, Brysbaert M, Desmet T. Parafoveal-on-foveal effects on eye movements in text reading: Does an extra space make a difference? Vision Research. 2005;45(13):1693–1706. doi: 10.1016/j.visres.2005.01.010. [DOI] [PubMed] [Google Scholar]
  16. Engbert R, Longtin A, Kliegl R. A dynamical model of saccade generation in reading based on spatially distributed lexical processing. Vision Research. 2002;42(5):621–636. doi: 10.1016/s0042-6989(01)00301-7. [DOI] [PubMed] [Google Scholar]
  17. Engbert R, Nuthmann A, Richter EM, Kliegl R. SWIFT: A dynamical model of saccade generation during reading. Psychological Review. 2005;112(4):777–813. doi: 10.1037/0033-295x.112.4.777. [DOI] [PubMed] [Google Scholar]
  18. Faraway JJ. Extending the linear model with R: Generalized linear, mixed effects and nonparametric regression models. Boca Raton: FL: Chapman & Hall/CRC; 2006. [Google Scholar]
  19. Feng G. Orthography and eye movements: The paraorthographic linkage hypothesis. In: Rayner K, Shen D, Bai X, Yan G, editors. Cognitive and cultural influences on eye movements. Tianjin, China: Tianjin People’s Publishing House; 2008. pp. 395–420. [Google Scholar]
  20. Henderson JM, Ferreira F. Effects of foveal processing difficulty on the perceptual span in reading: Implications for attention and eye movement control. Journal of Experimental Psychology: Learning, Memory, and Cognition. 1990;16(3):417–429. doi: 10.1037//0278-7393.16.3.417. [DOI] [PubMed] [Google Scholar]
  21. Hoosain R. Aspects of the Chinese Language. In: Hoosain R, editor. Psycholinguistic implications for linguistic relativity: A case study of Chinese. Hillsdale, New Jersey: Lawrence Erlbaum Associates, Inc; 1991. pp. 5–21. [Google Scholar]
  22. Hoosain R. Psychological reality of the word in Chinese. In: Chen HC, Tzeng OJL, editors. Language processing in Chinese. North-Holland: Elsevier; 1992. pp. 111–130. [Google Scholar]
  23. Inhoff AW, Liu W. The perceptual span and oculomotor activity during the reading of Chinese sentences. Journal of Experimental Psychology: Human Perception and Performance. 1998;24(1):20–34. doi: 10.1037//0096-1523.24.1.20. [DOI] [PubMed] [Google Scholar]
  24. Inhoff AW, Rayner K. Parafoveal word processing during eye fixations in reading: Effects of word frequency. Perception & Psychophysics. 1986;40:431–439. doi: 10.3758/bf03208203. [DOI] [PubMed] [Google Scholar]
  25. Inhoff AW, Starr M, Shindler KL. Is the processing of words during eye fixations in reading strictly serial? Perception & Psychophysics. 2000;62:1474–1484. doi: 10.3758/bf03212147. [DOI] [PubMed] [Google Scholar]
  26. Juhasz BJ, Rayner K. Investigating the effects of a set of intercorrelated variables on eye fixation durations in reading. Journal of Experimental Psychology: Learning Memory and Cognition. 2003;29(6):1312–1318. doi: 10.1037/0278-7393.29.6.1312. [DOI] [PubMed] [Google Scholar]
  27. Just MA, Carpenter PA. A theory of reading: From eye fixations to comprehension. Psychological Review. 1980;87(4):329–354. [PubMed] [Google Scholar]
  28. Kennedy A, Pynte J. Parafoveal-on-foveal effects in normal reading. Vision Research. 2005;45(2):153–168. doi: 10.1016/j.visres.2004.07.037. [DOI] [PubMed] [Google Scholar]
  29. Kliegl R. Toward a perceptual-span theory of distributed processing in reading: A reply to Rayner, Pollatsek, Drieghe, Slattery, and Reichle (2007) Journal of Experiment Psychology: General. 2007;136(3):530–537. doi: 10.1037/0096-3445.136.3.520. [DOI] [PubMed] [Google Scholar]
  30. Kliegl R, Grabner E, Rolfs M, Engbert R. Length, frequency, and predictability effects of words on eye movements in reading. European Journal of Cognitive Psychology. 2004;16(1–2):262–284. doi: 10.1080/09541440340000213. [DOI] [Google Scholar]
  31. Kliegl R, Masson MEJ, Richter EM. A linear mixed model analysis of masked repetition priming. Visual Cognition. 2010;18(5):655–681. [Google Scholar]
  32. Kliegl R, Nuthmann A, Engbert R. Tracking the mind during reading: The influence of past, present, and future words on fixation durations. Journal of Experimental Psychology: General. 2006;135(1):12–35. doi: 10.1037/0096-3445.135.1.12. [DOI] [PubMed] [Google Scholar]
  33. Kliegl R, Risse S, Laubrock J. Preview benefit and parafoveal-on-foveal effects from word n+2. Journal of Experiment Psychology: Human Perception and Performance. 2007;33(5):1250–1255. doi: 10.1037/0096-1523.33.5.1250. [DOI] [PubMed] [Google Scholar]
  34. Lexicon of common words in contemporary Chinese research team. Lexicon of common words in contemporary Chinese. The Commercial Press; Beijing, China: 2008. [Google Scholar]
  35. Li X, Gu J, Liu P, Rayner K. The advantage of word-based processing in Chinese reading: Evidence from eye movements. Journal of Experimental Psychology: Learning, Memory, and Cognition. doi: 10.1037/a0030337. in press. [DOI] [PubMed] [Google Scholar]
  36. Li X, Liu P, Rayner K. Eye movement guidance in Chinese reading: Is there a preferred viewing location? Vision Research. 2011;51:1146–1156. doi: 10.1016/j.visres.2011.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Li X, Logan G. Object-based attention in Chinese readers of Chinese words: Beyond Gestalt principles. Psychonomic Bulletin & Review. 2008;15(5):945–949. doi: 10.3758/PBR.15.5.945. [DOI] [PubMed] [Google Scholar]
  38. Li X, Rayner K, Cave KR. On the segmentation of Chinese words during reading. Cognitive Psychology. 2009;58:525–552. doi: 10.1016/j.cogpsych.2009.02.003. [DOI] [PubMed] [Google Scholar]
  39. Li X, Zhao W, Pollatsek A. Dividing lines at the word boundary position helps reading in Chinese. Psychonomic Bulletin & Review. 2012;19(5):929–934. doi: 10.3758/s13423-012-0270-6. [DOI] [PubMed] [Google Scholar]
  40. Liu P, Li W, Lin N, Li X. Do Chinese readers follow the National Standard Rules for word segmentation during reading? PLoS One. 2013;8:e55440. doi: 10.1371/journal.pone.0055440. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Lorch RF, Myers JL. Regression analyses of repeated measures data in cognitive research. Journal of Experimental Psychology: Learning Memory and Cognition. 1990;16(1):149–157. doi: 10.1037//0278-7393.16.1.149. [DOI] [PubMed] [Google Scholar]
  42. McConkie GW, Kerr PW, Reddix MD, Zola D. Eye movement control during reading: I. The location of initial eye fixations on words. Vision Research. 1988;28(10):1107–1118. doi: 10.1016/0042-6989(88)90137-x. [DOI] [PubMed] [Google Scholar]
  43. Miellet S, Sparrow L, Sereno SC. Word frequency and predictability effects in reading French: An evaluation of the E-Z Reader model. Psychonomic Bulletin & Review. 2007;14(4):762–769. doi: 10.3758/bf03196834. [DOI] [PubMed] [Google Scholar]
  44. O’Regan JK, Jacobs AM. Optimal viewing position effect in word recognition: A challenge to current theory. Journal of Experimental Psychology: Human Perception and Performance. 1992;18(1):185–197. [Google Scholar]
  45. Pinheiro JC, Bates D. Mixed-effects models in S and S-PLUS. Springer-Verlag New York, Inc; 2000. [Google Scholar]
  46. Pollatsek A, Reichle ED, Juhasz BJ, Machacek D, Rayner K. Immediate and delayed effects of word frequency and word length on eye movements in reading: A reversed delayed effect of word length. Journal of Experimental Psychology: Human Perception and Performance. 2008;34(3):726–750. doi: 10.1037/0096-1523.34.3.726. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Pynte J, Kennedy A, Ducrot S. The influence of parafoveal typographical errors on eye movements in reading. European Journal of Cognitive Psychology. 2004;16(1–2):178–202. doi: 10.1080/09541440340000169. [DOI] [Google Scholar]
  48. Rayner K. Eye guidance in reading: fixation locations within words. Perception. 1979;8:21–30. doi: 10.1068/p080021. [DOI] [PubMed] [Google Scholar]
  49. Rayner K. Eye movements in reading and information processing: 20 years of research. Psychological Bulletin. 1998;124(3):372–422. doi: 10.1037/0033-2909.124.3.372. [DOI] [PubMed] [Google Scholar]
  50. Rayner K. The Thirty-fifth Sir Frederick Barlett Lecture: Eye movements and attention in reading, scene perception, and visual search. The Quarterly Journal of Experimental Psychology. 2009;62(8):1457–1506. doi: 10.1080/17470210902816461. [DOI] [PubMed] [Google Scholar]
  51. Rayner K, Ashby J, Pollatsek A, Reichle ED. The effects of frequency and predictability on eye fixations in reading: Implications for the E-Z reader model. Journal of Experimental Psychology: Human Perception and Performance. 2004;30(4):720–732. doi: 10.1037/0096-1523.30.4.720. [DOI] [PubMed] [Google Scholar]
  52. Rayner K, Duffy SA. Lexical complexity and fixation times in reading: Effects of word frequency, verb complexity, and lexical ambiguity. Memory & Cognition. 1986;14(3):191–201. doi: 10.3758/bf03197692. [DOI] [PubMed] [Google Scholar]
  53. Rayner K, Juhasz BJ, Brown SJ. Do readers obtain preview benefit from word n+2? A test of serial attention shift versus distributed lexical processing models of eye movement control in reading. Journal of Experimental Psychology: Human Perception and Performance. 2007;33(1):230–245. doi: 10.1037/0096-1523.33.1.230. [DOI] [PubMed] [Google Scholar]
  54. Rayner K, Li X, Juhasz BJ, Yan G. The effect of word predictability on the eye movements of Chinese readers. Psychonomic Bulletin & Review. 2005;12(6):1089–1093. doi: 10.3758/bf03206448. [DOI] [PubMed] [Google Scholar]
  55. Rayner K, Li X, Pollatsek A. Extending the E-Z reader model of eye movement control to chinese readers. Cognitive Science. 2007;31:1021–1033. doi: 10.1080/03640210701703824. [DOI] [PubMed] [Google Scholar]
  56. Rayner K, Pollatsek A, Drieghe D, Slattery TJ, Reichle ED. Tracking the mind during reading via eye movements: Comments on Kliegl, Nuthmann, and Engbert (2006) Journal of Experimental Psychology-General. 2007;136(3):520–529. doi: 10.1037/0096-3445.136.3.520. [DOI] [PubMed] [Google Scholar]
  57. Rayner K, Reichle ED, Stroud MJ, Pollatsek A. The effect of word frequency, word predictability, and font difficulty on the eye movements of young and older readers. Psychology and Aging. 2006;21(3):448–465. doi: 10.1037/0882-7974.21.3.448. [DOI] [PubMed] [Google Scholar]
  58. Rayner K, Sereno SC, Raney GE. Eye movement control in reading: A comparison of two types of models. Journal of Experimental Psychology: Human Perception and Performance. 1996;22(5):1188–1200. doi: 10.1037//0096-1523.22.5.1188. [DOI] [PubMed] [Google Scholar]
  59. Rayner K, Slattery TJ, Drieghe D, Liversedge SP. Eye movements and word skipping during reading: Effects of word length and predictability. Journal of Experimental Psychology: Human Perception and Performance. 2011;37(2):514–528. doi: 10.1037/a0020990. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Rayner K, Well AD. Effects of contextual constraint on eye movements in reading: A further examination. Psychonomic Bulletin & Review. 1996;3(4):504–509. doi: 10.3758/bf03214555. [DOI] [PubMed] [Google Scholar]
  61. Reicher GM. Perceptual recognition as a function of meaningfulness of stimulus material. Journal of Experimental Psychology. 1969;81(2):275–280. doi: 10.1037/h0027768. [DOI] [PubMed] [Google Scholar]
  62. Reichle ED, Pollatsek A, Fisher DL, Rayner K. Toward a model of eye movement control in reading. Psychological Review. 1998;105(1):125–157. doi: 10.1037/0033-295x.105.1.125. [DOI] [PubMed] [Google Scholar]
  63. Reichle ED, Pollatsek A, Rayner K. Using E-Z Reader to simulate eye movements in nonreading tasks: A unified framework for understanding the eye-mind link. Psychological Review. 2012;119(1):155–185. doi: 10.1037/a0026473. [DOI] [PubMed] [Google Scholar]
  64. Reichle ED, Rayner K, Pollatsek A. The E-Z Reader model of eye-movement control in reading: Comparisons to other models. Behavioral and Brain Sciences. 2003;26:445–526. doi: 10.1017/s0140525x03000104. [DOI] [PubMed] [Google Scholar]
  65. Reichle ED, Warren T, McConnell K. Using E-Z reader to model effects of higher-level language processing on eye movements during reading. Psychonomic Bulletin & Review. 2009;16(1):1–21. doi: 10.3758/PBR.16.1.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Reilly R, Radach R. Some empirical tests of an interactive activation model of eye movement control in reading. Cognitive Systems Research. 2006;7(1):34–55. doi: 10.1016/j.cogsys.2005.07.006. [DOI] [Google Scholar]
  67. Richter EM, Engbert R, Kliegl R. Current advances in SWIFT. Cognitive Systems Research. 2006;7(1):23–33. doi: 10.1016/j.cogsys.2005.07.003. [DOI] [Google Scholar]
  68. Schilling HEH, Rayner K, Chumbley JI. Comparing naming, lexical decision, and eye fixation times: Word frequency effects and individual differences. Memory & Cognition. 1998;26(6):1270–1281. doi: 10.3758/bf03201199. [DOI] [PubMed] [Google Scholar]
  69. Schotter ER, Angele B, Rayner K. Parafoveal processing in reading. Attention Perception & Psychophysics. 2012;74:5–35. doi: 10.1037/A0023215. [DOI] [PubMed] [Google Scholar]
  70. Schotter ER, Blythe HI, Kirkby JA, Rayner K, Holliman NS, Liversedge SP. Binocular coordination: Reading stereoscopic sentences in depth. PLoS One. 2012;7(4) doi: 10.1371/journal.pone.0035608. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Shen D, Liversedge SP, Tian J, Zang C, Cui L, Bai X, Yan G, Rayner K. Eye movements of second language learners when reading spaced and unspaced Chinese text. Journal of Experimental Psychology: Applied. 2012;18:192–202. doi: 10.1037/a0027485. [DOI] [PubMed] [Google Scholar]
  72. Slattery TJ, Pollatsek A, Rayner K. The effect of the frequencies of three consecutive content words on eye movements during reading. Memory & Cognition. 2007;35(6):1283–1292. doi: 10.3758/bf03193601. [DOI] [PubMed] [Google Scholar]
  73. Tsai JL, McConkie GW. Where do Chinese readers send their eyes? In: Hyona RRJ, Deubel H, editors. The mind’s eye: Cognitive and applied aspects of eye movement research. Amsterdam: Elsevier; 2003. pp. 159–176. [Google Scholar]
  74. Vainio S, Hyönä J, Pajunen A. Lexical predictability exerts robust effects on fixation duration, but not on initial landing position during reading. Experimental Psychology. 2009;56(1):66–74. doi: 10.1027/1618-3169.56.1.66. [DOI] [PubMed] [Google Scholar]
  75. Vanyukov PM, Warren T, Wheeler ME, Reichle ED. The emergence of frequency effects in eye movements. Cognition. 2012;123(1):185–189. doi: 10.1016/j.cognition.2011.12.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Wang HJ. Sinigram-based theory and L2 Chinese teaching. Chinese Teaching Academic Journal. 2007;3:58–71. (in Chinese) [Google Scholar]
  77. Wang J. A study on the relative factors of foreign students’ Chinese character learning. Language Teaching and Linguistic Studies. 2009;31(1):9–16. (in Chinese) [Google Scholar]
  78. Wei W, Li X, Pollastsek A. Word properties of a fixated region affect outgoing saccade length in Chinese reading. Vision Research. 2013;80:1–6. doi: 10.1016/j.visres.2012.11.015. [DOI] [PubMed] [Google Scholar]
  79. Wheeler DD. Processes in word recognition. Cognitive Psychology. 1970;1(1):59–85. [Google Scholar]
  80. White SJ. Eye movement control during reading: Effects of word frequency and orthographic familiarity. Journal of Experimental Psychology: Human Perception and Performance. 2008;34(1):205–223. doi: 10.1037/0096-1523.34.1.205. [DOI] [PubMed] [Google Scholar]
  81. White SJ, Liversedge SP. Orthographic familiarity influences initial eye fixation positions in reading. European Journal of Cognitive Psychology. 2004;16(1–2):52–78. doi: 10.1080/09541440340000204. [DOI] [Google Scholar]
  82. Xu TQ. Character and syntactic structures in Chinese. Chinese Teaching in the World. 1994;8(2):1–9. (in Chinese) [Google Scholar]
  83. Xu TQ. Character as the basic structural unit and liguistic studies. Language Teaching and Linguistic Studies. 2005;6:1–11. (in Chinese) [Google Scholar]
  84. Yan G, Tian H, Bai X, Rayner K. The effect of word and character frequency on the eye movements of Chinese readers. British Journal of Psychology. 2006;97:259–268. doi: 10.1348/000712605X70066. [DOI] [PubMed] [Google Scholar]
  85. Yan M, Kliegl R, Shu H, Pan J, Zhou X. Parafoveal load of word N+1 modulates preprocessing effectiveness of word N+2 in Chinese reading. Journal of Experimental Psychology: Human Perception and Performance. 2010;36(6):1669–1676. doi: 10.1037/a0019329. [DOI] [PubMed] [Google Scholar]
  86. Yan M, Richter EM, Shu H, Kliegl R. Readers of Chinese extract semantic information from parafoveal words. Psychonomic Bulletin & Review. 2009;16(3):561–566. doi: 10.3758/PBR.16.3.561. [DOI] [PubMed] [Google Scholar]
  87. Yang H, McConkie GW. Reading Chinese: Some basic eye-movement characteristics. In: Wang Jian, Inhoff AW, Chen Hsuan-Chih., editors. Reading Chinese Script. Mahwah, New Jersey: Lawrence Erlbaum Associates; 1999. pp. 207–222. [Google Scholar]
  88. Yang J, Wang S, Xu Y, Rayner K. Do chinese readers obtain preview benefit from word n+2? Evidence from eye movements. Journal of Experimental Psychology: Learning, Memory, and Cognition. 2009;35(4):1192–1204. doi: 10.1037/a0013554. [DOI] [PubMed] [Google Scholar]
  89. Yang SN, McConkie GW. Eye movements during reading: a theory of saccade initiation times. Vision Research. 2001;41(25–26):3567–3585. doi: 10.1016/s0042-6989(01)00025-6. [DOI] [PubMed] [Google Scholar]
  90. Zang C, Liang F, Bai X, Yan G, Liversedge SP. Inter-word spacing and landing position effects during Chinese reading in children and adults. Journal of Experimental Psychology: Human Perception and Performance. doi: 10.1037/a0030097. in press. [DOI] [PubMed] [Google Scholar]

RESOURCES