Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2012 Nov 1.
Published in final edited form as: J Cogn Psychol (Hove). 2011 Nov 4;23(7):795–810. doi: 10.1080/20445911.2011.570257

Processing Novel and Lexicalized Finnish Compound Words

Alexander Pollatsek a,, Raymond Bertram b, Jukka Hyönä c
PMCID: PMC3327474  NIHMSID: NIHMS289701  PMID: 22518273

Abstract

Participants read sentences in which novel and lexicalized two-constituent compound words appeared while their eye movements were measured. The frequency of the first constituent of the compounds was also varied factorially and the frequency of the lexicalized compounds was equated over the two conditions. The sentence frames prior to the target word were matched across conditions. Both lexicality and first constituent frequency had large and significant effects on gaze durations on the target word; moreover the constituent frequency effect was significantly larger for the novel words. These results indicate that first constituent frequency has an effect in two stages: in the initial encoding of the compound and in the construction of meaning for the novel compound. The difference between this pattern of results and those for English prefixed words (Pollatsek, Slattery, & Juhasz, 2008) is apparently due to differences in the construction of meaning stage. A general model of the relationship of the processing of polymorphemic words to how they are fixated is presented.

Keywords: compound words, eye movements, effects of novelty


Interest has grown in how morphemically complex words are encoded since it has been demonstrated with a number of paradigms that such words are not merely “looked-up” in the lexicon, but that the morphemic components play a role in the accessing of such words. A great deal of the work has employed the lexical decision task, perhaps inspired by the landmark study by Taft and Forster (1976). Moreover, recent experiments using the masked priming paradigm of Forster and Davis (1984) have shown that units that are potentially morphemic components are extracted early in the word recognition process, sometimes even when these units are not morphemes in the actual word (Longtin & Meunier, 2005; McCormick, Rastle, & Davis, 2008; Rastle, Davis, & New, 2004). Although the lexical decision paradigm has indeed been a powerful tool for examining the roles of morphemic components in word identification, it is limited as a tool in understanding how the meaning of a morphologically complex word is accessed. For that reason, examining how morphologically complex words are processed in sentences seems necessary if one wants to understand how the meaning of a morphologically complex word is arrived at.

An important point needs to be stressed. If the reader knows the meaning of a morphologically complex word, there is no necessity that the morphemes – as morphemes in their linguistically defined sense – play a role in the access of the meaning of the word. That is, morphemes may just be highly frequently encountered subword strings of letters that are extracted as orthographic or phonological forms that are part of the process of constructing the orthographic or phonological form of the whole word. Indeed, experiments examining whether semantically transparent compound words (e.g., sunlight) are processed more easily in a sentence context than semantically opaque compound words (e.g., nightmare) have generally found no effect (e.g., Pollatsek & Hyönä, 2005, in Finnish; Frisson, Niswander-Klement, & Pollatsek, 2008, in English); however, Juhasz (2007) did obtain a semantic transparency effect in English.

As a result, the processing of the meaning of morphologically complex words that are novel, and thus the meaning can not be “looked up” in a mental lexicon, seems like an ideal venue for studying how the meaning of such a morphologically complex word is constructed from its parts. Before discussing the processing of such novel morphologically complex words, a review of what has been found out about morphologically complex words using eye movement measures seems in order, especially as many of them have used longer Finnish compound words, which are the stimuli employed in the current experiments.

The initial eye movement studies examining the processing of morphologically complex words employed long Finnish compound words. One reason for this is that in Finnish, like in German, compound words are always written without spaces; thus there is a large number of compound words that can be used as stimuli and one can arrive at a stimulus set of reasonable size where one variable is manipulated and other potentially relevant variables can be controlled fairly well. In the typical experiment described below, two morphologically complex target words are embedded in a sentence frame that is identical up to the word following the target word. The two sets of target words, as indicated above, are equated on a number of indices but vary on a key index, such as the mean frequency of a constituent or morpheme of the word. In addition, in a separate offline study, the sentence frame up through the target word is judged to be equally felicitous with the two target words. In the typical experiment, the focus is on gaze duration, which is the sum of the fixation durations on the target word on the first pass through the text, and is a measure often assumed to be closely related to the time to encode a word. However, other measures are used, such as the duration of the first fixation on the target word (first fixation duration), which assesses more directly whether a manipulation of a property of the word is having an effect on early processing of that word.

The initial experiments in Finnish (Hyönä & Pollatsek, 1998, Pollatsek, Hyönä, & Bertram, 2000) showed that the gaze duration on the word was not only affected by the frequency of the whole two-constituent compound word but also by the frequency of the first and second constituents. These three frequency effects were all large (on the order of 100 ms for gaze duration), indicating that the constituents were having a significant role in the encoding of these compound words. It should also be mentioned that these frequency effects were not limited to the gaze duration on the word. That is, it has been found several times that the frequency of the initial constituent has a small but significant effect on the first fixation duration (see e.g., Bertram & Hyönä, 2003; Hyönä & Pollatsek, 1998; Kuperman, Bertram, & Baayen, 2008), and there is some evidence that even whole word frequency effects may emerge on the first fixation duration on the word (Kuperman et al., 2008; Pollatsek et al., 2000). The initial effect of the frequency of the second constituent is typically on the second fixation duration (see e.g. Pollatsek et al., 2000). However, all these effects are obtained for relatively long compounds (over 10 characters); subsequently Bertram and Hyönä (2003) found that, for shorter compound words (7–9 characters long), the effect of the first constituent frequency was minimal, but the frequency of the whole compound word strongly affected gaze durations on the target words.

Subsequent experiments on English compound words have also found effects of the first constituent on eye movement measures in reading. For instance, several studies have shown that the frequency of the first constituent influences early fixation duration measures (Andrews, Miller, & Rayner, 2004; Juhasz, Starr, Inhoff & Placke, 2003; Juhasz, 2007). Inhoff, Starr, Solomon and Placke (2008) found solid effects for first constituent frequency on first fixation duration, gaze duration and total fixation duration for so-called headed compounds: compounds whose meaning is more closely related to the first than the second constituent, e.g., humankind. Juhasz and Berkowitz (in press) also found that the morphological family size of the first constituent, the number of existing derivations and compounds formed by a given constituent, affects compound word processing with shorter gaze duration corresponding with larger family sizes.

In addition, there are findings that the frequency of the second constituent affects fixation duration measures (Andrews et al., 2004; Juhasz et al., 2003; Juhasz, 2007; Inhoff et al., 2008). Inhoff et al. (2008), for example, found shorter first fixation durations, gaze durations, and total fixation times for high-frequency second constituent tailed compounds than for low-frequency ones. Andrews et al. found (2004) also found a solid effect of whole-word frequency in English in both gaze duration and total fixation duration. Furthermore, Juhasz (2008) found that the rated word frequency of the entire compound influences processing times on the compound word. Thus, the pattern of results for English compounds is quite similar as those for Finnish compounds. However, the constituent frequency effects in Finnish are a bit more reliable, possibly because the Finnish compounds used in the experiments of Hyönä and colleagues were generally longer than the ones used in English.

The morpheme frequency effect observed in sentence contexts is not limited to compound words. In a series of experiments by Niswander-Klement and Pollatsek (2006) on English prefixed words, the pattern of results was similar to that in the Finnish compound words described above. These experiments employed prefixed words with semantically transparent prefixes (e.g., un, mis, re) as target words, and as with the Finnish compound word experiments, the frequency of the root morpheme was varied for one set of materials and the frequency of the whole word was varied for another set of materials. The two prefixed words in a given sentence frame always had the same prefix. In addition, Niswander-Klement and Pollatsek also varied, across experiments, the length of the words (the long words were on average about 8.5 characters and included words of 9 letters and more; the short words were on average about 6.5 characters). There were significant root morpheme frequency effects for the longer prefixed words but not for the shorter prefixed words. In contrast, the whole-word frequency effects were significant for the shorter prefixed words but not for the longer prefixed words. Thus, the pattern of results mimics, to some extent, the one found by Bertram and Hyönä (2003) for short and long compound words. However, morpheme effects for prefixed words may appear for slightly shorter words due to the fact that the prefix is a typically more frequent, smaller, and presumably more easily encodable unit than a constituent of a compound word.

In sum, these experiments generally indicate that, for longer morphologically complex words, the constituents are playing an active role in the encoding of the word, but that there is less evidence that they play a similar role for shorter words. One possible way to explain this pattern of results is a dual-route race model in which there is a whole-word encoding route operating in parallel with a compositional route. Thus, if a word is long and many characters have to be processed in parallel, the whole-word route may be quite slow and thus the compositional route will win the “race” and be the process that actually encodes the word most of the time. In contrast, if the word is shorter, the whole-word route wins the race most of the time, so that the effects of compositional processing would not be seen in the eye movement record.

As indicated earlier, however, the results and the model we have presented so far do not speak to the role of constituent morphemes in the encoding of the meaning of a word. That is, all of the effects we have reported so far are consistent with the hypothesis that the morphemes are merely orthographic or phonological segments of a word that are aiding in constructing the orthographic or phonological form of the word (especially for long words) but that the access of meaning occurs after all of this occurs. One way to probe more deeply into whether morphemes play a role in the construction of meaning is to use novel morphemically complex words. As most of the subjects in the experiment presumably have never seen these words before, they must have to construct the meaning of the word from its parts.

A recent set of experiments by Pollatsek, Slattery, and Juhasz (2008) examined the processing of novel and lexicalized prefixed words (for an eye movement study on reading novel morphologically simple words, see Chaffin, Morris, & Seely, 2001). As with Niswander-Klement and Pollatsek (2006), they employed prefixed words with semantically transparent prefixes. (Otherwise, the reader could not be expected to construct the meaning of a novel word.) In one set of materials, there were longer lexicalized prefixed words in which the whole word frequency was matched but the frequency of the root morpheme differed. This was similar to the manipulation in the longer set used by Niswander-Klement and Pollatsek. The other set of materials employed pairs of novel prefixed words (e.g., mispaid and mislent), where each pair of words shared the same prefix, but the frequencies of the root morpheme differed. The results were quite clear, but one result was unexpected. First, there was a large novelty effect: about 100 ms difference in gaze duration between the novel prefixed words and the lexicalized ones. Second, there was a significant root frequency effect, which was about the same size observed by Niswander-Klement and Pollatsek before. The surprising result was that the root frequency effect was no bigger for the novel words than for the lexicalized words (the frequency effect was slightly, but not significantly, smaller for the novel words).

The reason that this result seemed surprising is that it seems as though access to the meaning of the novel words has to come through a compositional route in which the component morphemes have to be involved. In contrast, for the lexicalized words, it would seem as though the meaning of the word could be obtained by some sort of direct access from the lexicon (at least some of the time) and thus the effects of all compositional processing would be reduced relative to when a novel word is being processed. That is, if we assume that in the case of lexicalized words the whole-word route wins some of the time and that there is direct mapping of the lexical representation of a complex word with its semantic representation, the constituent frequency effect should be attenuated for lexicalized prefixed words in comparison to novel ones for which no lexical representation can exist. However, it may well be that the construction of meaning for novel prefixed words on the basis of its constituents is a relatively easy process and equally fast as the direct look-up process. In contrast, the construction of the meaning of a compound word from its constituents is likely to be less straightforward. For example, in Gagné and Spalding (2009), there were twelve different relations between the two constituents that were tested. For example, a snowman is a man MADE OF snow, whereas a snowshovel is a shovel FOR snow. Accordingly, we thought it would be of interest to determine whether the additive pattern of novelty and root frequency extends to compound words as well. As will become evident, the pattern with Finnish compounds is quite different, and we think quite revealing about how their meaning is accessed.

METHOD

Participants

Twenty-seven students of the University of Turku participated in the experiment. All were native speakers of Finnish, and had normal or corrected-to-normal vision.

Apparatus

Eye movements were monitored by the EYELINK II eyetracker manufactured by SR Research Ltd (Canada). The eyetracker is an infra-red video-based tracking system combined with hyperacuity image processing. There are two cameras mounted on a headband (one for each eye) including two infrared LEDs for illuminating each eye. The headband weighs 450 g in total. The cameras sample pupil location and pupil size at the rate of 500 Hz. Registration is monocular and is performed for the selected eye by placing the camera and the two-infra-red light sources at a distance of 4–6 cm from the eye. The spatial accuracy is better than 0.5 degrees. The spatial resolution (i.e., the differential accuracy) of the system is 15 min of arc. Head position with respect to the computer screen is tracked with the help of a head-tracking camera mounted on the center of the headband at the level of the forehead. Four LEDs are attached to the corners of the computer screen, and are viewed by the head-tracking camera when the subject sits directly facing the screen. Possible head motion is detected as movements of the four LEDs and is compensated for on-line from the eye position records. The system allows free head motion within a 100 cm cube.

Materials

Forty existing and 36 novel two-noun compounds were selected from an unpublished computerized newspaper corpus of 22.7 million word forms with the help of the WordMill database program of Laine and Virtanen (1999). Initially, there were 40 novel compounds selected, but due to a spelling error appearing prior to the target word two sentence pairs containing a novel compound had to be excluded. Twenty existing compounds had a high-frequency first constituent with a mean of 321 occurrences per million (range 73–1143), whereas the other twenty had a (relatively) low-frequency first constituent with a mean of 4 per million (range 1–13; all frequency counts reported are scaled to one million). Eighteen novel compounds had a high-frequency first constituent with a mean of 288 occurrences per million (range 74–689), whereas the other eighteen had a (relatively) low-frequent first constituent with a mean of 4 per million (range 1–14). The existing words had an average whole word frequency of about 2 per million, whereas the novel compounds did not appear in our database. However, all the novel compounds were to be found on the internet (Google frequency from 1 to 5) and were – according to our own estimation - interpretable. In order to assure that our estimation was correct, 14 persons from the Department of Psychology in Turku rated on a 4-point scale a preselected set of novel compounds on their comprehensibility (1: I don’t know what it means; 2: I have an idea what it might mean; 3: I think I know what the word means; 4: I know what the word means). Only compounds which scored on average 2 or higher were selected for the experiment proper. Almost all of the selected compounds were semantically transparent: the meaning of the constituents as separate words was the same as their meaning in the target compounds so that, the meaning of the compound could be composed on basis of constituent meanings. (There was one compound word in each of the four conditions that was slightly opaque.) The high- and low-frequency first constituent conditions were matched for surface frequency, second constituent frequency, bigram frequency, word length in letters, and length of the first constituent. The existing compounds were closely matched with the novel compounds in all aspects except for whole-word frequency and comprehensibility ratings. The lexical-statistical properties of all conditions including the comprehensibility ratings are listed in Table 1 and the compounds themselves and their translations are listed in the Appendix.

TABLE 1.

Lexical-Statistical Properties of the Four Conditions in the Experiment

Word Property Existing compounds with high- frequency first constituent Existing compounds with low-frequency first constituent Novel compounds with high-frequency first constituent Novel compounds with low-frequency first constituent
Mean1st-constituent frequencya 321 4 287 4
Mean 2nd-constituent frequencya 123 114 87 121
Mean surface frequencya 1.78 1.53 0.00 0.00
Mean comprehensibility ratingb 3.88 3.79 2.94 3.15
Mean bigram frequencyc 7.0 7.6 7.4 7.0
Mean word lengthd 13.1 13.1 13.2 13.2
Mean 1st constituent lengthd 7.2 7.4 7.3 7.5
a

All values scaled to one million words.

b

Rating scale from 1 to 4.

c

Scaled to one thousand.

d

Word length in characters (The range in all conditions was 12–16 characters.).

The target words were embedded in sentences with each target word appearing in a separate sentence. Each of the compound words starting with a high-frequency first constituent was paired with a compound word with a low-frequency first constituent, and a sentence frame was constructed that was identical up to the word following the target word; the rest of the sentence was different. To match for semantic plausibility, another rating study was conducted, in which both versions of the sentence pairs were listed underneath each other, and 9 participants who did not participate in the other rating task or in the experiment proper rated the naturalness of the sentences by indicating whether one of the sentences was more plausible or whether they were equally plausible. If at least 4 out of 9 rated one sentence to be more plausible than the other, a new sentence frame was constructed and rated again. This only happened on a few occasions (7 out of 80). In sum, the final target sentences in all pairs were equally plausible. An example of a sentence pair is shown below; the target word is shown in bold.

Low-frequency first constituent, existing compound:

Päätöksen mukaan konttisatama uudistetaan vastaamaan paremmin nykyisiä liikennemääriä.

‘It has been decided that the container port will be renewed in order to deal better with the nowadays traffic’

High-frequency first constituent, existing compound:

Päätöksen mukaan koulurakennus uudistetaan täydellisesti vuoden 2011 loppuun mennessä.

‘It has been decided that the school building will be repaired by the end of 2011’

Low-frequency first constituent, novel compound:

Grönroosin käyttämä shakkitemppu tuotti vastustajalle paljon päänvaivaa.

The chess trick used by Grönroos caused a lot of trouble for the opponent.

High-frequency first constituent, novel compound:

Grönroosin käyttämä vauhtiresepti tuotti hänelle kaksi rallin maailmanmestaruutta.

The speed formula used by Grönroos brought him two rally world championships.

The target sentences were presented in Courier one at a time, starting from the center left position on the computer screen. The sentences took up a maximum of 1 line of text, the critical word never appearing as the initial or final word of a text line. With a viewing distance of about 65 cm, one character space subtended approximately 0.5 degrees of visual angle. The 80 target sentences were mixed with 88 filler sentences. The sentences were presented in two blocks, so that paired sentences never appeared in the same block. The order of the blocks was counterbalanced across participants, and within a block the order of sentences was randomized.

Procedure

Prior to the experiment, the eyetracker was calibrated using a 3-point calibration grid that extended over the horizontal axis of the computer screen. Prior to each sentence, the calibration was checked by presenting a fixation point in the center left position of the screen; if needed, calibration was automatically corrected, after which a sentence was presented to the right of the fixation point.

Participants were instructed to read the sentences for comprehension at their own pace. They were further told that periodically they would be asked to paraphrase the last sentence they had read to make sure that they were attending to what they read. It was emphasized that the task was to comprehend, not to memorize the sentences. Participants were asked to paraphrase a sentence approximately after every seven sentences. The experimental session lasted a maximum of 45 minutes.

RESULTS

Effects prior to target word

We first examined fixation times on the word prior to the target word (word n−1) to determine whether there were any effects of the target word manipulation that surfaced in the eye-movement record prior to landing on the target word (so-called parafoveal-on-foveal effects). We primarily examined two global measure of processing on this word. Of course no eye movement measure is a magic indicator of processing time, but the following two measures are most frequently employed to assess the time to process a word. The first is gaze duration, the sum of the durations of all fixations on a word between the time it is entered from the left until the word is exited in either direction. The second is a measure that includes additional time spent before moving on to the next word. Here, we report selective regression path duration, which includes the durations of fixations on the target word after a regression was made to a prior word, but before a saccade is made to the right of the target word. This is slightly different from what is often termed go-past time, which also includes the durations of the fixations in the regression path (however, it will be less cumbersome if we refer to this measure as go-past time).

As can be seen in Table 2, there was little effect of either 1st constituent frequency or novelty on either gaze duration or go-past time on word n−1. The main effect of the 1st constituent frequency of the target word, which would be the most likely effect to find on a prior word, was not significant on either measure (F1s, F2s < 1). There was an indication of a novelty effect in the participant analysis for both measures; however, the effect was not close to significant in the item analyses, F1(1,26) = 4.007, p < .06; F2 <1, F1(1,26) = 7.056, p < .025; F2 <1, respectively. Moreover, such a novelty effect would be hard to interpret as the words n−1 in the novel compound conditions and existing compound conditions were different. The interaction effects were also not significant (Fs < 1). Thus, we think it is safe to conclude that there is no indication that the target word has any effect on eye movement behavior before it was fixated.

Table 2.

Global Mean Fixation Duration Measures on Word N-1 and the Target Word in Milliseconds

Eye Movement Measure Existing compounds Novel compounds

High Low High Low
Gaze Duration on Word N-1 235 236 251 241
Go Past Time on Word N-1 248 247 269 269
Gaze Duration on Target Word 529 591 671 824
Go Past Time on Target Word 571 625 723 874
Total Fixation Duration on Target Word 719 747 964 1072

Note: High = frequent 1st constituent; Low = infrequent 1st constituent.

Target word

Global measures of processing

The first question of interest was the effect of the two variables on the time to process the target compound word. A common measure to assess global processing time is gaze duration on a word. Gaze duration not only reflects processes related to lexical access, but also seems to be a promising measure of accessing word meaning. For example, a series of experiments (Duffy, Morris, & Rayner, 1988; Rayner & Duffy, 1986) showed that gaze durations on semantically ambiguous words (e.g., bank) were affected by whether the prior context was consistent with the less frequent meaning of the word or was ambiguous as to which meaning was intended. In short, these studies indicated that gaze duration was not only indexing the time to access the orthographic or phonological form of the word but was also indexing the time to arrive at a meaning consistent with the prior context.

As can be seen in Table 2, there was a 107 ms main effect of 1st constituent frequency and a 187 ms main effect of novelty on gaze duration, F1(1,26) = 46.36, p < .001; F2(1,36) = 22.33, p < .001, F1(1,26) = 127.3, p < .001; F2(1,36) = 6.691, p < .025, respectively. However, even more central to the focus of this experiment there was a large interaction, with the 1st constituent frequency effect being 92 ms larger for the novel words, F1(1,26) = 9.73, p < .005; F2(1,36) = 22.80, p < .001. The 62 ms 1st constituent frequency effect for the existing words just missed being significant in the item analysis, t1(26) = 2.925, p < .01, t2(19) = 2.018, p < .06, but the 153 ms 1st constituent frequency effect for the novel words was significant, t1(26) = 6.821, p < .001, t2(17) = 4.363, p < .001.

As also can be seen in Table 2, the pattern of results for go-past time is quite similar to that for gaze duration. Both the 103 ms main effect of 1st constituent frequency and the 200 ms main effect of novelty were significant, F1(1,26) = 39.41, p < .001; F2(1,36) = 17.49, p < .001, F1(1,26) = 123.9, p < .001; F2(1,36) = 9.020, p < .005, respectively, as was the 96 ms interaction, F1(1,26) = 10.56, p < .005; F2(1,36) = 22.03, p < .001. On this measure, the 54 ms 1st constituent frequency effect for the existing words was again not significant in the item analysis, t1(26) = 2.532, p < .02, t2(19) = 1.617, p < .20, but the 101 ms 1st constituent frequency effect for the novel words was significant, t1(26) = 6.660, p < .001, t2(17) = 3.964, p < .001.

A third measure commonly used to measure global processing time on a word is total fixation time, which is the sum of all fixations on a word. This measure may be less reliable in the present study, because it includes refixations on the word after later text has been read, and this later text varies between low- and high-frequency 1st constituent items even in the matched sets. Nonetheless, the pattern of data on this measure is fairly similar to that of the two prior measures (see Table 2). That is, both the 68 ms main effect of 1st constituent frequency and the 284 ms main effect of novelty were significant, F1(1,26) = 13.02, p < .001; F2(1,36) = 5.779, p < .025, F1(1,26) = 156.4, p < .001; F2(1,36) = 6.178, p < .025, respectively; however the 80 ms interaction was not significant by items, F1(1,26) = 7.221, p < .02; F2(1,36) = 1.772, p < .20. On this measure, the 28 ms 1st constituent frequency effect for the existing words was not close to significant, t1(26) = 1.407, p < .20, t2(19) < 1, but the 108 ms 1st constituent frequency effect for the novel words was significant, t1(26) = 3.890, p < .001, t2(17) = 2.595, p < .025.

In sum, the data pattern on all three measures was clear. There were large main effects of both 1st constituent frequency and novelty and a large interaction between the two. We should again point out that the novelty main effects were “between-item” effects and, even though we tried to control the two novelty conditions closely and even though it seems implausible for there to be no main effect of novelty, these effects could possibly be caused by uncontrolled differences in the prior sentence frames for the existing and novel compounds. However, both the 1st constituent frequency effect and the interaction were within-item effects and not subject to this criticism. Thus, the overall pattern on measures that plausibly assess the time to understand the compound word, most notably the large frequency by novelty interactions, is quite different from that observed for the English prefixed words.

More detailed analyses of the time course of processing the target word

As we have seen above, there is no evidence of any 1st constituent frequency or novelty effect on fixations prior to the target word. Hence, the earliest measure one would expect to find effects would be on the first fixation on the target word. The first fixation duration data are quite clear: there is a 1st constituent frequency effect and nothing else (see Table 3). The 25 ms main effect of 1st constituent frequency was significant, F1(1,26) = 33.11, p < .001; F2(1,36) = 26.00, p < .001, whereas neither the main effect of novelty nor the interaction were close to significant (all Fs < 1). In addition, the suggestive difference between novel and lexical low frequency 1st constituent words was not close to significant, t1(26) = 1.008, p < .20, t2 < 1. As can be seen from Table 3, there are no significant effects on the location of the first fixation. There is a hint of a novelty effect, F1(1,26) = 3.72, p < .001; F2 < 1; however, this effect could be due to differences in the length of word N-1 and other differences in the text prior to the target word in the novel and lexicalized conditions. The main effect of 1st constituent frequency and the interaction were not close to significant, F1(1,26) = 1.080, 1.445, ps > .20; F2s < 1. Thus, at the earliest stages of processing, 1st constituent frequency has a clear effect on processing time but novelty has virtually no effect.

Table 3.

More Detailed Eye Movement Measures

Eye Movement Measure Existing compounds Novel compounds
High Low High Low

Mean Mean Mean Mean
First Fixation Duration 229 249 229 258
First Fixation Location 3.73 3.92 3.65 3.64
Second Fixation Duration 237 212 237 248
Second Fixation Location 7.70 7.09 7.81 7.13
Gaze Duration on Constituent 1 307 375 318 460
Gaze Duration minus Gaze Duration on Constituent 1 228 218 359 371
# of Fixations on whole compound 2.32 2.61 2.90 3.31
# of Fixations on Constituent 1 1.35 1.58 1.42 1.74

Note: High = frequent 1st constituent; Low = infrequent 1st constituent.

The first best candidate for the next earliest processing effects is the similar measures on the second fixation on the word. For the duration of the second fixation on the word, there was a 36 ms interaction between 1st constituent frequency and novelty that was significant, F1(1,26) = 13.54, p < .001; F2(1,36) = 8.366, p < .01. In contrast, the main effect of 1st constituent frequency was not close to significant, F1(1,26) = 1.784, p < .20; F2 < 1, and the main effect of novelty was only significant over participants, F1(1,26) = 8.772, p < .01; F2 < 1. Although the direction of the interaction is the same as in the global processing measures, the pattern is quite different (see Table 3). That is, the interaction is largely driven by the fact that fixation time was 25 ms less in the low-frequency existing word condition than in the high-frequency existing word condition, t1(26) = −4.027, p < .001, t2(19) = −3.201, p < .005. Thus, the interaction on this measure can’t be the total explanation for the interaction in the gaze duration (and the other global measures); for those, the major effect was that there was a large difference between the two novel word conditions. (We will return to consider why this apparently anomalous result occurred for the existing compound words.) In contrast, the pattern of data for the location of the second fixation was quite simple. Readers, on average, fixated about two-thirds of a character further when the frequency of the 1st constituent was high, F1(1,26) = 18.73, p < .001; F2(1,36) = 21.35, p < .001, but there was no hint of a novelty effect or interaction (all Fs < 1). This difference could be viewed as a facilitative effect as it indicates that, on average, the reader has processed more of the word and hence wants to shift attention further into the word. Thus, although 1st constituent frequency did not have a facilitative effect on the duration of the second fixation, it did on the location of the second fixation.

The next index that is a plausible measure of relatively early processing on the compound word is the gaze duration on the first constituent. This measure is defined exactly as gaze duration on a word except that the region of interest is just the letters of the first constituent. As can be seen from Table 3, the pattern of data on this measure, to a large extent, mirrors the pattern of the gaze duration on the whole word. That is, there is an 106 ms main effect of 1st constituent frequency, F1(1,26) = 75.05, p < .001; F2(1,36) = 67.26, p < .001, a 50 ms effect of novelty, F1(1,26) = 24.26, p < .001; F2(1,36) = 3.034, p < .10, and a 74 ms interaction, F1(1,26) = 17.04, p < .001; F2(1,36) = 53.32, p < .001. In addition, the 1st constituent frequency effects for both the existing words (62 ms) and novel words (142 ms) were both significant, t1(26) = 4.185, p < .001, t2(19) = 4.191, p < .001, t1(26) = 10.908, p < .001, t2(17) = 7.202, p < .001. The chief difference between these data and the gaze duration data on the whole word is that the main effect of novelty in these data was considerably smaller and only marginally significant in the item analyses. (The first constituent was skipped about 2.5% of the time, but there were virtually no differences among the conditions, Fs < 1.)

We also computed a measure that assessed later processing: the gaze duration on the entire compound minus the gaze duration on the 1st constituent. This assesses the time spent on the target word after the participant left the 1st constituent. This is not the gaze duration on the 2nd constituent as (a) it does not include the few trials on which the first constituent was skipped (but when the 2nd constituent was skipped, that trial was counted as a zero fixation time rather than missing data) and (b) it includes the durations of regressive fixations back to the first constituent and possible re-inspection of the second constituent. As can be seen in Table 3, these data are quite straightforward: there was a 142 ms novelty effect, F1(1,26) = 79.12, p < .001; F2(1,36) = 16.77, p < .001, but virtually no 1st constituent frequency effect, and only the slightest hint of an interaction (Fs <1). Thus, the entire 1st constituent frequency effect for the gaze duration for existing words is captured by the gaze duration on the 1st constituent, so that further processing on these words is largely concerned with other things.

Finally, of interest is the extent to which the two more global first-pass time measures reported above, gaze duration and gaze duration on the first constituent, could be accounted for by the number of fixations in the relevant region. As can be seen in Table 3, the answer is that these variables explain most of the differences between conditions, but not all. First, consider the total number of fixations on the word. Here there was a 0.350 1st constituent frequency main effect and a 0.634 novelty main effect in the number of fixations that were significant, F1(1,26) = 23.50, p < .001; F2(1,36) = 16.50, p < .001, F1(1,26) = 114.4, p < .001; F2(1,36) = 14.45, p < .001, respectively, but the 0.12 interaction was far from significant (Fs < 1). The pattern was somewhat similar for the number of first-pass fixations on the 1st constituent. There was a 0.28 1st constituent frequency main effect in the number of fixations that was significant, F1(1,26) = 35.54, p < .001; F2(1,36) = 16.50, but the 0.12 novelty main effect was not significant over items, p < .001, F1(1,26) = 13.47, p < .001; F2(1,36) = 1.867, p < .20, and again the 0.10 interaction was not close to significant, F1(1,26) = 1.953, p < .20; F2 < 1. It thus appears that a substantial part of the 1st constituent frequency by novelty effect interaction on both gaze duration on the word and gaze duration on the 1st constituent is not accounted for by the number of fixations, and is likely accounted for by differences in the durations of fixations after the second fixation.

Accounting for the second fixation duration data

Most of the findings above are quite straightforward. The only one that is not is the interaction on the second fixation duration, which was largely driven by the fact that fixation durations on low frequency 1st constituent existing words were shorter than those on high frequency 1st constituent existing words. As this seems anomalous, it seems likely that there is a confounding that is causing the problem. The most likely candidate for such a confounding is the following. Compound words with high frequency 1st constituents are more likely to have many other compound words beginning with the same 1st constituent than compounds with low frequency 1st constituents. As a result, for existing low frequency first constituent compound words, it is plausible that there is some sort of facilitation on processing that is akin to a predictability effect (see Hyönä, Bertram, & Pollatsek, 2004, for a similar argument and data supporting it). Similarly, for the novel words, if there are not many existing compound words beginning with that 1st constituent, the fact that this is a novel compound may be signaled early (e.g., no existing compound has a 2nd constituent beginning with the 1st letter of the 2nd constituent).

There is some suggestive evidence for this hypothesis. We computed what we thought to be the most relevant family size measure for each compound word: the number of existing compound words that begin with the first constituent and also share the first letter of the second constituent. Given that this value was high for virtually all the high-frequency first constituent words, we concentrated on this measure for the low-frequency first constituent words. Also, given the other variation over items due to sentence context and other variables, the best dependent variable to examine was plausibly the difference in gaze duration on the first constituent between the low and high frequency items in a pair of items (averaged over participants). If our above hypothesis is true, one would expect this correlation to be negative for novel words (i.e., inflated fixation times when there were few – or in many cases no – existing compound that had that first constituent and beginning letter of the second constituent). Similarly, one might expect a facilitative effect of small family size for the existing compounds, as the second constituent thus becomes fairly predictable. Our analysis was consistent with these predictions; the correlations between this difference in gaze on the first constituent and this family size measure were −.582 and +.211 for the novel and existing compound words, respectively. (Both the first correlation and the difference between the correlations were significant – p < .02.) Thus, it appears that some of this relatively early novelty effect may be due to processing the first letter of the second constituent and its implications for finding a second constituent rather than encoding the entire second constituent and realizing that it is novel.

These results are in agreement with other studies that examined the role of left constituent family size in compound processing. For example, Hyönä et al. (2004) found a predictability effect within Finnish compounds, such that the second constituent of a compound was processed faster when the first constituent was low frequency and had few family members than when the frequency of the first constituent was high frequency and had many family members. Similarly, Kuperman, Schreuder, Bertram and Baayen (2009) observed an interaction between right constituent frequency and left constituent family size in Dutch compounds, such that the effect of right constituent frequency was strongest in compounds with large left constituent families and decreased with decreasing morphological family size. They argued that this interaction reflects that the ease of accessing the right constituent (diagnosed by its frequency effect) speeds up compound recognition more when there is more uncertainty about which candidate to choose from a larger number of possible right constituents. They further argued that if the competition in the family is relatively weak due to small number of possible family members, the right constituent may be relatively easy to predict, and additional morphological information in the form of right constituent frequency is not as useful for the lexical processor.

Discussion

The pattern of results in the current experiment is relatively straightforward. Both whether the compound word is novel and the frequency of the first constituent have substantial effects on the gaze duration on the word, and hence, presumably the time to encode the word up to some level. Moreover, the effects are strongly interactive, with the first constituent frequency having a much larger effect for the novel compound words. Moreover, the time course of the effects is somewhat different, as the novelty effect occurs somewhat later in the eye movement record. This time-course difference makes sense, as the novelty of the word would not generally be detected until significant processing of the second constituent has occurred. However, the fact that novelty effects appear as early as on the second fixation duration is a bit of a puzzle. We will discuss these relatively early novelty effects below, but we will assume that what might be considered “true novelty effects” (i.e., getting to the meaning of the novel word) as occurring relatively late in processing the compound word.

At first, it may seem that there is a very simple explanation for the pattern of data in terms of a dual route “race” model between one process that accesses the word through its whole form (i.e., without reference to the component morphemic constituents) and the other constructs the word from its constituents. In such a model, as indicated earlier, the existing compound words can be accessed – at least part of the time - through their whole form, or, to put it differently, processing the constituents could be bypassed some of the time. In contrast, according to such a model, novel compounds always have to be processed by such a constituent process (due to a lack of whole-word representations) and thus any effect due to the frequency of a constituent would be larger for the novel words than for the existing words.

Although such a model can give an adequate account of the data from the present experiment, it can not explain the difference in the pattern of results from the present experiment and the pattern in the parallel experiment using English prefixed words (Pollatsek et al., 2008), where the root frequency effect was the same for the novel and existing prefixed words. Of course, one could always pass over this difference by saying that Finnish is processed differently than English and leave it at that. However, such a vague non-explanation seems deeply unsatisfying and is also inconsistent with the findings from English and Finnish compound experiments that indicate that constituent and whole-word frequency effects are quite similar across languages (see e.g. Andrews et al., 2004; Hyönä & Pollatsek, 1998; Inhoff et al., 2008, Juhasz and Berkowitz, in press; Pollatsek et al., 2000). Moreover, we think there is a coherent theory that explains the difference and may be helpful in understanding how morphemically complex words are processed in general, and more specifically, at what level of processing various effects are occurring.

The key idea of our theory is that the effects we are observing occur at two different levels, which, to simplify the exposition, will be treated as two serial stages (for a similar proposal, see Libben, 1998). The first stage is to arrive at an identification of the orthographic and phonological form of the stimulus. For words, this would be roughly equivalent to what is termed ‘lexical access’ in many models, but we are making the first stage more general to cover nonwords as well. (We realize that in many PDP models, there is no fundamental distinction between how such a process occurs for words and nonwords.) The second stage in our model is the computation of the meaning of the entity. Obviously, using an expression like “the meaning” is quite loaded, as there are many levels of meaning that can be extracted from a word. We will return to this point later, but for the moment, what is meant by this stage is encoding the meaning of the word well enough to go on to the next word in the sentence.

According to our model, constituent frequency effects can occur in both stages. In the first (identification) stage, the lower the frequency a constituent has, the longer it takes to complete this stage. Moreover, in what follows, we assume, as stated above, (at least as a first approximation) that both the long Finnish compounds and the relatively long English prefixed words employed in Pollatsek et al. (2008) are long enough to be identified by a componential process virtually all the time. Thus, we are assuming that the frequency effects that occur in the first stage of processing for both the long Finnish compound words and the long English prefixed words occur for similar reasons, although the details of the composition process undoubtedly differ for the two types of words.

In contrast, we assume that the second stage of processing is quite different for the two types of words. As indicated earlier, the English prefixed words employed all had semantically transparent prefixes, and thus plausibly, the meaning of the word can be constructed from the root morpheme and the prefix by something like a “rule”. That is, the meaning of misX means that you did X wrongly in some way regardless of what X is (e.g. the meaning of miscircled can be computed as something like “circled something wrongly”). In contrast, as argued earlier, for compound words the process of constructing the meaning of the word from its constituents, even for “transparent compounds”, is not nearly as straightforward. This is partly because there are different types of semantic relationships between compound word constituents (Gagné & Spaulding, 2009). However, whereas the process of constructing a meaning of an existing compound may be aided by previous experience (that is, readers have constructed the meaning in earlier encounters with the word and the relationship between constituents is more or less established), for a novel compound the interpretation of the relationship between the constituents and with that the construction of the whole meaning may be less obvious. To use an English example (where a novel compound would be written with a space), monkey medicine could mean “medicine for monkeys”, “medicine designed by monkeys”, “medicine made out of monkeys”, etc. To decide on which is a plausible meaning, and possibly also to decide whether this meaning is consistent with the prior context is clearly a less automatic and rule-governed process. Moreover, it is quite plausible that the time that this process takes is correlated with the frequency of the first constituent, because less frequent first constituents tend to engage in the formation of fewer compounds than more frequent first constituents, because there is no firm knowledge about the prototypical semantic relationships in which they may become involved.

In the model, the additive effects of constituent frequency and novelty for the English prefixed words are a result of the two effects occurring in different stages. The root morpheme frequency effect occurs because stage 1, the identification stage, takes longer when the frequency of the root morpheme is low. The novelty effect occurs because stage 2 for the existing prefixed words is fairly automatic: the construction of its meaning has taken place on previous encounters with the word and thus the relationship between the constituents is more or less established. It may even be the case that this stage occurs simultaneously with the completion of stage 1. In contrast, for the novel prefixed words, a second, time-consuming process occurs in which the meaning must be constructed. However, our assumption is that this process is rule-governed and does not depend on previous experiences with the root and therefore we assume that the time it takes to construct the meaning of a novel prefixed word does not depend on the frequency of the root morpheme.

For the Finnish compound words in the present study, however, the effects of the variables on the processing stages are different. The frequency of the first constituent affects processing time in the first stage the same way as for the English prefixed words; thus both the novel and existing compounds take longer in stage 1 when the first constituent is infrequent. Moreover, as for the lexicalized English prefixed words, when the compound word is a lexical item, its meaning is composed (or perhaps retrieved) fairly automatically relying on previous experiences with the whole compound, so stage 2, is not influenced by the frequency of the first constituent. Instead, this process will be influenced by the frequency of the compound word, since the more experiences one has had of constructing the word meaning out of the constituents, the more quickly one can complete this process. However, for the novel Finnish compounds, because the meaning composition is not a rule-governed process, stage 2 takes appreciable time, and this time is longer when the frequency of the first constituent is low, since the prototypical relationship a low-frequency first constituent is engaged in is not firmly established. Thus, there is not only a novelty effect, but the size of the novelty effect is modulated by the frequency of the first constituent.

We think the above model is a reasonably satisfactory explanation of the data of the two experiments and deserves consideration as a theory of how morphemically complex words are processed. However, it may not fully explain the lack of transparency effects for compounds obtained by Pollatsek and Hyönä (2005). Pollatsek and Hyönä found that transparent Finnish compounds like sunlight are processed as quickly as opaque compounds like hogwash. If the second phase is about the construction of meaning on the basis of the meaning of the constituents (note that White, Bertram & Hyönä, 2008, found evidence for semantic activation of second constituents for transparent Finnish compounds), processing opaque compounds during the second stage would be quite problematic. However, if the model assumes whole-word representations at the lexical level become activated via its constituents and these whole-word representations directly map onto the semantic representation of the compound, the second stage of processing opaque compounds becomes much less problematic. This kind of model architecture fits in with the results of seven lexical decision experiments conducted by Ji (2008), who found that opaque and transparent compounds are recognized equally quickly. However, when she ‘forced’ participants to arrive at the meaning of compounds via the decomposition route by for instance inserting a space at the constituent boundary or coloring one of the constituents, opaque compounds elicited much longer response latencies than transparent compounds. Similarly, Frisson et al. (2008) found a transparency effect in the second experiment of their eye movement study when a space was inserted between the constituents of their transparent and opaque compounds.

The model that we propose is akin to the model of Libben (1998), who proposed a two-stage model in which access to a compound’s semantic representation is mediated by access to the constituents at the lexical level, regardless of whether the compound is transparent or opaque. However, his model differs from our model because constituents in his model first have to activate a whole-word lexical representation before constituent meanings and the whole-word meaning are retrieved. In contrast, our model allows for the retrieval of constituent meanings via constituent representations at the lexical level and the subsequent composition of the whole-word meaning on the basis of these constituent meanings.

We realize that further work needs to be done in computational modeling in order for such a model to be fully accepted. Among other things, in the above presentation, we have merely discussed how the processing of the words occurs and have not attempted to tie it closely to the pattern of eye movements, except by the tacit assumption that the more quickly stage 2 is accomplished, the shorter the gaze duration on the word is. It is clear that additional thought is needed to flesh out the model to tie the above processing assumptions to eye movements, especially to how refixations are directed within a word.

The most obvious difficulty for the theory, as we have sketched it, is whether it can account for the relatively early effects of whole word frequency or novelty in the eye movement record. We have already briefly discussed one explanation for such early effects: “family size of the compound”. In our discussion, we are using the term somewhat differently – the number of words that have the 1st constituent and the first letter of the second constituent. This is a symptom of a more general “missing piece” in the model we have sketched, which is how the reader “parses” a longer morphemically complex word into its components. Some of the time, they might attend to the series of letters at the beginning, and if the beginning of the segment is a legitimate morpheme or constituent, they encode it as such and then attend to the rest of the word. But what if the first morpheme is longer than this attended segment? Also, for prefixed words, how, exactly does “prefix stripping” (Taft & Forster, 1976) occur? For example if the prefix is under, does the reader sometimes take un as the prefix and then have to correct the error? Obviously, one has to go beyond the correlational analyses presented at the end of the results section to see whether such an explanation is satisfactory.

Another explanation for some of the novelty effects on gaze duration on the first constituent is that at least some of these fixations are intended to be on the second constituent (and hence the hypothesized stage 2 may already have begun). As Table 3 indicates, the mean location for second fixations is right around the constituent boundary for most of the compound words, and that, for some of these fixations that are recorded as landing on the first constituent, attention is already on the 2nd constituent. Indeed, 32% of the second fixations that are recorded as being on the first constituent are on the last letter of the first constituent. This mismatch between recorded fixation location and where attention is focused could happen for one of two reasons: (a) the participant’s saccade undershoots the intended target of the saccade; or (b) recording error.

In sum, the model proposed has some problems, many of which are due to the fact that one can’t really control everything in an experiment. However, the model seems like a reasonable start at trying to explain, in detail, how compound words are processed in reading. A key ingredient that is missing in the model above and in all models of reading that we know about is a detailed theory about how refixations on a word are programmed. For example, in the E-Z Reader model (e.g., Pollatsek, Reichle, & Rayner, 2006), refixations are semi-automatically programmed to the center of a word, and the speed in which the program is initiated is a function of how close the initial fixation is to the center of the word. If identification of the word proceeds sufficiently rapidly, the program to fixate the next word cancels the refixation program. Clearly, this is too simplistic for long words like the compound words examined in the present study, for which at least some of the refixations have to be deliberate moves further into the word – possibly intended to be on the middle of the 2nd constituent. This indicates that there are many issues to be dealt with in understanding the pattern of eye movements on long morphemically complex words, which include how accurate such refixation targeting is given that there is no space between the constituents to guide the saccade. We are hoping to be able to construct and test such a model in the near future; however, we think the current research indicates the general guidelines on what an adequate model would look like.

Acknowledgments

The first author’s research is supported by grant HD26765 from the National Institute of Health and the second author’s research is supported by grant 118404 from the Academy of Finland.

Appendix: Target Words1

Existing compounds and their translations
High Frequency First Constituent Low Frequency First Constituent
näytelmä/kerho drama club koliikki/vauva colic child
kaupunki/lehti community newspaper naapuruus/suhde neighbor relationship
talous/saarto economic blockage alttari/taulu altarpiece
säästö/vinkki saving tip rypäle/lajike grape variety
kulttuuri/aarre culture treasure silinteri/hattu silk hat
rakennus/taide building art = architecture laakerin/lehti laurel leaf
vieras/kiintiö visitor quota äitiys/huolto prenatal care
musiikki/osasto music department kristalli/kruunu cut-glass chandelier
puolue/kartta (political) party spectrum kesanto/pelto fallow field = fallow
teatteri/sali theatre hall pastelli/sävy pastel shade
paperi/silppu paper mash akilles/jänne Achilles heel
ammatti/kunta profession municipality = guild viisumi/pakko visa obligation
tutkimus/retki research trip kalvosin/nappi cufflink
liikenne/ruuhka traffic jam vanilja/kastike vanilla sauce
potilas/järjestö patient organization mellakka/poliisi riot policeman
koulu/rakennus school building kontti/satama container port
tilanne/arvio situation assessment pilotti/jakso pilot episode
kierros/nopeus rotation speed poraus/lautta drilling rig
budjetti/komitea budget committee korvike/alkoholi substitute (for) alcohol
tutkimus/osasto research department kansleri/ehdokas Chancellor candidate

Novel compounds and their translations
High Frequency First Constituent Low Frequency First Constituent

menestys/kenkä success shoe kiisseli/nälkä dessert hunger
joukkue/myynti team sales pasifisti/poika pacifist boy
kirkko/liitos church union parturi/video barber video
mestaruus/riemu championship joy kemikalio/murto pharmacy burglary
sairaala/kieli hospital language loikkari/perhe defector family
vauhti/resepti speed formula shakki/temppu chess trick
tuotanto/haitta production drawback sabotaasi/osasto sabotage department
joukkue/keikka team gig kokelas/sauna cadet sauna
mestari/kuoro master choir hurmuri/pappi charmer priest
asukas/hölkkä resident jog juniori/disko junior disco
hotelli/pakko hotel obligation varikko/surma depot death
lauantai/ostos Saturday purchase hienosto/varas high society thief
asiakas/kollega customer colleague kaivuri/tuttava digger acquaintance
tohtori/ehdokas doctor candidate kaappari/ongelma hijacker problem
kevät/kokemus spring experience laatta/kurssi tile course
sotilas/unelma soldier dream samurai/heitto samurai throw
konsertti/turisti concert tourist basisti/kuningas bassist king
maanantai/kriisi Monday crisis boikotti/perinne boycott tradition

Footnotes

1

Each line contains the two target words that appeared in the same sentence frame with the high-frequency first-constituent word first.

Contributor Information

Alexander Pollatsek, Email: pollatsek@psych.umass.edu.

Raymond Bertram, Email: rayber@utu.fi.

Jukka Hyönä, Email: hyona@utu.fi.

References

  1. Andrews S, Miller B, Rayner K. Eye movements and morphological segmentation of compound words: There is a mouse in mousetrap. European Journal of Cognitive Psychology. 2004;16:285–311. [Google Scholar]
  2. Bertram R, Hyönä J. The length of a complex word modifies the role of morphological structure: Evidence from eye movements when reading short and long Finnish compounds. Journal of Memory and Language. 2003;48:615–634. [Google Scholar]
  3. Chaffin R, Morris RK, Seely RE. Learning new word meanings from context: A study of eye movements. Journal of Experimental Psychology: Learning, Memory, and Cognition. 2001;27:225–235. [PubMed] [Google Scholar]
  4. Duffy SA, Morris RK, Rayner K. Lexical ambiguity and fixation times in reading. Journal of Memory and Language. 1988;27:429–446. [Google Scholar]
  5. Forster KI, Davis C. Repetition priming and frequency attenuation in lexical access. Journal of Experimental Psychology: Learning, Memory, and Cognition. 1984;10:680–698. [Google Scholar]
  6. Frisson S, Niswander-Klement E, Pollatsek A. The role of semantic transparency in the processing of English compound words. British Journal of Psychology. 2008;99:87–107. doi: 10.1348/000712607X181304. [DOI] [PubMed] [Google Scholar]
  7. Gagné CL, Spalding TL. Constituent integration during the processing of compound words: Does it involve the use of relational structures? Journal of Memory and Language. 2009;60:20–35. [Google Scholar]
  8. Hyönä J, Bertram R, Pollatsek A. Are long compounds identified serially via their constituents? Evidence from an eye-movement contingent display change study. Memory and Cognition. 2004;32:523–532. doi: 10.3758/bf03195844. [DOI] [PubMed] [Google Scholar]
  9. Hyönä J, Pollatsek A. Reading Finnish compound words: Eye fixations are affected by component morphemes. Journal of Experimental Psychology: Human Perception and Performance. 1998;24:1612–1627. doi: 10.1037//0096-1523.24.6.1612. [DOI] [PubMed] [Google Scholar]
  10. Inhoff AW, Starr MS, Solomon M, Placke L. Eye movements during the reading of compound words and the influence of lexeme meaning. Memory and Cognition. 2008;36:675–687. doi: 10.3758/mc.36.3.675. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Ji H. Doctoral dissertation. University of Alberta; 2008. The influence of morphological complexity on word processing. [Google Scholar]
  12. Juhasz BJ. The influence of semantic transparency on eye movements during English compound word recognition. In: van Gompel R, Fischer M, Murray W, Hill R, editors. Eye movements: A window on mind and brain. New York, NY: Elsevier; 2007. pp. 373–389. [Google Scholar]
  13. Juhasz BJ. The processing of compound words in English: Effects of word length on eye movements during reading. Language and Cognitive Processes. 2008;23:1057–1088. [Google Scholar]
  14. Juhasz BJ, Berkowitz RN. Effects of Morphological Families on English Compound Word Recognition: A Multi-Task Investigation. Language and Cognitive Processes in press. [Google Scholar]
  15. Juhasz BJ, Starr M, Inhoff AW, Placke L. The effects of morphology on the processing of compound words: Evidence from naming, lexical decisions and eye fixations. British Journal of Psychology. 2003;94:223–244. doi: 10.1348/000712603321661903. [DOI] [PubMed] [Google Scholar]
  16. Kuperman V, Bertram R, Baayen RH. Morphological dynamics in compound processing. Language and Cognitive Processes. 2008;23:1089–1132. [Google Scholar]
  17. Kuperman V, Schreuder R, Bertram R, Baayen RH. Reading Polymorphemic Dutch Compounds: Toward a Multiple Route Model of Lexical Processing. Journal of Experimental Psychology: Human Perception and Performance. 2009;35:876–895. doi: 10.1037/a0013484. [DOI] [PubMed] [Google Scholar]
  18. Laine M, Virtanen P. WordMill Lexical Search Program. Center for Cognitive Neuroscience, University of Turku; Finland: 1999. [Google Scholar]
  19. Libben G. Semantic transparency in the processing of compounds: consequences for representation, processing and impairment. Brain and Language. 1998;61:30–44. doi: 10.1006/brln.1997.1876. [DOI] [PubMed] [Google Scholar]
  20. Longtin C, Meunier F. Morphological decomposition in early visual word processing. Journal of Memory and Language. 2005;53:26–41. [Google Scholar]
  21. McCormick SF, Rastle K, Davis M. Is there a ‘fete’ in ‘fetish’? Effects of orthographic opacity on morpho-orthographic segmentation in visual word recognition. Journal of Memory and Language. 2008;58:307–326. [Google Scholar]
  22. Niswander-Klement E, Pollatsek A. The effects of root frequency, word frequency, and length on the processing of prefixed English words during reading. Memory & Cognition. 2006;34:685–702. doi: 10.3758/bf03193588. [DOI] [PubMed] [Google Scholar]
  23. Pollatsek A, Hyönä J. The role of semantic transparency in the processing of Finnish compound words. Language and Cognitive Processes. 2005;20:261–290. [Google Scholar]
  24. Pollatsek A, Hyönä J, Bertram R. The role of morphological constituents in reading Finnish compound words. Journal of Experimental Psychology: Human Perception and Performance. 2000;26:820–833. doi: 10.1037//0096-1523.26.2.820. [DOI] [PubMed] [Google Scholar]
  25. Pollatsek A, Reichle ED, Rayner K. Tests of the E-Z Reader model: Exploring the interface between cognition and eye-movement control. Cognitive Psychology. 2006;52:1–52. doi: 10.1016/j.cogpsych.2005.06.001. [DOI] [PubMed] [Google Scholar]
  26. Pollatsek A, Slattery TJ, Juhasz BJ. The processing of novel and lexicalized prefixed words in reading. Language and Cognitive Processes. 2008;23:1133–1158. [Google Scholar]
  27. Rastle K, Davis MH, New B. The broth in my brother’s brothel: Morpho-orthographic segmentation in visual word recognition. Psychonomic Bulletin and Review. 2004;11:1090–1098. doi: 10.3758/bf03196742. [DOI] [PubMed] [Google Scholar]
  28. Rayner K, Duffy SA. Lexical complexity and fixation times in reading: Effects of word frequency, verb complexity, and lexical ambiguity. Memory & Cognition. 1986;14:191–201. doi: 10.3758/bf03197692. [DOI] [PubMed] [Google Scholar]
  29. Taft M, Forster KI. Lexical storage and retrieval of polymorphemic and polysyllabic words. Journal of Verbal Learning and Verbal Behavior. 1976;15:607–620. [Google Scholar]
  30. White SJ, Bertram R, Hyönä J. Semantic processing of previews within compound words. Journal of Experimental Psychology: Learning, Memory, and Cognition. 2008;34:988–993. doi: 10.1037/0278-7393.34.4.988. [DOI] [PubMed] [Google Scholar]

RESOURCES