Abstract
Traditional models of adult language processing and production include two levels of representation: lexical and sublexical. The current study examines the influence of the inclusion of a lexical representation (i.e., a visual referent and/or object function) on the stability of articulation as well as on phonetic accuracy and variability in typically developing children and children with specific language impairment (SLI). A word learning paradigm was developed so that we could compare children’s production with and without lexical representation. The variability and accuracy of productions were examined using speech kinematics as well as traditional phonetic accuracy measures. Results showed that phonetic forms with lexical representation were produced with more articulatory stability than phonetic forms without lexical representation. Using more traditional transcription measures, a paired lexical referent generally did not influence segmental accuracy (percent consonant correct and type token ratio). These results suggest that lexical and articulatory levels of representation are not completely independent. Implications for models of language production are discussed.
Introduction
Most models of language production assume that lexical and phonological processing is distinct from motor implementation. Separate levels are generally proposed, including a semantic level where conceptual information related to word meaning is stored and a phonological level where specific sound sequences are mapped to the intended concept. There are significant differences among contemporary psycholinguistic models. In one prominent approach, there are discrete levels of processing, with lexical representations stored independently of their phonetic forms, so there is no direct relationship between basic speech motor processes and representation (e.g., Levelt, Roelofs, & Meyer, 1999). Alternatively, connectionist models are interactive, with proponents arguing for bidirectional spreading activation between semantically and phonologically related nodes (e.g., Dell, 1986). Despite the differences in these models, there is convergence in that semantic, lexical, and phonological levels provide input to speech motor implementation, or articulation (e.g., Dell, 1986; Dell & O’Seaghdha, 1992; Levelt, 1989; Levelt, Roelofs, & Meyer, 1990).
A contemporary theoretical alternative to multi-level psycholinguistic models incorporates exemplar theory to speech perception and production (Pierrehumbert, 2001; Pisoni, 1997; Johnson, 2006). These investigators propose a bundled storage of semantic, lexical, phonological and phonetic information so that fine grained phonetic information is stored directly with the lexical item rather than accessed in a cascading fashion. In speech perception, each instance of a label that is heard is stored and several instances of a similar label could be collected to form a category. Pierrehumbert (2001) argued that the traditional connectionist models do not permit phonetic or phonological variability in the output whereas a bottom up approach to perception and production allows for variation in production.
Recently, several investigators have begun to establish a more complex relationship between speech output and higher levels of processing (Baese-Berk & Goldrick, 2009; Frisch & Wright, 2002; Goldrick & Blumstein, 2006; McMillan, Corley, & Lickley, 2009). Relying on detailed acoustic and kinematic analysis, these investigators have shown that there are gradations in articulatory processes as a function of lexical properties. Interactions between neighborhood density and phonetic variation are well documented. For example, Munson and Solomon (2004) found that vowel space was expanded in words from high density neighborhoods. Baese-Berk and Goldrick (2009) showed that these lexical effects were inherent to the speech production process. These investigators controlled for the frequently observed confound between a phonetic variable, phonotactic frequency, and a lexical variable, neighborhood density. They found that voice onset time (VOT) of words with minimal pair neighbors was longer than that of words without such competitors. This was interpreted as a truly lexical effect, since VOT varied systematically as a function of neighborhood density, independent of phonetic level variables such as phonotactic probability.
Other paradigms have revealed gradations in phonetic properties as a function of higher order language processes. Goldrick and Blumstein (2006) found that errors in tongue twisters produced by adult speakers showed traces of the intended target. Post hoc analysis of these data revealed lexical effects, in that traces were larger for errors resulting in non-words compared with words. Focusing explicitly on lexical effects, Frisch and Wright (2002) asked participants to produce tongue twisters where the elicited error would either result in a real word or a non-word. They found that there were lexical effects on errors; when the tongue twister elicited an error that was a real word, there were more errors. These studies reveal that more interactivity is in place than had been assumed.
McMillan and colleagues (2009) tested adult speakers using a slip of the tongue paradigm in which target pairs were either completely non-lexical or mixed (lexical and non-lexical) and where potential slips could result in real words or non-words. They used electropalatographic data to measure motor variability in speech production as well as a more traditional transcription analysis. Their transcription analysis revealed that relatively more substitutions were made in the presence of real word competitors. This is reported as further support for a lexical bias in the speech production mechanism. The electropalatographic results of articulatory variability concurred with the transcription results, in that a more variable production arose when the elicitation pair consisted of only non-words. These authors suggest that the use of articulatory data make a “direct link between the cognitive processes involved in speech production and the resulting movements that produce speech” (McMillan et al., p. 62).
The present study, like McMillan et al. (2009), combines transcription and kinematic analyses to ask about interactions between lexical and articulatory levels of speech production, but now in young children. Virtually nothing is known about how such interactivity emerges in development, when speech motor capacities may even more tightly constrain lexical and phonological aspects of speech production. Using a word learning paradigm makes it possible for a direct comparison between targets with and without lexical representation.
We include children who are normally developing as well as those notoriously at risk for difficulties with mapping words to phonological forms, namely children with specific language impairment (SLI; Leonard, 1998). Children with SLI were included to replicate findings observed in the normally developing group and to investigate word learning effects in children who often show both lexical and articulatory deficits. Findings for these children may help support the idea that the word learning task included here is functioning as intended and that the results are generalizable. The interaction between articulation and word learning may be particularly critical during development when “word production emerges from a coupling of two independent systems, a conceptual system and an articulatory system” (Levelt et al., 1999, pg. 1). Like McMillan et al. (2009), we combine direct measures of speech kinematics with traditional transcription analysis to assess how lexical and articulatory levels interact.
In earlier work, we have established that the ability to produce stable and patterned articulatory output improves developmentally (e.g., Goffman & Smith, 1998; 1999; Smith & Zelaznik, 2004). Further, movement output variability changes within individuals as a function of speech production target; somewhat surprisingly, children acquiring English produce the earlier developing trochaic stress pattern with more speech motor variability than the later developing iambic stress pattern (Goffman, 1999; Goffman, Heisler, & Chakraborty, 2006). In the present work, we again consider changes in articulatory movement variability, this time as children acquire word forms which incorporate lexical specification. We will analyze and compare the variability of children’s speech movements when they produce phonetic strings that have no referent attached compared with phonetic strings that are assigned a visual referent.
As in adults, it is known that there are interactions between lexical and phonological factors in development. For example, children acquire words with highly frequent phonotactic structure more readily than those with low-frequency phonotactics (Storkel, 2001; Storkel, 2003; Storkel & Morrisette, 2002). That is, during learning children are more likely to comprehend and to accurately produce novel words of high phonotactic probability. However, in traditional models of lexical access and speech production, speech motor representations and processes are modularly separated from lexical and semantic representations and processes. This allows for non-independence (i.e., one module specifies the output to another) without assuming interaction (i.e., interplay between lexical/semantic and speech motor processes; Hillis & Caramazza, 1995; see Rapp & Goldrick, 2006 for a full review). As schematized in Figure 1, models of speech production, although different in detail, include conceptual, lexical, grammatical, and phonological levels (e.g., Dell, 1986; Levelt et al., 1999; Rapp & Goldrick, 2006).
Figure 1.
Schematic of speech production model (adapted from Levelt et al., 1999).
In the present study we ask whether the existence of a lexical representation for a word form interacts with a child’s capacity to produce stable articulatory movement trajectories associated with that word form. If production variability is differentiated for novel phonetic strings that are meaningless in comparison with lexical word forms (which may include a semantic representation) we may conclude that information about articulatory components of speech production is not completely independent of the lexicon.
In this study, children produced novel phonetic strings. In the course of the experiment, some of these phonetic strings were assigned a visual referent and some were not. Thus, some novel phonetic strings were provided with only phonological form and others were provided with a visual referent so that a lexical representation could be expected to begin to be established. Because the learning experience was relatively short-term and sketchy, the influence of semantic and lexical levels of representation could not be distinguished. Novel phonetic strings and lexical forms were compared for phonetic error (transcription analysis) and for articulatory variability (kinematic analysis). This may not provide information as to the integrity of competing models (independent and discrete levels versus independent levels that can influence other levels through spreading activation), but we could conclude that there is a direct connection between lexical and articulatory levels of representation if articulatory variability changes as a function of lexical status.
Method
Twenty-six children participated in this study, 13 typically developing and 13 meeting the exclusionary criteria for SLI (Leonard, 1998). The typically developing group consisted of 6 males and 7 females (M = 4 years, 2 months; SD = 3 months; range = 4;1 to 4;11). The SLI group consisted of 8 males and 5 females (M = 5 years, 0 months; SD = 8 months; range = 4;1 to 6;5).
Stimuli
Stimuli were four bisyllablic nonsense words: “fushpim,” “pafkub,” “bapkif,” and “mofpum.” All initial syllables had frequencies equal to 0 and all final syllables were identified at least one time but not more than six times in any of the English language databases (Burnage, 1990; Moe et al, 1982; Pisoni, Nusbaum, Luce, & Slowizcek, 1985). All medial clusters were chosen because of their low frequency of co-occurrence within the language (Munson, 2001).
Signals Recorded
The Optotrak (Northern Digital Inc., Waterloo, Ontario, Canada) was used to obtain recordings of lip and jaw movement during production. Three infrared light emitting diodes (IREDs) were placed on each participant’s face (upper lip, lower lip/jaw, and forehead). It was the lower lip/jaw movement that we analyzed. The forehead marker is used as a reference for the 3-dimensional subtraction of head movement. Kinematic data were collected at a sampling rate of 250 samples/second and a time locked acoustic signal was collected at a sampling rate of 16000 Hz.
Procedure
Stimuli were delivered using the Habit software (Cohen, Atkinson, & Chaput, 2000) from a MAC powerbook. The powerbook was connected to a thirty-inch Dell monitor and a set of external speakers that faced the child. Stimuli were presented at a comfortable loudness level.
To test the influence of word learning on speech production, a paradigm was developed which incorporated three phases: a pre-test (production), a learning phase (perception), and a post-test (production). A comprehension probe was completed after these three phases.
Phase 1: Pre-test
The first phase of the procedure was a pre-test designed to elicit production of the four stimulus items without any level of lexical representation. During this time children were instructed to look at the large monitor and repeat what they heard. The monitor displayed a checkerboard pattern as each auditory stimulus item was delivered. Each stimulus item was presented fifteen times in random order. Items were presented at the child’s pace with the experimenter controlling stimulus presentation.
Phase 2: Learning phase
The second phase of the procedure was a learning phase. Of the four phonetic strings used as stimulus items, two (counterbalanced across children) served as novel phonetic strings and two as lexical word forms for each child. The lexical word forms were provided with referential status during the learning phase in one of two ways. One item was given only a static visual referent. The other item was given a visual referent in combination with a dynamic presentation of object function. The two control items were presented in the same way that they were in the pre-test phase, as an auditory stimulus with a checkerboard pattern displayed on the monitor. Three-second videos of the experimental items were presented while the children heard a short phrase. When the lexical form was assigned a visual referent the child saw the static video display of the object and heard the statement, “________, oh look at it” (different counterbalancing conditions meant that a different word could fill the blank). When the lexical form was assigned a referent-plus-function, the child saw the video of the object being manipulated and heard the statement, “_______, you can squeeze it.” Visual referents were unfamiliar objects (e.g., the object used for the referent plus function condition was the squeezable top of a turkey baster). During the learning phase, each item was presented five times for a total of twenty tokens. Items were presented in a random order that was fixed across subjects.
Phase 3: Post-test
The third phase of the procedure was a post-test, again eliciting productions of the four stimulus items. Like the pre-test, all items were elicited fifteen times in random order. The visual and function referents were excluded from the post-test phase in an effort to minimize the immediate influence of attentional factors on the results. As in the pre-test phase, children saw the checkerboard presented on the monitor while they produced each of the novel strings.
Comprehension Probe
As a final task, a comprehension probe was used to test whether or not the children had learned the word-object mappings for the experimental stimulus items during the learning phase of the procedure. The probe was conducted with real objects that were presented on a table in front of the child. Two of the items were objects used during the learning phase and two of the objects were foils. When the objects were placed in front of the child, the examiner would instruct the child to “find the ______.” Two probe trials were administered, one for the object that was introduced with only a visual referent and one for the object that was introduced with the referent plus function.
Tokens selected for analysis
Twenty tokens of each word form were selected for analysis, 10 from the pre-test and 10 from the post-test phase. The first 10 consecutive fluent productions of each stimulus item were analyzed.
Kinematic Signals
Following data collection, kinematic waveforms were analyzed using the Matlab signal processing program (Mathworks, 2001). Displacement data were low pass filtered using a Butterworth filter with a cutoff frequency of 10 Hz (both forward and backward). We were interested in analyzing the superior-inferior movements of the lower lip. The 3-point difference method was used to calculate velocity from displacement.
Extraction of movement sequences
As shown in Figure 2, individual words were extracted from the stream of continuous articulatory movement. Opening and closing movements were selected by a visual examination of the patterning seen in the kinematic record. An algorithm then determined the maximum displacement value, corresponding with a zero crossing in velocity that occurred within a 25 ms window of the experimenter-selected point. The time locked acoustic signal was then played to confirm the selection.
Figure 2.
Illustration of how the kinematic data are extracted. This is lower lip/jaw movement from a child with specific language impairment (SLI) producing the string “pafkub.”
Movement variability
To determine the stability of movement patterns, trajectories first were amplitude-and time-normalized so that variation in the underlying patterning could emerge (see Figure 3 for an illustration). Amplitude normalization was accomplished by subtracting the mean and dividing by the standard deviation of each displacement record. For time normalization, a spline function was used to interpolate each displacement record onto a time base of 1000 points (for a detailed description of this analysis, see Smith, Johnson, McGillem, & Goffman, 2000). Normalization removes differences associated with rate and loudness and allows the underlying movement patterning to emerge.
Figure 3.
Data from a normally developing child producing “mofpum” in the pre-test and then in the post-test condition when the word acquired referential status. The top two panels show the extraction of 10 individual productions from the stream of speech. The middle two panels show these productions time- and amplitude-normalized and the bottom two panels show the calculation of the spatiotemporal index (STI).
The spatiotemporal index (STI; Smith, Goffman, Zelaznik, et al., 1995) was used as a measure of movement patterning variability. The STI is a measure of the stability of movement patterns when duration and amplitude are normalized. Ten productions of each stimulus item were time and amplitude normalized and then standard deviations were computed at 2% intervals across the productions. All of the standard deviations for a particular stimulus item (novel phonetic string or lexical word form) were summed. This compilation is the STI, with a higher value indicating increased movement patterning variability. The STI was performed on all pre-test and all post-test items.
Transcription Analysis
Digital audio recordings were phonetically transcribed for all tokens produced by each child. Reliability was completed on 25% of transcripts with a minimum of 93% inter-rater agreement. Following transcription, two analyses were completed to examine the phonetic variability of the productions. The first analysis was a count of total errors produced by the child in both the pre-test and post-test. Each individual token was examined in terms of six potential error sites: onset, nucleus, and coda for each of two syllables. If a child substituted an initial segment, that would count as one error. If a child added a segment to the onset and produced a substitution in the coda, that would count as two errors. The number of errors produced in the pre-test and the post-test was compared across subjects and conditions. The second transcription analysis was a type token ratio (TTR) analysis. The TTR is a measure of the number of types produced to the number of tokens produced, so it is a metric for the number of different ways that the child produces a single target form. The number of different word forms was tabulated for each condition (pre-test and post-test) and averaged across subjects.
Results
For all analyses, a mixed ANOVA was performed, with group (ND, SLI) as the between subjects factor. The within subjects factor was lexical status (i.e., novel phonetic string vs. lexical word). Preliminary analyses revealed no main effects or interactions across the visual-only and function referent categories. Therefore, in most cases analyses are collapsed. Because we were interested in whether articulatory or phonetic performance changed as a function of including lexical information, the primary analysis was of difference scores between the pre- and the post-test phases. Difference scores were calculated by subtracting the pre-test values from the corresponding post-test values. We also present the raw pre- and post-test scores (STI, percent consonant correct (PCC), and type token ratio (TTR)) to assess whether children with SLI performed more poorly than their ND peers. In the comprehension analysis, eleven (of 13) typically developing children learned the words in the referent condition and eleven learned the words in the referent plus function condition. Of the 13 children with SLI, 7 demonstrated learning in the word condition and 8 demonstrated learning in the word plus function condition. Secondary analyses will be reported comparing learners to non-learners.
Kinematic Analysis
The STI was calculated for each child for each stimulus item in both the pre-test and the post-test and difference scores were derived. These data are presented in Figure 4. For this analysis, data for one child from each language group was discarded due to missing data points. Difference scores were calculated by subtracting the pre-test STIs from the corresponding post-test STIs. A negative difference score thus reflects a decrease in kinematic variability from pre-test to post-test. Difference scores were analyzed using a repeated measures ANOVA that looked at lexical status by group. Results showed an effect for lexical status, F(1,22) = 7.29, p = .01, but no effect for group, F(1,22) = .99, p = .33 and no word * group interaction, F(1,22) = .13, p = .73.
Figure 4.
The mean difference score of the spatiotemporal index (STI) by word status category for children who are normally developing (ND) and specific language impaired (SLI). The closed shapes refer to words with lexical referents, with the circle a visual referent and the diamond a visual referent along with function. The open squares refer to control strings that were assigned no lexical referents.
Visual inspection of Figure 4 suggests that the children with SLI only showed an effect for lexical status when a function was also assigned during the perceptual learning phase. To assess this possibility, a repeated measures ANOVA was completed on the SLI group. There was a trend toward an effect of training type, F(3,33) = 2.44, p = .08.
To ascertain whether children who did not pass the comprehension probe still showed the effect of lexical status, an additional analysis was performed, this time with the group comparison being of learners vs. non-learners. Overall, the effect of lexical status was observed in non-learners as well as learners [referential non-learners vs. learners, group, F(1,22) = .002, p = .96; lexical status, F(1,22) = 8.65, p = .007; function non-learners vs. learners, group, F(1,22) = .57, p = .459; lexical status, F(1,22) = 8.65, p = .008].
STI raw values were also assessed, primarily to evaluate developmental differences between children with SLI and their ND peers. There was a group difference, F(1,22) = 5.32, p = .03, with children with SLI showing more articulatory variability than 4-year-olds who were developing language normally. There was also a learning effect, with greater articulatory variability observed (i.e., higher STI values) in the pre-test compared with the post-test, F(1,22) = 11.09, p = .003. Critically, there was an interaction between this learning effect and lexical status, F(1,22) = 6.19, p = .02. The learning effect was only observed in the word condition. The control condition, in which lexical status was not assigned, showed no differences between pre- and post-test STIs. These results are presented in Table 1.
Table 1.
Pre- and post-test values for the STI (means and standard errors).
| Word 1 object | Word 2 function | Control 1 | Control 2 | |
|---|---|---|---|---|
| SLI: Pre-test | 27.5 (2.1) | 30.8 (2.4) | 29.6 (1.5) | 28.4 (1.4) |
| SLI: Post-test | 26.6 (1.8) | 23.3 (1.8) | 29.4 (1.1) | 26.0 (1.6) |
| ND : Pre-test | 28.2 (2.1) | 25.0 (2.3) | 22.7 (1.3) | 21.5 (2.1) |
| ND: Post-test | 24.3 (1.9) | 21.7 (1.8) | 23.0 (1.7) | 21.7 (1.5) |
It is notable in Table 1 that, for the ND children, the pre-test STI values for forms that were eventually assigned lexical status were larger (i.e., 28.2 and 25.0) than those of forms that were eventually controls (i.e., 22.7 and 21.5). This may be a result of individual variation across children, not a floor effect. While the STI can theoretically be 0 if there is absolutely no variability, in fact individual children in this age range are extremely variable (e.g., Smith & Zelaznik, 2004). In the present study, a range of STI values of 11.2 to 40.7 were observed in the ND group and 13.9 to 42.5 in the SLI group. In addition, the words themselves may have inherently different STI values, due to phonetic effects. This is why individual words were counter-balanced across children as controls or lexical items. Because of this observed inter-subject variability and possible phonetic effects, what is critical for the present question about learning is change within individual children. Because of the counter-balancing, there is no clear reason why control words are inherently less variable than those assigned lexical status, and therefore the difference score was emphasized in interpretation of the results.
Transcription Analyses
To analyze the overall accuracy of the transcription data, a repeated measures ANOVA was run on the PCC difference score data. Difference scores were averaged for all children in the pre-test and the post-test and these data are presented in Figure 5. Results showed no significant main effects for group, F(1,24) = 1.08, p = .31, or lexical status, F(1,24) < 1, p = .98. These data demonstrate that there is no difference in the number of errors produced between pre-test and post-test. This is an important finding because it shows that the kinematic results are not being driven by degree of phonetic accuracy.
Figure 5.
The mean difference score of phonetic accuracy by word status category for children who are normally developing (ND) and specific language impaired (SLI).
To analyze the TTR, a repeated measures ANOVA was completed on the difference scores. Results showed no effect for group, F(1,24) = 3.21, p = .09. There was also no effect for word status, F(1,24) = .58, p = .45. There was a group by word status interaction, F(1,24) = 4.64, p = .04. As shown in Figure 6, children who are typically developing show a difference between words and non-words, while children who are SLI do not.
Figure 6.
The mean difference score of type token ratio by word status category for children who are normally developing (ND) and specific language impaired (SLI).
The raw PCC and TTR scores were also assessed. Again, the emphasis here was on developmental differences between the groups. As shown in Table 2, children with SLI performed more poorly than their ND peers on their segmental accuracy, F(1,24) = 38.05, p < .0001. There were no effects of lexical status, F(1,24) = .22, p = .64, and no learning effects, F(1,24) = 2.52, p = .13.
Table 2.
Pre- and post-test values for PCC (means and standard errors).
| Word 1 object | Word 2 function | Control 1 | Control 2 | |
|---|---|---|---|---|
| SLI : Pre-test | 60.1 (5.4) | 55.9 (6.8) | 66.2 (6.3) | 66.4 (5.7) |
| SLI: Post-test | 60.4 (5.6) | 56.6 (7.8) | 66.0 (6.6) | 64.8 (6.5) |
| ND : Pre-test | 87.6 (2.3) | 94.6 (2.3) | 84.9 (3.5) | 89.9 (2.4) |
| ND: Post-test | 91.5 (2.4) | 96.3 (1.7) | 87.1 (2.7) | 91.5 (2.0) |
For the TTR, there were again group effects, F(1,24) = 13.67, p = .001, with children with SLI more variable than those who were ND. There were no effects of lexical status, F(1,24) = 2.42, p = .13, but there were learning effects, F(1,24) = 18.01, p < .001. There was an interaction between group and learning effects, F(1,24) = .05, p = .05. As illustrated in Table 3, children with SLI showed increased learning effects.
Table 3.
Pre- and post-test values for the TTR (means and standard errors).
| Word 1 object | Word 2 function | Control 1 | Control 2 | |
|---|---|---|---|---|
| SLI: Pre-test | 43.1 (6.7) | 45.4 (6.0) | 46.2 (5.6) | 44.6 (5.4) |
| SLI: Post-test | 36.9 (6.1) | 26.2 (4.2) | 30.0 (5.1) | 30.8 (8.0) |
| ND : Pre-test | 24.6 (5.4) | 19.2 (4.7) | 30.8 (5.6) | 25.4 (5.4) |
| ND: Post-test | 14.6 (2.7) | 11.5 (1.9) | 31.5 (4.6) | 23.1 (5.5) |
Discussion
The purpose of the current study was to examine the influence of word learning on speech production. We used a novel word learning task to assess phonetic accuracy and variability as well as kinematic variability in “lexical words” and “phonetic strings” for children who are typically developing and those with SLI. Our results showed that, when a novel phonetic string is paired with a visual and/or functional referent, it becomes less variable in its motor implementation. Strings that are not assigned such lexical status do not change in variability with motor practice.
These results may provide support for the idea that specifications for motor implementation of word production are not completely independent of lexical information. Examination of more conventional phonetic accuracy measures did not show a change in percent consonants correct from pre-test to post-test, thus revealing a disconnect between the phonetic system and more basic motor implementation. Segmental variability, as measured using the TTR, did show learning effects, but these were usually not tied to whether a word had lexical status, only to articulatory practice. Transcription and kinematic measures both served as indices of learning. Along with several studies of adult speakers (e.g., Baese-Berk & Goldrick, 2009; Frisch and Wright, 2002, McMillan et al., 2009), we find that there are critical gradations in production processes as a function of lexical status and learning. We show that these relationships between lexical and articulatory levels are also observed in children, both those developing language normally and those with SLI.
Children with SLI were included in this study because they are poor language learners and thus may be less likely to show lexical effects on their learning, or may be less likely to learn at all. In fact, several participants, especially those with SLI, did not demonstrate comprehension of the experimental targets. This finding could be interpreted in two ways. One possibility is that there were covert lexical effects. It may be that children who did not recall the lexical items in the learning task developed sketchy lexical representations, but performance limitations influenced their behavior on the comprehension probes. This is not an unsupported possibility as it is documented that children with SLI demonstrate reduced lexical knowledge compared with their ND peers (McGregor et al., 2002).
A second possibility is that the more variable visual stimulus included in the lexical condition increased the participants’ attention to the phonetic string resulting in a more stable production. This does not discount the natural role of attention in word learning, as attention to a referent is a precursor to lexical integration and a significant part of the word learning process (Samuelson & Smith, 1998). However, for children whose language skills are less well developed, the typical process of word learning is likely disrupted and more basic cognitive behavior like attention to a moving stimulus could be influencing the results. The finding that children with SLI tended to have better learning for items that showed a dynamic video representation versus a static picture lends further support for this idea. This interpretation does not overlook the very important role that attention plays in word learning, but rather demonstrates its heightened importance for children whose language skills are not sufficiently strong to support word learning.
Our results showed a decrease in kinematic variability for both subject groups for words but not for novel phonetic strings, regardless of evidence of overt comprehension. If the SLI group were simply showing a more general practice effect not related to lexical or attentional components of processing, this should have been seen in both the referential and non-referential stimulus items. Gradations of lexical and/or semantic information did not generally influence the effect of word learning on articulatory stability. Lexical word status appears to be the significant factor. While increased functional information likely increases the richness of the semantic representation, it does not lead to more efficient motor implementation, perhaps especially for learners who are typically developing. One cannot ignore that attention could be used to explain this result; as previously mentioned, further study is needed to determine the relative contributions of attention and lexical and/or semantic representation to the change in articulatory stability.
As expected, children with SLI did perform more poorly than their typically developing peers. They showed increased articulatory variability (STI), as well as a greater proportion of consonants in error (PCC) and of segmental variability (TTR). However, the difference score analyses revealed that these decrements in overall performance were generally not related to their capacity to represent lexical and non-lexical strings. It is notable in Figure 4 that children with SLI showed a particularly marked effect of lexical status when a visual referent was supplemented with a description of function. Perhaps as a result of the variability of these data, this was not a statistically significant effect (p = .08). However, follow-up studies are needed to assess the degree of lexical and/or semantic information required to learn a word for children with SLI compared with their typically developing peers.
The present results reveal an interaction between the lexicon and articulation. Previous models describe distinct levels for lexical and phonological information. The models differ as to whether there is interaction between the distinct levels or whether all categorical information is maintained concurrently. In the present study, manipulation of input at the lexical level led to the increased convergence of articulatory movements onto a common template. These results do not support any one particular model over another, but do provide further support for an interaction between lexical levels and speech production.
As mentioned, there are several possible explanations for the finding that a change in lexical status has a direct effect on articulation. One possibility, consistent with relatively new findings from adult speakers (Baese-Berk & Goldrick, 2009; Goldrick & Blumstein, 2006; McMillan et al., 2009), is that during development there is rather a direct link between lexical and articulatory components. Another potential explanation of the finding involves attention. The presentation of the checkerboard pattern throughout the post-test phase was intended to control for attentional factors operating concurrently with production of word forms having or not having acquired lexical status in the learning phase. However, attentional influences may have been at work during the learning phase, especially for children with SLI. It is notable that for the normally developing children there were no differences in performance following the presentation of a more salient function (and related motion) cue. Regardless of the origin of the effect, these are the first developmental data that we know of that reveal articulatory learning effects as a function of lexical or perhaps attentional variables.
The present results fit in with a growing body of literature from adults revealing that there are important interactions between lexical and phonetic processing. Critically, it has proven difficult to integrate data from children into models of speech production. An approach that incorporates fine-grained production processes, such as the kinematic analysis used here, helps bridge this gap, providing another window into children’s language processing capacities.
Acknowledgments
We are grateful to Janna Berlin, David Kemmerer and Laurence Leonard for invaluable assistance with many phases of this work. This research was supported by the National Institutes of Health (National Institute of Deafness and other Communicative Disorders) grant DC04826 as well as the Bamford-Lahey Children’s Foundation.
References
- Baese-Berk M, Goldrick M. Mechanisms of interaction in speech production. Language and Cognitive Processes. 2009;24:527–554. doi: 10.1080/01690960802299378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burnage G. CELEX: A guide for users. Nijmegen, The Netherlands: CELEX; 1990. [Google Scholar]
- Cohen L, Atkinson D, Chaput H. Habit 2000. 2000. [computer program] [Google Scholar]
- Frisch SA, Wright R. The phonetics of phonological speech errors: An acoustic analysis of slips of the tongue. Journal of Phonetics. 2002;30:139–162. [Google Scholar]
- Goffman L. Prosodic influences on speech production in children with specific language impairment and speech deficits: Kinematic, acoustic, and transcription evidence. Journal of Speech, Language, and Hearing Research. 1999;42:1499–1517. doi: 10.1044/jslhr.4206.1499. [DOI] [PubMed] [Google Scholar]
- Goffman L, Heisler L, Chakraborty R. Mapping of prosodic structure onto words and phrases in children’s and adults’ speech production. Language and Cognitive Processes. 2006;21:25–47. [Google Scholar]
- Goffman L, Smith A. Development and differentiation of speech movement patterns. Journal of Experimental Psychology: Human Perception and Performance. 1999;25:649–660. doi: 10.1037//0096-1523.25.3.649. [DOI] [PubMed] [Google Scholar]
- Goldrick M, Blumstein SE. Cascading activation from phonological planning to articulatory processes: Evidence from tongue twisters. Language and Cognitive Processes. 2006;21:649–683. [Google Scholar]
- Hillis AE, Camarazza A. Converging evidence for the interaction of semantic and phonological information in accessing lexical information for spoken output. Cognitive Neurophsychology. 1995;12:187–227. [Google Scholar]
- Johnson K. Resonance in an exemplar-based lexicon: The emergence of social identity and phonology. Journal of Phonetics. 2006;34:485–499. [Google Scholar]
- Leonard LB. Children with Specific Language Impairment. Cambridge, MA: MIT Press; 1998. [Google Scholar]
- Levelt WJM, Roelofs A, Meyer AS. A theory of lexical access in speech production. Behavioral & Brain Sciences. 1999;22:1–75. doi: 10.1017/s0140525x99001776. [DOI] [PubMed] [Google Scholar]
- Mathworks, Inc. Matlab: High performance numeric computation and visualization software. Natick, MA: Author; 2001. [computer program] [Google Scholar]
- McGregor KK, Newman R, Reilly R, Capone N. Semantic representation and naming in children with specific language impairment. Journal of Speech, Language, and Hearing Research. 2002;45:998–1014. doi: 10.1044/1092-4388(2002/081). [DOI] [PubMed] [Google Scholar]
- Moe S, Hopkins M, Rush L. A vocabulary of first grade children. Springfield, IL: Thomas; 1982. [Google Scholar]
- Munson B. Phonological pattern frequency and speech production in adults and children. Journal of Speech, Language, and Hearing Research. 2001;44:778–792. doi: 10.1044/1092-4388(2001/061). [DOI] [PubMed] [Google Scholar]
- Munson B, Solomon PN. The effect of phonological density on vowel articulation. Journal of Speech, Language, and Hearing Research. 2004;47:1048–1058. doi: 10.1044/1092-4388(2004/078). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pierrehumbert J. Word frequency, lenition, and contrast. In: Bybee J, Hopper P, editors. Frequency effects and the emergence of lexical structure. John Benjamins; Amsterdam: 2001. pp. 137–157. [Google Scholar]
- Pisoni D. Some thoughts on “normalization” in speech perception. In: Johnson K, Mullennix J, editors. Talker Variability in Speech Processing. Academic Press; San Diego: 1997. pp. 9–32. [Google Scholar]
- Pisoni D, Nusbaum H, Luce P, Slowiacek L. Speech perception, word recognition, and the structure of the lexicon. Speech Communication. 1985;4:75–95. doi: 10.1016/0167-6393(85)90037-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rapp B, Goldrick M. Speaking words: Contributions of neuropsychological research. Cognitive Neuropsychology. 2006;23:39–73. doi: 10.1080/02643290542000049. [DOI] [PubMed] [Google Scholar]
- Samuelson L, Smith LB. Memory and attention make smart word learning: An alternative account of Akhtar, Carpenter, & Tomasello. Child Development. 1998;69:99–104. [PubMed] [Google Scholar]
- Smith A, Goffman L. Stability and patterning of movement sequences in children and adults. Journal of Speech, Language, and Hearing Research. 1998;41:18–30. doi: 10.1044/jslhr.4101.18. [DOI] [PubMed] [Google Scholar]
- Smith A, Goffman L, Zelaznik HN, Ying G, McGillem CM. Spatiotemporal stability and the patterning of speech movement sequences. Experimental Brain Research. 1995;104:493–501. doi: 10.1007/BF00231983. [DOI] [PubMed] [Google Scholar]
- Smith A, Johnson M, McGillem C, Goffman L. On the stability and patterning of speech movements. Journal of Speech, Language, and Hearing Research. 2000;43:277–286. doi: 10.1044/jslhr.4301.277. [DOI] [PubMed] [Google Scholar]
- Smith A, Zelaznik H. The development of functional synergies for speech motor coordination in childhood and adolescence. Developmental Psychobiology. 2004;45:22–33. doi: 10.1002/dev.20009. [DOI] [PubMed] [Google Scholar]
- Storkel HL. Learning new words: Phonotactic probability in language development. Journal of Speech Language and Hearing Research. 2001;44:1321–1337. doi: 10.1044/1092-4388(2001/103). [DOI] [PubMed] [Google Scholar]
- Storkel HL. Learning new words II: Phonotactic probability in verb learning. Journal of Speech, Language, and Hearing Research. 2003;46:1312–1323. doi: 10.1044/1092-4388(2003/102). [DOI] [PubMed] [Google Scholar]
- Storkel HL, Morrisette ML. The lexicon and phonology: Interactions in language acquisition. Language, Speech, and Hearing Services in Schools. 2002;33:24–37. doi: 10.1044/0161-1461(2002/003). [DOI] [PubMed] [Google Scholar]






