Abstract
Purpose
The goal was to assess the effects of maturation and phonological development on performance, by normally hearing children, on an imitative test of auditory capacity (On-Line Imitative Test of Speech-Pattern Contrast Perception [OlimSpac]; Boothroyd, Eisenberg, & Martinez, 2006; Eisenberg, Martinez, & Boothroyd, 2003, 2007).
Method
Thirty-four hearing children (aged between 1;8 [years;months] and 6;7) were asked to imitate nonword utterances. Responses were evaluated by a blinded listener in an 8-alternative forced-choice task, giving information on the children’s ability to convey, by imitation, information about 6 binary phonemic contrasts.
Results
Four children declined participation. Among 30 children aged 2;7 or older, performance improved significantly with age and varied with contrast. All children 3 years of age or older attained passing scores (7 or 8 correct responses in 8 binary trials) on at least 5 of the 6 contrasts. Post-alveolar consonant place was the contrast most often failed.
Conclusions
When evaluated on a pass/fail basis, normally hearing children 3 years of age or older are likely to demonstrate auditory perception of most phonemic contrasts using this imitative test. Phonological development and other task-related factors have only a modest effect on performance by normally hearing children after 3 years of age. The effects of hearing loss, hearing age, sensory assistance, and listening experience in children with hearing loss remain to be determined.
Keywords: imitation, hearing, auditory capacity, speech perception, children, phonological development, hearing tests, speech audiometry
The On-Line Imitative Test of Speech-Pattern Contrast Perception (OlimSpac; Boothroyd, Eisenberg, & Martinez, 2006; Eisenberg, Martinez, & Boothroyd, 2003, 2007) was developed as a tool for exploring auditory speech-perception capacity in children with hearing loss. In young children, however, developmental and other task-related factors can affect performance. The goal of the present study was to measure the performance of normally hearing children on this test and to determine the effects of age and contrast. The underlying premise was that normally hearing children have, by definition, the auditory capacity needed to perceive all phonologically significant contrasts among speech sounds. Less than perfect performance must, therefore, be attributed to the influence of other factors. The results were intended to provide a frame of reference against which to evaluate results obtained by children with hearing loss.
For present purposes, auditory capacity is defined as the ability of the peripheral auditory system to deliver, to higher auditory centers, patterns of neural excitation that convey consistent, detailed information about acoustic stimulus patterns. This capacity may be qualified as unassisted, aided, or implanted. It may also be qualified in terms of the type of acoustic stimulus. The present concern is with auditory speech-perception capacity—that is, the ability of the peripheral auditory system to deliver sufficient information for the detection of phonologically significant contrasts among the sound patterns of speech. We refer to these as speech pattern contrasts. Auditory capacity involves a combination of audibility and suprathreshold resolution. This study was concerned with the latter. Note that auditory capacity, as the term is used here, is different from auditory skill, which may be defined as the ability to interpret and use the information made available by auditory capacity.
There are several justifications for assessing auditory speech-perception capacity in young children with hearing loss (Kirk, Diefendorf, Pisoni, & Robbins, 1997). The information might be used, for example, to inform decisions about the need for, choice of, and prescription of sensory assistance (hearing aids and cochlear implants); to guide the fitting of hearing aids and the mapping of cochlear implants; to assess the outcome of sensory assistance; and to plan cognitive/linguistic intervention.
Unfortunately, capacity cannot be measured directly with behavioral tests. Instead, it must be inferred from performance. However, performance also depends on numerous other factors. Examples include auditory skill, attention span, interest, fatigue, cooperation, motivation, cognitive status, psychosocial development, and language development. One way to address this problem is to design tests in which task-related factors have little or no effect—an approach that has a long tradition in psychophysical research (Werner & Rubel, 1992). In reality, however, these factors can never be completely eliminated. As a result, one cannot confidently infer capacity from performance until the latter has approached asymptote. This is a serious drawback for a test that might be used for prescriptive or prognostic purposes.
Kirk et al. (1997) have offered an excellent review of the many behavioral tests that have been developed for assessing speech perception in children. Some involve detection of minimal phonemic contrasts. Others require word recognition in isolation or in sentences. Stimuli are presented in open or closed sets, and responses include conditioned head turns, pointing, button pushing, following instructions, and repetition. Because performance is likely to be influenced by cognitive and/or linguistic status, as well as by other task-related variables, the minimum age at which these tests can be considered suitable varies widely.
Among the response tasks used for assessing speech perception in children with hearing loss, repetition and imitation are regarded as suspect because they require both phonological knowledge and motor-speech skill. On the other hand, the imitative task is cognitively appropriate and familiar to young children, whereas nonspeech response tasks often call for higher levels of cognitive development and/or vocabulary knowledge. Moreover, the effects of phonological development are virtually impossible to eliminate in any test that uses speech stimuli, even if no production is required.
The present article deals with a test in which the perceptual task is the detection of phonemic contrasts, and the response task is imitation. Performance is measured in terms of the child’s ability to relay contrastive information to a hearing adult who responds in a forced-choice task. The original Imitative Test of Speech-Pattern Contrast Perception (ImSpac) was one of a battery of tests developed at the City University of New York for studies of auditory capacity and performance in children who use hearing aids and cochlear implants (Boothroyd, 1997a, 2009; Boothroyd & Eran, 1994; Boothroyd, Eran, & Hanin, 1996). There were several considerations behind the design of this test. First, as just pointed out, imitation is natural and cognitively appropriate for young children (Meltzoff, 2002). Second, the use of nonword utterances reduces the effects of vocabulary and higher levels of language.1 Third, a forced-choice approach to evaluating performance eliminates the need for the listener to make judgments of accuracy or quality (Boothroyd, 1985). Fourth, two or more binary contrasts can be evaluated at the same time, thereby reducing testing time and the demands on attention and cooperation. Finally, performance depends only on the child’s ability to detect and convey contrasts and not on the precision of phoneme production. In other words, he or she need not have fully mastered production of the speech sounds.
There were, however, two serious concerns with the original ImSpac. First, off-line editing and judging of recorded imitations (by teams of four paid listeners) were too time-consuming and too expensive for clinical application. This problem was addressed with an online procedure that involved masking a single listener during stimulus presentation (Kosky & Boothroyd, 2003). The approach was further refined at the House Ear Institute (HEI), leading to the on-line version of the test (OlimSpac) used in the study to be described (Boothroyd et al., 2006; Eisenberg et al., 2003, 2007).
The second concern was the possible influence of phonological development and motor-speech skill. When working with hearing-impaired children who have had the opportunity to use their auditory capacity for the development of spoken language, it has been found that performance is usually better with audio-visual than with audio-only models (Boothroyd, 1998; Boothroyd & Boothroyd-Turner, 2002; Boothroyd et al., 1996).2 From this, it has been concluded that the audio-alone score in these children is not being limited by motor-speech skill; they demonstrated that they could perform better when given more sensory input. There was, however, no guarantee that audio-only or audio-visual performance had reached the highest levels of which the child was capable. The potential influence of phonological and motor-speech development remains a concern.3
Before rejecting the imitative approach, however, it is appropriate to ask what the contributions of phonological development, motor-speech skill, and other task-related factors might be in normally hearing, normally developing children. We begin with the premise that these children can be assumed to have normal auditory speech-perception capacity and to have used this capacity for phonological development. If these assumptions are true, any deficits in performance must be attributed to one or more of the task-related factors discussed earlier. Specifically, age effects implicate development in a general sense, whereas contrast effects and Age × Contrast interactions implicate phonological development in particular. As indicated earlier, the data can provide a frame of reference against which to interpret results obtained from young children with hearing loss. They also provide a test of our ability to make inferences about perception capacity from measures of imitation performance.
In carrying out this research, we sought answers to two questions:
By what age should we expect normally hearing children to demonstrate, in this imitative task, the ability to detect and convey phonologically significant speech pattern contrasts?
Does this age depend on the contrast?
Method
Participants
Thirty-four children participated in this study. All had hearing thresholds within normal limits at octave intervals from 250 to 8000 Hz. English was the first or only language, and there were no known developmental delays. Seven of the children were hearing siblings of hearing-impaired children attending the Children’s Auditory Research and Evaluation Center at the HEI. Eight were children of HEI employees. Thirteen were the children of friends of HEI employees. Six were not otherwise associated with HEI.
Four children were unable or unwilling to perform the imitative task. Their ages ranged from 1;10 (years; months) to 2;10. This left 30 children for whom performance data were obtained. The ages of these 30 children ranged from 2;7 to 6;7, with a mean of 4;2. Fourteen were boys, and 16 were girls. Participation was by parental informed consent and child assent. Information on age, gender, HEI connection, and race/ethnicity are included with performance results in Table 1.
Table 1.
ID code | Age (months) | Age group | Race/ethnicity | Gender | HEI connection | Audio-visual (AV) |
Average AV in % regarding chance | Audio-only (AO) |
Average AO in % re. chance | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
VH | VP | CV | CC | CPf | CPr | VH | VP | CV | CC | CPf | CPr | ||||||||
03 | 31 | 2;7 [years; months]–3;0 | W | F | a | 7 | 6 | 8 | 5 | 6 | 5 | 54 | 7 | 8 | 7 | 6 | 7 | 6 | 71 |
11 | 33 | W | F | c | 8 | 8 | 8 | 8 | 7 | 7 | 92 | 8 | 8 | 8 | 7 | 6 | 8 | 88 | |
21 | 34 | W | M | c | 8 | 8 | 4 | 4 | 8 | 6 | 58 | 8 | 8 | 4 | 6 | 8 | 4 | 58 | |
29 | 35 | A | F | a | 8 | 8 | 8 | 7 | 8 | 8 | 96 | 8 | 8 | 7 | 6 | 8 | 6 | 79 | |
02 | 36 | W | M | a | 8 | 7 | 8 | 7 | 8 | 6 | 83 | 8 | 8 | 7 | 8 | 8 | 7 | 92 | |
04 | 36 | W | F | a | 8 | 8 | 8 | 7 | 8 | 4 | 79 | 8 | 8 | 7 | 7 | 8 | 4 | 75 | |
05 | 39 | 3;1–3;6 | W | M | b | 8 | 8 | 8 | 7 | 8 | 8 | 96 | 8 | 8 | 8 | 8 | 8 | 6 | 92 |
42 | 40 | W | M | c | 8 | 8 | 8 | 8 | 8 | 7 | 96 | 8 | 8 | 8 | 8 | 8 | 6 | 92 | |
50 | 41 | W | F | a | 8 | 8 | 7 | 8 | 8 | 8 | 96 | 8 | 8 | 7 | 8 | 8 | 8 | 96 | |
16 | 42 | B | M | b | 8 | 8 | 7 | 8 | 8 | 7 | 92 | 8 | 8 | 8 | 6 | 8 | 7 | 88 | |
17 | 42 | B | F | b | 8 | 8 | 8 | 8 | 8 | 6 | 92 | 8 | 8 | 8 | 8 | 8 | 6 | 92 | |
35 | 43 | 3;7–4;0 | H | F | d | 8 | 8 | 7 | 7 | 8 | 5 | 79 | 8 | 8 | 8 | 8 | 8 | 6 | 92 |
38 | 45 | H | F | c | 8 | 8 | 8 | 7 | 8 | 7 | 92 | 8 | 8 | 7 | 8 | 7 | 8 | 92 | |
20 | 46 | A | M | a | 8 | 8 | 8 | 8 | 8 | 7 | 96 | 8 | 8 | 8 | 7 | 8 | 7 | 92 | |
24 | 46 | W | F | c | 8 | 8 | 8 | 8 | 8 | 8 | 100 | 8 | 8 | 8 | 8 | 8 | 7 | 96 | |
26 | 48 | W | M | a | 8 | 8 | 8 | 8 | 8 | 7 | 96 | 8 | 8 | 8 | 8 | 8 | 8 | 100 | |
41 | 48 | W | F | b | 8 | 8 | 8 | 8 | 7 | 8 | 96 | 8 | 8 | 8 | 8 | 8 | 8 | 100 | |
57 | 49 | 4;1–5;0 | H | M | d | 8 | 8 | 8 | 8 | 8 | 8 | 100 | 8 | 8 | 8 | 8 | 8 | 8 | 100 |
06 | 55 | H | F | d | 8 | 8 | 8 | 8 | 8 | 8 | 100 | 8 | 8 | 8 | 8 | 8 | 8 | 100 | |
23 | 56 | W | M | b | 8 | 8 | 8 | 8 | 8 | 8 | 100 | 8 | 8 | 7 | 8 | 8 | 8 | 96 | |
10 | 57 | W | M | b | 8 | 8 | 7 | 8 | 8 | 7 | 92 | 8 | 8 | 8 | 8 | 8 | 7 | 96 | |
33 | 58 | W | M | c | 8 | 8 | 8 | 8 | 8 | 7 | 96 | 8 | 8 | 8 | 8 | 8 | 5 | 88 | |
46 | 59 | W | M | d | 8 | 8 | 8 | 8 | 8 | 8 | 100 | 8 | 8 | 8 | 8 | 8 | 8 | 100 | |
19 | 61 | 5;1–6;7 | W | F | b | 8 | 8 | 8 | 7 | 8 | 8 | 96 | 8 | 8 | 8 | 8 | 8 | 8 | 100 |
12 | 63 | W | F | b | 8 | 8 | 8 | 8 | 8 | 8 | 100 | 8 | 8 | 8 | 8 | 8 | 8 | 100 | |
09 | 71 | W | M | b | 8 | 8 | 7 | 7 | 8 | 7 | 88 | 8 | 8 | 7 | 8 | 8 | 7 | 92 | |
18 | 71 | H | M | a | 8 | 8 | 8 | 8 | 8 | 8 | 100 | 8 | 8 | 8 | 8 | 8 | 8 | 100 | |
47 | 74 | W | F | d | 8 | 8 | 8 | 8 | 8 | 8 | 100 | 8 | 8 | 8 | 8 | 8 | 8 | 100 | |
49 | 76 | W | F | b | 8 | 8 | 8 | 8 | 8 | 8 | 100 | 8 | 8 | 8 | 8 | 8 | 8 | 100 | |
48 | 79 | H | F | d | 8 | 8 | 8 | 8 | 8 | 8 | 100 | 8 | 8 | 8 | 8 | 8 | 8 | 100 |
Note. Scores for individual contrasts are the number of correct binary responses in eight trials. Composite scores are in percentage correct after correction for guessing, using Equation 1. ID = identification; Race/ethnicity: W = White/non-Hispanic (n = 20), A = Asian (n = 2), B = Black (n = 2), H = White/Hispanic (n = 6); Gender: F = female (n = 16), M = male (n = 14); House Ear Institute (HEI) connection: a = child of HEI employee (n = 8), b = child of friend of HEI employee (n = 10), c = sibling of HEI patient (n = 6), d = none (n = 6); VH = vowel height; VP = vowel place; CV = consonant voicing; CC = consonant continuance; CPf = pre-alveolar consonant place; CPr = post-alveolar consonant place; re. = relative to.
Speech Stimuli
The stimuli to be imitated consisted of vowel–consonant–vowel (VCV) utterances. Medial placement of the consonant was used to provide the child with both pre- and postvocalic coarticulatory cues. Productions of the stimuli by a female talker were video recorded using studio-quality digital equipment and a lavaliere microphone. Individual VCVs were extracted and edited for incorporation into the test software. The video properties were 30 frames per second with 24-bit color resolution and a display of 720 × 480 pixels. The Cinepak Codec was used for file compression, and the resulting video clips were saved as .avi files. The audio signals were extracted from the video clips and saved separately as .wav files. Audio properties were 44,100 samples per second with 16-bit resolution. One of the stereo channels was replaced with a speech-shaped random noise that coincided with the utterances and was used to mask the tester/listener during stimulus presentation to the child.
Software
The OlimSpac software provides three options for stimulus presentation: visual-only, audio-only, and audiovisual (Boothroyd et al., 2006). Figure 1 shows a typical test setup. The video signal is presented to the child via a slave monitor. The audio signal is presented via headphones or loudspeaker. Two types of stimuli are available: the VCV utterances described earlier, and consonant–vowel utterances produced by a different female talker and used in an earlier pilot study (Eisenberg et al., 2003). During stimulus presentation, the tester is prevented from hearing the audio stimulus by the masking noise described earlier. The tester is also prevented from seeing the video stimulus by covering that part of the monitor showing the talker’s face. The child’s imitation is relayed to the tester, at which time the audio masking has ended.4 The tester is presented with eight alternatives, as illustrated in Figure 2, and is required to indicate which one the child is thought to be imitating, guessing if not sure. Each of the tester’s responses provides information on three binary contrasts. A set of eight trials is used for three contrasts. A second set of eight trials is used for three different contrasts, giving six contrasts in all. Two of the six are vowel contrasts, and four are consonant contrasts, reflecting the higher importance of consonants in speech perception. Among the consonant contrasts, place of articulation is represented twice, reflecting its higher importance in consonant recognition. The six contrasts are as follows:
Vowel height (VH; e.g., “oodoo” vs. “aadaa”),
Consonant voicing (CV; e.g., “aataa” or “aapaa” vs. “aadaa” or “aabaa”),
Pre-alveolar consonant place (CPf; e.g., “oopoo” or “ooboo” vs. “ootoo” or “oodoo”),
Vowel place (VP; e.g., “ootoo” vs. “eetee”),
Consonant continuance (CC; e.g., “eetee” or “eechee” vs. “eesee” or “eeshee”), and
Post-alveolar consonant place (CPr; e.g., “oosoo” or “ootoo” vs. “ooshoo” or “oochoo”).
The qualifiers front and rear for the consonant place contrasts refer to pre- and post-alveolar place. Note that the scores are based only on the contrast and not on its phonemic context. In the case of CV, for example, if the target consonant is voiced, any voiced consonant is scored correct for voicing, even if it is the wrong one. This is an important feature of the test. The child does not need to be imitating a speech sound perfectly to be scored correct on a specific contrast.
The software provides four test forms in which the utterances embodying the contrasts and/or the order of presentation of the contrasts are changed. The score for a single contrast and a single presentation modality is based on eight binary trials with an expected score of four if the tester is guessing at random. A score of seven or eight is significantly better than chance at the .035 level (on the basis of the true binomial distribution). For this reason, the score of a single child on a single contrast is best thought of as pass/fail rather than as a parametric measure of proficiency. Increasing the number of trials would increase the precision of assessment but at the expense of testing time and the likelihood that a young child will not complete the test. The composite score, averaged across the six contrasts, is, however, based on a total of 48 trials and may be treated as a parametric index of overall performance.
Testing via a single presentation modality requires 16 trials and usually takes less than 5 min. Although the visual-only modality is available, testing is generally limited to two modalities: audio-visual and audio-only. The audio-visual condition is usually presented first. This provides training for the audio-only condition—familiarizing the child with the task and the talker. It can also serve as a screening test. If the child cannot perform under the audio-visual condition or if he or she scores at chance levels, there is little point in continuing with the audio-only condition. If the audio-visual score is significantly better than the audio-only score, one can conclude that audio-only performance is not being limited by motor-speech skill.
Procedure
In the present study, testing took place in a sound-attenuating enclosure with a parent and a researcher present. The tester remained outside the room, controlled stimulus presentation, and scored the responses. Sound stimuli were delivered at an average level of 70 dBA via loudspeaker. Each child was given an explanation and demonstration of what was required. Once it was clear that he or she understood the task and was prepared to participate, formal testing began. Only the VCV stimuli were used. Testing began with audio-visual presentation, followed by audio-only. Because form equivalence had not been established, testing was limited to the first of the four forms. For each presentation modality, the contrasts of VH, CV, and CPf were tested in the first eight trials. The contrasts of VP, CC, and CPr were tested in the second eight trials. As a reward, regardless of participation in the test, all children chose a small toy from a “treasure” chest and were also compensated with $10.
Results
Raw Data
Table 1 shows the number of correct listener judgments in eight trials for each child and each contrast under both audio-visual and audio-only conditions. Also shown are composite percentage correct scores. These were averaged across the six contrasts and were corrected for guessing using the following formula:
(1) |
where c = corrected score in percentage, r = raw score in percentage, and g = expected guessing score (in this case, 50%). Also shown in Table 1 are the demographic data mentioned earlier.
Main Effects of Age and Presentation Modality
Figure 3 shows the composite corrected scores as a function of age and presentation modality. Also shown are least squares fits of an exponential growth function to the data. The function is as follows:
(2) |
where y = predicted score, a = asymptote, e = base of natural logarithms, x = age, b = age at which score first rises above zero, and c = a growth constant.
The best-fit parameters (±1 SE) were as follows:
Audio-visual: a = 97% (±1.9), b = 27 months (±1.7), c = 4.3 months (±1.3)
Audio-only: a = 99% (±2.1), b = 22 months (±3.4), c = 7.6 months (±2.3)
The variance accounted for by these nonlinear functions was 60% and 65% for audio-only and audio-visual presentation, respectively. As one might expect from these data, a repeated-measures analysis of variance (ANOVA) in the arcsine transforms of the corrected percentage correct scores failed to show a significant effect of presentation modality, F(1, 29) = 0.002, p = .97. Because so many of the older children scored 100%, the test was repeated for the 15 children who were younger than 4 years of age. Again, there was no evidence of an effect of modality, F(1, 14) = 0.84, p = .37.
From these data, we conclude that composite performance, averaged over six contrasts, improved with age, that growth was virtually complete by 4 years of age, and that there was no evidence of deterioration in performance when testing switched from audio-visual to audio-only.
Effects of Demographic Variables
As in many studies of this type, the participant sample was one of convenience, leading to the possibility that age and contrast effects might be confounded with the effects of demographic variables. Because all children had English as their first language, there was no a priori reason to expect an effect of race/ethnicity. A gender effect is always a possibility, however, especially when developmental factors are involved. Furthermore, the associations of most of the children with the HEI might lead one to suspect an influence of enhanced spoken-language input and, perhaps, socioeconomic status.
The three panels of Figure 4 show the distributions of audio-only composite scores divided according to the three demographic factors. The top panel shows division by race/ethnicity (White/non-Hispanic vs. other). The center panel shows division by gender, and the bottom panel shows division by HEI connection. In no case is there obvious evidence that the two groups differed significantly. This observation was supported by three separate one-way ANOVAs in the arcsine-transformed data using age as a covariate: for race/ethnicity, F(1, 27) = 0.15, p = .71; for gender, F(1, 27) = 0.41, p = .53; for HEI connection, F(1, 27) = 2.35, p = .14. Interestingly, a one-way analysis of the HEI-connection data without the covariate of age showed significantly better performance by the children without an HEI connection; however, this was clearly attributable to there being no children younger than 3;6 in this group.
The data provide no evidence of an effect of these three demographic variables on the ability of young normally hearing children to convey phonetic contrasts by imitation of auditory patterns. They do not, however, rule out the possibility that such effects might be demonstrated in studies with larger, carefully selected samples.
Effect of Contrast
To examine differences between contrasts and potential Age × Contrast interactions, the children were divided into five age groups, as shown in Table 1. The groups consisted of the 6-month range from 2;6 to 3;0, the 6-month range from 3;1 to 3;6, the 6-month range from 3;7 to 4;0, the 1-year range from 4;1 to 5;0, and the 2-year range from 5;1 to 7;0. Contrast scores were corrected for guessing, expressed in percentage, arcsine transformed to increase homogeneity of variance, and subjected to a repeated-measures ANOVA. The between-groups variable was age group at five levels. The within-subject variable was contrast at six levels. The main effect of age was highly significant, F(4, 25) = 12.7, p < .000005, as was the effect of contrast, F(5, 125) = 12.7, p < .000005. There was also a significant Age × Contrast interaction, F(5, 125) = 1.9, p = .016. Because of the high number of perfect scores, the assumption of homogeneity of variance was violated. Multivariate analysis, using the Wilks’s lambda statistic, however, confirmed the findings—albeit with fewer degrees of freedom and, consequently, somewhat higher probability levels.
Figure 5 illustrates the Age × Contrast interaction. The curves were obtained by least squares fits of Equation 2 to individual data for the audio-only condition. It can be seen that the differences among the contrasts are large for the younger participants but diminish with increasing age as the curves approach asymptote. The age at which mean performance reaches 90% is less than 3 years for the contrasts of VH, VP, and CPf; is around 3;6 for CV and CC; and is around 4 years for CPr. Contrast-dependence of the age effect supports the conclusion that phonological development is a contributing factor.
Individual Data for Audio-Only Presentation
The results presented so far represent mean data for this sample of participants. The statistical analyses have been used to make inferences about the average performance in the population of normally hearing children from which these 30 participants are assumed to have been randomly selected. However, if OlimSpac is to have clinical value as an index of auditory capacity, we must be able to draw conclusions from the data for a single participant and a single contrast.
Figure 6 shows individual audio-only data for the six contrasts as a function of age. Unlike the earlier data, these results are shown as the number of correct responses in eight trials. Also shown is the passing criterion of seven or eight correct responses. As indicated earlier, the probability of seven or eight correct responses in eight binary trials, if the listener is responding at random, is .035. Scores failing to meet this criterion are shown with filled symbols. The following points should be noted:
Among the 30 children who completed the test, all obtained passing scores on the two vowel contrasts.
Only one child failed on CPf. This child was 2;9 of age.
One child (a different child) failed on CV. This child was 2;10 of age.
Four children failed on CC. The oldest was 3;6 of age.
Nine children failed on CPr. The oldest was 4;10 of age.
Every child 3 years of age or more obtained a passing score on at least five of the six contrasts.
This last point is made more explicit in Figure 7, which shows the number of contrasts passed as a function of age. Data are shown for both audio-only and audiovisual presentations. Of the 26 children 3 years of age or older, all completed the test, and all obtained passing scores on at least five of the six contrasts—regardless of presentation modality. With only one exception, the failed contrast was CPr.
Discussion
This study was not intended to provide definitive information on the phonological development in normally hearing children. Its purpose was to explore the effects of normal maturation and phonological development on performance on a specific test. We set out to answer three questions. By what age should we expect normally hearing children to demonstrate, in this imitative task, the ability to detect and convey phonologically significant speech pattern contrasts? Does this age depend on the contrast? Furthermore, in a more general sense, to what extent can we use production as an indicator of perception?
It would appear from these data that the probability of participation in this imitative test is high once a child reaches 3 years of age. This probability falls with decreasing age less than 3 years. Even after 3 years of age, mean performance improves measurably, at least up to 4 years of age. The age effect implicates developmental factors but tells us nothing about which factors are responsible. We do, however, exclude emerging auditory capacity. We began with the premise that auditory speech-perception capacity in hearing children is normal—a premise supported by evidence that children as young as 3 months of age respond to acoustic differences that are finer than those responsible for phonemic contrasts in their native language (Kuhl, Williams, Lacerda, Stevens, & Lindblom, 1992).
We also exclude the effect of cognitive development. This conclusion is based on the fact that performance on the VP contrast was perfect for all children who completed the test. It was clear that these children understood what was required. Among the four children who did not take the test, we do not know whether they were unwilling or unable.
The main effect of contrast and the Age × Contrast interaction implicate phonological development. There are two possible reasons that the vowel scores were almost perfect for the younger children. First, vowel articulation is mastered early in development. Second, the vowel contrasts used here were extreme, crossing at least two phoneme categories. Note, however, that performance on the minimal-pair contrast of CPf was also near-perfect. This contrast was represented by the sounds /p/, /b/, /t/, and /d/. According to Smit, Hand, Freilinger, Bernthal, and Bird (1990), 75% of normally developing children master these consonants before 3 years of age. It seems likely, therefore, that phonological development is more important than the magnitude of the contrasts. This conclusion is supported by the poor performance on CPr. In this case, two of the sounds used to represent the contrast (/sh/ and /ch/) are not mastered by 75% of normally developing children until 4 or 5 years of age (Smit et al., 1990).
It is possible that fatigue, boredom, and inattention influenced results and that these factors became less serious with increasing age; however, we have no way of exploring this possibility. It should be pointed out, however, that the time taken for the test is very short, around 5 min, and imitation is a familiar activity for developing children. Moreover, because the audio-only condition is always presented first, any fatigue effect would have shown up as an apparent modality effect—for which we found no evidence.
The absence of a significant difference between imitation of audio-visual and audio-only models is counter to findings obtained when using this test procedure with hearing-impaired children (Boothroyd & Boothroyd-Turner, 2002; Boothroyd et al., 1996; Eisenberg et al., 2003). Although visual information can be important in normal phonological development, it is probable that children with hearing loss, if acquiring spoken-language competence, place more reliance on visual cues. Less easily explained is the fact that Hnath-Chisolm, Laippley, and Boothroyd (1998) did find a visual enhancement effect in normally hearing children as young as 4 years of age. In that study, however, a three-interval, forced-choice test was used (Boothroyd, 1997b). This test was both boring and cognitively demanding, and the audio-only scores of the younger children were low enough to leave room for visual enhancement. Average performance in the present study was close to 100% by 4 years of age, leaving little or no room for demonstration of a visual enhancement effect even if it was present. Of course, failure to demonstrate a statistically significant effect does not mean that none exists or that one could not be demonstrated with more extensive testing and a larger sample size. To explore this issue further, we examined the audio-visual minus audio-only difference for the 15 children younger than 4 years of age. The mean difference in composite score, after correction for guessing, was 0.56 percentage points. The standard error was 2.07 percentage points. For 14 degrees of freedom, a t value of 2.14 is required for 95% confidence in a two-tailed test. From these data, we conclude that the design was capable of demonstrating an effect of medium size (Cohen’s d = 0.55). As pointed out previously, had we found a significant modality effect, we would not have been able to distinguish this from an effect of fatigue or loss of attention. Note, however, that visual enhancement was not the focus of this study. The audio-visual condition was used here for training and screening.
One concern about this test procedure is the potential role of the tester/listener. The results of the OlimSpac depend on the ability of this individual to maintain attention and to make rapid judgments on the basis of the often immature productions of young children. Had different listeners been used for different children, there would have been the opportunity for confounding listener differences with the effects of age. In this study, however, all the children were judged by the same listener. Note, also, that the forced-choice task is intended to minimize the effect of listener differences (Boothroyd, 1985). It has yet to be determined whether inter-listener differences have a significant effect on results obtained with OlimSpac.
It should be noted that the option “none of the above” was not available to the tester, even if she could not identify the child’s imitation or identified it as something other than one of the eight alternatives available. We stress again, however, that the purpose of this test is to measure the child’s ability to convey contrastive information—not to match phonemic targets. It may well be that the tester will correctly respond with a voiced consonant, for example, even if it is the wrong one. Moreover, the requirement of the null hypothesis is that responses are made at random and not withheld.
The study was not designed to investigate effects other than age and contrast, and the sample was one of convenience. It did, however, provide an opportunity to look for evidence of other effects. Because English as a first or sole language was a criterion for inclusion, there was no a priori reason to expect an effect of race/ethnicity, and none was observed. We might have expected a gender effect because girls are known to develop articulation skills somewhat more rapidly than do boys (Smit et al., 1990). Again, however, the data do not demonstrate a gender effect. Most of the participants were associated in some way with the HEI, and it has been suggested that this might bias the results toward better performance because of parental awareness of the value of an enriched spoken language environment, their socioeconomic status, their educational level, or some combination of these. There was, however, no evidence for such an effect. The failure to find evidence of these effects does not mean, of course, that they do not exist. However, a more powerful design would be needed to explore this possibility. From the one-way ANOVAs in the arcsine transforms of the audio-only composite scores reported earlier, it was determined that the current design was only capable of detecting, at the 95% level of confidence, large or very large effects (Cohen’s d = 0.95, 0.79, and 2.07 for the binary classifications of race, gender, and HEI connection, respectively).
The original purpose of this imitative test was to provide clinically useful information about the auditory speech-perception capacity of children with hearing impairment. It has always been recognized that if auditory capacity is to be inferred from imitative performance, two conditions must be satisfied. First, the child has had every opportunity to use hearing in the development of auditory and spoken language skills. Second, performance has approached an asymptote. We assume that, by definition, normally hearing children meet the first condition. The cross-sectional data from the present study suggest that the second condition is met in normally hearing children by around 3 years of age, after which the various task-related factors listed in the introduction have a relatively small effect. When working with children who have hearing loss, however, it is seldom possible to determine how well either of these conditions is met. Moreover, it is not clear which of the potential task-related factors will be primarily a function of chronological age and which will be a function of hearing age (i.e., the length of time since the child was provided with sensory assistance). A single performance measure on the audio-only portion of this test can only be taken, therefore, as a lower estimate of auditory speech-perception capacity. In other words, the capacity is not poorer than that indicated but could possibly be better. Repeated testing over time will help determine whether a performance asymptote has been approached (Boothroyd, 2009). As indicated in the introduction, the requirement for repeated testing severely restricts the test’s value as a predictive or prognostic tool. Nevertheless, the test may still have value as a measure of performance, progress, and outcome, especially if used as one component of a battery of tests. The test may also provide useful information on phonological development. Support for this suggestion comes from a study in which scores on OlimSpac have been shown to have predictive value in relation to higher levels of spoken language acquisition in children with hearing loss (DesJardin, Ambrose, Martinez, & Eisenberg, 2009).
We do not, at the time of writing, have a direct measure of auditory capacity in young children that can be used as a validating criterion for behavioral tests. There has, however, been preliminary work on the development of electrophysiological techniques. Much of this work has focused on the acoustic change complex, which is a response to change during an ongoing sound stimulus (Brown et al., 2008; Martin, 2007; Martin, Ali, Leach-Berth, & Boothroyd, in press; Martin & Boothroyd, 1999, 2000; Ostroff, Martin, & Boothroyd, 1999). In the future, it may be possible to measure auditory speech-perception capacity electrophysiologically before a child has developed auditory and spoken language skills. In the meantime, we must rely on behavioral tests while recognizing and accounting for their limitations.
Conclusions
We reached the following conclusions from this study:
Although the OlimSpac, as implemented in this study, is cognitively appropriate for normally developing children as young as 2;7, there is a high probability of nonparticipation in children younger than 3 years of age.
After 3 years of age, participation is probable, but the performance of normally hearing children is likely to improve with age and to vary with contrast.
The contrast effect and the Age × Contrast interaction implicate phonological development as a factor.
Nevertheless, after 3 years of age, the combined effect of phonological development and other task-related factors on performance is relatively small.
Using a passing criterion of seven correct responses in eight trials, it is highly probable that a normally developing child, 3 years of age or older, will obtain a passing score on at least five of the six contrasts used here.
The contrast most likely to be failed is that of CPr.
The effects of age, hearing loss, sensory assistance, phonological development, and task-related factors on the performance of children with hearing loss have yet to be determined. We can say with reasonable confidence, however, that poor performance on this test by a child with hearing loss who is younger than 3 years of age should not be attributed solely to poor auditory capacity.
Acknowledgments
A portion of these data was reported earlier in Eisenberg, Martinez, and Boothroyd’s (2007) study. This work was funded by National Institutes of Health (NIH) Grant DC006238 to the House Ear Institute (Los Angeles, CA). Development of the prototype of OlimSpac was supported by NIH Grant DC004433. Jessica Barlow provided valuable input to this article.
Footnotes
We say “reduces” rather than “eliminates” because some nonwords may be similar enough to real words to influence perception. Moreover, some nonwords may have real words embedded in them, as was the case in the present study.
For evidence of visual enhancement in other speech perception tests, see Blamey et al. (2001).
For data showing both audio-visual and audio-only scores improving over time in pediatric cochlear implantees, see Boothroyd (1998).
During formative evaluation, we found it necessary to terminate the audio portion of the final vowel, and the accompanying masking noise, after about 300 ms. This was done to deal with the problem of eager children who began their imitation before the utterance and, therefore, the noise were finished, making it impossible for the tester to hear the full response.
References
- Blamey PJ, Sarant JZ, Paatsch LE, Barry JG, Bow CP, Wales RJ, Tooher R. Relationships among speech perception, production, language, hearing loss, and age in children with impaired hearing. Journal of Speech, Language, and Hearing Research. 2001;44:264–285. doi: 10.1044/1092-4388(2001/022). [DOI] [PubMed] [Google Scholar]
- Boothroyd A. Evaluation of speech production in the hearing-impaired: Some benefits of forced-choice testing. Journal of Speech and Hearing Research. 1985;28:185–196. doi: 10.1044/jshr.2802.185. [DOI] [PubMed] [Google Scholar]
- Boothroyd A. Auditory capacity of hearing-impaired children using hearing aids and cochlear implants: Issues of efficacy and assessment. Scandinavian Audiology. 1997a;26(Suppl 46):17–25. [PubMed] [Google Scholar]
- Boothroyd A. Speech perception tests and hearing-impaired children. In: Plant G, Spens KE, editors. Profound deafness and speech communication. London, England: Whurr Publishers; 1997b. pp. 345–371. [Google Scholar]
- Boothroyd A. Evaluating the efficacy of hearing aids and cochlear implants in children who are hearing-impaired. In: Bess FH, editor. Children with hearing impairment: Contemporary trends. Nashville, TN: Vanderbilt Bill Wilkerson Center Press; 1998. pp. 249–260. [Google Scholar]
- Boothroyd A. Assessment of auditory speech-perception capacity. In: Eisenberg L, editor. Clinical management of children with cochlear implants. San Diego, CA: Plural Publishing; 2009. pp. 189–216. [Google Scholar]
- Boothroyd A, Boothroyd-Turner D. Post-implant audition and educational attainment in children with prelingually-acquired profound deafness. Annals of Otology, Rhinology and Laryngology. 2002;111(Suppl 189):79–84. doi: 10.1177/00034894021110s517. [DOI] [PubMed] [Google Scholar]
- Boothroyd A, Eisenberg LS, Martinez AS. Manual for OlimSpac 4.0. Los Angeles, CA: House Ear Institute; 2006. [Google Scholar]
- Boothroyd A, Eran O. Auditory speech-perception capacity of child implant users expressed as equivalent hearing loss [Monograph] Volta Review. 1994;96:151–168. [Google Scholar]
- Boothroyd A, Eran O, Hanin L. Speech perception and production in children with hearing impairment. In: Bess FH, Gravel JS, Tharpe AM, editors. Amplification for children with auditory deficits. Nashville, TN: Vanderbilt Bill Wilkerson Center Press; 1996. pp. 55–74. [Google Scholar]
- Brown CJ, Etler C, He S, O’Brien S, Erenberg S, Kim JR, Abbas PJ. The electrically evoked auditory change complex: Preliminary results from nucleus cochlear implant users. Ear and Hearing. 2008;29:704–717. doi: 10.1097/AUD.0b013e31817a98af. [DOI] [PMC free article] [PubMed] [Google Scholar]
- DesJardin J, Ambrose S, Martinez A, Eisenberg L. Relationships between speech perception abilities and spoken language skills in young children with hearing loss. International Journal of Audiology. 2009;48:248–259. doi: 10.1080/14992020802607423. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eisenberg LS, Martinez AS, Boothroyd A. Auditory-visual and auditory-only perception of phonetic contrasts in children [Monograph] Volta Review. 2003;102:327–346. [Google Scholar]
- Eisenberg LS, Martinez AS, Boothroyd A. Assessing auditory capabilities in young children. International Journal of Pediatric Otorhinolaryngology. 2007;71:1339–1350. doi: 10.1016/j.ijporl.2007.05.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hnath-Chisolm T, Laippley E, Boothroyd A. Age-related changes on a children’s test of sensory-level speech perception capacity. Journal of Speech, Language, and Hearing Research. 1998;41:94–106. doi: 10.1044/jslhr.4101.94. [DOI] [PubMed] [Google Scholar]
- Kirk EU, Diefendorf AO, Pisoni DB, Robbins AM. Assessing speech perception in children. In: Mendel LL, Danhauer JL, editors. Audiological evaluation and management and speech perception assessment. San Diego, CA: Singular; 1997. pp. 101–132. [Google Scholar]
- Kosky C, Boothroyd A. Validation of an on-line implementation of the Imitative Test of Speech-Pattern Contrast Perception (IMSPAC) Journal of the American Academy of Audiology. 2003;14:72–83. doi: 10.3766/jaaa.14.2.3. [DOI] [PubMed] [Google Scholar]
- Kuhl PK, Williams KA, Lacerda F, Stevens KN, Lindblom B. Linguistic experience alters phonetic perception in infants by 6 months of age. Science. 1992 January 31;255:606–608. doi: 10.1126/science.1736364. [DOI] [PubMed] [Google Scholar]
- Martin B. Can the acoustic change complex be recorded in an individual with a cochlear implant? Separating neural responses from the cochlear implant artifact. Journal of the American Academy of Audiology. 2007;18:126–140. doi: 10.3766/jaaa.18.2.5. [DOI] [PubMed] [Google Scholar]
- Martin B, Ali D, Leach-Berth T, Boothroyd A. Detection of the speech-evoked acoustic change complex (ACC) in individual listeners: Improving efficiency. Ear and Hearing in press. [Google Scholar]
- Martin B, Boothroyd A. Cortical, auditory, evoked potentials in response to periodic and aperiodic stimuli with the same spectral envelope. Ear and Hearing. 1999;20:33–44. doi: 10.1097/00003446-199902000-00004. [DOI] [PubMed] [Google Scholar]
- Martin B, Boothroyd A. Cortical, auditory, evoked potentials in response to changes of spectrum and intensity. The Journal of the Acoustical Society of America. 2000;107:2155–2161. doi: 10.1121/1.428556. [DOI] [PubMed] [Google Scholar]
- Meltzoff AN. Elements of a developmental theory of imitation. In: Meltzoff AN, Prinz W, editors. The imitative mind: Development, evolution, and brain bases. Cambridge, England: Cambridge University Press; 2002. pp. 19–41. [Google Scholar]
- Ostroff J, Martin B, Boothroyd A. Cortical evoked response to acoustic change within a syllable. Ear and Hearing. 1999;19:290–297. doi: 10.1097/00003446-199808000-00004. [DOI] [PubMed] [Google Scholar]
- Smit AB, Hand L, Freilinger JJ, Bernthal JE, Bird A. The Iowa Articulation Norms Project and its Nebraska replication. Journal of Speech and Hearing Disorders. 1990;55:779–798. doi: 10.1044/jshd.5504.779. [DOI] [PubMed] [Google Scholar]
- Werner LA, Rubel EW. Developmental psycho-acoustics. Washington, DC: American Psychological Association; 1992. [Google Scholar]