Abstract
Purpose
Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystem approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems.
Method
Nine acoustic variables reflecting different subsystems, and speech intelligibility, were measured in 22 children with CP. These children included 13 with a clinical diagnosis of dysarthria (SMI), and nine judged to be free of dysarthria (NSMI). Data from children with CP were compared to data from age-matched typically developing children (TD).
Results
Multiple acoustic variables reflecting the articulatory subsystem were different in the SMI group, compared to the NSMI and TD groups. A significant speech intelligibility prediction model was obtained with all variables entered into the model (Adjusted R-squared = .801). The articulatory subsystem showed the most substantial independent contribution (58%) to speech intelligibility. Incremental R-squared analyses revealed that any single variable explained less than 9% of speech intelligibility variability.
Conclusions
Children in the SMI group have articulatory subsystem problems as indexed by acoustic measures. As in the adult literature, the articulatory subsystem makes the primary contribution to speech intelligibility variance in dysarthria, with minimal or no contribution from other systems.
Keywords: speech acoustics, intelligibility, dysarthria, cerebral palsy
Previous research suggests that communication problems are present in a significant number of children with cerebral palsy, but estimates of how many children with CP have speech and/or language problems vary substantially across studies (e.g., 58% of over 400 children from the European registry study of Bax, Tydeman, & Flodmark, 2006; 50% of 68 children from the western Sweden registry study of Himmelmann, Lindh, & Hidecker, 2013; 31–88% of children from the older studies of Achilles, 1955, and Wolfe, 1950). There is little question as to the potential presence of speech and language problems in children with CP, but the specific types of communication problems have only recently begun to receive attention (see Hustad, Gorton, & Lee, 2010). A number of studies (Achilles, 1955; Wolfe, 1950; Platt, Andrews, Young & Quinn, 1980; Platt, Andrews, & Howie, 1980; Ansel & Kent, 1992) show clearly that dysarthria is a prominent speech problem in many adults with cerebral palsy. Because the disease and its behavioral manifestations are generally regarded to be non-progressive (Bax, Goldstein, Rosenbaum, Leviton, Paneth, Dan, Jacobsson, & Damiano, 2005), dysarthria is expected to be a prominent and permanent speech problem in children with CP (Otapowicz, Sobaniec, Kulak, & Sendrowski, 2007) and to persist into adulthood. The nature of the dysarthria in adults with CP, however, cannot be generalized to a straightforward description of dysarthria in the developing child with CP. This is because the speech motor control deficits present in a child almost certainly interact with typical developmental processes of speech motor control. The nature of the dysarthria in the developing child with CP, however, has not been studied in much detail. The current study examined speech acoustic and intelligibility variables of children with dysarthria secondary to CP, using a multiple speech subsystem approach and comparison to typically developing children. In addition, a speech intelligibility prediction model was tested with obtained speech acoustic and intelligibility data.
Many studies examining speech characteristics in children with CP are dated with regard to methods of study and tools for analysis. Studies in this area were mostly reported in the 1950’s to 1980’s (Achilles, 1955; Byrne, 1959; Clement & Twitchell, 1959; Farmer & Lencione, 1977; Hardy, 1961; Hixon & Hardy, 1964; Irwin, 1955a, 1955b; Kent & Netsell, 1978; Netsell, 1969; Wolfe, 1950; Workinger, 1986). A summary of this work is difficult to write because some studies focused on a particular subsystem (Netsell, 1969; Hardy, 1961), reported observations on only a few selected participants (Kent & Netsell, 1978) or an isolated speech behavior (Farmer & Lencione, 1977), or used maximum performance tasks to differentiate children with the spastic form of CP from typically-developing children (Wit, Maassen, Gabreëls, & Thoonen, 1993). What is clear from this literature, however, is that in children with dysarthria due to CP, any and all speech subsystems may be affected, as suggested by research on adult speakers with CP and dysarthria. The specific nature and impact of, and possible independent subsystem effects on speech intelligibility in children with CP, however, are minimally understood.
The general purpose of the present study was address this gap in the literature by obtaining speech acoustic and speech intelligibility measures from relatively young children with CP and from typically-developing controls. The acoustic measures were carefully chosen to represent each of three speech subsystems (articulatory, resonatory, and laryngeal). The acoustic measures were compared across three groups of children, those with CP and a clinical diagnosis of dysarthria, those with CP whose speech motor control capabilities were judged to be typically-developing, and a control group of typically-developing (neurologically typical) children. Single-word intelligibility scores were collected from each child as well, and as described below a specific aim of the present work was to determine the ability of the acoustic measures to jointly and singly predict the speech intelligibility scores.
Research relevant to knowledge on speech production deficits in children with CP, and the relations of those deficits to perception of the speech of these children, is, as mentioned above, scarce. In a group of 50 children with CP, aged 6;4–19 years, Clarke & Hoops (1980) found that the number of articulatory errors on a standardized test predicted a scaled measure of “speech proficiency”, the latter being highly correlated with formal measures of speech intelligibility. Clarke and Hoops (1980) included measures of fundamental frequency (F0) and speech sound pressure level (SPL) in their prediction exercise, but neither of the acoustic variables made a significant contribution to the variation in scaled speech proficiency. In another multiple regression analysis, Pirila, van der Meere, Pentikainen, Ruussu-Niemi, Korpela, Kilpinen, & Nieminen (2007) failed to find a significant relationship between very gross estimates of speech motor impairment (based on a three-category designation of “normal”, “immature”, and “deviant”) and a similarly general estimate of severity-of-involvement in 36 children with CP, aged 1:10- 9 years of age.
Improving speech intelligibility in individuals with dysarthria is often a crucial target for intervention. In the adult dysarthria literature, analyses of multiple acoustic variables and their relative functions as speech intelligibility predictors have been conducted (Kent, Kent, Weismer, Martin, Sufit, Brooks, & Rosenbek, 1989; Kim, Kent, & Weismer, 2011). Studies have revealed several important acoustic features that differ between speakers with dysarthria and control speakers without disorders, and that may contribute to intelligibility deficits. Two measures consistently identified as different between speakers with dysarthria and control speakers, and which seem to contribute significantly to variation in speech intelligibility scores, are the size of the vowel space and measures of second formant (F2) change (amount or rate) along major transitions (see review in Weismer, 2008). Lee and Hustad (2013), reporting on a group of 22 young children with CP and widely- varying overall and speech severities, reported a moderately strong correlation between size of the acoustic vowel space and speech intelligibility. The children were studied first at approximately four years of age, and at six month intervals thereafter until the children were five and half years of age (four sampling points); the aforementioned correlation was consistent at each of these sampling points. Children in the Lee and Hustad (2013) study were substantially younger than children with CP from previous studies of speech production and speech intelligibility, and the obtained relationship between vowel space area and speech intelligibility was consistent with previously reported effects for adults with dysarthria.
It is obvious that a large number of variables in speech production (i.e., voice, nasality, formant movement) potentially contribute to speech intelligibility (de Bodt, Huici, & Van De Heyning, 2002). This follows from the results of previous literature showing that CP may affect multiple speech subsystems. Acoustic measures are attractive for prediction studies because they are non-invasive and can be interpreted both in terms of speech production deficits and the effect of the acoustic signal on speech intelligibility. Investigation of a broad set of acoustic variables in children with CP, and in typically developing children, can provide 1) quantification of the difference of various aspects of speech production between children with CP and typically developing children; 2) quantification of speech subsystem impairment in children with CP; and 3) the functional effect of these deviations on measures of speech intelligibility.
In the current study, acoustic variables reflecting different speech subsystems were examined to identify differences among groups of children who had an average age of 67 months. The children with CP were separated into clinically- defined groups based on the presence or absence of speech motor impairment (Hustad, Gorton, & Lee, 2010). In addition, a speech intelligibility prediction model was tested using multiple speech subsystem approach. Specific research questions addressed were as follows:
What are the segmental, voice, resonance, and intelligibility characteristics of speech in children with speech motor impairment (SMI) secondary to CP and in children with CP and no diagnosed speech motor impairment (NSMI), and how do they compare to the same characteristics in typically-developing children (TD)?
When including multiple acoustic variables reflecting different speech subsystems in a prediction model, which acoustic variables are the best predictors of intelligibility in children with CP? And what is the independent contribution of each acoustic variable to speech intelligibility among multiple acoustic variables reflecting different speech subsystems?
Method
Participants
Speakers
Twenty two children with CP and 19 typically developing children participated as speakers. These children were participating in a longitudinal project on communication development in children with CP (Hustad et al., 2010; Lee & Hustad, 2013). Inclusion criteria for children with CP required that each child: (a) have a medical diagnosis of cerebral palsy; (b) be a native speaker of American English; (c) have hearing within normal limits; and (d) be able to produce single words in imitation. Eleven boys and eleven girls with CP participated. The average chronological age of children with CP was 67 months (range of 48–82 months, SD=9.9).
Children with CP were assigned into two groups by two certified speech- language pathologists: those with dysarthria (Speech Motor Impairment (SMI group)), and those who had no clinical evidence of speech motor impairment (No Speech Motor Impairment (NSMI))1. Children with CP who presented clinical evidence of speech motor impairment in any one or more of the speech subsystems (articulation, phonation, resonation, respiration) that could be observed visually and/or audibly were assigned to the SMI group. Clinical evidence of speech motor impairment was operationalized to include any obvious audible evidence of dysarthria, as well as visual identification of abnormal orofacial and/or respiratory movements during speech. Children with CP who did not present any such evidence were assigned to the NSMI group. The agreement rate between the two sets of classifications (SMI vs. NSMI) by the first and second judges was 95% (21 of 22 children). The single disagreement was resolved by discussion and a joint decision between the two judges. Ultimately, the child in question was determined to best fit the SMI group.
Among 22 children with CP, 13 had a clinical diagnosis of dysarthria (SMI group) and the remaining nine children did not have a clinical diagnosis of dysarthria or any other speech disorder (NSMI group). In addition, nineteen typically- developing children (TD group) participated. Inclusion criteria for typically developing children were that each child: (a) had no known disabilities, based on parent report and examiner judgment; (b) was a native speaker of American English; (c) had hearing within normal limits; (d) passed the Preschool Language Scale-4 Screening Test (Zimmerman, Steiner, and Pond; 2005) and (e) obtained standard scores within normal limits on the Arizona Articulation Proficiency Scale, Third Edition (Fudula, 2000). Typically developing children were matched with children with CP based on chronological age and sex. The average chronological age of typically developing children was 64 months (range of 47–84 months, SD=10.4). All speakers were recruited from the upper Midwest portion of the United States. Table 1 shows demographics of children with CP. The current study was approved by the Institutional Review Board of University of Wisconsin-Madison.
Table 1.
Child | Group | CP Diagnosis | CA (months) | Sex |
---|---|---|---|---|
1 | SMI | mixed dyskinetic/spastic | 78.3 | F |
2 | SMI | Dyskinetic | 61.4 | M |
3 | SMI | Diplegia | 82.3 | M |
4 | SMI | Hemiplegia | 59.2 | F |
5 | SMI | Quadriplegia | 50.7 | F |
6 | SMI | Quadriplegia | 78.4 | M |
7 | SMI | Ataxia | 59.4 | M |
8 | SMI | Diplegia | 61.9 | F |
9 | SMI | Dyskinetic | 72.1 | F |
10 | SMI | Diplegia | 66.4 | F |
11 | SMI | Diplegia | 56.8 | F |
12 | SMI | Unknown | 76.3 | M |
13 | SMI | Hemiplegia | 66.6 | F |
14 | NSMI | Hemiplegia | 66.8 | F |
15 | NSMI | Hemiplegia | 48 | M |
16 | NSMI | Unknown | 49.5 | F |
17 | NSMI | Diplegia | 72.4 | M |
18 | NSMI | Diplegia | 69.9 | M |
19 | NSMI | Hemiplegia | 67.0 | F |
20 | NSMI | Diplegia | 70.0 | M |
21 | NSMI | Diplegia | 77.2 | M |
22 | NSMI | Hemiplegia | 76.3 | M |
Note. CA = Chronological Age; SMI = children with dysarthria secondary to CP; NSMI = children with CP and without diagnosed speech disorders; TD = typically developing children
Listeners
Eighty-two individuals participated as listeners in this study. Two listeners were randomly assigned to each child (41 children × 2 listeners = 82 listeners). Each listener heard only one child. Inclusion criteria required that listeners: (a) be a native speaker of American English; (b) pass a pure tone hearing screening at 25 dB HL for 250 Hz, 500 Hz, 1 kHz, 4 kHz, and 6 kHz bilaterally (c) be between 18 and 40 years of age; (d) have no identified language, learning, or cognitive disabilities per self-report; and (e) have no more than incidental experience listening to people with communication disorders. Compensation was provided for all participants.
Materials and Procedures
Acquisition of speech samples from children
Single word stimuli from the Test of Children’s Speech (TOCS) (Hodge & Daniels, 2007) were used for this study. Children produced 38 different words that were lexically and phonetically appropriate for young children. The 38 words were used to generate speech intelligibility data for each child. Among these words, 13 were subjected to acoustic analysis. Five repetitions of each of these 13 words were included in random order on the lists used to elicit word productions from the children. The words selected for acoustic analysis included sheet, seat, hoot, boot, top, hot, bad, hat, pipe, whip, toys, big, and the nonsense word /mIm/. /mIm/ was included to facilitate an acoustic measure of nasality, described below. On average, four or five analyzable productions of each target word were obtained from each child.
A certified speech-language pathologist collected data from each child. Delayed imitation was employed to obtain productions of the target words. To ensure consistency across modeled productions, recordings of each target word were presented concurrently with a picture of the target word via a laptop computer. For the nonsense word, the symbols “mIm” were shown to the child, followed by an audio sample which was repeated as a delayed imitation. Recordings of children were made in a sound attenuating suite using professional-quality audio recording equipment (Marantz PMD 570 recorder; Mackie 1202 VLZ Pro Mixer; Audio-Technica (AT4040) studio microphone). Audio samples were recorded at a sampling rate of 44.1 Hz (16-bit quantization). The speech signals were recorded with a condenser studio microphone placed approximately 18 inches from the child’s mouth. The level of the signal was monitored and adjusted to obtain optimized recordings and to avoid peak clipping.
Acquisition of speech intelligibility data from listeners
Listeners heard recordings of the childrens’ word productions in a sound attenuating room and for each production typed what they thought they heard onto the computer. Listeners were seated in front of a 19 inch flat panel computer screen with a keyboard placed in front of them. The average audio output level for free-field listening was calibrated to approximately 75 dB SPL at the location of the listener. Speech stimuli were delivered via an in-house computer program that presented audio samples and stored typed data (orthographic transcriptions). Listeners were allowed to listen to each word once. The order of presentation of stimulus words was randomized for each listener. Listeners were instructed that children would be producing real words and to make their best guess if they were unsure about what the child said. Prior to the experiment, listeners were provided with instructions on how to use the experimental software. Each listener listened to only a single child to prevent learning effects that might be associated with hearing the same stimulus items produced by different children. This paradigm has been utilized regularly in previous studies (Hustad, Schueler, Schultz, & DuHadway, 2012; Hustad & Lee, 2008; Lee & Hustad, 2013).
Analysis of Data: Speech Acoustics and Intelligibility
Speech acoustics
Temporal, vowel spectral, nasality, and voice measures described below were selected as speech acoustic variables, to represent articulatory, velopharyngeal and laryngeal speech subsystems. A variable representing the respiratory subsystem (e.g., voice SPL) was not included for two reasons. First, a previous study of older children with CP, in which average voice SPL and its variability across utterances served as predictor variables for the dependent variable “speech proficiency,” failed to show a significant contribution to variation in the perceptual dependent variable (Clarke & Hoops, 1980). Second, technical aspects of obtaining voice SPL data that are comparable across children are complex, requiring absolute knowledge of equivalent mouth-to-microphone distances across children. The experimental setting for the current study did not permit absolute fixed mouth-to-microphone distances, and children’s voice SPLs differed. Variability in voice SPL of the current samples therefore includes the influence of fluctuating mouth-to-microphone distances, as well as gain adjustments for recording the speech samples. For these reasons, an acoustic variable representing the respiratory subsystem, such as voice SPL, was not included in the current study.
The speech acoustic data were obtained from the digital speech samples using a wideband spectrographic display, fast Fourier transform (FFT), and linear predictive coding (LPC) analyses in TF32 (Milenkovic, 2002), following established measurement criteria (Chen, 1995; Kent & Read, 2001; Kent et al., 1989; Klatt, 1976; Turner, Tjaden, & Weismer, 1995; Weismer & Berry, 2003).
Articulatory subsystem
First and second formant frequencies (F1 and F2), for the vowels /i/, /u, /a/, and /æ/ were determined using both wideband spectrographic and spectrum displays from a 30 ms window centered at the temporal midpoint of each vowel. Linear predictive coding was used to generate formant tracks which were hand corrected, as necessary, based on visual inspection of the spectrogram. A total of 1422 tokens (8 words x 41 children x 4 to 5 analyzable repetitions) were measured to obtain formant frequency data. Vowel space was calculated using the formula published by Johnson, Flemming, and Wright (2004).
Duration of the vowels /i/, /u/, /a/ and /æ/ was determined by measuring the interval between the first and last glottal pulse where both F1 and F2 were visible on the spectrogram. A total of 1424 tokens were measured to obtain vowel duration data. The vowel durations in this study were measured from single words. There is evidence in the adult literature on dysarthria (Weismer, Martin, Kent, & Kent, 1992) that single-word vowel durations increase as speech intelligibility decreases. Moreover, segment durations derived from single word productions seem to be moderately to highly correlated with speech intelligibility even in neurologically-normal speakers (Hazan & Markham, 2004) and in some cases in adults with CP (Ansel & Kent, 1992; Rong, Loucks, Kim, & Hasegawa-Johnson, 2012). Clarke and Hoops (1980) showed that speaking rate made a significant contribution to speech intelligibility in older children with CP, and it is well known that dysarthria in general as well as neurological immaturity—a speech motor control system still under development—may be associated with slower-than normal speaking rates (see reviews in Kent, Weismer, Kent, Vorperian, & Duffy, 1999; Kent, 1983; and Kim & Stoel-Gammon, 2010).
F2 slope in transition was included in the variable set because of the consistent finding of shallower-than-normal slopes in adult speakers with dysarthria (Weismer, Yunusova, & Bunton, 2012), and previously reported correlational links between F2 slope reduction and speech intelligibility (Weismer, Jeng, Laures, Kent, & Kent, 2001).The three words chosen for the slope measures (“pipe”, “toys”, and “whip”) all require relatively rapid, large changes in vocal tract configuration for successful production, and are therefore associated with steep and extensive F2 transitions. F2 transitions in the sonorant parts of each of these words are all rising; in the case of the diphthongs (“pipe” and “toys”) and labio-lingual glide to the following vowel /I/ (“whip”), the major rising transition sometimes follows a brief steady-state in F2. The onset and offset of the major transition was defined by the “20/20” rule (Weismer & Berry, 2003), which marked the boundaries for extraction of transition duration and the F2 change across that time interval. Computed this way, all F2 slopes were average slopes for the whole transition. A total of 515 tokens were measured to obtain F2 slope data.
Velopharyngeal subsystem
Degree of nasalization in oral vowels was estimated using Chen’s (1995) extra-pole analysis. The difference between the amplitudes of the first formant and the extra peak (A1-P1) introduced by oro-nasal coupling was measured as an index of the degree of nasalization in oral vowels. Chen’s index correlates with the perception of hypernasality, and can serve as non-invasive measure of velopharyngeal function (Chen, 1995). To measure the amplitude of the extra peak, the frequency location of the peak must be identified in advance. Chen (1995) identified this extra peak frequency to be located typically, in adults, around 950 Hz. Because children served as participants in this study, Chen’s adult-based estimate of this frequency might be inappropriate for the current analyses. A procedure to estimate the frequency of the extra peak associated with oro-nasal coupling in each child was therefore developed and implemented as follows. A nonsense word containing the labial nasal consonants /m/ surrounding the high-front lax vowel /I/ (/mIm/) was produced by each child. The reasoning was that measurement of the extra peak during the vocalic portion of this nonsense word would allow clear identification of the extra peak induced by oronasal coupling, as a result of the nasal consonant coarticulation effect on the vowel. The peak was estimated in /mIm/ for each child and used in the A1-P1 measurement.
To quantify the degree of nasalization of oral vowels in a non-nasal environment, A1-P1 of /I/ in “big” was measured, using the child-specific estimate of P1 derived from /mIm/. A1-P1 was measured with a 10 msec window at five successive locations across the vowel, including the 10%, 30%, 50%, 70%, and 90% time points of total vowel duration in both /mIm/ and “big.” When P1 was not identifiable from the spectra, the second harmonic after the first formant peak was designated as the location of the extra peak (Chen, 1995). On average, four repetitions of both words were measured in each child. A total of 314 tokens were measured to obtain A1-P1 data.
A non-invasive estimate of velopharyngeal function was included in the prediction variable set because of the assumption that there is a relationship between velopharyngeal incompetency and speech intelligibility. Little if any work has been done in the area of dysarthria on possible relationships between velopharyngeal incompetency and speech intelligibility, but literature on children with craniofacial anomalies (e.g., Kummer, 2011) suggests, at the least, an ordinal difference in speech intelligibility between the insufficiently versus well-functioning velopharyngeal port.
Laryngeal subsystem
Mean fundamental frequency (F0) and signal-to-noise ratio (SNR) of the vowel /a/ in the target word “Top” were measured. A single vowel was chosen for the laryngeal subsystem analyses to avoid mixing vowels which may have different intrinsic F0 (Sussman & Sapienza, 1994; Whalen & Levitt, 1995) or differential effects on harmonic-to-noise measures (Maccallum, Zhang, & Jiang, 2011). The vowel /a/ was chosen for the laryngeal-subsystem analyses because it has been used frequently in the voice literature for measuring F0 (Campisi, Tewfik, Pelland-Blais, Husein, & Sadeghi, 2000; Parsa & Jamieson, 2001). F0 and signal-to-noise ratio were obtained from the TF32 voice analysis algorithms, solely for the voiced interval of /a/ in “top”. Collectively, these measures were chosen because of their ability to reflect the general integrity of laryngeal mechanisms for voicing, including basic postural settings of laryngeal musculature (e.g., F0). A total of 186 tokens were measured to obtain F0 and SNR data.
Speech intelligibility
Word intelligibility scores were calculated as the number of words identified correctly divided by the number of possible words multiplied by 100. The word intelligibility score of each child was based on the average value of word intelligibility scores obtained from two listeners. If the average difference in word intelligibility scores between the two listeners (per child) was more than 10%, data were obtained from a third listener and the two intelligibility data points that differed by less than 10% were used. This occurred in 5 instances among 82 intelligibility data points.
Relationship of Current Data to Lee and Hustad (2013)
Data reported in this study overlap to a small extent with data reported by Lee and Hustad (2013). Specifically, Lee and Hustad (2013) reported vowel space and intelligibility data collected from children with CP at four sampling points from an average age of 50 months and 67 months. The acoustic vowel space and intelligibility data reported in the current study are from the fourth sampling point in Lee and Hustad (2013), and are included here for the prediction part of the study. All other measures in the current study, including all data from the TD group, have not been previously reported.
Reliability
Inter-judge reliability was obtained for all acoustic measures. Inter-judge reliability involved having a second judge, who was trained in speech acoustic analysis, make an independent set of acoustic measures for 10% of the stimuli. Correlation values across the initial and second measurements of the nine acoustic variables ranged between 0.86 and 0.99. The mean absolute difference values were 15.6 Hz (F1), 20.8 Hz (F2), 3.6 ms (vowel duration), 1.1 Hz/ms (F2 slope), 18.56 Hz (extra peak frequency), 0.23 dB (A1-P1), 3.1 Hz (F0), and 0.50 dB (SNR). The reliability data for formant frequency and vowel duration measures are consistent with those reported in prior investigations and were judged to be within an acceptable range for measurement error for these kinds of variables (Monsen & Engrebretson, 1983; Tjaden & Weismer, 1998). Detailed reliability data are available in Lee (2010).
Experimental Design and Analysis
To address the research questions, a one-way analysis of variance (ANOVA) across groups (TD, NSMI, SMI) was administered for each variable. Fisher’s LSD post-hoc tests were employed to examine pairwise group differences for significant variables. Because the study was exploratory in nature, an alpha level of 0.05 was employed for each test. A simultaneous method of multiple linear regression was then employed to investigate acoustic predictors of speech intelligibility. Incremental R2 change was examined using hierarchical analysis to investigate the independent contribution of each variable to speech intelligibility
Results
Acoustic Variables
Descriptive data for each acoustic variable, in the form of group means and standard deviations, are presented in Table 2. Variables in Table 2 are organized according to the speech subsystem they are assumed to reflect. For the articulatory subsystem variables, average values for the components (transition duration and transition extent) of the slope measures are also reported in Table 2. Transition duration and transition extent were not tested statistically because of their partial redundancy with the slope measures.
Table 2.
Speech Subsystem | Acoustic Variable | SMI Group average and SD |
NSMI Group average and SD |
TD Group average and SD |
|
---|---|---|---|---|---|
Articulatory Subsystem |
Vowel Space | Vowel space (Hz2) | 542914 (307412) | 907432 (193302) | 957023(159125) |
Duration | Vowel duration (ms) | 250.6 (145.9) | 173.7 (28) | 176.8(32.5) | |
“Pipe” (Hz/ms) | 6.9 (3.6) | 9.4 (0.9) | 9.9 (1.8) | ||
F2 Slope | “Whip” (Hz/ms) | 7.1 (4.5) | 12.7 (2.3) | 11.4 (2.4) | |
“Toys” (Hz/ms) | 7.5 (3.5) | 12.3 (5.2) | 10.1 (1.8) | ||
F2 Transitional | “Pipe” | 152.3 (197.5) | 111.7 (25.4) | 128.9 (26.2) | |
Duration | “Whip”(ms) | 143.5 (108.5) | 105.2 (20.8) | 107.7 (23.3) | |
“Toys” (ms) | 209.7 (63.2) | 188.5 (49.2) | 199.8 (19.3) | ||
F2 Transitional | “Pipe” (Hz) | 771 (399) | 1068 (261) | 1237 (239) | |
Extent | “Whip” (Hz) | 884 (549) | 1335 (388) | 1197 (297) | |
“Toys” (Hz) | 1285 (553) | 2048 (263) | 1962 (289) | ||
Velopharyngeal Subsystem |
A1-P1 | “Big” dB) | 19.0 (4.9) | 22.2 (3.2) | 19.7 (2.9) |
Laryngeal Subsystem |
F0 | “Top” (Hz) | 274.4 (53.9) | 249.5 (47.5) | 237.6 (24.8) |
SNR | “Top” (dB) | 14.4 (3.2) | 13.0 (3.9) | 14.2 (1.9) | |
Word Intelligibility (%) | 44.1 (27.4) | 80.3 (8.7) | 82.0 (9.6) |
The ANOVAs revealed the following variables to be significantly different among the three groups: vowel space (F (2, 38) = 14.310, p <0.0001), vowel duration (F (2, 38) = 3.368, p =0.045), and F2 slopes for all three words (“Pipe” (F (2, 38) = 6.507, p-value = 0.0037), “Whip” (F (2, 38) = 10.158, p-value = 0.0003), “Toys” (F (2, 38) = 5.518, p-value =0.0079)). Effect sizes for the significant group effects, estimated by means of η2, ranged from a relatively weak 0.15 for the vowel duration variable to a moderate 0.43 for the acoustic vowel space variable. The specifics of the ANOVA analyses are presented in Table 3. Pairwise post-hoc tests using Fischer’s LSD approach indicated the pairwise contrasts that contributed to the significant main effects; these are summarized in Table 4. In the following section, findings are further described according to each speech subsystem.
Table 3.
Source (Group difference) | Acoustic Variable | Sum of squares | df | F | Sig | η2 |
---|---|---|---|---|---|---|
Vowel Space | Vowel space (Hz2) | 1422537649612 | 2 | 14.310 | <0.0001** | .43 |
Error | 1888722377633 | 38 | ||||
Duration | Vowel duration (ms) | 49731 | 2 | 3.368 | .04502* | .15 |
Error | 280553 | 38 | ||||
“Pipe” (Hz/ms) | 75 | 2 | 6.507 | 0.0037** | .26 | |
Error | 218 | 38 | ||||
F2 Slope | “Whip” (Hz/ms) | 208 | 2 | 10.158 | 0.0003** | .35 |
Error | 388 | 38 | ||||
“Toys” (Hz/ms) | 122 | 2 | 5.518 | 0.0079** | .23 | |
Error | 421 | 38 | ||||
A1-P1 (dB) | “Big” (dB) | 59 | 2 | 2.137 | 0.1320 | .10 |
Error | 525 | 38 | ||||
F0 | “Top” (Hz) | 10556 | 2 | 3.313 | 0.0551 | .14 |
Error | 64067 | 38 | ||||
SNR | “Top” (dB) | 11 | 2 | 0.710 | 0.4978 | .04 |
Error | 305 | 38 | ||||
Word Intelligibility (%) | 12383 | 2 | 20.860 | <0.0001** | .52 | |
Error | 11279 | 38 |
p-value < 0.05;
p-value < 0.01)
Table 4.
Group comparisons | Statistically Significant Variables | Direction of the effect | Mean Difference | Std. Error | p-value |
---|---|---|---|---|---|
NSMI vs. TD | None | ||||
SMI vs. TD | Vowel Space | SMI < TD | −372739 | 83749 | <0.0001 |
Vowel Duration | SMI > TD | 67.5 | 30.6 | 0.0337 | |
F2 Slope “Pipe” | SMI < TD | −2.7 | 0.9 | 0.0043 | |
F2 Slope “Whip” | SMI < TD | −4.1 | 1.1 | 0.0007 | |
Word intelligibility | SMI < TD | −37.87 | 6.20 | < 0.0001 | |
SMI vs. NSMI | Vowel Space | SMI < NSMI | 296313 | 105382 | 0.0078 |
Vowel Duration | SMI > NSMI | −76.887 | 37.259 | 0.0459 | |
F2 Slope “Pipe” | SMI < NSMI | 2.1 | 1.1 | 0.048 | |
F2 Slope “Whip” | SMI < NSMI | 5.9 | 1.4 | 0.0001 | |
F2 Slope “Toys” | SMI < NSMI | 4.6 | 1.5 | 0.0042 | |
Word intelligibility | SMI < NSMI | 36.15 | 7.47 | <0.0001 |
Articulatory subsystem
The mean group differences and their direction for the group, pairwise contrasts are presented in Table 4. There were no significant differences between the TD and NSMI groups, but many significant differences between the TD and SMI groups and between the NSMI and SMI groups. The TD and NSMI groups had significantly larger vowel spaces, shorter vowel durations, and steeper F2 slopes as compared to the SMI group. The pattern of pairwise, significant contrasts was essentially the same for the TD versus SMI and NSMI versus SMI groups. In the case of the F2 slope differences between the SMI and the two other groups, examination of the transition duration and transition extent means in Table 2 suggests that both contributed to the shallower slopes in the former group. Transition duration was consistently longer and transition extent consistently smaller for the children in the SMI groups, as compared to children in both the NSMI and TD groups.
Velopharyngeal subsystem
The extra peak frequency (P1) determined empirically for each child’s production of /mIm/ ranged from 870 Hz to 1714 Hz, with an average of 1198 Hz.
Descriptive data (see Table 2) for A1-P1 showed average values of 19.7, 22.2, and 19.0 dB for the TD, NSMI, and SMI groups, respectively. Each of these means values is well within the normal range of A1-P1 values reported for typically-developing teenagers by Chen (1995, see her Figure 3). ANOVA showed that the A1-P1 variable was not significantly different across the three groups.
Laryngeal subsystem
Descriptive data (see Table 2) for the laryngeal subsystem variables suggested that the greatest differences tended to occur between children in the SMI group and children in the TD group. Higher F0 was observed for children in the SMI group compared to children in the TD and NSMI groups. ANOVA results failed to reveal statistically significant group effects for either of the laryngeal subsystem variables.
Subsystems analysis: a summary
In the current subsystem analysis, only articulatory variables (vowel space, vowel duration, and F2 slopes) statistically differentiated the children in the SMI group from children in the other two groups. For all variables, the NSMI and TD groups were statistically equivalent. When statistical effects were found between the SMI and TD groups or between the SMI and NSMI groups, they were in the direction expected from the adult literature. Specifically, vowel spaces were smaller, vowel durations longer, and F2 slopes shallower in the SMI group, as compared to either of the other two groups.
Speech Intelligibility
Table 2 shows the group means and standard deviations for the word intelligibility measure. ANOVA showed significant differences among the three groups (F(2, 38) = 20.860, p-value< 0.0001) and a moderate effect size of 0.52 (Table 3) for the significant group effect. Post-hoc LSD tests revealed significant group contrasts between the SMI and TD groups, and the SMI and NSMI groups (Table 4). The significant differences between the SMI and both the NSMI and TD groups, and the absence of a significant difference between the NSMI and TD groups, are consistent with the clinical assignment of children with CP to the two subgroups.
Contribution of Acoustic Variables to Speech Intelligibility
Pearson product moment correlation analyses were performed among all pairwise combinations of variables; regression analyses were conducted to identify the independent contribution of the acoustic variables to variations in speech intelligibility. For the purposes of the correlation and regression analyses, data from the NSMI and SMI groups were combined (n=22) to yield greater statistical power. Correlation analyses of the acoustic variables against word intelligibility revealed that F2 slope of “Whip” was most highly and positively correlated with word intelligibility (r= 0.85; R2 = 0.72) in children with CP.
Multiple linear regression was performed to determine predictors of speech intelligibility. For this analysis, a reduced set of predictor variables was selected. Six predictor variables were chosen according to the following criteria: a) at least one measure to represent each of the three subsystems, b) low correlations with other potential predictor variables, and c) previous evidence in the literature of sensitivity of the variable to dysarthria.
The selected measures included vowel space, vowel duration, and average F2 slope (articulatory subsystem), A1-P1 index (velopharyngeal subsystem), and F0, and SNR (laryngeal subsystem). Average F2 slope of the three target words was employed as a variable instead of choosing an F2 slope from one of the three words. As shown in Table 5, correlations were still observed among the six selected acoustic variables even after applying the criteria for the reduced set of predictor variables.
Table 5.
Vowel Space |
Vowel Duration |
Average F2 Slope |
A1-P1 | F0 | SNR | |
---|---|---|---|---|---|---|
Vowel Space | 1 | −.599** | .603** | .002 | −.419 | −.345 |
Vowel Duration | 1 | −.728** | −.311 | −.064 | .528* | |
Average F2 Slope | 1 | .427* | .002 | −.427* | ||
A1-P1 | 1 | .093 | −.264 | |||
F0 | 1 | .325 | ||||
SNR | 1 |
p-value < 0.05
p-value < 0.01
Multiple linear regression model
Three approaches to multiple linear regression modeling were completed in this analysis. First, all six predictor variables were treated as a single block for prediction of intelligibility. Second, predictor variables were entered in blocks representing subsystems. Third, each variable was entered in the second block and the remaining five variables were entered in the first block.
All Variables
The present study employed a simultaneous method of multiple linear regression. This method enters all six variables simultaneously to predict speech intelligibility. In previous literature investigating predictors of speech intelligibility (Ansel & Kent, 1992; de Bodt et al., 2002; Neel, 2008, Whitehill & Ciocca, 2000) or speech proficiency (Clarke & Hoops, 1980), various multiple regression methods have been used (e.g., stepwise regression, selecting and entering variables that were highly correlated with speech intelligibility). In this study, it was crucial to enter all acoustic variables simultaneously, in a single block, to represent the combined influence of the three subsystems on speech intelligibility scores.
Using the simultaneous method, a significant model for all children with CP (n=22) emerged (F (6, 15) = 15.101, p <0.0001, adjusted R2 = .801). See Table 6 for statistical results of the model including the beta coefficients. Average F2 slope and F0 were significant predictors of speech intelligibility based on the beta coefficients in this model. The variance inflation factor (VIF) values well below 10 in the table indicate that the multiple regression assumption regarding multicollinearity was not violated even with the observed inter-correlations among the few variables described above (Cohen, Cohen, West, & Aiken, 2003, p. 423). Post-hoc statistical power of this model was 0.999.
Table 6.
Predictor Variable | Unstandardized Coefficients B |
Standardized Coefficients Beta |
t | p-value | VIF+ |
---|---|---|---|---|---|
Vowel Space | <0.0001 | 0.200 | 1.137 | .273 | 3.268 |
Vowel Duration | −.074 | −.313 | −1.743 | .102 | 3.410 |
A1-P1 | .657 | .106 | .901 | .382 | 1.453 |
Average F2 Slope | 4.350 | .509 | 3.035 | .008* | 2.969 |
F0 of “Top” | −.225 | −.414 | −2.968 | .010* | 2.058 |
SNR of “Top’ | 2.240 | .275 | 2.066 | .057 | 1.873 |
p-value under 0.05.
VIF = the Variance Inflation Factor
Subsystems Blocks
Incremental R2 change for two successive subsystems blocks was examined using hierarchical analysis (Cohen et al., 2003, p. 168) to investigate the independent contribution of each speech subsystem to speech intelligibility. For this analysis, two blocks were employed. To examine the contribution of each speech subsystem, variables of each subsystem were entered simultaneously in the second block and the variables of the remaining two subsystems were entered in the first block. All three speech subsystems were rotated in the second block.
The independent contribution of each speech subsystem in this model is reported in Table 7, where the rank of each speech subsystem’s incremental R2 change is provided. The third column (the second to the last column) of Table 7 shows the increment in R2 of the second block as an independent contribution to the prediction model. The sum of the R2 with the single speech subsystem in the second block and the R2 with the remaining two speech subsystems in the first block yields the total R2 of the model described above. A large contribution of the articulatory subsystem to speech intelligibility in children with CP was observed.
Table 7.
Rank | Second block Speech Subsystem |
R2 changes with second block speech subsystem |
R2with with the remaining two speech subsystems in the first block) |
---|---|---|---|
1 | Articulatory Subsystem | 0.579 | 0.279 |
2 | Laryngeal Subsystem | 0.088 | 0.770 |
3 | Velopharyngeal Subsystem | 0.008 | 0.850 |
Total R2 | 0.858 | ||
Adjusted R2 | 0.801 |
Single Variables
Hierarchical analysis was used to examine the independent contribution of each variable to the variance in intelligibility scores. Among the six variables, five variables were entered simultaneously in the first block, and the sixth variable was entered in the second block. All six variables were individually rotated in the second block. R2 change between the first and second blocks showed the independent contribution of the variable entered in the second block relative to the model specified by the first block.
The independent contribution of each variable in this model is reported in Table 8, where the rank of each variable’s incremental R2 change is provided. The sum of the R2 with the acoustic variable in the second block and the R2 with five variables in the first block yields the total R2 of the model described above. The added variance accounted for by any single variable was relatively small, the maximum being 8.7% for average F2 slope added in the second block.
Table 8.
Rank | Sixth acoustic variable in the second block |
R2change with the sixth variable in the second block |
R2 with remaining five variables in the first block |
---|---|---|---|
1 | Average F2 Slope | 0.087 | 0.771 |
2 | F0 | 0.083 | 0.775 |
3 | SNR | 0.040 | 0.818 |
4 | Vowel Duration | 0.029 | 0.829 |
5 | Vowel Space | 0.012 | 0.846 |
6 | A1-P1 | 0.008 | 0.850 |
Total R2 | 0.858 | ||
Adjusted R2 | 0.801 |
Discussion
The first research question concerned the acoustically-inferred articulatory, velopharyngeal, and laryngeal characteristics of speech in children with CP, both with (SMI) and without (NSMI) clinically-identified dysarthria, and how these characteristics compared to those observed in typically-developing children (TD). The findings showed that children with SMI had a statistically significant impairment only in the articulatory subsystem. Children in the NSMI group had articulatory, velopharyngeal, and laryngeal characteristics that did not differ from those of typically developing children. A1-P1, F0, SNR, and speech intelligibility data of typically developing children in the current study were more or less consistent with similar data from previous studies (Chen, 1996; van Doorn & Purcell, 1998; Glaze, Bless, Milenkovic, & Susser, 1988; Gordon-Brannan & Hodson, 2000; Higgins & Hodge, 2002; Lee, Potamianos, & Narayana, 1999). Vowel spaces of typically developing children from the current study were somewhat larger than those from a well-known study of children’s formant frequencies (Lee et al., 1999; and see Flipsen & Lee, 2012), but this is almost certainly explained by the higher F2’s in the Missouri dialect spoken by children in the Lee et al. (1999) study, as compared to the Wisconsin dialect of children in the current study (Clopper & Pisoni, 2005).
The following findings in children in the SMI group are broadly consistent with previously published results from the literature on adults with CP and dysarthria: smaller vowel space (Liu, Tsao, & Kuhl, 2005, Mandarin speakers), longer vowel durations (Jeng, Weismer, & Kent, 2006, Mandarin speakers; Patel 2003), higher mean F0 (Jeng et al., 2006, Mandarin speakers; Patel 2003), and lower speech intelligibility scores (Platt et al, 1980). In addition, reduced F2 transition rate (slope) among children in the SMI group, compared to the typically-developing group of children, has been consistently observed as a characteristic of adults with dysarthria secondary to other etiologies (Weismer et al., 1992). The assumption is that shallower F2 slopes reflect a general articulatory slowness, a characteristic of speech motor control deficits observed especially for tongue motions during speech in adults with dysarthria (Weismer, Yunusova, & Bunton, 2012). As shown in Table 2, reducedF2 slopes reflect both lengthened transitions and reduced transition extents. The inference from shallower F2 slopes to slow articulatory motions seems to be consistent with the current finding that articulatory variables explain variation in speech intelligibility to a much larger degree than variables reflecting the velopharyngeal and laryngeal subsystems. Both speed and extent of change in vocal tract configuration appear to be affected by dysarthria, regardless of the age of the speakers.
The second research question concerned how different acoustic variables contribute to intelligibility in children with CP. When predicting speech intelligibility from multiple acoustic variables reflecting different speech subsystems, a significant multiple regression model that accounted for 80% of the variance was obtained. The bulk of this prediction, however, is from the articulatory subsystem variables which accounts independently for 58% of the variance in intelligibility scores. To verify this finding, an equal number of variables per subsystem, one from each (F2 slope, F0, and A1-P1) were employed and tested in the regression model. This post hoc analysis revealed the consistent pattern of a larger contribution of the articulatory subsystem than the other two subsystems. Hence, the findings indicate that, regardless of the number of variables under each subsystem block, those variables related to the articulatory subsystem made the most substantial, independent contribution to speech intelligibility scores. This is consistent with the conclusions of de Bodt et al. (2002) for adults with dysarthria, based on a multiple regression analysis of perceptual predictor variables and a criterion variable of scaled intelligibility: de Bodt et al. found that scaled articulatory proficiency made the most significant contribution to variance in intelligibility values.
Group Contrasts
In the current study, even though all acoustic variables representing different speech subsystems showed descriptive differences between children with dysarthria and typically-developing children in the expected directions, post-hoc tests showed that only speech intelligibility, spectral (formant-related measures), and vowel duration variables were statistically different between children in the SMI and TD groups. Statistical results further validated the clinical diagnosis of speech motor impairment for the children with CP, by showing no significant differences for any measure between the NSMI and TD groups. Other speech subsystem variables, such as the nasality index and SNR measures, were not significantly different in any group comparisons. These findings suggest that children with speech motor impairment secondary to CP have more distinct speech production differences in the articulatory subsystem than in the other two subsystems, when compared to typically developing children. The findings could also suggest, however, that the laryngeal and velopharyngeal measures were not sufficiently sensitive or adequate to reflect the function of the respective subsystems. More broadly, perhaps acoustic measures are not the best indices of the performance of these two subsystems. In addition, the stimuli upon which the findings were based were single words and analysis of connected speech might yield different findings for voice and resonance.
Average differences between children in the SMI and NSMI groups for acoustic and word intelligibility variables were in the same direction as seen for the SMI vs. TD group comparisons. No acoustic or speech intelligibility variables were found to be significantly different between children in the NSMI and TD groups. These findings indicate that children with CP and no diagnosed speech disorders have similar speech production as typically developing children at the segment level. This finding is consistent with data reported by Hustad et al. (2012) for a comparison of single-word intelligibility in children with CP and no speech motor impairment versus typically-developing children, but may not apply to longer utterance lengths. Examination of Figure 2 in Hustad et al. (2012, p. 1183) shows that at longer utterance lengths, children with CP who do not receive a diagnosis of dysarthria (i.e., children in the NSMI group) may have lower speech intelligibility as compared to typically-developing children of the same age. A future need is to coordinate acoustic measures with measures of speech intelligibility for multi-word utterances produced by children with CP. It is possible, for example, that the velopharyngeal and laryngeal subsystems may make substantial contributions to speech intelligibility of multi-word utterances.
Contribution of Acoustic Variables to Speech Intelligibility in Children with Speech Motor Impairment
The multiple regression model employed in the present study was developed to treat the speech signal as a product of the combination of multiple speech subsystems. Acoustic variables were selected for the model by attempting to minimize substantial inter-correlations among predictor variables. However, correlations among variables were observed even among the selected six variables representing behavior of the different speech subsystems. The persistent correlations among acoustic variables representing different speech subsystems may suggest that all speech subsystems tend to co-vary in children with speech motor impairment. Also the persistent correlations among the variables are likely to reflect the fact that in CP highly selective areas of damage—particularly in the periventricular white matter—are not common (Hoon, 2005; Yoshida, Hayakawa, et al., 2011; Yoshida, Faria, et al., 2013). Rather, damage that affects orofacial fibers (in the corticobulbar tracts, running roughly through the genu of the internal capsule) is likely to affect much if not all of laryngeal, velopharyngeal, jaw, and labial musculature. Damage encroaching on the posterior limb of the internal capsule might also affect respiratory (trunk) muscles. In other words, highly-specific, differential subsystem involvement is probably the exception, rather than the rule when speech production is affected by CP. The substantial contribution of the articulatory subsystem and the significant shared variance across all variables may indicate that the other subsystems are affected but speech intelligibility may be more resistant to decrements in voice and resonance, as compared to disruption of articulatory behavior. For example, in de Bodt et al. (2002), many of their speakers had abnormal ratings on resonance and voice quality but they did not make the same contribution as articulatory function to speech intelligibility.
F2 slope and F0
Based on the beta coefficients of the multiple regression model for the 22 children with CP, the average F2 slope and F0 were significant contributors to speech intelligibility. The average F2 slope has a positive relationship, and F0 has a negative relationship with word intelligibility as suggested by the signs of the standardized beta coefficients. As noted above, in the adult dysarthria literature (Weismer et al., 1992, Weismer, Jeng, Laures, Kent, & Kent 2001), F2 slope has been reported as an important predictor of speech intelligibility. The articulatory speed impairment implied by the shallower F2 slopes in adults or children with dysarthria has been argued to be a likely, general index of severity of speech motor control involvement in speakers with neuromotor speech disorders (see Weismer et al., 2012).
Although F0 did not reveal significant differences between groups (Table 3), it did appear to make a small, independent contribution to the prediction of speech intelligibility (Tables 6 and 7). Higher F0 among adult speakers with dysarthria has been observed in some studies, but the finding is not consistent (Patel, 2003; Patel, 2004). Higgins and Hodge (2002) reported higher F0 in children with dysarthria secondary to various etiologies, consistent with the present study. To the extent that higher F0 may reflect overall severity of impairment in CP, as suggested indirectly by the work of Ohata, Tsuboyama, Haruta, Ichihashi, and Nakamura (2009), F0 may be another, albeit weak, index of overall speech motor impairment.
Limitations and Future Study
Findings of the present study should be interpreted with caution considering a) the relatively small number of participants, b) the wide age range in each group, c) the use of single words to estimate speech intelligibility, and d) the heterogeneous population of children with CP. Different measurements should be examined to represent laryngeal and velopharyngeal functions in future studies. As noted above, the absence of an explicit measure reflecting the respiratory subsystem may be considered a shortcoming of the current study, although a rationale was provided for exclusion of such measures.
Clinical Implications
The current study showed that, in children with dysarthria secondary to CP, the articulatory subsystem is most prominently involved; the current analysis may also suggest a small role of F0 as an index of speech motor control impairment and as a contributor to speech intelligibility. In the Introduction to this paper, the authors suggested that an understanding of childhood dysarthria in CP should not be based a priori on an adaptation of the much more extensive data on dysarthria in adults. In particular, the interaction of a speech neuromotor disorder with developing speech motor control capabilities may very well produce characteristics of childhood dysarthria different from those observed in adults with dysarthria. In fact, the current results on speech acoustic differences between children with CP and dysarthria and typically developing children, and on the acoustic measures that make substantial contributions to single-word speech intelligibility scores, are very similar to those reported in the adult literature. At this point in time, and at least with respect to the measures studied in the current investigation, it seems that evidence-based practice in treatment of childhood dysarthria in CP can use not only the results of the present study, but also those of the more extensive literature on adults. The evidence supports primary attention to the articulatory subsystem in the case of both children and adults, when the goal is to improve speech intelligibility.
Acknowledgments
We thank Kris Gorton, Ok-Bun Lee, Kelly McCourt Hayes, Therese Wycklendt, and Amy Kramper for assistance with data collection and data analysis. Portions of these data are from first authors’ dissertation and were presented at the 2010 Biennial Conference on Motor Speech, Savannah, GA. Research was funded by grants K23DC007114 and R01DC009411 from the National Institute on Deafness and Other Communication Disorders, National Institutes of Health (United States Department of Health and Human Services), and by New Century Scholars Doctoral Scholarship from American Speech-Language-Hearing Foundation. Support was also provided by grant P30HD03352 to the Waisman Center from the National Institute of Child Health and Human Development, National Institutes of Health (United States Department of Health and Human Services).
Footnotes
Note that in our previous work, children were further separated based on the presence or absence of language / cognitive impairment. In the present study, this differentiation was not made and all children with dysarthria were grouped together, regardless of whether there were other co-occurring impairments.
References
- Achilles R. Communication anomalies of individuals with cerebral palsy: I. Analysis of communication processes in 151 cases of cerebral palsy. Cerebral Palsy Review. 1955;16:15–24. [Google Scholar]
- Ansel BM, Kent RD. Acoustic-phonetic contrasts and intelligibility in the dysarthria associated with mixed cerebral palsy. Journal of Speech and Hearing Research. 1992;35:296–308. doi: 10.1044/jshr.3502.296. [DOI] [PubMed] [Google Scholar]
- Bax M, Goldstein M, Rosenbaum P, Leviton A, Paneth N, Dan B, Jacobsson B, Damiano D. Proposed definition and classification of cerebral palsy, April 2005. Developmental Medicine and Child Neurology. 2005;47(8):571–576. doi: 10.1017/s001216220500112x. [DOI] [PubMed] [Google Scholar]
- Bax M, Tydeman C, Flodmark O. Clinical and MRI correlates of cerebral palsy. Journal of the American Medical Association. 2006;296(13):1602–1608. doi: 10.1001/jama.296.13.1602. [DOI] [PubMed] [Google Scholar]
- Byrne MC. Speech and language development of athetoid and spastic children. Journal of Speech and Hearing Disorders. 1959:231–240. doi: 10.1044/jshd.2403.231. [DOI] [PubMed] [Google Scholar]
- Campisi P, Tewfik T, Pelland-Blais E, Husein M, Sadeghi N. Multidimentional voice program analysis in children with vocal cord nodules. Journal of Otolaryngology. 2000;29(5):302–308. [PubMed] [Google Scholar]
- Chen MY. Acoustic parameters of nasalized vowels in hearing -impaired and normal-hearing speakers. Journal of Acoustical Society of America. 1995;98(5):2443–2453. doi: 10.1121/1.414399. [DOI] [PubMed] [Google Scholar]
- Chen MY. Acoustic correlates of nasality in speech. Massachusetts Institute of Technology; 1996. Unpublished doctoral dissertation. [Google Scholar]
- Clarke WM, Hoops HR. Predictive measures of speech proficiency in cerebral palsied speakers. Journal of Communication Disorders. 1980;13:385–394. doi: 10.1016/0021-9924(80)90007-6. [DOI] [PubMed] [Google Scholar]
- Clement M, Twitchell TE. Dysarthria in cerebral palsy. Journal of Speech and Hearing Disorders. 1959;24(2):118–122. doi: 10.1044/jshd.2402.118. [DOI] [PubMed] [Google Scholar]
- Clopper CG, Pisoni DB. Acoustic characteristics of the vowel systems of six regional varieties of American English. Journal of Acoustical Society of America. 2005;118(3):1661–1676. doi: 10.1121/1.2000774. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cohen J, Cohen P, West SG, Aiken LS. Applied nmultiple regression/correlation analysis for the behavioral sciences. 3rd. Mahwah, NJ: Erlbaum; 2003. [Google Scholar]
- de Bodt MS, Huici MEH, Van De Heyning PH. Intelligibility as a linear combination of dimensions in dysarthric speech. Journal of Communication Disorders. 2002;35:283–292. doi: 10.1016/s0021-9924(02)00065-5. [DOI] [PubMed] [Google Scholar]
- Farmer A, Lencione R. An extraneous vocal behavior in cerebral palsied speakers. British Journal of Disorders of Communication. 1977;12(2):109–118. doi: 10.3109/13682827709011315. [DOI] [PubMed] [Google Scholar]
- Flipsen P, Lee S. Reference data for the American English acoustic vowel space. Clinical Linguistics & Phonetics. 2012;26:923–933. doi: 10.3109/02699206.2012.720634. [DOI] [PubMed] [Google Scholar]
- Fudala JB. Arizona Articulation Proficiency Scale. 3rd. Los Angeles, CA: Western Psychological Services; 2000. [Google Scholar]
- Glaze LE, Bless DM, Milenkovic P, Susser RD. Acoustic characteristics of children's voice. Journal of Voice. 1988;2(4):312–319. [Google Scholar]
- Gordon-Brannan M, Hodson BW. Intelligibility/severity measurements of prekindergarten children's speech. American Journal of Speech-Language Pathology. 2000;9:141–150. [Google Scholar]
- Hardy JC. Intraoral breath pressure in cerebral palsy. Journal of Speech and Hearing Disorders. 1961;26:309–319. doi: 10.1044/jshd.2604.309. [DOI] [PubMed] [Google Scholar]
- Hazan V, Markham D. Acoustic-phonetic correlates of talker intelligibility for adults and children. Journal of Acoustical Society of America. 2004;116(5):3108–3118. doi: 10.1121/1.1806826. [DOI] [PubMed] [Google Scholar]
- Higgins CM, Hodge MM. Vowel area and intelligibility in children with and without dysarthria. Journal of Medical Speech-Language Pathology. 2002;10(4):271–277. [Google Scholar]
- Himmelmann K, Lindh K, Hidcker MJ. Communication ability in cerebral palsy" A study from the CP register of western Sweden. European Journal of Paediatric Neurology. 2013 doi: 10.1016/j.ejpn.2013.04.005. http://dx.org/10.1016/j.ejpn.2013.04.005. [DOI] [PubMed]
- Hixon TJ, Hardy JC. Restricted motility of the speech articulators in cerebral palsy. Journal of Speech and Hearing Disorders. 1964;29:293–306. doi: 10.1044/jshd.2903.293. [DOI] [PubMed] [Google Scholar]
- Hodge M, Daniels J. TOCS+ Intelligibility Measures. Edmonton, AB: University of Alberta; 2007. [Google Scholar]
- Hoon AH. Neuroimaging in cerebral palsy: patterns of brain dysgenesis and injury. Journal of Child Neurology. 2005;20(12):936–939. doi: 10.1177/08830738050200120201. [DOI] [PubMed] [Google Scholar]
- Hustad KC, Gorton K, Lee J. Classification of speech and language profiles in 4-year-old children with cerebral palsy: A prospective preliminary study. Journal of Speech, Language, and Hearing Research. 2010;53:1496–1513. doi: 10.1044/1092-4388(2010/09-0176). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hustad KC, Schueler B, Schultz L, DuHadway C. Intelligibility of 4-year-old children with and without cerebral palsy. Journal of Speech, Language, and Hearing Research. 2012;55(4):1177–1189. doi: 10.1044/1092-4388(2011/11-0083). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Irwin OC. Phonetic equipment of spastic and athetoid children. American Journal of Physical Medicine. 1955a;34(2):54–57. doi: 10.1044/jshd.2001.54. [DOI] [PubMed] [Google Scholar]
- Irwin OC. Phonetic speech development in cerebral palsied children. American Journal of Physical Medicine. 1955b;34(2):325–334. [PubMed] [Google Scholar]
- Jeng J, Weismer G, Kent RD. Production and perception of mandarin tone in adults with cerebral palsy. Clinical Linguistics & Phonetics. 2006;20(1):67–87. doi: 10.1080/02699200400016539. [DOI] [PubMed] [Google Scholar]
- Johnson K, Flemming E, Wright R. Response to Whalen et al. Language. 2004;80:646–648. [Google Scholar]
- Kent RD. The segmental organization of speech. In: MacNeilage PF, editor. The production of speech. New York: Springer-Verlag; 1983. pp. 57–89. [Google Scholar]
- Kent RD, Kent JF, Weismer G, Martin RE, Sufit RL, Brooks BR, Rosenbek JC. Relationship between speech intelligibility and the slope of second-formant transitions in dysarthric subjects. Clinical Linguistics & Phonetics. 1989;3(4):347–358. [Google Scholar]
- Kent RD, Kent JF, Weismer G, Martin RE, Sufit RL, Brooks BR, Rosenbek JC. Relationship between speech intelligibility and the slope of second-formant transitions in dysarthric subjects. Clinical Linguistics & Phonetics. 1989;3(4):347–358. [Google Scholar]
- Kent R, Netsell R. Articulatory abnormalities in athetoid cerebral palsy. Journal of Speech and Hearing Disorders. 1978;43(3):353–373. doi: 10.1044/jshd.4303.353. [DOI] [PubMed] [Google Scholar]
- Kent R, Read C. Acoustic Analysis of Speech. 2nd. Albany, NY: Singular / Thomson Learning; 2001. [Google Scholar]
- Kent RD, Weismer G, Kent JF, Vorperian HK, Duffy JR. Acoustic studies of dysarthric speech: Methods, progress, and potential. Journal of Communication Disorders. 1999;32:141–186. doi: 10.1016/s0021-9924(99)00004-0. [DOI] [PubMed] [Google Scholar]
- Kim M, Stoel-Gammon C. Segmental timing of young children and adults. nternational Journal of Speech Language Pathology. 2010;12(3):221. doi: 10.3109/17549500903477363. [DOI] [PubMed] [Google Scholar]
- Kim Y, Kent RD, Weismer G. An acoustic study of the relationships among neurologic disease, dysarthria type, and severity of dysarthria. Journal of Speech, Language, and Hearing Research. 2011;54:417–429. doi: 10.1044/1092-4388(2010/10-0020). [DOI] [PubMed] [Google Scholar]
- Klatt D. Linguistic uses of segmental duration in English: Acoustic and perceptual evidence. Journal of Acoustical Society of America. 1976;59:1208–1221. doi: 10.1121/1.380986. [DOI] [PubMed] [Google Scholar]
- Kummer AW. Disorders of resonance and airflow secondary to cleft palate and/or velopharyngeal dysfunction. Seminars in Speech and Language. 2011;32(2):141–149. doi: 10.1055/s-0031-1277716. [DOI] [PubMed] [Google Scholar]
- Lee S, Potamianos A, Narayana S. Acoustics of children's speech: developmental changes of temporal and spectral parameters. Journal of Acoustical Society of America. 1999;105(3):1455–1468. doi: 10.1121/1.426686. [DOI] [PubMed] [Google Scholar]
- Lee J. Development of vowels and their relationship with speech intelligibility in children with cerebral palsy. University of Wisconsin-Madison; 2010. Unpublished dissertation. [Google Scholar]
- Lee J, Hustad KC. A preliminary investigation of longitudinal changes in speech production over 18months in young children with cerebral palsy. Folia Phoniatrica et Logopedica. 2013;65(1):32–39. doi: 10.1159/000334531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu H, Tsao F, Kuhl PK. The effect of reduced vowel working space on speech intelligibility in Mandarin-speaking young adults with cerebral palsy. Journal of Acoustical Society of America. 2005;117(6):3879–3889. doi: 10.1121/1.1898623. [DOI] [PubMed] [Google Scholar]
- Maccallum JK, Zhang Y, Jiang JJ. Vowel selection and its effects on perturbation and nonlinear dynamic measures. Folia Phoniatrica et Logopaedica. 2011;63:88–97. doi: 10.1159/000319786. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Milenkovic P. TF32. Madison, WI: University of Wisconsin - Madison; 2002. [Google Scholar]
- Monsen RB, Engebretson AM. The accuracy of formant frequency measurements: a comparison of spectrographic analysis and linear prediction. Journal of Speech and Hearing Research. 1983;26:89–97. doi: 10.1044/jshr.2601.89. [DOI] [PubMed] [Google Scholar]
- Netsell R. Evaluation of velopharyngeal function in dysarthria. Journal of Speech and Hearing Disorders. 1969;34(2):113–122. doi: 10.1044/jshd.3402.113. [DOI] [PubMed] [Google Scholar]
- Ohata K, Tsuboyama T, Haruta T, Ichihashi N, Nakamura T. Longitudinal change in muscle and fat thickness in children and adolescents with cerebral palsy. Developmental Medicine & Child Neurology. 2009;51:943–948. doi: 10.1111/j.1469-8749.2009.03342.x. [DOI] [PubMed] [Google Scholar]
- Otapowicz D, Sobaniec W, Kulak W, Sendrowski K. Severity of dysarthric speech in children with infantile cerebral palsy in correlation with the brain CT and MRI. Advances in Medical Sciences. 2007;52:188–190. [PubMed] [Google Scholar]
- Parsa V, Jamieson DG. Acoustic discrimination of pathological voice: sustained vowels versus continuous speech. Journal of Speech, Language, and Hearing Research. 2001;44:327–339. doi: 10.1044/1092-4388(2001/027). [DOI] [PubMed] [Google Scholar]
- Patel R. Acoustic characteristics of the question-statement contrast in severe dysarthria due to cerebral palsy. Journal of Speech, Language, and Hearing Research. 2003;46:1401–1415. doi: 10.1044/1092-4388(2003/109). [DOI] [PubMed] [Google Scholar]
- Patel R. The acoustics of contrastive prosody in adults with cerebral palsy. Journal of Medical Speech-Language Pathology. 2004;12(4):189–193. [Google Scholar]
- Pirila S, van der Meere J, Pentikainen T, Ruussu-Niemi P, Korpela R, Kilpinen J, Nieminen P. Language and motor speech skills in children with cerebral palsy. Journal of Communication Disorders. 2007;40:116–128. doi: 10.1016/j.jcomdis.2006.06.002. [DOI] [PubMed] [Google Scholar]
- Platt LJ, Andrews G, Howie PM. Dysarthria of adult cerebral palsy. II. Phonemic analysis of articulation errors. Journal of Speech and Hearing Research. 1980;23:41–55. doi: 10.1044/jshr.2301.41. [DOI] [PubMed] [Google Scholar]
- Platt LJ, Andrews G, Young M, & Quinn PT. Dysarthria of adult cerebral palsy. I. Intelligibility and articulatory impairment. Journal of Speech and Hearing Research. 1980;23:28–40. doi: 10.1044/jshr.2301.28. [DOI] [PubMed] [Google Scholar]
- Rong P, Loucks T, Kim H, Hasegawa-Johnson M. Relationship between kinematics, F2 slope, and speech intelligibility in dysarthria due to cerebral palsy. Clinical Linguistics & Phonetics. 2012;26(9):806–822. doi: 10.3109/02699206.2012.706686. [DOI] [PubMed] [Google Scholar]
- Sussman JE, Sapienza C. Articulatory, developmental, and gender effects on measures of fundamental frequency and jitter. Journal of Voice. 1994;8(2):145–156. doi: 10.1016/s0892-1997(05)80306-6. [DOI] [PubMed] [Google Scholar]
- Tjaden K, Weismer G. Speaking-rate-induced variability in F2 trajectories. Journal of Speech, Language, and Hearing Research. 1998;41:976–989. doi: 10.1044/jslhr.4105.976. [DOI] [PubMed] [Google Scholar]
- Turner G, Tjaden K, Weismer G. The influence of speaking rate on vowel space and speech intelligibility for individuals with amyotrophic lateral sclerosis. Journal of Speech and Hearing Research. 1995;38:1001–1013. doi: 10.1044/jshr.3805.1001. [DOI] [PubMed] [Google Scholar]
- van Doorn J, Purcell A. Nasalance levels in the speech of normal Australian children. Cleft Palate-Craniofacial Journal. 1998;35(4):287–292. doi: 10.1597/1545-1569_1998_035_0287_nlitso_2.3.co_2. [DOI] [PubMed] [Google Scholar]
- Weismer G. Speech intelligibility. In: Ball MJ, Perkins MR, Müller N, Howard S, editors. The Handbook of Clinical Linguistics. Oxford, UK: Blackwell; 2008. pp. 568–582. [Google Scholar]
- Weismer G, Berry J. Effects of speaking rate on second formant trajectories of selected vocalic nuclei. Journal of Acoustical Society of America. 2003;113(6):3362–3378. doi: 10.1121/1.1572142. [DOI] [PubMed] [Google Scholar]
- Weismer G, Jeng J, Laures JS, Kent RD, Kent JF. Acoustic and intelligibility characteristics of sentence production in neurogenic speech disorders. Folia Phoniatrica et Logopaedica. 2001;53:1–18. doi: 10.1159/000052649. [DOI] [PubMed] [Google Scholar]
- Weismer G, Martin R, Kent RD, Kent JF. Formant trajectory characteristics of males with amyotrophic lateral sclerosis. Journal of Acoustical Society of America. 1992;91(2):1085–1098. doi: 10.1121/1.402635. [DOI] [PubMed] [Google Scholar]
- Weismer G, Yunusova Y, Bunton K. Measures to evaluate the effects of DBS on speech production. Journal of Neurolinguistics. 2012;25:74–94. doi: 10.1016/j.jneuroling.2011.08.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Whalen DH, Levitt AG. The universality of intrinsic F0 of vowels. Journal of Phonetics. 1995;23:349–366. [Google Scholar]
- Whitehill TL, Ciocca V. Perceptual-phonetic predictors of single-word intelligibility: a study of cantonese dysarthria. Journal of Speech, Language, and Hearing Research. 2000;43:1451–1465. doi: 10.1044/jslhr.4306.1451. [DOI] [PubMed] [Google Scholar]
- Wit J, Maassen B, Gabreels FJM, Thoonen G. Maximum performance tests in children with developmental spastic dysarthria. Journal of Speech and Hearing Research. 1993;36:452–259. doi: 10.1044/jshr.3603.452. [DOI] [PubMed] [Google Scholar]
- Wolfe WG. A comprehensive Evaluation of fifty cases of cerebral palsy. Journal of Speech and Hearing Disorders. 1950;15(3):234–251. doi: 10.1044/jshd.1503.234. [DOI] [PubMed] [Google Scholar]
- Workinger MS. Acoustic analysis of the dysarthrias in children with athetoid and spastic cerebral palsy. University of Wisconsin-Madison; 1986. Unpublished doctoral dissertation. [Google Scholar]
- Yoshida S, Faria AV, Oishi K, Kanda T, Yamori Y, Yoshida N, Hirota H, Iwami M, Okano S, Hus J, Jiang H, Li Y, Hayakawa K, Mori S. Anatomical characterization of athetotic and spastic cerebral palsy using an atlas-based analysis. Journal of Magnetic Resonance Imaging. 2013;38:288–298. doi: 10.1002/jmri.23931. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yoshida S, Hayakawa K, Oishi K, Mori S, Kanda T, Yamori Y, Yoshida N, Hirota H, Iwami M, Okano S, Matsushita H. Athetotic and spastic cerebral palsy: anatomic characterization based on diffusion-tensor imaging. Radiology. 2011;260(2):511–520. doi: 10.1148/radiol.11101783. [DOI] [PubMed] [Google Scholar]
- Zimmerman I, Steiner V, Pond R. Preschool Language Scale-4 Screening Test. San Antonia, TX; Psychological Corporation: 2005. [Google Scholar]