Abstract
Purpose
The objectives of this study were to identify acoustic characteristics of connected speech that differentiate children with dysarthria secondary to cerebral palsy (CP) from typically developing children and to identify acoustic measures that best detect dysarthria in children with CP.
Method
Twenty 5-year-old children with dysarthria secondary to CP were compared to 20 age- and sex-matched typically developing children on 5 acoustic measures of connected speech. A logistic regression approach was used to derive an acoustic model that best predicted dysarthria status.
Results
Results indicated that children with dysarthria secondary to CP differed from typically developing children on measures of multiple segmental and suprasegmental speech characteristics. An acoustic model containing articulation rate and the F2 range of diphthongs differentiated children with dysarthria from typically developing children with 87.5% accuracy.
Conclusion
This study serves as a first step toward developing an acoustic model that can be used to improve early identification of dysarthria in children with CP.
Dysarthria is a barrier to effective communication for many children with neurodevelopmental disorders, including more than 50% of children with cerebral palsy (CP; Nordberg, Miniscalco, Lohmander, & Himmelmann, 2013). Congenital dysarthria generally results in a lifelong disability associated with reduced speech intelligibility and negatively impacts participation in functional contexts (Pennington & McConachie, 2001). Although dysarthria can have pervasive effects on all aspects of speech motor control, including articulation, respiratory control, phonation, resonance, and prosody (De Bodt, Hernandez-Diaz, & Van De Heyning, 2002; Duffy, 2013; Workinger & Kent, 1991), the manner in which dysarthria manifests in children who are developing speech has received limited study. The current lack of information regarding speech characteristics and early indicators of dysarthria in young children who are still acquiring speech sounds limits early diagnosis and intervention for pediatric dysarthria.
Diagnosis of Dysarthria in Children
Dysarthria can be difficult to diagnose in young children for a variety of reasons. In adults, acquired dysarthria manifests as a change from previously normal speech function and is clinically diagnosed based on evidence of weakness, abnormal tone, or impaired coordination in speech musculature and the presence of deviant speech features in one or more speech domains (i.e., articulation, phonation, respiration, resonance, and prosody; Darley, Aronson, & Brown, 1969; Duffy, 2013). Because speech development is complete in adults, deviant speech characteristics cannot be attributed to ongoing development; however, for children, the coinciding influences of development and speech motor impairment (SMI) on speech characteristics can make dysarthria difficult to identify. First, children are not expected to have mastered all speech sounds until the age of 8, and there is a wide range of acceptable normal variation in speech sound production at young ages (Sander, 1972; Smit, Hand, Freilinger, Bernthal, & Bird, 1990). In addition, there is an overlap in the speech characteristics typically associated with dysarthria and developmental immaturity. Specifically, both are associated with difficulty producing complex, later-developing speech sounds, slower rates of speech, and greater variability in speech movements (Grigos, 2009; Hawkins, 1984; Kim, Martin, Hasegawa-Johnson, & Perlman, 2010; Lee, Potamianos, & Narayanan, 1999). This is particularly an issue in very young children, as some of the key perceptual characteristics currently used by speech-language pathologists to identify dysarthria, such as imprecise articulation, sound distortions, slow rate, and reduced intelligibility, are characteristics of typically developing (TD) speech at early ages, and little normative data exist on such measures of speech motor performance at young ages (McCauley & Strand, 2008). Currently, these difficulties disentangling the effects of SMI from development in young children are a critical barrier to early diagnosis of pediatric dysarthria.
For children with CP, early diagnosis of dysarthria is a particularly important clinical issue. Not all children with CP have dysarthria, though the diagnosis itself is a risk factor (Hustad, Gorton, & Lee, 2010; Mei, Reilly, Reddihough, Mensah, & Morgan, 2014; Sigurdardottir & Vik, 2011). Thus, early identification of those who do have dysarthria is crucial for enabling access to early intervention services that may lead to better speech outcomes. One recent study demonstrated that intelligibility was a sensitive and specific marker of dysarthria in 5-year-old children with CP (Hustad, Oakes, & Allison, 2015); however, this metric may be less diagnostically efficacious at younger ages, when a larger range of intelligibility is expected in TD children (Flipsen, 2006).
An additional challenge to early diagnosis of dysarthria in children with CP is the heterogeneity in speech characteristics across individual children. Many factors contribute to this heterogeneity, including severity of involvement, variations in underlying neuropathology that can differentially impact speech subsystems, and individual differences in speaking style that are also present in TD children (Vick et al., 2012). One study by Allison and Hustad (2014) demonstrated individual differences in the sentence characteristics that significantly predicted intelligibility among children with CP and dysarthria and suggested that this variability may be related to differences in underlying patterns of SMI. In order to translate research findings to improved clinical diagnosis at an individual level, it is essential to characterize the range of variability in speech profiles across children with CP and to identify the speech features that best differentiate children who have dysarthria from those who do not.
Acoustic Characteristics of Pediatric Dysarthria
Acoustic measures have the potential to help characterize speech features associated with dysarthria in young children and to assist in early diagnosis, as they are (a) sensitive to differences between dysarthric and typical speech in children (Higgins & Hodge, 2002; Hustad et al., 2010; Lee, Hustad, & Weismer, 2014), (b) allow for objective quantification of speech features (e.g., Kent & Kim, 2003), and (c) can be used to detect speech differences associated with dysarthria at a more fine-grained level than perceptual measures (Green et al., 2013; Rong, Yunusova, Wang, & Green, 2015). Because dysarthria in children with CP can involve perceptual deviations in multiple speech dimensions, including articulatory precision, voice quality, nasality, and rate (Workinger & Kent, 1991), it is important to consider acoustic measures that provide objective quantification of deviations across these different speech dimensions when considering candidate measures for improving early diagnosis of dysarthria.
Prior acoustic studies of children with dysarthria have primarily focused on acoustic characteristics of vowel articulation in single-word productions and rate in pediatric dysarthria. Specifically, children with dysarthria secondary to CP have smaller vowel space areas, slower speaking rates, longer segment durations, and shallower F2 slopes than TD children in single-word contexts (Higgins & Hodge, 2002; Hustad et al., 2010; Lee et al., 2014). Acoustic measures of vowel articulation and rate also tend to be correlated with intelligibility (Higgins & Hodge, 2002; Hustad et al., 2010; Lee et al., 2014); however, there are many other key speech dimensions that theoretically need to be considered when the goal is to identify markers of dysarthria. To our knowledge, only one previous study has examined acoustic measures of multiple speech dimensions in children with CP. Lee and colleagues (2014) examined acoustic measures of multiple speech subsystems from single-word productions. Results showed that, together, acoustic measures of phonation, resonance, and articulation explained a high degree of the variance in intelligibility in children with CP; however, only acoustic measures of articulation significantly differed between groups (Lee et al., 2014).
Extending studies to connected speech is also important, because even children with CP who have reduced intelligibility often speak in multiword sentences, and speech deviations may emerge in connected speech that are not apparent in single-word productions. Although Lee and colleagues (2014) did not find differences between children with and without dysarthria on acoustic measures of phonation and resonance in single words, studies examining connected speech have shown that children with CP are rated as having impairments in loudness and breath support and that these dimensions have been shown to improve with intensive intervention (Fox & Boliek, 2012; Levy, Ramig, & Camarata, 2012). Group differences in acoustic measures of these dimensions may emerge in longer utterances that place greater demands on speech motor control. For example, increased sentence length imposes greater demands on respiratory control, which may affect phonation in longer utterances in a way that is not apparent in single-word productions (e.g., deviant voice quality may result from reduced subglottal pressure near the end of longer breath groups; Yorkston, Beukelman, Strand, & Hakel, 2010). Prior studies of 4- to 5-year-old children with dysarthria secondary to CP have demonstrated effects of utterance length on intelligibility. Specifically, although additional linguistic context tends to enhance intelligibility of sentences compared to single words, longer sentences begin to show declining intelligibility relative to shorter sentences, which is likely due to increased motor control demands (Allison & Hustad, 2014; Hustad, Schueler, Schultz, & DuHadway, 2012). Together, these findings suggest that speech characteristics of children with dysarthria may differ in connected speech compared to single words and motivate the need for extending acoustic studies of speech features to connected speech.
Collectively, prior research suggests that acoustic measures are sensitive to differences between dysarthric and typical speech in children and have the potential to aid in early diagnosis; however, the diagnostic efficacy of acoustic features for detecting dysarthria in children with CP has not previously been investigated. The first step toward developing improved diagnostic methods is to identify acoustic measures that best differentiate TD children from children with CP who have a clear diagnosis of dysarthia. This method could then be extended to children at younger ages and with speech patterns that are harder to categorize.
In order to obtain comprehensive information about children's speech profiles, it is important to select acoustic measures that reflect a range of speech dimensions. For speakers with dysarthria, deficits in physiological control of speech subsystems result in alterations to the acoustic signal (Kent & Kim, 2003). These acoustic alterations then have perceptual consequences, resulting in commonly recognized features of dysarthria (Duffy, 2013); however, there is not a one-to-one mapping across levels. Speech acoustic theories explaining relationships between physiological subsystem control, acoustic measures, and phonetic/perceptual speech characteristics have been well-developed over decades of research (Fant, 1960; Stevens, 1998) and provide a theoretical basis for studying how speech is affected by dysarthria (e.g., Kent, Weismer, Kent, Vorperian, & Duffy, 1999). Physiological impairments at multiple subsystem levels can theoretically contribute to acoustic and perceptual features of dysarthria. For example, weak articulatory closure and reduced respiratory support can both contribute to inadequate intraoral pressure for producing stop consonants. Perceptually, this may contribute to a perception of imprecise articulation, and acoustically, it could be measured as a reduction in realization of bursts. A summary of the theoretical mapping between physiological, acoustic, and perceptual levels of speech analysis is presented in Figure 1. In this study, our approach was to select acoustic measures that correspond to common perceptual features of dysarthria in children with CP at both segmental and suprasegmental levels, could be measured from connected speech, and sample the subsystem space. This approach allowed us to examine group differences in quantitative measures of multiple speech dimensions as well as to obtain an overall impression of the acoustic speech profile of each child. We aimed to investigate the following questions:
Which acoustic measures of connected speech differentiate 5-year-old children with CP and dysarthria from TD children at a group level, and how do acoustic profiles of individual children with dysarthria fit the group trend?
Which acoustic measures of connected speech best detect dysarthria in 5-year-old children with CP?
We hypothesized that, as a group, children with CP and dysarthria would differ from TD children on all acoustic measures of connected speech but would show wide variability in individual profiles of speech features. In addition, we hypothesized that dysarthria diagnosis would be best predicted by a combination of measures reflecting multiple speech dimensions.
Method
Participants
Children With CP
Twenty 5-year-old children with CP were included in this study (13 girls, seven boys). All children with CP were participants in a larger ongoing longitudinal study. Five-year-old children were selected because children at this age are generally able to produce multiword sentences; however, speech development is not yet complete. Thus, studying children at this age allows for characterization of acoustic speech features associated with dysarthria within a controlled developmental window.
Inclusion criteria for the present research required children with CP to (a) have a diagnosis of dysarthria, as determined by speech-language pathologists' assessment; (b) be able to repeat sentences of at least five words in length; and (c) pass a hearing screening. Dysarthria diagnosis was determined for each child by two experienced research speech-language pathologists. The speech-language pathologists independently reviewed audio and video recordings of an oral mechanism exam, word and sentence repetition task, and spontaneous speech samples from data collection sessions as a basis for making judgments regarding dysarthria diagnosis. Dysarthria diagnosis was made in accordance with standard clinical procedures, specifically based on the presence of any obvious audible signs of dysarthria in one or more speech subsystems (e.g., articulatory imprecision, breathy or harsh voice quality, hypernasality, slow rate, short breath groups), as well as visual evidence of abnormal orofacial and/or respiratory movements during speech associated with abnormal tone or weakness. There was 100% agreement between the speech-language pathologists regarding the presence of dysarthria. Prior research from this longitudinal study has been used to develop a communication classification scheme for children with CP based on the presence of SMI with or without co-occurring language impairment (Hustad et al., 2010). In accordance with this classification scheme, children with CP and dysarthria will be subsequently referred to as the SMI group. For this study, children with SMI with and without comorbid language impairment were pooled together. Data from one of these children have been included in previous publications (Lee & Hustad, 2013; Lee et al., 2014). The mean age of children in the SMI group was 64.2 months (SD = 3.5).
Gross motor, language, and dysarthria severity, as indexed by speech intelligibility scores (Kent et al., 1989; Kim, Kent, & Weismer, 2011), were not explicitly controlled in this sample of children, as we were interested in characterizing speech features of a representative sample of children with dysarthria and CP within a narrow developmental age window. Gross motor characteristics varied widely across individuals, with Gross Motor Function Classification System (GMFCS; Palisano et al., 1997) levels ranging from I (indicating no or minimal mobility impairment) to V (indicating complete dependence on a wheelchair for mobility as well as impaired trunk and head control). The GMFCS is a standard scale for classifying gross motor impairment in individuals with CP. Although it does not reflect speech motor skills, moderate correlations between GMFCS scores and communication skills of children with CP have been reported (Hidecker et al., 2012) due to relationships between skills at the extreme ends of the severity continuum. Six of the children in the SMI group (30%) had co-occurring language impairment, with receptive language standard scores on the Test for Auditory Comprehension of Language–Fourth Edition (Carrow-Woolfolk, 2014) ranging from 68 to 85 (TD: M = 100, SD = 15). Intelligibility of children in the SMI group, based on transcription of utterances from the Test of Children's Speech Plus (TOCS+; Hodge & Daniels, 2007) as described below, spanned a wide range from 9% to 86%, indicating that the sample included children with dysarthria severity levels ranging from mild to severe. The majority of children in the SMI group had overall intelligibility between 26% and 75%, indicating that the sample was largely composed of children with moderate to severe dysarthria. Individual characteristics of children in the SMI group are listed in Table 1.
Table 1.
Child ID | Sex | Age (in months) | GMFCS a | Anatomic involvement | TACL-4 SS b | Overall intelligibility |
---|---|---|---|---|---|---|
CP01 | F | 67 | I | left hemiplegia | 83 | 85.55 |
CP02 | F | 62 | IV | quadriplegia | 106 | 35.70 |
CP03 | F | 66 | II | diplegia | 76 | 35.49 |
CP04 | F | 62 | I | left hemiplegia | 85 | 57.56 |
CP05 | F | 63 | II | right hemiplegia | 119 | 9.49 |
CP06 | F | 63 | III | quadriplegia | 68 | 31.91 |
CP07 | F | 61 | I | right hemiplegia | 102 | 58.87 |
CP08 | F | 60 | III | right hemiplegia | 128 | 55.58 |
CP09 | F | 71 | V | quadriplegia | 76 | 25.61 |
CP10 | F | 62 | IV | diplegia | 74 | 60.74 |
CP11 | F | 60 | IV | diplegia | 124 | 63.97 |
CP12 | F | 62 | II | right hemiplegia | 87 | 40.24 |
CP13 | F | 61 | I | unknown | 115 | 86.29 |
CP14 | M | 63 | IV | quadriplegia | 87 | 71.87 |
CP15 | M | 68 | IV | quadriplegia | 98 | 31.66 |
CP16 | M | 62 | I | diplegia | 100 | 57.16 |
CP17 | M | 69 | IV | quadriplegia | 94 | 59.88 |
CP18 | M | 71 | III | right hemiplegia | 72 | 71.56 |
CP19 | M | 65 | I | right hemiplegia | 104 | 36.48 |
CP20 | M | 66 | I | right hemiplegia | 87 | 38.62 |
TD01 | F | 66 | 90.61 | |||
TD02 | F | 62 | 84.67 | |||
TD03 | F | 65 | 94.15 | |||
TD04 | F | 67 | 90.55 | |||
TD05 | F | 70 | 94.25 | |||
TD06 | F | 66 | 92.54 | |||
TD07 | F | 64 | 91.31 | |||
TD08 | F | 61 | 94.19 | |||
TD09 | F | 64 | 93.64 | |||
TD10 | F | 64 | 91.77 | |||
TD11 | F | 60 | 95.72 | |||
TD12 | F | 60 | 89.22 | |||
TD13 | F | 63 | 88.33 | |||
TD14 | M | 68 | 94.24 | |||
TD15 | M | 62 | 96.09 | |||
TD16 | M | 60 | 85.00 | |||
TD17 | M | 63 | 91.59 | |||
TD18 | M | 69 | 89.52 | |||
TD19 | M | 62 | 91.23 | |||
TD20 | M | 60 | 91.46 |
Note. GMFCS = Gross Motor Function Classification System; TACL-4 = Test for Auditory Comprehension of Language–Fourth Edition; SS = standard score; CP = cerebral palsy; TD = typically developing; F = female; M = male.
GMFCS rating: I = no/mild impairment, V = severe impairment.
TACL-4 standard score: M = 100, SD = 15.
TD Children
Twenty 5-year-old TD children were included as a control group. Children in this group were matched to the participants in the SMI group for age and sex. Inclusion criteria were as follows: (a) have no reported history of speech, language, or learning problems; (b) pass the Preschool Language Scale–Fourth Edition Screening Test (Zimmerman, Steiner, & Pond, 2005); (c) achieve a standard score within the average range on the Arizona Articulation Proficiency Scale–Third Revision (Fudala, 2000); and (d) pass a hearing screening. Any child who could not repeat sentences of at least five words in length on the speech task was excluded from this study. The mean age of children in the TD group was 63.8 months (SD = 3.09). Overall intelligibility of TD children on the TOCS+ ranged from 85% to 96% (M = 91.5%, SD = 3.1%). Although not the focus of this study, it is noteworthy that the children in the TD group had intelligibility levels below 100%. This is consistent with previous findings of children in this age group (Allison & Hustad, 2014) and is a further indication that children in this control group had incompletely developed speech systems. Individual characteristics of children in the sample are listed in Table 1.
Acquisition of Speech Samples
All children in the study participated in a standard research protocol for obtaining speech samples that consisted of repeating an identical set of 42 single words and 60 sentences that were two to seven words in length (10 sentences of each length) taken from the TOCS+ (Hodge & Daniels, 2007) for research purposes. All data collection sessions were conducted in a sound-attenuated room by a speech-language pathologist. Audio-recorded adult models of TOCS+ stimuli were presented to children, accompanied by related pictures to help maintain their engagement in the activity. Children were asked to repeat each stimulus item following its presentation. Productions were audio-recorded with a condenser studio microphone (Audio-Technica AT4040) positioned near the child's mouth, using a floor stand. Speech samples from children were recorded using a digital audio recorder (Marantz PMD 570) at a 44.1-kHz sampling rate (16-bit quantization). The level of the signal was monitored and adjusted on a mixer (Mackie 1202 VLZ) to obtain optimized recording levels and to avoid peak clipping.
Intelligibility
In order to obtain intelligibility data for children in the sample, 200 adult listeners (5 listeners per child × 40 children) provided orthographic transcriptions of children's recorded word and sentence productions from the TOCS+. Audio recordings of the children's speech samples were segmented into separate sound files for each utterance and presented to listeners in a sound-attenuated booth. Each listener heard all 102 utterances (42 single words + 60 sentences) produced by one child and were asked to transcribe orthographically what they thought the child said according to a previously published protocol (Hustad et al., 2010). The order of presentation for stimulus items was randomized for each listener, and listeners were only able to listen to each utterance one time. Custom software was used to compare listeners' transcriptions to the actual utterances produced by the child. For each utterance, intelligibility was calculated as the percentage of stimulus words correctly identified by the listener. These utterance-level intelligibility scores were averaged across utterances of the same length to obtain an intelligibility score for single words and sentences of each length for each listener (e.g., intelligibility scores of all two-word sentences from the TOCS+ were averaged to obtain a mean two-word intelligibility score). For each child, overall intelligibility was determined by averaging the single-word and sentence intelligibility scores for each listener and then calculating the mean overall intelligibility scores across the five listeners.
Acoustic Analysis
Children's productions of 10 five-word sentences from the TOCS+ were used for acoustic analysis. Children produced each of the five-word sentences one time. For 5-year-old children, mean length of utterance is expected to be between four and five words (Rice et al., 2010). Thus, five-word sentences were chosen for analysis, as they were representative of habitual speech for children of this age, while maintaining a controlled speech context.
Five acoustic measures were obtained from this sentence set for all children. Measures were selected to quantify segmental and suprasegmental speech characteristics known to be associated with dysarthria in children with CP and to capture aspects of SMI that sample multiple speech subsystem domains. Included acoustic measures and their theoretical relationships to perceptual features and physiological speech subsystem impairments are summarized in Figure 1. Measures chosen for this study included acoustic correlates of articulatory imprecision, voice quality disturbance, and slow speaking rate that have been previously shown to be sensitive to dysarthria (Kent & Kim, 2003; Weismer, Jeng, Laures, Kent, & Kent, 2001) or were theorized to reveal an important aspect of deviant speech motor control in children with CP (Ansel & Kent, 1992; Nordberg, Miniscalco, & Lohmander, 2014; Workinger & Kent, 1991). Although including measures of all speech subsystems would be ideal, the current study involved analysis of existing speech recordings, and therefore, measures were chosen that could be readily obtained through acoustic analysis of connected speech samples. Although hypernasality and reduced loudness are also common perceptual features of dysarthria in children with CP (Workinger & Kent, 1991), acoustic measures of these speech dimensions were not obtained in this study, because there are no established acoustic metrics of hypernasality from recorded connected speech samples and we did not have a constant mouth-to-microphone distance needed for accurate estimates of loudness.
Acoustic measurements were made by the first author and a research assistant, both of whom were blinded to the subjects' diagnosis. A complete list of stimuli is included in the Appendix. The following measures were obtained:
F2 range of diphthongs was selected as an index of vowel articulation. Measures of F2 excursion have been shown to differentiate speakers with dysarthria from healthy controls in adult populations (Kent et al., 1992; Rosen, Goozée, & Murdoch, 2008; Rosen, Kent, Delaney, & Duffy, 2006; Yunusova, Weismer, Kent, & Rusche, 2005) and in children (Lee et al., 2014). F2 range was calculated as the difference between the maximum and minimum F2 frequencies within each vocalic segment containing a diphthong. Across the set of five-word sentences, six diphthongs were present. For each child, F2 trajectories for the vocalic segments containing diphthongs were generated using linear predictive coding analysis in TF32 and then visually examined and hand-corrected as needed. Approximately 80% of linear predictive coding tracks required hand correction to eliminate formant tracking errors. Beginning and end points of vocalic segments were defined as the first and last glottal pulses in which both F1 and F2 were present. Descriptive statistics from F2 histories were used to calculate the F2 range of each vocalic segment with a diphthong. An example F2 trajectory is shown in Figure 2 to illustrate this measurement procedure. F2 ranges were then averaged across the six target diphthongs to obtain an average F2 range measurement for each child. Of the 240 possible diphthongs (40 children × 6 tokens), 99.6% were measured; one diphthong from a child in the SMI group could not be measured because the child truncated the word and the formant could not be traced.
Proportion of observable bursts was selected as an indicator of consonant precision. In children with dysarthria, imprecise articulation may result in incomplete closure during production of plosive consonants, thus leading to a reduction in burst realization. To our knowledge, no studies have previously examined burst production in children with dysarthria, but studies examining burst production in adults suggest that realization of bursts may be reduced in speakers with dysarthria (Ackermann & Ziegler, 1991; Ansel & Kent, 1992; Liu, Tseng, & Tsao, 2000; Ozsancak, Auzou, Jan, & Hannequin, 2001). This measure was included, as we were interested in a measure of consonant production to complement the measure of vowel production (F2 range).
The presence of bursts was determined through visual judgments of the spectrogram and waveform. For each child, screenshots of the spectrogram and waveform were captured for all 10 five-word sentences and embedded in a PowerPoint presentation with a transcription of the sentence at the top of each slide. The locations of target plosive consonants were marked using red lines along the bottom of the spectrogram. Researchers viewed PowerPoint slides on a computer monitor and made binary judgments regarding the presence of each burst based on whether or not a distinct line of energy extending across at least 50% of the frequency range was present within the target window. Burst judgments were made without accompanying audio recordings in order to eliminate any potential influence of auditory information on researchers' decisions. An example image used by researchers to make burst judgments is shown in Figure 3. Across the set of five-word sentences, 18 initial and medial plosive consonants in the target sentences were analyzed for each child (720 total consonants analyzed). For each target consonant, the presence or absence of a visible burst on the spectrogram was recorded. The proportion of plosive consonants with an observable burst was calculated for each child by dividing the number of bursts produced by the total number of target consonants. Of the 720 target plosives, 99.3% of tokens were measured; two tokens from children in the TD group and three tokens from children in the SMI group were omitted because the children did not produce the target consonant.
Duration of closure interval voicing was examined as a measure of precision for making voicing contrasts. When vowels are followed by a voiceless obstruent, glottal pulses briefly continue into the closure interval of the consonant as a normal result of coarticulation (Chasaide & Gobl, 1993; Löfqvist, 1992). Control of the offset of voicing is one of several cues to the perception of voicing status of poststressed stops (Lisker, 1986). Deficits in precise timing of voicing offset may result from dysarthria and contribute to the reduction of these distinctions in speakers with CP (Ansel & Kent, 1992). Thus, the duration of persistent voicing in closure intervals may provide information about how children with dysarthria regulate their phonation to make voicing distinctions differently than TD children. Duration of persistent voicing in closure intervals was measured for postvocalic voiceless stop consonants.
Across the set of five-word sentences, eight postvocalic voiceless stops were measured for each child. For each target consonant, TF32 (Milenkovic, 2002) was used to measure the duration of persistent voicing and the duration of closure intervals. Duration of persistent voicing was measured as the time between the point of closure and the end of voicing (i.e., the point at which glottal pulses terminated). Closure interval durations were measured as the time between the point of closure and the onset of the following burst. An example of this measurement procedure is shown in Figure 4. As the duration of persistent voicing was expected to vary according to the rate of the speaker's production, the duration of persistent voicing was divided by the closure interval duration to yield a proportion of persistent closure interval voicing for each target consonant. Proportions were then averaged across the set of target consonants to obtain an average proportion of closure interval voicing for each child. Of the 320 target final consonants (40 children × 8 target consonants), 90% were measured; six tokens from children in the TD group could not be measured and 29 tokens from children in the SMI group could not be measured due to final consonant deletion, aphonia, or cluster reduction (e.g., if the child said “shoos” instead of “shoots”).
Proportion of utterance durations with deviant voice quality was examined as a quantitative measure of vocal quality. Clinical impairments in vocal quality, including breathiness, harshness, and irregular articulatory breakdown, are known perceptual features associated with dysarthria in children (van Mourik, Catsman-Berrevoets, Paquier, Yousef-Bak, & van Dongen, 1997; van Mourik, Catsman-Berrevoets, Yousef-Bak, Paquier, & van Dongen, 1998; Workinger & Kent, 1991), and acoustic studies have shown greater vocal instability in children with acquired dysarthria than typical children (Cornwell, Murdoch, Ward, & Morgan, 2003). Therefore, we expected that children with dysarthia would exhibit a greater proportion of deviant voice quality in their speech than TD children.
Periods of deviant voice quality were identified and measured using a hybrid perceptual–acoustic two-step process:
Three raters (the first author and two research assistants) independently listened to the 10 five-word sentences produced by each child and made binary perceptual judgments as to whether or not any audible deviant voice quality was present in each sentence. Raters listened for occurrences of the following types of deviant voice quality: glottal fry, breathiness, aphonia, diplophonia, wet/ gurgly voice, rough or hoarse voice quality, and phonation breaks. These voice characteristics have been associated with dysarthria in speakers with CP (Fox & Boliek, 2012; Schölderle, Staiger, Lampe, Strecker, & Ziegler, 2016; Workinger & Kent, 1991). Out of 400 total sentences across the 40 participants, 236 (59%) were identified as containing periods of audible deviant voice quality by at least two raters (142/200 sentences from children in the SMI group [72%] and 94/200 sentences from children in the TD group [47%]).
For each of the 236 sentences identified as containing deviant voice quality, the durations of deviant voice segments were measured acoustically. Beginning and end points of deviant voice segments were determined by using visual information in the waveform to locate disruptions in the normal periodicity pattern that corresponded with audible disruptions in voice quality. Beginning and end points of deviant voice segments were manually recorded using Praat (Boersma & Weenink, 2015), and a custom script was used to sum the duration of deviant voice segments within each sentence. An example waveform with marked deviant voicing segments is shown in Figure 5. The summed duration of deviant voice segments was divided by the utterance duration (exclusive of pauses) to yield a proportion of deviant voice quality for each sentence. The proportions were averaged across the set of 10 sentences to yield an average proportion of speech produced with deviant voice quality for each child.
Articulation rate was selected as a global index of speech production ability, as it is influenced by coordination and timing at all subsystem levels and is a key characteristic of dysarthria (Yorkston et al., 2010). Articulation rate was quantified as rate of speech (in syllables per second), exclusive of silent intervals longer than 200 ms. A custom Praat (Boersma & Weenink, 2015) script was used to calculate articulation rate. Rate was calculated for each sentence as the number of syllables produced divided by the corrected utterance duration (total utterance duration − summed pause duration). Articulation rate was averaged across the 10 sentences to yield an average articulation rate for each child. Articulation rate was calculated for 100% of sentences.
Reliability
Interjudge reliability and intrajudge reliability were obtained for all acoustic measures. Twenty percent of the children were randomly selected for reliability (four TD children, four SMI children). Speech samples were independently measured by a second researcher trained in acoustic analysis and were also remeasured by the first author. For interjudge reliability, Pearson product moment correlations showed strong agreement between judges for all measures (r = .93–.99), except for the duration of intervals produced with deviant voice quality, for which the correlation between judges was moderately high (r = .74). Mean absolute differences in measurements between judges were as follows: F2 range (109 Hz), voicing duration in closure intervals (0.004 s), duration of deviant voice quality (0.048 s), and articulation rate (0.1 syllable/s). For intrajudge reliability, Pearson product moment correlations showed strong agreement between first and second ratings for all measures (r = .94–.99). Mean absolute differences in measurements between judges were as follows: F2 range (145 Hz), voicing duration in closure intervals (0.003 s), duration of deviant voice quality (0.024 s), and articulation rate (0.05 syllable/s). Interjudge reliability and intrajudge reliability were consistent with previous literature (Auzou et al., 2000; Hustad et al., 2010; Rosen et al., 2008) and within an acceptable range. The moderate interjudge reliability for deviant voice quality is consistent with previous literature on perceptual ratings of voice quality, which has identified many factors that impact reliability across raters (Kreiman & Gerratt, 2000). In this study, we combined perceptual judgments of voice quality with visual information in the speech signal in order to increase the objectivity of deviant voice quality measurements.
Statistical Analysis
We used an a priori planned contrasts approach to examine group difference on the specific measures. This approach is considered more conservative than traditional multilevel analysis of variance because only the contrasts of interest are tested statistically (Kirk, 2013). Because of the small number of participants and the considerable variability among children with dysarthia, we used a nonparametric approach. To determine whether children with SMI and TD differed on acoustic measures of connected speech, Mann–Whitney U tests were conducted comparing groups on each of the five acoustic measures (proportion of bursts, F2 range of diphthongs, proportion of deviant voice quality, proportion of closure interval voicing, and articulation rate). In order to reduce the familywise error rates for multiple comparisons, a Holm–Bonferroni correction was applied to adjust significance levels for the five tests (Holm, 1979). The Holm–Bonferroni method is a modification of the Bonferroni correction that uses a sequential method for rejecting null hypotheses. Obtained p values are ranked from smallest to largest and compared to significance levels α/n, α/(n − 1), … α/1, where α is the target alpha level and n is the total number of tests performed. For the five contrasts in this study, the test with the lowest p value was compared to a significance level of α = .05/5 = .01, the test with the second lowest p value was compared to a significance level of α = .05/(5–1) = .0125, and the subsequent three p values were compared to significance levels of α = .0167, .025, and .05, respectively. This method provides robust protection against Type I errors while maintaining higher power than a classic Bonferroni correction (Holm, 1979). Given the small sample size in this study, this more powerful method of correcting for multiple comparisons was deemed an appropriate approach.
To characterize individual differences in acoustic profiles, each child in the SMI group was compared to the distribution of scores in the TD group on each of the speech measures. As standard developmental norms do not exist for included speech measures, data from children in the TD group were used to derive a “normal range” for each acoustic measure, defined as the TD group mean ± 2 SDs. This is a conservative criterion for defining a normal range; however, given the small sample size of the TD group, it was deemed appropriate for this analysis. For each child in the SMI group, speech measurements that were outside the “normal range” were identified descriptively.
To determine acoustic markers of dysarthria in 5-year-old children with CP, multiple logistic regression was used to test models with dysarthria status as the outcome variable and the five acoustic measures as predictor variables. In order to build the multiple logistic regression model that best fit the data, an empirical approach was used that involved entering independent variables in order of decreasing strength in predicting dysarthria status. First, a series of univariate logistic regression models was constructed to look at each variable as an independent predictor of dysarthria (Hosmer & Lemeshow, 2000). McFadden's pseudo-R 2 and Akaike information criterion were calculated for each univariate model to assess goodness of fit. As the objective of this analysis was to construct a model that predicted dysarthria diagnosis most accurately and with the fewest predictors, independent variables were then entered into a multiple logistic regression model one at a time, in order of highest to lowest pseudo-R 2 (see Table 2), until added variables no longer resulted in a significant improvement in model fit. After each variable was added, likelihood ratio tests were used to compare nested models to determine whether the added variable significantly improved the model.
Table 2.
Variable | B | SE | Wald | p | Goodness of fit |
|
---|---|---|---|---|---|---|
AIC | Pseudo-R 2 | |||||
Nasality rating | 4.22 | 1.51 | 2.80 | .01 | 18.09 | .75 |
Articulation rate | −8.75 | 3.17 | −2.76 | .01 | 27.80 | .57 |
F2 range of diphthongs | −0.01 | 0.00 | −2.53 | .01 | 49.02 | .19 |
Proportion of deviant voice quality | 14.53 | 6.25 | 2.33 | .02 | 50.71 | .16 |
Proportion of bursts | −5.96 | 3.18 | −1.88 | .06 | 55.13 | .08 |
Proportion of closure interval voicing | −1.71 | 2.23 | −0.77 | .44 | 57.86 | .01 |
Note. Measures are listed in order of highest to lowest pseudo-R 2. AIC = Akaike information criterion.
Results
Acoustic Measures: Group Comparisons
Descriptive data, presented in Figure 6, suggested that children in the SMI group showed impaired performance on four out of the five acoustic measures (proportion of bursts, F2 range of diphthongs, proportion of deviant voice quality, and articulation rate) compared to children in the TD group. In addition, standard deviations suggested that there was more variability across children in the SMI group than in the TD group for all four of these acoustic measures. Results of Mann–Whitney U tests revealed that group differences on the following measures were statistically significant, using a Holm–Bonferroni correction for multiple comparisons: articulation rate (U = 25, Z = −4.73, p < .001), F2 range of diphthongs (U = 95, Z = −2.84, p = .005), proportion of deviant voice quality (U = 112, Z = −2.37, p = .018), and proportion of bursts produced (U = 115, Z = −2.31, p = .021). Effect sizes (Field, 2009) were large for articulation rate (r = .75) and medium for the F2 range of diphthongs (r = .45), the proportion of bursts produced (r = .36), and the proportion of deviant voice quality (r = .37). No significant difference was found between groups for the proportion of closure interval voicing (U = 168, Z = −0.87, p = .387).
Individual Data
Descriptive analysis of individual data indicated variability in the number of children with dysarthria whose measurements were outside the “normal range” (± 2 SDs) for each acoustic measure. The speech patterns of each child in the SMI group relative to the “normal range” of the TD group are displayed in Figure 7. Articulation rate had the largest number of children with SMI outside the “normal range” of the TD group (13 of 20 children), followed by proportion of deviant voice quality (nine of 20 children), and then F2 range (four of 20 children) and proportion of bursts produced (four of 20 children). The majority of children in the SMI group (80%) showed impairment on between one and three acoustic measures relative to the TD group; however, the constellation of co-occurring impaired speech features varied substantially across individual children. Importantly, the number and pattern of impaired acoustic features did not appear related to intelligibility; a Spearman's rank order correlation revealed no significant correlation between the number of impaired acoustic features and intelligibility score (ρ = −.21, p = .36).
Logistic Regression
The best-fitting model to emerge from the logistic regression analysis included articulation rate and the F2 range of diphthongs as predictors of dysarthria status. This model was statistically significant (χ2 = 36.51, p < .001, −2LL = 18.96). Articulation rate made an independent significant contribution to the prediction of dysarthria status (B = −10.118, p < .01), but F2 range was not an independent significant predictor (B = −.007, p = .075). Results of a likelihood ratio test indicated that including the F2 range of diphthongs significantly improved the model over articulation rate alone (χ2 = 4.84, p = .02). The F2 range of diphthongs and articulation rate were moderately correlated (r = .50), but the variance inflation factor indicated multicollinearity was not a concern (variance inflation factor = 1.36).
This acoustic model classified children into dysarthria/ no dysarthria groups with 87.5% accuracy and explained approximately 65% (McFadden pseudo-R 2) of the variance in dysarthria status. The model had 85% sensitivity (95% confidence interval [CI] [62%, 96%]) and 90% specificity (95% CI [68%, 98%]) in identifying dysarthria. The positive predictive value of the model was 89% (95% CI [67%, 98%]), indicating the probability that children classified as having dysarthria actually had the disorder. The negative predictive value was 86% (95% CI [64%, 97%]), indicating the probability that children classified as not having dysarthria did not actually have the disorder. In order to make the odds ratios interpretable, the units were adjusted for articulation rate (to syllables per minute) and the F2 range of diphthongs (to 100 Hz increments). The odds ratio for articulation rate was 0.845, indicating that for a one syllable per minute decrease in rate, the odds of having dysarthria increased by a factor of 1.18, controlling for F2 range. The odds ratio for the F2 range of diphthongs was 0.481, indicating that for each 100 Hz reduction in F2 range, the odds of having dysarthria increased by a factor of 2.08, controlling for articulation rate.
Discussion
This study aimed to examine how acoustic characteristics of connected speech in children with CP and dysarthria compared to TD children at both group and individual levels and to identify acoustic measures that best predicted the presence of dysarthria. There were two primary findings: (a) In connected speech, children in the SMI group differed from TD children on acoustic measures of multiple speech dimensions but showed wide individual variability in the pattern of dimensions that showed impairment. (b) An acoustic model containing articulation rate and the F2 range of diphthongs differentiated children in the SMI group from TD children with a high degree of accuracy based on their production of multiword sentences. These findings are discussed in detail below.
Acoustic Characteristics of Connected Speech in Children With Dysarthria
The speech measures examined in this study were selected to quantify multiple deviant speech dimensions in children with dysarthria secondary to CP. These measures were selected to represent a range of speech features known to be impaired in children with CP; however, there are many other potential acoustic measures of speech dimensions that could have been included as well. Results of group comparisons indicated that in sentence production, 5-year-old children in the SMI group differed from TD children on acoustic measures of articulatory precision, voice quality, and articulation rate.
Acoustic measures of articulatory precision indicated that children in the SMI group used smaller F2 ranges for diphthongs and had reduced realization of bursts in connected speech, compared to children in the TD group. These results support and extend findings from extant literature on single-word productions in children with dysarthria. The reduced F2 ranges of children in the SMI group in this study of connected speech are consistent with prior research demonstrating that children with dysarthria had reduced F2 extents in single-word productions (Lee et al., 2014) and reduced vowel space areas in single-word productions (Higgins & Hodge, 2002; Hustad et al., 2010) compared to TD children. This reduction in F2 range suggests that children with dysarthria use smaller ranges of articulatory movement when producing phonemes requiring large changes in vocal tract configuration and supports the theory that children with dysarthria tend to undershoot these phonetic targets (Higgins & Hodge, 2002; Liu, Tsao, & Kuhl, 2005). In addition, results of this study demonstrated that children in the SMI group produced a significantly lower proportion of bursts in connected speech than TD children. To our knowledge, realization of bursts has not previously been investigated in children with dysarthria, and this finding provides evidence for one way in which dysarthria may affect children's precision of consonant production in connected speech. Reduced burst production in children with SMI may be due to impaired ability to generate adequate intraoral pressure for stop production compared to typical children. Such a deficit could be due to impaired strength or coordination at the articulatory level (i.e., incomplete closure leading to air leakage), impairment of the respiratory subsystem (i.e., reduced breath support for speech), and/or impaired velopharyngeal movement (i.e., incomplete closure leading to reduced intraoral pressure). Individual data suggested that children who had the greatest reductions in F2 range were not necessarily the same children who showed the greatest impairment in burst production and that impairments in these two measures were present in children with widely varying intelligibility levels. As shown in Figure 7, two children (CP05 and CP08) showed impairment in both burst production and F2 range relative to the TD group; however, an additional four children showed impairment in one of these measures but not the other. This suggests that the factors contributing to articulatory imprecision may differ across individual children with dysarthria.
Group comparisons also demonstrated that children in the SMI group exhibited more deviant voice quality in connected speech than TD children. Although voice quality impairment, including breathiness, hoarseness, and strained/strangled voice quality, is commonly associated with dysarthria in children with CP (Workinger & Kent, 1991), few studies have attempted to quantify deviant voice quality in this population. One recent study by Lee and colleagues examined phonatory stability via measures of jitter and shimmer in single-word productions of children with dysarthria secondary to CP compared to TD children but did not find significant differences between groups (Lee et al., 2014). The discrepancy between findings of Lee and colleagues and this study may be due to the higher speech motor demands of producing sentences compared to single words. Although voice quality deficits may not be apparent in production of single words for most children with dysarthria, deviant voice quality may emerge in connected speech as demands on breath support and coordination between respiration and phonation increase. Alternatively, it is possible that the perceptual–acoustic measure of percentage of deviant voice quality used in this study was more sensitive to the voice impairment of children with SMI than the jitter and shimmer used by Lee and colleagues (2014). Individual data, shown in Figure 7, suggest that the children with SMI who had the most deviant voice quality had widely ranging intelligibility levels and were not necessarily the same children as those who showed the greatest impairment on measures of articulatory precision.
Children in the SMI and TD groups did not differ in the proportion of voicing in closure intervals in connected speech. Children with dysarthria secondary to CP are known to have impairments in making voicing distinctions in their speech (Irwin, 1968; Nordberg et al., 2014; Workinger & Kent, 1991). Voicing during closure intervals was examined, as it was hypothesized to reflect impairments in control of voicing offset that could contribute to deficits in making voicing distinctions. There are several possible reasons why this measure did not show differences between groups. The proportions of voicing in closure intervals spanned a wide range, both within and across children in the SMI and TD groups. Although an identical set of eight postvocalic stops was measured for each child, the stimulus set contained a range of phonetic contexts (i.e., multiple vowel–consonant combinations), which may have contributed to the high within-child variability. The lack of group differences could reflect that voicing during closure intervals is not a well-controlled cue in either speakers with dysarthria or typical children. Another possibility is that temporal control of voicing offset is not yet developed in 5-year-old children. Studies have shown that children develop control of timing for different aspects of voicing contrasts over a protracted time that continues until after the age of 6 (Hawkins, 1984). Thus, it is possible that, in older children or adults with dysarthria, this measure would differentiate groups.
Results also indicated that children in the SMI group had significantly slower articulation rates than TD children in sentence production. This finding is consistent with previous literature showing reduced speaking rate (Hustad et al., 2010) and longer segment durations (Lee et al., 2014) in children with dysarthria due to CP compared to TD children and with perceptual studies demonstrating slow rate as a feature of dysarthria associated with CP in children (Workinger & Kent, 1991). As articulation rate is measured excluding pauses, reduction in articulation rate suggests that children with dysarthria secondary to CP may require more time to execute articulatory gestures than typical children (Nip, 2013). Individual data demonstrated that articulation rate was the acoustic measure on which the most children in the SMI group deviated from the “normal range” of the TD group, additionally suggesting the sensitivity of this measure to dysarthria in connected speech of children with CP.
Overall, results of group comparisons indicated that multiple segmental and suprasegmental acoustic measures of connected speech are sensitive to SMIs in children with dysarthria due to CP. These findings imply impairment in physiological control of multiple speech subsystems in children with CP and provide information about the acoustic basis of phonetic and perceptual features of dysarthria in children with CP, as described in Table 1. In addition, analysis of individual data, shown in Figure 7, yielded valuable information about the variability in the presence and magnitude of impairment in individual speech features among children. For example, the child with the lowest intelligibility (CP05) had scores outside the “normal range” of the TD group on all acoustic measures, but for all other children in the SMI group, the pattern and number of variables that were outside the “normal range” varied greatly, even among children with similar intelligibility levels. While preliminary, these data suggest that patterns of SMI vary across children and that individual children may have different profiles of relative strengths and weaknesses across speech motor domains.
Acoustic Markers of Pediatric Dysarthria
Results of logistic regression analysis indicated that with only two acoustic measures, articulation rate and the F2 range of diphthongs, 5-year-old children with dysarthria due to CP could be distinguished from TD children with a high level of accuracy. Although theoretically, quantitative measures of all deviant speech dimensions should contribute to identification of dysarthria, the goal of this study was to identify a model containing the fewest predictors that best separated children with dysarthria from TD children. Assessment of this model's performance indicated high sensitivity and specificity, as well as strong positive and negative predictive values. These results suggest that, together, articulation rate and the F2 range of diphthongs have strong potential diagnostic value in clinical identification of dysarthria; however, the large 95% confidence intervals for these metrics suggest the need for further validation of these findings.
In this model, articulation rate emerged as the only independently significant acoustic predictor of dysarthria. Articulation rate reflects the product of carefully timed and coordinated movements across all speech subsystem levels. The rate at which words are produced depends on the speed with which a speaker can execute movements of all parts of the vocal tract, including the vocal folds, articulators (i.e., lips, jaw, and tongue), velum, and respiratory musculature (Stevens, 1998). Thus, for speakers with dysarthria, reduced articulation rate may reflect the cumulative effects of SMI across physiological speech subsystems. In this way, articulation rate may be a more robust index of SMI than segmental acoustic measures or voice quality. Articulation rate has also been shown to reflect variance in speech motor abilities among TD children (Redford, 2014), further suggesting the potential value of this measure for detecting SMI in children with more subtle dysarthria symptoms.
The F2 range of diphthongs significantly improved the model, suggesting that this variable also made a valuable contribution to identifying dysarthria in this sample of 5-year-old children, even though it was not a significant independent predictor of dysarthria status when articulation rate was controlled. Second formant movement has been repeatedly shown to be a sensitive metric of dysarthria in children (Lee et al., 2014) and adults (Kim et al., 2011; Weismer et al., 2001; Weismer, Yunusova, & Bunton, 2012). Kim and colleagues (2011) found that F2 slope was the strongest predictor of intelligibility across multiple dysarthria subtypes in adult speakers and suggested that it was a sensitive metric of SMI, regardless of underlying etiology. Present findings support the importance of measures of F2 movement for identifying dysarthria in children. In the current study, children in the SMI group had severity levels ranging from mild to severe as indicated by intelligibility scores. The moderate correlation between articulation rate and the F2 range of diphthongs suggests some covariation between these variables as a function of severity, which may have decreased the independent effect of F2 range in this analysis. However, overall results indicated that the F2 range of diphthongs and articulation rate together distinguished children with dysarthria from TD children better than either variable in isolation.
The diagnostic accuracy of this acoustic model suggests that articulation rate and the F2 range of diphthongs are good candidate measures to examine with regard to their ability to identify dysarthria in younger children and those for whom a dysarthria diagnosis is unclear. Because of the heterogeneous nature of dysarthria in children with CP, it is clear that a multidimensional approach is needed to identify young children with dysarthria accurately. As the sample size was small, the number of predictors that could be included in the logistic regression model for this study was limited; however, in a larger sample, it is possible that additional acoustic measures could further improve the model's diagnostic accuracy.
Limitations and Future Directions
This study was preliminary in nature and conducted on a relatively small sample of children with CP who had a clinical dysarthria diagnosis. For the current analyses, we focused on measures of connected speech that were expected to be sensitive to SMI and reflected different aspects of speech motor control. However, it is possible that other acoustic measures might additionally enhance objective diagnosis of dysarthria in children, including measures of hypernasality, respiration, and prosody, which were not directly considered in the current study. Controlling for dysarthria severity in future studies is also important in order to reduce heterogeneity and examine the extent to which patterns of acoustic features may be related to severity.
As the long-term objective of this line of work is to improve diagnosis of dysarthria through use of objective measures that can be translated to clinical practice, future studies are needed that investigate the ability of this initial acoustic model to identify SMI in children for whom a dysarthria diagnosis is unclear. This would include younger children for whom developmental speech errors can make dysarthria diagnosis difficult, children with more subtle dysathria signs, and children for whom combined influences (i.e., SMI + phonological impairment) result in speech patterns that are harder to categorize. In order to extend acoustic models to a clinical setting, it would be important to establish normative ranges for articulation rate and F2 range for children at different ages and to evaluate the ability of these measures to differentiate children with dysarthria from children with other speech sound disorders. This study examined children within a narrow age window, which is advantageous in terms of experimental control when comparing groups; however, studying children at one time point only provides a snapshot of their developing systems. Smith and Zelaznik (2004) showed that synergies in speech movement develop over a protracted time, lasting until adolescence. Anatomically, children's vocal tracts grow nonlinearly through the end of the teenage years (Vorperian & Kent, 2007). In addition, children are simultaneously making gains in phonological knowledge, language skills, and cognitive skills throughout childhood. All of these factors are likely to affect speech motor performance of both TD children and children with dysarthria. Thus, the magnitude of differences in speech variables between children with and without dysarthria may change over time, as children progress through different developmental stages. Furthermore, it is also likely that these developmental factors interact within individual children in different ways (Vick et al., 2012). In order to understand how features of dysarthria emerge and change throughout childhood, longitudinal studies that examine trajectories of development in speech motor characteristics are needed.
Summary and Clinical Implications
Findings of this study contribute novel information regarding how acoustic characteristics of connected speech differ between children with CP and dysarthria of varying severity levels and TD children and illustrate the wide individual variability in speech patterns across children. In addition, results provided a preliminary acoustic model, consisting of articulation rate and the F2 range of diphthongs that shows promise for aiding in identification of dysarthria in children in CP. These findings provide an initial step toward developing an objective method to improve diagnosis of dysarthria in young children and for children with unclear speech diagnoses. This has important clinical implications, as early and accurate diagnosis may lead to improved intervention and better functional speech outcomes for children with dysarthria.
Acknowledgments
This study was funded by Grants R01DC009411 (awarded to Katherine Hustad) and 1F31DC013925-01 (awarded to Kristen Allison) from the National Institute on Deafness and Other Communication Disorders, National Institutes of Health. Support was also provided by the Waisman Center Core Grant P30HD03352 from the National Institute of Child Health and Human Development, National Institutes of Health. We would like to thank the children and families who participated in this study. In addition, we would like to thank Gary Weismer for his insight and advice on this project and Luke Annear for his assistance with data analysis.
Appendix
Stimuli used for acoustic measurement. Underlined and bolded segments indicate analyzed tokens for each acoustic measure. Proportion of deviant voice quality and articulation rate were utterance-level measures obtained for each sentence produced.
Sentence | F2 range of diphthongs | Proportion of bursts | Proportion of deviant voice quality | Closure interval voicing | Articulation rate |
---|---|---|---|---|---|
Baby likes his new toy. | likes, toy | Baby, toy | X | likes | X |
They'll eat those hotdogs soon. | hotdogs | X | eat | X | |
His fingers are in wrong. | X | X | |||
They are singing happy birthday. | happy birthday | X | happy | X | |
Water shoots from that gun. | gun | X | shoots, that | X | |
This cheese doesn't smell good. | doesn't, good | X | X | ||
The sign says ‘keep out.’ | out | keep | X | keep out | X |
Tie up the garbage bag. | tie | tie, garbage bag | X | up | X |
Give the flowers some water. | Give | X | X | ||
Put all the toys away. | toys away | Put, toys | X | X | |
Total tokens measured per child | 6 | 18 | 10 | 8 | 10 |
Funding Statement
This study was funded by Grants R01DC009411 (awarded to Katherine Hustad) and 1F31DC013925-01 (awarded to Kristen Allison) from the National Institute on Deafness and Other Communication Disorders, National Institutes of Health. Support was also provided by the Waisman Center Core Grant P30HD03352 from the National Institute of Child Health and Human Development, National Institutes of Health.
References
- Ackermann H., & Ziegler W. (1991). Articulatory deficits in parkinsonian dysarthria: An acoustic analysis. Journal of Neurology, Neurosurgery, & Psychiatry, 54(12), 1093–1098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Allison K. M., & Hustad K. C. (2014). Impact of sentence length and phonetic complexity on intelligibility in 5-year-old children with cerebral palsy. International Journal of Speech-Language Pathology, 16(4), 396–407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ansel B. M., & Kent R. D. (1992). Acoustic–phonetic contrasts and intelligibility in the dysarthria associated with mixed cerebral palsy. Journal of Speech and Hearing Research, 35(2), 296–308. [DOI] [PubMed] [Google Scholar]
- Auzou P., Ozsancak C., Morris R. J., Jan M., Eustache F., & Hannequin D. (2000). Voice onset time in aphasia, apraxia of speech, and dysarthria: A review. Clinical Linguistics & Phonetics, 14(2), 131–150. [Google Scholar]
- Boersma P., & Weenink D. (2015). Praat: Doing phonetics by computer (Version 5.4.08). Retrieved from http://www.praat.org
- Carrow-Woolfolk E. (2014). Test for Auditory Comprehension of Language–Fourth Edition. Austin, TX: PRO-ED, Inc. [Google Scholar]
- Chasaide A. N., & Gobl C. (1993). Contextual variation of the vowel voice source as a function of adjacent consonants. Language and Speech, 36(Pt 2–3), 303–330. [DOI] [PubMed] [Google Scholar]
- Cornwell P. L., Murdoch B. E., Ward E. C., & Morgan A. (2003). Dysarthria and dysphagia as long-term sequelae in a child treated for posterior fossa tumour. Pediatric Rehabilitation, 6(2), 67–75. [DOI] [PubMed] [Google Scholar]
- Darley F. L., Aronson A. E., & Brown J. R. (1969). Clusters of deviant speech dimensions in the dysarthrias. Journal of Speech and Hearing Research, 12(3), 462–496. [DOI] [PubMed] [Google Scholar]
- De Bodt M. S., Hernandez-Diaz H. M., & Van De Heyning P. H. (2002). Intelligibility as a linear combination of dimensions in dysarthric speech. Journal of Communication Disorders, 35(3), 283–292. [DOI] [PubMed] [Google Scholar]
- Duffy J. R. (2013). Motor speech disorders: Substrates, differential diagnosis, and management. New York, NY: Elsevier Health Sciences. [Google Scholar]
- Fant G. (1960). Acoustic theory of speech perception. The Hague, the Netherlands: Mouton. [Google Scholar]
- Field A. (2009). Discovering statistics using SPSS. Thousand Oaks, CA: Sage Publications. [Google Scholar]
- Flipsen P., Jr. (2006). Measuring the intelligibility of conversational speech in children. Clinical Linguistics & Phonetics, 20(4), 303–312. [DOI] [PubMed] [Google Scholar]
- Fox C. M., & Boliek C. A. (2012). Intensive voice treatment (LSVT® LOUD) for children with spastic cerebral palsy and dysarthria. Journal of Speech, Language, and Hearing Research, 55(3), 930–945. [DOI] [PubMed] [Google Scholar]
- Fudala J. B. (2000). Arizona Articulation Proficiency Scale–Third Revision. Torrance, CA: Western Psychological Services. [Google Scholar]
- Green J. R., Yunusova Y., Kuruvilla M. S., Wang J., Pattee G. L., Synhorst L., … Berry J. D. (2013). Bulbar and speech motor assessment in ALS: Challenges and future directions. Amyotrophic Lateral Sclerosis and Frontotemporal Degeneration, 14(7–8), 494–500. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grigos M. I. (2009). Changes in articulator movement variability during phonemic development: A longitudinal study. Journal of Speech, Language, and Hearing Research, 52(1), 164–177. [DOI] [PubMed] [Google Scholar]
- Hawkins S. (1984). On the development of motor control in speech: Evidence from studies of temporal coordination. Speech and Language: Advances in Basic Research and Practice (Vol. 11, pp. 317–374) Cambridge, MA: Academic Press. [Google Scholar]
- Hidecker M. J., Ho N. T., Dodge N., Hurvitz E. A., Slaughter J., Workinger M. S., … Paneth N. (2012). Inter-relationships of functional status in cerebral palsy: Analyzing gross motor function, manual ability, and communication function classification systems in children. Developmental Medicine and Child Neurology, 54(8), 737–742. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Higgins C. M., & Hodge M. M. (2002). Vowel area and intelligibility in children with and without dysarthria. Journal of Medical Speech-Language Pathology, 10(4), 271–277. [Google Scholar]
- Hodge M., & Daniels J. (2007). TOCS+ intelligibility measures. Edmonton, Canada: University of Alberta. [Google Scholar]
- Holm S. 1979. A simple sequential rejective multiple test procedure. Scandinavian Journal of Statistics, 6, 65–70. [Google Scholar]
- Hosmer D. W., & Lemeshow S. (2000). Applied Logistic Regression–Second Edition. New York, NY: Wiley. [Google Scholar]
- Hustad K. C., Gorton K., & Lee J. (2010). Classification of speech and language profiles in 4-year-old children with cerebral palsy: A prospective preliminary study. Journal of Speech, Language, and Hearing Research, 53(6), 1496–1513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hustad K. C., Oakes A., & Allison K. (2015). Variability, stability, and diagnostic accuracy of speech intelligibility scores in children. Journal of Speech, Language, and Hearing Research, 58, 1695–1707. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hustad K. C., Schueler B., Schultz L., & DuHadway C. (2012). Intelligibility of 4-year-old children with and without cerebral palsy. Journal of Speech, Language, and Hearing Research, 55(4), 1177–1189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Irwin O. C. (1968). Correct status of vowels and consonants in the speech of children with cerebral palsy as measured by an integrated test. The Cerebral Palsy Journal, 29(1), 9–12, 15. [PubMed] [Google Scholar]
- Kent J. F., Kent R. D., Rosenbek J. C., Weismer G., Martin R., Sufit R., & Brooks B. R. (1992). Quantitative description of the dysarthria in women with amyotrophic lateral sclerosis. Journal of Speech and Hearing Research, 35(4), 723–733. [DOI] [PubMed] [Google Scholar]
- Kent R. D., Kent J. F., Weismer G., Martin R., Sufit R. L., Brooks B. R., & Rosenbek J. C. (1989). Relationships between speech intelligibility and the slope of second formant transitions in dysarthric subjects. Clinical Linguistics & Phonetics, 3, 347–358. [Google Scholar]
- Kent R. D., & Kim Y. J. (2003). Toward an acoustic typology of motor speech disorders. Clinical Linguistics & Phonetics, 17(6), 427–445. [DOI] [PubMed] [Google Scholar]
- Kent R. D., Weismer G., Kent J. F., Vorperian H. K., & Duffy J. R. (1999). Acoustic studies of dysarthric speech: Methods, progress, and potential. Journal of Communication Disorders, 32(3), 141–186. [DOI] [PubMed] [Google Scholar]
- Kim H., Martin K., Hasegawa-Johnson M., & Perlman A. (2010). Frequency of consonant articulation errors in dysarthric speech. Clinical Linguistics & Phonetics, 24(10), 759–770. [DOI] [PubMed] [Google Scholar]
- Kim Y., Kent R. D., & Weismer G. (2011). An acoustic study of the relationships among neurologic disease, dysarthria type, and severity of dysarthria. Journal of Speech, Language, and Hearing Research, 54(2), 417–429. [DOI] [PubMed] [Google Scholar]
- Kirk R. E. (2013). Experimental design: Procedures for the behavioral sciences (4th ed.). Thousand Oaks, CA: Sage Publications. [Google Scholar]
- Kreiman J., & Gerratt B. R. (2000). Sources of listener disagreement in voice quality assessment. The Journal of the Acoustical Society of America, 108(4), 1867–1876. [DOI] [PubMed] [Google Scholar]
- Lee J., & Hustad K. C. (2013). A preliminary investigation of longitudinal changes in speech production over 18 months in young children with cerebral palsy. Folia Phoniatrica et Logopaedica, 65(1), 32–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee J., Hustad K. C., & Weismer G. (2014). Predicting speech intelligibility with multiple speech subsystem approach in children with cerebral palsy. Journal of Speech, Language, and Hearing Research, 57(5), 1666–1678. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee S., Potamianos A., & Narayanan S. (1999). Acoustics of children's speech: Developmental changes of temporal and spectral parameters. The Journal of the Acoustical Society of America, 105(3), 1455–1468. [DOI] [PubMed] [Google Scholar]
- Levy E. S., Ramig L. O., & Camarata S. M. (2012). The effects of two speech interventions on speech function in pediatric dysarthria. Journal of Medical Speech-Language Pathology, 20, 82–87. [Google Scholar]
- Lisker L. (1986). “Voicing” in English: A catalogue of acoustic features signaling /b/ versus /p/ in trochees. Language and Speech, 29(Pt. 1), 3–11. [DOI] [PubMed] [Google Scholar]
- Liu H. M., Tsao F. M., & Kuhl P. K. (2005). The effect of reduced vowel working space on speech intelligibility in Mandarin-speaking young adults with cerebral palsy. The Journal of the Acoustical Society of America, 117(6), 3879–3889. [DOI] [PubMed] [Google Scholar]
- Liu H. M., Tseng C.-H., & Tsao F. M. (2000). Perceptual and acoustic analysis of speech intelligibility in Mandarin-speaking young adults with cerebral palsy. Clinical Linguistics & Phonetics, 14(6), 447–464. [Google Scholar]
- Löfqvist A. (1992). Acoustic and aerodynamic effects of interarticulator timing in voiceless consonants. Language and Speech, 35(Pt 1–2), 15–28. [DOI] [PubMed] [Google Scholar]
- McCauley R. J., & Strand E. A. (2008). A review of standardized tests of nonverbal oral and speech motor performance in children. American Journal of Speech-Language Pathology, 17(1), 81–91. [DOI] [PubMed] [Google Scholar]
- Mei C., Reilly S., Reddihough D., Mensah F., & Morgan A. (2014). Motor speech impairment, activity, and participation in children with cerebral palsy. International Journal of Speech-Language Pathology, 16, 427–435. [DOI] [PubMed] [Google Scholar]
- Milenkovic P. (2002). TF32: University of Wisconsin-Madison. Retrieved from http://userpages.chorus.net/cspeech/TFpamph.pdf
- Nip I. S. (2013). Kinematic characteristics of speaking rate in individuals with cerebral palsy: A preliminary study. Journal of Medical Speech-Language Pathology, 20(4), 88–94. [PMC free article] [PubMed] [Google Scholar]
- Nordberg A., Miniscalco C., & Lohmander A. (2014). Consonant production and overall speech characteristics in school-aged children with cerebral palsy and speech impairment. International Journal of Speech-Language Pathology, 16(4), 386–395. [DOI] [PubMed] [Google Scholar]
- Nordberg A., Miniscalco C., Lohmander A., & Himmelmann K. (2013). Speech problems affect more than one in two children with cerebral palsy: Swedish population-based study. Acta Paediatrica, 102(2), 161–166. [DOI] [PubMed] [Google Scholar]
- Ozsancak C., Auzou P., Jan M., & Hannequin D. (2001). Measurement of voice onset time in dysarthric patients: Methodological considerations. Folia Phoniatrica et Logopaedica, 53(1), 48–57. [DOI] [PubMed] [Google Scholar]
- Palisano R., Rosenbaum P., Walter S., Russell D., Wood E., & Galuppi B. (1997). Gross Motor Function Classification System. Developmental Medicine and Child Neurology, 39, 214–223. [DOI] [PubMed] [Google Scholar]
- Pennington L., & McConachie H. (2001). Interaction between children with cerebral palsy and their mothers: The effects of speech intelligibility. International Journal of Language & Communication Disorders, 36(3), 371–393. [PubMed] [Google Scholar]
- Redford M. A. (2014). The perceived clarity of children's speech varies as a function of their default articulation rate. The Journal of the Acoustical Society of America, 135(5), 2952–2963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rice M. L., Smolik F., Perpich D., Thompson T., Rytting N., & Blossom M. (2010). Mean length of utterance levels in 6-month intervals for children 3 to 9 years with and without language impairments. Journal of Speech, Language, and Hearing Research, 53(2), 333–349. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rong P., Yunusova Y., Wang J., & Green J. R. (2015). Predicting early bulbar decline in amyotrophic lateral sclerosis: A speech subsystem approach. Behavioural Neurology, 2015, 183027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rosen K. M., Goozée J. V., & Murdoch B. E. (2008). Examining the effects of multiple sclerosis on speech production: Does phonetic structure matter? Journal of Communication Disorders, 41(1), 49–69. [DOI] [PubMed] [Google Scholar]
- Rosen K. M., Kent R. D., Delaney A. L., & Duffy J. R. (2006). Parametric quantitative acoustic analysis of conversation produced by speakers with dysarthria and healthy speakers. Journal of Speech, Language, and Hearing Research, 49(2), 395–411. [DOI] [PubMed] [Google Scholar]
- Sander E. K. (1972). When are speech sounds learned? Journal of Speech and Hearing Disorders, 37(1), 55–63. [DOI] [PubMed] [Google Scholar]
- Schölderle T., Staiger A., Lampe R., Strecker K., & Ziegler W. (2016). Dysarthria in adults with cerebral palsy: Clinical presentation and impacts on communication. Journal of Speech, Language, and Hearing Research, 59(2), 216–229. [DOI] [PubMed] [Google Scholar]
- Sigurdardottir S., & Vik T. (2011). Speech, expressive language, and verbal cognition of preschool children with cerebral palsy in Iceland. Developmental Medicine & Child Neurology, 53(1), 74–80. [DOI] [PubMed] [Google Scholar]
- Smit A. B., Hand L., Freilinger J. J., Bernthal J. E., & Bird A. (1990). The Iowa Articulation Norms Project and its Nebraska replication. Journal of Speech and Hearing Disorders, 55(4), 779–798. [DOI] [PubMed] [Google Scholar]
- Smith A., & Zelaznik H. N. (2004). Development of functional synergies for speech motor coordination in childhood and adolescence. Developmental Psychobiology, 45(1), 22–33. [DOI] [PubMed] [Google Scholar]
- Stevens K. N. (1998). Acoustic phonetics. Cambridge, MA: MIT Press. [Google Scholar]
- van Mourik M., Catsman-Berrevoets C. E., Paquier P. F., Yousef-Bak E., & van Dongen H. R. (1997). Acquired childhood dysarthria: Review of its clinical presentation. Pediatric Neurology, 17(4), 299–307. [DOI] [PubMed] [Google Scholar]
- van Mourik M., Catsman-Berrevoets C. E., Yousef-Bak E., Paquier P. F., & van Dongen H. R. (1998). Dysarthria in children with cerebellar or brainstem tumors. Pediatric Neurology, 18(5), 411–414. [DOI] [PubMed] [Google Scholar]
- Vick J. C., Campbell T. F., Shriberg L. D., Green J. R., Abdi H., Rusiewicz H. L., … Moore C. A. (2012). Distinct developmental profiles in typical speech acquisition. Journal of Neurophysiology, 107, 2885–2900. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vorperian H. K., & Kent R. D. (2007). Vowel acoustic space development in children: A synthesis of acoustic and anatomic data. Journal of Speech, Language, and Hearing Research, 50(6), 1510–1545. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weismer G., Jeng J. Y., Laures J. S., Kent R. D., & Kent J. F. (2001). Acoustic and intelligibility characteristics of sentence production in neurogenic speech disorders. Folia Phoniatrica et Logopaedica, 53(1), 1–18. [DOI] [PubMed] [Google Scholar]
- Weismer G., Yunusova Y., & Bunton K. (2012). Measures to evaluate the effects of DBS on speech production. Journal of Neurolinguistics, 25(4), 74–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Workinger M. S., & Kent R. D. (1991). Perceptual analysis of the dysarthrias in children with athetoid and spastic cerebral palsy. In Moore C. A., Yorkston K. M., & Beukelman D. R. (Eds.), Dysarthria and apraxia of speech: Perspectives on management (pp. 109–126). Baltimore, MD: Brookes. [Google Scholar]
- Yorkston K. M., Beukelman D. R., Strand E., & Hakel M. E. (2010). Management of motor speech disorders in children and adults. Austin, TX: Pro-Ed. [Google Scholar]
- Yunusova Y., Weismer G., Kent R. D., & Rusche N. M. (2005). Breath-group intelligibility in dysarthria: Characteristics and underlying correlates. Journal of Speech, Language, and Hearing Research, 48(6), 1294–1310. [DOI] [PubMed] [Google Scholar]
- Zimmerman I., Steiner V., & Pond R. (2005). Preschool Language Scale–Fourth Edition Screening Test. San Antonio, TX: The Psychological Corporation. [Google Scholar]