Abstract
Objective
This study examined the extent to which articulatory rate reduction and increased loudness were associated with adjustments in utterance-level measures of fundamental frequency (F0) variability for speakers with dysarthria and healthy controls that have been shown to impact on intelligibility in previously published studies. More generally, the current study sought to compare and contrast how a slower-than-normal rate and increased vocal loudness impact on a variety of utterance-level F0 characteristics for speakers with dysarthria and healthy controls.
Patients and Methods
Eleven speakers with Parkinson's disease, 15 speakers with multiple sclerosis, and 14 healthy control speakers were audio recorded while reading a passage in habitual, loud, and slow conditions. Magnitude production was used to elicit variations in rate and loudness. Acoustic measures of duration, intensity and F0 were obtained.
Results and Conclusions
For all speaker groups, a slower-than-normal articulatory rate and increased vocal loudness had distinct effects on F0 relative to the habitual condition, including a tendency for measures of F0 variation to be greater in the loud condition and reduced in the slow condition. These results suggest implications for the treatment of dysarthria.
Key Words: Dysarthria, Fundamental frequency, Speaking rate, Loudness
Introduction
The contribution of fundamental frequency (F0) to speech intelligibility has been examined in studies of normal and disordered speech, including studies of dysarthria [1,2,3,4,5,6,7,8]. These studies suggest that sentences or phrases characterized by relatively greater F0 variation, as indexed by measures such as F0 range or F0 standard deviation, tend to be associated with relatively better speech intelligibility. Spitzer et al. [6], for example, used speech resynthesis to flatten the F0 contour of phrases produced by a neurologically normal speaker. Phrases for which the intonation contour had been flattened were less intelligible compared to the original phrases for which the F0 contour was unaltered. Relatedly, Bunton [9] explored the contribution of F0 to vowel identity for speakers with Parkinson's disease (PD) and healthy controls. Speech resynthesis was used to flatten or enhance F0 contours. Results indicated that the enhanced F0 condition was associated with improved vowel identification accuracy for words produced by speakers with PD.
As illustrated by Bunton [9], one approach to investigating the contribution of F0 to intelligibility in dysarthria is to manipulate F0 using speech resynthesis [3,9,10]. These types of controlled studies have yielded much valuable information, but it also would seem important to pursue parallel studies investigating the extent to which speakers with dysarthria actually produce the kinds of F0 adjustments that have been linked to intelligibility in studies employing speech resynthesis. Thus, the present study explored the extent to which two behavioral therapy techniques for dysarthria were associated with F0 adjustments that have the potential to impact on intelligibility. By investigating two therapeutic techniques in the same speakers with dysarthria, the present study also helps to address the need for group studies comparing the relative merits of intervention techniques [11].
Rate reduction and increased vocal loudness are widely used in the treatment of dysarthria [12]. Both techniques have the potential to improve intelligibility, although a slower-than-normal rate has been shown in at least some studies to have quite variable effects on intelligibility in dysarthria, and studies investigating increased loudness in dysarthria have only recently begun to include formal, quantitative measures of intelligibility. Studies from our lab investigating rate and loudness effects in dysarthria have focused on segmental adjustments associated with these therapeutic techniques [13,14]. As reviewed below, other studies have explored the effects of increased loudness and rate reduction on F0, albeit not in the form of group comparison studies.
The impact of increased loudness on F0 characteristics for neurologically normal speakers and speakers with dysarthria has been examined in studies using a variety of speech materials [e.g., [15,16,17,18,19]; for a review of dysarthria studies, see [12]]. These studies indicate a tendency for increased loudness to be associated with an increase in mean F0 as well as increased F0 variation or range. It is reasonable to speculate that this enhanced prosodic variation helps to explain the improved intelligibility accompanying increased vocal loudness reported for at least some speakers with dysarthria, although the appropriate empirical studies have yet to be conducted. Group dysarthria studies investigating F0 adjustments accompanying increased vocal loudness have mostly focused on PD, however, and research suggests the importance of studying a variety of populations, as rate and loudness manipulations may not uniformly affect speech characteristics for all neurological diagnoses and dysarthrias [20,21,22].
F0 adjustments associated with a slower-than-normal articulatory rate have not been studied much in dysarthria, at least in studies where rate effects can be readily teased apart from other factors or were the primary focus of study [23,24]. For example, Wang et al. [24] reported descriptive measures of F0 in the form of means and standard deviations for sentences produced at habitual, fast, and slow rates by speakers with dysarthria secondary to traumatic brain injury. Even for neurologically normal talkers, F0 characteristics associated with a slower-than-normal articulatory rate are not well understood, and speaker numbers and experimental speech materials in existing studies are generally quite limited. Nonetheless, the available published studies suggest a tendency for a slower-than-normal articulatory rate to be associated with a lower mean F0, lower maximum F0, and reduced F0 variation [15,25,26,27]. This latter finding is especially interesting and suggests the possibility that a slower-than-normal rate may be associated with F0 adjustments potentially detrimental to intelligibility.
In summary, the purpose of the current study was to examine the extent to which two behavioral therapy techniques for dysarthria were associated with F0 adjustments that have the potential to impact on intelligibility. More generally, the present study sought to compare and contrast the impact of a slower-than-normal articulatory rate and increased loudness on a variety of F0 measures for a reading passage produced by individuals with dysarthria secondary to PD or multiple sclerosis (MS).
Methods
Speakers
A total of 40 speakers were studied. The MS group was comprised of 5 men and 10 women ranging in age from 25 to 62 years (mean age = 49 years; SD = 10), the PD group was comprised of 6 men and 5 women ranging in age from 42 to 74 years (mean age = 61 years; SD = 11), and the control group was comprised of 7 men and 7 women ranging in age from 20 to 72 years (mean age = 54 years; SD = 14). Supralaryngeal adjustments associated with increased loudness and rate reduction have been previously reported for a larger group of speakers [13,14]. Additional details regarding inclusionary criteria may be found in these studies [13,14].
As summarized in tables 1 and 2, a dysarthria diagnosis, prominent deviant perceptual characteristics, and an estimate of dysarthria severity were identified for individuals with MS and PD based on consensus, auditory-perceptual judgments of 3 speech-language pathologists. Scaled estimates of intelligibility for the ‘Grandfather Passage’ also were provided by 5 graduate students in speech-language pathology for the purpose of characterizing speech severity. Direct magnitude estimation was used to obtain a relative ranking of intelligibility, with higher scale values indicating relatively better intelligibility and lower scale values indicating relatively poorer intelligibility. Scaled intelligibility estimates in tables 1 and 2 represent the geometric mean for the 5 listeners. Intelligibility was not estimated for PDF2 due to technical difficulties.
Table 1.
Speaker characteristics for participants diagnosed with MS
Subject code | Age years | Years after diagnosis | Dysarthria diagnosis | Dysarthria severity | Deviant perceptual characteristics | Scaled intelligibility |
---|---|---|---|---|---|---|
MSF1 | 60 | 12 | spastic | moderate | strain-strangled, slow rate, short phrases | 29 |
MSF2 | 42 | 8 | spastic-ataxic | mild | low pitch, imprecise consonants, excess and equal stress | 238 |
MSF3 | 33 | 5 | spastic | moderate | stain-strangled, slow rate, voice tremor | 57 |
MSF4 | 50 | 9 | ataxic | mild/moderate | hyponasal, excess and equal stress, monopitch | 70 |
MSF5 | 41 | 12 | ataxic | mild/moderate | hyponasal, irregular articulatory breakdown, harsh | 100 |
MSF6 | 56 | 7 | spastic | mild | low pitch, harsh, slow rate | 183 |
MSF7 | 59 | 15 | ataxic | moderate | hyponasal, excess and equal stress, imprecise consonants | 65 |
MSF8 | 25 | 5 | spastic-ataxic | moderate | excess and equal stress, slow rate, short phrases | 112 |
MSF9 | 50 | 9 | ataxic | moderate | slow rate, imprecise consonants, irregular articulatory breakdown | 61 |
MSF10 | 54 | 5 | spastic | mild | slow rate, strain-strangled, pitch breaks | 241 |
MSM2 | 45 | 4 | ataxic | moderate | slow rate, monopitch, irregular articulatory breakdown | 91 |
MSM3 | 58 | 8 | spastic | mild | strain-strangled, harsh | 170 |
MSM5 | 62 | 5 | ataxic | mild | excess and equal stress, harsh, voice tremor | 168 |
MSM6 | 47 | 2 | ataxic | mild | hyponasal, imprecise consonants, voice tremor | 97 |
MSM7 | 48 | 21 | ataxic | moderate | hyponasal, monopitch, monoloud | 74 |
Table 2.
Speaker characteristics for participants diagnosed with PD
Subject code | Age years | Years after diagnosis | Dysarthria diagnosis | Dysarthria severity | Deviant perceptual characteristics | Scaled intelligibility |
---|---|---|---|---|---|---|
PDF1 | 42 | 6 | hypokinetic | moderate | monoloud, reduced loudness, variable rate | 146 |
PDF2 | 62 | 3 | hypokinetic | mild | imprecise consonants, slow rate | not available |
PDF3 | 50 | 3 | hypokinetic | moderate/severe | hypernasal, imprecise consonants, short rushes | 38 |
PDF4 | 72 | 9 | hypokinetic | moderate | reduced loudness, variable rate, short rushes | 115 |
PDF6 | 45 | 13 | hypokinetic | moderate/severe | fast rate, breathy voice, monoloud | 95 |
PDM1 | 69 | 12 | hypokinetic | moderate | monopitch, monoloud, reduced stress | 168 |
PDM2 | 74 | 1 | hypokinetic | mild | breathy, low pitch, slow rate | 100 |
PDM3 | 72 | 4 | hyperkinetic | mild | harsh, forced inspiration/expiration, low pitch | 178 |
PDM4 | 64 | 17 | hypokinetic | moderate | monopitch, monoloud, short rushes | 78 |
PDM5 | 60 | 8 | hypokinetic | moderate | breathy, short rushes, repeated phonemes | 70 |
PDM6 | 64 | 8 | hypo-/hyperkinetic | mild/moderate | breathy, fast rate, voice stoppages | 81 |
Speech Sample and Speaking Task
Participants were audio recorded while reading the ‘John Passage’, a 192-word passage developed to include a variety of consonants and vowels [13]. The present analyses were restricted to the first half of the reading passage, which is six sentences in length or 98 words and comparable in length to the ‘Grandfather Passage’ [11]. The reading passage was produced in habitual, loud, and slow speaking conditions. Magnitude production was used to elicit the variations in loudness and rate [20,22,28]. Details of the recording equipment and procedures may be found elsewhere [13,14].
Acoustic Analyses
Articulatory rate was measured for each speech run using the combined waveform and wideband (300–350 Hz) digital spectrographic displays of CSpeechSp 4.0 [29]. A run was operationally defined as a stretch of speech bounded by silent periods between words of at least 200 ms [13,30]. Conventional acoustic criteria were used to identify run onsets and offsets, such as stop release bursts, frication, or voicing energy. The number of syllables actually produced in each run was counted and articulatory rate was computed in syllables per second. Articulatory rates for speech runs in a given passage reading were averaged, yielding a mean value for each speaker and speaking condition for use in the statistical analysis.
Sound pressure level (SPL) was used to index variation in vocal intensity. SPL values for runs associated with a given passage reading were averaged, yielding a mean SPL for each speaker and condition. These averages were used in the statistical analyses.
Using TF32 [31], F0 traces were generated for each speech run and were visually inspected for errors by 2 trained research assistants unfamiliar with the purpose of the study. Computer-generated F0 tracking errors were hand-corrected on a pitch period-by-pitch period basis. A broad approach was used to describe F0, based on previous studies [15,25,32]. Summary F0 statistics computed for each speech run included F0 mean, standard deviation, minimum, maximum, range, interquartile range and slope. Most measures were obtained directly in TF32. Text files containing the time-by-F0 frequency values also were exported from TF32 into Excel to allow for computation of F0 range, interquartile range. F0 slope also was computed in Excel using linear regression analysis. For each speaker and condition, global F0 measures were averaged across runs for use in the statistical analyses.
Figure 1 shows the combined waveform and wideband (300-Hz) digital spectrographic displays for the speech run ‘a seed in his garden’ produced in the habitual condition by a male speaker with PD. The F0 contour also is shown. Figure 1 shows that the intonation contour is comprised of an initial, middle, and terminal segment, demarcated by vertical cursors. These components of the larger intonation contour may be referred to as focal segments [32]. A decline in the terminal portion of an intonation contour is one acoustic cue to syntactic structure in spoken English [e.g., [33,34]]. The nonterminal portion of an intonation contour does not necessarily show a decline, however, and this potential difference in the slope of terminal and nonterminal focal F0 segments is not captured by global F0 slope. Thus, linear regression analysis also was used to compute the slope of focal segments for comparison of nonterminal and terminal segments. For each speaker and condition, focal slope measures were averaged across terminal or nonterminal segments for use in the statistical analyses.
Fig. 1.
The waveform, wideband digital spectrogram (300 Hz), and F0 trace are shown for the speech run ‘a seed in his garden’ spoken by a male speaker with PD in the habitual condition. Vertical cursors denote the terminal focal F0 segment. Global F0 measures for this run were as follows: F0 mean = 160 Hz, F0 SD = 19 Hz, F0 minimum = 95 Hz, F0 maximum = 195 Hz, F0 range = 100 Hz, F0 slope = −0.028 Hz/ms.
Measurement Reliability
Intrajudge and interjudge measurement reliability were determined for approximately 10% of the speech runs for each speaker and condition. Intrajudge reliability for SPL yielded a mean absolute error of 0.12 dB SPL and a Pearson product-moment correlation coefficient of 0.99 for the first and second set of measures. Interjudge reliability also yielded a correlation of 0.99 and a mean error of 0.09 dB SPL. Intrajudge reliability for speech run duration yielded an average measurement error of 10 ms and a correlation of 0.99. Interjudge reliability for speech run duration yielded a mean error of 23 ms as well as a correlation of 0.99. Reliability estimates can be summarized as follows for F0 mean, standard deviation, range, minimum and maximum. For intrajudge reliability, average absolute measurement error ranged from a low of 1.87 Hz (F0 mean) to a high of 15.2 Hz (F0 range). Correlation coefficients ranged from 0.92 (F0 minimum) to 0.99 (F0 mean). For interjudge reliability, measures of average absolute measurement error ranged from a low of 2.4 Hz (F0 mean) to a high of 15.55 Hz (F0 range), and correlation coefficients ranged from 0.90 (F0 minimum) to 0.99 (F0 mean). For slope, intrajudge reliability yielded a mean measurement error of 0.005 Hz/ms with a correlation of 0.95 for the two sets of measures. Interjudge reliability for slope yielded a mean error of 0.007 Hz/ms and a correlation of 0.95.
Data Analysis
For global F0 measures, a multivariate linear model was fit to each dependent variable in this repeated measures design. All models included the main effects of Condition (Habitual, Loud, Slow), Group (MS, PD, Control), as well as a Condition × Group interaction. To control for sex differences in dependent measures, gender was included as a covariate in all analyses. A similar model was fit to the dependent measure of focal F0 slope, but this model included an additional main effect of Position (Terminal, Nonterminal) as well as all possible two- and three-way interactions of main effects. Pairwise comparisons were performed based on the fitted models in conjunction with a Tukey-Kramer adjustment for multiple comparisons. A nominal significance level of 0.05 was used in all hypothesis testing.
Results
Articulatory Rate and SPL
Table 3 reports articulatory rate and SPL measures. Analyses for articulatory rate indicated a significant Condition effect [F(2, 36) = 70.86, p < 0.0001] as well as a Group × Condition interaction [F(4, 36) = 2.81, p = 0.0395]. The main effect of Group was not significant. Post hoc testing further indicated a reduced articulatory rate for the Slow-Habitual (p< 0.0001) and Slow-Loud (p < 0.0001) contrasts. Articulatory rate also was significantly slower in the Loud versus Habitual condition (p = 0.0016). Within each speaker group only the Slow-Habitual and Slow-Loud contrasts were statistically significant. For SPL, there also was a significant Condition effect [F(2, 36) = 123.85, p < 0.0001], but no Group effect or Group × Condition interaction. Post hoc testing indicated higher SPLs for the Loud-Habitual and Loud-Slow (p < 0.0001) contrasts. The Habitual-Slow contrast also was significant (p = 0.0419), but the mean difference for these conditions was only about 1 dB.
Table 3.
Means and standard deviations (values in parentheses) are reported for number of speech runs, articulatory rate and SPL
Speech runs, n |
Articulatory rate, syllables/s |
SPL, dB SPL |
|||||||
---|---|---|---|---|---|---|---|---|---|
habitual | loud | slow | habitual | loud | slow | habitual | loud | slow | |
Control | 11 (4) | 12 (3) | 24 (8) | 4.25 (0.27) | 3.93 (0.34) | 2.91 (0.54) | 81.21 (2.85) | 86.61 (2.78) | 80.02 (3.36) |
MS | 17 (9) | 19 (10) | 32 (19) | 3.69 (0.64) | 3.48 (0.73) | 2.87 (0.61) | 80.72 (2.78) | 86.59 (2.44) | 79.80 (4.20) |
PD | 18 (7) | 16 (7) | 26 (10) | 4.08 (0.81) | 3.86 (0.59) | 3.33 (0.57) | 80.30 (3.92) | 85.67 (3.58) | 80.06 (4.04) |
To summarize, on average, articulatory rate was reduced to about 75% of Habitual in the Slow condition. The Slow condition further elicited a reduced articulatory rate relative to the Habitual and Loud conditions, and this trend held for 36 of the 40 speakers. The Loud condition elicited about a 5-dB increase in vocal intensity relative to the Habitual and Slow conditions. This trend held for each of the 40 speakers.
F0 Measures
Figure 2 reports F0 mean, minimum, maximum and slope as a function of group and condition. Each symbol represents the average value for an individual speaker. Table 4 reports descriptive statistics for measures of F0 variation as a function of group, sex, and condition.
Fig. 2.
F0 mean, minimum, maximum and slope data are reported as a function of speaker group and condition. Each symbol represents the average value for an individual speaker. Symbol shape indicates group affiliation, with unfilled and filled symbols corresponding to female and male speakers, respectively.
Table 4.
Summary statistics in the form of means and standard deviations (SD, values in parentheses) are reported for measures of F0 variation
Group | Sex | F0 range |
F0 IQR |
F0 SD |
||||||
---|---|---|---|---|---|---|---|---|---|---|
habitual | loud | slow | habitual | loud | slow | habitual | loud | slow | ||
Control | male | 111 (27) | 129 (28) | 101 (21) | 28 (12) | 35 (12) | 27 (14) | 22 (7) | 26 (6) | 22 (7) |
MS | male | 86 (38) | 92 (32) | 73 (40) | 20 (17) | 24 (9) | 17 (6) | 17 (6) | 19 (7) | 15 (7) |
PD | male | 72 (18) | 88 (23) | 70 (29) | 17 (6) | 23 (9) | 17 (9) | 14 (4) | 19 (16) | 14 (7) |
Control | female | 185 (62) | 209 (52) | 146 (34) | 36 (19) | 42 (17) | 30 (11) | 37 (13) | 42 (11) | 32 (9) |
MS | female | 178 (59) | 203 (52) | 148 (60) | 40 (18) | 44 (13) | 34 (16) | 39 (18) | 43 (11) | 34 (13) |
PD | female | 141 (26) | 164 (17) | 121 (29) | 27 (6) | 37 (6) | 26 (10) | 27 (6) | 33 (3) | 27 (11) |
IQR = Interquartile range.
With the exception of F0 minimum, the main effect of Condition was significant in analyses for all dependent variables. The typical pattern of results was such that F0 measures were highest or greatest in the Loud condition, followed by the Habitual and Slow conditions. For example, collapsing data across groups and speakers, F0 range was greatest in the Loud condition (male mean = 105 Hz; female mean = 196 Hz), followed by the Habitual (male mean = 91 Hz; female mean = 172 Hz) and Slow (male mean = 83 Hz; female mean = 141 Hz) conditions. For slope, however, measures were greatest or closest to 0 (i.e., indicating shallower slopes) in the Slow condition (male mean = −0.008 Hz/ms; female mean = −0.009 Hz/ms) as compared to the Loud (male mean = −0.011 Hz/ms; female mean = −0.022 Hz/ms) and Habitual (male mean = −0.014 Hz/ms; female mean = −0.020 Hz/ms) conditions. Post hoc tests further indicated that the Habitual-Loud and Loud-Slow comparisons were usually significant (p < 0.05). Many F0 measures also tended to differ in the Habitual and Slow conditions, but this difference was not always statistically significant. Analyses for slope and F0 range were exceptions. Finally, the main effect of Group was only significant in the statistical analysis for F0 minima [F(2, 36) = 4.69, p = 0.0155], where values were highest for the PD group (male mean = 98 Hz; female mean = 87 Hz) followed by the MS (male mean = 69 Hz; female mean = 79 Hz) and Control (male mean = 60 Hz; female mean = 73 Hz) groups. Post hoc testing indicated that only the PD-Control comparison was significant (p = 0.0128). The Group × Condition interaction was not significant in any of the analyses.
Upwards of 80% of speakers followed these group trends. For example, 34/40 speakers used a greater F0 range in the Loud versus Habitual condition, 33/40 speakers used a greater F0 range in the Habitual versus Slow condition, and 37/40 speakers used a greater F0 range in the Loud versus the Slow condition. F0 slope was the exception as only 26 of 40 speakers exhibited a shallower F0 slope in the Slow condition as compared to the Habitual condition. Speakers in all groups departed from the predominant trend or pattern for global F0 slope. Lastly, statistical analyses of focal F0 slope data failed to reveal any significant main effects or interactions. As such, these data are not considered further.
Discussion
An increase in vocal intensity of approximately 5 dB SPL relative to habitual or typical speech was sufficient to elicit an increase in F0 mean, maximum, range, standard deviation, and interquartile range. These results are consistent with findings reported in previous studies of normal speech and studies of PD [12,15,16,17,18,19]. Findings also are consistent with a study examining the impact of increased loudness on F0 characteristics for 2 speakers with dysarthria secondary to MS [35]. In contrast, a slowing of articulatory rate to about 75% of habitual had a very different effect on F0. Relative to the habitual condition, a slower-than-normal articulatory rate was associated with a reduction in F0 mean, maximum and range as well as a more gradually declining F0 across speech runs, as indicated by measures of F0 slope. These findings for the slow condition also support findings from previous studies of normal speech [15,25,26,27], and even more importantly extend our understanding of how a slower-than-normal articulatory rate affects utterance-level F0 characteristics for a reading passage produced by speakers with dysarthria.
Measures of F0 variation for speakers with dysarthria were of particular interest owing to their potential contribution to intelligibility. When speakers with PD and MS in the present study increased vocal intensity, utterance-level prosodic variation also increased. In contrast, utterance-level prosodic variation was reduced relative to typical speech when speakers with dysarthria used a slower-than-normal articulatory rate. Table 4 further suggests that when habitual F0 range was the most compressed, a slower-than-normal rate had the least impact on prosodic variation. For example, the average F0 range for male speakers with PD in the habitual condition was 72 Hz, as compared to 111 Hz for control males and 86 Hz for the MS males. In the slow condition, F0 range only decreased to an average of 70 Hz for PD males or about a 3% decrease as compared to 101 Hz for control males and 73 Hz for MS males. A similar trend is evident for MS and PD females. Thus, concerns that articulatory rate reduction will negatively impact on prosodic variation in dysarthria seem to be somewhat mitigated for speakers having a relatively more monotonous habitual speech pattern. Because a variety of techniques may be used clinically to reduce articulatory rate in dysarthria, however, future studies would help to determine whether certain rate reduction techniques are preferred for preserving utterance-level F0 variation.
Results of the current study also support existing studies suggesting that an increased vocal intensity or loud speech may be useful for enhancing F0 variation in dysarthria [12]. Inspection of the data in figure 2 indicates that all of the F0 adjustments for loud speech appear to be within the range of what would be considered appropriate for speakers of both genders. The point is not trivial because speech naturalness or perceived severity may be reduced if global dysarthria treatment techniques are not implemented in such a way that an appropriate overall prosodic profile is achieved [36,37].
Speakers in the present study did not reduce articulatory rate in the loud condition relative to the habitual condition, but other studies have reported that an increased vocal intensity is accompanied by a reduction in articulatory rate as well as enhanced F0 variation in dysarthria [12]. Thus, a slowing of articulatory rate in dysarthria need not preclude the possibility of enhanced F0 variation, if the reduction in rate is accomplished within the context of increased vocal loudness. Clear speech also shows promise as a therapeutic technique for simultaneously slowing articulatory rate and enhancing prosodic variation. Studies of clear speech in neurologically normal talkers, for example, suggest that clear speech is associated with enhanced F0 variation, lengthened segment durations and reduced articulatory rate, as well as improved intelligibility [38,39,40,41]. The few published studies investigating clear speech in dysarthria also suggest that relative to habitual speech, clear or hyperarticulate speech is associated with improved intelligibility, reduced articulatory rate or lengthened segment durations, and increased phrase or sentence-level F0 variation [42,43,44]. It remains to be determined whether clear or loud speech is relatively more effective in simultaneously reducing articulatory rate and enhancing or at least maintaining utterance-level F0 variation for speakers with a variety of neurological diagnoses and dysarthrias.
Maximizing intelligibility is an important treatment goal for many patients with dysarthria. Based on the present findings of increased F0 variation in the loud condition and a tendency toward reduced F0 variation in the slow condition relative to habitual speech, it is tempting to conclude that therapeutic techniques focusing on increasing vocal loudness might be preferred to techniques focusing on rate reduction if dysarthria treatment aims to maximize intelligibility. The current study was an investigation of speech production characteristics, however, and future studies are needed to quantitatively evaluate the relationship between perceptual judgments of intelligibility and F0 adjustments accompanying a slower-than-normal rate and increased loudness for speakers with dysarthria. Intelligibility is a complex construct reflecting at minimum the combined influence of segmental, suprasegmental, linguistic and listener variables. F0 – or prosodic variation – appears to interact with segmental integrity to impact on intelligibility in ways that are only beginning to be understood [e.g., [6,9]]. A variety of studies suggest that a slower-than-normal articulatory rate and increased vocal loudness may impact segmental characteristics in dysarthria. The manner in which these types of segmental adjustments may interact with F0 to ultimately impact perceptual judgments of intelligibility or naturalness in dysarthria requires further study. Studies employing speech resynthesis will likely prove helpful in this regard.
Finally, speakers with MS and PD in the present study were judged to exhibit dysarthria. However, with the exception of measures of F0 minima, there were no group differences in measures of F0 or in how F0 measures were affected by a slower-than-normal rate or increased vocal intensity. The implication is that prosody – at least as indexed by measures of F0 – was only mildly affected for speakers with dysarthria in the current study. Future studies are needed to determine how increased loudness and a slower-than-normal rate affect F0 characteristics in more severe dysarthria.
In conclusion, increased vocal intensity and a slower-than-normal rate were found to have opposite effects on F0 characteristics for a reading passage produced by speakers with PD, speakers with MS, and healthy controls. An increased vocal intensity was associated with adjustments in measures of F0 variation that have the potential to be beneficial to intelligibility, and a slower-than-normal rate was associated with changes in measures of F0 variation potentially detrimental to intelligibility. It is important for future studies to determine the relationship – if any – of these types of F0 adjustments to perceptual impressions of intelligibility and or speech naturalness.
Acknowledgements
We thank Beth Hilczmayer and Erin Szajta for their assistance with the acoustic analyses as well as Miranda Crumb and Grace Liu for their assistance with data reduction and entry. Portions of this study were reported at the 2008 Conference on Motor Speech. Research supported by NIDCD R01DC004689.
References
- 1.Binns C, Culling JF. The role of fundamental frequency contours in the perception of speech against interfering speech. J Acoust Soc Am. 2007;122:1765–1776. doi: 10.1121/1.2751394. [DOI] [PubMed] [Google Scholar]
- 2.Bradlow AR, Torretta GM, Pisoni DB. Intelligibility of normal speech. I. Global and fine-grained acoustic-phonetic talker characteristics. Speech Commun. 1996;20:255–272. doi: 10.1016/S0167-6393(96)00063-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Bunton K, Kent RD, Kent JF, Duffy JR. The effects of flattening fundamental frequency contours on sentence intelligibility in speakers with dysarthria. Clin Linguist Phon. 2001;15:181–193. [Google Scholar]
- 4.Laures J, Weismer G. The effect of a flattened f0 on intelligibility at the sentence-level. J Speech Lang Hear Res. 1999;42:1148–1156. doi: 10.1044/jslhr.4205.1148. [DOI] [PubMed] [Google Scholar]
- 5.Massen B, Povel D. The effect of correcting fundamental frequency on the intelligibility of deaf speech and its interaction with temporal aspects. J Acoust Soc Am. 1984;76:1673–1681. doi: 10.1121/1.391614. [DOI] [PubMed] [Google Scholar]
- 6.Spitzer SM, Liss JM, Mattys SL. Acoustic cues to lexical segmentation: a study of resynthesized speech. J Acoust Soc Am. 2007;122:3678–3687. doi: 10.1121/1.2801545. [DOI] [PubMed] [Google Scholar]
- 7.Watson PJ, Schlauch RS. The effect of fundamental frequency on the intelligibility of speech with flattened intonation contours. Am J Speech Lang Pathol. 2008;17:348–355. doi: 10.1044/1058-0360(2008/07-0048). [DOI] [PubMed] [Google Scholar]
- 8.Wingfield A, Lombardi L, Sokol S. Prosodic features and the intelligibility of accelerated speech: syntactic versus periodic segmentation. J Speech Hear Res. 1984;27:125–134. doi: 10.1044/jshr.2701.128. [DOI] [PubMed] [Google Scholar]
- 9.Bunton K. Fundamental frequency as a perceptual cue for vowel identification in speaker's with Parkinson disease. Folia Phoniatr Logop. 2006;58:323–339. doi: 10.1159/000094567. [DOI] [PubMed] [Google Scholar]
- 10.Kain A, Hosom JP, Niu X, van Santen J, Fried-Oken M, Staehely J. Improving the intelligibility of dysarthric speech. Speech Commun. 2007;49:743–459. [Google Scholar]
- 11.Duffy JR. Motor Speech Disorders: Substrates, Differential Diagnosis, and Management. ed 2. St. Louis: Mosby; 2005. [Google Scholar]
- 12.Yorkston KM, Hakel M, Beukelman DR, Fager S. Evidence for effectiveness of treatment of loudness, rate, or prosody in dysarthria: a systematic review (ANCDS Bulletin Board) J Med Speech Lang Pathol. 2007;15:xi–xxxvi. [Google Scholar]
- 13.Tjaden K, Wilding GE. Rate and loudness manipulations in dysarthria: acoustic and perceptual findings. J Speech Lang Hear Res. 2004;47:766–783. doi: 10.1044/1092-4388(2004/058). [DOI] [PubMed] [Google Scholar]
- 14.Tjaden K, Wilding GE. Effect of rate reduction and increased loudness on acoustic measures of anticipatory coarticulation in multiple sclerosis and Parkinson's disease. J Speech Lang Hear Res. 2005;48:261–277. doi: 10.1044/1092-4388(2005/018). [DOI] [PubMed] [Google Scholar]
- 15.Dromey C, Ramig LO. Intentional changes in sound pressure level and rate: their impact on measures of respiration, phonation, and articulation. J Speech Lang Hear Res. 1998;41:1003–1018. doi: 10.1044/jslhr.4105.1003. [DOI] [PubMed] [Google Scholar]
- 16.Holmberg EB, Hillman RE, Perkell JS. Glottal airflow and transglottal air pressure measurements for male and female speakers in soft, normal, and loud voice. J Acoust Soc Am. 1988;84:511–529. doi: 10.1121/1.396829. [DOI] [PubMed] [Google Scholar]
- 17.Patel R, Schell KW. The influence of linguistic content on the Lombard effect. J Speech Lang Hear Res. 2008;51:209–220. doi: 10.1044/1092-4388(2008/016). [DOI] [PubMed] [Google Scholar]
- 18.Stathopoulos ET, Sapienza CM. Respiratory and laryngeal function of women and men during vocal intensity variation. J Speech Hear Res. 1993;36:64–75. doi: 10.1044/jshr.3601.64. [DOI] [PubMed] [Google Scholar]
- 19.Stathopoulos ET, Sapienza CM. Developmental changes in laryngeal and respiratory function with variations in sound pressure level. J Speech Lang Hear Res. 1997;40:595–614. doi: 10.1044/jslhr.4003.595. [DOI] [PubMed] [Google Scholar]
- 20.Kleinow J, Smith A, Ramig LO. Speech motor stability in IPD: effects of rate and loudness manipulations. J Speech Lang Hear Res. 2001;44:1041–1051. doi: 10.1044/1092-4388(2001/082). [DOI] [PubMed] [Google Scholar]
- 21.McHenry MA. The effect of pacing strategies on the variability of speech movement sequences in dysarthria. J Speech Lang Hear Res. 2003;46:702–710. doi: 10.1044/1092-4388(2003/055). [DOI] [PubMed] [Google Scholar]
- 22.McHenry MA, Liss JM. The impact of stimulated vocal loudness of nasalance in dysarthria. J Me Speech Lang Pathol. 2006;14:197–205. [Google Scholar]
- 23.LeDorze G, Dionne L, Ryalls J, Julien M, Ouellet L. The effects of speech and language therapy for a case of dysarthria associated with Parkinson's disease. Eur J Disord Commun. 1992;27:313–324. doi: 10.3109/13682829209012043. [DOI] [PubMed] [Google Scholar]
- 24.Wang YT, Kent RD, Duffy JR, Thomas JE. Dysarthria in traumatic brain injury: a breath group and intonational analysis. Folia Phoniatr Logop. 2005;57:59–89. doi: 10.1159/000083569. [DOI] [PubMed] [Google Scholar]
- 25.Cooper WE, Sorensen JM. Fundamental Frequency in Sentence Production. New York: Springer; 1981. [Google Scholar]
- 26.Steppling ML, Montgomery AA. Perception and production of rise-fall intonation in American speech. Percept Psychophys. 2002;64:451–461. doi: 10.3758/bf03194717. [DOI] [PubMed] [Google Scholar]
- 27.Ladd DR, Faulkner D, Faulkner H, Schepman A. Constant segmental anchoring of F0 movements under changes in speech rate. J Acoust Soc Am. 1999;106:1543–1554. doi: 10.1121/1.427151. [DOI] [PubMed] [Google Scholar]
- 28.Weismer G, Jeng J, Laures JS, Kent R. Acoustic and intelligibility characteristics of sentence production in neurogenic speech disorders. Folia Phoniatr Logop. 2001;53:1–18. doi: 10.1159/000052649. [DOI] [PubMed] [Google Scholar]
- 29.Milenkovic P. Cspeech, version 4.0, computer program. Madison: University of Wisconsin; 1997. [Google Scholar]
- 30.Turner GS, Weismer G. Characteristics of speaking rate in the dysarthria associated with amyotrophic lateral sclerosis. J Speech Hear Res. 1993;36:1134–1144. doi: 10.1044/jshr.3606.1134. [DOI] [PubMed] [Google Scholar]
- 31.Milenkovic P. Computer program. Madison: University of Wisconsin; 2003. TF32. [Google Scholar]
- 32.Kim HH: Monotony of speech production in Parkinson's disease: acoustic entities and their perceptual relations; unpublished PhD diss University of Wisconsin-Madison, 1994.
- 33.Lieberman P. Intonation, perception and language. Cambridge: MIT Press; 1967. [Google Scholar]
- 34.Lieberman P, Katz W, Jongman A, Zimmerman R, Miller M. Measures of the sentence intonation of read and spontaneous speech in American English. J Acoust Soc Am. 1985;77:649–657. doi: 10.1121/1.391883. [DOI] [PubMed] [Google Scholar]
- 35.Sapir S, Pawless AA, Ramig LO, Seeley E, Fox C, Corby J. Effects of intensive phonatory-respiratory treatment (LSVT) on voice in two individuals with multiple sclerosis. J Med Speech Lang Pathol. 2001;9:141–151. [Google Scholar]
- 36.Liss J. The role of speech perception in motor speech disorders. In: Weismer G, editor. Motor Speech Disorders. San Diego: Plural; 1967. pp. 187–220. [Google Scholar]
- 37.Yorkston K, Beukelman D, Strand E, Bell K. Management of Motor Speech Disorders in Children and Adults. Austin: Pro-Ed; 1999. [Google Scholar]
- 38.Bradlow A, Kraus N, Hayes E. Speaking clearly for children with learning disabilities: sentence perception in noise. J Speech Lang Hear Res. 2003;46:80–97. doi: 10.1044/1092-4388(2003/007). [DOI] [PubMed] [Google Scholar]
- 39.Krause JC, Braida LD. Acoustic properties of naturally produced clear speech at normal speaking rates. J Acoust Soc Am. 2004;15:362–378. doi: 10.1121/1.1635842. [DOI] [PubMed] [Google Scholar]
- 40.Ferguson SH, Kewley-Port D. Talker differences in clear and conversational speech: acoustic characteristics of vowels. J Speech Hear Res. 2007;50:1241–1255. doi: 10.1044/1092-4388(2007/087). [DOI] [PubMed] [Google Scholar]
- 41.Picheny MA, Durlach NI, Braida LD. Speaking clearly for the hard of hearing. II. Acoustic characteristics of clear and conversational speech. J Speech Hear Res. 1986;29:434–446. doi: 10.1044/jshr.2904.434. [DOI] [PubMed] [Google Scholar]
- 42.Beukelman DR, Fager S, Ullman C, Hanson E, Logemann J. The impact of speech supplementation and clear speech on the intelligibility and speaking rate of people with traumatic brain injury. J Med Speech Lang Pathol. 2002;10:237–242. [Google Scholar]
- 43.Dromey C. Articulatory kinematics in patients with Parkinson disease using different speech treatment approaches. J Med Speech Lang Pathol. 2000;8:155–161. [Google Scholar]
- 44.Goberman AM, Elmer LW. Acoustic analysis of clear versus conversational speech in individuals with Parkinson disease. J Commun Disord. 2004;38:215–230. doi: 10.1016/j.jcomdis.2004.10.001. [DOI] [PubMed] [Google Scholar]