Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Aug 16.
Published in final edited form as: Clin Linguist Phon. 2014 May 29;28(11):857–878. doi: 10.3109/02699206.2014.921839

Relationship between acoustic measures and judgments of intelligibility in Parkinson’s disease: A within-speaker approach

LYNDA FEENAUGHTY 1, KRIS TJADEN 1, JOAN SUSSMAN 1
PMCID: PMC5558195  NIHMSID: NIHMS892429  PMID: 24874184

Abstract

This study investigated the acoustic basis of within-speaker, across-utterance variation in sentence intelligibility for 12 speakers with dysarthria secondary to Parkinson’s disease (PD). Acoustic measures were also obtained for 12 healthy controls for comparison to speakers with PD. Speakers read sentences using their typical speech style. Acoustic measures of speech rate, articulatory rate, fundamental frequency, sound pressure level and F2 interquartile range (F2 IQR) were obtained. A group of listeners judged sentence intelligibility using a computerized visual-analog scale. Relationships between judgments of intelligibility and acoustic measures were determined for individual speakers with PD. Relationships among acoustic measures were also quantified. Although considerable variability was noted, articulatory rate, fundamental frequency and F2 IQR were most frequently associated with within-speaker variation in sentence intelligibility. Results suggest that diversity among speakers with PD should be considered when interpreting results from group analyses.

Keywords: Acoustics, intelligibility, Parkinson’s disease

Introduction

Intelligibility has been defined as the degree to which an individual acoustic signal is understood by a listener (Weismer, 2008). Various metrics are used to assess sentence-level intelligibility in dysarthria including the Assessment of Intelligibility of Dysarthric Speech (AIDS: Yorkston & Beukelman, 1981) or the computerized version of this published test, the Speech Intelligibility Test (SIT; Yorkston, Beukelman, & Tice, 1996). These tests as well as other intelligibility metrics yield a numerical score. However, while numerical scores provide an estimate of the overall speech mechanism involvement in dysarthria (Tjaden & Wilding, 2011a; Weismer, Jeng, Laures, Kent, & Kent, 2001; Weismer, Yunusova, & Bunton, 2012), numerical scores alone do not provide insight concerning production variables contributing to intelligibility. As discussed in the following section, many studies therefore have sought to determine the relationship of acoustic measures to intelligibility.

Potential acoustic correlates of intelligibility

Numerous acoustic measures have been proposed to contribute to variations in intelligibility in dysarthria including supra-segmental and segmental acoustic measures. The following review focuses on dysarthria studies investigating acoustic measures of interest to this study. Relevant studies of normal speech are also considered. A more comprehensive review of possible acoustic correlates of intelligibility, including consonant spectral measures and rhythm metrics that were not included in this study, may be found in several sources (Kim, Hasegawa-Johnson, & Perlman, 2011a; Liss, Utianski, & Lansford, 2013; Weismer et al., 2012; Weismer, 2008; Yunusova, Weismer, Kent, & Rusche, 2005).

Global speech timing

Speech rate has been defined as the number of output units (i.e. syllables) per unit time including pauses, while articulation rate is the number of syllables per unit time excluding pauses (Tsao, Weismer, & Iqbal, 2006). The relationship between measures of global speech timing and intelligibility varies across studies. Some studies report improved speech intelligibility when speech rate is slowed (Hammen, Yorkston, & Minifie, 1994; Yorkston, Beukelman, Strand, & Bell, 1999; Yorkston, Hammen, Beukelman, & Traynor, 1990), while other studies report decreased or no improvement in intelligibility when speech or articulation rate is slowed (McRae, Tjaden, & Schoonings, 2002; Van Nuffelen, De Bodt, Vanderwegen, Van de Heyning, & Wuyts, 2010; Van Nuffelen, De Bodt, Wuyts, & Van de Heyning, 2009). Thus, the relationship between global speech timing and intelligibility appears to be complex.

Fundamental frequency (F0)

Prosodic variables, such as sentence-level measures of F0 have also been related to intelligibility. Bunton, Kent, Kent and Duffy (2001) examined sentence-level F0 for speakers with dysarthria. Results suggested that when F0 range was reduced using resynthesis techniques decrements in intelligibility were observed. Studies investigating healthy talkers further suggest a relationship between increased sentence-level F0 range and intelligibility. For example, healthy speakers judged to be more intelligible have been shown to have a greater sentence-level F0 range (Bradlow, Torretta, & Pisoni, 1996; but see Miller, Schlauch, & Watson, 2010). Laures and Weismer (1999) as well as Watson and Schlauch (2008) reported that when natural F0 variations (i.e. F0 contour) were flattened using speech resynthesis, sentence intelligibility was reduced. Studies investigating clear speech and speech produced with increased vocal intensity also suggest that an increased F0 range is associated with higher intelligibility (Picheny, Durlach, & Braida, 1986; Sapir, Ramig, & Fox, 2011; Tjaden, Kain, & Lam, 2014a).

In addition to overall sentence-level measures of F0, F0 measures of adjacent syllables reflecting reduced F0 variation have also been linked to reductions in intelligibility. Liss, Spitzer, Caviness, Adler, and Edwards (1998 Spitzer, Caviness, Adler, and Edwards (2000) investigated the degree to which acoustic manifestations (i.e. phrase duration, F0, amplitude variation and formant frequencies) of reduced syllabic contrast correlated with the lexical boundary errors in dysarthria. Results were interpreted to suggest that reduced syllabic contrasts contribute to reduced intelligibility in PD by hindering a listener’s ability to locate lexical boundaries within an utterance.

Sound pressure level (SPL)

Studies have also evaluated the link between SPL and speech intelligibility (Neel, 2009; Ramig, Countryman, Thompson, & Horii, 1995; Tjaden & Wilding, 2004). Ramig, Bonitati, Lemke, and Horrii (1994) and Cannito et al. (2012) reported the effects of Lee Silverman Voice Treatment (i.e. LSVT™) on intelligibility. Results from Ramig et al. (1994) and Cannito et al. (2012) suggested that following treatment, mean SPL increased and intelligibility also improved. The basis for the improved intelligibility is unclear, however. Acoustic and kinematic studies suggest that intelligibility improvements associated with increased vocal intensity could also be related to changes in source characteristics such as an increase in F0 or F0 range (Dromey & Ramig, 1998; Picheny et al., 1986; Tjaden et al., 2014a) or improved segmental articulation (Dromey & Ramig, 1998; Dromey, Ramig, & Johnson, 1995) in the form of an expanded vowel space area (Tjaden & Wilding, 2004) and increased spectral distinctiveness of stop consonants (Tjaden & Turner, 1997). Furthermore, the positive influence of a greater-than-normal vocal intensity on intelligibility may be related to an improved Signal to Noise ratio or audibility, although improved audibility alone does not fully account for improvements in intelligibility associated with an increased intensity (Neel, 2009; Kim & Kuo, 2012). Thus, while an increased SPL may be associated with improved intelligibility, the precise explanation for this relationship is a topic of ongoing study.

SPL modulation is another variable that has been shown to be associated with intelligibility. Bunton, Kent, Kent, and Rosenbek (2000) investigated intensity range of individuals with Amyotrophic Lateral Sclerosis (ALS), Cerebellar Disease (CD) and healthy controls. Significant differences in intensity range were not reported between the control and clinical groups. However, a restricted intensity range tended to be associated with reduced speech intelligibility for the ALS group with moderate intelligibility. Studies of clear speech for normal talkers as well as dysarthria also suggest the importance of energy modulation to intelligibility (Krause & Braida, 2009; Tjaden et al., 2014a). In contrast, other dysarthria studies have found no relationship between intensity range and intelligibility (Kim, Kent, & Weismer, 2011b).

Vowel segmental integrity

Various measures of vowel segmental production have been shown to be related to speech intelligibility including vowel space area and F2 slope. Vowel space area is generated by plotting the frequencies of the first two formants (F1, F2) (Turner, Tjaden, & Weismer, 1995). In this manner, vowel space area provides an overall estimate of a speaker’s vowel articulatory-acoustic working space, with greater vowel space areas, suggesting a greater difference in articulatory position for vowels (Weismer et al., 2012). Similarly, reduced spectral contrasts among vowels as partially indexed by vowel space area have been reported to be associated with reduced intelligibility (Kim et al., 2011a; McRae et al., 2002; Weismer et al., 2001; but also refer Weismer, Laures, Jeng, Kent, & Kent, 2000).

Studies have also shown a relationship between F2 slope and intelligibility. F2 slope corresponds to the rate of vocal tract shape change for vowels, with shallower slopes associated with slower tongue movement speeds (Yunusova et al., 2012). Recently, Kim et al. (2011b) investigated the relationship between various acoustic measures, including F2 slope and speech intelligibility in speakers varying in dysarthria severity. Results indicated that shallower F2 slopes were associated with poorer speech intelligibility (see also Kim, Weismer, Kent, & Duffy, 2009; Tjaden, Richards, Kuo, Wilding, & Sussman, 2013; Yunusova et al., 2012).

Group versus within speaker designs

Many of the dysarthria studies reviewed in the preceding section employed a group design, wherein data were pooled across speakers prior to quantifying the strength of the association between a given acoustic measure and intelligibility. Because speakers in these studies typically spanned a range of dysarthria severity, the association between acoustic measures and intelligibility could reflect the influence of a third, mediating variable (see discussion in Kim et al., 2011b; Weismer et al., 2001; Yunusova et al., 2005). Yunusova et al. (2005), however, investigated the acoustic basis of within-speaker variation in intelligibility for speakers with dysarthria secondary to PD and ALS. A novel acoustic measure, F2 interquartile range (F2 IQR) was generated by extracting second formant time histories for all voiced segments within a breath group and calculating the difference between the first and third quartiles. Results suggest that F2 IQR partially explained within-speaker variation in intelligibility for some speakers with PD. The results illustrate the feasibility of the within-speaker approach to identify acoustic variables that explain variation in intelligibility in dysarthria.

Summary and purpose

Intelligibility is an important construct in dysarthria. Although studies using a group design have made progress toward identifying acoustic variables potentially contributing to intelligibility in dysarthria, our understanding of the acoustic basis of intelligibility in dysarthria is incomplete. Yunusova et al.’s (2005) study demonstrates how a within-speaker, across-utterance design avoids the third-variable effect and thus may help to determine acoustic variables integral to intelligibility. However, additional studies are needed to determine the utility of the within-speaker approach. Thus, this study used a single subject design to investigate potential acoustic variables associated with within-speaker variations in intelligibility in dysarthria secondary to PD. Healthy controls were included for providing a baseline against which to compare acoustic measures for speakers with PD and to aid in interpreting the results. As in Yunusova et al.’s (2005) study, the relationship between intelligibility and acoustic variables was not of interest for healthy speakers. Previous research with healthy speakers suggests a limited range of within-speaker, across-utterance intelligibility variation (Monsen, 1983). Further, the relationship between acoustic measures and intelligibility was also not of interest for healthy speakers due to the prospect of ceiling effects.

Although many potential correlates of intelligibility could be investigated, articulatory rate and measures of F0, SPL and F2 IQR were selected for study because previous research suggests that these measures at least moderately correlate with intelligibility. Care was taken to include a diverse array of segmental and suprasegmental acoustic measures. Further, F2 IQR was used to index vowel segmental production because measures such as vowel space area and F2 slope may vary as a function of overall dysarthria severity rather than variation in intelligibility (Weismer et al., 2001). In addition, F2 IQR also does not require utterances to contain specific phonetic content. For each speaker with PD, the main research questions were whether greater modulations in F0 and SPL as well as relatively greater F2 IQR were associated with greater scaled sentence intelligibility.

Methods

Speakers

Twelve speakers (7 men, 5 women) with idiopathic PD were studied as well as 12 age and sex-matched healthy speakers (7 men, 5 women). Healthy speakers ranged in age from 50 to 77 years (Mean = 63 years; SD = 10 years). Selected perceptual and acoustic characteristics have been previously reported for these speakers, who are part of a larger research project (Sussman & Tjaden, 2012; Tjaden, Lam, & Wilding, 2013; Tjaden, Sussman, & Wilding, 2014b). Characteristics for the PD and control groups are summarized in Table 1. Although five speakers with PD had a history of dysarthria therapy, control speakers reported no history of speech-language therapy. All speakers were native speakers of standard American English and had achieved at least a high school diploma. In addition, participants did not use a hearing aid, had thresholds of at least 40 dB HL in one ear at 1000 and 2000 Hz (Ventry & Weinstein, 1983), and scored 26 or greater on the Standardized Mini-Mental State Examination (Molloy, 1999). Participants with PD reported no history of other neurological disease or neurosurgical treatment.

Table 1.

Speaker characteristics are summarized.

Speaker Age Years Post Diagnosis History of Speech Therapy SIT % Scaled Severity Grandfather Passage
CM01 55 97 0.43
CM02 53 95 0.23
CM03 70 90 0.31
CM05 58 91 0.28
CM06 52 94 0.47
CM08 68 93 0.39
CM09 63 92 0.27
CF01 73 94 0.49
CF02 77 91 0.42
CF05 50 94 0.28
CF17 75 95 0.32
CF22 65 94 0.19
MEAN 63 93 0.34
SD 10   2 0.09
RANGE 50–77 90–97 0.19–0.49
PDM01 76 12 No 87 0.46
PDM02 65   8 No 89 0.50
PDM03 58 13 No 89 0.65
PDM04 55   5 No 92 0.10
PDM06 66   3 No 91 0.20
PDM07 67 32 Yes 90 0.63
PDM08 78   4 Yes 90 0.80
PDF01 76 20 Yes 84 0.85
PDF04 48 11 Yes 90 0.54
PDF05 74   2 Yes 87 0.66
PDF06 75   5 No 90 0.62
PDF08 63   2 No 89 0.26
MEAN 68 10 89 0.52
SD 10   9   2 0.23
RANGE 48–78 2–32 84–92 0.10–0.85

Participant code: Control (C), Parkinson’s Disease (PD), Male (M) and Female (F). The Sentence Intelligibility Test (SIT) score is the average sentence perceptual judgment of 10 inexperienced listeners, as reported in Sussman & Tjaden (2012).

Procedures for obtaining SIT scores (Yorkston et al., 1996) as well as a scaled speech severity for the Grandfather Passage, an operationally defined perceptual constructs that aims to tap into speech naturalness and prosody, were reported in Sussman and Tjaden (2012) as part of a larger project. Mean SIT scores and scaled speech severity for the Grandfather passage are in included in Table 1 for describing speakers in this study. Mean scaled speech severity for the Grandfather Passage could range from 0 (“not impaired”) to 1.0 (“severely impaired”). Mann–Whitney U tests confirmed a significant difference in SIT scores (p<0.001) and scaled speech severity (p<0.05) for the Control and PD groups. The mildly reduced intelligibility for speakers with PD (i.e. 89%) compared to controls (i.e. 93%) coupled with moderate scale values for speech severity is consistent with mild dysarthria (Yorkston, Beukelman, Stand, & Hakel, 2010). Mean SIT scores and scaled speech severity for the reading passage were also somewhat reduced for the control group. Because the average age of the control group was 63 years, one possible explanation is that these scores represent listeners’ perception of speech characteristics consistent with normal aging adults. In addition, although listeners were blinded to the neurological status of the speakers they were told that some speakers had a neurological diagnosis, which may have biased listeners to judge speech samples as being more impaired. As noted in Sussman and Tjaden (2012), PD speakers were also anecdotally noted to have reduced segmental precision and a breathy, monotonous voice.

Speech task and procedure

Experimental stimuli from which the acoustic and perceptual measures were obtained were 10 sentences randomly selected per speaker from a pool of 25 syntactically and semantically normal IEEE Harvard Psychoacoustic Sentences (IEEE, 1969). Sentences included both declarative and imperatives and ranged in length from 7 to 9 words, with five key or content words.

Speakers were audio-recorded in a sound-treated room reading Harvard sentences (IEEE, 1969) in habitual, clear, fast, loud and slow conditions. For the habitual condition, which was of interest to the current study and was always recorded first, speakers were simply instructed to read sentences in their habitual or typical speaking style. The acoustic signal was transduced using an AKG C410 head mounted microphone positioned at 45–50° angle and 10 cm from the center of the lips. The signal was pre-amplified, low pass-filtered at 9.8 kHz and digitized at 22 kHz using TF32 (Milenkovic, 2002). A 1000 Hz calibration tone was also recorded for use in calculating SPL (Lam, Tjaden, & Wilding, 2012). All speakers were paid $10 per hour. Speakers with PD were recorded approximately 1 h prior to taking anti-Parkinsonian medications.

Perceptual task and procedure

Methodological details concerning the perceptual task are reported in other studies (Tjaden et al., 2013, 2014b). Readers are referred to these previous studies for a more detailed treatment of this material.

Listeners

Fifty inexperienced listeners recruited from the University at Buffalo participated. Listeners were native speakers of standard American English, had at least a high school diploma or equivalent, were between 18 and 30 years of age, and reported no history of speech, language or hearing problems. All listeners passed a hearing screening administered at 20 dB HL bilaterally for octave frequencies ranging from 250 to 8000 Hz (ANSI, 1989). Listeners were paid $10 per hour.

Perceptual measure and procedure

As described in Tjaden et al. (2014a, b), listeners judged intelligibility for Harvard sentences using a continuous vertical 150 mm computerized Visual Analog Scale (VAS) (Cannito et al., 1997; Sussman & Tjaden, 2012), with scale endpoints labeled as “Understand everything” (0 = intelligible) and “Cannot understand anything” (1.0 unintelligible).

As speakers with PD had mild dysarthria (Table 1), the experimental stimuli (i.e. 10 Harvard sentences per speaker) were mixed with 20-talker babble (Frank & Craig, 1984) to increase task difficulty and prevent ceiling effects. Various studies of dysarthria (Bunton, 2006; Cannito et al., 2012; McAuliffe, Schafer, O’Beirne, & LaPointe, 2009) and normal speech (Smiljanić & Bradlow, 2009; Uchanski, 2005) have also employed background noise when studying intelligibility. Sentences first were equated for peak amplitude using Goldwave, version 5.1 (2007) and then were mixed with babble at a Signal to Noise ratio of −3 dB. This level was determined via pilot testing to minimize both ceiling and floor effects and has been used in previous studies of normal speech as well as dysarthria (e.g. Ferguson & Kewley-Port, 2002; Maniwa, Jongman, & Wade, 2008; McAuliffe et al., 2009). Stimuli were presented to individual listeners via Sony MDRV300 headphones in a sound treated booth at 75 dB HL for peak vowel amplitude of key words in sentences. Listeners judged each sentence without knowledge of speaker identity or whether a speaker had a neurological diagnosis.

Sentences were pooled and divided into 10 sets. Five listeners were assigned to judge each set. Sentence sets contain one sentence produced by each talker in each condition. Thus, a given listener judged a subset of sentences for each talker. Following a brief practice task, listeners judged the intelligibility. Using a customized computer program, listeners were instructed to place the indicator of the computer mouse on the continuous 150 mm visual analog scale shown on a monitor. Scale endpoints were labeled “understand everything” and “cannot understand anything”. Following completion of the experiment, the computer program transformed the position of the mouse indicator to scores ranging from 0 and 1.0, with lower values indicating better intelligibility. Judgments were averaged across listeners to provide an average intelligibility judgment for each sentence.

Listener reliability

Each listener judged a random selection of 10% of sentences twice for determining intrajudge reliability. Pearson product correlation coefficients for the first and the second presentation of sentences ranged from 0.60 to 0.88 across the 50 listeners, with a mean of 0.71 (SD = 0.07). All correlations were also significant (p<0.001). As for the larger project (Tjaden et al., 2014a,b), a conservative approach was taken wherein listeners with intrajudge coefficients less than r = 0.70 was excluded from further consideration. Remaining analyses therefore reflect intelligibility judgments for 29 listeners (Mean intrajudge Pearson r = 0.76; SD = 0.05; Range 0.70–0.88). Interjudge reliability was assessed using the Intraclass correlation coefficient (ICC) following Neel (2009). Average ICCs ranged from 0.63 to 0.91 (Mean = 0.83; SD = 0.09) and were statistically significant (p<0.001). Reliability metrics compare favorably with those reported in previous dysarthria studies employing scaling tasks to measure intelligibility (e.g. Bunton et al., 2001; Kim et al., 2011b; Kim & Kuo, 2012; Neel, 2009; van Nuffelen et al., 2009; Weismer et al., 2001; Yunusova et al., 2005).

Acoustic measures and procedures

Acoustic measures were performed by the first author or a trained research assistant using TF32 (Milenkovic, 2002). Each measure is considered in detail in the following sections.

Global speech timing

Sentences were segmented into speech runs and pauses. A speech run was defined as a stretch of speech bound by a silent pause of more than 200 ms (Tjaden & Wilding, 2004; Turner & Weismer, 1993). Standard acoustic criteria were used to identify run onsets and offsets (Tjaden & Wilding, 2004). For each speaker and sentence, articulatory rates were calculated. Speech run durations for each sentence were summed to yield a total articulatory time. The number of syllables for each speech run was also obtained. Syllable counts were summed for each sentence to calculate the articulatory rate in syllables per second. Over 90% of sentences did not contain interword pauses. Thus, measures of speech rate were not obtained.

Fundamental frequency (F0)

F0 time histories were generated for each sentence. Computer-generated F0 traces were visually checked for errors and errors were corrected manually on a pitch period by pitch period basis using the combined waveform and wideband spectrographic (300–400 Hz) displays. The first full glottal pulse was used to determine the onset and offset of voiced segments. The F0 trace continued until the last discernible glottal pulse indicative of voicing seen in both waveform and wideband spectrographic displays. For sentences that contained areas of possible diplophonia or glottal fry, the lowest frequency was marked for inclusion in the pitch traces and was included in all F0 calculations. These segments were observed in about half (i.e. about 60 sentences) of the total sentences produced by all speakers with PD. Voiced speech segments, which could not be identified by periodic voicing, were omitted from all F0 calculations. Time histories were extracted into Excel for conversion from Hertz to Semitone (de Pijper, 2007). Mean F0, F0 SD (i.e. the standard deviation of data points from mean F0) and F0 Range (i.e. 90th percentile minus 10th percentile; Tjaden & Wilding, 2010) for each sentence were calculated. Two measures, F0 Range and F0 SD, were included to index F0 variation following previous studies (Bunton et al., 2000; Laures & Weismer, 1999). In addition, a slightly curtailed F0 Range (i.e. 90th percentile minus 10th percentile) may be more representative of a person’s ability to consistently modulate F0 because an absolute F0 Range metric is highly dependent on the two extreme values in a data set.

Sound pressure level (SPL)

Mean SPL and SPL SD were used to index vocal intensity and intensity modulations. A Root-Mean-Squared (RMS) voltage trace was generated for all speech runs in a given sentence. RMS traces voltages were exported to an Excel file and converted to dB SPL using the calibration tone for each speaker to calculate Mean SPL. SPL SD was exported to Excel directly from the RMS trace voltage for each sentence.

F2 interquartile range (F2 IQR)

Segmental integrity for vocalic segments was indexed using F2 IQR (Yunusova et al., 2005). This measure is suggested for use in within-speaker, across utterance analyses because, unlike most commonly used measure of the vowel space area, F2 IQR does not rely heavily on the phonetic context of an utterance (Yunusova et al., 2005). Linear predictive coding (LPC) formant trajectories for F2 were generated for all vowels, liquids, and glides and manually corrected for errors. Voiced segments were identified using voicing energy from the wideband spectrographic display (Tjaden & Wilding, 2010). Nasalized vowels were included, but nasal consonants (e.g. /m/) were excluded if segmentation could be achieved from the surrounding acoustic signal based on observed changes in spectrographic energy such as from the nasal resonance. F2 time histories were imported into Excel and F2 IQR was calculated by subtracting the 1st quartile from the 3rd quartile over each sentence.

Acoustic reliability

Intra-judge and inter-judge reliability of acoustic measures were calculated for approximately 10% of the stimuli (10 sentences × 24 speakers = 240 total sentences). Twenty-four sentences were randomly selected (i.e. one sentence from each speaker) to be re-measured by the first author or a trained research assistant. Average absolute measurement error and standard deviation were used to measure reliability. Pearson product moment correlation coefficients were also obtained (Tjaden, 2003; Tjaden & Wilding, 2004).

For intra-judge reliability, the absolute average measurement error and standard deviations were 0.03 syll/s (0.05 syll/s) for articulatory rate. The absolute average measurement error and standard deviations in semitones were 0.25 (0.26), 0.26 (0.24) and 0.62 (1.15) for mean F0, F0 SD and F0 Range, respectively. The absolute average measurement error and standard deviation for F2 IQR were 0.02 kHz (0.03 kHz). Finally, the absolute average measurement error and standard deviation for mean SPL and SPL SD were 0.11 dB (0.23 dB) and 0.10 dB (0.20 dB), respectively. For all acoustic measures, the Pearson correlations were greater than 0.96.

For inter-judge reliability, the absolute average measurement error and standard deviations were 0.05 syll/s (0.10 syll/s) for articulatory rate. The absolute average measurement error and standard deviations in semitones were 0.72 (1.24), 0.47 (0.42) and 0.87 (1.51) for mean F0, F0 SD and F0 Range, respectively. The absolute average measurement error and standard deviation for F2 IQR were 0.03 kHz (0.04 kHz). Finally, the absolute average measurement error and standard deviation for mean SPL and SPL SD were 0.11 dB (0.18 dB) and 0.16 dB (0.36 dB), respectively. For all acoustic measures, the Pearson correlations were all greater than 0.91.

Statistical analysis

Levene’s test for homogeneity of variance was used to assess the equality of variance among samples for each acoustic measure. For many of the acoustic variables, this assumption was not met. Therefore, non-parametric Mann–Whitney U tests were used to compare acoustic measures for the two groups. Spearman rank order correlations were used to evaluate the strength of association between the various acoustic variables and scaled sentence intelligibility within each PD speaker. A non-parametric model was used due to the number of observations (n = 10). As suggested by Cohen (1988), relationships between variables exhibiting a correlation of ±0.30 or stronger have a moderately strong association. Thus, moderate correlations among acoustic measures and intelligibility are considered meaningful in this study, regardless of whether correlations met standard criteria for statistical significance (p<0.05). Similar magnitudes of correlations between acoustic variables and intelligibility have been interpreted to be meaningful in past dysarthria studies (Kim et al., 2011b; Tjaden & Wilding, 2011b; Weismer et al., 2001; Yunusova et al., 2005, 2012). In addition, only one other published dysarthria study has investigated the acoustic basis of within-speaker variation in intelligibility. Thus, a conservative approach to data interpretation was deemed appropriate so as not to miss potential trends of interest. Correlation analysis was also used to investigate the relationship among all acoustic measures for the PD group.

Results

Comparison of scaled sentence intelligibility and acoustic measures for PD and control groups

Descriptive characteristics for scaled judgments of intelligibility of Harvard sentences for individuals with PD are summarized in Table 2. Average scaled sentence intelligibility for the PD group was 0.47 (SD = 0.14). For comparison, the average scaled sentence intelligibility for the 12 control speakers was 0.28 (SD = 0.10). A Mann–Whitney U test indicated that this difference was significant (p<0.001). In addition, Table 2 indicates that all speakers with PD demonstrated a range of sentence intelligibility for Harvard sentences such that some Harvard sentences were judged more intelligible than others within each speaker with PD.

Table 2.

Descriptive characteristics are summarized for scaled judgments of intelligibility for individuals with PD.

Speaker Average Scaled Intelligibility Harvard Sentences Scaled Intelligibility Harvard Sentences
Min Max
PDM01 0.64 0.10 0.92
PDM02 0.37 0.10 0.69
PDM03 0.45 0.07 0.86
PDM04 0.43 0.12 0.91
PDM06 0.62 0.09 0.86
PDM07 0.68 0.07 0.99
PDM08 0.55 0.07 0.93
PDF01 0.65 0.28 0.93
PDF04 0.52 0.12 0.67
PDF05 0.46 0.12 0.95
PDF06 0.41 0.04 0.87
PDF08 0.21 0.02 0.74
MEAN 0.47 0.10 0.86
SD 0.14 0.06 0.10
RANGE 0.21–0.68 0.02–0.28 0.67–0.99

Scaled intelligibility is the average across 10 Harvard sentences for each speaker. Participant code: Parkinson’s disease (PD), Male (M), and Female (F).

Figure 1 shows group means and standard deviations for acoustic measures. On average, the PD group had slightly faster articulatory rates (i.e. upper left panel) and reduced F2 IQR (i.e. upper right panel) as well as greater across-speaker F2 IQR variability (SD = 105.02 Hz) compared to the control group (SD = 45.62 Hz). As indicated by the height of the standard deviation bars (i.e. lower left panel), the PD group also tended to have a greater F0 Range (8.32 semitones), on average, compared to the control group (6.81 semitones). On average, F0 SD was slightly greater for the PD group (3.61 semitones) relative to the control group (3.36 semitones). Finally, the PD group demonstrated a modest reduction in Mean SPL (M = 71.87 dB, SD = 2.36 dB) compared to the control group (M = 72.97 dB, SD = 3.67 dB). SPL SD was also reduced for the PD group (8.29 dB) relative to controls (8.72 dB). Despite these descriptive differences, Mann–Whitney U tests indicated that acoustic measures for the PD and control groups were not statistically different.

Figure 1.

Figure 1

Mean and standard deviation bars are reported for Articulatory Rate in syllables per second (upper left panel), F2 IQR in Hertz (upper right panel), F0 Range in semitone (lower left panel) and Mean SPL in decibels (lower right panel) for the PD and Control groups.

Correlations between acoustic measures and intelligibility for speakers with PD

Table 3 reports Spearman rank order correlations reflecting the strength of association between acoustic variables and scaled sentence intelligibility for each speaker with PD. Speakers are listed from least (PDM07 = 0.68) to most (PDF08 = 0.21) intelligible, as indexed by average scaled intelligibility for Harvard sentences. The bold text indicates correlations of ±0.30 or stronger and for completeness; asterisks indicate statistically significant correlations (ρ<0.05). Except for PDF08, at least one acoustic variable was at least moderately correlated with scaled sentence intelligibility for all speakers.

Table 3.

Spearman rank order correlation results are reported.

Speaker Articulatory Rate (syll/sec) Mean F0 (Semitone) F0 SD (Semitone) F0 Range (Semitone) Mean SPL (dB) SPL SD (dB) F2 IQR (Hz)
PDM07   0.370   0.055 −0.139 −0.127   0.261 −0.115 −0.091
PDF01 −0.152   0.103   0.091   0.055   0.164   0.370   0.067
PDM01   0.018 0.503   0.479   0.527 −0.030   0.292   0.139
PDM06 0.794*     0.806*   0.491   0.442   0.188   0.134     0.673*
PDM08 0.576   0.176 0.515 −0.176   0.018   0.127   0.503
PDF05 −0.115 0.442   0.418   0.503 −0.238   0.042 0.321
PDF04   0.430 0.333   0.127   0.103   0.559 −0.243   0.236
PDF06 −0.212 −0.030   0.127   0.321   0.012 −0.043 −0.006
PDM04   0.195 0.517   0.109   0.334   0.128 0.571 −0.195
PDM03   0.333     0.636*   0.188   0.224   0.285 −0.115     0.709*
PDM02   0.115 −0.285   0.321   0.115   0.006   0.366   0.358
PDF08   0.091   0.139   0.188 −0.152   0.188   0.049   0.091

F0 Range = 90th percentile minus 10th percentile. Alpha level 0.05.

Results summarize the association between acoustic variables and scaled sentence intelligibility within speakers. The sign of the correlation indicates the direction of the association. Bold text indicates correlations of 0.30 or greater and an asterisk indicates significant correlations (ρ<0.05). Speakers are listed from least intelligible (PMD07) to most intelligible (PDF08).

Articulatory rate, F0 and SPL

As reported in Table 3, articulatory rate was at least moderately correlated with scaled sentence intelligibility for five of the 12 speakers. The absolute magnitude of correlations ranged from 0.333 to 0.794. The direction of the effect varied across speakers.

Mean F0 was moderately correlated with scaled sentence intelligibility for six of the 12 speakers. The absolute magnitude of correlations ranged from 0.333 to 0.806. The direction of the relationship varied across speakers. As indicated in the upper left panel of Figure 2, greater Mean F0 was correlated with better scaled intelligibility for PDM01 (ρ= −0.503) while the upper right panel indicates lower Mean F0 was correlated with better scaled sentence intelligibility for PDM03 (ρ = 0.636).

Figure 2.

Figure 2

The relationship between Mean F0 and scaled sentence intelligibility is reported for speakers PDM01 (upper left panel) and PDM03 (upper right panel). The relationship between SPL SD and scaled sentence intelligibility is reported for speakers PDM04 (lower left panel) and PDM02 (lower right panel). Each symbol (e.g. sn16) represents the corresponding sentence number produced by a given male speaker with PD.

F0 SD was at least moderately correlated with scaled sentence intelligibility for five of 12 speakers. The absolute magnitude of correlations ranged from 0.321 to 0.515. With the exception of one speaker (PDM01) the direction of the relationship was such that lower values of F0 SD were associated with better intelligibility. Similarly, correlation analysis indicated at least moderate associations between F0 Range and scaled sentence intelligibility for five of the 12 speakers. The absolute magnitude of correlations ranged from 0.321 to 0.527. The direction of the relationship was the same for all five speakers such that reduced or restricted modulations in F0 were associated with better intelligibility.

Mean SPL was at least moderately correlated with scaled sentence intelligibility for one speaker. A lower mean sound pressure level was associated with better scaled sentence intelligibility. SPL SD was also at least moderately correlated with scaled sentence intelligibility for three of 12 speakers, including PDF01 (ρ = 0.370), PDM04 (ρ = −0.571), and PDM02 (ρ = 0.366). SPL SD and intelligibility for these speakers were also modestly correlated, but the direction of the relationship differed across the three speakers. For example, as indicated in the lower left panel of Figure 2, greater sentence-level variation in SPL was associated with better intelligibility for PDM04 while the lower right panel indicates greater sentence-level variation in SPL was associated with reduced intelligibility for speaker PDM02.

F2 IQR

F2 IQR was at least moderately correlated with sentence intelligibility for five of the 12 speakers. The absolute magnitude of correlations ranged from 0.321 to 0.709. The direction of the effect was such that reduced F2 IQR was associated with better intelligibility for four speakers. For one speaker, a greater F2 IQR was associated with higher intelligibility.

Summary of within-speaker correlations

The strength of association as well as the direction of the relationship between most acoustic measures and intelligibility varied. Measures of articulatory rate, fundamental frequency and F2 IQR emerged as possible acoustic variables contributing to within-speaker variations in sentence intelligibility. With the exception of F0 Range, the direction of the relationship varied. For about half of the speakers, articulatory rate, measures of fundamental frequency and F2 IQR were at least moderately correlated with sentence intelligibility. Although the direction of the relationship varied, SPL SD was at least moderately correlated with sentence intelligibility for three speakers with PD. In sum, half of the speakers had three moderately strong correlations while three additional speakers had one or two moderately strong correlations.

Correlations between acoustic variables

Table 4 reports correlations among acoustic variables in the PD group. This table suggests significant associations among most of the acoustic variables when data were pooled across speakers. Overall, Mean SPL tended to be more strongly correlated with other acoustic variables while articulation rate, SPL and F0 modulation, as indexed by both standard deviation and range, tended to be less strongly correlated with other acoustic variables.

Table 4.

Spearman rank order inter-correlations among acoustic variables are reported.

Acoustic variables (1) Speech rate (syll/sec) (2) Articulation rate (syll/sec) (3) SPL SD (dB) (4) SPL mean (dB) (5) F2 IQR (Hz) (6) F0 mean (Semitone) (7) F0 SD (Semitone) (8) F0 range (Semitone)
(1) Speech rate (syll/sec)   1
(2) Articulation rate (syll/sec)   0.978*   1
(3) SPL SD (dB) −0.501* −0.505*   1
(4) SPL Mean (dB)   0.223*   0.244* −0.130 1
(5) F2 IQR (Hz) −0.199* −0.174 −0.010   0.005 1
(6) F0 Mean (Semitone) −0.338* −0.370*   0.200* −0.258* 0.193* 1
(7) F0 SD (Semitone)   0.028   0.012 −0.090   0.066 0.237* 0.240* 1
(8) F0 Range (Semitone)   0.084   0.083 −0.090   0.283* 0.186* 0.020 0.857* 1

Note. Alpha level 0.05.

The sign of the correlation indicates the direction of the association. An asterisk indicates significant correlations (p<0.05).

Qualitative analysis of linguistic variables and scaled intelligibility for speakers with PD

Due to the modest strength of association between acoustic measures and within-speaker variation in intelligibility, a qualitative post-hoc analysis was undertaken to explore whether word frequency might be contributing to within-speaker variation in intelligibility. Recall that 10 sentences were randomly selected for each speaker from this larger pool of 25 for use in performing acoustic measures and scaled judgments of intelligibility, as previously described. Thus, each of the 25 Harvard sentences was represented in the post-hoc analysis, albeit not for each speaker. Figure 3 shows mean judgments of intelligibility for the 29 listeners for each of the 25 Harvard Sentences. In Figure 3, Harvard sentences are indicated by sentence number in descending order on the x-axis. With the exception of sentences 15, 20 and 21, all sentences were included in the subset of 10 sentences for at least three different speakers with PD. Figure 3 shows that sentence 2, 11 and 23 were judged to be relatively more intelligible while sentences 14 and 18 were judged to be relatively less intelligible compared to the other sentences. Word frequency data (Brysbaert & New, 2009) further suggest that the keywords in sentences 14 and 18 occur less frequently (e.g. “dune” occurs 51 times) than the words that comprise sentence 2, 11 and 23 (e.g. “box” occurs 4577 times) out of 51 million words based on American subtitles.

Figure 3.

Figure 3

Average scaled intelligibility estimates for Harvard Sentences produced by speakers with PD are reported. Sentences are arranged in descending order.

To further explore whether word frequency contributed to the results of the correlation analyses in Table 3, the within-speaker correlation analyses were repeated excluding sentences 2, 11, 23, 14 and 18. The two analyses yielded similar results. The absolute strength of association among acoustic measures and intelligibility for the initial and repeated analyses was 0.484 and 0.509, respectively. Mean SPL and SPL SD emerged as additional acoustic variables contributing to within-speaker variations in sentence intelligibility for about half of the speakers with PD compared to the initial analysis summarized in Table 3. In contrast, less than three speakers in the original analysis exhibited at least a moderate correlation between sentence intelligibility and measures of SPL. The direction of the relationship also varied. However, in sum, measures of SPL were the most affected.

Discussion

The purpose of this study was to investigate potential acoustic variables associated with acrossutterance, within-speaker variation in intelligibility for speakers with mild dysarthria associated with PD. The strength of association as well as the direction of the relationship between most acoustic measures and intelligibility varied. Before further discussing these results and their implications, acoustic characteristics of speakers with PD relative to controls are considered.

Comparison of acoustic measures for speakers with PD and healthy controls

While differences in acoustic measures for the PD and control groups were not statistically significant, the PD group demonstrated trends consistent with previously published acoustic studies (Dromey, 2003; Weismer et al., 2001; Yunusova et al., 2005). That is, on average, the PD group had a faster articulatory rate, reduced mean fundamental frequency, lower mean intensity level and reduced F2 IQR compared to the control group (Figure 1). SPL modulation (SPL SD) also tended to be reduced for the PD group, while F0 variation (F0 SD, F0 Range) tended to be greater for the PD group. The lack of statistically significant differences may be explained by the modest sample size and the fact that the PD group had mild dysarthria (Table 1). As discussed below, a trend toward greater F0 Range and F0 modulation for speakers with PD as a whole may be explained by voicing instabilities.

Relationship between F0 and scaled sentence intelligibility

The direction of the relationship between sentence-level F0 modulation and scaled sentence intelligibility varied across the speakers with PD (Table 3). However, for most speakers, reduced F0 SD (i.e. 4 of 5 speakers) and F0 Range (i.e. 5 of 5 speakers) were associated with higher intelligibility. This finding is at odds with at least some previous studies (Bunton et al., 2001; but see Miller et al., 2010). One explanation for the trend toward reduced F0 modulation to be associated with better sentence intelligibility is the occurrence of low frequency vocal instability. For example, phonatory instability in the form of glottal fry and diplophonia suggesting an extreme lower F0 Range was observed in four of 12 female speakers with PD. For example, sentence one produced by PDF05 had the greatest F0 Range (14.94 semitones) and the lowest F0 10th percentile value (75.49 semitones). Importantly, for this speaker, average F0 10th percentile values were lower compared to controls while F0 90th percentile values were similar across sentences and consistent with age and sex-matched controls. This trend held for both female speakers with PD for whom results yielded at least a moderately strong relationship between metrics used to index F0 modulation and intelligibility.

In contrast, sentences with a reduced F0 Range for men that were judged to have better intelligibility appeared to be explained by restricting higher vocal frequencies (i.e. F0 90th percentile). This trend held for all male speakers with PD (i.e. 3 speakers) for whom results yielded at least a moderately strong relationship between F0 Range and intelligibility. For example, for PDM01, sentence 23 was judged to be more intelligible, had the most restricted F0 Range (4.17 semitones) and one of the lowest F0 90th percentile values (85.70 semitones). For this speaker, F0 10th percentile values were consistent with age- and sex-matched controls. Similar to previous studies, these findings suggest that F0 modulation aberrancies may manifest differently for men and women in PD (Hertrich & Ackermann, 1995; Holmes, Oates, Phyland, & Hughes, 2000; MacPherson, Huber, & Snow, 2011; Skodda, Rinsche, & Schlegel, 2009; Skodda, Visser, & Schlegel, 2011).

Most speakers with PD were judged to have the best intelligibility for sentences with reduced F0 modulation. Although the underlying reason for the varied results for F0 modulation is still unknown, synthetically enhanced F0 contours have been reported to increase vowel intelligibility for speakers with PD (Bunton, 2006). Results from the current study together with Bunton’s (2006) study suggest that additional perceptual studies investigating the effect of increased F0 Range attributable to low frequency instabilities are needed given that voice disruptions or instabilities are a common feature of dysarthria (Bunton & Weismer, 2002; Darley, Aronson, & Brown, 1969; Kent, Kent, Duffy, & Weismer, 1998).

Finally, Mean F0 was moderately correlated with scaled sentence intelligibility for six speakers (Table 3). For two of the six speakers, a lower Mean F0 was associated with better scaled sentence intelligibility. However, in Dromey and Ramig’s (1998) study, higher Mean F0 was associated with better scaled sentence intelligibility. It is also important to note that higher Mean F0 was associated with better intelligibility for two female speakers, while for male speakers the relationship varied. Aside from the large number of acoustic variables correlated with Mean F0 (Table 4), no obvious explanation emerged for why higher Mean F0 was associated with better intelligibility for some speakers, while for other speakers higher Mean F0 was associated with lower intelligibility. Thus, sentence-level mean F0 is likely linked to other acoustic variables and independently may not be a reliable indicator of within-speaker variations in scaled intelligibility for speakers with mild dysarthria in PD.

Relationship between SPL and sentence intelligibility

SPL modulation and intelligibility were moderately associated for three of 12 speakers, but the direction of the relationship varied (Figure 2). On average, two of the three speakers had greater mean scaled intelligibility (Table 1) and similar SPL SD (8.56 and 8.11 dB). Results suggest that the different relationships between SPL SD and sentence intelligibility cannot be attributed to overall differences in SPL modulation. Thus, the relationship between SPL modulation and within-speaker variation in scaled sentence intelligibility appears to be complex. Since speakers in the current study had mild dysarthria, studies investigating more severely impaired speakers are also needed as the results may differ for more severe dysarthria (Kim et al., 2011b; Weismer et al., 2012).

For one speaker (Table 3), sentences characterized by a reduced Mean SPL were associated with better scaled sentence intelligibility. Using a slightly different approach, Kim and Kuo (2012) reported that increased intensity levels for sentences produced by healthy controls were associated with a significant decrease in scaled intelligibility. Thus, the current findings may reflect the fairly preserved sentence-level intelligibility for speakers with mild dysarthria in PD. Although further research is needed to investigate the relationship between Mean SPL and sentence intelligibility, it appears that at least for some speakers with dysarthria, an increased SPL may not be associated with improved intelligibility (Cannito et al., 2012; Ramig et al., 1994).

Relationship between F2 IQR and sentence intelligibility

Five speakers with PD had at least a moderate relationship between F2 IQR and scaled sentence intelligibility, but the direction of the relationship varied. PDM06 had a relatively restricted F2 IQR (395.05 Hz) compared to PDM08 (536.70 Hz). However, for both speakers a reduced F2 IQR was associated with better scaled sentence intelligibility. Thus, an explanation was not obvious for why reduced F2 IQR tended to be associated with better scaled sentence intelligibility for these speakers. F2 IQR is presumed to reflect diminished vocalic segmental integrity. Although vocalic integrity may have been reduced for some speakers with PD, listeners may have perceived the intended message as intelligible because of a relatively slower speaking rate for these speakers. For example, on average, PDM06 produced about three syllables per second relative to other speakers with PD that averaged four to five syllables per second. Thus, speakers for whom a reduced F2 IQR was associated with better intelligibility tended to use a slower than normal rate compared to other speakers with PD.

Aside from a limited number of data points per speaker, the equivocal findings for F2 IQR may also be explained by the limited variation in F2 IQR. That is, despite the fact that a reduced F2 IQR tended to be associated with better intelligibility, speakers with PD had relatively restricted F2 IQR compared to healthy controls. With the exception of one speaker with PD, F2 IQR tended to vary between 400 and 550 Hz. F2 IQR for healthy controls varied between 500 and 650 Hz. Thus, it is possible that the variation in F2 IQR was not of sufficient magnitude for a relationship to intelligibility to emerge. Future studies are needed to investigate the relationship between segmental integrity and variations in scaled sentence intelligibility for individuals who naturally produce more sentence-to-sentence variation in F2 IQR.

Relationship between articulatory rate and scaled sentence intelligibility

Articulatory rate was at least moderately correlated with scaled sentence intelligibility for five speakers. However, the direction of the relationship varied. These findings as well as those from past studies suggest that the relationship between global speech timing and intelligibility is complex (e.g. McRae et al., 2002; Van Nuffelen et al., 2010; Weismer et al., 2000; Yorkston et al., 1999). Speakers for whom increased articulatory rate was associated with better intelligibility tended to have slower mean articulatory rates (i.e. approximately three syllables per second). Speakers for whom decreased articulatory rate was associated with better scaled sentence intelligibility (e.g. PDM03), tended to have faster mean articulatory rates (i.e. approximately 3.52–5.06 syllables per second). Thus, the direction of the relationship may have varied because of differences in habitual speaking rates.

Relationship between linguistic variables and scaled sentence intelligibility

Experimental sentences were somewhat heterogeneous with respect to phonetic content, syntax, semantics and word frequency. A qualitative post-hoc analysis showed that some sentences were consistently judged to be more intelligible, while others were consistently judged to be less intelligible (Figure 3). Sentences judged to be less intelligible were comprised of words that occurred less frequently in American English (Brysbaert & New, 2009). Thus, listeners may have been less able to use semantic knowledge to recover the intended message for these sentences. Linguistic variables and by inference, the semantic predictability of sentences, can contribute to intelligibility (Hustad, 2007; Hustad & Beukelman, 2001; Tjaden & Wilding, 2010; Yunusova et al., 2005). Thus, word frequency trends may also help to explain some of the within-speaker variation in intelligibility for speakers with mild dysarthria. Indeed, when analyses were repeated excluding sentences that contained words that were consistently judged to be more or less intelligible, results suggested that word frequency was not a factor contributing to the overall absolute strength of the associations, but may mask some acoustic variables such as measures of SPL that may contribute to within-speaker variations in sentence intelligibility.

Caveats and future directions

Several factors should be kept in mind when interpreting results. The limited number of statistically significant correlations may be related to the acoustic variables studied. Therefore, in addition to larger numbers of data points per speaker, future studies should also investigate the relationship of other acoustic measures to within-speaker variation in intelligibility. For example, spectral measures of consonants as well as rhythm metrics could be studied as these types of measures have been shown to be associated with intelligibility (Chen & Stevens, 2001; Kewley-Port, Pisoni, & Studdert-Kennedy, 1983; Liss et al., 2009; Kim et al., 2011a). Most speakers in the current study also had mild dysarthria. Results may differ for individuals with more severe dysarthria. However, studies of mild dysarthria are still relevant because even mild dysarthria can have major implications for maintaining employment, social relationships and quality of life (Yorkston et al., 2010). Speakers with mild dysarthria can also benefit from treatment aimed at maximizing speech intelligibility and efficiency (Yorkston et al., 2010). In addition, the inherent nature of the speech task may have constrained the range of variation in both acoustic and perceptual measures, thus contributing to the modest association between acoustic measures and intelligibility. The nature of the scaling task also may have resulted in speech or voice qualities other than intelligibility (i.e. naturalness, clarity) influencing perceptual judgments. Finally, judgments of scaled intelligibility for Harvard sentences were obtained in background noise (i.e. 20-talker babble noise). Preliminary evidence suggests that the acoustics of intelligibility in background noise for speakers with dysarthria may differ from quiet (McAuliffe, Good, O’Beirne, & LaPointe, 2008; McAuliffe et al., 2009). Thus, findings may not generalize to other perceptual environments.

In conclusion, results suggest that a within-speaker analysis shows promise for identifying acoustic variables likely contributing to variations in sentence intelligibility for speakers with PD. Certain acoustic variables, especially articulatory rate, fundamental frequency and F2 IQR show promise for capturing within-speaker variation in intelligibility at least for speakers with mild dysarthria secondary to PD in the current study. One possible clinical implication is that the current speakers with PD may benefit from therapy focused on the underlying acoustic variable associated with intelligibility. Finally, the fact that findings appear to be at odds with at least some previously published studies emphasizes the heterogeneity among speakers with PD. This diversity suggests that pooling data across speakers may mask the contribution of some acoustic measures to variations in scaled speech intelligibility. Although additional studies are needed, results suggest care is warranted when interpreting results from group analyses until there is a better understanding of the acoustic basis of intelligibility.

Acknowledgments

We thank Jessica Sam and Helena Rosenstrauch for assistance with various aspects of this study. We also thank Dr. Christina Kuo for her contribution on portions of this study, and Dr. Stathopoulos and Dr. Higginbotham for comments on previous versions of this document.

Footnotes

Declaration of interest

The authors report no conflict of interest. This research was supported by R01DC004689.

References

  1. ANSI. ANSI S3. 6-1989, American National Standard Specification for Audiometers. New York: American National Standards Institute; 1989. [Google Scholar]
  2. Bradlow AR, Torretta GM, Pisoni DB. Intelligibility of normal speech I: Global and fine-grained acoustic-phonetic talker characteristics. Speech Communication. 1996;20:255–272. doi: 10.1016/S0167-6393(96)00063-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Brysbaert M, New B. Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behavior Research Methods. 2009;41:977–990. doi: 10.3758/BRM.41.4.977. [DOI] [PubMed] [Google Scholar]
  4. Bunton K. Fundamental frequency as a perceptual cue for vowel identification in speakers with Parkinson’s disease. Folia Phoniatrica et Logopaedica. 2006;58:323–339. doi: 10.1159/000094567. [DOI] [PubMed] [Google Scholar]
  5. Bunton K, Kent RD, Kent JF, Duffy JR. The effects of flattening fundamental frequency contours on sentence intelligibility in speakers with dysarthria. Clinical Linguistics & Phonetics. 2001;15:181–193. [Google Scholar]
  6. Bunton K, Kent RD, Kent JF, Rosenbek JC. Perceptuo-acoustic assessment of prosodic impairment in dysarthria. Clinical Linguistics & Phonetics. 2000;14:13–24. doi: 10.1080/026992000298922. [DOI] [PubMed] [Google Scholar]
  7. Bunton K, Weismer G. Segmental level analysis of laryngeal function in persons with motor speech disorders. Folia Phoniatrica et Logopaedica. 2002;54:223–239. doi: 10.1159/000065199. [DOI] [PubMed] [Google Scholar]
  8. Cannito MP, Burch AR, Watts C, Rappold PW, Hood SB, Sherrard K. Disfluency in spasmodic dysphonia: A multivariate analysis. Journal of Speech, Language, and Hearing Research. 1997;40:627–641. doi: 10.1044/jslhr.4003.627. [DOI] [PubMed] [Google Scholar]
  9. Cannito MP, Suiter DM, Beverly D, Chorna L, Wolf T, Pfeiffer RM. Sentence intelligibility before and after voice treatment in speakers with idiopathic Parkinson’s disease. Journal of Voice. 2012;26:214–219. doi: 10.1016/j.jvoice.2011.08.014. [DOI] [PubMed] [Google Scholar]
  10. Chen H, Stevens KN. An acoustical study of the fricative /s/ in the speech of individuals with dysarthria. Journal of Speech, Language, and Hearing Research. 2001;44:1300–1314. doi: 10.1044/1092-4388(2001/101). [DOI] [PubMed] [Google Scholar]
  11. Cohen J. Statistical power analysis for the behavioral sciences. 2nd. New Jersey: Lawrence Erlbaum; 1988. [Google Scholar]
  12. de Pijper JR. Semitone conversions. Retrieved September. 2007;17:2012. from http://users.utu.fi/jyrtuoma/speech/semitone.html. [Google Scholar]
  13. Darley FL, Aronson AE, Brown JR. Clusters of deviant speech dimensions in the dysarthrias. Journal of Speech and Hearing Research. 1969;12:462–496. doi: 10.1044/jshr.1203.462. [DOI] [PubMed] [Google Scholar]
  14. Dromey C. Spectral measures and perceptual ratings of hypokinetic dysarthria. Journal of Medical Speech Language Pathology. 2003;11:85–94. [Google Scholar]
  15. Dromey C, Ramig LO. Intentional changes in sound pressure level and rate: Their impact on measures of respiration, phonation, and articulation. Journal of Speech, Language, and Hearing Research. 1998;41:1003–1018. doi: 10.1044/jslhr.4105.1003. [DOI] [PubMed] [Google Scholar]
  16. Dromey C, Ramig LO, Johnson AB. Phonatory and articulatory changes associated with increased vocal intensity in Parkinson disease: A case study. Journal of Speech and Hearing Research. 1995;38:751–764. doi: 10.1044/jshr.3804.751. [DOI] [PubMed] [Google Scholar]
  17. Ferguson SH, Kewley-Port D. Vowel intelligibility in clear and conversational speech for normal-hearing and hearing-impaired listeners. The Journal of the Acoustical Society of America. 2002;112:259–271. doi: 10.1121/1.1482078. [DOI] [PubMed] [Google Scholar]
  18. Frank T, Craig CH. Comparison of the auditec and rintelmann recordings of the NU-6. Journal of Speech and Hearing Disorders. 1984;49:267–271. doi: 10.1044/jshd.4903.267. [DOI] [PubMed] [Google Scholar]
  19. Hammen VL, Yorkston KM, Minifie FD. Effects of temporal alterations on speech intelligibility in Parkinsonian dysarthria. Journal of Speech and Hearing Research. 1994;37:244–253. doi: 10.1044/jshr.3702.244. [DOI] [PubMed] [Google Scholar]
  20. Hertrich I, Ackermann H. Gender-specific vocal dysfunctions in Parkinson’s disease: Electroglottic and acoustic analyses. Annals of Otology, Rhinology, and Laryngology. 1995;104:197–202. doi: 10.1177/000348949510400304. [DOI] [PubMed] [Google Scholar]
  21. Holmes RJ, Oates JM, Phyland DJ, Hughes AJ. Voice characteristics in the progression of Parkinson’s disease. International Journal of Language and Communication Disorders. 2000;35:407–418. doi: 10.1080/136828200410654. [DOI] [PubMed] [Google Scholar]
  22. Hustad KC. Contribution of two sources of listener knowledge to intelligibility of speakers with Cerebral Palsy. Journal of Speech, Language, and Hearing Research. 2007;50:1228–1240. doi: 10.1044/1092-4388(2007/086). [DOI] [PubMed] [Google Scholar]
  23. Hustad KC, Beukelman DR. Listener comprehension of severely dysarthric speech: Effects of linguistic cues and stimulus cohesion. Journal of Speech, Language, and Hearing Research. 2001;45:545–558. doi: 10.1044/1092-4388(2002/043). [DOI] [PubMed] [Google Scholar]
  24. IEEE. IEEE recommended practice for speech quality measures. New York: Institute of Electrical and Electronic Engineers; 1969. [Google Scholar]
  25. Kent RD, Kent JF, Duffy J, Weismer G. The dysarthrias: Speech-voice profiles, related dysfunctions, and neuropathology. Journal of Medical Speech-Language Pathology. 1998;6:165–211. [Google Scholar]
  26. Kewley-Port D, Pisoni DB, Studdert-Kennedy M. Perception of static and dynamic acoustic cues to place of articulation in initial stop consonants. Journal of the Acoustical Society of America. 1983;73:1779–1793. doi: 10.1121/1.389402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Kim H, Hasegawa-Johnson M, Perlman A. Temporal and spectral characteristics of fricatives in dysarthria. Journal of the Acoustical Society of America. 2011a;130:2446. [Google Scholar]
  28. Kim Y, Kent RD, Weismer G. An acoustic study of the relationships among neurologic disease, dysarthria type, and severity of dysarthria. Journal of Speech, Language, and Hearing Research. 2011b;54:417–429. doi: 10.1044/1092-4388(2010/10-0020). [DOI] [PubMed] [Google Scholar]
  29. Kim Y, Kuo C. Effect of level of presentation to listeners on scaled speech intelligibility of speakers with dysarthria. Folia Phoniatrica et Logopaedica. 2012;64:26–33. doi: 10.1159/000328642. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Kim Y, Weismer G, Kent R, Duffy JR. Statistical models of F2 slope in relation to severity of dysarthria. Folia Phoniatrica et Logopaedica. 2009;61:329–335. doi: 10.1159/000252849. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Krause JC, Braida LD. Evaluating the role of spectral and envelope characteristics in the intelligibility advantage of clear speech. Journal of the Acoustical Society of America. 2009;125:3346–3357. doi: 10.1121/1.3097491. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Lam J, Tjaden K, Wilding G. Acoustics of clear speech: Effect of instruction. Journal of Speech, Language, and Hearing Research. 2012;55:1807–1821. doi: 10.1044/1092-4388(2012/11-0154). [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Laures JS, Weismer G. The effects of a flattened fundamental frequency on intelligibility at the sentence level. Journal of Speech, Language, and Hearing Research. 1999;42:1148–1156. doi: 10.1044/jslhr.4205.1148. [DOI] [PubMed] [Google Scholar]
  34. Liss JM, Spitzer S, Caviness JN, Adler C, Edwards B. Syllabic strength and lexical boundary decisions in perception of hypokinetic dysarthric speech. Journal of the Acoustical Society of America. 1998;104:2457–2466. doi: 10.1121/1.423753. [DOI] [PubMed] [Google Scholar]
  35. Liss JM, Spitzer S, Caviness JN, Adler C, Edwards B. Lexical boundary error analysis in hypokinetic and ataxic dysarthria. Journal of the Acoustical Society of America. 2000;107:3415–3424. doi: 10.1121/1.429412. [DOI] [PubMed] [Google Scholar]
  36. Liss JM, White L, Mattys SL, Lansford K, Lotto AJ, Spitzer SM, Caviness JN. Quantifying speech rhythm abnormalities in dysarthrias. Journal of Speech, Language, and Hearing Research. 2009;52:1334–1352. doi: 10.1044/1092-4388(2009/08-0208). [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Liss JM, Utianski R, Lansford K. Crosslinguistic application of English-centric rhythm descriptors in motor speech disorders. Folia Phoniatrica et Logopaedica. 2013;65:3–19. doi: 10.1159/000350030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. MacPherson MK, Huber JE, Snow DP. The intonation-syntax interface in the speech of individuals with Parkinson’s disease. Journal of Speech, Language, and Hearing Research. 2011;54:19–32. doi: 10.1044/1092-4388(2010/09-0079). [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Maniwa K, Jongman A, Wade T. Perception of clear fricatives by normal-hearing and simulated hearingimpaired listeners. The Journal of the Acoustical Society of America. 2008;123:1114–1125. doi: 10.1121/1.2821966. [DOI] [PubMed] [Google Scholar]
  40. McAuliffe MJ, Good PV, O’Beirne GA, LaPointe LL. Influence of auditory distraction upon intelligibility ratings in dysarthria. Poster presented at the Conference on Motor Speech. 2008 Available from: http://hdl.handle.net/10092/3398.
  41. McAuliffe MJ, Schafer M, O’Beirne GA, LaPointe LL. Effect of noise upon the perception of speech intelligibility in dysarthria. Poster presented at the American Speech-Language and Hearing Association Convention. 2009 Available from: http://hdl.handle.net/10092/3410.
  42. McRae PA, Tjaden K, Schoonings B. Acoustic and perceptual consequences of articulatory rate change in Parkinson disease. Journal of Speech, Language, and Hearing Research. 2002;45:35–50. doi: 10.1044/1092-4388(2002/003). [DOI] [PubMed] [Google Scholar]
  43. Milenkovic P. TF 32 [Computer program] Madison, WI: University of Wisconsin-Madison; 2002. [Google Scholar]
  44. Miller SE, Schlauch RS, Watson PJ. The effects of fundamental frequency contour manipulations on speech intelligibility in background noise. Journal of the Acoustic Society of America. 2010;128:435–443. doi: 10.1121/1.3397384. [DOI] [PubMed] [Google Scholar]
  45. Molloy D. Standardized Mini-Mental State Examination. Troy, NY: Grange Press; 1999. [Google Scholar]
  46. Monsen RB. The oral speech intelligibility of hearing-impaired talkers. Journal of Speech and Hearing Disorders. 1983;48:286–296. doi: 10.1044/jshd.4803.286. [DOI] [PubMed] [Google Scholar]
  47. Neel AT. Effects of loud and amplified speech on sentence and word intelligibility in Parkinson disease. Journal of Speech, Language, and Hearing Research. 2009;52:1021–1033. doi: 10.1044/1092-4388(2008/08-0119). [DOI] [PubMed] [Google Scholar]
  48. Picheny MA, Durlach NI, Braida LD. Speaking clearly for the hard of hearing. II. Acoustic characteristics of clear and conversational speech. Journal of Speech and Hearing Research. 1986;29:434–446. doi: 10.1044/jshr.2904.434. [DOI] [PubMed] [Google Scholar]
  49. Ramig LO, Bonitati C, Lemke J, Horii Y. Voice treatment for patients with Parkison disease: Development of an approach and preliminary efficacy data. Journal of Medical Speech-Language Pathology. 1994;2:191–209. [Google Scholar]
  50. Ramig LO, Countryman S, Thompson LL, Horii Y. Comparison of two forms of intensive speech treatment for Parkinson disease. Journal of Speech and Hearing Research. 1995;38:1232–1251. doi: 10.1044/jshr.3806.1232. [DOI] [PubMed] [Google Scholar]
  51. Sapir S, Ramig LO, Fox CM. Intensive voice treatment in Parkinson’s disease: Lee Silverman voice treatment. Expert Review of Neurotherapeutics. 2011;11:815–830. doi: 10.1586/ern.11.43. [DOI] [PubMed] [Google Scholar]
  52. Skodda S, Rinsche H, Schlegel U. Progression of dysprosody in Parkinson’s Disease over time – A longitudinal study. Movement Disorders. 2009;24:716–722. doi: 10.1002/mds.22430. [DOI] [PubMed] [Google Scholar]
  53. Skodda S, Visser W, Schlegel U. Gender-related patterns of dysprosody in Parkinson’s Disease and correlation between speech variables and motor symptoms. Journal of Voice. 2011;25:76–82. doi: 10.1016/j.jvoice.2009.07.005. [DOI] [PubMed] [Google Scholar]
  54. Smiljanić R, Bradlow AR. Speaking and hearing clearly: Talker and listener factors in speaking style changes. Language and Linguistic Compass. 2009;3:236–264. doi: 10.1111/j.1749-818X.2008.00112.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Sussman JE, Tjaden K. Perceptual measures of speech from individuals with Parkinson’s disease and Multiple Sclerosis: Intelligibility and beyond. Journal of Speech, Language, and Hearing Research. 2012;55:1208–1219. doi: 10.1044/1092-4388(2011/11-0048). [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Tjaden K. Anticipatory coarticulation in Multiple Sclerosis and Parkinson’s disease. Journal of Speech, Language, and Hearing Research. 2003;46:990–1008. doi: 10.1044/1092-4388(2003/077). [DOI] [PubMed] [Google Scholar]
  57. Tjaden K, Kain A, Lam J. Hybridizing conversational and clear speech to investigate the source of intelligibility variation in Parkinson’s disease. Journal of Speech, Language and Hearing Research. 2014a doi: 10.1044/2014_JSLHR-S-13-0086. [Epub ahead of print] [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Tjaden K, Lam J, Wilding G. Vowel acoustics in Parkinson’s disease and multiple sclerosis: Comparison of clear, loud, and slow speaking conditions. Journal of Speech, Language and Hearing Research. 2013;56:1485–1502. doi: 10.1044/1092-4388(2013/12-0259). [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Tjaden K, Richards E, Kuo C, Wilding G, Sussman J. Acoustic and perceptual consequences of clear and loud speech. Folia Phoniatrica et Logopaedica. 2013;65:214–220. doi: 10.1159/000355867. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Tjaden K, Sussman J, Wilding G. Impact of clear, loud and slow speech on scaled intelligibility and speech severity in Parkinson’s Disease and Multiple Sclerosis. Journal of Speech, Language and Hearing Research. 2014b doi: 10.1044/2014_JSLHR-S-12-0372. [Epub ahead of print] [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Tjaden K, Turner GS. Spectral properties of fricatives in Amyotrophic Lateral Sclerosis. Journal of Speech, Language, and Hearing Research. 1997;40:1358–1372. doi: 10.1044/jslhr.4006.1358. [DOI] [PubMed] [Google Scholar]
  62. Tjaden K, Wilding GE. Rate and loudness manipulations in dysarthria: Acoustic and perceptual findings. Journal of Speech, Language, and Hearing Research. 2004;47:766–783. doi: 10.1044/1092-4388(2004/058). [DOI] [PubMed] [Google Scholar]
  63. Tjaden K, Wilding G. The impact of rate reduction and increased loudness on fundamental frequency characteristics in dysarthria. Folia Phoniactrica et Logopaedica. 2010;63:178–186. doi: 10.1159/000316315. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Tjaden K, Wilding G. Effects of speaking task on intelligibility in Parkinson’s disease. Clinical Linguistics & Phonetics. 2011a;25:155–168. doi: 10.3109/02699206.2010.520185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Tjaden K, Wilding G. Speech and pause characteristics associated with voluntary rate reduction in Parkinson’s disease and Multiple Sclerosis. Journal of Communication Disorders. 2011b;44:655–665. doi: 10.1016/j.jcomdis.2011.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Tsao YC, Weismer G, Iqbal K. Interspeaker variation in habitual speaking rate: Additional evidence. Journal of Speech, Language, and Hearing Research. 2006;49:1156–1164. doi: 10.1044/1092-4388(2006/083). [DOI] [PubMed] [Google Scholar]
  67. Turner GS, Tjaden K, Weismer G. The influence of speaking rate on vowel space and speech intelligibility for individuals with Amyotrophic Lateral Sclerosis. Journal of Speech and Hearing Research. 1995;38:1001–1013. doi: 10.1044/jshr.3805.1001. [DOI] [PubMed] [Google Scholar]
  68. Turner GS, Weismer G. Characteristics of speaking rate in the dysarthria associated with Amyotrophic Lateral Sclerosis. Journal of Speech and Hearing Research. 1993;36:1134–1144. doi: 10.1044/jshr.3606.1134. [DOI] [PubMed] [Google Scholar]
  69. Uchanski RM. Clear speech. In: Pisoni DB, Remex R, editors. The handbook of speech perception. Malden/Oxford: Blackwell; 2005. pp. 207–235. [Google Scholar]
  70. Van Nuffelen G, De Bodt M, Vanderwegen J, Van de Heyning P, Wuyts F. Effect of rate control on speech production and intelligibility in dysarthria. Folia Phoniatrica et Logopaedica. 2010;62:110–119. doi: 10.1159/000287209. [DOI] [PubMed] [Google Scholar]
  71. Van Nuffelen G, De Bodt M, Wuyts F, Van de Heyning P. The effect of rate control on speech rate and intelligibility of dysarthric speech. Folia Phoniatrica et Logopaedica. 2009;61:69–75. doi: 10.1159/000208805. [DOI] [PubMed] [Google Scholar]
  72. Ventry IM, Weinstein BE. Identification of elderly people with hearing problems. Asha. 1983;25:37. [PubMed] [Google Scholar]
  73. Watson PJ, Schlauch RS. The effect of fundamental frequency on the intelligibility of speech with flattened intonation contours. Journal of Speech-Language Pathology. 2008;17:348–355. doi: 10.1044/1058-0360(2008/07-0048). [DOI] [PubMed] [Google Scholar]
  74. Weismer G. Speech intelligibility. In: Ball MJ, Perkins MR, Muller N, Howard S, editors. The handbook of clinical linguistics. Oxford: Blackwell; 2008. pp. 568–582. [Google Scholar]
  75. Weismer G, Jeng J, Laures JS, Kent RD, Kent JF. Acoustic and intelligibility characteristics of sentence production in neurogenic speech disorders. Folia Phoniatrica et Logopaedica. 2001;53:1–18. doi: 10.1159/000052649. [DOI] [PubMed] [Google Scholar]
  76. Weismer G, Laures JS, Jeng J, Kent RD, Kent JF. Effect of speaking rate manipulations on acoustic and perceptual aspects of the dysarthria in Amyotrophic Lateral Sclerosis. Folia Phoniatrica et Logopaedica. 2000;52:201–219. doi: 10.1159/000021536. [DOI] [PubMed] [Google Scholar]
  77. Weismer G, Yunusova Y, Bunton K. Measures to evaluate the effects of DBS on speech production. Journal of Neurolinguistics. 2012;25:74–94. doi: 10.1016/j.jneuroling.2011.08.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Yorkston KM, Beukelman DR. Assessment of intelligibility of dysarthric speech. Tigard, OR: CC 455 Publications; 1981. [Google Scholar]
  79. Yorkston KM, Beukelman DR, Tice R. Sentence intelligibility test for Macintosh [Computer Software] Lincoln, NE: Tice Technology Services; 1996. [Google Scholar]
  80. Yorkston KM, Hammen VL, Beukelman DR, Traynor CD. The effect of rate control on the intelligibility and naturalness of dysarthric speech. Journal of Speech and Hearing Disorders. 1990;55:550–560. doi: 10.1044/jshd.5503.550. [DOI] [PubMed] [Google Scholar]
  81. Yorkston KM, Beukelman DR, Strand EA, Bell KR. Management of motor speech disorders in children and adults. 2nd. Austin, TX: Pro-Ed; 1999. [Google Scholar]
  82. Yorkston KM, Beukelman DR, Strand EA, Hakel M. Management of motor speech disorders in children and adults. 3rd. Austin, TX: Pro-Ed; 2010. [Google Scholar]
  83. Yunusova Y, Green JR, Greenwood L, Wang J, Pattee GL, Zinman L. Tongue movements and their acoustic consequences in Amyotrophic Lateral Sclerosis. Folia Phoniatrica et Logopaedica. 2012;64:94–102. doi: 10.1159/000336890. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Yunusova Y, Weismer G, Kent RD, Rusche NM. Breath-group intelligibility in dysarthria: Characteristics and underlying correlates. Journal of Speech, Language, and Hearing Research. 2005;48:1294–1310. doi: 10.1044/1092-4388(2005/090). [DOI] [PubMed] [Google Scholar]

RESOURCES