Abstract
Delayed auditory feedback (DAF) has been assessed as a rate reduction and intelligibility enhancing tool in patients with Parkinson's Disease (PD) for some time. However, there are contradictory results in the literature regarding the success of this device. Also, little is known about the effects of DAF on speech other than influences on speech rate and intelligibility. Frequency shifted feedback (FSF) is known to produce more natural sounding speech than DAF and to improve the fluency of persons who stutter. However, there are currently no studies reporting how PD speakers perform under FSF.
The aim of this study was to investigate the effects of both types of altered feedback on the speech of PD and control participants on a broad range of measures. The performance of 16 PD speakers and 11 control speakers in a reading task under DAF, FSF and no altered feedback (NAF) are reported in this paper.
The results showed that all groups responded to altered feedback in a similar way and showed a prominent reduction of speech rate. The conditions evoked changes in pause frequency (increases), loudness levels (increases), pitch variation (increases) and intelligibility and naturalness (decreases) for all or some of the groups. Few effects could be observed on articulation/pause time ratio, pause duration, pitch range, and speech rhythm. Previous reports on differences in susceptibility of PD speaker to altered feedback were confirmed and some speakers benefited from the system despite the negative group results for intelligibility and naturalness. In general, FSF resulted in performance closer to the NAF state than DAF on all variables, and for those PD speakers who benefited from altered feedback, the FSF condition evoked the greatest improvement.
Introduction
Hypokinetic dysarthria associated with Parkinson's Disease (PD) often results in increased speech rate or acceleration during utterances (Duffy, 1995), and speech rate reduction is therefore a common therapeutic technique to improve intelligibility. One of these techniques, delayed auditory feedback (DAF), does not require any direct control by the speaker and is thought to be most successful in maintaining the naturalness of speech (Yorkston, Beukelman, Strand & Bell, 1999). However, despite the apparent advantages of DAF for intelligibility improvements, a number of questions remain open. The most taxing of these is the question of why some speakers do not appear to benefit from this altered feedback. Various explanations have been proposed, with the speakers' cognitive skills playing a central role (Dagenais, Southwood & Mallonee, 1999). However, the exact factors influencing susceptibility to DAF are still not clear. Secondly, most studies on DAF in PD speakers have concentrated on a relatively small number of parameters to evaluate the device, predominantly speech rate and intelligibility. On the other hand, reports that provide information on a wider range of measures only investigated a few participants (Hanson & Metter, 1983; Yorkston et al. 1999). Research conducted on DAF in persons who stutter (PWS) indicates that this device can have negative effects on speech production, including reduction in pitch variation (Howell, 2004). Given that some of these problems are a common feature of PD further knowledge about any possible aggravation with DAF would be important for therapeutic considerations. Finally, research on PWS has shown that frequency shifted feedback (FSF) can have less negative side-effects than DAF (Howell, 2004). However, this system has not been studied in speakers with PD before and there is thus a lack of information whether it compares favourably with DAF in this population as well.
This paper presents pilot data from a large scale project into the susceptibility of PD speakers to altered feedback and their responses to DAF and FSF. The question addressed here is: What are the effects of DAF and FSF on PD speakers in relation to intelligibility, naturalness, as well a range of acoustic prosodic parameters?
Methods
Participants
16 speakers with a neurological diagnosis of idiopathic PD participated in the study (Table 1). All but one PD speaker had speech problems, ranging from mild to moderate-severe dysarthria. 11 control speakers (CON, 3 female, 9 male, 61-77 ys, mean: 66.8 ys, intelligibility range: 180-200, mean: 186) without any neurological impairment have also been analysed to date. The PD subjects had no history of neurological disorders other than their PD, and none of the speakers had a history of speech and language therapy. All participants were native speakers of British English and their hearing level was appropriate to carry out the tasks.
Table 1.
Subject | Gender | Age | Intelligibility Score |
Hoehn & Yahr scale |
Medication |
---|---|---|---|---|---|
LPD1 | m | 75 | 173 | 1.5 | 2 |
LPD2 | m | 63 | 104 | 3.5 | 1,2,3 |
LPD3 | m | 73 | 100 | 4 | nil |
LPD4 | m | 62 | 161 | 3 | 1,6,7 |
LPD5 | m | 71 | 125 | 3 | 1,5 |
LPD6 | m | 62 | 145 | 2.5 | 5,6 |
LPD7 | f | 59 | 169 | 3 | 2, 13, 14, 15 |
HPD1 | m | 71 | 184 | 2 | 2 |
HPD2 | m | 62 | 185 | 2.5 | 6, 8, 13, 16, 17 18, 19 |
HPD3 | f | 69 | 191 | 3 | 1,2,3,4 |
HPD4 | m | 71 | 181 | 1 | 1 |
HPD5 | m | 60 | 184 | 2 | 16, 20, 21, 22 |
HPD6 | m | 71 | 180 | 2 | nil |
HPD7 | m | 66 | 176 | 1 | 1,3 |
HPD8 | f | 63 | 179 | 2.5 | 2, 5, 6, 7, 8, 9, 10, 11 |
HPD9 | m | 67 | 176 | 3 | 1,3,4, 12 |
1= Sinemet; 2 = Madopar; 3 = Entacapone, 4 = Selegiline, 5 = Domperidone; 6 = Ropinirole; 7 = Amantadine, 8 = Amlodipine, 9 = Zispin; 10 = Co-amilofruse; 11 = Mirtazapine; 12 = Pergolide, 13 = Benzhexol; 14 = Quinine sulphate; 15 = Amitriptyline; 16 = Co-codamol; 17 = Bendrofluazide, 18 = Aspirin, 19 = Sinemet-Plus; 20 = Pramipexole; 21 = Finasteride; 22 = Quinine bisulphate
Experimental Procedure
Each speaker read a text passage (adapted from Lowit-Leuschel & Docherty 2001) in three randomly presented conditions: no altered feedback (NAF), DAF and FSF (½ octave upward shift), using a Casafutura™ Desktop Fluency System. Studies on PWS (Howell 2004) and PD speakers (Hanson & Metter 1983) have identified that greater auditory delay settings resulted in greater rate reduction. In addition, Rousseau & Watts (2002) found a trend for the 150 ms delay setting to be the most beneficial for rate reduction and intelligibility improvement. This delay setting was therefore chosen in the current experiment1. Ideally, the speakers' performance would have been assessed on a variety of delay settings, however, the number of tasks participants had to carry out in addition to those reported here did not permit this.
FSF shifts all speech frequencies up or down by a specified magnitude. A ½ octave upward shift setting was chosen for this condition, as it has been shown to be most effective in PWS (Howell 2004).
Data were recorded with a DAT recorder (Tascam DA-P1; Beyerdynamic Microphone M58) and digitized using CSL (Kay Elemetrics, Model 4300B) at a sampling rate of 20 kHz.
Analysis
Intelligibility and naturalness ratings were obtained by Direct Magnitude Estimation (Whitehill, Lee, & Chun, 2002) where listeners compared all speakers to a standard reference stimulus (defined as a score of 100). Four SLT professionals chose the standard out of the current data pool as a typical example of a moderate PD speaker. The speech samples were then randomised across all speakers and conditions and presented in groups of five, i.e. the standard and four samples for rating. Four final year SLT students then scored intelligibility and naturalness on separate occasions (inter-rater reliability p <.001). They had experience in dealing with dysarthric clients and received training in using DME to evaluate the speech samples, as well as advice on the distinction between intelligibility and naturalness ratings. Acoustic analysis was carried out with the Kay Elemetrics Multispeech system (Version 2.4). The following variables were measured:
Speaking rate (syllables per second): Based on the total time taken to read the passage, regardless of dysfluencies or pauses.
Articulation / pause time ratio (%): The ratio of time spent for articulation and pauses relative to speaking time. Pauses were specified acoustically as silences in the signal that lasted more than 200 ms.
Mean pause frequency (pauses per syllable): The number of pauses divided by the number of syllables.
Mean pause duration (ms): The sum of all pause durations divided by the number of pauses.
Pitch range (Hz): The difference between the highest and lowest pitch value per utterance, averaged over the signal.
Pair-wise Variability Index (PVI, Low, Grabe & Nolan, 2000): The mean durational variability of vowels in the signal.
For the statistical analysis non-parametric tests were used (Wilcoxon for two-related-samples and Mann-Whitney-U-Test for two independent-samples). The level of significance was set to p < .05.
Results
As previous research suggested differences in response to altered feedback according to the severity of the dysarthria (Rousseau & Watts 2002), the PD group was divided into two subgroups, a high and a low intelligibility group (HPD & LPD). The CON group range (180-200) was taken as the decisive factor in this division, i.e. HPD speakers fell within their range, LPD speakers below it. However, as there were some PD speakers whose score fell just below the normal range (e.g. 179), it was decided to use the CON group minimum minus one SD, i.e. a score of 175 as the threshold. This included a further three speakers in the HPD group. Statistical analyses were performed for both cut-off points (range vs. range minus SD). Although the second scenario resulted in less significant differences between groups and conditions this had no major implication on the overall results and it was therefore adopted. The analysis focused on across-group and across-condition comparisons. As expected from the split of the PD participants, the group comparison showed that the LPD speakers had lower values compared to the HPD and in particular to the CON group. These differences were evident across all feedback conditions (Table 2, Figures 1-3) although they did not always reach statistical significance.
Table 2.
NAF | DAF | FSF | ||||
---|---|---|---|---|---|---|
CON -LPD |
HPD- LPD |
CON -LPD |
HPD- LPD |
CON -LPD |
HPD- LPD |
|
Intelligibility | .000 | .000 | .001 | .016 | .000 | .005 |
Naturalness | .015 | .005 | .003 | .007 | ||
Speech rate | .020 | .050 | ||||
Articulation/ pause time ratio | .004 | .042 | .000 | .031 | ||
Pause duration | .011 | .042 | .004 | .005 | .007 | .018 |
Pause frequency | ||||||
Speech rhythm | ||||||
F0 range | .004 | |||||
Loudness Level |
The results of the across-feedback condition comparisons, which form the main focus of this paper are outlined in more detail below.
Intelligibility & Naturalness
All speaker groups were given lower intelligibility ratings in the DAF compared to the NAF condition (Figure 1). This was significant for the CON (p = .003) and HPD (p = .021) speakers. The scores for DAF were also lower than those for the FSF condition for these groups (CON: p = .005; HPD: p = .050). In addition the HPD group had significantly lower scores for FSF compared to NAF (p = .021).
All groups were rated most natural during the NAF condition, followed by FSF and DAF (Figure 1). The difference between NAF and DAF was significant for the LPD (p = .028) and HPD groups (p = .025). In the FSF condition the scores were significantly higher compared to DAF for all groups (CON: p = .041; HPD: p =.050; LPD: p =.027). A significant difference between NAF and FSF could only be found in the HPD group (p = .021).
Despite the relatively homogeneous performance across the groups, there was a considerable amount of variation between individuals. Most participants showed small changes across the conditions, for the better as well as for the worse. Although it is difficult to quantify from the DME results how much change results in significant perceptual improvement, three of the speakers stand out by showing a combined effect of intelligibility and naturalness enhancement (Table 3). These improvements were perceptually validated by the four SLTs who had not been involved in the DME scoring.
Table 3.
LPD3 | LPD5 | LPD6 | |||||||
---|---|---|---|---|---|---|---|---|---|
I | N | C | I | N | C | I | N | C | |
NAF | 100 | 100 | 200 | 125 | 119 | 244 | 145 | 119 | 264 |
DAF | 124 | 80 | 204 | 111 | 88 | 199 | 161 | 106 | 267 |
FSF | 131 | 96 | 227 | 139 | 123 | 262 | 158 | 133 | 291 |
The data in Table 3 show that despite variations concerning which condition resulted in the best score for an individual parameter, the FSF condition produced the best performance in relation to the combined score in all three cases.
The table also shows that listeners made a distinction between intelligibility and naturalness rating, e.g. LPD3 and LPD6 were rated as improving in intelligibility but being less natural in the DAF condition.
Speech Timing Characteristics
Speech rate
All subject groups reduced their speech rate during DAF (Figure 2). The difference between NAF and DAF was significant for all subject groups (CON: p = .003; HPD: p = .008; LPD: p = .028). The reduction from NAF to FSF was smaller, but still significant for the CON (p = .021) and HPD (p = .011) groups. In addition these groups had significantly slower rates in DAF than FSF (CON: p = .008; HPD: p = .028).
Articulation / pause time ratio
The group data showed similar patterns in articulation / pause time ratio changes across the conditions for all groups, with a greater proportion of articulation time in the altered feedback conditions (Figure 2). However, these differences were not statistically significant.
A more detailed analysis of the pause characteristics showed that there was a trend for a change in pausing behaviour for the HPD and CON speakers, i.e. NAF was associated with a lower number but longer duration of pauses than the altered feedback conditions. However, this was only significant for the HPD group in relation to pause frequency (NAF vs. DAF: p = .008; NAF vs. FSF: p = .021).
F0 Variability and Loudness Level
Whilst the CON and HPD speakers did not show any significant changes regarding F0 variability across the three conditions, the LPD speakers had a significantly higher results in the FSF condition compared to DAF (p = .046) (Figure 3). None of the groups showed increased mean F0 levels with FSF.
In relation to intensity, the group data indicated increased levels for the altered feedback conditions. The statistical analysis showed a significant increment from NAF to DAF (see Figure 3) for the CON (p = .003) and HPD (p = .015) groups.
Speech Rhythm
Neither of the groups showed significant changes regarding the PVI across the conditions (Figure 3).
Discussion
The between-group comparisons have indicated differences between the LPD and other speaker groups in the areas of naturalness and pausing behaviour. The result that the other measures showed no significant differences speaks to the fact that intelligibility ratings cannot necessarily predict the degree of impairment in other speech parameters. When individual speakers were considered all showed the symptoms of hypokinetic dysarthria reported in the literature (Duffy, 1995) and the current sample can be regarded as representative of the PD population.
In relation to the analysis of the different feedback conditions, a prominent feature in the results was the fact that, irrespective of absolute performance values and despite individual variations, the groups were affected by the feedback manipulations to the same degree for most parameters. One could thus say that all speaker groups showed the same susceptibility to altered feedback.
In relation to the individual parameters, all groups showed a prominent reduction of speaking rate during DAF. This is in line with several other studies (Downie, Low & Lindsay, 1981; Hanson & Metter, 1983; Rousseau & Watts, 2002; Yorkston et al., 1999). In addition, the effects of FSF on speaking rate were less pronounced than in DAF. This has also been observed in persons who stutter (Howell, 2004).
The analysis of pausing characteristics indicated that this rate reduction was largely achieved by drawing out the speech sounds and placing more frequent but shorter pauses in the signal. Yorkston et al (1999) observed a similar pattern in their speakers. The increase in pause frequency was particularly helpful to intelligibility, as word boundaries were signalled more clearly.
The F0 data indicated that the DAF tool had no negative effects on the speakers' degree of pitch modulation as reported in studies with persons who stutter (Howell, 2004). These data suggest that DAF might not aggravate problems such as monopitch in patients with PD as also observed by Hanson & Metter (1983) and Yorkston et al. (1999) in a smaller number of speakers. The fact that there was a higher degree of pitch variation for the LPD speakers during FSF is also in line with findings on PWS reported by Howell (2004). The perceptual analysis of the three speakers who benefited most from FSF indicated that this increase in variability had positive implications, i.e. speakers produced more extensive but still appropriate intonation contours rather than showing unnatural patterns. A greater sample of speakers needs to be analysed before firm conclusions can be drawn about the benefits of FSF on pitch and intonation though.
The finding that the loudness level increased during DAF could be an indication that the Lombard effect was elicited, as subjects only heard their speech through headphones during the altered feedback conditions whereas during NAF no headphones were used. On the other hand this increase was less marked in the FSF condition compared to the DAF condition, even though no changes in input level were made between the conditions. The increase in speech intensity could thus also be related more directly to the effects of DAF rather than being a general side effect of altered feedback (Howell, 1990). Hanson & Metter (1983) hypothesized that subjects spoke with greater physiological effort in the DAF conditions which could explain the increased loudness level observed in this study.
A rather unexpected result was the general lack of rhythmic differences between the conditions. Other authors have described a lengthening of vowels and other phonemes during DAF which should affect the rhythmic structure of utterances (Howell, 2004, Yorkston et al., 1999). There was some suggestion that certain aspects of speech production (such as connected speech processes, the effects of rate reduction) cancelled out these timing changes in the calculation of the PVI. On the other hand, the perceptual analysis indicated that not all speakers showed the “drawling” of vowels during DAF. The perceptual analysis also indicated that vowel length and quality were not as affected in the FSF as in the DAF condition in the current speaker groups, which is in line with previous research on PWS (Howell, 2004). A more detailed acoustic and perceptual analysis of vowel production is necessary to see whether the PVI is the most appropriate method of capturing rhythmic distortions during altered feedback.
Finally, intelligibility and naturalness ratings indicated notable decreases in the altered feedback conditions for all speaker groups. Although Yorkston et al. (1999) discussed that DAF maintained naturalness well, this related to comparisons with other rate reduction techniques rather than the speaker's normal production. It was thus not contrary to the literature that naturalness ratings decreased with altered feedback. Similarly, the reduction in intelligibility levels does not necessarily contradict studies by Downie et al. (1981), Hanson & Metter (1983) and Yorkston et al. (1999). These studies were reporting on speakers for whom DAF was a successful treatment technique and intelligibility benefits were to be expected. Later studies by Dagenais et al. (1999) and Rousseau & Watts (2002) indicate that only some speakers are susceptible to altered feedback, which has also been a result of the current study. One should also consider that only one DAF setting was used in this study. Although this setting has previously been reported to result in the greatest improvement, this does not necessarily have to apply to all speakers. It is thus possible that a greater number of speakers might have shown improvements in intelligibility if they had been recorded with their optimum setting in DAF.
When the acoustic profile of the three speakers who benefited most from altered feedback was looked at in greater detail, no particular parameters or combinations thereof came to light that could have caused that response. That is, these speakers did not show any greater rate reductions, or increases in loudness or pitch variation etc. than the rest of the speakers. In addition, they varied in baseline intelligibility level and PD severity despite all being part of the LPD group, whereas other speakers with similar levels did not respond as well. Although the current results thus follow the trend identified by Rousseau & Watts (2002) with more severe speakers benefiting to a greater extent, severity was not the sole determining factor. The acoustic parameters that were currently included could thus not identify the reasons for the disparate results. A more detailed acoustic phonetic analysis of speakers, focusing more on their articulatory characteristics and how they change in the various conditions might help to determine some of the reasons. In addition, Dagenais et al.'s (1999) hypothesis of a cognitive influence should be considered further.
In summary, the current results showed that altered auditory feedback could result in changes in most of the investigated parameters, although this did not reach significance for all groups all the time. There were also no obvious relationships between the different variables and the current set of measures was unable to identify a relationship between DAF/FSF induced rate reduction and its effects on intelligibility, naturalness, or other prosodic aspects.
With regard to the effects of FSF compared to DAF, the current results largely confirmed those reported in the literature on PWS, i.e. that FSF has less dramatic effects on speech rate characteristics and a more positive effect on naturalness than DAF (Howell, 2004). There were no obvious differences in susceptibility, i.e. speakers who changed performance with DAF also did so with FSF. However, there were some differences in relation to intelligibility and naturalness. Those speakers whose intelligibility improved with altered feedback had higher or similar ratings with FSF than DAF, combined with better scores for naturalness. This suggests that FSF is more beneficial than DAF. However, to make a more reliable statement concerning the different effects of the two devices a greater number of speakers who improve with altered feedback would have to be analysed.
Acknowledgements
This study was funded by the Parkinson Society of the UK. We are grateful to all those who participated in this study.
Footnotes
The nearest setting available to this on the currently used equipment was 147 ms.
Contributor Information
Bettina Brendel, Dept. of Speech and Language Therapy Strathclyde University 76 Southbrae Drive Glasgow G13 1PP, Scotland Tel. 44-141-950 3531 Fax: 44-141-9503762
Anja Lowit, Dept. of Speech and Language Therapy Strathclyde University 76 Southbrae Drive Glasgow G13 1PP, Scotland e-mail: a.lowit@strath.ac.uk Tel. 44-141-950 3531 Fax: 44-141-9503762
Prof. Peter Howell, Dept. of Psychology University College London Gower Street London WC1E 6BT e-mail: p.howell@ucl.ac.uk Tel. 44-207-679 7566 Fax: 44-207-436 4276
Literature
- Dagenais PA, Southwood MH, Mallonee KO. Assessing processing skills in speakers with Parkinson's Disease using delayed auditory feedback. Journal of Medical Speech-Language Pathology. 1999;7:297–313. [Google Scholar]
- Downie AW, Low JM, Lindsay DD. Speech disorders in parkinsonism. Use of delayed auditory feedback in selected cases. British Journal of Disorders of Communication. 1981;16:135–139. doi: 10.3109/13682828109011394. [DOI] [PubMed] [Google Scholar]
- Duffy JR. Motor speech disorders. St. Louis: Mosby; 1995. [Google Scholar]
- Hanson WR, Metter JM. DAF speech rate modification in Parkinson's Disease: A report of two cases. In: Berry WR, editor. Clinical Dysarthria. Boston: College Hill Press; 1983. pp. 232–251. [Google Scholar]
- Howell P. Effects of delayed auditory feedback and frequency-shifted feedback on speech control and some potentials for future development of prosthetic aids for stammering. Stammering Research. 2004;1:31–46. [PMC free article] [PubMed] [Google Scholar]
- Howell P. Changes in voice level caused by several forms of altered feedback in normal speakers and stutterers. Language and Speech. 1990;33:325–338. doi: 10.1177/002383099003300402. [DOI] [PubMed] [Google Scholar]
- Low EL, Grabe E, Nolan F. Quantitative characterizations of speech rhythm: Syllable-timing in Singapore English. Language and Speech. 2000;43:377–401. doi: 10.1177/00238309000430040301. [DOI] [PubMed] [Google Scholar]
- Lowit-Leuschel A, Docherty GJ. Prosodic variation across sampling tasks in normal and dysarthric speakers. Logopedics, Phoniatrics and Vocology. 2001;26:151–164. doi: 10.1080/14015430127772. [DOI] [PubMed] [Google Scholar]
- Rousseau B, Watts CR. Susceptibility of speakers with Parkinson Disease to delayed feedback. Journal of Medical Speech-Language Pathology. 2002;10:41–49. [Google Scholar]
- Whitehill TL, Lee ASY, Chun JC. Direct Magnitude Estimation and Interval Scaling of hypernasality. Journal of Speech, Language, and Hearing Research. 2002;45:80–88. doi: 10.1044/1092-4388(2002/006). [DOI] [PubMed] [Google Scholar]
- Yorkston KM, Beukelman DR, Strand EA, Bell KR. Management of Motor Speech Disorders in Children and Adults. 2nd ed. Austin, Texas: Pro-Ed; 1999. [Google Scholar]