Skip to main content
Folia Phoniatrica et Logopaedica logoLink to Folia Phoniatrica et Logopaedica
. 2008 Dec 17;61(1):1–11. doi: 10.1159/000184539

Analysis of Diadochokinesis in Ataxic Dysarthria Using the Motor Speech Profile Program™

Yu-Tsai Wang a,*, Ray D Kent b, Joseph R Duffy c, Jack E Thomas c
PMCID: PMC2790744  PMID: 19088478

Abstract

Aims

The Diadochokinetic Rate Analysis (DRA) in the KayPENTAX Motor Speech Profile is a computer program for the analysis of diadochokinesis (DDK). The objective of this study is to evaluate the suitability, reliability, and concurrent validity of the results from the DRA protocol and hand measurement for individuals with ataxic dysarthria, which is characteristically associated with dysdiadochokinesis.

Methods

Twenty-one participants with ataxic dysarthria were recorded as they repeated various syllables as quickly and steadily as possible. The DDK samples were executed by the DRA protocol at different thresholds and were also hand-measured. Analyses were based on the percentage of nonexecutable DDK samples, defined as samples in which the lowest peak intensity during CV syllables is lower than the highest peak intensity during intersyllable pauses, and the comparisons of the results between repeated analyses at different thresholds and between automatic and manual measuring methods.

Results

(1) More than one third of the DDK samples were nonexecutable; (2) the reliability at different thresholds and concurrent validity between different measuring methods were both satisfactory, and (3) temporal variation parameters were more inconsistent between different measuring methods than intensity variation parameters.

Conclusion

DRA has notable limitations in its clinical application but there is a considerable potential for improving its performance.

Key Words: Diadochokinesis, Reliability, Validity

Introduction

Diadochokinesis (DDK), the rapid repetition of alternating movements, has long been used for the clinical assessment of motor function. Recently, there has been an interest in instrumental quantitative analysis of this task in both clinical neurology [1, 2] and speech-language pathology [3, 4]. Such an analysis potentially offers an economy of time, greater reliability, and more detailed information than can be accomplished by the traditional methods, such as counting movements produced in a given time interval.

In speech-language pathology, DDK (also known as syllable alternating motion rate, AMR) is used to assess the rate and regularity of repetitive movements of the oral articulators. The maximum syllable repetition task is applied clinically to assess reciprocal movements of the lips and anterior and posterior tongue [5]. A fairly substantial normative database has been published for these syllable trains [6]. Average speech AMRs for normal adults range between five and seven syllables per second, with slower repetition rate of /kE/ than /tE/ or /pE/. In addition to place of articulation, AMR also varies with the age of the speaker [6]. A variant of the task, laryngeal or vocal fold DDK, typically requires the subject to repeat [h] + vowel syllables as an assessment of cyclic abductory-adductory vocal fold motions [7]. Although DDK performance often is described simply in terms of average rate, much more information can be derived from the task, which is one reason why an automated quantitative analysis is appealing.

DDK is suited to the examination of basic motor capabilities and is relatively insensitive to concomitant language impairments [8]. It is also easily performed by subjects with speech disorders of different severity levels and some authors believe that it has value for the clinical assessment of neurogenic speech disorders [5, 6, 9]. DDK rate appears to be a useful index of oral motor development in children [10,11,12,13,14]. DDK has been used as a measure of the speech deterioration in amyotrophic lateral sclerosis [15,16,17], cerebellar and spinocerebellar damage [8,18,19,20,21,22], Friedreich disease [23], multiple sclerosis [24, 25], Parkinson's disease [25, 26], hemispheric stroke [20, 27], traumatic brain injury [28], apraxia of speech [29], and cerebellar mutism syndrome [30]. Slow rate, temporal and intensity irregularities, and inaccuracy in the DDK task are reported in a number of studies on speech impairment associated with a neurogenic disease or damage, including cerebral palsy, developmental dyspraxia, stroke, amyotrophic lateral sclerosis, cerebellar diseases, and basal ganglia disorders [19].

DDK rate probably relates to dimensions of overall speech performance in the dysarthrias, but the relationships reported were not consistent. Wessel and Ziegler [18] indicated a high relationship between DDK rate and the intelligibility and severity of ataxic dysarthria, but Kent et al. [8] did not find a significant relationship between DDK rate and overall severity. Wang et al. [28] reported significant relationships between DDK rate and overall severity, overall intelligibility, word intelligibility, and overall prosody of dysarthria in traumatic brain injury, but Ozawa et al. [20] reported that DDK rate was not correlated with overall intelligibility and bizarreness of spastic and ataxic dysarthrias. Furthermore, the deterioration of speech over the course of amyotrophic lateral sclerosis was not paralleled by a deterioration of DDK rate [15, 17]. In addition, DDK rate tended to relate significantly to speaking rate [8, 18, 28]. Nishio and Niimi [31] reported significant correlations between DDK rate, speaking rate, and articulation rate, and indicated that DDK rate was more sensitive to the detection of abnormal articulation than speaking rate and articulation rate.

Because DDK abnormalities are frequently observed in the dysarthrias, this task has been used in both differential diagnosis and the identification of muscle system impairments. Extended quantitative analyses may augment the value of DDK testing. For example, variation in peak intensity during the syllable interval (voice onset time plus vocalic interval) may indicate difficulties with respiratory control or a voice problem, whereas the variation of peak intensity during intersyllable pauses (the closure interval) is likely to reflect continuous voicing (periodic energy) or poor oral articulatory control (aperiodic energy appearing as spirantization) [27]. Tjaden and Watling [25] concluded that the temporal parameters of the acoustic analysis protocol for DDK were better than intensity parameters in differentiating the motor speech disorders in multiple sclerosis and Parkinson's disease from normal speech. Moreover, a component analysis can be useful for differential diagnosis. Ozawa et al. [20] reported that lengthened intersyllable pauses accounted for the DDK slowness in ataxic dysarthria, whereas lengthened syllable durations contributed mainly to the DDK slowness in spastic dysarthria. Therefore, DDK analysis might have clinical usefulness as a measure of speech impairment.

Because of its intrinsic cyclicity and relative simplicity, DDK lends itself to automatic analysis. The simplicity of the DDK task makes it suitable even for severely dysarthric individuals who cannot effectively produce complex utterances such as phrases or sentences. With appropriate analysis, DDK is sensitive to both temporal and energy regularities [25, 27]. The variables measured in the acoustic analysis of the DDK task in previous studies include number of syllables produced per train, percentage of incomplete closures, median syllable duration, intratrain variation coefficient of syllable duration, syllable rate, mean syllable and gap durations, intraspeaker temporal variability, and intraspeaker maximum and minimum energy variability.

Although acoustic analysis can improve the quantitative value of DDK, such analysis can be time-consuming and tedious. Greater efficiency can be achieved through the development of automatic diadochokinetic rate analysis, such as the Motor Speech Profile (MSP) Model 5141 (KayPENTAX, Lincoln Park, N.J., USA), a software option for the KayPENTAX Computerized Speech Lab (CSL) model 4500. The Diadochokinetic Rate Analysis (DRA), a part of the MSP, is a protocol that measures the rate and regularity of consonant-vowel (CV) syllables repeated in a task involving maximum-rate repetition on one deep breath. The DRA protocol generates 11 temporal and intensity parameters automatically and simultaneously. Table 1 lists the DRA measures, and their symbols and abbreviations.

Table 1.

Acoustic parameters extracted by DRA

Parameter Symbol Unit
Average DDK period DDKavp ms
Average DDK rate DDKavr /s
Standard deviation of DDK period DDKsdp ms
Coefficient variation of DDK period DDKcvp %
Perturbation of DDK period DDKjit %
Average DDK peak intensity DDKavi dB
Standard deviation of DDK peak intensity DDKsdi dB
Coefficient variation of DDK peak intensity DDKcvi %
Maximum intensity of DDK sample DDKmxa dB
Average intensity of DDK sample DDKava dB
Average syllable intensity DDKsla dB

The DRA protocol can work with either on-line recording (wherein the participant hears a prerecorded DDK sample and then performs the DDK task) or a digital audiotape recording of DDK prepared in advance. The typical procedure in the MSP-DRA protocol is as follows: (a) capture a token from the client in real time or open a prepared digitized sound file, and then select an appropriate 7-second DDK sample, (b) select the gender of the participant to compare the results to suitable normative values, (c) perform the analysis itself, and (d) if the threshold provided by the program is not appropriate, then the operator can reposition it to an appropriate level and perform a reanalysis. This system generates a graph of the results of all the parameters against its database. It also displays the analyzed parameters on a table that incorporates normative values to help identify potentially important clinical differences.

Several factors can complicate the use of DRA. For example, inability to reposition an appropriate threshold in the MSP-DRA conceivably may occur when the lowest peak intensity during syllable intervals is lower than the highest peak intensity during intersyllable pauses (between the end of the vocalic nucleus and the following burst onset), or when the DDK trains are blurred, such as in participants with Parkinson's disease. Another possible complication is spirantization, resulting from articulatory undershoot or a weak articulatory force that is insufficient to achieve or maintain articulatory closure, which occurs during stop productions especially in hypokinetic dysarthria [32].

Ataxic dysarthria, characterized especially by a scanning pattern of speech, irregular articulatory breakdown, dysdiadochokinesis, prosodic excess, and phonatory-prosodic insufficiency, is associated with lesions in the cerebellar circuit [5, 32]. Abnormal DDK is a hallmark of ataxic dysarthria [5, 33]. Among the various types of dysarthria, ataxic dysarthria is well suited to the examination of automatic analysis of DDK samples because it exhibits typical dysdiadochokinesis, evident amplitude fluctuation, excess loudness, and irregular articulatory breakdown, but it is rarely affected by the ‘blurred’ DDK or spirantization commonly seen in hypokinetic dysarthria. It is challenging but still reliable to analyze. There is considerable potential for the automatic analysis of DDK in ataxia. Although automated processing, such as MSP-DRA, often is assumed to be inherently effective and reliable, it is important to see if automated processing applies well to dysarthric speech, given that temporal and intensity irregularities are common in dysarthrias. Furthermore, since the threshold level lines need to be repositioned for most executable DDK samples, confirmation of the reliability of the outputs between different thresholds using the same DRA protocol, so-called alternate-forms reliability [34], is legitimate. Finally, the concurrent validity, referring to prediction or agreement between two independent methods measuring the same attribute [35], between the results of computational algorithms and of hand measuring for the same DDK data has not been reported.

Despite the apparent usefulness of DDK in determining basic speech motor functions in ataxic speech, there are no published studies on the suitability of MSP-DRA on this task, the alternate-forms reliability of repositioned thresholds, and the concurrent validity between different measuring methods, in the form of a comparison between the results of computational algorithm and of hand measuring. This paper reports on the suitability, reliability and concurrent validity of the DRA protocol for use with samples of ataxic speech recorded under conditions that equal or exceed the typical clinical environment. It answers three main questions:

  • (1) Is DRA suitable for the analysis of DDK samples from speakers with ataxic dysarthria?

  • (2) How reliable is the repeated analysis of the same signal at varying thresholds using the DRA protocol?

  • (3) How does the DRA protocol compare with hand-measured analysis for the same DDK sample?

Method

Participants

Table 2 shows the characteristics of the participants, who were 21 individuals (9 men and 12 women) with ataxic dysarthria. Most participants had diagnoses of cerebellar degeneration or cerebellar neoplasm. Ataxic dysarthria was diagnosed perceptually by speech-language pathologists at the Mayo Clinic in Rochester, Minn., based on the presence of more than one of the following speech features: irregular AMR; unstable vowel prolongation; perceived irregular articulatory breakdown; excess and equal stress; abnormal variations in loudness; pitch and duration, and vowel distortion. Moreover, participants did not exhibit deviant speech features and oral mechanism findings associated with other dysarthria types but not with ataxic dysarthria [5, 33]. These criteria are consistent with those used in most studies of this speech disorder.

Table 2.

Participant characteristics

Subject Sex Age Diagnosis
AD01 F 26 CD
AD02 F 46 CD
AD03 F 59 CD
AD04 F 73 CD
AD05 M 54 CD
AD06 M 60 CD
AD07 M 66 CD
AD08 M 67 CD
AD09 F 39 CP
AD10 F 40 CP
AD11 F 55 CP
AD12 F 57 CP
AD13 F 69 CP
AD14 M 61 SCS
AD15 M 51 SCS
AD16 F 74 CCA
AD17 F 52 OPCA
AD18 M 17 T
AD19 F 44 T
AD20 M 67 BCS
AD21 M 39 AVM

CD = Cerebellar degeneration; CP = cerebellar paraneoplasm; SCS = spinal cerebellar syndrome; CCA = cerebral and cerebel-lar atrophy; OPCA = olivopontocerebellar atrophy; T = tumor; BCS = brainstem and cerebellar syndrome; AVM = arteriovenous malformation.

Data Collection

The speaking task of DDK was selected for analysis from a larger research study of dysarthria, collected between 1995 and 2006 at the Mayo Clinic. Each participant performed DDK tasks in the same order following the examiner's instructions and demonstration (‘take a deep breath and repeat /pEpEpE…/ as fast and steady as you can and keep it up for a while’). Three CV syllables, /pE/, /tE/, and /kE/, were selected for analysis in this study because they offered variations in place of consonant articulation and because they are frequently used in clinical speech assessments.

The speech samples for the subjects with ataxic dysarthria were recorded at the Mayo Clinic with either an Audio-Technica ATM71 cardioid condenser head-mounted microphone (Audio-Technica U.S., Stow, Ohio, USA) or an Audio-Technica PRO 8 HEx hypercardioid dynamic head-mounted microphone at a sampling rate of 44.1 kHz and with16-bit quantization in a soundproof room. The participants wore a head-mounted microphone. The microphone-to-mouth distance was 8–10 cm at an angle of 45°. While recording, the experimenter adjusted the input to an appropriate level at first and monitored the output through the recording. The input level was kept constant during recording. The recorded DDK samples were then digitized as sound files to be analyzed using the MSP-DRA protocol.

Suitability

The MSP-DRA protocol in CSL requires a 7-second DDK sample. When the DDK sample was longer than 7 s, the most steady and rhythmic 7-second sample was selected by the first author; otherwise, the whole DDK sample was used for further analysis. The minimum length of the samples was 4 s produced by AD09, including 7 /tE/ syllables. Seven DDK trains were less than 5 s; 13 trains were between 5 and 6 s; 7 trains were between 6 and 7 s. The collected DDK samples were then analyzed using the MSP-DRA protocol in CSL. When the lowest peak intensity during CV syllables was lower than the highest peak intensity during intersyllable pauses, the DDK sample was defined as nonexecutable by the MSP-DRA protocol. When a DDK trial was nonexecutable, the program could still run, but it gave an incorrect result.

Several factors may contribute to the difficulty of using the MSP-DRA protocol to analyze the DDK in ataxic dysarthria, including irregular articulatory breakdowns and the presence of a dip or valley in the energy contour of CV syllables. Figure 1 shows an example of the energy contour of DDK for /kE/ by subject AD05; the CV combinations have the kind of appearance that is typical of ataxic dysarthria. The horizontal line in the middle of the figure (indicated by arrow a) is the adjustable threshold for recognizing a CV production. Each CV combination is identified in the brackets at the top of the window. The durations and peak intensities for all CV syllables were then computed to generate temporal and intensity parameters instantly. In the plot of the energy contour, the maximum of the energy between the end of the vocalic nucleus and the following burst onset (as indicated by arrow b) is larger than the minimum of the peak intensities during the CV syllables (as indicated by arrow c), which made this DDK train nonexecutable by the MSP-DRA protocol. The reduced peak intensity during syllable intervals indicated by arrow c was caused by articulatory breakdown, which is not uncommon in ataxic dysarthric speech and probably causes difficulties in the accurate execution of the DRA protocol. Furthermore, a dip in energy between the consonant and the vowel segments as indicated by the arrows in figure 2 also probably causes inaccurate results of the execution of the DRA protocol. The executable percentage of DDK samples in the MSP-DRA protocol was then calculated to gauge the suitability of the MSP-DRA protocol.

Fig. 1.

Fig. 1.

Example of DDK for /kE/ by subject AD05. Arrow a points to a horizontal line, which represents the repositioned threshold level. Arrow b indicates the maximum valley intensity during intersyllable pauses. Arrows c and d indicate the reduced peak intensity during CV syllables caused by articulatory breakdown, which is not uncommon in ataxic dysarthric speech and probably causes difficulties in the accurate execution of DRA protocol.

Fig. 2.

Fig. 2.

Example of DDK for /pE/ by subject AD12. The arrows indicate the locations where the energy dips down between the consonant and the vowel segments, which causes inaccurate execution of the DRA protocol on this DDK train.

Reliability

Reliability of the analysis for the DDK samples was determined by rerunning the MSP-DRA protocol for the same digitized signal at two other selected thresholds. For the executable DDK samples, the first author first estimated the optimal threshold and then repositioned the thresholds 2 dB higher or lower than the optimal threshold level to test the reliability of the MSP-DRA protocol at different threshold levels for the same signal. We selected an increment of 2 dB because normal control speakers usually had less than 2 dB standard deviation for both energy maxima and minima [8]. Finally, DRA-derived parameters were compared at high and low thresholds for all data that could be repositioned. Paired t tests were used to test the significance of the differences between different thresholds at α = 0.05 level. The intraclass correlation coefficient (ICC) was also used to gauge the agreement between different thresholds.

Concurrent Validity

Concurrent validity of the analysis for the DDK samples was determined by comparing the results between the automatic protocol and hand measurements for all DDK samples that were executable by the DRA method. For each executable DDK sample file, the CV syllables and the peak intensity within each CV syllable were manually measured by the first author using the software system TF32 [36], formerly known as CSpeech, through its functions of labeling and measurement sequence. The measurements of burst onset and the end of the vocalic nucleus are generally straightforward. When double bursts occurred, the first burst was defined as the onset of burst [37,38,39]. The same criterion was used in the case of multiple bursts. The end of the vocalic nucleus was determined by the presence of energy in the main vowel formants, i.e., the first formant (F1) combined with energy for another higher formant (F2 or F3).

In MSP-DRA, DDKavp is the average period between CV syllables. The periods are measured between the voicing offsets of the syllables, i.e. between the negative slopes of the end of the syllables at the points crossing the threshold. Therefore, each period includes intersyllabic interval time and syllable duration. DDKavr is the inverse of DDKavp. DDKsdp times 100 divided by DDKavp is DDKcvp. DDKsdi is the standard deviation of DDK peak intensity. The principle and calculations of hand-measured values are identical to those in the MSP-DRA protocol:

Waveform, RMS trace, wideband and narrowband spectrograms were used in a composite display to determine locations of burst onset and the end of the vocalic nucleus. The first syllable duration of each DDK sample was excluded. The duration between the first end of the vocalic nucleus and the following end of the vocalic nucleus was the first CV syllable duration, and so on. The average of all CV syllable durations was calculated as hand-measured DDKavp. DDKavr was the inverse of DDKavp. The standard deviation of all CV syllable durations was calculated as hand-measured DDKsdp. DDKcvp was DDKsdp times 100 divided by DDKavp.

Afterwards, the DDKavp, DDKavr, DDKsdp, and DDKcvp for each DDK sample were measured and calculated. For the temporal parameters, since the inverse of DDKavp is DDKavr, only DDKavp was included in further analysis. For the intensity parameters, the peak intensity during each syllable interval was measured from the TF32 energy contour, and the standard deviation of all the measures was calculated as DDKsdi. Because the peaks of the energy contour of CV syllables were relative values in the present study, the DDKavi and DDKcvi between DRA and hand-measurement results were not directly comparable. But adding a constant, such as 96.33 dB, to every peak intensity reading taken from TF32 does not affect its standard deviation, so DDKsdi between DRA and hand-measurement results is comparable. Finally, comparisons for the three temporal parameters (DDKavp, DDKsdp, and DDKcvp) and the one intensity parameter (DDKsdi) were tested for the results generated by the MSP-DRA protocol and by hand measurement for the same data to determine the concurrent validity between different measuring methods. The ICC was used to gauge the agreement between different measuring methods. Paired t test was also used to test the significance of the differences between different thresholds at α = 0.05 level.

Hand-Measurement Agreement

About 1 month after completion of the acoustic analysis, eight DDK trains (20% of the executable data selected by a random number table) were remeasured by the first author and a second-year graduate student in communication disorders with experience in the acoustic analysis of speech to gauge intra- and interanalyst agreement. The numbers of peak intensity between the two measures were identical for both intra- and interanalyst agreement. The Pearson correlation coefficient of CV durations between the two measures was 0.997 and 0.988 for intra- and interanalyst agreement, respectively. The mean and standard deviation of absolute differences between the two measures were 2.1 and 3.0 ms for intra-analyst, and 3.9 and 6.2 ms for interanalyst agreement, respectively. The agreements that were within 10 ms for intra- and interanalyst measurements came to 98.7 and 93%, respectively. The ICC between the two measures was 0.997 and 0.988 for intra- and interanalyst agreement, respectively. The intra- and interanalyst agreement of hand measurement was judged to be satisfactory.

Results

Suitability

Twenty-three out of the 63 (36.5%) DDK trains (7 /pE/, 8 /kE/, and 8 /tE/) for the ataxic DDK samples were nonexecutable due to the impossibility of repositioning a threshold on the intensity plot that allowed the algorithm to execute correctly. The distribution of nonexecutable DDK trains was comparable for different consonants. All the three DDK trains produced by AD06, AD10, AD15, AD16, and AD21 were nonexecutable. Additionally, /pE/ produced by AD12 and AD17, /tE/ produced by AD02, AD05, and AD09, /kE/ produced by AD05, AD07, and AD13 were nonexecutable.

Reliability of Analysis at Different Threshold Levels

When the intensity difference is less than 4 dB between the maximum of the energy between the end of vocalic nucleus and the following burst onset and the minimum of the peak intensities during the CV syllables, the DDK intensity plot cannot be repositioned up and down in 2-dB increments. This limitation affected 18 samples. Consequently, there were 22 out of 40 ataxic DDK samples that could be compared between high and low thresholds. All of the intensity parameters generated with the MSP-DRA protocol are all the same at different thresholds, except for DDKsla, which is not of interest in this study. Finally, all the other temporal parameters were compared at high and low thresholds for all data that could be repositioned. Table 3 reports means and standard deviations of the output of the DRA temporal parameters at high and low thresholds and correlation coefficients between the two thresholds for each parameter. Paired t test statistical results showed no significant differences between different thresholds for all tested parameters at α = 0.05 level. The ICC of the two measures between different measuring methods was 1.0, 1.0, 0.996, 0.989, and 0.993 for DDKavp, DDKavr, DDKsdp, DDKcvp, and DDKjit, respectively.

Table 3.

Output of DRA temporal parameters at high and low thresholds

Parameter Mean ± SD
AMD (ASD) rHL t(21)
high low
DDKavp 284.13±81.44 284.16±81.61 0.37 (0.43) 1.00 0.233
DDKavr 3.76±0.945 3.76±0.946 0.0048 (0.0058) 1.00 0.235
DDKsdp 35.07±26.66 34.89±27.38 1.38 (1.14) 0.996 0.348
DDKcvp 11.96±5.84 11.87±5.92 0.49 (0.43) 0.989 0.466
DDKjit 3.87±3.94 3.84±3.99 0.229 (0.196) 0.993 0.305

Shown for each DRA temporal parameter in column 1 are: columns 2 and 3, the mean and standard deviation (SD) data at high and low thresholds; column 4, the absolute mean discrepancy (AMD) and the standard deviations of the absolute difference (ASD) between the output at high and low thresholds; column 5, correlation coefficients between the output at high and low thresholds (rHL), and column 6, the paired t test results.

Concurrent Validity of Analysis between Different Measuring Methods

There were 40 executable DDK samples in the present study. Figure 3 shows the scatter plots of hand-measured and DRA data for the four parameters DDKavp, DDKsdp, DDKcvp, and DDKsdi. Note that temporal variation parameters, DDKsdp and DDKcvp, were more inconsistent between different measuring methods than intensity variation parameter DDKsdi. The ICC of the two measures between different measuring methods was 0.999, 0.940, 0.865, and 0.983 for DDKavp, DDKsdp, DDKcvp, and DDKsdi, respectively. Table 4 shows the means and standard deviations of the four parameters generated from hand measurement and the DRA protocol. Paired t test statistical results showed no significant differences between different measuring methods for all tested parameters at α = 0.05 level.

Fig. 3.

Fig. 3.

Scatter plots of hand-measured and DRA data for the four parameters DDKavp, DDKsdp, DDKcvp, and DDKsdi.

Table 4.

Means and SD for DDKavp, DDKsdp, DDKcvp, and DDKsdi derived from hand measurements and DRA protocol

Parameter Mean ± SD
AMD (ASD) rHD t(39)
hand DRA
DDKavp 284.31±99.18 284.42±99.71 1.82 (1.94) 1.00 0.240
DDKsdp 35.53±31.35 36.74±39.22 7.2 (10.0) 0.964 0.622
DDKcvp 11.66±5.36 11.70±6.59 2.24 (2.15) 0.884 0.065
DDKsdi 2.13±0.80 2.10±0.78 0.11 (0.09) 0.983 0.979

Shown for each DRA parameter in column 1 are: columns 2 and 3, the mean and SD data generated from hand measurement and DRA protocol; column 4, the absolute mean discrepancy (AMD) and SD values of the absolute difference (ASD) between the output from hand measurement and DRA protocol; column 5, correlation coefficients between the output from hand measurement and DRA protocol (rHD); and column 6, the paired t test results.

Discussion

Suitability

Executability of the DDK samples by the DRA algorithm was a major limiting factor in this study. More than one third of the DDK samples from the participants with ataxic speech were nonexecutable, and the distributions of nonexecutable DDK trains for different consonants were comparable. Certain participants tended to produce nonexecutable DDK trains, which included about 70% of all the nonexecutable DDK trains in the present study.

Nonexecutable DDK trains were due to skipped or added syllables that were related especially to intersyllable pauses containing substantial energy and/or syllable peaks with reduced energy. Several different factors accounted for nonexecutable syllable trains. First, it was observed that high-energy explosive consonants occurred with considerable frequency in ataxic dysarthria. These events probably relate to the frequent perceptual observation of explosive loudness associated with cerebellar dysfunction [5]. Although the presence of such consonants did not always affect the DRA procedure, they did interfere often enough to constitute a problem for automated analysis. Second, articulatory breakdowns, characterized acoustically by abruptly reduced energy of syllables in the DDK sample, were a frequent cause for inaccurate execution of the DRA protocol on DDK trains in ataxic dysarthria. Arrows c and d in figure 1 indicate CV syllables with reduced peak intensity caused by articulatory breakdown. The result obtained from the DRA protocol was inaccurate because these two CV syllables were not detected within the brackets. Third, another possible reason for nonexecutable trains was the presence of a dip in energy between the consonant and the vowel segments (arrows in fig. 2), presumably reflecting incoordination of intrasyllabic movements or idiosyncratic speaking style. The result of the DRA protocol was inaccurate because more than the expected number of CV syllables was detected within the brackets.

The suitability of the MSP-DRA protocol for data analysis in this study was only fair. About 37% of the dysarthric speech samples were not admissible for this protocol. The large number of nonexecutable DDK trains in ataxic dysarthria indicates a limited application of MSP-DRA for ataxic dysarthria, and perhaps also for other kinds of dysarthria. Users of this system should be cautious and ascertain the suitability of samples for analysis by inspecting whether the lowest peak intensity during syllable intervals has a lower value than the highest peak intensity during intersyllable pauses. This inspection can be performed by observation of the waveform.

Since only a few syllables in some DDK trains made the file nonexecutable, one possible way to increase the percentage of executable data is to adjust the protocol to allow for semiautomation of analysis, which would allow the users to insert markers on those problematic syllables and rerun the protocol using a flexible threshold.

Reliability and Concurrent Validity

If the DDK samples are suitable for the DRA protocol, the reliability of the temporal parameters is acceptable for most analysis purposes. The results were consistent across different threshold levels for the same signal. With regard to the agreement between the two different measuring methods, although the differences were not statistically significant, discrepancies for temporal variation parameters (DDKsdp and DDKcvp) were noted for some participants with ataxic dysarthria, which indicated an important source of variation and a necessity to understand the causes and explanations of these variations through a more detailed component analysis within the CV syllable period in the DRA protocol. Larger discrepancies of temporal variation between the DRA protocol and the manual measurements are probably due to factors causing an unexpected fluctuation of the energy contour in dysarthric speech, requiring a higher threshold level for executing the DRA protocol. These factors include energy variation between the consonant and vowel segments, cycle-to-cycle variation between the oral articulators and laryngeal phonation (voice onset time) across the DDK train and unexpected discoordination during the maximum performance task [38]. With regard to the intensity variation parameter DDKsdi, since the manual measurements of peak intensity were made the same way as in the MSP-DRA protocol for all the executable DDK samples, the difference between the manual and automatic measurements was slight or none, which could be explained by the algorithmic differences between the MSP and TF32.

Although the concurrent validity of temporal and intensity variation parameters between the DRA protocol and manual measurement was satisfactory, the protocol ignores other important features. For example, it appears that a component analysis within each CV syllable interval defined in the DRA protocol, such as the duration from burst onset to the following end of the vocalic nucleus and the duration from the end of the vocalic nucleus to the following burst onset, could serve as a useful index to differentiate spastic dysarthria from ataxic dysarthria [20]. Moreover, temporal parameters of the DDK protocol proposed by Kent et al. [27] differentiate dysarthric speech from normal speech better than the intensity parameters. But the DRA protocol defines the CV period including the intersyllabic interval and syllable interval and does not include parameters relating to such a component analysis.

Furthermore, explosive consonants during syllables and high frequency energy during intersyllabic pauses in DDK were noted in dysarthric speech, especially in ataxic and hypokinetic dysarthria. Therefore, it is important to include the intensity parameters during syllables and intersyllable pauses in DDK in the protocol to detect respiratory incoordination and articulatory undershoot or a weak articulatory force to make a preliminary differential diagnosis clinically.

The DRA protocol generates a total of 11 parameters, but it may be helpful to refine its output and to add some new clinically useful parameters suggested by the literature, such as syllable duration, intrasyllable pause, and intensity parameters during syllables and intersyllable pauses, to the present DRA protocol to achieve its long-term goal of characterizing motor speech impairment.

Despite the difficulties that were encountered in the automatic acoustic analysis of speech, there is a reasonable prospect for improved performance. A major first step in evaluating the clinical suitability of DRA is to determine how well the analysis performs with samples of disordered speech. We chose samples of ataxic dysarthria because of the frequently noted abnormalities that ataxic speakers have in the DDK task. Other types of dysarthria may present their own distinctive challenges, and it would be helpful to have data from a variety of types and severity levels of dysarthria. The application of the DRA to general clinical purposes is justified only if it can be demonstrated that this protocol works for a substantial proportion of the kinds of samples that are seen in routine clinical assessment. It may also be necessary to incorporate other, more reliable, valid, or discriminative parameters in the DRA protocol in order to analyze motor disordered speech for purposes such as tracking disease progression, examining speech therapy efficacy, monitoring pharmaceutical effects, and characterizing motor speech disorders. However, achieving these goals will require a more general consideration of the procedural and interpretive issues surrounding DDK [6,40,41,42]. Automatic quantitative analysis has the potential to enhance the clinical utility of DDK, but improvements in this aspect will not resolve other problems that require solutions in their own right. There is a need for a standardized protocol that ensures concurrent validity in procedures across different clinical settings. DDK has a long history in speech research and clinical application, and it seems opportune to define new standards and implement more reliable methods to obtain the best information from this task.

The present study prompts a larger question about the development of automatic quantitative analyses of speech disorders. The question is: What kinds of evaluation are needed to ensure that the analyses are suitable for clinical application? Surely, it is not sufficient to demonstrate satisfactory validity and reliability for samples of healthy speech. As the current results show, pathological speech may have characteristics that are not present in healthy speech, and these characteristics can severely limit the performance of automatic analysis. It would be helpful if a large corpus of disordered speech, suitably annotated with respect to etiology and perceptual description, were available for general use in comparing analysis systems. Unless and until analysis programs like DRA are tested on representative samples of disordered speech, little confidence can be placed in their results as far as clinical application is concerned.

Conclusions

The DRA in the KayPENTAX MSP is designed to analyze the rate and regularity of maximally rapid syllable repetitions automatically and instantly. Although its performance on samples of DDK from speakers with ataxic dysarthria was limited, the reliability at different thresholds and the concurrent validity between different measuring methods were both satisfactory. More clinically useful parameters were suggested to be incorporated into the protocol. In brief, DRA has notable limitations in its clinical application but there is a considerable potential for improving its performance.

Acknowledgements

This work was supported in part by Research Grant No. 5 R01 DC00319 from the National Institute on Deafness and Other Communication Disorders (NIDCD-NIH) and NSC 95-2314-B-010-095 from the National Science Council, Taiwan.

References

  • 1.Fimbel EJ, Domingo PP, Lamoureux D, Beuter A. Automatic detection of movement disorders using recordings of rapid alternating movements. J Neurosci Methods. 2005;146:183–190. doi: 10.1016/j.jneumeth.2005.02.007. [DOI] [PubMed] [Google Scholar]
  • 2.Taylor Tavares AL, Jefferis GS, Koop M, Hill BC, Hastie T, Heit G, Bronte-Stewart HM. Quantitative measurements of alternating finger tapping in Parkinson's disease correlate with UPDRS motor disability and reveal the improvement in fine motor control from medication and deep brain stimulation. Mov Disord. 2005;20:1286–1298. doi: 10.1002/mds.20556. [DOI] [PubMed] [Google Scholar]
  • 3.Lundy DS, Roy S, Xue JW, Casiano RR, Jassir D. Spastic/spasmodic vs. tremulous vocal quality: motor speech profile analysis. J Voice. 2004;18:146–152. doi: 10.1016/j.jvoice.2003.12.001. [DOI] [PubMed] [Google Scholar]
  • 4.Meurer EM, Wender MC, von Eye CH, Capp E. Phono-articulatory variations of women in reproductive age and postmenopausal. J Voice. 2004;18:369–374. doi: 10.1016/j.jvoice.2003.02.001. [DOI] [PubMed] [Google Scholar]
  • 5.Duffy JR. Motor Speech Disorders: Substrates, Differential Diagnosis, and Management. St. Louis: Mosby; 2005. [Google Scholar]
  • 6.Kent RD, Kent JF, Rosenbek JC. Maximum performance tests of speech production. J Speech Hear Disord. 1987;52:367–387. doi: 10.1044/jshd.5204.367. [DOI] [PubMed] [Google Scholar]
  • 7.Renout KA, Leeper HA, Bandur DL. Vocal fold diadochokinetic function of individuals with amyotrophic lateral sclerosis. Am J Speech Lang Pathol. 1995;4:73–80. [Google Scholar]
  • 8.Kent RD, Kent JF, Duffy JR, Thomas JE, Weismer G, Stuntebeck S. Ataxic dysarthria. J Speech Lang Hear Res. 2000;43:1275–1289. doi: 10.1044/jslhr.4305.1275. [DOI] [PubMed] [Google Scholar]
  • 9.Darley FL, Aronson AE, Brown JR. Motor Speech Disorders. Philadelphia: Saunders; 1975. [Google Scholar]
  • 10.Hale ST, Kellum GD, Richardson JF, Messer SC, Gross AM, Sisakun S. Oral motor control, posturing, and myofunctional variables in 8-year-olds. J Speech Hear Res. 1992;35:1203–1218. doi: 10.1044/jshr.3506.1203. [DOI] [PubMed] [Google Scholar]
  • 11.Henry CE. The development of oral diadochokinesia and non-linguistic rhythmic skills in normal and speech-disordered young children. Clin Linguist Phonet. 1990;4:121–137. doi: 10.3109/02699209008985476. [DOI] [PubMed] [Google Scholar]
  • 12.Potter NL: Oral/Speech and Manual Motor Development in Preschool Children; unpublished PhD diss University of Wisconsin-Madison, 2005.
  • 13.Thoonen G, Maassen B, Wit J, Gabreels F, Schreuder R. The integrated use of maximum performance tasks in differential diagnostic evaluations among children with motor speech disorders. Clin Linguist Phonet. 1996;10:311–336. [Google Scholar]
  • 14.Williams P, Stackhouse J. Rate, accuracy and consistency: diadochokinetic performance of young, normally developing children. Clin Linguist Phonet. 2000;14:267–293. [Google Scholar]
  • 15.Kent RD, Sufit RL, Rosenbek JC, Kent JF, Weismer G, Martin RE, Brooks BR. Speech deterioration in amyotrophic lateral sclerosis: a case study. J Speech Hear Res. 1991;34:1269–1275. doi: 10.1044/jshr.3406.1269. [DOI] [PubMed] [Google Scholar]
  • 16.Samlan RA, Weismer G. The relationship of selected perceptual measures of diadochokinesis to speech intelligibility in dysarthric speakers with amyotrophic lateral sclerosis. Am J Speech Lang Pathol. 1995;4:9–13. [Google Scholar]
  • 17.Nishio M, Niimi S. Changes over time in dysarthric patients with amyotrophic lateral sclerosis (ALS): a study of changes in speaking rate and maximum repetition rate (MRR) Clin Linguist Phonet. 2000;14:485–497. [Google Scholar]
  • 18.Wessel K, Ziegler W. Speech timing in ataxic disorders: sentence production and rapid repetitive articulation. Neurology. 1996;47:208–214. doi: 10.1212/wnl.47.1.208. [DOI] [PubMed] [Google Scholar]
  • 19.Ackermann H, Hertrich I, Hehr T. Oral diadochokinesis in neurological dysarthrias. Folia Phoniatr Logop. 1995;47:15–23. doi: 10.1159/000266338. [DOI] [PubMed] [Google Scholar]
  • 20.Ozawa Y, Shiromoto O, Ishizaki F, Watamori T. Symptomatic differences in decreased alternating motion rates between individuals with spastic and with ataxic dysarthria: an acoustic analysis. Folia Phoniatr Logop. 2001;53:67–72. doi: 10.1159/000052656. [DOI] [PubMed] [Google Scholar]
  • 21.Kent RD, Kent JF, Rosenbek JC, Vorperian HK, Weismer G. A speaking task analysis of the dysarthria in cerebellar disease. Folia Phoniatr Logop. 1997;49:63–82. doi: 10.1159/000266440. [DOI] [PubMed] [Google Scholar]
  • 22.Schalling E, Hartelius L. Acoustic analysis of speech tasks performed by three individuals with spinocerebellar ataxia. Folia Phoniatr Logop. 2004;56:367–380. doi: 10.1159/000081084. [DOI] [PubMed] [Google Scholar]
  • 23.Gentil M. Dysarthria in Friedreich disease. Brain Lang. 1990;38:438–448. doi: 10.1016/0093-934x(90)90126-2. [DOI] [PubMed] [Google Scholar]
  • 24.Hartelius L, Lillvik M. Lip and tongue function differently affected in individuals with multiple sclerosis. Folia Phoniatr Logop. 2003;55:1–9. doi: 10.1159/000068057. [DOI] [PubMed] [Google Scholar]
  • 25.Tjaden K, Watling E. Characteristics of diadochokinesis in multiple sclerosis and Parkinson's disease. Folia Phoniatr Logop. 2003;55:241–259. doi: 10.1159/000072155. [DOI] [PubMed] [Google Scholar]
  • 26.Rosen KM, Kent RD, Duffy JR. Task-based profile of vocal intensity decline in Parkinson's disease. Folia Phoniatr Logop. 2005;57:28–37. doi: 10.1159/000081959. [DOI] [PubMed] [Google Scholar]
  • 27.Kent RD, Duffy JR, Kent JF, Vorperian HK, Thomas JE. Quantification of motor speech abilities in stroke: time-energy analyses of syllable and word repetition. J Med Speech Lang Pathol. 1999;7:83–90. [Google Scholar]
  • 28.Wang YT, Kent RD, Duffy JR, Thomas JE, Weismer G. Alternating motion rate as an index of speech motor disorder in traumatic brain injury. Clin Linguist Phonet. 2004;18:57–84. doi: 10.1080/02699200310001596160. [DOI] [PubMed] [Google Scholar]
  • 29.Ziegler W. Task-related factors in oral motor control: speech and oral diadochokinesis in dysarthria and apraxia of speech. Brain Lang. 2002;80:556–575. doi: 10.1006/brln.2001.2614. [DOI] [PubMed] [Google Scholar]
  • 30.Wang YT, Kent RD, Duffy JR, Thomas JE, Fredericks GV. Dysarthria following cerebellar mutism secondary to resection of a fourth ventricle medulloblastoma: a case study. J Med Speech Lang Pathol. 2006;14:109–122. [Google Scholar]
  • 31.Nishio M, Niimi S. Comparison of speaking rate, articulation rate and alternating motion rate in dysarthric speakers. Folia Phoniatr Logop. 2006;58:114–131. doi: 10.1159/000089612. [DOI] [PubMed] [Google Scholar]
  • 32.Weismer G. Motor speech disorders. In: Hardcastle WJ, Laver J, editors. The Handbook of Phonetic Science. Cambridge: Blackwell; 1997. pp. 191–219. [Google Scholar]
  • 33.Brown JR, Darley FL, Aronson AE. Ataxic dysarthria. Int J Neurol. 1970;7:302–309. [PubMed] [Google Scholar]
  • 34.Nunnally JC, Bernstein I. Psychometric Theory. St. Louis: McGraw-Hill; 1994. [Google Scholar]
  • 35.Guilford JP. Fundamental Statistics in Psychology and Education. St Louis: McGraw-Hill; 1973. [Google Scholar]
  • 36.Milenkovic P. Time-Frequency Analysis for 32-Bit Windows. Madison: Wisconsin; 2001. [Google Scholar]
  • 37.Auzou P, Özsancak C, Morris RJ, Jan M, Eustache F, Hannequin D. Voice onset time in aphasia, apraxia of speech and dysarthria: a review. Clin Linguist Phonet. 2000;14:131–150. [Google Scholar]
  • 38.Davis K. Phonetic and phonological contrasts in the acquisition of voicing: voice onset time production in Hindi and English. J Child Lang. 1995;22:275–305. doi: 10.1017/s030500090000979x. [DOI] [PubMed] [Google Scholar]
  • 39.Özsancak C, Auzou P, Jan M, Hannequin D. Measurement of voice onset time in dysarthric patients: methodological considerations. Folia Phoniatr Logop. 2001;53:48–57. doi: 10.1159/000052653. [DOI] [PubMed] [Google Scholar]
  • 40.Wang YT, Kent RD, Duffy JR, Thomas JE. Dysarthria associated with traumatic brain injury: speaking rate and emphatic stress. J Commun Disord. 2005;38:231–260. doi: 10.1016/j.jcomdis.2004.12.001. [DOI] [PubMed] [Google Scholar]
  • 41.Cohen W, Waters D, Hewlett N. DDK rates in the paediatric clinic: a methodological minefield. Int J Lang Commun Disord. 1998;33:428–433. doi: 10.3109/13682829809179463. [DOI] [PubMed] [Google Scholar]
  • 42.Yaruss JS, Logan KJ. Evaluating rate, accuracy, and fluency of young children's diadochokinetic productions: a preliminary investigation. J Fluency Disord. 2002;27:65–85. doi: 10.1016/s0094-730x(02)00112-2. [DOI] [PubMed] [Google Scholar]

Articles from Folia phoniatrica et logopaedica : official organ of the International Association of Logopedics and Phoniatrics (IALP) are provided here courtesy of Karger Publishers

RESOURCES