Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2012 Mar 27.
Published in final edited form as: J Commun Disord. 2008 Feb 29;41(6):485–500. doi: 10.1016/j.jcomdis.2008.02.001

Perturbation and nonlinear dynamic analysis of acoustic phonatory signal in Parkinsonian patients receiving deep brain stimulation

Victoria S Lee a, Xiao Ping Zhou b, Douglas A Rahn III a, Emily Q Wang c, Jack J Jiang a,1,*
PMCID: PMC3313602  NIHMSID: NIHMS363621  PMID: 18433765

Abstract

Nineteen PD patients who received deep brain stimulation (DBS), 10 non-surgical (control) PD patients, and 11 non-pathologic age- and gender-matched subjects performed sustained vowel phonations. The following acoustic measures were obtained on the sustained vowel phonations: correlation dimension (D2), percent jitter, percent shimmer, SNR, F0, vF0, and vAm. The results indicated the following: The mean D2 of control PD patients was significantly higher than the mean D2 of non-pathologic subjects and patients who received deep brain stimulation. These results suggest an improvement in PD voice in treated patients. Many PD vocal samples in this study have type 2 signals containing subharmonics that may not be suitable for perturbation analysis but are suitable for nonlinear dynamic analysis, making the D2 results more reliable. These findings show that DBS may provide measurable improvement in patients with severe vocal impairment.

Learning outcomes

Readers will be able to: (1) identify the advantages of nonlinear dynamic analysis as a clinical tool to evaluate the aperiodic voice commonly found in patients with Parkinson’s disease, (2) describe in general the method of obtaining a correlation dimension measure from a voice sample and the significance of this measure in terms of specific voice signal properties, (3) consider the preliminary implications from nonlinear dynamic analysis of a positive DBS effect on Parkinsonian voice and the potential for further investigations using nonlinear dynamic analysis on the influence of gender, severity of disease, and combined treatments on Parkinsonian voice improvement.

1. Introduction

Parkinson’s disease is a degenerative neurological disease. Symptoms of PD include both motor and vocal impairment. Impaired Parkinsonian voice has been described as breathy, tremulous, high-pitched, monotone, soft, and hoarse (Hanson, Gerratt, & Ward, 1984; Hoffman-Ruddy, Schulz, Vitek, & Evatt, 2001; Ramig, Scherer, Titze, & Ringel, 1988). Within the last decade, deep brain stimulation (DBS) of the subthalamic nucleus (STN) has emerged as a promising surgical option for individuals with advanced PD. The procedure involves stereotactic implantation of electrode(s) in the STN unilaterally or bilaterally. A pulse generator, usually located in the subcutaneous subclavicular area, provides chronic stimulation at the electrode site(s), affecting the loop that involves the cortical areas connected to the putamen altered in pathologic patients (Benabid, 2003; Broggi, Franzini, Marras, Romito, & Albanese, 2003).

The purpose of this study is to explore the effects of DBS of the STN on the vocal characteristics of Parkinsonian patients. Although DBS of the STN has been shown to greatly improve motor symptoms of PD, the results of this treatment on speech symptoms are inconsistent (Dromey, Kumar, Lang, & Lozano, 2000; Krack et al., 2003; Vaillancourt et al., 2004). Previous studies have shown that bilateral DBS of the STN improves maximal phonation time, vocal intensity level, and fundamental frequency variability, which may reflect increased subglottal pressure generation and greater laryngeal muscle coordination (Gentil, Chauvin, Pinto, Pollak, & Benabid, 2001; Gentil, Pinto, Pollack, & Benabid, 2003; Hoffman-Ruddy et al., 2001). Decreases in acoustic measures such as percent jitter, percent shimmer, and noise-to-harmonics (NHR) ratio may reflect less hoarseness and breathiness, two cardinal symptoms of PD voice (Dejonckere et al., 1996; Gentil et al., 2003; Hoffman-Ruddy et al., 2001; Reijonen, Soderlund, & Rihkanen, 2002). Other studies, however, have found that DBS of the STN may decrease speech intelligibility and production (Krause, Fogel, Mayer, Kloss, & Tronnier, 2004, Rousseaux et al., 2004).

Short-term fluctuations in phonatory signal in nearly periodic voice samples can be quantified using perturbation methods like jitter and shimmer, but these methods are less useful for severely disordered voices from which a period of sustained phonation is harder to extract (Karnell, Chang, Smith, & Hoffman, 1997; Titze, 1995). Nonlinear dynamic analysis, which provides a correlation dimension (D2) value, has recently been shown to be a valuable way to study phonation with aperiodic segments (Hertrich, Lutzenberger, Spieker, & Ackerman, 1997; Herzel, Berry, Titze, & Saleh, 1994; Rahn, Chou, Zhang, & Jiang, 2007; Titze, Baken, & Herzel, 1993; Zhang & Jiang, 2003; Zhang, McGilligan, Zhou, Vig, & Jiang, 2004). Aperiodic phonation is usually perceived as hoarse or breathy and has been found to be more prevalent in acoustic samples from PD subjects (Hertrich et al., 1997; Rahn et al., 2007; Ramig et al., 1988). In this study, both perturbation and nonlinear dynamic analysis are used.

The main hypothesis is that non-surgical (control) patients have significantly higher correlation dimension (D2) values than non-pathologic subjects and patients receiving DBS of the STN. The primary outcome measure is correlation dimension. The secondary outcome measures are indices of perturbation analysis, specifically percent jitter, percent shimmer, signal-to-noise ratio (SNR), F0 (fundamental frequency), vF0 (variability in fundamental frequency), and vAm (peak-to-peak amplitude variation). Previous studies have shown gender-related differences in the human STN and in the effects of PD on phonation. In particular, female PD voices often exhibit higher aperiodicity and therefore higher D2 values than male PD voices, which often have a lack of harmonic source energy, lower aperiodicity, and a lower D2 value (Hertrich & Ackerman, 1995; Marceglia et al., 2006). In order to investigate these gender-related differences, D2 is also analyzed by gender.

2. Materials and methods

2.1. Participants

The University of Wisconsin IRB and the Committee of Ethics at Shanghai Second Military Medical University Hospital approved the testing protocol and the informed consent procedure used in this study. The attending neurologist recruited 19 patients diagnosed with PD, 11 males and 8 females with an average age of 63.84 years, to undergo STN DBS surgery. The surgical procedure was identical to the procedure discussed by Benabid (2003). Nine patients had bilateral electrode placement, 7 had left electrode placement, and 3 patients had right electrode placement. Table 1 shows their demographic information and the disease characteristics prior to surgery. The decision to undergo DBS surgery was made by the subjects as part of their clinical care independent from the interests of this research study. Ten patients who did not undergo surgery were also selected to serve as the non-surgical (control) group. The 10 patients consisted of 6 females and 4 males with a mean age of 66.80 years (Table 2).

Table 1.

Gender, age, side of stimulation (STN), Hoehn-Yahr score, UPDRS-III overall motor score, UPDRS-III Item 18 (speech) motor score before surgery, and approximate disease duration in years at the time the study was conducted for patients who received DBS of the STN

Patient # Gender Age Side of STN Hoehn-Yahr UPDRS-III motor score UPDRS-III Item 18 PD duration since diagnosis (years)
1 Female 68 Bilateral III 51 0 12
2 Female 62 Bilateral III 60 0 12
3 Male 67 Bilateral IV 108 2 14
4 Male 61 Left III 51 0 18
5 Male 50 Bilateral III 78 1 5
6 Male 56 Bilateral III 58 0 10
7 Male 65 Bilateral IV 82 1 8
8 Female 65 Bilateral III 79 1 4
9 Female 57 Left V 100 2 11
10 Male 72 Left III 67 1 6
11 Male 76 Left III 80 1 13
12 Male 63 Left III 62 1 8
13 Male 58 Left IV 84 1 10
14 Male 77 Bilateral III 65 1 7
15 Female 48 Left III 62 0 10
16 Female 66 Right III 65 0 7
17 Female 65 Right III 65 0 6
18 Female 68 Bilateral IV 78 1 10
19 Male 69 Right III 62 1 6
Mean N/A 63.84 N/A 3.32 (III = 3, IV = 4, V = 5) 71.42 0.74 N/A

Mean values are listed for Hoehn-Yahr and UPDRS-III scores. As a result of lack of regular health care, a few patients have very short disease durations that are inconsistent with the stage of PD because of late diagnosis or diagnosis at the time of the study. Therefore, mean values were not calculated for disease duration.

Table 2.

Gender, age, Hoehn-Yahr score, UPDRS-III overall motor score, UPDRS-III Item 18 (speech) motor score, and approximate disease duration since diagnosis in years at the time the study was conducted for non-surgical (control) patients

Patient # Gender Age Hoehn-Yahr UPDRS-III UPDRS-III Item 18 (speech) PD duration since diagnosis (years)
1 Female 74 III 88 1 2
2 Female 74 IV 100 2 5
3 Female 73 III 76 1 3
4 Female 59 IV 94 1 7
5 Female 57 III 64 0 6
6 Female 77 III 68 0 <1
7 Male 61 IV 96 1 9
8 Male 68 III 72 1 4
9 Male 49 III 62 0 2
10 Male 76 IV 86 2 4
Mean N/A 66.80 3.40 80.60 0.9 N/A

Mean values are listed for Hoehn-Yahr and UPDRS-III scores. As a result of lack of regular health care, a few patients have very short disease durations that are inconsistent with the stage of PD because of late diagnosis or diagnosis at the time of the study. Therefore, mean values were not calculated for disease duration.

The attending neurologist selected patients by age, gender, Hoehn-Yahr score, UPDRS-III overall motor score, UPDRS-III Item 18 (speech) score, and PD duration since diagnosis. Because diagnosis of PD may vary depending on the frequency of doctor visits, Hoehn-Yahr and UPDRS-III motor scores were more important in the selection process than PD duration since diagnosis. Patients were selected to be as consistent as possible within each group and between the surgical and non-surgical groups. Both groups had similar mean UPDRS-III Item 18 scores, which may indicate similar levels of vocal impairment prior to the study (Tables 1 and 2).

Patients with vocal deficits caused by diseases other than PD, such as pulmonary diseases and diseases of the trachea and larynx, were excluded. Before having their voices recorded, all subjects had a laryngeal endoscopy to screen for any symptoms outside of those common to PD. Other exclusion criteria included cognitive and hearing impairment and a clinical diagnosis of depression.

2.2. Participants without Parkinson’s disease

Voice data from 11 non-pathologic subjects was used for comparison purposes. These participants were part of a separate study using an identical voice testing protocol to produce sustained vowel phonation. These subjects gave informed consent approved by the Institutional Review Board at Shanghai EENT Hospital. There were 6 females and 5 males with an average age of 65.36 years (Table 3). They were free of any speech or voice disorders as determined by an otolaryngologist via a flexible endoscopic examination.

Table 3.

Gender and age for non-pathologic subjects

Patient # Gender Age
1 Female 63
2 Female 71
3 Female 62
4 Female 66
5 Female 64
6 Female 70
7 Male 56
8 Male 63
9 Male 67
10 Male 60
11 Male 77
Mean N/A 65.36

2.3. Recording procedure

All recordings were taken in the medication-off condition, i.e., withholding medication for 12 h overnight. For patients that received DBS, recordings were taken for the stimulator-on condition. Then, the stimulator was turned off for 30 minutes, and recordings were taken for the stimulator-off condition. The stimulator-off recordings, however, were not reported in this paper because the stimulator-off period was deemed too short. Previous studies indicate that a stimulator must be turned off for a minimum of 12 h so as to achieve the true “stimulator-off” condition to detect voice changes (Santens, De Letter, Van Borsel, De Reuck, & Caemaert, 2003). For patients in the control group, the recordings were taken once.

At each time period, sustained /a/ vowel phonations of no less than 5 s were recorded in a sound-attenuated room using a head-mounted microphone (AKG Acoustics, Vienna, Austria) positioned at 15 cm from the mouth at a 45° angle. Audio files were recorded at a sampling rate of 25 kHz using Multispeech software (Kay Elemetrics Corporation, Lincoln Park, NJ). Patients were directed to perform sustained phonations within their normal vocal range. For each patient, five replicate recordings in each of the two conditions were taken, and three replicates randomly selected for vocal analysis. One-second segments were cut from the middle of these sustained voices, eliminating the offset and onset of the sustained phonation, and processed using nonlinear dynamic and perturbation analysis.

2.4. Blindness to the treatment factor

Different individuals were involved in data collection and data analysis. The neurologist selected all non-pathologic subjects and PD patients, both non-surgical and surgical. Two research assistants assisted in data collection, one controlling the stimulator and another, blind to the stimulator condition, directing the patient to perform the sustained phonations. The patient was also unaware of whether the stimulator was on or off. Therefore, the data collection satisfied double blindness. Another research assistant analyzed the data and was also blinded to the stimulator conditions. Therefore, blindness was also achieved at this stage of data processing.

2.5. Nonlinear dynamic analysis

The theory and usage of nonlinear dynamic methods, including phase space reconstructions and correlation dimensions, have been elaborately described in previous literature (Herzel et al., 1994; Jiang, Zhang, & Ford, 2003; Kumar & Mullick, 1996; Narayanan & Alwan, 1995; Titze et al., 1993). The reconstructed phase space shows the vibrations of the vocal folds as a function of time, with a periodic signal appearing as a closed trajectory and an aperiodic signal irregular and chaotic (Rahn et al., 2007; Jiang, Zhang, & McGilligan, 2006). Plotting a time series against itself at some time delay τ produces the reconstructed phase space. Fig. 1 shows phonatory time series

Fig. 1.

Fig. 1

Parkinsonian voice acoustic waveforms of patients that (a) did not receive DBS in the medication-off state (control) and (b) received DBS in the stimulator-on and medication-off state. Note the presence of subharmonics in the waveform of control patients. Figure was magnified.

x(ti),ti=t0+iΔt,(i=1,2,,N) (1)

from (Fig. 1a) a control patient and (Fig. 1b) a stimulated patient sampled at:

Δt=1fs (2)

The duration of the analysis window was 1 s (or 25,000 samples). Fig. 2 depicts the phase space reconstruction (x(t), x(t + τ)) with τ as the time delay calculated by Fraser and Swinney’s mutual information method (1986).

Fig. 2.

Fig. 2

The reconstructed phase space of a Parkinsonian voice in this study.

The correlation dimension procedure measures the correlation of any two points in the phase space trajectory and therefore the complexity and irregularity of the phase space trajectory. Based on correlation dimension, trajectories can be classified in order of increasing D2 in one of four states: (1) zero-dimensional fixed point (static states), (2) one-dimensional limit cycle (periodic oscillations), (3) two-dimensional quasi-periodic torus (two or more oscillations with no rationally dependent frequencies), and (4) fractal-dimensional chaotic (aperiodic oscillations). As a result, higher dimensionality D2 is characteristic of higher aperiodic vocal pathology and more severely impaired Parkinsonian voice. The estimated D2 of a chaotic system converges to a finite value given enough degrees of freedom whereas the estimated D2 of random white noise does not, and thus, correlation dimension can distinguish chaos from random white noise. Therefore, a higher D2 value implies aperiodicity and a higher degree of chaos in the voice, not increased randomness or noisiness (Jiang et al., 2006).

To analyze the PD voices in this study, correlation dimension calculations were determined based on past research of excised larynx phonations and live human voices (Jiang et al., 2003; Jiang et al., 2006; Zhang & Jiang, 2003; Zhang et al., 2004; Zhang, Jiang, Biazzo, & Jorgensen, 2005). The mathematical details of this method are presented in previous studies (Jiang et al., 2003; Zhang, Jiang, & Rahn, 2005; Zhang, Jiang, Wallace, & Zhou, 2005). Briefly, Grassberger and Procaccia’s correlation dimension (1983) was calculated based on the definition,

D2=limr0logC(r)logr (3)
C(W,N,r)=2(N+1W)(NW)n=WN1i=0N1nθ(r||XiXi+n||) (4)

where r is the radius around Xi and C was calculated using Theiler’s formula (1986). W was set as the time delay τ and θ(x) satisfies

θ(x)={1x>00x0 (5)

The correlation dimension is obtained with a linear curve fit to D2 vs. r in the scaling region where the slopes of these two curves increase transiently and then converge as embedding dimension m is increased. Fig. 3 shows the curves D2 vs. r from the same voice sample of the stimulated patient as shown in Fig. 1b. The slopes of the D2 vs. r curves approach 3.038 ± 0.004 in the indicated scaling region, which is the estimated D2 of this voice. Using the steps outlined above, phonatory time series from stimulated and control patients were analyzed.

Fig. 3.

Fig. 3

The estimated D2 value vs. r. The curves from bottom to top correspond to the embedding dimension m = 1, 2, 3, …, 10.

2.6. Perturbation analysis

The three, 1s segments of sustained phonations were analyzed using Cspeech 4.0 software (Paul Milenkovic, Madison, WI). In Cspeech, an analysis window is constructed based on an estimate of the fundamental frequency of the vocal waveform entered by the data analyzer. Cspeech then uses a least mean square fit of a waveform model to estimate the pitch period. This process is repeated for all points in the waveform (Karnell, Hall, & Landahl, 1995).

Percent jitter, percent shimmer, and signal-to-noise ratio values were obtained for the vocal recordings of each patient. Cspeech continues to run the algorithm to extract a pitch period after repeated failed attempts to “compute a pitch period consistent with the peak of the autocorrelation function,” which may occur in aperiodic waveforms (Milenkovic & Read, 1992). Err, which is calculated by Cspeech for each waveform, indicates the number of times the algorithm failed. An inaccurate pitch estimate or aperiodic vocal sample may have err values greater than 10, indicating that perturbation analysis may be unreliable (Milenkovic & Read, 1992). Percent jitter, percent shimmer, and SNR for a sample were eliminated if the err was greater than 10 (33% of the samples).

The Multi-Dimensional Voice Program (MDVP), model 5105, Version 2.0 (Kay Elemetrics Corporation), was used to obtain the perturbation measures of F0, vF0, and vAm from the sustained vowel phonations.

2.7. Statistical analysis

Means were calculated for each of the seven voice indices (D2, percent jitter, percent shimmer, SNR, F0, vF0, vAm) for each of the groups (non-pathologic, non-surgical, and surgical). Means were also calculated by gender for D2 and F0 values. An unpaired Wilcoxon rank sum test was used to test for differences between the groups. Statistical p-values less than 0.05 were considered significant for testing the main hypothesis as well as for the secondary outcome measures. Statistical computations were run on STATA 10.0 (Statacorp, College Station, TX).

To test the main hypothesis, the Wilcoxon rank sum test was run to compare the mean D2 values of the non-pathologic, non-surgical, and surgical groups. This was the primary outcome measure. Additional comparisons were run to explore differences related to gender. Because of the additional comparisons, the significance level was adjusted to p < 0.0167 using the Bonferroni correction.

The secondary outcome measures were indices of perturbation analysis, specifically percent jitter, percent shimmer, signal-to-noise ratio, F 0 (fundamental frequency), vF0 (variability in fundamental frequency), and vAm (peak-to-peak amplitude variation). Wilcoxon rank sum tests were run for each of the six indices, comparing non-pathologic, non-surgical, and surgical groups. Because of group comparisons across six indices, the significance level was p < 0.0083 using the Bonferroni correction.

3. Results

3.1. Aperiodicity of vocal samples and waveform analysis

Fig. 1 shows representative acoustic waveforms of PD voices for a control patient and a stimulated patient. The signals are classified as type 2 signals, meaning that modulations and subharmonics are present, as shown in the waveforms. Type 2 signals may be unsuitable for perturbation analysis. As many of the voices in this study had similar waveforms, many samples had type 2 signals or even type 3 signals, which are described as aperiodic and chaotic and unsuitable for perturbation analysis. The waveform of the control voice exhibited stronger modulations and subharmonics than the waveform of the stimulated voice. Most waveform comparisons of control voices to stimulated voices followed this trend.

3.2. Perturbation and nonlinear dynamic analysis

Table 4 shows the mean D2, percent jitter, percent shimmer, SNR, F 0, vF0, and vAm values for non-pathologic subjects and non-surgical and surgical PD patients. Table 5 summarizes the p-values for the statistical comparisons of these groups.

Table 4.

Mean and standard deviation of D2, percent jitter, percent shimmer, signal-to-noise ratio (SNR), F0, vF0, and vAm for non-pathologic patients (n = 11), non-surgical (control) PD patients (n = 10), and surgical (DBS of the STN) PD patients (n = 19)

Non-pathologic
Non-surgical
Surgical
Males (n = 5) Females (n = 6) Overall (n = 11) Males (n = 4) Females (n = 6) Overall (n = 10) Males (n = 11) Females (n = 8) Overall (n = 19)
D2 2.282 (±0.671) 2.064 (±0.327) 2.151 (±0.494) 3.475 (±0.287) 3.642 (±0.716) 3.575 (±0.565) 2.424 (±0.449) 2.957 (±0.881) 2.648 (±0.698)
Percent jitter 0.325 (±0.117) 0.427 (±0.240) 0.609 (±0.410)
Percent shimmer 2.434 (±1.724) 7.009 (±3.405) 6.615 (±2.841)
SNR 22.282 (±3.902) 14.062 (±5.252) 14.120 (±4.329)
F0 206.94 (±34.84) 262.84 (±61.99) 226.70 (±64.28) 157.20 (±31.97) 194.96 (±26.21) 182.37 (±32.27) 143.18 (±16.27) 178.34 (±27.53) 157.98 (±27.56)
vF0 1.355 (±0.431) 6.708 (±8.322) 1.816 (±1.198)
vAm 9.107 (±3.107) 10.885 (±2.289) 9.894 (±4.800)

Mean D2 and F0 values are also given by gender.

Table 5.

p-Values of statistical comparisons for mean D2, percent jitter, percent shimmer, SNR, F0, vF0, and vAm

Non-pathologic vs. non-surgical
Non-pathologic vs. surgical
Non-surgical vs. surgical
Males Females Overall Males Females Overall Males Females Overall
D2 0.0108* 0.0010* <0.0001* 0.5893 0.0258 0.0544 0.0003* 0.0213 <0.0001*
Percent jitter 0.3204 0.1245 0.1741
Percent shimmer 0.0004* 0.0002* 0.7393
SNR 0.0007* 0.0001* 0.4253
F0 0.0533 0.0001* 0.0041*
vF0 0.4286 0.6466 0.1781
vAm 0.1564 0.8379 0.0370

Comparisons are made with non-pathologic, non-surgical (control), and surgical groups. The p-values are also given for statistical comparisons of D2 values by gender. The p-values are determined using an unpaired Wilcoxon rank sum test.

*

p-Values that are statistically significant at the adjusted p-values of 0.0167 for D2 and 0.0083 for perturbation measures. p-values are adjusted using a Bonferroni correction.

Using the adjusted p-values for comparisons of D2, the mean D2 value of the non-surgical group was significantly higher than the mean D2 value of the non-pathologic group (p < 0.0001) and the surgical group (p < 0.0001). The mean D2 values were further analyzed by gender. The mean D2 value of non-surgical males and females was significantly higher than the mean D2 value of non-pathologic males (p = 0.0108) and females (p = 0.0010) respectively. The mean D2 value of non-surgical males was significantly higher than the mean D2 value of surgical males (p = 0.0003).

Using the adjusted p-values for comparisons of perturbation measures, the mean percent shimmer of the non-pathologic group was significantly lower than the mean percent shimmer of the non-surgical (p = 0.0004) and surgical (p = 0.0002) groups. The mean SNR of the non-pathologic group was significantly higher than the mean SNR of the non-surgical (p = 0.0007) and surgical (p = 0.0001) groups. Mean F 0 was significantly higher in the non-pathologic group than in the surgical group (p = 0.0001). Mean F 0 was significantly higher in the non-surgical group than in the surgical group (p = 0.0041).

4. Discussion

The mean D2 value was significantly higher for the non-surgical group than for the non-pathologic subjects, supporting the validity of D2 in discerning vocal signal quality differences. The mean D2 value of the non-surgical group was significantly higher than the mean D2 value of the surgical group, reflecting an improvement in voice signal quality with DBS of the STN.

Correlation dimension (D2) represents a distinct and important property of vocal signal, i.e. complexity in terms of degrees of freedom, and is useful for evaluating vocal irregularity in PD where traditional perturbation measures may be unsuitable. The decreased D2 value of stimulated patients may indicate a decrease in vocal fold rigidity and stiffness, two vocal symptoms commonly associated with PD. The simultaneous contraction of the opposing thyroarytenoid and cricothyroid muscles to shorten and lengthen the vocal folds is believed to cause the vocal stiffness found in PD patients (Gallena, Smith, Zeffiro, & Ludlow, 2001). This simultaneous contraction may result in asymmetric stiffness and non-coordinated movement of the vocal folds, which may induce subharmonics and chaos in the voice quantified by correlation dimension (Behrman & Baken, 1996; Rahn et al., 2007).

Mean D2 values were also statistically compared by gender. The mean D2 values of non-surgical males and females were significantly higher than mean D2 values of non-pathologic males and females with no significant differences by gender between surgical males and females and non-pathologic males and females, suggesting an improvement in voice signal quality for both genders in treated patients. Only the mean D2 value of non-surgical males, however, was significantly higher than the mean D2 value of surgical males, indicating greater vocal improvements from DBS in males. Previous studies have shown gender-related differences in PD voices, perhaps because of differences in laryngeal size and differences in the STN (Hertrich & Ackerman, 1995; Marceglia et al., 2006). Gender-specific vocal signal effects of DBS deserve further analysis. The results of our study should be regarded cautiously because of the small sample sizes.

Previous studies have shown that nonlinear dynamic analysis can differentiate patients with normal voices from patients with unilateral laryngeal paralysis or Parkinson’s disease (Rahn et al., 2007; Zhang, Jiang, Biazzo, et al.) Another study used nonlinear dynamic analysis to distinguish voice before and after surgical excision of vocal polyps (Zhang et al., 2004). This study extends the use of nonlinear dynamic analysis to evaluating the effectiveness of DBS of the STN on Parkinsonian voice.

Fundamental frequency was significantly higher in the non-pathologic and non-surgical groups than in the surgical group, indicating a shift in fundamental frequency away from normal values in treated patients. As mean fundamental frequency was not tested by gender, these results should be interpreted with caution and deserve further analysis to determine their validity.

Mean percent jitter and shimmer show significantly higher values in the non-surgical and surgical groups than in the non-pathologic group. In the case of the surgical PD and non-pathologic group comparison, the perturbation measures of percent jitter and shimmer detect vocal differences that D2 does not. These results suggest the use of D2 as a complementary rather than substitute measure for traditional perturbation measures. A possible explanation for the lack of significant differences in perturbation measures between the non-surgical and surgical groups may be the vocal signals used in this study. As shown in Fig. 1, many of the signals can be classified as type 2 or even type 3 signals. Type 2 signals contain modulations and subharmonics, and type 3 signals are aperiodic and chaotic. Both signals are unsuitable for perturbation analysis (Milenkovic & Read, 1992). As a result, greater emphasis may be placed on the results from nonlinear dynamic analysis, which are valid for both nearly periodic and aperiodic voice, rather than on the results of traditional perturbation analysis, which are unreliable for aperiodic voice (Hertrich & Ackerman, 1995; Herzel et al., 1994; Rahn et al., 2007; Titze et al., 1993; Zhang & Jiang, 2003; Zhang et al., 2004).

Other studies have also evaluated traditional perturbation measures, such as percent jitter, percent shimmer, and noise-to-harmonics ratio, and found that they decreased significantly after DBS, reflecting an improvement in PD voice (Dejonckere et al., 1996; Gentil et al., 2003; Hoffman-Ruddy et al., 2001; Reijonen et al., 2002). The PD voices in these studies, however, may have been less impaired and less aperiodic, and therefore, would be suitable for traditional perturbation measures.

The non-surgical group is composed of different individuals from the surgical group, although great care was taken to match the two groups as closely as possible, not only in age and gender ratios but also in Hoehn-Yahr and UPDRS-III motor scores (Tables 1 and 2). By establishing similar baseline levels of vocal impairment between non-surgical and surgical patients, we may better attribute differences in vocal measures between the groups to the effect of the stimulation. Sample sizes of surgical studies are usually relatively small because of the stringent inclusion/exclusion criteria for patients to qualify for DBS surgery (Dromey et al., 2000; Gentil, Garcia-Ruiz, Pollak, & Benabid, 1999; Taha, Janszen, & Favre, 1999).

Future studies could investigate the effects of DBS treatment used in conjunction with levodopa treatment as they may improve different symptoms of Parkinson’s disease. Other studies could also investigate whether severity of disease (in terms of UPDRS motor score, years since onset, or another measure) is related to aperiodicity of voice in Parkinson’s disease and whether the severity affects the usefulness of DBS treatment.

Acknowledgments

This study was supported by NIH Grant No. RO1DC006019 from the NIDCD and by NSFC Grant No. 30328029. Alejandro Munoz at the University of Wisconsin-Madison Department of Surgery provided invaluable statistical analysis expertise.

Appendix A. Continuing education

  1. Which voice outcome measure has been shown to be more reliable for evaluating a periodic voice?

    1. Percent jitter.

    2. Signal-to-noise ratio.

    3. Correlation dimension.

    4. Percent shimmer.

  2. Why are perturbation methods often unreliable for evaluating treatment effects on Parkinsonian voice?

    1. Clinical treatments like deep brain stimulation often target aspects of Parkinsonian voice not quantified by perturbation methods.

    2. Perturbation methods require the extraction of a stable pitch period from the sustained phonation, which is often difficult for Parkinsonian voice.

    3. Perturbation methods cannot be associated to perceptual aspects of Parkinsonian voice, and as a result have no clinical relevance.

    4. The computer programs for perturbation methods are not standardized, making these methods difficult to use for evaluating treatment effects.

  3. What specific property of the voice signal does nonlinear dynamic analysis evaluate?

    1. Vocal intensity level.

    2. Complexity in terms of degrees of freedom (vocal irregularity).

    3. Fundamental frequency variability.

    4. Laryngeal muscle coordination.

  4. Which type(s) of voice signals have modulations and subharmonics?

    1. Type 1.

    2. Type 2.

    3. Type 3.

    4. Both (b and c).

    5. All of the above.

  5. What does a decreased correlation dimension value for a Parkinsonian voice most directly indicate?

    1. Decreased vocal fold stiffness.

    2. Improved vocal intensity.

    3. Improved maximal phonation time.

    4. Increased subglottal pressure generation.

References

  1. Behrman A, Baken RJ. Correlation dimension of electroglottographic data from healthy and pathologic subjects. The Journal of the Acoustical Society of America. 1996;100:615–629. doi: 10.1121/1.419621. [DOI] [PubMed] [Google Scholar]
  2. Benabid AL. Deep brain stimulation for Parkinson’s disease. Current Opinion in Neurobiology. 2003;13:696–706. doi: 10.1016/j.conb.2003.11.001. [DOI] [PubMed] [Google Scholar]
  3. Broggi G, Franzini A, Marras C, Romito L, Albanese A. Surgery of Parkinson’s disease: Inclusion criteria and follow-up. Neurological Sciences. 2003;24(Suppl 1):S38–S40. doi: 10.1007/s100720300037. [DOI] [PubMed] [Google Scholar]
  4. Dejonckere P, Remacle M, Frensel-Elbaz E, Woisard V, Crevier-Buchman L, Millet B. Differentiated perceptual evaluation of pathological voice quality: Reliability and correlations with acoustic measurements. Revue de laryngologie-otologie-rhinologie. 1996;117:219–224. [PubMed] [Google Scholar]
  5. Dromey C, Kumar R, Lang A, Lozano A. An investigation of the effects of subthalamic nucleus stimulation on acoustic measures of voice. Movement Disorders. 2000;15:1132–1138. doi: 10.1002/1531-8257(200011)15:6<1132::aid-mds1011>3.0.co;2-o. [DOI] [PubMed] [Google Scholar]
  6. Fraser AM, Swinney HL. Independent coordinates for strange attractors from mutual information. Physical Review A. 1986;33:1134–1140. doi: 10.1103/physreva.33.1134. [DOI] [PubMed] [Google Scholar]
  7. Gallena S, Smith PJ, Zeffiro T, Ludlow CL. Effects of levodopa on laryngeal muscle activity for voice onset and offset in Parkinson disease. Journal of Speech, Language, and Hearing Research. 2001;44:1284–1299. doi: 10.1044/1092-4388(2001/100). [DOI] [PubMed] [Google Scholar]
  8. Gentil M, Chauvin P, Pinto S, Pollak P, Benabid AL. Effect of bilateral stimulation of the subthalamic nucleus on Parkinsonian voice. Brain and Language. 2001;78:233–240. doi: 10.1006/brln.2001.2466. [DOI] [PubMed] [Google Scholar]
  9. Gentil M, Garcia-Ruiz P, Pollak P, Benabid AL. Effect of stimulation of the subthalamic nucleus on oral control of patients with Parkinsonism. Journal of Neurology, Neurosurgery, and Psychiatry. 1999;67:329–333. doi: 10.1136/jnnp.67.3.329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Gentil M, Pinto S, Pollack P, Benabid AL. Effect of bilateral stimulation of the subthalamic nucleus on Parkinsonian dysarthria. Brain and Language. 2003;85:190–196. doi: 10.1016/s0093-934x(02)00590-4. [DOI] [PubMed] [Google Scholar]
  11. Grassberger P, Procaccia I. Measuring the strangeness of strange attractors. Physica D. 1983;9:189–208. [Google Scholar]
  12. Hanson DG, Gerratt BR, Ward PH. Cinegraphic observations of laryngeal function in Parkinson’s disease. Laryngoscope. 1984;94:348–353. doi: 10.1288/00005537-198403000-00011. [DOI] [PubMed] [Google Scholar]
  13. Hertrich I, Ackerman H. Gender-specific vocal dysfunctions in Parkinson’s disease: Electroglottographic and acoustic analyses. Annals of Otololaryngology, Rhinology, and Laryngology. 1995;104:197–202. doi: 10.1177/000348949510400304. [DOI] [PubMed] [Google Scholar]
  14. Hertrich I, Lutzenberger W, Spieker S, Ackerman H. Fractal dimension of sustained vowel productions in neurological dysphonias: An acoustic and electroglottographic analysis. Journal of the Acoustic Society of America. 1997;102:652–654. doi: 10.1121/1.419711. [DOI] [PubMed] [Google Scholar]
  15. Herzel H, Berry D, Titze IR, Saleh M. Analysis of vocal disorders with methods from nonlinear dynamic analysis. Journal of Speech and Hearing Research. 1994;37:1008–1019. doi: 10.1044/jshr.3705.1008. [DOI] [PubMed] [Google Scholar]
  16. Hoffman-Ruddy B, Schulz G, Vitek J, Evatt M. A preliminary study of the effects of subthalamic nucleus (STN) deep brain stimulation (DBS) on voice and speech characteristics in Parkinson’s Disease (PD) Clinical Linguistics and Phonetics. 2001;15:97–101. doi: 10.3109/02699200109167638. [DOI] [PubMed] [Google Scholar]
  17. Jiang JJ, Zhang Y, Ford CN. Nonlinear dynamics of phonations in excised larynx experiments. The Journal of the Acoustical Society of America. 2003;114:2198–2205. doi: 10.1121/1.1610462. [DOI] [PubMed] [Google Scholar]
  18. Jiang JJ, Zhang Y, McGilligan C. Chaos in voice, from modeling to measurement. Journal of Voice. 2006;20:2–17. doi: 10.1016/j.jvoice.2005.01.001. [DOI] [PubMed] [Google Scholar]
  19. Karnell MP, Chang A, Smith A, Hoffman H. Impact of signal type on validity of voice perturbation measures. NCVS Status and Progress Report. 1997;11:91–94. [Google Scholar]
  20. Karnell MP, Hall KD, Landahl KL. Comparison of fundamental frequency and perturbation measurements among three analysis systems. Journal of Voice. 1995;9:383–393. doi: 10.1016/s0892-1997(05)80200-0. [DOI] [PubMed] [Google Scholar]
  21. Krack P, Batir A, Van Blercom N, Chabardes S, Fraix V, Ardouin C, et al. Five-year follow-up of bilateral stimulation of the subthalamic nucleus in advanced Parkinson’s disease. The New England Journal of Medicine. 2003;349:1925–1934. doi: 10.1056/NEJMoa035275. [DOI] [PubMed] [Google Scholar]
  22. Krause M, Fogel W, Mayer P, Kloss M, Tronnier V. Chronic inhibition of the subthalamic nucleus in Parkinson’s disease. Journal of the Neurological Sciences. 2004;219:119–124. doi: 10.1016/j.jns.2004.01.004. [DOI] [PubMed] [Google Scholar]
  23. Kumar A, Mullick SK. Nonlinear dynamical analysis of speech. The Journal of the Acoustical Society of America. 1996;100:615–629. [Google Scholar]
  24. Marceglia S, Mrakic-Sposta S, Foffani G, Cogiamanian F, Caputo E, Egidi M, et al. Gender-related differences in the human subthalamic area: A local field potential study. European Journal of Neuroscience. 2006;24:3213–3222. doi: 10.1111/j.1460-9568.2006.05208.x. [DOI] [PubMed] [Google Scholar]
  25. Milenkovic P, Read C. CSpeech version 4 user’s manual. Madison, WI: University of Wisconsin; 1992. [Google Scholar]
  26. Narayanan SS, Alwan AA. A nonlinear dynamical systems analysis of fricative consonants. The Journal of the Acoustical Society of America. 1995;97:2511–2524. doi: 10.1121/1.411971. [DOI] [PubMed] [Google Scholar]
  27. Rahn DA, Chou M, Zhang Y, Jiang J. Phonatory impairment in Parkinson’s disease: Evidence from nonlinear dynamic analysis and perturbation analysis. Journal of Voice. 2007;21:64–71. doi: 10.1016/j.jvoice.2005.08.011. [DOI] [PubMed] [Google Scholar]
  28. Ramig LA, Scherer RC, Titze IR, Ringel SP. Acoustic analysis of voices of patients with neurologic disease: Rationale and preliminary data. The Annals of Otology, Rhinology, and Laryngology. 1988;97:164–172. doi: 10.1177/000348948809700214. [DOI] [PubMed] [Google Scholar]
  29. Reijonen P, Soderlund SL, Rihkanen H. Results of fascial augmentation in unilateral vocal fold paralysis. The Annals of Otology, Rhinology, and Laryngology. 2002;11:523–529. doi: 10.1177/000348940211100608. [DOI] [PubMed] [Google Scholar]
  30. Rousseaux M, Krystkowiak P, Kozlowski O, Ozsancak C, Blond S, Destee A. Effects of subthalamic nucleus stimulation on Parkinsonian dysarthria and speech intelligibility. Journal of Neurology. 2004;251:327–334. doi: 10.1007/s00415-004-0327-1. [DOI] [PubMed] [Google Scholar]
  31. Santens P, De Letter M, Van Borsel J, De Reuck J, Caemaert J. Lateralized effects of subthalamic nucleus stimulation on different aspects of speech in Parkinson’s disease. Brain and Language. 2003;87:253–358. doi: 10.1016/s0093-934x(03)00142-1. [DOI] [PubMed] [Google Scholar]
  32. Taha J, Janszen M, Favre J. Thalamic deep brain stimulation for the treatment of head, voice, and bilateral tremor. Journal of Neurosurgery. 1999;91:68–72. doi: 10.3171/jns.1999.91.1.0068. [DOI] [PubMed] [Google Scholar]
  33. Theiler J. Spurious dimension from correlation algorithms applied to limited time series data. Physical Review A. 1986;34:2427–2432. doi: 10.1103/physreva.34.2427. [DOI] [PubMed] [Google Scholar]
  34. Titze IR. Summary statement: Workshop on acoustic voice analysis. USA: Denver, CO: National Center for Voice and Speech; 1995. [Google Scholar]
  35. Titze IR, Baken R, Herzel H. Evidence of chaos in vocal fold vibration. In: Titze IR, editor. Vocal fold physiology: Frontiers in basic science. San Diego USA: Singular; 1993. pp. 143–188. [Google Scholar]
  36. Vaillancourt D, Prodoehl J, Verhagen, Metman L, Bakay R, Corcos D. Effects of deep brain stimulation and medication on bradykinesia and muscle activation in Parkinson’s disease. Brain. 2004;127(Pt. 3):491–504. doi: 10.1093/brain/awh057. [DOI] [PubMed] [Google Scholar]
  37. Zhang Y, Jiang JJ. Nonlinear dynamic analysis in signal typing of pathological human voices. Electronic Letters. 2003;39:1021–1023. [Google Scholar]
  38. Zhang Y, Jiang JJ, Biazzo L, Jorgensen M. Perturbation and nonlinear dynamic analyses of voices from patients with unilateral laryngeal paralysis. Journal of Voice. 2005a;19:519–528. doi: 10.1016/j.jvoice.2004.11.005. [DOI] [PubMed] [Google Scholar]
  39. Zhang Y, Jiang J, Rahn DA. Studying vocal fold vibrations in Parkinson’s disease with a nonlinear model. Chaos. 2005b;15:33903. doi: 10.1063/1.1916186. [DOI] [PubMed] [Google Scholar]
  40. Zhang Y, Jiang J, Wallace S, Zhou L. Comparison of nonlinear dynamic methods and perturbation methods for voice analysis. The Journal of the Acoustical Society of America. 2005c;118:2551–2560. doi: 10.1121/1.2005907. [DOI] [PubMed] [Google Scholar]
  41. Zhang Y, McGilligan C, Zhou L, Vig M, Jiang J. Nonlinear dynamic analysis of voices before and after surgical excision of vocal polyps. The Journal of the Acoustical Society of America. 2004;115:2270–2277. doi: 10.1121/1.1699392. [DOI] [PubMed] [Google Scholar]

RESOURCES