Abstract
The purpose of this paper is to review and evaluate measures of speech production that could be used to document effects of Deep Brain Stimulation (DBS) on speech performance, especially in persons with Parkinson disease (PD). A small set of evaluative criteria for these measures is presented first, followed by consideration of several speech physiology and speech acoustic measures that have been studied frequently and reported on in the literature on normal speech production, and speech production affected by neuromotor disorders (dysarthria). Each measure is reviewed and evaluated against the evaluative criteria. Embedded within this review and evaluation is a presentation of new data relating speech motions to speech intelligibility measures in speakers with PD, amyotrophic lateral sclerosis (ALS), and control speakers (CS). These data are used to support the conclusion that at the present time the slope of second formant transitions (F2 slope), an acoustic measure, is well suited to make inferences to speech motion and to predict speech intelligibility. The use of other measures should not be ruled out, however, and we encourage further development of evaluative criteria for speech measures designed to probe the effects of DBS or any treatment with potential effects on speech production and communication skills.
Keywords: dysarthria, intelligibility, deep brain stimulation, speech physiology
1.0 Introduction
Speech production is complex motor behavior, implemented by multiple structures moving simultaneously and often asynchronously. The movements occur within the respiratory system, larynx, and vocal tract, the latter a flexible tube extending from the upper level of the vocal folds to the lips. The vocal tract plays the major role in the production of speech sounds such as vowels and consonants. Two intertwined results of this time-varying ensemble of movements are 1) the airstream generated by movements of the respiratory system and modulated by the nearly-periodic valving at the level of the vocal folds is intermittently and briefly interrupted, partially or completely, by movements of vocal tract structures (such as the lips, tongue, jaw), and 2) the time-varying shapes and valvings within the vocal tract, often combined with quasi-periodic energy generated by the vibrating vocal folds but sometimes independently of this energy, generate the speech acoustic signal.
The generation of the speech acoustic signal by the underlying movements of the speech mechanism defines one side of the communication process. The other side is a listener’s understanding of the speech acoustic signal. Depending on the specific design of an experimental test, the identity of the speaker and listener, and additional factors, lawful relations of varying strength can be demonstrated between the “goodness” of the speech acoustic signal and a listener’s ability to understand a talker’s speech movements (Weismer, 2008).
When a speaker experiences some loss of ability to control speech movements, a measure of listener ability to identify the intended speech message should reflect that loss. Moreover, this listener measure might be expected to vary systematically with the magnitude of the speech movement problem: the greater the problem, the lower the index of speech understanding. Correspondingly, the index of speech understanding should improve with improvements in speech movements resulting from disease recovery or any of several therapies designed to ameliorate the problem. The opposite side of this expectation is of deteriorating speech movements with disease progression or as an unintended result of certain contemporary drug or surgical therapies whose primary purpose is relief from nonspeech (limb) symptoms in persons with neurological disease. Some loss of speech movement control, of course, might not affect the recovery by the listener of the intended message, but may still be detectable by a listener as a deviation from “normal” speech; because this paper is concerned with speech movement effects on intelligibility (rather than, say, “acceptability”), we will not be concerned with this issue further, although it is worthy of careful consideration in the future.
An open question concerns the effects of Deep Brain Stimulation (DBS) on speech production and intelligibility, primarily in persons with Parkinson’s disease (PD) but in patients with other neurological diseases as well. The current paper considers different ways to measure changes in speech production, and by implication speech intelligibility, resulting from disease progression, speech therapy, or drug and/or surgical interventions whose primary goal is alleviation of nonspeech (e.g., limb) symptoms. As reviewed by Tripoliti and Limousin (2010), some studies of patients with PD have shown limited improvement in speech measures as a result of DBS, but other studies show an absence of a speech effect or a worsening of speech production and speech intelligibility. Many published evaluations of DBS and speech production/intelligibility have made use of relatively crude measures, or measures whose relevance to speech can be questioned; these issues are treated below. The discussion that follows focuses on DBS and its potential effects on speech production/intelligibility, but the considerations and conclusions may be applicable to any treatment- or disease-progression changes in speech production and intelligibility.
2.0 Outline of the Essay
This paper first sketches a small set of criteria for evaluation of candidate measures to assess the effects of DBS on speech production. Individual candidate measures are then reviewed under the general headings of speech physiology and acoustic measures. Consideration of speech physiology includes electromyographic, oromotor nonverbal, electropalatographic, speech movement, and aerodynamic measures. The potential utility of each candidate measure in both clinical and research evaluation of DBS effects is considered with reference to the criteria mentioned above and described more fully below. The purpose of the critiques of the candidate measures is to offer a reasoned identification of a measure or measures that are likely to provide effective, clinically feasible evaluation of the effects of DBS (or any other therapy) on speech production and intelligibility. A novel contribution of the paper is the presentation of new data on speech movement in speakers with Parkinson’s disease (PD) and amyotrophic lateral sclerosis (ALS), and how those data may be inferred from simpler measures of speech acoustics; in addition, a systematic relationship is described between specific speech movement measures and estimates of speech intelligibility. These data are presented here because in our opinion they provide an empirical basis for choosing a measure that is likely to be effective in judging the effects of DBS on speech production skills in treated patients. The paper concludes with a very specific recommendations for a DBS study that could demonstrate whether or not the recommendations offered here are empirically defensible.
3.0 Criteria for Evaluative Measures
Some general criteria are presented here for measures designed to evaluate the effects of DBS on speech production and intelligibility. These criteria are offered as qualitative considerations in the choice of a measure (or measures), and serve to organize the critical consideration of each measure. When individual measures are considered below as candidates to evaluate the effects of DBS on speech production and intelligibility, the degree to which they meet each one of the four criteria described here are judged and justified.
First, the measure should have a demonstrated sensitivity to dysarthria. This sensitivity should be documented in currently available, published research showing that the measure routinely separates speakers with dysarthria from well-matched, control speakers. Optimally, the sensitivity should extend to gradations in severity of dysarthria, and possibly to differences in type of dysarthria. Sensitivity to severity variation is particularly important, because the purpose of making a measurement to document effects of DBS on speech production and/or intelligibility is, in effect, measuring possible variations in severity (either increasing or decreasing) consequent to stimulation.
Second, the measure should have relevance to functional speech communication. The link between a measure of therapeutic effect and a metric of “functional communication” may be established (or argued) in several ways. Most obviously, previous demonstrations in the research literature of a measure’s ability to index something about a patient’s speech production ability—that is, where the measure is shown to predict a more global index of speech success, such as speech intelligibility, or even a change during speech production that indicates an adjustment in the direction of potentially improved intelligibility—are the best foundation for use of a specific measure. Less obviously, but in the current opinion of substantial importance, theoretical and logical considerations can be brought to bear on the relevance of a particular measure to functional communication.
Third, because dysarthria is, at one level, a speech movement disorder wherein the disordered movements result in a degraded acoustic signal and difficulties with speech intelligibility, a direct measure of movement, or a measure that can be interpreted in terms of speech movement, seems desirable. Measures of limb control, obtained directly or inferred from clinical tools such as the UPDRS motor scale (Fahn, Elton, et al., 1987) are explicit in the evaluation of DBS effects (e.g., Klosterman et al., 2010; Sturman et al., 2010). It therefore seems desirable to investigate the speech component of neurological disease and its treatment with measures of (or inferences to) speech movement.
Fourth and finally, ease and simplicity of application and interpretation are desirable features for these measures. As described below, some measures (and especially of speech physiology) involve fairly elaborate technical instrumentation and patient preparation. Data collection with these techniques can be time consuming and require (in some cases) highly specialized expertise. A measure or measures that reduce these complications without sacrificing the ultimate goal—a stable, reproducible index of change that has interpretative power—is highly valued.
These criteria are only a subset of ones that could be discussed. For example, it is possible that a measure could be effective in demonstrating an improved physiological “underpinning” of the speech mechanism—such as increased tongue strength—even if this improvement has little immediate bearing on a patient’s ability to produce more intelligible speech. To cover this possibility, a criterion of “any evidence of improvement in speech mechanism performance for any task” would need to be added to the four stated above. We choose a more constrained set of criteria for this essay, but recognize the possibility of other views of what may or may not make a particular measure desirable.
4.0 Candidate Measures for the Evaluation of DBS Effects on Speech Production/Intelligibility
4.1 Overview
Review chapters (see, for example, chapters in Hardcastle, Laver, & Gibbon 2010; and Pisoni & Remez, 2005) are readily available for technical and interpretative details associated with the many measures that have been applied to speech production and perception performance. The measures discussed below were chosen partly because they have been the subject of a good deal of research in the general literature on speech production and perception, and partly because they enjoy some history in the particular case of dysarthria. In all cases, the question of the degree to which a measure predicts speech intelligibility is considered, consistent with the criterion of functional relevance stated above.
4. 2 Physiological Measures
4.2.1 Electromyography (EMG)
Electromyography (EMG) involves the recording of electrical activity associated with muscle contraction. Because EMG provides a more or less “direct” evaluation of muscle activity, it may seem like an attractive measure for the evaluation of DBS effects; after all, at one important level, DBS seeks to modify problems with muscle contraction resulting from a disease process such as PD. Perhaps, as some investigators have suggested, leakage of electrode current from a target site (e.g., subthalamic nucleus [STN]) into regions of the corticobulbar tract (Narayana et al., 2009; Pinto et al., 2005; Tassorelli et al., 2009) or other nearby pathways (Åström et al., 2010) is responsible for the stimulation-induced reduction of speech intelligibility reported in some patients; perhaps the basis of this reduction could be identified by EMG recordings from relevant speech muscles. EMG has been used to investigate speech production in dysarthria (see, for example, Leanderson, Meyerson, & Persson, 1972; Hunker, Abbs, & Barlow, 1982; O’ Dwyer, Neilson, Guitar, Quinn, & Andrews, 1983; Barlow & Abbs, 1984; O’Dwyer & Neilson, 1988), but over the last twenty years the approach has not enjoyed much attention.
A substantial body of work on the sensitivity of EMG to dysarthria has never taken shape, probably because the proper studies are exceedingly complicated. For example, the speech mechanism includes a large number of muscles, not all of which can be sampled simultaneously in a single experiment. Investigators who used EMG to study dysarthria typically obtained signals only from the relatively accessible lip, jaw, and other facial musculature (and in rare cases, the tongue: see O’Dwyer and Neilson, 1988). The absence of a substantial data base on speech EMG in normal talkers and talkers with dysarthria does not allow EMG to rise to the criterion of having empirically-demonstrated sensitivity to dysarthria. It is also unknown how a particular aspect of an EMG signal can be used to infer the specifics of a particular kind of speech movement disorder. The very few data available on this issue (see O’Dwyer and Neilson, 1988) suggest the great complexity of EMG as a potential measure for evaluation of change due to treatment. These considerations, plus the requirements for recording EMG signals and interpreting what they mean relative to a speech event (defined in movement, acoustic, and/or perceptual terms) do not allow such measures to meet the “ease” criterion stated above. The absence of data to estimate the sensitivity of EMG measures to dysarthria also means the relevance of EMG measures to measures of functional communication remains unknown. EMG measures of the tongue, arguably the most influential articulator, are difficult to obtain and interpret, yet another reason to question the ease and relevance of such measures (e.g., see Gérard, Perrier, & Payan, 2006, p. 86).
We would conclude that based on currently available data, EMG may not be a suitable measure for evaluation of DBS treatment effects on speech because the measure fails to meet at least three of the evaluation criteria. This conclusion does not necessarily apply to the evaluation of DBS treatment effects on limb function (see, for example, Journee, Postma, & Staal, 2007; and Levin, Krafczyk, Valkovic, Eggert, Claassen, & Bötzel, 2009; compare to Vaillancourt, Prodoehl, Sturman, Bakay, Metman, & Corcos, 2006).
4.2.2. Oromotor, Nonverbal (Nonspeech) Measures
For the purposes of this discussion, oromotor, nonverbal measures include measures of orofacial behavior performed in the absence of a speech acoustic signal generated by the behavior and that could reasonably be judged as a natural speech utterance. Such nonverbal measures include (but are not limited to) instrumental transduction of forces exerted by the lips, tongue, and jaw, as well as the various derivatives of these measures including force change per unit time, target force accuracy, force fatigue, and so forth. The requirement for absence of a “speech acoustic signal that could be…judged as a natural speech utterance” to characterize an oromotor, nonverbal measure means, in the current view, that sustained vowels and speech alternating motion rates (speech diadochokinetic sequences) fit into the nonspeech category. Perhaps such tasks are better termed “quasi-speech tasks” than oromotor, nonverbal tasks (Weismer, 2006a).
Ooromotor, nonverbal measures are discussed here because they have been used to evaluate the effects of DBS on orofacial structures, and have a fairly extensive history in the speech research literature (see review in Barlow & Bradford, 1992; and Weismer, 2006a). There are findings of increased lip, tongue, or jaw maximum strength, or of improved force control, as an outcome of DBS in subthalamic nucleus (STN-DBS: Gentil et al., 1999; Pinto et al., 2003; Gentil et al., 2003), although worsening of lip and tongue force control has also been reported (Gentil et al., 2000). The theory underlying such measurements as a window to speech motor control seems to be that motor control capabilities of the articulators are independent of specific task requirements. For various theoretical and methodological reasons discussed in Ziegler (2003) and Weismer (2006a), oromotor nonverbal measures should not be expected to provide meaningful insight to speech motor control processes (for an opposing view, see Robin et al. 1997; and Ballard et al., 2003). More importantly, when a careful analysis is made of published empirical tests of relationships between oromotor nonverbal performance and speech performance, the results are dramatically negative, even down to very specific details (e.g., lip performance in an oromotor, nonverbal task does not relate in a meaningful way to production of labial speech sounds: see Weismer, 2006a, for the analysis of the research literature on this issue).
These considerations lead us to the conclusion that oromotor, nonverbal measures are not good candidates to evaluate the effects of DBS on speech production. With respect to the evaluation criteria, measures of oromotor, nonverbal behaviors are relatively simple to obtain provided the right equipment is available. The measures also have shown sensitivity to the presence of dysarthria. Group performance for patients with neurological diseases known to produce dysarthria is typically worse than performance of control subjects. Interestingly, there is no evidence for systematically unique abnormalities in oromotor, nonverbal control as a function of type of neurological disease (or for the different dysarthria types assumed to vary by neurological disease). The evidence does not exist—one way or the other— because the correct experiment has apparently never been performed (Weismer & Kim, 2010).
Pinto et al. (2005), summarizing work on the effects of DBS on oromotor nonverbal control, argued that there is likely a substantial disconnection between an assessment of the “components” of speech and the integrated process of producing speech. By “components” Pinto et al. (2005) refer to the isolation of a particular articulator, such as lips, or tongue, and an assessment of its control capabilities independently of other articulators. By “integrated process” they mean the coordinated movements of all articulators to shape the vocal tract for the production of a time-varying target acoustic signal that is listener-directed in the sense of serving the needs of message transmission. As argued elsewhere (Weismer, 2006a), a measurement approach dedicated to isolation and evaluation of specific articulators commits, by technical necessity, to a nonspeech task. Pinto et al (2005, p. 1512) et al. expressed concern about the applicability of nonspeech measures to “functional” speech by noting for several patients, “…the inconsistent conclusions that assessment of intelligibility and speech subsystems may lead to”. Other examples of discordant results for nonspeech and speech performance in response to DBS or other brain stimulation can be found in Isaias, Alterman, and Tagliati (2009) and Naraya et al. (2009).
In summary, oromotor nonverbal measures meet the sensitivity criterion, as well as the “ease” criterion. These measures do not appear to meet the criterion of relevance to functional communication, nor have they been demonstrated to provide insight to speech movements. In fact, performance on oromotor, nonverbal tasks does not seem to have much relevance to articulatory motions and their speech signal products.
4.2.3. Electropalatography
The modern version of electropalatography (EPG) uses a minimally-invasive pseudopalate, custom-fit for each subject/patient and instrumented with an array of sensors for detection of tongue-palate contact (or contact pressure: see Murdoch, Goozée, Veidt, Scott, & Meyers, 2004) during speech and/or swallowing. The technique has been used to describe tongue-palate contact patterns for lingual consonants in a number of different languages, and to study coarticulation (the influence of surrounding sounds on the articulation of a “target” sound) (see Gibbon & Nikolaidis, 1999, for a review). EPG patterns may be characterized by the number of sensors contacted during a consonant closure interval, the primary place of contact, the symmetry of contact, and/or the openness of the contact pattern for consonants whose manner of articulation requires a narrow, open conduit (as in fricatives, for which the presence, length, and depth of a groove between the tongue and palate is critical to the production of frication noise).
EPG has been used to study lingual consonant production in speakers with dysarthria, including speakers with PD (see, for example, McAuliffe, Ward, & Murdoch, 2006 a, b; Kuruvilla, Murdoch, & Goozée, 2009; Timmins et al., 2009; and review in McAuliffe & Ward, 2006). As reviewed by McAuliffe and Ward (2006), several of these studies show differences in EPG patterns between speakers with dysarthria and neurologically-normal speakers, but the differences are often difficult (if not impossible) to interpret. For example, reports of apparently “atypical” palatal contact patterns associated with perception of a correct lingual consonant are fairly common (see cases reviewed by McAuliffe and Ward, 2006; and similar mismatches between EPG patterns and the perception of /Σ/ reported by Timmins et al. (2009) for speakers with Down syndrome). McAuliffe, Ward, and Murdoch (2006) report the opposite effect for speakers with Parkinson disease, wherein apparently “normal” EPG patterns for lingual consonants are associated with consonants perceived as “imprecise”. McAuliffe et al. (2006a) also described cases of correctly perceived /l/ with atypical EPG patterns, and of /τ/ judged as poorly articulated when the EPG pattern was “normal”.
These observations are presented here to support the conclusion that EPG measures do not, at least as reported to date, support the criteria of sensitivity to dysarthria or relevance to functional communication. EPG measures do not meet these criteria because of the mismatches between EPG patterns and perception of segmental (articulatory) integrity. There are several studies in which EPG pattern differences have been reported between speakers with dysarthria and control groups, or between specific patients and one or two control speakers (see McAuliffe & Ward, 2006), but group effects are not consistent across studies. A recent study by Hartinger, Tripoliti, Hardcastle, and Limousin (2011) reported EPG changes for two patients with Parkinson disease pre- and post-DBS stimulation, in which tongue-to-palate contact patterns for stop consonants were better for one patient, but worse for the other following stimulation. Although Hartinger et al. (2011) interpreted the EPG data in terms of articulatory speeds, the specific relationship between variations in speech movements and specific EPG patterns is unknown, largely because the proper studies have not been done. To date, then, the EPG literature does not seem consistent with the third criterion of providing insight to the underlying speech movement disorder and its potential modification by a treatment such as DBS. Finally, the collection of EPG data requires a custom-fabricated pseudopalate, proper equipment and software for recording and processing the multichannel (multisensor) data. For these reasons, EPG measures do not meet the “ease” criterion. Based on current data, EPG does not seem to be a good candidate for evaluating the effects of DBS on speech production.
4.2.4 Speech Movement
In this paper the term “speech movement” refers to motions of the upper articulators sensed and recorded using an electromagnetic (Perkell et al., 1992) or x-ray (Westbury, 1994) device. These devices use small markers placed on the lips, jaw, and tongue to represent motions of an entire structure or different parts of a single structure such as the tongue. Other techniques, including strain-gauge and optical transduction, have been limited to observation of the lips and jaw (Hunker, Abbs, & Barlow, 1982; Dromey, Ramig, & Johnson, 1995; , Ackermann et al., 1997; Yunusova, Green, Lindstrom, Ball, Pattee, & Zinman, 2010). The measures derived from these motions include displacements, speeds, and accelerations of individual markers, or more complex inter-marker measures to capture coordination patterns between two or more articulators. Here the discussion is confined to marker displacements and speeds.
The status of speech movement measures with respect to the first criterion, of having demonstrated sensitivity to dysarthria, is straightforward. As reviewed by Ackermann et al. (1997), Yunusova, Weismer, Westbury, & Lindstrom (2008), and Weismer (1997), speech movements in persons with dysarthria are typically smaller and slower than speech movements from control speakers. To date, there is little evidence that speech movements are differentially smaller or slower among different types of dysarthria; on average, adults with neuromotor speech disorders have reduced speech movements regardless of the underlying disease or dysarthria type (Weismer & Kim, 2010).
The second criterion— the relevance of a measure to an estimate of functional communication— has not been studied much for speech movements. Forrest, Weismer, and Turner (1989) reported an analysis of lower lip and jaw motions among persons with Parkinson disease, for speakers perceived with mild versus severe speech problems. Based on a formal perceptual measure to separate Parkinsonian participants into speech severity groups, speakers perceived as “severe” had smaller displacements and velocities of the lower lip when compared to speakers perceived as “mild”; the same pattern was observed for the jaw but only for one type of syllable. This relatively coarse-grained analysis of the relationship of lip/jaw speech motions to a functional measure of communication (perceptual scaling of speech severity) served as the point of departure for a larger study of the possible relationship between speech motions and scaled estimates of speech intelligibility in speakers with PD, amyotrophic lateral sclerosis (ALS), and control speakers (CS). Findings from this study are reported in print for the first time here; the results have a strong bearing on the conclusions reached at the end of the paper.
4.2.4.1 Speakers
Speakers included 22 persons with PD, 10 with ALS, and 20 control subjects with no history of speech, language, and hearing disorders and no history of neurological disease. Motions of the articulators were collected using the x-ray Microbeam Facility at the University of Wisconsin-Madison. The instrument and techniques for data collection have been described fully in Westbury (1994) as well as in publications on articulatory motions in both neurologically normal speakers and speakers with dysarthria (see, for example, Hashii, Honda, & Westbury, 1997; Weismer, Yunusova, & Westbury, 2003; Yunusova, Weismer, Westbury, & Lindstrom, 2008).
4.2.4.2. Apparatus, Procedures, and Measurements
Briefly, an array of gold-pellet, flesh-point markers were attached to the articulators and face, including four points along the midline of the tongue, two points on the jaw, two on the lips, and three on the head. The most forward marker of the four points on the tongue was placed approximately 10 millimeters (mm) back from the tongue tip when it was extended from the mouth, and the most posterior of the markers as far back as possible (that is, without eliciting a gag). The middle two tongue markers were then placed between these two extreme locations so the distance between any two markers was roughly equivalent. In Figure 1, discussed more completely below, the approximate locations of these four tongue markers is shown by the points marked T1, T2, T3, T4; T1 is the most forward marker, closest to the lips. The two jaw (mandible) markers were placed on the outer surface of the gums at the central incisors (MI) and between the first and second molars (MM). A marker was placed on the external surface of the lower lip (LL), at the vermilion border, and another on the external surface of the upper lip (UL), also at the vermilion border. Three other markers, not shown in Figure 1, were placed on the head to enable a computational elimination of head motions from the motions of the articulatory markers. Marker motions during speech were sampled and stored digitally.
The coordinate space is shown in Figure 1, with the x-axis labeled MAX OP (maxillary occlusal plane), and the y-axis labeled CMI (central maxillary incisors). The MAX OP was established by scanning participants as they held a custom-designed biteplate between closed jaws. The CMI plane was defined by a line running through the central maxillary incisors, orthogonal to the MAX OP. The intersection of the two planes is the origin of the coordinate system. An x-ray beam directed at the subject’s head was interrupted by the markers attached to the tongue, jaw, and lips, as well as by the reference markers on the head; the coordinates of the interrupted beam were stored as locations over time. The outline of the hard palate was derived empirically by tracking a pellet’s location as an experimenter moved it back and forth along the palatal midline. The location of the posterior pharyngeal wall for each participant was estimated from a scan performed at the beginning of the experiment.
Speech intelligibility was estimated in two ways. First, each speaker recorded a set of words adapted a multiple-choice, speech intelligibility test (Kent, Weismer, Kent, & Rosenbek , 1989). Ten listeners heard the word lists from each speaker and their responses were summed and averaged to yield a percentage word-intelligibility score per speaker. A second speech intelligibility score was obtained by direct magnitude estimates (DMEs; see Schiavetti, 1992). Ten listeners (different from those who generated the word intelligibility scores) heard each of three sentences recorded by each speaker, and scaled the intelligibility of a sentence in ratio to a modulus (standard). The modulus was a sentence selected from a speaker with ALS who was judged as having a moderate speech intelligibility deficit. The modulus sentence was assigned a value of 100, and all other sentences were scaled as ratios relative to this standard, with higher numbers indicating more intelligible sentences.
The procedures used in this experiment were approved by the Clinical Sciences Center Institutional Review Board (IRB) at the University of Wisconsin-Madison. Consistent with the Declaration of Helsinki, each participant received a full description and explanation of the experimental procedures and risks, and each signed an informed consent document.
The speech intelligibility data can be summarized as follows. When all speakers are combined, a quadratic function accounted for 81% of the variance between the two measures of speech intelligibility; the overall correlation of 0.90 is statistically significant. Significant linear correlations were found for each of the three groups of speakers (ALS, r = 0.90; PD, r = 0.66; control speakers, r = 0.62). In general, word and scaling intelligibility scores covaried strongly across speakers.
DME intelligibility data were used in all subsequent analyses for two reasons. First, the movement analysis was derived from multiword utterances, as were the DME scaling data; the multiword (i.e., phrasal) nature of the movement and intelligibility data seemed a more logical match than pairing movement data with single word intelligibility data. Second, data toward the top of the word-intelligibility scale (between roughly 88–100%) were spread out on the DME scale. This makes sense because the estimate of intelligibility in a scaling task is likely to reflect more than the perceptual recovery of sound segments and the words they form in the single word, multiple choice test. For example, a listener may recover the correct word even for a production characterized by individual speech sound distortions and possibly voice quality abnormalities; this same word (or the word in a sentence) may be scaled as less intelligible when the listener responds with an overall impression of the “goodness” of the utterance. Finally, the greater spread in values on the DME scale also provided a better opportunity to detect significant covariation between movement and intelligibility measures. The “ceiling” effect of percentage values for single word intelligibility could mask true covariation.
For an index of speech movement, a measure called Phonetic Working Space (PWS) was developed. Figure 1 illustrates the approach to estimating PWS for each of the tongue, jaw, and lip markers. Marker coordinates accumulated across a connected speech sample (the “Hunter Passage”: Crystal & House, 1982) are shown for a speaker with PD. When the collection of discrete coordinates are connected in sequence, as a continuous pathway, they form a “tangle” across the reading passage, whose most extreme positions can be connected to form the perimeter of the motion space for that marker. This perimeter is outlined in red for T3 (the third tongue marker); the mean T3 coordinates for this task are marked by the red dot. Lines extended from the mean coordinate to the perimeter of a marker’s accumulated locations were constructed to form triangles having widths of 10 degrees. The summed areas of each of the 36 triangles formed in this way provided an index of a marker’s PWS, expressed in square millimeters (Bunton, Westbury, & Weismer, 2000). PWS was obtained in this way, when possible, for each speaker and each marker.
Marker velocities were also measured across the sampled motion histories, using a three-point, central difference algorithm applied across the entire sequence of sampled coordinates. The measured velocities were expressed for each marker as averaged marker speeds. As presented below, there was a relationship between the size of a marker’s working space and its average speed; speed, in turn, had an interesting counterpart in an acoustic measure.
4.2.4.3. Size of Phonetic Working Space and Average Marker Speed
Across speaker groups and marker locations, there was a fairly strong tendency for the magnitude of average marker speed (expressed in millimeters/second) to be correlated with the averaged PWS (expressed in mm2). Data are shown in Figure 2, where PWS is on the x-axis and marker speed on the y-axis. Values for the three groups are shown in different colors (black = CS; red = PD; green = ALS). The data are grouped according to marker location: the ellipse labeled “T” collects all markers from the tongue (T1 through T4), “LL” the markers on the lower lip, “UL” on the upper lip, and “MI” the more anterior marker on the mandible. The “T” and “LL” points reflect their coupling to the mandible (that is, the mandible motions have not been subtracted from them). Within the “T”, “LL” and “MI” ellipses, data from the Control speakers are typically toward the upper right of the enclosure, data from the speakers with PD and ALS toward its lower left. Articulatory motions of control speakers tended to have greater PWS and higher speeds as compared to motions of speakers with PD and ALS. The tendency of average marker speed to be greater with averaged PWS can be seen both within articulators and across them. When correlations were obtained across speakers within groups, for each marker, the strongest relationships were found for the tongue markers, and in particular for T3. Correlations between PWS and average marker speeds for T3 were 0.97 for control speakers (N = 20 speakers), 0. 63 for speakers with PD (N = 16 speakers), and 0. 93 for speakers with ALS (N = 9 speakers).
4.2.4.4. Marker Speeds and DME Estimates of Speech Intelligibility
Correlations between marker speeds from UL, LL, and MI and scaled speech intelligibility ranged between −0.13 and 0.43, when computed across speakers within each of the three groups. For the four tongue marker (T1-T4), speed-intelligibility correlations computed within each group ranged from 0.38 to 0.94; among these 12 correlations, 8 were significant at p = 0.05 or less. All three groups had significant correlations for T3 and T4. A plot of scaled intelligibility against T3 speed is shown in Figure 3, together with fitted linear functions; as shown in the Figure, correlations ranged between 0.42 and 0.80. When data from all speakers were pooled together, a significant quadratic function accounted for 45% of the variance between speech intelligibility and T3 average speed.
The findings of positive and significant relationships between tongue speed and speech intelligibility, in which the significant within-group relationships accounted for between 18 and 64% of the variance in speech intelligibility scores, considered together with the lack of similar findings for the lips and jaw, suggest at least two conclusions. First, the presence of significant relations between tongue motions and speech intelligibility, even in simple bivariate expressions, supports the general outlines of the hypothesis advanced above: a coarse measure of articulatory reduction, and specifically of tongue motions, contributes to deficits in speech intelligibility. The variance accounted for in the significant relationships is modest at best, and not all of the tongue motions (and especially motions of the marker near the tongue tip—T1) yielded significant functions. The emphasis in these analyses, however, was on an exploration of a crude measure of articulatory reduction entered into a simple, two variable relationship, in full recognition of the likely multivariate nature of articulatory, phonatory, and speech breathing foundations of speech intelligibility. From this perspective, the findings of significant linear relations between tongue speeds and speech intelligibility within each of the three groups of speakers, and a significant quadratic function for all speakers combined, are intriguing. It suggests that among speakers with dysarthria and perhaps speakers in general, an estimate of either PWS or averaged tongue speed, or a more easily-obtained measure that reflects these speech motion characteristics, is a good index of the neuromotor integrity of the speech mechanism. Importantly, this conclusion is not specific to the dysarthria in PD, in which much of the work has been performed on the effects of DBS on speech production.
Second, the absence of significant relationships between UL, LL, and MI marker speeds (or working spaces) and speech intelligibility raises important questions concerning the insight to motor speech disorders, and therapeutic effects on them, afforded by speech motion observations restricted to the lips and jaw. Among the relatively small number of publications on articulatory motions in dysarthria, most are reports of lip and/or jaw motion only (e.g., Ackermann et al., 1997; Kleinow, Smith, & Ramig, 2001; Loh, Goozée, & Murdoch, 2005; and see review in Weismer, 1997 of earlier work). A liberal interpretation of the current findings is that neuromotor pathology may not affect lip and jaw function for speech in the same way as tongue function, or at least that whatever effects are seen in the lip/jaw complex and the tongue may not bear equally on speech intelligibility deficits. A stronger interpretation is that experiments restricted to observations of lip and jaw function for speech should not be expected to yield good insight to the speech motor control problems—where “motor control” implies the control of articulatory gestures in the service of producing intelligible speech—in speakers with dysarthria.
Based on these data, we can offer a more definitive answer to the relevance of speech movement data to functional communication in dysarthria. To the extent that measures of speech intelligibility provide an index of functional communication ability, the tongue-marker speed— intelligibility relationships suggest the significance of speech movement data in estimating functional communication ability. It follows that the measures meet the criterion of serving as a measure of therapeutic effects of DBS or other treatments for persons with neuromotor disorders. Speech movement measures clearly (and tautologically) meet the third criterion as well. If a movement disorder is evaluated most effectively by movement measures, the effects of DBS (or other treatments) on speech production are most directly evaluated by either the phonetic working space or average motion speed measures reported above. This conclusion is supported by the clear demonstration of lingual speech movement differences between the speaker groups with PD and ALS, as compared to the control speakers. A limitation of this conclusion, of course, is that the speech movement data we report serve as only a partial representation of factors that may be associated with speech intelligibility deficits. Voice quality (including voice loudness), prosody, and specific segmental characteristics (such as the misarticulation of fricatives, or semivowels) are not represented in the measure of PWS or speed. Future research should focus on how these factors add to the understanding of the bases of speech intelligibility fluctuations in dysarthria due to neuromotor disease or the treatment of it by interventions such as DBS.
This conclusion must also be tempered by the failure of speech movement measures to meet the criterion of ease of application. Speech movement measures, and especially of the tongue, are difficult to obtain. Electromagnetic devices are now available to collect tongue movements during speech, and although unlike x-rays magnetic fields are not likely to pose a health hazard to participants under normal circumstances, they have potential to disturb stimulator performance in implanted patients (e.g., Dustin, 2008). For general application of therapeutic effects—of therapies other than DBS—the electromagnetic instrument is expensive and requires a fair amount of expertise for experimental setup set up and data interpretation. As noted above, motions of the lip-jaw complex are more easily and safely obtained than motions of the tongue, but do not appear to have a strong link to speech intelligibility variations in speakers with dysarthria. In section 4.2.6 an easily-obtained measured is proposed that reflects tongue speed in what we believe to be a fairly direct way.
4.2.5. Aerodynamic Measures
Aerodynamic measures include air pressures and flows during speech production, as well as measures of speech breathing derived from chest wall kinematics. The procedures for collecting speech aerodynamic data typically involve at least two, and sometimes three or four measures simultaneously (such as intraoral and intranasal pressures with oral and nasal flows: see Hammer, Barlow, Lyons, & Pahwa, 2011, for an experiment using oral pressure and nasal flow data to assess changes in the size of the velopharyngeal port in response to DBS in speakers with PD). A large, mostly older literature has shown how measures of transglottal flow, intraoral and intranasal air pressure, and oral and nasal flow can be used to assess valving efficiency at various vocal tract locations from larynx, to velopharyngeal port, to lips (Peterson-Falzone, Hardin-Jones, & Karnell, 2010, pp. 306–311). Reliable, normative data are available for transglottal (oral) flow during vowel production, as well as intraoral air pressures for stops and fricatives (see Baken, 1987, pp. 241–313). These measures could be useful in the evaluation of therapeutic effects of DBS, and perhaps especially in hypokinetic dysarthria for which imprecise consonants are often regarded as a perceptual highlight. As outlined by McAuliffe, Murdoch, and Ward (2006), imprecise consonants in hypokinetic dysarthria are assumed to result from articulatory undershoot (Lindblom, 1963) because of “weakened” speech movements. Weakened stops or fricatives can be conceptualized as having incomplete (stops) or “loose” (fricatives) supralaryngeal constrictions. These weakened constrictions do not block the airstream efficiently, an effect that will contribute to reduced intraoral air pressures. Stops articulated with an incomplete, leaking constriction have some oral flow during the closure interval, and “loose” fricative constrictions are likely to have oral flows higher than those for normally-articulated fricatives.
Oral pressure and oral flow collected simultaneously have special relevance to the evaluation of DBS effects on speech and voice in persons with PD and hypokinetic dysarthria. A primary, perhaps for many patients the earliest, speech production symptom in PD is a breathy or hoarse voice (Logemann, Fisher, Boshes, & Blonsky, 1978). Normal vocal fold vibration for the production of voice in connected speech, or phonation, is a series of nearly-periodic opening and closing motions of the vocal folds. Each of the nearly-periodic cycles of vibration has an opening phase, a closing phase, and a closed phase. During the closed phase the vocal folds are in contact and more or less block the pulmonary airstream. This flow is driven by the positive (greater than atmospheric) pressure developed by muscular and recoil forces of the respiratory system; each time the vocal folds close completely (or nearly so) the pressure differential across the vocal folds causes the folds to reopen and go through yet another cycle of motion.
If each vocal fold cycle is thought of as a sequence of opening phase, closing phase, and closed phase, the opening plus closing phases constitute the open phase, or the fraction of the cycle during which air flows from lower to upper airways. The closed phase blocks this flow for a very brief interval. In fact, the closed phase of the typical vibratory cycle occupies roughly 30–60% of the entire cycle time (Holmberg, Hillman, & Perkell, 1988). The rapid modulation of flow between the open and closed phases is best expressed as an average transglottal flow across some unit of time, driven through the vibrating vocal folds by an average pressure differential across the vocal folds (termed subglottal or tracheal pressure) for the same unit time. Averaged pressure divided by the averaged flow provides an index of laryngeal resistance (LR), a measure inspired by Ohms’ law. Given a constant ratio of open to closed time during vocal fold vibration, transglottal flow increases with increases in the transglottal pressure differential. When the closed phase of vocal fold vibration forms an abnormally large fraction of a cycle, the laryngeal resistance is relatively high; an abnormally low laryngeal resistance indicates cycles of vocal fold vibration that do not have an adequate closed interval. The latter case is particularly relevant to the “weak voice” characteristic of speakers with PD; their problems with good closure during vocal fold vibration, and therefore low laryngeal resistance, are a primary cause of their breathy, hoarse, and/or asthenic-sounding voice.
In practice, estimates of LR are relatively easy to obtain. However, direct estimates of the pressure beneath the vocal folds (and therefore the transglottal pressure differential for the case of vowels when oral pressure is known to be equal to atmospheric pressure) are very difficult to obtain in a noninvasive way. Smitheran and Hixon (1981) argued that the pressure beneath the vocal folds could be estimated from the easily-measured intraoral pressure peaks during a series of consonant-vowel repetitions in which the consonant is a voiceless stop, and the vowel either /a/ or /i/. Voiceless stops are produced with the vocal folds spread apart and a complete blockage of air at the point of vocal tract constriction (in the case of /p/, at the lips). Under these conditions, the volume of air behind the vocal tract constriction is continuous with that of the lower airways, and the compressed air within that volume should yield roughly equivalent pressures at any location. Under the assumption that the pressure beneath the vocal folds is more or less constant throughout a speech utterance, and especially throughout an utterance with no special emphasis on a particular syllables or other dramatic changes in voice intensity, pressure peaks of the /p/s in the series of /pa/s should provide good estimates of the pressure beneath the vocal folds during the vowel (voiced) segments of the syllable repetitions. The flow measured during the (open-vocal tract) vowels in the syllable sequence is essentially the same as the flow through the vibrating vocal folds. The measurement of intraoral pressure peaks for /p/s and oral flow for vowels therefore provides the data for estimates of average laryngeal resistance.
Not surprisingly, evidence from normal speakers who simulate breathy and normal voice qualities show lower LR values for the breathy quality (Grillo, Perta, & Smith, 2009). This suggests that the breathy, weak voice observed in many patients with Parkinsonian dysarthria should be associated with lower-than-normal LR, and lower-than-normal voice intensity. Voice intensity, an acoustic measure discussed more fully below, has been studied as a measure in the evaluation of DBS effects on speech (see review in Tripoliti & Limousin, 2010). Indeed, a prominent perspective in the contemporary literature focuses on voice intensity deficiencies as the signature speech mechanism deficit of, and therapy target for, Parkinsonian dysarthria (e.g., Fox et al., 2006).
Measures of LR currently do not meet several of the evaluation criteria. For example, the sensitivity of the measure to dysarthria is not well documented, even if in the case of Parkinsonian dysarthria such sensitivity can be reasonably inferred from voice intensity data showing lower values for persons with PD, as compared to controls. As discussed in section 4.2.6., however, the typical voice intensity differences between persons with PD and control subjects are small enough to make reliable inferences of low LR in PD somewhat doubtful. Moreover, the small amount of data available on LR in speakers with PD (Hammer, Barlow, Lyons, & Pahwa, 2010) suggest that only about half of the participants (N= 18) have LR values less than the normal range in a DBS “off” condition; in the DBS “on” condition several had excessively high LR values and several other had little or no change (see Hammer et al., p. 1696). Second, the relationship between LR and functional measures of communication, such as speech intelligibility, has not been established. The third criterion, the ability of the measure to permit a reasonable inference to the underlying movement problem, seems to be met by estimates of LR which depend primarily on inefficient motions of the vocal folds resulting from weakened muscular “posturing” of the intrinsic laryngeal muscles. However, concern has been raised about the ability to separate laryngeal from respiratory contributions to LR measures, especially because respiratory and laryngeal contributions to subglottal pressure and transglottal flow—the measures required to derive LR—may depend on the nature of the vocal fold vibration problem in a particular disorder (Makiyama, Oshihashi, Mogitate, & Kida, 2005). Stated otherwise, extrapolations from normal speakers who simulate varying voice qualities to persons with actual vocal fold vibration problems may not be straightforward. Fourth, LR measures are relatively easy to make in a technical sense, but may be compromised by large fluctuations in voice intensity during the repeated-syllable task, a tendency on the part of the patient to produce voicing within the stop closure interval (which negates the equality between oral pressure and pressure below the vocal folds that can be assumed when the stop is truly voiceless), and/or incomplete closures of the vocal tract for the stop consonant which adds variability into the oral-pressure estimate of pressure below the vocal folds. Despite these current gaps in knowledge of LR, the ease of application and possible insight to laryngeal mechanisms and their contribution to speech intelligibility suggest further exploration of the measure as a potential index of DBS effects on speech production.
4.2.6. Acoustic Measures
Speech acoustic measures have been studied extensively in both normal speakers and speakers with speech disorders, including dysarthria. Among the dysarthrias, a relatively large amount of speech acoustic data have been reported for speakers with PD (see reviews in Kent et al., 1999; Goberman & Coelho, 2002; Duez, 2009).
Speech acoustic measures include segmental, suprasegmental, and voice measures. Segmental measures are designed to represent partly or completely the different speech sound categories of a language. Measures have been described for the normal production of vowels, diphthongs, semivowels (liquids and glides), nasals, and obstruents (stops, fricatives, and affricates; see Hixon, Weismer, & Hoit, 2008, pp. 473–548). Segmental measures may be temporal (such as vowel or consonant durations: see Crystal & House, 1988a,b), spectral (such as formant frequencies for vowels or power spectra for fricatives: see Hillenbrand et al., 1995, 2000; Blumstein & Stevens, 1979; Forrest, Weismer, Milenkovic, & Dougall, 1987), or spectro-temporal, as in formant transition properties of diphthongs (Holbrook & Fairbanks, 1962). Measures of formant transitions can also reflect changes in vocal tract configuration between obstruents and vowels, nasals and vowels, semivowels and vowels, and for diphthongs. Suprasegmental measures include those associated with prosody, such as fundamental frequency (F0) and voice intensity pattern or variation across a phrase, as well as measures of speech rhythm. Finally, voice measures include average speaking F0, average voice intensity, and measures of the stability of vocal fold vibration (such as jitter, shimmer, and signal-to-noise ratio). Because there are so many acoustic measures of speech, the following discussion is confined to those more frequently investigated in dysarthria and which lend themselves to evaluation within the framework of the criteria discussed throughout this paper.
Many speech acoustic measures clearly satisfy the “sensitivity” criterion for the documentation of change due to treatment or disease progression. Segmental, prosodic, and voice measures have been shown to be sensitive to the presence of dysarthria, and this sensitivity is solidified, in a statistical sense, by comprehensive data available for normal speakers. As reviewed by Weismer and Kim (2010), measures of voice-onset time (VOT), speaking rate, vowel space, formant transition rate, and the power spectrum difference between lingual fricatives such as /s/ and /Σ/ differentiate speakers with dysarthria from control speakers. The evidence for sensitivity of acoustic measures of prosody to dysarthria is less convincing than that for segmental measures, partially (or largely) because there is much less published work in this area (but see Skodda, Visser, & Schlegel, 2010; and Watson & Schlauch, 2008, for a good review of the effects of loss of F0 variation on speech intelligibility). A measure of speech rhythm—the paired variability index (PVI: Low, Grabe, & Nolan, 2000; or its normalized version: see Grabe & Low, 2002) has been shown to differentiate utterance productions of speakers with dysarthria from those of controls, and is even sensitive to different dysarthria types (Liss et al., 2009). Acoustic measures of voice production seem fairly sensitive to dysarthria, although the measures may not distinguish well among different dysarthria types (Kent & Kim, 2003).
A good deal of attention has been devoted to voice intensity (also referred to as voice sound pressure level [ SPL], consistent with the actual units measured), primarily because a well-known therapy approach for persons with PD focuses on treatment of phonatory loudness (see, for example, Ramig, Countryman, Thompson, & Horii, 1995; Ramig, Sapir, Fox, and Countryman, 2001; and review in Fox et al., 2006). The treatment strategy is best described as an attempt to recalibrate the function relating sense of effort in producing phonatory energy to the actual speech acoustic energy generated at the output of the lips. The “scaling up” of effort is assumed to spread to the rest of the speech mechanism, an idea first expressed in careful detail by Rosenbek and LaPointe (1985). The generals scaling up of speech effort produces beneficial effects beyond the (apparently) obvious one of a signal having greater intensity and improved audibility. For example, therapy directed to scaling up phonatory effort is assumed to spread to motions of the articulators, increasing their displacements, their extreme positions in the vocal tract for corner vowels, and possibly their speeds (Dromey & Ramig, 1998; Sapir, Spielman, Ramig, Story, & Fox, 2007). A successful therapy outcome of greater voice intensity is considered a fortiori to be associated with better speech intelligibility, clearly a desirable outcome of treatment for any speech disorder.
These considerations support voice intensity as a good measure to evaluate the effects of DBS on speech production in persons with PD. There are several reasons, however, to express caution about relying too much on voice intensity as an evaluative measure of the effect of DBS on speech production. First, available data on typical differences in voice intensity between speakers with PD and properly-matched controls suggest an effect of about 1–4dB for connected speech material (see, for example, Ramig et al., 2001; Tjaden & Wilding, 2004). The magnitude of this difference is far less than the typical treatment effects of 10–15 dB reported by (for example) Ramig et al. (1995; 2001). If there were a tight linkage between voice intensity and articulatory behavior on the one hand, and therefore voice intensity and speech intelligibility on the other, the smaller typical differences in voice intensity between speakers with PD and control speakers may not be associated with dramatic differences in articulatory behavior and speech intelligibility.1 Of greater importance, however, is the finding in the DBS literature of dissociation between stimulation-induced changes in voice intensity and speech intelligibility. As shown by Tripoliti et al. (2006; 2008; 2010; see discussion in Tripoliti & Limousin, 2010), increased vocal intensity following DBS may be associated with no change, or even a decrement in speech intelligibility. D’alatri et al. (2008) have reported a related finding of improved acoustic measures of phonation such as jitter, noise-to-harmonics ratio, and tremor following DBS with little change in speech intelligibility. At the current time, then, it seems reasonable to conclude that measures of source (phonatory) function do not satisfy the second criterion very well because they do not map on to functional estimates of communication in a straightforward way.
The third criterion is the degree to which a measure reflects some aspect of speech movement more or less directly. Among the many speech acoustic measures shown to be different between speakers with dysarthria and control speakers, several may be related to articulatory behavior although not necessarily speech movement per se. For example, the literature on the size of the acoustic vowel space, obtained by computing the planar area of a plot of the first formant frequency (F1) against the second formant frequency (F2) for the corner vowels of a language (Turner, Tjaden, & Weismer, 1995; Liu, Tsao, & Kuhl, 2005), assumes at least an ordinal relationship between the magnitude of the acoustic area and articulatory “distinctiveness” of vowels. In articulatory terms, this distinctiveness is most accurately described as differences in positions of operationally-defined, point coordinates on the tongue, lips, and jaw, obtained at a specified time during vowel articulation. The greater the acoustic vowel space area, the more different the articulatory positions of the corner vowels. The magnitude of the acoustic vowel space, a measure whose articulatory inference is to positional distinctiveness, provides no straightforward information on movement magnitude, speed, or any other derivative of motion. Future research may show intercorrelations between position and movement measures, although the correlations should not be expected on logical grounds alone.2
Even with these qualifications to interpretation of the acoustic vowel space as a reflection of articulatory position, rather than movement, the measure is still attractive for evaluation of DBS effects on speech production. First, the inference to articulatory behavior, even if only to positional distinctiveness, is an inference concerning control of the articulators. Second, the magnitude of the acoustic vowel space predicts variation in speech intelligibility scores. Although the goodness of this prediction varies across studies (see Sapir, Ramig, Spielman, & Fox, 2010, for a review), it is often statistically significant and may account for up to 50% of the variance between the two variables. Third, the measure of acoustic vowel space is simple, noninvasive, and technically far easier to implement than measures of articulatory positions. This summary of how the acoustic vowel space can be interpreted, and its potential as a measure to evaluate DBS effects on speech production, applies equally to acoustic measures of the “distance” between certain consonants (Tjaden & Wilding, 2004; Tjaden & Turner, 1997). The tendency for measures of vowel space area to covary with measures of speech intelligibility, or spectral distinctions for lingual consonants to covary with perceptual measures of consonant precision (Tjaden & Turner, 1997), suggests the promise of these measures to meet the second criterion of functional relevance. Additional work on the relationship of these measures to speech intelligibility is required because of the variable findings on acoustic vowel space noted above, and Tjaden & Wilding’s (2004) failure to find meaningful covariation between any of the acoustic measures they studied and scaled speech intelligibility measures is speakers with PD and Multiple Sclerosis (MS).
Voice-onset time (VOT), the time interval between the release of a stop consonant and the onset of vocal fold vibration, is a frequently-studied measure in speech production research and especially in speakers with dysarthria and other speech disorders (see review in Weismer, 2006b). VOT is part of the phonetic implementation of the voicing distinction for stops, with shorter values associated with voiced or unaspirated stops and longer values with voiceless or aspirated stops (in English); specific implementation of VOT differences for phonological voicing contrasts varies widely across languages (see Cho & Ladefoged, 1999). Part of the attraction of VOT as a measure of speech motor control is its potential to reflect the integrity of coordination between laryngeal and supralaryngeal gestures. As described by Weismer (2006b, pp. 108–117), however, articulatory interpretation of VOT differences between speakers with neurologically-based speech and/or language disorders and “normal” speakers, or between the same speaker pre- and post-stimulation, are exceedingly complex and indeterminate. A shortening or lengthening of an intended “long-lag” VOT, for example, may result from changes in timing (phasing) between gestures, or duration and/or scale of the component gestures. Based solely on the acoustic record it is impossible to know which articulatory change (or which combination of changes) produced the difference in VOT. For this reason, very specific articulatory interpretations of acoustically-measured VOT changes resulting from DBS in patients with multiple sclerosis (MS) (Pützer, Barry, & Moringlane, 2007) do not seem justified. More broadly, at least with respect to the third criterion used to organize the current analysis, the complicated and indeterminate inferences from VOT to underlying articulatory behavior seem to suggest caution in regarding it as a good measure to evaluate the effects of DBS on speech production.
Formant transitions have been studied a fair amount in dysarthria. Formant transitions, the changes in formant frequencies as a function of time, reflect modifications in vocal tract configuration throughout an utterance. Whereas formant transitions can be found almost anywhere in a spectrographic record of an utterance, certain types of transitions have been the focus of studies of normal and neurologically-impaired speech production. Formant transitions associated with diphthongs (such as /AI/ in “high”), stop-vowel and fricative-vowel sequences, and semivowels (/ω/,/ρ/,/λ/,/φ/) have been studied among speakers with dysarthria and compared to transitions produced by control speakers (Weismer, 1991; Weismer, Martin, Kent, & Kent, 1992; Kim, Weismer, Kent, & Duffy, 2009; Kim, Kent, & Weismer, 2011; Weismer, Kuo, & Allen, 2010). Typical measures of formant transitions include transition duration (TD), transition extent (TE, the amount of frequency change along a transition), and transition rate or slope (TR, the rate of change of frequency, TE/TD). The articulatory interpretation of these measures, and especially in the case of diphthong transitions, is fairly straightforward. Within the same speaker, the greater the TE the greater the change in vocal tract configuration; within the same speaker, the greater the TR, the greater the speed of change in vocal tract configuration (see Stevens, 2000, for the speech acoustic theory supporting these inferences from TE and TR).
These articulatory interpretations of formant transition data are subject to two important qualifications. First, formant frequencies, whether measured at a single point in time as “targets” to obtain the acoustic vowel space, or as a changing function of time, are dependent on vocal tract size. Vowel spaces or transition data, averaged across speakers, reflect this source of variability whose magnitude is generally unknown. The variability can be minimized to a certain degree by studying speakers only of a particular sex, age, and possibly race (see, for example, Yang, 1996). In the case of formant transitions, the influence of vocal tract size may be minimized with a focus on transition rates (slopes), which as shown for diphthongs by Weismer, Kent, Hodge, & Martin (1988) have very little variability across adult speakers of the same sex. Second, formant transitions (or the individual F1–F2 values of a “target” measurement) have the limitation of an ambiguous mapping back to individual articulators. Although the theory of speech acoustics (Stevens, 2000) regards F1 as primarily reflecting changes in tongue height and F2 changes in tongue advancement (and lip configuration), each formant frequency is affected by the overall vocal tract shape and its configurational changes over time. On the one hand, then, a formant transition does not provide direct information concerning a particular articulator, but on the other it does reflect the joint motions of articulators in creating vocal tract shapes. The uncertainty of inferring specific articulatory behavior from a formant transition measure is a qualification that can be cast in a positive light, because different vocal tract shapes resulting from the concerted motions of the several articulators are the important variables in creating linguistically contrastive acoustic signals.
Given the current state-of-the-science, F2 slopes in diphthongs such as /εI/ in “hail”, /AI/ in “sigh”, or in glide-vowel sequences such as /ωΘ/ in “wax” seem to best meet each of the four criteria stated at the outset of this paper. F2 slope measures are recommended because 1) they are highly sensitive to the presence of dysarthia, and specifically are more shallow in dysarthria than in control speakers (Kim et al., 2009; Kim et al., 2011), 2) they reflect rather directly an aspect of speech motor control, that is, speed of change in vocal tract configuration. Speech movement data presented above demonstrate that tongue speeds in PD, as well as in ALS, are significantly reduced relative to tongue speeds produced by age-matched control speakers; these reduced speeds are consistent with the relatively shallow F2 slopes observed among speakers with dysarthria. Given the substantial role of the tongue in shaping the vocal tract, the significantly slower tongue speeds in neurologically-impaired speakers, including those with PD, are likely to be associated with slower speeds of change in vocal tract configuration. F2 slopes are therefore a good proxy for the articulatory speeds described above; 3) both formant transition rates and tongue speeds covary significantly with measures of speech intelligibility (Kim et al., 2011; Weismer, Jeng, Laures, & Kent, 2001), suggesting a possible articulatory index of functional communication skill, and 4) F2 slope measures are easy to implement with speech analysis software (such as TF32 [Milenkovic, 2001] or Praat [Boersma & Weenik, 2009]).
This endorsement of F2 slope as a potentially effective measure to probe the effects of DBS on speech production does not mean it is without problems, or that it is the only measure that should be used. In fact, we feel strongly that the absence of a formal measure of speech intelligibility, and specifically one more sensitive and reliable (and interpretable) than item 18 from UPDRS (part III) or similar ordinal scales (see Pinto et al., 2004, for a similar conclusion), is indispensable to any evaluation of therapeutic effects in the treatment of dysarthria. F2 slope is an objective measure to index articulatory function—it can be tied more or less directly to speeds of vocal tract change—and may be used as a more specific index of speech motor control changes. A possible drawback of the F2 slope measure is that it does not seem to differentiate between different types of dysarthria—slopes are typically reduced in all types of dysarthria studied to date (Kim et al., 2009; Kim et al., 2011).
The most demanding test of the potential utility of F2 slope as a measure of the effects of DBS on speech production would be an experiment in which the measure is obtained pre- and poststimulation, as well as post-surgery with the stimulator turned on and off (and possibly at varying voltages when “on”, since there is evidence for the most negative effects of DBS in general, and on speech in particular, when stimulating voltage is relatively high [e.g., Tripoliti et al., 2008; Tommasi et al., 2008]). A proper control group, perhaps treated with some other method, should have data taken with the identical protocol and with an identical speech-sampling schedule as the surgical group. IF DBS has an effect on speech movements, either positive or negative, the F2 slope measures should reveal the effect in this experiment. The same experiment should allow for collection of speech intelligibility measures for the utterances collected at each sampling point.
5.0 Summary and Conclusions
This paper offers a set of criteria for evaluating measures that have potential for assessing the effect of DBS on speech production. Several well-known speech-physiology and speech-acoustic measures were reviewed and evaluated against these criteria. Evaluation of the measures included considerations from previously-published work, as well as new data on speech movements and their relation to speech intelligibility. Our conclusion is that the measure that currently best meets the criteria in terms of sensitivity to dysarthria, ease of application, inference to speech movement, and relationship to measures of functional communication such as speech intelligibility scores, is the slope of the second formant transition (F2 slope) for phonetic events involving relatively rapid changes in vocal tract configuration. To support this view, the new speech movement data presented in this paper are linked to speech intelligibility scores and F2 slopes. This conclusion does not rule out the use of multiple measures in the evaluation of DBS effects on speech, including some that may not meet the criteria as completely as F2 slope measures. In addition, our conclusions should be regarded as tentative because additional research may reveal new information about measures reviewed here, or measures not reviewed here. Finally, the present evaluative criteria are subject to debate and modification. We suggested these criteria based on logic and trends in the literature, but other criteria may be applicable to an evaluation of treatment probes.
Acknowledgements
The work reported here, and specifically the research reported in Section 4.2.4., was supported by a grant from the National Institutes of Health (NIH), “Articulatory Kinematics in Neurogenic Speech Disorders” (Award # R01 DC 003723, G. Weismer, Principal Investigator). John Westbury designed the measure of phonetic working space described in this paper, and Clarissa Weiss assisted in data analysis.
Footnotes
The issue of the magnitude of voice intensity effects produced by therapy efforts, and the possible relationship of those intensity changes to “untreated” articulatory and speech intelligibility changes, is complicated and requires far more extensive consideration than is possible in the present paper. “Typical”, group-averaged voice intensity differences between patients with PD and neurologically-normal controls for connected speech samples are apparently between 1–4 dB, a difference no greater than the standard deviation of voice intensity among a group of neurologically-normal controls (see data reported in Ramig et al., 2001, their Table 1, p. 81; also see Tjaden & Wilding, 2004, their Table 3, p. 772). The larger voice intensity training effect of 10–15 dB for speakers with PD is obtained for sustained vowel tasks (Ramig et al., 1995; Ramig et al. 2001). Interestingly, the magnitude of this treatment effect is nearly the same as the dynamic range (difference between softest and loudness phonation) for voice SPL reported by Goberman, Coelho, & Robb (2002) among persons with PD. If the magnitude of the SPL treatment effect is close to the untreated, dynamic range for speakers with PD, questions are raised concerning whether the reported treatment effects are merely demonstrations of the SPL dynamic range in this population, or more generally whether sustained vowel data, being so different from connected speech data, are useful for understanding the dysarthria and its response to treatment. These data, along with the potential dissociation of voice intensity and/or voice quality from speech intelligibility as a result of DBS (discussed in text), point to the need for a careful examination of the specific claims made for voice intensity as a centerpiece for speech therapy efforts in Parkinson disease. Increased voice intensity may of course have certain benefits not easily captured by articulatory measures or speech intelligibility measures. The benefits may include sense of patient well being (e.g., because of “being heard” or a sense of vitality), enhanced voice quality, and a greater willingness on the part of the patient to talk in social situations, among other things.
The lack of complete redundancy between, for example, the distinctive of articulatory positions for corner vowels and the displacement and/or speed of the motions reaching those positions has to do with many factors, including (at least) differences in the vocal tract size of speakers, dialect variations, and the specific time point at which position measures are made and the influence on these of varying durations across vowels. When a patient is his or her own control, as in the evaluation of the effects of DBS on speech production skills pre- and post-stimulation, factors such as vocal tract size and dialect are nullified. The potential influence of vowel duration remains, however. A given speaker can reach the same tongue coordinates for vowels like /i/ or /A/ at very different tongue speeds depending on the overall duration of the vowel and the time point at which the position measurement is made (typically at the temporal middle of a vowel; a longer vowel allows more time to reach the maximum displacement). Interestingly, the literature does not include a study of the relationship between articulatory positions measured at vowel “targets” for the corner vowels, and the acoustic vowel space corresponding to those positions; even the assumed ordinal relationship between acoustic vowel area and articulatory position distinctiveness awaits empirical validation.
Contributor Information
Gary Weismer, Dept. Communicative Disorders, UW-Madison, Waisman Center, UW-Madison, Madison, WI USA.
Yana Yunusova, Department of Speech-Language Pathology, University of Toronto, Toronto, Ontario, CANADA.
Kate Bunton, Speech, Language, and Hearing Sciences, University of Arizona, Tucson, AZ USA.
References
- Ackermann H, Hertrich I, Daum I, Scharf G, Spieker S. Kinematic analysis of articulatory movements in central motor disorders. Movement Disorders. 1997;12:1019–1027. doi: 10.1002/mds.870120628. [DOI] [PubMed] [Google Scholar]
- Åström M, Tripoliti E, Hairz MI, Zrinzo LU, Martinez-Torres I, Limousin P, Wårdell K. Patient specific model-based investigation of speech intelligibility and movement during deep-brain stimulation. Stereotactic and Functional Neurosurgery. 2010;88:224–233. doi: 10.1159/000314357. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baken RJ. Clinical measurement of speech and voice. San Diego, CA: College-Hill Press; 1987. [Google Scholar]
- Ballard KJ, Robin DA, Folkins JW. An integrative model of speech motor control: A response to Ziegler. Aphasiology. 2003;17:37–48. [Google Scholar]
- Barlow SM, Bradford PT. Measurement and implications of orofacial muscle performance in speech disorders. Journal of Human Muscle Performance. 1992;1:1–31. [Google Scholar]
- Barlow SM, Abbs JH. Orofacial fine motor control impairments in congenital spasticity: Evidence against hypertonus-related performance deficits. Neurology. 1984;34:145–150. doi: 10.1212/wnl.34.2.145. [DOI] [PubMed] [Google Scholar]
- Boersma P, Weenik D. Praat: doing phonetics by computer (Version 5.1.05) [Computer program] 2009 [Google Scholar]
- Blumstein SE, Stevens KN. Acoustic invariance in speech production: Evidence from measurements of the spectral characteristics of stop consonants. The Journal of the Acoustical Society of America. 1979;66:1001–1017. doi: 10.1121/1.383319. [DOI] [PubMed] [Google Scholar]
- Bunton K, Westbury J, Weismer G. An evaluation of phonetic working space in normal geriatrics and persons with motor speech disorders; Paper presented at the 140th meeting of the Acoustical Society of America; Newport Beach, California. 2000. [Google Scholar]
- Cho T, Ladefoged P. Variations and universals in VOT: evidence from 18 languages. Journal of Phonetics. 1999;27:207–229. [Google Scholar]
- Crystal TH, House AS. Segmental durations in connected speech samples: preliminary results. Journal of the Acoustical Society of America. 1982;72:705–716. doi: 10.1121/1.388251. [DOI] [PubMed] [Google Scholar]
- Crystal TH, House AS. The duration of American-English vowels: An overview. Journal of Phonetics. 1988a;16:263–284. [Google Scholar]
- Crystal TH, House AS. Segmental durations in connected-speech signals: Current results. Journal of the Acoustical Society of America. 1988b;83:1553–1573. doi: 10.1121/1.388251. [DOI] [PubMed] [Google Scholar]
- D’Alatri L, Paludetti G, Contarino MF, Galla S, Marchese MR, Bentivoglio AR. Effects of bilateral subthalamic nucleus stimulation and medication on parkinsonian speech impairment. Journal of Voice. 2008;22:365–372. doi: 10.1016/j.jvoice.2006.10.010. [DOI] [PubMed] [Google Scholar]
- Dromey C, Ramig LO. Intentional changes in sound pressure level and rate: their impact on measures of respiration, phonation, and articulation. Journal of Speech, Language, and Hearing Research. 1998;41:1003–1018. doi: 10.1044/jslhr.4105.1003. [DOI] [PubMed] [Google Scholar]
- Dromey C, Ramig LO, Johnson AB. Phonatory and articulatory changes associated with increased vocal intensity in patients with Parkinson disease. Journal of Speech and Hearing Research. 1995;38:751–764. doi: 10.1044/jshr.3804.751. [DOI] [PubMed] [Google Scholar]
- Duez D. Segmental duration in Parkinsonian French speech. Folia Phoniatrica et Logopaedica. 2009;61:239–246. doi: 10.1159/000228001. [DOI] [PubMed] [Google Scholar]
- Dustin K. Evaluation of electromagnetic incompatibility concerns for deep brain stimulators. Journal of Neuroscience Nursing. 2008;40:299–303. doi: 10.1097/01376517-200810000-00008. [DOI] [PubMed] [Google Scholar]
- Fahn S, Elton RL. UPDRS Development Committee. Unified Parkinson’s disease rating scale. In: Fahn S, Marsden CD, Calne DB, Goldstein M, editors. Recent Developments in Parkinson’s Disease. Florham Park, NJ: Macmillan; 1987. pp. 153–163. [Google Scholar]
- Forrest K, Weismer G, Milenkovic P, Dougall RN. Statistical analysis of word-initial voiceless obstruents: preliminary data. Journal of the Acoustical Society of America. 1988;84:115–123. doi: 10.1121/1.396977. [DOI] [PubMed] [Google Scholar]
- Forrest K, Weismer G, Turner GS. Kinematic, acoustic, and perceptual analyses of connected speech produced by Parkinsonian and normal geriatric adults. Journal of the Acoustical Society of America. 1989;85:2608–2622. doi: 10.1121/1.397755. [DOI] [PubMed] [Google Scholar]
- Fox CM, Ramig LO, Ciucci MR, Sapir S, McFarland DH, Farley BG. The science and practice of LSVT/LOUD: Neural plasticity-principled approach to treating individuals with Parkinson disease and other neurological disorders. Seminars in Speech and Language. 2006;27:283–299. doi: 10.1055/s-2006-955118. [DOI] [PubMed] [Google Scholar]
- Gentil M, Garcia-Ruiz P, Pollak P, Benabid AL. Effect of stimulation of the subthalamic nucleus on oral control of patients with parkinsonism. Journal of Neuroogy, Neurosurgery, and Psychiatry. 1999;67:329–333. doi: 10.1136/jnnp.67.3.329. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gentil M, Garcia-Ruiz P, Pollak P, Benabid AL. Effect of bilateral deep-brain stimulation on oral control of patients with parkinsonism. European Neurology. 2000;44:147–152. doi: 10.1159/000008224. [DOI] [PubMed] [Google Scholar]
- Gentil M, Pinto S, Pollak P, Benabid A. Effect of bilateral stimulation of the subthalamic nucleus on parkinsonian dysarthria. Brain and Language. 2003;85:190–196. doi: 10.1016/s0093-934x(02)00590-4. [DOI] [PubMed] [Google Scholar]
- Gérard J-M, Perrier P, Payan Y. 3D biomechanical tongue modeling to study speech production. In: Harrington J, Tabain M, editors. Speech production: Models, phonetic processes, and techniques. New York, NY: Psychology Press; 2006. pp. 85–101. [Google Scholar]
- Gibbon F, Nikolaidis K. Palatography. In: Hardcastle WJ, Hewitt N, editors. Coarticulation: Theory, data, and techniques. Cambridge, UK: Cambridge University Press; 1999. pp. 229–245. [Google Scholar]
- Goberman AM, Coelho C. Acoustic analysis of Parkinsonian speech. I: Speech characteristics and L-Dopa therapy. NeuroRehabilitation. 2002;17:27–246. [PubMed] [Google Scholar]
- Goberman AM, Coelho C, Robb M. Phonatory characteristics of Parkinsonian speech before and after morning medication: the ON and OFF states. Journal of Communication Disorders. 2002;35:217–239. doi: 10.1016/s0021-9924(01)00072-7. [DOI] [PubMed] [Google Scholar]
- Graber E, Low EL. Durational variability in speech and the rhythm class hypothesis. In: Gussenhoven C, Warner N, editors. Laboratory phonology. New York: Mouton de Gruyter; 2002. pp. 515–546. [Google Scholar]
- Grillo EU, Perta K, Smith L. Laryngeal resistance distinguished pressed, normal, and breathy voice in vocally untrained females. Logopedics, Phoniatrics, Vocology. 2009;34:43–48. doi: 10.1080/14015430802587835. [DOI] [PubMed] [Google Scholar]
- Hammer MJ, Barlow SM, Lyons KE, Pahwa R. Subthalamic nucleus deep brain stimulation changes velopharyngeal control in Parkinson’s disease. Journal of Communication Disorders. 2011;44:37–48. doi: 10.1016/j.jcomdis.2010.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hammer MJ, Barlow SM, Lyons KE, Pahwa R. Subthalamic nucleus deep brain stimulation changes speech respiratory and laryngeal control in Parkinson’s disease. Journal of Neurology. 2010;257:1692–1702. doi: 10.1007/s00415-010-5605-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hardcastle WJ, Laver J, Gibbon FE. The handbook of phonetic sciences. Oxford, UK: Blackwell Publishing; 2010. [Google Scholar]
- Hartinger M, Tripoliti E, Hardcastle WJ, Limousin P. Effects of medication and subthalamic nucleus deep brain stimulation on tongue movements in speakers with Parkinson’s disease using electropalatography: A pilot study. Clinical Linguistics & Phonetics. 2011;25:210–230. doi: 10.3109/02699206.2010.521877. [DOI] [PubMed] [Google Scholar]
- Hashii M, Honda K, Westbury JR. Time varying acoustic and articulatory characteristics of American English [♦]: a cross-speaker study. Journal of Phonetics. 1997;31:3–22. [Google Scholar]
- Hillenbrand J, Getty LA, Clark MJ, Wheeler K. Acoustic characteristics of American English vowels. Journal of the Acoustical Society of America. 1995;97:3099–3111. doi: 10.1121/1.411872. [DOI] [PubMed] [Google Scholar]
- Hillenbrand JM, Clark MJ, Nearey TM. Effects of consonant environment on vowel formant patterns. Journal of the Acoustical Society of America. 2000;109:748–763. doi: 10.1121/1.1337959. [DOI] [PubMed] [Google Scholar]
- Hixon TJ, Weismer G, Hoit JD. Preclinical speech science: Anatomy, physiology, acoustics, perception. San Diego, CA: Plural Publishing; 2008. [Google Scholar]
- Holbrook A, Fairbanks G. Diphthong formants and their movements. Journal of Speech and Hearing Research. 1962;5:38–58. doi: 10.1044/jshr.0501.38. [DOI] [PubMed] [Google Scholar]
- Holmberg E, Hillman R, Perkell J. Glottal airflow and transglottal air pressure measurements for male and female speakers in soft, normal, and loud voice. Journal of the Acoustical Society of America. 1988;84:511–529. doi: 10.1121/1.396829. [DOI] [PubMed] [Google Scholar]
- Hunker CJ, Abbs JH, Barlow SM. The relationship between Parkinsonian rigidity and hypokinesia in the orofacial system: A quantitative analysis. Neurology. 1982;32:749–754. doi: 10.1212/wnl.32.7.749. [DOI] [PubMed] [Google Scholar]
- Isaias IU, Alterman RL, Tagliati M. Deep brain stimulation for primary generalized dystonia. Archives of Neurology. 2009;66:465–470. doi: 10.1001/archneurol.2009.20. [DOI] [PubMed] [Google Scholar]
- Journee HL, Postma AA, Staal MJ. Intraoperative neurophysiological assessment of disabling symptoms in DBS surgery. Neurophysiologie Clinique/Clinical Neurophysiology. 2007;37:467–475. doi: 10.1016/j.neucli.2007.10.006. [DOI] [PubMed] [Google Scholar]
- Kent RD, Kim Y-J. Toward an acoustic typology of motor speech disorders. Clinical Linguistics and Phonetics. 2003;17:427–445. doi: 10.1080/0269920031000086248. [DOI] [PubMed] [Google Scholar]
- Kent RD, Weismer G, Kent JF, Rosenbek JC. Toward phonetic intelligibility testing in dysarthria. Journal of Speech and Hearing Disorders. 1989;54:482–499. doi: 10.1044/jshd.5404.482. [DOI] [PubMed] [Google Scholar]
- Kent RD, Weismer G, Kent JF, Vorperian HK, Duffy JR. Acoustic studies of dysarthric speech: Methods, progress, and potential. Journal of Communication Disorders. 1999;32:141–180. doi: 10.1016/s0021-9924(99)00004-0. [DOI] [PubMed] [Google Scholar]
- Kim Y-J, Kent RD, Weismer G. An acoustic study of the relationships among neurological disease, dysarthria type, and severity of dysarthria. Journal of Speech, Language, and Hearing Research. 2011;54:417–429. doi: 10.1044/1092-4388(2010/10-0020). [DOI] [PubMed] [Google Scholar]
- Kim Y-J, Weismer G, Kent RD, Duffy JR. Statistical models of F2 slope in relation to severity of dysarthria. Folia Phoniatrica et Logopaedica. 2009;61:329–335. doi: 10.1159/000252849. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kleinow J, Smith A, Ramig LO. Speech motor stability in IPD: Effects of rate and loudness manipulations. Journal of Speech, Language, and Hearing Research. 2001;44:1041–1051. doi: 10.1044/1092-4388(2001/082). [DOI] [PubMed] [Google Scholar]
- Klosterman F, Wahl M, Marzinzik F, Vesper J, Sommer W, Curio G. Speed effects of deep brain stimulation for Parkinson’s disease. Movement Disorders. 2010 doi: 10.1002/mds.23381. [EPub ahead of print, October 11]. [DOI] [PubMed] [Google Scholar]
- Kuruvilla MS, Murdoch BE, Goozée JV. Electropalatographic (EPG) assessment of tongue-to-palate contacts in dysarthric speakers following TBI. Clinical Linguistics & Phonetics. 2008;22:703–725. doi: 10.1080/02699200802176378. [DOI] [PubMed] [Google Scholar]
- Leanderson R, Meyerson BA, Persson A. Lip muscle function in Parkinsonian dysarthria. Acta Otolaryngologica. 1972;74:350–357. doi: 10.3109/00016487209128462. [DOI] [PubMed] [Google Scholar]
- Levin J, Krafczyk S, Valkovic P, Eggert T, Claassen J, Bötzel K. Objective measurement of muscle rigidity in Parkinsonian patients treated with subthalamic stimulation. Movement Disorders. 2009;24:57–63. doi: 10.1002/mds.22291. [DOI] [PubMed] [Google Scholar]
- Liss JM, White L, Mattys SL, Lansford K, Lotto AJ, Spitzer SM, Caviness JN. Quantifying speech rhythm abnormalities in the dysarthrias. Journal of Speech, Language, and Hearing Research. 2009;52:1334–1352. doi: 10.1044/1092-4388(2009/08-0208). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu H-M, Tsao F-M, Kuhl PK. The effect of reduced vowel working space on speech intelligibility in Mandarin-speaking young adults with cerebral palsy. Journal of the Acoustical Society of America. 2005;117:3879–3889. doi: 10.1121/1.1898623. [DOI] [PubMed] [Google Scholar]
- Logemann JA, Fisher HB, Boshes B, Blonsky ER. Frequency and cooccurrence of vocal tract dysfunctions in the speech of a large sample of Parkinson patients. Journal of Speech and Hearing Disorders. 1978;43:47–57. doi: 10.1044/jshd.4301.47. [DOI] [PubMed] [Google Scholar]
- Loh EW-L, Goozée JV, Murdoch BE. Kinematic analysis of jaw function in children following traumatic brain injury. Brain Injury. 2005;19:529–538. doi: 10.1080/02699050400025083. [DOI] [PubMed] [Google Scholar]
- Low EL, Grabe E, Nolan F. Quantitative characterization of speech rhythm: syllable timing in Singapore English. Language and Speech. 2000;43:377–401. doi: 10.1177/00238309000430040301. [DOI] [PubMed] [Google Scholar]
- Makiyama K, Yoshihashi H, Mogitate M, Kida A. The role of adjustment of expiratory effort in the control of vocal intensity: Clinical assessment of phonatory function. Otolaryngology—Head and Neck Surgery. 2005;132:641–646. doi: 10.1016/j.otohns.2005.01.017. [DOI] [PubMed] [Google Scholar]
- McAuliffe MJ, Ward EC. The use of electropalatography in the assessment and treatment of acquired motor speech disorders in adults: Current knowledge and future directions. NeuroRehabilitation. 2006;21:189–203. [PubMed] [Google Scholar]
- McAuliffe MJ, Ward EC, Murdoch BE. Speech production in Parkinson’s disease: I. An electropalatographic investigation of tongue-palate contact patterns. Clinical Linguistics & Phonetics. 2006a;20:1–18. doi: 10.1080/02699200400001044. [DOI] [PubMed] [Google Scholar]
- McAuliffe MJ, Ward EC, Murdoch BE. Speech production in Parkinson’s disease: II. Acoustic and electropalatographic investigation of sentence, word and segment durations. Clinical Linguistics and Phonetics. 2006b;20:19–33. doi: 10.1080/0269-9200400001069. [DOI] [PubMed] [Google Scholar]
- McClean MD, Beukelman DR, Yorkston KM. Speech-muscle visuomotor tracking in dysarthric and nonimpaired speakers. Journal of Speech and Hearing Research. 1987;30:276–282. doi: 10.1044/jshr.3002.276. [DOI] [PubMed] [Google Scholar]
- Milenkovic P. TF32 [Computer program] 2001 [Google Scholar]
- Murdoch BE, Goozee JV, Veidt M, Scott DH, Meyers IA. Introducing the pressure-sensing palatograph—the next frontier in electropalatography. Clinical Linguistics and Phonetics. 2004;18:433–445. doi: 10.1080/02699200410001703628. [DOI] [PubMed] [Google Scholar]
- Narayana S, Jacks A, Robin DA, Poizner H, Zhang W, Franklin C, Liotti M, Vogel D, Fox PT. A non-invasive imaging approach to understanding speech changes following deep brain stimulation in Parkinson’s disease. American Journal of Speech-Language Pathology. 2009;18:146–161. doi: 10.1044/1058-0360(2008/08-0004). [DOI] [PMC free article] [PubMed] [Google Scholar]
- O’Dwyer NJ, Neilson PD. Voluntary muscle control in normal and athetoid dysarthric speakers. Brain. 1988;111:877–899. doi: 10.1093/brain/111.4.877. [DOI] [PubMed] [Google Scholar]
- O’ Dwyer NJ, Neilson PD, Guitar BE, Quinn PT, Andrews G. Control of upper airway structures during nonspeech tasks in normal and cerebral-palsied subjects: EMG findings. Journal of Speech and Hearing Research. 1983;26:162–170. doi: 10.1044/jshr.2602.162. [DOI] [PubMed] [Google Scholar]
- Perkell JS, Cohen MH, Svirsky M, Matthies M, Garabieta I, Jackson M. Electro-magnetic midsagittal articulometer (EMMA) systems for transducing speech articulatory movements. Journal of the Acoustical Society of America. 1992;92:3078–3096. doi: 10.1121/1.404204. [DOI] [PubMed] [Google Scholar]
- Peterson-Falzone SJ, Hardin-Jones MA, Karnell MP. Cleft palate speech. 4th ed. Louis, MO: Mosby Elsevier; 2010. [Google Scholar]
- Pinto S, Gentil M, Fraix V, Benabid AL, Pollak P. Bilateral subthalamic stimulation effects on oral force control in Parkinson’s disease. Journal of Neurology. 2003;250:179–187. doi: 10.1007/s00415-003-0966-7. [DOI] [PubMed] [Google Scholar]
- Pinto S, Gentil M, Krack P, Sauleau P, Fraix V, Benabid A-L, Pollak P. Changes induced by Levodopa and subthalamic nucleus stimulation on Parkinsonian speech. Movement Disorders. 2005;20:1507–1515. doi: 10.1002/mds.20601. [DOI] [PubMed] [Google Scholar]
- Pinto S, Ozsancack C, Tripoliti E, Thobois S, Limousin-Dowsey P, Auzou P. Treatments for dysarthria in Parkinson’s disease. The Lancet Neurology. 2004;3:547–556. doi: 10.1016/S1474-4422(04)00854-3. [DOI] [PubMed] [Google Scholar]
- Pisoni DB, Remez RE. The handbook of speech perception. Oxford, UK: Blackwell Publishing; 2005. [Google Scholar]
- Pützer M, Barry WJ, Moringlane JR. Effect of deep brain stimulation on different speech subsystems in patients with multiple sclerosis. Journal of Voice. 2007;21:741–753. doi: 10.1016/j.jvoice.2006.05.007. [DOI] [PubMed] [Google Scholar]
- Ramig L, Countryman S, Thompson L, Horii Y. Comparison of two forms of intensive speech treatment for Parkinson disease. Journal of Speech and Hearing Research. 1995;38:1232–1251. doi: 10.1044/jshr.3806.1232. [DOI] [PubMed] [Google Scholar]
- Ramig LO, Sapir S, Fox C, Countryman S. Changes in vocal loudness following intensive voice treatment (LSVT) in individuals with Parkinson’s disease: a comparison with untreated patients and normal age-matched controls. Movement Disorders. 2001;16:79–83. doi: 10.1002/1531-8257(200101)16:1<79::aid-mds1013>3.0.co;2-h. [DOI] [PubMed] [Google Scholar]
- Robin DA, Solomon NP, Moon JB, Folkins JW. Nonspeech measurement of the speech production mechanism. In: McNeil MR, editor. Clinical management of sensorimotor speech disorders. New York: Thieme; 1997. pp. 49–62. [Google Scholar]
- Rosenbek JC, LaPointe LL. The dysarthrias: description, diagnosis, and treatment. In: Johns DF, editor. Clinical management of neurogenic communicative disorders. Boston, MA: Little, Borwn and Co; 1985. pp. 97–152. [Google Scholar]
- Sapir S, Spielman JL, Ramig LO, Story BH, Fox C. Effects of intensive voice treatment (the Lee Silverman Voice Treatment [LSVT]) on vowel articulation in dysarthric individuals with idiopathic Parkinson disease: Acoustic and perceptual findings. Journal of Speech, Language, and Hearing Research. 2007;50:899–912. doi: 10.1044/1092-4388(2007/064). [DOI] [PubMed] [Google Scholar]
- Schiavetti N. Scaling procedures for the measurement of speech intelligibility. In: Kent RD, editor. Intelligibility in speech disorders. Amsterdam, The Netherlands: John Benjamins Publishing Co.; 1992. pp. 11–34. [Google Scholar]
- Skodda S, Visser W, Schlegel W. Gender-related patterns of dysprosody in Parkinson’s disease and correlation between speech variables and motor symptoms. Journal of Voice. 2010 doi: 10.1016/j.jvoice.2009.07.005. [EPub ahead of publication, 12 April 2010] [DOI] [PubMed] [Google Scholar]
- Smitheran JR, Hixon TJ. A clinical method for estimating laryngeal airway resistance during vowel production. Journal of Speech and Hearing Disorders. 1981;46:138–146. doi: 10.1044/jshd.4602.138. [DOI] [PubMed] [Google Scholar]
- Stevens KN. Acoustic phonetics. Cambridge, MA: MIT Press; 2000. [Google Scholar]
- Sturman MM, Vaillancourt DE, Metman LV, Bakay RA, Corcos DM. Effects of five years of chronic STN stimulation on muscle strength and movement speed. Experimental Brain Research. 2010;205:435–443. doi: 10.1007/s00221-010-2370-8. [DOI] [PubMed] [Google Scholar]
- Sussman HM, Westbury JR. A laterality effect in isometric and isotonic labial tracking. Journal of Speech and Hearing Research. 1978;21:563–579. doi: 10.1044/jshr.2103.563. [DOI] [PubMed] [Google Scholar]
- Tassorelli C, Buscone S, Sandrini G, Pacchetti C, Furnari A, Zangaglia R, Bartolo M, Nappi G, Martignoni E. The role of rehabilitation in deep brain stimulation of the subthalamic nucleus for Parkinson’s disease: A pilot study. Parkinsonism and Related Disorders. 2009;15:675–681. doi: 10.1016/j.parkreldis.2009.03.006. [DOI] [PubMed] [Google Scholar]
- Timmins C, Cleland J, Wood SE, Hardcastle WJ, Wishart JG. A perceptual and electropalatographic study of /Σ/ in young people with Down’s syndrome. Clinical Linguistics and Phonetics. 2009;23:911–925. doi: 10.3109/02699200903141271. [DOI] [PubMed] [Google Scholar]
- Tjaden K, Turner GS. Spectral properties of fricatives in amyotrophic lateral sclerosis. Journal of Speech, Language, and Hearing Research. 1997;40:1358–1372. doi: 10.1044/jslhr.4006.1358. [DOI] [PubMed] [Google Scholar]
- Tjaden K, Wilding GE. Rate and loudness manipulations in dysarthria: acoustic and perceptual findings. Journal of Speech, Language, and Hearing Research. 2004;47:766–783. doi: 10.1044/1092-4388(2004/058). [DOI] [PubMed] [Google Scholar]
- Tommasi G, Krack P, Fraix V, Le bas J-F, Chabardes S, Benabid A-L, Pollak P. Pyramidal tract side effects induced by deep brain stimulation of the subthalamic nucleus. Journal of Neurology, Neurosurgery, and Psychiatry. 2008;79:813–819. doi: 10.1136/jnnp.2007.117507. [DOI] [PubMed] [Google Scholar]
- Tripoliti E, Dowsey-Limousin P, Tisch S, Borrell E, Hariz MI. Speech in Parkinson’s disease following subthalamic nucleus deep brain stimulation: preliminary results. Journal of Medical Speech-Language Pathology. 2006;14:309–315. [Google Scholar]
- Tripoliti E, Limousin P. Electrical stimulation of deep brain structures and speech. In: Maassen B, van Lieshout P, editors. Speech motor control: New developments in basic and applied research. Oxford, UK: Oxford University Press; 2010. pp. 297–313. [Google Scholar]
- Tripoliti E, Zrinzo L, Marinez-Torres I, Frost E, Pinto S, Foltynie T, Holl E, Petersen E, Roughton M, Hariz MI, Limousin P. Effects of contact location and voltage amplitude on speech and movement in bilateral subthalamic nucleus deep brain stimulation. Movement Disorders. 2008;16:2377–2383. doi: 10.1002/mds.22296. [DOI] [PubMed] [Google Scholar]
- Tripoliti E, Zrinzo L, Marinez-Torres I, Tisch S, Frost E, Borrell E, Hariz MI, Limousin P. Effects of subthalamic stimulation on speech of consecutive patients with Parkinson disease. Neurology. 2010 doi: 10.1212/WNL.0b013e318203e7d0. {E-print ahead of publication, November 12, 2010] [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vaillancourt DE, Prodoehl J, Sturman MM, Bakay RAE, Metman LV, Corcos DM. Effects of deep brain stimulation and medication on strength, bradykinesia, and electromyographic patterns of the ankle joint in Parkinson’s disease. Movement Disorders. 2006;21:50–58. doi: 10.1002/mds.20672. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Watson PJ, Schlauch RJ. The effect of fundamental frequency on the intelligibility of speech with flattened intonation contours. American Journal of Speech-Language Pathology. 2008;17:348–355. doi: 10.1044/1058-0360(2008/07-0048). [DOI] [PubMed] [Google Scholar]
- Weismer G. Assessment of articulatory timing. In: Cooper J, editor. Assessment of speech and voice production: Research and clinical applications, NIDCD Monograph Vol.1. Bethesda: NIH; 1991. pp. 84–95. [Google Scholar]
- Weismer G. Motor speech disorders. In: Hardcastle WJ, Laver J, editors. The handbook of phonetic sciences. London: Blackwell; 1997. pp. 191–219. [Google Scholar]
- Weismer G. Philosophy of research in motor speech disorders. Clinical Linguistics and Phonetics. 2006a;20:315–349. doi: 10.1080/02699200400024806. [DOI] [PubMed] [Google Scholar]
- Weismer G. Speech disorders. In: Traxler M, Gernsbacher M, editors. Handbook of psycholinguistics. 2nd ed. Amsterdam: Elsevier; 2006b. pp. 93–124. [Google Scholar]
- Weismer G, Kim Y-J. Classification and taxonomy of motor speech disorders: What are the issues? In: Maassen B, van Lieshout PHHM, editors. Speech motor control: New developments in basic and applied research. Cambridge, UK: Oxford University Press; 2010. pp. 229–241. [Google Scholar]
- Weismer G, Kuo C, Allen P. Transition characteristics in speakers with dysarthria and in healthy controls: Part IV: Additional data on vowel-consonant transitions and stroke patients; Paper presented at the 159th meeting of the Acoustical Society of America; Baltimore, MD. 2010. [Google Scholar]
- Weismer G, Jeng J-Y, Laures JS, Kent RD. Acoustic and intelligibility characteristics of sentence production in neurogenic speech disorders. Folia Phoniatrica et Logopaedica. 2001;53:1–18. doi: 10.1159/000052649. [DOI] [PubMed] [Google Scholar]
- Weismer G, Kent RD, Hodge M, Martin R. The acoustic signature for intelligibility test words. Journal of the Acoustical Society of America. 1988;84:1281–1291. doi: 10.1121/1.396627. [DOI] [PubMed] [Google Scholar]
- Weismer G, Martin RE, Kent RD, Kent JF. Formant trajectory characteristics of males with amyotrophic lateral sclerosis. Journal of the Acoustical Society of America. 1992;91:1085–1098. doi: 10.1121/1.402635. [DOI] [PubMed] [Google Scholar]
- Weismer G, Yunusova Y, Westbury JR. Interarticulator coordination in dysarthria: An x-ray microbeam study. Journal of Speech, Language, and Hearing Research. 2003;46:1247–1261. doi: 10.1044/1092-4388(2003/097). [DOI] [PubMed] [Google Scholar]
- Westbury JR. X-ray microbeam speech production database user's handbook. Madison, WI: University of Wisconsin-Madison; 1994. [Google Scholar]
- Yang B. A comparative study of American English and Korean vowels produced by male and female speakers. Journal of Phonetics. 1996;24:245–261. [Google Scholar]
- Yunusova Y, Green JR, Lindstrom MJ, Ball LJ, Pattee GL, Zinman L. Kinematics of disease progression in bulbar ALS. Journal of Communication Disorders. 2010;43:6–20. doi: 10.1016/j.jcomdis.2009.07.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yunusova Y, Weismer G, Westbury J, Lindstrom M. Articulatory movements during vowels in speakers with dysarthria and in normal controls. Journal of Speech, Language, and Hearing Research. 2008;51:596–611. doi: 10.1044/1092-4388(2008/043). [DOI] [PubMed] [Google Scholar]
- Ziegler W. Speech motor control is task-specific: Evidence from dysarthria and apraxia of speech. Aphasiology. 2003;17:3–36. [Google Scholar]