Abstract
Purpose
This study examined the relationship between the acoustic measure relative fundamental frequency (RFF) and a kinematic estimate of laryngeal stiffness.
Method
Twelve healthy adults (mean age = 22.7 years, SD = 4.4; 10 women, 2 men) produced repetitions of /ifi/ while varying their vocal effort during simultaneous acoustic and video nasendoscopic recordings. RFF was determined from the last 10 voicing cycles before the voiceless obstruent (RFF offset) and the first 10 cycles of revoicing (RFF onset). A kinematic stiffness ratio was calculated for the vocal fold adductory gesture during revoicing by normalizing the maximum angular velocity by the maximum glottic angle during the voiceless obstruent.
Results
A linear mixed effect model indicated that RFF offset and onset were significant predictors of the kinematic stiffness ratios. The model accounted for 52% of the variance in the kinematic data. Individual relationships between RFF and kinematic stiffness ratios varied across participants, with at least moderate negative correlations in 83% of participants for RFF offset but only 40% of participants for RFF onset.
Conclusions
RFF significantly predicted kinematic estimates of laryngeal stiffness in healthy speakers and has the potential to be a useful clinical indicator of laryngeal tension. Further research is needed in individuals with voice disorders.
Laryngeal tension has been implicated in a variety of voice disorders, both functional (e.g., vocal hyperfunction, puberphonia; Gökdoğan, Gökdoğan, Tutar, Aydil, & Yilmaz, 2015; Hillman, Holmberg, Perkell, Walsh, & Vaughan, 1989) and neurological (e.g., spasmodic dysphonia, Parkinson's disease; Gallena, Smith, Zeffiro, & Ludlow, 2001; Ludlow, 2009). Laryngeal tension and muscular stiffness, a biomechanical correlate of tension, can result in a strained vocal quality and, often, increased vocal effort (Roy, Mazin, & Awan, 2014). Although laryngeal tension is a common symptom in individuals with voice disorders, there is currently no gold standard objective measure of it. This discrepancy can lead to inconsistent and ambiguous clinical evaluation and rehabilitation of laryngeal tension in individuals with voice disorders.
Laryngeal Tension
Excessive laryngeal tension is a frequent symptom reported by individuals with dysphonia. In functional voice disorders, laryngeal tension can be present in the absence of any laryngeal pathology, meaning that these individuals appear to have anatomically normal vocal folds when viewed during laryngoscopy (Angsuwarangsee & Morrison, 2002; Roy, 2003). Vocal therapy with these individuals often focuses on decreasing tension at and around the larynx (e.g., laryngeal massage; Roy & Leeper, 1993) and decreasing the behaviors associated with hyperfunctional voice use (Pannbacker, 1998). However, individuals with vocal fold lesions, such as vocal fold nodules, often exhibit excessive laryngeal tension as well. In these cases, it is not known whether laryngeal tension led to the development of vocal lesions (Hsiao, Liu, Hsu, Lee, & Lin, 2001; Johns, 2003) or whether increased laryngeal tension is an adaptive behavior to compensate for the anatomical changes of the vocal folds themselves (Belafsky, Postma, Reulbach, Holland, & Koufman, 2002). Thus, laryngeal tension can be considered a primary or secondary cause of dysphonia in individuals with voice disorders (Roy, 2008). Individuals with primary and secondary laryngeal tension can be grouped together into a larger class known as vocal hyperfunction, described as excessive or imbalanced activation of the laryngeal and/or extralaryngeal muscles (Hillman et al., 1989). Vocal hyperfunction constitutes approximately 65% of voice disorder diagnoses (Ramig & Verdolini, 1998; Van Houtte, Van Lierde, & Claeys, 2011).
At present, voice clinicians have many diagnostic approaches available for assessing vocal function (Awan et al., 2014). Diagnostic techniques that attempt to assess laryngeal tension in particular include specific acoustic measures (e.g., fundamental frequency, cepstral peak prominence, spectral tilt; Maryn & Weenink, 2015), visual inspection via laryngoscopy, perceptual assessment (e.g., Consensus Auditory-Perceptual Evaluation of Voice; Kempster, Gerratt, Abbott, Barkmeier-Kraemer, & Hillman, 2009), and manual palpation of the neck musculature (Roy, Ford, & Bless, 1996). However, to date, no single acoustic measure has been shown to be indicative of laryngeal tension (Bhuta, Patrick, & Garnett, 2004), and visual estimates of tension, such as supraglottal compression identified during laryngoscopy, are not specific to voice disorders because they have been observed in healthy individuals as well (Milstein, 1999; Stager, Bielamowicz, Regnell, Gupta, & Barkmeier, 2000). Furthermore, perceptual ratings of vocal strain and increased vocal effort during phonation have variable inter- and intrarater reliability, especially in attempting to quantify and distinguish the degree of impairment when differences are slight (i.e., typical voice vs. mild dysphonia; Eadie et al., 2010; Kelchner et al., 2010; Schaeffer & Sidavi, 2011; Wuyts, De Bodt, & Van de Heyning, 1999). Last, manual palpation of the hyolaryngeal complex may be inconsistent and has questionable interrater reliability (Angsuwarangsee & Morrison, 2002; Stepp, Heaton, et al., 2011). Thus, standardized objective measures that are specific to laryngeal tension would enhance assessment and evaluation of therapeutic outcomes in individuals with voice disorders. In this study, we aimed to evaluate one such potential objective measure that is based on speech acoustics: relative fundamental frequency (RFF).
Acoustic Measure: RFF
RFF is an acoustic measure that was developed to characterize voicing during the transition of voicing offset and onset around a voiceless obstruent. It is calculated from the instantaneous fundamental frequencies (f 0) during the last 10 cycles of the voicing offset before the obstruent (known as the offset cycles) and the first 10 voicing onset cycles after the obstruent (see Figure 1). The f 0 of each cycle is normalized against a reference f 0 from the steady state of the vowel, thus adjusting for the speaker's own f 0 and the variation inherent within speakers (see Equation 1). The reference f 0 for the voicing offset is taken from offset cycle 1, and the reference f 0 for the voicing onset is taken from onset cycle 10. In this study, RFF offset cycle 10 and onset cycle 1 were specifically targeted for analysis because they are the furthest from the reference cycles and are hypothesized to reflect the greatest differences in laryngeal tension, with lower cycle values noted in individuals with tension-based voice disorders (Eadie & Stepp, 2013; Stepp, Merchant, Heaton, & Hillman, 2011).
| (1) |
Figure 1.
A waveform of a single /ifi/ instance with relative fundamental frequency offset and onset cycles identified.
Multiple factors are thought to contribute to RFF values, including laryngeal muscle tension, vocal fold abductory movement, and aerodynamic forces (Lien, Michener, Eadie, & Stepp, 2015; Stepp, Hillman, & Heaton, 2010b). Laryngeal tension, in particular, is hypothesized to influence RFF values for both offset and onset. When the effects of tension are disrupted, such as in individuals with tension-based voice disorders, there is a reduction in RFF, with offset values becoming more negative and onset values becoming less positive. Thus, researchers have identified RFF as a potential acoustic indicator of laryngeal tension. Significantly lower RFF values are apparent in individuals with functional (vocal hyperfunction) and neurological (spasmodic dysphonia, Parkinson's disease) voice disorders compared with healthy controls (Eadie & Stepp, 2013; Goberman & Blomgren, 2008; Stepp, 2013; Stepp, Hillman, & Heaton, 2010a; Stepp, Sawin, & Eadie, 2012). Furthermore, when individuals with vocal hyperfunction completed successful voice therapy, RFF values improved to ranges of those observed in healthy speakers (Stepp, Merchant, et al., 2011). The authors argued that the course of vocal therapy decreased laryngeal tension, which was then reflected as increases in participants' RFF values.
A relationship between RFF and the modulation of vocal effort has also been reported. Lien et al. (2015) examined RFF in healthy speakers across five speaker-modulated levels of effort: relaxed, normal, slightly strained, moderately strained, and maximally strained. Findings reported no statistical differences among the relaxed, normal, and slightly strained productions, but the moderate and maximally strained productions resulted in reductions in RFF values similar to those reported previously in individuals with vocal hyperfunction (Stepp et al., 2010a). These results suggest that RFF can be manipulated on the basis of vocal effort in those with otherwise healthy vocal mechanisms, resulting in an RFF similar to that observed in individuals with tension-based voice disorders.
Although these studies are promising, further research is need to validate RFF as an acoustic correlate of laryngeal tension and to determine its utility as a clinical measure for use with individuals with voice disorders.
Kinematic Measure: Estimate of Laryngeal Stiffness
Directly quantifying laryngeal tension has been challenging due to the limitations of measuring tension in vivo in an ecologically valid manner. Thus, a kinematic estimate is a more feasible way to characterize laryngeal function and, more specifically, intrinsic laryngeal muscle function.
Stiffness, defined as the measurement of resistance to displacement (Shiller, Laboissiere, & Ostry, 2002), is a biomechanical property of muscle that influences movement (Chu & Barlow, 2009). The ratio of maximum velocity to the extent of the movement, termed the kinematic estimate of stiffness or kinematic stiffness ratio (1/time), was first developed in the exercise physiology literature as a clinical correlate to tension during limb movement (J. D. Cooke, 1980; Feldman, 1980; Kelso & Holt, 1980). It was then adopted to characterize articulatory gestures of select oral structures (Hertrich & Ackermann, 2000; Kelso, Vatikiotis-Bateson, Saltzman, & Kay, 1985; Ostry, Cooke, & Munhall, 1987; Ostry, Keller, & Palmer, 1983; Ostry & Munhall, 1985) and laryngeal structures (A. Cooke, Ludlow, Hallett, & Selbie, 1997; Dailey et al., 2005; Munhall & Ostry, 1983; Stepp, Hillman, & Heaton, 2010b). In particular, the kinematic stiffness of gross vocal fold adductory gestures during voicing onset has been identified as an indicator of laryngeal tension during various voice modulations. A. Cooke et al. (1997) examined the vocal fold velocity profiles and associated kinematic stiffness ratios of healthy individuals who modulated their onset type (i.e., breathy, normal, hard glottal attack). The researchers described differences in the shape of the velocity profiles; longer gesture duration and shallow slopes were noted with breathy and normal onsets, whereas hard voicing onsets were shorter in duration and had steeper slopes. Furthermore, the authors reported that kinematic stiffness ratios were consistently lower for breathy onsets compared with hard onsets. The authors argued that hard onsets resulted in greater laryngeal tension and thus greater kinematic estimates of stiffness.
Stepp et al. (2010b) investigated kinematic stiffness during vocal fold abduction and adduction using a computational biomechanical model of laryngeal dynamics. By adjusting parameters of their one-joint virtual trajectory model, they were able to mimic intrinsic laryngeal muscle contraction by increasing tension of the thyroarytenoid (TA), lateral cricoarytenoid (LCA), and posterior cricoarytenoid muscles. The model found strong positive correlations between muscle tension and kinematic stiffness ratios. On the basis of the model, Stepp and colleagues hypothesized that kinematic estimates of stiffness could reflect laryngeal tension in individuals with vocal hyperfunction. Their experimental protocol used variable rates of speech, as prior research has correlated faster speech rate with increased maximum velocity during vocal fold adduction (Dailey et al., 2005) and increased tension in specific speech articulators (Hertrich & Ackermann, 2000; Ostry & Munhall, 1985). During slow speech rates, Stepp et al. (2010b) reported a significant difference in kinematic estimates of stiffness between those with vocal hyperfunction and healthy peers. However, the researchers also reported that when using faster speech rates, individuals with vocal hyperfunction did not show significant increases in kinematic stiffness ratios like those noted in healthy individuals. The authors suggested that individuals with vocal hyperfunction may have increased laryngeal tension at their baseline, which ultimately affected their ability to increase tension during faster speech rates.
Although the biomechanical model assisted in validating the use of the kinematic stiffness ratios as an indirect indication of laryngeal tension, the technique for gathering and processing the laryngeal kinematic data is invasive and time consuming and therefore has not been widely adopted into clinical evaluation. Here, we propose a combined analysis of kinematic stiffness ratios and the more practical acoustic measure of RFF in order to assist in validating RFF as an objective acoustic measure indicative of laryngeal tension.
Aim and Hypotheses
Voice clinicians currently lack a clinically feasible, objective measure of laryngeal tension. This study aimed to investigate the relationship between RFF and a kinematic estimate of laryngeal stiffness during speaker-modulated effort in healthy individuals in order to assist in validating RFF as an indicator of laryngeal tension. We hypothesized that both RFF offset cycle 10 and RFF onset cycle 1 would be significantly related to laryngeal tension (as reflected in the kinematic estimate of stiffness) within participants. We furthermore hypothesized that there would be a negative relationship between RFF and the kinematic stiffness ratios, with higher kinematic stiffness ratios resulting in lower RFF values.
Method
Participants
Twelve young adults (10 women, two men) aged 18 to 31 years (M = 22.7, SD = 4.4) participated in the study. A certified speech-language pathologist (SLP) completed a perceptual screening of vocal quality and reviewed medical history prior to participation. All participants were vocally healthy and had no current voice symptoms, including those of upper respiratory infection or similar. Participants had no history of voice disorder, laryngeal pathology, or any other known condition affecting vocal function (e.g., neurologic disease). Informed consent was obtained prior to participation, in compliance with the Boston University Institutional Review Board.
Experimental Design
Participants were seated for the duration of the study. They repeated the token /ifi/ in trains of three to capture voicing cycles surrounding a voiceless obstruent. The phoneme /f/ was chosen to limit the amount of individual variation within each speaker (Lien, Gattuccio, & Stepp, 2014), whereas the phoneme /i/ was chosen to provide the most open pharynx and thus the least obstructed view of the vocal folds.
In order to minimize the nasendoscopy examination time, participants were trained on vocal tasks prior to recordings. Participants were advised to maintain the pitch, volume, and rate of their typical speaking voice unless the task specifically stated otherwise; they were also advised to maintain relatively flat pitch contours, without clearly rising or falling inflection. A certified SLP determined task compliance throughout training, and verbal models were provided prior to each vocal task for all participants. Participants were provided the same descriptions for the vocal tasks and given feedback on their productions. Participants were considered to be appropriately trained once they correctly produced the vocal task for three consecutive /ifi/ productions.
In order to generate voice with varying degrees of tension, seven different voice tasks were elicited (see Table 1). First, participants began with their typical speaking voice at typical pitch and loudness. Next, participants were instructed to produce a moderate vocal strain, described as “twice as hard” as their typical speaking voice, followed by maximal strain, described as “as much strain as possible.” Participants then repeated tokens with a breathy voice while attempting to maintain their typical loudness. Next, a metronome set to the speed of largo (50 beats/min) assisted in cueing a steady, slow tempo while the participants repeated tokens with typical pitch and volume. Participants were then asked to create a hard glottal attack at the onset of each token. Last, a push–pull exercise was used to increase subglottal pressure and vocal fold adduction. Participants were instructed to pull up on the arms of the chair while straining their voice to elicit a strained vocal quality.
Table 1.
Voice tasks.
| Task | Description |
|---|---|
| Typical speaking voice | Typical pitch and loudness of conversational speech |
| Moderate vocal strain | Twice the speaker-perceived strain as their typical voice |
| Maximal vocal strain | As much speaker-perceived strain as possible |
| Breathy voice | Allowing extra air to escape while maintaining typical loudness |
| Controlled speed | Largo (50 words/min) |
| Hard glottal attack | Overemphasizing the first sound of each token |
| Push–pull exercise | Pulling up on the arms of the chair while straining the voice |
Once participants were appropriately trained, a headset microphone was placed 7 cm from the lips at 45° from midline, and a flexible endoscope was passed transnasally through the nasopharynx and past the soft palate in order to visualize the vocal folds. A numbing agent was not provided. A certified SLP screened for typical voice production and vocal function by visually inspecting the larynx during the nasendoscopy exam; all participants presented within normal limits. A range of three to nine /ifi/ sets (median = 4) were elicited for each voice task depending on the participant's compliance and nasendoscopy tolerance. Total nasendoscopy time was approximately 5 min. The entire session (including consent, training, and data collection) was less than 30 min.
Data Collection
The headset microphone (WH20; Shure, Niles, IL) was connected to a portable digital audio recorder (H4N Handy Portable Digital Recorder; Zoom, Hauppauge, NY). High-quality acoustic data were digitized at a sampling rate of 44.1 kHz and a resolution of 16 bits.
Digital video (via a distal imaging chip; EPK-1000; Pentax, Tokyo, Japan) and acoustic signals were recorded with the Digital Stroboscopy System (Kay Elemetrics, Lincoln Park, NJ) with a halogen light source. Video was digitized at 30 frames/s with a frame size of 480 × 360 pixels. Acoustic data, captured with a lapel microphone, were digitized at a sampling rate of 44.1 kHz. The video and high-quality acoustic waveforms were time aligned using the two acoustic signals in order to extract simultaneous acoustic and kinematic measures.
Acoustic Data Analysis
RFF was calculated using an automated MATLAB program, tuned to both healthy and dysphonic voices, developed by the third author (Version R2013a, The MathWorks, Natick, MA; Lien, 2015). The program identified the 10 vocal cycles before and after the voiceless obstruent, computed the inverse of each period, and compared the calculated f 0 values to the reference f 0 in semitones (ST) for normalization. Similar to previous studies, instances with excessive glottalization, fewer than 10 periodic cycles, large deflections, or unstable waveforms were automatically excluded during the analysis (Lien et al., 2014; Lien & Stepp, 2014). On average, participants had 9.1 useable RFF offsets and onsets per task. Values were averaged for participant and task. Last, RFF offset cycle 10 and onset cycle 1 were targeted for further analysis.
Kinematic Data Analysis
Video recordings, captured via nasendoscopy, were divided into the gross adductory gestures that occur with the onset of voicing following the voiceless obstruent in each /ifi/ instance. Kinematic stiffness ratios were defined only for these adductory movements because previous literature supports calculating kinematic estimates of stiffness for gestures associated with adductory movements during voicing onset (A. Cooke et al., 1997; Dailey et al., 2005; Stepp et al., 2010b). Video clips were then converted to images at a sampling rate of 30 frames/s for further analysis. From these, glottic angles were manually identified from the maximum abductory angle to the fully adducted point and then semiautomatically estimated using a custom MATLAB program (see Figure 2; Dailey et al., 2005; Stepp et al., 2010b). A single investigator (the first author) visually inspected each image and marked the edge of the visual portions of the right and left vocal folds. In some instances, other laryngeal structures (e.g., epiglottis, arytenoid cartilages) impaired the full view of the vocal folds and anterior commissure. In these cases, the glottic angle was automatically determined by the software by extending the straight edge of the visible portion of the vocal folds to a single intersecting point. The results of this process yielded a series of angles that were then fit with an asymmetric sigmoidal function (see Figure 3A) as in Stepp et al. (2010b) and Britton et al. (2012). The maximum angular velocity was determined from the sigmoidal fit (see Figure 3B) and then divided by the maximum angle to create a kinematic stiffness ratio (A. Cooke et al., 1997; Dailey et al., 2005; Munhall & Ostry, 1983; Stepp et al., 2010b). Stiffness ratios were calculated for each adductory instance and averaged for each participant across each task.
Figure 2.
(A) A video nasendoscopic image of vocal folds with the glottic angle identified. After the vocal fold edges are manually identified, the angle is estimated semiautomatically with use of a custom MATLAB program. (B) A series of video nasendoscopic images of the vocal folds during a single adductory gesture. The images begin at the maximum abductory point during the voiceless obstruent /f/ and progress toward the voicing onset of /i/ in a single /ifi/ instance.
Figure 3.
(A) An asymmetric sigmoid function that is based on the adductory glottic angles (seen as circles on the curve) for a single /ifi/ instance. The maximum angle is the largest glottic angle during the voiceless obstruent. (B) Angular velocity curve determined from the sigmoidal fit. The maximum angular velocity is identified and divided by the maximum angle to calculate the kinematic stiffness ratio.
Reliability and Statistical Analyses
The interrater and intrarater reliability of angle identification were completed on two of the 12 participants. Raters (the first and second authors) identified vocal fold edges for the semiautomatic determination of the vocal fold angles. Raters were blind to prior angle identifications. Statistical analysis of the angles using a relative intraclass correlation coefficient determined high reliability for both interrater and intrarater calculations: intraclass correlation coefficient (2, 1) = .91 and .92, respectively.
With Minitab statistical software (Version 17; Minitab Inc., State College, PA), a linear mixed effects analysis of the relationship between kinematic stiffness ratios and RFF was performed. RFF offset cycle 10 and RFF onset cycle 1 were analyzed as fixed effects, whereas participant was entered as a random effect due to the repeated measures experimental design. The coefficient of determination (R 2) and effect sizes (ηp 2) were determined. Significance was set a priori to p < .05. Then, in order to determine the strength and direction of the relationship, per-participant Pearson product–moment correlation coefficients (r) were determined separately for RFF offset cycle 10 and RFF onset cycle 1 against the kinematic stiffness ratios.
Results
The 12 participants produced /ifi/ instances across seven tasks, generating data for RFF offset cycle 10, RFF onset cycle 1, and kinematic stiffness ratios. Of the possible 252 data points (12 participants × 7 voice tasks × 3 variables), 248 were analyzed. The four missing data points consisted of RFF offset cycle 10 only, in which there were fewer than three usable offsets to average for a given task. The missing data points were spread across participants and did not appear to influence the distribution of the data set. All data met the assumptions of the selected statistical model.
Across all voice tasks, RFF offset cycle 10 values ranged from −2.65 to 1.81 ST. For productions of typical speaking voice, the average RFF offset cycle 10 value was −0.69 ST (SD = 0.8, range = −1.34 to 1.48 ST), which is within typical ranges reported previously in healthy adults (Goberman & Blomgren, 2008; Robb & Smith, 2002; Stepp et al., 2010a, 2012). RFF onset cycle 1 ranged from −1.45 to 3.26 ST across all tasks. The typical speaking voice task produced an onset cycle 1 average of 1.83 ST (SD = 0.98, range = −0.32 to 3.03 ST), which is also within the typical range reported in recent studies with healthy individuals (Lien, 2015; Lien et al., 2015).
Kinematic stiffness ratios ranged from 7.3 to 31.9 1/s across all vocal tasks. Similar to other studies (A. Cooke et al., 1997; Stepp et al., 2010b), the breathy, typical speaking voice, and controlled rate (largo) tasks produced the lowest kinematic stiffness ratios (M = 14.0, 14.6, and 15.2 1/s, respectively). The maximal strain task produced the greatest kinematic stiffness ratios, with an average ratio of 22.3 1/s; the push–pull exercise elicited the next greatest, with an average of 21.6 1/s.
The linear mixed-effects analysis (see Table 2) revealed that the predictor variables accounted for 52% of the variance in the kinematic stiffness ratios (adjusted R 2 = .52). RFF offset cycle 10 was a significant predictor with a large effect size (ηp 2 = .29). RFF onset cycle 1 was also significant; however, analysis yielded only a medium effect size (ηp 2 = .08).
Table 2.
Linear mixed-effects analysis with kinematic stiffness ratios as the dependent variable.
| Variable | df | F | p |
|---|---|---|---|
| RFF offset cycle 10 | 1, 79 | 27.5 | <.001 |
| RFF onset cycle 1 | 1, 79 | 6.1 | .016 |
| Participant | 11, 79 | 8.2 | <.001 |
Note. RFF = relative fundamental frequency.
RFF offset cycle 10 and RFF onset cycle 1 each were regressed to the kinematic stiffness ratios by individual (see Figures 4 and 5). Pearson correlation coefficients were calculated and reported for those that exhibited at least a moderate negative linear relationship (predetermined cutoff of r ≤ −.5). The range of correlation coefficients for RFF offset cycle 10 was r = −.90 to .20, with 83% of participants exhibiting at least a moderate negative relationship. RFF onset cycle 1 relationships were not as strong (r = −.79 to .46), with only 40% of participants exhibiting at least a moderate negative relationship between the measures.
Figure 4.
Scatter plots of relative fundamental frequency (RFF) offset cycle 10 and the kinematic stiffness ratios for each participant (S1 to S12). Data points have been averaged across each task. Individual correlation coefficients were determined, and regression lines were placed for participants with at least moderate negative correlations (r ≤ −.5). ST = semitones.
Figure 5.
Scatter plots of relative fundamental frequency (RFF) onset cycle 1 and the kinematic stiffness ratios for each participant (S1 to S12). Data points have been averaged across each task. Individual correlation coefficients were determined, and regression lines were placed for participants with at least moderate negative correlations (r ≤ −.5). ST = semitones.
Discussion
RFF as an Estimate of Laryngeal Tension
Prior studies have identified the acoustic measure RFF as a potential objective indicator of laryngeal tension (Eadie & Stepp, 2013; Goberman & Blomgren, 2008; Stepp, 2013; Stepp et al., 2010a, 2012). The purpose of this study was to determine the relationship between RFF and a kinematic estimate of laryngeal stiffness to further elucidate the potential of RFF as a valid measure of laryngeal tension. The linear mixed-effects model supported our hypothesis that RFF offset cycle 10 and onset cycle 1 are significantly predictive of a kinematic stiffness ratio, with RFF offset cycle 10 accounting for a larger proportion of the variance in the model (large effect size) compared with RFF onset cycle 1 (medium effect size). Furthermore, individual direct regressions of RFF offset cycle 10 with kinematic stiffness ratios resulted in more than 80% of individuals exhibiting at least a moderate negative relationship. However, RFF onset cycle 1 correlations were not nearly as robust, with only 40% of participants exhibiting a moderate negative relationship. Therefore, our prediction of the negative linear relationship between RFF values and kinematic stiffness ratios was consistently observed only in RFF offset cycle 10 analyses. The relationship between RFF onset cycle 1 and kinematic stiffness ratios was less clear.
In agreement with the findings presented here, researchers have postulated that RFF offset cycles and onset cycles may capture different physiological phenomena. RFF offset cycle 10 appears to have more predictive power in distinguishing between healthy individuals and those with voice disorders. Stepp et al. (2012) reported significant differences in RFF offset cycle 10 between healthy controls and individuals with vocal hyperfunction, whereas RFF onset cycle 1 did not show any significant differences. In the same study, RFF offset cycle 10 had a moderately strong ability to accurately and specifically predict vocal hyperfunction in comparison with controls, whereas the prediction by RFF onset cycle 1 was closer to chance. However, RFF onset cycle 1 has been shown to significantly correlate with listener perceptions of vocal effort in individuals with spasmodic dysphonia and in healthy speakers who modulated their vocal effort (Eadie & Stepp, 2013; Lien et al., 2015). Therefore, although RFF onset cycle 1 may be more specific to listener perception of vocal tension, it is possible that RFF offset cycle 10 may be more sensitive to the small changes in tension that are not yet detected by a naïve listener (Eadie et al., 2010; Kelchner et al., 2010). Here, all of the participants who exhibited at least a moderate negative correlation (r ≤ −.5) between RFF onset cycle 1 and kinematic stiffness ratios also exhibited a moderate negative correlation between RFF offset cycle 10 and kinematic stiffness ratios. It should be noted, however, that an additional five participants exhibited moderate negative relationships in RFF offset cycle 10 regressions but did not have moderate correlations in their RFF onset cycle 1 regressions. The difference may be an indication of RFF offset cycle 10 being more sensitive to subtle changes in tension that are not yet perceptible to the listener.
To test this hypothesis, future work should analyze the relationship between RFF and both listener perception of effort and self-reported effort. Self-reports of vocal effort are thought to be the most accurate because the individual is able to account for both auditory and somatosensory feedback during voice production (Rosenthal, Lowell, & Colton, 2014). To date, reports of the relationship between listener- and self-reported vocal effort has been weak to moderate, with R 2 values ranging from .23 to .35 (Eadie et al., 2010). Therefore, examining the relationship between listener perceptions and self-perceptions, alongside the acoustic measure, may assist in determining whether RFF offset or onset is more accurate at predicting small amounts of change. One obstacle to implementing a perceptual component to the study as described here is the background noise emitted by the video nasendoscopy equipment. Although that did not affect the robustness of our RFF analysis, a perceptual study would require isolation of the acoustic signal from the light source.
Physiological Basis of RFF
RFF is hypothesized to rely on a combination of three physiological factors: aerodynamics, vocal fold kinematics (abduction), and tension (Lien et al., 2015; Stepp, Merchant, et al., 2011). In the model proposed by Stepp, Merchant, et al. (2011), these factors influence all aspects of the transition (voicing offset, the voiceless obstruent, and voicing onset) and are additive in nature. During typical voice productions, increased tension is evident just prior to, during, and just following the voiceless obstruent. Abduction of the vocal folds is crucial to the offset of voicing, whereas increased airflow is evidenced during the voiceless obstruent and in the onset of revoicing.
In healthy individuals, devoicing for unvoiced consonants is associated with elongation of the vocal folds due to tension of the cricothyroid (CT) muscle (Lofqvist, Baer, McGarr, & Story, 1989; Stevens, 1977), which would contribute to increases in both offset and onset RFF values (Halle & Stevens, 1971; Stevens, 1977). During the offset of voicing, the increase in tension of the CT muscle is hypothesized to be counterbalanced by the simultaneous effect of vocal fold abduction (Fukui & Hirose, 1983), which acts to decrease RFF. Thus, in healthy speakers, tension and abduction effects during voicing offset balance with one another, leading to RFF offset values that are relatively stable around 0 ST. In contrast, during voicing onset, the tension-driven increases in RFF are hypothesized to constructively sum with the increases in RFF thought to be created by initially increased airflow (Lofqvist, Koenig, & McGowan, 1995; Lofqvist & McGowan, 1992), resulting in high RFF onset values that gradually decrease to steady state (0 ST).
Individuals with increased laryngeal tension may have reductions in the typical effect tension has on the aerodynamic and abduction influences during both devoicing and voicing gestures, resulting in decreased RFF values (negative offset values and less positive onset values). This effect has been observed across a variety of populations and manipulations. Compared with healthy control speakers, RFF values are lower in individuals with vocal hyperfunction (Lien, 2015; Stepp et al., 2010a, 2012), Parkinson's disease (Goberman & Blomgren, 2008; Stepp, 2013), and spasmodic dysphonia (Eadie & Stepp, 2013). Likewise, individuals in our study exhibited the same reductions in RFF values while increasing their laryngeal tension. Although this effect was more consistent and pronounced in offset cycle 10 analyses than in onset cycle 1, the sensitivity of RFF offset cycles has yet to be fully understood. We hypothesize that the changes in offset cycle 10 may be due to the interaction of specific intrinsic laryngeal muscle tension during devoicing gestures.
Although at this time it is infeasible to measure individual muscle activation, computational models allow simulations of select intrinsic laryngeal muscle activations and, ultimately, allow us to make predictions about potential physiological interactions. The intrinsic laryngeal muscles of particular interest to this work are the CT and TA pairs. The CT muscle, which contributes to tension during voicing transitions in RFF (Lien et al., 2015; Stepp, Merchant, et al., 2011), acts to increase the length and tension of the vocal folds and raise f 0 (Lowell & Story, 2006; Story, 2015). The TA, when activated without the influence of the CT, also raises f 0 during typical speech (Titze, Luschei, & Hirano, 1989). However, cocontraction of these muscles does not necessarily result in increased f 0, as one might expect. Two computational models examined this CT–TA relationship (Lowell & Story, 2006; Yin & Zhang, 2013) and reported that during instances of strong CT activation, simultaneous activation of the TA muscle resulted in a decreased f 0. The authors argued that the simultaneous activation changed the patterns of stress on the vocal folds, increased stiffness in the body of the vocal folds, and resulted in reduced vibrational frequency. Thus, during instances of high CT tension, cocontraction of the TA can act in opposition and ultimately decrease f 0. This is significant in light of the computational biomechanical model by Stepp et al. (2010b), revealing a relationship between TA tension (in conjunction with the LCA) and kinematic stiffness ratios. Their model showed that increased tension in the TA muscle resulted in simultaneous increases in kinematic stiffness ratios during adduction. Therefore, we propose that increased activation of the TA would dysregulate the effects CT tension has during the transition of devoicing and voicing, ultimately leading to decreases in the frequency of vibration of the vocal folds. Thus, the result would be evident as reduced RFF values, as we observed here in offset cycle 10 and onset cycle 1, and simultaneous increases in kinematic stiffness ratios. We hypothesize that the dysregulation in the CT and TA muscles is a primary contributor to the changes observed in RFF values and kinematic estimates of stiffness in this study, although careful modeling of both the adductory kinematics and vibrational dynamics is warranted.
It should be noted, however, that some individuals did not exhibit the expected relationship during individual analyses between the selected RFF cycles and kinematic stiffness ratios. Although participants were able to produce variation in laryngeal tension, as evidenced by the range of kinematic stiffness ratios, RFF values did not always concurrently change (e.g., Figure 4, S11). It may be possible that these participants relied on different adductory muscle pairs (i.e., larger contribution of the LCA, as modeled in Stepp et al., 2010b), which may have resulted in changes in the kinematic estimates of stiffness but not changes in our acoustic measure.
Further research would benefit from analysis of muscle activation, in isolation as well as in combination, to determine how specific intrinsic laryngeal muscles may be contributing to the symptoms observed in different diagnoses of dysphonia (i.e., muscle tension dysphonia vs. adductory spasmodic dysphonia). A laryngeal model that takes cocontractions into account may elucidate the relationship between muscle activation patterns during devoicing and voicing gestures and further expand our theoretical understanding of laryngeal biomechanics.
Individual Versus Group Analysis
Acoustic, aerodynamic, and electroglottographic measures of voice commonly vary on an individual basis (Holmberg, Hillman, Perkell, & Gress, 1994; Holmberg, Hillman, Perkell, Guiod, & Goldman, 1995; Traunmuller & Eriksson, 2000). Researchers have attributed this variation to anatomical and physiological factors, which result in a wide range of acceptable productions that depend on the speaker and the speech task. Likewise, RFF appears to vary on an individual basis as well, with a wide range of correlations reported during individual regressions. Our results are similar to findings in Lien et al. (2015), who reported stronger relationships (reflected in coefficient of determination values) between RFF and aerodynamic measures during different levels of self-modulated vocal effort when analyzed by individual rather than by group. Similar to our study, Lien et al. (2015) reported a range of individual variation across speakers: Some individuals exhibited very weak relationships (R 2 = .04), whereas others had much stronger relationships (R 2 = .95). Thus, modulations of effort can vary by individual not only in the strategy used to produce effortful voice but also in the relationship to other physiologic measures suspected to indicate laryngeal tension. Therefore, at this time, RFF values may not be appropriate for a group-level analysis but rather should be interpreted on an individual basis.
RFF may be most appropriate for monitoring progress and change within an individual, as there is no current evaluation criterion with clear cutoff values to indicate a diagnosis of laryngeal tension. The clinical application of RFF is still in the preliminary stages, with only one study by Stepp, Merchant, et al. (2011) examining RFF as a monitoring tool for individuals with vocal hyperfunction during voice therapy. It is important to note that the authors argued that the difference within an individual, from before to after therapy, was the key component of their analysis, providing insight into how this acoustic measure may be used as an objective clinical measure. Future work is needed to determine the clinical application of this type of acoustic measure to those with tension-based voice disorders.
Limitations and Future Work
The current study reports findings on a small group of healthy speakers only. Although prior research has supported using healthy participants to identify changes in voice during modulated effort (Lien et al., 2015; Rosenthal et al., 2014), further research is needed in order to determine whether these measures continue to be predictive in individuals with tension-based voice disorders. The strategies utilized by healthy individuals to create vocal effort may not be the same as those used by individuals with tension-based voice disorders. Strategies for producing vocal effort include increased subglottic pressure, increased force during adductory vocal fold gestures at voicing onset, variable activation of the intrinsic laryngeal muscles, or a combination thereof (Rosenthal et al., 2014). Although these have the potential to vary between healthy speakers and those with dysphonia, dysphonic individuals can also exhibit anatomical changes, swelling, and pain that may result in physiologic consequences during voice production. Thus, whether individuals with voice disorders produce vocal effort in the same way as healthy individuals is not known and may ultimately be a factor in the generalizability of these results to individuals with voice disorders. To our knowledge, no study has provided differential information about the techniques for increasing vocal effort in healthy individuals and those with voice disorders; this information is essential to the further development of RFF.
Although RFF is an acoustic measure that can be captured in both single utterances and running speech, it is still an acoustic measure that requires a voiceless obstruent situated between two sonorants. Therefore, our results reflect only findings from specific voicing onset–offset transitions and thus may not be consistent with voice transitions near a breath or pause. Studies that account for respiratory variability, voicing onset following a pause or inhalation, and voicing transitions during conversational speech would be an interesting way to expand the current work.
Additionally, the kinematic estimate of stiffness can be developed further as an indirect measure of tension. At this time, the biomechanical variable is only an estimate of stiffness. Quantification of vocal fold kinematics without direct in vivo measurement results in a ratio that is mass normalized across individuals and does not have units of stiffness. Although other indirect methods of quantifying laryngeal tension have been developed, such as the technique electromyography, they have yielded results that are, at best, conflicting (Hocevar-Boltezar, Janko, & Zargi, 1998; Redenbaugh & Reich, 1989; Stepp, Heaton, et al., 2011; Stepp, Heaton, Jette, Burns, & Hillman, 2010; Van Houtte, Claeys, D'Haeseleer, Wuyts, & Van Lierde, 2013). Thus, kinematic estimates of stiffness are currently the most feasible indirect method of assessing laryngeal tension at the level of the vocal folds during speaker productions. Although it is not ideal to compare two indirect methods to each other (RFF vs. kinematic stiffness ratios), we believe this is the first step in working toward a better understanding of laryngeal tension.
We suggest that further development of advanced biomechanical models (e.g., Stepp et al., 2010b) that include multiple muscle pairings during tension simulations would enhance our understanding of kinematic vocal fold movements and laryngeal physiology. Furthermore, increasing the sampling rate of the video nasendoscopy would allow for more specific time-sensitive information on points of abduction and adduction during voicing offset and onset. The sampling rate provided using typical nasendoscopy equipment is only 30 frames/s, whereas utilization of high-speed video nasendoscopy can provide laryngeal images at more than 10,000 frames/s. Use of a high-speed system may also allow for development of an automated algorithm, which would help expedite the time-consuming technique currently used to extract and process the data. Additional work should be completed on increasing the accuracy and efficiency in collection of these data in individuals with voice disorders.
Conclusions
RFF offset cycle 10 and RFF onset cycle 1 significantly predicted a kinematic estimate of laryngeal stiffness in healthy speakers during speaker-modulated effort, adding support for the validity of RFF as an acoustic indicator of laryngeal tension. RFF offset cycle 10 had more consistent and stronger negative correlations during individual regression against kinematic stiffness ratios compared with RFF onset cycle 1. RFF may be a feasible clinical tool that indicates laryngeal tension, but further research is needed to translate this acoustic measure to clinical populations.
Acknowledgments
This work was supported by National Institute on Deafness and Other Communication Disorders Grant DC012651. We thank Carolyn Michener, Talia Mittelman, and Meredith Cler of Boston University for their assistance with data processing.
Funding Statement
This work was supported by National Institute on Deafness and Other Communication Disorders Grant DC012651.
References
- Angsuwarangsee T., & Morrison M. (2002). Extrinsic laryngeal muscular tension in patients with voice disorders. Journal of Voice, 16, 333–343. [DOI] [PubMed] [Google Scholar]
- Awan S., Barkmeier-Kraemer J., Courey M., Deliyski D. D., Eadie T., Svec J., … Paul D. (2014, November). Standard clinical protocols for endoscopic, acoustic, and aerodynamic voice assessment: Recommendations from ASHA expert committee. Paper presented at the Annual Convention of the American Speech-Language-Hearing Association, Orlando, FL. [Google Scholar]
- Belafsky P. C., Postma G. N., Reulbach T. R., Holland B. W., & Koufman J. A. (2002). Muscle tension dysphonia as a sign of underlying glottal insufficiency. Otolaryngology—Head & Neck Surgery, 127, 448–451. [DOI] [PubMed] [Google Scholar]
- Bhuta T., Patrick L., & Garnett J. D. (2004). Perceptual evaluation of voice quality and its correlation with acoustic measurements. Journal of Voice, 18, 299–304. [DOI] [PubMed] [Google Scholar]
- Britton D., Yorkston K. M., Eadie T., Stepp C. E., Ciol M. A., Baylor C., & Merati A. L. (2012). Endoscopic assessment of vocal fold movements during cough. Annals of Otology, Rhinology & Laryngology, 121, 21–27. [DOI] [PubMed] [Google Scholar]
- Chu S. Y., & Barlow S. M. (2009). Orofacial biomechanics and speech motor control. SIG 5 Perspectives on Speech Science and Orofacial Disorders, 19, 37–43. [Google Scholar]
- Cooke A., Ludlow C. L., Hallett N., & Selbie W. S. (1997). Characteristics of vocal fold adduction related to voice onset. Journal of Voice, 11, 12–22. [DOI] [PubMed] [Google Scholar]
- Cooke J. D. (1980). The organization of simple, skilled movements. In Stelmach G. E. & Requin J. (Eds.), Tutorials in motor behavior (pp. 199–212). Amsterdam, the Netherlands: North-Holland. [Google Scholar]
- Dailey S. H., Kobler J. B., Hillman R. E., Tangrom K., Thananart E., Mauri M., & Zeitels S. M. (2005). Endoscopic measurement of vocal fold movement during adduction and abduction. Laryngoscope, 115, 178–183. [DOI] [PubMed] [Google Scholar]
- Eadie T. L., Kapsner M., Rosenzweig J., Waugh P., Hillel A., & Merati A. (2010). The role of experience on judgments of dysphonia. Journal of Voice, 24, 564–573. [DOI] [PubMed] [Google Scholar]
- Eadie T. L., & Stepp C. E. (2013). Acoustic correlate of vocal effort in spasmodic dysphonia. Annals of Otology, Rhinology and Laryngology, 122, 169–176. [DOI] [PubMed] [Google Scholar]
- Feldman A. G. (1980). Superposition of motor programs—I. Rhythmic forearm movements in man. Neuroscience, 5, 81–90. [DOI] [PubMed] [Google Scholar]
- Fukui N., & Hirose H. (1983). Laryngeal adjustments in Danish voiceless obstruent production. Annual Bulletin, Research Institute of Logopedics and Phoniatrics, 17, 61–71. [Google Scholar]
- Gallena S., Smith P. J., Zeffiro T., & Ludlow C. L. (2001). Effects of levodopa on laryngeal muscle activity for voice onset and offset in Parkinson disease. Journal of Speech, Language, and Hearing Research, 44, 1284–1299. [DOI] [PubMed] [Google Scholar]
- Goberman A. M., & Blomgren M. (2008). Fundamental frequency change during offset and onset of voicing in individuals with Parkinson disease. Journal of Voice, 22, 178–191. [DOI] [PubMed] [Google Scholar]
- Gökdoğan C., Gökdoğan O., Tutar H., Aydil U., & Yilmaz M. (2015). Speech range profile (SRP) findings before and after mutational falsetto (puberphonia). Journal of Voice, 30, 448–451. [DOI] [PubMed] [Google Scholar]
- Halle M., & Stevens K. N. (1971). A note on laryngeal features. MIT Research Laboratory of Electronics Quarterly Progress Report, 101, 198–213. [Google Scholar]
- Hertrich I., & Ackermann H. (2000). Lip-jaw and tongue-jaw coordination during rate-controlled syllable repetitions. The Journal of the Acoustical Society of America, 107, 2236–2247. [DOI] [PubMed] [Google Scholar]
- Hillman R. E., Holmberg E. B., Perkell J. S., Walsh M., & Vaughan C. (1989). Objective assessment of vocal hyperfunction: An experimental framework and initial results. Journal of Speech, Language, and Hearing Research, 32, 373–392. [DOI] [PubMed] [Google Scholar]
- Hocevar-Boltezar I., Janko M., & Zargi M. (1998). Role of surface EMG in diagnostics and treatment of muscle tension dysphonia. Acta Oto-Laryngologica, 118, 739–743. [DOI] [PubMed] [Google Scholar]
- Holmberg E. B., Hillman R. E., Perkell J. S., & Gress C. (1994). Relationships between intra-speaker variation in aerodynamic measures of voice production and variation in SPL across repeated recordings. Journal of Speech, Language, and Hearing Research, 37, 484–495. [DOI] [PubMed] [Google Scholar]
- Holmberg E. B., Hillman R. E., Perkell J. S., Guiod P. C., & Goldman S. L. (1995). Comparisons among aerodynamic, electroglottographic, and acoustic spectral measures of female voice. Journal of Speech, Language, and Hearing Research, 38, 1212–1223. [DOI] [PubMed] [Google Scholar]
- Hsiao T. Y., Liu C. M., Hsu C. J., Lee S. Y., & Lin K. N. (2001). Vocal fold abnormalities in laryngeal tension-fatigue syndrome. Journal of the Formosan Medical Association, 100, 837–840. [PubMed] [Google Scholar]
- Johns M. M. (2003). Update on the etiology, diagnosis, and treatment of vocal fold nodules, polyps, and cysts. Current Opinion in Otolaryngology & Head and Neck Surgery, 11, 456–461. [DOI] [PubMed] [Google Scholar]
- Kelchner L. N., Brehm S. B., Weinrich B., Middendorf J., deAlarcon A., Levin L., & Elluru R. (2010). Perceptual evaluation of severe pediatric voice disorders: Rater reliability using the consensus auditory perceptual evaluation of voice. Journal of Voice, 24, 441–449. [DOI] [PubMed] [Google Scholar]
- Kelso J. A., & Holt K. G. (1980). Exploring a vibratory systems analysis of human movement production. Journal of Neurophysiology, 43, 1183–1196. [DOI] [PubMed] [Google Scholar]
- Kelso J. A., Vatikiotis-Bateson E., Saltzman E. L., & Kay B. (1985). A qualitative dynamic analysis of reiterant speech production: Phase portraits, kinematics, and dynamic modeling. The Journal of the Acoustical Society of America, 77, 266–280. [DOI] [PubMed] [Google Scholar]
- Kempster G. B., Gerratt B. R., Abbott K. V., Barkmeier-Kraemer J., & Hillman R. E. (2009). Consensus Auditory-Perceptual Evaluation of Voice: Development of a standardized clinical protocol. American Journal of Speech-Language Pathology, 18(2), 124–132. [DOI] [PubMed] [Google Scholar]
- Lien Y. A. (2015). Optimization and automation of relative fundamental frequency for objective assessment of vocal hyperfunction (Doctoral dissertation). Boston University, Boston. [Google Scholar]
- Lien Y. A., Gattuccio C. I., & Stepp C. E. (2014). Effects of phonetic context on relative fundamental frequency. Journal of Speech, Language, and Hearing Research, 57, 1259–1267. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lien Y. A., Michener C. M., Eadie T. L., & Stepp C. E. (2015). Individual monitoring of vocal effort with relative fundamental frequency: Relationships with aerodynamics and listener perception. Journal of Speech, Language, and Hearing Research, 58, 566–575. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lien Y. A., & Stepp C. E. (2014). Comparison of voice relative fundamental frequency estimates derived from an accelerometer signal and low-pass filtered and unprocessed microphone signals. The Journal of the Acoustical Society of America, 135, 2977–2985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lofqvist A., Baer T., McGarr N. S., & Story R. S. (1989). The cricothyroid muscle in voicing control. The Journal of the Acoustical Society of America, 85, 1314–1321. [DOI] [PubMed] [Google Scholar]
- Lofqvist A., Koenig L. L., & McGowan R. S. (1995). Vocal tract aerodynamics in /aCa/ utterances: Measurements. Speech Communication, 16, 49–66. [Google Scholar]
- Lofqvist A., & McGowan R. S. (1992). Influence of consonantal environment on voice source aerodynamics. Journal of Phonetics, 20, 93–110. [Google Scholar]
- Lowell S. Y., & Story B. H. (2006). Simulated effects of cricothyroid and thyroarytenoid muscle activation on adult-male vocal fold vibration. The Journal of the Acoustical Society of America, 120, 386–397. [DOI] [PubMed] [Google Scholar]
- Ludlow C. L. (2009). Treatment for spasmodic dysphonia: Limitations of current approaches. Current Opinion in Otolaryngology & Head and Neck Surgery, 17, 160–165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maryn Y., & Weenink D. (2015). Objective dysphonia measures in the program Praat: Smoothed cepstral peak prominence and acoustic voice quality index. Journal of Voice, 29, 35–43. [DOI] [PubMed] [Google Scholar]
- Milstein C. F. (1999). Laryngeal function associated with changes in lung volume during voice and speech production in normal speaking women (Doctoral dissertation). University of Arizona, Tucson. [Google Scholar]
- Munhall K. G., & Ostry D. J. (1983). Ultrasonic measurement of laryngeal kinematics. In Titze I. R. & Scherer R. (Eds.), Vocal fold physiology: Biomechanics, acoustics and phonatory control (pp. 145–162). Denver, CO: Denver Center for the Performing Arts. [Google Scholar]
- Ostry D. J., Cooke J. D., & Munhall K. G. (1987). Velocity curves of human arm and speech movements. Experimental Brain Research, 68, 37–46. [DOI] [PubMed] [Google Scholar]
- Ostry D. J., Keller E., & Palmer P. M. (1983). Similarities in the control of the speech articulators and the limbs: Kinematics of tongue dorsum movement in speech. Journal of Experimental Psychology: Human Perception and Performance, 9, 622–636. [DOI] [PubMed] [Google Scholar]
- Ostry D. J., & Munhall K. G. (1985). Control of rate and duration of speech movements. The Journal of the Acoustical Society of America, 77, 640–648. [DOI] [PubMed] [Google Scholar]
- Pannbacker M. (1998). Voice treatment techniques: A review and recommendations for outcome studies. American Journal of Speech-Language Pathology, 7, 49–64. [Google Scholar]
- Ramig L. O., & Verdolini K. (1998). Treatment efficacy: Voice disorders. Journal of Speech, Language, and Hearing Research, 41, S101–S116. [DOI] [PubMed] [Google Scholar]
- Redenbaugh M. A., & Reich A. R. (1989). Surface EMG and related measures in normal and vocally hyperfunctional speakers. Journal of Speech and Hearing Disorders, 54, 68–73. [DOI] [PubMed] [Google Scholar]
- Robb M. P., & Smith A. B. (2002). Fundamental frequency onset and offset behavior: A comparative study of children and adults. Journal of Speech, Language, and Hearing Research, 45, 446–456. [DOI] [PubMed] [Google Scholar]
- Rosenthal A. L., Lowell S. Y., & Colton R. H. (2014). Aerodynamic and acoustic features of vocal effort. Journal of Voice, 28, 144–153. [DOI] [PubMed] [Google Scholar]
- Roy N. (2003). Functional dysphonia. Current Opinions in Otolaryngology & Head and Neck Surgery, 11, 144–148. [DOI] [PubMed] [Google Scholar]
- Roy N. (2008). Assessment and treatment of musculoskeletal tension in hyperfunctional voice disorders. International Journal of Speech-Language Pathology, 10, 195–209. [DOI] [PubMed] [Google Scholar]
- Roy N., Ford C. N., & Bless D. M. (1996). Muscle tension dysphonia and spasmodic dysphonia: The role of manual laryngeal tension reduction in diagnosis and management. Annals of Otology, Rhinology & Laryngology, 105, 851–856. [DOI] [PubMed] [Google Scholar]
- Roy N., & Leeper H. A. (1993). Effects of the manual laryngeal musculoskeletal tension reduction technique as a treatment for functional voice disorders: Perceptual and acoustic measures. Journal of Voice, 7, 242–249. [DOI] [PubMed] [Google Scholar]
- Roy N., Mazin A., & Awan S. N. (2014). Automated acoustic analysis of task dependency in adductor spasmodic dysphonia versus muscle tension dysphonia. Laryngoscope, 124, 718–724. [DOI] [PubMed] [Google Scholar]
- Schaeffer N., & Sidavi A. (2011). Toward a more quantitative measure to assess severity of dysphonia posttherapy: Preliminary observations. Journal of Voice, 25, E159–E165. [DOI] [PubMed] [Google Scholar]
- Shiller D. M., Laboissiere R., & Ostry D. J. (2002). Relationship between jaw stiffness and kinematic variability in speech. Journal of Neurophysiology, 88, 2329–2340. [DOI] [PubMed] [Google Scholar]
- Stager S. V., Bielamowicz S. A., Regnell J. R., Gupta A., & Barkmeier J. M. (2000). Supraglottic activity: Evidence of vocal hyperfunction or laryngeal articulation? Journal of Speech, Language, and Hearing Research, 43, 229–238. [DOI] [PubMed] [Google Scholar]
- Stepp C. E. (2013). Relative fundamental frequency during vocal onset and offset in older speakers with and without Parkinson's disease. The Journal of the Acoustical Society of America, 133, 1637–1643. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stepp C. E., Heaton J. T., Braden M. N., Jette M. E., Stadelman-Cohen T. K., & Hillman R. E. (2011). Comparison of neck tension palpation rating systems with surface electromyographic and acoustic measures in vocal hyperfunction. Journal of Voice, 25, 67–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stepp C. E., Heaton J. T., Jette M. E., Burns J. A., & Hillman R. E. (2010). Neck surface electromyography as a measure of vocal hyperfunction before and after injection laryngoplasty. Annals of Otology, Rhinology & Laryngology, 119, 594–601. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stepp C. E., Hillman R. E., & Heaton J. T. (2010a). The impact of vocal hyperfunction on relative fundamental frequency during voicing offset and onset. Journal of Speech, Language, and Hearing Research, 53, 1220–1226. [DOI] [PubMed] [Google Scholar]
- Stepp C. E., Hillman R. E., & Heaton J. T. (2010b). A virtual trajectory model predicts differences in vocal fold kinematics in individuals with vocal hyperfunction. The Journal of the Acoustical Society of America, 127, 3166–3176. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stepp C. E., Merchant G. R., Heaton J. T., & Hillman R. E. (2011). Effects of voice therapy on relative fundamental frequency during voicing offset and onset in patients with vocal hyperfunction. Journal of Speech, Language, and Hearing Research, 54, 1260–1266. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stepp C. E., Sawin D. E., & Eadie T. L. (2012). The relationship between perception of vocal effort and relative fundamental frequency during voicing offset and onset. Journal of Speech, Language, and Hearing Research, 55, 1887–1896. [DOI] [PubMed] [Google Scholar]
- Stevens K. N. (1977). Physics of laryngeal behavior and larynx modes. Phonetica, 34, 264–279. [DOI] [PubMed] [Google Scholar]
- Story B. H. (2015). Mechanisms of voice production. In Redford M. (Ed.), The handbook of speech production (pp. 34–58). West Sussex, United Kingdom: Wiley. [Google Scholar]
- Titze I. R., Luschei E. S., & Hirano M. (1989). Role of the thyroarytenoid muscle in regulation of fundamental frequency. Journal of Voice, 3, 213–224. [Google Scholar]
- Traunmuller H., & Eriksson A. (2000). Acoustic effects of variation in vocal effort by men, women, and children. The Journal of the Acoustical Society of America, 107, 3438–3451. [DOI] [PubMed] [Google Scholar]
- Van Houtte E., Claeys S., D'Haeseleer E., Wuyts F., & Van Lierde K. (2013). An examination of surface EMG for the assessment of muscle tension dysphonia. Journal of Voice, 27, 177–186. [DOI] [PubMed] [Google Scholar]
- Van Houtte E., Van Lierde K., & Claeys S. (2011). Pathophysiology and treatment of muscle tension dysphonia: A review of the current knowledge. Journal of Voice, 25, 202–207. [DOI] [PubMed] [Google Scholar]
- Wuyts F. L., De Bodt M. S., & Van de Heyning P. H. (1999). Is the reliability of a visual analog scale higher than an ordinal scale? An experiment with the GRBAS scale for the perceptual evaluation of dysphonia. Journal of Voice, 13, 508–517. [DOI] [PubMed] [Google Scholar]
- Yin J., & Zhang Z. Y. (2013). The influence of thyroarytenoid and cricothyroid muscle activation on vocal fold stiffness and eigenfrequencies. The Journal of the Acoustical Society of America, 133, 2972–2983. [DOI] [PMC free article] [PubMed] [Google Scholar]





