Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Jan 16.
Published in final edited form as: J Voice. 2021 Jul 17;37(6):897–906. doi: 10.1016/j.jvoice.2021.05.019

Characterization of Primary Muscle Tension Dysphonia Using Acoustic and Aerodynamic Voice Metrics

Adrianna C Shembel *,†,, Jeon Lee , Joshua R Sacher §, Aaron M Johnson
PMCID: PMC9762233  NIHMSID: NIHMS1849100  PMID: 34281751

Abstract

Objectives/Hypothesis.

The objectives of this study were to (1) identify optimal clusters of 15 standard acoustic and aerodynamic voice metrics recommended by the American Speech-Language-Hearing Association (ASHA) to improve characterization of patients with primary muscle tension dysphonia (pMTD) and (2) identify combinations of these 15 metrics that could differentiate pMTD from other types of voice disorders.

Study Design.

Retrospective multiparametric

Methods.

Random forest modeling, independent t-tests, logistic regression, and affinity propagation clustering were implemented on a retrospective dataset of 15 acoustic and aerodynamic metrics.

Results.

Ten percent of patients seen at the New York University (NYU) Voice Center over two years met the study criteria for pMTD (92 out of 983 patients), with 65 patients with pMTD and 701 of non-pMTD patients with complete data across all 15 acoustic and aerodynamic voice metrics. PCA plots and affinity propagation clustering demonstrated substantial overlap between the two groups on these parameters. The highest ranked parameters by level of importance with random forest models—(1) mean airflow during voicing (L/sec), (2) mean SPL during voicing (dB), (3) mean peak air pressure (cmH2O), (4) highest F0 (Hz), and (5) CPP mean vowel (dB)—accounted for only 65% of variance. T-tests showed three of these parameters—(1) CPP mean vowel (dB), (2) highest F0 (Hz), and (3) mean peak air pressure (cmH2O)—were statistically significant; however, the log2-fold change for each parameter was minimal.

Conclusion.

Computational models and multivariate statistical testing on 15 acoustic and aerodynamic voice metrics were unable to adequately characterize pMTD and determine differences between the two groups (pMTD and non-pMTD). Further validation of these metrics is needed with voice elicitation tasks that target physiological challenges to the vocal system from baseline vocal acoustic and aerodynamic ouput. Future work should also place greater focus on validating metrics of physiological correlates (eg, neuromuscular processes, laryngeal-respiratory kinematics) across the vocal subsystems over traditional vocal output measures (eg, acoustics, aerodynamics) for patients with pMTD.

Level of Evidence.

II

Keywords: Muscle tension dysphonia, instrumental acoustic, aerodynamic, voice diagnostics, random forest, regression, affinity clustering

INTRODUCTION

Recommended Protocols for Instrumental Assessment of Voice by the American-Speech-Language-Hearing Association (ASHA) Expert Panel include laryngoscopic, acoustic, and aerodynamic assessment.1 Acquisition of these metrics in the clinic is standard for individuals with primary Muscle Tension Dysphonia (pMTD)—a voice disorder without overt injury, illness, structural, or neurological causes.25 Although variants on pMTD definitions and nomenclature exist, the Classification Manual for Voice Disorders-1 defines the most commonly characterized presentation of pMTD in the literature.416 Specifically, the Classification Manual for pMTD includes vocal impairments that impact activities of daily living within the context of absent organic pathology but presence of supraglottic constriction or vocal fold hyper- or hypo-adduction during phonation on laryngoscopy.5 Laryngoscopy has been advocated as a diagnostic tool to define pMTD and distinguish pMTD from other types of voice disorders.14,17

However, laryngoscopy is not readily available in many clinical and academic institutions, which forces clinicians to resort to acoustic and aerodynamic metrics of vocal output to characterize pMTD. Furthermore, there is discrepancy in the literature on whether laryngoscopy can adequately differentiate laryngeal kinematic patterns in patients with pMTD from laryngeal patterns in individuals without voice disorders.1820 Acoustic and aerodynamic metrics are standard protocols in the clinic. However, the ability of these metrics to define pMTD and reliably distinguish pMTD from other types of voice disorders has not been well-vetted. Previous studies on pMTD have either not utilized standard metrics recommended by ASHA as study outcomes or have selected one or two outcomes from the list of recommendations. These methodological shortcomings have created incongruence between expert recommendations and empirical evidence for the pMTD population. To further confound the issue, variations in voice elicitation tasks, poorly defined inclusion criteria, and small sample sizes have led to substantial disagreement across pMTD research studies.2125

The conventional use of univariate, mean-based statistical methods on a clinical population with known heterogeneity is yet another methodological shortcoming.21,22,2527 The use of univariate designs and linear statistical models (eg, t-tests, ANOVAs) is appealing to simplify complex biological processes like voice production while aligning with the traditional scientific method. However, isolating dependent variables can lead to missed nuances across the multidimensionality of vocal production. Therefore, conceptual and analytic strategies that address complexities in how the voice is produced are needed. Furthermore, heterogeneity within the pMTD population yields study outcomes with large within-group statistical variance and robust overlap between healthy and disordered voices. Between the multidimensional nature of the voice and heterogeneity in pMTD clinical presentations, it is unlikely any one voice metric—especially the mean and standard deviation of that metric—can sufficiently define the pMTD population in its entirety.

Taken together, all of these methodological shortcomings have made it challenging to identify reliable inclusion criteria parameters for research studies and valid outcome measures for experimental designs for the pMTD population.2836 These shortcomings have also hindered clinical care. They have led to a heavy reliance on exclusionary diagnostics and delays in effective treatment for patients with pMTD, all of which place financial and resource burdens on the patient and medical system.4,3740

Fortunately, computational modeling and multivariate methods can address some of these methodological shortcomings. Computational modeling can identify combinations of metrics that best define the pMTD population from a comprehensive list of acoustic and aerodynamic metrics recommended by ASHA. Multivariate analysis can also inform the strength of these voice metric clusters and determine the ability for these groups of metrics to reliably distinguish pMTD from other types of voice disorders. The use of accurate metrics with specificity to the target population are most cost-effective and empirically sound. Results of these methodological approaches can inform clinicians whether acquisition of all recommended metrics is necessary for the pMTD population or whether certain metrics are more important than others for pMTD. Not only can these methods improve characterization of patients diagnosed with pMTD but they can also streamline the clinical voice assessment process. Improved characterization via validated metrics can also help guide investigators in determining which subjects best meet study inclusion criteria for research studies using the pMTD population.

The first study objective was to identify clusters of parameters that could define patients with pMTD using 15 standard acoustic and aerodynamic voice metrics recommended by ASHA. The second study objective was to identify combinations of these recommended metrics that could distinguish pMTD from other types of voice disorders. Using four different computational modeling and multivariate methods on a retrospective dataset of patients at the NYU Voice Center over a two-year time span, we aimed to address these objectives.

METHODS

Data acquisition

Participants

Retrospective clinical data collected at the NYU Voice Center in the Department of Otolaryngology, NYU Langone Medical Center from October 2017 to May 2019 were queried from the NYU Langone electronic medical record system (EPIC) using in house-software by the NYU DataCore. The protocol was approved by NYU Langone’s Institutional Review Board. Out of the total 983 patients who were seen at the NYU Voice Center with voice-related complaints across the two-year time span, 92 patients met the study inclusion criteria for the pMTD group (criteria detailed in Data Preparation). The 10% prevalence in our dataset align with previous literature demonstrating the percentage of patients with muscle tension dysphonia seen in voice clinics ranges from 10% to 40%.3,11,4145 The low end of the range representative in our dataset is likely due to the conservative approaches used to bin patients into the pMTD group (described in detail in the Data Preparation section). Of all the possible patients seen at the NYU Voice Center over two years, 65 patients in the pMTD cohort and 701 patients diagnosed with other voice disorders had full acoustic and aerodynamic data across all 15 voice assessment metrics. The reduced number of patients with complete data sets from the total sample size of potential participants seen at the Voice Center across the two years included patients with recordings that were excluded at the time of their initial voice evaluation appointment, either due to faulty equipment or abnormal data output (282 missing from non-pMTD cohort; 27 missing from pMTD cohort). Average age of patients with pMTD was 40.15 years (SD = 17.21) and 64% of these patients were women.

Diagnostics

“Muscle tension dysphonia” was defined by the essential features and classification criteria from the Classification Manual for Voice Disorders-1.5 Patients received a diagnosis of pMTD if they had self-described vocal impairments impacting activities of daily living in the absence of organic pathology and presence of supraglottic constriction or vocal fold hyper- or hypo-adduction during phonation on laryngoscopy. Many patients also had self-reported complaints of pain on swallowing and/or phonation and vocal fatigue (see Associated Features in Classification Manual).5 Laryngeal structure as well as supraglottic and vocal fold movement patterns were confirmed with halogen and stroboscopic light, respectively, using a standard rigid or flexible laryngoscope (PENTAX Medical). Laryngeal tasks included sustained vowels on comfortable pitch and loudness, pitch glides, loud and soft phonation, and sniff-/i/. Vibratory opening, closing, and closed phase patterns were confirmed with at least three videostroboscopic glottal cycles.1 All diagnoses were made by one of two board certified, fellowship trained laryngologists at the NYU Voice Center. Because laryngoscopy is not available to many SLPs, we used this parameter for inclusionary criteria only. Patients with structural (eg, vocal fold lesions) or neurological (eg, vocal fold immobility) voice-related impairments or other systemic issues that could contribute to voice deficits (eg, allergic rhinitis) were excluded from the pMTD cohort. No patients were excluded from the analysis as long as they had a complete acoustic and aerodynamic dataset.

Acoustic and aerodynamic data acquisition

The 15 standard acoustic and aerodynamic vocal output parameters collected at the NYU Voice Center are listed in Table 1. Methods to acquire these data were chosen based on ASHA’s Recommended Protocols for Instrumental Assessment of Voice (for comprehensive description of acoustic and aerodynamic data acquisition, processing, and analysis methods, refer to Patel et al, 2018).1

TABLE 1.

Fifteen Acoustic and Aerodynamic Voice Assessment Metrics

Parameters Voice Elicitation Task Voice Elicitation Task
I. Acoustics (a) Sustained /a/ vowel 1. Mean fundamental frequency (F0) of sustained /a/ vowel
2. Mean cepstral peak prominence (CPP) (dB) of sustained /a/ vowel
3. Standard deviation CPP (SD dB) of sustained /a/ vowel
4. Cepstral-Spectral Index of Dysphonia (CSID) vowel
(b) Pitch glide 5. Highest F0 achieved with upward pitch glide
6. Lowest F0 achieved with a downward pitch glide
7. Pitch glide range in semitones (ST)
(c) All-voiced CAPE-V sentence 8. Mean CPP (dB) of all-voiced speech sample
9. Standard deviation CPP (SD dB) of all-voiced speech sample
10. Cepstral-Spectral Index of Dysphonia (CSID) speech
(d) Spontaneous speech sample 11. Mean speaking F0 of spontaneous speech sample
12. Semitone (ST) range of spontaneous speech sample
II. Aerodynamics /pa:pa:pa:pa:pa/ utterance 13. Mean airflow during voicing (L/sec)
14. Mean peak air pressure (cmH20)
15. Mean SPL during voicing (dB)
Voice recordings.

In brief, acoustic voice samples with (1) three sustained /a/, (2) pitch glides on a sustained /a/ vowel to the comfortably lowest and highest pitches, (3) the six Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) sentences, and (4) brief spontaneous speech sample (“tell me how you got here today”) were recorded with a head-mounted cardioid condenser microphone, microphone preamplifier, and digital converter using the Computerized Speech Lab system (CSL Model 4500B, PENTAX Medical). Samples were acquired at a distance of 4–10 cm from the lips at a 45° angle at a sampling rate of 44.1 kHz. These voice samples are collected on all patients with voice-related complaints at the time of their initial voice evaluation by a speech-language pathologist at the NYU Voice Center.

Acoustics.

Frequency measures (fundamental frequency [F0], semitone range [ST]) were acquired from the steady state vowel, pitch glides, and spontaneous speech sample using the Computerized Speech Lab (Model 4500B, PENTAX Medical). Cepstral measures (cepstral peak prominence [CPP], cepstral peak prominence standard deviation [CPP SD], and cepstral spectral index of dysphonia [CSID]) were acquired from the steady state vowel and all-voiced CAPE-V sentence “we were away a year ago” using the Analysis of Dysphonia in Speech and Voice (ADSV) software (PENTAX medical). These metrics were all collected and stored in each patient’s electronic medical record at the time of the initial voice assessment and were available in numerical format for automated query.

Aerodynamics.

Aerodynamic measures were based on the middle three peaks of the /pa:pa:pa:pa:pa/ utterance through the pneumotachograph and catheter between the lips, resting on top of the tongue intraorally. Estimated oral airflow from the vowel, peak subglottal pressure from the bilabial voiceless stop consonant, and sound pressure level (dB SPL) were acquired automatically using the Phonatory Aerodynamic System (PAS, Model 6600, PENTAX medical). Aerodynamic metrics are all collected at the time of the initial voice assessment and numerical data were available at the time of the EPIC data query for this study.

Data preparation

Data preparation, computational modeling, and statistical analysis was performed in RStudio (version 1.2.5033) with R (version 4.0.0). Rows with missing data were first removed from the dataset. Data with ICD-9 and ICD-10 codes demarcating pMTD—defined as patients with voice complaints without structural or neurological voice issues or other pathology that could explain vocal impairments—were coded as one (Supplemental Table 1A for complete list of MTD-specific diagnoses). Only one other concomitant diagnosis known to co-occur with pMTD was included in the pMTD cohort (ie, coded as 1; complete list of potential concomitant disorders in Supplemental Table 1B).3,44 If more than one diagnosis listed in Supplemental Table 1B was included with the pMTD-specific ICD codes in the patient’s chart, the patient was coded as 0. This decision was made in an attempt to keep the group as homogenous as possible and to reduce confounding effects other common co-occurring conditions (eg, psychological disorders, laryngeal breathing dysfunction, chronic cough) could have on outcome metrics. Patients who were diagnosed with any other structural, neurological, or anatomical voice disorders or other pathologies that could account for their vocal impairments were also coded with a 0 (Supplemental Table 1C for complete list).

Computational modeling and statistical analysis

The complete dataset was tested for normality using the Shapiro-Wilk test. Once it was confirmed that the assumption of normality had been met (P > 0.05), the data were z-transformed so that all voice metrics were on the same dimensional scale. Data clipping was implemented on extremely low or high values (±3 standard deviations from the mean, corresponding to 0.0013 and 0.9987 percentile, respectively). Four statistical and computational methods were used to determine whether any combinations of acoustic and aerodynamic parameters could characterize pMTD and distinguish pMTD from other voice disorders: random forest modeling, independent t-tests, logistic regression, and affinity propagation clustering.

Random forest computational modeling

Random forest algorithms are classification methods that work by constructing numerous decision trees to distinguish groups. Parameters that are frequently detected in the constructed decision trees are ranked by their importance values. The level of importance of classifying parameters are determined by averaging across all the decision trees and finding the parameters with the highest values across the generated data matrix.46,47 The random forest model consisted of 500 decision trees built with 10 randomly selected variables at each split. The random forest model was replicated 1000 times using an independent random selection of non-pMTD data points equal to the pMTD dataset size for each replicate. The average importance of each acoustic and aerodynamic parameter was calculated and the features were ranked from most important to least important in the model. MeanDecreaseGini (mean decrease in Gini Index) was used to determine the importance score of each parameter within the set of 15 total parameters. This method determines parameter importance by calculating the average of the parameter’s total decrease in node impurity weighted by the proportion of samples reaching that node in each individual decision tree. Principal component analysis (PCA) plots with the parameters of top-five importance were generated based on the random forest model results.

Statistical analysis with independent t-tests

Independent t-tests were run on the complete acoustic and aerodynamic dataset to identify vocal output parameters with statistically significant differences between pMTD and non-pMTD patients. The analysis was adjusted for false discovery rate and log2 transformed before being ranked by the adjusted P-values. PCA plots on acoustic and aerodynamic metrics that were significantly different (P < 0.05) were generated based on the results of these t-tests.

Model optimization with logistic regression

The acoustic and aerodynamic parameters of importance from the random forest model and the statistically significant parameters identified with t-tests were compared to determine commonalities. Logistic regression was then implemented using the common parameters from the random forest model (importance ranked) and t-tests (statistically significant) as a baseline model. Additional investigations were conducted on whether additions of any of the other parameters included in the common parameters resulted in any improvements in prediction power. Finally, optimal parameters were defined as a union set of the common parameters and any parameters that added predictive power. The Hosmer and Lemeshow goodness of fit (GOF) chi-square test was calculated for the model. A PCA plot was also generated based on these optimal parameters.

Affinity propagation clustering (APCluster)

Cluster analysis works by organizing parameters into clusters based on group similarities across the entire data set. AP clustering, specifically, involves an algorithm that measures similarities among data points and forms clusters of data points based on their similarities. First, this model is advantageous because clustering is done based on the similarities of observed data samples and not on hypothetical averages of cluster samples (eg, k-means).48 Second, it allows for agnostic identification of discrete groups or relevant variables without a priori selection bias. Finally, it is beneficial when the number of clusters are unknown and thus requires an unsupervised learning algorithm to detect groups. This approach was used to determine whether acoustic and aerodynamic metrics could inform diagnosis and distinguish pMTD from other types of voice disorders. AP clustering was run on the optimal parameters previously identified from the complete dataset. The optimal number of group clusters was automatically determined by this algorithm—three—which was the same number of clusters determined by optimal gap statistics (Figure 1A). UMAP and PCA plots were generated across the AP clusters to visualize pMTD versus non-pMTD data points on these metrics across and within cluster groups (Figures 1B and 1C).

FIGURE 1.

FIGURE 1.

Gap statistics and AP clustering model. (A) Gap statistics analysis results for determining optimal number of clustering, (B) UMAP for AP Clustering, and (C) PCA Plot for AP Clustering. Outcomes were based on the four optimal parameters (highest F0, CPP vowel mean, mean airflow, and mean peak pressure) identified with computational and statistical modeling from the 15 total acoustic and aerodynamic voice metrics across pMTD and non-pMTD groups.

RESULTS

Figure 2 represents PCA plots of all parameters (full dataset, Figure 2A), important parameters identified with random forest models (Figure 2B), statistically significant parameters identified with independent t-tests (Figure 2C), and optimal predictor parameters based on logistic regression (Figure 2D). These PCA plots demonstrate substantial overlap in acoustic and aerodynamic metrics across the two groups (pMTD and non-pMTD), regardless of the computational or statistical approach implemented on the dataset.

FIGURE 2.

FIGURE 2.

PCA plots of (A) all acoustic and aerodynamic parameters (full dataset); (B) top five most important parameters determined by random forest model; (C) statistically significant parameters determined by independent t-tests; and (D) optimal predictor parameters based on logistic regression. These data represent robust overlap between pMTD and non-pMTD groups and brings into question the clinical utility of standard acoustic and aerodynamic voice assessment metrics for pMTD diagnostics in the clinical setting and inclusionary criteria for research studies.

Top-5 parameters based on ranked importance using random forest modeling were identified as the following (ranked by order of importance from most to least important): (1) mean airflow during voicing (L/sec), (2) mean SPL during voicing (dB), (3) mean peak air pressure (cmH2O), (4) highest F0 (Hz), and (5) CPP mean vowel (dB) (Table 2A). However, these parameters only accounted for 65% of the variance within the dataset (Goodness of Fit test: chi-squared = 9.94, P = 0.269). Additionally, three parameters were also identified with adjusted P-values using independent samples t-tests: (1) CPP mean vowel (dB), (2) highest F0 (Hz), and (3) mean peak air pressure (cmH2O) (of note, semitone range, CSID vowel, mean airflow during voicing (L/sec), and speaking range (in semitones) were significant only prior to FDR P-value adjustment) (Table 2B). However, the log2 fold change for each parameter was minimal (Table 2B for details). The four parameters that were deemed “optimal” in that they met both the 0.05 level of statistical significance for the independent sample t-test and were one of the Top-5 parameters in the random forest model were (1) highest F0, (2) CPP vowel mean, (3) mean airflow, and (4) mean peak pressure.

TABLE 2.

Acoustic and Aerodynamics Voice Metrics Ranked by Level of Importance and Statistical Significance

Vocal Output Parameters (A) Random Forest
(B) Independent t-tests
Rank Importance score Importance in model P-value Adjusted P-value(FDR) Log2 fold change
Mean airflow during voicing (L/sec) 1 7.49 Yes 0.04* 0.09 0.05
Mean SPL during voicing (dB) 2 5.43 Yes 0.73 0.86 0.01
Mean peak air pressure (cmH2O) 3 4.63 Yes 0.00* 0.02* 0.34
Highest F0 (Hz) 4 4.56 Yes 0.00* 0.00* −0.26
Mean CPP vowel (dB) 5 4.07 Yes 0.00* 0.00* −0.42
CPP speech standard deviation (SD dB) 6 3.23 No 0.40 0.67 −0.03
Pitch glide range (semitones) 7 3.09 No 0.03* 0.09 −0.30
Speaking range (semitones) 8 3.03 No 0.04* 0.09 −0.28
CPP speech mean (dB) 9 3.00 No 0.80 0.86 0.00
Speaking F0 standard deviation (Hz SD) 10 2.97 No 0.47 0.71 0.07
Speaking F0 mean (Hz) 11 2.85 No 0.78 0.86 0.00
CSID speech 12 2.79 No 0.07 0.14 0.19
CPP vowel standard deviation (SD dB) 13 2.71 No 0.60 0.82 −0.02
Lowest F0 (Hz) 14 2.45 No 0.88 0.88 −0.02
CSID vowel 15 2.18 No 0.04* 0.09 0.18
*

statistically significant (P < 0.05)

Results with AP clustering on those four optimal parameters (highest F0, CPP vowel mean, mean airflow, and mean peak pressure) showed the majority of patients with pMTD were in Cluster 2, with substantial overlap with non-pMTD patients in the same cluster (c.f., Figures 1B and 1C) (Supplemental Table 2 for complete list of all ICD codes in dataset, divided by cluster). Optimal features divided by group and cluster demonstrated Cluster 3 had lower CPP mean vowels and Cluster 1 higher mean peak pressures (Figure 3).

FIGURE 3.

FIGURE 3.

Box and whisker plots of optimal parameters, divided by (A) cluster and (B) cluster and voice disorder group (pMTD, non-pMTD). The higher mean peak air pressure values in Cluster one (the cluster without pMTD) and the lower CPP values in Cluster 3 (the cluster with few pMTD patients) is in contrast to the relatively normal vocal output parameters in Cluster 2 (the cluster with the highest pMTD patients). The high overlap between pMTD and non-pMTD in Cluster 2 (c.f., Figure 1) and mean z-scored values closer to zero suggest their commonalities could have more to do with the mild nature of their vocal output measures and less to do with the parameter itself.

DISCUSSION

The overarching objectives of this study were to identify multiparametric clusters of acoustic and aerodynamic voice metrics that could characterize patients with pMTD and differentiate these patients from other types of voice disorders. Random forest models, independent t-tests, logistic regression, and affinity propagation clustering were implemented on a complete dataset of 15 acoustic and aerodynamic voice metrics recommended by ASHA. The breadth of overlap between the pMTD and non-pMTD groups across metrics seen with PCA plots and AP clustering suggest these metrics are insufficient in their abilities to define pMTD and distinguish patients with pMTD from other types of voice disorders.

Although the majority of pMTD patients were in Cluster 2 with AP clustering, this cluster also included voice disorders that were organic (eg, nodules, edema, neoplasms), neurological (eg, vocal fold paralysis/paresis, Parkinson’s Disease), and structural (eg, laryngeal web, stenosis) in nature. The majority of the voice-disorder-specific codes in Cluster 2 were also present in Clusters 1 and 3. These findings suggest acoustic and aerodynamic voice metrics have little influence on their ability to cluster patients by medical diagnosis and exemplify the inability for these metrics to distinguish pMTD from other types of voice disorders. The minimal fold change and only moderate contributions to computational models found on the optimal parameters also brings into question the utility of these metrics for these same purposes. These findings insinuate acoustic and aerodynamic metrics may not serve to improve diagnostic specificity in the clinic or inclusion criteria for future studies (although previous work has shown that these metrics may be robust in determining pre- and post-treatment outcomes).49

Of note, although there were no pMTD patients in Cluster 1, the information provided with this approach is no different from the exclusionary diagnostic approaches currently available in the clinic (eg, identifying vocal fold nodules on laryngoscopy, and thereby excluding pMTD as a diagnosis). Stated differently, we continue to lack robust inclusionary criteria to distinguish pMTD from other disorders, as evidenced by the high overlap between pMTD and non-pMTD voice disordered groups in Cluster 2.

One reason for the lack of robust differences found with these metrics may have to do with the type of vocal elicitation task used to obtain these metrics (eg, brief steady state vowels or simple sentence productions on comfortable pitch and loudness). These targets are likely not sensitive enough to capture vocal deficits in pMTD that typically occur when vocal demands exceed baseline functional capacity (eg, increased vocal duration, intensity, or pitch range).8 Although effects were moderate, the fact the highest F0—a measure that briefly challenges baseline physiological boundaries—was both an important ranked factor in the Random Forest Model and met the adjusted P-value with standard statistical t-test, exemplifies this concept.

The lack of substance in findings with these tasks is further supported by the majority of pMTD patients clustering into Cluster 2—which had z-scores centered at or near 0 (c. f., Figure 3). In other words, the tasks may not have been sufficiently targeting deficits and may have made vocal output parameters appear more “normal.” Key differences between AP clusters as they relate to pMTD appears to be the relative absence of abnormal vocal output in pMTD compared to other voice disorders in the other two clusters. These cluster patterns also align with previous studies that have suggested not all patients with pMTD exhibit abnormal vocal outputs (eg, acoustic vocal quality). Instead of issues with the “sound” of the voice (eg, hoarseness), issues instead may stem from the physiologic “feel” of the voice (eg, vocal effort, fatigue) in some patients, which acoustic and aerodynamic measures may not adequately capture.29

Another reason for the lack of differences found between groups could be that these metrics capture vocal output—or end products of physiologic processes. They do not, however, inform neuromuscular biomechanics or subsystem kinematics that may differ between patients with pMTD and other voice disorders. Examples include respiratory hyperfunction, overactivation of the intrinsic and extrinsic laryngeal muscles in the absence of structural, anatomical, or neurological vocal fold deficits, or overactivity of the abdominal muscles, to name a few—all of which have been proposed as underlying mechanisms specific to muscle tension dysphonias.8,13,5052 Many of these different underlying physiological mechanisms can result in similar aberrant acoustic vocal qualities or aerodynamic inefficiencies in patients with pMTD. Vocal outputs resulting from different underlying deficits in other types of voice disorders may also overlap in their vocal output presentation with pMTD. This concept is supported by patterns found in Figure 1B and 1C and Supplemental Table 2.

These study findings contrast several previous works that found differences using acoustic and aerodynamic metrics—including subjects with pMTD and vocal fold lesions,24 muscle tension dysphonia and healthy controls,23 and different types of muscle tension dysphonia.53 As mentioned in the Introduction, inconsistencies in results across studies could have to do with the types of statistical approaches used to assess group differences within these studies. Statistical analysis has traditionally relied on probability to determine significance of results—or the likelihood of there being group differences (ie, P < 0.05 cutoff)—but not the strength of evidence, such as importance of the metric on the model. Although several parameters were statistically significant in the present study—in that they met the P < 0.05 cutoff—these metrics only accounted for a little more than half of the data’s behavior (65%) and their respective fold changes were minimal (c.f., Table 2B). With these modest effects and the known heterogeneity in pMTD clinical presentation, it is no surprise some studies found significant differences while others did not.21,22,26

Differences in findings across studies highlight the utility of computational modeling over traditional statistical tests. With a heterogeneous population like pMTD, multidimensional aspects of vocal production, and multitude of potential outcome parameters that can be utilized to examine group differences, statistical models that test unidimensional metrics will likely be met with reproducibility challenges. The use of computational modeling like neural networks and machine learning allow for more precise identification of diagnostic metrics and treatment outcomes, especially when dealing with clinical heterogeneity.54,55 Findings in the present study also elucidate the importance of using parameters that adequately target the population in question and suggest current standard metrics and their methods for elicitation may not be robust for the pMTD population. Using maximum function tasks across vocal range, duration, and intensity when assessing pMTD could be more revealing of underlying vocal pathophysiology, considering patients with pMTD commonly complain of difficulties with vocal range, endurance, and projection, respectively.12

Validation of biophysiologic correlates of pMTD in combination of multivariate modeling are also needed. Fortunately, validation of metrics that inform underlying physiology in pMTD are already underway. Kinematic stiffness ratios have been proposed as a measure of intrinsic laryngeal muscle biomechanics.56,57 Eulerian Video Magnification is being investigated as a measure of extrinsic muscle activity via blood flow perfusion.58 Several studies are investigating central nervous system responses in patients with muscle tension dysphonia.4,59 These methods will likely need to be performed in tandem using dynamic systems models to inform interplay across the phonatory, respiratory, and resonatory vocal subsystems.60 Physiologic-based metrics will be vital for improving precision of diagnosis in pMTD and managing this diagnosis using approaches that are physiologically-based instead of symptom-focused.

Several limitations are worth mentioning. The first is the retrospective nature of the study. The four speech-language pathologists at NYU who acquired acoustic and aerodynamic data at the time of the voice assessment all had similar training. However, there is still the chance of inter-clinician variability in how the SLPs elicited vocal tasks and the order of data acquisition; there is also the potential for data input errors with manual entry into the EPIC system, especially in the midst of a busy clinic day. The second limitation is that SPL was not directly calibrated prior to each voice evaluation. That being said, the acoustic recordings all used the same equipment with approximately the same mouth-to-microphone distance (+/− 2–4 cm). The gain on the microphone was kept at a constant level and was only reduced for very loud speech production. Previous studies have demonstrated acoustic measures related to fundamental frequency and cepstral peak prominence are robust to small variations in collections.61

CONCLUSION

The inability to distinguish pMTD from non-pMTD using standard acoustic and aerodynamic metrics—even with multivariate analysis and computational modeling—suggests caution should be exercised when using these metrics for clinical diagnosis in patients and inclusionary research study criteria for participants with pMTD. Validation of metrics and deterministic models that directly assess vocal subsystem biomechanics, kinematics and dynamics are needed. Despite their shortcomings, previous work has shown acoustic and aerodynamic metrics may still be helpful in determining pre-post treatment outcomes and may be robust in tracking changes within individuals.49

Supplementary Material

Supp Table 1
Supp Table 2

ACKNOWLEDGMENTS

The authors would like to thank Nichole Houle for her contributions with data acquisition in EPIC and help with data deidentification, cleaning, and processing; the authors would also like to thank Elizabeth Fosmire for her help with identifying ICD code descriptions. Author J.L. was supported by the Cancer Prevention Research Institute (CPRIT; RP150596).

Footnotes

SUPPLEMENTARY DATA

Supplementary data related to this article can be found online at doi:10.1016/j.jvoice.2021.05.019.

CONFLICT OF INTEREST

None of the authors have a conflict of interest to declare.

REFERENCES

  • 1.Patel R, Awan SN, Barkmeier-Kraemer J, et al. Recommended protocols for instrumental assessment of voice: american speech-language-hearing association expert panel to develop a protocol for instrumental assessment of vocal function. Am J Speech Lang Pathol. 2018;27:887–905. 10.1044/2018_AJSLP-17-0009. [DOI] [PubMed] [Google Scholar]
  • 2.Andreassen ML, Litts JK, Randall DR. Emerging techniques in assessment and treatment of muscle tension dysphonia. Curr Opin Otolaryngol Head Neck Surg. 2017;25:447–452. 10.1097/MOO.0000000000000405. [DOI] [PubMed] [Google Scholar]
  • 3.Morrison MD, Rammage LA, Belisle GM, et al. Muscular tension dysphonia. J Otolaryngol. 1983;12:302. [PubMed] [Google Scholar]
  • 4.Kunduk M, Fink DS, McWhorter AJ. Primary muscle tension dysphonia. Curr Otorhinolaryngol Rep. 2016;4:175–182. 10.1007/s40136-016-0123-3. [DOI] [Google Scholar]
  • 5.Verdolini K, Rosen CA, Branski RC. Classification Manual for Voice Disorders-I. Psychology Press; 2005... Accessed at: November 17, 2013; http://books.google.com/books?hl=en&lr=&id=Wkj-tExDSvBwC&oi=fnd&pg=PP1&dq=Classification+Manual+for+Voice+Disorders&ots=IGrnIh_YXz&sig=2wYztwkX5NLzqfF5MZS3GBak97o. [Google Scholar]
  • 6.Roy N. Assessment and treatment of musculoskeletal tension in hyperfunctional voice disorders. Int J Speech Lang Pathol. 2008;10:195–209. 10.1080/17549500701885577. [DOI] [PubMed] [Google Scholar]
  • 7.Morrison MD, Rammage L, Nichol H, et al. The management ofvoice disorders. Chapman & Hall Medical London. 1994.. Accessed July 15, 2013; http://www.getcited.org/pub/103099243. [Google Scholar]
  • 8.Spencer ML. Muscle tension dysphonia: a rationale for symptomatic subtypes, expedited treatment, and increased therapy compliance. Perspect Voice Voice Dis. 2015;25:5–15. 10.1044/vvd25.1.5. [DOI] [Google Scholar]
  • 9.Tomlinson CA, Archer KR. Manual therapy and exercise to improve outcomes in patients with muscle tension dysphonia: a case series. Phys Ther. 2015;95:117–128. 10.2522/ptj.20130547. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Van Houtte E, Van Lierde K, Claeys S. Pathophysiology and treatment of muscle tension dysphonia: a review of the current knowledge. J Voice. 2011;25:202–207. [DOI] [PubMed] [Google Scholar]
  • 11.Van Houtte E, Van Lierde K, D’haeseleer E, et al. The prevalence of laryngeal pathology in a treatment-seeking population with dysphonia. Laryngoscope. 2010;120:306–312. [DOI] [PubMed] [Google Scholar]
  • 12.Morrison MD, Nichol H, Rammage LA. Diagnostic criteria in functional dysphonia. Laryngoscope. 1986;96:1–8. [DOI] [PubMed] [Google Scholar]
  • 13.Altman KW, Atkinson C, Lazarus C. Current and emerging concepts in muscle tension dysphonia: a 30-month review. J Voice.. 2005;19:261–267. [DOI] [PubMed] [Google Scholar]
  • 14.Morrison MD, Rammage LA. Muscle misuse voice disorders: description and classification. Acta Oto-Laryngologica. 1993;113:428–434. 10.3109/00016489309135839. [DOI] [PubMed] [Google Scholar]
  • 15.Angsuwarangsee T, Morrison M. Extrinsic laryngeal muscular tension in patients with voice disorders. J Voice. 2002;16:333–343. [DOI] [PubMed] [Google Scholar]
  • 16.da Cunha Pereira G, de Oliveira Lemos I, Dalbosco Gadenz C, et al. Effects of voice therapy on muscle tension dysphonia: a systematic literature review. J Voice.. 2018;32:546–552. 10.1016/j.jvoice.2017.06.015. [DOI] [PubMed] [Google Scholar]
  • 17.Barkmeier JM, Case JL. Differential diagnosis of adductor-type spasmodic dysphonia, vocal tremor, and muscle tension dysphonia. Curr Opin Otolaryngol Head Neck Surg. 2000;8:174–179. [Google Scholar]
  • 18.Sama A, Carding PN, Price S, et al. The clinical features of functional dysphonia. Laryngoscope. 2001;111:458–463. 10.1097/00005537-200103000-00015. [DOI] [PubMed] [Google Scholar]
  • 19.Stager SV, Bielamowicz SA, Regnell JR, et al. Supraglottic activity: evidence of vocal hyperfunction or laryngeal articulation? J Speech Lang Hear Res. 2000;43:229–238. [DOI] [PubMed] [Google Scholar]
  • 20.Behrman A, Dahl LD, Abramson AL, et al. Anterior-Posterior and medial compression of the supraglottis: signs of nonorganic dysphoniaor normal postures? J Voice. 2003;17:403–410. [DOI] [PubMed] [Google Scholar]
  • 21.Gillespie AI, Gartner-Schmidt J, Rubinstein EN, et al. Aerodynamic profiles of women with muscle tension dysphonia/aphonia. J Speech Lang Hear Res. 2013;56:481–488. [DOI] [PubMed] [Google Scholar]
  • 22.Belsky MA, Rothenberger SD, Gillespie AI, et al. Do phonatory aerodynamic and acoustic measures in connected speech differ between vocally healthy adults and patients diagnosed with muscle tension dysphonia? J Voice. 2020. 10.1016/j.jvoice.2019.12.019. [DOI] [PubMed] [Google Scholar]
  • 23.Zheng Y-Q, Zhang B-R, Su W-Y, et al. Laryngeal aerodynamic analysis in assisting with the diagnosis of muscle tension dysphonia. J Voice. 2012;26:177–181. 10.1016/j.jvoice.2010.12.001. [DOI] [PubMed] [Google Scholar]
  • 24.Espinoza VM, Zañartu M, Van Stan JH, et al. Glottal aerodynamic measures in women with phonotraumatic and nonphonotraumatic vocal hyperfunction. J Speech Lang Hear Res. 2017;60:2159–2169. 10.1044/2017_JSLHR-S-16-0337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Marks KL, Lin JZ, Burns JA, et al. Estimation of subglottal pressure from neck surface vibration in patients with voice disorders. J Speech Lang Hear Res. 2020;63:2202–2218. 10.1044/2020_JSLHR-19-00409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Lopes LW, Batista Simões L, Delfino da Silva J, et al. Accuracy of acoustic analysis measurements in the evaluation of patients with different laryngeal diagnoses. J Voice. 2017;31:382. 10.1016/j.jvoice.2016.08.015. [DOI] [PubMed] [Google Scholar]
  • 27.McKenna VS, Hylkema JA, Tardif MC, et al. Voice onset time in individuals with hyperfunctional voice disorders: evidence for disordered vocal motor control. J Speech Lang Hear Res. doi: 10.1044/2019_JSLHR-19-00135. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Roy N. Differential diagnosis of muscle tension dysphonia and spasmodic dysphonia. Curr Opin Otolaryngol Head Neck Surg. 2010;18:165–170. 10.1097/MOO.0b013e328339376c. [DOI] [PubMed] [Google Scholar]
  • 29.Gillespie AI, Gartner-Schmidt J. Immediate effect of stimulability assessment on acoustic, aerodynamic, and patient-perceptual measures of voice. J Voice. 2016;30:507–5e9. [DOI] [PubMed] [Google Scholar]
  • 30.Valentino WL, Park J, Alnouri G, et al. Diagnostic value of acousticand aerodynamic measurements in vocal fold movement disorders and their correlation with laryngeal electromyography and voice handicap index. J Voice.. 2019. 10.1016/j.jvoice.2019.10.008. [DOI] [PubMed] [Google Scholar]
  • 31.McMullan PM. A comparison of acoustic and aerodynamic measurements of laryngeal function using low-cost and high-cost systems. 2016.. Accessed at: October 16, 2020; https://search.proquest.com/docview/1802256112/abstract/1F3D4932E80140F6PQ/1.
  • 32.Niebudek-Bogusz E, Kotylo P, Śliwińska-Kowalska M. Evaluation of voice acoustic parameters related to the vocal-loading test in professionally active teachers with dysphonia. 2007.. Published online Accessed at: January 3, 2014; http://www.degruyter.com/view/j/ijmh.2007.20.issue-1/v10001-007-0001-9/v10001-007-0001-9.xml. [DOI] [PubMed]
  • 33.Côrtes Gama AC, Camargo Z, Rocha Santos MA, et al. Discriminant capacity of acoustic, perceptual, and vocal self: the effects of vocal demands. J Voice. 2015;29:260.e45–260.e50. 10.1016/j.jvoice.2014.06.012. [DOI] [PubMed] [Google Scholar]
  • 34.Gillespie AI, Dastolfo C, Magid N, et al. Acoustic analysis of four common voice diagnoses: moving toward disorder-specific assessment. J Voice. 2014;28:582–588. 10.1016/j.jvoice.2014.02.002. [DOI] [PubMed] [Google Scholar]
  • 35.Dastolfo C, Gartner-Schmidt J, Yu L, et al. Aerodynamic outcomes of four common voice disorders: moving toward disorder-specific assessment. J Voice. 2016;30:301–307. 10.1016/j.jvoice.2015.03.017. [DOI] [PubMed] [Google Scholar]
  • 36.Brockmann-Bauser M, Drinnan MJ. Routine acoustic voice analysis: time to think again? Curr Opin Otolaryngol Head Neck Surg. 2011;19:165–170. 10.1097/MOO.0b013e32834575fe. [DOI] [PubMed] [Google Scholar]
  • 37.Roy N, Barkmeier-Kraemer J, Eadie T, et al. Evidence-based clinical voice assessment: a systematic review. Am J Speech Lang Pathol. 2013;22:212–226. 10.1044/1058-0360(2012/12-0014. [DOI] [PubMed] [Google Scholar]
  • 38.Cohen SM, Garrett CG. Hoarseness: is it really laryngopharyngeal reflux? Laryngoscope. 2008;118:363–366. 10.1097/MLG.0b013e318158f72d. [DOI] [PubMed] [Google Scholar]
  • 39.Thomas JP, Zubiaur FM. Over-diagnosis of laryngopharyngeal reflux as the cause of hoarseness. Eur Arch Otorhinolaryngol. 2013;270:995–999. 10.1007/s00405-012-2244-8. [DOI] [PubMed] [Google Scholar]
  • 40.Misono S, Dietrich M, Piccirillo JF. The puzzle of medically unexplained symptoms—a holistic view of the patient with laryngeal symptoms. JAMA Otolaryngol Head Neck Surg. 2020. 10.1001/jamaoto.2020.0559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Kiakojoury K, Dehghan M, Hajizade F, et al. Etiologies of dysphonia in patients referred to ENT clinics based on videolaryngoscopy. Iran J Otorhinolaryngol. 2014;26:169–174. [PMC free article] [PubMed] [Google Scholar]
  • 42.Sliwinska-Kowalska M, Niebudek-Bogusz E, Fiszer M, et al. The prevalence and risk factors for occupational voice disorders in teachers. Folia Phoniatrica et Logopaedica. 2006;58:85–101. 10.1159/000089610. [DOI] [PubMed] [Google Scholar]
  • 43.Bhattacharyya N. The prevalence of voice problems among adults in the United States. Laryngoscope. 2014;124:2359–2362. [DOI] [PubMed] [Google Scholar]
  • 44.Roy N. Functional dysphonia. Curr Opin Otolaryngol Head Neck Surg. 2003;11:144–148. [DOI] [PubMed] [Google Scholar]
  • 45.Koufman JA, Blalock PD. Classification and approach to patients with functional voice disorders. Ann Otol Rhinol Laryngol. 1982;91(4 Pt 1):372. [DOI] [PubMed] [Google Scholar]
  • 46.Breiman L. Random forests. Mach Learn. 2001;45:5–32. 10.1023/A:1010933404324. [DOI] [Google Scholar]
  • 47.Tin KH. The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell. 1998;20:832–844. 10.1109/34.709601. [DOI] [Google Scholar]
  • 48.Bodenhofer U, Kothmeier A, Hochreiter S. APCluster: an R package for affinity propagation clustering. Bioinformatics. 2011;27:2463–2464. 10.1093/bioinformatics/btr406. [DOI] [PubMed] [Google Scholar]
  • 49.Reetz S, Bohlender JE, Brockmann-Bauser M. Do standard instrumental acoustic, perceptual, and subjective voice outcomes indicate therapy success in patients with functional dysphonia? J Voice. 2019;33:317–324. 10.1016/j.jvoice.2017.11.014. [DOI] [PubMed] [Google Scholar]
  • 50.Gillespie AI. Effects of hyper-and hypocapnia on phonatory laryngeal resistance. 2013.. Accessed at: August 13, 2014; http://d-scholarship.pitt.edu/17318/. [DOI] [PMC free article] [PubMed]
  • 51.Morrison M. Pattern recognition in muscle misuse voice disorders: how I do it. J Voice. 1997;11:108–114. [DOI] [PubMed] [Google Scholar]
  • 52.Rubin JS, Macdonald I, Blake E. The putative involvement of the transabdominal muscles in dysphonia: A preliminary study and thoughts. J Voice. 2011;25:218–222. [DOI] [PubMed] [Google Scholar]
  • 53.Cesari U, Apisa P, Frasci M, et al. Aerodynamic analysis in quantitative evaluation of voice disorders. J Otol Rhinol. 2016;05. 10.4172/2324-8785.1000268. [DOI] [Google Scholar]
  • 54.Hewitson L, Mathews JA, Devlin M, et al. Blood biomarker discovery for autism spectrum disorder: A proteomic analysis. Plos One. 2021;16: e0246581. 10.1371/journal.pone.0246581. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
  • 55.Shah N, Farhat A, Tweed J, et al. Neural networks to predict radiographic brain injury in pediatric patients treated with extracorporeal membrane oxygenation. J Clin Med. 2020;9:2718. 10.3390/jcm9092718. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Diaz-Cadiz M, McKenna VS, Vojtech JM, et al. Adductory vocal fold kinematic trajectories during conventional versus high-speed videoendoscopy. J Speech Lang Hear Res. 2019;62:1685–1706. 10.1044/2019_JSLHR-S-18-0405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.McKenna VS, Murray ESH, Lien Y-AS, et al. The relationship between relative fundamental frequency and a kinematic estimate of laryngeal stiffness in healthy adults. J Speech Lang Hear Res. 2016;59:1283–1294. 10.1044/2016_JSLHR-S-15-0406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Adleberg J, O’Connell Ferster AP, Benito DA, et al. Detection of muscle tension dysphonia using eulerian video magnification: a pilot study. J Voice. 2019. 10.1016/j.jvoice.2019.02.006. [DOI] [PubMed] [Google Scholar]
  • 59.Roy N, Fetrow RA, Merrill RM, et al. Exploring the clinical utility of relative fundamental frequency as an objective measure of vocal hyperfunction. J Speech Lang Hear Res. 2016;59:1002–1017. 10.1044/2016_JSLHR-S-15-0354. [DOI] [PubMed] [Google Scholar]
  • 60.Croake DJ, Andreatta RD, Stemple JC. Descriptive analysis of the interactive patterning of the vocalization subsystems in healthy participants: a dynamic systems perspective. J Speech Lang Hear Res. 2019;62:215–228. 10.1044/2018_JSLHR-S-17-0466. [DOI] [PubMed] [Google Scholar]
  • 61.van der Woerd B, Wu M, Parsa V, et al. Evaluation of acoustic analyses of voice in nonoptimized conditions. J Speech Lang Hear Res. 2020;63:3991–3999. 10.1044/2020_JSLHR-20-00212. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp Table 1
Supp Table 2

RESOURCES