Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2026 Apr 4.
Published before final editing as: J Voice. 2026 Mar 11:S0892-1997(26)00049-4. doi: 10.1016/j.jvoice.2026.01.045

Automated creak detection in Spanish speakers with and without AdLD

M Eugenia Castro a,b, Fermin M Zubiaur Gomar c, Katherine L Marks d,*
PMCID: PMC13048833  NIHMSID: NIHMS2156280  PMID: 41820120

Abstract

Objectives:

Adductor laryngeal dystonia (AdLD) is a neurological voice disorder characterized by involuntary spasms of the adductor laryngeal muscles during phonation. An automated creak detector has shown promise in differentiating English speakers with AdLD from controls. However, no study has yet investigated creak in AdLD in Spanish speakers. In fact, there is a paucity of research validating tools to identify LD in languages other than English. The purpose of this study was to determine whether creak differentiates Spanish-speaking individuals with and without AdLD.

Methods:

Twenty speakers with AdLD, twenty speakers without voice disorders (controls), and twenty speakers with glottic insufficiency were recorded in a clinical environment. Each participant read a set of recently developed Spanish stimuli designed for LD screening, containing voiced and voiceless loaded sentences. An open-source creak detector was used to calculate the percentage of creak in each speaker’s recording. Mean smoothed cepstral peak prominence (CPPS) per speaker was calculated in Praat.

Results:

A Pearson’s correlation revealed a moderate relationship between creak and CPPS across groups (r = −.56). An analysis of covariance (ANCOVA) revealed a statistically significant effect of creak between groups (F(3, 56) =13.52, p > .05, R2 =.39). Three receiver operating characteristic curve analyses indicated that creak differentiated AdLD and Controls (AUC = .88), as well as AdLD and Glottic Insufficiency with acceptable diagnostic accuracy (AUC = .73), but not between control or glottic insufficiency groups.

Conclusions:

Creak differentiated Spanish speakers with and without AdLD with moderate discrimination. Although creak and CPPS were moderately correlated, when controlling for CPPS, creak was statistically different between speakers with AdLD and speakers with glottic insufficiency, and between speakers with AdLD and controls. Further work is needed to determine the clinical utility of creak in aiding a differential diagnosis of AdLD and muscle tension dysphonia in Spanish-speaking individuals.

Keywords: Dysphonia, Laryngeal Dystonia, Spanish Speakers, Assessment Stimuli, Spasmodic Dysphonia

1. Introduction

Laryngeal dystonia (LD), alternatively known as Spasmodic Dysphonia (SD), is a focal dystonia that affects the intrinsic laryngeal muscles causing involuntary spasms during speech tasks. The disorder is characterized by three main types: adductor, abductor, and mixed.13 Adductor laryngeal dystonia (AdLD) causes abrupt voice stops or breaks, caused by intermittent hyperadduction of the vocal folds due to laryngeal muscle spasms. Abductor laryngeal dystonia (AbLD) results in breathy voice breaks, which are aphonic segments due to intermittent abduction of the vocal folds during spasms1. In mixed LD, there is intermittent hyperadduction on voiced phonemes and abduction during transitions from voiceless to voiced phonemes, resulting in a combination of stops, breaks, and breathy breaks. The most common type of LD is AdLD,2 which causes symptoms resulting in auditory perceptual features of strain, roughness, asthenia, and vocal fry. 35 AdLD symptoms include discontinuities in running speech, including phonatory breaks (perceived as voice stops or breaks), frequency shifts (perceived as pitch breaks), and aperiodicity69 or creak.10,11 Consequently, individuals with AdLD experience reduced quality of life secondary to the negative impact to communication effectiveness and participation barriers in activities of daily living.1214

AdLD is a rare neurological voice disorder, with an approximate incidence of one out of every 100,000 individuals15. There are currently several challenges to obtaining a diagnosis for individuals with AdLD, which also delays proper treatment. Prior studies have shown that patients experience a substantial delay in LD diagnosis, reporting an average of 4.5 years from onset until obtaining formal diagnosis.15,16 This issue is related to the lack of objective diagnostic criteria for LD, as well as poor clinician awareness and education in the diagnostic protocols. General practitioners may not have sufficient experience with the disorder, and AdLD signs may be easily confused with signs of more common voice disorders.15,16 The differentiation between AdLD and other voice disorders, such as glottic insufficiency or muscle tension dysphonia, is critical to providing accurate treatment options. For example, voice therapy is effective as a treatment for muscle tension dysphonia but is less effective for AdLD. Since AdLD is neurological in nature, it does not respond to behavioral voice therapy in the way that other common voice disorders do.15,16 When patients with AdLD are misdiagnosed and offered inappropriate treatment, this leads to greater frustration, financial burden, and negative impact on quality of life that could be avoided with improved screening tools.

There is currently no diagnostic test specifically developed for LD, and clinical diagnosis requires subjective auditory-perceptual evaluation by experienced clinicians.1518 Symptoms of LD become perceptually evident during speech tasks (e.g., speaking vs. laughing or crying) and are phoneme-specific. For example, individuals with AbLD will demonstrate breathy voice breaks more predominantly when producing transitions between voiceless consonants and vowels, and individuals with AdLD will demonstrate symptoms of voice stops or breaks during vowel onsets and voiced consonants.17,19 Therefore, task and phoneme-specific stimuli are used to improve differential diagnosis of LD. The absence of an objective diagnostic test and the overreliance on perceptual evaluation represent challenges when the patient and the clinician differ in language or dialect. An automated, objective measure could reduce bias and provide an additional quantitative tool for accurate assessment. Currently there are no automated clinically feasible quantitative measures that identify primary signs of AdLD in voice samples. Within the literature, instances of LD discontinuities, such as phonatory breaks, frequency shifts, aperiodicity, and creak, in speech samples have been manually identified and found to be sensitive and specific to AdLD, 69,20 but these analyses are time consuming in methodology and therefore, not clinically feasible. Acoustic measures have been used to differentiate AdLD from muscle tension dysphonia with the use of specialized stimuli or complex calculations.2124 More recently, a new outcome measure was proposed in the literature called the spectral aggregate of the high-passed fo contour (SAHfo).25 SAHfo was designed to be used as an automated acoustic objective outcome measure specific to AdLD. Follow up studies confirmed a positive association between SAHfo with the number of laryngeal discontinuities in the acoustic signal of individuals with AdLD, which are caused by adductory laryngeal spasms.10 However, further validation and feasibility studies are needed before SAHfo can be used clinically.

To validate SAHfo, manual labels of acoustic discontinuities (phonatory breaks, frequency shifts, and aperiodicity) were compared against the measure, consistent with prior literature.8,9 During the labeling process, Marks et al., (2022)10 noted instances in which discontinuities did not meet the criteria of either aperiodicity or frequency shifts but were still not typical periodic phonation. For these instances, the umbrella term creak was adopted based on Keating et al. (2015), who described prototypical creaky voice as phonation characterized by low and irregular fundamental frequency and a constricted glottal configuration marked by small peak openings, prolonged closure, and reduced airflow, with other subtypes of creak including different combinations of the aforementioned features.26 In a subsequent study, Marks et al. (2023)11 applied an existing open-source, automated creak detector27 that was developed to automatically detect instances of creak in connected speech.2830 Specifically, the automated creak detector 31, is the result of an artificial neural network model that was trained to detect specific acoustic features associated with creak: first harmonic minus second harmonic (H1–H2), fundamental frequency (fo), residual peak prominence, power peak parameters, inter-pulse similarity, intra-frame periodicity and energy norm, power standard deviation, and ZeroXRate.31 Marks et al. (2023)11 found that creak (%) differentiated English speakers with AdLD from controls with high sensitivity and specificity, providing initial evidence of discriminant validity for creak in AdLD. Authors concluded that creak has the potential to be used as a screening tool in English, with further research required.11

Unfortunately, there is a paucity of validated tools to identify LD in languages other than English. One study described the development of vocal tasks to assess LD in Spanish32. This study was conducted in Chile including participants and diagnosticians who were all native Spanish Speakers32. However, dialectical and cultural variations in Spanish languages from different regions make it challenging for those vocal tasks to be applicable to Spanish speakers globally.33 Of note, voice and voiceless phonemes are sometimes pronounced differently depending on the Spanish dialect spoken. For example, the word “Llave” in Spanish (“key” in English), is pronounced with a voiceless initial phoneme in some South American dialects of Spanish, but it is produced with a voiced initial phoneme in some Central American dialects as well as European dialects of Spanish. To address this issue, Castro et al., (2023)34 developed phoneme-loaded sentences in Spanish to assess LD with particular attention to these dialectical differences.34 Clinicians familiar with a specific patient population (e.g., dysarthria) have been shown to benefit from using standardized diagnostic protocols that improve consistency and diagnostic accuracy.35 In line with this rationale, we developed Spanish stimuli that mirror the auditory-perceptual tasks commonly used by English-speaking clinicians to promote standardization and facilitate cross-linguistic application in LD assessment.34

Considering the wide diversity of dialectical backgrounds of Spanish-speaking individuals residing in the United States, it is critical that tools be developed to detect LD based on Spanish stimuli. The current gold-standard assessment of LD remains auditory-perception; however, monolingual American English speakers may have difficulty rating dysphonia in languages other than English when compared to bilingual listeners, suggesting that clinician’s unfamiliarity with the language could negatively impact accurate diagnosis.36 In LD, spasms are phoneme-specific and affected by coarticulation and linguistic context, making auditory-perceptual ratings potentially more susceptible to rater error if the rater is unfamiliar with the language. Based on these challenges, an automated, objective tool that has the potential to reliably identify signs of LD in Spanish-speaking individuals is of clinical significance. Our previous work has demonstrated the potential of creak to aid in the diagnosis of AdLD;11 however, to our knowledge, no study has yet investigated creak in Spanish speaking individuals with AdLD. The purpose of this study was to determine whether creak differentiates Spanish-speaking individuals with and without AdLD using Spanish stimuli that maintains phonemic consistency across Spanish dialects.

2. Methods

2.1. Participants

Participants in this study included 60 Spanish speaking individuals who resided in Mexico, as approved by the Institutional Review Board (IRB) of the University of Southern California (UP-21–00340). Demographics for the current study for each group are outlined in Table 1. Twenty participants were included in the Adductor-subtypes of LD (ADsLD) group: 16 with AdLD, two with AdLD and concomitant tremor, and two with mixed LD. Patients in the ADsLD group were diagnosed by a board-certified otolaryngologist (FZG) based on consensus criteria from Ludlow et al., (2018).17 Seven of the participants in the ADsLD group had never received Botox, and 13 had received Botox in the past, at least 76 days prior to the recording (mean = 316 days). Participants in the ADsLD group were matched with 20 patients with glottic insufficiency and 20 individuals without voice disorders (controls) based on sex. All patients glottic insufficiency were diagnosed via comprehensive evaluation by a board-certified otolaryngologist in Mexico (FZG). In this study, patients with glottic insufficiency were included to allow accurate comparison with the gold standard (i.e., videostroboscopy) and minimize confounding factors stemming from expert variability in perceptual assessment of AdLD and MTD. Individuals without voice disorders were adult volunteers who reported no history of speech, voice, language, or hearing disorders. Typical vocal function was confirmed through participant report and perceptual voice screening by a board-certified otolaryngologist and a speech-language pathologist with more than six years of experience in voice disorders.

Table 1.

Demographics and Clinical Characteristics of the Participant-Speakers

Group Laryngeal Dystonia Glottic Insufficiency Control

Age, mean (SD), years 56.90 (7.89) 57.43 (12.31) 45.57 (17.41)
Sex (n) Female 15 15 15
Male 5 5 5
Dialectical Background (n) Mexico 19 20 20
Venezuela 1 0 0
Diagnosis (n) AdLD 16 -- --
AdLD + tremor 2 -- --
Mixed LD 2 -- --
Vocal fold immobility -- 16 --
Presbyphonia -- 4 --

SD = standard deviation, n = number of participants, AdLD = adductor laryngeal dystonia

2.2. Audio Sample Acquisition and Analysis

Audio recordings of voice samples were recorded in a quiet room. A wireless lavalier microphone (Alvoxcon Audio) was placed 10 cm from the mouth at a 45-degree angle. Acoustic signals were recorded with a Mac computer (iMovie, Apple Inc software) using an external audio interface (Talent MIX-R Portable 3-Channel Mixer). Participants were instructed in Spanish to produce sustained vowels /a/ and /i/ and read aloud the set of phonemically loaded sentences34 (Appendix 1). Audio files were digitized (44100 Hz, 32-bit) and converted to .wav format for analysis. As in previous work11, an automated creak detector31 currently available open-source [Covarep]37 (v1.3.2) was implemented in an automated MATLAB38 script. The creak detector is a neural network that was trained to employ a combination of acoustic features to detect at least three patterns found in creaky voice: highly irregular temporal characteristics, fairly regular temporal characteristics with strong excitation peaks, and fairly regular temporal characteristics without strong secondary excitations.31 Mean Cepstral Peak Prominence Smoothed (CPPS) was used as an objective surrogate of dysphonia severity and was calculated in Praat for each speaker.39 CPPS was controlled to ensure that the creak detector was not simply reflecting differences in overall voice quality severity between groups, but rather measuring a unique feature.

2.3. Statistical Analysis

Open-source statistical software Jamovi (version 2.6) was used for correlation and analysis of covariance (ANCOVA).40 First, Pearson’s correlation was used to test the strength of the relationship between creak and CPPS. Based on the results (r = −.56, p < .001), an ANCOVA was used to test the effect of group on creak, controlling for CPPS. Levene’s test revealed a non-normal distribution of creak (F(2,57) = 9.21, p < .001), so a nonparametric ranked sum-based ANCOVA was used. Tukey’s post-hoc tests were calculated to assess pairwise comparisons of groups. Next, using a custom script in MATLAB,41 three separate receiver-operating characteristic (ROC) curves were used to test the diagnostic accuracy of creak in differentiating each group from the other. The ROC curve analysis plots sensitivity over one minus specificity. Sensitivity is defined as a test’s ability to identify whether the person with a disorder and specificity is defined as a test’s ability to correctly identify those without the disorder. The area under the curve (AUC) is a measure of diagnostic accuracy, where one = perfect diagnostic accuracy, .9 to 1 is outstanding, .8 to .9 is excellent, .7 to .8 is acceptable, .6 to .7 is poor, and 0.5 = no diagnostic accuracy.42 Likelihood ratios were calculated as the probability of creak identifying participants with ADsLD divided by the probability of the same finding in participants without ADsLD. Likelihood ratios were interpreted following McGee (2002), where a +LR greater than 10 provides strong evidence to rule in a diagnosis, 5–10 indicates moderate evidence, and 2–5 suggests weak evidence. Conversely, a −LR less than 0.1 provides strong evidence to rule out a diagnosis, 0.1–0.2 indicates moderate evidence, and 0.2–0.5 suggests weak evidence.43

3. Results

Descriptive statistics of mean, standard deviation, minimum, and maximum % creak and CPPS are listed in Table 2 for each group. A ranked sums-based analysis of covariance (ANCOVA) revealed a statistically significant effect of creak among groups (F(3, 56) =13.52, p < .05, R2=.39). Controlling for CPPS, there was a statistically significant difference in creak (%) between speakers with ADsLD and controls (t (56) = −2.62, p < .05, r = .33), as well as speakers with ADsLD and speakers with glottic insufficiency (t(56)= −2.87, p < .05, r =.36), with small-to-medium effect sizes. Figure 1 illustrates results of the ANCOVA. ROC curve analyses indicated that creak differentiated ADsLD and Controls (AUC = .88), as well as ADsLD and glottic insufficiency with acceptable diagnostic accuracy (AUC = .73). As expected, creak did not differentiate between speakers with glottic insufficiency and controls. Figure 2 illustrates the ROC Curve results for ADsLD vs. controls and ADsLD vs. glottic insufficiency, and Table 3 displays statistics associated with the ROC curve analyses, respectively.

Table 2.

Descriptive Statistics by Group

% Creak CPPS
ADsLD Control GI ADsLD Control GI
Mean 16.41 3.05 7.87 5.67 6.72 5.61
SD 16.30 3.48 10.26 1.26 0.82 1.61
Min 1.36 0.21 0.00 3.82 5.50 3.43
Max 47.63 15.35 36.81 8.80 8.19 8.90

Figure 1.

Figure 1.

Results of ANCOVA plotted as mean % creak for each group with error bars indicating 95% confidence intervals.

Figure 2.

Figure 2.

Receiver-operating characteristic curve analyses plotted as sensitivity over one minus specificity. Purple solid line indicates creak differentiating AdLD and Controls, with an AUC of .88. Light blue dotted line indicates creak differentiating AdLD and Glottic Insufficiency, with an AUC of .73.

Table 3.

Receiver-operating characteristic curve statistics

Comparison ADsLD vs Controls ADsLD vs Glottic Insufficiency

AUC 0.88 0.73
+ LR 9 5
− LR 0.58 0.79
Sensitivity 0.45 0.25
Specificity 0.95 0.95

4. Discussion

This study extends our previous work11 by investigating differences in creak in Spanish speakers, and by comparing adductor subtypes of LD to a different voice disorder: glottic insufficiency. Controlling for CPPS, creak was statistically higher in Spanish-speaking individuals in the ADsLD group than both control and glottic insufficiency groups, demonstrating construct validity of creak. Combined with the moderate negative relationship found between CPPS and creak, this finding indicates that creak may reflect a different but complementary physiological phenomenon, based on this small sample. Further investigation with a larger sample size is warranted.

Results of the ROC curve analyses indicate that automated estimates of creak can differentiate Spanish speakers with adductor subtypes of LD from controls with excellent diagnostic accuracy and from glottic insufficiency with acceptable diagnostic accuracy, which provides evidence of discriminant validity for creak. The +LR of 9 for ADsLD versus controls moderate-to-strong evidence of discriminant validity of creak. The −LR values of 0.58 and 0.79 provide weak evidence for ruling out ADsLD. High specificity (0.95 for both comparisons) suggests that the creak detector is highly accurate in identifying speakers without ADsLD, whereas lower sensitivity (0.45 and 0.25) indicates that some cases of ADsLD may not be detected. In sum, creak differentiated ADsLD from controls with excellent discrimination, with stronger evidence for ruling out ADsLD than for confirming. Creak differentiated ADsLD from glottic insufficiency with acceptable diagnostic accuracy, with stronger evidence for ruling out ADsLD than for confirming.

Glottic insufficiency was introduced as a different comparison group in this study, so the results are not directly comparable to the studies that have differentiated AdLD from primary MTD.23,24,44 For reference, creak had excellent diagnostic accuracy (AUC = .86), with .73 sensitivity and .93 specificity in 16 speakers with AdLD and 16 speakers with primary MTD. In a larger dataset of 50 speakers per group, Dragicevic et al. (2024),45 found more moderate statistics (AUC = .72, sensitivity = .70, specificity = .66). Their findings were consistent with other acoustic measures that have been used to differentiate AdLD from primary MTD.23,24 Roy et al. (2024) investigated creak and the Cepstral Spectral Index of Dysphonia (CSID) to detect task dependency (calculated as the difference of values from a voiced-loaded sentence and a voiceless-loaded sentence) in AdLD and primary MTD. They found a lower diagnostic accuracy for creak (AUC = .60, sensitivity = .48, specificity = .76) and for the normalized CSID (AUC = .59, sensitivity = .38, specificity = .91). However, the diagnostic accuracy increased for the normalized CSID of only the voiceless sentence (AUC = .70, sensitivity = .79, specificity = .58). These results imply that creak is not sensitive to task specificity but rather may reflect a different complimentary diagnostic feature of AdLD. Further work is needed to determine whether creak differentiates AdLD from primary MTD in Spanish.

The result of excellent diagnostic accuracy (AUC of .88) between the AdLD and control groups in the present study, was consistent with results of similar studies in English speakers. In Marks et al., (2023), creak differentiated 16 speakers with AdLD from 16 controls with outstanding diagnostic accuracy (AUC of .94).11 However, the sensitivity (.87) and +LR (13.0) found in Marks et al. (2023) were higher compared to the sensitivity (.45) and +LR (9.0) found in the current study. Specificity was high (.93 and .95, respectively) in both studies. In a larger sample of 50 speakers per group, Dragicevic et al. found more moderate values of AUC (.75), sensitivity (.70), and specificity (.66). In addition to differences in language spoken, it is possible that our current results were influenced by the length of the stimuli: i.e., the sentences of the Rainbow Passage used in English are longer than the Spanish stimuli used in the current study. There is preliminary evidence in studies of English that sentence length may play a role in the prevalence of creak across breath groups11,46,47. Further work is needed to determine to what extent stimuli impacts the discriminative accuracy of creak in AdLD.

The impact of the work is two-fold: first, it supports further validity for creak as a quantitative discriminative measure for adductor subtypes of LD. Second, it is a measure that appears to be robust to Spanish and English languages, such that it could be used even when the clinician and patient do not speak the same language of English or Spanish.

Given the lack of measures validated for Spanish-speaking individuals with AdLD, automated estimates of creak may fill a gap in the evaluation of AdLD and potentially lead to faster diagnosis and treatment of AdLD. However, no single automated measure should be used to diagnose. Comprehensive assessment of voice disorders necessitates the inclusion of multiple complementary measures to improve diagnostic accuracy (i.e., detailed case history, laryngeal imaging, perceptual analysis, and acoustic and aerodynamic assessments). At present, there remains a lack of validated instrumental or automated tools for the assessment of LD. Our results support the validity of creak as one potential measure for AdLD in Spanish speakers that would warrant further laryngological workup for the disorder. For example, if implemented as a screening tool, it could be used to determine when a comprehensive workup for AdLD, including use of task-specific stimuli, is warranted.

Creak was able to detect differences in adductor subtypes of LD and glottic insufficiency from stimuli that included both voiced and voiceless phoneme-loaded sentences. This is important because other quantitative measures used to differentiate AdLD from other voice disorders have been based on differences in sign expression (i.e., how prominently symptoms such as voice breaks, strain, or aperiodicity manifest across different speech tasks and in certain phonemic contexts).2224 AdLD is phoneme specific: because the muscles that spasm are adductory, the spasms occurs primarily on voiced phonemes rather than voiceless phonemes.19 Many outcomes for AdLD therefore depend on sign expression, wherein AdLD signs would be more apparent on voiced phoneme-loaded sounds than voiceless-loaded sounds, whereas signs of other voice disorders would typically be consistent across sounds. 10,21,22,24 Specific stimuli have been developed to aid a differential diagnosis; however, these stimuli are often only employed when an AdLD diagnosis is already suspected. Results from Marks et al., 202311 were based on a common mixed-phoneme reading passage in English, that is frequently used in clinical voice evaluations. In the current study, we used both voiced and voiceless phoneme-loaded sentences that were validated for AdLD in Spanish. In both studies, creak differentiated AdLD from other voice disorders and controls based on the mixed phoneme stimuli.

Because creak does not necessarily require specialized stimuli, in the long-term, a creak detector could be used as a quick, easily implemented screening tool for AdLD, particularly in generalized settings, to indicate when additional specialized workup is needed. However, it is important to acknowledge that although the creak detector is open access, it still requires use of a program such as MATLAB, which limits its clinical application. In maximizing sensitivity of the creak detector in differentiating AdLD from other voice disorders, thresholds could be applied as a clinical indicator that AdLD differential diagnosis work up is warranted. Larger samples are needed to estimate accurate thresholds, and additional research and development are needed to offer an accessible clinical screening tool for AdLD in both English and Spanish. Fur

5. Limitations

Limitations primarily pertain to group selection. First, since LD is a rare voice disorder, to optimize our sample size, two speakers with AdLD and concomitant tremor and two speakers with mixed LD were included in the ADsLD group. Second, although glottic sufficiency is an important voice disorder to rule out during a differential diagnosis, AdLD may be less likely to be misdiagnosed as glottic insufficiency than other voice disorders, such as MTD. This is because glottic insufficiency is easily observed under laryngoscopy, while AdLD and MTD may appear similarly. Although our results provide evidence of discriminant validity for creak in Spanish, the results are not directly comparable to our previous results in English, as the earlier study included patients with MTD, whereas the current study included speakers with glottic insufficiency. Inclusion of Spanish-speaking individuals with muscle tension dysphonia would strengthen the validity and clinical utility of an automated creak detector for use as a screening tool for AdLD; further investigation is warranted. Future work is warranted to study creak in AbLD. Finally, it is important to note that the creak detector is a neural network that was trained on speakers of US-English, Finnish, Swedish, and Japanese, 27,31 so further investigation and/or neural network training is needed to apply the creak detector to other languages. It is currently unknown how well creak would differentiate voice disorder groups in languages other than English or Spanish.

6. Conclusions

Creak differentiated Spanish speakers with ADsLD from controls with excellent discrimination. Based on the AUC, creak differentiated ADsLD from glottic insufficiency with acceptable discrimination based on stimuli including mixed phonemes, with greater specificity than sensitivity. Although creak and CPPS are moderately correlated, when controlling for CPPS, creak was statistically different between speakers with AdLD and speakers with glottic insufficiency, and between speakers with AdLD and controls. Findings from the current study support the clinical use of automated creak to identify signs of AdLD in Spanish-speaking individuals. Non-Spanish speaking SLPs with experience in the assessment and treatment of AdLD may be capable of reliably identifying the diagnosis of AdLD in Spanish-speaking individuals using this automated tool. This tool may improve diagnostic accuracy by voice care teams in the US and increase access to timely diagnosis and treatment for Spanish-speaking patients with LD. Further work is needed to determine the clinical utility of creak in aiding a differential diagnosis of AdLD and muscle tension dysphonia in both Spanish- and English-speaking individuals.

Disclosures

Katherine L. Marks receives funding from NIH NIDCD grants R21DC023280 (PI Marks) and R01DC009029 (PI Eddins).

Appendix 1.

Stimuli in Spanish for audio sample collection.

Stimuli in Spanish English translation (as reference)
Por favor sostenga el sonido /i/ por 4–5 segundos. Please hold an /i/ sound for 4–5 seconds
Por favor sostenga el sonido /a/ por 4–5 segundos. Please hold an /ah/ sound for 4–5 seconds
Lea las siguientes frases en voz alta:
 • Coco techa su choza con chapa
 • Pepa pasea su perrito por el patio
 • Susi se saca sus zapatos
 • Jacinta tiene sus zapatos puestos
Please read the following sentences aloud:
• Coco roofs his hut with metal sheets*
• Pepa walks her dog in the yard*
• Susi takes off her shoes*
• Jacinta has her shoes on*
Por favor cuente desde 60 hasta 65 Please count from 60 to 65.
Lea las siguientes frases en voz alta:
 • La abuela habla con la niña
 • Veo el agua y las olas en el mar
 • Alba hace unos huevos batidos
 • El avión vuela en el aire
Please read the following sentences aloud:
• The grandmother talks to the little girl*
• I see the water and the waves in the sea*
• Alba makes some beaten eggs*
• The plane flies in the air*
Por favor cuente desde 80 hasta 85 Please count from 80 to 85
*

Translation to English is provided only for descriptive purposes. The English translations are not intended to be used as stimuli, as they are not phonemically equivalent.

Footnotes

Declaration of interests

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:

Katherine Marks is funded by the following grants from the NIH NIDCD: R21DC023280 (PI: Marks) and R01DC009029 (PI: Eddins).

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.Blitzer A, Brin MF, Simonyan K, Ozelius LJ, Frucht SJ. Phenomenology, genetics, and CNS network abnormalities in laryngeal dystonia: A 30-year experience. The Laryngoscope. 2018;128 Suppl 1(Suppl 1):S1–S9. doi: 10.1002/lary.27003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Tisch SHD, Brake HM, Law M, Cole IE, Darveniza P. Spasmodic dysphonia: clinical features and effects of botulinum toxin therapy in 169 patients—an Australian experience. J Clin Neurosci. 2003;10(4):434–438. Accessed September 19, 2024. https://www.sciencedirect.com/science/article/pii/S0967586803000201 [DOI] [PubMed] [Google Scholar]
  • 3.Langeveld TPM, Drost HA, Zwinderman AH, Frijns JHM, De Jong RJB. Perceptual Characteristics of Adductor Spasmodic Dysphonia. Ann Otol Rhinol Laryngol. 2000;109(8):741–748. doi: 10.1177/000348940010900808 [DOI] [PubMed] [Google Scholar]
  • 4.Cannito MP, Doiuchi M, Murry T, Woodson GE. Perceptual Structure of Adductor Spasmodic Dysphonia and Its Acoustic Correlates. J Voice. 2012;26(6):818.e5–818.e13. doi: 10.1016/j.jvoice.2012.05.005 [DOI] [PubMed] [Google Scholar]
  • 5.Zwirner P, Murry T, Woodson GE. Perceptual-acoustic relationships in spasmodic dysphonia. J Voice. 1993;7(2):165–171. Accessed September 19, 2024. https://www.sciencedirect.com/science/article/pii/S0892199705803479 [DOI] [PubMed] [Google Scholar]
  • 6.Sapienza CM, Murry T, Brown WS. Variations in adductor spasmodic dysphonia:Acoustic evidence. J Voice. 1998;12(2):214–222. doi: 10.1016/S0892-1997(98)80041-6 [DOI] [PubMed] [Google Scholar]
  • 7.Sapienza CM, Cannito MP, Murry T, Branski R, Woodson G. Acoustic Variations in Reading Produced by Speakers With Spasmodic Dysphonia Pre-Botox Injection and Within Early Stages of Post-Botox Injection. J Speech Lang Hear Res. 2002;45(5):830–843. doi: 10.1044/1092-4388(2002/067) [DOI] [PubMed] [Google Scholar]
  • 8.Sapienza CM, Walton S, Murry T. Acoustic Variations in Adductor Spasmodic Dysphonia as a Function of Speech Task. J Speech Lang Hear Res. 1999;42(1):127–140. doi: 10.1044/jslhr.4201.127 [DOI] [PubMed] [Google Scholar]
  • 9.Sapienza CM, Walton S, Murry T. Adductor spasmodic dysphonia and muscular tension dysphonia: Acoustic analysis of sustained phonation and reading. J Voice. 2000;14(4):502–520. doi: 10.1016/S0892-1997(00)80008–9 [DOI] [PubMed] [Google Scholar]
  • 10.Marks KL, Feaster TF, Baker S, D íaz CME, Doyle PC, Stepp CE. Spectral Aggregate of the High-Passed Fundamental Frequency and Its Relationship to the Primary Acoustic Features of Adductor Laryngeal Dystonia. J Speech Lang Hear Res. 2022;65(11):4085–4095. doi: 10.1044/2022_JSLHR-22-00157 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Marks KL, Díaz Cádiz ME, Toles LE, et al. Automated Creak Differentiates Adductor Laryngeal Dystonia and Muscle Tension Dysphonia. The Laryngoscope. 2023;133(10):2687–2694. doi: 10.1002/lary.30588 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Bender BK, Cannito MP, Murry T, Woodson GE. Speech Intelligibility in Severe Adductor Spasmodic Dysphonia. J Speech Lang Hear Res. 2004;47(1):21–32. doi: 10.1044/1092-4388(2004/003) [DOI] [PubMed] [Google Scholar]
  • 13.Isetti D, Xuereb L, Eadie TL. Inferring Speaker Attributes in Adductor Spasmodic Dysphonia: Ratings From Unfamiliar Listeners. Am J Speech Lang Pathol. 2014;23(2):134–145. doi: 10.1044/2013_AJSLP-13-0010 [DOI] [PubMed] [Google Scholar]
  • 14.Braden MN, Johns MM, Klein AM, Delgaudio JM, Gilman M, Hapner ER. Assessing the Effectiveness of Botulinum Toxin Injections for Adductor Spasmodic Dysphonia: Clinician and Patient Perception. J Voice. 2010;24(2):242–249. doi: 10.1016/j.jvoice.2008.08.003 [DOI] [PubMed] [Google Scholar]
  • 15.Simonyan K, Barkmeier-Kraemer J, Blitzer A, et al. Laryngeal Dystonia. Neurology. 2021;96(21):989–1001. doi: 10.1212/WNL.0000000000011922 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Creighton FX, Hapner E, Klein A, Rosen A, Jinnah HA, Johns MM. Diagnostic Delays in Spasmodic Dysphonia: A Call for Clinician Education. J Voice. 2015;29(5):592–594. doi: 10.1016/j.jvoice.2013.10.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Ludlow CL, Domangue R, Sharma D, et al. Consensus-Based Attributes for Identifying Patients With Spasmodic Dysphonia and Other Voice Disorders. JAMA Otolaryngol Neck Surg. 2018;144(8):657–665. doi: 10.1001/jamaoto.2018.0644 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Johnson DM, Hapner ER, Klein AM, Pethan M, Johns MM. Validation of a Telephone Screening Tool for Spasmodic Dysphonia and Vocal Fold Tremor. J Voice. 2014;28(6):711–715. doi: 10.1016/j.jvoice.2014.03.009 [DOI] [PubMed] [Google Scholar]
  • 19.Erickson ML. Effects of Voicing and Syntactic Complexity on Sign Expression in Adductor Spasmodic Dysphonia. Am J Speech Lang Pathol. 2003;12(4):416–424. doi: 10.1044/1058-0360(2003/087) [DOI] [PubMed] [Google Scholar]
  • 20.Ludlow CL, Connor NP. Dynamic Aspects of Phonatory Control in Spasmodic Dysphonia. J Speech Lang Hear Res. 1987;30(2):197–206. doi: 10.1044/jshr.3002.197 [DOI] [PubMed] [Google Scholar]
  • 21.Roy N, Mauszycki SC, Merrill RM, Gouse M, Smith ME. Toward Improved Differential Diagnosis of Adductor Spasmodic Dysphonia and Muscle Tension Dysphonia. Folia Phoniatr Logop. 2007;59(2):83–90. doi: 10.1159/000098341 [DOI] [PubMed] [Google Scholar]
  • 22.Roy N, Whitchurch M, Merrill RM, Houtz D, Smith ME. Differential Diagnosis of Adductor Spasmodic Dysphonia and Muscle Tension Dysphonia Using Phonatory Break Analysis. The Laryngoscope. 2008;118(12):2245–2253. doi: 10.1097/MLG.0b013e318184577c [DOI] [PubMed] [Google Scholar]
  • 23.Roy N, Awan SN, Jennings S, Jensen J, Merrill RM. Adductor Laryngeal Dystonia Versus Muscle Tension Dysphonia: Examining the Utility of Automated Acoustic Analysis to Detect Task Dependency as a Distinguishing Feature. J Speech Lang Hear Res. 2024;67(10):3612–3630. doi: 10.1044/2024_JSLHR-24-00104 [DOI] [PubMed] [Google Scholar]
  • 24.Houtz DR, Roy N, Merrill RM, Smith ME. Differential diagnosis of muscle tension dysphonia and adductor spasmodic dysphonia using spectral moments of the long-term average spectrum. The Laryngoscope. 2010;120(4):749–757. doi: 10.1002/lary.20741 [DOI] [PubMed] [Google Scholar]
  • 25.Buckley DP, Cadiz MD, Eadie TL, Stepp CE. Acoustic Model of Perceived Overall Severity of Dysphonia in Adductor-Type Laryngeal Dystonia. J Speech Lang Hear Res. 2020;63(8):2713–2722. doi: 10.1044/2020_JSLHR-19-00354 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Keating PA, Garellek M, Kreiman J. Acoustic properties of different kinds of creaky voice. In: ICPhS. ; 2015. [Google Scholar]
  • 27.Drugman T, Kane J, Gobl C. Data-driven detection and analysis of the patterns of creaky voice. Comput Speech Lang. 2014;28(5):1233–1253. Accessed September 20, 2024. https://www.sciencedirect.com/science/article/pii/S0885230814000217 [Google Scholar]
  • 28.Dallaston K, Docherty G. Estimating the prevalence of creaky voice: A fundamental frequency-based approach. In: Proceedings of the 19th International Congress of Phonetic Sciences. Australasian Speech Science and Technology Association Inc. ; 2019:532–536. Accessed September 20, 2024. https://www.assta.org/proceedings/ICPhS2019/papers/ICPhS_581.pdf [Google Scholar]
  • 29.Martin P Automatic detection of voice creak. Proc Speech Prosody Shanghai Sept. Published online 2012:26–28. Accessed September 20, 2024. https://www.isca-archive.org/speechprosody_2012/martin12_speechprosody.pdf [Google Scholar]
  • 30.Murton O, Shattuck-Hufnagel S, Choi JY, Mehta DD. Identifying a creak probability threshold for an irregular pitch period detection algorithm. J Acoust Soc Am. 2019;145(5):EL379–EL385. Accessed September 20, 2024. https://pubs.aip.org/asa/jasa/article/145/5/EL379/671333 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Drugman T, Kane J, Gobl C. Data-driven Detection and Analysis of the Patterns of Creaky Voice. Published online May 31, 2020. Accessed September 20, 2024. http://arxiv.org/abs/2006.00518 [Google Scholar]
  • 32.Lagos AE, García-Huidobro FG, Ramos PH, et al. Spasmodic Dysphonia: Standardized Spanish Tool for Ambulatory Consult Diagnosis. J Voice. Published online February 19, 2020. doi: 10.1016/j.jvoice.2020.01.020 [DOI] [PubMed] [Google Scholar]
  • 33.Castro ME, Timmons Sund L, Bhatt NK, Hapner ER. Linguistic Relevance and Applicability of the Spanish VHI-10 in a Population Outside Spain. Folia Phoniatr Logop. Published online November 8, 2021. doi: 10.1159/000520737 [DOI] [PubMed] [Google Scholar]
  • 34.Castro ME, Timmons Sund L, Zubiaur Gomar FM, Wilson ML, Hapner ER. Reliability of Phonemically Loaded Sentences in Spanish for Identifying Laryngeal Dystonia by Non-Spanish Speaking Speech-Language Pathologists. J Voice. Published online November 2023:S0892199723003600. doi: 10.1016/j.jvoice.2023.11.001 [DOI] [PubMed] [Google Scholar]
  • 35.Borrie SA, Lansford KL, Barrett TS. A Clinical Advantage: Experience Informs Recognition and Adaptation to a Novel Talker With Dysarthria. J Speech Lang Hear Res. 2021;64(5):1503–1514. doi: 10.1044/2021_JSLHR-20-00663 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Cantor-Cutiva LC, Bottalico P, Webster J, Nudelman C, Hunter E. The Effect of Bilingualism on Production and Perception of Vocal Fry. J Voice. Published online July 21, 2021. doi: 10.1016/j.jvoice.2021.06.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Degottex G, Kane J, Drugman T, Raitio T, Scherer S. COVAREP—A collaborative voice analysis repository for speech technologies. In: 2014 Ieee International Conference on Acoustics, Speech and Signal Processing (Icassp). IEEE; 2014:960–964. Accessed September 20, 2024. https://ieeexplore.ieee.org/abstract/document/6853739/ [Google Scholar]
  • 38.Mathworks I MATLAB: The Language of Technical Computing: Computation, Visualization, Programming. Published online 2021. [Google Scholar]
  • 39.Boersma P. Praat, a system for doing phonetics by computer. Published online 2001. [Google Scholar]
  • 40.The jamovi project. jamovi (Version 2.6) [computer software]. Published 2025. Accessed February 2025. https://www.jamovi.org [Google Scholar]
  • 41.The MathWorks Inc. MATLAB (Version 9.13.0, R2022b) [computer software]. Natick, MA: The MathWorks Inc; 2022. Accessed February 2025. https://www.mathworks.com [Google Scholar]
  • 42.Hosmer DW, Lemeshow S. Applied Logistic Regression. 2nd ed. New York, NY: John Wiley & Sons; 2000:160–164. [Google Scholar]
  • 43.McGee S Simplifying Likelihood Ratios. J Gen Intern Med. 2002;17(8):647–650. doi: 10.1046/j.1525-1497.2002.10750.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Rees CJ, Blalock PD, Kemp SE, Halum SL, Koufman JA. Differentiation of adductor-type spasmodic dysphonia from muscle tension dysphonia by spectral analysis. Otolaryngol Head Neck Surg. 2007;137(4):576–581. doi: 10.1016/j.otohns.2007.03.040 [DOI] [PubMed] [Google Scholar]
  • 45.Dragicevic D, Marks K, Sauder C, et al. Automated creak differentiates laryngeal dystonia and muscle tension dysphonia during a conversational speech task. Proceedings of the International Conference on Voice Physiology and Biomechanics; 2024; Erlangen, Germany. [Google Scholar]
  • 46.Marks KL, Gates K, Norotsky R. Creak in AdLD derived from CAPE-V sentences. Presented at: The Fall Voice Conference; 2025; Charlotte, NC. [Google Scholar]
  • 47.Marks KL, Frankford SA, Cocroft SJ, Lonergan S, Cádiz MED, & Stepp CE (2024). The Prevalence of Creak Across Breath Groups in Adductor Laryngeal Dystonia. Journal of Voice [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES