Abstract
Purpose:
Despite the general agreement that dysarthria characteristics are largely language-independent, few efforts have attempted a systematic comparison across languages. To examine the role of native languages in the perception of speech characteristics of dysarthria secondary to Parkinson's disease (PD), auditory-perceptual ratings of dysarthria, and confidence level of the judgments were compared between two listener groups: language-matched and language-crossed.
Method:
A total of 60 listeners (35 native speakers of Korean and 25 native speakers of American English) estimated speech abnormality for 20 speech dimensions using a visual analog scale method for both language-matched and language-crossed speech stimuli. Speech stimuli were passage readings of the respective languages obtained from individuals with and without PD.
Results:
For speech dimension ratings, eight of 20 speech dimensions revealed significant differences in response to PD speech between the two listener groups, for most of which, language-crossed listeners' estimation was lower (i.e., more impaired) than language-matched listeners. For confidence-level ratings, language-matched listeners were less confident in the ratings of speakers with PD compared to the language-crossed listeners.
Conclusions:
The data support both language-universal and language-specific aspects in perceiving dysarthria characteristics, such that native language plays a role, especially when rating articulatory- and rhythmic-related characteristics. The findings are discussed with respect to the role of linguistic information, such as phonetic inventories and prosodic structures, in perceiving dysarthria characteristics.
Despite the advancement in speech technology providing various tools for assessing motor speech disorders, auditory-perceptual methods still remain the primary standard for describing and identifying speech characteristics of dysarthria, frequently used for clinical judgments such as speech severity and functional change over the progress of the disease or treatment (Bunton et al., 2007; Duffy, 2005; Hirsch et al., 2022). The framework of such perceptual evaluation is rooted in the Mayo Clinic classification system (hereafter, the Mayo Clinic system), which lists 38 dimensions across five aspects of speech production, including respiration, phonation, resonance, articulation, and prosody (Darley et al., 1969a, 1969b). Since then, the dysarthria labels (i.e., type) and the primary, distinctive characteristics of each type of dysarthria of the Mayo Clinic system have provided the framework and language to describe speech characteristics of certain types of dysarthria (e.g., ataxic dysarthria: Hilger et al., 2023; Kent et al., 2000; Yorkston & Beaukelman, 1981) and diseases (e.g., Parkinson's disease [PD]: Canter, 1963; Ho et al., 1998; Kim, 2017; Skodda & Schlegel, 2008).
Such language and terms, which are predominantly based on studies of American English, have been translated into many languages and are widely used for research and clinical purposes in those countries. For example, according to Liss et al. (2013), at least 23 languages across 50 papers have characterized dysarthric speech using the Mayo Clinic classification system. A decade later, we replicated the search using the same tools (PubMed, Medline, and Google Scholar) and key words, which resulted in an increase both in the number of languages studied (n = 28) and papers (n = 88). Despite the steady growth in relevant studies, it leaves open a timely question of whether or not our current language and scheme of dysarthria should be expected to fit with all languages. Furthermore, the data obtained from languages other than English at any level (perceptual, acoustic, and physiologic) have not been incorporated into dysarthria research and clinical practice in a meaningful way. Therefore, it is as yet uncertain whether knowledge obtained from studies from American English can be generalized to other languages. If it cannot, current theory regarding speech characteristics of dysarthria and current practice related to the evaluation and management of individuals with dysarthria will need revision. We argue that the role of language has been generally underestimated in understanding dysarthria both academically and clinically. A better understanding of the interaction between native language and dysarthria is important, especially considering the increasingly diverse populations served in clinical practice.
Increasing Questions About the Language Universality of Dysarthria
Within the field of speech-language pathology, research in dysarthria has had a relatively late start in paying attention to the role of language in the nature and characteristics of the speech disorder (Moya-Galé et al., 2023). As a comparison, studies in aphasia, the language side of neurogenic communication disorders in adults, have identified the effects of language on aphasia characteristics in the late 1980s such that language differences accounted for more variance than aphasia type differences (Menn & Obler, 1990; Vaid & Pandit, 1991; Wulfeck et al., 1989). Studies of crosslinguistic effects in aphasia have further advanced by exploring language- and culture-specific methodologies to be used in the assessment and treatment of adults with aphasia (Armstrong et al., 2020; Brewer et al., 2020; Lerman et al., 2020).
As pointed out by Miller et al. (2014), the relative lack of cross-language studies in dysarthria may come from the assumption that speech motor control and its disorders are likely universal and uniform across languages. Ostensibly, it seems reasonable to assume that speech motor control, such as speech breathing, vocal fold vibrations, and tongue and lip movement, must be the same for a speaker of any language. However, it should be noted that these physical activities are involved in speech production to meet specific requirements that are determined by the language. For example, from a wide range of voice onset time (ranging from −150 to +115 ms across languages; Lisker & Abramson, 1964), different languages utilize different values to serve for their phonetic/phonemic contrasts (i.e., two categories as in English vs. four categories as in Hindi). Two speakers with dysarthria who have similar degrees of difficulties in controlling vocal fold vibration and oral constriction but who speak different languages may not be described in the same way concerning their stop consonant errors. Furthermore, the abnormal distribution of voice onset time for stop consonants is an important articulatory problem in dysarthria, as it is reported as a potential language-specific contributor to intelligibility in dysarthria (Kim & Choi, 2017).
Another situation in which language difference arises as an issue is the use of perceptual descriptors of dysarthria for clients whose native language is different from English. It is easy to assume that a speech-language pathologist (SLP) may be able to identify certain dysarthria characteristics, such as audible inspirations and pitch breaks, even without extensive knowledge of the client's language. However, some characteristics, such as the stress and rhythm pattern or sound (articulatory) accuracy appropriate to the language, may be difficult to evaluate for nonnative speakers. To the best of our knowledge, two studies have examined the role of the native language of an SLP in the assessment of nonnative speakers with dysarthria. First, Hartelius et al. (2003) obtained passage recordings from 10 native speakers of Swedish and 10 Australian English who had been diagnosed with multiple sclerosis (MS). Two native speakers of Australian English and two native speakers of Swedish, all of whom had clinical experience in dysarthria ratings, assessed speech characteristics. The judges used a 4-, 5-, and 7-point scale to rate 33 dimensions, which mostly focused on phonatory-prosodic characteristics. The authors concluded that perceptual assessments of speakers with MS can be performed with high-interrater reliability (a mean rho of 85.7 and 84.3 for Australian speakers and Swedish speakers, respectively), irrespective of the judge's knowledge of the speaker's language.
Second, Näsström and Schalling (2020) explored the possibility of a Swedish-speaking SLP to assess Arabic-speaking participants with the assistance of an interpreter. Based on the general agreement between a Swedish-speaking SLP (nonnative-speaking SLP) and an Arabic-speaking SLP (native-speaking SLP) across seven speakers with dysarthria secondary to amyotrophic lateral sclerosis, stroke, MS, and PD, the authors claimed that an SLP who does not speak the native language of the patient can perform the assessment in collaboration with an interpreter. However, this study included a small number of participants (one Swedish-speaking SLP, one Arabic-speaking SLP, and one interpreter), and speech stimuli employed in the assessment were limited to sustaining a single sound (/a/, /s/) and repeating syllables. The assessment categories were broad, such as “respiration/phonation,” “oromotor/velopharyngeal function,” and “articulation,” instead of using detailed, specific speech characteristic items. Although the two studies provide important information on the role of SLP's language in dysarthria assessment, due to the limited number of studies and methods (e.g., number of listeners, speech stimuli, speech dimensions, speaker etiologies), we do not have a clear understanding of how native language interacts with dysarthria, when evaluated perceptually, especially by naïve listeners (i.e., no clinical experience in motor speech disorders).
The Current Study: Within- and Cross-Language Ratings of Dysarthria Characteristics
Motivated by the question, “What is the role of the listener's native language in perceiving dysarthria characteristics?” the current study provides a within- and cross-language data set of perceptual ratings of dysarthria between American English and Korean. Given the increasing data that support language-specific and language-universal characteristics identified in the literature on speech production of dysarthria, it is natural to ask about language universality or specificity in perceiving speech characteristics of dysarthria.
To answer the question, we conducted an auditory-perceptual experiment in which speech characteristics of PD were judged by listeners who responded to utterances both in their native and nonnative languages. PD was chosen as a model because of its relatively straightforward association with speech deterioration (i.e., hypokinetic dysarthria; Duffy, 2012). Korean was selected because it is different from American English in several interesting aspects, including its relatively simple vowel system, complex stop consonant cognates, and lack of lexical stress. Primary differences between the two languages and supporting studies are summarized in Table 1. Furthermore, comparing data between the two languages supports both language-universal and language-specific characteristics of dysarthria between the two languages (Kim & Choi, 2017).
Table 1.
Primary differences in speech sounds and prosody between the selected languages: English and Korean.
| Category | Aspect | English | Korean | Supporting studies |
|---|---|---|---|---|
| Phonetic inventory | Vowels | 11 or 12 monophthongs Tense-lax contrast |
7 monophthongs Absence of tense-lax contrast |
Shin et al. (2013), Kim & Choi (2017) |
| Consonants | 2 stop cognates (voiced, voiceless) | 3 stop cognates (lax, aspirated, tense) | Cho et al. (2002) | |
| Prosody | Rhythm | Stress-timed lexical stress | Syllable-timed lack of lexical stress |
Arvaniti (2012), Mok & Lee (2008) |
| Articulation rate | Similar (4–5 syllable/s) |
Cha (2001), Solomon & Hixon (1993) |
||
Figure 1 visualizes our two listener groups, which were categorized depending on whether the listener's native language matched the language spoken by the speaker they were rating. Ratings for speakers who shared the same language as the listener, whether they were neurologically healthy controls (HCs) or individuals with PD, fell into the “language-matched” group. Conversely, if listeners rated the speech of a speaker whose language differed from their own, their ratings were assigned to the “language-crossed” group.
Figure 1.
The language-pairing categorization schema for the listener groups employed in this study. AE = American English; KO = Korean.
Along with speech characteristics of dysarthria, one listener-intrinsic variable, listeners' confidence, was included for rating, as the confidence level may serve as another index of listener attributes in ratings of dysarthric speech. When information is insufficient for decision making (i.e., the lack of linguistic information in the speech signals), perceptual judgment may rely more on confidence in the estimation of the degree of speech abnormality in dysarthria (Baranski & Petrusic, 1994). Albeit limited, previous studies support a mismatch between listener's own perception of their ratings and their actual perceptual ratings of dysarthria (Hustad, 2006, 2007). For example, based on the lack of correlation between confidence ratings and intelligibility scores, Hustad (2007) speculated that confidence ratings may be a proxy for some other phenomena, such as processing load or working memory (Hustad, 2007). In our study, confidence ratings may provide information on the burden of processing foreign language in addition to atypical speech signals from talkers with dysarthria. Specifically, we posed the following two research questions:
Speech characteristics of dysarthria: Do language-matched and language-crossed listeners perceive speech characteristics of dysarthria in the same way?
Confidence level: Is the confidence of listeners higher when providing perceptual ratings for language-matched speakers than for language-crossed speakers?
Based on the findings supporting both language-specific and language-universal characteristics in speech production in dysarthria, we hypothesized that some, but not all, speech dimensions would indicate differences in the ratings between language-matched and language-crossed listeners. We also hypothesized that language-crossed listeners would show a lower confidence level than language-matched due to the limited information available and the increased burden of foreign language processing.
Method
Perceptual experiments were conducted at two data collection sites: Florida State University (Tallahassee, FL) and Hallym University (Chuncheon, Gangwon-do, South Korea). The study protocol was approved by both universities, and all participants gave informed consent.
Speakers
Speech samples from four speakers that were previously recorded for a larger project (Kim & Choi, 2017) were used in the study. Four speakers were carefully selected from the database by the first author, who is a native speaker of Korean, in consideration of several factors. First, male speakers were chosen to control for sex effects that may affect overall speech characteristics and PD symptoms (Gillies et al., 2014). Second, the PD speakers from each language were estimated to have equivalent speech intelligibility and other speech characteristics. For example, averaged intelligibility scores for the American English and Korean speakers with PD were 7.3 and 7.5, respectively, on a 10-point equal appearing interval where 1 was equated with totally unintelligible and 10 with completely intelligible (Kim & Choi, 2017). Table 2 provides demographic information and primary dysarthria characteristics (as determined by the first author) for each of the speakers.
Table 2.
Speaker information (Parkinson's disease and healthy controls).
| Speaker ID | Age | Sex | Native language | Intelligibility | Primary dysarthria characteristics |
|---|---|---|---|---|---|
| EnglishPD | 73 | M | American English | 7.3 | Reduced volume, deteriorating speech over time, monopitch, monoloudness |
| EnglishHC | 68 | M | American English | — | — |
| KoreanPD | 71 | M | Korean | 7.5 | Reduced volume, increasing articulation errors over time, monopitch, monoloudness |
| KoreanHC | 67 | M | Korean | — | — |
Note. ID = identification; PD = Parkinson's disease; M = male; HC = healthy control.
Speech Stimuli
Passage recordings in native languages were used for perceptual ratings of speech characteristics and confidence levels: “The Caterpillar passage” for American English (Patel et al., 2013) and “The Autumn passage” for Korean speakers (Kim, 2012). Both paragraphs were designed to sample the phonetic inventory of the language with multiple repetitions in different contexts and have often been used in the studies of motor speech disorders.
Listeners
A total of 68 young adult listeners participated in the perceptual ratings from each language group: (a) 31 English-speaking listeners (Mage = 23.0 years, SD = 2.8; two males, 29 females), 37 (b) Korean-speaking listeners (Mage = 26.2 years, SD = 5.2; seven males, 30 females). Inclusion criteria for participation were (a) between 18 and 40 years of age; (b) no current or history of speech, language, or hearing disorders based on self-report; (c) native speakers of the language group they belong to; and (d) little-to-no understanding of the counterpart language (e.g., English-speaking listeners do not understand spoken Korean). All listeners were undergraduate and graduate students majoring in speech-language pathology. Although they were not exposed to dysarthric speech on a daily basis, both groups had a general knowledge of dysarthria and speech production. Our recruitment aimed to include naïve listeners with no extensive background in motor speech disorders and with no advanced education in the counterpart language.
Prior to participation, the listener's proficiency in the counterpart language was examined beforehand by questionnaire, including participants' history of living abroad, familiarity with the counterpart language, and educational background of foreign language training. In addition, a brief screening test for auditory comprehension was conducted, in which listeners were asked to orthographically transcribe a short sentence of the counterpart language. This was to ensure that lexical information did not play a role in their perceptual ratings of speech disturbances of the counterpart language. None of our participants were excluded based on the transcription responses.
Data Collection
Listeners were asked to rate 20 dimensions using a visual analog scale method for both language-matched and language-crossed speech stimuli. For feasibility, 20 speech dimensions were selected from the 38 used in the original studies (Darley et al., 1969a), motivated by several factors, including (a) the language contrasts between English and Korean, (b) prominent speech characteristics of hypokinetic dysarthria, and (c) inclusion of comprehensive speech subsystems. For example, although resonance is not frequently affected by PD, hypernasality was included in the list. On the other hand, although low volume is frequently reported for speakers with PD, it was not included because precise control of volume was not possible due to the nature of remote data collection. Table 3 summarizes the selected speech dimensions and their primary component of the speech system.
Table 3.
Selected dimensions of speech disturbance in hypokinetic dysarthria and their component of the speech system associated with each speech characteristic derived from Darley et al. (1969a).
| Primary component | Speech characteristics |
|---|---|
| Phonatory | Harsh voice, continuous breathiness |
| Articulatory | Imprecise consonants, distorted vowels, irregular articulatory breakdowns |
| Prosodic | Reduced stress, short phrases, inappropriate silences, increase of rate |
| Articulatory-prosodic | Short rushes of speech, rate |
| Phonatory-prosodic | Monopitch, pitch level |
| Phonatory-respiratory-prosodic | Monoloudness, loudness decay, excess and equal stress, hypernasality, audible inspiration |
| Overall | Overall intelligibility, overall naturalness |
The listening experiment was remotely conducted using Qualtrics. Listeners were asked to rate the degree of each speech dimension using a horizontally oriented continuous scale (i.e., no tick marks) with end points labeled normal and severe in a quiet environment (Kim & Thompson, 2022). Given the number of dimensions to rate (n = 20), listeners were allowed to play the passage recordings multiple times. After rating each speech dimension, listeners were queried about their confidence using the same method, but the end points were labeled not at all confident and extremely confident. Prior to the experiments, the listeners received basic instruction that focused on the definition of each perceptual rating and the practice of the ratings in response to a speech sample. The terms and definitions of each dimension were obtained from Duffy (2012) for English-speaking listeners and Kim et al. (2016) for Korean-speaking listeners, which is a translation of Duffy (2012). This reflected our effort to minimize the effects of vocabulary choices that may arise from interlanguage translations. However, the possibility should be acknowledged in which perceptual nuances might have been lost due to the interlanguage translation and its potential consequence on the listeners' ratings using different languages. During the instruction, the listeners were also asked to perform the ratings in a quiet environment with a comfortable degree of loudness to the listener. For both ratings, the scale was numerically transformed after the experiment on a scale of 0 (severe for speech dimensions, not at all confident for confidence level) to 100 (normal for speech dimensions, extremely confident for confidence level). The value was not displayed to the listeners.
Statistical Analysis
The data formatting, preparation, and analyses were conducted in R (R Core Team, 2023). The first research question aimed to determine if language-matched and language-crossed listeners perceive speech characteristics of dysarthria in the same way. For this analysis, we transformed the ratings for these 20 perceptual dimensions into a long-format data set. This transformation resulted in the creation of a new variable labeled Rating, while Speech Dimension was configured as a factor with 20 distinct levels. To answer our question, we built a linear mixed-effects (LME) model to model Rating (i.e., the listener perceptual ratings across all dimensions) as the outcome, and the Speaker–Listener Language Pairing (i.e., language-matched, language-crossed), the Speaker Group (i.e., HC vs. PD), as well as the interaction between Speaker–Listener Language Pairing and Speaker Group as the predictors. Additionally, the ListenerID and Speech Dimension were separately entered as random effects to account for the listener- and dimension-related variations in the intercepts. The language-matched dimension ratings of the HC speakers served as the reference level. The LME approach was used instead of a linear model or t test approach, because the assumption of independence (assumed for the linear model and t tests) was violated by the current data.
A follow-up LME model was created to examine specific speaker/listener pairing and speaker group effects for each of the 20 perceptual dimensions. This model built off the previous model by adding Speech Dimension as a fixed effect. Therefore, in this model, the Rating variable served as the dependent/outcome variable, with the interaction between Speech Dimension (i.e., the 20 perceptual dimensions), Speaker Group, and Speaker–Listener Language Pairing (i.e., matched vs. crossed) as predictors. Again, the model included ListenerID as a random effect, serving to account for variations in the intercepts among listeners. To examine the pairwise comparisons between matched and unmatched speakers for each dimension and speaker group, we used the emmeans package (Lenth et al., 2019). The pairwise comparisons were evaluated using a conservative Bonferroni correction to account for multiple comparisons (i.e., .05/20 = .0025).
The second research question aimed to examine whether the confidence of listeners is higher when providing perceptual ratings for language-matched speakers than for language-crossed speakers? For this model, we took a similar approach to the models built for our first research question. Specifically, we transformed the confidence ratings for these 20 perceptual dimensions into a long-format data set. The outcome of this transformation led to the formation of a new variable named Rating, while Speech Dimension was designated as a factor with 20 distinct levels. Then, we built an LME model with Rating (i.e., the listener confidence ratings across all dimensions) as the outcome, and the Speaker–Listener Language Pairing (i.e., language-matched, language-crossed), the Speaker Group (i.e., HC vs. PD), as well as the interaction between Language Pairing and Speaker Group as the predictors. Additionally, the ListenerID and perceptual dimension were separately entered as random effects to account for the listener- and dimension-related variability in the intercepts. The language-matched confidence ratings of the HC speakers served as the reference level.
Finally, similar to the process completed in our first research question, a follow-up model was built to include Speech Dimension as a fixed effect. Therefore, in this model, the Rating variable served as the dependent/outcome variable, with the interaction between Speech Dimension (i.e., the 20 perceptual dimensions), Speaker Group, and Speaker–Listener Language Pairing (i.e., matched vs. crossed) as predictors, and ListenerID as a random effect, serving to account for variations in the intercepts among listeners. Again, the emmeans package was used to examine the pairwise comparisons between matched and unmatched speakers for each dimension and speaker group (Lenth et al., 2019). The pairwise comparisons were evaluated using a conservative Bonferroni correction to account for multiple comparisons (i.e., .05/20 = .0025).
Listener Reliability
Both inter- and intralistener reliabilities were checked for listener ratings. For intralistener reliability, 10% of randomly selected speech dimensions were rated again by the listeners. Listeners with a mean absolute difference between their first and second ratings greater than 40 for more than three dimensions were removed from the study (Kim & Thompson, 2022). This process identified 35 Korean-speaking listeners and 25 English-speaking listeners who met the inclusion criterion. After this process, the intrarater reliability was computed using the Pearson correlation coefficient, which was 0.87. For interlistener reliability, we calculated the intraclass correlation coefficient for each of the four speakers, which resulted in 0.936 (English speaker without PD), 0.764 (English speaker with PD), 0.941 (Korean speaker without PD), and 0.915 (Korean speaker with PD). The interlistener reliability was considered good to excellent, according to Koo and Li (2016).
Results
Research Question 1: Ratings of Speech Characteristics of Dysarthria
Descriptive statistics for the perceptual ratings across all 20 dimensions are presented in Table 4. Statistical findings are presented on the left of Table 5, labeled Dimension Ratings. The findings revealed that the language-matched and language-crossed groups did not differ in their ratings for the HC speakers, t(4, 716) = 1.05, p = .293. Additionally, not surprisingly, the language-matched ratings were significantly lower for speakers with PD compared to the HC speakers, t(4, 716) = −28.90, p < .001. However, there was a significant interaction between language pairing and speaker group, such that language-matched ratings of speakers with PD were significantly higher (i.e., less impaired) than the language-crossed ratings of speakers with PD, t(4, 716) = −6.09, p < .001.
Table 4.
Descriptive statistics for the perceptual measures, presented by language pairing and speaker group. A higher value indicates closer to normal for each dimension and vice versa.
| Perceptual dimension | Language-matched |
Language-crossed |
||||||
|---|---|---|---|---|---|---|---|---|
| HC |
PD |
HC |
PD |
|||||
| M | SD | M | SD | M | SD | M | SD | |
| Distorted vowels | 97.23 | 6.97 | 62.97 | 39.94 | 96.63 | 7.93 | 42.93 | 35.68 |
| Excess and equal stress | 94.43 | 13.01 | 73.43 | 28.64 | 94.87 | 12.81 | 67.23 | 30.13 |
| Imprecise consonants | 95.87 | 10.89 | 56.77 | 42.18 | 94.90 | 11.29 | 35.80 | 32.77 |
| Irregular articulatory breakdowns | 97.17 | 8.50 | 62.57 | 35.41 | 95.10 | 10.31 | 41.43 | 34.23 |
| Loudness decay | 95.37 | 12.42 | 63.47 | 27.73 | 94.57 | 12.49 | 49.87 | 25.83 |
| Overall intelligibility | 95.63 | 15.18 | 59.53 | 41.32 | 93.33 | 17.62 | 30.60 | 34.92 |
| Overall naturalness | 95.67 | 9.20 | 57.47 | 35.38 | 94.17 | 12.74 | 47.83 | 25.86 |
| Short rushes of speech | 93.53 | 14.12 | 72.67 | 31.00 | 90.67 | 16.26 | 55.30 | 32.31 |
| Audible inspiration | 91.97 | 13.71 | 83.37 | 23.91 | 91.97 | 12.32 | 78.00 | 23.18 |
| Continuous breathiness | 91.17 | 14.33 | 50.20 | 26.80 | 89.47 | 17.60 | 61.50 | 30.56 |
| Harsh voice | 87.60 | 17.67 | 56.43 | 22.77 | 87.90 | 16.04 | 73.20 | 26.76 |
| Hypernasality | 97.53 | 8.69 | 84.47 | 22.70 | 95.77 | 10.13 | 75.17 | 23.38 |
| Inappropriate silences | 95.22 | 10.60 | 78.57 | 20.88 | 95.10 | 10.25 | 72.27 | 28.01 |
| Increase of rate | 94.97 | 13.47 | 85.53 | 22.54 | 93.73 | 12.72 | 74.17 | 28.09 |
| Monoloudness | 93.50 | 14.68 | 59.30 | 31.60 | 89.97 | 21.96 | 49.10 | 28.61 |
| Monopitch | 88.53 | 20.74 | 61.40 | 33.60 | 91.60 | 18.92 | 54.63 | 30.55 |
| Pitch | 91.47 | 17.86 | 70.00 | 26.33 | 93.87 | 15.21 | 68.57 | 24.62 |
| Rate | 93.46 | 14.96 | 71.57 | 31.10 | 89.97 | 18.51 | 70.03 | 29.88 |
| Reduced stress | 93.03 | 13.86 | 63.00 | 33.70 | 93.70 | 15.20 | 49.33 | 30.65 |
| Short phrases | 93.10 | 14.54 | 73.30 | 24.33 | 90.20 | 15.93 | 71.60 | 24.71 |
Note. HC = healthy control; PD = Parkinson's disease.
Table 5.
Model findings for predicting the listeners' perceptual ratings based on listener pairing (Model 1) and the interaction between listener pairing and speaker group (Model 2).
| Predictors | Dimension ratings |
Confidence ratings |
||||
|---|---|---|---|---|---|---|
| Estimates | CI | p | Estimates | CI | p | |
| (Intercept) | 93.84 | [90.22, 97.46] | < .001 | 78.64 | [74.01, 83.26] | < .001 |
| Language-matched × HC | Reference | Reference | ||||
| Language-crossed × PD | −7.91 | [−10.45, −5.36] | < .001 | 3.20 | [1.06, 5.33] | .003 |
| Language-crossed | −0.97 | [−2.77, 0.83] | .293 | −11.05 | [−12.56, −9.54] | < .001 |
| PD | −26.54 | [−28.34, −24.74] | < .001 | −14.42 | [−15.93, −12.91] | < .001 |
| Random effects | ||||||
| σ2 | 505.73 | 355.26 | ||||
| τ00 | 88.56 ListenerID | 304.05 ListenerID | ||||
| 30.09 dimension | 4.00 dimension | |||||
| ICC | 0.19 | 0.46 | ||||
| N | 20 dimension | 20 dimension | ||||
| 60 ListenerID | 60 ListenerID | |||||
| Observations | 4798 | 4800 | ||||
| Marginal R2/conditional R2 | 0.280/0.417 | 0.088/0.512 | ||||
Note. Bold values indicate significance at p < .0025. CI = confidence interval; HC = healthy control; PD = Parkinson's disease; ICC = intraclass correlation coefficient.
To further explore the language pairing and speaker group effects on 20 individual speech dimensions, we added the perceptual dimension to our model as a fixed effect. Specifically, we examined the three-way interaction between language pairing (language-matched vs. language-crossed), speaker group (HC vs. PD), and perceptual dimension (each of the 20 dimensions). For this analysis, we used the emmeans package (Lenth et al., 2019).
The average ratings between language-matched and language-crossed listeners are visualized in Figure 2. Across the 20 perceptual speech dimensions, differences between language-matched and language-crossed ratings were significant for eight speech dimensions for PD speakers. These dimensions include harsh voice, short rushes of speech, reduced stress, loudness decay, irregular articulatory breakdowns, distorted vowels, imprecise consonants, and overall intelligibility. No differences were found in speech characteristics ratings for HC speakers.
Figure 2.
Perceptual ratings for 20 speech dimensions for HC and PD across the two listener groups: language-matched and language-crossed. Speech dimensions are listed following the order of averaged speech ratings across the groups (from high to low). A higher value indicates closer to normal for each dimension and vice versa. Speech dimensions that showed a significant difference between the listener groups are highlighted in bold and with an asterisk. HC = healthy control; PD = Parkinson's disease.
Specifically, for the perceptual ratings of speakers with PD, the language-matched ratings were significantly higher than the ratings of language-crossed for short rushes of speech, t(4, 659) = 4.42, p < .001; reduced stress, t(4, 659) = 3.48, p < .001; loudness decay, t(4, 659) = 3.46, p < .001; irregular articulatory breakdowns, t(4, 659) = 5.38, p < .001; distorted vowels, t(4, 659) = 5.10, p < .001; imprecise consonants, t(4, 659) = 5.34, p < .001; and overall intelligibility, t(4, 659) = 7.37, p < .001. In contrast, for the perceptual ratings of harsh voice in speakers with PD, the language-matched ratings were significantly lower than the ratings of language-crossed, t(4, 659) = −4.27, p < .001.
Research Question 2: Ratings of Confidence Level
The findings of our second research question are presented on the right of Table 4, labeled Confidence Ratings. The findings revealed that, for the ratings of the HC speakers, language-matched listeners were more confident in their ratings compared to language-crossed listeners, t(4, 718) = −14.37, p < .001. Additionally, the language-matched listeners were less confident in their ratings of speakers with PD compared to the HC speakers, t(4, 718) = −18.74, p < .001. However, there was a significant interaction between language pairing and the speaker group, such that language-matched listeners were relatively less confident in their ratings of speakers with PD compared to the language-crossed listeners, t(4, 718) = 2.94, p = .003. Visual examination of this interaction, as depicted in Figure 3, helps interpret this significant interaction. The difference in confidence for ratings of speakers with PD and HC was reduced among the language-crossed listeners compared to the language-matched listeners.
Figure 3.
Confidence ratings for 20 speech dimensions for HC and PD across the two listener groups: language-matched and language-crossed. Speech dimensions are listed following the order of averaged speech ratings across the groups (from high to low). Speech dimensions that showed a significant difference between the listener groups are highlighted in bold and with an asterisk. HC = healthy control; PD = Parkinson's disease.
We added the perceptual dimension to our model as a fixed effect to further explore the language pairing and speaker group effects on the confidence ratings of each of the 20 dimensions. Specifically, we examined the three-way interaction between language pairing (language-matched vs. language-crossed), speaker group (HC vs. PD), and perceptual dimension (each of the 20 dimensions). To examine how the effects of language pairing and speaker group interacted on the confidence ratings for each perceptual dimension, we used the emmeans package (Lenth et al., 2019).
The average confidence ratings between language-matched and language-crossed listeners are visualized in Figure 3. Differences between the confidence ratings of language-matched and language-crossed listeners across the 20 dimensions were primarily observed for the ratings of HC speakers. However, a handful of dimensions were sensitive to the listener/speaker pairing for the PD speakers. All language pairing effects were in the expected direction (i.e., language-matched listeners were more confident than language-crossed listeners).
Specifically, for the confidence ratings of HC speakers, the language-matched ratings were significantly higher than the ratings of language-crossed for rate, t(4, 661) = 3.37, p < .001; increase of rate, t(4, 661) = 3.2, p = .001; inappropriate silences, t(4, 661) = 3.51, p < .001; overall naturalness, t(4, 661) = 4.44, p < .001; short rushes of speech, t(4, 661) = 3.87, p < .001; overall intelligibility, t(4, 661) = 6.94, p < .001; imprecise consonants, t(4, 661) = 6.03, p < .001; irregular articulatory breakdowns, t(4, 661) = 5.91, p < .001; and distorted vowels, t(4, 661) = 6.97, p < .001. Additionally, for the confidence ratings of the speakers with PD, the language-matched ratings were significantly higher than the confidence ratings of language-crossed for overall intelligibility, t(4, 661) = 3.54, p < .001; imprecise consonants, t(4, 661) = 3.62, p < .001; irregular articulatory breakdowns, t(4, 661) = 3.33, p < .001; and distorted vowels, t(4, 661) = 5.29, p < .001.
Discussion
Cross-language studies on speech characteristics of dysarthria are emerging, but still, only a few exist that systematically compare results across languages. Even fewer studies examined the effect of native language on the perception of dysarthria. Consistent with the findings of the speech production literature, our findings support both aspects of dysarthria, language-universal, and language-specific, in perceiving dysarthric speech, which are detailed below.
Language-Universal Versus Language-Specific Speech Dimensions to Ears
Overall, speech characteristics were perceived in a similar way between language-matched and language-crossed listeners. For healthy speakers' stimuli, none of the speech dimensions revealed different ratings between the two listener groups, which likely reflects a ceiling effect.
In addition, for PD speakers' stimuli, the ratings did not differ for 12 of 20 speech dimensions between the two listener groups. Specifically, regardless of the listener's language background, the degree of abnormal speech patterns such as audible inspiration, hypernasality, inappropriate silence, short phrases, monopitch, and monoloudness was estimated to the same degree by listeners. Large variability in ratings was observed in perceptual ratings of PD speakers compared to HC speakers, as indicated by the standard deviations (see Table 4), which is consistent with previous research (Bunton et al., 2007; Patel & Campellone, 2009). However, the interlistener variability was also comparable between the two listener groups: language-matched and language-crossed listeners (see Table 4).
More importantly, eight speech dimensions revealed a significant difference in perceptual ratings contingent on the listener's native language. We argue that these are likely to be language-specific aspects of auditory-perceptual ratings of dysarthria in which the listener's native language plays a role. Seven among those were rated lower (i.e., more abnormal) by language-crossed listeners than language-matched listeners. These dimensions were short rushes of speech, reduced stress, loudness decay, irregular articulatory breakdowns, distorted vowels, imprecise consonants, and overall intelligibility. One dimension, harsh voice, was rated higher (i.e., more normal) by language-crossed listeners than by language-matched listeners.
It is notable that all articulatory-related perceptual dimensions, irregular articulatory breakdowns, distorted vowels, and imprecise consonants were included in the language-specific dimensions. This finding may align with our intuition that assessing the segmental integrity (i.e., the goodness of the sounds) as a language-independent dimension is challenging without knowing the given language's phonetic/phonemic system. Speech perception is described as an interaction between acoustic signals (bottom–up process) and the listener's prior linguistic knowledge (top–down process) when listeners attempt to match the acoustic input onto stored sound representations (Luce & McLennan, 2005). When a speech perception task specifically requires a good representation of vowels and consonants, as in our task for articulatory-related dimensions, listeners' lack of knowledge of the phonetic/phonemic inventories interferes with their responses. In this sense, different estimates between the listener groups for overall intelligibility ratings are understandable, as speech intelligibility is often legitimately defined as the degree to which sound sequences are decoded by listeners. However, the other “overall” index of speech function, naturalness, did not reveal different ratings between the two listener groups. We speculate that this reflects the different nature of the two constructs, such that articulatory disturbances affect the judgment of speech intelligibility to a greater degree, while prosodic deficits affect the judgment of speech naturalness (Anand & Stepp, 2015; Hilger et al., 2023; Kent & Rosenbek, 1982).
One possible explanation of the language-specific ratings of reduced stress is the difference in stress and rhythmic patterns between the two languages selected in our study (see Table 1). For example, American English is a stressed-time language, and stress within a word can serve as a grammatical cue (e.g., address as a noun /ˈadres/ or verb /əˈdres/). Korean is a syllable-timed language and does not have lexical stress (but see Heo, 1965). The differing weight stress patterns carry for the two languages may result in language-specific estimates of reduced stress. There are several prosodic dimensions that the differing rhythmic patterns could potentially affect, such as excess and equal stress, monopitch, and monoloudness. Among these, reduced stress may be the speech dimension listeners (especially naïve listeners) react more sensitively to the rhythmic pattern differences, compared to other dimensions vis-à-vis differences in prosodic ratings.
What is unexpected is that language-crossed listeners made lower ratings (except one dimension, harsh voice) for these language-specific speech dimensions compared with language-matched listeners. One may assume that perceptual ratings of dysarthria obtained from a listener who does not know or speak the language would underestimate the degree of disordered speech dimensions. However, our data suggest the opposite case. Relevant discussion is found in the literature on foreign-language speech perception in adverse listening conditions. Foreign speech perception has been extensively studied to understand the role of linguistic factors in speech decoding. Moreover, foreign speech perception in adverse conditions, frequently created by combining speech signals with noise or reverberation, has been a long-standing topic, as it resembles the everyday listening situations foreign language learners cope with (Lecumberri et al., 2010). Over decades, strong experimental evidence has supported that the effect of noise on speech perception is much greater for nonnative than for native listeners. For example, nonnative listeners suffer more from increasing noise than native listeners for consonant identification (Nábělek & Donahue, 1984; Takata & Nábělek, 1990). The difficulties are exacerbated when the perception task demands high-level linguistic information such as word or sentence perception/recognition (Black & Hast, 1962; Meador et al., 2000; Rogers et al., 2006). Some perceptual errors in noise made by nonnative listeners are due to the influence of the first language sound system on listener responses (Mackay et al., 2001). Although the type and nature of the challenge are different between foreign speech perception in noise and foreign speech perception of persons with dysarthria, lower ratings of language-crossed listeners may reflect the similar double challenge (i.e., atypical speech due to dysarthria, limited knowledge of the foreign language) in perceptual processing and ratings of dysarthria.
Further studies are necessary to explain why one dimension, harsh voice, is rated higher (i.e., less impaired) for the speakers with PD by language-crossed listeners than language-matched listeners. Although the results did not achieve statistical significance, the continuous breathiness dimension was also rated higher by language-crossed listeners than by language-matched listeners. In fact, these are the only two dimensions that were rated higher by language-crossed listeners than language-matched listeners (see Figure 2). Auditory-perceptual estimation of voice quality may differ from other dimensions of speech production, which is supported by a study in dysphonia (Ghio et al., 2011). Albeit limited, evidence supports the influence of native language on the perception of voice quality (e.g., Kreiman & Gerratt, 2010). Cross-language effects on the perception of voice perception in dysarthria speech warrant future studies with a sophisticated design focusing on foreign language and voice quality.
It should be noted that, as an initial attempt to study the effect of language backgrounds on perceptual assessments of dysarthria, this study treated the language-crossed listeners as one group without a split into two respective language groups (i.e., native speakers of American English vs. native speakers of Korean). A follow-up analysis is in progress that considers the directional language specificity between the two language groups, which further supports the need for the consideration of native languages in the perceptual assessment of dysarthria. For example, English-speaking listeners provided lower ratings for speech dimensions, reduced stress, and excess and equal stress regardless of the language of the speakers compared to Korean-speaking listeners, whereas Korean-speaking listeners tend to provide lower ratings for irregular articulatory breakdowns. Although a detailed analysis is warranted, the varying sensitivity in response to different speech dimensions between the two listener language groups is speculated to reflect the characteristics of the listener's native language, such as the use of lexical stress (hence, the listeners are more sensitive to reduced prosodic variations) and the phonetic complexity (hence, the listeners are more sensitive to reduced phonetic sharpness). Related to this, interlistener reliability was notably lower for the English-PD speaker compared to the other three speakers included in the study. Due to the small number of speakers included and the merged data between language-matched and language-crossed listeners, it may be too soon to generalize if the finding is another aspect of language-specific characteristics (e.g., Korean-speaking listeners' ratings vary to a large degree in response to English-speaking PD, whereas English-speaking listeners' ratings are as consistent as Korean-speaking listeners in response to Korean-speaking PD). A follow-up analysis focusing on the language-directionality may shed light on this point.
Greater Confidence Loss for Foreign Language Ratings Than for Dysarthria Speech Ratings
As predicted, language-matched listeners feel more confident in their ratings than language-crossed listeners, and the raters feel more confident in rating speech characteristics of healthy speakers compared to speakers with PD. However, our data support foreign speech as a greater factor of confidence loss in perceptual ratings than dysarthric speech. This statement is based on two observations. First, language-matched listeners were less confident in the ratings of speakers with PD compared to the language-crossed listeners. Possibly, knowledge and fluency of the language, in fact, amplifies the uncertainty of their ratings when they rate the degree of deviant speech dimensions. Second, as visualized in Figure 3, language-crossed listeners show significantly reduced confidence compared to language-matched listeners for a greater number of speech dimensions for HC speech (n = 9) compared to PD speech (n = 4).
Despite the reduced loss in confidence when rating PD speech, in general, language-crossed listeners still feel unconfident for four speech dimensions: distorted vowels, irregular articulatory breakdowns, excess and equal stress, and overall intelligibility. We believe that this further highlights the language-specificity of articulatory and rhythmic aspects of dysarthria.
Limitations and Future Directions
It should be noted that the terms language-universal and language-specific are confined to the two languages included in the study, which may apply in a different way to other languages. For example, hypernasality may be a language-specific speech dimension in a language that employs phonemic contrasts between nasal and oral vowels (e.g., French and Portuguese). Furthermore, our listeners were undergraduate and graduate students majoring in speech-language pathology. This approach was primarily based on access to the recruitment pool. However, practicing SLPs, especially those with extensive experience with dysarthria, might show a greater degree of language-independent ratings of dysarthria characteristics. Last, we included only one male speaker for each speaker group (e.g., Korean speaker groups with and without PD, English speaker groups with and without PD), mainly for two reasons. First, special effort was made to match speech characteristics of PD between the two languages as closely as possible to minimize interspeaker differences in speech characteristics of PD, in consideration of the increasing research, which reports the substantial heterogeneity of speech characteristics of PD possibly reflecting factors such as subtypes of PD and sex (Paulus & Jellinger, 1991; Stebbins et al., 2013). Second, to minimize the potential effects of interlistener variability on perceptual ratings, our listeners rated the entire set of speech stimuli (i.e., no subgroups of listeners assigned to different sets of speech stimuli). Therefore, a small number of speakers were preferred for feasibility. However, future studies are warranted to increase the number of speakers and listeners that factor in speaker and listener variability.
Conclusions
This study is an initial attempt to examine the role of the listener's native language in perceptual ratings of speech characteristics of dysarthria. The results obtained from language-matched and language-crossed listeners support the hypothesis that some speech dimensions, especially those characterizing the given language (i.e., phonetic/phonemic inventories and prosodic structures), are rated in a different manner contingent on the listener's native language. Our data negate the general assumption of language universality of perceptual characteristics of motor speech disorders while emphasizing the need for adjustment and consideration of language-specific characteristics in the assessment and treatment of dysarthria. This has clear, practical applications in clinical practice where linguistic diversity is estimated to considerably increase in caseloads in neurorehabilitation services.
Author Contributions
Yunjung Kim: Conceptualization (Lead), Data curation (Lead), Formal analysis (Lead), Writing – original draft (Lead). Austin Thompson: Formal analysis (Equal), Writing – original draft (Equal). Seung Jin Lee: Conceptualization (Supporting), Data curation (Equal), Writing – review & editing (Supporting).
Data Availability Statement
Due to the nature of the study, the speech recordings used for this study are not publicly available to be compliant with the institutional review board requirements. However, spreadsheets of perceptual ratings with de-identified participant information may be available from the first author upon reasonable request.
Acknowledgments
This study was in part supported by the National Institutes of Health Grants R03 DC012405 and R01 DC0204068 (awarded to principal investigator [PI]: Yunjung Kim). This work was also supported by the Ministry of Education of South Korea and the National Research Foundation of Korea Grant NRF-2022S1A5A8049773 (awarded to PI: Seung Jin Lee). The authors would like to thank the study participants in the United States and South Korea, both speakers and listeners, for their time and contribution to the study. Last, they would like to thank Courtney Wilkinson for her assistance with the data collection and analysis.
Funding Statement
This study was in part supported by the National Institutes of Health Grants R03 DC012405 and R01 DC0204068 (awarded to principal investigator [PI]: Yunjung Kim). This work was also supported by the Ministry of Education of South Korea and the National Research Foundation of Korea Grant NRF-2022S1A5A8049773 (awarded to PI: Seung Jin Lee).
References
- Anand, S., & Stepp, C. E. (2015). Listener perception of monopitch, naturalness, and intelligibility for speakers with Parkinson's disease. Journal of Speech, Language, and Hearing Research, 58(4), 1134–1144. https://doi.org/10.1044/2015_JSLHR-S-14-0243 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Armstrong, E., McAllister, M., Hersh, D., Katzenellenbogen, J. M., Thompson, S. C., Coffin, J., Flicker, L., Woods, D., Hayward, C., & Ciccone, N. (2020). A screening tool for acquired communication disorders in Aboriginal Australians after brain injury: Lessons learned from the pilot phase. Aphasiology, 34(11), 1388–1412. 10.1080/02687038.2019.1678107 [DOI] [Google Scholar]
- Arvaniti, A. (2012). The usefulness of metrics in the quantification of speech rhythm. Journal of Phonetics, 40(3), 351–373. 10.1016/j.wocn.2012.02.003 [DOI] [Google Scholar]
- Baranski, J. V., & Petrusic, W. M. (1994). The calibration and resolution of confidence in perceptual judgments. Perception & Psychophysics, 55(4), 412–428. 10.3758/BF03205299 [DOI] [PubMed] [Google Scholar]
- Black, J. W., & Hast, M. H. (1962). Speech reception with altering signal. Journal of Speech and Hearing Research, 5(1), 70–75. 10.1044/jshr.0501.70 [DOI] [PubMed] [Google Scholar]
- Brewer, K. M., McCann, C. M., & Harwood, M. L. N. (2020). Working with Māori adults with aphasia: An online professional development course for speech-language therapists. Aphasiology, 34(11), 1413–1431. 10.1080/02687038.2020.1738329 [DOI] [Google Scholar]
- Bunton, K., Kent, R. D., Duffy, J. R., Rosenbek, J. C., & Kent, J. F. (2007). Listener agreement for auditory-perceptual ratings of dysarthria. Journal of Speech, Language, and Hearing Research, 50(6), 1481–1495. 10.1044/1092-4388(2007/102) [DOI] [PubMed] [Google Scholar]
- Canter, G. J. (1963). Speech characteristics of patients with Parkinson's disease: I. Intensity, pitch, and duration. Journal of Speech and Hearing Disorders, 28(3), 221–229. 10.1044/jshd.2803.221 [DOI] [PubMed] [Google Scholar]
- Cha, J. (2001). Comparison of speech rates between syllable repetition and reading tasks [Unpublished master's thesis]. Yonsei University, Seoul, South Korea. [Google Scholar]
- Cho, T., Jun, S.-A., & Ladefoged, P. (2002). Acoustic and aerodynamic correlates of Korean stops and fricatives. Journal of Phonetics, 30(2), 193–228. 10.1006/jpho.2001.0153 [DOI] [Google Scholar]
- Darley, F. L., Aronson, A. E., & Brown, J. R. (1969a). Differential diagnostic patterns of dysarthria. Journal of Speech and Hearing Research, 12(2), 246–269. 10.1044/jshr.1202.246 [DOI] [PubMed] [Google Scholar]
- Darley, F. L., Aronson, A. E., & Brown, J. R. (1969b). Clusters of deviant speech dimensions in the dysarthrias. Journal of Speech and Hearing Research, 12(3), 462–496. 10.1044/jshr.1203.462 [DOI] [PubMed] [Google Scholar]
- Duffy, J. R. (2005). Pearls of wisdom—Darley, Aronson, and Brown and the classification of the dysarthrias. Perspectives on Neurophysiology and Neurogenic Speech and Language Disorders, 15(3), 22–27. 10.1044/nnsld15.3.22 [DOI] [Google Scholar]
- Duffy, J. R. (2012). Motor speech disorders-e-book: Substrates, differential diagnosis, and management. Elsevier Health Sciences. [Google Scholar]
- Ghio, A., Weisz, F., Baracca, G., Cantarella, G., Robert, D., Woisard, V., Fussi, F., & Giovanni, A. (2011). Is the perception of voice quality language-dependant? A comparison of French and Italian listeners and dysphonic speakers. Proceedings of Interspeech 2011, 525–528. [Google Scholar]
- Gillies, G. E., Pienaar, I. S., Vohra, S., & Qamhawi, Z. (2014). Sex differences in Parkinson's disease. Frontiers in Neuroendocrinology, 35(3), 370–384. 10.1016/j.yfrne.2014.02.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hartelius, L., Theodoros, D., Cahill, L., & Lillvik, M. (2003). Comparability of perceptual analysis of speech characteristics in Australian and Swedish speakers with multiple sclerosis. Folia Phoniatrica et Logopaedica, 55(4), 177–188. 10.1159/000071017 [DOI] [PubMed] [Google Scholar]
- Heo, W. (1965). Kwuke Umwoonhak [Korean phonology]. Cengumsa. [Google Scholar]
- Hilger, A., Cloud, C., & Fahey, T. (2023). Speech impairment in cerebellar ataxia affects naturalness more than intelligibility. The Cerebellum, 22(4), 601–612. 10.1007/s12311-022-01427-y [DOI] [PubMed] [Google Scholar]
- Hirsch, M. E., Thompson, A., Kim, Y., & Lansford, K. L. (2022). The reliability and validity of speech-language pathologists' estimations of intelligibility in dysarthria. Brain Sciences, 12(8), Article 1011. 10.3390/brainsci12081011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ho, A. K., Iansek, R., Marigliani, C., Bradshaw, J. L., & Gates, S. (1998). Speech impairment in a large sample of patients with Parkinson's disease. Behavioural Neurology, 11(3), 131–137. 10.1155/1999/327643 [DOI] [PubMed] [Google Scholar]
- Hustad, K. C. (2006). Estimating the intelligibility of speakers with dysarthria. Folia Phoniatrica et Logopaedica, 58(3), 217–228. 10.1159/000091735 [DOI] [PubMed] [Google Scholar]
- Hustad, K. C. (2007). Effects of speech stimuli and dysarthria severity on intelligibility scores and listener confidence ratings for speakers with cerebral palsy. Folia Phoniatrica et Logopaedica, 59(6), 306–317. 10.1159/000108337 [DOI] [PubMed] [Google Scholar]
- Kent, R. D., Kent, J. F., Duffy, J. R., Thomas, J. E., Weismer, G., & Stuntebeck, S. (2000). Ataxic dysarthria. Journal of Speech, Language, and Hearing Research, 43(5), 1275–1289. 10.1044/jslhr.4305.1275 [DOI] [PubMed] [Google Scholar]
- Kent, R. D., & Rosenbek, J. C. (1982). Prosodic disturbance and neurologic lesion. Brain and Language, 15(2), 259–291. 10.1016/0093-934X(82)90060-8 [DOI] [PubMed] [Google Scholar]
- Kim, H.-H. (2012). Neurologic speech language disorders. Sigma Press. [Google Scholar]
- Kim, H.-H., Seo, M., Kim, Y., & Yoon, J. (2016). Motor speech disorders. Bakhaksa. [Google Scholar]
- Kim, Y. (2017). Acoustic characteristics of fricatives /s/ and /∫/ produced by speakers with Parkinson's disease. Clinical Archives of Communication Disorders, 2(1), 7–14. 10.21849/cacd.2016.00080 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim, Y., & Choi, Y. (2017). A cross-language study of acoustic predictors of speech intelligibility in individuals with Parkinson's disease. Journal of Speech, Language, and Hearing Research, 60(9), 2506–2518. 10.1044/2017_JSLHR-S-16-0121 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim, Y., & Thompson, A. (2022). An acoustic–phonetic approach to effects of face masks on speech intelligibility. Journal of Speech, Language, and Hearing Research, 65(12), 4679–4689. 10.1044/2022_JSLHR-22-00245 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Koo, T. K., & Li, M. Y. (2016). A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of Chiropractic Medicine, 15(2), 155–163. 10.1016/j.jcm.2016.02.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kreiman, J., & Gerratt, B. R. (2010). Effects of native language on perception of voice quality. Journal of Phonetics, 38(4), 588–593. 10.1016/j.wocn.2010.08.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lecumberri, M. L. G., Cooke, M., & Cutler, A. (2010). Non-native speech perception in adverse conditions: A review. Speech Communication, 52(11–12), 864–886. 10.1016/j.specom.2010.08.014 [DOI] [Google Scholar]
- Lenth, R., Singmann, H., Love, J., Buerkner, P., & Herve, M. (2019). Package “emmeans” [Computer software]. R Foundation for Statistical Computing.
- Lerman, A., Goral, M., & Obler, L. K. (2020). The complex relationship between pre-stroke and post-stroke language abilities in multilingual individuals with aphasia. Aphasiology, 34(11), 1319–1340. 10.1080/02687038.2019.1673303 [DOI] [Google Scholar]
- Lisker, L., & Abramson, A. S. (1964). A cross-language study of voicing in initial stops: Acoustical measurements. Word, 20(3), 384–422. 10.1080/00437956.1964.11659830 [DOI] [Google Scholar]
- Liss, J. M., Utianski, R., & Lansford, K. (2013). Crosslinguistic application of English-centric rhythm descriptors in motor speech disorders. Folia Phoniatrica et Logopaedica, 65(1), 3–19. 10.1159/000350030 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Luce, P. A., & McLennan, C. T. (2005). Spoken word recognition: The challenge of variation. In Pisoni D. B. & Remez R. E. (Eds.), The handbook of speech perception (pp. 590–609). 10.1002/9780470757024.ch24 [DOI] [Google Scholar]
- MacKay, I. R., Flege, J. E., Piske, T., & Schirru, C. (2001). Category restructuring during second-language speech acquisition. The Journal of the Acoustical Society of America, 110(1), 516–528. 10.1121/1.1377287 [DOI] [PubMed] [Google Scholar]
- Meador, D., Flege, J. E., & MacKay, I. R. (2000). Factors affecting the recognition of words in a second language. Bilingualism: Language and Cognition, 3(1), 55–67. 10.1017/S1366728900000134 [DOI] [Google Scholar]
- Menn, L., & Obler, L. K. (1990). Agrammatic aphasia: Cross-language narrative sourcebook. John Benjamins. 10.1075/z.39 [DOI] [Google Scholar]
- Miller, N., Lowit, A., & Kuschmann, A. (2014). Introduction: Cross-language perspectives on motor speech disorders. In Miller N. & Lowit A. (Eds.), Motor speech disorders: A cross-language perspective (Vol. 12, pp. 7–28). Multilingual Matters. 10.21832/9781783092338-003 [DOI] [Google Scholar]
- Mok, P., & Lee, S. I. (2008, July 21–26). Korean speech rhythm using rhythmic measures [Paper presentation]. 18th International Congress of Linguists (CIL18), Seoul, South Korea. [Google Scholar]
- Moya-Galé, G., Kim, Y., & Fabiano, L. (2023). Raising awareness about language- and culture-specific considerations in the management of dysarthria associated with Parkinson's disease within the United States. Journal of Speech, Language, and Hearing Research. Advance online publication. 10.1044/2023_JSLHR-23-00365 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nábělek, A. K., & Donahue, A. M. (1984). Perception of consonants in reverberation by native and non-native listeners. The Journal of the Acoustical Society of America, 75(2), 632–634. 10.1121/1.390495 [DOI] [PubMed] [Google Scholar]
- Näsström, A. K., & Schalling, E. (2020). Development of a method for assessment of dysarthria in a foreign language: A pilot study. Logopedics Phoniatrics Vocology, 45(1), 39–48. 10.1080/14015439.2019.1650392 [DOI] [PubMed] [Google Scholar]
- Patel, R., & Campellone, P. (2009). Acoustic and perceptual cues to contrastive stress in dysarthria. Journal of Speech, Language, and Hearing Research, 52(1), 206–222. 10.1044/1092-4388(2008/07-0078) [DOI] [PubMed] [Google Scholar]
- Patel, R., Connaghan, K., Franco, D., Edsall, E., Forgit, D., Olsen, L., Ramage, L., Tyler, E., & Russell, S. (2013). “The Caterpillar”: A novel reading passage for assessment of motor speech disorders. American Journal of Speech-Language Pathology, 22(1), 1–9. 10.1044/1058-0360(2012/11-0134) [DOI] [PubMed] [Google Scholar]
- Paulus, W., & Jellinger, K. (1991). The neuropathologic basis of different clinical subgroups of Parkinson's disease. Journal of Neuropathology & Experimental Neurology, 50(6), 743–755. 10.1097/00005072-199111000-00006 [DOI] [PubMed] [Google Scholar]
- R Core Team. (2023). R: A language and environment for statistical computing [Computer software]. R Foundation for Statistical Computing. [Google Scholar]
- Rogers, C. L., Lister, J. J., Febo, D. M., Besing, J. M., & Abrams, H. B. (2006). Effects of bilingualism, noise, and reverberation on speech perception by listeners with normal hearing. Applied PsychoLinguistics, 27(3), 465–485. 10.1017/S014271640606036X [DOI] [Google Scholar]
- Shin, J., Kiaer, J., & Cha, J. (2013). The sounds of Korean. Cambridge University Press. [Google Scholar]
- Skodda, S., & Schlegel, U. (2008). Speech rate and rhythm in Parkinson's disease. Movement Disorders, 23(7), 985–992. 10.1002/mds.21996 [DOI] [PubMed] [Google Scholar]
- Solomon, N. P., & Hixon, T. J. (1993). Speech breathing in Parkinson's disease. Journal of Speech, Language, and Hearing Research, 36(2), 294–310. https://doi.org/10.1044/jshr.3602.294 [DOI] [PubMed] [Google Scholar]
- Stebbins, G. T., Goetz, C. G., Burn, D. J., Jankovic, J., Khoo, T. K., & Tilley, B. C. (2013). How to identify tremor dominant and postural instability/gait difficulty groups with the movement disorder society unified Parkinson's disease rating scale: Comparison with the unified Parkinson's disease rating scale. Movement Disorders, 28(5), 668–670. 10.1002/mds.25383 [DOI] [PubMed] [Google Scholar]
- Takata, Y., & Nábělek, A. K. (1990). English consonant recognition in noise and in reverberation by Japanese and American listeners. The Journal of the Acoustical Society of America, 88(2), 663–666. 10.1121/1.399769 [DOI] [PubMed] [Google Scholar]
- Vaid, J., & Pandit, R. (1991). Sentence interpretation in normal and aphasic Hindi speakers. Brain and Language, 41(2), 250–274. 10.1016/0093-934X(91)90155-T [DOI] [PubMed] [Google Scholar]
- Wulfeck, B., Bates, E., Juarez, L., Opie, M., Friederici, A., Macwhinney, B., & Zurif, E. (1989). Pragmatics in aphasia: Crosslinguistic evidence. Language and Speech, 32(4), 315–336. 10.1177/002383098903200402 [DOI] [PubMed] [Google Scholar]
- Yorkston, K. M., & Beukelman, D. R. (1981). Ataxic dysarthria: Treatment sequences based on intelligibility and prosodic considerations. Journal of Speech and Hearing Disorders, 46(4), 398–404. 10.1044/jshd.4604.398 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Due to the nature of the study, the speech recordings used for this study are not publicly available to be compliant with the institutional review board requirements. However, spreadsheets of perceptual ratings with de-identified participant information may be available from the first author upon reasonable request.



