Abstract
Purpose:
The purpose of this study was twofold: (a) to determine whether there are speech rhythm differences between preschool-age children who stutter that were eventually diagnosed as persisting (CWS-Per) or recovered (CWS-Rec) and children who do not stutter (CWNS), using empirical spectral analysis and empirical mode decomposition of the speech amplitude envelope, and (b) to determine whether speech rhythm characteristics close to onset are predictive of later persistence.
Method:
Fifty children (3–4 years of age) participated in the study. Approximately 2–2.5 years after the experimental testing took place, children were assigned to the following groups: CWS-Per (nine boys, one girl), CWS-Rec (18 boys, two girls), and CWNS (18 boys, two girls). All children produced a narrative based on a text-free storybook. From the audio recordings of these narratives, fluent utterances were selected for each child from which seven envelope-based measures were extracted. Group-based differences on each measure as well as predictive analyses were conducted to identify measures that discriminate CWS-Per versus CWS-Rec.
Results:
CWS-Per were found to have a relatively higher degree of power in suprasyllabic oscillations and greater variability in the timing of syllabic rhythms especially for longer utterances. A logistic regression model using two speech rhythm measures was able to discriminate the eventual outcome of recovery versus persistence, with 80% sensitivity and 75% specificity.
Conclusion:
Findings suggest that envelope-based speech rhythm measures are a promising approach to assess speech rhythm differences in developmental stuttering, and its potential for identification of children at risk of developing persistent stuttering should be investigated further.
Developmental stuttering is characterized by frequent occurrences of repetitions and prolongations of sounds, syllables, or words that disrupt the rhythmic flow of speech. The onset of developmental stuttering typically occurs around the ages of 30–36 months, and approximately 5%–8% of preschool-age children meet the diagnostic criteria for stuttering at some point in early childhood (see the review by Yairi & Ambrose, 2013). While up to 75%–85% of children who stutter (CWS) eventually recover from the disorder, the remaining 15%–25% continue to stutter into adulthood (i.e., persist), yielding a prevalence of approximately 1% across the general population (Yairi & Ambrose, 2005, 2013). Recovery rates decline to 50%–60% by the time a child reaches the age of 5 years (Walsh et al., 2018, 2020; Yairi & Ambrose, 2005), and beyond the age of 7 years, stuttering children are at significant risk for persistent stuttering with potentially negative psychological and academic consequences (Blumgart et al., 2010; Craig et al., 2009; Klein & Hood, 2004; O'Brian et al., 2011).
Developmental stuttering is a complex and multifaceted disorder (Conture & Walden, 2012; Smith, 1999; Yairi & Ambrose, 2005) affected by multiple, dynamic, and interacting factors (Smith & Weber, 2017). To date, research pinpoints various epidemiological, clinical, physiological, and behavioral factors associated with stuttering persistence and recovery in preschool-age children (for a review, see Walsh et al., 2018). Among such factors are sex (Reilly et al., 2009; Yairi & Ambrose, 2005), family history of stuttering (Ambrose et al., 1997; Kidd et al., 1981), age of onset (Yairi & Ambrose, 2005), time since onset (TSO; Yairi & Ambrose, 2005), linguistic and/or phonological factors (Mohan & Weber, 2015; Singer et al., 2020; Spencer & Weber-Fox, 2014; Yairi et al., 1996; Usler & Weber-Fox, 2015), speech motor skills (Spencer & Weber-Fox, 2014; Usler et al., 2017), neuroanatomical characteristics (Garnett et al., 2018), and emotional/temperamental differences (Ambrose et al., 2015; Erdemir et al., 2018; Zengin-Bolatkale et al., 2018). Although a variety of factors have been found to be associated with stuttering persistence in young children, empirical support of approaches to predict risk of persistence is scarce and just beginning to emerge (e.g., Singer et al., 2022; Walsh et al., 2021). These newer reports point to a cumulative approach using a combination of (possibly co-related) risk factors to help more accurately predict whether a child's stuttering will persist. However, there is still a lack of well-established and strong predictors of risk of persistence especially close to onset, and this continues to impact the clinical decision-making process, such as determining when to initiate treatment with a young child who stutters. Therefore, it is of utmost importance to successfully identify the factors that may be easily utilized for predicting the probability for persistence in children who begin to stutter. The novel study was designed to contribute to this body of research by investigating “speech rhythm” as a potential predictor of risk for stuttering persistence.
Evidence of Deviant Timing Processes in Developmental Stuttering
A growing body of recent neurophysiological and behavioral research highlights the possibility of a dysfunctional internal timing network as a core deficit underlying developmental stuttering (Chang et al., 2016; Chang & Zhu, 2013; Etchell et al., 2015; Hickok et al., 2011; Max & Yudman, 2003; cf. Hilger et al., 2016). This theory and related evidence are relevant to this study because such a deficit could potentially affect the ability to sequence the timing of speech movements. The basal ganglia-thalamo-cortical network (also referred to as the “timing network”) represents the neural circuitry thought to be involved in internal generation of periodic timing signals and rhythm processing (Grahn, 2009; Grahn & Brett, 2007; Grahn & McAuley, 2009). This network is composed of the basal ganglia, supplementary motor area, premotor, and auditory regions. In support of a salient role in developmental stuttering, findings from functional and structural magnetic resonance imaging and electrophysiological studies point to a neurophysiological deviance (e.g., weaker functional connectivity or delayed beta-band oscillations) in the cortical and subcortical regions of the timing network in CWS (Chang et al., 2016; Chang & Guenther, 2020; Chang & Zhu, 2013; Etchell et al., 2016). Furthermore, based in part on these findings, theoretical perspectives have proposed that a malfunction in these neural structures may represent the primary impairment underlying stuttering (Chang & Guenther, 2020).
In the behavioral domain, empirical evidence on timing processes in stuttering has come from studies using a variety of nonspeech rhythm–related paradigms. For example, CWS, compared to children who do not stutter (CWNS), were shown to exhibit greater rhythmic timing variability in their nonspeech lip movements (Howell et al., 1997) and poorer accuracy when synchronizing finger taps to periodic metronome and musical beats (Falk et al., 2015). Of particular interest to this study, inferior performance was noted in CWS that were eventually diagnosed as persisting (CWS-Per) compared to CWS that were eventually diagnosed as recovered (CWS-Rec) and CWNS, in a finger sequencing task (Tendera et al., 2020). Other empirical evidence came from the work of Wieland et al. (2015), who examined rhythm discrimination abilities of CWS (aged 6–11 years) and found evidence of a rhythm perception deficit in developmental stuttering, which may result from an internal timing deficit.
Relative to speech, difficulty with temporal processing is also expected to result in poor coordination of interarticulatory movements (e.g., the spatiotemporal coupling of upper and lower lip movements), which may explain findings of immature articulatory movement patterns in CWS. For example, in favor of a less refined and immature speech motor coordination system, CWS, compared to CWNS, were shown to have greater lip aperture trial-to-trial variability during fluent speech productions (MacPherson & Smith, 2013; Walsh et al., 2015). Of particular interest to this study, CWS-Per, compared to CWS-Rec and CWNS, were shown to have increased trial-to-trial variability of articulatory movement patterns across sentence repetitions (Usler et al., 2017) and inferior performance in a speech task composed of consonant production and nonword repetition (Spencer & Weber-Fox, 2014). This suggested a less refined and mature speech motor coordination mechanism in CWS-Per, where CWS-Rec were more like CWNS, entailing that the speech motor system of children on the pathway of recovery might resemble that of nonstuttering children more closely than their nonfluent peers on the pathway of persistence. Given the present evidence, whether the presence of a motor timing deficit in CWS (which could manifest itself in speech rhythm characteristics) may be an early indicator of persistent stuttering is of great interest for understanding developmental stuttering and its pathways into persistence or recovery.
Evidence of Deviant Speech Rhythm in Stuttering
Although speech rhythm is an understudied area in stuttering in general, a limited number of studies also point to deviant speech rhythm in adults who stutter (AWS). These studies have documented the fluent speech of AWS, compared to those of controls, to have a less typical rate and be less rhythmic (Wendahl & Cole, 1961); to have higher durational variability of vocalic and consonantal intervals, indicative of greater timing variability (Boutsen et al., 2000; Maruthy et al., 2017); and to be different in speech rhythm characterized by newer envelope-based rhythm measures (Dechamma & Maruthy, 2018). Another study found 6- to 8-year-old CWS, compared to CWNS, to have increased variability of speech segments at the sentence level (Dokoza et al., 2011). Although some differences in speech timing and rhythm patterns in AWS have been noted before, there is scarcity of research looking into speech rhythm in young CWS, let alone looking into its potential as an early risk marker of persistence—at a time before years of learned coping behaviors and therapeutic strategies have potentially shaped the way speech is produced. Therefore, a novel investigation of speech rhythm differences as potential predictors of risk for stuttering persistence is required to fill this gap in our knowledge.
Measures of Speech Rhythm
Speech production is inherently tied to time, and we can define speech rhythm broadly as the arrangement of speech sounds and movements in a way to alternate between stressed and unstressed elements (for emerging speech production models involving rhythm, see Poeppel & Assaneo, 2020; Tilsen, 2019). However, despite a long history of research that has focused on the measurement of speech rhythm, there is still no consensus on exactly how rhythm is encoded in speech. The oldest approaches to define and quantify speech rhythm were driven by the assumption that speech rhythm was encoded in duration and involved analyzing the temporal regularity of linguistic units (i.e., syllables, moras, or feet; Abercrombie, 1967; James, 1940; Pike, 1945), followed by those that focused on the durational variability of vocalic and consonantal intervals, that is, so-called duration-based rhythm metrics such as pairwise variability index (PVI; Dellwo, 2006; Grabe & Low, 2002; Ling et al., 2000; Nolan & Jeon, 2014). Duration-based metrics have previously been used to capture cross-linguistic rhythmic differences (stress-timed vs. syllable-timed) in 2- to 6-year-old children (Mok, 2013; Payne et al., 2012) and to examine acquisition of speech rhythm by monolingual versus bilingual 4- to 5-year-old children (Bunta & Ingram, 2007). These studies also found increased durational variabilities of consonantal and vocalic intervals as a function of age (Polyanskaya & Ordin, 2015) and language proficiency (Ordin & Polyanskaya, 2015). On the other hand, when intensity was taken into consideration, an opposite pattern was observed—intensity variability was reduced (He, 2018), and coupling of speech articulators was increased as a function of age and maturation (Smith & Zelaznik, 2004). Compared to durational measures, intensity measures were also shown to be more sensitive to between-speakers rhythmic differences and supposedly play a more important role in speech rhythm variability in general (He & Dellwo, 2016). This suggested that the organization of speech segments at duration dimension versus intensity could represent processes that contribute to speech rhythm in different ways.
Indeed, speech rhythm is a complex multidimensional percept that is not amenable to the analysis of a single dimension of the speech signal such as interval durations or timing of events (Cummins, 2009; Kohler, 2009). Focusing solely on the timing of these prominences would be misleading because percepts of duration are known to interact with other variables of the speech signal such as amplitude (Kohler, 2009) or fundamental frequency (f o) modulations (Yu et al., 2010). It has also been postulated that duration-based metrics might not be sensitive enough to detect nuanced speech rhythm differences in clinical populations because they were originally designed to compare cross-linguistic differences (Liss et al., 2010). Speech rhythm is influenced by characteristics of the amplitude envelopes of energy between the stressed locations (Morton & Chambers, 1976; Pompino-Marschall, 1989), and quantifying the bursts and lulls via amplitude envelopes could reveal the hierarchical temporal structure of speech (e.g., phonemes, syllables, words, phrases) more reliably (Ravignani et al., 2019). Durational metrics neglect this type of acoustic energy that carries important information about the periodicities in speech. Duration-based metrics were also shown to display substantial interspeaker variability and to be very sensitive to syllable complexity, making rhymical classification using such metrics less reliable (Arvaniti, 2012). Other drawbacks of these measures include that they are influenced by speaking rate (White & Mattys, 2007) and that segmenting the speech signal into linguistic units (i.e., vocalic and consonantal intervals) is labor intensive and challenging, especially when analyzing pathologic speech (Liss et al., 2010).
Given the limitations of duration-based metrics, in the last 2 decades, a newer approach called envelope spectral analysis (ESA) has dominated the speech rhythm research (Dechamma & Maruthy, 2018; Liss et al., 2010; Tilsen & Arvaniti, 2013; Tilsen & Johnson, 2008). This involves spectral analysis of amplitude envelopes (derived from filtered speech waveform) that exhibit relatively slow fluctuations of acoustic energy that tend to arise from alternations between vowels and consonants (Tilsen & Arvaniti, 2013) as well as from alterations in voicing, frication noise, bursts, and so forth (Liss et al., 2010). However, these slow fluctuations do not correspond precisely to vocalic and consonantal intervals; thereby, this approach is not based on any linguistic assumptions. Empirical mode decomposition (EMD) analysis (Tilsen & Arvaniti, 2013) extends ESA, and it is based on an instantaneous frequency analysis of signal components using a Hilbert transform rather than a fast Fourier transform as used in ESA (Huang et al., 1998). EMD extracts several functions from the amplitude envelope that captures oscillations at various timescales that reflect syllabic and suprasyllabic periodicities (Tilsen & Arvaniti, 2013). Ultimately, this approach provides metrics for various dimensions of speech rhythm, such as (a) power distribution metrics, which capture the relative contributions of syllabic versus suprasyllabic oscillations in the envelope; (b) rate metrics, which capture the frequencies of these oscillations; and (c) rhythmic stability metrics, which capture the stability or variability of these oscillations. The most significant advantages of the envelope-based approaches include the lack of a priori assumptions about the relation of linguistic units to rhythm as well as the lack of labor-intensive parsing of the speech signal into consonants and vowels (Liss et al., 2010).
The application of the envelope-based metrics to adult speech has shown that they are sufficiently flexible to capture information about periodicities in speech that likely correspond to different linguistic constructs such as the syllable, foot, and phrase (Liss et al., 2010; Tilsen & Arvaniti, 2013), and they have previously been used in adult speech to categorize and differentiate dysarthria (Liss et al., 2010), stuttering (Dechamma & Maruthy, 2018), and different languages (Tilsen & Arvaniti, 2013). Although envelope-based metrics have not been used in child speech before, they are arguably more valid for a broader use in adult and child speech as well as in clinical and nonclinical populations (Liss et al., 2010; Tilsen & Arvaniti, 2013). This is because envelope-based metrics primarily rely on fluctuations in acoustic energy resulting from vocal fold vibration and the opening and closing of jaw without reliance on any kind of linguistic construction (i.e., phonemes, syllables), which likely mature throughout the course of development and likely differ among clinical populations.
Purpose
Thereby, this study serves as an initial and novel investigation of speech rhythm in young CWS using envelope-based measures in an aim to identify those at higher risk of developing persistent stuttering. Specifically, we conducted this study to determine if acoustic features of speech rhythm indexed by ESA and EMD can distinguish between speech rhythm of young CWS-Per and CWS-Rec in the critical developmental age range when stuttering begins and at an age close to onset before the pathways for persistency versus recovery were established.
Method
Participants
Participants were part of a large-scale longitudinal investigation of emotional and linguistic contributions to childhood stuttering (e.g., Erdemir et al., 2018; Jones et al., 2014; Zengin-Bolatkale et al., 2018) conducted by Vanderbilt University's Developmental Stuttering Project. They included 50 preschool-age children between the ages of 3;0 and 4;10 (years;months) at the initial visit. The children were classified as CWS-Per (n = 10; nine boys, one girl), CWS-Rec (n = 20; 18 boys, two girls), or CWNS (n = 20; 18 boys, two girls) based on data from diagnostic evaluations conducted over a 2- to 2.5-year period (see the Classification and Inclusion Criteria section for additional description). The experimental data used in this study were from the initial visit of the longitudinal study. The total possible number of CWS-Per participants from the initial data was included. CWS-Rec and CWNS groups were formed to be approximately similar to the persistent group in age and sex distribution as well as other relevant speech-language variables (described below). Chronological age did not significantly differ across the three groups as determined by one-way analysis of variance, F(2, 47) = 1.82, p = .173, and across each pair of groups as determined by subsequent pairwise comparisons (p = .20, CWS-Rec vs. CWS-Per; p = .09, CWS-Rec vs. CWNS; p = .76, CWS-Per vs. CWNS). All participants were monolingual English speakers. They demonstrated normal hearing on a bilateral hearing screening at pure-tone frequencies of 1000, 2000, and 4000 Hz at 20 dB HL with no parental reports of neurological disorders. Participants were paid volunteers whose parents were informed about the study via advertisement in a local monthly parent magazine, local health providers, or self-referrals/professional referrals referral to the Vanderbilt Bill Wilkerson Center.
Classification and Inclusion Criteria
For talker group classification, the children were evaluated by a speech-language pathologist (SLP) as they engaged in a conversational free-play prior to experimental testing. A conversational speech sample of 300 words was elicited, and stuttered and nonstuttered disfluencies were counted (in line with Conture, 2001; Conture & Walden, 2012; Jones et al., 2017; Meyers, 1986; Riley, 1994; Tumanova et al., 2014; Yaruss, 1997a, 1997b; Yaruss et al., 1998). The participants were classified as CWS by the SLP if (a) they exhibited three or more stuttering-like disfluencies (SLDs; i.e., sound/syllable repetitions, monosyllabic whole-word repetitions, audible and inaudible sound prolongations) per 100 words of conversational speech (Conture, 2001; Jones et al., 2017; Tumanova et al., 2014; Yaruss, 1997a, 1997b; Yaruss et al., 1998), (b) they scored 11 or higher (i.e., severity equivalent of at least “mild”) on the Stuttering Severity Instrument for Children and Adults–Third Edition (Riley, 1994) and the Stuttering Severity Instrument for Children and Adults–Fourth Edition (Riley, 2009; both editions hereinafter referred to as SSI), and (c) a parental concern for stuttering was reported. Participants were classified as CWNS (a) they exhibited two or fewer stuttered disfluencies per 100 words of conversational speech, (b) they scored 10 or lower on the SSI (i.e., severity equivalent of less than “mild”), and (c) no parental concern for stuttering was reported.
Persistence and recovery status were based on diagnostic evaluations that took place 4–5 times (time points), each approximately 8 months apart, over a 2- to 2.5-year period. Participants were classified as CWS-Per if they met the “stuttering” criteria (as described above) at the initial and final diagnostic evaluations and parental concern of continued stuttering was reported at the final time point. Participants were classified as CWS-Rec if they were classified as CWS at the initial diagnostic evaluation and as CWNS at subsequent diagnostic evaluations and no parental concern of continued stuttering was reported at the final time point. Finally, participants were classified as CWNS if they met CWNS criteria at all the time points and no parental concern of stuttering was reported at any of the time points. In the case of a discrepancy between parental report and SLP evaluation based on speech sample, the participant was excluded from the study.
There is a possibility that stuttering severity and the total duration of time since onset (TSO) of stuttering may impact speech rhythm differences between CWS-Per and CWS-Rec groups; therefore, we aimed to achieve similar severity and TSO at the time of measurement of speech rhythm for the two stuttering groups. TSO information for CWS was obtained from parents using a bracketing technique described by Yairi and Ambrose (1992) and Anderson et al. (2003). Accordingly, using independent t tests, we confirmed that there were no significant differences in the frequency of SLDs, t(28) = 0.67, p = .50; SSI scores, t(28) = −0.25, p = .81; or TSO of reported stuttering, t(28) = 0.46, p = .65, at the initial diagnostic visit between the two groups. Table 1 represents participants' age, gender, frequency of SLDs, and SSI scores from the initial screening, as well as TSO of stuttering reported by the parent at the initial time point.
Table 1.
Mean and standard deviation of age, frequency of stuttering-like disfluencies (SLDs) and Stuttering Severity Instrument (SSI) scores from the initial screening, and time since onset (TSO) of stuttering as reported by the parent at the initial visit in each group.
Variable | Age (in months) | SLD per 100 words | SSIa score | TSO (in months) |
---|---|---|---|---|
CWS-Per | 46.2 (4.6) | 9.9 (6.5) | 19.3 (6.8) | 10.6 (4.9) |
CWS-Rec | 43.5 (5.5) | 8.7 (4.0) | 19.9 (5.9) | 9.7 (5.5) |
CWNS | 46.5 (6.3) | 1.1 (0.5) | 6.7 (1) |
The participants were administered several standardized speech-language assessments. The children who scored below the 16th percentile (approximately 1 SD below the mean) on any test were not included in the study to avoid any potential confounds with clinically significant speech-language concerns.
Care was also taken to avoid any potential between-groups differences in speech-language skills since differences in speech-language skills (such as expressive language and articulation) might be interacting with speech rhythm differences. Statistical testing did not reveal any significant between-groups findings across each pair of groups for scores of standardized speech and language assessments of articulation (Goldman-Fristoe Test of Articulation–Second Edition; Goldman & Fristoe, 2000; p = .11, CWNS vs. CWS-Per; p = .35, CWS-Rec vs. CWNS; p = .34, CWS-Per vs. CWS-Rec), receptive vocabulary (Peabody Picture Vocabulary Test; Dunn & Dunn, 2007; p = .92, CWNS vs. CWS-Per; p = .07, CWS-Rec vs. CWNS; p = .19, CWS-Per vs. CWS-Rec), expressive vocabulary (Expressive Vocabulary Test–Second Edition; Williams, 1997; p = .39, CWNS vs. CWS-Per; p = .08, CWS-Rec vs. CWNS; p = .56, CWS-Per vs. CWS-Rec), and receptive (p = .42, CWNS vs. CWS-Per; p = .56, CWS-Rec vs. CWNS; p = .23, CWS-Per vs. CWS-Rec) and expressive language (Test of Early Language Development–Third Edition; Hresko et al., 1999; p = .73, CWNS vs. CWS-Per; p = .86, CWS-Rec vs. CWNS; p = .69, CWS-Per vs. CWS-Rec) at the initial diagnostic evaluation.
Procedure
As part of the experimental visit at the initial time point, participants were seated in front of a computer monitor and engaged in a narrative task. The task involved telling a story based on the pictures from a “text-less” storybook displayed on the monitor. Each child told a story about a boy, a dog, and a frog by the author Mercer Mayer (e.g., Frog, Where Are You?; Mayer, 1969). A lapel microphone placed on the shirt of the child acquired the audio signal on a desktop computer using a sampling rate of 48 kHz and 16 bits. The elicited narratives varied in length depending on the number of utterances children generated during this task. Research on speech rhythm has traditionally involved reading or sentence repetition tasks with the advantage that linguistic factors can be controlled (Dechamma & Maruthy, 2018; Liss et al., 2009, 2010). On the other hand, use of conversational and narrative speech has the advantage of being highly naturalistic and more ecologically valid (Tilsen & Arvaniti, 2013). The ability to use such tasks is particularly critical for the purpose of this study because the participants were of preschool age and not able to read.
Measures
Transcriptions of the Utterances
A trained SLP watched the audio/video recordings of children's speech and identified fluent utterances that had no perceptible disfluencies—including both stuttered and nonstuttered disfluencies—from the narrative samples of each participant. The use of only fluent utterances provided an unbiased representation of speech rhythm in the three groups, since the presence of speech disfluencies would potentially bias the speech rhythm measures. Following Tilsen and Arvaniti's (2013) conservative approach, any utterance that included pauses > 100 ms was not included in the analyses since pauses interrupt rhythmicity and therefore affect the reliability of the rhythmic measures.
Speech rhythm measures are known to rely on duration (Liss et al., 2010; Tilsen & Arvaniti, 2013). Short utterances are problematic due to not containing enough stressed syllables to provide rhythmic information, and long utterances likewise are problematic for containing a mixture of rhythmic patterns resulting in more variable and complex rhythmicity (Tilsen & Arvaniti, 2013). Following Tilsen and Arvaniti (2013), we have included utterances between 1 and 3 s in duration in the analyses. This utterance length allowed us to include enough utterances per participant to conduct reliable spectral power analyses. Based on this selection criterion, the mean utterance duration was 1.79 s (SD = 0.13) for nonstuttering, 1.80 s (SD = 0.15) for persisting, and 1.81 s (SD = 0.11) for recovered children. Mean utterance durations did not differ among the three pairs of groups.
ESA and EMD
The speech rhythm measures used in this study were directly adopted from the study by Tilsen and Arvaniti (2013). Our approach is called “envelope-based” where rhythm is conceptualized as periodicity in the envelope of the speech signal, and it is based on analysis of amplitude envelopes derived from filtered speech waveforms. Two envelope-based approaches have been developed in an aim to quantify speech rhythm: ESA (Liss et al., 2010; Tilsen & Johnson, 2008) and EMD (Huang et al., 1998; Tilsen & Arvaniti, 2013).
As a first step, the “amplitude envelope” was extracted from the speech signal using the following steps: (a) The speech signal was bandpass-filtered (fourth-order Butterworth) using cutoff values of [450, 4500] Hz. The cutoff values were slightly higher than the ones used in the work of Tilsen and Arvaniti (2013) based on an exploratory analysis of the f o distribution of the current data as well as the knowledge that children have higher f o and formant frequencies compared to adults (Hillenbrand et al., 1995). (b) A fourth-order Butterworth filter with a 10-Hz cutoff was applied to obtain an envelope that varies on a syllable timescale—implying that the duration of a syllable was expected to be no less than 100 ms. (c) Finally, the envelope was normalized, down-sampled, and windowed using a Tukey window (r = .2) to aid subsequent spectral analyses of ESA and EMD.
ESA is based on a discrete Fourier transform (DFT) of the envelope of the speech waveform, which exhibits relatively slow fluctuations of acoustic energy that tend to arise from alternations between vowels and consonants (Tilsen & Arvaniti, 2013) as well as from alterations in voicing, frication noise, bursts, and so forth (Liss et al., 2010). It was calculated by taking the squared magnitude of the fast Fourier transform of a zero-padded processed envelope. In contrast to the complex sinusoid basis functions of the DFT, EMD extracts orthogonal basis functions from an empirical signal, using a sifting algorithm. These basis functions are called intrinsic mode functions (IMFs). Subsequently, a Hilbert transform was applied to each IMF to characterize its instantaneous phase and frequency (Huang et al., 1998). To further mitigate the effects of rapid changes in instantaneous phases, the phases were unwrapped where jumps occurred, each data point was smoothed by averaging over nearest neighbors, and extreme frequencies were trimmed on both lower and higher ends using 1.5 times the interquartile range criterion. To further avoid window-related edge effects, the first and last 100 ms of frequencies were also excluded. IMFs represent an oscillation in the signal at various timescales. We focus on the first two IMFs in this study since the first two IMFs have been observed to reflect syllabic (the fastest timescale of oscillation—IMF1) and suprasyllabic (the next fastest timescale of oscillation representing foot, phrase, etc.—IMF2) fluctuations in the envelope, respectively (Tilsen & Arvaniti, 2013). Figure 1 depicts an example speech waveform and the resultant ESA and EMD components. Finally, to quantify the multidimensional and complex nature of speech rhythm, seven rhythm measures have been derived using ESA and EMD analysis. These measures represent three power distribution metrics, two rate metrics, and two rhythm stability metrics.
Figure 1.
A sample speech waveform and the resultant empirical spectral analysis and empirical mode decomposition components. Left: The waveform of a sample utterance from the data set “He jumped into the water.” Middle left: Corresponding vocalic energy amplitude envelope (top) and the power spectrum from envelope spectral analysis (bottom). Middle right: Intrinsic mode functions (IMFs) obtained from empirical mode decomposition of envelopes. Right: Corresponding instantaneous frequencies of the IMFs.
Power distribution metrics. The relative amount of power in the amplitude envelope on syllabic versus suprasyllabic timescales is quantified via power distribution metrics. The first two of the power distribution metrics are derived from the envelope spectrum: (a) Spectral band power ratio (SBPr) is computed by taking the ratio of the amount of spectral power in the 1.5- to 3-Hz band (suprasyllabic oscillations) to power in the 3.5- to 10-Hz band (syllabic oscillations); (b) envelope spectral centroid (CNTR; spectral center of gravity) is computed by taking the weighted mean of frequencies calculated over 1–10 Hz. The third variable is derived from empirical mode functions: (c) IMF ratio (IMFr) is computed by taking the ratio of power in IMF2 (suprasyllabic timescale oscillations) to IMF1 (syllabic timescale oscillations). These three metrics represent the relative amount of power in suprasyllabic (longer, lower frequency) and syllabic (shorter, higher frequency) timescale periodicities.
Rate metrics. The oscillations in the speech envelope are quantified using IMFs at the syllabic (IMF1) and suprasyllabic (IMF2) timescale. These constructs can be conceptualized as the fastest timescale of oscillation in the envelope containing syllable timescale oscillations (IMF1) and the next fastest oscillation containing suprasyllable timescale (e.g., foot, phrase) oscillations (IMF2). The mean frequency of IMF1 and IMF2 represents the rate metrics of ω1 and ω2, respectively.
Rhythm stability metrics. The variability of periodicities in the envelope is quantified by calculating the variance of IMF frequencies at the syllabic (var.ω1) and suprasyllabic (var.ω2) timescales. These measures quantify variability of the frequency of envelope oscillations within an utterance and can be conceptualized as the degree to which rhythmic oscillations stay consistent throughout a stretch of speech. Variability of IMF frequencies quantifies temporal variation on syllabic and suprasyllabic timescales within a phrase or sentence production, and thereby, they can be thought of as indexing rhythmicity at the syllabic and suprasyllabic levels (higher variability entailing less stability and rhythmicity). Table 2 defines each of the seven variables obtained from the power spectra and the amplitude envelopes that are thought to define various aspects of speech rhythm.
Table 2.
Types of envelope modulation metrics, descriptions, and interpretations (after Tilsen & Arvaniti, 2013).
Type | Metric | Description | Interpretation |
Power distribution metrics | SBPr3.5 | Ratio between power in envelope spectrum bands (1/3.5/10 Hz) | Relative amount of spectral power in suprasyllabic vs. syllabic timescale oscillations |
CNTR1-10 | Envelope spectrum centroid calculated over 1- to 10-Hz band | Spectral center of gravity over a range of suprasyllabic to syllabic timescale oscillations | |
IMFr12 | Ratio between IMF2 and IMF1 | Relative amount of power in suprasyllabic vs. syllabic timescale envelope oscillations | |
Rate metrics | ω1 | Mean within-utterance instantaneous freq. of IMF1 | Rate of syllabic oscillations |
ω2 | Mean within-utterance instantaneous freq. of IMF2 | Rate of suprasyllabic oscillations | |
Rhythmic stability metrics | var. ω1 | Variance of within-utterance instantaneous freq. of IMF1 | Variability of syllabic oscillations |
var. ω2 | Variance of within-utterance instantaneous freq. of IMF2 | Variability of suprasyllabic oscillations |
Note. SBPr = spectral band power ratio; CNTR = envelope spectral centroid; IMFr = intrinsic mode function ratio; freq. = frequency.
Statistical Analysis
We used seven linear mixed-effects models (Diggle et al., 2013; Pinheiro & Bates, 2000; West et al., 2006) with cubic regression splines to test for a possible effect of stuttering status (CWS-Per, CWS-Rec, CWNS) on each one of the speech rhythm measures gathered from the individual utterances. All analyses were conducted using the software R (R Core Team, 2020) with the lmerTest package (Kuznetsova et al., 2017). The models included one of the seven speech rhythm measures (SBPr, CNTR, IMFr, ω1, ω2, var.ω1, and var.ω2) as the dependent variable and talker group (CWS-Per, CWS-Rec, CWNS) and duration as the independent variables. Since utterance duration was known to significantly interact with metric values (Tilsen & Arvaniti, 2013) and we were interested in the way the speech rhythm interacts with duration for the three groups, it was used as a fixed factor in the models. We used restricted cubic splines with three knots (at default fixed distribution percentiles of .05, .5, and .95) to assess flexible nonlinear associations of duration with the outcome. Splines were used to represent the flexible (nonlinear) relationship between utterance duration and speech rhythm measures, and three knots were used to be placed within the data range due to the small sample size and to avoid overfitting. Cubic polynomials are used in many forms of regression analysis (Desquilbet & Mariotti, 2010; Durrleman & Simon, 1989; Grajeda et al., 2016), and they offer sufficient flexibility to capture the shape of most data. Using cubic splines as opposed to using standard linear terms improved the model fits significantly for all of the measures. Since each child contributed a different amount of utterances, we have also included the total number of utterances as a covariate. Participant-level random intercepts were included in the models to account for within-subject correlation of utterances collected on a particular child. The assumptions for linear mixed-effect models were visually examined using the sjPlot package from R (Lüdecke et al., 2019), which revealed violation of normality of the residuals for three measures: SBPr, IMFr, and var.ω2. The Kolmogorov–Smirnov and Shapiro–Wilk tests were employed as a measure of normality, and we used logarithms to transform variables that were not normally distributed. Model predictions were transformed back to the response scale (package emmeans) in the figures.
In the second part of the analysis, we conducted a prospective analysis using a binary logistic regression model to examine whether there was a statistically significant relationship between the speech rhythm measures and stuttering outcome for the stuttering group (CWS-Per vs. CWS-Rec). This analysis was limited to utterances with longer duration (higher than the .75th percentile of the duration distribution) because duration interacted with the speech measures significantly in most cases and more pronounced differences were observed for longer utterances in retrospective analyses. A stepwise function was used for model selection with an objective to minimize the Akaike information criterion (AIC) value. Odds ratios were calculated from logits and transformed into probabilities for ease of visualization and interpretation. Weights were used to account for the unequal class numbers. A receiver operating characteristic (ROC) curve analysis was also employed to reveal the diagnostic ability of the speech rhythm measures to classify persistence and recovery by providing the trade-off between sensitivity and specificity. A confusion matrix was generated from the predictions of the fitted logistic regression model, using a threshold value gathered from the ROC curve.
Results
Group Differences on Speech Rhythm Measures
Power Distribution Metrics
For SBPr, we found significant main effects for group, F(2, 1470) = 8.08, p < .001, and duration, F(2, 1566) = 5.33, p < .01, and a significant interaction between duration and group, F(4, 1565) = 4.39, p < .01. We found CWS-Per to have significantly higher SBPr scores than both CWS-Rec (β = −.37, SE = .17, t = −2.23, p < .05) and CWNS (β = −.68, SE = .17, t = −3.929, p < .0001) overall. Following up the interaction effects, multiple comparisons were performed at short, moderate, and long utterance lengths corresponding to the duration percentiles of .05, .5, and .95 and utterance durations of 1.11, 1.76, and 2.78 s, respectively. CWS-Per had significantly higher SBPr scores than both CWS-Rec and CWNS at short durations (CWS-Per vs. CWS-Rec, β = .92, SE = .33, t = 2.76, p < .05; CWS-Per vs. CWNS, β = 1.08, SE = .32, t = 3.3, p < .01, Tukey adjusted) and long durations (CWS-Per vs. CWS-Rec, β = 1.24, SE = .46, t = 2.67, p < .05; CWS-Per vs. CWNS, β = 1.48, SE = .46, t = 3.2, p < .01, Tukey adjusted), whereas the CWS-Rec and CWNS did not differ from one another at any duration point. However, it should be noted that CWS-Per tended to score higher than CWS-Rec at moderate duration (β = .56, SE = .25, t = 2.27, p = .06).
A consistent pattern was observed for CNTR. We found significant main effects for group, F(2, 1466) = 9.91, p < .0001, and duration, F(2, 1566) = 10.38, p < .0001, and a significant interaction between duration and group, F(4, 1564) = 5.12, p < .0005. CWS-Per overall had significantly lower CNTR scores than both CWS-Rec (β = .63, SE = .29, t = 2.12, p < .05) and CWNS (β = 1.27, SE = .29, t = −4.28, p < .0001). Follow-up of the interaction effect indicated that CWS-Per had significantly lower CNTR scores than both CWS-Rec and CWNS at short duration (CWS-Per vs. CWS-Rec, β = −.32, SE = .10, t = −2.97, p < .01; CWS-Per vs. CWNS, β = −.41, SE = .10, t = −3.90, p < .001, Tukey adjusted) and long duration (CWS-Per vs. CWS-Rec, β = −.39, SE = .12, t = −2.51, p < .05; CWS-Per vs. CWNS, β = −.30, SE = .96, t = −3.17, p < .01, Tukey adjusted), whereas CWS-Rec and CWNS did not differ from one another at any duration point.
Results on IMFr strongly resembled that of SBPr, although the effects were observed to a lesser degree and not reaching statistical significance for most comparisons. A main effect of duration was observed, F(2, 1573) = 3.92, p < .05, and CWS-Per had higher IMFr scores than CWS-Rec (β = .33, SE = .15, t = 2.22, p < .05) at long duration only, whereas no significant difference was observed between CWS-Per and CWNS (β = .26, SE = .14, t = 1.75, p = .07), although CWS-Per still tended to have higher scores than CWNS.
The three power distribution metrics were also found to be highly correlated (SBPr vs. CNTR, r = −.8; SBPr vs. IMFr, r = .56; CNTR vs. IMFr, r = −.49, p < .0001). Higher SBPr and IMFr along with lower CNTR observed for CWS-Per indicate a relatively higher degree of low-frequency (suprasyllabic) periodicity, that is, more power/energy on the lower end of the spectrum—in suprasyllabic oscillations—especially for shorter and longer utterances. See Figure 2 for the marginal effects plots of the models on power distribution metrics—generated using the sjPlot's “plot_model” function (Lüdecke et al., 2019).
Figure 2.
Marginal effects plots of the regression models, depicting the power distribution metrics of (a) spectral band power ratio (SBPr), (b) envelope spectral centroid (CNTR), and intrinsic mode function ratio (IMFr) as a factor of duration for children who stutter that were eventually diagnosed as persisting (CWS-Per), children who stutter that were eventually diagnosed as recovered (CWS-Rec), and children who do not stutter (CWNS). The estimates are depicted as the solid line with the confidence interval depicted as the shaded region.
Rate Metrics
Results on the rate metrics (ω1 and ω2—instantaneous frequencies of IMF1 and IMF2, respectively) did not yield significant between-groups differences overall; however, a significant Group × Duration interaction was observed for ω2, F(2, 1574) = 3.88, p < .01. Follow-up comparisons revealed that CWS-Per had lower suprasyllabic rates (ω2) compared to CWNS at lower durations (β = −.26, SE = .08, t = −3.20, p < .01). CWS-Per were also slightly slower than CWS-Rec, but it did not reach statistical significance. A similar trend was observed for ω1 at lower durations, but none of the comparisons reached statistical significance. At lower durations, there was a trend for CWS-Per to be slightly slower at both syllabic and suprasyllabic timescales—see Figure 3. The rate metrics were also found to be moderately correlated (r = .47, p < .0001).
Figure 3.
Marginal effects plots of the regression models, depicting the rate metrics of (a) ω1 and (b) ω2 as a factor of duration for children who stutter that were eventually diagnosed as persisting (CWS-Per), children who stutter that were eventually diagnosed as recovered (CWS-Rec), and children who do not stutter (CWNS). The estimates are depicted as the solid line, with the confidence interval depicted as the shaded region.
Rhythmic Stability Metrics
Rhythmic stability metrics correspond to the within-utterance variance of ω1 and ω2, which represent the instability of the instantaneous frequencies of IMF1 and IMF2, respectively. Results on var.ω1 revealed a significant main effect for duration for both var.ω1, F(2, 1532) = 9.62, p < .0001, and var.ω2, F(2, 1532) = 59.00, p < .0001, and a marginally significant interaction between duration and group, F(4, 1534) = 2.2, p = .06, for var.ω1 only. The children in general exhibited higher syllabic and suprasyllabic variability with increased duration whereby the instantaneous frequencies became more variable in longer utterances, which was in line with Tilsen and Arvaniti (2013). CWS-Per had significantly higher var.ω1 scores than both CWS-Rec and CWNS at long duration only (CWS-Per vs. CWS-Rec, β = .70, SE = .30, t = 2.31, p < .05; CWS-Per vs. CWNS, β = .94, SE = .31, t = 3.02, p < .01), indicative of greater variability in the timing of syllabic rhythms when the utterances are long. The three groups did not differ from one another on var.ω2 at any duration point. See Figure 4.
Figure 4.
Marginal effects plots of the regression models, depicting the rhythm stability metrics of (a) var.ω1 and (b) var.ω2 as a factor of duration for children who stutter that were eventually diagnosed as persisting (CWS-Per), children who stutter that were eventually diagnosed as recovered (CWS-Rec), and children who do not stutter (CWNS). The estimates are depicted as the solid line, with the confidence interval depicted as the shaded region.
Predicting Persistence Based on the Speech Rhythm Metrics
The final binary logistic model to predict the likelihood that a child would be classified as persistent or recovered was selected from a stepwise procedure with an objective to minimize the AIC value. This model included two predictor variables, SBPr, and var.ω1. These measures were also found to have meaningful group-based differences at longer durations in the retrospective analysis. This analysis was limited to utterances longer than 2.14 s (with a maximum length of 3 s), which corresponds to the 75th percentile of the duration distribution from the two stuttering groups. This subset of data was selected based on the retrospective analysis results revealing that duration interacts with most of the speech measures and more pronounced differences are present for longer utterances. Results indicated that SBPr predicted persistence (B = 1.28, 95% CI [0.46, 2.37], p < .01), indicating that for each one-unit increase in SBPr, the log odds of persisting increases by 1.28, and that var.ω1 predicted persistence (B = 1.97, 95% CI [0.53, 3.87], p < .05), indicating that for each one-unit increase in var.ω1, the log odds of persisting increases by 1.97. Figure 5 displays a graphical summary of a predictor effect plot (Fox & Weisberg, 2018), which provides a graphical summary for the fitted regression model with the two predictor variables.
Figure 5.
Predictor effect plots for the fitted regression model with the two predictor variables of spectral band power ratio (SBPr) and var.ω1. The y-axis represents the probability of stuttering persistence, a value of 1 represents a probability of 100% for being classified as children who stutter that were eventually diagnosed as persisting. The shaded areas are pointwise confidence intervals for the fitted values (based on standard errors computed from the fitted regression coefficients).
The corresponding ROC curve, which represents sensitivity (proportion of children with persisting stuttering who are correctly identified as CWS-Per) and specificity (proportion of recovered children who are correctly identified as CWS-Rec), of the model is shown in Figure 6. The area under the ROC curve for predicting persistence was 0.87, which revealed that speech rhythm measures had a discriminative ability of 87% between stuttering children who were eventually diagnosed as persisting from those eventually diagnosed as recovered. Values between 0.80 and 0.89 are considered “good” predictive validity according to Carter et al. (2016) and “excellent” predictive validity according to Mandrekar (2010), while a value of 0.5 suggests no discriminative ability. Walsh et al. (2021) reported a predictive probability cutoff value of .4 as most appropriate for identifying persistence based on various clinical risk factors given that an identified persistence risk of 40% or higher resulted in better diagnostic validity. A cutoff value of .4 would mean that if a child's risk of persistence was 40% or higher, then that child would be considered a candidate for immediate intervention. Sensitivity is also often prioritized over specificity because failing to identify a true persisting child could have profound negative consequences, whereas recommending treatment for a child who would recover would not have as adverse of effects (Walsh et al., 2021). The visual inspection of the ROC curve (see Figure 6) in our data set also identified a cutoff value of .4 as meaningful to calculate the predictive probabilities and report the accuracy measures for the model. This cutoff value yielded a sensitivity value (true positive) of 80% (correctly predicted to be CWS-Per) and a specificity value of 75% (correctly predicted to be CWS-Rec). A 1 − specificity value of 25% refers to the proportion of CWS-Rec incorrectly identified as CWS-Per (false positive). This threshold value prioritizes the rate of true positives over false positives while also trying to maximize true positives and minimize false positives. The model was able to correctly classify 80% of CWS-Per (eight out of 10 were correctly classified) and 75% of CWS-Rec (15 out of 20 were correctly classified), for an overall success rate of 76.6% (see Table 3). The ROC analysis and the confusion matrix provide evidence for the robustness of a model using speech rhythm measures in predicting whether a preschool child's stuttering will persist.
Figure 6.
Receiver operating characteristic curve of the two speech rhythm measures (blue line) to discriminate eventual stuttering recovery and persistence. Sensitivity (true-positive) is plotted along the y-axis against 1 – specificity (false-positive) along the x-axis. The 45° diagonal line serves as null reference denoting no discrimination.
Table 3.
Confusion matrix for predicting stuttering outcome using the logistic regression model with a cutoff value of .4.
Predicted group membership | Observed group membership |
||
---|---|---|---|
CWS-Per | CWS-Rec | Overall | |
CWS-Per | 80% (8) | 20% (2) | 76.6% |
CWS-Rec | 25% (5) | 75% (15) |
Note. CWS-Per = children who stutter that were eventually diagnosed as persisting; CWS-Rec = children who stutter that were eventually diagnosed as recovered.
Discussion
Speech rhythm is known to be complex and multidimensional, thereby not amenable to a simple analysis (Tilsen & Arvaniti, 2013). One of the most significant challenges in the research of speech rhythm is that there is no a priori or universally agreed scientific definition of speech rhythm from which a ground truth measure of speech rhythm could be extracted. Despite this, there is a common intuition that, for signals that represent recurring events, such signals can vary in the extent to which those events recur in a regular pattern. Envelope-based analysis has been developed to quantitatively characterize speech rhythm more comprehensively than traditional acoustic metrics derived from durational relations of linguistic units (e.g., PVI). They have been more heavily used in the last decade than duration-based metrics since they have the utility of capturing fluctuations (temporal regularities) in acoustic energy emerging from the movement of articulatory gestures without dependence on constructional linguistic units, which likely mature throughout the course of development (especially after children start to read) and likely differ among various clinical populations. Although more studies are needed to verify envelope-based measures' utility in use across different age groups and clinical populations, they can be arguably more useful for broader applications in both adult and child speech as well as in clinical and nonclinical populations due to the lack of linguistic assumptions (Liss et al., 2010; Tilsen & Arvaniti, 2013). Therefore, this study used ESA and EMD methods (adapted from Tilsen & Arvaniti, 2013) to examine the speech rhythm characteristics of young children in an aim to improve our understanding of developmental stuttering and its persistence, and it revealed interesting contrasts especially between the speech rhythm of CWS-Per and CWS-Rec at a time when they were both still stuttering and their final stuttering status was not yet identified.
EMD measures used in the study involved power distribution metrics, which capture the relative power in syllabic versus suprasyllabic oscillations in the envelope; rate metrics, which capture the frequencies of those oscillations; and rhythmic stability metrics, which capture the stability (i.e., variability) of these oscillations. Results on the power distribution metrics overall showed that persisting children exhibited (a) relatively more spectral power in the lower frequency band (1.5- to 3-Hz band) compared to higher frequency band (3.5–10 Hz; SBPr)—indicative of relatively more low-frequency timescale oscillations, (b) higher ratio of power in IMF2 relative to IMF1 (IMFr)—indicative of relative influence of stress timescale periodicity, and (c) lower centroid (CNTR)—indicative of an overall concentration of frequencies toward the lower end of the spectrum (low-frequency periodicity). These trends collectively point to a relatively higher degree of suprasyllabic timescale periodicity in the speech of CWS-Per, while no differences were observed between CWS-Rec and CWNS. These differences were more pronounced for especially shorter and longer durations (i.e., utterances of less typical length), suggesting that the speech rhythm differences of CWS-Per might lie more in the utterances in upper and lower ranges of the duration distribution. This could arise from a difference in the phonetic manifestations of prominence (in terms of intensity and duration) at the suprasyllable level for short and long utterances in the case of CWS-Per. Alternatively, there could be distinct explanations for the between-groups differences observed at relatively short and long utterances. Shorter utterances likely involve a single prosodic phrase, in which case a phrase-level rhythm cannot be established, thereby increasing attention to suprasyllabic periodicity. In contrast, longer utterances are likely to involve several phrases, which may make the maintenance of a phrase-level rhythm more complex and challenging, also increasing attention to suprasyllabic periodicity. Medium-duration utterances could be less prone to exhibiting rhythmic differences between groups because they reflect a compromise as they allow a phrase-level rhythm to be established but do not require that rhythm to be maintained for more than a couple phrases.
It was previously suggested that rate metrics (instantaneous frequencies of IMF1 and IMF2) do not capture the same kind of information as power distribution metrics, because several languages have shown different patterns for power distribution metrics and rate metrics (Tilsen & Arvaniti, 2013). In this study, unlike the power distribution metrics, no meaningful differences were observed for rate of syllabic (ω1) and suprasyllabic (ω2) periodicities, except the lower suprasyllabic rates (ω2) observed in shorter utterances of CWS-Per. One likely explanation for this observation is that there could be more demand on the motor or semantic/conceptual planning processes for short utterances for CWS-Per, which could translate into lower suprasyllabic rates, but then it is less clear why longer utterances did not result in such similar decrease when a similar increase in motor or linguistic demand would be expected.
Rhythmic stability metrics reflect the degree to which instantaneous frequency of IMFs remain constant throughout a stretch of an utterance, indexing the stability of syllabic and suprasyllabic rhythms within utterances. CWS-Per displayed a higher degree of variability in the timing of syllabic rhythms (IMF1), indicative of a lower degree of rhythmicity in syllabic timescale oscillations for longer utterances. This might reflect a higher degree of variation in syllable duration when the utterances are long. On the other hand, the finding of no group-based differences in variability at the suprasyllabic level indicates that the children did not differ in the regularity with which stresses appear in speech (stress-related periodicity). Given that rate metrics did not display group-based differences, unlike the rhythmic stability metric at the syllable level, this indicates that it was the variability of repeating syllabic oscillations and not the rate at which they occurred that differentiated CWS-Per from the other two groups. When the instantaneous frequency of IMF1 (mu.Ω1) is constant but the variability of the frequency of IMF1 (var.ω1) is relatively higher, a lower degree of rhythmicity can be assumed. In line with Tilsen and Arvaniti (2013), both IMFs were more variable as the utterance length increased, which suggests decreased rhythmicity with increased utterance length. The increase of variability (decrease of rhythmicity), however, was significantly more pronounced for CWS-Per than CWS-Rec and CWNS, whereas CWS-Rec and CWNS performed in a similar fashion.
The model focusing on long utterances has shown that both SBPr and var.ω1 were associated with stuttering persistence—specifically, more low-frequency power on the suprasyllabic level and higher variability of syllabic oscillations were associated with an increased probability of persistence. Exactly how increased low-frequency power on the suprasyllabic level might be related to higher variability of syllabic oscillations is less clear and is subject to future research. However, one possible explanation is that when speakers devote more effort or attention to regulating suprasyllabic timing, they may sacrifice consistency in the regulation of syllable timing. A similar idea has been proposed in the context of models of speech rhythm in which coupled syllable- and foot-timescale oscillators govern syllable timing (O'Dell & Nieminen, 1999; Saltzman et al., 2008).
In this study, unlike CWNS or CWS-Rec, CWS-Per displayed a higher degree of syllabic variability (instability) in longer sentences. CWS tend to stutter more on longer and syntactically more complex utterances (Buhr & Zebrowski, 2009; Logan & Conture, 1995; Sawyer et al., 2008) where length and complexity are usually correlated. Longer utterances likely place more demand on speech motor planning and execution processes due to more complex linguistic and grammatical programming (Maner et al., 2000; Yaruss, 1999). Speech motor stability of CWS is known to decrease when the length and syntactic complexity of utterances increase—resulting in increased articulatory coordination variability in repeated productions as assessed by kinematic approaches (Smith et al., 2012; Usler et al., 2017; Walsh et al., 2015; cf. Max & Yudman, 2003). Using acoustic measures, Dokoza et al. (2011) also documented such increase in acoustic variability of various speech segments (i.e., voice onset times and syllables) on sentence level in 6- to 8-year-old CWS. Greater kinematic (i.e., lip aperture) variability in speech production was previously found to be associated with stuttering persistence in 5- to 7-year-old children (Usler et al., 2017), and lip aperture was shown to correlate with its corresponding acoustic signal intensity (Chandrasekaran et al., 2009; He & Dellwo, 2017). On the other hand, from a developmental perspective, syllable intensity variability was found to decrease as a function of age—indicative of maturation of articulatory motor control (He, 2018), which might also be a consequence of decreased lip aperture variability (He & Dellwo, 2017). Accordingly, higher variability in the syllabic level in the speech of persisting children could be related to such kinematic variabilities and speech motor vulnerabilities observed in other studies, and it may be indicative of a vulnerable speech–motor system that is especially susceptible to and taxed by the linguistic demands required to produce longer and more complex sentences.
One explanation for why the speech rhythm of CWS-Rec was more like CWNS (than CWS-Per) is that the speech motor mechanism of children who are on the pathway of recovery is already more refined and mature than that of children whose stuttering would later persist into adulthood. If such deviances in speech motor coordination and timing can indeed be shown to be a reliable predictor of persistent stuttering, this might lend support for the notion that particularly persistent stuttering is more closely associated with a timing/rhythm deficit. Indeed, the notion of differential developmental pathways in persistent and recovered stuttering is not new (Ambrose et al., 2015; Yairi & Ambrose, 2013). Such a view indicates that the mechanisms that lead to stuttering in children who eventually recover and in children whose stuttering persists operate differently at the epidemiology (Yairi & Ambrose, 2005), language (Watkins & Yairi, 1997; Yairi & Ambrose, 2005), motor control (Spencer & Weber-Fox, 2014), and/or temperamental/emotional (Ambrose et al., 2015; Erdemir et al., 2018) domains. Despite the growing empirical support for reliable distinctions between persistent and recovered groups as subtypes, the relative prominence of epidemiological, linguistic, motor, or emotional factors pertaining to the differentiation of their developmental pathways is still not clear. However, given that natural recovery (at least partially) is linked to maturation of the neural mechanisms of speech motor control (Forster & Webster, 2001), it is plausible that neuromotor patterns facilitating fluent speech develop conversely in CWS-Per, which in turn would be expected to align with the characteristics of persistent stuttering in adults.
Given the multifactorial nature of developmental stuttering, it is also possible that processes such as phonological (e.g., reduced nonword repetition and rhyming abilities; Spencer & Weber-Fox, 2014; Usler & Weber-Fox, 2015) and/or coarticulatory (e.g., differences in second formant transition rates; Chang et al., 2002; Subramanian et al., 2003) shown to be associated with stuttering persistence affect the way speech rhythm manifests. Especially difficulties with phonological and/or coarticulatory processes would result in overall lower frequency oscillations on both syllabic and suprasyllabic timescales (i.e., slower pace). In this study, no significant speech rate differences were observed across groups, except for the lower suprasyllabic rates of CWS-Per for the short utterances only—which might potentially be related to such vulnerabilities. In theory, phonological and/or coarticulatory differences could also interact with the sonority of coarticulatory gestures, effecting the speech amplitude envelope and its temporal variation at syllabic or suprasyllabic timescales to some degree. However, understanding how phonological processing and/or coarticulation impacts speech rhythm measures used in this study requires future research with a highly controlled set of stimuli. Moreover, even if such differences exist between the speaker groups, this is a chicken-or-the-egg problem and an open theoretical question as to whether they are the cause or effect of rhythmic differences observed. Ultimately, the speech motor system cannot be viewed in isolation and separate from auditory, linguistic, cognitive, and emotional processes. Auditory and other forms of feedback are crucial aspects of motor control, higher level cognitive/semantic representations govern the behavior of the motor system, and the motor system of speech production undoubtedly shapes the language structure and mediates the way it is perceived. In this respect, language and motor development are inherently tied, and they may play a collective role in speech rhythm.
Along these lines, there is growing evidence suggesting that atypical motor development could also serve as a risk factor for various speech/language disorders such as dyslexia (Capellini et al., 2010; Gooch et al., 2014) and developmental language disorder (DiDonato Brumbach & Goffman, 2014; Finlay & McPhillips, 2013; Flapper & Schoemaker, 2013; Jäncke et al., 2007). Converging evidence also supports the presence of comorbidities between speech/language disorders and motor disorders in children and has suggested that certain motor impairments have been associated with difficulties in language processing (Lense et al., 2021; Mirabella et al., 2017) and stuttering (Pruett et al., 2021). One possible explanation for such association is that deviant rhythm and/or timing processes represent a common underlying biological risk factor that may lead to comorbid impairments in speech/language processing, including developmental stuttering (see Atypical Rhythm Risk Hypothesis by Ladányi et al., 2020). Indeed, it has been proposed that internal timing deficits could result in not only impaired processing of rhythmic structures but also impaired syntactic processing of language where speech-language processing, rhythm processing, and/or motor impairments are all associated with one another (Fiveash et al., 2021; Ladányi et al., 2020). Future research should continue to consider the complex nature of these rhythm and timing processes and their potential contributions to speech-language development and disorders.
Limitations
It must be emphasized that this study was designed as an initial exploratory study, and so, it has notable limitations and considerations. One major limitation of this study is its relatively small sample size, particularly for those children whose stuttering persisted. Although this may be a limiting factor for generalizing the findings to the broader population, a low number of CWS-Per was expected based on typical persistence and recovery rates where typically only 15%–25% of CWS continue to persist (for a review, see Yairi & Ambrose, 2005, 2013). This is a challenge faced by most empirical research focusing on stuttering persistence in developmental stuttering. This study included all the children identified as persistent from a larger group of stuttering children in the study, but going forward, a larger sample would permit a more robust comparison between the speech rhythm characteristics of persistent and recovered children.
A second concern is about the length of tracking for stuttering status, where each child was tracked for 4–5 times over a 2- to 2.5-year period until about age 6–6.5 years. Although it is possible that a child classified as persisting in this study went on to recover following participation, based on past work, the chances are relatively small given that the highest rates of recovery happen during the first 12–24 months post onset and most of natural recovery takes place within 3–4 years post onset and before the age of 7 years, after which the chance of recovery drops to 5% (Yairi & Ambrose, 1999, 2005).
A third limitation is related to the narrative speech task that might have involved the risk of a confounding impact from linguistic factors (language and phonology), which might differ among different groups. We controlled for this to a certain degree by asking the children to talk about the same story (rather than speaking freely) and accounting for utterance length in our analyses; however, it is still possible that linguistic factors contributed to the differential trends observed for the three groups at least partially.
Fourth, this study focuses on speech rhythm in isolation of other risk factors—such as stuttering severity and linguistic factors—known to be predictive of stuttering persistence when assessed cumulatively (e.g., Singer et al., 2022; Walsh et al., 2021). In an aim to isolate speech rhythm qualities, we matched the three groups on stuttering and speech-language abilities. Although, this way, we may have avoided the confounding effects of stuttering severity and linguistic factors on speech rhythm, we may also have concealed some of the naturally occurring differences in speech rhythm across the groups. Given the multifactorial nature of developmental stuttering, higher stuttering severity and/or lower speech-language abilities may be associated with more pronounced speech rhythm differences if left to freely vary. Additional studies need to be conducted where multiple risk factors are taken into consideration collectively, compared to factors in isolation, to account for interactions and to yield to more ecologically valid and possibly stronger predictions of stuttering persistence.
Finally, given the novelty of the approach, we did not have specific hypotheses about which sets of measures from the envelope-based measures would be most discriminative of persistent stuttering, and thereby the study is exploratory in nature. It has yet to be established how these metrics correlate with duration-based speech rhythm metrics or perceptual judgments of rhythmicity, which would help to develop a comprehensive understanding of the rhythmic qualities of speech in persistent stuttering.
Conclusions
This review article has presented a new method for characterizing speech rhythm in young CWS-Per, CWS-Rec, and CWNS based on ESA and EMD of the vocalic energy amplitude envelope. This study is a critical first step in establishing speech rhythm differences in an aim to predict risk of stuttering persistence. The speech rhythm of CWS who are later identified as persistent was different from that of recovered or nonstuttering children. Two measures provided high accuracy in classifying CWS and persist versus those who recover, supporting the notion that there appears to be important rhythm-related information in the speech of these children. Future research is warranted to further explore envelope-based measures for identifying speech rhythm in children and establishing a systematic association with stuttering persistence. If robust associations could be established, efforts could also be directed toward the development of clinically relevant approaches. Ultimately, these results should also be considered within the multifactorial nature of developmental stuttering to establish a comprehensive understanding of the factors that contribute to its onset and development. Speech rhythm could be a strong potential candidate used to determine an individual child's risk for stuttering persistence in young CWS before the pathways for recovery and persistence are established.
Data Availability Statement
Participant consent to share de-identified data was not obtained at the time of data collection. Thus, based on current regulations, the Vanderbilt Institutional Review Board determined that de-identified data used in this study are not eligible to be shared without reconsenting the past participants, and the researchers of this study are no longer in contact with these past participants.
Acknowledgments
This work was supported by grants from the National Institute on Deafness and Other Communication Disorders (NIDCD) to Vanderbilt University (5R01DC000523-19, 2R56DC000523-20A1) and to Vanderbilt University Medical Center (R21DC016723, R01DC020311), as well as Vanderbilt CTSA grants from National Center for Advancing Translational Sciences (UL1RR024975, UL1TR00044506, UL1TR002243), and a Vanderbilt Kennedy Center Hobbs Discovery Grant. This research was also supported by the Wilker-Ellis Stuttering Research Fund at Vanderbilt University Medical Center. The research and content reported herein is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health, NIDCD, Vanderbilt University, Vanderbilt University Medical Center, the Vanderbilt Kennedy Center, or the generous donors that supported this work. The authors extend sincere appreciation to the young children and their caregivers who participated in this study, individuals without whose cooperation this project would not have been possible to conduct.
Funding Statement
This work was supported by grants from the National Institute on Deafness and Other Communication Disorders (NIDCD) to Vanderbilt University (5R01DC000523-19, 2R56DC000523-20A1) and to Vanderbilt University Medical Center (R21DC016723, R01DC020311), as well as Vanderbilt CTSA grants from National Center for Advancing Translational Sciences (UL1RR024975, UL1TR00044506, UL1TR002243), and a Vanderbilt Kennedy Center Hobbs Discovery Grant. This research was also supported by the Wilker-Ellis Stuttering Research Fund at Vanderbilt University Medical Center. The research and content reported herein is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health, NIDCD, Vanderbilt University, Vanderbilt University Medical Center, the Vanderbilt Kennedy Center, or the generous donors that supported this work.
References
- Abercrombie, D. (1967). Elements of general phonetics. Edinburgh University Press. [Google Scholar]
- Ambrose, N. G. , Cox, N. J. , & Yairi, E. (1997). The genetic basis of persistence and recovery in stuttering. Journal of Speech, Language, and Hearing Research, 40(3), 567–580. https://doi.org/10.1044/jslhr.4003.567 [DOI] [PubMed] [Google Scholar]
- Ambrose, N. G. , Yairi, E. , Loucks, T. M. , Seery, C. H. , & Throneburg, R. (2015). Relation of motor, linguistic and temperament factors in epidemiologic subtypes of persistent and recovered stuttering: Initial findings. Journal of Fluency Disorders, 45, 12–26. https://doi.org/10.1016/j.jfludis.2015.05.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anderson, J. D. , Pellowski, M. W. , Conture, E. G. , & Kelly, E. M. (2003). Temperamental characteristics of young children who stutter. Journal of Speech, Language, and Hearing Research, 46(5), 1221–1233. https://doi.org/10.1044/1092-4388(2003/095) [DOI] [PMC free article] [PubMed] [Google Scholar]
- Arvaniti, A. (2012). The usefulness of metrics in the quantification of speech rhythm. Journal of Phonetics, 40(3), 351–373. https://doi.org/10.1016/j.wocn.2012.02.003 [Google Scholar]
- Blumgart, E. , Tran, Y. , & Craig, A. (2010). Social anxiety disorder in adults who stutter. Depression and Anxiety, 27(7), 687–692. https://doi.org/10.1002/da.20657 [DOI] [PubMed] [Google Scholar]
- Boutsen, F. R. , Brutten, G. J. , & Watts, C. R. (2000). Timing and intensity variability in the metronomic speech of stuttering and nonstuttering speakers. Journal of Speech, Language, and Hearing Research, 43(2), 513–520. https://doi.org/10.1044/jslhr.4302.513 [DOI] [PubMed] [Google Scholar]
- Buhr, A. , & Zebrowski, P. (2009). Sentence position and syntactic complexity of stuttering in early childhood: A longitudinal study. Journal of Fluency Disorders, 34(3), 155–172. https://doi.org/10.1016/j.jfludis.2009.08.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bunta, F. , & Ingram, D. (2007). The acquisition of speech rhythm by bilingual Spanish- and English-speaking 4- and 5-year-old children. Journal of Speech, Language, and Hearing Research, 50(4), 999–1014. https://doi.org/10.1044/1092-4388(2007/070) [DOI] [PubMed] [Google Scholar]
- Capellini, S. A. , Coppede, A. C. , & Valle, T. R. (2010). Fine motor function of school-aged children with dyslexia, learning disability and learning difficulties. Pro-Fono: Revista De Atualizacao Cientifica, 22(3), 201–208. https://doi.org/10.1590/s0104-56872010000300008 [DOI] [PubMed] [Google Scholar]
- Carter, J. V. , Pan, J. , Rai, S. N. , & Galandiuk, S. (2016). ROC-ing along: Evaluation and interpretation of receiver operating characteristic curves. Surgery, 159(6), 1638–1645. https://doi.org/10.1016/j.surg.2015.12.029 [DOI] [PubMed] [Google Scholar]
- Chandrasekaran, C. , Trubanova, A. , Stillittano, S. , Caplier, A. , & Ghazanfar, A. A. (2009). The natural statistics of audiovisual speech. PLOS Computational Biology, 5(7), Article e1000436. https://doi.org/10.1371/journal.pcbi.1000436 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chang, S.-E. , Chow, H. M. , Wieland, E. A. , & McAuley, J. D. (2016). Relation between functional connectivity and rhythm discrimination in children who do and do not stutter. NeuroImage: Clinical, 12, 442–450. https://doi.org/10.1016/j.nicl.2016.08.021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chang, S.-E. , & Guenther, F. H. (2020). Involvement of the cortico-basal ganglia-thalamocortical loop in developmental stuttering. Frontiers in Psychology, 10, 3088. https://doi.org/10.3389/fpsyg.2019.03088 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chang, S.-E. , Ohde, R. N. , & Conture, E. G. (2002). Coarticulation and formant transition rate in young children who stutter. Journal of Speech, Language, and Hearing Research, 45(4), 676–688. https://doi.org/10.1044/1092-4388(2002/054) [DOI] [PubMed] [Google Scholar]
- Chang, S.-E. , & Zhu, D. C. (2013). Neural network connectivity differences in children who stutter. Brain, 136(12), 3709–3726. https://doi.org/10.1093/brain/awt275 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Conture, E. G. (2001). Stuttering: Its nature, diagnosis, and treatment. Pearson College Division. [Google Scholar]
- Conture, E. G. , & Walden, T. (2012). Dual diathesis-stressor model of stuttering. In Filatova Y. O. (Ed.), Theoretical issues of fluency disorders (pp. 94–127). National Book Centre. [Google Scholar]
- Craig, A. , Blumgart, E. , & Tran, Y. (2009). The impact of stuttering on the quality of life in adults who stutter. Journal of Fluency Disorders, 34(2), 61–71. https://doi.org/10.1016/j.jfludis.2009.05.002 [DOI] [PubMed] [Google Scholar]
- Cummins, F. (2009). Rhythm as entrainment: The case of synchronous speech. Journal of Phonetics, 37(1), 16–28. https://doi.org/10.1016/j.wocn.2008.08.003 [Google Scholar]
- Dechamma, D. , & Maruthy, S. (2018). Envelope modulation spectral (EMS) analyses of solo reading and choral reading conditions suggest changes in speech rhythm in adults who stutter. Journal of Fluency Disorders, 58, 47–60. https://doi.org/10.1016/j.jfludis.2018.09.002 [DOI] [PubMed] [Google Scholar]
- Dellwo, V. (2006). Rhythm and speech rate: A variation coefficient for deltaC. In Karnowski P. & Szigeti I. (Eds.), Language and language-processing (pp. 231–241). Peter Lang. https://doi.org/10.5167/uzh-111789 [Google Scholar]
- Desquilbet, L. , & Mariotti, F. (2010). Dose-response analyses using restricted cubic spline functions in public health research. Statistics in Medicine, 29(9), 1037–1057. https://doi.org/10.1002/sim.3841 [DOI] [PubMed] [Google Scholar]
- DiDonato Brumbach, A. C. , & Goffman, L. (2014). Interaction of language processing and motor skill in children with specific language impairment. Journal of Speech, Language, and Hearing Research, 57(1), 158–171. https://doi.org/10.1044/1092-4388(2013/12-0215) [DOI] [PMC free article] [PubMed] [Google Scholar]
- Diggle, P. , Heagerty, P. , Liang, K.-Y. , & Zeger, S. (2013). Analysis of longitudinal data (2nd ed.). Oxford University Press. https://pesquisa.bvsalud.org/portal/resource/pt/biblio-941525 [Google Scholar]
- Dokoza, K. P. , Hedever, M. , & Sarić, J. P. (2011). Duration and variability of speech segments in fluent speech of children with and without stuttering. Collegium Antropologicum, 35(2), 281–288. [PubMed] [Google Scholar]
- Dunn, L. M. , & Dunn, D. M. (2007). Peabody Picture Vocabulary Test–Fourth Edition (PPVT-4). Pearson Assessments. https://www.pearsonassessments.com/store/usassessments/en/Store/Professional-Assessments/Academic-Learning/Brief/Peabody-Picture-Vocabulary-Test-%7C-Fourth-Edition/p/100000501.html [Google Scholar]
- Durrleman, S. , & Simon, R. (1989). Flexible regression models with cubic splines. Statistics in Medicine, 8(5), 551–561. https://doi.org/10.1002/sim.4780080504 [DOI] [PubMed] [Google Scholar]
- Erdemir, A. , Walden, T. A. , Jefferson, C. M. , Choi, D. , & Jones, R. M. (2018). The effect of emotion on articulation rate in persistence and recovery of childhood stuttering. Journal of Fluency Disorders, 56, 1–17. https://doi.org/10.1016/j.jfludis.2017.11.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Etchell, A. C. , Johnson, B. W. , & Sowman, P. F. (2015). Beta oscillations, timing, and stuttering. Frontiers in Human Neuroscience, 8, Article 1036. https://doi.org/10.3389/fnhum.2014.01036 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Etchell, A. C. , Ryan, M. , Martin, E. , Johnson, B. W. , & Sowman, P. F. (2016). Abnormal time course of low beta modulation in non-fluent preschool children: A magnetoencephalographic study of rhythm tracking. NeuroImage, 125, 953–963. https://doi.org/10.1016/j.neuroimage.2015.10.086 [DOI] [PubMed] [Google Scholar]
- Falk, S. , Müller, T. , & Dalla Bella, S. (2015). Non-verbal sensorimotor timing deficits in children and adolescents who stutter. Frontiers in Psychology, 6, 847. https://doi.org/10.3389/fpsyg.2015.00847 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Finlay, J. C. S. , & McPhillips, M. (2013). Comorbid motor deficits in a clinical sample of children with specific language impairment. Research in Developmental Disabilities, 34(9), 2533–2542. https://doi.org/10.1016/j.ridd.2013.05.015 [DOI] [PubMed] [Google Scholar]
- Fiveash, A. , Bedoin, N. , Gordon, R. L. , & Tillmann, B. (2021). Processing rhythm in speech and music: Shared mechanisms and implications for developmental speech and language disorders. Neuropsychology, 35(8), 771–791. https://doi.org/10.1037/neu0000766 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Flapper, B. C. T. , & Schoemaker, M. M. (2013). Developmental coordination disorder in children with specific language impairment: Co-morbidity and impact on quality of life. Research in Developmental Disabilities, 34(2), 756–763. https://doi.org/10.1016/j.ridd.2012.10.014 [DOI] [PubMed] [Google Scholar]
- Forster, D. C. , & Webster, W. G. (2001). Speech–motor control and interhemispheric relations in recovered and persistent stuttering. Developmental Neuropsychology, 19(2), 125–145. https://doi.org/10.1207/S15326942DN1902_1 [DOI] [PubMed] [Google Scholar]
- Fox, J. , & Weisberg, S. (2018). Visualizing fit and lack of fit in complex regression models with predictor effect plots and partial residuals. Journal of Statistical Software, 87(9), 1–27. https://doi.org/10.18637/jss.v087.i09 [Google Scholar]
- Garnett, E. O. , Chow, H. M. , Nieto-Castañón, A. , Tourville, J. A. , Guenther, F. H. , & Chang, S.-E. (2018). Anomalous morphology in left hemisphere motor and premotor cortex of children who stutter. Brain, 141(9), 2670–2684. https://doi.org/10.1093/brain/awy199 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goldman, R. , & Fristoe, M. (2000). Goldman-Fristoe Test of Articulation–Seond Edition (GFTA-2). AGS. https://onlinelibrary.wiley.com/doi/abs/10.1002/9780470373699.speced0936 [Google Scholar]
- Gooch, D. , Hulme, C. , Nash, H. M. , & Snowling, M. J. (2014). Comorbidities in preschool children at family risk of dyslexia. Journal of Child Psychology and Psychiatry and Allied Disciplines, 55(3), 237–246. https://doi.org/10.1111/jcpp.12139 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grabe, E. , & Low, E. L. (2002). Durational variability in speech and the rhythm class hypothesis. Papers in Laboratory Phonology, 7(1982), 515–546. [Google Scholar]
- Grahn, J. A. (2009). The role of the basal ganglia in beat perception: Neuroimaging and neuropsychological investigations. Annals of the New York Academy of Sciences, 1169(1), 35–45. https://doi.org/10.1111/j.1749-6632.2009.04553.x [DOI] [PubMed] [Google Scholar]
- Grahn, J. A. , & Brett, M. (2007). Rhythm and beat perception in motor areas of the brain. Journal of Cognitive Neuroscience, 19(5), 893–906. https://doi.org/10.1162/jocn.2007.19.5.893 [DOI] [PubMed] [Google Scholar]
- Grahn, J. A. , & McAuley, J. D. (2009). Neural bases of individual differences in beat perception. NeuroImage, 47(4), 1894–1903. https://doi.org/10.1016/j.neuroimage.2009.04.039 [DOI] [PubMed] [Google Scholar]
- Grajeda, L. M. , Ivanescu, A. , Saito, M. , Crainiceanu, C. , Jaganath, D. , Gilman, R. H. , Crabtree, J. E. , Kelleher, D. , Cabrera, L. , Cama, V. , & Checkley, W. (2016). Modelling subject-specific childhood growth using linear mixed-effect models with cubic regression splines. Emerging Themes in Epidemiology, 13(1), Article 1. https://doi.org/10.1186/s12982-015-0038-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- He, L. (2018). Development of speech rhythm in first language: The role of syllable intensity variability. The Journal of the Acoustical Society of America, 143(6), EL463–EL467. https://doi.org/10.1121/1.5042083 [DOI] [PubMed] [Google Scholar]
- He, L. , & Dellwo, V. (2016). The role of syllable intensity in between-speaker rhythmic variability. International Journal of Speech, Language and the Law, 23(2), 243–273. https://doi.org/10.1558/ijsll.v23i2.30345 [Google Scholar]
- He, L. , & Dellwo, V. (2017). Amplitude envelope kinematics of speech: Parameter extraction and applications. The Journal of the Acoustical Society of America, 141(5), 3582–3582. https://doi.org/10.1121/1.4987638 [Google Scholar]
- Hickok, G. , Houde, J. , & Rong, F. (2011). Sensorimotor integration in speech processing: Computational basis and neural organization. Neuron, 69(3), 407–422. https://doi.org/10.1016/j.neuron.2011.01.019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hilger, A. I. , Zelaznik, H. , & Smith, A. (2016). Evidence that bimanual motor timing performance is not a significant factor in developmental stuttering. Journal of Speech, Language, and Hearing Research, 59(4), 674–685. https://doi.org/10.1044/2016_JSLHR-S-15-0172 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hillenbrand, J. , Getty, L. A. , Clark, M. J. , & Wheeler, K. (1995). Acoustic characteristics of American English vowels. The Journal of the Acoustical Society of America, 97(5), 3099–3111. https://doi.org/10.1121/1.411872 [DOI] [PubMed] [Google Scholar]
- Howell, P. , Au-Yeung, J. , & Rustin, L. (1997). Clock and motor variances in lip-tracking: A comparison between children who stutter and those who do not. In Hulstijn W., Peters H. F. M., & van Lieshout P. H. H. M. (Eds.), Speech production: Motor control, brain research and fluency disorders (pp. 573–578). Elsevier Scientific. [Google Scholar]
- Hresko, W. , Beal, D. , & Hamill, D. (1999). Test of Early Language Development-3 (TELD-3). Pro-Ed. https://www.proedinc.com/Products/14645/teld4-test-of-early-language-developmentfourth-edition.aspx [Google Scholar]
- Huang, N. E. , Shen, Z. , Long, S. R. , Wu, M. C. , Shih, H. H. , Zheng, Q. , Yen, N.-C. , Tung, C. C. , & Liu, H. H. (1998). The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings: Mathematical, Physical and Engineering Sciences, 454(1971), 903–995. [Google Scholar]
- James, A. L. (1940). Speech signals in telephony. Sir I. Pitman & Sons, Ltd. [Google Scholar]
- Jäncke, L. , Siegenthaler, T. , Preis, S. , & Steinmetz, H. (2007). Decreased white-matter density in a left-sided fronto-temporal network in children with developmental language disorder: Evidence for anatomical anomalies in a motor-language network. Brain and Language, 102(1), 91–98. https://doi.org/10.1016/j.bandl.2006.08.003 [DOI] [PubMed] [Google Scholar]
- Jones, R. M. , Buhr, A. P. , Conture, E. G. , Tumanova, V. , Walden, T. A. , & Porges, S. W. (2014). Autonomic nervous system activity of preschool-age children who stutter. Journal of Fluency Disorders, 41, 12–31. https://doi.org/10.1016/j.jfludis.2014.06.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jones, R. M. , Walden, T. A. , Conture, E. G. , Erdemir, A. , Lambert, W. E. , & Porges, S. W. (2017). Executive functions impact the relation between respiratory sinus arrhythmia and frequency of stuttering in Young children who do and do not stutter. Journal of Speech, Language, and Hearing Research, 60(8), 2133–2150. https://doi.org/10.1044/2017_JSLHR-S-16-0113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kidd, K. K. , Heimbuch, R. C. , & Records, M. A. (1981). Vertical transmission of susceptibility to stuttering with sex-modified expression. Proceedings of the National Academy of Sciences, 78(1), 606–610. https://doi.org/10.1073/pnas.78.1.606 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Klein, J. F. , & Hood, S. B. (2004). The impact of stuttering on employment opportunities and job performance. Journal of Fluency Disorders, 29(4), 255–273. https://doi.org/10.1016/j.jfludis.2004.08.001 [DOI] [PubMed] [Google Scholar]
- Kohler, K. J. (2009). Rhythm in speech and language. Phonetica, 66(1–2), 29–45. https://doi.org/10.1159/000208929 [DOI] [PubMed] [Google Scholar]
- Kuznetsova, A. , Brockhoff, P. B. , & Christensen, R. H. B. (2017). lmerTestPackage: Tests in linear mixed effects models. Journal of Statistical Software, 82(13). https://doi.org/10.18637/jss.v082.i13 [Google Scholar]
- Ladányi, E. , Persici, V. , Fiveash, A. , Tillmann, B. , & Gordon, R. L. (2020). Is atypical rhythm a risk factor for developmental speech and language disorders. Wiley Interdisciplinary Reviews. Cognitive Science, 11(5), e1528. https://doi.org/10.1002/wcs.1528 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lense, M. D. , Ladányi, E. , Rabinowitch, T.-C. , Trainor, L. , & Gordon, R. (2021). Rhythm and timing as vulnerabilities in neurodevelopmental disorders. Philosophical Transactions of the Royal Society of London Series B: Biological Sciences, 376(1835), 20200327. https://doi.org/10.1098/rstb.2020.0327 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ling, L. E. , Grabe, E. , & Nolan, F. (2000). Quantitative characterizations of speech rhythm: Syllable-timing in Singapore English. Language and Speech, 43(4), 377–401. https://doi.org/10.1177/00238309000430040301 [DOI] [PubMed] [Google Scholar]
- Liss, J. M. , LeGendre, S. , & Lotto, A. J. (2010). Discriminating dysarthria type from envelope modulation spectra. Journal of Speech, Language, and Hearing Research, 53(5), 1246–1255. https://doi.org/10.1044/1092-4388(2010/09-0121) [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liss, J. M. , White, L. , Mattys, S. L. , Lansford, K. , Lotto, A. J. , Spitzer, S. M. , & Caviness, J. N. (2009). Quantifying speech rhythm abnormalities in the dysarthrias. Journal of Speech, Language, and Hearing Research, 52(5), 1334–1352. https://doi.org/10.1044/1092-4388(2009/08-0208) [DOI] [PMC free article] [PubMed] [Google Scholar]
- Logan, K. J. , & Conture, E. G. (1995). Length, grammatical complexity, and rate differences in stuttered and fluent conversational utterances of children who stutter. Journal of Fluency Disorders, 20(1), 35–61. https://doi.org/10.1016/0094-730X(94)00008-H [Google Scholar]
- Lüdecke, D. , Bartel, A. , Schwemmer, C. , Powell, C. , Djalovski, A. , & Titz, J. (2019). sjPlot: Data Visualization for Statistics in Social Science (2.8.9) [Computer software] . https://CRAN.R-project.org/package=sjPlot
- MacPherson, M. K. , & Smith, A. (2013). Influences of sentence length and syntactic complexity on the speech motor control of children who stutter. Journal of Speech, Language, and Hearing Research, 56(1), 89–102. https://doi.org/10.1044/1092-4388(2012/11-0184) [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mandrekar, J. N. (2010). Receiver operating characteristic curve in diagnostic test assessment. Journal of Thoracic Oncology, 5(9), 1315–1316. https://doi.org/10.1097/JTO.0b013e3181ec173d [DOI] [PubMed] [Google Scholar]
- Maner, K. J. , Smith, A. , & Grayson, L. (2000). Influences of utterance length and complexity on speech motor performance in children and adults. Journal of Speech, Language, and Hearing Research, 43(2), 560–573. https://doi.org/10.1044/jslhr.4302.560 [DOI] [PubMed] [Google Scholar]
- Maruthy, S. , Venugopal, S. , & Parakh, P. (2017). Speech rhythm in Kannada speaking adults who stutter. International Journal of Speech-Language Pathology, 19(5), 529–537. https://doi.org/10.1080/17549507.2016.1221459 [DOI] [PubMed] [Google Scholar]
- Max, L. , & Yudman, E. A. (2003). Accuracy and variability of isochronous rhythmic timing across motor systems in stuttering versus nonstuttering individuals. Journal of Speech, Language, and Hearing Research, 46(1), 146–163. https://doi.org/10.1044/1092-4388(2003/012) [DOI] [PubMed] [Google Scholar]
- Mayer, M. (1969). Frog, where are you? Dial. [Google Scholar]
- Meyers, S. C. (1986). Qualitative and quantitative differences and patterns of variability in disfluencies emitted by preschool stutterers and nonstutterers during dyadic conversations. Journal of Fluency Disorders, 11(4), 293–306. https://doi.org/10.1016/0094-730X(86)90017-3 [Google Scholar]
- Mirabella, G. , Del Signore, S. , Lakens, D. , Averna, R. , Penge, R. , & Capozzi, F. (2017). Developmental coordination disorder affects the processing of action-related verbs. Frontiers in Human Neuroscience, 10, Article 661. https://doi.org/10.3389/fnhum.2016.00661 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mohan, R. , & Weber, C. (2015). Neural systems mediating processing of sound units of language distinguish recovery versus persistence in stuttering. Journal of Neurodevelopmental Disorders, 7(1), Article 28. https://doi.org/10.1186/s11689-015-9124-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mok, P. P. K. (2013). Speech rhythm of monolingual and bilingual children at age 2;6: Cantonese and English. Bilingualism, 16(3), 693–703. https://doi.org/10.1017/S1366728912000636 [Google Scholar]
- Morton, J. , & Chambers, S. M. (1976). Some evidence for ‘speech’ as an acoustic feature. British Journal of Psychology, 67(1), 31–45. https://doi.org/10.1111/j.2044-8295.1976.tb01495.x [DOI] [PubMed] [Google Scholar]
- Nolan, F. , & Jeon, H.-S. (2014). Speech rhythm: A metaphor. Philosophical Transactions of the Royal Society B: Biological Sciences, 369(1658), 20130396. https://doi.org/10.1098/rstb.2013.0396 [DOI] [PMC free article] [PubMed] [Google Scholar]
- O'Brian, S. , Jones, M. , Packman, A. , Menzies, R. , & Onslow, M. (2011). Stuttering severity and educational attainment. Journal of Fluency Disorders, 36(2), 86–92. https://doi.org/10.1016/j.jfludis.2011.02.006 [DOI] [PubMed] [Google Scholar]
- O'Dell, M. , & Nieminen, T. (1999). Coupled oscillator model of speech rhythm. In Proceedings of The XIVth International Congress of Phonetic Sciences (pp. 1075–1078).
- Ordin, M. , & Polyanskaya, L. (2015). Acquisition of speech rhythm in a second language by learners with rhythmically different native languages. The Journal of the Acoustical Society of America, 138(2), 533–544. https://doi.org/10.1121/1.4923359 [DOI] [PubMed] [Google Scholar]
- Payne, E. , Post, B. , Astruc, L. , Prieto, P. , & Vanrell, M. M. (2012). Measuring child rhythm. Language and Speech, 55(2), 203–229. https://doi.org/10.1177/0023830911417687 [DOI] [PubMed] [Google Scholar]
- Pike, K. L. (1945). The intonation of American English. University of Michigan Press. [Google Scholar]
- Pinheiro, J. C. , & Bates, D. M. (2000). Linear mixed-effects models: Basic concepts and examples. In Mixed-effects models in sand S-PLUS (pp. 3–56). Springer. https://doi.org/10.1007/978-1-4419-0318-1_1 [Google Scholar]
- Poeppel, D. , & Assaneo, M. F. (2020). Speech rhythms and their neural foundations. Nature Reviews Neuroscience, 21(6), 322–334. https://doi.org/10.1038/s41583-020-0304-4 [DOI] [PubMed] [Google Scholar]
- Polyanskaya, L. , & Ordin, M. (2015). Acquisition of speech rhythm in first language. The Journal of the Acoustical Society of America, 138(3), EL199–EL204. https://doi.org/10.1121/1.4929616 [DOI] [PubMed] [Google Scholar]
- Pompino-Marschall, B. (1989). On the psychoacoustic nature of the P-center phenomenon. Journal of Phonetics, 17(3), 175–192. https://doi.org/10.1016/S0095-4470(19)30428-0 [Google Scholar]
- Pruett, D. G. , Shaw, D. M. , Chen, H.-H. , Petty, L. E. , Polikowsky, H. G. , Kraft, S. J. , Jones, R. M. , & Below, J. E. (2021). Identifying developmental stuttering and associated comorbidities in electronic health records and creating a phenome risk classifier. Journal of Fluency Disorders, 68, 105847. https://doi.org/10.1016/j.jfludis.2021.105847 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ravignani, A. , Dalla Bella, S. , Falk, S. , Kello, C. T. , Noriega, F. , & Kotz, S. A. (2019). Rhythm in speech and animal vocalizations: A cross-species perspective. Annals of the New York Academy of Sciences, 1453(1), 79–98. https://doi.org/10.1111/nyas.14166 [DOI] [PMC free article] [PubMed] [Google Scholar]
- R Core Team. (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/ [Google Scholar]
- Reilly, S. , Onslow, M. , Packman, A. , Wake, M. , Bavin, E. L. , Prior, M. , Eadie, P. , Cini, E. , Bolzonello, C. , & Ukoumunne, O. C. (2009). Predicting stuttering onset by the age of 3 years: A prospective, community cohort study. Pediatrics, 123(1), 270–277. https://doi.org/10.1542/peds.2007-3219 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Riley, G. D. (1994). Stuttering Severity Instrument for children and adults (3rd ed.). Pro-Ed. [DOI] [PubMed] [Google Scholar]
- Riley, G. D. (2009). Stuttering Severity Instrument for children and adults (4th ed.). Pro-Ed. https://www.proedinc.com/Products/13025/ssi4-stuttering-severity-instrument--fourth-edition.aspx [DOI] [PubMed] [Google Scholar]
- Saltzman, E. , Nam, H. , Krivokapic, J. , & Goldstein, L. (2008). A task-dynamic toolkit for modeling the effects of prosodic structure on articulation. Speech Prosody, 10. [Google Scholar]
- Sawyer, J. , Chon, H. , & Ambrose, N. G. (2008). Influences of rate, length, and complexity on speech disfluency in a single-speech sample in preschool children who stutter. Journal of Fluency Disorders, 33(3), 220–240. https://doi.org/10.1016/j.jfludis.2008.06.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Singer, C. M. , Hessling, A. , Kelly, E. M. , Singer, L. , & Jones, R. M. (2020). Clinical characteristics associated with stuttering persistence: A meta-analysis. Journal of Speech, Language, and Hearing Research, 63(9), 2995–3018. https://doi.org/10.1044/2020_JSLHR-20-00096 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Singer, C. M. , Otieno, S. , Chang, S.-E. , & Jones, R. M. (2022). Predicting persistent developmental stuttering using a cumulative risk approach. Journal of Speech, Language, and Hearing Research, 65(1), 70–95. https://doi.org/10.1044/2021_JSLHR-21-00162 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith, A. (1999). Stuttering: A unified approach to a multifactorial, dynamic disorder. In Ratner N. B. & Healey E. C. (Eds.), Stuttering research and practice (Vol. 27). Psychology Press. [Google Scholar]
- Smith, A. , Goffman, L. , Sasisekaran, J. , & Weber-Fox, C. (2012). Language and motor abilities of preschool children who stutter: Evidence from behavioral and kinematic indices of nonword repetition performance. Journal of Fluency Disorders, 37(4), 344–358. https://doi.org/10.1016/j.jfludis.2012.06.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith, A. , & Weber, C. (2017). How stuttering develops: The multifactorial dynamic pathways theory. Journal of Speech, Language, and Hearing Research, 60(9), 2483–2505. https://doi.org/10.1044/2017_JSLHR-S-16-0343 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith, A. , & Zelaznik, H. N. (2004). Development of functional synergies for speech motor coordination in childhood and adolescence. Developmental Psychobiology, 45(1), 22–33. https://doi.org/10.1002/dev.20009 [DOI] [PubMed] [Google Scholar]
- Spencer, C. , & Weber-Fox, C. (2014). Preschool speech articulation and nonword repetition abilities may help predict eventual recovery or persistence of stuttering. Journal of Fluency Disorders, 41, 32–46. https://doi.org/10.1016/j.jfludis.2014.06.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Subramanian, A. , Yairi, E. , & Amir, O. (2003). Second formant transitions in fluent speech of persistent and recovered preschool children who stutter. Journal of Communication Disorders, 36(1), 59–75. https://doi.org/10.1016/S0021-9924(02)00135-1 [DOI] [PubMed] [Google Scholar]
- Tendera, A. , Wells, R. , Belyk, M. , Varyvoda, D. , Boliek, C. A. , & Beal, D. S. (2020). Motor sequence learning in children with recovered and persistent developmental stuttering: Preliminary findings. Journal of Fluency Disorders, 66, 105800. https://doi.org/10.1016/j.jfludis.2020.105800 [DOI] [PubMed] [Google Scholar]
- Tilsen, S. (2019). Space and time in models of speech rhythm. Annals of the New York Academy of Sciences, 1453(1), 47–66. https://doi.org/10.1111/nyas.14102 [DOI] [PubMed] [Google Scholar]
- Tilsen, S. , & Arvaniti, A. (2013). Speech rhythm analysis with decomposition of the amplitude envelope: Characterizing rhythmic patterns within and across languages. The Journal of the Acoustical Society of America, 134(1), 628–639. https://doi.org/10.1121/1.4807565 [DOI] [PubMed] [Google Scholar]
- Tilsen, S. , & Johnson, K. (2008). Low-frequency Fourier analysis of speech rhythm. The Journal of the Acoustical Society of America, 124(2), EL34–EL39. https://doi.org/10.1121/1.2947626 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tumanova, V. , Conture, E. G. , Lambert, E. W. , & Walden, T. A. (2014). Speech disfluencies of preschool-age children who do and do not stutter. Journal of Communication Disorders, 49, 25–41. https://doi.org/10.1016/j.jcomdis.2014.01.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Usler, E. , Smith, A. , & Weber, C. (2017). A lag in speech motor coordination during sentence production is associated with stuttering persistence in Young children. Journal of Speech, Language, and Hearing Research, 60(1), 51–61. https://doi.org/10.1044/2016_JSLHR-S-15-0367 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Usler, E. , & Weber-Fox, C. (2015). Neurodevelopment for syntactic processing distinguishes childhood stuttering recovery versus persistence. Journal of Neurodevelopmental Disorders, 7(1), 4. https://doi.org/10.1186/1866-1955-7-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Walsh, B. , Bostian, A. , Tichenor, S. E. , Brown, B. , & Weber, C. (2020). Disfluency characteristics of 4- and 5-year-old children who stutter and their relationship to stuttering persistence and recovery. Journal of Speech, Language, and Hearing Research, 63(8), 2555–2566. https://doi.org/10.1044/2020_JSLHR-19-00395 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Walsh, B. , Christ, S. , & Weber, C. (2021). Exploring relationships among risk factors for persistence in early childhood stuttering. Journal of Speech, Language, and Hearing Research, 64(8), 2909–2927. https://doi.org/10.1044/2021_JSLHR-21-00034 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Walsh, B. , Mettel, K. M. , & Smith, A. (2015). Speech motor planning and execution deficits in early childhood stuttering. Journal of Neurodevelopmental Disorders, 7(1), Article 27. https://doi.org/10.1186/s11689-015-9123-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Walsh, B. , Usler, E. , Bostian, A. , Mohan, R. , Gerwin, K. L. , Brown, B. , Weber, C. , & Smith, A. (2018). What are predictors for persistence in childhood stuttering. Seminars in Speech and Language, 39(4), 299–312. https://doi.org/10.1055/s-0038-1667159 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Watkins, R. V. , & Yairi, E. (1997). Language production abilities of children whose stuttering persisted or recovered. Journal of Speech, Language, and Hearing Research, 40(2), 385–399. https://doi.org/10.1044/jslhr.4002.385 [DOI] [PubMed] [Google Scholar]
- Wendahl, R. W. , & Cole, J. (1961). Identification of stuttering during relatively fluent speech. Journal of Speech and Hearing Research, 4(3), 281–286. https://doi.org/10.1044/jshr.0403.281 [DOI] [PubMed] [Google Scholar]
- West, B. T. , Welch, K. B. , & Galecki, A. T. (2006). Linear mixed models: A practical guide using statistical software. Chapman and Hall/CRC. https://doi.org/10.1201/9781420010435 [Google Scholar]
- White, L. , & Mattys, S. L. (2007). Calibrating rhythm: First language and second language studies. Journal of Phonetics, 35(4), 501–522. https://doi.org/10.1016/j.wocn.2007.02.003 [Google Scholar]
- Wieland, E. A. , McAuley, J. D. , Dilley, L. C. , & Chang, S.-E. (2015). Evidence for a rhythm perception deficit in children who stutter. Brain and Language, 144, 26–34. https://doi.org/10.1016/j.bandl.2015.03.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Williams, K. (1997). Expressive Vocabulary Test (EVT-2). AGS. https://fcon_1000.projects.nitrc.org/indi/cmi_healthy_brain_network_old/assessments/evt.html [Google Scholar]
- Yairi, E. , & Ambrose, N. (1992). A longitudinal study of stuttering in children. Journal of Speech and Hearing Research, 35(4), 755–760. https://doi.org/10.1044/jshr.3504.755 [DOI] [PubMed] [Google Scholar]
- Yairi, E. , & Ambrose, N. (2013). Epidemiology of stuttering: 21st century advances. Journal of Fluency Disorders, 38(2), 66–87. https://doi.org/10.1016/j.jfludis.2012.11.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yairi, E. , & Ambrose, N. G. (1999). Early childhood stuttering I: Persistency and recovery rates. Journal of Speech, Language, and Hearing Research, 42(5), 1097–1112. https://doi.org/10.1044/jslhr.4205.1097 [DOI] [PubMed] [Google Scholar]
- Yairi, E. , & Ambrose, N. G. (2005). Early childhood stuttering. Pro-Ed. [Google Scholar]
- Yairi, E. , Ambrose, N. G. , Paden, E. P. , & Throneburg, R. N. (1996). Predictive factors of persistence and recovery: Pathways of childhood stuttering. Journal of Communication Disorders, 29(1), 51–77. https://doi.org/10.1016/0021-9924(95)00051-8 [DOI] [PubMed] [Google Scholar]
- Yaruss, J. S. (1997a). Clinical implications of situational variability in preschool children who stutter. Journal of Fluency Disorders, 22(3), 187–203. https://doi.org/10.1016/S0094-730X(97)00009-0 [Google Scholar]
- Yaruss, J. S. (1997b). Clinical measurement of stuttering behaviors. Contemporary Issues in Communication Science and Disorders, 24(Spring), 27–38. https://doi.org/10.1044/cicsd_24_S_27 [Google Scholar]
- Yaruss, J. S. (1999). Utterance length, syntactic complexity, and childhood stuttering. Journal of Speech, Language, and Hearing Research, 42(2), 329–344. https://doi.org/10.1044/jslhr.4202.329 [DOI] [PubMed] [Google Scholar]
- Yaruss, J. S. , LaSalle, L. R. , & Conture, E. G. (1998). Evaluating stuttering in young children. American Journal of Speech-Language Pathology, 7(4), 62–76. https://doi.org/10.1044/1058-0360.0704.62 [Google Scholar]
- Yu, K. , Thomson, B. , & Young, S. (2010). From discontinuous to continuous F0 modelling in HMM-based speech. Synthesis, 6. [Google Scholar]
- Yu, K. , & Young, S. (2011). Continuous F0 modeling for HMM based statistical parametric speech synthesis. IEEE Transactions on Audio, Speech, and Language Processing, 19(5), 1071–1079. https://doi.org/10.1109/TASL.2010.2076805 [Google Scholar]
- Zengin-Bolatkale, H. , Conture, E. G. , Walden, T. A. , & Jones, R. M. (2018). Sympathetic arousal as a marker of chronicity in childhood stuttering. Developmental Neuropsychology, 43(2), 135–151. https://doi.org/10.1080/87565641.2018.1432621 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Participant consent to share de-identified data was not obtained at the time of data collection. Thus, based on current regulations, the Vanderbilt Institutional Review Board determined that de-identified data used in this study are not eligible to be shared without reconsenting the past participants, and the researchers of this study are no longer in contact with these past participants.