Temporal and Spectral Cues for Phoneme Perception in School-Age Children and Adults

Stacey L G Kane; Lori J Leibold; Heather L Porter; John H Grose; Emily Buss

doi:10.1044/2025_JSLHR-24-00701

. 2025 Aug 7;68(9):4447–4459. doi: 10.1044/2025_JSLHR-24-00701

Temporal and Spectral Cues for Phoneme Perception in School-Age Children and Adults

Stacey L G Kane ^a,^✉, Lori J Leibold ^b, Heather L Porter ^b, John H Grose ^c, Emily Buss ^c

PMCID: PMC12453024 PMID: 40774257

Abstract

Purpose:

This study considered the impact of spectral and temporal smearing on vowel and consonant discrimination in school-age children and adults with normal hearing (NH). The overall purpose of this work was to test the hypothesis that degraded spectral cues preferentially impact vowel discrimination, while reduced access to temporal cues preferentially affects consonant discrimination. This work is a first step toward understanding how the effects of poor spectral and temporal resolution may affect phonological awareness and speech perception in children with cochlear hearing loss (C-HL) and auditory neuropathy (AN).

Method:

Participants were 10 young adults and 18 school-age children with NH. Speech perception testing included vowel and consonant minimal pair discrimination for stimuli that were either unprocessed, spectrally smeared, or temporally smeared. All participants completed psychophysical estimates of spectral, temporal, and intensity resolution as well as standardized assessments of phonological awareness and receptive vocabulary.

Results:

Psychophysical estimates of spectral, temporal, and intensity resolution for unprocessed stimuli were consistent with previous literature, including improvement in thresholds as a function of child age. As predicted for both age groups, spectral smearing had greater effects on vowel discrimination, while temporal smearing had greater effects on consonant discrimination with minimal pairs differentiated by either presence/absence of a stop consonant or voicing. All participants demonstrated normal, age-adjusted, phonological awareness, and receptive vocabulary skills.

Conclusions:

For both children and adults, degraded spectral and temporal cues differentially affected access to vowel and consonant information. These results suggest the need for further investigations evaluating the effects of long-term reductions in access to spectral and temporal cues in children with hearing loss. This topic is particularly relevant to hearing losses such as C-HL and AN, which are primarily characterized by reduced perception of spectral and temporal acoustic cues, respectively.

Supplemental Material:

https://doi.org/10.23641/asha.29660819

Behavioral estimates of spectral and temporal resolution have been shown to develop during school age. Between the ages of approximately 5 and 12 years, the ability to detect temporal gaps, a measure of temporal envelope resolution, improves substantially in children with normal hearing (NH; Buss et al., 2014, 2017; Wightman et al., 1989). Detection of changes in spectrally modulated stimuli, a measure of spectral resolution, improves on a similar timeline (Allen & Wightman, 1992; also see Hall & Grose, 1991). These skills are relevant to speech perception because the availability of temporal and spectral cues has been implicated in consonant and vowel discrimination in adults with NH (Boothroyd et al., 1996; Drullman et al., 1994; Rance et al., 2008; ter Keurs et al., 1992; Xu & Pfingst, 2008) and with hearing loss (Rance et al., 2008). However, little is known about how children use temporal and spectral cues for the specific task of vowel and consonant discrimination. This developmental question is important because young children tend to employ listening strategies that weight acoustic cues differently than adults (Nittrouer, 1996). The current study explores how degradations to temporal and spectral cues affect vowel and consonant perception in school-age children with NH. This work is a first step toward understanding how hearing losses thought to differentially affect access to temporal and spectral cues—auditory neuropathy (AN) and cochlear hearing loss (C-HL), respectively—may also differentially affect vowel and consonant perception in spoken language.

In adults with NH, diminished access to spectral and temporal information has differing effects on vowel and consonant perception. For example, spectral smearing has a greater detrimental influence on vowel recognition than consonant recognition (ter Keurs et al., 1992). Conversely, temporal smearing has greater effects on consonant recognition (particularly stops) compared to vowels (Drullman et al., 1994). For vocoded speech signals, Xu and Pfingst (2008) suggest that consonant perception is more susceptible to reductions in temporal envelope bandwidth compared to vowels. Moreover, vowel perception is better when listeners are provided a greater number of spectral channels compared to increased temporal envelope bandwidth (Xu & Pfingst, 2008).

The relationships between degraded temporal and spectral cues, auditory skill development, and phoneme recognition become particularly relevant when considering the roles that vowels and consonants play in speech perception and language development. Evidence suggests that vowels are more important for recognizing prosody and syntax, while consonants play a larger role in word recognition and language acquisition (see Nazzi & Cutler, 2019). Vowels are thought to provide cues for recognizing phrase boundaries, syllable structure, word order, and syntax (Nespor et al., 2003). On the other hand, consonants are thought to provide cues that distinguish words from one another (Creel et al., 2006; Nespor et al., 2003) and are better at facilitating lexical growth (Escudero et al., 2016) compared to vowels.

Different types of hearing loss may uniquely influence how children use spectral and temporal cues. AN is the result of dyssynchronous coding of acoustic information along the auditory pathway (Moser & Starr, 2016; Starr et al., 1996). In the literature, AN is overwhelmingly associated with poor temporal resolution resulting in elevated thresholds for timing-related tasks such as gap detection and amplitude modulation detection (Michalewski et al., 2005; Rance et al., 2004; Rance & Starr, 2015; Zeng et al., 1999, 2005). In contrast, C-HL is overwhelmingly associated with broadened auditory filters and subsequently poor spectral resolution (Glasberg & Moore, 1986). Despite inconclusive results regarding temporal resolution in listeners with C-HL (Moore et al., 1992; Zwicker et al., 1982), the literature suggests that when the signal is audible, speech perception is more limited by frequency selectivity than by temporal resolution for listeners belonging to this population (Festen & Plomp, 1990; Turner et al., 1995). In adults with hearing loss, listeners with AN are less likely than those with C-HL to confuse phonemes that differ predominantly with respect to spectral features, while those with C-HL are better at distinguishing phonemes that differ predominantly with respect to temporal features (Rance et al., 2008). Considering the factors outlined above—auditory skill development for temporal and spectral resolution, availability of temporal and spectral cues for individuals with AN and C-HL, and the differing roles of vowels and consonants in spoken language—it becomes clear that the unique perceptual experiences associated with AN and C-HL may also have distinct downstream effects on language development and speech perception in children. The current study is a first step in exploring these complicated interactions between auditory skill development and the availability of temporal and spectral cues in childhood.

The following report describes the effects of spectral and temporal smearing on discrimination of minimal word pairs that differ by vowel or consonant content in school-age children and young adults with NH. Psychophysical resolution for spectral and temporal envelope cues was evaluated to characterize underlying access to the acoustic properties of interest. The role of intensity resolution was additionally considered in each of these measures. Psychophysical thresholds were also used to verify the effectiveness of smearing strategies among adult listeners. Finally, standardized assessments of phonological awareness and receptive vocabulary considered the hypothesized relationship between phonological structure and vocabulary size. Results from this study lay the groundwork for future investigations into the differential effects of diminished temporal and spectral resolution on consonant and vowel discrimination and speech perception in children with AN and C-HL.

Method

All procedures were approved by the institutional review board (IRB) at the University of North Carolina at Chapel Hill and performed under the consent/assent of all participants and parents/guardians (21–1698). A portion of the discussion section will compare results from this study to those from a small cohort of school-age pilot subjects with AN and C-HL. Data from children with AN and C-HL are limited as they were collected in the context of a dissertation completed during the COVID-19 pandemic. The University of North Carolina at Chapel Hill served as the IRB of record for pilot data collected in North Carolina as well as at Boys Town National Research Hospital.

Participants

Participants in the primary experiment were 18 children, 5.5–15.0 years of age (M = 10.4, SD = 3.0; eight females), and 10 young adults, 19.5–29.2 years of age (M = 23.8, SD = 3.4; 10 females). Children were recruited across a wide age range to characterize the effects of age on each perceptual task. All of these participants had hearing thresholds ≤ 20 dB HL at octave frequencies from 250 to 8000 Hz in both ears and were native speakers of American English. There were no reports of middle ear dysfunction within 1 month of testing. Of note, two children did not complete any psychophysical testing due to timing constraints (ages 5.5 and 7.9 years). Additionally, two children were missing data for one psychophysical condition each due to scheduling/attention constraints (6.4 years, spectral ripple; 11.9 years, intensity). These data are considered missing at random in statistical analyses.

Stimuli and Procedures

Testing was completed in a quiet room with participants seated at a table facing a touchscreen monitor and two powered loudspeakers (QSC Pro Audio, CP8). Standard placement of the speakers and tablet was ensured by using a template mat that marked the exact locations for each piece of equipment. The speakers were approximately 24 in. from the participant and placed at ±45° azimuth. All stimuli were calibrated for a presentation level of 75 dB SPL. Participants were tested in the sound field to accommodate comparison to data collected from children with hearing loss while wearing hearing aids.

Psychophysical Tasks

Participants provided psychophysical estimates of temporal, spectral ripple, and intensity resolution. Intensity discrimination was measured due to its potential effect on performance in both the spectral and temporal tasks. The temporal task was gap detection, where threshold represents the smallest temporal gap (in ms) that can be perceived by the listener. This acuity presumably reflects the perceptual salience of short gaps in speech, such as voice onset time (Hoover et al., 2015). Ramps at gap boundaries were 10 ms. The spectral task was spectral ripple phase reversal detection, where sinusoidal modulation (“ripple”) imposed on the spectral profile of a wide-band noise is inverted in the target stimulus relative to the foils, resulting in a pitch change. Threshold represents the smallest modulation depth (in dB) at which the spectral ripple phase reversal can be detected. Ripple density was held constant at 2 ripples per octave (RPO) because this spectral modulation rate has been shown to be the most sensitive predictor of speech perception (Davies-Venn et al., 2015; Liu & Eddins, 2008; van Veen & Houtgast, 1985). Finally, the intensity discrimination task required participants to detect intensity increments (in dB) from a standard stimulus at 75 dB SPL.

The carrier for all three psychophysical tasks was a speech-shaped noise. To create the speech-shaped noise, a 10-s noise file was generated using a 64-tap finite impulse response filter to shape Gaussian noise such that it matched the combined long-term average spectrum of all speech stimuli used in the minimal pairs task. A new random sample was selected from the stimulus WAV file for each trial. Stimuli were 500 ms in duration including 50 ms onset/offset ramps; the interstimulus-interval was 500 ms. All psychophysical tasks were presented in a three-alternative forced choice format using a 3-down/1-up stepping rule to estimate 79.4% correct performance (Levitt, 1971). At the beginning of each track, signal strength (gap duration in ms, RPO, or level increment) was adjusted by a factor of two; after the first two reversals, step sizes were adjusted by a factor of √2. The track continued for a total of eight reversals. Thresholds were calculated as the geometric mean of the last six reversals. Basic instructions were to listen to all three sounds and identify the sound that was different by selecting a button on a touchscreen monitor.

Adult participants completed each psychoacoustic task for unprocessed, spectrally smeared, and temporally smeared stimuli; children completed the psychoacoustic tasks in the unprocessed condition only. The rationale for adults completing both the unprocessed and smeared conditions was to verify the effectiveness of spectral and temporal smearing for degrading spectral and temporal cues, respectively. Two threshold runs were completed for each task unless thresholds differed by more than a factor of 1.5, in which case a third threshold was obtained. Final estimates were calculated as the geometric mean of the two or three thresholds collected. Task type was blocked and, for adults, presented in quasirandom order within each degradation condition.

Speech Discrimination

Three minimal speech pairs were recorded by one female speaker of Midland American English (average fundamental frequency = 209.5 Hz). All recorded stimuli were sampled at 44.1 kHz. The vowel task was to discriminate the word pair Net/Nut (/nɛt/ vs. /nʌt/), words differing by phonemes commonly confused by listeners with C-HL (DiNino et al., 2016; Owens et al., 1968). For consonant discrimination, listeners discriminated Sick and Stick (/sɪk/ vs. /stɪk/), which differ by the presence or absence of the stop consonant /t/ and, therefore, contain differences in voice onset time. Finally, the word pair Coat/Goat (/coʊt/ vs. /goʊt/) was used because it contains a voiced/voiceless stop contrast, a distinction that is particularly difficult for individuals with poor temporal resolution (e.g., AN; Rance et al., 2008). Each word was recorded four times, twice with a rising intonation and twice with flat intonation. Differences in intonation were introduced to reduce the opportunity for a listener to rely on cues within a single recording that were not based on the vowel and consonant distinctions of interest. The masker was the same speech-shaped noise described above; a new sample was drawn from the masker file for each trial. The combined target and masker were always presented at 75 dB SPL, irrespective of signal-to-noise ratio (SNR). This level was chosen because it is at the upper end of the intensity range for conversational speech (Pearsons et al., 1977). Two children (ages 5 and 7 years) reported loudness discomfort for the speech tasks. For these participants, the signal intensity was lowered by 2–3 dB.

For each trial, participants heard a word presented in masking noise while pictures corresponding to each word in the pair were presented on the tablet screen1. Participants were told which minimal pair they were listening for ahead of each adaptive track, and they were instructed to select the picture corresponding to the word they heard. The SNR was manipulated using a 3-down/1-up stepping rule to estimate the SNR associated with 79.4% correct performance (Levitt, 1971). Each track continued for 8 reversals in SNR. The step size was 8 dB for the first two reversals and 4 dB for the remaining six reversals; each track began at a signal level above the participant's anticipated threshold. Threshold was calculated as the mean SNR over the last six reversals. Participants completed this task for unprocessed, spectrally smeared, and temporally smeared stimuli. Participants completed a practice run prior to data collection for any instance where the word pair or smearing condition was changed. Two threshold estimates were collected for each word pair unless thresholds differed by more than 5 dB, in which case a third threshold was collected; final estimates were calculated as the average of the two or three thresholds in each task. If a participant was unable to distinguish word pairs in a given degradation condition after four attempts at 30 dB SNR, threshold was recorded as 30 dB SNR, and testing for that condition was discontinued.

12 spectrograms. It has a 4 times 3 grid of spectrograms, each row illustrating different words under three processing conditions: unprocessed, temporally smeared, and spectrally smeared. In the unprocessed column, the vertical axis represents frequency in kilohertz ranging from 0 to 10, and the horizontal axis represents time in milliseconds ranging from 0 to 800. It shows relatively sharp and distinct horizontal bands of high energy. In the temporally smeared column, the vertical axis represents frequency in kilohertz ranging from 0 to 10, and the horizontal axis represents time in milliseconds ranging from 0 to 1 for rows 1, 2, and 3 and for row 4 ranging from 0 to 800. It shows a horizontal smearing. In the spectrally smeared column, the vertical axis represents frequency in kilohertz ranging from 0 to 10, and the horizontal axis represents time in milliseconds ranging from 0 to 800. It shows a vertical smearing. — Spectrograms for unprocessed, temporally smeared, and spectrally smeared recordings of words Sick, Stick, Net, and Nut.

Spectral Smearing

Spectral smearing was modeled after Baer and Moore (1993) to simulate asymmetric broadening of auditory filters to approximate the effects of severe C-HL; smeared filters were broadened by a factor of 3 above and 6 below the characteristic frequency (see Figure 1 for examples of smeared and unprocessed stimuli).

Temporal Smearing

Temporal smearing was accomplished by passing each stimulus through a fourth order gammatone filter bank with equivalent rectangular bandwidth spacing (Moore & Glasberg, 1983). For each filter, the envelope and carrier were extracted, and the temporal envelope was low-pass filtered at 3 Hz via convolution using a 300-ms Hanning window. The smeared envelope and original carrier were then recombined to create the final temporally smeared stimulus. The relatively aggressive 3-Hz low-pass filter cutoff was selected based on pilot testing in adult participants with NH that resulted in gap detection thresholds comparable to published data from adults with AN (Zeng et al., 1999, 2005; see Figure 1 for examples).

Language Measures

All participants completed the Peabody Picture Vocabulary Test–Fourth Edition (PPVT-4), an assessment of receptive vocabulary in children as young as 2 years through adulthood (Dunn et al., 1965). The Elision subtest of the Comprehensive Test of Phonological Processing–Second Edition (CTOPP-2) was used to evaluate phonological awareness (Wagner et al., 2013).

Analysis

Language assessment scores were converted from raw values to scaled scores for analysis. Age was logarithmically transformed (base 10) for analyses involving child age to account for more rapid maturation in younger children compared to older children. Likewise, gap detection thresholds were evaluated using logarithmically transformed values.

Correlation analysis was used to evaluate the relationship between scaled scores for receptive vocabulary and phonological awareness. Linear mixed models with random intercepts for each participant compared thresholds for unprocessed and smeared stimuli to verify the effects of temporal and spectral smearing on resolution measures in adults. Documentation for calculating denominator degrees of freedom can be found in Pinheiro and Bates (2000). Friedman's signed-rank tests accounting for repeated measures evaluated performance for vowel and consonant minimal pairs as a function of smearing condition among children.

Results and Interim Discussion

Psychoacoustic Tasks

Figure 2 shows psychophysical threshold data for individual children plotted as a function of age (left column) and individual data overlaying boxplots for adult participants (right column). Recall that children listened in the unprocessed condition only. Threshold estimates for temporal gap detection, spectral ripple discrimination, and intensity discrimination significantly improved as a function of child age (r = −.66, p = .005; r = −.64, p = .011; r = −.63, p = .021, respectively). These findings are consistent with the existing literature that also shows significant improvement in thresholds for gap detection (Buss et al., 2014, 2017; Wightman et al., 1989), spectral ripple discrimination (Allen & Wightman, 1992), and intensity discrimination (Buss et al., 2013; Maxon & Hochberg, 1982) as children approach adolescence.

Three rows of scatterplots and box plots are arranged in two columns. There is one scatterplot per row in Column 1. There are three box plots per row in Column 2. Column 1 is titled Children, and column 2 is titled Adults. Column 1. Graph 1 of the age in years ranges from 5 to 16 versus gap detection in milliseconds ranges from 2 to 16. It has r equals negative 0.66 and p equals 0.005. It has a negative correlation and data points scattered downwards. Graph 2 of the age in years ranges from 5 to 16 versus spectral ripple discrimination in decibels ranges from 2 to 16. It has r equals negative 0.64 and p equals 0.011. It has a negative correlation and data points scattered downwards. Graph 3 of the age in years ranges from 5 to 16 versus intensity discrimination in decibels ranges from 0 to 5. It has r equals negative 0.63 and p equals 0.012. It has a negative correlation and data points scattered downwards. Column 2. Graph 1 of the smearing conditions versus gap detection in milliseconds ranges from 2 to 16 and contains three box plots corresponding to the smearing conditions Unprocessed Spectrally Smeared and Temporally Smeared. The median values are as follows. Unprocessed: 2.8, spectral: 3, and temporal: 14. Graph 2 of the smearing conditions versus spectral ripple discrimination in decibels ranges from 2 to 16. The median values are as follows. Unprocessed: 3, spectral: 4.2, and temporal: 3.5. Graph 3 of the smearing conditions versus intensity discrimination in decibels ranges from 0 to 5. The median values are as follows. Unprocessed: 0.9, spectral: 1.2, and temporal: 1. In all graphs, the data points overlap on the boxes, and the data points at unprocessed are shaded. — Gap detection (top), spectral ripple discrimination (middle), and intensity discrimination (bottom) thresholds for children (left) and adults (right) with normal hearing. Child data (unsmeared) are plotted as a function of log age. Dotted lines show the association between child age (log transformed) and threshold; associated correlations are reported at the upper right. Individual adult participants are represented by symbol shape. Smearing conditions are unsmeared, spectrally smeared, and temporally smeared from left to right.

Psychophysical Thresholds—Adult Participants

For gap detection thresholds in adult listeners, temporal smearing elevated thresholds compared to unprocessed and spectrally smeared conditions by a factor of 4.6 and 4.1, respectively. For spectral ripple discrimination, spectral smearing elevated thresholds compared to unprocessed and temporally smeared conditions by factors of 2.6 and 1.5, respectively. In contrast, intensity discrimination thresholds appear similar across smearing conditions. A linear mixed model examining intensity discrimination as a function of smearing condition did not suggest a significant effect of spectral or temporal smearing, F(2, 18) = 0.50, p = .616. Because intensity discrimination was not affected by temporal and spectral smearing, intensity resolution thresholds were averaged across smearing conditions for each adult participant. This averaged value was used in subsequent analyses to control for individual differences in listeners' use of intensity cues.

For temporal resolution measures, a linear mixed model considered the effects of smearing on gap detection thresholds while controlling for individual effects of intensity resolution. Results of this model revealed a significant effect of smearing condition on log-transformed gap detection thresholds, F(2, 18) = 775.6, p < .001. Table 1 contains coefficients, t, and significance values for this model. An observed main effect of intensity resolution suggests that intensity discrimination plays a significant role in overall gap detection thresholds, β = .04, t(8) = 3.21, p = .013. Additionally, gap detection thresholds for temporally smeared stimuli are significantly poorer (higher) than when stimuli are unprocessed or when they are spectrally smeared, β = −.67, t(18) = −35.43, p < .001; β = −.61, t(18) = −32.61, p < .001, respectively. Post hoc testing also revealed significantly poorer gap detection thresholds for spectrally smeared stimuli compared to the unprocessed condition, β = .05, t(18) = 2.82, p = .011.

Table 1.

Linear mixed model results evaluating the effects of spectral and temporal smearing on log₁₀-transformed gap detection threshold estimates (ms) in adults.

Variable	Value	SE	df	t	p
(Intercept)	1.05	0.02	18	51.41	< .001
Intensity	0.04	0.01	8	3.21	.013
Unprocessed	−0.67	0.02	18	−35.43	< .001
Spectrally smeared	−0.61	0.02	18	−32.61	< .001

Open in a new tab

Note. The model contains a random intercept for each adult participant, and the temporally smeared condition is the reference. SE = standard error; df = degrees of freedom.

A third linear mixed model compared the effects of spectral and temporal smearing on spectral ripple discrimination thresholds while controlling for intensity discrimination among individual listeners. Once again, modeling found significant effects of smearing, F(2, 18) = 11.30, p < .001. Table 2 contains coefficients, t, and significance values for this model. Results suggest significantly higher (poorer) spectral ripple discrimination thresholds for spectrally smeared stimuli compared to the unprocessed or temporally smeared stimuli, β = −3.29, t(18) = −4.74, p < .001; β = −1.82, t(18) = −2.62, p = .017, respectively. Post hoc testing also suggested significantly lower spectral ripple discrimination thresholds for unprocessed stimuli compared to those that were temporally degraded, β = 1.47, t(18) = 2.12, p = .048.

Table 2.

Linear mixed model results evaluating the effects of spectral and temporal smearing on spectral ripple phase reversal threshold estimates (dB) in adults.

Variable	Value	SE	df	t	p
(Intercept)	6.19	0.99	18	6.22	< .001
Intensity	−0.63	0.65	8	−0.97	.359
Unprocessed	−3.29	0.69	18	−4.74	< .001
Temporally smeared	−1.82	0.69	18	−2.62	.017

Open in a new tab

Note. The model contains a random intercept for each adult participant, and the spectrally smeared condition is the reference. SE = standard error; df = degrees of freedom.

Psychophysical data from adult listeners are consistent with the expectation that temporal smearing preferentially impairs gap detection and spectral smearing preferentially impairs detection of spectral ripple phase reversal. For adults with NH, intensity resolution was not significantly impacted by degrading the availability of either spectral or temporal cues. These results suggest that the temporal smearing approach implemented in this study successfully degrades the temporal envelope, while spectral smearing leaves perception of temporal gaps relatively intact. Conversely, spectral smearing appeared to preferentially affect access to frequency cues compared to temporal cues.

Speech Discrimination

Figure 3 shows average SNR values corresponding to 79.4% correct performance for each word pair for children and adults. Recall that child and adult participants completed testing for each of the word pairs (Net/Nut, Sick/Stick, Coat/Goat) under three smearing conditions (unprocessed, temporally smeared, and spectrally smeared). Notably, several child and adult participants performed at or near ceiling (30 dB SNR) under the smeared conditions. Data from this task were not normally distributed, considering substantial ceiling effects in the more challenging of the two smeared conditions for each word pair. As a result, Friedman's signed-rank tests were used to assess differences in median performance among children for all conditions.

Three scatterplots. The data points are labeled adults and children. The gradient scale for child age in years ranges from 6 to 14. Graph 1 of the net nut discrimination task SNR in decibels ranges from negative 20 to 30 versus smearing conditions: unprocessed, spectral, and temporal. It has a dotted horizontal line at 30. The data points are scattered as follows. Unprocessed. Adults and children are scattered between negative 20 and 0. Spectral. Adults and children are scattered between 10 and 30. Temporal. Adults scattered between negative 20 and 0, and children between negative 20 and 30. Graph 2 of the sick stick discrimination task SNR in decibels ranges from negative 20 to 30 versus smearing conditions. It has a dotted horizontal line at 30. The data points are scattered as follows. Unprocessed. Adults and children are scattered between negative 20 and 0. Spectral. Adults and children are scattered between negative 20 and 0. Temporal. Adults scattered between 0 and 30, and children between negative 15 and 30. Graph 3 of the coat or goat SNR in decibels ranges from negative 20 to 30 versus smearing conditions. It has a dotted horizontal line at 30. The data points are scattered as follows. Unprocessed. Adults and children are scattered between negative 20 and 0. Spectral. Adults are scattered between negative 10 and 30, and children are scattered between negative 10 and 0. Temporal. Adults and children are scattered between negative 5 and 30. — Minimal pair discrimination for words Net/Nut (top left), Sick/Stick (bottom left), and Coat/Goat (bottom right) for children with normal hearing (NH; see color map) and adults with NH. The dotted horizontal line corresponds to ceiling-level performance. SNR = signal-to-noise ratio.

Table 3 displays chi-squared and significance values for all analyses. An overall model evaluating the effects of smearing condition (unprocessed, temporal smearing, spectral smearing) and word pair (Net/Nut, Sick/Stick, Coat/Goat) on discrimination thresholds revealed significant differences among conditions, Χ²(8) = 123.84, p < .001. Similarly, models evaluating the overall effects of smearing on each word pair were also significant. Finally, paired contrasts compared the effects of all smearing conditions for each word pair; all contrasts revealed significant differences among all three smearing conditions for all word pairs. Most notable, and consistent with published literature in adults, were the detrimental effects of temporal smearing on consonant discrimination (Drullman et al., 1994) and spectral smearing on vowel discrimination (ter Keurs et al., 1992).

Table 3.

Chi-squared values for Friedman's signed-rank tests to assess differences between median speech discrimination results in children for Net/Nut, Sick/Stick, and Coat/Goat in the unprocessed, spectrally smeared, and temporally smeared conditions.

Net/Nut		Χ²	df	p	Median difference
	Full model	30.03	2	< .001
	Unprocessed/spectral	18	1	< .001	−40.58
	Unprocessed/temporal	7.12	1	.007	−2.5
	Spectral/temporal	17	1	< .001	35.25
Sick/Stick		Χ²	df	p	Median difference
	Full model	32.44	2	< .001
	Unprocessed/spectral	10.89	1	< .001	−4.44
	Unprocessed/temporal	18	1	< .001	−41.89
	Spectral/temporal	18	1	< .001	−38.42
Coat/Goat		Χ²	df	p	Median difference
	Full model	32.14	2	< .001
	Unprocessed/spectral	18	1	< .001	−6.42
	Unprocessed/temporal	18	1	< .001	−24.08
	Spectral/temporal	9.94	1	.002	−10.22

Open in a new tab

Note. df = degrees of freedom.

The minimal pair discrimination task described here was designed to evaluate the effects of spectral and temporal smearing on vowel and consonant discrimination in adults and children with NH. It is important to note that although the Net/Nut distinction contains spectral differences relating to the first and second formants, the phonemes (/ε/ and /ʊ/) also differ in duration. While the difference in duration for these phonemes may contribute to how they are perceived, it is notable that the substantial degradation of timing cues in this study had minimal effects on discrimination of the Net/Nut word pair. Conversely, the finding that spectral smearing severely limited adult and child listeners' ability to discriminate Net and Nut supports the hypothesis that differences in spectral information are the prevailing cues that listeners used to discriminate this word pair.

Regarding consonant discrimination in the Coat/Goat pair, it is reasonable that both spectral and temporal cues present in voiceless (/k/) and voiced (/g/) stops contribute to accurate perception of these phonemes. Importantly, degrading temporal cues had significantly greater effects on listeners' ability to differentiate Coat/Goat compared to when the stimuli were spectrally smeared, emphasizing the greater relative importance of temporal cues when making this voiced/voiceless distinction. Finally, for the Sick/Stick minimal pair, temporal smearing had substantial negative effects on discrimination, while effects of spectral smearing were small. This finding also supports the idea that temporal information is the dominant cue for distinguishing the words in this pair.

Language Measures

Figure 4 shows scaled scores for the PPVT-4 plotted as a function of scaled scores for the Elision Subtest of the CTOPP-2 for children in this study. Each symbol represents an individual listener, and child age is represented by color. Note that scaled scores are calculated relative to age and are based on a normal distribution where the M is 10 and the standard deviation is 3. Results indicate a significant positive relationship between receptive vocabulary and phonological awareness among children in this sample (r = .471, p = .048). These results are consistent with published data for preschoolers (Metsala, 1999) and school-age children with NH (Klein et al., 2017), and they support the idea that phonological awareness is positively related to vocabulary size.

A scatterplot of the CTOPP 2 Elision subset scaled score ranges from 6 to 20 versus PPVT 4 scaled score ranges from 6 to 20. It has a positive correlation and a correlation has an upward trend. The scale bar of the child age in years ranges from 6 to 16. It has r equals 0.471 and p equals 0.048. — Receptive vocabulary (Peabody Picture Vocabulary Test–Fourth Edition [PPVT-4]) scaled scores versus phonological awareness (Comprehensive Test of Phonological Processing–Second Edition [CTOPP-2]; Elision subtest) scaled scores for children with normal hearing. Symbols represent individual children, with age represented by color. The dotted line indicates the association between these two variables; the correlation is also reported.

Pilot Data in Children With Hearing Loss

Whereas the goal of this study was to evaluate the impacts of reduced access to temporal and spectral cues on consonant and vowel discrimination in children and young adults with NH, the long-term goal of this line of research is to evaluate the relationships between these cues and the development of language skills that support speech perception among children with hearing loss. Taken together, results from this experiment are consistent with the hypothesis that spectral and temporal smearing differentially affect vowel and consonant perception in adults and children with NH, and that phonological awareness supports vocabulary acquisition. The stimuli and smearing conditions included in this experiment were designed to simulate the temporal and spectral deficits commonly associated with AN and C-HL, respectively. Because AN and C-HL have been shown to exhibit greater relative effects on temporal and spectral processing, respectively, it is reasonable to expect that the smearing procedures described in this experiment would mimic at least some of the perceptual consequences of these hearing disorders.

To test the idea that degraded temporal and spectral cues preferentially affect consonant and vowel discrimination in children with AN and C-HL, pilot testing using the experimental procedures described above was performed with a small cohort of school-age listeners with AN (n = 4; 13.1–16.4 years [M = 15.0, SD = 1.42]) and C-HL (n = 8; 7.4–16.4 years [M = 12.3, SD = 3.36]). This testing was only completed using unprocessed stimuli. Children with hearing loss who participated in this pilot study were hearing aid users, each with several years' experience. See Supplemental Materials S1 and S2 for specific information regarding medical history, unaided hearing thresholds, and hearing aid fitting data (where available) for each child. Also included in Supplemental Material S1 are data relating to nonverbal intelligence, receptive vocabulary, and phonological awareness.

Figure 5 displays results for psychophysical thresholds among children with hearing loss compared to similarly aged peers with NH from the main study. Most children with C-HL demonstrated poorer spectral ripple discrimination in comparison to their peers with NH. In contrast, gap detection and intensity discrimination thresholds were similar for children with C-HL and children with NH. While results from the four children with AN were highly variable, a general pattern of poor temporal resolution was observed for three participants. Surprisingly, substantially elevated spectral ripple discrimination thresholds were observed for three of four children with AN.¹ Finally, elevated intensity discrimination was observed for two children with AN.

Three combinations of the scatterplots and boxplots. The data are labeled normal hearing, auditory neuropathy, and cochlear hearing loss. Graph 1 of the spectral ripple discrimination in decibels ranges from 0 to 25 versus the age in years ranges from 7 to adult. Normal hearing has a negative correlation, while auditory neuropathy and cochlear hearing loss have no correlation. The box plot is plotted at adult, with the median at 2.5. Graph 2 of the gap detection in milliseconds ranges from 0 to 40 versus age in years ranges from 7 to adult. Normal hearing has a negative correlation, while auditory neuropathy and cochlear hearing loss have no correlation. The box plot is plotted at adult, with the median at 2.8. Graph 3 of the intensity discrimination in decibels ranges from 0 to 8 versus age in years ranges from 7 to adult. Normal hearing has a negative correlation, while auditory neuropathy and cochlear hearing loss have no correlation. The box plot is plotted at adult, with the median at 1. — Spectral ripple discrimination (top left), gap detection (bottom left), and intensity discrimination (bottom right) for participants with normal hearing (NH; gray), cochlear hearing loss (purple), and auditory neuropathy (teal). Boxplots represent adult performance. The shaded region depicts the 95% confidence interval around a line fit to data from children with NH who were between 7.3 and 15.0 years of age.

One possible explanation for the relatively poor psychophysical threshold performance among children with AN is that they used less stable listening strategies or were more prone to lapses in attention during the listening tasks due to developmental or cognitive delays. To test this, an analysis of adaptive tracks was undertaken to evaluate whether children with AN provided more variable responses compared to their peers with NH or C-HL. The geometric standard deviation was calculated for the last six reversals in every threshold run for children with AN, C-HL, and NH² for gap detection, spectral ripple discrimination, and intensity discrimination. For data pooled across all tasks, the 50th percentile of the geometric standard deviation was 1.27 for children with NH, 1.24 for children with C-HL, and 1.22 for children with AN. These results indicate that track variability was similar across groups and that thresholds for children with AN were not likely influenced by variable attention or listening strategy.

Figure 6 displays results for minimal pair discrimination among children with hearing loss compared to similarly aged peers with NH from the main study. Children with C-HL had thresholds that were on average 2.8–7.1 dB higher than their peers with NH. Thresholds in children with C-HL appear to approach performance for children with NH with increasing age. For children with AN, average discrimination thresholds for Sick/Stick and Coat/Goat were 8.7 and 6.2 dB poorer than those collected from children with C-HL, and 15.9 and 9.0 dB poorer than those for children with NH. Surprisingly, children with AN also demonstrated the poorest performance among all three participant groups for the Net/Nut discrimination task. Average thresholds for Net/Nut discrimination with AN were 20.8 and 24.3 dB worse than average SNR scores for children with C-HL and NH, respectively.

Receptive vocabulary scaled scores were similar for all participants with AN and generally poorer for participants with AN compared to children with NH and C-HL (see Figure 7). Phonological awareness scaled scores for participants with AN spanned a range that was comparable to those from children with C-HL. Receptive vocabulary scaled scores and phonological awareness were significantly correlated among children with C-HL, r(9) = .76, p = .027, but not for children with AN, r(4) = .79, p = .212. The slopes of the best fitting lines comparing phonological awareness and receptive vocabulary in children with C-HL (y = .27x + 9.16) and NH (y = .41x + 10.37) are steeper than for children with AN (y = .11x + 7.51). The trend for steeper slopes seen for children with C-HL and NH suggests a greater positive relationship between phonological awareness and receptive vocabulary than for children with AN. The best fitting line for data from children with AN is likely affected by a lack of variability in receptive vocabulary. The associations between phonological awareness and receptive vocabulary size in this population should be explored further.

A scatterplot of the PPVT4 scaled score ranges from 0 to 20 versus CTOPP 2 Elision subtest ranges from 0 to 20. The data points are labeled normal hearing, auditory neuropathy, and cochlear hearing loss. The data points of normal hearing and cochlear hearing loss have a positive correlation and are scattered on the shaded region, and auditory neuropathy has no correlation. — Scaled scores for receptive vocabulary plotted against phonological awareness for participants with normal hearing (NH; gray), cochlear hearing loss (purple), and auditory neuropathy (teal). The shaded region depicts the 95% confidence interval around a line fit to data from children with NH who were between 7.3 and 15.0 years of age. PPVT-4 = Peabody Picture Vocabulary Test–Fourth Edition; CTOPP-2 = Comprehensive Test of Phonological Processing–Second Edition.

Comparing Results From Children With NH to Pilot Data From Children With AN and C-HL

Overall, the results among children and adults with NH in this study support the idea that spectral and temporal cues preferentially support vowel and consonant perception, respectively. The rationale for examining the effects of temporal and spectral smearing stems from the hypothesis that poor perceptual outcomes resulting from AN are primarily related to temporal resolution loss, while perception with C-HL is primarily related to spectral resolution loss. Importantly, pilot testing among children with AN and C-HL did not entirely conform to the original hypothesis. Results from these pilot data suggest the need for additional consideration of the perceptual experiences of children with AN as it relates to access to both spectral and temporal cues necessary for speech perception. One such consideration may be the possible presence of both AN and C-HL in the same listener.

Limitations to the Current Study

It is important to note that the speech perception measure used in this study was minimal pair discrimination, a task that assessed recognition of consonant and vowel distinctions for single words. In addition, while consonant distinctions are important for lexical access and vocabulary acquisition for words in isolation, vowel perception is crucial for speech recognition in sentences (Fogerty et al., 2012; Kewley-Port et al., 2007). Future studies should consider the long-term effects of degraded perception of both vowels and consonants for word- and sentence-level speech recognition.

Finally, pilot data collected from children with AN are limited to four listeners, each with very similar medical histories, including extreme prematurity and extended neonatal hospitalization. It is unknown how site of lesion (i.e., pre- or postsynaptic), potential combined presence of AN and C-HL, and other etiologies associated with AN would impact vowel and consonant perception. Further work is needed to elucidate the effects of site of lesion and etiology on acoustic speech perception with AN.

Conclusions

Overall, results from this study suggest that spectral and temporal smearing preferentially degrade access to vowel and consonant cues, respectively, in children and adults with NH. At the outset, it was expected that the use of spectral and temporal smearing would mimic the prevailing perceptual consequences of C-HL and AN, respectively; however, pilot data collected in children with AN and C-HL did not conform to this hypothesis. Further work is necessary to explore possible temporal and spectral resolution deficits in children with AN.

Data Availability Statement

The data sets generated during the current study are available in the Open Science Framework repository (https://osf.io/EHGB7).

Supplementary Material

Supplemental Material S1. Hearing health, pertinent medical history, hearing aid fitting and use data (when available), K-BIT (non-verbal intelligence), PPVT-4 (receptive vocabulary), and CTOPP-2 (phonological awareness, Elision) scaled scores for children with hearing.

JSLHR-68-4447-s001.pdf^{(611.4KB, pdf)}

Supplemental Material S2. Better-ear hearing thresholds for individual participants from 250 to 8000 Hz for children with C-HL (left panel) and AN (right panel). Better ear was calculated using a three-frequency pure tone average at 500, 1000, and 2000 Hz. Where the pure tone average is equal across both ears, the right ear is reported.

JSLHR-68-4447-s002.pdf^{(521KB, pdf)}

Acknowledgments

This work was sponsored by the National Institute on Deafness and Other Communication Disorders Grant F32 DC020341 (Kane). The authors would like to thank all participants for their contributions to this work. The authors would also like to thank (a) the members of the Human Auditory Development Lab at Boys Town National Research Hospital (BTNRH) for their contributions, particularly Maggie Miller for her assistance with recruitment and data collection, (b) Caitlin Sapp for her insights and support for recruitment at University of North Carolina (UNC), (c) the clinical audiology teams at UNC and BTNRH for recruitment support, and (d) the members of the Speech Perception and Auditory Research at Carolina Lab at UNC for assistance with data collection. Statistical support for this work was provided by Chris Wiessen at the Odum Institute for Research in Social Sciences at UNC. Images representing minimal pairs in the study were used with permission and can be viewed at the Open Science Framework link provided in the following manuscript: Buss et al. (2022).

Funding Statement

This work was sponsored by the National Institute on Deafness and Other Communication Disorders Grant F32 DC020341 (Kane).

Footnotes

One child with auditory neuropathy completed only one threshold run for the spectral ripple task due to the difficulty of the task and parent concern.

Note, reversal data were missing for one child with normal hearing in this sample.

References

Allen, P., & Wightman, F. (1992). Spectral pattern discrimination by children. Journal of Speech and Hearing Research, 35(1), 222–233. 10.1044/jshr.3501.222 [DOI] [PubMed] [Google Scholar]
Baer, T., & Moore, B. C. J. (1993). Effects of spectral smearing on the intelligibility of sentences in noise. The Journal of the Acoustical Society of America, 94(3), 1229–1241. 10.1121/1.408176 [DOI] [PubMed] [Google Scholar]
Boothroyd, A., Mulhearn, B., Gong, J., & Ostroff, J. (1996). Effects of spectral smearing on phoneme and word recognition. The Journal of the Acoustical Society of America, 100(3), 1807–1818. 10.1121/1.416000 [DOI] [PubMed] [Google Scholar]
Buss, E., Felder, J., Miller, M. K., Leibold, L. J., & Calandruccio, L. (2022). Can closed-set word recognition differentially assess vowel and consonant perception for school-age children with and without hearing loss? Journal of Speech, Language, and Hearing Research, 65(10), 3934–3950. 10.1044/2022_JSLHR-20-00749 [DOI] [PMC free article] [PubMed] [Google Scholar]
Buss, E., Hall, J. W., III., & Grose, J. H. (2013). Factors affecting the processing of intensity in school-aged children. Journal of Speech, Language, and Hearing Research, 56(1), 71–80. 10.1044/1092-4388(2012/12-0008) [DOI] [PMC free article] [PubMed] [Google Scholar]
Buss, E., Hall, J. W., III., Porter, H., & Grose, J. H. (2014). Gap detection in school-age children and adults: Effects of inherent envelope modulation and the availability of cues across frequency. Journal of Speech, Language, and Hearing Research, 57(3), 1098–1107. 10.1044/2014_JSLHR-H-13-0132 [DOI] [PMC free article] [PubMed] [Google Scholar]
Buss, E., Porter, H. L., Hall, J. W., III., & Grose, J. H. (2017). Gap detection in school-age children and adults: Center frequency and ramp duration. Journal of Speech, Language, and Hearing Research, 60(1), 172–181. 10.1044/2016_JSLHR-H-16-0010 [DOI] [PMC free article] [PubMed] [Google Scholar]
Creel, S. C., Aslin, R. N., & Tanenhaus, M. K. (2006). Acquiring an artificial lexicon: Segment type and order information in early lexical entries. Journal of Memory and Language, 54(1), 1–19. 10.1016/j.jml.2005.09.003 [DOI] [Google Scholar]
Davies-Venn, E., Nelson, P., & Souza, P. (2015). Comparing auditory filter bandwidths, spectral ripple modulation detection, spectral ripple discrimination, and speech recognition: Normal and impaired hearing. The Journal of the Acoustical Society of America, 138(1), 492–503. 10.1121/1.4922700 [DOI] [PMC free article] [PubMed] [Google Scholar]
DiNino, M., Wright, R. A., Winn, M. B., & Bierer, J. A. (2016). Vowel and consonant confusions from spectrally manipulated stimuli designed to simulate poor cochlear implant electrode-neuron interfaces. The Journal of the Acoustical Society of America, 140(6), 4404–4418. 10.1121/1.4971420 [DOI] [PMC free article] [PubMed] [Google Scholar]
Drullman, R., Festen, J. M., & Plomp, R. (1994). Effect of temporal envelope smearing on speech reception. The Journal of the Acoustical Society of America, 95(2), 1053–1064. 10.1121/1.408467 [DOI] [PubMed] [Google Scholar]
Dunn, L., Dunn, L., Bulheller, S., & Häcker, H. (1965). Peabody Picture Vocabulary Test. American Guidance Service. [Google Scholar]
Escudero, P., Mulak, K. E., & Vlach, H. A. (2016). Cross-situational learning of minimal word pairs. Cognitive Science, 40(2), 455–465. 10.1111/cogs.12243 [DOI] [PubMed] [Google Scholar]
Festen, J. M., & Plomp, R. (1990). Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing. The Journal of the Acoustical Society of America, 88(4), 1725–1736. 10.1121/1.400247 [DOI] [PubMed] [Google Scholar]
Fogerty, D., Kewley-Port, D., & Humes, L. E. (2012). The relative importance of consonant and vowel segments to the recognition of words and sentences: Effects of age and hearing loss. The Journal of the Acoustical Society of America, 132(3), 1667–1678. 10.1121/1.4739463 [DOI] [PMC free article] [PubMed] [Google Scholar]
Glasberg, B. R., & Moore, B. C. J. (1986). Auditory filter shapes in subjects with unilateral and bilateral cochlear impairments. The Journal of the Acoustical Society of America, 79(4), 1020–1033. 10.1121/1.393374 [DOI] [PubMed] [Google Scholar]
Hall, J. W., III., & Grose, J. H. (1991). Notched-noise measures of frequency selectivity in adults and children using fixed-masker-level and fixed-signal-level presentation. Journal of Speech and Hearing Research, 34(3), 651–660. 10.1044/jshr.3403.651 [DOI] [PubMed] [Google Scholar]
Hoover, E., Pasquesi, L., & Souza, P. (2015). Comparison of clinical and traditional gap detection tests. Journal of the American Academy of Audiology, 26(6), 540–546. 10.3766/jaaa.14088 [DOI] [PMC free article] [PubMed] [Google Scholar]
Kewley-Port, D., Burkle, T. Z., & Lee, J. H. (2007). Contribution of consonant versus vowel information to sentence intelligibility for young normal-hearing and elderly hearing-impaired listeners. The Journal of the Acoustical Society of America, 122(4), 2365–2375. 10.1121/1.2773986 [DOI] [PubMed] [Google Scholar]
Klein, K. E., Walker, E. A., Kirby, B., & McCreery, R. W. (2017). Vocabulary facilitates speech perception in children with hearing aids. Journal of Speech, Language, and Hearing Research, 60(8), 2281–2296. 10.1044/2017_JSLHR-H-16-0086 [DOI] [PMC free article] [PubMed] [Google Scholar]
Levitt, H. (1971). Transformed up-down methods in psychoacoustics. The Journal of the Acoustical Society of America, 49(2B), 467–477. 10.1121/1.1912375 [DOI] [PubMed] [Google Scholar]
Liu, C., & Eddins, D. A. (2008). Effects of spectral modulation filtering on vowel identification. The Journal of the Acoustical Society of America, 124(3), 1704–1715. 10.1121/1.2956468 [DOI] [PMC free article] [PubMed] [Google Scholar]
Maxon, A. B., & Hochberg, I. (1982). Development of psychoacoustic behavior: Sensitivity and discrimination. Ear and Hearing, 3(6), 301–308. 10.1097/00003446-198211000-00003 [DOI] [PubMed] [Google Scholar]
Metsala, J. L. (1999). Young children's phonological awareness and nonword repetition as a function of vocabulary development. Journal of Educational Psychology, 91(1), 3–19. 10.1037/0022-0663.91.1.3 [DOI] [Google Scholar]
Michalewski, H. J., Starr, A., Nguyen, T. T., Kong, Y.-Y., & Zeng, F.-G. (2005). Auditory temporal processes in normal-hearing individuals and in patients with auditory neuropathy. Clinical Neurophysiology, 116(3), 669–680. 10.1016/j.clinph.2004.09.027 [DOI] [PubMed] [Google Scholar]
Moore, B. C. J., Shailer, M. J., & Schooneveldt, G. P. (1992). Temporal modulation transfer functions for band-limited noise in subjects with cochlear hearing loss. British Journal of Audiology, 26(4), 229–237. 10.3109/03005369209076641 [DOI] [PubMed] [Google Scholar]
Moore, B. C. J., & Glasberg, B. R. (1983). Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. The Journal of the Acoustical Society of America, 74(3), 750–753. 10.1121/1.389861 [DOI] [PubMed] [Google Scholar]
Moser, T., & Starr, A. (2016). Auditory neuropathy—Neural and synaptic mechanisms. Nature Reviews Neurology, 12(3), 135–149. 10.1038/nrneurol.2016.10 [DOI] [PubMed] [Google Scholar]
Nazzi, T., & Cutler, A. (2019). How consonants and vowels shape spoken-language recognition. Annual Review of Linguistics, 5(1), 25–47. 10.1146/annurev-linguistics-011718-011919 [DOI] [Google Scholar]
Nespor, M., Pena, M., & Mehler, J. (2003). On the different role of vowels and consonants in speech processing and language acquisition. Lingue e Linguaggio. 10.1418/10879 [DOI]
Nittrouer, S. (1996). Discriminability and perceptual weighting of some acoustic cues to speech perception by 3-year-olds. Journal of Speech and Hearing Research, 39(2), 278–297. 10.1044/jshr.3902.278 [DOI] [PubMed] [Google Scholar]
Owens, E., Talbott, C. B., & Schubert, E. D. (1968). Vowel discrimination of hearing-impaired listeners. Journal of Speech and Hearing Research, 11(3), 648–655. 10.1044/jshr.1103.648 [DOI] [PubMed] [Google Scholar]
Pearsons, K. S., Bennett, R. L., & Fidell, S. A. (1977). Speech levels in various noise environments. Office of Health and Ecological Effects, Office of Research and Development, US EPA. [Google Scholar]
Pinheiro, J., & Bates, D. (2000). Mixed-Effects Models in S and S-PLUS. Springer. 10.1007/b98882 [DOI] [Google Scholar]
Rance, G., Fava, R., Baldock, H., Chong, A., Barker, E., Corben, L., & Delatycki, M. B. (2008). Speech perception ability in individuals with Friedreich ataxia. Brain, 131(8), 2002–2012. 10.1093/brain/awn104 [DOI] [PubMed] [Google Scholar]
Rance, G., McKay, C., & Grayden, D. (2004). Perceptual characterization of children with auditory neuropathy. Ear and Hearing, 25(1), 34–46. 10.1097/01.AUD.0000111259.59690.B8 [DOI] [PubMed] [Google Scholar]
Rance, G., & Starr, A. (2015). Pathophysiological mechanisms and functional hearing consequences of auditory neuropathy. Journal of Neurology, 138(11), 3141–3158. 10.1093/brain/awv270 [DOI] [PubMed] [Google Scholar]
Starr, A., Picton, T. W., Sininger, Y., Hood, L. J., & Berlin, C. I. (1996). Auditory neuropathy. Brain, 119(3), 741–753. 10.1093/brain/119.3.741 [DOI] [PubMed] [Google Scholar]
ter Keurs, M., Festen, J. M., & Plomp, R. (1992). Effect of spectral envelope smearing on speech reception. I. The Journal of the Acoustical Society of America, 91(5), 2872–2880. 10.1121/1.402950 [DOI] [PubMed] [Google Scholar]
Turner, C. W., Souza, P. E., & Forget, L. N. (1995). Use of temporal envelope cues in speech recognition by normal and hearing-impaired listeners. The Journal of the Acoustical Society of America, 97(4), 2568–2576. 10.1121/1.411911 [DOI] [PubMed] [Google Scholar]
van Veen, T. M., & Houtgast, T. (1985). Spectral sharpness and vowel dissimilarity. The Journal of the Acoustical Society of America, 77(2), 628–634. 10.1121/1.391880 [DOI] [PubMed] [Google Scholar]
Wagner, R., Torgesen, J., Rashotte, C., & Pearson, N. (2013). Comprehensive Test of Phonological Processing–Second Edition (CTOPP-2) [Database record]. 10.1037/t52630-000 [DOI]
Wightman, F., Allen, P., Dolan, T., Kistler, D., & Jamieson, D. (1989). Temporal resolution in children. Child Development, 60(3), Article 611. 10.2307/1130727 [DOI] [PubMed] [Google Scholar]
Xu, L., & Pfingst, B. E. (2008). Spectral and temporal cues for speech recognition: Implications for auditory prostheses. Hearing Research, 242(1–2), 132–140. 10.1016/j.heares.2007.12.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
Zeng, F.-G., Kong, Y.-Y., Michalewski, H. J., & Starr, A. (2005). Perceptual consequences of disrupted auditory nerve activity. Journal of Neurophysiology, 93(6), 3050–3063. 10.1152/jn.00985.2004 [DOI] [PubMed] [Google Scholar]
Zeng, F.-G., Oba, S., Garde, S., Sininger, Y., & Starr, A. (1999). Temporal and speech processing deficits in auditory neuropathy. NeuroReport, 10(16), 3429–3435. 10.1097/00001756-199911080-00031 [DOI] [PubMed] [Google Scholar]
Zwicker, E., Schorn, K., Ashoor, A. A., & Prochazka, T. (1982). Temporal resolution in hard-of-hearing patients. Audiology, 21(6), 474–492. 10.3109/00206098209072760 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

JSLHR-68-4447-s001.pdf^{(611.4KB, pdf)}

JSLHR-68-4447-s002.pdf^{(521KB, pdf)}

Data Availability Statement

The data sets generated during the current study are available in the Open Science Framework repository (https://osf.io/EHGB7).

[bib1] Allen, P., & Wightman, F. (1992). Spectral pattern discrimination by children. Journal of Speech and Hearing Research, 35(1), 222–233. 10.1044/jshr.3501.222 [DOI] [PubMed] [Google Scholar]

[bib2] Baer, T., & Moore, B. C. J. (1993). Effects of spectral smearing on the intelligibility of sentences in noise. The Journal of the Acoustical Society of America, 94(3), 1229–1241. 10.1121/1.408176 [DOI] [PubMed] [Google Scholar]

[bib3] Boothroyd, A., Mulhearn, B., Gong, J., & Ostroff, J. (1996). Effects of spectral smearing on phoneme and word recognition. The Journal of the Acoustical Society of America, 100(3), 1807–1818. 10.1121/1.416000 [DOI] [PubMed] [Google Scholar]

[bib47] Buss, E., Felder, J., Miller, M. K., Leibold, L. J., & Calandruccio, L. (2022). Can closed-set word recognition differentially assess vowel and consonant perception for school-age children with and without hearing loss? Journal of Speech, Language, and Hearing Research, 65(10), 3934–3950. 10.1044/2022_JSLHR-20-00749 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib4] Buss, E., Hall, J. W., III., & Grose, J. H. (2013). Factors affecting the processing of intensity in school-aged children. Journal of Speech, Language, and Hearing Research, 56(1), 71–80. 10.1044/1092-4388(2012/12-0008) [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib5] Buss, E., Hall, J. W., III., Porter, H., & Grose, J. H. (2014). Gap detection in school-age children and adults: Effects of inherent envelope modulation and the availability of cues across frequency. Journal of Speech, Language, and Hearing Research, 57(3), 1098–1107. 10.1044/2014_JSLHR-H-13-0132 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib6] Buss, E., Porter, H. L., Hall, J. W., III., & Grose, J. H. (2017). Gap detection in school-age children and adults: Center frequency and ramp duration. Journal of Speech, Language, and Hearing Research, 60(1), 172–181. 10.1044/2016_JSLHR-H-16-0010 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib7] Creel, S. C., Aslin, R. N., & Tanenhaus, M. K. (2006). Acquiring an artificial lexicon: Segment type and order information in early lexical entries. Journal of Memory and Language, 54(1), 1–19. 10.1016/j.jml.2005.09.003 [DOI] [Google Scholar]

[bib8] Davies-Venn, E., Nelson, P., & Souza, P. (2015). Comparing auditory filter bandwidths, spectral ripple modulation detection, spectral ripple discrimination, and speech recognition: Normal and impaired hearing. The Journal of the Acoustical Society of America, 138(1), 492–503. 10.1121/1.4922700 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib9] DiNino, M., Wright, R. A., Winn, M. B., & Bierer, J. A. (2016). Vowel and consonant confusions from spectrally manipulated stimuli designed to simulate poor cochlear implant electrode-neuron interfaces. The Journal of the Acoustical Society of America, 140(6), 4404–4418. 10.1121/1.4971420 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib10] Drullman, R., Festen, J. M., & Plomp, R. (1994). Effect of temporal envelope smearing on speech reception. The Journal of the Acoustical Society of America, 95(2), 1053–1064. 10.1121/1.408467 [DOI] [PubMed] [Google Scholar]

[bib11] Dunn, L., Dunn, L., Bulheller, S., & Häcker, H. (1965). Peabody Picture Vocabulary Test. American Guidance Service. [Google Scholar]

[bib12] Escudero, P., Mulak, K. E., & Vlach, H. A. (2016). Cross-situational learning of minimal word pairs. Cognitive Science, 40(2), 455–465. 10.1111/cogs.12243 [DOI] [PubMed] [Google Scholar]

[bib13] Festen, J. M., & Plomp, R. (1990). Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing. The Journal of the Acoustical Society of America, 88(4), 1725–1736. 10.1121/1.400247 [DOI] [PubMed] [Google Scholar]

[bib14] Fogerty, D., Kewley-Port, D., & Humes, L. E. (2012). The relative importance of consonant and vowel segments to the recognition of words and sentences: Effects of age and hearing loss. The Journal of the Acoustical Society of America, 132(3), 1667–1678. 10.1121/1.4739463 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib15] Glasberg, B. R., & Moore, B. C. J. (1986). Auditory filter shapes in subjects with unilateral and bilateral cochlear impairments. The Journal of the Acoustical Society of America, 79(4), 1020–1033. 10.1121/1.393374 [DOI] [PubMed] [Google Scholar]

[bib16] Hall, J. W., III., & Grose, J. H. (1991). Notched-noise measures of frequency selectivity in adults and children using fixed-masker-level and fixed-signal-level presentation. Journal of Speech and Hearing Research, 34(3), 651–660. 10.1044/jshr.3403.651 [DOI] [PubMed] [Google Scholar]

[bib17] Hoover, E., Pasquesi, L., & Souza, P. (2015). Comparison of clinical and traditional gap detection tests. Journal of the American Academy of Audiology, 26(6), 540–546. 10.3766/jaaa.14088 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib18] Kewley-Port, D., Burkle, T. Z., & Lee, J. H. (2007). Contribution of consonant versus vowel information to sentence intelligibility for young normal-hearing and elderly hearing-impaired listeners. The Journal of the Acoustical Society of America, 122(4), 2365–2375. 10.1121/1.2773986 [DOI] [PubMed] [Google Scholar]

[bib19] Klein, K. E., Walker, E. A., Kirby, B., & McCreery, R. W. (2017). Vocabulary facilitates speech perception in children with hearing aids. Journal of Speech, Language, and Hearing Research, 60(8), 2281–2296. 10.1044/2017_JSLHR-H-16-0086 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib20] Levitt, H. (1971). Transformed up-down methods in psychoacoustics. The Journal of the Acoustical Society of America, 49(2B), 467–477. 10.1121/1.1912375 [DOI] [PubMed] [Google Scholar]

[bib21] Liu, C., & Eddins, D. A. (2008). Effects of spectral modulation filtering on vowel identification. The Journal of the Acoustical Society of America, 124(3), 1704–1715. 10.1121/1.2956468 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib22] Maxon, A. B., & Hochberg, I. (1982). Development of psychoacoustic behavior: Sensitivity and discrimination. Ear and Hearing, 3(6), 301–308. 10.1097/00003446-198211000-00003 [DOI] [PubMed] [Google Scholar]

[bib23] Metsala, J. L. (1999). Young children's phonological awareness and nonword repetition as a function of vocabulary development. Journal of Educational Psychology, 91(1), 3–19. 10.1037/0022-0663.91.1.3 [DOI] [Google Scholar]

[bib24] Michalewski, H. J., Starr, A., Nguyen, T. T., Kong, Y.-Y., & Zeng, F.-G. (2005). Auditory temporal processes in normal-hearing individuals and in patients with auditory neuropathy. Clinical Neurophysiology, 116(3), 669–680. 10.1016/j.clinph.2004.09.027 [DOI] [PubMed] [Google Scholar]

[bib26] Moore, B. C. J., Shailer, M. J., & Schooneveldt, G. P. (1992). Temporal modulation transfer functions for band-limited noise in subjects with cochlear hearing loss. British Journal of Audiology, 26(4), 229–237. 10.3109/03005369209076641 [DOI] [PubMed] [Google Scholar]

[bib25] Moore, B. C. J., & Glasberg, B. R. (1983). Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. The Journal of the Acoustical Society of America, 74(3), 750–753. 10.1121/1.389861 [DOI] [PubMed] [Google Scholar]

[bib27] Moser, T., & Starr, A. (2016). Auditory neuropathy—Neural and synaptic mechanisms. Nature Reviews Neurology, 12(3), 135–149. 10.1038/nrneurol.2016.10 [DOI] [PubMed] [Google Scholar]

[bib28] Nazzi, T., & Cutler, A. (2019). How consonants and vowels shape spoken-language recognition. Annual Review of Linguistics, 5(1), 25–47. 10.1146/annurev-linguistics-011718-011919 [DOI] [Google Scholar]

[bib29] Nespor, M., Pena, M., & Mehler, J. (2003). On the different role of vowels and consonants in speech processing and language acquisition. Lingue e Linguaggio. 10.1418/10879 [DOI]

[bib30] Nittrouer, S. (1996). Discriminability and perceptual weighting of some acoustic cues to speech perception by 3-year-olds. Journal of Speech and Hearing Research, 39(2), 278–297. 10.1044/jshr.3902.278 [DOI] [PubMed] [Google Scholar]

[bib31] Owens, E., Talbott, C. B., & Schubert, E. D. (1968). Vowel discrimination of hearing-impaired listeners. Journal of Speech and Hearing Research, 11(3), 648–655. 10.1044/jshr.1103.648 [DOI] [PubMed] [Google Scholar]

[bib32] Pearsons, K. S., Bennett, R. L., & Fidell, S. A. (1977). Speech levels in various noise environments. Office of Health and Ecological Effects, Office of Research and Development, US EPA. [Google Scholar]

[bib33] Pinheiro, J., & Bates, D. (2000). Mixed-Effects Models in S and S-PLUS. Springer. 10.1007/b98882 [DOI] [Google Scholar]

[bib34] Rance, G., Fava, R., Baldock, H., Chong, A., Barker, E., Corben, L., & Delatycki, M. B. (2008). Speech perception ability in individuals with Friedreich ataxia. Brain, 131(8), 2002–2012. 10.1093/brain/awn104 [DOI] [PubMed] [Google Scholar]

[bib35] Rance, G., McKay, C., & Grayden, D. (2004). Perceptual characterization of children with auditory neuropathy. Ear and Hearing, 25(1), 34–46. 10.1097/01.AUD.0000111259.59690.B8 [DOI] [PubMed] [Google Scholar]

[bib36] Rance, G., & Starr, A. (2015). Pathophysiological mechanisms and functional hearing consequences of auditory neuropathy. Journal of Neurology, 138(11), 3141–3158. 10.1093/brain/awv270 [DOI] [PubMed] [Google Scholar]

[bib37] Starr, A., Picton, T. W., Sininger, Y., Hood, L. J., & Berlin, C. I. (1996). Auditory neuropathy. Brain, 119(3), 741–753. 10.1093/brain/119.3.741 [DOI] [PubMed] [Google Scholar]

[bib38] ter Keurs, M., Festen, J. M., & Plomp, R. (1992). Effect of spectral envelope smearing on speech reception. I. The Journal of the Acoustical Society of America, 91(5), 2872–2880. 10.1121/1.402950 [DOI] [PubMed] [Google Scholar]

[bib39] Turner, C. W., Souza, P. E., & Forget, L. N. (1995). Use of temporal envelope cues in speech recognition by normal and hearing-impaired listeners. The Journal of the Acoustical Society of America, 97(4), 2568–2576. 10.1121/1.411911 [DOI] [PubMed] [Google Scholar]

[bib40] van Veen, T. M., & Houtgast, T. (1985). Spectral sharpness and vowel dissimilarity. The Journal of the Acoustical Society of America, 77(2), 628–634. 10.1121/1.391880 [DOI] [PubMed] [Google Scholar]

[bib41] Wagner, R., Torgesen, J., Rashotte, C., & Pearson, N. (2013). Comprehensive Test of Phonological Processing–Second Edition (CTOPP-2) [Database record]. 10.1037/t52630-000 [DOI]

[bib42] Wightman, F., Allen, P., Dolan, T., Kistler, D., & Jamieson, D. (1989). Temporal resolution in children. Child Development, 60(3), Article 611. 10.2307/1130727 [DOI] [PubMed] [Google Scholar]

[bib43] Xu, L., & Pfingst, B. E. (2008). Spectral and temporal cues for speech recognition: Implications for auditory prostheses. Hearing Research, 242(1–2), 132–140. 10.1016/j.heares.2007.12.010 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib44] Zeng, F.-G., Kong, Y.-Y., Michalewski, H. J., & Starr, A. (2005). Perceptual consequences of disrupted auditory nerve activity. Journal of Neurophysiology, 93(6), 3050–3063. 10.1152/jn.00985.2004 [DOI] [PubMed] [Google Scholar]

[bib45] Zeng, F.-G., Oba, S., Garde, S., Sininger, Y., & Starr, A. (1999). Temporal and speech processing deficits in auditory neuropathy. NeuroReport, 10(16), 3429–3435. 10.1097/00001756-199911080-00031 [DOI] [PubMed] [Google Scholar]

[bib46] Zwicker, E., Schorn, K., Ashoor, A. A., & Prochazka, T. (1982). Temporal resolution in hard-of-hearing patients. Audiology, 21(6), 474–492. 10.3109/00206098209072760 [DOI] [PubMed] [Google Scholar]

PERMALINK

Temporal and Spectral Cues for Phoneme Perception in School-Age Children and Adults

Stacey L G Kane

Lori J Leibold

Heather L Porter

John H Grose

Emily Buss

Abstract

Purpose:

Method:

Results:

Conclusions:

Supplemental Material:

Method

Participants

Stimuli and Procedures

Psychophysical Tasks

Speech Discrimination

Figure 1.

Spectral Smearing

Temporal Smearing

Language Measures

Analysis

Results and Interim Discussion

Psychoacoustic Tasks

Figure 2.

Psychophysical Thresholds—Adult Participants

Table 1.

Table 2.

Speech Discrimination

Figure 3.

Table 3.

Language Measures

Figure 4.

Pilot Data in Children With Hearing Loss

Figure 5.

Figure 6.

Figure 7.

Comparing Results From Children With NH to Pilot Data From Children With AN and C-HL

Limitations to the Current Study

Conclusions

Data Availability Statement

Supplementary Material

Acknowledgments

Funding Statement

Footnotes

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases