Abstract
Advancing age is associated with decreased sensitivity to temporal cues in word segments, particularly when target words follow non-informative carrier sentences or are spectrally degraded (e.g., vocoded to simulate cochlear-implant stimulation). This study investigated whether age, carrier sentences, and spectral degradation interacted to cause undue difficulty in processing speech temporal cues. Younger and older adults with normal hearing performed phonemic categorization tasks on two continua: a Buy/Pie contrast with voice onset time changes for the word-initial stop and a Dish/Ditch contrast with silent interval changes preceding the word-final fricative. Target words were presented in isolation or after non-informative carrier sentences, and were unprocessed or degraded via sinewave vocoding (2, 4, and 8 channels). Older listeners exhibited reduced sensitivity to both temporal cues compared to younger listeners. For the Buy/Pie contrast, age, carrier sentence, and spectral degradation interacted such that the largest age effects were seen for unprocessed words in the carrier sentence condition. This pattern differed from the Dish/Ditch contrast, where reducing spectral resolution exaggerated age effects, but introducing carrier sentences largely left the patterns unchanged. These results suggest that certain temporal cues are particularly susceptible to aging when placed in sentences, likely contributing to the difficulties of older cochlear-implant users in everyday environments.
I. INTRODUCTION
Processing the temporal features of complex sounds is critical for speech recognition. For instance, listeners appear to primarily utilize temporal cues to identify and discriminate word contrasts such as Buy/Pie (voice onset time for the word-initial stop) (Lisker and Abramson, 1964) and Dish/Ditch (silence interval preceding the word-final fricative, i.e., silence duration) (Dorman et al., 1979). When such temporal cues are distorted, speech recognition tends to decline; such distortions include rapid speech (Wingfield et al., 1985), reverberation (Ding et al., 2023), and background noise (Wong et al., 2009). To further exemplify the importance of temporal cues to speech recognition, many cochlear-implant users can achieve robust speech recognition, particularly in quiet conditions (Friesen et al., 2001; Shannon et al., 1995), even though cochlear implants primarily convey the temporal envelope information from speech signals (Loizou, 2006). For cochlear-implant users, speech recognition is particularly impacted by rapid speech (Ji et al., 2013; Tinnemore et al., 2022), reverberation (Kressner et al., 2018), and the presence of competing sounds, such as background noise (e.g., Oxenham and Kreft, 2014).
Chronological age is another factor that is associated with decreased sensitivity to temporal cues in acoustic-hearing listeners (e.g., Gordon-Salant et al., 2006; Gordon-Salant et al., 2008; Goupell et al., 2017; Strouse et al., 1998) and cochlear-implant users (Johnson et al., 2021; Shader et al., 2020a,b; Xie et al., 2019). Temporal processing deficits in older adult listeners likely originate from age-related changes in temporal cue encoding in the peripheral auditory system (Lopez-Poveda, 2014; Lopez-Poveda and Barrios, 2013; Otte et al., 1978; Sergeyenko et al., 2013; Wu et al., 2019) and in the central auditory system (Anderson et al., 2012; Hughes et al., 2010; Roque et al., 2019; Tremblay et al., 2003; Walton et al., 1998). Given the importance of temporal processing to speech recognition, age-related temporal processing deficits are widely recognized as one potential mechanism underlying the speech recognition difficulties in older adult listeners that can occur in the absence of substantive hearing loss (Füllgrabe et al., 2015; Gordon-Salant et al., 2011; Schneider and Pichora-Fuller, 2001). Understanding the relationships among age, stimulus degradation, and temporal processing is critical for understanding how to maximize speech recognition outcomes for cochlear-implant users, particularly middle-aged and older adults who are experiencing age-related temporal processing deficits.
Evidence for age-related temporal processing deficits includes results from highly controlled laboratory experiments including phonemic categorization. In this paradigm, listeners categorize single-word contrasts that vary along a single temporal dimension (e.g., Gordon-Salant et al., 2006; Winn et al., 2016). For example, listeners can be presented with a Dish/Ditch contrast with varying silence durations and are required to categorize each stimulus as the word “Dish” or “Ditch.” Temporal processing performance is quantified with the slope (rate of response change) and 50% crossover point metrics [the point along the temporal cue continuum where a participant's responses are equally split (50%) between the two phoneme choices], estimated from the categorization responses as a function of the temporal cue duration. Older (vs younger) adult listeners typically exhibit shallower slopes and/or later crossover points, reflecting reduced sensitivity to temporal cues (i.e., a temporal processing deficit) (Gordon-Salant et al., 2006; Gordon-Salant et al., 2008). Furthermore, these age-related declines in temporal processing appeared to increase even more with decreasing spectral resolution (e.g., unprocessed vs 8-channel vocoded stimuli) in normal-hearing listeners who were presented with cochlear-implant stimulations (Goupell et al., 2017).
Extant research using phonemic categorization tasks has primarily focused on isolated words. Listeners, however, typically encounter longer utterances (e.g., sentences) in natural speech and conversations. Hence, not only is it necessary to know about the relationships among age, stimulus degradation, and temporal processing to help older adult cochlear-implant users with their speech recognition difficulties, it is even more important to understand how this occurs with more natural speech and conversations compared to isolated words. This study aims to utilize the phonemic categorization paradigm, a method for assessing temporal processing in a controlled manner, while also incorporating a non-informative carrier sentence. Specifically, we asked how advancing age affects temporal processing of a target word preceded by a sentence. The use of a preceding sentence context may reveal possible effects of age-related differences in forward masking, as suggested previously (Xie et al., 2022).
Few studies have investigated age-related temporal processing deficits in carrier sentence contexts. Gordon-Salant et al. (2008) examined age-related changes in temporal processing with the phonemic categorization task on word contrasts containing one of the four temporal cues: voice onset time (e.g., Buy/Pie contrast), silence duration (e.g., Dish/Ditch contrast), transition duration, and vowel duration. Words with these temporal cues were inserted at the end of carrier sentences that were contextually uninformative because they had limited semantic information to predict the target words. Results showed that older (vs younger) normal-hearing listeners had later crossover points when presented as categorizing many word contrasts in sentences compared to when presented as isolated words. The age-related temporal processing deficits were exaggerated most for the Buy/Pie contrast, and less so or not at all for other word contrasts. These results suggest that carrier sentences may interact with age to worsen age-related temporal processing deficits for some of the word contrasts.
Recently, Xie et al. (2022) adopted the stimuli from Gordon-Salant et al. (2008) and investigated the effect of carrier sentence on auditory temporal processing in adult cochlear-implant users across two temporal cues: voice onset time in the Buy/Pie contrast and silence duration in the Dish/Ditch contrast. Partly consistent with Gordon-Salant et al. (2008), they found that placing a carrier sentence before target words reduced the saliency of the voice onset time cue (Buy/Pie contrast), but not the silence duration cue (Dish/Ditch contrast) for word categorization. Because of the heterogeneity of cochlear-implant performance and the small sample size in Xie et al. (2022), the relationship between age and carrier sentence on temporal processing in cochlear-implant users, who typically experience poor spectral resolution with their sound processor, remains unclear.
Therefore, the current study examined the relationships among age, spectral degradation, and carrier sentence on temporal processing in acoustic-hearing listeners with and without cochlear-implant simulations. We tested acoustic-hearing listeners instead of actual cochlear-implant users to reduce participant heterogeneity. Temporal processing was measured via the aforementioned phonemic categorization task for two speech temporal cues: voice onset time for the word-initial stop of the Buy/Pie contrast and silence duration preceding the word-final fricative of the Dish/Ditch contrast. The inclusion of two temporal cues was to replicate previous work, indicating that the effects of age and carrier sentence on temporal processing were cue-specific (Gordon-Salant et al., 2008; Xie et al., 2022). Three hypotheses based on previous studies were evaluated in this study. First, it was hypothesized that older listeners show reduced sensitivity to the temporal cues of the temporally based word contrasts compared to younger listeners (Anderson et al., 2020; Gordon-Salant et al., 2006; Goupell et al., 2017; Xie et al., 2019). Second, it was hypothesized that the age-related temporal processing deficits are exaggerated for stimuli with reduced spectral resolution (Goupell et al., 2017), because listeners rely more heavily on temporal cues when recognizing spectrally degraded stimuli (Winn et al., 2012). This may place older listeners at a greater disadvantage due to their reduced temporal processing abilities (Anderson et al., 2020; Goupell et al., 2017; Strouse et al., 1998). The third hypothesis was the critical factor examined in this study; it was hypothesized that the age-related temporal processing deficits are exaggerated with the introduction of a non-informative carrier sentence, particularly for the Buy/Pie contrast where the critical temporal cue is at the word onset (Gordon-Salant et al., 2008; Xie et al., 2022).
II. METHOD
A. Participants
Sixteen younger adults with normal hearing (YNH) (18–23 years, two males) and 17 older adults with normal hearing (ONH) or near-normal hearing (65–78 years, one male) participated in this study. The sample size was selected based on prior work demonstrating the effect of age on auditory temporal processing in acoustic-hearing listeners (e.g., Gordon-Salant et al., 2006; Gordon-Salant et al., 2008; Goupell et al., 2017; Xie et al., 2019). Nominally, normal hearing was defined as ≤25 dB hearing level (HL) at octave frequencies from 250 to 4000 Hz in the better-hearing ear. The average and individual thresholds of test ears are displayed in Fig. 1. Two ONH participants had near-normal hearing thresholds because their 4000 Hz thresholds were 30 and 35 dB HL. The 8000 Hz threshold was missing for one ONH participant. All participants were native speakers of American English. The ONH participants scored 22 of 30 or better (13 of 17 scored 26 or better; data are missing for one ONH) on a cognitive screener, the Montreal Cognitive Assessment (Nasreddine et al., 2005), suggesting they had normal or near-normal cognitive function at the time of the study (Cecato et al., 2016; Dupuis et al., 2015) (see Ethics Approval for details.)
FIG. 1.
(Color online) Mean pure-tone thresholds (dB HL) of test ears for YNH (black squares) and ONH (red, open circles) groups. The horizontal dashed line indicates 25 dB HL. Error bars denote ±1 standard deviation. Transparent lines (black lines for YNH and red lines for ONH) denote thresholds for individual participants.
B. Stimuli
The stimuli included unprocessed and vocoded versions of target words and carrier sentences. The unprocessed and vocoded versions of the target words and the unprocessed version of the carrier sentences were directly obtained from previous studies (Gordon-Salant et al., 2006; Gorodn-Salant et al., 2008; Goupell et al., 2017). The target words and carrier sentences were recorded separately by the same speaker. The vocoded version of the carrier sentences was made from the unprocessed version for the current study. The following paragraphs provide a summary of the stimulus creation methods. Readers can refer to relevant prior studies for additional details (Gordon-Salant et al., 2006; Gordon-Salant et al., 2008; Goupell et al., 2017; Xie et al., 2019).
The target words were two continua of word contrasts: Buy/Pie (Fig. 2) and Dish/Ditch (Fig. 3). The Buy/Pie contrast varied in the voice onset time (VOT) (in ms) duration of the word-initial consonant sound. The Dish/Ditch contrast varied in the duration of the silent interval preceding the word-final consonant sound (i.e., silence duration). A seven-step continuum was created for each word contrast by systematically manipulating the temporal cue from 0 (Buy or Dish) to 60 ms (Pie or Ditch) with a 10 ms step size. Thus, the target words were primarily differentiated by temporal cues.
FIG. 2.
(Color online) Spectrograms for the end points (0 and 60 ms) and midpoint (30 ms) from the Buy/Pie contrast. The Buy/Pie contrast varies in the duration of the onset of voicing (i.e., voice onset time) for the word-initial stop consonant.
FIG. 3.
(Color online) Spectrograms for the end points (0 and 60 ms) and midpoint (30 ms) from the Dish/Ditch contrast. The Dish/Ditch contrast varies in the duration of a silent interval preceding the word-final fricative (i.e., silence duration).
Each target word was presented alone (isolated word condition) and after a carrier sentence (carrier sentence condition). There were 70 carrier sentences taken from Gordon-Salant et al. (2008). The sentences did not explicitly cue any target word (Buy, Pie, Dish, or Ditch) from their semantic information. For example, one carrier sentence was, “I had not thought about the.” The carrier sentence before each target word was not fixed, and all carrier sentences had an equal probability of being paired with a target word. The target words were inserted at the end of a carrier sentence such that they played immediately after the carrier sentence.
The vocoding procedures were similar to Goupell et al. (2017). Target words and carrier sentences were tone-vocoded with 2, 4, or 8 contiguous channels. Each stimulus was bandpass filtered (forward–backward third-order Butterworth filters) into 2, 4, or 8 bands with logarithmically distributed cutoff frequencies from 200 to 8000 Hz. The Hilbert envelopes were derived from each band and low-pass filtered (forward–backward second-order Butterworth filter) at 400 Hz. Tonal carriers at the geometric mean frequency of the bandpass filters were modulated with the processed envelopes and combined to form the final vocoded stimulus. Because we were primarily interested in temporal processing and effects of aging, tonal carriers were chosen to minimize any influence from random envelope modulations that occur from noise carriers (Cychosz et al., 2024; Whitmal et al., 2007). Forward–backward filtering minimized any temporal delays that would occur across frequency. There are factors to consider in such an implementation. The use of a relatively small number of channels (i.e., ≤8 channels) and relatively high envelope cutoff of 400 Hz, produces a relatively good speech recognition compared to noise carriers because of resolved sidebands of the carriers (Souza and Rosen, 2009). We assume that this effect should not differ across the age groups because the resolvability of components should be approximately the same given that all the listeners had good hearing thresholds. For sentence perception, which does not target the processing of temporal cues as narrowly as the categorization task that was used in the current study, effects of aging observed by manipulating envelope cutoff for noise vocoded sentences appear to be small to non-existent (Shader et al., 2020c). This, therefore, motivates the use of tonal carriers and a categorization task. In summary, the vocoder implementation was targeted at investigating the effects of age-related temporal processing deficits.
C. Design
Participants completed four types of blocks: the Buy/Pie contrast in the isolated word and carrier sentence conditions, and the Dish/Ditch contrast in the isolated word and carrier sentence conditions. In each block, the unprocessed and vocoded stimuli (2, 4, and 8 channels) were mixed and presented in a randomized order. The order of blocks was counterbalanced across participants. The rationale for presenting word contrasts and carrier conditions in separate blocks was to minimize confusion about the target words. Mixing unprocessed and vocoded stimuli would not cause confusion about target words and could control for order effects. In most conditions, each stimulus was repeated 10 times. In rare cases (about 2.1%), repetitions ranged from 6 to 9 because of computer error. In other cases (about 1.3%), the stimuli were repeated more than 10 times (11–17). In such cases, the first ten repetitions were selected to be consistent with the majority of the conditions.
D. Procedure
Participants completed testing in a sound-attenuating booth (Industrial Acoustics, Inc., North Aurora, IL). The testing was controlled by custom scripts in MATLAB (MathWorks, Natick, MA). Stimuli were presented monaurally at 75 dB sound pressure level (SPL) through one insert earphone (ER2, Etymotic, Elk Grove Village, IL) to the right ear of YNH participants and the better ear of ONH participants (11 right ears and six left ears). This level was chosen to maximize audibility for the older participants and to be consistent with presentation levels used in related investigations (e.g., Anderson et al., 2020). The better ear was the ear with a lower mean threshold across 500, 1000, 2000, and 4000 Hz. In six ONH participants, circumaural headphones (Sennheiser HD650, Wedemark, Germany) were used for testing, because of experimenter error. Transducer type (insert earphone vs headphone) was included as a factor in the statistical models to control for its effect on temporal processing (see Statistical Analysis and Results). Results did not show that transducer type modulated the effects of age group and related interactions.
Participants were instructed to categorize each stimulus (i.e., the word in the isolated word condition or the final word in the carrier sentence condition) as being “Buy” or “Pie” for the Buy/Pie contrast and as being “Dish” or “Ditch” for the Dish/Ditch contrast. Participants initiated each trial via a button press. They made their judgment without a time limit by clicking one of the two boxes on the screen. No feedback was provided. Participants completed the experiment in a session that lasted approximately 2 h. They were encouraged to have breaks between blocks to minimize fatigue.
E. Statistical analysis
Mixed-effects logistic regression models were fit to the trial-level data of each word contrast, using the buildmer package (Voeten, 2019) and lme4 package (Bates et al., 2015) in R (R Core Team, 2022).
For the Buy/Pie contrast, the dependent variable was the trial-level response to target words: 1 for a “Buy” response or 0 for a “Pie” response. The initial full model incorporated the following fixed factors: VOT (0–60 ms), age group (sum coded with YNH coded as −0.5 and ONH coded as +0.5), number of channels (factor coded with the unprocessed condition as the reference level), carrier sentence (or carrier; sum coded with the isolated word condition coded as −0.5 and the carrier sentence condition coded as +0.5), transducer type (or transducer; sum coded with the insert earphone presentation coded as −0.5 and the headphone presentation coded as +0.5), and their interactions. The VOT factor was centered and scaled using the scale function, and was treated as a continuous variable. The pure-tone average (500, 1000, 2000, and 4000 Hz) of the hearing thresholds was included as a covariate to control for audibility effects. The random effects were initially set as (VOT * number of channels * carrier | participant).
Model testing was implemented using the buildmer function from the buildmer package with the binomial family, the default direction for stepwise elimination (the combination of “order” and “backward”), and the default criterion (likelihood-ratio test). Briefly, the buildmer function took the initial full model to identify the maximal model that could converge and then performed stepwise elimination to find the best-fitting model for the observed data. The by-participant random intercept was not tested for elimination and was always included in the model. Results from the final best-fitting model are reported.
The above analysis procedures were applied to the Dish/Ditch contrast data with two modifications. First, the dependent variable was “Dish” (coded as 1) or “Ditch” (coded as 0) as the response. Second, the silence duration cue (0–60 ms) was used as the temporal cue of interest.
Figures to illustrate significant interaction effects were created as follows. First, the ggemmeans function from the ggeffects package (Lüdecke, 2018) was used to compute the predicted probabilities of response (“Buy” for the Buy/Pie contrast or “Dish” for the Dish/Ditch contrast) from the interaction terms of the final best-fitting models, with the remaining arguments using the default parameters. Then, the ggplot2 package (Wickham, 2011) was used to plot the predicted response probabilities as a function of the interaction terms.
III. RESULTS
A. Perception of the VOT cue in the Buy/Pie contrast
Figure 4 displays the percentage of “Buy” responses as a function of VOT. While both groups (YNH and ONH) were able to discriminate the words “Buy” and “Pie,” the ONH group tended to perceive the words as “Buy,” particularly when the words were preceded by carrier sentences or when the words were vocoded. Table I shows the results from the final best-fitting model. The description of the results focuses on factors directly related to the research questions: VOT, Group, Carrier (isolated word or carrier sentence), and Channels.
FIG. 4.
(Color online) Identification functions (i.e., the average percentage of trials identified as “Buy”) for the perception of voice onset time in the Buy/Pie contrast in YNH (black squares) and ONH (red, open circles) groups. Both unprocessed and vocoded (8, 4, and 2 channels) stimuli were presented in isolation (isolated word; upper graphs) and after a carrier sentence (carrier sentence; lower graphs). Error bars show ±1 standard error. The horizontal dashed line corresponds to 50% “Buy” responses.
TABLE I.
Results for the final best-fitting model: “Buy” (vs “Pie”) response = VOT * Carrier + VOT * Group * Channels + Group * Carrier * Channels + Carrier * Transducer + Channels * Transducer + (1 + VOT | participant). Group, Carrier, and Transducer were sum-coded with the level associated with the –0.5 contrast shown in parentheses.
| Fixed effects | β | Standard error | z | p |
|---|---|---|---|---|
| (Intercept) | −0.272 | 0.094 | −2.893 | 0.004 |
| VOT | −2.110 | 0.098 | −21.591 | < 0.001 |
| Group (YNH) | 1.603 | 0.152 | 10.525 | < 0.001 |
| Carrier (Isolated Word) | 1.692 | 0.098 | 17.243 | < 0.001 |
| Channels: 2 channels | 1.093 | 0.081 | 13.442 | < 0.001 |
| Channels: 4 channels | 0.292 | 0.080 | 3.636 | < 0.001 |
| Channels: 8 channels | 0.875 | 0.084 | 10.409 | < 0.001 |
| Transducer (Insert) | 0.306 | 0.189 | 1.618 | 0.106 |
| VOT * Group | 1.042 | 0.195 | 5.339 | < 0.001 |
| VOT * Carrier | 0.085 | 0.044 | 1.953 | 0.051 |
| VOT * Channels: 2 channels | 1.281 | 0.071 | 17.993 | < 0.001 |
| VOT * Channels: 4 channels | 0.808 | 0.073 | 11.047 | < 0.001 |
| VOT * Channels: 8 channels | 0.422 | 0.078 | 5.426 | < 0.001 |
| Group * Carrier | 1.240 | 0.182 | 6.809 | < 0.001 |
| Group * Channels: 2 channels | −1.210 | 0.126 | −9.620 | < 0.001 |
| Group * Channels: 4 channels | −1.373 | 0.127 | −10.780 | < 0.001 |
| Group * Channels: 8 channels | −0.900 | 0.134 | −6.726 | < 0.001 |
| Carrier * Channels: 2 channels | −0.981 | 0.113 | −8.706 | < 0.001 |
| Carrier * Channels: 4 channels | −0.892 | 0.114 | −7.825 | < 0.001 |
| Carrier * Channels: 8 channels | −0.891 | 0.119 | −7.490 | < 0.001 |
| Carrier * Transducer | 0.856 | 0.114 | 7.525 | < 0.001 |
| Transducer * Channels: 2 channels | 0.485 | 0.158 | 3.073 | 0.002 |
| Transducer * Channels: 4 channels | −0.337 | 0.156 | −2.159 | 0.031 |
| Transducer * Channels: 8 channels | −0.121 | 0.163 | −0.740 | 0.459 |
| VOT * Group * Channels: 2 channels | −0.411 | 0.142 | −2.890 | 0.004 |
| VOT * Group * Channels: 4 channels | −0.583 | 0.146 | −3.984 | < 0.001 |
| VOT * Group * Channels: 8 channels | −0.529 | 0.155 | −3.406 | < 0.001 |
| Group * Carrier * Channels: 2 channels | −1.873 | 0.226 | −8.302 | < 0.001 |
| Group * Carrier * Channels: 4 channels | −0.916 | 0.229 | −4.005 | < 0.001 |
| Group * Carrier * Channels: 8 channels | −1.197 | 0.237 | −5.053 | < 0.001 |
| Random effects | Variance | Standard deviation | ||
| (Intercept) | 0.085 | 0.291 | ||
| VOT | 0.190 | 0.436 | ||
| Marginal R2 | 47.3% | |||
| Conditional R2 | 51.4% |
There were significant interactions between VOT, Group, and Channels: VOT by Group by Channels: 2 channels interaction, VOT by Group by Channels: 4 channels interaction, and VOT by Group by Channels: 8 channels interaction (all ps < 0.01). These results indicate that, as shown in Fig. 5(A), the slopes of the logistic fits to “Buy” responses as a function of VOT were shallower for the ONH group than the YNH group for unprocessed stimuli. Such age-related difference in slopes became smaller for vocoded than unprocessed stimuli. The interaction was explored further by re-running the same model using each level of Channels as the reference level. Results showed that the age-related difference in slopes (shallower in ONH than YNH) was significant for all vocoded stimuli (significant VOT by Group interactions, all ps < 0.01). The magnitudes of age effects were not significantly different between these levels of Channels (8, 4, and 2 channels; all ps > 0.05). Thus, the interactions between VOT, Group, and Channels were driven by the fact that the age-related difference in slopes was reduced for vocoded compared to unprocessed stimuli.
FIG. 5.
(Color online) Illustrations of interaction effects on the perception of voice onset time in the Buy/Pie contrast: voice onset time by age group by channels interaction (A) and age group by carrier by channels interaction (B). Data in (A) were predicted values from the model, with the factor of Carrier averaged across levels. Data in (B) were predicted values from the model at the mean VOT. Error bars show ± 95% confidence interval.
There were significant interactions between Group, Carrier, and Channels: Group by Carrier by Channels: 2 channels interaction, Group by Carrier by Channels: 4 channels interaction, and Group by Carrier by Channels: 8 channels interaction (all ps < 0.001). These results indicate that, as shown in Fig. 5(B), the effect of age (ONH listeners reporting more “Buy” responses than YNH listeners) became smaller for vocoded than unprocessed stimuli. The effect of age with reduced stimulus spectral resolution (unprocessed vs vocoded) was more salient for the carrier sentence condition than the isolated word condition. The interaction was explored further by re-running the same model using each level of Channels as the reference level. Results showed that the effect of age was significant for the 2- and 8-channel stimuli (significant Group effects, both ps < 0.01), but not for the 4-channel stimuli (p = 0.096). The effect of age was greater for 8-channel stimuli than 2- and 4-channel stimuli (significant Group by Channels interactions, both ps < 0.01). Further, the change in the magnitude of the age effect from 8- to 2-channel stimuli was greater for the carrier sentence condition than the isolated word condition (significant Group by Carrier by Channels interaction, p = 0.001). The (insignificant) change in the magnitude of the age effect from 4- to 2-channel stimuli was reduced for the carrier sentence condition than the isolated word condition (significant Group by Carrier by Channels interaction, p < 0.001).
In summary, consistent with the first hypothesis, when categorizing the words Buy and Pie, the ONH (vs YNH) group exhibited reduced sensitivity to VOT changes (shallower slopes and reporting more “Buy” responses). Contrary to the second hypothesis, the effect of age became smaller with reduced stimulus spectral resolution. Consistent with the third hypothesis, introducing a carrier sentence before target words exaggerated the age effect. Further, the effect of age observed with reduced spectral resolution appears to be more prominent when introducing a carrier sentence.
B. Perception of the silence duration cue in the Dish/Ditch contrast
Figure 6 displays the percentage of “Dish” responses as a function of silence duration. While both groups (YNH and ONH) could discriminate the words “Dish” and “Ditch,” the ONH group tended to perceive the words as “Dish,” particularly for stimuli with lower spectral resolution. Table II shows the results from the final best-fitting model. The description focuses on factors directly related to the research questions: Silence duration, Group, Carrier, and Channels.
FIG. 6.
(Color online) Identification functions (i.e., the average percentage of trials identified as “Dish”) for the perception of silence duration in the Dish/Ditch contrast in YNH (black squares) and ONH (red/open circles) groups. Both unprocessed and vocoded (8, 4, and 2 channels) stimuli were presented in isolation (isolated word; upper graphs) and after a carrier sentence (carrier sentence; lower graphs). Error bars show ±1 standard error. The horizontal dashed line corresponds to 50% “Dish” responses.
TABLE II.
Results for the final best-fitting model: “Dish” (vs “Ditch”) response = Silence duration * Group * Carrier + Silence duration * Group * Channels + Group * Carrier * Channels + (1 + Silence duration | participant). Group and Carrier were sum-coded with the level associated with the –0.5 contrast shown in parentheses.
| Fixed effects | β | Standard error | z | p |
|---|---|---|---|---|
| (Intercept) | 0.310 | 0.103 | 2.999 | 0.003 |
| Silence duration | −3.247 | 0.153 | −21.178 | <0.001 |
| Group (YNH) | 0.437 | 0.207 | 2.117 | 0.034 |
| Carrier (Isolated Word) | 0.926 | 0.102 | 9.113 | <0.001 |
| Channels: 2 channels | 0.290 | 0.064 | 4.557 | <0.001 |
| Channels: 4 channels | −0.507 | 0.072 | −7.005 | <0.001 |
| Channels: 8 channels | −0.753 | 0.076 | −9.900 | <0.001 |
| Silence duration * Group | −0.381 | 0.306 | −1.246 | 0.212 |
| Silence duration * Carrier | −0.420 | 0.068 | −6.199 | <0.001 |
| Silence duration * Channels: 2 channels | 1.920 | 0.094 | 20.347 | <0.001 |
| Silence duration * Channels: 4 channels | −0.090 | 0.120 | −0.751 | 0.452 |
| Silence duration * Channels: 8 channels | −0.440 | 0.129 | −3.404 | <0.001 |
| Group * Carrier | −0.640 | 0.203 | −3.156 | 0.002 |
| Group * Channels: 2 channels | 0.569 | 0.127 | 4.471 | <0.001 |
| Group * Channels: 4 channels | 0.746 | 0.145 | 5.151 | <0.001 |
| Group * Channels: 8 channels | 0.995 | 0.152 | 6.545 | <0.001 |
| Carrier * Channels: 2 channels | −0.607 | 0.127 | −4.777 | <0.001 |
| Carrier * Channels: 4 channels | 0.003 | 0.145 | 0.022 | 0.983 |
| Carrier * Channels: 8 channels | −0.352 | 0.148 | −2.381 | 0.017 |
| Silence duration * Group * Carrier | −0.353 | 0.136 | −2.599 | 0.009 |
| Silence duration * Group * Channels: 2 channels | 1.644 | 0.189 | 8.717 | <0.001 |
| Silence duration * Group * Channels: 4 channels | 0.977 | 0.241 | 4.061 | <0.001 |
| Silence duration * Group * Channels: 8 channels | 0.721 | 0.259 | 2.787 | 0.005 |
| Group * Carrier * Channels: 2 channels | 0.805 | 0.254 | 3.175 | 0.001 |
| Group * Carrier * Channels: 4 channels | −0.025 | 0.291 | −0.085 | 0.933 |
| Group * Carrier * Channels: 8 channels | 0.458 | 0.295 | 1.549 | 0.121 |
| Random effects | Variance | Standard deviation | ||
| (Intercept) | 0.266 | 0.516 | ||
| Silence duration | 0.530 | 0.728 | ||
| Marginal R2 | 71.1% | |||
| Conditional R2 | 76.7% |
There was a significant interaction between Silence duration, Group, and Carrier (p = 0.009). This indicates that the effect of age on slopes was reduced for the carrier sentence condition compared to the isolated word condition (Fig. 6).
There were significant interactions between Silence duration, Group, and Channels: Silence duration by Group by Channels: 2 channels interaction, Silence duration by Group by Channels: 4 channels interaction, and Silence duration by Group by Channels: 8 channels interaction (all ps < 0.01). These results indicate that the effect of age on slopes was larger for vocoded than for unprocessed stimuli [Fig. 7(A)]. The interaction was explored further by re-running the same model using each level of Channels as the reference level. Results showed that the effect of age on slopes was only significant for the 2-channel stimuli (significant Silence duration by Group interaction, p < 0.001), such that the slope was shallower for the ONH group than the YNH group. The age effect for the 2-channel stimuli was significantly larger than for 4- and 8-channel stimuli (significant Silence duration by Group by Channels interactions, both ps < 0.001). The age effect was not significantly different between 4- and 8-channel stimuli (p = 0.334).
FIG. 7.
(Color online) Illustrations of interaction effects on the perception of silence duration in the Dish-Ditch contrast: silence duration by age group by channels interaction (A) and age group by carrier by channels interaction (B). Data in (A) were predicted values from the model, with the factor of Carrier averaged across levels. Data in (B) were predicted values from the model at the mean silence duration. Error bars show ±95% confidence interval.
There was a significant interaction between Group, Carrier, and Channels: Group by Carrier by Channels: 2 channels interaction (p = 0.001). This indicates that, as shown in Fig. 7(B), the effect of age (ONH reporting more “Dish” responses than YNH) became larger for vocoded than unprocessed stimuli. The effect of age observed with reduced stimulus spectral resolution (unprocessed vs vocoded) was greater for the carrier sentence condition than the isolated word condition but only when comparing unprocessed and 2-channel stimuli. The interaction was explored further by re-running the same model using each level of Channels as the reference level. Results showed that the effect of age (more “Dish” responses for ONH than YNH) was significant for all types of vocoded stimuli (significant effect of Group, all ps < 0.001). The age effect was reduced from 8- to 2-channel stimuli (significant Group by Channels interaction, p = 0.002). There was no significant difference in the age effect between 2- and 4-channel stimuli or between 4- and 8-channel stimuli (insignificant Group by Channels, both ps > 0.1). Further, the (insignificant) reduction in the age effect from 4- to 2-channel stimuli was smaller for the carrier sentence condition compared to the isolated word condition (significant Group by Carrier by Channels interaction, p = 0.001).
In summary, consistent with the first hypothesis, when categorizing the words Dish and Ditch, the ONH (vs YNH) group exhibited reduced sensitivity to silence duration changes (primarily reflected as reporting more “Dish” responses). Consistent with the second hypothesis, the age effect became larger with reduced stimulus spectral resolution. Contrary to the third hypothesis, introducing a carrier sentence before target words reduced the age effect. Further, the increased age effect observed by reducing the spectral resolution (e.g., unprocessed vs 8- and 4-channel stimuli) appears to be largely unaffected by introducing a carrier sentence.
IV. DISCUSSION
This study examined the relationships among age, spectral degradation, and carrier sentence on the ability to perceive temporal cues in word segments. Consistent with the first hypothesis, results showed that older (vs younger) listeners exhibited reduced sensitivity to both VOT changes (Buy/Pie contrast; shallower slopes and reporting more “Buy” responses; Figs. 4 and 5) and silence duration changes (Dish/Ditch contrast; shallower slopes and reporting more “Dish” responses; Figs. 6 and 7). For the Buy/Pie contrast, the largest age effects were seen for unprocessed words in carrier sentence conditions [Fig. 5(B)]. Specifically, age interacted with spectral degradation and carrier sentence on temporal processing, such that spectral degradation reduced the age-related temporal processing deficits, and the age-by-spectral degradation effect became more salient in carrier sentence conditions. In contrast, for the Dish/Ditch continuum, spectral degradation exaggerated the age-related temporal processing deficits, but introducing carrier sentences largely left the patterns unchanged (Figs. 6 and 7). Together, these results are partly consistent with the second and third hypotheses, suggesting that spectral degradation and the presence of a carrier sentence increase age-related temporal processing deficits in a cue-specific manner.
The findings of reduced ability to perceive temporal cues in word segments with advancing age in acoustic-hearing listeners agree with the findings of previous studies that examined age effects using similar phonemic categorization tasks to probe temporal processing abilities (Anderson et al., 2020; Gordon-Salant et al., 2006; Gordon-Salant et al., 2008; Goupell et al., 2017; Roque et al., 2019; Xie et al., 2019). Importantly, the current study reinforces previous findings that stimulus-related factors may modulate the degree of age-related temporal processing deficits (Gordon-Salant et al., 2008; Goupell et al., 2017; Xie et al., 2019).
First, the current study demonstrated that spectral degradation via vocoding exaggerates age-related temporal processing deficits for the Dish/Ditch contrast, but not the Buy/Pie contrast. Previous studies have primarily used the Dish/Ditch contrast to investigate the effect of spectral degradation on the ability to perceive temporal cues in word segments, and the results are mixed. Goupell et al. (2017) showed that, compared to the unprocessed condition, age-related differences may become larger with reduced resolution (e.g., unprocessed vs 8-channel). However, Anderson et al. (2020) did not find an exaggeration of age-related differences by reducing the spectral resolution. This discrepancy might be attributed to differences in the experimental design. For example, studies have used different levels of vocoding [e.g., both Goupell et al. (2017) and the current study included three vocoding levels, while Anderson et al. (2020) used only one vocoding level]. The broader range of word recognition difficulty associated with more vocoding levels may put older listeners at a greater disadvantage and increase the likelihood of observing age-related differences (Sheldon et al., 2008). Moreover, these studies recruited different subject groups, and differences in listener characteristics (e.g., hearing thresholds) might contribute to the discrepancy in findings (Anderson et al., 2020; Gordon-Salant et al., 2008). Further, these studies used different stimulus presentation levels [65 dB-A in Goupell et al. (2017) vs 75 dB SPL in Anderson et al. (2020); also see Experiment 3 in Xie et al. (2019) for an investigation of the stimulus presentation level effect]. The current study used the same presentation level (75 dB SPL) as Anderson et al. (2020), but showed different results. This may imply that stimulus level may not be the dominant factor contributing to the mixed findings. Future research is needed to clarify under what conditions reduced spectral resolution may or may not exaggerate age-related temporal processing deficits.
Second, the current study showed that introducing a carrier sentence exacerbated age-related temporal processing deficits for the Buy/Pie contrast, but not the Dish/Ditch contrast. This finding is consistent with Gordon-Salant et al. (2008), showing that carrier sentences worsened age effects on temporal processing for the Buy/Pie contrast but not the Dish/Ditch contrast. One candidate mechanism for the carrier sentence effect is forward masking (Xie et al., 2022). The VOT cue is at the word-initial position of the Buy/Pie contrast, and the silence duration cue is at the word-final position of the Dish/Ditch contrast. Hence, forward masking from the preceding carrier sentence is likely more prominent for the Buy/Pie contrast than the Dish/Ditch contrast. Advancing age is associated with increased susceptibility to forward masking (Dubno et al., 2003; Grose et al., 2016), which may exaggerate age-related temporal processing deficits with carrier sentences, particularly for the Buy/Pie contrast. The account of forward masking is consistent with Xie et al. (2022), who demonstrated that a carrier sentence affected the sensitivity to categorize the Buy/Pie contrast but not the Dish/Ditch contrast in cochlear-implant users. In Xie et al. (2022), the detrimental effect was even greater with a steady-state noise carrier. The account of forward masking, however, is only partially consistent with Gordon-Salant et al. (2008), who showed that the presence of carrier sentences affected performance on a variety of temporal cues of different word positions (i.e., word-initial, word-medial, and word-final) and for carrier sentences presented both before and after target words.
Third, the modulation of age-related temporal processing deficits by spectral degradation depends on the presence of carrier sentences and the type of temporal cue. For the Buy/Pie contrast, the effect of age tended to decrease with spectral degradation for stimuli with relatively higher resolution (4-channel or greater). Such a decrement in the age effect (with spectral degradation) was more salient in the carrier sentence conditions. This three-way interaction between age, spectral degradation, and carrier sentence may be attributed to the fact that carrier sentences exacerbated the age-related temporal processing deficits to a greater extent for unprocessed (vs vocoded) stimuli. For the Dish/Ditch contrast, the age group effects tended to increase with reduced stimulus spectral resolution. However, the increment appears to be largely unaffected by introducing a carrier sentence, at least for stimuli with a relatively high resolution (4- and 8-channel stimuli). As discussed above, the lack of an effect of carrier sentence may be explained by relatively limited forward masking from carrier sentences on the Dish/Ditch contrast. In other words, the critical temporal cue for Dish/Ditch is located in the middle of the word, relatively far away in time from the end of the carrier sentence that has the potential to mask the silence duration cue. Instead, the silence duration cue may be masked by the onset /dɪ/ sound that is present in all conditions, rather than the preceding carrier sentence. The cue-specific effect regarding the modulation of age-related temporal processing deficits by spectral degradation and carrier sentence again appears to align with the forward-masking account. Together, these studies suggest that forward masking is probably one of several mechanisms underlying the carrier sentence effect, which may exacerbate age-related temporal processing deficits. These findings also reinforce the importance of using sentence-length stimuli, rather than single words, to evaluate temporal processing ability in acoustic-hearing listeners and cochlear-implant users (Gordon-Salant et al., 2008; Xie et al., 2022).
The age-related temporal processing deficits are not exclusive to speech stimuli and have been widely observed with non-speech stimuli (e.g., gap detection; Harris et al., 2010; Snell, 1997; Strouse et al., 1998). This raises the question of how age-related changes in general auditory processing relate to age-related deficits in the perception of temporal cues with speech stimuli. There is some evidence that gap detection performance predicted the use of VOT cues in categorizing stop consonants in young adults with normal hearing (Elangovan and Stuart, 2008). Future studies should compare the effects of aging on temporal processing on tasks with non-speech (e.g., gap detection) and speech stimuli (e.g., phonemic categorization of words with temporal cues) (e.g., Grose et al., 2006).
This study has limitations that warrant future investigation. First, the extent to which the current findings regarding spectral degradation effects generalize to actual cochlear-implant users requires further investigation. Second, while this study primarily manipulated temporal cues of the word contrasts, these contrasts (e.g., Buy/Pie) may consist of spectral cues (e.g., onset formant) that contribute to word recognition, even when spectrally degraded (Roberts et al., 2011). Third, for the Buy/Pie contrast, Buy is most often used as a verb and is less plausible at the end of an actual sentence than Pie. Such expectation might bias listeners toward Pie rather than Buy, contributing to the carrier sentence effects. Future studies should use contrasts of unambiguous nouns (e.g., Beak/Peak) to investigate the effects of the carrier sentence. Fourth, given that the criteria for NH thresholds were limited to up to 4 kHz and the word contrasts contain frequencies above 4 kHz, age group differences in hearing thresholds above 4 kHz might contribute to apparent age-related temporal processing deficits. Low-pass filtering the stimuli at 4 kHz is a common approach to address this issue, although it also is unrealistic for the frequency range that is conveyed from actual cochlear-implant processors. Fifth, this study only tested two temporal cues at different word positions (VOT at the word-initial position and silence duration at the word-final position). Future studies could assess the generalizability of findings to other temporally based word contrasts and clarify mechanisms underlying the potential temporal processing differences associated with different word positions.
Finally, the presence of a neutral carrier sentence is only one type of speech context that may affect word recognition. For example, Gordon-Salant et al. (2008) also showed carrier sentence effects when the carrier sentences followed the target word. Future studies could examine the effect of the location of the carrier sentences on recognition of contrasting target words. In addition, natural speech typically contains sentential context that is semantically coherent with upcoming words and would enhance their recognition (Dubno et al., 2000; Kalikow et al., 1977; Pichora‐Fuller et al., 1995; Sommers and Danielson, 1999). The semantic context influences the predictability of individual words in a sentence by constraining the pool of likely lexical matches (e.g., Grant and Seitz, 2000), thereby potentially reducing the acoustic–phonetic cues required for word recognition. Hence, coherent semantic context might override the negative effects of the preceding stimulus context observed here using non-informative carriers. Future studies could examine the effect of semantic context on age-related temporal processing changes using target words with temporal cues embedded in sentences with varying semantic contexts (e.g., Bushong and Jaeger, 2019). Overall, the current study and future work would contribute to the knowledge of the extent of age-related temporal processing deficits in natural speech and conversations among older listeners with acoustic and electric hearing.
V. CONCLUSION
This study demonstrated that advancing age was associated with decreased sensitivity to temporal cues in word segments among acoustic-hearing listeners. The age-related temporal processing deficits depend on stimulus-related factors, including spectral degradation, carrier sentence, and type of temporal cue. Future work could extend this study to cochlear-implant users to examine how age- and stimulus-related factors influence their temporal processing abilities. Understanding the types of sounds and words affected by age-related temporal processing deficits, as well as the impact of context information, would be beneficial for aural rehabilitation for older cochlear-implant users.
ACKNOWLEDGMENTS
Research reported in this publication was supported by the National Institute on Deafness and Other Communication Disorders of the National Institutes of Health under Award Nos. R01AG051603 and R01DC020316 (M.J.G.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Ethics Approval
All materials and procedures were approved by the Institutional Review Board at the University of Maryland. All participants provided written informed consent and received monetary compensation for their participation.
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
- 1. Anderson, S. , Parbery-Clark, A. , White-Schwoch, T. , and Kraus, N. (2012). “ Aging affects neural precision of speech encoding,” J. Neurosci. 32, 14156–14164. 10.1523/JNEUROSCI.2176-12.2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Anderson, S. , Roque, L. , Gaskins, C. R. , Gordon-Salant, S. , and Goupell, M. J. (2020). “ Age-related compensation mechanism revealed in the cortical representation of degraded speech,” J. Assoc. Res. Otolaryngol. 21, 373–391. 10.1007/s10162-020-00753-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Bates, D. , Mächler, M. , Bolker, B. , and Walker, S. (2015). “ Fitting linear mixed-effects models using lme4,” J. Stat. Softw. 67, 1–48. 10.18637/jss.v067.i01 [DOI] [Google Scholar]
- 4. Bushong, W. , and Jaeger, T. F. (2019). “ Dynamic re-weighting of acoustic and contextual cues in spoken word recognition,” J. Acoust. Soc. Am. 146, EL135–EL140. 10.1121/1.5119271 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Cecato, J. F. , Martinelli, J. E. , Izbicki, R. , Yassuda, M. S. , and Aprahamian, I. (2016). “ A subtest analysis of the Montreal cognitive assessment (MoCA): Which subtests can best discriminate between healthy controls, mild cognitive impairment and Alzheimer's disease?,” Int. Psychogeriatr. 28, 825–832. 10.1017/S1041610215001982 [DOI] [PubMed] [Google Scholar]
- 6. Cychosz, M. , Winn, M. B. , and Goupell, M. J. (2024). “ How to vocode: Using vocoders for cochlear-implant research,” J. Acoust. Soc. Am. 155, 2407–2437. 10.1121/10.0025274 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Ding, N. , Gao, J. , Wang, J. , Sun, W. , Fang, M. , Liu, X. , and Zhao, H. (2023). “ Speech recognition in echoic environments and the effect of aging and hearing impairment,” Hear. Res. 431, 108725. 10.1016/j.heares.2023.108725 [DOI] [PubMed] [Google Scholar]
- 8. Dorman, M. F. , Raphael, L. J. , and Liberman, A. M. (1979). “ Some experiments on the sound of silence in phonetic perception,” J. Acoust. Soc. Am. 65, 1518–1532. 10.1121/1.382916 [DOI] [PubMed] [Google Scholar]
- 9. Dubno, J. R. , Ahlstrom, J. B. , and Horwitz, A. R. (2000). “ Use of context by young and aged adults with normal hearing,” J. Acoust. Soc. Am. 107, 538–546. 10.1121/1.428322 [DOI] [PubMed] [Google Scholar]
- 10. Dubno, J. R. , Horwitz, A. R. , and Ahlstrom, J. B. (2003). “ Recovery from prior stimulation: Masking of speech by interrupted noise for younger and older adults with normal hearing,” J. Acoust. Soc. Am. 113, 2084–2094. 10.1121/1.1555611 [DOI] [PubMed] [Google Scholar]
- 11. Dupuis, K. , Pichora-Fuller, M. K. , Chasteen, A. L. , Marchuk, V. , Singh, G. , and Smith, S. L. (2015). “ Effects of hearing and vision impairments on the Montreal Cognitive Assessment,” Aging Neuropsychol. Cogn. 22, 413–437. 10.1080/13825585.2014.968084 [DOI] [PubMed] [Google Scholar]
- 12. Elangovan, S. , and Stuart, A. (2008). “ Natural boundaries in gap detection are related to categorical perception of stop consonants,” Ear Hear. 29, 761–774. 10.1097/AUD.0b013e318185ddd2 [DOI] [PubMed] [Google Scholar]
- 13. Friesen, L. M. , Shannon, R. V. , Baskent, D. , and Wang, X. (2001). “ Speech recognition in noise as a function of the number of spectral channels: Comparison of acoustic hearing and cochlear implants,” J. Acoust. Soc. Am. 110, 1150–1163. 10.1121/1.1381538 [DOI] [PubMed] [Google Scholar]
- 14. Füllgrabe, C. , Moore, B. C. , and Stone, M. A. (2015). “ Age-group differences in speech identification despite matched audiometrically normal hearing: Contributions from auditory temporal processing and cognition,” Front. Aging Neurosci. 6, 347. 10.3389/fnagi.2014.00347 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Gordon-Salant, S. , Fitzgibbons, P. J. , and Yeni-Komshian, G. H. (2011). “ Auditory temporal processing and aging: Implications for speech understanding of older people,” Audiol. Res. 1(e4), 9–15. 10.4081/audiores.2011.e4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Gordon-Salant, S. , Yeni-Komshian, G. , and Fitzgibbons, P. (2008). “ The role of temporal cues in word identification by younger and older adults: Effects of sentence context,” J. Acoust. Soc. Am. 124, 3249–3260. 10.1121/1.2982409 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Gordon-Salant, S. , Yeni-Komshian, G. H. , Fitzgibbons, P. J. , and Barrett, J. (2006). “ Age-related differences in identification and discrimination of temporal cues in speech segments,” J. Acoust. Soc. Am. 119, 2455–2466. 10.1121/1.2171527 [DOI] [PubMed] [Google Scholar]
- 18. Goupell, M. J. , Gaskins, C. R. , Shader, M. J. , Walter, E. P. , Anderson, S. , and Gordon-Salant, S. (2017). “ Age-related differences in the processing of temporal envelope and spectral cues in a speech segment,” Ear Hear. 38(6), e335–e342. 10.1097/AUD.0000000000000447 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Grant, K. W. , and Seitz, P. F. (2000). “ The recognition of isolated words and words in sentences: Individual variability in the use of sentence context,” J. Acoust. Soc. Am. 107, 1000–1011. 10.1121/1.428280 [DOI] [PubMed] [Google Scholar]
- 20. Grose, J. H. , Hall, J. W., III , and Buss, E. (2006). “ Temporal processing deficits in the pre-senescent auditory system,” J. Acoust. Soc. Am. 119, 2305–2315. 10.1121/1.2172169 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Grose, J. H. , Menezes, D. C. , Porter, H. L. , and Griz, S. (2016). “ Masking period patterns and forward masking for speech-shaped noise: Age-related effects,” Ear Hear. 37, 48–54. 10.1097/AUD.0000000000000200 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Harris, K. C. , Eckert, M. A. , Ahlstrom, J. B. , and Dubno, J. R. (2010). “ Age-related differences in gap detection: Effects of task difficulty and cognitive ability,” Hear. Res. 264, 21–29. 10.1016/j.heares.2009.09.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Hughes, L. F. , Turner, J. G. , Parrish, J. L. , and Caspary, D. M. (2010). “ Processing of broadband stimuli across A1 layers in young and aged rats,” Hear. Res. 264, 79–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Ji, C. , Galvin, J. J., III , Xu, A. , and Fu, Q.-J. (2013). “ Effect of speaking rate on recognition of synthetic and natural speech by normal-hearing and cochlear implant listeners,” Ear Hear. 34, 313–323. 10.1097/AUD.0b013e31826fe79e [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Johnson, K. C. , Xie, Z. , Shader, M. J. , Mayo, P. G. , and Goupell, M. J. (2021). “ Effect of chronological age on pulse rate discrimination in adult cochlear-implant users,” Trends Hear. 25, 23312165211007367. 10.1177/23312165211007367 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Kalikow, D. N. , Stevens, K. N. , and Elliott, L. L. (1977). “ Development of a test of speech intelligibility in noise using sentence materials with controlled word predictability,” J. Acoust. Soc. Am. 61, 1337–1351. 10.1121/1.381436 [DOI] [PubMed] [Google Scholar]
- 27. Kressner, A. A. , Westermann, A. , and Buchholz, J. M. (2018). “ The impact of reverberation on speech intelligibility in cochlear implant recipients,” J. Acoust. Soc. Am. 144, 1113–1122. 10.1121/1.5051640 [DOI] [PubMed] [Google Scholar]
- 28. Lisker, L. , and Abramson, A. S. (1964). “ A cross-language study of voicing in initial stops: Acoustical measurements,” Word 20, 384–422. 10.1080/00437956.1964.11659830 [DOI] [Google Scholar]
- 29. Loizou, P. C. (2006). “ Speech processing in vocoder-centric cochlear implants,” Cochlear Brainstem Implants 64, 109–143. 10.1159/000094648 [DOI] [PubMed] [Google Scholar]
- 30. Lopez-Poveda, E. A. (2014). “ Why do I hear but not understand? Stochastic undersampling as a model of degraded neural encoding of speech,” Front. Neurosci. 8, 348. 10.3389/fnins.2014.00348 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Lopez-Poveda, E. A. , and Barrios, P. (2013). “ Perception of stochastically undersampled sound waveforms: A model of auditory deafferentation,” Front. Neurosci. 7, 124. 10.3389/fnins.2013.00124 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Lüdecke, D. (2018). “ ggeffects: Tidy data frames of marginal effects from regression models,” J. Open Source Softw. 3(26), 772. 10.21105/joss.00772 [DOI] [Google Scholar]
- 33. Nasreddine, Z. S. , Phillips, N. A. , Bédirian, V. , Charbonneau, S. , Whitehead, V. , Collin, I. , Cummings, J. L. , and Chertkow, H. (2005). “ The Montreal Cognitive Assessment, MoCA: A brief screening tool for mild cognitive impairment,” J. Am. Geriatr. Soc. 53, 695–699. 10.1111/j.1532-5415.2005.53221.x [DOI] [PubMed] [Google Scholar]
- 34. Otte, J. , Schuknecht, H. F. , and Kerr, A. G. (1978). “ Ganglion cell populations in normal and pathological human cochleae. Implications for cochlear implantation,” Laryngoscope 88, 1231–1246. 10.1288/00005537-197808000-00004 [DOI] [PubMed] [Google Scholar]
- 35. Oxenham, A. J. , and Kreft, H. A. (2014). “ Speech perception in tones and noise via cochlear implants reveals influence of spectral resolution on temporal processing,” Trends Hear. 18, 2331216514553783. 10.1177/2331216514553783 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Pichora‐Fuller, M. K. , Schneider, B. A. , and Daneman, M. (1995). “ How young and old adults listen to and remember speech in noise,” J. Acoust. Soc. Am. 97, 593–608. 10.1121/1.412282 [DOI] [PubMed] [Google Scholar]
- 37.R Core Team (2022). “ R: A language and environment for statistical computing,” https://www.R-project.org (Last viewed February 29, 2024).
- 38. Roberts, B. , Summers, R. J. , and Bailey, P. J. (2011). “ The intelligibility of noise-vocoded speech: Spectral information available from across-channel comparison of amplitude envelopes,” Proc. R. Soc. B Biol. Sci. 278, 1595–1600. 10.1098/rspb.2010.1554 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Roque, L. , Gaskins, C. , Gordon-Salant, S. , Goupell, M. J. , and Anderson, S. (2019). “ Age effects on neural representation and perception of silence duration cues in speech,” J. Speech Lang. Hear. Res. 62, 1099–1116. 10.1044/2018_JSLHR-H-ASCC7-18-0076 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Schneider, B. A. , and Pichora-Fuller, M. K. (2001). “ Age-related changes in temporal processing: Implications for speech perception,” Semin. Hear. 22, 227–240. 10.1055/s-2001-15628 [DOI] [Google Scholar]
- 41. Sergeyenko, Y. , Lall, K. , Liberman, M. C. , and Kujawa, S. G. (2013). “ Age-related cochlear synaptopathy: An early-onset contributor to auditory functional decline,” J. Neurosci. 33, 13686–13694. 10.1523/JNEUROSCI.1783-13.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Shader, M. J. , Gordon-Salant, S. , and Goupell, M. J. (2020a). “ Impact of aging and the electrode-to-neural interface on temporal processing ability in cochlear-implant users: Amplitude-modulation detection thresholds,” Trends Hear. 24, 2331216520936160. 10.1177/2331216520936160 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Shader, M. J. , Gordon-Salant, S. , and Goupell, M. J. (2020b). “ Impact of aging and the electrode-to-neural interface on temporal processing ability in cochlear-implant users: Gap detection thresholds,” Trends Hear. 24, 2331216520956560. 10.1177/2331216520956560 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Shader, M. J. , Yancey, C. M. , Gordon-Salant, S. , and Goupell, M. J. (2020c). “ Spectral-temporal trade-off in vocoded sentence recognition: Effects of age, hearing thresholds, and working memory,” Ear Hear. 41, 1226–1235. 10.1097/AUD.0000000000000840 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Shannon, R. V. , Zeng, F.-G. , Kamath, V. , Wygonski, J. , and Ekelid, M. (1995). “ Speech recognition with primarily temporal cues,” Science 270, 303–304. 10.1126/science.270.5234.303 [DOI] [PubMed] [Google Scholar]
- 46. Sheldon, S. , Pichora-Fuller, M. K. , and Schneider, B. A. (2008). “ Effect of age, presentation method, and learning on identification of noise-vocoded words,” J. Acoust. Soc. Am. 123, 476–488. 10.1121/1.2805676 [DOI] [PubMed] [Google Scholar]
- 47. Snell, K. B. (1997). “ Age-related changes in temporal gap detection,” J. Acoust. Soc. Am. 101, 2214–2220. 10.1121/1.418205 [DOI] [PubMed] [Google Scholar]
- 48. Sommers, M. S. , and Danielson, S. M. (1999). “ Inhibitory processes and spoken word recognition in young and older adults: The interaction of lexical competition and semantic context,” Psychol. Aging 14, 458–472. 10.1037/0882-7974.14.3.458 [DOI] [PubMed] [Google Scholar]
- 49. Souza, P. , and Rosen, S. (2009). “ Effects of envelope bandwidth on the intelligibility of sine- and noise-vocoded speech,” J. Acoust. Soc. Am. 126, 792–805. 10.1121/1.3158835 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Strouse, A. , Ashmead, D. H. , Ohde, R. N. , and Grantham, D. W. (1998). “ Temporal processing in the aging auditory system,” J. Acoust. Soc. Am. 104, 2385–2399. 10.1121/1.423748 [DOI] [PubMed] [Google Scholar]
- 51. Tinnemore, A. R. , Montero, L. , Gordon-Salant, S. , and Goupell, M. J. (2022). “ The recognition of time-compressed speech as a function of age in listeners with cochlear implants or normal hearing,” Front. Aging Neurosci. 14, 887581. 10.3389/fnagi.2022.887581 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Tremblay, K. L. , Piskosz, M. , and Souza, P. (2003). “ Effects of age and age-related hearing loss on the neural representation of speech cues,” Clin. Neurophysiol. 114, 1332–1343. 10.1016/S1388-2457(03)00114-7 [DOI] [PubMed] [Google Scholar]
- 53. Voeten, C. C. (2019). “ Using ‘buildmer’ to automatically find & compare maximal (mixed) models,” R package version, 1(6), pp. 1–7. [Google Scholar]
- 54. Walton, J. P. , Frisina, R. D. , and O'Neill, W. E. (1998). “ Age-related alteration in processing of temporal sound features in the auditory midbrain of the CBA mouse,” J. Neurosci. 18, 2764–2776. 10.1523/JNEUROSCI.18-07-02764.1998 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. Whitmal, N. A. , Poissant, S. F. , Freyman, R. L. , and Helfer, K. S. (2007). “ Speech intelligibility in cochlear implant simulations: Effects of carrier type, interfering noise, and subject experience,” J. Acoust. Soc. Am. 122, 2376–2388. 10.1121/1.2773993 [DOI] [PubMed] [Google Scholar]
- 56. Wickham, H. (2011). “ ggplot2,” Wiley Interdiscip. Rev. Comput. Stat. 3, 180–185. 10.1002/wics.147 [DOI] [Google Scholar]
- 57. Wingfield, A. , Poon, L. W. , Lombardi, L. , and Lowe, D. (1985). “ Speed of processing in normal aging: Effects of speech rate, linguistic structure, and processing time,” J. Gerontol. 40, 579–585. 10.1093/geronj/40.5.579 [DOI] [PubMed] [Google Scholar]
- 58. Winn, M. B. , Chatterjee, M. , and Idsardi, W. J. (2012). “ The use of acoustic cues for phonetic identification: Effects of spectral degradation and electric hearing,” J. Acoust. Soc. Am. 131, 1465–1479. 10.1121/1.3672705 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Winn, M. B. , Won, J. H. , and Moon, I. J. (2016). “ Assessment of spectral and temporal resolution in cochlear implant users using psychoacoustic discrimination and speech cue categorization,” Ear Hear. 37(6), e377–e390. 10.1097/AUD.0000000000000328 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Wong, P. C. , Jin, J. X. , Gunasekera, G. M. , Abel, R. , Lee, E. R. , and Dhar, S. (2009). “ Aging and cortical mechanisms of speech perception in noise,” Neuropsychologia 47, 693–703. 10.1016/j.neuropsychologia.2008.11.032 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Wu, P. , Liberman, L. , Bennett, K. , de Gruttola, V. , O'Malley, J. , and Liberman, M. (2019). “ Primary neural degeneration in the human cochlea: Evidence for hidden hearing loss in the aging ear,” Neuroscience 407, 8–20. 10.1016/j.neuroscience.2018.07.053 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62. Xie, Z. , Anderson, S. , and Goupell, M. J. (2022). “ Stimulus context affects the phonemic categorization of temporally based word contrasts in adult cochlear-implant users,” J. Acoust. Soc. Am. 151, 2149–2158. 10.1121/10.0009838 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63. Xie, Z. , Gaskins, C. R. , Shader, M. J. , Gordon-Salant, S. , Anderson, S. , and Goupell, M. J. (2019). “ Age-related temporal processing deficits in word segments in adult cochlear-implant users,” Trends Hear. 23, 2331216519886688. 10.1177/2331216519886688 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The data that support the findings of this study are available from the corresponding author upon reasonable request.







