Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Nov 2.
Published in final edited form as: Adv Exp Med Biol. 2013;787:501–510. doi: 10.1007/978-1-4614-1590-9_55

How Early Aging and Environment Interact in Everyday Listening: From Brainstem to Behavior Through Modeling

Barbara Shinn-Cunningham 1,, Dorea R Ruggles 1, Hari Bharadwaj 1
PMCID: PMC4629495  NIHMSID: NIHMS734124  PMID: 23716257

Abstract

We recently showed that listeners with normal hearing thresholds vary in their ability to direct spatial attention and that ability is related to the fidelity of temporal coding in the brainstem. Here, we recruited additional middle-aged listeners and extended our analysis of the brainstem response, measured using the frequency-following response (FFR). We found that even though age does not predict overall selective attention ability, middle-aged listeners are more susceptible to the detrimental effects of reverberant energy than young adults. We separated the overall FFR into orthogonal envelope and carrier components and used an existing model to predict which auditory channels drive each component. We find that responses in mid- to high-frequency auditory channels dominate envelope FFR, while lower-frequency channels dominate the carrier FFR. Importantly, we find that which component of the FFR predicts selective attention performance changes with age. We suggest that early aging degrades peripheral temporal coding in mid-to-high frequencies, interfering with the coding of envelope interaural time differences. We argue that, compared to young adults, middle-aged listeners, who do not have strong temporal envelope coding, have more trouble following a conversation in a reverberant room because they are forced to rely on fragile carrier ITDs that are susceptible to the degrading effects of reverberation.

1 Introduction

The cacophony of voices, noises, and other sounds that bombards our ears in many social settings makes it challenging to focus selective auditory attention. Various acoustic cues allow us to group sound components into perceptual objects to which we can direct attention (Darwin 1997; Shinn-Cunningham 2008; Shamma and Micheyl 2010; Shamma et al. 2011). In most common settings, reflected sound energy intensifies the problem of separating sound sources and selecting the source of interest by blurring the sound features that support source segregation and selection.

Many listeners report difficulties in everyday situations demanding selective attention, especially as they age (Leigh-Paffenroth and Elangovan 2011; Noble et al. 2012). We wondered if these problems are most evident when reverberant energy challenges the auditory system. We designed a task in which listeners had to focus spatial attention on a center, target speech stream in a mixture of three otherwise identical streams of spoken digits, and then varied the level of reverberation (Ruggles et al. 2011; Ruggles and Shinn-Cunningham 2011). By design, listeners are likely to rely on interaural timing differences (ITDs) to perform this task (Ruggles et al. 2011). Since reverberant energy causes inter-aural decorrelation, we found, as expected, that selective attention performance got worse with reverberation. We also found that individual ability on our task was correlated both with perceptual sensitivity to frequency modulation (FM) and overall strength of the frequency-following response (FFR; see also Strelcyk and Dau 2009). However, we had too few middle-aged listeners to explore age effects.

Here, we recruited additional middle-aged listeners so that we could look for aging effects. We extended our analysis of the FFR by separating the response into the portion phase locking to the stimulus envelope (FFRENV) and that phase locking to the stimulus carrier (FFRCAR; similar to approaches described in Aiken and Picton 2008; Gockel et al. 2011). We used existing brainstem response models (Dau 2003; Harte et al. 2010) to investigate which acoustic frequencies contribute to FFRENV and FFRCAR.

2 Methods

2.1 Subjects

A total of 22 listeners ranging in age from 20.9 to 54.7 years participated in the experiments. All listeners had average audiometric hearing thresholds of 20-dB HL or better for frequencies from 250 to 8,000 Hz and left-right ear asymmetry of 15 dB or less at all frequencies. Of the 22 listeners, 17 were participants in earlier studies; the newly recruited five all were over 40 years of age. All gave informed consent and were paid for their participation.

2.2 FFR Measurement

FFRs were measured in response to a /dah/syllable presented in positive polarity for 2,000 trials and in inverted polarity for 2,000 trials (Ruggles et al. 2011). Trials containing eyeblinks or other artifacts were removed, leaving at least 1,800 clean trials for each subject, condition, and stimulus polarity. The time series from each trial was windowed with a first-order Slepian taper (Thomson 1982) and the Fourier transform was computed. We generated distributions of phase-locking values (PLV) for different conditions using a bootstrapping procedure to produce 200 independent PLVs, each computed from a draw (with replacement) of 800 trials (Ruggles et al. 2011). We broke the PLV into orthogonal envelope and carrier components (FFRENV and FFRCAR) at every frequency from 30 to 3,000 Hz. FFRENV was calculated with equal draws from responses to each polarity, treating positive- and negative-polarity trials identically. FFRCAR was determined with equal draws from responses to each polarity but inverting the phase of negative-polarity trials (see also Aiken and Picton 2008; Gockel et al. 2011). For each harmonic of 100 Hz, we computed the proportion of the total FFR in FFRENV and in FFRCAR.

2.3 FFR Modeling

We used an existing model of brainstem responses (Dau 2003; Harte et al. 2010) to analyze the sources of the different components of the FFR. We presented the model with our/dah/syllable, then calculated the FFR by summing model outputs across peripheral channels with CFs spanning the range from 100 up to 10,000 Hz. At each harmonic (multiple of 100 Hz), we then computed the proportion of the total FFR phase locked to the envelope and the proportion phase locked to the carrier (FFRENV and FFRCAR). We then considered the output of each peripheral channel to explore which acoustic frequencies contributed to which components of the FFR. Finally, we analyzed the relative strength of the contribution of each peripheral channel to FFRENV at the fundamental frequency (100 Hz).

2.4 Spatial Attention Task

Subjects were asked to report a sequence of four digits appearing to come from in front while ignoring two competing digit streams, spoken by the same talker, from +15° to −15° azimuth (Ruggles and Shinn-Cunningham 2011). Spatial cues were simulated using a rectangular-room model with three different wall characteristics (Ruggles and Shinn-Cunningham 2011). Prior to statistical analyses, percent correct scores were transformed using a rationalized arcsine unit (RAU; Studebaker 1985). In the task, listeners report one of the three presented words nearly 95 % of the time; errors arise because of failures of selective attention, rather than memory limitations (Ruggles and Shinn-Cunningham 2011). Therefore, percent scores in the range 0.33–1.0 were linearly transformed to 0–1.0 (scores < 0.33 set to 0) prior to applying the transform.

2.5 FM Detection Task

Listeners indicated which of three 750-Hz tones (interstimulus interval 750 ms) contained 2-Hz frequency modulation (Strelcyk and Dau 2009). A two-down, one-up adaptive procedure (step size 1 Hz) estimated the 70.7 % correct FM threshold. Individual thresholds were computed by averaging the last 12 reversals per run, then averaging across six runs.

3 Results

3.1 Generators of FFRENV and FFRCAR

Figure 55.1 compares measurements and model predictions of the relative strengths of FFRENV and FFRCAR at harmonics of a periodic input (F0 = 100 Hz). The lowest frequencies of the FFR are dominated by FFRENV and the higher harmonics are dominated by FFRCAR. Both FFR components approach the noise floor in the empirical measurements by 800–900 Hz, which may help explain why the percentages of FFRENV and FFRCAR in the total FFR both asymptote to 0.5 as frequency increases and why the measured FFRENV does not drop as completely or as rapidly as the modeled FFRENV as frequency increases.

Fig. 55.1.

Fig. 55.1

Proportion of total FFR contained in FFRENV and in FFRCAR at each harmonic of 100 Hz from (a) experimental measures and (b) model predictions

Modeling results also suggest that different acoustic frequencies contribute to FFRENV and FFRCAR. In the model, peripheral channels with the lowest characteristic frequencies (CFs) tend to contribute to FFRCAR and peripheral channels with the highest CFs contribute to FFRENV, with a crossover point of about 2 kHz (Fig. 55.2a). The model also predicts that the channels that contribute the most to the 100-Hz FFRENV for our /dah/ syllable have CFs in the mid-to-high frequency range, around 1 kHz (Fig. 55.2a).

Fig. 55.2.

Fig. 55.2

(a) Relative strength of FFRENV and FFRCAR generated by each peripheral channel as a function of characteristic frequency (CF). (b) Relative contribution of each CF channel to strength of FFRENV at stimulus F0 of 100 Hz

3.2 Effects of Reverberation and Age on Selective Attention

Selective attention performance decreases as reverberant energy increases, reaching chance levels for all but five listeners in the highest reverberation level (Fig. 55.3; chance performance is one-third; modeling performance as a binomial distribution of 600 independent trials, we computed the 95 % confidence interval around this level).

Fig. 55.3.

Fig. 55.3

Percentage of target digits correctly reported as a function of individual listener age for the three room conditions. Open symbols show subjects not in Ruggles et al. (2011)

We quantified the fidelity of envelope temporal structure encoding for each listener as the FFRENV at 100 Hz. To quantify coding of the temporal fine structure in the input stimulus, we took the average of FFRCAR for four harmonics (600–900 Hz, henceforth denoted FFRCAR-AV). Importantly, these two statistics are not significantly correlated (r = 0.03, p = 0.905, N = 22), supporting the modeling prediction that each component reflects different aspects of temporal coding precision driven by different tonotopic portions of the auditory pathway.

We performed a multi-way, repeated-measures ANOVA on the selective attention results with factors of reverberation, age, FFRENV-100, and FFRCAR-AV (treating reverberation as categorical and all other factors as continuous). Although there is no statistically significant effect of age on selective attention performance (Fig. 55.1a; F(1, 16) = 1.42, p = 0.251), there is a significant interaction between age and reverberation (F(1, 16) = 5.88, p = 0.025) and a significant main effect of reverberation (F(1, 16) = 155.17, p = 7.01 × 10−11). Although age does not predict how well an individual performs overall, the toll that reverberation takes increases with age.

3.3 Relationship Between FFR Components and Performance

Consistent with previous results showing that the total FFR strength at 100 Hz (a measure dominated by envelope phase locking; see Fig. 55.1) predicted selective attention ability (Ruggles et al. 2011), we find a significant main effect of FFRENV-100 on performance (F(1, 16) = 5.03, p = 0.040). Importantly, however, there is a significant interaction between age and FFRENV-100 (F(1, 16) = 4.64, p = 0.048). There is also a significant interaction between age and FFRCAR-AVE (F(1, 16) = 4.64, p = 0.047), with no main effect of FFRCAR-AVE (F(1, 16) = 0.216, p = 0.649). The regression coefficients of the ANOVA analysis reveal that the younger a listener is, the better FFRENV-100 is in predicting selective attention, whereas FFRCAR-AVE is a better predictor the older the listener. These results suggest that FFRENV-100 and FFRCAR-AVE reflect different perceptual cues that each aid in selective auditory attention but that are weighted differently as listeners age.

3.4 Individual Differences in FFR

Figure 55.4 plots FFRENV-100 and FFRCAR-AVE as a function of age. While both components tend to decrease as age increases, age is not significantly correlated with either FFRENV-100 or with FFRCAR-AVE. Notably, a good percentage of the younger adult listeners have strong FFRs (particularly for FFRENV-100), whereas nearly all the older listeners have weak FFRs. Thus, most of the variance in the FFRs is from the younger listeners and cannot be explained by age alone.

Fig. 55.4.

Fig. 55.4

(a) FFRENV-100 as a function of age. (b) FFRCAR-AVE as a function of age. Open symbols show subjects not in Ruggles et al. (2011)

3.5 Relationship Between FM Detection Threshold and Performance

We previously found that FM detection threshold, a measure thought to reflect coding fidelity of temporal fine structure (Moore and Sek 1996), was also related to attention performance (Ruggles et al. 2011). This relationship remains significant with our additional subjects, as shown in Fig. 55.5.

Fig. 55.5.

Fig. 55.5

Percentage of target digits correctly reported as a function of FM threshold for the two levels of reverberation where performance is above chance. Open symbols show subjects not in Ruggles et al. (2011)

4 Discussion

Some previous studies have found that aging reduces FFR strength (Clinard et al. 2010); however, not all studies have found group age effects (Vander Werff and Burns 2011). Moreover, even studies that find age-related group differences have not consistently found corresponding age-related differences in auditory perceptual abilities (Clinard et al. 2010). The current study helps explain these discrepant findings, in that there is a large variation in brainstem responses even among young adults. By looking at individual subjects and considering different components of the FFR, we find reliable interactions between aging, perceptual ability, and specific components of the FFR.

Our results suggest that the FFR envelope component at the fundamental frequency of the stimulus tends to become weak as listeners reach middle age, possibly because the neural response to suprathreshold sound at acoustic frequencies in the mid-to-high frequency range (e.g., around 1,000 Hz) is reduced in overall strength. Physiological results show that noise exposure can reduce the magnitude of neural responses that are suprathreshold, even when thresholds are “normal” (Kujawa and Liberman 2009). These changes may come about because low-spontaneous-rate nerve fibers are particularly vulnerable to damage (Schmiedt et al. 1996).

In our task, performance is primarily limited by the ability to successfully direct spatial auditory attention, which may help explain why performance depends on the fidelity of envelope temporal coding. Envelope ITD cues in high-frequency sounds are known to carry spatial information; however, a number of classic laboratory experiments establish that for wideband, anechoic sounds, low-frequency carrier ITDs perceptually dominate over high-frequency spatial cues (Wightman and Kistler 1992; Macpherson and Middlebrooks 2002). The current results suggest that in reverberant settings, high-frequency ITD cues, encoded in signal envelopes, may be more important for spatial perception of wideband sounds than past laboratory studies suggest.

In anechoic conditions, temporal fine structure cues and temporal envelope cues both provide reliable information for directing selective spatial auditory attention. However, in reverberant settings, interaural decorrelation of temporal fine structure is more severe than interaural decorrelation of envelope structure; thus, high-frequency envelope ITD cues may be crucial to spatial perception in everyday settings. This possibility points to the importance of providing high-frequency amplification in assistive listening devices, which have typically focused on audibility of frequencies below 8 kHz.

Our results hint that middle-aged listeners, who have generally weak encoding of mid- to high-frequency temporal cues, rely on temporal fine structure cues to direct selective spatial auditory attention. This reliance on carrier ITD cues, which are relatively fragile in ordinary listening environments, may explain why middle-aged listeners report difficulty when trying to converse in everyday social settings. In contrast, younger listeners appear to give great perceptual weight to envelope ITD cues when directing selective attention, providing them with a more reliable cue for selective spatial auditory attention.

Acknowledgments

This work was sponsored by the National Institutes of Health (NIDCD R01 DC009477 to BGSC and NIDCD F31DC011463 to DR) and the National Security Science and Engineering Faculty Fellowship to BGSC.

References

  1. Aiken SJ, Picton TW. Envelope and spectral frequency-following responses to vowel sounds. Hear Res. 2008;245:35–47. doi: 10.1016/j.heares.2008.08.004. [DOI] [PubMed] [Google Scholar]
  2. Clinard CG, Tremblay KL, Krishnan AR. Aging alters the perception and physiological representation of frequency: evidence from human frequency-following response recordings. Hear Res. 2010;264:48–55. doi: 10.1016/j.heares.2009.11.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Darwin CJ. Auditory grouping. Trends Cogn Sci. 1997;1:327–333. doi: 10.1016/S1364-6613(97)01097-8. [DOI] [PubMed] [Google Scholar]
  4. Dau T. The importance of cochlear processing for the formation of auditory brainstem and frequency following responses. J Acoust Soc Am. 2003;113:936–950. doi: 10.1121/1.1534833. [DOI] [PubMed] [Google Scholar]
  5. Gockel HE, Carlyon RP, Mehta A, Plack CJ. The frequency following response (FFR) may reflect pitch-bearing information but is not a direct representation of pitch. J Assoc Res Otolaryngol. 2011;12:767–782. doi: 10.1007/s10162-011-0284-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Harte JM, Ronne F, Dau T. Modeling human auditory evoked brainstem responses based on nonlinear cochlear processing; Proceedings of the 20th international congress on acoustics; 2010; Sydney. 2010. [Google Scholar]
  7. Kujawa SG, Liberman MC. Adding insult to injury: cochlear nerve degeneration after “temporary” noise-induced hearing loss. J Neurosci. 2009;29:14077–14085. doi: 10.1523/JNEUROSCI.2845-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Leigh-Paffenroth ED, Elangovan S. Temporal processing in low-frequency channels: effects of age and hearing loss in middle-aged listeners. J Am Acad Audiol. 2011;22:393–404. doi: 10.3766/jaaa.22.7.2. [DOI] [PubMed] [Google Scholar]
  9. Macpherson EA, Middlebrooks JC. Listener weighting of cues for lateral angle: the duplex theory of sound localization revisited. J Acoust Soc Am. 2002;111:2219–2236. doi: 10.1121/1.1471898. [DOI] [PubMed] [Google Scholar]
  10. Moore BC, Sek A. Detection of frequency modulation at low modulation rates: evidence for a mechanism based on phase locking. J Acoust Soc Am. 1996;100:2320–2331. doi: 10.1121/1.417941. [DOI] [PubMed] [Google Scholar]
  11. Noble W, Naylor G, Bhullar N, Akeroyd MA. Self-assessed hearing abilities in middle- and older-age adults: a stratified sampling approach. Int J Audiol. 2012;51:174–180. doi: 10.3109/14992027.2011.621899. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Ruggles D, Shinn-Cunningham B. Spatial selective auditory attention in the presence of reverberant energy: individual differences in normal-hearing listeners. J Assoc Res Otolaryngol. 2011;12:395–405. doi: 10.1007/s10162-010-0254-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Ruggles D, Bharadwaj H, Shinn-Cunningham BG. Normal hearing is not enough to guarantee robust encoding of suprathreshold features important in everyday communication. Proc Natl Acad Sci. 2011;108:15516–15521. doi: 10.1073/pnas.1108912108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Schmiedt RA, Mills JH, Boettcher FA. Age-related loss of activity of auditory-nerve fibers. J Neurophysiol. 1996;76:2799–2803. doi: 10.1152/jn.1996.76.4.2799. [DOI] [PubMed] [Google Scholar]
  15. Shamma SA, Micheyl C. Behind the scenes of auditory perception. Curr Opin Neurobiol. 2010;20:361–366. doi: 10.1016/j.conb.2010.03.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Shamma SA, Elhilali M, Micheyl C. Temporal coherence and attention in auditory scene analysis. Trends Neurosci. 2011;34:114–123. doi: 10.1016/j.tins.2010.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Shinn-Cunningham BG. Object-based auditory and visual attention. Trends Cogn Sci. 2008;12:182–186. doi: 10.1016/j.tics.2008.02.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Strelcyk O, Dau T. Relations between frequency selectivity, temporal fine-structure processing, and speech reception in impaired hearing. J Acoust Soc Am. 2009;125:3328–3345. doi: 10.1121/1.3097469. [DOI] [PubMed] [Google Scholar]
  19. Studebaker GA. A “rationalized” arcsine transform. J Speech Hear Res. 1985;28:455–462. doi: 10.1044/jshr.2803.455. [DOI] [PubMed] [Google Scholar]
  20. Thomson DJ. Spectrum estimation and harmonic-analysis. Proc IEEE. 1982;70:1055–1096. [Google Scholar]
  21. Vander Werff KR, Burns KS. Brain stem responses to speech in younger and older adults. Ear Hear. 2011;32:168–180. doi: 10.1097/AUD.0b013e3181f534b5. [DOI] [PubMed] [Google Scholar]
  22. Wightman FL, Kistler DJ. The dominant role of low-frequency interaural time differences in sound localization. J Acoust Soc Am. 1992;91:1648–1661. doi: 10.1121/1.402445. [DOI] [PubMed] [Google Scholar]

RESOURCES