Abstract
Although repetition is the most commonly used conversational repair strategy, little is known about its relative effectiveness among listeners spanning the adult age range. The purpose of this study was to identify differences in how younger, middle-aged, and older adults were able to use immediate repetition to improve speech recognition in the presence of different kinds of maskers. Results suggest that all groups received approximately the same amount of benefit from repetition. Repetition benefit was largest when the masker was fluctuating noise and smallest when it was competing speech.
1. Introduction
Conversations in adverse listening situations often do not proceed smoothly. Listeners (especially older adults) often fail to understand what was said, resulting in a breakdown in the communication process. The most commonly used remedy for communication breakdowns is repetition (e.g., Tye-Murray, 1991). Repetition is of obvious potential benefit because it allows the listener a second exposure to the message of interest. It also buys additional time to process information before the conversation proceeds.
A phenomenon that is likely related to repetition is the benefit to speech recognition obtained when the listener is pre-cued or primed to the content of a message immediately prior to its presentation. Recent results from our lab (Costanzi et al., 2015) found that older adults (as compared to younger adults) benefitted less from a written prime when the masker was background speech, with no apparent age effects when the masker was noise. While an age-related reduction in the processing or use of primes has also been uncovered in some other work (Wu et al., 2012; Getzmann et al., 2014) it does not appear to be a universal finding (Murphy et al., 1999; Sheldon et al., 2008; Ezzatian et al., 2011).
Based on limited previous research regarding the effect of repetition as well as on conflicting results in terms of age effects in the priming phenomenon, we conducted a study to directly examine how repetition influences speech recognition in older, middle-aged, and younger adults. The additional time to process information that occurs when a message is repeated might be especially beneficial for older adults (e.g., Tun et al., 2009; Heinrich and Schneider, 2011). On the other hand, older adults may be less able to benefit from repetition if they have difficulty holding the initial presentation in memory (Wingfield et al., 1994; Wu et al., 2012).
Our recent finding of reduced priming effects in older adults in the presence of a speech masker (but not a noise masker) suggests that age-related changes might be influenced by the nature of the background sound. Hence, we investigated repetition benefit in the presence of three types of maskers: a competing speech sentence, a steady-state spectrally shaped noise, and a temporally fluctuating noise. During speech masking trials, both the target sentence and the masking sentence were repeated. Repeating the entire stimulus should act to decrease uncertainty and so potentially could be useful for reducing informational masking. Results of some priming studies suggest that it is more beneficial in reducing informational masking (that is, masking that is caused by perceptual confusion or uncertainty regarding target versus masker) than energetic masking (the more traditional view of masking as a peripheral process caused by physical interference), perhaps because it helps segregate the target utterance from the masking utterance (Freyman et al., 2004; Ezzatian et al., 2011). On the other hand, repetition of both target and masker may lead to priming of the masking speech message, making it more salient. This effect might be even larger for older adults, who are likely at a disadvantage in terms of cognitive skills needed to segregate target from masking speech and inhibit the distracting message (e.g., Neher et al., 2009; Neher et al., 2012). In terms of ecological validity, this type of scenario (where both target and masker are repeated) occurs when listening to a repetition of recorded speech (for example, re-playing a voicemail message that contains background speech). It does differ, however, from other real-life situations where the person repeating the message does so in a manner that makes it more understandable (e.g., louder, clearer, and/or slower) and when the background sounds (either speech or noise) differ to some extent between initial presentation and repetition.
We also compared repetition benefit in the presence of two types of noise maskers: steady-state speech spectrum noise and temporally fluctuating noise. We reasoned that repetition might be more beneficial in a fluctuating versus in a steady-state noise masker as, in the former case, listeners could get another chance to use glimpses of the target during low-energy epochs. Since a potential priming effect in the noise maskers was not an issue, we used different tokens of noise during initial presentation and repetition.
2. Methods
Participants in this study were older (61–81 years, mean 69 years), middle-aged (40–59 years, mean 53 years) and younger (19–22 years, mean 20 years) adults (n = 16 per group). Exclusion criteria included a history of neurological or otologic disorder, English not learned as a first language, and hearing aid use. Middle-aged and older participants were required to score at least 26 out of 30 points on the Mini-Mental State Exam (Folstein et al., 1975). Pure-tone testing demonstrated mean high-frequency average (2 kHz–6 kHz) values of 25 dB hearing level (HL) in the older group (range: 6–44 dB HL), 13 dB HL in the middle-aged group (range: 5–44 dB HL), and 4 dB HL in the younger group (range: 0–15 dB HL). All participants were required to have bilaterally normal tympanograms on the test day in order to rule out a middle-ear component to any observed hearing loss.
Stimuli for this study were selected sentences from the TVM (Theo-Victor-Michael) corpus (Helfer and Freyman, 2009). Sentences for the present experiment each contained two two-syllable nouns that were used for scoring (e.g., “Theo discussed the army and the toothpaste today”). Target sentences were presented in three types of maskers: a single-talker same-sex masker reciting a TVM sentence, a steady-state noise (SSN) that was spectrally shaped from a sample of running speech from a female talker, and a single-channel envelope modulated (SEM) noise. The SEM noise was generated by extracting the wideband temporal envelope from the speech sample via rectification and low-pass filtering at 20 Hz, then using this envelope to modulate the SSN. During trials in which the masker was speech, the masking sentence and target sentence differed by both talker and cue name (the first word of the sentence, which began with Theo, Victor, or Michael). The cue name for the target sentence was presented on a computer monitor just prior to and during each trial.
Testing was conducted in an IAC double-walled sound booth (#1604A). Target sentences were presented from a front loudspeaker located 1.3 meters from the listener at a height approximating that of a seated adult (1.2 m). The masker was presented from both the front and a side (60° to the right) loudspeaker with a 4-ms time delay favoring the side loudspeaker. Due to the precedence effect, this spatial configuration led to the perception of the masker being spatially separated from the target (Freyman et al., 1999). We verified that all participants indeed localized the masker to the right prior to the initiation of data collection.
Target sentences were presented at an average RMS of 68 dBA. We chose to use different (but overlapping) sets of signal-to-noise ratios (SNRs) for the listener groups to enable comparison of repetition benefit at approximately equivalent levels of performance. SNRs for the younger participants were −8, −4, and 0 dB. SNRs for the middle-aged and older listeners were −4, 0, and +4 dB. SNRs are expressed in relation to the total masker energy. For example, in the 0 dB SNR condition, the combined energy of the masker from the front and side loudspeakers was equal to the level of the target.
On half of all trials, the target sentence and its accompanying masker were presented twice in a row. On the other half of trials, stimuli were presented just once. When speech was the masker, the repetition used the exact same stimulus (target + masker) that was played during the initial presentation. During noise masking trials, a different sample of noise was played during the repetition. In order to reduce the possibility that listeners would not try fully on the first presentation if they knew they would get a second chance to hear the stimulus, participants were not informed a priori regarding whether or not the subsequent trial was going to be repeated. Participants were instructed to report the target sentence after each individual presentation (so they responded twice during repeated trials). Trials were blocked by masker type and SNR, with 30 sentences per block (20 unique sentences, with ten of them repeated) and three blocks per masker/SNR combination, for a total of 540 trials. Repeated and non-repeated trials were presented randomly within a block. Before data collection began, a practice block of six trials was completed.
It should be noted that participants also completed a memory task at the end of each block of speech perception measurement as well as a battery of cognitive tests. Results of those measures are not reported in this paper.
3. Results
We compared performance across the groups on all first attempts (trials that were not repeated as well as the first presentation of sentences that were repeated). Figure 1 shows these data. Performance of older subjects was poorer than that of younger participants in all conditions. Notable is the finding that the middle-aged subjects performed almost identically to the younger participants in the presence of the two types of noise maskers (Fig. 1, top and middle panel) but did poorer when the masker was a competing sentence (Fig. 1, bottom panel). Repeated-measures analysis of variance (ANOVA) on the proportional data transformed into rationalized arcsine units (Studebaker, 1985) were completed using only data from the two SNRs that were run on all participants (0 and –4 dB) in order to reduce the influence of floor/ceiling effects. This analysis showed significant main effects for all factors (masker type, SNR, and group) and two significant interactions: masker type × group [F(4, 88) = 3.73, p < 0.008] and masker type × SNR [F(2, 44) = 41.03, p < 0.001].
Fig. 1.
Speech recognition performance for first attempts for each subject group, by masker type (SEM noise: top panel; SSN noise: middle panel; speech masker: bottom panel). Error bars represent the standard error.
The primary purpose of this research was to compare benefit from repetition between groups in the different types of maskers. Figure 2 shows repetition benefit, defined as the simple difference in percent-correct performance between first and second attempts. There was little evidence that aging negatively affected benefit from repetition. Figure 2 also demonstrates that repetition benefit was most substantial in the presence of the SEM masker and smallest when the masker was competing speech. Repeated-measures ANOVA was conducted to examine the effects of group, SNR, and masker type (using −4 dB and 0 dB SNRs, the two values for which data were collected for all participants) on repetition benefit. This analysis confirmed that there was no evidence that aging led to a reliable decrease in repetition benefit. Results indicated significant effects of SNR [F(1, 45) = 19.61, p < 0.001] and masker [F(1, 45) = 13.33, p = 0.001] with no significant main or interaction effects involving listener group.
Fig. 2.
Repetition benefit (second attempt – first attempt) by subject group, SNR, and masker type. The top, middle, and bottom panels show repetition benefit obtained in the SEM, SSN, and speech maskers, respectively. Error bars represent the standard error.
We also examined repetition benefit for each type of masker at the SNR for each group that led to performance closest to 60% correct recognition on first attempts. This criterion was selected because data were available for all three groups in all three maskers in the vicinity of the 60% correct level (see Fig. 1), and because it avoided floor and ceiling effects. For the two noise maskers, this occurred at −4 dB SNR for all groups. Benefit in the presence of the speech masker was derived from data obtained at 0 dB for the older participants, −4 dB for the middle-age participants, and −8 dB for the younger subjects. As seen in Table 1, when benefit is considered at performance approximating 60% correct on first presentations, the finding of minimal differences between groups persists. Repeated-measures ANOVA with masker type as a within-subjects variable and group as the between-subjects factor confirmed this observation, with a significant main effect of masker [F(1, 45) = 38.89, p < 0.001] and non-significant main and interaction effects involving group. This supports our contention that benefit from repetition, at least in the conditions measured in this study, is stable into older adulthood.
Table 1.
Repetition benefit (difference in percent correct for second attempt − first attempt) at SNRs leading to approximately 60% correct during first attempts in the presence of each type of masker. Values in parentheses represent the standard error.
| SSN | SEM | Speech | |
|---|---|---|---|
| Older | 11(2) | 16(2) | 6(2) |
| Middle Aged | 10(1) | 16(1) | 6(2) |
| Younger | 12(3) | 15(1) | 9(2) |
4. Discussion
The primary purpose of this study was to examine how repetition affects the recognition of speech information across the adult age range. There was little evidence that aging negatively influences repetition benefit. Younger, middle-aged, and older participants were able to take advantage of immediate repetition to a similar extent, with maximum repetition benefit of 16–19 percentage points (in the presence of the temporally fluctuating noise masker at the most adverse SNRs). These values were considerably larger than the benefit from repetition obtained with the steady-state noise, which was, at most, 10–12 percentage points. We believe that the relatively large benefit when listening in the presence of the fluctuating noise was because it provided listeners with additional glimpses of the target in time epochs of low masker energy.
Repetition benefit in the presence of the speech masker was substantially smaller than what was obtained with either noise masker (3–6 percentage points improvement for the older and middle-aged subjects, and 5–9 percentage points for the younger participants). This comparison should be interpreted with caution because the maskers differed not only acoustically, but in how repetition was implemented: the speech masker itself was repeated while, in trials with noise maskers, the precise noise sample differed between initial presentation and repetition. Repetition might have been less effective in the presence of competing speech because the masker also was presented a second time, leading to additional interference. Consistent with this idea, Rivenez et al. (2006) demonstrated that repetition priming can occur for stimuli in an unattended channel, which leads to the possibility that the masker also was primed in our paradigm. Based on our data, it does not appear that older adults are particularly susceptible to any potential increase in saliency produced when the masker is repeated, as there were no statistically significant main or interaction effects in repetition benefit involving subject group. However, since both the target and speech masker were repeated in the present study, disentangling repetition effects for the target from those for the masker will require further study.
Results of this experiment demonstrate that individuals spanning the adult age range benefit from the immediate repetition of a speech message. Additional research will need to be conducted to elucidate mechanisms involved in repetition benefit in the presence of different types of maskers. In the present study, repetition benefit was smallest in the presence of a competing speech message. Of interest is determining if this trend persists when the target speech is repeated while the speech masker varies between initial presentation and repetition. More work also needs to be completed to explain the large repetition benefit that was found in the presence of a temporally fluctuating noise masker. Since different tokens of the masker were played during first and second presentations, it seems that listeners may be able to integrate information from glimpses of different segments of a message.
Acknowledgments
We thank Angela Costanzi, Sarah Laakso, Gabrielle Merchant, Michael Rogers, and Eva Goldwater for their assistance with this project. This work was supported by Grant No. R01 DC012057 from NIH/NIDCD.
References and links
- 1. Costanzi, A. , Helfer, K. S. , and Freyman, R. L. (2015). “ Effects of hearing and cognitive skills on priming in older and younger adults,” presented at the American Auditory Society Conference, Scottsdale, AZ. [Google Scholar]
- 2. Ezzatian, P. , Schneider, B. A. , Pichora-Fuller, M. K. , and Li, L. (2011). “ The effect of priming on release from informational masking is equivalent for younger and older adults,” Ear Hear. 32, 84–96. 10.1097/AUD.0b013e3181ee6b8a [DOI] [PubMed] [Google Scholar]
- 3. Folstein, M. F. , Folstein, S. E. , and McHugh, P. R. (1975). “ Mini-mental state: A practical method for grading the cognitive state of patients for the clinician,” J. Psych. Res. 12, 189–198. 10.1016/0022-3956(75)90026-6 [DOI] [PubMed] [Google Scholar]
- 4. Freyman, R. L. , Balakrishnan, U. , and Helfer, K. S. (2004). “ Effect of number of masking talkers and auditory priming on informational masking in speech recognition,” J. Acoust. Soc. Am. 115, 2246–2256. 10.1121/1.1689343 [DOI] [PubMed] [Google Scholar]
- 5. Freyman, R. L. , Helfer, K. S. , McCall, D. D. , and Clifton, R. K. (1999). “ The role of perceived spatial separation in the unmasking of speech,” J. Acoust. Soc. Am. 106, 3578–3588. 10.1121/1.428211 [DOI] [PubMed] [Google Scholar]
- 6. Getzmann, S. , Lewald, J. , and Falkenstein, M. (2014). “ Using auditory pre-information to solve the cocktail-party problem: Electrophysiological evidence for age-specific differences,” Front. Neur. 8, 413. 10.3389/fnins.2014.00413 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Heinrich, A. , and Schneider, B. A. (2011). “ Elucidating the effects of ageing on remembering perceptually distorted word pairs,” Quart. J. Exp. Psychol 64, 186–205. 10.1080/17470218.2010.492621 [DOI] [PubMed] [Google Scholar]
- 7. Helfer, K. S. , and Freyman, R. L. (2009). “ Lexical and indexical cues in masking by competing speech,” J. Acoust. Soc. Am. 125, 447–456. 10.1121/1.3035837 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Murphy, D. R. , McDowd, J. M. , and Wilcox, K. A. (1999). “ Inhibition and aging: Similarities between younger and older adults as revealed by the processing of unattended auditory information,” Psych. Aging 14, 44–59. 10.1037/0882-7974.14.1.44 [DOI] [PubMed] [Google Scholar]
- 9. Neher, T. , Behrens, T. , Carlile, S. , Jin, C. , Kragelund, L. , Petersen, A. S. , and vanSchaik, A. (2009). “ Benefit from spatial separation of multiple talkers in bilateral hearing aid users: Effects of hearing loss, age, and cognition,” Int. J. Aud. 48, 758–774. 10.3109/14992020903079332 [DOI] [PubMed] [Google Scholar]
- 10. Neher, T. , Lunner, T. , Hopkins, K. , and Moore, B. C. J. (2012). “ Binaural temporal fine structure sensitivity, cognitive function, and spatial speech recognition of hearing-impaired listeners,” J. Acoust. Soc. Am. 131, 2561–2564. 10.1121/1.3689850 [DOI] [PubMed] [Google Scholar]
- 11. Rivenez, M. , Darwin, C. J. , Bourgeon, L. , and Guilllaume, A. (2006). “ Unattended speech processing: Effect of vocal-tract length,” J. Acoust. Soc. Am. 121, EL90–EL95. 10.1121/1.2430762 [DOI] [PubMed] [Google Scholar]
- 12. Sheldon, S. , Pichora-Fuller, M. K. , and Schneider, B. A. (2008). “ Priming and sentence context support listening to noise-vocoded speech by younger and older adults,” J. Acoust. Soc. Am. 123, 489–499. 10.1121/1.2783762 [DOI] [PubMed] [Google Scholar]
- 13. Studebaker, G. A. (1985). “ A ‘rationalized’ arcsine transform,” J. Speech Hear. Res. 28, 455–462. 10.1044/jshr.2803.455 [DOI] [PubMed] [Google Scholar]
- 14. Tun, P. A. , McCoy, S. , and Wingfield, A. (2009). “ Aging, hearing acuity, and the attentional costs of effortful listening,” Psych. Aging 24, 761–766. 10.1037/a0014802 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Tye-Murray, N. (1991). “ Repair strategy usage by hearing-impaired adults and changes following communication therapy,” J. Speech Hear. Res. 34, 921–928. 10.1044/jshr.3404.921 [DOI] [PubMed] [Google Scholar]
- 16. Wingfield, A. , Alexander, A. H. , and Cavigelli, S. (1994). “ Does memory constrain utilization of top-down information in spoken word recognition? Evidence from normal aging,” Lang. Speech 37, 221–235. 10.1177/002383099403700301 [DOI] [PubMed] [Google Scholar]
- 17. Wu, M. , Li, H. , Hong, Z. , Xian, X. , Li, J. , Wu, X. , and Li, L. (2012). “ Effects of aging on the ability to benefit from prior knowledge of message content in masked speech recognition,” Speech Commun. 54, 529–542. 10.1016/j.specom.2011.11.003 [DOI] [Google Scholar]


