Word identification with temporally-interleaved competing sounds by younger and older adult listeners

Karen S Helfer; Sarah F Poissant; Gabrielle R Merchant

doi:10.1097/AUD.0000000000000786

. Author manuscript; available in PMC: 2021 May 1.

Published in final edited form as: Ear Hear. 2020 May-Jun;41(3):603–614. doi: 10.1097/AUD.0000000000000786

Word identification with temporally-interleaved competing sounds by younger and older adult listeners

Karen S Helfer ¹, Sarah F Poissant ², Gabrielle R Merchant ³

PMCID: PMC7080604 NIHMSID: NIHMS1534127 PMID: 31567564

Abstract

Objective

The purpose of this experiment was to contribute to our understanding of the nature of age-related changes in competing speech perception using a temporally-interleaved task.

Design

Younger and older adults (n=16/group) participated in this study. The target was a five-word sentence. The masker was one of the following: another five-word sentence; five brief samples of modulated noise; or five brief samples of environmental sounds. The stimuli were presented in a temporally interleaved manner, where the target and masker alternated in time, always beginning with the target. Word order was manipulated in the target (and in the masker during trials with interleaved words) to compare performance when the five words in each stream did vs. did not create a syntactically correct sentence. Talker voice consistency also was examined by contrasting performance when each word in the target was spoken by the same talker or by different talkers; a similar manipulation was used for the masker when it consisted of words. Participants were instructed to repeat back the target words and ignore the intervening words or sounds. Participants also completed a subset of tests from the NIH Cognitive Toolbox.

Results

Performance on this interleaved task was significantly associated with listener age and with a metric of cognitive flexibility, but it was not related to degree of high-frequency hearing loss. Younger adults’ performance on this task was better than that of older adults, especially for words located toward the end of the sentence. Both groups of participants were able to take advantage of correct word order in the target, and both were negatively affected, to a modest extent, when the masker words were in correct syntactic order. The two groups did not differ in how phonetic similarity between target and masker words influenced performance, and interleaved environmental sounds or noise had only a minimal effect for all listeners. The most robust difference between listener groups was found for the use of voice consistency: older adults, as compared to younger adults, were less able to take advantage of a consistent target talker within a trial.

Conclusions

Younger adults outperformed older adults when masker words were interleaved with target words. Results suggest that this difference was unlikely to be related to energetic masking and/or peripheral hearing loss. Rather, age-related changes in cognitive flexibility as well as problems encoding voice information appeared to underlie group differences. These results support the contention that, in real-life competing speech situations that produce both energetic and informational masking, older adults’ problems are due to both peripheral and non-peripheral changes.

INTRODUCTION

Older adults have particular difficulty in situations when more than one person is speaking simultaneously (e.g., Tun et al. 2002; Humes et al. 2006; Helfer & Freyman 2008, 2014; Rossi-Katz & Arehart 2009; Koelewijn et al. 2012, 2014; Lee & Humes 2012; Jesse & Janse 2012; Anderson et al. 2013; Woods et al. 2013; Füllgrabe et al. 2015; Helfer & Jesse 2015; Souza & Arehart 2015). Why are these situations so problematic as people age? When target and masking speech are presented simultaneously, there is the potential for both energetic masking (when elements of the target acoustic signal are obscured by the masker) and informational masking (higher-level perceptual interference related to target-masker confusion or distraction from the masker) to be produced. The hearing loss that typically accompanies the aging process leads to increased susceptibility to energetic masking, at least in part because of reduced ability to use brief instances of low energy within a fluctuating masker to glimpse the target (e.g., Takahashi & Bacon 1992; Dubno et al. 2002, 2003; Summers & Molis 2004; George et al. 2006). It also is likely that higher-level factors contribute to greater interference from informational masking. In complex listening situations individuals must attend to the message of interest while inhibiting distractions from environmental sounds and competing speech, which requires cognitive mediation. There is ample evidence for strong connections between age-related declines in cognitive abilities and speech understanding in competing speech situations (e.g. Tun & Wingfield 1999; Humes et al. 2006; Koelewijn et al. 2012; Anderson et al. 2013; Desjardins & Dougherty 2013; Woods et al. 2013).

Cognitive contributors to speech understanding in older adults (for example, inhibitory deficits and limitations in working memory) are difficult to tease apart from problems related to peripheral changes in auditory functioning when using conventional tasks of masked speech understanding. This is because both cognitive changes and the sensorineural hearing loss that is ubiquitous in older adults can influence speech perception. One way to reduce or eliminate energetic masking (and therefore the impact of peripheral factors) is by using a non-simultaneous masking task. In the task used in the present study, participants hear two temporally interleaved target and masker streams and are asked to repeat every other word and ignore intervening words or sounds. Since the to-be-ignored stimuli and the to-be-attended words do not overlap in time, little or no energetic masking is produced, with the exception of the possibility of a small amount of forward or backward masking. This paradigm allows for a more direct view of whether aging affects the ability to ignore competing speech, without the confounding factor of age-related susceptibility to energetic masking or other peripheral factors that influence the perception of simultaneously-masked speech.

Interleaved word tasks have been used primarily to study the cues that listeners can exploit to link together words into streams. For example, with this type of paradigm Kidd et al. (2008) found that young, normally-hearing adults can use a variety of cues (talker consistency, syntactically-correct word order, and spatial location) to assist in correctly recalling target words and ignoring masker words. Best et al. (2011) also used interleaved stimuli to compare performance of younger normally-hearing and younger hearing-impaired adults. Their results showed little difference in perception between these two groups of participants when the words were presented with no temporal overlap. In both of these studies, performance when masker words were temporally interleaved with target words was poorer than when only target words were presented, a pattern consistent with the task requiring some level of selective attention (Broadbent, 1952) and/or with the idea that the to-be-ignored interleaved words interfered with processes required to remember the target words (Kidd et al., 2008).

A study from our lab (Helfer et al. 2013) designed to contrast the performance of older and younger adults on an interleaved word task suggested that differences do occur between these two groups of participants. In that study, older individuals (who, for the most part, had some degree of age-related high-frequency hearing loss) were found to perform significantly poorer than younger adults, even though scores were close to ceiling for all listeners when the target words were presented with no intervening words. Older adults also made more “masker errors” (reporting a masker word instead of a target word) than did younger participants, perhaps pointing to difficulty inhibiting irrelevant information. Also of note was that short-term memory, as measured by a forward digit span task, was significantly associated with task performance, while degree of pure-tone hearing loss was not. However, older adults were just as capable of using linkage cues (correct word order, consistent talker voice, spatial cues) as were younger participants. These results raised the question of whether older adults showed reduced performance because they were more prone to interference from the masking words, as compared to younger listeners. That question could not be directly addressed in the Helfer et al. (2013) study because linkage factors (i.e., whether words were presented in correct syntactic order vs. random order, or whether the same talker spoke each word on a given trial vs. when different talkers spoke the words) co-varied between target and masker. For example, in syntactically-correct conditions, both the target and masking words were presented in correct syntactic order, and when the target talker was the same for each word within the trial, the masking talker also was consistent within the trial. This led to an inability to examine these factors in the to-be-attended stimuli separately from how they affected performance when applied to the to-be-ignored words.

The present study was designed to address several questions: were differences in performance noted in our earlier study due to differences in processing the target words and/or differences in processing the masking words? To what extent could the results be explained by increased susceptibility to non-simultaneous masking on the part of older participants, and to what extent were they due to cognitive factors? Were differences between older and younger listeners related to general susceptibility to distraction from competing sounds, or were they caused by lexical interference produced by competing speech? And digging deeper into this question, are masker words that are lexical neighbors with target words more disruptive than masker words that are phonetically different from the target? These questions, discussed in more detail below, are addressed in the current paper.

One still-unresolved issue regarding age-related changes in speech understanding is the extent to which problems experienced in competing speech situations are due to difficulty inhibiting the masker. Some work using simultaneous speech-on-speech tasks supports the view that older adults have greater difficulty inhibiting meaningful speech maskers, as compared to younger adults (e.g., Sommers & Danielson 1999; Tun & Wingfield 1999; Tun et al. 2002; Helfer & Freyman 2008). In the present study, we independently manipulated linkage variables in the to-be-attended and to-be-ignored streams in order to gain a better understanding of age-related differences in the processing of masked speech. Previous research using interleaved word tasks suggests robust effects of word order, with performance greatly enhanced when target words are in correct syntactic order vs. when they are presented in random order. However, this same manipulation has little effect for young, normal-hearing listeners when applied to masking speech (Kidd et al. 2008). Our hypothesis is that listeners who are efficient at ignoring the masker should be less affected by masker word order, while performance would be poorer when masker words are in correct (vs. random) order in listeners who have difficulty inhibiting the masker. A similar prediction could be made for voice consistency: talker consistency in the masker should matter little to listeners who can inhibit those words. In the present study we also examined the proportion of masker errors (when participants reported a to-be-ignored word instead of a target word) as a secondary indicator of the extent to which participants processed the masker.

Although the temporally-interleaved nature of this task greatly reduced energetic masking, there is the possibility of forward and/or backward masking influencing performance. Results of the Kidd et al. (2008) study suggests that forward and/or backward masking was unlikely to explain why interleaved words produce interference, since interleaved samples of noise had little effect for younger, normal-hearing listeners. However, Best et al. (2011) concluded that they could not rule out the influence of non-simultaneous masking on performance by their younger participants with hearing impairment. Indeed, there is evidence that, compared to younger adults with normal hearing, older adults with hearing loss (Svec et al. 2016; Fogerty et al. 2017) and without hearing loss (Fogerty et al. 2017) are more susceptible to forward masking, raising questions about the extent to which performance of older adults in our previous study (Helfer et al. 2013) was influenced by non-simultaneous masking. In the present study we used trials in which samples of noise were temporally interleaved with target words in order to identify potential effects of non-simultaneous masking. Based on results of Kidd et al. (2008), we anticipate that interleaved noise will have little to no effect on speech understanding in our younger (normally-hearing) participants. If this is not the case for our older participants, it would suggest that non-simultaneous masking contributed to performance on the interleaved word task for these individuals.

Another question of interest was whether older adults’ decline in performance on the interleaved task was due to increased lexical activation of to-be-ignored words or to a general increase in distraction. One way to parse this out is by manipulating the to-be-ignored interleaved sounds. According to the duplex-mechanism theory of auditory distraction, there are at least two ways that background sound can lead to a reduction in performance on a primary task: by interfering with deliberate processing of the target task, and by diverting attention away from the task (Hughes 2014). The former happens when background sounds are processed semantically (e.g., Marsh & Jones 2011) while the latter can occur with any sound that captures attention. In the present study, we compared interleaved masker words with interleaved samples of environmental sounds. If older adults’ performance is more disrupted by non-speech sounds than young listeners’, it would provide support for a generalized susceptibility to distraction from sounds that is not necessarily based upon lexical competition or other speech-related factors. Conversely, if group differences are only found for the interleaved speech maskers, it suggests that lexical factors play a role in this aging effect.

Although there is a body of work addressing the ability to identify meaningful environmental sounds by both younger (e.g., Shafiro 2008) and older (e.g., Saygin et al. 2005) adults, little is known about how the presence of environmental sounds influences speech understanding. Previous work has demonstrated that environmental sounds that are congruent with a visually-presented picture (e.g., the sound of a dog barking paired with a picture of a dog) can enhance picture naming (e.g., Chen & Spence 2010, 2011) while environmental sounds that are semantically related to the picture (e.g., the sound of a horse neighing with a picture of a dog) can interfere with picture naming, as compared to neutral sounds (e.g., drumming) (Madebach et al. 2017). This pattern of facilitation/interference is similar to what occurs when the distractors are printed words rather than semantically related sounds (e.g., Glaser & Dungelhoff 1984; Schriefers et al.1990; Jescheniak et al. 2009). Hence, it seems that background environmental sounds can carry meaning (and cause distraction/interference) in a way similar to what occurs with competing speech. In the present study we used samples of environmental sounds interleaved with target words to compare how older and younger adults cope with any potential interference from these sounds.

Finally, the present study addressed the influence of phonetic similarity between target and masker. A common way of conceptualizing phonetic similarity is in terms of lexical neighborhood (e.g., Luce & Pisoni 1998; Dirks et al. 2001). Words that differ from each other by one phoneme are considered lexical neighbors. Determining lexical neighborhood effects in competing speech tasks is complicated when using simultaneous competing speech techniques because spectral overlap between targets and maskers leads to greater energetic masking for lexical neighbors than for non-neighbors, which do not share most phonemes. We took advantage of the lack of energetic masking in the interleaved word paradigm to investigate the extent to which this kind of similarity between target and masker mediates performance in older vs. younger adults. We hypothesized that older adults would be at a greater disadvantage than younger adults when to-be-attended and to-be-ignored words were lexical neighbors because of age-related difficulty inhibiting lexical competitors (e.g., Sommers 1996; Sommers & Danielson 1999; Lash et al. 2013). Previous work in our lab using a simultaneous competing speech task indicated that lexical characteristics of words in a to-be-ignored speech stream can influence target word recognition (Helfer & Jesse 2015). Using the interleaved word paradigm allowed us to examine this potential lexical competition without the influence of energetic masking.

In summary, the study described in this paper used an interleaved word task to examine several aspects of age-related changes in speech understanding. We probed age differences in susceptibility to the masker by manipulating linkage cues (syntactically-correct vs. random word and consistent vs. varied talker voice) independently in the target and masker words, and by examining the proportion of masker intrusion errors produced by each participant group. In regard to whether group differences were due to general distraction or to lexical interference, we compared performance in trials with interleaved words with trials containing interleaved environmental sounds. We assessed how phonetic similarity between target and masking words impacted perception in younger vs. older adults by contrasting performance when target and masker words were lexical neighbors vs. when they shared fewer phonemes. Finally, we quantified the extent to which performance on this interleaved task can be explained by non-simultaneous masking vs. the extent to which it could be attributed to cognitive abilities in two ways: by examining performance when samples of noise were interspersed with the target words (to determine whether non-simultaneous masking influenced performance), and by using correlation analysis to identify connections among interleaved task performance, cognitive task performance, hearing loss, and age.

MATERIALS AND METHODS

Participants

Participants in this study were 16 younger (19–24 years, mean 20 years) and 16 older (60–76 years, mean 67 years) native-English-speaking adults with a negative history of ear problems or neurologic disorders. Younger adults had normal pure-tone thresholds (less than 25 dB HL from 250 Hz to 8000 Hz) in both ears. Their mean better-ear high-frequency average thresholds, or HFPTA (for 2000 Hz to 6000 Hz pure tones), was 3 dB HL, with a range of −5 dB HL to 9 dB HL. Older participants were required to have HFPTA in each ear no greater than 60 dB HL and symmetric thresholds (defined as no greater than 10 dB interaural difference in HFPTA). The mean better-ear HFPTA for these individuals was 18 dB HL (range = 6 dB HL to 39 dB HL). Figure 1 shows audiometric thresholds for both participant groups. Tympanograms measured on the test day were normal (Type A) for all participants. Additionally, older participants were required to score at least 26 on the Mini-Mental State Examination (Folstein et al. 1975). This project was approved by the University of Massachusetts Amherst Institutional Review Board.

Figure 1. — Average pure tone thresholds for younger (left panel) and older adult (right panel) participants. Error bars represent the standard error.

Cognitive Tasks

All participants completed four subtests from the NIH Cognitive Toolbox (Weintraub et al. 2013). In the Picture Vocabulary Test (a measure of crystalized intelligence) participants were presented with a printed word and were required to select the picture corresponding to that word. For the Flanker Task (attention and inhibitory ability) participants indicated the direction of an arrow in the middle of three arrows and ignored the arrows to the right and left. The Dimension Change Card Sort Task (which assesses cognitive flexibility) required participants to match an object based on either color or shape, while ignoring the other dimension. Finally, subjects completed the List Sorting Task (a measure of working memory), in which they viewed a series of pictures of food items and animals and were asked to repeat them back in order of size, separately for each category of word. A full description of each of these tasks can be found in Weintraub et al. (2013). For each of these subtests, the unadjusted scale score (which compares the participant’s score to data in the entire NIH toolbox normative sample, with no adjustment for age or demographic information) was used for analysis. A score of 100 represents average performance in relation to the national sample with higher scores reflecting better performance. Table 1 shows group performance for each cognitive task. Younger participants obtained significantly higher scores than older participants for each subtest except Vocabulary.

Table 1.

Group performance on each of the cognitive tasks. Values in parentheses are the standard errors.

	Vocabulary	Flanker^**	CardSort^*	ListSort^*

Older	136.47 (2.44)	108.76 (1.63)	111.11 (3.03)	108.39 (2.48)
Younger	114.89 (2.40)	126.19 (2.44)	126.58 (3.00)	114.86 (2.83)

Open in a new tab

^**

= group differences at the p < .01 level

= group differences at the p < .05 level.

Stimuli

The speech recognition task used in this study was a modified version of the interleaved sentence task described in Kidd et al. (2008), consisting of a total of 50 words that fell into one of five categories: names, verbs, numbers/numeral descriptors, adjectives, and nouns (see Table 2). The noun category consisted of two five-word sets of words that were minimal pairs (bats/cats/hats/mats/rats and bones/cones/phones/tones/zones) to allow for examination of the effect of phonetic similarity. Each of the 50 words was audio recorded at least three times in isolation from 10 female and 10 male talkers. The most exemplary tokens of each word from each talker were selected for use in this study.

Table 2.

Target/masker words used in this study.

Names	Verbs	Number	Adjective	Nouns
Ann	bought	no	big	bats
Bob	dropped	one	brown	bones
Chris	found	two	cute	cats
Dave	gave	three	green	cones
Ed	held	four	new	hats
Jean	lost	five	old	mats
Paul	made	six	pink	phones
Sue	saw	eight	red	rats
Tom	sold	nine	thin	tones
Will	took	twelve	white	zones

Open in a new tab

Environmental sound tokens and brief samples of speech-envelope-modulated noise (SEM) were used in place of interleaved masker words on some trials. Environmental sounds were drawn from a set of 57 brief tokens taken from the freesound.org website. These included human sounds (e.g., sneeze, cough), household sounds (e.g., glass breaking, door knock), outdoor sounds (e.g., car crash, fire engine siren), musical instruments (e.g., piano, banjo), and animal sounds (e.g., coyote howl, duck quack). Each of these tokens was edited to be the same average length as the target words (0.963 sec). In order to assure that tokens were identifiable, each one was played at a comfortable level to two young adults, who were asked to provide the names of the sounds. Two potential sound stimuli were eliminated based on these ratings; the rest were identified correctly and thus were retained for the experiment. On trials that used these environmental sound maskers, five sound tokens were selected randomly for each trial (see below for more detail about experimental procedures). The SEM noise was created by extracting the wideband temporal envelope from a set of sentences recorded from a female talker, rectifying and low-pass filtering at 20 Hz, and modulating speech spectrum noise with the resulting envelope. Noise samples also were 0.963 sec in length.

Procedures

Pure-tone thresholds, tympanograms, and cognitive test measures were obtained first. Participants then completed the interleaved task, which was conducted in 20 blocks of trials. The first two trials of each block were considered practice and were not scored; each block contained 12 scored trials (60 scored words). The target utterance always consisted of five words (one from each of the columns displayed in Table 2). It should be noted that target and masker words from the last column of the grid were always minimal pairs (that is, if the target word was “cones”, the masker word could not be “hats”). Participants were shown this word grid at the beginning of data collection but it was not available to them once the experiment began. In the first block, only target words were presented on each trial. In each of the remaining blocks (which were presented in randomized order), there were masker words or sounds (environmental sounds or SEM noise) temporally interleaved with the target word, with the initial word of the target sentence always occurring first. On trials where the masker consisted of words, participants were instructed to repeat back every other word, beginning with the first word, and to ignore the intervening words. When the masker consisted of noise or environmental sounds, listeners were told to repeat each word. An example of what participants might have heard on an interleaved speech trial is “Jean Dave found gave six eight brown red cones stones”, in which the underlined words are the targets and the italicized words are the maskers. Words/sounds were concatenated. Targets and maskers were presented at a root-mean-square (RMS) level of 68 dB A from a single loudspeaker located directly in front of the listener. A custom MATLAB program was used to control stimulus presentation and for scoring participants’ responses.

Table 3 lists the conditions used in this experiment. During the 16 blocks in which the competing sounds were words, each of the following factors was manipulated: target word order (syntactically correct or random); masker word order (syntactically correct or random); target voice fixed (one talker for all target words on a given trial) or varied (each target word in a trial spoken by a different talker); and masker voice fixed or varied. In these latter two conditions, the target and/or masking talker was selected randomly (without replacement) for each word from the set of 20 talkers’ utterances. An example of a trial with syntactically correct target and masker word order can be found above; an example of one with random target word order and syntactically correct masker word order is “found Dave cones gave six eight Jean red brown stones”, with target words underlined and masker words italicized. Additionally, two blocks were completed in which the intervening sounds were samples of environmental noises, as described above, and two blocks were run with samples of SEM noise as the intervening sounds. During these four blocks the target words were always presented in random syntactic order, and for each masker type one block was run with a fixed target talker and the other with varied target talkers.

Table 3.

Conditions used in this study. Block order was randomized across participants.

	Target Syntax Random				Target Syntax Correct
	Masker Syntax Random	Masker Syntax Correct	Noise	Environmental Sounds	Masker Syntax Random	Masker Syntax Correct
T voice fixed/M voice varied	X	X	X	X	X	X
T voice varied/M voice fixed	X	X	X	X	X	X
T and M voice fixed	X	X			X	X
T and M voice varied	X	X			X	X

Open in a new tab

T = target; M = masker.

RESULTS

For all analyses, order of responses was not taken into account during scoring. That is, an accurate-word response was considered correct regardless of whether or not it was reported by the participant in the order in which it was presented.

Although performance on the first block of trials (in which only five target words were presented in the correct syntactic order) was high for both groups, one participant from each group performed more than two standard deviations below the mean. These subjects’ data were therefore not considered in any of the analyses described below. For the remaining participants, performance on this control block averaged 94.88% for the older listeners and 95.63% for the younger listeners.

Interleaved Speech

One aim of this study was to examine the effect of linkage variables (syntactically correct vs. random word order, fixed talker within a trial vs. varied talkers within a trial) when they were applied independently to the target and masker words. These data can be seen in Fig. 2. First, it is clear that performance was better for younger participants than for older participants. Both groups obtained higher scores when the target was syntactically correct (mean percent-correct: younger listeners 86%, older listeners 76%) vs. when it was random (mean percent-correct: younger listeners 71%, older listeners 63%). When the target syntax was random, both groups obtained higher scores when the target was in the fixed voice condition (mean percent-correct: younger listeners 78%, older listeners 65%) vs. the varied voice condition (mean percent-correct: younger listeners 65%, older listeners 61%). Performance also was modulated, but to a lesser extent, by syntax condition in the masker, with lower scores when the masker words comprised a syntactically-correct sentence (mean percent-correct: younger listeners 77%, older listeners 68%) versus when the masker words were in random order (mean percent-correct: younger listeners 80%, older listeners 71%).

Figure 2. — Percent-correct recognition of target words for each listener group by target and masker syntax and voice condition (T-Fix = target voice fixed; T-Var = target voice varied; M-Fix = masker voice fixed; M-Var = masker voice varied).

Repeated-measures ANOVA on these data with target syntax, masker syntax, target voice condition, and masker voice condition as within-subjects factors and listener group as a between-subjects factor showed significant main effects for target syntax (F (1,28) = 63.78, p < .001), masker syntax (F (1,28) = 10.54, p = .003), target voice consistency (F 1,28) = 38.39, p < .001) and masker voice consistency (F (1,28 = 9.62, p = .004). There were two significant interactions: target voice condition x group (F (1,28) = 9.50, p = .005) and target syntax x target voice (F (1,28, = 9.59, p = .004). Post-hoc t-tests were used to explore the former of these interactions and found that the difference between target-talker fixed and target-talker varied was significant for younger participants (t = 4.01, p = .001) but not for older participants (t = .28, p = .784). Hence, younger listeners could benefit from a consistent target talker voice, but the same was not true for older participants.

Another way to look at processing of the to-be-ignored stimuli is by examining masker intrusion responses (when participants reported a word from the masker stream instead of a target word), since these types of responses could reflect problems with inhibiting the masker words. Figure 3 shows data on the proportion of these errors by target/masker syntax condition (top panel) and by target/masker voice condition (bottom panel). Two separate repeated-measures ANOVA were conducted (one for target and masker syntax condition, the other for target and masker voice consistency condition) to examine these effects with syntax or voice condition as the within-subjects variable and group as a between-subjects variable. For the target and masker syntax analysis, there was a significant main effect of syntax condition (target syntax: F [1,28] = 10.42, p = .003; masker syntax: F [1,28] = 10.21, p = .003) and non-significant main and interaction effects for subject group. For both groups, masker intrusion errors were more common when the target syntax was random and when the masker syntax was correct. ANOVA for fixed vs. varied target/masker voice uncovered somewhat different results, with a significant main effect of voice condition (F [3,26] = 5.64, p = .004) and a significant interaction between voice condition and subject group (F [3,26] = 2.87, p = .047). Post-hoc paired-samples t-tests found significant differences in the proportion of masker errors among voice conditions for the younger listeners (p values between .002 and .009) but not for the older listeners (p values between .257 and .417). When the masker voice was fixed, younger individuals could use this cue to decrease the number of intrusion errors (possibly because they were now better able to segregate the masker from the target talker). The same was not true for the older participants.

Figure 3: — Percentage of all responses that were words from the masker (T = target; M = masker). Top panel: percentage of masker errors by target/masker syntax condition; bottom panel: percentage of masker errors by voice consistency condition. Error bars represent the standard error.

Interleaved Noise and Interleaved Environmental Sounds

Recall that one research question was the extent to which group differences in susceptibility to non-simultaneous masking contributed to group differences in performance on the interleaved word task. Another question was whether distraction produced by interleaved environmental sounds would be disruptive to performance, and how this might interact with age. Fig. 4 displays performance when the interleaved stimuli were samples of noise or environmental sounds, along with performance averaged across all target-word random-syntax conditions for speech maskers (as data with interleaved noise and environmental sounds were only collected for random syntax conditions). Results showed minimal effects of noise and environmental sound maskers for both groups. Repeated ANOVA with masker type and target talker consistency condition as within-subjects variables and listener group as a between-subjects variable indicated significant main effects of group, masker type, and talker consistency (all p < .001) with significant interactions between masker type and target voice consistency (F (2,27) = 8.97, p = .001) as well as voice consistency and group (F (1, 28) = 11.51, p = .002). Post-hoc t-tests with Bonferroni corrections for the masker main effect indicated that average performance for the noise and sounds maskers did not differ from each other (p = .63) but both were significantly different from speech maskers (p < .001). The voice condition x group interaction was due to a significant difference between groups only for speech maskers when the target voice was fixed (p = .045). The groups did not differ significantly for speech maskers when the target voice was varied, or for noise or sound maskers in either voice condition.

Figure 4. — Percent correct performance when noise (left), environmental sounds (middle), and words (right) were interleaved with the target words. Error bars represent the standard error.

Phonetic similarity of target and masker

We speculated that older adults would be at a greater disadvantage than younger listeners when target and masker words were phonetically similar. This idea was based on the fact that masker words that are lexical neighbors of target words would cause greater competition, and that this would be a more significant factor for older adults because they are less capable of suppressing lexical competitors (e.g., Sommers 1996; Sommers & Danielson 1999; Lash et al. 2013).

Analysis of the impact of phonetic similarity depended upon determining if there were consistent effects related to word position, since when rhyming targets/maskers occurred in correct target syntax trials it was always at the end of the trial. Figure 5 shows percent-correct performance by word position for both types of target syntactic conditions, with data averaged across target voice consistency and masker syntax/voice consistency conditions. It is apparent that variability in older (but not younger) subjects’ performance was much greater for words at the end of the sentence as compared to words at the beginning and middle of the sentence. Repeated measures ANOVA with syntax condition and word position as within-subjects factors and group as a between-subjects factor confirmed that there was a significant two-way interaction between word position and syntax (F [4,25] = 19.51, p < .001) as well as a significant word position x group interaction (F [4,25] = 3.48, p = .010).

Figure 5. — Percent correct performance by word position for both types of target syntax conditions.

We contrasted percent-correct performance for words in which the targets and maskers on a given trial were minimal pairs (that is, from the last column in Table 2) versus when they were not (columns 1–4 in Table 2). This was only done for random target syntax conditions, as when the target words were in correct syntactic order the rhyming target word was always at the end of the trial, and our preliminary analysis described above showed significant word position effects that were mediated by subject group.

Figure 6 shows results of this analysis. Several trends are apparent. First, there were only subtle differences in the patterns of responses between groups. For both groups, performance was poorer for rhyming words (vs. non-rhyming words) when they occurred at the beginning of the trial. This finding persisted for older adults through the middle of the trial, but for younger participants it was only apparent for the first target word. For both groups, there was little effect of rhyme/no rhyme for words at or near the end of the trial. Repeated-measures ANOVA with word position and word type (rhyme vs. no-rhyme) as within subjects factors and subject group as a between subjects factor showed a significant interaction between word position and word type (F [4,25] = 5.56, p = .002) with no main or interaction effects involving group. Hence, there is little support for older adults being more susceptible to lexical interference by similar-sounding words in the current paradigm. Instead, both groups of participants had more difficulty identifying target words when there were rhyming masking words in the same trial during the beginning (both groups) and middle (older group) of the trial.

Influence of age, hearing loss, and cognitive abilities

A Pearson r correlation analysis was completed to help explain individual variability in performance on the interleaved word task. The analysis used the following variables: scores on each of the cognitive tasks; age; better-ear high-frequency pure-tone average (average of thresholds at 2k Hz, 3 kHz, 4 kHz, and 6 kHz); and percent-correct averaged across all conditions using words as the interleaved stimuli. Results of this analysis can be seen in Table 4. Age was significantly (and negatively) associated with all other variables, including each cognitive variable (except for vocabulary) and interleaved task performance. Additionally, the Dimension CardSort task (a measure of cognitive flexibility) was significantly related to interleaved task performance, as depicted in the scattergram in Fig. 7. Of note was that degree of high-frequency hearing loss, while significantly associated with both age and with scores on the combined cognitive variable, was not significantly related to performance on the interleaved word task. The outcome of this analysis supports the idea that factors other than degree of hearing loss led to the observed differences between groups on the interleaved word task.

Table 4.

Results of Pearson r correlation analyses.

	Age	HFPTA	ListSort	CardSort	Flanker	Vocabulary	PCSpeech
Age	---	.77^**	−.40^*	−.57^**	−.73^**	−.18	−.40^*
HFPTA		---	−.33	−.31	−.46^*	−.19	−.24
ListSort			---	−.08	.23	.17	.22
CardSort				---	.68^**	−.08	.41^*
Flanker					---	−.13	.31
Vocabulary						---	.11
PCSpeech							---

Open in a new tab

HFPTA = average of better-ear pure tone thresholds at 2 kHz, 3 kHz, 4 kHz, and 6 kHz; ListSort = performance on NIH Cognitive Toolbox ListSort (working memory) task; CardSort = performance on NIH Cognitive Toolbox Dimension CardSort (cognitive flexibility) task; Flanker = performance on NIH Cognitive Toolbox Flanker (attention and inhibitory ability) task; Vocabulary = performance on NIH Cognitive Toolbox Vocabulary task; PCspeech = percent-correct scores on the interleaved task averaged across all conditions with speech as the masker (* = p < .05; ** = p < .01).

Figure 7. — Scatterplot displaying the association between target word identification and score on the CardSort task. Solid markers represent data from the younger participants; open markers represent data from the older participants.

DISCUSSION

This work sought to help clarify factors that contribute to age-related changes in competing speech perception using a temporally-interleaved task that essentially eliminates energetic masking. Overall, younger adults outperformed older adults when the task was to repeat back words interleaved with other words, even though performance by the two groups was essentially equivalent when there were no stimuli or non-word stimuli interleaved between the target words. Both older and younger participants were able to benefit from target words being presented in correct syntactic order. These findings are consistent with results of our previous study (Helfer et al. 2013).

In many speech-on-speech recognition paradigms, successful task completion requires the listener to both correctly identify target words and to ignore masker words. Previous research using simultaneous competing speech tasks suggests that older adults may be more affected than younger adults by the nature of the masker. For example, older adults have more difficulty understanding speech in the presence of a meaningful (vs. non-meaningful) masker, as compared to younger adults (e.g., Tun et al. 2002; Rossi-Katz and Arehardt 2009). However, one potential difficulty with interpreting results of simultaneous competing speech tasks is that the masker produces both energetic and (potentially) informational masking, making it difficult to conclude that results are not due to a combination of these two factors. In the present study, younger and older participants appeared to be influenced similarly by the interleaved (masker) words. For both groups, having the to-be-ignored masker words in correct syntactic order (leading to a masker that was a syntactically-viable sentence) led to a slight reduction in performance, as compared to when masker words were presented in random order. Although the effect of masker syntax was statistically significant, it was very small, more or less concordant with results of Kidd et al. (2008) who showed no consistent effect of masker syntax among their four young normally-hearing participants.

We had hypothesized that an age-related increase in problems inhibiting the masker might be manifested as increased susceptibility to masker manipulations and/or as an increase in masker intrusion errors. Our results only partially support this hypothesis. Older participants did make more masker intrusion errors than younger participants (see Fig. 3). However, the effect of manipulating masker syntax did not differ between the two groups. This might be seen as contradictory to prior research using simultaneous competing speech tasks in which older adults were found to be more sensitive than younger adults to the understandability of a masker (e.g., Tun et al., 2002; Rossi-Katz and Arehart, 2009). This discrepancy could be a reflection of differences in the overall difficulty of simultaneous vs. interleaved competing speech tasks, as the former typically lead to lower levels of performance than the latter. It also should be noted that the masker manipulations used in the present study (word order and voice consistency) were different from what has been used in simultaneous competing speech tasks (e.g., different languages or forward vs. backward speech). A more direct comparison of results across studies could be made by using those types of masker manipulations with an interleaved word task.

There was some evidence that the masker was processed differentially by younger and older listeners. The most robust age difference in the present study was related to the effect of varied vs. fixed target and masker voices. Specifically, younger participants were able to use a consistent masker talker cue to decrease the number of intrusion responses, while older adults were not. On the surface, this seems to suggest that younger adults were less able to ignore the masker. An alternate way of looking at this is that they were better able to process and use information within the to-be-ignored words to determine that they were not targets. Prior research has demonstrated that individuals with higher working memory are better able to process multiple sources of information than are those with poorer working memory (e.g., Ronnberg et al., 2013). Among the participants in the present study, younger listeners had higher working memory spans than older listeners, which might have allowed them to devote more processing resources to the masker words.

Not only were younger adults better than older adults at using consistent voice information in the masker words to reduce the number of intrusion errors, they also benefited more than older participants from having a fixed target talker voice. This is inconsistent with results from our earlier study (Helfer et al., 2013) which found that older adults were as capable of using voice information as younger listeners. One major difference between these two studies is that our previous experiment had trials in which the voices were all either all male or all female during voice-varied trials, while in the present study all voice-varied trials had a mixture of male and female voices. This increase in variability likely affected our older participants to a greater extent than younger participants. Further bolstering this idea is the significant association found between the cognitive task measuring mental flexibility and performance on the interleaved task (discussed in more detail below). However, the pattern of deficits experienced by older adults in the present study also could be due to difficulty encoding voice information, as research has shown that the learning and/or encoding of voice information is poorer in older than in younger adults (e.g,. Yonan & Sommers 2000; Pilotti et al. 2001; Pilotti & Beyer 2002; Helfer & Freyman 2008; Rossi-Katz & Arehart 2009; Best et al. 2018).

Another question addressed in this study was the extent to which differences in performance between older and younger listeners could be explained by non-simultaneous (i.e., energetic) masking. We believe that the results strongly support our conclusion that non-simultaneous masking had little effect on task performance, consistent with what was noted by Kidd et al. (2008) using a similar task with young, normal-hearing listeners. In the present study, interleaved SEM noise had little effect on performance for either listener group. In addition, the differential effects of word position (in which larger group differences were found for words toward the end of the trial vs. in other positions) cannot be attributed to non-simultaneous masking, which should be similar across word positions. Moreover, the correlation analysis indicated that degree of high-frequency hearing loss was not significantly associated with performance on the interleaved word task.

Taken together, this evidence suggests that factors other than peripheral hearing loss determined the performance of older listeners. A significant association was found between scores on the Dimension Card Sort Task (which measures cognitive flexibility) and identification of target words. This subtest relies heavily on executive function, and it is likely that the interleaved task also tapped into participants’ ability to control and shift attention. Numerous previous studies have found that cognitive skills contribute to explaining individual variability in simultaneous speech-on-speech masking tasks (e.g., Anderson et al. 2013; Woods et al. 2013; Helfer & Freyman 2014; Helfer & Jesse 2015; Souza & Arehart, 2015). Results of the present paper extend this finding to a situation in which speech understanding is not limited by energetic masking or other peripheral factors.

We also were interested in knowing whether age-related decline in performance on the interleaved word task was due to older adults being more susceptible to general distraction produced by to-be-ignored stimuli. The present results do not support this idea, as both our younger and our older participants were only minimally affected by the presence of interleaved environmental sounds. Hence, differences in performance between older and younger participants noted when words were maskers could not be attributed to a general problem of attention being captured by competing sounds (although it is possible that our environmental sound samples were not sufficiently compelling to capture attention or cause distraction), nor was it caused by interleaved non-word sounds leading to a decrease in the ability to rehearse and/or remember the target words. It is possible that differences between the two groups in task difficulty contributed to our inability to identify any effect of environmental sound maskers. Previous work suggests that increasing task demands by making the primary task more difficult (for example, using a hard-to-read font) reduces the processing of background sounds, in essence protecting against attentional capture by those sounds (e.g., SanMiguel et al. 2008; Hughes et al. 2013; Sorqvist & Marsh 2015; Halin 2016). Assuming this finding extends to non-simultaneous tasks, it is possible that the lack of group differences when non-speech sounds were interleaved was at least in part due to the fact that the primary task was perceptually more difficult for older than for younger participants. That is, if understanding the target words was perceptually more taxing for older than for younger participants, the environmental sounds would have been processed to a lesser extent by older listeners, making them less prone to interfering with the primary task.

Since our results demonstrated that differences between groups were only found with interleaved words (and not with interleaved environmental sounds or noise samples), we believe that is it likely that lexical interference contributed to the deficit experienced by older participants. One caveat, however, is that results of the phonetic similarity analysis did not support the presence of increased lexical interference, since both groups of listeners were influenced to the same extent when words in the target and masker rhymed. It is possible that the way we examined lexical interference (by contrasting performance when targets and maskers were or were not minimal pairs) was not sufficiently sensitive to identify age-related differences in lexical processing that have been found using other tasks (e.g., Sommers 1996; Sommers & Danielson 1999; Lash et al. 2013). Regardless, we cannot conclusively either confirm or rule out that lexical interference led to older adults’ reduced performance on this interleaved word task.

Finally, it is feasible that by the end of the trial, older adults could no longer correctly determine which word was from the masker and which was from the target due to a breakdown in the ability to keep track of the sequence of words. This idea is supported by the finding that group differences were larger for words toward the end of the trial than for those at the beginning (see Fig. 5). We cannot exclude the possibility that an age-related deficit in temporal sequencing (e.g., Trainor & Trehub 1989; Fitzgibbons & Gordon-Salant 2001) led to our older adult participants having difficulty tracking over time which words were targets and which were maskers.

In summary, results of this study indicate that older adults are more impaired than younger adults when masker words are interleaved with target words. This deficit is significantly associated with cognitive flexibility but not with degree of high-frequency hearing loss. The fact that performance of older listeners was only minimally influenced by samples of interleaved noise supports the idea that non-simultaneous masking could not explain the findings obtained when words were used as maskers. Older adults appear to have particular difficulty using talker voice information during this task. Overall, even though this interleaved word task does not represent how communication takes place in “the real world”, the results support a growing body of evidence that hearing loss experienced by older adults does not entirely explain the problems they experience in competing speech situations. This suggests that remediation of hearing loss that centers around improving audibility of speech will not solve all of the difficulties encountered by older adults in complex listening environments.

ACKNOWLEDGEMENTS

We thank Sarah Laakso, Kimberly Adamson-Bashaw, Peter Wasiuk, and Michael Rogers for their assistance with this project. This work was supported NIH NIDCD R01 012057.

Financial disclosures/Conflicts of Interest: This research was funded by the NIH-NIDCD.

Contributor Information

Karen S. Helfer, University of Massachusetts Amherst, Department of Communication Disorders

Sarah F. Poissant, University of Massachusetts Amherst, Department of Communication Disorders

Gabrielle R. Merchant, Boys Town National Research Hospital

REFERENCES

Akeroyd MA (2008). Are individual differences in speech recognition related to individual differences in cognitive ability? A survey of twenty experimental studies with normal and hearing-impaired adults. Int J Aud, 47 (suppl. 2), S53–S71. [DOI] [PubMed] [Google Scholar]
Anderson S, White-Schwoch T, Parbery-Clark A, et al. (2013). A dynamic auditory cognitive system supports speech-in-noise perception in older adults. Hear Res, 300, 18–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
Best V, Mason CR, and Kidd G Jr. (2011). Spatial release from masking in normally hearing and hearing-impaired listeners as a function of the temporal overlap of competing talkers. J Acoust Soc Am, 129, 1616–1625. [DOI] [PMC free article] [PubMed] [Google Scholar]
Best V, Ahlstrom JB, Mason CR, et al. (2018). Talker identification: Effects of masking, hearing loss, and age. J Acoust Soc Am, 143, 1085–1092. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chen Y-C, and Spence C (2010). When hearing the bark helps to identify the dog: Semantically-congruent sounds modulate the identification of masked pictures. Cog, 114, 389–404. [DOI] [PubMed] [Google Scholar]
Chen Y-C, and Spence C (2011). Crossmodal semantic priming by naturalistic sounds and spoken words enhances visual sensitivity. J Exp Psych: Hum Percep Perform, 37, 1554–1568. [DOI] [PubMed] [Google Scholar]
Desjardins JL, and Doherty KA (2013). Age-related changes in listening effort for various types of masker noises. Ear Hear, 34, 261–272. [DOI] [PubMed] [Google Scholar]
Dirks DD, Takayanagi S, and Moshfegh A (2001). Effects of lexical factors on word recognition among normal-hearing and hearing-impaired listeners. J Amer Acad Aud, 12, 233–244. [PubMed] [Google Scholar]
Dubno JR, Horwitz AR, and Ahlstrom JB (2003). Recovery from prior stimulation: Masking of speech by interrupted noise for younger and older adults with normal hearing. J Acous Soc Am, 113, 2084–2094. [DOI] [PubMed] [Google Scholar]
Dubno JR, Horwitz AR, and Ahlstrom JB (2002). Benefit of modulated maskers for speech recognition by younger and older adults with normal hearing. J Acous Soc Am, 111, 2897–2907. [DOI] [PubMed] [Google Scholar]
Fitzgibbon JP, and Gordon-Salant S (2001). Aging and temporal discrimination in auditory sequences. J Acous Soc Am, 109, 2955–2963. [DOI] [PubMed] [Google Scholar]
Fogerty D, Bologna WJ, Ahlstrom JB, and Dubno JR (2017). Simultaneous and forward masking of vowels and stop consonants: Effects of age, hearing loss, and spectral shaping. J Acous Soc Am, 141, 1133–1143. [DOI] [PMC free article] [PubMed] [Google Scholar]
Folstein MF, Folstein SE, and McHugh PR (1975). Mini-mental state: A practical method for grading the cognitive state of patients for the clinician. J Psych Res, 12, 189–198. [DOI] [PubMed] [Google Scholar]
Füllgrabe C, Moore BC, and Stone MA (2015). Age-group differences in speech identification despite matched audiometrically normal hearing: contributions from auditory temporal processing and cognition. Front Aging Neurosci, 6, 347. [DOI] [PMC free article] [PubMed] [Google Scholar]
George ELJ, Festen JM, and Houtgast T (2006). Factors affecting masking release for speech in modulated noise for normal-hearing and hearing-impaired listeners. J Acous Soc Am, 120, 2295–2311. [DOI] [PubMed] [Google Scholar]
Glaser WR and Dungelhoff FJ (1984). The time course of picture-word interference. J Exp Psych Hum Percep Perf, 10, 640–654. [DOI] [PubMed] [Google Scholar]
Halin N (2016). Distracted while reading? Changing a hard-to-read font shields against the effects of environmental noise and speech on text memory. Front Psych, 7, 1196. [DOI] [PMC free article] [PubMed] [Google Scholar]
Helfer KS, and Freyman RL (2008). Aging and speech-on-speech masking. Ear Hear, 29, 87–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
Helfer KS, and Freyman RL (2014). Stimulus and listener factors affecting age related changes in competing speech perception. J Acous Soc Am, 136, 748–759. [DOI] [PMC free article] [PubMed] [Google Scholar]
Helfer KS, and Jesse A (2015). Lexical influences on competing speech perception in younger, middle-aged, and older adults. J Acous Soc Am, 138, 363–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
Helfer KS, Mason C, and Marino C (2013). Aging and the perception of temporally-interleaved words. Ear Hear, 34, 160–167. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hughes R (2014). Auditory distraction: A duplex-mechanism account. Psych J, 3, 30–41. [DOI] [PubMed] [Google Scholar]
Hughes R, Hurlstone MJ, Marsh JE, et al. (2013). Cognitive control of auditory distraction: impact of task difficulty, foreknowledge, and working memory capacity support duplex-mechanism account. J Exp Psych Hum Percep Perf, 39, 539–553. [DOI] [PubMed] [Google Scholar]
Humes LE, Lee JH, and Coughlin MP (2006). Auditory measures of selective and divided attention in young and older adults using single- talker competition. J Acous Soc Am, 120, 2926–2937. [DOI] [PubMed] [Google Scholar]
Jescheniak JD, Oppermann F, Hantsch A, et al. (2009). Do perceived context pictures automatically activate their phonological code? Exp Psych, 56, 56–65. [DOI] [PubMed] [Google Scholar]
Jesse A, and Janse E (2012). Audiovisual benefit for recognition of speech presented with single-talker noise in older listeners. Lang Cog Proc, 27, 1167–1191. [Google Scholar]
Kidd G Jr., Best V, and Mason CR (2008). Listening to every other word: Examining the strength of linkage variables in forming streams of speech. J Acous Soc Am, 124, 3793–3802. [DOI] [PMC free article] [PubMed] [Google Scholar]
Koelweijn T, Zekveld AA, Festen JM, et al. (2012). Pupil dilation uncovers extra listening effort in the presence of single-talker masker. Ear Hear, 33, 291–300. [DOI] [PubMed] [Google Scholar]
Koelweijn T, Zekveld AA, Festen JM, et al. (2014). The influence of informational masking on speech perception and pupil response in adults with hearing impairment. J Acous Soc Am, 135, 1596–1606. [DOI] [PubMed] [Google Scholar]
Lash A, Rogers CS, Zoller A, et al. (2013). Expectation and entropy in spoken word recognition: effects of age and hearing acuity. Exp Aging Res, 39, 235–253. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lee JH, and Humes LE (2012). Effect of fundamental-frequency and sentence-onset differences on speech-identification performance of young and older adults in a competing-talker background. J Acous Soc Am, 132, 1700–1717. [DOI] [PMC free article] [PubMed] [Google Scholar]
Luce PA, and Pisoni DB (1998). Recognizing spoken words: the neighborhood activation model. Ear Hear, 19, 1–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
Madebach A, Wohner S, Kieseler M-L, et al. (2017). Neighing, barking, and drumming horses—Object related sounds help and hinder picture naming. J. Exp. Psych: Hum Percep Perform, 43, 1629–1646. [DOI] [PubMed] [Google Scholar]
Marsh JE, and Jones D (2011). Cross-modal distraction by background speech: what role for meaning? Noise Health, 45, 210–216. [DOI] [PubMed] [Google Scholar]
Pilotti M, and Beyer T (2002). Perceptual and lexical components of auditory repetition priming in young and older adults. Mem Cog, 30, 226–236. [DOI] [PubMed] [Google Scholar]
Pilotti M, Beyer T, and Yasunami M (2001). Encoding tasks and the processing of perceptual information in young and older adults. J Geron: Psych Sci, 56B, P119–P128. [DOI] [PubMed] [Google Scholar]
Ronnberg J, Lunner T, Zekveld A, et al. (2013). The Ease of Language Understanding (ELU) model: theoretical, empirical, and clinical advances. Front Syst Neurosci, 7, 31. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rossi-Katz J, and Arehart KH (2009). Message and talker identification in older adults: Effects of task, distinctiveness of the talker’s voices, and meaningfulness of the competing message. J Speech Lang Hear, 52, 435–453. [DOI] [PubMed] [Google Scholar]
SanMiguel I, Corral M-J, and Escera C (2008). When loading working memory reduces distraction: behavioral and electrophysiological evidence from an auditory-visual distraction paradigm. J Cog Neurosci, 20, 1131–1145. [DOI] [PubMed] [Google Scholar]
Saygin AP, Dick F, and Bates E (2005). An on-line task for contrasting auditory processing in the verbal and nonverbal domains and norms for younger and older adults. Behav Res Meth, 37, 99–110. [DOI] [PubMed] [Google Scholar]
Schriefers HJ, Meyer AS, and Levelt WJM (2000). Exploring the time course of lexical access in language production—Picture-word interference studies. J Mem Lang, 29, 86–102. [Google Scholar]
Shafiro V (2008). Identification of environmental sounds with varying spectral resolution. Ear Hear, 29, 401–420. [DOI] [PubMed] [Google Scholar]
Sommers MS, and Danielson SM (1999). Inhibitory processes and spoken word recognition in young and older adults: The interaction of lexical competition and semantic context. Psych Aging, 14, 458–472. [DOI] [PubMed] [Google Scholar]
Sommers MS (1996). The structural organization of the mental lexicon and its contribution to age-related declines in spoken-word recognition. Psych Aging, 11, 333–341. [DOI] [PubMed] [Google Scholar]
Sorqvist P, and Marsh JE (2015). How concentration shields against distraction. Curr Dir Psychol Sci, 24, 267–272. [DOI] [PMC free article] [PubMed] [Google Scholar]
Souza P and Arehart K (2015). Robust relationship between reading span and speech recognition in noise. Int J Audiol, 54, 705–713. [DOI] [PMC free article] [PubMed] [Google Scholar]
Summers V, and Molis MR (2004). Speech recognition in fluctuating and continuous maskers: Effects of hearing loss and presentation level. J Speech Lang Hear Res, 47, 245–256. [DOI] [PubMed] [Google Scholar]
Svec A, Dubno JR, and Nelson PB (2016). Inherent envelope fluctuations in forward maskers: Effects of masker-probe delay for listeners with normal and impaired hearing. J Acous Soc Am, 139, 1195–1203. [DOI] [PMC free article] [PubMed] [Google Scholar]
Takahashi GA and Bacon SP (1992). Modulation detection, modulation masking, and speech understanding in noise in the elderly. J Speech Lang Hear Res, 35, 1410–1421. [DOI] [PubMed] [Google Scholar]
Trainor LJ, and Trehub SE (1989). Aging and auditory temporal sequencing: Ordering the elements of repeating tone patterns. Percept Psychophys, 45, 417–426. [DOI] [PubMed] [Google Scholar]
Tun PA and Wingfield A (1999). One voice too many: Adult age differences in language processing with different types of distracting sounds. J Gerontol, 54B, P317–P327. [DOI] [PubMed] [Google Scholar]
Tun PA, O’Kane G, and Wingfield A (2002). Distraction by competing speech in young and older adult listeners. Psychol Aging, 17, 453–467. [DOI] [PubMed] [Google Scholar]
Weintraub S, Dikmen SS, Heaton RK, et al. (2013). Cognition assessment using the NIH Cognitive Toolbox. Neurol, 80(Suppl 3), :s54–s64. [DOI] [PMC free article] [PubMed] [Google Scholar]
Woods WS, Kalluri S, Pentony S, et al. (2013). Predicting the effect of hearing loss and audibility on amplified speech perception in a multi-talker listening scenario. J Acous Soc Am, 133, 4268–4278. [DOI] [PubMed] [Google Scholar]
Yonan CA, and Sommers MS (2000). The effects of talker familiarity on spoken word identification in younger and older listeners. Psychol Aging, 15, 88–99. [DOI] [PubMed] [Google Scholar]

[R1] Akeroyd MA (2008). Are individual differences in speech recognition related to individual differences in cognitive ability? A survey of twenty experimental studies with normal and hearing-impaired adults. Int J Aud, 47 (suppl. 2), S53–S71. [DOI] [PubMed] [Google Scholar]

[R2] Anderson S, White-Schwoch T, Parbery-Clark A, et al. (2013). A dynamic auditory cognitive system supports speech-in-noise perception in older adults. Hear Res, 300, 18–32. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] Best V, Mason CR, and Kidd G Jr. (2011). Spatial release from masking in normally hearing and hearing-impaired listeners as a function of the temporal overlap of competing talkers. J Acoust Soc Am, 129, 1616–1625. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] Best V, Ahlstrom JB, Mason CR, et al. (2018). Talker identification: Effects of masking, hearing loss, and age. J Acoust Soc Am, 143, 1085–1092. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] Chen Y-C, and Spence C (2010). When hearing the bark helps to identify the dog: Semantically-congruent sounds modulate the identification of masked pictures. Cog, 114, 389–404. [DOI] [PubMed] [Google Scholar]

[R6] Chen Y-C, and Spence C (2011). Crossmodal semantic priming by naturalistic sounds and spoken words enhances visual sensitivity. J Exp Psych: Hum Percep Perform, 37, 1554–1568. [DOI] [PubMed] [Google Scholar]

[R7] Desjardins JL, and Doherty KA (2013). Age-related changes in listening effort for various types of masker noises. Ear Hear, 34, 261–272. [DOI] [PubMed] [Google Scholar]

[R8] Dirks DD, Takayanagi S, and Moshfegh A (2001). Effects of lexical factors on word recognition among normal-hearing and hearing-impaired listeners. J Amer Acad Aud, 12, 233–244. [PubMed] [Google Scholar]

[R9] Dubno JR, Horwitz AR, and Ahlstrom JB (2003). Recovery from prior stimulation: Masking of speech by interrupted noise for younger and older adults with normal hearing. J Acous Soc Am, 113, 2084–2094. [DOI] [PubMed] [Google Scholar]

[R10] Dubno JR, Horwitz AR, and Ahlstrom JB (2002). Benefit of modulated maskers for speech recognition by younger and older adults with normal hearing. J Acous Soc Am, 111, 2897–2907. [DOI] [PubMed] [Google Scholar]

[R11] Fitzgibbon JP, and Gordon-Salant S (2001). Aging and temporal discrimination in auditory sequences. J Acous Soc Am, 109, 2955–2963. [DOI] [PubMed] [Google Scholar]

[R12] Fogerty D, Bologna WJ, Ahlstrom JB, and Dubno JR (2017). Simultaneous and forward masking of vowels and stop consonants: Effects of age, hearing loss, and spectral shaping. J Acous Soc Am, 141, 1133–1143. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] Folstein MF, Folstein SE, and McHugh PR (1975). Mini-mental state: A practical method for grading the cognitive state of patients for the clinician. J Psych Res, 12, 189–198. [DOI] [PubMed] [Google Scholar]

[R14] Füllgrabe C, Moore BC, and Stone MA (2015). Age-group differences in speech identification despite matched audiometrically normal hearing: contributions from auditory temporal processing and cognition. Front Aging Neurosci, 6, 347. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] George ELJ, Festen JM, and Houtgast T (2006). Factors affecting masking release for speech in modulated noise for normal-hearing and hearing-impaired listeners. J Acous Soc Am, 120, 2295–2311. [DOI] [PubMed] [Google Scholar]

[R16] Glaser WR and Dungelhoff FJ (1984). The time course of picture-word interference. J Exp Psych Hum Percep Perf, 10, 640–654. [DOI] [PubMed] [Google Scholar]

[R17] Halin N (2016). Distracted while reading? Changing a hard-to-read font shields against the effects of environmental noise and speech on text memory. Front Psych, 7, 1196. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] Helfer KS, and Freyman RL (2008). Aging and speech-on-speech masking. Ear Hear, 29, 87–98. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] Helfer KS, and Freyman RL (2014). Stimulus and listener factors affecting age related changes in competing speech perception. J Acous Soc Am, 136, 748–759. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] Helfer KS, and Jesse A (2015). Lexical influences on competing speech perception in younger, middle-aged, and older adults. J Acous Soc Am, 138, 363–76. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] Helfer KS, Mason C, and Marino C (2013). Aging and the perception of temporally-interleaved words. Ear Hear, 34, 160–167. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] Hughes R (2014). Auditory distraction: A duplex-mechanism account. Psych J, 3, 30–41. [DOI] [PubMed] [Google Scholar]

[R23] Hughes R, Hurlstone MJ, Marsh JE, et al. (2013). Cognitive control of auditory distraction: impact of task difficulty, foreknowledge, and working memory capacity support duplex-mechanism account. J Exp Psych Hum Percep Perf, 39, 539–553. [DOI] [PubMed] [Google Scholar]

[R24] Humes LE, Lee JH, and Coughlin MP (2006). Auditory measures of selective and divided attention in young and older adults using single- talker competition. J Acous Soc Am, 120, 2926–2937. [DOI] [PubMed] [Google Scholar]

[R25] Jescheniak JD, Oppermann F, Hantsch A, et al. (2009). Do perceived context pictures automatically activate their phonological code? Exp Psych, 56, 56–65. [DOI] [PubMed] [Google Scholar]

[R26] Jesse A, and Janse E (2012). Audiovisual benefit for recognition of speech presented with single-talker noise in older listeners. Lang Cog Proc, 27, 1167–1191. [Google Scholar]

[R27] Kidd G Jr., Best V, and Mason CR (2008). Listening to every other word: Examining the strength of linkage variables in forming streams of speech. J Acous Soc Am, 124, 3793–3802. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] Koelweijn T, Zekveld AA, Festen JM, et al. (2012). Pupil dilation uncovers extra listening effort in the presence of single-talker masker. Ear Hear, 33, 291–300. [DOI] [PubMed] [Google Scholar]

[R29] Koelweijn T, Zekveld AA, Festen JM, et al. (2014). The influence of informational masking on speech perception and pupil response in adults with hearing impairment. J Acous Soc Am, 135, 1596–1606. [DOI] [PubMed] [Google Scholar]

[R30] Lash A, Rogers CS, Zoller A, et al. (2013). Expectation and entropy in spoken word recognition: effects of age and hearing acuity. Exp Aging Res, 39, 235–253. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] Lee JH, and Humes LE (2012). Effect of fundamental-frequency and sentence-onset differences on speech-identification performance of young and older adults in a competing-talker background. J Acous Soc Am, 132, 1700–1717. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] Luce PA, and Pisoni DB (1998). Recognizing spoken words: the neighborhood activation model. Ear Hear, 19, 1–36. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] Madebach A, Wohner S, Kieseler M-L, et al. (2017). Neighing, barking, and drumming horses—Object related sounds help and hinder picture naming. J. Exp. Psych: Hum Percep Perform, 43, 1629–1646. [DOI] [PubMed] [Google Scholar]

[R34] Marsh JE, and Jones D (2011). Cross-modal distraction by background speech: what role for meaning? Noise Health, 45, 210–216. [DOI] [PubMed] [Google Scholar]

[R35] Pilotti M, and Beyer T (2002). Perceptual and lexical components of auditory repetition priming in young and older adults. Mem Cog, 30, 226–236. [DOI] [PubMed] [Google Scholar]

[R36] Pilotti M, Beyer T, and Yasunami M (2001). Encoding tasks and the processing of perceptual information in young and older adults. J Geron: Psych Sci, 56B, P119–P128. [DOI] [PubMed] [Google Scholar]

[R37] Ronnberg J, Lunner T, Zekveld A, et al. (2013). The Ease of Language Understanding (ELU) model: theoretical, empirical, and clinical advances. Front Syst Neurosci, 7, 31. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] Rossi-Katz J, and Arehart KH (2009). Message and talker identification in older adults: Effects of task, distinctiveness of the talker’s voices, and meaningfulness of the competing message. J Speech Lang Hear, 52, 435–453. [DOI] [PubMed] [Google Scholar]

[R39] SanMiguel I, Corral M-J, and Escera C (2008). When loading working memory reduces distraction: behavioral and electrophysiological evidence from an auditory-visual distraction paradigm. J Cog Neurosci, 20, 1131–1145. [DOI] [PubMed] [Google Scholar]

[R40] Saygin AP, Dick F, and Bates E (2005). An on-line task for contrasting auditory processing in the verbal and nonverbal domains and norms for younger and older adults. Behav Res Meth, 37, 99–110. [DOI] [PubMed] [Google Scholar]

[R41] Schriefers HJ, Meyer AS, and Levelt WJM (2000). Exploring the time course of lexical access in language production—Picture-word interference studies. J Mem Lang, 29, 86–102. [Google Scholar]

[R42] Shafiro V (2008). Identification of environmental sounds with varying spectral resolution. Ear Hear, 29, 401–420. [DOI] [PubMed] [Google Scholar]

[R43] Sommers MS, and Danielson SM (1999). Inhibitory processes and spoken word recognition in young and older adults: The interaction of lexical competition and semantic context. Psych Aging, 14, 458–472. [DOI] [PubMed] [Google Scholar]

[R44] Sommers MS (1996). The structural organization of the mental lexicon and its contribution to age-related declines in spoken-word recognition. Psych Aging, 11, 333–341. [DOI] [PubMed] [Google Scholar]

[R45] Sorqvist P, and Marsh JE (2015). How concentration shields against distraction. Curr Dir Psychol Sci, 24, 267–272. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R46] Souza P and Arehart K (2015). Robust relationship between reading span and speech recognition in noise. Int J Audiol, 54, 705–713. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R47] Summers V, and Molis MR (2004). Speech recognition in fluctuating and continuous maskers: Effects of hearing loss and presentation level. J Speech Lang Hear Res, 47, 245–256. [DOI] [PubMed] [Google Scholar]

[R48] Svec A, Dubno JR, and Nelson PB (2016). Inherent envelope fluctuations in forward maskers: Effects of masker-probe delay for listeners with normal and impaired hearing. J Acous Soc Am, 139, 1195–1203. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R49] Takahashi GA and Bacon SP (1992). Modulation detection, modulation masking, and speech understanding in noise in the elderly. J Speech Lang Hear Res, 35, 1410–1421. [DOI] [PubMed] [Google Scholar]

[R50] Trainor LJ, and Trehub SE (1989). Aging and auditory temporal sequencing: Ordering the elements of repeating tone patterns. Percept Psychophys, 45, 417–426. [DOI] [PubMed] [Google Scholar]

[R51] Tun PA and Wingfield A (1999). One voice too many: Adult age differences in language processing with different types of distracting sounds. J Gerontol, 54B, P317–P327. [DOI] [PubMed] [Google Scholar]

[R52] Tun PA, O’Kane G, and Wingfield A (2002). Distraction by competing speech in young and older adult listeners. Psychol Aging, 17, 453–467. [DOI] [PubMed] [Google Scholar]

[R53] Weintraub S, Dikmen SS, Heaton RK, et al. (2013). Cognition assessment using the NIH Cognitive Toolbox. Neurol, 80(Suppl 3), :s54–s64. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R54] Woods WS, Kalluri S, Pentony S, et al. (2013). Predicting the effect of hearing loss and audibility on amplified speech perception in a multi-talker listening scenario. J Acous Soc Am, 133, 4268–4278. [DOI] [PubMed] [Google Scholar]

[R55] Yonan CA, and Sommers MS (2000). The effects of talker familiarity on spoken word identification in younger and older listeners. Psychol Aging, 15, 88–99. [DOI] [PubMed] [Google Scholar]

PERMALINK

Word identification with temporally-interleaved competing sounds by younger and older adult listeners

Karen S Helfer

Sarah F Poissant

Gabrielle R Merchant

Abstract

Objective

Design

Results

Conclusions

INTRODUCTION

MATERIALS AND METHODS

Participants

Figure 1.

Cognitive Tasks

Table 1.

Stimuli

Table 2.

Procedures

Table 3.

RESULTS

Interleaved Speech

Figure 2.

Figure 3:

Interleaved Noise and Interleaved Environmental Sounds

Figure 4.

Phonetic similarity of target and masker

Figure 5.

Figure 6.

Influence of age, hearing loss, and cognitive abilities

Table 4.

Figure 7.

DISCUSSION

ACKNOWLEDGEMENTS

Contributor Information

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases