Frequent False Hearing by Older Adults: The Role of Age Differences in Metacognition

Chad S Rogers; Larry L Jacoby; Mitchell S Sommers

doi:10.1037/a0026231

. Author manuscript; available in PMC: 2013 Mar 1.

Published in final edited form as: Psychol Aging. 2011 Dec 12;27(1):33–45. doi: 10.1037/a0026231

Frequent False Hearing by Older Adults: The Role of Age Differences in Metacognition

Chad S Rogers ¹, Larry L Jacoby ¹, Mitchell S Sommers ¹

PMCID: PMC3319693 NIHMSID: NIHMS357618 PMID: 22149253

Abstract

In two experiments testing age differences in the subjective experience of listening, which we call meta-audition, young and older adults were first trained to learn pairs of semantic associates. Following training, both groups were tested on identification of words presented in noise, with the critical manipulation being whether the target item was congruent, incongruent or neutral with respect to prior training. Results of both experiments revealed that older as compared to young adults were more prone to “false hearing,” defined as mistaken high confidence in the accuracy of perception when a spoken word had been misperceived. These results were obtained even when performance was equated across age groups on control items by reducing the noise level for older adults. Such false hearing is shown to reflect older adults’ heavier reliance on context. Findings suggest that older adults’ greater ability to benefit from semantic context reflects their bias to respond consistently with the context, rather than their greater skill in using context. Procedures employed are unique in measuring the subjective experience of hearing as well as its accuracy. Both theoretical and applied implications of the findings are discussed. Convergence of results with those showing higher false memory, and false seeing are interpreted as showing that older adults are less able to constrain their processing in ways that are optimal for performance of a current task. That lessened constraint may be associated with decline in frontal-lobe functioning.

Keywords: metacognition, aging, context, speech perception, confidence, meta-audition

During a hike in coastal Michigan, I (C.R.) approached a large fir tree to take a picture. A small songbird that was nesting in the tree swooped toward me aggressively, defending its territory. When later relating this incident to my family, I told them that I was attacked by a lark. My grandmother was alarmed, asking “how are you possibly alright?!” After some discussion, it became clear that she was absolutely certain that she had “heard” me say that I had been attacked by a shark. My grandmother’s error is an example of what we refer to as “false hearing”, a high-confidence, subjective experience of having actually “heard” a misperceived word (e.g., shark). This paper presents evidence that false hearing is more common among older adults than young adults. We argue that measures of subjective experience, as reflected by false hearing, are a critical yet underutilized assessment tool in audition and that age differences in subjective experience provide novel insight into the mechanisms that mediate perceptual experience.

When perceiving a spoken word in naturalistic listening situations, listeners can base their perceptual experience on two distinct sources of information: sensation and context (e.g., Nittrouer & Boothroyd, 1990). Sensation refers to the acoustic and phonetic characteristics of the word as processed by the peripheral auditory system. Context refers to the mental and environmental circumstances within which the word is perceived. In the above example, sensory information refers to phonetic cues, including formant frequencies, voice-onset times, burst frequencies, and other phonetic information that listeners use to identify the linguistic content of speech signals. Context, on the other hand, refers to the information contained in the sentence prior to presentation of the target word. Given that sharks are known to attack people, in the above example sensory and contextual information are incongruent in that the sensory information strongly suggests “lark” whereas the contextual information strongly suggests “shark”. Based on evidence soon described, we predicted that older adults would be more likely to falsely hear words presented in a misleading context than are young adults.

While subjective experience measures such as confidence ratings have not been strongly emphasized in audition research, they have frequently been used in investigations of metacognition. Of particular interest is the extent to which confidence ratings distinguish between correct and incorrect responses. We employ a measure called monitoring resolution to assess the correlation between confidence in identification of a spoken word and identification accuracy. As suggested by monitoring and control frameworks of metacognition (e.g., Koriat & Goldsmith, 1996; Nelson & Narens, 1990), such resolution is important because people control their actions on the basis of their confidence. The goal of this work is to integrate findings from metacognition, memory, and audition with the aim of understanding age differences in false hearing. We use the term “meta-audition” to refer to the metacognitive aspect of audition. Just as studies of metamemory have helped further understanding of memory processes (e.g. Dunlosky & Metcalfe, 2009; Koriat & Goldsmith, 1996), research on meta-audition has the potential to advance understanding of auditory processing. We highlight further similarities between age differences in meta-audition and age differences in metamemory.

We begin by considering research on differential use of contextual information by young and older adults. Prior work aimed at understanding age differences in the use of context (e.g., Nittrouer & Boothroyd, 1990) has been limited to situations in which context supports correct identification. Importantly, those experiments have not examined situations in which context could give rise to false hearing, such as the “shark/lark” example described above. Next, we review parallels between the processes underlying age differences in metamemory and meta-audition by relating false hearing to false memory. Finally, two experiments are described that assessed context effects on meta-audition in young and older adults.

Aging and the Facilitative Use of Context in Speech Perception

As people grow older, the relative contributions to speech perception made by context typically increases (e.g., Nittrouer & Boothroyd, 1990). In particular, older adults rely more on top-down information such as semantic context to compensate for hearing impairment (Wingfield, Tun, & McCoy, 2005). When young and older adults are compared under degraded listening conditions (e.g., with moderate to high levels of background noise present), age differences in spoken word identification diminish significantly in the presence of supportive semantic context (Dubno, Ahlstrom, & Horwitz, 2000; Hutchinson, 1989; Pichora-Fuller, Schneider, & Daneman, 1995; Sommers & Danielson, 1999). Given these findings, some have suggested that older adults are more skilled than young adults at using context, resulting from heavier reliance on context when listening in degraded conditions in daily life (Pichora-Fuller, 2008). Sommers and Danielson (1999) also presented evidence showing that the addition of context reduced age differences in lexical discrimination (Sommers, 1996). They argued that context constrains activation to a smaller set of candidate words, and that because of older adults’ deficit in ability to inhibit alternative responses (e.g., Hasher & Zacks, 1988), context is more beneficial for older than young adults.

An alternative account is that reliance on context produces a bias effect rather than serving to increase discrimination. A bias effect would show itself by increasing false hearing when context and the sensory signal were incongruent as well as by increasing correct hearing when the sensory signal and context were congruent. In contrast, models that attribute age differences in effects of context to differences in enhanced discrimination alone would predict correct hearing without increasing false hearing (e.g., NAM, Luce & Pisoni, 1998, and PARSYN, Luce et al., 2000).

Reliance on contextual information is generally adaptive because context may only rarely be misleading. Consequently, older adults’ greater reliance on context is generally useful as a means of compensating for hearing deficits. We characterize reliance on context as focusing on a larger unit of the word-in-context rather than focusing on the individual word. These two levels serve as qualitatively different bases for auditory judgments, analogous to the letter and word levels in investigations of visual perception aimed at the word superiority effect (e.g., Reicher, 1969; Wheeler, 1970). For hearing, focusing at the word level is a more effortful method for parsing heard messages than is focusing at the word-in-context level, but is necessary for correctly identifying words that are spoken in an incongruent context. Such focus was called a “close look” by Bruner (1957), who noted that participants must constrain their perception to specific features of an object in order to avoid top-down biases that result in perceptual illusions.

Supportive context has been shown to reduce perceptual effort in older listeners (McCoy, et al., 2005), thus older adults might be less able than young adults to effortfully focus attention at the word level so as to avoid false hearing. This may be true even when they are warned that context will often be misleading, and even when ability to identify words without supportive context has been equated by reducing background noise for older adults. Similar hypotheses have been supported by findings in the visual domain reported by Jacoby, Rogers, Bishara, and Shimizu (submitted). In their procedure, older and young adults had to identify briefly flashed words. They found that if a word was preceded by a misleading prime, older adults were more likely to report subjectively “seeing” the primed word (false seeing).

For the same reasons that older adults show greater false memory and false seeing in tasks where the fluency of a response can be misleading (e.g., Hay & Jacoby, 1999; Jacoby, Bishara, Hessels, & Toth, 2005a; Jacoby, et al., submitted), older adults were expected to show greater false hearing from reliance on misleading context. Jacoby and Rhodes (2006) review experiments demonstrating that older adults are much more prone to false memory than are young adults, and describe those findings in terms of a dual-process model of memory that distinguishes between recollection and accessibility bias. Recollection is described as a consciously controlled, effortful basis for responding that is tightly constrained by retrieval cues. In contrast, accessibility bias is a less effortful, more automatic basis for responding that reflects more global factors such as prior experience in the form of habits and context. They argue that the controlled processes necessary for supporting recollection diminish with age, rendering bias effects more influential with the result that the probability of false memory is increased. Just as avoiding false memory requires a close look at the past (constraining processing in ways required for recollection) avoiding false seeing and false hearing require a close look or close listen to the present.

Present Experiments

The present experiments investigated age differences in false hearing with procedures akin to those used by Jacoby and colleagues (e.g., Hay & Jacoby, 1999; Jacoby et. al., 2005a) to investigate false memory. Congruent and incongruent associative contexts were created by utilizing a cue-target training procedure that required participants to learn semantically related pairs of words (e.g. BARN-HAY). At test, participants listened to the cue word (e.g., BARN) presented in the clear and then listened to a word masked by white noise. For congruent trials, the word in noise was the same as the trained target (HAY). For incongruent trials, the word in noise was a phonological neighbor that formed a minimal pair (see Luce & Pisoni, 1998) with the trained target (PAY). For baseline trials (the control condition), the word in noise was unrelated to the training target (FUN). Experiment 1 manipulated the level of noise during presentation of the target word and utilized a two-alternative, forced- choice test (HAY/PAY). After selecting an alternative, participants indicated how confident they were that they had identified the word correctly.

In Experiment 2, the false hearing procedure was generalized to a more naturalistic listening situation. Instead of white noise, Experiment 2 utilized a 6-talker babble-noise masking procedure and an open-set, cued identification task, in which the participant said aloud the word that was presented in the noise. To control for age-related differences in speech perception ability, the noise level for each participant was set to their 50% speech reception threshold (SRT, ASHA, 1988). This was important to allow certainty that any age differences in false hearing that were observed did not result from age-related sensory differences in hearing.

In both experiments, participants’ meta-audition was assessed by analyzing mean confidence data, high confidence errors, and monitoring resolution. Confidence ratings have been commonly employed in studies of aging and metamemory (e.g., Jacoby, Wahlheim, Rhodes, Daniels, & Rogers, 2010; Kelley & Sahakyan, 2003; Lovelace & Marsh, 1985; Perfect & Stollery, 1993), and were chosen as to assess how well participants subjectively thought they were hearing. As mentioned earlier, resolution measures the extent to which a person’s confidence discriminates between correct and incorrect responses. Resolution was assessed using Goodman-Kruskal (Goodman & Kruskal, 1954) gamma correlations at the item-level to examine the correspondence between confidence and accuracy (see Nelson, 1984, for a discussion of the advantages of using gamma as a measure of metamemory). Like a Pearson’s correlation coefficient, a gamma correlation ranges from −1 to +1, where the absolute value reflects the degree of association, and the direction of the association is indicated by positive or negative values. A strong gamma correlation indicates that confidence strongly distinguishes between correct and incorrect responses, whereas a weak gamma correlation implies little association between confidence and accuracy.

For resolution, we expected to find an interaction between age and context type. For congruent contexts, we expected the resolution of confidence judgments to be higher for older than for young adults. When context is congruent, reliance on context provides a valid basis for accuracy, as does reliance on the audibility of the word. In contrast, for incongruent items we expected the resolution of confidence judgments to be lower for older than for young adults. In the incongruent condition context provides invalid cues for accuracy; thus older adults’ greater reliance on context should result in poorer resolution for their confidence judgments. An interaction of this sort would provide strong evidence for a qualitative difference in the basis for confidence used by older versus young adults.

Experiment 1

Methods

Participants

Sixteen undergraduate students were recruited through the Washington University subject pool and received either $15 or course credit for their participation. These young participants ranged in age from 18 to 22 years (M = 19.75, SD = 1.18). Sixteen older adults were recruited through the Washington University Older Adult subject pool. These older participants ranged in age from 65-82 years (M = 75.63, SD = 4.49), and received $15 for their participation. The mean score on the Vocabulary subtest of the Shipley Institute of Living Scale (Shipley, 1967) was lower for young participants (M = 33.75, SD = 2.77) than for older participants (M = 35.81, SD = 2.64), t(30) = 2.15, p <.05. All participants reported normal or corrected-to-normal vision.

Pure Tone Audiometric thresholds were obtained for all participants, and these thresholds were used to screen for hearing loss. Participants were tested using an audiometer in a double-walled sound-attenuating booth. None of the older adults or young adults had thresholds exceeding 25dB HL for frequencies of 500, 1000, and 2000Hz.

Materials and Design

Signal-to-noise ratio (SNR; -10 or -15) and trial type (Congruent, Incongruent, or Baseline) were manipulated within participants. A total of 72 three-word sets including one cue word (e.g. barn), one associatively related monosyllabic target word (e.g. hay), and one non-associatively related monosyllabic alternate word that was phonologically confusable with the target word (e.g. pay) were generated using the Washington University Neighborhood Database (Sommers, 2000) to create the congruent and incongruent trials. The three-word sets were divided into four groups of eighteen, which were balanced for word frequency and phonological confusability. These groups were rotated across participants through each of the combinations of congruent/incongruent trial types and SNR levels (e.g. congruent -10, congruent -15, incongruent -10, and incongruent -15). A total of thirty-six three-word sets were used for constructing baseline trials. Those three-word sets contained a cue word (e.g. cloud) and two monosyllabic words that were not associatively related to the cue, but were phonologically confusable with one another (e.g. cash, dash). Two groups of eighteen of these baseline word-sets were balanced for word frequencies and phonological confusability, so that they could be rotated across participants through the two SNR levels.

The auditory stimuli were spoken versions of the above word sets recorded at 11025 Hz using a 16-bit Digital-to-Analog converter with a Shure microphone in a double-walled sound attenuating booth. Words were spoken by a female speaker with a standard American dialect. Root-mean-square (RMS) amplitude of the stimuli was equated. Stimuli masked by noise were generated by taking the clear speech file (65dB SPL) and mixing it with a corresponding white noise file (75dB SPL for the -10 SNR condition and 80 dB SPL for the -15 SNR condition) using Adobe Audition v1.5 (Adobe, 2004).

Procedure

Training Phase

The procedure for the experiment was broken into two phases: the training phase and the perception test phase. The purpose of the training phase was to create a strong context by training the cue-target pairs to a high level of accuracy. During the training phase, participants were seated 18 inches away from the front of a computer screen and learned a series of word pairs that they were told to remember for a later memory test. For each pair, the cue word (e.g. barn) was presented on screen, and then 100 ms later was presented aurally via headphones. Fifty ms later, the associatively related target word (e.g. hay) was presented, adjacent to the cue word, aurally and visually in the same fashion. Both words presented visually remained on the screen for the entirety of aural presentation. Each cue-target pair was presented a total of five times. Pairs were presented in random order, with the limitation that all 72 pairs were presented once before any pair was presented an additional time.

The final component of the training phase was a 72-item cued recall test to assess training. On each trial, the cue word was presented visually and aurally, but with a question mark following the word (e.g. BARN -?). Participants had five seconds to provide the target word and were encouraged to guess if they did not know. After a response was provided, or five seconds elapsed, the target word was presented visually adjacent to the cue word (e.g. BARN – HAY), and the word was played over the headphones. All participants correctly recalled 80% or more of these words.

Perception test phase

During the 108-trial perception test phase, there were three different trial types: congruent, incongruent, and baseline trials. There were 36 of each trial type, half of which had target words presented at an SNR of -10, the other half at -15. Order of conditions in the perceptual test phase was randomized for each participant, with the limitation that no more than three trials of a given type were presented consecutively. Participants were informed that they would again be hearing a series of cue-target pairs, but that during this portion of the experiment the target word would be masked by noise. Participants were told that after the word in noise was played, two words would appear on the screen. Their task was to pick which of the two words was presented in the noise by saying the word aloud (i.e., two alternative forced-choice test, 2AFC). Participants were warned that some of the pairs in the perception test phase would be the same as the pairs in the training phase (e.g. BARN – HAY) but that some of the pairs would be different (e.g. BARN – PAY), and because of this they should only respond on the basis of what they heard in the noise, not on what they had learned earlier. This last point was printed in capital letters on the computer screen and was emphasized by the experimenter when recapitulating the instructions.

After providing an identification judgment, participants were instructed to indicate how confident they were that they had provided the correct response. As in the 2AFC perception test, the participants gave their rating aloud and the experimenter recorded the response. The 50-point scale for this judgment ranged from 50-100. Participants were encouraged to use the full range of the scale. The scale began at 50 because with the 2AFC test, participants had a 50% chance of providing the correct response based on pure guessing. As with the identification judgments, participants were instructed to make their confidence judgments only on the basis of what they heard in the noise.

After participants received all instructions for the perceptual test phase, they were asked to explain the procedure in their own words. Participants’ reports had to include 1) the identification judgment, 2) the confidence rating, and 3) the misleading nature of context. The instructor verbally repeated instructions and questioned participants until each participant’s procedure report was complete. All participants’ procedure reports were complete before the beginning of the perceptual test phase.

The timing for each trial was as follows: 200 ms before the first member of a pair (the cue) was presented over the headphones, a single asterisk “*” was presented visually in the top center portion of the screen until the offset of the aurally presented word. Following a 1000 ms inter-stimulus interval, two asterisks “**” were presented visually in the top center of the computer screen. 200 ms later the target word, masked by noise, was presented aurally. The asterisks were used so that participants would have a visual indication of which word was being played over the headphones, but were offset so that they did not distract the participants while the word was being played.

Results and Discussion

Unless otherwise specified, only effects that were found to be significant at α<.05 significance level and that were not involved in a higher-order interaction are reported. When Mauchly’s test of sphericity was significant, the Greenhouse-Geyser correction for MSE and degrees of freedom was used¹. Furthermore, when Levene’s test for equality of variances was significant during post-hoc t-tests, the degrees of freedom were corrected.

Hit rates

Identification accuracy was measured as the proportion of trials on which participants correctly identified the word in noise (hits). Figure 1 shows that while young adults had more hits than older adults on baseline trials, older adults had more hits than young adults on congruent trials. This finding is consistent with the notion that older adults effectively utilize context to compensate for age-related hearing loss (Hutchinson, 1989; Pichora-Fuller, Schneider, & Daneman, 1995; Pichora-Fuller, 2008; Sommers & Danielson, 1999; Wingfield et al., 2005). However, in our novel incongruent condition, it is clear that this contextual facilitation came at a cost: older adults had fewer hits than young adults on incongruent trials. This pattern of greater contextual facilitation for older adults on congruent trials and greater contextual interference on incongruent trials was consistent across the -10 and -15 SNR conditions, indicating a stronger reliance upon context for older adults than young adults.

To confirm the statistical reliability of these findings, hit rates were analyzed using a 2 (age: young, older) × 2 (SNR: -10, -15) × 3 (trial type: congruent, baseline, incongruent) mixed-model analysis of variance (ANOVA). The age × trial type interaction was significant, F(1.85, 55.59) = 7.71, MSE = .37, p<.001, η_p² =.20. Post-hoc F-tests applying the Bonferroni Type I error correction revealed that across SNRs older adults showed more hits on congruent trials (M = .83, SD = .14) than did young adults (M = .72, SD =.16), F(1, 30) = 4.04, p<.05, but fewer hits on incongruent trials (M= .24, SD = .17) than did the young adults (M = .43, SD = .16), F(1, 30) = 10.97, p<.01. Baseline performance was lower for older adults (M = .55, SD = .084) than for young adults (M = .62, SD = .085), F(1, 30) = 5.02, p <.05, indicating an age group difference in context-free speech perception ability. As expected, participants had greater hits at better SNRs, as revealed by a significant main effect of SNR, F(1, 60) = 31.28, MSE = .386, p<.001, η_p² =.51.

Confidence data

For the incongruent condition, choosing the alternative favored by context (i.e., the incorrect response) and holding high confidence in its selection served as a measure of false hearing. If participants were aware of cases in which they had failed to identify the word presented in noise and instead responded on the basis of context, then a choice predicated upon context might be considered a low-confidence “best guess”. In contrast, if reliance on context resulted in the chosen word being subjectively experienced as “heard”, then confidence in context-favored responses should be high. We expected older adults, as compared to young adults, to show greater confidence in words favored by context in both the congruent and incongruent test conditions.

The confidence pattern depicted in Figure 2 shows the mean confidence rating ascribed to responses that were favored by context (e.g. congruent hits and incongruent false alarms). The baseline condition serves as a reference point for correct identification made without prior context. The most striking finding that emerges from an examination of Figure 2 is a “V-shaped” function indicating that older adults were very confident when choosing a response favored by context. In contrast, young adults’ confidence judgments were minimally influenced by context. Young adults were more confident in their choices made in the baseline condition than were older adults. The 2 (age: young, older) × 2 (SNR: -10, -15) ×3 (trial type: congruent, baseline, incongruent) mixed-model ANOVA on mean confidence ratings revealed a significant 3-way interaction of trial type, SNR, and age, F(1.74, 60) = 3.40, MSE = 68.97, p<.05, η_p² =.10. Separate analyses done for the two SNRs revealed that in the -10 condition, the age × trial type interaction was highly significant, F(1.58, 60) = 25.803, MSE = 851.11, p<.001, η_p² =.46, but was smaller in the -15 condition, F(1.21, 60) = 6.76, MSE = 396.54, p<.01, η_p² =.18. We attribute this attenuation of the age × trial type interaction to potential floor effects on baseline trials for the −15 SNR condition.

Confidence in responses favored by context in Experiment 1. Confidence in hits is plotted for congruent and baseline trials. Confidence in false alarms is plotted for incongruent trials. SNR denotes signal-to-noise ratio. Error bars represent standard errors.

High Confidence Errors (Dramatic False Hearing)

Erroneously selecting the alternative favored by context in the incongruent condition and expressing l00% confidence that the selected word was the one presented in noise we define as “dramatic false hearing”. Older adults (M = .20, SD = .27) tended to be more likely than young adults (M =.05, SD = .05) to exhibit dramatic false hearing across SNRs, as shown by a main effect of age that approached significance, F(1, 30) = 3.60, MSE = 72.25, p<.07, η_p² =.11. Dramatic false hearing was also less likely to occur in the -15 (M =.04, SD = .14) than in the -10 (M =.12, SD = .25) SNR condition, as indicated by a significant main effect of SNR, F(1, 30) = 9.23, MSE = 33.06, p<.01, η_p² =.24. In Experiment 2, age differences in dramatic false hearing are further examined with a more powerful design and a greater number of participants.

Resolution

Recall that resolution is a measure of metacognitive monitoring that assesses the extent to which confidence in a response can discriminate whether the response was correct or not. As stated in the introduction, we expected the resolution to be higher for older than for young adults on congruent trials, and to be lower for older adults than for young adults on incongruent trials. This interaction would show that age groups differ in their bases for responding, with young adults more likely to respond on the basis of sensory information and older adults being more reliant upon context.

Resolution was assessed using gamma correlations. When participants used only one point on a confidence scale or achieve either 0% or 100% accuracy, a gamma correlation could not be calculated. In the case of 2 young adults and 6 older adults, participants’ gamma correlations could not be calculated for the above reasons, and were excluded from analysis. Figure 3 shows the resolution data from the remaining 14 young adults and 10 older adults².

Gamma (γ) correlation data from Experiment 1. Values above the zero line correspond to a positive relationship between confidence and accuracy (good monitoring), whereas values below the zero line correspond to a negative relationship between confidence and accuracy (poor monitoring). SNR denotes signal-to-noise ratio. Error bars represent standard errors.

The resolution data presented in Figure 3 reveal the predicted interaction, where older adults had better monitoring than young adults on congruent trials, but poorer monitoring than young adults on incongruent trials. The 2 (age: young, older) × 2 (SNR: -10, -15) ×3 (trial type: congruent, baseline, incongruent) mixed-model ANOVA on the gamma correlations revealed that the predicted age × trial type interaction was highly significant, F(2,44) = 10.47, MSE = 2.44, p<.001, η_p² =.32. Post-hoc F-tests using the Bonferroni correction revealed age group differences to be significant on congruent trials, F(1,22) = 20.82, MSE = 1.05, p<.001, η_p² =.49, and incongruent trials, F(1,22) = 10.55, MSE = 1.06, p<.01, η_p² =.32, but not baseline trials, F(1,22) = 2.81, p>.10, ns. There was also a main effect of SNR, F(1,22) = 4.44, MSE = 1.18, p<.05, η_p² =.19, which suggests that resolution was poorer in the −15 SNR condition.

The resolution results provide strong evidence of a qualitative difference between young and older adults in bases for responding. In the congruent condition, both sensory and context information were valid, and older adults showed a high positive correspondence between confidence and accuracy. In the incongruent condition, context was invalid, and older adults showed a strong negative correspondence between confidence and accuracy. This negative resolution implies that the more confident the listener was in his or her response, the more likely he or she was to be incorrect. The magnitude of these correlations is important as an indication of the extent to which older adults relied on context. Although young adults did show negative gammas in the incongruent condition, they were not as strong as those of the older adults.

Summary

Experiment 1 employed a 2AFC procedure and demonstrated that older adults were more reliant on context than were young adults when selecting an item as having been presented in noise. Older adults also were more prone to show dramatic false hearing, by being maximally confident in their erroneous selection of an item in the incongruent context condition. The finding of a significant age × trial type interaction for resolution provided strong evidence that young and older adults relied on qualitatively different bases for judging confidence.

The larger increase in correct responding in combination with the larger increase in false hearing shown by older adults provides evidence that the greater advantage of providing facilitative context for older adults found in prior experiments (Dubno, Ahlstrom, & Horwitz, 2000; Hutchinson, 1989; Pichora-Fuller, Schneider, & Daneman, 1995; Sommers & Danielson, 1999) results from a bias effect that is akin to that responsible for higher false seeing (Jacoby, et al., submitted) and false remembering by older adults (e.g., Hay & Jacoby, 1999).

Experiment 2

Experiment 2 included several changes from Experiment 1. The most important change is that a noise-adjustment procedure we call titration was used to equate performance of young and older adults on control trials. Because young adults had a higher baseline hit rate than older adults in both -10 and -15 SNR listening conditions, some might argue that the greater false hearing found for older adults in Experiment 1 might simply reflect age-related deficits in hearing. This difference in baseline accuracy does not truly compromise conclusions from Experiment 1 because of the general lack of interactions between SNR and age. However, in Experiment 2, the SNR was adjusted for each participant to a level that would produce approximately 50% correct identification, often referred to as the speech reception threshold (SRT, ASHA, 1988). SRT is a common clinical measure used to assess an individual’s ability to understand speech in noise. Our goal was to show that results from Experiment 1 could be replicated even when young and older adults were equated in their performance on baseline items (c.f. for false memory, Jacoby, et al., 2005a).

Another change is Experiment 2 gave participants the opportunity to act on the basis of their subjective experience. Participants were given a volunteer/withhold response option (Kelley & Sahakyan, 2003; Koriat & Goldsmith, 1996) after producing an identification and confidence response. Participants were told that the computer was keeping score by giving one point for correct responses that were volunteered, and penalizing one point for incorrect responses that were volunteered. Participants were told to attempt to maximize their score. Further, they were instructed that they could improve their score by withholding responses that they produced that were of uncertain accuracy. Based on the resolution data from Experiment 1, we expected older adults to be less able to successfully monitor their accuracy in the incongruent condition. Such monitoring is critical for determining which responses to withhold (Koriat & Goldsmith, 1996), and therefore we expected older adults to benefit less from the option of withholding responses than would young adults. In particular, compared with young adults, we expected older adults to show an increased probability of false hearing as well as increased volunteering words that were falsely heard. In naturalistic settings, older adults do have the ability to withhold responses and could enhance their hearing performance by doing so. Consequently, lessened ability to withhold erroneously heard items is of applied as well as of theoretical interest.

A third change was a shift from a closed-set (e.g., 2AFC) to an open-set identification procedure to simulate more naturalistic hearing situations. Unlike 2AFC, in typical listening situations individuals are not presented with a list of potential responses. In Experiment 2, instead of two viable alternatives appearing on the screen, a single question mark “?” appeared. Participants were instructed to say the word aloud that they believed was presented in the noise. As in Experiment 1, participants followed their identification attempt with a confidence judgment, this time ranging from 0-100. The type of noise was changed from white noise to 6-talker-babble to more closely simulate real-world listening in noise situations.