Abstract
Stuart Gatehouse was one of the pioneers of cognitive hearing science. The ease of language understanding (ELU) model (Rönnberg) is one example of a cognitive hearing science model where the interplay between memory systems and signal processing is emphasized. The mismatch notion is central to ELU and concerns how phonological information derived from the signal, matches/mismatches phonological representations in lexical and semantic long-term memory (LTM). When signals match, processing is rapid, automatic and implicit, and lexical activation proceeds smoothly. Given a mismatch, lexical activation fails, and working or short-term memory (WM/STM) is assumed to be invoked to engage in explicit repair strategies to disambiguate what was said in the conversation. In a recent study, negative long-term consequences of mismatch were found by means of relating hearing loss to episodic LTM in a sample of old hearing-aid wearers. STM was intact (Rönnberg et al.). Beneficial short-term consequences of a binary masking noise reduction scheme on STM was obtained in 4-talker babble for individuals with high WM capacity, but not in stationary noise backgrounds (Ng et al.). This suggests that individuals high on WM capacity inhibit semantic auditory distraction in 4-talker babble while exploiting the phonological benefits in terms of speech quality provided by binary masking (Wang). Both long-term and short-term mismatch effects, apparent in data sets including behavioral as well as subjective (Rudner et al.) data, need to be taken into account in the design of future hearing instruments.
Keywords: cognition, hearing aids, working memory, long-term memory, signal processing
Introduction
Stuart Gatehouse was a true pioneer of cognitive hearing science. He forcefully argued that cognition plays a vital role in hearing, especially when it comes to the interaction between signal processing in hearing aids and cognitive function, and that this should be reflected in the field of audiology.
This article is about cognitive hearing science. Cognitive hearing science recognizes the impact of cognitive functions on hearing and speech understanding. When Gatehouse made his largest contributions to the area, it had not yet been named. Nevertheless, he was a pioneer in demonstrating the importance of acknowledging individual differences in certain cognitive skills when it came to fitting of different rationales for signal processing in hearing aids (see Gatehouse, Naylor, & Elberling, 2003, 2006a, 2006b).
The core research question of cognitive hearing science is the nature of the interaction between bottom-up and top-down processes that promote understanding in a variety of communication conditions (Arlinger, Lunner, Lyxell, & Pichora-Fuller, 2009). The interaction can be manifest at different levels in the brain, from the peripheral organ to the cortex; communication may be in any modality, including both spoken and signed languages as well as their visual and tactile derivatives. The focus of this article is on memory systems and how memory systems help us understand the cognitive side of dealing with signal processing in hearing aids and listening in the real world.
Memory Systems
Memory is a multifaceted concept including a number of well-defined and researched subsystems that can be described within a unifying framework (e.g., Squire, 2009). The fundamental division is between short-term memory (STM) and long-term memory (LTM). STM is characterized by its limited capacity and brief duration, whereas LTM is more capacious and of longer duration. LTM can be usefully subdivided into episodic and semantic memory. Episodic memory is memory of personally experienced events (tagged by time, place, space, emotions, and context; Tulving, 1983). An example is recalling what you had for breakfast this morning. Semantic memory refers to general conceptual knowledge, without personal reference, that may or not be available for conscious awareness, for example, general world knowledge, vocabulary, and phonology (Tulving, 1983; Wheeler, Stuss, & Tulving, 1997). For example, you probably have no episodic recollection of when or how you learned the concept “memory” but it is nonetheless firmly embedded in your general conceptual knowledge in semantic memory. Working memory (WM) and STM refer to an individual’s capacity to hold and manipulate a set of items currently in mind (see Baddeley, 1990, 2000). A well-known example is the capacity to hold telephone numbers in mind while dialing them. STM is the older theoretical concept that focuses on STM storage (Atkinson & Shiffrin, 1968), whereas WM focuses on the dual storage and processing aspects (Baddeley, 2000; Daneman & Carpenter, 1980; Rönnberg et al., 2011). However, the two terms are sometimes used interchangeably. Whereas episodic and semantic memory belong to LTM and are concerned with the “there” and “then” aspects of memory, WM and STM are concerned with the “here” and “now” aspects of memory processing. Typically, “STM” refers to a time-window of 5 to 10 s, whereas “LTM” in experimental and everyday situations refers to a timescale of minutes, hours, and beyond. These two aspects of memory are assumed to be important in outcome evaluations of signal processing in hearing aids and for targeting hearing aid interventions.
In general, when a speech signal is made audible via amplification, effortless, automatic, fast, and precise understanding is promoted (i.e., implicit processing, see further under ELU model). It is no easy task, however, to make sounds audible without adding distortions. Modern hearing aids use wide dynamic range compression (WDRC) amplification, which amplifies weak input sounds more than loud input sounds. Thus, gain regulation systems may cause distortions of the original speech cues in different ways and may themselves generate a need for extra use of explicit WM resources by the listener (see further under ELU model). If the regulation of gain during rapid drops in the input level is too slow (long release time in the compressor), weak speech cues may be underamplified, and speech cues will be less salient. Likewise, if the regulation of gain during fast increases in the input level is too slow (long attack time), sounds may be too loud, attracting unnecessary attention, and thus consuming WM resources. However, if the regulation system is fast, secondary speech sources may also introduce artifacts in the regulation.
Translated into the perceptual challenges that a hearing impaired person faces, we will in this article report cognitive data that are pertinent to online, short-term (“here” and “now”) processing of speech in noise, for example, showing the importance of individual WM capacity under adverse conditions (Foo, Rudner, Rönnberg, & Lunner, 2007; Rudner, Foo, Rönnberg, & Lunner, 2009), especially in fluctuating noise backgrounds with fast-acting compression in the hearing aids (Lunner & Sundewall-Thorén, 2007; Rönnberg, Rudner, Lunner, & Zekveld, 2010). We will also show, as reported in a recent study, that noise reduction signal processing (i.e., binary masking) enhances memory recall of sentences presented in a competing talker background for persons with high WM capacity, even when perceptual accuracy has been accounted for (Ng, Rudner, Lunner, & Rönnberg, 2010). We start (see Empirical Studies) by reporting on long-term (“there” and “then”) negative consequences of hearing loss over a period of years, where we have shown that semantic and episodic LTM representations fade as a result of long-term hearing impairment, despite the use of existing hearing-aid technology (Rönnberg et al., 2011). This suggests that hearing-aid technology needs to be improved to prevent cognitive decline and ultimately the risk of dementia (Lin et al., 2011).
These examples show the significance of addressing STM/WM and LTM systems during communication under adverse conditions as well as the importance of measuring top-down cognitive processing abilities in addition to bottom-up sensory and perceptual processing abilities when the outcomes of hearing interventions are being considered.
Memory Systems Seen From a Communicative Perspective
The approach we have taken to memory systems at Linnaeus Centre HEAD in Linköping, Sweden, is from the perspective of how they contribute to facilitation or enrichment of poorly perceived elements of language. The general assumption is that a hearing impairment, a noisy environment, or even the signal processing in the hearing aid may push the perceptual system to rely more on knowledge-driven, compensatory storage and processing capacities of the individual hearing-aid user. The main reason is that under such conditions, the input signal does not automatically activate knowledge stored in semantic LTM, thus hindering lexical access. This is why we assume a continuous interaction between perception of heard linguistic input, semantic LTM, and WM, where WM plays the role of filling in missing pieces of information. This interaction is spelled out in the ease of language understanding (ELU) model.
The ELU Model
The ELU model (Rönnberg, 2003; Rönnberg, Rudner, & Foo, 2010; Rönnberg, Rudner, Foo, & Lunner, 2008) is about the flow (and bottlenecks) of information processing under adverse speech-understanding conditions. Skilled performance is characterized by high lexical access speed, mediated by phonological skills, and by complex WM capacity (storage and processing; Daneman & Carpenter, 1980). Lexical access speed refers to how smoothly and rapidly the meaning of a linguistic token can be retrieved from semantic LTM. For example, old persons are typically slower in deciding whether a letter string word is a real word or not (Rönnberg, 1990). When we refer to phonological skills, we refer to the ability to process sublexical units (e.g., syllables) that helps unlock the lexicon (Pulvermuller et al. 2001). Complex WM capacity is not only about storage or updating of information but also about how information currently held in mind is processed in relation to what is stored, for example, what inferences can be drawn and which information can be inhibited when a competing talker is present (Rudner, Rönnberg, & Lunner, 2011).
Thus, we have observed that high lexical access speed, phonological skills, and complex WM capacity generally are useful predictors of lip-reading skill as well as audiovisual, auditory, and tactile speech understanding (Andersson, Lyxell, Rönnberg, & Spens, 2001a, 2001b; Lunner, 2003; Lyxell et al., 1996; Lyxell & Rönnberg, 1987, 1989, 1991, 1992, 1993; Lyxell, Rönnberg, & Samuelsson, 1994; Rönnberg, 1990, 1993; Rönnberg, Andersson, Lyxell, & Spens, 1998; Rönnberg, Arlinger, Lyxell, & Kinnefors, 1989; Rönnberg et al., 1999; see review in Rönnberg, 2003).
The ELU model (see Figure 1) starts at a linguistic-cognitive level by assuming that processing of spoken input involves RApid Multimodal Binding of PHOnology (RAMBPHO, for neural correlates of binding sign and speech tokens, see Rudner, Fransson, Nyberg, Ingvar, & Rönnberg, 2007). The term multimodal refers to audiovisual integration of phonological information, typically syllabic information (see Pulvermuller et al. 2001).
Figure 1.
The ease of language understanding (ELU) model
Note: For details, see Rönnberg (2003); Rönnberg et al. (2011, 2008);
If there is a mismatch between incoming linguistic stimuli, processed in a RAMBPHO mode, and phonological representations in semantic LTM, then lexical activation fails, and explicit, WM-based, inference-making processes are assumed to come into play to reconstruct what was said. The phonological mismatch may be due to the type of hearing impairment, unfavorable signal-to-noise ratios (SNRs), distortions created by the signal processing in the hearing aid, or from excessive demands on cognitive processing speed. Empirically, we have focused on rather straightforward manipulations of compression release parameters, switching from either habitually used preexperimental settings to other experimental settings (Foo et al., 2007) or training with one setting and then testing with another (Rudner et al., 2009). However, the parametric detail of what changes in, for example, signal processing (and its effects on phonological processing) constitute the minimal changes to elicit mismatch with phonological representations in semantic LTM remain to be investigated (see further under Mismatch and WM).
As is illustrated in Figure 1, after an initial mismatch, explicit WM resources are invoked and used—sometimes after successive retrievals of semantic LTM knowledge—to infer and reconstruct what was uttered in, for example, the dialogue. The timescale of WM-based storage and processing is in seconds, running in parallel with the much faster implicit and automatic processes (in milliseconds). By implication, there is a processing economy built into the assumption of different modes of processing information, the implicit and explicit processing. Implicit processing is needed to unload the cognitive system until focused attention and explicit resources are recruited. If there is a match, however, processing runs smoothly by successive lexical activations and grammatical coconstruction (Rönnberg et al., 2008).
To extract meaning from a signal in noise, there is always an interaction between the implicit, automatic, and rapid perceptual processes and the more controlled, slow explicit processes (Rönnberg, 2003; Rönnberg et al., 2008). In other words, there is always an interaction and a relationship between WM/STM and semantic LTM, depending on the listener and the listening task. Context, talker, and dialogue characteristics set the frame for how the ratio of explicit over implicit processes may vary from time to time during discourse. The time scale of this change in ratio may be down to less than 400 ms because of the upper time limits of lexical access, that is, the time window for lexical access varies between 200 to 400 ms, during which a match/mismatch occurs (Stenfelt & Rönnberg, 2009). This in its turn demands signal processing in hearing instruments that rapidly and adaptively follows the ratio function over time. Thus, apart from amplification, compression speed, and noise reduction issues, optimization of the hearing instrument may have to take into account factors such as the WM capacity and lexical processing speed of the individual. Also, direct neurophysiological estimates of cognitive load can be one part of a solution to how a future cognitive hearing aid should adapt its signal processing to the fluctuations in the explicit/implicit ratio function (see, for example, Lunner, Rudner, & Rönnberg, 2009).
Language specificity
We have shown in several studies comparing WM for signed and spoken language that there are sign-specific cortical representations in parietal areas and that there are speech-specific WM areas as well (Rudner et al., 2007; Rönnberg, Rudner, & Ingvar, 2004). Interestingly, however, there are no indications that inferior frontal areas subserving phonological analysis differ between the languages (Rönnberg et al., 2004 for Swedish sign language [SSL]; compare. Petitto et al., 2000 for American sign language [ASL]; MacSweeney, Waters, Brammer, Woll, & Goswami, 2008 for British sign language [BSL]). This pattern of data—translated into the language of the ELU model—suggests that the RAMBPHO function operates at a relatively abstract level, common to both sign and speech. Similar neural correlates in other relatively implicit and automatic linguistic functions, such as semantic retrieval (Emmorey et al., 2003 for ASL) and grammatical construction (i.e., Japanese sign language [JSL] compared with spoken Japanese; Sakai, Tatsuno, Suzuki, Kimura, & Ichida, 2005; see also MacSweeney et al., 2006 for BSL), have also been empirically demonstrated. Similarities in cortical activation of lipread, audiovisual as well as tactilely mediated speech speak to the same possibility (e.g., Balk et al., 2010; Beauchamp, Yasar, Frye, & Ro, 2008; Calvert, Campbell, & Brammer, 2000; Lee, Truy, Mamou, Sappey-Marinier, & Giraud, 2007; Levänen, 1998; MacSweeney et al., 2004; see also Okada & Hickok, 2009). Thus, there seems to be no language or modality specificity for the implicit RAMBPHO function of language processing, excluding primary sensory processing differences, whereas language specificity is manifest in slower, explicit kinds of language processing such as WM.
Mismatch and WM
We have tested the mismatch assumption by changing the signal processing in the experimental hearing aid relative to the processing habitually used in a hearing aid (Foo et al., 2007) or by means of an intervention period where the participants have become acclimatized to a certain kind of signal processing but become tested with another signal-processing algorithm (i.e., compression speed manipulations; Rudner et al., 2009). The prediction based on the ELU model is that in conditions of mismatch, an increased dependence on explicit, WM-based processes should occur, whereas in match conditions no dependence on WM is expected. This basic prediction has been confirmed and also generalized to another Scandinavian language, Danish (Rudner, Foo, Sundewall Thorén, Lunner, & Rönnberg, 2008).
One further piece of evidence makes this very clear: Variance in aided speech recognition in noise performance has been shown in multiple regression analyses to be mainly explained by WM performance as the predictor variable (tested by the reading span test, tapping both the storage and processing aspects of WM; see, for example, Andersson et al., 2001a; Rönnberg et al., 1989) in mismatching conditions, whereas age is the dominating factor explaining the variance in matching conditions (Rudner et al., 2009). In these studies, peripheral hearing loss could not be shown to be of importance for success in aided listening in mismatching conditions, which suggests that even with amplification, the top-down, explicit processing mechanisms are what drives speech recognition as long as speech understanding is accomplished under adverse, mismatching conditions. Similar findings were obtained by Foo et al. (2007), where it was demonstrated that under mismatching conditions, even when partialling out the effects of hearing loss and age, the correlation between WM and aided speech perception in noise remained significant (see also Lunner & Sundewall-Thorén, 2007).
Mismatch may still pertain in noisy situations for persons with aided hearing even after acclimatization to signal processing and it seems that the ability to capitalize on the potential benefits of signal processing such as those offered by WDRC when listening in modulating noise also are contingent on WM capacity (Gatehouse et al., 2003; Lunner & Sundewall-Thorén, 2007; Rudner et al., 2011a). When persons with hearing impairment listen to speech in modulating noise at low SNRs without their hearing aids, good WM capacity makes a significant contribution to their ability to report what they hear (Rudner, Rönnberg & Lunner, 2011a). These effects were independent of degree of peripheral hearing loss.
Thus, again we find that top-down, explicit processing mechanisms play a crucial role in driving speech recognition and understanding also under adverse conditions that are independent of the phonological mismatch engendered by hearing-aid signal-processing manipulations. Therefore, it is important for the hearing-aid industry to realize that cognition counts as a relatively general determinant for listening to adverse speech in noise conditions as well as the particular conditions created by manipulations of different types of signal processing. We will describe three recent studies that pertain to mismatch, memory systems, and effort.
Empirical Studies
Study 1: Long-Term Mismatch Effects on Memory Systems
The mismatch function of the ELU model is about the short-term function of mismatch, namely, the importance of being able to switch to an explicit mode of information processing at different points in time of a conversation. The capacity to compensate for mismatch is dependent on the capacity of, for example, WM, the speed with which the lexicon is activated and the quality of the phonological representations in LTM.
In a recent study (Rönnberg et al., 2011), we studied long-term—as opposed to short-term—consequences of mismatch. This was accomplished by studying the relationships between degree of hearing loss and memory performance in a subsample (n = 160) of relatively old persons (x = 75 years) with hearing impairment. The average hearing loss for the poorer ear was moderate to severe (44 dB to 73 dB), with a successive increase in the loss going from low to high frequencies. The corresponding data for the better ear was 34 dB to 63 dB. The sample was drawn from the Swedish prospective cohort aging study called Betula (Nilsson et al., 2004). The sample was screened for dementia and had verbal and nonverbal IQ scores representative of the overall Betula sample.
ELU predictions
As phonological representations belong to semantic LTM, we assumed that the status of semantic LTM would not be affected by mismatch because semantic LTM is always used in the matching process. Nevertheless, the prediction for episodic LTM was that its status would be related to the frequency of mismatches. If mismatches occur, encoding and storage into episodic memory will fail and, hence, episodic LTM will suffer.
Imagine the number of times per day that you actually encode and retrieve information from episodic LTM: Perhaps, you activate your lexicon (i.e., a portion of semantic LTM) 30,000 times per day; with mismatches, perhaps you only successively can unlock the lexicon say 20,000 times per day. With a relatively smaller number of lexical activations, episodic LTM will be less engaged, trained, and maintained. This relative difference, then, causes a relative disuse for episodic LTM.
WM or STM is on the same account not expected to be affected as WM/STM is continuously occupied by retrospectively disambiguating what has been said in a dialogue, while simultaneously switching to predictions about what is to come. This dual demand on WM actually forces a continuous compensatory use of WM/STM compared with episodic LTM (Rönnberg et al., 2011). In addition, the ELU model predicts a tight relationship between the status of semantic LTM and episodic LTM. The status of phonological representations in semantic LTM will be decisive for how the match–mismatch functions operate, and, hence, if phonological representations have deteriorated in semantic LTM, episodic LTM functions—encoding and retrieval—will mirror those of semantic LTM. In addition, it is argued that RAMBPHO delivered information that matches a precise phonological representation in LTM will lead to better encoding into episodic LTM than a fuzzy RAMBPHO representation that matches with a fuzzy LTM representation: Although “matching” in both cases, “precise-precise” is better than “fuzzy-fuzzy.” This is why the ELU model predicts a positive relationship between semantic LTM status and episodic LTM status (Rönnberg et al., 2011).
Tests
Episodic memory was indexed by recall of (a) subject-performed tasks (SPTs, two-word imperatives were printed on index cards, the participants enacted the imperatives, one action per 8 s; free recall was oral for the duration of 2 min), (b) sentence recall (text + auditory presentation/encoding of imperative; free recall as for SPTs), and (c) auditorily presented word lists at a rate of one word every 2 s. Participants took the tests with their hearing aids switched on. Episodic LTM was operationally defined by a lag of >7 items intervening between presentation and recall of a certain item (including presentations and recall of other items, see Tulving & Colotla, 1970). Consequently, episodic STM was defined as having a lag of 7 or less. Semantic LTM was indexed by word fluency (i.e., initial letter fluency) and vocabulary.
Results
Structural equation modeling (SEM) demonstrated that (a) episodic LTM and semantic LTM are negatively associated with degree of hearing loss, whereas STM is not, (b) semantic LTM is strongly related to episodic LTM, and (c) age and hearing loss contribute independently to episodic LTM deficits. Thus, the ELU model receives support on all points, apart from the effect of hearing loss on semantic LTM.
Alternative accounts
Semantic LTM deficits may be predicted by an account that assumes a successive deterioration of phonological representations that influence phonological neighborhoods with increasing age (e.g., Luce & Pisoni, 1998; Sommers, 1996). The interesting part here is that our SEM models show that not only age but also the hearing loss per se contributes to a deterioration of semantic LTM. This result has been shown before with more profound impairments (Andersson, 2002; Lyxell et al., 2009) but is now shown for moderate hearing loss as well.
The episodic LTM results may also be predicted by an information degradation account (e.g., Heinrich & Schneider, 2010; Schneider, Daneman, & Pichora-Fuller, 2002) because the episodic LTM constructs in the SEM models depended on perception of auditory information—information that might have been sufficiently degraded to affect encoding into episodic LTM, even though the participants wore their hearing aids switched on during testing. The attentional allocation account by Tun, McCoy, and Wingfield (2009) may also be a candidate explanation for the episodic LTM deficit because in our data too, attentional costs stand in proportion to losses in episodic recall tasks, and the interaction is reinforced in old and hearing-impaired persons. For a more detailed discussion of these alternative accounts, SEM models and partial correlation patterns, see the original publication (Rönnberg et al., 2011). Overall, the ELU model does a good job in predicting relationships between the semantic and episodic LTM systems and between STM and episodic LTM.
It should be noted that we did not evaluate the hearing aids per se in the current study and two comments are important to make. First, we actually have a conservative estimate of the effects of hearing impairment in the current study as all participants wore hearing aids which presumably compensate to some degree for the hearing loss—and still we find negative effects related to hearing impairment. Second, a proper evaluation of a hearing aid in this context needs a well-matched group of nonusers, where each hearing-aid user has a “twin” nonuser, matched for hearing loss, IQ, gender, and schooling. Only then can we approach a conclusion regarding the potential effects of hearing aids per se.
Study 2: Immediate Memory Effects: Noise Reduction and STM
In a recent study by Sarampalis, Kalluri, Edwards, and Hafter (2009), a noise reduction algorithm (Ephraim & Malah, 1984) was found to provide beneficial effects for participants with normal hearing. In a study by Ng et al. (2010) on 20 hearing-impaired participants (symmetrical sensorineural hearing loss), and with a free recall memory procedure similar to the one used by Sarampalis et al. (2009), we evaluated “aggressive” ideal binary masking (IBM, see Wang, 2008; Wang, Kjems, Pedersen, Boldt, & Lunner, 2009) as a proof-of-concept, not only for improving speech comprehension but also for its potential release of masking effects on memory systems. Furthermore, binary masks estimated from directional microphones (Kjems, Bolt, Pedersen, Lunner, & Wang, 2009) were used as a realistic version of binary mask processing. For each participant, we used an individualized SNR yielding 95% speech recognition. The SNR was kept constant across conditions for each participant.
The procedure involved two steps. First, the participants listened to Swedish Hearing In Noise Test (HINT) sentences (Hällgren, Larsby, & Arlinger, 2006) at an individually adapted Speech reception threshold (SRT) of 95% in different background conditions (stationary noise and 4-talker babble) with different signal-processing algorithms applied and completed a perceptual speech recognition task: that is, they had to repeat the final word (to verify audibility and speech recognition) of each sentence immediately after listening to each sentence. Second, the participants recalled, in any order, as many final words as possible from a set of eight sentences after the set had been completed.
Results showed that free recall memory performance was lowered by a competing talker background but that this effect was less for persons with high WM capacity when noise reduction was applied. This effect was shown to pertain to STM recency positions (i.e., the last two positions in a list). Thus, individuals with high WM capacity seem to be able to exploit noise reduction to enhance cognitive performance.
Study 3: WM and Effort
Explicit cognitive processing mechanisms are by definition conscious and may be perceived as effortful. Thus, the perceived effort of listening in adverse conditions may be informative as to the degree of explicit demands on WM. Although the subjectively rated effort involved in listening to speech in noise increases with decreasing SNR, there is no linear relation between rated effort and WM (Zekveld, Kramer, & Festen, 2010). Rönnberg (2003) postulated that the contribution of explicit cognitive processing mechanisms to ease of language understanding can be described as a U-shaped function, whereby the greatest contribution is at moderate levels of listening challenge, where there is room for interaction between the level of clarity of the input signal and WM capacity. Too much or too little information extracted from the signal will leave less room for the bottom-up and top-down processes to interact optimally. Thus, it might be expected that perceived effort would correlate with WM capacity under moderately difficult listening conditions but not necessarily under more or less adverse conditions. In a recent study, we found that WM, rather than influencing relative rating of effort between different SNRs, modulated the relative rating of effort between different types of noise (Rudner, Lunner, Behrens, Sundewall Thorén & Rönnberg, 2011b).
There is a growing interest within the hearing-aid industry in using perceived listening effort as a measure of the efficacy of hearing aids. Thus, it is important to understand the relationship between listening effort and cognitive measures and how they relate to the ability to understand speech in noise. The ELU model provides a framework for understanding these relationships.
Summary and Future Challenges for the Hearing-Aid Industry
The findings from the Rönnberg et al. (2011), the Ng et al. (2010), and the Rudner et al. (2011b) studies suggest that there are several cognitive hearing science challenges for the hearing-aid industry:
To improve signal-processing options for individuals such that negative long-term effects (“there and then”) on LTM systems are reduced or minimized.
To improve such options for the immediate (“here and now”) perception and encoding into episodic STM and to use memory paradigms for outcome evaluations.
To improve subjective measurements and measurement paradigms of listening effort such that the explicit, “here and now” processing mechanisms that surface as effortful also are the ones that are tapped into clinically.
At a theoretical level, it can be stated that the ELU model provides a framework for understanding (1) to (3) above. However, future studies will in even more detail target the subcomponents of explicit WM processes (e.g., inhibition, switching) to come to grips with the mechanism(s) most sensitive for short-term and long-term mismatch effects on memory systems and how to measure these effects both behaviorally and subjectively.
At a more general clinical and industrial level, all three studies (Ng et al., 2010; Rönnberg et al., 2011; Rudner et al., 2011b) collectively suggest an importance for the industry to look beyond pure speech recognition in noise measures as these speech-recognition measures may underestimate or misrepresent the consequences of hearing loss and signal processing in hearing instruments for memory load, deterioration of LTM systems, and perceived effort. Furthermore, the results of these three studies indicate that cognitive consequences should always be taken into account before new hearing-aid signal-processing algorithms are introduced.
Footnotes
Authors’ Note: The article is based on an invited talk by the first author, The Stuart Gatehouse Memorial lecture, at IHCON, Lake Tahoe, California in August 2010.
Declaration of Conflicting Interests: The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding: The authors disclosed that they received the following support for their research and/or authorship of this article: The research was supported by a Linnaeus Centre HEAD grant (349-2007-8654) from the Swedish Research Council to the first author.
References
- Andersson U. (2002). Deterioration of the phonological processing skills in adults with an acquired hearing loss. European Journal of Cognitive Psychology, 14, 335-352 [Google Scholar]
- Andersson U., Lyxell B., Rönnberg J., Spens K.-E. (2001a). Cognitive predictors of visual speech understanding. Journal of Deaf Studies and Deaf Education, 6, 103-115 [DOI] [PubMed] [Google Scholar]
- Andersson U., Lyxell B., Rönnberg J., Spens K.-E. (2001b). A follow-up study on the effects of speech tracking training on visual speechreading of sentences and words: Cognitive prerequisites and chronological age. Journal of Deaf Studies and Deaf Education, 6, 116-129 [DOI] [PubMed] [Google Scholar]
- Arlinger S., Lunner T., Lyxell B., Pichora-Fuller M. K. (2009). The emergence of cognitive hearing science. Scandinavian Journal of Psychology, 50, 371-384 [DOI] [PubMed] [Google Scholar]
- Atkinson R. C., Shiffrin R. M. (1968). Human memory: A proposed system and its control processes. In Spence K. W., Spence J. T. The psychology of learning and motivation (Vol. 2, pp. 89-195). New York, NY: Academic Press [Google Scholar]
- Baddeley A. D. (1990). Human memory: Theory and practice. Hove, UK: Laurence Erlbaum [Google Scholar]
- Baddeley A. (2000). The episodic buffer: A new component of working memory? Trends in Cognitive Science, 4, 417-423 [DOI] [PubMed] [Google Scholar]
- Balk M. H., Ojanen V., Pekkola J., Autti T., Sams M., Jääskeläinen I. P. (2010) Synchrony of audio–visual speech stimuli modulates left superior temporal sulcus. NeuroReport, 21, 822-826 [DOI] [PubMed] [Google Scholar]
- Beauchamp M. S., Yasar N. E., Frye R. E., Ro T. (2008). Touch, sound and vision in human superior temporal sulcus. Neuroimage, 41, 1011-1020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Calvert G. A., Campbell R., Brammer M. J. (2000). Evidence from functional magnetic resonance imaging of crossmodal binding in the human heteromodal cortex. Current Biology, 10, 649-657 [DOI] [PubMed] [Google Scholar]
- Daneman M., Carpenter P. A. (1980). Individual differences in working memory and reading. Journal of Verbal Learning and Verbal Behavior, 19, 450-466 [Google Scholar]
- Emmorey K., Grabowski T., McCullough S., Damasio H., Ponto L. B., Hichwa D., Bellugi U. (2003). Neural systems underlying lexical retrieval for sign language. Neuropsychologia, 41, 85-95 [DOI] [PubMed] [Google Scholar]
- Ephraim Y., Malah D. (1984). Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics Speech and Signal Processing, 32, 1109-1121 [Google Scholar]
- Foo C., Rudner M., Rönnberg J., Lunner T. (2007). Recognition of speech in noise with new hearing instrument compression release settings requires explicit cognitive storage and processing capacity. Journal of the American Academy of Audiology, 18, 553-566 [DOI] [PubMed] [Google Scholar]
- Gatehouse S., Naylor G., Elberling C. (2003). Benefits from hearing aids in relation to the interaction between user and the environment. International Journal of Audiology, 42(Suppl. 1), S77-S85 [DOI] [PubMed] [Google Scholar]
- Gatehouse S., Naylor G., Elberling C. (2006a). Linear and nonlinear hearing aid fittings—Patterns of benefit 1. International Journal of Audiology, 45, 130-152 [DOI] [PubMed] [Google Scholar]
- Gatehouse S., Naylor G., Elberling C. (2006b). Linear and nonlinear hearing aid fittings—Patterns of candidature. International Journal of Audiology, 45, 153-171 [DOI] [PubMed] [Google Scholar]
- Heinrich A., Schneider B. A. (2010). Elucidating the effects of ageing on remembering perceptually distorted word pairs. Quarterly Journal of Experimental Psychology. Advance online publication. 10.1080/17470218.2010.492621 [DOI] [PubMed] [Google Scholar]
- Hällgren M., Larsby B., Arlinger S. (2006). A Swedish version of the Hearing In Noise Test (HINT) for measurement of speech recognition. International Journal of Audiology,45, 227-237 [DOI] [PubMed] [Google Scholar]
- Kjems U., Boldt J. B., Pedersen M. S., Lunner T., Wang D. (2009). Role of mask pattern in intelligibility of ideal binary-masked noisy speech. Journal of the Acoustical Society America, 126, 1415-1426 [DOI] [PubMed] [Google Scholar]
- Lee H.-J., Truy E., Mamou G., Sappey-Marinier D., Giraud A.-L. (2007). Visual speech circuits in profound acquired deafness: A possible role for latent multimodal connectivity. Brain, 130, 2929-2941 [DOI] [PubMed] [Google Scholar]
- Levänen S. (1998). Neuromagnetic studies of human auditory cortex function and reorganization. Scandinavian Audiology, 27(Suppl. 49), 1-6 [DOI] [PubMed] [Google Scholar]
- Lin F.R., Metter E.J., Richard J., O’Brien R.J., Resnick S.M., Zonderman A.B., Ferrucci L. (2011). Hearing loss and incident Dementia. Arch Neurol., 68(2), 214-220. 10.1001/archneurol.2010.362 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Luce P. A., Pisoni D. B. (1998). Recognizing spoken words: The neighborhood activation model. Ear and Hearing, 19(1), 1-36 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lunner T. (2003). Cognitive function in relation to hearing aid use. International Journal of Audiology,42(Suppl. 1), S49-58 [DOI] [PubMed] [Google Scholar]
- Lunner T., Sundewall-Thorén E. (2007). Interactions between cognition, compression, and listening conditions: Effects on speech-in-noise performance in a two-channel hearing aid. Journal of the American Academy of Audiology, 18, 539-552 [DOI] [PubMed] [Google Scholar]
- Lunner T., Rudner M., Rönnberg J. (2009). Cognition and hearing aids. Scandinavian Journal of Psychology, 50, 395-403 [DOI] [PubMed] [Google Scholar]
- Lyxell B., Arlinger S., Andersson J., Harder H., Näsström E., Svensson H., Rönnberg J. (1996). Information-processing capabilities and cochlear implants: Pre-operative predictors for speech understanding. Journal of Deaf Studies & Deaf Education, 1, 190-201 [DOI] [PubMed] [Google Scholar]
- Lyxell B., Rönnberg J. (1987). Guessing and speechreading. British Journal of Audiology, 21, 13-20 [DOI] [PubMed] [Google Scholar]
- Lyxell B., Rönnberg J. (1989). Information-processing skills and speechreading. British Journal of Audiology, 23, 339-347 [DOI] [PubMed] [Google Scholar]
- Lyxell B., Rönnberg J. (1991). Visual speech processing: Word decoding and word discrimination related to sentence-based speechreading and hearing-impairment. Scandinavian Journal of Psychology, 32, 9-17 [DOI] [PubMed] [Google Scholar]
- Lyxell B., Rönnberg J. (1992). The relationship between verbal ability and sentence-based speechreading. Scandinavian Audiology, 21, 67-72 [DOI] [PubMed] [Google Scholar]
- Lyxell B., Rönnberg J. (1993). The effects of background noise and working memory capacity on speechreading performance. Scandinavian Audiology, 22, 67-70 [DOI] [PubMed] [Google Scholar]
- Lyxell B., Rönnberg J., Samuelsson S. (1994). Internal speech functioning and speechreading in deafened and normal hearing adults. Scandinavian Audiology, 23, 179-185 [DOI] [PubMed] [Google Scholar]
- Lyxell B., Wass M., Sahlén B., Samuelsson C., Asker-Arnarson L., Ibertsson T., . . . Hällgren M. (2009). Cognitive development, reading and prosodic skills in children with cochlear implants. Scandinavian Journal of Psychology, 50, 463-474 [DOI] [PubMed] [Google Scholar]
- MacSweeney M., Campbell R., Woll B., Giampietro V., David A. S., McGuire P. K., . . . Brammer M. J. (2004). Dissociating linguistic and nonlinguistic gestural communication in the brain. NeuroImage, 22, 1605-1618 [DOI] [PubMed] [Google Scholar]
- MacSweeney M., Campbell R., Woll B., Brammer M. J., Giampietro V., David A. S., . . . McGuire P. K. (2006). Lexical and sentence processing in British sign language. Human Brain Mapping, 27, 63-76 [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacSweeney M., Waters D., Brammer M. J., Woll B., Goswami U. (2008). Phonological processing in deaf signers and the impact of age on first language acquisition. Neuroimage, 40, 1369-1379 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ng E., Rudner M., Lunner T., Rönnberg J. (2010, August 15-21). Effects of hearing aid signal processing on cognitive outcome measurements. Poster presented at ICHON, Lake Tahoe, CA [Google Scholar]
- Nilsson L.-G., Adolfsson R., Bäckman L., de Frias C. M., Molander B., Nyberg L. (2004). Betula: A prospective cohort study on memory, health and aging. Aging Neuropsychology and Cognition, 11, 2-3, 134-148 [Google Scholar]
- Okada K., Hickok G. (2009). Two cortical mechanisms support the integration of visual and auditory speech: A hypothesis and preliminary data. Neuroscience Letters, 452, 219-223 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Petitto L. A., Zatorre R. J., Gauna K., Nikelski E. J., Dostie D., Evans A. C. (2000). Speech-like cerebral activity in profoundly deaf people processing signed languages: Implications for the neural basis of human language. Proceedings from the National Academy of Sciences of the United States of America, (PNAS) 97(25)13961–13966 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pulvermüller F., Kujala T., Shtyrov Y., Simola J., Tiitinen H., Alku P., Alho K., Martinkauppi S., Ilmoniemi R.J., Näätänen R. (2001).Memory traces for words as revealed by the mismatch negativity. Neuroimage, 14, 607-616 [DOI] [PubMed] [Google Scholar]
- Rönnberg J. (1990). Cognitive and communicative function. The effects of chronological age and “handicap age”. European Journal of Cognitive Psychology, 5, 19-33 [Google Scholar]
- Rönnberg J. (1993). Cognitive characteristics of skilled tactiling: The case of GS. European Journal of Cognitive Psychology, 5, 19-33 [Google Scholar]
- Rönnberg J. (2003). Cognition in the hearing impaired and deaf as a bridge between signal and dialogue: A framework and a model. International Journal of Audiology, 42, S68-S76 [DOI] [PubMed] [Google Scholar]
- Rönnberg J., Andersson J., Samuelsson S., Söderfeldt B., Lyxell B., Risberg J. (1999). A speechreading expert: The case of MM. Journal of Speech, Language and Hearing Research, 42, 5-20 [DOI] [PubMed] [Google Scholar]
- Rönnberg J., Andersson U., Lyxell B., Spens K. (1998). Vibrotactile speechreading support: Cognitive prerequisites for training. Journal of Deaf Studies & Deaf Education, 3, 143-156 [DOI] [PubMed] [Google Scholar]
- Rönnberg J., Arlinger S., Lyxell B., Kinnefors C. (1989). Visual evoked potentials: Relation to adult speechreading and cognitive function. Journal of Speech and Hearing Research, 32, 725-735 [PubMed] [Google Scholar]
- Rönnberg J., Danielsson H., Rudner M., Arlinger S., Sternäng O., Wahlin Å., Nilsson L.-G. (2011). Hearing loss is negatively related to episodic and semantic long-term memory but not to short-term memory. Journal of Speech, Language, and Hearing Research, 54, 725-726 [DOI] [PubMed] [Google Scholar]
- Rönnberg J., Rudner M., Foo C. (2010). Cognitive neuroscience of signed language: Applications to working memory. In Bäckman L., Nyberg L. (Eds.), Festschrift in honor of L.-G. Nilsson (pp. 263-286). London, UK: Psychology Press [Google Scholar]
- Rönnberg J., Rudner M., Foo C., Lunner T. (2008). Cognition counts: A working memory system for ease of language understanding (ELU). International Journal of Audiology, 47, S171-S177 [DOI] [PubMed] [Google Scholar]
- Rönnberg J., Rudner M., Ingvar M. (2004). Neural correlates of working memory for sign language. Cognitive Brain Research, 20, 165-182 [DOI] [PubMed] [Google Scholar]
- Rönnberg J., Rudner M., Lunner T., Zekveld A. (2010). When cognition kicks in: Working memory and speech understanding in noise. Noise and Health, 12, 263-269 [DOI] [PubMed] [Google Scholar]
- Rudner M., Fransson P., Nyberg L., Ingvar M., Rönnberg J. (2007). Neural representation of binding lexical signs and words in the episodic buffer of working memory. Neuropsychologia, 45, 2258-2276 [DOI] [PubMed] [Google Scholar]
- Rudner M., Foo C., Rönnberg J., Lunner T. (2009). Cognition and aided speech in noise: Specific role for cognitive factors following nine-week experience with adjusted compression settings in hearing aids. Scandinavian Journal of Psychology, 50, 405-418 [DOI] [PubMed] [Google Scholar]
- Rudner M., Foo C., Sundewall Thorén E., Lunner T., Rönnberg J. (2008). Phonological mismatch and explicit cognitive processing in a sample of 102 hearing aid users. International Journal of Audiology, 47(Suppl. 2), S163-S170 [DOI] [PubMed] [Google Scholar]
- Rudner M., Lunner T., Behrens T., Sundewall Thorén E., Rönnberg J. (2011b). Working memory capacity may influence perceived effort during aided speech recognition in noise. Manuscript submitted for publication [DOI] [PubMed] [Google Scholar]
- Rudner M., Rönnberg J., Lunner T. (2011a). Working memory supports listening in noise for persons with hearing impairment. Journal of the American Academy of Audiology, 22, 1-12 [DOI] [PubMed] [Google Scholar]
- Sakai K. L., Tatsuno Y., Suzuki K., Kimura H., Ichida Y. (2005). Sign and speech: Amodal commonality in left hemisphere dominance for comprehension of sentences. Brain, 128, 1407-1417 [DOI] [PubMed] [Google Scholar]
- Sarampalis A., Kalluri S., Edwards B., Hafter E. (2009). Objective measures of listening effort: Effects of background noise and noise reduction. Journal of Speech, Language, and Hearing Research, 52, 1230-1240 [DOI] [PubMed] [Google Scholar]
- Schneider B. A., Daneman M., Pichora-Fuller M. K. (2002). Listening in aging adults: From discourse comprehension to psychoacoustics. Canadian Journal of Experimental Psychology, 56, 139-152 [DOI] [PubMed] [Google Scholar]
- Sommers M. S. (1996). The structural organization of the mental lexicon and its contribution to age-related declines in spoken-word recognition. Psychology and Aging,11, 333-341 [DOI] [PubMed] [Google Scholar]
- Squire L. R. (2009). Memory and brain systems: 1969-2009. Journal of Neurosciences,29, 12711-12716 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stenfelt S., Rönnberg J. (2009). The signal-cognition interface: Interactions between degraded auditory signals and cognitive processes. Scandinavian Journal of Psychology, 50, 385-393 [DOI] [PubMed] [Google Scholar]
- Tulving E. (1983). Elements of episodic memory. Oxford, UK: Oxford University Press [Google Scholar]
- Tulving E., Colotla V. A. (1970). Free recall of trilingual lists. Cognitive Psychology, 1, 86-98 [Google Scholar]
- Tun P. A., McCoy S., Wingfield A. (2009). Aging, hearing acuity, and the attentional costs of effortful listening. Psychology and Aging, 24, 761-766 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang D. (2008). Time-frequency masking for speech separation and its potential for hearing aid design. Trends in Amplification, 12, 332-353 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang D., Kjems U., Pedersen M. S., Boldt J. B., Lunner T. (2009). Speech intelligibility in background noise with ideal binary time-frequency masking. Journal of the Acoustical Society of America, 125, 2336-2347 [DOI] [PubMed] [Google Scholar]
- Wheeler M. A., Stuss D. T., Tulving E. (1997). Toward a theory of episodic memory: The frontal lobes and autonoetic consciousness. Psychological Bulletin, 121, 331-354 [DOI] [PubMed] [Google Scholar]
- Zekveld A. A., Kramer S. E., Festen J. M. (2010). Pupil response as an indication of effortful listening: The influence of sentence intelligibility. Ear and Hearing, 31, 480-490 [DOI] [PubMed] [Google Scholar]