Twenty-five years ago, many people thought that auditory prostheses were too crude to ever benefit auditory neuroscience. One skeptic thought that trying to understand auditory processing by using the artificial pattern of neural activation through a cochlear implant was like trying to understand how a television tuner works by applying a lightning bolt to the antenna. Recent papers in PNAS (1, 11) have shown the gains auditory prostheses have made in 25 years, in terms of practical benefit (i.e., restoring hearing to deaf people) and as a valuable tool for neuroscience. In this issue of PNAS, Rouger et al. (1) show that deaf people have superior lip-reading abilities and superior audiovisual integration compared with those with normal hearing and that they maintain superior lip-reading performance even after cochlear implantation. The development of the cochlear implant allowed for this unique perspective on auditory and visual integration.
Cochlear Implants and Neuroscience
The everyday act of listening to someone talk presents a complex problem for neuroscience. The auditory signal initiates a complex cascade of neural events, culminating in the parsing and categorization of the auditory neural stream into words and sentences. In parallel, the visual system processes images of the talker. The auditory and visual streams are merged into a multisensory signal that allows for better recognition in difficult conditions (e.g., noise) than by either modality alone. How auditory and visual streams are parsed and combined has been the subject of research for decades, but auditory perception through cochlear implants provides a perspective that gives new leverage on the research questions.
Cochlear implants are sensory prostheses that restore hearing to deafened individuals by electric stimulation of the remaining auditory nerve. Contemporary cochlear implants generally use 16–22 electrodes placed along the tonotopic axis of the cochlea. Each electrode is designed to stimulate a discrete neural region and thereby present a coarse representation of the frequency-specific neural activation in a normal cochlea. However, within each region of stimulated neurons, the fine spectro-temporal structure of neural activation/response is quite different from that of the normal ear. Despite these differences, modern cochlear implants provide high levels of speech understanding, with most recipients capable of telephone conversation.
Auditory prostheses, including the cochlear implant, are now regularly used for research in neuroscience. The emerging picture shows the relative roles of sensory end-organs, brainstem nuclei and cortex, in the processing of complex patterns of auditory information. For example, recent research with cochlear implants and simulations of cochlear implants (2) has shown that the lack of temporal fine-structure information causes speech performance to decline in noisy listening environments (3–5). Temporal fine structure is particularly important in music (6). In development, speech pattern recognition requires >5 years of normal-hearing experience to master, even during childhood when the cortex is most plastic (7). Imaging studies with simulations of cochlear implants have been able to distinguish cortical areas that are speech-specific from general auditory areas (8). Research with cochlear implant users has shown that combined auditory and visual information allows listeners to better function in high-noise environments (9, 10); however, the optimal coordination of auditory and visual information requires considerable experience (11). One common theme in these studies is the codevelopment of sensory input and cortex. Children implanted at later ages are at relative disadvantage, because the auditory cortex has been appropriated by other modalities/functions (12). This finding implies that early implantation allows the auditory system to compete for cortical real estate, whereas late implantation may be unable to dislodge existing cortical “squatters” (13).
Deaf implant listeners are better than normal-hearing listeners at combining auditory and visual cues.
Deafness and Audiovisual Integration
Rouger et al. (1) show that deaf implant listeners are better than normal-hearing listeners at combining auditory and visual cues, particularly when the auditory signal is degraded in a way that removes temporal fine-structure cues. Deaf people necessarily maximize their use of the visual cues; normal hearing people rely less strongly on visual cues. Cochlear implant users combined auditory and visual information synergistically, i.e., performance was better than would be predicted by a simple combination of the independent streams. In some conditions, normal-hearing listeners could also combine auditory and visual cues synergistically: when the auditory signal was degraded by sufficient noise masking to produce 30% correct speech recognition, adding visual cues improved performance to ≈80% correct. In this case, temporal fine-structure cues were preserved, and the synergy between auditory and visual streams was similar to that observed in implant listeners. In contrast, when the speech signal was degraded by reducing the spectral resolution and removing temporal fine-structure cues to simulate a cochlear implant in normal-hearing listeners (again, reducing performance to 30% correct), audiovisual performance improved only to ≈55% correct. With a similar auditory signal from the cochlear implant, deaf listeners were able to improve performance to >90% correct. The superior audiovisual performance by cochlear implant users is caused by both better performance with visual cues alone and better combination of the visual signal with the degraded audio signal. Normal-hearing listeners are able to integrate auditory and visual information similarly to cochlear implant listeners only when temporal fine-structure cues are preserved; normal-hearing audiovisual performance is poorer when fine-structure cues are removed.
Alongside many recent implant studies that show the importance of temporal fine-structure cues for auditory speech perception, the Rouger et al. (1) study shows the importance of temporal cues to multimodal speech perception. Cochlear implant listeners are able to compensate for the lost temporal information with visual cues, whereas normal-hearing listeners do not. It is unclear which aspect of temporal fine structure is most important for normal-hearing integration of auditory and visual information, e.g., harmonic pitch or temporal periodicity. It is possible that normal-hearing listeners could be trained to better integrate audiovisual cues, given degraded auditory information.
A recent paper (11) demonstrated that congenitally deaf children failed to integrate visual cues with the auditory cues from the cochlear implant if they were implanted later than 30 months of age. The extended period of deafness before implantation may require longer experience with implant hearing before auditory segments (i.e., phonemes) can be correctly identified. This lack of experience may contribute to the failure to fuse the auditory and visual streams. Nevertheless, early implantation seems to be key for normal sensory development and the synergistic effects of multisensory processing. Some, but not all, congenitally deaf adults who receive cochlear implants demonstrate synergistic audiovisual integration (14).
The work of Rouger et al. (1), combined with many recent implant studies, greatly advances our understanding of complex auditory signal processing and multisensory integration. Considerably better than a lightning bolt to the antenna, auditory prostheses are now routinely used to benefit auditory neuroscience. The “modern miracle” of restoring hearing to the deaf has provided a powerful research tool that has been used to better understand the development and plasticity of sensory processing. Of course, the ultimate synergy will arise when better understanding of complex sensory processing contributes to the design of the improved prostheses of the future.
Footnotes
The author declares no conflict of interest.
See companion article on page 7295.
References
- 1.Rouger J, Lagleyre S, Fraysse B, Deneve S, Deguine O, Barone P. Proc Natl Acad Sci USA. 2007;104:7295–7300. doi: 10.1073/pnas.0609419104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Shannon RV, Zeng F-G, Kamath V, Wygonski J, Ekelid M. Science. 1995;270:303–304. doi: 10.1126/science.270.5234.303. [DOI] [PubMed] [Google Scholar]
- 3.Fu Q-J, Nogaki G. J Assoc Res Otolaryngol. 2005;6:19–27. doi: 10.1007/s10162-004-5024-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Lorenzi C, Gilbert G, Carn H, Garnier S, Moore BCJ. Proc Natl Acad Sci USA. 2006;103:18866–18869. doi: 10.1073/pnas.0607364103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Qin MK, Oxenham AJ. J Acoust Soc Am. 2003;114:446–454. doi: 10.1121/1.1579009. [DOI] [PubMed] [Google Scholar]
- 6.Smith ZM, Delgutte B, Oxenham AJ. Nature. 2002;416:87–90. doi: 10.1038/416087a. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Eisenberg L, Shannon RV, Martinez AS, Wygonski J, Boothroyd A. J Acoust Soc Am. 2000;107:2704–2710. doi: 10.1121/1.428656. [DOI] [PubMed] [Google Scholar]
- 8.Scott SK, Rosen S, Lang H, Wise RJS. J Acoust Soc Am. 2006;120:1075–1083. doi: 10.1121/1.2216725. [DOI] [PubMed] [Google Scholar]
- 9.Bernstein LE, Auer ET, Takayanagi S. Speech Commun. 2004;44:5–18. [Google Scholar]
- 10.van Wassenhove V, Grant KW, Poeppel D. Proc Natl Acad Sci USA. 2005;102:1181–1186. doi: 10.1073/pnas.0408949102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Schorr EA, Fox NA, van Wassenhove V, Knudsen EI. Proc Natl Acad Sci USA. 2005;102:18748–18750. doi: 10.1073/pnas.0508862102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Lee DS, Lee JS, Oh SH, Kim S-K, Kim J-W, Chung J-K, Lee MC, Kim CS. Nature. 2001;409:149–150. doi: 10.1038/35051653. [DOI] [PubMed] [Google Scholar]
- 13.Doucet ME, Bergeron F, Lassonde M, Ferron P, Lepore F. Brain. 2006;129:3376–3383. doi: 10.1093/brain/awl264. [DOI] [PubMed] [Google Scholar]
- 14.Moody-Antonio S, Takayanagi S, Masuda A, Auer ET, Fisher L, Bernstein LE. Otol Neurotol. 2005;26:649–654. doi: 10.1097/01.mao.0000178124.13118.76. [DOI] [PubMed] [Google Scholar]