Abstract
Here we report the methods and output of a workshop examining possible futures of speech and hearing science out to 2030. Using a design thinking approach, a range of human-centered problems in communication were identified that could provide the motivation for a wide range of research. Nine main research programs were distilled and are summarized: (a) measuring brain and other physiological parameters, (b) auditory and multimodal displays of information, (c) auditory scene analysis, (d) enabling and understanding shared auditory virtual spaces, (e) holistic approaches to health management and hearing impairment, (f) universal access to evolving and individualized technologies, (g) biological intervention for hearing dysfunction, (h) understanding the psychosocial interactions with technology and other humans as mediated by technology, and (i) the impact of changing models of security and privacy. The design thinking approach attempted to link the judged level of importance of different research areas to the “end in mind” through empathy for the real-life problems embodied in the personas created during the workshop.
Keywords: hearing assistance, human communication, design thinking, future-focused workshop
Introduction
In basic research, practitioners typically narrow their focus to a relatively small segment of a disciplinary area, selecting questions that arise from previous work that have left important gaps in knowledge, including anomalies or ambiguity in the literature. By contrast, applied research begins with a specific problem and works to apply state-of-the-art techniques in search of a solution. An important feature of this approach is that the return on the investment for a solution of a particular problem can be estimated before the investment is made. From this perspective, a real tension arises in basic research when decisions need to be made in resource allocation and funding. When the “end in mind” is not articulated, the potential return on an investment is impossible to assess. The exercise reported here was an attempt to bridge that gap for one area of pure research and to visualize and articulate some “ends in mind” informed by current research at the cutting edge of hearing sciences and human communication.
At a 2-day workshop held in Berkeley, CA, in September 2016, more than 40 researchers from disciplines related to speech and hearing science worked together to identify a range of key issues in human communication and the most relevant research questions to drive progress toward those ends. Most of the participants were senior researchers, leading highly successful, multidisciplinary research teams (Supplementary Material). Eight early career researchers (ECRs) in their doctoral and early postdoctoral years also participated. The ECRs played a key role as rapporteurs for each of the small discussion groups and contributed significantly to the analysis and writing of this report.
The workshop had three lofty goals: First, taking a human-centered perspective to envision the key problems in hearing and human communication that will capture the attention of the field by 2030; second, to identify potential forward-looking solutions to these issues and the principal research areas that could deliver on those solutions; and third, to create a research community with a collective vision and awareness of these possible “ends in mind.”
Given the practical difficulties of bringing together a senior group of technically focused researchers for an extended period and the desire to focus the discussion on human needs as the end in mind, elements of a “design thinking” (DT) approach (Methods section) were employed to drive the discussion in a human-centered and forward-looking direction. Much of the brainstorming and discussion was carried out in small groups of six or seven researchers including at least one ECR. Each day opened with a diverse range of short, future-focused, and provocative presentations, affectionately referred to as “rants” (see https://www.listeninginto2030.org/future-focus for abstracts). Their intent was to stimulate innovative and intellectually adventurous thinking.
The scene was set by opening remarks that introduced the Knowledge Navigator (Skully, 1987), a video produced by the Apple Corporation as a marketing piece in 1987—before the invention of the World Wide Web and before some workshop participants were even born! When viewed nearly 30 years later, this piece of marketing is prescient, addressing shared digital workspaces, artificial intelligence (AI)-based assistants, networked telecommunications, real-time and massive data search, and analytics. An aspirational challenge put to the workshop was, “As a research community interested in human hearing and communication, what is our Knowledge Navigator?” Further challenges included, “What is the collective vision for our discipline area for just 15 years hence?” “What will be the social legacy or the return on investment in the efforts of our research community?” and “What are the use-cases or real-world solutions that our work will advance?”
This article is not intended as a review or presentation of scientific data but has two quite different objectives. Future-focused ideation using a relatively large group of specialist researchers is not common in the hearing and communication science research community, so the first objective is to briefly describe the design and conduct of the workshop should others be interested in adopting such an approach. Likewise, a problem definition in pure research is often driven and supported by the preceding research. By contrast, a future-focused exercise needs also to draw on the possible technological capabilities of the future. A second objective is to describe the main research themes that emerged from discussions between some of the leading researchers in this community. The value in these ideas is driven by the assumption that the deep knowledge and intuitions of these researchers might provide uncommon insights into future technological capability.
Methods
DT is a dynamic methodology used to reframe and solve problems. Typically, when faced with a problem to solve, we race to ideas or solutions, often based on what we know or think given previous data. DT encourages the practice of stepping back into the problem and observing or discovering the problem from a different perspective. The notion of DT was popularized by Tim Brown at IDEO in 2008 (Brown, 2009) and has since become a key tool for strategic thinking at major business organizations wishing to differentiate themselves in the marketplace (Kolko, 2015; Martin, 2009). The challenge is to allow design for the future, traditionally addressed from only an organizational or technical perspective, to be reframed in a way that has the end user or consumer at its center. Its tools also allow researchers to explore the contradictions and tensions that exist for people outside the laboratory when faced with the solutions we create. Inventing a possible future using intuitive thinking and abductive logic (i.e., seeking the simplest and most likely explanation for an observation, see Dew, 2007) allows for new insights and knowledge. Doing this early promotes outcomes that can then be more easily translated and applied to everyday lives. Using a wide range of case studies, both Brown (2009) and Verganti (2008) emphasize the importance of understanding the real problem from an end user’s perspective. More specifically, Verganti uses examples from Alessi, Nintendo, and Swatch to illustrate how innovation is relevant to end users through the meaning they place on products and services, not just the technology or product offerings themselves.
While we recognize that there are a range of methodologies that take a human-centered design approach, in the absence of any strong arguments about the relative benefits of one over another in the sort of context in which we wished to apply it, we chose DT because of the availability of DT practitioners who were passionate and experienced in applying this approach to ideation in the medical devices and audio industries. By starting with an understanding of the problems people and society face every day, possible focus areas that can enhance life and outcomes can be identified. It was this latter perspective that we were very keen to instill in the participants as they explored the potential impact of their own research as it extends to 2030.
To begin, participants were introduced to DT in a plenary session and then divided into small groups to develop seven personas (briefly described in the following; see also Supplementary Material). Jane Cockburn (one of the authors), a professional DT practitioner and trainer with a background in cochlear implant development (see http://kairosnow.com.au/), led the DT training. No restrictions were placed on the development of the personas. As these were intended as a vehicle for building empathy, it was important that each group felt some connection to the persona. Each of these small groups consisted of five to six senior researchers and one ECR who reported back to the plenary group. Three other DT practitioners from Ammunition Group, LLC (see http://www.ammunitiongroup.com/) also participated at the small-group level to support and assist in the development and application of the group personas to the ideation exercises that followed. Given the short time frame (2 days), it was not possible to implement the full DT framework; however, the tools that were introduced aimed to provoke empathy and stimulate curiosity.
The notion to use ECRs as rapporteurs developed from the thought that with their particular investment in the future, the ECRs, rather than senior researchers, might be more sensitized to issues that are relevant for future success and less invested in traditional models or approaches. If so, this could lead to broader synthesis and greater diversity in subsequent reporting.
The personas developed by each group were then used as vehicles for identifying key “real-world” listening and communication issues with relevance to research and development into 2030. Goals of the first day of the workshop were to (a) create and refine personas and (b) produce “journey maps” describing a day in the life of the persona character to help crystalize an understanding of their individual problems. Generating a fictive but realistic and crisp articulation of the listening and communication challenges faced by each persona was an essential outcome of the first day. In the first half of the second day, the small groups were reconvened and this time focused on how current and future scientific advances and technology could be applied in solving the challenges faced by each group’s persona. The personas, the challenges, and the potential solution areas are presented in the following Results section.
Next, each group’s technology and research solutions were pitched by their ECR rapporteur for discussion in a plenary session. Participants then voted on the most important areas likely to deliver on solutions to the sorts of problems identified by this method. Research areas were clustered by consensus in the plenary session and resulted in five major themes. These, plus four additional important areas that emerged during the plenary discussion, are explored in detail in the Discussion section.
Results: Persona Narratives and Journey-Map Challenges
The personas emerged progressively over the first day of the workshop. Participants’ understanding of the problems faced by each persona matured following construction of journey maps and refinement of each individual story. Each persona served as real-world embodiment for a range of “real-life” problems that could be addressed through research and innovation. While the personas reflected to some extent the societal backgrounds of the participants, the communication issues they embodied were felt to be universal. For practicality, on the second day, each group focused on only the top three challenges faced by their persona and on developing research and technology solutions relevant for meeting these challenges. The current report focuses on the most important themes identified by this group of researchers. Following the workshop, each ECR prepared a summary of the persona and journey map from his or her own discussion group. Concise versions are listed below while the full-length and more colorful descriptions are included as Supplementary Material.
Margo: A Mother and Professional With Multiple Demands on Her Time
Margo is a middle-aged professional who is beginning to realize that the demands of her day take a greater toll on her than they had when she was younger but she does not recognize any connection with her hearing problem. Education and overcoming the stigma of hearing aids are important goals (see for instance Meyer & Hickson, 2012).
The following are the key areas for research:
- Improving quality of life by
- Monitoring and cataloging the auditory scene,
- Providing listening assistance regardless of hearing status,
- Catch errors in communication mediated by an AI agent individualized to each listener using the listener’s life as context (see for instance, Simonnet, Ghannay, Camelin, Estève, & De Mori, 2017), and
- Monitoring and advice regarding the health state of the listener.
- Staying connected with those we love—those whose presence we want to feel
- Overcoming distance: virtual presence in telecommunications,
- Increasing the sense of presence of the talkers in shared auditory spaces for collaboration (see https://en.wikipedia.org/wiki/Virtual_collaboration), and
- How listening devices can adapt to listening environments with reverberation, noise, accents, and languages.
- Development of proposed solution nicknamed JIMINY (Juxtaposed Integrated Machine IN Your ear) which could include
- A master knowledge interface by acting as
- translator (babel fish [Adams, 2010 and see for instance https://www.itranslate.com/]) and voice diagnostic monitor (see for instance http://www.sondehealth.com/),
- whisperer (annotating communication with data [e.g., Google Assistant, Microsoft Cortana, Amazon Alexa, and Apple Siri]), and
- an intelligent machine providing a customized conscience (Wallach & Allen, 2008).
- A mind–body monitor (e.g., https://spire.io/), and
- Optimized communication with the listener depending on environmental and behavioral context and the information being transmitted (e.g., http://www.cogitocorp.com/).
Paul: A Young Male With Normal Hearing Who Is Very Busy With Lots of Things Going on in Various Areas Across His Life
Paul is a normally hearing 25-year-old man who is looking to simplify his busy, distracted life. He is a technologically sophisticated “early adopter” looking for ways to improve productivity but is nervous about lack of privacy.
The following are the key problem areas for research:
- Physiological measures obtained by an ear-level device and physiological and environmental status of the user:
- What data can be reliably obtained from wearable sensors?
- What information can be inferred for short, medium, and longer views?
- How can information be effectively delivered to the listener wearing an ear-level interface?
- How can listening augmentation systems integrate environmental data, inferred listener intent from a worn device, and support attention? For example, technology can
- Alert to potential environmental dangers,
- Augment signals/speakers of interest,
- Decrease isolation, for example, by “smart” mixing of external and earphone delivered information.
- How can we deliver acoustic information in a way that
- Does not distract from important tasks at hand or eases cognitive load?
- Increases attentional capacity or at least minimizes distraction?
- Inculcates trust between the listener and the device and the backend systems?
Bruce: A Violinist With Increasing Hearing Impairment
Bruce is a violinist who developed a significant unilateral hearing impairment after failure to wear hearing protection, a situation that motivates him to raise awareness of this issue. Bruce’s persona was inspired by the “rant” talk by Dr. Konstantina Stankovic (see https://www.listeninginto2030.org/future-focus), which focused on biological and therapeutic interventions.
The following are the key areas for research:
- Prevention and rehabilitation
- Increasing awareness of noise-induced hearing loss using smart marketing and promoting dose monitoring (“sound diet”),
- Early hearing loss detection and a deeper understanding of the impact on interpersonal relationships leading to more public education and individual counseling, and
- Affordable, simple, and effective sound control (ear plugs and smart hearing protection).
Medical research on biological interventions to restore hearing and improve neural interfaces (e.g., Mizutari et al., 2013)
Development of smart audio devices, speech enhancement technologies, smart mixing systems, and so forth (see for an early example, https://hereplus.me/).
Om: A 60-Year-Old Farmer in India With Progressive Hearing Impairment
Om’s persona came to life after conversations about how science and technology might affect people in the developing world. Om, a middle-class Indian farmer in a remote village, is concerned that his hearing loss is affecting his ability to interact as well as his overall sense of well-being. Remote living also means that Om has to travel a long distance to see a doctor.
The following are the key problem areas for research:
- Improving Om’s overall well-being (for one view of the impact of AI on individualized health care, see https://www.linkedin.com/pulse/healthcare-embraces-artificial-intelligence-rohit-talwar, and for “all-in-one” health assessment, see http://tricorder.xprize.org/)
- Use AI and a range of biomarkers that are smart enough to adapt to his lifestyle,
- Understand ways to provide information to Om in a manner helpful to him,
- Take a holistic approach that addresses both hearing loss and its associated comorbidities such as cognitive decline and tinnitus,
- Determine what physiological indicators can be used with an ear-level device, and
- Determine how analytics can be applied to obtain bioindicators of health and well-being.
- Find ways to deliver these sorts of technical solutions into Om’s hands
- Deal with the high cost of technology that is adaptable to Om’s specific needs,
- Provide information that is meaningful, useful, and constitutes a compelling case for continued use, and
- Design an intelligent system to meet Om’s changing requirements, level of understanding, and medical needs.
Nancy: A Busy Mother With a Large Family and a Partner With Some Hearing Disability
Social isolation is interfering with Nancy’s interactions with people. Miscommunication with her husband, who denies his hearing is failing, is stressful and complicated. Can ear-level devices provide confirmation that a message has been received? How can they be used to communicate in a multilayer, multichannel mode, regardless of proximity, communicating not only the raw information but also multiple layers of supporting cues?
The following are the key areas for research:
- To enable such multilayer, augmented reality environments, we need to understand the following:
- Auditory scene analysis of the environment as well as the information in the message,
- Visual interactions at the input and display ends of the communication, and
- The attentional capabilities of the listener especially with augmented listening.
- Closing the loop—creating an active listening experience in a shared social reality (for instance, See Facebook’s development of virtual social spaces—https://www.newscientist.com/article/2128391-facebook-banks-on-virtual-reality-as-the-future-of-socialising/):
- Understanding how listeners register a message received and measure their understanding (not just hearing) and providing feedback to the sender (see e.g., Schuller & Batliner, 2013; Schuller et al., 2013),
- Deriving attentional engagement, directivity, and so forth using biomarkers such as electroencephalogram (EEG) (e.g., Simon, 2015),
- Effective listener feedback with mechanisms sensitive to context and brain state, and
- Understanding how to ensure that the intended message is the received message.
- Enabling multilayered communications through human–system interaction to
- Communicate or augment emotion and intention, not just information (e.g., Mauss & Robinson, 2009),
- Work both locally and over a distance,
- Connect multimodally within and across a range of communication channels,
- Support “chat rooms” shared among groups (in this case Nancy’s family) wearing hearables that are easy to use and affordable, and
- Provide hearing protection and amplification as required.
Jane: A Schoolteacher With Normal Hearing
The persona of Jane, a teacher in her mid-thirties, evolved from a discussion of the ways in which social and behavioral issues could be addressed within the context of a world comfortable with auditory augmentation. In such a world, ear-level devices could coordinate augmented group learning activities, be used to monitor communication efficacy, and provide individualized support.
The following are the key areas for research:
Improving the teaching environment with active and passive acoustic treatments,
- Dynamically modifying classroom sounds using wearable technology to improve student engagement and cognition by
- Filtering unwanted sounds,
- Enhancing salience of relevant auditory features (e.g., Kim, Lin, Walther, Hasegawa-Johnson, & Huang, 2014), and
- Generating soundscapes or personalized immersive environments conducive to learning.
Real-time monitoring of student engagement with biophysical markers to help reinforce learning activities.
Jessica and Fernando: Traumatic Brain Injury, Relationship Building, and Reinforcement
In contrast to the other single-person personas, one working group conceived of a married couple persona to emphasize that communication is a two-party activity. Fernando has a military service-related traumatic brain injury that has resulted in a spectrum of challenges to his ability to communicate with his wife, Jessica. Both Jessica and Fernando struggle with how best to communicate their feelings and limitations with each other.
The following are the key areas for research:
- Understanding people in the relationship by developing mind reading/imaging technologies to
- Understand cognitive state and intent and monitor short- and long-term aspects of the relationship,
- Identify a suite of linguistic and paralinguistic biomarkers,
- Understand and model relationship dynamics to enable an effective display for users.
- Understanding the auditory scene in the context of a relationship by
- Decomposing preferred sources from the complex auditory scene, providing a processed or “cleaned-up” scene for improved understanding
- Understanding the acoustic environment in terms of sources, locations, and meaning (incorporating the history of the speakers and listeners).
- Considering how to display information to the users, which
- Requires understanding of human perceptual challenges and how to optimize display with those in mind, including the ability to attend to a scene without inducing cognitive overload and improve the quality of life of the listener
- Requires a man–machine interface that uses a variety of modes to convey information to people in an empowering, easy-to-use way.
Discussion
Once the areas for research focus had been pitched by each ECR rapporteur and discussed in a plenary session, participants voted for the most important problem areas. Many of the persona groups had overlapping or related key problem areas, so plenary discussion involved further grouping of the problem areas and the identification of five main themes. Final plenary discussion led to four other main themes. Each theme is described and briefly discussed below.
It is important to emphasize that these themes represent the informed intuitions and opinions of successful and eminent researchers, shaped by an empathetic approach to perceived human needs rather than arising from some systematically based predictions about the future. Nonetheless, we see these nine research themes as important starting points to inform discussion about research and funding priorities, that address the listening and communication needs of a diverse cross-section of the population, and carry into 2030.
Principal Themes
Theme 1: Measuring brain activity and other relevant physiological parameters
Scope of the theme
Effective measurement of the brain and other physiological parameters was the highest voted research theme. This theme broadly covers devices, algorithms, and substrates that would be quantified to provide information about cognitive and physiological status. It takes a “data first” approach to solving the communication problem by centralizing the importance of accurately assessing the current situation in communication.
Discussion
This was the most commonly discussed research, and not surprisingly, all aspects of speech were highlighted as points worthy of further measurement and analysis, including linguistic content (words, sentences, syntax, and diction) and paralinguistic content inferred from speech (e.g., as in identification of psychological state and emotions through vocal analytics). Technical challenges include dealing with background and competing noise, diverse audio quality, and speech recognition at a distance and in noise. This research domain, however, meant far more than simply quantifying the audible “vital signs” of a person.
Directly measuring the brain state was also commonly mentioned during the workshop as an advance likely to reach nonscientists by 2030, with EEG being often cited, as well as not-entirely-facetious remarks about a “Google hat” that would “read the mind”—remarks inspired, no doubt, by Google’s track record of organizing the world’s information as well as the science fiction-turned reality described in Jack Gallant’s rant, where functional magnetic resonance imaging is being used to create visual and semantic reconstructions from thoughts (see https://www.listeninginto2030.org/future-focus). Other sensing modalities discussed included galvanic skin response, electrocardiograms, blood pressure, temperature, and respiration waveforms.
Theme 2: Display of auditory and multimodal information
Scope of the theme
The focus of this theme was twofold: First, to understand how to deliver information to listeners to seamlessly integrate with their life in ways that are not distracting (see also Josh Miele and Maria Chait’s workshop rants) and second, understanding what information to display and which sensory modality to use.
Discussion
Existing research on acoustic notifications and their attentional cost and impact on cognitive load was discussed. Understanding how to integrate the output of an augmentation device with existing auditory environments is critical to how such technologies might be useful in a real-world context.
Other discussions addressed the safety implications involved when filtering out any external sound from our environments, for example, discarding the sound of dangerous sources (a truck backing up) so that we lost awareness of its presence. Safety concerns create a challenge for simply displaying information in a way that is not distracting while keeping in mind that seemingly irrelevant sounds can sometimes be lifesaving.
Discussion also centered on the issue of which sensory modality is best for displaying specific information, with primary focus on audition and vision. The choice might depend on individual preference, current activities, other information processing, but again safety issues related to when and where the information was needed (priority) need also to be accounted for. These requirements could be met by a flexible multimodal device using both audition and vision that learns user preferences for delivering information.
Theme 3: A picture in sounds: Analyzing the auditory scene
Scope of the theme
Our auditory landscape is rich and complex, and it is not fully captured by current technology. Humans organize their sound environment into meaningful components. A device that mimics this capability would have a richer set of information on which to operate than that used by current technology. In 2030, we foresee auditory augmentation that recognizes salience, distinguishing between important signals and meaningless noise. Whether enhancing learning experiences or fostering deeper and more engaged interpersonal interactions, a more complete accounting of the rich soundscapes in which we live is a prerequisite.
Discussion
Once we can reconstruct auditory scenes, the next step will be to manipulate these soundscapes to accentuate relevant information. Jane was a persona born from this workshop: a schoolteacher striving to provide her students with a multimodal learning experience. She could lead her students on a virtual field trip of the Great Pyramids, exposing them to the wonders of the ancient world through immersive sight and sound. Jane could highlight important information, for example, focusing on a street musician while dampening the excited chatter of her students.
Consider this scenario. Nancy, our middle-aged mother persona from Middle America, stands in her kitchen. Her children stampede in, earbuds blasting, followed by her husband, whose hearing had begun to falter years earlier. She says, “Dinner’s at 7,” but wonders if anyone heard her. Nancy’s narrative points to deeper issues of social isolation that stem from frustrated communication. In 2030, the children’s earbuds and her husband’s hearing aid may recognize and prioritize Nancy’s voice and dampen competing sounds, allowing Nancy to be heard and giving her opportunities for dialogue and connection.
Finally, consider personas Jessica and Fernando. Fernando was a veteran who suffered a traumatic brain injury and is now struggling both to hear and parse his auditory world. His marriage to Jessica is fracturing because Fernando cannot effectively attend to pertinent information. More than a simple hearing aid, Fernando needs a device that can help him stay focused on the object of his attention, and tune out the rest. A device that works for him would need to adapt and learn what information is most important to Fernando and Jessica, using feedback to improve over time (For AI-based personality models, see e.g., Zhang, Zheng, & Magnenat-Thalmann, 2016).
Theme 4: Shared auditory space
Scope of the theme
This theme revolved around the idea that just as we can share physical space by being in the same room, we can also share auditory space through technology, without being in the same place. This theme also encompassed ideas such as focusing collective attention on the same information; fostering a fullness of communication in auditory space by combining verbalized content with metadata, such as indications of emotional intent; AI-driven categorization and storage of important auditory information for later reference; and support for multichanneled and multilayered “chat rooms” that move us toward something that could rightfully be called a space.
Discussion
The social aspect of communication was a perspective that pervaded all our discussions regarding the future of technology. Reflecting this, the themes of adaptively and nonintrusively presenting auditory and multimodal information (Theme 2), and parsing of the auditory scene (Theme 3), would provide the technological foundations that would enable these shared spaces. The overt acoustic content of a conversation is just a sliver of the entirety of a conversation, and our current listening devices fail to fully capitalize on this rich space in which we communicate. Other cues such as posture, facial expression, and the spatial locations of the communicators all lend crucial nonverbal information about the emotional states and intentions of those in the space. While concurrent visual accompaniments may convey some of this, it was recognized that there is an entire level of interaction that we have yet to incorporate into our audio communication technology.
Technology that could adaptively enhance auditory objects to foster shared attention and that could catalog information for later reference was immediately recognized as having applications to the health-care field, as well as supporting social and interpersonal relationships. In the persona of Fernando, it allowed for the feeling of once again sharing the world with his spouse.
Devices were imagined that could enhance information conveying emotional valence beyond what is conveyed in the acoustic elements of speech, such as prosody. The emerging in-ear EEG devices are a potential technology precursor that could glean emotional content directly from brain activity. This might modulate the auditory signal or utilize some other sensory modality to convey this information.
A new kind of “chat room” was discussed—an auditory space shared by many but going well beyond a conference call. It was a space merging many of the innovations discussed into a multilayered communication platform conveying speech, nonverbal emotional content, user control over attentional enhancement, and further conveying a sense of embodiment, perhaps accomplished through 3D augmented listening. With multichannel “chat rooms,” users could switch seamlessly between conversations among friends, family, and coworkers. There was palpable optimism that this type of technology could help break down the barriers of isolation. Ray Goldsworthy commented that he could envision a technology like this being used to connect the elderly or homebound and prevent the psychological and neural decline that often accompanies social isolation.
Themes 5: Holistic approaches to wellness and health management, particularly in managing hearing impairment
Scope of the theme
Prevention and rehabilitation of hearing loss using education and training together with signal processing and therapeutic biomedical interventions. Much work still needs to be done on understanding the downstream effects of cochlear dysfunction on the auditory system, on higher functions such as speech processing and understanding, as well as the psychological and sociological impact. Given the complexity, these need to be considered holistically rather than seen as separate subsystems.
Discussion
As expected in a gathering of engineers and scientists, advances in technology were the favored approach to improving speech communication. Tempting visions of repaired hearing with bioregenerative techniques and lifetime use, adaptable hearing aids that self-tune over decades were floated along with devices that could tell speakers just what to say or what tone to take to match the emotional state of a conversation partner. The idealistic slogan, although not stated as such, seemed to be, “never again be misunderstood” or “always say the right thing.”
One dissenting group, represented by the persona Bruce, injected a voice of caution to this techno panacea. First, this group felt that prevention of damaged hearing should be emphasized, that our noisy environments should be rigorously characterized, from homes, to public transportation systems, to places of entertainment. By protecting our ears, we acknowledge the limitations of what can be done to repair hearing now, or even in 2030.
Second, when it is a hearing device that analyzes the emotions and intent of others to aid in communication, we also invade privacy and perhaps impair our own social abilities. If a person does not wish to share their emotions, what right is there for technology to penetrate their facade? Also, relying on a world where interpersonal interactions are mediated by technology may be pushing ourselves into a “Wall-E” style universe (see https://en.wikipedia.org/wiki/WALL-E and http://movies.disney.com/wall-e) where a mediator is invading privacy and lessening the intensity of human–human interactions. As discussed in other themes, each person’s take on technology is not the sole determinant of their environment. Individuals will have to navigate the choices of those with whom they interact.
Theme 6: Universal access to technology and its evolution to meet individual needs
Scope of the theme
This theme identified the need for technology in 2030 that evolves along with the user, increasing its overall usability over an entire technological life cycle. In addition, this theme addresses the need for hardware and software solutions that maximize the potential impact across a diverse set of communities around the world. The group identified several research areas in both technology and product development that could maximize the benefits of such a device, such as AI and machine-learning applications for self-adapting technology and advanced hardware engineering of small and cost-effective devices that provide increased processing power with minimal power consumption.
Discussion
The discussion of this theme centered on ways to ensure that the technologies identified during the workshop could have the greatest impact. One of the personas most relevant to this discussion was that of Om, the middle-class rural farmer from India. Out of this persona arose the issue of how to get the technology of 2030, meant to solve pressing needs related to hearing loss, into the hands of those who need it most in both the developed and the developing world. This includes getting relevant technology to people who do not have immediate access to specialist health-care providers (e.g., rural areas) and visualizes the creation of a “measure everything” approach, in which many biometric signals are recorded in one low-cost, high-performance adaptive device that will provide long-lasting benefit to the end users.
Theme 7: Biological intervention for hearing dysfunction
Scope of the theme
Much of the workshop discussion centered on ear-level technology focused on problems of communicating under difficult circumstances regardless of the hearing status. In the context of the hearing impaired, Theme 7 emphasizes a special place for therapeutic and biological interventions.
Discussion
The promise of new transformative biological interventions for hearing dysfunction surfaced as a direct result of a “rant” talk by surgeon-researcher Konstantina Stankovic. She highlighted the fact that as of 2016, the cochlea cannot be biopsied when establishing a medical diagnosis; indeed, the cochlea is not satisfactorily captured on even the most spatially sensitive clinical scans and cannot be visualized without subjecting a person to high-risk brain surgery. Therefore, all medical diagnoses of auditory function are made indirectly using a constellation of other clinical measures, each limited in resolution and specificity.
In her presentation, Stankovic detailed novel optical, genetic, and surgical approaches to acquired and congenital auditory anomalies in various stages of development. Exciting new directions under study include gene therapy to restore function in deafness-causing mutations, microscopy techniques by which to view previously unseen cellular structures within the cochlea without causing structural damage, and fully implantable cochlear implants powered by electrical gradients that exist within the inner ear. By combining techniques of computational biology and engineering with traditional principles of molecular biology, physician-scientists like Stankovic hope to push the field of otology to the cutting edge of translational medicine. Although these ideas did not take center stage during the persona-based DT session, their importance reemerged in the plenary outcomes discussion, during which participants articulated the value of biological innovation and the need for these technologies to reach patients.
Theme 8: Psychosocial interactions with technology and other humans as mediated by technology
Scope of the theme
After much focus on the individual user and on how to manage and assist communications in the immediate term, the question arose as to the psychological, intellectual, and emotional effects of technologies that mediate communications in the longer term. If everyday communication is mediated by these technologies, it is likely that they will have other lasting and potentially deeper effects.
Discussion
The brain is highly plastic and responds to persistent changes in the patterns of environmental stimuli. By using devices that focus attention on a particular sound source or information channel, environmental stimuli are being shaped by technology that could potentially profoundly alter the composition of the stimuli. This in turn could affect the manner in which the brain uses that information in the longer term. Virtually nothing is known about the effects of enhancing informational contrast to support attention in the long term. Likewise, the collective and social effects of such technologies are also a mystery. Barbara Shinn-Cunningham prompted discussion of the potential costs of this techno-panacea in her workshop rant, “The seduction of technology.” To some extent, this scenario held parallels to the contemporary debate about the online “filter bubble”: for example, the impact of focusing advertising, news, and other stories using predictive algorithms (e.g., Pariser, 2011). In this case, the concern is mainly about the self-reinforcing nature of such information—that is, services showing us what they decide we want to see based on our previous behavior rather than what we may need to see in the dynamically changing world in which we live. Going beyond the level of the individual, the cumulative sociological effects of these perceptual and informational filters need to be better understood.
Theme 9: Security and privacy
Scope of the theme
This theme considered significant societal issues that may result from listening devices with the ability to quantify and analyze the biophysical self.
Discussion
A hearing aid with keen capability to analyze a scene in order to assist the user might also spy on an exchange that other conversants intended to keep private. This is a general problem of building trust with machines that hear, which was directly addressed by Richard Lyon in his workshop rant. Biophysical measures that capture short- and long-term health information or that reveal state of mind may contain deeply private information that would be harmful in unintended hands. How might privacy be preserved or the multiplicity of privacy expectations in shared social spaces be negotiated in the face of invasive auditory technology? Will the norms or expectations for privacy evolve in coming years as people grow accustomed to features of their devices that improve their daily lives? How might personal thoughts or actions be kept private, while being utilized by our technology in a positive way? If intermediaries are to secure our data and negotiate our privacy with others, how do we establish trust with them? Privacy, security, and trust should be considered from the outset rather than dealt with as an afterthought as we build devices that are increasingly capable of listening effectively to the auditory world and to the human body.
For some, security and privacy concerns were arguably the most important and wide-reaching topic relevant to all aspects of life in the future. The simplest position to take is to give up privacy in exchange for convenience or else live in a Faraday cage. As one researcher pointed out, even if you opt to remove all sensing devices from your home, there is nothing to stop, or even signal, that another person is not covered in sensing devices and is acquiring your data as well as that of the wearer. This scenario has been called the “open mic” problem, wherein sensors in our environment are always on and always listening, whether we want them to or are even aware of them (e.g., Amazon’s Echo, https://www.amazon.com/Amazon-Echo-And-Alexa-Devices/b?ie=UTF8&node=9818047011). Ultimately, establishing mutual trust between users and companies will prove imperative. There was a general sense of discontent with this solution but stronger alternatives did not surface.
Conclusions
As an experiment in future-focused ideation with a group of scientists and engineers, the workshop was successful as judged by the first two lofty goals outlined in the Introduction section. The DT approach delivered a range of fictive personas that help shape a wide-ranging discussion of research themes focused on human-centered needs as the end in mind. Indeed, the principal themes described earlier represent only a fraction of what was discussed, albeit the fraction judged most important by these participants. Some themes (1 to 3) were very technically focused, while others had a strong focus on health and well-being (5 and 7); one was strongly motivated by a social good (6), while others focused on psychological (8) and sociological elements (9). While one theme (4) specifically referred to communication in a social context, social interaction through human communication strongly influenced most of the discussions.
Limiting factors included the short (2-day) duration of the workshop as well as the fact that, for the group, this was a significantly different mode of structured ideation to what they were used to. Preparatory work might have ameliorated some of these limitations, but the need to introduce DT tools as a hands-on exercise necessarily occupied a substantial fraction of time. Although formal feedback from participants was not sought, informal comments by participants ranged from very enthusiastic support for the exercise to skepticism as to its worth. The Supplementary Material includes commentary from the ECR participants (who also contributed to this report) and some senior participants.
We hope that the themes identified here focus discussion about prioritization and resource allocation to a number of important areas of human communication. Critically, by using DT methods, we have tried to link the level of judged importance of different research areas to the “end in mind” through empathy for the real-life problems embodied in the personas created during the workshop. In most cases, the most beneficial end remains the enhancement of human communication for both the hearing-impaired and the normally hearing individual as research and technology move into 2030.
Supplementary Material
Acknowledgments
The authors gratefully acknowledge Jennifer He for handling workshop logistics and for helping manage the inputs of the numerous contributors to the article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The workshop was supported by the National Science Foundation (award number 1637368), Starkey Hearing Technologies, and Google.
Supplementary Material
Supplementary material is available for this article online.
References
- Adams D. (2010) The ultimate hitchhiker’s guide to the galaxy, New York, NY: Del Rey. [Google Scholar]
- Brown T. (2009) Change by design, New York, NY: HarperBusiness. [Google Scholar]
- Dew N. (2007) Abduction: A pre-condition for the intelligent design of strategy. Journal of Business Strategy 28(4): 38–45. 10.1108/02756660710760935. [DOI] [Google Scholar]
- Kim K., Lin K.-H., Walther D. B., Hasegawa-Johnson M. A., Huang T. S. (2014) Automatic detection of auditory salience with optimized linear filters derived from human annotation. Pattern Recognition Letters 38: 78–85. 10.1016/j.patrec.2013.11.010. [DOI] [Google Scholar]
- Kolko J. (2015) Design thinking comes of age. Harvard Business Review 93(9): 66–71. [Google Scholar]
- Martin R. L. (2009) The design of business: Why design thinking is the next competitive advantage, Brighton, MA: Harvard Business Press. [Google Scholar]
- Mauss I. B., Robinson M. D. (2009) Measures of emotion: A review. Cognition and emotion 23(2): 209–237. 10.1080/02699930802204677. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meyer C., Hickson L. (2012) What factors influence help-seeking for hearing impairment and hearing aid adoption in older adults? International Journal of Audiology 51(2): 66–74. doi: 10.3109/14992027.2011.611178. [DOI] [PubMed] [Google Scholar]
- Mizutari K., Fujioka M., Hosoya M., Bramhall N., Okano H. J., Okano H., Edge A. S. (2013) Notch inhibition induces cochlear hair cell regeneration and recovery of hearing after acoustic trauma. Neuron 77(1): 58–69. 10.1016/j.neuron.2012.10.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pariser E. (2011) The filter bubble: How the new personalized web is changing what we read and how we think, New York, NY: Penguin. [Google Scholar]
- Schuller B., Batliner A. (2013) Computational paralinguistics: Emotion, affect and personality in speech and language processing, Hoboken, NJ: John Wiley & Sons. [Google Scholar]
- Schuller, B., Steidl, S., Batliner, A., Vinciarelli, A., Scherer, K., Ringeval, F.,…Marchi, E. (2013). The INTERSPEECH 2013 computational paralinguistics challenge: Social signals, conflict, emotion, autism. In Proceedings of Interspeech, Lyon, France: International Speech Communication Association (ISCA).
- Simon J. Z. (2015) The encoding of auditory objects in auditory cortex: Insights from magnetoencephalography. International Journal of Psychophysiology 95(2): 184–190. 10.1016/j.ijpsycho.2014.05.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simonnet E., Ghannay S., Camelin N., Estève Y., De Mori R. (2017) ASR error management for improving spoken language understanding. arXiv. 1705.09515. [Google Scholar]
- Skully, J. (1987). Knowledge navigator. Retrieved from https://en.wikipedia.org/wiki/Knowledge_Navigator.
- Verganti R. (2008) Design, meanings, and radical innovation: A metamodel and a research agenda. Journal of Product Innovation Management 25(5): 436–456. doi: 10.1111/j.1540-5885.2008.00313.x. [Google Scholar]
- Wallach W., Allen C. (2008) Moral machines: Teaching robots right from wrong, New York, NY: Oxford University Press. [Google Scholar]
- Zhang, J., Zheng, J., & Magnenat-Thalmann, N. (2016). Modeling personality, mood, and emotions. In N. Magnenat-Thalmann, J. Yuan, D. Thalmann, & B. J. You (Eds.), Human–Computer Interaction Series. Context aware human-robot and human-agent interaction (pp. 211–236). Cham, Switzerland: Springer. doi: 10.1007/978-3-319-19947-4_10. [DOI]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.