Skip to main content
PLOS One logoLink to PLOS One
. 2022 Jan 7;17(1):e0261354. doi: 10.1371/journal.pone.0261354

Effects of mild-to-moderate sensorineural hearing loss and signal amplification on vocal emotion recognition in middle-aged–older individuals

Mattias Ekberg 1,*, Josefine Andin 1, Stefan Stenfelt 2, Örjan Dahlström 1
Editor: Qian-Jie Fu3
PMCID: PMC8740977  PMID: 34995305

Abstract

Previous research has shown deficits in vocal emotion recognition in sub-populations of individuals with hearing loss, making this a high priority research topic. However, previous research has only examined vocal emotion recognition using verbal material, in which emotions are expressed through emotional prosody. There is evidence that older individuals with hearing loss suffer from deficits in general prosody recognition, not specific to emotional prosody. No study has examined the recognition of non-verbal vocalization, which constitutes another important source for the vocal communication of emotions. It might be the case that individuals with hearing loss have specific difficulties in recognizing emotions expressed through prosody in speech, but not non-verbal vocalizations. We aim to examine whether vocal emotion recognition difficulties in middle- aged-to older individuals with sensorineural mild-moderate hearing loss are better explained by deficits in vocal emotion recognition specifically, or deficits in prosody recognition generally by including both sentences and non-verbal expressions. Furthermore a, some of the studies which have concluded that individuals with mild-moderate hearing loss have deficits in vocal emotion recognition ability have also found that the use of hearing aids does not improve recognition accuracy in this group. We aim to examine the effects of linear amplification and audibility on the recognition of different emotions expressed both verbally and non-verbally. Besides examining accuracy for different emotions we will also look at patterns of confusion (which specific emotions are mistaken for other specific emotion and at which rates) during both amplified and non-amplified listening, and we will analyze all material acoustically and relate the acoustic content to performance. Together these analyses will provide clues to effects of amplification on the perception of different emotions. For these purposes, a total of 70 middle-aged-older individuals, half with mild-moderate hearing loss and half with normal hearing will perform a computerized forced-choice vocal emotion recognition task with and without amplification.

1. Introduction

Hearing loss is among the leading causes of disability globally with an increasing prevalence with age. It leads to difficulties in communication which can contribute to social isolation and diminished well-being [1]. In Sweden, it has been estimated that approximately 18% of the population have a hearing loss. While the prevalence is comparatively much higher in the oldest age groups (44.6% in the age group of 75–84 and 56.5% in the age group of 85+ years) the prevalence in the age groups of 55–64 and 65–74 (middle-age-to-old) is also relatively high (24.2 and 32.6%) [2]. This shows that hearing loss is not only a concern for the oldest in the population. Because of the importance of emotion recognition for interpersonal interaction and, thus, indirectly to measurements of well-being such as perceived quality of life and the presence/absence of depression, the effects of hearing loss, amplification, and auditory rehabilitation on the recognition and experience of vocal emotions, have been identified as important research topics [3]. In the present study, we will investigate how having a mild to moderate hearing loss and using hearing aids with linear amplification influences vocal emotion recognition for verbal and non-verbal materials.

Vocal emotion recognition

Emotions can be defined as “episodic, relatively short-term, biologically-based patterns of perception, experience, physiology, action and communication that occur in response to specific physical and social challenges and opportunities” ([4], p. 468). Perceiving and recognizing emotions are important and deficits in emotion recognition can have negative effects on interpersonal relationships and different social contexts such as job environments [3]. Vocal emotion expressions are one source through which we perceive others’ emotions. Vocal emotion recognition depends on a complex interplay between different factors (see [5] for a review of several such factors). Both the producer and observer of an emotion expression are influenced by their psychobiological architecture (e.g. the structure and function of the auditory system) and neurophysiological mechanisms, as well as by context and codes for expressing and interpreting emotion expressions. In the receiver, attributions concerning the emotional state of the expressions are made based upon schematic recognition and inference rules which are influenced by sociocultural context [5]. Schemata can be understood as cognitive organizing systems which control information simplification to manage its processing. These schemata can guide perception by means of providing categories, for example different emotions, and summarizing the most common and distinguishing attributes of different categories [6]. The expresser produces distal cues, which in the case of vocal expressions, are sounds with particular acoustic properties. These distal cues are transmitted to the perceiver and transformed into proximal cues; perceived voice features on which inferences are made [5]. In support of such a model, Bänziger Hosoya and Scherer have shown that the recognition of speech-prosody embedded emotion expressions is predicted by combinations of perceived voice features, which in turn are best explained as products of different complex combinations of acoustic features such as intensity, fundamental frequency, and speech-rate [7]. However, neither acoustic features nor perceived voice features fully predict vocal emotion recognition in normal hearing individuals [6], which indicates that recognition also depends upon other factors, for example differences in schematic knowledge of emotion expressions. Nevertheless, analysis of acoustic features may provide cues for understanding why some emotion expressions are more easily recognized than others, and why particular emotions are commonly mistaken (see for example [8], for such an analysis). Therefore, in the present study, we will also examine acoustic features of different emotional expressions and relate these features to vocal emotion recognition.

Emotions are vocally expressed through emotional prosody, as well as through non-verbal vocalizations [9]. Chin, Bergeson, and Phan define prosody as “the suprasegmental features of speech that are conveyed by the parameters of fundamental frequency, intensity, and duration” ([10], p.356). Non-verbal vocalizations, or affect bursts, are brief non-linguistic sounds such as for example laughter, to express happiness or amusement, and crying, to express sadness [9]. Prosody, including emotional prosody, evolves relatively slow in speech, requiring temporal tracking and integration over time in the perceiver [11]. In the emotion expresser, patterns of emotional prosody are likely affected to a high degree by socioculturally grounded norms of communication, while the expressive patterns of non-verbal vocalizations are driven, involuntarily, by psychobiological changes in the expresser [4]. Generally, non-verbal vocalizations are recognized more accurately than emotional speech prosody [5, 12], and recognition accuracy for different emotions varies between emotional prosody in speech and non-verbal vocalizations [1215]. Research on how well different emotions of both stimulus types are recognized compared to other emotions has so far not yielded consistent results (see for example [4] and [15]). However, with regard to emotional prosody specifically, Scherer [5] and Castiajo and Pinheiro [16] argue that anger and sadness are more accurately recognized compared to fear, surprise, and happiness. Anger, fear, sadness, happiness, and interest, expressed through prosody are not commonly confused for other emotions [5]. It is more common, however, for emotions which can be considered as variants of a broader category to be confused for one-another, such as for example pleasure and amusement within the category of happiness [5]. With regard to non-verbal vocalizations, Lima et al. showed that normal hearing younger adults could rapidly recognize expressions of eight different emotions with high accuracy, even under concurrent cognitive load, indicating that, at least for this population, the recognition of non-verbal vocalizations may depend on predominately automatic mechanisms [17]. The ability to accurately recognize non-verbal emotion expressions declines with aging [18], as does the ability to recognize emotional prosody in speech [19]. Amorim et al. suggest that this decline happens due to age-related changes in brain regions which are involved in the processing of emotional cues [18]. Hearing loss and aging both independently and conjointly contribute to a diminished ability to accurately recognize speech embedded vocal emotion expressions generally [19]. To our knowledge, however, no previous study has examined the effects of mild to moderate sensorineural hearing loss on non-verbal emotion expressions. Therefore, in the present study recognition of non-verbal expressions as well as emotional prosody will be investigated.

Vocal emotion recognition under mild-to-moderate hearing loss

Damage to any part of the auditory system can result in hearing loss [20]. The most common form of hearing loss is cochlear hearing loss [21], a type of peripheral sensorineural hearing loss involving damage to the structures within the cochlea, most commonly to the outer hair cells (OHCs), but also to the inner hair cells (IHCs) [2123]. Although not yet fully described, emotional expressions are characterized by different acoustic parameters. Processing and integration of these acoustic cues are, when correctly identified, helping the brain to recognize and differentiate between several emotions. Example of acoustic parameters are frequency perception, frequency discrimination, pitch and speech-related cues.

Consequences of sensorineural hearing loss include reduced sensitivity, dynamic range, and frequency selectivity [21]. Reduced sensitivity is the decreased ability to perceive sounds, often at certain frequencies. The loss of dynamic range entails that quiet sounds will be inaudible while loud sounds will be unaffected. Reduced frequency selectivity is the decreased ability to resolve spectral components of complex sounds [21], which is likely related to poorer pitch perception [21]. The presence of hearing loss and its degree of severity is commonly determined by pure-tone audiometry in which the hearing threshold levels measured in hearing level (dB HL) are determined at frequencies between 125 and 8000 Hz [24]. The pure tone average (PTA) is the average hearing threshold at specified frequencies, and one often reported average is PTA4 including the hearing thresholds at 500, 1000, 2000, and 4000 Hz. The degrees of hearing loss are categorized based on hearing thresholds. There is no consensus about the exact categorization of degrees of hearing loss based on hearing thresholds. However, one common categorization, used by for example the World Health Organization (WHO) is to divide degrees of hearing loss into mild (PTA4 of 26 to 40 dB HL), moderate (PTA4 of 41 to 60 dB HL), severe (PTA4 of 61–80 dB HL) and profound (PTA4 of more than 81 dB HL) [24]. Mild-to-moderate hearing loss is most common, making up 92% of cases [1].

The most common intervention for people with hearing loss is the use of hearing aids [21]. However, many individuals with hearing loss do not have access to or do not use hearing aids [25]. Hearing aids using linear amplification can restore the audibility of quiet sounds but do not solve the problems of a loss of dynamic range and reduced frequency selectivity [19]. Modern hearing aids aim to restore the outer hair cells’ (OHCs) function by non-linear amplification and compression of sounds, which involves the selective amplification of more quiet sounds, but do not fully restore the normal neural activity patterns in the ear and brain, and thus do not fully restore normal perception [26].

Several studies have found mild-to-moderate hearing loss to be related to deficits in vocal emotion recognition ability [19, 27, 28]), an ability that is not mitigated by hearing aid use [27], but also see [29, 30]. However, it has been suggested that what is interpreted as vocal emotion recognition deficits should rather be interpreted as general deficits in prosody recognition (including linguistic and non-linguistic prosody; [30, 31]). Concerning recognition of different vocal emotions, Christensen et al. found no significant interaction between emotion category and hearing loss [19]. Goy et al., however, found such an interaction, with sadness being recognized accurately significantly more often compared to anger, disgust, fear and happiness, but not compared to surprise or neutrality. In individuals with hearing loss, confusions between different vocal emotions appear to be much more common in comparison to normal hearing individuals [31].

Aims

The overall aim is to deepen our understanding of the effects of hearing loss and signal amplification on vocal emotion recognition. It is unclear how mild-moderate hearing loss and linear amplification through a hearing aid effects the perception of auditory cues important for (a) being able to perceive and (b) being able to discriminate between emotions.

More specifically, by examining accuracy for different emotions (what emotions are easiest to accurately identify) and patterns of confusion (which emotions are mixed up when inaccurately identified), for individuals with normal hearing and for individuals with hearing loss (listening with and without linear amplification), and by comparing performance with acoustic analysis of sentences and non-verbal expressions with different emotional emphasis, the aims are to examine:

  • Which acoustic parameters are important for detection of and discrimination between different emotions (by examination of performance of normal hearing participants)

  • How the effect of hearing loss affects emotion recognition (by comparing performance of participants with and without hearing loss)

  • How emotion recognition is affected by linear amplification (by examination of performance of participants with hearing loss using linear amplification)

Based on the literature, we predict that:

  1. individuals with hearing loss will have poorer recognition compared to normal hearing individuals for emotions expressed verbally, regardless of acoustic features and regardless of the use of linear amplification, and for emotions expressed non-verbally when linear amplifications is not used

  2. individuals with and without hearing loss will not differ in accuracy for non-verbal expressions when linear amplification is used

  3. patterns of confusion will differ between individuals with and without hearing loss for both verbally and non-verbally expressed emotions

  4. vocal emotions which are more distinct in terms of acoustic parameter measures, will be recognized with higher accuracy for both groups, but emotions that are distinguished mainly based on frequency parameters will be less correctly identified by the hearing loss group

  5. the more salient the amplitude-related acoustic parameters are for an emotion, the better that emotion will be identified when linear amplification is used compared to not.

2. Methods

2.1. Participants

Using a 2 x 6 mixed design (group x emotion), 80% power, 5% significance level, correlation between repeated measures of 0.5 and with N = 56 participants (a reasonable number of participants from a recruitment point of view), we will be able to detect an effect as small as η2 = .02 (f = 0.14); a next to small effect size. Power analysis was performed using G*Power version 3.1.9.7 [32].

Twenty-eight native Swedish speaking participants, aged 50–75, with mild-to-moderate, bilateral, symmetric sensorineural hearing loss (PTA4 of 30–60 dB HL), who have been using hearing aids for at least one year, and 28 age-matched native Swedish speaking participants with normal hearing will be recruited. Approximately equally number of men and women will be included in both groups. The choice of PTA4 of ≥30 dB as a criterion for the hearing loss group ensures that they benefit from hearing aid usage. The audiometric profiles of participants with hearing loss and information about which type of hearing aids they use, will be obtained through an audiological clinic, from participants who give their consent. The presence of normal hearing will be determined through audiometric testing with the criteria of equal or better than 20 dB HL at all frequencies between 125 and 4000 Hz and no worse than 30 dB HL at 8000Hz. Since an association between general cognitive ability (G) and emotion recognition ability have been established [33], the subtest Matrices from the Swedish version of WAIS-IV, which is strongly correlated with G, will be used for all participants [34]. Participants who perform below two standard deviations from the mean for their age span will be excluded from analysis. In addition, participants that have any of the following self-reported diagnoses or problems will be excluded from participation: hyperacusis, neurological disorders affecting the brain (e.g. multiple sclerosis or epilepsy), severe tinnitus (which is perceived to cause impairment and disability), developmental psychiatric disorders (e.g. ADHD, autism spectrum disorders or intellectual disability), mood and anxiety disorders (e.g. social anxiety disorder or depression), and the experience of great difficulties in identifying and describing one’s own emotions. Before being invited to participate in the study, interested individuals will fill out an online questionnaire, including questions about health problems and diagnoses described above, educational attainment, age, gender, and native language. Those who fulfill the inclusion criteria will be invited to Linköping University (Linköping, Sweden), where they will perform the experiment. All participants will sign a letter of informed consent. We will follow the declaration of Helsinki, and the project is approved by the Swedish Ethical Review Authority (Dnr: 2020–03674).

2.2. Task and study design

Stimuli material

The stimuli material is based on fourteen emotionally neutral sentences from the Swedish version of the hearing in noise test (HINT, [35]), and non-verbal emotion expressions. Four actors–an older female (69 years old), an older male (73 years old), a young female (19 years old) and a young male (29 years old)–were recorded when reading the sentences and when producing non-verbal expressions (sound expressions without using language), all with emotional prosody expressing different emotions of high and low intensity. The emotions included in the recordings are anger, happiness, sadness, fear, surprise. and interest For the sentences, prosodically neutral versions were also recorded. For the non-verbal expressions, the actors were instructed to imagine themselves experiencing the different emotions and to make expressions with sounds, without language which match those emotions. This entails some variation of the specific sounds expressed for specific emotions by different participants. The stimuli were recorded in a studio at the Audiology clinic at Linköping University Hospital, with the aid of a sound technician. Recordings were made in AudacityTM [36] using high-quality equipment, 24-bit resolution, and a 44.1 kHz sampling rate. From each actor, the clearest and most clean recordings out of two or more repetitions for each sentence and non-verbal vocalization was selected. With very few exceptions the sentences are approximately 2–3 seconds long and non-verbal expressions are 1–2 seconds long.

Validation of stimulus material

Sentences and non-verbal expressions are validated in an online administered task, where participants classify the perceived emotion by selecting from a list including the different emotions, and a neutral choice (“none of the described emotions/other emotion”) to reduce bias in the responses. Only sentences and non-verbal expressions which a majority of participants (>50%) classify as expressing the emotion intended by the actor will be included in the experiment Preliminary results from the validation show that interest overall as a category is not sufficiently well recognized in our material. Interest will therefore not be included in the study.

Study design

Stimuli will be presented through headphones. In the aided listening conditions participants with hearing loss will use a master hearing aid system with linear amplification using the Cambridge formula for linear hearing aids [37] in which each stimulus is tailored to each participants audiogram during presentation. Participants will sit in front of a computer screen and will be presented with the written question “Which emotion was expressed in the recording you just heard?” with the options; happiness, anger, fear, sadness, surprise, and neutral for sentences, and the same categories excluding neutral for non-verbal expressions. The task is to identify the correct emotion by button-press. Two seconds after the participant’s response, the next trial will be presented. The purpose of the lag between response and stimulus presentation is to allow for shifting of attention from responding to listening. Stimuli will be presented using PsychoPy version 3,0 (see [38] for a description of an earlier version).

Sentences and non-verbal expressions will be divided into separate sessions, and participants will have the opportunity to pause briefly between sessions. Participants with hearing loss will perform a total of four sessions; two sessions with hearing aids (aided listening condition), one for sentences and one for non-verbal expressions, and two sessions without hearing aids (unaided listening condition), one for sentences and one for non-verbal expressions. The order of listening conditions will be balanced across participants. To have the same test-time and load, participants with normal hearing will perform all sessions, two with sentences as stimuli and two with non-verbal expressions, but only one, randomly chosen, of each will be included in the analyses. Stimuli within sessions will be presented in balanced order. Two different sets of sentences and non-verbal expressions will be used for comparison within each listening condition.

2.3. Analyses

Acoustic analyses

For each recording of verbal and non-verbal emotions the acoustic parameters of the Geneva Minimalistic Parameter Set (GeMAPS, [39]) will be extracted using the OpenSmile toolkit v.2.3 [40]. The GeMAPS consist of a set of acoustic parameters which have been proposed as a standard for different areas of automatic voice analysis, including the analysis of vocal emotions. The parameters of GeMAPS have been shown to be of value for analyzing emotions in speech in previous research [39] and are described in Table 1.

Table 1. A description of the acoustic parameters of the GeMAPS as discussed in Eyben et al. [3,9].
Parameters Explanation
Frequency related
    Fundamental frequency (F0) the logarithmic fundamental frequency, F0, on a semitone scale starting at 27.5 Hz.
    Pitch (PT)
    Jitter deviations in individual consecutive F0 period lengths the center frequency of the first formant.
    Frequency–formant 1 the center frequency of the first formant
    Frequency–formant 2 the center frequency of the second formant
    Frequency–formant 3 the center frequency of the third formant
Energy/Amplitude/Intensity related
    Shimmer difference of the peak amplitudes of consecutive F0 periods
Loudness an estimate of the perceived signal intensity from an auditory spectrum
    Harmonics-to-noise ratio relation of energy in harmonic components to energy in noise-like components
Spectral (balance)-related components
    Alpha ratio ratio of the summed energy from 50–1000 and 1–5 kHz
Hammarberg index ratio of the strongest energy peak in the 0–2 kHz region to the strongest energy peak in the 2–5 kHZ region
Spectral slope 0–500 Hz linear regression slope of the logarithmic power spectrum within the given band
    Spectral slope 500–1500 Hz linear regression slope of the logarithmic power spectrum within the given band
Relative energy–formant 1 the relative energy of the first formant and the ratio of the energy of the spectral harmonic peak at the first formant’s center frequency to the energy of the spectral peak at the fundamental frequency
    Relative energy–formant 2 The relative energy of the second formant and the ratio of the energy of the spectral harmonic peak at the second formant’s center frequency to the energy of the spectral peak at the fundamental frequency
    Relative energy–formant 3 The relative energy of the third formant and the ratio of the energy of the spectral harmonic peak at the third formant’s center frequency to the energy of the spectral peak at the fundamental frequency

To control for baseline parameter differences between speakers’ voices, raw values for each parameter will be centered on each speaker’s neutral voice [8], while mean and standard deviations will allow for z-transformations of separate parameters within speaker and emotion. Acoustic parameters will be calculated for each recording separately. The mean and standard deviation of the z-transformed values of specific parameters within separate emotions will for each emotion generate a coordinate in a multi-parameter-dimensional space (where the origin represents neutral). For illustrative purposes, a three-dimensional space where the axes represent weighted values of frequency, intensity and spectral balance related parameters will be outlined. For non-verbal expressions, the same procedure will be followed, with the exception that each emotion will be compared to the mean of all other emotions.

Behavioural analyses

First, comparisons between the normal hearing and the hearing loss group without amplification will be made using one 2 x 6 mixed ANOVA, for verbal stimuli, and one 2 x 5 mixed ANOVA, for non-verbal stimuli, all with group as the between-subjects variable and emotion as the within-subjects variable. To relate the effect of hearing loss on performance, differences in performance over groups and emotions will be described in relation to distance matrices and patterns of confusion from the acoustic analyses.

Second, effects of linear amplification will be analyzed in the hearing loss group by one 2 x 6 repeated measures ANOVA, for verbal stimuli, and one 2 x 5 repeated measures ANOVA, for non-verbal stimuli, both with listening condition (amplified and non-amplified) and emotion as factors. To relate linear amplification and acoustic parameters to changes in performance, differences in performance over hearing condition (and emotions) will be described in relation to the distance matrices and patterns of confusion from the acoustic analyses.

The dependent/outcome variable in all analyses will be raw accuracy. However, in addition, we will report Rosenthal’s Proportion Index (PI, [41]) for each emotion expression of both stimulus types, to facilitate future comparison between studies.

3. Timeline

Ongoing—December 2021- Pilot study-Validation of recordings of emotion expressions. Analysis and selection of material for the presented study.

December 2021-November 2022: Data collection and acoustic analyses.

December 2022-March/April 2023: Analyses of results.

June 2023: Stage 2 report. Submission of article.

Data Availability

This is a stage one registered report and we have not yet begun data collection there are no data to be made available. Data will be stored in a public repository after finished data collection, and a link (URL) will be made publicly available upon publication of the stage two manuscript (full article).

Funding Statement

The author(s) received no specific funding for this work.

References

  • 1.Ferguson MA, Kitterick PT, Chong LY, Edmondson-Jones M, Barker F, & Hoare DJ. Hearing aids for mild to moderate hearing loss in adults. Cochrane Database Syst Rev. 2017. Sep 25;9(9):CD012023. doi: 10.1002/14651858.CD012023.pub2 Version 2.1. 2017 November. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Hörselskadades Riksförbund (HRF). HÖRSELSKADADE I SIFFROR 2017- Statistik om hörselskadade och hörapparatutprovning från Hörselskadades Riksförbund (HRF). Available from: https://hrf.se/app/uploads/2016/06/Hsk_i_siffror_nov2017_webb.pdf.
  • 3.Picou EM, Singh G, Goy H, Russo F, Hickson L, Oxenham AJ, et al. Hearing, Emotion, Amplification, Research, and Training Workshop: Current Understanding of Hearing Loss and Emotion Perception and Priorities for Future Research. Trends Hear. Jan-Dec 2018;22:2331216518803215. doi: 10.1177/2331216518803215 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Keltner D, Gross JJ. Functional accounts of emotions. Cognit. Emot. 1999;13(5): 467–480. doi: 10.1080/026999399379140 [DOI] [Google Scholar]
  • 5.Scherer K. Acoustic Patterning of Emotion Vocalizations. In: Frühholz S & Belin P, editors. The Oxford Handbook of Voice Perception. Oxford, UK: Oxford University Press; 2019. p. 61–93. [Google Scholar]
  • 6.Phillips JS, Lord RG. (1982). Schematic Information Processing and Perceptions of Leadership in Problem-Solving Groups. J Appl Psychol. 1982. Aug; 67(4):486–492. [Google Scholar]
  • 7.Bänziger T, Hosoya G, Scherer KR. Path Models of Vocal Emotion Communication. PLoS One. 2015. Sep 1;10(9):e0136675. doi: 10.1371/journal.pone.0136675 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Nordström, H. Emotional Communication in the Human Voice. [dissertation on the Internet]. Stockholm: Stockholm University; 2019 [cited 2021 Mar 25]. Retrieved from: http://www.diva-portal.org/smash/record.jsf?pid.
  • 9.Cowen AS, Elfenbein HA., Laukka P, Keltner D. Mapping 24 Emotions Conveyed by Brief Human Vocalization. Am Psychol. 2019. Sep;74(6):698–712. doi: 10.1037/amp0000399 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Chin S, B, Bergeson T, R, Phan J. Speech intelligibility and prosody production in children with cochlear implants. J Commun Disord. Sep-Oct 2012;45(5):355–66. doi: 10.1016/j.jcomdis.2012.05.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Liebenthal E, Silbersweig DA., Stern E. The language, tone and prosody of emotions: Neural substrates and dynamics of spoken-word emotion perception. Front Neurosci. 2016. Nov 8;10:506. doi: 10.3389/fnins.2016.00506 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Hawk ST, Van Kleef GA, Fischer AH, & Van Der Schalk J. “Worth a thousand words”: Absolute and relative decoding of nonlinguistic affect vocalizations. Emotion. 2009. Jun;9(3):293–305. doi: 10.1037/a0015178 [DOI] [PubMed] [Google Scholar]
  • 13.Banse R, Scherer KR. Acoustic profiles in vocal emotion expression. J Pers Soc Psychol. 1996. Mar;70(3):614–36. doi: 10.1037//0022-3514.70.3.614 [DOI] [PubMed] [Google Scholar]
  • 14.Juslin PN, Laukka P. Communication of emotions in vocal expression and music performance: Different channels, same code? Psychol Bull. 2003. Sep;129(5):770–814. doi: 10.1037/0033-2909.129.5.770 [DOI] [PubMed] [Google Scholar]
  • 15.Sauter DA., Eisner F, Ekman P, Scott SK. Cross-cultural recognition of basic emotions through nonverbal emotional vocalizations. Proc Natl Acad Sci U S A. 2010. Feb 9;107(6):2408–12. doi: 10.1073/pnas.0908239106 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Castiajo C, Pinheiro AP. Decoding emotions from nonverbal vocalizations: How much voice signal is enough? Motiv. Emot.2019. Oct; 43(5): 803–813. doi: 10.1007/s11031-019-09783-9 [DOI] [Google Scholar]
  • 17.Lima CF, Anikin A, Monteiro AC, Scott SK, Castro SL. Automaticity in the Recognition of Nonverbal Emotional Vocalizations. Emotion. 2019. Mar;19(2):219–233. doi: 10.1037/emo0000429 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Amorim M, Anikin A, Mendes AJ, Lima CF, Kotz SA, Pinheiro AP. Changes in vocal emotion recognition across the life span. Emotion. 2021. Mar;21(2):315–325. doi: 10.1037/emo0000692 [DOI] [PubMed] [Google Scholar]
  • 19.Christensen JA., Sis J, Kulkarni AM., Chatterjee M. Effects of Age and Hearing Loss on the Recognition of Emotions in Speech. Ear Hear. 2019. Sep/Oct;40(5):1069–1083. doi: 10.1097/AUD.0000000000000694 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Cunningham LL, & Tucci DL. (2017). Hearing Loss in Adults. N Engl J Med. 2017. Dec 21;377(25):2465–2473. doi: 10.1056/NEJMra1616601 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Oxenham AJ. How We Hear: The Perception and Neural Coding of Sound. Annu Rev Psychol. 2018. Jan 4;69:27–50. doi: 10.1146/annurev-psych-122216-011635 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Moore BCJ. Cochlear Hearing Loss: Physiological, Psychological and Technical Issues. 2 ed. Chichester, W Suss: John Wiley & Sons Ltd. 2007. [Google Scholar]
  • 23.Zeng F-G, Djalilian H. Hearing impairment. In: Plack C.J, editor. The Oxford Handbook of Auditory Science: vol 3 Hearing. New York: Oxford University Press; 2010. p. 325–348. [Google Scholar]
  • 24.Timmer BHB, Hickson L, Launer S. Adults with mild hearing impairment: Are we meeting the challenge? Int J Audiol. 2015;54(11):786–95. doi: 10.3109/14992027.2015.1046504 [DOI] [PubMed] [Google Scholar]
  • 25.Löhler J, Walther LE, Hansen F, Kapp P, Meerpohl J, Wollenberg B, et al. The prevalence of hearing loss and use of hearing aids among adults in Germany: a systematic review. Eur Arch Otorhinolaryngol. 2019. Apr;276(4):945–956. doi: 10.1007/s00405-019-05312-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Lesica NA. Why Do Hearing Aids Fail to Restore Normal Auditory Perception? Trends Neurosci. 2018. Apr;41(4):174–185. doi: 10.1016/j.tins.2018.01.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Goy H, Pichora-Fuller MK., Singh G, Russo FA. Hearing Aids Benefit Recognition of Words in Emotional Speech but Not Emotion Identification. Trends Hear. Jan-Dec 2018;22:2331216518801736. doi: 10.1177/2331216518801736 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Singh G, Liskovoi L, Launer S, & Russo F. The Emotional Communication in Hearing Questionnaire (EMO-CHeQ): Development and Evaluation. Ear Hear. Mar/Apr 2019;40(2):260–271. doi: 10.1097/AUD.0000000000000611 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Orbelo DM, Grim MA, Talbott RE, Ross ED. Impaired comprehension of affective prosody in elderly subjects is not predicted by age-related hearing loss or age-related cognitive decline. J Geriatr Psychiatry Neurol. 2005. Mar;18(1):25–32. doi: 10.1177/0891988704272214 [DOI] [PubMed] [Google Scholar]
  • 30.Mitchell RL, Kingston RA, Barbosa Bouc SL. The specificity of age-related decline in interpretation of emotion cues from prosody. Psychol Aging. 2011. Jun;26(2):406–14. doi: 10.1037/a0021861 [DOI] [PubMed] [Google Scholar]
  • 31.Pralus A, Hermann R, Cholvy F, Aguera PE, Moulin A, Barone P, et al. Rapid Assessment of Non-Verbal Auditory Perception in Normal-Hearing Participants and Cochlear Implant Users. J Clin Med. 2021. May 13;10(10):2093. doi: 10.3390/jcm10102093 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Faul F, Erdfelder E, Lang A-G, & Buchner A. G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods. 2007. May;39(2):175–91. doi: 10.3758/bf03193146 [DOI] [PubMed] [Google Scholar]
  • 33.Connolly HL, Lefevre CE, Young AW, Lewis GJ. Emotion recognition ability: Evidence for a supramodal factor and its links to social cognition. Cognition. 2020. Apr;197:104–166. doi: 10.1016/j.cognition.2019.104166 [DOI] [PubMed] [Google Scholar]
  • 34.Wechsler D. WAIS-IV. Manual–swedish version. Stockholm: Pearson. 201135. Rosenthal, R, Rubin, DB. Effect Size Estimation for One-Sample Multiple-Choice-Type Data: Design, Analysis, and Meta-Analysis. Psychol Bull. 1989. Sep;106(2): 332–337. [Google Scholar]
  • 35.Hällgren M, Larsby B, Arlinger S. A Swedish version of the Hearing In Noise Test (HINT) for measurement of speech recognition. Int J Audiol. 2006. Apr;45(4):227–37. doi: 10.1080/14992020500429583 [DOI] [PubMed] [Google Scholar]
  • 36.Team Audacity. Audacity(R): Free Audio Editor and Recorder, version 2.4.1 [Computer application] The Audacity Team; 2020. [cited 25 march 2021]. Available from https://audacityteam.org/. [Google Scholar]
  • 37.Moore BC, Glasberg BR. Use of a loudness model for hearing-aid fitting. I. Linear hearing aids. Br J Audiol. 1998. Oct;32(5):317–35. doi: 10.3109/03005364000000083 [DOI] [PubMed] [Google Scholar]
  • 38.Peirce J, Gray JR, Simpson S, MacAskill M, Höchenberger R, Sogo H, et al. PsychoPy2: Experiments in behavior made easy. Behav Res Methods. 2019. Feb;51(1):195–203. doi: 10.3758/s13428-018-01193-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Eyben F, Scherer KR, Schuller BW, Sundberg J, André E, Busso C, et al. (2016). The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing. 2016. Apr-Jun;7(2): 190–202. doi: 10.1109/TAFFC.2015.2457417 [DOI] [Google Scholar]
  • 40.Eyben F, Wöllmer M, Schuller B. openSMILE—The Munich Versatile and Fast Open-Source Audio Feature Extractor“, In; Proc. ACM Multimedia (MM), ACM, Florence, Italy, ACM, October 2010. P. 1459–1462. [Google Scholar]
  • 41.Rosenthal R, Rubin DB. Effect size estimation for one-sample multiple-choice-type data: Design, analysis, and meta-analysis. Psychol Bull. 1989. Sep; 106(2): 332–337. [Google Scholar]

Decision Letter 0

Qian-Jie Fu

11 Aug 2021

PONE-D-21-13058

Effects of mild-to-moderate sensorineural hearing loss and signal amplification on vocal emotion recognition in middle-aged–older individuals.

PLOS ONE

Dear Dr. Ekberg,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Sep 25 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Qian-Jie Fu, Ph.D.

Academic Editor

PLOS ONE

Journal requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf.

2. Please include 'Registered Report Protocol' in the title of your manuscript.

3. Please include your full ethics statement in the ‘Methods’ section of your manuscript file. In your statement, please include the full name of the IRB or ethics committee who approved or waived your study, as well as whether or not you obtained informed written or verbal consent. If consent was waived for your study, please include this information in your statement as well.

Additional Editor Comments (if provided):

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Does the manuscript provide a valid rationale for the proposed study, with clearly identified and justified research questions?

The research question outlined is expected to address a valid academic problem or topic and contribute to the base of knowledge in the field.

Reviewer #1: Partly

Reviewer #2: Partly

**********

2. Is the protocol technically sound and planned in a manner that will lead to a meaningful outcome and allow testing the stated hypotheses?

The manuscript should describe the methods in sufficient detail to prevent undisclosed flexibility in the experimental procedure or analysis pipeline, including sufficient outcome-neutral conditions (e.g. necessary controls, absence of floor or ceiling effects) to test the proposed hypotheses and a statistical power analysis where applicable. As there may be aspects of the methodology and analysis which can only be refined once the work is undertaken, authors should outline potential assumptions and explicitly describe what aspects of the proposed analyses, if any, are exploratory.

Reviewer #1: Partly

Reviewer #2: Partly

**********

3. Is the methodology feasible and described in sufficient detail to allow the work to be replicable?

Reviewer #1: No

Reviewer #2: Yes

**********

4. Have the authors described where all data underlying the findings will be made available when the study is complete?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception, at the time of publication. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above and, if applicable, provide comments about issues authors must address before this protocol can be accepted for publication. You may also include additional comments for the author, including concerns about research or publication ethics.

You may also provide optional suggestions and comments to authors that they might find helpful in planning their study.

(Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Abstract

The authors’ rationale for the current study is that existing research cannot distinguish a specific emotion-perception deficit from a general prosody-perception deficit in listeners with hearing loss, because emotion identification of verbal materials depends on prosody perception. The authors plan to use both verbal and nonverbal emotional speech materials to examine if listeners with hearing loss have a true emotion-specific deficit, and not just a general prosody-related deficit.

The abstract could be edited to make the rationale clearer and more coherent. To improve the flow of the abstract, I suggest focusing on the specific- vs general-deficit issue before moving on to the topics of audibility and hearing aids, instead of introducing information about hearing aid studies and then switching back to the first issue. It is also not clear in the abstract how changing the audibility would help to answer the question of whether listeners have a specific or general deficit, or how examining “mix-ups” would be helpful.

Sentences that need editing include: “…deficits in individuals hearing loss also ha are not ameliorated…” and “…mix-ups between different vocal of individuals…”.

Introduction

The study aim of relating acoustic features to emotion recognition was mentioned in the “vocal emotion recognition” section and included in the Aims section, but was not described in the abstract.

“Patterns of confusion” is a slightly different concept than “differences in accuracy among emotions”. The authors should consider which is most relevant for their purposes and be consistent in their terminology, instead of using these terms interchangeably.

The authors could explain what they mean by “examining patterns of confusion [will lead to] deeper knowledge of emotion recognition” (end of Introduction); e.g., the relevant explanation at the end of the Aims section could be brought up earlier.

- missing ’s’ in ‘material’ at the end of the first paragraph

- pg 3 line 8: clarify what “in contrast” is contrasting

- pg 3 line 10: “however” implies the pattern for prosody and non-verbal vocalizations should be different, but they seem more alike than different

- proofread for extra punctuation, e.g., ‘;,’ at the end of page 3, or missing punctuation, e.g., missing period after ‘poorer pitch perception (18)’

Aims

- the phrase “for emotions expressed non-verbally when linear amplifications is not used” seems to be redundant, given that the first part of the sentence already both types of materials would be more poorly recognized regardless of amplification

- extra ’s’ in ‘amplifications’

Method - participants

Regarding the a priori power calculation, the meaning of the following statement is unclear: “we will be able to conduct separate analyses for different stimulus types and outcome measures”. I assume that it means that the same number of participants is appropriate for a 2 x 7 within-subjects design, for comparing amplified and non-amplified speech materials, and for comparing non-verbal vocalizations to sentences, but it would be clearer to say so explicitly.

Method - task and study design

The sentence materials are described in detail, but there is no description of what the “non-verbal vocalizations” consist of. I see that the “neutral” option is also excluded; is there no possibility of a “neutral” non-verbal vocalization?

What is the accuracy criterion for emotions to be recognized “well above chance” for pilot testing of the speech materials, and will it be in line with previous studies?

The authors should clarify which parts of the text describe pilot testing and which parts describe the procedures for the actual study, e.g., by using a separate sub-header.

For reader unfamiliar with this master hearing aid system, the authors could clarify if the “amplified” stimuli will be pre-processed and tailored to each participant’s audiogram before presentation in the session — is this the case?

If participants are presented with 8 options instead of 7, including the extra option of “other emotion”, wouldn’t this technically create 8 levels in the condition (instead of 7 as in the power calculation)? This also seems to add an extra complication to the calculation of chance levels for recognition, given that actors were directed to create only 7 emotions.

Method - analyses

A very brief description of GeMAPS would help readers to understand why the authors chose to use this set of speech measures.

How will the distance matrices of the acoustic measurements be integrated with the behavioral data (accuracy)?

Reviewer #2: Introduction

-page 2, first paragraph, they’ve been identified as an important question… why? Especially auditory rehabilitation (do you mean use of amplification devices? “auditory rehab” usually pertains to behavioural strategies).

-maybe in this first paragraph you want to discuss the number of individuals (esp older adults) with hearing loss, the prevalence of this condition in that population makes addressing issues related to age-related hearing loss a pressing concern. You actually don’t really talk about hearing loss in middle-aged/older individuals (your target population for the proposed study) at all, or why it is so important to study these people.

-page 3, first paragraph. There are a lot of data here about differences in recognition rates of emotions, and how they vary between emotional prosody and non-verbal vocalizations. Could be helpful to provide a table?

End of first paragraph, page 3, reword “why we…”

If less is known, is there any information that is known? Any previous studies on this?

Add a line about how hearing aids are most common intervention but still such a low uptake. Suggests that many people (esp. older adults) with hearing loss are not accessing amplification.

Top of page 4, Christensen and Goy papers, are these with older individuals experiencing hearing loss? Is there a confusion between age x emotion x hearing loss? Any expectations why aging might change ability to recognize emotion?

Method

2.2- can you break down into sub-heading first describing the pilot study, then another section describing the proposed full set of stimuli

2.3 Analyses- can you break down into behavioural data (participant responses) and acoustical analysis of your stimuli, starting with “for each recording…”

It would be helpful to describe where your data will be stored (institutional website?).

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2022 Jan 7;17(1):e0261354. doi: 10.1371/journal.pone.0261354.r002

Author response to Decision Letter 0


22 Oct 2021

Reviewer #1

Abstract

Reviewer comment 1:1

The authors’ rationale for the current study is that existing research cannot distinguish a specific emotion-perception deficit from a general prosody-perception deficit in listeners with hearing loss, because emotion identification of verbal materials depends on prosody perception. The authors plan to use both verbal and nonverbal emotional speech materials to examine if listeners with hearing loss have a true emotion-specific deficit, and not just a general prosody-related deficit.

The abstract could be edited to make the rationale clearer and more coherent. To improve the flow of the abstract, I suggest focusing on the specific- vs general-deficit issue before moving on to the topics of audibility and hearing aids, instead of introducing information about hearing aid studies and then switching back to the first issue. It is also not clear in the abstract how changing the audibility would help to answer the question of whether listeners have a specific or general deficit, or how examining “mix-ups” would be helpful.

Sentences that need editing include: “…deficits in individuals hearing loss also ha are not ameliorated…” and “…mix-ups between different vocal of individuals…”.

Response 1:1

Thank you for the suggestions about the abstract, we have now rewritten the abstract to make the rationale clearer and more coherent. The new abstract now reads as follows:

” Previous research has shown deficits in vocal emotion recognition in sub-populations of individuals with hearing loss, making this a high priority research topic. However, previous research has only examined vocal emotion recognition using verbal material, in which emotions are expressed through emotional prosody. There is evidence that older individuals with hearing loss suffer from deficits in general prosody recognition, not specific to emotional prosody. No study has examined the recognition of non-verbal vocalization, which constitutes another important source for the vocal communication of emotions. It might be the case that individuals with hearing loss have specific difficulties in recognizing emotions expressed through prosody in speech, but not non-verbal vocalizations. We aim to examine whether vocal emotion recognition difficulties in middle- aged-to older individuals with sensorineural mild-moderate hearing loss are better explained by deficits in vocal emotion recognition specifically, or deficits in prosody recognition generally by including both sentences and non-verbal expressions. Furthermore a,, some of the studies which have concluded that individuals with mild-moderate hearing loss have deficits in vocal emotion recognition ability have also found that the use of hearing aids does not improve recognition accuracy in this group. We aim to examine the effects of linear amplification and audibility on the recognition of different emotions expressed both verbally and non-verbally. Besides examining accuracy for different emotions we will also look at patterns of confusion (which specific emotions are mistaken for other specific emotion and at which rates) during both amplified and non-amplified listening, and we will analyze all material acoustically and relate the acoustic content to performance. Together these analyses will provide clues to effects of amplification on the perception of different emotions. For these purposes, a total of 70 middle-aged-older individuals, half with mild-moderate hearing loss and half with normal hearing will perform a computerized forced-choice vocal emotion recognition task with and without amplification.”

Introduction

Reviewer comment 1:2

The study aim of relating acoustic features to emotion recognition was mentioned in the “vocal emotion recognition” section and included in the Aims section, but was not described in the abstract.

Response 1:2

Thank you for noticing and pointing this out. This aim is now included in the abstract. See Response 1:1.

Reviewer comment 1:3

“Patterns of confusion” is a slightly different concept than “differences in accuracy among emotions”. The authors should consider which is most relevant for their purposes and be consistent in their terminology, instead of using these terms interchangeably.

Response 1:3

We agree that there is a difference between the two terms. We distinguish between accuracy for different emotions and patterns of confusion throughout the manuscript.

Reviewer comment 1:4

The authors could explain what they mean by “examining patterns of confusion [will lead to] deeper knowledge of emotion recognition” (end of Introduction); e.g., the relevant explanation at the end of the Aims section could be brought up earlier.

Response 1:4

Thank you for pointing out the need for clarification here. In our revised aims section we now write (end of pg.4- pg.5):

“More specifically, by examining accuracy for different emotions (what emotions are easiest to accurately identify) and patterns of confusion (which emotions are mixed up when inaccurately identified), for individuals with normal hearing and for individuals with hearing loss (listening with and without linear amplification), and by comparing performance with acoustic analysis of sentences and non-verbal expressions with different emotional emphasis, the aims are to examine:

• Which acoustic parameters are important for detection of and discrimination between different emotions (by examination of performance of normal hearing participants)?

• How the effect of hearing loss affects that pattern (by comparing performance of participants with and without hearing loss)?

• How that pattern is affected by linear amplification (by examination of performance of participants with hearing loss using linear amplification)? “

We also introduce the topic of confusion between different emotions in individuals with hearing loss in the section “Vocal emotion recognition under mild-to-moderate hearing loss” in the introduction:

“In individuals with hearing loss, confusions between different vocal emotions appear to be much more common in comparison to normal hearing individuals (31)” (pg. 4, §3)

Reviewer comment 1:5

- missing ’s’ in ‘material’ at the end of the first paragraph

Response 1:5

Thanks for noticing this error. The ’s’ has been added

Reviewer comment 1:6

- pg 3 line 8: clarify what “in contrast” is contrasting

Response 1:6

Thanks for pointing this inconsistency out. This part of the text has been removed from the current manuscript.

Reviewer comment 1:7

- pg 3 line 10: “however” implies the pattern for prosody and non-verbal vocalizations should be different, but they seem more alike than different

Response 1:7

Thank you for this accurate observation. The wording has been changed to:

”However, with regard to emotional prosody specifically, Scherer (5) and Castiajo and Pinheiro (16) argue that anger and sadness are more accurately recognized compared to fear, surprise, and happiness.”

Reviewer comment 1:8

- proofread for extra punctuation, e.g., ‘;,’ at the end of page 3, or missing punctuation, e.g., missing period after ‘poorer pitch perception (18)’

Response 1:8

Thank you for noticing. Missing period after ’poorer pitch perception’ has been added. The extra semicolon on pg. 3 has been removed.

Aims

Reviewer comment 1:9

- the phrase “for emotions expressed non-verbally when linear amplifications is not used” seems to be redundant, given that the first part of the sentence already both types of materials would be more poorly recognized regardless of amplification

Response 1:9

Thank you for noticing this mistake. The preceding sentence refers to verbal expressions and this to non-verbal. We have rewritten the sentences and clarified it:

“...we predict that:

1. individuals with hearing loss will have poorer recognition compared to the normal hearing group for emotions expressed verbally, regardless of acoustic features and regardless of the use of linear amplification, and for emotions expressed non-verbally when linear amplifications is not used”

Reviewer comment 1:10

- extra ’s’ in ‘amplifications’

Response 1:10

Thank you for noticing. An extra s has been added (pg.5, top of page and lines below).

Method - participants

Reviewer comment 1:11

Regarding the a priori power calculation, the meaning of the following statement is unclear: “we will be able to conduct separate analyses for different stimulus types and outcome measures”. I assume that it means that the same number of participants is appropriate for a 2 x 7 within-subjects design, for comparing amplified and non-amplified speech materials, and for comparing non-verbal vocalizations to sentences, but it would be clearer to say so explicitly.

Response 1:11

Thank you. We agree that this section was unclear. This text has been changed to:

“Using a 2 x 6 mixed design (group x emotion), 80% power, 5% significance level, correlation between repeated measures of 0.5 and with N=56 participants (a reasonable number of participants from a recruitment point of view), we will be able to detect an effect as small as η2=.02 (f=0.14); a next to small effect size. Power analysis was performed using G*Power version 3.1.9.7 (32).” (pg. 5, §3)

As is discussed in the analyses section, several of the planned analyses will be 2x6 mixed ANOVAs, but not all. However, we think this is the most demanding analysis in terms of number of participants, so we based our estimation on this, along with consideration regarding how many participants we will realistically will be able to recruit.

Method - task and study design

Reviewer comment 1:12

The sentence materials are described in detail, but there is no description of what the “non-verbal vocalizations” consist of. I see that the “neutral” option is also excluded; is there no possibility of a “neutral” non-verbal vocalization?

Response 1:12

Thank you for noticing this omission. We have added the following description in the Task and study design section ( pg. 5, §2)

”For the non-verbal expressions the actors were instructed to imagine themselves experiencing the different emotions and to make expressions with sounds, without language which match those emotions. This entails some variation of the specific sounds expressed for specific emotions by different participants. The sounds which were perceived as most clearly expressing a given emotion were selected by the authors, and some were slightly edited (omitting pauses) such that all sounds are between 1-2 seconds long.”

With regard to the exclusion of neutral, we believe that such expressions would be very difficult to achieve, if they are achievable. We have not found any previous study including neutral for non-verbal expressions.

Reviewer comment 1:13

What is the accuracy criterion for emotions to be recognized “well above chance” for pilot testing of the speech materials, and will it be in line with previous studies?

Response 1:13

There does not seem to be a generally accepted criterion for validation at the level of individual stimuli. The commonly used criterion of above-chance accuracy seems to be used at a category level such as overall accuracy for expressions of a particular category. We have simply chosen stimuli which are accurately classified by a majority of participants (>50%). The new description under the subheading Validation of stimulus material reads:

“Only sentences and non-verbal expressions which a majority of participants (>50%) classify as expressing the emotion intended by the actor will be included in the experiment” (pg.6, §3)

Reviewer comment 1:14

The authors should clarify which parts of the text describe pilot testing and which parts describe the procedures for the actual study, e.g., by using a separate sub-header.

Response 1:14

Thank you for this suggestion. The pilot study is now described under the new heading “Validation of stimulus material”. (pg. 6, §3)

Reviewer comment 1:15

For reader unfamiliar with this master hearing aid system, the authors could clarify if the “amplified” stimuli will be pre-processed and tailored to each participant’s audiogram before presentation in the session — is this the case?

Response 1:14

The master hearing aid system will process the auditory stimuli tailored to the participant’s audiogram during presentation, such that the amplified stimuli will be presented via head- phones to the participants, who at the time will not wear their own hearing aids. We have added the following description- pg. 6, §4 :

”.. with linear amplification using the Cambridge formula for linear hearing aids (37) in which each stimulus is tailored to each participants audiogram during presentation”

Reviewer comment 1:15

If participants are presented with 8 options instead of 7, including the extra option of “other emotion”, wouldn’t this technically create 8 levels in the condition (instead of 7 as in the power calculation)? This also seems to add an extra complication to the calculation of chance levels for recognition, given that actors were directed to create only 7 emotions

Response 1:15

The option of ’other emotion’ is only included in the pilot study/validation. We agree that adding the option of “other emotion” does add complication to the calculation of chance levels, although it seems to be standard to include such an option in validation studies to reduce bias. Our new criterion as described however does not take into account the number of options.

Method - analyses

Reviewer comment 1:16

A very brief description of GeMAPS would help readers to understand why the authors chose to use this set of speech measures.

Response 1:16

The following description has been added- pg. 7, §2: ” The GeMAPS consist of a set of acoustic parameters which have been proposed as a standard for different areas of automatic voice analysis, including the analysis of vocal emotions. The parameters of GeMAPS have been shown to be of value for analyzing emotions in speech in previous research (40) and are described in table 1.”

Reviewer comment 1:17

How will the distance matrices of the acoustic measurements be integrated with the behavioral data (accuracy)?

Response 1:17

We will discuss differences in accuracy and patterns of confusion in relation to descriptive graphical plots of distance matrices for the central acoustic parameters. We have added the following under Behavioral analyses, pg. 8, §1 of the subsection:

” To relate linear amplification and acoustic parameters to changes in performance, differences in performance over hearing condition (and emotions) will be described in relation to the distance matrices and patterns of confusion from the acoustic analyses”

Reviewer #2:

Reviewer comment 2:1

Introduction

-page 2, first paragraph, they’ve been identified as an important question… why? Especially auditory rehabilitation (do you mean use of amplification devices? “auditory rehab” usually pertains to behavioural strategies).

Response 2:1

Picou et al (2018) identified that the subject of emotion recognition in hearing loss has been slightly neglected in audiology. They highlight the importance of emotion recognition due to it’s important for interpersonal interaction and social functioning and discuss a number of topics for which there is a need for research, including emotion recognition (more knowledge of how it is affected by hearing loss), the effects of amplification (long term and short) and different rehabilitation strategies (what works and how) with regard to vocal emotion recognition.

With “auditory rehabilitation” we initially meant all types of interventions including amplification. We now realize however that auditory rehabilitation (referring primarily to behavioral intervantions should be distinguished from amplification (through devices such as hearing aids and CIs). Therefore, we now separate between the two in the text.

We have added the following to pg. 2, §1:

”Because of the importance of emotion recognition for interpersonal interaction and, thus, indirectly to measurements of well-being such as perceived quality of life and the presence/absence of depression, the effects of hearing loss, amplification, and auditory rehabilitation on the recognition and experience of vocal emotions, have been identified as important research topics (3)”

Reviewer comment 2:2

-maybe in this first paragraph you want to discuss the number of individuals (esp older adults) with hearing loss, the prevalence of this condition in that population makes addressing issues related to age-related hearing loss a pressing concern. You actually don’t really talk about hearing loss in middle-aged/older individuals (your target population for the proposed study) at all, or why it is so important to study these people.

Response 2:2

Thank you for this suggestion. We agree that this will be a good intro and have therefore added such information to the first paragraph of the Introduction. We have here also clarified the importance of investigating hearing loss in middle-aged individuals. As a consequence, the part introducing the meaning of emotions has been moved down under the subheading ’vocal emotion recognition’. The first part of the introduction now reads:

”Hearing loss is among the leading causes of disability globally with an increasing prevalence with age. It leads to difficulties in communication which can contribute to social isolation and diminished well-being (1). In Sweden, it has been estimated that approximately 18% of the population have a hearing loss. While the prevalence is comparatively much higher in the oldest age groups (44.6 % in the age group of 75-84 and 56.5% in the age group of 85+ years) the prevalence in the age groups of 55-64 and 65-74 (middle-age-to-old) is also relatively high (24.2 and 32.6%) (2). This shows that hearing loss is not only a concern for the oldest in the population. Because of the importance of emotion recognition for interpersonal interaction and, thus, indirectly to measurements of well-being such as perceived quality of life and the presence/absence of depression, the effects of hearing loss, amplification, and auditory rehabilitation on the recognition and experience of vocal emotions, have been identified as important research topics (3).”

Reviewer comment 2:3

-page 3, first paragraph. There are a lot of data here about differences in recognition rates of emotions, and how they vary between emotional prosody and non-verbal vocalizations. Could be helpful to provide a table?

Response 2:3

We agree that this paragraph was dense and difficult to follow. We have considered providing a table but decided to keep the information as text. However, we have revised this section to make it clearer, as follows:

”Research on how well different emotions of both stimulus types are recognized compared to other emotions has so far not yielded consistent results (see for example 4 and 15). However, with regard to emotional prosody specifically, Scherer and Castiajo and Pinheiro argue that anger and sadness are more accurately recognized compared to fear, surprise, and happiness (4, 15) (pg. 3, §.§, lines 12-16)

Reviewer comment 2:4

End of first paragraph, page 3, reword “why we…”

Response 2:4

This part has now been rewritten and reads as follows:

“Therefore, in the present study, recognition of non-verbal expressions as well as emotional prosody will be investigated.” (pg.3, last two lines of §1)

Reviewer comment 2:5

If less is known, is there any information that is known? Any previous studies on this?

Response 2:6

We have not found any study using non-verbal expressions with older individuals with hearing loss. We have added the following, with the addition of a new reference (nr 18) on pg. 3, end of §1, lines 29-32:

” The ability to accurately recognize non-verbal emotion expressions declines with aging (18), as does the ability to recognize emotional prosody in speech (19). Amorim et al. suggest that this decline happens due to age-related changes in brain regions which are involved in the processing of emotional cues (18). Hearing loss and aging both independently and conjointly contribute to a diminished ability to accurately recognize speech embedded vocal emotion expressions generally (19). To our knowledge, however, no previous study has examined the effects of mild to -moderate sensorineural hearing loss on non-verbal emotion expressions..”

Reviewer comment 2:7

Add a line about how hearing aids are most common intervention but still such a low uptake. Suggests that many people (esp. older adults) with hearing loss are not accessing amplification.

Response 2:7

We have added the following at -pg. 4, §2, with an added new reference:

”However, many individuals with hearing loss do not have access to or do not use hearing aids (25)”.

Reviewer comment 2:8

Top of page 4, Christensen and Goy papers, are these with older individuals experiencing hearing loss? Is there a confusion between age x emotion x hearing loss? Any expectations why aging might change ability to recognize emotion?

Response 2:8

The Goy et al. Paper examines the correlations between age and emotion identification and between PTA and emotion identification independently. Both age and PTA correlate significantly with emotion identification. In Christensen et al. participants are divided into different age groups. They found that aging does impact vocal emotion recognition negatively and that aging and hearing loss independently have negative effects on vocal emotion recognition accuracy, however the effects of hearing loss and old age are also additive. The suggested reasons for the age-related decline include both biological and social factors, we have added a brief discussion of one such proposed explanation, along with a very brief discussion of the effects of hearing loss and aging (see response 2:6):

”The ability to accurately recognize non-verbal emotion expressions declines with aging (18), as does the ability to recognize emotional prosody in speech (19). Amorim et al. suggest that this decline happens due to age-related changes in brain regions which are involved in the processing of emotional cues (18). Hearing loss and aging both independently and conjointly contribute to a diminished ability to accurately recognize speech embedded vocal emotion expressions generally (19).”

Method

Reviewer comment 2:9

2.2- can you break down into sub-heading first describing the pilot study, then another section describing the proposed full set of stimuli

Response 2:9

The pilot study, stimuli material and the experimental study design are now presented under separate sub-headers (see pg. 6).

Reviewer comment 2:10

2.3 Analyses- can you break down into behavioural data (participant responses) and acoustical analysis of your stimuli, starting with “for each recording…”

Response 2:10

Analyses have been broken down into ’behavioral data’ and ’acoustic analyses’ under separate sub-headings (see pg. 7 for Acoustic analyses and pg.8 for Behavioral data)

Reviewer comment 2:11

It would be helpful to describe where your data will be stored (institutional website?).

Response 2:11

We intend to store data on the Open Science Framework (OSF) website. However, we have not yet created a link for this as the study has not started.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 1

Qian-Jie Fu

1 Dec 2021

Effects of mild-to-moderate sensorineural hearing loss and signal amplification on vocal emotion recognition in middle-aged–older individuals.

PONE-D-21-13058R1

Dear Dr. Ekberg,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Qian-Jie Fu, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Does the manuscript provide a valid rationale for the proposed study, with clearly identified and justified research questions?

The research question outlined is expected to address a valid academic problem or topic and contribute to the base of knowledge in the field.

Reviewer #1: Yes

**********

2. Is the protocol technically sound and planned in a manner that will lead to a meaningful outcome and allow testing the stated hypotheses?

The manuscript should describe the methods in sufficient detail to prevent undisclosed flexibility in the experimental procedure or analysis pipeline, including sufficient outcome-neutral conditions (e.g. necessary controls, absence of floor or ceiling effects) to test the proposed hypotheses and a statistical power analysis where applicable. As there may be aspects of the methodology and analysis which can only be refined once the work is undertaken, authors should outline potential assumptions and explicitly describe what aspects of the proposed analyses, if any, are exploratory.

Reviewer #1: Yes

**********

3. Is the methodology feasible and described in sufficient detail to allow the work to be replicable?

Reviewer #1: Yes

**********

4. Have the authors described where all data underlying the findings will be made available when the study is complete?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception, at the time of publication. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above and, if applicable, provide comments about issues authors must address before this protocol can be accepted for publication. You may also include additional comments for the author, including concerns about research or publication ethics.

You may also provide optional suggestions and comments to authors that they might find helpful in planning their study.

(Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors' revisions have strengthened the rationale and improved the clarity of the manuscript overall; I have no further suggestions.

There is a minor typo in the abstract, "Furthermore a,," and a missing period before "Preliminary results" on p6.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Attachment

    Submitted filename: Response to Reviewers.docx

    Data Availability Statement

    This is a stage one registered report and we have not yet begun data collection there are no data to be made available. Data will be stored in a public repository after finished data collection, and a link (URL) will be made publicly available upon publication of the stage two manuscript (full article).


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES