Short abstract
Head movements can improve sound localization performance and speech intelligibility in acoustic environments with spatially distributed sources. However, they can affect the performance of hearing aid algorithms, when adaptive algorithms have to adjust to changes in the acoustic scene caused by head movement (the so-called maladaptation effect) or when directional algorithms are not facing in the optimal direction because the head has moved away (the so-called misalignment effect). In this article, we investigated the mechanisms behind these maladaptation and misalignment effects for a set of six standard hearing aid algorithms using acoustic simulations based on premade databases; this was done so we could study the effects as carefully as possible. Experiment 1 investigated the maladaptation effect by analyzing hearing aid benefit after simulated rotational head movement in simple anechoic noise scenarios. The effects of movement parameters (start angle and peak velocity), noise scenario complexity, and adaptation time were studied, as well as the recovery time of the algorithms. However, a significant maladaptation effect was only found in the most unrealistic anechoic scenario with one noise source. Experiment 2 investigated the effects of maladaptation and misalignment using previously recorded natural head movements in acoustic scenes resembling everyday life situations. In line with the results of Experiment 1, no effect of maladaptation was found in these more realistic acoustic scenes. However, a significant effect of misalignment on the performance of directional algorithms was found. This demonstrates the need to take head movement into account in the evaluation of directional hearing aid algorithms.
Keywords: head movement, hearing aids, maladaptation, misalignment, directional microphones
When listening and talking, people naturally move their head, and these movements often give benefits. For example, rotational and translational head movements can improve localization performance (Lu & Cooke, 2011; Wallach, 1940), and optimizing head orientation can lead to speech intelligibility benefits when using better ear listening in both normal-hearing listeners (Grange & Culling, 2016) and in hearing-impaired listeners with asymmetric hearing loss (Brimijoin et al., 2012). By optimizing head orientation, people make use of the head shadow effect, which refers to the phenomenon that the head casts an acoustic shadow, causing level differences depending on the frequency and the direction of incidence of a sound source relative to the head (Van Wanrooij & Van Opstal, 2004). The head shadow effect is therefore important for speech perception and hearing aid performance and is influenced by head and source movement. There are several examples of hearing aid algorithms that can use head movement to improve their performance. For example, in algorithms that estimate the direction-of-arrival of sound sources, knowledge of head movement has been used to reduce front-back confusion (Archer-Boyd et al., 2015). Other examples are algorithms that use head movement and eye movement to estimate the spatial auditory attention of the hearing aid user, which is then used to enhance attended sources or suppress unattended ones (Best et al., 2017; Favre-Félix et al., 2017; Grimm et al., 2018; Grimm, Luberadzka, et al., 2016; Hart et al., 2009; Tessendorf et al., 2011).
However, there are other data that demonstrate that head movements can also reduce the performance of hearing aid algorithms. Ricketts (2000) found a significantly greater speech intelligibility benefit with directional microphones when the hearing aid user was facing 30° off-axis, and so facing other directions would result in suboptimal performance. Abdipour et al. (2015) showed that the performance of existing localization-based source separation methods was reduced by moving sources, which can also be caused by head movement, and Boyd et al. (2013) showed that the performance of direction-of-arrival estimators can fall due to head movements. Finally, Hamacher et al. (2005) described the problem that adaptive beamformers have to adapt again after head turns, which could limit the benefit in everyday life.
Two consequences of head movement can be identified. First, the head movement might result in a head orientation that is not optimal. As shown by Ricketts, this could have consequences for directional algorithms, as their beam might be misaligned with the direction of optimal benefit. Second, the head movement causes dynamic changes to the scene. This could have consequences for hearing aid algorithms that use temporal integration to estimate certain properties of the scene because the head movement can cause smearing of the estimated properties during the temporal integration. Moreover, the update rate of the algorithm also determines how fast it can adapt to the new situation after head movement. Instantaneous adaptation is usually not possible because of artifacts. Therefore, adaptive algorithms need time to adapt to the new situation, which could lead to maladaptation if the algorithms are not fast enough or if the situation is constantly changing. Note that misalignment can apply to any algorithm that uses direction, but maladaptation is only of relevance to adaptive algorithms.
The goal of this study was to evaluate the effect of head movement on noise suppression by hearing aid algorithms in terms of maladaptation and misalignment (other properties of the algorithm output, such as sound quality and speech intelligibility, could also be affected by maladaptation and misalignment, but they are outside our current scope). To understand the impact of the effect of head movement, it is important to relate it to the benefit provided by the algorithms in terms of noise suppression. The study consisted of two experiments both using entirely simulated methods and analyses so that the mechanisms behind maladaptation and misalignment could be investigated in detail. Experiment 1 used synthetic (cosine) head movements and analyzed the noise suppression performance after the end of the movement (when the head was facing in the direction of the target source) so that the maladaptation effect could be singled out. The experiment evaluated how large the maladaptation effect is and how it is influenced by movement parameters and the complexity of the noise scenario. It also examined how long it took for the algorithms to recover from maladaptation. Experiment 2 focused on the combined effect of maladaptation and misalignment during natural head movements measured in virtual audiovisual environments (VEs) resembling everyday life situations (Hendrikse et al., 2019b). It was shown in an earlier study that the movement behavior recorded when the participants were watching the animated characters used in these VEs was similar to the movements made when watching video recordings of real persons (Hendrikse et al., 2018).
For Experiment 1, it was expected that the adaptation time would be the most important factor to determine whether an algorithm is affected by maladaptation and how long this effect lasts. It is hypothesized that a fast-adapting algorithm will adapt quickly to the new situation after head movement, so any potential effect of maladaptation would be small. A larger effect of maladaptation was expected for algorithms with an adaptation time in the same order of magnitude as the movement duration or for slowly adaptive algorithms. The duration of the movement was expected to be a determining factor when the adaptation time was of the same order of magnitude because this determined how much the algorithms could adapt during the movement. A slowly adaptive algorithm would not have time to adapt during the movement and would take longer to adapt to the new situation after head movement. In this case, the maladaptation effect would probably depend on the difference between the initial and the new situation. To preview the results in Experiment 1, an effect of maladaptation after head movement was only found in the most unrealistic noise scenario. Because more realistic acoustic scenes were used in Experiment 2, maladaptation was not expected to affect the algorithms’ performance here. To check this, the dependency of the performance on the rotational head speed was evaluated. As maladaptation is affected by dynamic changes, performance should be independent of the rotational head speed if there is no effect of maladaptation. However, misalignment was expected to affect the algorithm performance in Experiment 2, because the performance was analyzed during real head movements, so the head direction was not always the same. In this case, the algorithm performance would depend on the head direction.
In our previous paper (Hendrikse et al., 2019b), differences in movement behavior were found between younger and older normal-hearing listeners. The older normal-hearing listeners tended to do more of the movement with their heads and less with their eyes compared with younger normal-hearing listeners. If more of the movement is done with the head, less misalignment problems are expected for the directional algorithms. Therefore, the algorithm performance of the directional algorithms in Experiment 2 would be expected to be better for the movement data of the older listeners compared with the younger listeners.
Simulation Procedure
The virtual acoustic environments and head movements were implemented in an acoustic model, giving the simulated hearing aid microphone recordings of a simulated moving listener (Figure 1). These recordings were then processed with a set of representative hearing aid algorithms, resulting in a simulated binaural hearing aid output. The performance of each algorithm was analyzed using a measure of short-time signal-to-noise ratio (SNR) calculated in a defined window. Details of the acoustic model, the hearing aid processing, the analysis method, and of the hearing aid algorithms are given in the following sections.
Figure 1.
Schematic Drawing of the Simulation Procedure. The input signals and head movement trajectories were implemented in an acoustic model, which provided the simulated hearing aid microphone recordings of a simulated listener making the defined head movements in the virtual acoustic environment. These recordings were processed with hearing aid algorithms, resulting in a binaural hearing aid output. Analysis windows were extracted after compensating for the delay of the system, and the input SNR and output SNR were calculated in those windows. The difference between these two is the algorithm benefit, quantified by the SNR improvement. Note that the HA algorithms always processed the superposition of target and noise. For SNR analysis, target and noise were separated postprocessing using the method of Hagerman & Olofsson (2004). HA = hearing aid; SNR = signal-to-noise ratio.
Acoustic Model
The virtual acoustic environments were implemented in TASCAR (Toolbox for Acoustic Scene Creation And Rendering; Grimm et al., 2019). This played back the input signals in the simulated acoustic environment and applied the head movement to a simulated listener at a sampling rate of 44.1 kHz. The simulated sound field was rendered to a circular (Experiment 1) or spherical (Experiment 2) array of simulated loudspeakers, matching the loudspeaker positions in a database of hearing aid head-related impulse responses. The loudspeaker signals were then convolved with the hearing aid head-related impulse responses. For this, two different impulse response databases were used in the experiments (see their Methods for details), but both databases used the same head-and-torso simulator (Brüel & Kjaer Type 4128C with artificial ears: 4158C right and 4159C left, pre-amplifier 2669) with hearing aid dummies behind the ears. These hearing aid dummies had three microphones each (front, center, and rear), as described in Kayser et al. (2009). Because the acoustic model is linear, we could process the target and noise signals separately to allow calculation of the SNR.
Hearing Aid Processing and Performance Measures
The sum of the noise and target signals was processed by the different hearing aid algorithms to give the simulated binaural hearing aid output. Here, contrary to the linear acoustic model, a separate processing of target and noise signals was not possible due to the nonlinearity of the hearing aid processing. We took 200-ms windows of these outputs for the better ear, after compensating for the delay of the system (see next section), and then calculated the SNR using Hagerman and Olofsson’s (2004) method. This method allowed access to the target and noise output separately under the constraint of the nonlinearity of the algorithms, by processing the superposition of target and noise 2 times: with normal and inverted phase for the noise, and then adding or subtracting the processed signals. The difference between the SNR from the front hearing aid microphone recordings (“input” SNR) and binaural output (“output” SNR) quantifies the “algorithm benefit” in terms of SNR improvement. By comparing the algorithm benefit for the different head movements, the effect of head movement on the algorithm performance related to noise suppression could be investigated. The method of Hagerman and Olofsson assumes short-time linearity of the algorithms; whether this assumption is valid for all algorithms was checked in Supplementary Materials A using Olofsson and Hansen’s (2006) method. The results in the Supplementary Materials show that the assumption is valid for all algorithms except the adaptive minimum variance distortionless response beamformer (“AMVDRb,” see next section) because the level of the estimated nonlinear distortion was more than 20 dB lower than the level of the processed superposition of target and noise signals. For the AMVDRb algorithm, different adaptation time settings were tested (details in Experiment 1), and the fast and intermediate settings showed high levels of nonlinear distortion, but the assumption was valid for the slowest setting.
Hearing Aid Algorithms
Six hearing aid algorithms from different classes, representative of algorithms currently used in hearing aids (Hamacher et al., 2005), were selected for analysis: a delay-and-subtract beamformer (labelled “D&Sb”), an adaptive differential microphone (“ADMb”), a static binaural beamformer (“SBb”), an adaptive minimum variance distortionless response beamformer (“AMVDRb”), a binaural noise reduction algorithm (“BNR”), and a single-channel noise reduction algorithm (“SCNR”). Descriptions of the algorithms follow in the next paragraphs.
The hearing aid microphone recordings were processed with the hearing aid algorithms using the open Master Hearing Aid (openMHA version 4.8.0, http://www.openmha.org/, Herzke et al. 2017) with the default settings. Some additional settings were tested for the ADMb and AMVDRb, as explained in Experiment 1. The implementations of all algorithms have been used previously (Baumgärtel, Hu, et al., 2015; Baumgärtel, Krawczyk-Becker, et al., 2015; Grimm, Kollmeier, et al., 2016).
Processing signals with the openMHA induces an algorithm-dependent processing delay. The processing delays for the algorithms are reported in Table 1, as calculated by cross-correlating the input and output for the target signals from Experiment 1. This delay was compensated for in order to align the hearing aid microphone recordings with the binaural hearing aid output. Table 1 also lists the most important properties of the algorithms regarding their adaptivity, directionality, and the number of microphones used monaurally or binaurally. The adaptivity and directionality properties can be verified in the polar plots of the SNR benefit, target gain, and noise gain provided in Supplementary Materials C.
Table 1.
Summary Table of the Most Important Properties of the Hearing Aid Algorithms.
| Algorithm | Processing delay | Adaptivity | Directionality | Microphones (monaural/binaural) |
|---|---|---|---|---|
| D&Sb | 173 samples3.9 ms | Fixed | Beamformer, cardioid |
|
| ADMb all settings | 17 samples0.39 ms | Adaptive | Beamformer, cardioid |
|
| SBb | 2,822 samples64 ms | Fixed | Beamformer, narrow |
|
| AMVDRb all settings | 88 samples2.0 ms | Adaptive | Beamformer, narrow |
|
| BNR | 176 samples4.0 ms | Adaptive | Omni |
|
| SCNR | 705 samples16 ms | Adaptive | Omni |
|
Note. The table displays the processing delay of the hearing aid algorithms in the openMHA, their adaptivity properties, directionality properties, the number of microphones used by the algorithms, and whether they are monaural or binaural (microphones used on one or two sides). ADMb = adaptive differential microphone; AMVDRb = adaptive minimum variance distortionless response beamformer; D&Sb = delay-and-subtract beamformer; SBb = a static binaural beamformer; BNR = binaural noise reduction; SCNR = single-channel noise reduction.
Delay-and-Subtract Beamformer
The D&Sb is a fixed (nonadaptive) monaural beamformer, which was applied to the left and right hearing aid microphone recordings separately. On each side, it consisted of a first-order differential microphone using the front and rear omnidirectional microphones that were separated by a distance of 14.9 mm. The rear microphone signal was delayed by the time it takes sound to travel over the microphone separation distance (4.3824e–5 s) and subtracted from the front signal to achieve a cardioid pattern. The delay was achieved by applying a corresponding linear phase in the frequency domain using an overlap-add method as described in Grimm et al. (2006). The resulting high-pass characteristics were not equalized.
Adaptive Differential Microphone
The ADMb is a monaural algorithm based on two D&Sbs using a single pair of omnidirectional microphones as described earlier. These two beamformers generated a front-facing and a back-facing microphone signal (Elko & Pong, 1995). A mixing weight was adapted to steer a spatial zero toward the most prominent sound source in the rear hemisphere, as the target was assumed to be in the front and the distractor in the back hemisphere. The optimum value of the mixing weight was found by minimizing the mean-square amplitude value of the output. The algorithm was applied to the left and right hearing aid microphone recordings separately.
Static Binaural Beamformer
The binaural six-microphone beamformer (SBb) algorithm (Rohdenburg et al., 2007) aims to minimize the overall noise output power while preserving the desired speech component in the frontal hearing aid microphone channels. It therefore assumed that the target was coming from the front. The beamformer is a fixed minimum variance distortionless response (MVDR) beamformer without a generalized sidelobe canceller. Binaural cues of both target and noise signals were preserved by applying a real-valued time-variant postfilter of the front microphones controlled by beamformer output, instead of returning the beamformer output directly. A wave propagation model with a sampled head-related impulse responses for frontal direction from the database described in (Kayser et al., 2009) was used as propagation vector.
Adaptive MVDR Beamformer
This is the adaptive version (AMVDRb) of the binaural multimicrophone beamformer algorithm described earlier (Rohdenburg et al., 2007). It used the generalized sidelobe canceler, which could adapt to a varying noise field, and used only four microphones instead of six (front and rear on each side). In our version, the beamformer output was returned directly, resulting in a diotic output signal.
Binaural Noise Reduction
The BNR scheme filtered out signals based on their interaural coherence (Grimm et al., 2009; Luts et al., 2010). The algorithm used the left and right front hearing aid microphones. Target sources were assumed to have a high interaural coherence because their position changes only slowly and their direct signal path dominates the reverberant part (at least within the critical distance; Grimm et al., 2009). Distractor sources were assumed to be incoherent because they are either beyond the critical distance or consist of many similar uncorrelated sources distributed around the listener. This algorithm was thus designed to filter out diffuse noise. The interaural coherence was estimated from fluctuations in the interaural phase difference, in third-octave bands with a time constant of 40 ms. From the measure of coherence, the attenuation for the corresponding frequency band and time frame was calculated so that incoherent signals were filtered out.
Single-Channel Noise Reduction
The SCNR algorithm (Breithaupt et al., 2008) used the left and right front hearing aid microphones. The algorithm aimed to improve the SNR by identifying speech components in the signal and filtering out nonspeech components. This was done by applying temporal smoothing in the cepstral domain. In the cepstral domain, the noisy speech signal was decomposed into coefficients related to the speech envelope, the excitation, and noise. Cepstral coefficients related to the speech excitation were identified using fundamental frequency estimation. Strong temporal adaptive recursive smoothing was applied to the cepstral coefficients that are dominated by noise and little smoothing to the cepstral coefficients representing speech.
Experiment 1
Method
Three virtual acoustic scenarios of increasing acoustic complexity were implemented. All the scenarios were anechoic and included a target source at 0° and 1 m distance and, in addition, some noise as follows:
one speech-shaped noise source at 120°, 1 m distant
five speech-shaped noise sources (−120°, −60°, 60°, 120°, and 180°), 1 m distant
five speakers of opposite sex to target speaker (−120°, −60°, 60°, 120°, and 180°), 1 m distant
In all three scenarios, a range of synthetic rotational head movements was implemented to simulate the orienting movement of a listener toward a target speaker. The head movements all ended at specific times such that the listener faced the target sound source. The analysis windows were positioned immediately after the end of the movements, after compensating for the delays caused by the acoustic model (130 samples), HRIR convolution (400 samples), and hearing aid processing (see second column of Table 1). Because the analysis windows were positioned right after the end of the movement, any potential effect of the head movement on the algorithm benefit in these analysis windows can only be caused by maladaptation. Details of the head movements, the stimuli, the database of hearing aid head-related impulse responses, and the analysis are given in the following sections.
Head Movement Traces
The simulated head movement traces were synthesized such that the azimuth followed a cosine ramp from the starting angle to the target angle (0°). Although a listener’s head movements might be better characterized by a sigmoidal curve (Brimijoin et al., 2014), a cosine was chosen here, because a precise time was needed for when the movement ended. The head movement traces started between −180° and 180°, in steps of 30° away from the target and ended at 0°, facing the target. In all cases, the head stayed oriented toward the initial direction for 5 s before the movement started, to give the hearing aid algorithms time to adapt. Peak velocities of 50 deg/s, 100 deg/s and 150 deg/s were implemented, as ±50 deg/s has been reported as the average rotational head velocity for small movements (±30°) and ±150 deg/s is the average rotational head velocity for larger movements (±105°; Brimijoin et al., 2010). In total, 37 traces were created of which one trace was the control trace without movement (i.e., starting facing the target; see Figure 2).
Figure 2.
Simulated Cosine Head Movement Traces That Were Implemented. The parameters starting angle [−180°:30°:180°] and peak velocity [50 deg/s, 100 deg/s, 150 deg/s] were changed to create a total of 37 traces, including a control trace without movement, as indicated by different colors. A 200 ms analysis window is located immediately after the end of the movement in which the algorithm performance related to noise suppression is analyzed.
Stimuli
The target signals were 20 fragments from the DAPS database (Mysore, 2015), using 10 male and 10 female speakers. The fragments were chosen so that the signals in the analysis windows were between +1 dB and +5 dB above the root-mean-squared value for the whole signal without pauses. This was to make sure that the signal in the analysis windows contained speech, not just speech pauses. Noise sources were either unmodulated speech-shaped noise or distractor speakers. Speech-shaped noise signals were created using the long-term average speech spectrum of the target signals. Multiple signals were created so that uncorrelated speech-shaped noise sources could be presented simultaneously. The distractor speakers were different fragments from the DAPS database, always of the opposite sex to the target speaker. The noise signal was started 200 ms before the target signal, allowing for better algorithm adaptation. All scenarios were measured at −5 dB and +5 dB long-term input SNR because the SNR can also influence the algorithm benefit.
Database of Hearing Aid Head-Related Impulse Responses
Hearing aid head-related impulse responses were used that matched the loudspeaker positions for a circular array of 90 loudspeakers (starting at 0° azimuth with 4° spacing and at 0° elevation). They were taken from the database of Thiemann et al. (2015).
Algorithm Settings
To study the effect of the adaptation time on maladaptation, different settings were tested for the two adaptive beamformers (ADMb and AMVDRb). For the ADMb, the parameter µ, which determines the adaptation step size, was set to 1e−4 (standard), 1e−5, and 1e−6. A smaller value for µ results in slower adaptation: respectively, 0.23, 2.3, and 23 s. The values were chosen so that the approximate adaptation time fits the three categories of faster than movement duration, same order of magnitude as movement duration, and slower than movement duration.
For the AMVDRb, two parameters affect the adaptation time: µ has a direct effect on the adaptation time, where larger values result in a faster adaptation, and α influences the variation of the adaptation across time. In the experiment, only µ was changed (0.4, 0.04, and 0.0004) and α was kept constant at the default value of 0.5. For this more complex algorithm, calculation of the approximate adaptation time for the different settings is not straightforward, but it can be estimated from the effect duration (Figure 9).
Figure 9.
As Figure 8, but for AMVDRb in the Selected Situations and for the Different Settings: Fast Setting (1), Intermediate Setting (2), and Slow Setting (3). AMVDRb = adaptive minimum variance distortionless response beamformer; SNR = signal-to-noise ratio.
Analysis
First, the SNR improvement was calculated for the static control condition. From this, it could be seen how well the algorithms worked in each scenario. The movement effect was quantified by calculating the difference in output SNR (termed “ΔSNR”) between the static control condition and the movement conditions. A negative ΔSNR indicated that any improvement in SNR was reduced by the head movement. Situations with a large negative ΔSNR were selected for further investigation by plotting ΔSNR for different peak velocities and start angles. Finally, to analyze how long it took for the algorithms to recover from maladaptation, new recordings were made in which further time was added between the end of the movement and the start of the analysis window while keeping the same signal in the analysis window.
Results
Algorithm Performance: Static Improvement and Movement Effect
First of all, we looked at the static SNR improvement, because if there is no SNR improvement in the first place, there cannot be a reduction due to movement. The values of the static SNR improvement and ΔSNR are plotted in Figures 3 to 5, grouped by algorithm. It can be seen that not all algorithms provided a benefit in all situations because the assumptions of the algorithms are not always met. Furthermore, ΔSNR is zero or close to zero for the D&Sb, SBb, BNR, and SCNR algorithms (Figures 3 and 5) so there is no effect of head movement for these algorithms. However, for the ADMb and AMVDRb algorithms, there is an effect of head movement (Figure 4). The median ΔSNR is small compared with the static SNR improvement, but the range of ΔSNR is large for the AMVDRb algorithm. So, for this algorithm, there are some situations where the movement has a large impact. Whether this was significant, and in which situations, was examined by looking at the confidence intervals.
Figure 3.
Performance of D&Sb and SBb Algorithms. The SNR improvement in the static condition is plotted in black for all scenarios. The ΔSNR is plotted in red for all scenarios. Each scenario was tested with a long-term input SNR of −5 dB and +5 dB, as indicated by the background gray shading. The boxplots display the range (narrow line), 25th and 75th percentiles (thick line) and the median (x). The ΔSNR is zero, which indicates that there was no effect of head movement for these nonadaptive algorithms. D&Sb = delay-and-subtract beamformer; SBb = a static binaural beamformer; SNR = signal-to-noise ratio.
Figure 4.
As per Figure 3, but for the ADMb and AMVDRb Algorithms for the Different Settings: Fast Setting (1), Intermediate Setting (2), and Slow Setting (3). ADMb = adaptive differential microphone; AMVDRb = adaptive minimum variance distortionless response beamformer; SNR = signal-to-noise ratio.
Figure 5.
As per Figure 3, but for the BNR and SCNR Algorithms. BNR = binaural noise reduction; SCNR = single-channel noise reduction; SNR = signal-to-noise ratio.
The combinations of algorithm, scenario, and long-term SNR whose ΔSNR confidence intervals did not include zero are listed in Table 2; thus, in these situations, there was a significant ΔSNR. This could be either positive or negative, but only the ADMb and AMVDRb algorithms showed a significant negative effect of head movement, and only in the anechoic scenario with one noise source was the mean ΔSNR larger than 1 dB.
Table 2.
Combinations of Algorithm, Scenario, and Long-Term Input SNR That Have a Significant Movement Effect (95% Confidence Interval for the Mean ΔSNR Does Not Include 0).
| Algorithm | Scenario | Long-term SNR (dB) | Mean ± standard deviation static benefit (dB) | Mean ± standard deviation ΔSNR (dB) |
95% confidence interval |
|---|---|---|---|---|---|
| ΔSNR | |||||
| ADMb intermediate setting | 1 | −5 | 15.5 ± 2.0 | −1.1 ± 0.3 | [−1.6, −0.5] |
| +5 | 12.0 ± 1.7 | 0.4 ± 0.1 | [0.3, 0.6] | ||
| ADMb slow setting | 1 | −5 | 17.9 ± 1.5 | −2.9 ± 0.1 | [−3.0, −2.7] |
| +5 | 12.8 ± 1.7 | 2.1 ± 0.2 | [1.8, 2.4] | ||
| 2 | −5 | 7.9 ± 1.6 | 0.2 ± 0.0 | [0.2, 0.2] | |
| +5 | 7.8 ± 1.7 | 0.3 ± 0.0 | [0.3, 0.3] | ||
| 3 | −5 | 6.8 ± 1.8 | 0.6 ± 0.0 | [0.5, 0.7] | |
| +5 | 6.3 ± 1.8 | 1.0 ± 0.0 | [1.0, 1.1] | ||
| AMVDRb fast setting | 1 | −5 | 29.8 ± 5.2 | −8.4 ± 0.8 | [−10.0, −6.8] |
| +5 | 15.3 ± 3.1 | −5.4 ± 0.5 | [−6.5, −4.3] | ||
| 2 | −5 | 7.0 ± 1.2 | 0.5 ± 0.1 | [0.4, 0.7] | |
| +5 | 0.7 ± 2.2 | 4.4 ± 0.4 | [3.6, 5.2] | ||
| 3 | −5 | 6.8 ± 3.7 | 2.3 ± 0.4 | [1.5, 3.0] | |
| +5 | −4.0 ± 3.9 | 8.6 ± 0.6 | [7.2, 9.9] | ||
| AMVDRb intermediate setting | 1 | −5 | 35.1 ± 3.5 | −26.0 ± 0.7 | [−27.4, −24.7] |
| +5 | 23.4 ± 2.6 | −14.0 ± 0.6 | [−15.3, −12.8] | ||
| 2 | +5 | 6.7 ± 1.0 | 0.9 ± 0.2 | [0.5, 1.2] | |
| 3 | −5 | 9.5 ± 1.5 | −0.4 ± 0.1 | [−0.7, −0.1] | |
| +5 | 4.7 ± 1.8 | 3.2 ± 0.4 | [2.5, 4.0] | ||
| AMVDRb slow setting | 1 | −5 | 21.4 ± 0.8 | −10.4 ± 0.1 | [−10.6, 10.2] |
| +5 | 20.2 ± 0.9 | −8.2 ± 0.2 | [−8.6, −7.9] | ||
| 2 | −5 | 8.1 ± 0.8 | −0.1 ± 0.0 | [−0.2, −0.1] | |
| 3 | −5 | 9.6 ± 0.8 | −0.3 ± 0.0 | [−0.3, −0.3] | |
| SCNR | 2 | −5 | 8.5 ± 3.3 | 0.2 ± 0.1 | [0.0, 0.3] |
Note. The table displays the means, standard errors and confidence intervals for the ΔSNR. Values are displayed in boldface if the static algorithm benefit is negative and when the movement effect thus results in a decreased benefit. ADMb = adaptive differential microphone; AMVDRb = adaptive minimum variance distortionless response beamformer; SNR = signal-to-noise ratio; SCNR = single-channel noise reduction.
Effect of Head Movement Parameters
For the ADMb and AMVDRb algorithms, a significant negative effect of head movement was found, and this effect was largest in the anechoic scenario with one noise source at −5 dB long-term SNR. In this scenario, the influence of the movement parameters, start angle, and peak velocity on ΔSNR was further investigated. The influence of the movement parameters is shown in Figure 6 for ADMb and in Figure 7 for AMVDRb. For ADMb with the intermediate setting, both the start angle and peak velocity had an influence on ΔSNR, whereas for the slowest setting only the start angle had an influence. For the AMVDRb algorithm, the peak velocity had a larger influence for the intermediate setting, and the start angle had a larger influence for the slowest setting, similar to the ADMb algorithm. The fastest setting also gave a significant ΔSNR, which was influenced mainly by the peak velocity.
Figure 6.
Influence of Movement Start Angle and Peak Velocity on the ΔSNR for the ADMb in Selected Situations. The ΔSNR is plotted for different start angles of the movement (x-axis) and peak velocities (line color). The movement traces indicated with the black circle are selected for the analysis of the effect duration in the next section. ADMb = adaptive differential microphone; SNR = signal-to-noise ratio.
Figure 7.
As per Figure 6, but for AMVDRb. AMVDRb = adaptive minimum variance distortionless response beamformer; SNR = signal-to-noise ratio.
Effect Duration
To study how long the algorithms took to recover from movements, further recordings were made in which extra time was added between the end of the movement and the start of the analysis window. This was done for the individual movement traces that were shown in Figures 6 and 7: movement starting at −60° and +60° with 150 deg/s for the ADMb and movement starting at +120° and +60° with 150 deg/s for the AMVDRb. The ΔSNR in the shifted analysis windows was plotted against the time shift on a logarithmic axis (Figures 8 and 9). We found that the intermediate and slowest settings, having longer adaptation times, took longer to recover; for the slowest setting, it could take 2 or more seconds to recover. The different settings for the ADMb and AMVDRb algorithms show similar recovery times. It is noticeable that the AMVDRb algorithm did not fully recover in the anechoic scenario with one noise source for the movement trace starting at +120° (facing the noise source) with a peak velocity of 150 deg/s. This means that the AMVDRb algorithm potentially has problems in real life when the attention of the hearing aid user is shifted to a different source, but this needs to be checked in a more realistic scenario.
Figure 8.
Movement Effect Duration for the ADMb Algorithm in the Selected Situations and for the Different Settings: Fast Setting (1), Intermediate Setting (2), and Slow Setting (3). Plotted are the median ΔSNR (circles), the 25th and 75th percentiles (thick lines), and the range (thin vertical lines) over all target signals for increasing times between the end of the movement and the start of the analysis window (x-axis, logarithmic) and different algorithm settings (line color). ADMb = adaptive differential microphone; SNR = signal-to-noise ratio.
Discussion
It was important to check the static SNR improvement of the algorithms first so that the impact of ΔSNR could be assessed. The results in Figures 3 to 5 show that not all algorithms provided a static SNR improvement in all situations because their assumptions are not always met. The BNR algorithm provided a small or negative SNR improvement in the tested scenarios because none of the scenarios is really diffuse. This way, there is no room for a reduction in SNR improvement due to movement. In future tests, it would be good to include a more diffuse scenario so that all algorithms have at least one scenario in which they provide a large SNR improvement.
D&Sb and SBb are nonadaptive algorithms and therefore could not be effected by maladaptation. However, they were included in this experiment to check the method. No effect of head movement was found for D&Sb and SBb, which indicates that the recordings, delay compensation, and analysis were done correctly. The BNR and SCNR algorithms were also not affected by head movement, even though they are adaptive. It is likely that the adaptation of these algorithms was fast enough to not cause problems. A significant reduction in SNR improvement larger than 1 dB because of head movement was found for the ADMb and AMVDRb algorithms but only in the anechoic scenario with one noise source. In the other two scenarios with more noise sources, the head movement had a very small (<1 dB), but significant, negative effect on the SNR improvement of ADMb and AMVDRb, or even a significant positive effect. A positive effect of head movement indicates that the algorithms were not performing optimally in the first place. A possible explanation is that the different initial condition causes the algorithms to adapt to a different local minimum for the noise output power.
It was hypothesized that the movement parameters would affect the algorithms differently depending on the adaptation time. To check this, the settings of the ADMb and AMVDRb algorithms were chosen so that the approximate adaptation times were faster, in the range of, or slower than the movement duration. We found that for both algorithms, the peak velocity, which affects the movement duration, had a larger influence than the start angle for the intermediate setting, whereas the start angle, which determines the difference between the initial and final situation, had a larger influence for the slowest setting. This is in accordance with the expectations. Unexpectedly, however, the movement effect for the fastest setting of the AMVDRb algorithm was also large. This could have something to do with the high levels of nonlinear distortion that were found for this algorithm setting and the intermediate setting, which might make the SNR calculated from the output of Hagerman and Olofsson’s method inaccurate. The output for the slowest setting, however, was reliable, so we can be sure that there really was an effect of head movement for the AMVDRb algorithm.
Overall, there is thus a significant effect of head movement on the SNR improvement in some situations. The question remains what the impact of this effect is. An SNR improvement significantly reduced by more than 1 dB was found only in the most unrealistic anechoic scenario with one noise source. Both ADMb and AMVDRb algorithms provide a very large static SNR improvement in this scenario, and even with the decreased performance due to the head movement, the SNR improvement was still large. Although the analysis of the effect duration showed that it could take the algorithms several seconds to recover, the large SNR improvement makes it unlikely that the head movement will decrease the speech intelligibility. Furthermore, we know that the head movement effect in Experiment 1 is only caused by maladaptation because the final orientation of our synthetic head movements is always the same. As no negative effect of head movement could be found in the more realistic scenarios with multiple noise sources, it is unlikely that maladaptation plays a role in daily life. However, in Experiment 1, maladaptation was analyzed only after the end of the movement. It might be that it results in a larger effect during movement, when the peak velocity is reached. Experiment 2 therefore investigated the SNR improvement during recorded natural head movements in virtual acoustic environments resembling everyday life. In this case, the head orientation is not always the same, and misalignment is an additional factor that can potentially reduce the SNR improvement of directional algorithms. Experiment 2 thus checks the hypothesis that in more realistic environments, there is no effect of maladaptation on the SNR improvement of adaptive algorithms, but that there is a reduction in SNR improvement of directional algorithms because of misalignment.
Experiment 2
Method
Recordings were made of a simulated hearing aid user performing the head movements measured in 21 younger normal-hearing and 19 older normal-hearing listeners in a set of VEs defined in a previous study (Hendrikse et al., 2019b). Each algorithm’s performance at noise suppression was again quantified by calculating the improvement in SNR (difference between input and output SNR), but here it was calculated in 200-ms time windows during the entire time span of the VEs, without overlap. For this, the segmental SNR after Quackenbush et al. (1988) was used. The effect of head movement on the algorithm benefit was analyzed in terms of maladaptation and misalignment by comparing the different head movement traces. The effect of head movement can be analyzed by looking at the variance in algorithm benefit over the movement traces, here quantified by the range in SNR improvement. The settings used for the algorithms were the standard (fastest) setting for the ADMb and the slowest setting for the AMVDRb. The latter was not the standard setting because the standard (fastest) setting showed high levels of nonlinear distortion (see Supplementary Materials A). The following paragraphs describe in more detail the VEs, the database of hearing aid head-related impulse responses and the analyses that were carried out.
Virtual Environments
For this experiment, we used head movement data and VEs from Hendrikse et al. (2019b). The six VEs used in this experiment were as follows: a cafeteria (including both a single-task condition and a dual-task condition having an additional hand-eye-coordination task), a lecture hall, a living room, a street, and a train station. In all VEs, the participants had to listen to a specified target, while their head movements were measured. Brief descriptions of the environments are given in the following paragraphs, more details and a technical analysis can be found in (Hendrikse et al., 2019b). The range of SNR in the VEs is plotted in Figure 10 (left panel).
Figure 10.
Input SNR Mean Across Participants (Left) and Range Across Participants (Right). The left panel shows the dynamics of the input SNR over time. The right panel shows the difference in input SNR due to head movement. SNR = signal-to-noise ratio.
Cafeteria: the listener was sitting at the edge of a table at which four persons (azimuths of −28°, −4°, 8°, and 34°, negative azimuths are to the right) were having a conversation. The noise consisted of several point noise sources, such as competing conversations at neighboring tables, music and laughter. Diffuse noise, consisting of babble noise and noise of plates and cutlery, was also present. In the cafeterialisteningonly condition, the only task was to listen to the four-person conversation at the table. In the cafeteriadualtask condition, the task was to listen to the conversation and at the same time put pins in the holes on a Purdue Pegboard (Tiffin & Asher, 1948), to simulate eating and listening at the same time. In this situation, performing the dual task significantly influenced the head movement behavior (Hendrikse et al., 2019b).
Lecture hall: the listener was sitting in the audience while a lecture was given. The lecturer’s speech (target) was presented directly (−15°) and through loudspeakers (51° and −38°). Presentation slides were shown on a screen (25°). The noise level was low and consisted of occasional noises from the audience, such as coughing, sneezing, sighing and noise of pencil writing, and turning pages.
Living room: the listener was sitting on a sofa while the news was playing on a TV (target, at an azimuth of −4°). A person sitting on a chair to the left (45°) was occasionally commenting on the news. A person sitting on the sofa next to the listener was eating crisps (−90°). On the left side of the listener was an active fire place. Through the open door to the kitchen, noises of the dishwasher, water cooker, and fridge could be faintly heard.
Street: the listener was standing at a bus stop, where four people (−17°, 4°, 23°, and 42°) were having a conversation (target). On the sidewalk, a bicycle was driving by and a mother with pram was walking by while singing lullabies to her baby. On the road, cars were driving past, as well as a truck, a bus, and an emergency vehicle. Diffuse noise consisted of distant traffic noise and birds singing. There was also a train driving past in the distance.
Train station: the listener was standing on the platform. Occasionally, announcements were made over the loudspeaker system about trains arriving/departing (target). The noise consisted of a conversation on a neighboring platform (98°), people walking past with trolleys, beeps of ticket validation machines, train engines, and brakes. Diffuse noise was a recording from a real train station.
Database of Hearing Aid Head-Related Impulse Responses
For this experiment, we wanted to make sure that our simulated signals represented the signals that were played to the participants while recording their head movement. Therefore, hearing aid head-related impulse responses were recorded in the setup described in Hendrikse et al. (2019b). This setup consisted of 28 loudspeakers that were divided between a 16-loudspeaker horizontal ring array at ear level (first loudspeaker at 11.25° from frontal direction, with 22.5° spacing) and two 6-loudspeaker ring arrays at +45° and −45° elevation (first loudspeaker at 0° and 30° azimuth from frontal direction, respectively, with 60° spacing). Four subwoofers were positioned on the floor at 45°, 135°, −135°, and −45°. The room in which this setup was placed was sound-treated and had a reverberation time (T60) of 0.13 s and an early decay time of 0.04 s. The impulse responses were recorded from the position of the target source in the virtual environments at a head-and-torso simulator with multichannel hearing aid dummies behind the ears as described earlier. This was standing at the position of the participant in the laboratory. In the environments in which the participants were seated, they were sitting on a chair on a platform. Two sets of impulse responses were recorded (with and without platform) to match the environments in which the participants were seated as well as those where they were standing. A logarithmic frequency sweep (Farina et al., 2001) was used to eliminate nonlinear distortions of the loudspeakers.
Analysis
First, we checked whether the observed variation across the movement traces of the participants led to differences in SNR because of the head shadow effect. Therefore, differences in input SNR caused by the different movement traces were analyzed.
Subsequently, the algorithm benefit was analyzed to see whether there was a variance over the different movement traces. However, potentially there were also differences due to the movement in the input SNR. It could be possible that the algorithms just compensate for the differences in input SNR, which would result in the same output SNR for all movement traces. This would mean that head movement does not affect the performance of the algorithms. To check this, the variance in output SNR over the movement traces was also analyzed. To examine the hypothesis of differences in directional algorithm performance between older and younger people, a repeated-measures ANOVA was carried out for the mean algorithm benefit over time with the environment and algorithm type as between-participant factors and the age-group as within-participant factor.
Finally, in order to find out how much of the variance in algorithm benefit could be contributed to misalignment and maladaptation, the algorithm benefit was plotted per rotational head speed and per head direction. A potential dependency of the algorithm benefit on the rotational head speed would be caused by maladaptation. A potential dependency of the algorithm benefit on the head direction would be caused by the head shadow effect, which affects the input SNR, and the misalignment effect. The mean horizontal angular rotational head speed and the mean head direction were calculated for the head movement traces of the participants in the same time windows as used for the SNR calculation. The absolute head direction was taken, and for the VEs with multitalker conversation, the head direction relative to the target was taken, to correct for the changing target position. This also shows in which situations a potential effect of head movement is problematic.
Results
Effect of Movement on Input SNR
The range in input SNR across participants was calculated in all time windows and plotted as a histogram to show the variance due to head movement (Figure 10, right panel). It shows that the mean range in input SNR was between 2 and 6 dB for all environments. SNR differences this large can be assumed to influence speech perception and also the hearing aid algorithm performance. This shows that we have enough variation in the input SNR across movement traces to check the influence on algorithm benefit. In the left panel of Figure 10, the mean input SNR across participants in all time windows is plotted as a histogram in the left panel, to show the dynamics of the input SNR over time in the different environments.
Algorithm Benefit
The mean algorithm benefit over time is plotted in Figure 11 for all algorithms, movement traces, and environments. It can be seen that not all algorithms provided a benefit in all environments. Moreover, it can be seen that there is a variance in the algorithm benefit over the different movement traces. This variance indicates an effect of head movement and was also visible in the output SNR (see Supplementary Materials D). Therefore, we can be sure that the algorithms did not just compensate for the differences in input SNR due to head movement, which would lead to the same output SNR, but that head movement really did affect the performance of the algorithms. The following sections investigate whether the variance in algorithm benefit is related to misalignment and maladaptation. Figure 11 shows that the range in algorithm benefit was up to 6 dB for some algorithms in some environments. The differences in single time windows could be larger. To find this out, histograms were plotted of the range in benefit across the participants in each time window (Figure 12). It can be seen that differences in head movement patterns could cause differences in algorithm benefit up to 15 dB in single time windows. For the AMVDRb, a large range in algorithm benefit occurred more often than for the other algorithms, whereas the SCNR and BNR algorithms were less affected by head movement. An effect of head movement could be seen in all VEs, but in the streetactive and train station VEs, the range in algorithm benefit was larger.
Figure 11.
Segmental SNR Benefit Over Time of All Algorithms Over All Movement Traces in All Environments. Box plots show the median (different symbol and color for each algorithm), 25th and 75th percentiles (thick line), and the range (thin line). The range in algorithm benefit shows that different movement traces result in a different algorithm benefit. The reverberation time (T60) of each environment is listed on top, as measured in (Hendrikse et al., 2019b), together with the mean input SNR in each environment. ADMb = adaptive differential microphone; AMVDRb = adaptive minimum variance distortionless response beamformer; D&Sb = delay-and-subtract beamformer; SBb = a static binaural beamformer; BNR = binaural noise reduction; SCNR = single-channel noise reduction; SNR = signal-to-noise ratio.
Figure 12.
The Algorithm Benefit Range Across Participants (and Across Head Movement Traces) Calculated in Each Time Window and Plotted as a Histogram for Each Environment Separately. Different colors indicate different algorithms. This shows how the head movement can affect the algorithm benefit in each time window. ADMb = adaptive differential microphone; AMVDRb = adaptive minimum variance distortionless response beamformer; D&Sb = delay-and-subtract beamformer; SBb = a static binaural beamformer; BNR = binaural noise reduction; SCNR = single-channel noise reduction.
The repeated-measures ANOVA, to examine differences in algorithm performance between the age groups based on their different movement behaviors, showed that only the AMVDRb algorithm with the slowest setting gave a significant overall effect of age-group on the algorithm benefit, F(1, 38) = 4.5; p < .05. However, there were very small but significant differences for some algorithms in some environments. These are listed in Table 3.
Table 3.
Significant Differences in Algorithm Benefit Between Younger and Older Participants.
| Environment | Algorithm | Difference in algorithm benefit younger–older (dB) | Statistics F (1, 38) |
|---|---|---|---|
| Cafeteriadualtask | ADMb | 0.3 | 12.4*** |
| SBb | −0.1 | 4.4* | |
| Lecture hall | D&Sb | 0.2 | 8.5** |
| ADMb | 0.2 | 5.0* | |
| Living room | SBb | 0.4 | 14.8*** |
| Train station | D&Sb | −0.4 | 9.9** |
| ADMb | −0.5 | 16.3*** | |
| SBb | −0.4 | 9.4** | |
| AMVDRb | −0.6 | 17.9*** |
Note. ADMb = adaptive differential microphone; AMVDRb = adaptive minimum variance distortionless response beamformer; D&Sb = delay-and-subtract beamformer; SBb = a static binaural beamformer.
*p < .05. **p < .01. ***p < .001.
Algorithm Benefit per Rotational Head Speed
To examine whether maladaptation contributed to the variance in algorithm benefit, the correlation between the algorithm benefit and the horizontal angular rotational head speed was investigated using linear regression. No dependency was found, and the correlations were below .15 for all algorithms in all environments. Thus, maladaptation did not contribute to the movement effect. Scatter plots of the algorithm benefit versus the horizontal angular rotational head speed, including regression lines, can be found in Supplementary Materials E.
Algorithm Benefit per Head Direction
To examine how the input SNR and misalignment contributed to the variance in algorithm benefit, the algorithm benefit per time window was plotted as a function of the horizontal head direction of the participants in the corresponding time window (Figure 13). Figure 13 shows that there was a strong dependency of the algorithm benefit on the head direction for the directional algorithms, especially for the AMVDRb algorithm. The omnidirectional BNR and SCNR algorithms did not depend much on the head direction. Only in the train station VE was there a strong increase in algorithm benefit toward azimuths of −90°, for the SCNR algorithm. The dependency of the directional algorithms on the head direction was strongest in the streetactive and cafeteria VEs. The direction of optimal benefit for the AMVDRb algorithm was clearly toward the target in these VEs; for the other directional algorithms, it was somewhat off-axis.
Figure 13.
Algorithm Benefit as a Function of Head Direction for All Algorithms (Different Colors) in All Environments. The algorithm benefit is plotted as a function of the absolute head direction for the living room, lecture hall and train station environments, and as a function of the head direction relative to the target for the cafeteria and street environments, to compensate for changing target speaker positions. The plots show a dependency of the algorithm benefit on the head direction. ADMb = adaptive differential microphone; AMVDRb = adaptive minimum variance distortionless response beamformer; D&Sb = delay-and-subtract beamformer; SBb = a static binaural beamformer; BNR = binaural noise reduction; SCNR = single-channel noise reduction.
Discussion
Experiment 2 examined the influence of head movement on hearing aid algorithm benefit quantified by SNR improvement during natural head movement that was measured in the laboratory in VEs resembling everyday life situations. The algorithm benefit and output SNR both varied over the different movement traces, indicating that head movement influenced the performance of the hearing aid algorithms. The range in algorithm benefit over the different movement traces was up to 6 dB for some algorithms in some environments, and in single time windows even up to 15 dB, indicating that head movement has a large influence on algorithm benefit.
The analysis of the dependency of the algorithm benefit on the rotational head speed and head direction revealed how much of the variance in algorithm benefit could be contributed to misalignment and maladaptation. Because no dependency on the rotational head speed was found, maladaptation was shown not to be a relevant effect. This was also predicted based on the results of Experiment 1. There was, however, a strong dependency of the algorithm benefit on the head direction for the directional algorithms. This indicates that there is a combined effect of the head shadow and misalignment. Because the range in algorithm benefit was sometimes larger than the range in input SNR, and because the omnidirectional algorithms were not affected, we can be sure that the combined effect is not only caused by the influence of the head shadow effect on the input SNR, but that misalignment also plays a role. Most likely there is some kind of interaction between the two, and differences in input SNR are amplified by the beam pattern. Studying the mechanisms behind this interaction in more detail would be an interesting topic for future research.
The analysis of the dependency on the head direction also shows in which situations there can be problems caused by head movement. In the VEs with multitalker conversations, problems occurred when not pointing the head in the direction of the active speaker. For the D&Sb, ADMb, and SBb algorithms, the optimal direction was 15° to 30° off-axis from the target direction, as was also found by Ricketts (2000). From the analysis of the movement behavior in the previous study (Hendrikse et al., 2019b), we know that the participants were following the active speaker in the cafeterialisteningonly and streetactive VEs. The head direction was not always pointing exactly toward the off-center target speakers, however, because part of the movement was done with the eyes. In the cafeteriadualtask VE, they were no longer following the active speaker because they were also looking at the Pegboard. This could have led to a larger range in algorithm benefit in the cafeteriadualtask VE, but as we can see in Figure 11, no difference can be seen between the two cafeteria VEs. There is a larger range in algorithm benefit in the streetactive VE. Figure 13 (bottom right panel) shows that a head directed toward the traffic on the right side resulted in a lower algorithm benefit, and we know from the movement behavior that participants were occasionally looking at the traffic, which could explain this. In the living room VE, there was also a large range in algorithm benefit. This was caused by occasionally looking at the person making comments, which resulted in a higher benefit for some algorithms and a lower benefit for others (Figure 13, top left panel). In the train station, a higher algorithm benefit was obtained for the SCNR algorithm when pointing the head at −90° and thus toward the neighboring platform. It is most likely that the announcements at this platform could be heard better this way. Finally, in the lecture hall, the range in algorithm benefit was small, and the benefit did not greatly depend on the head direction.
When analyzing the algorithm benefit, it could be seen that not all algorithms provide a benefit in all situations. The benefit in more realistic environments is also much lower than the benefit in the simple scenarios in Experiment 1. This difference in noise reduction performance between simple laboratory conditions and more realistic environments was previously shown by Grimm, Kollmeier & Hohmann (2016). Potential causes of these differences in performance could be the reverberation and the fluctuating noise in the more realistic environments, but finding specific explanations for the differences in performance is beyond the scope of this article. The benefit in these more realistic environments is probably a better predictor of the benefit of these algorithms in daily life.
Finally, it was hypothesized that beamformers would work better for older people than for younger people due to differences in the relationship between head and eye movement. Significant differences in the order of 0.5 dB were found between the age groups, indicating that an effect of age was there, but it is not important in most situations.
Conclusion
The goal of this work was to study how head movement might influence hearing aid algorithm performance related to noise suppression and the underlying mechanisms of this. Two potential consequences of head movement were investigated: maladaptation and misalignment. Experiment 1 examined the effect of maladaptation by analyzing algorithm benefit quantified by SNR improvement after simulated rotational head movement in simple anechoic noise scenarios. Experiment 2 examined the combined effect of maladaptation and misalignment on algorithm benefit during natural head movement that was measured in realistic VEs in the laboratory.
In Experiment 1, a significant effect of maladaptation was found, but only in the most unrealistic scenario did it have an influence larger than 1 dB. Algorithms with slower adaptation times were more likely to be affected by maladaptation because of head movement, and it could take several seconds for the algorithms to recover. For adaptation times in the range of the movement duration, the peak velocity of the movement had a large influence on maladaptation. For slower adaptation times, only the start angle of the movement played a role. Based on these results, it was expected that maladaptation would not play a role in daily life, and this was checked in Experiment 2.
Experiment 2 shows that natural head movement can severely affect algorithm performance in simulated situations resembling everyday life and confirms the expectation of Experiment 1 that maladaptation does not play a role. A dependency of the algorithm benefit on the head direction was found so there was a combined effect of the head shadow and misalignment. A large head movement effect was found only for the directional algorithms, resulting in up to 6 dB difference in SNR improvement, averaged over time. The omnidirectional algorithms were only very minimally affected. This confirms the finding that misalignment plays a role. A smaller algorithm benefit was found when the head was not pointing toward the target speaker in the virtual environments with multitalker conversations. Finally, only small significant differences in directional algorithm benefit were found between the younger and older participants; the behavioral differences between the age groups are too small to seriously affect the performance of directional algorithms.
This work thus demonstrates the need to take head movement into account in the evaluation of directional hearing aid algorithms. The head movement data used for the prediction of the influence on hearing aid algorithm performance were measured in normal-hearing participants. It could be, however, that hearing-impaired and hearing aid users move differently; this needs to be examined in a future study. Because environments resembling real-life situations and natural head movement data were used, the findings of this study may be significant for real-life use of hearing aids, both qualitatively and quantitatively.
The databases of head movement trajectories and VEs are available online (Hendrikse et al., 2019a, 2019c). Another database was published accompanying this article that describes how to generate the hearing aid microphone recordings used in Experiment 2 (Hendrikse et al., 2020). Together with the methods proposed here, they may be useful for studying the performance of directional algorithms of commercial hearing aids as well as for investigating the potential of (attention-) steered directional filters to counteract the misalignment effect. To demonstrate the methods, only the noise reduction property of the algorithms was investigated. Other properties of the algorithm output, such as sound quality and speech intelligibility, could also be affected by head movement and this could be investigated in future studies using the methods proposed here.
Supplemental Material
Supplemental material, TIA916682 Supplemental Material for Evaluation of the Influence of Head Movement on Hearing Aid Algorithm Performance Using Acoustic Simulations by Maartje M. E. Hendrikse, Giso Grimm and Volker Hohmann in Trends in Hearing
Acknowledgments
The authors would like to thank the associate editor, Michael Akeroyd, and the two anonymous reviewers for their helpful comments.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by the Deutsche Forschungsgemeinschaft (German Research Foundation)—project 352015383–SFB 1330 B1.
ORCID iD
Maartje M. E. Hendrikse https://orcid.org/0000-0002-7704-6555
Supplemental Material
Supplemental material for this article is available online.
References
- Abdipour R., Akbari A., Rahmani M., Nasersharif B. (2015). Binaural source separation based on spatial cues and maximum likelihood model adaptation. Digital Signal Processing, 36, 174–183. 10.1016/j.dsp.2014.09.001 [DOI] [Google Scholar]
- Archer-Boyd A. W., Whitmer W. M., Brimijoin W. O., Soraghan J. J. (2015). Biomimetic direction of arrival estimation for resolving front-back confusions in hearing aids. The Journal of the Acoustical Society of America, 137(5), EL360–EL366. 10.1121/1.4918297 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baumgärtel R. M., Hu H., Krawczyk-Becker M., Marquardt D., Herzke T., Coleman G., Adiloğlu K., Bomke K., Plotz K., Gerkmann T., Doclo S., Kolmeier B., Hohmann V., Dietz M. (2015). Comparing binaural pre-processing strategies II: Speech intelligibility of bilateral cochlear implant users. Trends in Hearing, 19(0), 1–18. 10.1177/2331216515618903 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baumgärtel R. M., Krawczyk-Becker M., Marquardt D., Völker C., Hu H., Herzke T., Coleman G., Adiloğlu K., Ernst S. M., Gerkmann T., Doclo S., Kolmeier B., Hohmann V., Dietz M. (2015). Comparing binaural pre-processing strategies I : Instrumental evaluation. Trends in Hearing, 19, 1–16. 10.1177/2331216515617916 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Best V., Roverud E., Streeter T., Mason C. R., Kidd G. (2017). The benefit of a visually guided beamformer in a dynamic speech task. Trends in Hearing, 21, 233121651772230 10.1177/2331216517722304 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boyd A. W., Whitmer W. M., Brimijoin W. O., Akeroyd M. A. (2013). Improved estimation of direction of arrival of sound sources for hearing aids using gyroscopic information. Proceedings of Meetings on Acoustics, 19, 030046 10.1121/1.4799684 [DOI] [Google Scholar]
- Breithaupt C., Gerkmann T., Martin R. (2008). A novel a priori SNR estimation approach based on selective cepstro-temporal smoothing. In IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 4897–4900). IEEE. 10.1109/ICASSP.2008.4518755 [DOI]
- Brimijoin W. O., McShefferty D., Akeroyd M. A. (2010). Auditory and visual orienting responses in listeners with and without hearing-impairment. The Journal of the Acoustical Society of America, 127(6), 3678–3688. 10.1121/1.3409488 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brimijoin W. O., McShefferty D., Akeroyd M. A. (2012). Undirected head movements of listeners with asymmetrical hearing impairment during a speech-in-noise task. Hearing Research, 283(1–2), 162–168. 10.1016/j.heares.2011.10.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brimijoin W. O., Whitmer W. M., McShefferty D., Akeroyd M. A. (2014). The effect of hearing aid microphone mode on performance in an auditory orienting task. Ear & Hearing, 35(5), 204–212. 10.1097/AUD.0000000000000053 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Elko G. W., Pong A.-T. N. (1995). A simple adaptive first-order differential microphone. In Proceedings of 1995 Workshop on Applications of Signal Processing to Audio and Acoustics (pp. 169–172). IEEE. 10.1109/ASPAA.1995.482983 [DOI]
- Farina A., Bellini A., Armelloni E. (2001). Non-linear convolution: A new approach for the auralization of distorting systems [Convention]. 110th Convention of the Audio Engineering Society, Amsterdam, The Netherlands.
- Favre-Félix A., Graversen C., Dau T., Lunner T. (2017). Real-time estimation of eye gaze by in-ear electrodes. In Engineering in Medicine and Biology Society (EMBC), 2017 39th Annual International Conference of the IEEE (pp. 4086–4089). IEEE. 10.1109/EMBC.2017.8037754 [DOI] [PubMed]
- Grange J. A., Culling J. F. (2016). The benefit of head orientation to speech intelligibility in noise. The Journal of the Acoustical Society of America, 139(2), 703–712. 10.1121/1.4941655 [DOI] [PubMed] [Google Scholar]
- Grimm G., Herzke T., Berg D., Hohmann V. (2006). The master hearing aid: A PC-based platform for algorithm development and evaluation. Acta Acustica United With Acustica, 92(4), 618–628. [Google Scholar]
- Grimm G., Hohmann V., Kollmeier B. (2009). Increase and subjective evaluation of feedback stability in hearing aids by a binaural coherence-based noise reduction scheme. IEEE Transactions on Audio, Speech, and Language Processing, 17(7), 1408–1419. 10.1109/TASL.2009.2020531 [DOI] [Google Scholar]
- Grimm G., Kayser H., Hendrikse M., Hohmann V. (2018). A gaze-based attention model for spatially-aware hearing aids. In Speech communication; 13 ITG symposium (pp. 231–235). VDE Verlag GmbH.
- Grimm G., Kollmeier B., Hohmann V. (2016). Spatial acoustic scenarios in multichannel loudspeaker systems for hearing aid evaluation. Journal of the American Academy of Audiology, 27(7), 557–566. 10.3766/jaaa.15095 [DOI] [PubMed] [Google Scholar]
- Grimm G., Luberadzka J., Hohmann V. (2019). A toolbox for rendering virtual acoustic environments in the context of audiology. Acta Acustica United With Acustica, 105(3), 566–578. 10.3813/AAA.919337 [DOI] [Google Scholar]
- Grimm G., Luberadzka J., Müller J., Hohmann V. (2016). A simple algorithm for real-time decomposition of first order Ambisonics signals into sound objects controlled by eye gestures. In Proceedings of the Interactive Audio Systems Symposium. University of York.
- Hagerman B., Olofsson Å. ( 2004). A method to measure the effect of noise reduction algorithms using simultaneous speech and noise. Acta Acustica United With Acustica, 90(2), 356–361. [Google Scholar]
- Hamacher V., Chalupper J., Eggers J., Fischer E., Kornagel U., Puder H., Rass U. (2005). Signal Processing in High-End Hearing Aids: State of the Art, Challenges, and Future Trends. EURASIP Journal on Advances in Signal Processing, 2005(18), 2915–2929. 10.1155/ASP.2005.2915 [DOI] [Google Scholar]
- Hart J., Onceanu D., Sohn C., Wightman D., Vertegaal R. (2009). The attentive hearing aid : Eye selection of auditory sources for hearing impaired users In Gross T. (Ed.), IFIP conference on human-computer interaction (pp. 19–35). Springer. [Google Scholar]
- Hendrikse M. M. E., Llorach G., Grimm G., Hohmann V. (2018). Influence of visual cues on head and eye movements during listening tasks in multi-talker audiovisual environments with animated characters. Speech Communication, 101, 70–84. 10.1016/j.specom.2018.05.008 [DOI] [Google Scholar]
- Hendrikse M. M. E., Llorach G., Hohmann V., Grimm G. (2019. a). Database of movement behavior and EEG in virtual audiovisual everyday-life environments for hearing aid research. 10.5281/ZENODO.1434090 [DOI] [PMC free article] [PubMed]
- Hendrikse M. M. E., Llorach G., Hohmann V., Grimm G. (2019. b). Movement and gaze behavior in virtual audiovisual listening environments resembling everyday life. Trends in Hearing, 23, 1–29. 10.1177/2331216519872362 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hendrikse M. M. E., Llorach G., Hohmann V., Grimm G. (2019. c). Virtual audiovisual everyday-life environments for hearing aid research. 10.5281/zenodo.1434115 [DOI] [PMC free article] [PubMed]
- Hendrikse M. M. E., Schwarte K., Grimm G., Hohmann V. (2020). Generating hearing aid microphone recordings including head movement in virtual acoustic environments resembling everyday life. 10.5281/zenodo.3621283 [DOI]
- Herzke T., Kayser H., Loshaj F., Grimm G., Hohmann V. (2017). Open signal processing software platform for hearing aid research (openMHA). In Proceedings of the Linux Audio Conference (pp. 35–42). 10.1121/1.4987813 [DOI]
- Kayser H., Ewert S. D., Anemüller J., Rohdenburg T., Hohmann V., Kollmeier B. (2009). Database of multichannel in-ear and behind-the-ear head-related and binaural room impulse responses. EURASIP Journal on Advances in Signal Processing, 2009(1), 298605 10.1155/2009/298605 [DOI] [Google Scholar]
- Lu Y. C., Cooke M. (2011). Motion strategies for binaural localisation of speech sources in azimuth and distance by artificial listeners. Speech Communication, 53(5), 622–642. 10.1016/j.specom.2010.06.001 [DOI] [Google Scholar]
- Luts H., Eneman K., Wouters J., Schulte M., Vormann M., Büchler M., Dillier N., Houben R., Dreschler W. A., Froehlich M., Puder H., Grimm G., Hohmann V., Leijon A., Lombard A., Mauler D., Spriet A. (2010). Multicenter evaluation of signal enhancement algorithms for hearing aids. Acoustical Society of America, 127(3), 1491–1505. 10.1121/1.3299168 [DOI] [PubMed] [Google Scholar]
- Mysore G. J. (2015). Can we automatically transform speech recorded on common consumer devices in real-world environments into professional production quality speech? – A dataset, insights, and challenges. IEEE Signal Processing Letters, 22(8), 1006–1010. 10.1109/LSP.2014.2379648 [DOI] [Google Scholar]
- Olofsson Å., Hansen M. (2006). Objectively measured and subjectively perceived distortion in nonlinear systems. The Journal of the Acoustical Society of America, 120(6), 3759–3769. 10.1121/1.2372591 [DOI] [PubMed] [Google Scholar]
- Quackenbush S. R., Barnwell T. P., Clements M. A. (1988). Objective measures of speech quality. Prentice Hall.
- Ricketts T. (2000). The impact of head angle on monaural and binaural performance with directional and omnidirectional hearing aids. Ear and Hearing, 21, 318–328. 10.1097/00003446-200008000-00007 [DOI] [PubMed] [Google Scholar]
- Rohdenburg T., Hohmann V., Kollmeier B. (2007). Robustness analysis of binaural hearing aid beamformer algorithms by means of objective perceptual quality measures. In 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (pp. 315–318). IEEE. 10.1109/ASPAA.2007.4393016 [DOI]
- Tessendorf B., Bulling A., Roggen D., Stiefmeier T., Feilner M., Derleth P., Tröster G. (2011). Recognition of hearing needs from body and eye movements to improve hearing instruments. Pervasive Computing, 6696 LNCS, 314–331. 10.1007/978-3-642-21726-5_20 [DOI] [Google Scholar]
- Thiemann J., Escher A., van de Par S. (2015). Multiple model high-spatial resolution HRTF measurements. In Proceedings of the German Annual Conference on Acoustics (DAGA) (pp. 797–798).
- Tiffin J., Asher E. J. (1948). The Purdue Pegboard: Norms and studies of reliability and validity. Journal of Applied Psychology, 32, 234–247. 10.1037/h0061266 [DOI] [PubMed] [Google Scholar]
- Van Wanrooij M. M., Van Opstal A. J. (2004). Contribution of head shadow and pinna cues to chronic monaural sound localization. The Journal of Neuroscience, 24(17), 4163–4171. 10.1523/JNEUROSCI.0048-04.2004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wallach H. (1940). The role of head movements and vestibular and visual cues in sound localization. Journal of Experimental Psychology, 27(4), 339–368. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplemental material, TIA916682 Supplemental Material for Evaluation of the Influence of Head Movement on Hearing Aid Algorithm Performance Using Acoustic Simulations by Maartje M. E. Hendrikse, Giso Grimm and Volker Hohmann in Trends in Hearing













