Skip to main content
Trends in Hearing logoLink to Trends in Hearing
. 2025 Aug 25;29:23312165251372462. doi: 10.1177/23312165251372462

Age-Related Differences in EEG-Based Speech Reception Threshold Estimation Using Scalp and Ear-EEG

Heidi B Borges 1,2, Emina Alickovic 1,3, Christian B Christensen 2, Preben Kidmose 2, Johannes Zaar 1,4,
PMCID: PMC12378310  PMID: 40853325

Abstract

Previous studies have demonstrated the feasibility of estimating the speech reception threshold (SRT) based on electroencephalography (EEG), termed SRTneuro, in younger normal-hearing (YNH) participants. This method may support speech perception in hearing-aid users through continuous adaptation of noise-reduction algorithms. The prevalence of hearing impairment and thereby hearing-aid use increases with age. The SRTneuro estimation is based on envelope reconstruction accuracy, which has also been shown to increase with age, possibly due to excitatory/inhibitory imbalance or recruitment of additional cortical regions. This could affect the estimated SRTneuro. This study investigated the age-related changes in the temporal response function (TRF) and the feasibility of SRTneuro estimation across age. Twenty YNH and 22 older normal-hearing (ONH) participants listened to audiobook excerpts at various signal-to-noise ratios (SNRs) while EEG was recorded using 66 scalp electrodes and 12 in-ear-EEG electrodes. A linear decoder reconstructed the speech envelope, and the Pearson's correlation was calculated between the reconstructed and speech-stimulus envelopes. A sigmoid function was fitted to the reconstruction-accuracy-versus-SNR data points, and the midpoint was used as the estimated SRTneuro. The results show that the SRTneuro can be estimated with similar precision in both age groups, whether using all scalp electrodes or only those in and around the ear. This consistency across age groups was observed despite physiological differences, with the ONH participants showing higher reconstruction accuracies and greater TRF amplitudes. Overall, these findings demonstrate the robustness of the SRTneuro method in older individuals and highlight its potential for applications in age-related hearing loss and hearing-aid technology.

Keywords: ear-EEG, EEG, speech reception threshold, neural decoding, speech intelligibility in noise

Introduction

Understanding speech in noise is a critical and challenging task for hearing-aid users (Kochkin, 2002). A common method to evaluate this ability is by measuring the speech reception threshold (SRT), a behavioral measure typically defined as the signal-to-noise ratio (SNR) at which a participant can accurately repeat 50% of the presented speech items (SRTbeh). This approach, however, relies on active participation of the listener, which might not always be feasible.

To address this limitation, recent research has explored the use of electroencephalography (EEG) measured while the subjects are listening to speech-in-noise stimuli to estimate the SRT (SRTneuro) (Borges et al., 2025; Lesenfants et al., 2019). This approach offers a promising alternative for assessing speech intelligibility without relying on active behavioral responses.

Researchers have estimated the SRTneuro with scalp EEG and linear models in younger normal-hearing (YNH) listeners. Both encoding (forward) and decoding (backward) models have been used to estimate SRTneuro (Borges et al., 2025; Lesenfants et al., 2019; Vanthornhout et al., 2018), with decoders offering robust speech envelope reconstruction and encoders providing spatial and temporal insights through the temporal response function (TRF) (Alickovic et al., 2019). The TRF is similar to the event-related potential (ERP). However, unlike ERPs, which average responses over multiple repetitions of discrete stimuli (Luck, 2014), the TRF can be calculated for continuous and non-repeated stimuli and reflects only the selected stimulus feature, not the entire stimulus. A decoder model that reconstructs the speech envelope from EEG was used to estimate the SRTneuro as the midpoint of a sigmoid function fitted to SNR-versus-reconstruction-accuracy datapoints (Borges et al., 2025; Vanthornhout et al., 2018). This approach is inspired by a standardized method to determine the SRTbeh. In a previous study using whole scalp EEG and this method, an SRTneuro within 3 dB of the SRTbeh was achieved for 100% of participants and within 2 dB for 75% (Borges et al., 2025).

Existing studies have focused on younger adults. Since age-related hearing loss is the most common type of hearing impairment (Gratton & Vázquez, 2003), it is relevant to investigate the SRTneuro estimation in older adults. Age-related changes include enhanced neural tracking of speech, as evidenced by increased envelope reconstruction accuracy from EEG (Karunathilake et al., 2023; Presacco et al., 2016). These age-related neural changes could potentially bias the SRTneuro estimation or challenge the method in other ways.

Importantly, SRTneuro offers advantages beyond traditional behavioral testing. It enables a more automated, passive estimation procedure, which can be useful for non-responsive individuals such as younger children or individuals with cognitive decline. Furthermore, continuous measurement of speech intelligibility could support adaptive hearing-aid technologies or enable real-time monitoring of listening conditions. However, the use of full-scalp electrodes is impractical for integration into hearing aids. To address this limitation, a recent study investigated the feasibility of measuring SRTneuro using a more discreet and unobtrusive approach: electrodes placed in and around the ear in YNH individuals (Borges et al., 2024). It was found that SRTneuro estimates derived from the electrode configuration of in-ear EEG and 4 electrodes around the ear closely matched the SRTbeh, with results similar to SRTneuro estimates obtained based on a full-scalp array with 66 electrodes. This finding is an important step in the direction of potential integration of SRTneuro measurements into hearing- aids and other ear-worn devices.

Despite these advances the feasibility of using scalp and ear-EEG for SRTneuro estimation in older normal-hearing (ONH) individuals remains unexplored. The present study aims to fill this gap by evaluating SRTneuro estimation in a ONH group using both scalp and ear-EEG configurations, and compare results with those obtained for a YNH group (Borges et al., 2025). A secondary objective was to investigate noise-induced changes in the TRF in ONH participants in order to better understand the neurological changes that could impact the SRTneuro estimation.

Materials and Methods

Participants

In this study, a new data set was collected from 22 ONH participants (16 females, 6 males), aged 57 to 76 years (mean age: 65 years) and combined with an existing data set from 20 YNH participants, originally collected for another study (Borges et al., 2025). The inclusion criteria for the ONH individuals were identical to those used for the YNH study: right-handed, no significant dyslexia affecting daily life, and no history of neurological disorders (Borges et al., 2025), with the exception of the age and hearing threshold requirements. The YNH population was required to be between 18 and 30 years old and have normal hearing defined as pure-tone thresholds of max 20 dB HL (dB hearing level) at 0.125, 0.25, 0.5, 0.75, 1, 1.5, 2, 3, 4, 6, and 8 kHz. The ONH participants were required to be above 50 years old and the ear with the lower threshold (i.e., the “better” ear) was required to meet the following criteria: a maximum threshold of 25 dB HL in the frequency range of 0.125 to 3 kHz, 30 dB HL at 4 kHz, 40 dB HL at 6 kHz, and 45 dB HL at 8 kHz. The other ear was allowed to have a 10-dB higher maximum threshold. The study was approved by Aarhus University's Institutional Review Board with the approval number 2023-014.

Experimental Setup

The experimental paradigm and setup followed the protocol outlined in the previous study (Borges et al., 2025). The study consisted of three visits. During the first visit, the SRTbeh was estimated, the Edinburgh Handedness Inventory test (Oldfield, 1971) was conducted, a reading span test (Daneman & Carpenter, 1980) was performed, and ear impressions were taken. The second visit involved data collection for a separate study and is therefore not further described here. The third visit involved EEG recordings for the current study.

Behaviorally Estimated Speech Reception Threshold—SRTbeh

The stimuli were presented by means of Etymotic ER-1 insert Earphones (Etymotic Research, Inc., IL, USA) via a soundcard (RME Hammerfall DSB Multiface II, Audio AG, Germany) using disposable foam tips.

The SRTbeh was estimated using the Danish Hearing In Noise Test (HINT) sentences (Nielsen & Dau, 2011) and an adaptive procedure. The 50% word reception score was found using steady-state speech-shaped noise, with the first sentence starting at an SNR of −10 dB. The speech level was set at 65 dB sound pressure level (SPL), while the level of the steady-state speech-shaped noise was varied to obtain the desired SNR. Initially, two concatenated HINT training sentence lists, comprising 40 sentences, were used for training. Subsequently two concatenated HINT sentence lists were used to determine the SRTbeh (see details in Borges et al., 2025).

Experimental Paradigm and Setup for Speech Reception Threshold Estimated From EEG—SRTneuro

The stimuli were presented in the same setup as for the SRTbeh, with the exception that the sound tube from of the earphones was connected to the sound bore in the ear-EEG earpieces for stimulus presentation during the EEG recordings. Scalp and in-ear EEG were recorded concurrently with two separate amplifiers at a sampling rate of 4096 Hz. Scalp EEG was recorded using the Biosemi Active EEG (Amsterdam, Netherlands) system with a 64-channel cap and 2 external mastoid electrodes and reference (CMS) between PO3 and POz. Ear-EEG was recorded using a SAGA32+/64+ system (TMSi, Oldenzaal, Netherlands) with 12 in-ear ear-EEG electrodes and an additional Fpz electrode using an average reference during acquisition. The SAGA amplifier features a hardware-implemented average reference, eliminating the need for a specific reference electrode during data acquisition. The Fpz electrode location was recorded by both amplifiers to allow for merging the data from the two amplifiers. The 12 silver/silver chloride (Ag/AgCl) ear-EEG electrodes had a diameter of 4 mm and were placed in the individually molded earpieces in position ExA, ExB, ExC, ExT, ExI and ExK where x is replaced with R if placed on the right-ear earpiece and with L if placed on the left-ear earpiece. The naming convention for the in-ear electrodes is outlined in Kidmose et al. (2013), the design of the soft earpieces is described in Kappel et al. (2019), and the electrodes are described and characterized in Kappel and Kidmose (2022). For an illustration of the electrodes and their placement see Borges et al. (2024). Before inserting the in-ear-EEG earpieces, the participants’ ears were cleaned with a wet cotton swab. The quality of the signals recorded from the in-ear electrodes was assessed through visual inspection in a live viewer, specifically by checking for artefacts induced by eye movements and jaw clenches.

The SRTneuro was estimated by presenting audiobook excerpts, each lasting approximately 60 seconds. This length was chosen to balance data quality and participant fatigue. The audiobook excerpts were played to the participants at five different SNRs relative to the behaviorally measured SRT (SRTbeh −4 dB, SRTbeh −2 dB, SRTbeh, SRTbeh +2 dB, SRTbeh +4 dB) and without noise (clean speech). The SNR range was chosen to obtain an informative variation in speech reception. If the range is too narrow, the variation in speech reception will be small and mainly driven by inter-trial variability. Conversely, if the range is too wide, the speech reception may become nearly binary, resulting in the subject either understanding almost nothing or nearly everything. The presentation level of the speech was fixed at 65 dB SPL while the steady-state speech-shaped noise was adjusted according to the desired SNR. A total of 16 excerpts were presented at each SNR in a randomized order. The audiobook material was filtered using a first-order lowpass filter with a cutoff frequency of 2 kHz to approximate the third-octave band power spectral density of the HINT material and thereby enhance comparability between the audiobook material and the HINT material. To keep the participants engaged in the listening task, a two-alternative forced-choice content-related question followed each excerpt. After every 16 trials, participants were given a recreational task, such as describing their morning routine in as much detail as possible. See further details on the SRTneuro paradigm and experimental setup in Borges et al. (2025) and Borges et al. (2024).

Analysis

All analyses were conducted using MATLAB (The MathWorks Inc, Massachusetts, USA) and the mTRF toolbox (Crosse et al., 2016).

Stimulus Preprocessing

The envelopes of the audiobook excerpts were extracted as described in Borges et al. (2024). The broadband envelope of the analytical signal was obtained using the Hilbert transformation. Since the human auditory system has a compressive behavior, where louder sounds are not amplified as much as softer sounds in the presented stimuli, a power law of x(t)0.6, where x(t) represents the envelope at time t, was applied. This approach is designed to compensate for the cochlear compression, and is a common way to model this phenomenon (Biesmans et al., 2017). The resulting compressed envelope was bandpass filtered between 1 and 8 Hz using a zero-phase filter implemented in MATLAB with the “filtfilt” function, employing a sixth-order Butterworth polynomial. This filter was implemented due to its desirable frequency response with a flat passband and appropriate steepness for the application preserving the signal of interest whilst attenuating unwanted frequencies. The filtered envelope was then downsampled to 64 Hz for further analysis.

EEG Preprocessing

The EEG data were processed as described in Borges et al. (2024). The in-ear EEG and scalp EEG were processed separately. Channels exhibiting a constant signal, such as those that were saturated, were first identified. The in-ear EEG electrodes were then referenced to the average of all in-ear electrodes, while the scalp electrodes were referenced to Fpz. Flat electrodes and electrodes with a standard deviation (SD) above three times the mean SD of the channels recorded with the same amplifier after highpass filtering with a cut-off frequency of 1 Hz were rejected. For the scalp EEG, the rejected channels were replaced using spherical interpolation, whereas for the in-ear EEG rejected electrodes were either omitted (for AEar referenced, see below) or replaced with the mean of the electrodes in the local area (Ear). Then all channels were referenced to Fpz, downsampled to 256 Hz, and bandpass filtered between 1 and 8 Hz using a zero-phase filter based on a sixth-order Butterworth polynomial. The filtered channels were further downsampled to 64 Hz and then epoched. To remove extreme artefacts all values greater than 100 μV and smaller than −100 μV were removed, along with 32 datapoints (0.5 s) before and after each artefact. To preserve the correlation structure of the signal, the removed points were reconstructed using autoregressive modeling, implemented with the “fillgaps” function in MATLAB. To evaluate the performance of the SRTneuro estimation across various electrode configurations, 13 configurations were selected, including in-ear electrodes and electrodes located around the ear (T7/8 and M1/2). In electrode configurations containing only two electrodes, one electrode was re-referenced to the other. In configurations with more than two electrodes, an average reference across all included electrodes was used. For reference, performance was also evaluated using a full-scalp electrode configuration, referenced to the mastoid average. An overview of the in- and around-ear electrode configurations is provided in Table 1. For graphical illustration, see Supplementary Material S1.

Table 1.

An Overview of the Investigated Electrode Configurations With Electrodes In and Around the Ear.

Configuration abbreviation Active electrode(s) Reference electrode(s)
Ear 12 ear-EEG electrodes Average of active electrodes
AEar Average left ear-EEG Average right ear-EEG
M M1 M2
T T7 T8
EarT 12 ear-EEG electrodes, T7, T8 Average of active electrodes
AEarT Average right ear-EEG, average left ear-EEG, T7, T8 Average of active electrodes
MT M1, M2, T7, T8 Average of active electrodes
EarM 12 ear-EEG electrodes, M1, M2 Average of active electrodes
AEarM Average right ear-EEG, average left ear-EEG, M1, M2 Average of active electrodes
EarMT 12 ear-EEG electrodes, M1, M2, T7, T8 Average of active electrodes
LAEarMT Average left ear-EEG, M1, T7 Average of active electrodes
RAEarMT Average right ear-EEG, M2, T8 Average of active electrodes
AEarMT Average right ear-EEG, average left ear-EEG, M1, M2, T7, T8 Average of active electrodes

The abbreviation for the configuration is shown along with the active electrode(s) and the reference electrode(s).

Temporal Response Function

An encoding model was trained to predict the EEG response from the envelope of the presented stimuli. Leave-one-out cross-validation was applied within each condition, resulting in 16 encoder models trained per condition and participant. TRF estimation involves inversion of the stimulus property covariance matrix. To prevent overfitting, ridge regularization with a regularization parameter (λ) of 100 was applied across all TRF estimations. This λ value was used across all conditions and all participants, as it resulted in the highest overall mean reconstruction accuracy across participants and condition in a previous study including YNH participants, when testing λ values in the range [10−6 to 105] (Borges et al., 2025) and ensures a fair comparison between models. The noise floor was determined by calculating the Pearson's correlation between the EEG data predicted from 68 audio excerpts, which were not presented to the participants, and the recorded EEG. These audio excerpts, which were extracted from the same audiobook material as the experimental speech stimuli, went through the same preprocessing as the experimental stimuli.

A subset of 17 channels was selected for subsequent TRF analysis (FC5, FC3, FC1, FCz, FC2, FC4, FC6, C3, C1, Cz, C2, C4, F3, F1, Fz, F2, and F4), as these channels demonstrated high prediction accuracy in previous studies (Fuglsang et al., 2017). A permutation test with α = 0.05 and a precision of 0.01 was used to test whether the mean prediction accuracy of the TRFs from the 17 channels significantly exceeded the mean noise floor within a condition. Only conditions for which a significant difference was observed at the participant level were included in the further TRF analysis. To find the amplitudes and latencies of the prominent peaks, identified manually as the distinctive peaks in the grand average TRF (P1, N1, and P2), a Gaussian function was fitted to each peak on the participant-level TRF for each noise condition. The Gaussian function was parameterized as: a(t)=p*exp(((tl)/w)2) , where a(t) is the amplitude as a function of time t, p is the amplitude, l is the latency, and w is the width of the deflection. This approach aligns with that of Borges et al. (2025), and the fit was performed using the same boundary parameters. Fits with R2 values at or below 0.5 were rejected and excluded from further analysis.

Relationships between TRF amplitudes, latencies, and SNR were examined using linear mixed models (six models in total) in RStudio (R Core Team, Vienna, version 4.3.2) with the nlme package (Lindstrom & Bates, 1990; Pinheiro & Bates, 1996). A mixed model was chosen to account for inter-subject variability in repeated measures. The relationship was investigated using the model FSNR+Group+(1|P) , where F is the latency or amplitude to be predicted, SNR and Group are fixed effects, and P is the participant-specific offset, modeled as a random effect. The participant-specific slope was not included as a random variable, since the purpose was to find a general trend across participants. The residuals were analyzed for normality by visual inspection of the histogram and QQ plots. The significance of the fixed effects was assessed using a univariate Wald test with α = 0.05, followed by Holm–Bonferroni correction. Only the conditions containing noise were included in the statistical analysis whereas the clean speech condition was not considered.

To compare the mean prediction accuracies between the two age groups within each condition, a cluster permutation test was used to control for multiple comparisons across electrodes. This analysis was performed using the “ft_freqstatistics” function in FieldTrip (Oostenveld et al., 2011). The following settings were used: Monte-Carlo estimates of the significance probabilities, independent samples t-test statistics, a randomization of 1000 repetitions and a cluster α of 0.05. The “ft_prepare_neighbors” function was used to define the neighborhood structure using the triangulation method.

Speech Reception Threshold Estimation From EEG

The SRTneuro was estimated using a linear decoder that was trained on the clean speech data to reconstruct the envelopes of the speech signals contained in the presented speech-and-noise stimuli (Borges et al., 2024). The reconstruction accuracy was calculated as the Pearson's correlation between the actual and the reconstructed envelopes. The noise floor was found using the same approach as used for the encoder. The reconstruction accuracy for the clean speech condition was evaluated using leave-one-out cross-validation, where one clean-speech excerpt was used for testing and 15 clean-speech excerpts were used for training. This procedure was repeated 16 times, with a different test trial in each iteration.

To investigate whether the mean reconstruction accuracy in the ONH individuals was significantly higher than that of the YNH individuals, as observed in Presacco et al. (2016), a permutation test was conducted. This test had a precision of 0.01 and a significance level of 5%. The null hypothesis was that the mean reconstruction accuracy of the ONH individuals was not higher than that of the YNH individuals. The alternative hypothesis was that the mean reconstruction accuracy in the ONH individuals was significantly higher than that of the YNH individuals.

A sigmoid function was then fitted to the reconstruction-accuracy-versus-SNR data points for each individual participant. The function used for the fit, as described by Farris-Trimble and McMurray (2013), is given by the following equation:

S(SNR)=pb1+exp(4spb(mSNR))+b (1)

Here, p is the maximum value of the sigmoid, b is the minimum value, s is the slope, and m is the midpoint. The maximum value of the sigmoid was set to the mean value of the reconstruction accuracy obtained for the clean speech condition, and the minimum value was set to the mean reconstruction accuracy obtained for the noise floor. The slope and the midpoint of the sigmoid were estimated using a non-linear least-square fitting procedure with the “lsqcurvefit” function in MATLAB. For this approach to be valid, an increase in reconstruction accuracy as a function of increasing SNR is required. Therefore, a permutation test was performed to test whether the reconstruction accuracy values obtained for the conditions SRTbeh + 2 dB, SRTbeh + 4 dB and clean speech were significantly higher than the noise floor and the reconstruction accuracy values obtained for the conditions SRTbeh-4 dB, SRTbeh-2 dB, with α  = 0.05. A fit was conducted only if a significant increase in reconstruction accuracy was found. Furthermore, a fit was not performed if the mean of the reconstruction accuracy in more than two conditions was above the maximum value p (i.e., above the mean reconstruction accuracy obtained for clean speech) or if the mean of the reconstruction accuracy in more than two conditions was below the minimum value b (i.e., below the noise floor). The slope was restricted to positive values during the fitting procedure. The fitting procedure was repeated 100 times, and the mean values of the parameters obtained in the 10 fits with the highest R2 values were used as the function parameters. The resulting SRTneuro values outside the boundaries of [-40 + 40] dB were discarded, as they were deemed to be unrealistic estimations, which was the case for three participants in four different conditions (1.3% of the estimations across all electrode configurations and participants).

The SRTneuro estimation was evaluated using three performance measures: i) the percentage of valid SRTneuro estimations, where higher percentages indicate the method's applicability to more participants; ii) the number of participants with a difference between SRTbeh and SRTneuro within ±3 dB, where higher numbers indicate better precision; and iii) the SD of the difference between SRTbeh and SRTneuro, where lower values indicate better precision.

The performance of the SRTneuro estimation was evaluated by calculating the difference between SRTbeh and SRTneuro. To test whether the mean of the difference between SRTbeh and SRTneuro was significantly different between the age groups, a permutation was conducted with a precision of 0.01 and a significance level of 5%. The null hypothesis was that there is no difference in the mean, and an alternative hypothesis was that there is a difference in the mean. To address whether there was a significant difference in the variance of these differences between the two age groups, a two-sampled F-test for equal variances was performed. The null hypothesis for the F-test was that the differences in the two groups come from a normal distribution with the same variance, and the alternative hypothesis was that the differences come from a normal distribution with different variance.

Results

The measured pure-tone thresholds for the ONH are shown in Figure 1a. The SRTbeh ranged from −5.4 dB to −4.0 dB, with a mean of −4.8 in the ONH group, see Figure 1b. To investigate age-related differences, this dataset was compared with data from 20 YNH participants collected in a previous study, in this group the SRTbeh ranged from −6.0 to −3.3 dB with a mean of −5.4 dB, see Figure 1b (for details see Borges et al., 2025).

Figure 1.

Figure 1.

(A) Pure-Tone Thresholds Obtained in the Left and Right Ear for Each of the ONH Participant (Thin Lines) Along with the Mean and the Standard Deviation Across the Population (Bold Lines and Error Bars). The Inclusion Criteria for the “Better” Ear are Shown with a Red Dotted Line, and the Maximum Threshold Allowed for the Other Ear (10 dB Higher) is Shown as a Solid Red Line. (B) Boxplot of the Measured SRTbeh Values for Both ONH and YNH Participants. The Red Line Indicates the Median, the Edges of the Blue Box Represent the 25th and 75th Percentile, the Black Whiskers Mark the Most Extreme Values, and a Red Cross Marks Outliers Defined as a Value More than 1.5 Times the Interquartile Range Away from the Bottom or Top of the Box. SRTbeh Values for Individual Participants are Shown as Black Circles (Horizontally Jittered for Improved Readability).

In Figure 2a, the resulting grand average TRFs for the two age groups are shown. Higher overall amplitudes are observed for the ONH individuals (solid lines) compared to the YNH individuals (dotted lines), while no clear differences in latency are evident between the two groups. TRFs for the individual participants can be found in Supplementary Material S2. The number of conditions for which a given component was not included on the participant level, due to an R2 for the Gaussian function being at or below 0.5 or the prediction accuracy not being significantly above the noise floor, is reported in Supplementary Material S3.

Figure 2.

Figure 2.

(A) The Grand Average TRF from the ONH Individuals Shown by Solid Lines and YNH Individuals Shown by Dotted Lines in the −100 to 400 ms Time Window for All Six SNRs. (B) T-Values from the Cluster Permutation Test, Comparing the Mean Prediction Accuracies Between the Two Age Groups (ONH - YNH) within Each Condition at the Participant Level. Electrodes Included in the Significant Cluster are Marked with a Star Symbol.

Figure 2b shows topography plots of T-values from the cluster permutation test, comparing the mean prediction accuracies between the two age groups (ONH - YNH) for each SNR condition. A positive cluster is consistently present over all SNR conditions, centrally located for the ONH group compared to the YNH group. Results from the Wald test, summarized in Table 2, indicate that age group was a significant predictor of the amplitude of all components (P1, N1, P2), but not of latency. Moreover, SNR was found to be a significant predictor of the latency of all components and the amplitude of N1 and P2.

Table 2.

The Results From the Mixed Linear Model Analyzing TRF Amplitudes and Latencies Relative to SNR and Age Group, Along with the Corresponding Coefficients and Results From the Wald Test.

Component Parameter Fixed effect Coefficient t-value Degrees of freedom Std. error p-Value
P1 Amplitude SNR 0.004 0.695 101 0.006 0.4887
Age Group 0.353 4.299 38 0.082 0.0001
Latency SNR −1.476 −2.975 101 0.496 0.0037
Age Group 7.068 2.074 38 3.408 0.0449
N1 Amplitude SNR −0.043 −6.276 101 0.007 < 10−4
Age Group −0.406 −3.239 38 0.126 0.0025
Latency SNR −2.485 −10.345 101 0.240 < 10−4
Age Group 1.494 0.426 38 3.503 0.6722
P2 Amplitude SNR 0.039 6.404 101 0.006 < 10−4
Age Group 0.273 2.781 38 0.098 0.0084
Latency SNR −3.75288 −5.634 101 0.666 < 10−4
Age Group 8.65695 1.761 38 4.916 0.0863

Statistically significant coefficients are shown in bold font.

An increased overall reconstruction accuracy for the ONH individuals can be observed in Figure 3. The permutation test, with the null hypotheses that the mean reconstruction accuracy of the ONH individuals is not larger than that of the YNH individuals, yielded a p-value of p = 0.0002. This result indicates that the ONH group exhibited significantly larger mean reconstruction accuracy compared to the YNH group.

Figure 3.

Figure 3.

Reconstruction Accuracy for the YNH and ONH Groups Along with the Noise Floor for Each Group. The Mean for Each Group is Shown in Bold, and the Shaded Areas Represent ±1 Standard Deviation Around the Mean of the Reconstruction Accuracy for the Group.

Table 3 and Figure 4 show that all ONH individuals obtained an SRTneuro within 3 dB of their SRTbeh when using the Scalp electrode configuration. The SD of the difference between the SRTbeh and the SRTneuro was 1.2 dB, with a median of 0.1 dB. A similar trend was observed for the YNH individuals. For the Scalp configuration, the permutation test with the null hypothesis of no between-age-group difference in the mean of the difference between SRTbeh and SRTneuro was non-significant (p = 0.7411), indicating no evidence of a difference in mean values and thus SRT estimation precision. Additionally, a two-sampled F-test for equal variances of the difference, with a null hypothesis of no difference in variance, was non-significant (p = 0.5818), suggesting no evidence of a difference in variance.

Table 3.

Comparison of the Difference Between SRTbeh and SRTneuro for the ONH and YNH Individuals.

Younger normal-hearing individuals Older normal-hearing individuals
% of estimations < 3dB [%] SD [dB] % of estimations < 3dB [%] SD [dB]
Ear 45 30 3.05 41 23 2.67
AEar 40 30 2.58 41 32 1.87
M 35 25 2.83 50 36 3.20
T 70 55 2.58 77 36 3.64
EarM 70 30 6.80 45 36 2.13
AEarM 55 35 4.21 55 32 2.63
EarT 95 55 6.75 77 64 3.75
AEarT 85 70 2.04 82 68 2.44
MT 90 85 1.65 91 86 1.86
EarMT 100 70 2.76 82 68 2.80
AEarMT 100 95 1.80 86 77 2.10
LAEarMT 65 50 4.58 86 73 2.10
RAEarMT 95 95 1.56 77 59 4.31
Scalp 100 95 1.37 100 100 1.21

The first column shows the percentage of valid SRTneuro estimates out of all included participants in the age group. The following columns show the percentage of participants with a difference between SRTbeh and SRTneuro within 3 dB out of all valid SRTneuro estimates in the age group, and the SD of the difference between SRTbeh and SRTneuro for the valid SRTneuro estimates in the age group.

Figure 4.

Figure 4.

The Upper Panel Shows a Bar Plot of the Percentage of Participants for Whom a Valid SRTneuro was Obtained, with Data from the YNH Individuals Shown in Blue and Data from the ONH Individuals Shown in Red. The Lower Panel Depicts a Boxplot of the Difference Between SRTbeh and SRTneuro; the Median is Shown as a Red Line, the 25th and 75th Percentiles are Indicated by the Boxes, and the Whiskers Indicate Extreme Values for Non-Outliers. The Participant-Specific Data Points are Depicted as Black Circles. Outliers are Defined as Datapoints more than 1.5 Times the Interquartile Range Away from the Top or Bottom of the Box and Marked with a Red Cross. 3-dB Limits are Shown as Green Dotted Lines.

For the ONH individuals, SRTneuro estimates based on in-ear EEG electrodes (Ear and AEar) were obtained for 41% of participants, with SDs of 2.7 dB (Ear) and 1.9 dB (AEar). The proportion of participants with SRTneuro values within ±3 dB of SRTbeh was 23% for Ear and 32% for AEar. These results closely resembled those obtained for the YNH individuals. Estimations obtained with the mastoid electrodes only (M) were roughly similar. The percentage of SRTneuro estimates increased substantially for the ONH group when using the temporal electrodes (T). However, in the temporal configuration (T) the number of ONH participants with an SRTneuro within ±3 dB of the SRTbeh was similar to the in-ear (Ear and AEar) and equal to the mastoid (M) configuration, suggesting that the higher percentage of estimates obtained with the T configuration came at the expense of overall reduced precision.

When combining the electrodes from the in-ear EEG configurations (Ear and AEar) with mastoids (EarM and AEarM), the results were very similar to using the mastoid or in-ear electrodes separately for the ONH individuals. In contrast, combining the in-ear and mastoid electrodes (EarM) for the YNH individuals resulted in an increase of reliable SRTneuro estimations as compared to the M and Ear configurations. However, the percentage of participants within 3 dB for the EarM configuration was similar to that obtained for the Ear and M configurations, and the SD increased for the EarM configuration compared to the M and Ear configurations. When comparing the AEarM to the separate configurations (AEar and M) in the YNH individuals, a similar trend was observed, with an increase in reliable SRTneuro estimations, but a higher SD in AEarM compared to AEar and M.

When combining electrodes from the Ear configurations (AEar and Ear) with the temporal electrodes, the number of estimates roughly doubled, accompanied by a doubling also in the number of ONH participants with an SRTneuro within 3 dB difference of the SRTbeh. Furthermore, the SD decreased compared to the T configuration but increased compared to the Ear and AEar configurations. A similar trend was found in the YNH individuals. When combining the mastoid electrodes and the temporal electrodes (MT), very similar results for the ONH and YNH individuals were obtained, with a better SRTneuro estimation compared to configurations using the temporal and mastoid electrodes separately (M and T).

The electrode configurations combining the in-ear electrodes and the mastoid and temporal electrodes (EarMT and AEarMT) did not improve the SRTneuro estimation of ONH participants compared to using the MT configuration. For the YNH participants, on the other hand, there was an increase in SRTneuro estimates within 3 dB difference of SRTbeh and the amount of reliable SRTneuro estimates for the AEarMT configuration compared to MT, as well as a minor increase in SD. For EarMT an increase in reliable SRTneuro estimates was found, but also an increase of SD and a decrease in the number of SRTneuro within ±3 dB of SRTbeh.

Using only electrodes from one side of the head for the ONH individuals revealed similar results for the left side (LAEarMT) compared to both sides (AEarMT), while a slight decrease in SRTneuro estimation quality was observed for the right side (RAEarMT) compared to both sides (AEarMT).

For the YNH individuals, the right side (RAEarMT) showed only minor differences in estimation quality compared to both sides (AEarMT), whereas the left side (LAEarMT) yielded lower estimation quality than when using electrodes from both sides (AEarMT). In the YNH individuals, RAEarMT performed almost identical to the Scalp configuration. This was not the case for the ONH individuals where the best performing side (LAEarMT) yielded a lower overall number of SRTneuro estimates (86% vs. 100%), fewer SRTneuro estimates within 3 dB difference from the measured SRTbeh (73% vs. 100%), and a higher SD of the difference between SRTneuro and SRTbeh (2.1 dB vs. 1.2 dB) compared to the Scalp configuration.

Discussion

Summary of the Main Results

A statistically significant increase in stimulus reconstruction accuracy was observed for the ONH compared to the YNH individuals, see Figure 3, along with enhanced EEG prediction accuracy in the EEG fronto-centrally for the ONH individuals compared to the YNH individuals, see Figure 2b. The ONH group had enhanced amplitude of both P1, N1, and P2 relative to the YNH group (i.e., a positive fixed effect for P1 and P2 and a negative effect for N1, Table 2). The SNR of the presented stimuli was a significant predictor of the latency of all components, and for the amplitude of N1 and P2.

Regarding SRT estimation, no statistically significant difference between the two age groups was found for the mean and variance of the differences between SRTbeh and SRTneuro when using the Scalp configuration, indicating that there was no difference in SRT estimation quality across the two groups. In the in- and around-ear configurations, the difference between SRTbeh and SRTneuro was generally small between the two age groups, with some exceptions: (i) the SRT estimations for YNH individuals improved more when temporal electrodes were used as compared to in-ear electrodes than for ONH individuals, (ii) an increase in the number of SRTneuro estimates was observed when using the EarM configuration compared to M and Ear in the YNH individuals but not in the ONH individuals, (iii) when using electrodes from only one side of the head, the best SRTneuro estimate for the ONH individuals was obtained using the left side (LAEarMT configuration), whereas the best estimate for the YNH individuals was obtained using the right side (RAEarMT configuration), (iv) the estimation quality for the AEarMT/RAEarMT configuration was similar to the Scalp configuration (i.e., very high) for YNH individuals whereas it was slightly reduced as compared to the Scalp configuration for the ONH individuals.

Age-Related Differences in Reconstruction Accuracy and TRFs

The TRFs in Figure 2a show an increase in amplitude in the ONH individuals compared to the YNH individuals, which was confirmed in the statistical test showing that age group was a significant predictor of amplitude for all components (P1, N1, and P2). This increase of amplitude, and thereby SNR of the EEG signal, likely contributed to the enhanced reconstruction accuracy seen in the ONH individuals compared to the YNH individuals. The SNR of the stimuli were a significant predictor of the latency of all components and of the amplitude of N1 and P2. Caution is advised when interpreting changes in the latency and amplitude of peaks and troughs in the TRF waveform, as these features do not directly reflect the amplitude and latency of the underlying neural components. For instance, a change in the amplitude of a neural component can affect the latencies and amplitudes of multiple features in the TRF waveform, (Luck & Kappenman, 2012). Furthermore, this study cannot specify the underlying causes of the observed differences, as the analysis conducted here only confirms their existence. Identifying the true causation of the changes would require additional experiments and is beyond the scope of the present study. However, an increasing latency with decreasing SNR may suggest an increase of processing, potentially delaying the neural response due to the additional neural processing required in low-SNR conditions. The decrease in N1 and P2 amplitude magnitude with lower SNR likely reflects the TRF encoder's challenge in tracking the speech envelope. As the speech envelope becomes masked by additive noise, the neural tracking of the speech envelope deteriorates, resulting in a lower TRF amplitude magnitude. In this study, no evidence was found to support the SNR as a reliable predictor of the P1 amplitude, suggesting that the P1 amplitude as a response to the speech stimulus envelope is less affected by SNR levels.

Enhanced envelope reconstruction accuracy in ONH individuals compared to YNH individuals has been observed in previous studies (Decruy et al., 2019; Karunathilake et al., 2023; McClaskey, 2024; Presacco et al., 2016) as well as increased TRF peak amplitudes (Karunathilake et al., 2023; Panela et al., 2024). However, in contrast to the present results, prediction accuracy has been found to decrease with age (Gillis et al., 2023), but this could be due to methodological differences, that is, using spectrogram and acoustic onset instead of envelope. This enhancement of the auditory response in ONH has been speculated to result from an excitatory/inhibitory imbalance (Alain et al., 2014), resulting in cortical hyperactivity in the auditory cortex including increased spontaneous neural firing, increased synchronization amongst neurons, and enhanced sound evoked responses (Herrmann & Butler, 2021).

Yet there is evidence that it could be a cortical effect of hearing loss rather than increasing age when it comes to neural fundamental-frequency tracking (Van Canneyt et al., 2021). Animal models have furthermore shown frequency-specific increases in spontaneous neuronal firing rate following noise exposure, linking the hyperactivity to hearing loss (Eggermont, 2015; Eggermont & Tass, 2015; Seki & Eggermont, 2003). Similar hyperactivity has also been observed in aging animal models in the absence of noise exposure. In these cases, age-related inner-ear dysfunctions such as degeneration of hair cells, the stria vascularis, and spiral ganglion cells are believed to underlie the observed hyperactivity (Bao & Ohlemiller, 2010; Dubno et al., 2013; Gratton & Vázquez, 2003; Keithley, 2020; Moore, 1987; Plack, 2014; Schmiedt, 2010).

Another potential explanation is recruitment of additional cortical regions to process the same stimuli in ONH, compensating for a reduction in the specialized processing regions (Brodbeck et al., 2018; Peelle et al., 2010). Karunathilake et al. (2023) also investigated the latency changes of M50trf, M100trf, and M200trf, that is, the MEG counterparts of the P1, N1, and P2 deflections discussed in the current study. They found significant noise-related delays in latencies of these components, which support the findings of the current study. It is important to note that the study by Karunathilake et al. (2023) utilized babble noise, whereas the present study employed steady-state speech-shaped noise. It is promising for the application of the SRTneuro method that some of the same neurological changes were also found when using more naturalistic noise such as babble noise in the stimuli. Other studies have also reported that the latencies of TRF deflections decrease and their amplitudes increase in magnitude with higher SNR levels in YNH individuals using MEG (Ding & Simon, 2013) and in preschool children using EEG (Van Hirtum et al., 2023).

Age-Related Differences in SRTneuro Estimation

There was no statistical evidence for an age-group difference in mean or variance of the differences between SRTbeh and SRTneuro when using the Scalp configuration, despite the overall increase in reconstruction accuracy in the ONH individuals compared to the YNH individuals, as shown in Figure 3. This suggests that the mean value of the fitted sigmoid function (the SRTneuro) remains unaffected by changes in overall reconstruction accuracy, which is a highly desirable characteristic of the SRTneuro estimation method. Furthermore, the observation that the variance of the estimate is also independent of reconstruction accuracy level demonstrates the robustness of the estimation method to variability in reconstruction accuracies across individuals. A possible explanation for this robustness lies in the nature of the sigmoid fitting procedure, which takes the individual reconstruction accuracy level into account, which can be considered as a form of normalization of the neural tracking strength of the individual participant. However, it is somewhat unclear why an improvement in reconstruction accuracy does not translate into a higher percentage of reliably estimated SRT values.

In the current study, the SRTneuro obtained from electrodes in and around the ears showed comparable results for the two age groups, with a few exceptions (see Figure 4 and Table 3). For the YNH individuals, the number of reliable SRTneuro estimates increased when using the EarM configuration compared to using M or Ear. There was no increase in the number of YNH participants with an SRTneuro within ±3 dB difference of the SRTbeh, and the SD of the difference between SRTneuro and SRTbeh increased. This suggests that the increase in reliable SRTneuro estimates came at the expense of the overall precision of the estimates. This trend was not observed in the ONH individuals, indicating that the difference observed between YNH and ONH may be due to random fluctuations. The YNH individuals showed a more substantial benefit from using only temporal electrodes compared to in-ear electrodes than the ONH individuals. This could be due to an enhanced response in the ONH individuals in the areas outside the core auditory cortex (Brodbeck et al., 2018). When the response area is broader, neighboring channels may capture more synchronized activity, resulting in reduced additional information in the T electrodes, due to the recorded activity being similar between T electrodes and in-ear electrodes.

When using electrodes from only one side of the head, the best SRTneuro estimates in the ONH individuals were found on the left side (LAEarMT configuration), whereas for the YNH individuals they were found for the right side (RAEarMT). A study by Brodbeck et al. (2018) compared the prediction accuracy of the envelope for ONH and YNH and found that there was an increased prediction accuracy in ONH and that it was particularly pronounced in the left temporal lobe. This increased activity on the left side of the head with age could explain the benefit of estimating SRTneuro based on left-side electrodes for the ONH individuals but not for the YNH, as observed in the current study. This is further supported by the fact that the cluster permutation test in the current study revealed enhanced prediction accuracy for the ONH in a fronto-central area across all SNR conditions, with more electrodes in the cluster around the left ear compared to the right, see Figure 2b. In the current study, using only electrodes from the “better” side of the head for the ONH individuals did not yield SRTneuro estimation performance comparable to that obtained when using all scalp electrodes, whereas for the YNH individuals this was the case. This could be due to enhanced recruitment of neurons in the areas close to the temporal lobe in ONH resulting in a smaller difference in the potential measured by the neighboring channel for the ONH compared to the YNH individuals (Brodbeck et al., 2018).

In the current study, the SNRs were chosen in 2 dB steps around the SRTbeh. Previous work (Borges et al., 2025) investigated whether this SNR selection strategy biased the SRTneuro estimation. To assess this, the same SRTneuro estimation methods were applied to data sets simulated based on the same underlying function but sampled for different SNR ranges. In particular, the SNR range was moved by −4 dB to 4 dB in 1 dB intervals (9 distinct SNR sets). This analysis showed no evidence of the SNR selection biasing the SRTneuro estimation.

Application

SRTneuro provides a continuous measure of the SRT, offering new opportunities for real-world assessment and adaptation in hearing care. Logging of SRTneuro during daily-life situations could inform optimization of hearing-aid performance and support personalized rehabilitation strategies. Furthermore, continuous assessment of SRTneuro based on uncontrolled natural speech could enable hearing aids to dynamically adjust their performance in real time to optimize the user's speech intelligibility.

The present study shows that the SRTneuro can be estimated based on in-ear-EEG alone in the ONH individuals with similar precision as in YNH individuals, especially when also including electrodes around the ear. The SRTneuro estimation was independent of age group even though higher reconstruction accuracies were observed in the ONH individuals. This is an advantage for an automatic SRTneuro estimation, as it suggests that the estimation method is robust with respect to effects of age on reconstruction accuracy and therefore does not need to be specifically tailored to different age groups. The SRTneuro can be estimated using electrodes in and around the ear from one side of the head, with slightly lower precision in the ONH individuals compared to using all scalp electrodes, but with the same precision as obtained for the full scalp configuration in the YNH individuals. However, the better side for precise SRTneuro estimation when only using electrodes in and around the ear from one side differed between ONH and YNH groups. If a SRTneuro measurement platform was to be used across age groups there are thus three options: (i) electrodes from both ears could be used for the SRTneuro estimation, (ii) the better ear for the participant could be identified and used for the SRTneuro estimation, or (iii) the ear used for the SRTneuro estimation could be determined by age. Since hearing impairment is associated with a higher age (Gratton & Vázquez, 2003), and in the current study the ONH group obtained more precise SRTneuro estimation when using electrodes from the left side, it is likely that a left sided electrodes placement would yield good results for most hearing-aid users. Using electrodes from only one side of the head allows to obtain the SRTneuro without connecting the two hearing aids and would therefore be more applicable. Another solution that does not require a connection between the two hearing aids could be the use of two different reference systems (one for each side) as input for the SRTneuro estimation; this method has not been explored in the current study.

Applying the proposed method in an actual clinical context, where the behavioral SRT (SRTbeh) is unknown, the method would likely require sampling across a broader range of SNRs to adequately capture the informative portion of the underlying sigmoid function. This would come at the cost of increased measurement duration.

In the ONH population, only the Scalp electrode configuration yielded an SRTneuro estimate for all participants. This is due to the quality requirements implemented for the sigmoid fit and reconstruction accuracy datapoints. If ear-EEG electrode configurations were to be implemented in practice, these quality requirements should be revisited to ensure that SRTneuro fits are only conducted when the reconstruction accuracy is reliably tracked. Furthermore, if the increase in reconstruction accuracy is not reliably tracked, additional data could be recorded and included until this is the case.

The method would likely be improved by training the model on more data, since the standard error of the mean generally decreases with 1n (Kirkwood & Sterne, 2009; Mesik & Wojtczak, 2022; Wilroth et al., 2023) thus, conducting more trials would most likely increase the number of SRT values that can be accurately estimated. Moreover, adding more predictors such as phoneme onsets and spectrogram to the model would most likely also lead to an increase of accurately estimated SRT values (Lesenfants et al., 2019). If the model is further improved, this could enhance the usability of estimating SRTneuro with electrodes in and around the ear. If the SRTneuro is estimated in a hearing aid, the hearing-aid signal processing could adapt dynamically to improve speech understanding for the individual hearing-aid user when necessary. The SRT value could also be logged in the user's natural environment and thus support rehabilitation. Estimating the SRTneuro in a real-life setting would most likely require more data than when testing in a laboratory setting. This is not necessarily a drawback since much more data could be collected outside of a laboratory setting, and ear-EEG allows long-term and discreet monitoring (Kidmose et al., 2012). Furthermore, electrodes positioned in and around the ear could offer additional benefits for hearing-aid feedback, including decoding the attended speaker (Alickovic et al., 2019; Fiedler et al., 2017; Mirkovic et al., 2016; Nguyen et al., 2025; Rotaru et al., 2024; Tanveer et al., 2024) and estimating hearing thresholds (Bech Christensen et al., 2018; Christensen et al., 2018; Sergeeva et al., 2024) and other audiometric features.

Limitations

The SRTneuro estimation relies on the envelope-following response in the EEG rather than on speech-intelligibility scores. Here it is important to note that the envelope following response reflects an encoding of acoustic features, not necessarily comprehension. While the two measures are highly correlated (Ding & Simon, 2013; Iotzov & Parra, 2019; Shannon et al., 1995; Vanthornhout et al., 2018), an envelope-following response is likely a prerequisite—rather than sufficient per se—for speech intelligibility. For example, SRTneuro could potentially still be estimated when participants listen to stimuli containing speech in an unfamiliar language, where speech understanding is absent.

The study is limited by the amount of data collected. Given that in-ear EEG is well-suited for long-term monitoring, the collection of additional data should enhance the decoding model and improve the precision of the SRTneuro estimation.

If the SRTneuro was implemented in a hearing aid, many unknown factors could influence the estimation, such as different types of noise variations in room acoustics and acoustic features of the target speech. Furthermore, the method has not been explored in hearing-impaired individuals, where a larger variation in the SRTs is expected, therefore, to strengthen the generalizability of the study it would be a logical next step to conduct a study with hearing-impaired individuals. This would allow for validation across a wider SRT range and provide insights into individual differences. In the current study, the YNH and ONH groups had relatively balanced hearing threshold levels, and the behavioral SRTs measured in the two groups were very similar with little interindividual variation. However, it should be noted that differences in speech intelligibility and other psychoacoustic measures between younger and older listeners with normal hearing are expected (see Goossens et al., 2017; 53-54; Regev, Oxenham, et al., 2025; Regev, Zaar, et al., 2025; Working Group on Speech Understanding & Aging, 1988), although these differences may not necessarily emerge in the conditions used in the current study (speech mixed with speech-shaped noise).

It is unclear how left/right imbalances in hearing loss may affect the quality of the SRTneuro estimation from each side. A study by Presacco et al. (2019) compared the reconstruction accuracy of the envelope between ONH individuals and older hearing-impaired individuals, finding no significant differences between the two populations. However, one study found that elevated hearing thresholds and impaired speech intelligibility were associated with an increased correlation between the EEG and the amplitude envelope of the presented stimuli (Schmitt et al., 2022), and an increase in cortical responses to sound in hearing-impaired compared to older adults has also been observed (Alain et al., 2014; Millman et al., 2017; Tremblay et al., 2003). In the current study, enhanced reconstruction accuracy did not have an effect on the SRTneuro estimation quality. However, changes in neural activation due to hearing loss may interact differently with the SRT estimation method used in the current study than changes in neural activation due to age.

The current study demonstrates the feasibility of estimating SRTneuro using electrodes in and around the ear. While integrating advanced technology such as ear-EEG into hearing aids could pave the way for neuro-steered hearing aids, with SRTneuro estimation serving as a concrete example of the potential opportunities, incorporating such technologies also introduces challenges that must be carefully balanced against user benefits and economic costs. Several of these challenges, though beyond the scope of the current study, are worth highlighting: (i) increased power consumption in an already power-constrained device; (ii) limited physical space for integrating electrodes and supporting electronics; (iii) vulnerability to electromagnetic interference from both internal and external sources; (iv) reduced control over recording conditions in real-world settings, where environmental noise may affect the SNR; (v) privacy and ethical concerns related to continuous EEG recording. Such considerations are crucial for the successful implementation and acceptance of neuro-steered hearing aids.

Conclusion

The ONH individuals showed similar SRTs to their YNH peers, while also exhibiting an overall increase in envelope reconstruction accuracy. However the precision of the SRTneuro estimate did not significantly differ across the two age groups. For scalp EEG, the SRTneuro was estimated with good precision in all participants in both the YNH and ONH groups. When restricting the estimation to in-the-ear electrodes, the number of individuals with an estimated SRTneuro decreased to 45% and 41% for the YNH and ONH groups, respectively. When combining in-the-ear and around-the-ear electrodes, the maximum percentage of individuals with an estimated SRTneuro was 100% for the YNH and 86% for the ONH. An analysis of spatiotemporal responses through the TRF revealed that the ONH group exhibited increased amplitudes of the P1 (∼50 ms), N1 (∼120 ms) and P2 (∼200 ms) deflections compared to the YNH group. TRF latencies decreased with increasing SNR, while the amplitudes of the N1 and P2 deflections increased as the SNR increased. Overall, these findings demonstrate the robustness of the SRTneuro estimation method with regard to age-related changes in neural speech-envelope tracking.

Supplemental Material

sj-pdf-1-tia-10.1177_23312165251372462 - Supplemental material for Age-Related Differences in EEG-Based Speech Reception Threshold Estimation Using Scalp and Ear-EEG

Supplemental material, sj-pdf-1-tia-10.1177_23312165251372462 for Age-Related Differences in EEG-Based Speech Reception Threshold Estimation Using Scalp and Ear-EEG by Heidi B Borges, EminaAlickovic, Christian B Christensen, Preben Kidmose and Johannes Zaar in Hearing

Acknowledgments

The authors would like to thank all the participants who dedicated their time to be part of the study. They would also like to thank Alberte Hygum Valsted, Sven-Gustav Thiesen and Josefine Hjort for their support during data collection, Jesper Trolle and Ingelise Nielsen for their invaluable assistance in producing the ear-EEG molds, and Lorenz Fiedler for his assistance with the statistical analysis. Lastly, they thank the William Demant Foundation for making this study possible.

Footnotes

Ethical Considerations: The experimental protocols were approved by the Institutional Review Board (IRB) of Aarhus University (Approval number 2023-014) on September 29, 2023. Informed consent was obtained prior to participation. The participants were given the option to refuse to participate by opting out at any point of the study. The participants were given the option to withdraw their data prior to anonymization.

Consent to Participate: Informed consent from participants was obtained in writing.

Consent for Publication: The informational material provided to participants prior to obtaining their written consent stated that the data would be published in scientific journal articles.

Authors’ contributions: Heidi B Borges did conceptualization, methodology, software, formal analysis, investigation, data curation, writing—original draft, visualization, funding acquisition. Emina Alickovic did conceptualization, methodology, software, writing—review and editing, supervision, funding acquisition. Christian B. Christensen did conceptualization, methodology, resources, writing—review and editing, supervision. Preben Kidmose did conceptualization, methodology, writing—review and editing, supervision, project administration, funding acquisition. Johannes Zaar did conceptualization, methodology, software, writing—review and editing, supervision, project administration, funding acquisition.

Funding: The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the William Demant Foundation [Grant number 21-2912].

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement: Data can be provided upon reasonable request.

Supplemental Material: Supplemental material for this paper is available online.

References

  1. Alain C., Roye A., Salloum C. (2014). Effects of age-related hearing loss and background noise on neuromagnetic activity from auditory cortex. Frontiers in Systems Neuroscience, 8, 8. 10.3389/fnsys.2014.00008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Alickovic E., Lunner T., Gustafsson F., Ljung L. (2019). A tutorial on auditory attention identification methods. Frontiers in Neuroscience, 13, 153. 10.3389/fnins.2019.00153 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bao J., Ohlemiller K. K. (2010). Age-related loss of spiral ganglion neurons. Hearing Research, 264(1–2), 93–97. 10.1016/j.heares.2009.10.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bech Christensen C., Hietkamp R. K., Harte J. M., Lunner T., Kidmose P. (2018). Toward EEG-assisted hearing aids: Objective threshold estimation based on EarEEG in subjects with sensorineural hearing loss. Trends in Hearing, 22, 1–13. 10.1177/2331216518816203 [DOI] [Google Scholar]
  5. Biesmans W., Das N., Francart T., Bertrand A. (2017). Auditory-inspired speech envelope extraction methods for improved EEG-based auditory attention detection in a cocktail party scenario. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 25(5), 402–412. 10.1109/TNSRE.2016.2571900 [DOI] [PubMed] [Google Scholar]
  6. Borges H. B., Zaar J., Alickovic E., Christensen C. B., Kidmose P. (2024). The speech reception threshold can be estimated using EEG electrodes in and around the ear. bioRxiv, 2024.12.02.625819. 10.1101/2024.12.02.625819 [DOI] [Google Scholar]
  7. Borges H. B., Zaar J., Alickovic E., Christensen C. B., Kidmose P. (2025). Speech reception threshold estimation via EEG-based continuous speech envelope reconstruction. European Journal of Neuroscience, 61(6), e70083. 10.1111/ejn.70083 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Brodbeck C., Presacco A., Anderson S., Simon J. Z. (2018). Over-representation of speech in older adults originates from early response in higher order auditory cortex. Acta Acustica United With Acustica, 104(5), 774–777. 10.3813/AAA.919221 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Christensen C. B., Harte J. M., Lunner T., Kidmose P. (2018). Ear-EEG-based objective hearing threshold estimation evaluated on normal hearing subjects. IEEE Transactions on Bio-Medical Engineering, 65(5), 1026–1034. 10.1109/TBME.2017.2737700 [DOI] [PubMed] [Google Scholar]
  10. Crosse M. J., Di Liberto G. M., Bednar A., Lalor E. C. (2016). The multivariate temporal response function (mTRF) toolbox: A MATLAB toolbox for relating neural signals to continuous stimuli. Frontiers in Human Neuroscience, 10, 10.3389/fnhum.2016.00604 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Daneman M., Carpenter P. A. (1980). Individual differences in working memory and reading. Journal of Verbal Learning and Verbal Behavior, 19(4), 450–466. 10.1016/S0022-5371(80)90312-6 [DOI] [Google Scholar]
  12. Decruy L., Vanthornhout J., Francart T. (2019). Evidence for enhanced neural tracking of the speech envelope underlying age-related speech-in-noise difficulties. Journal of Neurophysiology, 122(2), 601–615. 10.1152/jn.00687.2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Ding N., Simon J. Z. (2013). Adaptive temporal encoding leads to a background-insensitive cortical representation of speech. The Journal of Neuroscience, 33(13), 5728–5735. 10.1523/JNEUROSCI.5297-12.2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Dubno J. R., Eckert M. A., Lee F.-S., Matthews L. J., Schmiedt R. A. (2013). Classifying human audiometric phenotypes of age-related hearing loss from animal models. Journal of the Association for Research in Otolaryngology, 14(5), 687–701. 10.1007/s10162-013-0396-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Eggermont J. J. (2015). Animal models of spontaneous activity in the healthy and impaired auditory system. Frontiers in Neural Circuits, 9, 10.3389/fncir.2015.00019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Eggermont J. J., Tass P. A. (2015). Maladaptive neural synchrony in tinnitus: Origin and restoration. Frontiers in Neurology, 6, 10.3389/fneur.2015.00029 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Farris-Trimble A., McMurray B. (2013). Test–retest reliability of eye tracking in the visual world paradigm for the study of real-time spoken word recognition. Journal of Speech, Language, and Hearing Research, 56(4), 1328–1345. 10.1044/1092-4388(2012/12-0145) [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Fiedler L., Wöstmann M., Graversen C., Brandmeyer A., Lunner T., Obleser J. (2017). Single-channel in-ear-EEG detects the focus of auditory attention to concurrent tone streams and mixed speech. Journal of Neural Engineering, 14(3), 036020. 10.1088/1741-2552/aa66dd [DOI] [PubMed] [Google Scholar]
  19. Fuglsang S. A., Dau T., Hjortkjær J. (2017). Noise-robust cortical tracking of attended speech in real-world acoustic scenes. NeuroImage, 156, 435–444. 10.1016/j.neuroimage.2017.04.026 [DOI] [PubMed] [Google Scholar]
  20. Gillis M., Kries J., Vandermosten M., Francart T. (2023). Neural tracking of linguistic and acoustic speech representations decreases with advancing age. NeuroImage, 267, 119841. 10.1016/j.neuroimage.2022.119841 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Goossens T., Vercammen C., Wouters J., van Wieringen A. (2017). Masked speech perception across the adult lifespan: Impact of age and hearing impairment. Hearing Research, 344, 109–124. 10.1016/j.heares.2016.11.004 [DOI] [PubMed] [Google Scholar]
  22. Gratton M. A., Vázquez A. E. (2003). Age-related hearing loss: Current research. Current Opinion in Otolaryngology & Head and Neck Surgery, 11(5), 367–371. 10.1097/00020840-200310000-00010 [DOI] [PubMed] [Google Scholar]
  23. Herrmann B., Butler B. E. (2021). Hearing loss and brain plasticity: The hyperactivity phenomenon. Brain Structure and Function, 226(7), 2019–2039. 10.1007/s00429-021-02313-9 [DOI] [PubMed] [Google Scholar]
  24. Iotzov I., Parra L. C. (2019). EEG can predict speech intelligibility. Journal of Neural Engineering, 16(3), 036008. 10.1088/1741-2552/ab07fe [DOI] [PubMed] [Google Scholar]
  25. Kappel S. L., Kidmose P. (2022). Characterization of dry-contact EEG electrodes and an empirical comparison of Ag/AgCl and IrO2 electrodes. Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference, 2022, 3127–3130. 10.1109/EMBC48229.2022.9871923 [DOI] [PubMed] [Google Scholar]
  26. Kappel S. L., Rank M. L., Toft H. O., Andersen M., Kidmose P. (2019). Dry-contact electrode ear-EEG. IEEE Transactions on Bio-Medical Engineering, 66(1), 150–158. 10.1109/TBME.2018.2835778 [DOI] [PubMed] [Google Scholar]
  27. Karunathilake I. M. D., Dunlap J. L., Perera J., Presacco A., Decruy L., Anderson S., Kuchinsky S. E., Simon J. Z. (2023). Effects of aging on cortical representations of continuous speech. Journal of Neurophysiology, 129(6), 1359–1377. 10.1152/jn.00356.2022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Keithley E. M. (2020). Pathology and mechanisms of cochlear aging. Journal of Neuroscience Research, 98(9), 1674–1684. 10.1002/jnr.24439 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Kidmose P., Looney D., Mandic D. P. (2012). Auditory evoked responses from ear-EEG recordings. Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference, 2012, 586–589. 10.1109/EMBC.2012.6345999 [DOI] [PubMed] [Google Scholar]
  30. Kidmose P., Looney D., Ungstrup M., Rank M. L., Mandic D. P. (2013). A study of evoked potentials from ear-EEG. IEEE Transactions on Bio-Medical Engineering, 60(10), 2824–2830. 10.1109/TBME.2013.2264956 [DOI] [PubMed] [Google Scholar]
  31. Kirkwood B. R., Sterne J. A. C. (2009). Essential medical statistics (2nd ed., [Nachdruck]). Blackwell Science. [Google Scholar]
  32. Kochkin S. (2002). Consumers rate improvements sought in hearing instruments. Hearing Review, 9(11), 18–22. https://hearingreview.com/hearing-products/marketrak-vi-consumers-rate-improvements-soughtnbsp-in-hearing-instruments [Google Scholar]
  33. Lesenfants D., Vanthornhout J., Verschueren E., Decruy L., Francart T. (2019). Predicting individual speech intelligibility from the cortical tracking of acoustic- and phonetic-level speech representations. Hearing Research, 380, 1–9. 10.1016/j.heares.2019.05.006 [DOI] [PubMed] [Google Scholar]
  34. Lindstrom M. L., Bates D. M. (1990). Nonlinear mixed effects models for repeated measures data. Biometrics, 46(3), 673–687. 10.2307/2532087 [DOI] [PubMed] [Google Scholar]
  35. Luck S. J. (2014). An introduction to the event-related potential technique (2nd ed.). MIT Press. [Google Scholar]
  36. Luck S. J., Kappenman E. S. (Eds.). (2012). The Oxford handbook of event-related potential components. Oxford University Press. [Google Scholar]
  37. McClaskey C. M. (2024). Neural hyperactivity and altered envelope encoding in the central auditory system: Changes with advanced age and hearing loss. Hearing Research, 442, 108945. 10.1016/j.heares.2023.108945 [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Mesik J., Wojtczak M. (2022). The effects of data quantity on performance of temporal response function analyses of natural speech processing. Frontiers in Neuroscience, 16, 963629. 10.3389/fnins.2022.963629 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Millman R. E., Mattys S. L., Gouws A. D., Prendergast G. (2017). Magnified neural envelope coding predicts deficits in speech perception in noise. The Journal of Neuroscience, 37(32), 7727–7736. 10.1523/JNEUROSCI.2722-16.2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Mirkovic B., Bleichner M. G., De Vos M., Debener S. (2016). Target speaker detection with concealed EEG around the ear. Frontiers in Neuroscience, 10, 349. 10.3389/fnins.2016.00349 [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Moore D. R. (1987). Physiology of higher auditory system. British Medical Bulletin, 43(4), 856–870. 10.1093/oxfordjournals.bmb.a072222 [DOI] [PubMed] [Google Scholar]
  42. Nguyen N. D. T., Mikkelsen K., Kidmose P. (2025). Cognitive component of auditory attention to natural speech events. Frontiers in Human Neuroscience, 18, 10.3389/fnhum.2024.1460139 [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Nielsen J. B., Dau T. (2011). The Danish hearing in noise test. International Journal of Audiology, 50(3), 202–208. 10.3109/14992027.2010.524254 [DOI] [PubMed] [Google Scholar]
  44. Oldfield R. C. (1971). The assessment and analysis of handedness: The Edinburgh inventory. Neuropsychologia, 9(1), 97–113. 10.1016/0028-3932(71)90067-4 [DOI] [PubMed] [Google Scholar]
  45. Oostenveld R., Fries P., Maris E., Schoffelen J.-M. (2011). Fieldtrip: Open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Computational Intelligence and Neuroscience, 2011, 1–9. 10.1155/2011/156869 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Panela R. A., Copelli F., Herrmann B. (2024). Reliability and generalizability of neural speech tracking in younger and older adults. Neurobiology of Aging, 134, 165–180. 10.1016/j.neurobiolaging.2023.11.007 [DOI] [PubMed] [Google Scholar]
  47. Peelle J. E., Troiani V., Wingfield A., Grossman M. (2010). Neural processing during older adults’ comprehension of spoken sentences: Age differences in resource allocation and connectivity. Cerebral Cortex, 20(4), 773–782. 10.1093/cercor/bhp142 [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Pinheiro J. C., Bates D. M. (1996). Unconstrained parametrizations for variance–covariance matrices. Statistics and Computing, 6, 289–296. 10.1007/BF00140873 [DOI] [Google Scholar]
  49. Plack C. J. (2014). The sense of hearing. Psychology Press. [Google Scholar]
  50. Presacco A., Simon J. Z., Anderson S. (2016). Evidence of degraded representation of speech in noise, in the aging midbrain and cortex. Journal of Neurophysiology, 116(5), 2346–2355. 10.1152/jn.00372.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Presacco A., Simon J. Z., Anderson S. (2019). Speech-in-noise representation in the aging midbrain and cortex: Effects of hearing loss. PLoS One, 14(3), e0213899. 10.1371/journal.pone.0213899 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Regev J., Oxenham A. J., Relaño-Iborra H., Zaar J., Dau T. (2025a). Evaluating the role of age on speech-in-noise perception based primarily on temporal envelope information. Hearing Research, 460, 109236. 10.1016/j.heares.2025.109236 [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Regev J., Relaño-Iborra H., Zaar J., Dau T. (2024). Disentangling the effects of hearing loss and age on amplitude modulation frequency selectivity. The Journal of the Acoustical Society of America, 155(4), 2589–2602. 10.1121/10.0025541 [DOI] [PubMed] [Google Scholar]
  54. Regev J., Zaar J., Relaño-Iborra H., Dau T. (2023). Age-related reduction of amplitude modulation frequency selectivity. The Journal of the Acoustical Society of America, 153(4), 2298. 10.1121/10.0017835 [DOI] [PubMed] [Google Scholar]
  55. Regev J., Zaar J., Relaño-Iborra H., Dau T. (2025). Investigating the effects of age and hearing loss on speech intelligibility and amplitude modulation frequency selectivity. The Journal of the Acoustical Society of America, 157(3), 2077–2090. 10.1121/10.0036220 [DOI] [PubMed] [Google Scholar]
  56. Rotaru I., Geirnaert S., Heintz N., Van de Ryck I., Bertrand A., Francart T. (2024). What are we really decoding? Unveiling biases in EEG-based decoding of the spatial focus of auditory attention. Journal of Neural Engineering, 21(1), 016017. 10.1088/1741-2552/ad2214 [DOI] [PubMed] [Google Scholar]
  57. Schmiedt R. A. (2010). The physiology of cochlear presbycusis. In Gordon-Salant S., Frisina R. D., Popper A. N., Fay R. R. (Eds.), The aging auditory system (Vol. 34, pp. 9–38). Springer New York. [Google Scholar]
  58. Schmitt R., Meyer M., Giroud N. (2022). Better speech-in-noise comprehension is associated with enhanced neural speech tracking in older adults with hearing impairment. Cortex, 151, 133–146. 10.1016/j.cortex.2022.02.017 [DOI] [PubMed] [Google Scholar]
  59. Seki S., Eggermont J. J. (2003). Changes in spontaneous firing rate and neural synchrony in cat primary auditory cortex after localized tone-induced hearing loss. Hearing Research, 180(1–2), 28–38. 10.1016/S0378-5955(03)00074-1 [DOI] [PubMed] [Google Scholar]
  60. Sergeeva A., Christensen C. B., Kidmose P. (2024). Towards ASSR-based hearing assessment using natural sounds. Journal of Neural Engineering, 21(2), 026045. 10.1088/1741-2552/ad3b6b [DOI] [PubMed] [Google Scholar]
  61. Shannon R. V., Zeng F. G., Kamath V., Wygonski J., Ekelid M. (1995). Speech recognition with primarily temporal cues. Science, 270(5234), 303–304. 10.1126/science.270.5234.303 [DOI] [PubMed] [Google Scholar]
  62. Tanveer M. A., Skoglund M. A., Bernhardsson B., Alickovic E. (2024). Deep learning-based auditory attention decoding in listeners with hearing impairment. Journal of Neural Engineering, 21(3), 036022. 10.1088/1741-2552/ad49d7 [DOI] [PubMed] [Google Scholar]
  63. Tremblay K. L., Piskosz M., Souza P. (2003). Effects of age and age-related hearing loss on the neural representation of speech cues. Clinical Neurophysiology, 114(7), 1332–1343. 10.1016/S1388-2457(03)00114-7 [DOI] [PubMed] [Google Scholar]
  64. Van Canneyt J., Wouters J., Francart T. (2021). Cortical compensation for hearing loss, but not age, in neural tracking of the fundamental frequency of the voice. Journal of Neurophysiology, 126(3), 791–802. 10.1152/jn.00156.2021 [DOI] [PubMed] [Google Scholar]
  65. Van Hirtum T., Somers B., Verschueren E., Dieudonné B., Francart T. (2023). Delta-band neural envelope tracking predicts speech intelligibility in noise in preschoolers. Hearing Research, 434, 108785. 10.1016/j.heares.2023.108785 [DOI] [PubMed] [Google Scholar]
  66. Vanthornhout J., Decruy L., Wouters J., Simon J. Z., Francart T. (2018). Speech intelligibility predicted from neural entrainment of the speech envelope. Journal of the Association for Research in Otolaryngology, 19(2), 181–191. 10.1007/s10162-018-0654-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Wilroth J., Bernhardsson B., Heskebeck F., Skoglund M. A., Bergeling C., Alickovic E. (2023). Improving EEG-based decoding of the locus of auditory attention through domain adaptation. Journal of Neural Engineering, 20(6), 066022 . 10.1088/1741-2552/ad0e7b [DOI] [PubMed] [Google Scholar]
  68. Working Group on Speech Understanding & Aging . (1988). Speech understanding and aging. The Journal of the Acoustical Society of America, 83(3), 859–895. 10.1121/1.395965 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

sj-pdf-1-tia-10.1177_23312165251372462 - Supplemental material for Age-Related Differences in EEG-Based Speech Reception Threshold Estimation Using Scalp and Ear-EEG

Supplemental material, sj-pdf-1-tia-10.1177_23312165251372462 for Age-Related Differences in EEG-Based Speech Reception Threshold Estimation Using Scalp and Ear-EEG by Heidi B Borges, EminaAlickovic, Christian B Christensen, Preben Kidmose and Johannes Zaar in Hearing


Articles from Trends in Hearing are provided here courtesy of SAGE Publications

RESOURCES