Skip to main content
Journal of Speech, Language, and Hearing Research : JSLHR logoLink to Journal of Speech, Language, and Hearing Research : JSLHR
. 2023 Sep 6;66(10):4066–4082. doi: 10.1044/2023_JSLHR-22-00548

Auditory, Lexical, and Multitasking Demands Interactively Impact Listening Effort

Stefanie E Kuchinsky a,b,c,, Niki Razeghi c, Nick B Pandža b,d,e
PMCID: PMC10713022  PMID: 37672797

Abstract

Purpose:

This study examined the extent to which acoustic, linguistic, and cognitive task demands interactively impact listening effort.

Method:

Using a dual-task paradigm, on each trial, participants were instructed to perform either a single task or two tasks. In the primary word recognition task, participants repeated Northwestern University Auditory Test No. 6 words presented in speech-shaped noise at either an easier or a harder signal-to-noise ratio (SNR). The words varied in how commonly they occur in the English language (lexical frequency). In the secondary visual task, participants were instructed to press a specific key as soon as a number appeared on screen (simpler task) or one of two keys to indicate whether the visualized number was even or odd (more complex task).

Results:

Manipulation checks revealed that key assumptions of the dual-task design were met. A significant three-way interaction was observed, such that the expected effect of SNR on effort was only observable for words with lower lexical frequency and only when multitasking demands were relatively simpler.

Conclusions:

This work reveals that variability across speech stimuli can influence the sensitivity of the dual-task paradigm for detecting changes in listening effort. In line with previous work, the results of this study also suggest that higher cognitive demands may limit the ability to detect expected effects of SNR on measures of effort. With implications for real-world listening, these findings highlight that even relatively minor changes in lexical and multitasking demands can alter the effort devoted to listening in noise.


Millions of Americans report listening difficulties, such as problems understanding speech in noise, in the absence of a clinical hearing loss (Edwards, 2020). There is increasing acknowledgment that pure-tone hearing thresholds (Phatak et al., 2019) and speech intelligibility scores (Winn & Teece, 2021) alone are insufficient for fully characterizing the challenges that listeners experience in their daily lives. Part of the gap in our understanding of functional hearing complaints is that real-world communication commonly involves listening in complex environments, to complex stimuli, and with complex task demands that are not often probed in clinical or controlled laboratory settings (Keidser et al., 2020).

Indeed, successful attended listening comprises an interaction of perceptual, linguistic, and cognitive processes that support the translation of an acoustic signal into an understanding of a speaker's intended message. While the coordination of these functions often occurs in an automatic, effortless fashion under ideal situations, adverse listening conditions can impact the ease with which speech understanding is accomplished (Mattys et al., 2012). Challenges can arise at every level of processing. For example, when listening to a friend talk about their dissertation research while walking down a busy street, background noise may mask the auditory signal, some words may be unfamiliar to the listener, and the listener must simultaneously pay attention to rapid visual changes to avoid running into people. While previous research has largely focused on how acoustic challenges impact the effort exerted to understand speech, few studies have concurrently examined the role of linguistic and cognitive demands (see Zekveld et al., 2018, for a review). In their consensus article on enhancing the ecological validity of research in the hearing sciences, Keidser et al. (2020) noted numerous study design features that are important to consider but are as of yet understudied, such as the relevance of the stimulus materials and task complexity (e.g., single vs. multiple tasks) to real-world communication. The goal of this study was to examine the extent to which effortful listening in background noise is impacted by the lexical frequency of word stimuli (indicative of how likely individuals are to encounter it in their daily lives) and multitasking demands.

From Listening Demands to Listening Effort

Even among younger, normal-hearing adults, listening in noise has been associated with having to work harder to successfully recognize speech (Rabbitt, 1966). As put forth in the Ease of Language Understanding (ELU) model, explicit working-memory processes must be invoked to resolve mismatches between a degraded acoustic signal and stored phonological or lexical representations (Rönnberg et al., 2008, 2013). The increased allocation of such mental resources to speech understanding suggests a critical role for cognition in effortful listening. The Framework for Understanding Effortful Listening (FUEL) highlights the multidimensional factors that impact the deployment of effort in complex listening situations (Pichora-Fuller et al., 2016), including variation in the quality of the acoustic signal, the auditory and cognitive capacities of the listener, the linguistic content of the message, the context in which listening is performed, and variation in a listener's level of arousal or motivation. FUEL and other models (e.g., Strauss & Francis, 2017) describe an important dissociation between processing requirements imposed by a task (i.e., demands or load) and the mental resources that an individual is willing and able to engage to meet those demands (i.e., listening effort). In general, the relationship between demands and effort has been observed to be nonlinear, such that maximum effort tends to be exerted at moderate levels of task difficulty (Ohlenforst et al., 2017).

Even for simple word stimuli, mental demands can arise at multiple levels of speech processing (e.g., Blumstein, 2009; Luce & Pisoni, 1998; Marslen-Wilson, 1987). While many studies have manipulated acoustic, linguistic, or cognitive demands as participants respond to auditory stimuli, fewer have examined interactions among them in terms of their impact of effort using a within-participants study design.

Investigating Interactions Among Listening Demands on Effort

Studies that have investigated the relations among acoustic, linguistic, or cognitive demands on listening effort have predominantly used pupillometry with manipulations of at most two of these dimensions at a time (for a review, see Zekveld et al., 2018). These have included studies of acoustic–lexical, acoustic–syntactic, acoustic–semantic, and acoustic–cognitive interactions. Maximal effort is generally observed when both auditory and either linguistic or cognitive demands are highest (Kuchinsky et al., 2013; Wendt et al., 2016; Winn, 2016), until a tipping point when people begin to give up (Ayasse & Wingfield, 2018). When multiple challenging task demands are imposed, there is greater potential to overload individuals. For example, Zekveld et al. (2019) found that the expected effect of signal-to-noise ratio (SNR) on the pupil response was greatly diminished under high memory demands that were likely to have exceeded listeners' memory capacities. However, it remains to be seen whether such interactions would hold for other measures of listening effort; previous evidence suggests that different measures are sensitive to distinct aspects of listening effort (Alhanbali et al., 2019; Colby & McMurray, 2021; McGarrigle et al., 2014; Strand et al., 2018; Visentin et al., 2022).

The dual-task paradigm is a commonly used behavioral measure of effort that may be particularly well suited for probing questions about the impact of cognitive demands associated with multitasking on listening effort. It also may be more practical to collect in patients compared to pupillometry (e.g., restrictions on visual stimuli, and age- or medication-related changes in pupillary dynamics; Winn et al., 2018). Thus, this study aimed to examine interactions among acoustic, linguistic, and multitasking cognitive demands using a well-studied behavioral index of listening effort: the dual-task paradigm (Sarampalis et al., 2009).

Investigating Interactive Listening Demands Using the Dual-Task Paradigm

Multitasking research has frequently employed dual-task experimental designs to probe the nature of the mental architecture that supports task performance (Koch et al., 2018). In the domain of hearing sciences, the dual-task paradigm has primarily been used to obtain a measure of listening effort, rather than to study multitasking processes per se (for a review, see Gagné et al., 2017). Typically, participants complete a primary task (e.g., speech recognition), which they have been instructed should be of principal focus. On some trials, they are also instructed to perform a secondary task (e.g., visual probe detection) without sacrificing performance on the primary one. Because a limited amount of mental resources can be deployed at any given time (Kahneman, 1973), as more resources are used for the primary speech recognition task (e.g., poorer SNR), fewer are presumably available for the secondary task. The resulting decrement in secondary task performance (e.g., a slower reaction time [RT]) is a “dual-task cost” that is indicative of the listening effort exerted on the primary task.

Auditory and linguistic interactions. Dual-task studies that have manipulated both acoustic and linguistic demands have primarily focused on sentence-level properties of speech, such as sentence context (Desjardins & Doherty, 2013, 2014; Sarampalis et al., 2009). Results have been mixed across such studies suggesting that such interactions may depend on specifics of the task design, which can vary, for example, in terms of the demands imposed by the secondary task, task prioritization instructions, and the relative timing of primary and secondary tasks (reviewed in Gagné et al., 2017).

Pupillometry studies of lexical processing suggest that greater listening effort is associated with naming words that occur less versus more commonly in the English language (i.e., lower frequency words like “hack” vs. higher frequency words like “back”; Papesh & Goldinger, 2012). However, there is a paucity of studies examining interactions among acoustic and lexical demands on listening effort. Kuchinsky et al. (2013) observed interactive effects of acoustic signal degradation and lexical competition, such that effort was greatest when both the SNR was poor and competition was present (i.e., lexical processing demands were high). However, as a counterexample, effects of word frequency and stimulus degradation for written word processing have been commonly observed to have an additive, but not interactive, effect on lexical decision RTs (e.g., Becker & Killion, 1977; Yap & Balota, 2007).

Auditory and cognitive interactions. There is extensive evidence that degraded speech processing is associated with greater recruitment of working memory resources (see reviews by Rönnberg et al., 2013; Wingfield et al., 2015). However, there is relatively limited understanding of how cognitive demands external to the perceptual stimulus, such as listening while under high or low multitasking demands, contribute to listening challenges (Mattys & Wiget, 2011). A review by Koch et al. (2018) describes multitasking in terms of the temporal overlap in mental processes that are involved in performing more than one task. This includes maintaining representations of the mental and motor processes that are required of each task (i.e., “task sets”).

The dual-task paradigm may be particularly well suited to investigate the extent to which varying multitasking demands impact effortful listening. A few studies have investigated the nature of top-down cognitive demands elicited in dual-task studies of effort by varying both primary and secondary task difficulty. For example, Picou and Ricketts (2014) had participants recognize monosyllabic words in quiet or noise and in the presence or absence of audiovisual cues. The difficulty of the secondary task was manipulated by including a simple, a complex, and a semantic condition. In the simple task, participants pressed a button anytime a rectangle turned red in the center of the screen. In the complex condition, participants pressed a left or right arrow depending on the spatial location and parity of a number on the screen. In the semantic condition, participants button-pressed to indicate if the auditorily presented word in the primary task was a noun or not. The authors found varying effects of secondary task complexity on dual-task measures of effort: The semantic task revealed greater effort (i.e., slower secondary task performance) when listening in noise, but no effect of audiovisual presentation. This pattern may be due in part to the semantic task engaging the deepest level of processing and thus eliciting greater resource conflict between primary and secondary tasks. However, as the authors note, multiple aspects of the secondary task differed across conditions, including whether the stimuli were visual or auditory, the timing of the task onset, and how frequently a judgment was required.

Extending these results, Wu et al. (2016) kept task stimuli consistent across trials while varying secondary task complexity via instructions for responding. Participants' primary task was to recognize sentences from the Hearing in Noise Test at 11 different SNRs. The secondary task comprised Stroop stimuli (i.e., color words written in different colors of ink). In the easier secondary task condition, listeners pressed the spacebar any time they saw a word on the screen (ignoring its ink color and meaning). In the harder condition, listeners pressed a specific button to indicate the color of the word, which could conflict with what the word said (e.g., GREEN written in red ink). They observed a nonlinear relationship between RTs on the secondary task and the SNR of the primary task, with the greatest effort occurring at moderate levels of listening difficulty. Unlike the results of Picou and Ricketts (2014), this pattern was consistent across levels of secondary task complexity. The discrepancy across studies may be attributable to differences in the secondary task procedures, such as the specific cognitive mechanisms engaged (e.g., semantic processing, cognitive control; Gagné et al., 2017). Another differentiating factor was that Wu et al. (2016) presented sentence stimuli, which engage cognitive and linguistic processes that unfold over a longer time compared to the single-word stimuli used by Picou and Ricketts (2014). Thus, observing differences in listening effort may depend on the ability to closely align primary and secondary tasks in time, which is more challenging to do with sentence materials.

The Current Study

The goal of this study was to examine the extent to which auditory, linguistic, and cognitive load impact listening effort through an experimental design that manipulated SNR, word frequency, and multitasking demands, respectively. Word frequency was the lexical property of focus due to its previously observed influence on speech processing and effort (Dirks et al., 2001; Papesh & Goldinger, 2012) and its relevance to understanding how effort changes as a function of stimuli that are more commonly or less commonly encountered in daily living. Younger adults were tested to examine how these factors interact in a normative population.

The dual-task paradigm in this study involved a primary word recognition task in speech-shaped noise at two fixed SNRs designed to achieve approximately 50% and 70% recognition accuracy. These SNRs were chosen based on established intelligibility norms (Wilson et al., 2007) and pilot testing to ensure the task was moderately challenging since studies have demonstrated that maximal effort occurs at moderate levels of speech intelligibility (near 50%; Ohlenforst et al., 2017; Wu et al., 2016). The secondary visual processing task was either a simple number detection task or a relatively complex number categorization task. Using a visual task avoids overlap between stimulus presentation modalities, thus focusing competition between primary and secondary tasks on generating a response rather than on perception.

The same stimulus presentation procedure occurred on all trials, with only instructions varying by condition (e.g., Wu et al., 2016). Words were used in the primary speech recognition task (e.g., Picou & Ricketts, 2014) to allow for tighter control over the relative timing of resource competition between tasks (vs. sentence stimuli). The secondary task occurred following each to-be-recognized word, again with the aim of interfering with the generation of the verbal response, rather than its perception (Downs, 1982; Fraser et al., 2010; Karatekin et al., 2004).

Study Hypotheses and Predictions

Prior to testing the primary theoretical question of interest, a series of manipulation and dual-task design assumption checks were performed for the dual-task design. These included assessments of whether the task demands were indeed relatively harder and easier and whether assumptions of dual-task paradigm (Fisk et al., 1986) were met. Specifically, these checks tested whether: (a) the SNR manipulation affected speech recognition scores, (b) participants maintained performance on the primary word recognition task in the presence of the secondary task (a critical assumption of the dual-task paradigm), and (c) the multitasking manipulation affected secondary task RTs. Respectively, predictions were tested across two models (accuracy and RTs): (a) for accuracy, a main effect of SNR on word recognition (easier SNR > harder SNR accuracy) and (b) no main effect of multitasking on recognition (multitask = single task accuracy) nor interaction with SNR; and (c) for RTs, a main effect of multitasking (multitask > single task RT), a main effect of secondary task complexity (complex > simple RT), and an interaction (larger multitasking slowdown for the more complex task).

The main theoretical hypothesis of this study was that auditory, lexical, and multitasking demands interactively impact listening effort, such that maximal effort would be observed when all three conditions were most challenging. The associated prediction was that a poorer SNR, lower lexical frequency, and a more complex secondary task would yield the slowest secondary task RTs (relative to the other conditions), indicative of greatest effort. Alternatively, maximal effort might occur under moderate task demands, for example, if high cognitive demands minimize commonly observed SNR effects on effort (Zekveld et al., 2019). In this case, the slowest secondary task RTs would be observed only under a subset of task demands (e.g., two, but not all three, demands were most challenging).

A final alternative hypothesis was that acoustic, lexical, and multitasking demands might be merely additive (e.g., Becker & Killion, 1977; Yap & Balota, 2007) in terms of their effect on listening effort. This alternative would predict main effects of SNR, lexical frequency, and secondary task complexity, but no interaction on RTs.

Method

Participants

Participants were 26 adults (nine men, 17 women) recruited from the University of Maryland and surrounding community, ranging in age from 18 to 24 years old (M = 20.70, SD = 1.50). Participants had normal hearing (pure-tone thresholds screened at 20 dB HL) at 1.0, 2.0, and 4.0 kHz bilaterally, which was screened using a portable GSI-17 audiometer and TDH-39 supra-aural headphones in a sound-treated room. Since screening was not completed in a sound-proof booth, the high noise floor at lower frequencies precluded screening below 1.0 kHz. Furthermore, screening at higher frequencies was not conducted since the stimuli used were bandpass filtered from 0.2 to 4.0 kHz.

All participants self-reported that (a) American English was the first language they learned and that they were not fluent in any other languages; (b) they had at least a high school diploma; (c) they had normal or corrected-to-normal vision; and (d) they had no known speech, language, motor, cognitive, or neurological disorders. Eligible individuals provided informed consent to participate in this University of Maryland institutional review board–approved protocol.

Equipment

Participants were seated approximately 20 in. in front of a 24-in. monitor of a PC running Windows 7. Auditory stimuli were presented through Sennheiser HD 201 circumaural headphones and were routed from the computer through a Monoprice 615220 4-Channel Headphone Amplifier. Calibration of the headphones was completed to ensure that the background noise was presented at a fixed level of 70 dB SPL. A Snowball iCE USB microphone was used to record participants' verbal responses.

Materials

The auditory stimuli for the primary task consisted of 160 Northwestern University Auditory Test No. 6 (NU-6) monosyllabic consonant–vowel–consonant (CVC) words recorded by female speaker No. 1 from the Department of Veterans Affairs. Praat (v.6.0) software (Boersma & Weenink, 2016) was used to process auditory stimuli. Words were extracted from the CD tracks and bandpass filtered from 0.2 to 4.0 kHz. Filtering was performed to allow for the comparison of younger and older adults (who often have hearing loss at higher frequencies) in future studies.

Background noise consisted of Gaussian noise that was modified to have the same long-term average spectrum as the word stimuli. This speech-spectrum masker was preferred over babble as it avoids the amplitude modulation characteristics in babble and the ability for participants to listen in dips (Wilson et al., 2007). Words and noise were then mixed at two SNRs (+5 and +1 dB SNR, noise fixed at 70 dB SPL). Pilot testing showed that these SNRs resulted in approximately 70% and 50% recognition accuracy in younger adults.

Of the 200 available NU-6 words, word stimuli were considered for inclusion in the main experiment if they yielded at least 90% recognition accuracy in quiet from a study with younger, normal-hearing adults (Wilson & McArdle, 2015, Table SM12) and were not homophones with other selected words (i.e., removed “witch,” retained “which”). Of those items, 160 critical words were pseudo-randomly selected to create 16 lexically balanced 10-word lists, which would later be randomly assigned to the eight possible task conditions for each participant (described below). The remaining 40 NU-6 words were used as practice items.

Counterbalancing was performed to align word lists on word frequency (how commonly a word occurs in English), neighborhood density (number of phonologically similar words), and neighborhood frequency (how commonly neighbors appear in English). The following measures were obtained from the English Lexicon Project database (Balota et al., 2007): log-transformed hyperspace-analogue-to-language word frequency (Log_Freq_HAL: M = 9.86, SD = 1.86; Lund & Burgess, 1996), phonological neighborhood density (Phonological Levenshtein Distance [PLD]: M = 1.19, SD = 0.27), and phonological neighborhood frequency (PLD Frequency: M = 8.71, SD = 0.68). There were no significant differences (p > .05) between any of the sixteen 10-word lists on these factors except for one likely spurious difference (p = .03) comparing two of the 16 lists on PLD.1 There was no significant correlation between word frequency and neighborhood density across items, r(158) = −.05, p = .53.

Visual stimuli for the secondary task appeared in 30-point Courier New font and consisted of a pound sign interrupted by the brief presentation of a single digit number. The number appeared after a variable interstimulus interval (ISI) of 0, 250, 500, 750, or 1,000 ms (32 occurrences of each across the experiment). Each number (2 through 9) was presented four times within each ISI across the experiment.

All trials contained both auditory and visual stimuli with the instructions for responding changing across five experimental blocks that comprised a total of eight experimental conditions. Three of the five blocks were single-task conditions that served as baseline measures of performance—Conditions 1 and 2: a primary task only (word recognition at two SNRs, alternating every five words); Condition 3: a simple secondary task only (number detection); and Condition 4: a relatively complex secondary task only (odd–even number categorization). In the latter two blocks with the secondary tasks (Conditions 3 and 4), unattended auditory stimuli were presented in only the easier SNR (+5 dB). The remaining two blocks were dual-task conditions—Conditions 5 and 6: a simple multitasking condition (primary task at two SNRs with the simple secondary task); and Conditions 7 and 8: a relatively complex multitasking condition (primary task at two SNRs with the relatively complex secondary task). Within every block, 30 words appeared in each of the relevant SNR conditions. The 16 counterbalanced 10-word lists were randomly assigned to each of the possible eight conditions for each participant. For a given participant, no words were repeated within or across blocks.

Procedure

Testing occurred in an acoustically treated room at the Maryland Neuroimaging Center at the University of Maryland, College Park, in a single, 1-hr session. The experiment was presented via E-Prime v.2.0.10.356 (Psychology Software Tools, Sharpsburg, PA), with auditory stimuli presented diotically over circumaural headphones. Participants completed practice trials in which the experimenter answered questions and ensured compliance with the various block instructions prior to starting the experiment.

Task parameters were selected with the goal of having the secondary task interfere with the selection of a verbal response to the word stimulus (rather than its perception or encoding; Downs, 1982). Figure 1 depicts the sequence of events, which was the same for every 5-s trial. It began with a centrally presented pound symbol and the onset of the speech-spectrum noise, which continued throughout the trial. One second later, a word was presented at the designated SNR. To reduce predictability, a variable ISI (0–1,000 ms) occurred between word offset (word duration, M = 578 ms, SD = 70) and the onset of the number stimulus. This ISI was in line with previous studies that have investigated the impact of performing a dual task on response-related processes (e.g., Allen et al., 1998; Hartley, 2001). The number appeared for 125 ms before changing back to a pound sign. The pound sign stayed on the screen for the remainder of the trial, during which button-press and/or verbal responses were recorded. Even for trials with the longest possible word duration (898 ms) and ISI (1,000 ms), the minimum response window of each 5-s trial was 1,977 ms.

Figure 1.

The diagram depicts the events and their corresponding durations during a 5 second trial. 1. A hash symbol is displayed. Event: Noise only. Duration: 1000 milliseconds. 2. A hash symbol is displayed. Event: Word in noise. Duration: 403 to 898 milliseconds. 3. A hash symbol is displayed. Event: I S I. Duration: 0 to 1000 milliseconds. 4. The number 5 is displayed. Event: Number. Duration: 125 milliseconds. 5. A hash symbol is displayed. Event: Button or verbal Response. Duration: Remainder of Trial.

The timing of events for each 5-s trial. Depending on block instructions, participants repeated the word they heard in noise and/or made a simple (detection) or relatively complex (categorization) judgment about the visually presented number. In dual-task blocks, participants were instructed to respond to the number prior to repeating the word aloud. ISI = interstimulus interval.

Instructions differed across the five blocks, which were presented in a random order. In the simple, secondary task–only block, participants were instructed to press the “1” key on the keyboard using their dominant hand when they saw any number appear on the screen (i.e., number detection). In the relatively complex, secondary task–only block, participants pressed the “1” key when they saw an odd number and the “2” key when they saw an even number (i.e., number categorization). In the primary task–only block, participants were told to ignore the visual information and repeat aloud the word that they heard as quickly and accurately as possible. In two dual-task blocks (simple and relatively complex multitasking), participants performed both the primary and the secondary tasks by button-pressing for the number prior to repeating the word aloud so as not to interfere with RTs (e.g., Wu et al., 2016). For greater ease of interpreting dual-task changes in RTs, task instructions stressed that accuracy on the word recognition task was of primary importance (Gagné et al., 2017).

Data Analysis

Task conditions included the number of tasks (single vs. dual), SNR (easier vs. harder), and secondary task condition (simple vs. complex). Log-transformed word frequency (Lund & Burgess, 1996) of the stimuli ranged from 5.20 to 14.92.

Trial-level primary task word accuracy and secondary task RTs were dependent variables. For RT analyses, only trials for which participants accurately performed the secondary task were examined (96.60% of trials). Compared to correct trials, errors can result from a variety of sources unrelated to task processing, such as lapses in attention, and are associated with different RT distributions (Ratcliff & Murdock, 1976). Trials were also removed from RT analyses if they elicited an RT more than 3 SDs away from that participant's mean (1.06% of remaining trials removed).

Data analysis using mixed-effects models (MEMs) was implemented using R v.3.6.3 with the lme4 package v.1.1-21 (Bates et al., 2015; R Core Team, 2020). MEMs allow for the analysis of subject- and item-level effects in a single model and the specification of the appropriate data distribution and are generally more powerful than traditional analyses of variance (Gordon, 2019; Gueorguieva & Krystal, 2004). Fixed effects (i.e., experimental manipulations) and random effects (i.e., cross-classified random intercepts by participants and items) were modeled to predict trial-level word recognition accuracy (0/1) and secondary task log-transformed RT (logRT) across individuals and words. SNR (reference level: easier), number of tasks (reference level: single), and secondary task difficulty (reference level: simple) were categorical predictors, and log-transformed word frequency was a continuous predictor that was mean-centered at the item level. For accuracy data, the data distribution was specified as binomial using a logistic MEM (glmer function), and for logRT data, the distribution was specified as Gaussian using a linear MEM (lmer function).

Model Assumptions

For all models, assumptions related to the residuals being normally distributed, homoscedastic, and linearly related to predicted values were confirmed via the resid_panel() function in the package ggResidpanel (v.0.3.0; Goode & Rey, 2019). Lack of multicollinearity (all variance inflation factors < 5.0) was confirmed via the check_collinearity() function in the performance package (v.0.3.0; Lüdecke et al., 2019). RT data in decision tasks are commonly positively skewed, which can lead to violations of MEM assumptions, such as normality and homoscedasticity of the models residuals (Lo & Andrews, 2015). Though MEM offers the opportunity to deal with such violations by modeling different RT distributions (e.g., Gaussian, gamma, inverse Gaussian) and link functions (e.g., identity, inverse), Lo and Andrews (2015) emphasized the importance of selecting these based on theoretical grounds.

Though there is not an established distribution for RTs from dual-task studies of listening effort, there is theoretical and empirical evidence to suggest that a log transformation of RTs modeled with the default Gaussian distribution and identity link function is the most appropriate analytical approach. Theoretically, studies have indicated that a proportional change in RTs from a reference condition is a sensitive measure of listening effort (e.g., Desjardins & Doherty, 2013, 2014). Lo and Andrews (2015) noted that in such cases, a log transformation may be more appropriate (i.e., log[x] − log[y] = log[x/y]) than other distributions/link functions. Indeed, the study on which the current experiment is most closely based log-transformed RTs (logRTs) prior to a linear MEM analysis (e.g., Wu et al., 2016). Empirically, for this study, when attempting to model a basic RT effect (Number of Tasks × Task Type) for logRT data (Gaussian distribution, identity link) as well as for the distributions and link functions examined by Lo and Andrews (2015), the model residuals for the log RT model were the ones that best met all of the MEM assumptions of the residuals being normally distributed, homoscedastic, and linearly related to the response (see code and markdown available on the Open Science Framework [OSF] for detailed comparisons).

Full Model Specifications

For each analysis (i.e., manipulation checks and the primary theoretical question), a full MEM was initially specified that included all fixed effects relevant to the research question specified in each section of the Results: Main effects up through the highest order interaction possible among predictors of interest (e.g., SNR, number of tasks, secondary task type, log word frequency). Full models also included a maximal random effects structure: random intercepts by participants and items as well as random slopes corresponding to the fixed effect terms (main effects and interactions) modeled in a given analysis by participants (i.e., SNR, number of tasks, secondary task type, log word frequency) and by items (i.e., SNR, number of tasks, secondary task type). For the two manipulation checks pertaining to word recognition accuracy, an initial single MEM was specified that included SNR and number of tasks. For the third manipulation check pertaining to RTs, an initial MEM was specified that included number of tasks and secondary task type. The specified MEM for the primary theoretical question of interest was performed on dual-task trials only and included terms related to SNR, secondary task type, and word frequency. Two control analyses examined whether this effect was modulated by ISI or by word recognition accuracy. To be conservative, model fitting (described in the next section) was performed to permit a potential fit of these four-way interactions along with random slopes that included ISI or accuracy.

Model Fitting Procedure

To ensure the replicability of analyses, all model fitting procedures were automated using the buildmer package v1.1 (Voeten, 2019) in R v.3.6.3 (R Core Team, 2020). This approach aimed to find the model specification that best supported the data, that is, the maximally specified and informative solution (in terms of fixed and random effects contributing to model fit) that achieved model convergence. Using the buildmer function eliminated bias or errors that could be introduced from carrying out fitting procedures manually. The study data and R script used to generate all analyses and figures as well as verbose model fitting output have been uploaded to the OSF at https://osf.io/pd84w/.

Buildmer is a wrapper around MEM packages (e.g., lme4) that, with its default arguments, automates model fitting in two major steps: finds the maximal model that can still converge (Barr et al., 2013) and performs backward stepwise elimination (Matuschek et al., 2017) based on likelihood ratio tests (LRTs). Fitting was first performed for the random effects (with maximal fixed effects) and then for the fixed effects (using the resulting optimal random effects). More specifically, first, the buildmer procedure automatically determines the order of the effects in the model, in terms of making sure the most information-rich effects (based on LRTs) are in the model and that the maximally feasible model converges. Only terms that prevent a model from converging (e.g., singular fit) are removed at this stage. Second, backward stepwise elimination is performed by the algorithm with LRTs. This test uses the deviance statistics from two nested models, which has a χ2 distribution, with α level = .05. The Wald test was used for binomial MEM p values and the Satterthwaite approximation (Luke, 2017) for linear MEM p values. The optimizer was set to “bobyqa” and the number of iterations was set to 200,000 to allow models to converge (models that did not converge were eliminated). The resulting model structures are reported in a table of the best-fitting model for each analysis.

For interactions that could not be directly interpreted from the model output, namely, the three-way interaction that addresses the primary theoretical research question, the interaction was inspected by plotting (with a fitted line from the best-fit model via the lsmeans package v.2.30-0; Lenth, 2016) and through re-referencing (aka releveling) the baseline of the best-fit model. MEM output reports the estimates for a given condition relative to all reference levels. To examine how other comparisons contributed to the best-fit model, the model can be re-referenced (e.g., mean vs. minimum vs. maximum log word frequency value) allowing for the examination of a two-way interaction along levels of a third variable. The re-referenced model is not a separate statistical test (e.g., buildmer is not rerun), and the fit is mathematically identical to the original model (including the three-way interaction term; for examples of using buildmer and re-referencing in the literature, see Pandža et al., 2020; Phillips et al., 2021).

Model Reporting

Following reporting recommendations from the Meteyard and Davies (2020), details of the resulting best-fitting model for each analysis are reported in tables that contain coefficient estimates (est.) and its standard error (SE) as a measure of effect size for each fixed effect and an associated test of statistical significance. Terms that were not significant or nested within a significant interaction were dropped by the buildmer algorithm (LRT: p > .05) and thus were not included in the final reported model as a result of the model fitting process. Variance and standard deviation as well as correlation structure of the random effects are reported as well as pseudo-R2 measures for marginal (variance of fixed effects) and conditional (variance of fixed and random effects) estimates of MEM fit, obtained from the performance package (v.0.3.0; Lüdecke et al., 2019) r2_nakagawa() function (Nakagawa et al., 2017).

Results

Accuracy on the Primary Word Recognition Task: SNR Manipulation and Primary Task Prioritization Checks

Analyses of the word recognition data aimed to test critical assumptions of experimental dual-task design. The first assumption was that the selected SNRs elicited word recognition (primary task) accuracy levels that were moderately difficult, specifically targeting 50% and 70% accuracy. As shown in Figure 2, average word accuracy was 72.50% (SD = 11.77) for the easier SNR and 53.65% (SD = 9.96) for the harder SNR in the single-task, word recognition condition (red bars).

Figure 2.

A box and whisker plot depicting the Word accuracy in terms of the percentage of correct words with respect to the S N R and the Number of Tasks. The description lists the mean value, first and third quartiles, and the bottom and top of the whiskers. The data is as follows. S N R: Easier. Orange box: 73, 66, 75, 55, 84. Blue box: 70, 64, 80, 48, 84. S N R: Harder. Orange box: 56, 53, 60, 35, 75. Blue box: 55, 40, 63, 23, 71. The legend for the Number of tasks is as follows. Orange box: Single. Blue box: Dual. All values are estimates.

Easier and harder SNRs yielded targeted accuracy levels across both single- and dual-task conditions. Horizontal lines indicate the median value, filled boxes contain the middle 50% range of participant means, and vertical lines indicate the upper and lower quartiles. SNRs = signal-to-noise ratios.

Statistical tests were performed to evaluate whether (a) the effect of SNR on recognition performance was significant and whether (b) participants followed instructions to maintain performance on the primary task in the presence of the secondary task. Table 1 details the resulting best-fitting model predicting word recognition accuracy. As expected, there was a significant main effect of SNR (est. = −1.40, SE = 0.14, p < .001). Furthermore, when participants were asked to perform both a primary and secondary task together (dual-task, blue bars), there was no significant change in word recognition accuracy on the primary task, with 69.90% (SD = 11.30) for the easier SNR and 51.73% (SD = 12.51) for the harder SNR. Neither the SNR × Number of Tasks interaction term nor a number of tasks main effect significantly contributed to the model's fit of the data, ps > .05. The complexity of the secondary task also did not impact word recognition accuracy as an interaction with SNR or as a main effect, ps > 0.05.

Table 1.

Best fitting model for SNR on word recognition accuracy.

Fixed effects Word accuracy
Estimate SE Z p
Intercept 1.60 0.25 6.29 < .001
SNR (Harder > Easier) −1.40 0.14 −10.22 < .001

Random effects

Variance

SD

Correlations
By-Word Intercepts 6.11 2.47
By-Word SNR Slopes 0.75 0.87 −.43
By-Participant Intercepts 0.42 0.65

Model fit

Marginal

Conditional

Pseudo-R2 0.05 0.66

Note. SE = standard error; SNRs = signal-to-noise ratios.

RTs on the Secondary Visual Task: Multitasking Manipulation Check

A third manipulation check examined the extent to which RTs varied across the secondary tasks depending on number of tasks performed (single vs. dual) and secondary task complexity (simple number detection vs. complex number categorization). Figure 3 shows that performing the primary and secondary tasks together (vs. secondary task alone) resulted in slower RTs. This difference in log RTs was larger when secondary task was a simple number detection (single task: M = 5.09 log ms, SD = 0.26; dual task: M = 5.38 log ms, SD = 0.30) compared to a complex number categorization (single task: M = 6.07 log ms, SD = 0.20; dual task: M = 6.23 log ms, SD = 0.16) as revealed by a significant interaction between Number of Tasks and Secondary Task Complexity (est. = −0.13, SE = 0.05, p = .02). Table 2 shows the results of the best-fitting model, which include main effects indicating that slower RTs (i.e., positive coefficient estimates) occurred with a greater number of tasks (i.e., slower for dual than single) and with secondary task complexity (i.e., slower for complex than simple) and includes a significant interaction term.2

Figure 3.

A box and whisker plot depicting the R T in milliseconds with respect to the secondary task complexity and the Number of Tasks. The description lists the mean value, first and third quartiles, and the bottom and top of the whiskers. The data is as follows. Secondary Task Complexity: Simple. Orange box: 160, 140, 200, 120, 250. Blue box: 230, 190, 260, 130, 400. Secondary Task Complexity: Complex. Orange box: 450, 400, 480, 350, 550. Blue box: 540, 480, 580, 360, 680. The legend for the Number of Tasks is as follows. Orange box: Single. Blue box: Dual. All values are estimates.

Performing a word recognition and visual task together (dual task) resulted in slower logRTs compared to performing a visual task alone (single task), especially when performing a simpler (vs. more complex) secondary task. Horizontal lines indicate the median participant average RT, filled boxes contain the middle 50% range of values, and vertical lines indicate the upper and lower quartiles. logRTs = log-transformed reaction times; RT = reaction time.

Table 2.

Best-fitting model for single- versus dual-task logRTs.

Fixed effects logRTs
Estimate SE t p
Intercept 5.09 0.05 97.98 < .001
NumTasks (Dual > Single) 0.29 0.05 5.95 < .001
SecondTask (Complex > Simple) 0.97 0.05 19.04 < .001
NumTasks:SecondTask −0.13 0.05 −2.46 .02

Random effects

Variance

SD

Correlations
By-Word Intercepts < 0.01 0.05
By-Participant Intercepts 0.06 0.25
By-Participant NumTasks Slopes 0.06 0.23    −.28
By-Participant SecondTask Slopes 0.06 0.24    −.71  .31
By-Participant NumTasks:SecondTask Slopes 0.05 0.23    .11  −.48  −.77
Residual 0.10 0.32

Model fit

Marginal

Conditional

Pseudo-R2 0.54 0.74

Note. logRTs = log-transformed reaction times; SE = standard error.

Auditory, Lexical, and Multitasking Demands on Dual-Task RTs

The final set of analyses addressed the main theoretical questions of interest. A linear MEM was constructed to examine the interaction among acoustic, lexical, and multitasking demands on secondary task RTs, as an indicator of effort, for trials on which participants performed both a primary and secondary task. Independent variables were primary task SNR (harder vs. easier), log-transformed word frequency, and secondary task complexity (complex number categorization vs. simple number detection). The best-fitting MEM included a significant three-way interaction among these variables (est. = 0.04, SE = 0.02, p = .04), with the associated fixed effects of the best-fitting model given in Table 3. This three-way interaction was not further modulated by adding a variable for ISI up through a refit four-way interaction (p > .05).

Table 3.

Best-fitting model for acoustic, lexical, and multitasking demands on logRTs.

Fixed effects logRTs
Estimate SE t p
Intercept 5.33 0.06 89.83 < .001
SecondTask (Complex > Simple) 0.85 0.05 15.67 < .001
SNR (Harder > Easier) 0.04 0.03 1.40 .16
LogWdFreq (mean centered) 0.01 0.01 1.38 .17
SecondTask:SNR −0.02 0.04 −0.48 .63
SecondTask:LogWdFreq −0.03 0.01 −2.09 .04
SNR:LogWdFreq −0.04 0.01 −2.57 .01
SecondTask:SNR:LogWdFreq 0.04 0.02 2.10 .04

Random Effects

Variance

SD

Correlation
By-Word Intercepts < 0.01 0.05
By-Participant Intercepts 0.08 0.29
By-Participant SecondTask Slopes 0.06 0.25 −.87
Residual 0.10 0.32

Model fit

Marginal

Conditional

Pseudo-R2 0.53 0.69

Note. logRTs = log-transformed reaction times; SE = standard error; SNR = signal-to-noise ratio.

Figure 4 plots the fitted data, which indicates that the largest effect of SNR on the logRT measure of effort occurred primarily for the low-frequency words (e.g., “wag”) in the simpler secondary task condition. Closer examination of the interaction term supports this conclusion. As shown in Table 3, no simple main effect of SNR was observed at average word frequency values (i.e., model referenced to mean-centered log word frequency) for either the simple (est. = 0.04, SE = 0.03, p = .16) or complex (est. = −0.02, SE = 0.04, p = .63) secondary task conditions. Re-referencing the model, the lack of an SNR effect was also observed for high-frequency words, est. = −0.10, SE = 0.06, t = −1.70, p = .09 (i.e., model re-referenced to maximum value of log word frequency). However, a significant effect of SNR emerged for low-frequency words, est. = 0.22, SE = 0.08, t = 2.89, p < .01 (i.e., model re-referenced to minimum value of log word frequency).

Figure 4.

The graph plots R T in milliseconds on the y axis and the Log Word Frequency on the x axis. The y axis ranges from 100 to 600. The x axis ranges from 6 to 14. The dashed purple line runs between (5.5, 520) and (14, 499). The dashed blue line runs between (5.5, 520) and (14, 470). The solid purple line runs between (5.5, 240) and (14, 200). The solid blue line runs between (5.5, 198) and (14, 220). The legend is as follows. S N R. Blue line: Easier. Purple line: Harder. Second Task. Solid line: Simple. Dashed line: Complex. All values are estimates.

logRT varied with SNR, lexical, and secondary-task demands. The largest effect of SNR was observed for lower frequency words, specifically when performing the simple, number detection task (vs. complex categorization). Fitted model lines are plotted, and shading represents SE around the fitted lines. logRT = log-transformed reaction time; SNR = signal-to-noise ratio; RT = reaction time.

Finally, to highlight the extent to which these RT measures provide an index of effortful listening over and above word recognition performance, the best-fitting model (see Table 3) with correct trials only was rerun on trials for which the auditorily presented word was both correctly and incorrectly identified. This also ensured that participants had directed their attention to the primary task as instructed (Hicks & Tharpe, 2002). The significant three-way interaction of secondary task complexity, SNR, and log word frequency observed in the full data set persisted with an identical effect size when analyzing correct and incorrect trials, with an additional model covariate of word identification accuracy: three-way interaction term est. = 0.04, SE = 0.02, t = 2.46, p = .01.

Discussion

Real-world listening commonly occurs in complex environments, including those that require individuals to listen and simultaneously be aware of salient visual information. To enhance the ecological validity of hearing sciences research, Keidser et al. (2020) noted that experimenters should consider the relevance of their study design to real-world communication, including as the nature of the stimulus materials and task complexity. This study examined how minor changes in two such factors altered an expected SNR effect on a dual-task measure of listening effort: lexical frequency of word stimuli (indicative of how likely individuals are to encounter it in their daily lives) and multitasking demands (single task, simple dual task, or complex dual task).

Many previous dual-task studies have focused on the impact of acoustic challenge on listening effort (Gagné et al., 2017). Studies that have furthermore investigated associations between acoustic and linguistic or cognitive demands on effort have predominantly used pupillometry (for a review, see Zekveld et al., 2018) or have used dual-task designs but manipulated only two of these dimensions at a time (for a review, see Gagné et al., 2017). This study aimed to examine these relationships using a dual-task paradigm. Thus, the dual-task paradigm may be a relatively simple yet appropriate task for examining the impact of these types of multitasking cognitive demands on listening effort.

Dual-Task Design Manipulation Checks

SNR Manipulation and Primary Task Prioritization

The interpretability of dual-task effects depends on a number of assumptions having been met (Fisk et al., 1986). Manipulation checks revealed that the selected SNRs (easier vs. harder) during a primary word recognition task targeted intelligibility levels that were moderately difficult as intended: approximately 70% and 50% correct. Similar levels of acoustic challenge have previously been shown to elicit differences in effort in pupillometry (Ohlenforst et al., 2017; Zekveld et al., 2010) and dual-task studies (Wu et al., 2016). Importantly, this pattern did not differ depending on whether the word recognition task was performed alone or in the presence of a secondary visual task, irrespective of the complexity of that task. Thus, participants appeared to follow instructions to always make the word recognition task their primary focus. Maintaining equivalent performance on the primary task under single- and dual-task conditions is critical for interpreting changes in secondary task performance as an index of effort rather than strategic shifts in attention allocation (Fisk et al., 1986).

Multitasking Manipulation

Manipulation checks also revealed that, as expected, participants responded more slowly on a secondary visual task when performing the primary and secondary tasks together (i.e., multitasking) versus the secondary task alone. This increase in RT indicated that the primary and secondary tasks competed for resources, another assumption of the dual-task paradigm. The RT slowdown with the multitasking effect was further modulated by whether the visual task was simpler (number detection) or relatively more complex (number categorization), but not necessarily in the direction expected. Contrary to the original hypothesis, a larger multitasking slowdown (dual vs. single secondary task responses, irrespective of the primary task) was observed for the simpler detection task, in which log RTs were overall faster, compared to the categorization secondary task, in which log RTs were overall much slower.

The effect of number of tasks may have been apparently larger in the simpler number detection condition since RTs were overall faster in that condition. The nonlinear nature of the log transformation that was applied to the positively skewed raw RTs prior to analysis (e.g., Wu et al., 2016) means that differences in fast RTs were weighted more heavily than in slower ones. As Lo and Andrews (2015) suggested, in cases where a log transformation is justified, parallel analyses on raw RT data can “provide useful complementary insight” (p. 12). Indeed, compared to the log RT analysis, the difference between the raw RTs under single- versus dual-task instructions was much more similar and, in fact, numerically but not statistically (see Footnote 2) smaller in the simpler (M = 69.53 ms) compared to the more complex (M = 84.07 ms) conditions. This finding highlights the importance of considering the expected mean (will responses generally be very quick or slow?) and distribution (how skewed might it be?) of RTs in a secondary task in a dual-task design.

Another explanation for the unexpected larger multitasking slowdown with a simpler versus more complex secondary task may be that the more complex task resulted in RTs near ceiling (~500 ms). Especially when dual tasking, it may not have been advantageous for participants to utilize a large portion of the response window allotted for responses to both tasks (a minimum of 1,977 ms). This would require retaining primary task words for longer and leave less time to articulate responses for the primary task, which could be problematic since auditory memory traces decay over just a couple of seconds in the absence of rehearsal (Baddeley, 2012). Had this occurred, participants might have had to store to-be-reported words for up to around 1,500 ms (up to 1,000-ms ISI + ~500 ms to respond to visual stimuli), with word rehearsal potentially hindered by the presence of the visual task. Countering this interpretation, however, the retention length, or ISI, was not found to interact with the effects of acoustic, lexical, and multitasking demands on RTs (although a study designed and powered to look at such an effect may have yielded a different result). Regardless, as researchers consider primary and secondary task selection in dual-task studies, the desire to maximize primary and secondary task interference (to yield the largest effort effects) may need to be balanced with the need to avoid potential ceiling effects on secondary tasks.

Primary Study Hypothesis: Acoustic, Lexical, and Multitasking Demands Interactively Impact Listening Effort

Previous studies that have used the dual-task paradigm have observed that poorer SNRs are associated with worse secondary task performance, indicative of greater listening effort (e.g., Seeman & Sims, 2015), particularly at moderate levels of intelligibility (Wu et al., 2016). This study revealed that the strength of such an SNR effect can be modulated or even erased depending on the lexical properties of the word stimuli and the type of multitasking demands that the secondary task imposes. There is substantial evidence that increasing acoustic demands of a listening task (e.g., by adding more noise or degrading the signal) draws upon more cognitive resources, in line with the ELU model (see reviews by Rönnberg et al., 2013; Wingfield et al., 2015). However, few studies have examined how different levels of cognitive demands imposed by the task (rather than by the acoustic stimuli) impact listening difficulties (Mattys & Wiget, 2011). Dual-task studies that have manipulated multitasking demands via secondary task difficulty have observed mixed results: either no interactions with acoustic demands on effort (Wu et al., 2016) or a larger effect for a complex semantic task (Picou & Ricketts, 2014). This study further highlights that not only may the specific demands and timing of the selected tasks have a major impact on whether dual-task differences in effort are observable (Gagné et al., 2017), but the properties of the selected speech and noise stimuli may also play a role in any noted differences.

Pupillometry and dual-task investigations of linguistic demands on listening effort have commonly examined sentence-level properties such as syntactic complexity (Piquado et al., 2010), syntactic–prosodic mismatches (Engelhardt et al., 2010), and sentence context/predictability (Desjardins & Doherty, 2013, 2014; Sarampalis et al., 2009). One recent pupillometry study also demonstrated subtle effects of dense versus sparse lexical neighborhood density on effort among younger adults (McLaughlin et al., 2022). This study expanded on such work by examining the extent to which the lexical properties of words, modeled as a continuous item-level predictor, altered the expected dual-task SNR effects on listening effort.

Greater effects of lexical demands with higher cognitive load (Mattys & Wiget, 2011) or with poorer SNRs (Kuchinsky et al., 2013) have also been reported. As a result, maximal effort (slowest RTs) was predicted when acoustic, linguistic, and cognitive demands were highest (harder SNR, lowest word frequency, complex number categorization secondary task). Participants were indeed slowest overall to respond when the words were low in lexical frequency and when the secondary task was more complex. However, the expected SNR-driven slowdown in RTs (for a review, see Gagné et al., 2017) was only observed for the more challenging (lower frequency) lexical items and, somewhat counterintuitively, only when the secondary task was simpler. This pattern may have resulted in part from the reduced sensitivity of the more complex task to dual- versus single-task demands as noted in the previous discussion section. Another explanation is that participants had reached a tipping point, with the additional effort required to perform the more cognitively challenging visual task overwhelming the expected SNR effect. This interpretation is aligned with the findings of Zekveld et al. (2019), who observed that at higher levels of memory load, the expected SNR effects on the pupil response were weakened.

The results of this study also bear on some methodological considerations for dual-task analyses. To examine the extent to which stimulus properties modulate expected dual-task measures of effort, this study analyzed trial-level lexical effects on dual-task RT measures. Dual-task studies historically computed average RTs across conditions of interest and calculated dual-task costs by subtracting single- from dual-task responses. However, increasingly, dual-task studies have employed more powerful mixed-effects modeling of trial-level responses (e.g., Brown et al., 2013, 2020; Colby & McMurray, 2021; Ohlenforst et al., 2017). This study highlights yet another motivation for analyzing trial-level dual-task data within a mixed-effects framework: to consider how item-level variability across speech stimuli might impact dual-task effects simultaneously with participant-level variability in a cross-classified framework. Accounting for item variability may be particularly important to consider in studies of the impact of hearing loss on these effects, which would not be expected to be uniform across words with different spectral densities.

In this study, one property of the speech stimuli was modeled as a factor of interest (i.e., fixed effect of lexical frequency) based on the common observation that word frequency can influence intelligibility (Dirks et al., 2001) and the ease of lexical processing (Papesh & Goldinger, 2012). However, there are potentially many word properties that contribute to the variability of responses to speech stimuli that were not explicitly examined and were beyond the scope of this study. Stimulus properties may not be of theoretical importance in every study, and thus not every analysis needs to include linguistic predictors as fixed effect terms. However, whether effects that are of interest persist despite the inherent variability across stimuli, and thus are likely to generalize to new speech materials, can and should be examined minimally by including item-level random effects (intercepts and slopes) whenever possible (Baayen et al., 2008; Clark, 1973; Quené & van den Bergh, 2008). Analogously, though researchers might not include fixed effects for participants' personality traits in every model, these and other individual differences have the potential to influence factors of interest. Thus, including random slope and intercept effects by participants helps account for some of that unexplained variance. In some cases, capturing variability across items can be even more consequential than capturing variability across individuals. In this study, RT variability was better accounted for by differences across participants than word stimuli (random intercept variance in Tables 2 and 3), but the reverse was true for models of accuracy, wherein unexplained variability in accuracy outcomes was better accounted for by differences across words (see Table 1).

Limitations and Future Directions

Though the chosen SNRs elicited speech recognition performance that was overall at targeted levels, there was a fair amount of variability in accuracy across participants and words. There is often no straightforward answer as to whether a researcher should aim to equate SNRs and allow variance in word recognition performance or equate performance and allow variance in SNRs. Future work may benefit from gaining tighter control over performance by using an adaptive-track SNR procedure for each participant. Past research that has collected dual-task measures of effort using both approaches (equated SNRs vs. adaptively equated performance across participants) when examining age effects fortunately have yielded the same pattern of results (Anderson Gosselin & Gagné, 2011a, 2011b). Given that this study examined younger adults screened for normal hearing thresholds, one might expect similar outcomes if an adaptive procedure had been used, though future work could test this explicitly.

A limitation of any dual-task measure is that several assumptions must be met to interpret the results as an indicator of effort (Fisk et al., 1986) and that the nature and timing of the tasks selected can impact the size of obtained effects (Gagné et al., 2017). In this study, dual-task assumptions were examined, for example, showing that individuals appeared to follow instructions by demonstrating unchanging primary task performance across conditions. However, it can be challenging, if not impossible, to independently verify that participants are not subtly shifting their attention and are fully utilizing their available resources capacities for the two tasks (e.g., McGarrigle et al., 2014).

It is important to acknowledge that the results of this study may depend on the details of the experimental paradigm as well as the speech (i.e., CVC words) and noise (i.e., speech-spectrum noise) materials used. Because it can be difficult to know a priori whether a set of primary and secondary tasks will engage same set of perceptual and/or cognitive processes, the current design was based largely on the established literature. Indeed, researchers have struggled with developing taxonomies of the myriad variations of the dual-task paradigm, which may limit the ability to generalize findings across studies with categorically different designs (Esmaeili Bijarsari, 2021). Selecting different primary and secondary tasks or changing their timing relative to one another could further alter the pattern of obtained results. More specifically, following Wollesen et al. (2019), this study employed both a “reaction time” number detection and a “controlled process” parity judgment secondary task design, which may be expected to yield different patterns of results than a “discrimination” (e.g., go/no-go or Stroop) or “working memory” (e.g., n-back) design.

The current experiment does not resolve these considerations inherent in the dual-task paradigm. Rather, it advocates for the linguistic content of the speech stimuli to be an important dual-task design consideration as well. Future studies should also explore the extent to which similar results are obtained using sentence stimuli and other types of energetic and informational noise maskers, which are known to differentially impact listening effort (Koelewijn et al., 2014; Stenbäck et al., 2021). For example, in a study of informational masking, Stenbäck et al. (2022) observed that effects of cognition on listening effort were eliminated once hearing thresholds were accounted for.

Lastly, this study was limited in that it only tested younger adults with normal hearing sensitivity. Larger effects of auditory, linguistic, and cognitive manipulations might be predicted with older adults with and without hearing loss. Increased reliance on top-down mechanisms is commonly thought to underlie observed effort and age-related changes in speech processing (Alain et al., 2004; Wingfield & Grossman, 2006). While there is evidence that lexical processing proceeds more automatically with aging due to greater language experience (Lien et al., 2006), older adults appear to depend more on lexical information to fill in missing information under greater acoustic and task demands (Revill & Spieler, 2012).

Conclusions

This study investigated the extent to which auditory, lexical, and multitasking demands interactively impact listening effort as measured with the dual-task paradigm in younger, normal-hearing adults. The results aimed to extend the literature by examining interactions among all three types of demands in within-subjects analyses of the impact of the properties of words (rather than sentences) as a continuous item-level predictor. A significant three-way interaction was observed, such that the expected effect of SNR on effort was only observable for words that less commonly appear in English (lower lexical frequency) and only when multitasking demands were relatively simpler (visual detection vs. categorization secondary task). In line with previous work, this finding suggests that under higher cognitive demands, there may be a reduced ability to detect expected effects of SNR on effort. Additionally, this study identified an additional source of variance, namely, the properties of the speech stimuli, that can influence the sensitivity of the dual-task paradigm for detecting changes in listening effort. Relevant to real-world listening in which acoustic, linguistic, and cognitive demands are often quite high, these findings highlight that even relatively minor changes in lexical and multitasking demands can alter the effort devoted to listening in noise. Future work should continue to explore the impact of a wider range of interactive acoustic, linguistic, and cognitive demands on effort, particularly for older adults who may experience declines in such functions.

Data Availability Statement

The data and analysis R code and markdown file that support the findings of this study are available via the Open Science Framework at https://osf.io/pd84w/.

Acknowledgments

This work was supported in part by National Institute on Deafness and Other Communication Disorders Grant R03 DC015059 (awarded to Stefanie E. Kuchinsky). The authors would like to thank the participants for the time they devoted to this study. They also thank Geetika Reichman for assistance with data collection. The identification of specific products or scientific instrumentation is considered an integral part of the scientific endeavor and does not constitute endorsement or implied endorsement on the part of the authors, Department of Defense (DoD), or any component agency. The views expressed in this article are those of the authors and do not reflect the official policy of the Department of Army/Navy/Air Force, the DoD, or the U.S. Government.

Funding Statement

This work was supported in part by National Institute on Deafness and Other Communication Disorders Grant R03 DC015059 (awarded to Stefanie E. Kuchinsky).

Footnotes

1

Using an alternative measure of neighborhood density available in the ELP database (Phono_N_H), there were no statistically significant differences between any lists (M = 21.71, SD = 2.41).

2

To better understand this interaction on log-transformed data (see Lo & Andrews, 2015, p. 12), which was in the opposite direction from what had been predicted, a parallel post hoc linear MEM was conducted on raw RTs. The interaction was nonsignificant (p > .05). However, due in part to model residuals being highly skewed, model fit was much poorer (marginal: R2 = .46, conditional: R2 = .65) than the one conducted on logRT in Table 2.

References

  1. Alain, C., McDonald, K. L., Ostroff, J. M., & Schneider, B. (2004). Aging: A switch from automatic to controlled processing of sounds? Psychology and Aging, 19(1), 125–133. 10.1037/0882-7974.19.1.125 [DOI] [PubMed] [Google Scholar]
  2. Alhanbali, S., Dawes, P., Millman, R. E., & Munro, K. J. (2019). Measures of listening effort are multidimensional. Ear and Hearing, 40(5), 1084–1097. 10.1097/AUD.0000000000000697 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Allen, P. A., Smith, A. F., Vires-Collins, H., & Sperry, S. (1998). The psychological refractory period: Evidence for age differences in attentional time-sharing. Psychology and Aging, 13(2), 218–229. 10.1037/0882-7974.13.2.218 [DOI] [PubMed] [Google Scholar]
  4. Anderson Gosselin, P., & Gagné, J.-P. (2011a). Older adults expend more listening effort than young adults recognizing speech in noise. Journal of Speech, Language, and Hearing Research, 54(3), 944–958. 10.1044/1092-4388(2010/10-0069) [DOI] [PubMed] [Google Scholar]
  5. Anderson Gosselin, P., & Gagné, J.-P. (2011b). Older adults expend more listening effort than young adults recognizing audiovisual speech in noise. International Journal of Audiology, 50(11), 786–792. 10.3109/14992027.2011.599870 [DOI] [PubMed] [Google Scholar]
  6. Ayasse, N. D., & Wingfield, A. (2018). A tipping point in listening effort: Effects of linguistic complexity and age-related hearing loss on sentence comprehension. Trends in Hearing, 22. 10.1177/2331216518790907 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4), 390–412. 10.1016/j.jml.2007.12.005 [DOI] [Google Scholar]
  8. Baddeley, A. (2012). Working memory: Theories, models, and controversies. Annual Review of Psychology, 63(1), 1–29. 10.1146/annurev-psych-120710-100422 [DOI] [PubMed] [Google Scholar]
  9. Balota, D. A., Yap, M. J., Hutchison, K. A., Cortese, M. J., Kessler, B., Loftis, B., Neely, J. H., Nelson, D. L., Simpson, G. B., & Treiman, R. (2007). The English lexicon project. Behavior Research Methods, 39(3), 445–459. 10.3758/BF03193014 [DOI] [PubMed] [Google Scholar]
  10. Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255–278. 10.1016/j.jml.2012.11.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. 10.18637/jss.v067.i01 [DOI] [Google Scholar]
  12. Becker, C. A., & Killion, T. H. (1977). Interaction of visual and cognitive effects in word recognition. Journal of Experimental Psychology: Human Perception and Performance, 3(3), 389–401. 10.1037/0096-1523.3.3.389 [DOI] [Google Scholar]
  13. Blumstein, S. E. (2009). Auditory word recognition: Evidence from aphasia and functional neuroimaging. Language and Linguistics Compass, 3(4), 824–838. 10.1111/j.1749-818X.2009.00136.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Boersma, P., & Weenink, D. (2016). Praat: Doing phonetics by computer [Computer program]. Version 6.0.14. http://www.praat.org
  15. Brown, V. A., McLaughlin, D. J., Strand, J. F., & Van Engen, K. J. (2020). Rapid adaptation to fully intelligible nonnative-accented speech reduces listening effort. Quarterly Journal of Experimental Psychology, 73(9), 1431–1443. 10.1177/1747021820916726 [DOI] [PubMed] [Google Scholar]
  16. Brown, V. A., Strand, J. F., & Van Engen, K. J. (2013). Summary for policymakers. Journal of Chemical Information and Modeling, 53(9), 1689–1699. 10.1017/CBO9781107415324.00423800267 [DOI] [Google Scholar]
  17. Clark, H. H. (1973). The language-as-fixed-effect fallacy: A critique of language statistics in psychological research. Journal of Verbal Learning and Verbal Behavior, 12(4), 335–359. 10.1016/S0022-5371(73)80014-3 [DOI] [Google Scholar]
  18. Colby, S., & McMurray, B. (2021). Cognitive and physiological measures of listening effort during degraded speech perception: Relating dual-task and pupillometry paradigms. Journal of Speech, Language, and Hearing Research, 64(9), 3627–3652. 10.1044/2021_JSLHR-20-00583 [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Desjardins, J. L., & Doherty, K. A. (2013). Age-related changes in listening effort for various types of masker noises. Ear and Hearing, 34(3), 261–272. 10.1097/AUD.0b013e31826d0ba4 [DOI] [PubMed] [Google Scholar]
  20. Desjardins, J. L., & Doherty, K. A. (2014). The effect of hearing aid noise reduction on listening effort in hearing-impaired adults. Ear and Hearing, 35(6), 600–610. 10.1097/AUD.0000000000000028 [DOI] [PubMed] [Google Scholar]
  21. Dirks, D. D., Takayana, S., & Moshfegh, A. (2001). Effects of lexical factors on word recognition among normal-hearing and hearing-impaired listeners. Journal of the American Academy of Audiology, 12(5), 233–244. 10.1055/s-0042-1745602 [DOI] [PubMed] [Google Scholar]
  22. Downs, D. W. (1982). Effects of hearing aid use on speech discrimination and listening effort. The Journal of Speech and Hearing Disorders, 47(2), 189–193. 10.1044/jshd.4702.189 [DOI] [PubMed] [Google Scholar]
  23. Edwards, B. (2020). Emerging technologies, market segments, and MarkeTrak 10 insights in hearing health technology. Seminars in Hearing, 41(01), 037–054. 10.1055/s-0040-1701244 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Engelhardt, P. E., Ferreira, F., & Patsenko, E. G. (2010). Pupillometry reveals processing load during spoken language comprehension. Quarterly Journal of Experimental Psychology, 63(4), 639–645. 10.1080/17470210903469864 [DOI] [PubMed] [Google Scholar]
  25. Esmaeili Bijarsari, S. (2021). A current view on dual-task paradigms and their limitations to capture cognitive load. Frontiers in Psychology, 12, Article 648586. 10.3389/fpsyg.2021.648586 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Fisk, A. D., Derrick, W. L., & Schneider, W. (1986). A methodological assessment and evaluation of dual-task paradigms. Current Psychology, 5(4), 315–327. 10.1007/BF02686599 [DOI] [Google Scholar]
  27. Fraser, S., Gagné, J.-P., Alepins, M., & Dubois, P. (2010). Evaluating the effort expended to understand speech in noise using a dual-task paradigm: The effects of providing visual speech cues. Journal of Speech, Language, and Hearing Research, 53(1), 18–33. 10.1044/1092-4388(2009/08-0140) [DOI] [PubMed] [Google Scholar]
  28. Gagné, J.-P., Besser, J., & Lemke, U. (2017). Behavioral assessment of listening effort using a dual-task paradigm. Trends in Hearing, 21. 10.1177/2331216516687287 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Goode, K., & Rey, K. (2019). ggResidpanel: Panels and interactive versions of diagnostic plots using “ggplot2” (R package version 0.3.0). https://cran.r-project.org/package=ggResidpanel
  30. Gordon, K. R. (2019). How mixed-effects modeling can advance our understanding of learning and memory and improve clinical and educational practice. Journal of Speech, Language, and Hearing Research, 62(3), 507–524. 10.1044/2018_JSLHR-L-ASTM-18-0240 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Gueorguieva, R., & Krystal, J. H. (2004). Move over ANOVA: Progress in analyzing repeated-measures data and its reflection in papers published in the Archives of General Psychiatry. Archives of General Psychiatry, 61(3), 310–317. 10.1001/archpsyc.61.3.310 [DOI] [PubMed] [Google Scholar]
  32. Hartley, A. A. (2001). Age differences in dual-task interference are localized to response-generation processes. Psychology and Aging, 16(1), 47–54. 10.1037/0882-7974.16.1.47 [DOI] [PubMed] [Google Scholar]
  33. Hicks, C. B., & Tharpe, A. M. (2002). Listening effort and fatigue in school-age children with and without hearing loss. Journal of Speech, Language, and Hearing Research, 45(3), 573–584. 10.1044/1092-4388(2002/046) [DOI] [PubMed] [Google Scholar]
  34. Kahneman, D. (1973). Attention and effort. The American Journal of Psychology, 88(2), 339–340. 10.2307/1421603 [DOI] [Google Scholar]
  35. Karatekin, C., Couperus, J. W., & Marcus, D. J. (2004). Attention allocation in the dual-task paradigm as measured through behavioral and psychophysiological responses. Psychophysiology, 41(2), 175–185. 10.1111/j.1469-8986.2004.00147.x [DOI] [PubMed] [Google Scholar]
  36. Keidser, G., Naylor, G., Brungart, D. S., Caduff, A., Campos, J., Carlile, S., Carpenter, M. G., Grimm, G., Hohmann, V., Holube, I., Launer, S., Lunner, T., Mehra, R., Rapport, F., Slaney, M., & Smeds, K. (2020). The quest for ecological validity in hearing science: What it is, why it matters, and how to advance it. Ear and Hearing, 41(Suppl. 1), 5S–19S. 10.1097/AUD.0000000000000944 [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Koch, I., Poljac, E., Müller, H., & Kiesel, A. (2018). Cognitive structure, flexibility, and plasticity in human multitasking–An integrative review of dual-task and task-switching research. Psychological Bulletin, 144(6), 557–583. 10.1037/bul0000144 [DOI] [PubMed] [Google Scholar]
  38. Koelewijn, T., Zekveld, A. A., Festen, J. M., & Kramer, S. E. (2014). The influence of informational masking on speech perception and pupil response in adults with hearing impairment. The Journal of the Acoustical Society of America, 135(3), 1596–1606. 10.1121/1.4863198 [DOI] [PubMed] [Google Scholar]
  39. Kuchinsky, S. E., Ahlstrom, J. B., Vaden, K. I., Cute, S. L., Humes, L. E., Dubno, J. R., & Eckert, M. A. (2013). Pupil size varies with word listening and response selection difficulty in older adults with hearing loss. Psychophysiology, 50(1), 23–34. 10.1111/j.1469-8986.2012.01477.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Lenth, R. V. (2016). Least-squares means: The RPackage lsmeans. Journal of Statistical Software, 69(1), 1–33. 10.18637/jss.v069.i01 [DOI] [Google Scholar]
  41. Lien, M. C., Allen, P. A., Ruthruff, E., Grabbe, J., McCann, R. S., & Remington, R. W. (2006). Visual word recognition without central attention: Evidence for greater automaticity with advancing age. Psychology and Aging, 21(3), 431–447. 10.1037/0882-7974.21.3.431 [DOI] [PubMed] [Google Scholar]
  42. Lo, S., & Andrews, S. (2015). To transform or not to transform: Using generalized linear mixed models to analyse reaction time data. Frontiers in Psychology, 6, Article 1171. 10.3389/fpsyg.2015.01171 [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Luce, P. A., & Pisoni, D. B. (1998). Recognizing spoken words: The neighborhood activation model. Ear and Hearing, 19(1), 1–36. 10.1097/00003446-199802000-00001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Luke, S. G. (2017). Evaluating significance in linear mixed-effects models in R. Behavior Research Methods, 49(4), 1494–1502. 10.3758/s13428-016-0809-y [DOI] [PubMed] [Google Scholar]
  45. Lund, K., & Burgess, C. (1996). Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, & Computers, 28(2), 203–208. 10.3758/BF03204766 [DOI] [Google Scholar]
  46. Lüdecke, D., Makowski, D., & Waggoner, P. (2019). performance: Assessment of regression models performance (R package version 0.3.0). https://cran.r-project.org/web/packages/performance/
  47. Marslen-Wilson, W. D. (1987). Functional parallelism in spoken word-recognition. Cognition, 25(1–2), 71–102. 10.1016/0010-0277(87)90005-9 [DOI] [PubMed] [Google Scholar]
  48. Mattys, S. L., Davis, M. H., Bradlow, A. R., & Scott, S. K. (2012). Speech recognition in adverse conditions: A review. Language and Cognitive Processes, 27(7–8), 953–978. 10.1080/01690965.2012.705006 [DOI] [Google Scholar]
  49. Mattys, S. L., & Wiget, L. (2011). Effects of cognitive load on speech recognition. Journal of Memory and Language, 65(2), 145–160. 10.1016/j.jml.2011.04.004 [DOI] [Google Scholar]
  50. Matuschek, H., Kliegl, R., Vasishth, S., Baayen, H., & Bates, D. (2017). Balancing type I error and power in linear mixed models. Journal of Memory and Language, 94, 305–315. 10.1016/j.jml.2017.01.001 [DOI] [Google Scholar]
  51. McGarrigle, R., Munro, K. J., Dawes, P., Stewart, A. J., Moore, D. R., Barry, J. G., & Amitay, S. (2014). Listening effort and fatigue: What exactly are we measuring? A British society of audiology cognition in hearing special interest group ‘white paper.’ International Journal of Audiology, 53(7), 433–445. 10.3109/14992027.2014.890296 [DOI] [PubMed] [Google Scholar]
  52. McLaughlin, D. J., Zink, M. E., Gaunt, L., Spehar, B., van Engen, K. J., Sommers, M. S., & Peelle, J. E. (2022). Pupillometry reveals cognitive demands of lexical competition during spoken word recognition in young and older adults. Psychonomic Bulletin and Review, 29(1), 268–280. 10.3758/s13423-021-01991-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Meteyard, L., & Davies, R. A. I. (2020). Best practice guidance for linear mixed-effects models in psychological science. Journal of Memory and Language, 112, Article 104092. 10.1016/j.jml.2020.104092 [DOI] [Google Scholar]
  54. Nakagawa, S., Johnson, P. C. D., & Schielzeth, H. (2017). The coefficient of determination R2 and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded. Journal of the Royal Society Interface, 14(134), Article 20170213. 10.1098/rsif.2017.0213 [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Ohlenforst, B., Zekveld, A. A., Lunner, T., Wendt, D., Naylor, G., Wang, Y., Versfeld, N. J., & Kramer, S. E. (2017). Impact of stimulus-related factors and hearing impairment on listening effort as indicated by pupil dilation. Hearing Research, 351, 68–79. 10.1016/j.heares.2017.05.012 [DOI] [PubMed] [Google Scholar]
  56. Pandža, N. B., Phillips, I., Karuzis, V. P., O'Rourke, P., & Kuchinsky, S. E. (2020). Neurostimulation and pupillometry: New directions for learning and research in applied linguistics. Annual Review of Applied Linguistics, 40, 56–77. 10.1017/S0267190520000069 [DOI] [Google Scholar]
  57. Papesh, M. H., & Goldinger, S. D. (2012). Pupil-blah-metry: Cognitive effort in speech planning reflected by pupil dilation. Attention, Perception & Psychophysics, 74(4), 754–765. 10.3758/s13414-011-0263-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Phatak, S. A., Brungart, D. S., Zion, D. J., & Grant, K. W. (2019). Clinical assessment of functional hearing deficits: Speech-in-noise performance. Ear and Hearing, 40(2), 426–436. 10.1097/AUD.0000000000000635 [DOI] [PubMed] [Google Scholar]
  59. Phillips, I., Calloway, R. C., Karuzis, V. P., Pandža, N. B., O'Rourke, P., & Kuchinsky, S. E. (2021). Transcutaneous auricular vagus nerve stimulation strengthens semantic representations of foreign language tone words during initial stages of learning. Journal of Cognitive Neuroscience, 34(1), 127–152. 10.1162/jocn_a_01783 [DOI] [PubMed] [Google Scholar]
  60. Pichora-Fuller, M. K., Kramer, S. E., Eckert, M. A., Edwards, B., Hornsby, B. W. Y., Humes, L. E., Lemke, U., Lunner, T., Matthen, M., Mackersie, C. L., Naylor, G., Phillips, N. A., Richter, M., Rudner, M., Sommers, M. S., Tremblay, K. L., & Wingfield, A. (2016). Hearing impairment and cognitive energy: The framework for understanding effortful listening (FUEL). Ear and Hearing, 37(1), 5S–27S. 10.1097/AUD.0000000000000312 [DOI] [PubMed] [Google Scholar]
  61. Picou, E. M., & Ricketts, T. A. (2014). The effect of changing the secondary task in dual-task paradigms for measuring listening effort. Ear and Hearing, 35(6), 611–622. 10.1097/AUD.0000000000000055 [DOI] [PubMed] [Google Scholar]
  62. Piquado, T., Isaacowitz, D., & Wingfield, A. (2010). Pupillometry as a measure of cognitive effort in younger and older adults. Psychophysiology, 47(3), 560–569. 10.1111/j.1469-8986.2009.00947.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Quené, H., & van den Bergh, H. (2008). Examples of mixed-effects modeling with crossed random effects and with binomial data. Journal of Memory and Language, 59(4), 413–425. 10.1016/j.jml.2008.02.002 [DOI] [Google Scholar]
  64. Rabbitt, P. (1966). Recognition: Memory for words correctly heard in noise. Psychonomic Science, 6(8), 383–384. 10.3758/BF03330948 [DOI] [Google Scholar]
  65. Ratcliff, R., & Murdock, B. B. (1976). Retrieval processes in recognition memory. Psychological Review, 83(3), 190–214. 10.1037/0033-295X.83.3.190 [DOI] [Google Scholar]
  66. R Core Team. (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing. http://www.r-project.org [Google Scholar]
  67. Revill, K. P., & Spieler, D. H. (2012). The effect of lexical frequency on spoken word recognition in young and older listeners. Psychology and Aging, 27(1), 80–87. 10.1037/a0024113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Rönnberg, J., Lunner, T., Zekveld, A., Sörqvist, P., Danielsson, H., Lyxell, B., Dahlström, Ö., Signoret, C., Stenfelt, S., Pichora-Fuller, M. K., & Rudner, M. (2013). The ease of language understanding (ELU) model: Theoretical, empirical, and clinical advances. Frontiers in Systems Neuroscience, 7, Article 31. 10.3389/fnsys.2013.00031 [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Rönnberg, J., Rudner, M., Foo, C., & Lunner, T. (2008). Cognition counts: A working memory system for ease of language understanding (ELU). International Journal of Audiology, 47(Suppl. 2), S99–S105. 10.1080/14992020802301167 [DOI] [PubMed] [Google Scholar]
  70. Sarampalis, A., Kalluri, S., Edwards, B., & Hafter, E. (2009). Objective measures of listening effort: Effects of background noise and noise reduction. Journal of Speech, Language, and Hearing Research, 52(5), 1230–1240. 10.1044/1092-4388(2009/08-0111) [DOI] [PubMed] [Google Scholar]
  71. Seeman, S., & Sims, R. (2015). Comparison of psychophysiological and dual-task measures of listening effort. Journal of Speech, Language, and Hearing Research, 58(6), 1781–1792. 10.1044/2015_JSLHR-H-14-0180 [DOI] [PubMed] [Google Scholar]
  72. Stenbäck, V., Marsja, E., Hällgren, M., Lyxell, B., & Larsby, B. (2021). The contribution of age, working memory capacity, and inhibitory control on speech recognition in noise in young and older adult listeners. Journal of Speech, Language, and Hearing Research, 64(11), 4513–4523. 10.1044/2021_JSLHR-20-00251 [DOI] [PubMed] [Google Scholar]
  73. Stenbäck, V., Marsja, E., Hällgren, M., Lyxell, B., & Larsby, B. (2022). Informational masking and listening effort in speech recognition in noise: The role of working memory capacity and inhibitory control in older adults with and without hearing impairment. Journal of Speech, Language, and Hearing Research, 65(11), 4417–4428. 10.1044/2022_JSLHR-21-00674 [DOI] [PubMed] [Google Scholar]
  74. Strand, J. F., Brown, V. A., Merchant, M. B., Brown, H. E., & Smith, J. (2018). Measuring listening effort: Convergent validity, sensitivity, and links with cognitive and personality measures. Journal of Speech, Language, and Hearing Research, 61(6), 1463–1486. 10.1044/2018_JSLHR-H-17-0257 [DOI] [PubMed] [Google Scholar]
  75. Strauss, D. J., & Francis, A. L. (2017). Toward a taxonomic model of attention in effortful listening. Cognitive, Affective, & Behavioral Neuroscience, 17(4), 809–825. 10.3758/s13415-017-0513-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Visentin, C., Valzolgher, C., Pellegatti, M., Potente, P., Pavani, F., & Prodi, N. (2022). A comparison of simultaneously-obtained measures of listening effort: Pupil dilation, verbal response time and self-rating. International Journal of Audiology, 61(7), 561–573. 10.1080/14992027.2021.1921290 [DOI] [PubMed] [Google Scholar]
  77. Voeten, C. C. (2019). buildmer: Stepwise elimination and term reordering for mixed-effects regression (R package version 1.1). https://cran.r-project.org/package=buildmer
  78. Wendt, D., Dau, T., & Hjortkjær, J. (2016). Impact of background noise and sentence complexity on processing demands during sentence comprehension. Frontiers in Psychology, 7, Article 345. 10.3389/fpsyg.2016.00345 [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Wilson, R. H., Carnell, C. S., & Cleghorn, A. L. (2007). The words-in-noise (WIN) test with multitalker babble and speech-spectrum noise maskers. Journal of the American Academy of Audiology, 18(06), 522–529. 10.3766/jaaa.18.6.7 [DOI] [PubMed] [Google Scholar]
  80. Wilson, R. H., & McArdle, R. (2015). The homogeneity with respect to intelligibility of recorded word-recognition materials. Journal of the American Academy of Audiology, 26(4), 331–345. 10.3766/jaaa.26.4.2 [DOI] [PubMed] [Google Scholar]
  81. Wingfield, A., Amichetti, N. M., & Lash, A. (2015). Cognitive aging and hearing acuity: Modeling spoken language comprehension. Frontiers in Psychology, 6, Article 684. 10.3389/fpsyg.2015.00684 [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Wingfield, A., & Grossman, M. (2006). Language and the aging brain: Patterns of neural compensation revealed by functional brain imaging. Journal of Neurophysiology, 96(6), 2830–2839. 10.1152/jn.00628.2006 [DOI] [PubMed] [Google Scholar]
  83. Winn, M. B. (2016). Rapid release from listening effort resulting from semantic context, and effects of spectral degradation and cochlear implants. Trends in Hearing, 20. 10.1177/2331216516669723 [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Winn, M. B., & Teece, K. H. (2021). Listening effort is not the same as speech intelligibility score. Trends in Hearing, 25. 10.1177/23312165211027688 [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Winn, M. B., Wendt, D., Koelewijn, T., & Kuchinsky, S. E. (2018). Best practices and advice for using pupillometry to measure listening effort: An introduction for those who want to get started. Trends in Hearing, 22. 10.1177/2331216518800869 [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Wollesen, B., Wanstrath, M., Van Schooten, K. S., & Delbaere, K. (2019). A taxonomy of cognitive tasks to evaluate cognitive-motor interference on spatiotemoporal gait parameters in older people: A systematic review and meta-analysis. European Review of Aging and Physical Activity, 16(1), 12–27. 10.1186/s11556-019-0218-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Wu, Y.-H., Stangl, E., Zhang, X., Perkins, J., & Eilers, E. (2016). Psychometric functions of dual-task paradigms for measuring listening effort. Ear and Hearing, 37(6), 660–670. 10.1097/AUD.0000000000000335 [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Yap, M. J., & Balota, D. A. (2007). Additive and interactive effects on response time distributions in visual word recognition. Journal of Experimental Psychology: Learning Memory and Cognition, 33(2), 274–296. 10.1037/0278-7393.33.2.274 [DOI] [PubMed] [Google Scholar]
  89. Zekveld, A. A., Koelewijn, T., & Kramer, S. E. (2018). The pupil dilation response to auditory stimuli: Current state of knowledge. Trends in Hearing, 22. 10.1177/2331216518777174 [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Zekveld, A. A., Kramer, S. E., & Festen, J. M. (2010). Pupil response as an indication of effortful listening: The influence of sentence intelligibility. Ear and Hearing, 31(4), 480–490. 10.1097/AUD.0b013e3181d4f251 [DOI] [PubMed] [Google Scholar]
  91. Zekveld, A. A., Kramer, S. E., Rönnberg, J., & Rudner, M. (2019). In a concurrent memory and auditory perception task, the pupil dilation response is more sensitive to memory load than to auditory stimulus characteristics. Ear and Hearing, 40(2), 272–286. 10.1097/AUD.0000000000000612 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data and analysis R code and markdown file that support the findings of this study are available via the Open Science Framework at https://osf.io/pd84w/.


Articles from Journal of Speech, Language, and Hearing Research : JSLHR are provided here courtesy of American Speech-Language-Hearing Association

RESOURCES