Abstract
Estimating cognitive workload levels is an emerging research topic in the cognitive neuroscience domain, as participants’ performance is highly influenced by cognitive overload or underload results. Different physiological measures such as Electroencephalography (EEG), Functional Magnetic Resonance Imaging, Functional near-infrared spectroscopy, respiratory activity, and eye activity are efficiently used to estimate workload levels with the help of machine learning or deep learning techniques. Some reviews focus only on EEG-based workload estimation using machine learning classifiers or multimodal fusion of different physiological measures for workload estimation. However, a detailed analysis of all physiological measures for estimating cognitive workload levels still needs to be discovered. Thus, this survey highlights the in-depth analysis of all the physiological measures for assessing cognitive workload. This survey emphasizes the basics of cognitive workload, open-access datasets, the experimental paradigm of cognitive tasks, and different measures for estimating workload levels. Lastly, we emphasize the significant findings from this review and identify the open challenges. In addition, we also specify future scopes for researchers to overcome those challenges.
Keywords: Electroencephalography, Cognitive workload, Convolutional neural network, Long short-term memory
Introduction
“Cognitive workload” is a theoretical construct representing the load imposed on an individual’s cognitive system while performing a particular task (Wickens et al. 2004). The workload includes perception, attention, memory, and decision-making processes. The workload level can influence the performance of an individual. Excessively high and low cognitive workload levels potentially lead to decreased performance and increased potential for error (Yeh and Wickens 1988). Therefore, cognitive workload assessment is essential in evaluating operator performance while accomplishing the cognitive task. According to Gevins (2006), cognitive workload combines the two components: task difficulty and mental effort associated with task difficulty. The quantity of engaged cognitive resources represents the task difficulty (Roy et al. 2016). The engagement of cognitive resources during the task depends on several factors, such as short-term/working memory load for a specific task, the number of tasks/items processed simultaneously, and the task execution speed (i.e., temporal pressure) (Roy et al. 2016). Subjective, performance, and physiological measures can estimate the workload level of a participant. This section provides a general overview of workload-related topics such as (a) cognitive workload and (b) different measures for estimating cognitive workload. A detailed description of each subsection is mentioned below.
Cognitive workload
The cognitive workload can be defined as the physical and/or mental requirements associated with a task or combination of tasks (Chakladar et al. 2021). In cognitive psychology, cognitive workload denotes the amount of working memory resources needed to accomplish the task (Sweller 1988). Working memory, having a limited capacity, holds the information temporarily (Ericsson et al. 1999). Working memory is often called short-term memory, which is related to controlling the decision-making and behavior of humans [8]. The cognitive workload / cognitive load is divided into three categories: (a) intrinsic cognitive load, (b) extraneous cognitive load, and (c) germane cognitive load (Cook et al. 2017). A detailed description of each subsection is mentioned below.
Intrinsic cognitive load This type of cognitive load is induced by the inherent nature of the task, and it is determined by the number of items to be remembered (Cook et al. 2017).
Extraneous cognitive load The extraneous cognitive load refers to external factors like time pressure, noise, and situation. This type of cognitive load varies with the time gap between stimuli presentation (Cook et al. 2017).
Germane cognitive load The germane cognitive load denotes the load placed on working memory based on experimental design and automation (Cook et al. 2017). The germane load is placed on working memory during schema formation and automation (Cook et al. 2017). Such load is increased by expanding workload complexity in long-term memory.
The schematic diagram of workload assessment with the help of different cognitive loads is shown in Fig. 1. The workload measurement is dependent on individual perception (i.e., intrinsic cognitive load), working environment, and task demands (i.e., extraneous cognitive load). The performance of workload assessment is conducted jointly with behavioral analysis and workload classification results.
Fig. 1.

Schematic diagram of workload measurement. Task performance is estimated based on the subject’s behavior towards the task (based on the subjective measure) and workload levels are classified by machine/deep learning classifier
The cognitive workload is evaluated using three metrics, namely: (i) subjective measures, (ii) performance measures, and (iii) physiological measures. A detailed description of each measure is written in the following sections.
Subjective measures
In subjective measures, the workload level is defined using the quantitative rankings of the user’s feelings (Rubio et al. 2004). It is designed with interlined question-answer-type responses (Rubio et al. 2004). Subjective rating scales are divided into two groups (Longo 2015): (1) uni-dimensional scale, it has only one dimension (e.g., Modified Cooper-Harper (Cooper and Harper 1969)/Subjective Workload Dominance technique (Vidullch et al. 1991)) and (2) multidimensional scale (e.g., National Aeronautics and Space Administration (NASA) Task load index: NASA-TLX (Hart and Staveland 1988) or Subjective Workload Assessment Technique: SWAT (Reid and Nygren 1988)). Among all the rating-based subjective measures, NASA-TLX and SWAT are widely used tools for workload assessment (Rubio et al. 2004). The detailed description of NASA-TLX and SWAT are mentioned below.
NASA-TLX It is the most popular subjective measurement tool used for estimating the cognitive workload of a person (Hart and Staveland 1988). In NASA-TLX subjective measure, the total workload is divided into six subscales: mental demands, physical demands, temporal demands, performance, effort, and frustration (Group 1988). Each subscale is divided into 20 equal intervals starting from very low to very high workload index (Group 1988). Each interval in the subscale is assigned a 5-point rating; therefore, 100 points () are assigned for each subscale (Group 1988). The rating of each subscale is multiplied by the task load index/weights according to the task. Finally, the sum of weighted ratings is divided by 15 (sum of the weights) to get the final weighted score of a subject for the specific task (Group 1988).
SWAT In this workload assessment technique, subjects are asked to rate the workload of a task based on the three dimensions: time load, mental effort load, and psychological stress load (Rubio et al. 2004; Reid and Nygren 1988). Time load highlights the amount of time available in planning, executing, and monitoring a task, whereas mental effort assesses how much conscious mental effort and planning are required to perform a task. Psychological stress load measures the amounts of risk, confusion, frustration, and anxiety associated with task performance (Reid and Nygren 1988).
Performance measures
In the performance measures, the workload levels can be evaluated by a person’s performance during the task with an increasing task complexity (Veltman and Jansen 2005). The performance measures are categorized into two subparts: primary and secondary measure (Wickens 1979). In the primary measure, the workload of the main task is evaluated. In the secondary measure, the workload level of some concurrent parallel tasks is evaluated with the primary task. Primary task measurements are not changed with a low workload situation (Miller 2001). The main drawback of primary task measure is that it may be limited concerning diagnostic capability (Boff et al. 1986). On the other hand, intrusiveness is the main limitation of the secondary-task performance technique (Boff et al. 1986).
Physiological measures
The cognitive workload can be classified using different kinds of physiological measures such as cardiovascular activity: Heart Rate (HR), Heart Rate Variability (HRV), Electrocardiography (ECG); respiratory measure: respiration rate, ventilation; eye activity: blink rate, pupil diameter; and brain activity: Electroencephalography (EEG), Near-infrared spectroscopy (NIRS), functional magnetic resonance imaging (fMRI), functional near-infrared spectroscopy (fNIRS), etc (Kramer 2020). Among all the physiological measures, brain activity-based measures are widely used for cognitive state classification (Ayres et al. 2021). Therefore, this section divided the physiological measures into two subsections: brain activity-based measures and non-brain activity-based measures. A detailed discussion of each subsection is mentioned below.
Brain activity-based measures
Based on the human nervous system, physiological measures related to brain activity associated with the central nervous system (CNS) (Kramer 2020). The CNS comprises the brain and spinal cord (Bear et al. 2020). The human brain comprises three main components: cerebrum, cerebellum, and brainstem (Bear et al. 2020). The cerebrum, the foremost part of the brain, is subdivided into right and left hemispheres (Bear et al. 2020). Each hemisphere is divided into four lobes: frontal, temporal, parietal, and occipital lobes (Bear et al. 2020). The brain hemispheres with their lobes are shown in Fig. 2.
Fig. 2.

a Hemispheres of the brain (Bear et al. 2020), b Lobes of the brain (Bear et al. 2020)
The frontal lobe is associated with executive functions, including self-control, planning, reasoning, and abstract thought, while the occipital lobe is related to vision (Bear et al. 2020; Aghajani et al. 2017). The temporal lobe process and understand sounds such as musical notes and speech (Bear et al. 2020; Aghajani et al. 2017). This lobe is also associated with emotions and recognizing faces. Parietal lobes control sensations (touch, pressure, pain, and temperature), spatial orientation (understanding of the size, shape, and direction), visual perception, and interpret language and words (Bear et al. 2020; Aghajani et al. 2017).
To decode human behavior towards a task, understanding brain rhythm is essential (Bear et al. 2020). Brain rhythm refers to patterns generated due to massed neuronal activity associated with specific behaviors of humans, arousal level, and sleep states (Bear et al. 2020). EEG is the classical method to record brain rhythms (Aghajani et al. 2017). The EEG brain rhythm is divided into multiple frequency bands such as delta (1–4 Hz), theta (5–8 Hz), alpha (8–15 Hz), beta (12–31 Hz), and gamma (32 Hz) (Aghajani et al. 2017; Nidal and Malik 2014). Delta/theta waves are mostly related to the various phases of sleep or idle state (Bear et al. 2020). The alpha band is activated during the relaxed or drowsiness process, whereas the beta band is responsible for the high thinking process (Bear et al. 2020; Aghajani et al. 2017). The Gamma band is activated during cross-modal sensory processing (Bear et al. 2020; Aghajani et al. 2017).
Apart from EEG, other non-invasive techniques can measure brain activities for workload measurement. The details of all such techniques are discussed below.
Functional Magnetic Resonance Imaging (fMRI) It detects changes in cerebral blood flow/metabolism and oxygenation levels during neural activation by using electromagnetic fields (Bear et al. 2020). The neural system controls the blood flow; however, it is also influenced by other physiological factors.
Magnetoencephalography (MEG) In the MEG measurement, the magnetic field is generated by the electrical activity of neurons (Singh 2014). MEG is fused with magnetic resonance imaging to get magnetic source imaging.
Functional Near-Infrared Spectroscopy (fNIRS) It is an optical spectroscopy tool that uses infrared light to characterize acquired variations in cerebral metabolism during neuronal activity (Bunce et al. 2006). It tracks changes in oxyhemoglobin and deoxyhemoglobin concentrations in the superficial layers of the cortex.
Positron Emission Tomography (PET) It is a functional imaging technique used to visualize and quantify changes in metabolic processes and other physiological activities like blood flow, chemical composition, etc. Ollinger and Fessler (1997).
The comparison of different brain activity-based physiological measures is performed in Table 1. It can be noticed that due to high temporal resolution, EEG is widely used for cognitive workload estimation (Panda et al. 2020; Zarjam et al. 2015). EEG devices are also easy to use and economical; thus, they are widely used in Brain-Computer Interfaces (BCI) systems (Abiri et al. 2019). The advantages of the brain activity-based physiological measures are defined below:
High specificity These physiological measures directly assess brain activity, providing specific insights into cognitive and emotional processes (Panda et al. 2020; Chakladar and Chakraborty 2018).
Localization Physiological measures such as fMRI and fNIRS can identify specific brain cortex regions associated with certain functions or stimuli. This localization process improves the performance of region-wise Brain connectivity networks (Faro and Mohamed 2006).
Temporal and Spatial resolution EEG with high temporal resolution identifies the spontaneous changes in brain activity (Chakladar et al. 2020). In contrast, with a high spatial resolution, fMRI finds localized brain sources involved in various cognitive tasks (Yacoub et al. 2015).
The disadvantages of brain activity-based physiological measures are as follows:
Expensive and complex equipment The overall cost of fMRI experiments (including fMRI machine and its maintenance) is costly, so it will be troublesome to accommodate fMRI machines for small and mid-level research groups (Richards et al. 2016).
Ethical concerns fNIRS, fMRI, and EEG experiments require privacy and ethical concerns of the participants. Obtaining ethical permission is an essential and sometimes complex process. Researchers must navigate several ethical considerations to ensure the well-being and rights of study participants (Alderson and Morrow 2020).
Table 1.
Comparison of different brain signal acquisition methods
| Method | Measurement activity | TP resolution | SP resolution | Costs |
|---|---|---|---|---|
| EEG | Electrical | High | Low | Economical |
| MEG | Magnetic | High | High | Expensive |
| fMRI | Metabolic | Low | High | Expensive |
| PET | Metabolic | Low | High | Expensive |
| fNIRS | Metabolic | Low | High | Economical |
Temporal (TP) resolution, Spatial (SP) resolution
Non-brain activity-based measures
Another part of the human nervous system is the peripheral nervous system (PNS), which consists of many nerves that branch out from the CNS all over the body (Bear et al. 2020). Peripheral physiological measures such as cardiovascular, respiratory, and eye activity-related measures are associated with the PNS (Hogervorst et al. 2014). In cardiovascular measures, ECG measures the electrical activity of the heart using a number of sensors (Finsen et al. 2001; Houssein et al. 2017). The heart beat-to-beat variations are classified as Heart Rate Variability (HRV) (Finsen et al. 2001). The number of beats in a time period (mostly in a minute) is reported as Heart Rate (HR) (Finsen et al. 2001). HR increases with increasing task demands (De Rivecourt et al. 2008) or when additional memory load is introduced (Finsen et al. 2001). The rest and task periods in a simulated flight task can be classified using HR measure (De Rivecourt et al. 2008). Autoregressive model-based spectral analysis has been developed to measure workload levels using HRV (Zhang et al. 2006). They found that spectral analysis improved the workload assessment by coupling the information obtained from the recorded HRV signal and outputs of the Autoregressive model.
Estimation of the cognitive load can be performed through the respiratory measure. The respiratory measure is dependent on different factors such as time (e.g., number of breaths per minute), volume (e.g., amount of air inhaled during one respiratory cycle), gas exchange (e.g., respiratory exchange ratio), and variability of those parameters (Mauderly 1990). It can be observed that the cognitive load is positively related to respiration rate (Grassmann et al. 2016). Eye activity can be measured through blink rate, blink duration, blink latency, and pupil size (Tummeltshammer et al. 2019). The mental state of an operator can be classified by eye movement and eye distance (Schuller et al. 2009). It can be noticed that pupil diameter can identify age (Piquado et al. 2010), visual and auditory presentations (Klingner et al. 2011).
The advantages of the non-brain activity-based physiological measures are defined below:
Simplicity and cost-effectiveness Non-brain activity-based measures are often valued for their simplicity and cost-effectiveness in research and various applications. These measures offer valuable insights into physiological and behavioral responses without the need for complex and expensive equipment (De Rivecourt et al. 2008).
Physiological correlation Non-brain activity measures, such as respiratory or heart rate, can provide information about physiological responses related to stress or emotional states (Panaite et al. 2016).
Ethical considerations These measures may be perceived as less invasive, making them more acceptable to certain participants. Therefore, no ethical permission is required during the experiment.
The disadvantages of the non-brain activity-based physiological measures are as follows:
Limited temporal resolution Some non-brain activity-based measures have poor temporal resolution, and they cannot capture the rapid changes in the cognitive states of participants (Singh et al. 2019).
Potential for noise External factors like environmental conditions and individual differences can introduce noise into these measures (Signorini et al. 2001).
In a similar workload estimation review (Zhou et al. 2021), authors have analyzed cognitive workload levels using EEG and machine learning models. However, they did not use deep learning and statistical methods for workload estimation. Deep learning methods perform better than machine learning classifiers in workload classification. Apart from EEG, it is also important to discuss other physiological measures (i.e., fMRI, fNIRS, eye tracking, etc.) for workload estimation. In contrast, our study highlights workload estimation using different physiological measures, along with the multimodal fusion of different measures. Moreover, we mention the statistical approaches and deep learning classifiers for workload estimation apart from machine learning classifiers. Hopefully, our review will be helpful for beginner researchers to understand the basics of workload estimation using physiological measures and for experienced researchers to comprehend workload estimation through different machine learning/deep learning/statistical approaches and multimodal approaches. As per our knowledge, no such review exists that covers an in-depth analysis of machine learning, deep learning, and statistical approaches to identify the workload level of a participant using different physiological measures. The contributions of this review are mentioned below:
This review segregates physiological measures into brain and non-brain activity-based measures and mentions a detailed discussion of each measure.
A detailed, in-depth analysis of machine learning, deep learning, and statistical approaches for workload estimation using different physiological signals is discussed. Moreover, we also demonstrated multimodal-fusion techniques of two or more physiological measures for workload estimation.
We also highlight widely used cognitive task-wise analysis for different physiological measures (based on existing studies of the last five years) and highlight the most used physiological measure for all the tasks.
The remainder of the review paper is organized as follows. In Sect. 2, we have briefly discussed cognitive workload estimation using various techniques. The experimental designs of different tasks for workload estimation are described in Sect. 3. A detailed discussion of workload estimation using different physiological measures (EEG, fMRI, fNIRS, cardiovascular etc.) is described in Sect. 4. Different cognitive task-wise analysis is discussed in 5. In Sect. 6, we discuss the major findings, open problems, and future trends that need further investigation. Finally, we conclude the whole paper in Sect. 7.
Literature review
This review focuses on the recognition of EEG-based cognitive workload using different physiological measures. This section discusses common paradigms of cognitive workload estimation and workload estimation techniques. A detailed discussion of each section is mentioned below.
Common paradigms
The estimation of cognitive workload level can be performed in controlled laboratory conditions (e.g., conducting cognitive tasks) or operating machines in real or simulated environments. Therefore, the widely used paradigms to understand the cognitive workload are divided into two parts: cognitive and operator-oriented task paradigms. A brief illustration of both paradigms is discussed below.
Cognitive-oriented task paradigms
In this paradigm, the subject performed a single cognitive task such as mental arithmetic (MA) (Chakladar et al. 2021), n-back task (Wang et al. 2015), Sternberg working memory task (Roy et al. 2016) etc. Such cognitive tasks, which consumes the working memory of human, require temporary storage for processing and retrieval of the information. In the n-back task, the workload complexity is increased with the increased value of n. In the n-back task, the subject needs to remember the digit/letter presented n-trials back and match that digit/letter with the current trial data. In the Sternberg working memory task, subjects remember different stimulus sets and search whether the present stimulus is matched with the memorized group of stimulus (Roy et al. 2016). In the MA task, subjects need to perform arithmetic tasks (addition/subtraction) between two numbers and memorize the result. Then, they should repeatedly perform another mental task from that memorized result (Shin et al. 2016). All these cognitive tasks require temporal storage for storing intermediate results (Zarjam et al. 2015).
Operator-oriented task paradigms
In this paradigm, participants are required to operate the machine in a simulated or real way. This paradigm focuses on operators’ actions while performing the task. This paradigm includes tasks such as air traffic management (Arico et al. 2015), driving (Almogbe et al. 2019), automation-enhanced cabin air management system (aCAMS) (Yin et al. 2016) and flight simulating task (Dehais et al. 2018). In air traffic management tasks, the air traffic controller performed multi-task activities by engaging in visual activities of airplane control and auditory communication with pilots (Arico et al. 2015). The workload levels were adjusted with traffic manipulation, complexity, or volume. In aCAMS task, the operator performed a safety-critical task with multiple subsystems that control the air quality of a spacious cabin (Wang et al. 2012).
Workload estimation techniques
The cognitive workload can be estimated using machine learning, deep learning, and statistical approaches. In addition, we also illustrate the complexity of each approach while estimating workload levels. A detailed discussion of each section is mentioned below.
Machine learning
A general definition of Machine learning is “the technique that improves the performance measure of the task through some training experience” jordan2015machine. Machine learning techniques can be broadly classified into three major categories: supervised, unsupervised, and reinforcement learning. Among them, supervised learning is used as the most commonly used technique (Jordan and Mitchell 2015). The machine learning classifiers (SVM, kNN, decision tree, etc.) follow a series of steps such as data processing, feature extraction, feature selection, and classification for workload classification. In supervised learning, a model has been built that predicts outputs from the trained labeled input data. In contrast, input data (without label) are categorized into similar groups in unsupervised learning (Jordan and Mitchell 2015). Reinforcement learning is an area of machine learning concerned with how intelligent agents should take action in an environment to maximize the cumulative reward (Sutton and Barto 2018). A brief discussion of cognitive workload estimation using machine learning methods is shown in Table 3.
Table 3.
Summary of cognitive workload estimation using machine learning methods
| Studies | Task | Measures | Methods | Results |
|---|---|---|---|---|
| Walter et al. (2013) | Working memory tasks | EEG | ERD, ERS + SVM | Acc: 97.00% |
| Bashivan et al. (2015) | Sternberg Task | EEG | Band power, wavelet entropy + SVM | Acc: 92.00% |
| Baldwin and Penaranda (2012) | Working memory tasks | EEG | Band power + ANN | Acc: 85.30% |
| Zarjam et al. (2013) | MA | EEG | Entropy + ANN | Acc: 98.00% |
| Kakkos et al. (2019) | Simulated flight task | EEG | multiband functional connectivity networks + SVM | Acc: 82.00% |
| Hou et al. (2015) | Stress monitoring | EEG | Statistical features, SVM | Acc: 67.06% |
| Wang and Sourina (2013) | MA | EEG | Power spectral density (PSD), Autoregressive (AR) + ANN | Acc: 87.87% |
| Brouwer et al. (2012) | n-back tests | EEG | Spectral power, ERP features + SVM | Acc: 88.00% |
| Qu et al. (2021) | Simulated flight task | ECG | Heart Rate Variability, SVM | Acc: 90.80% |
| Abibullaev and An (2012) | Baseline, object rotation, multiplication and letter padding task | fNIRS | Wavelet + BPNN, LDA and SVM | Max Acc: 94.25% (Baseline task) |
| Bauernfeind et al. (2014) | MA | fNIRS | HbO, HbR + LDA/QDA/SVM/SLDA | Acc: 86.60% |
| Chen et al. (2016) | MA | Eye activity | Eye blink, pupillary response + Adaboost | Max. Acc: 79.30% |
Oxygenated hemoglobin: HbO, deoxygenated hemoglobin: HbR, Accuracy: Acc, Backpropagation neural network: BPNN, Artificial Neural Network: ANN, Mental arithmetic: MA, Quadratic discriminant analysis: QDA, Shrinkage LDA: SLDA
Deep learning
The traditional machine learning classifiers use hand-crafted features that need prior information for classification (Supratak et al. 2017). Therefore, most of the time, the machine learning classifiers do not give satisfactory results (Supratak et al. 2017). In contrast, the deep learning classifiers automatically extract task-specific relevant features without having prior knowledge, improving the classification performance (LeCun et al. 2015). Zhang et al. (2018) have implemented a recurrent 3D convolutional neural network (CNN) to identify the workload levels from the cross-task (n-back and MA) experiment using the topographical images. Their deep learning model has achieved 88.9% average classification accuracy. An ensemble stacked denoising autoencoder (SDAE)—based deep model has been implemented for estimating workload levels (low and high) from aCAMS task (Yang et al. 2019). The SDAE preserved the localized information from the EEG. For eight subjects, they achieved a subject-wise mean classification accuracy of 92% with their model. Zeng et al. (2018) has developed a 5-layered CNN model to evaluate the workload level of a driver during the driving simulation task. They compared their model with LSTM and SVM models and found that the CNN-based model achieved the maximum average classification accuracy of 92.68%.
A variational autoencoder (VAE)-spatial attention-based deep learning model (CNN-BLSTM) has been implemented for estimating workload levels from EEG-based MA tasks Chakladar et al. (2022). They extracted localized EEG features from EEG band-wise topographical videos and achieved 83.13% classification accuracy for four workload levels (baseline, low, medium, and hard). Chakladar et al. (2022) have developed a multimodal deep clustering framework to estimate workload levels. The multimodal model extracted the temporal and spectral latent features from LSTM and CNN-based VAEs, and the combined latent features have been passed to the sparse subspace clustering to identify workload levels. They have achieved 0.952 clustering accuracy with their model. Filter Bank Common Spatial Pattern (FBCSP) and LSTM-based deep ensemble model have been used for classifying the cognitive state of a user (Chakladar et al. 2021). The deep ensemble model consists of similar structured LSTM networks that work in parallel. For the MA task, with two workload classes (task and rest), their model achieved 87% classification accuracy. A dynamic hierarchical attention-based transformer model has been used to extract temporal-spatial-spectral features from EEG data (Zhang et al. 2022). The dynamic features obtained from different dimensions (temporal/spatial/spectral) are fused to the Gated recurrent units (GRU) network for workload classification. Their model achieved 78.61% classification accuracy.
The spectrogram is highly effective in representing the salient features of the EEG. A hybrid framework consisting of spectrogram features, CNN and Bidirectional Neural Turing Machine (BNTM) have been used to measure cognitive workload levels (Qiao and Bi 2020). The spectrogram feature improved the spatial resolution of the signal. Their model achieved 96.3% classification accuracy. Spectrogram and deep sparse autoencoder-based deep learning model have been used to estimate the drowsiness of a driver (Belsare et al. 2021). They compared their model with the SVM and LSTM models and found that their model achieved the maximum classification accuracy of 97.37%. A large use of EEG spectral data has also been used in virtual reality-based n-back tasks. In each n-back trial, the spectrogram feature extracted the spectral information over the trial duration. The extracted features were passed to the CNN/LSTM model to evaluate the workload levels (Ved and Yildirim 2021). Their model achieved 97.37% classification accuracy. A brief discussion of cognitive workload estimation using deep learning methods is shown in Table 4.
Table 4.
Summary of cognitive workload estimation using different Deep learning methods
| Studies | Task | Measures | Tools | Results |
|---|---|---|---|---|
| Yang et al. (2019) | ACAMS | EEG | Spectral + temporal features, ensemble SDAE | Acc: 92.00% |
| Kwak et al. (2020) | Sternberg Task | EEG | EEG Spectral power, 3D CNN, LSTM | Acc: 90.80% |
| Hefron et al. (2017) | Multi-Attribute Task Battery | EEG | Temporal features, LSTM | Acc: 93.00% |
| Chakladar et al. (2020) | SIMKAP task | EEG | Time/freq/linear/non-linear features, GWO optimizer + BLSTM-LSTM | Acc: 86.33% |
| Chakladar et al. (2021) | MA | EEG | Filter Bank Common Spatial Pattern + Ensemble LSTM | Acc: 87.00% |
| Gupta et al. (2021) | Shape/color identification of objects | EEG | Fuzzy C-means clustering, time segment smoothing + BLSTM-LSTM model | Acc: 92.80% |
| Zhang et al. (2018) | MA + n-back task | EEG | EEG topographic maps, Morlet wavelet transformation + R3DCNN | Acc: 88.90% |
| Saadati et al. (2019) | n-back | fNIRS | HbO, HbR + spatial CNN | Acc: 97.00% |
| Mughal et al. (2021) | n-back | fNIRS | Recurrence plots + LSTM- CNN | Acc: 85.90% |
| Asgher et al. (2020) | Mental activity | fNIRS | HbO, HBR + LSTM/CNN | Max Acc: 89.31% (LSTM) |
| Amin et al. (2022) | Driver’s stress detection | ECG | Morse wavelet + Deep transfer learning and Fuzzy logic | Max Acc: 98.11% |
| Bhardwaj et al. (2018) | Driver fatigue detection | ECG | Heart Rate Variability, SAE | Acc: 90.00% |
Automatic enhanced cabin air management system: ACAMS, stacked denoising autoencoder: SDAE, stacked autoencoder: SAE, oxygenated hemoglobin: HbO, deoxygenated hemoglobin: HbR, Accuracy: Acc, Unmanned aerial vehicle: UAV, Mental arithmetic: MA, Simultaneous capacity (SIMKAP)-based multitasking activity, Grey Wolf optimizer: GWO, Recurrent 3D convolutional neural networks: R3DCNN
Statistical approach
The workload levels of a participant are often measured by some subjective metrics such as NASA-TLX, SWAT, etc. In those cases, the response time and response accuracy of the subject toward the task with varying degrees of complexity reveal the changes in performance. Statistical methods are widely used for finding the significance test between different workload levels. Statistical tests are used to identify the behavioral analysis of participants during the cognitive task (Chakladar et al. 2021). In Allison and Polich (2008), authors have evaluated different workload levels (easy, medium, and hard) from first-person shooter games. The difficulty level of the game was adjusted by varying numbers of enemies. They have found that the amplitude of ERP components such as , decreased with an increased cognitive workload. They applied repeated measures analysis of variance (ANOVA) and found that game performance decreased with increased task complexity. A brief discussion of cognitive workload estimation using statistical approaches is shown in Table 5.
Table 5.
Summary of cognitive workload estimation using statistical measures
| Studies | Task | Measures | Tools | Results |
|---|---|---|---|---|
| Midha et al. (2021) | Reading-writing | fNIRS | HbO, HbR + ANOVA | Large differences observed in the left side of the PFC |
| Causse et al. (2017) | Fight simulator | fNIRS | HbO, HbR, ANOVA | HbO concentration in prefrontal cortex is increased with difficulty |
| Ayaz et al. (2012) | n-back, air traffic control, UAV | fNIRS | HbO, HbR, One-way repeated measures ANOVA | Hemodynamic changes of fNIRS measures are associated with the relative changes of cognitive workload. |
| Koshino et al. (2005) | n-back | fMRI | Functional connectivity, ANOVA | Similar functional connectivity patterns exist for the two groups (autism and control) with different hemispheric correlations. |
| Causse et al. (2022) | n-back | fMRI | ANOVA, functional connectivity | Correct response: 94.71% |
| Benedetto et al. (2011) | Driving task | Eye activity | Blink duration and blink rate, repeated measure ANOVA | Blink duration is more significant factor than blink rate |
Oxygenated hemoglobin: HbO, deoxygenated hemoglobin: HbR, Unmanned aerial vehicle: UAV, Pre-frontal cortex: PFC
Complexity analysis of workload estimation methods
The selection of a workload classification method depends on the specific requirements of the cognitive task. Statistical methods like regression and ANOVA are relatively simple and suitable for simple cognitive tasks (e.g., small datasets of driving or reading-writing tasks). Machine learning methods, including decision trees and SVM, offer greater flexibility but increase complexity with more features and data. Deep learning methods, like LSTM, CNN, and other deep neural networks, can handle complex patterns in unstructured data but are computationally intensive, requiring powerful hardware. Hence, machine learning and deep learning classifiers can be used to classify workload levels from large datasets of MA, n-back, driving simulation tasks, etc. Therefore, the choice of workload estimation method depends on the task’s complexity, data availability, and computational resources.
Experimental conditions
In the cognitive workload task, the experiment process is broadly classified into three characteristics: experimental task design, recording process, and selection of stimuli. The experimental task is either designed by researchers or, in some cases, the researcher uses publicly available open-access datasets. The data recording processes of physiological measures are different from each other. The recording process depends on devices and their functionality to collect the data. There should not be any external noise or additional disturbances while recording the data. In an experiment, participant responses are captured during the presentation of task-specific stimuli. This section is classified into three subsections from the above observation: open-access datasets, recording devices, and stimuli of cognitive tasks. A detailed discussion of each subsection is mentioned below.
Open-access datasets
In this section, we discuss unimodal and multimodal public datasets of workload estimation. A detailed discussion of unimodal datasets is given below.
EEGMat 1 A public dataset of EEG-based MA task was provided by the National Technical University of Ukraine (Zyma et al. 2019), where 36 participants (aged from 18 to 26 years) performed the MA task (serial subtractions). The EEG data were collected through 23 EEG channels (placed based on a 10–20 electrode placement system). The arithmetic task was divided into two mental states: resting state and mental counting. The resting state consists of 3 mins followed by 4 mins of mental counting state. The sampling frequency of the dataset is 500 Hz.
WM_EEG 2 An EEG dataset was recorded from nine epilepsy patients (aged from 28 to 56) during a verbal Sternberg working memory task (Boran et al. 2020). The dataset investigates the working memory of patients by capturing scalp EEG and intracranial EEG (iEEG) recordings. In the experiment, eight letters (i.e., stimulus) were displayed on the screen for two sec. The middle four, six, or eight letters were memory items corresponding to the set size for the trial.
STEW 3 A public EEG dataset for estimating multitasking mental workload activity by a single-session simultaneous capacity (SIMKAP) experiment (Lim et al. 2018). The experiment was performed with 48 subjects. The SIMKAP test was conducted to evaluate the multitasking capability and stress tolerance of a participant. The dataset includes two tasks: resting state/no-task and SIMKAP-based tasks. In the resting state, subjects were asked to keep their eyes open without performing any work for three minutes. Their EEG signals for resting state were also recorded during that time. Next, subjects were asked to perform the SIMKAP test while EEG signals were also recorded, and the final three minutes of the recording were used as workload conditions.
A detailed discussion of multimodal workload estimation datasets is given below:
HBCI4 This open-access dataset (Shin et al. 2016) was developed to implement hybrid brain-computer interfaces (BCIs) using EEG and NIRS signals. The dataset consists of three different experimental tasks: (A) motor imagery (left vs. right hand), (B) mental arithmetic versus baseline task, and (C) motion artifacts of 30 subjects. The sampling frequency of EEG data was set at 200Hz. EEG data were collected using thirty electrodes placed according to the international 10–5 electrode placement system. Thirty channels (fourteen sources and sixteen detectors) were used to capture the NIRS data.
EEG_ NIRS 5 A hybrid open-access multimodal cognitive dataset was developed using EEG and NIRS recordings (Shin et al. 2018). Twenty-six healthy participants performed three cognitive tasks: (1) n-back (0-, 2- and 3-back), (2) discrimination/selection response task (DSR), and 3) word generation (WG) task. EEG data were recorded using 30 EEG electrodes at a sampling rate of 1000 Hz. NIRS data were collected using the NIRScout device at a sampling rate of 10.4 Hz. Sixteen sources and sixteen detectors were placed in the frontal, motor, parietal, and occipital areas. The n-back dataset includes three sessions, where each session is divided into 20 trials. Each trial contains three series of 0, 2, and 3-back () tasks, where each task includes 30% targets and 70% non-targets letters. A single session consists of instruction (2 s), task (40 s), and rest (20 s) period.
Recording devices
This review discusses different physiological measures (EEG, ECG, fNIRS, fMRI, eye activity) to evaluate cognitive workload. The recording techniques of each device are also different. Here, we highlight some of the widely used devices of each measure. A detailed discussion of physiological devices is shown in Table 2.
Table 2.
List of physiological devices used for cognitive workload measurement
| EEG | |||||
|---|---|---|---|---|---|
| Device | Channels | Comm. | Sfreq | Op.Time(Hrs) | Weight |
| Emotiv Epoc | 5–32 | Proprietary wireless | 128 | 12 | 125 g |
| Open BCI | 8–16 | Bluetooth | 250 | 24 | 260 g |
| BioSemi | 16 | Wired | 512 | 5–6 | 1.1 Kg |
| BioSemi | 32–256 | Wired | 2–16 kHz | Unlimited | 1.1 Kg |
| Brain products ActiCHamp | 32–160 | Wired | 100 | Unlimited | 1.1 Kg |
| fNIRS | |||||
|---|---|---|---|---|---|
| Device | Channels | Sfreq(Hz) | Source types | Detector types | Multimodal Integration |
| Brite | Upto 27 | 150 | LED | Photodiodes | EEG, ECG/EMG |
| OxyMon | Upto 108 | 250 | Laser | Photodiodes | EEG, EMG, fMRI |
| OctaMon(PFC) | 8 | 50 | LED | Photodiodes | EEG, ECG/EMG |
| fNIR300 | 16 | 150 | LED | Photodiodes | ET and res |
| NirSmart | Over 60 | 23 | LED | Avalanche photo diode | – |
| ECG | |||||
|---|---|---|---|---|---|
| Device | Channels | Storage (reports) | Rec mode | Battery | Filters |
| BPL GenX3 | 3 | 250 | A/M | 3000 mAh | AC/drift/ EMG/Low-pass |
| Philips TC10 | 12 | 200 | A/M/R | 4800 mAh | AC/High /Low-pass |
| BPL 9108 | 12 | 800 | A/M/R | 2500 mAh | AC/drift/ EMG/Low-pass |
| RMS vesta 301i | 3 | 250 | A/M | 1.3 Ah | High/low-pass/ Notch |
| fMRI | |||||
|---|---|---|---|---|---|
| Device | Machine type | Magnet type | Bore size (cm) | Max. amplitude (mT/m) | Max. slew rate (T/m/s) |
| Philips achieva 3T | Closed | Supercon | 60 | 45 | 200 |
| Philips achieva 1.5T | Closed | Supercon | 60 | 33 | 150 |
| Optima MR750w | Closed | Supercon | 70 | 45 | 200 |
| Eye tracking | |||||
|---|---|---|---|---|---|
| Device | Sfreq(Hz) | Camera | Weight (g) | Connectors | Software |
| Tobii pro fusion | 250 | 2 Tobii EyeSensors | 168 | USB Type-C | Any application built on the Tobii Pro SDK |
| Eye tracker-pupil labs | 200 | 192192 px illumination | 46.9 | USB Type-C | Pupil Invisible Companion App |
Columns of each physiological measure are changed with the characteristics of the individual measure. Note: sampling frequency (sfreq) in Hz, Communication(comm). Auto(A), Manual(M), Rhythm(R), Eye tracking (ET), respiration (res), Operating Time: Op. Time
Stimuli of cognitive tasks
In the cognitive experiments, the participant’s response is elicited through different types of stimulus presentation. Some of the tasks used only visual or auditory stimuli, whereas another set of tasks used both visual and auditory stimuli. The task complexity is increased with more stimuli. A detailed classification of widely used cognitive tasks using different stimuli is shown in Fig. 3. From Fig. 3, it can be shown that most of the studies are related to visual stimuli-based experiments.
Fig. 3.

Stimuli-wise cognitive tasks classification. Studies associated to cognitive tasks: n-back (Brouwer et al. 2012), Mental arithmetic (Chakladar et al. 2021), Air-traffic controller (Arico et al. 2015), Real world driving (Zhang et al. 2015), Sternberg task (Roy et al. 2016), Discrimination/selection response and Word generation task (Shin et al. 2018), auditory n-back (Son et al. 2011), Oddball task (Ullsperger et al. 2001), MAT-B (Fournier et al. 1999), Simultaneous Capacity (Lim et al. 2018), Dual n-back (Cheema et al. 2018), Visual-auditory monitoring (Bagheri and Power 2020)
Physiological measures for workload estimation
This section discusses different physiological measures (such as EEG, fMRI, fNIRS, etc.) for workload estimation. A detailed description of each measure is explained below.
EEG-based approaches
This section comprehensively discusses the existing studies of EEG-based workload classification using deep learning/machine learning classifiers and statistical methods. Takeda et al. (2016) analyzed the effects of ERP components (, ) during driving simulator application. They have observed that amplitude is inversely proportional to task difficulty, whereas the driver’s pleasure level is increased with the decreased value of amplitude. In Bagheri and Power (2020), authors have illustrated the association between EEG-based workload levels (low vs. high) and stress levels (stressed vs. relaxed). They have performed ANOVA to check the difference between workload levels based on the Root Mean Square Error (RSME) score.
As the EEG signals largely vary with individual subjects’ responses, the traditional statistical test fails to identify the workload levels across all the subjects (Heard et al. 2018). Traditional machine learning classifiers such as SVM, kNN, etc. can estimate the workload levels efficiently for all the subjects (Sciaraffa et al. 2019). Wang et al. (2015) has used a low-cost wireless device to estimate workload levels from the n-back task. Their model has achieved a maximum classification accuracy of 100% between the lowest (0-back) and highest (1, 2-back) workload level with the proximal support vector machine (PSVM) classifier. In Wang et al. (2015), authors have implemented the SVM classifier that identifies the operator’s attention during a distracted driving task. They have achieved 84.6% accuracy with their model. In Fan et al. (2017), authors have estimated the operator’s workload levels using the kNN classifier during EEG-based virtual driving tasks. They have used several features, such as statistical features, fractal dimension features, higher-order crossings (HOC) features, and power features for classification. Their model achieved 95% classification accuracy with power features.
The traditional machine learning classifiers use hand-crafted features that need prior information for classification (Supratak et al. 2017). Therefore, the machine learning classifiers usually do not give satisfactory results (Supratak et al. 2017). In contrast, the deep learning classifiers automatically extract task-specific relevant features without having prior knowledge, improving the classification performance (LeCun et al. 2015). Zhang et al. (2018) have implemented a recurrent 3D convolutional neural network (CNN) to identify the workload levels from the cross-task (n-back and MA) experiment using the topographical images. Their deep learning model has achieved 88.9% average classification accuracy. EEG-based cognitive state assessment can be performed through long-time attention-demanding tasks, such as driving a car (Hajinoroozi et al. 2016).
Chakladar et al. (2020) has implemented a hybrid framework that consists of a deep neural network (Bidirectional Long short-term memory: BLSTM-Long short-term memory: LSTM) and Grey Wolf Optimizer-based evolutionary method. They have used their model to estimate cognitive workload levels (low, moderate, and high) during the simultaneous multitasking activity. Their model has achieved a maximum of 86.33% classification accuracy for 48 subjects. A VAE-spatial attention-based deep learning model (CNN-BLSTM) has been implemented for estimating workload levels from EEG-based MA tasks (Chakladar et al. 2022). They extracted localized EEG features from EEG band-wise topographical videos and achieved 83.13% classification accuracy for four workload levels (baseline, low, medium, and hard).
fMRI-based approaches
fMRI is a non-invasive neuroimaging method that detects changes in local cerebral blood volume, cerebral blood flow, and oxygenation levels during neural activation by using electromagnetic fields (Sitaram et al. 2007). Due to high spatial resolution, it can effectively distinguish active brain regions of cognitive states (Sitaram et al. 2007). An fMRI-based functional connectivity approach has been proposed for the n-back task (Causse et al. 2022). The experiment was performed on 20 participants. They found that the frontoparietal executive control network (ECN) was related to increased cognitive load. They applied repeated-measures ANOVA and identified that the 2-back task is more complex than the 0-back. Lim et al. (2010) has used the Arterial spin labeling perfusion fMRI to investigate brain fatigue during a sustained attention-driven task called the psychomotor vigilance test. They have experimented on 15 subjects. Participants checked their performance with task complexity levels by monitoring time-on-task (TOT) effects. After a high mental workload session, they identified cognitive fatigue in the frontoparietal network. The frontoparietal network has built a high-attention network for TOT effects. In Yaple and Arsalidou (2018), authors examined the neural representation of children while performing the n-back task (levels: 0, 1, 2-back). They have performed their experiment on 260 children under 15 years old using activation likelihood estimation. Their results suggested that working memory activation has been highly correlated with frontoparietal regions. In Koshino et al. (2005), authors performed a comparison between high-functioning autism and age-matched control group for an fMRI-based n-back task (levels: 0, 1, 2-back). They performed their experiment on 14 participants in each group (control and autism). For the control group, they found that the left parietal region was more excited than the right parietal region.
fNIRS-based approaches
fNIRS is a non-invasive brain activity recording technique that uses infrared light to characterize acquired variations in cerebral metabolism during neuronal activity (Bunce et al. 2006). It tracks changes in oxyhemoglobin (HbO) and deoxyhemoglobin (HbR) concentrations in the superficial layers of the cortex (Bunce et al. 2006). Bunce et al. (2011) utilized fNIRS to examine the hemodynamic response in the dorsolateral prefrontal cortex (DLPFC) for workload levels. They experimented on quasi-realistic computerized warship commander tasks with eight subjects (four experts and four novices). Their results showed that a lower oxygenation level was required while the subject performed low to moderate workload-level tasks. In contrast, a higher oxygenation level was required for high workload levels. In Midha et al. (2021), authors have estimated workload levels from the fNIRS-based reading-writing task. They performed their experiment on 20 healthy subjects. Their results highlighted that an increased level of HbO and decreased level of HbR in the prefrontal cortex (PFC) effectively distinguish the workload levels (easy, medium, and hard) of reading tasks. Causse et al. (2017) estimated the workload levels of participants during the flight simulation task. They found that increased concentration of HbO and decreased concentration of HbR in the PFC were changed with the complexity of workload levels. However, task performance was not related to prefrontal activation (HbO concentration) at the cortex level. A repeated-measures ANOVA was implemented to find the difference between two workload levels (difficult and easy landing) of the flight simulation task. In Tsunashima and Yanagisawa (2009), authors have extracted HbO and HbR features from the fNIRS signal during the car driving task. They have performed multiresolution analysis using discrete wavelet transform and statistical group analysis. Finally, the adaptive cruise control (ACC) system has evaluated brain function analysis. They found less activation in the frontal lobe during driving with the ACC system. Herff et al. (2014) examined workload activity in PFC during the n-back task (n = 1, 2, and 3). For ten subjects, they found that hemodynamic responses in the PFC can efficiently classify the workload levels. They have achieved 78% classification accuracy with the Linear Discriminant Analysis (LDA) classifier.
Cardiovascular measures-based approaches
Cardiovascular measures can be described through cardiac physiological measures (e.g., heart rate) or, vascular physiological (e.g., peripheral vasoconstriction), or a combination of both measures (e.g., blood pressure) (Tao et al. 2019). Cardiovascular measures are effectively used in cognitive workload estimation (Tao et al. 2019). In Tjolleng et al. (2017), authors have estimated the workload levels of a participant during the driving (primary task) while performing an n-back task (secondary task) using an ECG signal. They have used time-domain ECG measures: mean inter-beat interval (IBI), standard deviation of IBIs, and root mean squared difference of adjacent IBIs; frequency domain ECG measures: power in low frequency, power in high frequency, and the ratio of power in low and high frequencies in their experiment. For 15 subjects, they have achieved 82% classification accuracy with the artificial neural network (ANN) model. Qu et al. (2021) have estimated workload levels from ECG-based simulated flight task. Along with the HRV feature, they have extracted the power spectrum feature and sample entropy feature from ECG. Their model has achieved 90.8% classification accuracy with the SVM classifier. Bhardwaj et al. (2018) examined driver fatigue detection from ECG signals. They have used time-domain feature: Heart rate, frequency domain features such as low/high-frequency spectral power, and stacked autoencoder to classify the workload levels (active, drowsy, and fatigue). For ten subjects, they achieved 96.6% accuracy with their model.
Eye activity measures-based approaches
Eye activity measures can be identified through blink rate, blink duration, blink latency, pupillary response, pupil size, etc. Chen et al. (2016). In Guo et al. (2021), authors have monitored the workload of an operator during telerobotic space training using eye-tracking. Ten subjects wore the eye-tracker device, and they simultaneously teleoperated a canadarm2 robot to complete an on-orbit assembly task. They have identified that mental workload and task performance mostly depend on pupil diameter and the index of pupillary activity features. Benedetto et al. (2011) examined the effects of eye blinks during the simulated lane change test during the driving task. They found that blink duration was a more significant factor than blink rate for driver visual workload. More short blinks occurred during lane change, and long blinks happened when driving time was increased. Pupillary response of eye activity has been measured to estimate the workload levels for younger and adult groups (Piquado et al. 2010). The pupillary response was measured for both groups of subjects while they listened to a set of digits and recalled those digits. They found that the normalized measure of pupil size of younger adults induced a high mental workload.
A pictorial illustration of workload estimation studies based on individual physiological measures and multimodal physiological measures is shown in Fig. 4. We have searched workload estimation studies using different physiological measures in the last five years from websites and computed the result in Fig. 4. From Fig. 4a, it can be concluded that EEG is a widely used physiological measure (42%) according to the existing workload estimation studies. In contrast, fMRI is the least used physiological measure (7%) among all of them.
Fig. 4.
Summaries of workload estimation studies a Studies based on physiological measures, b Studies based on multimodal measures. In b, the label “multiple” refers to the studies that fuse more than two modalities, like (EEG+ECG+fNIRS/ EEG+Eye activity+fNIRS etc)
Multimodal measures-based approaches
In multimodal approaches, data from two or more sensors are merged together to make a more robust framework for real-time measurement of cognitive workload (Debie et al. 2019). EEG and the fNIRS-based multimodal system have been implemented to classify mental states (motor imagery and rest). They have achieved 66% classification accuracy with the LDA classifier (Leamy et al. 2011). ECG, galvanic skin response, and respiration measures were combined to classify different driving states in real-time driving environments (Ll et al. 2017). They have achieved an average classification accuracy of 99% for identifying different stress levels (low, medium, and high) with different features such as time, spectral, wavelet features, and kernel-based classifiers (Extreme Learning Machine (ELM), SVM). In Zhang et al. (2017), authors merged ECG and EEG signals for the n-back experiment. They combined two heterogeneous modalities’ features using interactive mutual information modeling. The fused features were passed to the different classifiers (kNN, SVM). They achieved the maximum classification accuracy of 90.6% with the SVM classifier. Lin et al. (2020) combined EEG and fNIRS signals to identify workload levels in the lane-deviation driving task. Sixteen subjects participated in the experiment. They have used the spectral power of EEG and HbO, HbR features of fNIRS signals for workload estimation. They found that increased concentration of HbO and variation of EEG spectral power for theta, alpha, and beta bands were associated with poor driving performance. The cognitive workload level of a participant has been classified using EEG and eye-tracking data (Lobo et al. 2016). Two tasks, visual search (primary task) and syntactic transformation (secondary task) have been used for finding workload levels. They used alpha and theta band power of EEG and pupil diameter of both eyes as eye-tracking features. Their model has achieved a maximum 75.00% F1 score with the kNN classifier. A summary of cognitive workload estimation using multimodal physiological measures is discussed in Table 6.
Table 6.
Summary of cognitive workload estimation using multimodal physiological measures
| Studies | Task | Measures | Features(Tools) | Results |
|---|---|---|---|---|
| Aghajani et al. (2017) | n-back test | EEG + fNIRS | PSD (EEG), HbO and HbR (fNIRS) + SVM | Acc: 96.10% |
| Saadati et al. (2019) | n-back, DSR, WG, Left/right MI test | EEG + fNIRS | ERD/ERS (EEG), HbO and HbR (fNIRS) + DNN/SVM | Average Acc: 90.10% (MI task) |
| Saadati et al. (2019) | n-back | EEG + fNIRS | ERD/ERS (EEG), HbO and HbR (fNIRS) + CNN | Acc: 89.00% |
| Lin et al. (2020) | Driving task | EEG + fNIRS | PSD (EEG), HbO, HbR (fNIRS) | Increased concentration of HbO and EEG bands-wise spectral power lead to the poor driving performance. |
| Zhang et al. (2017) | n-back | EEG + ECG | PSD, ERP (EEG) and HR, HRV (ECG) + SVM | Average acc: 90.6% |
| Chandra et al. (2017) | MATB, breathing activity | EEG + ECG | PSD (EEG) and R-R mean, SBI(ECG) + ANN | Engagement index was the most prominent features based on the ANOVA analysis |
| Lobo et al. (2016) | visual search | EEG + Eye-tracking | PSD (EEG), pupil diameter and eye closure (Eye-tracking) + kNN | Max F1 score: 75.00% |
| Borys et al. (2017) | Mental arithmetic | EEG + Eye-tracking | PSD (EEG), pupillometry (Eye-tracking) + SVM | Acc: 90.00% |
| Brouwer et al. (2017) | Monitoring, mental math tasks | EEG + Eye-tracking | ERP (EEG), pupil size (Eye-tracking) + SVM | Acc: 65.00% |
| Ahn et al. (2016) | Mental fatigues of driver | EEG + ECG + fNIRS | PSD(EEG), HR (ECG) and HbO, HbR (fNIRS) + SVM | Acc: 75.90% |
| Liu et al. (2017) | n-back | EEG + ECG + fNIRS | PSD(EEG), HRV (ECG) and HbO, HbR (fNIRS) + ANOVA, Wilcoxon Signed Rank test | The combined approach of EEG and fNIRS significantly improved workload classification than EEG or fNIRS-alone. |
Heart Rate: HR, Heart Rate Variability: HRV, Shrinkage LDA: SLDA, Event-Related Potential: ERP, Common Spatial Pattern: CSP, Sympathovagal Balance Index: SBI, Multi-Attribute Task Battery: MATB, Discrimination/selection response: DSR, Word generation: WG, Motor Imagery: MI, Event-related synchronization/desynchronization: ERS/ERD, Deep Neural Network: DNN
The adaptive fuzzy model has been used to assess the workload state of participants using EEG and cardiovascular measures (Zhang et al. 2007; Mahfouf et al. 2007). They have used two adaptive fuzzy models: an adaptive network-based fuzzy inference system and a Genetic algorithm-based Mamdani fuzzy model to estimate workload levels from aCAMS task. They found promising results with the two adaptive fuzzy models.
For multimodal fusion of different physiological measures (Fig. 4b), most of the studies (47%) used more than two modalities for workload assessment. As EEG was the most commonly used physiological measure, it combined with all other modalities. For two modalities-based fusion, a maximum of 28% of studies used EEG and fNIRS for workload estimation
Cognitive task-wise analysis
This section discusses the most popular six cognitive tasks and performs a comparative analysis based on different physiological measures. The analysis is performed based on the workload studies (published in peer-reviewed journals) within 2017–2022. The studies are categorized into two widely spread cognitive tasks: (a) cognitive-oriented tasks and (b) operate-oriented tasks. Mental arithmetic, n-back task, and Sternberg tasks belong to the cognitive-oriented task. On the other hand, operate-oriented tasks include driving tasks, MATB and flight simulation tasks. Summaries of workload studies of different physiological measures for these six widely used cognitive tasks are shown in Fig. 5. For all the tasks, it can be noted that most of the studies used EEG for workload estimation. All the physiological measures (fMRI, fNIRS, ECG, EEG, eye activity and multimodal) are used in mental arithmetic and n-back studies. Five physiological measures (except fMRI) are used in driving and flight simulation studies, whereas four physiological measures (except fMRI and ECG/eye activity) are used in MATB and Sternberg task-related studies. From all of the tasks (Fig. 5a–f), it can be concluded that EEG is the mostly used physiological measure, whereas fMRI is the least used physiological measure for workload classification.
Fig. 5.
Summaries of workload estimation studies based on cognitive tasks. a-c represents Cognitive-Oriented Tasks and d–f represents Operator-Oriented Tasks
Discussion
In this section, we discuss the major findings of the review, the open challenges of cognitive workload measurement studies, and the strategies to overcome those challenges to enhance recognition performance.
Summary of major findings
Physiological measures are more widely used for estimating cognitive workload than other measures (i.e., subjective and performance) (Ayres et al. 2021). Therefore, we divide the major findings of workload estimation using physiological measures into four parts: (a) analysis between brain and non-brain activity-based physiological measures, (b) highlighting the best physiological measure in multimodal fusion, (c) identification of the best physiological measure according to the cognitive task paradigm, and (d) significance of different methods for workload estimation. A detailed discussion of each area is mentioned below.
In brain activity-based measures, EEG-based workload studies mostly used SVM (Bashivan et al. 2015; Walter et al. 2013) and ANN (Baldwin and Penaranda 2012; Zarjam et al. 2013) as machine learning classifiers and CNN (Saadati et al. 2019; Zhang et al. 2018), LSTM (Kwak et al. 2020; Hefron et al. 2017), BLSTM-LSTM (Chakladar et al. 2020; Gupta et al. 2021) as deep learning classifiers. On the Other hand, fMRI (Koshino et al. 2005; Lim et al. 2010) and fNIRS (Midha et al. 2021; Causse et al. 2017) studies mostly highlight the brain activation regions and connectivity network for workload levels. In contrast, non-brain activities, such as cardiovascular measures, used HRV features, spectral power, and machine/deep learning classifiers for workload classification (Qu et al. 2021; Bhardwaj et al. 2018). The eye activity-based measures used different features such as eye-tracking (Guo et al. 2021), eye blink rates (Benedetto et al. 2011), and pupillary responses (Piquado et al. 2010) for workload classification. Brain activity-based measures monitor neural activity patterns, allowing researchers to determine when cognitive resources are allocated, the degree of mental effort exerted, and the cognitive states of individuals engaged in a task. Therefore, we find that brain activity-based measures are better than non-brain activity-based measures in their ability to provide real-time information, making them valuable tools for monitoring workload in real-world settings.
Combining brain and non-brain activity-based measures offers a multifaceted approach to cognitive workload estimation. In multimodal approaches, features from different modalities are combined for workload classification. EEG, having excellent temporal resolution and different varieties of features (time/frequency/statistics/connectivity, etc.), is mostly used in multimodal fusion. Multimodal fusion of EEG/ECG (Zhang et al. 2017), EEG/fNIRS (Lin et al. 2020), and EEG/fNIRS/ECG (Ahn et al. 2016) can effectively used for workload estimation. Therefore, we can identify that EEG is the most commonly used physiological measure for multimodal fusion.
According to the cognitive task paradigm, EEG is mostly used in different cognitive-oriented task paradigms such as MA (Chakladar et al. 2020), n-back tasks (Zhang et al. 2018) and operator-wise task paradigms (driving (Zeng et al. 2018) and aCAMS (Yang et al. 2019)). Therefore, we conclude that EEG is a more suitable physiological measure than others for these factors: (a) designing different kinds of cognitive/operator-oriented experiments with low experimental cost and device portability, (b) excellent temporal resolution with short trial duration for each cognitive state. Figure 5 also dictates that EEG is mostly used for estimating cognitive workload.
Workload measurement can be broadly categorized into workload level classification and finding active cortex regions for different workload levels. In this review, machine learning (Table 3) and deep learning (Table 4) classifiers are used to classify workload levels. In contrast, statistical tests (Table 5) are used to identify active regions for workload levels. From Tables 3 and 4, we found that EEG studies are used in the workload classification. Apart from EEG, ECG, and fNIRS, studies also followed the classification task. From Table 3, it can be identified that SVM is mostly used as a machine learning classifier across different cognitive tasks. In deep learning classification, we noticed that LSTM models with different configurations (stacked LSTM/LSTM, BLSTM-LSTM, ensemble LSTM, LSTM-CNN) are mostly used for workload classification across cognitive tasks. ANOVA statistical test is used for different physiological measure-based studies to find the activation difference across workload levels (Table 5). We noticed that fNIRS studies identified the activated brain regions based on increased/decreased concentrations of HbO and HbR. On the other hand, fMRI studies are focused on building functional connectivity networks by connecting activated regions for each cognitive workload state.
Open challenges and future trends
Though significant advancements have been achieved, there remain substantial challenges and forthcoming developments that merit thorough investigation for enhancing recognition performance.
Open access datasets
The first and foremost problem of workload measurement is the availability of open-access datasets. Although some publicly available datasets are listed in Sect. 3.1, few of the studies share their code. Thus, many studies cannot easily be reproduced. Therefore, we can argue that publishing both the data and codes can improve the fair comparison analysis. Thus, we encourage researchers in the related domain to make their EEG dataset and related code with free licenses or available upon request.
The generalizability of the model across tasks
In the recent research era, the generalizability of machine learning/deep learning models is in high demand, as a single robust model can be used in multiple cognitive tasks. If a model can achieve more than 85% accuracy over multiple tasks, that model can be referred to as a generic model.
The generic machine learning/deep learning model should perform well across tasks and unseen data from populations (i.e., not part of the training set). Here, “populations” typically refers to different groups of subjects who participated in the experiment. Several factors are related to the generalizability of a model across cognitive tasks and populations, which are as follows:
Task transferability Some machine learning models are designed to be task-agnostic and can generalize well across a range of cognitive tasks (Wang et al. 2015). For example, transfer learning techniques allow a model pre-trained on a large dataset to adapt and perform well on specific tasks with smaller datasets.
Data representations The choice of data representations and features can significantly impact generalizability. Models capable of acquiring and representing information in a meaningful manner tend to exhibit better generalization (Mazher et al. 2017).
Cognitive task complexity The generalizability of a model can vary based on the complexity of cognitive tasks (Mazher et al. 2017). Some tasks may require high-level reasoning and understanding, while others may be more data-driven. Models that can capture both high-level and low-level features are more likely to generalize across a broader spectrum of cognitive tasks.
Some of the models (Mazher et al. 2017; Wang et al. 2015) proved them as “generic” model, whereas other models (Zhang et al. 2014; Bagheri and Power 2020; McKendrick et al. 2019) are poorly generalized. Therefore, researchers should develop their model so that it can be generalized across tasks/populations.
The interpretability of models
Apart from the generalizability of the model, it is equally important to interpret the result clearly so that the reader can understand the actual and predicted results. Some machine learning models used black-box models (Du et al. 2019), which needs more interpretability.
To improve the interpretability, some studies implemented the Wilcoxon signed-rank test (Chakladar et al. 2021; Shin et al. 2016) to check the statistical significance of different features or classifiers. Visualizing layer-wise feature maps of CNN models (Zhang et al. 2019) can be used for better understanding the model’s training process.
Fusion of multimodal measures
As the feature dimension and characteristics of multiple physiological measures (EEG, fNIRS, ECG) are different, combining them to make a single feature for workload classification is more complex. Moreover, the relationship between different modalities can change over time. Sometimes, different modalities can provide complementary measures to each other, which causes problems during fusion (Tomita et al. 2014). However, fusing features from multiple modalities can enhance the performance of cognitive workload estimation (Debie et al. 2019).
Conclusion
This review provides a detailed step-by-step analysis of the cognitive workload and different physiological measures (EEG, ECG, fMRI, fNIRS etc.) for workload estimation. Moreover, this survey has identified different techniques (such as machine learning, deep learning, and statistical methods) to measure workload levels. Table 2 identifies the list of devices of different physiological measures used in workload estimation. Devices of different physiological measures have different characteristics (no. of channels, storage, weight, multimodal integration facility, etc.), so the reader can get a clear picture of devices of different physiological measures from this table. The comparison studies of different physiological measures based on all cognitive tasks and individual cognitive tasks are shown in Fig. 4 and Fig. 5, respectively. From both figures (i.e., Figs. 4 and 5), it can be concluded that EEG outperforms other physiological measures for all types of cognitive tasks. Apart from unimodal physiological measure-based workload classification, we have identified the multimodal approach for workload classification (refer to Table 6). Different types of cognitive task paradigms and stimulus-specific tasks (Fig. 3) will guide the researchers to enhance their work in the future. Further, we also discuss the major findings of this work. Finally, We conclude by highlighting the list of open challenges and future trends for overcoming those challenges.
Declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- Abibullaev B, An J (2012) Classification of frontal cortex haemodynamic responses during cognitive tasks using wavelet transforms and machine learning algorithms. Med Eng Phys 34(10):1394–1410 [DOI] [PubMed] [Google Scholar]
- Abiri R, Borhani S, Sellers EW, Jiang Y, Zhao X (2019) A comprehensive review of EEG-based brain-computer interface paradigms. J Neural Eng 16(1):011001 [DOI] [PubMed] [Google Scholar]
- Aghajani H, Garbey M, Omurtag A (2017) Measuring mental workload with EEG+ fNIRS. Front Hum Neurosci 11:359 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ahn S, Nguyen T, Jang H, Kim JG, Jun SC (2016) Exploring neuro-physiological correlates of drivers’ mental fatigue caused by sleep deprivation using simultaneous EEG, ECG, and fNIRS data. Front Hum Neurosci 10:219 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alderson P, Morrow V (2020) The ethics of research with children and young people: a practical handbook. Sage, London [Google Scholar]
- Allison BZ, Polich J (2008) Workload assessment of computer gaming using a single-stimulus event-related potential paradigm. Biol Psychol 77(3):277–283 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Almogbel MA, Dang AH, Kameyama W (2019) Cognitive workload detection from raw EEG-signals of vehicle driver using deep learning. In: 21st international conference on advanced communication technology. IEEE, pp 1–6
- Amin M, Ullah K, Asif M, Waheed A, Haq SU, Zareei M et al (2022) ECG-based driver’ s stress detection using deep transfer learning and fuzzy logic approaches. IEEE Access 10:29788–29809 [Google Scholar]
- Arico P, Borghini G, Di Flumeri G, Colosimo A, Graziani I, Imbert JP, et al (2015) Reliability over time of EEG-based mental workload evaluation during air traffic management (ATM) tasks. In: 37th annual international conference of the IEEE engineering in medicine and biology society. IEEE, 7242–7245 [DOI] [PubMed]
- Asgher U, Khalil K, Khan MJ, Ahmad R, Butt SI, Ayaz Y et al (2020) Enhanced accuracy for multiclass mental workload detection using long short-term memory for brain-computer interface. Front Neurosci 14:584 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ayaz H, Shewokis PA, Bunce S, Izzetoglu K, Willems B, Onaral B (2012) Optical brain monitoring for operator training and mental workload assessment. Neuroimage 59(1):36–47 [DOI] [PubMed] [Google Scholar]
- Ayres P, Lee JY, Paas F, van Merriënboer JJ (2021) The validity of physiological measures to identify differences in intrinsic cognitive load. Front Psychol 12:702538 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bagheri M, Power SD (2020) EEG-based detection of mental workload level and stress: the effect of variation in each state on classification of the other. J Neural Eng 17(5):056015 [DOI] [PubMed] [Google Scholar]
- Baldwin CL, Penaranda B (2012) Adaptive training using an artificial neural network and EEG metrics for within-and cross-task workload classification. Neuroimage 59(1):48–56 [DOI] [PubMed] [Google Scholar]
- Bashivan P, Yeasin M, Bidelman GM (2015) Single trial prediction of normal and excessive cognitive load through EEG feature fusion. In: IEEE signal processing in medicine and biology symposium. IEEE, pp 1–5
- Bauernfeind G, Steyrl D, Brunner C, Müller-Putz GR (2014) Single trial classification of fNIRS-based brain-computer interface mental arithmetic data: a comparison between different classifiers. In: 36th annual international conference of the IEEE engineering in medicine and biology society. pp 2004–2007 [DOI] [PubMed]
- Bear M, Connors B, Paradiso MA (2020) Neuroscience: exploring the brain, Enhanced. Jones & Bartlett Learning, Burlington [Google Scholar]
- Belsare S, Kale M, Ghayal P, Gogate A, Itkar S (2021) Performance comparison of different EEG analysis techniques based on deep learning approaches. In: 2021 international conference on emerging smart computing and informatics (ESCI). IEEE, pp 490–493
- Benedetto S, Pedrotti M, Minin L, Baccino T, Re A, Montanari R (2011) Driver workload and eye blink duration. Transp Res F Traffic Psychol Behav 14(3):199–208 [Google Scholar]
- Bhardwaj R, Natrajan P, Balasubramanian V (2018) Study to determine the effectiveness of deep learning classifiers for ECG based driver fatigue classification. In: IEEE 13th international conference on industrial and information systems. IEEE, pp 98–102
- Boff KR, Kaufman L, Thomas JP (1986) Handbook of perception and human performance, vol 1. Wiley, New York [Google Scholar]
- Boran E, Fedele T, Steiner A, Hilfiker P, Stieglitz L, Grunwald T et al (2020) Dataset of human medial temporal lobe neurons, scalp and intracranial EEG during a verbal working memory task. Sci Data 7(1):1–7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Borys M, Plechawska-Wójcik M, Wawrzyk M, Wesołowska K (2017) Classifying cognitive workload using eye activity and EEG features in arithmetic tasks. International conference on information and software technologies. Springer, New York, pp 90–105 [Google Scholar]
- Brouwer AM, Hogervorst MA, Van Erp JB, Heffelaar T, Zimmerman PH, Oostenveld R (2012) Estimating workload using EEG spectral power and ERPs in the n-back task. J Neural Eng 9(4):045008 [DOI] [PubMed] [Google Scholar]
- Brouwer AM, Hogervorst MA, Oudejans B, Ries AJ, Touryan J (2017) EEG and eye tracking signatures of target encoding during structured visual search. Front Hum Neurosci 11:264 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bunce SC, Izzetoglu M, Izzetoglu K, Onaral B, Pourrezaei K (2006) Functional Near-infrared spectroscopy. IEEE Eng Med Biol Mag 25(4):54–62 [DOI] [PubMed] [Google Scholar]
- Bunce SC, Izzetoglu K, Ayaz H, Shewokis P, Izzetoglu M, Pourrezaei K et al (2011) Implementation of fNIRS for monitoring levels of expertise and mental workload. International conference on foundations of augmented cognition. Springer, New York, pp 13–22 [Google Scholar]
- Causse M, Chua Z, Peysakhovich V, Del Campo N, Matton N (2017) Mental workload and neural efficiency quantified in the prefrontal cortex using fNIRS. Sci Rep 7(1):1–15 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Causse M, Lepron E, Mandrick K, Peysakhovich V, Berry I, Callan D et al (2022) Facing successfully high mental workload and stressors: an fMRI study. Hum Brain Mapp 43(3):1011–1031 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chakladar DD, Datta S, Roy PP, Vinod A (2022) Cognitive workload estimation using variational auto encoder & attention-based deep model. IEEE Trans Cogn Dev Syst
- Chakladar DD, Dey S, Roy PP, Iwamura M (2021) EEG-based cognitive state assessment using deep ensemble model and filter bank common spatial pattern. In: 25th international conference on pattern recognition. IEEE. pp. 4107–4114
- Chakladar DD, Samanta D, Roy PP (2022) Multimodal deep sparse subspace clustering for multiple stimuli-based cognitive task. In: 2022 26th international conference on pattern recognition (ICPR). IEEE, pp. 1098–1104
- Chakladar DD, Chakraborty S (2018) EEG based emotion classification using Correlation Based Subset Selection. Biol Inspir Cogn Architect 24:98–106 [Google Scholar]
- Chakladar DD, Dey S, Roy PP, Dogra DP (2020) EEG-based mental workload estimation using deep BLSTM-LSTM network and evolutionary algorithm. Biomed Signal Process Control 60:101989 [Google Scholar]
- Chakladar DD, Roy PP, Iwamura M (2021) EEG-based cognitive state classification and analysis of brain dynamics using deep ensemble model and graphical brain network. IEEE Trans Cogn Dev Syst 14(4):1507–1519 [Google Scholar]
- Chandra S, Sharma G, Sharma M, Jha D, Mittal AP (2017) Workload regulation by Sudarshan Kriya: an EEG and ECG perspective. Brain inf 4(1):13–25 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cheema BS, Samima S, Sarma M, Samanta D (2018) Mental workload estimation from EEG signals using machine learning algorithms. In: International conference on engineering psychology and cognitive ergonomics. pp 265–284
- Chen F, Zhou J, Wang Y, Yu K, Arshad SZ, Khawaji A et al (2016) Eye-based measures. Robust multimodal cognitive load measurement. Springer, New York, pp 75–85 [Google Scholar]
- Cook DA, Castillo RM, Gas B, Artino AR Jr (2017) Measuring achievement goal motivation, mindsets and cognitive load: validation of three instruments’ scores. Med Educ 51(10):1061–1074 [DOI] [PubMed] [Google Scholar]
- Cooper GE, Harper RP (1969) The use of pilot rating in the evaluation of aircraft handling qualities. National Aeronautics and Space Administration, Washington, D.C [Google Scholar]
- De Rivecourt M, Kuperus M, Post W, Mulder L (2008) Cardiovascular and eye activity measures as indices for momentary changes in mental effort during simulated flight. Ergonomics 51(9):1295–1319 [DOI] [PubMed] [Google Scholar]
- Debie E, Rojas RF, Fidock J, Barlow M, Kasmarik K, Anavatti S et al (2019) Multimodal fusion for objective assessment of cognitive workload: a review. IEEE Trans Cybern 51(3):1542–1555 [DOI] [PubMed] [Google Scholar]
- Dehais F, Dupres A, Di Flumeri G, Verdiere K, Borghini G, Babiloni F. et al (2018) Monitoring pilot’s cognitive fatigue with engagement features in simulated and actual flight conditions using an hybrid fNIRS-EEG passive BCI. In: IEEE international conference on systems, man, and cybernetics (SMC). IEEE, pp 544–549
- Du M, Liu N, Hu X (2019) Techniques for interpretable machine learning. Commun ACM 63(1):68–77 [Google Scholar]
- Ericsson K, Delaney P, Miyake A, Shah P (1999) Models of working memory: mechanisms of active maintenance and executive control. Long-Term Memory Altern Capacity Models Work Memory Everyday Skilled Life 257–295
- Fan J, Wade JW, Key AP, Warren ZE, Sarkar N (2017) EEG-based affect and workload recognition in a virtual driving environment for ASD intervention. IEEE Trans Biomed Eng 65(1):43–51 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Faro SH, Mohamed FB (2006) Functional MRI: basic principles and clinical applications. Springer, New York [Google Scholar]
- Finsen L, Søgaard K, Jensen C, Borg V, Christensen H (2001) Muscle activity and cardiovascular response during computer-mouse work with and without memory demands. Ergonomics 44(14):1312–1329 [DOI] [PubMed] [Google Scholar]
- Fournier LR, Wilson GF, Swain CR (1999) Electrophysiological, behavioral, and subjective indexes of workload when performing multiple tasks: manipulations of task difficulty and training. Int J Psychophysiol 31(2):129–145 [DOI] [PubMed] [Google Scholar]
- Gevins A, Smith ME (2006) Electroencephalography (EEG) in neuroergonomics
- Grassmann M, Vlemincx E, Von Leupoldt A, Mittelstädt JM, Van den Bergh O (2016) Respiratory changes in response to cognitive load: a systematic review. Neural Plast [DOI] [PMC free article] [PubMed]
- Group HPRGNAR (1988) NASA- TLX paper and pencil version instruction manual. Moffett Feild, California [Google Scholar]
- Guo Y, Freer D, Deligianni F, Yang GZ (2021) Eye-tracking for performance evaluation and workload estimation in space telerobotic training. IEEE Trans Hum Mach Syst 52(1):1–11 [Google Scholar]
- Gupta SS, Taori TJ, Ladekar MY, Manthalkar RR, Gajre SS, Joshi YV (2021) Classification of cross task cognitive workload using deep recurrent network with modelling of temporal dynamics. Biomed Signal Process Control 70:103070 [Google Scholar]
- Hajinoroozi M, Mao Z, Jung TP, Lin CT, Huang Y (2016) EEG-based prediction of driver’s cognitive performance by deep convolutional neural network. Signal Proc Image Commun 47:549–555 [Google Scholar]
- Hart SG, Staveland LE (1988) Development of NASA-TLX (task load index): results of empirical and theoretical research. Advances in psychology, vol 52. Elsevier, Amsterdam, pp 139–183 [Google Scholar]
- Heard J, Harriott CE, Adams JA (2018) A survey of workload assessment algorithms. IEEE Trans Hum Mach Syst 48(5):434–451 [Google Scholar]
- Hefron RG, Borghetti BJ, Christensen JC, Kabban CMS (2017) Deep long short-term memory structures model temporal dependencies improving cognitive workload estimation. Pattern Recogn Lett 94:96–104 [Google Scholar]
- Herff C, Heger D, Fortmann O, Hennrich J, Putze F, Schultz T (2014) Mental workload during n-back task-quantified in the prefrontal cortex using fNIRS. Front Hum Neurosci 7:935 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hogervorst MA, Brouwer AM, Van Erp JB (2014) Combining and comparing EEG, peripheral physiology and eye-related measures for the assessment of mental workload. Front Neurosci 8:322 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hou X, Liu Y, Sourina O, Tan YRE, Wang L, Mueller-Wittig W (2015) EEG based stress monitoring. In: 2015 IEEE international conference on systems, man, and cybernetics. IEEE, pp 3110–3115
- Houssein EH, Kilany M, Hassanien AE (2017) ECG signals classification: a review. Int J Intell Eng Inf 5(4):376–396 [Google Scholar]
- Jordan MI, Mitchell TM (2015) Machine learning: trends, perspectives, and prospects. Science 349(6245):255–260 [DOI] [PubMed] [Google Scholar]
- Kakkos I, Dimitrakopoulos GN, Gao L, Zhang Y, Qi P, Matsopoulos GK et al (2019) Mental workload drives different reorganizations of functional cortical connectivity between 2D and 3D simulated flight experiments. IEEE Trans Neural Syst Rehabil Eng 27(9):1704–1713 [DOI] [PubMed] [Google Scholar]
- Klingner J, Tversky B, Hanrahan P (2011) Effects of visual and verbal presentation on cognitive load in vigilance, memory, and arithmetic tasks. Psychophysiology 48(3):323–332 [DOI] [PubMed] [Google Scholar]
- Koshino H, Carpenter PA, Minshew NJ, Cherkassky VL, Keller TA, Just MA (2005) Functional connectivity in an fMRI working memory task in high-functioning autism. Neuroimage 24(3):810–821 [DOI] [PubMed] [Google Scholar]
- Kramer AF (2020) Physiological metrics of mental workload: a review of recent progress. Multiple-task Perform 279–328
- Kwak Y, Kong K, Song WJ, Min BK, Kim SE (2020) Multilevel feature fusion with 3d convolutional neural network for EEG-based workload estimation. IEEE Access. 8:16009–16021 [Google Scholar]
- Leamy DJ, Collins R, Ward TE (2011) Combining fNIRS and EEG to improve motor cortex activity classification during an imagined movement-based task. International conference on foundations of augmented cognition. Springer, New York, pp 177–185 [Google Scholar]
- LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444 [DOI] [PubMed] [Google Scholar]
- Li C, Zhao Y, Pf Y, Zhang J, Jz Z (2017) Detecting driving stress in physiological signals based on multimodal feature analysis and kernel classifiers. Expert Syst Appl 85:279–291 [Google Scholar]
- Lim J, Wc Wu, Wang J, Detre JA, Dinges DF, Rao H (2010) Imaging brain fatigue from sustained mental workload: an ASL perfusion study of the time-on-task effect. Neuroimage 49(4):3426–3435 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lim W, Sourina O, Wang L (2018) STEW: simultaneous task EEG workload data set. IEEE Trans Neural Syst Rehabil Eng 26(11):2106–2114 [DOI] [PubMed] [Google Scholar]
- Lin CT, King JT, Chuang CH, Ding W, Chuang WY, Liao LD et al (2020) Exploring the brain responses to driving fatigue through simultaneous EEG and fNIRS measurements. Int J Neural Syst 30(01):1950018 [DOI] [PubMed] [Google Scholar]
- Liu Y, Ayaz H, Shewokis P (2017) Multisubject learning for mental workload classification using concurrent EEG, fNIRS, and physiological measures. Front Hum Neurosci 11:389 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lobo JL, Ser JD, De Simone F, Presta R, Collina S, Moravek Z (2016) Cognitive workload classification using eye-tracking and EEG data. In: Proceedings of the international conference on human–computer interaction in aerospace. pp 1–8
- Longo L (2015) A defeasible reasoning framework for human mental workload representation and assessment. Behav Inf Technol 34(8):758–786 [Google Scholar]
- Mahfouf M, Zhang J, Linkens DA, Nassef A, Nickel P, Hockey GRJ, et al (2007) Adaptive fuzzy approaches to modelling operator functional states in a human-machine process control system. In: IEEE international fuzzy systems conference. IEEE, pp 1–6
- Mauderly JL (1990) Measurement of respiration and respiratory responses during inhalation exposures. J Am Coll Toxicol 9(4):397–405 [Google Scholar]
- Mazher M, Abd Aziz A, Malik AS, Amin HU (2017) An EEG-based cognitive load assessment in multimedia learning using feature extraction and partial directed coherence. IEEE Access 5:14819–14829 [Google Scholar]
- McKendrick R, Feest B, Harwood A, Falcone B (2019) Theories and methods for labeling cognitive workload: classification and transfer learning. Front Hum Neurosci 13:295 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Midha S, Maior HA, Wilson ML, Sharples S (2021) Measuring mental workload variations in office work tasks using fNIRS. Int J Hum Comput Stud 147:102580 [Google Scholar]
- Miller S (2001) Workload measures. National advanced driving simulator Iowa City, United States
- Mughal NE, Khalil K, Khan MJ (2021) fNIRS based multi-class mental workload classification using recurrence plots and CNN-LSTM. In: 2021 international conference on artificial intelligence and mechatronics systems (AIMS). IEEE, pp 1–6
- Nidal K, Malik AS (2014) EEG/ERP analysis: methods and applications. CRC Press, Florida [Google Scholar]
- Ollinger JM, Fessler JA (1997) Positron-emission tomography. IEEE Signal Process Mag 14(1):43–55 [Google Scholar]
- Paas F, Van Gog T (2006) Optimising worked example instruction: different ways to increase germane cognitive load. Learn Instr 16(2):87–91 [Google Scholar]
- Panaite V, Hindash AC, Bylsma LM, Small BJ, Salomon K, Rottenberg J (2016) Respiratory sinus arrhythmia reactivity to a sad film predicts depression symptom improvement and symptomatic trajectory. Int J Psychophysiol 99:108–113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Panda D, Chakladar DD, Dasgupta T (2020) Multimodal system for emotion recognition using EEG and customer review. In: Proceedings of the global AI congress 2019. Springer, pp 399–410
- Piquado T, Isaacowitz D, Wingfield A (2010) Pupillometry as a measure of cognitive effort in younger and older adults. Psychophysiology 47(3):560–569 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Qiao W, Bi X (2020) Ternary-task convolutional bidirectional neural turing machine for assessment of EEG-based cognitive workload. Biomed Signal Process Control 57:101745 [Google Scholar]
- Qu H, Gao X, Pang L (2021) Classification of mental workload based on multiple features of ECG signals. Inf Med Unlocked 24:100575 [Google Scholar]
- Reid GB, Nygren TE (1988) The subjective workload assessment technique: A scaling procedure for measuring mental workload. Advances in Psychology, vol 52. Elsevier, Amsterdam, pp 185–218 [Google Scholar]
- Richards DA, Ekers D, McMillan D, Taylor RS, Byford S, Warren FC et al (2016) Cost and outcome of behavioural activation versus cognitive behavioural therapy for depression (COBRA): a randomised, controlled, non-inferiority trial. Lancet 388(10047):871–880 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roy RN, Charbonnier S, Campagne A, Bonnet S (2016) Efficient mental workload estimation using task-independent EEG features. J Neural Eng 13(2):026019 [DOI] [PubMed] [Google Scholar]
- Rubio S, Díaz E, Martín J, Puente JM (2004) Evaluation of subjective mental workload: a comparison of SWAT, NASA-TLX, and workload profile methods. Appl Psychol 53(1):61–86 [Google Scholar]
- Saadati M, Nelson J, Ayaz H (2019) Multimodal fNIRS-EEG classification using deep learning algorithms for brain-computer interfaces purposes. International conference on applied human factors and ergonomics. Springer, New York, pp 209–220 [Google Scholar]
- Saadati M, Nelson J, Ayaz H (2019) Convolutional neural network for hybrid fNIRS-EEG mental workload classification. In: International conference on applied human factors and ergonomics. pp 221–232
- Saadati M, Nelson J, Ayaz H (2019) Mental workload classification from spatial representation of fNIRS recordings using convolutional neural networks. In: IEEE 29th international workshop on machine learning for signal processing (MLSP). IEEE, pp 1–6
- Schuller B, Müller R, Eyben F, Gast J, Hörnler B, Wöllmer M et al (2009) Being bored? Recognising natural interest by extensive audiovisual integration for real-life application. Image Vis Comput 27(12):1760–1774 [Google Scholar]
- Sciaraffa N, Aricò P, Borghini G, Flumeri GD, Florio AD, Babiloni F (2019) On the use of machine learning for EEG-based Workload assessment: algorithms comparison in a realistic task. In: International symposium on human mental workload: models and applications. Springer. pp 170–185
- Shin J, von Lühmann A, Blankertz B, Kim DW, Jeong J, Hwang HJ et al (2016) Open access dataset for EEG + NIRS single-trial classification. IEEE Trans Neural Syst Rehabil Eng 25(10):1735–1745 [DOI] [PubMed] [Google Scholar]
- Shin J, Von Lühmann A, Kim DW, Mehnert J, Hwang HJ, Müller KR (2018) Simultaneous acquisition of EEG and NIRS during cognitive tasks for an open access dataset. Sci Data 5:180003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Signorini MG, Marchetti F, Cerutti S (2001) Applying nonlinear noise reduction in the analysis of heart rate variability. IEEE Eng Med Biol Mag 20(2):59–68 [DOI] [PubMed] [Google Scholar]
- Singh SP (2014) Magnetoencephalography: basic principles. Ann Indian Acad Neurol 17(Suppl 1):S107 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Singh N, Aggarwal Y, Sinha RK (2019) Heart rate variability analysis under varied task difficulties in mental arithmetic performance. Heal Technol 9(3):343–353 [Google Scholar]
- Sitaram R, Caria A, Veit R, Gaber T, Rota G, Kuebler A, et al (2007) FMRI brain-computer interface: a tool for neuroscientific research and treatment. Comput Intell Neurosci [DOI] [PMC free article] [PubMed]
- Son J, Park S, et al (2011) Cognitive workload estimation through lateral driving performance. In: Proceedings of the 16th Asia pacific automotive engineering conference. pp 06–08
- Supratak A, Dong H, Wu C, Guo Y (2017) DeepSleepNet: a model for automatic sleep stage scoring based on raw single-channel EEG. IEEE Trans Neural Syst Rehabil Eng 25(11):1998–2008 [DOI] [PubMed] [Google Scholar]
- Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT press, Cambridge [Google Scholar]
- Sweller J (1988) Cognitive load during problem solving: effects on learning. Cogn Sci 12(2):257–285 [Google Scholar]
- Takeda Y, Inoue K, Kimura M, Sato T, Nagai C (2016) Electrophysiological assessment of driving pleasure and difficulty using a task-irrelevant probe technique. Biol Psychol 120:137–141 [DOI] [PubMed] [Google Scholar]
- Tao D, Tan H, Wang H, Zhang X, Qu X, Zhang T (2019) A systematic review of physiological measures of mental workload. Int J Environ Res Public Health 16(15):2716 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tjolleng A, Jung K, Hong W, Lee W, Lee B, You H et al (2017) Classification of a Driver’s cognitive workload levels using artificial neural network on ECG signals. Appl Ergon 59:326–332 [DOI] [PubMed] [Google Scholar]
- Tomita Y, Vialatte FB, Dreyfus G, Mitsukura Y, Bakardjian H, Cichocki A (2014) Bimodal BCI using simultaneously NIRS and EEG. IEEE Trans Biomed Eng 61(4):1274–1284 [DOI] [PubMed] [Google Scholar]
- Tsunashima H, Yanagisawa K (2009) Measurement of brain function of car driver using functional near-infrared spectroscopy (fNIRS). Comput Intell Neurosci (2009) [DOI] [PMC free article] [PubMed]
- Tummeltshammer K, Feldman EC, Amso D (2019) Using pupil dilation, eye-blink rate, and the value of mother to investigate reward learning mechanisms in infancy. Dev Cogn Neurosci 36:100608 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ullsperger P, Freude G, Erdmann U (2001) Auditory probe sensitivity to mental workload changes-an event-related potential study. Int J Psychophysiol 40(3):201–209 [DOI] [PubMed] [Google Scholar]
- Ved H, Yildirim C (2021) Detecting mental workload in virtual reality using EEG spectral data: a deep learning approach. In: 2021 IEEE international conference on artificial intelligence and virtual reality (AIVR). pp 173–178
- Veltman J, Jansen C (2005) The role of operator state assessment in adaptive automation. TNO Defence Security and Safety Soesterberg (Netherlands)
- Vidullch MA, Ward GF, Schueren J (1991) Using the subjective workload dominance (SWORD) technique for projective workload assessment. Hum Factors 33(6):677–691 [Google Scholar]
- Walter C, Schmidt S, Rosenstiel W, Gerjets P, Bogdan M (2013) Using cross-task classification for classifying workload levels in complex learning tasks. In: 2013 Humaine association conference on affective computing and intelligent interaction. IEEE, pp. 876–881
- Wang Q, Sourina O (2013) Real-time mental arithmetic task recognition from EEG signals. IEEE Trans Neural Syst Rehabil Eng 21(2):225–232 [DOI] [PubMed] [Google Scholar]
- Wang R, Zhang J, Zhang Y, Wang X (2012) Assessment of human operator functional state using a novel differential evolution optimization based adaptive fuzzy model. Biomed Signal Process Control 7(5):490–498 [Google Scholar]
-
Wang S, Gwizdka J, Chaovalitwongse WA (2015) Using wireless EEG signals to assess memory workload in the
-back task. IEEE Trans Hum Mach Syst 46(3):424–435 [Google Scholar] - Wang YK, Jung TP, Lin CT (2015) EEG-based attention tracking during distracted driving. IEEE Trans Neural Syst Rehabil Eng 23(6):1085–1094 [DOI] [PubMed] [Google Scholar]
- Wickens CD (1979) Measures of workload, stress and secondary tasks. Mental workload. Springer, New York, pp 79–99 [Google Scholar]
- Wickens CD, Gordon SE, Liu Y, Lee J (2004) An introduction to human factors engineering, vol 2. Pearson Prentice Hall Upper Saddle River, NJ [Google Scholar]
- Yacoub E, Harel N, Shmuel A (2015) High-resolution fMRI. fMRI: from nuclear spins to brain functions. 769–791
- Yang S, Yin Z, Wang Y, Zhang W, Wang Y, Zhang J (2019) Assessing cognitive mental workload via EEG signals and an ensemble deep learning classifier based on denoising autoencoders. Comput Biol Med 109:159–170 [DOI] [PubMed] [Google Scholar]
- Yaple Z, Arsalidou M (2018) N-back working memory task: meta-analysis of normative fMRI studies with children. Child Dev 89(6):2010–2022 [DOI] [PubMed] [Google Scholar]
- Yeh YY, Wickens CD (1988) Dissociation of performance and subjective measures of workload. Hum Factors 30(1):111–120 [Google Scholar]
- Yin Z, Zhang J (2016) Recognition of cognitive task load levels using single channel EEG and stacked denoising autoencoder. In: 35th Chinese control conference (CCC). IEEE, pp 3907–3912
- Zarjam P, Epps J, Chen F, Lovell NH (2013) Estimating cognitive workload using wavelet entropy-based features during an arithmetic task. Comput Biol Med 43(12):2186–2195 [DOI] [PubMed] [Google Scholar]
- Zarjam P, Epps J, Lovell NH (2015) Beyond subjective self-rating: EEG signal classification of cognitive workload. IEEE Trans Auton Ment Dev 7(4):301–310 [Google Scholar]
- Zeng H, Yang C, Dai G, Qin F, Zhang J, Kong W (2018) EEG classification of driver mental states by deep learning. Cogn Neurodyn 12(6):597–606 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang J, Nassef A, Mahfouf M, Linkens D, El-Samahy E, Hockey G et al (2006) Modelling and analysis of HRV under physical and mental workloads. IFAC Proc Vol 39(18):189–194 [Google Scholar]
- Zhang J, Yin Z, Wang R (2014) Recognition of mental workload levels under complex human-machine collaboration by using physiological features and adaptive support vector machines. IEEE Trans Hum Mach Syst 45(2):200–214 [Google Scholar]
- Zhang H, Chavarriaga R, Khaliliardali Z, Gheorghe L, Iturrate I, d R Millán J (2015) EEG-based decoding of error-related brain activity in a real-world driving task. J Neural Eng 12(6):066028 [DOI] [PubMed] [Google Scholar]
- Zhang P, Wang X, Chen J, You W (2017) Feature weight driven interactive mutual information modeling for heterogeneous bio-signal fusion to estimate mental workload. Sensors 17(10):2315 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang P, Wang X, Zhang W, Chen J (2018) Learning spatial-spectral-temporal EEG features with recurrent 3D convolutional neural networks for cross-task mental workload assessment. IEEE Trans Neural Syst Rehabil Eng 27(1):31–42 [DOI] [PubMed] [Google Scholar]
- Zhang P, Wang X, Chen J, You W, Zhang W (2019) Spectral and temporal feature learning with two-stream neural networks for mental workload assessment. IEEE Trans Neural Syst Rehabil Eng 27(6):1149–1159 [DOI] [PubMed] [Google Scholar]
- Zhang J, Hua Y, Gu J, Chen Y, Yin Z (2022) Dynamic hierarchical learning of temporal-spatial-spectral EEG features with transformers for cognitive workload estimation. In: 41st Chinese control conference (CCC). IEEE, pp 7112–7117
- Zhang J, Mahfouf M, Linkens D, Nickel P, Hockey G (2007) Adaptive fuzzy model of operator functional state in human–machine system: a preliminary study. In: Proceedings of the IASTED international conference. pp 555–017
- Zhou Y, Huang S, Xu Z, Wang P, Wu X, Zhang D (2021) Cognitive workload recognition using EEG signals and machine learning: a review. IEEE Trans Cogn Dev Syst 14:799–818 [Google Scholar]
- Zyma I, Tukaev S, Seleznov I, Kiyono K, Popov A, Chernykh M et al (2019) Electroencephalograms during mental arithmetic task performance. Data 4(1):14 [Google Scholar]


