Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Aug 1.
Published in final edited form as: Brain Res Bull. 2024 May 31;214:110992. doi: 10.1016/j.brainresbull.2024.110992

Mental Workload evaluation using weighted phase lag index and coherence features extracted from EEG data

Somayeh B Shafiei 1,*, Saeed Shadpour 2, Ambreen Shafqat 3
PMCID: PMC11734752  NIHMSID: NIHMS2037686  PMID: 38825253

Abstract

Electroencephalogram (EEG) represents an effective, non-invasive technology to study mental workload. However, volume conduction, a common EEG artifact, influences functional connectivity analysis of EEG data. EEG coherence has been used traditionally to investigate functional connectivity between brain areas associated with mental workload, while weighted Phase Lag Index (wPLI) is a measure that improves on coherence by reducing susceptibility to volume conduction, a common EEG artifact. The goal of this study was to compare two methods of functional connectivity analysis, wPLI and coherence, in the context of mental workload evaluation. The study involved model development for mental workload domains and comparing their performance using coherence-based features, wPLI-based features, and a combination of both. Generalized linear mixed-effects model (GLMM) with the least absolute shrinkage and selection operator (LASSO) feature selection method was used for model development. Results indicated that the model developed using a combination of both feature types demonstrated improved predictive performance across all mental workload domains, compared to models that used each feature type individually. The R2 values were 0.82 for perceived task complexity, 0.71 for distraction, 0.91 for mental demand, 0.85 for physical demand, 0.74 for situational stress, and 0.74 for temporal demand. Furthermore, task complexity and functional connectivity patterns in different brain areas were identified as significant contributors to perceived mental workload (p-value<0.05). Findings showed the potential of using EEG data for mental workload evaluation which suggests that combination of coherence and wPLI can improve the accuracy of mental workload domains prediction. Future research should aim to validate these results on larger, diverse datasets to confirm their generalizability and refine the predictive models.

Introduction

The application of Electroencephalography (EEG) in evaluating mental workload, particularly in surgical settings, has gained significant attention due to EEG’s ability to provide real-time information about cognitive states [1-5]. A notable aspect of using EEG in these environments is its ability to capture the dynamic changes in mental workload through the measurement of specific brain activities, such as theta and alpha waves, which are closely linked to cognitive effort and attention. However, a significant limitation of EEG-based mental workload evaluation is the challenge posed by volume conduction. This phenomenon occurs when local ionic currents pass through an active nervous membrane and generate electrical fields in the surrounding tissue [6]. These currents further spread through the conductive mediums of the brain, skull, and scalp, which leads to multiple electrodes picking up similar electrical potentials from the same source and makes the interpretation of EEG data complicated.

Volume conduction is a significant challenge in EEG signal analysis because it can create artificial correlations between signals recorded at different electrodes. This can lead to misleading interpretations, particularly when studying connectivity or coherence between different brain regions. Volume conduction can obscure the true synchrony between brain regions, affecting the accuracy and reliability of coherence measures. For instance, signals recorded from adjacent electrodes may reflect overlapping sources rather than distinct neural interactions, leading to potentially misleading interpretations of functional connectivity. Susceptibility to volume conduction is an essential feature that determines the validity of functional connectivity calculation [7]. In mitigating the effect of volume conduction, the weighted Phase Lag Index (wPLI) is noteworthy for its ability to minimize zero-lag interactions that are typically attributed to volume conduction rather than true neural synchrony.

Coherence:

Coherence is a frequency-domain measure that assesses the degree of linear correspondence between signals at two EEG channels over time. Coherence is computed using the cross-spectral density of the two signals divided by the product of their auto-spectral densities [8, 9]. Volume conduction can artificially inflate coherence values because it causes signals at different electrodes to have similar spectral components only due to the spreading of electrical activity, not actual neuronal connectivity.

Weighted Phase Lag Index (wPLI):

wPLI is a method developed to overcome the limitations posed by volume conduction in coherence analysis. It measures the asymmetry of the phase differences between two signals, which helps to mitigate the influence of zero-lag (or near-zero-lag) components likely caused by volume conduction. The wPLI improves upon traditional coherence by reducing the bias introduced by volume conduction. Since it focuses on the asymmetry of the phase distribution, wPLI discounts the contributions from zero phase lag, which are most susceptible to artifacts from volume conduction. This makes wPLI a more reliable measure when assessing true functional connectivity between EEG channels.

The aim of this study was to compare the performance of mental workload evaluation models that are developed using coherence-based and wPLI-based functional connectivity features. These models were validated through a subjective mental workload assessment using the Surgical Task Load Index (Surg-TLX).

Surg-TLX mental workload assessment tool:

Surgical Task Load Index (Surg-TLX) is a specific adaptation of the NASA Task Load Index (NASA-TLX) tool to the surgical environment. Mental workload in SURG-TLX tool refers to the mental, physical, and temporal demands, as well as the task complexity, situational stress, and distraction that the surgeon experiences during a surgical procedure [10]. Perceived complexity can relate to the mental workload as more complex tasks typically require higher levels of cognitive processing. Mental demand (MD) measures the mental and perceptual activity required to perform a task, including thinking, decision making, calculating, and remembering. Temporal demand (TD) measures the time pressure felt by a subject. It has a significant cognitive component as it often involves the need to process information quickly. Situational stress measures the stress perceived by the subject, which can be influenced by mental workload as high levels of cognitive demand can increase stress. While distraction and physical demand are not directly part of the cognitive load theory factors (mental demand, temporal demand, situational stress), they can be influenced by the mental workload associated with a task. High cognitive load can make participants more susceptible to distractions and can contribute to physical fatigue or strain, which can increase the perceived levels of distraction and physical demand, respectively. The range for each SURG-TLX metric is 1 to 20, where 1 is the lowest and 20 is the highest value.

Surg-TLX is a well-established method in the field for assessing mental workload in environments that require high precision and decision-making under stress, such as surgical settings. The use of Surg-TLX is well-documented in prior research that examines cognitive functions in high-stakes environments, providing a robust framework for comparison and benchmarking in our study. A study by Wilson et al. detailed the development and validation of Surg-TLX as a multi-dimensional measure specifically designed for the surgical context, highlighting its relevance and utility in assessing the perceived demands and stressors faced by surgical operators [10]. An additional systematic review explored the use of self-report instruments, including Surg-TLX, to evaluate the overall cognitive load across various surgical procedures and to assess learning curves within competency-based surgical education. This review also explains how cognitive demands evolve with increasing surgeon experience [11].

Purpose of current study:

The primary aim of this study was to explore how different methods for extracting functional connectivity features from EEG data—specifically coherence, which is susceptible to volume conduction; wPLI, which is less susceptible to volume conduction; and their combination—affect the evaluation of mental workload. This comparative analysis is novel because it assesses the individual contributions of each method and explores their combined effect in a unified model. It provides a comprehensive understanding of EEG-based mental workload assessment through the use of functional connectivity.

Methods

This study received approval from the Roswell Park Comprehensive Cancer Center’s Institutional Review Board (IRB: I-241913). The IRB granted a waiver of documentation of written consent, and instead, participants were presented with a research study information sheet and provided verbal consent. Twenty-one participants used the da Vinci® Skills Simulator to perform three simulator tasks (Figure 1). This simulator comprises two mechanical arms with instruments and a camera arm, all controlled via a computer console. Its tasks and software resemble those in popular virtual reality games, offering an engaging and immersive experience. A detailed table including participants’ age, gender, and performance on the surgical simulator tasks, along with SURG-TLX scores, is available in Supplement 1.

Figure 1. Experimental set up.

Figure 1

A) EEG data were recorded while the participant was performing surgical simulator tasks using the da Vinci robot; B) EEG data were decontaminated from artifacts and used to extract functional connectivity features through coherence and wPLI analyses; C) functional connectivity at different brain’s Brodmann’s Areas, and actual SURG-TLX scores, were used to develop machine learning models for evaluation of SURG-TLX metrics.

Cognitive Tasks:

Matchboard 1 (least complexity), matchboard 2 (moderate complexity), and matchboard 3 (the most complex) were the tasks considered. The objective of these game-like exercises is to improve both psychomotor abilities and cognitive capabilities. These tasks involve diverse cognitive aspects, such as attention, memory, executive functions, and visuospatial skills. In all considered tasks, objects with shapes like numbers and letters appear around a matchboard with corresponding character-shaped recesses [12, 13]. In matchboard 1, participants should pick up each object and place it in the corresponding location, objects will turn green when they are correctly placed. In matchboard 2, three panel doors are blocked by the matchboard below. Participants should retract the panel doors with one instrument and place the characters in their respective bins with the other instrument. In matchboard 3, three swinging panel doors and three sliding doors are blocked by the matchboard below. Participants should switch between using three instrument arms to manipulate multiple doors. Participants use the third instrument to retract one of the sliding doors, then use the other instrument arms to open the swinging door which reveals the character-bin below. Participants should place the appropriate character in the bin [12, 13].

The performance criterion was set at 70 out of 100. Therefore, participants repeated each task level to reach the passing score. The performance score in conducting each attempt was generated by the simulator program, which was based on tool-based metrics including 1) time to complete the task: the total amount of time the user spent on the task (unit: seconds); 2) economy of motion: total distance traveled by all instruments (unit: centimeters); 3) instrument collisions: total number of instrument-on-instrument collisions; 4) excessive instrument force: total time an excessive instrument force was applied above a prescribed threshold force (unit: seconds); 5) instruments out of view: total distance traveled by instruments outside of the user’s field of view (unit: centimeters); 6) master workspace range: radius of user’s working volume on master grips (unit: centimeters); and 7) drops: number of times an object was dropped in an inappropriate region of the surgical scene.

Actual mental workload scores:

At the end of each task attempt, participants completed the SURG-TLX metrics to assess their mental workload metrics (range for each SURG-TLX metric: 1 to 20).

EEG recording and subjects:

Volume conduction phenomenon causes signal leakage from one channel to another and is a fundamental concern with existing EEG studies [14, 15]. To mitigate the effects of volume conduction on data, an EEG headset with high density (for greater spatial resolution) and spatial filtering algorithms are required. Additionally, functional connectivity extraction methods that are not influenced by volume conduction should be used. In this study, EEG data were recorded from 21 participants (age: 36.5±12.7 years; 13 males and 8 females) at a frequency of 500 Hz using a high-density EEG headset (AntNeuro®) with 124 channels. The recordings were made while the participants performed a cognitive task (matchboard) with three levels of complexity.

EEG preprocessing:

Signals recorded from F8, Poz, AF4, AF8, F6, FC3, M1, and M2 were excluded from the study due to poor signal quality. The advanced source analysis (ASA) framework developed by ANT Neuro Inspiring Technology Inc., Netherlands, was used to preprocess EEG data. The EEG data were re-referenced to the common average reference, calculated from all scalp channels. This method helps reduce noise and minimize the influence of common artifacts across channels [16]. Artifact correction employed blind source separation and a topographical Principal Component Analysis (PCA)-based approach. Line noise artifacts were removed using a 60 Hz notch filter. A band-pass filter (0.2–250 Hz) with a steepness of 24 dB/octave was applied. Visual inspection identified and removed facial, muscular, and other artifacts [16]. A spatial filtering method, spatial Laplacian (SP) method, was applied to the signals to alleviate the effect of spatial artifacts such as volume conduction [17].

Extracting functional brain network components:

The brain’s function can be conceptualized as a network, referred to as the ‘functional brain network’ [18]. In this network, the brain areas (EEG channels) serve as nodes, while the functional connectivity index between pairs of brain areas defines the network’s elements [19]. In this study, the functional connectivity index was calculated using two approaches: 1) Coherence analysis and 2) pairwise phase synchronization techniques.

Functional connectivity index calculation using coherence analysis:

Coherence is a statistical measure that describes the correlation between two signals in the frequency domain, indicating the degree to which they vary together over a range of frequencies [20]. The coherence value ranges from 0 to 1. A coherence of 1 at a certain frequency indicates a perfect linear relationship between the two signals at that frequency, while a coherence of 0 signifies no linear relationship between the signals at that frequency. Coherence is sensitive to common sources, which can lead to a false indication of direct interaction between two regions. This occurs when both regions are interacting with a third one, known as the ‘volume conduction’ or ‘field spread’ issue.

Functional connectivity index calculation using pairwise phase synchronization techniques:

The wPLI is a measure of phase synchronization between two signals commonly used in neuroscience to assess functional connectivity between different brain regions [21]. Unlike coherence, which assesses the correlation between signals, the wPLI considers the consistency of phase differences, making it less sensitive to volume conduction [21, 22]. It quantifies the asymmetry in the distribution of instantaneous phase differences between signals, indicating the extent to which one signal leads or lags in phase relative to another, on average. The wPLI value ranges from 0 to 1, where 0 indicates no phase coupling or an equal distribution of phase lead and lag, and 1 indicates consistent phase leading or lagging. In this study, the Hilbert-Huang Transform algorithm (HHT) was applied to EEG signals to calculate the wPLI for pairs of channels, and the resulting functional connectivity values were used to construct functional connectivity networks. The HHT is specifically designed to analyze non-stationary and nonlinear data, providing accurate time-frequency representation. By combining the wPLI with HHT, this study aimed to offer a robust analysis of functional connectivity in complex signals like brainwaves, as the wPLI is less influenced by common sources or volume conduction.

Retrieved EEG features over brain’s Brodmann’s Areas (BA):

Each EEG channel was assigned to a specific Brodmann Area (BA) based on its approximate position above the respective BA [23]. To determine the corresponding BA for each EEG channel, resources such as the Brodmann’s Interactive Atlas and the Brain master software were utilized [3]. A total of 116 EEG channels were allocated to the 21 Brodmann areas (Figure 2). Subsequently, average coherence and average wPLI values were calculated across EEG channels within each distinct BA. This process resulted in 21 coherence-related features and 21 wPLI-related features, offering a comprehensive overview of brain activity.

Figure 2. Brodmann’s Areas (BA) considered in this study.

Figure 2

A) EEG data were recorded using a 124-channel EEG headset. Signals from 8 EEG channels were excluded due to poor quality. B) Remaining signals from 116 EEG channels were assigned to the 21 Brodmann areas. C) List of channels roughly above each Brodmann’s Area.

Mental workload evaluation model development:

The Least Absolute Shrinkage and Selection Operator (LASSO) is a statistical technique that performs variable selection and regularization to improve the prediction accuracy and interpretability of the resulting model. It achieves this by constraining the sum of the absolute values of the model parameters, thereby forcing some coefficient estimates to become exactly zero. When combined with a Generalized Linear Mixed-Effects Model (GLMM), known as GLMM-LASSO, this approach allows for variable selection and shrinkage simultaneously, making it effective for modeling both fixed and random effects [24].

GLMM-LASSO algorithm was employed to develop models for evaluating mental workload domains, specifically SURG-TLX metrics. This involved using coherence-based features, wPLI-based features, and a combination of both types of features. The tuning parameter in the GLMM-LASSO algorithm, lambda, was selected using the Bayesian Information Criterion (BIC) and five-fold cross-validation techniques. Subsequently, all variables identified as significant by GLMM-LASSO were utilized to fit the linear mixed model.

Model’s performance metrics include Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and the coefficient of determination (R2), which are common metrics to evaluate performance of regression models. MAE is the average absolute difference between the predicted and actual values, with lower values indicating better model performance. RMSE is similar to MAE, but squares the differences before taking the average, which gives more weight to larger errors.

Results

Results derived from the application of GLMM-LASSO analyses:

Table 1 shows the performance of models for mental workload domains using coherence-based and wPLI-based functional connectivity features and GLMM-LASSO analysis.

Table 1.

Performance of the developed models for evaluation of mental workload domains.

Complexity Distraction MD PD Stress TD
Models developed using only coherence-based features MAE 1.37 0.78 1.33 0.93 1.40 1.33
RMSE 1.84 1.18 1.87 1.35 2.03 1.87
R2 0.78 0.67 0.72 0.84 0.71 0.72
Models developed using only wPLI-based features MAE 1.37 0.77 1.02 0.90 1.34 1.33
RMSE 1.87 1.21 1.39 1.32 2.01 1.86
R2 0.77 0.66 0.87 0.84 0.72 0.72
Models developed using Coherence-based and wPLI-based features MAE 1.27 0.76 0.85 0.87 1.33 1.31
RMSE 1.67 1.14 1.16 1.28 1.92 1.80
R2 0.82 0.71 0.91 0.85 0.74 0.74

The results demonstrate that integrating both coherence-based and wPLI-based features into the model enhances predictive performance across all mental workload domains. This is evident from the consistently lower MAE and RMSE values, as well as higher R2 values, compared to models using only coherence or wPLI features. Notably, the combined feature model exhibited the smallest MAEs across all domains, indicating superior predictive accuracy. Moreover, it consistently achieved the lowest RMSE, suggesting fewer large errors, and generally had the highest R2 values, indicating better explanatory power across all domains. Tables 2-7 represent the models developed using coherence-based and wPLI-based features for evaluation of mental demand (Table 2), distraction (Table 3), temporal demand (Table 4), task complexity (Table 5), physical demand (Table 6), and situational stress (Table 7), respectively.

Table 2.

Results of a GLMM-LASSO model for mental demand evaluation.

Mental Demand Predictors Coefficient Standard Error p - value
Task complexity level 2 −0.05 0.23 0.83
Task complexity level 3 1.02 0.26 <0.001
Average coherence at Brodmann’s Area 8 0.44 0.26 0.09
Average coherence at Brodmann’s Area 37 −0.64 0.13 <0.001
Average coherence at Brodmann’s Area 39 −0.67 0.32 0.03
Average coherence at Brodmann’s Area 44 0.73 0.18 <0.001
Average coherence at Brodmann’s Area 46 −0.32 0.26 0.22
Average coherence at Brodmann’s Area 47 0.45 0.15 0.003
Average wPLI at Brodmann’s Area 46 0.35 0.14 0.01

R2=0.91; RMSE=1.16; MAE=0.85; Number of samples: 216

Table 7.

Results of a GLMM-LASSO model for situational stress evaluation.

Situational Stress Predictors Coefficient Standard Error p - value
Task complexity level 2 −0.03 0.23 0.91
Task complexity level 3 1.22 0.24 <0.001
Average coherence at Brodmann’s Area 5 −0.58 0.23 0.01
Average coherence at Brodmann’s Area 7 0.49 0.27 0.07
Average coherence at Brodmann’s Area 8 0.69 0.22 0.002
Average coherence at Brodmann’s Area 9 0.33 0.09 <0.001
Average coherence at Brodmann’s Area 19 −0.29 0.23 0.20
Average coherence at Brodmann’s Area 20 −0.22 0.13 0.08
Average coherence at Brodmann’s Area 37 −0.34 0.13 0.01
Average coherence at Brodmann’s Area 44 0.39 0.17 0.02
Average coherence at Brodmann’s Area 46 −0.52 0.24 0.03
Average coherence at Brodmann’s Area 47 −0.14 0.15 0.33
Average wPLI at Brodmann’s Area 6 −0.22 0.17 0.18
Average wPLI at Brodmann’s Area 8 −0.24 0.17 0.16
Average wPLI at Brodmann’s Area 10 0.31 0.13 0.01
Average wPLI at Brodmann’s Area 18 0.23 0.16 0.17
Average wPLI at Brodmann’s Area 37 −0.48 0.11 <0.001
Average wPLI at Brodmann’s Area 44 −0.23 0.09 0.01

R2=0.74; RMSE=1.92; MAE=1.33; Number of samples: 216

Table 3.

Results of a GLMM-LASSO model for distraction evaluation.

Distraction Predictors Coefficient Standard error p - value
Task complexity level 2 −0.12 0.23 0.59
Task complexity level 3 0.51 0.24 0.03
Average coherence at Brodmann’s Area 6 0.39 0.29 0.18
Average coherence at Brodmann’s Area 37 0.50 0.13 <0.001
Average coherence at Brodmann’s Area 39 −0.30 0.29 0.30
Average coherence at Brodmann’s Area 40 −0.32 0.24 0.18
Average coherence at Brodmann’s Area 42 −0.25 0.12 0.03
Average wPLI at Brodmann’s Area 6 −0.17 0.17 0.30
Average wPLI at Brodmann’s Area 10 −0.22 0.12 0.08

R2=0.71; RMSE=1.14; MAE=0.76; Number of samples: 216

Table 4.

Results of a GLMM-LASSO model for temporal demand evaluation.

TD Predictors Coefficient Standard Error p - value
Task complexity level 2 0.28 0.24 0.24
Task complexity level 3 1.34 0.26 <0.001
Average coherence at Brodmann’s Area 9 0.23 0.08 0.006
Average coherence at Brodmann’s Area 18 0.34 0.12 0.007
Average coherence at Brodmann’s Area 19 −0.23 0.22 0.30
Average coherence at Brodmann’s Area 21 0.74 0.18 <0.001
Average coherence at Brodmann’s Area 40 −0.48 0.23 0.04
Average coherence at Brodmann’s Area 46 −0.49 0.20 0.01
Average wPLI at Brodmann’s Area 1 −0.28 0.09 0.002
Average wPLI at Brodmann’s Area 2 0.20 0.10 0.05
Average wPLI at Brodmann’s Area 6 −0.49 0.16 0.004
Average wPLI at Brodmann’s Area 19 0.43 0.14 0.002
Average wPLI at Brodmann’s Area 20 −0.30 0.14 0.03
Average wPLI at Brodmann’s Area 39 −0.32 0.11 0.007

R2=0.74; RMSE=1.80; MAE=1.31; Number of samples: 216

Table 5.

Results of a GLMM-LASSO model for perceived complexity evaluation.

Features associated with perceived complexity Coefficient Standard Error p - value
Task complexity level 2 0.15 0.23 0.52
Task complexity level 3 1.57 0.26 <0.001
Average coherence at Brodmann’s Area 2 −0.52 0.15 <0.001
Average coherence at Brodmann’s Area 5 −0.23 0.21 0.26
Average coherence at Brodmann’s Area 10 0.20 0.09 0.03
Average coherence at Brodmann’s Area 18 0.34 0.13 0.01
Average coherence at Brodmann’s Area 19 −0.47 0.23 0.04
Average coherence at Brodmann’s Area 20 −0.38 0.13 0.003
Average coherence at Brodmann’s Area 21 0.28 0.18 0.12
Average coherence at Brodmann’s Area 37 −0.35 0.13 0.008
Average coherence at Brodmann’s Area 39 0.66 0.29 0.02
Average coherence at Brodmann’s Area 40 −0.54 0.24 0.02
Average coherence at Brodmann’s Area 41 −0.44 0.16 0.006
Average coherence at Brodmann’s Area 44 0.78 0.17 <0.001
Average coherence at Brodmann’s Area 46 −0.47 0.22 0.03
Average wPLI at Brodmann’s Area 5 −0.46 0.17 0.006
Average wPLI at Brodmann’s Area 6 −0.40 0.17 0.02
Average wPLI at Brodmann’s Area 37 −0.46 0.11 <0.001
Average wPLI at Brodmann’s Area 39 0.25 0.12 0.03
Average wPLI at Brodmann’s Area 40 −0.28 0.13 0.03
Average wPLI at Brodmann’s Area 44 −0.13 0.09 0.15
Average wPLI at Brodmann’s Area 46 0.45 0.14 0.002

R2=0.82; RMSE=1.67; MAE=1.27; Number of samples: 216

Table 6.

Results of a GLMM-LASSO model for physical demand evaluation.

Physical Demand Predictors Coefficient Standard Error p - value
Task complexity level 2 −0.05 0.22 0.81
Task complexity level 3 0.89 0.23 <0.001
Average coherence at Brodmann’s Area 37 0.44 0.12 <0.001
Average coherence at Brodmann’s Area 39 0.22 0.25 0.38
Average coherence at Brodmann’s Area 40 −0.32 0.22 0.14
Average wPLI at Brodmann’s Area 6 −0.46 0.16 0.005
Average wPLI at Brodmann’s Area 10 −0.17 0.12 0.15
Average wPLI at Brodmann’s Area 20 −0.17 0.14 0.20
Average wPLI at Brodmann’s Area 21 −0.17 0.14 0.22

R2=0.85; RMSE=1.28; MAE=0.87; Number of samples: 216

Discussion

The performance of models developed using coherence-based features alone demonstrates moderate to good predictive accuracy across most mental workload domains. However, it is important to note that coherence is known to be sensitive to volume conduction, where signals from distant brain regions may appear to be synchronized due to shared electrical activity rather than direct functional connectivity. As a result, coherence-based models may overestimate functional connectivity in some cases, potentially leading to inflated predictive performance, as evidenced by the higher MAE and RMSE compared to wPLI-based models for some metrics such as MD and stress.

Results indicate that models developed using wPLI-based features alone consistently outperform coherence-based models across most mental workload domains. This superior performance of wPLI-based models can be attributed to their reduced sensitivity to volume conduction effects compared to coherence-based models. While coherence-based models may overestimate functional connectivity due to volume conduction, wPLI-based models provide more accurate estimates by focusing on the consistency of phase differences between signals rather than their correlation.

Interestingly, combining coherence-based and wPLI-based features in the models results in improved predictive performance across all mental workload domains, as evidenced by lower MAE and RMSE values compared to models using either feature type alone. This suggests that while coherence-based features may capture some spurious connectivity due to volume conduction, they still contribute valuable information to the overall predictive model when used in conjunction with wPLI-based features. This highlights the importance of considering multiple features and their respective strengths and limitations when evaluating functional connectivity in EEG data, particularly in the presence of volume conduction.

In the context of mental workload evaluation, both coherence and wPLI are important because they capture different aspects of brain dynamics. Mental workload is a complex cognitive function that likely involves multiple interacting processes in the brain. Coherence can provide information about consistent synchronization between brain regions, potentially reflecting shared information processing or co-activation of these regions. On the other hand, the wPLI can provide complementary information about more specific directional interactions between regions that are less influenced by common sources. Thus, by using both coherence and wPLI, a more comprehensive picture of the brain dynamics associated with different levels of mental workload could be captured.

The predictors included the level of task complexity and EEG-based functional connectivity features (coherence-based or wPLI-based) at designated Brodmann areas—regions of the brain cortex defined by their histological structure and function. The positive association observed between task complexity level 3 and mental workload domains (p < 0.05) suggests that as task complexity increases, perceived mental workload in each domain also tends to increase.

The other significant predictors are various coherence-based and wPLI-based features at different Brodmann areas, which have both positive and negative effects. The positive or negative influence of these brain measures on mental workload suggests specific brain areas and types of neural synchronization may play a role in how these aspects of mental workload are experienced.

The negative association between average wPLI at BA6 and temporal demand may be related to BA6’s role in motor planning [25, 26], motor learning [27], and working memory [28], and suggests that high temporal demand is associated with disruption of the efficient functioning of these processes.

The negative associations found between the mental demand and average coherence at BAs 37 and 39. These areas play role in various types of memory processing [29, 30], motor functions [27, 31, 32], and visual processing [33, 34]. The finding that higher mental demand is associated with lower synchronization and co-activation within these areas suggests that these regions must ‘work harder’ under greater mental loads, potentially leading to desynchronization of neural activity.

The negative association between average wPLI at BAs 37 and 44 and situational stress may suggest that these areas might ‘desynchronize’ or become less coordinated under high-stress conditions. In contrast, the positive association between average coherence at BA37 and physical demand could indicate increased activity in this region when the body is under physical stress, potentially supporting its role in processing sensory inputs related to the body’s state and external environment.

The mixed associations found with perceived complexity - with both positive (BA44) and negative associations (BAs 2, 5, 37) - may reflect the fact that complexity can engage many different cognitive processes, including memory, attention, and sensory processing, which are associated with these areas [35-37]. A potential reason is that some of these processes become more synchronized, while others become less synchronized, under conditions of high complexity.

Overall, the results of this study underscore the potential of using features derived from brain imaging data—specifically EEG—to predict different aspects of perceived mental workload. They suggest that task complexity, as well as specific patterns of neural synchronization in different brain areas, play crucial roles in determining the perceived mental workload. However, further research is needed to fully understand these relationships and to refine the evaluation models. Additionally, these findings need to be validated on different and larger datasets to confirm their generalizability.

Limitations and future directions of EEG-based workload assessment:

EEG-based mental workload evaluation is effective for studying cognitive processes and brain function, yet it comes with notable limitations. Firstly, the phenomenon of volume conduction significantly impacts EEG data, which can challenge the accuracy of interpreting results. The weighted Phase Lag Index (wPLI) is less affected by these artifacts compared to simpler measures like coherence, providing a more reliable measure of true functional connectivity. Nevertheless, residual effects from volume conduction may still influence its outcomes, necessitating future exploration of advanced signal processing techniques, such as enhanced spatial filtering or novel computational models that directly address this issue.

Secondly, EEG primarily captures electrical activity on the scalp and has lower spatial resolution than methods like fMRI, which complicates the precise identification of brain locations, especially in regions like the prefrontal cortex. Although EEG’s excellent temporal resolution permits detailed tracking of rapid brain activity changes, this comes at the cost of spatial accuracy, limiting the detailed localization of these processes.

Lastly, EEG setups, particularly those with high-density arrays, can be cumbersome and time-consuming, which may affect the naturalness of tasks and influence participant behavior. Integrating EEG with other imaging methods such as fMRI or near-infrared spectroscopy (NIRS) could combine the high temporal resolution of EEG with the superior spatial resolution of these modalities. Future studies might focus on developing integrated protocols that effectively synchronize these technologies. Moreover, integrating EEG data with other physiological measures, such as heart rate and galvanic skin response, can offer a more comprehensive view of cognitive states and reduce data interpretation ambiguity.

Limitations of this study and potential direction for future research:

A limitation of this study is the small sample size of 21 participants, which may restrict the generalizability of the findings. Recruiting participants for surgical task studies is inherently challenging due to the extensive preparation and specific training required, a common issue in this field as evidenced by similar studies [38, 39]. Despite these constraints, this preliminary study provides valuable initial findings. Future research should aim to include a larger and more diverse group of participants and tasks to enhance the generalizability of the findings.

Future studies could also benefit from exploring a broader range of validation methods to assess mental workload metrics. While the current study employed the Surg-TLX, known for its established reliability in high-stakes environments like surgery, other validation methods could further strengthen the generalizability and robustness of the models.

EEG’s role extends into research on cognitive decline, including Alzheimer’s disease, where it is primarily used to study brain activity patterns and disruptions [6, 15, 40-43]. The focus of current study on overcoming challenges such as volume conduction in EEG analysis and improving cognitive assessments using features related to coherence and wPLI could greatly benefit cognitive load evaluations in clinical settings, including Alzheimer’s disease diagnosis. To bridge the gap between our findings and their potential clinical utility, further studies are necessary. These should explore viable pathways for implementing our enhanced EEG-based cognitive assessment methods in clinical trials or diagnostic settings. Collaborations with clinical researchers could be particularly beneficial, allowing these methods to be tested in targeted studies on Alzheimer’s patients, assessing both their practicality and effectiveness in clinical contexts.

Supplementary Material

Supplementary material

Acknowledgments

Current study was supported by the National Institute on Aging (NIA) under grant number 3R01EB029398-03S1, and the National Institute of Biomedical Imaging and Bioengineering of the National Institutes of Health under grant number R01EB029398. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. This work was supported by a National Cancer Institute (NCI) grant (P30CA016056), which involved the use of shared resources at Roswell Park Comprehensive Cancer Center, including the Comparative Oncology and the Applied Technology Laboratory for Advanced Surgery (ATLAS) illustration studios. The authors thank all participants in the study.

Footnotes

Conflicts of interest

The authors declare that they have no conflict of interest.

Data collection from human subjects: This study received approval from the Roswell Park Comprehensive Cancer Center’s Institutional Review Board (IRB: I-241913). The IRB granted a waiver of documentation of consent, and instead, participants were presented with a research study information sheet and provided verbal consent.

Contributor Information

Somayeh B. Shafiei, the Intelligent Cancer Care Laboratory, the Department of Urology, Roswell Park Comprehensive Cancer Center in Buffalo, NY 14263, USA.

Saeed Shadpour, the Department of Animal Biosciences, University of Guelph, Guelph, Ontario N1G 2W1, Canada

Ambreen Shafqat, the Intelligent Cancer Care Laboratory, the Department of Urology, Roswell Park Comprehensive Cancer Center in Buffalo, NY 14263, USA

Data availability statement

The EEG data analyzed in the current study are available at: Shafiei, S. B., Shadpour, S., Mohler, J., Seilanian Toussi, M., Doherty, P., & Jing, Z. (2023). Electroencephalogram and eye-gaze datasets for robot-assisted surgery performance evaluation (version 1.0.0). PhysioNet [12].

Code availability statement

No custom code or mathematical algorithm was developed for this study. Details regarding the specific codes used can be found in the references cited. Any access restrictions or licensing information associated with the codes used can be obtained from the respective sources as indicated in the references.

References

  • 1.Guru KA, et al. , Understanding cognitive performance during robot-assisted surgery. Urology, 2015. 86(4): p. 751–757. [DOI] [PubMed] [Google Scholar]
  • 2.Berka C., et al. , Real-time analysis of EEG indexes of alertness, cognition, and memory acquired with a wireless EEG headset. International Journal of Human-Computer Interaction, 2004. 17(2): p. 151–170. [Google Scholar]
  • 3.Shadpour S., et al. , Developing cognitive workload and performance evaluation models using functional brain network analysis. npj Aging, 2023. 9(1): p. 22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Shafiei SB, et al. , Evaluating the mental workload during robot-assisted surgery utilizing network flexibility of human brain. IEEE Access, 2020. 8: p. 204012–204019. [Google Scholar]
  • 5.Shafiei* S, et al. , MP34-19 BRAIN NETWORK REGIONAL FLEXIBILITY HAS RELATIONSHIP WITH MENTAL WORKLOAD DURING ROBOT-ASSISTED SURGERY PERFORMANCE. The Journal of Urology, 2020. 203(Supplement 4): p. e510–e511. [Google Scholar]
  • 6.Ruiz-Gómez SJ, et al. Volume conduction effects on connectivity metrics: application of network parameters to characterize Alzheimer’s disease continuum. in 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). 2020. IEEE. [DOI] [PubMed] [Google Scholar]
  • 7.Nunez PL, et al. , EEG coherency: I: statistics, reference electrode, volume conduction, Laplacians, cortical imaging, and interpretation at multiple scales. Electroencephalography and clinical neurophysiology, 1997. 103(5): p. 499–515. [DOI] [PubMed] [Google Scholar]
  • 8.Bendat JS and Piersol AG, Random data: analysis and measurement procedures. 2011: John Wiley & Sons. [Google Scholar]
  • 9.Carter GC, Coherence and time delay estimation. Proceedings of the IEEE, 1987. 75(2): p. 236–255. [Google Scholar]
  • 10.Wilson MR, et al. , Development and validation of a surgical workload measure: the surgery task load index (SURG-TLX). World journal of surgery, 2011. 35: p. 1961–1969. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Dias RD, et al. , Systematic review of measurement tools to assess surgeons' intraoperative cognitive workload. Journal of British Surgery, 2018. 105(5): p. 491–501. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Shafiei SB, Shadpour S, Mohler J, Seilanian Toussi M, Doherty P, Jing Z, 2023. Electroencephalogram and eye-gaze datasets for robot-assisted surgery performance evaluation. PhysioNet. 10.13026/qj5m-n649 (version 1.0.0). [DOI] [Google Scholar]
  • 13.Goldberger AL, et al. , PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. circulation, 2000. 101(23): p. e215–e220. [DOI] [PubMed] [Google Scholar]
  • 14.Herreras O., Local field potentials: myths and misunderstandings. Frontiers in neural circuits, 2016. 10: p. 101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Briels CT, et al. , Reproducibility of EEG functional connectivity in Alzheimer’s disease. Alzheimer's research & therapy, 2020. 12: p. 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Luck SJ, An introduction to the event-related potential technique. 2014: MIT press. [Google Scholar]
  • 17.Srinivasan R., et al. , EEG and MEG coherence: measures of functional connectivity at distinct spatial scales of neocortical dynamics. Journal of neuroscience methods, 2007. 166(1): p. 41–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Lynn CW and Bassett DS, The physics of brain network structure, function and control. Nature Reviews Physics, 2019: p. 1. [Google Scholar]
  • 19.Bullmore E and Sporns O, Complex brain networks: graph theoretical analysis of structural and functional systems. Nature reviews neuroscience, 2009. 10(3): p. 186. [DOI] [PubMed] [Google Scholar]
  • 20.Nolte G., et al. , Identifying true brain interaction from EEG data using the imaginary part of coherency. Clinical neurophysiology, 2004. 115(10): p. 2292–2307. [DOI] [PubMed] [Google Scholar]
  • 21.Vinck M., et al. , An improved index of phase-synchronization for electrophysiological data in the presence of volume-conduction, noise and sample-size bias. Neuroimage, 2011. 55(4): p. 1548–1565. [DOI] [PubMed] [Google Scholar]
  • 22.Stam CJ, Nolte G, and Daffertshofer A, Phase lag index: assessment of functional connectivity from multi channel EEG and MEG with diminished bias from common sources. Human brain mapping, 2007. 28(11): p. 1178–1193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Strotzer M., One century of brain mapping using Brodmann areas. Clinical Neuroradiology, 2009. 19(3): p. 179–186. [DOI] [PubMed] [Google Scholar]
  • 24.Schelldorfer J, Meier L, and Bühlmann P, Glmmlasso: an algorithm for high-dimensional generalized linear mixed models using ℓ1-penalization. Journal of Computational and Graphical Statistics, 2014. 23(2): p. 460–477. [Google Scholar]
  • 25.Catalan MJ, et al. , The functional neuroanatomy of simple and complex sequential finger movements: a PET study. Brain: a journal of neurology, 1998. 121(2): p. 253–264. [DOI] [PubMed] [Google Scholar]
  • 26.Bischoff-Grethe A., et al. , Neural substrates of response-based sequence learning using fMRI. Journal of cognitive neuroscience, 2004. 16(1): p. 127–138. [DOI] [PubMed] [Google Scholar]
  • 27.Brunia C., et al. , Visual feedback about time estimation is related to a right hemisphere activation measured by PET. Experimental Brain Research, 2000. 130: p. 328–337. [DOI] [PubMed] [Google Scholar]
  • 28.Rämä P., et al. , Working memory of identification of emotional vocal expressions: an fMRI study. Neuroimage, 2001. 13(6): p. 1090–1101. [DOI] [PubMed] [Google Scholar]
  • 29.Babiloni C., et al. , Human cortical responses during one-bit delayed-response tasks: an fMRI study. Brain research bulletin, 2005. 65(5): p. 383–390. [DOI] [PubMed] [Google Scholar]
  • 30.Sun X., et al. , Age-dependent brain activation during forward and backward digit recall revealed by fMRI. Neuroimage, 2005. 26(1): p. 36–47. [DOI] [PubMed] [Google Scholar]
  • 31.Bernal B and Altman N, Neural networks of motor and cognitive inhibition are dissociated between brain hemispheres: an fMRI study. International Journal of Neuroscience, 2009. 119(10): p. 1848–1880. [DOI] [PubMed] [Google Scholar]
  • 32.Slotnick SD and Schacter DL, The nature of memory related activity in early visual areas. Neuropsychologia, 2006. 44(14): p. 2874–2886. [DOI] [PubMed] [Google Scholar]
  • 33.Köhler S and Kapur S, Dissociation of pathways for object and spatial vision: a PET study in. Neuroreport, 1995. 6: p. 1865–1868. [DOI] [PubMed] [Google Scholar]
  • 34.Cheng K., et al. , Human cortical regions activated by wide-field visual motion: an H2 (15) O PET study. Journal of Neurophysiology, 1995. 74(1): p. 413–427. [DOI] [PubMed] [Google Scholar]
  • 35.Bernard R., et al. , Cortical activation during rhythmic hand movements performed under three types of control: an fMRI study. Cognitive, Affective, & Behavioral Neuroscience, 2002. 2: p. 271–281. [DOI] [PubMed] [Google Scholar]
  • 36.Caplan JB, et al. , Parallel networks operating across attentional deployment and motion processing: a multi-seed partial least squares fMRI study. Neuroimage, 2006. 29(4): p. 1192–1202. [DOI] [PubMed] [Google Scholar]
  • 37.Yoo S-S, Paralkar G, and Panych LP, Neural substrates associated with the concurrent performance of dual working memory tasks. International Journal of Neuroscience, 2004. 114(6): p. 613–631. [DOI] [PubMed] [Google Scholar]
  • 38.Wang Z and Majewicz Fey A, Deep learning with convolutional neural network for objective skill evaluation in robot-assisted surgery. International journal of computer assisted radiology and surgery, 2018. 13: p. 1959–1970. [DOI] [PubMed] [Google Scholar]
  • 39.Kamat A., et al. Open access fNIRS dataset of surgical skill execution in the fundamental of laparoscopic program. in Clinical and Translational Neurophotonics 2023. 2023. SPIE. [Google Scholar]
  • 40.Hata M., et al. , Functional connectivity assessed by resting state EEG correlates with cognitive decline of Alzheimer’s disease–An eLORETA study. Clinical Neurophysiology, 2016. 127(2): p. 1269–1278. [DOI] [PubMed] [Google Scholar]
  • 41.Smailovic U., et al. , Regional disconnection in Alzheimer dementia and amyloid-positive mild cognitive impairment: association between EEG functional connectivity and brain glucose metabolism. Brain connectivity, 2020. 10(10): p. 555–565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Engels MM, et al. , Declining functional connectivity and changing hub locations in Alzheimer’s disease: an EEG study. BMC neurology, 2015. 15: p. 1–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Vecchio F., et al. , Human brain networks in cognitive decline: a graph theoretical analysis of cortical connectivity from EEG data. Journal of Alzheimer's Disease, 2014. 41(1): p. 113–127. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary material

Data Availability Statement

The EEG data analyzed in the current study are available at: Shafiei, S. B., Shadpour, S., Mohler, J., Seilanian Toussi, M., Doherty, P., & Jing, Z. (2023). Electroencephalogram and eye-gaze datasets for robot-assisted surgery performance evaluation (version 1.0.0). PhysioNet [12].

No custom code or mathematical algorithm was developed for this study. Details regarding the specific codes used can be found in the references cited. Any access restrictions or licensing information associated with the codes used can be obtained from the respective sources as indicated in the references.

RESOURCES