Dynamic velocity scaling for industrial collaborative robots: a gaze-driven approach

Matteo Manzardo; Federico Fraboni; Sofia Morandini; Luca Gualtieri; Renato Vidoni; Luca Pietrantoni

doi:10.1038/s41598-025-34142-9

. 2026 Mar 9;16:8471. doi: 10.1038/s41598-025-34142-9

Dynamic velocity scaling for industrial collaborative robots: a gaze-driven approach

Matteo Manzardo ¹, Federico Fraboni ², Sofia Morandini ², Luca Gualtieri ^1,^✉, Renato Vidoni ^1,^3,^✉, Luca Pietrantoni ²

PMCID: PMC12972166 PMID: 41803171

Abstract

Implementing an effective human-robot seamless interaction (HRSI) is a critical challenge for the industrial robotics community, which requires capturing all human mental processes and turn them into suitable robot’s actions. This work presents a method to adjust the robot behavior in real-time considering both human’s low-frequency, i.e. cognitive workload, and high-frequency cognitive processes, i.e. visual attention, which are generally overlooked in literature, to optimize safety, productivity and ergonomics. Regarding high-frequency processes, the system monitors the operator’s gaze - the closer they look to the robot, the more attentive they are, while attention drops when the robot leaves their field of view, posing potential risks. The speed of the manipulator is then dynamically modulated based on operator’s visual attention. Regarding low-frequency processes, the robot’s trajectory is adjusted to optimize operator’s cognitive workload. An experimental validation involving 26 participants led to three key findings: the developed algorithm improved productivity (18% improvement), cognitive workload (5%), fluency (10%), usability (5%), reliability (5%), and acceptance of the system (9%); the exploitation of high-frequency cognitive processes leads to a significant improvement in all mentioned metrics, suggesting its relevance for future research in the field of HRSI; increasing the level of adaptability of the robotic system positively affects collaboration’s effectiveness.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-025-34142-9.

Keywords: Human-robot seamless interaction, Collaborative robots, Robot authonomy, Robot adaptation

Subject terms: Mechanical engineering, Electrical and electronic engineering, Risk factors

Introduction

Industrial Human-Robot Seamless Interaction (HRSI) is considered one of the most significant research challenges for the robotics community^1,2 and a core objective of Industry 5.0^3,4. This is motivated both by the economic and social impact associated with its effective implementation⁵. Indeed, a successful human-machine collaboration has demonstrated substantial potential for improving productivity⁶, worker’s safety⁷ and both cognitive⁸ and physical ergonomics⁹.

Similarly to human teams cooperating with each other, an effective HRSI can be achieved if (i) the robot can understand the human co-worker’s psycho-physical status and actions and (ii) adapt its behavior accordingly, i.e. autonomously modifying its trajectory or tasks, (iii) without interfering, degrading, or limiting the human operations^4,10.

Thus, it is essential to comprehensively capture the operator’s cognitive processes to implement human-centric adaptability. Cognitive processes can occur at high- or low-frequency, based on how the human’s responses vary over time. Specifically, high-frequency processes involve near real-time processing (e.g., the visual attention towards and object or situation). In contrast, low-frequency processes are characterized by slower response mechanisms (e.g., impacting cognitive workload). Considering the interplay between high-frequency and low-frequency cognitive processes, it is possible to do a parallelism to the well-established Dual Process Theory in cognitive psychology by¹¹. Dual Process Theory posits two systems of cognition: System 1, which is fast, automatic, and reliant on immediate sensory input, and System 2, which is slower, deliberate, and effort-intensive. Similarly, high-frequency processes, such as gaze-driven visual attention, operate in real-time, aligning closely with the rapid, instinctual nature of System 1. On the other hand, low-frequency processes are very similar to System 2’s thoughtful and reflective engagement. Notably, both theories emphasize the dynamic interaction between these layers of cognition, where rapid, high-frequency inputs feed into and inform slower, low-frequency adjustments, mirroring the interplay between Systems 1 and 2. By integrating these concepts, the present work extends the dual-process framework into the domain of human-robot interaction, offering a unified perspective on optimizing collaborative tasks. Overlooking either of these aspects would result in an incomplete understanding of the human cognitive state, preventing the effective implementation of HRSI. By considering both high- and low-frequency cognitive processes, the robotic behavior can be adapted to enhance overall operational effectiveness and productivity while also ensuring the safety and well-being of the operator. The effectiveness of an adaptive interaction, however, might vanish if the system introduces new potential risk factors in terms of safety or ergonomics, or reveals impractical to be deployed¹².

In the quest to solve this challenge, several different solutions have been studied and implemented to enable minimally invasive human-to-machine as well as machine-to-human communication, ranging from haptic devices^13–15 to physiological signals^16,17 and behavioral measurements^18–20. While an effective human-machine communication is a critical aspect in HRSI, if it is not integrated in a wider framework considering also robot’s behavior adaptation, it could not fully address the challenge.

However, among the researches considering robot’s real-time adaptation, to the best of the authors’ knowledge, there is no work that simultaneously improves safety, ergonomics as well as productivity considering both high and low-frequency cognitive processes. Due to the difficulties at online evaluating the human’s awareness, rather than adjusting the robot’s behavior according to the human’s state, a widely adopted strategy to improve safety the enhancement of human awareness through haptic devices^21,22, extended reality²³, verbal communication^24,25 as well as visual signals²⁶. However, those approaches target the reduction of the mechanical risk without acting on the robot’s behavior. Considerable effort has also been expended in the development of solutions to online evaluate^27–29 and, subsequently, improve both the physical as well as the cognitive ergonomics of a collaborative operation, for instance informing the user of possible unhealthy positions is a strategy to reduce physical fatigue^30,31. Another approach is to adapt the robot’s behavior in such a way to suggest the operator to work under the proper conditions^32–34. To online reduce mental fatigue, instead, it has been proposed to adapt accordingly and in real-time the robot behavior³⁵, the degree of autonomy³⁶, or the assistance level to human operators³⁷. Moreover, to improve a trade-off between cognitive workload and productivity, the robot’s trajectory has been adapted online, exploiting ideas based on Game Theory³⁸ and Pareto Front³⁹. However, all those approaches consider only low-frequency cognitive processes. In⁴⁰, human real-time cognitive processes have been considered with the aim of enhancing safety: a multimodal approach has been developed to determine whether a collision is intentional or not to activate safety measures in possible unsafe conditions, thus optimizing the behavior of the robot with respect to mechanical risk. While this work considered high-frequency cognitive process, it mainly focused on safety aspects.

Aims and justification

The aim of this work is to develop and assess a methodology for robot’s behavior adaptation taking into account both high- and low-frequency cognitive processes to implement HRSI considering productivity, safety, and well-being requirements at the same time. Such a method integrates both the adaptation of the robot’s behavior based on the operator’s visual attention (high-frequency cognitive process) and cognitive workload (influenced by low-frequency cognitive process). In that regard, three specific aspects of the human visual systems are considered: human’s field of view (FOV), gaze behavior (where and how humans are looking at objects) and pupil dilation. The former two are exploited to estimate the operator’s visual attention, while the latter is taken into account to evaluate the cognitive workload.

Adapting robot behavior according to visual attention.

The human FOV plays a significant role in determining how individuals perceive and interact with their environment⁴¹. It defines the part of space that can be sensed through vision. This, combined with the information provided by gaze behavior (i.e., the way by which eyes are moving and looking around), can be exploited to evaluate the operator’s visual attention towards the robot. Visual attention can be understood as a collection of cognitive and physiological mechanisms that regulate visual information. These processes are essential for integrating visual data into coherent objects, which can then be identified, recognized, and remembered. Humans rely on visual attention to facilitate object recognition. Since object recognition mechanisms can’t simultaneously process every object in the field, attention is crucial for selecting and delivering manageable subsets of input for recognition⁴². Visual attention refers to the cognitive processes that enable humans to efficiently manage information overload by selecting what is relevant and filtering out what is not⁴³. Inadequate selective attention, mental errors, and susceptibility to distractions are recognized as key human factors that contribute to accidents when hazards go unnoticed^44,45. In industrial HRI, preventing (mechanical) hazards from unexpected contacts is crucial for ensuring the operator’s safety. The likelihood and severity of such contacts are closely related to the robot’s speed within the collaborative workspace. Consequently, many designers opt to reduce the robot’s speed, which, while enhancing safety, often compromises robotic performance and overall productivity. In this work, we aim to enhance both safety, productivity and operator’s wellbeing by exploiting worker’s visual attention toward the robot. Visual attention is considered a high-frequency cognitive process since it involves rapid, moment-to-moment adjustments in focus and processing, often requiring quick shifts between different stimuli in the environment. Indeed, this rapid processing aligns with the concept of a high-frequency cognitive process, as attention needs to be continuously regulated to manage visual information effectively.

Starting from the key idea that the operator’s visual attention is directly related to what he/she is looking at (i.e., whether the robot is or is not in his/her FOV), as well as how the focused object is moving (i.e. visual attention decreases as the robot moves away from the gaze point), the following main concept has been implemented in the proposed robot’s adaptive behavior algorithm: the speed of the manipulator is dynamically scaled according to the operator’s visual attention towards the robot, thus influencing safety, productivity and operator’s well-being (Fig. 1, right and Video SV1). Considering the principles of visual attention, it is expected that when the operator is observing the robot, he/she has a higher capability of recognizing potential contact and thus preventing mechanical hazards. In this case, we suppose that the manipulator can operate at a higher speed while maintaining proper safety conditions given that the probability of a dangerous contact is supposed to be lower. At the same time, productivity is higher since it is correlated with robot speed in a collaborative cycle. Visual attention, then, decreases as the robot moves away from the gaze point (where the human is looking) until it reaches its minimum value as the manipulator exits the operator’s FOV. In that case, the robot’s speed is reduced accordingly to a safe value.

Fig. 1 — representation of the experimental setup (center) and of the adaptation with respect to low- and high-frequency cognitive processes, respectively, on the left and on the right of the image. On the right, the gaze point is represented by the red, yellow and green circles.

Reducing the frequency of both hazardous and non-hazardous contacts also impacts interaction ergonomics. From a physical perspective, and in line with the requirements of ISO TS 15066⁴⁶, ergonomic limits may differ from biomechanical limits. For frequent contacts, even those compliant with safety standards, applicable threshold values can be further lowered to achieve an ergonomically acceptable level. From a cognitive perspective, factors such as trust in the robotic system and the fluency of collaboration can improve when interactions minimize nonfunctional and unintended contacts.

Adapting robot behavior according to cognitive workload.

Cognitive workload refers to the effort that an individual shows during a task to achieve a particular level of performance: the more effort a task requires, the higher the cognitive workload⁴⁷. It affects safety, performance and well-being of the operator⁴⁸. Pupil dilation is a widely used method to assess cognitive workload. This is evidenced by various studies that have observed correlations between task difficulty and pupil dilation⁴⁹. In particular, increased cognitive workload is associated with greater pupil dilation⁵⁰. Eye tracking studies have also demonstrated that higher task difficulty evokes higher peak pupil dilation and longer peak duration, suggesting that pupil diameter can be used as a physiological indicator of task workload, especially in visual-motor tasks⁵¹. In this work, it is assumed that a robotic system can be used to modulate the operator’s cognitive workload in such a way as to always keep it at the optimal level by changing the speed and, therefore, the pace of collaborative tasks. When a machine and a human operator cooperate, they exhibit two different peculiarities: while the machine is generally characterized by a fixed and constant cycle time, the operator’s cycle time, driven by both cognitive and physiological aspects, changes continuously. In this regard, saturation (how much of the available time is occupied by work tasks) strongly affects workers cognitive workload in manufacturing⁵². Moreover, it has been shown that the robot’s speed can also influence the operator’s cognitive workload, even considering velocities compliant with ISO TS 15066 requirements⁵³. Cognitive workload is considered a low-frequency process since it reflects the overall mental effort over a longer period or across tasks. The buildup and management of cognitive workload is affected not only by moment-to-moment adjustments, but it mostly occurs over the course of a task.

According to the measured operator’s pupil dilatation, the proposed algorithm adjusts the task execution speed to optimize the worker’s cognitive workload. It modifies the robot’s speed to maintain an appropriate pace while avoiding the introduction of new cognitive related risk factors (Fig. 1, left and Video SV1). Such adaptability is dynamic and human-centric, i.e., capable of considering the variations of the optimal cognitive workload level considering different operator’s individual features and working conditions, potentially varying from task to task.

Combining the key adaptation actions, both high- and low-frequency cognitive processes are taken into account to adapt online the robot’s behavior to improve safety, productivity and ergonomics.

Experimental scenario

To implement and validate the proposed solution in collaborative assembly tasks, robot motion scaling methods are driven in real-time by the human’s visual attention (measured through gaze behavior) and cognitive workload (retrieved as an indirect measure from pupil dilation).

The experimental activity, which will be detailed in Section “Materials and Methods”, consisted of a within-subject experiment with 26 participants who collaborated with an industrial robot to assemble a wooden box. In particular, once they underwent a training procedure to learn how to assemble the product, they were asked to mount the box in three different scenarios provided in a randomized order. As it will be explained in the Methods section, scenario randomization has been introduced to prevent influences on results such as the learning effect, increased familiarity with the task, and fatigue. In the first scenario (S1), the robot moves along its standard reference trajectory without adapting its behavior according to human mental processes. In the second (S2), the behavior of the robot (i.e., its trajectory) is adjusted considering only human’s visual attention (high-frequency processes), while, in the third scenario (S3), the adaptive behavior is implemented considering both visual attention as well as cognitive workload (both low- and high-frequency cognitive processes). The outcomes of the three scenarios are evaluated and compared employing qualitative and quantitative results in terms of production performance, safety, and ergonomics. In particular, the following indicators have been selected: productivity- and quality- related human performance (cycle time and errors), cognitive workload, robot’s reliability, HRI fluency, system usability and acceptance. Such an experiment also had the goal of evaluating the effect of the optimization considering the joint action of high- and low-frequency cognitive processes.

Results summary

Results show how, as the level of adaptivity grows (from S1 to S3), productivity, fluency of the operation, usability, reliability, and acceptance of the robotic system improve, thus proving the effectiveness of the developed algorithm. In particular, a statistically significant increase in productivity has been found shifting from S1 (no adaptability) to S3 (higher adaptability), without compromising the quality of the assembly operation: the number of errors, even though they decreased from S1 and S2 to S3, showed no statistically significant dependence of the scenario and, consequently, on the level of adaptability. In other words, it has been proven that the developed algorithm significantly improves productivity without compromising assembly quality ( Inline graphic result) and that, as the level of adaptability grows, the productivity increases without any worsening the number of production errors ( result). Cognitive workload showed a statistically significant decreasing trend as the level of adaptivity increased from S1 to S3. This result also confirmed the effectiveness of the control algorithm deployed in S3, which proved to be capable of driving the cognitive workload towards the desired level ( Inline graphic result). Fluency and usability showed a trend similar to that of productivity: they significantly improved from S1 to S3, meaning that the collaboration quality perceived by the user is positively impacted both by the developed method ( result) as well as by the level of adaptability of the robot ( Inline graphic result). Also, the reliability of the robotics system perceived by the user and the acceptance of the robot showed a similar, respectively, significant and marginally significant trend: they improved from S1 to S3, i.e. as adaptivity grows. It confirmed that the developed control algorithms can foster the the operator’s working conditions form a cognitive perspective of the user ( Inline graphic result) and that this can be achieved through a growing adaptability of the system ( result).

Such results can be summarized in three key findings: ( Inline graphic key finding) the developed algorithms led to improved productivity, operator’s cognitive workload, fluency of the collaborative operation, usability, reliability, and acceptance of the system without affecting the quality of the production process; ( key finding) the exploitation of high-frequency cognitive processes, which are almost overlooked in literature, leads to a significant and consistent improvement in all the above-mentioned metrics, thus suggesting its relevance for future research in the field of HRSI, ( Inline graphic key finding) an increasing level of adaptability of the robotic system positively affects all the metrics considered, without compromising the quality of the manufacturing process.

Results

This section reports the metrics and the analysis of the effects of the developed control algorithms and, consequently, of the level of robot’s adaptability, on productivity (cycle time and errors), cognitive workload, robot’s reliability, HRI fluency, system’s usability, and acceptance. These were evaluated by considering all the scenarios (S1, S2, and S3). Results have been summarized in Table 1 and Fig. 2.

Table 1.

table summarizing the results of the experimental campaign.

Measured Quantity	Measured Quantity	Optimal Results	S1		S2		S3		p-value
Measured Quantity	Measured Quantity	Optimal Results	M	SD	M	SD	M	SD	p-value
Productivity	Cycle Time [s]	Lower is better	142.66	11.46	120.86	16.77	117.08	13.23
Productivity	Assembly Errors	Lower is better	0.846	1.262	0.885	1.154	0.500	0.635	0.248
Cognitive Workload	Stochastic Gaze Entropy (SGE)	Lower is better	1.065	0.056	1.017	0.063	1.008	0.048
Fluency	Paliga and Pollak⁵⁹	Higher is better	5.63	1.27	6.19	0.69	6.22	0.67	0.02
Usability	Lewis and Sauro⁶⁰	Higher is better	4.27	0.52	4.43	0.44	4.50	0.40	0.01
Reliability	Stochastic Gaze Entropy (SGE)	Lower is better	1.065	0.056	1.017	0.063	1.008	0.048
Acceptance	Vanderlaan⁶²	Higher is better	3.85	0.99	4.18	0.64	4.20	0.64	0.06

Open in a new tab

Fig. 2 — Plots summarizing the results of the experiments. In particular, the error bar represents the average value and one standard deviation.

Productivity

To measure productivity-related human performance, two indicators have been evaluated: the average cycle time and the number of assembly errors. These have been widely demonstrated to be relevant indicators to quantify productivity performance.

The lower the cycle time, the higher the amount of products assembled. Descriptive statics shows how the mean cycle time value generally decreases from S1 (mean M=142.66s, standard deviation SD=11.46s) to S2 (M=120.86, SD=16.77, Inline graphic 15.28% w.r.t. S1 ) and S3 (M=117.08, SD=13.23, 17.93% w.r.t. S1). To evaluate the statistical significance of the scenario and, consequently, of the control strategy, an Analysis of Variance (ANOVA) has been performed, leading to the following results: Sum of Squares SS=8507.90, F-value F=38.77, p= Inline graphic . Considering that productivity is inversely proportional to cycle time, this confirms a systematic increase in productivity due to the progressive scenarios change.

The number of human errors, i.e. the total number of wrongly executed assembly steps, is measured to evaluate the influence of the robot’s adaptivity and, consequently, of the proposed control logic, on the effectiveness of the collaborative operation. The number of errors committed by an operator is heavily influenced by the operator’s cognitive workload⁵⁰, which, in turn, varies depending on the method used to control the robot. Furthermore, the smaller the number of errors is, the better the performance of the human-robot dyad⁵⁴.

The descriptive statistics shows that the mean number of errors made by the participants for each scenario decreased from S1 (M =0.846, SD = 1.262) to S3 (M = 0.500, SD = 0.635, Inline graphic 40.9% w.r.t. S1), with S2 (M = 0.885, SD = 1.154, +4.61% w.r.t. S1) having a slightly higher mean than S1. An ANOVA analysis was performed to statistically evaluate the significance of such data, and the results were SS=1.558, F=1.356, p=0.248. Such results entail that there is no statistically significant trend.

Cognitive workload

This metric quantitatively describes gaze behavior and, in particular, gaze dispersion^55,56. Multiple studies have found a correlation between Stochastic Gaze Entropy (SGE) and cognitive demand: as cognitive workload increases, SGE also increases^57,58. SGE Inline graphic has been computed using the Shannon equation⁵⁵:

where n is the number of areas of interest (AOI) within the FOV of the operator, Inline graphic the probability that the gaze is observing the AOI. The numeric values lie in the range , where 0 is associated with no gaze dispersion, i.e. the operator observes always the same AOI, and indicates the maximum possible gaze dispersion, indicating randomness and unpredictability in the gaze behavior. In this study, SGE was normalized considering its baseline value (further details will be given in the “Material and Methods” Section). In this experiment 100 different possible AOIs have been defined to compute such a metric.

Regarding the experimental results, the descriptive statistics show a decreasing value of SGE, thus indicating a decreasing level of cognitive workload shifting from S1 (M=1.065, SD=0.056) to S2 (M=1.017, SD=0.063, Inline graphic 4.51% w.r.t. S1) and S3 (M=1.008 SD=0.048, 5.35% w.r.t. S1). The ANOVA test (SS=0.042, F=12.697, p=) confirmed the statistical significance of such a trend.

Fluency in human-robot interaction

Fluency in HRI has been evaluated through the self-reported “Fluency in Human-Robot Interaction Scale,” developed by Paliga and Pollak⁵⁹, which evaluates the smoothness and efficiency of interactions between humans and robots. Fluency in this context refers to how seamlessly human operators and robots work together as a team. The scale assesses this fluency considering three aspects: the human operator’s perspective, the robot’s performance, and the overall teamwork. It includes items that measure trust in the robot’s decision-making, the robot’s commitment to the team’s success, the effectiveness of the robot as a team member, and the overall harmony of the human-robot team. This comprehensive approach provides valuable insights into the dynamics of human-robot collaboration.

The respondents are instructed: “The following statements concern your work with the robot. Please read them carefully and choose how much you agree with each statement. Use the scale provided below.” The response format has a seven-point Likert scale (1 = I strongly disagree, 2 = I disagree, 3 = I somewhat disagree, 4 = I neither agree nor disagree, 5 = I somewhat agree, 6 = I agree, 7 = I strongly agree). Furthermore, the scale includes items such as “I trusted the robot to do the right thing at the right time” for assessing the human’s trust in the robot, “The robot performed well as part of the team” for evaluating the robot’s contribution, and “The human-robot team did well on the task” for gauging the overall effectiveness of the team. Each item is designed to reflect different aspects of fluency in the collaborative process.

To evaluate such data, a repeated measures analysis using a General Linear Model (GLM) has been performed to examine the differences in fluency scores across three scenarios. The mean scores increased from S1 (M = 5.63, SD = 1.27) to S2 (M = 6.19, SD = 0.69, +9.95% w.r.t. S1) and S3 (M = 6.22, SD = 0.77, +10.48% w.r.t. S1). The multivariate tests showed a marginally significant effect of the scenarios on fluency scores (Pillai’s Trace = 0.22, F(2, 24) = 3.37, p = 0.05). However, Mauchly’s test of sphericity was significant (p < 0.001), indicating a violation of the sphericity assumption. After applying the Greenhouse-Geisser and Huynh-Feldt corrections, the tests of within-subjects effects revealed a significant effect of scenarios on fluency scores (Greenhouse-Geisser: F(1.29, 32.27) = 5.99, p = 0.01; Huynh-Feldt: F(1.33, 33.28) = 5.99, p = 0.01).

The tests of within-subjects contrasts showed significant linear (F(1, 25) = 6.31, p = 0.02) and quadratic (F(1, 25) = 5.08, p = 0.03) trends across the scenarios. The estimated marginal means increased from S1 (M = 5.64, 95% CI [5.12, 6.15]) to S2 (M = 6.19, 95% CI [5.92, 6.47]) and S3 (M = 6.22, 95% CI [5.91, 6.53]), indicating a significant effect of scenarios on fluency scores, with scores increasing from S1 to S2 and S3, following both linear and quadratic trends.

Usability of the robotic system

In the evaluation of the robot system’s usability, the research utilized the System Usability Scale (SUS), a widely recognized standardized questionnaire for assessing perceived usability. Developed by Brooke⁶⁰, the SUS is prevalent in industrial usability studies. The SUS consists of 10 items, each rated on a five-point Likert scale ranging from 1 (Strongly Disagree) to 5 (Strongly Agree), with statements alternating between positive and negative implications. Sample items from the SUS include assessments of system usage frequency, complexity, and ease of use (’I think that I would like to use this system frequently’, ’I found the system unnecessarily complex’, ’I thought the system was easy to use’). Participants completed a questionnaire at the end of each scenario for SUS evaluation.

In S1, participants generally agreed or strongly agreed that they would like to use the robot frequently, found it easy to use, and felt confident using the system. They also disagreed that the robot was unnecessarily complex, inappropriate for the task, or required learning many things before use. In S2, SUS scores improved, with higher percentages strongly agreeing that they would like to use the robot frequently, found it easy to use, and felt confident using the system. They also strongly disagreed that the robot was unnecessarily complex, needed technical support, was inappropriate for the task, or required learning many things before use. S3 showed similar trends, with even higher percentages strongly agreeing across most dimensions, indicating that participants perceived the robot as user-friendly, easy to learn, and well-suited for the task, with a high level of confidence in using the system.

Regarding the quantitative analysis of the experimental results, a repeated measures analysis using a General Linear Model (GLM) has been performed to investigate the differences in SUS scores across three scenarios. The mean SUS scores increased from S1 (M = 4.27, SD = 0.52) to S2 (M = 4.43, SD = 0.44, +3.75% w.r.t. S1) and S3 (M = 4.50, SD = 0.40, +5.39% w.r.t. S1). The multivariate tests indicated a significant effect of the scenario on SUS scores (Pillai’s Trace = 0.24, F(2, 24) = 3.75, p = 0.04). Mauchly’s test of sphericity was not significant (p = 0.15), suggesting that the assumption of sphericity was not violated. The tests of within-subjects effects, assuming sphericity, revealed a significant effect of scenarios on SUS scores (F(2, 50) = 5.17, p = 0.01). The tests of within-subjects contrasts showed a significant linear trend (F(1, 25) = 7.80, p = 0.01) but no significant quadratic trend (F(1, 25) = 0.61, p = 0.44) across the scenarios. The estimated marginal means increased from S1 (M = 4.27, 95% CI [4.06, 4.48]) to S2 (M = 4.43, 95% CI [4.25, 4.61]) and S3 (M = 4.50, 95% CI [4.34, 4.66]), suggesting a significant effect of scenarios on SUS scores, with scores increasing linearly from S1 to S2 and S3.

Reliability of the robotic system

Another relevant peculiarity of SGE is its correlation with the reliability of the robot perceived by the human: as the robot’s actions are more reliable, SGE decreases⁶¹.

Analogously to what has been described for the analysis of the cognitive workload, the descriptive statistics show a decreasing value of SGE between the scenarios. This indicates that the reliability of the robot increases as the level of adaptability increases changing from S1 (M=1.065, SD=0.056) to S2 (M=1.017, SD=0.063, Inline graphic 4.51% w.r.t. S1) and S3 (M=1.008 SD=0.048, 5.35% w.r.t. S1). The ANOVA test (SS=0.042, F=12.697, p=) revealed that this trend is characterized by a statistical significance.

Acceptance of the robotic system

The Acceptance Scores⁶² were measured using a 5-point semantic differential scale, where participants rated their perceptions of the scenarios on various dimensions. Each dimension was presented as a pair of opposite adjectives, such as “Useful:Useless,” “Pleasant:Unpleasant,” “Bad:Good,” “Pleasant:Annoying,” “Effective:Superfluous,” “Irritating:Likable,” “Helpful:Worthless,” “Undesirable:Desirable,” and “Increases alertness:Causes drowsiness.” For each pair of adjectives, participants selected a value from 1 to 5, with 1 representing the most negative perception (e.g., useless, unpleasant, bad) and 5 representing the most positive perception (e.g., useful, pleasant, good). In some cases, the scales were reverse to ensure consistency in the direction of the scores, with 1 representing the most positive perception and 5 representing the most negative perception. Using this semantic differential scale, it is possible to quantify participants’ perceptions and attitudes toward the scenarios across multiple dimensions, providing a comprehensive measure of acceptance.

A repeated measures analysis using a General Linear Model (GLM) has been performed to examine the differences in acceptance scores across three scenarios. The mean acceptance scores increased from S1 (M = 3.85, SD = 0.99) to S2 (M = 4.18, SD = 0.64, +8.57% w.r.t. S1 ) and S3 (M = 4.20, SD = 0.64, +9.09 % w.r.t. S1 ). The multivariate tests did not indicate a significant effect of the scenario on acceptance scores (Pillai’s Trace = 0.14, F(2, 24) = 1.93, p = 0.17). However, Mauchly’s test of sphericity was significant (p < 0.001), suggesting that the assumption of sphericity was violated. After applying the Greenhouse-Geisser and Huynh-Feldt corrections, the tests of within-subjects effects showed a marginally significant effect of scenario on acceptance scores (Greenhouse-Geisser: F(1.11, 27.79) = 3.77, p = 0.06; Huynh-Feldt: F(1.13, 28.15) = 3.77, p = 0.06).

The tests of within-subjects contrasts revealed a marginally significant linear trend (F(1, 25) = 4.01, p = 0.06) but no significant quadratic trend (F(1, 25) = 3.08, p = 0.09) across the scenarios. The estimated marginal means increased from S1 (M = 3.85, 95% CI [3.45, 4.25]) to S2 (M = 4.18, 95% CI [3.93, 4.44]) and S3 (M = 4.20, 95 % CI [3.94, 4.46]), suggesting a marginally significant effect of scenario on acceptance scores, with scores increasing linearly from S1 to S2 and S3.

Discussion

This work investigated the evolution of HRSI across the three scenarios, focusing on the following indicators: productivity-related human performance (cycle time and errors), cognitive workload, HRI fluency, system’s usability, reliability and acceptance. The results provide valuable insights into the dynamics of human-robot collaboration and the potential for improvement over time. In the following, all the above-mentioned indicators are discussed considering the analyzed results. Finally, the future perspective of deploying this technological approach in industrial deployment is discussed.