Abstract
We present a preliminary quantitative study aimed at developing an optimal standard protocol for automatic classification of specific affective states as related to human- computer interactions. This goal is mainly achieved by comparing standard psychological test-reports to quantitative measures derived from simultaneous non-invasive acquisition of psychophysiological signals of interest, namely respiration, galvanic skin response, blood volume pulse, electrocardiogram and electroencephalogram. Forty-three healthy students were exposed to computer-mediated stimuli, while wearable non-invasive sensors were applied in order to collect the physiological data. The stimuli were designed to elicit three distinct affective states: relaxation, engagement and stress. In this work we report how our quantitative analysis has helped in redefining important aspects of the protocol, and we show preliminary findings related to the specific psychophysiological patterns correlating with the three target affective states. Results further suggest that some of the quantitative measures might be useful in characterizing specific affective states.
I. Introduction
In the scientific literature, several studies have investigated emotion classification during human-computer interactions by means of biological signals [1] [2]. For instance, the affective computing group [3] [4] carried out important research highlighting the use of psycho-physiological measures in order to deduce emotional states while performing different kinds of activity on a PC. It would be a strategic innovation to integrate emotional reactions into a computer mediated communication (CMC) system in order to develop more successful and ergonomically designed machines. Many studies have shown interesting results supporting the feasibility to detect affective states by means of phychophysiological data acquisitions and analysis, with the critical purpose of correlating biological signals with emotional reactions and translate findings into critical information for innovative and pioneering computer-human interactions [5] [6]. Our initial research hypothesis is that a protocol for affective states detection can rely on the acquisition and processing of specific psycho-physiological signals. To this extent, this long-term collaborative effort between highly qualified teams of psychologists and engineers may provide the ideal environment for such endeavor. In this paper we report the advancements achieved in the last six years in defining a standard protocol for optimal analysis of the correlation between biological patterns and affective states. Overall, we have selected three prototypical laboratory situations aimed to elicit three main affective states of interest, we have opted for a single Psychological report and a finite collection of biological signals, and we have established simple multimodal quantitative analyses on the collected data in order to characterize the affective states of interest and compare the derived indexes to validated psychological measures. In the future, as our findings recommend additional investigation, we envisage that further refined physiological indexes will be the core of a technological tool that will provide a quantitative feedback pointing at precise affective states during human-computer interactions.
II. Hardware Description and Recording Settings
Data acquisition has been performed at the Clinical Research Center at MIT using Flexcomp Infinity, a 10 channel USB PC peripheral by Thought Technology. Every channel was synchronously acquired at 2048 Hz and exported at a 256 Hz sampling rate. The Lab was equipped with 2 portable PCs, one for delivering the stimuli and the other for data acquisition. Respiration (RESP), Skin Conductance (SC), Blood Volume Pulse (BVP), Electrocardiogram (EKG) and Electroencephalogram (EEG) were continuously recorded through sensors opportunely placed on the student’s body. In addition, student facial expressions and student interactions were also recorded by webcam for future analysis.
III. Experimental Methods and Design
The design methodology was developed and improved in the course of 6 years along 3 separate phases. In the first phase, a prototype experimental protocol was defined, including the human-computer interaction framework, the physiological recordings generally recognized at the time within this framework, as well as basic methods and techniques to analyze and process the psychophysiological data. In the second phase, the prototype protocol was tested on a group of 20 subjects in order to evaluate and eventually re-calibrate the efficacy of our experimental stimuli, as well as to verify the usefulness of the recorded variables. This phase yielded critical insights for possible improvements, such as the importance of the stimuli sequential order, or the advantages in considering additional biological series. As a result, a final refined protocol was defined in the third phase as described below.
A series of digital affective stimuli are at the core of the human-computer interaction protocol. Stimuli are created ad hoc to trigger specific emotions in 43 students (age ranged from 20 to 25 years, Mean 22, St.Dev. 2.2) from MIT. We looked at the previous scientific literature to shape our stimuli [7] [8]. Furthermore, the studies in [9] [10] constitute a key stage in helping us to develop stimuli with particular components to trigger specific psychological reactions in the subjects.
The protocol is composed of 4 epochs: Baseline (the subject is asked to stare at a white screen for 3 minutes), Relax (the subject is exposed to a sequence of panoramas, aimed to induce a relaxation state, for 10 minutes), Stress (the subject is asked to perform a Stroop task [11] [12] for 4 minutes followed by a mathematical task [13], lasting another 4 minutes), and Engagement (the subject is asked to read a detective tale [10], lasting 10–15 minutes, according to his/her reading speed). Every subject was briefed in the Lab in order to become more familiar with the Lab environment and equipment. When the subject reached a comfortable status, he/she was asked to start the experiment. Since in the stress epoch subjects have to speak, they were instructed to read loudly statements written on some relaxation and engagement slides, in order to compare stress to other epochs.
As the prototype phase evidenced important effects on both psychological and physiological measures due to the sequential order of the emotion-induced conditions, the sample was split into two groups in the final protocol. The first group (defined as RES group) was exposed, in temporal order, to Baseline, Relax1, ENGAGEMENT, STRESS, Relax2; the second group (defined as RSE group) to Baseline, Relax1, STRESS, ENGAGEMENT, Relax2. Such setting allows for evaluation of both effects of Engagement on Stress and viceversa, as well as the effects of Engagement and Stress on relaxation. Also advocated by findings in the prototype phase, the entire session was repeated for each student after one month. The comparison of results from the 1st to the 2nd session allows for evaluation of novelty effects on the quantitative assessment.
IV. Psychological Self-Reports
During the prototype phase, three commonly used questionnaires were evaluated for consideration: EMAS (Endler Multidimensional Anxiety Scales), STAI (State Trait Anxiety Iventory), and PANAS (Positive Affect Negative Affect Scale). The STAI scale was selected as the most responsive according to the correlation among physio/psycho data. In the final protocol, each subject was asked to fill in the STAI questionnaire at the end of each stimulus.
V. Signal Processing
Skin Conductance (SC)
Skin Conductance and Skin Resistance are units of electrodermal activity which are expressed in either conductance (microsiemens) or resistance (microohms). All signals were always monitored by the same device. SC reflects a fairly slow physiological process and can be sampled at 32 Hz without distortion. The signal is expressed in microohms. We considered mean and standard deviation of the sampled signal.
Heart Rate Variability
The ECG signal was analyzed with custom software developed using Matlab (The Mathworks, Inc.; Natick, MA) in order to detect the R peaks and calculate the RR variability series as the time interval between two consecutive R peaks. Spectral analysis was performed by means of autoregressive spectral methods. The Levinson–Durbin recursion was used to identify the coefficients of the autoregressive model and the order was chosen (between 4 and 12) according to the Akaike figure of merit. The autoregressive spectral decomposition procedure was applied to calculate the power of the oscillations embedded in the series. The rhythms were classified as very low frequency (VLF, <0.04 Hz), low-frequency (LF, ranging from 0.04 to 0.15 Hz) and high frequency (HF, from 0.15 to 0.5 Hz) oscillations. Power was expressed in absolute (LFRR and HFRR) and in normalized units [LFnu and HFnu as 100*LFRR/(σ2RR-VLFRR) and 100*HFRR/(σ2RR-VLFRR), where σ2RR- was the RR variance and VLFRR was the VLF power expressed in absolute units] [14].
Respiration
The respiration signal has been recorded by means of a stretch belt sensor positioned around the chest (thoracic respiration). The signal was filtered (software Biograph Infiniti from Thought Technology) to produce a smooth sinusoidal signal. Respiration Period represents the peak-to-peak time (max-to-max distance of the sinusoid).
EEG
The EEG sensors (2 channels) were placed on the frontal lobes of subjects. A low cutoff frequency (LCF) of 13Hz and high cutoff frequency (HCF) of 21 Hz were used to process EEG Beta spectral analysis. Low total cutoff frequency (Hz): 1. High total cutoff frequency (Hz): 512.
VI. Results
We analyzed students affective reactions as dependent variables to our stimuli by means of both psychological self-reports scores from the STAI and physiological activation from GSR, RESP, EKG and EEG signal processing and statistical analysis.
In Figure 1 we show the averaged results from the STAI scores and the SC index among all subjects for the baseline epoch in the 1st and 2nd session. Here, both STAI scores and SC values from baseline are lower in the second session as compared to the first. This finding highlights how subjects in the first session reported higher level of stress during the baseline as they didn’t know anything about the protocol. In the second session, data are not affected by the novelty effect of the experiment. Because of this effect, from now on we report results from the second session only.
Figure 2 shows averaged results from the second session of the STAI score (subscales in blue and red, Total scores in yellow, and percentile Total score in light blue) together with SC, RP and EEG indexes. The first light blue bar is baseline; the dark yellow bar, relaxation1; the green bar, engagement; red one is stress and light yellow is relaxation2. Results are averaged for all subjects (22 for RES and 21 for RSE) within epochs (3 min. for baseline; 10 min. for relax1, 10–15 min. for engagement; 8 min. for stress; 10 min. for relax2). Bars show St. Dev. values. The higher STAI scores belong to stress, while the lowest are from relaxation1. Of note, STAI scores in relaxation2 did not decrease to relaxation1 levels. For SC, stress epochs are always the highest, while SC values for relaxation are always the lowest. SC values for engagement are always between stress and relaxation. The second relaxation epochs are also affected by previous epochs. For RP, the stress epochs show highest values, although for this index lowest rates are during engagement. For EEG Beta Waves the highest values are for engagement, with strong
Figure 3, shows the HRV indexes. From top, RR values describe a strong reduction during stress, with no significant differences between relaxation and engagement. For total HRV, stress epochs indicate the highest rates, with less difference between relaxation and engagement. From baseline to stress, HF absolute values decrease significantly, whereas differences between engagement and relaxation are not significant. The LF/HF graphs are the most interesting, as the highest values are for engagement. The stress epoch values are between engagement and (lowest) relaxation.
VII. Statistical and Correlation Analysis
Mean and Standard Deviation for STAI Total scores and all physiological indexes considered were computed for each subject for each epoch (Baseline, Relaxation1, Stress, Engagement and Relaxation2). Epoch averages for group subjects (22 for RES and 21 for RSE) was also considered. Statistical analysis was performed to compare the three emotional states (Relax, Stress and Engagement) using both a t-student test with Bonferroni multiple comparison correction. In Table I we show the significance outcomes from statistical analysis of the second RES (Relax, Engagement, Stress) and RSE (Relax, Stress, Engagement) session.
Table I.
Relax-Stress | Relax-Engagement | Stress-Engagement | |
---|---|---|---|
STAI Total | 0,000224 | 1,35E-08 | 1,85E-06 |
SC | 4,38E-07 | 0,014875 | 8E-07 |
Resp Period | 0,009278 | 0,022267 | 1,48E-06 |
β % Power | 0,001163183 | 0,479352139 | 7,48514E-05 |
RR Mean | 0.001073 | 0.423548 | 0.001066 |
LFnu | 0.113496 | 0.021362 | 0.944537 |
HFnu | 0.305767 | 0.090382 | 0.6905 |
LF/HF | 0.443711 | 0.072408 | 0.589713 |
Relax-Stress | Relax-Engagement | Stress-Engagement | |
STAI Total | 5,22E-06 | 0,001186 | 0,010027 |
SC | 4,21E-06 | 2,72E-05 | 0,009354 |
Resp Period | 3,42E-05 | 0,079058 | 1,68E-07 |
β % Power | 0,175138 | 0,005794 | 7,07E-05 |
RR Mean | 0.012765 | 0.75637 | 0.012091 |
LFnu | 0.185313 | 0.012144 | 0.222024 |
HFnu | 0.127555 | 0.016524 | 0.303989 |
LF/HF | 0.209047 | 0.146864 | 0.505276 |
We further performed a correlation analysis between STAI scores and biological signals, in order to explore the level of correlation between psychological and physiological measures. The most interesting results focus the attention on STAI scores, skin conductance, and mean RR interval, shown in Figure 4. Clearly, the two physiological indexes are highly correlated with the STAI scores, particularly in the second session where the novelty effect is absent. Significant findings emerged also correlating respiration period and STAI score, especially for the stress epochs. Even EEG Beta waves showed interesting correlation trends with STAI scores, although not always significant. At last, surprising results evidence a relevant correlation between LF/HF and engagement epochs.
VIII. Discussion
We have reported a preliminary quantitative analysis related to specific psychophysiological patterns correlating with three target affective states. Important conclusions and observations can be drawn from the study.
From a psychological standpoint, the statistical data from the STAI scores show that the 3 stimuli were effective in eliciting the 3 targeted affective states of relaxation, engagement and stress. First, a notable sequence effect was observed: engagement and relaxation right after stress are showing higher scores than engagement and relaxation before stress. Because there is a sequence effect, RES gradually improves the arousal reactions from the first to the last epoch, while RSE shows how engagement after stress is affected by the lingering stress arousal. This has important implications in setting up sequential stimuli, and/or using a randomized experimental design. Another important consideration in our analysis can be done about the baseline: on one side we observed how baseline and relaxation are correlating very well, on the other side we proved that a better baseline epoch can be acquired repeating all the tests one month later the first time, as long as subjects seem to be prone to the novelty effect of the first session.
Table I shows significant differences between epochs for many psychophysiological measures, reflecting the STAI behavior assessed through self-rated tests. In this sense the results give reasonable perspectives regarding an effective use in order to characterize the gathered information. An important conclusion is that relax-engagement is harder to differentiate than relax-stress and stress-engagement: reading a detective tale as a stimulus aimed to induce engagement may not elicit a psychophysiological reaction that differs from the one evoked by relaxation. Further research should investigate other alternative experimental situations of engagement, studying which factors can elicit stronger patterns of engagement. The best classification results come from the SC and Respiration signals, as they show significant different patterns for all epochs. Probably these two indexes are the most strongly associated with the physiological reactions to the affective states of interest. The measures derived from RR are promising, however the poor performance in discerning between engagement and relaxation reflects on one side the known high inter-subject variability, on the other how the analysis might be affected by physiological and behavioral factors which may bias the sympatho-vagal estimate, and certainly deserve further research. However the correlation between HRV measures and engagement is quite interesting. From EEG, the β % Power index might reflect a good correlation with engagement due the reading process that subjects perform during the epoch of engagement. Further research might also address such factor.
From a methodological standpoint, it is possible to define at least two realms. On one side there is the realm of “raw data”, that of course requires good sensors, proper sensors location and preparation of body contact, etc.. On the other side, the choice of the optimal indexes [15] represents a very important issue about how to infer additional information from raw data. As described by the table with all p values, as well as the correlation results, some measures like SC and Respiration Period are quite efficient in distinguishing the 3 affective states. However, it might be advantageous to try further alternative parameters to verify if they might improve the performance in discerning between states.
In conclusion, our work so far poses a solid basis for further investigation on identifying more refined physiological measures more closely associated with targeted psychological states. Affective computing is a promising field that could lead, in the long run, to the identification of specific patterns of affective states able to provide critical information in a wide set of applications for human-computer interactions.
Acknowledgments
We thank the Brown Lab at MIT, 77 Mass Ave Cambridge MA, for funding the experimental recordings, and the Clinical Research Center at MIT for all the support and help in carrying out all the experiments. We also thank Luca Citi for helping with signal preprocessing and analysis. This work was supported by NIH Grants R01-HL084502, and DP1-OD003646.
References
- 1.Wastell DG, Newman M. Stress, control and computer system design: a psychophysiological field study. Behaviour and Information Technology. 1996;15(3):183–192. [Google Scholar]
- 2.Wilson G, Sasse MA. Do users always know what’s good for them utilising physiological responses to assess media quality. In: McDonald S, Waern Y, Cockton G, editors. People and Computers XIV-Usability or Else! Proc of HCI. Springer; Berlin: 2000. pp. 327–339. [Google Scholar]
- 3.Picard R. Affective Computing. 1997. [Google Scholar]
- 4.Scheirer J, Fernandez R, Klein J, Picard R. Frustrating the user on purpose: a step toward building an affective computer. Interacting with Computers. 2002;14:93–118. [Google Scholar]
- 5.Angeslevä J, Reynolds C, O’modhrain S. DVD ACM Transactions on Graphics. 2004. EmoteMail. [Google Scholar]
- 6.Picard R, Vyzas E, Healey J. Toward Machine Emotional Intelligence: Analysis of Affective Physiological State. IEEE Trans on pattern analysis and machine intelligence. 2001;23(10):1175–1191. [Google Scholar]
- 7.Pagani M, Lombardi F, Guzzetti S, Sandrone G, Rimoldi O, Malfatto G, Cerutti S, Malliani A. Power spectral density of heart rate variability as an index of sympatho-vagal interaction in normal and hypertensive subjects. J Hypertens Suppl. 1984;2(3):383–385. [PubMed] [Google Scholar]
- 8.Shapiro S, Lespérance Y. Modeling Multiagent Systems with CASL - A Feature Interaction Resolution Application. Proceedings of the 7th International Workshop on Intelligent Agents VII. Agent Theories Architectures and Languages; July 07–09, 2000.pp. 244–259. [Google Scholar]
- 9.Lang PJ. The emotion probe: Studies of motivation and attention. American Psychologist. 1995;50(5):372–385. doi: 10.1037//0003-066x.50.5.372. [DOI] [PubMed] [Google Scholar]
- 10.Scotti S, Mauri M, Barbieri R, Jawad B, Cerutti S, Mainardi L, Brown EN, Villamira MA. Automatic Quantitative Evaluation of Emotions in E-Learning Applications; Conf Proc IEEE Eng Med Biol Soc; 2006. pp. 1359–62. [DOI] [PubMed] [Google Scholar]
- 11.Ridley Stroop J. Studies of Interference in Serial Verbal Reactions. Journal of Experimental Psychology. 643–662;18:1935. [Google Scholar]
- 12.MacLeod CM. Half a century of research on the Stroop effect: An integrative review. Psychological Bulletin. 1991;109(2):163–203. doi: 10.1037/0033-2909.109.2.163. [DOI] [PubMed] [Google Scholar]
- 13.Lang PJ. The emotion probe: Studies of motivation and attention. American Psychologist. 1995;50:372–385. doi: 10.1037//0003-066x.50.5.372. [DOI] [PubMed] [Google Scholar]
- 14.Task force of the European Society of Cardiology and the North American Society of Pacing and Electrophysiology. Standard of measurement, physiological interpretation and clinical use. Circulation. 1996;93:1043–1065. [PubMed] [Google Scholar]
- 15.Rainville P, Bechara A, Naqvi N, Damasio AR. Basic emotions are associated with distinct patterns of cardiorespiratory activity. International Journal of Psychophysiology. 2006;61:5–18. doi: 10.1016/j.ijpsycho.2005.10.024. [DOI] [PubMed] [Google Scholar]