Skip to main content
Biomedical Journal logoLink to Biomedical Journal
. 2018 Jan 3;40(6):355–368. doi: 10.1016/j.bj.2017.11.001

An accurate emotion recognition system using ECG and GSR signals and matching pursuit method

Atefeh Goshvarpour 1, Ataollah Abbasi 1,, Ateke Goshvarpour 1
PMCID: PMC6138614  PMID: 29433839

Abstract

Background

The purpose of the current study was to examine the effectiveness of Matching Pursuit (MP) algorithm in emotion recognition.

Methods

Electrocardiogram (ECG) and galvanic skin responses (GSR) of 11 healthy students were collected while subjects were listening to emotional music clips. Applying three dictionaries, including two wavelet packet dictionaries (Coiflet, and Daubechies) and discrete cosine transform, MP coefficients were extracted from ECG and GSR signals. Next, some statistical indices were calculated from the MP coefficients. Then, three dimensionality reduction methods, including Principal Component Analysis (PCA), Linear Discriminant Analysis, and Kernel PCA were applied. The dimensionality reduced features were fed into the Probabilistic Neural Network in subject-dependent and subject-independent modes. Emotion classes were described by a two-dimensional emotion space, including four quadrants of valence and arousal plane, valence based, and arousal based emotional states.

Results

Using PCA, the highest recognition rate of 100% was achieved for sigma = 0.01 in all classification schemes. In addition, the classification performance of ECG features was evidently better than that of GSR features. Similar results were obtained for subject-dependent emotion classification mode.

Conclusions

An accurate emotion recognition system was proposed using MP algorithm and wavelet dictionaries.

Keywords: Electrocardiogram, Emotion recognition, Galvanic skin responses, Matching pursuit, Probabilistic neural network


At a glance commentary

Scientific background on the subject

The main goal of this study was to examine the performance of an automatic emotion recognition system based on Matching Pursuit (MP) algorithm using galvanic skin response (GSR) and electrocardiogram (ECG) time series.

What this study adds to the field

An accurate emotion recognition system was proposed using MP algorithm with the maximum recognition rate of 100%. In addition, the classification performance of ECG features was better than that of GSR measures.

Generally, monitoring and evaluation of Autonomic Nerve System (ANS) has been performed by physiological measures, including electrocardiogram (ECG), Galvanic Skin Response (GSR), Blood Pressure (BP), and respiration rates. Among them, there is a special attention on the ECG and GSR to evaluate different pathological and psychophysiological conditions. The ECG is one of the most informative signals for evaluating the electrical activity of the heart and the GSR can provide enlightening evidences for the assessment of the sweet glands function as an ANS indicator [1]. These measures offer simple, effective, low cost, noninvasive, and continuous recordings. However; to identify desired patterns associated with the distinctive mental and physiological states, automatic interpretation is crucial.

Several studies have been conducted to find the relationships between emotion and ANS function. To address this topic, Kreibig carried out a review of 134 articles studying emotional ANS responding in healthy individuals, as well as the choice of physiological measures in evaluating ANS reactivity [2]. Significant ANS response specificity was concluded in emotion, especially for some distinct affective states. Earlier, Levenson also examined the ANS as an indicator for detecting the occurrence of emotion [3]. In some scientific research, distinct sympathetic and parasympathetic divisions of the ANS functions has been considered. For example, the sympathetic activation and vagal deactivation has been demonstrated for anxiety [4]. However, a greatly diverged autonomic signature has been reported during the occurrence of emotions [2].

Automated biosignal classification has been an interesting issue in many fields of medical research, including health and disease monitoring. Emotion recognition is one of the important areas. Several emotion recognition applications have been demonstrated for human life, including computer games and entertainments, human–computer interfaces, humanoid robotics, intellectual disabilities, and patient/doctor interaction in some diseases, such as schizophrenia and autism [5], [6], [7], [8]. The importance of emotion recognition has resulted in the advent of “affective computing”. Many attempts have been made to develop automatic devices which can manage the problem of human emotion recognition and interpretation. Several methodologies have been suggested to improve the recognition rates. These methodologies comprise the application of time domain, spectral components, wavelet transform, and nonlinear analysis.

Dynamic structures of physiological signals and quickly fluctuation in their patterns recommend their decomposition over large classes of waveforms. For this purpose, Fourier Transform (FT) and wavelet analysis have been introduced, which are not always satisfactory. FT results in a poor exemplification of functions that are well localized in time. In addition, in both aforementioned techniques, the signal transition from their expansion coefficients cannot be easily detected and identified, because the information may be diluted over the entire analysis. Wacker and Witte claimed that among time-frequency methods, Matching Pursuit (MP) algorithm is a desirable one, as it provides a promising time-frequency resolution for all frequencies and reduces cross terms concurrently [9]. In addition, MP is the first processing procedure which adapts the window length to the local features of the examined time series [10], [11]. Applying this method, periodic and transient structures of the signal are defined parametrically by time span, time occurrence, frequency, amplitude, and phase.

The time-frequency resolution of MP is high, and this technique has been successfully used in many areas of biomedical research. Durka and Blinowska studied transients in sleep Electroencephalogram (EEG) by means of MP algorithm [11]. It has shown that sleep spindles can be localized with high precision. In addition, their time span and intensities were recognized. The authors claimed that applying MP technique, different structures in data can be identified and the spatiotemporal characteristics can be monitored. Bardonova et al. used MP to detect the frequency changes in heart signals [12]. The heart cycles were decomposed, and a relative frequency histogram of the data was calculated. They showed that frequency changes in the QRS complex of ECG signals can be analyzed during the experiment using MP technique. Sommermeyer et al. proposed an algorithm based on MP to examine photoplethysmographic signals of patients with sleep disorders [13]. The system offered information about the central or obstructive sleep apnea. The promising specificity (>90%) and sensitivity (>95%) were revealed. To classify ECG features, Pantelopoulos and Bourbakis examined the effectiveness of the projections of ECG samples on wavelet packet dictionaries extracted from MP algorithm [14]. Hong-xin et al. employed MP with Gabor dictionary to compress EEG and ECG signals [15]. Genetic algorithm was also implemented to reduce computational complexity. By applying the proposed algorithm, a higher compression ratio and lower reconstruction errors have been achieved compared to the traditional methods. It has confirmed that the MP technique is a suitable tool for studying the nonstationary physiological signals.

In the current study, the performance of MP on GSR and ECG time series was assessed. Special emphasis was put on the emotional responses elicited by music.

Considering the music as a method intended to stimulate certain emotions, very few investigations on emotion recognition have been done using intelligent algorithms and classifiers. Kim and Andre proposed an emotion recognition system based on four physiological signals, including electromyogram (EMG), ECG, respiration changes (RSP), and skin conductivity (SC) [16]. Entropy, time-frequency indices, spectral measures, and geometric analysis were calculated to find the most relevant features. They performed subject-dependent and subject-independent classification. For two above mentioned cases, the maximum classification rates of 95% and 70% were achieved, respectively. Duan et al. assessed the performance of k nearest neighbor (KNN), support vector machine (SVM), and least squares in emotion recognition using EEG signal [17]. After smoothing EEG power spectrum features, their dimension was reduced by means of minimal redundancy maximal relevance (MRMR) and principal component analysis (PCA). The proposed framework resulted in the mean accuracy rate of 81.03%. Lin et al. attempted to discover the association between musical stimuli and EEG signals [18]. The maximum accuracy of 92.73% was acquired using 60 features extracted from all EEG frequency bands, including delta, theta, alpha, beta, and gamma with a temporal resolution of 1 s. Their previous work also revealed that the alpha power asymmetry could discriminate the categories of emotion with the average classification rate of 69.7% in five volunteers [19]. In another investigation, to improve the classification rate of emotional EEG signals, Lin et al. applied machine learning algorithms [20]. They showed that some power spectrum features can be considered as sensitive measures for characterizing emotional brain dynamics. The maximum classification rate of 82.29% ± 3.06% was reported. Naji et al. proposed a novel approach based on forehead biosignals (FBS) and ECG to recognize music stimulates emotions [21]. The mean accuracy of 89.24%, corresponding to recognition rates of 94.86% and 94.06% for valence and arousal dimensions, was attained. More recently, for characterizing four emotions, including engaging, soothing, boring, and annoying elicited by musical stimuli, 3 channel FBS were examined in 25 healthy individuals [22]. The best accuracy rate of 87.05% was obtained, which is corresponding to the best arousal accuracy of 93.66% and the best valence accuracy of 93.29%. Using ECG features in combination with FBS data [23], correct classification rates increased to 94.91%, 93.63%, and 88.78% for arousal, valence, and total classification rates, respectively. Agrafioti et al. proposed a subject-dependent emotion recognizer which recruited empirical mode decomposition to detect emotion patterns on ECG [24]. They examined two classification problems: (1) low, medium, and high arousal, (2) low and high arousal. Using linear discriminant, a maximum accuracy of 89% was reported. Alzoubi et al. applied ECG, EMG, and GSR properties to discriminate eight affective states [25]. Several statistical features were extracted. Two feature selection methods and nine classifiers were examined. Among them, KNN and linear Bayes normal classifiers yielded the best emotion recognition rates. They reported a mean kappa score of 0.25 for user-dependent mode. In another study, an emotion recognition system was proposed that considers five physiological signals, including ECG, GSR, blood volume pulse (BVP), respiration, and pulse signals [26]. A support vector regression (SVR) was trained on morphological indices to discriminate three emotions. The results showed the highest recognition rate of 89.2%. Jerrita et al. evaluated ECG data of 60 participants in 6 emotional states [27]. After combining the Hurst features with higher order statistics (HOS), four classifiers were examined. A maximum accuracy of 92.87% was reported for random validation. However, in a subject-independent validation mode, the highest recognition rate of 76.45% was achieved. An automatic multiclass arousal/valence classifier was proposed using standard and nonlinear features of ECG, GSR, and RSP [28]. A recognition accuracy was more than 90% using the quadratic discriminant classifier (QDC). Later, the team proposed a personalized framework to characterize the emotional states [29]. Features from the instantaneous spectrum, bispectrum, and the dominant Lyapunov exponent (LE) were fed to a SVM. An overall accuracy of 79.29% was achieved in recognizing four emotional states.

Due to the importance of emotion recognition and its applications, it is crucial to further scrutinize and improve the accuracy of emotion classification system. Therefore, the aim of the current study was to propose a more accurate system for affect recognition. To this effect, MP algorithm with three different dictionaries in combination with some feature selection methods were employed on ECG and GSR signals.

The current study is organized as follows: In section 2, the data collection protocol is described. In addition, the proposed methodology is introduced which consists of matching pursuit algorithm, feature selection, and classification. The results are presented in section 3 and the study is concluded in section 4.

Methods

The emotion recognition framework is shown in Fig. 1.

Fig. 1.

Fig. 1

Proposed methodology.

Briefly, ECG and GSR signals were recorded simultaneously from 11 subjects while listening to music with different emotional content (Data Acquisition). Then, applying 3 dictionaries, including Coiflets wavelet (Coif5) at level 14, Daubechies wavelet (db4) at level 8, and discrete cosine transform (DCT), matching pursuit coefficients were calculated from the normalized GSR and ECG signals (Feature Extraction). After extracting some statistical features (Indices), different dimensionality reduction methods were evaluated to decrease the computational costs, the risk of curse of dimensionality, and to remove redundant features. Finally, the feature vector was fed to a probabilistic neural network (PNN).

Data acquisition

GSR and ECG signals of 11 college students (females; mean age: 22.73 ± 1.68 years) were collected. All participants were asked to read and sign a consent form if they agreed to take part in the experiment. The privacy rights of human subjects were always observed and the experiment was conducted in accordance with the ethical principles of the Helsinki Declaration [30].

To describe emotions, one of the most commonly used approaches is a dimensional model of emotion in which a few independent dimensions are considered on discrete or continuous scales. In this approach, two dimensions are usually chosen, namely arousal and valence. In the current protocol, emotional states in all four quadrants of valence and arousal dimensions were selected. As a result, peacefulness (low arousal and positive valence), happiness (high arousal and positive valence), sadness (low arousal and negative valence), and scary (high arousal and negative valence) were chosen as emotion classes. Fifty six short musical excerpts (fourteen stimuli per each emotional category) were selected, which validated by Vieillard et al. [31] They were recorded in a piano timbre.

The music blocks were presented in a random order. The subjects were instructed to put on the headphones, lie down in a supine position, and try to remain still during data acquisition. The initial baseline measurement was carried out for 2 min with eyes closed followed by about 15 min of emotional music. All tests were done in a controlled temperature and light, as well as at specific times of the day (9 AM–13 PM). The average temperature of the room was about 23 °C. Musical pieces were presented at a comfortable volume using KMPlayer software. Fig. 2 depicts the protocol description. After the section, the subjects were asked to fill in the questionnaire for evaluating the emotional contents of the music.

Fig. 2.

Fig. 2

Protocol description.

All signals were collected in Computational Neuroscience Laboratory using 16 channels PowerLab (manufactured by ADInstruments). To remove AC power line noise, a digital notch filter was applied at 50 Hz. The sampling rate was 400 Hz. An example of recorded signals is shown in Fig. 3. More details about signal acquisition procedure can be found in the study performed by Goshvarpour et al. [32].

Fig. 3.

Fig. 3

Example of GSR and ECG signals from one subject.

Feature extraction

Matching pursuit

In 1993, MP algorithm was first proposed by Mallat and Zhang [33]. This technique is a greedy, iterative procedure and is described as follows:

  • 1.

    Find the first function (atom) of the chosen dictionary, which has the largest inner product with the input signal and provides the best fit to it.

  • 2.

    Subtract the contribution provided by the selected function from the signal.

  • 3.

    Repeat steps 1 and 2 on the remaining residuals until a satisfactory decomposition of the given signal is provided in terms of selected function.

Each dictionary consists of functions which exemplify the structures of the signal. In addition, all dictionary functions are normalized; therefore, they have an equal chance in the selection of optimal matching model.

Let x be the given signal and D be the selected dictionary with functions (atoms)ϕP. Therefore, it can be written asD={ϕP}PΓ. To approximate and decompose x, the MP technique works iteratively as follows:

x=n=0m1anϕPn+Rmx (1)

where ϕ represents the matched function in the nth iteration, and P characterizes the precise time and frequency centers, phase, amplitude, and width. an is the projection (inner product) of Rnx on ϕPn.

an=Rnx,ϕPn (2)
Rm+1x=RmxRm,ϕPmϕPm (3)

where R0x=x. As mentioned previously, the best atom that maximizes the inner product of the signal (specified by equation (2)) should be chosen to minimize equation (3) in every step. The stop criterion of the MP algorithm is achieved when either the maximum number of functions or a reasonable signal approximation has been reached.

GSR and ECG signals have different morphological structures. Applying MP with given dictionaries, one can distinct and approximate these structures. Wavelet packets, Gabor atoms, and cosine packet are some of the waveform dictionaries. For wavelet dictionaries, it is crucial to select the suitable wavelet functions and the number of decomposition levels. Therefore, first, one should know which wavelet type is suitable for ECG and GSR signals, and then appropriate dictionary is selected for MP. In other words, if the characteristics of the selected dictionary matched the data, better results would be attained.

In the current study, three different dictionaries were examined, including coif5 at level 14, db4 at level 8, and DCT dictionary.

Feature selection

A high-dimensional feature space results in computational costs, the risk of curse of dimensionality, and poor classification accuracies. In this study, to extract relevant features, two traditional linear techniques (LDA and PCA) and one global nonlinear approach (Kernel PCA) were implemented.

PCA

PCA constructs a linear transformation T with the principal components (i.e., principal eigenvectors) of the data, in which the TTcovXX¯Tis maximized [34]. In this term, covXX¯ represents the covariance matrix of the zero mean data X. Therefore, PCA formulated as (4).

covυXX¯=λυ (4)

where λ is principal eigenvalues. The low-dimensional data representations yi are calculated by mapping onto the linear basis T, Y=(XX¯)T. One of the main drawbacks of the PCA technique is that the covariance matrix size is proportional to the data-points dimensionality [35].

Linear Discriminant Analysis (LDA)

LDA is a supervised technique, which attempts to find a linear mapping M that maximizes the linear class differentiation in a low-dimensional space [34]. Consider the within-class scatter Sw and the between-class scatter Sb, which are defined as (5) and (6),

Sw=cpccovXcX¯c (5)
Sb=ccovX¯c=covXX¯Sw (6)

wherecovXcX¯c is the covariance matrix of the zero mean data-points xi assigned to class cC. The ratio of the within-class scatter (Sw) and the between-class scatter (Sb) is optimized by LDA:

TTSbTTTSwT (7)

By solving the generalized eigen problem, this maximization can be made as follows:

Sbυ=λSwυ (8)

Kernel PCA

Using a kernel function, the reformulation of traditional linear PCA is constructed in a high-dimensional space by Kernel PCA (K-PCA) [36]. As a result, a nonlinear mapping is achieved by K-PCA. The kernel matrix K of the data-points xi is calculated by (9):

kij=κ(xi,xj) (9)

where κ shows a kernel function [37]. Applying some adjustment, the kernel matrix K is centered.

k=k1njjk1nkjj+1n2(jkj)jj (10)

The eigenvectors of the kernel matrix vi have an influence on the eigenvectors of the covariance matrix αi (which shaped by κ in a high-dimensional space).

αi=1λiυi (11)

In the covariance matrix, the results of the projection onto the eigenvectors (i.e., the low-dimensional representation of data Y) can be measured by:

Y={jα1κ(xj,x),jα2κ(xj,x),K,,jαdκ(xj,x)} (12)

The performance of K-PCA highly depends on the selection of the kernel function κ (the Gaussian, polynomial, and linear) [37]. Like PCA technique, the main drawback of K-PCA is about the size of the kernel matrix [35]; which is squared by the number of the data-points samples.

Classification

Probabilistic neural network

The architecture for this system is demonstrated in Fig. 4 [38].

Fig. 4.

Fig. 4

Architecture of the PNN.

As shown in Fig. 4, the network consists of three main parts: (1) input layer, (2) radial basis layer, and (3) competitive layer. Consider that Q input/target pairs are involved in the network. For each target vector, K elements are assigned, where K shows the number of neurons in the second layer. One element of the target vector is 1 and the others are 0. The weights of the first layer (IW1,1) are developed by the transpose of the matrix constructed from the Q training pairs. Then, a ||Dist|| function calculates the closeness between the input vector and the training set. Its output is multiplied by the bias and sent to transfer function. An output vector (O1) is presented which is close to 1, if a target is close to a training vector. For the second layer weights, LW1,2, the target vector (T) is assigned. The row related to the certain class of input is 1 and otherwise it is 0. The multiplication TO1 is obtained by the sum of the O1 elements to each of the K target classes. Finally, the compete layer, the second layer transfer function, assigned a 1 to the largest component of C box target and 0's elsewhere.

An explanation on PNN is briefly provided as follows:

  • Feeding a feature vector into the input layer.

  • Determination of the distance between the input and the weight vector.

  • Calculation of the summation of these contributions for each input class to yield the probability.

  • Selection the maximum of these probabilities by a competitive layer and assigned 1 for that class and 0's elsewhere.

One of the crucial steps in the training process of PNN is the determination of the sigma value (the smoothing parameter), which can be identified by trial and error.

Results

Studying GSR and ECG signals, the emotional responses of 11 healthy young college students were examined. MP coefficients were extracted from the normalized GSR and ECG signals applying 3 different dictionaries, i.e. Coif5 at level 14, db4 at level 8, and DCT. In the next stage, some statistical measures were calculated from the extracted coefficients, including minimum (min), mean, maximum (max), variance (var), and standard deviation (std). To determine if there are any significant differences between the MP coefficients of the emotional states and rest condition, the nonparametric Wilcoxon rank sum test was performed. The results of this test are presented in Table 1.

Table 1.

Wilcoxon rank sum test p-values for comparison between emotional states and rest condition of autonomic measures, including ECG and GSR.

Dic Indices ECG, Rest with: GSR, Rest with:
Happiness Sadness Peacefulness Scary Happiness Sadness Peacefulness Scary
Coif5 min 2 × 10−4 1 × 10−4 8.57 × 10−87 1.48 × 10−5 1.57 × 10−290 0 0 0
mean 0.16 0.51 0.11 0.098 1.96 × 10−305 0 0 0
max 0.46 0.039 0.15 0.13 4.54 × 10−282 8.93 × 10−264 0 0
var 3.7 × 10−3 5.20 × 10−8 2.4 × 10−9 7.008 × 10−9 9.37 × 10−117 1.39 × 10−241 2.21 × 10−318 5.31 × 10−280
std 3.7 × 10−3 5.20 × 10−8 2.4 × 10−9 7.008 × 10−9 9.37 × 10−117 1.39 × 10−241 2.21 × 10−318 5.31 × 10−280
db4 min 6.5 × 10−3 2.16 × 10−4 2.62 × 10−4 1.42 × 10−4 5.97 × 10−282 0 0 0
mean 0.12 0.39 0.16 0.26 5.35 × 10−303 0 0 0
max 0.51 0.032 0.10 0.22 3.76 × 10−261 0 6.40 × 10−262 0
var 4.2 × 10−3 5.55 × 10−8 3.15 × 10−9 8.25 × 10−9 8.92 × 10−117 2.26 × 10−318 1.38 × 10−241 5.06 × 10−280
std 4.2 × 10−3 5.55 × 10−8 3.15 × 10−9 8.25 × 10−9 8.92 × 10−117 2.26 × 10−318 1.38 × 10−241 5.06 × 10−280
DCT min 0.008 1.3 × 10−3 1.3 × 10−3 1.16 × 10−4 2.68 × 10−288 0 0 0
mean 0.017 0.16 6.2 × 10−3 9.1 × 10−3 2.25 × 10−305 0 0 0
max 0.50 0.29 0.60 0.98 1.16 × 10−269 1.48 × 10−267 0 0
var 2.9 × 10−3 2.49 × 10−7 1.15 × 10−8 2.26 × 10−8 1.41 × 10−115 1.51 × 10−241 1.34 × 10−317 5.53 × 10−279
std 2.9 × 10−3 2.49 × 10−7 1.15 × 10−8 2.26 × 10−8 1.41 × 10−115 1.51 × 10−241 1.34 × 10−317 5.53 × 10−279

Abbreviation: Dic: Dictionary.

As shown in Table 1, in most cases, the ECG and GSR indices differed significantly between emotional states and rest condition.

Friedman's test was also performed to determine what the biosignal features are in associated with three emotional categories (Table 2). In addition, Tukey's honestly significant difference (hsd) criterion was used to take multiple comparison into account.

Table 2.

Evaluation of the autonomic measures of ECG and GSR by means of Freidman test (with Tukey post hoc) on three emotional categories.

Dic Feature High arousal (HA)/Neutral (N)/Low arousal (LA) (3A)
Positive valence (PV)/Neutral (N)/Negative valence (NV) (3V)
ECG GSR ECG GSR
Coif5 min 1.31 × 10-18* 0* 2.09 × 10-18* 0*
mean 0.6797 0* 0.6963 0*
max 0.0015* 6.28 × 10-181* 2.62 × 10-4* 2.18 × 10-170*
var 1.37 × 10-14* 8.24 × 10-166* 2.16 × 10-15* 3.24 × 10-182*
std 1.37 × 10-14* 8.24 × 10-166* 2.16 × 10-15* 3.24 × 10-182*
Tukey hsd post hoc test HA vs. N p < 0.01 p < 0.0001 PV vs. N p < 0.001 p < 0.0001
HA vs. LA p < 0.0001 p < 0.0001 PV vs. NV p < 0.0001 p < 0.0001
LA vs. N p > 0.05 p > 0.05 NV vs. N p > 0.05 p > 0.05
db4 min 2.69 × 10-10 0* 1.73 × 10-9* 0*
mean 0.2676 0* 0.0944 0*
max 0.004* 1.12 × 10-163* 0.0039* 4.6 × 10-144*
var 4.86 × 10-15* 8.24 × 10-166* 5.2 × 10-15* 3.24 × 10-182*
std 4.86 × 10-15* 8.24 × 10-166* 5.2 × 10-15* 3.24 × 10-182*
Tukey hsd post hoc test HA vs. N p < 0.01 p < 0.0001 PV vs. N p < 0.01 p < 0.0001
HA vs. LA p < 0.0001 p < 0.0001 PV vs. NV p < 0.0001 p < 0.0001
LA vs. N p > 0.05 p > 0.05 NV vs. N p > 0.05 p > 0.05
DCT min 0.003* 0* 0.0019* 0*
mean 0.0193* 0* 0.0404* 0*
max 0.0438* 4.11 × 10-256* 0.163 3.07 × 10-258*
var 4.32 × 10-11* 1.08 × 10-164* 9.86 × 10-12* 5.43 × 10-181*
std 4.32 × 10-11* 1.08 × 10-164* 9.86 × 10-12* 5.43 × 10-181*
Tukey hsd post hoc test HA vs. N p < 0.01 p < 0.0001 PV vs. N p < 0.01 p < 0.0001
HA vs. LA p < 0.001 p < 0.0001 PV vs. NV p < 0.0001 p < 0.0001
LA vs. N P > 0.05 p > 0.05 NV vs. N p > 0.05 p > 0.05

* indicates the significant differences (p < 0.05).

As shown in Table 2, among all statistical measures extracted from MP coefficients of ECG, only the mean does not show significant differences between groups. The smallest p-values of Freidman test indicated that the most significant differences among three emotional groups revealed by means of the GSR measures (Table 2). Post hoc comparison revealed significant differences between high arousal and neutral as well as positive valence and neutral for GSR indices (p < 0.0001) and for ECG features (p < 0.001). Furthermore, significant differences have been observed between high arousal and low arousal, and positive valence and negative valence for both signals (p < 0.0001).

To reduce the dimensionality of the feature space and to discover an optimal feature subset, LDA, PCA, and K-PCA methods were performed. After applying the feature selection methodologies, the obtained features were fed to the PNN. According to the dimensional structure of emotion, the classification was performed in: (1) arousal-valence dimension (5C: happiness, sadness, scary, peacefulness, and rest condition as a different class); (2) arousal dimension (3A: including positive arousal, negative arousal, and rest condition); and (3) valence dimension (3V: including pleasant, unpleasant, and the rest condition). Fig. 5 illustrates the emotion classes schematically. Subject-dependent and subject-independent emotion classification modes were considered.

Fig. 5.

Fig. 5

Three emotional categories considered in the study: (A) Five classes of emotion (5C), including positive valence and positive arousal (happiness), positive valence and negative arousal (peacefulness), negative valence and positive arousal (scary), negative valence and negative arousal (sadness), and rest condition (neutral); (B) Three classes of valence (3V), including positive valence (peacefulness and happiness), negative valence (scary and sadness), and rest condition (neutral); (C) Three classes of arousal (3A), including high arousal (happiness and scary), low arousal (peacefulness and sadness), and rest condition (neutral).

To evaluate the system performance, the classification accuracies and error for each category with different sigma values (0.1, 0.09, 0.08, …, and 0.01) were calculated in a subject-independent mode. The recognition rates for three emotional groups are presented in Table 3, Table 4, Table 5 for Coif5, db4, and DCT, respectively.

Table 3.

Overall classification accuracies, output error, and elapsed time for PNN and Coif5 dictionary (subject-independent).

Sigma 0.1 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01
Signal F. S. Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error
5C ECG LDA 58.14 0.37 62.92 0.31 68.34 0.25 74.80 0.18 82.27 0.13 89.72 0.07 95.58 0.02 99.02 0.002 99.91 0.0002 99.99 0.0001
PCA 86.87 0.09 90.84 0.05 94.21 0.03 96.89 0.01 98.46 0.01 99.42 0.002 99.80 0.0004 99.92 0.0005 99.99 0.0001 100 0
K-PCA 27.32 0.69 27.95 0.65 29.07 0.59 30.76 0.52 33.17 0.46 37.14 0.41 42.71 0.35 51.93 0.28 66.55 0.18 82.94 0.07
GSR LDA 41.61 0.09 41.97 0.06 42.44 0.04 42.93 0.02 43.64 0.003 44.60 0.002 46.34 0.004 48.89 0.01 53.75 0.01 64.21 0.04
PCA 49.77 0.05 50.14 0.06 50.93 0.05 51.89 0.04 53.10 0.05 54.59 0.06 57.07 0.05 60.86 0.04 67.45 0.03 79.53 0.004
K-PCA 32.90 0.82 34.06 0.82 34.52 0.8 34.98 0.77 36.18 0.6 37.16 0.45 38.35 0.31 38.95 0.11 40.04 0.09 41.85 0.06
3V ECG LDA 51.09 0.21 51.87 0.21 53.16 0.20 54.30 0.2 55.77 0.19 57.97 0.19 61.14 0.19 65.94 0.16 73.81 0.12 90.53 0.04
PCA 87.23 0.07 90.39 0.05 93.75 0.03 96.41 0.02 98.27 0.008 99.30 0.002 99.72 0.0009 99.91 0.0003 99.99 0.0001 100 0
K-PCA 47.64 0.15 48.46 0.15 49.47 0.16 50.42 0.16 52.04 0.17 54.06 0.17 57.36 0.17 63.18 0.15 73.58 0.11 86.25 0.05
GSR LDA 57.07 0.11 57.32 0.10 57.78 0.10 58.15 0.09 58.92 0.08 59.97 0.08 62.02 0.07 64.59 0.06 68.65 0.05 77.02 0.04
PCA 62.69 0.04 63.07 0.04 63.81 0.05 64.43 0.05 65.25 0.05 66.62 0.04 68.56 0.04 71.32 0.04 76.27 0.03 85.23 0.01
K-PCA 49.92 0.06 49.91 0.06 50.26 0.07 50.62 0.06 51.28 0.04 51.75 0.04 52.06 0.04 52.35 0.05 53.11 0.14 55.42 0.001
3A ECG LDA 50.77 0.46 51.44 0.43 51.98 0.41 52.89 0.39 54.23 0.36 55.99 0.34 58.98 0.30 63.48 0.24 70.86 0.18 87.20 0.07
PCA 87.56 0.08 90.93 0.06 94.05 0.037 96.48 0.02 98.23 0.009 99.36 0.001 99.77 0.0004 99.94 0.0004 100 0 100 0
K-PCA 48.02 0.48 48.47 0.46 48.86 0.43 50.19 0.39 51.59 0.35 54.03 0.32 57.54 0.38 63.46 0.24 73.34 0.16 86.81 0.08
GSR LDA 58.36 0.15 58.70 0.15 59.02 0.14 59.78 0.13 60.53 0.11 61.58 0.11 62.77 0.09 64.87 0.07 68.68 0.06 78.05 0.04
PCA 64.11 0.09 64.44 0.09 64.98 0.09 65.53 0.08 66.38 0.08 67.51 0.07 69.37 0.08 71.89 0.07 76.90 0.05 85.53 0.02
K-PCA 52.58 0.43 52.84 0.43 52.95 0.42 53.04 0.42 53.00 0.43 53.34 0.42 53.59 0.42 54.19 0.35 56.97 0.15 58.32 0.05

Abbreviations: 5C: 5 Classes of Emotions; 3V: 3 Classes of Valence; 3A: 3 Classes of Arousal; K-PCA: Kernel PCA; F. S.: Feature selection; Acc: Accuracy.

Table 4.

Overall classification accuracies, output error, and elapsed time for PNN and db4 dictionary (subject-independent).

Sigma 0.1 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01
Signal F. S. Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error
5C ECG LDA 62.06 0.30 67.30 0.25 73.38 0.19 80.14 0.14 87.41 0.07 93.63 0.03 97.88 0.008 99.59 0.0002 99.96 0.0004 100 0
PCA 88.54 0.07 92.30 0.04 95.43 0.03 97.57 0.02 98.88 0.006 99.61 0.002 99.89 0.001 99.99 0.0001 100 0 100 0
K-PCA 27.38 0.78 28.30 0.72 29.34 0.67 30.72 0.62 32.91 0.57 36.32 0.51 41.34 0.44 50.11 0.32 64.32 0.22 82.61 0.07
GSR LDA 40.46 0.09 40.93 0.09 41.40 0.1 41.99 0.1 42.94 0.08 44.62 0.02 46.07 0.003 48.57 0.005 52.93 0.02 63.19 0.03
PCA 49.54 0.07 50.06 0.06 50.68 0.05 51.64 0.04 52.74 0.05 54.26 0.06 57.34 0.05 60.96 0.04 67.16 0.02 78.46 0.001
K-PCA 32.94 0.82 34.00 0.82 34.50 0.80 35.06 0.77 36.21 0.60 37.18 0.45 38.34 0.31 38.96 0.11 40.03 0.08 41.80 0.06
3V ECG LDA 51.47 0.22 52.32 0.21 53.17 0.21 54.13 0.20 55.47 0.20 57.56 0.19 61.24 0.18 66.55 0.15 75.56 0.12 91.85 0.03
PCA 88.38 0.06 91.63 0.04 94.60 0.03 96.88 0.01 98.45 0.003 99.54 0.002 99.89 0.0006 99.99 0.0001 100 0 100 0
K-PCA 47.57 0.23 48.07 0.23 49.02 0.23 49.80 0.22 51.15 0.22 53.26 0.21 56.29 0.20 62.01 0.18 71.81 0.14 86.10 0.06
GSR LDA 58.15 0.11 58.87 0.11 59.35 0.10 60.31 0.10 61.24 0.09 62.85 0.08 64.90 0.08 67.93 0.08 72.54 0.06 80.93 0.02
PCA 62.51 0.05 62.82 0.04 63.32 0.04 64.19 0.04 65.06 0.04 66.34 0.04 68.58 0.04 71.51 0.04 76.10 0.03 84.36 0.01
K-PCA 49.93 0.06 49.93 0.06 50.29 0.07 50.66 0.06 51.29 0.04 51.75 0.04 52.06 0.04 52.34 0.04 53.07 0.13 55.39 0.003
3A ECG LDA 50.96 0.42 51.55 0.41 52.39 0.39 53.40 0.37 54.84 0.34 57.04 0.31 59.63 0.27 63.97 0.23 72.10 0.17 89.11 0.06
PCA 88.60 0.08 91.75 0.05 94.58 0.04 96.97 0.02 98.62 0.01 99.53 0.002 99.85 0.001 99.99 0.0001 100 0 100 0
K-PCA 47.97 0.53 48.41 0.48 49.27 0.44 50.33 0.40 51.71 0.37 53.49 0.33 56.68 0.30 61.92 0.25 72.07 0.18 86.08 0.08
GSR LDA 58.62 0.07 59.30 0.07 59.93 0.06 60.55 0.07 61.08 0.08 61.82 0.08 62.93 0.08 64.78 0.07 68.08 0.06 76.61 0.03
PCA 63.78 0.09 64.27 0.09 64.77 0.09 65.43 0.08 66.15 0.07 67.35 0.07 69.26 0.08 72.09 0.07 76.66 0.05 84.89 0.02
K-PCA 52.58 0.43 52.78 0.43 52.92 0.43 53.00 0.42 53.01 0.43 53.34 0.42 53.59 0.42 54.17 0.35 56.06 0.13 58.23 0.05

Abbreviations: 5C: 5 Classes of Emotions; 3V: 3 Classes of Valence; 3A: 3 Classes of Arousal; K-PCA: Kernel PCA; F. S.: Feature selection; Acc: Accuracy.

Table 5.

Overall classification accuracies, output error, and elapsed time for PNN and DCT dictionary (subject-independent).

Sigma 0.1 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01
Signal F. S. Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error Acc (%) Error
5C ECG LDA 48.73 0.47 52.51 0.41 56.97 0.35 62.68 0.28 69.57 0.21 77.89 0.15 87.65 0.07 95.78 0.02 99.46 0.004 100 0
PCA 84.69 0.09 88.85 0.06 92.89 0.04 95.95 0.02 98.12 0.006 99.35 0.002 99.83 0.0009 99.96 0.0001 100 0 100 0
K-PCA 28.11 0.93 28.96 0.85 30.14 0.76 31.15 0.69 33.67 0.62 37.08 0.54 43.03 0.44 52.74 0.31 68.10 0.17 86.17 0.03
GSR LDA 42.85 0.08 43.56 0.08 43.90 0.06 44.69 0.05 44.84 0.03 45.45 0.02 46.08 0.01 47.40 0.04 48.68 0.06 51.91 0.07
PCA 47.60 0.006 48.44 0.02 48.94 0.02 49.31 0.03 50.02 0.03 51.22 0.05 52.80 0.04 55.39 0.05 59.79 0.05 70.43 0.02
K-PCA 32.88 0.82 34.07 0.81 34.46 0.80 35.00 0.76 36.12 0.61 37.19 0.45 38.19 0.32 39.16 0.12 40.16 0.09 41.97 0.06
3V ECG LDA 51.94 0.23 52.58 0.23 53.61 0.22 54.54 0.21 56.08 0.20 58.33 0.18 61.81 0.17 66.53 0.15 74.78 0.11 91.09 0.04
PCA 85.99 0.07 89.33 0.05 92.49 0.04 95.39 0.02 97.62 0.01 99.12 0.003 99.76 0.0008 99.94 0.0002 100 0 100 0
K-PCA 48.31 0.30 48.96 0.27 50.04 0.24 50.88 0.22 52.12 0.20 54.20 0.20 57.75 0.19 63.90 0.17 74.59 0.12 88.74 0.05
GSR LDA 56.90 0.03 57.03 0.03 57.50 0.03 57.84 0.03 58.31 0.02 58.78 0.04 59.49 0.06 60.00 0.08 60.94 0.08 64.16 0.06
PCA 60.86 0.08 61.15 0.07 61.68 0.06 62.42 0.05 62.94 0.04 63.80 0.04 65.67 0.04 67.38 0.05 70.47 0.04 78.33 0.03
K-PCA 49.87 0.06 49.88 0.06 50.29 0.07 50.60 0.07 51.23 0.04 51.79 0.04 52.06 0.04 52.50 0.05 53.32 0.15 55.52 0.003
3A ECG LDA 51.25 0.045 51.80 0.43 52.80 0.4 53.74 0.37 55.46 0.34 57.40 0.30 60.31 0.27 64.83 0.22 72.97 0.16 89.76 0.05
PCA 86.16 0.1 89.33 0.07 92.62 0.05 95.63 0.03 97.80 0.01 99.10 0.003 99.74 0.0007 99.95 0.0002 99.99 0.0001 100 0
K-PCA 48.50 0.47 48.81 0.45 49.30 0.42 50.01 0.4 51.38 0.38 53.60 0.34 57.59 0.29 63.93 0.24 74.99 0.15 88.93 0.05
GSR LDA 57.36 0.14 57.72 0.14 58.86 0.13 59.57 0.12 60.41 0.10 61.09 0.1 61.52 0.1 62.15 0.10 62.60 0.09 64.49 0.07
PCA 63.02 0.1 63.31 0.1 63.78 0.09 64.16 0.09 64.70 0.09 65.18 0.09 66.30 0.08 67.96 0.07 71.40 0.08 78.89 0.05
K-PCA 52.56 0.44 52.75 0.43 52.92 0.43 53.00 0.43 53.01 0.43 53.29 0.42 53.51 0.42 54.34 0.35 57.44 0.16 58.49 0.05

Abbreviations: 5C: 5 Classes of Emotions; 3V: 3 Classes of Valence; 3A: 3 Classes of Arousal; K-PCA: Kernel PCA; F. S.: Feature selection; Acc: Accuracy.

According to the results (Table 3, Table 4, Table 5), it is found that applying PCA, the performances of the classifier are better than those achieved by LDA and k-PCA. In addition, the best recognition rates were achieved for sigma = 0.01, especially for ECG signals. In this case, the maximum classification rate of 100% was achieved for all emotional categories (5C, 3V, and 3A).

Among MP dictionaries, better recognition rates were achieved by means of wavelet pocket dictionaries (Coif5 and db4).

To evaluate the performance of three dictionaries more easily, a graphical representation was provided for best PNN structure (sigma = 0.01). Fig. 6 summarizes the classification accuracies for DCT, db4, and Coif5 dictionaries.

Fig. 6.

Fig. 6

Total emotion recognition rates for DCT, db4, and Coif5 dictionaries for sigma = 0.01. The outliers are plotted individually using the '+' symbol (an outlier is an observation that is numerically distant from the rest of the data).

As shown in Fig. 6, higher emotion recognition rates with lower variation were achieved for wavelet packet dictionaries.

The performance of feature selection techniques has been demonstrated schematically in Fig. 7.

Fig. 7.

Fig. 7

Emotion recognition rates applying K-PCA, LDA, and PCA methods on ECG and GSR using different MP dictionaries (sigma = 0.01), y-axis shows recognition rates.

Applying PCA, the maximum accuracy of 100% was reached for ECG signals in all emotion classes and all dictionaries. In addition, K-PCA showed lower emotion recognition rates compared to PCA and LDA.

As the emotional performance includes a subjective, conscious experience, a subject-dependent classification was also examined. In this case, classification was performed with PNN (for sigma = 0.01, which resulted in the best classification rates in subject-independent mode) for the extracted features of every subject. Since the best results were achieved using PCA in a subject-independent fashion, we have applied this method for the evaluation of subject-dependent mode. Table 6, Table 7, Table 8 summarize the classification performances for Coif5, db4, and DCT, respectively, where classification is subject-dependent.

Table 6.

Overall classification accuracy, sensitivity, and specificity using PCA and PNN for Coif5 dictionary (subject-dependent).

3A 3V 5C
Signal Subject Acc (%) Sen (%) Spec (%) Acc (%) Sen (%) Spec (%) Acc (%) Sen (%) Spec (%)
ECG 1 100 100 100 100 100 100 100 100 100
2 100 100 100 100 100 100 100 100 100
3 100 100 100 100 100 100 100 100 100
4 100 100 100 100 100 100 100 100 100
5 100 100 100 100 100 100 100 100 100
6 100 100 100 100 100 100 100 100 100
7 100 100 100 100 100 100 100 100 100
8 100 100 100 100 100 100 100 100 100
9 100 100 100 100 100 100 100 100 100
10 100 100 100 100 100 100 100 100 100
11 100 100 100 100 100 100 100 100 100
GSR 1 94.84 99.46 99.84 95.24 99.46 99.84 93.34 99.46 99.84
2 95.92 92.43 99.84 96.13 91.89 99.84 94.5 92.43 99.84
3 93.48 99.46 100 94.23 99.46 100 92.46 99.46 100
4 93.14 97.3 99.22 93.34 97.3 99.22 91.24 97.3 99.22
5 99.46 100 100 99.52 100 100 99.39 100 100
6 97.49 100 100 96.94 100 100 95.86 100 100
7 92.93 100 100 95.65 100 100 91.92 100 100
8 89.74 100 100 90.96 100 100 86.89 100 100
9 90.08 100 99.92 85.46 100 99.92 83.9 100 99.92
10 92.53 100 100 93.07 100 100 89.4 100 100
11 99.66 99.46 99.92 99.52 98.92 99.92 99.59 99.46 99.92

Abbreviations: 5C: 5 Classes of Emotions; 3V: 3 Classes of Valence; 3A: 3 Classes of Arousal; Acc: Accuracy; Sen: Sensitivity; Spec: Specificity.

Table 7.

Overall classification accuracy, sensitivity, and specificity using PCA and PNN for db4 dictionary (subject-dependent).

3A 3V 5C
Signal Subject Acc (%) Sen (%) Spec (%) Acc (%) Sen (%) Spec (%) Acc (%) Sen (%) Spec (%)
ECG 1 100 100 100 100 100 100 100 100 100
2 100 100 100 100 100 100 100 100 100
3 100 100 100 100 100 100 100 100 100
4 100 100 100 100 100 100 100 100 100
5 100 100 100 100 100 100 100 100 100
6 100 100 100 100 100 100 100 100 100
7 100 100 100 100 100 100 100 100 100
8 100 100 100 100 100 100 100 100 100
9 100 100 100 100 100 100 100 100 100
10 100 100 100 100 100 100 100 100 100
11 100 100 100 100 100 100 100 100 100
GSR 1 96.06 100 99.84 95.65 100 99.84 93.95 100 99.84
2 96.06 92.97 99.84 96.33 91.35 99.84 94.84 92.97 99.84
3 93.89 100 100 94.9 100 100 92.32 100 100
4 93.21 98.38 98.91 93.75 98.38 98.91 91.37 98.38 98.91
5 99.05 98.92 100 98.85 98.38 100 98.64 98.92 100
6 97.76 100 100 96.54 100 100 95.86 100 100
7 92.6 100 100 95.38 100 100 91.71 100 100
8 90.9 100 100 91.3 100 100 86.89 100 100
9 90.22 100 100 85.53 100 100 83.77 100 100
10 92.93 100 100 93.27 100 100 89.54 100 100
11 99.52 99.46 99.92 99.39 98.92 99.92 99.46 99.46 99.92

Abbreviations: 5C: 5 Classes of Emotions; 3V: 3 Classes of Valence; 3A: 3 Classes of Arousal; Acc: Accuracy; Sen: Sensitivity; Spec: Specificity.

Table 8.

Overall classification accuracy, sensitivity, and specificity using PCA and PNN for DCT dictionary (subject-dependent).

3A 3V 5C
Signal Subject Acc (%) Sen (%) Spec (%) Acc (%) Sen (%) Spec (%) Acc (%) Sen (%) Spec (%)
ECG 1 100 100 100 100 100 100 100 100 100
2 100 100 100 100 100 100 100 100 100
3 100 100 100 100 100 100 100 100 100
4 100 100 100 100 100 100 100 100 100
5 100 100 100 100 100 100 100 100 100
6 100 100 100 100 100 100 100 100 100
7 100 100 100 100 100 100 100 100 100
8 100 100 100 100 100 100 100 100 100
9 100 100 100 100 100 100 100 100 100
10 100 100 100 100 100 100 100 100 100
11 100 100 100 100 100 100 100 100 100
GSR 1 91.44 98.38 99.77 92.32 98.38 99.77 88.65 98.38 99.77
2 93 84.32 99.61 94.16 83.24 99.61 91.3 84.86 99.61
3 92.32 99.46 100 93.48 99.46 100 90.56 99.46 100
4 90.29 94.59 98.91 91.17 94.59 98.91 88.04 94.59 98.91
5 97.15 94.05 100 96.88 94.05 100 96.47 94.05 100
6 95.52 98.92 99.77 95.52 98.92 99.77 93.48 98.92 99.77
7 92.19 100 99.92 95.24 100 99.92 90.96 100 99.92
8 85.8 100 100 86.21 100 100 78.94 100 100
9 89.27 99.46 99.77 84.78 99.46 99.77 82.61 99.46 99.77
10 91.64 99.46 99.84 91.92 99.46 99.84 87.84 99.46 99.84
11 98.78 97.84 99.61 98.91 97.84 99.61 98.98 98.92 99.53

Abbreviations: 5C: 5 Classes of Emotions; 3V: 3 Classes of Valence; 3A: 3 Classes of Arousal; Acc: Accuracy; Sen: Sensitivity; Spec: Specificity.

Using ECG parameters, the average classification rate is 100% for all MP dictionaries and all emotional classes (3A, 3V, and 5C). For GSR parameters, the mean accuracy rates of 94.48%, 94.55%, and 92.52% were achieved for 3A, 3V, and 5C using Coif5. Using db4, the average classification performances were 94.75%, 94.63%, and 92.58%. Using DCT, the mean accuracies were 92.49%, 92.78%, and 89.8%, correspondingly. Again, mean emotion recognition rates were higher for wavelet packet dictionaries than that of DCT.

Discussion and conclusions

The current study was aimed to offer a new methodology for emotion recognition based on ECG and GSR signals. After collecting GSR and ECG signals of 11 healthy subjects while listening to some emotional music, an efficient emotion recognition framework was proposed based on the MP algorithm with wavelet (coif5 at level 14, db4 at level 8) and DCT dictionaries. In addition, some feature selection techniques based on traditional linear methodology (LDA and PCA), and global nonlinear approach (K-PCA) were examined. In a subject-independent classification, it has shown that our proposed ECG features outperformed the GSR. Furthermore, it is verified that wavelet packet dictionaries gave higher classification rates than DCT dictionary. Considering PCA, the classification accuracy of 100% was achieved for ECG signal (sigma = 0.01) in all emotional categories (5C, 3V, and 3A); whereas, the lowest emotion recognition rates were observed using K-PCA (Fig. 7). Subject-dependent emotion recognition scheme indicated the accuracy rate of 100% for ECG characteristics. Compared to the results of previous studies, the accuracy of 100% for discrimination between 5 classes in 11 subjects evidently proved the potential of the proposed technique.

It is crucial to design and develop an accurate emotion recognition system. This matter has been challenged by some researchers who examined different physiological signals and methods. Table 9 compares the differences between the proposed algorithm and the conventional processing methods in emotion recognition in terms of number of subjects, number of emotional classes, type of stimuli, physiological signals, methodology, and emotion recognition rate.

Table 9.

Comparison between previous achievements on emotion recognition using physiological signals and the result of this study.

Publication Subjects Number of classes Stimuli Signal Method Maximum Accuracy rate
[13] 3 4 Music ECG, SC, EMG, and RSP time/frequency, entropy, geometric analysis, sub-band spectra, multi-scale entropy, and extended linear discriminant analysis as a classifier 70%
[14] 5 2 Music EEG frequency based features and their combination, SVM, and Linear dynamic system 81.03%
[17] 26 4 Music EEG frequency domain features 82.29%
[19] 25 4 Music FBS EEG spectrum and time-domain
characteristics of FBS signals
87.05%
[20] 25 4 Music FBS, ECG feature-level fusion and naive-Bayes decision level fusion 89.24%
[21] 44 2 & 3 Picture & video game ECG Hilbert-Huang transform and linear discriminants 89%
[22] 27 8 AutoTutor ECG, EMG, and GSR Statistical features, k-nearest neighbor and linear Bayes normal classifiers Not applicable
[23] 11 3 Movie ECG, GSR, BVP, respiration, pulse R–R interval of ECG, GSR, peak of BVP, and peak
of pulse and SVR
89.2%
[24] 60 6 Audio-visual ECG Hurst, HOS, KNN, Fuzzy KNN 92.87%
[25] 35 5 Picture ECG, GSR, RSP Standard and nonlinear features fed to QDC >90%
[26] 30 4 Picture Heart rate instantaneous spectrum, bispectrum, LE, and SVM 79.29%
This study 11 5 Music ECG, GSR MP, PCA, PNN 100%

In the current study, three emotional categories were adopted, including 5C, 3V, and 3A. The results showed that the best recognition rates were achieved for sigma = 0.01 in all emotion categories (5C, 3V and 3A). Using PCA, the mean classification rate of 100% was reached. The choice of dimensionality reduction method to provide higher discrimination rates depends on the implication of feature vector or data. Totally, there is no rule for feature selection method which provides higher classification performance. In this study, the best classification results were achieved by using ECG parameters in PCA. One of the benefits of PCA is that it is computationally less expensive than some other feature selection methods like Kernel PCA.

The results of this study confirmed the effectiveness of matching pursuit algorithm with the wavelet dictionary in differentiating music-induced emotions using physiological signals. In addition, the classification performance of ECG features was higher than that of GSR. These results confirmed that better results would be obtained if the dictionaries matched the signal characteristics. Previously, it has shown that different components of the ANS are activated in emotions. The heart is innervated by both sympathetic and parasympathetic nerves systems which speed or slow the heart rhythm. In contrast, skin response is innervated by sympathetic. In addition, it is easier to get robust ECG with electrodes on the hand than GSR from fingers. On the other hand, in this study, some wavelet mothers like Coif5 have been evaluated. The major advantage of Coif5 is the complete convergence properties of it to the ECG. While, the GSR only shows useful information in a particular frequency range. Therefore, it may be expected that indices derived from GSR may not as accurate as ECG in emotion recognition based on a 2D emotional space. This results are in the line with previously reported articles [39], [40]. Seoane et al. showed that ECG is better indicator of emotion than GSR [39]. Lower emotion classification accuracy has also been reported for GSR than ECG by Palanisamy et al. [40].

Our results also indicated that emotion recognition was very subjective using GSR features. However, depending on the MP dictionary, the feasibility of determining emotional classes was different in terms of valence/arousal dimensions. On average, the results supported that valence was better detectable than arousal. This finding is in the line with previous achievements on subjective recognition mode [24]. However, adverse results were reported in some studies [16].

Based on the physiological characteristics of ECG and GSR, there are the complex components which involved in the presentation of emotion or psychological events, respectively. The performance of GSR is influenced by the sympathetic function as well as the performance of ECG is influenced by both the sympathetic and parasympathetic functions [41]. On the other hand, sympathetic-linked reactivity has been associated with emotional arousal and GSR is one of the best indices of the arousal [42]. Since the proposed system detected valence more easily, it was expected that the emotional ANS dynamics through both sympathetic and parasympathetic pathways can be acknowledged by ECG parameters.

The high efficiency of the proposed emotion recognition system suggests that it can be applied confidently in clinical use. Since the physiological responses of schizophrenic or autistic patients may provide information about risk of illness and recovery, future works should be performed to examine the implication of the current system on patients with emotional impairment or disturbance, such as schizophrenia or autism.

Conflicts of interest

The authors have no conflicts of interest relevant to this article.

Acknowledgements

We gratefully acknowledge Computational Neuroscience Laboratory, where the data were collected and all the subjects volunteered for the study.

Footnotes

Peer review under responsibility of Chang Gung University.

References

  • 1.Shahani B., Halperin J., Boulu P., Cohen J. Sympathetic skin response - a method of assessing unmyelinated axon dysfunction in peripheral neuropathies. J Neurol Neurosurg Psychiatr. 1984;47:536–542. doi: 10.1136/jnnp.47.5.536. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Kreibig S.D. Autonomic nervous system activity in emotion: a review. Biol Psychol. 2010;84:394–421. doi: 10.1016/j.biopsycho.2010.03.010. [DOI] [PubMed] [Google Scholar]
  • 3.Levenson R.W. The autonomic nervous system and emotion. Emot Rev. 2014;6:100–112. [Google Scholar]
  • 4.Ritz T., Thöns M., Fahrenkrug S., Dahme B. Airways, respiration, and respiratory sinus arrhythmia during picture viewing. Psychophysiology. 2005;42:568–578. doi: 10.1111/j.1469-8986.2005.00312.x. [DOI] [PubMed] [Google Scholar]
  • 5.Park S., Kim K. Physiological reactivity and facial expression to emotion-inducing films in patients with schizophrenia. Arch Psychiatr Nurs. 2011;25:e37–e47. doi: 10.1016/j.apnu.2011.08.001. [DOI] [PubMed] [Google Scholar]
  • 6.Drusch K., Stroth S., Kamp D., Frommann N., Wolwer W. Effects of training of affect recognition on the recognition and visual exploration of emotional faces in schizophrenia. Schizophr Res. 2014;159:485–490. doi: 10.1016/j.schres.2014.09.003. [DOI] [PubMed] [Google Scholar]
  • 7.Hazlett R. Proceedings of the SIGCHI Conference on human Factors in computing systems: 22–27 April 2006, Montréal, Québec, Canada. 2006. Measuring emotional valence during interactive experiences: Boys at video game play; pp. 1023–1026. [Google Scholar]
  • 8.Yannakakisa G., Hallam J. Entertainment modeling through physiology in physical play. Int J Hum Comput Stud. 2008;66:741–755. [Google Scholar]
  • 9.Wacker M., Witte H. Time-frequency techniques in biomedical signal analysis: a tutorial review of similarities and differences. Methods Inf Med. 2013;52:279–296. doi: 10.3414/ME12-01-0083. [DOI] [PubMed] [Google Scholar]
  • 10.Baumgartner C., Blinowska K., Cichocki A., Dickhaus H., Durka P., McClintock P. Discussion of “Time-frequency techniques in biomedical signal analysis: a tutorial review of similarities and differences”. Methods Inf Med. 2013;52:297–307. [PubMed] [Google Scholar]
  • 11.Durka P., Blinowska K. Analysis of EEG transients by means of matching pursuit. Ann Biomed Eng. 1995;23:608–611. doi: 10.1007/BF02584459. [DOI] [PubMed] [Google Scholar]
  • 12.Bardonova J., Provaznik I., Novakova M. Matching pursuit decomposition for detection of frequency changes in experimental data - application to heart signal recording analysis. Scr Medica BRNO. 2006;79:279–288. [Google Scholar]
  • 13.Sommermeyer D., Schwaibold M., Scholler B., Grote L., Hedner J., Bolz A. World congress on medical physics and biomedical engineering: 7-12 September 2009, Munich, Germany. 2009. Detection of sleep disorders by a modified Matching Pursuit algorithm; pp. 1271–1274. [Google Scholar]
  • 14.Pantelopoulos A., Bourbakis N. 10th IEEE International Conference on information Technology and applications in biomedicine (ITAB): 3–5. Nov. 2010. Efficient single-lead ECG Beat classification using matching pursuit based features and an Artificial neural network. Corfu; 2010. p.1–4. [Google Scholar]
  • 15.Hong-xin Z., Can-feng C., Yan-ling W., Pei-hua L. Decomposition and compression for ECG and EEG signals with sequence index coding method based on matching pursuit. J China Univ Posts Telecommun. 2012;19:92–95. [Google Scholar]
  • 16.Kim J., Andre E. Emotion recognition based on physiological changes in music listening. IEEE Trans Pattern Anal Mach Intell. 2008;30:2067–2083. doi: 10.1109/TPAMI.2008.26. [DOI] [PubMed] [Google Scholar]
  • 17.Duan R.-N., Wang X.-W., Lu B.-L. 19th International Conference of neural information processing, ICONIP:12–15 November 2012, Doha, Qatar. 2012. EEG-based emotion recognition in listening music by using support vector machine and linear dynamic system; pp. 468–475. [Google Scholar]
  • 18.Lin Y.-P., Wang C.-H., Wu T.-L., Jeng S.-K., Chen J.-H. IEEE 10th workshop on Multimedia Signal Processing: 8-10 Oct. 2008, Cairns, Qld. 2008. Support vector machine for EEG signal classification during listening to emotional music; pp. 127–130. [Google Scholar]
  • 19.Lin Y.-P., Wang C.-H., Wu T.-L., Jeng S.-K., Chen J.-H. IEEE International region 10 Conference: 30 Oct. -2 Nov. 2007, Taipei, Taiwan. 2007. Multilayer Perceptron for EEG signal classification during listening to emotional music; pp. 1–3. [Google Scholar]
  • 20.Lin Y.-P., Wang C.-H., Jung T.-P., Wu T.-L., Jeng S.-K., Duann J.-R. EEG-based emotion recognition in music listening. IEEE Trans Biomed Eng. 2010;57:1798–1806. doi: 10.1109/TBME.2010.2048568. [DOI] [PubMed] [Google Scholar]
  • 21.Naji M., Firoozabadi M., Azadfallah P. Classification of music-induced emotions based on information fusion of forehead biosignals and electrocardiogram. Cogn Comput. 2014;6:241–252. [Google Scholar]
  • 22.Naji M., Firoozabadi M., Azadfallah P. Emotion classification during music listening from forehead biosignals. SIViP. 2015;9:1365–1375. [Google Scholar]
  • 23.Naji M., Firoozabadi M., Azadfallah P. IEEE-EMBS International Conference on biomedical and health Informatics (BHI): 1–4 June 2014, Valencia. 2014. A new information fusion approach for recognition of music-induced emotions; pp. 205–208. [Google Scholar]
  • 24.Agrafioti F., Hatzinakos D., Anderson A.K. ECG pattern analysis for emotion detection. IEEE Trans Affect Comput. 2012;3:102–115. [Google Scholar]
  • 25.AlZoubi O., D'Mello S.K., Calvo R.A. Detecting naturalistic expressions of nonbasic affect using physiological signals. IEEE Trans Affect Comput. 2012;3:298–310. [Google Scholar]
  • 26.Chang C.Y., Chang C.W., Zheng J.Y., Chung P.C. Physiological emotion analysis using support vector regression. Neurocomputing. 2013;122:79–87. [Google Scholar]
  • 27.Jerritta S., Murugappan M., Wan K., Yaacob S. Classification of emotional states from electrocardiogram signals: a non-linear approach based on hurst. Biomed Eng Online. 2013;12:44. doi: 10.1186/1475-925X-12-44. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Valenza G., Lanata A., Scilingo E. The role of nonlinear dynamics in affective valence and arousal recognition. IEEE Trans Affect Comput. 2012;3:237–249. [Google Scholar]
  • 29.Valenza G., Citi L., Lanata A., Scilingi E., Barbieri R. Revealing real-time emotional responses: a personalized assessment based on heartbeat dynamics. Sci Rep. 2014;4:4998. doi: 10.1038/srep04998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.World Medical Association World Medical Association Declaration of Helsinki: ethical principles for medical research involving human subjects. JAMA. 2013;310:2191–2194. doi: 10.1001/jama.2013.281053. [DOI] [PubMed] [Google Scholar]
  • 31.Vieillard S., Peretz I., Gosselin N., Khalfa S., Gagnon L., Bouchard B. Happy, sad, scary and peaceful musical excerpts for research on emotions. Cogn Emot. 2008;22:720–752. [Google Scholar]
  • 32.Goshvarpour A., Abbasi A., Goshvarpour A. Evaluating autonomic parameters: the role of sleepduration in emotional responses to music. Iran J Psychiatry. 2016;11:59–63. [PMC free article] [PubMed] [Google Scholar]
  • 33.Mallat S.G., Zhang Z. Matching pursuits with time-frequency dictionaries. IEEE Trans Sig Proc. 1993;41:3397–3415. [Google Scholar]
  • 34.Duda R.O., Hart P.E., Stork D.G. 2nd ed. Wiley; New York: 2001. Pattern classification. [Google Scholar]
  • 35.Maaten Lvd . In: An introduction to dimensionality reduction using MATLAB. Maastricht U., editor. 2007. Report MICC 07–07. The Netherlands. [Google Scholar]
  • 36.Scholkopf B., Smola A., Muller K. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 1998;10:1299–1319. [Google Scholar]
  • 37.Shawe-Taylor J., Christianini N. Cambridge University Press; Cambridge, UK: 2004. Kernel methods for pattern analysis. [Google Scholar]
  • 38.Zhang X. Honors Research Fellows and Undergraduate Research Scholars Theses. A&M Univ; Texas: 2006. Neural network-based classification of single phase distribution transformer fault data. [Google Scholar]
  • 39.Seoane F., Mohino-Herranz I., Ferreira J., Alvarez L., Buendia R., Ayllón D. Wearable biomedical measurement systems for assessment of mental stress of combatants in real time. Sensors. 2014;14:7120–7141. doi: 10.3390/s140407120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Palanisamy K., Murugappan M., Yaacob S. Multiple physiological signal-based human stress identification using non-linear classifiers. Electron Electr Eng. 2013;19:80–85. [Google Scholar]
  • 41.Zeraoulia E. CRC Press; New Hampshire: 2011. Models and applications of chaos theory in modern sciences. Enfield. [Google Scholar]
  • 42.Bach D.R. Sympathetic nerve activity can be estimated from skin conductance responses. Neuro Image. 2014;84:122–123. doi: 10.1016/j.neuroimage.2013.08.030. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Biomedical Journal are provided here courtesy of Chang Gung University

RESOURCES