Table 3.
Feature generation method | SVM performance | Number of tokens in the utterance≥1 | Number of tokens in the utterance≥5 | Number of tokens in the utterance≥30 | Number of tokens in the utterance≥50 |
---|---|---|---|---|---|
| |||||
Metrics | Patient utterance | Patient utterance | Patient utterance | Patient utterance | |
(N = 1731) | (N = 1068) | (N = 274) | (N = 135) | ||
Nurse utterance | Nurse utterance | Nurse utterance | Nurse utterance | ||
(N = 1763) | (N = 1290) | (N = 358) | (N = 195) | ||
Performance of individual feature generation method (no feature selection method was used) | |||||
TF-IDF | AUC-ROC | 71.26 ± 1.51 | 79.45 ± 0.63 | 89.7 ± 1.93 | 91.85 ± 1.95 |
F1-score | 67.64 ± 1.59 | 70.88 ± 2.09 | 80.65 ± 2.04 | 81.78 ± 4.28 | |
LIWC | AUC-ROC | 70.57 ± 1.31 | 76.01 ± 76.01 | 88.53 ± 3.45 | 87.67 ± 2.77 |
F1-score | 65.27 ± 1.16 | 68.92 ± 1.55 | 80.37 ± 5.41 | 78.1 ± 2.53 | |
Word2Vec | AUC-ROC | 72.71 ± 1.41 | 79.1 ± 1.51 | 88.1 ± 3.21 | 87.58 ± 2.6 |
F1-score | 67.16 ± 0.88 | 70.82 ± 1.9 | 80.44 ± 3.05 | 75.56 ± 3.52 | |
Unigram | AUC-ROC | 65.99 ± 2.07 | 70.95 ± 1.83 | 78.69 ± 3.79 | 81.88 ± 3.48 |
F1-score | 67.52 ± 1.49 | 63.86 ± 1.29 | 69.61 ± 4.22 | 65.82 ± 4.54 | |
UMLS | AUC-ROC | 58.71 ± 2.22 | 60.4 ± 2.09 | 68.29 ± 3.66 | 68.15 ± 4.96 |
F1-score | 66.68 ± 0.67 | 61.84 ± 1.57 | 60.9 ± 2.81 | 58.05 ± 9.00 | |
POS-tagging | AUC-ROC | 51.68 ± 7.48 | 46.01 ± 2.81 | 46.5 ± 4.25 | 48.43 ± 4.59 |
F1-score | 65.81 ± 0.84 | 62.17 ± 0.38 | 60.16 ± 0.74 | 58.52 ± 0.93 | |
Performance of a combination of feature generation methods after selecting the most informative features using JMIM method | |||||
TF-IDF (JMIM)+LIWS(JMIM) | AUC-ROC | 68.98 ± 1.59 | 76.29 ± 2.18 | 87.97 ± 3.09 | 90.66 ± 1.22 |
F1-score | 68.24 ± 2.19 | 70.71 ± 2.03 | 82.67 ± 3.45 | 83.4 ± 4.07 | |
TF-IDF (JMIM)+LIWS (JMIM)+Word2Vec (JMIM) | AUC-ROC | 71.72 ± 1.9 | 79.03 ± 2.29 | 90.06 ± 3.48 | 91.85 ± 1.59 |
F1-score | 67.16 ± 1.53 | 70.19 ± 2.34 | 82.12 ± 3.47 | 80.3 ± 3.5 | |
TF-IDF (JMIM)+LIWS (JMIM)+Word2Vec (JMIM)+Unigram (JMIM) | AUC-ROC | 71.75 ± 1.87 | 78.93 ± 2.26 | 89.89 ± 3.43 | 91.68 ± 1.82 |
F1-score | 67.09 ± 1.47 | 70.22 ± 2.32 | 81.88 ± 3.94 | 80.36 ± 3.05 | |
TF-IDF (JMIM)+LIWS (JMIM)+Word2Vec (JMIM)+Unigram (JMIM)+UMLS (JMIM) | AUC-ROC | 72.72 ± 1.95 | 79.67 ± 2.07 | 89.94 ± 2.88 | 92.61 ± 2.02 |
F1-score | 68.24 ± 1.47 | 70.71 ± 2.33 | 82.67 ± 3.94 | 83.4 ± 3.05 |
Note: The feature generation method that demonstrated the highest performance, as indicated by the AUC-ROC and F1-score values, was highlighted in bold.