Skip to main content
. 2023 Jul 21;30(10):1673–1683. doi: 10.1093/jamia/ocad139

Table 3.

SVM classifier performance at utterance level for various utterance lengths

Feature generation method SVM performance Number of tokens in the utterance≥1 Number of tokens in the utterance≥5 Number of tokens in the utterance≥30 Number of tokens in the utterance≥50

Metrics Patient utterance Patient utterance Patient utterance Patient utterance
(N = 1731) (N = 1068) (N = 274) (N = 135)
Nurse utterance Nurse utterance Nurse utterance Nurse utterance
(N = 1763) (N = 1290) (N = 358) (N = 195)
Performance of individual feature generation method (no feature selection method was used)
TF-IDF AUC-ROC 71.26 ± 1.51 79.45 ± 0.63 89.7 ± 1.93 91.85 ± 1.95
F1-score 67.64 ± 1.59 70.88 ± 2.09 80.65 ± 2.04 81.78 ± 4.28
 LIWC AUC-ROC 70.57 ± 1.31 76.01 ± 76.01 88.53 ± 3.45 87.67 ± 2.77
F1-score 65.27 ± 1.16 68.92 ± 1.55 80.37 ± 5.41 78.1 ± 2.53
 Word2Vec AUC-ROC 72.71 ± 1.41 79.1 ± 1.51 88.1 ± 3.21 87.58 ± 2.6
F1-score 67.16 ± 0.88 70.82 ± 1.9 80.44 ± 3.05 75.56 ± 3.52
 Unigram AUC-ROC 65.99 ± 2.07 70.95 ± 1.83 78.69 ± 3.79 81.88 ± 3.48
F1-score 67.52 ± 1.49 63.86 ± 1.29 69.61 ± 4.22 65.82 ± 4.54
 UMLS AUC-ROC 58.71 ± 2.22 60.4 ± 2.09 68.29 ± 3.66 68.15 ± 4.96
F1-score 66.68 ± 0.67 61.84 ± 1.57 60.9 ± 2.81 58.05 ± 9.00
 POS-tagging AUC-ROC 51.68 ± 7.48 46.01 ± 2.81 46.5 ± 4.25 48.43 ± 4.59
F1-score 65.81 ± 0.84 62.17 ± 0.38 60.16 ± 0.74 58.52 ± 0.93
Performance of a combination of feature generation methods after selecting the most informative features using JMIM method
 TF-IDF (JMIM)+LIWS(JMIM) AUC-ROC 68.98 ± 1.59 76.29 ± 2.18 87.97 ± 3.09 90.66 ± 1.22
F1-score 68.24 ± 2.19 70.71 ± 2.03 82.67 ± 3.45 83.4 ± 4.07
 TF-IDF (JMIM)+LIWS (JMIM)+Word2Vec (JMIM) AUC-ROC 71.72 ± 1.9 79.03 ± 2.29 90.06 ± 3.48 91.85 ± 1.59
F1-score 67.16 ± 1.53 70.19 ± 2.34 82.12 ± 3.47 80.3 ± 3.5
 TF-IDF (JMIM)+LIWS (JMIM)+Word2Vec (JMIM)+Unigram (JMIM) AUC-ROC 71.75 ± 1.87 78.93 ± 2.26 89.89 ± 3.43 91.68 ± 1.82
F1-score 67.09 ± 1.47 70.22 ± 2.32 81.88 ± 3.94 80.36 ± 3.05
TF-IDF (JMIM)+LIWS (JMIM)+Word2Vec (JMIM)+Unigram (JMIM)+UMLS (JMIM) AUC-ROC 72.72 ± 1.95 79.67 ± 2.07 89.94 ± 2.88 92.61 ± 2.02
F1-score 68.24 ± 1.47 70.71 ± 2.33 82.67 ± 3.94 83.4 ± 3.05

Note: The feature generation method that demonstrated the highest performance, as indicated by the AUC-ROC and F1-score values, was highlighted in bold.