Skip to main content
. 2021 Mar 2;28(1):e100262. doi: 10.1136/bmjhci-2020-100262

Table 4.

Performance metrics in the studies used supervised learning (sentiment analysis and text classification). SVM and NB were the preferred classifier as it produced better results demonstrated by the F1 score. Only five studies reported multiple fold validation

Author k-fold cross-validation Sentiment analysis Text classification
Classifier Performance Classifier Performance
Alemi et al34*† Five repetitions of twofold cross-validation SVM Positive 0.89
Negative 0.64
SVM Staff related 0.85
Doctor listens 0.34
NB Positive 0.94
Negative 0.68
NB Staff related 0.80
Doctor listens 0.37
Doing-Harris et al24* NR NB 0.84 NB Explanation 0.74
Friendliness 0.40
Greaves et al27 Single-fold cross-validation NB
SVM
0.89
0.84
NB
SVM
Dignity and respect 0.85
Cleanliness 0.84
Dignity and respect 0.8
Cleanliness 0.84
Hawkins et al52 10-fold cross-validation SVM 0.89‡
Jimenez-Zafra et al54 10-fold cross-validation SVM COPOD 0.86
COPOS 0.71
Huppertz et al6 NR SVM 0.87‡
Wagland et al48 Single-fold cross-validation
10-fold cross-validation
SVM 0.80
SVM 0.83
Bahja et al26 Single-fold cross-validation
4-fold cross-validation
SVM
NB
0.84
0.78
SVM
NB
0.81
0.78

*Best and worst performing category, respectively.

†Classified as praise (positive), complaint (negative).

‡Reported as overall accuracy.

COPOD, corpus of patient opinions in Dutch; COPOS, corpus of patient opinions in Spanish; NB, Naïve Bayes; NR, not reported; SVM, support vector machine.