. 2021 Mar 2;28(1):e100262. doi: 10.1136/bmjhci-2020-100262

Table 4.

Performance metrics in the studies used supervised learning (sentiment analysis and text classification). SVM and NB were the preferred classifier as it produced better results demonstrated by the F1 score. Only five studies reported multiple fold validation

Author	k-fold cross-validation	Sentiment analysis		Text classification
Author	k-fold cross-validation	Classifier	Performance	Classifier	Performance
Alemi et al³⁴*†	Five repetitions of twofold cross-validation	SVM	Positive 0.89 Negative 0.64	SVM	Staff related 0.85 Doctor listens 0.34
Alemi et al³⁴*†	Five repetitions of twofold cross-validation	NB	Positive 0.94 Negative 0.68	NB	Staff related 0.80 Doctor listens 0.37
Doing-Harris et al²⁴*	NR	NB	0.84	NB	Explanation 0.74 Friendliness 0.40
Greaves et al²⁷	Single-fold cross-validation	NB SVM	0.89 0.84	NB SVM	Dignity and respect 0.85 Cleanliness 0.84 Dignity and respect 0.8 Cleanliness 0.84
Hawkins et al⁵²	10-fold cross-validation	–	–	SVM	0.89‡
Jimenez-Zafra et al⁵⁴	10-fold cross-validation	SVM	COPOD 0.86 COPOS 0.71	–	–
Huppertz et al⁶	NR	SVM	0.87‡	–	–
Wagland et al⁴⁸	Single-fold cross-validation 10-fold cross-validation	SVM	0.80	–	–
Wagland et al⁴⁸	Single-fold cross-validation 10-fold cross-validation	SVM	0.83	–	–
Bahja et al²⁶	Single-fold cross-validation 4-fold cross-validation	SVM NB	0.84 0.78	–	–
Bahja et al²⁶	Single-fold cross-validation 4-fold cross-validation	SVM NB	0.81 0.78	–	–

*Best and worst performing category, respectively.

†Classified as praise (positive), complaint (negative).

‡Reported as overall accuracy.

COPOD, corpus of patient opinions in Dutch; COPOS, corpus of patient opinions in Spanish; NB, Naïve Bayes; NR, not reported; SVM, support vector machine.