. 2019 Aug 5;2(3):330–338. doi: 10.1093/jamiaopen/ooz030

Table 2.

Performance of the various machine learning approaches employed for identifying unsafe food products

Classifier description	Precision	Recall	F1 score
Linear SVM (Feature selection using Chi²k = 500)	0.61	0.64	0.62
Multinomial Naive Bayes (Feature selection using Chi², k = 500)	0.66	0.66	0.66
Weighted logistic regression (Feature selection using Chi², k = 500)	0.58	0.74	0.65
Weighted logistic regression (Feature selection using Chi², k = 1000)	0.64	0.71	0.67
Weighted logistic regression (Feature selection using mutual information, k = 1000)	0.60	0.68	0.64
Weighted logistic regression with SMOTE (ratio = 1: 5) (tested on real data points only)	0.62	0.68	0.65
Weighted logistic regression with SMOTE (ratio = 1: 3) (tested on real data points only)	0.62	0.71	0.66
Weighted logistic regression with SMOTE (ratio = 1: 2) (tested on real data points only)	0.62	0.70	0.66
Weighted logistic regression with SMOTE (ratio = 1: 1) (tested on real data points only)	0.63	0.66	0.64
BERT (epoch = 10, max sequence length = 128)	0.76	0.67	0.71
BERT (epoch = 10, max sequence length = 128) with focal loss for dealing with imbalanced data ( $α = 0.915, γ = 5)$	0.75	0.74	0.73
BERT (epoch = 20, max sequence length = 256)	0.79	0.67	0.72
BERT (epoch = 30, max sequence length = 256)	0.78	0.71	0.74
BERT (epoch = 30, max sequence length = 256) with focal loss for dealing with imbalanced data ( $α = 0.915, γ = 5)$	0.77	0.71	0.74

BERT is the best performing classifier. Chi² refers to Chi-square. The accuracy ([true positives + true negative]/total reviews), precision (also known as positive predictive value = true positives/predicted positive condition), recall (also known as sensitivity = [true positive/[true positives + false negatives]), and F1-score (the harmonic mean of the precision and recall) are discussed.