Skip to main content
. 2023 May 9;195(8):713–719. doi: 10.1055/a-2061-6562

Table 2. Overview table of accuracy values of various feature extraction methods used to train different ML algorithms and evaluated with 10-fold cross-validation on the training dataset. : BOW: bag of words; LDA: Latent Dirichlet allocation; LR: Logistic regression; NMF: Non-negative matrix factorization; NN: Neural network; PCA: Principal component analysis; SVM: Support vector machine; TF-IDF: Term frequency-inverse document frequency.

NN SVM LR Average Accuracy
Dummy 0.5
BOW 0.96 0.97 0.97 0.967
TF-IDF 0.95 0.96 0.97 0.96
NMF 0.94 0.91 0.9 0.917
PCA 0.91 0.9 0.9 0.903
LDA 0.88 0.89 0.88 0.883
Doc2Vec 0.87 0.9 0.86 0.877