Skip to main content
. 2022 Nov 29;60(2):571–591. doi: 10.1007/s10844-022-00768-8

Table 7.

Best accuracy with different machine learning algorithm for different text representation method

Input data table Used text Machine learning algorithms Accuracy Macro average (F1 score) Weighted average (F1 score)
ScispaCy Abstract Random Forest Classifier 0.74 0.74 0.74
TF-IDF Abstract Random Forest Classifier 0.92 0.92 0.92
BOW Abstract Random Forest Classifier 0.90 0.90 0.90
ScispaCy Title Multinomial NB 0.68 0.68 0.68
TF-IDF Title Random Forest Classifier 0.80 0.80 0.80
BOW Title Random Forest Classifier 0.70 0.70 0.70
ScispaCy Title and Abstract Random Forest Classifier 0.79 0.79 0.79
TF-IDF Title and Abstract Random Forest Classifier 0.92 0.92 0.92
BOW Title and Abstract Random Forest Classifier 0.91 0.91 0.91
BOW Abstract with Bibliometric Features Random Forest Classifier 0.73 0.73 0.73
TF-IDF Abstract with Bibliometric Features Random Forest Classifier 0.92 0.92 0.92
Bidirectional Encoder Representations Title_Abstract Neural Network (BERT) 0.87 0.87 0.87