. 2022 Nov 29;60(2):571–591. doi: 10.1007/s10844-022-00768-8

Table 7.

Best accuracy with different machine learning algorithm for different text representation method

Input data table	Used text	Machine learning algorithms	Accuracy	Macro average (F1 score)	Weighted average (F1 score)
ScispaCy	Abstract	Random Forest Classifier	0.74	0.74	0.74
TF-IDF	Abstract	Random Forest Classifier	0.92	0.92	0.92
BOW	Abstract	Random Forest Classifier	0.90	0.90	0.90
ScispaCy	Title	Multinomial NB	0.68	0.68	0.68
TF-IDF	Title	Random Forest Classifier	0.80	0.80	0.80
BOW	Title	Random Forest Classifier	0.70	0.70	0.70
ScispaCy	Title and Abstract	Random Forest Classifier	0.79	0.79	0.79
TF-IDF	Title and Abstract	Random Forest Classifier	0.92	0.92	0.92
BOW	Title and Abstract	Random Forest Classifier	0.91	0.91	0.91
BOW	Abstract with Bibliometric Features	Random Forest Classifier	0.73	0.73	0.73
TF-IDF	Abstract with Bibliometric Features	Random Forest Classifier	0.92	0.92	0.92
Bidirectional Encoder Representations	Title_Abstract	Neural Network (BERT)	0.87	0.87	0.87