Skip to main content
. 2022 Nov 29;60(2):571–591. doi: 10.1007/s10844-022-00768-8

Table 3.

Overview of input corpus in the machine learning methods

Corpus (Input data table) Used text Columns Original columns
ScispaCy Abstract 1500 195434
TF-IDF Abstract 20939 20939
BOW Abstract 20939 20939
ScispaCy Title 1500 3430
TF-IDF Title 326 326
BOW Title 326 326
ScispaCy Title_Abstract 1500 195760
TF-IDF Title_Abstract 21264 21264
BOW Title_Abstract 21264 21264
BOW Abstract_BibliometricFeatures 20945 20945
TF-IDF Abstract_BibliometricFeatures 20945 20945
BERT Title 3072 30522
BERT Title_and_Abstract 3072 30522
BERT Title_and_Abstract_and_ BibliometricFeatures 3072 30522