Table 3.
Overview of input corpus in the machine learning methods
Corpus (Input data table) | Used text | Columns | Original columns |
---|---|---|---|
ScispaCy | Abstract | 1500 | 195434 |
TF-IDF | Abstract | 20939 | 20939 |
BOW | Abstract | 20939 | 20939 |
ScispaCy | Title | 1500 | 3430 |
TF-IDF | Title | 326 | 326 |
BOW | Title | 326 | 326 |
ScispaCy | Title_Abstract | 1500 | 195760 |
TF-IDF | Title_Abstract | 21264 | 21264 |
BOW | Title_Abstract | 21264 | 21264 |
BOW | Abstract_BibliometricFeatures | 20945 | 20945 |
TF-IDF | Abstract_BibliometricFeatures | 20945 | 20945 |
BERT | Title | 3072 | 30522 |
BERT | Title_and_Abstract | 3072 | 30522 |
BERT | Title_and_Abstract_and_ BibliometricFeatures | 3072 | 30522 |