Table 2.
Computational resources used for each classifier by feature type for data sets II and III.
| Feature type and data set | Random forest | Logistic regression | |||
|
|
Memory, MB | Run time, hours:minutes:seconds | Memory, MB | Run time, hours:minutes:seconds | |
| BOWa | |||||
|
|
II | 310 | 00:04:10 | 340 | 00:21:35 |
|
|
III | 3500 | 07:22:02 | 3400 | 23:17:20b |
| TF-IDFc | |||||
|
|
II | 310 | 00:04:15 | 270 | 00:03:04 |
|
|
III | 3400 | 06:37:04 | 2300 | 02:47:30 |
| CBOWd | |||||
|
|
II | 193 | 00:03:02 | 180 | 00:01:17 |
|
|
III | 1700 | 01:21:11 | 1700 | 00:16:36 |
| PV-DBOWe | |||||
|
|
II | 170 | 00:03:35 | 89 | 00:00:34 |
|
|
III | 1100 | 01:41:18 | 1600 | 00:02:13 |
aBOW: bag-of-words.
bNo convergence after 100,000 iterations.
cTF-IDF: term frequency–inverse document frequency.
dCBOW: continuous BOW.
ePV-DBOW: paragraph vector–distributed BOW.