Table 3.
Cross tabulation showing the number of studies employing certain research designs by the aspects of text mining that were compared ( n = 44)
What aspect of text mining was compared | Retrospective simulation | Prospective—case study | Prospective—controlled trial | Prospective—other | Total—what was compared |
---|---|---|---|---|---|
Classifiers/ algorithms | 13 | 0 | 0 | 3 | 16 |
Number of features | 2 | 0 | 0 | 0 | 2 |
Feature extraction/sets (e.g., BoW) | 8 | 0 | 0 | 2 | 10 |
Views (e.g., T&A, MeSH) | 5 | 0 | 0 | 1 | 6 |
Training set size | 2 | 0 | 0 | 0 | 2 |
Kernels | 2 | 0 | 0 | 0 | 2 |
Topic specific versus general training data | 3 | 0 | 0 | 1 | 4 |
Other optimisations | 9 | 0 | 0 | 4 | 13 |
No comparison | 5 | 5 | 4 | 1 | |
Total—study design (duplicates removed) | (27) | (5) | (4) | (8) |
Note. Many studies compared more than one aspect of text mining, therefore column total for ‘Total—what was compared’ sums to greater than 44. The row for ‘Total—study design (duplicates removed)’ shows the number of studies of each design type rather than the column totals, as the column totals would include duplications of the same studies that compared multiple aspects of text mining technologies.