Table 2.
Feature comparison over all results
| Feature | SVMLight | SVM-perf | AdaBoostM1 | Ada Over |
|---|---|---|---|---|
| Unigram | 0.418 | 0.492 † | 0.420 | 0.471 † |
| Bigram | 0.406 | 0.513* † | 0.420 | 0.477* † |
| Argumentative | 0.403 | 0.479 † | 0.415 | 0.464 † |
| Noun phrases | 0.222 | 0.329 † | 0.222 | 0.271 † |
| Concepts | 0.409 | 0.497* † | 0.427 | 0.480* † |
| CUIs | 0.398 | 0.496 | 0.422 | 0.475 † |
| MTI predictions | 0.513* | 0.531* † | 0.478* | 0.501* † |
| MTI MMI | 0.398 | 0.454 † | 0.367 | 0.382 † |
| MTI PRC | 0.481* | 0.502 † | 0.430 | 0.453 † |
| First level taxonomy | 0.300 | 0.456 † | 0.351 | 0.429 † |
| Second level taxonomy | 0.222 | 0.424 † | 0.329 | 0.393 † |
| Third level taxonomy | 0.173 | 0.383 | 0.285 | 0.341 † |
| Journal | 0.115 | 0.193 † | 0.126 | 0.208 † |
| Affiliation | 0.046 | 0.064 | 0.045 | 0.044 † |
| Author | 0.062 | 0.137 † | 0.081 | 0.084 † |
Results are reported in F-measure. Binary representation of features is used. Several learning algorithms have been used including SVMLight, SVM-perf, AdaBoostM1 and AdaBoostM1 with oversampling of positive instances (Ada Over). For each column, results significantly better than unigram (p >0.05) are indicated with *. For each pair of methods (SVMLight/SVM-perf and AdaBoostM1/Ada Over), statistical differences are highlighted using †.