Table 3.
GTB |
GTB optimized |
SVM |
SVM optimized |
|||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | P | R | F1 | P | R | F1 | |
TF 1000 | 83.4 | 90.6 | 86.7 | 79.6 | 92.2 | 85.2 | 76.3 | 79.8 | 78.0 | 80.2 | 88.1 | 83.7 |
Lemma | 79.3 | 87.6 | 83.0 | 76.5 | 92.2 | 83.1 | 60.1 | 100.0 | 75.1 | 78.9 | 88.2 | 83.1 |
Stem | 82.4 | 88.3 | 85.0 | 79.7 | 93.7 | 85.7 | 60.1 | 100.0 | 75.1 | 80.7 | 89.8 | 84.8 |
Stop | 79.0 | 83.6 | 80.6 | 79.0 | 93.0 | 85.0 | 76.5 | 78.3 | 77.4 | 83.1 | 89.8 | 84.8 |
IST | 79.0 | 86.0 | 81.7 | 76.7 | 89.1 | 81.9 | 73.0 | 65.1 | 68.9 | 72.9 | 84.5 | 78.0 |
TF-IDF 1000 | 81.7 | 91.2 | 86.0 | 79.5 | 92.1 | 84.9 | 60.1 | 100.0 | 75.1 | 78.1 | 89.7 | 82.8 |
LS-TFIDF 1000 | 80.2 | 84.4 | 81.9 | 78.9 | 91.3 | 84.2 | 60.1 | 100.0 | 75.1 | 72.7 | 88.9 | 79.3 |
SS-TFIDF 1000 | 78.6 | 85.8 | 81.6 | 78.8 | 93.0 | 85.0 | 60.1 | 100.0 | 75.1 | 75.3 | 86.6 | 79.8 |
GTB: gradient tree boosting; SVM: support vector machine; TF: term frequency; IST: infection-specific term; TF-IDF: term frequency–inverse document frequency.
In total, the material comprised 213 HRs of which 128 contained HAI giving a baseline precision of 60 percent, recall of 100 percent and F-score of 75 percent.