TABLE I:
Performance of five supervised learning classifiers over three different clinical representation features, CNN over word embedding vectors and knowledge- guided CNN over CUI+word embedding vectors with random under-sampling methods applied to majority class
Feature | Algorithm | Undersampling | AUC | Precision | Recall | F-measure |
---|---|---|---|---|---|---|
Bag of words | NB | N/A | 0.7454 | 0.6018 | 0.0793 | 0.1401 |
L2- SVM | N/A | 0.7790 | 0.4072 | 0.6469 | 0.4998 | |
L1- SVM | N/A | 0.7730 | 0.3765 | 0.6564 | 0.4769 | |
L2- LR | N/A | 0.7790 | 0.4068 | 0.6539 | 0.5016 | |
L1- LR | N/A | 0.7715 | 0.3835 | 0.6504 | 0.4825 | |
RF | N/A | 0.7659 | 0.5227 | 0.0268 | 0.0509 | |
RF | 1:1 RUS | 0.7613 | 0.3892 | 0.6520 | 0.4612 | |
GBDT | N/A | 0.7673 | 0.6146 | 0.0688 | 0.1237 | |
GBDT | 1:1 RUS | 0.7710 | 0.3685 | 0.6436 | 0.4728 | |
Bag of CUIs | NB | N/A | 0.7448 | 0.5263 | 0.0699 | 0.1235 |
L2- SVM | N/A | 0.7684 | 0.3964 | 0.6329 | 0.4874 | |
L1- SVM | N/A | 0.7688 | 0.3770 | 0.6504 | 0.4773 | |
L2- LR | N/A | 0.7690 | 0.3916 | 0.6376 | 0.4851 | |
L1- LR | N/A | 0.7649 | 0.3686 | 0.6457 | 0.4693 | |
RF | N/A | 0.7633 | 0.4385 | 0.5198 | 0.4757 | |
RF | 1:1 RUS | 0.7627 | 0.3491 | 0.6234 | 0.4697 | |
GBDT | N/A | 0.7643 | 0.5663 | 0.0548 | 0.0999 | |
GBDT | 1:1 RUS | 0.7703 | 0.3531 | 0.6668 | 0.4552 | |
Bag of words+CUIs | NB | N/A | 0.7448 | 0.5263 | 0.0699 | 0.1235 |
L2- SVM | N/A | 0.7785 | 0.4089 | 0.6410 | 0.4993 | |
L1- SVM | N/A | 0.7749 | 0.3813 | 0.6457 | 0.4795 | |
L2- LR | N/A | 0.7791 | 0.4070 | 0.6504 | 0.5007 | |
L1- LR | N/A | 0.7730 | 0.3863 | 0.6492 | 0.4844 | |
RF | N/A | 0.7676 | 0.4839 | 0.0175 | 0.0338 | |
RF | 1:1 RUS | 0.7652 | 0.3468 | 0.6532 | 0.4713 | |
GBDT | N/A | 0.7671 | 0.5892 | 0.0886 | 0.1540 | |
GBDT | 1:1 RUS | 0.7701 | 0.3414 | 0.6641 | 0.4637 | |
Word embeddings | CNN | N/A | 0.7269 | 0.5016 | 0.1818 | 0.2669 |
CNN | 1:1 RUS | 0.7224 | 0.2848 | 0.7040 | 0.4055 | |
CNN | 1:2 RUS | 0.7285 | 0.4364 | 0.4476 | 0.4419 | |
CNN | 1:3 RUS | 0.7381 | 0.5010 | 0.2949 | 0.3712 | |
Word embeddings + Semantic selectd CUIs | Knowledge-guided CNN | N/A | 0.7231 | 0.4437 | 0.2436 | 0.3145 |
Knowledge-guided CNN | 1:1 RUS | 0.7466 | 0.3590 | 0.6294 | 0.4572 | |
Knowledge-guided CNN | 1:2 RUS | 0.7529 | 0.4349 | 0.5175 | 0.4726 | |
Knowledge-guided CNN | 1:3 RUS | 0.7341 | 0.4483 | 0.4347 | 0.4414 |