Skip to main content
. Author manuscript; available in PMC: 2017 Aug 1.
Published in final edited form as: J Biomed Inform. 2016 May 13;62:21–31. doi: 10.1016/j.jbi.2016.05.004

Table 5.

Performance of classification models using only lexical features according to different evaluation metrics for the task of annotating caregiver interview session transcripts. Highest value for each metric and codebook size across all models is highlighted in boldface.

Cls. Model Acc. Prec. Rec. F1 Kappa
16 NB 0.571 0.608 0.571 0.575 0.518
NB-M 0.633 0.629 0.633 0.604 0.573
J48 0.578 0.563 0.578 0.567 0.514
AdaBoost 0.602 0.582 0.602 0.588 0.539
RF 0.640 0.631 0.640 0.596 0.574
DiscLDA 0.482 0.442 0.482 0.421 0.362
CNN 0.657 0.641 0.657 0.648 0.512
SVM 0.664 0.653 0.664 0.639 0.606

19 NB 0.477 0.504 0.477 0.467 0.434
NB-M 0.536 0.539 0.536 0.512 0.487
J48 0.436 0.431 0.436 0.432 0.382
AdaBoost 0.467 0.457 0.467 0.460 0.415
RF 0.507 0.508 0.507 0.467 0.450
DiscLDA 0.374 0.370 0.374 0.333 0.287
CNN 0.510 0.498 0.510 0.504 0.401
SVM 0.545 0.547 0.545 0.535 0.497

58 NB 0.379 0.392 0.379 0.370 0.350
NB-M 0.442 0.404 0.442 0.386 0.401
J48 0.340 0.321 0.340 0.328 0.302
AdaBoost 0.381 0.359 0.381 0.366 0.344
RF 0.402 0.358 0.402 0.352 0.358
DiscLDA 0.288 0.258 0.288 0.234 0.229
CNN 0.118 0.102 0.118 0.109 0.032
SVM 0.451 0.420 0.451 0.418 0.414