Table 3.
Global (and lowest 5) means of the training and testing AUCsa in the real-world test.
Pipeline | Training set | Testing set | |||
AUCb | F-measure | AUCb | F-measure | ||
Traditional | |||||
NLPc + SVMd (linear) | 0.9921 (0.9768) | 0.9365 (0.7983) | 0.9477 (0.8549) | 0.8458 (0.5984) | |
NLP + SVM (polynomial) | 0.9103 (0.7975) | 0.6316 (0.4045) | 0.8716 (0.7400) | 0.5761 (0.2802) | |
NLP + SVM (radial basis) | 0.9577 (0.9208) | 0.7954 (0.6484) | 0.9349 (0.8476) | 0.7588 (0.5258) | |
NLP + SVM (sigmoid) | 0.9522 (0.9058) | 0.7840 (0.6261) | 0.9259 (0.8196) | 0.7515 (0.5209) | |
NLP + RFe | 0.9996 (0.9985)f | 0.9869 (0.9664)f | 0.9483 (0.8484) | 0.8582 (0.5901) | |
NLP + GBMg | 0.9995 (0.9985) | 0.9821 (0.9562) | 0.9462 (0.8416) | 0.8568 (0.5948) | |
Proposed | |||||
GloVeh + CNNi | 0.9956 (0.9868) | 0.9803 (0.9523) | 0.9645 (0.8952)f | 0.9003 (0.7204)f |
aAUC: area under the curve, calculated using the receiver operating characteristic curve.
bThe results are presented as the mean AUC or F-measure (mean of the lowest 5 AUCs or F-measures). Detailed AUCs and F-measures for each chapter-level International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) diagnosis code are shown in Multimedia Appendix 3.
cNLP: natural language processing for feature extraction (terms, n-gram phrases, and SNOMED CT categories).
dSVM: support vector machine.
eRF: random forest.
fThe best method for a specific index.
gGBM: gradient boosting machine.
hGloVe: a 50-dimensional word embedding model, pretrained using English Wikipedia and Gigaword.
iCNN: convolutional neural network.