Table 2.
Comparison of classifiers for opioid misuse
Classifier | ROC AUC (95% CI) |
F1 | Precision/PPV (95% CI) | Recall/Sensitivity (95% CI) | Specificity (95% CI) | NPV (95% CI) | P value for model fit* |
---|---|---|---|---|---|---|---|
Rule-based | NAa | 0.76 | 0.68 (0.57, 0.78) | 0.87 (0.76, 0.94) | 0.79 (0.71, 0.86) | 0.92 (0.85, 0.96) | < 0.01 |
Logistic Regression CUI | 0.91 (0.86, 0.95) | 0.79 | 0.89 (0.77, 0.96) | 0.71 (0.58, 0.81) | 0.95 (0.90, 0.98) | 0.86 (0.80, 0.91) | 0.06 |
Logistic Regression Word | 0.91 (0.86, 0.95) | 0.72 | 0.86 (0.73, 0.94) | 0.62 (0.49, 0.73) | 0.95 (0.89, 0.98) | 0.83 (0.76, 0.88) | < 0.01 |
Convolutional Neural Network CUI | 0.93 (0.90, 0.97) | 0.81 | 0.82 (0.70, 0.90) | 0.79 (0.68, 0.88) | 0.91 (0.85, 0.95) | 0.89 (0.83, 0.94) | 0.51 |
Convolutional Neural Network Word | 0.94 (0.91, 0.98) | 0.84 | 0.94 (0.85, 0.99) | 0.75 (0.63, 0.85) | 0.98 (0.93, 1.00) | 0.88 (0.82, 0.93) | 0.42 |
Convolutional Neural Network Character | 0.93 (0.90, 0.97) | 0.79 | 0.88 (0.76, 0.95) | 0.72 (0.60, 0.82) | 0.95 (0.89, 0.98) | 0.87 (0.80, 0.92) | < 0.01 |
Deep Averaging Network CUI | 0.83 (0.78, 0.88) | 0.74 | 0.68 (0.57, 0.78) | 0.87 (0.76, 0.94) | 0.79 (0.71, 0.86) | 0.92 (0.85, 0.96) | < 0.01 |
Deep Averaging Network Word | 0.80 (0.74, 0.86) | 0.49 | 0.74 (0.56, 0.87) | 0.37 (0.25, 0.49) | 0.93 (0.87, 0.97) | 0.74 (0.67, 0.80) | < 0.01 |
Max Pooling Network CUI | 0.93 (0.89, 0.96) | 0.79 | 0.85 (0.73, 0.93) | 0.74 (0.61, 0.83) | 0.93 (0.87, 0.97) | 0.87 (0.80, 0.92) | 0.60 |
Max Pooling Network Word | 0.91 (0.86, 0.96) | 0.78 | 0.87 (0.76, 0.95) | 0.71 (0.58, 0.81) | 0.95 (0.89, 0.98) | 0.86 (0.79, 0.91) | 0.36 |
Deep Averaging + Max Pooling Network CUI | 0.94 (0.91, 0.97) | 0.81 | 0.92 (0.82, 0.98) | 0.72 (0.60, 0.82) | 0.97 (0.92, 0.99) | 0.87 (0.80, 0.92) | < 0.01 |
Deep Averaging + Max Pooling Network Word | 0.94 (0.91, 0.97) | 0.78 | 0.86 (0.74, 0.94) | 0.72 (0.60, 0.82) | 0.94 (0.88, 0.97) | 0.87 (0.80, 0.92) | 0.09 |
Logistic regression with a combination of unigrams and bigrams; PPV positive predictive value, NPV negative predictive value, ROC AUC area under the curve receiver operating characteristic, CUI concept unique identifier, CI confidence interval
*model fit by Hosmer-Lemeshow Goodness of Fit test where p > 0.05 demonstrate the model fit the data well
aNA not applicable because bivariate predictions (0/1) without predicted probabilities to plot ROC AUC