Table 2. Performance metrics for natural language processing and classification on the derivation cohort.
a) | |||||||
Stroke | |||||||
Average AUC (95% CI) | Logistic Regression | k-NN | CART | OCT | OCT-H | RF | RNN |
BOW | 0.951 (0.943:0.959) | 0.808 (0.767:0.848) | 0.889 (0.868:0.91) | 0.805 (0.774:0.836) | 0.915 (0.899:0.92) | 0.922 (0.902:0.942) | 0.838 (0.811:0.866) |
tf-idf | 0.939 (0.933:0.945) | 0.857 (0.825:0.889) | 0.883 (0.859:0.907) | 0.813 (0.801:0.825) | 0.894 (0.853:0.906) | 0.929 (0.909:0.948) | 0.843 (0.816:0.869) |
GloVe | 0.904 (0.889:0.918) | 0.867 (0.836:0.898) | 0.734 (0.703:0.765) | 0.722 (0.69:0.753) | 0.767 (0.775:0.834) | 0.892 (0.868:0.916) | 0.961 (0.955:0.967) |
Location | |||||||
Average AUC (95% CI) | Logistic Regression | k-NN | CART | OCT | OCT-H | RF | RNN |
BOW | 0.959 (0.944:0.974) | 0.841 (0.816:0.867) | 0.949 (0.93:0.969) | 0.867 (0.838:0.896) | 0.937 (0.919:0.955) | 0.96 (0.943:0.978) | 0.896 (0.873:0.926) |
tf-idf | 0.962 (0.943:0.981) | 0.903 (0.873:0.933) | 0.944 (0.918:0.97) | 0.862 (0.828:0.896) | 0.934 (0.917:0.951) | 0.965 (0.947:0.983) | 0.956 (0.936:0.977) |
GloVe | 0.906 (0.884:0.927) | 0.843 (0.819:0.868) | 0.734 (0.677:0.791) | 0.699 (0.662:0.722) | 0.809 (0.787:0.83) | 0.873 (0.854:0.892) | 0.976 (0.968:0.983) |
Acuity | |||||||
Average AUC (95% CI) | Logistic Regression | k-NN | CART | OCT | OCT-H | RF | RNN |
BOW | 0.898 (0.874:0.922) | 0.815 (0.775:0.854) | 0.797 (0.748:0.846) | 0.735 (0.705:0.764) | 0.797 (0.742:0.852) | 0.901 (0.883:0.919) | 0.754 (0.733:0.779) |
tf-idf | 0.893 (0.865:0.921) | 0.857 (0.826:0.888) | 0.801 (0.762:0.839) | 0.733 (0.703:0.764) | 0.807 (0.764:0.843) | 0.902 (0.876:0.923) | 0.899 (0.875:0.922) |
GloVe | 0.881 (0.842:0.92) | 0.842 (0.805:0.879) | 0.73 (0.684:0.776) | 0.719 (0.66:0.778) | 0.82 (0.766:0.873) | 0.866 (0.824:0.908) | 0.925 (0.894:0.955) |
b) | |||||||
Sensitivity | Specificity | Accuracy | Precision | Threshold | |||
Stroke | 0.902 | 0.872 | 0.892 | 0.935 | 0.69 | ||
MCA Location | 0.902 | 0.911 | 0.908 | 0.766 | 0.42 | ||
Acuity | 0.911 | 0.689 | 0.772 | 0.935 | 0.33 |
k-Nearest Neighbors (k-NN); Classification and Regression Trees (CART); Optimal Classification Trees (OCT); Random Forests (RF); Recurrent Networks (RNN).