TABLE 7.
Author | Task | NLP model | Embeddings and Corpus | AUC |
Xia et al. (77) | Identify patients with complex neurological disorder | cTAKES-based | Train corpus: ∼600 clinical notes Test corpus: ∼500 clinical notes |
0.96 |
Wissel et al. (104) | Identify candidates for epilepsy surgery | Multiple linear regression | Train corpus: ∼1,100 clinical notes Test corpus: ∼8,340 clinical notes |
0.9 |
Heo et al. (85) | Prediction of stroke outcomes | CNN + LSTM + Multilayer perceptron | Train corpus: ∼1,300 clinical notes Test corpus: ∼500 clinical notes |
0.81 |
Lineback et al. (78) | Prediction of 30-day readmission after stroke | Logistic regression + naïve Bayes + SVM + RF + Gradient boosting + XGBoost | Train corpus: ∼2,300 clinical notes Test corpus: ∼550 clinical notes |
0.64 |
Lin et al. (81) | Identify UAU in hospitalized patients | Logistic regression | Train corpus: ∼58 k clinical notes | 0.91 |
Bacchi et al. (87) | Prediction of cause of TIA-like presentations | RNN + CNN | Corpus: 2,201 clinical notes (∼150 words each) | 0.88 |
NLP, natural language processing; AUC, area under curve; cTAKES, clinical text analysis and knowledge extraction system; CNN, convolutional neural network; LSTM, long short-term memory; SVM, support vector machine; RF, random forest; UAU, unhealthy alcohol use; ISP-D, internet-based self-assessment program for depression.