Skip to main content
. 2020 Jul 24;8(7):e18599. doi: 10.2196/18599

Table 2.

Performance of artificial intelligence.

Reference Best model recommended Comparison/other models Performance measures of the best model



Accuracy AUROCa Recall Specificity Precision F measure other
Huanget al [107] SVMb Logistic regression 0.675 0.771 0.632 0.789 N/Ac N/A N/A
Lian et al [106] Ensemble of three models Bayesian network model; likelihood ratio model; BCPNNd N/A N/A N/A N/A N/A N/A Chi-square improved by 28.83%
Chapman et al [105] Integrated NLPe with RFf model for relation extraction and CRFg model CRF; RF model for relation extraction N/A N/A N/A N/A N/A 0.612 N/A
Yang et al [104] MADEx (long short-term memory CRF+SVM) RNNh; CRF; SVM; RF N/A N/A 0.6542 N/A 0.5758 0.6125 N/A
Dey et al [103] Neural fingerprint (deep learning) 10 other chemical fingerprints 0.91 0.82 0.50 0.93 N/A 0.400 N/A
Dandala et al [102] BiLSTMi+CRF (joint and external resources) BiLSTM+CRF (sequential); BiLSTM+CRF (joint) N/A N/A 0.822 concept extraction; 0.855 relation classification N/A 0.846 concept extraction; 0.888 relation classification 0.83 concept extraction; 0.87 relation classification N/A
Cai et al [101] CARDj Association rule mining N/A N/A N/A N/A N/A N/A Identifying drug interaction 20%
Onay et al [100] LSVMk Boosted and bagged trees (ensemble) 0.89 0.88 0.83 1.00 N/A 0.91 N/A
Tinoco et al [99] Computerized surveillance system Manual chart review N/A N/A N/A N/A N/A N/A Number of events detected 92% (HAIl), 82% (SSIm), 91% (LRTIn), 99% (UTIo), 100% (BSIp), 52% (ADEq)
Carrel et al [98] NLP-assisted manual review Manual chart review N/A N/A N/A N/A N/A N/A Identified 3.1% additional patients with opioid problems
Li et al [97] NLP-based hybrid model Rule-based method; CRF N/A N/A 0.907 N/A 0.924 0.915 N/A
Schiff et al [96] MedAware, a probabilistic machine-learning CDSr system Traditional CDS 0.75 N/A N/A N/A N/A N/A 75% of the identified alerts were clinically meaningful
Reddy et al [95] ABC4Ds smartphone app (based on CBRt, an AIu technique) N/A N/A N/A N/A N/A N/A N/A ABC4D was superior to nonadaptive bolus calculator and also more user friendly
Long et al [93] AI smartphone app N/A N/A N/A N/A N/A N/A N/A 100% adherence in the intervention group
Hasan et al [92] Co-occurrence KNNv and popular algorithm Logistic regression; KNN; random algorithm; co-occurrence; drug popularity N/A N/A N/A N/A N/A N/A Simple algorithms such as popular algorithm, co-occurrence, and KNN performed better than more complex logistic regression
Hu et al [91] Bagged SVRw and bagged voting MLPx; model tree; KNN N/A N/A N/A N/A N/A N/A Mean absolute error for both 0.210
Tang et al [90] NLP N/A N/A N/A 0.59 N/A 0.75 N/A N/A
Hu et al [89] RF C4.5; KNN; CARTy; MLP; logistic regression 0.839 0.912 0.782 0.888 N/A N/A N/A
Bean et al [88] Own model Logistic regression; SVM; decision tree; NLP N/A 0.92 N/A N/A N/A N/A N/A
Hamma et al [87] CART CART and CHAIDz 0.902 N/A N/A N/A N/A N/A CHAID outperformed CART only in central nervous system classification
Song et al [86] Similarity-based SVM Analogous machine-learning algorithms (not mentioned) N/A N/A 0.24 0.97 0.68 N/A N/A
Simon et al [85] PANDITaa Nurses 0.635 N/A N/A N/A N/A N/A 36.5% PANDIT recommendation did not match with the nurses; 1.4% of the recommendations were unsafe.
Fong et al [84] Unigram logistic regression Unigram, bigram, and combined logistic regression and SVM N/A 0.914 0.830 N/A 0.838 0.765 Unigram SVM and logistic regression were comparable
Ye et al [83] RF Linear and nonlinear machine-learning algorithms N/A N/A N/A N/A N/A N/A C-statistic of 0.884
Marella et al [82] Naïve Bayes kernel Naïve Bayes; KNN and rule induction 0.855 0.927 N/A N/A N/A 0.877 N/A
McKnight [81] NLP; SELFbb N/A Labeled 0.52; unlabeled 0.80 N/A N/A N/A N/A N/A N/A
Rosenbaum and Baron [80] SVM Logistic regression N/A 0.97 0.80 0.96 N/A N/A Positive predictive value 0.52
Wang et al [79] Binary SVM with radial basis function kernel Regularized logistic regression; linear SVM N/A N/A 0.783 N/A 0.783 0.783 N/A
Gupta and Patrick [66] Naïve Bayes multinomial J48; naïve Bayes; SVM N/A 0.96 0.78 0.98 0.79 0.78 Kappa 0.76; mean absolute error 0.03
Wang et al [67] Ensemble classifier chain of SVM with radial basis function kernel Binary relevance of SVM, classifier chain of SVM 0.654 N/A 0.791 N/A 0.689 0.736 Hamming loss 0.80
Zhou et al [49] SVM and RF Naïve Bayes and MLP N/A N/A 0.769SVM for event type; 0.927 RF for even cause N/A 0.788 SVM for event type; 0.927 RF for event cause 0.758 SVM for event type; 0.925 RF for event cause N/A
Fong et al [68] NLP with SVM NLP with decision tree 0.990 0.960 0.920 1.00 1.000 0.960 N/A
El Messiry et al [69] NLP Scaled linear discriminant analysis; SVM; LASSOcc and elastic-net regularized generalized linear models; max entropy; RF; neural network 0.730 N/A 0.770 0.696 N/A N/A N/A
Chondrogiannis et al [70] NLP N/A N/A N/A N/A N/A N/A N/A Model developed in this study identified that each clinical report contains about 6.8 abbreviations
Liang and Gong [71] Naïve Bayes with binary relevance SVM; decision rule; decision tree; KNN N/A N/A N/A N/A N/A N/A Micro F measure 0.212
Ong et al [72] Text classifier with SVM Text classifier with naïve Bayes N/A 0.920 multitype dataset; 0.980 patient misidentification dataset 0.830 multitype dataset; 0.940 patient misidentification dataset N/A 0.880 multitype dataset; 0.990 patient misidentification dataset 0.860 multitype dataset; 0.960 patient misidentification dataset N/A
Taggart et al [73] Rule-based NLP SVM; extra trees; convolutional neural network N/A N/A N/A 0.846 N/A N/A Positive predictive value 0.627; negative predictive value 0.971
Denecke et al [74] AIMLdd N/A N/A N/A N/A N/A N/A N/A Minimize information loss during clinical visits
Evans et al [75] SVM J48; naïve Bayes 0.728 0.891 incident type; 0.708 severity of harm N/A N/A N/A N/A N/A
Wang et al [76] Convolutional neural network SVM N/A N/A N/A N/A N/A 0.850 N/A
Klock et al [47] SVM and RNNee RF 0.899 SVM; 0.900 RNN N/A N/A N/A N/A 0.648899 SVM; 0.889 RNN N/A
Li et al [77] Ensemble machine learning (bagging, boosting, and random feature method) N/A N/A N/A 0.572 from 0.10 risk score; 0.855 from 0.04 risk score N/A N/A N/A C-statistic 0.880
Muff et al [78] NLP Patient safety indicators N/A N/A 0.770 0.938 N/A N/A N/A
Kwon et al [65] Deep learning-based early warning system Modified early warning system; RF; logistic regression N/A 0.850 0.757 0.765 N/A 1.000 AUPRCff
Hu et al [64] Neural network model ViEWSgg N/A 0.880 N/A N/A N/A 0.81 Positive predictive value 0.726
Segal et al [63] MedAware (a CDSShh) + EHRii Legacy CDS N/A N/A N/A N/A N/A N/A Clinically relevant 85%, alert burden 0.04%
Menard et al [62] Machine learning (name not disclosed) N/A N/A 0.970 N/A N/A N/A N/A N/A
Eerikainen et al [61] RF Binary classification tree; regularized discriminant analysis classifier; SVM; RF N/A N/A 0.950 0.780 N/A 0.782 N/A
Antink et al [60] Combined (selecting the best machine- learning algorithm for each alarm type) Binary classification tree; regularized discriminant analysis classifier; SVM; RF N/A N/A 0.950 0.780 N/A 0.782 N/A
Zhang et al [59] Cost-sensitive SVM N/A N/A N/A 0.950 0.850 N/A 0.809 N/A
Ansari et al [58] Multimodal machine learning using decision tree N/A N/A N/A 0.890 0.850 N/A 0.762 N/A
Chen et al [57] RF N/A N/A 0.870 N/A N/A N/A N/A N/A

aAUROC: area under the receiver operating characteristic curve.

bSVM: support vector machine.

cN/A: not applicable (Not reported).

dBCPNN: Bayesian confidence propagation neural network.

eNLP: natural language processing.

fRF: random forest.

gCRF: conditional random field.

hRNN: recurrent neural network.

iBiLSTM: Bi-long short-term memory neural network.

jCARD: casual association rule discovery.

kLSVM: linear support vector machine.

lHAI: hospital-associated infection.

mSSI: surgical site infection.

nLRTI: lower respiratory tract infection.

oUTI: urinary tract infection.

pBSI: bloodstream infection.

qADE: adverse drug event.

rCDS: clinical decision support.

sABC4D: Advanced Bolus Calculator For Diabetes.

tCBR: case-based reasoning.

uAI: artificial intelligence.

vKNN: K-nearest neighbor.

wSVR: support vector regression.

xMLP: multilayer perceptron.

yCART: classification and regression tree.

zCHAID: Chi square automatic interaction detector.

aaPANDIT: Patient Assisting Net-Based Diabetes Insulin Titration.

bbSELF: semisupervised local Fisher discriminant analysis.

ccLASSO: least absolute shrinkage and selection operator.

ddAIML: artificial intelligence markup language.

eeRNN: recurrent neural network.

ffAUPRC: area under the precision-recall curve.

ggVieWS: VitalPac Early Warning Score.

hhCDSS: clinical decision support system.

iiEHR: electronic health record.