Table 2.
Deep learning implementation in electronic health records and medical report management.
Reference | Task | Method | Remark |
Wickramasinghe et al, 2017 [89] | Extract features from medical records | CNNa | It achieves superior accuracy compared with traditional techniques to detect meaningful clinical motifs and uncovers the underlying structure of the disease |
Lin et al, 2017 [90] | Disease code classification | CNN | The method had a higher testing accuracy (mean AUCb=0.9696; mean F-score=0.9086) than traditional NLPc-based approaches (mean AUC range 0.8183-0.9571; mean F-score range 0.5050-0.8739) |
Cheng et al, 2016 [91] | Risk prediction of chronic congestive heart failure | CNN | The model performance increases the prediction accuracy by 1.5% when 60% training data were used and 5.2% when it is 90% training data |
Zeng et al, 2017 [92] | MobileDeepPill: Recognition of unconstrained pill image | CNN | DLd-based pill image recognition algorithm won the first price of the NIHe NLMf Pill Image Recognition Challenge |
Li et al, 2018 [93] | Extraction of adverse drug events | RNNg | The DL model achieved a result of F-score=65.9%, which is higher than F-score=61.7% from the best system in the MADEh1.0 challenge |
Zhang et al, 2018 [94] | Identify clinical named entity | RNN | CRFi and bidirectional LSTMj-CRF achieved a precision of 0.9203 and 0.9112, recall of 0.8709 and 0.8974, and F-score score of 0.8949 and 0.9043, respectively |
Jagannatha et al, 2016 [95] | Prediction based on sequence labeling | RNN | Prediction model improved detection of the exact phrase for various medical entities |
Jagannatha et al, 2016 [96] | Extraction of medical events | RNN | Cross-validated microaverage of precision, recall, and F-score for all medical tags for gated recurrent unit–documents are 0.812, 0.7938, and 0.8031, respectively, which are higher than other methods |
Rajkomar et al, 2018 [97] | Representation of patients’ record | RNN | Achieved high accuracy for tasks such as predicting in-hospital mortality, prolonged length of stay, and all of a patient’s final discharge diagnoses |
Hou et al, 2018 [98] | Extraction of drug-drug interaction | RNN | DL can efficiently aid in information extraction (drug-drug interaction) from text. The F-score ranged from 49% to 81% |
Choi et al, 2015 [99] | Predicting clinical events | RNN | On the basis of separate blind test set evaluation, the model can perform differential diagnosis with up to 79% recall, which is significantly higher than several baselines |
Choi et al, 2016 [100] | Detection of heart failure onset | RNN | When using an 18-month observation window, the AUC for the RNN model increased to 0.883 and was significantly higher than the 0.834 AUC for the best of the baseline methods |
Volkova et al, 2017 [101] | Forecasting influenza-like illness | RNN | LSTM model outperformed previously used models in all metrics, for example, Pearson correlation (0.79), RMSEk (0.01), RMSPEl (29.52), and MAPEm (69.54) |
Yadav et al, 2016 [102] | Patient data deidentification | RNN | The proposed approach achieved best performance, with 89.63, 90.73, 90.18 for recall, precision, and F-score, respectively |
Hassanien et al, 2013[103] | Classification of diagnoses | RNN | Models outperformed several strong baselines, including a multilayer perceptron trained on hand-engineered features |
Li et al, 2014 [104] | Identifying informative risk factors and predicting bone disease | DBNn | Proposed framework predicted the progression of osteoporosis from risk factors and provided information to improve the understanding of the disease |
Che et al, 2015 [105] | Detection of characteristic patterns of physiology | DBN | The empirical efficacy of the technique was demonstrated on 2 real-world hospital datasets and the model was able to learn interpretable and clinically relevant features |
Tran et al, 2015 [106] | Harness electronic health record with minimal human supervision | DBMo | The model achieved F-scores of 0.21 for moderate-risk and 0.36 for high-risk, which are significantly higher than those obtained by clinicians and competitive with the results obtained by support vector machine |
Miotto et al, 2016 [107] | Predict future of patients | AEp | Results significantly outperformed those achieved using representations based on raw electronic health record data and alternative feature learning strategies |
Lv et al, 2016 [108] | Clinical relation extraction | AE | The proposed model is validated on the dataset of i2b2 2010. The DL method for feature optimization showed great potential |
Lasko et al, 2013 [109] | Inferring phenotypic patterns | AE | The model distinguished the uric acid signatures of gout and acute leukemia despite not being optimized for the task |
aCNN: convolutional neural network.
bAUC: area under the curve.
cNLP: natural language processing.
dDL: deep learning.
eNIH: national institutes of health.
fNLM: national library of medicine.
gRNN: recurrent neural network.
hMADE: medication and adverse drug events.
iCRF: conditional random fields.
jLSTM: long short-term memory.
kRMSE: root mean square error.
lRMSPE: root mean square percentage error.
mMAPE: mean absolute percentage error.
nDBN: deep belief network.
oDBM: deep Boltzmann machine.
pAE: autoencoder.