Skip to main content
. 2020 Jul 27;26(1):23–34. doi: 10.1007/s10741-020-10007-3

Table 1.

Comparison of machine learning algorithms with traditional methods in the management of heart failure

Author Journal Year Outcome Comparison between machine learning and conventional methods Conclusion
Machine learning models Conventional methods
Classification of HF patients
  Austin PC Journal of Clinical Epidemiology 2013

Discrimination

HFpEF vs HFrEF

Model AUC Model AUC Conventional LR performed at least as well as modern methods
Regression tree 0.683 LR 0.780
Bagged regression tree 0.733
Random forest 0.751
Boosted regression tree (depth 1) 0.752
Boosted regression tree (depth 2) 0.768
Boosted regression tree (depth 3) 0.772
Boosted regression tree (depth 4) 0.774
CRT response
  Kalscheur MM Circ Arrhythm Electropysiol 2018 All-cause mortality or HF hospitalization in CRT recipients

AUC values

RF model (0.74, 95% CI 0.72–0.76)

Sequential minimal optimization to train a SVM (0.67, 95% CI 0.65–0.68)

AUC values

Multivariate LR (0.67, 95% CI 0.65–0.69)

The improvement in AUC for the RF model was statistically significant compared to the other models, p < 0.001
Data extraction
  Zhang R BMC Med Inform Decis Mak 2018 HF information (NYHA) extraction from clinical notes RF, n-gram features → F-measure 93.78%, recall 92.23%, precision 95.40%, SVM → F-measure 93.52%, recall 93.21%, precision 93.84% LR → F-measure 90.42%, recall 90.82%, precision 90.03% ML-based methods outperformed a rule-based method. The best machine learning method was an RF
HF diagnosis
  Nirschi JJ PlosOne 2018 HF diagnosis using biopsy images

AUC value

RF 0.952

Deep learning 0.974

AUC value

Pathologists 0.75

ML models outperformed conventional methods
  Rasmy L J Biomed Inform 2018 HF diagnosis

AUC value

Recurrent NN 0.822

AUC value

LR 0.766

ML outperformed conventional methods
  Son CS J Biomed Inform 2012 HF diagnosis Rough sets based decision-making model → accuracy 97.5%, SENS 97.2%, SPE 97.7%, PPV 97.2%, NPV 97.7%, AUC 97.5% LR-based decision-making model → accuracy 88.7%, SENS 90.1%, SPE 87.5%, PPV 85.3%, NPV 91.7%, AUC 88.8% ML models outperformed conventional methods
  Wu J Med Care 2010 HF diagnosis Boosting using a less strict cut-off had better performance compared to SVM The highest median AUC (0.77) was observed for LR with Bayesian information criterion LR and boosting were, both, superior to SVM
Identification of HF patients
  Blecker S JAMA Cardiology 2016 Identification of HF patients ML using notes and imaging reports → (developmental set) AUC 99%, SENS 92%, PPV 80%. (Validation SET) AUC 97%, SENS 84%, PPV 80% LR using structured data → (developmental set) AUC 96%, SENS 78%, PPV 80%. (Validation SET) AUC 95%, SENS 76%, PPV 80% ML models improved identification of HF patients
  Blecker S J Card Fail 2018 Identification of HF hospitalization ML with use of both data → (developmental set) AUC 99%, SENS 98%, PPV 43%. (Validation SET) AUC 99%, SENS 98%, PPV 34% LR using structured data, notes, and imaging reports → (developmental set) AUC 96%, SENS 98%, PPV 14%. (Validation SET) AUC 96%, SENS 98%, PPV 15% ML models performed better in identifying decompensated HF
  Choi E Journal of AMIA 2017 Predicting HF diagnosis from EHR

AUC values

12-month observation →

NN model 0.777

MLP with 1 hidden layer 0.765

SVM 0.743

K-NN 0.730

AUC values

12-month observation →

LR 0.747

ML models performed better in detecting incident HF with a short observation window of 12–18 months
Prediction of outcomes
  Austin PC Biom J 2012 30-day mortality

AUC values

Regression tree 0.674

Bagged trees 0.713

Random forests 0.752

Boosted trees—depth one 0.769

Boosted trees—depth two 0.788

Boosted trees—depth three 0.801

Boosted trees—depth four 0.811

AUC values

LR 0.773

Ensemble methods from the data mining and ML literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional LR models
  Austin PC J Clin Epidemiol 2010 In-hospital mortality

AUC values

LR models

Regression trees 0.620–0.651

AUC values

LR 0.747–0.775

LR predicted in-hospital mortality in patients hospitalized with HF more accurately than did the regression trees
  Awan SE ESC Heart Failure 2019 30-day readmissions

AUC values

MLP 0.62

Weighted random forest 0.55

Weighted decision trees 0.53

Weighted SVM models 0.54

AUC values

LR 0.58

The proposed MLP-based approach is superior to other ML and regression techniques
  Fonarow GC JAMA 2005 In-hospital mortality

AUC values

CART model (derivation cohort 68.7%; validation cohort 66.8%)

AUC values

LR model (derivation cohort 75.9%; validation cohort 75.7%)

Based on AUC, the accuracy of the CART model (derivation cohort 68.7%; validation cohort 66.8%) was modestly less than that of the more complicated LR model (derivation cohort75.9%; validation cohort 75.7%)
  Frizzell JD JAMA Cardiol 2016 30-day readmissions

C-statistics

Tree-augmented naive Bayesian network 0.618

RF 0.607

Gradient-boosted 0.614

Least absolute shrinkage and selection operator models 0.618

C-statistics

LR 0.624

ML methods showed limited predictive ability
  Golas SB BMC Med Inform Decis Mak 2018 30-day readmissions

AUC values

Gradient boosting 0.650 ± 0.011

Maxout networks 0.695 ± 0.016

Deep unified networks 0.705 ± 0.015

AUC values

LR 0.664 ± 0.015

Deep learning techniques performed better than other traditional techniques
  Hearn J Circ Heart Fail 2018 Clinical deterioration (i.e., the need for mechanical circulatory support, listing for heart transplantation, or mortality from any cause)

AUC values

ppVo2 0.800 (0.753–0.838)

Staged LASSO 0.827 (0.785–0.867)

Staged NN 0.835 (0.795–0.880)

BxB LASSO 0.816 (0.767–0.866)

BxB NN 0.842 (0.794–0.882)

AUC values

CPET risk score 0.759 (0.709–0.799)

NN incorporating breath-by-breath data achieved the best performance
  Kwon JM Echocardiography 2019 Hospital mortality

AUC values

Deep learning 0.913

RF 0.835

AUC values

LR 0.835

MAGGIC score 0.806

GWTG score 0.783

The echocardiography-based deep learning model predicted in-hospital mortality among HD patients more accurately than existing prediction models
  Phillips KT AMIA Annu Symp Proc 2005 Mortality

AUC levels

Nearest neighbor 0.823

NN 0.802

Decision tree 0.4975

AUC values

Stepwise LR 0.734

Data mining methods outperform multiple logistic regression and traditional epidemiological methods
  Mortazavi BJ Circ Cardiovasc Qual Outcomes 2016 HF readmissions

C-statistics

Boosting 0.678

C-statistics

LR 0.543

Boosting improved the c-statistic by 24.9% over LR
  Myers J Int J Cardiol 2014 Cardiovascular death

AUC values

Artificial NN 0.72

Cox PH models 0.69

AUC values

LR 0.70

An artificial NN model slightly improves upon conventional methods
  Panahiazar M Stud Health Technol Inform 2015 5-year mortality

AUC values

RF 62% (baseline set), 72% (extended set)

Decision tree 50% (baseline set), 50% (extended set)

SVM 55% (baseline set), 38% (extended set)

AdaBoost 61% (baseline set), 68% (extended set)

AUC values

LR 61% (baseline set), 73% (extended set)

LR and RF return more accurate models
  Subramanian D Circ Heart Fail 2011 1-year mortality

C-statistics

Ensemble model using gentle boosting with 10-fold cross-validation 84%

C-statistics

Μultivariate LR model using time-series cytokine

Measurements 81%

The ensemble model showed significantly better performance
  Taslimitehrani V J Biomed Inform 2016 5-year survival

Precision

SVM 0.2,

CPXR (log) 0.721

Recall

SVM 0.5

CPXR (log) 0.615

Accuracy

SVM 0.66

CPXR 0.809

Precision

LR 0.513

Recall

LR 0.506

Accuracy

LR 0.717

CPXR is better than logistic regression, SVM, random forest and AdaBoost
  Turgeman L Artif Intell Med 2016 Hospital readmissions

AUC values

NN 0.589 (train), 0.639 (test)

Naïve Bayes 0.699 (train), 0.676 (test)

SVM 0.768 (train), 0.643 (test)

CART decision tree 0.529 (train), 0.556 (test)

Ensemble models C5 0.714 (train), 0.693 (test)

CHAID decision tree 0.671 (train), 0.691 (test)

AUC values

LR 0.642 (train), 0.699 (test)

A dynamic mixed-ensemble model combines a boosted C5.0 model as the base ensemble classifier and SVM model as a secondary classifier to control classification error for the minority class
  Wong W Scientific World Journal 2003 Mortality (365 days models)

AUC values

MLP 69%

Radial basis function 67%

AUC values

LR 60%

NNs are able to outperform the LR in terms of sample prediction
  Yu S Artif Intell Med 2015 30-day HF readmissions

AUC values

Linear SVM 0.65

Poly SVM 0.61

Cox PH 0.63

AUC values

Industry standard method (LACE) 0.56

The ML models performed better compared to standard method
  Zhang J Int J Cardiol 2013 Death or hospitalization

AUC values

Decision trees 79.7%

AUC values

LR 73.8%

Decision trees tended to perform better than LR models
  Zhu K Methods Inf Med 2015 30-day readmissions

AUC values

RF 0.577

SVM 0.560

Conditional LR 1 = 0.576

Conditional LR 2 = 0.608

Conditional LR 3 = 0.615

AUC values

Standard LR 0.547

Stepwise LR 0.539

LR after combining ML outperforms standard classification models
  Zolfaghar K In 2013 IEEE International Conference on Big Data 2013 HF readmissions

AUC values

Multicare health systems model

RF 62.25%

AUC values

Multicare health systems model

LR 63.78%

Yale model

LR 59.72%

ML random forest model does not outperform traditional LR model

AUC area under the receiver operating curve, CPET cardiopulmonary exercise test, HF heart failure, LR logistic regression, ML machine learning, MLP multilayer perceptron, NN neural networks, NPV negative prognostic value, PH proportional hazard, PPV positive prognostic value, ppVo2 predicted peak oxygen uptake, RF random forest, SENS sensitivity, SPE specificity, SVM support vector machine