Abstract
Heart failure is a serious condition with high prevalence (about 2% in the adult population in developed countries, and more than 8% in patients older than 75 years). About 3–5% of hospital admissions are linked with heart failure incidents. Heart failure is the first cause of admission by healthcare professionals in their clinical practice. The costs are very high, reaching up to 2% of the total health costs in the developed countries. Building an effective disease management strategy requires analysis of large amount of data, early detection of the disease, assessment of the severity and early prediction of adverse events. This will inhibit the progression of the disease, will improve the quality of life of the patients and will reduce the associated medical costs. Toward this direction machine learning techniques have been employed. The aim of this paper is to present the state-of-the-art of the machine learning methodologies applied for the assessment of heart failure. More specifically, models predicting the presence, estimating the subtype, assessing the severity of heart failure and predicting the presence of adverse events, such as destabilizations, re-hospitalizations, and mortality are presented. According to the authors' knowledge, it is the first time that such a comprehensive review, focusing on all aspects of the management of heart failure, is presented.
Keywords: Heart failure, Diagnosis, Prediction, Severity estimation, Classification, Data mining
1. Introduction
Heart failure (HF) is a complex clinical syndrome and not a disease. It prevents the heart from fulfilling the circulatory demands of the body, since it impairs the ability of the ventricle to fill or eject blood. It is characterized by symptoms, such as breathlessness, ankle swelling and fatigue that may be accompanied by signs, for example elevated jugular venous pressure, pulmonary crackles, and peripheral edema, caused by structural and/or functional cardiac or non-cardiac abnormalities. HF is a serious condition associated with high morbidity and mortality rates. According to the European Society of Cardiology (ESC), 26 million adults globally are diagnosed with HF, while 3.6 million are newly diagnosed every year. 17–45% of the patients suffering from HF die within the first year and the remaining die within 5 years. The related to HF management costs are approximately 1–2% of all healthcare expenditure, with most of them linked with recurrent hospital admissions [1], [2], [3].
The increased prevalence, the escalated healthcare costs, the repeated hospitalizations, the reduced quality of life (QoL) and the early mortality have transformed HF to an epidemic in Europe and worldwide and highlight the need for early diagnosis (detection of the presence of HF and estimation of its severity) and effective treatment. In clinical practice, medical diagnosis, including carefully history and physical examination, is supported by ancillary tests, such as blood tests, chest radiography, electrocardiography and echocardiography [4]. The combination of data produced by the above procedure of diagnosis resulted in the formulation of several criteria (e.g. Framingham, Boston, the Gothenburg and the ESC criteria) determining the presence of HF [5]. Once the diagnosis of HF is established, the experts classify the severity of HF using either the New York Heart Association (NYHA) or the American College of Cardiology/American Heart Association (ACC/AHA) Guidelines classification systems, since this classification allows them to determine the most appropriate treatment (medication treatment, guidelines regarding nutrition and physical activity exercising) to be followed [6].
Although there is a significant progress in understanding the complex pathophysiology of HF, the quantity and complexity of data and information to be analyzed and managed convert the accurate and efficient diagnosis of HF and the assessment of therapeutic regimens to quite challenging and complicated tasks. Those factors, in combination with the positive effects of early diagnosis of HF (which allows experts to design an effective and possibly successful treatment plan, prevents condition worsening, affects positively the patient's health, improves patient's QoL and contributes to decrease of medical costs) are the reasons behind the enormous increase of the application of machine learning techniques to analyze, predict and classify medical data. Classification methods are among the data mining techniques that have gained the interest of research groups. Accurate classification of disease stage or etiology or subtypes allows treatments and interventions to be delivered in an efficient and targeted way and permits assessment of the patient's progress.
Focusing on HF, different data mining techniques have been employed to differentiate the patients with HF from controls, to recognize the different HF subtypes (e.g. HF with reduced ejection fraction, HF with preserved ejection fraction) and to estimate the severity of HF (NYHA class) (Fig. 1). Additionally, data mining techniques can be advantageous even if HF is being diagnosed at a late stage, where the therapeutic benefits of interventions and the prospect of survival are limited, since they allow the timely prediction of mortality, morbidity and risk of readmission. Data recorded in the subjects' health record, expressing demographic information, clinical history information, presenting symptoms, physical examination results, laboratory data, electrocardiogram (ECG) analysis results, are employed. An extended review of the studies reported in the literature addressing the above mentioned issues (HF detection, severity estimation, prediction of adverse events) through the utilization of machine learning techniques is presented in this paper.
The systematic literature review was based on sources like i) PubMeD, ii) Scopus, iii) ScienceDirect, iv) Google Scholar, v) Web of Science (WoS) using as keywords the phrases “detection of HF”, “severity estimation of HF”, “HF subtypes classification”, “prediction of HF destabilizations”, “prediction of HF relapses”, “prediction of HF mortality”, “prediction of HF re-hospitalizations”.
The studies reported in the literature were selected based on the following criteria: i) focus on heart failure and no any other heart disease, ii) are written in English language, iii) are published from 2000 (inclusive) until present, iv) cover different geographical locations, v) are employing machine learning techniques, vi) employ Electronic Health Records, published databases, observational, trial, etc. for the development and validation, vii) provide information regarding the evaluation measures and the validation method that was followed and, viii) the response feature is either differentiation of subjects to normal and HF or differentiation of subjects to different HF subtypes or estimation of the severity of HF or estimation of the destabilization or estimation of re-admission or estimation of mortality. There is no restriction regarding the time frame of the prediction. Furthermore, studies addressing both aspects of HF management (e.g. detection and severity estimation of HF) were also included in this review. Studies not fulfilling more than one of the above mentioned criteria were excluded.
2. Detection of HF
According to the ESC guidelines [1], the algorithm to diagnose HF in a non-acute setting is the following. First the probability of HF based on prior clinical history of the patient, the presenting symptoms, physical examination, and resting ECG is estimated. If all elements are normal, HF is highly unlikely. If at least one element is abnormal, plasma Natriuretic Peptides should be measured. This measurement allows the experts to identify those patients who need echocardiography. The process of diagnosis of HF can be: (i) less time consuming, (ii) supported and (iii) performed with the same accuracy by the applications of machine learning techniques on the available data. More specifically, the detection of HF is expressed as a two class classification problem where the output of the classifiers is the presence or not of HF.
Most of the studies reported in the literature focus on the utilization of heart rate variability (HRV) that is a measure to classify a subject as normal or as patient with HF. Those methods are presented in Table 1. The main difference between those methods is related to the HRV features which are employed to detect HF.
Table 1.
Authors | Method | Data | Features | Evaluation measures |
---|---|---|---|---|
Asyali et al. 2003 [7] | Linear discriminant analysis Bayesian classifier |
No. of data 54 normal subjects 29 patients with CHF |
Predictor features Long-term HRV measures |
Observed Agreement Rate:93.24%, Sensitivity (true positive):81.82% Specificity (true negative): 98.08% kappa statistics: 0.832 (95% confidence interval: 0.689–0.974) |
Source of data RR interval databases at PhysioBank include beat annotation files for long-term (∼ 24 h) ECG recordings |
Response feature Normal CHF |
Validation | ||
n/a | ||||
Isler et al. 2007 [8] | Feature selection (genetic algorithm) Minmax Normalization k-NN |
No. of data 54 normal subjects 29 CHF subjects |
Predictor features Short-term HRV measures + Wavelet entropy measures |
k = 5 Sensitivity:96.43% Specificity:96.36% Accuracy: 96.39% k = 7 Sensitivity:100% Specificity:94.74% Accuracy: 96.39% |
Source of data RR interval records at the MIT/BIH database include beat annotation files for long-term (∼ 24 h) ECG recordings |
Response feature Normal CHF |
Validation | ||
Leave-one-out cross-validation | ||||
Thuraisingham 2009 [9] | Features from difference plot second order (SODP) k-NN |
No. of data 36 normal subjects 36 CHF subjects |
Predictor features Central tendency measure standard deviation of the RR intervals D (distance) |
Success rate: 100% |
Source of data The RR interval data was obtained from MIT-BIH Normal Sinus Rhythm database, BIDMC congestive Heart Failure database, and congestive heart failure RR interval database |
Response feature Normal CHF |
Validation | ||
n/a | ||||
Elfadil et al. 2011 [10] | Supervised Multi-layer perceptron |
No. of data Training 53 Normal sinus rhythm (NSR) & 17 CHF recordings Testing 12 CHF and 12 normal subjects |
Predictor features Power spectral density R1 (band 1), R2 (bands: 2 to 3), R3 (bands: 4 to 10), R4 (bands: 11 to 16), R5 (bands: 17 to 24), R6 (bands: 25 to 32). |
Sensitivity: 85.30% Specificity: 82.00% Accuracy: 83.65% |
Source of data Data randomly simulated from Massachusetts Institute of Technology (MIT) database |
Response feature Normal CHF |
Validation | ||
Testing 12 CHF and 12 normal subjects | ||||
Unsupervised Normalization Self-organizing map |
No. of data Training 1000 CHF &1000 normal simulated randomly from 17CHF and 53 normal subjects Testing 1000 CHF &1000 normal simulated randomly from 12 CHF and 12 normal subjects |
Predictor features Power spectral density R1 (band 1), R2 (bands: 2 to 3), R3 (bands: 4 to 10), R4 (bands: 11 to 16), R5 (bands: 17 to 24), R6 (bands: 25 to 32). |
Sensitivity: 89.10% Specificity: 96.70% Accuracy: 92.90% |
|
Source of data Massachusetts Institute of Technology (MIT) database |
Response feature Normal CHF |
Validation Testing 1000 CHF &1000 normal simulated randomly from 12 CHF and 12 normal subjects |
||
Pecchia et al. 2011 [11] | CART with feature selection | No. of data 54 normal subjects 29 CHF subjects |
Predictor features Short-term HRV measures |
Sensitivity: 89.70% Specificity: 100.00% |
Source of data Normal subjects was retrieved from the Normal Sinus Rhythm RR Interval Database CHF group was retrieved from the Congestive Heart Failure RR Interval Database |
Response feature Normal CHF |
Validation | ||
Leave-one-out cross-validation | ||||
Mellilo et al. 2011 [12] | CART with feature selection | No. of data 72 normal subjects 44 CHF subjects |
Predictor features Long-term HRV measures |
Sensitivity: 89.74% Specificity: 100.00% |
Source of data Normal subjects were retrieved from the Normal Sinus Rhythm RR Interval Database and from the MIT-BIH Normal Sinus Rhythm Database The data for the CHF group were retrieved from the Congestive Heart Failure RR Interval Database and from the BIDMC Congestive Heart Failure Database |
Response feature Normal CHF |
Validation | ||
10 fold-cross-validation | ||||
Jovic et al. 2011 [13] | SVM, MLP, C4.5, Bayesian classifiers | No. of data 25 normal subjects 25 CHF subjects |
Predictor features Correlation dimension, Spatial filling index, Central tendency measure, Approximate entropy (four features), Standard deviation of the NN (or R-R) interval – SDNN, root of the mean squared differences of N successive R-R intervals – RMSSD, ratio of the number of interval differences of successive R-R intervals that are greater than 20 ms and the total, number of R-R intervals - pNN20, HRV triangular index |
SVM Sensitivity: 77.2% Specificity: 87.4% MLP Sensitivity: 96.6% Specificity: 97.8% C4.5 Sensitivity: 99.2% Specificity: 98.4% Bayesian Sensitivity: 98.4% Specificity: 99.2% |
Validation | ||||
Source of data BIDMC congestive heart failure database MIT-BIH normal sinus rhythm database Normal sinus rhythm RR interval database |
Response feature Normal CHF |
10 × 10-fold-cross-validation | ||
Yu et al. 2012 [14] | Feature selection (UCIMFS, MIFS, CMIFS, mRMR) SVM |
No. of data 54 normal subjects 29 CHF subjects |
Predictor features Long-term HRV measures + Age and Gender |
All features Sensitivity: 93.10% Specificity: 98.14% Accuracy: 96.38% UCMIFS Sensitivity: 96.55% Specificity: 98.14% Accuracy: 97.59% MIFS Sensitivity: 93.10% Specificity: 98.14% Accuracy: 96.38% CMIFS Sensitivity: 93.10% Specificity: 100.00% Accuracy: 97.59% mRMR Sensitivity: 93.10% Specificity: 98.14% Accuracy: 96.38% |
Source of data Congestive heart failure (CHF) and normal sinus rhythm (NSR) database, both of which were available on the PhysioNet |
Response feature Normal CHF |
Validation | ||
Leave-one-out cross-validation | ||||
Yu et al. 2012 [15] | Feature selection by Genetic Algorithm (GA) SVM |
No. of data 54 Normal subjects 29 CHF subjects |
Predictor features Bispectral analysis based features |
RBF kernel Sensitivity: 95.55% Specificity: 100% Linear kernel Sensitivity: 93.10% Specificity: 98.14% |
Source of data Data for the research were provided by the congestive heart failure (CHF) database (chf2db) and normal sinus rhythm (NSR) database (nsr2db), both of which are available on the PhysioNet |
Response feature Normal CHF |
Validation | ||
Leave-one-out cross validation | ||||
Liu et al. 2014 [16] | Feature selection Feature normalization Feature combination SVM & k-NN |
No. of data 30 normal subjects 17 CHF subjects |
Predictor features Short-term HRV measures |
SVM Accuracy: 100.00% Precision: 100.00% Sensitivity: 100.00% k-NN Accuracy: 91.49% Precision: 94.12% Sensitivity: 84.21% |
Source of data Normal subjects was retrieved from the Normal Sinus Rhythm RR Interval Database CHF group was retrieved from the Congestive Heart Failure RR Interval Database |
Response feature Normal CHF |
Validation | ||
Cross-validation | ||||
Narin et al. 2014 [17] | Filter based backward elimination feature selection SVM, k-NN, LDA, MLP, RBF classifier |
No. of data 54 normal subjects 29 CHF subjects |
Predictor features Short term HRV measures + Wavelet transform measures |
SVM Sensitivity: 82.75% Specificity: 96.29% Accuracy: 91.56% k-NN k = 5 Sensitivity: 65.51% Specificity: 96.29% Accuracy: 85.54% Polynomial LDA Sensitivity: 75.86% Specificity: 90.74% Accuracy: 85.54% MLP Sensitivity: 82.75% Specificity: 92.59% Accuracy: 89.15% RBF Sensitivity: 58.62% Specificity: 96.29% Accuracy: 93.13% |
Source of data The data used in this study were obtained from the normal sinus rhythm and congestive heart failure RR interval databases from the MIT/BIH database in PhysioNET |
Response feature Normal CHF |
Validation | ||
Leave-One-Out cross-validation | ||||
Heinze et al. 2014 [18] | Feature extraction by Power spectral density(PSD) Conventional spectral analysis Ordinal patterns Learning Vector Quantization (LVQ) classifier |
No. of data 54 Normal subjects 29 CHF subjects |
Predictors features HRV measures |
PSD features 13.6% error at 50 min Conventional analysis features 17.5% error at 60 min Ordinal patterns 18% error at 45 min |
Source of data Normal sinus rhythm and congestive heart failure RR interval databases from the MIT/BIH database in PhysioNET |
Response feature Normal CHF |
Validation | ||
Multiple-hold-out validation (80% training data, 20% testing data) with 50 repetitions |
CHF: Congestive Heart Failure, CART: Classification and Regression Tree, UCMIFS: Uniform Conditional Mutual Information Feature Selection CMIFS: Conditional Mutual Information Feature Selection, MIFS: Mutual Information Feature Selection, mRMR: min-redundancy max-relevance, SVM: Support Vector Machines, k-NN: k Nearest Neighbors, RBF: Radial Basis Function, MLP: Multi-Layer Perceptron, LDA: Linear Discriminant Analysis, HRV: Heart Rate Variability.
Yang et al. 2010 [19] proposed a scoring model which allows the detection of HF and the assessment of its severity. More specifically, two Support Vector Machines (SVM) models were built. The first model detects the presence or not of HF (Non-HF group vs. HF group). In case the subject belongs to the non- HF group, the second model classifies the patients to a Healthy group or to a HF-prone group. The output of the SVM models was mapped to a score value (it is described in Section 4 since the study focuses in the severity estimation of HF). If the score value, produced by mapping the output of the first model (Score 1), is lower than 4 (score interval: 0–4), then the subject belongs to the non-HF group. If Score 1 is > 4 (score interval: 4–5.9), then the subject has HF (HF group). If the Score 1 is lower than 4 and the Score 2 (score produced by mapping the output of the second SVM model) is lower than 2 (score interval: 0–2), then the patient belongs to the Healthy group. If Score 1 is lower than 4 and the Score 2 is > 2 (score interval: 2–4), then the subject belongs to HF-prone group (Fig. 2).
Gharehchopogh et al. 2011 [20] utilized neural networks (NN) and a set of 40 subjects in order to detect HF. For each subject, gender, age, blood pressure, smoking habit and its annotation as normal or patient were available. 38 out of 40 subjects were correctly classified resulting thus to True Positive Rate 95.00%, False Positive Rate 9.00%, Precision 95.00%, Recall 95.00%, F-measure 94.00% and Area Under Curve (AUC) 95%.
Son et al. 2012 [4] studied the discrimination power of 72 variables in differentiating congestive heart failure (CHF) patients from those with dyspnea, and the risk factor Pro Brain Natriuretic Peptides (Pro-BNP). Rough sets and logistic regression were employed for the reduction of the feature space. Then a decision tree based classification was applied to the produced by the previous step feature set. The experimental results showed that the rough sets based decision-making model had accuracy 97.5%, sensitivity 97.2%, specificity 97.7%, positive predictive value 97.2%, negative predictive value 97.7%, and area under ROC curve 97.5%, while the corresponding values for the logistic regression decision-making model were accuracy 88.7%, sensitivity 90.1%, specificity 87.5%, positive predictive value 85.3%, negative predictive value 91.7%, and area under receiver operating characteristic (ROC) curve 88.8%.
Masetic et al. 2016 [21] applied Random Forests algorithm to long-term ECG time series in order to detect CHF. ECG signals were acquired from the Beth Israel Deaconess Medical Center (BIDMC) Congestive Heart Failure and the PTB Diagnostic ECG databases, both freely available on PhysioNet [22], while normal heartbeats were taken from 13 subjects from MIT–BIH Arrhythmia database.1 Features were extracted from ECG using the autoregressive Burg method. Besides Random Forests, the authors evaluated, on the same dataset, C4.5, SVM, Artificial Neural Networks (ANN) and k-Nearest Neighbors (k-NN) classifiers and the performance of the classifiers in terms of sensitivity, specificity, accuracy, F-measure and ROC curve were recorded and compared. The authors have chosen Random Forests due to its very good accuracy in classifying a subject as normal or CHF.
Wu et al. 2010 [23] and Aljaaf et al. 2015 [2] move one step forward and attempt to predict the presence of HF. Wu et al. 2010 [23] modeled detection of HF more than 6 months before the actual date of clinical diagnosis. In order this to be achieved, data from electronic health records of the Geisinger Clinic were employed. The electronic health records included data representing demographic, health behavior, use of care, clinical diagnosis, clinical measures, laboratory data, and prescription orders for anti-hypertensive information. The information was expressed by 179 independent variables. The authors compared SVM, Boosting, and logistic regression models for their ability to early predict the HF. Before the application of classifiers, feature selection was performed. A different selection procedure was followed depending on the classifier. For logistic regression, variable selection was based on minimizing the Akaike information criterion (AIC) and the Bayesian information criterion (BIC), while the L1-norm variable selection technique was used in the case of SVM. AUC was measured and the results indicated that the AUCs were similar for logistic regression and Boosting. The highest median AUC (77.00%) was observed for logistic regression with BIC and Boosting with less strict cut off.
Aljaaf et al. 2015 [2] proposed a multi-level risk assessment of developing HF. The proposed model could predict five risk levels of HF (1: No risk, 2: Low risk, 3: Moderate risk, 4: High risk, 5: Extremely high risk) using C4.5 decision tree classifier. The Cleveland Clinic Foundation heart disease dataset2 was used. The authors enhance the dataset with three new attributes - risk factors, namely obesity, physical activity and smoking. The dataset included 160 instances of risk level 1, 35 instances of risk level 2, 54 instances of risk level 3, 35 instances of risk level 4 and 13 instances of risk level 5. For the evaluation of the C4.5 classifier a 10-fold cross-validation procedure was followed. The overall precision of the proposed approach is 86.30%, while the precision for predicting each one of the above mentioned risk levels is 89.00, 86.50, 72.00, 90.90 and 100.00%, respectively.
Zheng et al. 2015 [24] proposed a computer assisted system for the diagnosis of CHF. The computer assisted system employs Least Squares SVM (LS-SVM) and it is trained and tested utilizing heart sound and cardiac reverse features. The results of the LS-SVM classifier were compared with those produced by ANN and Hidden Markov Models indicating thus the superiority of LS-SVM approach.
A short presentation of the above mentioned studies is provided in Table 2.
Table 2.
Authors | Method | Data | Features | Evaluation measures |
---|---|---|---|---|
Yang et al. 2010 [19] | Scoring model using SVM | No. of data 153 subjects 65 HF subjects, 30 HF-prone subjects 58 healthy subjects |
Predictor features parameters are selected from clinical tests, i.e., blood test, heart rate variability test, echocardiography test, electrocardiography test, chest radiography test, 6 min walk distance test and physical test |
SVM model 1 Sensitivity: 75% Specificity: 94% Youden's index: 69% SVM model 2 Sensitivity: 100% Specificity: 80% Youden's index: 80% |
Source of data Data collected at Zhejiang Hospital |
Response feature Non-HF group (Healhty group or HF-prone group) HF group |
Validation | ||
90 subjects used as test cases | ||||
Gharehchopogh et al. 2011 [20] | Neural networks | No. of data 40 subjects 26 normal subjects 14 HF subjects |
Predictor features Gender, age, blood pressure, smoking habit |
Training set True positive rate: 95.00%, False positive rate: 9.00%, Precision: 95.00%, Recall: 95.00%, F-measure: 94.00% AUC: 95%. Testing set Percentage prediction: 85% |
Source of data Data collected at referral health center in one of region in Tabriz |
Response feature HF yes HF no |
Validation | ||
Testing set | ||||
Son et al. 2012 [4] | Rough sets based decision model Logistic regression based decision model |
No. of data 159 subjects 71 CHF subjects 88 normal subjects |
Predictor features Laboratory findings |
Rough sets based decision model Accuracy: 97.5% Sensitivity: 97.2% Specificity: 97.7% Positive predictive value: 97.2% Negative predictive value: 97.7% Area under ROC curve: 97.5% Logistic regression based decision model Accuracy: 88.7% Sensitivity: 90.1% Specificity: 87.5% Positive predictive value: 85.3% Negative predictive value: 91.7% Area under ROC curve: 88.8% |
Source of data Data collected at the emergency medical center of Keimyung University Dongsan Hospital |
Response features Normal CHF |
Validation | ||
10-fold-cross-validation | ||||
Masetic et al. 2016 [21] | Random Forests SVM C4.5 ANN k-NN |
No. of data 15 CHF subjects 13 normal subjects |
Predictor features Features extracted by raw ECG using Burg method for autoregressive |
BIDMC congestive heart failure + MIT BIH Arrhythmia databases ROC area: 100% F-measure: 100% Accuracy: 100% PTB Diagnostic ECG + MIT BIH Arrhythmia databases ROC area: 100% F-measure: 100% Accuracy: 100% |
Source of data Beth Israel Deaconess Medical Center (BIDMC) Congestive Heart Failure PTB Diagnostic ECG Normal heartbeats were taken from MIT–BIH Arrhythmia database |
Response features Normal CHF |
Validation | ||
10-fold cross-validation | ||||
Zheng et al. 2015 [24] | Wavelet Transform for Heart Sound signals Least Square Support Vector Machine (LS-SVM) Neural Network Hidden Markov model |
No. of data 64 CHF subjects 88 healthy volunteers |
Predictor features Heart Sound and Cardiac Reserve features The ratio of diastolic to systolic duration. The ratio of the amplitude of the first heart sound to that of the second heart sound. The width of multifractal spectrum. The frequency corresponding to the maximum peak of the normalized PSD curve. Adaptive sub-band energy fraction shown. |
LS-SVM Accuracy: 95.39% Sensitivity: 96.59% Specificity: 93.75% |
Source of data Chongqing University and the First and the Second Affiliated Hospitals of Chongqing University of Medical Sciences |
Response feature Normal CHF |
Validation | ||
The double-fold cross-validation |
SVM: Support Vector Machines, HF: Heart Failure, CHF: Congestive Heart Failure, ANN: Artificial Neural Networks, ROC: Receiver Operating Characteristic, AUC: Area Under Curve, LS-SVM: Least Square Support Vector Machine, k-NN: k-Nearest Neighbors.
3. HF Subtypes Classification
Once HF is detected, the etiology or the subtypes of HF can be estimated. According to HF guidelines, the etiology of HF is diverse within and among world regions. There is no agreed single classification system for the causes of HF, with much overlap between potential categories. HF manifests at least two major subtypes, which are commonly distinguished based on the measurement of the left ventricular ejection fraction (LVEF) [25]. Patients with LVEF larger or equal to 50% are characterized as patients with HF with preserved ejection fraction (HFpEF), while patients with LVEF lower than 40% are characterized as patients with HF with reduced ejection fraction (HFrEF). When the LVEF lies between 40 and 49% the patient belongs to so called “gray zone”, which is defined as HF with mid-range ejection fraction (HFmrEF).
Machine learning techniques have been applied to classify HF subtypes. This approach of classification of HF subtypes started the last 3 years. Austin et al. 2013 [26] classified HF patients according to two disease subtypes (HFpEF vs. HFrEF) using different classification methods. More specifically, classification trees, bagged classification trees, Random Forests, boosted classification trees and SVM were employed. The training of the classifiers was performed using the EFFECT-1 sample of Enhanced Feedback for Cardiac Treatment (EFFECT) study, while for the validation of the classifiers the EFFECT-2 sample was used. The two samples consist of 9.943 and 8.339 patients hospitalized with a diagnosis of HF, respectively. Removing subjects with missing values and subjects whom ejection fraction could not be determined, 3.697 patients for training and 4.515 patients for testing were finally employed. For each patient, 34 variables were recorded expressing information regarding demographic characteristics, vital signs, presenting signs and symptoms, laboratory data and previous medical history. The results indicate that patients can be classified into one of the two mutually exclusive subtypes with 69.6% positive predictive value using the Random Forests classifier.
Betanzos et al. 2015 [25] applied machine learning techniques to classify HF subtypes using the concept of Volume Regulation Graph (VRG) domain rather than by the single use of ejection fraction (EF). More specifically, they used both the metric EF and the basic variables that define the EF, namely end systolic volume (ESV) and end diastolic volume (EDV). This approach allowed them to overcome the limitations inherent to the use of EF which neglects the importance of left ventricular cavity volume. From those data, the end systolic volume index (ESVI) was computed and through the application of machine learning techniques, the validity of ESVI as an index for discriminating between the HFpEF and the HFrEF patients was examined. Both supervised and unsupervised techniques were applied. K-means using Euclidean distance, Expectation - Maximization (EM) and sequential Information Bottleneck algorithm (sIB) were used to perform discrimination in an unsupervised manner. Supervised classifiers, such as SVM, SVMPEGASOS, Nearest Neighbors (IB1) and NNGE, which is a nearest neighbor-like algorithm using non nested generalized exemplars, rule based algorithm OneR, C4.5, PART, and Naive Bayes classifier, were tested and compared. The authors employed two datasets for the evaluation of the above mentioned machine learning techniques. The first dataset included data from 48 real patients (35 belong to the class HFpEF and 13 to the class HFrEF), while the second dataset includes simulated data, generated using Monte Carlo simulation approach, that correspond to 63 instances (34 from class HFpEF and 29 from class HFrEF). The results of the unsupervised methods revealed interesting dividing patterns of the two subtypes, while the SVM PEGASOS algorithm was opted for the classification of the patients, since it produced the best results in terms of training and test error. Based on those results, Betanzos et al. 2015 [25] concentrated on SVMPEGASOS algorithm toward examining how the results are differentiated when patients belonging to the “gray zone” are included. They set different cutoff points (EF at 40, 45, 50, and 55%). The SVM PEGASOS model was trained using the first dataset described previously and it was tested on a new dataset including simulated data corresponding to 403 instances, among which 150 refer to class HFpEF, 137 refer to class HFrEF and 116 refer to HFmrEF. The utilization of the different cutoff points differentiate the number of samples belonging to the two classes. The results indicated that ESV can act as a discriminator even when patients with HFmrEF are included.
Isler 2016 [27] performed a heart rate variability analysis in order to distinguish patients with systolic CHF from patients with diastolic CHF. More specifically, short-term HRV measures were given as input to nearest neighbors and multi-layer perceptron classifiers. Eight different configurations were applied (No heart rate normalization and no MINMAX normalization, heart rate normalization and no MINMAX normalization, No heart rate normalization and MINMAX normalization, Heart rate normalization and MINMAX normalization). 18 patients with systolic and 12 patient with diastolic CHF were enrolled in the study. Leave-one-out cross validation method was followed and the best accuracy was achieved using multi-layer neural network.
Shah et al. 2015 [28] focused on the distinction of HFpEF subtypes. They employed 397 HFpEF patients and performed detailed clinical, laboratory, electrocardiographic phenotyping of the participating patients. The extracted 67 continuous variables were given as input to statistical learning algorithms (e.g. unbiased hierarchical cluster analysis) and penalized model-based clustering. The analysis revealed 3 distinct pheno-groups in terms of clinical characteristics, cardiac structure and function, hemodynamics and outcomes.
A short presentation of the methods for HF subtype classification is presented in Table 3.
Table 3.
Authors | Method | Data | Features | Evaluation measures | ||||
---|---|---|---|---|---|---|---|---|
Austin et al. 2013 [26] | Random Forests | No. of data 3.697 patients for training 4.515 patients for testing |
Predictor features Demographic characteristics, vital signs, presenting signs and symptoms, results of laboratory investigations, and previous medical history |
Sensitivity: 37.8% | PPV: 69.6% | |||
Specificity: 89.7% | NPV: 69.7% | |||||||
Source of data Data collected during the Enhanced Feedback for Effective Cardiac Treatment (EFFECT) study |
Response feature HFpEF HFrEF |
Validation | ||||||
Testing set of 8.339 subjects | ||||||||
Betanzos et al. 2015 [25] | SVM PEGASOS | No. of data 48 real patients (35 HFpEF and 13 HFrEF) for training 63 Monte Carlo simulated instances (34 HFpEF and 29 HFrEF) for testing |
Predictor features End Systolic Volume Index |
Training error %: 2.08 | Test error %: 4.76 | |||
Source of data Clinical study conducted at Cardiovascular Center, OLV Clinic, Aalst, in Belgium |
Response feature HFpEF HFrEF |
Validation | ||||||
Testing set of 63 instances 10-fold cross-validation | ||||||||
SVM PEGASOS | No. of data 48 real patients (35 HFpEF and 13 HFrEF) for training 403 Monte Carlo simulated instances (150 HFpEF, 137 HFrEF, 116 HFmrEF) for testing |
Predictor features End Systolic Volume Index |
True Positive Rate | 40% | 45% | 50% | 55% | |
HFpEF | 100% | 91% | 98% | 99% | ||||
HFrEF | 87% | 96% | 97% | 98% | ||||
Source of data Clinical study conducted at Cardiovascular Center, OLV Clinic, Aalst, in Belgium |
Response feature HFpEF HFrEF Including patients belonging to “gray zone” |
Validation | ||||||
Testing set of 403 instances | ||||||||
Isler 2016 [27] | Min-Max Normalization k-NN, MLP |
No. of data 18 patients with systolic CHF 12 patients with diastolic CHF |
Predictor features Short term HRV measures |
MPL Sensitivity: 93.75% Specificity: 100% Accuracy: 96.43% k-NN Sensitivity: 87.50% Specificity: 91.67% Accuracy: 89.29% |
||||
Source of data Holter ECG data used in this study were obtained from the Faculty of Medicine in Dokuz Eylul University |
Response feature patients with systolic CHF patients with diastolic CHF |
Validation | ||||||
Leave-one-out cross-validation |
PPV: Positive Predictive Value, NPV: Negative Predictive Value, MLP: Multi-Layer Perceptron, k-NN: k Nearest Neighbors, SVM: Support Vector Machines, LS-SVM: Least Square SVM, HF: Heart Failure, CHF: Congestive Heart Failure, HRV: Heart Rate Variability, HFpEF: Heart Failure with preserved Ejection Fraction, HFrEF: Heart Failure with reduced Ejection Fraction, AUC: Area Under Curve.
4. Severity Estimation of HF
Due to the fact that HF is asymptomatic in its first stages, early assessment of the severity of HF becomes a crucial task. The most commonly employed classifications for HF severity are NYHA and ACC/AHA stages of HF. NYHA is based on symptoms and physical activity, while ACC/AHA describes HF stages based on structural changes and symptoms [6]. The two assessment methods provide useful and complementary information about the presence and severity of HF. More specifically, ACC/AHA stages of HF emphasize the development and progression of HF, whereas NYHA focus on exercise capacity of the patient and the symptomatic status of the disease [1].
NYHA classification has been criticized due to the fact that it is based on subjective evaluation and thus intra-observer variability can be introduced [29]. According to the HF guidelines, an objective evaluation of the severity of HF can be provided by the combination of a 2-D ECG with Doppler flow [1]. For the estimation of the severity of HF in the acute setting after myocardial infarction, KILLIP classification can be utilized [1].
Studies reported in the literature, address HF severity estimation through the utilization of machine learning techniques. Specifically, HF severity estimation is expressed either as a 2 or 3 class classification problem, depending on the merge of the NYHA class that has been performed. Akinyokun et al. 2009 [30] proposed a neuro-fuzzy expert system for the severity estimation of HF. A multilayered feed -forward neural network was trained taking as input data from patients from three hospitals in Nigeria. For each patient, seventeen variables were recorded. A measure of significance of each input variable to the output is computed in order redundant information to be removed. Through this procedure six variables, expressing signs and symptoms of HF, were retained and the neural network was retrained using the selected variables. Fuzzy rules were then extracted from the trained datasets. The fuzzy-logic system employs the root mean square error method for drawing inference. The output of the neuro-fuzzy engine is given as input to the decision support engine aiming to optimize the final decision value. The decision support engine carries out the cognitive and emotional filter that corresponds to the objective and subjective feelings, respectively, of the practitioner supporting him/her to make judgments and take decisions regarding the final diagnosis. The cognitive filter average value is added to the neuro-fuzzy values and the decision support intermediate value (DSIV) is computed. The DSIV is then added to the emotional filter average value and the decision support final value (DSFV) is extracted. If DSFV is lower than 0.2, then no HF is presented. If DSFV is > 0.2 and lower or equal to 0.4, then the patient is characterized as mild HF. If DFSV is > 0.4 and lower or equal to 0.7, then the degree of severity is considered to be moderate. In order the patient to be classified to the severe HF class, the DFSV must be between 0.7 and 1. Finally, in case DFSV is > 1, the patient's status is in a very severe condition.
Guidi et al. 2012 [31] developed a computer aided telecare system aiming to assist in the clinical decision of non-specialist personnel involved in the management of HF patients. Among the functionalities of the telecare system is the characterization of patients as mild, moderate or severe. In order this to be achieved, NN, SVM, decision tree and fuzzy expert system classifiers were employed. The classifiers were trained and tested using anamnestic (age, gender, NYHA class) and instrumental data (weight, systolic blood pressure, diastolic blood pressure, EF, BNP, heart rate, ECG parameters (atrial fibrillation, left bundle branch block, ventricular tachycardia))corresponding to 100 (training set) and 36 (testing set) patients, respectively. The distribution of patients to the three severity classes is 35 mild, 31 moderate and 34 severe in the training phase and 15 mild, 8 moderate and 13 severe in the test phase. A 10-fold cross-validation procedure was applied. According to the presented results NN can classify patients with 86.1% accuracy.
Two years later, the same research team [32] enhanced the “pool” of classifiers that were evaluated, with classification and regression tree (CART) and Random Forests. Data from 136 patients, treated by the Cardiology Department of the St. Maria Nuova Hospital (Florence, Italy) were distributed to the three prediction types as follow, 51 mild, 37 moderate and 48 severe. For the evaluation of the classifiers the authors followed a subject based cross validation approach to address the fact that the dataset included cluster-correlated data (baseline and follow-up data of the same patient). More specifically, follow-up data of the same patient were grouped within the same fold. In this way, their assumption that follow-up data spread in a large time period can be considered as separate instances of the dataset, does not affect the independence of the folds. Random Forests outperformed the other methods for the automatic severity assessment. However, the standard deviation was very high. This is due to the fact that in some folds the accuracy was > 90%, while in some others the accuracy was lower than 50%. These folds probably include patients with moderate HF, revealing thus the difficulty of the proposed system in classifying those patients. Although the classification results produced by the CART classifier is 1% lower than those produced by Random Forests, CART algorithm gains the preference of researchers since it can be easily transformed to a set of rules that can be analyzed by medical experts.
Recently the authors of [33] proposed a multi-layer monitoring system for clinical management of CHF. The three layers include the following monitor activities: a) scheduled visits to a hospital following up the presence of a HF event, b) home monitoring visits by nurses, and c) patient's self-monitoring at home through the utilization of specialized equipment. For the activities of the first two layers, a decision support system was developed providing prediction of decompensations and assessment of the HF severity. Random Forests algorithm was employed based on its performance in the studies reported previously. It was evaluated in terms of accuracy, sensitivity and specificity for each class versus all the other classes in a 10-fold cross validation. The obtained accuracy was 81.3%, while the sensitivity and specificity were87 and 95%, respectively for class 3 (severe HF vs. other). Class 1 (mild HF vs. other) was identified with 75% sensitivity and 84% specificity and class 2 (moderate HF vs. other) was identified with 67% sensitivity and 80% specificity.
Taking into consideration the fact that ECG provides an objective evaluation of the severity of HF, researchers studied the relationship of long and short-term HRV measures with NYHA class [34], [35], [36], [37], [38] and their discrimination power for HF detection [11], [12]. Pecchia et al. 2011 [39] presented a remote health monitoring system for HF, which provides estimation of HF severity through the utilization of a CART method. HRV measures, extracted from ECG signals, were utilized in order the subject detected with HF to be classified as mild (NYHA I or II) or severe (NYHA III). Different trees were trained using different combinations of the short-term HRV measures. The achieved accuracy, sensitivity, specificity and precision was79.31, 82.35, 75.00 and 82.35%, respectively. The dataset included 83 subjects, 54 control and 29 patients. The 29 patients were distributed to the two classes as follow: 12 were mild and 17 severe.
Two years later, Mellilo et al. 2013 [40] based on the long-term HRV measures and the CART algorithm in order to individuate severity of HF. The classifier separated low risk patients (NYHA I or II) from high risk patients (NYHA III or IV). The HRV measures were extracted from two Holter monitor databases (Congestive Heart Failure RR Interval Database and BIDMC Congestive Heart Failure Database) [17] and corresponded to 12 low risk and 34 high risk patients. However, only 11 low risk and 30 high risk patients were enrolled in the study. The CART algorithm was modified in order to incorporate a feature selection algorithm addressing the issues of small and unbalanced dataset. The results of their method were compared with the results of other classifiers, such as simple CART, C4.5, and Random Forests. All the algorithms were evaluated with and without the application of SMOTE algorithm. The accuracy, precision, sensitivity and specificity of the proposed CART algorithm was 85.40, 87.50, 93.30 and 63.60%, respectively. As mentioned previously, the tree that is created by the CART algorithm can be easily transformed to rules, in the specific case rules for severity estimation. According to the authors the extracted rules were consistent with previous findings. Shahbazi et al. 2015 [41] exploited long- erm HRV measures to estimate the severity of HF and more specifically to classify patients to low risk and high risk. Generalized Discriminant Analysis was applied for reducing the number of features, as well as to overcome overlapping of the samples of two classes in the feature space. The selected features were given as input to a k-NN classifier providing classification accuracy 97.43% in the case when both linear and nonlinear features were utilized and 100% accuracy in the case when only nonlinear features were utilized.
Yang et al. 2010 [19] proposed a scoring model allowing classification of a subject to three groups; health group (without cardiac dysfunction), HF-prone group (asymptomatic stages of cardiac dysfunction) and HF group (symptomatic stages of cardiac dysfunction). SVM was employed and the total accuracy was 74.40%. The accuracy for each one of the three groups was 78.79% for healthy group, 87.50% for HF-prone group and 65.85% for the HF group. In total, 289 subjects participated in the study among which 70 were healthy, 59 belonged to HF-prone group (NYHA I, ACC/AHA B-C) and 160 belonged to HF group (NYHA II-III, ACC/AHA C-D). In order imputation of missing values to be achieved, the Bayesian principal components analysis was employed [42]. The decision value of SVM (v) [43] is mapped to a specific range in order a definite score to be produced. For this purpose a tan-sigmoid function is applied given by:
(1) |
where y is the mapped value. The determination of the cutoff points is achieved using Youden's index [44].
Sideris et al. 2015 [45] proposed a data driven methodology for the estimation of the severity of HF that relies on a clustering-based, feature extraction approach. The authors exploited disease diagnostic information and extracted features. In order to reduce the dimensions of diagnostic codes they identified the disease groups with high frequency of co-occurrence. The extracted clusters were utilized as features for the estimation of severity of the condition of HF patients by employing an SVM classifier. The results were compared with those produced giving as input to the SVM classifier the cluster-based feature set enhanced with the Charlson comorbidity score and an accuracy improvement of up to 14% in the predictability of the severity of condition was achieved. The procedure was applied for each one of the extracted six daily threshold-based outcome variables (I1–I6) labeling the severity of the condition, especially in the context of remote health monitoring.
A short review of the methods addressing HF severity estimation are presented in Table 4.
Table 4.
Authors | Method | Data | Features | Evaluation measures | ||||||
---|---|---|---|---|---|---|---|---|---|---|
Akinyokun et al. 2009 [30] | Neuro-fuzzy expert system | No. of data 30 subjects |
Predictor features Signs and symptoms of heart failure: chest pain, dyspnea (shortness of breath), orthopnea, palpitation, cough, fatigue, tachycardia, cyanosis, edema, nocturia, high blood pressure, low blood pressure, heart rate, rales, (crackles in lungs), elevated neck veins, hepamegaly, wheeze, heart sound, alteration in thought process, changes in level of consciousness, absence of emotion, heart murmur, pleural effusion, pulmonary edema, cardio thoracic ratio, upper zone flow distribution and echocardiogram. |
Training set Mean Square Error: 0.021, High Standard Deviation: 0.036, Average minimum normalized mean Square Error: 0.026, Correlation coefficient: 0.988 Overall percentage error: 1.24%, Akaiike Information Criteria (AIC): 171.288 Minimum Description Length: 129.107. |
||||||
Source of data Data collected at three hospital of Nigeria |
Response feature Mild HF Moderate HF Severe HF |
Validation | ||||||||
70% of the datasets were used for training, 20% were employed as testing datasets 10% were used as cross validation datasets. | ||||||||||
Guidi et al. 2012 [31] | Computer aided telecare system NN/SVM/Fuzzy-Genetic/Decision Tree |
No. of data 136 subjects 51 mild, 37 moderate, 48 severe |
Predictor features Anamnestic data (age, gender, NYHA class) Instrumental data (weight, systolic blood pressure, diastolic blood pressure, EF, BNP, heart rate, ECG parameters (atrial fibrillation, left bundle branch block, ventricular tachycardia)) |
Accuracy | ||||||
NN | 86.10% | |||||||||
SVM | 69.40% | |||||||||
FG | 72.20% | |||||||||
DT | 77.80% | |||||||||
Source of data Data collected at the Cardiology Department of the St. Maria Nuova Hospital (Florence, Italy) |
Response feature Mild HF Moderate HF Severe HF |
Validation | ||||||||
100 subjects for training 36 subjects for testing | ||||||||||
Guidi et al.2014 [32] | NN/SVM/Fuzzy-Genetic/CART/Random Forests | No. of data 136 subjects 51 mild, 37 moderate, 48 severe |
Predictor features Anamnestic data (age, gender, NYHA class) Instrumental data (weight, systolic blood pressure, diastolic blood pressure, Ejection Fraction (EF), BNP, heart rate, ECG parameters (atrial fibrillation, left bundle branch block, ventricular tachycardia)) |
Accuracy | Std | Critical errors | ||||
NN | 77.80% | 7.4 | 0 | |||||||
SVM | 80.30% | 9.4 | 3 | |||||||
FG | 69.90% | 9.9 | 1 | |||||||
CART | 81.80% | 8.9 | 2 | |||||||
RF | 83.30% | 7.5 | 1 | |||||||
Source of data Data collected at the Cardiology Department of the St. Maria Nuova Hospital (Florence, Italy) |
Response feature Mild HF Moderate HF Severe HF |
Validation | ||||||||
A person independent ten-fold cross validation | ||||||||||
Guidi et al. 2015 [33] | Multi-layer monitoring system for clinical management of CHF Random Forests |
No. of data 250 patients 93 mild, 92 moderate, 65severe |
Predictor features Height and weight (Body Mass Index) Systolic and diastolic blood pressure Heart rate Oxygen saturation Ejection fraction (EF) BNP or NT-proBNP Bioelectrical impedance vector (BIVA) parameters NYHA class 12-lead EKG report (e.g., presence of bundle branch block, tachycardia, atrial fibrillation, etc.) Etiology Comorbidity Current therapy, pharmaceutical and surgical (pacemaker or ICD ICD/CRT) |
Accuracy: 81.30% | ||||||
“Mild” vs. all Sensitivity: 75.00% Specificity: 84.00% | ||||||||||
“Moderate” vs. all Sens: 67.00% Spec: 80.00% | ||||||||||
“Severe” vs. all Sensitivity: 87.00% Specificity: 95.00% | ||||||||||
Source of data Clinical study data collected through home visits and follow up |
Response feature Mild HF Moderate HF Severe HF |
Validation | ||||||||
10-fold cross-validation | ||||||||||
Pecchia et al. 2011 [39] | Remote health monitoring system for HF CART Mild, Severe |
No. of data 54 controls 29 patients 12 mild, 17 severe |
Predictor features HRV measures |
Accuracy: 79.31% Sensitivity: 82.35% Specificity: 75.00% Precision: 82.35% |
||||||
Source of data Normal subjects was retrieved from the Normal Sinus Rhythm RR Interval Database CHF group was retrieved from the Congestive Heart Failure RR Interval Database |
Response feature Mild (NYHA class I or II) Severe (NYHA class III) |
Validation | ||||||||
Cross-validation | ||||||||||
Mellilo et al. 2013 [40] | 1. Proposed CART/ 2. CART/ 3. CART with SMOTE/ 4. C4.5/5. C4.5 with SMOTE/6. RF/7. RF with SMOTE Low risk (NYHA I or II), High risk (NYHA III or IV) |
No. of data 11 low risk 30 high risk |
Predictor features Long-term HRV measures |
Accuracy | Sens | Spec | Precision | |||
1 | 85.40% | 93.30% | 63.60% | 87.50% | ||||||
2 | 73.20% | 100.00% | 0.0% | 73.20% | ||||||
3 | 75.00% | 73.30% | 77.30% | 81.50% | ||||||
4 | 65.90% | 73.30% | 45.50% | 78.60% | ||||||
5 | 84.60% | 93.30% | 86.40% | 89.30% | ||||||
6 | 73.20% | 86.70% | 36.40% | 78.80% | ||||||
7 | 82.70% | 83.30% | 81.80% | 86.20% | ||||||
Source of data Congestive Heart Failure RR Interval Database BIDMC Congestive Heart Failure Database |
Response feature Low risk (NYHA class I and II) High risk (NYHA class III and IV) |
Validation | ||||||||
10-fold cross-validation | ||||||||||
Yang et al.2010 [19] | Scoring model SVM Healthy group, HF-prone group, HF group |
No. of data 153 subjects 65 HF subjects, 30 HF-prone subjects 58 healthy subjects |
Predictor features parameters are selected from clinical tests, i.e., blood test, heart rate variability test, echocardiography test, electrocardiography test, chest radiography test, six minutes walk distance test and physical test |
Total accuracy: 74.40% | ||||||
Accuracy for the healthy group:78.79% | ||||||||||
Accuracy for the HF-prone group: 87.50% | ||||||||||
Accuracy for the HF group: 65.85% | ||||||||||
Source of data Data collected at Zhejiang Hospital |
Response feature Healthy group HF-prone group HF group |
Validation | ||||||||
90 subjects used as test cases | ||||||||||
Shahbazi et al. 2015 [41] | Feature extraction with Generalized Discriminant Analysis (GDA) k-NN |
No. of data 12 low risk HF subjects 32 high risk HF subjects |
Predictor features Long-term HRV measures |
Linear + nonlinear features + GDA Accuracy: 97,43% Precision: 96,66% Sensitivity: 100% Specificity: 90% Nonlinear features + GDA Accuracy: 100% Precision: 100% Sensitivity: 100% Specificity: 100% |
||||||
Source of data Congestive Heart Failure RR intervals Database with patients suffering from CHF (NYHA classes I–III) BIDMC Congestive Heart Failure Database with patients suffering from severe CHF (NYHA class III and IV). |
Response feature Low risk HF High risk HF |
Validation | ||||||||
Leave-one-out cross-validation | ||||||||||
Sideris et al. 2015 [45] | Feature extraction with Hierarchical clustering SVM |
No. of data 7 million discharge records 3041 patients |
Predictor features Demographics (gender, age, race), diagnostic information encoded in ICD-9-CM and hospitalization specific information including blood test results and discharge diagnoses coded as ICD-9-CM codes. |
Alert | Accuracy (%) | TPR (%) | TNR (%) | |||
I1 | 70.72 | 66.18 | 64.21 | 59.74 | 77.24 | 72.63 | ||||
I2 | 58.57 | 51.63 | 52.65 | 53.06 | 64.49 | 50.20 | ||||
I3 | 73.15 | 70.73 | 67.31 | 64.31 | 79.00 | 77.15 | ||||
I4 | 65.48 | 63.97 | 71.78 | 72.74 | 59.18 | 55.21 | ||||
I5 | 69.39 | 69.15 | 63.66 | 61.10 | 75.12 | 77.20 | ||||
I6 | 67.87 | 63.16 | 54.71 | 52.94 | 81.03 | 73.38 | ||||
Source of data Training 2012 National Inpatient Sample (NIS), Healthcare Cost and Utilization Project (HCUP) which contains 7 million discharge records and ICD-9-CM codes Testing Ronald Reagan UCLA Medical Center Electronic Health Records (EHR) from 3041 patients |
Response feature Low risk High risk |
Validation | ||||||||
10-fold cross-validation |
NN: Neural Networks, SVM: Support Vector Machines, FG: Fuzzy-Genetic, DT: Decision Tree, RF: Random Forests, Std: Standard deviation, TPR: True Positive Rate, TNR: True Negative Rate, Sens: Sensitivity, Spec: Specificity, HF: Heart Failure, NYHA: New York Heart Association, CART: Classification and regression tree, GDA: Generalized Discriminant Analysis, k-NN: k Nearest Neighbors, SMOTE: Synthetic Minority Over-sampling Technique
It must be mentioned that according to the authors knowledge, the HF severity estimation has not been addressed in the past as a four class classification problem (NYHA I, NYHA II, NYHA III, NYHA IV).
5. Prediction of Adverse Events
As already mentioned in the Introduction section, HF is a major health problem associated with the presence of serious adverse events, such as mortality, morbidity, destabilizations, re-hospitalizations, affecting both the individuals (e.g. reduced quality of life) and the society (e.g. increased healthcare costs). The early prediction of those events will allow experts to achieve effective risk stratification of patients and to assist in clinical decision making. Prognostic information could guide the appropriate application of monitoring and treatment, resulting in improvements in the quality of care that is provided, as well as in the outcome of patients hospitalized with HF.
Toward this direction the prediction ability of different factors related to HF morbidity, mortality, destabilizations and re-hospitalizations had been studied. Furthermore, models taking into account simultaneously multiple factors have been reported in the literature using statistical methods (e.g. multi-variable Cox regression models). This multi-variable statistical analysis lead to the formation of scores used in clinical practice, providing estimation of risk for mortality (e.g. Heart Failure Survival Score [46], Get With the guidelines score [47], Seattle Heart Failure Model [48], EFFECT [49]), re-hospitalizations [50] and morbidity [51].
5.1. Destabilizations
Although HF is a chronic syndrome, its evolution does not happen gradually. Alternating periods of relative stability and acute destabilizations exist. The goal of the experts is to predict and prevent destabilizations and death of the HF patient during a stable phase.
Candelieri et al. 2008 [52] adopted Knowledge Discovery (KD) approaches to predict if a patient with CHF in stable phase will further decompensate. A group of 49 CHF patients recurrently visited by cardiologists, every two weeks, was used for the evaluation of the KD approaches. A set of different clinical parameters, selected from guidelines and clinical evidence-based knowledge were evaluated by the cardiologist during the visit, general information and monitored parameters were measured for each patient. Decision trees, Decision Lists, SVM and Radial Basis Function Networks were employed and the leave-patient-out approach was followed to evaluate the performance of the generated models. Decision trees outperformed the other approaches. It provided prediction accuracy 92.03%, sensitivity 63.64%, and False Positive Rate 6.90%. In 2009 Candelieri et al. [53] examined how decision trees and SVM, developed in their previous work, perform on an independent testing set. The results indicated that SVM are more reliable in predicting new decompensation events. The value of evaluation measures is 97.37% accuracy, 100.00% sensitivity, and 2.78% False Positive Rate. Based on this observation they further extended their research activity, by proposing the SVM hyper-solution framework [54]. The term “hyper-solution” is used to describe SVM based on meta-heuristics (Tabu-Search and Genetic Algorithm) searching for the most reliable hyper-classifier (SVM with a basic kernel, SVM with a combination of kernel, and ensemble of SVMs), and for its optimal configuration. The Genetic Algorithm-based framework has been proven more accurate on minority class than the Tabu-Search.
The prediction of the destabilization of HF patients was also addressed by Guidi et al. 2014 [32] and Guidi et al. 2015 [33]. They made a prediction of the frequency (none, rare or frequent) of CHF decompensation during the year after the first visit using five machine learning techniques (NN, SVM, Fuzzy -Genetic Expert System, Random Forests and CART). In Guidi et al. 2014 [32], CART algorithm produced the best classification results (87.6% accuracy). However, in terms of critical error the best results were produced by the Random Forest algorithm. In Guidi et al. 2015 [33], the prediction was addressed as three different classification problems, none vs. all, rare vs. all and frequent vs. all, employing the Random Forests algorithm. The overall accuracy produced by the 10-fold cross-validation procedure is 71.90%, while the sensitivity and specificity for each case that was studied is 57% and 79% for the first case, 65% and 60% for the second case and 59% and 96% for the third case.
A short review of the methods addressing prediction of destabilizations are provided in Table 5.
Table 5.
Authors | Method | Data | Features | Evaluation measures |
---|---|---|---|---|
Candelieri et al. 2008 [52] | Decision trees | No. of data 49 patient with CHF |
Predictor features Systolic Blood Pressure (SBP), Heart Rate (HR), Respiratory Rate (RR), Body Weight (weight), Body Temperature (BT) Total Body Water (TBW). Patient condition evaluated by the cardiologist during the visit, Gender, Age, NYHA class, Alcohol use Smoking |
Accuracy: 92.03% Sensitivity: 63.64% False Positive Rate: 6.90% |
Source of data Data collected at the Cardiovascular Diseases Division, Department of Experimental and Clinical Medicine, Faculty of Medicine, University “Magna Graecia” of Catanzaro, Italy. |
Response feature No risk Risk For destabilizations within 2 week |
Validation | ||
Leave-patient-out validation | ||||
Candelieri et al. 2009 [53] | SVM | No. of data 49 patient with CHF |
Predictor features Systolic Blood Pressure (SBP), Heart Rate (HR), Respiratory Rate (RR), Body Weight (weight), Body Temperature (BT) Total Body Water (TBW). Patient condition evaluated by the cardiologist during the visit, Gender, Age, NYHA class, Alcohol use Smoking |
Leave-patient-out Accuracy: 82.06% Sensitivity: 63.64% False Positive Rate: 16.90% Testing set Accuracy: 97.37% Sensitivity: 100.00% False Positive Rate: 2.78% |
Source of data Data collected at the Cardiovascular Diseases Division, Department of Experimental and Clinical Medicine, Faculty of Medicine, University “Magna Graecia” of Catanzaro, Italy. |
Response feature No risk Risk For destabilizations within 2 week |
Validation | ||
Leave-patient-out validation Testing set | ||||
Candelieri et al. 2010 [54] | SVM hyper solution framework (Genetic Algorithm) |
No. of data 301 instances |
Predictor features Systolic Blood Pressure, Heart Rate, Respiratory Rate, Body Weight, Body Temperature, Total Body Water), Patient health conditions, with respect to stable or decompensated status |
Accuracy: 87.35% Sensitivity: 90.91% False Positive Rate: 16.21% |
Source of data Clinical study data collected through frequent follow ups |
Response feature No risk Risk For destabilizations within 2 week |
Validation | ||
Stratified 10-fold cross-validation | ||||
Guidi et al. 2014 [32] | CART Random Forests |
No. of data 136 subjects 110 stable 14 rare 12 frequent |
Predictor features Anamnestic data (age, gender, NYHA class) Instrumental data (weight, systolic blood pressure, diastolic blood pressure, EF, BNP, heart rate, ECG parameters (atrial fibrillation, left bundle branch block, ventricular tachycardia)) |
CART Accuracy: 87.60% Critical errors: 9 |
Random Forests Accuracy: 85.60% Critical errors: 5 | ||||
Source of data Data collected from the Cardiology Department of the St. Maria Nuova Hospital (Florence, Italy) |
Response feature Stable Rare Frequent within one year after the first visit |
Validation | ||
A person independent ten-fold cross validation | ||||
Guidi et al. 2015 [33] | Random Forests | No. of data 250 subjects 160 none 55 rare 64 frequent |
Predictor features Height and weight (Body Mass Index) Systolic and diastolic blood pressure Heart rate Oxygen saturation Ejection fraction (EF) BNP or NT-proBNP Bioelectrical impedance vector (BIVA) parameters NYHA class 12-lead EKG report (e.g., presence of bundle branch block, tachycardia, atrial fibrillation, etc.) Etiology Comorbidity Current therapy, pharmaceutical and surgical (pacemaker or ICD ICD/CRT) |
Overall accuracy: 71.90% None vs. all Sensitivity: 57.00% Specificity: 79.00% Rare vs. all Sensitivity: 65.00% Specificity: 60.00% Frequent vs. all Sensitivity: 59.00% Specificity: 96.00% |
Source of data Clinical study data collected through home visits and follow up |
Response features Stable Rare Frequent within one year after the first visit |
Validation | ||
10-fold cross-validation |
SVM: Support Vector Machines, CHF: Congestive Heart Failure, CART: Classification and regression tree.
5.2. Re-Hospitalizations
Re-hospitalizations gain the interest of researchers due to their negative impacts on healthcare systems' budgets and patient loads. Thus, the development of predictive modeling solutions for risk prediction is extremely challenging. Prediction of re-hospitalizations was addressed by Zolfaghar et al. 2013 [55], Vedomske et al. 2013 [56], Shah et al. 2015 [28], Roy et al. 2015 [57], Koulaouzidis et al. 2016 [58], Tugerman et al. 2016 [59], and Kang et al. 2016 [60].
Zolfaghar et al. 2013 [55] studied big data driven solutions to predict risk of readmission for CHF within a period of 30-days. Predictive factors were first extracted from the National Inpatient Dataset (NIS) and augmented with the Multicare Health System (MHS) patient dataset. Data mining models, such as logistic regression and Random Forests, were then applied. The best prediction accuracy is 78.00%. The dataset where the prediction models were evaluated contained 15,696 records. In order the authors to examine how the application of big data framework outperforms the traditional systems, when the size of the training set increases, they scaled up the original data linearly several times. Five scenarios of data size were created and the Random Forests algorithm was employed. Among the scenarios, the best prediction accuracy was 87.12%.
Vedomske et al. 2013 [56] applied Random Forests to administrative claims data in order to predict readmissions for CHF patients within 30 day. The data were retrieved from the University of Virginia Clinical Database Repository (CDR) maintained by the Department of Public Health Sciences Clinical Informatics Division. Different variations of the Random Forests classifier were developed depending on the input. More specifically, datasets including procedure data, diagnosis data, a combination of both, and basic demographic data were extracted. The procedure was applied two times; one without prior weighting on the response variable and then with prior weighting aiming to address the issue of imbalanced classes. The discriminative power of the models was measured with the AUC after randomly splitting the datasets into 2/3 training set and 1/3 testing set.
Shah et al. 2015 [28], as previously described (Section 3), detected three HFpEF pheno-groups. Furthermore they studied the association of those groups with adverse outcomes (HF hospitalization, cardiovascular hospitalization, death and combined outcome of cardiovascular hospitalization or death). The results indicated that the created pheno-groups with differential risk profiles provided better discrimination compared to clinical parameters (e.g., the MAGGIC risk score) and B-type Natriuretic Peptide. Additionally, they utilized SVM to predict clinical outcome. Each outcome was coded as binary and 46 phenotypic predictors were included. Radial and sigmoid basis functions were evaluated. The tuning of the values of the gamma and cost parameters was achieved using a derivation cohort of 420 patients, and the evaluation of the performance was performed using a validation cohort including 107 patients. Area under the receiver operating characteristic curve (AUROC), sensitivity, mean specificity, and mean precision were the evaluation measures employed.
Roy et al. 2015 [57] addressed the problem of estimation of readmission risk as a binary classification task. The objective was to identify patients with CHF who are likely to be readmitted within 30 days of discharge (30 days = 1 patient will be readmitted, 30 days = 0 patient will not be readmitted). A dynamic hierarchical classification was followed. The prediction problem was divided in several stages or layers, creating thus a hierarchy of classification models. At each stage-layer the risk of readmission was predicted within certain days (cutoffs). Thus at each stage-layer a binary classification problem was addressed. The output from each layer was combined in order the overall 30-day risk of readmission to be predicted. The method was evaluated on the Washington State Inpatient Dataset3 and the Heart Failure cohort data from Multi Care Health Systems (MHS).4Logistic regression, Random Forests, Adaboost, Naïve Bayes and SVM classifiers were tested at each layer of dynamic hierarchical classification framework. The best classifier at each stage was determined through a 10-fold cross-validation procedure on training set.
Koulaouzidis et al. 2016 [58] used daily collected physiological data such as blood pressure, heart rate, weight, while the patients were at their home and predicted HF patients' re-hospitalization through a Naive Bayes classifier. They assessed, by employing an analysis of vectors, the predictive value of each of the monitored signals and their combinations. They observed that the best predictive results were obtained with the combined use of weight and diastolic blood pressure received during a time period of 8 days (8-day telemonitoring data). The achieved AUROC was 0.82 ± 0.02) allowing the authors to conclude that the telemonitoring has high potential in the detection of HF decompensation, however, the validity of the proposed approach in the clinical management of patients should be examined through a large-scale prospective study.
Kang et al. 2016 [60] like Koulaouzidis et al. 2016 [58] worked with data from telemonitored patients aiming to predict first re-hospitalization during the 60-day home healthcare episode. They utilized the OASIS-C dataset and they employed bivariate analysis for selecting the variables that can act as predictors and lead to the development of the best decision tree model. The J48, using 10-fold cross-validation procedure, was used to create the decision tree. 67% of the dataset was used for the construction of the tree, while 33% was used for its validation. True Positive Rate, the False Positive Rate and the AUROC are employed as evaluation measures.
Tugerman et al. 2016 [59], in order to predict hospital readmissions within 30 days following discharge, combined the C5.0 and SVM classifiers controlling thus the trade-off between reasoning transparency and predictive accuracy. Once they optimized the two classifiers, the optimization of the mixed model was followed. In order the two models (C5.0 and SVM) to be combined a tree confidence threshold was predefined. Records that are predicted with tree confidence below the predefined one are further classified by SVM. The performance of the mixed model was measured in terms of sensitivity, specificity, F1 score, positive predictive values (PPV), negative predictive values (NPV). Different threshold values were employed for the testing and training set.
Table 6 presents a short review of the literature regarding prediction of re-hospitalizations.
Table 6.
Authors | Method | Data | Features | Evaluation measures |
---|---|---|---|---|
Zolfaghar et al. 2013 [55] | Logistic regression Random Forests |
No. of data A: 15,696 records B: 1.665.866 records (linear scale up) |
Predictor features Socio demographic, vital signs, laboratory tests, discharge disposition, medical comorbidity and other cost related factors, like length of stay |
Logistic regression + A Accuracy: 78.03% Precision: 33.00% Recall: 0.08% F-measure: 0.17% AUC: 59.72% Random Forests + B Accuracy: 87.12% Precision: 99.88% Recall: 40.60% F-measure: 57.37% |
Source of data National Inpatient Dataset (NIS) augment it with our patient dataset from Multicare Health System (MHS) |
Response feature 30-day risk of re-admission Readmission = yes (class 1) (hospitalization within 30 days of discharge or of an earlier index of hospitalization due to CHF) Readmission = no (class 0) |
Validation | ||
70% of the dataset train 30% of the dataset test | ||||
Vedomske et al. 2013 [56] | Random Forests | No. of data 1.000.000 patients Virginia Clinical Database Repository (CDR) Study cohort with 19.189 inpatient visits 2.749 HF diagnoses 1814 procedures |
Predictor features Procedure data, diagnosis data, demographic data |
With prior weighting AUC: 80% Without prior weighting AUC: 84% |
Source of data University of Virginia Clinical Database Repository (CDR) maintained by the Department of Public Health Sciences Clinical Informatics Division |
Response feature Readmission within 30 days |
Validation | ||
2/3 of the dataset used for training 1/3 of the dataset used for testing | ||||
Shah et al. 2015 [28] | SVM | No. of data 527 patients |
Predictors features Phenotypic data |
Area under the receiver operating characteristic curve (AUROC): 70.40% Sensitivity: 63.10% mean Specificity: 57.20% mean Precision: 63.60% |
Source of data Data collected at the outpatient clinic of the Northwestern University HFpEF Program as part of a systematic observational study of HFpEF (ClinicalTrials.gov identifier #NCT01030991) |
Response feature HF hospitalization yes HF hospitalization no |
Validation | ||
Validation set of 107 patents | ||||
Roy et al. 2015 [57] | Dynamic Hierarchical Classification | No. of data Washington State Inpatient Dataset and the Heart Failure cohort data from Multi Care Health Systems (MHS) |
Predictors features Clinical. Data Socio-demographic Important data pertinent to CHF (ejection fraction, blood pressure, primary and secondary diagnosis indicating comorbidities, and APR-DRG codes for severity of illness and risk of mortality), Information about Discharges (discharge status, discharge destination, length of stay and follow-up plans) Cardiovascular and comorbidity attributes. |
Accuracy: 69.20% Precision: 24.80% Recall:53.60% AUC:69.60% |
Source of data Washington State Inpatient Dataset and the Heart Failure cohort data from Multi Care Health Systems (MHS) |
Response feature Readmission < 30 days Readmission > 30 days |
Validation | ||
At each stage the best classifier was determined using a 10-fold-cross-validation procedure on training set | ||||
Koulaouzidis et al. 2016 [58] | Naïve Bayes classifier | No. of data n/a |
Predictors features Blood pressure, heart rate, weight |
AUC: 82% |
Source of data Kingston-upon-Hull, home telemonitoring for patients with chronic HF |
Response feature High risk of HF hospitalization Low risk of HF hospitalization |
Validation | ||
10-fold cross-validation | ||||
Kang et al. 2016 [60] | Feature selection with Bivariate analysis J48 Decision tree |
No. of data 552 telemonitored HF patients |
Predictors features Patient Overall status Patient living situation Severe pain experiences Frequency of activity-limiting pain Presence of skin issues Ability to dress lower body Therapy needed |
AUC (c-statistic): 59% True positive rate: 65% False positive rate: 49% |
Source of data OASIS-C dataset |
Response feature Likely to be hospitalized Not likely to be hospitalized |
Validation | ||
10-fold cross-validation | ||||
Tugerman et al. 2016 [59] | Ensemble model with Boosted C5.0 tree and SVM | No. of data 20.231 inpatient admissions 4.840 CHF patients |
Predictors features Comorbidities, lab values, vitals, demographics and historical |
Sensitivity: 0.258 Specificity: 0.912 PPV: 0.260 NPV: 0.911 Accuracy: 0.842 F1 score: 0.259 |
Source of data Veterans Health Administration (VHA) Pittsburg Hospitals |
Response feature Readmission within30 days following discharge No readmission within30 days following discharge |
Validation | ||
The data set was separated into a training set of 15,481 admissions (75%), and test (holdout/validation) set of 4840 admissions (25%). |
SVM: Support Vector Machines, Sens: Sensitivity, Spec: Specificity, AUC: Area Under Curve.
5.3. Mortality
HF is one of the leading causes of death worldwide. Accurate HF survival prediction models can provide benefits both to patients and physicians, with the most important being the prevention of such an adverse event.
Besides Shah et al. 2015 [28], Fonarrow et al. 2005 [61] estimated mortality risk in patients hospitalized with acute decompensated heart failure (ADHF), Bohacik et al. 2013 [62] applied an alternating decision tree to predict risk of mortality within six months for heart failure patients and two years later [63] they present a model based on fuzzy logic, Panahiazar et al. 2015 [64] exploited data from electronic health records of the Mayo Clinic and they performed HF survival analysis using machine learning techniques. One year later, the same research team [65] applied Contrast Pattern Aided Logistic Regression (CPXR(Log)) with the probabilistic loss function to the same dataset, developing and validating prognostic risk models to predict 1, 2, and 5 year survival in HF. Austin et al. 2012 [66] and Subramanian et al. 2011 [67] predicted 30 day and 1 year mortality, respectively by employing ensemble classifiers. Finally, Ramirez et al. 2015 [68] addressed the problem of mortality prediction as a classification problem where the classes are Sudden cardiac death (SCD), Pump failure Death (PFD) and Non cardiac death, survivors. The following classification problems were studied: i) SCD vs. the rest, ii) PFD vs. the rest and iii) SCD victims, PFD victims and others (non-CD and survivors).
Fonarrow et al. 2005 [61] developed a risk stratification model for predicting in-hospital mortality exploiting the Acute Decompensate Heart Failure National Registry (ADHERE) of patients hospitalized with a primary diagnosis of ADHF in 263 hospitals in the United States [69] and utilizing the CART classification algorithm. The data included in the ADHERE registry were divided in two cohorts. More specifically, the first 33,046 hospitalizations (derivation cohort) were analyzed to develop the model, while data from 32,229 subsequent hospitalizations (validation cohort) were employed in order the validity of the model to be tested. From 39 variables, selected out of 80 included in the ADHERE registry, blood urea nitrogen, systolic blood pressure, levels of serum are identified as predictors for in hospital mortality. The CART tree was able to stratify patients into high, intermediate and low risk.
Bohacik et al. 2013 [62] classified 2023 patients diagnosed with HF into two possible predictions, alive or dead. Nine features describe the instance of patients expressing information regarding pulse rate, NT-proBNP level, blood sodium level, blood uric acid level, blood creatinine level, weight, height, gender and age. In order classification to be achieved an alternating decision tree, which maps each HF patient to a real valued prediction, was utilized. The prediction is the sum of the predictions of the base rules in its set, while the classification is the sign of the prediction. The achieved sensitivity is 37.31%, specificity is 91.53%, positive predictive value is 60.25%, negative predictive value is 80.94% and accuracy is 77.66%.
Two years later, Bochacik et al. 2015 [63] presented a model for the estimation of risk mortality within 6 months employing ambiguity and notions of fuzzy logic. The model stores knowledge for the patients in the form of fuzzy rules and classifies a patient to dead or alive using those rules. The authors compared the results of the proposed classifier with those produced by the application of a Bayesian network classifier, a nearest neighbor classifier, multilayer neural network, 1R classifier, a decision list, and a logistic regression model. Furthermore, the authors evaluated the interpretability using measures expressing the complexity of the fuzzy rules (average rule length, average number of rules, and average, minimal and maximal number of assignments in the conditions of rules).
Panahiazar et al. 2015 [64] applied decision trees, Random Forests, Adaboost, SVM and logistic regression to a dataset produced by the electronic health records of the Mayo Clinic. The dataset initially included 119,749 patients admitted to the Mayo Clinic from 1993 to 2013. 842 patient records were excluded due to incomplete and missing data and some others because they did not met the criteria defined by the experts. Thus, a final cohort with 5044 HF patients was used. For each patient 43 predictor variables, expressing demographic data, vital measurements, lab results, medication and co-morbidities, were recorded. The class variable corresponded to mortality status, consequently three versions of the dataset were created, each one corresponding to survival period (1-year, 2-year, 5-year). 1560 instances out of 5044 were used for training and the rest 3484 instances for testing. The predictor variables were divided into two sets, one including the same variables with those used in Seattle Heart Failure Model (baseline set) and one including the predictors of the first set plus race, ethnicity, body mass index, calcium channel blocker and 26 different co-morbidities (extended set). The above mentioned classifiers were applied to baseline and extended the set for 1-year, 2-years and 5-years prediction models. The authors observed that logistic regression and Random Forests were more accurate models compared to others, as well as that the incorporation of the 26 co-morbidities improves the results.
Taslimitehrani et al. 2016 [65] employed the CPXR(Log) classification algorithm with the probabilistic loss function to the cohort of 5044 patients described previously. The authors compared the results of CPXR(Log) classification algorithm with the results produced by decision trees, Random Forests, Adaboost, SVM and logistic regression. The CPXR(Log) classification algorithm outperformed the other classifiers and the prediction accuracy was 93.70% for 1 year mortality, 83.00% for the 2 years mortality and 78.60% for the 5 years mortality. The CPXR algorithm uses a pattern as logical characterization of a subgroup of data, and a local regression model characterizing the relationship between predictor and response for data of that subgroup. In case the patient's data match to one of the patterns, then the local model was built for the specific group of patients instead of the baseline model that was built for the whole population is used. According to the authors, the analysis of those patterns revealed the heterogeneity of HF between the patients. In order this heterogeneity to be taken into consideration for the survival prediction, the utilizations of the local models and different patterns is recommended.
Subramanian et al. 2011 [67] focused on predicting the mortality within 1 year by building logistic regression models and ensemble models that incorporate time-series measurements of biomarkers such as cytokine. More specifically, three logistic regression models were built to predict survival beyond 52 weeks after entry into the trial. The models are differentiated depending on the input they receive. The first model uses standard baseline measurements, allowing the experts to compare their results with those reported in the literature, the second model incorporates baseline measurements and baseline cytokine evaluating thus the contribution of cytokines to the prediction of survival and the third model includes cytokine measurements up to week 24 to the second set of predictor variables assessing thus the utility of serial follow-up measurement to predict survival. The ensemble model was built by combining the three models previously described. The final classification of the subjects as a survivor or non-survivor is determined through a majority voting procedure.
Austin et al. 2012 [66] reduced the prediction horizon of mortality to one month. In order the prediction to be achieved ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees were employed. The method was evaluated in two large cohorts of patients hospitalized with either acute myocardial infarction (16.230 subjects) or congestive heart failure (15.848 subjects) and the best results were produced by logistic regression trees.
Ramirez et al. [68] employed the SVM classifier and holter ECG recordings for 597 CHF patients with sinus rhythm enrolled in the MUSIC study to classify them to sudden cardiac death victims, pump failure death victims and other (the latter including survivors and victims of non-cardiac causes). According to the specific study, the ECG risk marker quantifying the slope of the T-peak-to-end/RR regression, T-wave alternans and heart rate turbulence slope can act as discriminators of the classes mentioned above.
Table 7 presents a short review of the literature regarding prediction of mortality.
Table 7.
Authors | Method | Data | Features | Evaluation measures |
---|---|---|---|---|
Shah et al. 2015 [28] | SVM | No. of data 527 patients |
Predictor features Phenotypic data |
Area under the receiver operating characteristic curve (AUROC): 71.80% Sensitivity: 64.00% mean Specificity: 57.70% mean Precision: 60.90% |
Source of data Data collected at the outpatient clinic of the Northwestern University HFpEF Program as part of a systematic observational study of HFpEF (ClinicalTrials.gov identifier #NCT01030991) |
Response features Death yes Death no |
Validation | ||
Validation set of 107 patents | ||||
Fonarrow et al. 2005 [61] | CART | No. of datas 33,046 instances (derivation cohort) 32,229 instances (validation cohort) |
Predictor features Demographic information, medical history, baseline clinical characteristics, initial evaluation, treatment received, procedures performed, hospital course, patient disposition |
The odds ratio for mortality between patients identified as high and low risk was 12.9 |
Source of data Acute Decompensated Heart Failure National Registry (ADHERE) of patients |
Response features Low risk Intermediate risk 1 Intermediate risk 2 Intermediate risk 3 High risk |
Validation | ||
Validation set of 32,229 instances | ||||
Bohacik et al. 2013 [62] | Alternating decision tree | No. of data 2032 patients |
Predictor features Pulse rate, NT-proBNP level, blood sodium level, blood uric acid level, blood creatinine level, weight, height, gender, age. |
Sensitivity: 37.31%, Specificity: 91.53%, PPV: 60.25%, NPV: 80.94% Accuracy: 77.66% |
Source of data Hull LifeLab - a large, epidemiologically representative, information-rich clinical database | Response features 1 year 2 years 5 years survival |
Validation | ||
10-fold cross-validation | ||||
Panahiazar et al. 2015 [64] | Logistic regression Random Forests |
No. of data 5044 HF patients |
Predictor features Demographic variables, Laboratory results, Medications, 26 major chronic conditions (ICD-9 code) as comorbidities as defined by the U.S. Department of Health and Human Services. |
1-year Logistic Regression AUC: 68.00% (baseline set) 81.00% (extended set) Random Forests AUC: 62.00% (baseline set) 80.00% (extended set) 2-years Logistic Regression AUC: 70.00% (baseline set) 74.00% (extended set) Random Forests AUC: 65.00% (baseline set) 72.00% (extended set) 5-years Logistic Regression AUC: 61.00% (baseline set) 73.00% (extended set) Random Forests AUC: 62.00% (baseline set) 72.00% (extended set) |
Source of data Electronic health records of the Mayo Clinic |
Response features 1 year 2 years 5 years survival |
Validation | ||
Testing set of 3484 patients | ||||
Taslimitehrani et al. 2016 [65] | CPXR(Log) | No. of data 5044 patients |
Predictor features Demographics, Vitals, Lab results, Medications, 24 major chronic conditions as co-morbidities. |
1-year Precision: 82.00% Recall: 78.20% Accuracy: 91.40% 2-years Precision: 78.00% Recall: 76.00% Accuracy: 83.00% 5-years Precision: 72.10% Recall: 61.50% Accuracy: 80.90% |
Source of data Electronic health records of the Mayo Clinic |
Response features 1 year 2 years 5 years survival |
Validation | ||
Testing set of 3484 patients | ||||
Austin et al. 2012 [66] | Logistic regression model (cubic smoothing splines) Boosted regression trees |
No. of data EFFECT baseline (9945 HF patients) utilized 8240 EFFECT follow up (8339 HF patients) utilized 7608 |
Predictor features Demographic characteristics, vital signs, presenting signs and symptoms, results of laboratory investigations, and previous medical history Age, systolic blood pressure, respiratory rate, sodium, urea, history of stroke or transient ischemic attack, dementia, chronic obstructive pulmonary disease, cirrhosis of the liver, and cancer. In the CHF sample |
Logistic regression model -Splines AUC: 79% R2: 0.203 Brier's score: 0.119 Boosted regression trees (depth four) AUC: 78% R2: 0.18 Brier's score: 0.107 |
Source of data Enhanced Feedback for Effective Cardiac Treatment (EFFECT) Study |
Response feature 30-day mortality binary variable denoting whether the patient died within 30 days of hospital admission |
Validation | ||
EFFECT Follow-up sample was used as the validation. | ||||
Bochacik et al. 2015 [63] | Fuzzy model | No. of data n/a |
Predictor features Blood Creatinine Level, Height, Blood Uric Acid Level, Age, Blood Sodium Level, Sex, Weight, NT-proBNP Level, Pulse Rate |
Fuzzy model Sensitivity: 63.27% Specificity: 65.54% |
Source of data Hull LifeLab 2032 instances (HF patients) |
Response feature Class attribute (patient status) classifies the patients into alive (patients being alive six and more months after the data collection) and dead (patients passing away within six months after data collection). |
Validation | ||
10-fold cross-validation | ||||
Ramirez et al. 2015 [68] | Dichotomization thresholds Exhaustive feature selection C-SVM classifier |
No. of data 597 Chronic Heart Failure patients 134 died (49 SCD victims 62 PFD victims 23 non CD victims) 463 survivors |
Predictor features Δα, an index potentially related to dispersion in repolarization restitution IAA, an index reflecting the average TWA activity during a 24-h period TS, a parameter measuring the turbulence slope of HRT |
SCD vs. the rest Sensitivity: 55% Specificity: 68% Kappa: 0.10 PFD vs. the rest Sensitivity: 79% Specificity: 57% Kappa: 0.14 Three-class classification SCD Sensitivity: 18% Specificity: 79% Kappa: 0.11 PFD Sensitivity: 14% Specificity: 81% Kappa: 0.11 |
Source of data MUSIC (MUerte Súbita en Insuficiencia Cardiaca) study |
Response feature Sudden cardiac death (SCD) Pump failure Death (PFD) Non cardiac death Survivors |
Validation | ||
5-fold cross-validation | ||||
Subramanian et al. 2011 [67] | Ensemble Logistic regression with boosting | No. of data 963 patients |
Predictor features Standard clinical variables and time-series of cytokine and cytokine receptor levels |
AUC(c-statistic): 84% |
Source of data Vesnarinone Evaluation of Survival Trial (VEST) |
Response feature 1 year mortality |
Validation | ||
10-fold cross-validation |
SVM: Support Vector Machines, AUC: Area Under Curve, HF: Heart Failure, PPV: Positive Predictive Values, NPV: Negative Predictive Value, CART: Classification and regression tree, CHF: Congestive Heart Failure, SCD: Sudden cardiac death, PFD: Pump failure Death, CD: Cardiac Death, CA: Classification Ambiguity, CIE: Cumulative Information Estimation.
6. Summary and Outlook
HF is a chronic disease characterized by a variety of unpleasant outcomes, such as poor QoL, recurrent hospitalization, high mortality and significant cost burden. A significant deterrent of the above mentioned serious consequences is early diagnosis of HF (detection of HF, estimation of the etiology and severity of HF), as well as early prediction of adverse events. Toward this direction the application of machine learning techniques contributed significantly. Researchers applied data mining techniques in order to address issues concerning management of HF either separately or in combination. More specifically, detection of HF is based mainly on the utilization of HRV measures in combination with classifiers such as SVM, CART and k-NN. The studies either utilize short-term HRV measures or long-term HRV measures. None of the studies has attempted to compare or combine short- and long-term HRV measures. However, there are studies that incorporate, in the classification process, data expressing the results of clinical examination, presenting symptoms, lab tests etc.The utilization of different sources of data in each one of these studies limits their comparison, unlike the methods that detect HF, by utilizing HRV measures, that is applied to publicly available datasets commonly used in all studies. After the detection of HF, the estimation of the etiology or the characterization of the type of HF follows. Different classifiers were applied in order to classify a patient into one of the two major HF subtypes (HFpEF vs. HFrEF). All the studies addressed the issue as a two class classification problem and did not take into consideration the patients belonging to the so called “gray zone” (HFmrEF). Only Betanzos et al. 2015 [19] included in their study this group of patients. However, they did not consider patients with HFmrEF as a separate group (3 class classification problem) but included them in one of the two major HF types by setting different cutoff points. The next step in the management of HF concerns the estimation of its severity. According to the studies reported in the literature, the problem of HF severity estimation is transformed to a two or three class classification problem. The patient's status is characterized as mild, moderate or severe. The definition of those characterizations is differentiated between the studies. For example, in some studies the characterization “severe” refers to patients belonging to NYHA class III or IV, while in some other only patients belonging to NYHA class IV are included. Furthermore, according to the authors' knowledge, no one have tried to classify the patients into 4 NYHA classes. Finally, prediction of adverse events has been attempted by the researchers. Models predicting destabilizations, re-hospitalizations, and mortality have been presented in the literature. The time frame of prediction depends on the adverse events. However, the interest of the researchers has turned to the prediction of HF since the earlier HF is detected, the more likely change health outcomes for people can be achieved. Wu et al. 2010 [18] and Aljaaf et al. 2015 [2] presented their work regarding the specific issue, with the work of Aljaaf et al. 2015 [2] achieving the best prediction accuracy. Recently a research team from Sutter Health, a Northern California not-for-profit health system, and the Georgia Institute of Technology, have proposed a method that according to the authors has the potential to reduce HF rates and possibly save lives since it can predict disease onset nine months before doctors can now deliver the diagnosis.5 The method employs deep learning, a branch of machine learning based on learning representations of data. Deep learning has been applied to problems such as computer vision and speech understanding. In the future the application of deep learning to personalized prescriptions, therapy recommendation, clinical trial recruitment, tasks involving prediction and detection of disease will be studied, opening a new window in the management of the HF and other diseases. The current work provides a comprehensive review and comparison (Table 8), in terms of advantages and disadvantages, of the methods reported in the literature that address, either separate or in combination, all the aspects of the HF management employing machine learning techniques.
Table 8.
Authors | Advantages | Disadvantages | |
---|---|---|---|
Detection of Heart Failure | Asyali et al. 2003 [7] | Discrimination power of 9 long-term HRV measures were examined and finally only one feature SDNN is selected for the detection of HF with higher sensitivity and specificity. SDNN strong indicator for the presence of HF. |
The comparison with short-term measures is limited since information regarding physical activity and sleep is not included High risk of overfitting. Neither cross-validation approach nor independent test set is used. |
Isler et al. 2007 [8] | Standard HRV measures were combined with wavelet entropy measures leading to higher discrimination power. | k-NN utilized by the authors lacks the property of the interpretability of induced knowledge. | |
Thuraisingham 2009 [9] | Utilization of the probabilistic loss function in the CPXR(Log) algorithm. Handling of the high dimensionality and complexity of EHR data. Incorporation of information regarding comorbidities. |
Information regarding the validation of the method is not provided. | |
Elfadil et al. 2011 [10] | Unsupervised approach. No labeling of the dataset is needed. |
Data randomly simulated are utilized for testing. | |
Pecchia et al. 2011 [11] | Provides a set of rules fully understandable by cardiologists expressed as “if … then”. | The performance depends on parameter values. Methodology addressing the fact of unbalance dataset is not applied. |
|
Mellilo et al. 2011 [12] | Interpretability, No overfitting. |
Dataset is small and unbalanced. The method is designated to cooperate with a specific classifier in the feature selection process. |
|
Jovic et al. 2011 [13] | HRV statistical, geometric and nonlinear measures are employed | Carefully selected collection of periods T is needed. | |
Yu et al. 2012 [14] | Utilization of five category features in combination with the utilization of UCMIFS algorithm. | The value of parameter β is not determined automatically and affects the performance of the feature selector. | |
Yu et al. 2012 [15] | Novel features calculated from the bispectrum are utilized. | – | |
Liu et al. 2014 [16] | New nonstandard HRV measures are utilized | – | |
Narin et al. 2014 [17] | Inclusion of nonlinear HRV measures and wavelet-based measures. | Unbalance dataset. Information for comorbid conditions and medication intake are not employed. |
|
Heinze et al. 2014 [18] | Ordinal patterns provide insight into distinctive RR interval dynamic differences. Automated relevance determination is applied in order to identify the deciding RR interval features for the discrimination between CHF and healthy subjects. |
– | |
Yang et al. 2010 [19] | Reliable estimation of missing values. | For the evaluation of Bayesian principal component analysis used for imputation of missing values artificial missing data are introduced to complete samples. | |
Gharehchopogh et al. 2011 [20] | – | Limited number of features. Demographics, Blood Pressure and Smoking are utilized. |
|
Son et al. 2012 [4] | Takes into account the feature dependencies and their collective contribution. |
No information regarding clinical histories, symptoms, or electrocardiogram results was exploited. The number of patients with CHF and with non-cardiogenic dyspnea was relatively small, a fact that produced variations when determining the risk factors and decision rules. |
|
Masetic et al. 2016 [21] | Combination of autoregressive Burg method with RF classifier. | - | |
Zheng et al. 2015 [24] | The predictor features consist of cardiac reserve indexes and heart sound characteristics. | The physiological significance corresponding to the changes of indexes should be explored in depth. | |
Heart Failure subtypes classification | Austin et al. 2013 [26] | Boosted trees, bagged trees, and random forests do not offer an advantage over conventional logistic regression. Conventional logistic regression should remain a standard tool. |
No optimization of the parameters. |
Betanzos et al. 2015 [25] | Patients belonging to “gray zone” (HFmrEF) are included in the study. | The cut-off criterion to distinguish HFpEF from HFrEF should take into consideration other information (medication, age, gender etc.) | |
Isler 2016 [27] | HR normalization also improves the statistical significances in time-domain and non-linear HRV measures. | More patient data is needed to enhance the validity of this study. | |
Authors | Advantages | Disadvantages | |
Severity estimation of Heart Failure | Akinyokun et al. 2009 [30] | The emotional and cognitive filters further refine the diagnosis results by taking care of the contextual elements of medical diagnosis. | Further information regarding the architecture of the neural networks are missing. |
Guidi et al. 2012 [31] | - | No justification of the selection of training (100 subjects) and testing set (36 subjects). | |
Guidi et al.2014 [32] | CART provides a humanly understandable decision-making process. | Generalization of the findings is not permitted due to the small sample size. | |
Guidi et al. 2015 [33] | Proposed a collaborative system for the comprehensive care of congestive heart failure. | Severity estimation of HF as mild, moderate, severe is not addressed as a three class classification problems but as a two class classification problem. | |
Pecchia et al. 2011 [39] | Define mild and severe in terms of NYHA class. | No information regarding the cross-validation approach (leave-one-out, k-fold) is provided. | |
Mellilo et al. 2013 [40] | Modification of the CART algorithm is proposed in order issue of imbalanced dataset to be addressed. | A larger dataset will confirm the generalization of the findings. The different extraction procedures of NN intervals. It is not clarified if the oversampling approach was applied on the construction of the tree or also to the validation. |
|
Yang et al.2010 [19] | Reliable estimation of missing values. | For the evaluation of Bayesian principal component analysis used for imputation of missing values artificial missing data are introduced to complete samples. | |
Shahbazi et al. 2015 [41] | Combination of linear and non-linear long-term HRV measures in combination with generalized discriminant analysis. | The fact that the dataset is small and unbalanced was addressed through the leave-one-out cross-validation performance estimates. The generalization of the results is not possible due to the above mentioned fact. The sampling frequency of ECG recordings are not equal. The procedures of extracting NN intervals are not the same. |
|
Sideris et al. 2015 [45] | A novel data-driven framework to extract predictive features from disease and symptom diagnostic codes is proposed. Number of cluster-based features is automatically determined through a greedy optimization methodology. |
Further information regarding the definition of the six daily threshold-based outcome variables is needed (why only heart rate and systolic blood pressure is included, does the ranges of these measures are differentiated depending on the patient) |
|
Prediction of adverse events Destabilization |
Candelieri et al. 2008 [52] | Presented a decision tree which was evaluated in terms of predictive performance (accuracy and sensitivity) through a suitable validation technique and it was checked by clinical experts in terms of plausibility. | Low sensitivity. |
Candelieri et al. 2009 [53] | – | Only 1 of the 4 patients belonging to testing set but not to training set, have presented a decompensation. | |
Candelieri et al. 2010 [54] | SVM hyper solution framework performing, at the same time, Model Selection, Multiple Kernel Learning and Ensemble Learning with the aim to identify the best hyper-classifier is proposed. | – | |
Guidi et al.2014 [32] | CART provides a humanly understandable decision-making process. | Generalization of the findings is not permitted due to the small sample size. | |
Guidi et al. 2015 [33] | Proposed a collaborative system for the comprehensive care of congestive heart failure. | – | |
Authors | Advantages | Disadvantages | |
Prediction of adverse events Re-hospitalizations |
Zolfaghar et al. 2013 [55] | A big data solution for predicting the 30-day risk of readmission for the CHF patients is proposed. | – |
Vedomske et al. 2013 [56] | Incorporation of billing information in the prediction of re-hospitalizations. |
Data from a single hospital are employed. Visits which contained no data for readmissions were excluded. |
|
Shah et al. 2015 [28] | Relationship between the pheno-groups and adverse outcomes. | Further demonstration of generalizability is needed. | |
Roy et al. 2015 [57] | Hierarchical classification technique for risk of readmission, dividing the prediction problem in several layers, is proposed. Algorithmic layering capability is trained and tested over two real world datasets and is currently integrated into the clinical decision support. |
– | |
Koulaouzidis et al. 2016 [58] | Telemonitoring data are employed. | Small number of predictor features. Utilization of different classifiers. No information regarding the sample size. |
|
Kang et al. 2016 [60] | It provides a preliminary understanding of the characteristics of telehomecare patients that were associated with re-hospitalization. It provides a visual depiction of the associations among risk factors, allowing a more complete exploration of the profile of patients at high risk for re-hospitalization among all patients who used telehomecare. |
Input variables does not include (bio)markers or characteristics of medication noncompliance that may affect re-hospitalization. | |
Tugerman et al. 2016 [59] | A mixed-ensemble model for predicting hospital readmission is proposed. An optimization approach, which takes into account the degree of correlation between the models, the distance of the minority instances to the decision boundaries of the SVM, the penalty for misclassification errors for patients who were actually readmitted (positive readmission instances), and generalization power is proposed. |
The dataset is highly imbalanced. | |
Prediction of adverse events Mortality |
Shah et al. 2015 [28] | Relationship between the pheno-groups and adverse outcomes. | Further demonstration of generalizability is needed. |
Fonarrow et al. 2005 [61] | 5 levels of risk are estimated. | Each patient's actual risk may be influenced by many factors not measured or considered in this model. | |
Bohacik et al. 2013 [62] | Alternating decision trees allows the estimation of the contribution of each decision node in isolation. | Low sensitivity. | |
Panahiazar et al. 2015 [64] | Hazard Ratio (HR) is calculated based on real world EHR data. | - | |
Taslimitehrani et al. 2016 [65] | CPXR(Log) is used allowing effectively building of highly accurate prediction models on datasets with diverse predictor– response relationships |
The selection of the parameters values affecting CPXR(Log) is not justified. | |
Austin et al. 2012 [66] | - | Utilization of other classifiers is not employed. Regression models did not include shrinkage or penalized estimation methods |
|
Bochacik et al. 2015 [63] | Interpretability was evaluated using quantitative measures. An algorithmic model using computations of ambiguity and utilizing notions of fuzzy logic is proposed. |
- | |
Ramirez et al. 2015 [68] | Different etiologies of mortality are predicted. | The utilization of fully automated ECG measurements may induce imprecision. The number of SCD and PFD victims was relatively low in comparison with survivors. |
|
Subramanian et al. 2011 [67] | A multivariate logistic regression model using baseline and serial measurements of cytokine and cytokine receptors levels up to 24 weeks predicts 1-year mortality. | - |
Acknowledgment
This work is supported by the HEARTEN project that has received funding from the European Union's Horizon 2020 research and innovation program under grant agreement No 643694.
Footnotes
References
- 1.Ponikowski P., Voors A.A., Anker S.D., Bueno H., Cleland J.G.F., Coats A.J.S. ESC guidelines for the diagnosis and treatment of acute and chronic heart failure. Eur Heart J. 2016;2015(ehw128) doi: 10.1016/j.rec.2016.11.005. [DOI] [PubMed] [Google Scholar]
- 2.Aljaaf A.J., Al-Jumeily D., Hussain A.J., Dawson T., Fergus P., Al-Jumaily M. Third international conference on technological advances in electrical, Beirut, Lebanon. 2015. Predicting the likelihood of heart failure with a multi level risk assessment using decision tree. [Google Scholar]
- 3.Cowie M.R. The heart failure epidemic. Medicographia. 2012 [Google Scholar]
- 4.Son C.-S., Kim Y.-N., Kim H.-S., Park H.-S., Kim M.-S. Decision-making model for early diagnosis of congestive heart failure using rough set and decision tree approaches. J Biomed Inform. 2012;45:999–1008. doi: 10.1016/j.jbi.2012.04.013. [DOI] [PubMed] [Google Scholar]
- 5.Roger V.L. The heart failure epidemic. Int J Environ Res Public Health. 2010;7:1807–1830. doi: 10.3390/ijerph7041807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Dickstein K., Cohen-Solal A., Filippatos G., McMurray J.J.V., Ponikowski P., Poole-Wilson P.A. ESC guidelines for the diagnosis and treatment of acute and chronic heart failure 2008 the task force for the diagnosis and treatment of acute and chronic heart failure 2008 of the European Society of Cardiology. Developed in collaboration with the heart failure association of the ESC (HFA) and endorsed by the European Society of Intensive Care Medicine (ESICM) Eur Heart J. 2008;29:2388–2442. doi: 10.1093/eurheartj/ehn309. [DOI] [PubMed] [Google Scholar]
- 7.Asyali M.H. 2003. Discrimination power of long-term heart rate variability measures. [Google Scholar]
- 8.Işler Y., Kuntalp M. Combining classical HRV indices with wavelet entropy measures improves to performance in diagnosing congestive heart failure. Comput Biol Med. 2007;37:1502–1510. doi: 10.1016/j.compbiomed.2007.01.012. [DOI] [PubMed] [Google Scholar]
- 9.Thuraisingham R.A. A classification system to detect congestive heart failure using second-order difference plot of RR intervals. Cardiol Res Pract. 2009 doi: 10.4061/2009/807379. (Article ID 807379) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Elfadil N., Ibrahim I. 2011. Self organizing neural network approach for identification of patients with congestive heart failure. [Google Scholar]
- 11.Pecchia L., Melillo P., Sansone M., Bracale M. Discrimination power of short-term heart rate variability measures for CHF assessment. IEEE Trans Inf Technol Biomed. 2011;15:40–46. doi: 10.1109/TITB.2010.2091647. [DOI] [PubMed] [Google Scholar]
- 12.Melillo P., Fusco R., Sansone M., Bracale M., Pecchia L. Discrimination power of long-term heart rate variability measures for chronic heart failure detection. Med Biol Eng Comput. 2011;49:67–74. doi: 10.1007/s11517-010-0728-5. [DOI] [PubMed] [Google Scholar]
- 13.Jovic A., Bogunovic N. Electrocardiogram analysis using a combination of statistical, geometric, and nonlinear heart rate variability features. Artif Intell Med. 2011;51:175–186. doi: 10.1016/j.artmed.2010.09.005. [DOI] [PubMed] [Google Scholar]
- 14.Yu S.-N., Lee M.-Y. Conditional mutual information-based feature selection for congestive heart failure recognition using heart rate variability. Comput Methods Programs Biomed. 2012;108:299–309. doi: 10.1016/j.cmpb.2011.12.015. [DOI] [PubMed] [Google Scholar]
- 15.Yu S.-N., Lee M.-Y. Bispectral analysis and genetic algorithm for congestive heart failure recognition based on heart rate variability. Comput Biol Med. 2012;42:816–825. doi: 10.1016/j.compbiomed.2012.06.005. [DOI] [PubMed] [Google Scholar]
- 16.Liu G., Wang L., Wang Q., Zhou G., Wang Y., Jiang Q. A new approach to detect congestive heart failure using short-term heart rate variability measures. PLoS One. 2014;9:e93399. doi: 10.1371/journal.pone.0093399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Narin A., Isler Y., Ozer M. Investigating the performance improvement of HRV indices in CHF using feature selection methods based on backward elimination and statistical significance. Comput Biol Med. 2014;45:72–79. doi: 10.1016/j.compbiomed.2013.11.016. [DOI] [PubMed] [Google Scholar]
- 18.Heinze C., Trutschel D.S.U., Golz M. Proceedings of the 8th conference of the European study group on cardiovascular oscillations (ESGCO 2014) 2014. Discrimination and relevance determination of heart rate variability features for the identification of congestive heart failure. [Google Scholar]
- 19.Yang G., Ren Y., Pan Q., Ning G., Gong S., Cai G. vol. 3. 2010. A heart failure diagnosis model based on support vector machine; pp. 1105–1108. (2010 3rd international conference on biomedical engineering and informatics (BMEI)). [Google Scholar]
- 20.Gharehchopogh F.S., Khalifelu Z.A. 2011. Neural network application in diagnosis of patient: a case study, Abbottabad. [Google Scholar]
- 21.Masetic Z., Subasi A. Congestive heart failure detection using random forest classifier. Comput Methods Programs Biomed. 2016;130:54–64. doi: 10.1016/j.cmpb.2016.03.020. [DOI] [PubMed] [Google Scholar]
- 22.Goldberger A.L., Amaral L.A.N., Glass L., Hausdorff J.M., Ivanov P.C., Mark R.G. PhysioBank, PhysioToolkit, and PhysioNet components of a new research resource for complex physiologic signals. Circulation. 2000;101:e215–e220. doi: 10.1161/01.cir.101.23.e215. [DOI] [PubMed] [Google Scholar]
- 23.Wu J., Roy J., Stewart W.F. Prediction modeling using EHR data: challenges, strategies, and a comparison of machine learning approaches. Med Care. 2010;48:S106–S113. doi: 10.1097/MLR.0b013e3181de9e17. [DOI] [PubMed] [Google Scholar]
- 24.Zheng Y., Guo X., Qin J., Xiao S. Computer-assisted diagnosis for chronic heart failure by the analysis of their cardiac reserve and heart sound characteristics. Comput Methods Programs Biomed. 2015;I22:372–383. doi: 10.1016/j.cmpb.2015.09.001. [DOI] [PubMed] [Google Scholar]
- 25.Alonso-Betanzos A., Bolón-Canedo V., Heyndrickx G.R., Kerkhof P.L. Exploring guidelines for classification of major heart failure subtypes by using machine learning. Clin Med Insights Cardiol. 2015;9:57–71. doi: 10.4137/CMC.S18746. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Austin P.C., Tu J.V., Ho J.E., Levy D., Lee D.S. Using methods from the data-mining and machine-learning literature for disease classification and prediction: a case study examining classification of heart failure subtypes. J Clin Epidemiol. 2013;66:398–407. doi: 10.1016/j.jclinepi.2012.11.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Isler Y. Discrimination of systolic and diastolic dysfunctions using multi-layer perceptron in heart rate variability analysis. Comput Biol Med [accepted paper] 2016 doi: 10.1016/j.compbiomed.2016.06.029. [DOI] [PubMed] [Google Scholar]
- 28.Shah S.J., Katz D.H., Selvaraj S., Burke M.A., Yancy C.W., Gheorghiade M. Phenomapping for novel classification of heart failure with preserved ejection fraction. Circulation. 2015;131:269–279. doi: 10.1161/CIRCULATIONAHA.114.010637. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Fleg J.L., Piña I.L., Balady G.J., Chaitman B.R., Fletcher B., Lavie C. Assessment of functional capacity in clinical and research applications: an advisory from the committee on exercise, rehabilitation, and prevention, council on clinical cardiology, American Heart Association. Circulation. 2000;102:1591–1597. doi: 10.1161/01.cir.102.13.1591. [DOI] [PubMed] [Google Scholar]
- 30.Akinyokun C.O., Obot O.U., Uzoka F.-M.E. Application of neuro-fuzzy technology in medical diagnosis: case study of heart failure. In: Dössel O., Schlegel W.C., editors. World congress on medical physics and biomedical engineering, September 7–12, 2009. Springer Berlin Heidelberg; Munich, Germany: 2009. pp. 301–304. [Google Scholar]
- 31.Guidi G., Iadanza E., Pettenati M.C., Milli M., Pavone F., Gentili G.B. Heart failure artificial intelligence-based computer aided diagnosis telecare system. In: Donnelly M., Paggetti C., Nugent C., Mokhtari M., editors. Impact analysis of solutions for chronic disease prevention and management. Berlin Heidelberg; Springer: 2012. pp. 278–281. [Google Scholar]
- 32.Guidi G., Pettenati M.C., Melillo P., Iadanza E. A machine learning system to improve heart failure patient assistance. IEEE J Biomed Health Inform. 2014;18:1750–1756. doi: 10.1109/JBHI.2014.2337752. [DOI] [PubMed] [Google Scholar]
- 33.Guidi G., Pollonini L., Dacso C.C., Iadanza E. A multi-layer monitoring system for clinical management of congestive heart failure. BMC Med Inform Decis Mak. 2015;15(Suppl. 3):S5. doi: 10.1186/1472-6947-15-S3-S5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Panina G., Khot U.N., Nunziata E., Cody R.J., Binkley P.F. Role of spectral measures of heart rate variability as markers of disease progression in patients with chronic congestive heart failure not treated with angiotensin-converting enzyme inhibitors. Am Heart J. 1996;131:153–157. doi: 10.1016/s0002-8703(96)90064-2. [DOI] [PubMed] [Google Scholar]
- 35.Mietus J.E., Peng C.-K., Henry I., Goldsmith R.L., Goldberger A.L. The pNNx files: re-examining a widely used heart rate variability measure. Heart. 2002;88:378–380. doi: 10.1136/heart.88.4.378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Musialik-Łydka A., Sredniawa B., Pasyk S. Heart rate variability in heart failure. Kardiol Pol. 2003;58:10–16. [PubMed] [Google Scholar]
- 37.Arbolishvili G.N., Mareev V.I., Orlova I.A., Belenkov I.N. Heart rate variability in chronic heart failure and its role in prognosis of the disease. Kardiologiia. 2006;46:4–11. [PubMed] [Google Scholar]
- 38.Casolo G.C., Stroder P., Sulla A., Chelucci A., Freni A., Zerauschek M. Heart rate variability and functional severity of congestive heart failure secondary to coronary artery disease. Eur Heart J. 1995;16:360–367. doi: 10.1093/oxfordjournals.eurheartj.a060919. [DOI] [PubMed] [Google Scholar]
- 39.Pecchia L., Melillo P., Bracale M. Remote health monitoring of heart failure with data mining via CART method on HRV features. IEEE Trans Biomed Eng. 2011;58:800–804. doi: 10.1109/TBME.2010.2092776. [DOI] [PubMed] [Google Scholar]
- 40.Melillo P., De Luca N., Bracale M., Pecchia L. Classification tree for risk assessment in patients suffering from congestive heart failure via long-term heart rate variability. IEEE J Biomed Health Inform. 2013;17:727–733. doi: 10.1109/jbhi.2013.2244902. [DOI] [PubMed] [Google Scholar]
- 41.Shahbazi F., Asl B.M. Generalized discriminant analysis for congestiveheart failure risk assessment based on long-termheart rate variability. Comput Methods Programs Biomed. 2015;I22:191–198. doi: 10.1016/j.cmpb.2015.08.007. [DOI] [PubMed] [Google Scholar]
- 42.Oba S., Sato M., Takemasa I., Monden M., Matsubara K., Ishii S. A Bayesian missing value estimation method for gene expression profile data. Bioinformatics. 2003;19:2088–2096. doi: 10.1093/bioinformatics/btg287. [DOI] [PubMed] [Google Scholar]
- 43.Cortes C., Vapnik V. Support-vector networks. Mach Learn. 1995;20:273–297. [Google Scholar]
- 44.Fluss R., Faraggi D., Reiser B. Estimation of the Youden index and its associated cutoff point. Biom J. 2005;47:458–472. doi: 10.1002/bimj.200410135. [DOI] [PubMed] [Google Scholar]
- 45.Sideris C., Alshurafa N., Pourhomayoun M., Shahmohammadi F., Samy L., Sarrafzadeh M. A data-driven feature extraction framework for predicting the severity of condition of congestive heart failure patients. Conf Proc IEEE Eng Med Biol Soc. 2015;2015:2534–2537. doi: 10.1109/EMBC.2015.7318908. [DOI] [PubMed] [Google Scholar]
- 46.Ketchum E.S., Levy W.C. Multivariate risk scores and patient outcomes in advanced heart failure. Congest Heart Fail. 2011;17:205–212. doi: 10.1111/j.1751-7133.2011.00241.x. [DOI] [PubMed] [Google Scholar]
- 47.Peterson P.N., Rumsfeld J.S., Liang L., Albert N.M., Hernandez A.F., Peterson E.D. A validated risk score for in-hospital mortality in patients with heart failure from the American Heart Association get with the guidelines program. Circ Cardiovasc Qual Outcomes. 2010;3:25–32. doi: 10.1161/CIRCOUTCOMES.109.854877. [DOI] [PubMed] [Google Scholar]
- 48.Levy W.C., Mozaffarian D., Linker D.T., Sutradhar S.C., Anker S.D., Cropp A.B. The Seattle heart failure model prediction of survival in heart failure. Circulation. 2006;113:1424–1433. doi: 10.1161/CIRCULATIONAHA.105.584102. [DOI] [PubMed] [Google Scholar]
- 49.Lee D.S., Austin P.C., Rouleau J.L., Liu P.P., Naimark D., Tu J.V. Predicting mortality among patients hospitalized for heart failure: derivation and validation of a clinical model. JAMA. 2003;290:2581–2587. doi: 10.1001/jama.290.19.2581. [DOI] [PubMed] [Google Scholar]
- 50.Philbin E.F., DiSalvo T.G. Prediction of hospital readmission for heart failure: development of a simple risk score based on administrative data. J Am Coll Cardiol. 1999;33:1560–1566. doi: 10.1016/s0735-1097(99)00059-5. [DOI] [PubMed] [Google Scholar]
- 51.Pocock S.J., Wang D., Pfeffer M.A., Yusuf S., McMurray J.J.V., Swedberg K.B. Predictors of mortality and morbidity in patients with chronic heart failure. Eur Heart J. 2006;27:65–75. doi: 10.1093/eurheartj/ehi555. [DOI] [PubMed] [Google Scholar]
- 52.Candelieri A., Conforti D., Perticone F., Sciacqua A., Kawecka-Jaszcz K., Styczkiewicz K. Early detection of decompensation conditions in heart failure patients by knowledge discovery: the HEARTFAID approaches. Comput Cardiol. 2008:893–896. [Google Scholar]
- 53.Candelieri A., Conforti D., Sciacqua A., Perticone F. Ninth International Conference on Intelligent Systems Design and Applications. ISDA; Pisa, Italy: 2009. Knowledge Discovery Approaches for Early Detection of Decompensation Conditions in Heart Failure Patients, 2009. November 30–December 2, 2009. [Google Scholar]
- 54.Candelieri A., Conforti D. A hyper-solution framework for SVM classification: application for predicting destabilizations in chronic heart failure patients. Open Med Inform J. 2010;4:136–140. doi: 10.2174/1874431101004010136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Zolfaghar K., Meadem N., Teredesai A., Basu Roy S., Si-Chi C., Muckian B. 2013. Big data solutions for predicting risk-of-readmission for congestive heart failure patients, IEEE International Conference on Big Data. [Google Scholar]
- 56.Vedomske M.A., Brown D.E., Harrison J.H. Proceedings of the 12th international conference on machine learning and applications. 2013. Random forests on ubiquitous data for heart failure 30-day readmissions prediction. [Google Scholar]
- 57.Roy S.B., Teredesai A., Zolfaghar K., Liu R., Hazel D. Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. 2015. Dynamic hierarchical classification for patient risk-of-readmission; pp. 1691–1700. [Google Scholar]
- 58.Koulaouzidis G., Iakovidis D.K., Clark A.L. Telemonitoring predicts in advance heart failure admissions. Int J Cardiol. 2016;216:78–84. doi: 10.1016/j.ijcard.2016.04.149. [DOI] [PubMed] [Google Scholar]
- 59.Turgeman L., May J.H. A mixed-ensemble model for hospital readmission. Artif Intell Med. 2016;72:72–82. doi: 10.1016/j.artmed.2016.08.005. [DOI] [PubMed] [Google Scholar]
- 60.Kang Y., McHugh M.D., Chittams J., Bowles K.H. Utilizing home healthcare electronic health records for telehomecare patients with heart failure. A decision tree approach to detect associations with rehospitalizations. Comput Inform Nurs. 2016;34(4):175–182. doi: 10.1097/CIN.0000000000000223. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Fonarow G.C., Adams K.F., Abraham W.T., Yancy C.W., Boscardin W.J., ADHERE Scientific Advisory Committee Risk stratification for in-hospital mortality in acutely decompensated heart failure: classification and regression tree analysis. JAMA. 2005;293:572–580. doi: 10.1001/jama.293.5.572. [DOI] [PubMed] [Google Scholar]
- 62.Bohacik J., Kambhampati C., Davis D.N., Cleland J.G.F. Alternating decision tree applied to risk assessment of heart failure patients. J Inf Technol. 2013;6(2):25–33. [Google Scholar]
- 63.Bohacik J., Matiasko K., Benedikovic M., Nedeljakova I. Proceedings of the 8th IEEE international conference on intelligent data acquisition and advanced computing systems: technology and applications. 2015. Algorithmic model for risk assessment of heart failure patients. [Google Scholar]
- 64.Panahiazar M., Taslimitehrani V., Pereira N., Pathak J. Using EHRs and machine learning for heart failure survival analysis. Stud Health Technol Inform. 2015;216:40–44. [PMC free article] [PubMed] [Google Scholar]
- 65.Taslimitehrani V., Dong G., Pereira N.L., Panahiazar M., Pathak J. Developing EHR-driven heart failure risk prediction models using CPXR(Log) with the probabilistic loss function. J Biomed Inform. 2016;60:260–269. doi: 10.1016/j.jbi.2016.01.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Austin P.C., Lee D.S., Steyerberg E.W., Tu J.V. Regression trees for predicting mortality in patients with cardiovascular disease: what improvement is achieved by using ensemble-based methods? Biom J. 2012;54(5):657–673. doi: 10.1002/bimj.201100251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Subramanian D., Subramanian V., Deswal A., Mann D.L. New predictive models of heart failure mortality using time-series measurements and ensemble models. Circ Heart Fail. 2011;4:456–462. doi: 10.1161/CIRCHEARTFAILURE.110.958496. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Ramírez J., Monasterio V., Mincholé A., Llamedo M., Lenis G., Dipl-Ing C.I. Automatic SVM classification of sudden cardiac death and pump failure death from autonomic and repolarization ECG markers. J Electrocardiol. 2015;48:551–557. doi: 10.1016/j.jelectrocard.2015.04.002. [DOI] [PubMed] [Google Scholar]
- 69.Adams K.F., Fonarow G.C., Emerman C.L., LeJemtel T.H., Costanzo M.R., Abraham W.T. Characteristics and outcomes of patients hospitalized for heart failure in the United States: rationale, design, and preliminary observations from the first 100,000 cases in the acute decompensated heart failure national registry (ADHERE) Am Heart J. 2005;149:209–216. doi: 10.1016/j.ahj.2004.08.005. [DOI] [PubMed] [Google Scholar]