Abstract
Chronic diseases are the most severe health concern today, and heart disease is one of them. Coronary artery disease (CAD) affects blood flow to the heart, and it is the most common type of heart disease which causes a heart attack. High blood pressure, high cholesterol, and smoking significantly increase the risk of heart disease. To estimate the risk of heart disease is a complex process because it depends on various input parameters. The linear and analytical models failed due to their assumptions and limited dataset. The existing studies have used medical data for classification purposes, which help to identify the exact condition of the patient, but no one has developed any correlation equation which can be directly used to identify the patients. In this paper, mathematical models have been developed using the medical database of patients suffering from heart disease. Curve fitting and artificial neural network (ANN) have been applied to model the condition of patients to find out whether the patient is suffering from heart disease or not. The developed curve fitting model can identify the cardiac patient with accuracy, having a coefficient of determination (R2-value) of 0.6337 and mean absolute error (MAE) of 0.293 at a root mean square error (RMSE) of 0.3688, and the ANN-based model can identify the cardiac patient with accuracy having a coefficient of determination (R2-value) of 0.8491 and MAE of 0.20 at RMSE of 0.267, it has been found that ANN provides superior mathematical modeling than curve fitting method in identifying the heart disease patients. Medical professionals can utilize this model to identify heart patients without any angiography or computed tomography angiography test.
1. Introduction
Predicting cardiac sickness accurately may save someone's life, while an incorrect diagnosis can be deadly. Heart disease is made more likely by a host of risk factors, including high cholesterol, obesity, elevated triglyceride levels, and hypertension. Heart failure occurs when the heart's muscles fail to pump blood as efficiently as they should [1]. Shortness of breath may result from blood clots in the lungs. The heart weakens or stiffens over time due to certain cardiac conditions, such as restricted arteries in the heart or high blood pressure. People with heart disease may live longer if they get the proper treatment. A low-fat, low-sodium diet, 30 minutes of moderate exercise five days a week (or more), limiting alcohol, and quitting smoking use may all help lower your risk of heart disease. You may also be prescribed medications by your doctor if lifestyle changes aren't enough to keep your cardiac disease under control. If you have a cardiac condition, the medication you get will be tailored to your specific needs. Your doctor may recommend certain therapies or surgery if medication fails to work for you. The kind of heart illness and the degree of cardiac damage will dictate the type of surgery. These classic risk factors, such as a family history of early coronary artery disease, dyslipidemia, and age, are well documented in the etiological route of ischemic heart disease in women [1].
The prevention, diagnosis, and rehabilitation of humans with ischemic heart disease continue to be a major concern. A complex interplay of variables, including unique risk factors and disease pathogenesis for ischemic heart disease in women with nonobstructive coronary artery disease and coronary microvasculature and endothelium disorder, contribute to this conundrum. Chronic disease prediction is essential in healthcare informatics [2]. It is important to detect the sickness as soon as possible [3]. Heart disease and diabetes risk may be estimated using machine learning algorithms that analyze the data for these diseases. Medical, industrial, and educational domains can benefit from extracting valuable information from vast datasets via data mining. Machine learning (ML) is one of the fastest-growing fields of artificial intelligence [1, 4].
Cardiovascular disease is one of the leading causes of death. The current study focuses on detecting, altering, and addressing risk variables on an individual basis. Despite the fact that the incidence of various cardiovascular risk factors is increasing at different rates around the world, the magnitude of the increase has prompted researchers to look into the causes of the risk factors. This study's main purpose is to develop a model for detecting cardiac patients without using angiography or computed tomography angiography.
Heart disease may be detected in a variety of persons using machine-learning techniques. Neural networks (NN), decision trees (DT), CN2 rule inducers, Stochastic Gradient Descent, and support vector machine (SVM) were utilized, and found that the DT and SVM algorithms produce the best results in the 20-fold and 10-fold cross-validation tests 87.69% of the time [5]. Na et al. [6] developed an algorithm based on heart rate variability (HRV) to distinguish panic disorder from other types of anxiety. Because panic disorder and other anxiety disorders have similar causes and symptoms, machine learning aims to create a classification model that distinguishes panic disorder from other anxiety illnesses [6]. By finding complex, nonlinear patterns of expression and linkages in data sets, it has been found that the ML techniques may extract underlying information and found that for families, Random forest models yielded AUC values of more than 0.80, and for species, more than 0.98% [7]. The need for advanced tools to detect the illness early may reduce fatality rates. AI and data mining have many different methods that could help predict CVD before it happens and find out how people act in many different ways from a lot of data. The results of these forecasts will help doctors to make decisions and get patients checked out early, which will help them live longer. Different models can be utilized in several classification approaches [8].
Various researchers have developed a layered biometric identification system resistant to PAs by fusing fingerprint and heart-signal data. Artifact attacks are avoided in the first layer using an excellent convolutional neural network (CNN). An electrocardiography (ECG) image is used in the second layer of a lightweight CNN to prevent corpse attacks. Next, fingerprint matches at a predetermined threshold are utilized to prevent attacks by conformists. A score-level fusion of the fingerprint and a cardiac signal is used in the last layer of biometric authentication to ensure security. Two freely available online databases of fingerprints and cardiac signals were used to evaluate the proposed system against different scenarios of authentication and assault. There were no false match rates (FMRs) found in the experiments, and the false nonmatch rates were satisfactory (FNMR) [9]. A unique data-driven strategy with a fuzzy rule-based classification system for cardiac disease detection outperforms other models in order to balance interpretability and accuracy [10].
The prevalence of heart failure has been rising in lockstep with the pace of population growth [11]. A python-based app has been made for healthcare because it is more reliable and helps with the tracking and setting up different types of health monitoring Apps. A random forest classification system is being developed to diagnose cardiac problems. This method has an 83% of average accuracy rate over training data [12]. Dynamic systems (MLDS) have been developed to increase their existing knowledge at each layer. For feature selection, the model employs the correlation attribute evaluator (CAE), extra trees classifier (ETC), information gain attribute evaluator (IGAE), gain ratio attribute evaluator (GRAE), and Lasso. The ensemble approach for categorization in the model was built using random forest (RF), gradient boosting (GB), and naive Bayes (NB) classifiers [13].
Several statistical approaches, including principal component analysis, were used to find the essential parameters for stroke prediction. They have found that the most critical criteria for diagnosing stroke in patients are age, average glucose level, heart disease, and hypertension. Furthermore, compared to all accessible input characteristics and various benchmarking methodologies, a feed-forward neural network with four properties has the greatest accuracy rate and the lowest miss rate [14]. Three alternative goals have been chosen for CHF (chronic heart failure) modeling: CHF identification as the primary diagnostic, prediction of blood pressure, and classification of CHF stages. Several machine-learning algorithms were applied to three sorts of features for each job: static, dynamic, and the entire feature set. The findings suggest that the models perform better when temporal and nontemporal information are included [15]. Heart disease may be detected earlier because of the newly developed and improved algorithms that have been built, innovated, and optimized. Based on a variety of classifier algorithms, including NB, the salp swarm optimized neural network (SSA-NN), Bayesian optimized SVM (BO-SVM), and K-nearest neighbors, the created system for predicting cardiovascular illness have been put into practice (KNN) [16]. Heart failure models were used to analyze the severity of heart failure and predict the occurrence of adverse events such as destabilizations, re-hospitalizations, and death [17]. Total heart rate (HR) and ear-worn, long-term blood pressure (BP) monitor to improve wearability. An SVM classifier to learn and recognize raw heartbeats from moving artifact-influenced data [18]. A single mechanocardiography measurement and the atrial fib relation (AFib) can be correctly identified acute decompensated heart failure can be diagnosed with a reasonable level of accuracy [19].
Machine learning establishes a new technique for detecting significant features which enhance the accuracy of the prediction of cardiovascular disease. A hybrid random forest and linear model approach improve the performance while maintaining an accuracy rate of 88.7% in predicting heart disease [20]. Enhanced deep learning assisted convolutional neural network algorithm was used to help and enhance patient prognostics in heart disease. It has been added to the IoMT for expert systems that help clinicians quickly and efficiently diagnose cardiac patients' information on cloud platforms worldwide. Compared to standard techniques, the test findings suggest that if you have a lot of flexibility with your EDCNN hyperparameters, you can get an accuracy of up to 99.1%. A unique rapid conditional mutual information feature selection approach has been developed to overcome the feature selection challenges [21].
The feature selection methods were used for feature selection in order to improve classification accuracy and minimize classification system execution time. The experimental findings suggest that the feature selection method (FCMIM) may be used with a classifier support vector machine to create a high-level intelligent system to detect heart disease [22]. Heart rate variability is a powerful predictor of hypertensive individuals who are more likely to experience cardiovascular-related events. In contrast to the standard methodologies utilized for the same purpose, the supervised learning model is simple, efficient, and cost-effective, and it can be used for cardiac monitoring analysis [23]. Various researchers have used machine-learning algorithms to predict the ECG signals [24, 25]. As we mentioned, ML techniques can be applied in different applications and used especially in medical identification [26–28].
The medical field has a massive amount of patient data. This data must be mined using different machine-learning methods. Healthcare experts analyze this data in order to make effective diagnostic decisions. Clinical help can be provided by analyzing medical data using classification algorithms. The existing studies have used medical data for classification purposes which help to identify the exact condition of the patient, but no one has developed any correlation equation which can be directly used to identify the patients. In this study, basic information with some important clinical data have been used to identify the cardiac patient at the early stage without going through angiography and CT angiography. The major contributions of this study are the following:
The correlation has been developed using curve fitting and artificial neural network (ANN) methods.
Developed an artificial neural network (ANN) model that professionals can use to identify cardiac patients. An ANN-based model provides results with very high accuracy.
A detailed discussion on heart disease and a selective literature review has been done to identify the issues and parameters related to the cardiac disease for testing and identification purposes.
The data has been collected from the Kaggle database. The performance of the models has been compared. The results show that these correlations can help in identify cardiac patients easily with higher precision.
The rest of the paper is organized in the following sections. Section 2 presents the details of data collection and data preparation which has been used for modeling. The correlation models using curve fitting and artificial neural network methods are presented in Section 3. The major findings and performance of the models of curve fitting and ANN models are summarized in Section 4.
2. Cardiac Patients Identification
Identification of cardiac patients in the early stages is important to reduce the risk of complications. To address this issue, it is proposed to develop correlations that can be utilized to identify cardiac patients. A methodology has been proposed, as shown in Figure 1. The data sets have been collected from the online database, and the data filtration and standardization operations have been performed to remove outliers and make the data dimensionless. The proposed curve fitting and ANN methods have been used to develop models, and the performance of the developed models has been tested on various performance parameters to select the best-fitted model.
Figure 1.

Methodology of the proposed work.
2.1. Data Collection and Data Preparation
The clinical parameters of a heart disease patient were collected from the open-source link (https://github.com/g-shreekant/Heart-Disease-Prediction-using-Machine-Learning) used for the development of correlation [29]. A list of such parameters is listed in Table 1. For the modeling of parameters, all the parameter values are standardized in the range of 0 to 1 using equation (1). The details of the statistical properties of the parameters used for the modeling are listed in Table 2 to understand the features of the data. Figure 2 shows the correlation plot between the input (X) and output (Y) variables.
| (1) |
where Y is the output of the normalized value, x is the value to be normalized, Xmin is the minimum value in the selected dataset, and Xmax is the maximum value in the selected dataset.
Table 1.
Statistical data for every major parameter in the study.
| S. No. | Parameter | Symbol | Description |
|---|---|---|---|
| 1. | Age | A c | Years |
| 2. | Sex | S c | 1 = male, 0 = female |
| 3. | Chest pain type | C p | Value 1: atypical angina, value 2: non-anginal pain, value 3: typical angina |
| 4. | Blood pressure | B p | mm Hg on admission to the hospital |
| 5. | Cholesterol | C h | mg/dl |
| 6. | Fasting blood sugar | F b | >120 mg/dl, 1 = true; 0 = false |
| 7. | Resting electrocardiographic results | R e | Value 0: probable, value 1: normal, value 2: having ST-T wave abnormality |
| 8. | Maximum heart rate | H r | BPM |
| 9. | Exercise-induced angina | E x | 1 = yes; 0 = no |
| 10. | ST depression induced by exercise relative to rest | O p | — |
| 11. | The slope of the peak exercise ST segment | S p | 0: downsloping; 1: flat; 2: upsloping 0: down sloping; 1: flat; 2: upsloping |
| 12. | The number of significant vessels | C a | (0–3) |
| 13. | Thalassemia value | T h | Value 1: fixed defect, value 2: normal blood flow, value 3: reversible defect |
Table 2.
Statistical properties of the medical data.
| Parameters | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| A c | S c | C p | B p | C h | F b | R e | H r | E x | O p | S p | C a | T h | |
| Maximum | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.01 |
| Minimum | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Mean | 0.32 | 0.36 | 0.28 | 0.15 | 0.26 | 0.60 | 0.33 | 0.17 | 0.70 | 0.18 | 0.77 | 0.55 | 0.55 |
| Median | 0.33 | 0.34 | 0.26 | 0.00 | 0.50 | 0.63 | 0.00 | 0.13 | 0.50 | 0.00 | 0.67 | 1.00 | 1.00 |
| Mode | 0.00 | 0.25 | 0.18 | 0.00 | 0.50 | 0.69 | 0.00 | 0.00 | 1.00 | 0.00 | 0.67 | 1.00 | 0.01 |
| Range | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.01 |
| Std. | 0.34 | 0.17 | 0.12 | 0.36 | 0.26 | 0.17 | 0.47 | 0.19 | 0.31 | 0.26 | 0.20 | 0.50 | 0.50 |
| Skewness | 0.48 | 0.71 | 1.18 | 1.98 | 0.16 | −0.54 | 0.75 | 1.27 | −0.52 | 1.31 | −0.48 | −0.19 | −0.19 |
| Kurtosis | −1.20 | 0.90 | 4.59 | 1.92 | −1.36 | −0.03 | −1.45 | 1.56 | −0.62 | 0.83 | 0.30 | −1.98 | −1.98 |
Figure 2.

Correlation plot between input and output parameter.
Shapley additive explanations (SHAP) of the input parameters considered for modeling are shown in Figure 3. It is used to determine the contribution of each input parameter in the final predicted output. It shows that the thalassemia value (Th) is the most important parameter and fasting blood sugar (Fb) is the least important parameter in predicting a heart patient. The relevance of every variable can be determined based on the SHAP values. The red dot indicates that the feature value is high, which leads to a higher SHAP value.
Figure 3.

Feature importance: (a) SHAP feature importance measured as the mean absolute shapley values; (b) variable importance plot; (c) SHAP summary plot.
The detailed methodology of the proposed work for the identification of cardiac patients based on the medical conditions using machine-learning models is given in Figure 3.
3. Modeling of Parameters
The predicted value of the heart disease patient equation can be written as follows:
| (2) |
3.1. Curve Fitting Technique
The relationship between the predicted heart disease value (Hd) and patient age (Ac) is shown in Figure 4, and equation (3) shows the relationship between Hd and Ac:
| (3) |
Figure 4.

Relationship plot of Hd vs. Ac.
Let A0=1.2152, and the values A0 depend on other input parameters such as Sc, Cp, Bp, Ch, Fb, Re, Hr, Ex, Op, Sp, Ca, and Th.
Equation (3) can be written as follows:
| (4) |
We can rewrite the (4) as follows:
| (5) |
Figure 5 shows the plot between A0 and Sc, and the relationship between A0 and Sc is expressed in equation (6):
| (6) |
Figure 5.

Relationship plot of Ao vs. Bp.
Let B0=0.7647, the value of B0 depends on other input parameters such as Cp, Bp, Ch, Fb, Re, Hr, Ex, Op, Sp, Ca, and Th.
Equation (6) can be written as follows:
| (7) |
We can rewrite the equation (7) as follows:
| (8) |
Figure 6 shows the plot between B0 and Cp, and the relationship between B0 and Cp is expressed in equation (9):
| (9) |
Figure 6.

Relationship plot of Bo vs. Cp.
Consider, C0=0.5688, and it also depends on other input parameters such as Bp, Ch, Fb, Re, Hr, Ex, Op, Sp, Ca, and Th.
Equation (9) can be written as follows:
| (10) |
We can re-write the (10) as follows:
| (11) |
Figure 7 shows the plot between C0 and Bp, and the relationship between C0 and Bp is expressed in equation (12):
| (12) |
Figure 7.

Relationship plot of Co vs. Bp.
Consider, D0=0.7619, this value also depends on other input parameters such as Ch, Fb, Re, Hr, Ex, Op, Sp, Ca, and Th.
Equation (12) can be written as follows:
| (13) |
We can rewrite the (13) as follows:
| (14) |
Figure 8 shows the plot between D0 and Ch, and the relationship between D0 and Ch is expressed in equation (15):
| (15) |
Figure 8.

Relationship plot of Do vs. Ch.
Let E0=0.87, the constant value D0 also depends on other input parameters such as Fb, Re, Hr, Ex, Op, Sp, Ca, and Th.
Equation (15) can be written as follows:
| (16) |
We can rewrite the (16) as follows:
| (17) |
Figure 9 shows the plot between E0 and Fb, and the relationship between E0 and Fb is expressed in equation (18):
| (18) |
Figure 9.

Relationship plot of Eo vs. Fb.
Consider, F0=0.875, this constant value also depends on other input parameters such as Re, Hr, Ex, Op, Sp, Ca, and Th.
Equation (18) can be written as follows:
| (19) |
We can rewrite the (19) as follows:
| (20) |
Figure 10 shows the plot between F0 and Re, and the relationship between F0 and Re is expressed in equation (21):
| (21) |
Figure 10.

Relationship plot of Fo vs. Re.
Consider, G0=0.8439, this constant value also depends on other input parameters such as Hr, Ex, Op, Sp, Ca, and Th.
Equation (21) can be written as follows:
| (22) |
We can rewrite the equation (22) as follows:
| (23) |
Figure 11 shows the plot between G0 and Hr, and the relationship between G0 and Hr is expressed in equation (24):
| (24) |
Figure 11.

Relationship plot of Go vs. Hr.
Consider, H0=0.3717, this constant value also depends on other input parameters such as Ex, Op, Sp, Ca, and Th.
Equation (24) can be written as follows:
| (25) |
We can rewrite the (25) as follows:
| (26) |
Figure 12 shows the plot between H0 and Ex, and the relationship between H0 and Ex is expressed in equation (27):
| (27) |
Figure 12.

Relationship plot of Ho vs. Ex.
Consider, I0=0.408, this constant value also depends on other input parameters such as Op, Sp, Ca, and Th.
Equation (27) can be written as follows:
| (28) |
We can rewrite the (28) as follows:
| (29) |
Figure 13 shows the plot between I0 and Op, and the relationship between I0 and Op is expressed in equation (30):
| (30) |
Figure 13.

Relationship plot of Io vs. Op.
Consider, J0=0.4871, this constant value also depends on other input parameters such as Sp, Ca, and Th.
Equation (30) can be written as follows:
| (31) |
We can rewrite the (31) as follows:
| (32) |
Figure 14 shows the plot between J0 and Sp, and the relationship between J0 and Sp is expressed in equation (33):
| (33) |
Figure 14.

Relationship plot of Jo vs. Sp.
Let K0=0.469, the constant value K0 also depends on other input parameters such as Ca and Th.
Equation (33) can be written as follows:
| (34) |
We can re-write the (34) as follows:
| (35) |
Figure 15 shows the plot between K0 and Ca, and the relationship between K0 and Ca is expressed in equation (36):
| (36) |
Figure 15.

Relationship plot of Ko vs. Ca.
Let L0=0.5168, and the constant value L0 only depends on one input parameter, that is Th.
Equation (36) can be written as follows:
| (37) |
We can rewrite the (37) as follows:
| (38) |
Figure 16 shows the plot between L0 and Th, and the relationship between L0 and Th is expressed in equation (39):
| (39) |
Figure 16.

Relationship plot of Lo vs. Th.
Let, M0=0.6861
Equation (39) can be written as follows:
| (40) |
We can rewrite the (40) as follows:
| (41) |
Solving equations (5), (8), (11), (14), (17), (20), (23), (26), (29), (32), (35), (38), and (41) to find out Hd.
| (42) |
Equation (42) can be utilized for identifying the heart disease patient. Figure 17 shows a comparison between the predicted behavior and reported through the medical test. At an average value of M0, that is, 0.6881. The curve fitting method can predict heart disease patients with R2-the value of 0.6337, with a mean absolute error of 0.293 at RMSE of 0.3688. The curve fitting method gives poor performance; hence, it cannot use for prediction purposes.
Figure 17.

A comparison between the predicted behavior and reported through the medical test.
3.2. ANN Technique
In 1944, Walter Pitts and Warren McCullough developed new types of networks called neural networks. One of the most extensively used machines learning approaches is the artificial neural network (ANN) model, which is inspired by biological neurons. The ANN is the most commonly used statistical model for detecting the relationship between input and output via a set of interconnected data structures with multiple neurons capable of enormous calculations for information representation and data processing. ANN model might be trained to forecast the required output from the supplied input. ANN is a type of artificial intelligence that operates in the same way as the human brain. ANN is made up of a sequence of linked neurons stacked in layers, just like the human brain. The weights linking the neurons determine the capacity of ANN structures to process provided information. The ANN structure can be either feed-forward or recurrent; however, feedforward is the most commonly employed in engineering and also utilized in this study. The feedforward network is made up of three layers: input, hidden, and output layer, as shown in Figure 18. The neurons in the same layer cannot be connected with each other, but they are also connected to the adjacent layers. Neurons are linked together and have different weights. Gradient descent and backpropagation are generally implemented to decrease errors. This method separated the data into three sections on a random basis: 90% for training, 5% for validation, and 5% for testing; the same approach is employed in this investigation. TANSIG (43) and PURELIN (34) were chosen as the activation functions in the hidden and output layers, respectively.
| (43) |
| (44) |
Figure 18.

Structure of ANN.
ANN is one of the popular machine-learning techniques utilized to predict heart disease patients. A total of 303 samples were employed to model the parameters, with 90% of the data architecture used for training, 5% used for validation, and 5% used for testing. Only a single hidden layer with neurons from 5 to 25 was used to obtain the best network. The hit-and-trial approach was applied to the performance indices (R and MSE) to calculate the ranking of training, testing, and validation datasets. The training, testing, and validation datasets' ranking results reveal that the 10 neurons in the hidden layer have the best performance.
The error ratio plot presented in (Figure 19(a)) and the performance plot (Figure 19(b)) using the ANN model at a minimum MSE of 0.08781 has been obtained at the 12 iterations. Figure 19(c) depicts the learning process for the best-analyzed neural network gradient, momentum, and validation check. The detailed histogram of the training, validation, and testing of input data is shown in Figure 20.
Figure 19.

(a) Histogram of training, testing, and validation data. (b) Performance plot. (c) Learning process.
Figure 20.

Results of ANN model: (a) Training data, (b) validation data, (c) testing data, and (d) all data.
The following is the mathematical expression between the standardized input parameters and the output:
| (45) |
The heart disease patient can be predicted using the equation (43). If the value of Hd is equal to 1 shows the patient has heart disease, and if the value is 0, then the patient is not suffering from heart disease:
| (46) |
where the hidden neuron responses Ai (i = 1 to 10) are fed to the network output value and can be calculated with the equation (47).
| (47) |
As shown in Figure 20, the final correlation can predict the patient with an R2-value of 0.8491, having an average mean absolute error of 0.20 at 0.267 of RMSE. As a result, it has been determined that the generated correlation indicated by (44) is the best for predicting the heart disease patient. As shown in Figure 21, the maximum values of the output parameter only lie on the two points that are 0 and 1. The proposed model is good for forecasting the diseases of heart patients.
Figure 21.

Results obtained from ANN.
4. Conclusion
Heart disease is one of the dangerous chronic diseases in which patient lives at the risk of heart attack or sometimes death. The current study generated an efficient correlation for identifying heart disease patients. Curve fitting and ANN were applied to the normalized medical results to develop the correlations. The key finding of this investigation is the curve fitting method-based correlation is not suitable for identifying the heart disease patient as its accuracy is low. The curve fitting method predicts with R2-value 0.6337 having a mean absolute error of 0.293 at RMSE of 0.3688. The ANN-based correlation can identify the heart disease patient with the coefficient of determination of 0.8491, having an average MAE of 0.20 at 0.267 of RMSE. The ANN-based developed correlation method is accurate for identifying the heart disease patient. This model can be utilized to identify the heart disease patient without the need for angiography or computed tomography angiography test.
Notation
- A c :
Age
- S c :
Sex
- C p :
Chest pain type
- B p :
Blood pressure
- C h :
Cholesterol
- F b :
Fasting blood sugar
- R e :
Resting electrocardiographic results
- H r :
Maximum heart rate
- E x :
Exercise-induced angina
- O p :
ST depression induced by exercise relative to rest
- S p :
The slope of the peak exercise ST segment
- C a :
The number of significant vessels
- T h :
Thalassemia value
- R:
Correlation coefficient
- R 2:
Coefficient of determination
- MSE:
Mean square error
- SSA-NN:
Salp swarm optimized neural network
- BO-SVM:
Bayesian optimized SVM
- AFib:
Atrial fib relation
- SHAP:
Shapley additive explanations
Anonyms
- CAD:
Coronary artery disease
- ANN:
Artificial neural network
- RMSE:
Root mean square error
- ML:
Machine learning
- NN:
Neural networks
- DT:
Decision trees
- SVM:
Support vector machine
- HRV:
Heart rate variability
- CNN:
Convolutional neural network
- CAE:
correlation attribute evaluator
- ETC:
Extra trees classifier
- IGAE:
Information gain attribute evaluator
- GRAE:
Gain ratio attribute evaluator
- RF:
Random forest
- GB:
Gradient boosting
- NB:
Naive Bayes
- HF:
Chronic Heart Failure
- BP:
Blood pressure
- MAE:
Mean absolute error.
Data Availability
The clinical parameters of a heart disease patient were collected from the open-source link (https://github.com/g-shreekant/Heart-Disease-Prediction-using-Machine-Learning) used for the development of correlation.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
References
- 1.Bharti R., Khamparia A., Shabaz M., Dhiman G., Pande S., Singh P. Prediction of heart disease using a combination of machine learning and deep learning. Computational Intelligence and Neuroscience . 2021;2021:11. doi: 10.1155/2021/8387680.8387680 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Mastoi Q. U. A., Wah T. Y., Mohammed M. A., et al. Novel DERMA fusion technique for ECG heartbeat classification. Life . 2022;12(6):p. 842. doi: 10.3390/life12060842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Rahman A. U., Saeed M., Mohammed M. A., Krishnamoorthy S., Kadry S., Eid F. An integrated algorithmic MADM approach for heart diseases’ diagnosis based on neutrosophic hypersoft set with possibility degree-based setting. Life . 2022;12(5):p. 729. doi: 10.3390/life12050729. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Elhoseny M., Mohammed M. A., Mostafa S. A., et al. A new multi-agent feature wrapper machine learning approach for heart disease diagnosis. Computers, Materials & Continua . 2021;67(1):51–71. doi: 10.32604/cmc.2021.012632. [DOI] [Google Scholar]
- 5.Pires I. M., Marques G., Garcia N. M., Ponciano V. Machine learning for the evaluation of the presence of heart disease. Procedia Computer Science . 2020;177:432–437. doi: 10.1016/j.procs.2020.10.058. [DOI] [Google Scholar]
- 6.Na K. S., Cho S. E., Cho S. J. Machine learning-based discrimination of panic disorder from other anxiety disorders. Journal of Affective Disorders . 2021;278:1–4. doi: 10.1016/j.jad.2020.09.027. [DOI] [PubMed] [Google Scholar]
- 7.Edreira D. F., Blanco J. L., Lozano C. F. Machine Learning analysis of the human infant gut microbiome identifies influential species in type 1 diabetes. Expert Systems with Applications . 2021;185 doi: 10.1016/j.eswa.2021.115648.115648 [DOI] [Google Scholar]
- 8.Swathy M., Saruladha K. A comparative study of classification and prediction of Cardio-Vascular Diseases (CVD) using Machine Learning and Deep Learning techniques. ICT Express . 2021;8(1) doi: 10.1016/j.icte.2021.08.021. [DOI] [Google Scholar]
- 9.Jomaa R. M., Islam M. S., Mathkour H., Ahmadi S. A. A multilayer system to boost the robustness of fingerprint authentication against presentation attacks by fusion with heart-signal. Journal of King Saud University - Computer and Information Sciences . 2022 doi: 10.1016/j.jksuci.2022.01.004. [DOI] [Google Scholar]
- 10.Bahani K., Moujabbir M., Ramdani M. An accurate fuzzy rule-based classification systems for heart disease diagnosis. Scientific African . 2021;14 doi: 10.1016/j.sciaf.2021.e01019.e01019 [DOI] [Google Scholar]
- 11.Sanni R. R., Guruprasad H. S. Analysis of performance metrics of heart failured patients using Python and machine learning algorithms. Global Transitions Proceedings . 2021;2(2):233–237. doi: 10.1016/j.gltp.2021.08.028. [DOI] [Google Scholar]
- 12.Chang V., Bhavani V. R., Xu A. Q., Hossain M. A. An artificial intelligence model for heart disease detection using machine learning algorithms. Healthcare Analytics . 2022;2 doi: 10.1016/j.health.2022.100016.100016 [DOI] [Google Scholar]
- 13.Uddin M. N., Halder R. K. An ensemble method based multilayer dynamic system to predict cardiovascular disease using machine learning approach. Informatics in Medicine Unlocked . 2021;24 doi: 10.1016/j.imu.2021.100584.100584 [DOI] [Google Scholar]
- 14.Bahalkeh E., Hasan I., Yih Y. The relationship between intensive care unit length of stay information and its operational performance. Healthcare Analytics . 2022;2 doi: 10.1016/j.health.2022.100036.100036 [DOI] [Google Scholar]
- 15.Balabaeva K., Kovalchuk S. Comparison of temporal and non-temporal features effect on machine learning models quality and interpretability for chronic heart failure patients. Procedia Computer Science . 2019;156:87–96. doi: 10.1016/j.procs.2019.08.183. [DOI] [Google Scholar]
- 16.Patro S. P., Nayak G. S., Padhy N. Heart disease prediction by using novel optimization algorithm: a supervised learning prospective. Informatics in Medicine Unlocked . 2021;26 doi: 10.1016/j.imu.2021.100696.100696 [DOI] [Google Scholar]
- 17.Tripoliti E. E., Papadopoulos T. G., Karanasiou G. S., Naka K. K., Fotiadis D. I. Heart failure: diagnosis, severity estimation and prediction of adverse events through machine learning techniques. Computational and Structural Biotechnology Journal . 2017;15:26–47. doi: 10.1016/j.csbj.2016.11.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Zhang Q., Zeng X., Hu W., Zhou D. A machine learning-empowered system for long-term motion-tolerant wearable monitoring of blood pressure and heart rate with ear-ECG/PPG. IEEE Access . 2017;5 doi: 10.1109/ACCESS.2017.2707472.10547 [DOI] [Google Scholar]
- 19.Mehrang S., Lahdenoja O., Kaisti M., et al. Classification of atrial fibrillation and acute decompensated heart failure using smartphone Mechanocardiography: a multilabel learning approach. IEEE Sensors Journal . 2020;20(14):7957–7968. doi: 10.1109/JSEN.2020.2981334. [DOI] [Google Scholar]
- 20.Mohan S., Thirumalai C., Srivastava G. Effective heart disease prediction using hybrid machine learning techniques. IEEE Access . 2019;7 doi: 10.1109/ACCESS.2019.2923707.81542 [DOI] [Google Scholar]
- 21.Pan Y., Fu M., Cheng B., Tao X., Guo J. Enhanced deep learning assisted convolutional neural network for heart disease prediction on the internet of medical things platform. IEEE Access . 2020;8 doi: 10.1109/ACCESS.2020.3026214.189503 [DOI] [Google Scholar]
- 22.Li J. P., Haq A. U., Din S. U., Khan J., Khan A., Saboor A. J. I. A. Heart disease identification method using machine learning classification in e-healthcare. IEEE Access . 2020;8 doi: 10.1109/access.2020.3001149. doi: 10.1109/access.2020.3001149.107562 [DOI] [Google Scholar]
- 23.Alkhodari M., Islayem D. K., Alskafi F. A., Khandoker A. H. Predicting hypertensive patients with higher risk of developing vascular events using heart rate variability and machine learning. IEEE Access . 2020;8 doi: 10.1109/access.2020.3033004.192727 [DOI] [Google Scholar]
- 24.Ketu S., Mishra P. K. Empirical analysis of machine learning algorithms on imbalance electrocardiogram based arrhythmia dataset for heart disease detection. Arabian Journal for Science and Engineering . 2022;47(2):1447–1469. doi: 10.1007/s13369-021-05972-2. [DOI] [Google Scholar]
- 25.Hasnony I. M. E., Elzeki O. M., Alshehri A., Salem H. Multi-label active learning-based machine learning model for heart disease prediction. Sensors . 2022;22(3):p. 1184. doi: 10.3390/s22031184. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Subathra M. S. P., Mohammed M. A., Maashi M. S., Zapirain B. G., Sairamya N. J., George S. T. Detection of focal and non-focal electroencephalogram signals using fast Walsh-Hadamard transform and artificial neural network. Sensors . 2020;20(17):p. 4952. doi: 10.3390/s20174952. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Rahman A. U., Saeed M., Mohammed M. A., Jaber M. M., Zapirain B. G. A novel fuzzy parameterized fuzzy hypersoft set and riesz summability approach based decision support system for diagnosis of heart diseases. Diagnostics . 2022;12(7):p. 1546. doi: 10.3390/diagnostics12071546. doi: 10.3390/diagnostics12071546. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Abualkishik A. Z., Alwan A. A. Multi-objective chaotic butterfly optimization with deep neural network based sustainable healthcare management systems. American Journal of Business and Operations Research . 2021;4(2):39–48. doi: 10.54216/ajbor.040203. [DOI] [Google Scholar]
- 29.Shreekant G. Heart disease Prediction using Machine Learning. 2021. https://github.com/g-shreekant/Heart-Disease-Prediction-using-Machine-Learning .
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The clinical parameters of a heart disease patient were collected from the open-source link (https://github.com/g-shreekant/Heart-Disease-Prediction-using-Machine-Learning) used for the development of correlation.
