Skip to main content
Heliyon logoLink to Heliyon
. 2024 Aug 8;10(16):e36051. doi: 10.1016/j.heliyon.2024.e36051

Artificial intelligence algorithms permits rapid acute kidney injury risk classification of patients with acute myocardial infarction

Jun Wei a,b,1, Dabei Cai c,d,1, Tingting Xiao c,1, Qianwen Chen c, Wenwu Zhu e, Qingqing Gu c, Yu Wang c, Qingjie Wang c,d, Xin Chen c,⁎⁎⁎, Shenglin Ge a,⁎⁎, Ling Sun c,d,
PMCID: PMC11367145  PMID: 39224361

Abstract

Objective

This study aimed to develop and validate several artificial intelligence (AI) models to identify acute myocardial infarction (AMI) patients at an increased risk of acute kidney injury (AKI) during hospitalization.

Methods

Included were patients diagnosed with AMI from the Medical Information Mart for Intensive Care (MIMIC) III and IV databases. Two cohorts of AMI patients from Changzhou Second People's Hospital and Xuzhou Center Hospital were used for external validation of the models. Patients' demographics, vital signs, clinical characteristics, laboratory results, and therapeutic measures were extracted. Totally, 12 AI models were developed. The area under the receiver operating characteristic curve (AUC) were calculated and compared.

Results

AKI occurred during hospitalization in 1098 (28.3 %) of the 3882 final enrolled patients, split into training (3105) and test (777) sets randomly. Among them, Random Forest (RF), C5.0 and Bagged CART models outperformed the other models in both the training and test sets. The AUCs for the test set were 0.754, 0.734 and 0.730, respectively. The incidence of AKI was 9.8 % and 9.5 % in 2202 patients in the Changzhou cohort and 807 patients in the Xuzhou cohort with AMI, respectively. The AUCs for patients in the Changzhou cohort were RF, 0.761; C5.0, 0.733; and bagged CART, 0.725, respectively, and Xuzhou cohort were RF, 0.799; C5.0, 0.808; and bagged CART, 0.784, respectively.

Conclusion

Several machines learning-based prediction models for AKI after AMI were developed and validated. The RF, C5.0 and Bagged CART model performed robustly in identifying high-risk patients earlier.

Clinical trial approval statement

This Trial was registered in the Chinese clinical trials registry: ChiCTR1800014583. Registered January 22, 2018 (http://www.chictr.org.cn/searchproj.aspx).

Keywords: Acute kidney injury, Acute myocardial infarction, Artificial intelligence, Random forest

1. Introduction

Globally, acute myocardial infarction (AMI) remains high in morbidity and mortality [1]. Annually, about 550,000 cases experience the first AMI episode and 200,000 cases recurrent AMI [2]. AMI has often multiple complications, such as cardiogenic shock, heart failure after myocardial infarction (MI), acute kidney injury (AKI), malignant ventricular arrhythmia, and even death [3,4]. Because of the heterogeneous prognoses of patients [5], reliable prognostic tools are in urgent need.

During hospitalization, patients with AMI, especially those undergoing percutaneous coronary intervention (PCI), may suffer from renal deterioration, which is multifactorial and mostly characterized by AKI [6]. AKI has a complex pathogenesis, involving hemodynamics (sustained hypotension and ischemia-reperfusion), metabolism, medication (especially renin-angiotensin axis blockers), and concurrent diseases, such as sepsis, hemorrhage, atherosclerotic disease, and acute hyperglycemia [7,8]. AKI occurs in approximately 10–15 % of all hospitalized patients and more than 50 % of those in the intensive care unit [9]. Observational studies have shown a non-linear association between small increases in serum creatinine concentration and the risk of short- and long-term adverse outcomes in patients with and without chronic kidney diseases [10]. Evidence has also shown that even mild AKI (eg, a 50 % increase in serum creatinine) is associated with an increased risk of hospital mortality [11]. AMI patients, if having developed AKI during hospitalization, may face a significantly higher risk of major adverse cardiovascular events [12]. Since 2008, observational studies have demonstrated a strong, reproducible association between AKI, though mild, and subsequent chronic kidney disease [10]. Therefore, more efficient methods are needed to early detect AKI risk, thus improving the prognosis of AMI patients.

Artificial intelligence (AI) algorithms refer to a collection of a series of computational steps and rules designed to achieve tasks and functions. They could analyze and process large amounts of data to discover patterns and regularities, and use these patterns and regularities to perform tasks such as prediction, classification, and generation of new content. Common AI algorithms include machine learning algorithms and deep learning algorithms. In our previous study, we found that machine learning methods, particularly the random forest (RF) algorithm, have improved the accuracy of risk stratification for AKI in AMI patients [13]. It is also reported that a real-time system which was established with machine learning methods and echocardiography videos was more accurate than cardiologists in distinguishing takotsubo syndrome and AMI [14]. Comparing with traditional scoring methods, AI algorithms could handle large amounts of data. This enables them to discover hidden relationships in the data and provide more comprehensive and accurate risk predictions. The purpose of this study was to develop several AI models to identify AMI patients at an increased risk of AKI during hospitalization with MIMIC database data and validated the performance of the models with real-world cohorts from two cardiac centers.

2. Methods

2.1. Ethics statement

In this study, the analysis of MIMIC database data was approved by the Beth Israel Deaconess Medical Center and the MIT Institutional Review Board. Because the patient information in the database is anonymous, informed consent from patients is not required. The online course was completed (id: 44703031) and the database was then accessed. All the enrolled patients from Changzhou Second People's Hospital and Xuzhou Central Hospital had signed informed consent. The study had been approved by the Ethics Committee of Changzhou Second People's Hospital (Ethics number: [2018]KY005-01) and Xuzhou Central Hospital (Ethics number: XZXY-LK-20211021-037). The project was performed in accordance with the Declaration of Helsinki.

2.2. Patients and grouping

Data of patients with AMI were extracted from the MIMIC III and IV databases for model training. Structured Query Language programming in Navicat Premium (version 15.0.12) is used for data extraction. 41000–41092 in ICD-9 (International Classification of Diseases, Ninth Revision) codes were used to identify AMI patients [13]. Inclusion criteria were [1]: All patients diagnosed with AMI [2]; age ranged between 18 and 90 years old. Exclusion criteria were [1]: patients with inadequate serum creatinine and troponin test results [2]; Patients with more than 5 % missing data [3]. Patients hospitalized for recurrent AMI [4]. Patients with severe infection, immune system disease, or severe liver and kidney dysfunction. Data from MIMIC III and IV databases were then merged, and 80 % of the patients in the new dataset were randomly assigned to a training set, and the remaining 20 % to a test set.

Two cohorts of AMI patients from Changzhou Second People's Hospital and Xuzhou Central Hospital from December 2012 to January 2022 were used for external validation of the models. Finally, 3882 AMI patients from MIMIC database, 2202 AMI patients from Changzhou cohort and 807 eligible AMI patients from Xuzhou cohort were enrolled in the final study. Patients' demographics, vital signs, clinical characteristics, laboratory results, and therapeutic measures were extracted.

The enrolled patients were divided into two groups according to the occurrence of AKI during hospitalization. The flow chart of the study was shown in Fig. 1.

Fig. 1.

Fig. 1

Flow diagram of the selection process of patients. The left flowchart shows the whole process of data extraction; the right flowchart shows the whole process of AI algorithm development and evaluation. AMI: acute myocardial infarction; AI: artificial intelligence; AKI: acute kidney injury; SMOTE: Synthetic Minority Oversampling Technique.

2.3. Study endpoint

The study endpoint event was AKI during hospitalization. AKI was defined according to the most recent international clinical practice guidelines [15]. Estimating Glomerular filtration rate (eGFR) was calculated according to the formula of Modification of Diet in Renal Disease (MDRD) [16]. A diagnosis of AKI was established by three criteria: (a) creatinine elevation ≥0.3 mg/dl (≥26.5 μmol/l) within 48 h; (b) a known or presumed more than 1.5 times increase in creatinine compared to the baseline within previous 7 days; and (c) a urinary output of <0.5 ml/kg/h over a period of 6 h.

2.4. Synthetic Minority Oversampling Technique

Imbalanced classes are one of the common problems in classifier model training. Therefore, a synthetic dataset with 50 % positive results was created by Synthetic Minority Oversampling Technique (SMOTE), independently used for model training. SMOTE covers more minority groups and provides more relevant minority samples [17].

2.5. Feature selection

Boruta, as an algorithm based on a RF classifier, can select the features most relevant to the dependent variable, rather than a smallest compact set of features best suitable for a particular model [18]. In the present study, we employed Boruta to select relevant features.

2.6. AI algorithms

Features were first selected from the training cohort to construct multiple AI models, including Decision Tree (DT) Model, Support Vector Machine (SVM) Model, RF Model, Extreme Gradient Boosting (XGBoost) Model, Gradient Boosting Machine (GBM) Model, Bagged CARTbModel, C5.0 model, Multi-layer Perceptron (MLP) model, Latent Dirichlet Allocation (LDA) model, Generalized Linear Model (GLM) model, K-Nearest Neighbor (KNN) Model, Neural Network Model (NNET) model. Six-fold cross-validation was introduced into model training.

DT [19], a basic classification and regression method, recursively splits a data set, so that the data in each subset belong to the same category as much as possible. SVM [20], a binary classification model, can be used for classification by finding an optimal hyperplane in the feature space, thus maximizing the interval between two categories while minimizing the classification error. RF [21] serves as an integrated learning method to improve prediction performance by building multiple decision trees and combining their predictions. Each decision tree is trained independently with samples drawn from the original dataset in a relaxed manner. XGBoost [22] is an integrated learning method, similar gradient boosting trees. It aims to improve the prediction accuracy of a decision tree by establishing multiple decision trees with gradient boosting. Through integrated learning, Bagged CART [23] improves classification performance by training multiple weak learners (e.g., decision trees) and combining their predictions. C5.0 [24] relies on integrated learning to enhance prediction performance by combining multiple weak learners (e.g., Decision Tree, Plain Bayes, and Support Vector Machines). C5.0 uses an adaptive learning rate tuning strategy to speed up the convergence process and improve model performance. GBM [25], another integrated learning method, strengthens prediction performance by combining multiple weak learners. It is based on a decision tree, where a new weak learner is constructed at each iteration, and the predictions of all weak learners are eventually weighted and summed to obtain the final prediction. As a feed-forward neural network, MLP [26] contains one input layer, one or more hidden layers, and one output layer, each bearing a pair of neurons interconnected by weights. LDA [27] is a topic model for analyzing the underlying topic structure in a document collection. It represents a document as a distribution of topics. GLM [28] is a statistical model used to establish a linear relationship between dependent and independent variables. It can handle linear and nonlinear relationships, as well as multivariate covariance problems. KNN [29] is an instance-based learning method that calculates the distance between the sample to be classified and all the samples in the training set, selects the K nearest neighbors, and then votes according to the categories of these neighbors to obtain the categories of the sample to be classified. NNET [30] is a computational model that simulates the neuronal structure of the human brain, and consists of multiple levels of nodes. Each node receives input data and computes the output through an activation function. The network is trained by a back-propagation algorithm to minimize the prediction error.

2.7. Validation of models’ performances

The performances of models were validated using data from the training and test cohorts. The receiver operating characteristic curve (ROC) and the area under the curve (AUC) were calculated to compare the discriminative abilities of models. Decision curve analysis (DCA) was performed, and the maximum net benefit at a specific threshold probability was determined as the benefit with the greatest clinical value. Sensitivity, specificity, precision, recall, and F1 metrics were also used for model evaluation.

2.8. Statistical analysis

Continuous variables were described as mean ± Standard Deviation (SD) or median and interquartile range (IQR). For continuous variables, Student's t-test or Mann-Whitney U test was used to illustrate differences between groups. Categorical variables were described as numbers or percentages, and differences between groups were assessed using the chi-square test or Fisher's exact test. Missing values were replaced by multiple imputations. SMOTE algorithm was used to process imbalanced datasets. Boruta algorithm was used for feature selection. Several AI algorithms were used for model training, including DT, SVM, RF, XGBoost, GBM, Bagged CART, MLP, LDA, GLM, KNN and NNET. Six-fold cross-validation was introduced while training the model. The AUC and DCA curve were used to validate each model. Other model evaluation indicators were also calculated (recall, precision, sensitivity, specificity and F1 score). Statistical analyses were performed using R software (version 4.3.1). Graphs were plotted using GraphPad Prism (version 8.3.0) or Origin (version 9.1.0); P < 0.05 was considered statistically significant.

3. Results

3.1. Participants’ characteristics

The baseline characteristics are listed in Table 1 and Supplementary Table 1. A total of 3882 patients were enrolled, randomly split into training (3105, 80 %) and test (777, 20 %) sets. Among them, 1098 (28.3 %) patients experienced AKI during hospitalization. AKI had an incidence of 27.7 % in the training cohort, and 30.3 % in the test cohort. There were no significant differences for occurrence of AKI between the two group (χ2 = 1.991, P = 0.158).

Table 1.

Baseline characteristics of patients.

training cohort (n = 3105) test cohort (n = 777) Changzhou cohort (n = 2202) Xuzhou cohort (n = 807)
Age (years) 66.50 (12.99) 66.00 (12.98) 64.87 (13.98) 62.57 (13.89)
Sex (male) 2021 (65.1) 520 (66.9) 1634 (74.2) 630 (78.1)
HR (bpm) 85.83 (18.58) 85.20 (18.17) 79.50 (15.96) 81.16 (15.03)
SBP (mmHg) 119.23 (21.46) 119.81 (23.18) 131.26 (24.15) 131.90 (23.57)
DBP (mmHg) 63.84 (14.57) 63.62 (14.28) 78.75 (15.82) 79.93 (14.79)
Creatinine (mg/dL) 1.53 (1.46) 1.64 (1.71) 0.96 (0.67) 0.92 (0.50)
eGFR [mL/(min·1.73 m2)] 69.02 (38.05) 67.65 (34.99) 93.16 (33.40) 97.53 (34.46)
WBC (k/uL) 12.01 (5.42) 11.97 (5.47) 9.63 (3.95) 11.17 (34.46)
Hemoglobin (g/dL) 11.99 (2.38) 12.02 (2.30) 13.72 (2.04) 14.39 (4.01)
Glucose (mg/dL) 159.81 (67.24) 159.91 (69.61) 135.50 (59.81) 131.32 (54.94)
BUN (mg/dL) 27.31 (19.25) 27.08 (19.65) 53.57 (29.33) 53.48 (28.81)
CKMB (ng/mL) 74.60 (112.45) 69.23 (113.05) 393.05 (229.55) 399.10 (234.63)
TNT (ng/mL) 2.90 (5.54) 2.73 (5.74) 23.95 (13.58) 23.98 (14.10)
Hypertension 820 (26.4) 226 (29.1) 1432 (65.0) 525 (65.1)
Diabetes 403 (13.0) 119 (15.3) 665 (30.2) 313 (38.8)
AKI 862 (27.8) 236 (30.4) 215 (9.8) 77 (9.5)

Continuous variables are expressed as mean plus or minus standard deviation. Counting data are presented as numbers and percentages.

HR, heart rate; SBP, arterial systolic pressure; DBP, diastolic arterial pressure; RBC, red blood cell; WBC, white blood cell; BUN, blood urea nitrogen; APTT, activated partial prothrombin time; CK-MB, Creatine Kinase Isozyme-MB; TNT, Troponin-T; eGFR, evaluate Glomerular Filtration Rate.

3.2. Importance of variable

A total of 38 variables were extracted from the database. Finally, the top 20 significant predictors were screened out in Fig. 2. It is indicated that the top 5 predictors were estimated Glomerular Filtration Fates (eGFR), creatinine, blood urea nitrogen, cardiogenic shock and creatine-kinase myocardial band. The top 20 significant predictors were then used for subsequent model development and validation.

Fig. 2.

Fig. 2

Summary of importance of the selected features according to Boruta algorithm. This figure shows the importance of top 20 ranked variables. The columns represent the medium importance of the feature and the error bars represent the maximum and minimal importance (scaled to a maximum value of 100).

3.3. Performances of AI models in the training group

The AI models were then trained with an increasing number of variables according to the ranked importance of variable in Fig. 2. AUCs of each AI model were listed in Supplementary Tables 2–6. The calibration curve developed based on the RF algorithm in the training group showed that the RF algorithm was well fitted (Supplementary Fig. 1) [31]. Overall, better predictive performances were achieved by the RF, Bagged CART, XGBoost, KNN, GBM and C5.0.

In Fig. 3A and B, we included the top 15 variables and evaluated the AUCs of each AI model in the training group. Since the number of included variables has a great influence on the prediction effects, we then evaluated AUCs of each model when the top 10,12,15,18 and 20 variables were added in the model in the training group and the validation group in Fig. 4. Overall, we found that RF, Bagged CART, and C5.0 models, XGBoost, and GBM models with top 15 variables showed higher AUC values compared with other models. Moreover, the DCA curve showed that the RF model exhibited the highest accuracy in predicting clinical outcomes, followed by the Bagged CART, XGBoost, C5.0, and GBM models. (Fig. 5A and B). Taken all these together, it is indicated that the following 5 models, RF, Bagged CART, C5.0, XGBoost and GBM, with top 15 variables had good predictive effects for AKI risk classification.

Fig. 3.

Fig. 3

Receiver operating characteristic curve (ROC) of the model with 15 variables included in the training cohort and test cohort. (A) and (B) Model performance in the training cohort; (C) and (D) Test cohort Model performance in the test cohort.

Fig. 4.

Fig. 4

Dynamics of area under the ROC curve for all AI models. The X-axis represents the number of included variables; the Y-axis represents the different AI models; and the Z-axis represents the area under the receiver operating characteristic curve (A, C). The X-axis represents the number of included variables; the Y-axis represents the different AI models; and the colored ones represent the size of the area under the receiver operating characteristic curve (red is the largest and blue is the smallest) (B, D). (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

Fig. 5.

Fig. 5

Decision curve analysis with 15 variables in the training and test cohorts. (A) and (B) model performance in the training cohort; (C) and (D) test cohort Model performance in the test cohort.

3.4. Performances of AI models in the test group

Supplementary Tables 7–11 showed AUC values with each model in the test group. The AI models using the top 15 features of importance included in the model in the test cohort achieved the following AUCs: GLM, 0.712 (95 % CI: 0.674–0.750); SVM, 0715 (95 % CI: 0.677∼0.754); DT, 0.660 (95 % CI: 0.620–0.699); RF, 0.754 (95 % CI: 0.718–0.789); LDA, 0.706 (95 % CI: 0.667∼0.744); KNN, 0.593 (95 % CI: 0.551–0.634); GBM, 0.724 (95 % CI: 0.686∼0.762); XGBoost, 0.725 (95 % CI: 0.688∼0.762); C5.0, 0.734 (95 % CI: 0.698–0.770); Bagged CART, 0.730 (95 % CI: 0.693–0.767); NNET, 0.660 (95 % CI: 0.618–0.701); MLP, 0.698 (95 % CI: 0.657∼0.738). Among them, the RF, Bagged CART, and C5.0 models had the best performances. ROC curves of the AI model based on 15 variables in the test group are shown in Fig. 3C and D. RF, Bagged CART, and C5.0 models with top 15 variables also exhibited relatively higher AUC values by dynamic plots in the test group. The DCA curve in the test cohort showed that the RF model achieved the highest accuracy in predicting AKI risk, followed by C5.0, XGBoost, Bagge CART and GBM models. (Fig. 5C and D). For the three well performed AI models, we further used other indicators to evaluate the model performance. Sensitivity, specificity, precision, recall and F1 scores of the three AI models in the test group were showed in Supplementary Table 12. Collectively, we found that RF, C5.0, and the Bagged CART had strong predictive abilities in the early assessment of AKI risk.

3.5. Performances of AI models in the validation group

Meanwhile, two external validation groups were added to further evaluate the performance of the AI model. A total of 2202 AMI patients from Changzhou cohort were enrolled. Among them, 215 (9.8 %) patients experienced AKI during hospitalization. Meanwhile 807 AMI patients from Xuzhou cohort were enrolled. AKI had an incidence of 9.5 % in Xuzhou cohort. (Table 1). In the model training process, we found that RF, C5.0, and the Bagged CART had strong predictive abilities for AKI risk. We then validated the three models in two cohorts with the top 5 most important variables which were listed in Fig. 2. The ROC curve of the three models in two validation group were shown in Fig. 6A and B. The AUC values in external validation group I (Changzhou cohort) were RF, 0.761 (95 % CI: 0.729–0.792); C5.0, 0.733 (95 % CI: 0.697–0.769); and Bagged CART, 0.725 (95 % CI: 0.690–0.760) (Supplementary Table 13). The AUC values for external validation group II (Xuzhou cohort) were RF, 0.799 (95 % CI: 0.741–0.856); C5.0, 0.808 (95 % CI: 0.753–0.862); Bagged CART, 0.784 (95 % CI: 0.731–0.838) (Supplementary Table 14). Taken all these together, the three AI models, RF, C5.0, and the Bagged CART performed robustly in identifying high-risk AKI patients in AMI patients.

Fig. 6.

Fig. 6

Receiver operating characteristic curve (ROC) of the model with 5 variables included in the validation cohort Ⅰ (Changzhou cohort) and validation cohort Ⅱ (Xuzhou cohort). (A) ROC curves of three models in the validation cohort Ⅰ; (B) ROC curves of the three models in the validation cohort Ⅱ.

To make the model easier to use in clinical practice, we finally developed a predictive software based on RF, C5.0, and a bagged CART model with the top 15 variables. Fig. 7 showed an example of patient prediction using the optimal models. The clinician could enter the variables into the software. And the software could output the probabilities of AKI risk based on the three models immediately.

Fig. 7.

Fig. 7

The predictive software based on artificial intelligence algorithms. A predictive software based on RF, C5.0, and a Bagged CART model with the top 15 variables were developed. It showed an example of patient prediction using the optimal models. The clinician could enter the variables into the software. And the software could output the probabilities of AKI risk based on the three models immediately.

4. Discussion

The major findings in the current study are as follows [1]: The incidence of AKI in the AMI population was 28.2 % in the MIMIC database, 9.8 % in Changzhou Cohort and 9.5 % in Xuzhou cohort [2]; a total of 12 AI algorithms were constructed to assess the risk of AKI in the AMI population, and we found that RF, C5.0, and Bagged CART had strong predictive abilities in the early assessment of AKI risk [3]; the top 5 main features contributing to the predictive model were eGFR, creatinine, blood urea nitrogen, cardiogenic shock and creatine-kinase myocardial band.

The incidence of AKI in the AMI population was 28.2 % in the MIMIC database, 9.8 % in Changzhou Cohort and 9.5 % in Xuzhou cohort. The incidence of AKI after AMI reported in previous studies ranged from 7.1 to 29.3 % [[32], [33], [34]]. This differences in AKI incidence from different studies may be related to the following reasons: Firstly, the nations of the patients enrolled in the different cohorts varied. Patients enrolled in MIMIC's database are mainly from the State of Israel. The patients in the other two cohorts in the current study were from two cardiac centers of China. Secondly, the baseline age of the selected patients was different. The AMI patients in MIMIC database were older, so there would be differences in baseline renal function, which may lead to differences in the incidence of AKI. Thirdly, severity of coronary arteries, the timing of interventional reperfusion therapy, the type and doses of contrast agents, as well as whether the preoperative hydration treatment was given may also affect the incidence of AKI.

It is reported in previous study that the risk of AKI was associated with age, sex, heart rate, hypertension, chronic kidney disease, glomerular filtration rate, extensive anterior myocardial infarction, diuretics, contrast agents, Killip cardiac Functional class/heart failure/ejection fraction, cardiogenic shock/shock index, left ventricular end-diastolic size, creatine kinase and other factors [[35], [36], [37], [38]]. These findings are roughly as the same as the features in the AI model we screened out through the Boruta algorithm. These characteristics could be readily obtained at hospital admission, and could be easily used by clinicians to perform early risk assessment. In our previous study [13], we used several machine learning algorithms to conduct an initial assessment of AKI risk in patients with AMI. In the current study, 10 machine learning algorithms and 2 deep learning algorithms were used for predicting AKI risk. In addition, we set up the training set and the test set in a different way, compared to those previously used. By incorporating more AI algorithms, we improved the predictive accuracy of our models. In future studies, we hope that the predictive model could be validated in an external cohort, and different AI models could be combined to achieve a better performance.

The mechanism of AKI in AMI patients during hospitalization mainly includes the following: Firstly, as the age increases, the kidney's tolerance to stressors and injuries may decreases [39]. Secondly, the activation of the sympathetic nervous system and renin-angiotensin system, results in renal vasoconstriction and renal injury due to reduced cardiac output and venous return, and leads to decreased glomerular filtration rates [40]. Thirdly, the kidney injury might be further aggravated due to inflammation and ischemia-reperfusion injury following myocardial injury [41]. Inflammation and oxidative stress could further damage renal tubule cells, thereby further damaging renal perfusion and aggravating renal inflammation [42]. Finally, platelet activation could promote glomerular microthrombus formation and ultimately lead to deterioration of renal function [43]. In addition, metabolic factors, such as hyperglycemia and acidosis, can also initiate adverse reactions [44]. Here, we found that AMI patients with AKI in the MIMIC database had lower blood pressure and platelets, and higher heart rate, age, white blood cells, blood sugar, APTT, PT, and INR. The pathophysiological differences behind these clinical differences are generally consistent with previous results. The short-term or long-term prognosis of AKI in the AMI population is poor. Overall outcomes can be improved if AKI is early identified after admission [45]. The AI models in the present study may help physicians identify high-risk cases and optimize clinical treatments.

In the current study, the RF, Bagged CART, and C5.0 models performed best in both the training and test sets. The RF model excels in handling high-dimensional data and complex interactions between variables. It reduces the risk of overfitting in a single tree model by integrating multiple decision trees, and is suitable for situations where there are many variables and the importance of the variables needs to be assessed [46]. Bagged CART modeling is a method to improve the stability of decision trees through Bootstrap resampling technique. It is superior in handling noisy data and reducing model variance, and is suitable for contexts where the data sample size is small or the data is noisy [47]. The C5.0 model is a decision tree algorithm based on information gain and is suitable for dealing with datasets with well-defined classification boundaries. It finds a good balance between computational speed and model complexity, and is suitable for contexts where the data size is large and the model needs to be constructed quickly [48]. These three have the following common characteristics: First, they can handle high-dimensional data and do not require dimensionality reduction or feature selection; Second, they can randomly select feature subsets and sample subsets, dynamically adjust sample weights, or perform adaptive enhancement to reduce the risk of overfitting; Third, they all have good interpretability and robustness. At the same time, they also have the following disadvantages: First, they require a large number of resources and time to train the model; Second, they are very sensitive to noise and outliers, and require preprocessing; Third, in the classification problem, they can only provide a probability value, rather than an exact classification.

It should be noted that both LDA and NNET are neural network models, while MLP is a feedforward neural deep learning network. Regrettably, these three models performed poorly in this study. Possible reasons are as follows: first, the features of the dataset are not rich enough. For example, some studies have pointed out that the levels of cytokines in the venous blood of MI patients increase significantly, especially biomarkers such as IL-6, TNF-α and MCP-1 [[49], [50], [51]]. These features should be combined into AI algorithms to generate a more accurate model. Second, it may be related to the small sample size of the dataset. When the sample size is small, the AI algorithm is prone to overfitting, that is, the model learns the prior information about the sample proportion in the training set, so that the actual prediction can focus on most categories.

There are also several limitations in the current study. The sample size of the current study is relatively small, which prevents some models such as MLP from achieving an optimal performance. It needs to further expand the sample size in the future and validate the model through a prospective cohort. Although the performance of present models has surpassed traditional models, it is not enough to develop clinical prediction using only a single model. If models with better results can be fused through voting or stacking methods, a reliable prediction may be expected. MIMIC data only represent one medical center, and the prediction model needs to be developed based on data from multiple centers, and externally validated and continuously optimized. Imaging data such as coronary angiography, Coronary Computerized Tomography or surface electrocardiogram, or some more predictive Incorporating efficient circulating biomarkers, should be incorporated into the model to improve its performance.

5. Conclusion

In summary, several machine learning-based prediction models for AKI after AMI were developed and validated. The RF, C5.0 and Bagged CART model performed robustly in identifying high-risk patients. These models may be used to assist in clinical decision-making for AKI in AMI patients.

Ethics approval and consent to participate

Use of the MIMIC database was approved by the Beth Israel Women's Deaconess Medical Center and the MIT Institutional Review Board, and all patient information in the database is anonymized and therefore does not require informed consent. We completed online courses and exams and gained access to the database (record IDs: 44703031). All the enrolled patients from Changzhou Second People's Hospital and Xuzhou Central Hospital had signed informed consent. The study had been approved by the Ethics Committee of Changzhou Second People's Hospital (Ethics number: [2018]KY005-01) and Xuzhou Central Hospital (Ethics number: XZXY-LK-20211021-037). The project was performed in accordance with the Declaration of Helsinki.

Consent for publication

Not applicable.

Availability of data and materials

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://physionet.org/content/mimiciii/1.4/and https://physionet.org/content/mimiciv/2.2/.

Funding

This study was supported by grants from National Natural Science Foundation of China (Grant No.82270328), Changzhou High-Level Medical Talents Training Project (2022CZBJ054), Natural Science Foundation of Jiangsu Province (BK20221229), Technology Development Fund of Nanjing Medical University (NMUB2020069), Major Research plan of Changzhou Health Commission of Jiangsu Province of China (ZD202215), China Postdoctoral Science Funding Program (2022M720544), Key Projects of Changzhou Medical Center affiliated to Nanjing Medical University (CMCM202306) and High Level Hospital Construction Project of Jiangsu Province (GSPJS202404).

CRediT authorship contribution statement

Jun Wei: Writing – original draft, Visualization, Validation, Formal analysis, Data curation, Conceptualization. Dabei Cai: Writing – original draft, Formal analysis, Data curation, Conceptualization. Tingting Xiao: Writing – original draft, Formal analysis, Data curation, Conceptualization. Qianwen Chen: Software, Formal analysis, Data curation, Conceptualization. Wenwu Zhu: Formal analysis, Data curation, Conceptualization. Qingqing Gu: Formal analysis, Data curation, Conceptualization. Yu Wang: Formal analysis, Data curation, Conceptualization. Qingjie Wang: Writing – review & editing, Visualization, Validation, Resources, Funding acquisition, Formal analysis, Data curation, Conceptualization. Xin Chen: Writing – review & editing, Supervision, Software, Formal analysis, Data curation. Shenglin Ge: Writing – review & editing, Visualization, Formal analysis, Data curation, Conceptualization. Ling Sun: Writing – review & editing, Visualization, Validation, Supervision, Software, Funding acquisition, Formal analysis, Data curation, Conceptualization.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

Thanks to all authors for their contributions in this study.

Footnotes

Appendix A

Supplementary data to this article can be found online at https://doi.org/10.1016/j.heliyon.2024.e36051.

Contributor Information

Xin Chen, Email: 13506126702@126.com.

Shenglin Ge, Email: aydgsl@sina.com.

Ling Sun, Email: sunling85125@hotmail.com.

Appendix A. Supplementary data

The following is the Supplementary data to this article:

Multimedia component 1
mmc1.docx (97.7KB, docx)

References

  • 1.Chan M.Y., Shah B.R., Gao F., et al. Recalibration of the global registry of acute coronary events risk score in a multiethnic asian population. Am. Heart J. 2011;162:291–299. doi: 10.1016/j.ahj.2011.05.016. [DOI] [PubMed] [Google Scholar]
  • 2.Tsao C.W., Aday A.W., Almarzooq Z.I., et al. Heart disease and stroke statistics-2023 update: a report from the American heart association. Circulation. 2023;147:e93–e621. doi: 10.1161/cir.0000000000001123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Henry T.D., Tomey M.I., Tamis-Holland J.E., et al. Invasive management of acute myocardial infarction complicated by cardiogenic shock: a scientific statement from the American heart association. Circulation. 2021;143:e815–e829. doi: 10.1161/cir.0000000000000959. [DOI] [PubMed] [Google Scholar]
  • 4.Garcia R., Marijon E., Karam N., et al. Ventricular fibrillation in acute myocardial infarction: 20-year trends in the FAST-MI study. Eur. Heart J. 2022;43:4887–4896. doi: 10.1093/eurheartj/ehac579. [DOI] [PubMed] [Google Scholar]
  • 5.Fox K.A., Fitzgerald G., Puymirat E., et al. Should patients with acute coronary disease be stratified for management according to their risk? Derivation, external validation and outcomes using the updated GRACE risk score. BMJ Open. 2014;4 doi: 10.1136/bmjopen-2013-004425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Marenzi G., Cosentino N., Bartorelli A.L. Acute kidney injury in patients with acute coronary syndromes. Heart. 2015;101:1778–1785. doi: 10.1136/heartjnl-2015-307773. [DOI] [PubMed] [Google Scholar]
  • 7.Shacham Y., Steinvil A., Arbel Y. Acute kidney injury among ST elevation myocardial infarction patients treated by primary percutaneous coronary intervention: a multifactorial entity. J. Nephrol. 2016;29:169–174. doi: 10.1007/s40620-015-0255-4. [DOI] [PubMed] [Google Scholar]
  • 8.Sun L.Y., Wijeysundera D.N., Tait G.A., Beattie W.S. Association of intraoperative hypotension with acute kidney injury after elective noncardiac surgery. Anesthesiology. 2015;123:515–523. doi: 10.1097/aln.0000000000000765. [DOI] [PubMed] [Google Scholar]
  • 9.Ronco C., Bellomo R., Kellum J.A. Acute kidney injury. Lancet. 2019;394:1949–1964. doi: 10.1016/s0140-6736(19)32563-2. [DOI] [PubMed] [Google Scholar]
  • 10.Chawla L.S., Kimmel P.L. Acute kidney injury and chronic kidney disease: an integrated clinical syndrome. Kidney Int. 2012;82:516–524. doi: 10.1038/ki.2012.208. [DOI] [PubMed] [Google Scholar]
  • 11.Murugan R., Karajala-Subramanyam V., Lee M., et al. Acute kidney injury in non-severe pneumonia is associated with an increased immune response and lower survival. Kidney Int. 2010;77:527–535. doi: 10.1038/ki.2009.502. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Chalikias G., Serif L., Kikas P., et al. Long-term impact of acute kidney injury on prognosis in patients with acute myocardial infarction. Int. J. Cardiol. 2019;283:48–54. doi: 10.1016/j.ijcard.2019.01.070. [DOI] [PubMed] [Google Scholar]
  • 13.Cai D., Xiao T., Zou A., et al. Predicting acute kidney injury risk in acute myocardial infarction patients: an artificial intelligence model using medical information mart for intensive care databases. Front Cardiovasc Med. 2022;9 doi: 10.3389/fcvm.2022.964894. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Laumer F., Di Vece D., Cammann V.L., et al. Assessment of artificial intelligence in echocardiography diagnostics in differentiating takotsubo syndrome from myocardial infarction. JAMA Cardiol. 2022;7:494–503. doi: 10.1001/jamacardio.2022.0183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Kellum J.A., Lameire N. Diagnosis, evaluation, and management of acute kidney injury: a KDIGO summary (Part 1) Crit. Care. 2013;17:204. doi: 10.1186/cc11454. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Delanaye P., Mariat C. The applicability of eGFR equations to different populations. Nat. Rev. Nephrol. 2013;9:513–522. doi: 10.1038/nrneph.2013.143. [DOI] [PubMed] [Google Scholar]
  • 17.Lertampaiporn S., Thammarongtham C., Nukoolkit C., Kaewkamnerdpong B., Ruengjitchatchawalya M. Heterogeneous ensemble approach with discriminative features and modified-SMOTEbagging for pre-miRNA classification. Nucleic Acids Res. 2013;41 doi: 10.1093/nar/gks878. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Speiser J.L., Miller M.E., Tooze J., Ip E. A comparison of random forest variable selection methods for classification prediction modeling. Expert Syst. Appl. 2019;134:93–101. doi: 10.1016/j.eswa.2019.05.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Chern C.C., Chen Y.J., Hsiao B. Decision tree-based classifier in providing telehealth service. BMC Med. Inf. Decis. Making. 2019;19:104. doi: 10.1186/s12911-019-0825-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Bhosale H., Ramakrishnan V., Jayaraman V.K. Support vector machine-based prediction of pore-forming toxins (PFT) using distributed representation of reduced alphabets. J. Bioinf. Comput. Biol. 2021;19 doi: 10.1142/s0219720021500281. [DOI] [PubMed] [Google Scholar]
  • 21.Kamińska J.A. A random forest partition model for predicting NO(2) concentrations from traffic flow and meteorological conditions. Sci. Total Environ. 2019;651:475–483. doi: 10.1016/j.scitotenv.2018.09.196. [DOI] [PubMed] [Google Scholar]
  • 22.Zhang Z., Ho K.M., Hong Y. Machine learning for the prediction of volume responsiveness in patients with oliguric acute kidney injury in critical care. Crit. Care. 2019;23:112. doi: 10.1186/s13054-019-2411-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Islam M.M., Rahman M.J., Chandra Roy D., Maniruzzaman M. Automated detection and classification of diabetes disease based on Bangladesh demographic and health survey data, 2011 using machine learning approach. Diabetes Metabol. Syndr. 2020;14:217–219. doi: 10.1016/j.dsx.2020.03.004. [DOI] [PubMed] [Google Scholar]
  • 24.Turgeman L., May J.H. A mixed-ensemble model for hospital readmission. Artif. Intell. Med. 2016;72:72–82. doi: 10.1016/j.artmed.2016.08.005. [DOI] [PubMed] [Google Scholar]
  • 25.Lynch C.M., Abdollahi B., Fuqua J.D., et al. Prediction of lung cancer patient survival via supervised machine learning classification techniques. Int. J. Med. Inf. 2017;108:1–8. doi: 10.1016/j.ijmedinf.2017.09.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Lorencin I., Anđelić N., Španjol J., Car Z. Using multi-layer perceptron with Laplacian edge detector for bladder cancer diagnosis. Artif. Intell. Med. 2020;102 doi: 10.1016/j.artmed.2019.101746. [DOI] [PubMed] [Google Scholar]
  • 27.Pérez J., Pérez A., Casillas A., Gojenola K. Cardiology record multi-label classification using latent Dirichlet allocation. Comput. Methods Progr. Biomed. 2018;164:111–119. doi: 10.1016/j.cmpb.2018.07.002. [DOI] [PubMed] [Google Scholar]
  • 28.Singh D., Hurst J.R., Martinez F.J., et al. Predictive modeling of COPD exacerbation rates using baseline risk factors. Ther. Adv. Respir. Dis. 2022;16 doi: 10.1177/17534666221107314. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Ye H., Wu P., Zhu T., et al. Diagnosing coronavirus disease 2019 (COVID-19): efficient harris hawks-inspired fuzzy K-nearest neighbor prediction methods. IEEE Access. 2021;9:17787–17802. doi: 10.1109/access.2021.3052835. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Kirby S.D., Eng P., Danter W., et al. Neural network prediction of obstructive sleep apnea from clinical criteria. Chest. 1999;116:409–415. doi: 10.1378/chest.116.2.409. [DOI] [PubMed] [Google Scholar]
  • 31.Wussler D., Kozhuharov N., Sabti Z., et al. External validation of the meessi acute heart failure risk score: a cohort study. Ann. Intern. Med. 2019;170:248–256. doi: 10.7326/m18-1967. [DOI] [PubMed] [Google Scholar]
  • 32.Shacham Y., Leshem-Rubinow E., Steinvil A., et al. Renal impairment according to acute kidney injury network criteria among ST elevation myocardial infarction patients undergoing primary percutaneous intervention: a retrospective observational study. Clin. Res. Cardiol. 2014;103:525–532. doi: 10.1007/s00392-014-0680-8. [DOI] [PubMed] [Google Scholar]
  • 33.Tsai T.T., Patel U.D., Chang T.I., et al. Contemporary incidence, predictors, and outcomes of acute kidney injury in patients undergoing percutaneous coronary interventions: insights from the NCDR Cath-PCI registry. JACC Cardiovasc. Interv. 2014;7:1–9. doi: 10.1016/j.jcin.2013.06.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Hwang S.H., Jeong M.H., Ahmed K., et al. Different clinical outcomes of acute kidney injury according to acute kidney injury network criteria in patients between ST elevation and non-ST elevation myocardial infarction. Int. J. Cardiol. 2011;150:99–101. doi: 10.1016/j.ijcard.2011.03.039. [DOI] [PubMed] [Google Scholar]
  • 35.Khan I., Dar M.H., Khan A., Iltaf K., Khan S., Falah S.F. Frequency of acute kidney injury and its short-term effects after acute myocardial infarction. J. Pakistan Med. Assoc. 2017;67:1693–1697. [PubMed] [Google Scholar]
  • 36.Schmucker J., Fach A., Becker M., et al. Predictors of acute kidney injury in patients admitted with ST-elevation myocardial infarction - results from the Bremen STEMI-Registry. Eur Heart J Acute Cardiovasc Care. 2018;7:710–722. doi: 10.1177/2048872617708975. [DOI] [PubMed] [Google Scholar]
  • 37.Wang C., Pei Y.Y., Ma Y.H., et al. Risk factors for acute kidney injury in patients with acute myocardial infarction. Chin Med J (Engl) 2019;132:1660–1665. doi: 10.1097/cm9.0000000000000293. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Queiroz R.E., de Oliveira L.S., de Albuquerque C.A., et al. Acute kidney injury risk in patients with ST-segment elevation myocardial infarction at presentation to the ED. Am. J. Emerg. Med. 2012;30:1921–1927. doi: 10.1016/j.ajem.2012.04.011. [DOI] [PubMed] [Google Scholar]
  • 39.Palomba H., de Castro I., Neto A.L., Lage S., Yu L. Acute kidney injury prediction following elective cardiac surgery: AKICS Score. Kidney Int. 2007;72:624–631. doi: 10.1038/sj.ki.5002419. [DOI] [PubMed] [Google Scholar]
  • 40.Grisk O. The sympathetic nervous system in acute kidney injury. Acta Physiol. 2020;228 doi: 10.1111/apha.13404. [DOI] [PubMed] [Google Scholar]
  • 41.He W., Chen P., Chen Q., Cai Z., Zhang P. Cytokine storm: behind the scenes of the collateral circulation after acute myocardial infarction. Inflamm. Res. 2022;71:1143–1158. doi: 10.1007/s00011-022-01611-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Zhang J., Ding W., Liu J., Wan J., Wang M. Scavenger receptors in myocardial infarction and ischemia/reperfusion injury: the potential for disease evaluation and therapy. J. Am. Heart Assoc. 2023;12 doi: 10.1161/jaha.122.027862. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Raso Vasquez A.O., Kertai M.D., Fontes M.L. Postoperative thrombocytopenia: why you should consider antiplatelet therapy? Curr. Opin. Anaesthesiol. 2018;31:61–66. doi: 10.1097/aco.0000000000000551. [DOI] [PubMed] [Google Scholar]
  • 44.Gohbara M., Hayakawa A., Akazawa Y., et al. Association between acidosis soon after reperfusion and contrast-induced nephropathy in patients with a first-time ST-segment elevation myocardial infarction. J. Am. Heart Assoc. 2017;6 doi: 10.1161/jaha.117.006380. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Zhao Y., Zhou H., Tan W., et al. Prolonged dexmedetomidine infusion in critically ill adult patients: a retrospective analysis of a large clinical database Multiparameter Intelligent Monitoring in Intensive Care III. Ann. Transl. Med. 2018;6:304. doi: 10.21037/atm.2018.07.08. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Tao L., Tang W., Xia Z., et al. Machine learning predicts the serum PFOA and PFOS levels in pregnant women: enhancement of fatty acid status on model performance. Environ. Int. 2024;190 doi: 10.1016/j.envint.2024.108837. [DOI] [PubMed] [Google Scholar]
  • 47.Nilashi M., Abumalloh R.A., Ahmadi H., et al. Electroencephalography (EEG) eye state classification using learning vector quantization and bagged trees. Heliyon. 2023;9 doi: 10.1016/j.heliyon.2023.e15258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Hu Z., Hu Y., Zhang S., et al. Machine-learning-based models assist the prediction of pulmonary embolism in autoimmune diseases: a retrospective, multicenter study. Chin Med J (Engl) 2024 doi: 10.1097/cm9.0000000000003025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Rakhit R.D., Seiler C., Wustmann K., et al. Tumour necrosis factor-alpha and interleukin-6 release during primary percutaneous coronary intervention for acute myocardial infarction is related to coronary collateral flow. Coron. Artery Dis. 2005;16:147–152. doi: 10.1097/00019501-200505000-00003. [DOI] [PubMed] [Google Scholar]
  • 50.Park H.J., Chang K., Park C.S., et al. Coronary collaterals: the role of MCP-1 during the early phase of acute myocardial infarction. Int. J. Cardiol. 2008;130:409–413. doi: 10.1016/j.ijcard.2007.08.128. [DOI] [PubMed] [Google Scholar]
  • 51.Meier P., Gloekler S., de Marchi S.F., et al. Myocardial salvage through coronary collateral growth by granulocyte colony-stimulating factor in chronic coronary artery disease: a controlled randomized trial. Circulation. 2009;120:1355–1363. doi: 10.1161/circulationaha.109.866269. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Multimedia component 1
mmc1.docx (97.7KB, docx)

Data Availability Statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://physionet.org/content/mimiciii/1.4/and https://physionet.org/content/mimiciv/2.2/.


Articles from Heliyon are provided here courtesy of Elsevier

RESOURCES