Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2022 May 27;12:8956. doi: 10.1038/s41598-022-13152-x

Outcome prediction for acute kidney injury among hospitalized children via eXtreme Gradient Boosting algorithm

Ying-Hao Deng 1, Xiao-Qin Luo 1, Ping Yan 1, Ning-Ya Zhang 2, Yu Liu 1, Shao-Bin Duan 1,
PMCID: PMC9142505  PMID: 35624143

Abstract

Acute kidney injury (AKI) is common among hospitalized children and is associated with a poor prognosis. The study sought to develop machine learning-based models for predicting adverse outcomes among hospitalized AKI children. We performed a retrospective study of hospitalized AKI patients aged 1 month to 18 years in the Second Xiangya Hospital of Central South University in China from 2015 to 2020. The primary outcomes included major adverse kidney events within 30 days (MAKE30) (death, new renal replacement therapy, and persistent renal dysfunction) and 90-day adverse outcomes (chronic dialysis and death). The state-of-the-art machine learning algorithm, eXtreme Gradient Boosting (XGBoost), and the traditional logistic regression were used to establish prediction models for MAKE30 and 90-day adverse outcomes. The models’ performance was evaluated by split-set test. A total of 1394 pediatric AKI patients were included in the study. The incidence of MAKE30 and 90-day adverse outcomes was 24.1% and 8.1%, respectively. In the test set, the area under the receiver operating characteristic curve (AUC) of the XGBoost model was 0.810 (95% CI 0.763–0.857) for MAKE30 and 0.851 (95% CI 0.785–0.916) for 90-day adverse outcomes, The AUC of the logistic regression model was 0.786 (95% CI 0.731–0.841) for MAKE30 and 0.759 (95% CI 0.654–0.864) for 90-day adverse outcomes. A web-based risk calculator can facilitate the application of the XGBoost models in daily clinical practice. In conclusion, XGBoost showed good performance in predicting MAKE30 and 90-day adverse outcomes, which provided clinicians with useful tools for prognostic assessment in hospitalized AKI children.

Subject terms: Acute kidney injury, Paediatric kidney disease, Outcomes research, Paediatric research, Acute kidney injury, Paediatric kidney disease, Machine learning, Prognosis, Paediatric research

Introduction

Acute kidney injury (AKI) is a common complication among hospitalized children, characterized by an abrupt increase in serum creatinine (SCr) or decline in urine output1,2. Recent studies have suggested that AKI occurs in 26.9% of critically ill children and at least 5% of pediatric patients outside the intensive care unit (ICU)3,4. In China, the overall incidence of AKI is 20% in hospitalized children5. AKI has been found to be associated with poor prognosis in pediatric patients, including death, prolonged lengths of stay, longer ventilator support, and chronic kidney disease (CKD)1,2,6.

Recently, there has been increased concern about short- and long-term clinical outcomes in hospitalized AKI children. Major adverse kidney events within 30 days (MAKE30), a composite of death, new renal replacement therapy (RRT), or persistent renal dysfunction, is recommended to be a patient-centered outcome for clinical trials in AKI710. Previous studies have examined MAKE30 in specified pediatric patients, but there is limited information on this outcome among hospitalized AKI children11,12. Additionally, it is essential to understand the long-term prognosis of AKI patients13. Even after hospital discharge, AKI patients are still at high risk of long-term mortality or chronic renal insufficiency.

Early prediction of adverse outcomes can allow clinicians to stratify pediatric AKI patients for individualized management and may improve patient outcomes. However, there is little clinical information on how we can identify patients at high risk of short- and long-term adverse outcomes at an early stage. Recently, machine learning approaches have been developed and applied in diverse medical fields, including predicting the development and outcomes of AKI1420. The eXtreme Gradient Boost (XGBoost) algorithm, one of the state-of-the-art machine learning approaches, is an efficient implementation of the gradient boosting framework21. The machine learning algorithm has many advantages, such as high predictive accuracy, automatic modeling of non-linearities and high-order interactions, and robustness to multicollinearity. XGBoost has been shown to outperform traditional statistical methods, such as logistic regression, in diverse fields15,22,23 and has the potential to improve outcome prediction in hospitalized AKI children. Therefore, the study aimed to use the XGBoost algorithm to develop outcome prediction models in hospitalized AKI children.

Methods

Study design

We performed a retrospective study of admissions from January 1, 2015 to December 31, 2020 in the Second Xiangya Hospital of Central South University in China. Pediatric AKI patients were identified from hospitalized children aged between 1 month and 18 years, with at least two serum creatinine (SCr) measurements in any 7-day window during the first 30 days of hospitalization. AKI was determined according to the SCr criteria of the 2012 Kidney Disease: Improving Global Outcomes (KDIGO) Clinical Practice Guideline24. The time of AKI diagnosis was identified as the earliest time when the change in SCr met the KDIGO criteria. Baseline SCr was defined as the lowest SCr in the 7 days before AKI diagnosis, or the minimum inpatient SCr value for patients who met the criteria of community-acquired AKI5. We analyzed only the first hospitalization when a patient had multiple admissions during the study period. We excluded patients with end-stage renal disease (CKD stage 5, maintenance dialysis, and renal transplantation, identified by diagnosis codes) and hospital stay < 48 h. The study was approved by the Medical Ethics Committee of the Second Xiangya Hospital of Central South University (No. 2013-S061) and registered in the Chinese Clinical Trial Registry (ChiCTR-1800019857). Informed consent was waived due to the retrospective nature. The study was performed in accordance with the Declaration of Helsinki.

Data collection

We extracted data on patients’ demographics, diagnoses, clinical laboratory tests and treatments from the electronic medical record system and laboratory information system. AKI stage was defined based on the KDIGO criteria and determined using the highest SCr value during the first 7 days after AKI diagnosis. We excluded initiation of renal replacement therapy (RRT) when determining AKI stage 3 but recorded it as a separate variable. Patients were recognized as community-acquired AKI when the increase in SCr on the first day met the KDIGO criteria, or the SCr value on admission was ≥ 1.5 times the standardized SCr reference value and ≥ 1.5 times the lowest SCr value during hospitalization5. Patients who did not meet the criteria for community-acquired AKI were categorized as hospital-acquired AKI. Comorbidities were identified by the diagnosis codes (International Classification of Diseases, 10th Edition) on admission and at discharge. We analyzed surgery operations and exposure to nephrotoxic drugs in the 7 days before the time of AKI diagnosis. Surgery operations were determined based on the procedure codes and the surgery date, and both were recorded at hospital discharge. The use of nephrotoxic drugs was determined in accordance with the list of nephrotoxic drugs presented in a recent study5. We also collected laboratory data and clinical interventions within 7 days after AKI diagnosis. Laboratory data included hemoglobin, white blood cells, platelets, proteinuria (urinary protein dipstick values ≥ 1 +), total bilirubin, albumin, serum potassium, and serum sodium. If multiple measurements of a laboratory parameter were available during the period, we used the one taken closest to the time of AKI diagnosis. Clinical interventions included the use of loop diuretics, mechanical ventilation, and RRT.

Outcomes

The primary outcomes included MAKE30 and 90-day adverse outcomes. MAKE30 was defined as a composite of death, new RRT, or persistent renal dysfunction at hospital discharge or 30 days after AKI diagnosis, whichever occurred first7,8. Persistent renal dysfunction was defined as a final inpatient SCr value ≥ 200% of the baseline value. 90-day adverse outcomes included death and chronic dialysis 90 days after the time of AKI diagnosis. Survival status after hospital discharge was obtained from the Chinese Center for Disease Control and Prevention cause-of-death reporting system. Chronic dialysis was determined by reviewing patients’ inpatient and outpatient medical records, making phone calls to the patients or their families, and referring to the Chinese National Renal Data System. The secondary outcome was the length of hospital stay, defined as the number of days between AKI diagnosis and hospital discharge.

Statistical analysis

Continuous variables are presented as medians and interquartile ranges and were compared by Mann–Whitney U test. Categorical variables are presented as counts and percentages and were compared by chi-square tests. Survival analysis was performed by Kaplan–Meier method. Multivariable logistic regression analysis was used to determine risk factors of MAKE30 and 90-day adverse outcomes. Baseline variables considered clinically relevant or statistically significant on univariable analysis were selected into the stepwise regression model. The percentages of missing values in all baseline variables were less than 20%, and the median (for continuous variables) or mode (for categorical variables) was used for missing value imputation.

To establish and validate prediction models for MAKE30 and 90-day adverse outcomes, we randomly allocated pediatric AKI patients to the training and the test sets by the ratio of 7 to 3. In the training set, both XGBoost and logistic regression were used for model construction. The list of all predictor variables included in the prediction models is shown in Supplementary Table S1. XGBoost is an optimized distributed gradient boosting method with high efficiency, flexibility and portability21. It implements machine learning algorithms under the Gradient Boosting framework. The final output is obtained by weighting multiple decision trees and decreasing the gradient of the loss function. XGBoost provides a variety of hyper-parameters for different settings. This study used grid search and five-fold cross-validation to identify optimal hyper-parameters. The training set was randomly split into 5 equal-sized subsets, and 4 of them were used for model training, while the remaining one served as the validation set. This process was repeated 5 times, using one subset for model validation each time. After parameter-tuning, seven hyper-parameters (eta, max_depth, min_child_weight, subsample, colsample_bytree, gamma and lambda) were optimized, and they were set in the final model. Feature importance of the XGBoost model was calculated using the gain as the measure, representing each feature's fractional contribution to the model based on the total gain of this feature's splits. Finally, the performance of the prediction models was further evaluated in the test set. Evaluation metrics included the area under the receiver operating characteristic curve (AUC), the area under the precision-recall curve (AUPRC), and the Brier score, in which AUC was selected as the primary metric. The optimal cutoff points were determined based on the maximum Youden index in the training set. In addition, we conducted 5 random splits to test the robustness of the findings.

In sensitivity analysis, we examined the performance of the XGBoost models and the logistic regression models in predicting MAKE30 and 90-day adverse outcomes in pediatric AKI patients in different age groups. We also examined the models’ performance in AKI children in the ICU and those in other units.

Statistical analyses were performed using R 4.1.2 (https://cran.r-project.org). We used the xgboost package, version 1.4.1.1, for XGBoost modeling. p-value < 0.05 was considered statistically significant.

Results

Patient characteristics

During the study period, 18,194 of 93,040 hospitalized children had at least two times SCr measurements in a 7-day window during the first 30 days of hospitalization. Of them, 1394 pediatric AKI patients who met all eligibility criteria were included in our study (Fig. 1). The overall occurrence of AKI was 7.7% (1394/18,194) among hospitalized children. The incidence of AKI was 16.6%, 7.1% and 4.8% in patients aged 1 month to 1 year (infancy), aged 2 to 10 years (childhood), and aged 11 to 18 years (adolescence), respectively.

Figure 1.

Figure 1

Study flow diagram. AKI, acute kidney injury. MAKE30, Major Adverse Kidney Events within 30 days. The figure was created using Microsoft PowerPoint 2019 (https://www.microsoft.com/).

Baseline characteristics of the study cohort are shown in Table 1. The study cohort consisted of 504 (36.2%) AKI patients in infancy, 502 (36.0%) in childhood, and 388 (27.8%) in adolescence. Hospital-acquired AKI accounted for 75.4% of pediatric AKI patients. Most patients (60.5%) were diagnosed with AKI stage 1, while 22.9% were diagnosed with AKI stage 2 and 16.6% with AKI stage 3. For hospitalized AKI children, the top three most common clinical settings were nephrotoxic drugs (48.3%), congenital heart disease or cardiac surgery (32.4%) and sepsis (13.0%).

Table 1.

Baseline characteristics of the study cohort.

Characteristics Cohort (n = 1394)
Age, yr 4 (0–11)
Age categories, n (%)
Infancy, 1 mo−1 yr 504 (36.2)
Childhood, 2–10 yr 502 (36.0)
Adolescent, 11–18 yr 388 (27.8)
Sex, male, n (%) 817 (58.6)
AKI type, n (%)
Community-acquired AKI 343 (24.6)
Hospital-acquired AKI 1051 (75.4)
AKI stage, n (%)
Stage 1 844 (60.5)
Stage 2 319 (22.9)
Stage 3 231 (16.6)
Clinical settings, n (%)
Sepsis 181 (13.0)
Glomerulonephritis 57 (4.1)
Nephrotic syndrome 156 (11.2)
CKDa 15 (1.1)
Urinary tract obstruction/malformation 25 (1.8)
Non-cardiac surgery 72 (5.2)
Congenital heart disease/cardiac surgery 451 (32.4)
Heart failure 114 (8.2)
Inherited metabolic disease 24 (1.7)
Cardiac arrest 12 (0.9)
Trauma/burn 27 (1.9)
Shock 61 (4.4)
Respiratory failure 120 (8.6)
Diarrhea/dehydration 56 (4.0)
Nephrotoxic drugs 673 (48.3)
Laboratory data
Hemoglobin, g/L 111 (92–126)
 < 90 313 (22.5)
White blood cells, × 109/L 9.4 (5.7–14.6)
 < 4 227 (16.3)
 > 10 649 (46.6)
Platelets, × 109/L 229 (138–345)
 < 100 239 (17.1)
Proteinuria, n (%) 261 (22.7)
Serum albumin, g/L 36.8 (30.3–41.0)
 < 30 332 (23.9)
Serum total bilirubin, μmol/L 9.5 (5.2–18.2)
 > 34.2 156 (11.2)
Serum potassium, mmol/L 4.4 (3.9–4.9)
 < 3.5 144 (10.5)
 > 5.5 89 (6.5)
Serum sodium, mmol/L 138 (136–140)
 < 135 271 (19.9)
 > 145 56 (4.1)
Loop diuretics, n (%) 736 (52.8)
Mechanical ventilation, n (%) 274 (19.7)
RRT, n (%) 105 (7.5)

AKI acute kidney injury, CKD chronic kidney disease, RRT renal replacement therapy.

Continuous variables are presented as median (interquartile range) and categorical variables are presented as n (%).

Missing data: proteinuria (n = 245, 17.6%), serum albumin (n = 4, 0.3%), serum total bilirubin (n = 7, 0.5%), serum potassium (n = 29, 2.1%) and serum sodium (n = 29, 2.1%).

aAdmission or discharge diagnoses included CKD stage 3–4, identified by ICD-10 codes (N18.803 and N18.804).

Outcomes

Outcomes of the study cohort are shown in Table 2. MAKE30 occurred in 24.1% of all pediatric AKI patients. The Kaplan–Meier curves for mortality within 30 days and 90 days are shown in Supplementary Figs. S1 and S2. During the follow-up, the incidence of 90-day adverse outcomes was 8.1%. Baseline characteristics of pediatric AKI patients stratified by MAKE30 and 90-day adverse outcomes are presented in Supplementary Tables S2 and S3, respectively. Overall, compared with those who showed a good prognosis, patients with MAKE30 or 90-day adverse outcomes had a higher prevalence of clinical comorbidities, a larger proportion of abnormal laboratory data, and more severe renal dysfunction at baseline.

Table 2.

Outcomes of the study cohort.

Outcomes Cohort (n = 1394)
Hospital length of stay (d) 13 (6–26)
MAKE 30, n (%)
Death 66 (4.7)
Receipt of new RRT 124 (8.9)
PRD 233 (16.7)
Total 336 (24.1)
90-day adverse outcomes, n (%)
Death 99 (7.1)
Chronic dialysis 14 (1.0)
Total 113 (8.1)

MAKE30, Major Adverse Kidney Events within 30 days.

Continuous variables are presented as median (interquartile range) and categorical variables are presented as n (%).

RRT renal replacement therapy, PRD persistent renal dysfunction.

Risk factors for MAKE30 and 90-day adverse outcomes

Multivariable logistic regression analysis showed that the risk factors for MAKE30 were hospital-acquired AKI, AKI stage, glomerulonephritis, respiratory failure, hypoalbuminemia (serum albumin < 30 g/L), hyperbilirubinemia (serum total bilirubin > 34.2 mmol/L), and hyperkalemia (serum potassium > 5.5 mmol/L) (Table 3). Of these, AKI stage was the major risk factor of MAKE30, with an odds ratio (OR) of 9.42 (95% confidence interval [CI], 6.58–13.49) for stage 2 and 16.86 (95% CI, 11.31–25.12) for stage 3. The risk factors for 90-day adverse outcomes included age, AKI stage, CKD, shock, respiratory failure, thrombocytopenia (platelets < 100 × 109/L), hypoalbuminemia, hyperkalemia and mechanical ventilation (Table 4). The ORs for the top 3 major risk factors were 14.86 (95% CI, 4.71–46.90) for CKD, 3.96 (95% CI, 1.78–8.80) for shock and 3.19 (95% CI, 1.71–5.95) for respiratory failure.

Table 3.

Multivariable logistic regression analysis of risk factors associated with MAKE30.

Characteristics OR 95% CI p value
Hospital-acquired AKI 1.49 1.02–2.17 0.039
AKI stage
Stage 1 1.00
Stage 2 9.42 6.58–13.49 < 0.001
Stage 3 16.86 11.31–25.12 < 0.001
Glomerulonephritis 1.97 1.02–3.81 0.044
Shock 1.98 0.99–3.96 0.05
Respiratory failure 2.67 1.61–4.43 < 0.001
Nephrotoxic drugs 0.76 0.54–1.06 0.10
Platelets < 100 × 109/L 1.42 0.98–2.06 0.07
Serum albumin < 30 g/L 1.54 1.10–2.17 0.012
Serum total bilirubin > 34.2 mmol/L 1.95 1.26–3.00 0.003
Serum potassium > 5.5 mmol/L 2.02 1.14–3.58 0.015

OR odds ratio, CI confidence interval, AKI acute kidney injury.

Table 4.

Multivariable logistic regression analysis of risk factors associated with 90-day adverse outcomes.

Characteristics OR 95% CI p value
Age 1.08 1.04–1.12 < 0.001
AKI stage
Stage 1 1.00
Stage 2 1.75 1.01–3.04 0.046
Stage 3 2.38 1.36–4.16 0.002
Sepsis 0.59 0.28–1.22 0.15
CKD 14.86 4.71–46.90 < 0.001
Shock 3.96 1.78–8.80 < 0.001
Respiratory failure 3.19 1.71–5.95 < 0.001
Platelets < 100 × 109/L 2.73 1.67–4.48 < 0.001
Serum albumin < 30 g/L 1.71 1.06–2.78 0.029
Serum total bilirubin > 34.2 mmol/L 1.80 0.98–3.32 0.06
Serum potassium > 5.5 mmol/L 2.69 1.23–5.86 0.013
Mechanical ventilation 2.72 1.61–4.61 < 0.001

OR odds ratio, CI confidence interval, AKI acute kidney injury, CKD chronic kidney disease.

Prediction models for MAKE30 and 90-day adverse outcomes

Of 1394 pediatric AKI patients, 975 were randomly assigned to the training set and 419 to the test set. There was no significant difference in baseline characteristics and outcomes between the training and the test sets (Supplementary Tables S4 and S5). In the test set, the AUC of the XGBoost model was 0.810 (95% CI 0.763–0.857) for MAKE30 and 0.851 (95% CI 0.785–0.916) for 90-day adverse outcomes. The AUC of the logistic regression model was 0.786 (95% CI 0.731–0.841) for MAKE30 and 0.759 (95% CI 0.654–0.864) for 90-day adverse outcomes. (Fig. 2 and Supplementary Fig. S3). Table 5 describes the performance of the prediction models for MAKE30 and 90-day adverse outcomes. At the optimal cutoff points, XGBoost achieved a sensitivity of 72.0% and a specificity of 77.4% for MAKE30 and a sensitivity of 73.0% and a specificity of 84.0% for 90-day adverse outcomes in the test set. The precision-recall curves of the models are provided in Fig. 3 and Supplementary Fig. S4. In the test set, the AUPRC of the XGBoost model was 0.521 for MAKE30 and 0.409 for 90-day adverse outcomes. The Brier score and calibration plots of the models are provided in Fig. 4 and Supplementary Fig. S5. The Brier scores of the two models were lower than that of the null model. The results of the 5 random splits are shown in Supplementary Table S6.

Figure 2.

Figure 2

Receiver operating characteristic curves of the logistic regression and the XGBoost models for MAKE30 (A) and 90-day adverse outcomes (B) in the test set (B). AUC, area under the receiver operating characteristic curve. The figure was created using R 4.1.2 (https://cran.r-project.org).

Table 5.

Performance of the XGBoost models for MAKE30 and 90-day adverse outcomes in the training and test sets.

MAKE30 90-day adverse outcomes
Training set Test set Training set Test set
AUC (95% CI) 0.907 (0.887–0.927) 0.810 (0.763–0.857) 0.964 (0.946–0.983) 0.851 (0.785–0.916)
Cutoff points 0.2958 0.2958 0.0948 0.0948
Sensitivity (%) 85.2 72.0 96.1 73.0
Specificity (%) 81.2 77.4 86.7 84.0
PPV (%) 59.1 50.0 37.8 30.7
NPV (%) 94.5 89.8 99.6 97.0

MAKE30, Major Adverse Kidney Events within 30 days.

AUC area under the receiver operating characteristic curve, CI confidence interval, PPV positive predictive value, NPV negative predictive value.

Figure 3.

Figure 3

Precision-recall curves of the logistic regression and XGBoost models for MAKE30 (A) and 90-day adverse outcomes (B) in the test set. The figure was created using Python 3.6 (https://www.python.org/).

Figure 4.

Figure 4

Calibration curves of the logistic regression and XGBoost models for MAKE30 (A) and 90-day adverse outcomes (B) in the test set. The Brier scores of the null model, logistic regression model, and XGBoost model for MAKE30 were 0.239, 0.144, and 0.141, respectively. The Brier scores of the null model, logistic regression model, and XGBoost model for 90-day adverse outcomes were 0.088, 0.074, and 0.065, respectively.

Figures 5 and 6 show the top 15 most important features derived from the XGBoost model. Feature importance reflects the contribution of each variable to the results during the learning process. AKI stage 3 was the most important variable for the prediction of MAKE30, followed by AKI stage 2, serum albumin, platelet count, and serum potassium. For 90-day adverse outcomes, the top 5 most important predictors were serum albumin, platelet count, shock, age, and serum potassium. The partial dependence plots and individual conditional expectation plots of the XGBoost models were provided in Supplementary Figs. S6 and S7.

Figure 5.

Figure 5

The top 15 important features derived from the XGBoost model for MAKE30. AKI, acute kidney injury; WBC, white blood cell. The figure was created using R 4.1.2 (https://cran.r-project.org).

Figure 6.

Figure 6

The top 15 important features derived from the XGBoost model for 90-day adverse outcomes. RRT, renal replacement therapy; AKI, acute kidney injury; WBC, white blood cell; CKD, chronic kidney disease. The figure was created using R 4.1.2 (https://cran.r-project.org).

In sensitivity analysis, we examined the models’ performance in pediatric AKI patients stratified by age groups. We also evaluated the models’ performance in AKI children in ICU and those in other units. The results are shown in Supplementary Figs. S8S11. The XGBoost models were superior to the logistic models for predicting MAKE30 and 90-days adverse outcomes in the subgroups of pediatric AKI patients.

We further developed a web-based risk calculator (http://xydsbAKIteam.xyeyy.com) to promote the application of the XGBoost models, which can automatically calculate the risk of MAKE30 and 90-day adverse outcomes in hospitalized AKI children.

Discussion

The present study found that the incidence of MKAE30 and 90-day adverse outcomes was 24.1% and 8.1% among hospitalized AKI children, respectively. AKI stage was the major risk factor for MAKE30. CKD was the major risk factor for 90-day adverse outcomes. Additionally, we established and validated machine learning-based models using the XGBoost algorithm for predicting MAKE30 and 90-day adverse outcomes. A web-based calculator was established to apply the XGBoost models in daily clinical practice.

Several recent studies have examined the incidence and outcomes of AKI among hospitalized children35,13,25,26. The incidence of AKI varies with clinical settings and age. A large multicenter study reported a 20% overall incidence of AKI among 101,836 pediatric inpatients in China5. In the study, AKI occurred in 28% of infants, higher than 17% of childhood patients and 12% of adolescents. Additionally, one study showed that the occurrence of AKI increased in parallel with age and was greatest in patients aged 15 to 18 years old25. Our study found that AKI occurred in 7.7% of all hospitalized children. The incidence of AKI in infants was approximately twice that in childhood patients and three times that in adolescents (4.8%). Differences may depend on the diverse causes of AKI and the distribution of comorbidities. Although the incidence of AKI differs between patient populations, it is consistently related to poorer prognosis in hospitalized children3. Previous studies have reported that the incidence of MAKE30 was 9.6% in children with sepsis11 and 5.2% in critically ill children12. Our study showed that MKAE30 and 90-day outcomes occurred in 24.1% and 8.1% of hospitalized AKI children, respectively. The results suggested that continuous monitoring during hospitalization and frequent follow-up after discharge are essential for pediatric AKI patients.

The study identified risk factors associated with MAKE30 and 90-day adverse outcomes. As expected, AKI stage and CKD were the major risk factors of MAKE30 and 90-day adverse outcomes, respectively. A higher AKI stage reflects more severe renal dysfunction, while a history of CKD suggests decreased glomerular reserve at baseline27. Another important finding is that the risk of 90-day adverse outcomes increased with age. The reasons may be distinct developmental status and repairability in hospitalized AKI children of different ages. In addition, hospital-acquired AKI was independently associated with MAKE30. Although differences in outcomes between hospital- and community-acquired AKI have been investigated in adults28, studies focusing on pediatric patients are still limited. Finally, baseline variables associated with systemic diseases and multiorgan dysfunction were also risk factors for poor prognosis in hospitalized AKI children3,5.

Our study used machine learning methods to predict adverse outcomes in hospitalized AKI children. The results showed that the XGBoost models achieved good performance in predicting MAKE30 and 90-day adverse outcomes. Previous studies have also shown the applicability of the XGBoost algorithm in predicting complications after pediatric cardiac surgery29, multiple organ dysfunction in pediatric ICU30, and volume responsiveness in oliguric AKI patients15. Compared with traditional logistic regression, there are several strengths of the XGBoost algorithm. Firstly, XGBoost has a strong non-linear fitting capability. In the logistic regression model, a linear relationship between the continuous independent variables and the logit conversion values of the dependent variables is needed. Instead, XGBoost makes flexible assumptions and has the ability to learn the complex relationship between the input variables. Secondly, XGBoost is robust to outliers and multicollinearity among the predictors. By contrast, logistic regression requires that there is no multicollinearity between the independent variables. Thirdly, XGBoost can achieve better predictive performance by applying ensemble learning, which integrates the results of multiple weak learners to obtain the strong learner. Fourthly, the XGBoost algorithm can identify important predictors of the outcome by calculating the contribution of each feature to each tree in the learning process20, which clinicians may ignore in clinical practice.

Early prediction of adverse outcomes is critical for risk stratification and clinical decision-making in hospitalized AKI children. To promote the clinical application of the XGBoost models, we further established a web-page risk calculator for prognostic assessment of pediatric AKI patients. The risk calculator can help clinicians identify high-risk patients at an early stage for individualized management, such as discussions of goal-of-care, decisions about resource allocation, evaluations of the quality of care, and suggestions of follow-up frequency, and may improve the prognosis of hospitalized AKI children.

Our study has several limitations. Firstly, because it was a single-center retrospective study conducted in an academic hospital, the results may not be generalizable to patients in other medical centers. Secondly, the sample size was relatively small, resulting in a limited number of positive individuals of some baseline variables. An essential variable was CKD, which was determined based on admission or discharge diagnosis codes. Because of the lack of body height data, we were unable to identify it according to the estimated glomerular filtration rate. Thirdly, urine output criteria were not used for AKI diagnosis because hourly urine output rate was not routinely measured in hospitalized AKI patients outside the ICU. Future multi-center prospective studies are required to externally validate the robustness and clinical effectiveness of the prediction models in a larger cohort of hospitalized AKI children.

Conclusions

In conclusion, we determined the incidence and outcomes of AKI among hospitalized children and developed machine learning-based prediction models for MAKE30 and 90-day adverse outcomes using the XGBoost algorithm. The XGBoost models showed good predictive performance in all hospitalized AKI children and in different subgroups. We further established a web-based risk calculator to promote the clinical application of the XGBoost models, which provided clinicians with useful tools for prognostic assessment in hospitalized AKI children. Future multi-center prospective studies are required to demonstrate the robustness and clinical effectiveness of the prediction models.

Supplementary Information

Author contributions

S.B.D. designed and supervised the study and revised the manuscript. Y.H.D. performed the data extraction, analyzed and interpreted the data and drafted the manuscript. X.Q.L. and P.Y. analyzed and interpreted the data and critically revised the manuscript. N.Y.Z. analyzed the data and revised the manuscript critically for important intellectual content. All authors read and approved the final manuscript.

Data availability

The datasets used during the current study are available from the corresponding author on request.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-022-13152-x.

References

  • 1.Sutherland SM, Kwiatkowski DM. Acute kidney injury in children. Adv. Chronic Kidney Dis. 2017;24(6):380–387. doi: 10.1053/j.ackd.2017.09.007. [DOI] [PubMed] [Google Scholar]
  • 2.Sigurjonsdottir VK, Chaturvedi S, Mammen C, Sutherland SM. Pediatric acute kidney injury and the subsequent risk for chronic kidney disease: Is there cause for alarm? Pediatr. Nephrol. 2018;33(11):2047–2055. doi: 10.1007/s00467-017-3870-6. [DOI] [PubMed] [Google Scholar]
  • 3.Kaddourah A, Basu RK, Bagshaw SM, Goldstein SL, Investigators A. Epidemiology of acute kidney injury in critically Ill children and young adults. N. Engl. J. Med. 2017;376(1):11–20. doi: 10.1056/NEJMoa1611391. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.McGregor TL, Jones DP, Wang L, Danciu I, Bridges BC, Fleming GM, et al. Acute kidney injury incidence in noncritically Ill hospitalized children, adolescents, and young adults: A retrospective observational study. Am. J. Kidney Dis. 2016;67(3):384–390. doi: 10.1053/j.ajkd.2015.07.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Xu X, Nie S, Zhang A, Mao J, Liu HP, Xia H, et al. Acute kidney injury among hospitalized children in China. Clin. J. Am. Soc. Nephrol. 2018;13(12):1791–1800. doi: 10.2215/CJN.00800118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Uber AM, Sutherland SM. Acute kidney injury in hospitalized children: Consequences and outcomes. Pediatr. Nephrol. 2020;35(2):213–220. doi: 10.1007/s00467-018-4128-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Palevsky PM, Molitoris BA, Okusa MD, Levin A, Waikar SS, Wald R, et al. Design of clinical trials in acute kidney injury: Report from an NIDDK workshop on trial methodology. Clin. J. Am. Soc. Nephrol. 2012;7(5):844–850. doi: 10.2215/CJN.12791211. [DOI] [PubMed] [Google Scholar]
  • 8.Semler MW, Self WH, Wanderer JP, Ehrenfeld JM, Wang L, Byrne DW, et al. Balanced crystalloids versus saline in critically Ill adults. N. Engl. J. Med. 2018;378(9):829–839. doi: 10.1056/NEJMoa1711584. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Cheng W, Wu X, Liu Q, Wang HS, Zhang NY, Xiao YQ, et al. Post-contrast acute kidney injury in a hospitalized population: Short-, mid-, and long-term outcome and risk factors for adverse events. Eur. Radiol. 2020;30(6):3516–3527. doi: 10.1007/s00330-020-06690-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Yan P, Duan XJ, Liu Y, Wu X, Zhang NY, Yuan F, et al. Acute kidney disease in hospitalized acute kidney injury patients. PeerJ. 2021;9:e11400. doi: 10.7717/peerj.11400. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Weiss SL, Balamuth F, Thurm CW, Downes KJ, Fitzgerald JC, Laskin BL. Major adverse kidney events in pediatric sepsis. Clin. J. Am. Soc. Nephrol. 2019;14(5):664–672. doi: 10.2215/CJN.12201018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Alobaidi R, Basu RK, DeCaen A, Joffe AR, Lequier L, Pannu N, et al. Fluid accumulation in critically Ill children. Crit. Care Med. 2020;48(7):1034–1041. doi: 10.1097/CCM.0000000000004376. [DOI] [PubMed] [Google Scholar]
  • 13.Parikh RV, Tan TC, Salyer AS, Auron A, Kim PS, Ku E, et al. Community-based epidemiology of hospitalized acute kidney injury. Pediatrics. 2020;146(3):11258. doi: 10.1542/peds.2019-2821. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Koyner JL, Carey KA, Edelson DP, Churpek MM. The development of a machine learning inpatient acute kidney injury prediction model. Crit. Care Med. 2018;46(7):1070–1077. doi: 10.1097/CCM.0000000000003123. [DOI] [PubMed] [Google Scholar]
  • 15.Zhang Z, Ho KM, Hong Y. Machine learning for the prediction of volume responsiveness in patients with oliguric acute kidney injury in critical care. Crit. Care. 2019;23(1):112. doi: 10.1186/s13054-019-2411-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Kang MW, Kim J, Kim DK, Oh KH, Joo KW, Kim YS, et al. Machine learning algorithm to predict mortality in patients undergoing continuous renal replacement therapy. Crit. Care. 2020;24(1):42. doi: 10.1186/s13054-020-2752-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Tseng PY, Chen YT, Wang CH, Chiu KM, Peng YS, Hsu SP, et al. Prediction of the development of acute kidney injury following cardiac surgery by machine learning. Crit. Care. 2020;24(1):478. doi: 10.1186/s13054-020-03179-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Sandokji I, Yamamoto Y, Biswas A, Arora T, Ugwuowo U, Simonov M, et al. A time-updated, parsimonious model to predict AKI in hospitalized children. J. Am. Soc. Nephrol. 2020;31(6):1348–1357. doi: 10.1681/ASN.2019070745. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Luo X-Q, Yan P, Zhang N-Y, Luo B, Wang M, Deng Y-H, et al. Machine learning for early discrimination between transient and persistent acute kidney injury in critically Ill patients with sepsis. Sci. Rep. 2021;11(1):20269. doi: 10.1038/s41598-021-99840-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Hsu CN, Liu CL, Tain YL, Kuo CY, Lin YC. Machine learning model for risk prediction of community-acquired acute kidney injury hospitalization from electronic health records: development and validation study. J. Med. Internet. Res. 2020;22(8):e16903. doi: 10.2196/16903. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Chen, T. and Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; San Francisco, California, USA: Association for Computing Machinery 785–94 (2016).
  • 22.Chen T, Li X, Li Y, Xia E, Qin Y, Liang S, et al. Prediction and risk stratification of kidney outcomes in IgA nephropathy. Am. J. Kidney Dis. 2019;74(3):300–309. doi: 10.1053/j.ajkd.2019.02.016. [DOI] [PubMed] [Google Scholar]
  • 23.Hou N, Li M, He L, Xie B, Wang L, Zhang R, et al. Predicting 30-days mortality for MIMIC-III patients with sepsis-3: A machine learning approach using XGboost. J. Transl. Med. 2020;18(1):462. doi: 10.1186/s12967-020-02620-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Group KDIGOKAKIW. KDIGO clinical practice guideline for acute kidney injury. Kidney Int. Suppl. 2(1):1–138 (2012).
  • 25.Sutherland SM, Ji J, Sheikhi FH, Widen E, Tian L, Alexander SR, et al. AKI in hospitalized children: Epidemiology and clinical associations in a national cohort. Clin. J. Am. Soc. Nephrol. 2013;8(10):1661–1669. doi: 10.2215/CJN.00270113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Cao Y, Yi ZW, Zhang H, Dang XQ, Wu XC, Huang AW. Etiology and outcomes of acute kidney injury in Chinese children: A prospective multicentre investigation. BMC Urol. 2013;13:41. doi: 10.1186/1471-2490-13-41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Forni LG, Darmon M, Ostermann M, Oudemans-van Straaten HM, Pettila V, Prowle JR, et al. Renal recovery after acute kidney injury. Intensive Care Med. 2017;43(6):855–866. doi: 10.1007/s00134-017-4809-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Sawhney S, Fluck N, Fraser SD, Marks A, Prescott GJ, Roderick PJ, et al. KDIGO-based acute kidney injury criteria operate differently in hospitals and the community-findings from a large population cohort. Nephrol Dial Transpl. 2016;31(6):922–929. doi: 10.1093/ndt/gfw052. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Zeng X, An J, Lin R, Dong C, Zheng A, Li J, et al. Prediction of complications after paediatric cardiac surgery. Eur. J. Cardiothorac. Surg. 2020;57(2):350–358. doi: 10.1093/ejcts/ezz198. [DOI] [PubMed] [Google Scholar]
  • 30.Bose SN, Greenstein JL, Fackler JC, Sarma SV, Winslow RL, Bembea MM. Early prediction of multiple organ dysfunction in the pediatric intensive care unit. Front Pediatr. 2021;9:711104. doi: 10.3389/fped.2021.711104. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The datasets used during the current study are available from the corresponding author on request.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES