Machine learning models for predicting acute kidney injury: a systematic review and critical appraisal

Iacopo Vagliano; Nicholas C Chesnaye; Jan Hendrik Leopold; Kitty J Jager; Ameen Abu-Hanna; Martijn C Schut

doi:10.1093/ckj/sfac181

. 2022 Aug 2;15(12):2266–2280. doi: 10.1093/ckj/sfac181

Machine learning models for predicting acute kidney injury: a systematic review and critical appraisal

Iacopo Vagliano ^1,^✉, Nicholas C Chesnaye ², Jan Hendrik Leopold ³, Kitty J Jager ⁴, Ameen Abu-Hanna ⁵, Martijn C Schut ⁶

PMCID: PMC9664575 PMID: 36381375

ABSTRACT

Background

The number of studies applying machine learning (ML) to predict acute kidney injury (AKI) has grown steadily over the past decade. We assess and critically appraise the state of the art in ML models for AKI prediction, considering performance, methodological soundness, and applicability.

Methods

We searched PubMed and ArXiv, extracted data, and critically appraised studies based on the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD), Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies (CHARMS), and Prediction Model Risk of Bias Assessment Tool (PROBAST) guidelines.

Results

Forty-six studies from 3166 titles were included. Thirty-eight studies developed a model, five developed and externally validated one, and three studies externally validated one. Flexible ML methods were used more often than deep learning, although the latter was common with temporal variables and text as predictors. Predictive performance showed an area under receiver operating curves ranging from 0.49 to 0.99. Our critical appraisal identified a high risk of bias in 39 studies. Some studies lacked internal validation, whereas external validation and interpretability of results were rarely considered. Fifteen studies focused on AKI prediction in the intensive care setting, and the US-derived Medical Information Mart for Intensive Care (MIMIC) data set was commonly used. Reproducibility was limited as data and code were usually unavailable.

Conclusions

Flexible ML methods are popular for the prediction of AKI, although more complex models based on deep learning are emerging. Our critical appraisal identified a high risk of bias in most models: Studies should use calibration measures and external validation more often, improve model interpretability, and share data and code to improve reproducibility.

Keywords: acute kidney injury, clinical prediction models, critical appraisal, machine learning, systematic review

Graphical Abstract

INTRODUCTION

Acute kidney injury (AKI) has a substantial impact on the global burden of kidney disease, with a global estimate of 13.3 million cases in 2017 [1, 2] and 1.7 million deaths each year globally [3, 4]. Early recognition, risk assessment, and care of AKI are suboptimal and contribute to disease progression, high health care costs, and poor patient outcomes [5, 6]. To assist physicians with risk assessment of AKI, prediction models have been developed across various patient populations with varying degrees of predictive accuracy [7, 8]. Models being built using machine learning (ML), which are mathematical models to make decisions and predictions based on data sets, have become popular [9]. ML differs from standard regression modelling (including models that tend to be parametric and their extensions, semiparametric or with a relatively low number of parameters—e.g. logistic regression and Cox models) in the high volume of data that can be used as input and the computational effort required for analysis [9, 10].

Recently, we have seen rapid growth in ML models for AKI prediction [12–30]. The sudden rise of such a novel and immediately popular modeling paradigm raises questions about how well these models perform, the soundness of their methodology, and whether the models are applicable to clinical settings (e.g. populations and availability of predictors).

Systematic reviews on AKI prediction are plentiful [12–29]. We are aware of a single review of AKI prediction using ML models [30], which assessed whether ML models outperform logistic regression for predicting AKI. This review did not perform any critical appraisal. In contrast, we review and critically appraise ML models for the prediction of AKI in terms of performance, methodological soundness, and clinical applicability.

MATERIALS AND METHODS

The protocol for this study was registered in the online PROSPERO database (CRD42022304868). We followed the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines [31].

Study identification

We used PubMed (pubmed.ncbi.nlm.nih.gov) and ArXiv (arxiv.org) for our search. We searched title or abstract with the string (Clinical OR medical) AND (predict*) AND (AKI OR AKF OR AKD OR ARI OR ARF OR ARD OR ‘acute kidney injury’ OR ‘acute kidney failure’ OR ‘acute renal failure’ OR ‘acute renal insufficiency’). The search was conducted on March 1, 2021.

Study inclusion

We included studies that (i) developed or validated prediction models for AKI and (ii) used ML models. We excluded studies that focused on identifying or analyzing individual predictors instead of model development or validation. We excluded studies that used only standard regression models, gray literature, and informal publications (commentaries, letters to the editor, editorials, and meeting abstracts).

Study selection

Pilot selection and extraction were conducted by I.V., N.C.C. and J.H.L. to validate and refine the research question, the inclusion criteria, and the data-extraction form. Subsequently, we selected full-text papers based on abstract screening and divided them equally among I.V., N.C.C., and J.H.L. At least two researchers reviewed a quarter of the included studies to ensure an adequate level of inter-reviewer agreement. Discrepancies between reviewers were resolved by discussion.

Data extraction

We created a data-extraction form (Supplementary Table S1) based on the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) and the Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies (CHARMS) checklists [32, 33]. We included items regarding specific aspects of the models (prediction time window and duration of follow-up), the type of data, the methods used for model interpretability, and the availability of data and code. I.V., N.C.C., and J.H.L. performed the data extraction.

Critical appraisal

We assessed potential biases in the included studies by using the Prediction Model Risk of Bias Assessment Tool (PROBAST) [34]. PROBAST distinguishes among different aspects that may generate bias: (i) the use of unsuitable data, (ii) participant selection, (iii) definition or assessment of predictors, (iv) outcome definition and its relation to the predictors, and (v) incorrect data analysis. The latter pertains to the handling of missing data, validation, and use of proper performance measures. To define common criteria for rating bias and applicability, I.V., N.C.C., J.H.L., and A.A.H. first reviewed and discussed one study. I.V., N.C.C., and J.H.L. then completed the critical appraisal. At least two researchers reviewed a quarter of the included studies to ensure inter-reviewer agreement. Disagreement between reviewers was resolved by discussion.

RESULTS

Literature search

We retrieved 3166 titles through our search (Fig. 1). Fifty-four were selected for full-text screening, and 46 studies were finally included. Most of these studies were published over the past 2 years (Fig. 2). Thirty-eight studies (82%) developed a model, five (11%) developed and externally validated one, and three (7%) externally validated one.

FIGURE 1: — Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) flowchart of study inclusions and exclusions.

FIGURE 2: — Models used in the included studies over time, grouped by their type. Orange bars show flexible machine learning (ML) models; blue bars show deep-learning models.

General study characteristics

Outcome

Thirty-two studies (70%) defined AKI as the outcome (distinguishing only between patients with and without AKI), and six studies (13%) focused on postoperative AKI. Other outcomes included the severity of AKI {10 studies [22%]}, the progression of AKI {1 study [2%]}, late AKI {AKI occurring after resuscitation or first 48 hours, 1 study [2%]}, preexisting AKI on arrival {1 study [2%]}, hospital-acquired AKI {1 study [2%]}, community-acquired AKI {1 study [2%]}, drug-induced AKI {1 study [2%]}, perioperative AKI {1 study [2%]}, and cardiac surgery–associated AKI {1 study [2%]}.

Definition and prevalence of AKI

The Kidney Disease Improving Global Outcomes (KDIGO) criteria [1] defined AKI in 36 studies (78%), whereas 4 studies (8%) used Acute Kidney Injury Network;[35] 2 studies (4%) use Risk, Injury, Failure, Loss of kidney function, and End-stage kidney disease (RIFLE) criteria;[36] 1 study (2%) used codes from the International Classification of Diseases, Ninth Revision;[37] and 1 study (2%) used the National Health Service England algorithm [38] together with KDIGO. The prevalence of AKI ranged from 0.5% (general hospital population) [38] to 72.7% (patients who underwent aortic arch surgery) [39] and was not reported in three studies (7%).

Type of prediction model

Figure 2 shows the number of studies over time, grouped by the type of model, and that deep-learning models emerged around 2017. Figure 3A shows the wide variety of models used in the selected studies. We distinguish between (i) flexible ML models, which tend to be nonparametric or are ‘parameter-rich’ models, such as decision trees and random forests, and (ii) deep-learning models, which are based on neural networks, have multiple levels of representation and rely on simple, nonlinear modules to transform the representation at one level into a more abstract representation. The most common models were random forest {17 studies [37%]} and gradient-boosted trees {9 studies [20%]}. Among deep-learning models, recurrent neural networks were the most frequent {6 studies [13%]}. Figure 3B illustrates the type of model used by data type.

Type and origin of data

The vast majority of studies used clinical variables with a single measurement {28 studies [61%]}, whereas 13 studies (28%) used clinical variables with repeated measurements, 3 studies (7%) used clinical variables with repeated measurements together with clinical notes, and 2 studies (4%) combined their data with external data. Twenty-seven studies used data from their own center. The Medical Information Mart for Intensive Care (MIMIC) data set, an openly available, intensive care–specific data set from the United States, was widely used {10 studies [28%]} [40]. Supplementary Figure S1 includes more details.

Model predictive performance

The predictive performance of clinical models is assessed through discrimination and calibration. The former is the ability of a predictive model to separate data into classes (e.g. correctly distinguishing between patients with and without AKI). The latter measures the agreement between predicted and observed outcomes [41].

Figure 4 summarizes the performance measures used for evaluating the models. Area under the receiver operating characteristic curve (AUROC) was the most used discrimination measure {41 studies [89%]}. Calibration was rarely assessed {3 studies [7%]}. Table 1 summarizes the reported performance measures for each study. AUROC varied from 0.49 to 0.99. Random forest was often the best performing model {12 studies [26%]} within the study. Other best performing models included recurrent neural networks (RNNs) {6 studies [13%]} and gradient-boosted trees {5 studies [11%]}.

Table 1.

Overview of the results reported by the studies and their settings

Study	Settings	AUROC	Other measures	Best model	Comparison with	Validation
S01 [54]	Any AKI 48 h ahead	0.863–0.921	PR AUPRC: 0.173–0.297	RNN	Gradient-boosted trees, logistic regression	Internal
	AKI 2–3 48 h ahead	0.870–0.957	PR AUPRC: 0.167–0.387
	AKI 3 48 h ahead	0.930–0.980	PR AUC: 0.245–0.487
S02 [55]	Unstructured and structured features	0.673–0.835	F-measure: 0.091–0.542	SVM	Random forest, logistic regression, naïve Bayes, CNN	Internal
	Structured features	0.657–0.812	F-measure: 0.233–0.501	Random forest	SVM, logistic regression, naïve Bayes
	Unstructured features	0.750–0.774	F-measure: 0.066–0.495	Logistic regression	SVM, random forest, naïve Bayes
S03 [56]	MIMIC	0.743–0.893		RNN	CNN	Internal
	eICU	0.812–0871
S05 [57]	AKI	0.817–0.834	F-measure: 0.283–0.430	Gradient-boosted trees	Logistic regression, deep learning (unspecified)	Internal
			Accuracy: 0.939–0.948
S07 [58]	AKI	0.499–0.867	PR AUPRC: 0.063–0.332	Gradient-boosted trees	RNN, logistic regression	Internal
	AKI stage sCr		F-measure: ∼0.560–0.609	Logistic regression (LASSO)	Linear regression, ridge regression, LARS, SGD random forest, MARS
S08 [59]	AKI stage 1/sCr		F-measure: ∼0.650–0.671	Random forest	Linear regression, ridge regression, LARS, SGD LASSO, MARS	Internal
	AKI occurrence sCr		F-measure: ∼0.650–0.686	Logistic regression (LASSO)	Linear regression, ridge regression, LARS, SGD random forest, MARS
	AKI occurrence 1/sCr		F-measure: ∼0.750–0.758	Random forest	Linear regression, ridge regression, LARS, SGD LASSO, MARS
S09 [38]	Onset	0.762–0.841	Accuracy: 0.570–0.810	Gradient-boosted trees	SOFA	Internal
	12 h ahead	0.734–0.749	Accuracy: 0.550–0.760
	24 h ahead	0.716–0.758	Accuracy: 0.760–0.820
	48 h ahead	0.675–0.707	Accuracy: 0.810–0.820
	72 h ahead	0.653–0.674	Accuracy: 0.790–0.800
S10 [60]	AKI stage ≥1	0.730		Gradient-boosted trees		Internal
	AKI stage ≥2	0.870
	AKI stage ≥3	0.930
S11 [61]	AKI		0.800	Generalized additive model		Internal
S12 [62]	AKI 7-days	0.840–0.870	Accuracy: 0.760–0.800	Random forest	Generalized additive model	Internal
S15 [63]	AKI	0.690–0.760		Logistic regression	Random forest, naïve Bayes, deep learning (unspecified)	Temporal
S18 [64]	AKI stage ≥1	0.746–0.758		Logistic regression	LASSO, random forest	Internal
	AKI stage ≥2	0.714–0.721		Random forest	LASSO, logistic regression
S20 [65]	At 24 h from admission	0.621–0.664		Ensemble (of all techniques)	Logistic regression, naïve Bayes, SVM, decision trees	Internal
S22 [66]	All features	0.797–0.827	Accuracy: 0.744–0.767	Generalized additive model	Logistic regression, naïve Bayes, SVM	Internal
	Feature selection with LASSO	0.797–0.824	Accuracy: 0.744–0.767
	Feature extraction with 5 principal components	0.819–0.858	Accuracy: 0.741–0.777
S23 [67]	AKI data from admission	0.751–0.765		Random forest	AdaBoost, logistic regression	Internal
	AKI data 24 h before admission	0.732–0.747
	AKI data 7 days before admission	0.733–0.747
	AKI data 15 days before admission	0.733–0.742
	AKI data 30 days before admission	0.732–0.747
S26 [68]	AKI within first 48 h	0.716–0.769	PR AUPRC: 0.430–0.479	RNN (LSTM)	RNN (GRU)	Internal
S32 [69]	Late AKI within first 24 h		Accuracy: 0.733	CART		Geographical
S38 [70]	Postoperative AKI	0.740–0.800		Random forest	Bayesian model averaging	Internal
S39 [71]	AKI	0.730–0.890		Random forest	SVMs, logistic regression	Internal
S40 [72]	On admission	0.750–0.800		AKIpredictor	Physicians	External
	First morning	0.890–0.940
	First 24 h	0.890–0.950
S41 [73]	AKI	0.560–0.920	Accuracy: 0.800–1.00	KNN	Only KNN but using different predictors	Internal
S42 [74]	Any AKI	0.882		Random forest		Internal
	AKI stage ≥2	0.878
S43 [75]	AKI before onset	0.687–0.744	F-measure: 0.261–0.330	Ensemble (logistic regression and random forest)	Logistic regression, random forest, naïve Bayes, Bayesian network	Internal
	AKI within the stay	0.676–0.734	F-measure: 0.253–0.318
	AKI within first 30 days	0.720–0.764	F-measure: 0.184–0.316
	AKI within first 5 days	0.600–0.764	F-measure: 0.047–0.184
S44 [76]	AKI	0.772–0.796	Accuracy: 0.724–0.744	MLP	Logistic regression, random forest	Internal
S46 [77]	AKI	0.550–0.780		Gradient-boosted trees	Decision trees, random forest, gradient-boosted trees, SVM, MLP, deep-belief networks	Internal
S48 [78]	AKI	0.573–0.809	Accuracy: 0.575–0.813	Random forest	Preselected random forest comparing it with gradient-boosted trees, bayesian networks, SVM, logistic regression, naïve Bayes, KNN, deep learning (unspecified)	Internal
			F-measure: 0.628–0.833
		0.589–0.809	Accuracy: 0.581–0.813	Random forest + local and global pattern detection	Only random forest (using 3 different pattern-detection variants) and last recorded value
			F-measure: 0.634–0.833
S49 [79]	AKI 0 days ahead		F1: 0.745–0.875	KNN	AdaBoost, logistic regression, random forest	Internal
	AKI 1 day ahead		F1: 0.686–0.759
	AKI 2 days ahead		F1: 0.605–0.695
	AKI 3 days ahead		F1: 0.588–0.654
	AKI 4 days ahead		F1: 0.590–0.659
	AKI 5 days ahead		F1: 0.572–0.646
S50 [80]	Hospital-acquired AKI 24–96 h ahead	0.552–0.791	Accuracy: 0.648–0.736	Recurrent additive network	Logistic regression, SVM	Internal
			F1: 0.403–0.644
S52 [81]	AKI	0.720–0.960	Accuracy: 0.730–0.900	RNN	KDIGO	External
			F1: 0.660–0.900
S53 [82]	AKI	0.580–0.824	PR AUPRC: 0.137–0.264	F-GAM	Decision trees, logistic regression, random forest, gradient-boosted stumps, SVM, deep learning (unspecified)	Internal
S54 [83]	Unstructured and structured features	0.660–0.700		RNN	Logistic regression, random forest, gradient-boosted trees	Internal
	Structured features	0.700–0.709
	Unstructured features	0.720–0.775
S56 [84]	AKI	0.650–0.790		MySurgeryRisk	Physicians	External
S58 [85]	AKI data before admission	0.750		AKIpredictor		Internal
	AKI data before and on admission	0.770
	AKI data before admission and first 24 h	0.800
	AKI data before admission and first 24 h and radio-contrast 1 week before	0.820
S59 [86]	AKI	0.738–0.988		CNN	Decision trees, logistic regression, random forest, RNN	Internal
S60 [87]	AKI	0.745–0.901	AUPRC: 0.747–0.907	RNN	Physicians	Internal
			Accuracy: 0.711–0.846
			F1: 0.673–0.848
S61 [88]	AKI	0.690–0.70		SVM	Logistic regression, random forest, SVM, KNN, AdaBoost	Internal
S62 [89]	AKI 24 h ahead	0.530–0.810		ETSM	KNN, naïve Bayes	Geographical
	AKI 48 h ahead	0.520–0.780
S63 [90]	AKI		Accuracy: 0.845–0.855	Random forest	Logistic regression, random forest, SVM, naïve Bayes, decision trees	Internal
S64 [91]	AKI stage ≥1	0.670–0.720		Gradient-boosted trees	Only 1 model	Temporal
	AKI stage ≥2	0.850–0.860
	AKI stage ≥3	0.910–0.920
S65 [92]	AKI stage ≥1	0.761		Logistic regression (LASSO)	Gradient-boosted trees	Internal
	AKI stage ≥2	0.818
S66 [93]	AKI	0.781–0.843		Ensemble (random forest and gradient-boosted trees)	Logistic regression, random forest, SVM, gradient-boosted trees	Internal
S67 [94]	AKI stage ≥1 within 24 h	0.800		Random forest		Internal
	AKI stage ≥2 within 24 h	0.760
	AKI stage ≥1 within 48 h	0.740
	AKI stage ≥2 within 48 h	0.810
	AKI stage ≥1 within 72 h	0.770
	AKI stage ≥2 within 72 h	0.750
S68 [38]	AKI	0.640–0.800		Light gradient machine		Internal
	AKI stages	0.560–0.710
S70 [95]	AKI	0.728–0.755		Bayesian networks		Internal
S71 [96]	AKI	0.812–0.835		Bayesian networks		Internal
S72 [97]	AKI	0.682–0.782		Deep rule forest		None

Open in a new tab

AdaBoost: adaptive boosting; AUC: area under the curve; AUPRC: area under the precision-recall curve; CART: classification and regression trees; eICU: xx; ETSM: ensemble time-series model; F-GAM: factored-generalized additive model; GRU: gated recurrent unit; LSTM: long short-term memory; MLP: multilayer perceptron (feed-forward neural network); PR: AUPRC; sCr: serum creatinine; SGD: stochastic gradient descend; SOFA: sequential organ failure assessment; SVM: support vector machine.

Twenty-three studies (50%) compared the performance of ML models to standard regression models. In all 23, logistic regression was used as a comparator, but least-angle regression (LARS) {one study [2%]}, linear regression {one study [2%]}, and multivariate adaptive regression splines (MARS) {one study [2%]} were used, as well. Logistic regression was the best performer (outperforming support vector machine, random forest, naïve Bayes, an unspecified deep-learning method, linear regression, LARS, stochastic gradient descent, MARS) in 6 studies (13%) but was outperformed by RNN, gradient-boosted trees, support vector machine, random forest, an ensemble model, k-nearest neighbor (KNN), generalized additive model, factorized generalized additive model, feed-forward networks, convolutional neural networks (CNNs), and recurrent additive networks in 20 studies (43%; in 3 studies there were multiple logistic regression models, 1 of which was outperformed and the other the best performing).

Model validation

The most common methods for internal validation were cross-validation {20 studies [43%]} and the separation of data in a training and a test set {19 studies [41%]}. External validation of the model in a different population was performed in eight studies (17%). Supplementary Figure S2 provides further information.

Critical appraisal

Assessment of bias

Table 2 shows the result of the critical appraisal with PROBAST. The vast majority of studies were identified as having a high risk of bias {39 studies [85%]} because of how the analysis was performed: Calibration was not assessed {35 studies [76%]}, and missing data were not optimally handled {21 studies [46%]}. One study (2%) had a risk of bias because of the selection of participants, one (2%) because of predictors, and three (7%) because of the outcome definition. One study (2%) had an unclear risk of bias for the predictors, one (2%) for the outcome. Concerns for applicability in clinical practice were raised by four studies (8%) because the predictors the model used were unavailable at the time of prediction. Two studies (4%) showed unclear applicability, one because of predictors, the other because of the outcome. Two studies (4%) that only externally validated a model were included in the critical appraisal but with high concerns for applicability because their main goal was to compare model performance with clinicians.

Table 2.

Results of the critical appraisal with PROBAST

Study	ROB				Applicability			Overall
	Participants	Predictors	Outcome	Analysis	Participants	Predictors	Outcome	ROB	Applicability
S01 [54]	+	+	+	+	+	+	+	+	+
S02 [55]	+	+	+	+	+	+	+	+	+
S03 [56]	+	+	+	−	+	+	+	−	+
S05 [57]	+	?	?	−	+	−	?	−	−
S07 [58]	+	+	+	−	+	?	+	−	+
S08 [59]	+	+	+	−	+	+	+	−	+
S09 [38]	+	+	+	−	+	+	+	−	+
S10 [60]	+	+	+	−	+	+	+	−	+
S11 [61]	+	+	+	−	+	+	+	−	+
S12 [62]	+	+	+	−	+	+	+	−	+
S15 [63]	+	+	+	+	+	+	+	+	+
S18 [64]	+	+	+	−	+	+	+	−	+
S20 [65]	+	+	+	−	+	+	+	−	+
S22 [66]	+	+	+	−	+	+	+	−	+
S23 [67]	+	+	+	−	+	+	+	−	+
S26 [68]	+	+	?	−	+	−	?	−	−
S32 [69]	+	+	+	−	+	+	+	−	+
S38 [70]	+	+	?	−	+	−	?	−	−
S39 [71]	+	+	+	−	+	+	+	−	+
S40 [72]	+	+	+	−	+	+	+	−	−
S41 [73]	+	+	+	−	+	+	+	−	+
S42 [74]	+	+	+	−	+	+	+	−	+
S43 [75]	+	+	+	−	+	+	+	−	+
S44 [76]	+	+	+	−	+	+	+	−	+
S46 [77]	+	+	+	+	+	+	+	+	+
S48 [78]	+	+	+	−	+	+	+	−	+
S49 [79]	+	+	+	−	+	+	+	−	+
S50 [80]	+	+	−	−	+	−	−	−	−
S52 [81]	+	+	+	−	+	+	+	−	+
S53 [82]	+	+	+	−	+	+	+	−	+
S54 [83]	+	+	+	−	+	+	+	−	+
S56 [84]	+	+	+	−	+	+	+	−	−
S58 [85]	+	+	−	−	+	+	+	−	+
S59 [86]	+	+	+	−	+	+	+	−	+
S60 [87]	+	+	+	−	+	+	+	−	+
S61 [88]	+	+	+	−	+	+	+	−	+
S62 [89]	+	+	+	−	+	+	+	−	+
S63 [90]	+	−	+	−	+	?	+	−	?
S64 [91]	+	+	+	−	+	+	+	−	+
S65 [92]	+	+	+	−	+	+	+	−	+
S66 [93]	+	+	+	−	+	+	+	−	+
S67 [94]	+	+	+	−	+	+	+	−	+
S68 [38]	+	+	+	−	+	+	+	−	+
S70 [95]	+	+	+	−	+	+	+	−	+
S71 [96]	+	+	+	−	+	+	+	−	+
S72 [97]	−	+	−	−	?	+	?	−	?

Open in a new tab

The plus symbol (+) indicates low risk of bias (ROB) or low concern for applicability; the minus symbol (−) means high ROB or high concern for applicability; the question mark (?) implies unclear ROB or unclear concern for applicability.

Data pre-processing

Twelve studies (26%) did not specify whether missing data were present or how they were treated, and five (11%) did not use any imputation method. In the studies that did handle missing data, mean and carry-forward imputation were the most common methods {six studies [13%]}. Four studies (8%) applied the multivariate imputation by chained equations (MICE) [42] method (Supplementary Figure S3). Another relevant aspect of model development concerns variable selection. Twelve studies (26%) used all the available variables, eight (16%) used expert opinion to pre-select variables, six (13%) used the least absolute shrinkage and selection operator (LASSO) [43], and five (11%) selected variables based on existing literature. Five studies (11%) did not specify whether variable selection was used. Supplementary Figure S4 contains more details.

Interpretability

Interpretability reflects the degree to which a human can consistently predict the model's output [44]. Interpretability was rarely addressed {13 studies [28%}. The most popular method to improve interpretability was providing the variable importance {seven studies [15%]}. Additional methods, used by a single study each (2%), were Shapley Additive Explanations [45], regression coefficients, contributions of variables to the predicted probability, statistical testing and manual evaluation to identify discriminant predictors, predicting future trajectories for clinically relevant biomarkers, and using a more interpretable logistic regression (fewer predictors) alongside the best model.

Applicability and reproducibility

Thirty studies (65%) were performed in tertiary care hospitals (Supplementary Figure S5). Twenty-eight studies (61%) included data from a single center. The number of study sites ranged from 1 to 1239 and was unspecified in 6 studies (13%). The intensive care unit (ICU) patient population was most frequently studied {15 studies [33%]}, followed by the general hospital, surgery, and cardiac surgery populations {12, 8, and 5 studies, respectively [26%, 17%, 11%]}. The study population size ranged from 50 to 1841 951 {median, 23 246 [interquartile range: 4485–52 686]}. The duration of follow-up ranged from 24 to 1000 hours and was omitted in 11 studies (24%). The prediction window ranged from the time of admission to 7 days, but eight studies (17%) failed to specify it. Regarding reproducibility, few studies shared code {five studies [11%]} or data {nine studies [19%]}.

DISCUSSION

Findings

We reviewed and critically appraised ML models for the prediction of AKI in terms of performance, methodological soundness, and clinical applicability. Models were mostly developed for the ICU population, followed by the general hospital and (cardiac) surgery populations. Although deep-learning models have emerged since 2014, more traditional, flexible ML methods (random forest and gradient-boosted trees) are still widely used to predict AKI. Prediction models typically include clinical predictors at baseline and, to a lesser extent, repeated measures. Although all studies provided model discrimination, equally important measures of calibration were rare. Most models were not externally validated. Our critical appraisal demonstrated a high risk of bias in the majority of studies, with some concern regarding their applicability in clinical practice.

Performance

Random forest was often the best performing method compared with other models within the same study. RNN demonstrated promising results. The popularity and performance of the simpler, flexible ML models, such as random forest, may indicate that flexible ML methods are sufficiently effective or perhaps better than deep-learning techniques for the type of data and tasks relevant for AKI prediction. Most studies relied on baseline clinical predictors and less so on clinical notes or repeated measures. Choosing the optimal model highly depends on the type of data available. Deep learning is typically beneficial for complex data, as demonstrated by several studies incorporating predictors derived from text or repeated measures. Although the use of deep learning may improve predictive performance in these settings, it comes at the cost of being less interpretable, which may discourage its uptake in clinical practice. Prediction models are inherently uninterpretable from a causal perspective. Interpretability in the context of prediction refers to the explicability of the predictions (i.e. how the model made the prediction) and which predictors contributed the most to the prediction (i.e. variable importance). Although some models are easier to interpret than others, making predictions understandable does not provide any information about the underlying causal mechanisms between predictors and outcome. Inferring causality from prediction models is referred to as the ‘'Table 2 fallacy’ [46].

Methodologic soundness

We found a high risk of bias in the majority of studies, mostly because of flaws in the analysis. A common flaw was the lack of model calibration. Although model discrimination was typically assessed, calibration was often overlooked; both, however, should be reported to evaluate model performance [34]. Specific tasks call upon different performance measures. For example, benchmarking and decision-making based on individual predictions require good calibration, while identifying the most vulnerable patients mainly requires discrimination. The reviewed studies did not explain why they did or did not use specific measures. Another common flaw was the reliance on simple internal validation methods, such as splitting data in train and test sets, without correcting for optimism and overfitting. More reliable methods, such as cross-validation, should be preferred. Similarly, suboptimal methods for dealing with missing data were often used, whereas MICE provides the least biased results [47]. The two main strategies used for variable selection were the inclusion of all available variables and backward-elimination methods. There is no consensus on the best method for variable selection [48], although including all variables can avoid overfitting and selection bias [49], even though this is often impractical [48]. Finally, only two studies relied on prospective data. Although we acknowledge the difficulties associated with collecting data prospectively, retrospective data may not be representative of the patient population and are prone to selection bias, recall bias, and misclassification bias [50].

Clinical applicability

The majority of the studies used data from a single center, implying that the model would be less generalizable to the broader patient population. Although many studies have been performed in the ICU, the MIMIC data set was often used, possibly because MIMIC is publicly available and includes complex data (repeated measures and clinical notes). Although using the same data may foster the comparison of models among studies, prediction results risk being biased toward its specific population and may be less generalizable to the broader ICU population. External validation of models was rare, further limiting the generalizability to other populations.

Reproducible research has become a pressing issue across many scientific disciplines, and sharing data and code is key [47, 51, 52]. The ability to reproduce studies is limited as data and code were usually unavailable. Even when there are commercial concerns about intellectual property, strong arguments exist for ensuring that algorithms are nonproprietary and available for scrutiny [53]. Proprietary algorithms hamper transparency and prevent external validation in different settings by independent researchers.

Challenges and opportunities

The main opportunity that ML offers for the prediction of AKI is that these models allow for a more flexible relationship between the predictors and the outcome than standard regression methods. Flexible ML models allow expression of highly nonlinear relationships between predictors and AKI. Besides the typical use of baseline predictors in most models, deep-learning models are capable of including time-updated measurements of predictors as well as text from clinical notes, with the potential of improving model performance. Deep learning, with its latent representations (e.g. a hidden layer in a neural network) can uncover complex relationships between predictors and outcome, hence improving the prediction. This advantage makes sense only if complex relationships exist and if there are sufficient data to reliably estimate model parameters. Learning such models requires managing their complexity as they are prone to overfitting.

Limitations

Our study has three main limitations. First, although comprehensive, our search strategy may have missed some relevant studies. We selected two sources (PubMed and ArXiv) that should have identified the most significant studies from the medical and ML domains (see Supplementary Section B), but we excluded studies with only standard regression models. Second, the risk of bias entails some subjective judgment, and people with different experiences of ML performance could have varying perceptions. To limit this effect, 12 were reviewed by at least two assessors. Third, PROBAST was designed for regression models. There are no clear guidelines on how to score some questions (e.g. regarding predictors and sample size) for machine learning and deep-learning models. The upcoming TRIPOD-AI and PROBAST-AI might overcome this limitation [53].

CONCLUSIONS

Relatively simple models, such as random forest and gradient-boosted trees, are still common, although more complex models based on deep learning are emerging, providing opportunities for the inclusion of temporal data and text as predictors. Although deep-learning models have the potential to improve predictions, they are also less interpretable, which may impede uptake in clinical practice—challenges that should be addressed in the future. In accordance with reporting guidelines, we encourage reporting both model discrimination and model calibration. The generalizability of prediction models should be improved through the use of multicenter data during development or external validation. Sharing data and code is encouraged to improve study reproducibility.

Supplementary Material

sfac181_Supplemental_Files

Click here for additional data file.^{(2.8MB, zip)}

Contributor Information

Iacopo Vagliano, Deptartment of Medical Informatics, Amsterdam UMC, University of Amsterdam, Amsterdam, Amsterdam Public Health Research Institute, Amsterdam, The Netherlands.

Nicholas C Chesnaye, ERA Registry, Department of Medical Informatics, Amsterdam UMC, University of Amsterdam, Amsterdam Public Health Research Institute, Amsterdam, The Netherlands.

Jan Hendrik Leopold, Deptartment of Medical Informatics, Amsterdam UMC, University of Amsterdam, Amsterdam, Amsterdam Public Health Research Institute, Amsterdam, The Netherlands.

Kitty J Jager, ERA Registry, Department of Medical Informatics, Amsterdam UMC, University of Amsterdam, Amsterdam Public Health Research Institute, Amsterdam, The Netherlands.

Ameen Abu-Hanna, Deptartment of Medical Informatics, Amsterdam UMC, University of Amsterdam, Amsterdam, Amsterdam Public Health Research Institute, Amsterdam, The Netherlands.

Martijn C Schut, Deptartment of Medical Informatics, Amsterdam UMC, University of Amsterdam, Amsterdam, Amsterdam Public Health Research Institute, Amsterdam, The Netherlands.

AUTHORS’ CONTRIBUTIONS

I.V. contributed to research idea, study design, methodology, extraction, analysis, and interpretation of data, writing—original draft; N.C.C. contributed to methodology, extraction, analysis, and interpretation of data, writing—original draft; J.H.L. contributed to methodology, extraction and analysis of data, writing—review & editing; A.A.H. contributed to methodology, interpretation of data, writing—review & editing; K.J.J. contributed to interpretation of data, writing—review & editing; M.C.S. contributed to research idea, study design, methodology, interpretation of data, writing—review & editing. Each author read and approved the final manuscript, and accepts accountability for the work by ensuring that questions pertaining to the accuracy or integrity of any portion of the work are appropriately investigated and resolved.

DATA AVAILABILITY STATEMENT

The data underlying this article are available in the article and in its online supplementary material.

CONFLICT OF INTEREST STATEMENT

All the authors declared no competing interests. The results presented in this paper have not been published previously in whole or part, except in abstract format.

REFERENCES

1. Khwaja A. KDIGO clinical practice guidelines for acute kidney injury. Nephron Clin Pract 2013;120:c179–84. 10.1159/000339789 [DOI] [PubMed] [Google Scholar]
2. Jager KJ, Kovesdy C, Langham Ret al. A single number for advocacy and communication-worldwide more than 850 million individuals have kidney diseases. Kidney Int 2019;96:1048–50. 10.1016/j.kint.2019.07.012 [DOI] [PubMed] [Google Scholar]
3. Susantitaphong P, Cruz DN, Cerda Jet al. Acute Kidney Injury Advisory Group of the American Society of Nephrology . World incidence of AKI: a meta-analysis. Clin J Am Soc Nephrol 2013;8:1482–93. 10.2215/CJN.00710113 [DOI] [PMC free article] [PubMed] [Google Scholar]
4. Mehta RL, Cerdá J, Burdmann EAet al. International Society of Nephrology's 0by25 initiative for acute kidney injury (zero preventable deaths by 2025): a human rights case for nephrology. Lancet 2015;385:2616–43. 10.1016/S0140-6736(15)60126-X [DOI] [PubMed] [Google Scholar]
5. Hoste EAJ, Kellum JA, Selby NMet al. Global epidemiology and outcomes of acute kidney injury. Nat Rev Nephrol 2018;14:607–25. 10.1038/s41581-018-0052-0 [DOI] [PubMed] [Google Scholar]
6. National Confidential Enquiry into Patient Outcome and Death . Adding insult to injury: a review of the care of patients who died in hospital with a primary diagnosis of acute kidney injury (acute renal failure). National Confidential Enquiry into Patient Outcome and Death, 2009 [Google Scholar]
7. Matheny ME, Miller RA, Ikizler TAet al. Development of inpatient risk stratification models of acute kidney injury for use in electronic health records. Med Decis Making 2010;30:639–50. 10.1177/0272989X10364246 [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Mehran R, Aymong ED, Nikolsky Eet al. A simple risk score for prediction of contrast-induced nephropathy after percutaneous coronary intervention: development and initial validation. J Am Coll Cardiol 2004;447:1393–9. 10.1016/j.jacc.2004.06.068 [DOI] [PubMed] [Google Scholar]
9. Coorey CP, Sharma A, Mueller Set al. Prediction modelling—part 2: using machine learning strategies to improve transplantation outcomes. Kidney Int 2021;99:817–23. 10.1016/j.kint.2020.08.026 [DOI] [PubMed] [Google Scholar]
10. Au EH, Francis A, Bernier-Jean Aet al. Prediction modelling—part 1: regression modeling. Kidney Int 2020;97:877–84. 10.1016/j.kint.2020.02.007 [DOI] [PubMed] [Google Scholar]
11. Gameiro J, Branco T, Lopes JA.. Artificial intelligence in acute kidney injury risk prediction. J Clin Med 2020;9:678. 10.3390/jcm9030678 [DOI] [PMC free article] [PubMed] [Google Scholar]
12. Park S, Lee H.. Acute kidney injury prediction models: current concepts and future strategies. Curr Opin Nephrol Hypertens 2019;28:552–9. 10.1097/MNH.0000000000000536 [DOI] [PubMed] [Google Scholar]
13. Hodgson LE, Selby N, Huang T-Met al. The role of risk prediction models in prevention and management of AKI. Semin Nephrol 2019;39:421–30. 10.1016/j.semnephrol.2019.06.002 [DOI] [PubMed] [Google Scholar]
14. Hodgson LE, Sarnowski A, Roderick PJet al. Systematic review of prognostic prediction models for acute kidney injury (AKI) in general hospital populations. BMJ Open 2017;7:e016591. 10.1136/bmjopen-2017-016591 [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Pozzoli S, Simonini M, Manunta P.. Predicting acute kidney injury: current status and future challenges. J Nephrol 2018;31:209–23. 10.1007/s40620-017-0416-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
16. Wilson T, Quan S, Cheema Ket al. Risk prediction models for acute kidney injury following major noncardiac surgery: systematic review. Nephrol Dial Transplant 2016;31:231–40. 10.1093/ndt/gfv414 [DOI] [PubMed] [Google Scholar]
17. Allen DW, Ma B, Leung KCet al. Risk prediction models for contrast-induced acute kidney injury accompanying cardiac catheterization: systematic review and meta-analysis. Can J Cardiol 2017;33:724–36. 10.1016/j.cjca.2017.01.018 [DOI] [PubMed] [Google Scholar]
18. Caragata R, Wyssusek KH, Kruger P.. Acute kidney injury following liver transplantation: a systematic review of published predictive models. Anaesth Intensive Care 2016;44:251–61. 10.1177/0310057X1604400212 [DOI] [PubMed] [Google Scholar]
19. Szerlip HM, Chawla LS.. Predicting acute kidney injury prognosis. Curr Opin Nephrol Hypertens 2016;25:226–31. 10.1097/MNH.0000000000000223 [DOI] [PubMed] [Google Scholar]
20. Safari S, Yousefifard M, Hashemi Bet al. The role of scoring systems and urine dipstick in prediction of rhabdomyolysis-induced acute kidney injury: a systematic review. Iran J Kidney Dis 2016;10:101–6. [PubMed] [Google Scholar]
21. Lin X, Yuan J, Zhao Yet al. Urine interleukin-18 in prediction of acute kidney injury: a systemic review and meta-analysis. J Nephrol 2015;28:7–16. 10.1007/s40620-014-0113-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
22. de Geus HR, Betjes MG, Bakker J.. Biomarkers for the prediction of acute kidney injury: a narrative review on current status and future challenges. Clin Kidney J 2012;5:102–8. 10.1093/ckj.sfs008 [DOI] [PMC free article] [PubMed] [Google Scholar]
23. Liu X, Guan Y, Xu Set al. Early predictors of acute kidney injury: a narrative review. Kidney Blood Press Res 2016;41:680–700. 10.1159/000447937 [DOI] [PubMed] [Google Scholar]
24. Meisner A, Kerr KF, Thiessen-Philbrook Het al. Methodological issues in current practice may lead to bias in the development of biomarker combinations for predicting acute kidney injury. Kidney Int 2016;89:429–38. 10.1038/ki.2015.283 ISSN 0085-2538 [DOI] [PMC free article] [PubMed] [Google Scholar]
25. Ho J, Tangri N, Komenda Pet al. Urinary, plasma, and serum biomarkers’ utility for predicting acute kidney injury associated with cardiac surgery in adults: a meta-analysis. Am J Kidney Dis 2015;66:993–1005. 10.1053/j.ajkd.2015.06.018 [DOI] [PubMed] [Google Scholar]
26. Mosa O, Skitek M, Jerin A.. Validity of Klotho, CYR61 and YKL-40 as ideal predictive biomarkers for acute kidney injury: review study. Sao Paulo Med J 2017;135:57–65. 10.1590/1516-3180.2016.0099220516 [DOI] [PMC free article] [PubMed] [Google Scholar]
27. Darmon M, Truche AS, Abdel-Nabey Met al. Early recognition of persistent acute kidney injury. Semin Nephrol 2019;39:431–41. 10.1016/j.semnephrol.2019.06.003 ISSN 0270-9295 [DOI] [PubMed] [Google Scholar]
28. Sutherland SM, Goldstein SL, Bagshaw SM.. Acute kidney injury and big data. Contrib Nephrol 2018;193:55–67. 10.1159/000484963 [DOI] [PubMed] [Google Scholar]
29. Song X, Liu X, Liu Fet al. Comparison of machine learning and logistic regression models in predicting acute kidney injury: a systematic review and meta-analysis. Int J Med Informatics 2021;151:104484. 10.1016/j.ijmedinf.2021.104484 [DOI] [PubMed] [Google Scholar]
30. Moher D, Liberati A, Tetzlaff Jet al. PRISMA Group . Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. BMJ 2009;339:b2535. 10.1136/bmj.b2535 [DOI] [PMC free article] [PubMed] [Google Scholar]
31. Collins GS, Reitsma JB, Altman DGet al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ 2015;350:g7594. 10.1136/bmj.g7594 [DOI] [PubMed] [Google Scholar]
32. Moons KGM, de Groot JAH, Bouwmeester Wet al. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist. PLoS Med 2014;11:e1001744. 10.1371/journal.pmed.1001744 [DOI] [PMC free article] [PubMed] [Google Scholar]
33. Moons KGM, Wolff RF, Riley RDet al. PROBAST: a tool to assess risk of bias and applicability of prediction model studies: explanation and elaboration. Ann Intern Med 2019;170:W1–33. 10.7326/M18-1377 [DOI] [PubMed] [Google Scholar]
34. Bellomo R, Ronco C, Kellum JAet al. Acute Dialysis Quality Initiative workgroup . Acute renal failure—definition, outcome measures, animal models, fluid therapy and information technology needs: the Second International Consensus Conference of the Acute Dialysis Quality Initiative (ADQI) Group. Crit Care 2004;8:R204–12. 10.1186/cc2872 [DOI] [PMC free article] [PubMed] [Google Scholar]
35. Mehta RL, Kellum JA, Shah SVet al. Acute kidney injury network: report of an initiative to improve outcomes in acute kidney injury, Crit Care 2007;11:R31. 10.1186/cc5713 [DOI] [PMC free article] [PubMed] [Google Scholar]
36. Centers for Disease Control and Prevention . International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM). Atlanta: Centers for Disease Control and Prevention, 2002. [Google Scholar]
37. Mohamadlou H, Lynn-Palevsky A, Barton Cet al. Prediction of acute kidney injury with a machine learning algorithm using electronic health record data. Can J Kidney Health Dis 2018;5:205435811877632. 10.1177/2054358118776326 [DOI] [PMC free article] [PubMed] [Google Scholar]
38. Lei G, Wang G, Zhang Cet al. Using machine learning to predict acute kidney injury after aortic arch surgery. J Cardiothorac Vasc Anesth 2020;34:3321–8. 10.1053/j.jvca.2020.06.007 [DOI] [PubMed] [Google Scholar]
39. Johnson AEW, Pollard TJ, Shen Let al. MIMIC-III, a freely accessible critical care database. Sci Data 2016;3:160035. 10.1038/sdata.2016.35 [DOI] [PMC free article] [PubMed] [Google Scholar]
40. Steyerberg EW. Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating. New York: Springer, 2009. [Google Scholar]
41. van Buuren S, Groothuis-Oudshoorn K.. MICE: multivariate imputation by chained equations in R. J Stat Softw 2011;45:1–67. 10.18637/jss.v045.i03 [DOI] [Google Scholar]
42. Tibshirani R. Regression shrinkage and selection via the Lasso. J R Stat Soc 1996;58:267–88. [Google Scholar]
43. Kim B, Khanna R, Koyejo O.. Examples are not enough, learn to criticize! Criticism for interpretability. Proceedings of the 30th International Conference on Neural Information Processing Systems, 2016;2288–96. [Google Scholar]
44. Lundberg SM, Lee SI.. A unified approach to interpreting model predictions. Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017;4768288–77. [Google Scholar]
45. Westreich D, Greenland S.. The table 2 fallacy: presenting and interpreting confounder and modifier coefficients. Am J Epidemiol 2013;177:292–8. 10.1093/aje/kws412 [DOI] [PMC free article] [PubMed] [Google Scholar]
46. Camerer CF, Dreber A, Holzmeister Fet al. Evaluating the replicability of social science experiments in nature and science between 2010 and 2015. Nat Hum Behav 2018;2:637–44. 10.1038/s41562-018-0399-z [DOI] [PubMed] [Google Scholar]
47. Royston P, Moons KGM, Altman DGet al. Prognosis and prognostic research: developing a prognostic model. BMJ 2009;338:b604. 10.1136/bmj.b604 [DOI] [PubMed] [Google Scholar]
48. Harrell FE Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis. New York: Springer, 2001. [Google Scholar]
49. Sauerland S, Lefering R, Neugebauer EAM. Retrospective clinical studies in surgery: potentials and pitfalls. J Hand Surg Br 2002;27:117–21. 10.1054/jhsb.2001.0703 [DOI] [PubMed] [Google Scholar]
50. Ebrahim S, Sohani ZN, Montoya Let al. Reanalyses of randomized clinical trial data. JAMA 2014;312:1024–32. 10.1001/jama.2014.9646 [DOI] [PubMed] [Google Scholar]
51. Wallach JD, Boyack KW, Ioannidis JPA. Reproducible research practices, transparency, and open access data in the biomedical literature, 2015-2017. PLoS Biol 2018;16:e2006930. 10.1371/journal.pbio.2006930 [DOI] [PMC free article] [PubMed] [Google Scholar]
52. Van Calster B, Steyerberg EW, Collins GS.. Artificial intelligence algorithms for medical prediction should be nonproprietary and readily available. JAMA Intern Med 2019;179:731. 10.1001/jamainternmed.2019.0597 [DOI] [PubMed] [Google Scholar]
53. Collins GS, Dhiman P, Andaur Navarro CLet al. Protocol for development of a reporting guideline (TRIPOD-AI) and risk of bias tool (PROBAST-AI) for diagnostic and prognostic prediction model studies based on artificial intelligence. BMJ Open 2021;11:e048008. 10.1136/bmjopen-2020-048008 [DOI] [PMC free article] [PubMed] [Google Scholar]
54. Tomašev N, Glorot X, Rae JWet al. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature 2019;572:116–9. 10.1038/s41586-019-1390-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
55. Sun M, Baron J, Dighe Aet al. Early prediction of acute kidney injury in critical care setting using clinical notes and structured multivariate physiological measurements. Stud Health Technol Inform 2019;264:368–72. 10.3233/SHTI190245 [DOI] [PubMed] [Google Scholar]
56. Pan Z, Du H, Ngiam KYet al. A self-correcting deep learning approach to predict acute conditions in critical care. arXiv:190104364. 10.48550/arXiv.1901.04364 [DOI] [Google Scholar]
57. Parreco J, Chatoor M.. Comparing machine learning algorithms for predicting acute kidney injury. Am Surg 2019;85:725–9. [PubMed] [Google Scholar]
58. Weisenthal SJ, Quill C, Farooq Set al. Predicting acute kidney injury at hospital re-entry using high-dimensional electronic health record data. PLoS One 2018;13:e0204920. 10.1371/journal.pone.0204920 [DOI] [PMC free article] [PubMed] [Google Scholar]
59. Park N, Kang E, Park Met al. Predicting acute kidney injury in cancer patients using heterogeneous and irregular data. PLoS One 2018;13:e0199839. 10.1371/journal.pone.0199839 [DOI] [PMC free article] [PubMed] [Google Scholar]
60. Koyner J, Carey K, Edelson Det al. The development of a machine learning inpatient acute kidney injury prediction model. Crit Care Med 2018;46:1070–7. 10.1097/CCM.0000000000003123 [DOI] [PubMed] [Google Scholar]
61. Bihorac A, Ozrazgat-Baslanti T, Ebadi Aet al. MySurgeryRisk: development and validation of a Machine-learning risk algorithm for major complications and death after surgery. Ann Surg 2018;269:652. 10.1097/SLA.0000000000002706 [DOI] [PMC free article] [PubMed] [Google Scholar]
62. Adhikari L, Ozrazgat-Baslanti T, Ruppert Met al. Improved predictive models for acute kidney injury with IDEA: intraoperative data embedded analytics. PLoS One 2019;14:e0214904. 10.1371/journal.pone.0214904 [DOI] [PMC free article] [PubMed] [Google Scholar]
63. Davis SE, Lasko TA, Chen Get al. Calibration drift in regression and machine learning models for acute kidney injury. J Am Med Inform Assoc 2017;24:1052–61. 10.1093/jamia/ocx030 [DOI] [PMC free article] [PubMed] [Google Scholar]
64. Cronin RM, VanHouten JP, Siew EDet al. National Veterans Health Administration inpatient risk stratification models for hospital-acquired acute kidney injury. J Am Med Inform Assoc 2015;22:1054–71. 10.1093/jamia/ocv051 [DOI] [PMC free article] [PubMed] [Google Scholar]
65. Kate RJ, Perez RM, Mazumdar Det al. Prediction and detection models for acute kidney injury in hospitalized older adults. BMC Med Inform Decis Mak 2016;16:39. 10.1186/s12911-016-0277-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
66. Thottakkara P, Ozrazgat-Baslanti T, Hupf BBet al. Application of machine learning techniques to high-dimensional clinical data to forecast postoperative complications. PLoS One 2016;11:e0155705. 10.1371/journal.pone.0155705 [DOI] [PMC free article] [PubMed] [Google Scholar]
67. Cheng P, Waitman LR, Hu Yet al. Predicting inpatient acute kidney injury over different time horizons: how early and accurate? AMIA Annu Symp Proc 2018;2017:565–74. [PMC free article] [PubMed] [Google Scholar]
68. Zhang K, Xue Y, Flores Get al. Modelling EHR timeseries by restricting feature interaction. arXiv 2019. 10.48550/arXiv.1911.06410 [DOI] [Google Scholar]
69. Schneider DF, Dobrowolsky A, Shakir IAet al. Predicting acute kidney injury among burn patients in the 21st century: a classification and regression tree analysis. J Burn Care Res 2012;33:242–51. 10.1097/BCR.0b013e318239cc24 [DOI] [PMC free article] [PubMed] [Google Scholar]
70. Kerr KF, Morenz ER, Roth Jet al. Developing biomarker panels to predict progression of acute kidney injury after cardiac surgery. Kidney Int Rep 2019;4:1677–88. 10.1016/j.ekir.2019.08.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
71. Zhou C, Wang R, Jiang Wet al. Machine learning for the prediction of acute kidney injury and paraplegia after thoracoabdominal aortic aneurysm repair. J Card Surg 2020;35:89–99. 10.1111/jocs.14317 [DOI] [PubMed] [Google Scholar]
72. Flechet M, Falini S, Bonetti Cet al. Machine learning versus physicians’ prediction of acute kidney injury in critically ill adults: a prospective evaluation of the AKIpredictor. Crit Care 2019;23:282. 10.1186/s13054-019-2563-x [DOI] [PMC free article] [PubMed] [Google Scholar]
73. Tran NK, Sen S, Palmieri TLet al. Artificial intelligence and machine learning for predicting acute kidney injury in severely burned patients: a proof of concept. Burns 2019;45:1350–8. 10.1016/j.burns.2019.03.021 [DOI] [PubMed] [Google Scholar]
74. Chiofolo C, Chbat N, Ghosh Eet al. Automated continuous acute kidney injury prediction and surveillance: a random forest model. Mayo Clin Proc 2019;94:783–92. 10.1016/j.mayocp.2019.02.009 [DOI] [PubMed] [Google Scholar]
75. He J, Hu Y, Zhang Xet al. Multi-perspective predictive modeling for acute kidney injury in general hospital populations using electronic medical records. JAMIA Open 2019;2:115–22. 10.1093/jamiaopen/ooy043 [DOI] [PMC free article] [PubMed] [Google Scholar]
76. Zimmerman LP, Reyfman PA, Smith ADRet al. Early prediction of acute kidney injury following ICU admission using a multivariate panel of physiological measurements. BMC Med Inf Decis Mak 2019;19:16. 10.1186/s12911-019-0733-z [DOI] [PMC free article] [PubMed] [Google Scholar]
77. Lee HC, Yoon HK, Nam Ket al. Derivation and validation of machine learning approaches to predict acute kidney injury after cardiac surgery. J Clin Med 2018;7:322. 10.3390/jcm7100322 [DOI] [PMC free article] [PubMed] [Google Scholar]
78. Morid MA, Sheng ORL, Fiol GDet al. Temporal pattern detection to predict adverse events in critical care: case study with acute kidney injury. JMIR Med Inform 2020;8:e14272. 10.2196/14272 [DOI] [PMC free article] [PubMed] [Google Scholar]
79. Chen YS, Chou CY, Chen ALP.. Early prediction of acquiring acute kidney injury for older inpatients using most effective laboratory test results. BMC Med Inf Decis Mak 2020;20:36. 10.1186/s12911-020-1050-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
80. Goodwin TR, Demner-Fushman D.. A customizable deep learning model for nosocomial risk prediction from critical care notes with indirect supervision. J Am Med Inform Assoc 2020;27:567–76. 10.1093/jamia/ocaa004 [DOI] [PMC free article] [PubMed] [Google Scholar]
81. Meyer A, Zverinski D, Pfahringer Bet al. Machine learning for real-time prediction of complications in critical care: a retrospective study. Lancet Respir Med 2018;6:905–14. 10.1016/S2213-2600(18)30300-X [DOI] [PubMed] [Google Scholar]
82. Cui Z, Fritz BA, King CRet al. A factored generalized additive model for clinical decision support in the operating room. AMIA Annu Symp Proc 2020;2019:343–52 [PMC free article] [PubMed] [Google Scholar]
83. Xu Z, Chou J, Zhang XSet al. Identifying sub-phenotypes of acute kidney injury using structured and unstructured electronic health record data with memory networks. J Biomed Inform 2020;102:103361. 10.1016/j.jbi.2019.103361 [DOI] [PubMed] [Google Scholar]
84. Brennan M, Puri S, Ozrazgat-Baslanti Tet al. Comparing clinical judgment with the MySurgeryRisk algorithm for preoperative risk assessment: a pilot study. Surgery 2019;165:1035–45. 10.1016/j.surg.2019.01.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
85. Flechet M, Güiza F, Schetz Met al. AKIpredictor, an online prognostic calculator for acute kidney injury in adult critically ill patients: development, validation and comparison to serum neutrophil gelatinase-associated lipocalin. Intensive Care Med 2017;43:764–73. 10.1007/s00134-017-4678-3 [DOI] [PubMed] [Google Scholar]
86. Wang Y, Bao J, Du Jet al. Precisely predicting acute kidney injury with convolutional neural network based on electronic health record data. arXiv:200513171. 10.48550/arXiv.2005.13171 [DOI] [Google Scholar]
87. Rank N, Pfahringer B, Kempfert Jet al. Deep-learning-based real-time prediction of acute kidney injury outperforms human predictive performance. NPI J Digit Med 2020;3:139. 10.1038/s41746-020-00346-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
88. Al-Jefri M, Lee J, James M.. Predicting acute kidney injury after surgery. In: 2020 42nd Annual International Conference of the IEEE Engineering in Medicine Biology Society (EMBC), 2020, 5606–9. 10.1109/EMBC44109.2020.9175448 [DOI] [PubMed] [Google Scholar]
89. Wang Y, Wei Y, Yang Het al. Utilizing imbalanced electronic health records to predict acute kidney injury by ensemble learning and time series model. BMC Med Inf Decis Mak 2020;20:238. 10.1186/s12911-020-01245-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
90. Li Y, Chen X, Shen Zet al. Prediction models for acute kidney injury in patients with gastrointestinal cancers: a real-world study based on bayesian networks. Ren Fail 2020;42:869–76. 10.1080/0886022X.2020.1810068 [DOI] [PMC free article] [PubMed] [Google Scholar]
91. Churpek MM, Carey KA, Edelson DPet al. Internal and external validation of a machine learning risk score for acute kidney injury. JAMA Netw Open 2020;3:e2012892. 10.1001/jamanetworkopen.2020.12892 [DOI] [PMC free article] [PubMed] [Google Scholar]
92. Hsu CN, Liu CL, Tain YLet al. Machine learning model for risk prediction of community-acquired acute kidney injury hospitalization from electronic health records: development and validation study. J Med Internet Res 2020;22:e16903. 10.2196/16903 [DOI] [PMC free article] [PubMed] [Google Scholar]
93. Tseng PY, Chen YT, Wang CHet al. Prediction of the development of acute kidney injury following cardiac surgery by machine learning. Crit Care 2020;24:478. 10.1186/s13054-020-03179-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
94. Martinez DA, Levin SR, Klein EYet al. Early prediction of acute kidney injury in the emergency department with machine-learning methods applied to electronic health record data. Ann Emerg Med 2020;76:501–14. 10.1016/j.annemergmed.2020.05.026 [DOI] [PubMed] [Google Scholar]
95. Li Y, Xu J, Wang Yet al. A novel machine learning algorithm, bayesian networks model, to predict the high-risk patients with cardiac surgery-associated acute kidney injury. Clin Cardiol 2020;43:752–61. 10.1002/clc.23377 [DOI] [PMC free article] [PubMed] [Google Scholar]
96. Li Y, Chen X, Wang Yet al. Application of group LASSO regression based bayesian networks in risk factors exploration and disease prediction for acute kidney injury in hospitalized patients with hematologic malignancies. BMC Nephrol 2020;21:162. 10.1186/s12882-020-01786-w [DOI] [PMC free article] [PubMed] [Google Scholar]
97. Kuo B, Kang Y, Wu Pet al. Discovering drug-drug and drug-disease interactions inducing acute kidney injury using deep rule forests. arXiv:200702103. 10.1109/IRI49571.2020.00062 [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

sfac181_Supplemental_Files

Click here for additional data file.^{(2.8MB, zip)}

Data Availability Statement

The data underlying this article are available in the article and in its online supplementary material.

[bib1] 1. Khwaja A. KDIGO clinical practice guidelines for acute kidney injury. Nephron Clin Pract 2013;120:c179–84. 10.1159/000339789 [DOI] [PubMed] [Google Scholar]

[bib2] 2. Jager KJ, Kovesdy C, Langham Ret al. A single number for advocacy and communication-worldwide more than 850 million individuals have kidney diseases. Kidney Int 2019;96:1048–50. 10.1016/j.kint.2019.07.012 [DOI] [PubMed] [Google Scholar]

[bib3] 3. Susantitaphong P, Cruz DN, Cerda Jet al. Acute Kidney Injury Advisory Group of the American Society of Nephrology . World incidence of AKI: a meta-analysis. Clin J Am Soc Nephrol 2013;8:1482–93. 10.2215/CJN.00710113 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib4] 4. Mehta RL, Cerdá J, Burdmann EAet al. International Society of Nephrology's 0by25 initiative for acute kidney injury (zero preventable deaths by 2025): a human rights case for nephrology. Lancet 2015;385:2616–43. 10.1016/S0140-6736(15)60126-X [DOI] [PubMed] [Google Scholar]

[bib5] 5. Hoste EAJ, Kellum JA, Selby NMet al. Global epidemiology and outcomes of acute kidney injury. Nat Rev Nephrol 2018;14:607–25. 10.1038/s41581-018-0052-0 [DOI] [PubMed] [Google Scholar]

[bib6] 6. National Confidential Enquiry into Patient Outcome and Death . Adding insult to injury: a review of the care of patients who died in hospital with a primary diagnosis of acute kidney injury (acute renal failure). National Confidential Enquiry into Patient Outcome and Death, 2009 [Google Scholar]

[bib7] 7. Matheny ME, Miller RA, Ikizler TAet al. Development of inpatient risk stratification models of acute kidney injury for use in electronic health records. Med Decis Making 2010;30:639–50. 10.1177/0272989X10364246 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib8] 8. Mehran R, Aymong ED, Nikolsky Eet al. A simple risk score for prediction of contrast-induced nephropathy after percutaneous coronary intervention: development and initial validation. J Am Coll Cardiol 2004;447:1393–9. 10.1016/j.jacc.2004.06.068 [DOI] [PubMed] [Google Scholar]

[bib9] 9. Coorey CP, Sharma A, Mueller Set al. Prediction modelling—part 2: using machine learning strategies to improve transplantation outcomes. Kidney Int 2021;99:817–23. 10.1016/j.kint.2020.08.026 [DOI] [PubMed] [Google Scholar]

[bib10] 10. Au EH, Francis A, Bernier-Jean Aet al. Prediction modelling—part 1: regression modeling. Kidney Int 2020;97:877–84. 10.1016/j.kint.2020.02.007 [DOI] [PubMed] [Google Scholar]

[bib11] 11. Gameiro J, Branco T, Lopes JA.. Artificial intelligence in acute kidney injury risk prediction. J Clin Med 2020;9:678. 10.3390/jcm9030678 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib12] 12. Park S, Lee H.. Acute kidney injury prediction models: current concepts and future strategies. Curr Opin Nephrol Hypertens 2019;28:552–9. 10.1097/MNH.0000000000000536 [DOI] [PubMed] [Google Scholar]

[bib13] 13. Hodgson LE, Selby N, Huang T-Met al. The role of risk prediction models in prevention and management of AKI. Semin Nephrol 2019;39:421–30. 10.1016/j.semnephrol.2019.06.002 [DOI] [PubMed] [Google Scholar]

[bib14] 14. Hodgson LE, Sarnowski A, Roderick PJet al. Systematic review of prognostic prediction models for acute kidney injury (AKI) in general hospital populations. BMJ Open 2017;7:e016591. 10.1136/bmjopen-2017-016591 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib15] 15. Pozzoli S, Simonini M, Manunta P.. Predicting acute kidney injury: current status and future challenges. J Nephrol 2018;31:209–23. 10.1007/s40620-017-0416-8 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib16] 16. Wilson T, Quan S, Cheema Ket al. Risk prediction models for acute kidney injury following major noncardiac surgery: systematic review. Nephrol Dial Transplant 2016;31:231–40. 10.1093/ndt/gfv414 [DOI] [PubMed] [Google Scholar]

[bib17] 17. Allen DW, Ma B, Leung KCet al. Risk prediction models for contrast-induced acute kidney injury accompanying cardiac catheterization: systematic review and meta-analysis. Can J Cardiol 2017;33:724–36. 10.1016/j.cjca.2017.01.018 [DOI] [PubMed] [Google Scholar]

[bib18] 18. Caragata R, Wyssusek KH, Kruger P.. Acute kidney injury following liver transplantation: a systematic review of published predictive models. Anaesth Intensive Care 2016;44:251–61. 10.1177/0310057X1604400212 [DOI] [PubMed] [Google Scholar]

[bib19] 19. Szerlip HM, Chawla LS.. Predicting acute kidney injury prognosis. Curr Opin Nephrol Hypertens 2016;25:226–31. 10.1097/MNH.0000000000000223 [DOI] [PubMed] [Google Scholar]

[bib20] 20. Safari S, Yousefifard M, Hashemi Bet al. The role of scoring systems and urine dipstick in prediction of rhabdomyolysis-induced acute kidney injury: a systematic review. Iran J Kidney Dis 2016;10:101–6. [PubMed] [Google Scholar]

[bib21] 21. Lin X, Yuan J, Zhao Yet al. Urine interleukin-18 in prediction of acute kidney injury: a systemic review and meta-analysis. J Nephrol 2015;28:7–16. 10.1007/s40620-014-0113-9 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib22] 22. de Geus HR, Betjes MG, Bakker J.. Biomarkers for the prediction of acute kidney injury: a narrative review on current status and future challenges. Clin Kidney J 2012;5:102–8. 10.1093/ckj.sfs008 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib23] 23. Liu X, Guan Y, Xu Set al. Early predictors of acute kidney injury: a narrative review. Kidney Blood Press Res 2016;41:680–700. 10.1159/000447937 [DOI] [PubMed] [Google Scholar]

[bib24] 24. Meisner A, Kerr KF, Thiessen-Philbrook Het al. Methodological issues in current practice may lead to bias in the development of biomarker combinations for predicting acute kidney injury. Kidney Int 2016;89:429–38. 10.1038/ki.2015.283 ISSN 0085-2538 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib25] 25. Ho J, Tangri N, Komenda Pet al. Urinary, plasma, and serum biomarkers’ utility for predicting acute kidney injury associated with cardiac surgery in adults: a meta-analysis. Am J Kidney Dis 2015;66:993–1005. 10.1053/j.ajkd.2015.06.018 [DOI] [PubMed] [Google Scholar]

[bib26] 26. Mosa O, Skitek M, Jerin A.. Validity of Klotho, CYR61 and YKL-40 as ideal predictive biomarkers for acute kidney injury: review study. Sao Paulo Med J 2017;135:57–65. 10.1590/1516-3180.2016.0099220516 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib27] 27. Darmon M, Truche AS, Abdel-Nabey Met al. Early recognition of persistent acute kidney injury. Semin Nephrol 2019;39:431–41. 10.1016/j.semnephrol.2019.06.003 ISSN 0270-9295 [DOI] [PubMed] [Google Scholar]

[bib28] 28. Sutherland SM, Goldstein SL, Bagshaw SM.. Acute kidney injury and big data. Contrib Nephrol 2018;193:55–67. 10.1159/000484963 [DOI] [PubMed] [Google Scholar]

[bib29] 29. Song X, Liu X, Liu Fet al. Comparison of machine learning and logistic regression models in predicting acute kidney injury: a systematic review and meta-analysis. Int J Med Informatics 2021;151:104484. 10.1016/j.ijmedinf.2021.104484 [DOI] [PubMed] [Google Scholar]

[bib30] 30. Moher D, Liberati A, Tetzlaff Jet al. PRISMA Group . Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. BMJ 2009;339:b2535. 10.1136/bmj.b2535 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib31] 31. Collins GS, Reitsma JB, Altman DGet al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ 2015;350:g7594. 10.1136/bmj.g7594 [DOI] [PubMed] [Google Scholar]

[bib32] 32. Moons KGM, de Groot JAH, Bouwmeester Wet al. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist. PLoS Med 2014;11:e1001744. 10.1371/journal.pmed.1001744 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib33] 33. Moons KGM, Wolff RF, Riley RDet al. PROBAST: a tool to assess risk of bias and applicability of prediction model studies: explanation and elaboration. Ann Intern Med 2019;170:W1–33. 10.7326/M18-1377 [DOI] [PubMed] [Google Scholar]

[bib34] 34. Bellomo R, Ronco C, Kellum JAet al. Acute Dialysis Quality Initiative workgroup . Acute renal failure—definition, outcome measures, animal models, fluid therapy and information technology needs: the Second International Consensus Conference of the Acute Dialysis Quality Initiative (ADQI) Group. Crit Care 2004;8:R204–12. 10.1186/cc2872 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib35] 35. Mehta RL, Kellum JA, Shah SVet al. Acute kidney injury network: report of an initiative to improve outcomes in acute kidney injury, Crit Care 2007;11:R31. 10.1186/cc5713 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib36] 36. Centers for Disease Control and Prevention . International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM). Atlanta: Centers for Disease Control and Prevention, 2002. [Google Scholar]

[bib37] 37. Mohamadlou H, Lynn-Palevsky A, Barton Cet al. Prediction of acute kidney injury with a machine learning algorithm using electronic health record data. Can J Kidney Health Dis 2018;5:205435811877632. 10.1177/2054358118776326 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib38] 38. Lei G, Wang G, Zhang Cet al. Using machine learning to predict acute kidney injury after aortic arch surgery. J Cardiothorac Vasc Anesth 2020;34:3321–8. 10.1053/j.jvca.2020.06.007 [DOI] [PubMed] [Google Scholar]

[bib39] 39. Johnson AEW, Pollard TJ, Shen Let al. MIMIC-III, a freely accessible critical care database. Sci Data 2016;3:160035. 10.1038/sdata.2016.35 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib40] 40. Steyerberg EW. Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating. New York: Springer, 2009. [Google Scholar]

[bib41] 41. van Buuren S, Groothuis-Oudshoorn K.. MICE: multivariate imputation by chained equations in R. J Stat Softw 2011;45:1–67. 10.18637/jss.v045.i03 [DOI] [Google Scholar]

[bib42] 42. Tibshirani R. Regression shrinkage and selection via the Lasso. J R Stat Soc 1996;58:267–88. [Google Scholar]

[bib43] 43. Kim B, Khanna R, Koyejo O.. Examples are not enough, learn to criticize! Criticism for interpretability. Proceedings of the 30th International Conference on Neural Information Processing Systems, 2016;2288–96. [Google Scholar]

[bib44] 44. Lundberg SM, Lee SI.. A unified approach to interpreting model predictions. Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017;4768288–77. [Google Scholar]

[bib45] 45. Westreich D, Greenland S.. The table 2 fallacy: presenting and interpreting confounder and modifier coefficients. Am J Epidemiol 2013;177:292–8. 10.1093/aje/kws412 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib46] 46. Camerer CF, Dreber A, Holzmeister Fet al. Evaluating the replicability of social science experiments in nature and science between 2010 and 2015. Nat Hum Behav 2018;2:637–44. 10.1038/s41562-018-0399-z [DOI] [PubMed] [Google Scholar]

[bib47] 47. Royston P, Moons KGM, Altman DGet al. Prognosis and prognostic research: developing a prognostic model. BMJ 2009;338:b604. 10.1136/bmj.b604 [DOI] [PubMed] [Google Scholar]

[bib48] 48. Harrell FE Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis. New York: Springer, 2001. [Google Scholar]

[bib49] 49. Sauerland S, Lefering R, Neugebauer EAM. Retrospective clinical studies in surgery: potentials and pitfalls. J Hand Surg Br 2002;27:117–21. 10.1054/jhsb.2001.0703 [DOI] [PubMed] [Google Scholar]

[bib50] 50. Ebrahim S, Sohani ZN, Montoya Let al. Reanalyses of randomized clinical trial data. JAMA 2014;312:1024–32. 10.1001/jama.2014.9646 [DOI] [PubMed] [Google Scholar]

[bib51] 51. Wallach JD, Boyack KW, Ioannidis JPA. Reproducible research practices, transparency, and open access data in the biomedical literature, 2015-2017. PLoS Biol 2018;16:e2006930. 10.1371/journal.pbio.2006930 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib52] 52. Van Calster B, Steyerberg EW, Collins GS.. Artificial intelligence algorithms for medical prediction should be nonproprietary and readily available. JAMA Intern Med 2019;179:731. 10.1001/jamainternmed.2019.0597 [DOI] [PubMed] [Google Scholar]

[bib53] 53. Collins GS, Dhiman P, Andaur Navarro CLet al. Protocol for development of a reporting guideline (TRIPOD-AI) and risk of bias tool (PROBAST-AI) for diagnostic and prognostic prediction model studies based on artificial intelligence. BMJ Open 2021;11:e048008. 10.1136/bmjopen-2020-048008 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib54] 54. Tomašev N, Glorot X, Rae JWet al. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature 2019;572:116–9. 10.1038/s41586-019-1390-1 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib55] 55. Sun M, Baron J, Dighe Aet al. Early prediction of acute kidney injury in critical care setting using clinical notes and structured multivariate physiological measurements. Stud Health Technol Inform 2019;264:368–72. 10.3233/SHTI190245 [DOI] [PubMed] [Google Scholar]

[bib56] 56. Pan Z, Du H, Ngiam KYet al. A self-correcting deep learning approach to predict acute conditions in critical care. arXiv:190104364. 10.48550/arXiv.1901.04364 [DOI] [Google Scholar]

[bib57] 57. Parreco J, Chatoor M.. Comparing machine learning algorithms for predicting acute kidney injury. Am Surg 2019;85:725–9. [PubMed] [Google Scholar]

[bib58] 58. Weisenthal SJ, Quill C, Farooq Set al. Predicting acute kidney injury at hospital re-entry using high-dimensional electronic health record data. PLoS One 2018;13:e0204920. 10.1371/journal.pone.0204920 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib59] 59. Park N, Kang E, Park Met al. Predicting acute kidney injury in cancer patients using heterogeneous and irregular data. PLoS One 2018;13:e0199839. 10.1371/journal.pone.0199839 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib60] 60. Koyner J, Carey K, Edelson Det al. The development of a machine learning inpatient acute kidney injury prediction model. Crit Care Med 2018;46:1070–7. 10.1097/CCM.0000000000003123 [DOI] [PubMed] [Google Scholar]

[bib61] 61. Bihorac A, Ozrazgat-Baslanti T, Ebadi Aet al. MySurgeryRisk: development and validation of a Machine-learning risk algorithm for major complications and death after surgery. Ann Surg 2018;269:652. 10.1097/SLA.0000000000002706 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib62] 62. Adhikari L, Ozrazgat-Baslanti T, Ruppert Met al. Improved predictive models for acute kidney injury with IDEA: intraoperative data embedded analytics. PLoS One 2019;14:e0214904. 10.1371/journal.pone.0214904 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib63] 63. Davis SE, Lasko TA, Chen Get al. Calibration drift in regression and machine learning models for acute kidney injury. J Am Med Inform Assoc 2017;24:1052–61. 10.1093/jamia/ocx030 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib64] 64. Cronin RM, VanHouten JP, Siew EDet al. National Veterans Health Administration inpatient risk stratification models for hospital-acquired acute kidney injury. J Am Med Inform Assoc 2015;22:1054–71. 10.1093/jamia/ocv051 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib65] 65. Kate RJ, Perez RM, Mazumdar Det al. Prediction and detection models for acute kidney injury in hospitalized older adults. BMC Med Inform Decis Mak 2016;16:39. 10.1186/s12911-016-0277-4 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib66] 66. Thottakkara P, Ozrazgat-Baslanti T, Hupf BBet al. Application of machine learning techniques to high-dimensional clinical data to forecast postoperative complications. PLoS One 2016;11:e0155705. 10.1371/journal.pone.0155705 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib67] 67. Cheng P, Waitman LR, Hu Yet al. Predicting inpatient acute kidney injury over different time horizons: how early and accurate? AMIA Annu Symp Proc 2018;2017:565–74. [PMC free article] [PubMed] [Google Scholar]

[bib68] 68. Zhang K, Xue Y, Flores Get al. Modelling EHR timeseries by restricting feature interaction. arXiv 2019. 10.48550/arXiv.1911.06410 [DOI] [Google Scholar]

[bib69] 69. Schneider DF, Dobrowolsky A, Shakir IAet al. Predicting acute kidney injury among burn patients in the 21st century: a classification and regression tree analysis. J Burn Care Res 2012;33:242–51. 10.1097/BCR.0b013e318239cc24 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib70] 70. Kerr KF, Morenz ER, Roth Jet al. Developing biomarker panels to predict progression of acute kidney injury after cardiac surgery. Kidney Int Rep 2019;4:1677–88. 10.1016/j.ekir.2019.08.017 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib71] 71. Zhou C, Wang R, Jiang Wet al. Machine learning for the prediction of acute kidney injury and paraplegia after thoracoabdominal aortic aneurysm repair. J Card Surg 2020;35:89–99. 10.1111/jocs.14317 [DOI] [PubMed] [Google Scholar]

[bib72] 72. Flechet M, Falini S, Bonetti Cet al. Machine learning versus physicians’ prediction of acute kidney injury in critically ill adults: a prospective evaluation of the AKIpredictor. Crit Care 2019;23:282. 10.1186/s13054-019-2563-x [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib73] 73. Tran NK, Sen S, Palmieri TLet al. Artificial intelligence and machine learning for predicting acute kidney injury in severely burned patients: a proof of concept. Burns 2019;45:1350–8. 10.1016/j.burns.2019.03.021 [DOI] [PubMed] [Google Scholar]

[bib74] 74. Chiofolo C, Chbat N, Ghosh Eet al. Automated continuous acute kidney injury prediction and surveillance: a random forest model. Mayo Clin Proc 2019;94:783–92. 10.1016/j.mayocp.2019.02.009 [DOI] [PubMed] [Google Scholar]

[bib75] 75. He J, Hu Y, Zhang Xet al. Multi-perspective predictive modeling for acute kidney injury in general hospital populations using electronic medical records. JAMIA Open 2019;2:115–22. 10.1093/jamiaopen/ooy043 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib76] 76. Zimmerman LP, Reyfman PA, Smith ADRet al. Early prediction of acute kidney injury following ICU admission using a multivariate panel of physiological measurements. BMC Med Inf Decis Mak 2019;19:16. 10.1186/s12911-019-0733-z [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib77] 77. Lee HC, Yoon HK, Nam Ket al. Derivation and validation of machine learning approaches to predict acute kidney injury after cardiac surgery. J Clin Med 2018;7:322. 10.3390/jcm7100322 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib78] 78. Morid MA, Sheng ORL, Fiol GDet al. Temporal pattern detection to predict adverse events in critical care: case study with acute kidney injury. JMIR Med Inform 2020;8:e14272. 10.2196/14272 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib79] 79. Chen YS, Chou CY, Chen ALP.. Early prediction of acquiring acute kidney injury for older inpatients using most effective laboratory test results. BMC Med Inf Decis Mak 2020;20:36. 10.1186/s12911-020-1050-2 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib80] 80. Goodwin TR, Demner-Fushman D.. A customizable deep learning model for nosocomial risk prediction from critical care notes with indirect supervision. J Am Med Inform Assoc 2020;27:567–76. 10.1093/jamia/ocaa004 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib81] 81. Meyer A, Zverinski D, Pfahringer Bet al. Machine learning for real-time prediction of complications in critical care: a retrospective study. Lancet Respir Med 2018;6:905–14. 10.1016/S2213-2600(18)30300-X [DOI] [PubMed] [Google Scholar]

[bib82] 82. Cui Z, Fritz BA, King CRet al. A factored generalized additive model for clinical decision support in the operating room. AMIA Annu Symp Proc 2020;2019:343–52 [PMC free article] [PubMed] [Google Scholar]

[bib83] 83. Xu Z, Chou J, Zhang XSet al. Identifying sub-phenotypes of acute kidney injury using structured and unstructured electronic health record data with memory networks. J Biomed Inform 2020;102:103361. 10.1016/j.jbi.2019.103361 [DOI] [PubMed] [Google Scholar]

[bib84] 84. Brennan M, Puri S, Ozrazgat-Baslanti Tet al. Comparing clinical judgment with the MySurgeryRisk algorithm for preoperative risk assessment: a pilot study. Surgery 2019;165:1035–45. 10.1016/j.surg.2019.01.002 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib85] 85. Flechet M, Güiza F, Schetz Met al. AKIpredictor, an online prognostic calculator for acute kidney injury in adult critically ill patients: development, validation and comparison to serum neutrophil gelatinase-associated lipocalin. Intensive Care Med 2017;43:764–73. 10.1007/s00134-017-4678-3 [DOI] [PubMed] [Google Scholar]

[bib86] 86. Wang Y, Bao J, Du Jet al. Precisely predicting acute kidney injury with convolutional neural network based on electronic health record data. arXiv:200513171. 10.48550/arXiv.2005.13171 [DOI] [Google Scholar]

[bib87] 87. Rank N, Pfahringer B, Kempfert Jet al. Deep-learning-based real-time prediction of acute kidney injury outperforms human predictive performance. NPI J Digit Med 2020;3:139. 10.1038/s41746-020-00346-8 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib88] 88. Al-Jefri M, Lee J, James M.. Predicting acute kidney injury after surgery. In: 2020 42nd Annual International Conference of the IEEE Engineering in Medicine Biology Society (EMBC), 2020, 5606–9. 10.1109/EMBC44109.2020.9175448 [DOI] [PubMed] [Google Scholar]

[bib89] 89. Wang Y, Wei Y, Yang Het al. Utilizing imbalanced electronic health records to predict acute kidney injury by ensemble learning and time series model. BMC Med Inf Decis Mak 2020;20:238. 10.1186/s12911-020-01245-4 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib90] 90. Li Y, Chen X, Shen Zet al. Prediction models for acute kidney injury in patients with gastrointestinal cancers: a real-world study based on bayesian networks. Ren Fail 2020;42:869–76. 10.1080/0886022X.2020.1810068 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib91] 91. Churpek MM, Carey KA, Edelson DPet al. Internal and external validation of a machine learning risk score for acute kidney injury. JAMA Netw Open 2020;3:e2012892. 10.1001/jamanetworkopen.2020.12892 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib92] 92. Hsu CN, Liu CL, Tain YLet al. Machine learning model for risk prediction of community-acquired acute kidney injury hospitalization from electronic health records: development and validation study. J Med Internet Res 2020;22:e16903. 10.2196/16903 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib93] 93. Tseng PY, Chen YT, Wang CHet al. Prediction of the development of acute kidney injury following cardiac surgery by machine learning. Crit Care 2020;24:478. 10.1186/s13054-020-03179-9 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib94] 94. Martinez DA, Levin SR, Klein EYet al. Early prediction of acute kidney injury in the emergency department with machine-learning methods applied to electronic health record data. Ann Emerg Med 2020;76:501–14. 10.1016/j.annemergmed.2020.05.026 [DOI] [PubMed] [Google Scholar]

[bib95] 95. Li Y, Xu J, Wang Yet al. A novel machine learning algorithm, bayesian networks model, to predict the high-risk patients with cardiac surgery-associated acute kidney injury. Clin Cardiol 2020;43:752–61. 10.1002/clc.23377 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib96] 96. Li Y, Chen X, Wang Yet al. Application of group LASSO regression based bayesian networks in risk factors exploration and disease prediction for acute kidney injury in hospitalized patients with hematologic malignancies. BMC Nephrol 2020;21:162. 10.1186/s12882-020-01786-w [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib97] 97. Kuo B, Kang Y, Wu Pet al. Discovering drug-drug and drug-disease interactions inducing acute kidney injury using deep rule forests. arXiv:200702103. 10.1109/IRI49571.2020.00062 [DOI] [Google Scholar]

PERMALINK

Machine learning models for predicting acute kidney injury: a systematic review and critical appraisal

Iacopo Vagliano

Nicholas C Chesnaye

Jan Hendrik Leopold

Kitty J Jager

Ameen Abu-Hanna

Martijn C Schut

ABSTRACT

Background

Methods

Results

Conclusions

Graphical Abstract

Graphical Abstract.

INTRODUCTION

MATERIALS AND METHODS

Study identification

Study inclusion

Study selection

Data extraction

Critical appraisal

RESULTS

Literature search

FIGURE 1:

FIGURE 2:

General study characteristics

Outcome

Definition and prevalence of AKI

Type of prediction model

FIGURE 3:

Type and origin of data

Model predictive performance

FIGURE 4:

Table 1.

Model validation

Critical appraisal

Assessment of bias

Table 2.

Data pre-processing

Interpretability

Applicability and reproducibility

DISCUSSION

Findings

Performance

Methodologic soundness

Clinical applicability

Challenges and opportunities

Limitations

CONCLUSIONS

Supplementary Material

Contributor Information

AUTHORS’ CONTRIBUTIONS

DATA AVAILABILITY STATEMENT

CONFLICT OF INTEREST STATEMENT

REFERENCES

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases