Skip to main content
Journal of the American Medical Informatics Association : JAMIA logoLink to Journal of the American Medical Informatics Association : JAMIA
. 2015 Jun 23;22(5):1054–1071. doi: 10.1093/jamia/ocv051

National Veterans Health Administration inpatient risk stratification models for hospital-acquired acute kidney injury

Robert M Cronin 1,2,3, Jacob P VanHouten 2,4, Edward D Siew 5, Svetlana K Eden 4, Stephan D Fihn 6,7, Christopher D Nielson 6,8, Josh F Peterson 2, Clifton R Baker 6, T Alp Ikizler 5, Theodore Speroff 1,3,4, Michael E Matheny 1,2,3,4,
PMCID: PMC5009929  PMID: 26104740

Abstract

Objective Hospital-acquired acute kidney injury (HA-AKI) is a potentially preventable cause of morbidity and mortality. Identifying high-risk patients prior to the onset of kidney injury is a key step towards AKI prevention.

Materials and Methods A national retrospective cohort of 1,620,898 patient hospitalizations from 116 Veterans Affairs hospitals was assembled from electronic health record (EHR) data collected from 2003 to 2012. HA-AKI was defined at stage 1+, stage 2+, and dialysis. EHR-based predictors were identified through logistic regression, least absolute shrinkage and selection operator (lasso) regression, and random forests, and pair-wise comparisons between each were made. Calibration and discrimination metrics were calculated using 50 bootstrap iterations. In the final models, we report odds ratios, 95% confidence intervals, and importance rankings for predictor variables to evaluate their significance.

Results The area under the receiver operating characteristic curve (AUC) for the different model outcomes ranged from 0.746 to 0.758 in stage 1+, 0.714 to 0.720 in stage 2+, and 0.823 to 0.825 in dialysis. Logistic regression had the best AUC in stage 1+ and dialysis. Random forests had the best AUC in stage 2+ but the least favorable calibration plots. Multiple risk factors were significant in our models, including some nonsteroidal anti-inflammatory drugs, blood pressure medications, antibiotics, and intravenous fluids given during the first 48 h of admission.

Conclusions This study demonstrated that, although all the models tested had good discrimination, performance characteristics varied between methods, and the random forests models did not calibrate as well as the lasso or logistic regression models. In addition, novel modifiable risk factors were explored and found to be significant.

Keywords: risk models, random forest, logistic regression, acute kidney injury

BACKGROUND AND SIGNIFICANCE

Acute kidney injury (AKI) occurs in 1–5% of hospitalized patients and 5–20% of intensive care unit patients.1–3 AKI episodes are typically divided into community-acquired and hospital-acquired categories4,5 Both of these categories have similar incidences but differ in etiology and prognosis. Inpatient mortality rates for AKI range from 15%, in general ward patients, to >50%, in intensive care unit patients who require dialysis.3,4,6,7 Hospital-acquired AKI (HA-AKI) is associated with significant morbidities, including myocardial infarction, chronic kidney disease, and end-stage renal disease.8

Many risk factors for HA-AKI can be modified and prevented or reduced, if identified in a timely fashion. Examples of strategies that could prevent or reduce HA-AKI risk factors include more timely resuscitation, avoidance of nephrotoxic medications, intravenous (IV) contrast, or better assessment of the risks/benefits of potentially high-risk therapies or procedures.1,9–21 The time right before hospitalization provides a window of opportunity to conduct surveillance and prompt intervention.

Statistical models can improve the patient’s quality of care by predicting adverse outcomes.22–25 Initially, risk prediction models for AKI focused on adverse outcomes following AKI.17 Subsequent models that use AKI as the outcome have been developed for select populations and outcomes, such as rhabdomyolysis, surgery, percutaneous coronary intervention, burns, and lower respiratory track disease.26–33 Most risk models rely on logistic regression using known clinical predictions; only one of these models uses a machine learning algorithm (eg, random forests).30 Random forests have the ability to bring interactions and relationships among large numbers of variables into the model using an ensemble method. Previous published works showed random forests to be superior to logistic regression.34–36 One single-center study described a logistic regression model run on all inpatient hospitalizations to predict AKI using the Risk, Injury, Failure, Loss, and End-stage Kidney1 classification criteria.37 There are no known models that have been developed to predict HA-AKI within a large national cohort, and there is a lack of evidence contrasting the performance of multiple risk modeling methods, such as regression, and machine learning algorithms, such as random forests, in this clinical domain.

We sought to compare traditional and novel risk modeling methods (logistic regression, lasso regression, and random forests) within a large national Veterans Affairs (VA) electronic health record (EHR)-derived cohort, in order to develop a predictive model for HA-AKI using modifiable risk factors, which could alert clinicians about patients who are more likely to develop AKI and could provide guidance on which clinical therapies and interventions to pursue or avoid for such patients.

MATERIALS AND METHODS

Study Setting and Design

A national retrospective cohort of 6,390,410 patient hospitalizations was collected, including all adult admissions in 116 VA hospitals from January 1, 2003 to December 31, 2012. The VA utilizes an EHR, Computerized Patient Record System (CPRS) (which has been in place since the 1990s3 8,39), that was able to provide reliable national data for the domains required for this study from 2002 onward.40 This study was approved by the Institutional Review Board and the Research and Development committee of the Tennessee Valley Healthcare System VA.

Data Collection

During the study period, all data were collected from the national Corporate Data Warehouse, which aggregates national data from each VA facility’s Veterans Health Information Systems and Technology Architecture and CPRS instances.38,39,41 Detailed references for data domains and data field availability can be obtained from http://vaww.vinci.med.va.gov/vincicentral/default.aspx. All VA laboratory data were obtained for each patient and linked to their hospitalization record. Diagnoses were obtained from the International Classification of Diseases version 9 (ICD-9) Procedure and Current Procedural Terminology codes. Medication information was obtained from pre-admission medication lists and medication administration structured data. Radiologic studies, such as computerized tomography (CT) scans, were recorded from orders placed in CPRS. We collected data was from 365 days prior to the admission date-time stamp (−365 days) up to 9 days after the admission date-time stamp (+9 days). The admission date-time stamp is defined at time equals 0 (Figure 1). Mortality data were collected using the VA Vital Status files, which include data from the National VA benefits program, individual VA facilities, direct family reports, and National Death Index sources.

Figure 1:

Figure 1:

Breakdown of time periods and time windows. The pre-index time periods include the pre-admission window, which starts at 365 days prior to the admission date-time stamp up until 24 h prior to the admission date-time stamp, and the admission window, which starts 24 h prior to the admission date-time stamp up until 48 h after the admission date-time stamp. The post-index time period is defined from being from 48 h after the admission date-time stamp until 9 days after the admission date-time stamp. Predictive and outcome variables were divided among one of five time periods, designated A to E.

Cohort Exclusion Criteria

We excluded patient hospitalizations that had a length of stay <48 h, because outcomes were ascertained after this window, and that had a length of stay over 30 days, because these patients were systematically different from the standard length of stay population, and the intent was for these models to be used to help tailor care during the admission window. We also excluded patient admissions that did not have a pre-admission baseline creatinine value, that did not have a creatinine value determined in the first 48 h after the admission date-time stamp, and that did not have at least one creatinine value determined after the first 48 h of hospitalization. We also excluded patients who had undergone dialysis, had had a renal transplant prior to admission, or who experienced community-acquired AKI during the admission window. We excluded hospice patients, defined as patients who were receiving hospice services from −30 days to within +48 h of admission. Finally, we excluded VA centers with low admission volumes, ie, less than 100 hospital admissions per year. A summary of patient hospitalization exclusions are shown in Figure 2. The final analysis cohort consisted of 1,620,898 patient admissions among 611,230 patients.

Figure 2:

Figure 2:

Summary of the patient cohort and exclusion criteria.

Study Definition of Hospital-Acquired Acute Kidney Injury

All outcomes were determined using creatinine laboratory value data and dialysis procedure codes collected during the post-index time period, defined as a 7-day period in the post-admission time window (+48 h to +9 days) using the different stages of the Kidney Diseases Improving Global Outcomes (KDIGO) classification criteria described in Supplementary Appendix 1. AKI stage 1+ was defined as being in stages 1, 2, or 3 of the KDIGO classification, AKI stage 2+ was defined as being in stage 2 or 3 of the KDIGO classification, and dialysis was defined as acute dialysis, without a prior occurrence of dialysis, during the pre-admission window (–365 d to –24 h) or admission window (–24 h to +48 h of admission).

We defined our baseline creatinine value as the mean outpatient creatinine value from –365 days up to –7 days.42 Community-acquired AKI was calculated using the baseline creatinine value and the maximum creatinine value from between –24 h up until +48 h. HA-AKI was calculated for all patients without community-acquired AKI, using the baseline creatinine value and the maximum creatinine value from +48 h up until +9 days. Mortality was determined by all-cause mortality in the 7 days following the admission window (+48 h to +9 days).

Candidate Risk Factors

All risk factor, inclusion, and exclusion criteria were collected in the pre-index time period, which is any time prior to +48 h of admission (Figure 1). The pre-index time period includes an admission window from –24 h up until +48 h. The window of time prior to the admission date-time stamp (the admission window) was used to include emergency department and outpatient care resulting in direct hospital admission for the inpatient care stay as part of the admission window. Variables recorded in the admission window indicate the most recent state of the patient prior to our outcomes. AKI risk factors were based on KDIGO guidelines and previous literature. If one of the variables was not recorded prior to admission, we imputed the variable with simple imputation of the median value for that variable.

Medications, vital signs (including blood pressure), temperature, and body mass index (BMI) were recorded in both the pre-admission and admission windows. All other predictor variables were recorded once in either the pre-admission or admission window. Pre-admission chronic diagnoses were included in the risk prediction models and were defined using administrative condition and procedure codes (see Supplementary Appendix 2) documented from –365 days to –24 h. To account for severity of illness, we included variables used in the Charlson comorbidity index, including disease diagnoses from diagnosis codes as well as stratification by age.43 Mean BMI was calculated from height and weight measurements in the time period from –365 days up until –24 h, for the pre-admission BMI, and during the admission window, for the admission BMI. We looked at an entire year prior to our outcome variables for diagnoses and mean BMI in order to make sure we captured as many patients as possible, assuming patients will see their physician at least once a year. In most cases, BMI does not change rapidly, so we allowed for the inclusion of weight data over a year’s time. The most recent pre-index laboratory test values were extracted within a time window of −120 h (−5 days) until +48 h after admission. We used the most recent pre-index laboratory test to capture the most recent snapshot of the patient prior to our outcomes. We included the patient’s glomerular filtration rate (GFR), a measure of a patient’s kidney function, during both the pre-admission window (pre-admission GFR: –5 days until –24 h) and the admission window (admission GFR: –24 h until +48 h) in the models. We also calculated the change in GFR and the change in hemoglobin over the admission time window. Pre-admission medication exposures were defined as the patient having taken the medication at any time from –90 days to –24 h prior to their hospital admission. All data were obtained from outpatient pharmacy fill records, using fill dates and pill counts, and allowing fill gaps of 90 days (because, in the VA, chronic prescriptions will be written for a 90-day supply), which approximates 80% adherence.44 Admission medications were recorded from the bar-coded medication administration records during the admission window. CT scan information was obtained during the admission window. For contrasted studies, we were able to ascertain whether contrast was ordered, but we were unable to confirm delivery of contrast with certainty in all cases. Mean temperatures were calculated from temperature recordings from –90 days up until –24 h, for the pre-admission temperature, and during the admission window, for the admission temperature. Minimum and maximum blood pressures were determined during the admission window. We calculated a blood pressure variable defined as hypotension if the minimum systolic blood pressure was <90 and hypertension if the maximum systolic blood pressure was >180.

Risk Prediction Models

Three modeling methods were used to compare HA-AKI predictive performance: logistic regression, least absolute shrinkage and selection operator (lasso) regression, and random forests.45–47 We included the same candidate risk factors as predictor variables in all three methods (see Table 1 for a full list of risk factors). For logistic regression, we used the glm package in R48 to calculate odds ratios (ORs) and 95% confidence intervals (95% CIs) for the predictor variables. Lasso regression can be interpreted as a penalized logistic regression model that enables a sharp penalty on the regression coefficients and allows for variable selection.46 For this reason, we report ORs for each predictor variable. To train and test the lasso regression, we used the glmnet package in R.49 Random forests are ensemble learning methods that create a “forest” of decision trees at training time and output the mode of the classification outputs by the individual decision trees.47 To train the random forests, we used the SAS package HPFOREST50 with 150 trees. We recorded the importance of the variables for a random forest, measured by the decrease in impurity at the nodes that used those variables. We did not adjust for clustering by hospital as a random effect, because we wanted our modeling methods to be comparable, and clustering could not be done for random forests.

Table 1:

Table of all variables used in the models

Discrete variables
Risk factor n (%) Risk factor N (%) N (%)
Demographics Medications Pre-admission Admission
Gender (Male) 1,557,832 (96.11) NSAIDs 321,074 (19.81) 144,007 (8.88)
Race ACEi 642,820 (39.66) 539,033 (33.26)
Am. In. – Alaskan 17,189 (1.06) Acyclovir NA 18,635 (1.15)
Asian-Pac. Island 21,176 (1.31) Aminoglycosides 21,147 (1.30) 28,558 (1.76)
Black 300,500 (18.54) Anhydrase Diuretic 3,513 (0.22) 3,256 (0.20)
Unknown 47,467 (2.93) Antiemetics 79,017 (4.87) 134,324 (8.29)
White 1,234,566 (76.17) AntiFungals 49,602 (3.06) 39,619 (2.44)
AntiTB 7,285 (0.45) 6,902 (0.43)
Diagnoses ARB 108,014 (6.66) 83,808 (5.17)
Alcoholism 331,939 (20.48) Benzodiazepines 231,698 (14.29) 337,596 (20.83)
ALD 66,426 (4.10) Beta Blockers 753,325 (46.48) 795,089 (49.05)
Anemia 445,133 (27.46) CCB 402,007 (24.80) 332,528 (20.52)
Cancer 418,399 (25.81) Cephalosporins 92,969 (5.74) 307,947 (19.00)
CDVD 501,793 (30.96) Cimetidine NA 2,579 (0.16)
CHF 331,600 (20.46) Cyclosporine NA NA 3,151 (0.19)
COPD 555,156 (34.25) Fluoroquinolones 166,135 (10.25) 119,340 (7.36)
CVA 283,175 (17.47) Glucocorticoids 209,578 (12.93) 229,212 (14.14)
DM 651,663 (40.20) Insulin 229,855 (14.18) 467,939 (28.87)
Dyslipidemia 935,340 (57.71) K-Sparing Diuretics 131,195 (8.09) 98,925 (6.10)
Hepatitis 164,641 (10.16) Lincomycin 33,769 (2.08) 35,151 (2.17)
HIV 21,542 (1.33) Lithium NA 13,433 (0.83)
HTN 1,221,391 (75.35) Loop Diuretics 408,328 (25.19) 424,737 (26.20)
MVR 48,259 (2.98) Macrolides 108,658 (6.70) 107,093 (6.61)
PVD 316,570 (19.53) MAOI 254 (0.02) 140 (0.01)
Dementia 95,795 (5.91) Nacetylcysteine NA 51,113 (3.15)
RA 50,017 (3.09) Nitrofurantoin 15,603 (0.96) 3,883 (0.24)
PUD 95,404 (5.89) Opioids 869,296 (53.63) 1,010,508 (62.34)
Hemiplegia 68,459 (4.22) Penicillins 155,254 (9.58) 249,273 (15.38)
Statins 770,802 (47.55) 691,968 (42.69)
Other Sulfa Antibiotics 85,280 (5.26) 28,324 (1.75)
CT Scan +Contrast 117,750 (7.26) TCA 87,981 (5.43) 57,823 (3.57)
CT Scan –Contrast 245,623 (15.15) Tetracyclines 61,671 (3.80) 27,588 (1.70)
Hypertension 165,469 (10.21) Thiazides 259,583 (16.01) 152,820 (9.43)
Hypotension 133,712 (8.25) Trimethoprim NA 21,848 (1.35)
Outcomes Vancomycin NA 195,346 (12.05)
AKI: Stage 1+ 128,457 (7.93)
AKI: Stage 2+ 15,684 (0.97)
Dialysis 1,940 (0.12)
Continuous variables
Risk factor Median (IQR) Missing (%) Risk factor Median (IQR) Missing (%)
Demographics Labs
Admission Age 65 (58–77) 0.00 Direct Bilirubin 0.2 (0.1–0.3) 75.65
Other GGT 47 (25–128) 94.44
Pre-Admit Mean BMI 27.6 (23.9–32.1) 4.02 Glucose 116 (97–151) 0.81
Admit Mean BMI 27.1 (23.2–31.7) 27.15 Hematocrit 35.7 (31.3–39.9) 0.53
Readmit Max. Temp. 98.6 (98–99.3) 14.00 Hemoglobin 12 (10.4–13.4) 1.18
Admit Max. Temp. 98.8 (98.3–99.7) 2.44 Delta Hemoglobin 0 (0–11.6) 1.20
NS IVF 0 (0–0.73) 0.00 Lipase 34 (21–88) 83.05
1/2 NS IVF 0 (0–0.24) 0.00 MCH 30.5 (29–32) 1.17
LR IVF 0 (0–0.06) 0.00 MCHC 33.7 (32.9–34.3) 0.71
Water IVF 0 (0–0.19) 0.00 MCV 90.6 (86.7–94.5) 0.70
Mean Pre-Admit GFR 69.8 (54.7–71.4) 0.00
Labs Pre-Admit GFR Count 4 (2–7) 0.00
Albumin 3.4 (2.9–3.9) 25.32 SD Pre-Admit GFR 8.3 (5–12.9) 29.92
Alkaline Phosphatase 82 (64–109) 25.56 Mean Admit GFR 78.2 (61.1–97.8) 1.20
ALT 23 (16–37) 26.09 SD Admit GFR 6.8 (3.3–11.6) 15.46
Ammonia 37.8 (24–61) 96.72 Admit GFR Count 3 (2–3) 1.20
AST 25 (19–38) 26.97 Delta Admit GFR 0 (0–11.6) 1.20
Bicarbonate 26 (24–29) 0.27 Platelets 205 (155–267) 0.92
BNP 284 (87–896) 79.35 Sodium 138 (135–140) 0.13
BUN 15 (11–21) 5.72 Total Bilirubin 0.7 (0.4–1) 25.66
Calcium 8.7 (8.3–9.1) 6.50 Troponin-I 0 (0–0.1) 63.12
Chloride 103 (100–106) 0.30 Troponin-T 0 (0–0) 94.74
CK 85 (48–168) 66.91 WBC 8.1 (6.1–10.7) 0.70
CK-MB 2.6 (1.5–4.6) 77.79

Discrete variables including demographics, chronic diagnoses, medication rates, radiology tests, and outcomes of the analysis cohort. The columns represent the number of hospitalizations where each variable was present and the percentage of hospitalizations with each variable present. Continuous variables include demographics, laboratory tests, vital signs, body mass index (BMI), temperatures, and intravenous fluids (IVF). The columns represent the median, inter-quartile range (IQR), and the percentage of missing values.

NA, not available; CCB, calcium channel blocker; ARB, angiotensin II receptor blocker; ACEi, angiotensin converting enzyme inhibitor; TB, tuberculosis; MAOI, monoamine oxidase inhibitor; TCA, tricyclic antidepressants; CHF, congestive heart failure; DM, diabetes mellitus; HTN, hypertension; PVD, peripheral vascular disease; ALD, advanced liver disease; CVA, cerebrovascular accident; CDVD, cardiovascular disease; COPD, chronic obstructive pulmonary disease; MVR, mitral valve regurgitation; RA, rheumatoid arthritis; PUD, peptic ulcer disease; BUN, blood urea nitrogen; CK, creatinine kinase; CK-MB, creatinine kinase-MB isoenzyme; BNP, B-type natriuretic peptide; WBC, white blood cell count; MCV, mean corpuscular volume; MCH, mean corpuscular hemoglobin; MCHC, mean corpuscular hemoglobin concentration; AST, aspartate aminotransferase; ALT, alanine aminotransferase; GGT, gamma-glutamyl transpeptidase; NS, normal saline; LR, lactate ringers; GFR, glomerular filtration rate; SD, standard deviation; NSAIDs, non-steroidal anti-inflammatory drugs; CT, computerized tomography; HIV, human immunodeficiency virus; AKI, acute kidney injury.

To assess whether the models could accurately predict AKI with a smaller number of variables, we used lasso regression with a restrictive lambda to create a parsimonious model with only six predictive variables, which were determined by a heavily penalized lasso regression.

Statistical Analysis

These models were internally validated using bootstrapping, with the process of training and testing models repeated 50 times. For each iteration, the training set was created by sampling with replacement from the entire dataset.51,52 The size of the training set was the same size of the entire dataset, but some hospitalizations were represented multiple times and some were not represented at all. The test set consisted of the remaining hospitalizations that were not chosen in the bootstrapping with replacement training set. In each bootstrap iteration, model discrimination was evaluated using the area under the receiver operating characteristic curve (AUC),53 integrated discrimination improvement (IDI),54 and continuous net reclassification index (NRI).54 The Brier score55 was calculated for calibration assessment. For the purpose of reporting the effects of the risk factors included in the model, we computed a final model for each method. The final models were created using the entire dataset for point estimates of ORs, variable importance, and 95% CIs. CIs and P-values cannot be obtained directly from lasso regression models, although some work has been done to approximate these CIs, most often by means of the bootstrap. Because our final models were built with the complete training set, we presented only the point estimates for the lasso-penalized coefficients and not the bootstrap CIs.56 We created observed to expected (O/E) ratio plots to assess calibration for the final models with each outcome with the val.prob.ci R code.57 Logistic regression is the only model that allows for P-values and CIs; therefore, we used a Bonferroni corrected significance threshold for 124 predictor variables for each of our three outcomes, which yielded a P-value of 1.34 × 104 (P = 0.05/372). While this is a conservative adjustment strategy, risk factors that remain significant at this level are undisputedly associated with the outcome. To consider severity of illness and AKI’s relationship with mortality, we performed a sensitivity analysis calculating mortality rates among ranges of Charlson comorbidity index scores and AKI stages (see Supplementary Appendix 4).

RESULTS

A summary of patient demographic factors, outpatient and inpatient medication rates, laboratory test ordering rates, radiology tests, intravenous fluids (IVF) administration, and outcomes is presented in Table 1. Approximately 9% (9.02%) of patients experienced HA-AKI. Of the hospitalization instances in the analysis, approximately 7.93% were classified as stage 1+, 0.97% were classified as stage 2+, and 0.12% were classified as dialysis. Males represented 96.11% of the population, with a median age of 65. White patients accounted for the majority of hospital admissions (76.17%).

Logistic regression and lasso regression final models for stage 1+, 2+, and acute dialysis are presented in Table 2. Because lasso regression is a penalized regression that utilizes variable selection, certain predictor variables were removed from the model and therefore were not represented in the final model. Lasso regression removed 12 predictor variables from the stage 1+ outcome, 17 predictor variables from the stage 2+ outcome, and 20 predictor variables from the dialysis outcome. Logistic regression predictor variables that were significant with Bonferroni corrected P-values for all three outcomes included the following admission medications: benzodiazepines and vancomycin, the following labs: elevated sodium, high blood urea nitrogen (BUN), and total bilirubin, as well as low chloride, calcium, bicarbonate, and mean admission GFR. These regression variables also had ORs > 1.00 in the lasso regression, but CIs could not be calculated with this method. Half-normal saline (1/2 NS) and lactated ringers (LR) were associated with lower AKI rates for stage 1+ (OR: 0.92–0.98), but not for stage 2+ (OR: 0.93–1.05) or dialysis (OR: 0.81–1.08).

Table 2:

Final models of logistic regression and lasso regression.

Risk factor Logistic regression Lasso regression
Stage 1+ Stage 2+ Dialysis Stage 1+ Stage 2+ Dialysis
OR (95% CI) OR (95% CI) OR (95% CI) OR OR OR
(Intercept) 23.03 (7.88–67.35) 0.32 (0.02–4.19) 11.12 (0.01–9540) 74.58 0.31 689
Demographics
 Admit Age 1.00 (1.00–1.00) 1.01 (1.00–1.01) 0.97 (0.96–0.97) 1.00 1.01 0.97
 Gender (Male) 1.27 (1.23–1.32) 0.94 (0.86–1.03) 2.00 (1.42–2.81) 1.25 0.96 1.87
 Race (White) 0.97 (0.93–1.00) 0.89 (0.81–0.97) 0.82 (0.63–1.08) 0.96 0.90 0.83
 Race (Black) 1.78 (1.71–1.85) 1.35 (1.23–1.49) 1.10 (0.83–1.45) 1.74 1.34 1.08
 Race (Asian-Pac. Islander) 1.02 (0.96–1.09) 0.99 (0.84–1.16) 1.06 (0.69–1.62) 1.01 1.03
 Race (Am. In. - Alaskan) 1.00 (0.93–1.07) 0.97 (0.81–1.15) 0.88 (0.52–1.47) 0.93
Medications
Pre-admission
  NSAIDs 0.98 (0.96–1.00) 0.99 (0.94–1.03) 0.95 (0.81–1.11) 0.99 1.00 0.96
  Aminoglycosides 1.00 (0.94–1.06) 1.15 (1.00–1.32) 1.03 (0.66–1.59) 1.13
  Cephalosporins 0.98 (0.95–1.00) 0.97 (0.9–1.03) 0.95 (0.78–1.14) 0.98 0.98 0.96
  CCB 1.06 (1.04–1.08) 1.04 (0.99–1.1) 1.13 (0.99–1.28) 1.06 1.04 1.13
  Penicillins 0.98 (0.96–1.00) 0.94 (0.89–0.99) 0.90 (0.77–1.05) 0.99 0.94 0.91
  β-Blockers 0.97 (0.95–0.98) 1.01 (0.96–1.05) 0.99 (0.87–1.13) 0.97 1.00
  ARB 1.08 (1.04–1.11) 1.20 (1.09–1.33) 1.01 (0.83–1.24) 1.07 1.19
  ACEi 1.06 (1.04–1.07) 1.15 (1.1–1.2) 0.92 (0.82–1.03) 1.05 1.14 0.93
  AntiTB 0.91 (0.82-1.01) 0.78 (0.59–1.03) 0.80 (0.37–1.74) 0.95 0.82 0.92
  AntiFungals 0.99 (0.95–1.03) 1.04 (0.95–1.14) 0.88 (0.65–1.19) 1.02 0.93
  Glucocorticoids 1.01 (0.99–1.03) 0.99 (0.94–1.05) 0.85 (0.73–1.00) 0.87
  Lincomycin 1.01 (0.96–1.05) 0.95 (0.86–1.06) 1.23 (0.95–1.6) 0.97 1.19
  Macrolides 1.01 (0.99–1.04) 1.02 (0.95–1.09) 1.01 (0.83–1.23) 1.00
  MAOI 0.34 (0.13–0.87) 1.28 (0.22–7.4) 0.00 (0-8.10E + 105) 0.69
  Nitrofurantoin 0.95 (0.89–1.02) 0.82 (0.69–0.98) 0.47 (0.23–0.95) 0.96 0.86 0.52
  Sulfa Antibiotics 0.87 (0.84–0.89) 0.91 (0.84–0.98) 0.82 (0.66–1.02) 0.87 0.92 0.84
  Tetracyclines 0.98 (0.95–1.01) 1.02 (0.93–1.1) 0.90 (0.7–1.16) 0.99 0.92
  Thiazides 0.91 (0.89–0.93) 0.93 (0.88–0.98) 0.86 (0.75–0.98) 0.92 0.95 0.88
  Loop Diuretics 0.96 (0.94–0.98) 1.02 (0.97–1.07) 1.07 (0.94–1.22) 0.97 1.01 1.05
  Anhydrase Diuretic 0.85 (0.74–0.97) 0.85 (0.57–1.26) 0.42 (0.1–1.8) 0.90 0.96 0.50
  K-Sparing Diuretics 0.96 (0.93–0.98) 1.12 (1.04–1.2) 0.88 (0.73–1.05) 0.97 1.11 0.90
  Benzodiazepines 0.91 (0.89–0.93) 0.92 (0.87–0.96) 0.81 (0.69–0.94) 0.92 0.93 0.83
  TCA 0.97 (0.94–1.01) 1.05(0.95–1.15) 0.93 (0.69–1.24) 1.00 1.04 0.94
  Statins 1.00 (0.99–1.02) 1.06 (1.00–1.11) 1.11 (0.97–1.27) 1.03 1.05
  Insulin 1.11 (1.08–1.13) 1.00 (0.95–1.06) 1.05 (0.92–1.2) 1.10 1.00 1.05
  Fluoroquinolones 0.99 (0.97–1.01) 1.00 (0.95–1.06) 1.03 (0.89–1.19) 0.99 1.01
  Antiemetics 1.00 (0.97–1.04) 1.04 (0.97–1.12) 0.94 (0.74–1.19) 1.03 0.99
  Opioids 0.93 (0.92–0.94) 0.98 (0.94–1.01) 0.96 (0.86–1.06) 0.93 0.98 0.96
Risk factor Logistic regression Lasso regression
Stage 1+ Stage 2+ Dialysis Stage 1+ Stage 2+ Dialysis
OR (95% CI) OR (95% CI) OR (95% CI) OR OR OR
Admission
  NSAIDs 1.08 (1.06–1.11) 1.10 (1.04–1.17) 1.00 (0.8–1.26) 1.07 1.08
  Aminoglycosides 1.42 (1.36–1.48) 1.30 (1.17–1.43) 1.51 (1.11–2.06) 1.40 1.28 1.48
  Cephalosporins 0.90 (0.89–0.92) 0.97 (0.93–1.01) 0.95 (0.83–1.08) 0.90 0.98 0.97
  CCB 1.09 (1.07–1.11) 1.12 (1.06–1.18) 1.18 (1.04–1.35) 1.09 1.11 1.17
  Penicillins 1.10 (1.08–1.12) 1.23 (1.18–1.29) 1.09 (0.95–1.24) 1.09 1.24 1.10
  β–Blockers 1.13 (1.11–1.15) 1.05 (1.00–1.1) 1.09 (0.96–1.23) 1.12 1.04 1.06
  ARB 1.14 (1.09–1.18) 1.05 (0.94–1.17) 0.83 (0.65–1.06) 1.13 1.04 0.86
  ACEi 1.24 (1.22–1.26) 1.30 (1.24–1.36) 0.73 (0.64–0.83) 1.24 1.29 0.73
  AntiTB 1.11 (1.00–1.23) 1.00 (0.78–1.3) 1.09 (0.52–2.27) 1.06
  AntiFungals 1.20 (1.15–1.25) 1.07 (0.97–1.18) 1.27 (0.96–1.68) 1.18 1.07 1.22
  Glucocorticoids 0.76 (0.74–0.77) 0.68 (0.64–0.72) 0.87 (0.74–1.02) 0.77 0.69 0.89
  Lincomycin 1.03 (0.99–1.08) 0.97 (0.88–1.08) 0.87 (0.62–1.22) 1.02 1.00 0.92
  Macrolides 0.90 (0.87–0.92) 0.96 (0.9–1.03) 1.07 (0.88–1.31) 0.90 0.98 1.03
  MAOI 2.58 (1.00–6.65) 0.75 (0.06–8.93) 0.00 (0-1.93E + 132) 1.24
  Nitrofurantoin 0.88 (0.77–1.01) 1.08 (0.77–1.5) 0.39 (0.05–2.84) 0.91 0.59
  Sulfa Antibiotics 2.24 (2.08–2.4) 1.58 (1.31–1.89) 0.57 (0.26–1.25) 2.15 1.42 0.84
  Tetracyclines 0.79 (0.75–0.83) 0.76 (0.66–0.88) 0.98 (0.67–1.42) 0.80 0.79
  Thiazides 1.56 (1.53–1.6) 1.39 (1.31–1.48) 1.12 (0.94–1.32) 1.55 1.36 1.07
  Loop Diuretics 1.65 (1.62–1.68) 1.31 (1.25–1.37) 0.98 (0.86–1.11) 1.65 1.31
  Anhydrase Diuretic 1.46 (1.29–1.65) 1.17 (0.81–1.67) 0.85 (0.25–2.81) 1.39 1.04 0.96
  K-Sparing Diuretics 1.25 (1.22–1.29) 1.00 (0.92–1.08) 0.87 (0.7–1.08) 1.24 1.00 0.89
  Benzodiazepines 1.17 (1.15–1.19) 1.23 (1.18–1.28) 1.31 (1.15–1.49) 1.16 1.21 1.27
  TCA 1.12 (1.07–1.17) 1.04 (0.93–1.17) 0.89 (0.62–1.28) 1.08 1.03 0.91
  Statins 1.01 (0.99–1.02) 0.91 (0.87–0.96) 0.92 (0.81–1.05) 1.01 0.93 0.95
  Insulin 1.05 (1.03–1.07) 1.01 (0.96–1.06) 1.05 (0.91–1.21) 1.05 1.00 1.04
  Fluoroquinolones 1.09 (1.06–1.11) 1.07 (1.02–1.14) 0.79 (0.66–0.94) 1.08 1.07 0.82
  Antiemetics 1.15 (1.12–1.18) 1.23 (1.16–1.31) 1.11 (0.93–1.33) 1.14 1.22 1.07
  Opioids 1.16 (1.15–1.18) 1.29 (1.24–1.34) 0.99 (0.89–1.1) 1.15 1.27 1.00
  Cyclosporine 1.26 (1.13–1.41) 0.84 (0.59–1.21) 0.93 (0.5–1.71) 1.23 0.91 0.98
  Trimethoprim 0.96 (0.89–1.04) 0.89 (0.72–1.09) 1.52 (0.66–3.49)
  Cimetidine 1.37 (1.2–1.56) 1.10 (0.74–1.64) 0.73 (0.18–2.96) 1.33 1.01 0.90
  Nacetylcysteine 1.21 (1.18–1.25) 1.17 (1.07–1.27) 1.06 (0.87–1.3) 1.21 1.15 1.04
  Acyclovir 1.04 (0.98–1.1) 1.59 (1.41–1.79) 1.07 (0.72–1.6) 1.03 1.55
  Vancomycin 1.37 (1.34–1.39) 1.84 (1.76–1.92) 1.46 (1.28–1.68) 1.36 1.83 1.46
  Lithium 1.11 (1.03–1.19) 1.18 (0.96–1.46) 0.84 (0.35–2.03) 1.07 1.11 1.00
Risk factor Logistic regression Lasso regression
Stage 1+ Stage 2+ Dialysis Stage 1+ Stage 2+ Dialysis
OR (95% CI) OR (95% CI) OR (95% CI) OR OR OR
Diagnoses
  CHF 1.00 (0.98–1.01) 0.92 (0.88–0.97) 0.90 (0.8–1.02) 0.94 0.92
  DM 1.04 (1.02–1.05) 1.05 (1.01–1.1) 1.13 (0.99–1.29) 1.04 1.05 1.12
  HTN 1.06 (1.04–1.07) 1.06 (1.01–1.11) 1.04 (0.88–1.22) 1.05 1.05 1.00
  PVD 1.05 (1.03–1.06) 1.01 (0.97–1.05) 1.19 (1.07–1.32) 1.05 1.17
  ALD 1.07 (1.03–1.1) 1.16 (1.06–1.25) 1.09 (0.86–1.39) 1.05 1.15 1.10
  Cancer 1.06 (1.04–1.07) 1.21 (1.17–1.26) 1.00 (0.9–1.12) 1.05 1.21
  CVA 1.03 (1.01–1.05) 1.05 (1.00–1.1) 1.01 (0.89–1.15) 1.02 1.03
  Alcoholism 0.95 (0.94–0.97) 0.94 (0.9–0.99) 0.91 (0.78–1.05) 0.96 0.95 0.94
  HIV 1.26 (1.19–1.33) 1.27 (1.11–1.44) 0.87 (0.57–1.33) 1.23 1.23 0.94
  Hepatitis 0.98 (0.96–1.01) 0.96 (0.9–1.01) 1.22 (1.04–1.43) 0.99 0.98 1.22
  Anemia 0.94 (0.93–0.96) 0.93 (0.89–0.96) 1.29 (1.16–1.43) 0.95 0.93 1.28
  CDVD 0.99 (0.97–1.00) 0.96 (0.92–1.00) 0.95 (0.85–1.07) 0.99 0.97 0.97
  COPD 0.97 (0.96–0.99) 0.98 (0.95–1.02) 0.96 (0.87–1.07) 0.98 0.99 0.97
  Dyslipidemia 0.93 (0.91–0.94) 0.92 (0.89–0.96) 0.91 (0.81–1.02) 0.93 0.93 0.93
  MVR 0.95 (0.92–0.98) 0.96 (0.87–1.05) 1.13 (0.93–1.38) 0.95 0.97 1.10
  Dementia 0.96 (0.93–0.98) 1.01 (0.94–1.09) 0.65 (0.51–0.82) 0.97 1.00 0.67
  RA 1.07 (1.04–1.11) 1.14 (1.04–1.24) 1.18 (0.92–1.5) 1.06 1.11 1.13
  PUD 0.98 (0.95–1.00) 0.94 (0.88–1.00) 0.80 (0.65–0.97) 0.98 0.95 0.82
  Hemiplegia 0.98 (0.95–1.02) 0.90 (0.83–0.97) 1.05 (0.83–1.33) 1.00 0.92 1.02
Labs
 Sodium 1.02 (1.02–1.02) 1.03 (1.03–1.04) 1.03 (1.02–1.05) 1.02 1.03 1.02
 Chloride 0.97 (0.96–0.97) 0.94 (0.93–0.94) 0.92 (0.91–0.93) 0.97 0.94 0.93
 Bicarbonate 0.96 (0.96–0.96) 0.94 (0.93–0.94) 0.94 (0.93–0.96) 0.96 0.94 0.95
 Calcium 0.96 (0.95–0.96) 0.95 (0.92–0.97) 0.83 (0.78–0.87) 0.96 0.94 0.82
 BUN 1.01 (1.01–1.01) 1.02 (1.02–1.02) 1.02 (1.02–1.02) 1.01 1.02 1.02
 Glucose 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 1.00
 Troponin-I 1.00 (1.00–1.00) 1.00 (1.00–1.01) 1.00 (1.00–1.01) 1.00 1.00 1.00
 Troponin-T 1.05 (1.03–1.08) 1.04 (0.99–1.09) 0.73 (0.43–1.23) 1.05 1.03 0.83
 CK-MB 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 1.00 1.00
 CK 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 (1.00–1.00)
 BNP 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 (1.00–1.00)
 Hemoglobin 0.97 (0.97–0.98) 0.97 (0.95–0.99) 1.01 (0.97–1.05) 0.97 0.98
 Delta Hemoglobin 1.03 (1.03–1.04) 1.04 (1.03–1.05) 1.02 (0.99–1.06) 1.03 1.03 1.03
 Hematocrit 1.00 (0.99–1.00) 1.00 (1.00–1.01) 0.96 (0.95–0.98) 1.00 0.97
 WBC 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 (1.00–1.00)
 Platelets 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00
 MCV 1.02 (1.01–1.03) 1.01 (0.99–1.03) 1.07 (1.03–1.12) 1.00 1.00 1.01
 MCHC 1.01 (0.99–1.04) 0.96 (0.91–1.01) 1.03 (0.91–1.17) 0.97 0.96 0.89
 MCH 0.95 (0.93–0.97) 1.00 (0.94–1.05) 0.85 (0.74–0.96)
Risk factor Logistic regression Lasso regression
Stage 1+ Stage 2+ Dialysis Stage 1+ Stage 2+ Dialysis
OR (95% CI) OR (95% CI) OR (95% CI) OR OR OR
 Albumin 1.00 (1.00–1.00) 0.77 (0.75–0.79) 0.76 (0.7–0.82) 0.82 0.84
 AST 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 1.00
 ALT 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 1.00
 Direct Bilirubin 0.99 (0.98–1.00) 0.99 (0.97–1.00) 0.97 (0.92–1.03) 1.00 0.99 0.99
 Total Bilirubin 1.08 (1.07–1.09) 1.11 (1.09–1.12) 1.08 (1.04–1.12) 1.08 1.11 1.08
 Alkaline Phosphatase 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 (1.00–1.00)
 GGT 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 (1.00–1.00)
 Ammonia 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 1.00 1.00
 Lipase 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 (1.00–1.00)
 Mean Pre-Admit GFR 1.04 (1.04–1.04) 1.02 (1.02–1.02) 0.98 (0.97–0.98) 1.04 1.02 0.97
 Pre-Admit GFR Count 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 (0.99–1.00) 1.00 1.00 1.00
 SD Pre-Admit GFR 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.01 (1.00–1.01) 1.00 1.01
 Mean Admit GFR 0.96 (0.95–0.96) 0.99 (0.99–0.99) 0.97 (0.97–0.98) 0.96 0.99 0.97
 SD Admit GFR 1.01 (1.01–1.01) 1.01 (1.01–1.01) 1.03 (1.02–1.03) 1.01 1.01 1.02
 Admit GFR Count 1.06 (1.05-1.06) 0.99 (0.98–1.01) 1.09 (1.04–1.13) 1.06 0.99 1.08
 Delta Admit GFR 0.98 (0.98–0.98) 0.99 (0.99–0.99) 0.99 (0.99–1.00) 0.98 0.99 0.99
Other
 CT Scan−Contrast 0.97 (0.94–0.99) 1.05 (0.99–1.12) 1.00 (0.85–1.17) 0.97 1.05
 CT Scan + Contrast 0.97 (0.94–1.00) 1.04 (0.96–1.12) 1.68 (1.31–2.14) 0.97 1.03 1.62
 NS IVF 0.92 (0.92–0.93) 0.99 (0.97–1.00) 0.99 (0.95–1.03) 0.93 0.99 1.00
 1/2 NS IVF 0.98 (0.97–0.98) 1.03 (1.00–1.05) 0.97 (0.9–1.05) 0.98 1.02 0.99
 LR IVF 0.96 (0.94–0.97) 0.97 (0.93–1.01) 0.93 (0.81–1.08) 0.96 0.98 0.96
 Water IVF 1.12 (1.11–1.14) 1.18 (1.14–1.21) 1.04 (0.95–1.14) 1.12 1.17 1.04
 Hypertension 1.35 (1.33–1.37) 1.36 (1.29–1.43) 1.23 (1.09–1.4) 1.35 1.35 1.22
 Hypotension 1.03 (1.01–1.05) 1.16 (1.1–1.22) 1.10 (0.94–1.29) 1.02 1.15 1.09
 Pre-Admit Max. Temp. 0.99 (0.99–1.00) 0.98 (0.97–0.99) 1.04 (1.00–1.08) 0.99 0.99 1.04
 Pre-Admit Mean BMI 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 (1.00–1.00)
 Admit Max. Temp. 0.97 (0.96–0.97) 1.03 (1.01–1.04) 0.99 (0.95–1.03) 0.97 1.02 1.00
 Admit Mean BMI 1.00 (1.00–1.00) 1.00 (1.00–1.00) 1.00 (1.00–1.00)

The final models of logistic regression and lasso regression are reported using odds ratios (OR) and 95% confidence intervals (95% CI) of risk factors, for logistic regression, and ORs of risk factors, for lasso regression. In the lasso regression, if a variable was dropped from the regression, the content of the cell is “–.” ORs for intravenous fluids (IVF) are increased risk per liter of fluid given. Bolded ORs for the logistic regression are significant to the Bonferroni correction of 1.34 × 10−4.

–, dropped variable from Lasso; CCB, calcium channel blocker; ARB angiotensin II receptor blocker; ACEi, angiotensin converting enzyme inhibitor; TB, tuberculosis; MAOI, monoamine oxidase inhibitor; TCA, tricyclic antidepressants; CHF, congestive heart failure; DM, diabetes mellitus; HTN, hypertension; PVD, peripheral vascular disease; ALD, advanced liver disease; CVA, cerebrovascular accident; CDVD, cardiovascular disease; COPD, chronic obstructive pulmonary disease; MVR, mitral valve regurgitation; RA, rheumatoid arthritis; PUD, peptic ulcer disease; BUN, blood urea nitrogen; CK, creatinine kinase; CK-MB, creatinine kinase-MB isoenzyme; BNP, B-type natriuretic peptide; WBC, white blood cell count; MCV, mean corpuscular volume; MCH, mean corpuscular hemoglobin; MCHC, mean corpuscular hemoglobin concentration; AST, aspartate aminotransferase; ALT, alanine aminotransferase; GGT, gamma-glutamyl transpeptidase; NS, normal saline; LR, lactate ringers; GFR, glomerular filtration rate; SD, standard deviation; NSAIDs, nonsteroidal anti-inflammatory drugs, HIV, human immunodeficiency virus; CT, computerized tomography; BMI, body mass index.

The random forest’s variable importance for stage 1+, stage 2+, and dialysis was presented from highest to lowest to indicate the most important variables in the forest (Table 3). The following variables had importance values from 1 to 10 for all three outcomes: mean pre-admission GFR, delta admission GFR, and the BUN.

Table 3:

Results of the variable importance of the final random forest model.

Random forest
Risk factor Stage 1+ Stage 2+ Dialysis Risk factor Stage 1+ Stage 2+ Dialysis
Rank Rank Rank Rank Rank Rank
Demographics Diagnoses
 Admit Age 20 36 10 CHF 7 79 48
 Gender (Male) 109 93 117 DM 15 62 63
 Race 39 46 24 HTN 43 65 69
Medications PVD 73 84 39
Pre-admission ALD 79 50 78
  NSAIDs 91 94 114 Cancer 76 55 100
  Aminoglycosides 119 106 102 CVA 99 108 85
  Cephalosporins 106 116 96 Alcoholism 82 87 71
  CCB 50 82 58 HIV 104 97 123
  Penicillins 100 110 103 Hepatitis 90 78 57
  β–Blockers 44 89 91 Anemia 87 81 33
  ARB 75 92 74 CDVD 77 101 72
  ACEi 35 56 77 COPD 72 77 83
  AntiTB 126 126 120 Dyslipidemia 84 80 55
  AntiFungals 117 114 112 MVR 108 120 62
  Glucocorticoids 80 103 108 Dementia 105 117 115
  Lincomycin 113 107 86 RA 115 99 89
  Macrolides 102 112 105 PUD 116 119 97
  MAOI 131 130 131 Hemiplegia 112 118 94
  Nitrofurantoin 123 128 126
  Sulfa Antibiotics 98 111 90 Labs
  Tetracyclines 118 109 101 Sodium 48 12 13
  Thiazides 60 75 65 Chloride 22 7 4
  Loop Diuretics 8 51 37 Bicarbonate 27 20 6
  Anhydrase Diuretic 128 124 129 Calcium 49 18 14
  K-Sparing Diuretics 57 61 79 BUN 4 6 3
  Benzodiazepines 95 98 76 Glucose 30 33 47
  TCA 101 96 98 Troponin-I 25 21 21
  Statins 86 100 84 Troponin-T 41 26 46
  Insulin 23 83 44 CK-MB 26 19 27
  Antiemetics 114 86 109 CK 53 17 17
  Opioids 71 90 61 BNP 11 16 11
  Fluoroquinolones 97 85 60 Hemoglobin 42 34 15
Delta Hemoglobin 46 38 28
Admission Hematocrit 47 44 26
  NSAIDs 93 91 118 WBC 33 9 32
  Aminoglycosides 78 70 82 Platelets 56 43 40
  Cephalosporins 66 66 87 MCV 59 41 29
Random forest
Risk factor Stage 1+ Stage 2+ Dialysis Risk factor Stage 1+ Stage 2+ Dialysis
Rank Rank Rank Rank Rank Rank
  CCB 38 67 53 MCHC 52 42 25
  Penicillins 45 13 59 MCH 58 40 43
  β–Blockers 13 71 68 Albumin 32 8 20
  ARB 62 95 88 AST 34 5 12
  ACEi 10 39 66 ALT 64 23 18
  AntiTB 121 121 116 Total Bilirubin 12 1 8
  AntiFungals 94 76 95 Alkaline Phosphatase 28 10 9
  Glucocorticoids 36 63 99 GGT 65 27 19
  Lincomycin 110 113 121 Ammonia 55 14 36
  Macrolides 81 104 93 Lipase 74 30 45
  MAOI 130 131 130 Mean Pre-Admit GFR 5 2 1
  Nitrofurantoin 129 125 127 Pre-Admit GFR Count 29 24 34
  Sulfa Antibiotics 18 60 119 SD Pre-Admit GFR 40 15 16
  Tetracyclines 120 123 92 Mean Admit GFR 1 22 2
  Thiazides 9 49 51 SD Admit GFR 6 32 7
  Loop Diuretics 3 25 23 Admit GFR Count 24 31 22
  Anhydrase Diuretic 122 122 125 Delta Admit GFR 2 3 5
  K-Sparing Diuretics 21 53 113
  Benzodiazepines 83 57 81 Other
  TCA 103 105 111 CT Scan − Contrast 92 73 64
  Statins 70 102 73 CT Scan + Contrast 107 69 106
  Insulin 16 68 75 NS IVF 14 37 41
  Fluoroquinolones 85 74 104 1/2 NS IVF 68 45 54
  Antiemetics 88 59 67 LR IVF 89 52 56
  Opioids 69 58 70 Water IVF 31 11 35
  Cyclosporine 127 129 110 Hypertension 17 48 52
  Trimethoprim 37 72 122 Hypotension 96 64 50
  Cimetidine 125 127 128 Pre-Admit Max. Temp. 63 47 30
  Nacetylcysteine 67 88 80 Pre-Admit Mean BMI 54 29 49
  Acyclovir 111 54 107 Admit Max. Temp. 61 28 31
  Vancomycin 19 4 38 Admit Mean BMI 51 35 42
  Lithium 124 115 124

Risk factors are ranked based on their variable importance for each stage separately, with 1 representing the variable with the highest variable importance and 131 representing the variable with the lowest importance.

CCB, calcium channel blocker; ARB, angiotensin II receptor blocker; ACEi, angiotensin converting enzyme inhibitor; TB, tuberculosis; MAOI, monoamine oxidase inhibitor; TCA, tricyclic antidepressants; CHF, congestive heart failure; DM, diabetes mellitus; HTN, hypertension; PVD, peripheral vascular disease; ALD, advanced liver disease; CVA, cerebrovascular accident; CDVD, cardiovascular disease; COPD chronic obstructive pulmonary disease; MVR, mitral valve regurgitation; RA, rheumatoid arthritis; PUD, peptic ulcer disease; BUN, = blood urea nitrogen; CK, creatinine kinase; CK-MB, creatinine kinase-MB isoenzyme; BNP, B-type natriuretic peptide; WBC, white blood cell count; MCV, mean corpuscular volume; MCH, mean corpuscular hemoglobin; MCHC, mean corpuscular hemoglobin concentration; AST, aspartate aminotransferase; ALT, alanine aminotransferase; GGT, gamma-glutamyl transpeptidase; NS, normal saline; LR, lactate ringers; GFR, glomerular filtration rate; SD, standard deviation; NSAIDs, nonsteroidal anti-inflammatory drugs, HIV, human immunodeficiency virus; CT, computerized tomography; BMI, body mass index.

Discrimination performance of the AKI stage 1+, AKI stage 2+, and dialysis models was evaluated by the AUC (Table 4). The highest AUCs were logistic regression for stage 1+, with a median AUC of 0.758 (95% CI: 0.758–0.758); random forest for stage 2+, with a median AUC of 0.720 (95% CI: 0.719–0.721); and logistic regression for dialysis, with a median AUC of 0.825 (95% CI: 0.823–0.827). Lasso regression and the random forests performed very similarly. For the other discrimination measures (NRI and IDI), the lasso and logistic regression methods outperformed the random forest method in most stages (Table 4). Random forests were not as well calibrated for any of the outcomes compared with logistic and lasso regression, as demonstrated in the O/E ratio plots (Figure 3). When we performed a sensitivity analysis of a heavily penalized lasso parsimonious model (see Supplementary Appendix 3), the AUC decreased by 0.055 for stage 1+, 0.082 for stage 2+, and 0.025 for dialysis.

Table 4:

Results of discrimination and calibration metrics for the 50 bootstrap samples.

Discrimination and calibration metrics
Model Stage 1+ Stage 2+ Dialysis
Median (95% CI) Median (95% CI) Median (95% CI)
AUC
Logistic regression 0.758 (0.758–0.758) 0.715 (0.714–0.716) 0.825 (0.823–0.827)
Lasso regression 0.758 (0.757–0.758) 0.714 (0.713–0.715) 0.824 (0.822–0.826)
Random forest 0.746 (0.744–0.748) 0.721 (0.720–0.721) 0.823 (0.818–0.828)
NRI
Lasso vs LR 0.461 (0.460–0.463) 0.348 (0.344–0.351) 0.549(0.538–0.559)
RF vs Lasso 0.378 (0.377–0.379) 0.271 (0.267–0.275) 0.306
RF vs LR 0.419 (0.417–0.420) 0.332 (0.329–0.336) 0.409 (0.399–0.420)
IDI
Lasso vs LR 0.004 (0.004–0.004) 0.001 (0.001–0.001) 0.007 (0.006–0.007)
RF vs Lasso 0.022 (0.022–0.022) 0.004 (0.004–0.004) –0.021 (−0.022 to −0.021)
RF vs LR 0.026 (0.026–0.026) 0.005 (0.005–0.005) −0.015 (−0.016 to −0.014)
Brier
LR 0.067 (0.067–0.067) 0.010 (0.009–0.010) 0.001 (0.001–0.001)
RF 0.068 (0.068–0.068) 0.010 (0.009–0.010) 0.001 (0.001–0.001)
Lasso 0.068 (0.068–0.068) 0.010 (0.009–0.010) 0.001 (0.001–0.001)

Area under the receiver operating characteristic curve (AUC) values are represented for each model by median and 95% confidence intervals (95% CIs) for the stage 1+, 2+, and dialysis outcomes. The continuous net reclassification index (NRI) and integrated discrimination improvement (IDI) values are reported as the improvement of the second model vs the first model (model A vs model B being positive is interpreted as model B having a superior classification). The Brier score is represented for each model with medians and 95% CIs.

LR = logistic regression, RF = random forest.

Figure 3:

Figure 3:

Figure 3:

Each model’s observed to expected ratio plots are presented for lasso regression (Figures 3a–c), logistic regression (Figures 3d–f), and random forest (Figures 3g–i).

DISCUSSION

In this study of the largest cohort of HA-AKI models ever developed, random forests were unexpectedly inferior to lasso and logistic regression for most outcomes and very similar for stage 2+ for AUC and O/E ratio plot measurements. Comparing lasso and logistic regression, lasso was able to make a more parsimonious model, with marginal decreases in AUC and retention of O/E ratio performance for all outcomes.

Both logistic regression and lasso had slightly superior or very similar AUC performances compared with random forests. This is contrary to previous published works that showed random forests to be superior to logistic regression.34–36 However, studies have shown that random forests have diminished performance in detecting both marginal and interacting effects in high-dimensional data.58 Weighting methods have been used to improve imbalance, which is likely to occur when the outcome being measured is rare, as is the case in our dataset, for all stages of AKI.59–61 However, weighted random forests still have only a modest improvement in predictive ability when effect sizes are small, which is true in our dataset, with most ORs ∼1.00.62 Logistic regression and lasso outperformed random forests nearly consistently when compared with using NRI and IDI. Discrimination measure IDI and improvement in AUC are both weighted measures of improvement in sensitivity, with AUC giving more weight to larger sensitivities and IDI giving the same weight to all values of sensitivity.53 Because of the differences between these two discrimination measures, they may rank models differently when the difference in those models’ performances is not very large.53 We see in the O/E ratio graphs that logistic regression and lasso regression are better calibrated for each of the AKI outcomes than random forests, with random forests over-predicting risk as the predicted probability increases. Lasso regression performed nearly as well as logistic regression, with similar AUCs, O/E plots, and ORs for the predictor variables. This demonstrates the effectiveness of lasso regression in simplifying the model by removing less important predictor variables while performing nearly as well as logistic regression.

The most important modifiable variables during the admission window for AKI and dialysis included IV hydration and admission medication exposures. Many of these risk factors were significant in the AKI stage 1+ or 2+ models and represent actionable therapies that can be embedded in clinical decision support to provide estimates of risk reduction if they were administered or held, as clinically appropriate.

This was the first prediction model for HA-AKI to include IVF administration calculated through bar-coded medication administration records, which is important because IVF is a preventable risk factor. IVF administration is either a protective factor or a risk factor for AKI, depending on what type of fluid is given. The protective association with the more isotonic fluids associated with volume resuscitation is supported by the literature, which reports that volume expansion and volume expansion protocols reduce the incidence of AKI.63–68 Our models showed that NS, 1/2 NS, and LR are associated with a lower risk of developing stage 1+ AKI, and that water alone is associated with an elevated risk of developing stage 1+ AKI. The risk associated with free water fluid administration could be related to disease severity, because free water volume is most commonly administered as the solution for IV medications and for patients who get flushes in their IV catheters after medication administration. However, risk associated with free water administration is biologically plausible, because this fluid is not effective for volume resuscitation and could also be a significant risk factor that has not previously been described in the development of AKI (and therefore needs further exploration). Isotonic IVF was a protective factor, and delta hemoglobin was a risk factor for stage 1+ AKI, supporting the theory that stage 1+ AKI is associated with intravascular volume depletion. Causality cannot be determined, because IVF may also mask the development of AKI by diluting serum creatinine. However, the fact that NS and LR were protective and free water IVF was a risk factor for stage 1+ AKI provides support that each effect is not due to the dilution of serum creatinine concentrations. Overall, whether isotonic IVF are protective against or mask the development of stage 1+ AKI is unclear, and further studies are required.

The mean pre-admission and admission GFR (or the level of kidney dysfunction, which was significant in stage 1+, 2+, and dialysis in our model) is one of the most important risk factors for the development of AKI.69,70 Most of the medications that were significant in our model are known risk factors for AKI.1,9–15 The fact that some of the antibiotics that were significantly associated with AKI are not direct nephrotoxins could have been due to their proxy association with acute infection leading to sepsis. The choice of antibiotic associated with a protective effect or risk likely represents the severity of disease. Cephalosporins, tetracyclines, and macrolides are typically used for less invasive infections and were protective, but penicillins and fluoroquinolones, which can be used for more serious infections, were risk factors. Bactrim, vancomycin, and aminoglycosides, which are known to be nephrotoxic, had a much stronger association with higher ORs than other antibiotics.

The lab values are proxies for disease states. Aspartate aminotransferase, alanine aminotransferase, total bilirubin, and alkaline phosphatase are elevated in acute liver disease, which is a risk factor for AKI. Serum glucose is elevated in diabetes, but can also contribute to dehydration through the osmotic load in very elevated states. Creatinine kinase-MB isoenzyme is elevated in acute myocardial infarction; elevated sodium, elevated BUN, and decreased chloride are seen in hypovolemic states; and low bicarbonate is seen in sepsis, all of which are risk factors for AKI. Admission CT scans with and without IV contrast were slightly protective in stage 1+ AKI; however, they were risk factors in both stage 2+ AKI and dialysis, with CT scans with IV contrast being significant for dialysis. These findings are limited by the inability to determine which patients actually received IV contrast during the CT scan, as described in our methods.

The differences in risk factors for AKI stage 1+, AKI 2+, and dialysis are likely explained by the fact that mild AKI is associated with a less severe phenotype. In contrast, severe AKI represents more intrinsic renal injury with sustained loss of function.71 The relative sensitivity and specificity of severity grades of the standard definitions of AKI is an active area of research and discussion. For this reason, we reported the models across the spectrum of severity. Indeed, the differences in risk factors and strengths of associations between outcome severities support the need to use different risk models for these different outcomes.

We expanded our prior single-center work37 by developing the model in a large national cohort, exploring which modeling methods appears to be more robust regarding prediction performance, and evaluating additional novel risk factors available within the Veterans Health Administration EHR. We extended prior models of AKI in the literature. Previously, studies have looked at adverse outcomes after the development of AKI11 or in select populations.26–33 The present study captured data on all hospitalizations and predicted AKI outcomes before a patient developed AKI. Random forests have been used to predict AKI development in the contrast-induced nephropathy population;30 however, our study compares the ability of random forests, lasso regression, and logistic regression to predict outcomes for all populations. Finally, this study was performed on a nationwide cohort of over 100 hospitals, which is larger than previous studies on HA-AKI prediction.

This study includes some limitations, so its results should be interpreted cautiously. Our cohort is largely comprised of male patients and may not generalize to a population with a greater proportion of female patients. In addition, the models we used were internally validated, and the generalizability of those models will need to be assessed through external validation in other populations. However, there is growing literature to suggest that local refitting or remodeling of a developed risk model is warranted on a regular basis, regardless of external validation, and all risk prediction models should be used with caution in other clinical settings if refitting/remodeling is not performed.72–74 Another limitation of our study is the secular trend that creatinine assay changes introduced during the study. Extensive validation of ICD-9 codes for accuracy have been performed previously at the VA and other institutions for chronic conditions such as congestive heart failure, coronary artery disease, and hypertension;75,76 however, some ICD-9 codes have not been extensively studied.

CONCLUSIONS

This study explored multiple modeling methods, including logistic regression, lasso regression, and random forests for modeling HA-AKI in a large nationwide cohort. Traditional regression methods outperformed machine learning methods in this domain. Our final recommendation is to use lasso regression within this clinical setting, given its intuitive representation of the risk factors and ORs, its ability to simplify the model based on the selection of the most important clinical predictors, and its equivalent performance to logistic regression and similar and superior performance to random forests. This study also explored novel risk factors within the EHR data and demonstrated the ability of multiple risk modeling techniques to effectively predict HA-AKI and identify potential risk factors that can trigger interventions to prevent HA-AKI. We were able to determine multiple clinical risk factors that could be intervened upon and were able to show the potential risks and benefits of IVF in a predictive model, which has not been done previously. These models can be used for population health in dashboards and within institutional and provider quality profiling activities and, additionally, can be used to support clinical decision support for individual patients that is both more appropriate to the patient context and also provides explicit recommendations for risk mitigation through preventable risk factors in the model.

CONTRIBUTORS

M.E.M. and R.M.C. were involved with the study’s conception, design, and data collection. M.E.M., R.M.C., J.P.V., E.D.S., and S.K.E. were involved with the analysis of the study. All authors were involved with writing and editing the manuscript.

FUNDING

M.E.M. is supported by the following grants: Veterans Health Administration HSR&D CDA-08-020 and HSR&D IIR-11-292. E.S. is supported by K23 DK088964-01A1 and K24 DK62849 grants from the National Institute of Diabetes and Digestive and Kidney Diseases. This work was also partially supported by the Assessment and Serial Evaluation of the Subsequent Sequelae of Acute Kidney Injury Study (5U01DK082192-02, 5U01DK082185-02, 5U01DK082223-02). J.P. was supported by an R01 LM009965-03 grant from the National Library of Medicine. R.C. and J.V.H. were supported by the 5T15LM007450-12 training grant from the National Library of Medicine. This project was executed in collaboration with the Predictive Analytics group of the Veterans Affairs Office of Analytics and Business Intelligence. The views presented in this work are solely those of the authors and do not necessarily represent the position or the policy of the US Department of Veterans Affairs or the National Institutes of Health.

COMPETING INTERESTS

None.

SUPPLEMENTARY MATERIAL

Supplementary material is available online at http://jamia.oxfordjournals.org/.

REFERENCES

  • 1.Bellomo R, Ronco C, Kellum JA, et al. Acute renal failure - definition, outcome measures, animal models, fluid therapy and information technology needs: the Second International Consensus Conference of the Acute Dialysis Quality Initiative (ADQI) Group. Crit Care Lond Engl. 2004;8:R204–R212. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Uchino S, Kellum JA, Bellomo R, et al. Acute renal failure in critically ill patients: a multinational, multicenter study. JAMA J Am Med Assoc. 2005;294:813–818. [DOI] [PubMed] [Google Scholar]
  • 3.Hou SH, Bushinsky DA, Wish JB, et al. Hospital-acquired renal insufficiency: a prospective study. Am J Med. 1983;74:243–248. [DOI] [PubMed] [Google Scholar]
  • 4.Liaño F, Junco E, Pascual J, et al. The spectrum of acute renal failure in the intensive care unit compared with that seen in other settings. The Madrid Acute Renal Failure Study Group. Kidney Int Suppl. 1998;66:S16–S24. [PubMed] [Google Scholar]
  • 5.Wang Y, Cui Z, Fan M. Hospital-acquired and community-acquired acute renal failure in hospitalized Chinese: a ten-year review. Ren Fail. 2007;29:163–168. [DOI] [PubMed] [Google Scholar]
  • 6.Samaan KH, Dahlke M, Stover J. Addressing safety concerns about U-500 insulin in a hospital setting. Am J Health-Syst Pharm AJHP Off J Am Soc Health-Syst Pharm. 2011;68:63–68. [DOI] [PubMed] [Google Scholar]
  • 7.Brivet FG, Kleinknecht DJ, Loirat P, et al. Acute renal failure in intensive care units--causes, outcome, and prognostic factors of hospital mortality; a prospective, multicenter study. French Study Group on Acute Renal Failure. Crit Care Med. 1996;24:192–198. [DOI] [PubMed] [Google Scholar]
  • 8.Coca SG, Yusuf B, Shlipak MG, et al. Long-term risk of mortality and other adverse outcomes after acute kidney injury: a systematic review and meta-analysis. Am J Kidney Dis Off J Natl Kidney Found. 2009;53:961–973. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Mehta RL, Pascual MT, Soroko S, et al. Spectrum of acute renal failure in the intensive care unit: the PICARD experience. Kidney Int. 2004;66:1613–1621. [DOI] [PubMed] [Google Scholar]
  • 10.Bernieh B, Al Hakim M, Boobes Y, et al. Outcome and predictive factors of acute renal failure in the intensive care unit. Transplant Proc. 2004;36:1784–1787. [DOI] [PubMed] [Google Scholar]
  • 11.Prins JM, Büller HR, Kuijper EJ, et al. Once versus thrice daily gentamicin in patients with serious infections. Lancet. 1993;341:335–339. [DOI] [PubMed] [Google Scholar]
  • 12.Hatala R, Dinh TT, Cook DJ. Single daily dosing of aminoglycosides in immunocompromised adults: a systematic review. Clin Infect Dis Off Publ Infect Dis Soc Am. 1997;24:810–815. [DOI] [PubMed] [Google Scholar]
  • 13.Hock R, Anderson RJ. Prevention of drug-induced nephrotoxicity in the intensive care unit. J Crit Care. 1995;10:33–43. [DOI] [PubMed] [Google Scholar]
  • 14.Tran DD, Oe PL, de Fijter CW, et al. Acute renal failure in patients with acute pancreatitis: prevalence, risk factors, and outcome. Nephrol Dial Transplant Off Publ Eur Dial Transpl Assoc - Eur Ren Assoc. 1993;8:1079–1084. [PubMed] [Google Scholar]
  • 15.Taber SS, Mueller BA. Drug-associated renal dysfunction. Crit Care Clin. 2006;22:357–374, viii. [DOI] [PubMed] [Google Scholar]
  • 16.Walsh TJ, Hiemenz JW, Seibel NL, et al. Amphotericin B lipid complex for invasive fungal infections: analysis of safety and efficacy in 556 cases. Clin Infect Dis Off Publ Infect Dis Soc Am. 1998;26:1383–1396. [DOI] [PubMed] [Google Scholar]
  • 17.Bamgboye EL, Mabayoje MO, Odutola TA, et al. Acute renal failure at the Lagos University Teaching Hospital: a 10-year review. Ren Fail. 1993;15:77–80. [PubMed] [Google Scholar]
  • 18.Nolan CR, Anderson RJ. Hospital-acquired acute renal failure. J Am Soc Nephrol JASN. 1998;9:710–718. [DOI] [PubMed] [Google Scholar]
  • 19.Thadhani R, Pascual M, Bonventre JV. Acute renal failure. N Engl J Med. 1996;334:1448–1460. [DOI] [PubMed] [Google Scholar]
  • 20.Kleinknecht D. Epidemiology in acute renal failure in France today. In: Biari D, Neild G, eds. Acute Renal Failure in Intensive Therapy Unit. Berlin: Springer-Verlag; 1990:13–21. [Google Scholar]
  • 21.Cantarovich F, Bodin L. Functional acute renal failure. In: Cantarovich F, Rangoonwala B, Verho M, eds. Progress in Acute Renal Failure. Paris: Hoechst Marion Roussel; 1998:55–65. [Google Scholar]
  • 22.Hunt JP, Meyer AA. Predicting survival in the intensive care unit. Curr Probl Surg. 1997;34:527–599. [PubMed] [Google Scholar]
  • 23.Randolph AG, Guyatt GH, Carlet J. Understanding articles comparing outcomes among intensive care units to rate quality of care. Evidence based medicine in critical care group. Crit Care Med. 1998;26:773–781. [DOI] [PubMed] [Google Scholar]
  • 24.Matheny ME, Ohno-Machado L, Resnic FS. Risk-adjusted sequential probability ratio test control chart methods for monitoring operator and institutional mortality rates in interventional cardiology. Am Heart J. 2008;155:114–120. [DOI] [PubMed] [Google Scholar]
  • 25.Topol EJ, Block PC, Holmes DR, et al. Readiness for the scorecard era in cardiovascular medicine. Am J Cardiol. 1995;75:1170–1173. [DOI] [PubMed] [Google Scholar]
  • 26.Rodríguez E, Soler MJ, Rap O, et al. Risk factors for acute kidney injury in severe rhabdomyolysis. PloS One. 2013;8:e82992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.McMahon GM, Zeng X, Waikar SS. A risk prediction score for kidney failure or mortality in rhabdomyolysis. JAMA Intern Med. 2013;173:1821–1828. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Kim WH, Lee SM, Choi JW, et al. Simplified clinical risk score to predict acute kidney injury after aortic surgery. J Cardiothorac Vasc Anesth. 2013;27:1158–1166. [DOI] [PubMed] [Google Scholar]
  • 29.Slankamenac K, Beck-Schimmer B, Breitenstein S, et al. Novel prediction score including pre- and intraoperative parameters best predicts acute kidney injury after liver surgery. World J Surg. 2013;37:2618–2628. [DOI] [PubMed] [Google Scholar]
  • 30.Gurm HS, Seth M, Kooiman J, et al. A novel tool for reliable and accurate prediction of renal complications in patients undergoing percutaneous coronary intervention. J Am Coll Cardiol. 2013;61:2242–2248. [DOI] [PubMed] [Google Scholar]
  • 31.Ng SY, Sanagou M, Wolfe R, et al. Prediction of acute kidney injury within 30 days of cardiac surgery. J Thorac Cardiovasc Surg. Published Online First: 28 August 2013. doi:10.1016/j.jtcvs.2013.06.049 [DOI] [PubMed] [Google Scholar]
  • 32.Schneider DF, Dobrowolsky A, Shakir IA, et al. Predicting acute kidney injury among burn patients in the 21st century: a classification and regression tree analysis. J Burn Care Res Off Publ Am Burn Assoc. 2012;33:242–251. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Breidthardt T, Christ-Crain M, Stolz D, et al. A combined cardiorenal assessment for the prediction of acute kidney injury in lower respiratory tract infections. Am J Med. 2012;125:168–175. [DOI] [PubMed] [Google Scholar]
  • 34.Casanova R, Saldana S, Chew EY, et al. Application of random forests methods to diabetic retinopathy classification analyses. PloS One. 2014;9:e98587. 35 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Liu Y, Traskin M, Lorch SA, et al. Ensemble of trees approaches to risk adjustment for evaluating a hospital’s performance. Health Care Manag Sci. Published Online First: April 29, 2014. doi:10.1007/s10729-014-9272-4 [DOI] [PubMed] [Google Scholar]
  • 36.Sowa J-P, Heider D, Bechmann LP, et al. Novel algorithm for non-invasive assessment of fibrosis in NAFLD. PloS One. 2013;8:e62439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Matheny ME, Miller RA, Ikizler TA, et al. Development of inpatient risk stratification models of acute kidney injury for use in electronic health records. Med Decis Mak Int J Soc Med Decis Mak. 2010;30:639–650. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Kolodner R. Computerizing Large Integrated Health Networks. The VA Success. New York: Springer-Verlag; 1997. [Google Scholar]
  • 39.Payne TH, Savarino J. Development of a clinical event monitor for use with the Veterans Affairs Computerized Patient Record System and other data sources. Proc AMIA Annu Symp AMIA Symp. 1998;145–149. [PMC free article] [PubMed] [Google Scholar]
  • 40.Perlin JB, Kolodner RM, Roswell RH. The Veterans Health Administration: quality, value, accountability, and information as transforming strategies for patient-centered care. Am J Manag Care. 2004;10:828–836. [PubMed] [Google Scholar]
  • 41.Brown SH, Lincoln MJ, Groen PJ, et al. VistA--U.S. Department of Veterans Affairs national-scale HIS. Int J Med Inf. 2003;69:135–156. [DOI] [PubMed] [Google Scholar]
  • 42.Siew ED, Ikizler TA, Matheny ME, et al. Estimating baseline kidney function in hospitalized patients with impaired kidney function. Clin J Am Soc Nephrol. 2012;7:712–719. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Charlson ME, Pompei P, Ales KL, et al. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40:373–383. [DOI] [PubMed] [Google Scholar]
  • 44.Greevy RA, Huizinga MM, Roumie CL, et al. Comparisons of persistence and durability among three oral antidiabetic therapies using electronic prescription-fill data: the impact of adherence requirements and stockpiling. Clin Pharmacol Ther. 2011;90:813–819. [DOI] [PubMed] [Google Scholar]
  • 45.Bishop CM. Pattern Recognition and Machine Learning. New York: Springer; 2006. [Google Scholar]
  • 46.Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B Methodol. 1996;58:267–288. [Google Scholar]
  • 47.Breiman L. Random forests. Mach Learn. 2001;45:5–32. [Google Scholar]
  • 48.Team RCD. R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna Austria; 2005. [Google Scholar]
  • 49.Friedman J, Hastie T, Tibshirani R. glmnet: Lasso and elastic-net regularized generalized linear models. R Package Version. 2009;1: 1–4. [Google Scholar]
  • 50.De Ville B, Neville P. Decision Trees for Analytics Using SAS Enterprise Miner. SAS Institute; 2013. Cary, North Carolina, USA. [Google Scholar]
  • 51.Harrell FE, Jr, Lee KL, Califf RM, et al. Regression modelling strategies for improved prognostic prediction. Stat Med. 1984;3:143–152. [DOI] [PubMed] [Google Scholar]
  • 52.Efron B. The Jackknife, the Bootstrap, and Other Resampling Plans. Philadelphia, PA: Society for Industrial and Applied Mathematics; 1982. [Google Scholar]
  • 53.Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143:29–36. [DOI] [PubMed] [Google Scholar]
  • 54.Pencina MJ, D’Agostino RB, D’Agostino RB, et al. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med. 2008;27:157–172; discussion 207–212. [DOI] [PubMed] [Google Scholar]
  • 55.Brier G. Verification of forecasts expressed in terms of probabilities. Mon Weather Rev. 1950;78:1–3. [Google Scholar]
  • 56.Lockhart R, Taylor J, Tibshirani RJ, et al. A significance test for the lasso. Ann Stat. 2014;42:413–468. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Steyerberg EW, Vickers AJ, Cook NR, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiol Camb Mass. 2010;21:128–138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Winham SJ, Colby CL, Freimuth RR, et al. SNP interaction detection with Random Forests in high-dimensional genetic data. BMC Bioinformatics. 2012;13:164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Chen C, Liaw A, Breiman L. Using random forest to learn imbalanced data. Univ Calif Berkeley. Technical Report 666 2004. [Google Scholar]
  • 60.Maudes J, Rodríguez JJ, García-Osorio C, et al. Random feature weights for decision tree ensemble construction. Inf Fusion. 2012;13:20–30. [Google Scholar]
  • 61.Amaratunga D, Cabrera J, Lee Y-S. Enriched random forests. Bioinforma Oxf Engl. 2008;24:2010–2014. [DOI] [PubMed] [Google Scholar]
  • 62.Winham SJ, Freimuth RR, Biernacka JM. A weighted random forests approach to improve predictive performance. Stat Anal Data Min. 2013;6:496–505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Better OS, Abassi ZA. Early fluid resuscitation in patients with rhabdomyolysis. Nat Rev Nephrol. 2011;7:416–422. [DOI] [PubMed] [Google Scholar]
  • 64.Trivedi HS, Moore H, Nasr S, et al. A randomized prospective trial to assess the role of saline hydration on the development of contrast nephrotoxicity. Nephron Clin Pract. 2003;93:C29–C34. [DOI] [PubMed] [Google Scholar]
  • 65.Wu LN, Genge BR, Wuthier RE. Evidence for specific interaction between matrix vesicle proteins and the connective tissue matrix. Bone Miner. 1992;17:247–252. [DOI] [PubMed] [Google Scholar]
  • 66.Kellum JA, Unruh ML, Murugan R. Acute kidney injury. Clin Evid. 2011;2011:2001. [PMC free article] [PubMed] [Google Scholar]
  • 67.Mueller C, Buerkle G, Buettner HJ, et al. Prevention of contrast media-associated nephropathy: randomized comparison of 2 hydration regimens in 1620 patients undergoing coronary angioplasty. Arch Intern Med. 2002;162:329–336. [DOI] [PubMed] [Google Scholar]
  • 68.Marathias KP, Vassili M, Robola A, et al. Preoperative intravenous hydration confers renoprotection in patients with chronic kidney disease undergoing cardiac surgery. Artif Organs. 2006;30:615–621. [DOI] [PubMed] [Google Scholar]
  • 69.Grams ME, Astor BC, Bash LD, et al. Albuminuria and estimated glomerular filtration rate independently associate with acute kidney injury. J Am Soc Nephrol. 2010;21:1757–1764. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Hsu C-Y, McCulloch CE, Fan D, et al. Community-based incidence of acute renal failure. Kidney Int. 2007;72:208–212. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Bellomo R, Kellum JA, Ronco C. Acute kidney injury. Lancet. 2012;380:756–766. [DOI] [PubMed] [Google Scholar]
  • 72.Minne L, Eslami S, de Keizer N, et al. Effect of changes over time in the performance of a customized SAPS-II model on the quality of care assessment. Intensive Care Med. 2012;38:40–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Amarasingham R, Patzer RE, Huesch M, et al. Implementing electronic health care predictive analytics: considerations and challenges. Health Aff Proj Hope. 2014;33:1148–1154. [DOI] [PubMed] [Google Scholar]
  • 74.Hickey GL, Grant SW, Murphy GJ, et al. Dynamic trends in cardiac surgery: why the logistic EuroSCORE is no longer suitable for contemporary cardiac surgery and implications for future risk models. Eur J Cardio-Thorac Surg Off J Eur Assoc Cardio-Thorac Surg. 2013;43:1146–1152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Siew ED, Peterson JF, Eden SK, et al. Outpatient nephrology referral rates after acute kidney injury. J Am Soc Nephrol. 2012;23:305–312. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Gerds TA, Cai T, Schumacher M. The Performance of Risk Prediction Models. Biom J. 2008;50:457–479. [DOI] [PubMed] [Google Scholar]

Articles from Journal of the American Medical Informatics Association : JAMIA are provided here courtesy of Oxford University Press

RESOURCES