Skip to main content
AMIA Annual Symposium Proceedings logoLink to AMIA Annual Symposium Proceedings
. 2025 May 22;2024:733–737.

Identifying acute kidney injury subtypes based on serum electrolyte data in ICU via K-medoids clustering

Wentie Liu 1, Tongyue Shi 1, Haowei Xu 1,2,3, Huiying Zhao 4, Jianguo Hao 1, Guilan Kong 1,2,*
PMCID: PMC12099402  PMID: 40417583

Abstract

This study proposes to use the K-medoids clustering method to identify subtypes of Intensive Care Unit (ICU)-acquired acute kidney injury (AKI) patients based on serum electrolyte data. Three distinct AKI subtypes with different serum electrolyte characteristics were identified by clustering analysis. Further, descriptive analysis was employed to characterize in-hospital mortality and renal replacement therapy, diuretic and vasopressor usage in the three subtypes, and Chi-square tests were conducted to check the differences of prognosis and treatments among the identified subtypes. This study enables the subclassification of AKI patients in the ICU, facilitating ICU physicians to make timely clinical decisions about AKI, and ultimately may contribute to patient outcome improvement.

Introduction

Acute kidney injury (AKI) represents a significant clinical disease marked by the abrupt deterioration of renal function. The high in-hospital mortality associated with AKI places substantial economic and societal burdens on affected families1, 2. As a clinically heterogeneous syndrome, AKI manifests diverse etiologies, clinical features, therapeutic responses, and prognostic outcomes. The heterogeneity of AKI has attracted attention from researchers in medicine and informatics, and some studies about its subtype analysis have been conducted.

Currently, there are two main approaches for AKI sub-classification analysis. One involves researchers with a medical background, and they leverage domain knowledge and clinical experience to do AKI subtype identification, considering factors such as etiology and comorbidities of AKI. For instance, based on anatomical distinctions, AKI patients can be classified into prerenal AKI, intrinsic AKI, and postrenal AKI3. Also, AKI can be subtyped based on the severity of the disease, which is defined using serum creatinine and urine output. Thus AKI can be classified into stages 1, 2, and 34. Moreover, ICU-acquired AKI patients can be classified into different subtypes according to etiologies and clinical characteristics, and then sepsis-related AKI, surgery-related AKI, ischemic AKI, and nephrotoxic AKI can be identified5. The diversity in current AKI subtype definition or identification requires more timely and pragmatic AKI subtype identification in the era of digital health, and a feasible and reasonable AKI subtype identification can enhance our understanding of this complex clinical condition and help implement personalized treatment of ICU-acquired AKI.

Another AKI sub-classification analysis approach involves researchers with informatics or medical informatics background, and sub-classification analysis of AKI is based on the large volume of patient data. In recent years, propelled by the advances of artificial intelligence and big data technologies, there has been a notable surge in data-driven studies aiming to enhance the personalized treatment of AKI. These studies leverage machine learning methods to analyze many clinical metrics, biomarkers, and genomic data and thus can identify distinct subtypes among AKI patients. Data-driven AKI sub-classification studies, often based on machine learning methods, excel in handling large-scale and diverse datasets. This data-driven approach holds promise in the timely identification of AKI subtypes, considering the condition’s heterogeneity and intricate progression6.

Electrolyte disturbance is in close correlation with the onset and progression of AKI. The acute impairment of renal function disrupts the excretion and reabsorption of water, sodium, and other metabolic byproducts, eventually leading to the accumulation of these substances. Electrolyte disturbance may lead to AKI. For example, hyperkalemia affects renal tubular cell function, potentially leading to tubular necrosis in some cases7. Meanwhile, AKI may induce varied degrees of electrolyte disturbance8. Some studies have confirmed that AKI may be accompanied by various electrolyte imbalances during its course9. However, current studies about AKI subtype analysis lack dedicated investigations grounded in serum electrolyte data. From a data-driven perspective, serum electrolyte data are more easily measured than biomarker data, and repetitive measurements of serum electrolytes during ICU stay can be taken with minimal costs. This study aims to identify AKI subtypes using electrolyte data, and then find out the treatment and prognostic differences in the identified subtypes.

Methods

The patient samples used in this study were extracted from the Medical Information Mart for Intensive Care (MIMIC) database, a collaborative effort between the MIT Laboratory for Computational Physiology, Beth Israel Deaconess Medical Center (BIDMC), and Philips Healthcare. Established in 2003 with funding from the National Institutes of Health, the MIMIC-IV database is the latest version, covering clinical data from nearly 300,000 patients between 2008 and 201910. The diagnostic criteria of AKI can be found in the Kidney Disease: Improving Global Outcomes (KDIGO) guideline released in 2012, primarily based on serum creatinine and urine output[4]. The time point of electrolyte data extraction was set as the latest measurement before the diagnosis of AKI. Initially, we intended to cluster patients based on seven serum electrolytes, namely serum sodium, serum potassium, serum chloride, serum calcium, serum phosphorus, serum magnesium, and serum bicarbonate. However, a significant data deficiency in serum calcium was observed after data extraction, with a missing rate of approximately 65%. Consequently, the final clustering variables were reduced to six, with serum calcium excluded.

We standardized the measurement units of various electrolyte variables, converting electrolyte records measured in mg/dl to mmol/l. If the recorded value of a particular electrolyte variable (except serum calcium) for a patient is 0 or null, we excluded the patient. Further, for clustering purposes, we normalized the numerical values of electrolyte variables by proportionally mapping the original values within the range of 0 to 1.

K-means is a commonly used algorithm in clustering analysis designed to partition a dataset into a predefined number (K) of clusters, making each data point belong to the cluster with the nearest centroid. The algorithm iteratively minimizes the sum of squared distances within clusters to determine optimal cluster center positions. In applying K-means, it is essential to predefine the number of clusters, denoted as K. K-medoids clustering is similar to K-means but has its advantages. Different from K-means, which uses the mean of all points in the cluster to define the center, K-medoids is based on instance-based clustering, defining the center of each cluster using an actual observation within the cluster. Consequently, K-medoids is more resilient to outliers and noisy data.

In this study, we ultimately selected the optimized K-medoids clustering algorithm to identify subtypes of AKI. To determine the optimal values of K, we calculated three clustering model fit indexes (Silhouette Coefficient, Davies-Bouldin(DB) index, and Calinski-Harabasz index) for different K clusters and employed the elbow method. For a clustering model, a higher Silhouette Coefficient indicates better cohesion, a smaller DB index suggests better-defined clusters, and a larger Calinski-Harabasz index implies greater separation. However, determining the optimal number of clusters requires a comprehensive consideration of all the metrics. The elbow method is used in cluster analysis to determine the optimal number of clusters. It identifies a turning point to determine the most suitable number of clusters by observing the relationship between the number of clusters and the partition error. The plot of the within-cluster sum of squares (WCSS) against the number of clusters can help us to identify an obvious inflection point that corresponds to the point within the velocity cluster, where the sum of squares and the rate of decline are slowing down most significantly. This inflection point, where the slope descends from rapid to gradual, is referred to as the elbow and indicates a potentially optimal value for K.

Results

According to the definition of AKI in the 2012 KDIGO guideline, we identified 38,141 patients diagnosed with AKI after being admitted to ICU in MIMIC IV. After data cleaning and preprocessing, 23,052 patients were included in the final clustering analysis.

Figure 1 illustrates the line chart of the elbow method. The horizontal axis of the elbow method chart typically represents the number of clusters ranging from 2 to 6, while the vertical axis represents the within-cluster sum of squares (Inertia). Table 1 displays the values of three indices: Silhouette Coefficient, DB index, and Calinski-Harabasz index for K clustering ranging from 2 to 6. In addition to a larger DB index for the clustering model with K=3, the other two indices favor this choice as well.

Figure 1.

Figure 1.

Elbow Method for Optimal K.

Table 1.

Value of three indexes in different K-medoids clustering.

graphic file with name 4952t1.jpg

Considering the results obtained from both the elbow method and the three indices, we chose K=3 as the clustering parameter, categorizing patients into three subtypes.

In the studied 23,052 ICU-acquired AKI patients, 8,029 individuals were classified as Subtype 1, accounting for 34.8%, 11,497 individuals were classified as Subtype 2, constituting 49.9%, and 3,526 individuals were classified as Subtype 3, constituting 15.3%. The electrolyte data characteristics of these subtypes are presented in Table 2. The subtypes identified in this study revealed distinctive electrolyte profiles and diverse prognostic implications. In-hospital mortality for Subtype 1, Subtype 2, and Subtype 3 were 8.30% , 9.52%, and 17,68%, respectively. All the electrolyte variables in Subtype 1 were within normal ranges, and patients belonging to Subtype 1 exhibited the lowest in-hospital mortality. AKI patients in Subtype 2 had elevated levels of sodium and chloride, and demonstrated a moderate in-hospital mortality among the three subtypes. Subtype 3 presented increased levels of potassium, phosphorus, and magnesium, while bicarbonate levels notably decreased, falling below normal thresholds, and the patients in this subtype had the highest in-hospital mortality. These findings underscore the heterogeneity among ICU-acquired AKI patients and emphasize the feasibility of using serum electrolyte data to cluster AKI patients in the ICU.

Table 2.

Serum electrolyte data characteristics of the three subtypes.

graphic file with name 4952t2.jpg

As for treatments, we extracted data of treatments received by AKI patients from the time point of AKI diagnosis to ICU discharge (or ICU death), and they include diuretics, renal replacement therapy, and two vasopressors (Dobutamine and Dopamine). The treatments were represented using binary variables, with “1” indicating that the patient received the treatment after the AKI diagnosis. It can be observed that Subtype 3, owning the highest in-hospital mortality, exhibited the highest rate in utilizing all four treatment modalities. The detailed data about the utilization of the four treatments are presented in Table 3.

Table 3.

Utilization of the study treatments in the three subtypes.

graphic file with name 4952t3.jpg

Conclusions

Based on serum electrolyte data, this study employed K-medoids to identify subtypes of ICU-acquired AKI patients. Three distinct AKI subtypes with significant difference in serum electrolyte profile have been identified. Most of the previous studies on subclass analysis of critical illness used biomarkers as cluster variables, which were rarely included in the database and were not easy to measure clinically, so their applicability in the intensive care scenario was poor. Serum electrolyte is easy to obtain, easy to measure, and closely related to AKI disease itself, so it is a good clustering variable.This study also explored the differences of in-hospital mortality and treatments in the three subtypes, and found that the three AKI subtypes varied significantly in in-hospital mortality and different treatment usageThis finding has the potential to aid clinicians in making appropriate intervention decisions for patients belonging to different subtypes. We found that for patients with subclass 3, the use of diuretics and Dobutamine reduces mortality rates.Since this study is unsupervised cluster learning, which is not labeled as supervised learning, the conclusions drawn are exploratory. However, this study provides a good idea for personalized treatment of AKI, and more research can be invested in the future to make this conclusion can be truly applied to the clinic.

Acknowledgment

The authors thank MIT Laboratory for Computational Physiology, Beth Israel Deaconess Medical Center (BIDMC), and Philips Healthcare for providing the data. This study was supported by the National Natural Science Foundation of China [823720951], Zhejiang Provincial Natural Science Foundation of China [LZ22F020014], the Social Science Project of the Chinese Ministry of Education [22YJA630036], and the Beijing Natural Science Foundation [7212201].

Figures & Tables

References

  • [1].Susantitaphong P., Cruz D.N., Cerda J, et al. World incidence of AKI: a meta-analysis. Clin J Am Soc Nephrol. 2013;8(9):p. 1482–93. doi: 10.2215/CJN.00710113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [2].Silver S.A., Chertow G.M. The Economic Consequences of Acute Kidney Injury. Nephron. 2017;137(4):p. 297–301. doi: 10.1159/000475607. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Jacob J., Dannenhoffer J., Rutter A. Acute Kidney Injury. Prim Care. 2020;47(4):p. 571–584. doi: 10.1016/j.pop.2020.08.008. [DOI] [PubMed] [Google Scholar]
  • [4].Khwaja A. KDIGO clinical practice guidelines for acute kidney injury. Nephron Clin Pract. 2012;120(4):p. c179–84. doi: 10.1159/000339789. [DOI] [PubMed] [Google Scholar]
  • [5].Kellum J.A., Prowle J.R. Paradigms of acute kidney injury in the intensive care setting. Nat Rev Nephrol. 2018;14(4):p. 217–230. doi: 10.1038/nrneph.2017.184. [DOI] [PubMed] [Google Scholar]
  • [6].Atreya M.R., Sanchez-Pinto L.N., Kamaleswaran R. Commentary: ‘Critical illness subclasses: all roads lead to Rome’. Crit Care. 2022;26(1):p. 387. doi: 10.1186/s13054-022-04265-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Lombardi G., Gambaro G., Ferraro P.M. Serum Potassium Disorders Predict Subsequent Kidney Injury: A Retrospective Observational Cohort Study of Hospitalized Patients. Kidney Blood Press Res. 2022;47(4):p. 270–276. doi: 10.1159/000521833. [DOI] [PubMed] [Google Scholar]
  • [8].Goyal A., Daneshpajouhnejad P., Hashmi M.F., et al. StatPearls. StatPearls Publishing; 2023. Acute Kidney Injury. [PubMed] [Google Scholar]
  • [9].Erfurt S., Lehmann R., Matyukhin I, et al. Stratification of Acute Kidney Injury Risk, Disease Severity, and Outcomes by Electrolyte Disturbances. J Clin Med Res. 2023;15(2):p. 59–67. doi: 10.14740/jocmr4832. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Johnson A.E., Bulgarelli L., Shen L, et al. MIMIC-IV, a freely accessible electronic health record dataset. Scientific data. 2023;10(1):p. 1. doi: 10.1038/s41597-022-01899-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from AMIA Annual Symposium Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES