Building Trust in Clinical AI: A Web-Based Explainable Decision Support System for Chronic Kidney Disease

Krishna Mridha; Ming Wang; Lijun Zhang

. 2025 Jun 10;2025:375–384.

Building Trust in Clinical AI: A Web-Based Explainable Decision Support System for Chronic Kidney Disease

Krishna Mridha ¹, Ming Wang ¹, Lijun Zhang ¹

PMCID: PMC12150721 PMID: 40502268

Abstract

Chronic Kidney Disease (CKD) is a significant global public health issue, affecting over 10% of the population. Timely diagnosis is crucial for effective management. Leveraging machine learning within healthcare offers promising advancements in predictive diagnostics. We developed a Web-Based Clinical Decision Support System (CDSS) for CKD, incorporating advanced Explainable AI (XAI) methods, specifically SHAP (Shapley Additive Explanations) and LIME (Local Interpretable Model-agnostic Explanations). The model employs and evaluates multiple classifiers: KNN, Random Forest, AdaBoost, XGBoost, CatBoost, and Extra Trees, to predict CKD. The effectiveness of the models is assessed by measuring their accuracy, analyzing confusion matrix statistics, and the AUC. AdaBoost achieved a 100% accuracy rate. Except for KNN, all classifiers consistently reached perfect precision and sensitivity. Additionally, we present a real-time web-based application to operationalize the model, enhancing trust and accessibility for healthcare practitioners and stakeholder.

Keywords: Chronic Kidney Disease, Explainable AI, Web-Based Clinical Decision Support System

Introduction

Chronic kidney disease (CKD) is an advancing and critical health issue that impacts more than 800 million people worldwide¹. CKD impose a significant economic burden to the healthcare system due its high diagnosis costs. Thus, early detection and proper management of CKD are needed for reducing the disease progression, improving patient outcomes and overall healthcare system.

Now a days, Machine Learning (ML) have shown remarkable success in analyzing big datasets to understand the patterns and make prediction with clear explanation to the healthcare provider in their decisions making processes. There are lots of studies have implemented ML models for CKD prediction and classification. In a paper by Liu et al. applied random forest (RF) model to classify the CKD and non-CKD among 40,686 people which achieve around 93.15% accuracy². In another study by Qin et at. tested six various ML models on a CKD dataset with 400 samples and indicated that RF perform better than other models with an accuracy of 99.75%³. Elhoseny et al. introduced a method known as Density-based Feature Selection (DFS) combined with Ant Colony Optimization (D-ACO), which improved classification accuracy to 95% by considering selected features⁴. Aljaaf et al. used novel multiple imputation to handled missing values and implemented a Multilayer Perceptron (MLP) based neural network in prediction achieved 98.1% accuracy⁵. On the other hand, Gunarathne et al. tested various classification model and report that Multiclass Decision Forest algorithm provide 99.1% accuracy⁶. ML plays an important role in healthcare, but its implementation has faced challenged due it “black box” nature. These models may provide accurate prediction result but often lack transparency making it difficult for healthcare providers to how prediction made. This lack of explainability poses a major challenge in clinical environment where trust and accountability are crucial. At this stage Explainable AI (XAI) plays a key role by addressing these issues, making ML models more transparent and trustworthy. The two techniques called SHAP (Shapley Additive Explanations)⁷,⁸ and LIME (Local Interpretable Model-agnostic Explanations)⁹ allow clinicians to visualize and identify the important features from patients’ data which influence the model’s prediction building trust between clinicians and ML systems.

Our research focused on building a Web based tool called the Clinical Decision Support System (CDSS) to assist clinicians in diagnosing by providing explanation of feature importance that has more impact on the disease. We created AI-based predictive tools to assess the risk of CKD and implemented an explainable dashboard to enhance the transparency of the ML-generated risk assessments.

Methods

We created an end-to-end web-based ML model that can explain prediction for CKDs. Figure 1 illustrates the proposed architecture of our work, which consists of several sections. The model begins with data collection from hospital patient reports input and concludes with the prediction of the output.

Our proposed ML workflow for medical application. The phases include data collection, preprocessing and analysis, model selection and training, hyperparameter tuning, model evaluation, and web application design. The final model is deployed as a Flask application, enabling end users to submit prediction requests and receive scores.

Data Source and Characters

We utilized a publicly available dataset from the UCI Machine Learning Repository to develop and validate our prediction model. The dataset is multivariate, comprising 400 samples with 24 features, including 11 numerical and 14 categorical variables, along with a binary target variable indicating CKD status (“1” for CKD and “0” for non-CKD). The dataset was collected over a two-month period from hospital records.

Numerical variables such as Age, Serum Creatinine, and Hemoglobin are reported as mean ± standard deviation (SD). Categorical variables are presented as counts and percentages. A detailed statistical summary is provided in Table 1.

Table 1:

Summary Statistics and Comparative Analysis of Clinical and Laboratory Variables in CKD Dataset

Variable	Scale	All Data (N=400)	No (N = 150)	Yes (N = 250)	p-value
Albumin	0	229 (57.2%)	149 (99.3%)	80 (32.0%)	<0.001
	1	49 (12.2%)	0 (0.0%)	49 (19.6%)
	2	48 (12.0%)	0 (0.0%)	48 (19.2%)
	3	47 (11.8%)	0 (0.0%)	46 (18.4%)
	4	26 (6.5%)	1 (0.7%)	26 (10.4%)
	5	1 (0.2%)	0 (0.0%)	1 (0.4%)
Sugar	0	334 (83.5%)	150 (100%)	184 (73.6%)	0.187
	1	14 (3.5%)	0 (0.0%)	14 (5.6%)
	2	20 (5.0%)	0 (0.0%)	20 (8.0%)
	3	15 (3.8%)	0 (0.0%)	15 (6.0%)
	4	14 (3.5%)	0 (0.0%)	14 (5.6%)
	5	3 (0.8%)	0 (0.0%)	3 (1.2%)
Red Blood Cells	0	68 (17.0%)	2 (1.3%)	66 (26.4%)	<0.001
Red Blood Cells	1	332 (83.0%)	148 (98.7%)	184 (73.6%)	<0.001
Pus Cell	0	88 (22.0%)	1 (0.7%)	87 (34.8%)	<0.001
Pus Cell	1	312 (78.0%)	149 (99.3%)	163 (65.2%)	<0.001
Pus Cell Clump0		358 (89.5%)	150 (100.0%)	208 (83.2%)	<0.001
Pus Cell Clump0	1	42 (10.5%)	0 (0.0%)	42 (16.8%)	<0.001
Bacteria	0	378 (94.5%)	150 (100.0%)	228 (91.2%)	<0.001
Bacteria	1	22 (5.5%)	0 (0.0%)	22 (8.8%)	<0.001
Hypertension	0	253 (63.2%)	150 (100.0%)	103 (41.2%)	0.826
Hypertension	1	147 (36.8%)	0 (0.0%)	147 (58.8%)	0.826
Diabetes Mellit	0	263 (65.8%)	150 (100.0%)	113 (45.2%)	0.339
Diabetes Mellit	1	137 (34.2%)	0 (0.0%)	137 (54.8%)	0.339
Coronary Artery Disease	0	366 (91.5%)	150 (100.0%)	216 (86.4%)	<0.001
Coronary Artery Disease	1	34 (8.5%)	0 (0.0%)	34 (13.6%)	<0.001
Appetite	0	318 (79.5%)	150 (100.0%)	168 (67.2%)	<0.001
Appetite	1	82 (20.5%)	0 (0.0%)	82 (32.8%)	<0.001
Peda Edema	0	324 (81.0%)	150 (100.0%)	174 (69.6%)	<0.001
Peda Edema	1	76 (19.0%)	0 (0.0%)	76 (30.4%)	<0.001
Aanemia	0	340 (85.0%)	150 (100.0%)	190 (76.0%)
Aanemia	1	60 (15.0%)	0 (0.0%)	60 (24.0%)
Age (years)	2.0–90.0	51.48 (17.17)	46.60 (15.61)	54.37 (17.35)	<0.001
Blood Pressure (mmHg)	50.0–180.0	76.47 (13.68)	71.40 (8.52)	79.52 (15.20)	<0.001
Specific Gravity	1.005–1.025	1.01790 (0.00553)	1.02213 (0.00303)	1.01466 (0.00507)	<0.001
Blood Glucose Random (mg/dL)	22.0–490.0	145.33 (77.51)	109.68 (24.20)	170.98 (91.46)	<0.001
Blood Urea (mg/dL)	1.5–391.0	52.09 (48.97)	33.87 (14.70)	70.31 (57.83)	<0.001
Serum Creatinine (mg/dL)	0.4–76.0	3.11 (5.75)	0.90 (0.33)	4.33 (6.82)	<0.001
Sodium (mEq/L)	4.5–163.0	137.93 (10.01)	141.71 (4.81)	134.93 (11.22)	<0.001
Potassium (mEq/L)	2.5–47.0	4.62 (2.52)	4.33 (0.60)	4.90 (4.46)	<0.001
Haemoglobin (g/dL)	3.1–17.8	12.58 (2.56)	15.07 (1.43)	11.03 (2.45)	<0.001
Packed Cell Volume	9.0–54.0	39.04 (8.75)	46.12 (4.42)	34.51 (7.97)	<0.001
White Blood Cell Count (cells/cmm)	2200.0– 264000	8392.13 (3017.13)	7776.67 (1852.78)	8831.60 (3333.04)	<0.001
Red Blood Cell Count (millions/cmm)	2.1–8.0	4.70 (0.99)	5.35 (0.65)	4.34 (1.03)	<0.001

Open in a new tab

Missing data, present in several columns, were addressed using a two-pronged imputation strategy based on the proportion of missing values:

Numerical Columns: Missing values in all numerical columns were imputed using Random Sampling Imputation to preserve variability and distribution within the dataset.
1. Categorical Columns:
  1. High missingness columns (red_blood_cells and pus_cell) were imputed using Random Sampling.
  2. Remaining categorical columns were imputed using Mode Imputation to handle low proportions of missing values efficiently.

This approach ensured that no information unavailable at the time of prediction (e.g., target variable CKD status) was used during imputation, thereby minimizing potential data leakage and preserving the integrity of the model.

Table 1 shows that the cohort consisted of 229 patients (57.2%) with an albumin level of 0, showing a significant difference between CKD and non-CKD groups (p < 0.001). The presence of sugar in urine was also a key variable, with 334 patients (83.5%) having a sugar level of 0. Notably, 100.0% of non-CKD patients had a sugar level of 0 compared to 73.6% in CKD patients, although the p-value was 0.187, indicating a weaker association. Red Blood Cells (‘rbc’) were examined, revealing that 332 patients (83.0%) had an abnormal ‘rbc’ count (level 1). However, only 2% of non-CKD patients had a low ‘rbc’ count (level 0), compared to 26.4% of CKD patients, with a highly significant p-value of <0.001. Similar trends were observed with Pus Cell (‘pc’) and Pus Cell Clumps (‘pcc’), where significant differences were observed between the groups (p < 0.001). For other categorical variables, such as Bacteria, Hypertension (‘htn’), Diabetes Mellitus (‘dm’), coronary artery disease (‘cad’), Appetite (‘appet’), Peda Edema (‘pe’), and Anaemia (‘ane’), significant differences were observed between the CKD and non-CKD groups, particularly in the presence of Bacteria (p < 0.001) and Appetite (p < 0.001). For instance, 150 patients (100.0%) were observed without ‘pe’, with all non-CKD patients falling into this category, compared to 69.6% of CKD patients.

Albumin and sugar levels, measured on a scale from 0 to 5, are critical indicators in chronic kidney disease (CKD). Elevated albumin signifies protein leakage due to kidney damage, while high sugar levels indicate glucose spillage, often linked to diabetes—a major risk factor for CKD. For the other variables, 0 means no and 1 means yes.

In terms of numerical variables, the mean age of the cohort was 51.48 years. Non-CKD patients were younger on average, with a mean age of 46.60 years, compared to CKD patients, who had a mean age of 54.37 years. This difference was statistically significant (p < 0.001). Additionally, blood pressure varied significantly between the groups. non-CKD patients had a lower average blood pressure of 71.40 mmHg, while CKD patients had an average blood pressure of 79.52 mmHg, also with a p-value < 0.001.

Moreover, several laboratory measures demonstrated notable differences between the groups. For instance, Blood Glucose Random levels were significantly lower in non-CKD patients (mean 109.68) compared to CKD patients (mean 170.98), with a p-value of less than 0.001. Similarly, Serum Creatinine, Sodium, Potassium, Hemoglobin, Packed Cell Volume, White Blood Cell Count, and Red Blood Cell Count all showed significant differences between CKD and non-CKD patients, with p-values of less than 0.001 for each variable.

Exploratory Data Analysis (EDA)

Figure 2 illustrates the distribution of various health metrics, revealing skewness or bimodality in many cases, which indicates uneven data distribution. For instance, age and blood urea are right-skewed, suggesting a predominance of younger individuals and potentially higher urea levels. Albumin exhibits a bimodal pattern, possibly indicating two distinct groups within the data. Long tails in distributions, such as sugar, suggest the presence of outliers that could affect the analysis. Specifically, the age distribution is right-skewed, with most individuals being younger than 50. Blood Pressure is centered around 80-90 mmHg, and specific gravity peaks at 1.020, both showing normal distributions. Albumin has bimodal peaks around 2.5 and 3.5. Sugar displays a long right tail, indicating a few high readings. Blood Glucose Random centers at 100 mgs/dL, while Blood Urea is right-skewed, peaking around 10 mgs/dL. Serum creatinine is also right skewed, peaking around 1 mgs/dL. Sodium, Potassium, Hemoglobin, and Packed Cell volume all exhibit normal distributions, with peaks at 140 mEq/L, 4 mEq/L, 12 g/dL, and 40%, respectively. White Blood Cell Count is right-skewed, peaking around 8000 cells/μL, while Red Blood Cell Count is normally distributed with a peak around 5 million millions/mm³.

Analyzing the distribution of health metrics reveals uneven data distribution and potential outliers, particularly in age, albumin, and sugar.

Figure 3 illustrates the skewness in several categorical features. Notably, a higher proportion of patients are categorized as ‘normal’ across most of the variables, indicating an imbalance between normal and abnormal values. This skewness suggests that the dataset contains healthier (or non-CKD) patients, which may impact the distribution of predictions and model performance.

Distribution of Ten Clinical Features Stratified by CKD Status. Each subplot shows the frequency count of CKD (salmon color) and non-CKD (light blue color) patients for each categorical variable. The analysis includes Red Blood Cells, Pus Cell, Pus Cell Clumps, Bacteria, Hypertension, Diabetes Mellitus, Coronary Artery Disease, Appetite, Pedal Edema, and Anemia.

Data Pre-processing

To prepare the dataset, several preprocessing steps were undertaken. First, the ‘id’ column, which held no analytical significance, was removed to streamline the dataset. To make the columns more understandable and user friendly, we have changed the columns name. The dataset having ambiguous or incorrect values within the categorical variable such as ‘\tno’ and ‘ yes’, were standardized to ‘no’ and ‘yes’. Initially the column identified as object types, including ‘packed_cell_volume’, ‘white_blood_cell_count’, and ‘red_blood_cell_count’, were converted to appropriate numerical data types which ensure accurate data processing. In addition, to handle the missing data, different techniques were utilize based on the nature of the variable. For numerical columns, missing values were imputed using a random sampling method from the existing data within the same column, while categorical variables had their missing values filled with the mode, or most frequent value, of the column. Finally, all categorical variables, which each contained only two unique values, were encoded using Label Encoding. This transformation was essential to ensure compatibility with machine learning algorithms that require numerical input.

ML Model Building

In this study, we employed several ML models to classify patients based on health metrics into either ‘CKD’ or ‘healthy’. The models used include KNN, Random Forest, AdaBoost, XGBoost, CatBoost, and Extra Trees. These were chosen for their consistently low error rates across the different datasets and CKD studies.

Explainability

After ML, Shapley values from the SHAP Python package⁷,⁸ and LIME⁹ were used to interpret the predictions. For a given feature, let’s say ‘blood_pressure’, the SHAP value for an instance would be calculated as:

\begin{array}{l} ϕ_{age} = \sum_{S \subseteq F {age}} \frac{| S |! \cdot (| F | - | S | - 1)!}{| F |!} [f (S \cup {age}) - f (S)] \\ ϕ_{blood_pressure} = \sum_{S \subseteq F {blood pleasure}} \frac{| S |! \cdot (| F | - | S | - 1)!}{| F |!} [f (S \cup {blood_pressure}) - f (S)] \\ ϕ_{albumin} = \sum_{S \subseteq F {albumin}} \frac{| S |! \cdot (| F | - | S | - 1)!}{| F |!} [f (S \cup {albumin}) - f (S)] \\ ϕ_{haemoglobin} = \sum_{S \subseteq F {haemoglobin}} \frac{| S |! \cdot (| F | - | S | - 1)!}{| F |!} [f (S \cup {haemoglobin}) - f (S)] \end{array}

Where:

$ϕ_{blood pressure}$ is the SHAP value for the ‘blood_pressure’ feature.
F is the set of all features, such as ‘age’, ‘specific_gravity’, ‘albumin’, ‘sugar’, etc.
S is a subset of features excluding ‘blood_pressure’.
f(S) is the model’s prediction using only the features in subset SSS.
$f (S \cup {b l o o d_p r e s s u r e}) Model prediction with blood_p$ Model prediction with blood_pressure added to subset S.
|s| is the number of features in subset SSS.
|F| is the total number of features.

To approximate the model’s prediction locally around a specific instance , (such as a patient’s data), LIME uses a simpler model, like a linear model, that is fitted using the perturbed data:

Where:

g(z'): The simple interpretable model (e.g., linear model) fitted to the perturbed data.
G: The class of potential simple models.
Z: Set of all perturbed instances around the original instance x.
π_x(z): Proximity measure between the original instance x and the perturbed instance z.
f(z_age, z_{blood_pressure}, z_albumin, z_haemoglobin, ...): Prediction of the complex model for the original instance with features like age, blood_pressure, albumin, haemoglobin, etc.
g(z'_age,z'_{blood_pressure},z'_albumin, z'_haemoglobin, ...): Prediction of the simple model for the perturbed instance.
Ω(g): Complexity penalty to ensure simplicity of g

Application Development

In this section, we detail the architecture and development process of the application, emphasizing both the frontend and backend components, as depicted in the provided system diagram and Figure 4.

Overview of proposed ML Web Application Architecture: The frontend, built with TypeScript and Angular, interacts with a backend via RESTful HTTP APIs. The backend, developed with Python and Flask, handles model training and prediction using scikit-learn, saving the model as a .pkl file for deployment.

Frontend: The frontend of the application is designed to provide an intuitive and responsive user interface (UI) for interacting with the machine learning model. It is built using HTML, CSS and Bootstrap. The key components of the frontend include:

UI Component: The user interface is a page which is consist of various component to interact with user. Basically, these components are responsible for collecting user input, displaying prediction results, and ensuring a smooth user experience.
HTTP API Service: The HTTP API Service is a bridge between frontend and backend. This service handles all HTTP requests and responses, allowing the frontend to interact with the backend RESTful APIs. The service sends user inputs for model training or prediction and receives responses which are then rendered on the UI.

Backend: The backend of the application handles core functionalities such as model training and prediction. It is built using Python and utilizes the Flask web framework to create RESTful APIs. The backend is organized as follows:

/api/train: This API endpoint is dedicated to model training. When requested, it starts the training process using the provided dataset and parameters. Once the model is trained, it is serialized and saved as a .pkl file, allowing for future use without the need for retraining.
/api/predict: This API endpoint is used for making predictions. It loads the pre-trained model from the .pkl file and processes incoming data to generate predictions. The results are then sent back to the frontend through the HTTP API service.

Results and Discussion

We use accuracy, precision, recall, sensitivity, specificity, and AUC metrics to represents the comparison of ML models performance shown in Table 2. The AdaBoost Classifier outperforms all other models with a perfect accuracy of 100% with 100% precision, and a recall of 99.83%. Other ensemble models like CatBoost and XGBoost also perform well, with accuracy around 96-98% and high AUC values close to 100% which indicating strong classification capabilities. On the other hand, the KNN model underperforms with an accuracy of 65.83% and much lower precision and recall, suggesting it is less suitable for this dataset compared to the ensemble-based models.

Table 2.

Performance metrics for CKD prediction models.

Model	Accuracy	Precision	Recall	Sensitivity	Specificity	AUC
Ada Boost Classifier	100%	100%	99.83%	96.83%	100%	100%
Cat Boost	98.33%	100%	95.83%	95.83%	100%	99.80%
XGBoost	96.67%	100 %	91.67%	91.67%	100%	99.88%
Extra Trees Classifier	96.67%	100%	91.67%	91.67%	100%	99.74%
Random Forest Classifier	95.83%	100%	89.58%	89.58%	100%	100%
KNN	65.83%	56.14%	66.67%	66.67%	65.28%	73.96%

Open in a new tab

The left side of the Figure 5 is confusion matrix, and right side is ROC curve. The confusion matrix depicts the model performance in classifying the CKD and healthy individuals showing no misclassification as both true positive and true negative. The ROC curve further confirms the model’s excellent performance with an AUC of 100%, which means the model perfectly separates CKD patients from healthy individuals. This level of accuracy highlights the robustness and reliability of the model in clinical applications for CKD prediction.

Confusion Matrix and ROC Curve. The model perfectly classifies CKD with an AUC of 1.00 and no misclassifications.

Figure 6 is a screenshot of the designed web application developed for CKD prediction, featuring multiple sections. The top right section is the input interface, allowing users to input various clinical parameters (e.g., Albumin, Blood Pressure) to make predictions. Bottom, the system outputs a CKD prediction along with visual explanations using SHAP⁷,⁸ and LIME⁹. The SHAP force plot and bar chart highlight the top features that most influenced the prediction, while the LIME explanation illustrates how individual features contributed to the classification. The top 5 features contribute to the prediction are Hemoglobin, Specific gravity, Serum creatine, Albumin, and Packed cell volume, reflecting these urine and blood biomarkers are import for CKD screening and prediction also reported in other studies¹⁹ - ²². This interface provides healthcare professionals with an accessible tool to interpret AI-driven decisions with transparency and trust.

CKD Prediction Web App. The app allows clinical inputs, predicts CKD, and provides SNAP/LIME explanations for transparency.

Compared to the previous studies listed in Table 3, our methods not only perform well with all features but also emphasize explainability, trust-building, and the provision of a web-based decision support system tailored to the healthcare environment. This ensures that clinicians can rely on the AI model not just for accurate predictions but also for transparent and understandable insights. This makes our method more suitable for practical, real-world healthcare applications, where trust in AI is often a deciding factor for its use.

Table 3.

Limitations of Related CKD Studies.

Ref	Data Size (Train:Test)	Features	Accuracy	Limitations	LIME	SHAP	Web Application
Qin et al. [2]	400	25	99.83%	No time complexity analysis, hindering resource planning, scalability, and real-time suitability.	No	No	No
Elhoseny et al. [4]	400 (80: 20)	25	95%	Lack of cross-validation settings may limit findings’ generalizability.	No	No	No
Gunarathne et al. [5]	400 (70: 30)	14	99.1%	The strength of the data is low due to the small size of the dataset and the presence of missing values.	No	No	No
Aljaaf et al. [6]	400 (60: 40)	25	98.1%	Limited exploration of alternative feature selection methods beyond PCA, impacting result robustness. Lack of cross-validation affects reliability.	No	No	No
Akter et al. [11]	400 (80: 20)	25	99%	Limited exploration of alternative feature selection methods beyond PCA, impacting result robustness.	No	No	No
Chittora et al. [12]	400 (50:50)	25	99.6%	Lack of cross-validation raises reliability and generalizability concerns.	No	No	No
Halder et al. [13]	400 (70: 30)	25	100%	Significant risk of overfitting with complex models like RF and XGBoost.	No	No	Yes
Islam et al. [14]	400 (70: 30)	25	98.3%	Limited classifier variety restricts performance exploration.	No	No	No
Zheng et al. [15]	491 (80:20)	21	AUC: 87%	The effectiveness of this approach varies across different datasets.	No	No	No
Ghose et al. [16]	491 (70:30)	20	93.29%	No exploration of ensemble models, limiting model diversity.	Yes	Yes	No
Xiao et al. [17]	591 (80:20)	19	83%	Overfitting risk due to lack of regularization techniques.	No	No	Yes
Poonia et al. [18]	400 (80: 20)	25	97.5%	Absence of comprehensive feature selection strategy.	No	No	No

Open in a new tab

Conclusion

This study develops a web-based CDSS that effectively predicts CKD using ML models and incorporates XAI techniques. We developed and utilized a range of classifiers, including KNN, Random Forest, AdaBoost, XGBoost, CatBoost, and Extra Trees, applied to process CKD datasets to determine the most effective model. The AdaBoost model achieved the highest accuracy, highlighting the strength of ensemble methods in medical diagnosis. By integrating SHAP and LIME for interpretability, the system addresses the “black box” issue in AI, allowing clinicians to understand and trust the model’s predictions. The web-based interface provides an accessible, user-friendly platform for real-time CKD prediction and explanation, enhancing clinical decision-making and promoting the integration of AI into healthcare practices.

Funding

This study is funded by the National Heart Lung and Blood Institute under award number 1R01HL175410.

Data Availability

The data sets used or analyzed in this study are available from GitHub (https://github.com/krishnamridhacase/Kidney_AMIA)

Authors’ Contributions

MW and LZ obtained the funding and supervised KM. KM, MW and LZ conceived the idea and designed the experiments. KM analyzed the data and implemented the analysis. KM, MW and LZ contributed to the writing of the manuscript. All authors participated in the discussion, revision, and approval of the final manuscript.

Figures & Tables

Reference

1.Kovesdy CP. Epidemiology of chronic kidney disease: an update 2022. Kidney Int Suppl (2011) 2022 Apr;12(1):7–11. doi: 10.1016/j.kisu.2021.11.003. PMID: 35529086; PMCID: PMC9073222. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Qin J, Chen L, Liu Y, Liu C, Feng C, Chen B. A machine learning methodology for diagnosing chronic kidney disease. IEEE access. 2019 Dec 30;8:20991–1002. [Google Scholar]
3.Liu P, Liu Y, Liu H, Xiong L, Mei C, Yuan L. A Random Forest Algorithm for Assessing Risk Factors Associated With Chronic Kidney Disease: Observational Study. Asian Pac Isl Nurs J. 2024;8:e48378. doi: 10.2196/48378. PMID: 38830204; PMCID: 11184270. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Elhoseny M, Shankar K, Uthayakumar J. Intelligent diagnostic prediction and classification system for chronic kidney disease. Scientific reports. 2019 Jul 3;9(1):9583. doi: 10.1038/s41598-019-46074-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Gunarathne WH, Perera KD, Kahandawaarachchi KA. Performance evaluation on machine learning classification techniques for disease classification and forecasting through data analytics for chronic kidney disease (CKD) In2017 IEEE 17th international conference on bioinformatics and bioengineering (BIBE) 2017, Oct 23:291296. [Google Scholar]
6.Aljaaf AJ, Al-Jumeily D, Haglan HM, Alloghani M, Baker T, Hussain AJ, Mustafina J. Early prediction of chronic kidney disease using machine learning supported by predictive analytics. In 2018 IEEE congress on evolutionary computation (CEC) 2018, Jul 8:1–9. [Google Scholar]
7.Lundberg S., Lee S.-I. A Unified Approach to Interpreting Model Predictions. arXiv 2017, arXiv:1705.07874 [Google Scholar]
8.Lundberg S. Slundberg/Shap. Available online: https://github.com/slundberg/shap. (accessed in Aug. 2024) [Google Scholar]
9.Ribeiro M. T., Singh S., Guestrin C. Why Should I Trust You?: Explaining the Predictions of Any Classifier. arXiv preprint arXiv:1602.04938. 2016 Retrieved from https://arxiv.org/abs/1602.04938 . [Google Scholar]
10.Rubini L, Soundarapandian P, Eswaran P. Chronic Kidney Disease [dataset] UCI Machine Learning Repository. 2015 Available from: https://doi.org/10.24432/C5G020 . [Google Scholar]
11.Akter S, Habib A, Islam MA, Hossen MS, Fahim WA, Sarkar PR, Ahmed M. Comprehensive performance assessment of deep learning models in early prediction and risk identification of chronic kidney disease. IEEE Access. 2021 Nov 19;9:165184–206. [Google Scholar]
12.Chittora P, Chaurasia S, Chakrabarti P, Kumawat G, Chakrabarti T, Leonowicz Z, Jasiński M, Jasiński Ł, Gono R, Jasińska E, Bolshev V. Prediction of chronic kidney disease-a machine learning perspective. IEEE access. 2021 Jan 22;9:17312–34. [Google Scholar]
13.Halder RK, Uddin MN, Uddin MA, Aryal S, Saha S, Hossen R, Ahmed S, Rony MA, Akter MF. ML-CKDP: Machine learning-based chronic kidney disease prediction with smart web application. Journal of Pathology Informatics. 2024 Dec 1;15:100371. doi: 10.1016/j.jpi.2024.100371. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Islam MA, Majumder MZ, Hussein MA. Chronic kidney disease prediction based on machine learning algorithms. Journal of pathology informatics. 2023 Jan 1:14–100189. doi: 10.1016/j.jpi.2023.100189. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Zheng JX, Li X, Zhu J, Guan SY, Zhang SX, Wang WM. Interpretable machine learning for predicting chronic kidney disease progression risk. Digital Health. 2024 Jan;10:20552076231224225. doi: 10.1177/20552076231224225. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Ghosh SK, Khandoker AH. Investigation on explainable machine learning models to predict chronic kidney diseases. Scientific Reports. 2024 Feb 14;14(1):3687. doi: 10.1038/s41598-024-54375-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Xiao J, Ding R, Xu X, Guan H, Feng X, Sun T, Zhu S, Ye Z. Comparison and development of machine learning tools in the prediction of chronic kidney disease progression. J. of translational medicine. 2019 Dec;17:1–3. doi: 10.1186/s12967-019-1860-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Poonia RC, Gupta MK, Abunadi I, Albraikan AA, Al-Wesabi FN, Hamza MA. Intelligent diagnostic prediction and classification models for detection of kidney disease. InHealthcare. 2022 Feb 14;Vol. 10(No. 2):371. doi: 10.3390/healthcare10020371. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Pan W, Han Y, Hu H, He Y. Association between hemoglobin and chronic kidney disease progression: a secondary analysis of a prospective cohort study in Japanese patients. BMC Nephrol. 2022;23:295. doi: 10.1186/s12882-022-02920-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Hoshino J, Muenz D, Zee J, Sukul N, Speyer E, Guedes M, Lopes AA, Asahi K, Haalen H, James G, Dhalwani N, Pecoits-Filho R, Bieber B, Robinson BM, Pisoni RL, Lopes A, Pecoits-Filho R, Combe C, Jacquelinet C, Massy Z, Stengel B, Duttlinger J, Fliser D, Lonnemann C, Reichel H, Wada T, Yamagata K, Pisoni R, Robinson B, Silva V, Sesso R, Speyer E, Asahi K, Hoshino J, Narita I, Perlman R, Port F, Sukul N, Wong M, Young E, Zee J. Associations of Hemoglobin Levels With Health-Related Quality of Life, Physical Activity, and Clinical Outcomes in Persons With Stage 3-5 Nondialysis CKD. J. of Renal Nutrition. 2020;30(5):404–414. doi: 10.1053/j.jrn.2019.11.003. [DOI] [PubMed] [Google Scholar]
21.McAdams MC, Gregg LP, Xu P, Zhang S, Li M, Carroll E, Kannan V, Willett DL, Hedayati SS. Specific Gravity Improves Identification of Clinically Significant Quantitative Proteinuria from the Dipstick Urinalysis. Kidney360. 2024 June;5(6):851–859. doi: 10.34067/KID.0000000000000452. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Gaitonde DY, Cook DL, Rivera IM. Chronic Kidney Disease: Detection and Evaluation. Am Fam Physician. 2017 Dec 15;96(12):776–783. [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data sets used or analyzed in this study are available from GitHub (https://github.com/krishnamridhacase/Kidney_AMIA)

[r1-6607] 1.Kovesdy CP. Epidemiology of chronic kidney disease: an update 2022. Kidney Int Suppl (2011) 2022 Apr;12(1):7–11. doi: 10.1016/j.kisu.2021.11.003. PMID: 35529086; PMCID: PMC9073222. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r2-6607] 2.Qin J, Chen L, Liu Y, Liu C, Feng C, Chen B. A machine learning methodology for diagnosing chronic kidney disease. IEEE access. 2019 Dec 30;8:20991–1002. [Google Scholar]

[r3-6607] 3.Liu P, Liu Y, Liu H, Xiong L, Mei C, Yuan L. A Random Forest Algorithm for Assessing Risk Factors Associated With Chronic Kidney Disease: Observational Study. Asian Pac Isl Nurs J. 2024;8:e48378. doi: 10.2196/48378. PMID: 38830204; PMCID: 11184270. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r4-6607] 4.Elhoseny M, Shankar K, Uthayakumar J. Intelligent diagnostic prediction and classification system for chronic kidney disease. Scientific reports. 2019 Jul 3;9(1):9583. doi: 10.1038/s41598-019-46074-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r5-6607] 5.Gunarathne WH, Perera KD, Kahandawaarachchi KA. Performance evaluation on machine learning classification techniques for disease classification and forecasting through data analytics for chronic kidney disease (CKD) In2017 IEEE 17th international conference on bioinformatics and bioengineering (BIBE) 2017, Oct 23:291296. [Google Scholar]

[r6-6607] 6.Aljaaf AJ, Al-Jumeily D, Haglan HM, Alloghani M, Baker T, Hussain AJ, Mustafina J. Early prediction of chronic kidney disease using machine learning supported by predictive analytics. In 2018 IEEE congress on evolutionary computation (CEC) 2018, Jul 8:1–9. [Google Scholar]

[r7-6607] 7.Lundberg S., Lee S.-I. A Unified Approach to Interpreting Model Predictions. arXiv 2017, arXiv:1705.07874 [Google Scholar]

[r8-6607] 8.Lundberg S. Slundberg/Shap. Available online: https://github.com/slundberg/shap. (accessed in Aug. 2024) [Google Scholar]

[r9-6607] 9.Ribeiro M. T., Singh S., Guestrin C. Why Should I Trust You?: Explaining the Predictions of Any Classifier. arXiv preprint arXiv:1602.04938. 2016 Retrieved from https://arxiv.org/abs/1602.04938 . [Google Scholar]

[r10-6607] 10.Rubini L, Soundarapandian P, Eswaran P. Chronic Kidney Disease [dataset] UCI Machine Learning Repository. 2015 Available from: https://doi.org/10.24432/C5G020 . [Google Scholar]

[r11-6607] 11.Akter S, Habib A, Islam MA, Hossen MS, Fahim WA, Sarkar PR, Ahmed M. Comprehensive performance assessment of deep learning models in early prediction and risk identification of chronic kidney disease. IEEE Access. 2021 Nov 19;9:165184–206. [Google Scholar]

[r12-6607] 12.Chittora P, Chaurasia S, Chakrabarti P, Kumawat G, Chakrabarti T, Leonowicz Z, Jasiński M, Jasiński Ł, Gono R, Jasińska E, Bolshev V. Prediction of chronic kidney disease-a machine learning perspective. IEEE access. 2021 Jan 22;9:17312–34. [Google Scholar]

[r13-6607] 13.Halder RK, Uddin MN, Uddin MA, Aryal S, Saha S, Hossen R, Ahmed S, Rony MA, Akter MF. ML-CKDP: Machine learning-based chronic kidney disease prediction with smart web application. Journal of Pathology Informatics. 2024 Dec 1;15:100371. doi: 10.1016/j.jpi.2024.100371. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r14-6607] 14.Islam MA, Majumder MZ, Hussein MA. Chronic kidney disease prediction based on machine learning algorithms. Journal of pathology informatics. 2023 Jan 1:14–100189. doi: 10.1016/j.jpi.2023.100189. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r15-6607] 15.Zheng JX, Li X, Zhu J, Guan SY, Zhang SX, Wang WM. Interpretable machine learning for predicting chronic kidney disease progression risk. Digital Health. 2024 Jan;10:20552076231224225. doi: 10.1177/20552076231224225. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r16-6607] 16.Ghosh SK, Khandoker AH. Investigation on explainable machine learning models to predict chronic kidney diseases. Scientific Reports. 2024 Feb 14;14(1):3687. doi: 10.1038/s41598-024-54375-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r17-6607] 17.Xiao J, Ding R, Xu X, Guan H, Feng X, Sun T, Zhu S, Ye Z. Comparison and development of machine learning tools in the prediction of chronic kidney disease progression. J. of translational medicine. 2019 Dec;17:1–3. doi: 10.1186/s12967-019-1860-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r18-6607] 18.Poonia RC, Gupta MK, Abunadi I, Albraikan AA, Al-Wesabi FN, Hamza MA. Intelligent diagnostic prediction and classification models for detection of kidney disease. InHealthcare. 2022 Feb 14;Vol. 10(No. 2):371. doi: 10.3390/healthcare10020371. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r19-6607] 19.Pan W, Han Y, Hu H, He Y. Association between hemoglobin and chronic kidney disease progression: a secondary analysis of a prospective cohort study in Japanese patients. BMC Nephrol. 2022;23:295. doi: 10.1186/s12882-022-02920-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r20-6607] 20.Hoshino J, Muenz D, Zee J, Sukul N, Speyer E, Guedes M, Lopes AA, Asahi K, Haalen H, James G, Dhalwani N, Pecoits-Filho R, Bieber B, Robinson BM, Pisoni RL, Lopes A, Pecoits-Filho R, Combe C, Jacquelinet C, Massy Z, Stengel B, Duttlinger J, Fliser D, Lonnemann C, Reichel H, Wada T, Yamagata K, Pisoni R, Robinson B, Silva V, Sesso R, Speyer E, Asahi K, Hoshino J, Narita I, Perlman R, Port F, Sukul N, Wong M, Young E, Zee J. Associations of Hemoglobin Levels With Health-Related Quality of Life, Physical Activity, and Clinical Outcomes in Persons With Stage 3-5 Nondialysis CKD. J. of Renal Nutrition. 2020;30(5):404–414. doi: 10.1053/j.jrn.2019.11.003. [DOI] [PubMed] [Google Scholar]

[r21-6607] 21.McAdams MC, Gregg LP, Xu P, Zhang S, Li M, Carroll E, Kannan V, Willett DL, Hedayati SS. Specific Gravity Improves Identification of Clinically Significant Quantitative Proteinuria from the Dipstick Urinalysis. Kidney360. 2024 June;5(6):851–859. doi: 10.34067/KID.0000000000000452. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r22-6607] 22.Gaitonde DY, Cook DL, Rivera IM. Chronic Kidney Disease: Detection and Evaluation. Am Fam Physician. 2017 Dec 15;96(12):776–783. [PubMed] [Google Scholar]

PERMALINK

Building Trust in Clinical AI: A Web-Based Explainable Decision Support System for Chronic Kidney Disease

Krishna Mridha

Ming Wang

Lijun Zhang

Abstract

Introduction

Methods

Figure 1.