Skip to main content
PLOS One logoLink to PLOS One
. 2024 Mar 19;19(3):e0298673. doi: 10.1371/journal.pone.0298673

Interpretable machine learning-based individual analysis of acute kidney injury in immune checkpoint inhibitor therapy

Minoru Sakuragi 1,2, Eiichiro Uchino 1,2, Noriaki Sato 1,2, Takeshi Matsubara 2, Akihiko Ueda 1,3, Yohei Mineharu 1,4,5, Ryosuke Kojima 1, Motoko Yanagita 2,6,*, Yasushi Okuno 1,*
Editor: Giuseppe Remuzzi7
PMCID: PMC10950216  PMID: 38502665

Abstract

Background

Acute kidney injury (AKI) is a critical complication of immune checkpoint inhibitor therapy. Since the etiology of AKI in patients undergoing cancer therapy varies, clarifying underlying causes in individual cases is critical for optimal cancer treatment. Although it is essential to individually analyze immune checkpoint inhibitor-treated patients for underlying pathologies for each AKI episode, these analyses have not been realized. Herein, we aimed to individually clarify the underlying causes of AKI in immune checkpoint inhibitor-treated patients using a new clustering approach with Shapley Additive exPlanations (SHAP).

Methods

We developed a gradient-boosting decision tree-based machine learning model continuously predicting AKI within 7 days, using the medical records of 616 immune checkpoint inhibitor-treated patients. The temporal changes in individual predictive reasoning in AKI prediction models represented the key features contributing to each AKI prediction and clustered AKI patients based on the features with high predictive contribution quantified in time series by SHAP. We searched for common clinical backgrounds of AKI patients in each cluster, compared with annotation by three nephrologists.

Results

One hundred and twelve patients (18.2%) had at least one AKI episode. They were clustered per the key feature, and their SHAP value patterns, and the nephrologists assessed the clusters’ clinical relevance. Receiver operating characteristic analysis revealed that the area under the curve was 0.880. Patients with AKI were categorized into four clusters with significant prognostic differences (p = 0.010). The leading causes of AKI for each cluster, such as hypovolemia, drug-related, and cancer cachexia, were all clinically interpretable, which conventional approaches cannot obtain.

Conclusion

Our results suggest that the clustering method of individual predictive reasoning in machine learning models can be applied to infer clinically critical factors for developing each episode of AKI among patients with multiple AKI risk factors, such as immune checkpoint inhibitor-treated patients.

Introduction

Acute kidney injury (AKI) is a critical complication with significant prognostic implications often observed in cancer patients [13]. Immune checkpoint inhibitors (ICIs) are key therapeutic agents for advanced cancer that can cause renal-related adverse events during their administration, including AKI [48]. With the increasing use of ICIs, the incidence of AKI during ICI therapy has been reported to be as high as 14–18% [911]. The development of AKI during systemic therapy, such as ICI therapy, not only increases the risk of death and the adverse effects on multiple organs but also represents a major cause of interruption of cancer treatment [1, 2]. Several risk factors for AKI, including baseline renal function, proton pump inhibitors (PPI), and immune-related adverse events (IrAEs), have been reported in ICI-treated patients [1217]. However, these studies analyzed the population as a whole and did not perform individual risk analyses for each AKI episode in each patient. Since the etiology of AKI in patients undergoing cancer therapy varies, even among those diagnosed with the same type of AKI, clarifying the causes of AKI is critical for achieving optimal cancer treatment. Therefore, it is essential to individually analyze ICI-treated patients for existing underlying pathologies causing the onset of each episode of AKI. However, these individual analyses have not yet been realized with conventional clinical research methods, and no such study has been reported.

Herein, we investigated the underlying background of AKI in ICI-treated patients by applying a new approach to classify and analyze time-series individual predictive reasoning of machine learning (ML)-based AKI prediction models. First, we focused on the fact that the temporal changes in individual predictive reasoning in continuous AKI prediction models represent the key features contributing to each AKI prediction. We then estimated that in AKI prediction models, patients with similar predictive reasoning shared similar underlying factors for AKI development, and clustered AKI patients based on the pattern of features with high predictive contribution quantified in time-series by SHapley Additive exPlanations (SHAP) [18]. Thus, we compared each cluster with nephrologist chart review findings, which revealed crucial underlying factors involved in AKI development in individual ICI-treated patients that were not previously observed. Furthermore, the predictive reasoning consisted of combinations of features reasonably interpretable by clinicians.

Our results enabled us to clarify the background of AKI development in ICI-treated patients with underlying risks for AKI and suggested the potential for medical applications of ML prediction models as interpretable artificial intelligence (AI) to medical care, which had been a challenge to explainability.

Materials and methods

Model development and definitions

We created a dataset from the electronic medical records (EMRs) of 616 patients who received ICI therapy for cancer at the Kyoto University Hospital from July 2014 to September 2019 and constructed an AKI prediction model. Using this dataset, we constructed an ML-based model to continuously predict the development of AKI within 7 days of the reference date (S1 and S2 Figs in S1 File). Subsequently, we visualized the predictive reasoning among patients with AKI using SHAP and evaluated the clinical validity of patient clustering using predictive reasoning for AKI development (Fig 1). AKI was defined based on serum creatinine (SCr) changes (≥ 0.3 mg/dL or 1.5 times increase from baseline) according to Kidney Disease: Improving Global Outcomes diagnostic criteria [19] (S1 Method in S1 File). The period for the prediction model was defined as the period from the ICI initiation in each patient to the end of December 2019; patients with multiple AKI events within 14 days from the date of the first episode of AKI were excluded from the evaluation.

Fig 1. Analysis overview.

Fig 1

Among the 657 ICI-treated patients, those who had end-stage renal disease (ESRD) before ICI initiation and those with missing data or inadequate periods for model construction were excluded from this study. The entire dataset was split into test (20%) and training datasets on a per-patient basis, and hyperparameter tuning of the model was performed on the training dataset (5-fold cross-validation). The contribution of key features to AKI development at each prediction time point was quantified based on SHAP values and visualized using the heatmap. The SHAP value of each feature takes a positive or negative value as a vector of contributions, with the magnitude of the absolute value representing the degree of influence on the prediction outcome. The trend of SHAP values at the time point when the model predicts AKI (bold black box) indicates the combination of key features and their contributions (individual predictive reasoning) crucial for predicting AKI in that patient. AKI, acute kidney injury; ICI, immune checkpoint inhibitors; ESRD, end-stage renal disease; ML, machine learning; GBDT, Gradient Boosting Decision Tree; SHAP, SHapley Additive exPlanations; SCr, serum creatinine.

We used LightGBM [20], a gradient-boosting decision tree, as a prediction algorithm to build a classification model that would continuously predict AKI within 7 days from each time point (Fig 1). The main reasons for selecting LightGBM were its flexibility in handling medical records that potentially contain a certain number of missing values and its ability to perform high-speed calculations (S1 Table in S1 File). We used 287 clinical variables obtained from EMRs as input features for each patient (S2 Method in S1 File). For the features linked to time series, data from the 4 weeks before the reference date were divided into four windows, one for each week, and each window was labeled “(-1 wk),” “(-2 wk),” “(-3 wk),” and “(-4 wk),” and suffixes were assigned to each feature (S1 Fig in S1 File). In addition, the objective variable was labeled “AKI-positive” if the patient developed AKI within 7 days of the predicted time point (S2 Fig in S1 File). All analyses were conducted using Python 3.7.7 (https://www.python.org/doc/), with scikit-learn [21] 0.22.1 (https://scikit-learn.org/stable/index.html#) and LightGBM 2.3.0 (https://lightgbm.readthedocs.io/en/stable/#) libraries for model development, and statsmodels 0.13.2, rpy2 3.5.2, and lifelines 0.25.9 libraries for statistical analysis.

Visualizing individual AKI predictive reasoning and clustering

SHAP is a game theory-based model interpretation framework that quantitatively evaluates the contribution of each input feature as a SHAP value [18]. Unlike previous studies, we performed a unique visualization in which SHAP values at all prediction time points were arranged in a time series (Fig 1). The SHAP method was implemented using the Python SHAP package (https://shap.readthedocs.io/en/latest/).

We performed hierarchical clustering for patients with AKI based on the patterns of SHAP values and searched for common clinical backgrounds in each cluster. Subsequently, we compared the clinical backgrounds in each cluster with AKI causes, as annotated by three nephrologists (S3 Fig in S1 File). All chart reviews and the free-text annotations of the nephrologists for AKI causes were conducted independent of ML model analysis and without being influenced by each other (S3 Method in S1 File). Furthermore, we evaluated the clinical validity of the clustering by observing the 90-day survival after the first episode of AKI with the Kaplan—Meier analysis. In addition, categorical variables and means among clusters were compared using Fisher’s exact probability and Kruskal—Wallis tests, respectively. Finally, the distribution of annotation labels within each cluster was evaluated using Chi-square goodness-of-fit test. Statistical significance was defined as p < 0.05.

Ethical statement and informed consent

The dataset was generated and reviewed based on the clinical information obtained from the EMR of our institution. This study was conducted using data obtained only during medical practice, according to the principles of the Declaration of Helsinki. Per Japanese laws and regulations, informed consent was obtained on an opt-out basis. All explanations of the study and expressions of consent were assured to be conducted in a written format, guaranteeing that participants received comprehensive information and their consent or dissent was appropriately recorded. This method aligns with the approval granted by the Ethical Review Board of Kyoto University, acknowledging it as a valid form of consent for this type of research. We ensured ethical compliance by publicly providing detailed information about the study, including its purpose, the nature of the data used, and the rights of participants to withdraw, on the Kyoto University Hospital website (https://www.kuhp.kyoto-u.ac.jp/outline/research-disclosure.html). The option for participants was made clear and accessible, thus preserving their autonomy. The Ethical Review Board of Kyoto University approved the study (Approval Number R1498), recognizing its adequacy for the nature of this retrospective analysis. The period of data access and analysis for this study was from March 2022 to August 2022. In collecting the data, the authors did not access any data that could identify individual participants.

Results

Model performance and visualizing individual predictive reasoning

Among the 616 patients, 112 (18.2%) had at least one AKI episode after initiation of ICI therapy. The clinical characteristics of the patients are presented in Table 1. The generalization performance of the model estimated based on the test data had an area under the receiver operating characteristic curve of 0.880, similar to that of the pre-existing models [2229] (Fig 2a). Performance comparisons with other ML models are summarized in S1 Table in S1 File. The SHAP values of the key features that contributed to the prediction of AKI are presented in Fig 2b and 2c. Two examples of predictive reasoning in patients with AKI are presented in Fig 2d. Considering that the contributing factors of AKI vary across patients (Fig 2e), individual differences in predictive reasoning may reflect individual differences in clinical backgrounds related to the development of AKI.

Table 1. Baseline characteristics of patients undergoing immune checkpoint inhibitor therapy.

All patients (n = 616)
With AKI (n = 112) Without AKI (n = 504) p-value
Age [n (%)]
 20–39 years 17 (3) 3 (3) 14 (3) 0.954
 40–59 years 107 (17) 21 (19) 86 (17) 0.773
 60–79 years 415 (67) 80 (71) 335 (66) 0.367
 > 80 years 77 (13) 8 (7) 69 (14) 0.082
Male [n] / Female [n] 415 / 201 82 / 30 333 / 171 0.178
Malignancy types [n (%)]
 Gastrointestinal 74 (12) 10 (9) 64 (13) 0.342
 Lung 333 (54) 45 (40) 288 (57) 0.001 *
 Urologic 72 (12) 23 (21) 49 (10) 0.002 *
 Skin 78 (13) 28 (25) 50 (10) < 0.001 *
 Other 59 (9) 6 (5) 53 (11) 0.133
ICI types [n (%)]
 PD-1 antibody 559 (91) 103 (92) 453 (90) 0.620
 PD-L1 antibody 75 (12) 11 (10) 64 (13) 0.495
 CTLA-4 antibody 43 (7) 12 (11) 31 (6) 0.131
 Combination therapy 22 (4) 4 (4) 18 (4) 1.000
Baseline SCr [mg/dL, median (IQR)] 0.79 (0.66–0.95) 0.90 (0.67–1.11) 0.82 (0.66–0.92) < 0.001 *
PPI administration [n (%)] 152 (25) 33 (29) 119 (24) 0.239
NSAID administration [n (%)] 66 (11) 12 (11) 54 (11) 1.000

All data are presented as medians (interquartile range, IQR) or means (standard deviation, SD), as appropriate for nonparametric or parametric variables, respectively. Patients with ESRD at the initiation of ICI (n = 5), patients without data on renal function after ICI (n = 18), and patients whose follow-up was censored < 3 months after initiation of ICI (n = 18) were excluded from the analysis. ICIs included anti-PD-1, anti-PD-L1, and anti-CTLA-4 antibodies, while some patients received combination therapy with anti-PD-1 and anti-CTLA-4 antibodies. Comparisons of categorical variables were made using the Chi-square test or Fisher’s exact probability test. ICI, immune checkpoint inhibitors; AKI, acute kidney injury; PD-1, Programmed cell death 1; PD-L1, Programmed death-ligand 1; CTLA-4, Cytotoxic T-lymphocyte-associated antigen; SCr, serum creatinine; PPI, proton pump inhibitors; NSAIDs, nonsteroidal anti-inflammatory drugs; ESRD, end-stage renal disease; IQR, interquartile range; SD, standard deviation.

Fig 2. Model performance and visualizing individual predictive reasoning.

Fig 2

(a) Performance of the model. The general performance is evaluated based on the area under the ROC curve. (b, c) Features indicating higher overall SHAP. Features with higher average contributions for all patients are shown. Positive and negative contributions to predicted AKI development are characterized by positive and negative SHAP values, respectively, with red and blue representing the magnitude of respective feature values. (d) Examples of individual predictive reasoning. The graph of SCr value-predicted probabilities of AKI development within 7 days, and the heatmap of SHAP values for the key features are represented on the same timeline. The red dotted line indicates the threshold value for 0.25 in the precision probabilities, and the predicted probabilities above the line are regarded as positive predictions (S5 Fig in S1 File). The bold black-boxed area, at time points with elevated predictive probability, represents the key features and their contribution to the prediction of AKI for that individual. These two examples demonstrate different heatmap patterns of SHAP, suggesting the difference in predictive reasoning. (e) Examples of nephrologists’ chart reviews. The annotations of nephrologists for contributing factors to AKI development for the two cases with predictive reasoning are shown above. ROC, receiver operating characteristic curve; SHAP, SHapley Additive exPlanations; AKI, acute kidney injury; SCr, serum creatinine; IrAEs, immune-related adverse events.

Clustering patients with AKI using predictive reasoning

A total of 112 patients with AKI were categorized into four clusters based on predictive reasoning immediately before the first episode of AKI using unsupervised clustering (Fig 3a, Table 2), compared with annotation independently reviewed by the three nephrologists [24]. The number of clusters was determined as the number of visually valid clusters indicated on the dendrogram produced by the hierarchical clustering. Based on their descriptions, the strongest contributive risk factors for AKI development were assigned six labels for each patient: “Hypovolemia,” “Cancer Cachexia,” “Infection,” “Drug-related,” “Obstruction,” and “Others.” (S3 Method in S1 File). The number of these labels was counted in each cluster to determine the most dominant contributive risk factor. Although the proportions of each label did not differ significantly among the clusters, each cluster had distinct patterns of contributing risk factors for the development of AKI (Fig 3b). While there was a clear trend in the label distribution within each cluster, only clusters 3 and 4 showed statistically significant differences. The most dominant labels in each cluster were as follows: cluster 1, “Hypovolemia”; cluster 2, “Drug-related”; cluster 3, “Drug-related”; and cluster 4, “Cancer Cachexia.” In addition, clusters 2 and 3 were annotated as “Drug-related,” including IrAE, while each cluster indicated different patient backgrounds (S2 Table in S1 File). These results suggested that patients categorized by predictive reasoning likely have different clinical backgrounds regarding AKI development between the clusters.

Fig 3.

Fig 3

(a) Patient clustering by SHAP values. Overall, 112 patients with AKI were categorized into four clusters with ML-based unsupervised clustering by SHAP values. The number of clusters was determined as the number of visually valid clusters indicated on the dendrogram. (b) Distribution of annotations for causes of AKI in the four clusters. The distribution of annotation labels was calculated cluster-wise. (c) Dependence plot of key features. Each point in the scatterplot reveals the correlation between the values of key features and SHAP in the last week prior to each AKI development among 112 patients. The plots in the features of CRP and LDH are color-coded by cluster. (d) Survival analysis after AKI. Kaplan—Meier curves of 90-day survival after the first AKI in each cluster. Analysis of variance reveals significant differences in survival rates among the four clusters, with Cluster 4 having the poorest prognosis. (e) Interpretation of the AI model and clinician assessment. The interpretation of the AI-based model is indicated by patient clustering based on predictive reasoning and prognostic variance. Clinician assessment is indicated by the reviews of nephrologists. (f) Patient clustering by raw feature values. Clustering by raw values of the key features, excluding SHAP weighting, categorizes the same patients with AKI into three clusters. (g) Distribution of annotations for causes of AKI in the three clusters. As in (b), six labels were aggregated for each cluster with the contributing factors for AKI. Clustering by raw values of the key features does not provide meaningful patient clustering reflecting the clinical background of AKI. SHAP, SHapley Additive exPlanations; AKI, acute kidney injury; CRP, C-reactive protein; LDH, Lactate Dehydrogenase; med_diuretic(-1wk), medication of diuretics within the last week.

Table 2. Clinical characteristics of patients with acute kidney injury in each cluster.

Cluster 1 Cluster 2 Cluster 3 Cluster 4 p-value
Number of patients [n] 30 13 30 39
Age [n (%)]
 20–39 years 0 (0) 1 (8) 0 (0) 2 (5) 0.203
 40–59 years 6 (20) 2 (15) 8 (27) 5 (13) 0.590
 60–79 years 21 (70) 9 (69) 20 (67) 30 (77) 0.720
 > 80 years 3 (10) 1 (8) 2 (7) 2 (5) 0.953
Male [n] / Female [n] 22 / 8 7 / 6 19 / 11 34 / 5 < 0.05 *
Malignancy types [n (%)]
 Gastrointestinal 4 (13) 1 (8) 4 (13) 1 (2) 0.262
 Lung 10 (33) 5 (38) 14 (47) 16 (41) 0.780
 Urologic 8 (27) 5 (38) 3 (10) 7 (18) 0.108
 Skin 7 (23) 1 (8) 8 (27) 12 (31) 0.492
 Other 1 (3) 1 (8) 1 (3) 3 (8) 0.730
Baseline SCr [mg/dL, median (IQR)] 0.91 (0.77–1.12) 0.75 (0.62–1.32) 0.85 (0.62–0.99) 0.92 (0.73–1.18) 0.543
AKI stage on first episode of AKI [n (%)]
 Stage 1 19 (63) 8 (62) 27 (90) 26 (67) < 0.05 *
 Stage 2 6 (20) 3 (23) 2 (7) 8 (20) 0.322
 Stage 3 or require RRT 5 (17) 2 (15) 1 (3) 5 (13) 0.343
Ratio of inpatient AKI [n (%)] 3 (10) 6 (46) 7 (23) 36 (92) < 0.05 *
Primary cause of AKI [n (%)]
 Hypovolemia 9 (30) 2 (15) 7 (23) 7 (18) 0.634
 Cancer Cachexia 6 (20) 1 (8) 5 (17) 14 (36) 0.125
 Infection 3 (10) 1 (8) 5 (17) 7 (18) 0.765
 Drug-related 7 (23) 6 (46) 10 (33) 7 (18) 0.174
 Obstruction 3 (10) 2 (15) 0 (0) 2 (5) 0.142
 Others 2 (7) 1 (8) 3 (10) 2 (5) 0.953

All data are presented as medians (interquartile range, IQR) or means (standard deviation, SD), as appropriate for nonparametric or parametric variables, respectively. Comparisons of categorical variables and means among clusters are made using Fisher’s exact probability test and the Kruskal—Wallis test, respectively. AKI, acute kidney injury; ICI, immune checkpoint inhibitors; SCr, serum creatinine; RRT, renal replacement therapy; IQR, interquartile range; SD, standard deviation.

To further elucidate patient clustering by SHAP, we constructed a two-dimensional plot (dependence plot), which represents the correlation between the feature values and their SHAP values in the week before AKI development among 112 patients with AKI (Fig 3c). For example, cluster 4, which was characterized by high SHAP values for C-reactive protein (CRP) and lactate dehydrogenase (LDH), dietary intake, and diuretic use, demonstrated high CRP and LDH levels and poor dietary intake, including diuretic use in one out of three cases, which strongly contributed to AKI prediction (Fig 3a). Generally, high CRP and LDH levels and poor dietary intake are associated with organ damage and dehydration, which can be causes of AKI in advanced cancers [4]. According to the chart review by nephrologists, certain patients with AKI in cluster 4 had cancer cachexia in the terminal phase, while some developed diuretic-induced AKI. Based on these findings, the AKI predictive reasoning in cluster 4 can be interpreted as “patients with terminal cancer and cachexia who developed AKI due to worsening conditions or diuretic use, high CRP and LDH levels, and poor dietary intake.” In addition, when the dependence plots of CRP and LDH were color-coded by cluster, higher values of the features and SHAP were frequently observed in clusters 3 and 4 (Fig 3c). Furthermore, since the clustering with AKI predictive reasoning captured the distinct clinical characteristics of cancer patients, we speculated that patient clustering by SHAP may capture prognostic differences in advanced cancers. Therefore, the 90-day survival rate of 112 patients with the first occurrence of AKI was analyzed, and it was discovered that significant prognostic differences existed between the four clusters (Fig 3d). Notably, cluster 4 had the poorest prognosis. These findings suggest that the predictive reasoning for AKI can recognize prognostic variances after AKI, supporting the clinical validity of patient clustering by SHAP (Fig 3e).

To confirm the necessity of SHAP in clinical interpretation, the same patients were clustered by the raw values for the same key features and divided into three clusters (Fig 3f). The results revealed that, in contrast to SHAP clustering, there were no distinguishing characteristics in the causes of AKI between the clusters, and each cluster did not reflect the contributing risk factors for AKI development (Fig 3g).

Among the patients in clusters 2 and 3, only a few cases of suspected ICI or IrAE involvement were confirmed on renal biopsy. A detailed chart review revealed that many cases were not biopsied for AKI diagnosis after discussions among the attending physician, patient, and their family; consideration of the general condition of the procedure; the prognosis of the patient; and the risk of fatal complications.

Discussion

Herein, we have shown that the clustering approach using SHAP values in ML-based AKI prediction models offers a novel perspective in assessing the etiology of each episode of AKI in patients undergoing ICI therapy. Patient clustering based on time-series SHAP values for AKI prediction enables clinicians to interpret predictive reasoning that reflects the underlying causes of AKI individually. This indicates that we can infer factors critical for AKI development on a case specific basis by focusing on the temporal changes and patterns in each SHAP value in the ML model, which continuously predicts AKI. Therefore, our approach seems appropriate for estimating the most critical causes of AKI in cancer patients receiving systemic therapy, including ICI therapy, with diverse and complicated AKI risks. The features predicted as particularly essential variables in our model were consistent with the findings of previous studies using multivariate analyses [1115]. PPIs, which have been associated with the development of AKI in several observational studies [12, 13, 30], were also identified as a key feature in our prediction model. In addition, although not at the top of the list, diuretics, NSAIDs, and baseline renal function features were also identified as key risk factors by the model, as shown in the dependence plot [17] (Fig 3c, S4 Fig in S1 File). Although the dependence plot did not indicate a causal relationship, the prediction model regarded these key features as crucial for predicting AKI.

However, our method identified individual differences in the underlying backgrounds of AKI that could not have been deduced by conventional methods. As indicated by the varying distribution of clinician annotations (Fig 3b), the patient clusters classified based on the predicted key features had different AKI development backgrounds; for example, cluster 4 was interpreted to have cancer cachexia as the primary contributing factor to AKI development, whereas clusters 2 and 3 suggested the contribution of drugs, including ICI or IrAE. Patients in cluster 4 were characterized by high CRP and LDH levels and the use of diuretics and had the poorest prognosis after the development of AKI (Fig 3c and 3d). These predictive findings reflect the development of AKI due to cancer cachexia. Cluster 3 had more cases of higher CRP levels, persistent inflammation due to IrAE, infections, and end-stage cancer, with many patients receiving outpatient follow-ups. The high SHAP trend of CRP in cluster 3 was considered reflective of these conditions. In contrast, cluster 2 had relatively more cases of poor dietary intake that required hospitalization and fever. The high SHAP trend of dietary intake in cluster 2 was considered to reflect these conditions. Most patients with drug-related AKI in clusters 2 and 3 developed extra-renal IrAEs before AKI [15] (S2 Table in S1 File). In addition, significant prognostic differences were noted between the clusters according to the predictive reasoning, although no variable for survival was provided for model training. This indicates that the predictive reasoning of the AI model is not solely derived from a combination of laboratory values and medications. Although some studies have discussed the prognostic relevance of AKI in ICI-treated patients [11, 13, 14], our study suggested that prognostic differences after AKI were relevant regarding the differences in predicted factors of AKI development.

In several AI-based prediction models, SHAP has been widely used to predict risk factors for various outcomes, including AKI [3134]. However, although it is possible to infer predicted characteristics that demonstrate measurable correlations with SHAP values, it has not been feasible to determine their clinical significance in individual patients. This is partly because the correlation between an individual input feature and its contribution does not fully explain the pathophysiology of complicated outcomes. Furthermore, although many features with nonlinear relationships with SHAP values contribute to the prediction of AKI (S4 Fig in S1 File), comprehending the clinical importance of each feature with a nonlinear contribution is challenging. To the best of our knowledge, no study has attempted to clinically interpret the meaning of contributing factors as individual risk factors in each patient. We found that the combination of contributing factors, including nonlinear contributions, constitutes predictive reasoning in AI models representing the time-varying AKI risks. This method allowed us to clinically interpret the underlying background behind individualized prediction of AKI observed in different time series for the first time.

We believe that our study is significant because it reveals underlying causes in individual patients with AKI in ICI therapy, which cannot be obtained by conventional approaches, and provides predictive reasoning with clinically valid interpretability. However, the implications of our study go beyond simply allowing individualized assessment of AKI during ICI therapy. Cancer patients typically develop AKI owing to complex risk factors arising from various medications or complications. Therefore, predicting AKI development by monitoring a single laboratory result or medication considered as critical factors is often difficult. Similar to investigating the significant contributive features by SHAP analysis, determining the most critical factor for AKI among the multilayered AKI risk factors is a process that clinicians implement to select patients at high risk of AKI and assess their risks. Clinicians usually follow thought processes such as “the probability of AKI onset increases when additional risk factors such as infections and diuretics (triggers) are added to the background of cancer cachexia (underlying risks).” When interpreting the combination of underlying clinical backgrounds and additional stratified risks that lead to AKI development, analyzing the individual AI models’ predictive reasoning can be a valuable approach to explore the most critical AKI risks, which are challenging to understand using routine medical data [35]. In the future, this approach will help effectively determine the appropriate assessment and intervention for patients with complicated AKI risks (S5 Fig in S1 File) [36]. Further analyses applying a similar approach to patients receiving other chemotherapy may capture other characteristic predicting reasoning models specific to the causative agent and disease state. Furthermore, this model can be applied to predict AKI and other outcomes in various other fields that need such individualized prediction.

This study had several limitations. First, this model was developed at a single center; hence, multicenter studies are needed for external validation. Second, due to the nature of ICI therapy, the difference in data availability may have affected the prediction accuracy and the contribution of the features (S6 Fig in S1 File). Therefore, designing equal time-series features, devising missing interpolations, and selecting the population may resolve this problem. Third, information on image findings and surgery, which may be necessary for specific AKI prediction (e.g., obstructive AKI), were not included as features in the present model. Therefore, adding such information in future studies can further improve the performance and interpretability of the model. Furthermore, the validity of the clinical interpretation was assessed by reviews conducted by nephrologists; however, information may have been missed in the retrospective chart reviews. Finally, although this was a retrospective analysis by design, future prospective studies are expected to clarify the benefits of patient clustering by predictive reasoning, which can aid clinicians’ decisions and patient outcomes by prospectively predicting new patients with AKI.

In conclusion, the study findings are significant as this study is the first to demonstrate a novel approach for interpreting ML models by patient clustering using individual predictive reasoning patterns and has the potential to accelerate future medical applications of AI. We expect our approach to be widely applied to explainable AI in various medical fields, including renal diseases.

Supporting information

S1 File. Contains all the supporting files.

(PDF)

pone.0298673.s001.pdf (1.6MB, pdf)

Acknowledgments

We thank Tomohiro Kuroda and the Division of Medical Informatics and Administration Planning, Kyoto University Hospital, for the EMR data extraction and management. We also like to thank Editage (www.editage.com) for English language editing.

Data Availability

Data cannot be shared publicly because of patient privacy in electronic medical records. Data are available from Kyoto University Graduate School and Faculty of Medicine, Ethics Committee via email (ethcom@kuhp.kyoto-u.ac.jp) or telephone (+81-75-753-4680) for researchers who meet the criteria for access to confidential data.

Funding Statement

The authors received no specific funding for this work.

References

  • 1.Lam AQ, Humphreys BD. Onco-nephrology: AKI in the cancer patient. Clin J Am Soc Nephrol. 2012;7: 1692–1700. doi: 10.2215/CJN.03140312 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Salahudeen AK, Doshi SM, Pawar T, Nowshad G, Lahoti A, Shah P. Incidence rate, clinical correlates, and outcomes of AKI in patients admitted to a comprehensive cancer center. Clin J Am Soc Nephrol. 2013;8: 347–354. doi: 10.2215/CJN.03530412 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Cohen EP, Krzesinski JM, Launay-Vacher V, Sprangers B. Onco-nephrology: core curriculum 2015. Am J Kidney Dis. 2015;66: 869–883. doi: 10.1053/j.ajkd.2015.04.042 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Postow MA, Sidlow R, Hellmann MD. Immune-related adverse events associated with immune checkpoint blockade. N Engl J Med. 2018;378: 158–168. doi: 10.1056/NEJMra1703481 [DOI] [PubMed] [Google Scholar]
  • 5.Shingarev R, Glezerman IG. Kidney complications of immune checkpoint inhibitors: a review. Am J Kidney Dis. 2019;74: 529–537. doi: 10.1053/j.ajkd.2019.03.433 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Cortazar FB, Marrone KA, Troxell ML, Ralto KM, Hoenig MP, Brahmer JR, et al. Clinicopathological features of acute kidney injury associated with immune checkpoint inhibitors. Kidney Int. 2016;90: 638–647. doi: 10.1016/j.kint.2016.04.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Shirali AC, Perazella MA, Gettinger S. Association of acute interstitial nephritis with programmed cell death 1 Inhibitor Therapy in Lung Cancer Patients. Am J Kidney Dis. 2016;68: 287–291. doi: 10.1053/j.ajkd.2016.02.057 [DOI] [PubMed] [Google Scholar]
  • 8.Mamlouk O, Selamet U, Machado S, Abdelrahim M, Glass WF, Tchakarov A, et al. Nephrotoxicity of immune checkpoint inhibitors beyond tubulointerstitial nephritis: single-center experience. J Immunother Cancer. 2019;7: 2. doi: 10.1186/s40425-018-0478-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Koks MS, Ocak G, Suelmann BBM, Hulsbergen-Veelken CAR, Haitjema S, Vianen ME, et al. Immune checkpoint inhibitor-associated acute kidney injury and mortality: an observational study. PLOS ONE. 2021;16: e0252978. doi: 10.1371/journal.pone.0252978 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Shimamura Y, Watanabe S, Maeda T, Abe K, Ogawa Y, Takizawa H. Incidence and risk factors of acute kidney injury, and its effect on mortality among Japanese patients receiving immune check point inhibitors: a single-center observational study. Clin Exp Nephrol. 2021;25: 479–487. doi: 10.1007/s10157-020-02008-1 [DOI] [PubMed] [Google Scholar]
  • 11.García-Carro C, Bolufer M, Bury R, Castañeda Z, Muñoz E, Felip E, et al. Acute kidney injury as a risk factor for mortality in oncological patients receiving checkpoint inhibitors. Nephrol Dial Transplant. 2022;37: 887–894. doi: 10.1093/ndt/gfab034 [DOI] [PubMed] [Google Scholar]
  • 12.Seethapathy H, Zhao S, Chute DF, Zubiri L, Oppong Y, Strohbehn I, et al. The incidence, causes, and risk factors of acute kidney injury in patients receiving immune checkpoint inhibitors. Clin J Am Soc Nephrol. 2019;14: 1692–1700. doi: 10.2215/CJN.00990119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Cortazar FB, Kibbelaar ZA, Glezerman IG, Abudayyeh A, Mamlouk O, Motwani SS, et al. Clinical features and outcomes of immune checkpoint inhibitor-associated Aki: A Multicenter Study. J Am Soc Nephrol. 2020;31(2020): 435–446. doi: 10.1681/ASN.2019070676 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Meraz-Muñoz A, Amir E, Ng P, Avila-Casado C, Ragobar C, Chan C, et al. Acute kidney injury associated with immune checkpoint inhibitor therapy: incidence, risk factors and outcomes. J Immunother Cancer. 2020;8: e000467. doi: 10.1136/jitc-2019-000467 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Gupta S, Short SAP, Sise ME, Prosek JM, Madhavan SM, Soler MJ, et al. Acute kidney injury in patients treated with immune checkpoint inhibitors. J Immunother Cancer. 2021;9: e003467. doi: 10.1136/jitc-2021-003467 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Gérard AO, Barbosa S, Parassol N, Andreani M, Merino D, Cremoni M, et al. Risk factors associated with immune checkpoint inhibitor—induced acute kidney injury compared with other immune-related adverse events: a case—control study. Clin Kidney J. 2022;15(10): 1881–1887. doi: 10.1093/ckj/sfac109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Ji MS, Wu R, Feng Z, Wang YD, Wang Y, Zhang L, et al. Incidence, risk factors and prognosis of acute kidney injury in patients treated with immune checkpoint inhibitors: a retrospective study. Sci Rep. 2022;12: 18752. doi: 10.1038/s41598-022-21912-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. 2020;2: 56–67. doi: 10.1038/s42256-019-0138-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Kellum JA, Lameire N, Aspelin P, Barsoum RS, Burdmann EA, Goldstein SL, et al. Kidney disease: improving global outcomes (KDIGO) acute kidney injury work group. KDIGO clinical practice guideline for acute kidney injury. Kidney Int Suppl. 2012;2: 1–138. [Google Scholar]
  • 20.Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, et al. LightGBM: A highly efficient gradient boosting decision tree, NIPS 2017. 30; 2017: pp. 3149–3157. Available from: https://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradientboosting-decision-tree.pdf. [Google Scholar]
  • 21.Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2011;12: 2825–2830. [Google Scholar]
  • 22.Wu L, Hu Y, Liu X, Zhang X, Chen W, Yu ASL, et al. Feature ranking in predictive models for hospital-acquired acute kidney injury. Sci Rep. 2018;8: 17298. doi: 10.1038/s41598-018-35487-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Koyner JL, Carey KA, Edelson DP, Churpek MM. The development of a machine learning inpatient acute kidney injury prediction model. Crit Care Med. 2018;46: 1070–1077. doi: 10.1097/CCM.0000000000003123 [DOI] [PubMed] [Google Scholar]
  • 24.He J, Hu Y, Zhang X, Wu L, Waitman LR, Liu M. Multi-perspective predictive modeling for acute kidney injury in general hospital populations using electronic medical records. JAMIA Open. 2019;2: 115–122. doi: 10.1093/jamiaopen/ooy043 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Tomašev N, Glorot X, Rae JW, Zielinski M, Askham H, Saraiva A, et al. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature. 2019;572: 116–119. doi: 10.1038/s41586-019-1390-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Wu L, Hu Y, Zhang X, Chen W, Yu ASL, Kellum JA, et al. Changing relative risk of clinical factors for hospital-acquired acute kidney injury across age groups: a retrospective cohort study. BMC Nephrol. 2020;21: 321. doi: 10.1186/s12882-020-01980-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Rank N, Pfahringer B, Kempfert J, Stamm C, Kühne T, Schoenrath F, et al. Deep-learning-based real-time prediction of acute kidney injury outperforms human predictive performance. npj Digit Med. 2020;3: 139. doi: 10.1038/s41746-020-00346-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Park N, Kang E, Park M, Lee H, Kang HG, Yoon HJ, et al. Predicting acute kidney injury in cancer patients using heterogeneous and irregular data. PLOS ONE. 2018;13: e0199839. doi: 10.1371/journal.pone.0199839 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Sandokji I, Yamamoto Y, Biswas A, Arora T, Ugwuowo U, Simonov M, et al. A time-updated, parsimonious model to predict AKI in hospitalized children. J Am Soc Nephrol. 2020;31: 1348–1357. doi: 10.1681/ASN.2019070745 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Abdelrahim M, Mamlouk O, Lin H, Lin J, Page V, Abdel-Wahab N, et al. Incidence, predictors, and survival impact of acute kidney injury in patients with melanoma treated with immune checkpoint inhibitors: a 10-year single-institution analysis. Oncoimmunology. 2021;10: 1927313. doi: 10.1080/2162402X.2021.1927313 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Lauritsen SM, Kristensen M, Olsen MV, Larsen MS, Lauritsen KM, Jørgensen MJ, et al. Explainable artificial intelligence model to predict acute critical illness from electronic health records. Nat Commun. 2020;11: 3852. doi: 10.1038/s41467-020-17431-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Zhang Y, Yang D, Liu Z, Chen C, Ge M, Li X, et al. An explainable supervised machine learning predictor of acute kidney injury after adult deceased donor liver transplantation. J Transl Med. 2021;19: 321. doi: 10.1186/s12967-021-02990-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Shakeri E, Mohammed EA, Shakeri HAZ, Far B. Exploring features contributing to the early prediction of sepsis using machine learning, annu. Int. Conf. IEEE Eng Med Biol Soc 2021. (20241). pp. 2472–2475. [DOI] [PubMed] [Google Scholar]
  • 34.Monsarrat P, Bernard D, Marty M, Cecchin-Albertoni C, Doumard E, Gez L, et al. Systemic periodontal risk score using an innovative machine learning strategy: an observational study. J Pers Med. 2022;12: 217. doi: 10.3390/jpm12020217 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Beaulieu-Jones BK, Yuan W, Brat GA, Beam AL, Weber G, Ruffin M, et al. Machine learning for patient risk stratification: standing on, or looking over, the shoulders of clinicians? npj Digit Med. 2021;4: 62. doi: 10.1038/s41746-021-00426-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Yang CC. Explainable artificial intelligence for predictive modeling in healthcare. J Healthc Inform Res. 2022;6: 228–239. doi: 10.1007/s41666-022-00114-1 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Giuseppe Remuzzi

2 Nov 2023

PONE-D-23-28945Interpretable machine learning-based individual analysis of acute kidney injury in immune checkpoint inhibitor therapyPLOS ONE

Dear Dr. Okuno,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

The manuscript focuses on a topic of potential interest. However, the study has major shortcomings that preclude sound conclusions. To mention some of them: i) in the Methods section, more details should be provided regarding the source of the data, the type of data collected, and the specific methods used for data analysis; ii) a more detailed description of the data analysis methods would enable a better understanding of the approach used to analyze the data; iii) provide a proper justification for the choice of LightGBM algorithm between all the available classification algorithms.

Please submit your revised manuscript by Dec 17 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Giuseppe Remuzzi

Academic Editor

PLOS ONE

Journal Requirements:

1. When submitting your revision, we need you to address these additional requirements.

Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified what type you obtained (for instance, written or verbal, and if verbal, how it was documented and witnessed). If your study included minors, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information.

Once you have amended this/these statement(s) in the Methods section of the manuscript, please add the same text to the “Ethics Statement” field of the submission form (via “Edit Submission”).

For additional information about PLOS ONE ethical requirements for human subjects research, please refer to http://journals.plos.org/plosone/s/submission-guidelines#loc-human-subjects-research.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Sakuragi et al submit for consideration in PLOS, a manuscript entitled “Interpretable machine learning-based individual analysis of acute kidney injury in immune checkpoint inhibitor therapy” dealing with the prediction of acute kidney injury during immune checkpoint inhibitor therapy. They wished to clarify the causes of AKI during cancer treatment by mean of a new clustering approach with Shapley Additive exPlanations (SHAP). They developed a decision tree-based machine learning model predicting AKI within 7 days, by use of the medical records of 616 treated patients. The temporal changes in AKI individual predictive reasoning models represented the features to cluster AKI patients, and the results were compared with annotation by three nephrologists. ROC analysis were performant, as patients with AKI were clustered with significant prognosis (p = 0.010). The leading causes of AKI for each cluster were easily interpretable by clinicians. Then authors then suggest that such a method is useful by inferring clinical factors for developing each AKI among patients with multiple AKI risk factors.

In this very interesting study, Sakuragi et al developed a machine-learning algorithm based on gradient-boosting decision-tree (LightGBM) to predict the onset of AKI, cleverly deriving the SHAP value to consider only the variables available over time. The algorithm developed exhibits good performances. The methodology is interesting. However, being not as clear as it deserves for this important topic, it needs to be clarified. For example, the authors mention that the total dataset was split into 20% for testing and 80% for training or validation. This terminology is a bit odd as it may confuse the reader by conveying the idea of a validation. It is customary to mention “training” and “test” sets, only.

The authors missed an important and recent reference about the physiopathology of ICI-related AKI (Gerard AO et al (DOI :10.1093/ckj/sfac109), that should be added in the list of references.

The authors mention the use of the LightGBM algorithm; they should provide a proper justification. Have they tested other types of classification algorithms, either linear-plane (e.g. SVM) or decision-tree (e.g. XGboost or Catboost)? The use of a given algorithm may omit its testing, but its choice needs a justification (e.g. avoiding overlearning with Xgboost? rapid calculation with LightGBM? Lesser number of branches as compared with XGBoost?). Please explain.

How was the number of clusters determined? Did they use K-means clustering? Have the authored corrected for multiple comparisons when analyzing and providing their results?

The authors do not properly cite the sources of their libraries nor the code language (Python) used.

The authors cleverly used the SHAP value to evaluate the probability of occurrence of AKI, but what about the missing predictors? Did they consider the impact of each variable to be independent over time?

In the results part, the authors seem able to predict 7 days before, the occurrence of AKI in patients treated with ICIs but, above all to explain (thanks to their SHAP) the weight of each factor in the final prediction, therefore allowing a better interpretability. However, the authors excluded patients with multiple AKI events within two weeks from the date of the first AKI. Yet these patients appear (at least to me) the first to deserve such predictions. Please discuss.

Minor points:

The Figures require a better definition, as they are hard to read. Likewise, the lines on heatmap (2d and 2e) are almost impossible to decipher.

Reviewer #2: In the introduction section, the sentence "In recent years, AI has become an integral part of various industry verticals, and it is estimated that the market size of AI will reach 7.38billionby2025."containsagrammarerror.Theword"of"before"markets"shouldbereplacedwith"for".Therefore,thecorrectedsentencewouldbe"Inrecentyears,AIhasbecomeanintegralpartofvariousindustryverticals,anditisestimatedthatthemarketsizeforAIwillreach 7.38 billion by 2025."

In the discussion section, the sentence "The use of AI in healthcare has revolutionized the way we approach treatment and has led to more personalized care plans for patients." contains an error in the usage of the word "revolutionized". The correct word would be "revolutionized", which correctly conveys the intended meaning. Therefore, the corrected sentence would be "The use of AI in healthcare has revolutionized the way we approach treatment and has led to more personalized care plans for patients."

3.In the Methods section, the paper describes the data collection process but does not provide sufficient details regarding the source of the data, the type of data collected, and the specific methods used for data analysis. It would be helpful to provide more information about the data source and the type of data collected to enhance the credibility of the study. Additionally, a more detailed description of the data analysis methods would enable a better understanding of the approach used to analyze the data.

4.Throughout the paper, there are several grammar errors and typos that need to be corrected. These errors include misspelled words, punctuation errors, and inconsistent usage of language. It is essential to proofread the paper carefully and address these language errors to enhance readability and professionalism.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Milou-Daniel DRICI

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2024 Mar 19;19(3):e0298673. doi: 10.1371/journal.pone.0298673.r002

Author response to Decision Letter 0


25 Nov 2023

Dear Dr. Remuzzi,

Thank you for your comprehensive review of our manuscript and for highlighting areas that require further clarification and improvement. We have thoroughly addressed each of the points raised by you in your feedback, and hope that our revised manuscript better meets the publication criteria of PLOS ONE.

1. Details in the Methods Section (Point i):

In response to your first point, we have expanded the Methods section to provide more details about the data source, the types of data collected, and the specific methods used for data analysis. We believe these additions will offer a clearer understanding of our research methodology and the robustness of our approach.

2. Detailed Description of Data Analysis Methods (Point ii):

In response to this comment, we have elaborated on the data analysis methods in the revised manuscript. This includes a more detailed explanation of how the data was processed, the statistical methods applied, and the rationale behind our analytical choices. We hope this enhanced description will provide a better comprehension of the analytical framework and the steps taken to ensure the integrity of our findings.

3. Justification for the Use of LightGBM Algorithm (Point iii):

Regarding the use of the LightGBM algorithm, we have provided a detailed justification in the revised manuscript. We compared LightGBM with other classification algorithms, such as Logistic Regression, Support Vector Machine, XGBoost, and CatBoost, in terms of performance and computation time. Our findings, detailed in the supporting data, demonstrate that LightGBM offered superior accuracy and efficiency, making it the most suitable choice for our study's requirements.

Regarding some of the several sentences that Reviewer #2 identified as having primarily grammatical errors, we have carefully re-examined our manuscript and believe there may have been a misunderstanding, as the specific sentences mentioned were not found in our submission. We respectfully suggest this could be a mix-up with another manuscript. Nevertheless, we are grateful for the review and have made every effort to ensure our manuscript's clarity and accuracy.

Additionally, in response to the ethical guidelines of PLOS ONE, we have revised the 'Ethical Statement and Informed Consent' section of our manuscript to provide a clearer explanation of the informed consent process as per Japanese regulations and the ethical approval of Kyoto University.

We have prepared separate, detailed response letters for each reviewer, addressing their specific comments and suggestions. These response letters have been uploaded as independent files alongside our revised manuscript to facilitate an organized and clear review process. The revisions included in the revised manuscript encompass both the changes made by us, the authors, and those made during English language editing by Editage (www.editage.com). The revisions can be seen as track changes in the revised manuscript. In line with your instructions, we will include a rebuttal letter, a marked-up copy of the manuscript, and an unmarked version in our resubmission.

We appreciate the opportunity to enhance our manuscript based on your valuable feedback and look forward to the possibility of our work contributing to the scientific community through PLOS ONE.

Sincerely,

Yasushi Okuno, Ph.D.

Department of Biomedical Data Intelligence, Graduate School of Medicine, Kyoto University

53 Shogoin-Kawahara-cho, Sakyo-ku, Kyoto 606-8507, Japan

okuno.yasushi.4c@kyoto-u.ac.jp

Motoko Yanagita, M.D., Ph.D.

Department of Nephrology, Graduate School of Medicine, Kyoto University

54 Shogoin-Kawahara-cho, Sakyo-ku, Kyoto 606-8507, Japan

motoy@kuhp.kyoto-u.ac.jp

-------------------------------------------------------------------- 

Dear Reviewer #1,

Thank you very much for providing such detailed and constructive feedback on our manuscript. We have carefully considered each point you raised and have made corresponding revisions to enhance the quality and clarity of our study. Please find below our responses to your comments, which we have carefully considered and addressed in our revised manuscript.

1. Clarification of Dataset Splitting Terminology:

We apologize for any confusion caused by our initial description of the dataset splitting. In the revised manuscript, we have clarified that our dataset was divided into an 80% training set and a 20% test set. We have removed the term "validation" to avoid any ambiguity and ensure a clear understanding.

2. Reference Addition:

We appreciate your suggestion to include the recent reference on the physiopathology of ICI-related AKI (Gérard AO et al., Clin Kidney J. 2022;15(10):1881-1887. DOI: 10.1093/ckj/sfac109). This reference has been added to our reference list in the revised manuscript (as reference [16]).

3. Justification for the Use of LightGBM Algorithm:

Thank you for your valuable comment regarding our choice of algorithm. We would like to elaborate on our rationale for selecting the LightGBM (LGB) as the primary algorithm. The decision was primarily driven by two factors: the high accuracy of Gradient Boosting Decision Tree (GBDT) algorithms and their capability to directly handle missing values, which are notably present in EMR time-series data. This approach ensures the maintenance of accuracy while mitigating the risk associated with the pre-processing costs or accuracy reduction due to interpolation in time-series variables. In response to your suggestion, we have additionally presented a comparative analysis of the performance and computation time of various algorithms, including Logistic Regression (LR), Support Vector Machine (SVM), XGBoost (XGB), Catboost, and LGB. The results of this analysis have been added to our supporting data (see S1 Table.), which also includes details about the server specifications used for the calculations. The comparison, based on the average performance and computation time across 10 learning and inference cycles, indicated that LGB outperformed other algorithms in terms of both accuracy and speed. Consequently, we chose LGB for our study. As an additional note, for algorithms like SVM and LR that cannot directly handle missing values, methods such as interpolation or deletion were used. We hope this explanation clarifies our choice of LightGBM for the study and addresses your concern.

4. Determination of the Number of Clusters:

Thank you for your question regarding the determination of the number of clusters in our analysis. We adopted hierarchical clustering to perform the SHAP-based patient clustering. The number of clusters was determined using two approaches: first, through visual inspection of the dendrogram produced by the hierarchical clustering, and second, by applying the elbow method, a common technique in non-hierarchical clustering methods such as K-means. While the elbow method did not yield a definitive number of clusters, the dendrogram clearly indicated the presence of four distinct clusters. Consequently, we chose to define four clusters for our analysis. This dendrogram is presented at the left side of the cluster map in Figure 3a. We realize that the original manuscript lacked specific details about how the number of clusters was decided, which may have caused some confusion. To address this, we have now included a more comprehensive explanation of the cluster determination process in the Results section of the revised manuscript (please refer to line 204 in the unmarked version). We appreciate your insightful feedback, which has helped us improve the clarity and completeness of our paper.

5. Corrected for Multiple Comparisons:

Thank you for raising this important point. In our analysis (Figure 3b), where we compared major AKI factors across clusters, we focused on testing whether any of the labels significantly dominated within each cluster, rather than performing specific label-to-label comparisons. Hence, we did not adjust for multiple comparisons. We observed that clusters 3 and 4 demonstrated significant differences in label distributions, while clusters 1 and 2 showed distinct trends, though these were not statistically significant. We recognize the potential for misinterpretation of our initial statement regarding significant differences. Therefore, we have revised the applicable text to more accurately reflect our findings (please refer to line 211 in the unmarked version): " While there was a clear trend in the label distribution within each cluster, only clusters 3 and 4 showed statistically significant differences. The most dominant labels in each cluster were as follows: cluster 1, “Hypovolemia”; cluster 2, “Drug-related”; cluster 3, “Drug-related”; and cluster 4, “Cancer Cachexia." This approach was similarly applied in the survival analysis (Figure 3d) and the comparison of clinical characteristics (Table 2). The absence of multiple comparison adjustments in our analysis was due to our focus on demonstrating that SHAP-based clustering could reveal clinically relevant trends, rather than pinpointing specific differences between clusters. However, we acknowledge the importance of clearly communicating to our readers the lack of multiple comparison adjustments and appreciate your feedback regarding this matter.

Additionally, we would like to correct an error in our manuscript regarding the statistical test used for analyzing label distribution within each cluster. We incorrectly mentioned 'Cochran's Q test' in the Methods section; however, the actual test used was the 'Chi-square goodness-of-fit test'. We apologize for this oversight and have made the necessary correction in our manuscript.

6. Citation of Libraries and Code Language:

We appreciate your observation regarding the citation of the libraries and the programming language (Python) used in our study. Upon review, we realized that while we had included some citations, there were indeed areas where this information was insufficiently detailed. To address this, we have carefully revised the manuscript to ensure that all the libraries used are now properly cited and provided a more comprehensive acknowledgment of Python as the programming language. We believe these revisions more accurately reflect the resources utilized in our research and enhance the manuscript's clarity and transparency.

7. Addressing Missing Predictors and Independent Variable Impact:

Thank you for your insightful comment. In our research, we employed a combination of 11 unique features, 4 vital signs, and 68 types of time-series data, which included 52 laboratory values and 16 medication details, all extracted from Electronic Medical Records (EMRs). Detailed information on this can be found in our Supporting Methods (S2 Method) and Supporting Figure (S1 Fig.). We aggregated these features into lagged variables based on data from the preceding four weeks relative to each prediction point. Our focus was mainly on structured data with less than 50% missingness, but we acknowledge that unstructured data, such as notes from medical records and laboratory results with substantial missingness, could represent potential missing predictors. Furthermore, we ensured that each time-series data point was independently aggregated for each specific prediction time point. Consequently, each feature was used exclusively for the prediction relevant to its respective time point, maintaining the temporal independence of the variables. We appreciate your attention to this critical aspect of our methodology.

8. Inclusion of Patients with Multiple AKI Events:

Thank you for your insightful comment. Indeed, the period immediately following an AKI episode is generally recognized as a period carrying a significant risk for further AKI, and ordinarily, this period should not be excluded from study. However, in our study's model design, including the period right after the first AKI episode as part of the prediction target posed several challenges. Our model uses lagged variables, aggregating the average values for each week leading up to each prediction point. For example, the 'average Serum Creatinine (SCr) value of the last week' could vary significantly between the time just before and just after the initial AKI episode. This variation occurs because the rise in SCr value resulting from an AKI episode is reflected in the lag variables. In this case, the model would learn to label both the time point just before AKI and a time point after AKI as positive. Consequently, the model might learn the post-AKI increase in SCr value as an important predictive factor for AKI. This is particularly problematic if the AKI persists for several days, as the SCr value increase resulting from the first AKI episode would then significantly influence the lag variables used for learning AKI positive labels in subsequent days. As a result, the model could erroneously learn to prioritize the rise in SCr value as the main predictive factor for AKI. The aim of the AKI prediction model was to forecast future AKI before an increase in SCr values. Therefore, using the post-AKI rise in SCr for learning could misalign the model from its intended function. To maintain the accuracy of the model and prevent a decline in interpretability, we chose to exclude the period immediately following an AKI episode from the prediction target. Given that AKI can last several days, and considering that the aggregation is based on the average of the past week, we excluded the first two weeks after the initial AKI episode from our analysis.

Nevertheless, as you have rightly pointed out, the period immediately following the first AKI episode contains critical information about ongoing or worsening AKI. In our future research, we intend to analyze such conditions in more detail and develop models to predict persistent or worsening AKI, thereby addressing the areas not covered in this study. We are grateful for your valuable feedback.

9. Improvement of Figures and Heatmaps:

Finally, we acknowledge that the clarity of the figures and heatmaps was suboptimal. In response, we have enhanced the resolution and contrast of all figures, including the lines on heatmaps (2d and 2e), for better readability and interpretation.

Once again, thank you for your thorough review and insightful comments. We have endeavored to address each comment to the best of our ability and believe that as a result, our manuscript is now much improved. We are eager to hear any additional feedback that you may have.

Sincerely,

Yasushi Okuno, Ph.D.

Department of Biomedical Data Intelligence, Graduate School of Medicine, Kyoto University

53 Shogoin-Kawahara-cho, Sakyo-ku, Kyoto 606-8507, Japan

okuno.yasushi.4c@kyoto-u.ac.jp

Motoko Yanagita, M.D., Ph.D.

Department of Nephrology, Graduate School of Medicine, Kyoto University

54 Shogoin-Kawahara-cho, Sakyo-ku, Kyoto 606-8507, Japan

motoy@kuhp.kyoto-u.ac.jp

-------------------------------------------------------------------- 

Dear Reviewer #2,

Thank you very much for providing such detailed and constructive feedback on our manuscript. We have carefully considered each of your comment and have made corresponding revisions to enhance the quality and clarity of our study. We have taken your feedback seriously and have addressed each comment as follows:

1. Regarding the sentence in the introduction section, we have carefully reviewed our manuscript and were unable to find the specific sentence mentioned ("In recent years, AI has become an integral part of various industry verticals, and it is estimated that the market size of AI will reach 7.38 billion by 2025"). We would be grateful if you could kindly recheck this.

2. Similarly, in relation to the sentence about AI in healthcare in the discussion section, we also could not find this in our submission. We kindly request your assistance in verifying this.

3. We appreciate your constructive comments about the Methods section. We have expanded this section to provide more details on our data collection process, including the sources of data, types of data collected, and a comprehensive description of our data analysis methods. These enhancements, we believe, will significantly improve the understanding and credibility of our study.

4. Regarding the grammatical errors and typos, we are grateful for your attention to detail. We have thoroughly revised and proofread the manuscript to correct these issues, ensuring greater clarity and professionalism in our writing.

We are thankful for the opportunity to refine our manuscript based on your valuable feedback and hope that our revisions meet your expectations. Your guidance is crucial for the improvement of our work.

Sincerely,

Yasushi Okuno, Ph.D.

Department of Biomedical Data Intelligence, Graduate School of Medicine, Kyoto University

53 Shogoin-Kawahara-cho, Sakyo-ku, Kyoto 606-8507, Japan

okuno.yasushi.4c@kyoto-u.ac.jp

Motoko Yanagita, M.D., Ph.D.

Department of Nephrology, Graduate School of Medicine, Kyoto University

54 Shogoin-Kawahara-cho, Sakyo-ku, Kyoto 606-8507, Japan

motoy@kuhp.kyoto-u.ac.jp

Attachment

Submitted filename: Response_to_Reviewer_No.2_PONE-D-23-28945.docx

pone.0298673.s002.docx (15.9KB, docx)

Decision Letter 1

Giuseppe Remuzzi

30 Jan 2024

Interpretable machine learning-based individual analysis of acute kidney injury in immune checkpoint inhibitor therapy

PONE-D-23-28945R1

Dear Dr. Okuno,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Giuseppe Remuzzi

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #3: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #3: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #3: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

Reviewer #3: All the comments made by the first reviewers were addressed. I do not have any further comments. Well done.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Milou-Daniel Drici

Reviewer #3: Yes: Marcelo Rodrigues Bacci

**********

Acceptance letter

Giuseppe Remuzzi

8 Mar 2024

PONE-D-23-28945R1

PLOS ONE

Dear Dr. Okuno,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

If revisions are needed, the production department will contact you directly to resolve them. If no revisions are needed, you will receive an email when the publication date has been set. At this time, we do not offer pre-publication proofs to authors during production of the accepted work. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few weeks to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Prof. Giuseppe Remuzzi

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 File. Contains all the supporting files.

    (PDF)

    pone.0298673.s001.pdf (1.6MB, pdf)
    Attachment

    Submitted filename: Response_to_Reviewer_No.2_PONE-D-23-28945.docx

    pone.0298673.s002.docx (15.9KB, docx)

    Data Availability Statement

    Data cannot be shared publicly because of patient privacy in electronic medical records. Data are available from Kyoto University Graduate School and Faculty of Medicine, Ethics Committee via email (ethcom@kuhp.kyoto-u.ac.jp) or telephone (+81-75-753-4680) for researchers who meet the criteria for access to confidential data.


    Articles from PLOS ONE are provided here courtesy of PLOS

    RESOURCES