Skip to main content
PLOS One logoLink to PLOS One
. 2020 Sep 17;15(9):e0239262. doi: 10.1371/journal.pone.0239262

Increasing tendency of urine protein is a risk factor for rapid eGFR decline in patients with CKD: A machine learning-based prediction model by using a big database

Daijo Inaguma 1,*,#, Akimitsu Kitagawa 1,#, Ryosuke Yanagiya 2,#, Akira Koseki 3,#, Toshiya Iwamori 3,#, Michiharu Kudo 3,, Yukio Yuzawa 4,
Editor: Tatsuo Shimosawa5
PMCID: PMC7497987  PMID: 32941535

Abstract

Artificial intelligence is increasingly being adopted in medical fields to predict various outcomes. In particular, chronic kidney disease (CKD) is problematic because it often progresses to end-stage kidney disease. However, the trajectories of kidney function depend on individual patients. In this study, we propose a machine learning-based model to predict the rapid decline in kidney function among CKD patients by using a big hospital database constructed from the information of 118,584 patients derived from the electronic medical records system. The database included the estimated glomerular filtration rate (eGFR) of each patient, recorded at least twice over a period of 90 days. The data of 19,894 patients (16.8%) were observed to satisfy the CKD criteria. We characterized the rapid decline of kidney function by a decline of 30% or more in the eGFR within a period of two years and classified the available patients into two groups—those exhibiting rapid eGFR decline and those exhibiting non-rapid eGFR decline. Following this, we constructed predictive models based on two machine learning algorithms. Longitudinal laboratory data including urine protein, blood pressure, and hemoglobin were used as covariates. We used longitudinal statistics with a baseline corresponding to 90-, 180-, and 360-day windows prior to the baseline point. The longitudinal statistics included the exponentially smoothed average (ESA), where the weight was defined to be 0.9*(t/b), where t denotes the number of days prior to the baseline point and b denotes the decay parameter. In this study, b was taken to be 7 (7-day ESA). We used logistic regression (LR) and random forest (RF) algorithms based on Python code with scikit-learn library (https://scikit-learn.org/) for model creation. The areas under the curve for LR and RF were 0.71 and 0.73, respectively. The 7-day ESA of urine protein ranked within the first two places in terms of importance according to both models. Further, other features related to urine protein were likely to rank higher than the rest. The LR and RF models revealed that the degree of urine protein, especially if it exhibited an increasing tendency, served as a prominent risk factor associated with rapid eGFR decline.

Introduction

Chronic kidney disease (CKD) is a commonly occurring lifestyle-related disease. It induces problematic symptoms in patients [1], which can sometimes progress to end-stage kidney disease (ESKD) or cause cardiovascular (CV) disease. Its diagnosis is often delayed as most patients remain asymptomatic with respect to kidney dysfunction during stages 1, 2, and 3a of CKD. Therefore, medical check-ups and laboratory tests are essential not only for patients with diabetes or hypertension, but also for the general population. Several reports have demonstrated that treatment by a nephrologist could arrest the decline of estimated glomerular filtration rate (eGFR) in patients with CKD [24]. However, the ratio of nephrologists to patients with CKD is low across the world. Because of the reasons mentioned above, it might be helpful to identify patients with rapid decline of eGFR among the many CKD patients.

Previous large-scale cohort studies have identified several conditions, including proteinuria, hypertension, and comorbidity of diabetes, as risk factors associated with the rapid decline of eGFR [58]. Further, several clinical trials have established that reno-protective drugs such as renin angiotensin system blockers and sodium glucose transporter-1 inhibitors can decelerate the rate of eGFR decline, by comparing their effects with those of a placebo in CKD patients [912]. In other words, if we identify CKD patients exhibiting rapid decline in eGFR, we might be able to intervene the its course in an early stage. Recent studies have focused on evaluating kidney function trajectories in patients to predict the incidence of CV disease and all-cause mortality [13, 14]. Regardless of the primary kidney disease, the decline of eGFR is a common feature in patients with CKD. However, kidney function trajectories are often heavily patient-dependent. Previous reports have established that rapid eGFR decline is related to blood pressure-related problems, comorbidity, and proteinuria, not only in patients with CKD, but also in the general populace [15, 16].

Rapid development has been made in the field of artificial intelligence (AI) since the 1980s. In recent times, machine learning-based methods have found applications in various fields, including medicine [1720]. In particular, artificial neural networks have been applied in nephrology for various prediction purposes [2124]. The ability to automatically identify irregularities in data makes machine learning especially useful for big data comprising a large number of variables, where manual alternatives are not viable. Therefore, machine learning can be potentially applied to big medical data and the prediction of associated phenomena. Our hospital has maintained a big database of more than 900,000 patients treated for different diseases since 2004. To the best of our knowledge, no AI-based methods have been proposed yet to identify the aforementioned risk factors associated to rapid eGFR decline. In this study, we assumed that kidney function trajectories of patients would be informative and aid the diagnosis and subsequent treatment of CKD. Therefore, we developed a machine learning-based model to predict rapid eGFR decline in CKD patients by using a big hospital database.

Materials and methods

Dataset and population criteria

We constructed a database based on the information of 118,584 patients recorded by the electronic medical records system of the Fujita Health University Hospital during the period of June 2004 to July 2019. The database included the measured eGFRs for each patient, recorded at least twice over a period of 90 days. This study only used the data of 914,280 patients. 19,894 patients (16.8%) among them were observed to satisfy the following CKD criteria. In this study, CKD was defined to be characterized by eGFR < 60 ml/min/1.73 m2 and/or urine protein > 1+, as determined by the dipstick method, over a period of more than 90 days. Further, each patient was required to be at least 20 years old for measurement of eGFR and urine protein, and each previous measurement was required to have been recorded within two years of the current one. We excluded patients who had undergone dialysis or kidney transplantation before reference points. Information about comorbidity of diabetes, history of acute kidney injury (AKI), and use of renin angiotensin system inhibitor (RASI) was obtained from ICD-10 of electric medical records.

Classification of patients based on trajectory of eGFR

The rapid decline (RD) of eGFR in CKD patients was defined to be a decline of 30% or more in eGFR within a period of 2 years [2527]. As an accurate metric for the eGFR value, we used average eGFR measurements over a period of 90 days for each patient to avoid temporal spikes in data. Following this procedure, we identified 5,609 unique CKD patients exhibiting rapid eGFR decline and collected an aggregate of 9,866 samples from them. To form our cohort, in addition to the 9,866 RD samples from 5,609 unique patients, we created control (non-RD) samples by extracting eGFR trajectories from patients with similar profiles exhibiting (1) non-RD eGFR, (2) rapid eGFR decline beginning less than 2 years before the positive sample, (3) same gender, or (4) least mean average difference between ages and eGFR values at the beginning of the trajectories. Following this procedure, we identified 4,302 unique control patients with CKD and extracted 9,866 samples not exhibiting rapid eGFR decline. Fig 1 shows the patient flow. Finally, we combined the two groups and identified an aggregate of 9,911 unique patients for the present study. Fig 2 indicates representative examples of reference points in each group. In some cases, the reference points have been set several times. The reference points of prediction for patients in either group, RD detection points for patients in the RD group, and measurement points where eGFR was used for matching for patients in the non-RD group were available.

Fig 1. Patient flow.

Fig 1

Fig 2. Representative examples of reference points in each group.

Fig 2

Predictive model

By assigning positive labels to samples in the RD group and negative labels to samples in the non-RD group, we constructed predictive models based on two machine learning algorithms. Longitudinal laboratory test results, including urine protein, blood urea nitrogen, systolic blood pressure, diastolic blood pressure, total cholesterol, hemoglobin, uric acid, and triglyceride, were taken to be the covariates. First, we noted the reference point values of the aforementioned tests, i.e., the latest values observed corresponding to the reference point of prediction. Only the baseline values of blood urea nitrogen were included. Next, we recorded the longitudinal statistics based on the past 90-, 180-, and 360-day windows from the reference point. The longitudinal statistics considered in this study were mean, standard deviation, and exponentially smoothed average (ESA), where the weight was defined to be 0.9*(t/b), where t is the number of days from the reference point and b is the decay parameter. In this study, b was taken to be 7 (weekly decay). The missing values corresponding to each laboratory test were imputed via the last observation carried forward method. If no data were available for a test, the mean value of the corresponding training data was used instead. Additionally, all the values were standardized. In the following step, by using the aforementioned covariates and training labels, we applied the logistic regression (LR) and random forest (RF) algorithms based on Python code and the scikit-learn library (https://scikit-learn.org/) to create two classification models for RD. We optimized the models by fine-tuning the hyperparameters of the algorithms, including the regularization parameters of LR and the number of trees, etc. for RF. After identifying the optimal parameters via inner four-fold cross validation, we evaluated these models using outer five-fold cross validation. The contribution to RD was evaluated via the analysis of coefficient weights in the LR model and the decrease in mean Gini impurity in the RF model. We defined three patterns by grouping together the features as follows. Pattern 1 comprised comorbidity of diabetes, history of AKI, systolic blood pressure, diastolic blood pressure, use of RASIs, urine protein, hemoglobin, serum uric acid, blood urea nitrogen, serum total cholesterol, and serum triglyceride at the reference point; Pattern 2 comprised comorbidity of diabetes, history of AKI, systolic blood pressure, diastolic blood pressure, use of RASIs, urine protein, hemoglobin, serum uric acid, blood urea nitrogen, serum total cholesterol, and serum triglyceride at the reference point, for a period of 180 days, and the 7-day ESA of features prior to the reference point; and Pattern 3 comprised comorbidity of diabetes, history of AKI, systolic blood pressure, diastolic blood pressure, use of RASIs, urine protein, hemoglobin, serum uric acid, blood urea nitrogen, serum total cholesterol, and serum triglyceride at the reference point, for periods of 90, 180, and 360 days, and 7-day ESA of features prior to the reference point.

Ethics approval and consent to participate

The present study was conducted following the Ethical guidelines for Clinical Research by the Japanese Ministry of Health, Labor, and Welfare (created July 30, 2003; full revision December 28, 2004; full revision July 31, 2008) and the Helsinki Declaration (revised 2013). It was approved by the clinical research ethics committees at Fujita Health University School of Medicine (approval number: HM19-157). All data were fully anonymized before we analyzed. The contents of the entire research have been displayed in the information disclosure document on the Web and Informed consent was obtained in the form of opt-out on the web-site. Those who rejected were excluded. The trial registration number of the study is UMIN 000037476, and it was registered on August 1, 2019.

Results

Comparison of patient characteristics and laboratory data at the reference point

Table 1 compares the patient characteristics and laboratory data of the two groups at the reference point. No significant differences in age, sex, eGFR, and rate of history of acute kidney injury (AKI) were observed between the two groups. However, comorbidity of diabetes, blood pressure, serum total cholesterol, serum uric acid, serum triglyceride, and amount of urine protein were observed to be higher in patients in the RD group. Meanwhile, use of renin angiotensin system inhibitors was low in the RD group. Further, blood pressure, serum total cholesterol, serum uric acid, serum triglyceride, and amount of urine protein were higher in patients in the RD group over periods of 90, 180, and 360 days prior to the reference point (S1 Table) and the 7-day ESA (S2 Table).

Table 1. Patients characteristics and laboratory data at reference point.

Variables All n, 19,732 RD group n, 9,866 Non-RD group n, 9,866 p value
Age (years old) 68.5, 13.7 68.5, 13.7 68.5, 13.6 1.000
Female gender (%) 41.7 41.7 41.7 1.000
Comorbidity of diabetes (%) 31.3 35.6 27.1 < 0.001
History of AKI (%) 4.6 4.6 4.5 0.707
SBP (mmHg) 131, 26 136, 26 128, 26 < 0.001
DBP (mmHg) 73, 15 74, 15 72, 15 < 0.001
Use of RASIs (%) 61.8 56.8 66.8 < 0.001
eGFR (ml/min/1.73m2) 39.9, 26.0 39.9, 26.0 39.9, 26.1 0.760
Serum creatinine (mg/dL) 2.23, 2.04 2.25, 2.05 2.21, 2.04 0.061
BUN (mg/dL) 29.5, 19.1 29.8, 17.9 29.3, 20.1 < 0.001
Hemoglobin (mg/dL) 11.5, 2.2 11.4, 2.1 11.5, 2.3 0.001
Hematocrit (%) 34.8, 6.4 34.7, 6.1 34.9, 6.8 < 0.001
Serum T-C (mg/dL) 181, 49 186, 50 175, 47 < 0.001
Serum TG (mg/dL) 142, 91 151, 100 133, 79 < 0.001
Serum uric acid (mg/dL) 6.2, 2.0 6.3, 1.9 6.0, 2.0 < 0.001
Urine protein * 1.9, 1.8 2.3, 1.9 1.4, 1.6 < 0.001
Urine protein ** 2 [0, 3] 2 [0, 5] 1 [0, 3]

Mean, standard deviation, Value, %

* Continuous value of urine protein test by dipstick

** Semi-quantity test of urine protein test by dipstick 50% [25%, 75%]

0; -, 1; ±, 2; +, 3; ++, 4; +++, 5; ++++

RD; rapid decline, AKI; acute kidney injury, SBP; systolic blood pressure, DBP; diastolic blood pressure, RASI; renin angiotensin system inhibitor, eGFR; estimated glomerular filtration rate, BUN; blood urea nitrogen, T-C; total cholesterol, TG; triglyceride

Comparison of areas under the curve (AUCs) of the two models

Fig 3 shows receiver operating characteristic curve and Table 2 compares the AUCs exhibited by the LR-based and the RF-based model in the prediction of RD. The AUCs exhibited by the LR-based model using the Pattern 1, 2, and 3 were observed to be 0.67, 0.69, and 0.71, respectively. In contrast, the AUCs exhibited by the RF model using the Pattern 1, 2, and 3 were observed to be 0.68, 0.71, and 0.73, respectively. The AUCs exhibited by both models were observed to increase with the increase in the number of features.

Fig 3. Receiver operating characteristic curve for prediction of the RD.

Fig 3

A. The Pattern 1 (the LR model). B. The Pattern 2 (the LR model). C. The Pattern 3 (the LR model). D. The Pattern 1 (the RF model). E. The Pattern 2 (the RF model). F. The Pattern 3 (the RF model).

Table 2. Comparison of AUC by models.

Model Pattern AUC
Logistic regression model 1 0.67
2 0.69
3 0.71
Random forest model 1 0.68
2 0.71
3 0.73

Each feature includes: comorbidity of diabetes, history of AKI, SBP, DBP, use of RASIs, urine protein, hemoglobin, serum uric acid, BUN, serum total cholesterol, serum triglyceride

1; at baseline (at start point of rapid eGFR decline)

2; at baseline, average and standard deviation of features during 180 days prior to the baseline, and 7-day exponentially smoothed average of features

3; at baseline, average and standard deviation of features during 90, 180, and 360 days prior to the baseline, and 7-day exponentially smoothed average of features

AUC; area under curve, AKI; acute kidney injury, SBP; systolic blood pressure, DBP; diastolic blood pressure, RASI; renin angiotensin system inhibitor, BUN; blood urea nitrogen, eGFR; estimated glomerular filtration rate

Ranking of features according to the LR-based and RF-based models

The Pattern1 comprised eight 11 features—urine protein, systolic blood pressure, serum uric acid, blood urea nitrogen, serum total cholesterol, use of RASIs, hemoglobin, serum triglyceride, comorbidity of diabetes, diastolic blood pressure, and history of AKI—in order of importance as measured by the LR-based model. In contrast, the RF-based model provided the following list in order of importance: urine protein, systolic blood pressure, serum total cholesterol, blood urea nitrogen, serum uric acid, hemoglobin, serum triglyceride, diastolic blood pressure, use of RASIs, comorbidity of diabetes, and history of AKI comprised. Table 3 lists the top-10 ranking of features comprising the Pattern 2 and 3, according to both models. Significantly, the 7-day ESA of urine protein ranked within the first two places corresponding to both models. Further, features related to urine protein were observed to be more likely to rank higher than the rest. The 7-day ESA of hemoglobin was also consistently placed at high ranks corresponding to the LR-based model.

Table 3. Ranking of 10 top logistic regression and random forest model features.

Rank Logistic regression Random forest
Features 2 3 2 3
1 hemoglobin (7-day ESA) urine protein (7-day ESA) urine protein (7-day ESA) hemoglobin (90 SD)
2 urine protein (7-day ESA) hemoglobin (7-day ESA) hemoglobin (180 SD) urine protein (7-day ESA)
3 hemoglobin (180 mean) SBP (7-day ESA) urine protein (180 mean) urine protein (180 mean)
4 total cholesterol (baseline) hemoglobin (90 SD) urine protein (baseline) urine protein (360 mean)
5 hemoglobin (180 SD) total cholesterol (baseline) uric acid (180 SD) urine protein (90 mean)
6 SBP (7-day ESA) total cholesterol (7-day ESA) uric acid (7-day ESA) hemoglobin (180 SD)
7 total cholesterol (7-day ESA) hemoglobin (360 mean) uric acid (180 mean) urine protein (baseline)
8 SBP (180 mean) hemoglobin (180 mean) total cholesterol (baseline) hemoglobin (360 SD)
9 urine protein (180 mean) hemoglobin (90 mean) BUN (baseline) total cholesterol (90 SD)
10 hemoglobin (baseline) uric acid (90 SD) SBP (baseline) uric acid (90 SD)

Features

2; at baseline, average and standard deviation of features during 180 days prior to the baseline, and 7-day ESA of features

3; at baseline, average and standard deviation of features during 90, 180, and 360 days prior to the baseline, and 7-day ESA of features

ESA; exponentially smoothed average, SBP; systolic blood pressure, SD; standard deviation, BUN; blood urea nitrogen

Discussion

We demonstrated that proteinuria, especially when it exhibited a recent spike, was important in the prediction of rapid eGFR decline in CKD patients being treated in a hospital. The present study exhibited three primary characteristics. First, we analyzed big data via machine learning algorithms. We also adopted the ESA of variables as the primary metric during the extraction of risk factors because we considered the long-term trends of each variable, as they are meaningful in the prediction of eGFR trajectory. Second, we adopted ESA as one of the features, containing, in particular, the ESA of urine protein. This enabled us to weigh features closer to the reference point. Finally, the subjects in the present study included out-patients suffering from various diseases involving CKD, while a certain proportion of the data contained kidney function reports from different sections of population, including the general population, elderly population, or patients already diagnosed with CKD. The primary cause of CKD cannot be narrowed to a single kidney disease in many cases, as more often than not, the symptoms are caused by complications arising from a combination of two or more diseases. Diseases other than kidney-related ones can also sometimes lead to CKD directly or indirectly during the follow-up period. As the hospital considered in this study is the biggest in Japan, we had access to data pertaining to a large number of patients suffering from various diseases. Because of the diversity of available data, it can be concluded that the results of the present study are informative to manage patients who are needed to be followed for different diseases in large-sized hospitals. In predictive models to see an unknown future using past data, AUC around 0.7 is generally regarded as being good, and an improvement from 0.71 (LR-based model) to 0.73 (RF-based model) was thought to be rather remarkable.

Besides proteinuria, the present study also established the ESA of proteinuria to be one of the most prominent risk factors associated to rapid eGFR decline. This corroborates the conclusions of several other cohort studies, which have indicated that proteinuria is significantly associated with certain kidney function metrics, including the doubling of serum creatinine level, eGFR halving, and progression to ESKD [5, 6]. The Clinical Renal Insufficiency Cohort (CRIC) study, conducted in the United States of America, calculated the hazard ratios for ESKD and eGFR halving corresponding to the highest and the lowest proteinuria categories to be 11.83 and 11.19, respectively [5]. Meanwhile, the Chronic Kidney Disease Japan Cohort (CKD-JAC) also reported that increased albumin-to-creatinine ratio at the baseline was significantly associated to eGFR halving and progression to ESKD at the primary end-point [6]. It was crucial to carefully observe the temporal trends of urinary protein excretion predict eGFR decline as early as possible. We consider the aforementioned conclusions of the study to be novel and informative. Meanwhile, it was reported that eGFR decline in patients with CKD stage 3 was relatively slow[28] and episode of AKI generally affects trajectories of eGFR [29, 30]. Hence, we used episode of AKI as a variable in the analysis. However, the episode of AKI did not rank in the top-10 in terms of feature importance. We considered that the reason might be due to low incident rate of AKI in the present study.

AI-based prediction has been attempted in various medical fields, especially in nephrology [22, 3133]. Nationwide studies and cohorts have been conducted all over the world to analyze big data more effectively. We accumulated more than 132,000 pieces of medical data, ranging from 2004 to the present from a single hospital. The prediction accuracy of the proposed methods can be further improved by appending additional parameters, such as average and standard deviation values prior to the reference time, and by complementing textual data with digital data despite the retrospective study design. Moreover, the analysis of ESA, which was established to be the most prominent feature by the present study, was only possible due to the application of machine learning. We consider the conclusions of the study to be of use in real-world clinical scenarios despite the preliminary nature of the study.

The study had the following limitations. First, patient information, including medical histories, comorbidities, and medications through ICD-10 code, was not completely available. This was because some patients opted for concurrent treatment of other diseases at other hospitals. Second, the intervals between successive examinations or the frequency of examinations, including blood tests, were dependent on individual patients. Hence, we used the average values over periods of 90, 180, and 360 days prior to the baseline. Finally, even though urinary protein creatinine ratio is currently the best metric for the evaluation of kidney disease, the measurement of proteinuria was only available via the semi-quantitative method using dipsticks in the present study. This is because the measurement of urinary protein creatinine ratio has only become popular among physicians other than nephrologists over the last decade. Based on the aforementioned limitations, it is evident that the use of more advanced systems to acquire more detailed information, including medications prescribed at other facilities, is necessary to enhance the accuracy of the proposed rapid eGFR decline prediction methods.

Conclusion

The proposed RF-based and LR-based models based on machine learning algorithms were proved to be effective in the identification of patients with rapid eGFR decline in real-world clinical scenarios. Further, urine protein, especially if it exhibited a recent spike, was established to be a prominent risk factor associated with rapid eGFR decline.

Supporting information

S1 Table. Average blood pressure and laboratory data for 90, 180, and 360 days prior to the baseline.

(DOCX)

S2 Table. Exponentially smoothed average of blood pressure and laboratory data for 7 and 30days.

(DOCX)

Data Availability

10.6084/m9.figshare.12780311.

Funding Statement

IBM Research provided support for this study in the form of salaries for AK, TI and MK. The specific roles of these authors are articulated in the ‘author contributions’ section.

References

  • 1.Xie Y, Bowe B, Mokdad AH, Xian H, Yan Y, Li T, et al. Analysis of the Global Burden of Disease study highlights the global, regional, and national trends of chronic kidney disease epidemiology from 1990 to 2016. Kidney international. 2018;94(3):567–81. Epub 2018/08/07. 10.1016/j.kint.2018.04.011 . [DOI] [PubMed] [Google Scholar]
  • 2.Chen TK, Knicely DH, Grams ME. Chronic Kidney Disease Diagnosis and Management: A Review. Jama. 2019;322(13):1294–304. Epub 2019/10/02. 10.1001/jama.2019.14745 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Campbell GA, Bolton WK. Referral and comanagement of the patient with CKD. Advances in chronic kidney disease. 2011;18(6):420–7. Epub 2011/11/22. 10.1053/j.ackd.2011.10.006 . [DOI] [PubMed] [Google Scholar]
  • 4.Smart NA, Dieberg G, Ladhani M, Titus T. Early referral to specialist nephrology services for preventing the progression to end-stage kidney disease. The Cochrane database of systematic reviews. 2014;(6):Cd007333 Epub 2014/06/19. 10.1002/14651858.CD007333.pub2 . [DOI] [PubMed] [Google Scholar]
  • 5.Yang W, Xie D, Anderson AH, Joffe MM, Greene T, Teal V, et al. Association of kidney disease outcomes with risk factors for CKD: findings from the Chronic Renal Insufficiency Cohort (CRIC) study. American journal of kidney diseases: the official journal of the National Kidney Foundation. 2014;63(2):236–43. Epub 2013/11/05. 10.1053/j.ajkd.2013.08.028 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Inaguma D, Imai E, Takeuchi A, Ohashi Y, Watanabe T, Nitta K, et al. Risk factors for CKD progression in Japanese patients: findings from the Chronic Kidney Disease Japan Cohort (CKD-JAC) study. Clinical and experimental nephrology. 2017;21(3):446–56. Epub 2016/07/15. 10.1007/s10157-016-1309-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.De Nicola L, Provenzano M, Chiodini P, Borrelli S, Garofalo C, Pacilio M, et al. Independent Role of Underlying Kidney Disease on Renal Prognosis of Patients with Chronic Kidney Disease under Nephrology Care. PloS one. 2015;10(5):e0127071 Epub 2015/05/21. 10.1371/journal.pone.0127071 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Toto RD, Greene T, Hebert LA, Hiremath L, Lea JP, Lewis JB, et al. Relationship between body mass index and proteinuria in hypertensive nephrosclerosis: results from the African American Study of Kidney Disease and Hypertension (AASK) cohort. American journal of kidney diseases: the official journal of the National Kidney Foundation. 2010;56(5):896–906. Epub 2010/08/31. 10.1053/j.ajkd.2010.05.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Brenner BM, Cooper ME, de Zeeuw D, Keane WF, Mitch WE, Parving HH, et al. Effects of losartan on renal and cardiovascular outcomes in patients with type 2 diabetes and nephropathy. The New England journal of medicine. 2001;345(12):861–9. Epub 2001/09/22. 10.1056/NEJMoa011161 . [DOI] [PubMed] [Google Scholar]
  • 10.Lewis EJ, Hunsicker LG, Clarke WR, Berl T, Pohl MA, Lewis JB, et al. Renoprotective effect of the angiotensin-receptor antagonist irbesartan in patients with nephropathy due to type 2 diabetes. The New England journal of medicine. 2001;345(12):851–60. Epub 2001/09/22. 10.1056/NEJMoa011303 . [DOI] [PubMed] [Google Scholar]
  • 11.Wanner C, Heerspink HJL, Zinman B, Inzucchi SE, Koitka-Weber A, Mattheus M, et al. Empagliflozin and Kidney Function Decline in Patients with Type 2 Diabetes: A Slope Analysis from the EMPA-REG OUTCOME Trial. Journal of the American Society of Nephrology: JASN. 2018;29(11):2755–69. Epub 2018/10/14. 10.1681/ASN.2018010103 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Perkovic V, Jardine MJ, Neal B, Bompoint S, Heerspink HJL, Charytan DM, et al. Canagliflozin and Renal Outcomes in Type 2 Diabetes and Nephropathy. The New England journal of medicine. 2019;380(24):2295–306. Epub 2019/04/17. 10.1056/NEJMoa1811744 . [DOI] [PubMed] [Google Scholar]
  • 13.Xie Y, Bowe B, Xian H, Balasubramanian S, Al-Aly Z. Estimated GFR Trajectories of People Entering CKD Stage 4 and Subsequent Kidney Disease Outcomes and Mortality. American journal of kidney diseases: the official journal of the National Kidney Foundation. 2016;68(2):219–28. Epub 2016/03/08. 10.1053/j.ajkd.2016.02.039 . [DOI] [PubMed] [Google Scholar]
  • 14.Rosansky SJ. Renal function trajectory is more important than chronic kidney disease stage for managing patients with chronic kidney disease. American journal of nephrology. 2012;36(1):1–10. Epub 2012/06/16. 10.1159/000339327 . [DOI] [PubMed] [Google Scholar]
  • 15.Yu Z, Rebholz CM, Wong E, Chen Y, Matsushita K, Coresh J, et al. Association Between Hypertension and Kidney Function Decline: The Atherosclerosis Risk in Communities (ARIC) Study. American journal of kidney diseases: the official journal of the National Kidney Foundation. 2019. Epub 2019/04/30. 10.1053/j.ajkd.2019.02.015 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Iseki K, Konta T, Asahi K, Yamagata K, Fujimoto S, Tsuruya K, et al. Dipstick proteinuria and all-cause mortality among the general population. Clinical and experimental nephrology. 2018;22(6):1331–40. Epub 2018/06/06. 10.1007/s10157-018-1587-x . [DOI] [PubMed] [Google Scholar]
  • 17.Niel O, Bastard P. Artificial Intelligence in Nephrology: Core Concepts, Clinical Applications, and Perspectives. American journal of kidney diseases: the official journal of the National Kidney Foundation. 2019;74(6):803–10. Epub 2019/08/28. 10.1053/j.ajkd.2019.05.020 . [DOI] [PubMed] [Google Scholar]
  • 18.Hamet P, Tremblay J. Artificial intelligence in medicine. Metabolism: clinical and experimental. 2017;69s:S36–s40. Epub 2017/01/28. 10.1016/j.metabol.2017.01.011 . [DOI] [PubMed] [Google Scholar]
  • 19.Johnson KW, Torres Soto J, Glicksberg BS, Shameer K, Miotto R, Ali M, et al. Artificial Intelligence in Cardiology. Journal of the American College of Cardiology. 2018;71(23):2668–79. Epub 2018/06/09. 10.1016/j.jacc.2018.03.521 . [DOI] [PubMed] [Google Scholar]
  • 20.Yu KH, Beam AL, Kohane IS. Artificial intelligence in healthcare. Nature biomedical engineering. 2018;2(10):719–31. Epub 2019/04/25. 10.1038/s41551-018-0305-z . [DOI] [PubMed] [Google Scholar]
  • 21.Liu Y, Zhang Y, Liu D, Tan X, Tang X, Zhang F, et al. Prediction of ESRD in IgA Nephropathy Patients from an Asian Cohort: A Random Forest Model. Kidney & blood pressure research. 2018;43(6):1852–64. Epub 2018/12/12. 10.1159/000495818 . [DOI] [PubMed] [Google Scholar]
  • 22.Xiao J, Ding R, Xu X, Guan H, Feng X, Sun T, et al. Comparison and development of machine learning tools in the prediction of chronic kidney disease progression. Journal of translational medicine. 2019;17(1):119 Epub 2019/04/12. 10.1186/s12967-019-1860-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Barbieri C, Molina M, Ponce P, Tothova M, Cattinelli I, Ion Titapiccolo J, et al. An international observational study suggests that artificial intelligence for clinical decision support optimizes anemia management in hemodialysis patients. Kidney international. 2016;90(2):422–9. Epub 2016/06/06. 10.1016/j.kint.2016.03.036 . [DOI] [PubMed] [Google Scholar]
  • 24.Zhao J, Gu S, McDermaid A. Predicting outcomes of chronic kidney disease from EMR data based on Random Forest Regression. Mathematical biosciences. 2019;310:24–30. Epub 2019/02/16. 10.1016/j.mbs.2019.02.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Coresh J, Turin TC, Matsushita K, Sang Y, Ballew SH, Appel LJ, et al. Decline in estimated glomerular filtration rate and subsequent risk of end-stage renal disease and mortality. Jama. 2014;311(24):2518–31. Epub 2014/06/04. 10.1001/jama.2014.6634 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Levey AS, Inker LA, Matsushita K, Greene T, Willis K, Lewis E, et al. GFR decline as an end point for clinical trials in CKD: a scientific workshop sponsored by the National Kidney Foundation and the US Food and Drug Administration. American journal of kidney diseases: the official journal of the National Kidney Foundation. 2014;64(6):821–35. Epub 2014/12/03. 10.1053/j.ajkd.2014.07.030 . [DOI] [PubMed] [Google Scholar]
  • 27.Matsushita K, Chen J, Sang Y, Ballew SH, Shimazaki R, Fukagawa M, et al. Risk of end-stage renal disease in Japanese patients with chronic kidney disease increases proportionately to decline in estimated glomerular filtration rate. Kidney international. 2016;90(5):1109–14. Epub 2016/09/27. 10.1016/j.kint.2016.08.003 . [DOI] [PubMed] [Google Scholar]
  • 28.Eriksen BO, Ingebretsen OC. The progression of chronic kidney disease: a 10-year population-based study of the effects of gender and age. Kidney international. 2006;69(2):375–82. Epub 2006/01/13. 10.1038/sj.ki.5000058 . [DOI] [PubMed] [Google Scholar]
  • 29.Fiorentino M, Grandaliano G, Gesualdo L, Castellano G. Acute Kidney Injury to Chronic Kidney Disease Transition. Contributions to nephrology. 2018;193:45–54. Epub 2018/02/03. 10.1159/000484962 . [DOI] [PubMed] [Google Scholar]
  • 30.Chawla LS, Kimmel PL. Acute kidney injury and chronic kidney disease: an integrated clinical syndrome. Kidney international. 2012;82(5):516–24. Epub 2012/06/08. 10.1038/ki.2012.208 . [DOI] [PubMed] [Google Scholar]
  • 31.Koyner JL, Carey KA, Edelson DP, Churpek MM. The Development of a Machine Learning Inpatient Acute Kidney Injury Prediction Model. Critical care medicine. 2018;46(7):1070–7. Epub 2018/03/30. 10.1097/CCM.0000000000003123 . [DOI] [PubMed] [Google Scholar]
  • 32.Kanda E, Kanno Y, Katsukawa F. Identifying progressive CKD from healthy population using Bayesian network and artificial intelligence: A worksite-based cohort study. Scientific reports. 2019;9(1):5082 Epub 2019/03/27. 10.1038/s41598-019-41663-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Elhoseny M, Shankar K, Uthayakumar J. Intelligent Diagnostic Prediction and Classification System for Chronic Kidney Disease. Scientific reports. 2019;9(1):9583 Epub 2019/07/05. 10.1038/s41598-019-46074-2 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Tatsuo Shimosawa

1 Jul 2020

PONE-D-20-15213

A machine learning-based prediction model for rapid glomerular filtration rate decline in patients with chronic kidney disease by using a big database

PLOS ONE

Dear Dr. Inaguma,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

The authors should incorporate important confounding factors in analysis as suggested by the expert.

Please submit your revised manuscript by Aug 15 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Tatsuo Shimosawa, M.D., Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. In the ethics statement in the manuscript and in the online submission form, please provide additional information about the patient records used in your retrospective study. Specifically, please ensure that you have discussed whether all data were fully anonymized before you accessed them and/or whether the IRB or ethics committee waived the requirement for informed consent. If patients provided informed written consent to have data from their medical records used in research, please include this information.

3. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information.

4.We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially identifying or sensitive patient information) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. Please see http://www.bmj.com/content/340/bmj.c181.long for guidelines on how to de-identify and prepare clinical data for publication. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

5.Thank you for stating the following in the Competing Interests section:

[DI received lecture fees from Ono Pharmaceutical Co., Ltd. and Kyowa Hakko Kirin Co. YY received research support grants from Otsuka Pharmaceutical Co., Ltd., Kyowa Hakko Kirin Co., Ltd., and Chugai Pharmaceutical Co., Ltd. This does not alter our adherence to PLOS ONE policies on sharing data and materials.].   

We note that one or more of the authors are employed by a commercial company: IBM Research

  1. Please provide an amended Funding Statement declaring this commercial affiliation, as well as a statement regarding the Role of Funders in your study. If the funding organization did not play a role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and only provided financial support in the form of authors' salaries and/or research materials, please review your statements relating to the author contributions, and ensure you have specifically and accurately indicated the role(s) that these authors had in your study. You can update author roles in the Author Contributions section of the online submission form.

Please also include the following statement within your amended Funding Statement.

“The funder provided support in the form of salaries for authors [insert relevant initials], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.”

If your commercial affiliation did play a role in your study, please state and explain this role within your updated Funding Statement.

2. Please also provide an updated Competing Interests Statement declaring this commercial affiliation along with any other relevant declarations relating to employment, consultancy, patents, products in development, or marketed products, etc. 

Within your Competing Interests Statement, please confirm that this commercial affiliation does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to  PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests) . If this adherence statement is not accurate and  there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

Please include both an updated Funding Statement and Competing Interests Statement in your cover letter. We will change the online submission form on your behalf.

Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

Reviewer #2: I Don't Know

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: No

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: In this study, the authors investigated the risk factors for the rapid decline in estimated glomerular filtration rate (eGFR) using logistic (LR) and random forest (RF) models. They found that urinary protein level and its tendency to increase were associated with rapid eGFR decline. Although potentially useful and interesting, I have several statistical concerns.

1. Title: I think that the most important message to readers is that the presence of urine protein is a risk factor for the rapid decline in eGFR, as indicated by RF. The title “A machine learning-based prediction model for rapid…..” does not reflect the contents of this paper. How about changing the title considering above?

2. Endpoint, P. 8: The endpoint was defined as “a decline of 30% or more in eGFR within a period of 2 years”. How were the patients, who developed end-stage kidney disease or died within one year, treated? If they were not included in the analysis, there was a bias in the analysis.

3. Variables, P. 12: Causes of CKD such as diabetes mellitus, history of diseases such as cardiovascular disease, and use of medications such as angiotensin II receptor blockers use are necessary variables when investigating the risk factors for the rapid eGFR decline. Additional analysis including these variables should be conducted.

4. AUCs, P. 14: Table 2 shows the AUCs of the models. The AUCs of the RF models, about 0.7, were very low as the results of machine learning models. Because the values were almost the same as those of the LR models, there was no merit of machine learning analysis. It would be better to find new risk factors for the rapid decline in eGFR decline using usual statistical method properly.

Reviewer #2: This paper address is an important issue.It Reinforces findings of others. That proteinuria is the main determinant of the speed with which kidney function declines.It is not a random sample of patients they are hospitalized patients but it is unclear if the data is inpatient or outpatient or both that must be clarified.Inpatient data can be confounded by acute illnesses and their effect on kidney function

The CKD prevalence 16.8% which is considerably higher than average prevalence in the general population, 10%. The 28% that had greater than 30% decline in two years seems awfully high and again this is not a representative population since these patients were either hospitalized or come from patients in a hospital data base

.Regarding dipstick protein as the source of proteinurea,Interpretation of this needs to include urine specific gravity since the concentration of the urine can greatly affect the urine protein test

.Line 251 to 253 says the results apply to real world clinical settings I think that needs to be adjusted to say that these findings would apply to similar populations as the one that the study

. Line 268 to 2 74 is very confusing it needs to be rewritten.The strong correlation between the ESA of proteinuria and GFR decline established in the present

269 study was proved under the assumption that the increase in urinary protein excretion reflected the

270 exacerbation of glomerular hypertension and sclerosis. Other than proteinuria, the RF-based

271 model revealed that the ESA of serum creatinine level was also ranked in the top-10 in terms of

272 feature importance. Therefore, it was crucial to carefully observe the temporal trends of urinary

273 protein excretion and kidney function to predict GFR decline as early as possible. We consider

274 the aforementioned conclusions of the study to be novel and informative.

i do not understand this!!

Other factors that need to be discussed when evaluating the conclusions is the fact that episodes of AKI can affect the rate of change of kidney function overtime and the fact that stable renal function in a population of this age is very common. Reference Erikson KI 2006 ,Population of similar age where over a 10 year interval 27% had no decline kidney function.Another Issue that needs to be addressed in these types of studies is the competitive risk of death versus decline of kidney function and how that was adjusted for in this analysis

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Steven Rosansky

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Sep 17;15(9):e0239262. doi: 10.1371/journal.pone.0239262.r002

Author response to Decision Letter 0


20 Aug 2020

Reviewer #1:

1. Title: I think that the most important message to readers is that the presence of urine protein is a risk factor for the rapid decline in eGFR, as indicated by RF. The title “A machine learning-based prediction model for rapid…..” does not reflect the contents of this paper. How about changing the title considering above?

Reply: Thank you for your suggestion. We changed the title of our article as it showed the contents.

2. Endpoint, P. 8: The endpoint was defined as “a decline of 30% or more in eGFR within a period of 2 years”. How were the patients, who developed end-stage kidney disease or died within one year, treated? If they were not included in the analysis, there was a bias in the analysis.

Reply: Thank you for valuable comments. We enrolled all patients who had not received maintenance dialysis before the reference points. In other words, the patients, who developed end-stage kidney disease or died within one year after the reference points were included in the present study. We revised the description (page 7, line 103 – 104)

3. Variables, P. 12: Causes of CKD such as diabetes mellitus, history of diseases such as cardiovascular disease, and use of medications such as angiotensin II receptor blockers use are necessary variables when investigating the risk factors for the rapid eGFR decline. Additional analysis including these variables should be conducted.

Reply: Thank you for reasonable comments. As you pointed out, variables including comorbidity of diabetes, use of renin angiotensin system inhibitors, and history of acute kidney injury were reported to be risk factors for rapid GFR declining. Therefore, we added the factors and re-analyzed. We show the results in the Table 1, 2, and add the description (page 7, line 104 – page 8, 106, page 10, line 152 – 159, page 20, line 290 – page 21, line 294). However, the variables did not rank 10 top features.

4. AUCs, P. 14: Table 2 shows the AUCs of the models. The AUCs of the RF models, about 0.7, were very low as the results of machine learning models. Because the values were almost the same as those of the LR models, there was no merit of machine learning analysis. It would be better to find new risk factors for the rapid decline in eGFR decline using usual statistical method properly.

Reply:

We conducted both logistic regression and random forest model by machine learning. Different from descriptive models which can marks very high, this paper discusses predictive models. In predictive models to see an unknown future using past data, AUC around 0.7 is generally regarded as being good, and an improvement from 0.71 (logistic regression) to 0.73 (random forest) is thought to be rather remarkable. Hence, we believe showing Random Forest results including prediction performance and its important factors compared to logistic regression is of interest to researchers of relevant disciplines. We added the description in the Discussion (page 19, line 269 - 272).

Reviewer #2: This paper address is an important issue.It Reinforces findings of others. That proteinuria is the main determinant of the speed with which kidney function declines.It is not a random sample of patients they are hospitalized patients but it is unclear if the data is inpatient or outpatient or both that must be clarified.Inpatient data can be confounded by acute illnesses and their effect on kidney function.

Reply:

Thank you for valuable comments. Proteinuria is one of the strongest risk factors for declining GFR in CKD patients. In addition, it often increases under sick conditions as you pointed out. In the present study, we used data from both outpatients and hospitalized patients. Therefore, we decided to use data such as eGFR, hemoglobin, and urine protein of mean values of 90, 180, and 360 days before the reference points in order to exclude transient changes of GFR caused by acute illness. In addition, we added and re-analyzed history of acute kidney injury as a feature (page 7, line 104 – page 8, 105, page 10, line 152 – 159, page 20, line 290 – page 21, line 294).

The CKD prevalence 16.8% which is considerably higher than average prevalence in the general population, 10%. The 28% that had greater than 30% decline in two years seems awfully high and again this is not a representative population since these patients were either hospitalized or come from patients in a hospital data base.

Reply:

As you pointed out, patients in our study were not representative because the database were from hospital-wide database. However, we thought that there were few reports from that database and therefore, it was one of valuable features, instead. We revised the description in the Discussion (page 19, line 268 - 269).

.Regarding dipstick protein as the source of proteinurea,Interpretation of this needs to include urine specific gravity since the concentration of the urine can greatly affect the urine protein test

Reply:

We totally agreed with you. Unfortunately, data of urine specific gravity was not available. Hence, we described that in the limitation (page 21, line 310 – page 22, line 313).

.Line 251 to 253 says the results apply to real world clinical settings I think that needs to be adjusted to say that these findings would apply to similar populations as the one that the study

Reply:

Thank you for your suggestion. We re-wrote the description (page 19, line 268 - 269).

. Line 268 to 274 is very confusing it needs to be rewritten.The strong correlation between the ESA of proteinuria and GFR decline established in the present 269 study was proved under the assumption that the increase in urinary protein excretion reflected the 270 exacerbation of glomerular hypertension and sclerosis. Other than proteinuria, the RF-based 271 model revealed that the ESA of serum creatinine level was also ranked in the top-10 in terms of 272 feature importance. Therefore, it was crucial to carefully observe the temporal trends of urinary 273 protein excretion and kidney function to predict GFR decline as early as possible. We consider 274 the aforementioned conclusions of the study to be novel and informative.

i do not understand this!!

Other factors that need to be discussed when evaluating the conclusions is the fact that episodes of AKI can affect the rate of change of kidney function overtime and the fact that stable renal function in a population of this age is very common. Reference Erikson KI 2006 ,Population of similar age where over a 10 year interval 27% had no decline kidney function.Another Issue that needs to be addressed in these types of studies is the competitive risk of death versus decline of kidney function and how that was adjusted for in this analysis

Reply:

Thank you for valuable comments. As you pointed out, AKI is relevant to GFR trajectory, and therefore, we added history of AKI as a variable. According to ICD-10, we examined AKI including acute renal failure, acute tubular injury, acute tubular necrosis, and so on. In both RD and non-RD groups, only around 4.5% of CKD patients had history of AKI. The results of re-analysis after adding history of AKI did not change much (Table 2 and 3). We revised the description in the Discussion (page 20, line 290 – page 21, line 294) and added the references (#28 and #29). Unfortunately, we could not find the article you recommended (we used the keyword of Erikson and KI but could not find)

Decision Letter 1

Tatsuo Shimosawa

28 Aug 2020

PONE-D-20-15213R1

Increasing tendency of urine protein is a risk factor for rapid GFR decline in patients with CKD: A machine learning-based prediction model by using a big database.

PLOS ONE

Dear Dr. Inaguma,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Oct 12 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Tatsuo Shimosawa, M.D., Ph.D.

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: No

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: I Don't Know

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: No

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

Reviewer #2: Kidney International (2006) 69, 375–382. doi:10.1038/sj.ki.5000058 ericson reference

it needs further revisions

title change to urine protein is a risk factor for rapid GFR decline in patients

with CKD: A machine learning-based prediction model by using a big database.

remove from line 57with respect to kidney dysfunction during stages 1, 2, and 3a ofCKD.

all mentions of GFR should be eGFR which is what you are using not actual GFR

Meanwhile, use of renin angiotensin system inhibitors was low in the RD group- COMMENT ON THIS IN DISCUSSION

LEAVE OUT Further, glomerular

283 hyperfiltration is known to lead to proteinuria and glomerular sclerosis, and subsequently result

284 in the decline of GFR. Therefore, we attempted to reduce intra-glomerular blood pressure by

285 prescribing medications, including renin angiotensin blockers and sodium glucose transporter-1.

286 The strong correlation between the ESA of proteinuria and GFR decline established in the present

287 study was proved under the assumption that the increase in urinary protein excretion reflected the

288 exacerbation of glomerular hypertension and sclerosis. this is not correct not what you did

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Sep 17;15(9):e0239262. doi: 10.1371/journal.pone.0239262.r004

Author response to Decision Letter 1


30 Aug 2020

Reply:

Thanks for your comments.

1. As you pointed out, we removed description from line 57 (with respect to kidney dysfunction during stages 1, 2, and 3a of CKD) and from line 283 to 288 (Further, glomerular hyperfiltration is known to lead to proteinuria and glomerular sclerosis, and subsequently result in the decline of GFR. Therefore, we attempted to reduce intra-glomerular blood pressure by prescribing medications, including renin angiotensin blockers and sodium glucose transporter-1. The strong correlation between the ESA of proteinuria and GFR decline established in the present study was proved under the assumption that the increase in urinary protein excretion reflected the exacerbation of glomerular hypertension and sclerosis.).

2. We changed all “GFR” to “eGFR”.

3. We added the description and the Reference #28.

Attachment

Submitted filename: Reply to reviewers comment.docx

Decision Letter 2

Tatsuo Shimosawa

3 Sep 2020

Increasing tendency of urine protein is a risk factor for rapid eGFR decline in patients with CKD: A machine learning-based prediction model by using a big database.

PONE-D-20-15213R2

Dear Dr. Inaguma,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Tatsuo Shimosawa, M.D., Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Tatsuo Shimosawa

8 Sep 2020

PONE-D-20-15213R2

Increasing tendency of urine protein is a risk factor for rapid eGFR decline in patients with CKD: A machine learning-based prediction model by using a big database.

Dear Dr. Inaguma:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Prof. Tatsuo Shimosawa

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Table. Average blood pressure and laboratory data for 90, 180, and 360 days prior to the baseline.

    (DOCX)

    S2 Table. Exponentially smoothed average of blood pressure and laboratory data for 7 and 30days.

    (DOCX)

    Attachment

    Submitted filename: Reply to reviewers comment.docx

    Data Availability Statement

    10.6084/m9.figshare.12780311.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES