Abstract
Objective
Screening for chronic kidney disease (CKD) requires an estimated glomerular filtration rate (eGFR, mL/min/1.73 m2) from a blood sample and a proteinuria level from a urinalysis. We developed machine-learning models to detect CKD without blood collection, predicting an eGFR less than 60 (eGFR60 model) or 45 (eGFR45 model) using a urine dipstick test.
Materials and Methods
The electronic health record data (n = 220 018) obtained from university hospitals were used for XGBoost-derived model construction. The model variables were age, sex, and 10 measurements from the urine dipstick test. The models were validated using health checkup center data (n = 74 380) and nationwide public data (KNHANES data, n = 62 945) for the general population in Korea.
Results
The models comprised 7 features, including age, sex, and 5 urine dipstick measurements (protein, blood, glucose, pH, and specific gravity). The internal and external areas under the curve (AUCs) of the eGFR60 model were 0.90 or higher, and a higher AUC for the eGFR45 model was obtained. For the eGFR60 model on KNHANES data, the sensitivity was 0.93 or 0.80, and the specificity was 0.86 or 0.85 in ages less than 65 with proteinuria (nondiabetes or diabetes, respectively). Nonproteinuric CKD could be detected in nondiabetic patients under the age of 65 with a sensitivity of 0.88 and specificity of 0.71.
Discussion and Conclusions
The model performance differed across subgroups by age, proteinuria, and diabetes. The CKD progression risk can be assessed with the eGFR models using the levels of eGFR decrease and proteinuria. The machine-learning-enhanced urine-dipstick test can become a point-of-care test to promote public health by screening CKD and ranking its risk of progression.
Keywords: urinalysis, machine-learning model, chronic kidney disease, estimated glomerular filtration rate, XGBoost
INTRODUCTION
Chronic kidney disease (CKD) is characterized by the progressive and irreversible loss of kidney function.1 CKD is diagnosed when a marker of kidney damage, such as albuminuria or decreased glomerular filtration rate (GFR, mL/min/1.73 m2), is present for more than 3 months.2 About 1 in 10 people have CKD in middle and high-income countries, which is most commonly caused by diabetes and hypertension.3 CKD is more common in people aged ≥ 65 years in the United States.4 In general, CKD is clinically silent and asymptomatic until its advanced stages, and as many as 9 in 10 adults with CKD do not know they have CKD in the United States.1,4 Failure to recognize CKD results in reduced lifespan and worse renal replacement therapy outcomes, which makes CKD the most expensive of chronic diseases.1 Therefore, early identification of CKD patients and referral to specialist kidney care services will bring economic and clinical benefits.1
GFR and albuminuria are essential components of CKD screening and risk stratification.2 Because serum creatinine levels and urine dipstick tests are required to assess GFR and albuminuria, respectively, current CKD guidelines mandate obtaining both blood and urine samples. GFR is the best overall index of kidney function and decreased GFR is generally defined as less than 60. GFR can be estimated from demographic variables, including age, gender, race, and serum creatinine levels, using the Chronic Kidney Disease Epidemiology Collaborations (CKD-EPI) equation (eGFR, mL/min/1.73 m2).5 Risks of death, cardiovascular diseases, progression to end-stage kidney disease, and hospitalization begin to rise when GFR < 60 and grow fast when GFR < 45.6,7 However, serum creatinine measurement requires invasive blood collection and adequate laboratory equipment.2
Albuminuria is a hallmark of kidney damage and is associated with significantly increased risks of cardiovascular diseases, rapid decline in kidney function, and all-cause mortality.8–10 The urine dipstick test has been used for over 50 years to detect albuminuria.3 The dipstick can evaluate urine's chemical characteristics using multiple pads impregnated with chemical reagents to see a specific urine feature.11 The urine dipstick test for protein is most sensitive to albumin and can be used for semiquantitative assessment of albuminuria.2 The urine dipstick has been used to support CKD screening programs because it is simple, cost-saving, and straightforward.2,12 However, the urine dipstick test alone is insufficient to screen CKD because of the proportion of nonproteinuria CKD, that is, low eGFR without proteinuria.12,13 In addition, CKD severity and progression risk cannot be assessed using the urine dipstick test without calculating the GFR.14
Therefore, GFR estimation using the urine dipstick test can be helpful because it would let us know both kidney function and damage at once while avoiding invasive blood tests. A previous study evaluated the validity of 1+ or higher proteinuria on the urine dipstick test for identifying reduced GFR in male Japanese workers aged 40 years and older. The area under the curve (AUCs) were 0.52 and 0.70 in identifying GFR < 60 or GFR < 50, respectively, suggesting that the urine dipstick test alone is not sufficient to reliably identify reduced GFR.13 Another study reported that hematuria or 1+ proteinuria was sensitive but nonspecific in detecting elevated serum creatinine.15 We speculated that eGFR estimation might be possible if we exploit all the urine chemical information from urine dipstick test results using an advanced machine-learning technique, that is, XGBoost, which is an advanced gradient-boosting decision-tree algorithm with high speed and performance.16
We attempted to predict an eGFR level below 60 using urine dipstick measurements (eGFR60 model). We also tried to indicate an eGFR level below 45 (eGFR45 model) because of its different clinical outcomes and risk profiles.2 They correspond to GFR categories, such as G3a (mildly to moderately decreased, 45–59) and G3b (moderately to severely reduced, 30–44).2 We designed this study to apply a simple urine dipstick test in general populations, hoping for the early detection of CKD and reduction of CKD progression and sequelae, including end-stage kidney disease (ESKD or G5, GFR <15) and cardiovascular complications. For this purpose, we performed external validation of models using the Korea National Health and Nutrition Examination Survey (KNHANES) data representing the Korean noninstitutionalized civilian population.17
METHODS
Development and validation data
The overview of the study design is illustrated in Figure 1. We screened 286 649 cases for 21 years (2000–2020) from a university hospital (CHA data from Bundang CHA Hospital, Seongnam, Republic of Korea) and 9156 cases for 11 years (2010–2020) from a diabetes center (SHDC data from Severance Hospital Diabetes Center, Seoul, Republic of Korea) for data development. We used the Severance Checkup Health Promotion Center (SCHPC, Seoul, Republic of Korea) data for 8 years (2013–2020) and KNHANES data for 12 years (2007–2018) for external validation (Figure 1). KNHANES is a nationwide survey conducted annually by the Korea Disease Control and Prevention Agency for all citizens in Korea.17 KNHANES follows a multistage clustered probability design and provides post-stratification sample weights to accurately represent the Korean population.18
Figure 1.
Study flowchart for predicting a decrease in estimated glomerular filtration rate (eGFR) from urine dipstick test. Bold numbers in the boxes are case numbers of data. XGBoost: eXtreme Gradient Boosting; TPOT: Tree-based Pipeline Optimization Tool; ML: machine learning; GPU: Graphics processing unit; eGFR60 model: model to predict eGFR < 60 mL/min/1.73 m2; eGFR45 model: model to predict eGFR < 45 mL/min/1.73 m2; AUC: area under the curve; SHAP: SHapley Additive exPlanations.
We obtained complete information on age, sex, serum creatinine level, and 10 elements of the urine dipstick test from each subject. The urine dipstick and blood tests were performed on the same day. We selected a singular dataset for each subject by choosing the instance with the lowest eGFR value from among the multiple records. For CHA, SHDC, and SCHPC data, those under 18 or over 90 years of age were excluded, which were outliers according to the age distribution of the development data. For KNHANES data, we excluded subjects under 18 or over 79 years old because the exact age was unavailable for those over 79 years.
The Institutional Review Board of CHA University Bundang CHA Medical Center (No. 2019-01-032) and Severance Hospital (No. 4-2020-0231) approved this study. KNHANES IV (2007–2009), KNHANES V (2010–2012), KNHANES VI (2013–2015), and KNHANES VII (2016–2018) were approved by the KCDC research ethics committee (2007-02CON-04-P, 2008-04EXP-01-C, 2009-01CON-03-2C, 2010-02CON-21-C, 2011-02CON-06-C, 2012-01EXP-01-2C, 2013-07CON-03-4C, 2013-12EXP-03-5C, and 2018-01-03-P-A). Written informed consent was obtained from all the individuals included in the KNHANES cohort. Because CHA, SHDC, and SCHPC data were obtained as deidentified forms, informed consent was not required.
Model variables and outcomes
We attempted to build 2 models that could predict eGFR less than 60 and 45. eGFR was calculated by the CKD-EPI equation (2009) [eGFR = 141 × minimum(sCr/κ, 1)α × maximum(sCr/κ, 1)−1.209 × 0.993age × 1.018 (if female) × 1.159 (if black), where κ is 0.7 for females and 0.9 for males, α is −0.329 for females and −0.411 for males, sCr is serum creatinine (mg/dL)].5 We selected 12 model predictors of age, sex, and urine dipstick tests (protein, specific gravity, pH, glucose, blood, leukocyte, bilirubin, urobilinogen, ketone, nitrite). Urine pH was measured in 0.5 units from 5 to 9, and urine specific gravity was measured in 0.005 units from 1.005 to 1.030. Urine blood, protein, glucose, ketone, and urobilinogen were measured at 6 levels (0 [negative], 1 [trace], 2 [1+], 3 [2+], 4 [3+], and 5 [4+]), while urine leukocyte and bilirubin were measured at five levels (0 [negative], 1 [trace], 2 [1+], 3 [2+], and 4 [3+]). We combined the negative and trace of the urine bilirubin and urobilinogen and assigned them to level 0. Urine nitrite was measured as 0 (negative) or 1 (positive). Multiple devices were used for urine dipstick tests at Bundang CHA Hospital (URiSCAN® S-300 or URiSCAN® PRO, YD Diagnostics, Yongin, Republic of Korea; Clinitek 500, Siemens Healthineers, Erlangen, Germany; AX-4030, ARKRAY Inc., Kyoto, Japan; UC-3500 or UC-1000, Sysmex, Kobe, Japan), Severance Hospital (Atellica® 1500, CLINITEK Advantus®, or CLINITEK Novus®, Siemens Healthineers, Erlangen, Germany), Severance Checkup Health Promotion Center (URiSCAN® Pro II or URiSCAN® PRO, YD Diagnostics, Yongin, Republic of Korea) while KNHANES conducted urine dipstick tests using a single device (Urisys 2400, Roche Diagnostics GmbH, Mannheim, Germany).
Hyperparameter optimization and model training
The development data were split 9:1 into a training and an internal validation set (Figure 1). We proceeded with multistratification for 3 age clusters and model outcomes when divided into the training and internal validation sets. We standardized continuous variables of the training set by removing the mean and scaling them to unit variance. Then, we trained the extreme gradient boosting algorithm (XGBoost) that exhibits excellent prediction performance in a short time without overfitting and handles missing values.19
However, the disadvantage of XGBoost is that many hyperparameters should be tuned before training. We used the tree-based pipeline optimization tool (TPOT, version 0.11.6), an automated machine-learning tool, to optimize 9 hyperparameters of XGBoost (Supplementary Methods). In addition, we employed GPU-accelerated TPOT to get faster and more accurate results.20,21
Feature selection for a sparse model
We retrieved the importance score of each feature that indicated how useful a feature is in constructing a boost decision tree within the trained model.16 Then, we sorted the model features according to their importance and selected the subset of features by eliminating the most unimportant feature. If the AUC of a model by the subset of features was not compromised, the next unimportant feature was removed. This procedure was repeated before an AUC of a feature subset was lower than the AUC of all features.
Model performance test and explanation
We evaluated model performance using sensitivity, specificity, and AUC with 95% bootstrap confidence intervals (10 000 times) (Supplementary Methods). For external validation with KNHANES data, we computed weighted AUC, sensitivity, and specificity using the sample weight. The optimal threshold for sensitivity and specificity calculations was the maximum Youden's index (sensitivity + specificity − 1) or the minimum of the Index of Union with the minimum of the absolute difference between sensitivity and specificity.22
We showed the impact of each feature on the model prediction in KNHANES data using the Shapley Additive exPlanations (SHAP) probability value23 (Supplementary Methods). We also evaluated the model performance in subjects with CKD risk factors, such as age ≥ 65 years, diabetes, and hypertension, using the KNHANES dataset. Diabetes was with one of the following conditions, a fasting blood sugar ≥ 126 mg/dL, diagnosis by a physician, current hypoglycemic medication, or insulin injection. Hypertension was defined when a systolic blood pressure ≥ 140 mmHg, a diastolic blood pressure ≥90 mmHg, or current antihypertensive medication.
Statistics, calculations, and software
R (version 3.6.3, R Foundation for Statistical Computing, Vienna, Austria) and python (version 3.8.5, Python Software Foundation, Wilmington, DE, USA) were used as programming languages. The Supplementary Methods explain the details of computations performed using the scikit-learn tools of Python and statistics for KNHANES survey data.
RESULTS
Characteristics of development and external-validation datasets
After data preprocessing, a development dataset for model training included 220 018 unique patients, and the first and second external validation datasets had 74 380 and 62 845, respectively (Table 1). While the prevalence of eGFR< 60 was 2.1% in the KNHANES dataset, the development data from university hospitals had a higher prevalence of decreased eGFR (6.8%), and the SCHPC validation data exhibited a lower prevalence (0.9%) (Table 1).
Table 1.
Demographic characteristics of study data
| Characteristics | CHA and SHDC development data (n = 220 018) | SCHPC validation data (n = 74 380) | KNHANES validation dataa (n = 62 945) |
|---|---|---|---|
| Age (year), mean (SD) [range] | 47 (16) [18–90] | 47 (13) [18–90] | 45 (16) [18–79] |
| Age ≥ 65, n (%) | 35 525 (16.1) | 13 449 (21.4) | 4 499 557 (12.7) |
| Sex, n (%) | |||
| Female | 125 646 (57) | 35 752 (48) | 16 915 995 (48) |
| Male | 94 372 (43) | 38 628 (52) | 18 553 804 (52) |
| eGFR (mL/min/1.73 m2) | |||
| Mean (SD) | 93.7 (22.1) | 101.3 (15.1) | 97.4 (17.0) |
| <60, n (%) | 14 871 (6.8) | 681 (0.9) | 726 038 (2.1) |
| <45, n (%) | 6414 (2.9) | 180 (0.2) | 170 449 (0.5) |
| Urine dipstick tests | |||
| Specific gravity, mean (SD) | 1.019 (0.008) | 1.024 (0.007) | 1.020 (0.006) |
| pH, mean (SD) | 6.2 (0.8) | 5.4 (0.7) | 5.7 (0.8) |
| Blood levelb | 0 (1) [0–5] | 0 (0) [0–4] | 0 (1) [0–5] |
| Protein levelb | 0 (0) [0–5] | 0 (0) [0–5] | 0 (0) [0–5] |
| Glucose levelb | 0 (0) [0–5] | 0 (0) [0–5] | 0 (0) [0–5] |
| Ketone levelb | 0 (0) [0–5] | 0 (0) [0–4] | 0 (0) [0–4] |
| Urobilinogen levelb | 0 (0) [0–5] | 0 (0) [0–5] | 0 (0) [0–4] |
| Bilirubin levelb | 0 (0) [0–4] | 0 (0) [0–4] | 0 (0) [0–4] |
| Leucocyte levelb | 0 (1) [0–4] | 0 (0) [0–4] | NA |
| Nitrite, n (%) | 4462 (2.0) | 331 (0.4) | 1274 (2.0) |
CHA: Bundang CHA hospital; SHDC: Severance Hospital Diabetes Center; SCHPC: Severance Checkup Health Promotion Center; KNHANES: Korean National Health and Nutrition Examination Survey; SD: standard deviation; eGFR: estimated glomerular filtration rate; NA: not available.
Estimated values by complex-survey-design weights.
Median (interquartile range) [range].
The development data showed 17.1% isolated proteinuria (proteinuria with eGFR ≥60), 3.7% isolated eGFR decline (eGFR < 60 without proteinuria), and 3.1% both proteinuria and eGFR decline, which differed from the KNHANES data of 9.5% isolated proteinuria, 1.6% isolated eGFR decline, and 0.5% both proteinuria and eGFR decline (P < 0.001, chi-square test).
Performance of XGBoost models for eGFR-decline detection
XGBoost models were trained using 12 features of development data after hyperparameter optimization by TPOT. Then, 6 features (urine protein, blood, glucose, pH, specific gravity, and age) were selected by feature importance for eGFR60 and eGFR45 models, and the male was added to avoid threshold difference by gender.
On internal validation, the eGFR60 and eGFR45 models showed performance at the AUCs of 0.91 and 0.94, respectively, and the eGFR45 model's AUC was statistically higher than the eGFR60 model (Figure 2). The AUCs of the model with 7 selected features were not statistically different from the model with 12 features (P > 0.05, bootstrap t-test). A comparable level of AUC was obtained from external validation 1 (SCHPC) and 2 (KNHANES) (Figure 2).
Figure 2.
Receiver operating characteristic (ROC) curves for the detection of estimated glomerular filtration rate (eGFR) < 60 mL/min/1.73 m2 (A, eGFR60 model) and eGFR < 45 mL/min/1.73 m2 (B, eGFR45 model). The data used for internal validation, external validation 1, and external validation 2 were obtained from university hospitals (CHA and SHDC), a health checkup center (SCHPC), and the general population (KNHANES), respectively. For external validation 2, weighted ROC was drawn to represent the entire population. The threshold, sensitivity, and specificity are calculated for KNHANES data using the Youden index.
Model explanation and effects of model predictors
Age was the most influential factor in the eGFR60 and eGFR45 models (Figure 3). The presence of proteinuria was also strongly predictive of low eGFR. Urine blood and glucose were positively related to low eGFR, while pH and specific gravity were negatively associated with eGFR decline (Figure 3).
Figure 3.
Summary plots of SHAP value (probability of a decline in estimated glomerular filtration rate, eGFR) for eGFR < 60 mL/min/1.73 m2 (A, eGFR60 model) and eGFR < 45 mL/min/1.73 m2 (B, eGFR45 model) prediction. KNHANES data were used as reference and foreground datasets. The distribution of each feature's impacts on the model output is plotted using SHAP values. Model features are sorted along the y-axis of the summary plot of SHAP value by the sum of SHAP value magnitudes over all samples.
Model performance in individuals with risk factors of kidney disease
We tested model performances in predefined subgroups of KNHANES data according to risk factors for CKD; elderly (65 years or older), diabetes, and hypertension. In both the eGFR60 and eGFR45 models, the AUCs were significantly decreased at the age of 65 or higher and failed to attain 0.8 AUC (Figure 4A and B). For the age under 65 years, the AUC was not statistically changed by hypertension (Figure 4C and D) but was significantly dropped by diabetes (Figure 4E and F). For individuals aged 65 or older, the AUC of the eGFR60 and eGFR45 models was 0.64 and 0.71 in diabetes and 0.68 and 0.77 in hypertension, respectively.
Figure 4.
Changes in weighted receiver operating curves for detecting a decline in estimated glomerular filtration rate (eGFR, mL/min/1.73 m2) according to models, ages (years), hypertension, and diabetes. The area under the curve (AUC) presents with the 95% confidence interval in the parentheses. eGFR60 model: eGFR < 60 detection model; eGFR45 model: eGFR < 45 detection model.
Then, the model performance was calculated in a proteinuria or nonproteinuria condition for the age under 65 years. In proteinuria, CKD is already diagnosed by urine protein, and the degree of eGFR decline by the model gives a rank for CKD progression risk. In proteinuria, eGFR60 and eGFR45 models showed excellent accuracy, and the sensitivity and specificity statistically reached 90% except for subjects with diabetes (Table 2 and Figure 5). With diabetes, the sensitivity of the eGFR60 model was as good as 80% (Table 2).
Table 2.
Model performance for ages under 65 years according to urine protein and diabetes
| Models | Nondiabetes (95% CI) | Diabetes (95% CI) | All (95% CI) |
|---|---|---|---|
| Proteinuria | |||
| eGFR60 model | |||
| Sensitivity | 0.93 (0.88–0.97) | 0.80 (0.70–0.89) | 0.93 (0.88–0.96) |
| Specificity | 0.86 (0.81–0.90) | 0.85 (0.80–0.90) | 0.84 (0.80–0.87) |
| Threshold | 0.47 (0.39–0.53) | 0.73 (0.70–0.80) | 0.48 (0.43–0.54) |
| eGFR45 model | |||
| Sensitivity | 0.96 (0.88–1.00) | 0.92 (0.83–0.97) | 0.97 (0.92–1.00) |
| Specificity | 0.93 (0.86–0.96) | 0.86 (0.81–0.95) | 0.91 (0.89–0.95) |
| Threshold | 0.61 (0.39–0.72) | 0.80 (0.76–0.91) | 0.62 (0.58–0.75) |
| Nonproteinuria | |||
| eGFR60 model | |||
| Sensitivity | 0.88 (0.81–0.97) | 0.69 (0.59–0.77) | 0.88 (0.82–0.94) |
| Specificity | 0.71 (0.62–0.76) | 0.70 (0.65–0.75) | 0.71 (0.63–0.75) |
| Threshold | 0.23 (0.18–0.27)a | 0.42 (0.41–0.45) | 0.24 (0.19–0.28)a |
| eGFR45 model | |||
| Sensitivity | 0.95 (0.79–1.00) | 0.60 (0.39–0.77) | 0.97 (0.86–1.00) |
| Specificity | 0.78 (0.72–0.91) | 0.64 (0.53–0.80) | 0.71 (0.68–0.82) |
| Threshold | 0.18 (0.15–0.31)a | 0.25 (0.20–0.37) | 0.14 (0.13–0.21)a |
eGFR: estimated glomerular filtration rate (mL/min/1.73 m2); eGFR60 model: eGFR < 60 detection model; eGFR45 model: eGFR < 45 detection model; CI: confidence interval.
The threshold is defined by the Youden index. Otherwise, the Index of Union was applied.
Figure 5.
Changes in weighted receiver operating curves for detecting a decline in estimated glomerular filtration rate (eGFR, mL/min/1.73 m2) in ages under 65 years according to the model, urine protein, and diabetes. The area under the curve (AUC) presents with the 95% confidence interval in the parentheses. eGFR60 model: eGFR < 60 detection model; eGFR45 model: eGFR < 45 detection model.
In nonproteinuria, the model detected CKD with decreased eGFR that could not be caught by a urine test before. The sensitivity statistically got to 90% for the eGFR60 and eGFR45 models for subjects without diabetes (Table 2 and Figure 5). However, the accuracy of models was low in diabetes, and the sensitivity and specificity were statistically less than 80% (Table 2 and Figure 5).
For comparison, Supplementary Table S1 is provided for those aged 65 years or older, which corresponds to Table 2.
DISCUSSION
A mild or moderate decrease in eGFR could be detected at 0.9 or higher of AUC by XGBoost models using age, sex, and 5 components of urinalysis. Age had the highest average impact among model features, followed by urine protein. The performance of the models was significantly lower in people aged 65 years or older and in those with diabetes, according to a subgroup study. Our models performed well for people under the age of 65 years for nonproteinuria CKD without diabetes and proteinuria CKD. Therefore, the urine dipstick test with machine-learning enhancements could be used on individuals under 65 years of age to screen for CKD without diabetes or to track kidney function changes in those with proteinuria.
Clinical importance for detecting both kidney dysfunction and proteinuria
KNHANES data showed that 86% (10.0%/11.6%) and 18% (2.1%/11.6%) of kidney disease could be screened by proteinuria and eGFR, respectively. Nonproteinuria (isolated eGFR decline) CKD and isolated-proteinuria CKD are prominent and thus, it is necessary to screen for CKD via assessment of both proteinuria and eGFR. Moreover, isolated proteinuria corresponds to moderate to high risk of CKD, but proteinuria with decreased eGFR is designated as high to very high risk.2 The level of CKD risk can be crucial because it influences life expectancy, quality of life, prognosis, and progression of CKD.2,14 Of note, neither the eGFR nor the proteinuria level alone can fully capture the CKD prognosis.2 Therefore, eGFR and a urine test were indispensable to correctly assess CKD risks without screening failure. However, our models can provide simultaneous eGFR and proteinuria levels using a simple urine dipstick test, eliminating the need for invasive blood sampling. This technology has the potential to enhance the efficiency, convenience, and cost-effectiveness of CKD screening and assessment compared with current practices.
The advantage of using the eGFR60 and eGFR45 models for CKD screening
If we use both the eGFR60 and eGFR45 models, an early stage of CKD could not only be detected using the eGFR60 model but also more advanced CKD could be informed using the eGFR45 model. The performance was higher in the eGFR45 model than in the eGFR60 model, which prevented missing clinically more significant patients.2 In addition, CKD will progress more rapidly to ESRD when the initial GFR is lower, and an eGFR of 45 is the turning point for the primary cause of ESRD, i.e., IgA nephropathy, to have a poorer prognosis.14,24
Strategies for high-performance models and advantages by feature selection
Because the quantity of data is one of the critical issues in machine learning, we gathered large-sized data as big as 300 000 instances for 21 years from university hospitals and trained the XGBoost algorithm.25 We used the GPU-accelerated TPOT to optimize multiple hyperparameters of XGBoost, and GPU can give TPOT a higher chance to meet more evolved and better combinations in limited calculation time for the genetic algorithm of TPOT.20 We addressed the imbalance problem by setting the “scale_pos_weight” parameter of XGBoost to (the number of majority classes/number of minority classes) for class-weighted training, in which a model was configured to be sensitive to misclassification errors of the minority class.19
In addition, we pursued sparse models using 5 out of 10 urine dipstick results without performance loss. A urine dipstick with 5 pads reduces the manufacturing cost and shortens the test time by omitting the 5-min leukocyte pad for color change.26 Consequently, our model would take only 1 min for a urine dipstick test.
Model explainability on eGFR-decline detection
The model showed relevant associations between low eGFR and old age, proteinuria, hematuria, and glucosuria (Figure 3). The age-associated GFR decline is consistently observed in epidemiologic studies, and many studies show the relationship between aging and pathologic abnormalities of kidneys.2 Proteinuria is a hallmark of numerous kidney diseases, and its presence is strongly associated with kidney function decline.27,28 Blood in the urine (hematuria) is commonly associated with proteinuria, especially in glomerular and renal hematuria.26 However, several studies revealed that isolated microscopic hematuria also significantly increased the risks for CKD development.29 Urine glucose (glycosuria) can be related to kidney disease because of diabetes mellitus and Fanconi's syndrome.26
In addition, the model used urine pH and specific gravity as eGFR-preserving factors (Figure 3). Urine pH is generally slightly acidic (5.5–6.5), but urine pH 5.0–5.5 is suggested as an independent predictor of CKD.26,30 Urine specific gravity (USG) reflects the kidney's concentrating ability, and low USG is frequently associated with kidney dysfunction.26
Limitations in model applications
The model achievement for the whole data would not guarantee the model performance for a subgroup. The model performance was insufficient to find kidney dysfunction in those aged 65 or older. The calculated eGFR and the actual GFR may differ in old age, but the most accurate GFR was estimated using the CKD-EPI formula developed to compensate for the shortcomings of the MDRD formula.31 There is debate on the discrimination between age-related GFR loss without proteinuria and CKD in the elderly.32 Because the age-associated GFR decline is observed in longitudinal studies, GFR below 60 may not mean CKD in older people.2 Accordingly, the AUC of the eGFR45 model rose but remained at a moderate level of performance due to unknown reasons (Figure 4). Still, individuals older than 65 should be cautious about incident GFR decline because CKD prevalence increases in the elderly, and those aged 65 or older have the highest incidence of ESKD.33
In addition, the presence of diabetes significantly reduced the predictive power of our models (Figure 4). Over 2 decades in the United States, the clinical pattern of diabetic kidney disease is shifted to decreased proteinuria among patients younger than 65 years and increased GFR impairment, which can be due to changes in diabetes medication or treatment.34 GFR loss without proteinuria might give the model difficulties in predicting impaired eGFR from urine, and the models could not apply to subjects with nonproteinuria in diabetes (Figure 5 and Table 2). Therefore, we suggest that patients with diabetes undergo regular blood tests and urinalysis for adequate diagnosis of CKD.
While the inclusion of inpatients in the development data may have introduced inconsistencies in urine indices and serum creatinine levels due to acute hospitalizations,35,36 the developed models were validated using 2 outpatient-only external datasets: SCHPC from a health checkup center and KNHANES representing a noninstitutionalized population (Figure 2). These results suggest that the developed models could serve as a potential screening tool for detecting CKD in the general population.
Finally, since the data used for model training and evaluation were for Koreans, we cannot guarantee the model's performance for races other than Asians.
Informatics implications of the model
Our models can be implemented as software in digital urinalysis apps for smartphones or portable urine dipstick analyzers. These devices can be used for point-of-care testing (POCT), that is, medical testing at or near the point of patient care. POCT is increasingly being utilized for disease screening because of its convenience, accessibility, and rapid data availability.37 Our model suits point-of-care CKD screening because urine dipstick test results are generally provided within a minute and can be easily interpreted with minimal interobserver variability. In addition, the devices with our algorithms allow the simultaneous detection of albuminuria and low GFR without a blood test. Therefore, our work may be beneficial in low- and middle-income countries or rural areas where CKD prevalence is high and serum creatinine measurements are not readily available.38–40 Early CKD identification and intervention are critical medical issues in these areas, due to limited resources for managing advanced CKD often being.40 We expect CKD screening to be effectively and inexpensively deployed in resource-limited areas through local health service centers by our machine-learning-enhanced urine-dipstick-analyzing devices.
Furthermore, recent advances in digital urinalysis may enable home-based digital health technologies.41 Automated smartphone-based colorimetric analysis of urine dipstick tests is currently accurate and reproducible.42,43 Combining our CKD prediction models with digital urinalysis will further enhance the usability of the urine test by providing additional information on the probability of CKD. Furthermore, these technologies can be used for remote patient monitoring of acute kidney injury (AKI), which is characterized by an abrupt decline in eGFR. AKI is particularly prevalent in patients with CKD but is often undiagnosed largely due to its asymptomatic nature. Additionally, undiagnosed AKI is frequently associated with poor prognosis.44,45 Acute worsening of eGFR or proteinuria can be detected at an early stage by home-based digital urinalysis and promptly and appropriately managed (Figure 6).
Figure 6.
Proposed use of machine-learning-enhanced urinalysis for kidney disease detection. CKD1, CKD2, and CKD3 are moderate, high, and very high risks of chronic kidney disease (CKD), respectively, according to Kidney Disease: Improving Global Outcomes guidelines.2 eGFR60 model: estimated glomerular filtration rate (eGFR) < 60 mL/min/1.73 m2 detection model; eGFR45 model: eGFR < 45 mL/min/1.73 m2 detection model.
CONCLUSION
We created high-performance machine-learning models to predict mild and moderately decreased kidney function using age, sex, and 5 elements of a subject's urine dipstick test. Our model might help early detection of CKD and timely referral to a nephrologist by providing 2 essential pieces of information on proteinuria and kidney dysfunction. Our screening method was noninvasive, fast, and economical and would contribute to preventing ESKD, one of the most burdensome diseases worldwide.
Supplementary Material
Contributor Information
Eun Chan Jang, Department of Biomedical Informatics, Graduate School of Medicine, CHA University, Seongnam, Republic of Korea.
Young Min Park, Department of Biomedical Informatics, Graduate School of Medicine, CHA University, Seongnam, Republic of Korea.
Hyun Wook Han, Department of Biomedical Informatics, Graduate School of Medicine, CHA University, Seongnam, Republic of Korea; Institute for Biomedical Informatics, Graduate School of Medicine, CHA University, Seongnam, Republic of Korea.
Christopher Seungkyu Lee, Department of Ophthalmology, Institute of Vision Research, Severance Hospital, Yonsei University College of Medicine, Seoul, Republic of Korea.
Eun Seok Kang, Department of Internal Medicine, Severance Hospital Diabetes Center, Institute of Endocrine Research, Yonsei University College of Medicine, Seoul, Republic of Korea.
Yu Ho Lee, Division of Nephrology, Department of Internal Medicine, CHA Bundang Medical Center, CHA University, Seongnam, Republic of Korea.
Sang Min Nam, Department of Biomedical Informatics, Graduate School of Medicine, CHA University, Seongnam, Republic of Korea; Institute for Biomedical Informatics, Graduate School of Medicine, CHA University, Seongnam, Republic of Korea; Department of Ophthalmology, CHA Bundang Medical Center, CHA University, Seongnam, Republic of Korea.
FUNDING
The National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) grant number 2019R1C1C1007663; and also supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education grant numbers 2021R1G1A1014115 and 2020R1F1A1068423.
AUTHOR CONTRIBUTIONS
S.M.N. and Y.H.L. designed and directed the study. C.S.L and E.S.K. contributed to data collection and preprocessing. E.C.J., Y.M.P., and H.W.H. developed a machine-learning model and evaluated and analyzed the model’s performance. S.M.N., Y.H.L., and E.C.J. wrote a draft of the manuscript. All coauthors reviewed and edited the manuscript.
SUPPLEMENTARY MATERIAL
Supplementary material is available at Journal of the American Medical Informatics Association online.
CONFLICT OF INTEREST STATEMENT
None declared.
DATA AVAILABILITY
The data of CHA, SHDC, and SCHPC may be shared on request to the corresponding author with the permission of the respective organization. The KNHANES dataset is available at https://knhanes.kdca.go.kr/knhanes/sub03/sub03_02_05.do
REFERENCES
- 1. Kidney Disease: Improving Global Outcomes (KDIGO) CKD Work Group. Introduction: the case for updating and context. Kidney Int Suppl (2011) 2013; 3 (1): 15–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Kidney Disease: Improving Global Outcomes (KDIGO) CKD Work Group. Chapter 1: Definition and classification of CKD. Kidney Int Suppl (2011) 2013; 3 (1): 19–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Webster AC, Nagler EV, Morton RL, et al. Chronic kidney disease. The Lancet 2017; 389 (10075): 1238–52. [DOI] [PubMed] [Google Scholar]
- 4. Centers for Disease Control and Prevention. Chronic Kidney Disease in the United States. Atlanta, GA: US Department of Health and Human Services, Centers for Disease Control and Prevention; 2021. [Google Scholar]
- 5. Levey AS, Stevens LA, Schmid CH, et al. A new equation to estimate glomerular filtration rate. Ann Internal Med 2009; 150 (9): 604–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Go AS, Chertow GM, Fan D, et al. Chronic kidney disease and the risks of death, cardiovascular events, and hospitalization. N Engl J Med 2004; 351 (13): 1296–305. [DOI] [PubMed] [Google Scholar]
- 7. Kidney Disease: Improving Global Outcomes (KDIGO) CKD Work Group. Chapter 5: Referral to specialists and models of care. Kidney Int Suppl (2011) 2013; 3 (1): 112–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Matsushita K, van der Velde M, Astor BC, et al. ; Chronic Kidney Disease Prognosis Consortium. Association of estimated glomerular filtration rate and albuminuria with all-cause and cardiovascular mortality in general population cohorts: a collaborative meta-analysis. Lancet 2010; 375 (9731): 2073–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. van der Velde M, Matsushita K, Coresh J, et al. Lower estimated glomerular filtration rate and higher albuminuria are associated with all-cause and cardiovascular mortality. A collaborative meta-analysis of high-risk population cohorts. Kidney Int 2011; 79 (12): 1341–52. [DOI] [PubMed] [Google Scholar]
- 10. Gansevoort RT, Matsushita K, van der Velde M, et al. Lower estimated GFR and higher albuminuria are associated with adverse kidney outcomes. A collaborative meta-analysis of general and high-risk population cohorts. Kidney Int 2011; 80 (1): 93–104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Fogazzi GB, Garigali G.. Urinalysis In: Feehally J, Floege J, Tonelli M, Johnson RJ, eds. Comprehensive Clinical Nephrology. 6th ed. China: Elsevier Inc.; 2019: 39–52. [Google Scholar]
- 12. Uchida D, Kawarazaki H, Shibagaki Y, et al. Underestimating chronic kidney disease by urine dipstick without serum creatinine as a screening tool in the general Japanese population. Clin Exp Nephrol 2015; 19 (3): 474–80. [DOI] [PubMed] [Google Scholar]
- 13. Kawashima M, Wada K, Ohta H, et al. Evaluation of validity of the urine dipstick test for identification of reduced glomerular filtration rate in Japanese male workers aged 40 years and over. J Occup Health 2012; 54 (3): 176–80. [DOI] [PubMed] [Google Scholar]
- 14. Kidney Disease: Improving Global Outcomes (KDIGO) CKD Work Group. Chapter 2: Definition, identification, and prediction of CKD progression. Kidney Int Suppl (2011) 2013; 3 (1): 63–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Karras DJ, Heilpern KL, Riley LJ, et al. Urine dipstick as a screening test for serum creatinine elevation in emergency department patients with severe hypertension. Acad Emerg Med 2002; 9 (1): 27–34. [DOI] [PubMed] [Google Scholar]
- 16. Chen T, Guestrin C. XGBoost: a scalable tree boosting system. In: KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY: Association for Computing Machinery; 2016: 785–94.
- 17. Kweon S, Kim Y, Jang MJ, et al. Data resource profile: the Korea National Health and Nutrition Examination Survey (KNHANES). Int J Epidemiol 2014; 43 (1): 69–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Korea Centers for Disease Control and Prevention. Korea National Health and Nutrition Examination Survey. 2021. https://knhanes.kdca.go.kr/knhanes/eng/index.do. Accessed February 19, 2023.
- 19. Brownlee J. XGBoost with python: gradient boosted trees with XGBoost and scikit-learn: machine learning mastery. 2018.
- 20. Becker N. Faster AutoML with TPOT and RAPIDS 2020. https://medium.com/rapids-ai/faster-automl-with-tpot-and-rapids-758455cd89e5. Accessed February 19, 2023.
- 21. Mitchell R, Frank E.. Accelerating the XGBoost algorithm using GPU computing. PeerJ Comput Sci 2017; 3: e127. [Google Scholar]
- 22. Unal I. Defining an optimal cut-point value in ROC analysis: an alternative approach. Comput Math Methods Med 2017; 2017: 3762651. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Lundberg SM, Erion G, Chen H, et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell 2020; 2 (1): 56–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Zhang JJ, Yu GZ, Zheng ZH, et al. Dividing CKD stage 3 into G3a and G3b could better predict the prognosis of IgA nephropathy. PLoS One 2017; 12 (4): e0175828. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Obermeyer Z, Emanuel EJ.. Predicting the future—big data, machine learning, and clinical medicine. N Engl J Med 2016; 375 (13): 1216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Simerville JA, Maxted WC, Pahira JJ.. Urinalysis: a comprehensive review. Am Fam Physician 2005; 71 (6): 1153–62. [PubMed] [Google Scholar]
- 27. Locatelli F, Marcelli D, Comelli M, et al. Proteinuria and blood pressure as causal components of progression to end-stage renal failure. Northern Italian Cooperative Study Group. Nephrol Dial Transplant 1996; 11 (3): 461–7. [DOI] [PubMed] [Google Scholar]
- 28. Wilmer WA, Rovin BH, Hebert CJ, et al. Management of glomerular proteinuria: a commentary. J Am Soc Nephrol 2003; 14 (12): 3217–32. [DOI] [PubMed] [Google Scholar]
- 29. You-Hsien Lin H, Yen CY, Lim LM, et al. Microscopic haematuria and clinical outcomes in patients with stage 3-5 nondiabetic chronic kidney disease. Sci Rep 2015; 5: 15242. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Nakanishi N, Fukui M, Tanaka M, et al. Low urine pH is a predictor of chronic kidney disease. Kidney Blood Press Res 2012; 35 (2): 77–81. [DOI] [PubMed] [Google Scholar]
- 31. Levey AS, Inker LA, Coresh J.. GFR estimation: from physiology to public health. Am J Kidney Dis 2014; 63 (5): 820–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Weinstein JR, Anderson S.. The aging kidney: physiological changes. Adv Chronic Kidney Dis 2010; 17 (4): 302–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Fried LF, Unruh ML.. Aging in kidney disease: key issues and gaps in knowledge. Adv Chronic Kidney Dis 2010; 17 (4): 291–2. [DOI] [PubMed] [Google Scholar]
- 34. de Boer IH, Rue TC, Hall YN, et al. Temporal trends in the prevalence of diabetic kidney disease in the United States. JAMA 2011; 305 (24): 2532–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Molitoris BA, Reilly ES.. Quantifying glomerular filtration rates in acute kidney injury: a requirement for translational success. Semin Nephrol 2016; 36 (1): 31–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Pelletier K, Lafrance J-P, Roy L, et al. Estimating glomerular filtration rate in patients with acute kidney injury: a prospective multicenter study of diagnostic accuracy. Nephrol Dial Transplant 2020; 35 (11): 1886–93. [DOI] [PubMed] [Google Scholar]
- 37. Price CP. Point of care testing. BMJ 2001; 322 (7297): 1285–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Perkovic V, Cass A, Patel AA, et al. High prevalence of chronic kidney disease in Thailand. Kidney Int 2008; 73 (4): 473–9. [DOI] [PubMed] [Google Scholar]
- 39. Collaboration GBDCKD. Global, regional, and national burden of chronic kidney disease, 1990-2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet 2020; 395 (10225): 709–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. George C, Mogueo A, Okpechi I, et al. Chronic kidney disease in low-income to middle-income countries: the case for increased screening. BMJ Glob Health 2017; 2 (2): e000256. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Stauss M, Keevil B, Woywodt A.. Point-of-care testing: home is where the lab is. Kidney 360 2022; 3 (7): 1285–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Flaucher M, Nissen M, Jaeger KM, et al. Smartphone-based colorimetric analysis of urine test strips for at-home prenatal care. IEEE J Transl Eng Health Med 2022; 10: 2800109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Ra M, Muhammad MS, Lim C, et al. Smartphone-based point-of-care urinalysis under variable illumination. IEEE J Transl Eng Health Med 2018; 6: 2800111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Chawla LS, Kimmel PL.. Acute kidney injury and chronic kidney disease: an integrated clinical syndrome. Kidney Int 2012; 82 (5): 516–24. [DOI] [PubMed] [Google Scholar]
- 45. Meran S, Wonnacott A, Amphlett B, et al. How good are we at managing acute kidney injury in hospital? Clin Kidney J 2014; 7 (2): 144–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data of CHA, SHDC, and SCHPC may be shared on request to the corresponding author with the permission of the respective organization. The KNHANES dataset is available at https://knhanes.kdca.go.kr/knhanes/sub03/sub03_02_05.do






