Skip to main content
Springer logoLink to Springer
. 2026 Apr 4;20(1):413. doi: 10.1007/s11701-026-03386-6

Development and validation of an interpretable machine learning model for predicting urinary incontinence at 6 months after robot-assisted radical prostatectomy

Weidong Yu 1,2,3,4,#, Yu He 1,2,3,#, You Ma 1,2,3,4,#, Jiawei Zhang 5, Junchao Wu 1,2,3, Chaozhao Liang 1,2,3,, Meng Zhang 1,2,3,, Cheng Yang 1,2,3,4,
PMCID: PMC13048921  PMID: 41933262

Abstract

Urinary incontinence adversely affects quality of life as a common complication after robot-assisted radical prostatectomy (RARP). To address this, our study sought to build and explain a machine learning (ML) model for predicting 6-month post-RARP incontinence risk. This single-center retrospective analysis enrolled 852 patients with localized prostate cancer (PCa) who underwent RARP from January 2021 to December 2024. Models were developed employing seven ML algorithms. Their performance was primarily evaluated through Receiver Operating Characteristic (ROC) curves and Decision Curve Analysis (DCA). Subsequently, the optimal model was interpreted using SHAP (SHapley Additive exPlanations) analysis. The final model incorporated seven key predictors: prostate volume, membranous urethral length (MUL), age, body mass index (BMI), transurethral resection of the prostate (TURP), T stage, and basal surgical margin status. In the testing set, the Light Gradient Boosting Machine (LightGBM) model demonstrated the best performance, achieving an AUC of 0.784 and an accuracy of 0.793. Through SHAP analysis, intuitive visualizations were generated to illustrate how each feature contributed to individual risk predictions. This ML-based prediction model accurately estimates the risk of urinary incontinence at 6 months after RARP and demonstrates favorable clinical utility. This tool provides a dependable means for timely detection and risk classification of patients at elevated risk, potentially enabling the adoption of tailored preventive measures.

Supplementary Information

The online version contains supplementary material available at 10.1007/s11701-026-03386-6.

Keywords: Prostate cancer, Machine learning, Robot-assisted radical prostatectomy, SHAP, Urinary incontinence

Introduction

Prostate cancer (PCa) is a leading malignancy of the male genitourinary system globally [1]. For localized PCa, radical prostatectomy constitutes the cornerstone of treatment, with its techniques undergoing ongoing refinements. Robot-assisted radical prostatectomy (RARP) has been widely adopted because of its three-dimensional visualization, high-definition magnification, and enhanced instrument dexterity, offering favorable perioperative and oncological outcomes compared with open and conventional laparoscopic approaches [2, 3].

Despite these advantages, RARP is associated with several postoperative complications, including urinary incontinence, erectile dysfunction, anastomotic stricture, urinary fistula, penile shortening, and hernia formation [4, 5]. Among them, urinary incontinence remains the most common complication affecting postoperative quality of life. Although continence gradually recovers in most patients, a considerable proportion continues to suffer from persistent incontinence, which severely interferes with daily activities and imposes a substantial clinical burden [6]. Therefore, reliable prediction of postoperative continence recovery is of great clinical importance for early risk stratification and targeted intervention.

Previous studies primarily developed prediction models based on logistic regression and nomograms [710]. Although these approaches are easy to implement and statistically well established, they are inherently limited by linear assumptions and a restricted ability to model complex nonlinear relationships and high-dimensional feature interactions. With the rapid development of artificial intelligence (AI), machine learning (ML) algorithms—such as random forests and gradient boosting—have shown superior performance in clinical prediction tasks. Moreover, explainable AI techniques, including SHAP and LIME, have addressed the long-standing “black box” concern of ML by providing transparent feature-level interpretations.

This study developed a ML model, enhanced with SHAP interpretability, to predict urinary continence recovery after RARP. By integrating multidimensional perioperative data, the model facilitates precise risk stratification and can support clinicians in optimizing personalized preventive strategies.

Methods

Participant enrollment

Patients who underwent RARP for localized PCa at our institution from January 1, 2021, to December 30, 2024, were consecutively enrolled in this retrospective cohort study to minimize selection bias. The study protocol received approval from our Institutional Review Board (Approval No. PJ2025-03-38) and was conducted per local regulations. Given the retrospective design, informed consent was waived.

Patients were included based on the following criteria: (1) pathologically confirmed PCa prior to surgery and treatment with RARP at our institution; (2) clinically localized disease (T1–T3) without evidence of distant metastasis; and (3) absence of preoperative urinary incontinence. Exclusion criteria included: (1) preoperative voiding dysfunction, such as urinary incontinence, neurogenic bladder, bladder outlet obstruction, or chronic urinary retention; (2) receipt of radiotherapy before or after surgery; (3) incomplete or missing clinical data; and (4) loss to follow-up or follow-up duration of less than 6 months.

Surgical technique

All patients underwent RARP via a transperitoneal approach using a standardized five-trocar technique. After dissection of periprostatic and prevesical adipose tissue, the bilateral endopelvic fascia was incised, and the dorsal vein complex was ligated. The bladder neck was dissected and transected to expose the posterior urethral wall. The bilateral vasa deferentia and seminal vesicles were isolated and divided. Denonvilliers’ fascia was incised with cold scissors, and dissection was continued toward the prostatic apex. The prostate was rotated to fully expose the apical urethra. After confirmation of negative surgical margins and preservation of functional urethral length, the membranous urethra was transected. Pelvic lymph node dissection was performed at the surgeon’s discretion in patients at high risk of nodal metastasis. Vesicourethral anastomosis was completed using a continuous barbed suture (10–12 stitches). A 20-Fr urethral catheter was placed, and a pelvic drain was inserted in the retropubic space after ensuring hemostasis. The specimen was then extracted, and all incisions were closed.

All robotic surgeries were conducted by a team of five certified urologists, each with extensive experience in robot-assisted procedures. Prior to the study, every surgeon had served as primary operator in a minimum of 200 RARP cases. All enrolled patients underwent a uniform transperitoneal RARP without unilateral or bilateral nerve preservation.

Data collection

Perioperative data were primarily sourced from electronic medical records, examination reports, pathology reports, and other relevant documents. Demographic variables included age and body mass index (BMI). Medical history variables comprised previous abdominal surgery, history of transurethral resection of the prostate (TURP), hypertension, diabetes, smoking status, alcohol consumption, and respiratory disease. Radiological parameters were obtained from preoperative prostate MRI performed by experienced radiologists who were blinded to postoperative continence outcomes. Prostatic dimensions—anteroposterior (AP), transverse (TV), and craniocaudal (CC)—as well as membranous urethral length (MUL), were measured. Prostate volume was derived from the ellipsoid formula: volume = 0.52 x AP x TV x CC [11, 12]. On sagittal MRI, the MUL was determined as the distance between the prostatic apex and the urethral bulb. MUL was measured by two independent experienced radiology experts. Interobserver variability was assessed using the intraclass correlation coefficient (ICC). The mean value of the two observers’ measurements was used as the final MUL value for each patient in the subsequent analysis.

Postoperative pathological variables included Gleason score, ISUP grade group, TNM stage according to the AJCC 8th edition, surgical margin status (apical and basal), perineural invasion, and lymphovascular invasion. The intraoperative parameters assessed were operative time, estimated blood loss, and whether lymph node dissection was performed. Laboratory variables included preoperative PSA, albumin, and fasting glucose (after ≥ 8 h of fasting), as well as postoperative day 1 levels. Postoperative hospital length of stay was recorded as a recovery indicator.

Continence outcome definitions

The urethral catheter was typically removed about two weeks post-surgery. Patient outcomes were then assessed at scheduled follow-ups one, three, six, and twelve months after its removal. The six-month mark serves as a critical and clinically decisive time point for differentiating between transient dysfunction and potential persistent incontinence. Given this clinical rationale, coupled with the considerable loss to follow-up observed at 12 months post-RARP in our cohort, we selected 6 months as the study endpoint.

Urinary continence status was assessed based on patient-reported daily pad usage using the standardized question: “How many pads or adult diapers did you typically use per day to manage leakage during the past four weeks?” Continence recovery was defined as the use of ≤ 1 safety pad per day for two consecutive weeks. Urinary incontinence was defined as involuntary leakage requiring > 1 pad per day during daily activities for at least two consecutive weeks. This definition aligns with widely used clinical endpoints in post-prostatectomy incontinence research. A recent review incorporating data from over 190,000 patients indicated that “No pad” (53.3%) and “Safety pad” (19.3%) are the two most commonly employed objective evaluation criteria in such studies [6].

Predictive variable screening

To prevent data leakage, all feature selection was confined to the training set following a 7:3 random split of the dataset. For the initial variable selection, we employed the least absolute shrinkage and selection operator (LASSO) with 5-fold cross-validation. The optimal regularization parameter (λ) was determined using the 1-standard-error (λ.1se) criterion to enhance model generalizability and mitigate overfitting. The Boruta algorithm, a random forest-based feature selection method, was subsequently applied as a complementary approach. The algorithm identifies genuinely predictive features by iteratively contrasting their importance against that of randomly permuted “shadow” features. Final predictive variables were determined based on the combined results of LASSO and Boruta selection, together with clinical relevance. Additionally, we calculated the Variance Inflation Factor (VIF) to assess multicollinearity among the predictor variables.

Model development, evaluation, and SHAP interpretability analysis

After variable selection, we employed seven ML models to predict 6-month post-RARP urinary incontinence. Based on their inherent characteristics, the seven models are categorized as follows: Logistic Regression represents a linear model; Decision Tree is a non-linear model; Random Forest, eXtreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM) are all tree-based ensemble models; Support Vector Machine (SVM) operates as a kernel-based model; and Artificial Neural Network (ANN) constitutes a connectionist model.

The models were trained and tuned on the training set using 5-fold cross-validation, with the area under the receiver operating characteristic curve (AUC) as the primary optimization metric. Hyperparameter optimization was performed via grid search to identify the best configuration for each model. The finalized models were evaluated on an independent test set using a comprehensive set of metrics: AUC, accuracy, precision (positive predictive value), sensitivity, specificity, and F1-score. Calibration curves and decision curve analysis (DCA) were used to evaluate model performance and clinical net benefit, respectively.

To enhance the interpretability of the optimal ML model, we employed the SHapley Additive exPlanations (SHAP) framework. This approach quantifies the contribution of each feature to individual predictions based on cooperative game theory.

Statistical methods

All statistical analyses were performed using SPSS (version 27), R (version 4.4.3) and Python (version 3.10.4). Continuous data were reported as mean ± standard deviation or median (interquartile range), based on normality. Categorical data were expressed as frequency (percentage). Group comparisons utilized the Student’s t-test, Mann–Whitney U test, or chi-squared test, as appropriate. Statistical significance was defined as a two-sided p-value < 0.05.

Results

Participant enrollment

Between January 1, 2021, and December 30, 2024, a total of 1,364 patients underwent RARP for PCa at our center. Patients were excluded for the following reasons: loss to follow-up or follow-up duration < 6 months (n = 405), preoperative voiding dysfunction (n = 32), receipt of radiotherapy before or after surgery (n = 54), and incomplete or missing clinical data (n = 21). Ultimately, 852 patients were included in the final study. Among them, 614 patients (72.1%) achieved urinary continence at 6 months after RARP, whereas 238 patients (27.9%) remained incontinent. The patient selection flowchart is presented in Fig. 1.

Fig. 1.

Fig. 1

Flowchart of participant enrollment and study design

Basic characteristics

Based on Table 1, a total of 852 patients with localized PCa were included in this study. The cohort was randomly split into a training set (n = 596) and a test set (n = 256) for model development and validation. The primary outcome, postoperative incontinence, was present in 238 patients (27.9%), with a comparable distribution between the training (27.9%) and test (28.1%) sets (P = 1.000), indicating a successful and balanced split regarding the outcome variable.

Table 1.

Basic characteristics

n Overall train_set test_set P-Value
852 596 256
Result, n (%) Continence 614 (72.1) 430 (72.1) 184 (71.9) 1
Incontinence 238 (27.9) 166 (27.9) 72 (28.1)
Age, mean (SD) 69.5 (7.1) 69.6 (7.0) 69.5 (7.1) 0.874
BMI, mean (SD) 23.8 (3.1) 23.7 (3.1) 23.9 (3.1) 0.478
Respiratory Disease, n (%) No 828 (97.2) 578 (97.0) 250 (97.7) 0.748
Yes 24 (2.8) 18 (3.0) 6 (2.3)
Diabetes, n (%) No 738 (86.6) 518 (86.9) 220 (85.9) 0.784
Yes 114 (13.4) 78 (13.1) 36 (14.1)
Previous Abdominal Surgery, n (%) No 701 (82.3) 489 (82.0) 212 (82.8) 0.865
Yes 151 (17.7) 107 (18.0) 44 (17.2)
Hypertension, n (%) No 461 (54.1) 329 (55.2) 132 (51.6) 0.367
Yes 391 (45.9) 267 (44.8) 124 (48.4)
Drink, n (%) No 760 (89.2) 526 (88.3) 234 (91.4) 0.216
Yes 92 (10.8) 70 (11.7) 22 (8.6)
Previous TURP, n (%) No 766 (89.9) 532 (89.3) 234 (91.4) 0.407
Yes 86 (10.1) 64 (10.7) 22 (8.6)
PSA, n (%) ≤20ng/ml 527 (61.9) 365 (61.2) 162 (63.3) 0.628
>20ng/ml 325 (38.1) 231 (38.8) 94 (36.7)
Preoperative Albumin, mean (SD) 43.3 (3.8) 43.3 (3.7) 43.2 (4.0) 0.589
Preoperative FBG, mean (SD) 5.9 (1.5) 5.9 (1.4) 6.0 (1.5) 0.749
Postoperative Albumin, mean (SD) 33.9 (3.0) 33.9 (2.9) 33.9 (3.1) 0.938
Postoperative FBG, mean (SD) 6.8 (2.8) 6.8 (2.7) 6.9 (3.0) 0.499
Prostate Volume, mean (SD) 38.6 (19.4) 38.1 (19.0) 39.8 (20.4) 0.278
MUL, mean (SD) 11.8 (1.3) 11.7 (1.3) 11.8 (1.3) 0.209
Operation Duration, mean (SD) 161.1 (52.1) 159.6 (52.5) 164.4 (51.3) 0.212
Hb Change, mean (SD) 19.0 (9.4) 19.1 (9.6) 18.8 (8.8) 0.67
PLND, n (%) No 494 (58.0) 341 (57.2) 153 (59.8) 0.538
Yes 358 (42.0) 255 (42.8) 103 (40.2)
Tstage, n (%) T1-2 648 (76.1) 447 (75.0) 201 (78.5) 0.31
T3 204 (23.9) 149 (25.0) 55 (21.5)
Gleason Score, n (%) 6 126 (14.8) 92 (15.4) 34 (13.3) 0.369
7 428 (50.2) 290 (48.7) 138 (53.9)
8 133 (15.6) 91 (15.3) 42 (16.4)
9 156 (18.3) 115 (19.3) 41 (16.0)
ISUP, n (%) 1 126 (14.8) 92 (15.4) 34 (13.3) 0.501
2 233 (27.3) 158 (26.5) 75 (29.3)
3 195 (22.9) 132 (22.1) 63 (24.6)
4 133 (15.6) 91 (15.3) 42 (16.4)
5 165 (19.4) 123 (20.6) 42 (16.4)
10 9 (1.1) 8 (1.3) 1 (0.4)
PNI, n (%) Negative 204 (23.9) 146 (24.5) 58 (22.7) 0.624
Positive 648 (76.1) 450 (75.5) 198 (77.3)
LVI, n (%) Negative 733 (86.0) 517 (86.7) 216 (84.4) 0.42
Positive 119 (14.0) 79 (13.3) 40 (15.6)
Basal Margin, n (%) Negative 775 (91.0) 538 (90.3) 237 (92.6) 0.343
Positive 77 (9.0) 58 (9.7) 19 (7.4)
Apical Margin, n (%) Negative 745 (87.4) 527 (88.4) 218 (85.2) 0.228
Positive 107 (12.6) 69 (11.6) 38 (14.8)
Postoperative Hospital Stay, mean (SD) 6.0 (1.9) 6.0 (2.0) 6.0 (1.7) 0.86

SD: standard deviation;

BMI: Body mass index; TURP: Transurethral resection of the prostate; PSA: Prostate-specific antigen; FBG: Fasting blood glucose; MUL: Membranous urethral length;

PLND: Pelvic lymph node dissection; ISUP: International Society of Urological Pathology; PNI: Perineural Invasion; LVI: Lymphovascular Invasion;

The baseline demographic, clinical, and pathological characteristics were well-balanced between the two sets. Key variables, including comorbidities (e.g., diabetes, hypertension), preoperative PSA levels, prostate volume, operative duration, pathological stage (T-stage), Gleason Score, and surgical margin status, all showed no statistically significant differences (all P > 0.05). This confirms that the training and test sets are comparable across all measured covariates, supporting the validity of the subsequent ML analysis.

Predictive variable screening

LASSO regression identified two optimal regularization parameters: λ_min = 0.013630 and λ_1se = 0.045682 (Fig. 2A-B). To improve model generalizability and reduce overfitting, λ_1se was selected, resulting in six predictors: age, BMI, history of TURP, prostate volume, basal surgical margin status, and MUL. The coefficient path and cross-validation curves of the LASSO model are shown in Fig. 2A–B.

Fig. 2.

Fig. 2

Clinical feature selection using LASSO regression and the Boruta algorithm. (A) Regularization path plot of clinical features in LASSO regression. (B) Cross-validated binomial deviance versus log(λ) for the LASSO regression. The vertical dashed lines correspond to the optimal λ values chosen by the minimum criterion (left) and the one-standard-error rule (right). We selected the right λ value, which retained six features, as the final parameter to enhance generalizability and mitigate overfitting. (C) Feature importance ranking from the Boruta algorithm, where green, yellow, and red boxes represent confirmed, tentative, and rejected features, respectively

Subsequently, the Boruta algorithm was applied as a complementary feature selection method. The feature importance ranking is displayed in Fig. 2C, where green, yellow, and red boxes indicate confirmed, tentative, and rejected features, respectively. Boruta identified eight variables as important predictors. By integrating the results of LASSO and Boruta, together with clinical relevance, seven variables were ultimately selected for model construction: age, BMI, history of TURP, basal surgical margin status, prostate volume, T stage, and MUL. Multicollinearity among the final selected predictors was assessed using the VIF, with all VIF values < 2, indicating the absence of significant multicollinearity (Supplementary Table 2).

Prediction model development and performance evaluation

Seven models’ performance was evaluated and compared on training and test sets (Fig. 3). The ROC curves (Fig. 3A-B) show that LightGBM achieved the best discrimination on the test set, with an AUC of 0.784. Calibration curves indicated good prediction reliability (Fig. 3C). Furthermore, DCA curves revealed that most models provided better net clinical benefit than the “treat-all” strategy across most thresholds, with LightGBM offering the most favorable utility (Fig. 3D).

Fig. 3.

Fig. 3

Performance comparison of seven machine learning models. (A) Receiver operating characteristic (ROC) curves in the training cohort. (B) ROC curves in the independent test cohort. (C) Calibration curves in the test cohort. (D) Decision curve analysis (DCA) evaluating the clinical net benefit in the test cohort

Based on the ROC and DCA results, a classification threshold of 0.35 was selected to optimize model performance and clinical applicability. Detailed performance metrics of each model in both the training and test sets are presented in Table 2. In the test set, LightGBM achieved the best overall performance, with an AUC of 0.784, followed by XGBoost (AUC: 0.763) and Random Forest (AUC: 0.753). Notably, LightGBM exhibited a favorable balance between sensitivity (0.611) and specificity (0.864), resulting in the highest F1-score (0.624), indicating robust predictive performance.

Table 2.

Model performance

Data set Model AUC AUC 95% CI Lower AUC 95% CI Upper Accuracy Precision Sensitivity Specificity F1 Score
Train set Logistic 0.759 0.725 0.794 0.716 0.491 0.494 0.802 0.492
Decision Tree 0.860 0.832 0.888 0.802 0.641 0.657 0.858 0.649
Random Forest 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
XGBoost 0.987 0.978 0.996 0.951 0.905 0.922 0.963 0.913
LightGBM 0.970 0.956 0.984 0.926 0.855 0.886 0.942 0.870
SVM 0.753 0.718 0.787 0.757 0.600 0.380 0.902 0.465
ANN 0.762 0.728 0.796 0.708 0.480 0.572 0.760 0.522
Test set Logistic 0.721 0.666 0.776 0.742 0.548 0.472 0.848 0.507
Decision Tree 0.639 0.580 0.698 0.668 0.408 0.403 0.772 0.406
Random Forest 0.753 0.700 0.806 0.730 0.518 0.611 0.777 0.561
XGBoost 0.763 0.711 0.815 0.762 0.580 0.556 0.842 0.567
LightGBM 0.784 0.734 0.835 0.793 0.638 0.611 0.864 0.624
SVM 0.706 0.651 0.762 0.734 0.559 0.264 0.918 0.358
ANN 0.716 0.661 0.771 0.703 0.475 0.528 0.772 0.500

AUC: area under the curve;

XGBoost: eXtreme Gradient Boosting; LightGBM: Light Gradient Boosting Machine; SVM: Support Vector Machine; ANN: Artificial Neural Network;

SHAP analysis and model interpretability

Following comprehensive performance evaluation, the LightGBM algorithm was identified as the best-performing model and was therefore carried forward for interpretability analysis using SHAP. The SHAP summary histogram (Fig. 4A) ranks features according to their mean absolute SHAP values, reflecting their global importance. The most influential predictors were prostate volume, MUL, age, BMI, history of TURP, T stage, and basal surgical margin status. Figure 4B (SHAP beeswarm plot) visualizes the direction and magnitude of contribution for each feature. Higher values of age, BMI, history of TURP, prostate volume, T stage, and basal margin positivity (red dots, representing higher feature values, are clustered on the right side.) were associated with positive SHAP values, indicating increased risk of postoperative urinary incontinence. In contrast, higher MUL values (red dots are clustered on the left side.) were associated with negative SHAP values, indicating decreased risk. To further illustrate individual-level predictions, we employed SHAP waterfall plots for two representative patients (Fig. 4C–D). One patient was correctly classified as high risk (predicted probability = 0.915) and experienced persistent urinary incontinence at 6 months postoperatively (Fig. 4C). The other patient was classified as low risk (predicted probability = 0.075) and achieved urinary continence within 6 months after RARP (Fig. 4D).

Fig. 4.

Fig. 4

Interpretation of the LightGBM model using SHapley Additive exPlanations (SHAP). (A) Feature importance ranked by the mean absolute SHAP value. (B) SHAP beeswarm plot illustrating the distribution of feature impacts across the test cohort (Each dot represents a specific patient; red: high feature value; blue: low feature value). (C, D) SHAP waterfall plots for two representative cases: (C) a true-positive patient and (D) a true-negative patient. Red and blue bars indicate features that increase or decrease the prediction score, respectively

Discussion

This study developed a ML prediction model that uses perioperative parameters to predict urinary incontinence following RARP. ML frameworks integrating SHAP explainers surpass traditional logistic regression and static nomograms [710] by capturing nonlinear relationships and offering interpretable feature insights. Research applying ML to predict urinary incontinence after RARP has been limited, with previous studies constrained by small sample sizes and a narrow range of algorithms [13]. To address these limitations, our study expands the cohort to over 800 patients with six-month follow-up, applies rigorous feature selection, and evaluates seven ML algorithms, resulting in a model with superior predictive performance.

By applying ML algorithms, the model can more accurately identify patients at high risk for postoperative incontinence, enabling personalized interventions and targeted protective strategies for these individuals. Five of the indicators can be measured directly before surgery, while the other two (T-stage and base margin status) can be inferred preoperatively using methods such as PSA, MRI, and digital rectal examination. In preoperative decision-making, for older patients (e.g., over 70 years) with a limited life expectancy (e.g., < 5 or < 10 years) who are highly concerned about quality of life impairment from urinary incontinence, forgoing surgery in favor of radical radiotherapy may be considered [1416]. It is reported that radical radiotherapy can achieve comparable clinical outcomes compared with radical prostatectomy for PCa patients. In the surgical management of high-risk patients, optimized techniques such as Retzius-sparing RARP are preferable, as studies indicate they offer better recovery of urinary continence and lower rates of complications like penile shortening and inguinal hernia compared to conventional approaches [1719]. Postoperatively, beyond commonly advised at-home pelvic floor muscle training, more intensive methods such as physical therapist-guided training [20] or biofeedback-assisted therapy [21, 22] could be recommended to enhance muscle training efficacy. The model allows for prioritized allocation of rehabilitation resources to high-risk patients, thereby improving resource utilization efficiency and reducing unnecessary waste.

SHAP analysis revealed the following factors, ranked by importance, as key predictors of urinary incontinence 6 months after RARP: larger prostate volume, shorter MUL, advanced age, elevated BMI (indicating obesity), history of TURP, higher pathologic T stage, and positive basal margin. A larger prostate volume is consistently associated with a higher incidence of post-RARP incontinence. This correlation may be attributed to the technical challenges posed by a larger gland during surgery, potentially affecting the preservation of structures crucial for urinary control [23, 24]. MUL is a well-established anatomical predictor; both preoperative and postoperative MUL measurements are significantly associated with continence recovery. A longer MUL generally predicts a higher likelihood of regaining continence, underscoring the importance of its preservation during surgery [25, 26]. Advanced age is a significant risk factor, potentially due to age-related degenerative changes in the urethral sphincter’s integrity and neural pathways [27, 28]. Similarly, obesity (higher BMI) is linked to worse continence outcomes, possibly because increased intra-abdominal pressure exerts additional stress on the bladder and urethra [29, 30]. A history of TURP prior to RARP is also a recognized risk factor, likely because the initial procedure alters the prostate’s anatomy, increasing the complexity of subsequent cancer surgery and hindering functional recovery [31, 32]. Furthermore, a higher pathological T stage (T3) often indicates a greater tumor burden and extraprostatic extension, which can compromise the surgeon’s ability to spare the neurovascular bundles and other continence-related structures [3335]. Finally, the presence of positive basal surgical margins may necessitate a more extensive resection near the bladder neck, potentially damaging the sphincteric mechanism and surrounding nerves essential for urinary continence [36, 37]. In summary, these factors highlight a combination of patient-specific anatomy (prostate volume, MUL), patient characteristics (age, BMI), medical history (TURP), and disease severity (T stage, margin status) that collectively influence the risk of urinary incontinence following RARP. Preoperative identification of these factors can aid in risk stratification and patient counseling.

It is noteworthy that the LightGBM model constructed in this study exhibits near-perfect performance on the training set (AUC = 0.970), while its AUC on the independent test set is 0.784, indicating overfitting. This phenomenon can be attributed to two main factors. First, this study employs cross-validation for hyperparameter optimization rather than a fixed validation set. This method partitions the training data multiple times to identify robust hyperparameter combinations, thereby reducing evaluation bias compared to a single train-validation split. However, this approach cannot eliminate the inherent risk of overfitting that arises from an overly complex model structure. Second, compared to models such as logistic regression, SVM, and ANN, LightGBM demonstrates a greater propensity for overfitting. This is primarily due to its inherent algorithmic characteristics, specifically the Leaf-wise (leaf-by-leaf growth) strategy. This strategy facilitates the construction of deep and asymmetric trees, enabling the model to efficiently capture complex patterns, including noise, within the training data. Consequently, tree-based models like LightGBM are more prone to achieving exceptionally high fitting performance on the training data compared to structurally simpler models. Performance on the training set has limited reference value, whereas performance on an independent test set holds both representativeness and clinical significance.

Despite its contributions, it is important to note the limitations of this study. Firstly, the single-center, retrospective nature of this study may limit the generalizability of the findings. Multi-center studies, such as a recent machine learning study predicting severe erectile dysfunction after penile fracture repair that leveraged data from 23 centers, offer clear advantages in enhancing sample diversity and assessing model transportability [38]. An effective pathway forward may involve initial model development within a standardized single-center cohort, followed by broad validation across multiple institutions, thereby creating a complementary relationship between the two stages. Secondly, although SHAP analysis was employed to enhance model interpretability, it explains feature importance relative to the model’s predictions based on specific assumptions, rather than establishing real-world clinical causality. This reflects a fundamental epistemological limitation of post-hoc interpretability tools, which elucidate the model’s internal logic but do not equate to clinician reasoning. A recent blinded expert evaluation of large language models in urology critically assessed this gap, demonstrating that even high-performing models achieved lower scores in “interpretive depth” compared to other metrics, highlighting the superficial nature of such explanations versus nuanced clinical reasoning [39]. Therefore, the insights from our SHAP analysis should be viewed as hypothesis-generating for further clinical investigation, not as definitive evidence of causal pathways. Thirdly, based on clinical practicality and alignment with previous literature, this study assessed postoperative urinary continence recovery using daily pad usage. However, current research guidelines [40] increasingly recommend the use of patient-reported outcome measures (PROMs) as the standard for assessment, which may impose some limitations on the generalizability of our findings. Future research requires multi-center, prospective studies with longer follow-up, integrating surgical factors and employing rigorous PROMs as the criterion for urinary continence.

Conclusion

We developed and internally validated an interpretable ML model based on LightGBM to predict the risk of urinary incontinence at 6 months after RARP. SHAP analysis delineated the key predictors: larger prostate volume, shorter MUL, advanced age, higher BMI, history of TURP, advanced pathological T stage, and positive basal surgical margin. This model can be a useful tool for the early identification of high-risk populations, potentially guiding personalized perioperative care.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

Supplementary Material 1 (10.3KB, xlsx)
Supplementary Material 2 (11.1KB, xlsx)

Author contributions

YWD and YC designed the study. YWD, HY, MY, WJC and ZJW collected and analyzed the data. YWD, ZJW, and ZM prepared the manuscript and provided the table. LCZ, ZM and YC reviewed the manuscript and provided constructive suggestions. All authors read and approved the final manuscript.

Funding

The authors gratefully acknowledge financial support from the National Natural Science Foundation of China (Grants Nos. 82270818, 81700662), the Outstanding Scientific Research and Innovation Team for Male Genitourinary Diseases in Anhui Provincial Universities (Grant No. 2022AH010071), the Research Funds of the Center for Big Data and Population Health of IHM (Grant No. JKS2022001), and the Anhui Provincial Higher Education Research Projects (Grant No. 2025AHGXZK30614).

Data availability

No datasets were generated or analysed during the current study.

Declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose. The authors declare no competing interests.

Ethics approval

The study protocol received full ethical approval from Institutional Review Board of Anhui Medical University (Approval No. PJ2025-03-38). All procedures were conducted in accordance with local legislation and institutional requirements.

Consent to participate

The ethics committee/institutional review board waived the requirement of written informed consent for participation from the participants because this study is a retrospective study.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Weidong Yu, Yu He and You Ma have contributed equally to this work.

Contributor Information

Chaozhao Liang, Email: liang_chaozhao@ahmu.edu.cn.

Meng Zhang, Email: zhangmeng@ahmu.edu.cn.

Cheng Yang, Email: yang_cheng@ahmu.edu.cn.

References

  • 1.Siegel RL, Giaquinto AN, Jemal A (2024) Cancer statistics, 2024. CA Cancer J Clin 74(1):12–49. 10.3322/caac.21820 [DOI] [PubMed] [Google Scholar]
  • 2.Mohler JL, Antonarakis ES, Armstrong AJ et al (2019) Prostate cancer, version 2.2019, nccn clinical practice guidelines in oncology. J Natl Compr Canc Netw 17(5):479–505. 10.6004/jnccn.2019.0023 [DOI] [PubMed] [Google Scholar]
  • 3.Basiri A, de la Rosette JJ, Tabatabaei S et al (2018) Comparison of retropubic, laparoscopic and robotic radical prostatectomy: who is the winner? World J Urol 36(4):609–621. 10.1007/s00345-018-2174-1 [DOI] [PubMed] [Google Scholar]
  • 4.Kadono Y, Nohara T, Kawaguchi S et al (2022) Impact of pelvic anatomical changes caused by radical prostatectomy. Cancers (Basel) 14(13):3050. 10.3390/cancers14133050 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Novara G, Ficarra V, Rosen RC et al (2012) Systematic review and meta-analysis of perioperative outcomes and complications after robot-assisted radical prostatectomy. Eur Urol 62(3):431–452. 10.1016/j.eururo.2012.05.044 [DOI] [PubMed] [Google Scholar]
  • 6.Moretti TBC, Magna LA, Reis LO (2023) Continence criteria of 193 618 patients after open, laparoscopic, and robot-assisted radical prostatectomy. BJU Int 134(1):13–21. 10.1111/bju.16180 [DOI] [PubMed] [Google Scholar]
  • 7.Huang J, Dai X, Sun J, Fan Y, Guo C (2024) Prediction models for urinary incontinence after robotic-assisted laparoscopic radical prostatectomy: a systematic review. J Robot Surg 18(1):249. 10.1007/s11701-024-02009-2 [DOI] [PubMed] [Google Scholar]
  • 8.Tutolo M, Bruyneel L, Van der Aa F et al (2021) A novel tool to predict functional outcomes after robot-assisted radical prostatectomy and the value of additional surgery for incontinence. BJU Int 127(5):575–584. 10.1111/bju.15242 [DOI] [PubMed] [Google Scholar]
  • 9.Pinkhasov RM, Lee T, Huang R et al (2022) Prediction of incontinence after robot-assisted radical prostatectomy: development and validation of a 24-month incontinence nomogram. Cancers (Basel) 14(7):1644. 10.3390/cancers14071644 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Collette ERP, Klaver SO, Lissenberg-Witte BI et al (2021) Patient reported outcome measures concerning urinary incontinence after robot assisted radical prostatectomy: development and validation of an online prediction model using clinical parameters, lower urinary tract symptoms and surgical experience. J Robot Surg 15(4):593–602. 10.1007/s11701-020-01145-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Guo S, Zhang J, Jiao J et al (2023) Comparison of prostate volume measured by transabdominal ultrasound and mri with the radical prostatectomy specimen volume: a retrospective observational study. BMC Urol 23(1). 10.1186/s12894-023-01234-5 [DOI] [PMC free article] [PubMed]
  • 12.Paterson NR, Lavallée LT, Nguyen LN et al (2016) Prostate volume estimations using magnetic resonance imaging and transrectal ultrasound compared to radical prostatectomy specimens. Can Urol Association J 10(7–8):264. 10.5489/cuaj.3236 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Amparore D, De Cillis S, Alladio E et al (2024) Development of machine learning algorithm to predict the risk of incontinence after robot-assisted radical prostatectomy. J Endourol 38(8):871–878. 10.1089/end.2024.0057 [DOI] [PubMed] [Google Scholar]
  • 14.Garin O, Suárez JF, Guedea F et al (2021) Comparative effectiveness research in localized prostate cancer: a 10-year follow-up cohort study. Int J Radiation Oncology*Biology*Physics 110(3):718–726. 10.1016/j.ijrobp.2020.12.032 [DOI] [PubMed] [Google Scholar]
  • 15.Hamdy FC, Donovan JL, Lane JA et al (2016) 10-year outcomes after monitoring, surgery, or radiotherapy for localized prostate cancer. N Engl J Med 375(15):1415–1424. 10.1056/nejmoa1606220 [DOI] [PubMed] [Google Scholar]
  • 16.Hamdy FC, Donovan JL, Lane JA et al (2023) Fifteen-year outcomes after monitoring, surgery, or radiotherapy for prostate cancer. N Engl J Med 388(17):1547–1558. 10.1056/nejmoa2214122 [DOI] [PubMed] [Google Scholar]
  • 17.Kowalczyk KJ, Davis M, Neill JO et al (2020) Impact of retzius-sparing versus standard robotic-assisted radical prostatectomy on penile shortening, peyronie’s disease, and inguinal hernia sequelae. Eur Urol Open Sci 22:17–22. 10.1016/j.euros.2020.09.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Barakat B, Othman H, Gauger U et al (2022) Retzius sparing radical prostatectomy versus robot-assisted radical prostatectomy: which technique is more beneficial for prostate cancer patients (master study)? A systematic review and meta-analysis. Eur Urol Focus 8(4):1060–1071. 10.1016/j.euf.2021.08.003 [DOI] [PubMed] [Google Scholar]
  • 19.Rosenberg JE, Jung JH, Edgerton Z et al (2021) Retzius-sparing versus standard robot‐assisted laparoscopic prostatectomy for the treatment of clinically localized prostate cancer. BJU Int 128(1):12–20. 10.1111/bju.15385 [DOI] [PubMed] [Google Scholar]
  • 20.Overgård M, Angelsen A, Lydersen S, Mørkved S (2008) Does physiotherapist-guided pelvic floor muscle training reduce urinary incontinence after radical prostatectomy? Eur Urol 54(2):438–448. 10.1016/j.eururo.2008.04.021 [DOI] [PubMed] [Google Scholar]
  • 21.Hsu L, Liao Y, Lai F, Tsai P (2016) Beneficial effects of biofeedback-assisted pelvic floor muscle training in patients with urinary incontinence after radical prostatectomy: a systematic review and metaanalysis. Int J Nurs Stud 60:99–111. 10.1016/j.ijnurstu.2016.03.013 [DOI] [PubMed] [Google Scholar]
  • 22.Ribeiro LHS, Prota C, Gomes CM et al (2010) Long-term effect of early postoperative pelvic floor biofeedback on continence in men undergoing radical prostatectomy: a prospective, randomized, controlled trial. J Urol 184(3):1034–1039. 10.1016/j.juro.2010.05.040 [DOI] [PubMed] [Google Scholar]
  • 23.Modig KK, Godtman RA, Carlsson S et al (2025) Patient- and procedure-specific risk factors for urinary incontinence after robot-assisted radical prostatectomy: a nationwide, population-based study. 10.1016/j.euo.2025.03.015. Eur Urol Oncol [DOI] [PubMed]
  • 24.Lardas M, Grivas N, Debray TPA et al (2022) Patient- and tumour-related prognostic factors for urinary incontinence after radical prostatectomy for nonmetastatic prostate cancer: a systematic review and meta-analysis. Eur Urol Focus 8(3):674–689. 10.1016/j.euf.2021.04.020 [DOI] [PubMed] [Google Scholar]
  • 25.Yu Y, Zhang S, Xiong X et al (2025) Preoperative membranous urethra length and urinary continence following radical prostatectomy: a systematic review and meta-analysis. Int J Surg 111(8):5502–5517. 10.1097/JS9.0000000000002600 [DOI] [PubMed] [Google Scholar]
  • 26.Negrean C, Alam A, Hickling D et al (2025) Preoperative magnetic resonance imaging membranous urethral length as a predictor of urinary continence after radical prostatectomy: a systematic review and meta-analysis. 10.1016/j.euf.2025.02.002. Eur Urol Focus [DOI] [PubMed]
  • 27.Nilsson AE, Schumacher MC, Johansson E et al (2011) Age at surgery, educational level and long-term urinary incontinence after radical prostatectomy. BJU Int 108(10):1572–1577. 10.1111/j.1464-410X.2011.10231.x [DOI] [PubMed] [Google Scholar]
  • 28.Mandel PMD, Graefen MMD, Michl UMD, Huland HMD, Tilki DMD (2015) The effect of age on functional outcomes after radical prostatectomy. Urol Oncol 33(5):203–211. 10.1016/j.urolonc.2015.01.015 [DOI] [PubMed] [Google Scholar]
  • 29.Kilic S, Sambel M (2025) Impact of obesity on perioperative and clinical outcomes after robotic assisted radical prostatectomy. Sci Rep 15(1):225. 10.1038/s41598-024-82003-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Wei Y, Wu Y, Lin M et al (2018) Impact of obesity on long-term urinary incontinence after radical prostatectomy: a meta-analysis. Biomed Res Int 2018:1–9. 10.1155/2018/8279523 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Gupta NP, Singh P, Nayyar R (2011) Outcomes of robot-assisted radical prostatectomy in men with previous transurethral resection of prostate. BJU Int 108(9):1501–1505. 10.1111/j.1464-410X.2011.10113.x [DOI] [PubMed] [Google Scholar]
  • 32.Leyh-Bannurah S, Liakos N, Oelke M et al (2021) Perioperative and postoperative outcomes of robot-assisted radical prostatectomy in prostate cancer patients with prior transurethral subvesical deobstruction: results of a high-volume center. J Urol 206(2):308–318. 10.1097/JU.0000000000001776 [DOI] [PubMed] [Google Scholar]
  • 33.Hagman A, Lantz A, Carlsson S et al (2021) Urinary continence recovery and oncological outcomes after surgery for prostate cancer analysed by risk category: results from the laparoscopic prostatectomy robot and open trial. World J Urol 39(9):3239–3249. 10.1007/s00345-021-03662-0 [DOI] [PubMed] [Google Scholar]
  • 34.Chen W, Lee YK, Kuo H, Wang J, Jiang Y (2023) Oncological and functional outcomes of high-risk and very high-risk prostate cancer patients after robot-assisted radical prostatectomy. PLoS ONE 18(3):e282494. 10.1371/journal.pone.0282494 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Wu X, Wong CH, Gandaglia G, Chiu PK (2023) Urinary continence in high-risk prostate cancer after robot-assisted radical prostatectomy. Curr Opin Urol 33(6):482–487. 10.1097/MOU.0000000000001127 [DOI] [PubMed] [Google Scholar]
  • 36.Freire MP, Weinberg AC, Lei Y et al (2009) Anatomic bladder neck preservation during robotic-assisted laparoscopic radical prostatectomy: description of technique and outcomes. Eur Urol 56(6):972–980. 10.1016/j.eururo.2009.09.017 [DOI] [PubMed] [Google Scholar]
  • 37.Heo JE, Lee JS, Goh HJ, Jang WS, Choi YD (2020) Urethral realignment with maximal urethral length and bladder neck preservation in robot-assisted radical prostatectomy: urinary continence recovery. PLoS ONE 15(1):e227744. 10.1371/journal.pone.0227744 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Geyik S, Onder Yilmaz I, Zubaroglu M et al (2025) Prediction of severe erectile dysfunction after penile fracture repair: machine learning analysis results from the reconstruction and trauma working group of the society of urological surgery (RAT-SUS). Sex Med 13(6):qfaf101. 10.1093/sexmed/qfaf101 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Taşkıran AT, Balık AY, Başaran E, Baba D, Kayıkçı MA (2026) Clinical reasoning with machines: evaluating the interpretive depth of AI in urological case assessments. BMC Urol. 10.1186/s12894-026-02048-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Abrams P, Andersson KE, Apostolidis A et al (2018) 6th international consultation on incontinence. Recommendations of the international scientific committee: evaluation and treatment of urinary incontinence, pelvic organ prolapse and faecal incontinence. Neurourol Urodyn 37(7):2271–2272. 10.1002/nau.23551 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1 (10.3KB, xlsx)
Supplementary Material 2 (11.1KB, xlsx)

Data Availability Statement

No datasets were generated or analysed during the current study.


Articles from Journal of Robotic Surgery are provided here courtesy of Springer

RESOURCES