Skip to main content
American Journal of Translational Research logoLink to American Journal of Translational Research
. 2025 Apr 15;17(4):2614–2628. doi: 10.62347/CZYA6232

Regression analysis and validation of risk factors for upper limb dysfunction following modified radical mastectomy for breast cancer patients

Yonggang Li 1, Shuan Hui 1
PMCID: PMC12082522  PMID: 40385071

Abstract

Objective: To develop and validate a predictive tool using machine learning models for identifying risk factors for upper limb dysfunction following modified radical mastectomy (MRM) in breast cancer patients. Methods: A total of 768 breast cancer patients who underwent Modified radical mastectomy (MRM) between January 2022 and December 2023 were included in this study. The dataset was divided into a training set (506 cases) and a validation set (262 cases). The collected data encompassed demographic characteristics, clinicopathological features, medical history, and postoperative rehabilitation plans. Predictive analyses were conducted using machine learning models, including support vector machine (SVM), extreme gradient boosting (XGBOOST), Gaussian naïve Bayes (GNB), adaptive boosting (ADABOOST), and random forest. Model evaluation was performed using ten-fold cross-validation, with performance metrics including receiver operating characteristic (ROC) curves, area under the curve (AUC) values, specificity, sensitivity, accuracy, and F1-score. DeLong’s test was used to compare AUC values and identify the optimal predictive model. Results: Baseline characteristics showed no significant differences between the training and validation sets (P>0.05). Analysis of factors associated with upper limb dysfunction in the training set revealed significant differences in variables such as age, BMI, cancer type, axillary lymph node dissection, ipsilateral radiotherapy, postoperative rehabilitation plans, and monthly per capita household income (P<0.05). Low correlations were observed among these variables (R values close to 0), indicating minimal multicollinearity. Model performance evaluation showed that the XGBOOST and random forest models demonstrated high AUC values (0.817-0.884) across both the training and validation sets. These models also exhibited superior specificity and sensitivity, indicating strong predictive performance and robustness in identifying patients at risk of postoperative upper limb dysfunction. Conclusion: The XGBOOST and random forest models exhibited excellent predictive accuracy, offering valuable tools for the early identification and personalized management of high-risk patients. These models provide critical data support for postoperative rehabilitation planning and contribute to improving the quality of life for breast cancer patients.

Keywords: Modified radical mastectomy, upper limb dysfunction, machine learning models, risk prediction, XGBOOST

Introduction

According to the World Health Organization, approximately 2.3 million women worldwide were diagnosed with breast cancer in 2022, with around 670,000 deaths attributed to the disease [1]. Breast cancer is the most prevalent malignancy among women globally and remains a leading cause of cancer-related mortality in this population [2]. Recent advancements in breast cancer screening and treatment, particularly early detection and personalized therapies, have significantly improved survival rates, enabling many patients to achieve long-term survival [3]. Despite the decline in mortality, the persistently high incidence of breast cancer continues to impose a substantial health burden, with increasing attention being directed toward the quality of life after treatment [4]. For breast cancer survivors, overcoming the disease is only the first step; managing the long-term impact on postoperative quality of life represents an ongoing challenge.

Modified radical mastectomy (MRM) is a widely employed treatment for breast cancer, offering high cure rates but often resulting in postoperative complications, particularly upper limb dysfunction [5]. Procedures such as axillary lymph node dissection can damage the nervous and lymphatic systems, leading to pain, swelling, and restricted movement in the affected limb [6]. This dysfunction not only interferes with daily activities and self-care but can also result in chronic lymphedema, increased infection risk, prolonged recovery periods, and elevated healthcare costs [7]. Additionally, long-term physical dysfunction and the associated need for ongoing care can negatively affect patients’ psychological well-being, increasing the risk of anxiety and depression [8]. Therefore, identifying and addressing risk factors for upper limb dysfunction is critical for improving postoperative rehabilitation outcomes and the overall quality of life for breast cancer patients.

In recent years, machine learning models have gained increasing attention for their ability to predict disease complications and recurrence, particularly in identifying high-risk patients and optimizing personalized treatment plans [9]. These models have been successfully applied to predict outcomes in cardiovascular disease, cancer metastasis, and common postoperative complications, demonstrating promising results [10,11]. In the context of breast cancer postoperative management, machine learning has been utilized to assess recurrence risks and address various health concerns, providing valuable data for individualized follow-up plans [12]. However, no systematic studies have specifically focused on predicting upper limb dysfunction following MRM for breast cancer. Existing research has primarily addressed general postoperative complications, neglecting the unique challenges posed by upper limb dysfunction - a complex issue influenced by preoperative, intraoperative, and postoperative factors [13].

This study aims to bridge this gap by introducing machine learning models to predict upper limb dysfunction after MRM, thereby supporting early intervention and risk management. By applying and comparing various machine learning algorithms, this study aims to address the current gap in predicting postoperative functional impairments. Early identification of high-risk patients will facilitate personalized management strategies to mitigate these complications. The novelty of this study lies in the first systematic application of machine learning models to predict upper limb dysfunction, offering precise risk assessments that can improve postoperative rehabilitation and enhance quality of life for breast cancer survivors.

Methods and materials

Participants

This study included patients who underwent MRM for breast cancer at The First People’s Hospital of Xianyang between January 2022 and December 2023. All patients were preoperatively diagnosed with breast cancer and had at least six months of postoperative follow-up.

Inclusion criteria: female breast cancer patients aged 18 years or older, diagnosed with breast cancer through pathological examination [14], and having confirmed indications for MRM. Patients were required to have a minimum of six months of follow-up, the ability to complete upper limb dysfunction assessments, and comprehensive clinical and follow-up data, including baseline information and treatment details.

Exclusion criteria: patients with severe cardiovascular or cerebrovascular disease, hepatic or renal insufficiency, or other comorbidities affecting quality of life; those diagnosed with other malignancies or with uncontrolled major diseases; patients with neurological or musculoskeletal diseases (e.g., stroke, Parkinson’s disease, rheumatoid arthritis) that could impair upper limb function; pregnant or breastfeeding women, due to potential physiological impact on functional recovery; and patients with a history of breast surgery or axillary lymph node dissection.

A total of 768 patients were included in this study, divided into a training set (506 cases) and a validation set (262 cases). Details of the data distribution are provided in Table 1. This study was conducted with the approval of the First People’s Hospital of Xianyang Medical Ethics Committee.

Table 1.

Patient baseline characteristics

Variable Count Percentage
Age (years)
    18-40 192 0.25
    41-65 259 0.3372
    >65 317 0.4128
BMI (kg/m2)
    18-22.9 354 0.4609
    23-25 221 0.2878
    >25 193 0.2513
Disease Type
    Initial Diagnosis 701 0.9128
    Recurrence 67 0.0872
Cancer Type
    Ductal Carcinoma in Situ 154 0.2005
    Invasive Ductal Carcinoma 504 0.6563
    Other 110 0.1432
Axillary Lymph Node Dissection
    Yes 694 0.9036
    No 74 0.0964
Ipsilateral Radiotherapy
    Yes 298 0.388
    No 470 0.612
Neoadjuvant Chemotherapy
    Yes 434 0.5651
    No 334 0.4349
Diabetes History
    Yes 83 0.1081
    No 685 0.8919
Hypertension History
    Yes 130 0.1693
    No 638 0.8307
Smoking History
    Yes 202 0.263
    No 566 0.737
Alcohol Use History
    Yes 48 0.0625
    No 720 0.9375
Postoperative Rehabilitation Plan
    Yes 643 0.8372
    No 125 0.1628
Marital Status
    Married 655 0.8529
    Unmarried 71 0.0924
    Other 42 0.0547
Education Level
    ≤ Junior High School 263 0.3424
    High School 346 0.4505
    ≥ College 159 0.207
Monthly Household Income (CNY)
    <3000 370 0.4818
    3000-4500 220 0.2865
    >4500 178 0.2318
ER Status
    Positive 513 0.668
    Negative 255 0.332
PR Status
    Positive 419 0.5456
    Negative 349 0.4544
HER2 Status
    Positive 147 0.1914
    Negative 621 0.8086

Note: ER, Estrogen Receptor; PR, Progesterone Receptor; BMI, Body Mass Index; HER2, Human Epidermal Growth Factor Receptor 2.

Criteria for upper limb dysfunction assessment

Upper limb function was assessed using the Rowe Shoulder Score [15], a widely used tool for evaluating different types and stages of upper limb function. The score ranges from 0 to 100 and is divided into four grading levels: ≤50 as poor, 51-74 as fair, 75-89 as good, and 90-100 as excellent. Higher scores indicate better shoulder function. In this study, a score below 75 was defined as upper limb dysfunction.

Data collection

Data collected included baseline and treatment-related information for each patient. Demographic characteristics included age (18-40, 41-65, >65), BMI (kg/m2) categories (18-22.9, 23-25, >25), marital status (married, unmarried, others such as divorced or widowed), education level (≤ junior high school, high school, ≥ college), and monthly per capita household income (<3000, 3000-4500, >4500). Clinical and pathological characteristics included cancer type (ductal carcinoma in situ, invasive ductal carcinoma, others), axillary lymph node dissection (yes/no), ipsilateral radiotherapy (yes/no), and neoadjuvant chemotherapy (yes/no). Medical history included diabetes (yes/no), hypertension (yes/no), smoking (yes/no), and alcohol use (yes/no). Rehabilitation information included whether a postoperative rehabilitation plan was implemented (yes/no). Upper limb function was assessed six months postoperatively, with patients scoring below 75 classified as having dysfunction. This systematic data collection provided a comprehensive basis for constructing machine learning models and analyzing key risk factors.

Data preprocessing

Categorical variables were converted into dummy variables to accommodate the requirements of machine learning models. Baseline characteristics of the training and validation sets were statistically tested to ensure balanced characteristics between the two sets.

Model construction

Five machine learning models were employed to predict the risk of upper limb dysfunction: support vector machine (SVM), extreme gradient boosting (XGBOOST), Gaussian naïve Bayes (GNB), adaptive boosting (ADABOOST), and random forest. All models employed ten-fold cross-validation to enhance robustness and generalizability. For SVM, a radial basis function (RBF) kernel was used, and the regularization parameter (C) and kernel parameter (γ) were optimized through cross-validation to balance model complexity and classification accuracy. For XGBOOST, the learning rate (eta) was set at 0.1 to prevent overfitting, the maximum depth (max_depth) was set at 6 to control tree complexity, the subsample ratio was set at 0.7 for sample proportions per iteration, the feature sample ratio (colsample_bytree) was set at 0.8, and the model underwent 100 iterations as determined by ten-fold cross-validation. For GNB, a smoothing parameter (Laplace) of 0 was used to maintain the Gaussian distribution assumption, which is suitable for binary prediction tasks. ADABOOST utilized 50 iterations (n_estimators), determined through cross-validation, to balance training time and accuracy, with a learning rate of 1. For Random Forest, the number of trees (ntree) was determined via ten-fold cross-validation to minimize the out-of-bag (OOB) error rate, and the number of split variables (mtry) was set to the square root of the total number of features to control overfitting.

Statistical analysis

Data analysis was conducted using SPSS version 26.0. The normality of continuous variables was assessed using the Kolmogorov-Smirnov (K-S) test. Data were expressed as mean ± standard deviation (SD) for normally distributed variables, and as median with interquartile range (IQR) for non-normally distributed variables. Categorical data were presented as frequencies, and group comparisons were conducted using the chi-square test. A P-value <0.05 was considered statistically significant.

Model performance evaluation was carried out using R software (version 4.3.3, released February 2024). Receiver operating characteristic (ROC) curve plotting and area under the curve (AUC) calculation were performed using the pROC package, while visualization was conducted using ggplot2. Data preprocessing and model building were conducted using caret and data.table packages. Performance metrics included ROC curve, AUC, specificity, sensitivity, accuracy, and F1-Score. DeLong’s test was used to compare AUC values across models to identify the optimal predictive model.

Results

Comparison of baseline characteristics between patient groups

Baseline characteristics between the training and validation sets showed no significant differences across all variables. Specifically, upper limb dysfunction (P=0.146), age (P=0.383), BMI (P=0.679), cancer type (P=0.428), axillary lymph node dissection (P=0.274), ipsilateral radiotherapy (P=0.834), neoadjuvant chemotherapy (P=0.993), history of diabetes (P=0.938), hypertension (P=0.377), smoking history (P=0.379), alcohol consumption history (P=0.665), postoperative rehabilitation plan (P=0.338), marital status (P=0.974), education level (P=0.914), household income per capita (P=0.971), ER status (P=0.747), PR status (P=0.354), and HER2 status (P=0.319) showed no statistical differences between the two groups (see Table 2).

Table 2.

Comparison of baseline characteristics between validation and training sets

Variable Validation set (n=262) Training set (n=506) Statistic P-value
Upper Limb Dysfunction
    Yes 90 201 2.117 0.146
    No 172 305
Age (years)
    18-40 72 120 1.919 0.383
    41-65 81 178
    >65 109 208
BMI (kg/m2)
    18-22.9 120 234 0.775 0.679
    23-25 80 141
    >25 62 131
Disease Type
    Initial Diagnosis 238 463 0.095 0.758
    Recurrence 24 43
Cancer Type
    Ductal Carcinoma in Situ 58 96 1.699 0.428
    Invasive Ductal Carcinoma 171 333
    Other 33 77
Axillary Lymph Node Dissection
    Yes 241 453 1.199 0.274
    No 21 53
Ipsilateral Radiotherapy
    Yes 103 195 0.044 0.834
    No 159 311
Neoadjuvant Chemotherapy
    Yes 148 286 <0.001 0.993
    No 114 220
Diabetes History
    Yes 28 55 0.006 0.938
    No 234 451
Hypertension History
    Yes 40 90 0.779 0.377
    No 222 416
Smoking History
    Yes 74 128 0.774 0.379
    No 188 378
Alcohol Use History
    Yes 15 33 0.187 0.665
    No 247 473
Postoperative Rehabilitation Plan
    Yes 224 419 0.917 0.338
    No 38 87
Marital Status
    Married 223 432 0.052 0.974
    Unmarried 24 47
    Other 15 27
Education Level
    ≤ Junior High School 91 172 0.180 0.914
    High School 119 227
    ≥ College 52 107
Monthly Household Income (CNY)
    <3000 125 245 0.059 0.971
    3000-4500 75 145
    >4500 62 116
ER Status
    Positive 177 336 0.104 0.747
    Negative 85 170
PR Status
    Positive 149 270 0.858 0.354
    Negative 113 236
HER2 Status
    Positive 45 102 0.992 0.319
    Negative 217 404

Note: ER, Estrogen Receptor; PR, Progesterone Receptor; BMI, Body Mass Index; HER2, Human Epidermal Growth Factor Receptor 2.

Comparison of baseline characteristics between patients with and without upper limb dysfunction in the training set

In the training set, significant differences were observed between patients with and without upper limb dysfunction for several variables. Specifically, age (P<0.001), BMI (P<0.001), cancer type (P=0.001), axillary lymph node dissection (P=0.036), ipsilateral radiotherapy (P=0.011), postoperative rehabilitation plan (P<0.001), and household income per capita (P=0.008) were significantly different. Other variables, including disease onset type (P=0.532), history of diabetes (P=0.805), hypertension history (P=0.677), smoking history (P=0.311), alcohol consumption history (P=0.253), marital status (P=0.103), education level (P=0.365), ER (P=0.928), PR (P=0.387), and HER2 (P=0.569), showed no significant differences (see Table 3).

Table 3.

Comparison of baseline characteristics between patients with and without upper limb dysfunction in the training set

Variable Non-Dysfunction Group (n=316) Dysfunction Group (n=190) Statistic P-value
Age (years)
    18-40 80 40 28.274 <0.001
    41-65 128 50
    >65 97 111
BMI (kg/m2)
    18-22.9 152 82 13.883 <0.001
    23-25 92 49
    >25 61 70
Disease Type
    Initial Diagnosis 281 182 0.391 0.532
    Recurrence 24 19
Cancer Type
    Ductal Carcinoma in Situ 67 29 13.011 0.001
    Invasive Ductal Carcinoma 182 151
    Other 56 21
Axillary Lymph Node Dissection
    Yes 266 187 4.379 0.036
    No 39 14
Ipsilateral Radiotherapy
    Yes 104 91 6.388 0.011
    No 201 110
Neoadjuvant Chemotherapy
    Yes 180 106 1.944 0.163
    No 125 95
Diabetes History
    Yes 34 21 0.061 0.805
    No 271 180
Hypertension History
    Yes 56 34 0.173 0.677
    No 249 167
Smoking History
    Yes 82 46 1.026 0.311
    No 223 155
Alcohol Use History
    Yes 23 10 1.308 0.253
    No 282 191
Postoperative Rehabilitation Plan
    Yes 267 152 12.089 <0.001
    No 38 49
Marital Status
    Married 265 167 4.549 0.103
    Unmarried 29 18
    Other 11 16
Education Level
    ≤ Junior High School 111 61 2.015 0.365
    High School 131 96
    ≥ College 63 44
Monthly Household Income (CNY)
    <3000 153 92 9.590 0.008
    3000-4500 96 49
    >4500 56 60
ER Status
    Positive 203 133 0.008 0.928
    Negative 102 68
PR Status
    Positive 158 112 0.747 0.387
    Negative 147 89
HER2 Status
    Positive 64 38 0.325 0.569
    Negative 241 163

Note: ER, Estrogen Receptor; PR, Progesterone Receptor; BMI, Body Mass Index; HER2, Human Epidermal Growth Factor Receptor 2.

Correlation analysis of significant variables in the dysfunction group in the training set

Correlation analysis of variables with significant differences in the training set revealed low correlations among them, with correlation coefficients (R values) close to 0. The correlation between the postoperative rehabilitation plan and age was the highest (R=0.07), indicating a slight positive correlation, while other variables exhibited even lower correlations. These results support the independence of these variables and provide a robust basis for model construction (see Figure 1).

Figure 1.

Figure 1

Correlation analysis of significant variables between the dysfunction and non-dysfunction groups. Note: BMI, Body Mass Index.

ROC curve and performance evaluation of machine learning models

In the training set, five machine learning models (SVM, XGBOOST, GNB, ADABOOST, and Random Forest) were evaluated for predictive performance. Based on the ROC curves and AUC values, the Random Forest and XGBOOST models demonstrated superior predictive performance, with AUC ranges of 0.817-0.884 and 0.817-0.883, respectively. Specifically, the Random Forest model achieved a specificity of 76.39%, sensitivity of 77.61%, and a Youden index of 54.01%, making it the top performer among the five models. The XGBOOST model achieved a specificity of 75.41% and sensitivity of 76.62%, displaying strong performance in terms of accuracy and F1-score. DeLong’s test revealed no statistically significant differences in AUC between the Random Forest and XGBOOST models (P=0.9684), both of which significantly outperformed the other models (see Tables 4, 5; Figure 2A).

Table 4.

ROC curve parameters of the 5 machine learning models in the training set

Marker 95% CI Specificity Sensitivity Youden_index Accuracy Precision F1_Score
SMV 0.725-0.810 63.28% 80.10% 43.38% 69.96% 80.10% 67.93%
XGBOOST 0.817-0.883 75.41% 76.62% 52.03% 75.89% 76.62% 71.63%
GNB 0.685-0.773 59.67% 75.62% 35.29% 66.01% 75.62% 63.87%
ADABOOST 0.751-0.830 75.08% 69.65% 44.73% 72.92% 69.65% 67.15%
Random forest 0.817-0.884 76.39% 77.61% 54.01% 76.88% 77.61% 72.73%

Note: SVM, Support Vector Machine; XGBOOST, Extreme Gradient Boosting; GNB, Gaussian Naive Bayes; ADABOOST, Adaptive Boosting.

Table 5.

Comparison of AUCs of the 5 machine learning models in the training set

Variable 1 Variable 2 Statistic P-value Test Method Direction
SVM XGBOOST -5.931 <0.001 DeLong’s test Consistent
SVM GNB 2.361 0.018 DeLong’s test Consistent
SVM ADABOOST -1.952 0.051 DeLong’s test Consistent
SVM Random Forest -5.308 <0.001 DeLong’s test Consistent
XGBOOST GNB 7.381 <0.001 DeLong’s test Consistent
XGBOOST ADABOOST 6.205 <0.001 DeLong’s test Consistent
XGBOOST Random Forest -0.040 0.968 DeLong’s test Consistent
GNB ADABOOST -5.069 <0.001 DeLong’s test Consistent
GNB Random Forest -6.387 <0.001 DeLong’s test Consistent
ADABOOST Random Forest -4.645 <0.001 DeLong’s test Consistent

Note: SVM, Support Vector Machine; XGBOOST, Extreme Gradient Boosting; GNB, Gaussian Naive Bayes; ADABOOST, Adaptive Boosting.

Figure 2.

Figure 2

ROC curves of the 5 machine learning models in training and validation sets. A. Training set ROC curves. B. Validation set ROC curves. Note: SVM, Support Vector Machine; XGBOOST, Extreme Gradient Boosting; GNB, Gaussian Naive Bayes; ADABOOST, Adaptive Boosting.

Performance evaluation of models in the validation set

In the validation set, the predictive performance of all five models remained consistent. The Random Forest model continued to exhibit strong predictive capabilities, with an AUC range of 0.817-0.884, specificity of 76.39%, and sensitivity of 77.61%, indicating excellent generalization ability. Similarly, the XGBOOST model maintained a high AUC range (0.817-0.883) in the validation set, with specificity and sensitivity of 75.41% and 76.62%, respectively. DeLong’s test showed no significant differences in AUC between the Random Forest and XGBOOST models in the validation set (P=0.919). Both models significantly outperformed the other three models, demonstrating superior predictive performance across both the training and validation sets, making them the optimal models for predicting upper limb dysfunction in this study (see Tables 6, 7; Figure 2B).

Table 6.

ROC curve parameters of the 5 machine learning models in the validation set

Marker 95% CI Specificity Sensitivity Youden_index Accuracy Precision F1_Score
SMV 0.725-0.810 63.28% 80.10% 43.38% 69.96% 80.10% 67.93%
XGBOOST 0.817-0.883 75.41% 76.62% 52.03% 75.89% 76.62% 71.63%
GNB 0.685-0.773 59.67% 75.62% 35.29% 66.01% 75.62% 63.87%
ADABOOST 0.751-0.830 75.08% 69.65% 44.73% 72.92% 69.65% 67.15%
Random forest 0.817-0.884 76.39% 77.61% 54.01% 76.88% 77.61% 72.73%

Note: SVM, Support Vector Machine; XGBOOST, Extreme Gradient Boosting; GNB, Gaussian Naive Bayes; ADABOOST, Adaptive Boosting.

Table 7.

Comparison of AUCs of the 5 machine learning models in the validation set

Variable 1 Variable 2 Statistic P-value Test Method
SMV XGBOOST 2.130 0.033 DeLong’s test
SMV GNB -1.956 0.050 DeLong’s test
SMV ADABOOST -0.367 0.713 DeLong’s test
SMV Random forest 1.508 0.131 DeLong’s test
XGBOOST GNB -3.534 <0.001 DeLong’s test
XGBOOST ADABOOST -3.467 <0.001 DeLong’s test
XGBOOST Random forest -0.101 0.919 DeLong’s test
GNB ADABOOST 2.171 0.029 DeLong’s test
GNB Random forest 3.681 <0.001 DeLong’s test
ADABOOST Random forest 2.099 0.035 DeLong’s test

Note: SVM, Support Vector Machine; XGBOOST, Extreme Gradient Boosting; GNB, Gaussian Naive Bayes; ADABOOST, Adaptive Boosting.

Calibration curves of machine learning models

The calibration curves for the XGBOOST and Random Forest models showed good fit in both the training and validation sets (see Figures 3, 4). In the training set (n=506), the XGBOOST model’s predicted probabilities closely matched the observed probabilities, with a mean absolute error (MAE) of 0.017. Stability in calibration performance was observed with 1,000 bootstrap repetitions. In the validation set (n=262), the calibration performance of the XGBOOST model slightly declined, but the predicted and observed probabilities remained close, with an MAE of 0.034, indicating good calibration on new data. The Random Forest model achieved an MAE of 0.022 in the training set, with high consistency between predicted and observed probabilities. In the validation set, the MAE was 0.02, further demonstrating the model’s low error rate and stable calibration across both the training and validation sets.

Figure 3.

Figure 3

Calibration curves for XGBOOST model in training and validation sets. A. Calibration curve for training set. B. Calibration curve for validation set. Note: XGBOOST, Extreme Gradient Boosting.

Figure 4.

Figure 4

Calibration curves for Random Forest model in training and validation sets. A. Calibration curve for training set. B. Calibration curve for validation set.

Discussion

This study utilized machine learning models to predict the risk of upper limb dysfunction following modified radical mastectomy (MRM) in breast cancer patients, identifying several variables significantly associated with functional impairment. These variables include age, BMI, cancer type, axillary lymph node dissection, ipsilateral radiotherapy, postoperative rehabilitation plans, and per capita household income. Among the five machine learning models evaluated, XGBOOST and Random Forest demonstrated superior performance in both the training and validation sets, offering a novel and effective approach for the early identification and management of postoperative functional impairments.

In univariate analysis, several factors were significantly associated with upper limb dysfunction. Age emerged as a critical determinant of postoperative recovery. Carr et al. [16] reported that breast cancer patients undergoing mastectomy, compared to breast-conserving treatment, faced a higher risk of upper limb dysfunction, with contributing factors including ipsilateral radiotherapy, surgical site, and specific cancer types. Our findings align with these results, highlighting that older patients are more likely to experience recovery challenges due to decreased muscle strength, reduced joint flexibility, and overall physical function decline. Similarly, BMI was identified as a significant risk factor, suggesting that a higher BMI can adversely affect healing and movement capabilities. Zheng et al. [17] found that aggressive axillary lymph node dissection is a major risk factor for breast cancer-related lymphedema and upper limb dysfunction, suggesting that alternative strategies, such as regional lymph node irradiation, can substantially reduce lymphedema risk.

Cancer type also played a crucial role in postoperative function, as different cancer types can affect the surgical scope and complexity, thereby influencing the risk of functional impairment. Our findings show that axillary lymph node dissection significantly impacts upper limb function, likely due to nerve and lymphatic system involvement, which can cause postoperative pain, swelling, and mobility issues. Cocco et al. [18] suggested that sentinel lymph node biopsy combined with radiotherapy, instead of axillary dissection, may reduce postoperative complications without compromising survival in patients with limited axillary involvement. Ipsilateral radiotherapy was also found to contribute to tissue fibrosis and restricted mobility, consistent with findings by Mohite et al. [19], who reported that routine exercises, including scapular strengthening, significantly improved shoulder pain and dysfunction after MRM. Aboelnour et al. [20] further demonstrated the efficacy of scapular stabilization and graded elastic band exercises in enhancing shoulder mobility, reducing pain, and improving quality of life in patients with adhesive capsulitis. Additionally, per capita household income, as an indicator of socioeconomic status, was found to indirectly influence postoperative recovery by affecting patient access to resources and support, underscoring the importance of considering social factors in clinical practice.

Studies on upper limb dysfunction in other diseases, such as stroke, have identified similar risk factors. Holmes et al. [21] reported that significant predictors of post-stroke upper limb pain include diabetes, prior shoulder pain, and limited upper limb function, which parallels the nerve and muscle damage observed in breast cancer patients after surgery or radiotherapy. Furthermore, Snickars et al. [22] highlighted early predictors of upper limb dysfunction in post-stroke patients, including grip strength and finger extension, which may share similar physiological mechanisms with postoperative upper limb dysfunction in breast cancer patients.

Among the five machine learning models tested, XGBOOST and Random Forest achieved superior predictive performance, as evidenced by higher AUC, specificity, and sensitivity. XGBOOST, which iteratively optimizes errors through gradient boosting, effectively captures complex, nonlinear relationships among features. Chen et al. [23] demonstrated the utility of XGBOOST in predicting bleeding risk among elderly aspirin users, achieving high AUC and calibration, highlighting its ability to manage complex clinical variables. Random Forest, which constructs multiple decision trees using randomly sampled features, minimizes overfitting while maintaining strong generalizability. Su et al. [24] showed that XGBOOST performed exceptionally well in predicting knee osteoarthritis severity, further confirming its value in high-risk screening and personalized intervention. Similarly, Jin et al. [25] found that Random Forest outperformed logistic regression in sensitivity and AUC when predicting poor responses to neoadjuvant chemotherapy in breast cancer patients, supporting its advantages in breast cancer prognosis.

In contrast, SVM, GNB, and ADABOOST models exhibited slightly lower performance in both the training and validation sets. However, SVM has shown promise in other disease predictions. For example, Alsaykhan et al. [26] achieved high accuracy in detecting acute lymphoblastic leukemia using a hybrid model combining SVM and particle swarm optimization, demonstrating SVM’s strength in handling high-dimensional feature spaces. Similarly, Gong et al. [27] used SVM with evolutionary computation algorithms to achieve high accuracy and specificity in predicting acute ST-segment elevation myocardial infarction, showcasing its potential in processing nonlinear data.

The calibration curves for XGBOOST and Random Forest, both in the training and validation sets, demonstrated a strong alignment between predicted and observed probabilities, indicating good model calibration. Storås et al. [28] utilized machine learning model interpretation methods to analyze proteins associated with meibomian gland dysfunction severity, illustrating the ability of machine learning to accurately identify clinically relevant features while maintaining robust calibration in biomarker screening. Zhou et al. [29] validated the stability of an ADABOOST-based depression prediction model during COVID-19 quarantine, emphasizing its applicability in high-stakes public health scenarios.

Our findings suggest that XGBOOST and Random Forest models hold significant clinical application potential. Liang et al. [30] developed a Naive Bayes-based predictive model that excelled in identifying vascular calcification risk in type 2 diabetes patients, underscoring the value of machine learning in personalized risk assessment. Li et al. [31] demonstrated the efficacy of a Random Forest-based androgen receptor-related survival model in prostate cancer risk assessment, supporting its role in clinical decision-making for personalized treatment. Additionally, Ji et al. [32] achieved high AUC and accuracy using a GNB model to predict post-stroke cognitive impairment, reinforcing the importance of machine learning in early intervention for high-risk patients. The models developed in this study not only provide technical support for identifying high-risk populations with postoperative upper limb dysfunction but also lay the foundation for optimizing individualized intervention strategies, potentially improving recovery outcomes and quality of life.

The strengths of this study include its large sample size and the use of ten-fold cross-validation to control for model bias, ensuring stability and reliability. Additionally, rigorous data collection and variable control enhanced the accuracy of the analyses. However, as a retrospective study, the research is susceptible to inherent selection bias, and the generalizability of the prediction models may be limited. Future studies should include larger, more diverse populations to validate these findings. Moreover, prospective data collection and the exploration of advanced algorithms, such as deep learning, could further improve the predictive accuracy and applicability of these models.

Conclusion

The machine learning models developed in this study demonstrated excellent performance in predicting the risk of upper limb dysfunction, with XGBOOST and Random Forest models emerging as top performers. These models provide significant technical support for the early identification and management of high-risk patients following breast cancer surgery, highlighting their promising potential for clinical application.

Disclosure of conflict of interest

None.

References

  • 1.Bray F, Laversanne M, Sung H, Ferlay J, Siegel RL, Soerjomataram I, Jemal A. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2024;74:229–263. doi: 10.3322/caac.21834. [DOI] [PubMed] [Google Scholar]
  • 2.Giaquinto AN, Sung H, Newman LA, Freedman RA, Smith RA, Star J, Jemal A, Siegel RL. Breast cancer statistics 2024. CA Cancer J Clin. 2024;74:477–495. doi: 10.3322/caac.21863. [DOI] [PubMed] [Google Scholar]
  • 3.US Preventive Services Task Force. Screening for breast cancer. JAMA. 2024;331:1973–1974. [Google Scholar]
  • 4.Sandoval JL, Franzoi MA, di Meglio A, Ferreira AR, Viansone A, André F, Martin AL, Everhard S, Jouannaud C, Fournier M, Rouanet P, Vanlemmens L, Dhaini-Merimeche A, Sauterey B, Cottu P, Levy C, Stringhini S, Guessous I, Vaz-Luis I, Menvielle G. Magnitude and temporal variations of socioeconomic inequalities in the quality of life after early breast cancer: results from the multicentric French CANTO cohort. J. Clin. Oncol. 2024;42:2908–2917. doi: 10.1200/JCO.23.02099. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Aitken GL, Correa G, Samuels S, Gannon CJ, Llaguna OH. Assessment of textbook oncologic outcomes following modified radical mastectomy for breast cancer. J Surg Res. 2022;277:17–26. doi: 10.1016/j.jss.2022.03.018. [DOI] [PubMed] [Google Scholar]
  • 6.Siqueira TC, Frágoas SP, Pelegrini A, de Oliveira AR, da Luz CM. Factors associated with upper limb dysfunction in breast cancer survivors. Support Care Cancer. 2021;29:1933–1940. doi: 10.1007/s00520-020-05668-7. [DOI] [PubMed] [Google Scholar]
  • 7.Mahfouz FM, Li T, Joda M, Harrison M, Kumar S, Horvath LG, Grimison P, King T, Goldstein D, Park SB. Upper-limb dysfunction in cancer survivors with chemotherapy-induced peripheral neurotoxicity. J Neurol Sci. 2024;457:122862. doi: 10.1016/j.jns.2023.122862. [DOI] [PubMed] [Google Scholar]
  • 8.Roldán-Jiménez C, Martín-Martín J, Pajares B, Ribelles N, Alba E, Cuesta-Vargas AI. Factors associated with upper limb function in breast cancer survivors. PM R. 2023;15:151–156. doi: 10.1002/pmrj.12731. [DOI] [PubMed] [Google Scholar]
  • 9.Cheng G, Xu J, Wang H, Chen J, Huang L, Qian ZR, Fan Y. mtPCDI: a machine learning-based prognostic model for prostate cancer recurrence. Front Genet. 2024;15:1430565. doi: 10.3389/fgene.2024.1430565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Margue G, Ferrer L, Etchepare G, Bigot P, Bensalah K, Mejean A, Roupret M, Doumerc N, Ingels A, Boissier R, Pignot G, Parier B, Paparel P, Waeckel T, Colin T, Bernhard JC. UroPredict: machine learning model on real-world data for prediction of kidney cancer recurrence (UroCCR-120) NPJ Precis Oncol. 2024;8:45. doi: 10.1038/s41698-024-00532-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Lip GYH, Genaidy A, Tran G, Marroquin P, Estes C, Shnaiden T, Bayewitz A. Incident and recurrent myocardial infarction (MI) in relation to comorbidities: prediction of outcomes using machine-learning algorithms. Eur J Clin Invest. 2022;52:e13777. doi: 10.1111/eci.13777. [DOI] [PubMed] [Google Scholar]
  • 12.Swanson K, Wu E, Zhang A, Alizadeh AA, Zou J. From patterns to patients: advances in clinical machine learning for cancer diagnosis, prognosis, and treatment. Cell. 2023;186:1772–1791. doi: 10.1016/j.cell.2023.01.035. [DOI] [PubMed] [Google Scholar]
  • 13.Schaeffer T, Canizares MF, Wall LB, Bohn D, Steinman S, Samora J, Manske MC, Hutchinson DT, Shah AS, Bauer AS CoULD Study Group. How risky are risk factors? An analysis of prenatal risk factors in patients participating in the congenital upper limb differences registry. J Hand Surg Glob Online. 2022;4:147–152. doi: 10.1016/j.jhsg.2022.03.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Jagsi R, Mason G, Overmoyer BA, Woodward WA, Badve S, Schneider RJ, Lang JE, Alpaugh M, Williams KP, Vaught D, Smith A, Smith K, Miller KD Susan G. Komen-IBCRF IBC Collaborative in partnership with the Milburn Foundatio. Inflammatory breast cancer defined: proposed common diagnostic criteria to guide treatment and research. Breast Cancer Res Treat. 2022;192:235–243. doi: 10.1007/s10549-021-06434-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Lazrek O, Karam KM, Bouché PA, Billaud A, Pourchot A, Godeneche A, Freaud O, Kany J, Métais P, Werthel JD, Bohu Y, Gerometta A, Hardy A. A new self-assessment tool following shoulder stabilization surgery, the auto-Walch and auto-Rowe questionnaires. Knee Surg Sports Traumatol Arthrosc. 2023;31:2593–2601. doi: 10.1007/s00167-022-07290-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Carr HM, Patel RA, Beederman MR, Maassen NH, Hanson SE. Risk factors for upper extremity impairment after mastectomy: a single institution retrospective review. Plast Reconstr Surg Glob Open. 2024;12:e5684. doi: 10.1097/GOX.0000000000005684. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Zheng SY, Chen CY, Qi WX, Cai G, Xu C, Cai R, Qian XF, Shen KW, Cao L, Chen JY. The influence of axillary surgery and radiotherapeutic strategy on the risk of lymphedema and upper extremity dysfunction in early breast cancer patients. Breast. 2023;68:142–148. doi: 10.1016/j.breast.2023.02.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Cocco D, Shah C, Wei W, Wilkerson A, Grobmyer SR, Al-Hilli Z. Axillary lymph node dissection can be omitted in patients with limited clinically node-positive breast cancer: a National Cancer Database analysis. Br J Surg. 2022;109:1293–1299. doi: 10.1093/bjs/znac305. [DOI] [PubMed] [Google Scholar]
  • 19.Mohite PP, Kanase SB. Effectiveness of scapular strengthening exercises on shoulder dysfunction for pain and functional disability after modified radical mastectomy: a controlled clinical trial. Asian Pac J Cancer Prev. 2023;24:2099–2104. doi: 10.31557/APJCP.2023.24.6.2099. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Aboelnour NH, Kamel FH, Basha MA, Azab AR, Hewidy IM, Ezzat M, Kamel NM. Combined effect of graded Thera-Band and scapular stabilization exercises on shoulder adhesive capsulitis post-mastectomy. Support Care Cancer. 2023;31:215. doi: 10.1007/s00520-023-07641-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Holmes RJ, McManus KJ, Koulouglioti C, Hale B. Risk factors for poststroke shoulder pain: a systematic review and meta-analysis. J Stroke Cerebrovasc Dis. 2020;29:104787. doi: 10.1016/j.jstrokecerebrovasdis.2020.104787. [DOI] [PubMed] [Google Scholar]
  • 22.Snickars J, Persson HC, Sunnerhagen KS. Early clinical predictors of motor function in the upper extremity one month post-stroke. J Rehabil Med. 2017;49:216–222. doi: 10.2340/16501977-2205. [DOI] [PubMed] [Google Scholar]
  • 23.Chen T, Lei W, Wang M. Predictive model of internal bleeding in elderly aspirin users using XGBoost machine learning. Risk Manag Healthc Policy. 2024;17:2255–2269. doi: 10.2147/RMHP.S478826. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Su K, Yuan X, Huang Y, Yuan Q, Yang M, Sun J, Li S, Long X, Liu L, Li T, Yuan Z. Improved prediction of knee osteoarthritis by the machine learning model XGBoost. Indian J Orthop. 2023;57:1667–1677. doi: 10.1007/s43465-023-00936-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Jin Y, Lan A, Dai Y, Jiang L, Liu S. Development and testing of a random forest-based machine learning model for predicting events among breast cancer patients with a poor response to neoadjuvant chemotherapy. Eur J Med Res. 2023;28:394. doi: 10.1186/s40001-023-01361-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Alsaykhan LK, Maashi MS. A hybrid detection model for acute lymphocytic leukemia using support vector machine and particle swarm optimization (SVM-PSO) Sci Rep. 2024;14:23483. doi: 10.1038/s41598-024-74889-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Gong M, Liang D, Xu D, Jin Y, Wang G, Shan P. Analyzing predictors of in-hospital mortality in patients with acute ST-segment elevation myocardial infarction using an evolved machine learning approach. Comput Biol Med. 2024;170:107950. doi: 10.1016/j.compbiomed.2024.107950. [DOI] [PubMed] [Google Scholar]
  • 28.Storås AM, Fineide F, Magnø M, Thiede B, Chen X, Strümke I, Halvorsen P, Galtung H, Jensen JL, Utheim TP, Riegler MA. Using machine learning model explanations to identify proteins related to severity of meibomian gland dysfunction. Sci Rep. 2023;13:22946. doi: 10.1038/s41598-023-50342-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Zhou Y, Zhang Z, Li Q, Mao G, Zhou Z. Construction and validation of machine learning algorithm for predicting depression among home-quarantined individuals during the large-scale COVID-19 outbreak: based on Adaboost model. BMC Psychol. 2024;12:230. doi: 10.1186/s40359-024-01696-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Liang X, Li X, Li G, Wang B, Liu Y, Sun D, Liu L, Zhang R, Ji S, Yan W, Yu R, Gao Z, Liu X. A machine learning approach to predicting vascular calcification risk of type 2 diabetes: a retrospective study. Clin Cardiol. 2024;47:e24264. doi: 10.1002/clc.24264. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Li Q, Wang Y, Chen J, Zeng K, Wang C, Guo X, Hu Z, Hu J, Liu B, Xiao J, Zhou P. Machine learning based androgen receptor regulatory gene-related random forest survival model for precise treatment decision in prostate cancer. Heliyon. 2024;10:e37256. doi: 10.1016/j.heliyon.2024.e37256. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Ji W, Wang C, Chen H, Liang Y, Wang S. Predicting post-stroke cognitive impairment using machine learning: a prospective cohort study. J Stroke Cerebrovasc Dis. 2023;32:107354. doi: 10.1016/j.jstrokecerebrovasdis.2023.107354. [DOI] [PubMed] [Google Scholar]

Articles from American Journal of Translational Research are provided here courtesy of e-Century Publishing Corporation

RESOURCES