Abstract
Study Design.
A retrospective real-world study.
Objective.
Using machine learning models to identify risk factors for residual pain after PLIF in patients with degenerative lumbar spine disease.
Summary of Background Data.
Residual pain after PLIF is a frequent phenomenon, and the specific risk factors for residual pain are not known.
Materials and Methods.
Between June 2018 and March 2023, 936 patients with lumbar degenerative disease who underwent PLIF surgery were recruited. Group A (n=501) had <7 days of VAS ≥3 pain within 1 month post-PLIF, whereas group B (n=435) had ≥7 days. Imaging outcomes included PMI, MMI, MMD, lumbar lordosis (LL), and LL improvement rate. Functional outcomes were assessed by VAS. Univariate and multivariate logistic regression analyses were used to determine the potential risk of short-term postoperative pain. Risk factors were identified using machine learning models and predicted whether residual pain would occur.
Results.
A total of 435 (46.5%) patients experienced residual postoperative pain. Independent risk factors included surgical segment, PMI, MMI, and depression level. The Random Forest Model model had an accuracy of 95.7%, a sensitivity of 96.4%, a specificity of 94.1%, and an F1 score of ~95.2% for predicting recurrent pain, indicating high reliability and generalizability.
Conclusions.
Our study reveals risk factors for the development of residual pain after PLIF. Compared to the pain group, the non-pain group had better paravertebral muscles, good psychological level, lower surgical segment and LL improvement rate. These factors may represent targets for preoperative and perioperative optimization as a means to minimize the potential for residual pain after PLIF.
Key Words: lumbar degenerative disease, posterior lumbar interbody fusion, short-term residual pain, psoas muscle index, multifidus muscle index
Low back pain remains the primary cause of global productivity loss and the number one cause of years of health loss in 126 countries.1,2 With the continuous increase in people’s work intensity and work time, coupled with the intensification of population aging, degenerative lumbar spine disease (DLSD) is one of the main causes of persistent low back pain in middle-aged and elderly individuals.3–5 Since its development in 1943, posterior lumbar interbody fusion (PLIF)6 has become a reliable surgical option for treating degenerative lumbar diseases. Although PLIF can directly remove protruding lumbar discs or hypertrophic ligaments that compress the spinal cord or nerve roots and expand the diameter of the spinal canal, some patients experience little benefit from surgery7,8 and even persistent residual pain after surgery,9,10 which even affects their daily life. Research has shown that short-term postoperative rehabilitation affects patients’ visual perception and confidence in rehabilitation.11
Therefore, analyzing the causes of postoperative pain in patients, evaluating potential treatment risks, and providing a more accurate pain prediction probability on the basis of individual patient characteristics before surgery can enable clinical staff to manage patients’ expectations before surgery and guide medical staff to intervene in specific aspects, effectively preventing postoperative pain in advance. Machine learning algorithms are powerful tools for analyzing big data.12–14 In our study, on the basis of real-world clinical data, we applied six machine learning models to predict the occurrence of short-term postoperative residual pain, screened high-precision prediction methods, and identified risk factors for the occurrence of pain, thus helping doctors, nurses, and patients weigh the risks, pros, and cons of surgical interventions.
METHODS
Selection Criteria
This study retrospectively reviewed 936 patients with degenerative lumbar conditions at our institute between June 2018 and March 2023. These patients were divided into groups A and B. Group A (n = 501): painful days with visual analog scale (VAS) scores ≥3 were <7 days within 1 month after PLIF. In group B (n = 435), the number of painful days with VAS scores ≥3 were ≥7 days within 1 month after PLIF. All patients received 50 mg flurbiprofen axetil through intravenous injection twice a day for postoperative pain relief. Patients whose VAS scores were ≥7 received 0.2 g Celecoxib orally once for temporary pain relief.
The inclusion criteria were as follows: (1) Imaging-based diagnosis of degenerative lumbar spine disorders. (2) All patients received PLIF. (3) The follow-up time was 1 month, and all the relevant information was available.
The exclusion criteria were as follows: (1) Combined with lumbar isthmic spondylolysis and vertebral fracture. (2) Combined with severe scoliosis and severe osteoporosis. We defined severe scoliosis as a Cobb angle ≥40° on standing anteroposterior radiographs of the lumbar spine and severe osteoporosis was defined as a T-score ≤−2.5 accompanied by one or more fragility fractures.15,16 (3) Patients with a history of spinal surgery or fractures. (4) Patients who are associated with spinal tumors or space-occupying lesions.
Assessed Parameters
Pain Assessment
The VAS was used to determine the patient’s perception of lower back or leg pain before surgery, and postoperatively (0–10 scale, with 0 being painless and 10 being the most painful)17
Psychological Assessment
We used the Self-Assessment Scale for Anxiety (SAS) to assess the subjective feelings of the patients and categorized their anxiety levels into I to IVs on the basis of standard scores as no anxiety or mild, moderate, or severe anxiety. The severity of their preoperative anxiety state was measured as a reference for perioperative psychological changes.18
Radiographic Evaluation
All patients underwent anteroposterior and lateral radiographs, as well as MR imaging before surgery, and radiographs were taken again 5 days post-surgery. Radiologic evaluations were performed by three experienced spinal surgeons through a blinded method. The recordings differed by <5%, indicating accurate, stable, and reliable measurements. The mean value of each radiographic parameter was used in the analysis. Lumbar lordosis (LL) is measured on standing lateral spine radiographs by identifying the superior endplates of L1 and S1. Straight lines (tangents) are drawn along these endplates, and the angle formed at their intersection is calculated as the Cobb angle, which represents LL (Fig. 1). The LL improvement rate was calculated as the absolute difference between preoperative LL and postoperative LL divided by the preoperative LL value. Using axial T2-weighted MR images at L3, the magnetic lasso tool in the Display system outlined muscle perimeters to automatically calculate the area in cm² (Fig. 2). The PMI and MMI were derived by dividing the muscle area by the square of the patient’s height (cm²/m²). MMGV was measured with ImageJ by drawing a region of interest (ROI) that avoids fat or other tissues, and a value <30 suggested significant degeneration or fat infiltration, which decreased to <20 in severe cases.
Figure 1.

Measurement of preoperative and postoperative lumbar lordosis angles. Lumbar lordosis (LL) is measured on standing lateral spine radiographs by identifying the superior endplates of L1 and S1. Straight lines (tangents) are drawn along these endplates, and the angle formed at their intersection is calculated as the Cobb angle, which represents LL.
Figure 2.
Measurement of PMI, MMI, and MMGV. PMI and MMI were measured at the L3 level on axial T2-weighted images using the Display system, with the muscle area calculated in cm² and normalized by height (cm²/m²). MMGV was determined using ImageJ software by drawing an ROI within the multifidus muscle, avoiding surrounding fat or tissues, and calculating the average grayscale value (unitless). Measurements were performed bilaterally.
Statistical Methods
Statistical analyses were conducted through SPSS version 27.0. Univariate analysis was initially performed on all potential factors through χ2 tests for categorical variables and independent t tests for continuous variables to identify those significantly associated with short-term postoperative pain after PLIF (P<0.05). Statistically significant variables were then included in a multivariate logistic regression to determine independent protective and risk factors, with odds ratios (ORs) and 95% CIs calculated to quantify these associations. The reliability of the radiological data was evaluated by the intraclass correlation coefficient (ICC).
Machine Learning Model
This study used six classifiers to predict whether a patient would experience short-term postoperative pain. The classifiers used include K-Nearest Neighbors (KNN), Logistic Regression (LR), Support Vector Machine (SVM), Naive Bayes (NB), Decision Tree Classifier (DTC), and Random Forest (RF). Each model’s parameters were optimized through GridSearchCV to achieve the best performance.
DATA Processing
During the data preprocessing stage, we checked for missing and duplicate values, and no such issues were found, ensuring data integrity. As the number of factors in this study was relatively small, we did not exclude predictors on the basis solely of the univariate analysis. Then, through the reduced variance inflation factor (VIF) tool, fitting was performed through the least squares method to select independent variables for the regression model. Variables with high correlations, indicated by high VIF values, were excluded to avoid model instability. Postoperative LL(Post-LL) and Preoperative LL(Pre-LL) were removed, whereas other variables remained within acceptable VIF ranges.
The data set consists of 936 samples with 16 features, and the target variable is Boolean. The samples were split into training and testing sets at an 8:2 ratio. Owing to the multidimensional nature of the data, standardization was applied to ensure consistent feature scaling. In addition, categorical variables were one-hot encoded to match the model input requirements.
Model Training and Optimization
After completing data processing, we trained the models and optimized their hyperparameters using GridSearch CV within a cross-validation framework. GridSearch CV systematically searches through predefined parameter combinations and evaluates model performance using cross-validation on different training and validation subsets, ultimately selecting the hyperparameter combination that performs best on the validation set. The effectiveness of each model was assessed through accuracy metrics and confusion matrix analysis.
RESULTS
A total of 936 patients with degenerative lumbar disease were included in this study, on the basis of the defined inclusion and exclusion criteria. All patients underwent PLIF surgery and completed at least 1 month of postoperative follow-up. Of these, 499 patients (53.32%) were female. Group A comprised 501 patients with a mean age of 67.07±13.1 years, whereas group B included 435 patients with a mean age of 66.94±13.43 years. There was no significant difference in the preoperative VAS score between the two groups of patients. The ICC values are shown in Table 1. The ICC values showed excellent agreement (>0.9) for all the measurements. None of the differences were statistically significant.
TABLE 1.
Interobserver Intraclass Correlation Coefficients for All Radiographic Variables
| Pre-LL | Post-LL | PMI | MMI | MMGV | |
|---|---|---|---|---|---|
| 95% CI | 0.98 (0.98, 0.99) | 0.95 (0.94, 0.95) | 0.96 (0.95, 0.98) | 0.95 (0.94, 0.97) | 0.98 (0.98, 0.99) |
LL indicates lumbar lordosis; MMGV, multifidus muscle gray value; MMI, multifidus muscle index; PMI, psoas muscle index; Post, postoperative; Pre, Preoperative.
Univariate Logistic Regression Analysis
Univariate logistic regression analysis revealed several factors significantly associated with residual postoperative lumbar and leg pain in PLIF patients. These factors included sex, SAS, surgical segment, drainage volume, PMI, MMI, MMGV, postoperative LL, and LL improvement rate (Table 2).
TABLE 2.
Comparison of General Data Between the PLIF Non-Short-Term Pain Group (Group A) and the Short-Term Pain Group (Group B) Patients
| Group A (n=501) | Group B (n=435) | P | |
|---|---|---|---|
| Sex, n (%) | <0.01* | ||
| Male | 277 (55.29) | 160 (36.78) | |
| Female | 224 (44.71) | 275 (63.22) | |
| Age [mean(SD)] | 67.07 (13.10) | 66.94 (13.43) | 0.887 |
| BMI [mean(SD)] | 23.19 (3.43) | 23.15 (3.17) | 0.853 |
| Education,n (%) | 0.500 | ||
| Junior high school education | 340 (67.86) | 299 (68.73) | |
| High school and undergraduate programs | 135 (26.95) | 107 (24.60) | |
| Graduate degree | 26 (5.19) | 29 (6.67) | |
| SAS, n (%) | <0.01* | ||
| 1 | 397 (79.25) | 307 (70.57) | |
| 2 | 89 (17.76) | 92 (21.15) | |
| 3 | 13 (2.59) | 23 (5.29) | |
| 4 | 2 (0.40) | 13 (2.99) | |
| Surgical segment [mean (SD)] | 1.32 (0.59) | 1.56 (0.71) | <0.01* |
| Surgical duration [mean (SD)] | 2.49 (0.57) | 2.53 (0.65) | 0.302 |
| Drainage time [mean (SD)] | 2.45 (0.80) | 2.47 (0.79) | 0.611 |
| Drainage volume [mean (SD)] | <0.01* | ||
| <50 | 473 (94.41) | 392 (90.11) | |
| ≥50 | 28 (5.59) | 43 (9.89) | |
| Duration of illness | 22.24 (12.35) | 22.05 (13.41) | 0.823 |
| PMI [mean(SD)] | 10.44 (4.51) | 4.21 (2.55) | <0.01* |
| MMI [mean(SD)] | 2.40 (0.49) | 1.59 (0.73) | <0.01* |
| MMGV [mean(SD)] | 40.88 (7.40) | 39.23 (7.33) | <0.01* |
| Pre-LL [mean (SD)] | 48.20 (5.88) | 46.64 (8.35) | <0.01* |
| Post-LL [mean (SD)] | 54.77 (5.21) | 54.55 (5.66) | 0.534 |
| LL improvement rate [mean (SD)] | 0.14 (0.09) | 0.23 (0.49) | <0.01* |
Statistically significant difference (P<0.05).
LL indicates lumbar lordosis; MMGV, multifidus muscle gray value; MMI, multifidus muscle index; PMI, psoas muscle index; Post, postoperative; Pre, preoperative; SAS, Self-Rating Anxiety Scale; SD, standard deviation.
Multivariate Logistic Regression Analysis
A binary logistic regression model was constructed with short-term postoperative pain as the dependent variable (No=0, Yes=1). The independent variables included sex, SAS, surgical segment, drainage volume, PMI, MMI, MMGV, postoperative LL, and LL improvement rate, as identified through univariate analysis. The model’s goodness-of-fit was confirmed using the Hosmer-Lemeshow test. Multivariate analysis revealed surgical segment (P<0.001), SAS grade 3 (P=0.016), SAS grade 4 (P=0.017), PMI (P<0.001), MMI (P<0.001), and the LL improvement rate (P=0.013) as independent predictors of short-term postoperative pain (Table 3).
TABLE 3.
Multivariate Logistic Regression Analysis of Risk Factors Related to Short-Term Pain After PLIF Surgery
| Influencing factor | B | OR (95% CI) | P |
|---|---|---|---|
| Sex | −0.30 | 0.971 (0.586–1.607) | 0.909 |
| Surgical segment | 0.820 | 2.271 (1.521–3.391) | <0.001* |
| Drainage volume | |||
| <50 | Reference | ||
| ≥50 | 0.566 | 1.760 (0.845–3.669) | 0.131 |
| SAS | |||
| 1 | Reference | ||
| 2 | 0.225 | 1.291 (0.699–2.386) | 0.415 |
| 3 | 1.649 | 5.203 (1.359–19.927) | 0.016* |
| 4 | 3.073 | 21.612 (1.737–268.834) | 0.017* |
| PMI | −0.851 | 0.427 (0.376–0.485) | <0.001* |
| MMI | −1.855 | 0.156 (0.104–0.236) | <0.001* |
| MMGV | −0.007 | 0.993 (0.959–1.027) | 0.677 |
| Pre-LL | 0.027 | 1.027 (0.971–1.087) | 0.354 |
| LL improvement rate | 4.634 | 102.922 (2.648–400.029) | 0.013* |
Statistically significant difference (P<0.05).
LL indicates lumbar lordosis; MMGV, multifidus muscle gray value; MMI, multifidus muscle index; PMI, psoas muscle index; Pre, preoperative; SAS, Self-Rating Anxiety Scale.
Data Visualization
Mosaic Plot and Violin Plot
Figure 3 illustrates the relationships between various factors and groups, including sex, surgical segment, SAS, drainage volume, and drainage time (defined as the interval between the placement of the surgical drain and its removal). Compared with group B, group A had a greater proportion of males, fewer surgical segments, lower anxiety levels, and lower drainage volume. Figure 4 shows that compared with group B, group A had higher median and mean values for the PMI and MMI, with a greater frequency of high values. In contrast, the MMGV differences between the groups were minimal. The LL improvement rate was greater in group B than in group A.
Figure 3.
Mosaic plot. Mosaic plots illustrating the relationships between various factors and groups, including sex, surgical segment, SAS, drainage volume, and drainage time.
Figure 4.
Violin plots. Violin plots showing the distribution of age, BMI, PMI, MMI, duration of illness, MMGV, LL preoperative, LL postoperative, LL improvement, and Surgical duration across groups.
Bubble Chart of the PMI and MMI
In Figure 5, group A and group B show distinct distribution differences across the PMI and MMI dimensions. In particular, along the PMI dimension, group A’s data points are more widely spread, whereas group B’s PMI values are more concentrated in the lower range.
Figure 5.

Bubble chart of PMI and MMI.
Correlation Analysis
Figure 6 shows that the PMI and MMI are negatively correlated with group, with correlation coefficients of −0.64 and −0.55, respectively, indicating that higher PMI and MMI values are associated with group A. Furthermore, sex, SAS, surgical segment, drainage volume at extubation, and the LL improvement rate were found to be correlated with outcomes, whereas BMI, education level, and disease duration were not significantly associated with outcomes.
Figure 6.

Pearson correlation coefficient.
Binary Response Prediction
Figure 7 shows that RF performed best, with minimal misclassifications (TP=80, FN=5, TN=100, FP=3), accurately distinguishing positive and negative samples. DT and SVM also performed well, whereas LR showed balanced performance with moderate false negatives and positives. KNN and NB performed worse, with NB particularly struggling due to a high false negative rate.
Figure 7.
Confusion matrices for different models.
Accuracy Comparison of Different Models
Table 4 and Figure 8 compare the models’ accuracies, which range from 87.2% to 95.7%. RF performed best, achieving 95.7% accuracy, a sensitivity of 96.4%, a specificity of 94.1%, and an F1 score of ~95.2%, indicating that strong feature interactions were captured by ensemble learning. It reduces variance, handles high-dimensional features, and diverse decision boundaries, enhancing model robustness and accuracy.
TABLE 4.
Model Performance Evaluation Table
| Model | Accuracy | Precision | Recall | F1 Score |
|---|---|---|---|---|
| K-nearest neighbors | 0.877659574 | 0.897435897 | 0.823529412 | 0.858895706 |
| Logistic regression | 0.904255319 | 0.913580247 | 0.870588235 | 0.891566265 |
| Support vector machine | 0.914893617 | 0.936708861 | 0.870588235 | 0.902439024 |
| Naive Bayes | 0.872340426 | 0.906666667 | 0.800000000 | 0.850000000 |
| Decision tree classifier | 0.936170213 | 0.974025974 | 0.882352941 | 0.925925926 |
| Random forest | 0.957446809 | 0.963855422 | 0.941176471 | 0.952380952 |
Figure 8.

Accuracy comparison of different models.
Random Forest Feature Importance
The feature-importance analysis of the RF reveals key variables contributing to the model’s predictions. Figure 9 shows that the PMI (0.4) and MMI (0.45) are the most significant features, indicating potential strong linear or nonlinear relationships. The surgical segment ranks third, whereas drainage volume and the LL improvement rate are highly important compared with features such as age and BMI. Basic demographics, such as age and sex, have less influence.
Figure 9.

Random forest feature importances.
DISCUSSION
With the growing global aging population and the changing nature of modern work, the incidence of DLSD continues to rise. DLSD places a heavy burden on health care systems.19–23 In this study, consistent with previous reports, we confirmed that PLIF is effective for relieving nerve compression, correcting deformities, and restoring intervertebral stability. However, despite a generally high success rate, a substantial number of patients still experience short-term residual pain.24–28
Short-term residual pain after PLIF remains a major clinical issue. In our study, nearly half of the patients were affected. When comparing patients without pain to those with residual pain, our multivariate analysis revealed that anxiety level, the number of surgical segments, the paraspinal muscle indices (PMI, MMI), and the improvement in LL were significant predictors. Previous studies have focused mostly on factors such as patient age, osteoporosis, and the number of fused segments.29–34 However, few studies have explored how paraspinal muscle mass and psychological factors interact to influence pain. Our results help fill this gap. The psoas major and multifidus muscles are key to spinal stability and load distribution and directly affect surgical outcomes and recovery. Muscle atrophy or fatty infiltration not only slows postoperative recovery but also may heighten spinal mechanical stress and increase the risk of postoperative pain.35,36 Notably, in elderly patients or those with sarcopenia, muscle atrophy increases the risk of failed back surgery syndrome (FBSS) and prolongs recovery. Paraspinal muscle atrophy is often linked to higher pain scores and functional limitations, which aligns with our observation that patients with lower PMI/MMI are more prone to residual pain.37 Although MMGV may indicate the amount of fat, connective tissue, or fluid in muscle, its role may be overshadowed by stronger or more direct factors such as the PMI and MMI. MMGV might play a greater role over the long term, rather than being a key factor in short-term pain.
Multisegment surgery correlated with greater short-term residual pain. This may be because multisegment procedures will cause more extensive trauma to soft tissues and nerves.38 Moreover, adding more fused levels leads to greater changes in lumbar mobility and load distribution, possibly pushing adjacent segments to bear more stress and triggering insufficient muscular compensation.39 Patients in the no-pain group had greater preoperative LL, which is in line with the findings of previous studies.40,41 A relatively greater lumbar lordosis before surgery often suggests a more stable lumbar alignment, requiring less correction during surgery and thus reducing stress on surrounding soft tissue and muscles, ultimately lowering the risk of short-term residual pain. A moderate LL improvement offers a buffer period for soft tissue adaptation.42–44 This matches our findings. In previous studies, opinions differed on whether to place a drain after PLIF or how long to leave it in place.45,46 Our data revealed that the duration of drain placement did not differ significantly between groups, but a lower drainage volume was linked to less pain, possibly reflecting reduced inflammation and bleeding, and decreased nerve compression by hematoma.
We evaluated six machine learning models to guide clinical management and found that the RF model performed best, with 95.7% accuracy, 96.4% sensitivity, and 94.1% specificity. Its strong performance stems from its ability to handle complex, nonlinear interactions among features and highlights how low PMI, low MMI, and high anxiety combine to cause residual pain.47 By aggregating multiple decision trees, the RF minimizes overfitting and maintains generalizability, making it well-suited for clinical use. Although LR and SVM also showed high accuracy, LR struggles with high-dimensional data, and SVM lacks a feature-importance analysis. KNN and NB fared poorly, especially in unbalanced data sets.48–50 Overall, the RF’s low misclassification rate and stability against variance make it a reliable tool for predicting residual pain, and its integration into clinical decision support could enhance both preoperative risk stratification and postoperative monitoring. During preoperative evaluation, the PMI and MMI help identify high-risk patients. If there is no absolute indication (e.g., cauda equina syndrome), conservative treatment—targeted physical therapy and nutrition—should be attempted first. For patients requiring surgery, evaluating anxiety and addressing psychological distress through a biopsychosocial approach can reduce residual pain. Minimizing fusion segments and preserving posterior structures protects spinal stability, whereas focused rehabilitation aids in muscle recovery in those with poor paraspinal muscle quality.
This study is the first to integrate muscle mass, surgical approaches, and psychological factors into a short-term residual pain analysis after PLIF, identify key risk factors, and develop a robust predictive model. Despite its strengths, certain limitations exist. As a single-center retrospective study, generalizability may be limited by selection bias, missing global sagittal alignment data, and incomplete patient records, potentially reducing sample representativeness. Although the sample size was sufficient for preliminary analysis, a larger sample could capture broader variability and reduce the risk of Type II errors. In addition, excluding factors such as inflammatory biomarkers and lifestyle variables may narrow the model’s predictive scope. Future prospective, multicenter research with lager samples and more diverse variables could enhance both the model’s stability and its clinical applicability.
CONCLUSION
This research confirms the critical role of the quality of paraspinal muscles, especially the psoas major and multifidus muscles, in the development of short-term residual pain after PLIF. Compared with the group with residual pain, the group without pain had more robust paravertebral muscles, improved psychological characteristics, and a greater LL improvement rate. These factors may represent targets for preoperative and perioperative optimization as a means to minimize the potential for residual pain after PLIF. Integrating this machine learning model into clinical practice can provide patients with more personalized treatment plans, paving the way for data-driven spine surgical interventions.
Key Points
Machine learning prediction: the Random Forest Model effectively predicted residual pain after PLIF surgery with high accuracy (95.7%), sensitivity (96.4%), specificity (94.1%), and an F1 score of ~95.2%.
Identified risk factors: independent risk factors for residual postoperative pain included higher surgical segment levels, lower paravertebral muscle index (PMI), lower multifidus muscle index (MMI), and elevated levels of depression.
Patient comparisons: patients without residual pain demonstrated better paravertebral muscle condition, better psychological status, underwent surgery at lower lumbar segments, and had lower lumbar lordosis (LL) improvement rates compared with those with residual pain.
Clinical implications: recognizing these risk factors allows clinicians to tailor surgical approaches and postoperative care to minimize the occurrence of residual pain after PLIF.
Model utility: the study underscores the potential of machine learning models in enhancing postoperative outcomes by accurately identifying patients at risk for residual pain.
ACKNOWLEDGMENTS
The authors thank all the support from the First Affiliated Hospital of Soochow University.
Footnotes
H.S., W.T., and X.Y. contributed equally.
Approval was obtained from the ethics committee of the First Affiliated Hospital of Soochow University. The procedures used in this study adhere to the tenets of the Declaration of Helsinki. Approved by the medical ethics committee (Ethical Research Approval No. 518, 2024; ClinicalTrials.gov ID: NCT06628583).
H.S.: conceptualization, methodology, investigation, software, model building, writing—original draft. W.T.: conceptualization, data curation, model building, writing—original draft, investigation. X.Y.: conceptualization, methodology, mata processing, model building, writing—original draft. L.D.: methodology, data curation, investigation. L.C.: conceptualization, methodology, data curation. Z.Q.: conceptualization, methodology, data curation. writing—review and editing, funding acquisition. H.Y.: conceptualization, methodology, data curation. writing—review and editing, funding acquisition. J.Z.: conceptualization, methodology, data curation, validation, writing—review and editing, funding acquisition. Y.Q.: conceptualization, methodology, data curation, validation, writing—review and editing. H.L.: conceptualization, methodology, data curation, validation, writing—review and editing.
The authors report no conflicts of interest.
Contributor Information
Haifu Sun, Email: shfsdfyy@163.com.
Wenxiang Tang, Email: 20235233093@stu.suda.edu.cn.
Xingyu You, Email: seliiiiia.you@gmail.com.
Lei Deng, Email: denglei981122@163.com.
Liuyu Chen, Email: 1429745503@qq.com.
Zhonglai Qian, Email: szqzlspine@163.com.
Huilin Yang, Email: suzhouspine@163.com.
Jun Zou, Email: jzou@suda.edu.cn.
Yusen Qiao, Email: qiaoyusen8612@suda.edu.cn.
Hao Liu, Email: liuhaodoctor@163.com.
References
- 1.Onuora S. Low back pain is a growing concern. Nat Rev Rheumatol. 2023;19:462. [DOI] [PubMed] [Google Scholar]
- 2.Knezevic NN, Candido KD, Vlaeyen JWS, Van Zundert J, Cohen SP. Low back pain. Lancet. 2021;398:78–92. [DOI] [PubMed] [Google Scholar]
- 3.Kim HS, Wu PH, Jang IT. Lumbar Degenerative Disease Part 1: Anatomy and pathophysiology of intervertebral discogenic pain and radiofrequency ablation of basivertebral and sinuvertebral nerve treatment for chronic discogenic back pain: a prospective case series and review of literature. Int J Mol Sci. 2020;21:1483. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Vo CD, Jiang B, Azad TD, Crawford NR, Bydon A, Theodore N. Robotic spine surgery: current state in minimally invasive surgery. Global Spine J. 2020;10(2 Suppl):34s–40s. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Nguyen KML, Nguyen DTD. Minimally invasive treatment for degenerative lumbar spine. Tech Vasc Interv Radiol. 2020;23:100700. [DOI] [PubMed] [Google Scholar]
- 6.Cloward RB. Posterior lumbar interbody fusion updated. Clin Orthop Relat Res. 1985;193:16–19. [PubMed] [Google Scholar]
- 7.Gibson JN, Waddell G. Surgical interventions for lumbar disc prolapse: updated Cochrane Review. Spine (Phila Pa 1976). 2007;32:1735–1747. [DOI] [PubMed] [Google Scholar]
- 8.Bailey CS, Rasoulinejad P, Taylor D, et al. Surgery versus conservative care for persistent sciatica lasting 4 to 12 months. N Engl J Med. 2020;382:1093–1102. [DOI] [PubMed] [Google Scholar]
- 9.Berg B, Gorosito MA, Fjeld O, et al. Machine learning models for predicting disability and pain following lumbar disc herniation surgery. JAMA Netw Open. 2024;7:e2355024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.McGirt MJ, Bydon M, Archer KR, et al. An analysis from the Quality Outcomes Database, Part 1. Disability, quality of life, and pain outcomes following lumbar spine surgery: predicting likely individual patient outcomes for shared decision-making. J Neurosurg Spine. 2017;27:357–369. [DOI] [PubMed] [Google Scholar]
- 11.Traeger AC, Qaseem A, McAuley JH. Low back pain. Jama. 2021;326:286. [DOI] [PubMed] [Google Scholar]
- 12.Guo Z, Wang P, Ye S, et al. Interpretable machine learning models based on shapley additive explanations for predicting the risk of cerebrospinal fluid leakage in lumbar fusion surgery. Spine. 2024;49:1281–1293. [DOI] [PubMed] [Google Scholar]
- 13.Li R, Wang L, Wang X, et al. Development of machine learning model for predicting prolonged operation time in lumbar stenosis undergoing posterior lumbar interbody fusion: a multi-center study. Spine J. 2024. [DOI] [PubMed] [Google Scholar]
- 14.Ogink PT, Groot OQ, Bindels BJJ, Tobert DG. The use of machine learning prediction models in spinal surgical outcome: an overview of current development and external validation studies. Semin Spine Surg. 2021;33:100872. [Google Scholar]
- 15.Peck W.A., Burckhardt P, Christiansen C, et al. Consensus development conference: diagnosis, prophylaxis, and treatment of osteoporosis. Am J Med. 1993;94:646–650. [DOI] [PubMed] [Google Scholar]
- 16.Surgical treatment of severe osteoporosis including new concept of advanced severe osteoporosis. Osteoporos Sarcopenia. 2017;3:164–169. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Zub LW, Szymczyk M, Pokryszko-Dragan A, Bilińska M. Evaluation of pain in patients with lumbar disc surgery using VAS scale and quantitative sensory testing. Adv Clin Exp Med. 2013;22:411–419. [PubMed] [Google Scholar]
- 18.Zung WW. A rating instrument for anxiety disorders. Psychosomatics. 1971;12:371–379. [DOI] [PubMed] [Google Scholar]
- 19.Grotle M, Småstuen MC, Fjeld O, et al. Lumbar spine surgery across 15 years: trends, complications and reoperations in a longitudinal observational study from Norway. BMJ Open. 2019;9:e028743. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Solumsmoen S, Poulsen G, Kjellberg J, Melbye M, Munch TN. The impact of specialised treatment of low back pain on health care costs and productivity in a nationwide cohort. EClinicalMedicine. 2022;43:101247. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Ravindra VM, Senglaub SS, Rattani A, et al. Degenerative Lumbar Spine Disease: Estimating Global Incidence and Worldwide Volume. Global Spine J. 2018;8:784–794. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Wang YXJ, Káplár Z, Deng M, Leung JCS. Lumbar degenerative spondylolisthesis epidemiology: a systematic review with a focus on gender-specific and age-specific prevalence. J Orthop Translat. 2017;11:39–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Yoshihara H, Yoneoka D. National trends in the surgical treatment for lumbar degenerative disc disease: United States, 2000 to 2009. Spine J. 2015;15:265–271. [DOI] [PubMed] [Google Scholar]
- 24.Cho JH, Joo YS, Lim C, Hwang CJ, Lee DH, Lee CS. Effect of one- or two-level posterior lumbar interbody fusion on global sagittal balance. Spine J. 2017;17:1794–1802. [DOI] [PubMed] [Google Scholar]
- 25.Bydon M, Macki M, Abt NB, et al. The cost-effectiveness of interbody fusions versus posterolateral fusions in 137 patients with lumbar spondylolisthesis. Spine J. 2015;15:492–498. [DOI] [PubMed] [Google Scholar]
- 26.Cheng X, Zhang K, Sun X, et al. Clinical and radiographic outcomes of bilateral decompression via a unilateral approach with transforaminal lumbar interbody fusion for degenerative lumbar spondylolisthesis with stenosis. Spine J. 2017;17:1127–1133. [DOI] [PubMed] [Google Scholar]
- 27.Farrokhi MR, Rahmanian A, Masoudi MS. Posterolateral versus posterior interbody fusion in isthmic spondylolisthesis. J Neurotrauma. 2012;29:1567–1573. [DOI] [PubMed] [Google Scholar]
- 28.Liu G, Liu W, Jin D, Yan P, Yang Z, Liu R. Clinical outcomes of unilateral biportal endoscopic lumbar interbody fusion (ULIF) compared with conventional posterior lumbar interbody fusion (PLIF). Spine J. 2023;23:271–280. [DOI] [PubMed] [Google Scholar]
- 29.Yang P, Liang X, Xu X, et al. Incidence and predictive factors of new onset postoperative sacroiliac joint pain after posterior lumbar fusion surgery for degenerative lumbar disease. J Pain Res. 2023;16:4291–4299. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Abbott AD, Tyni-Lenné R, Hedlund R. Leg pain and psychological variables predict outcome 2-3 years after lumbar fusion surgery. Eur Spine J. 2011;20:1626–1634. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.McGirt MJ, Sivaganesan A, Asher AL, Devin CJ. Prediction model for outcome after low-back surgery: individualized likelihood of complication, hospital readmission, return to work, and 12-month improvement in functional disability. Neurosurg Focus. 2015;39:E13. [DOI] [PubMed] [Google Scholar]
- 32.Schönnagel L, Caffard T, Vu-Han TL, et al. Predicting postoperative outcomes in lumbar spinal fusion: development of a machine learning model. Spine J. 2024;24:239–249. [DOI] [PubMed] [Google Scholar]
- 33.Kim JS, Merrill RK, Arvind V, et al. Examining the ability of artificial neural networks machine learning models to accurately predict complications following posterior lumbar spine fusion. Spine (Phila Pa 1976). 2018;43:853–860. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Hébert JJ, Bigney EE, Nowell S, et al. Outcome prediction following lumbar disc surgery: a longitudinal study of outcome trajectories, prognostic factors, and risk models. J Neurosurg Spine. 2024;42:33–42; 1–10. [DOI] [PubMed] [Google Scholar]
- 35.Guven AE, Schönnagel L, Chiapparelli E, et al. Relationship between lumbar foraminal stenosis and multifidus muscle atrophy—a retrospective cross-sectional study. Spine (Phila Pa 1976). 2024. [DOI] [PubMed] [Google Scholar]
- 36.Liu Y, Liu Y, Hai Y, et al. Multifidus muscle fatty infiltration as an index of dysfunction in patients with single-segment degenerative lumbar spinal stenosis: a case-control study based on propensity score matching. J Clin Neurosci. 2020;75:139–148. [DOI] [PubMed] [Google Scholar]
- 37.Chua M, Hochberg U, Regev G, et al. Gender differences in multifidus fatty infiltration, sarcopenia and association with preoperative pain and functional disability in patients with lumbar spinal stenosis. Spine J. 2022;22:58–63. [DOI] [PubMed] [Google Scholar]
- 38.Sun W, Xue C, Tang X, et al. Selective versus multi-segmental decompression and fusion for multi-segment lumbar spinal stenosis with single-segment degenerative spondylolisthesis. J Orthop Surg Res. 2019;14:46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Guan J, Zhao D, Liu T, et al. Correlation between surgical segment mobility and paravertebral muscle fatty infiltration of upper adjacent segment in single-segment LDD patients: retrospective study at a minimum 2 years’ follow-up. BMC Musculoskelet Disord. 2023;24:28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Nguyen AQ, Harvey JP, Federico VP, et al. The effect of changes in segmental lordosis on global lumbar and adjacent segment lordosis after L5-S1 anterior lumbar interbody fusion. Global Spine J. 2025;15:112–120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Uribe JS, Myhre SL, Youssef JA. Preservation or restoration of segmental and regional spinal lordosis using minimally invasive interbody fusion techniques in degenerative lumbar conditions: a literature review. Spine. 2016;41(suppl 8):S50–S58. [DOI] [PubMed] [Google Scholar]
- 42.Wang D, Chen X, Han D, Wang W, Kong C, Lu S. Radiographic and surgery-related predictive factors for increased segmental lumbar lordosis following lumbar fusion surgery in patients with degenerative lumbar spondylolisthesis. Eur Spine J. 2024;33:2813–2823. [DOI] [PubMed] [Google Scholar]
- 43.Ham DW, Kim HJ, Park SM, Park SJ, Park J, Yeom JS. The importance of thoracolumbar junctional orientation, change in thoracolumbar angle, and overcorrection of lumbar lordosis in development of proximal junctional kyphosis in adult spinal deformity surgery. J Neurosurg Spine. 2022;37:874–882. [DOI] [PubMed] [Google Scholar]
- 44.Daniels AH McDonald CL, and Diebo BG. Segmental lordosis restoration during lumbar degenerative spinal fusion: surgical techniques and outcomes. J Am Acad Orthop Surg. 2024. [DOI] [PubMed]
- 45.Jang HD, Park SS, Kim K, et al. Is routine use of drain really necessary for posterior lumbar interbody fusion surgery? A retrospective case series with a historical control group. Global Spine J. 2023;13:621–629. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Karamian B, Kothari P, Toci G, et al. Effect of drain duration and output on perioperative outcomes and readmissions after lumbar spine surgery. Asian Spine J. 2023;17:262–271. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Shi G, Liu G, Gao Q, et al. A random forest algorithm-based prediction model for moderate to severe acute postoperative pain after orthopedic surgery under general anesthesia. BMC Anesthesiol. 2023;23:361. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Christodoulou E, Ma J, Collins GS, Steyerberg EW, Verbakel JY, Van Calster B. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol. 2019;110:12–22. [DOI] [PubMed] [Google Scholar]
- 49.Chicco D, Jurman G. Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone. BMC Med Inform Decis Mak. 2020;20:16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Chen C, Breiman L. Using Random Forest to Learn Imbalanced Data. University of California; 2004. [Google Scholar]




