Abstract
Background:
Patients with severe lumbar disc herniation (LDH), particularly those complicated by spinal stenosis or vertebral instability, frequently require posterior lumbar interbody fusion to alleviate nerve compression and reconstruct spinal biomechanical stability. Aiming to optimize individualized surgical planning, it is necessary to establish accurate predictive models derived from multidimensional clinical data.
Methods:
In this retrospective, multi-center study, the data utilized in this study were sourced from the Degenerative Spine Diseases in China (DSDC2024, NCT05867732). The model was trained on 3055 cases and externally validated across four geographically distinct cohorts (n = 3186). Leveraging a two-stage ensemble framework, we first applied Lasso regression to select target predictive variables from 38 clinical accessibility features (demographics, comorbidities, surgical parameters, and laboratory indices), then integrated XGBoost, random forest, and logistic regression through stacked generalization. Bayesian optimization with 10-fold cross-validation refined hyperparameters, while decision curve analysis quantified clinical utility against traditional risk assessment methods. Shapley Additive exPlanations analysis quantified feature contributions and interaction effects.
Results:
Amongst the 70 algorithmic combinations evaluated, the integration of Lasso with Stack emerged as the most predictive, achieving an impressive average area under the receiver operating characteristic curve of 0.884. The top five significant predictors were the fusion levels, clinical course duration, preoperative hospitalization, preoperative hemoglobin, and preoperative albumin.
Conclusion:
The IBLED-LDH model provides a valuable tool for preoperative intraoperative blood loss risk stratification, balancing predictive accuracy with interpretability through advanced ensemble learning.
Keywords: ensemble learning, intraoperative blood loss, multicenter study, posterior lumbar interbody fusion, severe lumbar disc herniation
Introduction
According to statistics released by the Chinese Center for Disease Control and Prevention in 2017, low-back pain ranked fourth in the ranking of non-fatal diseases leading to loss of life expectancy[1–3]. Lumbar disc herniation (LDH) is among the most common causes of lower-back pain affecting all age groups[4]. And especially, the detection rate of LDH among young people between the ages of 25 and 39 is the highest, reaching 13.93%, firmly occupying the first place among all age groups[5]. LDH refers to a group of syndromes mainly caused by lumbar disc annulus rupture, protruding into the spinal canal, pressing the dural sac or spinal meridian root, causing low back pain, femoral nerve, or sciatica[6,7]. Patients with LDH frequently begin with nonsurgical treatments and subsequent spinal surgical intervention if those are not effective[8,9].
Compared to lumbar discectomy, the subsequent development of posterior lumbar interbody fusion (PLIF) has become an intervention choice for long-term stability. This procedure involves the insertion of titanium screws through the pedicle and connection with titanium rods, facilitating the thorough removal of protruding intervertebral disc tissue, thereby completely relieving compression on neural tissue and restoring intervertebral height. Simultaneously, permanent stability is achieved through bone fusion, resulting in true stable reconstruction[10]. However, this traditional open surgery is associated with high invasiveness and intraoperative detachment of muscle and nerve, leading to the occurrence of a high risk of intraoperative bleeding of PLIF[11]. Excessive procedure of blood transfusion procedures would pose a substantial economic burden on the healthcare system[12]. Therefore, identifying intraoperative blood loss (IBL) risk predictors in PLIF could enhance preoperative risk stratification and optimize surgical planning.
Artificial intelligence (AI) has revolutionized the field of medicine in recent years, transforming healthcare delivery through data-driven innovations[13,14]. Machine learning (ML) stands as a pivotal component in realizing AI, which emulates human intelligence to analyze, interpret, and extract information from big data. Furthermore, the integration of multiple algorithms through ensemble learning can enhance the efficiency of a single algorithm[15,16], enabling deep exploration of extensive datasets and providing a scientific reference for clinical program decision-making. We have been advocating for the realization of multicenter and extensive collaborative research on the same clinical problem. This is of great significance in leveraging the academic strengths of clinicians and advancing clinical decision-making systems, particularly in integrating dispersed clinical centers into a logically unified clinical big data framework, thus establishing a multi-center clinical big data application platform. To address this gap, we developed an ensemble prediction model using multi-center Chinese datasets to enable personalized preoperative risk stratification of IBL in PLIF candidates with severe LDH.
Methods
Study design
We conducted a multicenter observational cohort study structured into five methodological phases: (1) data collection and randomization, (2) model development, (3) comparative algorithm analysis, (4) subgroup bias assessment, and (5) interpretative validation. The total cohort was divided into a training group and 4 validation groups, and the predictive efficacy was evaluated by the combination of 7 feature-selection methods and 10 ML algorithms. Subsequently, we conducted a thorough assessment of subgroup bias, which was stratified based on age, sex, chronic conditions, and spinal comorbidities. To enhance the interpretability of our findings, we employed Shapley Additive exPlanations (SHAP) to offer a comprehensive interpretation and illustrate the application of our model through the case study of individual patients. Our study was approved by the institutional review board of all participating institutions (Ethical number: 2022-IRB-04). Furthermore, this study adhered to the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) reporting guideline[17] and Strengthening The Reporting Of Cohort Studies in Surgery checklist[18].
HIGHLIGHTS
Established the largest predictive model for posterior lumbar interbody fusion (PLIF) blood loss in 6241 lumbar disc herniation patients sourced from 22 medical institutions.
The optimal Lasso-Stack ensemble algorithm, selected from 70 algorithm combinations, achieved an area under the receiver operating characteristic curve of 0.884 for PLIF blood loss prediction.
The model’s excellent discrimination, calibration, and clinical utility were validated through cumulative curves, calibration curves, and decision curve analysis. Shapley Additive exPlanations plots were used to enhance model interpretability and address bias risks and the black-box issue, facilitating clinical adoption.
Patient resources and criteria
This multicenter retrospective study analyzed data from the Degenerative Spine Diseases in China project (DSDC2024; https://clinicaltrials.gov/study/NCT05867732?term=NCT05867732), a collaborative effort encompassing 22 Chinese medical institutions from January 2015 to January 2022. This comprehensive database was designed to facilitate the identification of preventable complications and facilitate the development of an assisted decision-making system specifically tailored for patients suffering from degenerative spine diseases. Participants were allocated to a training group and four validation groups according to geographical location. This approach aimed to capture the diverse characteristics of patients from various regions within China.
All included patients had a confirmed diagnosis of LDH, based on clinical and imaging diagnosis. The researchers had access to the patients’ complete hospitalization records (blood routine, coagulation function, liver and kidney function, and other indices) during the perioperative period. Surgical information recorded by our database was also available for analysis.
The exclusion criteria were as follows: patients with cardiopulmonary insufficiency, hepatic or kidney function obstacle, or a malignant tumor. In addition, patients with other intraoperative complications that may result in excessive blood loss were excluded (anatomic variants). Cases with incomplete recording data also be excluded from the study.
Outcome and variables inspection
The outcome of interest was the risk of IBL, which was defined as exceeding the 75th percentile of blood loss in our multicenter cohort (i.e. > 400 mL)[19–21], a hemoglobin drop of ≥10 g/dL – typically indicative of approximately 400 mL of blood loss[22] – or the occurrence of intraoperative blood transfusion, which served as an objective marker of excessive bleeding.
The IBL was rigorously documented by anesthesiologists and itinerant nurse using a standardized dual-method protocol (negative pressure suction + gauze weighing) under unified institutional guidelines: suction volumes were calculated by subtracting lavage fluid from total suction container contents, while gauze blood loss was quantified by pre/post-dry weight differential (1 g = 1 mL), ensuring consistent quantification across all cases.
Given that this study marked the inaugural application of ensemble learning in evaluating IBL among patients with LDH undergoing PLIF, we embarked on a thorough review of pertinent prior studies[23,24]. Then, referring to the results of previous studies, the baseline information and the potential related variables from the electronic medical record system were collected in their entirety. To further ensure the comprehensiveness of our variable list, we consulted with experts, ultimately identifying the candidate variables pivotal to the subsequent phase of model development. Rigorous data quality control measures were implemented. All data were thoroughly examined for completeness and manually cleansed. Instances of missing values (less than 5%) were meticulously addressed using the random forest imputation technique, ensuring the integrity and reliability of our dataset[25,26].
Overview of model development
Feature selection
The training data had several redundant features, and removing these features did not lead to a loss of information and thus facilitated the simplification of the model. Beyond Univariate Screening and including them all (ALL) in the model, we used four variable screening methods, Boruta[27], least absolute shrinkage and selection operator[28], Genetic Algorithm[29], Simulated Annealing[30]. These six methods were subsequently analyzed using RVenn to identify the intersection of selected variables (detailed in Supplemental Digital Content).
ML algorithms
Regarding ML algorithms, we utilized nine diverse approaches to develop the optimal prediction model, including Logistic Regression, Decision Tree[31], Elastic Networks[32], K-nearest neighbor[33], LightGBM[34], Support Vector Machine[35], Multilayer Perceptron[36], Random Forest[37], XGBoost[38]. Furthermore, a stacked algorithm was constructed based on Lasso regression, which integrated the nine meta-models hereinbefore. In total, 10 models were established, with all 9 models developed using the tidymodels R package. Internal validation was conducted by 10-fold cross-validation, and the Bayesian optimization algorithm was used to determine the best hyperparameter setting[39,40].
Model performance and evaluation
To achieve a predictive model that exhibits both high accuracy and generalizability, we integrated 7 feature selection methods with 10 ML algorithms, resulting in seventy unique combinations. The ensemble model’s performance was rigorously assessed in terms of area under the receiver operating characteristic curve (AUC) across four validation groups. This evaluation was complemented by an analysis of the area under the Precision-Recall Curve (AUC-PR) and precision values, providing a comprehensive understanding of the model’s predictive capabilities[41,42]. Furthermore, to assess the model’s discriminatory power, Kolmogorov–Smirnov (K–S) curves were employed. With the calculation of the consistency index (CI), the decision curve analysis (DCA) was conducted to evaluate the clinical utility and applicability of the model’s prediction performance. Additionally, to ensure the fairness and robustness of the model, pre-defined subgroup analyses were performed to assess the bias of the model across different variable subgroups.
Model interpretation
To enhance the interpretability of our predictive model, we leveraged the SHAP plots, which offered consistent and locally accurate attribution values (SHAP values) for each feature within the training group[43]. Moreover, we independently presented SHAP force plots for patients with positive and negative clinical outcomes, providing further insights into the model’s decision-making process and enhancing its interpretability.
Statistical procedures
Summary statistics were presented as event numbers and percentages (%) in categorical data, and medians and interquartile range in quantitative data. Continuous variables were compared using the Student’s t-test or Wilcoxon signed-rank tests, and categorical variables were compared using the χ2 test or Fisher’s exact test. A two-sided P < 0.05 were considered significant. All statistical analysis was performed using the R (version 4.3.2).
Results
The baseline characteristics of patients
A total of 6241 LDH patients who underwent PLIF from 22 hospitals between 2015 and 2022 who met the inclusion criteria were included in this study. The Flow diagram of the study is shown in Figure 1. To check the external validity and general applicability of the model, we divided the 6241 patients into five datasets based on geographical regions in China: patients from eight institutions in Northwest China as the training set (Training, N = 3055), patients from four institutions located in Southwest China as the first validation cohort (Testing 1, N1 = 1031), patients from two hospitals in Southeast China as the second validation cohort (Testing 2, N2 = 814), patients from three hospitals in East China as the third validation cohort (Testing 3, N3 = 623), and patients from five hospitals in Northeast China as the fourth validation cohort (Testing 4, N4 = 718). The summary statistics of variable baseline characteristics in these five cohorts are presented in Supplemental Digital Content Table 1, available at: http://links.lww.com/JS9/E440.
Figure 1.
The flow diagram of the study.
The analysis revealed a notable balance in the baseline assessments among the cohorts, indicating that the subsequent steps of feature selection and model development can be pursued with confidence. The correlation heatmap of baseline characteristics with excessive IBL is depicted in Supplemental Digital Content Figure 1, available at: http://links.lww.com/JS9/E440.
A greater number of spinal levels fused, longer clinical course duration, and bone graft were associated with significantly increased IBL. Detailed correlations between study variables and excessive IBL are provided in Supplemental Digital Content Figure 2 and Supplemental Digital Content Table 2, available at: http://links.lww.com/JS9/E440.
Figure 2.
(A) AUC-ROC performance of 70 possible combination algorithms between feature selection methods and machine learning algorithms. We exhibited the respective values in four validation groups and the average results. (B) Above is the Venn diagram of feature intersections among six feature selection models and original features. Region 5 reflects the intersection of all methods. Below is the specific process of features selected by each model. There are 37 features in total. (C) The detailed process of 10-fold cross-validation of nine model building algorithms. (D) The process of integration of the algorithmic optimization.
Model development
The outcome of the feature selection process and the methodology adopted for graph construction are thoroughly detailed in Figure 2B and Supplemental Digital Content Figure 3, available at: http://links.lww.com/JS9/E440.
Figure 3.
(A) Model assessment in validation group 1: (a) Kolmogorov–Smirnov (K–S) curve; (b) decision curve. (B) Model fairness assessment through different subgroups.
To demonstrate the robustness of our model, a rigorous 10-fold cross-validation was conducted within the training set (Fig. 2C). It was observed that the Stack model achieved optimal calibration and precision when the penalty value converged at 0.0651 (Fig. 2D). Subsequently, a decline in both accuracy and area under the receiver operating characteristic curve (AUROC) values was evident, suggestive of model overfitting. By cross-modeling these two components, a total of 70 combinations were formed. The AUROC values of the four validation cohorts, along with their average AUROC values, are displayed in Figure 2A. Among the 70 combinations, the model integrating Lasso with the Stack algorithm exhibited the highest average AUROC value of 0.884 and AUC-PR value of 0.706. Model-related AUC-PR and precision values are shown in Supplemental Digital Content Figure 4, available at: http://links.lww.com/JS9/E440.
Model validation
The predictive performance of the first validation cohort is described in Figure 3A(a). The K–S statistic reached 0.6403 (range 0.6–0.7), demonstrating excellent risk stratification capacity of our model. This was further supported by DCA showing consistent net clinical benefit across a broad spectrum of threshold probabilities [Fig. 3A(b)], outperforming both “treat-all” and “treat-none” strategies. Notably, these findings were replicated in another three validation cohorts, with detailed K-S and DCA results provided in Supplemental Digital Content Figure 5, available at: http://links.lww.com/JS9/E440.
Subgroup bias analysis
We analyzed subgroups based on age, gender, presence of diabetes, combined with lumbar spinal stenosis, and combined with lumbar spondylolisthesis. All subgroups exhibited the ensemble predictor model had a stable and reliable predictive performance in different predefined subgroups (see Fig. 3B). This stability across diverse subpopulations validated the robustness of our model and its potential for widespread clinical application.
Model interpretation and visualization
Based on the game-theoretic theory that predictions were contributed by a combination of factors with anisotropic contributions, the Shapley values were applied to gain insight into the decision-making process of the stacking model[44,45]. Figure 4A presents the ranking plot of the top 23 important predictors of IBL, including a greater number of spinal levels fused, longer clinical course duration, and lower preoperative hemoglobin index. Figure 4B and C displays the average SHAP values for continuous and categorical variables, respectively, providing a quantitative understanding of their relative impacts. Furthermore, we exhibited the SHAP force plots for two representative patients, one with excessive IBL and the other with normal volume, to visually demonstrate how individual features contribute to the overall prediction (detailed explanations in Supplemental Digital Content Figure 6, available at: http://links.lww.com/JS9/E440).
Figure 4.
SHAP method for model interpretation. (A) Global feature importance in descending order. (B) SHAP approach for continuous variables. (C) SHAP approach for category features.
Through 70 times ensemble model comparison, we optimized the number of included variables to ensure the performance of the optimized model. To further enhance the transparency of our model’s predictions and prioritize the significance of key descriptors that influence the prediction outcomes, we delved into the SHAP explanation technique. Ultimately, we developed a user-friendly web-based calculator (https://nomogram2021.shinyapps.io/IBLED_LDH/) that facilitates the clinical application of this prediction model, thereby aiding in the preoperative assessment of intraoperative bleeding risk.
Discussion
It is of great significance to assess the risk of excessive IBL in severe LDH patients before PLIF. The high invasiveness and tissue detachment associated with PLIF contribute significantly to an increased risk of IBL and secondary injury, the latter being a notable source of postoperative hidden blood loss[46]. IBL is a crucial parameter to evaluate the quality of surgery[47].
In this multicenter study, we developed, calibrated, and evaluated the IBLED-LDH model, aiming to assist clinicians in the preoperative individual risk assessment. Considering the lack of risk evaluation models in spine surgery, our IBLED-LDH model demonstrated excellent predictive performance (four validation groups: AUROCs ranged from 0.873 to 0.904). Our ensemble model screened the most important 23 predictors out of 38 features, outperforming models that encompassed all features. This approach, which utilizes fewer variables to achieve accurate prediction outcomes, enhances the model’s practicality in clinical decision-making. Furthermore, the employment of Shapley force plots provided a visual representation of the contribution of various features, thereby improving model interpretability. This enhanced transparency not only facilitates clinicians’ understanding of the model’s predictions but also enables them to make more informed decisions regarding patient care.
In our study, the relevance of intraoperative multi-segmental fusion related to the increasing risk of IBL was emphasized. From the forest plot, the risk of excessive IBL of two-segment fusion increased 3.02 times relative to one-segment fusion (adjusted odds ratio [OR] = 4.02, 95% CI = 3.23–5.01), whereas three or more segments indicated greater risk by 11.71 times (adjusted OR = 12.71, 95% CI = 9.33–17.4). Multi-segment spinal surgery means a large surgical exposure and massive muscle dissection, accompanied by more intraoperative bleeding and surgical time. It was reported that multi-segment levels of fusion increased the risk of epidural venous plexus injury, leading to the formation of epidural hematoma[48]. Notably, in our analysis utilizing the Shapley feature importance ranking, the number of intraoperative segmental fusions emerged as the top-ranking feature variable, exerting a significant predictive share in this model. Longer clinical course duration and preoperative hospitalization, which suggested a more complex condition, involving greater affected lumbar discs and more severe local adhesions, also enhanced the risk of IBL. Patients with lower preoperative hemoglobin and albumin are widely recognized as high-risk populations of excessive IBL[49].
The rising incidence of LDH and spinal corrective surgery in China has heightened the need to assess patients’ risk of IBL. Adequate preoperative preparation, including meticulous surgical approach planning, correction of comorbidities, and monitoring of blood parameters, directly contributes to reducing IBL risk and enhancing postoperative recovery. However, research cannot be deemed complete until the relevance of predictor variables is comprehensively clarified and quantitatively integrated into a predictive model capable of generating patient-specific outcomes.
The single feature-selecting method or ML algorithm can potentially compromise the efficiency of the predictive model. To address this, we adopted an ensemble modeling approach that integrated various feature-selecting methods with multiple ML algorithms. Our analysis revealed the superiority of the Lasso regression combined with the Stack model across four validation cohorts. Notably, by including fewer variables, we achieved superior predictive performance compared to a model that encompassed all variables. Furthermore, the utilization of cumulative curves, calibration curves, and DCA demonstrated the excellent discrimination, calibration, and clinical utility of our model in generating individualized predictions. These findings underscore the importance of employing a comprehensive and multifaceted approach in developing predictive models for IBL, enabling more accurate and tailored risk assessments for patients undergoing spinal surgeries.
To address potential overfitting from testing 70 model combinations, we applied Bayesian optimization with 10-fold cross-validation for robust hyperparameter tuning. This approach allowed us to evaluate candidate hyperparameters across multiple validation folds, thereby reducing the risk of overfitting to a single data partition and ensuring a balance between bias and variance (with a maximum of 50 optimization iterations per model). Meanwhile, the model’s generalizability was confirmed through four independent, geographically stratified external validation cohorts (n = 3186), where the selected Lasso-Stack model demonstrated highly consistent performance (average ΔAUC = −0.0005). These methodological strategies not only enhanced model stability and reliability but also serve as a potential reference framework for future predictive modeling studies in surgical settings.
Currently, the efficacy of traditional association studies of IBL and low-dimensional ML models remains fragile. To our knowledge, this study represented the largest cohort for assessing IBL risk in LDH patients undergoing PLIF, and cross-comparisons of multiple algorithms also presented evidence for the robustness and accuracy of the model. The risk of bias and the black-box nature of ML are barriers to clinician adoption of the prediction model[50,51]. To address these concerns, we employed SHAP plots, which offer a comprehensive visualization of both global features and specific samples, enhancing the clarity and interpretability of our ensemble model. Additionally, the consistent performance across clinical subgroups further strengthened the credibility of our model’s stability.
Nonetheless, there are still some limitations. As a retrospective study relying on data retrospectively collected from electronic medical record systems, the real-world performance of our model requires further validation in prospective cohorts. Second, while excluding anatomical variants enhanced internal validity, this limits generalizability for complex cases. However, we recognize the critical role of anatomical heterogeneity in PLIF bleeding risk prediction and have initiated a registered clinical study to develop anatomy-stratified prediction algorithms. Lastly, subsequent international cohorts and time-series validations are needed to carry on.
Conclusion
This study, we have developed the IBLED-LDH model, which exhibited excellent predictive performance and a high degree of interpretability. This innovative model applied the Lasso selecting method with the Stack learning algorithm and was validated by multicenter external cohorts, which aimed to assist clinicians in assessing IBL risk for LDH patients undergoing PLIF. Factors such as levels of fusion, clinical course duration, and hemoglobin levels ranked among the most influential determinants in the model. Interactive SHAP plots were generated to visualize both global feature importance and patient-specific risk trajectories. Overall, the IBLED-LDH model represents a novel decision-support tool for optimizing perioperative management in spinal fusion surgery, advancing personalized spine surgical care.
Acknowledgements
The authors would like to express their sincere gratitude to clinical fellows of each sub-center for their contributions. Some of them performed perfect surgery and provided complete data, while some of them provided references to data processing and statistical methods. We placed the names of these contributors in separate supplementary materials. This work is supported by Extreme Smart Analysis platform (https://www.xsmartanalysis.com/).
Footnotes
N.S., Y.Z., and Z.D. contributed equally and shared the first authors.
Sponsorships or competing interests that may be relevant to content are disclosed at the end of this article.
Supplemental Digital Content is available for this article. Direct URL citations are provided in the HTML and PDF versions of this article on the journal's website, www.lww.com/international-journal-of-surgery
Published online 19 June 2025
Contributor Information
Ning Shen, Email: paulshenyh@163.com.
Yusi Zhang, Email: nicola0908@163.com.
Zhendong Ding, Email: dingzhd66@csu.edu.cn.
Xubin Quan, Email: quanxubin@163.com.
Xiaozhu Liu, Email: xiaozhuliu2021@163.com.
Yang Zhang, Email: 1565983618@qq.com.
Tianyu Xiang, Email: 421973525@qq.com.
Yingang Zhang, Email: zyingang@mail.xjtu.edu.cn.
Chengliang Yin, Email: chengliangyin@163.com.
Ethical approval
Our study was approved by the institutional review board of all participating institutions (Ethical number: 2022-IRB-04).
Consent
Not applicable.
Source of funding
This study was supported partly by the Shaanxi Provincial Health and Health Research Fund Project (2022E006), and the “co-PI” project from the Third Xiangya hospital of Central South University (No. 202405).
Author contributions
Data curation, formal analysis, investigation, writing – original draft: N.S.; data curation, formal analysis, methodology, writing – original draft: Y.Z.; data curation, investigation, writing – original draft: Z.D.; investigation, methodology, software: R.L.; investigation, methodology, software: X.Q.; investigation, methodology, software: X.L.; investigation, methodology, software: Y.Z.; investigation, methodology, software: T.X.; investigation, methodology, software: Y.Z.; conceptualization, formal analysis, supervision, methodology, review and editing: C.Y.; conceptualization, formal analysis, funding acquisition, methodology, review and editing: W.L.
Conflicts of interest disclosure
No conflicts of interest.
Research registration unique identifying number (UIN)
The data utilized in this study were sourced from the Degenerative Spine Diseases in China (DSDC2024, NCT05867732).
Guarantor
Wenle Li.
Provenance and peer review
Not commissioned, externally peer-reviewed.
Data availability statement
Data are available from the corresponding author upon reasonable request.
Additional information
The machine learning models have been incorporated in a web-based calculator (https://nomogram2021.shinyapps.io/IBLED_LDH/). The complexity of the ensemble model can cause web calculators to run slowly.
References
- [1].Fine N, Lively S, Seguin CA, et al. Intervertebral disc degeneration and osteoarthritis: a common molecular disease spectrum. Nat Rev Rheumatol 2023;19:136–52. [DOI] [PubMed] [Google Scholar]
- [2].Momen Majumder MS, Hakim F, Bandhan IH, et al. Low back pain in the Bangladeshi adult population: a cross-sectional national survey. BMJ Open 2022;12:e059192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [3].Cieza A, Causey K, Kamenov K, et al. Global estimates of the need for rehabilitation based on the global burden of disease study 2019: a systematic analysis for the global burden of disease study 2019. Lancet 2021;396:2006–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [4].Purushotham S, Hodson N, Greig C, et al. Microscopic changes in the multifidus muscle in people with low back pain associated with lumbar disc herniation. Sci Rep 2024;14:31927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Basic Research and Transformation Society, Professional Committee of Spine and Spinal Cord, Chinese Association of Rehabilitation Medicine. Guideline for diagnosis, treatment and rehabilitation of lumbar disc herniation. Chin J Surg 2022;60:401–08. [DOI] [PubMed] [Google Scholar]
- [6].Yu P, Mao F, Chen J, et al. Characteristics and mechanisms of resorption in lumbar disc herniation. Arthritis Res Ther 2022;24:205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].Zhang AS, Xu A, Ansari K, et al. Lumbar disc herniation: diagnosis and management. Am J Med 2023;136:645–51. [DOI] [PubMed] [Google Scholar]
- [8].El Melhat AM, Youssef ASA, Zebdawi MR, et al. Non-surgical approaches to the management of lumbar disc herniation associated with radiculopathy: a narrative review. J Clin Med 2024;13:974. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Liu C, Ferreira GE, Abdel Shaheed C, et al. Surgical versus non-surgical treatment for sciatica: systematic review and meta-analysis of randomised controlled trials. BMJ 2023;381:e070730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].An B, Ren B, Han Z, et al. Comparison between oblique lumbar interbody fusion and posterior lumbar interbody fusion for the treatment of lumbar degenerative diseases: a systematic review and meta-analysis. J Orthop Surg Res 2023;18:856. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Dong W, Tang Y, Lei M, et al. The effect of perioperative sequential application of multiple doses of tranexamic acid on postoperative blood loss after PLIF: a prospective randomized controlled trial. Int J Surg 2024;110:2122–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12].Abdou M, Kwon JW, Kim HJ, et al. Tranexamic acid and intraoperative and postoperative accumulative bleeding in elective degenerative spine surgery. Yonsei Med J 2022;63:927–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Chen PC, Mermel CH, Liu Y. Evaluation of artificial intelligence on a reference standard based on subjective interpretation. Lancet Digit Health 2021;3:e693–e95. [DOI] [PubMed] [Google Scholar]
- [14].Rajpurkar P, Chen E, Banerjee O, et al. AI in health and medicine. Nat Med 2022;28:31–38. [DOI] [PubMed] [Google Scholar]
- [15].Mahajan P, Uddin S, Hajati F, et al. Ensemble learning for disease prediction: a review. Healthcare (Basel) 2023;11:1808. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Huang LT, Hou JY, Liu HT. Machine-learning intervention progress in the field of organic waste composting: simulation, prediction, optimization, and challenges. Waste Manag 2024;178:155–67. [DOI] [PubMed] [Google Scholar]
- [17].Collins GS, Reitsma JB, Altman DG, et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ 2015;350:g7594. [DOI] [PubMed] [Google Scholar]
- [18].Agha RA, Mathew G, Rashid R, et al. Revised strengthening the reporting of cohort, cross-sectional and case-control studies in surgery (STROCSS) guideline: an update for the age of artificial intelligence. Prem J Sci 2025;10:100081 [Google Scholar]
- [19].Holopainen N, Oranges CM, Di Summa PG, et al. Return to work after breast reduction: a comparative study. J Clin Med 2023;12:642. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20].Fang X, Xiong Y, Yuan F, et al. Preoperative planning using three-dimensional multimodality imaging for soft tissue sarcoma of the axilla: a pilot study. Cancers (Basel) 2022;14:3185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [21].Tang CYK, Kamath VHD, Cheung PWH, et al. Predictive factors for intraoperative blood loss in surgery for adolescent idiopathic scoliosis. BMC Musculoskelet Disord 2021;22:225. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [22].Crh AH, Alliance CHC. Expert consensus on emergency nursing for fatal haemorrhage (2019). J Intervent Rodiol 2020;29:221–27. [Google Scholar]
- [23].Park S, Park K, Lee JG, et al. Development of machine learning models predicting estimated blood loss during liver transplant surgery. J Pers Med 2022;12:1028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [24].Shi Y, Zhang G, Ma C, et al. Machine learning algorithms to predict intraoperative hemorrhage in surgical patients: a modeling study of real-world data in Shanghai, China. BMC Med Inform Decis Mak 2023;23:156. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Shiba K, Daoud A, Hikichi H, et al. Heterogeneity in cognitive disability after a major disaster: a natural experiment study. Sci Adv 2021;7:eabj2610. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [26].Baboota RK, Rawshani A, Bonnet L, et al. BMP4 and Gremlin 1 regulate hepatic cell senescence during clinical progression of NAFLD/NASH. Nat Metab 2022;4:1007–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [27].Parvez A, Saleem J, Bhatti MA, et al. Boruta-driven analysis of telehealth amalgamation across healthcare stratifications with diffuse-dual-channel and tiered-gatekeeper systems. Sci Rep 2024;14:24784. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [28].Kang J, Choi YJ, Kim IK, et al. LASSO-based machine learning algorithm for prediction of lymph node metastasis in T1 colorectal cancer. Cancer Res Treat 2021;53:773–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [29].Jiao J, Ghoreishi S-M, Moradi Z, et al. Coupled particle swarm optimization method with genetic algorithm for the static–dynamic performance of the magneto-electro-elastic nanosystem. Eng Comput 2022;38:2499–513. [Google Scholar]
- [30].Sadeghi M, Bazrafkan MM, Rutner M, et al. Modeling of magnetoelectric microresonator using numerical method and simulated annealing algorithm. Micromachines 2023;14:1878. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [31].Li B, Wang Y, Yin Z, et al. Decision tree-based identification of important molecular fragments for protein-ligand binding. Chem Biol Drug Des 2024;103:e14427. [DOI] [PubMed] [Google Scholar]
- [32].Gong CC, Zhou M, Hu Y, et al. Elastic net-based identification of GAMT as potential diagnostic marker for early-stage gastric cancer. Biochem Biophys Res Commun 2022;591:7–12. [DOI] [PubMed] [Google Scholar]
- [33].Li S, Tsui PH, Wu W, et al. Ultrasound k-nearest neighbor entropy imaging: theory, algorithm, and applications. Ultrasonics 2024;138:107256. [DOI] [PubMed] [Google Scholar]
- [34].Yan J, Xu Y, Cheng Q, et al. LightGBM: accelerated genomically designed crop breeding through ensemble learning. Genome Biol 2021;22:271. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [35].Wang H, Shao Y, Zhou S, et al. Support vector machine classifier via L(0/1) soft-margin loss. IEEE Trans Pattern Anal Mach Intell 2022;44:7253–65. [DOI] [PubMed] [Google Scholar]
- [36].Yu H, Liu Y, Zhou G, et al. Multilayer perceptron algorithm-assisted flexible piezoresistive PDMS/Chitosan/cMWCNT sponge pressure sensor for sedentary healthcare monitoring. ACS Sens 2023;8:4391–401. [DOI] [PubMed] [Google Scholar]
- [37].Wang C. Optimization of sports effect evaluation technology from random forest algorithm and elastic network algorithm. PLoS One 2023; 18: e0292557. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [38].Liang D, Wang L, Zhong P, et al. Perspective: global burden of iodine deficiency: insights and projections to 2050 using XGBoost and SHAP. Adv Nutr 2025;16:100384. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [39].Moeini M, Sela L, Taha AF, et al. Bayesian optimization of booster disinfection scheduling in water distribution networks. Water Res 2023;242:120117. [DOI] [PubMed] [Google Scholar]
- [40].Mohr F, van Rijn JN. Fast and informative model selection using learning curve cross-validation. IEEE Trans Pattern Anal Mach Intell 2023;45:9669–80. [DOI] [PubMed] [Google Scholar]
- [41].Schalkamp AK, Peall KJ, Harrison NA, et al. Wearable movement-tracking data identify Parkinson’s disease years before clinical diagnosis. Nat Med 2023;29:2048–56. [DOI] [PubMed] [Google Scholar]
- [42].Hong W, Lu Y, Zhou X, et al. Usefulness of random forest algorithm in predicting severe acute pancreatitis. Front Cell Infect Microbiol 2022;12:893294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [43].Qi X, Wang S, Fang C, et al. Machine learning and SHAP value interpretation for predicting comorbidity of cardiovascular disease and cancer with dietary antioxidants. Redox Biol 2025;79:103470. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [44].Feng J, Liang J, Qiang Z, et al. A hybrid stacked ensemble and Kernel SHAP-based model for intelligent cardiotocography classification and interpretability. BMC Med Inform Decis Mak 2023;23:273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [45].Shen L, Jin Y, Pan A, et al. Machine learning-based predictive models for perioperative major adverse cardiovascular events in patients with stable coronary artery disease undergoing noncardiac surgery. Comput Methods Programs Biomed 2025;260:108561. [DOI] [PubMed] [Google Scholar]
- [46].Chu PL, Wang T, Zheng JL, et al. Global and current research trends of unilateral biportal endoscopy/biportal endoscopic spinal surgery in the treatment of lumbar degenerative diseases: a bibliometric and visualization study. Orthop Surg 2022;14:635–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [47].Yang J, He Y, Zhang X, et al. Robotic and laparoscopic sacrocolpopexy for pelvic organ prolapse: a systematic review and meta-analysis. Ann Transl Med 2021;9:449. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [48].McLaughlin WM, Donnelley CA, Yu K, et al. Three-dimensional printing versus freehand surgical techniques in the surgical management of adolescent idiopathic spinal deformity. J Spine Surg 2022;8:234–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [49].Pennington Z, Ehresman J, Feghali J, et al. A clinical calculator for predicting intraoperative blood loss and transfusion risk in spine tumor patients. Spine J 2021;21:302–11. [DOI] [PubMed] [Google Scholar]
- [50].Finlayson SG, Subbaswamy A, Singh K, et al. The clinician and dataset shift in artificial intelligence. N Engl J Med 2021;385:283–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [51].Andaur Navarro CL, Damen JAA, Takada T, et al. Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review. BMJ 2021;375:n2281. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data are available from the corresponding author upon reasonable request.




