Abstract
Abstract
Introduction
Traumatic brain injury (TBI) remains a major public health concern in India, with high mortality and long-term disability. Existing prognostic models, mostly developed in high-income countries using traditional methods, lack generalisability to the Indian context and do not use the potential of machine learning or multicentric data. This study primarily aims to develop, compare and validate machine learning methods, including the traditional approach, to predict 30-day mortality and 6-month functional outcomes in patients with moderate or severe TBI. A secondary objective is to describe and compare admission characteristics and outcomes (at discharge, 3 months, 6 months and 1 year) in TBI patients in tertiary care settings using descriptive analyses.
Methods and analysis
Data from the neurotrauma registry at Jai Prakash Narayan Apex Trauma Centre, department of neurosurgery, All India Institute of Medical Sciences (AIIMS), New Delhi, including patients admitted between 23 March 2022 and 22 September 2024, will be used for model development and internal validation. For external validation, retrospectively collected data from the same centre (May 2010 to August 2013) and prospectively collected data from AIIMS Patna (1 June 2022 to 30 November 2024) and Rajiv Gandhi Government General Hospital, Madras Medical College (MMC), Chennai (1 May 2022 to 31 October 2024) will be included. Prediction models for 30-day mortality and 6-month functional outcomes will be developed using both machine learning and traditional statistical techniques. Model performance will be evaluated based on discrimination, calibration and clinical utility, with the latter assessed through decision curve analysis (DCA). An online risk calculator will be developed based on the best-performing model to estimate outcome probabilities along with 95% CIs.
Ethics and dissemination
The institutional Ethics Review Board of respective data collection centres, that is, AIIMS, New Delhi, AIIMS, Patna, and MMC, Chennai, approved the study. Findings will be published in peer-reviewed journals and disseminated at national and international conferences.
Discussion
This study will develop and validate prognostic models using traditional and machine learning methods tailored to the Indian TBI context. Multicentric, prospectively collected data will enhance generalisability, while clinical utility will be evaluated through DCA. Adherence to Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis + Artificial Intelligence (TRIPOD+AI) guidelines ensures methodological transparency. With external validation, these models may improve clinical decision-making, resource planning and patient-family communication in diverse Indian healthcare settings.
Keywords: Brain Injuries, INTENSIVE & CRITICAL CARE, Prognosis, Clinical Protocols, ACCIDENT & EMERGENCY MEDICINE
STRENGTH AND LIMITATIONS OF THIS STUDY.
The study is one of the first multicentric Indian studies to develop and externally validate prognostic models for traumatic brain injury (TBI) across diverse geographical regions.
For the first time in Indian TBI cohorts, the clinical utility of prognostic models in TBI will be assessed using decision curve analysis to evaluate their net benefit across different threshold probabilities.
An online risk calculator/tool will be deployed for clinical usefulness.
This study will be reported according to the most recent Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis + Artificial Intelligence guideline.
A high proportion of missing 6-month outcome data may affect model precision, and the use of data from a specialised level one trauma centre could limit generalisability to broader healthcare settings.
Introduction
In recent years, the field of machine learning (ML) has developed new and sophisticated statistical and computational techniques to process high-dimensional data, which have diverse applications in science and engineering.1 In medicine, there is great potential for ML approaches to transform care, offer approaches for diagnostic evaluation and automatically personalise therapeutic decisions on par with or better than clinicians.2 Such approaches (so-called big data solutions) might also prove valuable for analysing time-dependent neuromonitoring data, predicting real-time events and characterising physiological states that respond to specific therapies.1 ML may generate substantial improvements in neurosurgery. It has great potential to predict outcomes accurately and facilitate clinical decision-making about critically ill patients in neurosurgery and other medical domains.3 To help clinicians provide reliable information to patients and relatives and to facilitate the comparative audit of care between centres and countries, prognostic models are essential. There is an urgent need for further development, validation and implementation of prognostic models in the case of patients with traumatic brain injury (TBI).1 TBI is a heterogeneous and complex condition in which many mechanisms and pathways can lead to mortality and poor long-term outcomes.4,7 In such cases, intelligent application of these modern statistical approaches can improve understanding of the pathology and treatment of TBI. It can also benefit prognostication, hypothesis generation and stratification of patients in research studies.8 9
Rationale
One of the most widely used risk calculators for TBI is the Corticosteroid Randomisation After Significant Head Injury (CRASH) risk calculator, which provides predictions of mortality and unfavourable outcomes based on patient characteristics.10 Even though the CRASH model has international relevance, its development was not specifically tailored to the unique demographic and clinical variations across the Indian subcontinent. Given the diverse demographic and clinical profiles seen in Indian hospitals, using data from different sectors of the Indian subcontinent for developing and validating prognostic models is crucial but not widely established. There is a need to create region-specific models to provide more accurate predictions, as existing models like CRASH may not fully capture the distinct factors affecting TBI outcomes in India, such as variations in injury mechanisms, healthcare access and treatment protocols. Additionally, no previous studies in India have evaluated the clinical utility of TBI prognostic models using decision curve analysis (DCA), highlighting a critical gap in applying outcome prediction tools in Indian settings. Developing models tailored to local populations can enhance predictive accuracy and improve clinical decision-making in TBI care across India.
More research is needed to identify optimal machine-learning algorithms for different prediction problems. To avoid research waste, model development and validation methodologies should be properly designed and reported. More attention is urgently and regularly needed to the calibration performance of traditional regression approaches and ML models.11 The purpose of a predictive model is to provide valid outcome predictions for new patients.12 Essentially, the data set for developing a model is not of interest other than learning for the future. Therefore, validation is an important part of the predictive modelling process. In choosing a modelling technique for prediction for particular settings, the performance of the resulting model at external validation is a critical factor.11 The Jai Prakash Narayan Apex Trauma Centre (JPNATC), All India Institute of Medical Sciences (AIIMS), New Delhi (India), is India’s largest tertiary trauma care centre. It is currently one of the best-integrated level 1 trauma centres in our setting. This centre has a large data set with many characteristics.13 Other hospitals in different parts of India (Trauma Centre, AIIMS, Patna and Rajiv Gandhi Government General Hospital, Madras Medical College (MMC), Chennai) have a huge burden of managing and treating trauma patients. Now is the appropriate time to develop, validate and compare ML models and find the best suitable ML algorithm with optimal performance on external validation in our setting to evaluate outcomes after TBI.
This study primarily aims to develop, compare and validate ML methods, including the traditional approach, to predict 30-day mortality and 6-month functional outcomes in patients with moderate or severe TBI. A secondary objective is to describe and compare admission characteristics. and outcomes (at discharge, 3 months, 6 months and 1 year) in TBI patients in tertiary care settings using descriptive analyses.
Methods
Study setting
For the development of models and internal validation, this study will use prospectively collected data from JPNATC under the Department of Neurosurgery, AIIMS, New Delhi (India) for patients admitted from 23 March 2022 to 22 September 2024 (JPN-TBI-P-Data). For external validation, we will use (1) retrospectively collected data from the same centre (JPNATC, AIIMS, New Delhi) for patients admitted between May 2010 and August 2013, using a cross-sectional design (JPN-TBI-R-Data) and (2) prospectively collected data from two additional centres: Trauma Centre, AIIMS, Patna (1 June 2022–30 November 2024) (TC-TBI-P-Data) and Rajiv Gandhi Government General Hospital, MMC, Chennai (1 May 2022–31 October 2024) (MMC-TBI-P-Data). In the case of prospective data collection, it spans a period of two and a half years. The overall data collection period is from 02 February 2022 to 02 August 2025.
Inclusion and exclusion criteria
This study includes all patients with moderate or severe TBI, that is, the patients with admission Glasgow Coma Scale (GCS) ≤12 at the time of admission at the emergency department (ED) and later on, will be admitted to intensive care unit (ICU) under the Department of Neurosurgery. Only adult patients aged 18 years and above will be included in the study. We will exclude those patients who are dead on arrival or who do not arrive at the ED within 72 hours of injury. Additionally, to ensure comparability in functional outcomes, only patients with a documented ability to perform basic activities of daily living independently prior to injury will be included.
Data management
As mentioned above, we are using four datasets for model development and validation—three sets of prospectively collected data (JPN-TBI-P-Data, TC-TBI-P-Data and MMC-TBI-P-Data) and one set of retrospectively collected data (JPN-TBI-R-Data). The prospective data are being collected from three study centres: AIIMS, New Delhi; AIIMS, Patna; MMC, Chennai. For AIIMS Patna and MMC Chennai, data are captured using standardised electronic case report forms (eCRFs) developed and hosted on the REDCap platform. Access to the REDCap system is restricted to authorised study personnel to ensure confidentiality and data security. Data entry is performed in real-time during hospitalisation and follow-up by trained research staff at each centre, ensuring contemporaneous and structured data capture aligned with clinical workflows. The complete structure and variables included in the eCRF are detailed in online supplemental file S1.
For AIIMS, New Delhi, data will be extracted from the existing neurotrauma registry, a structured, hospital-based clinical database that is prospectively maintained by the department of neurosurgery at JPNATC. This registry includes all variables aligned with the REDCap-based eCRFs used at the other participating centres, along with additional clinical variables. Functional outcome data at 6 months and 1 year are assessed using the Glasgow Outcome Scale–Extended (GOSE), and are collected by trained clinical staff through direct or telephonic follow-up.
Across all four datasets, the definitions of variables and the time points at which they are collected are harmonised to ensure consistency and comparability. All participating centres follow nationally and internationally recognised guidelines for the management of TBI, ensuring standardised and high-quality care across sites.14 15
Predictors and outcome(s)
Details of the candidate predictor variables considered for model development are provided in online supplemental file S2. We prepared the list of candidate predictors based on review of literature and clinical inputs. The study considers information based on demographics (age, gender), clinical severity (the motor GCS at admission, pupillary reactivity, limb movement and major extracranial injuries), secondary insult (hypoxia, hypotension), various CT findings of traumatic brain (midline shift, fracture, mass effect, contusion, subdural haematoma, epidural haematoma, dot haemorrhages, basal cistern effaced, presence of traumatic subarachnoid haemorrhage/intraventricular haematoma, non-evacuated haematoma, evacuated haematoma, decompressive craniectomy) and various blood results (levels of haemoglobin, glucose, sodium, creatinine) at first admission to hospital.
With each of the modelling techniques, we will develop prediction models for mortality at 30 days and 6-month functional outcome (also known as unfavourable outcome). In the present study, the outcome assessment will be based on the GOS, at discharge, 3 months, 6 months and 1 year. The GOS is a 5-point score (dead ‘1’, vegetative state ‘2’, severely disabled ‘3’, moderately disabled ‘4’, good recovery ‘5’) given to victims of TBI at some point of time in their recovery. It is a very general assessment of the general functioning of the person who suffered from TBI. The GOS score will be further dichotomised into favourable outcome (moderate disability, good recovery or GOS=4 and 5) and unfavourable outcome (death, persistent vegetative state and severe disability or GOS=1, 2, 3). Trained clinical staff will be involved in determining the follow-up data of GOS at discharge, 3 months, 6 months and 1 year through direct interviews of patients in the clinic or by telephonic interviews of patients or their caregivers. The outcome assessment for an individual patient will be done by two staff members, and any discrepancies, if there are, will be resolved through mutual discussion.
Sample size calculation
There is no such guideline for calculating sample size for ML models; however, our sample size calculation is based on work carried out by Riley et al (2019) for sample size calculation for developing prediction models for binary outcomes. We used the pmsampsize package in R (V.4.3.2) for calculation.16 Assuming an in-hospital mortality rate of 34.4% and unfavourable outcomes at 66.3% among TBI patients, with 24 candidate predictor variables, which correspond to 30 model parameters (including categorical variable levels) (online supplemental file S2), an anticipated R-squared value of 0.28 is selected to ensure moderate predictive accuracy.17 A shrinkage factor of 0.9 will be applied to mitigate overfitting, and the estimated sample size required for model development is 805 patients (minimum) for both in-hospital mortality (mortality at 30 days) and unfavourable outcomes at 6 months.
For external validation, we calculated a sample size of 1663 patients for unfavourable functional outcomes and 1571 patients for in-hospital mortality using 30 candidate predictors, a c-statistic of 0.87 and mean (SD) of linear predictors −0.510 (2.191) and 1.365 (2.419) for in-hospital mortality and unfavourable outcomes, respectively.17 A width of one is set for observed and expected event proportions, resulting in the selection of 1663 patients for external validation.18
Missing data
This study will use missing data using multiple imputation,19 a robust approach for handling incomplete data in covariates. Each variable with missing values will be imputed based on its distribution, using appropriate imputation methods such as predictive mean matching for continuous variables and polytomous regression for unordered categorical variables. The proportion of missingness will be quantified using the fraction of missing information to assess the extent of uncertainty due to missing data.20
Predictive models and modelling techniques
We will compare six statistical modelling techniques based on ML algorithms to predict mortality at 30 days and 6-month functional outcomes for both outcomes. The modelling techniques include traditional approaches such as logistic regression and ML approaches such as classification and regression tree, random forest, artificial neural network, support vector machine and gradient boosting.21,23 In modelling, continuous variables will be included as continuous variables without categorisation. Prior to model training, continuous variables will be normalised and categorical variables will be one-hot encoded, as these preprocessing steps are standard for algorithms employing gradient descent optimisation, to ensure efficient learning and model convergence. We will compare ML algorithms with traditional regression methods, including standard logistic regression and penalised regression techniques such as lasso and ridge regression. These penalised methods help improve model performance by reducing the size of the regression coefficients, which can prevent overfitting and overly extreme predictions. No interaction terms will be included in the traditional approach. The ML algorithms require tuning of certain settings, called hyperparameters, to perform well. We will use the caret package in R to find the best combination of these hyperparameters. The selection will be based on which setting gives the best model performance, measured using the average log-likelihood across 10 repeated 10-fold cross-validations.
To ensure robustness and reliability of our findings, we will conduct sensitivity analyses by exploring the impact of different handling of missing data, excluding retrospective datasets and evaluating alternative definitions of key outcome.
Feature importance and predictor selection
To identify the most influential predictors for 30-day mortality and 6 month functional outcomes, we will use Shapley Additive Explanation (SHAP) values. SHAP is a game-theoretic approach that attributes the contribution of each feature to the prediction output of a model by comparing predictions made with and without each feature across different combinations.24 This method allows both global interpretability (overall feature importance) and local interpretability (case-level explanations). The overall importance of each variable will be quantified by averaging its absolute SHAP values across all patients in the development dataset. Based on the SHAP rankings and clinical relevance, we will define multiple predictor sets for model development.25
To address potential multicollinearity among clinical variables, we will assess pairwise correlations and variance inflation factors. For models sensitive to correlated features (eg, logistic regression), we will apply regularisation techniques such as LASSO or exclude redundant variables, as appropriate. ML algorithms that are inherently more robust to multicollinearity will be used alongside more interpretable models to cross-check feature robustness and ensure stable predictive performance. This structured approach will help ensure that predictor selection is both data-driven and clinically meaningful, while minimising overfitting and preserving model interpretability.
Model validation
To evaluate model performance and generalisability, three complementary validation strategies will be employed:
Internal-external cross-validation: in this approach, models will be developed on data from all but one study centre and then validated on the held-out centre. This process will be repeated across all study sites, allowing assessment of model transportability and heterogeneity in performance across settings.
The performance estimates and their 95% CIs from each internal-external validation iteration will be summarised using forest plots to visually assess heterogeneity across centres. To derive overall pooled estimates for each model and outcome, a random-effects meta-analysis will be conducted. The DerSimonian and Laird estimator will be used to compute between-study variance (τ²), ensuring that variability across different sites is appropriately accounted for.
Internal validation: within the JPN-TBI-P-Data (prospectively collected data from AIIMS New Delhi), internal validation will be conducted using 10-fold cross-validation to assess the stability and predictive performance of the model. This widely used approach compares the performance of different regression models on the same dataset. The dataset is randomly divided into 10 mutually exclusive folds of roughly equal size. Each fold is then used as a test set for the model trained on the remaining nine folds using specific algorithms. The model’s overall performance is assessed by averaging the accuracies across the 10 folds.26
-
External validation: external validation is needed to strengthen the generalisability of predictive models.27 The present study will perform temporal and geographical external validation. A model will be fitted to data from some centres and evaluated in patients from another centre. Final models trained on the JPN-TBI-P-Data will be externally validated using three independent datasets:
JPN-TBI-R-Data.
TC-TBI-P-Data.
MMC-TBI-P-Data.
Model performance
The performance of models will be evaluated in terms of discrimination and calibration. Discrimination is a model’s ability to separate patients with different outcomes, quantified by using the Area Under Curve (AUC) which determines whether those with higher predicted risks are more likely to have a poor outcome among all possible pairs of patients with different outcomes.28 An AUC of 1.0 means perfect discrimination and an AUC of 0.5 means no discriminative power. Calibration will be assessed graphically with calibration slope, calibration-in-the-large by plotting the observed outcome against the predicted probability, which determines agreement between predicted and observed risks over the full range of predicted probabilities. Patients will be grouped per deciles of predicted risks to perform the test, which means that each group consists of 10% of the total number of patients. We will report the Brier score to specify the overall performance of a model with predictors against a non-informative model without predictors on a scale from 0–100%.5 In addition to AUC and Brier Score, we will report accuracy, sensitivity and specificity as standard performance metrics. Accuracy will be used to measure the overall proportion of correctly classified outcomes by each model. Sensitivity (true positive rate) will indicate the model’s ability to correctly identify patients with poor outcomes. Specificity (true negative rate) will reflect the model’s ability to correctly classify patients with favourable outcomes. The best-performing model will be selected based primarily on highest AUC, followed by Brier score and other secondary metrics. In cases of similar performance, models with greater clinical interpretability and implementation feasibility will be prioritised for external validation.
Decision curve analysis
The clinical utility of predictive models in the study will be evaluated using DCA across a defined range of threshold probabilities for both outcomes. In this context, the threshold probability refers to the minimum predicted risk level at which a clinical intervention is considered appropriate, typically the point where the expected benefits of the intervention outweigh its potential harms and justify its use in practice.29 Decision curves will be derived from model-predicted probabilities obtained from the external validation dataset and will display the median and 95% CIs of net benefit across a range of threshold probabilities.30
Implementation
An online risk calculator will be developed using the best-performing prognostic model to facilitate real-time clinical application. Implemented as a Shiny web application, the tool will enable clinicians to input individual patient characteristics and obtain personalised estimates of outcome probabilities—such as 30-day mortality or 6-month functional outcomes—along with 95% CIs. Designed for ease of use, the calculator will serve as a practical aid to support evidence-based decision-making and enhance communication with patients and families in tertiary care settings.
Descriptive analyses
Additionally, we will perform descriptive statistical analyses to address the secondary objective. We will describe and compare the baseline admission characteristics and clinical outcomes (at discharge, 3 months, 6 months and 1 year) among patients with moderate and severe TBI treated at various tertiary care centres. Continuous variables will be summarised using mean and SD or median and IQR, depending on the test of normality. Categorical variables will be presented as frequencies and percentages. Comparative analyses will be conducted using appropriate tests based on the type and distribution of the data: χ2 test or Fisher’s exact test for categorical variables, and independent t-test or Mann-Whitney U test for continuous variables.
Software
All statistical analyses will be performed using the R software V.4.3.2. Key R packages will include glmnet, caret, randomForest, xgboost for modelling, missForest for imputation and iml for SHAP value computation. Calibration plots, Brier scores and ROC plots will be performed using the rms and Proc packages. DCA will be conducted using the dcurves package.
Ethics and dissemination
The ethics review committees: (1) institutional ethics committee, AIIMS-Patna: ref id: AIIMS/Pat/1EC/2020/834, (2) institutional ethics committee, AIIMS-New-Delhi: ref id: IEC-852/03.12.2021, RP-07/202 and (3) institutional ethics committee, MMC: ref id: ECR/270/Inst./TN /2013/RR-16 have reviewed and approved the study. It uses patient data collected through routine clinical care and hospital registries, with no direct patient interaction or intervention at the time of analysis. As the data were gathered as part of standard practice, no additional informed consent is required. To ensure participant confidentiality, identifying details will be removed during transcription. While requesting participants to complete questionnaires may be perceived as intrusive, we are committed to protecting their privacy. Study findings will be disseminated through peer-reviewed publications, conference presentations and open-access platforms, with efforts to engage healthcare stakeholders and policymakers to support evidence-based improvements in TBI care.
Guideline adherence
This study protocol adheres to the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis + Artificial Intelligence (TRIPOD+AI) guidelines, which provide updated and standardised recommendations for reporting prediction model studies that incorporate ML or artificial intelligence methods.31 The completed TRIPOD+AI checklist is provided as a research checklist to ensure comprehensive and transparent reporting.
Discussion
This study protocol presents one of the first multicentre efforts in India to develop, validate and assess the clinical utility of prognostic models for moderate and severe TBI using both traditional and ML-based methods. We will evaluate model performance comprehensively and, for the first time in Indian TBI cohorts, DCA will be employed to evaluate their clinical net benefit across a range of risk thresholds. The resulting online risk calculator may offer clinicians a user-friendly tool for outcome prediction and decision support, contributing to personalised patient management.
Accurate prognostication can help justify the transfer to neurosurgical specialist services and early management of the individual patient. We may get the best suitable ML algorithm based on routinely collected baseline variables, which has optimal performance on external validation for evaluating outcomes after TBI and would allow the model to be applied before in-hospital therapeutic interventions in our setting. This best suitable algorithm may be used in clinical decision-making, appropriate resource allocation, designing research studies that compare outcomes in different groups of patients and when designing randomised control trials and assessing the quality of healthcare. Our approach, which integrates real-world, prospective data from diverse geographical regions in India, ensures that the developed models are grounded in contextually relevant clinical characteristics. The use of SHAP values for model interpretation provides transparency and enhances model trustworthiness in clinical settings.
Sometimes, despite the poor performance of prognostic models, models have the potential to be clinically useful, so clinical utility must be seen, followed by discrimination and calibration of models. This approach allows for a comprehensive assessment of the clinical value of each predictive model beyond traditional performance metrics. However, clinical utility using DCA has notable limitations that must be acknowledged. One significant concern is miscalibration, where predicted risks do not accurately reflect actual outcomes, potentially leading to harmful clinical decisions.30 Despite these limitations, we chose to use DCA because it allows for the evaluation of the net clinical benefit of a prediction model across a range of threshold probabilities, providing insights into the model’s practical utility in guiding real-world clinical decisions. Future work should explore how DCA findings align with clinician decision-making preferences and evaluate model performance under varying thresholds to ensure robustness and usability in practice.
A practical and accurate rule for outcome prediction can significantly aid the victim’s family counselling and appropriate allocation of resources (to those predicted to survive or recover with a good outcome) while assisting treatment-limiting decisions in those predicted to have a poor outcome. This study will also aim to capture regional variations in patient characteristics and outcomes through comparative analyses across participating tertiary care centres. Understanding these differences will be critical for tailoring interventions, improving care delivery models and informing regional healthcare planning for TBI management in India.
India and other developing countries are facing the significant challenges of prevention, pre-hospital care and rehabilitation in their rapidly changing environments to reduce the burden of TBI. There is a lack of data from the policy-making and implementation point of view. For the development of models, we have to be restricted to the data set available in the database of level one tertiary care trauma centre, which may limit generalisability to primary or secondary healthcare settings, particularly those with fewer resources. While we attempted to address this through geographical external validation, further studies in rural and community-based settings are warranted. Sometimes, few ML algorithms require massive stores of training data. We may expect more than 30% of patients with loss to follow-up for long-term outcome assessment, which may influence the precision of model predictions. We will restrict our research to only modelling techniques mentioned in the primary objectives. Various other ML techniques might also be suited for the prediction of outcomes in the case of TBI.
The clinical applicability of predictive models must also be tempered by ethical considerations. Prognostic models should support, not replace, clinical judgement especially in high-stakes decision-making such as life-sustaining treatments. Shared decision-making, transparent communication and respect for patient autonomy must remain central. Future work should explore how predictions from such tools are perceived and used by clinicians, patients and caregivers in the Indian healthcare context.
This study provides timely evidence with important implications for TBI care pathways, trauma system planning and outcome-based triage protocols. By identifying patients at higher risk of poor outcomes early in the treatment course, the model can support better allocation of ICU resources, inform surgical prioritisation and facilitate family counselling.
The risk calculator developed through this study can be integrated into electronic health systems and potentially scaled through mobile platforms, enhancing accessibility in resource-limited settings. Future research should focus on assessing the tool’s real-world usability, its effect on clinical decision-making and long-term impact on patient outcomes. In addition, further exploration of ensemble and hybrid ML techniques may yield incremental improvements in predictive performance. Due to the lack of data for policy making and implementation process in TBI, the plan is to build a national database to promote outcome research in this filed.
Supplementary material
Footnotes
Funding: This research is funded by the Indian Council of Medical Research (ICMR) (grant number 2020-5528).
Prepublication history and additional supplemental material for this paper are available online. To view these files, please visit the journal online (https://doi.org/10.1136/bmjopen-2024-096275).
Provenance and peer review: Not commissioned; externally peer reviewed.
Patient consent for publication: Not applicable.
Patient and public involvement: Patients and/or the public were not involved in the design, conduct, reporting or dissemination plans of this research.
References
- 1.Maas AIR, Menon DK, Adelson PD, et al. Traumatic brain injury: integrated approaches to improve prevention, clinical care, and research. Lancet Neurol. 2017;16:987–1048. doi: 10.1016/S1474-4422(17)30371-X. [DOI] [PubMed] [Google Scholar]
- 2.Weng SF, Vaz L, Qureshi N, et al. Prediction of premature all-cause mortality: A prospective general population cohort study comparing machine-learning and standard epidemiological approaches. PLoS One. 2019;14:e0214365. doi: 10.1371/journal.pone.0214365. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Buchlak QD, Esmaili N, Leveque J-C, et al. Machine learning applications to clinical decision support in neurosurgery: an artificial intelligence augmented systematic review. Neurosurg Rev. 2020;43:1235–53. doi: 10.1007/s10143-019-01163-8. [DOI] [PubMed] [Google Scholar]
- 4.Lingsma HF, Roozenbeek B, Steyerberg EW, et al. Early prognosis in traumatic brain injury: from prophecies to predictions. Lancet Neurol. 2010;9:543–54. doi: 10.1016/S1474-4422(10)70065-X. [DOI] [PubMed] [Google Scholar]
- 5.Steyerberg EW, Vickers AJ, Cook NR, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology. 2010;21:128–38. doi: 10.1097/EDE.0b013e3181c30fb2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Austin PC, Steyerberg EW. Predictive accuracy of risk factors and markers: a simulation study of the effect of novel markers on different performance measures for logistic regression models. Stat Med. 2013;32:661–72. doi: 10.1002/sim.5598. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Vergouwe Y, Moons KGM, Steyerberg EW. External validity of risk models: Use of benchmark values to disentangle a case-mix effect from incorrect coefficients. Am J Epidemiol. 2010;172:971–80. doi: 10.1093/aje/kwq223. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Kamal VK, Agrawal D, Pandey RM. Why Statistical Modelling in Outcome Prediction in Patients with Traumatic Brain Injury is Essential?: In Indian context. J Trauma Treat. 2017;06 doi: 10.4172/2167-1222.1000372. [DOI] [Google Scholar]
- 9.Helmy A, Timofeev I, Hutchinson PJ. What is the purpose of statistical modelling in traumatic brain injury? Acta Neurochir. 2010;152:2007–8. doi: 10.1007/s00701-010-0746-y. [DOI] [Google Scholar]
- 10.Honeybul S, Ho KM. Use of the CRASH study prognosis calculator in patients with severe traumatic brain injury. J Clin Neurosci. 2013;20:1808–10. doi: 10.1016/j.jocn.2013.08.011. [DOI] [PubMed] [Google Scholar]
- 11.Christodoulou E, Ma J, Collins GS, et al. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol. 2019;110:12–22. doi: 10.1016/j.jclinepi.2019.02.004. [DOI] [PubMed] [Google Scholar]
- 12.Kamal VK, Agrawal D, Pandey RM. Prognostic models for prediction of outcomes after traumatic brain injury based on patients admission characteristics. Brain Inj. 2016;30:393–406. doi: 10.3109/02699052.2015.1113568. [DOI] [PubMed] [Google Scholar]
- 13.Kamal VK, Agrawal D, Pandey RM. Epidemiology, clinical characteristics and outcomes of traumatic brain injury: Evidences from integrated level 1 trauma center in India. J Neurosci Rural Pract. 2016;7:515–25. doi: 10.4103/0976-3147.188637. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Carney N, Totten AM, O’Reilly C, et al. Guidelines for the Management of Severe Traumatic Brain Injury, Fourth Edition. Neurosurgery. 2017;80:6–15. doi: 10.1227/NEU.0000000000001432. [DOI] [PubMed] [Google Scholar]
- 15.ACS ACS Releases Revised Best Practice Guidelines in Management of Traumatic Brain Injury. 2025. https://www.facs.org/for-medical-professionals/news-publications/news-and-articles/acs-brief/october-29-2024-issue/acs-releases-revised-best-practice-guidelines-in-management-of-traumatic-brain-injury/ Available.
- 16.Riley RD, Snell KIE, Ensor J, et al. Minimum sample size for developing a multivariable prediction model: Part I - Continuous outcomes. Stat Med. 2019;38:1262–75. doi: 10.1002/sim.7993. [DOI] [PubMed] [Google Scholar]
- 17.Kamal VK, Pandey RM, Agrawal D. Development and temporal external validation of a simple risk score tool for prediction of outcomes after severe head injury based on admission characteristics from level-1 trauma centre of India using retrospectively collected data. BMJ Open. 2021;11:e040778. doi: 10.1136/bmjopen-2020-040778. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Riley RD, Snell KIE, Archer L, et al. Evaluation of clinical prediction models (part 3): calculating the sample size required for an external validation study. BMJ. 2024;384:e074821. doi: 10.1136/bmj-2023-074821. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Marshall A, Altman DG, Holder RL, et al. Combining estimates of interest in prognostic modelling studies after multiple imputation: current practice and guidelines. BMC Med Res Methodol. 2009;9:57. doi: 10.1186/1471-2288-9-57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Madley-Dowd P, Hughes R, Tilling K, et al. The proportion of missing data should not be used to guide decisions on multiple imputation. J Clin Epidemiol. 2019;110:63–73. doi: 10.1016/j.jclinepi.2019.02.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Cortes C, Vapnik V. Support-Vector Networks. Mach Learn. 1995;20:273–97. doi: 10.1023/A:1022627411411. [DOI] [Google Scholar]
- 22.van der Ploeg T, Nieboer D, Steyerberg EW. Modern modeling techniques had limited external validity in predicting mortality from traumatic brain injury. J Clin Epidemiol. 2016;78:83–9. doi: 10.1016/j.jclinepi.2016.03.002. [DOI] [PubMed] [Google Scholar]
- 23.Harrell FE. Regression modeling strategies: with applications to linear models, logistic regression, and survival analysis. regression modeling strategies: with applications to linear models, logistic regression, and survival analysis. Springer; 2010. [Google Scholar]
- 24.Lamane H, Mouhir L, Moussadek R, et al. Interpreting machine learning models based on SHAP values in predicting suspended sediment concentration. Int J Sedim Res. 2025;40:91–107. doi: 10.1016/j.ijsrc.2024.10.002. [DOI] [Google Scholar]
- 25.Baptista ML, Goebel K, Henriques EMP. Relation between prognostics predictor evaluation metrics and local interpretability SHAP values. Artif Intell. 2022;306:103667. doi: 10.1016/j.artint.2022.103667. [DOI] [Google Scholar]
- 26.Wong TT. Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recognit DAGM. 2015;48:2839–46. doi: 10.1016/j.patcog.2015.03.009. [DOI] [Google Scholar]
- 27.Ramspek CL, Jager KJ, Dekker FW, et al. External validation of prognostic models: what, why, how, when and where? Clin Kidney J. 2021;14:49–58. doi: 10.1093/ckj/sfaa188. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Verbakel JY, Steyerberg EW, Uno H, et al. ROC curves for clinical prediction models part 1. ROC plots showed no added value above the AUC when evaluating the performance of clinical prediction models. J Clin Epidemiol. 2020;126:207–16. doi: 10.1016/j.jclinepi.2020.01.028. [DOI] [PubMed] [Google Scholar]
- 29.Sachs MC, Sjölander A, Gabriel EE. Aim for Clinical Utility, Not Just Predictive Accuracy. Epidemiology. 2020;31:359–64. doi: 10.1097/EDE.0000000000001173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Huber M, Schober P, Petersen S, et al. Decision curve analysis confirms higher clinical utility of multi-domain versus single-domain prediction models in patients with open abdomen treatment for peritonitis. BMC Med Inform Decis Mak. 2023;23:63. doi: 10.1186/s12911-023-02156-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.The BMJ TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. https://www.bmj.com/content/385/bmj-2023-078378 Available. [DOI] [PMC free article] [PubMed]
