Abstract
Background
Both nutrition and inflammation have been implicated in the pathogenesis of colorectal cancer (CRC), but most previous studies have examined these factors separately. This study aimed to explore the combined association of inflammation and nutritional status with CRC.
Methods
This study selected 101,316 subjects from the National Health and Nutrition Survey (NHANES) conducted from 1999 to 2018. First, weighted logistic regression was used to measure the association between the advanced lung cancer inflammation index (ALI) and CRC. Then, restricted cubic splines (RCS) were used to capture the dose-response curve, and the predictive power of the model was calibrated by the ROC curve. Subsequently, robustness was verified through subgroup and interaction analyses. Furthermore, random forest analysis combined with the Boruta algorithm was employed to identify CRC-related factors. Subsequently, a machine learning(ML) prediction framework is constructed, and the black box of the optimal model is disassembled using SHAP values to endow it with interpretability.
Results
In the fully adjusted model, each unit increase in log-transformed ALI was associated with a 20.9% reduction in CRC risk (OR = 0.791; 95% CI: 0.628–0.997; p = 0.047). Participants in the highest log-ALI quartile had a 46.2% lower risk compared to those in the lowest quartile (OR = 0.538; 95% CI: 0.344–0.842; P = 0.007). The fully adjusted model demonstrated strong discriminative ability (AUC = 0.848). RCS analysis confirmed a linear dose-response relationship (P for nonlinearity = 0.731). The robustness of these findings was supported by subgroup and sensitivity analyses. Random forest analysis coupled with the Boruta algorithm identified log-ALI as a strong predictor. Among seven machine learning models evaluated, the LightGBM algorithm achieved the highest and most stable predictive performance (AUC = 0.870). SHAP analysis confirmed log-ALI as the most important protective feature.
Conclusion
This study demonstrates that higher ALI levels, indicative of better nutritional and inflammatory status, are significantly associated with a lower risk of CRC. The optimized ML model based on ALI shows promise as a cost-effective tool for CRC risk stratification.
Supplementary Information
The online version contains supplementary material available at 10.1186/s12967-025-07494-z.
Keywords: Advanced lung cancer inflammation index, Colorectal cancer, National health and nutrition examination survey, Boruta algorithm, Machine learning algorithm
Introduction
Colorectal cancer (CRC) poses a significant global public health challenge. It is the third most commonly diagnosed malignancy and the second leading cause of cancer-related mortality worldwide [1]. Recent data from 2024 indicate approximately 1.8 million new cases and 881,000 deaths globally [2]. If this curve is followed at full speed, the number of cases will have to increase by another 63% by 2040. Furthermore, the increasing incidence of early-onset CRC, combined with rising healthcare costs, imposes a substantial economic burden on healthcare systems worldwide.
A concerning trend of increasing early-onset CRC, coupled with rising incidence projections, places a substantial economic and clinical burden on healthcare systems [3]. In recent years, the global rise in obesity has emerged as a critical risk factor for CRC [4]. Excess adiposity contributes to CRC initiation and progression through mechanisms involving chronic low-grade inflammation, insulin signaling dysregulation, and redox imbalance [5]. Obesity also disrupts hormonal homeostasis, further amplifying adipose tissue–mediated inflammatory pathways [6]. Moreover, inadequate nutrition and persistent inflammation play pivotal roles in the pathogenesis of gastrointestinal malignancies. Inflammatory processes are recognized as key determinants of tumor initiation, progression, and prognosis [7]. Chronic inflammation may impair immune surveillance and promote a tumor-supportive microenvironment [8]. Emerging evidence has identified several inflammation-related markers as reliable prognostic indicators for CRC, such as the neutrophil-to-albumin ratio (NPAR) [9], which shows a strong association with clinical outcomes. Moreover, poor nutritional status exacerbates persistent intestinal inflammation, thereby promoting malignant transformation [10]. Concurrently, reduced serum albumin levels are closely associated with cancer-related cachexia, characterized by progressive weight loss and muscle wasting [11]. Therefore, a comprehensive assessment of the interplay between inflammation and nutrition may provide valuable insights for clinical strategies aimed at reducing CRC risk.
Clinically, the management of CRC continues to pose significant challenges. Current therapeutic strategies mainly adopt a multimodal approach centered around surgical resection [12]. Although chemotherapy is widely utilized for its practicality [13], its efficacy is often limited by inadequate drug accumulation and multidrug resistance [14, 15]. The development of anti-angiogenic agents such as thalidomide marked a therapeutic advance [16, 17], and more recent combinations with immune checkpoint inhibitors have shown synergistic efficacy in advanced CRC [18]. Concurrently, nanomedicines have emerged as promising platforms for their tumor-targeting and controlled-release capabilities, demonstrating selective anti-CRC activity and potential to overcome resistance when combined with conventional drugs [19–21]. Against this backdrop, the identification of biomarkers capable of accurately predicting treatment response and prognosis is essential to guide personalized therapeutic strategies [22].
The systemic inflammatory response acts as a critical mediator linking various metabolic abnormalities and intricately interacts with pathological conditions such as obesity and insulin resistance [23]. To better capture this complex interplay, the Advanced Lung Cancer Inflammation Index (ALI) was developed as an innovative composite scoring system. Originally proposed by Jafri et al. for prognostic assessment in non-small cell lung cancer [24], ALI integrates inflammatory markers with nutritional status indicators, thereby overcoming the limitations of conventional single-parameter biomarkers [17]. Subsequently, the predictive utility of ALI has been validated across a spectrum of malignancies, including breast, hepatocellular, and gastric cancers [25–27]. More recently, its application has expanded to non-oncological chronic diseases such as hypertension, heart failure, coronary artery disease, and Crohn’s disease [28–31]. The unique clinical value of ALI derives from its integration of key parameters: it combines nutritional metrics such as body mass index and serum albumin with the neutrophil-to-lymphocyte ratio (NLR) [32], and has established prognostic value in metabolic diseases like diabetes [33]. This synthesis of immunological and nutritional dimensions provides a more comprehensive assessment of disease pathophysiology [31]. By integrating nutritional and inflammatory pathways through readily available parameters such as albumin, NLR, and BMI, ALI offers a clinically accessible approach aligned with the Bench-to-Bedside (B2B) paradigm in translational medicine.
However, evidence on the predictive value of ALI specifically in CRC remains limited. To address this knowledge gap, we utilized the nationally representative NHANES cohort to comprehensively investigate the association between ALI and CRC risk. Our approach was to first lay a solid foundation with traditional statistics and then leverage machine learning (ML) to build visual models, representing a novel approach with potential to enhance clinical diagnosis and treatment strategies. These findings position ALI as a promising biomarker for CRC risk assessment. However, further validation in larger, prospective cohorts is warranted to confirm its clinical utility.
Materials and methods
Study participants and research design
The NHANES employs a sophisticated stratified sampling design, administered by the Centers for Disease Control and Prevention (CDC), to assess population health in the United States. This nationally representative study has collected comprehensive data on physiological parameters and dietary patterns from participants of all ages through biennial surveys since 1999. Each cycle systematically records sociodemographic information, dietary intake, clinical examination results, biochemical measurements, and questionnaire responses. Ethical approval for all study procedures was granted by the Research Ethics Review Board of the National Center for Health Statistics (NCHS), and written informed consent was obtained from all participants prior to enrollment.
Our study utilized NHANES datasets from 1999 to 2018, applying rigorous participant selection based on predefined exclusion criteria. Individuals were excluded if they: (1) were under 20 years of age; (2) lacked confirmed malignancy status; (3) had incomplete biomarker data required for ALI calculation (neutrophil count, lymphocyte count, serum albumin, or BMI); or (4) had missing values for any covariates, including demographic, lifestyle, socioeconomic, or comorbidity variables. Following these exclusions, the final analytical cohort included 37,952 participants who met all study criteria (Fig. 1).
Fig. 1.
Flowchart of eligible study participants
Diagnostic criteria for CRC
The history of tumors was provided by the Medical Condition Section (MCQ) in the NHANES questionnaire. First, screen through the MCQ220 with the question “Did the doctor ever tell you that you have A tumor or a malignant disease?” Then, use the MCQ230 A-D to follow up with the question “What type of cancer is it exactly?” These two steps help to determine the malignant status. Participants were classified as CRC cases only if they reported a malignancy and specified colon or rectal cancer. The control group included individuals with other cancer types, no history of cancer, or a combination of CRC and other malignancies. Data were collected via in-person interviews or computer-assisted systems, with trained personnel applying strict quality control procedures to ensure accuracy and minimize errors.
Assessment of ALI
The ALI was calculated as serum albumin (g/dL) × BMI (kg/m²) / NLR [34]. The NLR was derived from the complete blood count by dividing the absolute neutrophil count by the absolute lymphocyte count. BMI was computed as weight in kilograms divided by the square of height in meters. All laboratory procedures adhered to the standardized protocols outlined in the NHANES Laboratory/Medical Technologists Procedures Manual [35]. Fasting blood samples were collected in K3EDTA vacuum tubes (BD Medical Systems) and analyzed in duplicate within 20 min of collection using Beckman Coulter MAXM/HMX hematology analyzers. Serum albumin was quantified at Collaborative Laboratory Services via the bromcresol purple method on a Beckman Coulter DxC800 analyzer, measuring absorbance at 600 nm. Detailed laboratory methodologies are further documented at: https://wwwn.cdc.gov/nchs/nhanes/continuousnhanes/labmethods.aspx?Cycle=2017-2018.
Covariates
Our analysis controlled for potential confounders across three categories: (1) population characteristics, including age, gender, ethnicity, marital status, education, and socioeconomic status; (2) behavioral factors, including tobacco and alcohol use; and (3) chronic medical conditions, including cardiovascular disease (CVD), hypertension, and diabetes. These variables were systematically collected using standardized methods, including structured interviews, clinical assessments, and biochemical analyses.
Demographic variables are divided as follows: The race/ethnicity variable is divided into non-Hispanic White, non-Hispanic Black, Mexican American, Other Hispanic, and other. Marital status was classified as “married/cohabiting”, “Unmarried”, “widowed/divorced/separated”. The education variable is divided into below high school, graduation, and above. BMI is classified into three grades based on the clinical line: normal < 25, overweight 25-29.9, and obese ≥ 30. Socioeconomic status was assessed using the family poverty income ratio (PIR). A PIR of ≤ 1.3 indicates low income, 1.3–3.5 indicates medium income, and ≥ 3.5 indicates high income. Living habits can also be classified in detail: If one smokes less than a hundred cigarettes in a lifetime, it is considered as never having smoked; if one smokes more than a hundred and has quit, it is considered as having smoked; if one smokes more than a hundred, it is considered as smoking currently. If one drinks less than 12 cups in a lifetime, it is considered as never having drunk. If one drinks 12 cups but has not touched it in the past year, it is considered as having drunk. If one has drunk in the past year, it is classified by gender: women ≤ 1 cup per week and men ≤ 2 cups per week are considered light; women 2–3 cups and men 3–4 cups are considered medium; women ≥ 3 cups and men ≥ 4 cups are considered heavy. The definition of chronic diseases was based on the test report and self-report: Regarding blood pressure, if the doctor reported high blood pressure, the average systolic/diastolic pressure is ≥ 140/90 mmHg, or if antihypertensive drugs are taken, any one of them indicates hypertension. For blood sugar, if a doctor confirms the diagnosis, hypoglycemic drugs are taken, random blood glucose ≥ 11.1, 2-hour OGTT ≥ 11.1, fasting ≥ 7.0 or HbA1c ≥ 6.5%, any one of these indicates diabetes. For the cardiovascular system, as long as a doctor has reported angina pectoris, heart failure, coronary heart disease, myocardial infarction or stroke, it all counts as CVD. All disease names are determined in accordance with the unified process of NHANES, where the questionnaires, physical examinations, and laboratory results are all matched before the final decision is made.
Statistical analysis
The entire analysis process followed the complex sampling design, calculating the weights clearly and precisely. Classified data are presented as the number of cases and percentages. Continuous data are presented as the mean ± standard error (SE). The differences between continuous variables were tested using the weighted Student’s t-test, while the cross-group differences of categorical variables were tested using the weighted χ² test.
ALI values were log-transformed to approximate a normal distribution.Then, treat it as a continuous variable and categorized into quartiles. Both the original and logarithmic distributions are presented in the Supplementary Table S1. We constructed three sequential logistic regression models with progressive adjustment for covariates: Model 1 was unadjusted; Model 2 adjusted for demographic factors such as age, race, gender, income, education, and marriage; Model 3 additionally adjusted for clinical and lifestyle factors such as BMI, alcohol consumption, hypertension, diabetes, and CVD. Throw the quartile logarithm ALI as a continuous quantity into the model and conduct a linear trend test. The dose-response curve was visualized using restricted cubic splines (RCS). To verify stability, all covariates are decomposed into subgroup analyses, and interactions are also calculated. Predictivepower is demonstrated by receiver operating characteristic (ROC), with the area under the curve (AUC) as the goldstandard. The DeLong method tested if the fully adjusted model has lifted the curve.
Sensitivity analysis
To see if the lack of data would lead the conclusion astray, we conducted a few sensitivity tests: First, we singled out those with complete variables for “complete cases”, using them as the yardstick for the main analysis; Second, we retained all participants and grouped the missing items into one category. Finally, multiple imputations are used to extract all the information, and all the results are thrown into the supplementary materials.
Model development and validation
Among the 37,952 final records, the data was randomly split into 75% for the training set (n = 28,464) and 25% for the test set (n = 9,488). The complete workflow was designed to ensure a rigorous separation between model development and evaluation phases. To prevent data leakage, all feature preprocessing parameters were learned exclusively from the training set and then applied to the test set. Furthermore, the feature selection process and class imbalance handling were also conducted solely within the training set. To address the prevalent class imbalance, we systematically compared two techniques: ROSE and SMOTE technique. The ROSE method, which generates new synthetic examples based on a smoothed bootstrap approach from the minority class, was empirically found to provide superior and more robust generalization performance across our model ensemble compared to SMOTE’s k-nearest neighbor interpolation. Consequently, superior and more robust generalization performance data was selected for all subsequent model development. Subsequently, the Boruta algorithm, embedded within a random forest framework, was employed on the training set to identify the key risk factors for CRC. This algorithm performs a robust all-relevant feature selection by comparing the importance of original attributes with that of randomly permuted shadow attributes. We trained and validated a total of seven machine learning models: extreme gradient boosting (XGBoost), decision tree (DT), multilayer perceptron (MLP), neural networks (NNET), K-nearest neighbors (KNN), light gradient boosting machine (LightGBM), and support vector machine (SVM), using traditional Logistic Regression as a baseline control. A nested validation approach was implemented. In the inner loop, hyperparameter tuning for each model was performed via a 5-fold cross-validation combined with a grid search strategy on the training set, with the objective of maximizing the mean cross-validation AUC. Key hyperparameters optimized included, but were not limited to, the number of estimators and maximum depth for tree-based models, the regularization parameter (C) for SVM and Logistic Regression, and the learning rate for gradient boosting models. In the outer loop, the final tuned models, with their optimal hyperparameters, were evaluated only once on the held-out test set to provide an unbiased estimate of generalization performance. The models were evaluated using a comprehensive set of metrics: AUC, accuracy, precision, specificity, sensitivity, negative predictive value (NPV), and F1-score. The AUC was prioritized as the primary metric for model selection and comparison due to its robustness to class imbalance. Finally, to ensure explainability, SHAP values were computed on the test set to interpret the output of the optimal model, quantifying the contribution of each feature to individual predictions.
The entire set of operations was completed in R 4.4.1, and the weights of NHANES from 1999 to 2018 were averaged out according to the recommended guidelines with the help of the survey package. The ML part was implemented using the tidymodels framework with the themis package for completion. A bilateral p value < 0.05 is considered to have crossed the significance line.
Results
Baseline characteristics of the study participants
The profile of the study population was laid out in Table 1: 37,952 people were shortlisted, with an average age of 47.09 years, among whom 259 were diagnosed with CRC. Notably, CRC patients were significantly older, with a mean age of 68.58 years. Compared with non-CRC participants, CRC prevalence was significantly higher (all p < 0.05) among those who were Non-Hispanic White, had a history of former smoking, mild alcohol consumption, or diagnoses of hypertension, diabetes, or CVDs. Marital status also showed a significant association, with higher prevalence in widowed/divorced/separated individuals. Notably, both ALI and log-ALI levels differed significantly between CRC and non-CRC groups (p < 0.001), indicating their potential discriminative value.
Table 1.
Baseline of included participants by the presence of colorectal cancer
| Characteristic | Total (N = 37,952) |
No Colorectal cancer (N = 37,693) |
Colorectal cancer (N = 259) |
P-value |
|---|---|---|---|---|
| Neutrophil count | 4.295(0.017) | 4.295(0.017) | 4.203(0.123) | 0.455 |
| Lymphocyte count | 2.146(0.010) | 2.148(0.010) | 1.888(0.067) | < 0.001 |
| Albumin, (g/dl) | 4.284(0.004) | 4.284(0.004) | 4.164(0.028) | < 0.001 |
| BMI, (kg/m2) | 28.870(0.069) | 28.871(0.069) | 28.579(0.433) | 0.503 |
| ALI | 68.434(0.540) | 68.481(0.542) | 59.121(2.537) | < 0.001 |
| Log-ALI | 4.103(0.004) | 4.104(0.004) | 3.947(0.035) | < 0.001 |
| Age, years | 47.090(0.210) | 46.982(0.210) | 68.581(1.127) | < 0.001 |
| Gender, n (%) | 0.081 | |||
| Female | 18,835(50.591) | 18,706(50.559) | 129(57.110) | |
| Male | 19,117(49.409) | 18,987(49.441) | 130(42.890) | |
| Race, n (%) | < 0.001 | |||
| Mexican American | 6472(7.829) | 6455(7.860) | 17(1.760) | |
| Non-Hispanic Black | 7507(10.094) | 7463(10.106) | 44(7.817) | |
| Non-Hispanic White | 17,812(70.661) | 17,635(70.585) | 177(85.706) | |
| Other Hispanic | 2977(5.035) | 2964(5.048) | 13(2.444) | |
| Other Race | 3184(6.380) | 3176(6.401) | 8(2.273) | |
| Education level, n (%) | 0.152 | |||
| Below high school | 9557(15.480) | 9473(15.452) | 84(21.109) | |
| High school | 8779(23.702) | 8722(23.713) | 57(21.454) | |
| Above high school | 19,616(60.818) | 19,498(60.835) | 118(57.437) | |
| Poverty level, n (%) | 0.167 | |||
| Low income | 11,257(20.237) | 11,183(20.234) | 74(20.827) | |
| Moderate income | 14,444(35.674) | 14,330(35.642) | 114(42.107) | |
| High income | 12,251(44.089) | 12,180(44.124) | 71(37.066) | |
| Marital status, n (%) | < 0.001 | |||
| Married/Living with Partner | 23,093(64.458) | 22,958(64.506) | 135(54.962) | |
| Widowed/Divorced/Separated | 8364(18.290) | 8251(18.174) | 113(41.332) | |
| Never married | 6495(17.253) | 6484(17.321) | 11(3.705) | |
| Smoking status, n (%) | < 0.001 | |||
| Never | 20,291(53.827) | 20,186(53.886) | 105(42.243) | |
| Former | 9600(25.034) | 9479(24.940) | 121(43.591) | |
| Now | 8061(21.139) | 8028(21.174) | 33(14.166) | |
| Alcohol consumption, n (%) | < 0.001 | |||
| Never | 5242(10.760) | 5210(10.755) | 32(11.833) | |
| Former | 6598(14.037) | 6520(13.966) | 78(27.985) | |
| Mild | 12,743(36.359) | 12,631(36.328) | 112(42.542) | |
| Moderate | 5742(17.433) | 5722(17.475) | 20(9.233) | |
| Heavy | 7627(21.410) | 7610(21.476) | 17(8.406) | |
| Diabetes, n (%) | < 0.001 | |||
| No | 32,198(88.762) | 32,012(88.836) | 186(74.121) | |
| Yes | 5754(11.238) | 5681(11.164) | 73(25.879) | |
| Hypertension, n (%) | < 0.001 | |||
| No | 21,895(62.793) | 21,835(62.962) | 60(29.178) | |
| Yes | 16,057(37.207) | 15,858(37.038) | 199(70.822) | |
| CVD, n (%) | < 0.001 | |||
| No | 33,880(91.652) | 33,705(91.763) | 175(69.765) | |
| Yes | 4072(8.348) | 3988(8.237) | 84(30.235) |
Note: Mean (SD) for continuous variables: the P value was calculated by the weighted Students T-test
Percentages (weighted N, %) for categorical variables: the P value was calculated by the weighted chi-square test
Abbreviation: ALI, advanced lung cancer inflammation index; BMI, body mass index; CVD, cardiovascular disease
Correlation between ALI and CRC risk
First, take the logarithm of ALI, and then layer it into the model by continuous quantity and quartile simultaneously. The association between log-ALI and CRC risk was assessed using logistic regression. The results are listed in the Table 2. In the unadjusted model (Model 1), for each unit increase in log-ALI, the CRC probability dropped by 47.6% (OR = 0.524; 95% CI: 0.400-0.686; P < 0.001). After adjustment for age, race, gender, poverty level, education, and marital status in Model 2, the negative association still persisted, with an OR of 0.771 (95% CI: 0.606–0.980; P = 0.034). In Model 3, BMI, smoking, drinking, hypertension, diabetes and CVD were further included. The results remained the same: for each increase in log-ALI, the risk of CRC decreased by more than 20%, with an OR of 0.791 (95% CI: 0.628–0.997; P = 0.047). In accordance with contemporary statistical guidelines, we interpret this finding as indicative of a statistically significanttrend. In the quartile analysis, compared with the bottom-ranked Q1, Model 1 showed a 64.9% reduction in the risk for the topmost Q4 (OR = 0.351; 95% CI: 0.226–0.545; P < 0.001). After all covariates were adjusted, the decline remained at nearly 46.2% (OR = 0.538; 95% CI: 0.344–0.842; P = 0.007). After maximizing all covariates, the model’s AUC soared to 0.848(95% CI:0.828–0.867), outperforming the unadjusted version. The DeLong test also signed a p < 0.001 (Fig. 2). RCS analysis confirmed a linear dose-response relationship between log-ALI and CRC risk (P for nonlinearity = 0.731)(Fig. 3). The results remained consistent in sensitivity analyses that excluded extreme ALI values and in analyses performed both before and after multiple imputation (Supplementary Table S1-S3).
Table 2.
Association between advanced lung cancer inflammation index levels and colorectal cancer among US participants, NHANES, 1999 to 2018
| ALI | Model I [OR (95% CI)] |
p-value | Model II [OR (95% CI)] |
p-value | Model III [OR (95% CI)] |
p-value |
|---|---|---|---|---|---|---|
| Per ln-unit increase | 0.524(0.400,0.686) | < 0.001 | 0.771(0.606,0.980) | 0.034 | 0.791(0.628,0.997) | 0.047 |
| Q1 | Reference | Reference | Reference | |||
| Q2 | 0.703(0.435,1.135) | 0.148 | 0.949(0.584,1.543) | 0.832 | 0.969(0.601,1.561) | 0.896 |
| Q3 | 0.704(0.472,1.050) | 0.085 | 1.041(0.700,1.549) | 0.840 | 1.077(0.731,1.585) | 0.706 |
| Q4 | 0.351(0.226,0.545) | < 0.001 | 0.534(0.340,0.838) | 0.007 | 0.538(0.344,0.842) | 0.007 |
| P for trend | < 0.001 | 0.041 | 0.044 |
Note: Abbreviation: ALI: advanced lung cancer inflammation index; NHANES, National Health and Nutrition Examination Survey; BMI, Body mass index; CVD, cardiovascular disease. OR, odds ratio; CI, confidence interval
Model I was unadjusted
Model II was adjusted for age, race, gender, poverty level, education and marital status
Model III was adjusted for age, race, gender, poverty level, education, marital status, BMI, smoking status, alcohol consumption, hypertension, diabetes, and CVD
Fig. 2.
Receiver operating characteristic (ROC) between log-ALI and colorectal cancer in Model 1 and Model 3
Fig. 3.
The dose - response relationship between log-ALI level and colorectal cancer risk. The model has been adjusted for age, race, gender, poverty level, education, marital status, BMI, smoking status, alcohol drinking, hypertension, diabetes, angina, stroke, and CVD
Subgroup analyses
Subgroup analyses were performed to assess the association between log-ALI and CRC across various population strata (Fig. 4). The inverse association between log-ALI and CRC was stronger in the elderly (> 60 years old), non-Hispanic whites, obese individuals with BMI ≥ 30, unmarried individuals, those without diabetes, and those with CVD. Among them, the interaction between CVD and ALI reached P = 0.018, with the most significant difference.
Fig. 4.
Subgroup analyses of the association between log-ALI level and colorectal cancer. Notes: ALI, advanced lung cancer inflammation index. Analyses were adjusted for age, race, gender, poverty level, education, marital status, BMI, smoking status, alcohol drinking, hypertension, diabetes, angina, stroke, and CVD
Random forest analysis
To identify key determinants of CRC risk, we applied random forest modeling incorporating all covariates from the fully adjusted model. As shown in Fig. 5, feature importance visualization indicates that higher values of both mean decrease in Gini index and mean decrease in accuracy correspond to greater predictive significance. Notably, based on the mean decrease in the Gini index, log-ALI was identified as the most important predictor. It also ranked as the second most influential variable when assessed by the mean decrease in accuracy, confirming its robust and independent predictive value for CRC.
Fig. 5.
Variable importance according to the random forest analysis, evaluated by mean decrease in the Gini index (a) and mean decrease in accuracy (b). Notes: ALI, advanced lung cancer inflammation index; BMI, body mass index; CVD, cardiovascular disease
Boruta algorithm
Figure 6 presents the results of the Boruta algorithm, which identified key features associated with CRC. Variables located in the green zone were classified as important predictors. After 500 iterations, 12 significant variables were retained, including log-ALI, neutrophil, lymphocyte, age, marital status, alcohol consumption, BMI, albumin, poverty level, smoking status, sex and education level. The results show that the weight of logarithmic ALI has outperformed other indicators and firmly holds the top position. For final model development, diabetes, hypertension, CVD, and race were purposefully included due to their clinical relevance.
Fig. 6.
Predictor importance for angina pectoris according to the Boruta algorithm. A predictor was deemed important if its mean importance Z-score was significantly higher than the maximum value of the shadow variables (blue). Conversely, a predictor (red) was excluded if its mean importance Z-score was significantly lower than the maximum value of the shadow variables. Notes: ALI, advanced lung cancer inflammation index; BMI, body mass index; CVD, cardiovascular disease
ML predictive models and representative interpretations
To address class imbalance, both the ROSE and SMOTE algorithms were evaluated. Comparative analysis revealed that the ROSE method provided superior and more robust generalization performance across the ensemble of models. Consequently, the ROSE-processed dataset was selected for all subsequent analyses, while results based on SMOTE are provided in the Supplementary Figure S2 and Supplementary Table S4. The predictive performance of eight distinct algorithms was evaluated and compared. (Fig. 7). When comparing the training set with the test set, LightGBM maintained a consistent lead, remaining stable without any fluctuations. The AUC of the test set reached 0.832, the highest in the entire competition, while also maintaining strong performance on the training set (AUC: 0.870), reflecting a well-balanced learning capacity and maintaining minimal overfitting. The model also exhibited excellent precision (0.999), specificity (0.860), and F1-score (0.817), underscoring its clinical reliability. Furthermore, LightGBM outperformed the logistic regression model, which attained AUC values of 0.848 on the training set and 0.824 on the test set. The model’s robustness was further validated through 5-fold cross-validation, demonstrating consistent performance improvement throughout the tuning process (Supplemental Figure S3). A comprehensive comparison of all evaluated models is provided in Supplementary Table S5.
Fig. 7.
Receiver operating characteristic (ROC) curves of the eight machine-learning models.To evaluate the performance of the different ML methods, we compared eight ML algorithms: XGBoost, DT, MLP, NNET, KNN, LightGBM, SVM and logistic model Fig. 7. (A) ROC curves of the training set. (B) ROC curves of the testing set
To contextualize these results, we compared our model against several established CRC risk assessment models derived from the NHANES database, including inflammation-based prognostic markers. As shown in Supplemental Figure S4 and nd Supplementary Table S6 our model achieved a higher AUC value(0.940), indicating superior predictive performance relative to these existing biomarkers. We further assessed the performance of the LightGBM model across key clinical subgroups. As summarized in Supplemental Figure S5 and Supplementary Table S7, the model consistently demonstrated strong discriminative ability. As detailed in Supplementary Table S7, the model demonstrated strong and consistent discriminative ability (AUC > 0.84) in several major subgroups, including older adults (> 60 years) and non-diabetic participants.
To evaluate the contribution of individual predictors to CRC risk estimation within the LightGBM model, we performed SHAP analysis. This method quantifies the importance of each feature and its directional impact on model predictions (Fig. 8A and B). Among the 15 predictors analyzed, log-ALI exhibited the highest mean absolute SHAP value. This confirms its role as the most influential protective factor against CRC. The finding aligns with prior clinical expectations. To further enhance model interpretability, we generated SHAP waterfall plots and force plot illustrating the direction and magnitude of feature contributions to the prediction for individual participants, as exemplified by the second participant in the study cohort (Fig. 8C and D).
Fig. 8.
SHAP diagram of LightGBM model. (A) SHAP value ranking of the variables. (B) SHAP honeycomb diagram of the LightGBM model. (C) SHAP waterfall plot. (D) Shap force plot
Discussion
As the first nationally representative study examining the association between ALI and CRC, multivariable regression and RCS analyses confirmed a linear inverse relationship. The robustness of this finding was supported by subgroup and sensitivity analyses. ROC analysis demonstrated the strong discriminative ability of ALI for CRC prevalence. Feature selection using random forest and the Boruta algorithm further emphasized the diagnostic relevance of ALI. To enhance predictive accuracy, seven machine learning models were employed, with SHAP values providing interpretable insights into feature contributions. Collectively, the results highlight the pivotal clinical value of ALI in CRC risk assessment. This study establishes ALI as a protective factor against CRC and suggests that its risk prediction model may offer practical utility for CRC prevention and early intervention.
Compared with existing CRC risk assessment tools, the ALI-based model developed in this study offers several distinct advantages. While the Asia-Pacific Colorectal Screening (APCS) score demonstrates reasonable clinical utility, it relies exclusively on basic demographic features such as age, sex, and family history [36]. In contrast, the ALI model incorporates objective biomarkers, enabling a more biologically informed risk assessment. Similarly, although the Harvard Cancer Risk Index covers a wider range of lifestyle factors [37], it depends on self-reported data that are susceptible to recall bias. Genetic-based prediction models perform well in specific populations [38], yet their broader applicability is constrained by high testing costs. In terms of predictive performance, a large-scale external validation in the UK Biobank indicated that the best existing models achieved an AUC of 0.67–0.70 in males, with even lower values in females [39]. By comparison, the ALI model not only exhibits superior discriminative ability but may also be less affected by population heterogeneity, as it utilizes routinely available and objective clinical measures. From a clinical implementation perspective, the ALI model has the potential to enhance current CRC screening pathways. Although colonoscopy remains the gold standard [40], its invasiveness limits widespread repeated use. Fecal immunochemical testing, the most widely used non-invasive screening method for average-risk individuals [41], could be strategically combined with the ALI model to improve follow-up adherence in screening programs [42]. Emerging multitarget stool DNA tests provide high sensitivity but are restricted by cost and accessibility [42]. Furthermore, by offering risk stratification rather than a binary outcome, the ALI model may help resolve clinical ambiguity in cases where multitarget stool DNA tests are negative yet colonoscopy results remain inconclusive [43]. Compared to traditional serum biomarkers (such as CEA and CA199), which are limited in early detection, the ALI model utilizes routine clinical parameters without necessitating additional assays. Advanced liquid biopsy techniques, including circulating tumor cells [44] and circulating tumor DNA [45], represent promising alternatives; however, their technical complexity and high cost currently hinder broad adoption in population-wide screening.
CRC development and progression result from the interplay of genetic susceptibility and environmental factors, with chronic inflammation, oxidative stress, and gut microbiota dysbiosis representing core pathogenic mechanisms [46, 47]. Persistent reactive oxygen species (ROS) accumulation may induce critical mutations in tumor suppressor genes and proto-oncogenes, thereby driving tumorigenesis [48]. Chronic inflammation compromises intestinal mucosal barrier integrity, impairs host defense, and promotes pathogen invasion, creating a self-perpetuating cycle [49]. Substantial clinical evidence indicates that chronic inflammation frequently precedes or accompanies tumor progression [50]. In this context, ALI represents a quantifiable biomarker warranting further investigation in tumor-associated inflammatory processes. Chronic inflammation promotes carcinogenesis through multiple mechanisms. Sustained inflammatory responses generate ROS, exacerbating oxidative damage [51]. In CRC, intensified inflammation reshapes the composition of epithelium-adherent microbiota, favoring species that harbor genotoxic gene products. For example, certain Escherichia coli strains produce colibactin, which induces host cell mutations [52]. Upon recognition by myeloid cells, translocated microbial metabolites amplify tumor-associated inflammation through IL-23-dependent pathways and related mechanisms [53]. Additionally, macrophage-derived factors, such as TGF-β, contribute to the formation of an immunosuppressive tumor microenvironment [54], ultimately enhancing cancer stemness and promoting immune evasion [55]. Nutritional status plays a critical role in both the initiation and progression of CRC. Epidemiological evidence indicates that obesity is a significant risk factor, with individuals exhibiting a BMI >30 kg/m² showing a 1.5- to 1.8-fold increased CRC risk [56]. Obesity contributes to CRC through multiple mechanisms. Visceral adipose tissue secretes adipokines, including leptin and resistin, which activate key oncogenic pathways [57]. Obesity is also frequently associated with insulin resistance and hyperinsulinemia, whereby insulin and related growth factors promote tumor growth by enhancing cellular proliferation and inhibiting apoptosis [58, 59]. Additionally, obesity induces chronic low-grade inflammation, characterized by elevated pro-inflammatory cytokine production and persistent systemic inflammatory responses [60]. Conversely, malnutrition is associated with increased CRC susceptibility. Deficiencies, including hypoalbuminemia and sarcopenia, progressively impair immune function, notably reducing immune surveillance against abnormal cells [61], thereby creating a permissive environment for tumor initiation and progression.
The ALI integrates three key parameters: BMI, circulating albumin, and NLR. BMI and albumin serve as nutritional indicators, whereas NLR reflects systemic inflammation, rendering ALI a holistic biomarker that simultaneously captures metabolic and inflammatory status. Neutrophils, which constitute 50–70% of circulating leukocytes, play multifaceted and often contradictory roles in CRC progression [62]. They contribute to tumor development through the release of IL-1β [63], yet also secrete CCL17 to recruit regulatory T cells, facilitating immune evasion [64], and support angiogenesis [65]. Tumor-associated neutrophils (TANs) exhibit considerable plasticity, shifting from anti-tumor to pro-tumor phenotypes [66]. Pro-tumor TANs display altered metabolic activity and secrete factors that promote angiogenesis, tumor growth, and metastasis—such as AGR2, which enhances the metastatic potential of CRC cells [67, 68]. Additionally, neutrophils form neutrophil extracellular traps (NETs) via NETosis [69], which have been linked to circulating tumor cell survival, metastasis, therapy resistance, and poor responses to immunotherapy [70, 71]. Lymphocytes play a central role in antitumor immunity, suppressing tumor proliferation and metastasis [72]. Clinically, peripheral lymphocyte counts serve as reliable prognostic biomarkers, with higher levels generally reflecting stronger antitumor responses mediated by cytotoxic T cells and natural killer cells [73]. Beyond direct cytotoxicity, lymphocytes modulate the tumor microenvironment through cytokine signaling, promoting antigen presentation and helping reverse immunosuppression [74]. The correlation between peripheral and tumor-infiltrating lymphocytes further supports their clinical relevance. Conversely, lymphopenia impairs immune surveillance and promotes tumor immune evasion. Obesity is recognized as a probable risk factor for CRC by leading international agencies [75]. Adipose tissue, particularly visceral fat, secretes pro-inflammatory cytokines, which contribute to tumorigenesis [76]. Adipose tissue inflammation also promotes insulin resistance and hyperinsulinemia, with insulin and insulin-like growth factors exerting mitogenic and anti-apoptotic effects [77]. Additionally, adipocytes release adipokines including leptin, which enhances cellular proliferation and invasiveness [78]. Obesity may further drive CRC through epigenetic modifications and alterations in sex hormone levels [79, 80]. Serum albumin serves as an objective biomarker of both nutritional status and systemic inflammation [81]. Low albumin levels are linked to impaired energy metabolism, reduced tolerance to antitumor therapies, and poor clinical outcomes [82]. Albumin exerts protective effects through its antioxidant and immunomodulatory properties, including regulation of antigen presentation and cytokine secretion [83]. Dysregulated albumin levels may promote colorectal carcinogenesis by disrupting redox balance and immune function, while tumor progression itself can further deplete systemic albumin [84]. The ALI index integrates BMI, serum albumin, and the NLR to capture key pathophysiological processes in CRC. Hypoalbuminemia reflects impaired antioxidant and barrier defenses, elevated NLR indicates pro-tumor inflammation, and increased BMI contributes to chronic inflammation and insulin resistance. Together, these components synergistically promote carcinogenesis through sustained oxidative stress, compromised immune surveillance, and enhanced proliferative signaling. As a composite biomarker, the ALI quantitatively encapsulates this multidimensional pathophysiology and shows meaningful correlation with CRC progression and treatment outcomes.
In our subgroup analyses, the inverse association between ALI and CRC exhibited significant interactions in specific populations, suggesting heterogeneous effects of lifestyle and metabolic factors across populations. The enhanced protective effect of ALI in the elderly may be understood within the inflammaging framework, an age-related state of chronic low-grade inflammation and immunosenescence. The attenuated protection in diabetic patients may reflect altered ALI component biology under hyperglycemic conditions. Chronic hyperglycemia promotes tumor proliferation while impairing immune surveillance [77], and induces insulin resistance and albumin glycation that may reduce its antioxidant capacity. The obesity-diabetes comorbidity [85] creates a complex metabolic environment where ALI’s nutritional-inflammatory balance indicators may be compromised. Chronic inflammation serves as a critical link between CVD and cancer development. Patients with CVD often exhibit persistent systemic inflammation, which may facilitate cancer cell proliferation and metastasis. Furthermore, CVD and cancer share several common risk factors, including obesity and hyperglycemia, which collectively contribute to the overlapping pathophysiological processes of both conditions. Concerning marital status, evidence indicates that being unmarried may influence cancer risk through psychosocial mechanisms [86]. Insufficient social support and chronic psychological stress can induce neuroendocrine dysregulation, impair immune surveillance [87].
Our study, utilizing a nationally representative sample, establishes a significant inverse association between ALI and CRC risk. A key methodological strength lies in the integration of complementary analytical approaches, including weighted multivariable logistic regression, RCS, and notable ML algorithms. As a composite biomarker derived from routinely available parameters, ALI is a cost-effective and accessible metric that holds potential for seamless integration into routine clinical practice. Several methodological limitations merit consideration. First, although our analysis incorporated ten cross-sectional surveys and adjusted for multiple covariates, the non-sequential design of this study is a key limitation, as it prevents any determination of causality. Additionally, this inherent limitation of the study design is further compounded by potential residual confounding from unmeasured or imperfectly measured factors, such as detailed dietary patterns, physical activity levels, and other lifestyle variables not comprehensively captured in the NHANES database. We strongly emphasize that future prospective cohort studies with serial ALI measurements are essential to validate these findings and elucidate temporal relationships. Additionally, Mendelian randomization analyses could provide valuable complementary evidence to assess potential causal mechanisms while accounting for unmeasured confounding. Second, CRC case identification relied entirely on self-report, as NHANES lacks imaging or histopathological confirmation. This may affect the accuracy of observed associations. Additionally, the absence of clinical validation limited access to tumor characteristics, constraining detailed subgroup analyses. It should be noted, however, that previous validation studies have demonstrated the reliability of self-reported cancer data, showing high sensitivity for colon cancer and an overall positive predictive value of 0.75 for all cancers combined [88]. Moreover, potential non-differential misclassification of the novel ALI index would likely attenuate the observed effect sizes, indicating that the true association may be stronger than reported. Third, the relatively limited number of CRC cases may constrain the stability of ML models and the detection of subtle predictive patterns, despite our use of algorithms designed for imbalanced data and cross-validation techniques. However, our comprehensive approach, including the use of algorithms specifically designed for imbalanced data, rigorous cross-validation, and independent test set evaluation, provides reasonable assurance of model reliability. The consistent performance observed between training and test sets across multiple algorithms further supports the robustness of our findings. Future validation in larger, independent cohorts will be essential to confirm the generalizability of the ALI-based risk stratification model. Fourth, inherent limitations in laboratory measurements and biological variability should be considered. Although NHANES laboratories maintain CLIA certification and implement rigorous quality control procedures such as standardized protocols, regular instrument calibration, and blinded repeat testing, minor systematic errors between batches may still occur. Additionally, the biological interpretation of ALI components requires caution: serum albumin levels may be influenced by acute conditions and hydration status, and BMI serves only as an approximate indicator of body composition, subject to variation based on measurement timing and specific conditions. We emphasize that the limitations acknowledged in this study do not undermine the reliability of the current results but instead highlight key areas for methodological refinement in future investigations.
This study demonstrates that the ALI-based CRC risk stratification model holds substantial potential for clinical implementation. By utilizing routinely available clinical parameters, the model requires no specialized equipment or additional testing, offering significant advantages in cost-effectiveness and accessibility. The ALI-based risk stratification tool presents a practical approach for enhancing current CRC screening protocols. Its implementation can be achieved through two complementary pathways: integration into hospital laboratory information systems for automated risk calculation using routine parameters (BMI, albumin, and NLR), and development of web-based calculators for primary care settings. This dual approach would enable identification of high-risk individuals for prioritized colonoscopy referral while allowing dynamic monitoring through serial ALI measurements in moderate-risk cases. Several challenges require addressing before widespread implementation. The model necessitates validation across diverse populations and healthcare settings. Successful integration will require appropriate IT infrastructure, staff training, and workflow adjustments, with ALI scores always interpreted within a comprehensive clinical assessment. Looking forward, the predictive performance of the ALI index could potentially be augmented by integrating it with other emerging biomarkers. Particularly promising is the incorporation of gut microbiota signatures, given their established link to both intestinal inflammation and CRC carcinogenesis [89–91]. Future studies that combine inexpensive systemic indices like ALI with multi-omics approaches, including microbiome profiling, are warranted to build more comprehensive and powerful risk stratification tools.
This study thus establishes a foundation for developing more efficient, cost-effective, and precise CRC risk assessment tools that can be readily integrated into diverse healthcare settings, ultimately contributing to improved early detection and prevention of CRC.
Conclusion
Our study demonstrates a significant inverse association between ALI and CRC susceptibility in American adults. The CRC risk prediction framework incorporating ALI, developed using multiple ML approaches, exhibited robust discriminative performance. These findings underscore the potential protective effects of maintaining optimal nutritional status and modulating systemic inflammation in CRC prevention. We recommend future prospective cohort studies and mechanistic investigations to validate the prognostic utility of ALI for CRC risk stratification.
Supplementary Information
Below is the link to the electronic supplementary material.
Acknowledgements
NHANES protocols were approved by NCHS Ethics Review Board. All participants provided written informed consent. We extend our sincere appreciation to everyone who supported this research. We are also grateful for the valuable input and constructive feedback from peers in the discipline.
Abbreviations
- NHANES
National Health and Nutrition Examination Survey
- CRC
Colorectal cancer
- ALI
Advanced lung cancer inflammation index
- PIR
Poverty income ratio
- RCS
Restricted cubic spline
- ROC
Receiver operating characteristic
- AUC
Area under the curve
- ML
Machine learning
- SHAP
Shapley additive explanations
- WHO
World Health Organization
- NLR
neutrophil-to-lymphocyte ratio
- BMI
Body mass index
- B2B
Bench-to-bedside
- CDC
Centers for disease control and prevention
- NCHS
National center for health statistics
- MCQ
Medical conditions questionnaire
- CVD
Cardiovascular disease
- SE
Standard error
- XGBoost
Extreme gradient boosting
- DT
Decision tree
- MLP
Multilayer perceptron
- NNET
Neural networks
- KNN
K-nearest neighbors
- SVM
Support vector machine
- LightGBM
Light gradient boosting machine
- NPV
Negative predictive value
- IARC
International Agency for Research on Cancer
- AICR
American Institute for Cancer Research
- ctDNA
Circulating tumor DNA
- scRNA-seq
Single-cell RNA sequencing
- CEA
Carcinoembryonic antigen
- ROS
Reactive oxygen species
- TANs
Tumor-associated neutrophils
Author contributions
MG and YL contributed to the Original draft, Methodology, and Formal analysis. HMW and JMZ contributed to Validation and Formal analysis. GXZ contributed to Resources and Data curation. NZ was involved in Writing–review & editing, Supervision, Project administration, and Investigation.
Funding
This work was supported by the Jilin Provincial Department of Education(grant number JJKH20241326KJ).
Data availability
The datasets used during the current study are available from the NHANES database(https://www.cdc.gov/nchs/nhanes/about/index.html). The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Declarations
Ethical approval
This study used publicly available, anonymized data from the NHANES. Ethical approval was not required for this analysis. All study protocols of NHANES were approved by the Research Ethics Review Board of the NCHS. Our research strictly adhered to the ethical principles outlined in the Declaration of Helsinki. All analytical methods were conducted in full compliance with relevant NHANES data use policies and prevailing ethical regulations.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Ming Gao and Ying Li contributed equally to this work and share first authorship.
References
- 1.Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. Cancer J Clin. 2021;71(3):209–49. [DOI] [PubMed] [Google Scholar]
- 2.Bray F, Laversanne M, Sung H, Ferlay J, Siegel RL, Soerjomataram I, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. Cancer J Clin. 2024;74(3):229–63. [DOI] [PubMed] [Google Scholar]
- 3.Keum N, Giovannucci E. Global burden of colorectal cancer: emerging trends, risk factors and prevention strategies. Nat Reviews Gastroenterol Hepatol. 2019;16(12):713–32. [DOI] [PubMed] [Google Scholar]
- 4.de Wit DF, Hanssen NMJ, Wortelboer K, Herrema H, Rampanelli E, Nieuwdorp M. Evidence for the contribution of the gut Microbiome to obesity and its reversal. Sci Transl Med. 2023;15(723):eadg2773. [DOI] [PubMed] [Google Scholar]
- 5.Chaplin A, Rodriguez RM, Segura-Sampedro JJ, Ochogavía-Seguí A, Romaguera D, Barceló-Coblijn G. Insights behind the relationship between colorectal cancer and obesity: is visceral adipose tissue the missing link? Int J Mol Sci. 2022;23(21). [DOI] [PMC free article] [PubMed]
- 6.Bahreiny SS, Ahangarpour A, Rajaei E, Sharifani MS, Aghaei M. Meta-Analytical and Meta-Regression evaluation of subclinical hyperthyroidism’s effect on male reproductive health: hormonal and seminal perspectives. Reproductive sciences (Thousand Oaks, Calif). 2024;31(10):2957-71. [DOI] [PubMed]
- 7.Jiang X, Shapiro DJ. The immune system and inflammation in breast cancer. Mol Cell Endocrinol. 2014;382(1):673–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Furman D, Campisi J, Verdin E, Carrera-Bastos P, Targ S, Franceschi C, et al. Chronic inflammation in the etiology of disease across the life span. Nat Med. 2019;25(12):1822–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Qu C, Yang ST, Shen TL, Peng QT, Sun XJ, Lin YY. Exploring the influence of anemia and inflammation indices on colorectal cancer: analysis of the National health and nutrition examination survey from 2011 to 2018. Front Oncol. 2024;14:1457886. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Orchard TS, Andridge RR, Yee LD, Lustberg MB. Diet quality, inflammation, and quality of life in breast cancer survivors: A cross-sectional analysis of pilot study data. J Acad Nutr Dietetics. 2018;118(4):578–e881. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Don BR, Kaysen G. Serum albumin: relationship to inflammation and nutrition. Semin Dial. 2004;17(6):432–7. [DOI] [PubMed] [Google Scholar]
- 12.Abedizadeh R, Majidi F, Khorasani HR, Abedi H, Sabour D. Colorectal cancer: a comprehensive review of carcinogenesis, diagnosis, and novel strategies for classified treatments. Cancer Metastasis Rev. 2024;43(2):729–53. [DOI] [PubMed] [Google Scholar]
- 13.Chen Q, Ke HT, Dai ZF, Liu Z. Nanoscale theranostics for physical stimulus-responsive cancer therapies. Biomaterials. 2015;73:214–30. [DOI] [PubMed] [Google Scholar]
- 14.Qin SY, Cheng YJ, Lei Q, Zhang AQ, Zhang XZ. Combinational strategy for high-performance cancer chemotherapy. Biomaterials. 2018;171:178–97. [DOI] [PubMed] [Google Scholar]
- 15.Wu Q, Yang ZP, Nie YZ, Shi YQ, Fan DM. Multi-drug resistance in cancer chemotherapeutics: mechanisms and lab approaches. Cancer Lett. 2014;347(2):159–66. [DOI] [PubMed] [Google Scholar]
- 16.Folkman J. Tumor angiogenesis: a possible control point in tumor growth. Ann Intern Med. 1975;82(1):96–100. [DOI] [PubMed] [Google Scholar]
- 17.Hua X, Chen J, Wu Y, Sha J, Han SH, Zhu XL. Prognostic role of the advanced lung cancer inflammation index in cancer patients: a meta-analysis. World J Surg Oncol. 2019;17(1):177. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Ciciola P, Cascetta P, Bianco C, Formisano L, Bianco R. Combining immune checkpoint inhibitors with Anti-angiogenic agents. J Clin Med. 2020;9(3):675. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Ali I, Lone MN, Suhail M, Mukhtar SD, Asnin L. Advances in nanocarriers for anticancer drugs delivery. Curr Med Chem. 2016;23(20):2159–87. [DOI] [PubMed] [Google Scholar]
- 20.Ali I, Alsehli M, Scotti L, Tullius Scotti M, Tsai ST, Yu RS, et al. Progress in polymeric Nano-medicines for theranostic cancer treatment. Polymers. 2020;12(3):598. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Ali I, Wani WA, Khan A, Haque A, Ahmad A, Saleem K, et al. Synthesis and synergistic antifungal activities of a pyrazoline based ligand and its copper(II) and nickel(II) complexes with conventional antifungals. Microb Pathog. 2012;53(2):66–73. [DOI] [PubMed] [Google Scholar]
- 22.He X, Lan HR, Jin KT, Liu FL. Can immunotherapy reinforce chemotherapy efficacy? A new perspective on colorectal cancer treatment. Front Immunol. 2023;14:1237764. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Bahreiny SS, Ahangarpour A, Aghaei M, Mohammadpour Fard R, Jalali Far MA, Sakhavarz T. A closer look at Galectin-3: its association with gestational diabetes mellitus revealed by systematic review and meta-analysis. J Diabetes Metab Disord. 2024;23(2):1621–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Jafri SH, Shi R, Mills G. Advance lung cancer inflammation index (ALI) at diagnosis is a prognostic marker in patients with metastatic non-small cell lung cancer (NSCLC): a retrospective review. BMC Cancer. 2013;13:158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Gao XY, Qi JC, Du B, Weng XJ, Lai JH, Wu RP. Combined influence of nutritional and inflammatory status and breast cancer: findings from the NHANES. BMC Public Health. 2024;24(1):2245. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Qiu X, Shen S, Lu DH, Jiang NZ, Feng YF, Li JD, et al. Predictive efficacy of the advanced lung cancer inflammation index in hepatocellular carcinoma after hepatectomy. J Inflamm Res. 2024;17:5197–210. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Yin C, Toiyama Y, Okugawa Y, Omura Y, Kusunoki Y, Kusunoki K, et al. Clinical significance of advanced lung cancer inflammation index, a nutritional and inflammation index, in gastric cancer patients after surgical resection: A propensity score matching analysis. Clin Nutr. 2021;40(3):1130–6. [DOI] [PubMed] [Google Scholar]
- 28.Zhang YB, Pan YX, Tu JB, Liao LH, Lin SQ, Chen KH, et al. The advanced lung cancer inflammation index predicts long-term outcomes in patients with hypertension: National health and nutrition examination study, 1999–2014. Front Nutr. 2022;9:989914. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Tu JB, Wu B, Xiu JM, Deng JY, Lin SQ, Lu J, et al. Advanced lung cancer inflammation index is associated with long-term cardiovascular death in hypertensive patients: National health and nutrition examination study, 1999–2018. Frontiers in physiology. 2023;14:1074672. [DOI] [PMC free article] [PubMed]
- 30.Fan WJ, Zhang Y, Liu YX, Ding ZJ, Si YQ, Shi F, et al. Nomograms based on the advanced lung cancer inflammation index for the prediction of coronary artery disease and calcification. Clin Appl thrombosis/hemostasis: Official J Int Acad Clin Appl Thrombosis/Hemostasis. 2021;27:10760296211060455. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Kusunoki K, Toiyama Y, Okugawa Y, Yamamoto A, Omura Y, Kusunoki Y, et al. The advanced lung cancer inflammation index predicts outcomes in patients with crohn’s disease after surgical resection. Colorectal Disease: Official J Association Coloproctology Great Br Irel. 2021;23(1):84–93. [DOI] [PubMed] [Google Scholar]
- 32.Torfi E, Bahreiny SS, Saki N, Khademi R, Sarbazjoda E, Nezhad IA, et al. Evaluation of Pro-BNP biomarker in heart failure patients and its relationship with complete blood count parameters: A case-control study. Health Sci Rep. 2024;7(9):e70083. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Aghaei M, Bahreiny SS, Zayeri ZD, Davari N, Abolhasani MM, Saki N. Evaluation of complete blood count parameters in patients with diabetes mellitus: A systematic review. Health Sci Rep. 2025;8(2):e70488. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Mandaliya H, Jones M, Oldmeadow C, Nordman II. Prognostic biomarkers in stage IV non-small cell lung cancer (NSCLC): neutrophil to lymphocyte ratio (NLR), lymphocyte to monocyte ratio (LMR), platelet to lymphocyte ratio (PLR) and advanced lung cancer inflammation index (ALI). Translational Lung Cancer Res. 2019;8(6):886–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Mast AE, Steele WR, Johnson B, Wright DJ, Cable RG, Carey P, et al. Population-based screening for anemia using first-time blood donors. Am J Hematol. 2012;87(5):496–502. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Yeoh KG, Ho KY, Chiu HM, Zhu F, Ching JY, Wu DC, et al. The Asia-Pacific colorectal screening score: a validated tool that stratifies risk for colorectal advanced neoplasia in asymptomatic Asian subjects. Gut. 2011;60(9):1236–41. [DOI] [PubMed] [Google Scholar]
- 37.Colditz GA, Atwood KA, Emmons K, Monson RR, Willett WC, Trichopoulos D, et al. Harvard report on cancer prevention 4: Harvard cancer risk index. Risk index working group, Harvard center for cancer prevention. Cancer Causes Control: CCC. 2000;11(6):477–88. [DOI] [PubMed] [Google Scholar]
- 38.Wang HM, Chang TH, Lin FM, Chao TH, Huang WC, Liang C, et al. A new method for post Genome-Wide association study (GWAS) analysis of colorectal cancer in Taiwan. Gene. 2013;518(1):107–13. [DOI] [PubMed] [Google Scholar]
- 39.Usher-Smith JA, Harshfield A, Saunders CL, Sharp SJ, Emery J, Walter FM, et al. External validation of risk prediction models for incident colorectal cancer using UK biobank. Br J Cancer. 2018;118(5):750–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.The Lancet Gastroenterology H. Controversy over colonoscopy for colorectal cancer screening. Lancet Gastroenterol Hepatol. 2022;7(12):1061. [DOI] [PubMed] [Google Scholar]
- 41.Joseph DA, King JB, Dowling NF, Thomas CC, Richardson LC. Vital signs: colorectal cancer screening test use - united states, 2018. MMWR Morbidity Mortal Wkly Rep. 2020;69(10):253–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Green BB, Baldwin LM, West II, Schwartz M, Coronado GD. Low rates of colonoscopy follow-up after a positive fecal immunochemical test in a medicaid health plan delivered mailed colorectal cancer screening program. J Prim Care Community Health. 2020;11:2150132720958525. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Ladabaum U, Mannalithara A, Weng Y, Schoen RE, Dominitz JA, Desai M, et al. Comparative effectiveness and cost-effectiveness of colorectal cancer screening with blood-based biomarkers (liquid biopsy) vs fecal tests or colonoscopy. Gastroenterology. 2024;167(2):378–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Tao XY, Li QQ, Zeng Y. Clinical application of liquid biopsy in colorectal cancer: detection, prediction, and treatment monitoring. Mol Cancer. 2024;23(1):145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Bessa X, Vidal J, Balboa JC, Márquez C, Duenwald S, He Y, et al. High accuracy of a blood ctDNA-based multimodal test to detect colorectal cancer. Annals Oncology: Official J Eur Soc Med Oncol. 2023;34(12):1187–93. [DOI] [PubMed] [Google Scholar]
- 46.Lichtenstern CR, Ngu RK, Shalapour S, Karin M. Immunotherapy, inflammation and colorectal cancer. Cells. 2020;9(3):618. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Mena S, Ortega A, Estrela JM. Oxidative stress in environmental-induced carcinogenesis. Mutat Res. 2009;674(1–2):36–44. [DOI] [PubMed] [Google Scholar]
- 48.Poetsch AR. The genomics of oxidative DNA damage, repair, and resulting mutagenesis. Comput Struct Biotechnol J. 2020;18:207–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Sies H. Oxidative stress: a concept in redox biology and medicine. Redox Biol. 2015;4:180–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Greten FR, Grivennikov SI. Inflammation and cancer: triggers, mechanisms, and consequences. Immunity. 2019;51(1):27–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Schmitt M, Greten FR. The inflammatory pathogenesis of colorectal cancer. Nat Rev Immunol. 2021;21(10):653–67. [DOI] [PubMed] [Google Scholar]
- 52.Wilson MR, Jiang Y, Villalta PW, Stornetta A, Boudreau PD, Carrá A, et al. The human gut bacterial genotoxin colibactin alkylates DNA. Science (New York, NY). 2019;363(6428):eaar7785. [DOI] [PMC free article] [PubMed]
- 53.Grivennikov SI, Wang K, Mucida D, Stewart CA, Schnabl B, Jauch D, et al. Adenoma-linked barrier defects and microbial products drive IL-23/IL-17-mediated tumour growth. Nature. 2012;491(7423):254–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Canli Ö, Nicolas AM, Gupta J, Finkelmeier F, Goncharova O, Pesic M, et al. Myeloid cell-derived reactive oxygen species induce epithelial mutagenesis. Cancer Cell. 2017;32(6):869–e835. [DOI] [PubMed] [Google Scholar]
- 55.Gretschel J, El Hage R, Wang R, Chen Y, Pietzner A, Loew A, et al. Harnessing Oxylipins and inflammation modulation for prevention and treatment of colorectal cancer. Int J Mol Sci. 2024;25(10):5408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Lauby-Secretan B, Scoccianti C, Loomis D, Grosse Y, Bianchini F, Straif K. Body fatness and cancer–viewpoint of the IARC working group. N Engl J Med. 2016;375(8):794–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.O’Sullivan DE, Sutherland RL, Town S, Chow K, Fan J, Forbes N, et al. Risk factors for early-onset colorectal cancer: A systematic review and Meta-analysis. Clin Gastroenterol Hepatology: Official Clin Pract J Am Gastroenterological Association. 2022;20(6):1229–e405. [DOI] [PubMed] [Google Scholar]
- 58.Bardou M, Barkun AN, Martel M. Obesity and colorectal cancer. Gut. 2013;62(6):933–47. [DOI] [PubMed] [Google Scholar]
- 59.Quail DF, Olson OC, Bhardwaj P, Walsh LA, Akkari L, Quick ML, et al. Obesity alters the lung myeloid cell landscape to enhance breast cancer metastasis through IL5 and GM-CSF. Nat Cell Biol. 2017;19(8):974–87. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Quail DF, Dannenberg AJ. The obese adipose tissue microenvironment in cancer development and progression. Nat Reviews Endocrinol. 2019;15(3):139–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Thanikachalam K, Khan G. Colorectal cancer and nutrition. Nutrients. 2019;11(1):164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Eruslanov EB, Singhal S, Albelda SM. Mouse versus human neutrophils in cancer: A major knowledge gap. Trends Cancer. 2017;3(2):149–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Wang Y, Wang K, Han GC, Wang RX, Xiao H, Hou CM, et al. Neutrophil infiltration favors colitis-associated tumorigenesis by activating the interleukin-1 (IL-1)/IL-6 axis. Mucosal Immunol. 2014;7(5):1106–15. [DOI] [PubMed] [Google Scholar]
- 64.Mishalian I, Bayuh R, Eruslanov E, Michaeli J, Levy L, Zolotarov L, et al. Neutrophils recruit regulatory T-cells into tumors via secretion of CCL17–a new mechanism of impaired antitumor immunity. Int J Cancer. 2014;135(5):1178–86. [DOI] [PubMed] [Google Scholar]
- 65.Gordon-Weeks AN, Lim SY, Yuzhalin AE, Jones K, Markelc B, Kim KJ, et al. Neutrophils promote hepatic metastasis growth through fibroblast growth factor 2-dependent angiogenesis in mice. Hepatology (Baltimore MD). 2017;65(6):1920–35. [DOI] [PubMed] [Google Scholar]
- 66.Silvestre-Roig C, Kalafati L, Chavakis T. Neutrophils are shaped by the tumor microenvironment: novel possibilities for targeting neutrophils in cancer. Signal Transduct Target Therapy. 2024;9(1):77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Bui TM, Yalom LK, Ning E, Urbanczyk JM, Ren X, Herrnreiter CJ, et al. Tissue-specific reprogramming leads to angiogenic neutrophil specialization and tumor vascularization in colorectal cancer. J Clin Investig. 2024;134(7):e174545. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Xiu BQ, Chi YY, Liu L, Chi WR, Zhang Q, Chen JJ, et al. LINC02273 drives breast cancer metastasis by epigenetically increasing AGR2 transcription. Mol Cancer. 2019;18(1):187. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Tian SB, Chu YN, Hu J, Ding XL, Liu ZB, Fu DA, et al. Tumour-associated neutrophils secrete AGR2 to promote colorectal cancer metastasis via its receptor CD98hc-xCT. Gut. 2022;71(12):2489–501. [DOI] [PubMed] [Google Scholar]
- 70.Yazdani HO, Roy E, Comerci AJ, van der Windt DJ, Zhang H, Huang H, et al. Neutrophil extracellular traps drive mitochondrial homeostasis in tumors to augment growth. Cancer Res. 2019;79(21):5626–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Sun NJ, Jiang JJ, Chen BY, Chen YR, Wu HM, Wang HY, et al. Neutrophil extracellular trap genes predict immunotherapy response in gastric cancer. Heliyon. 2024;10(17):e37357. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Araki K, Ito Y, Fukada I, Kobayashi K, Miyagawa Y, Imamura M, et al. Predictive impact of absolute lymphocyte counts for progression-free survival in human epidermal growth factor receptor 2-positive advanced breast cancer treated with Pertuzumab and trastuzumab plus eribulin or nab-paclitaxel. BMC Cancer. 2018;18(1):982. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Bai ZY, Zhou Y, Ye ZF, Xiong JL, Lan HY, Wang F. Tumor-Infiltrating lymphocytes in colorectal cancer: the fundamental indication and application on immunotherapy. Front Immunol. 2021;12:808964. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Wankhede D, Yuan T, Kloor M, Halama N, Brenner H, Hoffmeister M. Clinical significance of combined tumour-infiltrating lymphocytes and microsatellite instability status in colorectal cancer: a systematic review and network meta-analysis. Lancet Gastroenterol Hepatol. 2024;9(7):609–19. [DOI] [PubMed] [Google Scholar]
- 75.Clinton SK, Giovannucci EL, Hursting SD. The world cancer research fund/American Institute for cancer research third expert report on diet, nutrition, physical activity, and cancer: impact and future directions. J Nutr. 2020;150(4):663–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Wei HJ, Zeng R, Lu JH, Lai WF, Chen WH, Liu HY, et al. Adipose-derived stem cells promote tumor initiation and accelerate tumor growth by interleukin-6 production. Oncotarget. 2015;6(10):7713–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Hotamisligil GS. Inflammation and metabolic disorders. Nature. 2006;444(7121):860–7. [DOI] [PubMed] [Google Scholar]
- 78.Hoda MR, Keely SJ, Bertelsen LS, Junger WG, Dharmasena D, Barrett KE. Leptin acts as a mitogenic and antiapoptotic factor for colonic cancer cells. Br J Surg. 2007;94(3):346–54. [DOI] [PubMed] [Google Scholar]
- 79.Li R, Grimm SA, Chrysovergis K, Kosak J, Wang X, Du Y, et al. Obesity, rather than diet, drives epigenomic alterations in colonic epithelium resembling cancer progression. Cell Metabol. 2014;19(4):702–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Wu JL, Bai YN, Lu YW, Yu ZX, Zhang SM, Yu B, et al. Role of sex steroids in colorectal cancer: pathomechanisms and medical applications. Am J Cancer Res. 2024;14(7):3200–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Lv L, Sun X, Liu B, Song J, Wu DJH, Gao Y, et al. Genetically predicted serum albumin and risk of colorectal cancer: A bidirectional Mendelian randomization study. Clin Epidemiol. 2022;14:771–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Baracos VE, Martin L, Korc M, Guttridge DC, Fearon KCH. Cancer-associated cachexia. Nat Reviews Disease Primers. 2018;4:17105. [DOI] [PubMed] [Google Scholar]
- 83.Halliwell B. Albumin–an important extracellular antioxidant? Biochem Pharmacol. 1988;37(4):569–71. [DOI] [PubMed] [Google Scholar]
- 84.Tan DW, Fu Y, Tong WD, Li F. Prognostic significance of lymphocyte to monocyte ratio in colorectal cancer: A meta-analysis. Int J Surg (London England). 2018;55:128–38. [DOI] [PubMed] [Google Scholar]
- 85.Zhang XQ, Ma N, Lin QS, Chen KN, Zheng FJY, Wu J, et al. Body roundness index and all-cause mortality among US adults. JAMA Netw Open. 2024;7(6):e2415051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Mozafar Saadati H, Khodamoradi F, Salehiniya H. Associated factors of survival rate and screening for colorectal cancer in iran: a systematic review. J Gastrointest Cancer. 2020;51(2):401–11. [DOI] [PubMed] [Google Scholar]
- 87.Pike JL, Irwin MR. Dissociation of inflammatory markers and natural killer cell activity in major depressive disorder. Brain Behav Immun. 2006;20(2):169–74. [DOI] [PubMed] [Google Scholar]
- 88.Bergmann MM, Calle EE, Mervis CA, Miracle-McMahill HL, Thun MJ, Heath CW. Validity of self-reported cancers in a prospective cohort study in comparison with data from state cancer registries. Am J Epidemiol. 1998;147(6):556–62. [DOI] [PubMed] [Google Scholar]
- 89.Wong CC, Yu J. Gut microbiota in colorectal cancer development and therapy. Nat Reviews Clin Oncol. 2023;20(7):429–52. [DOI] [PubMed] [Google Scholar]
- 90.Cheng YW, Ling ZX, Li LJ. The intestinal microbiota and colorectal cancer. Front Immunol. 2020;11:615056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Arthur JC, Perez-Chanona E, Mühlbauer M, Tomkovich S, Uronis JM, Fan TJ, et al. Intestinal inflammation targets cancer-inducing activity of the microbiota. Sci (New York NY). 2012;338(6103):120–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The datasets used during the current study are available from the NHANES database(https://www.cdc.gov/nchs/nhanes/about/index.html). The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.








