Skip to main content
iScience logoLink to iScience
. 2025 Dec 24;29(2):114522. doi: 10.1016/j.isci.2025.114522

A multicenter multimodel habitat radiomics model for predicting immunotherapy response in advanced NSCLC

Xiaoxiao Huang 1,2,6, Yurun Xie 3,6, Haiyang Nong 1,6, Xiaoxin Huang 5, Donglian Gu 2, Kui Wang 2, Deyou Huang 1,4,, Guanqiao Jin 2,7,∗∗
PMCID: PMC12907010  PMID: 41704757

Summary

A robust predictive biomarker is critical for identifying patients with NSCLC who may benefit from immunotherapy. This study developed a CT-based habitat model using 590 advanced NSCLC cases. The model was constructed in contrast-enhanced CT images and validated on an independent cohort with non-contrast CT. Tumor volumes were segmented into three subregions via K-means clustering. Radiomic features were extracted from each habitat and used to build predictive models with six machine learning classifiers. The ExtraTrees-based habitat model demonstrated superior predictive performance in the test cohort (AUC = 0.814). Compared to traditional radiomics, 3D deep learning, clinical, and PD-L1 expression models, the habitat model maintained strong predictive advantages, enabling efficient prediction of immunotherapy benefit and aiding in the identification of suitable patients for personalized.

Subject areas: Health sciences, Medicine, Medical specialty, Internal medicine, Oncology

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • A multi-algorithm Habitat model predicts NSCLC immunotherapy response

  • Habitat model outperforms radiomics, deep learning, and clinical models

  • Imaging-based models surpass PD-L1 in predicting immunotherapy benefit

  • Habitat analysis serves as a non-invasive biomarker for treatment decisions


Health sciences; Medicine; Medical specialty; Internal medicine; Oncology

Introduction

Non-small cell lung cancer (NSCLC) is the most common histological type of lung cancer,1 with a 5-year survival rate of around 18% for advanced-stage disease.2

Recently, the combination of programmed death-1 (PD-1) receptor and its ligand (PD-L1) checkpoint blockade antibodies with chemotherapy has significantly improved patients' overall survival (OS) rates and progression-free survival (PFS).3 However, only 20% of patients with lung cancer respond to PD-1/PD-L1 immunotherapy and achieve meaningful clinical benefits.4 Therefore, identifying robust biomarkers capable of effectively predicting responses to PD-1/PD-L1 checkpoint blockade antibodies is crucial to avoid unnecessary immune-related toxicities associated with immunotherapy.

PD-L1 expression in tumor tissue is a primary biomarker, approved by the U.S. Food and Drug Administration (FDA), for predicting patient response to immunotherapy. Patients with the positive expression of PD-L1 usually have better clinical outcomes.5 However, recent studies have shown that patients with lung cancer with PD-L1 tumor proportion score (TPS) < 1% achieve superior objective response rates (ORRs) when treated with immunotherapy combined with chemotherapy compared to chemotherapy alone.6 Moreover, for PD-L1 TPS <1% patients with lung cancer who respond to PD-L1 inhibitors, anti-PD-1/PD-L1 inhibitors not only prolong their overall survival (OS) but also demonstrate comparable duration of response (DOR) to patients with PD-L1 TPS ≥50%.7,8 This indicates that even PD-L1-negative patients can achieve favorable objective response rates from PD-1/PD-L1 immunotherapy, and PD-L1 expression alone cannot accurately predict clinical benefit from immunotherapy. For advanced and inoperable patients with lung cancer, PD-L1 expression assessment primarily relies on tumor tissue biopsy. However, the widespread intratumoral heterogeneity of PD-L1 may lead to sampling bias, preventing accurate evaluation of PD-L1 expression levels in tumors.9,10 Therefore, researchers continue to explore stable molecular biomarkers to predict immunotherapy response.

Recently, radiomics has emerged as a crucial non-invasive tool for evaluating therapeutic efficacy and disease diagnosis in patients with cancer,11,12,13 owing to its ability to extract high-throughput quantitative information from medical images such as computed tomography (CT), magnetic resonance imaging (MRI), and positron emission tomography (PET).14 Due to the diagnostic biopsy nature of PD-1/PD-L1, the specimens collected are typically small portions of tumor tissue, which often leads to heterogeneous results. In contrast, medical imaging can provide a whole view of tumors, while radiomics technology extracts high-dimensional features from the entire tumor image, thereby avoiding the omission of tumor information in one region. Additionally, due to the non-invasive and repeatable nature of radiological examinations, radiomics can be continuously employed to monitor therapeutic responses in diseases. Numerous previous studies have demonstrated that radiomics can effectively predict PD-L1 expression levels in tumor tissues and further evaluate the durable clinical benefits and survival outcomes of anti-PD-1/PD-L1 immunotherapy in patients with NSCLC.15,16,17

Numerous studies have consistently demonstrated the considerable potential of radiomics in predicting the prognosis of lung cancer treatment. Research by Zhu, Liu, Wu et al.16,18,19 has shown that radiomic models constructed using machine learning classifiers, such as logistic regression (LR) and support vector machines (SVMs), can effectively predict whether patients will benefit from immunotherapy and enable stratification of survival risk. Traditional radiomics approaches offer good model interpretability; however, their feature extraction process relies on manual design and selection. In contrast, deep learning leverages an end-to-end feature learning paradigm to automatically extract complex, high-level patterns from medical images, exhibiting greater potential in terms of prediction accuracy and automation. Studies by Mu, Saad et al.,20,21 focusing specifically on the application of deep learning in predicting immunotherapy response in patients with non-small cell lung cancer, have also demonstrated superior predictive performance, highlighting the broad prospects of such methods in clinical prognostic assessment.

Previous studies on radiomics in lung cancer immunotherapy primarily treated lung cancer tissue as a homogeneous entity for feature extraction. However, numerous studies have demonstrated that tumors are heterogeneous and consist of complex internal components.1 Tumor heterogeneity often leads to uneven growth of intratumoral cells, which can contribute to the occurrence of treatment resistance.22 Since intratumoral heterogeneity frequently results in varying therapeutic outcomes among individuals, considering the tumor as a uniform entity may fail to accurately predict treatment responses. Moreover, many studies delineate the tumor region of interest (ROI) based on only a few slices,15,23 which may lead to the loss of critical imaging features. Recently, a new radiomics method named “Habitat” divides tumors into different subregions by identifying grey voxels with similar characteristics in images of tumor tissues.24 This method can effectively evaluate tumor heterogeneity and has been proven in the treatment and metastasis evaluation of breast cancer, rectal cancer, and liver cancer.25,26,27 Nevertheless, there is relatively limited literature concerning its application in lung cancer.

In this study, we employed Habitat imaging technology to partition intratumoral subregions and extract radiomic features, thereby enabling a more comprehensive characterization of spatial heterogeneity within tumors. Subsequently, we constructed predictive models based on the Habitat model, the traditional radiomic model, the 3D deep learning model, and the clinical model, systematically evaluating their respective values in predicting the efficacy of anti-PD-1/PD-L1 immunotherapy in patients with NSCLC. To ensure consistency and comparability across modeling approaches, identical machine learning classifiers were applied in the development of the clinical, Habitat, and radiomics models. The study utilized a multi-center retrospective cohort for model training and validation, allowing for comprehensive performance evaluation and comparison among the four types of predictive models. Our objective was to elucidate the relative advantage of the Habitat-based model in predicting immunotherapy benefits, thereby providing reliable evidence for its potential integration into clinical decision-support systems.

Results

Patient’s baseline clinical characteristics

The patient’s inclusion process was shown in Figure 1. In our study, we found that 75% of patients responded to immune checkpoint inhibitors after 6 months of immunotherapy. The average age of the patients was 60.32 ± 9.77, with a predominance of males, and most patients were classified at clinical stage IV and N2-3. The details of demographic and clinical characteristics are outlined in Table 1. The experimental workflow was shown in Figure 2.

Figure 1.

Figure 1

Patient enrollment flowchart

Table 1.

Patients characteristics in Train, Val, and Test cohorts

ALL cohort Train cohort Val cohort Test cohort p-value
Age 60.32 ± 9.77 59.92 ± 10.02 60.52 ± 9.52 62.48 ± 8.50 0.189
Gender 0.039
 Female 121(20.51) 75(20.00) 28(17.39) 18(33.33)
 Male 469(79.49) 300(80.00) 133(82.61) 36(66.67)
Smoking_Status 0.122
 No 249(42.20) 152(40.53) 65(40.37) 32(59.26)
 0-30 years 264(44.75) 172(45.87) 74(45.96) 18(33.33)
 ≥30 years 77(13.05) 51(13.60) 22(13.66) 4(7.41)
Histopathology 0.393
 adenocarcinoma 333(56.44) 207(55.20) 92(57.14) 34(62.96)
 SCC1 236(40.00) 151(40.27) 65(40.37) 20(37.04)
 others 21(3.92) 17(4.53) 4(2.48) None
Clinical_stage 0.084
 Ⅲ 150(25.42) 87(23.20) 43(26.71) 20(37.04)
 Ⅳ 440(74.58) 288(76.80) 118(73.29) 34(62.96)
T 0.912
 1–2 118(20.00) 74(19.73) 32(19.87) 12(22.23)
 3–4 472(80.00) 301(80.26) 129(80.13) 42(77.78)
N 0.011
 0–1 92(15.59) 52(13.86) 24(14.91) 15(27.77)
 2–3 498(84.41) 323(86.13) 137(85.1) 39(71.37)
M 0.108
 No 152(25.76) 87(23.20) 46(28.57) 19(35.19)
 Yes 438(74.24) 288(76.80) 115(71.43) 35(64.81)
CEA 0.071
 No 328(55.59) 204(54.40) 86(53.42) 38(70.37)
 Yes 262(44.41) 171(45.60) 75(46.58) 16(29.63)
CA125 0.164
 No 356(60.34) 217(57.87) 101(62.73) 38(70.37)
 Yes 234(39.66) 158(42.13) 60(37.27) 16(29.63)
CA153 0.922
 No 478(81.02) 302(80.53) 132(81.99) 44(81.48)
 Yes 112(18.98) 73(19.47) 29(18.01) 10(18.52)
CYFRA21_1 0.001
 No 302(51.19) 187(49.87) 68(42.24) 51(94.44)
 Yes 288(48.81) 188(50.13) 93(57.76) 3(5.56)
SCC2 0.049
 No 457(77.46) 285(76.00) 123(76.40) 49(90.74)
 Yes 133(22.54) 90(24.00) 38(23.60) 5(9.26)
CA199 0.420
 No 495(83.90) 320(85.33) 132(81.99) 43(79.63)
 Yes 95(16.10) 55(14.67) 29(18.01) 11(20.37)
DCB <0.001
 No 145(24.58) 109(29.07) 21(13.04) 15(27.78)
 Yes 445(75.42) 266(70.93) 140(86.96) 39(72.22)

Continuous variables were expressed as mean +/standard deviation (SD). Categorical variables were described as frequencies and percentages.SCC1: Squamous Cell Carcinoma; CEA: Carcinoembryonic Antigen; CA125: Cancer Antigen 125; CA153: Cancer Antigen 15-3; CYFRA21_1: Cytokeratin 19 Fragment; SCC2: Squamous Cell Carcinoma Antigen;CA199: Carbohydrate Antigen 19-9; DCB: Durable Clinical Benefit.

Figure 2.

Figure 2

The experimental workflow implemented in this study

The process included image preprocessing and segmentation, feature extraction, model construction, and the development of distinct predictive models based on Habitat, radiomics, deep learning, and clinical data. Finally, the models were evaluated.

Habitat model feature extraction and selection

In our study, we manually extracted 1,834 radiomics features from Habitat1, Habitat2, and Habitat3, respectively. The features of Habitat were integrated from the three subregions (Habitat1, Habitat2, and Habitat3), resulting in a total of 5,502 features for Habitat. Dimensionality reduction and feature selection were performed using the Pearson correlation coefficient, mRMR algorithm, and LASSO regression. Ultimately, 17 features were selected for Habitat1, 9 features for Habitat2, 8 features for Habitat3, and 24 features for Habitat. Figure 3 presents the histogram of Rad-scores for the final selected features in each Habitat model, predicting the outcome. The extracted radiomics features included shape features, first-order features, and texture features. Figure 4 illustrates the quantity and proportion of manually extracted features, as well as the Lasso regression and Mean Standard Error (MSE) of the features incorporated into the Habitat models.

Figure 3.

Figure 3

Histogram of Rad-score of radiomics features in different habitat regions

Habitat1 (A), Habitat2 (B), Habitat3 (C), and Habitat (D).

Figure 4.

Figure 4

Habitat feature selection and model performance

Pie charts of radiomics features in Habitat (A); bar charts of radiomics features in Habitat (B); the coefficients for the cross-validation of LASSO regression (C); and the MSE of LASSO analysis(D).

Construction of habitat models and evaluation

We used six machine learning classifiers-LR, SVM, RandomForest, ExtraTrees, LightGBM, and multilayer perceptron (MLP)-to develop habitat models based on the selected final features from Habitat1, Habitat2, Habitat3, and Habitat. The performance of models was evaluated by comparing AUC values, sensitivity, specificity, positive predictive value, negative predictive value, Recall, and F1-score in both train and validation cohorts to determine the most effective habitat imaging model.

Tables 2 and 3 present the predictive performance of the Habitat1, Habitat2, Habitat3, and Habitat models based on different machine learning classifiers in the train and validation cohorts (for model performance comparisons on the test cohort, please refer to Table S1). As shown in Figures 5 and 6, the Habitat model consistently outperformed the three subregional models in terms of AUC values across various classifiers. This suggests that the integrated Habitat model, incorporating features from all subregions, offers the highest predictive capability. Through comparative analysis, the Habitat-ExtraTrees model achieved the highest AUC values, with an AUC of 0.865 in the train Cohort, 0.845 in the validation Cohort, and 0.814 in the test Cohort (Figure S1). Compared to other models, the Habitat-ExtraTrees model demonstrated superior performance, with closely matched AUC values across all three cohorts, indicating its excellent robustness and generalization capability.

Table 2.

Comparative performance of habitat, Habitat1, Habitat2, and Habitat3 models on train cohorts

model_name Accuracy AUC 95%CI Sensitivity Specificity PPV NPV Recall F1
LR

Habitat 0.749 0.744 0.688–0.799 0.820 0.578 0.826 0.568 0.82 0.823
Habitat1 0.689 0.708 0.648–0.768 0.727 0.598 0.812 0.478 0.727 0.767
Habitat2 0.668 0.685 0.626–0.744 0.709 0.569 0.800 0.446 0.709 0.752
Habitat3 0.669 0.654 0.590–0.719 0.727 0.521 0.793 0.431 0.727 0.759

SVM

Habitat 0.688 0.772 0.719–0.826 0.662 0.752 0.867 0.477 0.662 0.751
Habitat1 0.777 0.736 0.680–0.792 0.922 0.430 0.795 0.697 0.922 0.854
Habitat2 0.709 0.295 0.236–0.355 1.000 0.000 0.709 0.000 1.000 0.829
Habitat3 0.601 0.662 0.597–0.727 0.562 0.698 0.824 0.387 0.562 0.668

RandomForest

Habitat 0.781 0.823 0.776–0.869 0.816 0.697 0.868 0.608 0.816 0.841
Habitat1 0.785 0.778 0.721–0.835 0.812 0.720 0.874 0.616 0.812 0.842
Habitat2 0.741 0.771 0.719–0.822 0.785 0.633 0.839 0.548 0.785 0.811
Habitat3 0.678 0.736 0.680–0.792 0.674 0.687 0.845 0.455 0.674 0.749

ExtraTrees

Habitat 0.816 0.865 0.825–0.905 0.872 0.679 0.869 0.685 0.872 0.871
Habitat1 0.747 0.798 0.748–0.848 0.777 0.673 0.850 0.558 0.777 0.812
Habitat2 0.730 0.827 0.781–0.873 0.683 0.844 0.914 0.523 0.683 0.782
Habitat3 0.698 0.763 0.709–0.816 0.702 0.687 0.850 0.478 0.702 0.769

LGBM

Habitat 0.891 0.942 0.915–0.968 0.891 0.890 0.952 0.770 0.891 0.920
Habitat1 0.837 0.923 0.895–0.951 0.828 0.860 0.934 0.676 0.828 0.878
Habitat2 0.837 0.881 0.842–0.920 0.849 0.807 0.915 0.687 0.849 0.881
Habitat3 0.820 0.875 0.835–0.916 0.814 0.833 0.925 0.640 0.814 0.866

MLP

Habitat 0.720 0.774 0.722–0.826 0.729 0.697 0.855 0.514 0.729 0.787
Habitat1 0.758 0.756 0.702–0.811 0.820 0.607 0.833 0.586 0.820 0.827
Habitat2 0.644 0.717 0.659–0.775 0.592 0.771 0.863 0.437 0.592 0.702
Habitat3 0.689 0.659 0.594–0.725 0.777 0.469 0.787 0.455 0.777 0.782

AUC:area under the receiver operating characteristic curve; CI:confidence intervals; PPV: Positive Predictive Value;NPV: Negative Predictive Value;F1:F1 Score;LR:Logistic Regression; SVM: Support Vector Machine; LGBM: Light Gradient Boosting Machine;MLP: Multilayer Perceptron.

Bold values indicate the AUC of the optimal habitat model in the training cohort.

Table 3.

Comparative performance of habitat, Habitat1, Habitat2, and Habitat3 models on val cohorts

model_name Accuracy AUC 95%CI Sensitivity Specificity PPV NPV Recall F1
LR

Habitat 0.602 0.767 0.674–0.859 0.557 0.905 0.975 0.235 0.557 0.709
Habitat1 0.647 0.749 0.645–0.853 0.630 0.762 0.944 0.242 0.630 0.756
Habitat2 0.388 0.678 0.565–0.791 0.300 1.000 1.000 0.169 0.300 0.462
Habitat3 0.651 0.645 0.512–0.778 0.664 0.571 0.904 0.218 0.664 0.766

SVM

Habitat 0.677 0.710 0.571–0.850 0.671 0.714 0.940 0.246 0.671 0.783
Habitat1 0.699 0.697 0.561–0.833 0.711 0.619 0.923 0.250 0.711 0.803
Habitat2 0.163 0.239 0.134–0.344 0.043 1.000 1.000 0.130 0.043 0.082
Habitat3 0.826 0.565 0.410–0.719 0.906 0.333 0.892 0.368 0.906 0.899

RandomForest

Habitat 0.795 0.790 0.695–0.886 0.814 0.667 0.942 0.350 0.814 0.874
Habitat1 0.756 0.745 0.629–0.860 0.785 0.571 0.922 0.293 0.785 0.848
Habitat2 0.594 0.618 0.493–0.742 0.586 0.650 0.921 0.183 0.586 0.716
Habitat3 0.490 0.656 0.547–0.766 0.422 0.905 0.964 0.204 0.422 0.587

ExtraTrees

Habitat 0.758 0.845 0.777–0.913 0.743 0.857 0.972 0.333 0.743 0.842
Habitat1 0.654 0.773 0.666–0.879 0.622 0.857 0.966 0.261 0.622 0.757
Habitat2 0.806 0.622 0.477–0.768 0.857 0.450 0.916 0.310 0.857 0.886
Habitat3 0.839 0.691 0.566–0.816 0.906 0.429 0.906 0.429 0.906 0.906

LGBM

Habitat 0.720 0.789 0.695–0.883 0.714 0.762 0.952 0.286 0.714 0.816
Habitat1 0.564 0.740 0.643–0.837 0.504 0.952 0.986 0.230 0.504 0.667
Habitat2 0.537 0.629 0.501–0.757 0.507 0.750 0.934 0.179 0.507 0.657
Habitat3 0.423 0.579 0.468–0.691 0.344 0.905 0.957 0.184 0.344 0.506

MLP

Habitat 0.534 0.776 0.677–0.874 0.471 0.952 0.985 0.213 0.471 0.638
Habitat1 0.756 0.690 0.553–0.828 0.785 0.571 0.922 0.293 0.785 0.848
Habitat2 0.406 0.605 0.478–0.733 0.336 0.900 0.959 0.162 0.336 0.497
Habitat3 0.604 0.647 0.508–0.785 0.586 0.714 0.926 0.221 0.586 0.718

AUC:area under the receiver operating characteristic curve; CI:confidence intervals; PPV: Positive Predictive Value;NPV: Negative Predictive Value;F1:F1 Score;LR:Logistic Regression; SVM: Support Vector Machine; LGBM: Light Gradient Boosting Machine;MLP: Multilayer Perceptron

Bold values indicate the AUC of the optimal habitat model in the validation cohort.

Figure 5.

Figure 5

The ROC curves of habitat models constructed based on different machine learning classifiers in the train cohort

Habitat1 (A), Habitat2 (B), Habitat3 (C), and Habitat (D).

Figure 6.

Figure 6

The ROC curves of habitat models constructed based on different machine learning classifiers in the validation cohort

Habitat1 (A), Habitat2 (B), Habitat3 (C), and Habitat (D).

Radiomics model construction and evaluation

We extracted 1,834 features from the tumor VOI. These features were then screened using the same dimensionality reduction method as the habitat analysis, retaining the 2 most representative features (Figure 7). Based on these 2 features, we constructed a radiomics model employing six machine learning classifiers, consistent with the habitat analysis approach.

Figure 7.

Figure 7

Radiomics features selection

The coefficients for the cross-validation of the LASSO regression of the radiomics model (A); the MSE of LASSO analysis of the radiomics model (B); and final feature weights for the radiomics model (C).

As demonstrated in Figure 8 and Table 4, the performance of the radiomics models constructed using the six machine learning classifiers was generally suboptimal. The AUC values exhibited instability across the three cohorts, with noticeable decreases observed in both the validation and test cohorts. Among the six models, the one based on LightGBM demonstrated relatively acceptable performance, achieving AUC values of 0.766, 0.506, and 0.520 in the train, validation, and test cohorts, respectively.

Figure 8.

Figure 8

The ROC curves of radiomics models constructed based on different machine learning classifiers

Train cohort (A), validation cohort (B), and test cohorts (C).

Table 4.

Performance comparison of radiomics models constructed by different machine learning classifiers in different cohorts

model_name Accuracy AUC 95%CI Sensitivity Specificity PPV NPV Recall F1
LR

Train 0.675 0.612 0.545–0.678 0.748 0.495 0.783 0.446 0.783 0.748
Val 0.696 0.511 0.361–0.660 0.736 0.429 0.896 0.196 0.896 0.736
Test 0.333 0.458 0.291–0.624 0.077 1.000 1.000 0.294 1.000 0.077

SVM

Train 0.699 0.610 0.543–0.675 0.816 0.413 0.772 0.479 0.772 0.816
Val 0.534 0.504 0.364–0.642 0.514 0.667 0.911 0.171 0.911 0.514
Test 0.333 0.400 0.235–0.564 0.077 1.000 1.000 0.294 1.000 0.077

RandomForest

Train 0.685 0.716 0.658–0.772 0.737 0.560 0.803 0.466 0.803 0.737
Val 0.714 0.494 0.348–0.640 0.764 0.381 0.892 0.195 0.892 0.764
Test 0.444 0.528 0.371–0.684 0.256 0.933 0.909 0.326 0.909 0.256

ExtraTrees

Train 0.720 0.622 0.558–0.686 0.868 0.358 0.767 0.527 0.767 0.868
Val 0.571 0.490 0.370–0.608 0.579 0.524 0.890 0.157 0.890 0.579
Test 0.648 0.479 0.301–0.656 0.795 0.267 0.738 0.333 0.738 0.795

LGBM

Train 0.701 0.766 0.714–0.817 0.703 0.697 0.850 0.490 0.850 0.703
Val 0.745 0.506 0.368–0.644 0.814 0.286 0.884 0.187 0.884 0.814
Test 0.500 0.520 0.354–0.684 0.385 0.800 0.833 0.333 0.833 0.385

MLP

Train 0.693 0.619 0.554–0.684 0.812 0.404 0.769 0.468 0.769 0.812
Val 0.236 0.514 0.378–0.648 0.121 1.000 1.000 0.146 1.000 0.121
Test 0.444 0.508 0.337–0.678 0.256 0.933 0.909 0.326 0.909 0.256

AUC:area under the receiver operating characteristic curve; CI:confidence intervals; PPV: Positive Predictive Value;NPV: Negative Predictive Value;F1:F1 Score;LR:Logistic Regression; SVM: Support Vector Machine; LGBM: Light Gradient Boosting Machine;MLP: Multilayer Perceptron.

Bold values indicate the AUC of the best-performing model in the training, validation, and test cohorts.

3D deep learning model construction and evaluation

We evaluated three deep learning architectures-DenseNet121, ResNet18, and DenseNet169 construct end-to-end 3D deep learning models. The 3D deep learning model based on DenseNet121 exhibited the strongest performance, attaining AUC values of 0.733, 0.676, and 0.624 on the train, validation, and test cohorts, respectively (refer to Figure 9; Table 5).

Figure 9.

Figure 9

The ROC curves of 3D deep learning models constructed based on different algorithm

Train cohort (A), validation cohort (B), and test cohort (C).

Table 5.

Performance comparison of 3D deep learning models constructed in different cohorts

model_name Accuracy AUC 95%CI Sensitivity Specificity PPV NPV Recall F1
DenseNet121

Train 0.618 0.733 0.681–0.785 0.529 0.835 0.885 0.423 0.885 0.529
Val 0.703 0.676 0.537–0.815 0.709 0.667 0.931 0.264 0.931 0.709
Test 0.648 0.624 0.461–0.786 0.641 0.667 0.833 0.417 0.833 0.641

Resnet18

Train 0.715 0.770 0.721–0.819 0.715 0.716 0.858 0.510 0.858 0.715
Val 0.561 0.710 0.589–0.832 0.515 0.857 0.958 0.217 0.958 0.515
Test 0.463 0.491 0.316–0.666 0.333 0.800 0.812 0.316 0.812 0.333

DenseNet169

Train 0.578 0.549 0.485–0.613 0.601 0.523 0.752 0.352 0.752 0.601
Val 0.348 0.525 0.391–0.659 0.261 0.905 0.946 0.161 0.946 0.261
Test 0.704 0.395 0.218–0.572 0.897 0.200 0.745 0.429 0.745 0.897

AUC: area under the receiver operating characteristic curve; CI: confidence intervals; PPV: positive predictive value; NPV: negative predictive value; F1: F1 score

Bold values indicate the AUC of the best-performing model in the training, validation, and test cohorts.

Clinical model construction and evaluation

We conducted a comprehensive univariate analysis of all clinical features and calculated the odds ratios and corresponding p-values for each feature. As shown in Table 6, Age, Gender, Smoking_Status, CEA, CA125, CA153, CA199, CYFRA21_1, SCC, Histopathology, Clinical_stage, and TNM staging were all associated with immunotherapy benefit (all p-values <0.05). However, when these factors were included in the multivariate analysis, only age remained significantly associated with immunotherapy benefit. Subsequently, we incorporated age into the clinical model and constructed it using different classifiers.

Table 6.

Univariate and multivariate analyses of clinical factors in NSCLC patients

Univariate analysis
Multivariate analysis
OR_UNI p_value OR_MULTI p_value
Age 1.015(1.012–1.018) 0.000 1.03(1.015–1.046) 0.001
Gender 2.333(1.896–2.872) 0.000 0.833(0.474–1.465) 0.595
Smoking_Status 1.788 (1.481–2.158) 0.000 0.941 (0.687–1.289) 0.752
CEA 2.226 (1.696–2.921) 0.000 1.021(0.666–1.564) 0.936
CA125 2.908(1.586–2.776) 0.000 0.961(0.615–1.501) 0.882
CA153 2.174(1.436–3.290) 0.002 1.047(0.619–1.770) 0.886
CA199 1.750(1.104–2.776) 0.046 0.835(0.482–1.445) 0.589
CYFRA21_1 2.133(1.649–2.759) 0.000 0.859(0.550–1.343) 0.576
SCC 2.462(1.679–3.607) 0.000 0.966(0.572–1.632) 0.913
Histopathology 1.779(1.560–2.028) 0.000 1.045(0.730–1.496) 0.841
Clinical_stage 2.064(1.679–2.537) 0.000 0.586(0.120–2.852) 0.578
T 1.260(1.195–1.328) 0.000 0.964(0.783–1.186) 0.773
N 1.364(1.265–1.470) 0.000 0.953(0.776–1.192) 0.722
M 2.064(1.679–2.537) 0.000 0.965(0.197–4.721) 0.970

OR_UNI: the OR of univariate analysis; OR_MULTI: the OR of multivariate analysis.

Bold p-values indicate statistical significance (p < 0.05) in the multivariate analysis.

Figure 10 displays the ROC curves of the clinical models built by different machine learning classifiers. By comparing the AUC values across the train cohort, validation cohort, and test cohort, the LR-clinical model emerged as the optimal clinical model, with AUC values of 0.625, 0.608, and 0.526 for the training cohort, validation cohort, and external test cohorts, respectively. Table 7 provides a detailed comparison of the performance metrics of models constructed by different classifiers.

Figure 10.

Figure 10

Clinical features selection and models construction

Forest plot of clinical univariate analysis (A), forest plot of multivariable analysis (B), heatmap of clinical variables' correlations (C), the ROC curves of clinical models across machine learning classifiers on train cohort (D), validation cohort (E), and testing cohort (F).

Table 7.

Performance comparison of clinical models constructed by different machine learning classifiers in different cohorts

model_name Accuracy AUC 95%CI Sensitivity Specificity PPV NPV Recall F1
LR

Train 0.600 0.625 0.564–0.685 0.571 0.670 0.809 0.39 0.571 0.670
Val 0.528 0.608 0.484–0.733 0.500 0.714 0.921 0.176 0.500 0.648
Test 0.593 0.526 0.355–0.698 0.564 0.667 0.815 0.37 0.564 0.667

SVM

Train 0.795 0.835 0.793–0.878 0.808 0.761 0.892 0.619 0.808 0.848
Val 0.640 0.639 0.508–0.771 0.636 0.667 0.927 0.215 0.636 0.754
Test 0.352 0.441 0.271–0.611 0.128 0.933 0.833 0.292 0.128 0.222

RandomForest

Train 0.749 0.754 0.702–0.807 0.827 0.560 0.821 0.570 0.827 0.824
Val 0.379 0.580 0.461–0.698 0.293 0.952 0.976 0.168 0.293 0.451
Test 0.407 0.516 0.340–0.692 0.179 1.000 1.000 0.319 0.179 0.304

ExtraTrees

Train 0.568 0.698 0.641–0.754 0.440 0.881 0.900 0.392 0.440 0.591
Val 0.410 0.556 0.454–0.659 0.321 1.000 1.000 0.181 0.321 0.486
Test 0.704 0.574 0.392–0.757 0.821 0.400 0.780 0.462 0.821 0.800

LGBM

Train 0.456 0.622 0.566–0.678 0.271 0.908 0.878 0.338 0.271 0.414
Val 0.410 0.614 0.499–0.729 0.350 0.810 0.925 0.157 0.350 0.508
Test 0.593 0.513 0.351–0.674 0.692 0.333 0.730 0.294 0.692 0.711

MLP

Train 0.643 0.716 0.659–0.773 0.609 0.725 0.844 0.432 0.609 0.707
Val 0.509 0.646 0.523–0.769 0.464 0.810 0.942 0.185 0.464 0.622
Test 0.556 0.537 0.356–0.718 0.487 0.733 0.826 0.355 0.487 0.613

AUC: area under the receiver operating characteristic curve; CI: confidence intervals; PPV: positive predictive value; NPV: negative predictive value; F1: F1 score; LR: logistic regression; SVM: Support Vector Machine; LGBM: light gradient boosting machine; MLP: multilayer perceptron.

Bold values indicate the AUC of the best-performing model in the training, validation, and test cohorts.

Comparison of the habitat, radiomics, 3D deep learning, and clinical models

As shown in Figure 11 and Table 8, among all the models included for comparison, the Habitat-ExtraTrees-based model demonstrated the best performance in predicting the benefits of immunotherapy for lung cancer, with AUC values of 0.865, 0.845, and 0.814 in the training, validation, and test sets, respectively. The Delong test results indicated that, except for the absence of a statistically significant difference in AUC between Habitat-ExtraTrees and DenseNet121 in the validation set (p = 0.062), the model exhibited statistically significant differences in AUC compared to the Radiomics, 3D DeepLearning, and Clinical models in all other cohorts.

Figure 11.

Figure 11

The best comparison of image models

Comparison of ROC curves among different models in the train, validation, and test cohorts (A), comparison of DeLong tests among different models AUC in the train (B), validation (C), and test cohorts (D).

Table 8.

Comparative performance of the Habitat, Radiomics,3D DeepLearning, and clinical models

model_name Accuracy AUC 95%CI Sensitivity Specificty PPV NPV Recall F1
Train

Clinical-LR 0.600 0.625 0.564–0.685 0.571 0.670 0.809 0.390 0.571 0.670
Rad-LGBM 0.701 0.766 0.714–0.817 0.703 0.697 0.850 0.490 0.850 0.703
DenseNet121 0.618 0.733 0.681–0.785 0.529 0.835 0.885 0.423 0.885 0.529
Habitat-ExtraTrees 0.816 0.865 0.825–0.905 0.872 0.679 0.869 0.685 0.872 0.871

Val

Clinical-LR 0.528 0.608 0.484–0.733 0.500 0.714 0.921 0.176 0.500 0.648
Rad-LGBM 0.745 0.506 0.368–0.644 0.814 0.286 0.884 0.187 0.884 0.814
DenseNet121 0.703 0.676 0.537–0.815 0.709 0.667 0.931 0.264 0.931 0.709
Habitat-ExtraTrees 0.758 0.845 0.777–0.913 0.743 0.857 0.972 0.333 0.743 0.842

Test

Clinical-LR 0.593 0.526 0.355–0.698 0.564 0.667 0.815 0.37 0.564 0.667
Rad-LGBM 0.500 0.520 0.354–0.684 0.385 0.800 0.833 0.333 0.833 0.385
DenseNet121 0.648 0.624 0.461–0.786 0.641 0.667 0.833 0.417 0.833 0.641
Habitat-ExtraTrees 0.648 0.814 0.697–0.930 0.513 1.000 1.000 0.441 0.513 0.678

AUC: area under the receiver operating characteristic curve; CI: confidence intervals; PPV: Negative Predictive Value;NPV: Negative Predictive Value; F1: F1 score; Clinical-LR: a clinical model built with logistic regression; Rad-LGBM: a radiomics model built with LightGBM; Habitat-ExtraTrees: a habitat analysis model built with ExtraTrees.

Bold values denote the top performance for each metric per cohort. Notably, the Habitat-ExtraTrees model demonstrates the highest AUC in all three cohorts.

Comparison of the habitat, radiomics, 3D deeplearning, clinical models, and programmed death-1 receptor and its ligand model

To further validate the clinical value of the Habitat model in predicting the efficacy of immunotherapy in NSCLC, we enrolled 146 patients with NSCLC from Center A who had undergone PD-L1 immunohistochemical testing. We systematically compared the most predictive performance of the Habitat model, radiomics (Rad) model, 3D deep learning (3D-DL) model, clinical model, and PD-L1 expression models (including positive/negative and high/low expression groupings). The Habitat model was developed using the ExtraTrees classifier, the Rad model using the LGBM classifier, the 3D-DL model using the DenseNet121 architecture, and the clinical model using the LR classifier. As shown in Figure 12 and Table 9, the Habitat model demonstrated superior performance across multiple key metrics, achieving an accuracy of 0.803 and an AUC of 0.889 (95% CI: 0.835–0.942), while maintaining high sensitivity (0.800) and specificity (0.809). The 3D-DL model also exhibited certain competitive advantages in terms of AUC (0.714) and specificity (0.894), though its overall discriminative performance remained lower than that of the Habitat model. In contrast, both PD-L1 expression models (PDL1-PN and PDL1-HL) showed generally lower values across all metrics, indicating limited predictive capability.

Figure 12.

Figure 12

Comparative Performance of the Habitat, Radiomics, 3D DeepLearning, Clinical, and PD-L1 Models

Rad: Radiomics model; 3D-DL: The 3D deep learning model was constructed using the DenseNet121 architecture; PDL1-PN:PD-L1 positive/negative expression; PDL1-HL:PD-L1 high/low expression.

Table 9.

Comparative performance of the Habitat, Radiomics,3D DeepLearning, clinical, and PD-L1 models

model_name Accuracy AUC 95%Cl Sensitivity Specificity PPV NPV Recall F1
Clinical 0.476 0.571 0.474–0.666 0.271 0.915 0.871 0.371 0.27 0.412
Rad 0.707 0.582 0.471–0.691 0.813 0.489 0.771 0.548 0.81 0.790
Habitat 0.803 0.889 0.835–0.942 0.800 0.809 0.899 0.655 0.80 0.847
3D-DL 0.626 0.714 0.629–0.798 0.501 0.894 0.909 0.457 0.50 0.645
PDL1-PN 0.592 0.570 0.483–0.656 0.632 0.511 0.733 0.393 0.63 0.677
PDL1-HL 0.429 0.535 0.466–0.603 0.245 0.830 0.750 0.339 0.24 0.364

AUC: area under the receiver operating characteristic curve; CI: confidence intervals; PPV: positive predictive value; NPV: negative predictive value; F1: F1 score; Rad: radiomics model; 3D-DL: the 3D deep learning model was constructed using the DenseNet121 architecture; PDL1-PN:PD-L1positive/negative expression; PDL1-HL:PD-L1 high/low expression.

Model explainability

To further enhance the interpretability of the Habitat model, we employed the SHAP algorithm to quantify the contribution of each radiomics feature to the final prediction. The Habitat model was constructed based on the ExtraTrees classifier. As illustrated in Figure 13, the SHAP summary plot provides a global interpretation of the ExtraTrees_Habitat model. This analysis identifies wavelet_HHL_firstorder_Maximum_h1_CT and square_firstorder_Median_h2_CT as critical predictive radiomic features for determining response to anti-PD-1/PD-L1 immunotherapy in patients with NSCLC. Figure 14 demonstrates the application of this ExtraTrees classifier-based Habitat model to an individual patient case: a 62-year-old male who achieved clinical benefit following 6 months of immunotherapy. Notably, the top-ranked predictors for this patient’s favorable outcome were also wavelet_HHL_firstorder_Maximum_h1_CT and square_firstorder_Median_h2_CT.

Figure 13.

Figure 13

Global interpretation of the Habitat-ExtraTrees Model

Bee swarm plot of all features in the ExtraTrees-Habitat model in the training cohort (A), Bar plot of radiomics features importance for the ExtraTrees-Habitat model in the training cohort (B). The wavelet_HHL_firstorder_Maximum_h1_CT and square_firstorder_Median_h2_CT were identified as the most significant features, with larger negative values exerting stronger inhibitory effects on the predicted outcomes.

Figure 14.

Figure 14

Case-specific interpretation of the Habitat-ExtraTrees Model

Visualization of the original lung tumor image, segmentation map, and habitat regions(A), Waterfall plot of the model for a single sample(B), Force plot of the model for a single sample(C).

Discussion

Although anti-PD-1/PD-L1 immunotherapy can significantly improve the OS and PFS of patients with advanced NSCLC. Currently, only a subset of patients responds to this treatment strategy. Given this situation, this study aims to construct a habitat model using CT images of patients with lung cancer, with the hope of establishing a non-invasive, robust, and safe biomarker to predict whether patients with NSCLC can benefit from immunotherapy. To rigorously evaluate the model’s generalization ability, it was trained exclusively on contrast-enhanced CT images and subsequently validated on an independent external test set consisting of non-contrast CT images. We used the K-Means algorithm to divide the tumor VOI into three sub-regions and extracted radiomics features from them. Subsequently, we constructed a habitat model based on six machine learning classifiers, namely LR, SVM, RF, ExtraTrees, LGBM, and MLP. Comprehensive comparison showed that the Habitat model constructed by the ExtraTrees-based classifier had the best performance, with AUC values of 0.865, 0.845, and 0.814 in the training set, validation set, and test set, respectively.

To further validate the predictive efficacy of the Habitat model, we simultaneously constructed a radiomics model and a 3D deep learning model based on the tumor VOI. The comprehensive comparison results indicated that the habitat model performed best in multiple evaluations. To more comprehensively assess whether this model could replace tumor tissue PD - L1 expression for predicting the benefits of immunotherapy, we enrolled 146 patients who underwent PD - L1 testing at Center A and systematically compared the Clinical - LR, Habitat - ExtraTrees, radiomics - LGBM models, the 3D deep learning model based on the DenseNet121 architecture, and the PD - L1 expression models (including positive/negative and high/low expression groups). The results showed that the Habitat model performed excellently in several key metrics, with an accuracy of 0.803, an AUC of 0.889, and it also maintained high sensitivity (0.800) and specificity (0.809).

We noticed that on the external test set, the habitat model constructed in this study experienced a decline in performance, which was specifically manifested as a decrease in sensitivity and accuracy. We analyze that this performance degradation may be caused by the following factors: Firstly, the unbalanced distribution of the outcome metric DCB in the dataset is a potential major factor leading to the decline in model performance. In this study, there were significant differences in the DCB ratios among the training set, validation set and external test set, which were 71%, 87%, and 72%, respectively. A higher DCB ratio in the internal validation set may cause the model to over-learn to identify DCB-positive cases during the optimization process, thereby resulting in an inflated sensitivity under this distribution. However, when the model was applied to an external test set where the DCB ratio dropped back to a level close to that of the training set, it showed a decline in the recognition ability for non-beneficial cases, and the sensitivity also decreased accordingly. Second, the model was developed using contrast-enhanced CT scans from Center A and tested using non-contrast CT scans from Center B. Since the model highly depended on contrast uptake patterns during training, the absence of contrast enhancement in the test set may reduce the discriminative power of certain imaging features, thereby compromising the model’s classification stability and resulting in decreased sensitivity. In addition, the heterogeneity in the selection of immunotherapy drugs among different centers may introduce uncontrollable confounding factors, further affect the consistency of patients' immune response patterns and thereby weakening the predictive stability of the model in external cohorts.

Both conventional radiomics and deep learning have demonstrated promising progress in predicting the efficacy of immunotherapy for lung cancer. Numerous scholars have developed radiomics models by extracting CT features before lung cancer treatment.16,19 Their findings demonstrate that, compared to tumor PD-L1 expression levels, radiomics models constructed based on CT imaging features exhibit superior efficacy in predicting anti-PD-L1 immunotherapy over conventional PD-L1 expression assessment methods. The radiomics models developed by the teams of Zhu et al.16 and Wu et al.19 achieved AUC values of 0.800 and 0.752 in the test set, respectively. In this study, we also attempted to construct radiomics models based on six machine learning classifiers (including LR, SVM, RF, ExtraTrees, LGBM, and MLP), but their prediction performance was generally poor. In the test sets, the AUC of most models was only in the range of 0.400–0.528, and the sensitivity and specificity were also unsatisfactory. Compared with the approach of Wu et al., who expanded the tumor VOI to include a 20 mm peritumoral area to capture more peri-tumoral information, our study, while incorporating some peri-tumoral context in the VOI delineation, defined a notably smaller peritumoral space. This difference in spatial coverage of the tumor microenvironment may have limited the extraction of informative radiomic features related to the tumor’s interaction with its surroundings, thereby contributing to the comparatively lower model performance. In addition, for multiple lesions, Zhu et al. retained up to five lesions according to the longest diameter of the lesions and used the multiple instance learning (MIL) method to adaptively weight the prediction results of each lesion, thus constructing a comprehensive radiomics model. In contrast, this study only constructed models based on single lesions and failed to fully capture the spatial heterogeneity and multifocal characteristics of tumors. Probably, the radiomics strategy combined with multiple instance learning may comprehensively and accurately evaluate the potential benefits of lung cancer immunotherapy.

Compared with traditional radiomics methods, deep learning can automatically extract higher-level and more abstract imaging features by learning from large-scale data, and has stronger representational ability.28 Leveraging the advantage of deep learning, our study constructed 3D deep learning models using three neural network architectures: DenseNet121, ResNet18, and DenseNet169. Among them, the model based on DenseNet121 performed relatively well, with AUC values of 0.733, 0.676, and 0.624 in the training set, validation set, and test set, respectively. Compared with the previously constructed traditional radiomics models, the deep learning models demonstrated certain improvements in prediction performance. Similarly, the SimTA deep learning model developed by Yang et al.17 demonstrated excellent performance in predicting DCB of immunotherapy for patients with NSCLC with an AUC of 0.80 (95% CI: 0.74–0.86). This model also showed effective predictive ability for patients' OS and PFS. Additionally, Wang et al.29 constructed a model integrating deep learning and radiomics features to predict the survival outcomes of patients with NSCLC after immunotherapy. Their research results also indicated that the predictive performance of the deep learning model was superior to that of traditional radiomics models.

We found that although deep learning models have more advantages than traditional radiomics in predicting the immunotherapeutic benefits of lung cancer, both typically regard tumors as homogeneous entities and fail to fully consider their spatial heterogeneity. This heterogeneity is usually manifested as irregular tumor growth patterns and changes in cell distribution and density, which profoundly affect the treatment outcomes.30 To address this challenge, researchers have proposed a novel radiomics approach—Habitat analysis. Habitat analysis identifies voxels with similar greyscale intensities in medical images and subsequently divides the tumor into several subregions for radiomic feature extraction.24 This method effectively simulates the spatial heterogeneity of tumors and enables accurate prediction of treatment response and prognostic evaluation.31,32

Cai et al.33 developed deep learning (DL) and Habitat models based on enhanced lung CT images to evaluate the DCB of immunotherapy for NSCLC. In the test set, the AUC of the DL model was 0.631 (95% CI:0.517–0.735), and the AUC of Habitat4 was 0.781 (95% CI:0.676–0.865). The Habitat model was slightly superior to deep learning. By fusing the DL and Habitat models, the model performance was improved, with an AUC of 0.865 (95%CI:0.772–0.931) in the test set. Similarly, Ye et al.34 developed a Habitat model utilizing arterial-phase contrast-enhanced CT images to forecast the pathological complete response of patients with NSCLC undergoing neoadjuvant immunochemotherapy. In comparison to the conventional radiomics model, the Habitat model exhibited superior performance, achieving an AUC of 0.781 (95% CI: 0.673–0.889) in the external validation set, while the AUC of the traditional radiomics model was 0.723 (95% CI: 0.591–0.855). These findings suggest that habitat analysis holds promise for assessing the efficacy of lung cancer treatments. Although CT examination is a routine method for patients with lung cancer, contrast-enhanced chest CT is not mandatory for elderly patients with advanced-stage disease. Additionally, patients with renal insufficiency are generally not recommended to undergo CT scans with iodine contrast agents. Based on these clinical considerations, our study aimed to develop a Habitat model using contrast-enhanced CT for training, and rigorously evaluate its generalizability on an independent external test set consisting of non-contrast CT images. This design directly addresses the common clinical scenario where contrast-enhanced CT may be contraindicated or unavailable. To comprehensively evaluate the performance of the machine learning-based habitat analysis model, we employed six different classifiers for comparative analysis, avoiding the limitations of a single classifier. Our study also shows that the habitat model outperforms the deep learning model and the traditional radiomics model in predicting the benefits of NSCLC immunotherapy. On the test set, the AUCs of ExtraTrees-Habitat, DenseNet121, and radiomics-LGBM are 0.814 (95% CI:0.697–0.930), 0.624 (95% CI:0.461–0.786), and 0.520 (95% CI:0.354–0.684), respectively. Although previous studies by Cai and Ye established habitat models based on contrast-enhanced CT, our work further demonstrates that a model trained on contrast-enhanced CT can maintain robust predictive performance when applied to non-contrast CT images. This supports the potential clinical translation of habitat analysis across varying imaging protocols.

However, in this study, both the traditional radiomics model and the deep learning model constructed from the same non-contrast CT scans performed poorly, with the radiomics model particularly showing limited discriminative ability. We speculate that the reasons may include the following: non-contrast images lack the hemodynamic information provided by contrast agents, and their relatively lower image contrast and signal-to-noise ratio restrict the ability of handcrafted features to characterize biological behaviors; meanwhile, the deep learning model may have been constrained by the training sample size and model architecture, preventing it from effectively extracting predictive deep features. In contrast, the habitat model, by characterizing the heterogeneity distribution within different subregions of the tumor, reduces reliance on absolute greyscale values to some extent. This approach allows it to maintain relatively robust performance even on non-contrast CT images.

To improve the clinical interpretability of key features of the habitat model in predicting the benefit of immunotherapy for lung cancer, we used a game theory-based SHAP analysis framework. This methodology uniformly assesses the direction and magnitude of each feature’s contribution to individual predictive outcomes, elucidating the specific impact of various features on the model’s decision-making process.35 Through SHAP analysis, we identified two key Habitat features that were significantly associated with the benefit of immunotherapy in NSCLC: wavelet_HHL_firstorder_Maximum_h1_CT and square_firstorder_Median_h2_CT. Formally, both features were categorized as first-order texture features. Traditionally, first-order features only describe the distribution properties of pixel grey values within an image region without considering their spatial relationships.36 However, in the way these two features are extracted, although they belong to the same first-order category, these two features are not directly derived from the greyscale information of the original image. They are derived from different tumor subregions through two advanced image processing techniques, the wavelet transform and the square filter, respectively. More importantly, SHAP analysis reveals significant differences in their contribution to the final outcome in the prediction model. This phenomenon confirms our previous hypothesis that the advantage of the effectiveness of Habitat’s model is not dependent on the original, absolute CT grey values, but rather stems from its ability to extract and parse the deeper features of the image. Specifically, the wavelet transform is able to decompose the image into components of different frequencies and orientations to capture texture information that is difficult to recognize by the human eye and conventional features,14 an approach that enhances the details of the heterogeneous structures inside the tumor (Habitat1), while the squared filtering enhances the subtle density contrasts of the peri-tumor region (Habitat2) through a nonlinear transformation.37 This suggests that our habitat model reduces the direct reliance on absolute grey values and instead keenly captures the changes in grey distribution and implicit spatial structural information within different regions enhanced by filtering. Of particular importance, this mechanism provides a rational explanation for the fact that our model still exhibits good performance on unenhanced CT images. In the absence of hemodynamic information provided by contrast agents, conventional CT feature discrimination is limited. Our method indirectly “mines” and “amplifies” the grey-scale changes or texture features in different regions of the image through the image transformation technique described above. This mechanism may be the basis for the good performance of the model on unenhanced CT images.

From a biological perspective, the constructed subregions in this study include Habitat1 and Habitat3, which originate from the tumor core region, while Habitat2 corresponds to the tumor invasive margin. The results of the swarm plot in Figure 13A demonstrate that lower values of the feature wavelet_HHL_firstorder_Maximum_h1_CT are more likely to predict poor prognosis. The underlying biological mechanism may lie in the fact that a lower greyscale maximum in this region reflects the absence of high-density cellular structures, which could be attributed to two potential reasons: first, the lack of high-density lymphocyte aggregation; second, a reduced number of tumor cells accompanied by extensive necrotic areas. Insufficient lymphocyte infiltration in the tumor core suggests that immune cells fail to effectively penetrate and eliminate tumor cells, manifesting as an “immune-excluded” or “immune-desert” phenotype,38 which is typically associated with resistance to immunotherapy.39,40 Additionally, tumor necrosis is often accompanied by hypoxia, enhanced glycolysis, and the formation of an immunosuppressive microenvironment, further impairing the response to immunotherapy.

On the other hand, the feature square_firstorder_Median_h2_CT originates from the tumor invasive margin region (Habitat2). Its median value reflects the central tendency of pixel grey values in this region, and a higher feature value tends to predict a positive treatment outcome. Previous studies have demonstrated that the texture features of the peritumoral region in lung cancer possess predictive value for immunotherapy efficacy. For instance, the model established by Khorrami et al.41 extracted radiomic features from both intratumoral and peritumoral regions and found that peritumoral textures could effectively distinguish immunotherapy responders from non-responders. Their preliminary research in breast cancer42 also suggested that peritumoral imaging features could be used to assess treatment response. This study further reveals that first-order texture features based on the invasive margin region can similarly predict immunotherapy efficacy, although no prior studies have reported characterization using first-order category features. We hypothesize that the increase in Median values may be associated with enhanced lymphocyte infiltration within the invasive margin, where elevated cellular density leads to an overall rise in grey values on CT imaging. The tumor invasive margin is often accompanied by abundant tumor-infiltrating lymphocytes and tertiary lymphoid structure formation,43,44 indicating a more active immune microenvironment, which may contribute to improved responses to immunotherapy. As this study employed a retrospective design and most enrolled cases were derived from biopsy specimens, we were unable to obtain whole-tumor pathological sections for comparative analysis. This limitation hindered our ability to directly validate the correspondence between the constructed imaging habitat subregions and the spatial cellular distribution structure of the tumor at the histological level. Although we could speculate, based on existing literature and pathophysiological mechanisms, that the radiomic features reflecting immune response changes might be associated with biological behaviors such as lymphocyte infiltration, necrosis, and vascular architecture, we were unable to provide further precise biological interpretations at the spatial cellular distribution level. Future prospective studies incorporating multi-region biopsies or whole-tumor resection specimens with spatial transcriptomics or multiplex immunohistochemistry are warranted to elucidate the true spatial relationship between imaging habitats and the tumor microenvironment.

Our study demonstrates that key imaging features derived from distinct subregions of the habitat collectively influence the prediction of immunotherapy efficacy in significantly different or even opposite directions. This finding strongly suggests that tumors and their microenvironments exhibit high spatial heterogeneity, with distinct anatomical subregions potentially representing vastly different biological behaviors, which can be separately extracted and quantified through imaging features. Notably, these regional features, particularly first-order features, exhibit complementary rather than uniform predictive value for treatment response, further highlighting the unique advantages and potential of habitat-based analysis in deciphering the spatial complexity of tumors and precisely assessing the immune microenvironment.

In conclusion, this study demonstrates that the “Habitat” analysis technique based on lung CT images can serve as a non-invasive and clinically feasible imaging biomarker to effectively predict the clinical benefits of immunotherapy in patients with NSCLC. The research provides a promising new strategy for identifying potential immunotherapy beneficiaries before treatment, which aids in optimizing clinical decision-making, improving treatment selection, and ultimately enhancing patient prognosis.

Limitations of the study

This study has several limitations. First, as a retrospective study, there was some heterogeneity in the immunotherapy regimens of the included patients, which may affect the generalizability of the findings. Second, most cases in this study were derived from biopsy specimens, and whole-tumor pathological sections were unavailable for comparative analysis. This limitation restricted our ability to directly validate the spatial correspondence between imaging-based habitat subregions and tumor cellular distribution patterns at the histological level. Finally, as a dual-center study, the size and diversity of the external validation cohort remained relatively limited. Future prospective, multicenter studies incorporating multi-region biopsies or whole-tumor resection specimens with spatial transcriptomics or multiplex immunofluorescence analyses are warranted to further elucidate the intrinsic relationship between imaging habitats and the tumor microenvironment at the spatial molecular level and to validate the model’s generalizability.

Resource availability

Lead contact

Further information and requests for data should be directed to and will be fulfilled by the lead contact, Guanqiao Jin (jinguanqiao77@gxmu.edu.cn).

Materials availability

This study did not generate new unique reagents.

Data and code availability

  • The radiological images and clinical data reported in this article can be obtained from the lead contact upon reasonable request.

  • All original code has been deposited on the GitHub and is publicly available as of the date of publication. DOI is listed in the key resources table.

  • Any additional information required to reanalyze the data reported in this article is available from the lead contact upon request.

Acknowledgments

We thank the Onekey AI platform for its assistance with this work. This research was funded by the Beijing Medical Award Foundation(Grant No. YXJL-2022-0665-0210), the Guangxi Key Research and Development Program (Grant No. GuikeAB23026087), the Natural Science Foundation of Guangxi Zhuang Autonomous Region(Grant No. 2023GXNSFAA026225), and the Baise Science Research and Technology Development Program(Grant No. Baike 20250338).

Author contributions

Conceptualization:X.X.H.1 and G.Q.J; methodology: X.X.H.1, Y.R.X, and H.Y.N.; investigation: H.Y.N., X.X.H.2, D.L.G, and K.W.; writing – original draft: X.X.H.1 and Y.R.X.; writing – review and editing: X.X.H.1, D.Y.H., and G.Q.J.; funding acquisition: X.X.H.1, Y.R.X., and G.Q.J.; resources: X.X.H.1, Y.R.X., and H.Y.N.; supervision: D.Y.H. and G.Q.J. X.X.H.1: Xiaoxiao Huang; X.X.H.2: Xiaoxin Huang.

Declaration of interests

The authors declare that they have no competing interests.

STAR★Methods

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Deposited data

Code for models This paper https://github.com/Huang-xx-521/Habiitat-DL-Radiomics
Clinical data of patients This paper N/A
chest CT images This paper N/A

Software and algorithms

ITK-SNAP (version 4.4.0) ITK-SNAP software https://www.itksnap.org/
Python (version 3.14.0) Python Software Foundation https://www.python.org/
Onekey (version 4.9.1) OnekeyAl-Platform https://github.com/OnekeyAl-Platform/onekey

Experimental model and study participant details

Ethics statement

This retrospective study was conducted in accordance with the principles of the Declaration of Helsinki and was approved by the Ethics Committee of Guangxi Medical University Cancer Hospital (approval number: KY-2022-301) and the Ethics Committee of the Affiliated Hospital of Youjiang Medical University for Nationalities (approval number: YYFY-LL-2025-228). The requirement for informed consent was waived by the ethics committees due to the retrospective nature of this study.

Patients and data information

This retrospective study collected data from NSCLC patients at two centers: Guangxi Medical University Cancer Hospital (Center A) and the Affiliated Hospital of Youjiang Medical University for Nationalities (Center B), between January 2021 and December 2024. Inclusion criteria were as follows: (1) receipt of at least two cycles of anti-PD-1/PD-L1 immunotherapy; (2) pathologically confirmed NSCLC. The exclusion criteria were as follows: (1) NSCLC patients defined as stage I-II according to the 9th edition of AJCC/UICC;(2) any treatment (surgery, radiotherapy, etc.) before the first CT examination;(3) incomplete clinical history data;(4) CT image slice thickness> 1.25 mm(all patients from center A underwent contrast-enhanced CT, whereas those from center B underwent non-contrast CT); (5) follow-up time less than 6 months or patients lost to follow-up. A total of 590 NSCLC patients met the inclusion and exclusion criteria for this study, with 536 from Center A and 54 from Center B. Patients from Center A were randomly split into a training cohort and a validation cohort at a 7:3 ratio, while those from Center B served as the test cohort (Figure 1). To assess the clinical utility of the imaging model, 146 NSCLC patients who underwent PD-L1 immunohistochemical testing from Center A were included. Center B was excluded from this analysis due to the limited number of patients with PD-L1 testing. Two pathologists jointly assessed the PD-L1 tumor proportion score. The expression levels were defined as follows: TPS < 1% was considered negative; TPS ≥ 1% was considered positive. Additionally, TPS < 50% was classified as low expression, and TPS ≥ 50% was classified as high expression.Subsequently, we collected patient clinical data such as age, gender, smoking history, histopathological type, clinical stage, TNM stage, tumor markers from clinical electronic databases.

Method details

CT scanning protocol and image acquisition

All patients underwent CT scanning. The CT scanning equipment included: 128-slice Siemens Sensation, 64-slice GE Revolution, 64-slice GE Discovery CT750 HD, and 256-slice GE Revolution Ace. For patients from center A, contrast-enhanced CT was performed via intravenous injection of a non-ionic iodinated contrast agent at a dose of 1.2–1.5 mL/kg body weight, with an injection rate of 2.5–3.0 mL/s. The arterial phase scanning was initiated using a bolus-tracking technique, with a region of interest placed in the descending aorta and a trigger threshold set at 100–120 Hounsfield units. The scanning parameters were typically set as follows: tube voltage, 120 kV; tube current, automated dose modulation; and reconstruction using a bone algorithm. Patients were positioned supine, head-first, and the scanning range extended from the thoracic inlet to 2–3 cm below the costophrenic angle during a single inspiratory breath-hold. All lung CT images were retrieved from the Picture Archiving and Communication System (PACS) in Digital Imaging and Communications in Medicine (DICOM) format for subsequent segmentation of the tumor region of interest (ROI).

Tumor segmentation

In this study, a junior doctor with five years of experience initially used ITK-SNAP (version 3.8.0, available at www.itksnap.org) for slice-by-slice delineation of the tumor ROI. A senior doctor with a decade of experience then reviewed the delineations. Tumor boundaries were manually traced using the brush tool in ITK-SNAP to create a complete tumor VOI. An 8-unit cubic brush was subsequently employed to expand the previously delineated tumor ROI, facilitating a comprehensive analysis of both intratumoral and peritumoral regions. This approach is based on evidence that peritumoral features can effectively predict treatment response.45,46 For inconsistent segmentations, a third senior radiologist provided expert consultation.

Generation of habitat subregions

Within the VOI of each tumor, we extracted a total of 19 local features per voxel.

These features included a range of shape descriptors, textural features, and first-order statistical attributes. The specific features extracted were: firstorder_Entropy,

firstorder_MeanAbsoluteDeviation, firstorder_Median,

glcm_DifferenceAverage, glcm_DifferenceEntropy, glcm_DifferenceVariance,

glcm_Imc1, glcm_Imc2, glcm_InverseVariance, glcm_JointEnergy,

glcm_JointEntropy, glcm_SumEntropy, glrlm_LongRunEmphasis,

glrlm_RunEntropy, glrlm_RunVariance,glszm_SizeZoneNonUniformityNormalized, glszm_SmallAreaHighGrayLevelEmphasis, ngtdm_Contrast, and ngtdm_Strength.

we employed the K-means algorithm to cluster all voxels and their associated features, exploring 3 to 10 distinct cluster centers to delineate different intratumoral habitat regions.The clustering performance was evaluated using the Calinski-Harabasz (CH) index, Davies-Bouldin (DB) index, and Silhouette score to optimize segmentation efficacy. Following cluster analysis, subregions sharing identical clustering features were merged, with each region representing a unique intratumoral microenvironmental signature. We systematically assessed the impact of varying the number of cluster centers from 3 to 10 on analytical validity. After a comprehensive evaluation, the optimal number of cluster centers (K) was ultimately determined to be 3(Detailed information is provided in Figure S1, Table S2). Thus, we categorized the habitat into three subregions, designated as Habitat1, Habitat2, and Habitat3.

Feature extraction and selection of radiomics and habitat

In the radiomics model and the Habitat model, we strictly followed the Image Biomarker Standardization Initiative (IBSI) guidelines and used Python package PyRadiomics for feature extraction. We adopted the same method for feature extraction in the radiomics model and the habitat sub - regions, and divided the extracted features into geometric, intensity, and texture categories. GLCM, GLRLM, GLSZM, and NGTDM techniques were used to evaluate tumor shape, voxel brightness, and spatial patterns.

The extracted features were normalized to ensure the consistency of all data dimensions and facilitate subsequent feature selection. For feature selection, we first retained features with an intraclass correlation coefficient (ICC) greater than 0.75 based on ICC analysis to ensure good reproducibility.Then, the extracted features were first subjected to Pearson correlation analysis. Based on the correlation coefficients, we retained only one feature among those with high correlation (corr > 0.9). Subsequently, we employed the Minimum Redundancy Maximum Relevance (mRMR) algorithm and LASSO regression with 10-fold cross-validation to identify the final radiomics features, retaining only the most predictive ones to construct a robust and streamlined model.

Construction of radiomics, habitat, and clinical models

We employed six machine learning classifiers, including Logistic Regression, Support Vector Machine, Random Forest, ExtraTrees, LightGBM, and Multilayer Perceptron, to construct predictive models based on radiomic features and habitat-derived features for predicting durable clinical benefit (DCB) from anti-PD-1/PD-L1 therapy in NSCLC patients. For the habitat-based approach, we developed four distinct models: Habitat, Habitat1, Habitat2, and Habitat3. The Habitat model integrated imaging features from all three subregions (Habitat1, Habitat2, and Habitat3). Additionally, clinical features were screened through univariable and multivariable analyses, and the selected clinical variables were used to build a clinical prediction model using the same set of machine learning classifiers. DCB was defined as PFS exceeding 6 months, as evaluated by the Response Evaluation Criteria in Solid Tumors (RECIST version 1.1) . Accordingly, patients were classified into either the DCB (PFS > 6 months) or the no-DCB cohort (PFS ≤ 6 months).

Construction of 3D deep learning model

In this study, three deep learning architectures, namely DenseNet121, ResNet18, and DenseNet169, were employed to construct 3D deep learning models. By extending the two-dimensional convolution, pooling, and normalization operations in the original network to three-dimensional forms and replacing the fully connected layer with global average pooling 3D, the models can effectively process three-dimensional medical imaging data. The input images were uniformly sampled to 64×64×64 voxels and subjected to Z-score normalization. During the training phase, data augmentation strategies such as random rotation, translation, and scaling were adopted to enhance the generalization ability of the models. The Adam optimizer was used for model training, with an initial learning rate of 0.01. The cosine annealing scheduling strategy was applied to dynamically adjust the learning rate, and the maximum number of training epochs was set to 100.

Model evaluation

The diagnostic performance of each predictive model was evaluated using receiver operating characteristic (ROC) curve analysis, including sensitivity, specificity, accuracy, positive predictive value(PPV), negative predictive value(NPV), and Recall, F1-score and area under the ROC curve (AUC). The optimal imaging predictive model was determined based on ROC curves. A detailed flowchart of the process from CT scanning to model development is illustrated in Figure 2.

Model interpretability

We employed SHAP (Shapley Additive exPlanations) visualization to enhance the interpretability of the model, thereby improving its credibility for clinical applications. SHAP quantifies the contribution of each feature to the model's prediction as a “Shapley value,” elucidating the impact of individual features on the model's output. The SHAP analysis was performed using the KernelExplainer package in Python.

Quantification and statistical analysis

We assessed the normality of clinical characteristics using the Shapiro-Wilk test. Continuous variables were analyzed using t-tests or Mann-Whitney U tests, while categorical variables were analyzed using chi-square (χ2) tests. Any analyses and model construction in this study were performed on the OnekeyAI platform (v4.9.1, Python 3.14.0). We use PyRadiomics (v3.1.0) for radiomics feature extraction, Scikit-learn (v1.0.2) for machine learning training, and PyTorch (v1.11.0) for deep learning training.

Published: December 24, 2025

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.isci.2025.114522.

Contributor Information

Deyou Huang, Email: fzxyh2012@126.com.

Guanqiao Jin, Email: jinguanqiao77@gxmu.edu.cn.

Supplemental information

Document S1. Figures S1 and S2 and Tables S1 and S2
mmc1.pdf (560.3KB, pdf)

References

  • 1.Jamal-Hanjani M., Wilson G.A., McGranahan N., Birkbak N.J., Watkins T.B.K., Veeriah S., Shafi S., Johnson D.H., Mitter R., Rosenthal R., et al. Tracking the Evolution of Non-Small-Cell Lung Cancer. N. Engl. J. Med. 2017;376:2109–2121. doi: 10.1056/NEJMoa1616288. [DOI] [PubMed] [Google Scholar]
  • 2.Doroshow D.B., Herbst R.S. Treatment of Advanced Non-Small Cell Lung Cancer in 2018. JAMA Oncol. 2018;4:569–570. doi: 10.1001/jamaoncol.2017.5190. [DOI] [PubMed] [Google Scholar]
  • 3.Spurr L.F., Martinez C.A., Kang W., Chen M., Zha Y., Hseu R., Gutiontov S.I., Turchan W.T., Lynch C.M., Pointer K.B., et al. Highly aneuploid non-small cell lung cancer shows enhanced responsiveness to concurrent radiation and immune checkpoint blockade. Nat. Cancer. 2022;3:1498–1512. doi: 10.1038/s43018-022-00467-x. [DOI] [PubMed] [Google Scholar]
  • 4.John T., Sakai H., Ikeda S., Cheng Y., Kasahara K., Sato Y., Nakahara Y., Takeda M., Kaneda H., Zhang H., et al. First-line nivolumab plus ipilimumab combined with two cycles of chemotherapy in advanced non-small cell lung cancer: a subanalysis of Asian patients in CheckMate 9LA. Int. J. Clin. Oncol. 2022;27:695–706. doi: 10.1007/s10147-022-02120-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Rittmeyer A., Barlesi F., Waterkamp D., Park K., Ciardiello F., von Pawel J., Gadgeel S.M., Hida T., Kowalski D.M., Dols M.C., et al. Atezolizumab versus docetaxel in patients with previously treated non-small-cell lung cancer (OAK): a phase 3, open-label, multicentre randomised controlled trial. Lancet. 2017;389:255–265. doi: 10.1016/s0140-6736(16)32517-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Hellmann M.D., Paz-Ares L., Bernabe Caro R., Zurawski B., Kim S.W., Carcereny Costa E., Park K., Alexandru A., Lupinacci L., de la Mora Jimenez E., et al. Nivolumab plus Ipilimumab in Advanced Non-Small-Cell Lung Cancer. N. Engl. J. Med. 2019;381:2020–2031. doi: 10.1056/NEJMoa1910231. [DOI] [PubMed] [Google Scholar]
  • 7.Gettinger S., Horn L., Jackman D., Spigel D., Antonia S., Hellmann M., Powderly J., Heist R., Sequist L.V., Smith D.C., et al. Five-Year Follow-Up of Nivolumab in Previously Treated Advanced Non-Small-Cell Lung Cancer: Results From the CA209-003 Study. J. Clin. Oncol. 2018;36:1675–1684. doi: 10.1200/jco.2017.77.0412. [DOI] [PubMed] [Google Scholar]
  • 8.Borghaei H., Paz-Ares L., Horn L., Spigel D.R., Steins M., Ready N.E., Chow L.Q., Vokes E.E., Felip E., Holgado E., et al. Nivolumab versus Docetaxel in Advanced Nonsquamous Non-Small-Cell Lung Cancer. N. Engl. J. Med. 2015;373:1627–1639. doi: 10.1056/NEJMoa1507643. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Nishino M., Ramaiya N.H., Hatabu H., Hodi F.S. Monitoring immune-checkpoint blockade: response evaluation and biomarker development. Nat. Rev. Clin. Oncol. 2017;14:655–668. doi: 10.1038/nrclinonc.2017.88. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Topalian S.L., Taube J.M., Anders R.A., Pardoll D.M. Mechanism-driven biomarkers to guide immune checkpoint blockade in cancer therapy. Nat. Rev. Cancer. 2016;16:275–287. doi: 10.1038/nrc.2016.36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Huang E.P., O'Connor J.P.B., McShane L.M., Giger M.L., Lambin P., Kinahan P.E., Siegel E.L., Shankar L.K. Criteria for the translation of radiomics into clinically useful tests. Nat. Rev. Clin. Oncol. 2023;20:69–82. doi: 10.1038/s41571-022-00707-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Gillies R.J., Kinahan P.E., Hricak H. Radiomics: Images Are More than Pictures, They Are Data. Radiology. 2016;278:563–577. doi: 10.1148/radiol.2015151169. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Cen Y.Y., Nong H.Y., Huang X.X., Lu X.X., Pu C.H., Huang L.H., Zheng X.J., Pan Z.L., Huang Y., Ding K., Huang D.Y. Computed tomography-based deep learning and multi-instance learning for predicting microvascular invasion and prognosis in hepatocellular carcinoma. World J. Gastroenterol. 2025;31 doi: 10.3748/wjg.v31.i30.109186. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Lambin P., Rios-Velazquez E., Leijenaar R., Carvalho S., van Stiphout R.G.P.M., Granton P., Zegers C.M.L., Gillies R., Boellard R., Dekker A., Aerts H.J.W.L. Radiomics: extracting more information from medical images using advanced feature analysis. Eur. J. Cancer. 2012;48:441–446. doi: 10.1016/j.ejca.2011.11.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Mu W., Tunali I., Gray J.E., Qi J., Schabath M.B., Gillies R.J. Radiomics of (18)F-FDG PET/CT images predicts clinical benefit of advanced NSCLC patients to checkpoint blockade immunotherapy. Eur. J. Nucl. Med. Mol. Imaging. 2020;47:1168–1182. doi: 10.1007/s00259-019-04625-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Zhu Z., Chen M., Hu G., Pan Z., Han W., Tan W., Zhou Z., Wang M., Mao L., Li X., et al. A pre-treatment CT-based weighted radiomic approach combined with clinical characteristics to predict durable clinical benefits of immunotherapy in advanced lung cancer. Eur. Radiol. 2023;33:3918–3930. doi: 10.1007/s00330-022-09337-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Yang Y., Yang J., Shen L., Chen J., Xia L., Ni B., Ge L., Wang Y., Lu S. A multi-omics-based serial deep learning approach to predict clinical outcomes of single-agent anti-PD-1/PD-L1 immunotherapy in advanced stage non-small-cell lung cancer. Am. J. Transl. Res. 2021;13:743–756. [PMC free article] [PubMed] [Google Scholar]
  • 18.Liu C., Gong J., Yu H., Liu Q., Wang S., Wang J. A CT-Based Radiomics Approach to Predict Nivolumab Response in Advanced Non-Small-Cell Lung Cancer. Front. Oncol. 2021;11 doi: 10.3389/fonc.2021.544339. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Wu S., Zhan W., Liu L., Xie D., Yao L., Yao H., Liao G., Huang L., Zhou Y., You P., et al. Pretreatment radiomic biomarker for immunotherapy responder prediction in stage IB-IV NSCLC (LCDigital-IO Study): a multicenter retrospective study. J. Immunother. Cancer. 2023;11 doi: 10.1136/jitc-2023-007369. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Mu W., Jiang L., Shi Y., Tunali I., Gray J.E., Katsoulakis E., Tian J., Gillies R.J., Schabath M.B. Non-invasive measurement of PD-L1 status and prediction of immunotherapy response using deep learning of PET/CT images. J. Immunother. Cancer. 2021;9 doi: 10.1136/jitc-2020-002118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Saad M.B., Hong L., Aminu M., Vokes N.I., Chen P., Salehjahromi M., Qin K., Sujit S.J., Lu X., Young E., et al. Predicting benefit from immune checkpoint inhibitors in patients with non-small-cell lung cancer by CT-based ensemble deep learning: a retrospective study. Lancet Digit. Health. 2023;5:e404–e420. doi: 10.1016/s2589-7500(23)00082-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Kashyap A., Rapsomaniki M.A., Barros V., Fomitcheva-Khartchenko A., Martinelli A.L., Rodriguez A.F., Gabrani M., Rosen-Zvi M., Kaigala G. Quantification of tumor heterogeneity: from data acquisition to metric generation. Trends Biotechnol. 2022;40:647–676. doi: 10.1016/j.tibtech.2021.11.006. [DOI] [PubMed] [Google Scholar]
  • 23.Tian P., He B., Mu W., Liu K., Liu L., Zeng H., Liu Y., Jiang L., Zhou P., Huang Z., et al. Assessing PD-L1 expression in non-small cell lung cancer and predicting responses to immune checkpoint inhibitors using deep learning on computed tomography images. Theranostics. 2021;11:2098–2107. doi: 10.7150/thno.48027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Chaudhury B., Zhou M., Goldgof D.B., Hall L.O., Gatenby R.A., Gillies R.J., Patel B.K., Weinfurtner R.J., Drukteinis J.S. Heterogeneity in intratumoral regions with rapid gadolinium washout correlates with estrogen receptor status and nodal metastasis. J. Magn. Reson. Imaging. 2015;42:1421–1430. doi: 10.1002/jmri.24921. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Shi Z., Huang X., Cheng Z., Xu Z., Lin H., Liu C., Chen X., Liu C., Liang C., Lu C., et al. MRI-based Quantification of Intratumoral Heterogeneity for Predicting Treatment Response to Neoadjuvant Chemotherapy in Breast Cancer. Radiology. 2023;308 doi: 10.1148/radiol.222830. [DOI] [PubMed] [Google Scholar]
  • 26.Huang H., Chen H., Zheng D., Chen C., Wang Y., Xu L., Wang Y., He X., Yang Y., Li W. Habitat-based radiomics analysis for evaluating immediate response in colorectal cancer lung metastases treated by radiofrequency ablation. Cancer Imaging. 2024;24:44. doi: 10.1186/s40644-024-00692-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Liu H.F., Wang M., Lu Y.J., Wang Q., Lu Y., Xing F., Xing W. CEMRI-Based Quantification of Intratumoral Heterogeneity for Predicting Aggressive Characteristics of Hepatocellular Carcinoma Using Habitat Analysis: Comparison and Combination of Deep Learning. Acad. Radiol. 2024;31:2346–2355. doi: 10.1016/j.acra.2023.11.024. [DOI] [PubMed] [Google Scholar]
  • 28.Chen M., Copley S.J., Viola P., Lu H., Aboagye E.O. Radiomics and artificial intelligence for precision medicine in lung cancer treatment. Semin. Cancer Biol. 2023;93:97–113. doi: 10.1016/j.semcancer.2023.05.004. [DOI] [PubMed] [Google Scholar]
  • 29.Wang C., Ma J., Shao J., Zhang S., Li J., Yan J., Zhao Z., Bai C., Yu Y., Li W. Non-Invasive Measurement Using Deep Learning Algorithm Based on Multi-Source Features Fusion to Predict PD-L1 Expression and Survival in NSCLC. Front. Immunol. 2022;13 doi: 10.3389/fimmu.2022.828560. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Gatenby R.A., Grove O., Gillies R.J. Quantitative Imaging in Cancer Evolution and Ecology. Radiology. 2013;269:8–15. doi: 10.1148/radiol.13122697. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Kim J., Ryu S.Y., Lee S.H., Lee H.Y., Park H. Clustering approach to identify intratumour heterogeneity combining FDG PET and diffusion-weighted MRI in lung adenocarcinoma. Eur. Radiol. 2019;29:468–475. doi: 10.1007/s00330-018-5590-0. [DOI] [PubMed] [Google Scholar]
  • 32.Park J.E., Kim H.S., Kim N., Park S.Y., Kim Y.H., Kim J.H. Spatiotemporal Heterogeneity in Multiparametric Physiologic MRI Is Associated with Patient Outcomes in IDH-Wildtype Glioblastoma. Clin. Cancer Res. 2021;27:237–245. doi: 10.1158/1078-0432.Ccr-20-2156. [DOI] [PubMed] [Google Scholar]
  • 33.Caii W., Wu X., Guo K., Chen Y., Shi Y., Chen J. Integration of deep learning and habitat radiomics for predicting the response to immunotherapy in NSCLC patients. Cancer Immunol. Immunother. 2024;73:153. doi: 10.1007/s00262-024-03724-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Ye G., Wu G., Zhang C., Wang M., Liu H., Song E., Zhuang Y., Li K., Qi Y., Liao Y. CT-based quantification of intratumoral heterogeneity for predicting pathologic complete response to neoadjuvant immunochemotherapy in non-small cell lung cancer. Front. Immunol. 2024;15 doi: 10.3389/fimmu.2024.1414954. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Qi X., Wang S., Fang C., Jia J., Lin L., Yuan T. Machine learning and SHAP value interpretation for predicting comorbidity of cardiovascular disease and cancer with dietary antioxidants. Redox Biol. 2025;79 doi: 10.1016/j.redox.2024.103470. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Schöneck M., Lennartz S., Zopfs D., Sonnabend K., Wawer Matos Reimer R., Rinneburger M., Graffe J., Persigehl T., Hentschke C., Baeßler B., et al. Robustness of radiomic features in healthy abdominal parenchyma of patients with repeated examinations on dual-layer dual-energy CT. Eur. J. Radiol. 2024;175 doi: 10.1016/j.ejrad.2024.111447. [DOI] [PubMed] [Google Scholar]
  • 37.van Griethuysen J.J.M., Fedorov A., Parmar C., Hosny A., Aucoin N., Narayan V., Beets-Tan R.G.H., Fillion-Robin J.C., Pieper S., Aerts H.J.W.L. Computational Radiomics System to Decode the Radiographic Phenotype. Cancer Res. 2017;77:e104–e107. doi: 10.1158/0008-5472.Can-17-0339. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Galon J., Bruni D. Approaches to treat immune hot, altered and cold tumours with combination immunotherapies. Nat. Rev. Drug Discov. 2019;18:197–218. doi: 10.1038/s41573-018-0007-y. [DOI] [PubMed] [Google Scholar]
  • 39.Herbst R.S., Soria J.C., Kowanetz M., Fine G.D., Hamid O., Gordon M.S., Sosman J.A., McDermott D.F., Powderly J.D., Gettinger S.N., et al. Predictive correlates of response to the anti-PD-L1 antibody MPDL3280A in cancer patients. Nature. 2014;515:563–567. doi: 10.1038/nature14011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Tumeh P.C., Harview C.L., Yearley J.H., Shintaku I.P., Taylor E.J.M., Robert L., Chmielowski B., Spasic M., Henry G., Ciobanu V., et al. PD-1 blockade induces responses by inhibiting adaptive immune resistance. Nature. 2014;515:568–571. doi: 10.1038/nature13954. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Khorrami M., Prasanna P., Gupta A., Patil P., Velu P.D., Thawani R., Corredor G., Alilou M., Bera K., Fu P., et al. Changes in CT Radiomic Features Associated with Lymphocyte Distribution Predict Overall Survival and Response to Immunotherapy in Non-Small Cell Lung Cancer. Cancer Immunol. Res. 2020;8:108–119. doi: 10.1158/2326-6066.Cir-19-0476. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Braman N., Prasanna P., Whitney J., Singh S., Beig N., Etesami M., Bates D.D.B., Gallagher K., Bloch B.N., Vulchi M., et al. Association of Peritumoral Radiomics With Tumor Biology and Pathologic Response to Preoperative Targeted Therapy for HER2 (ERBB2)-Positive Breast Cancer. JAMA Netw. Open. 2019;2 doi: 10.1001/jamanetworkopen.2019.2561. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Rodriguez A.B., Peske J.D., Woods A.N., Leick K.M., Mauldin I.S., Meneveau M.O., Young S.J., Lindsay R.S., Melssen M.M., Cyranowski S., et al. Immune mechanisms orchestrate tertiary lymphoid structures in tumors via cancer-associated fibroblasts. Cell Rep. 2021;36 doi: 10.1016/j.celrep.2021.109422. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Lopez de Rodas M., Villalba-Esparza M., Sanmamed M.F., Chen L., Rimm D.L., Schalper K.A. Biological and clinical significance of tumour-infiltrating lymphocytes in the era of immunotherapy: a multidimensional approach. Nat. Rev. Clin. Oncol. 2025;22:163–181. doi: 10.1038/s41571-024-00984-x. [DOI] [PubMed] [Google Scholar]
  • 45.Li X., Guo Y., Huang S., Wang F., Dai C., Zhou J., Lin D. A CT-based intratumoral and peritumoral radiomics nomogram for postoperative recurrence risk stratification in localized clear cell renal cell carcinoma. BMC Med. Imaging. 2025;25:167. doi: 10.1186/s12880-025-01715-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Chen Z., Zhu H., Shu H., Zhang J., Gu K., Yao W. Preoperative prediction of WHO/ISUP grade of ccRCC using intratumoral and peritumoral habitat imaging: multicenter study. Cancer Imaging. 2025;25:59. doi: 10.1186/s40644-025-00875-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1 and S2 and Tables S1 and S2
mmc1.pdf (560.3KB, pdf)

Data Availability Statement

  • The radiological images and clinical data reported in this article can be obtained from the lead contact upon reasonable request.

  • All original code has been deposited on the GitHub and is publicly available as of the date of publication. DOI is listed in the key resources table.

  • Any additional information required to reanalyze the data reported in this article is available from the lead contact upon request.


Articles from iScience are provided here courtesy of Elsevier

RESOURCES