Skip to main content
. 2025 Aug 20;10(9):105557. doi: 10.1016/j.esmoop.2025.105557

Table 3.

Landscape of artificial intelligence-based clinical prediction models in early and metastatic non-small-cell lung cancer

Author (year) Study design Sample size Stage and type of lung cancer Type of model Outcome(s) investigated Type of prognostic features included in the model(s) Predictive features included in the final model(s) Variables significantly associated with outcome Model discrimination Guideline approval
Liu et al. 202321 Retrospective 198 Stage IA or IB invasive lung adenocarcinoma Cox regression model 5-year overall survival (OS), disease-free survival (DFS) Patient demographics, clinical and pathological parameters Demographics: age and sex
Pathological parameters: smoking history, type of surgery, adjuvant chemotherapy, T stage, lymphovascular invasion, necrosis, neuron invasion, architectural score
T stage, necrosis, and architectural score were key prognostic predictors AUC:
Prediction model = 0.717
T stage = 0.624
Architectural score = 0.611
Necrosis = 0.617
N/A—further studies and external validation are required
She et al. 202036 Retrospective 17 322 Stages I-IV NSCLC DeepSurv—a deep learning (DL) survival neural network
Cox proportional hazards regression model
Prognostic model:
Survival predictions and treatment recommendations
Treatment recommendation:
SEER database—population that followed treatment recommendations; CHINA database = population that did not follow recommendations
Patient characteristics, tumour stage, treatment strategies 127 features, including
Cases (sex, age, and marriage status), tumour characteristics (location, size, histological grade, histological type, TNM stage), SEER code (CS extension, CS mets at dx, regional nodes examined, regional nodes positive, lung—pleural/elastic layer invasion by H&E or elastic stain, lung—separate tumour nodules—ipsilateral lung, lung—surgery to primary site), and treatment details (surgical type)
N/A Prediction model C-index:
DeepSurv model = 0.739
Cox regression model = 0.716
Treatment recommendation hazard ratio:
SEER = 2.99
CHINA = 2.14
N/A—feasibility trial status; further studies and external validation needed
Jin et al. 202337 Retrospective 4617 Surgically resected stage aI3 NSCLC Model 1:
DL neural network
Model 2:
Random forest
Model 3:
Cox proportional hazards model
5-year survival Patient demographics, NSCLC-related characteristics, and treatment details Demographic information (age and sex) and NSCLC-cancer-related characteristics (TNM stage, histology type, primary site, tumour size, regional node number examined, regional node positive number, and laterality of the tumour), and treatment details (surgery of primary site, radiation, and chemotherapy) Regional positive nodes (−0.6634); regional examined nodes (−0.7648); tumour size (−0.5633); age (−0.4633) C-index:
DL neural network = 0.843
Random forest = 0.678
Cox proportional hazards = 0.678
N/A—improvements to model explainability needed
Palmer et al. 202338 Retrospective 392 Patients undergoing radical surgery for early-stage NSCLC Three regularised Cox models:
Ridge, LASSO, Elastic Net
Recurrence-free survival (RFS) 66 clinicopathological features across 3 models Elastic Net model:
Systemic inflammatory response index (SIRI), eosinophil count, preoperative nodal stage, weight loss, performance status, and maximum standardised uptake value (SUV-max)
Ridge model:
Performance status, weight loss, SIRI, eosinophil count, lymphovascular invasion, visceral pleural invasion, and pathological stage
N/A C-index:
Elastic Net = 0.70
Ridge = 0.75
AUC ROC:
Elastic Net = 0.72
Ridge = 0.79
N/A—prospective and external validation required
Hindocha et al. 202239 Retrospective, multicentre 657 patients, across 5 hospitals Patients receiving curative-intent radiotherapy for NSCLC Logistic regression models based on TNM staging (for benchmarking)
Machine learning (ML) models
Recurrence, RFS, OS 2 years after treatment Patient demographics and clinical parameters Age, PTV, primary size, SUV primary, SUV nodal, BMI, FEV1, TLCO, total dose, BED, sex, smoking status, T staging, N staging, nodal avidity, morphology (adenocarcinoma/no pathological diagnosis), morphology, treatment (chemo, conventional, SBRT) RFS:
PTV, size of the primary tumour, total number of fractions, whether treatment was with SBRT or BED
Recurrence:
PTV, size of primary tumour, number of fractions, SBRT treatment, BED
OS:
PTV, size of primary tumour, BED, T1 stage, treatment with SBRT, treatment with chemoradiotherapy, whether nodal avidity was present, smoking status
AUC ROC:
RFS:
ML model = 0.682
TNM model = 0.650
Recurrence:
ML model = 0.687
TNM model = 0.670
OS:
ML model = 0.659
TNM model = 0.649
N/A—more advanced, radiotherapy-specific, and image-based models are required
Huang et al. 201640 Retrospective 282 Early-stage (IA-IIB) NSCLC Cox regression models: radiomics signature versus clinicopathological risk factors DFS Radiomics, TNM staging, clinicopathological factors Radiomics signature as an independent biomarker; TNM staging, clinicopathological factors N/A C-index:
Radiomics signature = 0.72
Clinical-pathological nomogram = 0.691
N/A—building on previous model, but further validation required
Hosny et al. 201841 Retrospective, multi-cohort 1194 Stage I-IIIb NSCLC patients treated with radiotherapy versus patients treated with surgery 3D convolutional neural network (CNN)
Random forest (for benchmarking)
2-year OS CT data, clinical parameters, preliminary genomic associations CNN models: labelled CT scans, contrast-enhanced CT, MRI of the brain, bronchoscopy with TBNA, mediastinoscopy, radiotherapy treatment, chemotherapy treatment, PET–CT scans
Random forest: clinical information (age, sex, staging, clinicopathological factors)
Large, uninterrupted areas of relatively higher CT density AUC ROC:
DL CNN:
Radiotherapy = 0.70
Surgery = 0.71
Random forest:
Radiotherapy = 0.55
Surgery = 0.58
N/A—rigorous evaluation in future prognostic studies necessary; improvements required include interoperability, model blind spots, and sensitivity to medical images
Zhang et al. 202342 Retrospective 6371 Surgically resected, stage I-III NSCLC Intelligent prognosis evaluation system (IPES)
Cox regression (for benchmarking)
OS Pre-therapy CT images, EGFR genotype IPES model:
Cisplatin-based chemotherapy, EGFR-TKI-targeted adjuvant therapy, EGFR mutation, sex, age, smoking status, complete resection, cancer/tumour family history, histology (adenocarcinoma/other), staging, death status
Cox regression:
Clinical metadata, TNM staging
N/A C-index:
IPES = 0.817;
Cox models = 0.733
N/A—a more comprehensive integration of the clinicopathological factors is required
Martell et al. 202443 Prospective 48 Surgically resected NSCLC patients who have not received chemo before surgery Deep representation learning
Pearson rank order correlation (for benchmarking)
Histology subtype, OS CT scans, clinicopathological data, tissue metabolomics Sex, TNM staging, histology, CT scan (contrast/non-contrast), treatment received, dosage, end result, tissue metabolome (amino acid, lipid, xenobiotics, nucleotide, peptide, energy, etc.) Sedoheptulose-7-phosphate and uridine—metabolites with the highest correlation (0.72 and 0.64, respectively) C-index:
Radiomics = 0.62
CT = 0.57
TMR-CT = 0.74
Radiomics + clinical = 0.64
CT + clinical = 0.59
TMR-CT + clinical = 0.78
N/A—retrospective study
Jiang et al. 201944 Retrospective 524 Lung adenocarcinoma (IA-IV) Artificial neural network (ANN)
Cox regression model
Progression-free survival (PFS)
OS
Clinical data; genetic mutation data Smoking history, tumour stage, surgical margin resection status, primary tumour site, TCGA database, pathological grade of the margin, types of gene mutation, patient’s vital status Smoking history, tumour stage, surgical margin level AUC ROC:
ANN model = 0.712
Cox regression model = 0.662
N/A—further improvements and more comprehensive analysis of TCGA needed for clinical use
Coudray et al. 201845 Retrospective 1634 TCGA database Deep convolutional neural network (CNN)
Random sampling
Cancer prognostication based on genetic mutations Whole slide H&E images, genetics/genomics Morphology, immunohistochemical staining, necrotic regions, inflammation, blood clotting, fibrotic scars, genetic mutations Genetic mutations most commonly associated with adenocarcinoma—STK11, EGFR, FAT1, SETBP1, KRAS, and TP53 AUC:
CNN = 0.750
Random sampling = 0.687
N/A—classification needs to be extended to other subtypes and histological features
Li et al. 201946 Retrospective 492

579
TCGA adenocarcinoma database
GEO adenocarcinoma database
SigFeature (an ML algorithm)
1-year, 3-year, and 5-year OS Genetic mutations 16-gene-based prognostic prediction model:
C1QTNF6, ENPP5, PLEK2, RGS20, RHOV, SFTA3, GNG7, HMMR, IGF2BP1, PEBP1, UPK1B, SRGAP1, ZNF14, IER5L, SATB2, STYX
Over-expression of IGF2BP1, UPK1B, SRGAP1, SATB2, C1QTNF6, RHOV, IER5L, STYX, HMMR, PLEK2, and RGS20; and under expression of PEBP1, SFTA3, GNG7, ENPP5, and ZNF14 C-index:
SigFeature:
1-year = 0.753
3-year = 0.726
5-year = 0.656;
N/A—retrospective
Yang et al. 202247 Retrospective 1128
2498
318
TCGA database
IMMPORT database
Cistrome Cancer database
Cox proportional hazards regression model
Decision tree model
5-year overall survival (OS) Transcriptomic, immune, and clinicopathological data 193 differential immune genes, clinical parameters (sex, age, stage, tumour histology, survival) Degree of neutrophil infiltration in the tumour microenvironment; immune genes with significant prognostic benefit: LTB4R, IL-33, and SHC3 AUC:
TCGA data:
Cox model = 0.673
Decision tree model = 0.601
GEO data:
Cox = 0.712
N/A - large-scale, multicentre evidence necessary for further validation and verification
Song et al. 202048 Prospective 342 Stage IV EGFR-positive NSCLC patients, receiving EGFR-TKI therapy DL semantic signature
Radiomics signature model
PFS Clinical data, CT scans, EGFR subtype Clinical data (sex, age, smoking status, performance status score, histopathological subtype), administered therapeutic regimen, EGFR variant subtype, enhanced/non-contrast CT N/A AUC:
DL model = 0.73
N/A—screening of CT sections and an image-encoding process corresponding to DL semantic features is needed to better explain the model
Deng et al. 202249 Retrospective 570


129
Stage IV NSCLC patients treated with EGFR-TKIs
Stage IV patients treated with ICIs
EfficientNetV2-based survival benefit prognosis (ESBP) model
Experienced oncologist
PFS Preoperative CT scans, mutation presence, demographic data Demographic information, including sex, age, smoking status, performance status score, histopathological subtype, EGFR mutation subtype (EGFR-TKI patients), tumour proportion score (ICI-treated patients), and the administered therapeutic regimen N/A C-index:
ESBP = 0.690
10-year experienced oncologists = 0.651
N/A—although ESBP has been tested in the clinical setting based on how well it can assist clinical staff, further validation and optimisation on larger and more diverse datasets are needed
Kinoshita et al. 202350 Prospective 1049 Surgically resected stage I-IIIA NSCLC eXtreme Gradient Boosting (XGBoost) ML algorithm
p-Stage scores
DFS, OS, cancer-specific survival (CSS) at 5 years 17 clinicopathological factors, 30 preoperative, and 22 post-operative blood tests Tumour malignancy (p-Stage, pathological N status, histological type, pleural invasion, lymphatic invasion, SUV-max, CEA, and CYFRA), pulmonary function, and smoking history.
The preoperative blood test results, including total protein, albumin, total bilirubin, direct bilirubin, aspartate aminotransferase, alanine aminotransferase, alkaline phosphatase, γ-glutamyl transpeptidase, lactate dehydrogenase, urea nitrogen, creatinine, urine acid, sodium, potassium, chlorine, calcium, total cholesterol, triglyceride, glucose, CRP, white blood cell, neutrophil, lymphocyte, monocyte, haemoglobin, platelet, prothrombin time-international normalised ratio, and activated partial thromboplastin time, were averaged over the 1-month period before surgery
Preoperative prothrombin time-international normalised ratio and post-operative CRP were identified as variables of high importance for DFS, OS, and CSS; post-operative total protein, post-operative aspartate aminotransferase, post-operative creatinine, post-operative monocyte, preoperative lymphocyte, post-operative lymphocyte, and post-operative lactate dehydrogenase were identified as important variables in at least two among DFS, OS, and CSS AUC:
XGBoost:
DFS = 0.890
OS = 0.926
CSS = 0.960
p-Stage:
DFS = 0.756
OS = 0.758
CSS = 0.786
N/A—retrospective study on a single cohort; external validation is necessary
Torrente et al. 202251 Retrospective 5275 NSCLC patients AI-powered CLARIFY DSP Clinical data, questionnaires data, data collected from wearable devices N/A N/A N/A N/A—however, CLARIFY project has been extensively validated and might soon be used as an independent decision-support tool
Janik et al. 202315 Prospective, multi-site 1387 Early stage (I-II) NSCLC ML model
Previous work (Lee et al., 2020)
5-year OS Clinical data, molecular data, radiological data Tumour characteristics (stage, TNM, histology, grade, subtype); general (age, race, sex, previous cancer, family cancer history, Eastern Cooperative Oncology Group, synchronous tumours, and biomarkers: anaplastic lymphoma kinase immunohistochemistry, PD-L1, and epidermal growth factor receptor-negative); comorbidities; smoking; symptoms; radiotherapy (type, area, dose, fractioning, duration); chemotherapy (type, start time, and regimen); surgery (procedure, time, type, resection grade, and TN stages) Having four family members with other cancers is the highest contributing factor, which increases the prediction by 0.19; missing information on the regimen of the chemotherapy or not having it done at all decreases the prediction by 0.07 AUC:
ML model = 0.81
Previous work = 0.7677
N/A—further prospective and multi-site studies necessary, which would also incorporate radiomics and genomics
Rakaee et al. 202552 Multicentre cohort study Retrospective 958 patients (development cohort—614, validation cohort—344) Supervised DL-based response stratification H&E with binary classification (responder versus non-responder) Objective response rate
PFS
OS
PD-L1 expression
TMB
TILs
Age
Sex
ECOG performance
Histology
Line of therapy
Deep-IO model probability score Deep-IO score
PD-L1 expression
ECOG performance
Status
Histological subtype
Sex
AUC:
Internal set: 0.75 (95% CI 0.64-0.85)
External validation: 0.66 (95% CI 0.60-0.72)
Deep-IO + PD-LI
0.70 (95%, CI, 0.63-0.76)
Further validation needed

AUC, area under the curve; BED, biologically effective dose; BMI, body mass index; CI, confidence interval; CRP, C-reactive protein; CS, collaborative staging; CT, computed tomography; ECOG, Eastern Cooperative Oncology Group; EGFR, epidermal growth factor receptor; FEV1, forced expiratory volume in 1 second; GEO, Gene Expression Omnibus; H&E, haematoxylin–eosin; ICIs, immune checkpoint inhibitors; IL, interleukin; IO, immunotherapy; LASSO, least absolute shrinkage and selection operator; MRI, magnetic resonance imaging; N/A, not applicable; NSCLC, non-small-cell lung cancer; PD-L1, programmed death-ligand 1; PET, positron emission tomography; PTV, planning target volume; ROC, receiver operating characteristic; SBRT, SBRT, stereotactic body radiation therapy; TBNA, transbronchial needle aspiration; TCGA, The Cancer Genome Atlas; TILs, tumour-infiltrating lymphocytes; TKI, tyrosine kinase inhibitor; TLCO, transfer factor for carbon monoxide; TMR, tissue-metabolic-radiomic; TNM, tumour–node–metastasis.