Table 3.
Landscape of artificial intelligence-based clinical prediction models in early and metastatic non-small-cell lung cancer
| Author (year) | Study design | Sample size | Stage and type of lung cancer | Type of model | Outcome(s) investigated | Type of prognostic features included in the model(s) | Predictive features included in the final model(s) | Variables significantly associated with outcome | Model discrimination | Guideline approval |
|---|---|---|---|---|---|---|---|---|---|---|
| Liu et al. 202321 | Retrospective | 198 | Stage IA or IB invasive lung adenocarcinoma | Cox regression model | 5-year overall survival (OS), disease-free survival (DFS) | Patient demographics, clinical and pathological parameters | Demographics: age and sex Pathological parameters: smoking history, type of surgery, adjuvant chemotherapy, T stage, lymphovascular invasion, necrosis, neuron invasion, architectural score |
T stage, necrosis, and architectural score were key prognostic predictors | AUC: Prediction model = 0.717 T stage = 0.624 Architectural score = 0.611 Necrosis = 0.617 |
N/A—further studies and external validation are required |
| She et al. 202036 | Retrospective | 17 322 | Stages I-IV NSCLC | DeepSurv—a deep learning (DL) survival neural network Cox proportional hazards regression model |
Prognostic model: Survival predictions and treatment recommendations Treatment recommendation: SEER database—population that followed treatment recommendations; CHINA database = population that did not follow recommendations |
Patient characteristics, tumour stage, treatment strategies | 127 features, including Cases (sex, age, and marriage status), tumour characteristics (location, size, histological grade, histological type, TNM stage), SEER code (CS extension, CS mets at dx, regional nodes examined, regional nodes positive, lung—pleural/elastic layer invasion by H&E or elastic stain, lung—separate tumour nodules—ipsilateral lung, lung—surgery to primary site), and treatment details (surgical type) |
N/A | Prediction model C-index: DeepSurv model = 0.739 Cox regression model = 0.716 Treatment recommendation hazard ratio: SEER = 2.99 CHINA = 2.14 |
N/A—feasibility trial status; further studies and external validation needed |
| Jin et al. 202337 | Retrospective | 4617 | Surgically resected stage aI3 NSCLC | Model 1: DL neural network Model 2: Random forest Model 3: Cox proportional hazards model |
5-year survival | Patient demographics, NSCLC-related characteristics, and treatment details | Demographic information (age and sex) and NSCLC-cancer-related characteristics (TNM stage, histology type, primary site, tumour size, regional node number examined, regional node positive number, and laterality of the tumour), and treatment details (surgery of primary site, radiation, and chemotherapy) | Regional positive nodes (−0.6634); regional examined nodes (−0.7648); tumour size (−0.5633); age (−0.4633) | C-index: DL neural network = 0.843 Random forest = 0.678 Cox proportional hazards = 0.678 |
N/A—improvements to model explainability needed |
| Palmer et al. 202338 | Retrospective | 392 | Patients undergoing radical surgery for early-stage NSCLC | Three regularised Cox models: Ridge, LASSO, Elastic Net |
Recurrence-free survival (RFS) | 66 clinicopathological features across 3 models | Elastic Net model: Systemic inflammatory response index (SIRI), eosinophil count, preoperative nodal stage, weight loss, performance status, and maximum standardised uptake value (SUV-max) Ridge model: Performance status, weight loss, SIRI, eosinophil count, lymphovascular invasion, visceral pleural invasion, and pathological stage |
N/A | C-index: Elastic Net = 0.70 Ridge = 0.75 AUC ROC: Elastic Net = 0.72 Ridge = 0.79 |
N/A—prospective and external validation required |
| Hindocha et al. 202239 | Retrospective, multicentre | 657 patients, across 5 hospitals | Patients receiving curative-intent radiotherapy for NSCLC | Logistic regression models based on TNM staging (for benchmarking) Machine learning (ML) models |
Recurrence, RFS, OS 2 years after treatment | Patient demographics and clinical parameters | Age, PTV, primary size, SUV primary, SUV nodal, BMI, FEV1, TLCO, total dose, BED, sex, smoking status, T staging, N staging, nodal avidity, morphology (adenocarcinoma/no pathological diagnosis), morphology, treatment (chemo, conventional, SBRT) | RFS: PTV, size of the primary tumour, total number of fractions, whether treatment was with SBRT or BED Recurrence: PTV, size of primary tumour, number of fractions, SBRT treatment, BED OS: PTV, size of primary tumour, BED, T1 stage, treatment with SBRT, treatment with chemoradiotherapy, whether nodal avidity was present, smoking status |
AUC ROC: RFS: ML model = 0.682 TNM model = 0.650 Recurrence: ML model = 0.687 TNM model = 0.670 OS: ML model = 0.659 TNM model = 0.649 |
N/A—more advanced, radiotherapy-specific, and image-based models are required |
| Huang et al. 201640 | Retrospective | 282 | Early-stage (IA-IIB) NSCLC | Cox regression models: radiomics signature versus clinicopathological risk factors | DFS | Radiomics, TNM staging, clinicopathological factors | Radiomics signature as an independent biomarker; TNM staging, clinicopathological factors | N/A | C-index: Radiomics signature = 0.72 Clinical-pathological nomogram = 0.691 |
N/A—building on previous model, but further validation required |
| Hosny et al. 201841 | Retrospective, multi-cohort | 1194 | Stage I-IIIb NSCLC patients treated with radiotherapy versus patients treated with surgery | 3D convolutional neural network (CNN) Random forest (for benchmarking) |
2-year OS | CT data, clinical parameters, preliminary genomic associations | CNN models: labelled CT scans, contrast-enhanced CT, MRI of the brain, bronchoscopy with TBNA, mediastinoscopy, radiotherapy treatment, chemotherapy treatment, PET–CT scans Random forest: clinical information (age, sex, staging, clinicopathological factors) |
Large, uninterrupted areas of relatively higher CT density | AUC ROC: DL CNN: Radiotherapy = 0.70 Surgery = 0.71 Random forest: Radiotherapy = 0.55 Surgery = 0.58 |
N/A—rigorous evaluation in future prognostic studies necessary; improvements required include interoperability, model blind spots, and sensitivity to medical images |
| Zhang et al. 202342 | Retrospective | 6371 | Surgically resected, stage I-III NSCLC | Intelligent prognosis evaluation system (IPES) Cox regression (for benchmarking) |
OS | Pre-therapy CT images, EGFR genotype | IPES model: Cisplatin-based chemotherapy, EGFR-TKI-targeted adjuvant therapy, EGFR mutation, sex, age, smoking status, complete resection, cancer/tumour family history, histology (adenocarcinoma/other), staging, death status Cox regression: Clinical metadata, TNM staging |
N/A | C-index: IPES = 0.817; Cox models = 0.733 |
N/A—a more comprehensive integration of the clinicopathological factors is required |
| Martell et al. 202443 | Prospective | 48 | Surgically resected NSCLC patients who have not received chemo before surgery | Deep representation learning Pearson rank order correlation (for benchmarking) |
Histology subtype, OS | CT scans, clinicopathological data, tissue metabolomics | Sex, TNM staging, histology, CT scan (contrast/non-contrast), treatment received, dosage, end result, tissue metabolome (amino acid, lipid, xenobiotics, nucleotide, peptide, energy, etc.) | Sedoheptulose-7-phosphate and uridine—metabolites with the highest correlation (0.72 and 0.64, respectively) | C-index: Radiomics = 0.62 CT = 0.57 TMR-CT = 0.74 Radiomics + clinical = 0.64 CT + clinical = 0.59 TMR-CT + clinical = 0.78 |
N/A—retrospective study |
| Jiang et al. 201944 | Retrospective | 524 | Lung adenocarcinoma (IA-IV) | Artificial neural network (ANN) Cox regression model |
Progression-free survival (PFS) OS |
Clinical data; genetic mutation data | Smoking history, tumour stage, surgical margin resection status, primary tumour site, TCGA database, pathological grade of the margin, types of gene mutation, patient’s vital status | Smoking history, tumour stage, surgical margin level | AUC ROC: ANN model = 0.712 Cox regression model = 0.662 |
N/A—further improvements and more comprehensive analysis of TCGA needed for clinical use |
| Coudray et al. 201845 | Retrospective | 1634 | TCGA database | Deep convolutional neural network (CNN) Random sampling |
Cancer prognostication based on genetic mutations | Whole slide H&E images, genetics/genomics | Morphology, immunohistochemical staining, necrotic regions, inflammation, blood clotting, fibrotic scars, genetic mutations | Genetic mutations most commonly associated with adenocarcinoma—STK11, EGFR, FAT1, SETBP1, KRAS, and TP53 | AUC: CNN = 0.750 Random sampling = 0.687 |
N/A—classification needs to be extended to other subtypes and histological features |
| Li et al. 201946 | Retrospective | 492 579 |
TCGA adenocarcinoma database GEO adenocarcinoma database |
SigFeature (an ML algorithm) |
1-year, 3-year, and 5-year OS | Genetic mutations | 16-gene-based prognostic prediction model: C1QTNF6, ENPP5, PLEK2, RGS20, RHOV, SFTA3, GNG7, HMMR, IGF2BP1, PEBP1, UPK1B, SRGAP1, ZNF14, IER5L, SATB2, STYX |
Over-expression of IGF2BP1, UPK1B, SRGAP1, SATB2, C1QTNF6, RHOV, IER5L, STYX, HMMR, PLEK2, and RGS20; and under expression of PEBP1, SFTA3, GNG7, ENPP5, and ZNF14 | C-index: SigFeature: 1-year = 0.753 3-year = 0.726 5-year = 0.656; |
N/A—retrospective |
| Yang et al. 202247 | Retrospective | 1128 2498 318 |
TCGA database IMMPORT database Cistrome Cancer database |
Cox proportional hazards regression model Decision tree model |
5-year overall survival (OS) | Transcriptomic, immune, and clinicopathological data | 193 differential immune genes, clinical parameters (sex, age, stage, tumour histology, survival) | Degree of neutrophil infiltration in the tumour microenvironment; immune genes with significant prognostic benefit: LTB4R, IL-33, and SHC3 | AUC: TCGA data: Cox model = 0.673 Decision tree model = 0.601 GEO data: Cox = 0.712 |
N/A - large-scale, multicentre evidence necessary for further validation and verification |
| Song et al. 202048 | Prospective | 342 | Stage IV EGFR-positive NSCLC patients, receiving EGFR-TKI therapy | DL semantic signature Radiomics signature model |
PFS | Clinical data, CT scans, EGFR subtype | Clinical data (sex, age, smoking status, performance status score, histopathological subtype), administered therapeutic regimen, EGFR variant subtype, enhanced/non-contrast CT | N/A | AUC: DL model = 0.73 |
N/A—screening of CT sections and an image-encoding process corresponding to DL semantic features is needed to better explain the model |
| Deng et al. 202249 | Retrospective | 570 129 |
Stage IV NSCLC patients treated with EGFR-TKIs Stage IV patients treated with ICIs |
EfficientNetV2-based survival benefit prognosis (ESBP) model Experienced oncologist |
PFS | Preoperative CT scans, mutation presence, demographic data | Demographic information, including sex, age, smoking status, performance status score, histopathological subtype, EGFR mutation subtype (EGFR-TKI patients), tumour proportion score (ICI-treated patients), and the administered therapeutic regimen | N/A | C-index: ESBP = 0.690 10-year experienced oncologists = 0.651 |
N/A—although ESBP has been tested in the clinical setting based on how well it can assist clinical staff, further validation and optimisation on larger and more diverse datasets are needed |
| Kinoshita et al. 202350 | Prospective | 1049 | Surgically resected stage I-IIIA NSCLC | eXtreme Gradient Boosting (XGBoost) ML algorithm p-Stage scores |
DFS, OS, cancer-specific survival (CSS) at 5 years | 17 clinicopathological factors, 30 preoperative, and 22 post-operative blood tests | Tumour malignancy (p-Stage, pathological N status, histological type, pleural invasion, lymphatic invasion, SUV-max, CEA, and CYFRA), pulmonary function, and smoking history. The preoperative blood test results, including total protein, albumin, total bilirubin, direct bilirubin, aspartate aminotransferase, alanine aminotransferase, alkaline phosphatase, γ-glutamyl transpeptidase, lactate dehydrogenase, urea nitrogen, creatinine, urine acid, sodium, potassium, chlorine, calcium, total cholesterol, triglyceride, glucose, CRP, white blood cell, neutrophil, lymphocyte, monocyte, haemoglobin, platelet, prothrombin time-international normalised ratio, and activated partial thromboplastin time, were averaged over the 1-month period before surgery |
Preoperative prothrombin time-international normalised ratio and post-operative CRP were identified as variables of high importance for DFS, OS, and CSS; post-operative total protein, post-operative aspartate aminotransferase, post-operative creatinine, post-operative monocyte, preoperative lymphocyte, post-operative lymphocyte, and post-operative lactate dehydrogenase were identified as important variables in at least two among DFS, OS, and CSS | AUC: XGBoost: DFS = 0.890 OS = 0.926 CSS = 0.960 p-Stage: DFS = 0.756 OS = 0.758 CSS = 0.786 |
N/A—retrospective study on a single cohort; external validation is necessary |
| Torrente et al. 202251 | Retrospective | 5275 | NSCLC patients | AI-powered CLARIFY DSP | Clinical data, questionnaires data, data collected from wearable devices | N/A | N/A | N/A | N/A—however, CLARIFY project has been extensively validated and might soon be used as an independent decision-support tool | |
| Janik et al. 202315 | Prospective, multi-site | 1387 | Early stage (I-II) NSCLC | ML model Previous work (Lee et al., 2020) |
5-year OS | Clinical data, molecular data, radiological data | Tumour characteristics (stage, TNM, histology, grade, subtype); general (age, race, sex, previous cancer, family cancer history, Eastern Cooperative Oncology Group, synchronous tumours, and biomarkers: anaplastic lymphoma kinase immunohistochemistry, PD-L1, and epidermal growth factor receptor-negative); comorbidities; smoking; symptoms; radiotherapy (type, area, dose, fractioning, duration); chemotherapy (type, start time, and regimen); surgery (procedure, time, type, resection grade, and TN stages) | Having four family members with other cancers is the highest contributing factor, which increases the prediction by 0.19; missing information on the regimen of the chemotherapy or not having it done at all decreases the prediction by 0.07 | AUC: ML model = 0.81 Previous work = 0.7677 |
N/A—further prospective and multi-site studies necessary, which would also incorporate radiomics and genomics |
| Rakaee et al. 202552 | Multicentre cohort study | Retrospective | 958 patients (development cohort—614, validation cohort—344) | Supervised DL-based response stratification H&E with binary classification (responder versus non-responder) | Objective response rate PFS OS |
PD-L1 expression TMB TILs Age Sex ECOG performance Histology Line of therapy |
Deep-IO model probability score | Deep-IO score PD-L1 expression ECOG performance Status Histological subtype Sex |
AUC: Internal set: 0.75 (95% CI 0.64-0.85) External validation: 0.66 (95% CI 0.60-0.72) Deep-IO + PD-LI 0.70 (95%, CI, 0.63-0.76) |
Further validation needed |
AUC, area under the curve; BED, biologically effective dose; BMI, body mass index; CI, confidence interval; CRP, C-reactive protein; CS, collaborative staging; CT, computed tomography; ECOG, Eastern Cooperative Oncology Group; EGFR, epidermal growth factor receptor; FEV1, forced expiratory volume in 1 second; GEO, Gene Expression Omnibus; H&E, haematoxylin–eosin; ICIs, immune checkpoint inhibitors; IL, interleukin; IO, immunotherapy; LASSO, least absolute shrinkage and selection operator; MRI, magnetic resonance imaging; N/A, not applicable; NSCLC, non-small-cell lung cancer; PD-L1, programmed death-ligand 1; PET, positron emission tomography; PTV, planning target volume; ROC, receiver operating characteristic; SBRT, SBRT, stereotactic body radiation therapy; TBNA, transbronchial needle aspiration; TCGA, The Cancer Genome Atlas; TILs, tumour-infiltrating lymphocytes; TKI, tyrosine kinase inhibitor; TLCO, transfer factor for carbon monoxide; TMR, tissue-metabolic-radiomic; TNM, tumour–node–metastasis.