. 2022 Dec 1;20(5):850–866. doi: 10.1016/j.gpb.2022.11.003

Table 3.

Publications relevant toMLon treatment response and survival prediction

Publication	Feature extraction method	Prediction model	Sample size	Data type	Performance	Validation method	Feature selection/input	Highlight/advantage	Shortcoming
Jiang et al. [104]	MRRN-based model	MRRN-based model	1210	CT Images	DSC (0.68–0.75)	5-fold cross-validation	3D image features	The model can accurately track the tumor volume changes from CT images across multiple image resolutions	The model does not predict accurately enough when the tumor size is small
Qureshi [106]	NA	RF; SVM; KNN; LDA; CART	201	Molecular structure and somatic mutations of EGFR	Accuracy (0.975)	10-fold cross-validation	4 clinical features + 4 protein drug interaction features + 5 geometrical features	The model integrates multiple features for data training, and achieves better performance than other benchmarked models	Among the possible 594 EGFR mutations available in the COSMIC database, the model only considers the most common 33 EGFR mutations for model training
Kapil et al. [107]	AC-GAN	AC-GAN	270	Digital pathology images	Lcc (0.94); Pcc (0.95); MAE (8.03)	Hold-out	PD-L1-stained tumor section histological slides	The model achieves better performance than other benchmarked, fully supervised models	In the experiments, the use of PD-L1 staining for TPS evaluation may not be accurate enough
Geeleher et al. [109]	NA	Ridge regression model	62	RNA-seq	Accuracy (0.89)	Leave-one-out cross-validation	Removed low variable genes	The model can accurately predict the drug response using RNA-seq profiles only	The training sample size is small
Chen et al. [123]	Chi-square test + NN	NN	440	RNA-seq	Accuracy (0.83)	Hold-out	RNA-seq of 5 genes	The model uses multiple laboratory datasets for training to improve its robustness	The model doesn’t consider demographic and clinical features, which may affect the prediction
LUADpp [125]	Top genes with most significant mutation frequency difference	SVM	371	Somatic mutations	Accuracy (0.81)	5-fold cross-validation	Somatic mutation features in 85 genes	The model can predict with high accuracy with only seven gene mutation features	Mutation frequency may be impacted by the sampling bias across datasets; LD may also affect the feature selection
Cho et al. [126]	Information gain; Chi-squared test; minimum redundancy maximum relevance; correlation algorithm	NB; KNN; SVM; DT	471	Somatic mutations	Accuracy (0.68–0.88)	5-fold cross-validation	Somatic mutation features composed of 19 genes	To improve performance, the model uses four different methods for feature selection	The training cohort consists of only one dataset
Yu et al. [128]	Information gain ratio; hierarchical clustering	RF	538	Multi-omics (histology, pathology reports, RNA, proteomics)	AUC (> 0.8)	leave-one-out cross-validation	15 gene set features	The study uses an integrative omics-pathology model to improve the accuracy in predicting patients’ prognosis	Cox models may be overfitted in multiple-dimension data
Asada et al. [130]	Autoencoder + Cox-PH + K-means + ANOVA	SVM	364	Multi-omics (miRNA, mRNA)	Accuracy (0.81)	Hold-out	20 miRNAs + 25 mRNAs	The study uses ML algorithms to systematically model feature extraction from multi-omics datasets	The model does not consider the impact of clinical and demographic variances in data training
Takahashi et al. [131]	Autoencoder + Cox-PH + K-means + XGBoost/LightGBM	LR	483	Multi-omics (mRNA, somatic mutation, CNV, mythelation, RPPA)	AUC (0.43–0.99 under different omics data)	Hold-out	12 mRNAs, 3 miRNAs, 3 methylations, 5 CNVs, 3 somatic mutations, and 3 RPPA	The study uses ML algorithms to systematically model feature extraction from multi-omics datasets	The datasets collected in this study contain uncommon samples between different omics datasets, which may cause bias in model evaluation
Wiesweg et al. [136]	Lasso regression	SVM	122	RNA-seq	Significant hazard ratio differences	Hold-out	7 genes from feature selection model + 25 cell type-specific genes	The ML-based feature extraction model performs better than using any single immune marker for immunotherapy response prediction	The metrics used in this study does not perceptual intuition. Using accuracy or AUC may be better
Trebeschi et al. [137]	LR; RF	LR; RF	262	CT imaging	AUC (0.76–0.83)	Hold-out	10 radiographic features	The model can extract potential predictive CT-derived radiomic biomarkers to improve immunotherapy response prediction	The predictive performance between different cancer types is not robust
Saltz et al. [142]	CAE [143]	VGG16 [144] + DeconvNet [145]	4612 (13 cancer types)	Histological images	AUC (0.9544)	Hold-out	Image features of H&E-stained tumor section histological slides	The model outperforms pathologists and other benchmarked models	The predictive performance between different cancer types is not robust

Note: MRRN, resolution residually connected network; CART, classification and regression trees; AC-GAN, auxiliary classifier generative adversarial networks; Lcc, Lin’s concordance coefficient; Pcc, Pearson correlation coefficient; MAE, mean absolute error; TPS, tumor proportional scoring; LD, linkage disequilibrium; Cox-PH, Cox proportional-hazards; ANOVA, analysis of variance; miRNA, microRNA; RPPA, reverse phase protein array; CAE, convolutional autoencoder; mRNA, messenger RNA; PD-L1, programmed cell death 1 ligand 1; COSMIC, the Catalogue Of Somatic Mutations In Cancer; EGFR, epidermal growth factor receptor. Compared with hold-out, cross-validation is usually more robust, and accounts for more variance between possible splits in training, validation, and test data. However, cross-validation is more time consuming than using the simple holdout method.