. 2020 Oct 14;10:588221. doi: 10.3389/fonc.2020.588221

TABLE 1.

Features of recent methodologies and techniques used for AI based systems biology approaches in multi-omics data analysis of cancer.

Methodology	Techniques	Characteristics	Specialty	Cancer types	Omics data	Outcome	Performance	References
Unsupervised and supervised	Stacked autoencoder and hierarchical integration deep flexible neural forest network (HI-DFN Forest)	Autoencoders are used to integrate multi-omics data. HI-DFN Forest is used for classification	Considers intrinsic statistical properties and learns high-level representations of each omics data. HI-DFNForest model is suitable for small-scale data.	Breast, glioblastoma, ovarian cancer	mRNA expression, miRNA expression, methylation	Classify cancer subtypes	Accuracy: 0.885 (glioblastoma multiforme)	(33)
Supervised and unsupervised	Deep-learning, autoencoder	Autoencoder was used to reduce data and then SVM to find sub-groups.	Predicts survival subgroups and aggregates genes belonging to similar pathways.	Hepatocellular carcinoma	mRNA expression, miRNA expression, methylation	Predict survival subgroups	Concordance index: 0.68	(34, 35)
Unsupervised	Combinations of autoencoders	Data integration based on four types of variational autoencoders (VAE)	All VAE architectures perform well. Learned representations coupled with SVMs provides best prediction.	Breast cancer	mRNA expression, CNV data	Focused on data integration approaches	Accuracy: 0.858	(37)
Kernel framework	Multiple kernel learning	Combine several kernels to one meta-kernel in an unsupervised framework.	Identifies cancer subtypes and provides relationships between them	Breast cancer	mRNA expression, miRNA expression, methylation	Proposed generic approach of data integration	Average cluster purity: 0.70	(39)
Unsupervised	Autoencoder	multi-modal sparse denoising autoencoder framework coupled with sparse non-negative matrix factorization	Illustrate impact of individual omics feature on pathway score.	Colorectal cancer, lung squamous cell carcinoma, glioblastoma multiforme and breast cancer	mRNA expression, miRNA expression, methylation, CNV	Cluster patients and provide feature pathways for patient clusters	Consensus silhouette index: 0.98 (colorectal cancer)	(42)
Supervised and unsupervised	Random forest, SCVM	Combined use of random forest and SVM	Classifies normal and cancer samples across different tissue types and hence useful for diagnosis	9 types of cancers	Pan-cancer mRNA expression,	Classification and identifies biomarkers	Accuracy: 97.89% (non-specific tissue type)	(43)
Unsupervised	Autoencoder	Three types of integration approaches used. Feature combinations with highest average predictive accuracy was used.	Auto-encoder based classification	Neuroblastoma	mRNA expression, CNV	Prognosis sub-group	p-value from Kaplan-Meier curves for overall survival: 2.8e-8	(44)
Supervised and unsupervised	Integrative network fusion network and deep learning	Random forest was trained by two types of integrated omics data. Classifier was used based on intersection of two training processes.	Two approaches followed for data integration, juxtaposed and integration by similarity network fusion.	Neuroblastoma	mRNA expression, CNV	Prognosis sub-group	p-value for Kaplan-Meier plot: 5.7e-4	(45)
Supervised and unsupervised	SVM, and random forest	Initial supervised analysis was followed by systems biology approach and random forest based analysis	Multi-omics data was integrated in multiple steps with removal of redundant features.	Colorectal cancer	mRNA expression, miRNA expression, CNV, metabolomics	Identifies markers, pathways associated with cancer relapse	p-value from Kaplan-Meier curves for overall survival: 5.7e-4	(46)
Multi-view learning	Min-Redundancy and Max-Relevance (MRMR)	Finds features having maximum relevance in feature selection and minimum redundancy with already selected features	Two stage feature selection framework	Ovarian cancer	mRNA expression, methylation, CNV	Identifies biomarkers for predicting survival.	Area under curve (AUC): 0.7 for random forest classifier	(47)
Neural network	Deep learning based neural network	Instead of gene expression data, eigengene modules of gene co-expression analysis were used as features.	Associates feature genes with metadata like age	Breast cancer	mRNA expression, miRNA expression, methylation, CNV and other metadata	Survival prediction	Mean concordance index: 0.6813	(48)
LASSO and neural network	Deep learning framework and lasso	Use group LASSO and deep neural network for data integration and then Cox model for survival prediction	Different features from same gene are grouped together	Pan-cancer	mRNA expression, CNV, SNP	Survival prediction	Concordance index: 0.8	(49)
Kernel method	Kernel alignment assessment of omic similarity matrix	Omic similarity matrix was constructed for each omics data and similarity between them was measured.	Considers involvement of large number of biomarkers in disease prognosis	Pan-cancer	mRNA expression, miRNA expression, methylation, CNV, SNP	Variation in prognosis assessment across cancer types	Concordance index >0.68 (sample size = 900)	(50)
Kernel based and feature-selection based	Bayesian efficient multiple kernel learning (BEMKL) model	Kernalized regression which works on similarities between cell lines	Reduces number of model parameters to match number of samples, not feature numbers. Extract non-linear relations between features and drug response.	Breast cancer cell lines	mRNA expression, CNV, methylation, SNP, proteomic	Drug-response prediction	False discovery rate: 2.5e-5	(51, 52, 57)
Deep neural network, transfer learning	Multi-Omics Late Integration (MOLI)	Creates feature space for each omics data. Learned features are integrated by concatenation and used for prediction of drug response. Use transfer learning by using responses of all drugs for same target while training.	Considers unique distribution for each omics data.	Pan-cancer	mRNA expression, CNV, SNP	Predicts drug response	Accuracy: 0.8 for drug cetuximab	(53)
Supervised	SVM and leave-one-out cross-validation (LOOCV)	Finds features from each omics data and then identifies marker candidates based on miRNA and mRNA interactions	Analyzed integrated mRNA and miRNA expression data considering their interactions	Pancreatic ductal carcinoma	MRNA expression, miRNA expression	Identify mRNA and miRNA markers. Predicts miRNA expression level	AUC: 0.925 for miR-21 as multi-marker	(54)
Supervised	idTRAX	Finds target kinases from the compound data of all genes	Identifies kinases as effective targets of drugs	Breast cancer	Genomic and transcriptomic	Cell-model selective anti-cancer drug target	Spearman correlation ∼0.1	(55)
Supervised	Capsule network based modeling (CapsNetMMD)	Multi-omics data is integrated to form feature matrix and converted to capsule layers by convolution.	Supervised classification is done based on known breast cancer genes	Breast cancer	mRNA expression, methylation, CNV	Therapeutic target genes of breast cancer	p-value: 3.6e-141 (rank cut-off: 20%)	(56)
Supervised	Random forest and different classifiers	Features were extracted based on shrunken centroid and random forest based algorithm. Different classifiers were used.	Considers methylation patterns. Distinguishes early and late stages of cancer.	Papillary renal cell carcinoma	mRNA expression, methylation	Finds driver genes	Accuracy: 84.6% for random forest	(58)
Semi-supervised	PLATYPUS	After training on labeled data, it co-trains with unlabeled data considering the messing data.	Important features are linked to drug sensitivity	Pan-cancer cell lines	mRNA expression, CNV, SNP	Predicts drug response	AUC: 0.9	(59)