1 |
[8] |
To design a method to classify and predict classes of cancer |
Neighborhood analysis, DNA microarrays, self organizing maps |
27 ALL samples from Dana-Farber Cancer Institute, 11 adult AML samples from the Cancer and Leukemia Group B (CALGB) leukemia cell bank |
Feasible method. Proper experimental care is required |
|
2 |
[2] |
To classify samples of cancer for gene expression data |
Computational analysis, affymetrix oligonucleaotide arrays, neighborhood analysis, genecluster software |
38 leukemia samples (11 AML, 27 ALL), for testing 34 samples (14 AML, 20 ALL) |
Genes with no correlation provide a better result, and the median prediction strength is 0.86 |
|
3 |
[9] |
To specify the specific categories of cancer using their gene expression |
ANNs, cDNA microarrays, DeArray software |
NCI, ATCC, MSKCC, CHTN, DZNSG, National Institutes of Health |
It can work with nonlinear features also. It is robust. It also achieves high sensitivity and specificity. |
|
4 |
[10] |
To create a framework for predicting predefined classes of tumor |
Compound covariate prediction, BRB ArrayTools |
Hereditary breast cancer dataset of 22 patients [11] |
Good setter for comparing prediction methods. Require some improvements. |
|
5 |
[12] |
To develop a classification system for DNA microarray gene expression data |
SOMs, Cluster and TreeView software, PCA, KNN |
Multiple datasets have been used, such as one with 99 samples, the other with 42 selections, |
Gene expressions provide an excellent way of diagnosing patients with medulloblastomas |
|
6 |
[13] |
To propose a method that performs classification on interval-scaled attributes basis |
PCA, FA, fuzzy FA |
203 samples (a subset of the actual dataset used in [14]) |
Successfully used in supervised learning. FA provides more information compared to surgical-pathological staging |
|
7 |
[15] |
To propose a method for gene feature selection |
Multiple SVM-RFE |
Four gene expression datasets available on Kent Ridge Bio-Medical Data Set Repository |
MSVM-RFE has classification accuracy better than SVM-RFE. SVM's performance has been improved. |
|
8 |
[16] |
To propose a framework for addressing the problem of integration of different data types |
Generalized singular value decomposition |
Fourteen breast cancer cell lines from American Type Culture Collection |
Gene expression and copy number data are being analyzed. Improvements can be made to use other data types also. |
|
9 |
[17] |
To propose a method used to find tissues of the tumor with different gene expression data |
ssEAM, PSO |
NC160, acute leukemia, ALL dataset |
ssEAM performs better than PNN, ANN, LVQ1and KNN at a 0.05 significance level |
|
10 |
[18] |
To present a selection method for analyzing gene expression data |
RBF neural network, rough based feature selection method, naïve Bayes, linear SVM |
ALL, AML, lung cancer and prostate cancer dataset (http://sdmc.lit.org.sg/GEDatasets/Datasets.) |
The best classification accuracy rate of 99.8% |
|
11 |
[19] |
To present a framework for discovering cancer classes. |
Permutation technique, cluster ensemble, cluster validity index (DAI) |
3 synthetic and 4 real datasets (leukemia [2], Novartis multitissue [20], lung cancer [14], St. Jude [21]) |
DAI finds the number of classes correctly and outperforms other existing methods |
|
12 |
[22] |
To present a method based on gene expression for classifying NSCLC |
Hierarchical clustering, SpotFire decision site, proportional hazards model |
91 NSCLC, six normal lung tissues from GSE3526 (Duke University) |
Gene signatures provide the best way for histopathological classification |
|
13 |
[23] |
To propose a classifier predicting disease in CRC patients |
Agilent 44K oligonucleotide arrays, Kaplan–Meier method, unsupervised hierarchical clustering |
188 training samples (NCI, LUMC, SGH) and 206 testing samples (Institute Catalad'Oncologia, Spain) |
Eighty-six percent of patients of the validation dataset are identified as low-risk patients. First prognostic technique for CRC |
|
14 |
[24] |
To propose a framework that combines genome-wide copy number and expression data |
L1-L2 constrained regression, local and global search strategies |
89 samples of breast cancer Dataset (UG San Francisco and California Pacific Medical Center [25]) |
Outperforms other existing methods accuracy |
|
15 |
[26] |
To propose a framework that combines other models that describes gene interaction. |
Bayesian model, Gibbs distribution, ANOVA test, parallel programming with GPU/CPU |
GSE4290, DREAM dataset |
Specificity of 0.99 has been achieved. Better performance than Enet and VAR |
|
16 |
[27] |
To propose the extended framework for segmentation of breast tumor |
Multichannel MRFs, kinetic observation model, Gaussian mixture model |
DCE MRI images of breast cancer |
AOC of 0.9 has been achieved using multichannel MRF compared to AOC of 0.89 in single-channel MRF. Better segmentation results when applied to SVM |
|
17 |
[28] |
To propose a gene selection method |
LSLS, wrapper method, SVM |
Six datasets available at Kent Ridge Biomedical Data repository |
LSLS performs better than KW and SPFS |
|
18 |
[29] |
To present a novel method classifying tumor samples. |
RPCA, LDA, SVM |
Nine different publically available datasets (acute leukemia data [2], colon cancer data, glimos data, medulloblastoma data, prostate cancer data, 11_tumor data, and brain tumor data) |
Performance is measured using LOO-CV, accuracy, and AUC. A feasible and effective method. |
|
19 |
[30] |
To propose a method based on deep learning for inferring target genes expression |
D-GEX |
Microarray GEO dataset, RNA-Seq-based GTEx dataset |
Outperforms linear regression (15.33 relative improvement) and KNN. The lower error rate in most of the genes (81.31%). |
|
20 |
[31] |
To develop a fused network identifying KIRC stages |
Gene expression and DNA methylation data, SNF, SNFTool, sparse partial least square regression, LASSO label prediction method |
The Cancer Genome Atlas KIRC data (TCGA data portal) |
High prediction accuracy than KNN, MLW, and WDC. It is robust. |
|
21 |
[32] |
To classify widely and rarely expressed genes |
Incremental feature selection method, mRMR, RNN |
Gene expression dataset available at the Human Protein Atlas [33] |
GO terms and KEGG are used at the functional level. Youden's indexes are 0.739 and 0.639 for normal and cancer tissues, respectively. |
|
22 |
[34] |
To develop a light-weight CNN for classifying breast cancer |
CNN, array-array intensity correlation, R-Studio, batch normalization |
Breast cancer dataset from Pan-Cancer Atlas |
Achieves 98.76% accuracy |
|
23 |
[35] |
To propose a method for classifying different types of cancer. |
BPSO-DT, CNN, deep learning |
Cancer types: RNA sequencing values from tumor samples/tissues available at Mendeley datasets |
It achieves an accuracy of 96.90%. Various evaluation parameters are recall, precision, and F1 score. |
|
24 |
[36] |
To propose a method based on NMF to classify tumor |
NMF, SNMF, SVM |
Colon cancer dataset [37], acute leukemia dataset, medulloblastoma dataset |
It is effective and efficient. The effect of sparseness is low. |
|
25 |
[38] |
To propose a model for biclustering data of gene expression. |
PCA, GLPCA, DHPCA, |
SRBCT, medulloblastoma, colon cancer, 11_Tumors |
It is compared with PCA, GLPCA, GNMF, ONMTF, and NMTFCoS. It provides better accuracy than others. |
|
26 |
[39] |
To present a framework for predicting the expression of genes employing nonlinear features |
Unsupervised clustering algorithm, L-GEPM, LSTM neural network |
GEO data from LINCS cloud, GTEx, and 1000G RNA-Seq data |
Performs better than D-GM, LR-L1, and KNN-R. Target genes extracted are much closer to the actual gene expression. Flexible and superior for NL features. |
|
27 |
[40] |
To propose a multilayer framework to classify multitissues of cancer. |
CNN, RNA sequencing, supervised learning, stochastic gradient descent optimization, back-propagation |
11093 samples from the Cancer Genome Atlas |
98.93 percent overall accuracy and 0.99 AUC have been achieved |
|
28 |
[41] |
To propose a gene selection method that can classify tissues in multicategory datasets |
PLS, linear support vector classifier, MATLAB, OSU_SVM3.00 toolbox linear SVC, SVM |
MIT AML and ALL dataset, SRBCT datasets |
It is efficient and robust. It works well for both two-category and multicategory datasets. |
|
29 |
[42] |
To propose an ST model for finding the effects of CNAs |
LST and NA, dynamic modeling, transcriptional bursting, transcriptional oscillation, circular binary segmentation |
NCBI/GEO database |
It shows the use of mathematical theory to investigate the findings and for a better understanding of cancer bio |
|
30 |
[43] |
To propose a muti-fusion-based method for profiling gene expression under nonthermal plasma treatment. |
Dempster–Shafer method, fuzzy C-Means clustering method, MATLAB R2016b |
NCBI Gene Expression Omnibus under GEO (GSE59997) |
Reduces uncertainty and increases reliability. The use of C-means finds changes in genes in various nonthermal plasma treatments. |
|
31 |
[44] |
To present a survey of 1D CNN and its applications. |
NA |
NA |
1D CNN works well with small data and where fewer computations are required. It also works where low-cost implementation is needed. |
|
32 |
[45] |
To propose a classification method for ECG signal images based on 2D CC. |
CNN, Intel17-5930K CPU, and NVIDIA GTX1080 GPU |
MIT-BIH Arrhythmia database |
2D CNN outperforms 1D CNN. 2D CNN is more accurate and robust. 1D CNN works well with limited data. |