TABLE 1.
Summary for iterative progress on model precision scores.
| Best method | Tuning grid | Best tune hyperparameters | Validation | Test | |
|---|---|---|---|---|---|
| miRNA | PCA and undersample | Max depth: [2:8] | Max depth: 6 | Accuracy: 86% | Accuracy: 81% |
| AdaBoost | # of trees: [1:16] | Number of trees: 12 | F-score: 86% | F-score: 82% | |
| Co-efficiency of learning: Breiman | |||||
| miRNA and mRNA | PCA SMOTE | Cost: 10(−4) × (20:150)) | Cost: 0.0025 | Accuracy: 81% | Accuracy: 93% |
| SVM (linear kernel) | F-score: 0.82% | F-score: 92% | |||
| miRNA, mRNA, and methylation | PCA SMOTE | Cost: 10(−4) × (20:150)) | Cost: 0.0027 | Accuracy: 82% | Accuracy: 93% |
| SVM (linear kernel) | F-score: 83% | F-score: 92% |
The miRNA model applied by feature selection through importance (the hybrid model) and the class imbalance solution through undersampling is the method to be applied for prediction. For both “miRNA–mRNA” and the “miRNA–mRNA–methylation” triple model, principal component analysis for dimensionality and SMOTE for the class imbalance solution was the best method to increase predictive power and stability of the model.