Table 1.
Comparative analysis of existing work on hepatitis C virus prediction.
| References | Used techniques | Performance metrics used | Data set and number of instances | Outcomes | Challenges |
|---|---|---|---|---|---|
| 45 | RF, SVM, GB | Precision, accuracy, miss rate | Online UCI Dataset, 668 instances | RF achieves 89% precision and a 17.2% miss rate | Cannot predict beyond the range of training data and overfitting issues |
| 12 | RF, KNN | Precision, accuracy, recall, F-measure, confusion matrix | A laboratory examination dataset with 200 instances | RF achieves 6% better results than KNN and other methods | It performs better for limited data |
| 13 | SVM, DT, GB, LR, NB, KNN, XGB, RF | Accuracy | Online Kaggle Dataset, with 4462 instances | The DB methods show 91.2 & accuracy over other methods | Fewer data samples utilize |
| 14 | RF | Precision | UCI data set with 670 instances | RF achieves 89.6% precision over other ML methods | Data inconsistency issues |
| 15 | RF, SVM | Precision, the miss rate | NCA Hospital dataset with 425 instances | RF achieves 87.6% precision | It performs better for limited data |
| 16 | SVM, ANN, KNN | Precision, accuracy, recall | Online UCI dataset with 295 instances | ANN Achieves 90.1% precision in training and testing | Limited parameters were considered in the experimental analysis |
| 17 | SVM, RF, DT, BN, NN, NB | Precision, recall, F-measure, detection rate, and recall | Online Kaggle dataset with 559 Instances | NN performs better and achieves more than 11.6% better results than other methods | It performs better for limited data |
| 18 | Extreme learning machine | Precision, miss rate | Online Kaggle dataset with 550 instances | Better precision as compared to the SVM method | Limited parameters |
| 19 | ANN | Accuracy, miss rate | Lahore Hospital dataset with 289 instances | ANN achieves better results in terms of accuracy and miss rate % | Data inconsistency issues |
| 20 | PSO, GA, REP, DT- C4.5 and CART, ADT, MLR, RT | Precision, recall, accuracy, miss rate | Egypt HCV dataset, with 669 instances | GA methods show better classification outcomes | It performs better for limited data |
| 21 | SVM, NB, NN, DT | Accuracy and miss rate % | Online UCI dataset with 335 instances | NN achieves better accuracy and misses rate% | Limited parameters were considered in the experimental analysis |
| 22 | SVM, simulated annealing (SA) | Sensitivity, specificity, precision, and accuracy | Online Kaggle dataset with 295 instances | SVM achieves better results than existing ML methods | Data inconsistency issues |
| 23 | Binary LR | TPR and accuracy | Online UCI dataset with 269 instances | Binary LR achieves better TPR and accuracy % | Performs better for limited data |
| 24 | ANN, NN, and SVM | Precision and accuracy | Online UCI dataset with 295 instances | ANN achieves better precision | Limited parameters were used |