Table 2.
Comparison of performance among various ML methods used for construction of the binary MOLT model
| Cohort | Model | Index | |||||
| AUROC (95% CI) | ACC | AUPR | F1 score | Sensitivity | Specificity | ||
| LG cohort (training) | Random forest | 1.000 (1.000–1.000) | 0.996 | 1.000 | 0.997 | 0.987 | 1.000 |
| LightGBM | 0.999 (0.999–1.000) | 0.983 | 0.999 | 0.988 | 0.956 | 0.996 | |
| XGBoost | 0.891 (0.878–0.904) | 0.836 | 0.842 | 0.888 | 0.573 | 0.961 | |
| Decision tree | 0.836 (0.820–0.853) | 0.825 | 0.744 | 0.874 | 0.681 | 0.894 | |
| SVM | 0.920 (0.907–0.932) | 0.872 | 0.892 | 0.911 | 0.680 | 0.963 | |
| Logistic regression | 0.811 (0.793–0.829) | 0.787 | 0.707 | 0.855 | 0.501 | 0.923 | |
| KNN | 0.992 (0.990–0.995) | 0.907 | 0.982 | 0.935 | 0.731 | 0.990 | |
| LG cohort (internal validation) | Random forest | 0.875 (0.855–0.896) | 0.824 | 0.816 | 0.878 | 0.566 | 0.955 |
| LightGBM | 0.907 (0.891–0.923) | 0.842 | 0.851 | 0.887 | 0.650 | 0.938 | |
| XGBoost | 0.852 (0.832–0.873) | 0.798 | 0.774 | 0.862 | 0.504 | 0.947 | |
| Decision tree | 0.826 (0.803–0.849) | 0.791 | 0.748 | 0.848 | 0.613 | 0.881 | |
| SVM | 0.817 (0.793–0.842) | 0.772 | 0.719 | 0.841 | 0.500 | 0.910 | |
| Logistic regression | 0.781 (0.754–0.809) | 0.757 | 0.672 | 0.835 | 0.431 | 0.923 | |
| KNN | 0.713 (0.684–0.743) | 0.705 | 0.582 | 0.799 | 0.354 | 0.883 | |
| BS cohort (external validation) | Random forest | 0.862 (0.819–0.904) | 0.828 | 0.865 | 0.864 | 0.676 | 0.936 |
| LightGBM | 0.822 (0.774–0.870) | 0.802 | 0.827 | 0.841 | 0.669 | 0.897 | |
| XGBoost | 0.725 (0.667–0.782) | 0.699 | 0.709 | 0.768 | 0.483 | 0.853 | |
| Decision tree | 0.678 (0.615–0.741) | 0.731 | 0.689 | 0.783 | 0.586 | 0.833 | |
| SVM | 0.856 (0.812–0.900) | 0.808 | 0.844 | 0.828 | 0.834 | 0.789 | |
| Logistic regression | 0.800 (0.754–0.846) | 0.722 | 0.734 | 0.771 | 0.614 | 0.799 | |
| KNN | 0.816 (0.770–0.862) | 0.751 | 0.781 | 0.806 | 0.559 | 0.887 | |
Cells set in bold/italic represent the first/second best performance in each cohort.
ACC, accuracy; AUPR, area under the precision-recall curve; CI, confidence interval; KNN, k-nearest neighbors; ML, machine learning; MOLT, machine learning of obstructive jaundice based on common laboratory tests; SVM, support vector machine.