Table 2.
Performance of MMML models and clinical indexes by dataset
| Datasets | Models | Accuracy | Sensitivity | Specificity | Recall | Precision | F1-score | AUC |
|---|---|---|---|---|---|---|---|---|
| Training | ||||||||
| GBM | 0.978 | 0.929 | 0.995 | 0.929 | 0.985 | 0.956 | 0.998 | |
| GLM | 0.858 | 0.714 | 0.907 | 0.714 | 0.725 | 0.719 | 0.876 | |
| XGBoost | 0.916 | 0.929 | 0.912 | 0.929 | 0.783 | 0.850 | 0.970 | |
| RF | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | |
| DL | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | |
| Stacking | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | |
| Child–Pugh | 0.495 | 0.900 | 0.356 | 0.900 | 0.323 | 0.475 | 0.662 | |
| MELD | 0.618 | 0.771 | 0.566 | 0.771 | 0.378 | 0.507 | 0.673 | |
| APRI | 0.720 | 0.600 | 0.761 | 0.600 | 0.462 | 0.522 | 0.725 | |
| FIB-4 | 0.644 | 0.586 | 0.663 | 0.586 | 0.373 | 0.456 | 0.645 | |
| Validation | ||||||||
| GBM | 0.952 | 0.667 | 1.000 | 0.667 | 1.000 | 0.800 | 0.990 | |
| GLM | 0.879 | 0.667 | 0.912 | 0.667 | 0.545 | 0.600 | 0.903 | |
| XGBoost | 0.970 | 0.889 | 0.982 | 0.889 | 0.889 | 0.889 | 0.994 | |
| RF | 0.970 | 0.889 | 0.982 | 0.889 | 0.889 | 0.889 | 0.983 | |
| DL | 0.894 | 0.333 | 0.982 | 0.333 | 0.750 | 0.462 | 0.984 | |
| Stacking | 0.970 | 0.889 | 0.982 | 0.889 | 0.889 | 0.889 | 0.998 | |
| Child–Pugh | 0.697 | 0.889 | 0.667 | 0.889 | 0.296 | 0.444 | 0.815 | |
| MELD | 0.909 | 0.667 | 0.947 | 0.667 | 0.667 | 0.667 | 0.856 | |
| APRI | 0.773 | 0.778 | 0.772 | 0.778 | 0.350 | 0.483 | 0.791 | |
| FIB-4 | 0.848 | 0.556 | 0.895 | 0.556 | 0.455 | 0.500 | 0.704 | |
| Test | ||||||||
| GBM | 0.888 | 0.690 | 0.958 | 0.690 | 0.853 | 0.763 | 0.968 | |
| GLM | 0.832 | 0.595 | 0.916 | 0.595 | 0.714 | 0.649 | 0.849 | |
| XGBoost | 0.876 | 0.881 | 0.874 | 0.881 | 0.712 | 0.787 | 0.935 | |
| RF | 0.919 | 0.857 | 0.941 | 0.857 | 0.837 | 0.847 | 0.976 | |
| DL | 0.888 | 0.690 | 0.958 | 0.690 | 0.853 | 0.763 | 0.937 | |
| Stacking | 0.932 | 0.952 | 0.924 | 0.952 | 0.816 | 0.879 | 0.975 | |
| Child–Pugh | 0.596 | 0.762 | 0.538 | 0.762 | 0.368 | 0.496 | 0.686 | |
| MELD | 0.534 | 0.929 | 0.395 | 0.929 | 0.351 | 0.510 | 0.680 | |
| APRI | 0.665 | 0.786 | 0.622 | 0.786 | 0.423 | 0.550 | 0.739 | |
| FIB-4 | 0.770 | 0.452 | 0.882 | 0.452 | 0.576 | 0.507 | 0.703 |
APRI AST to Platelet Ratio Index, DL deep learning, FIB-4 fibrosis 4 score, GBM gradient boost machine, GLM general linear model, MELD model for end-stage liver disease, MMML multimodal machine learning, RF random forest, XGBoost eXtreme gradient boosting