Skip to main content
. 2022 Oct 24;36(1):326–338. doi: 10.1007/s10278-022-00724-6

Table 2.

Performance of MMML models and clinical indexes by dataset

Datasets Models Accuracy Sensitivity Specificity Recall Precision F1-score AUC
Training
GBM 0.978 0.929 0.995 0.929 0.985 0.956 0.998
GLM 0.858 0.714 0.907 0.714 0.725 0.719 0.876
XGBoost 0.916 0.929 0.912 0.929 0.783 0.850 0.970
RF 1.000 1.000 1.000 1.000 1.000 1.000 1.000
DL 1.000 1.000 1.000 1.000 1.000 1.000 1.000
Stacking 1.000 1.000 1.000 1.000 1.000 1.000 1.000
Child–Pugh 0.495 0.900 0.356 0.900 0.323 0.475 0.662
MELD 0.618 0.771 0.566 0.771 0.378 0.507 0.673
APRI 0.720 0.600 0.761 0.600 0.462 0.522 0.725
FIB-4 0.644 0.586 0.663 0.586 0.373 0.456 0.645
Validation
GBM 0.952 0.667 1.000 0.667 1.000 0.800 0.990
GLM 0.879 0.667 0.912 0.667 0.545 0.600 0.903
XGBoost 0.970 0.889 0.982 0.889 0.889 0.889 0.994
RF 0.970 0.889 0.982 0.889 0.889 0.889 0.983
DL 0.894 0.333 0.982 0.333 0.750 0.462 0.984
Stacking 0.970 0.889 0.982 0.889 0.889 0.889 0.998
Child–Pugh 0.697 0.889 0.667 0.889 0.296 0.444 0.815
MELD 0.909 0.667 0.947 0.667 0.667 0.667 0.856
APRI 0.773 0.778 0.772 0.778 0.350 0.483 0.791
FIB-4 0.848 0.556 0.895 0.556 0.455 0.500 0.704
Test
GBM 0.888 0.690 0.958 0.690 0.853 0.763 0.968
GLM 0.832 0.595 0.916 0.595 0.714 0.649 0.849
XGBoost 0.876 0.881 0.874 0.881 0.712 0.787 0.935
RF 0.919 0.857 0.941 0.857 0.837 0.847 0.976
DL 0.888 0.690 0.958 0.690 0.853 0.763 0.937
Stacking 0.932 0.952 0.924 0.952 0.816 0.879 0.975
Child–Pugh 0.596 0.762 0.538 0.762 0.368 0.496 0.686
MELD 0.534 0.929 0.395 0.929 0.351 0.510 0.680
APRI 0.665 0.786 0.622 0.786 0.423 0.550 0.739
FIB-4 0.770 0.452 0.882 0.452 0.576 0.507 0.703

APRI AST to Platelet Ratio Index, DL deep learning, FIB-4 fibrosis 4 score, GBM gradient boost machine, GLM general linear model, MELD model for end-stage liver disease, MMML multimodal machine learning, RF random forest, XGBoost eXtreme gradient boosting