. 2022 Oct 24;36(1):326–338. doi: 10.1007/s10278-022-00724-6

Table 2.

Performance of MMML models and clinical indexes by dataset

Datasets	Models	Accuracy	Sensitivity	Specificity	Recall	Precision	F1-score	AUC
Training
	GBM	0.978	0.929	0.995	0.929	0.985	0.956	0.998
	GLM	0.858	0.714	0.907	0.714	0.725	0.719	0.876
	XGBoost	0.916	0.929	0.912	0.929	0.783	0.850	0.970
	RF	1.000	1.000	1.000	1.000	1.000	1.000	1.000
	DL	1.000	1.000	1.000	1.000	1.000	1.000	1.000
	Stacking	1.000	1.000	1.000	1.000	1.000	1.000	1.000
	Child–Pugh	0.495	0.900	0.356	0.900	0.323	0.475	0.662
	MELD	0.618	0.771	0.566	0.771	0.378	0.507	0.673
	APRI	0.720	0.600	0.761	0.600	0.462	0.522	0.725
	FIB-4	0.644	0.586	0.663	0.586	0.373	0.456	0.645
Validation
	GBM	0.952	0.667	1.000	0.667	1.000	0.800	0.990
	GLM	0.879	0.667	0.912	0.667	0.545	0.600	0.903
	XGBoost	0.970	0.889	0.982	0.889	0.889	0.889	0.994
	RF	0.970	0.889	0.982	0.889	0.889	0.889	0.983
	DL	0.894	0.333	0.982	0.333	0.750	0.462	0.984
	Stacking	0.970	0.889	0.982	0.889	0.889	0.889	0.998
	Child–Pugh	0.697	0.889	0.667	0.889	0.296	0.444	0.815
	MELD	0.909	0.667	0.947	0.667	0.667	0.667	0.856
	APRI	0.773	0.778	0.772	0.778	0.350	0.483	0.791
	FIB-4	0.848	0.556	0.895	0.556	0.455	0.500	0.704
Test
	GBM	0.888	0.690	0.958	0.690	0.853	0.763	0.968
	GLM	0.832	0.595	0.916	0.595	0.714	0.649	0.849
	XGBoost	0.876	0.881	0.874	0.881	0.712	0.787	0.935
	RF	0.919	0.857	0.941	0.857	0.837	0.847	0.976
	DL	0.888	0.690	0.958	0.690	0.853	0.763	0.937
	Stacking	0.932	0.952	0.924	0.952	0.816	0.879	0.975
	Child–Pugh	0.596	0.762	0.538	0.762	0.368	0.496	0.686
	MELD	0.534	0.929	0.395	0.929	0.351	0.510	0.680
	APRI	0.665	0.786	0.622	0.786	0.423	0.550	0.739
	FIB-4	0.770	0.452	0.882	0.452	0.576	0.507	0.703

APRI AST to Platelet Ratio Index, DL deep learning, FIB-4 fibrosis 4 score, GBM gradient boost machine, GLM general linear model, MELD model for end-stage liver disease, MMML multimodal machine learning, RF random forest, XGBoost eXtreme gradient boosting