. 2022 Jun 23:1–20. Online ahead of print. doi: 10.1007/s00146-022-01490-3

Table 3.

Performance metrics of the trained machine learning models.

Source: Author’s own work

		Decision tree	Random forest	AdaBoost	SVM	Logistic regression	Neural network (one hidden layer)	Neural network (five hidden layers)
Train set	Accuracy	1.0000	0.9999	1.0000	0.8012	0.7929	0.9997	0.9955
	Precision	1.0000	0.9972	1.0000	0.1250	0.1194	0.9902	0.8879
	Recall	1.0000	1.0000	1.0000	0.7720	0.7635	1.0000	0.9986
	F1 score	1.0000	0.9986	1.0000	0.2151	0.2066	0.9958	0.9698
	AUC	1.0000	0.9999	1.0000	0.7871	0.8643	1.0000	0.9998
Validation set	Accuracy	0.9416	0.9622	0.9410	0.7982	0.7869	0.9481	0.9490
	Precision	0.1650	0.4375	0.1553	0.1288	0.1205	0.2391	0.2078
	Recall	0.1405	0.0579	0.1322	0.7686	0.7521	0.1818	0.1322
	F1 score	0.1518	0.1022	0.1429	0.2206	0.2078	0.2066	0.1616
	AUC	0.5565	0.5275	0.5522	0.7840	0.8620	0.7502	0.6464
Test set	Accuracy	0.9358	0.9638	0.9380	0.8001	0.7940	0.9484	0.9515
	Precision	0.1293	0.6923	0.1441	0.1323	0.1267	0.2527	0.2464
	Recall	0.1220	0.0732	0.1301	0.7724	0.7561	0.1870	0.1382
	F1 score	0.1255	0.1324	0.1368	0.2259	0.2170	0.2150	0.1771
	AUC	0.5449	0.5359	0.5499	0.7868	0.8577	0.7252	0.6430

(i) Precision denotes the share of true positives in total predicted positives; recall is the share of true positives in total actual positives; F1 score is the harmonic mean of the precision and recall; area under the curve (AUC) measures the ability of a classifier to distinguish between classes (on a scale from 0 to 1, with larger values signalising better performance)