[Preprint]. 2025 Jun 4:rs.3.rs-6589736. [Version 1] doi: 10.21203/rs.3.rs-6589736/v1

Table 3.

Classification performance of various ML algorithms on cold injuries among the elderly (>60 years old) population.

Method	Accuracy		Precision		Recall		F1-score
	Mean	Standard Error	Mean	Standard Error	Mean	Standard Error	Mean	Standard Error
Decision Tree	69.282	0.013	30.177	0.028	26.551	0.030	27.960	0.026
Random Forest	76.555	0.010	44.802	0.123	9.681	0.024	15.683	0.037
XGBoost	77.895	0.005	52.886	0.036	18.564	0.022	27.351	0.028
AdaBoost	76.746	0.005	47.656	0.023	26.596	0.025	33.876	0.025
Neural Network	69.665	0.018	29.444	0.037	20.683	0.022	23.617	0.022

Accuracy: Accuracy measures a classification model’s performance by calculating the ratio of correctly predicted instances to total cases. It provides an overall assessment of the model’s ability to classify data correctly.

Precision: Precision is a metric that evaluates how well a classification model identifies positive instances. It represents the ratio of correctly predicted positive cases to the total number of cases predicted as positive. Precision is essential when minimizing false positives is critical.

Recall: Recall measures how effectively a model captures all the relevant positive cases and is critical when the goal is to minimize the number of missed positive instances.

F1-score: The F1-score is a machine learning metric that assesses a classification model’s performance, especially in cases with imbalanced datasets, by combining precision and recall into a single measure.