. 2021 Jul 28;12:4575. doi: 10.1038/s41467-021-24823-0

Table 3.

Prediction performances of prediction models related to under-sampling and bagging techniques for the number of all heatstrokes^a.

	GAM ^b	XGBoost^b	Under-sampling XGBoost model	Hybrid model consisting of GAM and under-sampling XGBoost model
	The number of all heatstrokes
Overall predictive accuracies per city per 12 h
RMSE in training	1.37	1.09	1.48	1.28
RMSE in testing	2.47	3.28	3.48	2.97
Predictive accuracies on days when the number of heatstrokes spiked^c
MAPE per 1-day (%) in training	18.0	11.9	23.81	16.3
MAPE per 1-day (%) in testing	19.7	28.5	13.38	14.8
Total absolute percentage error (%) in training	8.3	5.9	20.79	1.2
Total absolute percentage error (%) in testing	21.9	31.9	6.94	14.2

GAM generalized additive model, XGBoost extreme gradient boosting decision tree, RMSE root-mean-square error, MAPE mean absolute percentage error.

^a Smaller RMSE, MAPE, and total absolute percentage error show better predictabilities.

^b These GAM and XGBoost models were the same as those in Table 2.

^c MAPE and total absolute percentage error were calculated after observed and predicted values were summed up per day (for MAPE) per the entire period (for total absolute percentage error) on days when the number of all heatstrokes was 80th percentile (corresponding to 53.6 in 2015, 57.8 in 2016, 60.6 in 2017, and 89.8 in 2018) and over in each year. MAPE is a mean value of absolute errors divided by observed values.