. 2024 Feb 26;10:e1896. doi: 10.7717/peerj-cs.1896

Table 7. Regression results for the prediction of the duration of diabetes type 1 on the Takashi2019 dataset.

Performance of the learned models with the different methods evaluated with the different metrics, expressed in the format “average value ± standard deviation”, obtained on 1,000 executions, each execution had 70% randomly chosen data instances for training set and the remaining 30% used for test set. We reported in blue and with an asterisk * the top result for each rate. At the beginning of each execution we randomly shuffled the dataset instances. RMSE: root mean square error. MAE: mean absolute error. MSE: mean square error. SMAPE: symmetric mean absolute percentage error. R²: coefficient of determination. RMSE, MAE, MSE: best value 0 and worst value +∞. R²: best value +1 and worst value −∞. SMAPE: best value 0 and worst value 2. We listed the complete formulas of R², RMSE, MSE, MAE, and SMAPE in the Supplemental Information. We ranked the methods considering the results obtained through R-squared (in bold).

Method	R ²	RMSE	MAE	MSE	SMAPE
Random forests	*0.41 ± 0.05	*5.98 ± 0.27	5.19 ± 0.26	*35.87 ± 03.31	0.22 ± 0.01
XGBoost	0.39 ± 0.14	6.04 ± 0.70	*5.00 ± 0.49	37.08 ± 08.97	*0.21 ± 0.02
Linear regression	0.14 ± 0.47	7.00 ± 1.83	5.52 ± 1.31	52.49 ± 29.27	0.27 ± 0.06
Decision trees	0.05 ± 0.26	7.53 ± 1.07	6.23 ± 0.88	57.98 ± 16.46	0.26 ± 0.03