Table 7. Regression results for the prediction of the duration of diabetes type 1 on the Takashi2019 dataset.
Performance of the learned models with the different methods evaluated with the different metrics, expressed in the format “average value ± standard deviation”, obtained on 1,000 executions, each execution had 70% randomly chosen data instances for training set and the remaining 30% used for test set. We reported in blue and with an asterisk * the top result for each rate. At the beginning of each execution we randomly shuffled the dataset instances. RMSE: root mean square error. MAE: mean absolute error. MSE: mean square error. SMAPE: symmetric mean absolute percentage error. R2: coefficient of determination. RMSE, MAE, MSE: best value 0 and worst value +∞. R2: best value +1 and worst value −∞. SMAPE: best value 0 and worst value 2. We listed the complete formulas of R2, RMSE, MSE, MAE, and SMAPE in the Supplemental Information. We ranked the methods considering the results obtained through R-squared (in bold).
| Method | R 2 | RMSE | MAE | MSE | SMAPE |
|---|---|---|---|---|---|
| Random forests | *0.41 ± 0.05 | *5.98 ± 0.27 | 5.19 ± 0.26 | *35.87 ± 03.31 | 0.22 ± 0.01 |
| XGBoost | 0.39 ± 0.14 | 6.04 ± 0.70 | *5.00 ± 0.49 | 37.08 ± 08.97 | *0.21 ± 0.02 |
| Linear regression | 0.14 ± 0.47 | 7.00 ± 1.83 | 5.52 ± 1.31 | 52.49 ± 29.27 | 0.27 ± 0.06 |
| Decision trees | 0.05 ± 0.26 | 7.53 ± 1.07 | 6.23 ± 0.88 | 57.98 ± 16.46 | 0.26 ± 0.03 |