Skip to main content
. 2020 Dec 15;11:593336. doi: 10.3389/fpsyt.2020.593336

Table 2.

Mean absolute error (standard error) for each model and each fold (first challenge).

Individual algorithms Ensemble learning
BLUP-mean BLUP-quantiles SVM 6-layer CNN Age spe. 6-layer CNN ResNet Inception V1 LM RF Mean Median
Fold 1 5.32 (0.19) 4.90 (0.19) 5.31 (0.18) 4.18 (0.16) 4.01 (0.15) 4.02 (0.15) 3.82 (0.14) 3.46 (0.13)* 3.62 (0.15) 3.74 (0.13) 3.67 (0.14)
Fold 2 5.05 (0.18) 4.79 (0.19) 5.34 (0.18) 4.47 (0.15) 4.12 (0.13) 4.01 (0.14) 3.97 (0.15) 3.53 (0.13)* 3.60 (0.15)* 3.69 (0.13) 3.74 (0.13)
Fold 3 4.90 (0.18) 4.37 (0.16) 4.84 (0.17) 4.41 (0.16) 4.27 (0.15) 3.88 (0.14) 4.00 (0.16) 3.33 (0.13)* 3.46 (0.15)* 3.46 (0.12)* 3.45 (0.13)*
Fold 4 5.07 (0.18) 4.71 (0.18) 5.06 (0.18) 4.55 (0.17) 4.27 (0.16) 4.11 (0.15) 3.85 (0.15) 3.57 (0.13)* 3.72 (0.14) 3.68 (0.14) 3.74 (0.15)
Fold 5 5.22 (0.19) 4.69 (0.18) 5.20 (0.18) 4.02 (0.16) 3.89 (0.15) 3.99 (0.16) 3.75 (0.15) 3.34 (0.13)* 3.51 (0.14) 3.56 (0.13) 3.47 (0.13)
5-fold combined MAE 5.11 4.69 5.15 4.33 4.11 4.00 3.88 3.44 3.58 3.62 3.61

Fold 1 corresponds to the train-test split used in the Predictive Analytics Competition (PAC) challenge and presented in Table 1. LM (linear model), RF (random forest), mean, and median age scores are the four methods considered for ensemble learning. The standard error [SE = SD/sqrt(N)] reflects the uncertainty around the MAE estimate. A 95% confidence interval may be calculated as MAE ± 1.96 * SE, though it (falsely) assumes normality of the absolute error distribution. For the 5-fold combined MAE, we did not report the SE, as it is notoriously biased downward (54) due to the overlap of the different training/test samples.

*

Indicates a significant reduction of MAE via ensemble learning compared with Inception alone (p < 0.01, assuming five independent tests).