. 2020 Jun 18;10:194. doi: 10.1038/s41398-020-00867-6

Table 2.

Results of K = 100 cross-validation experiments with the HDRS total score (left columns) and the YMRS total score (right columns) models based on all, mandatory and mood self-assessment items, respectively.

	HDRS total score		YMRS total score
Model	R² (SD) ↑^a	RMSE (SD) ↓^b	R² (SD) ↑^a	RMSE (SD) ↓^b
All self-assessment items
Pooled naïve mean	−0.02 (0.03)	5.99 (0.37)	−0.04 (0.05)	4.18 (0.70)
Pooled Ridge	0.37 (0.10)	4.68 (0.48)	0.02 (0.15)	4.03 (0.60)
Pooled XGBoost	0.44 (0.10)	4.40 (0.41)	−0.04 (0.21)	4.11 (0.53)
Pooled Bayesian	0.36 (0.12)	4.72 (0.51)	0.00 (0.21)	4.04 (0.56)
Separate naïve mean	0.47 (0.11)	4.29 (0.47)	−0.00 (0.33)	4.00 (0.53)
Separate Ridge	0.47 (0.12)	4.30 (0.49)	0.04 (0.30)	3.92 (0.54)
Separate XGBoost	0.27 (0.15)	5.03 (0.49)	−0.38 (0.50)	4.64 (0.45)
Hierarchical Bayesian	0.57 (0.10)	3.85 (0.47)	0.12 (0.31)	3.74 (0.46)
Mandatory self-assessment items
Pooled naïve mean	−0.02 (0.03)	5.94 (0.37)	−0.04 (0.06)	4.25 (0.71)
Pooled Ridge	0.21 (0.07)	5.24 (0.34)	0.01 (0.09)	4.12 (0.65)
Pooled XGBoost	0.37 (0.12)	4.63 (0.39)	−0.06 (0.18)	4.23 (0.57)
Pooled Bayesian	0.21 (0.10)	5.22 (0.37)	0.03 (0.13)	4.08 (0.61)
Separate naïve mean	0.46 (0.16)	4.28 (0.57)	−0.01 (0.30)	4.08 (0.54)
Separate Ridge	0.46 (0.16)	4.29 (0.57)	0.00 (0.29)	4.06 (0.54)
Separate XGBoost	0.25 (0.18)	5.06 (0.54)	−0.34 (0.39)	4.68 (0.42)
Hierarchical Bayesian	0.54 (0.13)	3.94 (0.53)	0.10 (0.27)	3.85 (0.49)
Mood self-assessment item
Pooled naïve mean	−0.02 (0.02)	5.91 (0.41)	−0.05 (0.05)	4.20 (0.77)
Pooled Ridge	0.21 (0.06)	5.19 (0.35)	0.02 (0.07)	4.05 (0.70)
Pooled XGBoost	0.34 (0.11)	4.75 (0.35)	0.01 (0.18)	4.03 (0.54)
Pooled Bayesian	0.20 (0.12)	5.23 (0.45)	0.04 (0.12)	4.00 (0.63)
Separate naïve mean	0.44 (0.15)	4.31 (0.47)	0.02 (0.27)	3.98 (0.59)
Separate Ridge	0.45 (0.15)	4.29 (0.48)	0.03 (0.27)	3.96 (0.59)
Separate XGBoost	0.42 (0.15)	4.42 (0.42)	−0.04 (0.34)	4.05 (0.51)
Hierarchical Bayesian	0.51 (0.14)	4.05 (0.45)	0.16 (0.25)	3.68 (0.54)

The hierarchical Bayesian model achieved the best overall performance in every case and could predict the clinical severity ratings within 4 points of RMSE on the original rating scales. The best HDRS total result was achieved using all self-assessment items while the best YMRS total result was achieved using only the mood self-assessment item.

Bold values indicates the best results within each set of self-assessment items.

^aCoefficient of determination. Higher is better.

^bRoot Mean Square Error. Lower is better.