. 2022 Jan 31;54(6):2802–2828. doi: 10.3758/s13428-021-01771-7

Table 2.

Comparison of validation accuracies of the best models trained on combinations of 5 questions, averaged over 10 combinations, using default hyperparameters, with demographics.

Model	AUC Score	Standard Deviation	95% Confidence Interval	F1 Score	Standard Deviation	95% Confidence Interval
Logistic Regression	87.79%	0.54%	87.58% - 87.97%	87.76%	0.55%	87.55% - 87.95%
Gaussian NB	86.16%	0.41%	86.02% - 86.30%	86.05%	0.42%	85.92% - 86.20%
SVM	87.66%	0.70%	87.40% - 87.93%	87.61%	0.71%	87.35% - 87.89%
MLP	87.93%	0.87%	87.44% - 88.01%	87.89%	0.87%	87.40% - 87.97%
Random Forest	89.75%	0.72%	89.52% - 89.96%	89.67%	0.71%	89.43% - 89.88%
XGBoost	87.82%	0.96%	87.46% - 88.18%	87.80%	0.93%	87.43% - 88.14%
Ensemble	89.82%	0.84%	89.60% - 90.07%	89.78%	0.82%	89.56% - 90.03%