Skip to main content
. 2024 May 18;23:2220–2229. doi: 10.1016/j.csbj.2024.05.029

Table 2.

Mean squared error (MSE) of the best three-feature and four-feature combinations of the linear regression, support vector, k-nearest neighbors, and random forest regression models for predicting aggregation rates. Hyperparameters are set to the default Scikit-learn parameters.

Regression Models Three-feature MSE Four-feature MSE
Linear SAP_pos_CDRH3
SCM_pos_CDRL3
SCM_neg_CDRH3
0.433 SAP_pos_CDRH3
SCM_pos_CDRL1
SCM_pos_CDR
SCM_pos_Hv
0.457
Nearest neighbors
(neighbor numbers = 5)
SAP_pos_CDRL3
SCM_pos_CDRH3
SCM_neg_Fv
0.366 SCM_pos_CDRH3
SCM_neg_CDRH2
SCM_neg_Hv
SCM_neg_Fv
0.319
Random forest (max_depth = None) SAP_pos_Hv
SCM_pos_CDRH3
SCM_neg_Fv
0.367 SAP_pos_Hv
SCM_pos_CDRH3
SCM_pos_Lv
SCM_neg_Fv
0.364
Support vector
(C = 1.0, ε = 0.1)
SCM_pos_CDRH3
SCM_neg_CDRH2
SCM_neg_Fv
0.307 SCM_pos_CDRH3
SCM_neg_CDRH2
SCM_neg_CDRL2
SCM_neg_Fv
0.301