Skip to main content
. 2009 Oct 21;4(10):e7546. doi: 10.1371/journal.pone.0007546

Table 1. Prediction error of GLM (glm.*) and GAM (gam.*) models on training and test data sets.

Model MSE (Train) MRD (Train) MSE (Test) MRD (Test)
glm.full 0.032 0.277 0.082 0.511
glm.step 0.030 0.278 0.081 0.499
glm.bits 0.032 0.283 0.082 0.497
gam.full 0.030 0.264 0.085 0.534
gam.step 0.030 0.274 0.081 0.498
gam.bits 0.032 0.281 0.082 0.497

Results are shown for models with three sets of predictor variables: 1) full models which contain all BLAST statistics (full), 2) stepwise models which contain BLAST statistics selected during using a stepwise AIC variable selection process (step), and 3) bits models only utilize the BLAST bit score (log) as a predictor variable (bits). Models were assessed using both MSE and MRD (lower values are better). On the training set, models with all predictor variables (glm.full, and gam.full) fit the data best (MSE = 0.032 and 0.030 and MRD = 0.277 and 0.264 respectively). However, models with more predictor variables do not perform significantly better on the test data versus models which have bit score as a single predictor.