Figure 3. Error rates (%) for full data classification.
Error rates (y-axes) where 1.0 is 100% incorrect classification (i.e. includes false positives and false negatives) are shown for the combined data set comprising the initial, geographic and summer 2011 sample sets. Error rates are shown for the 2 best classification options, multinomial linear regression (Multi) and logitboost (logit) on the x-axes. Full and test error rates are described in the analysis section of Materials and Methods. Logit. tie right/tie wrong is where the results that are tied are included in the error rates as being right or wrong. Logit. iteration is where more iterations of the classification scheme are performed than normally would be (usually equal to the number of data points in the set). This results in increased error rates due to over-fitting of the data such that any sample variation outside of the limits of the sample set used for deriving the classification rules results in an error.
