Figure 3. Evaluating model accuracy and robustness.
Left. In repeated cross-validation, data is split into different training and test sets. Models learned from training data are applied to the held-out test data. The example shown learns rules about ADCP activity from the extremes and applies these rules to the samples with intermediate activity. Center. Performance accuracy can be defined by characterizing the degree of agreement between model and observation for classification and regression models. Right. Permutation testing, in which the performance of randomized data is compared to actual data, offers a means to gauge model robustness. In this case, actual data can be used to model antibody activity with similar fidelity as the experimental data can be experimentally replicated, and significantly better than when models are learned from permuted data, or from control data, such as that relating to a different pathogen.