Table 3.
The classification performance on protein domain sequences for the CSSS model (1-NN classifier) with the k-mer size=1 (see Section 3), expressed as the integral of the AUC curve shown in Supplementary Figure S2 in Supplementary Data
Similarity/distance measure | Classification method |
|
---|---|---|
SVM | 1-NN | |
SW P valuea | 48.66 | 50.22 |
LZW-BLASTa | 49.0 | 37.18 |
CSSS | NA | 50.64 |
Since Dataset III contains 54 protein families, the maximum value for the integral of the AUC curve is 54, which correspond to all 54 protein families being classified without error.
aSimilarity/distance measures presented in Kocsor et al. (2006).