Table 1.
Control data set. Average RF distance for each reference set of the synthetic control data set (sequence length of 1000 amino acids, no ASRV). For word-based methods, we show the best performing word length k for each alphabet A (AA: original amino acids; CE: chemical equivalence classes), the only exception being B-bin with CE: k = 5 is slightly better on this data set but k = 4 performs better on the other two data sets. Methods are ordered according to their rank sums ∑R. The Friedman test statistic is FR = 4758.1 (P < 10−10). Significant differences are found at or beyond the α = 0.05 level between the following pairs (numbers refer to column “No.”): method 1 versus methods 22–2: method 2 versus methods 22–4; method 3 versus methods 22–5; methods 4 and 5 versus methods 22–6; method 6 versus methods 22–18; method 7 versus methods 22–19; methods 8–19 versus methods 22–20; and methods 20 and 21 versus method 22.
Reference set of control data | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
|||||||||||
No. |
![]() |
Method |
![]() |
k | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
1 | 3228.0 | d ML | AA | — | 0.024 | 0.044 | 0.068 | 0.092 | 0.140 | 0.160 | 0.192 |
2 | 4285.0 | d PB-ML | CE | — | 0.044 | 0.068 | 0.090 | 0.148 | 0.266 | 0.356 | 0.518 |
3 | 4483.5 | d PB-SIM | CE | — | 0.040 | 0.084 | 0.096 | 0.154 | 0.276 | 0.388 | 0.556 |
4 | 5374.0 | d PB-ML | AA | — | 0.044 | 0.070 | 0.104 | 0.176 | 0.362 | 0.570 | 0.736 |
5 | 5650.5 | d PB-SIM | AA | — | 0.050 | 0.076 | 0.120 | 0.176 | 0.380 | 0.612 | 0.744 |
6 | 8127.5 | d ACS | CE | — | 0.068 | 0.156 | 0.222 | 0.392 | 0.590 | 0.744 | 0.872 |
7 | 8285.5 | d ACS | AA | — | 0.076 | 0.108 | 0.234 | 0.398 | 0.660 | 0.756 | 0.872 |
8 | 8316.5 | d S | CE | 5 | 0.082 | 0.160 | 0.276 | 0.398 | 0.624 | 0.712 | 0.844 |
9 | 8336.5 | d P | CE | 5 | 0.058 | 0.124 | 0.228 | 0.402 | 0.660 | 0.778 | 0.882 |
10 | 8362.5 | d P | AA | 4 | 0.062 | 0.112 | 0.224 | 0.420 | 0.666 | 0.798 | 0.870 |
11 | 8452.0 | d F | CE | 5 | 0.052 | 0.130 | 0.240 | 0.418 | 0.662 | 0.790 | 0.882 |
12 | 8529.5 | d E | AA | 4 | 0.054 | 0.110 | 0.240 | 0.432 | 0.696 | 0.806 | 0.872 |
13 | 8555.0 | d E | CE | 5 | 0.060 | 0.128 | 0.244 | 0.430 | 0.676 | 0.784 | 0.880 |
14 | 8572.0 | d F | AA | 4 | 0.062 | 0.108 | 0.240 | 0.436 | 0.688 | 0.804 | 0.880 |
15 | 8706.0 | d S | AA | 4 | 0.076 | 0.156 | 0.274 | 0.440 | 0.684 | 0.746 | 0.862 |
16 | 8846.5 | d LZ | CE | — | 0.066 | 0.146 | 0.268 | 0.472 | 0.672 | 0.792 | 0.868 |
17 | 9015.0 | B-bin | AA | 3 | 0.064 | 0.138 | 0.290 | 0.480 | 0.710 | 0.800 | 0.876 |
18 | 9046.0 | d LZ | AA | — | 0.072 | 0.116 | 0.270 | 0.488 | 0.712 | 0.826 | 0.890 |
19 | 9192.5 | B-bin | CE | 4 | 0.080 | 0.138 | 0.300 | 0.506 | 0.686 | 0.792 | 0.900 |
20 | 10,286.0 | d C | AA | 3 | 0.110 | 0.188 | 0.394 | 0.588 | 0.798 | 0.862 | 0.888 |
21 | 10,851.0 | d C | CE | 4 | 0.116 | 0.240 | 0.420 | 0.648 | 0.792 | 0.884 | 0.904 |
22 | 12,599.0 | d W | AA | (1) | 0.494 | 0.564 | 0.688 | 0.700 | 0.836 | 0.868 | 0.892 |