Table 2.
Classifier | C4.5 | NB | ||
---|---|---|---|---|
Dataset | EBD (SEM) | FI (SEM) | EBD (SEM) | FI (SEM) |
1 | 100.00% (0.00) | 100.00% (0.00) | 93.33% (0.93) | 93.33% (0.85) |
2 | 86.43% (0.79) | 84.62% (0.77) | 93.03% (1.06) | 92.12% (0.94) |
3 | 78.61% (1.30) | 64.23% (1.72) | 81.53% (1.11) | 81.53% (1.02) |
4 | 88.62% (0.66) | 84.38% (0.67) | 75.43% (0.83) | 72.76% (0.79) |
5 | 59.04% (1.72) | 56.33% (1.93) | 71.19% (0.92) | 69.78% (1.13) |
6 | 96.67% (1.10) | 95.67% (0.82) | 82.32% (1.17) | 80.28% (1.51) |
7 | 94.46% (1.03) | 94.46% (1.03) | 97.32% (0.84) | 97.32% (0.84) |
8 | 60.00% (2.08) | 50.00% (2.03) | 72.33% (1.42) | 70.82% (1.49) |
9 | 83.61% (1.28) | 81.29% (0.97) | 91.94% (0.91) | 93.67% (0.72) |
10 | 68.00% (1.98) | 66.54% (1.21) | 76.00% (1.65) | 71.76% (1.32) |
11 | 77.67% (1.30) | 72.44% (0.91) | 75.53% (1.33) | 73.81% (1.11) |
12 | 55.83% (2.14) | 59.58% (2.12) | 63.33% (1.81) | 61.67% (1.84) |
13 | 58.92% (0.86) | 57.14% (0.96) | 50.36% (0.84) | 49.32% (0.88) |
14 | 58.75% (0.91) | 62.33% (1.01) | 58.33% (1.04) | 57.65% (1.09) |
15 | 54.94% (0.72) | 54.20% (0.74) | 55.34% (1.70) | 53.86% (1.07) |
16 | 72.43% (1.32) | 71.25% (1.45) | 86.22% (1.41) | 85.45% (1.22) |
17 | 70.06% (0.94) | 68.96% (1.17) | 82.81% (0.79) | 81.78% (1.42) |
18 | 81.21% (0.58) | 83.78% (0.68) | 83.76% (0.91) | 89.76% (0.75) |
19 | 74.12% (1.32) | 72.22% (1.21) | 85.12% (1.09) | 84.19% (1.31) |
20 | 59.45% (2.08) | 59.45% (2.08) | 100.00% (0.00) | 100.00% (0.00) |
21 | 62.32% (1.54) | 65.24% (1.43) | 78.23% (0.59) | 76.23% (0.54) |
22 | 73.22% (0.78) | 69.78% (1.21) | 78.23% (0.77) | 77.23% (0.78) |
23 | 73.32% (0.92) | 68.49% (0.98) | 46.22% (0.98) | 48.55% (0.87) |
24 | 76.12% (1.32) | 73.04% (1.72) | 83.32% (1.65) | 80.12% (1.23) |
Average | 73.49% (2.07) | 71.48% (2.12) | 77.55% (2.65) | 76.79% (2.32) |
Accuracies for EBD and FI discretization methods are obtained from the application of C4.5 and NB classifiers to the discretized variables. The mean and the standard error of the mean (SEM) for the accuracy for each dataset is obtained by 10 × 10 cross-validation. For each dataset, the higher accuracy is shown in bold font and equal accuracies are underlined.