Skip to main content
. 2011 Jul 28;12:309. doi: 10.1186/1471-2105-12-309

Table 7.

Statistical comparison of EBD and FI discretization methods

Evaluation Measure Method Mean (SEM) Difference of Means Z statistic (p-value)
C4.5 Accuracy EBD 73.49% (2.07) 2.01 2.219
[0%, 100%] FI 71.48% (2.12) (0.026)

C4.5 AUC EBD 73.22% (1.89) 1.07 2.732
[50%, 100%] FI 72.15% (1.77) (0.007)

C4.5 Robustness EBD 72.55% (2.81) -0.26 -0.261
[0%, ∞] FI 72.81% (2.76) (0.794)

NB Accuracy EBD 77.55% (2.65) 0.76 2.080
[0%, 100%] FI 76.79% (2.32) (0.038)

NB AUC EBD 74.83% (1.43) 1.11 2.711
[0%, 100%] FI 73.71% (1.24) (0.007)

NB Robustness EBD 81.72% (2.92) -0.68 -0.016
[50%, ∞] FI 82.40% (2.59) (0.987)

Stability EBD 0.74 (0.025) 0.02 1.972
[0, 1] FI 0.72 (0.029) (0.049)

Mean # of intervals per predictor EBD 1.27 (0.074) 0.11 1.686
[1, n] FI 1.16 (0.038) (0.092)

In the first column the range of a measure is given in square brackets where n is the number of instances in the dataset. In the last column the number on top in the last column is the Z statistic and the number at the bottom is the corresponding p-value. On all performance measures, except for the mean number of intervals per predictor, the Z statistic is positive when EBD performs better than FI. The two-tailed p-values of 0.05 or smaller are in bold, indicating that EBD performed statistically significantly better at that level.