Table 1. Evaluation results of four criteria on benchmark dataset MTBLS79 (selected measure under each criterion was shown in bracket).
Criterion (a) | Criterion (b) | Criterion (c) | Criterion (d) | |
---|---|---|---|---|
(PMAD) | (distribution of P-value) | (consistency) | (AUC) | |
Auto scaling | 0.8360 | Good | 14.6500 | 0.8344 |
Contrast | 0.7797 | Fair | 9.7500 | 0.6250 |
Cubic splines | 0.1393 | Excellent | 13.7500 | 0.8322 |
Cyclic loess | 0.3188 | Good | 15.6500 | 0.8356 |
EigenMS | 0.1799 | Good | 16.4000 | 0.8010 |
Level scaling | 0.2890 | Good | 15.1000 | 0.8345 |
Linear baseline | 0.6035 | Fair | 6.3000 | 0.7072 |
Log-transform | 0.1349 | Good | 14.7500 | 0.8168 |
Mean | 0.3100 | Good | 14.7500 | 0.8213 |
Median | 0.3100 | Good | 14.5500 | 0.8177 |
MSTUS | 0.0064 | Good | 14.3500 | 0.8405 |
Pareto scaling | 0.5320 | Good | 14.9500 | 0.8344 |
Power scaling | 0.1660 | Good | 14.9500 | 0.8314 |
PQN | 0.3260 | Good | 13.7000 | 0.8309 |
Quantile | 0.2989 | Excellent | 13.8000 | 0.8119 |
Range scaling | 0.1573 | Good | 15.3500 | 0.8344 |
Total sum | 2.4336 | Fair | 14.7000 | 0.7538 |
Vast scaling | 2.7200 | Good | 15.0000 | 0.8344 |
VSN | 0.5626 | Excellent | 13.7500 | 0.8373 |
The way calculating those measures was described in ‘Materials and Methods’ section and ‘Supplementary Methods’ section. Besides of quantitative measures, qualitative ones such as distribution of P-value were also evaluated and three performance levels were provided (Excellent, Good and Fair). Qualitative measures were evaluated by visual inspection and examples illustrating how those three performance levels were assigned were shown in Supplementary Figure S1.