Table 5. KNN-based sensible data analysis protocol performance on one- and two-color neuroblastoma data set compared with independent two-color results.
| Neuroblastoma data sets | End point | Modeling parameters | Cross-validation | External validation | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Feature ranking | N | k | Threshold | AUC | MCC | Sensitivity | Specificity | AUC | MCC | Sensitivity | Specificity | ||
| Two-color (original MAQC-II) | Overall survival | FC& (P<0.05) | 75 | 29 | 0.23 | 0.879 | 0.549 | 0.759 | 0.908 | 0.848 | 0.516 | 0.564 | 0.920 |
| Event-free survival | FC& (P<0.05) | 135a | 30 | 0.11 | 0.891 | 0.594 | 0.910 | 0.788 | 0.839 | 0.577 | 0.819 | 0.764 | |
| Positive control | P& (FC>1.5) | 10 | 30 | 0.17 | 0.981 | 0.943 | 0.966 | 0.980 | 0.991 | 0.973 | 0.993 | 0.980 | |
| Negative control | FC& (P<0.05) | 95a | 2 | 0.52 | 0.523 | 0.021 | 0.739 | 0.277 | 0.456 | −0.115 | 0.678 | 0.218 | |
| One-color (newly generated) | Overall survival | SAM | 125 | 28 | 0.27 | 0.886 | 0.507 | 0.718 | 0.901 | 0.844 | 0.435 | 0.622 | 0.836 |
| Event-free survival | SAM | 200 | 26 | 0.17 | 0.893 | 0.617 | 0.898 | 0.813 | 0.830 | 0.591 | 0.825 | 0.768 | |
| Positive control | SAM | 5 | 29 | 0.20 | 0.987 | 0.943 | 0.965 | 0.980 | 1.000 | 1.000 | 1.000 | 1.000 | |
| Negative control | FC& (P<0.05) | 30a | 7 | 0.30 | 0.485 | −0.003 | 0.999 | 0.000 | 0.486 | 0.000 | 1.000 | 0.000 | |
Abbreviations: AUC, area under the receiver operating characteristic curve; FC, fold change; MAQC-II, MicroArray Quality Control consortium phase II; MCC, Matthews correlation coefficient; SAM, significance analysis of microarrays.
Using the negative control of all filtered features performs slightly better on cross-validation.