Fig. 4. Benchmarking feature selection methods.
a Mean scaled silhouette index of feature sets ranging from 50 to 4000 features. b Rank distribution of feature selection methods. For each dataset, the methods are ranked from 1 to 9 by their best SI across all feature set sizes. c AUROC of DE gene detection. For all boxes, the middle line represents the median, the lower and upper box limits correspond to the first and third quartiles, the upper and lower whiskers extend upto 1.5*IQR from the top and bottom of the box, respectively (where IQR is the inter-quartile range). Points beyond the whiskers are outliers and are plotted individually. All p-values were calculated using two-sided Student’s T-tests (n = 7). Refer to Supp. Table S6 for details regarding the datasets.