a) Selected features and model performance in ROC - AUC in DMD dataset and Leukemia dataset. Like the UHR dataset, the model performance is stable across various GFS parameters, while the number of selected genes increases with more relaxed thresholds. b) Venn diagram of selected genes with various GFS parameters in DMD dataset and c) Leukemia dataset. d) Distribution of selected genes in DMD samples and healthy controls when = 5% and = 15% as well as = 0% and = 100%, respectively. Density plot in green represents the distribution of all genes, while histogram in pink represents the distribution of selected genes after GFS normalisation. Median of the selected gene expression is higher than that of all genes, indicating the gene signatures correlates with DMD tend to have relatively high expression level comparing to other genes. e) Distribution of selected genes in ALL subject and AML subject with = 5% and = 15% as well as = 0% and = 100%, respectively.