Skip to main content
. 2024 Jan 2;42(10):1581–1593. doi: 10.1038/s41587-023-02033-x

Extended Data Fig. 8. StablL’s performance with MX knockoffs or random permutations on synthetic data with normal and non-normal distributions compared to Lasso.

Extended Data Fig. 8

Synthetic datasets differing in distribution were generated using the Normal to Anything (NORTA) framework, as described in methods. Sparsity (∣Ŝ∣, upper panels), reliability (FDR and JI, middle panels), and predictivity performances (RMSE, lower panels) of StablL (MX knockoffs, red box plots, or random permutations, black box plots), and Lasso (grey box plots) as a function of the number of samples (n, x-axis) for synthetic data with a normal distribution (a), zero-inflated normal distribution (b), negative binomial distribution (c), or zero-inflated negative binomial distribution (d). The results are shown for datasets with 25 informative features in the context of uncorrelated (left panels) or correlated (right panels, intermediate correlation, R ~ 0.5) data for regression tasks (continuous outcomes). Results obtained for other scenarios, including other SRMs (EN, SGL, and AL), correlation structures (low, R ~ 0.2, high, R ~ 0.7), and classification tasks are listed in Table S2. Boxes in box plots indicate the median and interquartile range (IQR), with whiskers indicating 1.5 × IQR.