Skip to main content
. Author manuscript; available in PMC: 2024 Jun 4.
Published in final edited form as: Nat Biomed Eng. 2023 May 1;7(6):811–829. doi: 10.1038/s41551-023-01034-0

Extended Data Fig. 2 |. Feature discovery benchmark.

Extended Data Fig. 2 |

For each synthetic or semi-synthetic dataset (a), we trained a variety of models (b) including neural networks, GBMs, support vector machines, and elastic net regression, as well as univariate statistics (Pearson correlation). For the machine-learning models, we then used SAGE to generate global Shapley value feature attributions (c), ranked the features according to the magnitude of their attributions (d), and compared the ranked list generated by each method to the binary ground truth importance vector (e). To measure the feature discovery quality of each method, we plotted how many “true” features are found cumulatively at each point in the ranked feature list (f), then summarized the curve generated by this procedure by measuring the AUFDC. This score is then rescaled so that a score of 0 represents random performance while a score of 1 represents perfect performance.