RENOVO algorithm: Comparison between full and minimal models
(A) ROC analysis: ROC curves to evaluate performances of RENOVO-F (blue line) and RENOVO-M (red line). The curves, together with the values of the AUROCs, are showed to compare the two models. The chi-square test p value of AUROC difference is also displayed.
(B) Precision-recall curves for RENOVO-F (blue line) and RENOVO-M (red line) to evaluate the precision of the models with respect to the P/LP class. AUROCs for the two curves are reported. The chi-square test p value of AUROC difference is also displayed.
(C) Negative precision-recall curves to evaluate precision on the B/LB class: results are depicted in blue for RENOVO-F and in red for RENOVO-M. AUCs are reported for both models.
(D) Distributions of computed PLS for training and test variants for RENOVO-F (left) and RENOVO-M (right). The density plot is clearly showing a bi-modal distribution with a large separation between the two peaks, suggesting a high degree of confidence in the prediction call. Vertical lines denote the thresholds used to define RENOVO classes: blue lines define HP benign and IP benign classes and red lines HP pathogenic and IP pathogenic.
(E) RENOVO results on ClinVar datasets: prioritization results on the training (benign and pathogenic) and test (benign and pathogenic) set for RENOVO-F. Colors follow the classification provided by RENOVO: blue shades for HP and IP benign classes, red shades for HP and IP pathogenic, and gray for LP. Bubble sizes are proportional to the fractions of variants represented.
(F) RENOVO results on ClinVar datasets: prioritization results on the training (benign and pathogenic) and test (benign and pathogenic) set for RENOVO-M. Bubble colors and sizes follow the code described in (E).
(G) Feature importance with mean SHAP values retrieved for RENOVO-F. To reduce noise, only the first 20 features are shown. The vertical gray line at 0.01 represents the threshold used to keep features in the selection step: gray dots are features below this cutoff.
(H) Feature importance with mean SHAP values retrieved for RENOVO-M are displayed. To reduce noise, only the first 20 features are shown.
(I) ROC curves obtained by RENOVO-M classification (black continuous line) and by other predictive and functional scores.
(L) Precision-recall curves obtained by RENOVO-M (black continuous line) classification and by other predictive and functional scores.