The grey distribution shown in the background of all plots represents the distribution of
for all single amino acid changes in the 1366 proteins that we analysed. Each plot is also labelled with the median
of the subset analysed as well as a range of
values that cover 95% of the data in that subset (box plot shows median, quartiles and outliers). (
A) Distribution of RaSP
values for benign (blue) and pathogenic (tan) variants extracted from the ClinVar database (
Landrum et al., 2018). We observe that the median RaSP
value is higher for pathogenic variants compared to benign variants. (
B) Distribution of RaSP
values for variants with different allele frequencies (AF) extracted from the gnomAD database
Karczewski et al., 2020 in the ranges (i) AF > 10
-2 (green), (ii) 10
-2 > AF > 10
-4 (orange), (iii) AF < 10
-4 (purple). We observe a gradual shift in the median RaSP
going from common variants (AF> 10
-2) towards rarer ones (AF< 10
-4).