Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2020 Oct 13;21:452. doi: 10.1186/s12859-020-03759-0

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Author(s) 2020

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

PMC Copyright notice

Fig. 1 — Macro-molecular binding SAVs. All results were based on the ExAC data from 60 k individuals [5]; SNAP2 [15, 16] predicted effects on molecular protein function, and ProNA2020 [4] predicted residues at ProNA-binding interfaces (binding either other proteins, DNA, or RNA). (a demonstrates the degree to which SAVs (Single Amino acid Variants) are predicted more or less often than expected by chance (Methods) in ProNA-binding interfaces by the method ProNA2020 [4]. In particular, common SAVs (observed in > 5% of population) and rare SAVs (observed in < 1% of population) were significantly under-represented in ProNA-binding. The lines below and above the bars for the odds ratios marked the 95% confidence intervals taken from Fisher’s exact test computed on the number of SAVs predicted as binding/non-binding in each class (common or rare; note the error bar for the rare SAVs is so small that it appears as a single horizontal line). b Zooms into the subset of all SAVs predicted as ProNA-binding. The y-axis gives the cumulative percentage of SAVs predicted above a certain SNAP2-score (x-axis) [15, 16] predicted to be in ProNA-binding interfaces. This score reflects the strength of predicting SAVs to affect molecular protein function (+ 100 strongest prediction of effect) or to be neutral (− 100 strongest prediction of neutrality). Random (gray line) was based on the average over all possible 19-non-native mutations computed in silico (Method). Computing Kolmogorov–Smirnov p values between all pairs of lines revealed that the differences between common and all others were extremely significant (common vs. rare: p value < 2.2 × 10^–16 and common vs. random: p value < 2.7 × 10^–15). The p value between random and rare was not quite significant (p value < 2 × 10^–2, Additional File 1: Table S1; c, d distinguish distributions between SAVs at residue positions predicted in ProNA-binding interfaces (dubbed binding) and non-binding (dubbed other) for different SNAP2-score thresholds. While c shows the raw distribution, c highlighted the cumulative distribution (as in b). The differences between all pairwise curves were statistically significant (Additional File 1: Table S1). For instance, for very reliable effect predictions with SNAP2-scores ≥ 50 (dashed vertical lines), about 40% of all common SAVs were predicted to affect molecular function and to be in a residue predicted or observed (ProNA2020 [4] uses whatever is available, either a homology-based inference from experimental information or machine learning prediction) to be in an interface binding a large molecule (protein, DNA, or RNA)