Table 1. Specificities of variant interpretation tools.
All variants (n = 63,197)a | Variants predicted by all tools (n = 7,268)b | ||||||
---|---|---|---|---|---|---|---|
Tools | VUSc | Benign | Harmful | Specificity | Benign | Harmful | Specificity |
PON-P2d | 21,373 | 34,529 | 1,626 | 0.955 | 6655 | 613 | 0.916 |
VESTd,e | 1,168 | 22,614 | 4,480 | 0.835 | 5984 | 1284 | 0.823 |
FATHMMd | 5,531 | 43,005 | 6,766 | 0.864 | 6287 | 981 | 0.865 |
PROVEAN | 3,908 | 45,868 | 13,421 | 0.774 | 5712 | 1556 | 0.786 |
PPH2d,f | 6,386 | 37,124 | 13,602 | 0.732 | 5404 | 1864 | 0.744 |
LRT | 19,333 | 31,736 | 12,128 | 0.724 | 5465 | 1803 | 0.752 |
MA | 8,044 | 39,493 | 15,660 | 0.716 | 5306 | 1962 | 0.730 |
CADDg | 0 | 40,659 | 22,538 | 0.643 | 4539 | 2729 | 0.625 |
SIFT | 5,099 | 36,808 | 21,290 | 0.634 | 4868 | 2400 | 0.670 |
MT2h | 15,313 | 30,632 | 17,252 | 0.640 | 4764 | 2504 | 0.655 |
aAll variants having AF> = 1% and <25% in at least one population and not present in the training dataset for the method. After excluding cases in the training datasets, the total number of variants was 57,528 for PON-P2, 28,262 for VEST, 55,302 for FATHMM, and 57,112 for PPH2.
bVariants classified as benign or harmful. Variants present in training dataset of any of the tools were excluded. All variants that were automatically annotated without making predictions were excluded.
cVariants for which the predictions were not available, were ambiguous, or were predicted to have unknown significance.
dVariants present in the training datasets were excluded.
eVariants were not classified into benign and harmful by the program. A cutoff of 0.5 was used so that variants with score greater than or equal to 0.5 were classified as harmful, otherwise benign.
fHumVar version of PolyPhen-2 was used as the performance was higher than for HumDiv version.
gVariants were not classified into benign and harmful by the program. A cutoff of 20 was used so that variants with score greater than or equal to 20 were grouped as harmful and otherwise benign. The authors have recommended a cutoff ranging from 10 to 20. The highest cutoff was used so that the highest possible specificity was obtained.
hVariants that were automatically detected to be harmful or benign were not included in the classified cases as they are not real predictions by the tool, instead annotations based on known data.