Skip to main content
. 2018 May 3;13(5):e0196849. doi: 10.1371/journal.pone.0196849

Table 2. Performance on the VEST-indel test set for frameshift variations.

Method MCCb Sensitivity Specificity F-scorec False positive rate False discovery rate
Evaluation on the consensus subseta
ENTPRISE-X 0.626
(0.749)
0.943 0.916 0.620
(0.767)
8.4% 54%
VEST-indel 0.440
(0.585)
0.914 0.814 0.421
(0.615)
18.6% 73%
DDIG-in 0.321
(0.441)
0.943 0.663 0.297
(0.439)
33.7% 82%
Evaluation on the full test set
ENTPRISE-X 0.586 0.878 0.912 0.590 8.8% 55%
Baselined 0.323 0.988 0.621 0.294 37.9% 83%
Baselinee 0.224 0.598 0.775 0.271 22.5% 83%
ENTPRISE-X_1f 0.570 0.878 0.905 0.574 9.5% 57%
ENTPRISE-X_2f 0.555 0.854 0.905 0.562 9.5% 58%
ENTPRISE-X_10altf 0.587±0.006 0.887±0.006 0.910±0.003 0.590±0.006 9.0%±0.3% 55.8%±0.7%
ENTPRISE-X-nolocal 0.481 0.707 0.914 0.509 8.6% 60%
ENTPRISE-X-nonew 0.099 0.793 0.390 0.168 61.0% 90%
ENTPRISE-X-noratio 0.513 0.890 0.871 0.509 12.9% 64%
ENTPRISE-X-noessential 0.574 0.890 0.903 0.575 9.7% 58%
ENTPRISE-X-nopathogen 0.543 0.866 0.896 0.546 10.4% 60%
ENTPRISE-X-nodisease 0.368 0.683 0.859 0.396 14.1% 72%
ENTPRISE-X-nointeract 0.586 0.890 0.909 0.588 9.1% 56%

a To be fair to all methods, only the consensus mutations of three methods are evaluated in comparison to the other methods.

b Matthew’s Correlation Coefficient. The numbers in parentheses are the maximal possible values.

c 2(precision×recall)/(precision+recall), where precision = (true positive)/(true positive + false positive), recall = (true positive)/(true positive + false negative). Numbers in parentheses are the maximal possible values.

d When only the feature representing if the gene is disease-associated or not is used.

e When only the feature representing if the gene is essential or not is used.

f When using one of the 2 models trained on each half of the pathogenic data and training ENTPRISE-X for 10 different random partitions of the pathogenic part of the training set were used.