Skip to main content
editorial
. 2016 Oct 5;25(1):2–7. doi: 10.1038/ejhg.2016.129

Table 1. Evaluation of feature subsets (ROC-AUC, 10-fold cross-validation with linear support vector machine classifier) to discriminate between CPDs versus SNPs and disease-causing mutations versus common SNPs (DM versus SNPs).

Feature subset CPD-AUC (%) DM-AUC (%) Performance reduction for CPD (CPD-AUC versus DM-AUC)
Genomic MSA 74.0 94.6 −20.6
Protein MSA (homologous) 60.4 80.7 −20.2
Local protein structure 56.5 68.5 −12.0
Regional protein composition 55.3 63.5 −8.2
Exonic features 64.7 71.0 −6.4
Annotated functional sites 50.3 55.3 −5.0
Amino-acid features 64.5 69.1 −4.6
Random value (control) 50.0 48.7 1.3

Features ranked by performance reduction between CPD set and DM set.