. 2020 Oct 26;117(45):28201–28211. doi: 10.1073/pnas.2002660117

Table 1.

Comparison of the ensemble models using P3DFi (this work), SIFT (11), PolyPhen2 (HVAR) (9), and CADD (48) scores to the ensemble model without P3DFi values, and to the other individual scores

Method	Recall/ sensitivity/ true positive rate	Selectivity/ specificity/ true negative rate	Balanced accuracy	MCC	F1 score	Precision	Fallout/false positive rate	Miss rate/false negative rate
Random forest ^* (P3DFi_{Protein class}, SIFT, PolyPhen2, CADD)	0.74	0.91	0.82	0.54	0.84	0.97	0.09	0.26
Random forest ^* (P3DFi_DAGS1330, SIFT, PolyPhen2, CADD)	0.72	0.88	0.80	0.50	0.82	0.96	0.12	0.28
Random forest ^* (SIFT, PolyPhen2, CADD)	0.71	0.89	0.80	0.49	0.82	0.96	0.11	0.29
SIFT (11)	0.84	0.68	0.76	0.48	0.87	0.91	0.32	0.16
PolyPhen2 (9)	0.82	0.75	0.79	0.51	0.87	0.93	0.25	0.18
CADD (48)	0.90	0.58	0.74	0.48	0.89	0.89	0.42	0.10

The best score values are boldfaced. The performances are evaluated on 22,362 variants (17,707 pathogenic and 4,655 benign) from the validation set for which all of the scores were available. The training and test datasets are reported in Datasets S4 and S5, respectively, together with the scores used to develop all models and their outputs.

Random forest ensemble model was developed using 2,000 decision tree classifiers (see details in Materials and Methods).