Skip to main content
. 2015 Oct 26;37(1):28–35. doi: 10.1002/humu.22911

Table 2.

Training and Validation Sets Used by Current Prediction Methods

Non‐overlapping multi‐method
Training set (as published) Test set (as published) benchmark set
  Pathogenic Benign Pathogenic Benign Pathogenic Benign
In‐frame
PROVEAN Uniprot Uniprot HGMD 2011 1000G P1 HGMD2014.4 1000G P3 AA
DDIG‐in HGMD 2012 1000G P1 Uniprot Uniprot HGMD2014.4 1000G P3 AA
SIFT‐indel HGMD 2010 Interspecies Uniprot Uniprot HGMD2014.4 1000G P3 AA
CADD Simulated Fixed Polymorphisms ClinVar ESP6500 HGMD2014.4 1000G P3 AA
VEST‐indel HGMD 2014.3 ESP6500 AA ClinVar Interspecies HGMD2014.4 1000G P3 AA
Frameshift
PROVEAN N/A N/A N/A N/A  N/A N/A
DDIG‐in HGMD 2012 1000G P1 HGMD 2012 Interspecies  HGMD2014.4 1000G P3 AA
SIFT‐indel HGMD 2010 Interspecies N/A N/A  HGMD2014.4 1000G P3 AA
CADD Simulated Fixed Polymorphisms ClinVar ESP6500  HGMD2014.4 1000G P3 AA
VEST‐indel HGMD 2014.3 ESP6500 AA ClinVar Interspecies  HGMD2014.4 1000G P3 AA

1000G P1 and 1000G P3 are variants from 1000 Genomes Phase 1 and 3, respectively. Interspecies benign variants derived from pairwise genome alignments of human and cow, dog, horse, chimp, rhesus macaque, and rat. Uniprot variants were obtained from the UniProtKB/Swiss‐Prot “Human Polymorphisms and Disease Mutations” dataset (Release 2011_09), annotated as deleterious, neutral, or unknown based on keywords from the provided Uniprot descriptions. AA, African or African American Ancestry and N/A, not applicable.