Skip to main content
. 2013 Dec 12;9(12):e1003382. doi: 10.1371/journal.pcbi.1003382

Table 1. The different datasets constructed and used in this study and their composition.

Data set Protein chains nsSNPs Description
1 kG 19,058 106,311 A data set containing all the 1 kG variants filtered by population.
OMIM 19,058 10,151 A protein sequence based set containing OMIM variants for all reviewed UniProt human proteins.
Humsavar 19,058 23,846 A set based on human disease polymorphisms from UniProt.
3D 2,139 10,628 A protein 3D structure based set consisting of 1 kG variants for proteins that have a complete structure in the PDB.
Monomer 325 1,461 A subset of the 3D set containing only proteins identified as being monomeric.
Model 2,630 13,037 A set based on human ModBase homology models where sequence coverage and identity are between 90–100%.