Skip to main content
. 2012 Sep 28;4(9):75. doi: 10.1186/gm376

Table 1.

Performance of gene characteristics at predicting association with disease

gene2pubmed GeneRIF

Scoring method Validation (02/2007-01/2009) Validation (02/2007-04/2010) CTD validation (11/2008) Validation (02/2007-01/2009) Validation (02/2007-04/2010) CTD validation (11/2008)
Percentage GC content 0.50 0.50 0.51 0.50 0.50 0.51
Number of transcripts 0.53 0.53 0.55 0.51 0.51 0.53
Transcript length 0.51 0.52 0.50 0.52 0.52 0.53
Genomic length 0.52 0.52 0.50 0.51 0.51 0.52
Gene ID 0.73 0.71 0.78 0.64 0.63 0.69

Characteristics were compared against the 02/2007-11/2008 validation sets using gene2pubmed and GeneRIF gene references, as well as the 11/2008 Comparative Toxicogenomics Database (CTD) validation set. Gene characteristics were extracted from EnsEMBL. We compare the performance of these characteristics at predicting new gene-disease relationships in our validation sets (for the genes with mapped characteristics).