Table 1.
gene2pubmed | GeneRIF | |||||
---|---|---|---|---|---|---|
Scoring method | Validation (02/2007-01/2009) | Validation (02/2007-04/2010) | CTD validation (11/2008) | Validation (02/2007-01/2009) | Validation (02/2007-04/2010) | CTD validation (11/2008) |
Percentage GC content | 0.50 | 0.50 | 0.51 | 0.50 | 0.50 | 0.51 |
Number of transcripts | 0.53 | 0.53 | 0.55 | 0.51 | 0.51 | 0.53 |
Transcript length | 0.51 | 0.52 | 0.50 | 0.52 | 0.52 | 0.53 |
Genomic length | 0.52 | 0.52 | 0.50 | 0.51 | 0.51 | 0.52 |
Gene ID | 0.73 | 0.71 | 0.78 | 0.64 | 0.63 | 0.69 |
Characteristics were compared against the 02/2007-11/2008 validation sets using gene2pubmed and GeneRIF gene references, as well as the 11/2008 Comparative Toxicogenomics Database (CTD) validation set. Gene characteristics were extracted from EnsEMBL. We compare the performance of these characteristics at predicting new gene-disease relationships in our validation sets (for the genes with mapped characteristics).