Table 6.
Summary of MeSHOP performance
Scoring method | Mean AUC | AUC standard error | Mean test rank (n = 200) | Overall rank |
---|---|---|---|---|
Cosine distance of term frequency-inverse document frequency | 0.93 | 0.03 | 15.03 | 2 |
Cosine distance of P-values | 0.57 | 0.05 | 87.25 | 16 |
Cosine distance of term fractions | 0.90 | 0.04 | 20.21 | 4 |
Sum of the log of combined P-values | 0.91 | 0.03 | 18.88 | 3 |
Sum of the differences of log P-values | 0.87 | 0.06 | 26.97 | 7 |
L2 of log-p of overlapping terms only | 0.94 | 0.03 | 12.06 | 1 |
L2 of term fractions of overlapping terms only | 0.57 | 0.04 | 86.70 | 15 |
L2 of log of P-values | 0.86 | 0.07 | 28.05 | 10 |
L2 of P-values | 0.86 | 0.07 | 29.62 | 12 |
L2 of term fractions | 0.90 | 0.03 | 20.39 | 5 |
L2 of term frequency | 0.86 | 0.06 | 28.31 | 11 |
Term coverage | 0.87 | 0.06 | 27.14 | 8 |
Term overlap | 0.87 | 0.03 | 26.17 | 6 |
Number of gene MeSH terms | 0.81 | 0.05 | 38.69 | 13 |
Number of disease MeSH terms | 0.86 | 0.06 | 27.87 | 9 |
Gene ID | 0.71 | 0.06 | 58.78 | 14 |
The AUC mean, standard deviation and ranking for the MeSHOP scores and the gene and disease baselines are described, over all validation sets and both GeneRIF and gene2pubmed reference sets.