Table 3.
Correlation between the precision of predictions of g-annotations to human proteins in an alignment A versus g’s p-value in A according to Sect. 2.4
| Thresh | N | Pearson | Pearson p | ’s | Spearman | Spearman p | ’s |
|---|---|---|---|---|---|---|---|
| 56,470 | 0.371 | 94.8 | 0.579 | 168.6 | |||
| 49,270 | 0.378 | 90.7 | 0.618 | 174.3 | |||
| 36,570 | 0.431 | 92.2 | 0.681 | 177.6 | |||
| 25,871 | 0.449 | 80.9 | 0.688 | 152.5 | |||
| 16,272 | 0.473 | 68.5 | 0.721 | 132.8 | |||
| 7688 | 0.621 | 69.4 | 0.827 | 128.9 | |||
| 4500 | 0.711 | 67.8 | 0.737 | 73.2 |
The “thresh” column specifies the upper bound on the p-value of g in a particular alignment A before A’s g-annotation predictions are included in that row; N is the number of (A, g) pairs that result, across all alignments and GO terms with human proteins as targets. The “Pearson” column is the correlation between (a) the fraction of g-annotation predictions that are validated in alignment A and (b) the p-value of g in the alignment A that produced the predictions; the “Pearson p” column is the p-value of the Pearson correlation of the previous column; and the ’s column is the number of standard deviations represented by the Pearson p. The last three columns duplicate the previous three, but for the Spearman correlation. The correlations are negative because the prediction precision increases as p-value decreases, as expected (note: the Pearson and Spearman p’s technically decrease in significance as N decreases, though they remain highly significant throughout.)