Table 2.
Likelihood ratio tests for spatial clustering of CDK consensus matches
Number | H0 (fs, fw) | H1 (f1s, f1w) (f2s, f2w) | Λ | P value | |
'Known' | 12 | 6.72, 10.8 | 25.1, 34.3; 2.66, 5.66 | 44.4 | 1.2 × 10-9 |
'Unbiased positives' | 18 | 2.81, 8.51 | 19.8, 31.6; 1.53, 6.77 | 20.2 | 1.6 × 10-4 |
'Unbiased negatives' | 173 | 0.67, 6.68 | 2.93, 47.7; 0.65, 6.34 | 5.58 | 0.13 |
'Known,' scrambled | 12 | 0.96, 6.04 | 4.60, 10.2; 0.00, 4.48 | 5.21 | 0.15 |
Comparison of a one-component versus two-component mixture of multivariate geometric distributions in different protein sets. Maximum likelihood parameter estimates (in matches per 1,000 residues) under the two hypotheses are indicated by f. See text for descriptions of parameters. Λ indicates the likelihood ratio test statistic, which is expected to be χ2 distributed with three degrees of freedom. P values are computed under that assumption. Seven low-confidence open reading frames were removed from the 'unbiased negatives', although similar results are obtained if they are included. CDK, cyclin-dependent kinase.