Table 3. Relationships between parameters of clustering and biological enrichment.
Metric | Top Quartile | Bottom Quartile | |
F | K | 6 | |
transform | FFT, center | ||
distance | chebychev | cityblock | |
algorithm | Kmeans | ||
C | K | ||
transform | FFT,center | ||
distance | euclidean | cityblock | |
algorithm | SOM | Kmeans | |
P | K | 6, 8 | |
transform | FFT, center | log10, nMaxLog10, rangeScale | |
distance | |||
algorithm | AP | Kmeans | |
PFAM | K | 12, 14 | |
transform | pow | zscore, rangeScale | |
distance | cityblock | ||
algorithm | Kmeans, SOM | ||
Motifs | K | 6 | 10 |
transform | log10, normMax | zscore | |
distance | |||
algorithm | Ncut | ||
Scansite Bind | K | 6 | 14 |
transform | nMaxLog10 | ||
distance | |||
algorithm | Kmeans, Ncut | ||
Scansite Kinase | K | 12 | 6 |
transform | FFT | ||
distance | |||
algorithm |
Parameters are given for each biological metric if they are enriched in either the top or bottom quartile of the list when ranked by the number of labels enriched in that category. PELM kinase annotations and Pfam_site did not perform better than random controls, and are not included. Although Scansite Kinase parameter enrichment also did not perform better than random controls, it is listed here since subsets based on the FFT did perform better than their random control counterparts. See Table 2 for a full description of parameters.