Table II.
Benchmarking of cluster qualities
The size and composition of expert-curated Arabidopsis families collected from literature (Reference column) is compared to results obtained with HCL and BCL. The last example represents a family of unknown function for which no Pfam domain or references are available. The provided numbers represent the family sizes, while several numbers in a field indicate the size of the obtained subfamilies in decreasing order. Only one gene/protein model was counted per gene. The number “1” stands for singlet (e.g. “5 × 1” stands for five singlets). Data sources are provided as footnotes. *, These counts do not include truncated genes or those that are absent in the latest genome annotation. n.a., Not available.
Family Name | Reference | HCL | BCL |
---|---|---|---|
Cytochrome P450 familya | 244 | 239/5×1 | 151/31/28/19/4/3/8×1 |
Acyl-group desaturasesb | 15 | 15 | 9/4/1/1 |
Stearoyl-ACP desaturasesb | 7 | 7 | 7 |
Xyloglucan xylosyltransferasesc | 7* | 7 | 7 |
Xyloglucan fucosyltransferasesd | 9* | 9 | 9 |
Phototropinse | 2 | 2 | 2 |
Auxin response factorsf | 23 | 17/6 | 20/3×1 |
Fatty acid multifunctional proteinsg | 2 | 2 | 2 |
Phospholipase D familyh | 12* | 9/3×1 | 10/2 |
Δ8 sphingolipid desaturasesi | 2 | 2 | 2 |
Nitrate reductasesj | 2 | 2 | 2 |
Expressed protein (BCL ID: 321) | n.a. | 10×1 | 10 |