Skip to main content
. 1999 Nov;9(11):1093–1105. doi: 10.1101/gr.9.11.1093

Table 3.

Cluster Splitting of Gene Clusters Within 2.029 cDNA Clones

Gene ID Copies Core clusters Total Percent of copies Diversity index






Ef1_α 669 636 (646), 3 (4) 639 95.52 0.049
Cytochrom_cox_I 274 229 (232), 24 (24),  12 (14), 2 (2) 267 97.45 0.121
clone_190B1 254 237 (241), 2 (2) 239 94.09 0.068
tubulin_β 207 203 (205) 203 98.07 0.023
40SRibo_protS6 183 176 (176) 176 96.17 0.045
40SRibo_protS4 100 99 (99) 99 99.00 0.012
60SRibo_protL4 85 84 (86) 84 98.82 0.014
GAPDH 82 77 (77) 77 93.90 0.074
Ef1_β 67 66 (76) 66 98.51 0.018
human_calmodulin 32 22 (26), 4 (4) 26 81.25 0.324
heat_shock_cogKD71 28 22 (23) 22 78.57 0.271
heat_shock_cogKD90 26 20 (20) 20 76.92 0.276
human_TNF_receptor 12 8 (8) 8 66.67 0.442
clone_244D14 8 3 (3) 3 37.50 0.318
clone_241F17 2 2 (2) 2 100.00 0.000
  Total 2029 1932 95.22 0.137

The diversity index for most gene clusters is low (< 0.2). E.g., GAPDH (human glyceraldehyde-3-phosphate dehydrogenase) is present with 82 copies in the library. Clustering finds 77 copies in a calculated cluster of size 77 (the numbers in brackets denote the sizes of the calculated clusters). These 77 copies correspond to 93.90% of the copies of that gene. We consider only calculated clusters that are pure because only those clusters contribute to gene identification. 

HHS Vulnerability Disclosure