Skip to main content
. 2007 Jan 5;8:4. doi: 10.1186/1471-2105-8-4

Table 2.

Similarities between sequences in the three training sets.

Model Type Number of clusters Minimum cluster size Maximum cluster size Number of Singletons Average cluster size Cluster distributiona
Bacterial 84 1 4 74 1.19 6,2,2
Viral 87 1 5 78 1.15 7,1,0,1
Tumour 76 1 7 66 1.32 4,2,2,1,0,1

For a given cut-off, a perfectly diverse set of sequences will have number of clusters equal to the number of sequences, a maximum and minimum cluster size of one, and an average cluster size of one.

a for non-singleton clusters of 2 or more members. Cluster numbers are shown in ascending cluster size.