Skip to main content
. 2006 Nov 16;35(Database issue):D232–D236. doi: 10.1093/nar/gkl812

Table 1.

Examples of records (benchmark tests) included in the collection

Benchmark testsa Data Classification tasks Comparison methodsb
Classification of protein domains in SCOP [PCB0001, PCB00003, PDB0005] 11 944 Protein sequences/or protein structures from SCOP95 (6) Superfamilies subdivided into families………246 BLAST, Smith–Waterman, Needleman–Wunsch, LA–kernel, PRIDE2
Folds subdivided into superfamilies………191
Classes subdivided into folds………377
Classification of protein domains in CATH [PCB00007, PCB00009, PCB00011, PCB00013] 11 373 Protein sequences/or protein structures from CATH (7) (H) groups subdivided into S groups………165 BLAST, Smith–Waterman, Needleman–Wunsch, LA–kernel, PRIDE2
T groups subdivided into H groups………199
A groups subdivided into T groups………297
Classes subdivided into A groups………33
CLassification of phyla based on 3 phospho-glycerate kinase (3PGK) sequences. [PCB00031, PCB00032] 131 3PGK Protein and DNA sequences (11,29) Groups of kingdoms (Archaea, Bacteria, Eucarya) subdivided into phyla……10 BLAST, Smith–Waterman, Needleman–Wunsch, LA–kernel, LZW, PPMZ
Functional annotation of unicellular eukaryotic sequences based on prokaryotic orthologs. [PCB00031] 17 973 Sequences of prokaryotes and unicellular eukaryotes from the COG databases (5) Orthologous groups subdivided into prokaryotes and eukaryotes………119 BLAST, Smith–Waterman, Needleman–Wunsch, LA–kernel, LZW, PPMZ

aThe collection contains a total of 6405 benchmark tests including a total of 3297 protein sequence classification tests, 3095 3D classification tests and 10 DNA (coding region) classification tests. The accession numbers of the records are given in square brackets.

bSee text for the references.