Skip to main content
. 2002 Jun;1(3):341–352. doi: 10.1128/EC.1.3.341-352.2002

TABLE 1.

Similarity searchesa

Target Method Query (GSS data set) Result of query
Database or sequence Date posted No. of sequences No. of letters
GenBank Alveolata 8 August 2000 36,628 29,939,175 BLASTN 2.0.14 hq 3 rRNA genes; 6 protein coding genes
Swall BLASTX 2.0.12 nr 722 ORFs
Swissprot 7 July 2000 86,593 31,411,157
Swissprotnew 8 July 2000 2,822 1,280,954
Sptrembl 28 June 2000 297,973 93,374,136
Sptrnew 8 July 2000 71,201 22,367,698
Pfam 21 September 2000 111,934 38,089,901 BLASTP 2.0.14 ORF 524 PfamA domains
Proteome BLASTP 2.0.14 ORF
E. coli 4 January 2001 4,581 1,435,304 101 hitsb
S. cerevisiae 15 November 2000 6,358 2,991,939 505 hitsb
C. elegans 15 November 2000 19,704 8,596,400 586 hitsb
D. melanogaster 15 November 2000 14,080 6,850,524 599 hitsb
A. thaliana 6 January 2001 25,458 11,049,032 588 hitsb
H. sapiens 15 November 2000 31,919 13,433,624 616 hitsb
hq 2 November 2000 3,046 1,403,056 TBLASTX 2.0.14 hq 108 families
a

The genome survey sequence (GSS) data set used for each query is as follows: nr, nonredundant data (3,139 reads; 3,481,394 nt); hq, high-quality nonredundant data (3,046 reads, 1,403,056 nt); and ORF, set of 722 ORF fragments identified by similarity (107,303 aa).

b

Threshold, E < 10−4.