Table 2.
Analysis of maize genome survey sequences: a comparison with maize proteins and ESTs
| Approach | Number of entries | Unique sequences | wORF | Comparison with maize proteins | Reference | ||||
| %NS | %TE | %HP | %KP | NS %EST | |||||
| Mutator insertions | 4412 | 970 | 375 | 93 | 3 | 2 | 2 | 26 | [26] |
| Random inserts | 3480 | 2529 | 1015 | 61 | 38 | 1 | 1 | 44 | [27] |
| Methylation filter | 1692 | 1083 | 258 | 84 | 10 | 2 | 3 | 37 | [28] |
| BAC ends | 945 | 881 | 454 | 48 | 51 | 0 | 0 | 28 | [29] |
| MPP | 669 | 338 | 150 | 80 | 1 | 7 | 11 | 47 | [30] |
| ORFs | 399 | 86 | 79 | 76 | 0 | 14 | 10 | 22 | [31] |
| Other | 28 | 11 | 3 | 33 | 67 | 0 | 0 | 0 | |
All sequences were retrieved from the GenBank GSS database (with the number of database entries given in the second column). Sequences shorter than 360 bp and redundant sequences were removed with the vmatch program [21] (V.B. and S.K., unpublished), resulting in the reduced sequence set sizes given in the column 'Unique sequences'. Of these, only sequences with non-redundant open reading frames of at least 120 codons (with the number of qualifying entries given in the wORF column) were compared to a maize protein set using BLASTP [23]. Entries were classified on the basis of BLASTP results and GenBank keywords as novel (NS), transposable element (TE), hypothetical protein (HP), or known protein (KP). The corresponding columns give the fraction of sequences in each class (percent). The column 'NS %EST' gives the percentage of sequences with novel ORFs matching maize ESTs. MPP, Missouri Mapping Project.