Table 1. Eight increments of diversity used in exon/intron identification.
ID notation | ID type | Source of information | ID defined by two sources | |
---|---|---|---|---|
First source | Second source | |||
ID1 | ID {m × 4} | 7 or 8 bases around splice site | Potential splice site region | All true splice site region |
ID2 | ID {C2m × 42} | 7 or 8 bases around splice site | Potential splice site region | All true splice site region |
ID3 | ID {C3m × 43} | 7 or 8 bases around splice site | Potential splice site region | All true splice site region |
ID4 | ID{43} | 48 bases before potential and true boundary | Potential splice site region (L1 sequence) | All true splice site region (L1 sequences) |
ID5 | ID{43} | 48 bases before potential and after true boundary | Potential splice site region (L1 sequence) | All true splice site region (L2 sequences) |
ID6 | ID{43} | 48 bases after potential and before true boundary | Potential splice site region (L2 sequence) | All true splice site region (L1 sequences) |
ID7 | ID{43} | 48 bases after potential and true boundary | Potential splice site region (L2 sequence) | All true splice site region (L2 sequences) |
ID8 | ID{43} | 48 bases before and after potential boundary | Potential splice site region (L1 sequence) | Potential splice site region (L2 sequence) |
The definitions for eight IDs are shown in the table. The second column gives the ID type. The third column gives the location of the source of information that is necessary for defining ID. As a rule, each ID is defined by two diversity sources (Equation 2 of the text). The last two columns indicate two sources where ‘potential splice site region’ refers to a sequence to be identified and ‘all true splice region’ refers to all sequences (exons or introns) in standard set.