Table 2. Sequence validation rates of predictions, FGENESH vs. GENSCAN

Gene prediction set

Gene prediction program

Total no. of predictions

No. of predictions tested*

No. sequence-validated (%)

sjc set

GENSCAN

196

161

60 (37.3%)

FGENESH

11

10

4 (40.0%)

Heidelberg set

FGENESH

1,266

160

18 (11.3%)

Homol-2 set

GENSCAN

333

204

27 (13.2%)

FGENESH

6

5

2 (40.0%)

Homol-0 set§

GENSCAN

9,463

129

7 (5.4%)

FGENESH

127

75

5 (6.7%)

Total predictions

 

11,402

744

123 (16.5%)

Control set

 

159

159

154 (96.9%)

*Gene predictions from each prediction set (sjc set, comprised of GENSCAN or FGENESH predictions with an intron conserved in D. pseudoobscura; Heidelberg set, FGENESH predictions reported verified by transcription profiling (1); homol-2 set, with homology to D. pseudoobscura in two exons; homol-0 set, with homology in zero to one exon; and release 3.1 controls) were tested by sequencing of RT-PCR products.

†Gene predictions were considered validated if the aligned sequence of the PCR product was consistent with a spliced gene model in the region of the prediction.

‡Only 1,266 multiexon predictions from the 2,636 predictions described in the Hild et al. (1) study were considered for analysis, and, of these, we tested only the 160 with the highest priority scores that did not overlap any GENSCAN or FGENESH predictions tested in the other sets.

§The homol-0 set was selected to be representative of the full range of priority scores.