Skip to main content
. 2012 Nov 30;7(11):e50609. doi: 10.1371/journal.pone.0050609

Table 3. Evaluation of gene finders with various training genes.

GeneFinder Traininggenes## Accuracy level
GENE EXON NUCLEOTDE
SN SP SN SP SN SP Predicted Matched$$ Duplicate++
GlimmerHMM All validated genes excepttest genes 0.16 0.15 0.27 0.30 0.54 0.50 684 269 (47) 52
All validated genes includingtest gene 0.20 0.20 0.33 0.35 0.61 0.55 710 273 (64) 47
Using a trained model fromprogram creator Not available for Toxoplasma gondii
Using a model trained onhuman genes 0.02 0.01 0.04 0.05 0.23 0.14 1129 247 (5) 131
SNAP All validated genes excepttest genes 0.18 0.12 0.44 0.33 0.46 0.35 889 277 (53) 172
All validated genes includingtest genes 0.18 0.12 0.46 0.35 895 279 (54) 170
Using a trained model fromprogram creator Not available for Toxoplasma gondii
Using a model trained onhuman genes 0.09 0.04 0.06 0.09 0.16 0.11 1759 267 (25) 315
AUGUSTUS All validated genes excepttest genes 0.33 0.38 0.54 0.57 0.81 0.78 510 261 (99) 2
All validated genesincluding test genes 0.37 0.42 0.57 0.59 0.82 0.79 514 265 (111) 2
Using a trained modelfrom program creator 0.36 0.42 0.57 0.56 0.78 0.84 470 256 (108) 0
Using a model trained onhuman genes 0.12 0.09 0.19 0.19 0.34 0.25 114 282 (37) 150
GeneMark_hmm Using a trained modelfrom program creator 0.06 0.07 0.15 0.13 0.43 0.37 580 240 (19) 49
GeneMark_hmm ES Using a self-training procedure.i.e. no training genes required 0.08 0.09 0.23 0.19 0.56 0.44 630 248 (25) 45
##

The types of training genes used in the training model. The number of validated genes  =  3,432 (includes test genes) and the number of test genes  =  299.

$$

Number of predicted genes that align entirely or partly with the test genes and meet the criteria E-value  =  0 and 100% coverage – a value in brackets is the number of predicted genes that are exactly the same as the test genes i.e. the start and end genomic coordinates of each exon is the same as each test gene exon.

++

Number of predicted genes that align to the same test gene i.e. the predicted gene is only a part of the entire test gene and there can be one or more predictions per test gene.