Skip to main content
. 2016 Oct 31;24(1):25–35. doi: 10.1093/dnares/dsw045

Table 2.

Assessment of DNA structural features for their ability to differentiate between promoter and non-promoter sequences in six different model organisms

Organism Number of sequences First cycle
Second cycle
Property
Prec. BA F-score Prec. BA F-score
H. pylori 706 77 75 75 80 75 72 STB
Anabaena 2513 72 71 70 76 71 68
Synechocystis 1639 69 69 69 74 69 65
E. coli 1331 74 75 76 80 77 76
Salmonella 981 75 76 77 79 77 76
Klebsiella 1324 80 80 80 84 81 80
H. pylori 706 54 56 63 56 58 63 BDC
Anabaena 2513 52 54 62 53 53 58
Synechocystis 1639 52 53 61 53 53 57
E. coli 1331 54 56 63 55 57 61
Salmonella 981 55 57 65 56 58 62
Klebsiella 1324 56 59 66 59 61 65
H. pylori 706 54 54 55 58 56 51 BNC
Anabaena 2513 55 57 64 57 58 61
Synechocystis 1639 52 53 62 53 54 59
E. coli 1331 52 54 63 52 53 59
Salmonella 981 52 53 62 52 53 59
Klebsiella 1324 51 52 63 51 51 59
H. pylori 706 53 55 62 54 55 59 CBC
Anabaena 2513 53 54 60 55 55 58
Synechocystis 1639 53 54 61 53 54 58
E. coli 1331 57 58 63 58 59 60
Salmonella 981 56 58 64 59 60 63
Klebsiella 1324 58 60 66 59 60 64

Out of the total test dataset (promoters associated with the primary category of TSS), 1001 nucleotide long sequences with 30–65% GC content were considered. The sequences with promoter predictions falling in the 300 nucleotide region spanning −200 to +100 with respect to TSS at 0 were considered as TP while sequences with predictions falling in the coding region +200 to + 500 with respect to TSS were labelled as FP. Evaluation parameters precision (Prec.), BA and F-score were calculated using formulas explained in the ‘Material and methods’ section.