Skip to main content
. 2008 Sep 18;9:381. doi: 10.1186/1471-2105-9-381

Table 5.

Effect of parameter set indexing strategy on PFCWLLKR performance using non-TIS-containing data

k Indexing strategy TN FP Sn
3 modulating edit 5,074 11,047 0.3147
PWM 5,134 10,987 0.3185
WAM 5,069 11,052 0.3144
static edit 6,170 9,951 0.3827
PWM 6,279 9,842 0.3895
WAM 6,119 10,002 0.3796

5 modulating edit 4,537 11,584 0.2814
PWM 4,484 11,637 0.2781
WAM 4,262 11,859 0.2644
static edit 5,993 10,128 0.3718
PWM 6,065 10,056 0.3762
WAM 5,679 10,442 0.3523

10 modulating edit 4,190 11,931 0.2599
PWM 3,708 12,413 0.2300
WAM 3,533 12,588 0.2192
static edit 6,345 9,776 0.3936
PWM 5,537 10,584 0.3435
WAM 5,199 10,922 0.3225

16,121 non-TIS-containing instances were used in five-fold cross-validation experiments, in which parameter sets were selected for putative TIS evaluation according to best cluster fit established by either the Hamming distance relative to cached medoids (edit), position weight matrix scores (PWM), or weight array matrix scores (WAM). Parameter indexing was tested under both modulating (cluster assignment for each site separately) and static (cluster assignment based on the leftmost ATG) approaches. k denotes the number of clusters considered. TN represents the number of instances for which the method (correctly) refused to predict a TIS, and FP the number for which some prediction was made, though always incorrect (see Figure 2). Sn=TNTN+FP.