Table 4.
Promoter Range | 1 kb upstream | 1 kb upstream | 5 kb upstream | 5 kb upstream | ||||||||
PWM | All Proflies | Limited Profiles | All Proflies | Limited Profiles | ||||||||
Classifier | Expression | Lower | Upper | Feature Selection | Accuracy | SD | Accuracy | SD | Accuracy | SD | Accuracy | SD |
IB1 | Threshold | 0.2 | 0.8 | InfoGain | 91.56% | 235% | 81.65% | 4.22% | 93.06% | 1.80% | 93.27% | 2.35% |
IB1 | Threshold | 0.33 | 0.66 | InfoGain | 91.89% | 2.95% | 90.72% | 2.90% | 95.57% | 2.04% | 93.62% | 1.78% |
IB1 | Threshold | 0.2 | 0.8 | ChiSquared | 89.96% | 2.74% | 81.00% | 4.04% | 93.92% | 1.75% | 92.63% | 2.22% |
IB1 | Threshold | 0.33 | 0.66 | ChiSquared | 91.10% | 2.90% | 90.67% | 2.79% | 94.07% | 2.43% | 93.43% | 2.31% |
IB1 | Tanh | 0.25 | 0.75 | InfoGain | 92.71% | 2.43% | 92.74% | 2.30% | 92.13% | 2.47% | 92.00% | 3.01% |
Naive Bayes | Threshold | 0.2 | 0.8 | InfoGain | 90.47% | 2.85% | 8235% | 3.78% | 96.04% | 1.34% | 94.98% | 1.41% |
Naive Bayes | Threshold | 0.2 | 0.8 | InfoGain | 91.67% | 2.53% | 83.18% | 3.11% | 94.39% | 1.73% | 93.78% | 2.00% |
Table 4 shows the effects of variations in the parameters for connectivity network construction. The genomic region searched for transcription factor binding sites was either 1000 bp or 5000 bp upstream of known genes. Two different collections of Position weighted matrices (PWM) were also applied: 1) all the matrices provided by TRANSFAC relevant to mammalian genes (All Profiles), or 2) the selection of PWMs identified by TRANSFAC as 'high quality' (Limited Profiles).