Skip to main content
. 2012 Feb 1;7(2):263–268. doi: 10.4161/psb.18720

Table 2. Comparative analysis of statistical methods for the identification of outliers representing erroneous ESTs of PTS1 proteins. Three statistical methods (see text) were evaluated to identify outliers in OG-specific PWM score histograms. Methods evaluated as too insensitive for an OG failed to identify additional apparent outliers and experimentally validated cytosolic sequences, while methods evaluated as too unspecific detected too many sequences as outliers. To achieve very good performance in outlier detection, it is recommended to apply all three methods and to eliminate outliers that are identified by at least two methods. ACX1, acyl-CoA oxidase; AGT, alanine (serine)-glyoxylate amiontransferase; ATF1/2, acetyltransferase; BSMDR, quinone oxidoreductase; GSTT1, glutathione S-transferase isoform theta 1; HPR, hydroxypyruvate reductase; MLS, malate synthase; SDRb/DECR, short-chain dehydrogenase-reductase B/2,4-dienoyl-CoA reductase; SCP2, sterol carrier protein isoform 2.

OG acronym
Total seq. number
Method 1: „Standard deviation from mean value”
Method 2: „Positive deviation from median score”
Method 3: „Interquartile range”
    Number of seq. excluded Number of seq. excluded (%) Number of erroneous (cyt.) seq. excluded Conclusion Number of seq. excluded Number of seq. excluded (%) Number of erroneous (cyt.) seq. excluded Conclusion Number of seq. excluded Number of seq. excluded (%) Number of erroneous (cyt.) seq. excluded Conclusion
ACX1
88
3
3.4
0/0
good
3
3.4
0/0
good
3
3.4
0/0
good
AGT
94
15
16.0
1/1
good
15
16.0
1/1
good
14
14.9
1/1
good
ATF1/2
61
4
6.6
0/0
good
5
8.2
0/0
good
6
9.8
0/0
good
BSMDR
52
3
5.8
0/0
good
2
3.8
0/0
good
2
3.8
0/0
good
GSTT1
54
4
7.4
1/1
good
5
7.4
1/1
good
4
7.4
1/1
good
HPR
76
4
5.3
0/0
too insens.
10
13.2
0/0
good
7
9.2
0/0
good
MLS
47
13
27.7
1/1
good
13
27.7
1/1
good
0
0
0/1
too insens.
SDRb/DECR
72
9
12.5
1/1
too unspec.
5
12.5
1/1
too unspec.
2
2.8
1/1
good
SCP2
91
6
6.6
2/2
good
10
11.0
2/2
too unspec.
6
6.6
2/2
good
total
635
61
9.6
6/6
good
68
10.7%
6/6
good
44
6.9%
5/6
good
 
 
 
 
 
 
Combined method application
 
 
 
 
            60 9.4% 6/6 very good