Table 1. Performances of the programs for searching metabolic protein family profiles against the simulated marine data set with uneven coverage.
Pathway | #Families | HMM-GRASPx | HMMER3 | RPS-BLAST | UProC | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Rec. | Prec. | F. | Rec. | Prec. | F. | Rec. | Prec. | F. | Rec. | Prec. | F. | ||
KO00010 | 47 | 70.2 | 94.3 | 80.5 | 25.0 | 95.7 | 39.7 | 43.8 | 97.6 | 60.4 | 26.8 | 68.0 | 38.5 |
KO00020 | 69 | 61.6 | 86.5 | 72.0 | 18.0 | 92.2 | 30.1 | 31.4 | 94.5 | 47.2 | 28.7 | 70.1 | 40.7 |
KO00030 | 21 | 78.0 | 96.2 | 86.1 | 28.6 | 98.5 | 44.4 | 49.2 | 97.7 | 65.5 | 33.1 | 56.4 | 41.7 |
KO00051 | 111 | 58.7 | 87.4 | 70.2 | 15.6 | 94.1 | 26.7 | 24.4 | 90.9 | 38.4 | 17.8 | 57.1 | 27.1 |
KO00620 | 80 | 60.2 | 89.4 | 72.0 | 15.3 | 92.9 | 26.3 | 27.1 | 95.0 | 42.2 | 23.6 | 75.6 | 35.9 |
KO00680 | 124 | 59.3 | 89.7 | 71.4 | 17.2 | 95.1 | 29.1 | 27.6 | 92.8 | 42.5 | 21.0 | 73.3 | 32.7 |
KO00910 | 49 | 61.9 | 87.3 | 72.4 | 16.3 | 92.2 | 27.7 | 29.5 | 88.8 | 44.2 | 26.7 | 76.9 | 39.6 |
KO00920 | 7 | 62.7 | 93.7 | 75.1 | 11.1 | 93.4 | 19.8 | 24.3 | 97.0 | 38.9 | 8.3 | 25.1 | 12.4 |
Average | - | 64.1 | 90.6 | 75.0 | 18.4 | 94.3 | 30.5 | 32.2 | 94.3 | 47.4 | 23.2 | 62.8 | 33.6 |
Pathway names: KO00010 (Glycolysis/Glycogenesis), KO00020 (TCA cycle), KO00030 (Pentose phosphate pathway), KO00051 (Fructose and mannose metabolism), KO00620 (Pyruvate metabolism), KO00680 (Methane metabolism), KO00910 (Nitrogen metabolism), KO00920 (Sulfur metabolism). The column “#Families” indicates the number of protein (domain) families involved in the corresponding pathway. The columns “Rec.”, “Prec.”, and “F.” indicate Recall, Precision, and F-measure, respectively. All performances are presented as percentages. The highest performances among all programs are bolded.
Running time for all programs on the simulated marine data set is available in S4 Table.