Table 1. Major characteristic of DGE libraries and tag mapping to the UniGene transcript database.
the BCG-infected library | the H37Rv-infected library | |||||||
Distinct Tag | % | Total Tag | % | Distinct Tag | % | Total Tag | % | |
Raw data | 306688 | 100 | 5759300 | 100 | 303581 | 100 | 5773517 | 100 |
Low Quality Tag | 0 | 0 | 0 | 0 | ||||
Tags containing N | 774 | 0.25 | 940 | 0.02 | 893 | 0.29 | 1053 | 0.02 |
Adaptor tag | 82 | 0.03 | 94 | 0 | 84 | 0.03 | 89 | 0 |
Tag copynum <2 | 202170 | 65.92 | 202170 | 3.51 | 166118 | 54.72 | 166118 | 2.88 |
Clean tag | 103662 | 33.8 | 5556096 | 96.47 | 136486 | 44.96 | 5606257 | 97.1 |
Copynum >2 | 103662 | 100 | 5556096 | 100 | 136486 | 100 | 5606257 | 100 |
Copynum >5 | 13385 | 12.91 | 101008 | 1.82 | 15986 | 11.71 | 120238 | 2.14 |
Copynum >10 | 8809 | 8.5 | 129049 | 2.32 | 10198 | 7.47 | 148787 | 2.65 |
Copynum >20 | 7384 | 7.12 | 236865 | 4.26 | 8561 | 6.27 | 274419 | 4.89 |
Copynum >50 | 3872 | 3.74 | 273595 | 4.92 | 4154 | 3.04 | 295194 | 5.27 |
Copynum >100 | 6106 | 5.89 | 4638534 | 83.49 | 6432 | 4.71 | 4520966 | 80.64 |
Tag mapping | ||||||||
All mapping | 67106 | 64.74 | 4723249 | 85.01 | 77287 | 56.63 | 4731204 | 84.39 |
Unambiguous mapping | 60472 | 58.34 | 3197086 | 57.54 | 70300 | 51.51 | 3376934 | 60.24 |
Unknown tag | 14728 | 14.21 | 477629 | 8.60 | 23008 | 16.86 | 523621 | 9.34 |
All Mapping represents the number of all tags mapped to the UniGene virtual tag database, clear-cut Mapping represents the number of clear-cut tags mapped to the UniGene virtual tag database, clear-cut tags indicate the tags matched only to one gene. Examination confirmed that the quality of the data were up to specification suggesting that DGE data are significant.