Table 1. Summary of the number of reads generated by 454 pyrosequencing before and after processing and assembling.
Sample | Raw reads | Reads after removal of: exact duplicates, short-reads, low-complexity, human, and bacterial contamination | Reads after assembly (Newbler) | Viral reads tBASTX (unassembled) E value <10−3, cut-off 0.8 | Viral reads tBASTX (assembled) E value <10−3, cut-off 0.8 | Viral reads MetaVir (unassembled) E value <10−3 | Viral reads MetaVir (assembled) E value <10−3 |
---|---|---|---|---|---|---|---|
C1 | 3,196 | 901 | 605 | 272 | 129 | 195 | 98 |
C2 | 53,420 | 40,387 | 3,088 | 3,606 | 480 | 2,641 | 374 |
C3 | 43,543 | 33,153 | 2,060 | 3,459 | 111 | 3,411 | 78 |
C4 | 67,822 | 50,393 | 2,758 | 4,392 | 454 | 4,087 | 355 |
C5 | 52,747 | 44,501 | 3,241 | 5,746 | 316 | 6,444 | 277 |
C6 | 54,838 | 49,263 | 1,813 | 5,198 | 147 | 2,769 | 101 |
C7 | 39,990 | 23,228 | 890 | 3,343 | 111 | 1,717 | 55 |
C8 | 62,063 | 53,874 | 2,467 | 6,214 | 323 | 6,896 | 286 |
C9 | 26,145 | 23,325 | 870 | 1,576 | 113 | 1,899 | 116 |
C10 | 29,035 | 19,414 | 1,540 | 2,173 | 96 | 1,266 | 46 |
IC1 | 12,820 | 10,170 | 715 | 1,129 | 31 | 1,143 | 25 |
V1 | 12,480 | 10,875 | 2,759 | 3,555 | 666 | 3,518 | 555 |
V2 | 28,292 | 13,435 | 1,801 | 4,360 | 533 | 4,048 | 426 |
V3 | 16,877 | 14,478 | 2,138 | 3,092 | 603 | 3,372 | 518 |
V4 | 32,269 | 30,928 | 2,200 | 797 | 153 | 1,311 | 164 |
V5 | 18,556 | 17,619 | 1,400 | 1,136 | 63 | 1,402 | 67 |
V6 | 18,620 | 14,179 | 6,474 | 2,225 | 995 | 2,066 | 722 |
V7 | 12,707 | 11,693 | 1,199 | 4,146 | 193 | 4,315 | 157 |
V8 | 16,595 | 15,278 | 2,332 | 850 | 270 | 773 | 230 |
Total | 602,015 | 477,094 | 40,350 | 57,269 | 5,787 | 53,224 | 4,648 |
The number of reads with homology to viral hits is compared between our tBLASTX approach and MetaVir, using unassembled and assembled reads.
C1–C10, CD fecal samples; IC1, CD intestinal sample; V1–V8, control samples.