Table 3.
Sample name | Raw reads | Raw bases | Clean reads | GC% | Q20 (%) | Q30 (%) | Clean bases | Clean bases% |
---|---|---|---|---|---|---|---|---|
cont1 | 67572136 | 10135820400 | 65650056 | 48.57 | 99.16 | 97.32 | 9689017234 | 95.59 |
cont2 | 66474707 | 9971206050 | 65006370 | 47.54 | 99.22 | 97.52 | 9514596643 | 95.42 |
infect1 | 69033293 | 9854993950 | 65789887 | 49.67 | 98.95 | 96.74 | 9383925239 | 95.22 |
infect2 | 70292999 | 10543949850 | 68826092 | 49.91 | 98.94 | 96.69 | 10102208096 | 95.81 |
Raw reads: statistics of raw reads, each adjacent four lines contains the information of one read, and the total read number of each file is calculated; raw base: the number of all bases in raw data; clean reads: filter the original data, remove the linker sequence, remove the contaminated part, and remove the sequence containing more low-quality bases to obtain clean reads; GC%: the percentage of G and C in all bases; Q20, Q30: the percentage of total number of bases where the Phred score is greater than 20 and 30 which indicates base call accuracy; clean bases: the number and length of sequences in clean reads calculate the number of bases; clean bases%: percentage of clean bases/raw base.