Table 3. Summary of the computational processing of 454 pyrosequence data.
Step | Process | Program | Program source (URL or e-mail) | Calculator machine | # of query seqs. |
Output | Time |
1 | Barcode sorting | split_libraries.py | QIIME (http://qiime.org/scripts/split_libraries.html) | Windows 64 bit PC* | 1,583,218** | 106 to 8618 reads per subject | 10 min |
2 | OTU clustering | pick_otus_through_otu_table.py | QIIME (http://qiime.org/scripts/pick_otus_through_otu_table.html) | Windows 64 bit PC | 131,768 | 15,578 redundant seqs. | 17 min |
3 | de novo chimera check | de novo Uchime | Mothur 1.25.1 (http://www.mothur.org/wiki/Download_mothur) | Windows 64 bit PC | 15,578 | 7,200 chimeric seqs. | 6 min |
4 | DB chimera check | DB Uchime | Mothur 1.25.1 (http://www.mothur.org/wiki/Download_mothur) | Windows 64 bit PC | 15,578 | 7,852 chimeric seqs. (6,612 chimeric seqs.)*** | 7 hrs |
5 | Sequence search up to genus level | RDP Classifier | RDP II (http://rdp.cme.msu.edu/) | RDP host computer | 8,966 | 9 phyla–109 genera | 10 min |
6 | Sequence search in species level | RDP Seqmatch | RDP II (http://rdp.cme.msu.edu/) | RDP host computer | 8,966 | 3,992 seqs identified with know species | 3 hrs |
7 | Data processing of RDP Seqmatch |
SeqmatchQ400 | Kyushu Univ(nakayama @ agr.kyushu-u.ac.jp) | Windows 64 bit PC | 8,945 | 276 species | 30 min |
*Intel Core i7-3930 K CPU (3.20 GHz). **Batch sequence data included all sequences from 2 x half PicoTiterPlate regions, which contained 256 samples including non-Singaporean samples. ***The number of chimeric sequences determined by taking into account both de novo and DB chimera checks.