Table 1.
Category | Tool | Dataa | Wall time (h:m:s) |
Total CPU time (h:m:s) |
Daily throughputb |
---|---|---|---|---|---|
Clustering | CD-HIT-EST | 1 | 00:08:53 | 00:34:08 | 3,113 |
CD-HIT | 2 | 00:00:58 | 00:02:52 | 23,040 | |
H-CD-HIT | 2 | 00:20:06 | 01:10:26 | 1,600 | |
CD-HIT-454 | 1 | 00:05:40 | 00:21:54 | 4,800 | |
rRNA | BLASTN-rRNA | 1 | 00:12:43 | 13:44:53 | 139 |
hmm-rRNA | 1 | 00:01:56 | 00:20:35 | 5,008 | |
tRNA | tRNA-scan | 1 | 00:02:29 | 02:01:50 | 936 |
ORF calling | ORF-finder | 1 | 00:02:06 | 00:02:06 | 23,040 |
Metagene | 1 | 00:16:21 | 00:15:21 | 6,400 | |
FragGeneScan | 1 | 01:27:50 | 01:27:50 | 1,294 | |
Function | COG | 2 | 00:14:55 | 15:12:50 | 126 |
KOG | 2 | 00:15:16 | 16:25:31 | 116 | |
PRK | 2 | 00:28:38 | 32:03:16 | 59 | |
PFAM | 2 | 01:33:44 | 115:30:23 | 16 | |
TIGRFAM | 2 | 00:53:23 | 62:31:51 | 30 | |
Pathway | KEGG | 2 | 20:24:33 | 553:32:48 | 3 |
Statistics | FNA-stat | 1 | 00:00:38 | 00:00:38 | 43,746 |
FAA-stat | 2 | 00:00:12 | 00:00:12 | 52,363 | |
Quality control | QC-filter-FASTQ | 1 | 00:03:13 | 00:03:13 | 19,200 |
QC-filter-FASTA-qual | 1 | 00:02:47 | 00:02:47 | 23,040 | |
Trim | 1 | 00:04:00 | 00:04:00 | 16,457 | |
Filtering | Filter-human | 1 | 00:40:28 | 02:29:57 | 762 |
Binning | RDP-binning | 1 | 01:16:30 | 01:20:00 | 1,404 |
FR-HIT-binning | 1 | 00:36:59 | 02:13:53 | 853 | |
OTU clustering | CD-HIT-OTU | 3 | 00:05:10 | 00:10:23 | 8,861 |
File conversion | FASTQ2FASTA | 1 | 00:02:24 | 00:02:24 | 23,040 |
a See text for descriptions of the 3 datasets tested.
b Daily throughput is calculated as the daily CPU time of WebMGA cluster with 80 cores divided by the total CPU time of a job, assuming 2 minutes of administrative CPU cost such as job queuing, file coping etc. for each job.