Skip to main content
. 2014 Jun 2;4(13):2642–2653. doi: 10.1002/ece3.1107

Table 1.

Steps of the Illumina metabarcoding data denoising pipeline using a fungal ITS rDNA example file. We present decreasing read/cluster numbers, approximate computing times, and computing infrastructure for each step. Computing times are based on the example data file run on a standard desktop computer with two processors and 4GB RAM or a massive RAM machine with 48 processors and 512GB RAM

Pipeline step Program Files Read numbers Cluster numbers Time Computer Processors
Raw sequence data Pool 1 forward 14.940.845
Pool 1 reverse 14.940.845
Pool 2 forward 11.209.268
Pool 2 reverse 11.209.268
Pool 3 forward 13.946.058
Pool 3 reverse 13.946.058
1. Quality filtering Script (Supplements) Pool 1 forward 13.433.309 Up to 1 h Desktop computer 2
Reads_Quality_Length_distribution.pl Pool 1 reverse 13.433.309
Pool 2 forward 9.998.878
Pool 2 reverse 9.998.878
Pool 3 forward 13.044.704
Pool 3 reverse 13.044.704
2. Paired-end assembly PANDAseq Pool 1 12.341.403 Up to 1 h Desktop computer 2
Pool 2 9.314.737
Pool 3 12.134.242
3. Remove primer artifacts Script (Supplements) Pool 1 11.255.037 Minutes Desktop computer 2
remove_multiprimer.py Pool 2 8.520.491
Pool 3 11.153.448
4. Reorient reads to 5′-3′ fqgrep Pool 1 9.061.462 Up to 1 h Desktop computer 2
Pool 2 8.155.112
Pool 3 10.539.993
5. Demultiplex Script (Supplements)
 (A) forward labels demultiplex.sh Pool 1 8.851.827 Up to 1 h Desktop computer 2
Pool 2 8.053.268
Pool 3 9.903.834
 (B) reverse labels Pool 1 3.297.016 Up to 1 h Desktop computer 2
Pool 2 2.957.182
Pool 3 3.949.934
6. Pool files, remove primers and labels rename.pl Pool 1, 2 and 3 combined 10.203.752 Minutes Desktop computer 2
7. Extract fungal ITS FungalITSextractor Pool 1, 2 and 3 combined 10.093.751 Several days Computer cluster 50
8. Similarity clustering
 (A) Remove replicate sequences UPARSE Pool 1, 2 and 3 combined 4.869.466 Minutes Desktop computer 2
 (B) Sort sequences by abundance UPARSE Pool 1, 2 and 3 combined 560,678 Minutes Desktop computer 2
 (C) Cluster OTUs UPARSE Pool 1, 2 and 3 combined 14,781 Up to 1 h Desktop computer 2
9. Reference-based chimera filtering UPARSE Pool 1, 2 and 3 combined 14,636 Up to 1 h Desktop computer 2
10. Identify fungal OTUs
 (A) BLAST BLAST Pool 1, 2 and 3 combined 14,636 Several hours Computer cluster 25
 (B) Assign fungal reads MEGAN Pool 1, 2 and 3 combined 3208 Minutes Desktop computer
11. Fungal OTU abundance table UPARSE Pool 1, 2 and 3 combined 5.964.069 3208 Several hours Desktop computer 2