Table 1.
Results of read alignment algorithms on simulated reads
(Percent misses) Time |
||||||
---|---|---|---|---|---|---|
Variant | GSNAP | BWA | Bowtie | SOAP2 | SOAP | MAQ |
36 nt reads | ||||||
Exact | 51 | 17 | 9 | 70 | 869 | 2248 |
1 mm | 55 | 60 | 33 | 11 | 1157 | 2106 |
2 mm | (1.0) 304 | 64 | 46 | 39 | (2.9) 1470 | 6008 |
3 mm | (11.9) 405 | 551 | 544 | − | (15.6) 1369 | 19 523 |
Ins, 1–3 | 640 | (4.9) 767 | − | − | (5.1) 5534 | − |
Del, 1–3 | 653 | (3.3) 1016 | − | − | (3.7) 4308 | − |
Ins, 4–9 | (0.1) 507 | − | − | − | 31420 | − |
Del, 4–30 | (0.1) 887 | − | − | − | − | − |
70 nt reads | ||||||
Exact | 15 | 23 | 9 | 13 | 1205 | 2180 |
1 mm | 23 | 25 | 15 | 12 | (0.1) 1564 | 2120 |
2 mm | 45 | 33 | 48 | 67 | (0.9) 2363 | 6175 |
3 mm | 95 | 83 | 542 | − | (3.3) 2272 | 20 316 |
4 mm | 325 | 373 | − | − | (7.8) 2098 | (2.4) 20 002 |
Ins, 1–3 | 245 | (2.0) 323 | − | − | (4.3) 15 516 | − |
Del, 1–3 | 263 | (1.3) 425 | − | − | (4.7) 14 645 | − |
Ins, 4–9 | (0.1) 288 | − | − | − | − | − |
Del, 4–30 | (0.1) 292 | − | − | − | − | − |
100 nt reads | ||||||
Exact | 15 | 29 | 11 | 10 | − | 2211 |
1 mm | 21 | 30 | 16 | 13 | − | 2168 |
2 mm | 33 | 35 | 56 | 73 | − | 6330 |
3 mm | 50 | 52 | 620 | − | − | 20 697 |
4 mm | 82 | 137 | − | − | − | (0.5) 20 503 |
5 mm | 155 | 543 | − | − | − | (2.1) 20 283 |
Ins, 1–3 | 269 | (1.3) 218 | − | − | − | − |
Del, 1–3 | 273 | (0.8) 360 | − | − | − | − |
Ins, 4–9 | (0.1) 335 | − | − | − | − | − |
Del, 4–30 | (0.1) 312 | − | − | − | − | − |
Times (in seconds) are for each set of 100 000 reads. For BWA, times include conversion to genomic coordinates (∼8 s per dataset). For SOAP2, times exclude loading of indices (∼35 s per dataset). Sensitivity was computed over reads that were unique (mapping to one location in the genome) and non-upgradeable (not mapping to another genomic location with a better variant type than the expected alteration). Misses, if any, are represented by percentages in parentheses before the corresponding running time. Dashes indicate variant types that could not be detected by the corresponding program. Variants: mm, mismatch(es); ins, insertion; and del, deletion. Parameter flags used, where n is number of mm in dataset: GSNAP (mm): -t 1 -m n. GSNAP (indel): -t 1 -m 0 -i 0. BWA (mm): aln -o 0 -n n. BWA (indel): aln -n 3 -o 1 -O 1 -E 1. Bowtie: -f -k 10 –quiet -p 1 -v n. SOAP2: -r 2 -v n. SOAP (mm): -s 12 -r 2 -w 10 -v n. SOAP (indel): -s 12 -r 2 -w 10 -v 0 -g 3. MAQ: map -C 10 -e 200 -n n. For the 3-mismatch dataset, Bowtie was also run in its MAQ mode, by removing the -v flag for limiting the number of mismatches and adding ‘-e 200’ to permit more mismatches. In that mode, times for the 36, 70 and 100 nt datasets were 46, 142 and 750 s, but miss rates were 57.2, 13.4 and 6.4%.