Table 1.
GraphMap |
BLASR |
LAST |
LAST + LAST-TRAIN |
||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Manual (q = 1) |
Manual (q = 2) |
Training |
Training+LAMA |
||||||||||
Haplotype | Polymorphism | Count | Freq | Count | Freq | Count | Freq | Count | Freq | Count | Freq | Count | Freq |
TT: CT | CYP2D6*4 | 207 | 11.9% | 227 | 18.4% | 340 | 20.1% | 182 | 21.2% | 327 | 27.3% | 343 | 27.4% |
TT: CC | (reference bias) | 225 | 13.0% | 329 | 26.6% | 326 | 19.3% | 164 | 19.1% | 160 | 13.4% | 134 | 10.7% |
T−: CT | 70 | 4.0% | 31 | 2.5% | 65 | 3.8% | 36 | 4.2% | 75 | 6.3% | 78 | 6.2% | |
T−: CC | CYP2D6*3 | 226 | 13.0% | 281 | 22.8% | 232 | 13.7% | 199 | 23.1% | 199 | 16.6% | 217 | 17.3% |
Other | 1006 | 58.1% | 367 | 29.7% | 726 | 43.1% | 279 | 32.4% | 436 | 36.4% | 480 | 38.3% | |
Total | 1734 | 100.0% | 1235 | 100.0% | 1689 | 100.0% | 860 | 100.0% | 1197 | 100.0% | 1252 | 100.0% |
In the first column, TX:CY indicates the phased haplotype where the 1st position (rs35742686) is ‘X’ (‘T’ in the reference genome) and the 2nd position (rs3892097) is ‘Y’ (‘C’ in the reference genome). See also Supplementary Table S14. The high frequency for TT:CC (the identical haplotype to the reference genome) is known as reference bias (Laver et al., 2016). The values for ‘BLASR’ were computed from the mapping results in Ammar et al. (2015), where BLASR was used for mapping Nanopore reads to the reference genome. The column ‘training + LAMA’ shows the results of probabilistic alignment (Hamada et al., 2011) using forward scores with the trained parameters by LAST-TRAIN. See Supplementary Materials S7 for the detailed command line options for every tool.