Table 1:
Dataset | Basecaller | Deletion rate (%) | Insertion rate (%) | Mismatch rate (%) | Identity rate (%) | Error rate (%) |
---|---|---|---|---|---|---|
Metrichor | 8.93 | 2.38 | 4.57 | 86.50 | 15.88 | |
Albacore v1.1 | 6.35 | 3.82 | 4.69 | 88.96 | 14.86 | |
Albacore v2 | 6.19 | 3.38 | 3.98 | 89.82 | 13.55 | |
Lambda | BasecRAWller | 7.89 | 10.01 | 10.56 | 81.54 | 28.46 |
Chiron | 8.20 | 2.13 | 4.03 | 87.76 | 14.36 | |
Chiron-BS | 6.20 | 2.13 | 4.20 | 89.60 | 12.53 | |
Metrichor | 7.52 | 1.93 | 3.84 | 88.64 | 13.29 | |
Albacore v1.1 | 5.76 | 3.27 | 4.14 | 90.10 | 13.17 | |
Albacore v2 | 5.21 | 2.99 | 3.57 | 91.22 | 11.77 | |
E. coli | BasecRAWller | 7.16 | 10.40 | 10.30 | 82.54 | 27.86 |
Chiron | 6.36 | 1.81 | 3.07 | 90.57 | 11.24 | |
Chiron-BS | 4.94 | 2.36 | 3.16 | 91.90 | 10.46 | |
Metrichor | 7.63 | 2.40 | 4.35 | 88.02 | 14.38 | |
Albacore v1.1 | 6.12 | 3.57 | 4.68 | 89.19 | 14.37 | |
Albacore v2 | 5.05 | 3.58 | 4.05 | 90.90 | 12.68 | |
M. tuberculosis | BasecRAWller | 7.17 | 10.85 | 10.42 | 82.41 | 28.44 |
Chiron | 7.16 | 2.50 | 4.33 | 88.51 | 13.99 | |
Chiron-BS | 5.84 | 3.05 | 4.50 | 89.66 | 13.39 | |
Metrichor | 12.95 | 4.15 | 7.65 | 79.4 | 24.75 | |
Albacore v1.1 | 8.62 | 6.51 | 7.52 | 83.86 | 22.65 | |
Albacore v2 | 8.71 | 6.03 | 6.05 | 85.24 | 20.79 | |
Human | BasecRAWller | 8.41 | 10.28 | 10.10 | 81.49 | 28.79 |
Chiron | 9.13 | 5.14 | 9.33 | 81.54 | 23.60 | |
Chiron-BS | 9.30 | 5.62 | 7.87 | 82.83 | 22.79 |
Deletion, insertion, and mismatch rates (%) are defined as the number of deleted, inserted, and mismatched bases divided by the number of bases in the reference genome (the lower the better). Identity rate (%) is defined as the number of matched bases divided by the number of bases in the reference genome for that sample (the higher the better; identity rate = 1 - deletion rate - mismatch rate). Error rate (%) is defined as the sum of deletion, insertion, and mismatch rates (the lower the better; error rate = deletion rate + insertion rate + mismatch rate). This statistic effectively summarizes the basecalling accuracy of the associated model. The best result in each category is indicated in bold.