Table 3. Recall and precision metrics for M. tuberculosis F11 variants called against M. tuberculosis H37Rv by Pilon (with and without long insert library data), GATK UnifiedGenotyper and SAMtools.
Pilon | GATK | SAMtools | Pilon-frags | |||||||||
R | P | F | R | P | F | R | P | F | R | P | F | |
Single substitution | 0.96 | 0.98 | 0.97 | 0.85 | 0.98 | 0.91 | 0.88 | 0.93 | 0.90 | 0.94 | 0.98 | 0.96 |
Single insertion | 0.83 | 1 | 0.91 | 0.75 | 1 | 0.86 | 0.79 | 1 | 0.88 | 0.79 | 1 | 0.88 |
Single deletion | 0.91 | 0.95 | 0.93 | 0.87 | 0.9 | 0.86 | 0.87 | 1 | 0.93 | 0.87 | 0.95 | 0.91 |
Multi substitution | 1 | 0.95 | 0.97 | 0.67 | N/A | N/A | 1 | 0.98 | 0.99 | 1 | 0.95 | 0.97 |
Multi insertion | 0.63 | 0.73 | 0.68 | 0.17 | 0.79 | 0.28 | 0.21 | 0.5 | 0.30 | 0.63 | 0.76 | 0.69 |
Multi deletion | 0.73 | 0.9 | 0.81 | 0.27 | 0.75 | 0.4 | 0.39 | N/A | N/A | 0.71 | 0.87 | 0.78 |
The three rows marked with 'Single' indicate single nucleotide variants. The three rows marked with 'Multi' indicate variants involving two or more nucleotides, which also include very large events that span several Kb. Recall (R) is the fraction of curated events that were called by the program. Precision (P) is the fraction of calls that the program made that were also described in the curation. The F-measure is the harmonic mean of recall and precision and provides measure of the trade-off between recall and precision. “N/A” indicates that all events of this type were captured in another variant category.