Table 5.
Pipeline | Min support | Precision | Recall | Time (s) | No. of correct lineages | ||
---|---|---|---|---|---|---|---|
A | 5 | 20 | 2 | .953 | .947 | 38 | 40 |
A | 5 | 20 | 5 | .951 | .945 | 19 | 40 |
B | 5 | 20 | 2 | .992 | .967 | 48 | 40 |
B | 5 | 20 | 5 | .993 | .960 | 21 | 40 |
A | 50 | 50 | 2 | .933 | .918 | 897 | 40 |
A | 50 | 50 | 5 | .942 | .948 | 355 | 40 |
B | 50 | 50 | 2 | .960 | .962 | 2465 | 40 |
B | 50 | 50 | 5 | .972 | .960 | 677 | 40 |
For each catalog, we report the precision and recall achieved by MALVIRUS in genotyping its variations, the average running times, and the number of input samples (out of 41) assigned to the correct lineage. We considered 8 different catalogs, built using pipeline A or B on the set of assemblies retrieved from GISAID, prefiltered using and then subsampled using different combinations of parameters and . In addition, we also filtered out from the catalogs all variations present in less than either 2 or 5 assemblies (Min support columns)