Skip to main content
. 2021 Apr 14;22:265. doi: 10.1186/s12864-021-07583-5

Table 1.

General and BUSCO benchmark statistics for homology grouping performed under setting D1 to D8

Clustering setting Minimum sequence similarity Homology groups Single copy groups Correct groups a True Positives False Positives False Negatives Recall Precision F-score
D1 95% 49,290 812 395 128,085 14 3905 0.9704 0.9999 0.9849
D2 85% 28,896 1615 629 131,795 24 195 0.9985 0.9998 0.9992
D3 75% 24,650 1690 638 131,952 35 38 0.9997 0.9997 0.9997
D4 65% 22,347 1699 640 131,975 38 15 0.9999 0.9997 0.9998
D5 55% 20,636 1683 639 131,975 44 15 0.9999 0.9997 0.9998
D6 45% 19,234 1653 633 131,985 245 5 0.9981 0.9981 0.9991
D7 35% 17,908 1612 623 131,985 508 5 0.9962 0.9962 0.9981
D8 25% 16,486 1486 607 131,986 7002 4 1.0000 0.9496 0.9741

a Correct groups are defined as the number of groups that correctly organize one out of 670 ‘complete’ and ‘non-duplicated’ Enterobacteriaceae BUSCO genes. Calculations of recall, precision, and F-score explained in Methods