Table 7.
Chimera classification accuracies Perseus applied to the three denoised V2 'Uneven' data sets.
Dataset | Uneven1 | Uneven2 | Uneven3 | |||
---|---|---|---|---|---|---|
Classification | Good | Chimeric | Good | Chimeric | Good | Chimeric |
Good | 78 (83.0%) | 16 (17.0%) | 70 (90.9%) | 7 (9.1%) | 62 (82.7%) | 13 (17.3%) |
Bimera | 7 (0.9%) | 809 (99.1%) | 9 (1.3%) | 660 (98.7%) | 10 (1.2%) | 833 (98.8%) |
Trimera | 1 (1.2%) | 80 (98.8%) | 1 (1.4%) | 70 (98.6%) | 1 (1.2%) | 81 (98.8%) |
Quadramera | 0 (0.0%) | 1 (100.0%) | (0.0%) | 2 (100.0%) | -- | -- |
Unclassified | 26 (19.7%) | 106 (80.3%) | 14 (35.0%) | 26 (65.0%) | 22 (41.5%) | 31 (58.5%) |
Each row gives a separate category of denoised sequence according to its true classification as 'Good', 'Bimera', 'Trimera', 'Quadramera' and 'Unclassified'. The columns are then split across data sets and give the number flagged as good or chimeric by classification with a logistic regression given a 50% probability cut-off.