Skip to main content
. 2023 Jan 9;11:e14490. doi: 10.7717/peerj.14490

Table 2. Clasnip classification performance of CLso rRNA gene regions.

Group 16S 16-23S IGS 50S rplJ/rplL
# Sample Identity (Q5) (%) Accuracy (%) # Sample Identity (Q5) (%) Accuracy (%) # Sample Identity (Q5) (%) Accuracy (%)
A 10 93.4 100.0 6 90.8 100.0 6 97.6 100.0
B 9 92.4 88.9 1 97.5 100.0 3 97.3 100.0
C 11 93.7 90.9 21 90.0 100.0 31 100.0 100.0
Cras1a 13 100.0 100.0 15 95.8 93.3 18 99.8 100.0
Cras1b 3 100.0 100.0 3 100.0 100.0 3 99.8 100.0
Cras2 3 100.0 100.0 4 98.6 100.0 4 99.1 100.0
D 10 98.2 100.0 24 87.4 100.0 17 96.4 100.0
E 5 100.0 100.0 7 92.0 100.0 8 98.6 100.0
F 1 100.0 100.0 1 100.0 100.0
G 3 98.3 100.0 3 98.9 100.0 4 100.0 100.0
H 1 100.0 100.0 1 100.0 100.0 1 100.0 100.0
H-Con 2 100.0 100.0
U 1 100.0 100.0 1 100.0 100.0 5 99.8 100.0
Total 72 97.2 86 98.8 101 100.0

Note:

# Sample is the number of samples with more than 5 SNPs covered in the reference region. Identity (Q5) means the 5% quantile of estimated identity distribution. If a new sample’s identity is greater than the identity (Q5) of a group, the new sample is classified into the group. Accuracy is the ratio of correctly classified samples to all samples with more than 5 SNPs covered in the reference region. “Correctly classified” is defined as the identity of the sample’s group is the highest among other groups.