Fig. 5. The WIS reference set is more annotated than the UNITN reference set.
Count per genome (each genome is a dot and the mean overall genomes is shown as a bar, y axis is in log10 scale) of each type of annotated segment (x axis). Protein coding (CDS), transfer ribonucleic acid RNA (tRNA), miscellaneous RNA (miscRNA), ribosomal RNA (rRNA), transfer-messenger RNA (tmRNA), and repeating regions, stratified by reference set— WIS (n = 3594 genomes) in blue, UNITN (n = 4930) in red, and UHGG (n = 4744) in brown. Bonferroni corrected P value annotations: not significant (ns) q > 0.05, *q < 0.05, **q < 0.01, ***q < 0.001, ****q < 0.0001 according to Mann–Whitney U test.