Skip to main content
. 2020 Jul 29;21:519. doi: 10.1186/s12864-020-06910-6

Fig. 3.

Fig. 3

Evaluating the effect of assembly quality on gene annotation. a Hybrid assembly improves gene prediction and annotation accuracy. The left bar plot depicts total number of predicted coding sequences (CDSs) of various assemblies of each genome. The bar plot on the right shows the ratio of incomplete to complete CDSs of each assembly. CDSs were predicted by Prodigal and aligned to the UniProtKB/TrEMBL protein database using DIAMOND Blastp. The ratio of query sequence length to subject sequence length was then used as a proxy to measure completeness of the predicted CDSs (threshold of ≥0.95). b Pangenome analysis of different assemblies of related strains, including Escherichia fergusonii (GC162 vs. GC505), Christensenella massiliensis (GC249 vs. GC441) and Blautia obeum (GC481 vs. GC508). Phylogenetic trees were generated by alignment of core and accessory genes identified by PIRATE. The colour ramp indicates the Markov clustering (MCL) threshold at which each gene family has been classified (the higher this threshold, the less divergent is that gene family across assemblies)