Figure 3.
Overview of the ClassiCOL pipeline. (A) Schematic of the ClassiCOL algorithm workflow starting from a search engine output file. The numbers represented in the table are randomly chosen to highlight the possible difference between two taxonomic branches. (B). Schematic overview of peptide reuse from independent mutations after the speciation event, which are important for classification over different taxonomic levels. The peptides represented in the scheme are true peptides, from different spectra, that were found in the goat reference data set from Rüther et al.8 (C). Visual representation of the collagen-species bicluster. The cluster is zoomed in at the Pecora family level on a sample from Rangifer tarandus (RMC42); the lighter the color in the heatmap, the greater the coverage for that collagen sequence. The y-axis represents the species taxonomic tree, the x-axis represents the homology between collagen sequences, colored by relatedness. (D). Output sunburst plot of the Rangifer tarandus sample, where higher scores represent higher likelihood that the named taxon approximates the sample content. All ambiguity is retained in the final output, whether originating from the isoBLAST approach or directly from the search engine results. The interactive version can be found as Figure S2.
