Skip to main content
. Author manuscript; available in PMC: 2025 Jul 4.
Published in final edited form as: J Infect. 2024 Sep 7;89(5):106265. doi: 10.1016/j.jinf.2024.106265

Fig. 3. Overview of C. coli attribution study.

Fig. 3

(A) Random Forest analyses were performed on training data from chicken, cattle, turkey, and pig sources to score patterns of unitigs according to their ability to predict isolate source. Dots within each box show how common each pattern is in the two intersecting hosts. Patterns within the small, dotted boxes are common in the host on the horizontal axis, but rare in the host labelled on the vertical axis. Patterns of unitigs with highly discriminatory mutual information (MI) scores were used to select markers for different hosts. (C) Genes from which unitig markers were selected and assigned allele numbers for attribution. (D) Markers were tested on a subset of the data for accuracy (overall accuracy > 90%). (E) Chicken and turkey markers were combined to predict the source of 696 C. coli infection cases from poultry, cattle, and pigs.