Annotation refinement and classification of MGE protein sequences. Each of the 2,790 gene names passed to refinement typically belonged to multiple protein clusters or families. For each cluster, one named representative was selected, and its putative function was compared with the literature-derived descriptions recovered from the abstract analysis. If the UniProt/NCBI entry did not support a link between the gene name and function, the protein was annotated via literature review by one of two researchers, with uncertainties and disagreements settled by discussion. Protein families that were ultimately confirmed to perform a target function were assigned a major and minor mobileOG category.