Figure 3.
Gene names disambiguation algorithm. A bipartite graph models the equivalence relations between locus tags (upper nodes) and gene names (lower nodes). Canonical gene names and locus tags relate via vertical links, whereas diagonal links connect synonyms and locus tags. Mapping from gene names into locus tags can follow any vertical link but diagonal links are taboo (red node) except if node degree is exactly one (e.g. yvbD). In this example, opuCB, dnaE, dnaG, dnaN, dnaA and dnaX are solved via canonical gene names; yvbD is found to be a synonym for opuCB and both are the same gene; whereas dnaH cannot be solved because it exhibits degeneracy (node degree greater than one, and no vertical link denoting it as a canonical gene name).