Percentage of the category 2 comparisons in which the proto-oncoprotein set had a significantly greater frequency of rare k-mers (p), the control set (i.e. the GO proteins) had a significantly greater frequency of rare k-mers (C) or the difference in rare k-mer frequency between the two sets of proteins was not statistically significant (n.s.). (p<0.05 was considered significant. These data are shown for k=5, 6 and 7, and for different definitions of a rare k-mer (found zero times in proteome C, found two or fewer times in proteome C or found five or fewer times in proteome C). Percentages may not sum to 100 due to rounding.)