a SSR incidence and motif length. An increase in repeat motif resulted in lesser incidence, inverse proportionality, which is expected. However, two observations should be noted. First, Gammapolyomavirus is the only genera where the highest incidence is of di-nucleotide repeat motifs. All others have mono-nucleotide motif as most represented along expected lines. Second, the fall in incidence from mono- to di-nucleotide motif SSRs is the least in Deltapolyomavirus.
b Mono-nucleotide motif composition. In-spite of varying GC percentage (Fig. 1), the mono-nucleotide motif composition is very much biased towards A/T across all genera. Total represents overall data. c Di-nucleotide motif composition. Though AT/TA is the most represented di-nucleotide repeat motif overall, it does not enjoy the same status across all genera, with Alphapolyomavirus being the exception. Here, CT/TC has the highest incidence closely followed by AT/TA. d Distribution of SSRs (%) across different proteins. Overall, LTAg accounted for over 47% of all SSRs in the coding region with VP1 coming a distant second at around 16%. Only the 6 proteins which accounted for the highest SSRs were included, the rest have been collectively taken as “Others”. e SSRs contribution (%) by proteins across different genera. Herein, subtle variations are visible. Though LTAg gene accounts for maximum SSRs in the coding genome across all the genera but the contributing percentage varies from 35% in Gammapolyomavirus to almost 50% in Betapolyomavirus