Table 2.
Database name | Number of proteins | Number of proteins with STRs | % of proteins with STRs | Mediana | Averagea | Standard deviationa | Number of clustersb |
---|---|---|---|---|---|---|---|
UniProtKB/Swiss-Prot (total) | 554 241 | 28003 | 5.05% | 14.75 | 15.14 | 3.69 | 6237 |
Archaea | 19 525 | 351 | 1.80% | 10.71 | 10.63 | 1.27 | 45 |
Bacteria | 333 691 | 6794 | 2.04% | 17.38 | 17.45 | 2.66 | 1048 |
Euk: Fungi | 33 613 | 3996 | 11.89% | 13.46 | 13.79 | 3.65 | 893 |
Euk: Invertebrata | 27 607 | 3372 | 12.21% | 17.34 | 18.62 | 7.95 | 812 |
Euk: Vertebrata | 18 292 | 1461 | 7.99% | 13.66 | 13.90 | 2.42 | 1801 |
Euk: Plants | 42 101 | 3601 | 8.55% | 12.51 | 12.82 | 2.98 | 795 |
Viruses | 16 852 | 889 | 5.28% | 14.07 | 14.15 | 2.57 | 203 |
aRepetitive region length, measured in amino acid residues.
bClustering was used to define repeat classes. Should a protein contain three different, co-localized STRs, the clustering method will produce 6 clusters: three with regular STRs and three with fused repeats. See also supplementary material for more information.