Table 3.
Statistical correlation between predicted disorder content and organism characteristics
Bacteria | Archaea | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Complete | Seg1 | Seg2 | Seg3 | Seg4 | Complete | Seg1 | Seg2 | Seg3 | Seg4 | |||
Chromosomes | Avg. protein len. | Correlation coef. | 0.1042 | −0.1278 | −0.0714 | 0.1220 | 0.2643 | 0.1480 | −0.3819 | 0.3125 | 0.1829 | – |
Sample size | 2554 | 40 | 921 | 1504 | 89 | 139 | 6 | 124 | 9 | 0 | ||
Significance of CC | < 0.0001 | 0.4319 | 0.0303 | < 0.0001 | 0.0123 | 0.0821 | 0.4550 | 0.0004 | 0.6376 | – | ||
G + C content | Correlation coef. | 0.6060 | 0.3054 | 0.2793 | 0.2741 | 0.3052 | 0.2667 | −1.000 | 0.0653 | 0.1818 | 0.7369 | |
Sample size | 2554 | 151 | 1043 | 756 | 604 | 139 | 2 | 77 | 54 | 6 | ||
Significance of CC | < 0.0001 | 0.0001 | < 0.0001 | < 0.0001 | < 0.0001 | 0.0015 | – | 0.5726 | 0.1883 | 0.0947 | ||
Proteome size | Correlation coef. | 0.2950 | 0.1345 | 0.0689 | 0.3442 | 0.2377 | 0.2978 | 0.0817 | 0.5330 | – | – | |
Sample size | 2554 | 1128 | 1118 | 276 | 32 | 139 | 115 | 24 | 0 | 0 | ||
Significance of CC | < 0.0001 | < 0.0001 | 0.0212 | < 0.0001 | 0.1902 | 0.0004 | 0.3854 | 0.0073 | – | – | ||
Genome size | Correlation coef. | 0.3019 | 0.1592 | 0.1562 | 0.1159 | 0.8357 | 0.3585 | 0.3341 | −0.8534 | – | – | |
Sample size | 2554 | 1469 | 995 | 87 | 3 | 139 | 136 | 3 | 0 | 0 | ||
Significance of CC | < 0.0001 | < 0.0001 | < 0.0001 | 0.2851 | – | < 0.0001 | < 0.0001 | – | – | – | ||
Plasmids | Avg. rotein len. | Correlation coef. | −0.0570 | 0.7456 | 0.0207 | −0.1596 | 0.2914 | 0.0408 | / | −0.0671 | −1.0000 | / |
Sample size | 877 | 4 | 371 | 491 | 11 | 20 | 1 | 17 | 2 | 0 | ||
Significance of CC | 0.0916 | – | 0.6911 | 0.0004 | 0.3846 | 0.8644 | – | 0.7980 | – | – | ||
G + C content | Correlation coef. | 0.3324 | 0.4513 | 0.0693 | 0.0844 | 0.3494 | 0.5399 | 0.5155 | 0.0494 | −0.6586 | / | |
Sample size | 877 | 123 | 319 | 230 | 205 | 20 | 6 | 8 | 5 | 1 | ||
Significance of CC | < 0.0001 | < 0.0001 | 0.2171 | 0.2022 | < 0.0001 | 0.0140 | 0.2952 | 0.9075 | – | – | ||
Proteome size | Correlation coef. | 0.1976 | 0.4958 | 0.0008 | −0.1792 | 0.4609 | 0.0863 | 0.0866 | 0.1977 | / | / | |
Sample size | 877 | 215 | 392 | 238 | 32 | 20 | 13 | 7 | 0 | 0 | ||
Significance of CC | < 0.0001 | < 0.0001 | 0.9874 | 0.0056 | 0.0079 | 0.7175 | 0.7785 | 0.6709 | – | – | ||
Genome size | Correlation coef. | 0.2048 | 0.4079 | 0.0518 | 0.1335 | 0.5414 | 0.0645 | −0.1670 | −0.9999 | / | / | |
Sample size | 877 | 259 | 460 | 137 | 21 | 20 | 17 | 3 | 0 | 0 | ||
Significance of CC | < 0.0001 | < 0.0001 | 0.2676 | 0.1199 | 0.0113 | 0.7870 | 0.5218 | – | – | – |
The table represents the statistical correlation between predicted disorder content and different organism characteristics. The disorder content is predicted using IsUnstruct predictor and measured as a percentage of amino acids in long disordered regions (> = 30AA)
For each sample set (Archaeal/Bacteral chromosomes, plasmids) and each of the observed characteristics, the samples are additionally classified in 4 segments (quarters) by range of the observed characteristics. Correlations are computed for the whole sample and additionally for each of the segments, to find out if the correlation is stronger for some segment (quarter) of the characteristics’ range. The significant correlations are emphasized in boldface