Skip to main content
. 2018 Apr 25;19:158. doi: 10.1186/s12859-018-2158-6

Table 3.

Statistical correlation between predicted disorder content and organism characteristics

Bacteria Archaea
Complete Seg1 Seg2 Seg3 Seg4 Complete Seg1 Seg2 Seg3 Seg4
Chromosomes Avg. protein len. Correlation coef. 0.1042 −0.1278 −0.0714 0.1220 0.2643 0.1480 −0.3819 0.3125 0.1829
Sample size 2554 40 921 1504 89 139 6 124 9 0
Significance of CC < 0.0001 0.4319 0.0303 < 0.0001 0.0123 0.0821 0.4550 0.0004 0.6376
G + C content Correlation coef. 0.6060 0.3054 0.2793 0.2741 0.3052 0.2667 −1.000 0.0653 0.1818 0.7369
Sample size 2554 151 1043 756 604 139 2 77 54 6
Significance of CC < 0.0001 0.0001 < 0.0001 < 0.0001 < 0.0001 0.0015 0.5726 0.1883 0.0947
Proteome size Correlation coef. 0.2950 0.1345 0.0689 0.3442 0.2377 0.2978 0.0817 0.5330
Sample size 2554 1128 1118 276 32 139 115 24 0 0
Significance of CC < 0.0001 < 0.0001 0.0212 < 0.0001 0.1902 0.0004 0.3854 0.0073
Genome size Correlation coef. 0.3019 0.1592 0.1562 0.1159 0.8357 0.3585 0.3341 −0.8534
Sample size 2554 1469 995 87 3 139 136 3 0 0
Significance of CC < 0.0001 < 0.0001 < 0.0001 0.2851 < 0.0001 < 0.0001
Plasmids Avg. rotein len. Correlation coef. −0.0570 0.7456 0.0207 −0.1596 0.2914 0.0408 / −0.0671 −1.0000 /
Sample size 877 4 371 491 11 20 1 17 2 0
Significance of CC 0.0916 0.6911 0.0004 0.3846 0.8644 0.7980
G + C content Correlation coef. 0.3324 0.4513 0.0693 0.0844 0.3494 0.5399 0.5155 0.0494 −0.6586 /
Sample size 877 123 319 230 205 20 6 8 5 1
Significance of CC < 0.0001 < 0.0001 0.2171 0.2022 < 0.0001 0.0140 0.2952 0.9075
Proteome size Correlation coef. 0.1976 0.4958 0.0008 −0.1792 0.4609 0.0863 0.0866 0.1977 / /
Sample size 877 215 392 238 32 20 13 7 0 0
Significance of CC < 0.0001 < 0.0001 0.9874 0.0056 0.0079 0.7175 0.7785 0.6709
Genome size Correlation coef. 0.2048 0.4079 0.0518 0.1335 0.5414 0.0645 −0.1670 −0.9999 / /
Sample size 877 259 460 137 21 20 17 3 0 0
Significance of CC < 0.0001 < 0.0001 0.2676 0.1199 0.0113 0.7870 0.5218

The table represents the statistical correlation between predicted disorder content and different organism characteristics. The disorder content is predicted using IsUnstruct predictor and measured as a percentage of amino acids in long disordered regions (> = 30AA)

For each sample set (Archaeal/Bacteral chromosomes, plasmids) and each of the observed characteristics, the samples are additionally classified in 4 segments (quarters) by range of the observed characteristics. Correlations are computed for the whole sample and additionally for each of the segments, to find out if the correlation is stronger for some segment (quarter) of the characteristics’ range. The significant correlations are emphasized in boldface