Table 1.
Species | Number of proteins | Number of LCD proteins | % LARKS | % LCD | % Glob |
---|---|---|---|---|---|
E. coli | 4293 | 199 | 2.4 | 0.6 | 86.9 |
M. tuberculosis | 3983 | 717 | 3.9 | 5.4 | 76.4 |
P. falciparum | 5152 | 2806 | 0.8 | 11.3 | 78.3 |
S. cerevisiae | 6720 | 1185 | 1.7 | 2.5 | 78.2 |
D. melanogaster | 13,473 | 3940 | 2.3 | 5.7 | 72.5 |
H. sapiens | 20,135 | 5832 | 2.1 | 3.9 | 71.2 |
Species | % LARKS ∩ LCD |
% LARKS ∩ Glob |
% LCD ∩ Glob |
|||
---|---|---|---|---|---|---|
Actual | Expected | Actual | Expected | Actual | Expected | |
E. coli | 0.000 ≤ 0.022 ≤ 0.050 | 0.015 | 1.843 ≤ 1.853 ≤ 1.866 | 2.072 | 0.431 ≤ 0.437 ≤ 0.445 | 0.537 |
M. tuberculosis | 1.236 ≤ 1.267 ≤ 1.296 | 0.214 | 1.910 ≤ 1.935 ≤ 1.955 | 3.009 | 1.693 ≤ 1.730 ≤ 1.763 | 4.149 |
P. falciparum | 0.080 ≤ 0.088 ≤ 0.096 | 0.092 | 0.486 ≤ 0.490 ≤ 0.495 | 0.64 | 4.641 ≤ 4.661 ≤ 4.683 | 8.833 |
S. cerevisiae | 0.048 ≤ 0.062 ≤ 0.073 | 0.041 | 1.054 ≤ 1.063 ≤ 1.072 | 1.298 | 0.969 ≤ 0.981 ≤ 0.992 | 1.925 |
D. melanogaster | 0.346 ≤ 0.356 ≤ 0.365 | 0.128 | 1.077 ≤ 1.082 ≤ 1.092 | 1.639 | 2.016 ≤ 2.03 ≤ 2.043 | 4.123 |
H. sapiens | 0.180 ≤ 0.186 ≤ 0.194 | 0.081 | 1.124 ≤ 1.129 ≤ 1.133 | 1.491 | 1.223 ≤ 1.232 ≤ 1.240 | 2.748 |
The percent of residues in each proteome that were found to be in LARKS, LCDs, or globular regions is shown in columns 4 to 6 of the top table. The table below gives the statistics of overlapping residue types to compare the actual to the expected. Shown in the actual column are the fractions of residues meeting the criteria of the form of a 95% confidence interval with the following percentiles 2.5% ≤ actual overlap ≤ 97.5%. Confidence intervals of the actual value reflect the variance of the data from 100 rounds of bootstrapping (see Experimental procedures for details). The expected intersection value is the fraction of LARKS residues multiplied by the fraction of LCD residues. If the expected value is below the given 2.5% confidence interval for the actual value, then LARKS are enriched in LCDs. If the expected value is above the 97.5% confidence interval, then LARKS are depleted in LCDs compared with what would be expected for that organism. The same methodology is repeated to find LARKS ∩ Glob and LCD ∩ Glob. Comparing the actual LARKS ∩ LCD with the expected indicates that LARKS are enriched in LCDs in all organisms studied except P. falciparum.