Skip to main content
. 2021 Sep 16;297(4):101194. doi: 10.1016/j.jbc.2021.101194

Table 1.

Residue content of proteomes

Species Number of proteins Number of LCD proteins % LARKS % LCD % Glob
E. coli 4293 199 2.4 0.6 86.9
M. tuberculosis 3983 717 3.9 5.4 76.4
P. falciparum 5152 2806 0.8 11.3 78.3
S. cerevisiae 6720 1185 1.7 2.5 78.2
D. melanogaster 13,473 3940 2.3 5.7 72.5
H. sapiens 20,135 5832 2.1 3.9 71.2
Species % LARKS ∩ LCD
% LARKS ∩ Glob
% LCD ∩ Glob
Actual Expected Actual Expected Actual Expected
E. coli 0.000 ≤ 0.022 ≤ 0.050 0.015 1.843 ≤ 1.853 ≤ 1.866 2.072 0.431 ≤ 0.437 ≤ 0.445 0.537
M. tuberculosis 1.236 ≤ 1.267 ≤ 1.296 0.214 1.910 ≤ 1.935 ≤ 1.955 3.009 1.693 ≤ 1.730 ≤ 1.763 4.149
P. falciparum 0.080 ≤ 0.088 ≤ 0.096 0.092 0.486 ≤ 0.490 ≤ 0.495 0.64 4.641 ≤ 4.661 ≤ 4.683 8.833
S. cerevisiae 0.048 ≤ 0.062 ≤ 0.073 0.041 1.054 ≤ 1.063 ≤ 1.072 1.298 0.969 ≤ 0.981 ≤ 0.992 1.925
D. melanogaster 0.346 ≤ 0.356 ≤ 0.365 0.128 1.077 ≤ 1.082 ≤ 1.092 1.639 2.016 ≤ 2.03 ≤ 2.043 4.123
H. sapiens 0.180 ≤ 0.186 ≤ 0.194 0.081 1.124 ≤ 1.129 ≤ 1.133 1.491 1.223 ≤ 1.232 ≤ 1.240 2.748

The percent of residues in each proteome that were found to be in LARKS, LCDs, or globular regions is shown in columns 4 to 6 of the top table. The table below gives the statistics of overlapping residue types to compare the actual to the expected. Shown in the actual column are the fractions of residues meeting the criteria of the form of a 95% confidence interval with the following percentiles 2.5% ≤ actual overlap ≤ 97.5%. Confidence intervals of the actual value reflect the variance of the data from 100 rounds of bootstrapping (see Experimental procedures for details). The expected intersection value is the fraction of LARKS residues multiplied by the fraction of LCD residues. If the expected value is below the given 2.5% confidence interval for the actual value, then LARKS are enriched in LCDs. If the expected value is above the 97.5% confidence interval, then LARKS are depleted in LCDs compared with what would be expected for that organism. The same methodology is repeated to find LARKS ∩ Glob and LCD ∩ Glob. Comparing the actual LARKS ∩ LCD with the expected indicates that LARKS are enriched in LCDs in all organisms studied except P. falciparum.