Table 5.
Human sub-proteomea | 1 | 2 | 3b | 4b | 5 | 6 |
---|---|---|---|---|---|---|
1 | 1,566 | 1,567 | 1,496 | 27,014 | 14,381 | 95.5 |
2 | 1,598 | 1,600 | 1,576 | 26,352 | 13,850 | 98.6 |
3 | 1,937 | 1,956 | 1,885 | 44,087 | 17,222 | 97.3 |
4 | 2,100 | 2,106 | 2,064 | 30,025 | 14,400 | 98.2 |
5 | 2,058 | 2,162 | 1,986 | 40,374 | 17,653 | 96.5 |
6 | 2,131 | 2,262 | 2,098 | 62,564 | 16,698 | 98.4 |
7 | 2,289 | 2,417 | 2,227 | 37,796 | 17,303 | 97.2 |
8 | 2,556 | 2,557 | 2,475 | 33,387 | 16,439 | 96.8 |
9 | 2,858 | 2,931 | 2,721 | 81,650 | 20,105 | 95.2 |
10 | 3,067 | 3,120 | 3,042 | 48,918 | 19,069 | 99.1 |
11 | 3,306 | 3,351 | 3,198 | 59,666 | 21,568 | 96.7 |
12 | 3,355 | 3,383 | 3,271 | 68,549 | 22,697 | 97.4 |
13 | 3,478 | 3,510 | 3,377 | 56,469 | 21,221 | 97.0 |
14 | 3,529 | 3,554 | 3,417 | 60,138 | 21,490 | 96.8 |
15 | 3,663 | 3,673 | 3,596 | 88,229 | 22,454 | 98.1 |
16 | 3,626 | 3,715 | 3,545 | 96,827 | 22,061 | 97.7 |
17 | 4,084 | 4,095 | 3,958 | 72,451 | 23,645 | 96.9 |
18 | 4,374 | 4,385 | 4,112 | 68,579 | 24,055 | 94.0 |
19 | 4,423 | 4,476 | 4,295 | 79,283 | 24,472 | 97.1 |
20 | 4,751 | 4,795 | 4,735 | 70,296 | 22,686 | 99.6 |
21 | 4,646 | 4,768 | 4,555 | 115,916 | 26,284 | 98.0 |
22 | 4,886 | 4,901 | 4,850 | 95,959 | 25,348 | 99.2 |
23 | 5,048 | 5,168 | 4,740 | 72,400 | 23,453 | 93.8 |
24 | 5,408 | 5,428 | 5,144 | 101,686 | 26,713 | 95.1 |
25 | 5,941 | 5,977 | 5,841 | 128,904 | 27,751 | 98.3 |
26 | 13,143 | 14,063 | 12,700 | 290,022 | 32,681 | 96.6 |
27 | 33,281 | 34,606 | 32,460 | 664,539 | 35,181 | 97.5 |
28 | 42,683 | 44,263 | 41,535 | 787,778 | 35,375 | 97.3 |
29 | 51,865 | 53,804 | 50,075 | 895,672 | 35,489 | 96.5 |
30 | 62,219 | 65,156 | 60,506 | 1,103,962 | 35,629 | 97.2 |
Allc | 254,808 | 299,749 | 246,322 | 3,713,010 |
All 30 artificial human sub-proteomes constitute 686 proteins and are numbered from 1 to 30. The comparison human proteome contained 36,103 proteins and 15,771,565 occurrences of 2,388,563 unique 5-mers. Column number refers to: (1) unique 5-mers in the artificial sub-proteome; (2) total number of 5-mers in the artificial sub-proteome (including multiple occurrences); (3) unique 5-mers from the artificial sub-proteome occurring in the human proteome; (4) occurrences in the human proteome of 5-mers from artificial sub-proteome (including multiple occurrences); (5) number of human proteins in the human proteome involved in overlap; (6) % of unique 5-mers from the artificial sub-proteome which occur in the human proteome (i.e. 100 × column 3/column 1).
Analogous to viral proteomes in size (see Table 1), and composed by set of human proteins as detailed in Table 3.
The results of linear regression analysis between columns 1 and 3, and 1 and 4 are: column 3 = 0.97083 × column 1 + 0.76628 (r = 0.99998). Column 4 = 17.921 × column 1 + 6278.6 (r = 0.99719).
Obtained by combining all 30 human sub-proteomes into one sub-proteome, and then computing the overlap with the entire original human proteome minus the proteins in the combined sub-proteomes.