Abstract
Integral membrane proteins from over 20 ubiquitous families of channels, secondary carriers, and primary active transporters were analyzed for average size differences between homologues from the three domains of life: Bacteria, Archaea, and Eucarya. The results showed that while eucaryotic homologues are consistently larger than their bacterial counterparts, archaeal homologues are significantly smaller. These size differences proved to be due primarily to variations in the sizes of hydrophilic domains localized to the N termini, the C termini, or specific loops between transmembrane α-helical spanners, depending on the family. Within the Eucarya domain, plant homologues proved to be substantially smaller than their animal and fungal counterparts. By contrast, extracytoplasmic receptors of ABC-type uptake systems in Archaea proved to be larger on average than those of their bacterial homologues, while cytoplasmic enzymes from different organisms exhibited little or no significant size differences. These observations presumably reflect evolutionary pressure and molecular mechanisms that must have been operative since these groups of organisms diverged from each other.
The three largest classes of transporters found in nature are channels, secondary carriers, and primary active transporters (8, 10). Channel proteins facilitate passive diffusion of their substrates across membranes through aqueous pores, while secondary carriers generally utilize electrochemical gradients of H+, Na+, and solutes to drive the active accumulation or efflux of their primary substrates, and primary active transporters couple transport to the expenditure of a primary source of energy such as ATP hydrolysis or electron flow (10, 16). While channel proteins frequently span the membrane only a few times and form oligomeric complexes, secondary carriers and primary active transporters span the membrane multiple times and usually function as monomers or dimers in the absence of accessory proteins (4). Higher complexes of primary and secondary active transporters can provide regulatory (7, 11) or targeting and stability functions (15).
Recently we have classified transport proteins according to a functional and phylogenetic system called the transporter classification (TC) system (8–10). While many of the identified families of transport proteins are found in only one of the three domains of living organisms (Bacteria, Archaea, or Eucarya), others are ubiquitous, being found in all three domains. Our studies have led to the conclusion that these ubiquitous families are ancient families that existed prior to the divergence of Eucarya and Archaea from Bacteria and that little horizontal transfer of genetic material encoding transport proteins between these three domains of life has occurred at least during the past 2 to 3 billion years (8, 9).
In this study we compared the sizes of homologues of the ubiquitous families in the three domains of living organisms. We showed that while the eucaryotic homologues are consistently larger than their bacterial counterparts, the archaeal homologues are almost always smaller. Moreover, within the Eucarya domain, plant homologues are consistently smaller than the fungal and animal homologues, which are of similar sizes. These observations apparently do not apply to extracellular receptors and cytoplasmic enzymes, which exhibit the reverse size tendencies or no significant differences. The size differences observed for secondary carriers of homologues from the three domains of life proved to be due primarily to variations in the sizes of specific hydrophilic domains within these proteins, and the locations of these size-variable domains appear to be characteristic of specific families.
MATERIALS AND METHODS
The PSI-BLAST database search method (http://www.ncbi.nlm.nih.gov/blast/psiblast.cgi) was used to identify homologous proteins. Multiple alignments were generated using the CLUSTAL X program (13), and hydropathy and putative transmembrane spanner (TMS) analyses were conducted using the TMPred program (2). Positions of size variation among homologues were identified using a combination of programs for multiple alignment (CLUSTAL X) and topological analysis (TMPred). To test for statistically significant differences in protein length, the data were analyzed using two-tailed Sign tests (17).
RESULTS
Size variation in integral membrane transport protein homologues in Bacteria, Archaea, and Eucarya.
Table 1 presents the average sizes, in numbers of amino acyl residues, of the integral membrane protein homologues of 15 families of secondary carriers, 3 families of channel proteins, and 4 families of primary active transporters present in the archaeal, bacterial, and eucaryotic domains. The number of homologues examined is presented in parentheses. The average sizes of the archaeal and eucaryotic homologues relative to the average sizes of the bacterial proteins are also provided. All of the archaeal homologues available in the SwissProt, GenBank, and PIR databases at the time these studies were conducted were included in the analysis. When limited numbers of bacterial or eucaryotic homologues comparable to the number of archaeal proteins were identified, all of these were also included. However, when the numbers of bacterial and/or eucaryotic homologues considerably exceeded the number of archaeal family members, several proteins from the former two groups were generally selected at random from various organisms. In some cases, many eucaryotic proteins were included so that proteins within specific Eucarya kingdoms (animals, plants, and fungi) could be compared (see below).
TABLE 1.
Comparison of membrane transport homologue sizes between Archaea, Bacteria, and Eucarya
| Family | TC no.a | Avg size
|
||||
|---|---|---|---|---|---|---|
|
Archaea
|
Bacteria
|
Eucarya
|
||||
| No. of AAsb | % Relative sizec | No. of AAsb | No. of AAsb | % Relative sizec | ||
| Carriers | ||||||
| Sugar porter (major facilitator superfamily) | 2.A.1.1 | 399 (4) | 95 | 422 (6) | 527 (18) | 124 |
| Amino acid-polyamine-organocation | 2.A.3 | 508 (6) | 109 | 463 (6) | 602 (5) | 131 |
| Cation diffusion facilitator | 2.A.4 | 293 (4) | 98 | 298 (4) | 491 (8) | 165 |
| Resistance-nodulation-division | 2.A.6 | 758 (3) | 76 | 998 (56) | 1,296 (4) | 130 |
| SecDF | 2.A.6.4 | 713 (3) | 76 | 935 (13) | —d | |
| Ca2+:cation antiporter | 2.A.19 | 320 (4) | 87 | 370 (7) | 649 (24) | 174 |
| Inorganic phosphate transporter | 2.A.20 | 332 (6) | 70 | 477 (5) | 581 (10) | 122 |
| Monovalent cation:proton antiporter-1 | 2.A.36 | 440 (3) | 85 | 516 (5) | 702 (15) | 136 |
| Monovalent cation:proton antiporter-2 | 2.A.37 | 410 (5) | 84 | 490 (13) | 793 (5) | 162 |
| K+ transporter | 2.A.38 | 477 (3) | 99 | 480 (5) | 758 (8) | 157 |
| Nucleobase:cation symporter-2 | 2.A.40 | 439 (3) | 97 | 452 (9) | 566 (13) | 125 |
| Formate-nitrite transporter | 2.A.44 | 277 (2) | 102 | 273 (7) | 547 (2) | 200 |
| Divalent anion:Na+ symporter | 2.A.47 | 410 (4) | 79 | 516 (8) | 681 (12) | 132 |
| Ammonium transporter | 2.A.49 | 411 (7) | 88 | 464 (7) | 503 (12) | 108 |
| Multi-antimicrobial extrusion | 2.A.66 | 454 (5) | 99 | 458 (5) | 636 (6) | 138 |
| Channels | ||||||
| Major intrinsic protein | 1.A.8 | 246 (2) | 98 | 251 (11) | 278 (33) | 111 |
| Chloride channel | 1.A.11 | 410 (5) | 89 | 458 (5) | 827 (17) | 180 |
| Metal ion transporter | 9.A.17 | 330 (3) | 99 | 332 (19) | 692 (9) | 210 |
| Primary active transporters | ||||||
| P-type ATPase | 3.A.3 | 724 (9) | 99 | 732 (85) | 1,096 (62) | 150 |
| Arsenite-antimonite efflux | 3.A.4 | 388 (7) | 94 | 411 (22) | 693 (11) | 169 |
| Type II secretory pathway (SecY) | 3.A.5 | 461 (12) | 106 | 436 (44) | 455 (26) | 104 |
| Na+-transporting carboxylic acid decarboxylase (β) | 3.B.1 | 385 (3) | 96 | 399 (9) | —d | |
For further information on TC numbers, see reference 10 or http://www-biology.ucsd.edu/∼msaier/transport/.
The average number of amino acyl residues (AAs) per protein homologue is reported, with the number of proteins examined appearing in parentheses.
Average percent size of the archaeal or eukaryotic homologue relative to the bacterial homologue is presented for each of the families examined. When the values for all families were averaged, the archaeal proteins proved to be 8% smaller than their bacterial homologues while the eukaryotic homologues proved to be 40% larger.
No protein homologues were identified in this domain.
Examination of the results presented in Table 1 reveals that of the 22 protein families studied, the average sizes of the eucaryotic homologues are always substantially greater than those of the procaryotic homologues. Moreover, with only three exceptions (the amino acid-polyamine-organocation [APC] and formate-nitrite transporter [FNT] families of secondary carriers and the SecY proteins of the type II protein secretion pathway family of primary active protein secretory systems), the average sizes of the archaeal homologues are always less than those of the bacterial homologues.
All of the size difference values, obtained when the archaeal or eucaryotic homologues for the various families were compared with the bacterial homologues (Table 1), were averaged. The average archaeal protein size for all 22 families examined was 92% of that of the bacterial homologues, while the average eucaryotic protein size for all 20 families examined was 140% of that of the bacterial homologues. Thus, while the archaeal proteins are 8% smaller than the bacterial proteins, on average, the eucaryotic proteins are 40% larger.
Size variation in integral membrane transport protein homologues in fungi, plants, and animals.
Within the Eucarya domain, animal, plant, and fungal (including yeast) homologues were analyzed separately (Table 2). In all but three of the families of transport proteins analyzed, the plant proteins exhibited average sizes that were substantially smaller than the animal or fungal homologues. The exceptions were the sugar porter family of the major facilitator superfamily, the ammonium transporter family, and the SecY family within type II protein secretion pathway systems. In the sugar porter family of the major facilitator super family, animal homologues proved to be slightly smaller on average than the plant homologues.
TABLE 2.
Comparison of membrane transport homologue sizes between animals, plants, and fungi
| Family | TC no. | Avg size
|
||||
|---|---|---|---|---|---|---|
| Animals
|
Plants
|
Fungi
|
||||
| No. of AAsa | % Relative sizeb | No. of AAsa | % Relative sizeb | No. of AAsa | ||
| Carriers | ||||||
| Sugar porter (major facilitator superfamily) | 2.A.1.1 | 491 (6) | 86 | 518 (6) | 91 | 571 (6) |
| Amino acid/auxin porter | 2.A.18 | 517 (13) | 93 | 475 (16) | 85 | 557 (9) |
| Ca2+:cation antiporter | 2.A.19 | 897 (14) | 147 | 441 (4) | 72 | 610 (6) |
| Cation-chloride cotransporter | 2.A.30 | 1,060 (14) | 98 | 973 (1) | 90 | 1,085 (2) |
| Monovalent cation:proton antiporter-1 | 2.A.36 | 755 (10) | 115 | 488 (2) | 74 | 657 (3) |
| Nucleobase:cation symporter-2 | 2.A.40 | 591 (6) | 101 | 523 (5) | 89 | 588 (2) |
| K+ transporter | 2.A.43 | —c | 506 (2) | 50 | 1,010 (6) | |
| Divalent anion:Na+ symporter | 2.A.47 | 575 (8) | 65 | —c | 891 (4) | |
| Ammonium transporter | 2.A.49 | 576 (4) | 127 | 493 (5) | 109 | 452 (3) |
| Channels | ||||||
| Major intrinsic protein | 1.A.8 | 315 (15) | 102 | 268 (22) | 87 | 308 (2) |
| Chloride channel | 1.A.11 | 858 (11) | 108 | 778 (6) | 98 | 796 (2) |
| Metal ion transporter | 9.A.17 | 725 (3) | 102 | 657 (4) | 93 | 710 (2) |
| Primary active transporters | ||||||
| P-type ATPase | 3.A.3 | 1,348 (21) | 135 | 954 (29) | 95 | 999 (12) |
| Arsenite-antimonite efflux | 3.A.4 | 667 (9) | 82 | —c | 809 (2) | |
| Type II secretory pathway (SecY) | 3.A.5 | 440 (6) | 113 | 471 (16) | 120 | 391 (3) |
The average number of amino acyl residues (AAs) for the protein homologues of each family is reported, with the number of proteins examined in parentheses. Fungal proteins include those from yeast.
Average percent size of the animal or plant homologue relative to the fungal homologue is presented for each of the families examined. When the values for all families were averaged, the plant proteins proved to be 17% smaller than their fungal homologues while the animal homologues proved to be 5% larger.
No protein homologues were identified in this kingdom.
All of the size difference values, obtained when the animal or plant homologues for the various families were compared with the fungal homologues (Table 2), were averaged. The average animal protein size for all 14 families examined was 105% of that of the fungal homologues, while the average plant protein size for the 13 families examined was 83% of that of the fungal homologues. Thus, while the animal proteins are 5% larger than the fungal proteins, on average, the plant proteins are 17% smaller.
Size variation in homologous constituents of the ABC-type transport system in Bacteria and Archaea.
The ABC superfamily of uptake permeases is restricted to procaryotes, but it is found in both Bacteria and Archaea. These systems include three constituents: extracytoplasmic receptors, integral membrane proteins, and cytoplasmic ATP-hydrolyzing constituents. Over 20 families of these systems have been identified (10). These types of homologues (receptors, integral membrane constituents, and cytoplasmic ATP-hydrolyzing energizers) were analyzed for size variation (see Tables 3, 4, and 5, respectively). As shown in Table 3, the average archaeal receptor sizes proved to be greater than those of the average bacterial receptor sizes for 11 of the 13 families that have homologues in both domains. Overall, the archaeal receptors are 7% larger, on average, than their bacterial homologues. By contrast, the integral membrane archaeal homologues of ABC systems are usually smaller than the bacterial homologues (Table 4). Thus, of the 20 families examined, 15 proved to have smaller archaeal homologues, on average, than bacterial homologues. The average size difference proved to be 3.5%. Finally, the cytoplasmic ATP-hydrolyzing energizers tend to be somewhat smaller in Archaea than in Bacteria (Table 5). Thus, of the 16 families analyzed, 13 were smaller and 3 were larger, on average. Overall, the archaeal cytoplasmic proteins were 3.5% smaller than their bacterial homologues. Thus, the trend displayed by the ABC membrane proteins (Table 4) agreed with that for other integral membrane transport proteins (Table 1). The size differences for the archaeal extracytoplasmic receptors were opposite to that observed for the integral membrane constituents, with the archaeal receptors being substantially larger than their bacterial homologues. The ATP-hydrolyzing energizers showed minimal size differences.
TABLE 3.
Comparison of ABC receptor homologue sizes between Archaea and Bacteria
| Family | TC no. | Avg size
|
|||
|---|---|---|---|---|---|
|
Archaea
|
Bacteria
|
||||
| No. of AAsa | % Relative sizeb | No. of AAsa | % Relative sizeb | ||
| Carbohydrate uptake transporter-1 family | 3.A.1.1 | 477 (6) | 111 | 421 (13) | 100 |
| Carbohydrate uptake transporter-2 family | 3.A.1.2 | —c | 343 (12) | 100 | |
| Polar amino acid uptake transporter family | 3.A.1.3 | 278 (2) | 106 | 262 (29) | 100 |
| Hydrophobic amino acid uptake transporter family | 3.A.1.4 | 443 (4) | 118 | 376 (14) | 100 |
| Peptide-opine-nickel uptake transporter family | 3.A.1.5 | 644 (11) | 121 | 532 (25) | 100 |
| Sulfate uptake transporter family | 3.A.1.6 | —c | 342 (6) | 100 | |
| Phosphate uptake transporter family | 3.A.1.7 | 337 (6) | 101 | 334 (18) | 100 |
| Molybdate uptake transporter family | 3.A.1.8 | 264 (2) | 105 | 251 (15) | 100 |
| Phosphonate uptake transporter family | 3.A.1.9 | —c | 300 (5) | 100 | |
| Ferric iron uptake transporter family | 3.A.1.10 | —c | 328 (15) | 100 | |
| Polyamine-opine-phosphonate uptake transporter family | 3.A.1.11 | 417 (2) | 118 | 352 (20) | 100 |
| Quaternary amine uptake transporter family | 3.A.1.12 | —c | 421 (7) | 100 | |
| Vitamin B12 uptake transporter family | 3.A.1.13 | 361 (4) | 129 | 279 (14) | 100 |
| Iron chelate uptake transporter family | 3.A.1.14 | 347 (2) | 106 | 327 (20) | 100 |
| Manganese-zinc-iron chelate uptake transporter family | 3.A.1.15 | 316 (4) | 103 | 306 (26) | 100 |
| Nitrate-nitrite-cyanate uptake transporter family | 3.A.1.16 | 353 (4) | 78 | 451 (10) | 100 |
| Taurine uptake transporter family | 3.A.1.17 | —c | 333 (11) | 100 | |
| Putative cobalt uptake transporter family | 3.A.1.18 | 88 (4) | 84 | 105 (3) | 100 |
| Thiamine uptake transporter family | 3.A.1.19 | 350 (2) | 101 | 346 (12) | 100 |
| Brachyspira iron transporter family | 3.A.1.20 | —c | 346 (10) | 100 | |
The average number of amino acyl residues (AAs) per protein homologue is reported, with the number of proteins examined appearing in parentheses.
Average percent size of the archaeal homologues relative to the bacterial homologues is presented for each of the families examined.
No protein homologue or just one such protein was identified in this domain.
TABLE 4.
Comparison of ABC membrane protein homologue sizes between Archaea and Bacteria
| Family | TC no. | Avg size
|
|||
|---|---|---|---|---|---|
|
Archaea
|
Bacteria
|
||||
| No. of AAsa | % Relative sizeb | No. of AAsa | % Relative sizeb | ||
| Carbohydrate uptake transporter-1 family (MalF) | 3.A.1.1 | 301 (4) | 83 | 363 (24) | 100 |
| Carbohydrate uptake transporter-1 family (MalG) | 3.A.1.1 | 309 (4) | 96 | 321 (24) | 100 |
| Carbohydrate uptake transporter-2 family (RbsC) | 3.A.1.2 | 332 (1) | 97 | 342 (22) | 100 |
| Carbohydrate uptake transporter-2 family (RbsD) | 3.A.1.2 | —c | 137 (7) | 100 | |
| Polar amino acid uptake transporter family (HisM) | 3.A.1.3 | 222 (2) | 96 | 232 (50) | 100 |
| Polar amino acid uptake transporter family (HisQ) | 3.A.1.3 | 222 (2) | 95 | 233 (39) | 100 |
| Hydrophobic amino acid uptake transporter family (LivH) | 3.A.1.4 | 289 (5) | 91 | 318 (17) | 100 |
| Hydrophobic amino acid uptake transporter family (LivM) | 3.A.1.4 | 351 (8) | 87 | 402 (20) | 100 |
| Peptide-opine-nickel uptake transporter family (OppB) | 3.A.1.5 | 333 (12) | 103 | 324 (48) | 100 |
| Peptide-opine-nickel uptake transporter family (OppC) | 3.A.1.5 | 366 (9) | 116 | 315 (51) | 100 |
| Sulfate uptake transporter family | 3.A.1.6 | —c | 277 (10) | 100 | |
| Phosphate uptake transporter family | 3.A.1.7 | 285 (7) | 90 | 315 (23) | 100 |
| Molybdate uptake transporter family | 3.A.1.8 | 248 (4) | 107 | 232 (13) | 100 |
| Phosphonate uptake transporter family | 3.A.1.9 | —c | 246 (5) | 100 | |
| Ferric iron uptake transporter family | 3.A.1.10 | —c | 520 (8) | 100 | |
| Polyamine-opine-phosphonate uptake transporter family (PotB) | 3.A.1.11 | 263 (3) | 90 | 291 (25) | 100 |
| Polyamine-opine-phosphonate uptake transporter family (PotC) | 3.A.1.11 | 262 (3) | 97 | 271 (18) | 100 |
| Quaternary amine uptake transporter family | 3.A.1.12 | —c | 435 (7) | 100 | |
| Vitamin B12 uptake transporter family | 3.A.1.13 | 345 (6) | 101 | 341 (11) | 100 |
| Iron chelate uptake transporter family (FecC) | 3.A.1.14 | 338 (6) | 99 | 340 (18) | 100 |
| Iron chelate uptake transporter family (FecD) | 3.A.1.14 | 344 (4) | 99 | 346 (22) | 100 |
| Manganese-zinc-iron chelate uptake transporter family | 3.A.1.15 | 271 (4) | 93 | 291 (38) | 100 |
| Nitrate-nitrite-cyanate uptake transporter family | 3.A.1.16 | 254 (5) | 83 | 305 (16) | 100 |
| Taurine uptake transporter family | 3.A.1.17 | —c | 328 (7) | 100 | |
| Putative cobalt uptake transporter family | 3.A.1.18 | 255 (30) | 89 | 287 (11) | 100 |
| Thiamine uptake transporter family | 3.A.1.19 | 406 (4) | 117 | 348 (31) | 100 |
| Brachyspira iron transporter family | 3.A.1.20 | —c | 424 (7) | 100 | |
The average number of amino acyl residues (AAs) per protein homologue is reported, with the number of proteins examined appearing in parentheses.
Average percent size of the archaeal homologues relative to the bacterial homologues is presented for each of the families examined.
No protein homologues were identified in this domain.
TABLE 5.
Comparison of cytoplasmic ABC protein homologue sizes between Archaea and Bacteria
| Family | TC no. | Avg size
|
|||
|---|---|---|---|---|---|
|
Archaea
|
Bacteria
|
||||
| No. of AAsa | % Relative sizeb | No. of AAsa | % Relative sizeb | ||
| Carbohydrate uptake transporter-1 family | 3.A.1.1 | 363 (14) | 99 | 366 (48) | 100 |
| Carbohydrate uptake transporter-2 family | 3.A.1.2 | 496 (5) | 98 | 504 (34) | 100 |
| Polar amino acid uptake transporter family | 3.A.1.3 | 240 (2) | 95 | 252 (31) | 100 |
| Hydrophobic amino acid uptake transporter family | 3.A.1.4 | 258 (6) | 97 | 267 (15) | 100 |
| Peptide-opine-nickel uptake transporter family | 3.A.1.5 | 324 (9) | 96 | 336 (25) | 100 |
| Sulfate uptake transporter family | 3.A.1.6 | 329 (1) | 92 | 357 (8) | 100 |
| Phosphate uptake transporter family | 3.A.1.7 | 286 (10) | 92 | 312 (21) | 100 |
| Molybdate uptake transporter family | 3.A.1.8 | 344 (1) | 99 | 349 (8) | 100 |
| Phosphonate uptake transporter family | 3.A.1.9 | —c | 287 (4) | 100 | |
| Ferric iron uptake transporter family | 3.A.1.10 | 329 (1) | 93 | 353 (9) | 100 |
| Polyamine-opine-phosphonate uptake transporter family | 3.A.1.11 | 343 (5) | 91 | 376 (13) | 100 |
| Quaternary amine uptake transporter family | 3.A.1.12 | 369 (3) | 93 | 396 (12) | 100 |
| Vitamin B12 uptake transporter family | 3.A.1.13 | 252 (3) | 96 | 263 (10) | 100 |
| Iron chelate uptake transporter family | 3.A.1.14 | 272 (2) | 101 | 269 (24) | 100 |
| Manganese-zinc-iron chelate uptake transporter family | 3.A.1.15 | 271 (3) | 103 | 262 (21) | 100 |
| Nitrate-nitrite-cyanate uptake transporter family | 3.A.1.16 | 250 (3) | 91 | 274 (18) | 100 |
| Taurine uptake transporter family | 3.A.1.17 | —c | 332 (11) | 100 | |
| Putative cobalt uptake transporter family | 3.A.1.18 | 284 (9) | 103 | 276 (11) | 100 |
| Thiamine uptake transporter family | 3.A.1.19 | —c | 225 (3) | 100 | |
| Brachyspira iron transporter family | 3.A.1.20 | —c | 371 (3) | 100 | |
The average number of amino acyl residues (AAs) per protein homologue is reported, with the number of proteins examined appearing in parentheses.
Average percent size of the archaeal homologue relative to the bacterial homologue is presented for each of the families examined.
No protein homologues were identified in this domain.
Size variation in homologous cytoplasmic enzymes.
Similar analyses were conducted with a variety of catabolic and anabolic cytoplasmic enzymes (Table 6). These proteins showed similar homologue sizes, regardless of the domain or kingdom analyzed. Averaging all of the statistically significant results in Table 6 revealed that, on average, eucaryotic enzymes are only 3% larger than the homologous bacterial enzymes and archaeal enzymes are only 3% smaller than the homologous bacterial enzymes. These average size differences are much less than for the integral membrane transport proteins analyzed (Tables 1 and 4). Moreover, among the Eucarya kingdoms, animal and fungal homologues are essentially the same size while plant homologues are only about 1% smaller on average. This last mentioned average size difference is not statistically significant. Thus, cytoplasmic enzymes do not appear to exhibit the appreciable size differences that were observed for integral membrane proteins.
TABLE 6.
Comparisons of cytoplasmic enzyme sizes for the domains of organisms and the kingdoms of Eucarya
| Enzyme | EC no. | Archaea | Bacteria |
Eucarya
|
|||
|---|---|---|---|---|---|---|---|
| All kingdoms | Animals | Plants | Fungi | ||||
| Enolase | 4.2.1.11 | 425 (6) | 430 (23) | 438 (30) | 434 (13) | 444 (8) | 438 (9) |
| Phosphoglycerate kinase | 2.7.2.3 | 409 (10) | 398 (21) | 414 (27) | 416 (11) | 401 (4) | 417 (12) |
| Glyceraldehyde 3-phosphate dehydrogenase | 1.2.1.12 | 338 (12) | 335 (24) | 336 (57) | 334 (23) | 339 (13) | 335 (21) |
| Triose-phosphate isomerase | 5.3.1.1 | 224 (10) | 253 (25) | 251 (24) | 249 (11) | 254 (8) | 249 (5) |
| Phosphoglucose isomerase | 5.3.1.9 | 401 (1) | 517 (22) | 560 (18) | 557 (7) | 566 (8) | 554 (3) |
| Pyruvate kinase | 2.1.7.40 | 465 (4) | 500 (22) | 522 (22) | 529 (9) | 509 (6) | 524 (7) |
| Inosine-5′-monophosphate dehydrogenase | 1.1.1.250 | 481 (7) | 500 (16) | 517 (11) | 520 (4) | 502 (3) | 525 (4) |
| Inosine-5′-monophosphate-aspartate ligase | 6.3.4.4 | 340 (6) | 426 (19) | 447 (6) | 454 (4) | —a | 434 (2) |
| Glutamine synthetase | 6.3.1.2 | 451 (11) | 466 (26) | 369 (33) | 376 (13) | 365 (15) | 360 (5) |
| Aspartate aminotransferase | 2.6.1.1 | 391 (6) | 394 (17) | 412 (15) | 412 (8) | 409 (5) | 421 (2) |
| Elongation factor-2/elongation factor-G | No EC no. | 731 (16) | 698 (25) | 849 (12) | 855 (6) | 845 (3) | 842 (3) |
No sequenced homologues of this enzyme were found in plants.
Statistical significance of the observed homologue size differences.
To test for statistically significant differences in transport protein lengths between phylogenetic groups, the data were analyzed using two-tailed Sign tests (17). In these analyses, comparisons were made between paired domains of life, or between paired kingdoms within the Eucarya domain, in terms of average lengths of amino acyl sequences within the protein families (Tables 1 to 5). The Sign test is a qualitative, nonparametric paired-sample test that utilizes only the direction of difference (< or >) between paired data. As such, it requires no assumptions regarding the distribution of data either within or between sample groups. We felt that such assumptions might be unwarranted given the size variation observed within the taxonomic groups with respect to average length of the proteins within the families represented. Thus, there proved to be more variation between protein families within each domain than there was between domains in any particular protein family. Each pair of domains or kingdoms was therefore compared independently. A P value of ≤0.05 was considered significant. Tabulated data and associated Sign test P values are presented in Table 7.
TABLE 7.
Results of statistical analyses of observed homologue size differences
| Homologue | Groups compared | No. with A > B:A < B | Sign test P value | Correlation |
|---|---|---|---|---|
| Integral membrane transporter | Archaea (A) vs Bacteria (B) | 3:19 | 0.002 | A < B |
| Bacteria (A) vs Eucarya (B) | 0:20 | <0.001 | A < B | |
| Eucarya (A) vs Archaea (B) | 19:1 | <0.001 | A > B | |
| Plants (A) vs animals (B) | 2:10 | 0.039 | A < B | |
| Animals (A) vs fungi (B) | 9:5 | 0.42 | None | |
| Fungi (A) vs plants (B) | 11:2 | 0.022 | A > B | |
| ABC receptor | Archaea (A) vs Bacteria (B) | 11:2 | 0.022 | A > B |
| ABC membrane protein | Archaea (A) vs Bacteria (B) | 5:15 | 0.041 | A < B |
| ABC cytoplasmic energizer | Archaea (A) vs Bacteria (B) | 3:13 | 0.013 | A < B |
The results of the statistical analyses strongly support the conclusion that transport protein length differs significantly between domains in all pairwise comparisons. Archaea have significantly shorter transport proteins than Bacteria, and both Archaea and Bacteria have shorter transport proteins than Eucarya. Tests were generally less significant in pairwise comparisons of Eucarya kingdoms. Plants have shorter transport proteins than either animals or fungi, but the difference in protein length between animals and fungi is not significant. Corresponding analyses of the ABC receptors, membrane proteins, and cytoplasmic energizers argued for statistical significance, although the actual size differences between the archaeal and bacterial energizers proved to be minimal.
Localization of regions in homologues of secondary transporters responsible for size differences between Bacteria, Archaea, and Eucarya.
Five families of secondary carriers were analyzed in detail to determine what portions of these proteins exhibit the greatest size variation. For this purpose, five sequence-divergent members of each family from each of the three domains of living organisms were selected for analysis. These sequences were multiply aligned using the CLUSTAL X program (13), and hydropathy analyses were conducted using the TMpred program (2). The results of these analyses are summarized in Table 8.
TABLE 8.
Localization of regions of size difference between homologues of five families of secondary carriers
![]() |
![]() |
Size in number of amino acyl residues (AAs).
Database and accession number.
Putative number of α-helical TMSs predicted using the TMPred program.
N and C are the N-terminal and C-terminal hydrophilic domains (regions) preceding the first predicted TMS and following the last predicted TMS, respectively, based on TMPred analyses. Region number refers to the number of the putative TMS, based on TMPred predictions.
For each family, the bacterial homologues are presented first, the archaeal homologues are presented second, and the eucaryotic proteins are presented last. Table 8 presents (i) the organismal domains, (ii) the protein abbreviations, (iii) the size of each individual protein, (iv) the database and accession number, allowing easy access to the sequence of that protein, (v) the number of putative TMSs predicted using the TMpred program, (vi) the size of the N-terminal hydrophilic domain (N) in number of amino acyl residues, (vii) the residues predicted to comprise the individual TMSs (1 to 14), and (viii) the size of the C-terminal hydrophilic domain (C).
The first family shown is the Ca2+:cation antiporter (CaCA) family (Table 8). The size differences between the proteins are apparent when examining the data summarized in column 3. In column 5, it can be seen that there is substantial variation in the predicted number of TMSs. For this family and other families examined, some of this variation may represent experimental error due to limitations of the TMPred program. For the CaCA family, there is little variation in the sizes of the N-terminal and C-terminal hydrophilic domains (between 0 and 40 residues each). However, three of the eucaryotic proteins (all from animals) (CaSA2 Bta, Orfl Cel, and CaSA Dme) show large inter-TMS loops between TMSs 6 and 7. These loops are between 455 and 566 amino acyl residues long, accounting for most of the size differences observed for these proteins compared with other homologues examined. These loops are predicted to be of 29 to 38 residues for the bacterial proteins, of 11 to 22 residues for the archaeal proteins, and of 14 and 10 residues for the two remaining eucaryotic proteins, plant and yeast proteins, respectively. Additionally, it can be seen that other eucaryotic inter-TMS loops are of somewhat increased size relative to their procaryotic counterparts. For example, loops 1 and 2 (between TMSs 1 and 2) contain 1 to 22 residues in procaryotic proteins but 34 to 99 residues in the eucaryotic homologues; loops 2 and 3 in the procaryotic proteins are 16 to 19 residues long while those in the eucaryotic proteins are 21 to 23 residues long; and loops 3 and 4 in the procaryotic proteins are of 7 to 18 residues while those in the eucaryotic proteins are of 15 to 23 residues. Finally, the program predicts 11 TMSs for the bacterial homologues, 9 or 10 TMSs for the archaeal homologues, and either 10 or 12 TMSs for the eucaryotic proteins. All of these differences, when taken together, account for the observed size variations of the individual proteins of the CaCA family.
The second family listed in Table 8 is the inorganic phosphate transporter (Pit) family. All but one of the Pit family members has a small N-terminal hydrophilic region, the one exception being the plant Pit Ath homologue, which has an N-terminal hydrophilic domain of 126 residues. The archaeal proteins generally have shorter hydrophilic N termini than the bacterial proteins. Further, the hydrophilic C termini of all homologues are short (1 to 26 residues). The major size variations observed between the procaryotic and eucaryotic proteins of this family are in loops 7–8 and 8–9. For example, Orf Cel is predicted to have a somewhat large loop 8–9 (46 residues), Glvr Hsa and Nps Sce have a large loop 7–8 (62 and 206 residues, respectively), and Pho4 Ncr has large loops 7–8 and 8–9 (122 and 89 residues, respectively). Differences in the number of putative TMSs predicted are also observed, with Orf Cel and Pho4 Ncr predicted to have more TMSs than the other homologues.
The monovalent cation:proton antiporter families (CPA1 and CPA2) show similarly sized N-terminal hydrophilic domains, but their C-terminal hydrophilic domains differ substantially in size (Table 8). Thus, in the CPA1 family, four of the bacterial homologues have hydrophilic extensions of 21 to 32 residues, but one protein, Orf Bsu, has a hydrophilic extension of 131 residues. Similarly, four of the archaeal proteins have C-terminal hydrophilic extensions of 7 to 25 residues, but one protein (Nhe2 Afu) has an extension of 124 residues. Finally, all of the eucaryotic proteins have long C-terminal hydrophilic domains of 102 to 394 residues. In the CPA2 family, the major size differences are also in the C-terminal regions. In this family, the bacterial C-terminal extensions are large (226 to 282 residues), while all but one of the archaeal extensions are short (6 to 21 residues except for that in Orf Mth, which is 174 residues in length). All of the eucaryotic proteins have large hydrophilic C termini (185 to 277 residues). The two yeast proteins additionally exhibit large loops between their final two C-terminal TMSs (308 and 444 residues, respectively).
Finally, several proteins of the divalent anion:Na+ symporter family exhibit major size differences in their N-terminal hydrophilic domains (Table 8), although differences in loop sizes and numbers of putative TMSs contribute significantly to the overall protein size differences. Most of these proteins show small C-terminal hydrophilic extensions.
In summary, we have found that the positional basis for the size variations observed between secondary carrier homologues from the three domains of life depends primarily on the family and secondarily on the individual proteins within that family. Some families show differences primarily in the N-terminal hydrophilic domains, others show differences in the C-terminal hydrophilic domains, and still others show differences in specific inter-TMS loop regions. Most families exhibit size differences between homologues that represent a combination of these effects, with one of these effects predominating.
DISCUSSION
The average size differences for the various types of protein homologues analyzed are summarized in Table 9. When all family size differences are averaged, integral membrane transport proteins of bacteria are 8% larger than their archaeal homologues and 40% smaller than their eucaryotic homologues. When the three constituents of procaryotic-specific ABC-type uptake permeases were examined, the archaeal extracytoplasmic receptors proved to be 7% larger than their bacterial homologues, on average, while the membrane and cytoplasmic constituents were 3 to 4% smaller. Homologous cytoplasmic enzymes showed little or no significant difference between the three domains of life (Table 9). Within the Eucarya kingdoms, integral membrane transporters of plants proved to be significantly smaller than those of animals (21%) and fungi (17%), although no corresponding size differences were noted for homologous cytoplasmic enzymes. These observations clearly show that during evolution, integral membrane transport proteins have been subject to different pressures giving rise to size differences that are not paralleled in cytoplasmic proteins or extracytoplasmic receptors. In fact, the latter proteins exhibit significant size differences between Bacteria and Archaea that are opposite to those observed for the integral membrane proteins. These observations must be explainable at the molecular level.
TABLE 9.
Average size differences in various domains for the protein types analyzed
| Protein type | Relative size (%)
|
|||||
|---|---|---|---|---|---|---|
| Bacteria | Archaea | All kingdoms |
Eucarya
|
|||
| Plants | Animals | Fungi | ||||
| Integral membrane transporters | 100 | 92 | 140 | 83 | 105 | 100 |
| ABC permeases | ||||||
| Receptors | 100 | 107 | ||||
| Membrane proteins | 100 | 96 | ||||
| Energizers | 100 | 96 | ||||
| Cytoplasmic enzymes | 100 | 97 | 103 | 99 | 100 | 100 |
The molecular explanation(s) for the protein size differences documented in this report is currently elusive. Several investigators have noted that when plasmidic DNA sequences exhibiting short repetitive elements are transferred to yeast, the repeats tend to increase in number, although the reverse is true in Escherichia coli (1, 5, 6, 12). The molecular basis for this observation is not known, but if operative on chromosomal DNA over an extended period of evolutionary time, it could account for the observed average membrane protein homologue size differences. However, because the cytoplasmic proteins analyzed do not show this trend and because extracytoplasmic receptors show the opposite trend, we disfavor such an explanation.
Other explanations may exist. Our domain analyses summarized in Table 8 show that the major size differences in secondary carrier proteins occur primarily in the N- and C-terminal hydrophilic extensions and specific inter-TMS loops of these integral membrane proteins, and that the locations where the major size differences occur are family specific. Sometimes the numbers of putative α-helical TMSs differ, but these differences may be in part artifactual and do not generally account for the size variations observed. Hydrophilic domains in transporters are known to play regulatory roles in various well-studied procaryotic and eucaryotic transport proteins (3, 14). It is possible that Eucarya have been under greater pressure to evolve regulatory domains controlling transport than have Bacteria and that Bacteria have in turn been under greater pressure to evolve such regions than have the Archaea. If this possibility does account for the observed size differences, then plants must have been under less stringent pressure to evolve protein regulatory sequences than were animals and fungi. Moreover, cytoplasmic enzymes have not been subject to similar constraints. These observations may have predictive value for purposes of annotation. However, one can expect that multiple explanations will account for the size variations observed.
It is clear that the studies reported here pose more questions than they have answered. What are the membrane structural features or mechanistic features that promote the observed size differences? Are repeated DNA sequences present in the structural genes for these proteins, and if so, do numbers of repeats contribute to or even account for the size differences observed for their protein products? What are the physiological benefits to organisms in the three domains of life to promote homologue size variation? What accounts for the size differences observed between plant transporters and those from other Eucarya? Further computational experimentation, currently in progress, will be required to provide answers to these interesting questions.
ACKNOWLEDGMENTS
We thank Donna Yun, Monica Mistry, Milda Simonaitis, and Yolanda Anglin for their assistance in the preparation of this manuscript.
Work in the authors' laboratory was supported by NIH grant no. 2R01 AI14176 from the National Institute of Allergy and Infectious Diseases and no. 9RO1 GM55434 from the National Institute of General Medical Sciences, as well as by the M. H. Saier, Sr. Memorial Research Fund.
REFERENCES
- 1.Henderson S T, Petes T D. Instability of simple sequence DNA in Saccharomyces cerevisiae. Mol Cell Biol. 1992;12:2749–2757. doi: 10.1128/mcb.12.6.2749. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Hofmann K, Stoffel W. TMPred—a database of membrane spanning protein segments. Biol Chem Hoppe-Seyler. 1993;347:166. [Google Scholar]
- 3.Hoischen C, Levin J, Pitaknarongphorn S, Reizer J, Saier M H., Jr Involvement of the central loop of the lactose permease of Escherichia coli in its allosteric regulation by the glucose-specific enzyme IIA of the phosphoenolpyruvate-dependent phosphotransferase system. J Bacteriol. 1996;178:6082–6086. doi: 10.1128/jb.178.20.6082-6086.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Kaback H R, Wu J. From membrane to molecule to the third amino acid from the left with a membrane transport protein. Q Rev Biophys. 1997;30:333–364. doi: 10.1017/s0033583597003387. [DOI] [PubMed] [Google Scholar]
- 5.Levinson G, Gutman G A. Slipped-strand mispairing: a major mechanism for DNA sequence evolution. Mol Biol Evol. 1987;4:203–221. doi: 10.1093/oxfordjournals.molbev.a040442. [DOI] [PubMed] [Google Scholar]
- 6.Morel P, Reverdy C, Michel B, Ehrlich S D, Cassuto E. The role of SOS and flap processing in microsatellite instability in Escherichia coli. Proc Natl Acad Sci USA. 1998;95:10003–10008. doi: 10.1073/pnas.95.17.10003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Persson B L, Petersson J, Fristedt U, Weinander R, Berhe A, Pattison J. Phosphate permeases of Saccharomyces cerevisiae: structure, function and regulation. Biochim Biophys Acta. 1999;1422:255–272. doi: 10.1016/s0304-4157(99)00010-6. [DOI] [PubMed] [Google Scholar]
- 8.Saier M H., Jr . Molecular phylogeny as a basis for the classification of transport proteins from bacteria, archaea and eukarya. In: Poole R K, editor. Advances in microbial physiology. San Diego, Calif: Academic Press; 1998. pp. 81–136. [DOI] [PubMed] [Google Scholar]
- 9.Saier M H., Jr Genome archeology leading to the characterization and classification of transport proteins. Curr Opin Microbiol. 1999;2:555–561. doi: 10.1016/s1369-5274(99)00016-8. [DOI] [PubMed] [Google Scholar]
- 10.Saier M H., Jr A functional-phylogenetic classification system for transmembrane solute transporters. Microbiol Mol Biol Rev. 2000;64:354–411. doi: 10.1128/mmbr.64.2.354-411.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Stevens B R, Fernandez A, Hirayama B, Wright E M, Kempner E S. Intestinal brush border membrane Na+/glucose cotransporter functions in situ as a homotetramer. Proc Natl Acad Sci USA. 1990;87:1456–1460. doi: 10.1073/pnas.87.4.1456. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Strand M, Prolla T A, Liskay R M, Petes T D. Destabilization of tracts of simple repetitive DNA in yeast by mutations affecting DNA mismatch repair. Nature. 1994;365:274–276. doi: 10.1038/365274a0. [DOI] [PubMed] [Google Scholar]
- 13.Thompson J D, Gibson T J, Plewniak F, Jeanmougin F, Higgins D G. The CLUSTAL X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 1997;25:4876–4882. doi: 10.1093/nar/25.24.4876. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Vankeerberghen A, Lin W, Jaspers M, Cuppens H, Nilius B, Cassiman J-J. Functional characterization of the CFTR R domain using CFTR/MDR1 hybrid and deletion constructs. Biochemistry. 1999;38:14988–14998. doi: 10.1021/bi991520d. [DOI] [PubMed] [Google Scholar]
- 15.Verrey F, Jack D L, Paulsen I T, Saier M H, Jr, Pfeiffer R. New glycoprotein-associated amino acid transporters. J Membr Biol. 1999;172:181–192. doi: 10.1007/s002329900595. [DOI] [PubMed] [Google Scholar]
- 16.West I C. Ligand conduction and the gated-pore mechanism of transmembrane transport. Biochim Biophys Acta. 1997;1331:213–234. doi: 10.1016/s0304-4157(97)00007-5. [DOI] [PubMed] [Google Scholar]
- 17.Zar J H. The Sign test (22.6) In: Kurtz B, editor. Biostatistical analysis. Englewood Cliffs, N.J: Prentice-Hall, Inc.; 1984. pp. 386–387. [Google Scholar]


