TABLE 1.
Cluster IDa | Annotation | Organism with top BLASTp hit | BLASTp score | Presequence length (aa) | TP length (aa)b | TMH1 positionc | TMH2 positionc | Id |
---|---|---|---|---|---|---|---|---|
Class IA proteins | ||||||||
0726 | Ferredoxin | Arabidopsis thaliana | 3e−06 | 148 | 60 | 7-29 | 89-111 | 2 |
0899 | 50S ribosomal protein L3 | Cyanophora paradoxa | 3e−12 | 168 | 52 | 13-35 | 87-109 | 2 |
1043 | Putative ferredoxin | Arabidopsis thaliana | 4e−09 | 144 | 65 | 7-29 | 94-113 | 1 |
1116 | 30S ribosomal protein S20 | Synechococcus elongatus | 4e−08 | 170 | 64 | 13-35 | 99-121 | 2 |
1127 | Zeta-carotene desaturase | Oryza sativa | 1e−22 | 182 | 55 | 21-43 | 98-121 | 2 |
1204 | Uroporphyrinogen decarboxylase | Anopheles gambiae | 2e−31 | 149 | 60 | 21-38 | 98-120 | 2 |
1312 | Putative ferredoxin | Arabidopsis thaliana | 3e−09 | 143 | 58 | 11-33 | 91-110 | 1 |
1428 | RubisCO small subunit | Euglena gracilis | e−118 | 120 | 56 | 4-26 | 82-104 | 2 |
1495 | Glutaredoxin 2 | Actinobacillus actinomycetem- comitans | 6e−10 | 130 | 55 | 7-24 | 79-101 | 2 |
1503 | Membrane-associated 30-kDa protein | Pisum sativum | 2e−10 | 168 | 75 | 7-29 | 104-126 | 1 |
1573 | Putative ferredoxin | Arabidopsis thaliana | 6e−09 | 147 | 61 | 13-35 | 96-118 | 1 |
1674 | Sugar nucleotide phosphorylase | Arabidopsis thaliana | 7e−12 | 168 | 66 | 21-43 | 109-131 | 1 |
1706 | 50S ribosomal protein L34 | Arabidopsis thaliana | 7e−06 | 190 | 63 | 13-35 | 98-117 | 1 |
2042 | Peptidyl-prolyl cis-trans isomerase | Oryza sativa | 6e−07 | 178 | 67 | 7-26 | 93-117 | 1 |
2448 | Ferredoxin-like protein | Rhizobium loti | 8e−05 | 120 | 58 | 15-37 | 95-117 | 1 |
2566 | Ribose-5-phosphate isomerase | Spinacia oleracea | 2e−24 | 144 | 61 | 20-42 | 103-125 | 1 |
2596 | Ycf53 (tetrapyrrole-binding protein) | Synechococcus elongatus | 2e−12 | 182 | 58 | 12-34 | 92-114 | 1 |
2669 | 50S ribosomal protein L11 | Odontella sinensis | 3e−12 | 186 | 55 | 19-41 | 96-118 | 2 |
2795 | d-Ribulose-5-phosphate 3-epimerase | Arabidopsis thaliana | 1e−24 | 150 | 58 | 21-40 | 98-120 | 2 |
2990 | Albino 3 | Bigelowiella natans | 5e−37 | 211 | 77 | 29-51 | 128-150 | 1 |
3121 | Chaperonin PSII quinone-binding protein | Arabidopsis thaliana | 8e−30 | 189 | 94 | 3-25 | 119-136 | 1 |
3164 | Rhodanese domain-containing protein | Oryza sativa | 6e−08 | 133 | 58 | 5-24 | 82-104 | 1 |
3171 | Photosystem II 22-kDa protein | Arabidopsis thaliana | 3e−17 | 152 | 67 | 21-40 | 107-126 | 1 |
3330 | Coproporphyrinogen III oxidase | Chlamydomonas reinhardtii | e−108 | 156 | 53 | 22-44 | 97-119 | 2 |
3362 | ATP synthase delta chain | Nicotiana tabacum | 7e−22 | 147 | 57 | 15-37 | 94-116 | 2 |
3372 | 50S ribosomal protein L15 | Bigelowiella natans | 1e−14 | 191 | 54 | 24-46 | 100-119 | 1 |
3375 | Light-regulated Chlp-localized protein | Solanum tuberosum | 4e−20 | 120 | 60 | 12-31 | 91-110 | 1 |
3383 | ATP synthase gamma chain | Odontella sinensis | 7e−78 | 137 | 60 | 13-35 | 95-113 | 3 |
3449 | Cytochrome f | Euglena gracilis | 4e−91 | 147 | 60 | 7-26 | 86-108 | 2 |
3469 | Porphobilinogen deaminase | Euglena gracilis | 0 | 151 | 56 | 17-39 | 95-112 | 2 |
3474 | Probable membrane-associated 30-kDa protein | Synechocystis sp. | 7e−49 | 151 | 63 | 7-26 | 89-111 | 1 |
3482 | Fructose-1,6-bisphosphatase | Bigelowiella natans | 2e−71 | 188 | 56 | 20-37 | 93-115 | 1 |
3500 | Glu 1-semialdehyde 2,1-aminomutase | Chlorarachnion sp. | e−148 | 138 | 53 | 7-29 | 82-99 | 2 |
3504 | Carbonic anhydrase | Deinococcus radiodurans | 7e−28 | 102 | 52 | 5-24 | 76-98 | 3 |
3558 | Carbonic anhydrase | Deinococcus radiodurans | 1e−10 | 140 | 62 | 13-35 | 97-119 | 1 |
3594 | 50S ribosomal protein L28 | Toxoplasma gondii | 4e−18 | 160 | 55 | 13-35 | 90-112 | 2 |
3603 | Peroxiredoxin precursor | Chlamydomonas reinhardtii | 8e−85 | 134 | 57 | 5-27 | 84-106 | 2 |
3619 | 50S ribosomal protein L21 | Thermoanaerobacter tengcongensis | 3e−11 | 168 | 52 | 13-35 | 87-109 | 1 |
3635 | Coproporphyrinogen III oxidase | Chlamydomonas reinhardtii | 1e−78 | 169 | 60 | 29-51 | 111-133 | 3 |
3653 | Delta 12 fatty acid desaturase | Phaeodactylum tricornutum | 2e−98 | 162 | 77 | 13-30 | 107-129 | 2 |
3673 | Carbonic anhydrase | Deinococcus radiodurans | 8e−30 | 179 | 68 | 13-32 | 100-122 | 4 |
3676 | 30S ribosomal protein S1 | Chlamydomonas reinhardtii | 7e−32 | 233 | 66 | 4-26 | 92-114 | 1 |
3817 | Acyl carrier protein | Synechocystis sp. | 6e−13 | 122 | 49 | 15-34 | 83-105 | 1 |
3830 | Ferredoxin | Euglena viridis | 1e−41 | 138 | 56 | 17-39 | 95-117 | 3 |
3881 | ATP/ADP transporter | Galdieria sulfuraria | 0 | 148 | 52 | 12-34 | 86-105 | 1 |
3900 | PsbM | Zea mays | 0.017 | 154 | 63 | 13-35 | 98-120 | 3 |
3911 | LHCI | Euglena gracilis | 5e−86 | 179 | 50 | 13-35 | 85-107 | 1 |
3934 | Ferredoxin-NADP+ reductase | Chlamydomonas reinhardtii | e−144 | 114 | 51 | 5-27 | 78-100 | 1 |
3943 | NADPH protochlorophyllide reductase | Chlorarachnion sp. | 5e−69 | 155 | 53 | 12-34 | 87-109 | 1 |
3946 | RuBisCO activase | Chlorococcum littorale | e−145 | 95 | 59 | 13-35 | 72-101 | 2 |
3996 | LHCI | Euglena gracilis | e−116 | 158 | 55 | 13-35 | 90-109 | 2 |
4008 | LHCI | Euglena gracilis | 0 | 141 | 51 | 13-35 | 86-108 | 1 |
4056 | CP29 | Oryza sativa | 7e−57 | 136 | 50 | 12-34 | 84-106 | 1 |
7084 | Chl. synthase 33-kDa subunit | Anabaena sp. | 9e−10 | 141 | 45 | 15-34 | 79-101 | 1 |
7147 | SOUL-heme-binding protein | Arabidopsis thaliana | 7e−20 | 136 | 68 | 5-22 | 90-112 | 1 |
7392 | ATP-dependent Clp protease | Vibrio cholerae | 1e−44 | 143 | 58 | 15-37 | 95-117 | 1 |
7739 | Ycf3 (PSI assembly) | Physcomitrella patens | 3e−27 | 162 | 64 | 20-42 | 106-128 | 1 |
7766 | RuBisCO 60-kDa chaperonin | Arabidopsis thaliana | 5e−37 | 122 | 67 | 7-29 | 96-118 | 1 |
8108 | YebC-related protein | Arabidopsis thaliana | 7e−17 | 108 | 57 | 7-29 | 86-108 | 1 |
8254 | Chlorophyll b synthase | Dunaliella salina | 8e−18 | 114 | 70 | 6-22 | 92-111 | 1 |
8643 | Uroporphyrinogen decarboxylase | Ashbya gossypii | 1e−17 | 164 | 63 | 17-39 | 102-124 | 1 |
8888 | 3-Isopropylmalate dehydrogenase | Bifidobacterium longum | 7e−13 | 150 | 71 | 13-35 | 106-128 | 1 |
9366 | Photosystem II family protein | Arabidopsis thaliana | 4e−11 | 137 | 61 | 13-35 | 96-118 | 1 |
Class IB proteins | ||||||||
3955 | Oxygen evolving enhancer (OEE1) | Euglena gracilis | e−116 | 142 | 53 | 5-27 | 80-99 | 3 |
4026 | Oxygen evolving enhancer (OEE2) | Lycopersicon esculentum | 6e−17 | 153 | 49 | 20-42 | 91-113 | 2 |
3381 | HCF136 (PSII stability factor) | Arabidopsis thaliana | 5e−07 | 142 | 52 | 7-29 | 81-103 | 1 |
3249 | Putative ascorbate peroxidase | Lycopersicon esculentum | 2e−04 | 184 | 70 | 13-35 | 105-127 | 1 |
3902 | Cytochrome c6 | Euglena gracilis | 4e−69 | 123 | 60 | 29-51 | 111-133 | 2 |
3752 | PSI subunit III (PsaF) | Chlamydomonas reinhardtii | 6e−53 | 144 | 60 | 13-35 | 95-114 | 2 |
2674 | Thylakoid luminal 17.4-kDa protein | Arabidopsis thaliana | 5e−22 | 171 | 71 | 15-37 | 108-127 | 1 |
Class II proteins | ||||||||
3630 | Photosystem II (PsbW) | Chlorarachnion sp. | 4e−15 | 82 | 52 | 20-37 | 3 | |
3294 | ABC transporter (cytochrome c biogenesis) | Nostoc punctiforme | 5e−33 | 175 | 135 | 34-53 | 1 | |
0923 | PEP/phosphate translocator | Phaeodactylum tricornutum | 4e−10 | 166 | 132 | 13-35 | 1 | |
4012 | Oxygen evolving enhancer (OEE3) | Chlamydomonas reinhardtii | 3e−22 | 61 | 36 | 13-35 | 2 | |
2060 | Mg-protoporphyrin IX methyltransferase | Synechococcus elongatus | 4e−17 | 66 | 40 | 5-27 | 1 | |
2416 | Peptide chain release factor (RF) 2 | Synechocystis sp. | 3e−42 | 99 | 70 | 13-35 | 1 | |
3797 | PSI subunit IV (PsaE) | Chlamydomonas reinhardtii | 6e−17 | 95 | 61 | 15-37 | 3 | |
4932 | 50S ribosomal protein L9 | Bigelowiella natans | 6e−05 | 62 | 39 | 15-33 | 1 | |
8550 | Short-chain (SC) dehydrogenase | Prochlorococcus marinus | 8e−07 | 120 | 82 | 29-51 | 2 | |
3784 | Phosphoribulokinase | Vaucheria litorea | 1e−76 | 100 | 75 | 20-42 | 1 | |
9282 | MECP synthase | Arabidopsis thaliana | 2e−36 | 121 | 80 | 28-50 | 1 | |
6808 | Squalene and phytoene synthases | Prochlorococcus marinus | 1e−27 | 98 | 47 | 35-52 | 1 | |
2660 | ClpB | Phaseolus lunatus | 8e−48 | 123 | 76 | 37-52 | 1 |
Original cluster IDs had “EEL0000” preceding the 4-digit numbers shown.
For class I proteins, this is the region between the signal sequence and the stop-transfer region.
TMH1 and TMH2 are the hydrophobic domains (range of amino acids is given from the start Met) of the signal sequence and stop-transfer sequence, respectively, as predicted by the TMHMM program. Underlined regions indicate that the TMHMM program did not predict a TMH (TMHMM value, 0.1 < P < 0.9) but that a hydrophobic patch is apparent from a Kyte-Doolittle analysis.
Number of nearly identical isoforms detected.