Table 2.
Number of proteins |
Number containing CBM |
Overall percentage (%) |
|
---|---|---|---|
Mouse | |||
Full length proteinsb | 33451 | 10076 | 30 |
Cytoplasmic sequencesc | |||
Soluble | 22265 | 5936 | 27 |
Type I transmembrane | 1548 | 201 | 13 |
Type II transmembrane | 2869 | 335 | 12 |
Multi-pass transmembrane | 3821 | 739 | 19 |
Total | 30503 | 7211 | 24 |
Non-cytoplasmic sequencesc | |||
Soluble | 2948 | 996 | 34 |
Type I transmembrane | 1548 | 488 | 32 |
Type II transmembrane | 2869 | 608 | 21 |
Multi-pass transmembrane | 3821 | 773 | 20 |
Total | 11186 | 2865 | 26 |
Yeast | |||
All proteins | 6736 | 2883 | 43 |
Sequences derived from CYGD database http://mips.helmholtz-muenchen.de/genre/proj/yeast/ were scanned for the presence of any of the two putative CBM sequences; ΩxΩxxxxΩ and ΩxxxxΩxxΩ or the combined consensus sequence ΩxΩxxxxΩxxΩ (Couet et al., 1997), where Ω is either Phe, Trp or Tyr.
The full set of 51135 coding sequences was reviewed, and those with annotated truncations at the N-terminus were discarded: topology with respect to the membrane cannot be accurately determined in this set.
Topology with respect to the membrane was calculated based on the presence in sequences of signal peptides and integral membrane domains using a previously published annotation pipeline (Davis et al., 2006).