(1) Proteins exhibiting N-terminal SP are extracted from TMD database (Y). (2) The different types of SP (Tat-SP, ABC-SP and FPE-SP) are extracted (Y). (3) Absence of a signal anchor (N) and export (Y) define Sec-dependent SP (Sec-SP). (1) Proteins with TMD but no predicted SP (N), (4) are checked for uncleaved SP (Unc-SP), i.e. TMD of at least 7 amino acid within the first 100 N-terminal residues and with Nin-Cout topology (Type II signal) (Y). (3) Unc-SP also comprises proteins with signal anchor (Y) and SP categorised as non-exported (N). (5) From the types of SP are clearly defined. Together with (4) proteins with TMD but no SP (N) and (6) protein substrates of holins, Wss and FEA, (7) the presence of the respective protein secretion systems is checked (Y). (8) When the respective protein secretion system is absent (N) or (9) proteins are not predicted as secreted (N), proteins are considered as cytoproteins and located in the CP. (10) Cytoproteins predicted as exported by NC (Y) are further considered as located extracellularly. (7) Secreted proteins and their respective secretion system are defined from there. (11) Translocated proteins with Unc-SP (Y) are IMPs. (11) Translocated proteins without Unc-SP (N), and (12) with TMD (Y) but no SP (N) are IMPs. (13) Remaining translocated proteins with a cleavable SP (Y) and a single predicted TMD (TMD = 1) (Y) cannot be IMP, and are checked for (16) the presence of a lipobox. (13) Remaining translocated proteins with more than one TMD (N) are checked for (14) the absence of overlap (N) with SP region and LPXTG domain respectively (TMD = 2 AND TMD = LPXTG) to be IMP, otherwise (Y) are checked for (16) the presence of a lipobox. (15) From TMD topology prediction (
Figure 2
), IMPs are further subcategorised and considered as integral to CM. (16) Presence of a lipobox (Y) define lipoproteins anchored to the CM. (17) The presence of glycine residue at position C+2 (C+2 = G) (Y) indicates potential release into the EM [26], [48]. (18) Presence of cell-wall retention domains define (19) parietal proteins (CW-protein) that are further subcategorised (
Figure 2
) and considered as located at the CW. (20) Proteins with less tha 2 GW modules is not defined as CW-protein located at the CW [57], [58]. (21) Proteins part of S-layer, pilus and cellulosome, as well as (22) pseudopilus and flagellum are defined. (23) Secreted proteins with none of the cell-envelope retention are as exoproteins located in the EM. N: No, Y: Yes.