Abstract
Tuberculosis (TB) is an infectious bacterial disease that causes morbidity and mortality, especially in developing countries. Although its efficacy against TB has displayed a high degree of variability (0%–80%) in different trials, Mycobacterium bovis bacillus Calmette-Guérin (BCG) has been recognized as an important weapon for preventing TB worldwide for over 80 years. Because secreted proteins often play vital roles in the interaction between bacteria and host cells, the secretome of mycobacteria is considered to be an attractive reservoir of potential candidate antigens for the development of novel vaccines and diagnostic reagents. In this study, we performed a proteomic analysis of BCG culture filtrate proteins using SDS-PAGE and high-resolution Fourier transform mass spectrometry. In total, 239 proteins (1555 unique peptides) were identified, including 185 secreted proteins or lipoproteins. Furthermore, 17 novel protein products not annotated in the BCG database were detected and validated by means of RT-PCR at the transcriptional level. Additionally, the translational start sites of 52 proteins were confirmed, and 22 proteins were validated through extension of the translational start sites based on N-terminus-derived peptides. There are 103 secreted proteins that have not been reported in previous studies on the mycobacterial secretome and are unique to our study. The physicochemical characteristics of the secreted proteins were determined. Major components from the culture supernatant, including low-molecular-weight antigens, lipoproteins, Pro-Glu and Pro-Pro-Glu family proteins, and Mce family proteins, are discussed; some components represent potential predominant antigens in the humoral and cellular immune responses.
Tuberculosis (TB)1 is one of the greatest killers worldwide, especially in developing countries. About one-third of the world's population has been infected by TB bacteria, and 10% of those infected have a lifetime risk of falling ill with TB (1). In 2011, ∼8.7 million people fell ill and 1.4 million died from TB. It is notable that TB and human immunodeficiency virus (HIV) can form a lethal combination, each speeding the other's progress. Additionally, drug-resistant TB is growing and is present in virtually all of the countries surveyed (2). Although TB is a treatable and curable disease, co-infection with HIV and the emergence of drug-resistant strains have made treatment a heavy economic burden, and there are severe adverse drug reactions in patients. Currently, an important weapon in the fight against TB is Mycobacterium bovis bacillus Calmette-Guérin (BCG) (3). The BCG vaccine has existed for over 80 years and has a documented protective effect against meningitis and disseminated TB in children (4). However, it does not prevent primary infection and, more important, does not prevent the reactivation of a latent pulmonary infection. Furthermore, the efficacy of BCG against TB has displayed a high degree of variability (0%–80%) in different trials (5). Therefore, the impact of BCG vaccination on the transmission of TB is limited, and new vaccination strategies as alternatives or complements to BCG are urgently needed, particularly against primary infection and latent pulmonary infection.
Bacterial secreted proteins, which are specifically released into the surrounding extracellular milieu, constitute a large and biologically important subset of proteins that are involved in cellular communication, adhesion, and migration (6). In Gram-positive bacteria, secreted proteins can be anchored to the cytoplasmic membrane, associated with the cell wall, released into the extracellular milieu, or injected into a host cell (7). A significant number of mycobacterial proteins have been shown to be secreted or exported during growth; these proteins are central to pathogenesis, and some of them have been shown to be key T-cell antigens mediating protective immunity against TB (8). The secretome of mycobacteria is considered to be an attractive reservoir of potential candidate antigens for the development of new vaccines and diagnostic reagents. However, secretome analysis is quite challenging, and bacterial secretomes have often been under-studied. This scenario could be attributed to technical limitations such as the presence of low-abundance proteins or contamination by cytoplasmic or other normally nonsecreted proteins released following cell lysis and death (9). Several attempts have been made to define the secretome of M. tuberculosis using two-dimensional gel electrophoresis or liquid chromatography (LC) coupled with different types of MS analysis. For example, Mattow et al. utilized two-dimensional gel electrophoresis coupled with MALDI-MS and capillary LC–electrospray ionization–MS/MS to identify 137 proteins from culture supernatant, only 42 of which had previously been described as secreted proteins (8). Okkels et al. applied a narrow-range pI gradient two-dimensional gel electrophoresis separation combined with MALDI-MS and electrospray ionization MS/MS to characterize eight ESAT-6 spots, among which four species of full-length ESAT-6 were identified (10). Malen et al. used two-dimensional gel electrophoresis coupled with a MALDI-TOF MS/MS approach to identify 257 mycobacterial proteins, including 159 secreted proteins (11).
While some previous secretome studies on M. tuberculosis have been performed, a comprehensive analysis of the BCG secretome has not been undertaken. Therefore, our knowledge of important secreted immune constituents of BCG and their functions against TB is still ambiguous. In 2003, Florio et al. identified 12 proteins in the culture filtrate (CF) of BCG in the pI range of 6–11 using two-dimensional gel electrophoresis and MALDI-TOF, and only three of them had not been described previously (12). Using a similar method, in 2010, Rodriguez-Alvarez et al. compared the secretomes of wild-type M. bovis and a PstS1-recombinant BCG vaccine substrain (rBCG38) and identified six conserved hypothetical proteins that are differentially expressed (13). Recently, Berredo-Pinho et al. reported the proteomic profile of culture filtrate proteins (CFPs) from M. bovis BCG Moreau and identified 101 different proteins, of which 53 were thought to be secreted proteins (14). Although proteome-wide studies performed on CFPs of BCG are still limited because of their frequently low concentrations, recent developments in Fourier transform MS with high resolution and accuracy at both the MS and MS/MS levels might substantially promote comprehensive investigations of secreted protein profiling (15). Moreover, to reduce the complexity of the extracted CFPs, a one-dimensional gel electrophoresis separation outperformed previous strategies with respect to the number of identified proteins, reproducibility, and throughput (9). In the present study, BCG CFPs were separated by means of one-dimensional gel electrophoresis, and 12 gel slices were cut. The resulting peptides were separated via reversed-phase LC and analyzed using a high-resolution LTQ Orbitrap Velos to improve the identification coverage and reliability. In total, 239 CFPs were identified with high confidence, including 185 potential secreted proteins or lipoproteins. We also obtained 17 novel protein products that were not annotated in the BCG database and for which we performed RT-PCR at the transcriptional level to support the existence of these proteins. We found 103 secreted proteins that have not been reported in previous studies on the mycobacterial secretome and that are unique to our study. Additionally, 52 existing annotated proteins were confirmed with correctly assigned translational start sites (TSSs), and 22 proteins were validated by extension with TSSs based on N-terminal peptides. This study is a secretomic repertoire of BCG, and some of the potential prominent antigens implicated in protective immune responses will most likely contribute to the design of future vaccination and diagnostic strategies against TB.
EXPERIMENTAL PROCEDURES
Strains and Sample Preparation
M. bovis var BCG NCTC 5692 was grown in 5 l of liquid Sauton medium, and cells were collected in the mid-exponential range (A600 of 0.4–0.5) after 14 days of incubation with gentle agitation. The culture supernatant and cells were separated via filtration through first a 0.45-μm-pore-size membrane and then a 0.22-μm-pore-size membrane (Millipore, Bedford, MA). The CFP samples were prepared as described elsewhere, with some modifications (8). Briefly, the resulting CF addition of a protease inhibitor mixture (Roche, Germany) was treated with 0.015% (w/v) sodium deoxycholate under shaking and incubated for 10 min at room temperature and subsequently subjected to a trichloroacetic acid (10%, v/v) precipitation procedure. The resulting solution was incubated overnight at 4 °C and then centrifuged at 4000g for 15 min to collect the precipitates. After being washed twice with ice-cold acetone and allowed to air dry, the protein content of the precipitates was quantitated via the bicinchoninic acid protein assay. The protein pellets were then suspended in SDS-PAGE loading buffer and dissolved for 2 h. The samples were boiled for 10 min and subsequently centrifuged at a maximum speed for 30 min, and the resulting supernatant was subjected to SDS-PAGE.
In-gel Digestion
Approximately 10 μg of protein from the CF was loaded onto a 12% SDS-PAGE gel (1.0 mm thick, width/length of 8.6/6.8 cm) and stained using colloidal Coomassie Blue stain. After the excess stain had been removed, each lane was cut into 12 bands and subjected to an in-gel tryptic digestion protocol as described elsewhere (16). Briefly, sliced bands were washed twice with 50% acetonitrile in 50 mm ammonium bicarbonate (NH4HCO3) for 15 min at room temperature, and then dehydrated by 100% acetonitrile. The in-gel reduction was performed using 10 mm dithiothreitol at 37°C for 45 min followed by alkylation using 55 mm iodoacetamide at room temperature for 30 min in the dark, and an in-gel digestion was conducted using modified trypsin (trypsin/protein ratio of 1/10 (w/w), Promega, Madison, WI) at 37°C for 16 to 20 h. All of the tryptic peptides extracted from the gel slices were desalted using ZipTipC18 (Millipore, Bedford, MA) and were solubilized in 0.1% formic acid for subsequent LC-MS/MS analysis.
In-house Database Construction
For proteomic discovery, we constructed two in-house databases. The first was the six reading frame translation database of the BCG genome (downloaded from NCBI). This construction was performed by translating the entire genome in all six reading frame options, three forward and three on the reversed DNA strand (17). Briefly, the codons TAA, TAG, and TGA were selected as the stop codons in a certain frame, and putative open reading frames (ORFs) were generated by translating sequences from the first nucleotide to a stop codon. The next putative ORF was started at the next nucleotide following the previous stop codon. This procedure was performed on both DNA strands of the chromosome in all three reading frames. Entries containing fewer than 15 aa or redundant sequences from repetitive genomic information were deleted for simplification. In total, we obtained a set of 111,825 possible entries. All of the entries here were named BCGRF000001 to BCGRF111825 with the frame tag. Moreover, the annotations with the same frames were replaced with original names in the BCG genome. For example, entry BCGRF000001 was renamed BCG0001, which was annotated in the BCG genome data set with the same frame as BCGRF000001. Additionally, sequences for common contaminants (338 unique entries) from two collections (248 from the Max Planck Institute of Biochemistry, 112 from the Global Proteome Machine Organization Common Repository of Adventitious Protein) were appended to the end of the target database FASTA file (supplemental Text S1). In total, the final database had 112,163 entries.
The second database was a specialized N-terminal extension database that was constructed as described elsewhere (18). All of the customized entries were merged into the extension database, except for those entries for which the start codons were the same as in the previous annotation. In total, from the annotated sequences listed in the BCG genome, 1805 alternative start site entries were collected in the extension database.
MS Analysis and Database Search for Protein Identification
Digested peptide mixtures were separated using a nanoAcquity ultra-performance LC system (Waters, Milford, MA) equipped with a C18 reversed-phase microcapillary trapping (nanoAcquity Symmetry C18, 5 μm, 180 μm × 20 mm) and an analytical column (nanoAcquity BEH 300 C18, 1.7 μm, 100 μm × 100 mm). The outlet of the analytical column was coupled directly to a high-resolution LTQ Orbitrap Velos mass spectrometer (Thermo Fisher Scientific, Germany) using a nano-electrospray ion source. Peptides were eluted through the analytical column with a constant flow at 0.4 μl/min using a 160-min gradient with aqueous solvents A (0.1% HCOOH) and B (0.1% HCOOH, 80% CH3CN). During the elution step, the percentage of solvent B increased in a linear fashion from 5% to 35% at 5–95 min, followed by an increase to 85% at 95–130 min, a column wash at 85% at 130–145 min, and re-equilibration at 1% B at 146–160 min. The eluted peptides were introduced into the mass spectrometer using a PicoTip Emitter (SilcaTip 360 μm outer diameter × 20 μm inner diameter, 10 ± 1 μm) (New Objective, Woburn, MA) and were electrosprayed with a distally applied spray voltage of 2.0 kV. Full scan MS spectra with an m/z range of 380 to 2000 were acquired in profile mode with a resolution of 60,000 in the Orbitrap. The most intense precursor ions (up to 20, multiply charged (2+ or 3+)) from the full scan were selected for fragmentation by collision-induced dissociation and were detected in an Orbitrap with a resolution of 7500. The dynamic exclusion list for MS/MS was restricted to 5000 entries, with a maximum retention period of 60 s and a relative mass window of 10 ppm. A normalized collision energy of 35% was used for the MS/MS, and the data were acquired in centroid mode. Additionally, an activation Q-value of 0.25 and an activation time of 10 ms were also applied for the MS/MS. Lock mass calibration using a background ion from the air (m/z 445.12003) was applied. In total, we performed 36 reversed-phase LC-MS/MS runs (each lane was cut into 12 bands) with three repeats.
The raw data were processed using Proteome Discovery software (version 1.3; Thermo Scientific, Germany) with the search algorithm MASCOT (Matrix Sciences, London, UK), and the MS/MS spectra were searched against three customized databases: the BCG protein database, the N-terminal extension database, and a six reading frame translation database. Enzyme specificity was set to trypsin/P, and a maximum of two missed cleavages were allowed. Cysteine carbamidomethylation was used as a static modification, and methionine oxidation and N-terminal acetylation were used as dynamic modifications. The initial maximal allowed mass tolerance was set at 5 ppm for precursor masses and then 0.8 Da for fragment ion masses. The reverse database search option was enabled, and a maximum target-decoy-based false discovery rate of 1.0% for peptide and protein identification was allowed. At least two unique peptides were required for protein identification. All the raw mass spectra files have been deposited into the publicly accessible database PeptideAtlas and now are available with dataset Identifier PASS00133. The complete set of peak list files (mgf file format) converted from the raw files can also be accessed freely with dataset Identifier PASS00213.
Bioinformatics Tools for the Prediction of Secreted Proteins
SignalP 4.0 and TatP 1.0 software were used for the prediction of classical amino-terminal secretion signal peptides and Tat-dependent signal peptides, respectively. Non-classically secreted proteins were predicted using SecretomeP 2.0. Protein transmembrane helices were predicted using TMHMM 2.0. All of this software is publicly available from the Centre for Biological Sequence Analysis at the Technical University of Denmark. The theoretical molecular mass and pI value were obtained from the Proteome Discoverer software calculation. Lipoproteins were predicted using a Hidden Markov Model method PRED-LIPO for Gram-positive bacteria. The subcellular localization of the identified proteins was predicted using the PSORTb v4.0 program. Gene prediction programs used for prokaryotes were FgeneSB and GeneMark. Functional classifications were determined according to the Pasteur Institute functional classification tree. Homologous proteins were searched using the Blastp program.
RT-PCR Validation
RT-PCR was performed to provide transcriptional level evidence for genes corresponding to novel proteins identified in this study using a previously described protocol (18). Briefly, the total RNAs extracted from BCG cells using the SV Total RNA Isolation System kit (Promega, Madison, WI) were treated with RQ1 RNase-free DNase to remove any contaminating genomic DNA, and this was followed by heat inactivation of the endonuclease. cDNA synthesis was performed from 1 μg of the total RNA using the SuperScriptTM III Reverse Transcriptase (Invitrogen) according to the manufacturer's protocol. PCR was performed using 1 μl of the resulting cDNA as a starting material according to standard procedures. PCR reactions that were conducted with isolated RNAs as the templates were used as negative controls to indicate the elimination of genomic DNA contamination, and a reaction with human ß-actin cDNA as a template was used as a positive control with an amplified product of 353 bp. The sizes of the amplified products were determined by an E-Gel Electrophoresis System using a 2% E-Gel pre-cast agarose gel and a 1 kb Plus DNA Ladder (Invitrogen). The gene-specific primers used in this study were designed using Primer Premier 5.0 software and are listed in supplemental Table S1.
RESULTS AND DISCUSSION
In Silico Characterization of Classical Secreted Proteins and Lipoproteins in the BCG Genome
Proteins that are secreted through the general secretory (Sec) pathway normally have classical amino-terminal secretion signal peptide sequences (7). In order to generate a conceptual list of classically secreted proteins, we screened the BCG database for proteins that possess classical signal peptides using the software SignalP 4.0. However, there are two types of networks in the current SignalP 4.0 version: the SignalP-TM and SignalP-noTM methods. To obtain a more accurate prediction, we used SignalP-TM to predict those proteins that might include transmembrane (TM) regions and SignalP-noTM to predict those without TM regions. As a result, we first predicted proteins with TM regions using the program TMHMM 2.0 and obtained 634 proteins with one or multiple TM regions. These proteins were analyzed with the SignalP-TM predictor, and 39 of them were predicted to contain signal peptides. Those without TM regions were analyzed with SignalP-noTM, and 204 were predicted to possess signal peptides. In total, 243 proteins in the BCG protein database were predicted to contain amino-terminal signal peptide sequences (supplemental Table S2); these proteins are considered to be classical secreted proteins.
Lipoproteins are a functionally diverse class of secreted bacterial proteins that contain 1% to 3% bacterial genome-encoding proteins (19). The signal peptides of these proteins direct their export and post-translational lipid modification (20). For Gram-positive bacteria, lipoproteins are usually predicted using the program PRED-LIPO, which is based on regular expression patterns and outperforms the well-known LipoP method (21). However, even though the prediction method has a high specificity and very few false positives, we also manually validated the remaining proteins beyond the PRED-LIPO prediction with a Blastp analysis using orthologous lipoproteins from other species. We obtained 66 lipoproteins via the PRED-LIPO prediction and an additional 40 potential lipoproteins via manual Blastp analysis. In total, 106 potential lipoproteins were predicted in the BCG database (supplemental Table S3A).
Analysis of the CFPs Identified Using SDS-PAGE and High-resolution Fourier Transform Mass Spectrometry
To achieve the best identification of the BCG extracellularly secreted proteins under an in vitro culture condition, a strain was cultivated in liquid Sauton medium to limit contamination from medium-derived protein. Cells were harvested at the mid-exponential phase, when bacterial lysis is minimal, although not exiguous. Electrophoresis analysis showed that CFPs had a molecular weight majority ranging from 10 to 60 kDa. Several intensively Coomassie-stained bands were observed: one very intensive band was present at ∼50 kDa, and three others were present at ∼60, 35, and 25 kDa, respectively. After a search through Proteome Discovery software, the protein identification was filtered with an IonScore of no less than 40 and less than a 1% cumulative false discovery rate at the peptide level. Furthermore, we set the criterion that each protein detected was required to match at least two unique peptide sequences. In total, we obtained 1555 unique peptide sequences, representing 239 proteins (supplemental Table S4). Among these proteins, 128 (∼54%) were presumed to be secreted proteins with classical amino-terminal secretion signal peptides using the program SignalP, which indicatied that they were targeted for secretion via the Sec pathway (Fig. 1). Additionally, 13 proteins were recognized by the TatP 1.0 algorithm as harboring Tat signal peptides (Fig. 1). The consensus sequence recognized by this algorithm is RR.[FGAVML][LITMVF]. It contains two invariant arginines in the first two positions and any amino acid in the third position (indicated by the dot), in addition to the variable amino acids indicated in the brackets. Both the Sec and Tat signal peptides are composed of three distinct regions: the N-, H-, and C- regions, which are cleaved by SPase I (12). Interestingly, BCG_2087c (BlaC) contained a complete Tat motif in its signal peptide sequence but was not recognized via the SignalP-noTM method. Another secreted antigen, FbpA, also contained a Tat motif with a cleavage site most likely between position 43 and 44, but it was not recognized via either the SignalP-TM or the SignalP-noTM method. Surprisingly, subcellular localization prediction using PSORTb v4.0 showed that these two proteins were localized to the extracellular compartment. We deduced that the proteins were actually secreted antigens that were missed by the SignalP prediction. Moreover, according to the program PRED-LIPO and manual Blastp analysis, 73 lipoproteins were unambiguously identified in this study (Fig. 1, supplemental Table S3B). Interestingly, 55 of these lipoproteins were also considered to be secreted via the Sec pathway because of their classical amino-terminal signal peptides.
Fig. 1.

Venn diagram of the number of identifications predicted to be secreted by different programs. The culture filtrate proteins were predicted by the algorithms SignalP 4.0, TatP 1.0, PRED-LIPO, and SecretomeP 2.0, respectively. The numbers of proteins predicted by each program and all possible combinations are indicated in the Venn diagram.
Secreted proteins without signal peptides are known as leaderless secreted proteins and constitute a significant fraction of the secretome. It was reported that these proteins appear to have cytoplasmic functional roles as well as extracellular roles (22). As an alternative strategy, the non-classical secreted proteins could be identified via the SecretomeP method, which identifies proteins based on their specific biological and chemical properties or characteristics regardless of whether the protein carries a cleaved N-terminal signal peptide (23). This method has been trained on secreted proteins that were experimentally identified but not predicted by other algorithms and might complement the highly popular method for scanning classical secreted proteins, SignalP (24). Using SecretomeP, 103 proteins were predicted to be leaderless secreted proteins. Interestingly, 58 of them were also regarded as secreted proteins with classical signal peptides using the program SignalP. Therefore, excluding classical secreted proteins, 45 proteins were indeed determined to be leaderless secreted proteins lacking classical secretion signal peptides with high confidence, and they were probably produced as a result of Sec-independent secretion mechanisms (Fig. 1).
In total, 185 proteins were predicted to be secreted by at least one of the four programs employed (Fig. 1). On average, more than six peptides were used to identify each CFP, and the amino acid sequence coverage was ∼35.3%. For secreted proteins, five or six peptides were used to identify each one, and the amino acid sequence coverage averaged 31.2%.
Isoelectric Point and Molecular Weight Distributions of the CFPs
In this study, protein identification covered wide pI values and molecular weight ranges. The pI values ranged from 3.91 (PE-PGRS family protein PE_PGRS43b, BCG_2509c) to 12.19 (50S ribosomal protein L32 RpmF, BCG_1034); a detailed pI distribution is displayed in Fig. 2A. The majority of the proteins clustered between pI 4 and 7, which is in agreement with previous studies performed on CFPs (12). Among secreted proteins, most numbers of them ranged between pI 5 and 7. Interestingly, all the CFPs identified between pI 8 and 9 were secreted proteins.
Fig. 2.
Distributions of total culture filtrate proteins and secreted proteins identified in this study. The distribution of identifications in different (A) pI ranges, (B) molecular weight (MW) ranges, (C) subcellular localizations, and (D) functional categories. The culture filtrate proteins identified are illustrated in the blue histogram, and the secreted proteins are in red.
The the lowest molecular weight among the proteins was 9.41 kDa (50S ribosomal protein L32 RpmF, BCG_1034), and the PPE family protein PPE6 (BCG_0345c) with a molecular weight of 194.08 kDa represented the largest secreted protein. The distribution of the molecular weights of CFPs is depicted in Fig. 2B, and the majority were found in the range between 10 and 60 kDa, which represented ∼90% (215 out of 239) of all of the identifications. The molecular weight distribution of the secreted proteins was similar for the CFPs. Only one protein (bifunctional penicillin-binding protein 1A/1B PonA1, BCG_0081) ranged from 70 kDa to 80 kDa. Interestingly, the 50S ribosomal protein L32 RpmF had the lowest molecular weight, whereas its pI value was the highest. For the 185 secreted proteins, the average molecular weight was 34.5 kDa, and the average pI value was 6.5. For the non-secreted proteins, the average molecular weight and pI value were 32.7 kDa and 5.8, respectively. For the 73 lipoproteins, the average molecular weight was 32.9 kDa, and the theoretical pI value was 6.2. There were no apparent differences in relation to the molecular weight and the pI value distribution between the secreted and non-secreted proteins.
Subcellular Localization of the CFPs
All of the CFPs identified in this study were subjected to the PSORTb v4.0 program in order to predict their subcellular localizations (Fig. 2C). The results showed that 48 proteins localized to the cytoplasmic membrane; 46 of them were secreted proteins, and 20 were lipoproteins. Because bacterial lipoproteins are a functionally diverse class of membrane-anchored or -associated proteins, it was not surprising that so many lipoproteins were identified in the cytoplasmic membrane compartment. Interestingly, several proteins were identified with multiple transmembrane helices; for example, BCG_3669c and BCG_0326 were predicted to have four and three transmembrane helices, respectively. Additionally, 72 and 26 CFPs were predicted to localize to cytoplasmic and extracellular compartments, respectively. Approximately half of the cytoplasmic proteins were predicted to be secreted proteins. All of the extracellular CFPs contained classical signal peptides except for two non-classical secretory proteins. Intriguingly, two of them (BlaC and FbpA) also contained complete Tat motifs in their signal peptide sequences. Six of the extracellular proteins were also lipoproteins. In addition, another two lipoproteins were found in the cell wall. It has been reported that some lipoproteins could be alternatively processed by signal peptidase I or II, and this mechanism could explain their localization in the extracellular environment or in the cell wall (12). No subcellular localization could be predicted for 76 proteins. Because the current version of the pSORTb v4.0 program is not perfect—for example, it cannot detect lipoprotein motifs for some lipoproteins and proteins that are located at multiple sites—this tool should be used with caution, and we are aware of this. In fact, the localization information for 41 of the lipoproteins is unknown. Further study to determine the localization information of these proteins should be pursued.
Comparisons with Other Studies on the Mycobacterial Secretome
To investigate the secreted proteins in CFs, a number of mycobacterial secretome studies have been undertaken (8, 11–14, 25–34). Table I summarizes the major studies on the mycobacterial secretome performed to date. These studies focused on CFs of different mycobacterial substrains, including BCG variants. Combining all of the data from the studies published to date, there are 397 proteins that have been reported in different mycobacterial CFs. Among them, 148 proteins were considered to be secreted proteins, including classical and leaderless secreted proteins or lipoproteins. We compared the proteins identified in our study with those identified in previous studies and found that 82 of the secreted proteins were previously reported. Therefore, there are 103 secreted proteins that have not been reported previously and are unique to our study. It should be noted that Malen et al. obtained a total of 257 proteins using a combination of two-dimensional gel electrophoresis MALDI-TOF-MS and LC-MS/MS in a single study, but only 144 were identified by at least two peptides (12). They reported that 159 of them had predicted N-terminal signal peptides. However, when requiring at least two unique peptides per protein, only dozens of secreted proteins were identified in their study. In our study, all identifications were filtered with high confidence and required at least two unique peptides per protein. We presume that the higher identification rate here is most likely a result of the use of accurate high-resolution Fourier transform MS settings at both the MS and MS/MS levels, whereas many of the earlier studies used MS at lower resolutions and accuracies. Furthermore, CFPs were pre-fractionated using a one-dimensional gel electrophoresis method followed by in-gel trypsin digestion, which decreases the complexity of the secreted proteins and has no bias against low abundance secreted proteins.
Table I. Summary of the major mass-spectrometry-based studies performed on the mycobacterial secretome to date; table includes details about the numbers of total proteins and secreted proteins, the analysis method, and the type of mass spectrometer used in the listed studies.
| Year | Title | Number of identifications | Number of secreted proteins | Analysis method and instrument used | Reference |
|---|---|---|---|---|---|
| 1997 | Definition of Mycobacterium tuberculosis culture filtrate proteins via two-dimensional polyacrylamide gel electrophoresis, N-terminal amino acid sequencing, and electrospray mass spectrometry | 32 | n.d. | Two-dimensional gel electrophoresis and TSQ-700 quadrupole MS | (25) |
| 1999 | Comparative proteome analysis of Mycobacterium tuberculosis and Mycobacterium bovis BCG strains: toward functional genomics of microbial pathogens | 263 | 54 | Two-dimensional gel electrophoresis and MALDI-TOF (Voyager Elite MS) | (26) |
| 2000 | Toward the proteome of Mycobacterium tuberculosis | 288 | 49 | Two-dimensional gel electrophoresis and MALDI-MS | (27) |
| 2000 | Mapping and identification of Mycobacterium tuberculosis proteins via two-dimensional gel electrophoresis, microsequencing, and immunodetection | 61 | 14 | Two-dimensional gel electrophoresis, microsequencing, and immunodetection | (28) |
| 2001 | The application of proteomics in defining the T-cell antigens of Mycobacterium tuberculosis | 30 | n.d. | Two-dimensional gel electrophoresis and LCQ-ESI-MS | (29) |
| 2002 | Hypoxic response of Mycobacterium tuberculosis studied via metabolic labeling and proteome analysis of cellular and extracellular proteins | 7 | 2 | Two-dimensional gel electrophoresis and LCQ-MS | (30) |
| 2003 | Comparative proteome analysis of culture supernatant proteins from virulent Mycobacterium tuberculosis H37Rv and attenuated M. bovis BCG Copenhagen | 137 | 42 | MALDI-MS PMF,[1] ESI-MS/MS, or capillary LC-ESI-MS/MS | (8) |
| 2003 | Comparative proteome analysis of culture supernatant proteins of Mycobacterium tuberculosis H37Rv and H37Ra | 5 | n.d. | Two-dimensional gel electrophoresis MALDI-TOF MS | (31) |
| 2003 | Identification of novel proteins in culture filtrates of Mycobacterium bovis bacillus Calmette-Guérin in the isoelectric point range of 6–11 | 12 | n.d. | Two-dimensional gel electrophoresis and MALDI-TOF (Voyager Spec MS) | (12) |
| 2004 | CFP10 discriminates between nonacetylated and acetylated ESAT-6 of Mycobacterium tuberculosis by differential interaction | 8 | 4 | One-dimensional gel electrophoresis, MALDI-MS, and ESI-MS/MS (Voyager Elite) | (10) |
| 2007 | Comprehensive analysis of exported proteins from Mycobacterium tuberculosis H37Rv | 257 | 159 | Two-dimensional gel electrophoresis and MALDI-TOF MS; LC coupled with MS/MS. | (11) |
| 2009 | Immunoproteomic identification of secretory and subcellular protein antigens and functional evaluation of the secretome fraction of Mycobacterium immunogenum, a newly recognized species of the Mycobacterium chelonae–Mycobacterium abscessus group | 33 | 4 | Two-dimensional gel electrophoresis and MALDI-TOF | (33) |
| 2009 | Mycobacterium tuberculosis glycoproteomics based on ConA-lectin affinity capture of mannosylated proteins | 41 | 31 | Two-dimensional gel electrophoresis, ligand blotting and immunoblotting, LC-3200 Q TRAP-MS | (32) |
| 2010 | The secretome of a recombinant BCG substrain reveals differences in hypothetical proteins | 9 | n.d. | Two-dimensional gel electrophoresis 3200 Q-TRAP-MS/MS | (13) |
| 2010 | Descriptive proteomic analysis shows protein variability between closely related clinical isolates of Mycobacterium tuberculosis | 101 | n.d. | Two-dimensional gel electrophoresis, iTRAQ-labeled and LTQ-MS | (34) |
| 2011 | Proteomic profile of culture filtrate from the Brazilian vaccine strain Mycobacterium bovis BCG Moreau compared to M. bovis BCG Pasteur | 101 | 53 | Two-dimensional gel electrophoresis and MALDI-TOF/TOF (4700 Proteomics Analyzer) | (14) |
| 2013 | Analysis of the secretome and identification of novel constituents from culture filtrate of bacillus Calmette-Guérin using high-resolution mass spectrometry | 239 | 185 | One-dimensional gel electrophoresis and ESI-MS/MS (LTQ-Orbitrap Velos) | This study |
ESI, electrospray ionization; PMF, peptide mass fingerprint; n.d., no data.
Translational Start Site Assignments
In genomic annotation, the majorities of TSSs are assigned by using bioinformatic methods or based on homology comparative genomic approaches; an accurate TSS is critical for the analysis of both the protein function and the transcriptional regulation (35). Because most TSSs are conceptually translated from predicted transcripts and no straightforward experimental methodologies can easily determine a TSS, it is difficult to correctly assign a TSS in a given gene. The true TSSs were usually significantly different when predicted by different bioinformatic methods (36). For example, although the M. tuberculosis H37Rv genome sequence has been available for more than ten years, ∼50% of the gene annotations in the most used datasets from two independent institutions (the Sanger Institute and the Institute of Genomic Research-TIGR) have different TSSs (37). The N-terminal sequencing of proteins has been helpful in verifying the predicted TSSs to a certain extent, but this method is usually not applicable if the N-termini of proteins are blocked by modifications (38). Additionally, it is not a high-throughput method for a large quantity of proteins, and it is also time consuming and costly. Here, we utilized an MS-based proteomic strategy for assigning TSSs that was more universal and high-throughput than N-terminal sequencing. In this method, protein N-terminal peptides can be indicated by their non-tryptic nature at the N-terminus of the peptide. Such semi-tryptic peptides (i.e. N-terminal peptides with an initiator methionine residue or an initiator methionine cleaved) were detected by searching the protein database (18). We used these criteria to assign correct TSSs of CFPs and confirmed 52 existing annotations with predicted TSSs based on N-terminal peptides. Among them, 15 proteins were confirmed with N-terminal peptides with initiator methionine residues, and 42 were confirmed with the initiator methionine cleaved. Interestingly, five of them were confirmed with both the initiator methionine residues and the initiator methionine cleaved (supplemental Table S5).
It is of interest that some TSSs that are wrongly assigned can be corrected. Here, all of the MS-derived peptides were screened against the customized N-terminal extension database, and a minimal IonScore of 40 for an individual peptide was required. As a result, 98 N-terminal peptides were mapped to the database. After manual validation, 33 unique peptides that mapped upstream of the currently annotated TSSs of their corresponding proteins were obtained. These peptide hits indicated that the 5′ ends of the corresponding genes should be extended. In total, the TSS extensions of 22 proteins were validated, of which 7 contained at least two unique peptides (Table II). Fig. 3 depicts an example of a gene model that has an extension of the N-terminus. Four unique peptides mapped upstream of the original gene product BCG_1741c (Fig. 3A), a catechol-o-methyltransferase. Additionally, 13 unique peptides also mapped to BCG_1741c when we searched against the BCG protein database. The extended gene sequence was searched using the gene prediction programs FgeneSB and GeneMark, and the result indicated an alternative gene model. Moreover, by performing a Blastp search against the non-redundant protein database, the N-terminal-extended protein, and not the original protein, shared a higher similarity with its homolog in M. tuberculosis H37Rv. Therefore, according to our proteomic results, the length of BCG_1741c should be extended to 249 aa instead of 196 aa (Fig. 3B).
Table II. List of the 22 candidate N-terminally extended gene-encoding proteins; each protein's information includes unique peptide sequences corresponding to the IonScore and PEP value.
| Protein accessions | Gene name | Number of unique peptides | Extension peptide sequence | IonScorea | PEPb |
|---|---|---|---|---|---|
| BCG_0083 | - | 2 | GRPMQVAIALFPGNTALDAVGPYEVLQRVPSFDVVFVGHR | 86.10 | 2.18E-02 |
| VPSFDVVFVGHR | 56.94 | 7.52E-03 | |||
| BCG_1159c | - | 2 | VTEAGAMAAGR | 93.45 | 4.83E-07 |
| NLAMELVR | 45.18 | 2.48E-02 | |||
| BCG_1741c | - | 4 | MAAGIRNITTTGQIGDGR | 50.40 | 1.61E-01 |
| EAAAVDYVLAHAGAGNIDDVLATIDK | 89.95 | 1.87E-09 | |||
| NITTTGQIGDGR | 58.04 | 1.58E-01 | |||
| EAAAVDYVLAHAGAGNIDDVLATIDKFAYEK | 133.60 | 3.74E-02 | |||
| BCG_2465c | ndkA | 3 | QLIGEIISR | 56.70 | 1.66E-04 |
| TLVLIKPDGIER | 48.30 | 4.60E-02 | |||
| MRHLVLGTVTERTLVLIKPDGIER | 119.30 | 1.97E-03 | |||
| BCG_2754 | ephG | 2 | GTSSAGSVAISR | 46.75 | 2.52E-02 |
| VSSSAWLPSPATAQGGTVR | 106.38 | 3.03E-03 | |||
| BCG_3014c | - | 3 | EIAEHPFGTPTFTGR | 49.35 | 5.80E-04 |
| IASPDGVAFASIDGELGEPSEMTAR | 92.87 | 9.73E-02 | |||
| LIQMRIGRIASPDGVAFASIDGELGEPSEMTAR | 58.57 | 2.38E-02 | |||
| BCG_3091 | - | 2 | RRGHVEPGGCPLTAGSDR | 44.65 | 7.02E-03 |
| QAIVEAAER | 51.50 | 3.20E-03 | |||
| BCG_0640 | mce2F | 1 | MSATRRTR | 44.10 | 3.81E-01 |
| BCG_0702c | - | 1 | KPSRYGFWDSAACAVVMPISYNIVISLILERYALGENMAR | 66.75 | 1.14E-03 |
| BCG_0822 | - | 1 | AALMTAHPETPRLGYIGLGNQGAPMAKR | 81.90 | 3.74E-02 |
| BCG_0853 | - | 1 | DGGTAGRGRGLGVER | 70.35 | 4.27E-02 |
| BCG_0989 | pstC1 | 1 | AALRSMLARAGEVGR | 56.70 | 2.10E-02 |
| BCG_1153 | glyA1 | 1 | TTAVMSAPLAEVDPDIAELLAK | 87.15 | 2.21E-02 |
| BCG_1317 | - | 1 | MPATSVANNSGSMVALATIEACPALPSR | 47.25 | 1.47E-01 |
| BCG_2099 | lppJ | 1 | RPALTQSGAALRDLSPTTMFSMPHSTADR | 40.95 | 9.01E-02 |
| BCG_2502c | - | 1 | GVVAMAESGESPRLSDELGPVDYLMHRGEANPR | 76.65 | 7.88E-02 |
| BCG_2670 | arsC | 1 | RLTVGLTPGCVSKPR | 43.05 | 9.73E-02 |
| BCG_2738c | hflX | 1 | SWLPSPMPPASCVGRQCWWPAEISLTIGGTR | 71.40 | 8.13E-02 |
| BCG_2963 | fadD28 | 1 | QTPAELAR | 48.85 | 3.68E-02 |
| BCG_3379c | add | 1 | SDRLMTAAPTLQTIR | 98.70 | 3.87E-01 |
| BCG_3535c | ilvB2 | 1 | GMAVTPVTVGDHLVAR | 45.70 | 1.44E-02 |
| BCG_3623c | - | 1 | MAAAISIHPRRHHLR | 59.40 | 1.27E-03 |
a The Mascot score of the peptide identification.
b The probability that the observed peptide spectrum match is incorrect.
Fig. 3.

N-terminal extension of two gene models using peptide mapping upstream of the annotated translational start sites. A, four unique peptides (red lines) mapped to the upstream region of the annotated gene BCG_1741c (blue box). The gene prediction programs FgeneSB and GeneMark predicted the presence of a longer gene model extending the N-terminus of the gene (yellow box). The orthologous protein from M. tuberculosis H37Rv supported this N-terminal extension (purple box). B, the protein sequence of BCG_1741c was extended by 53 aa at the N-terminus. The underlined sequence in red indicates 17 unique peptides, including 4 extended N-terminal peptides. C, a non-tryptic N-terminal peptide, A.MPATSVANNSGSMVALATIEACPALPSR.L, mapped to the upstream of the annotated translational start site of BCG_1317. An extension of this protein sequence by 63 aa was also supported by the FgeneSB and GeneMark programs and Blastp searching. The seven unique peptides are underlined. The peptide sequences mapped to the products of genes BCG_1741c and BCG_1317 are indicated in red.
Furthermore, based on N-terminal peptides that had an initiator methionine, the accurate location of the extended TSSs of four CFPs could be confirmed. One such example is illustrated in Fig. 3C, where a peptide with a non-tryptic N-terminus, A. MPATSVANNSGSMVALATIEACPALPSR.L, mapped upstream of the original TSS of an annotated protein BCG_1317. Another seven unique peptides also mapped to this protein when we searched against the BCG protein database. The extended gene sequence was also supported by FgeneSB and GeneMark predictions and Blastp searching. It should be noted that although the extensions of 15 proteins were validated with different TSSs based on only one extended peptide each, most of them contained several unique peptides each when searched against the BCG protein database (supplemental Table S4). Consequently, the MS-based proteomic method used here could confirm proteins with predicted TSSs, identify proteins with different TSSs, and validate their extended TSSs, which could be used as evidence for extending the original length of the gene models. This strategy represents an effective and promising means for the experimental identification of TSSs that could be applied to other fractions of the BCG proteome, such as cytoplasmic and membrane proteins. We presumed that the identification of the different TSSs was most likely a result of the imperfect BCG genome annotated with the current bioinformatic methods. The results presented here suggest that some predicted ORF lengths in the genomic annotation probably require re-characterization.
Discovery of Novel Protein Coding Genes
It is most intriguing that some novel peptides or proteins could be identified in the CF. However, the incompleteness of the current protein databases acts as a limiting factor when seeking novelty with MS/MS data. Here, we constructed an in-house database of BCG that included all possible “gene encoding products” (17). This database would, therefore, contain all possible ORFs, both those previously predicted and those that were not predicted. Peptides identified via MS were considered to be existing gene products from the genome (39). For example, Jungblut et al. identified six proteins that were not predicted by the genome annotation of M. tuberculosis using a two-dimensional gel electrophoresis–MS approach (40). We used this proteomic strategy to provide an independent and complementary means of novel constituent identification that is an evidence-based detection, and not a theoretical prediction from genomic sequences.
In the present study, after we excluded peptides that map to currently annotated proteins (from the BCG database), the results from the customized six-reading-frame database search were used to provide a list of novel unique peptides. In total, 61 peptides mapped to regions of “proteins” in the six-frame database where no data were present in the annotated BCG database. To improve the confidence of novel “proteins,” we required at least two unique peptides with a minimal IonScore of 40 per ORF. After manual filtering and validation, we could predict the presence of 17 novel protein coding genes. Table III lists the 17 novel constituents along with 37 supporting unique peptides and genomic coordinates. By performing the Blastp algorithm against the non-redundant protein database, we checked the conservation of these ORFs across related organisms. Among these novel proteins, eight have orthologs in other mycobacteria, and four have orthologs in other organisms. Significantly, the other five were completely novel constituents that had no homology with proteins from any organisms. Interestingly, a bioinformatics analysis indicated that two of the novel proteins (BCGRF042474 and BCGRF059986) were potential classical secreted proteins, and five (BCGRF002933, BCGRF005243, BCGRF016070, BCGRF047639, and BCGRF051382) were leaderless secreted proteins. Fig. 4 depicts the identification of a novel ORF-encoding “protein,” BCGRF059986, in the BCG genome along with the corresponding unique peptides. In detail, two peptides, R.GDLASGTLLVTGVSPRPDAGGQQYVTIAGIITGPTVNEYAVYQR.M and R.MAVDVDQWPTVGQILPVVYSPK.N, mapped to the novel “protein.” The ORF sequence was supported by FgeneSB and GeneMark predictions. Furthermore, a Blastp search against the non-redundant protein database showed that the “protein” shared a high homology with the hypothetical protein MRA_3169 in M. tuberculosis H37Ra (Fig. 4A). The length of the novel “protein,” BCGRF059986, should be 106 aa (Fig. 4B). Furthermore, as an extra validation step, we successfully designed primers for an RT-PCR experiment to verify the transcription of the mRNAs of the novel gene, suggesting that the novel ORF inferred by our method was reliable (Fig. 4C). We also confirmed the transcription of the remaining 16 novel discoveries using RT-PCR (Fig. 4D). PCR fragments of the expected sizes were observed, indicating that the novel genes were transcribed. Therefore, our proteomic results confirm these true novel gene models that have been missed in genome annotation.
Table III. Identification of 17 novel gene products using peptide evidence; the transcripts of these genes were confirmed via RT-PCR.
| Protein ID | Orthologs in other bacteria | Peptide sequence | IonScorea | Genome coordinates |
|---|---|---|---|---|
| BCGRF000060 | Hypothetical protein MtubH3_16167 in M. tuberculosis H37Ra | FALPLAAIAVAAIVVR | 77.3 | 10987–11121 |
| GADVWHVAGDPPPDHITGDEEGP | 47.6 | |||
| BCGRF000622 | No | RAASGLVMR | 46.7 | 134452–134856 |
| SAASAMRRAASGLVMR | 63.1 | |||
| BCGRF002933 | No | HPSRAVAPLRPPPAPGSEWHHAPPPPTGR | 43.8 | 664480–664875 |
| RSAPPPPR | 68 | |||
| BCGRF003258 | Hypothetical protein Mhar_1125 in Methanosaeta harundinacea 6Ac | RVQELDQVGAAGGVVGGLPR | 73.7 | 737815–738762 |
| VQELDQVGAAGGVVGGLPRR | 42.1 | |||
| BCGRF004329 | No | TAGAGLGAVGATGAAGRLEPR | 46.7 | 1006411–1006923 |
| GVSGGAQGCGRPSQGR | 68.7 | |||
| BCGRF005243 | No | PTTAPTPSR | 57.3 | 1210039–1210260 |
| PTTAPTPSRSSAPTPAGSPRPSPVSTNR | 42.4 | |||
| BCGRF005292 | Hypothetical protein Saci_1135 in Sulfolobus acidocaldarius DSM | TAASLPQVAK | 42.1 | 1225672–1226310 |
| SSPSDKLDSKAR | 60.5 | |||
| BCGRF009708 | Hypothetical protein MRA_2077 in M. tuberculosis H37Ra | AIENALTLILGLPTGPER | 49.3 | 2304709–2305119 |
| RGDLWLVSLGAAR | 69.3 | |||
| BCGRF014560 | Hypothetical protein MtubH3_17448 in M. tuberculosis H37Ra | TIGIVYVHGDPVDYLDRDQMAK | 47.3 | 3422092–3422412 |
| TIGIVYVHGDPVDYLDR | 56.2 | |||
| YAVVISPGSMPWSVVTVVPTSTSAQPAVFRPELEVMGTK | 54.1 | |||
| BCGRF016070 | Hypothetical protein MLCL383.20c in M. leprae | LRDRGTSPSASNAGTPAWR | 54.5 | 3771895–3772644 |
| TAPYRPSWSVSARAHRPSRAASSTNSSGVLAPSR | 67.6 | |||
| BCGRF042474 | Hypothetical protein TMAG_03731 in M. tuberculosis SUMu001 | LDESDVDGYQSR | 93.1 | 1274118–1274327 |
| WVGLAGVAGVVAGGALVAR | 88.3 | |||
| RAYTPDEVR | 41.5 | |||
| BCGRF047639 | N-ethylammeline chlorohydrolase in Agrobacterium sp. H13-3 | LHLKLSRPCSPSR | 56.5 | 2412561–2413187 |
| LSRPCSPSRVGITITPR | 78.3 | |||
| BCGRF051382 | 50S ribosomal protein L28 in M. tuberculosis CDC1551 | WDPNIQTVHAVTRPGGNK | 51.2 | 3286014–3286208 |
| RWDPNIQTVHAVTRPGGNK | 45.2 | |||
| RWDPNIQTVHAVTRPGGNKK | 41 | |||
| BCGRF059986 | Hypothetical protein MT3222 in M. tuberculosis H37Ra | MAVDVDQWPTVGQILPVVYSPK | 96.7 | 3456324–3455992 |
| GDLASGTLLVTGVSPRPDAGGQQYVTIAGIITGPTVNEYAVYQR | 44.2 | |||
| BCGRF081476 | Fatty acid synthase in M. bovis | RADPDAAVPVR | 45.7 | 2806871–2806056 |
| VGGGSAARRR | 62.1 | |||
| BCGRF094956 | Succinyl-CoA:3-ketoacid-CoA ligase, subunit B in Acidilobus saccharovorans 345-15 | NTSAVIVAMTPR | 60.6 | 4073971–4073597 |
| GLDVCCPPARARSGGLLR | 54.5 | |||
| BCGRF098438 | No | VVSAAGSGAGTAAIGLSVHSASKIFCSSGRGTAR | 43 | 3220987–3219797 |
| SSAIDLAWALPGASTVASASTSQLPRVVSAAGSGAGTAAIGLSVHSASK | 52.9 |
a The Mascot score of the peptide identification.
Fig. 4.
Identification of novel gene models based on peptide mapping to the genomic region. A, two unique peptides (red lines) with minimal IonScores of 40 mapped to the genomic region corresponding to a novel protein, BCGRF059986. The presence of this novel gene model was also supported by the FgeneSB and GeneMark programs (yellow box). This novel protein was found to be similar to a hypothetical protein MT3222 in M. tuberculosis H37Ra (purple box). B, protein sequence of a novel gene product. The identified region is in red. C, validation of the novel gene model BCGRF059986 via an RT-PCR approach. The amplified RT-PCR product confirmed the expression of new mRNAs for the novel gene. The size of the product was determined by means of an E-Gel Electrophoresis System using a 2% E-Gel pre-cast agarose gel. DNA Ladder, 1 kb Plus DNA marker (Invitrogen). For BCGRF059986, PCR reaction was performed using the novel gene cDNA as a template. For the negative control, PCR reaction was performed with RNAs as the template. No product displayed in this lane indicated that the RNAs were free of any contaminating genomic DNA. For ß-actin cDNA, a positive control, PCR reaction was performed using human ß-actin cDNA as a template, and the amplified 353-bp product was visualized. D, the transcription of the remaining 16 novel gene models was also confirmed via RT-PCR. PCR fragments of the expected sizes were observed, indicating that the novel genes were transcribed. PCR reactions performed with RNAs as the templates were used as negative controls.
The BCG genome sequence has been available for more than 5 years and has been re-characterized previously (41). It was surprising that many novel constituents were detected in CF, especially because most of them were already annotated in other mycobacteria but were missing from the primary genome sequence of BCG. Interestingly, the lengths of six novel proteins with confirmed TSSs were relatively short (an average length of 98 aa). It is likely that they were missed in the genome annotation of the reference strain because of their small size. Based on the results of our study, it is suggested that the approach of using MS-based proteomic data to identify novel proteins in CF might prove to be an essential complementary method in the future, along with computational methods for annotating genomes, especially for newly sequenced genomes.
Functional Distribution and Analysis of the CFPs
Functional Distribution of the CFPs
The annotated proteins in the BCG database have been classified into 12 distinct functional categories. The 239 proteins identified were distributed across nine of these categories (Fig. 2D). Most of them were involved in the cell wall and cell processes (functional category 3, 43.5%) and intermediary metabolism and respiration (functional category 7, 17.2%). Relatively few of them were involved in the PE/PPE family (functional category 6, 4.2%) and lipid metabolism (functional category 1, 3.3%). Only two transcriptional regulatory proteins (BCG_3091 and BCG_0702c) were detected in the CFPs. Interestingly, almost all identifications classified in the cell wall and cell processes functional category (102 out of 104) were secreted proteins (Fig. 2D).
In protein database annotation, proteins for which there are no proteomic data are annotated as “hypothetical” or “conserved hypothetical” (if there is supporting evidence of homology in other species) (42). Although these proteins are conserved across related organisms, they are uncharacterized because of dubious functionality based on homology searching (43). The detection of hypothetical proteins with proteomic data allows us to remove the “hypothetical” tag that is associated with many current annotations in databases; that is, we can confirm the existence of hypothetical proteins by using a proteomic approach. Here, we identified 44 proteins that were annotated as hypotheticals (functional categories 10 and 16). Interestingly, 29 of them were predicted as secreted proteins. Additionally, five were conserved hypotheticals with orthologs in Mycobacterium. The identification of hypothetical proteins through the use of proteomic data showed their existence in CF, and their functions are worth studying further.
Major Components from the Culture Supernatant
Proteins released from growing mycobacteria into the extracellular medium are usually believed to be responsible for the high efficacy of BCG, and recognition of these molecules could lead to early immunological detection of the infected macrophages and control of TB (44). The crude CFPs, therefore, have been extensively characterized and are considered to be an attractive source of candidate antigens for a new vaccine and diagnostic reagents. In this study, four low-molecular-weight antigens were detected: CFP2, CFP6, CFP10A, and CFP17. These secreted antigens are thought to play important roles in the development of protective immune responses. CFP2 corresponds to MTB12 in M. tuberculosis, which was reported to constitute a major component of CF and have potential value as a subunit vaccine component to protect against infection by M. tuberculosis (45). CFP6 elicited high proliferative responses in healthy contacts and patients recovering from TB and also induced the release of a significantly high amount of IFN-γ (46). Additionally, the 9.5-kDa antigen CFP10A had been the focus of a TB vaccine because of its capability to induce strong cellular immune responses in the host (47). CFP17 can induce both a high IFN-γ release and a strong delayed-type hypersensitivity response (44). These proteins might have a promising future for the prevention and diagnosis of TB.
The antigen 85 (Ag85) complex, which comprises three proteins, Ag85A, Ag85B, and Ag85C, represents a promising candidate as a novel drug target and pathogenesis factor in mycobacteria (48). In this study, we detected all three of the secreted antigens and one related protein, FbpD, in the CF. The Ag85 antigens, which are ∼35 kDa and have a pI score of 6.5 each, participate in cell wall biosynthesis and interact with the host macrophage as fibronectin-binding proteins (48). Furthermore, they are also involved in the response to isoniazid treatment. FbpD, also known as the secreted MPT51/MPB51 antigen, can induce a high level of antigen-specific CD8+ T-cell response (49). It is interesting that this immunogenic protein was previously reported in the CF and also within the cell (50).
MPB53, MPB70, and MPB83 are among the most studied mycobacterial antigens. DNA sequence analysis shows that the gene mpb53 is localized close to mpb70 and mpb83 (51). MPB53, an 18-kDa protein detected in the CF of tuberculosis mycobacteria (including clinical isolates) but not nontuberculous mycobacteria, can induce strong, tuberculosis-specific antibody responses and could be a major protective antigen (52). MPB70 and MPB83, encoded as precursor proteins for export through the Sec pathway, can elicit strong T-cell responses and have been extensively explored for the sensitive and specific diagnosis of TB (53). Additionally, MPT63, a 16-kDa immune-protective extracellular protein, could be designated as MPB63 in BCG, which would be similar to other major secretory proteins from BCG, such as MPB53, MPB70, and MPB83. Because MPT63 is a mycobacteria-specific antigen (a Blastp search showed that MPT63 has homologs only in mycobacteria-related species) and is implicated in the virulence of mycobacteria, it has been considered as an attractive drug target and diagnostic reagent against TB (54).
Furthermore, three conserved secreted proteins (TB18.6, TB22.2, and TB39.8) were also identified in this study. Although their exact functions remain to be elucidated, they appear to be major T-cell antigens during infection with pathogenic mycobacteria (55).
Identification of Lipoproteins
Lipoproteins are synthesized as precursors in the cytoplasm and are then translocated across the cytoplasmic membrane by either the Sec or the Tat translocation system (19). In this study, we identified 73 lipoproteins in the CF, of which 55 contained classical secretion signal peptides. The majority were involved in the cell wall and cell processes category (functional category 3). Some lipoproteins are potent agonists of Toll-like receptor 2, which can initiate responses by antigen-presenting cells that influence both innate and adaptive immunity (56). For example, the Toll-like receptor 2 agonists LpqH and LprG participate in the regulation of adaptive immunity by inducing cytokine secretion in innate immune cells or regulating the activation of memory T lymphocytes (57). LprA is a cell-wall-associated lipoprotein that also can induce cytokine responses and regulate APC function (58). Three phosphate-binding transporter lipoproteins (PstS1, PstS2, and PstS3) that are members of a family of periplasmic proteins that act as high-affinity receptors for active transport systems in mycobacteria were unambiguously identified in CF. They play roles in the regulation of mycobacterial growth or metabolism and could be valuable candidates for rapid and specific diagnoses (59). In addition, MPB83, an important antigen described above, is a glycosylated lipoprotein processed by signal peptidase II (60). Interestingly, DsbF, which is a disulfide-bond-forming protein, can ensure the correct folding and disulfide bond formation of secreted proteins (61). ProX from the ATP-binding cassette transport system can bind the compatible solutes glycine betaine and proline betaine with high affinity and specificity, thereby serving as protein stabilizers (62). It was reported that SubI, a sulfate-binding lipoprotein of the ATP-binding cassette transport system, is involved in sulfur metabolism for mycobacterial growth (63).
PE and PPE Family Proteins
The PE and PPE family proteins are exemplified by the presence of Pro-Glu (PE) and Pro-Pro-Glu (PPE) motifs near the conserved N-terminus regions (64). In this study, ten PE/PPE family proteins were identified in CF. Interestingly, PE_PGRS19 was predicted to have a classical secretion signal peptide, and another seven PE/PPE proteins were predicted to be non-classically secreted proteins. It has been shown that extensive amounts of PE and PPE proteins are secreted through the ESX-5 system, which plays crucial roles in mycobacterial virulence (65). For example, as an important ESX-5 substrate, PE_PGRS30 is involved in phagosomal maturation arrest and replication in macrophages (66). Although currently these proteins are the subject of very few biochemical and structure-function investigations, they have been implicated in mycobacterial antigenic variation, which can induce strong immune responses in the host and have roles in mycobacterial virulence and pathogenesis (67).
Mammalian Cell Entry Family Proteins
Mammalian cell entry (Mce) family proteins are crucial for the virulence of mycobacteria and represent components of transport systems that interact with host cells (68). Structural analysis has indicated that some Mce proteins are similar to colicins or ß-barrel porins, which form channels through lipid bilayers (69). In this study, six Mce family proteins were detected in the CF; all contained N-terminal signal or anchor sequences. Although the exact functions of Mce proteins have not been fully determined, it has been demonstrated that some proteins exert their functions by promoting a change in the plasma membrane of cells and allowing an invasion of pathogens into cells (70). For example, Mce1C, mapped by eight unique peptides, was thought to be involved in host cell invasion during the initial phase of mycobacteria infection, functioning in a way similar to that of Mce1A (71). Additionally, Mce4F, which was predicted to be a steroid transporter, was proposed to have roles involving cholesterol and its metabolism in the pathogenesis of M. tuberculosis (72). Most interestingly, mce family genes are absent from the human genome. Therefore, Mce proteins might represent ideal candidate drug targets for better TB therapeutics (68).
Moreover, a number of other non-classical secreted proteins were also detected in the CF, including GroES, GlnA1, ribosomal protein RpsL, RplT, and RpsU. Some of these have necessary functions in the mycobacterial life. For example, GroES is necessary for the correct folding of a variety of proteins. Certain ribosomal proteins can serve as potent immunogens and have been applied in the skin's delayed-type hypersensitivity response (73).
CONCLUSIONS
In the present study, we obtained the BCG CFP repertoire, and a total of 239 proteins were identified with high confidence through the use of one-dimensional gel electrophoresis and high-resolution tandem mass spectrometry analysis. Out of these, 185 were considered to be secreted proteins or lipoproteins, which suggests that the CF was especially enriched with respect to secreted proteins. The 103 secreted proteins that have not been reported previously provided further insight into the BCG secretion proteome, which might be involved in the immunity of mycobacteria. Furthermore, we also identified 17 novel protein products that were not annotated in the BCG database. We further validated their existence at a transcriptional level via RT-PCR. Additionally, 22 proteins were validated to extend with TSSs based on N-terminal peptides. These data represent the largest number of mycobacterial secreted proteins reported in a single study, and some of these proteins might be potential candidates for vaccination and therapeutics.
Supplementary Material
Acknowledgments
We thank Professor Zongde Zhang (Beijing Tuberculosis and Thoracic Tumor Research Institute, P.R. China) for providing the strain M. bovis var BCG NCTC 5692.
Footnotes
* This work was supported by the National Natural Science Fund (Grant No. 30800039) from the National Natural Science Foundation of China, the Twelve-Fifth Mega-Scientific Project on infectious diseases, China (Grant No. 2012ZX10003–002), and an intramural grant from the Institute of Pathogen Biology, Chinese Academy of Medical Sciences (Grant No. 2007IPB007).
This article contains supplemental material.
1 The abbreviations used are:
- Ag85
- antigen 85
- BCG
- Mycobacterium bovis bacillus Calmette-Guérin
- CF
- culture filtrate
- CFP
- culture filtrate protein
- HIV
- human immunodeficiency virus
- Mce
- Mammalian cell entry
- ORF
- open reading frame
- PE/PPE
- Pro-Glu/Pro-Pro-Glu
- Sec
- secretory
- TB
- tuberculosis
- TM
- transmembrane
- TSS
- translational start site.
REFERENCES
- 1. Behr M. A., Wilson M. A., Gill W. P., Salamon H., Schoolnik G. K., Rane S., Small P. M. (1999) Comparative genomics of BCG vaccines by whole-genome DNA microarray. Science 284, 1520–1523 [DOI] [PubMed] [Google Scholar]
- 2. Dye C. (2006) Global epidemiology of tuberculosis. Lancet 367, 938–940 [DOI] [PubMed] [Google Scholar]
- 3. Mahairas G. G., Sabo P. J., Hickey M. J., Singh D. C., Stover C. K. (1996) Molecular analysis of genetic differences between Mycobacterium bovis BCG and virulent M. bovis. J. Bacteriol. 178, 1274–1282 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Girard M. P., Fruth U., Kieny M. P. (2005) A review of vaccine research and development: tuberculosis. Vaccine 23, 5725–5731 [DOI] [PubMed] [Google Scholar]
- 5. Andersen P., Doherty T. M. (2005) The success and failure of BCG—implications for a novel tuberculosis vaccine. Nat. Rev. Microbiol. 3, 656–662 [DOI] [PubMed] [Google Scholar]
- 6. Eichelbaum K., Winter M., Diaz M. B., Herzig S., Krijgsveld J. (2012) Selective enrichment of newly synthesized proteins for quantitative secretome analysis. Nat. Biotechnol. 30, 984–990 [DOI] [PubMed] [Google Scholar]
- 7. Desvaux M., Dumas E., Chafsey I., Chambon C., Hebraud M. (2010) Comprehensive appraisal of the extracellular proteins from a monoderm bacterium: theoretical and empirical exoproteomes of Listeria monocytogenes EGD-e by secretomics. J. Proteome Res. 9, 5076–5092 [DOI] [PubMed] [Google Scholar]
- 8. Mattow J., Schaible U. E., Schmidt F., Hagens K., Siejak F., Brestrich G., Haeselbarth G., Muller E. C., Jungblut P. R., Kaufmann S. H. (2003) Comparative proteome analysis of culture supernatant proteins from virulent Mycobacterium tuberculosis H37Rv and attenuated M. bovis BCG Copenhagen. Electrophoresis 24, 3405–3420 [DOI] [PubMed] [Google Scholar]
- 9. Makridakis M., Vlahou A. (2010) Secretome proteomics for discovery of cancer biomarkers. J. Proteomics 73, 2291–2305 [DOI] [PubMed] [Google Scholar]
- 10. Okkels L. M., Muller E. C., Schmid M., Rosenkrands I., Kaufmann S. H., Andersen P., Jungblut P. R. (2004) CFP10 discriminates between nonacetylated and acetylated ESAT-6 of Mycobacterium tuberculosis by differential interaction. Proteomics 4, 2954–2960 [DOI] [PubMed] [Google Scholar]
- 11. Malen H., Berven F. S., Fladmark K. E., Wiker H. G. (2007) Comprehensive analysis of exported proteins from Mycobacterium tuberculosis H37Rv. Proteomics 7, 1702–1718 [DOI] [PubMed] [Google Scholar]
- 12. Florio W., Batoni G., Esin S., Bottai D., Maisetta G., Pardini M., Campa M. (2003) Identification of novel proteins in culture filtrates of Mycobacterium bovis bacillus Calmette-Guerin in the isoelectric point range 6–11. Proteomics 3, 798–802 [DOI] [PubMed] [Google Scholar]
- 13. Rodriguez-Alvarez M., Palomec-Nava I. D., Mendoza-Hernandez G., Lopez-Vidal Y. (2010) The secretome of a recombinant BCG substrain reveals differences in hypothetical proteins. Vaccine 28, 3997–4001 [DOI] [PubMed] [Google Scholar]
- 14. Berredo-Pinho M., Kalume D. E., Correa P. R., Gomes L. H., Pereira M. P., da Silva R. F., Castello-Branco L. R., Degrave W. M., Mendonca-Lima L. (2011) Proteomic profile of culture filtrate from the Brazilian vaccine strain Mycobacterium bovis BCG Moreau compared to M. bovis BCG Pasteur. BMC Microbiol. 11, 80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Malard V., Chardan L., Roussi S., Darolles C., Sage N., Gaillard J. C., Armengaud J. (2012) Analytical constraints for the analysis of human cell line secretomes by shotgun proteomics. J. Proteomics 75, 1043–1054 [DOI] [PubMed] [Google Scholar]
- 16. Zheng J., Wei C., Zhao L., Liu L., Leng W., Li W., Jin Q. (2011) Combining blue native polyacrylamide gel electrophoresis with liquid chromatography tandem mass spectrometry as an effective strategy for analyzing potential membrane protein complexes of Mycobacterium bovis bacillus Calmette-Guerin. BMC Genomics 12, 40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. de Souza G. A., Softeland T., Koehler C. J., Thiede B., Wiker H. G. (2009) Validating divergent ORF annotation of the Mycobacterium leprae genome through a full translation data set and peptide identification by tandem mass spectrometry. Proteomics 9, 3233–3243 [DOI] [PubMed] [Google Scholar]
- 18. Zheng J., Liu L., Wei C., Leng W., Yang J., Li W., Wang J., Jin Q. (2012) A comprehensive proteomic analysis of Mycobacterium bovis bacillus Calmette-Guerin using high resolution Fourier transform mass spectrometry. J. Proteomics 77, 357–371 [DOI] [PubMed] [Google Scholar]
- 19. Rezwan M., Grau T., Tschumi A., Sander P. (2007) Lipoprotein synthesis in mycobacteria. Microbiology 153, 652–658 [DOI] [PubMed] [Google Scholar]
- 20. Sutcliffe I. C., Harrington D. J. (2004) Lipoproteins of Mycobacterium tuberculosis: an abundant and functionally diverse class of cell envelope components. FEMS Microbiol. Rev. 28, 645–659 [DOI] [PubMed] [Google Scholar]
- 21. Bagos P. G., Tsirigos K. D., Liakopoulos T. D., Hamodrakas S. J. (2008) Prediction of lipoprotein signal peptides in Gram-positive bacteria with a Hidden Markov Model. J. Proteome Res. 7, 5082–5093 [DOI] [PubMed] [Google Scholar]
- 22. Tjalsma H., Antelmann H., Jongbloed J. D., Braun P. G., Darmon E., Dorenbos R., Dubois J. Y., Westers H., Zanen G., Quax W. J., Kuipers O. P., Bron S., Hecker M., van Dijl J. M. (2004) Proteomics of protein secretion by Bacillus subtilis: separating the “secrets” of the secretome. Microbiol. Mol. Biol. Rev. 68, 207–233 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Bendtsen J. D., Kiemer L., Fausboll A., Brunak S. (2005) Non-classical protein secretion in bacteria. BMC Microbiol. 5, 58. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Bendtsen J. D., Jensen L. J., Blom N., Von Heijne G., Brunak S. (2004) Feature-based prediction of non-classical and leaderless protein secretion. Protein Eng. Des. Sel. 17, 349–356 [DOI] [PubMed] [Google Scholar]
- 25. Sonnenberg M. G., Belisle J. T. (1997) Definition of Mycobacterium tuberculosis culture filtrate proteins by two-dimensional polyacrylamide gel electrophoresis, N-terminal amino acid sequencing, and electrospray mass spectrometry. Infect. Immun. 65, 4515–4524 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Jungblut P. R., Schaible U. E., Mollenkopf H. J., Zimny-Arndt U., Raupach B., Mattow J., Halada P., Lamer S., Hagens K., Kaufmann S. H. (1999) Comparative proteome analysis of Mycobacterium tuberculosis and Mycobacterium bovis BCG strains: towards functional genomics of microbial pathogens. Mol. Microbiol. 33, 1103–1117 [DOI] [PubMed] [Google Scholar]
- 27. Rosenkrands I., King A., Weldingh K., Moniatte M., Moertz E., Andersen P. (2000) Towards the proteome of Mycobacterium tuberculosis. Electrophoresis 21, 3740–3756 [DOI] [PubMed] [Google Scholar]
- 28. Rosenkrands I., Weldingh K., Jacobsen S., Hansen C. V., Florio W., Gianetri I., Andersen P. (2000) Mapping and identification of Mycobacterium tuberculosis proteins by two-dimensional gel electrophoresis, microsequencing and immunodetection. Electrophoresis 21, 935–948 [DOI] [PubMed] [Google Scholar]
- 29. Covert B. A., Spencer J. S., Orme I. M., Belisle J. T. (2001) The application of proteomics in defining the T cell antigens of Mycobacterium tuberculosis. Proteomics 1, 574–586 [DOI] [PubMed] [Google Scholar]
- 30. Rosenkrands I., Slayden R. A., Crawford J., Aagaard C., Barry C. E., 3rd, Andersen P. (2002) Hypoxic response of Mycobacterium tuberculosis studied by metabolic labeling and proteome analysis of cellular and extracellular proteins. J. Bacteriol. 184, 3485–3491 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. He X. Y., Zhuang Y. H., Zhang X. G., Li G. L. (2003) Comparative proteome analysis of culture supernatant proteins of Mycobacterium tuberculosis H37Rv and H37Ra. Microbes Infect. 5, 851–856 [DOI] [PubMed] [Google Scholar]
- 32. Gonzalez-Zamorano M., Mendoza-Hernandez G., Xolalpa W., Parada C., Vallecillo A. J., Bigi F., Espitia C. (2009) Mycobacterium tuberculosis glycoproteomics based on ConA-lectin affinity capture of mannosylated proteins. J. Proteome Res. 8, 721–733 [DOI] [PubMed] [Google Scholar]
- 33. Gupta M. K., Subramanian V., Yadav J. S. (2009) Immunoproteomic identification of secretory and subcellular protein antigens and functional evaluation of the secretome fraction of Mycobacterium immunogenum, a newly recognized species of the Mycobacterium chelonae-Mycobacterium abscessus group. J. Proteome Res. 8, 2319–2330 [DOI] [PubMed] [Google Scholar]
- 34. Mehaffy C., Hess A., Prenni J. E., Mathema B., Kreiswirth B., Dobos a. K. M. (2010) Descriptive proteomic analysis shows protein variability between closely related clinical isolates of Mycobacterium tuberculosis. Proteomics 10, 1966–1984 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Rison S. C., Mattow J., Jungblut P. R., Stoker N. G. (2007) Experimental determination of translational starts using peptide mass mapping and tandem mass spectrometry within the proteome of Mycobacterium tuberculosis. Microbiology 153, 521–528 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Zhu H. Q., Hu G. Q., Ouyang Z. Q., Wang J., She Z. S. (2004) Accuracy improvement for identifying translation initiation sites in microbial genomes. Bioinformatics 20, 3308–3317 [DOI] [PubMed] [Google Scholar]
- 37. de Souza G. A., Malen H., Softeland T., Saelensminde G., Prasad S., Jonassen I., Wiker H. G. (2008) High accuracy mass spectrometry analysis as a tool to verify and improve gene annotation using Mycobacterium tuberculosis as an example. BMC Genomics 9, 316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Link A. J., Robison K., Church G. M. (1997) Comparing the predicted and observed properties of proteins encoded in the genome of Escherichia coli K-12. Electrophoresis 18, 1259–1313 [DOI] [PubMed] [Google Scholar]
- 39. Kelkar D. S., Kumar D., Kumar P., Balakrishnan L., Muthusamy B., Yadav A. K., Shrivastava P., Marimuthu A., Anand S., Sundaram H., Kingsbury R., Harsha H. C., Nair B., Prasad T. S., Chauhan D. S., Katoch K., Katoch V. M., Chaerkady R., Ramachandran S., Dash D., Pandey A. (2011) Proteogenomic analysis of Mycobacterium tuberculosis by high resolution mass spectrometry. Mol. Cell. Proteomics 10, M111.011627. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Jungblut P. R., Muller E. C., Mattow J., Kaufmann S. H. (2001) Proteomics reveals open reading frames in Mycobacterium tuberculosis H37Rv not predicted by genomics. Infect. Immun. 69, 5905–5907 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Brosch R., Gordon S. V., Garnier T., Eiglmeier K., Frigui W., Valenti P., Dos Santos S., Duthoy S., Lacroix C., Garcia-Pelayo C., Inwald J. K., Golby P., Garcia J. N., Hewinson R. G., Behr M. A., Quail M. A., Churcher C., Barrell B. G., Parkhill J., Cole S. T. (2007) Genome plasticity of BCG and impact on vaccine efficacy. Proc. Natl. Acad. Sci. U.S.A. 104, 5596–5601 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Jaffe J. D., Stange-Thomann N., Smith C., DeCaprio D., Fisher S., Butler J., Calvo S., Elkins T., FitzGerald M. G., Hafez N., Kodira C. D., Major J., Wang S., Wilkinson J., Nicol R., Nusbaum C., Birren B., Berg H. C., Church G. M. (2004) The complete genome and proteome of Mycoplasma mobile. Genome Res. 14, 1447–1461 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Lew J. M., Kapopoulou A., Jones L. M., Cole S. T. (2011) TubercuList—10 years after. Tuberculosis (Edinb.) 91, 1–7 [DOI] [PubMed] [Google Scholar]
- 44. Weldingh K., Rosenkrands I., Jacobsen S., Rasmussen P. B., Elhay M. J., Andersen P. (1998) Two-dimensional electrophoresis for analysis of Mycobacterium tuberculosis culture filtrate and purification and characterization of six novel proteins. Infect. Immun. 66, 3492–3500 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Webb J. R., Vedvick T. S., Alderson M. R., Guderian J. A., Jen S. S., Ovendale P. J., Johnson S. M., Reed S. G., Skeiky Y. A. (1998) Molecular cloning, expression, and immunogenicity of MTB12, a novel low-molecular-weight antigen secreted by Mycobacterium tuberculosis. Infect. Immun. 66, 4208–4214 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Bhaskar S., Khanna S. P., Mukherjee R. (2000) Isolation, purification and immunological characterization of novel low molecular weight protein antigen CFP 6 from culture filtrate of M. tuberculosis. Vaccine 18, 2856–2866 [DOI] [PubMed] [Google Scholar]
- 47. Arlehamn C. S., Sidney J., Henderson R., Greenbaum J. A., James E. A., Moutaftsi M., Coler R., McKinney D. M., Park D., Taplitz R., Kwok W. W., Grey H., Peters B., Sette A. (2012) Dissecting mechanisms of immunodominance to the common tuberculosis antigens ESAT-6, CFP10, Rv2031c (hspX), Rv2654c (TB7.7), and Rv1038c (EsxJ). J. Immunol. 188, 5020–5031 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Romero I. C., Mehaffy C., Burchmore R. J., Dobos-Elder K., Brennan P., Walker J. (2010) Identification of promoter-binding proteins of the fbp A and C genes in Mycobacterium tuberculosis. Tuberculosis (Edinb.) 90, 25–30 [DOI] [PubMed] [Google Scholar]
- 49. Hashimoto D., Nagata T., Uchijima M., Seto S., Suda T., Chida K., Miyoshi H., Nakamura H., Koide Y. (2008) Intratracheal administration of third-generation lentivirus vector encoding MPT51 from Mycobacterium tuberculosis induces specific CD8+ T-cell responses in the lung. Vaccine 26, 5095–5100 [DOI] [PubMed] [Google Scholar]
- 50. Al-Sayyed B., Piperdi S., Yuan X., Li A., Besra G. S., Jacobs W. R., Jr., Casadevall A., Glatman-Freedman A. (2007) Monoclonal antibodies to Mycobacterium tuberculosis CDC 1551 reveal subcellular localization of MPT51. Tuberculosis (Edinb.) 87, 489–497 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Wiker H. G., Michell S. L., Hewinson R. G., Spierings E., Nagai S., Harboe M. (1999) Cloning, expression and significance of MPT53 for identification of secreted proteins of Mycobacterium tuberculosis. Microb. Pathog. 26, 207–219 [DOI] [PubMed] [Google Scholar]
- 52. Miki K., Nagata T., Tanaka T., Kim Y. H., Uchijima M., Ohara N., Nakamura S., Okada M., Koide Y. (2004) Induction of protective cellular immunity against Mycobacterium tuberculosis by recombinant attenuated self-destructing Listeria monocytogenes strains harboring eukaryotic expression plasmids for antigen 85 complex and MPB/MPT51. Infect. Immun. 72, 2014–2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Wiker H. G. (2009) MPB70 and MPB83—major antigens of Mycobacterium bovis. Scand. J. Immunol. 69, 492–499 [DOI] [PubMed] [Google Scholar]
- 54. Manca C., Lyashchenko K., Wiker H. G., Usai D., Colangeli R., Gennaro M. L. (1997) Molecular cloning, purification, and serological characterization of MPT63, a novel antigen secreted by Mycobacterium tuberculosis. Infect. Immun. 65, 16–23 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. Eweda G., Suzuki D., Nagata T., Tsujimura K., Koide Y. (2010) Identification of murine T-cell epitopes on low-molecular-mass secretory proteins (CFP11, CFP17, and TB18.5) of Mycobacterium tuberculosis. Vaccine 28, 4616–4625 [DOI] [PubMed] [Google Scholar]
- 56. Drage M. G., Pecora N. D., Hise A. G., Febbraio M., Silverstein R. L., Golenbock D. T., Boom W. H., Harding C. V. (2009) TLR2 and its co-receptors determine responses of macrophages and dendritic cells to lipoproteins of Mycobacterium tuberculosis. Cell. Immunol. 258, 29–37 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Lancioni C. L., Li Q., Thomas J. J., Ding X., Thiel B., Drage M. G., Pecora N. D., Ziady A. G., Shank S., Harding C. V., Boom W. H., Rojas R. E. (2011) Mycobacterium tuberculosis lipoproteins directly regulate human memory CD4(+) T cell activation via Toll-like receptors 1 and 2. Infect. Immun. 79, 663–673 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Pecora N. D., Gehring A. J., Canaday D. H., Boom W. H., Harding C. V. (2006) Mycobacterium tuberculosis LprA is a lipoprotein agonist of TLR2 that regulates innate immunity and APC function. J. Immunol. 177, 422–429 [DOI] [PubMed] [Google Scholar]
- 59. Lefevre P., Braibant M., de Wit L., Kalai M., Roeper D., Grotzinger J., Delville J. P., Peirs P., Ooms J., Huygen K., Content J. (1997) Three different putative phosphate transport receptors are encoded by the Mycobacterium tuberculosis genome and are present at the surface of Mycobacterium bovis BCG. J. Bacteriol. 179, 2900–2906 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Michell S. L., Whelan A. O., Wheeler P. R., Panico M., Easton R. L., Etienne A. T., Haslam S. M., Dell A., Morris H. R., Reason A. J., Herrmann J. L., Young D. B., Hewinson R. G. (2003) The MPB83 antigen from Mycobacterium bovis contains O-linked mannose and (1–>3)-mannobiose moieties. J. Biol. Chem. 278, 16423–16432 [DOI] [PubMed] [Google Scholar]
- 61. Chim N., Riley R., The J., Im S., Segelke B., Lekin T., Yu M., Hung L. W., Terwilliger T., Whitelegge J. P., Goulding C. W. (2010) An extracellular disulfide bond forming protein (DsbF) from Mycobacterium tuberculosis: structural, biochemical, and gene expression analysis. J. Mol. Biol. 396, 1211–1226 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62. Schiefner A., Breed J., Bosser L., Kneip S., Gade J., Holtmann G., Diederichs K., Welte W., Bremer E. (2004) Cation-pi interactions as determinants for binding of the compatible solutes glycine betaine and proline betaine by the periplasmic ligand-binding protein ProX from Escherichia coli. J. Biol. Chem. 279, 5588–5596 [DOI] [PubMed] [Google Scholar]
- 63. Wooff E., Michell S. L., Gordon S. V., Chambers M. A., Bardarov S., Jacobs W. R., Jr., Hewinson R. G., Wheeler P. R. (2002) Functional genomics reveals the sole sulphate transporter of the Mycobacterium tuberculosis complex and its relevance to the acquisition of sulphur in vivo. Mol. Microbiol. 43, 653–663 [DOI] [PubMed] [Google Scholar]
- 64. Mukhopadhyay S., Balaji K. N. (2011) The PE and PPE proteins of Mycobacterium tuberculosis. Tuberculosis (Edinb.) 91, 441–447 [DOI] [PubMed] [Google Scholar]
- 65. Stoop E. J., Bitter W., van der Sar A. M. (2012) Tubercle bacilli rely on a type VII army for pathogenicity. Trends Microbiol. 20, 477–484 [DOI] [PubMed] [Google Scholar]
- 66. Iantomasi R., Sali M., Cascioferro A., Palucci I., Zumbo A., Soldini S., Rocca S., Greco E., Maulucci G., De Spirito M., Fraziano M., Fadda G., Manganelli R., Delogu G. (2012) PE_PGRS30 is required for the full virulence of Mycobacterium tuberculosis. Cell. Microbiol. 14, 356–367 [DOI] [PubMed] [Google Scholar]
- 67. Sampson S. L. (2011) Mycobacterial PE/PPE proteins at the host-pathogen interface. Clin. Dev. Immunol. 2011, 497203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68. Zhang F., Xie J. P. (2011) Mammalian cell entry gene family of Mycobacterium tuberculosis. Mol. Cell. Biochem. 352, 1–10 [DOI] [PubMed] [Google Scholar]
- 69. Pajon R., Yero D., Lage A., Llanes A., Borroto C. J. (2006) Computational identification of beta-barrel outer-membrane proteins in Mycobacterium tuberculosis predicted proteomes as putative vaccine candidates. Tuberculosis (Edinb.) 86, 290–302 [DOI] [PubMed] [Google Scholar]
- 70. Chitale S., Ehrt S., Kawamura I., Fujimura T., Shimono N., Anand N., Lu S., Cohen-Gould L., Riley L. W. (2001) Recombinant Mycobacterium tuberculosis protein associated with mammalian cell entry. Cell. Microbiol. 3, 247–254 [DOI] [PubMed] [Google Scholar]
- 71. Stavrum R., Valvatne H., Stavrum A. K., Riley L. W., Ulvestad E., Jonassen I., Doherty T. M., Grewal H. M. (2012) Mycobacterium tuberculosis Mce1 protein complex initiates rapid induction of transcription of genes involved in substrate trafficking. Genes Immun. 13, 496–502 [DOI] [PubMed] [Google Scholar]
- 72. Mohn W. W., van der Geize R., Stewart G. R., Okamoto S., Liu J., Dijkhuizen L., Eltis L. D. (2008) The actinobacterial mce4 locus encodes a steroid transporter. J. Biol. Chem. 283, 35368–35374 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73. Zheng J., Wei C., Leng W., Dong J., Li R., Li W., Wang J., Zhang Z., Jin Q. (2007) Membrane subproteomic analysis of Mycobacterium bovis bacillus Calmette-Guerin. Proteomics 7, 3919–3931 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


