Abstract
Streptococcus pneumoniae (pneumococcus [Pnc]) is a causative agent of many infectious diseases, including pneumonia, septicemia, otitis media, and conjunctivitis. There have been documented conjunctivitis outbreaks in which nontypeable (NT), nonencapsulated Pnc has been identified as the etiological agent. The use of mass spectrometry to comparatively and differentially analyze protein and peptide profiles of whole-cell microorganisms remains somewhat uncharted. In this report, we discuss a comparative proteomic analysis between NT S. pneumoniae conjunctivitis outbreak strains (cPnc) and other known typeable or NT pneumococcal and streptococcal isolates (including Pnc TIGR4 and R6, Streptococcus oralis, Streptococcus mitis, Streptococcus pseudopneumoniae, and Streptococcus pyogenes) and nonstreptococcal isolates (including Escherichia coli, Enterococcus faecalis, and Staphylococcus aureus) as controls. cPnc cells and controls were grown to mid-log phase, harvested, and subsequently treated with a 10% trifluoroacetic acid-sinapinic acid matrix mixture. Protein and peptide fragments of the whole-cell bacterial isolate-matrix combinations ranging in size from 2 to 14 kDa were evaluated by matrix-assisted laser desorption ionization-time of flight mass spectrometry. Additionally Random Forest analytical tools and dendrogramic representations (Genesis) suggested similarities and clustered the isolates into distinct clonal groups, respectively. Also, a peak list of protein and peptide masses was obtained and compared to a known Pnc protein mass library, in which a peptide common and unique to cPnc isolates was tentatively identified. Information gained from this study will lead to the identification and validation of proteins that are commonly and exclusively expressed in cPnc strains which could potentially be used as a biomarker in the rapid diagnosis of pneumococcal conjunctivitis.
Streptococcus pneumoniae (pneumococcus [Pnc]) is a facultative anaerobic bacterium that is an important human pathogen worldwide. The microorganism is a causative agent of many infections, including community-acquired pneumonia, meningitis, septicemia, bacteremia, otitis media, and conjunctivitis (8, 10, 17). Pnc contains many virulence factors, including a polysaccharide capsule that is antiphagocytic, enabling the organism to avoid being engulfed and thus escape immune detection. Based on capsular polysaccharides, 91 serotypes of Pnc are known. However, there are strains that do not react with Pnc typing antisera and thus are nontypeable (NT) or nonencapsulated (3), although they meet the identification criteria (optochin sensitivity, bile solubility, and being GenProbe positive) as being Pnc. Moreover, many NT strains are actually just variants of normally encapsulated strains.
Pneumococcal conjunctivitis, an infection of the conjunctiva, is of significant public health concern in highly populated environments such as college campuses, nursing homes, and day care centers. Through the years, there have been large outbreaks of conjunctivitis that have occurred in various regions of the United States, including New York, California, New Hampshire, New Jersey, and Maine. Martin et al. (15) and Carvalho et al. (3) reported microbiological, biochemical or genetic evidence that all of the Pnc strains from these outbreaks lacked a detectable polysaccharide capsule. Lack of a capsule, as well as the insensitivity of pneumococcal culture and diagnostic assays, presents a challenge to correctly diagnose pneumococcal conjunctivitis.
Molecular and immunological technologies (real-time PCR and enzyme-linked immunosorbent assays) detecting expression of Pnc genes or antibodies in bodily fluids have been used with a limited degree of sensitivity for detection and diagnosis of pneumococcal disease (4, 23). However, advances in the field of proteomics and bioinformatics have now made it possible to identify novel diagnostic targets or biomarkers aimed at improved detection. These expressed-gene or protein targets could prove useful in differentiating infectious strains that have been associated with previous conjunctivitis outbreaks and could reduce transmission of this infection.
Mass spectrometry (MS), a rapid, powerful, and sensitive analytical tool has been used recently for the differentiation, identification, and characterization of microbial pathogens. In particular, MS techniques such as matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) MS have been used to analyze whole bacterial cells that have not been modified chemically or by mechanical disruption (6). In recent years, MALDI-TOF MS has been used to differentiate significant human pathogens such as Helicobacter pylori, Bacillus cereus, Escherichia coli, and Coxiella burnetii (1, 6, 9, 11-14, 16, 20, 21, 24, 25). Studies by Friedrich and colleagues employed MALDI-TOF MS for rapid identification of 10 different species of viridans streptococci (7). Additionally, the MALDI technology has been used to identify Mycobacterium and moreover distinguish between multiple strains within a species (18). By use of high-throughput measures such as MALDI-TOF, protein/peptide fingerprints can be generated based on a proteomic profile. These proteins or patterns could serve as uniquely expressed pathogen-specific peptide or protein biomarkers that may prove useful for diagnostic purposes.
In this report, we describe a differential proteomic analysis using MALDI-TOF MS of representative Pnc conjunctival (cPnc) U.S. outbreak isolates. The unique cPnc outbreak isolates were compared with other nonconjunctival, pneumococcal and streptococcal isolates and a limited number of nonstreptococcal strains and species. Additionally, statistical algorithms as well as traditional cluster analysis were used to identify similarities among these isolates, in particular the cPnc isolates. A list of peptides/proteins found among the isolates was compiled in which at least one peptide/protein was common and exclusively expressed in the cPnc isolates. These cPnc proteomic signatures or biomarkers could ultimately be useful in the diagnosis of this infection.
MATERIALS AND METHODS
Materials and reagents.
All chemicals used during this study were purchased from Sigma-Aldrich (St. Louis, MO), except where indicated. Culture medium (Todd-Hewitt broth) was obtained from the Scientific Resources Program at the Centers for Disease Control and Prevention (CDC).
Bacterial strains.
All strains were from the CDC Streptococcus Reference Laboratory. Study strains consisted of 13 cPnc outbreak isolates as well as controls Streptococcus pneumoniae serotype 4, Pnc TIGR4, and Streptococcus pneumoniae unencapsulated strain R6; other streptococcal species, including Streptococcus oralis, Streptococcus mitis, Streptococcus pseudopneumoniae, and Streptococcus pyogenes (group A); and strains from heterologous genera Escherichia coli (group B), Staphylococcus aureus (group C), and Enterococcus faecalis (group D). In addition, pneumococcal serotypes contained within the 7-valent pneumococcal conjugate vaccine and NT pneumococcal sterile-site isolates were also used in the study for comparison (Table 1). The controls used in the study were not associated with the conjunctivitis outbreaks and were used to validate the methods' abilities to differentiate at the species and genus level. Groups A, B, C, and D were included as outgroups for statistical purposes. The 13 cPnc isolates described in this study are a limited sampling population and are considered representatives of all the clinical conjunctival isolates from the aforementioned U.S. outbreaks (New York in 1980, California in 1981, New Hampshire in 2002, New Jersey in 2002, and Maine in 2003).
TABLE 1.
Bacterial strains used in this study
| Straina | Source |
|---|---|
| Sp 165 (1138-80) | 1980, New York conjunctivitis, NT |
| Sp 166 (1139-80) | 1980, New York conjunctivitis, NT |
| Sp 168 (61-81) | 1981, California conjunctivitis, NT |
| Sp 169 (62-81) | 1981, California conjunctivitis, NT |
| Sp 170 (63-81) | 1981, California conjunctivitis, NT |
| Sp 245 (1852-02) | 2002, New Hampshire conjunctivitis, NT |
| Sp 246 (1853-02) | 2002, New Hampshire conjunctivitis, NT |
| Sp 247 (2136-02) | 2002, New Jersey conjunctivitis, NT |
| Sp 248 (2136-02) | 2002, New Jersey conjunctivitis, NT |
| Sp 263 (71-03) | 2003, Maine conjunctivitis, NT |
| Sp 264 (72-03) | 2003, Maine conjunctivitis, NT |
| Sp 265 (73-03) | 2003, Maine conjunctivitis, NT |
| Sp 266 (74-03) | 2003, Maine conjunctivitis, NT |
| R6 (BAA-228) | Derivative of D39, nonencapsulated |
| TIGR4 (BAA-344) | Encapsulated, serotype 4 |
| (M) | S. mitis |
| (3) SS1246/NCTC 10712 | S. mitis |
| (35) 1165/Mitis 26 | S. mitis |
| (40) SS1059/JC67 | S. mitis |
| (67) SS1303/NCTC 12261 | S. mitis |
| (O) | S. oralis |
| (6) SS1236/ATCC 35037 | S. oralis |
| (7) SS900/ATCC 15914 | S. oralis |
| (21) SS911/ATCC 10557 | S. oralis |
| Sp 83 | Pnc serotype 4, encapsulated 7-valent vaccine |
| Sp 86 | Pnc serotype 6B, encapsulated 7-valent vaccine |
| Sp 95 | Pnc serotype 9V, encapsulated 7-valent vaccine |
| Sp 105 | Pnc serotype 14, encapsulated 7-valent vaccine |
| Sp 116 | Pnc serotype 18C, encapsulated 7-valent vaccine |
| Sp 117 | Pnc serotype 19F, encapsulated 7-valent vaccine |
| Sp 125 | Pnc serotype, 23F, encapsulated 7-valent vaccine |
| (P) | S. pseudopneumoniae |
| ATCC BAA-960 (65) | CDC-RC, S. pseudopneumoniae |
| 290-03 (72) | CDC-RC, S. pseudopneumoniae |
| 288-03 (74) | CDC-RC, S. pseudopneumoniae |
| 276-03 (77) | CDC-RC, S. pseudopneumoniae |
| 253-03 (83) | CDC-RC, S. pseudopneumoniae |
| 844-00 | Sterile site (blood), NT |
| 5094-02 | Sterile site (blood or cerebrospinal fluid), NT |
| 6024-01 | Sterile site (blood or cerebrospinal fluid), NT |
| 7232-99 | Sterile site (blood or cerebrospinal fluid), NT |
| Streptococcus pyogenes | Gram-positive, capsulated, respiratory pathogen |
| Escherichia coli | Gram-negative intestinal pathogen |
| Enterococcus faecalis | Gram-positive intestinal pathogen |
| Staphylococcus aureus | Gram-positive human pathogen |
The “SS” designations and the numbers and letters in parentheses are strain identity codes from the CDC catalog for Streptococcus.
Bacterial cell growth and harvest for MS analysis.
Bacterial isolates stored at −70°C were initially streaked on Trypticase soy agar (BBL, Becton Dickinson, Franklin Lakes, NJ) with 5% defibrinated sheep's blood plates and incubated overnight at 37°C with 5% CO2. After confluent growth, a full loop of bacteria was inoculated in 10 ml of Todd-Hewitt broth (with 5% yeast extract) and grown to mid-log phase (optical density at 420 nm [OD420] of ∼0.4) at 37°C with 5% CO2 for 4 to 5 h. The bacterial suspension was centrifuged at 4,600 × g for 10 min at 4°C. The supernatant was decanted, and the pellet was washed twice in sterile distilled water, followed by centrifugation at 10,000 × g at room temperature for 10 min. The pellet (∼1012 cells) was resuspended in 50 μl of water, aliquoted (2 μl) in microcentrifuge tubes, and stored at −70°C until further use. To ensure purity among the isolates, the resuspended bacterial inoculum was streaked on a Trypticase soy agar blood plate and incubated overnight at 37°C with 5% CO2. All strains were cultured and grown three separate times over a 3-day period. The strains were grown to the same OD (mid-log phase at OD420 of ∼0.4) to ensure consistency in growth.
Preparing bacterial cell suspensions for MALDI-TOF analysis.
The MALDI matrix consisted of saturated solutions (20 mg/ml) of 3,5-dimethoxy-4-hydroxycinnaminic acid (sinapinic acid [SA]) (Sigma-Aldrich). SA was mixed with 50% acetonitrile and Milli-Q-grade water containing 10% trifluoroacetic acid. A 192-well stainless steel MALDI target plate (Applied Biosystems [AB], Framingham, MA) was used in the study. The plates were washed with Milli-Q-grade water, treated with methanol, and allowed to dry at room temperature. When dry, 0.5 μl of premixed suspensions containing matrices and whole bacterial forms or mass standards for calibration (Sequazyme peptide mass standards kit; AB) were spotted in four separate wells to create quadruplicates of samples and controls. In addition, 0.5 μl of bovine cytochrome c (1 mM) was added to one well of each sample and used as an internal standard. After air drying, the plates were inserted into the instrument for MALDI-TOF MS analysis.
MALDI-TOF MS analysis.
Mass spectra were acquired using a MALDI-TOF/TOF mass spectrometer (AB 4700 Proteomics Analyzer) equipped with a nitrogen laser (Nd:YAG) at 337 nm and a 200-Hz repetition rate. Analyses were performed at least 3 different days in linear delayed-extraction positive-ion mode at an accelerating voltage of 20 kV. The instrument was calibrated and checked before analysis with several calibration mixtures from either the peptide mass standards kit or the 4700 standard kit (AB), depending on the analysis mass range. Mass accuracy for each standard was within 0.05% of the corresponding average molecular weight. After initial manual laser intensity optimization and baseline data acquisition, spectra were acquired in automatic control mode, using uniform parameters to improve consistency and reproducibility. For optimum data quality of mass spectra in the m/z range of 2,000 to 14,000, SA was used as the matrix. The instrument was programmed to examine signals from at least 12 to a maximum of 100 randomly positioned nonoverlapping locations in each sample well, and the signals from the first 10 acquisitions for each spot that met the acceptance criteria were accumulated into one final-profile mass spectrum. A minimum of 11 individual spectra representing 10 accumulated subspectra were obtained from each well. The acceptance criteria, based on 1,000 laser shots per spot, were signal intensities between 2,000 and 55,000 counts and a signal/noise ratio of 10 or greater.
Data processing.
Mass spectra from three harvestings were processed in the following manner. Spectral data were exported as text format m/z-intensity lists with a unified m/z scale, using custom Microsoft Visual Basic for Applications (VBA) macros in Data Explorer, the AB viewing application. The text data were further processed and viewed by use of a suite of custom Microsoft Visual Basic .NET (VB.NET) programs. One custom program, MultiSpec Viewer, was designed to display hundreds of spectra at once in a number of formats, including a simulated gel view for visual analysis of the data set, which comprised several thousand individual spectra. Spectra failing to meet the quality requirements (usually containing no recognizable peaks due to failures of the automatic acquisition algorithms [approximately 10% of the total]) were discarded. The remaining spectra were subjected to background subtraction and then were summed by MALDI by well or by organism (to give ∼12 spectra or 1 representative high-quality spectrum, respectively); normalized to the base peak; smoothed using a 21-point, 2-pass Gaussian algorithm; and finally standardized and denoised using a custom Fortran program (22). The output of the standardizing and denoising programs was a set of profile spectra containing relative intensities of only the statistically significant peaks (22), with zeros at all other m/z values. Thus, these data sets were in an ideal format for further analysis by a range of commercial statistical and data-mining applications. To decrease the time required for statistical analyses, the summed spectra were typically compressed by a factor of 20, reducing ∼18,000 points to ∼900 for a typical m/z 2,000 to 14,000 spectrum. We used PAST software v1.34 (http://folk.uio.no/ohammer/past/doc1.html) for hierarchical cluster analysis, with the single summed spectra (one summed spectrum representing each organism) for input. We used a Fortran program, Random Forest (RF) v 5.1 (2; http://www.stat.berkeley.edu/users/breiman/RandomForests/cc_home.htm) for classification and identification, in this case with ∼9 summed spectra from three harvestings of each organism as a training set and ∼3 separate summed spectra as unknowns. Recompiling the Fortran RF code for each experimental condition was automatically driven by VB.NET programs, and custom viewing applications were developed to aid in interpreting the RF results.
Tentative peak matching and database searching.
A tentative identification of prominent peaks was done using the Tag-ident proteomics tool or ExPASy sequence retrieval system (http://us.expasy.org). In addition, “MS DB Filter,” a custom VB.NET algorithm, was used to construct a CDC-modified database filtered from UniProt (http://www.ebi.ac.uk/uniprot/index.html). MS DB Filter excludes any Swiss-Prot and TrEMBL or UniProt entry described as a fragment, strips out signal and prepeptide sequences, and applies a rule to add or remove initial methionine as described by Pineda (19). The CDC-modified filtered database was used for data mining the deduced proteome from several bacterial species used in this study which have had the whole genome sequenced. As of April 2008, information for TIGR4 and R6 species/isolates used in this study could be found in the Swiss-Prot and TrEMBL databases (UniProt). Custom algorithms within MultiSpec Viewer were also used to generate peak lists from the acquired mass spectra. In addition a manual screen of an extensive Microsoft Excel spreadsheet consisting of the 45 isolates from 2 to 14 kDa was used to correlate generated peaks with the CDC-modified database in order to provide tentative protein identifications.
RESULTS
MALDI-TOF MS spectra of cPnc isolates.
MALDI-TOF MS fingerprinting revealed similarities among representative U.S. cPnc outbreak isolates. Summed, smoothed, and normalized MALDI-TOF MS spectra from bacterial samples grown on three separate occasions revealed that the outbreaks share commonalities within the 2- to 14-kDa mass range. In particular, 11 major ion signals were observed in the region between 4,000 and 10,000 Da, including a peak at m/z 4,425 (Fig. 1). In this mass range, it is reasonable to assume that almost all signals originate from small proteins, and as is typical for MALDI-TOF spectra that in the absence of evidence to the contrary, these are singly charged ([M+H]+ forms). Among the cPnc outbreak isolates themselves, there were also minor differences in which several of the isolates, including NH Sp 246 (Fig. 2), lacked some protein peaks. Moreover, there are also visual differences among spectra found in conjunctival outbreak isolates that are not observed in the controls Pnc TIGR4 and R6. As would be expected, the E. coli, S. aureus, E. faecalis, and S. pyogenes isolates are very different (Fig. 1).
FIG. 1.
Differentiation of cPnc outbreak isolates and nonconjunctival bacterial controls by MALDI MS. The mass spectrum (A) and simulated-gel (B) views were prepared using a custom program, MultiSpec Viewer. The peak masses (2,000 to 14,000) in the spectrum and simulated-gel views are represented as m/z, and the relative intensity (0 to 100 [white to blue]) is expressed as a percentage. The three distinct colored lines along the right y axis are illustrated to easily distinguish the three main groups in the study (cPnc isolates, red; pneumococcal and streptococcal control isolates, green; control isolates for heterologous genera, blue). Lanes 1 to 13, cPnc outbreak isolates Sp 165, Sp 166, Sp 168, Sp 169, Sp 170, Sp 245, Sp 246, Sp 247, Sp 248, Sp 263, Sp 264, Sp 265, and Sp 266, respectively. Lanes 14 to 22, Pnc TIGR4, Pnc R6, S. mitis, S. oralis, S. pseudopneumoniae, E. coli, S. pyogenes, S. aureus, and E. faecalis, respectively. Each trace is the sum of all individual spectra (typically 10 to 20) for that organism, after background subtraction and smoothing.
FIG. 2.
Strain differentiation among cPnc isolates and identification of tentative ribosomal proteins present in cPnc isolates by MALDI MS. The spectrum view was prepared using a custom program, MultiSpec Viewer. The peak masses (2,000 to 14,000) in the spectrum are represented as m/z, and the relative intensity (0 to 100) is expressed as a percentage. Black arrows indicate the absence of ion peaks in isolate Sp 246. In addition, an overlay representing ribosomal proteins, obtained from a UniProt Pnc protein mass library, is illustrated (orange lines). Lanes 1 to 8, cPnc outbreak isolates Sp 165, Sp 169, Sp 170, Sp 246, Sp 247, Sp 248, Sp 263, and Sp 265, respectively. Each trace is the sum of all individual spectra (typically 10 to 20) for that organism, after smoothing.
A peak ion list (Table 2) of 16 peak masses, but not inclusive of all 487 separate protein and peptide masses, generated from a manual visual peak comparison was obtained from all 45 isolates and compared to a UniProt Pnc protein mass library. Two percent (9/487) of the data queried resulted in similarities to known ribosomal proteins. A ribosomal spectrum overlay, from the same Pnc database, using the MultiSpec Viewer, also suggested the tentative identification of ribosomal proteins among the conjunctival outbreak isolates as well as among Pnc TIGR4 and R6. The overlay constituted 11 ion peaks, within the mass range of 4,000 to 8,000 Da (Fig. 2).
TABLE 2.
Tentative peak list (representatives) of conjunctival and nonconjunctival isolatesa
| Strain(s) | Observed mass (Da [approximate])b | Putative protein or peptide |
|---|---|---|
| Sp 165, Sp 166, Sp 168, Sp 169, Sp 170 | 2,424 | ? |
| Sp 169, Sp 170 | 2,610 | ? |
| Sp 165, Sp 166, Sp 168, Sp 169, Sp 170, Sp 245, Sp 246, Sp 247, Sp 248, Sp 263, Sp 264, Sp 265, Sp 266 | 2,943-2,945 | ? |
| Sp 166, Sp 168 | 3,465 | ? |
| TIGR4 | 4,003 | ? |
| TIGR4 | 4,218 | ? |
| R6 | 4,741 | ? |
| TIGR4, R6 | 5,481-5483 | 50S ribosomal protein L33 |
| Sp 245, Sp 247, Sp 263, Sp 264, Sp 266 | 5,495-5,499 | ? |
| TIGR4 | 6,276 | Ribosomal protein L30 |
| R6 | 6,648 | Ribosomal protein L32 |
| Sp 165, Sp 166, Sp 168, Sp 169, Sp 170, Sp 245, Sp 246, Sp 247, Sp 248, Sp 263, Sp 264, Sp 265, Sp 266 | 6,872-6,875 | 30S ribosomal protein S21 |
| TIGR4, R6 | 6,877 | 30S ribosomal protein S21 |
| R6 | 7,998 | 50S ribosomal protein L29 |
| TIGR4 | 10,414 | 30S ribosomal protein S15 |
| Sp 165, Sp 248, Sp 265 | 11,001 | 50S ribosomal protein L24 |
Boldface indicates that the results are unique in all cPnc outbreak isolates.
Observed masses are derived from peak tops of unresolved isotopic clusters − 1 Da, assuming all ions were [M+H]+.
Cluster analysis of cPnc outbreak isolates.
The hierarchal cluster analysis using the PAST program with a Jaccard similarity coefficient indicated that 12 of the 13 conjunctival isolates are clustered together and share 76 to 86% similarity (Fig. 3), while cPnc NH Sp 245 exhibited only 70% similarity with respect to the other conjunctival isolates. In addition, the cPnc isolates displayed 58, 58, 45, and 45% similarity to Pnc R6, Pnc TIGR4, NT Pnc sterile condition-isolated strains, and Pnc vaccine serotype strains, respectively. The dendrogram suggests that the conjunctival isolates are distantly related to S. mitis, S. oralis, and S. pseudopneumoniae (45 to 48%), and there was little relationship to S. aureus, E. coli, E. faecalis, and S. pyogenes (10 to 12%) (Fig. 3).
FIG. 3.
Hierarchal cluster analysis of cPnc outbreak isolates and nonconjunctival bacterial controls. The PAST program, using the Jaccard similarity coefficient (expressed as a percentage), was used to assess the relatedness of the cPnc outbreak isolates and controls. A dendrogram of cPnc outbreak isolates compared with pneumococcal, streptococcal, and nonstreptococcal species is presented. Input data had been summed (all spectra for each organism), background subtracted, smoothed, standardized, and denoised. Shown are results for cPnc outbreak isolates (group 1), S. mitis (group 2), S. oralis (group 3), S. pseudopneumoniae (S. pseudopn. [group 4]), Pnc R6 and TIGR4 (group 5), Pnc sterile-site isolated strains (group 6), and Pnc 7-valent vaccine serotypes (group 7) and heterologous genera, including, E. coli, S. pyogenes (SMIC), S. aureus, and E. faecalis (group 8).
RF analysis of cPnc isolates.
RF, a statistical algorithm that computes proximities between data sets, locates outliers, and computes error rates by bootstrapping (2), was performed. Initially, a total of 900 spectra from the 45 isolates or classes were analyzed, with an overall error rate of 8.33%. Outlier and misclassified spectra were then identified by RF by running the analysis 200 times using subsets of randomly selected spectra (68% of each class); this number of repeats was chosen so as to give reliable statistics on each spectrum. Outliers (with an RF outlier distance of 5 or above) and consistently misclassified spectra (incorrect identification rate of 25% or above) were excluded, and the randomized RF analysis was repeated with the new data set a total of three times. A total of 125 spectra were excluded, and the overall error rate was reduced to 3.18% among the individual classes. In essence, the RF clusters the conjunctival isolates and controls into distinct clonal groups.
DISCUSSION
Pneumococcal conjunctivitis, usually a self-limiting infection of the ocular mucosal surface, poses serious public health consequences if not diagnosed early. The ease with which the infection spreads among individuals warrants the need for more rapid and improved detection methodologies. The simplicity and feasibility of generating mass spectra from whole-cell bacteria, the reproducibility of the sample preparation, and the ability to differentiate among genera, species, and strains makes MALDI-TOF MS a powerful methodology to be applied to the field of clinical diagnostics. MALDI-TOF whole-organism MS fingerprinting coupled with high-performance statistical algorithm is a promising tool capable of distinguishing unique and sample-limited NT cPnc outbreak strains from other pneumococcal, streptococcal, and nonstreptococcal species.
Previous studies using molecular techniques, such as pulse-field gel electrophoresis, multilocus sequence tagging, and PCR, have revealed that the cPnc isolates are similar genotypically (3, 15). Using MS, proteins are the most characteristic macromolecule that can be assessed without extraction, separation, or amplification (6), as required by the aforementioned technologies. In this proteomic study, albeit confirmatory with previous genetics-based investigations (3, 15), MALDI-TOF MS analysis as evident by visual spectrum analyses and hierarchal cluster analysis also demonstrated that the cPnc outbreak isolates are very similar. The conjunctival isolate clustering is a reflection of unique strain characteristics of cPnc within the subset of proteins being examined in this study. Moreover, uniquely expressed genes that are identified will make ideal candidates for biomarker evaluation.
Additionally, RF was able to separate the strains in this study into groups at the genus, species, and, to a certain extent, strain level (Sp 246) with minimal error. The low error rate of 3.13% among the cPnc isolates indicates that the RF algorithm is able to correctly identify and categorize mass spectra to the given appropriate class (individual strains or isolates) or group (similar strains, i.e., specific cPnc outbreaks). The spectra that are consistently being misclassified after successive screenings resulting in error rates may be due to low-quality spectra that were not filtered appropriately. Interestingly, from a biological perspective, error rates may not necessarily be a negative. In our case, mismatched spectra which resulted in low error rates can simply imply that the cPnc isolates are biologically related and are too similar for the algorithm to distinguish.
MALDI-TOF MS is a tool with great promise for the medical, public health, and scientific communities. Mass spectral fingerprinting using MALDI-MS has been used to detect biomarkers from whole unfractionated microorganisms, including viruses, prokaryotes, and a few unicellular eukaryotes (1, 6, 9, 11-14, 16, 20, 21, 24, 25). These biomarkers have proven useful for rapidly identifying and differentiating microbial pathogens. For instance, small acid-soluble proteins have been used to characterize Bacillus species (5). Additionally, Shaw et al. reported the identification of biomarkers in unfractionated C. burnetii cells phase I purified from embryonic egg yolk sac preparations (24). Furthermore, spectral markers in the mass range of 2,000 to 8,000 Da were obtained from MALDI-TOF MS analysis of four human microsporidian isolates (16). Biomarkers for Mycobacterium species have also been detected by MALDI primarily in the 500- to 2,000-Da range, most likely representing lipid molecules or small polypeptides (18).
Protein biomarkers identified by MALDI-TOF MS are often basic, such as the highly conserved and abundant ribosomal protein families (19). In the present study, several ribosomal proteins, as illustrated in Fig. 2, were tentatively identified in the range of 2,000 to 14,000 Da by database searching and spectrum overlay. The tentative proteins appeared to be conserved, based on mass, among the cPnc isolates as well as in other penumococcal strains. In addition, there was a peak at m/z 2,944 that was common to and uniquely expressed in the cPnc isolates relative to other strains tested. This biomarker candidate will require amino acid sequencing for validation as a clinical diagnostic marker.
In conclusion, MALDI-TOF MS, a rapid and sensitive methodology, was successfully utilized for differentiating cPnc U.S. outbreak isolates. Through statistical algorithms and hierarchal clustering, it was demonstrated that the cPnc outbreak isolates from California and the northeastern United States are very similar. Based on their MALDI-TOF MS fingerprints, putative peptide/protein biomarkers were tentatively identified, one of which was common and exclusively expressed in cPnc isolates. These cPnc proteomic signatures or biomarker candidates could ultimately be fruitful in the diagnosis of this infection. These expressed biomarkers are advantageous compared to genetic markers that would provide only information based on their expressive potential. Conjunctival isolate protein biomarkers would be a true indication of the organisms' ability to cause disease. Moreover, MALDI-TOF MS, with its high sensitivity, may also prove useful in gaining insight into the pathogenic mechanisms of disease, in particular mechanisms by which these NT cPnc strains cause large sporadic outbreaks. For instance, cPnc surface proteins associated with adherence or attachment to host cells that would subsequently initiate infection could be used as biomarkers. Furthermore, understanding how and why these cPnc strains cause disease can aid in the development of better treatments and even prophylactic measures to minimize the spread of infection during future outbreaks.
Acknowledgments
This work was supported in part by an Emerging Infectious Diseases Research Fellowship sponsored by the Association of Public Health Laboratories and the National Center for Infectious Diseases at the Centers for Disease Control and Prevention.
We thank Rickard Facklam for insight.
The findings and conclusions in this report are those of the authors and do not necessarily represent the officials of the Centers for Disease Control and Prevention.
Footnotes
Published ahead of print on 15 August 2008.
REFERENCES
- 1.Amiri-Eliasi, B., and C. Fenselau. 2001. Characterization of protein biomarkers desorbed by MALDI from whole fungal cells. Anal. Chem. 73:5228-5231. [DOI] [PubMed] [Google Scholar]
- 2.Breiman, L. 2001. Random Forests. Machine Learning 45:5-32. [Google Scholar]
- 3.Carvalho, M. G. S., A. G. Steigerwalt, T. Thompson, D. Jackson, and R. R. Facklam. 2003. Confirmation of nontypeable Streptococcus pneumoniae-like organisms isolated from outbreaks of epidemic conjunctivitis as Streptococcus pneumoniae. J. Clin. Microbiol. 41:4415-4417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Carvalho, M. G. S., M. L. Tondella, K. McCaustland, L. Weidlich, L. McGee, L. W. Mayer, A. Steigerwalt, M. Whaley, R. R. Facklam, B. Fields, G. Carlone, E. W. Ades, R. Dagan, and J. S. Sampson. 2007. Evaluation and improvement of real-time PCR assays targeting lytA, ply, and psaA genes for detection of pneumococcal DNA. J. Clin. Microbiol. 45:2460-2466. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Castahna, E., A. Fox, and K. F. Fox. 2006. Rapid discrimination of Bacillus anthracis from other members of the B. cereus group by mass and sequence of “intact” small acid soluble proteins (SASPs) using mass spectrometry. J. Microbiol. Methods 67:230-240. [DOI] [PubMed] [Google Scholar]
- 6.Fenselau, C., and P. A. Demirev. 2001. Characterization of intact microorganisms by MALDI mass spectrometry. Mass Spectrom. Rev. 20:157-171. [DOI] [PubMed] [Google Scholar]
- 7.Friedrichs, C., A. C. Rodloff, G. S. Chhatwal, W. Schellenberger, and K. Eschrich. 2007. Rapid identification of viridans streptococci by mass spectrometric discrimination. J. Clin. Microbiol. 45:2392-2397. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Giglotti, F., W. T. Williams, F. G. Hayden, J. O. Hendley, J. Benjamin, M. Dickens, M. Gleason, V. A. Perriello, and J. Wood. 1981. Etiology of acute conjunctivitis in children. J. Pediatr. 98:531-536. [DOI] [PubMed] [Google Scholar]
- 9.Holland, R. D., J. G. Wilkes, F. Rafii, J. B. Sutherland, C. C. Persons, K. J. Voorhees, and J. O. Lay, Jr. 1996. Rapid identification of intact whole bacteria based on spectral patterns using matrix-laser desorption/ionization with time-of-flight mass spectrometry. Rapid Commun. Mass Spectrom. 10:1227-1232. [DOI] [PubMed] [Google Scholar]
- 10.Hoskins, J., W. E. Aborn, Jr., J. Arnold, et al. 2001. Genome of the bacterium Streptococcus pneumoniae strain R6. J. Bacteriol. 183:5709-5717. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Jarman, K. H., S. T. Cebula, A. J. Saenz, C. E. Peterson, N. B. Valentine, M. T. Kingsley, and K. L. Wahl. 2000. An algorithm for automated bacterial identification using matrix-assisted laser desorption/ionization time-of-flight/mass spectrometry. Anal. Chem. 72:1217-1223. [DOI] [PubMed] [Google Scholar]
- 12.Jarman, K. H., D. S. Daly, C. E. Peterson, A. J. Saen, N. B. Valentine, and K. L. Wahl. 2000. Extracting and visualizing matrix-assisted laser desorption/ionization time-of-flight mass spectral fingerprints. Rapid Commun. Mass Spectrom. 13:1586-1594. [DOI] [PubMed] [Google Scholar]
- 13.Krader, P., and D. Emerson. 2004. Identification of archaea and some extremophilic bacteria using matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry. Extremophiles 8:259-268. [DOI] [PubMed] [Google Scholar]
- 14.Lay, J. O., Jr. 2001. MALDI-TOF mass spectrometry of bacteria. Mass Spectrom. Rev. 20:172-194. [DOI] [PubMed] [Google Scholar]
- 15.Martin, M., J. H. Turco, M. E. Zegans, et al. 2003. An outbreak of conjunctivitis due to atypical Streptococcus pneumoniae. N. Engl. J. Med. 348:1112-1121. [DOI] [PubMed] [Google Scholar]
- 16.Moura, H. M., M. Ospina, A. R. Woolfitt, J. R. Barr, and G. S. Visvesvara. 2003. Analysis of four human microsporidian isolates by MALDI-TOF. J. Eukaryot. Microbiol. 50:156-163. [DOI] [PubMed] [Google Scholar]
- 17.Perkins, R. E., R. B. Kundsin, M. V. Pratt, I. Abrahamsen, and H. M. Leibowitz. 1975. Bacteriology of normal and infected conjunctiva. J. Clin. Microbiol. 1:147-149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Pignone, M., K. M. Greth, J. Cooper, D. Emerson, and J. Tang. 2006. Identification of mycobacteria by matrix-assisted laser desorption ionization-time-of-flight mass spectrometry. J. Clin. Microbiol. 44:1963-1970. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Pineda, F. J., M. D. Antoine, P. A. Demirev, A. B. Feldman, J. Jackman, M. Longenecker, and J. S. Lin. 2003. Microorganism identification by matrix-assisted laser/desorption ionization mass spectrometry and model-derived ribosomal protein biomarkers. Anal. Chem. 75:3817-3822. [DOI] [PubMed] [Google Scholar]
- 20.Pribil, P. A., and C. Fenselau. 2005. Characterization of enterobacteria using MALDI-TOF mass spectrometry. Anal. Chem. 77:6092-6095. [DOI] [PubMed] [Google Scholar]
- 21.Pribil, P. A., E. Patton, G. Black, V. Doroshenko, and C. Fenselau. 2005. Rapid characterization of Bacillus spores targeting species-unique peptides produced with an atmospheric pressure matrix-assisted laser desorption/ionization source. J. Mass Spectrom. 40:464-474. [DOI] [PubMed] [Google Scholar]
- 22.Satten, G. A., S. Datta, H. Moura, A. R. Woolfitt, M. G. Carvalho, G. M. Carlone, B. K. De, A. Pavlopoulus, and J. R. Barr. 2004. Standardization and denoising algorithms for mass spectra to classify whole-organism bacterial specimens. Bioinformatics 20:3128-3136. [DOI] [PubMed] [Google Scholar]
- 23.Scott, J. A. G., Z. Mlacha, J. Nyiro, S. Njenga, P. Lewa, J. Obiero, H. Otieno, J. S. Sampson, and G. M. Carlone. 2005. Diagnosis of invasive pneumococcal disease among children in Kenya with enzyme-linked immunosorbent assay for immunoglobulin G antibodies to pneumococcal surface adhesin A. Clin. Diagn. Lab. Immunol. 12:1195-1201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Shaw, E. I., H. Moura, A. R. Woolfitt, M. Ospina, H. A. Thompson, and J. R. Barr. 2004. Identification of biomarkers of whole Coxiella burnetti phase I by MALDI-TOF mass spectrometry. Anal. Chem. 76:4017-4022. [DOI] [PubMed] [Google Scholar]
- 25.van Baar, B. L. 2000. Characterization of bacteria by matrix-assisted laser desorption/ionization and electrospray mass spectrometry. FEMS Microbiol. Rev. 24:193-219. [DOI] [PubMed] [Google Scholar]



