Abstract
Closely related bacterial isolates can display divergent phenotypes. This can limit the usefulness of phylogenetic studies for understanding bacterial ecology and evolution. Here, we compare phenotyping based on Raman spectrometric analysis of cellular composition to phylogenetic classification by ribosomal multilocus sequence typing (rMLST) in 108 isolates of the zoonotic pathogens Campylobacter jejuni and C. coli. Automatic relevance determination (ARD) was used to identify informative peaks in the Raman spectra that could be used to distinguish strains in taxonomic and host source groups (species, clade, clonal complex, and isolate source/host). Phenotypic characterization based on Raman spectra showed a degree of agreement with genotypic classification using rMLST, with segregation accuracy between species (83.95%), clade (in C. coli, 98.41%), and, to some extent, clonal complex (86.89% C. jejuni ST-21 and ST-45 complexes) being achieved. This confirmed the utility of Raman spectroscopy for lineage classification and the correlation between genotypic and phenotypic classification. In parallel analysis, relatively distantly related isolates (different clonal complexes) were assigned the correct host origin irrespective of the clonal origin (74.07 to 96.97% accuracy) based upon different Raman peaks. This suggests that the phenotypic characteristics, from which the phenotypic signal is derived, are not fixed by clonal descent but are influenced by the host environment and change as strains move between hosts.
INTRODUCTION
Campylobacteriosis is the most common form of bacterial gastroenteritis in industrialized countries (1). The major disease-causing species, Campylobacter jejuni and Campylobacter coli, are a common commensal component of the gut microbiota in numerous wild and domesticated animal species (2). The factors that promote this ubiquity in potential disease reservoirs are poorly understood, but the capacity to colonize multiple hosts is likely to be an important feature of the ecology and disease epidemiology in this and other zoonotic pathogens (3).
Identifying the source of sporadic human infection can be difficult when there are large numbers of potential disease reservoirs. To address this, multilocus sequence typing (MLST) of C. jejuni and C. coli isolates from various sources has been used to show that there is substantial genetic differentiation between sequence types (STs) associated with different hosts using seven MLST loci (4, 5). These data have been used as a basis for frequency-dependent population genetic attribution models, allowing the tracking of human disease to reference population data derived from various animal species and the environment (6, 7). Consistent with other epidemiological studies (8–10), quantitative attribution models have identified consumption of contaminated poultry as a major source of human infection.
Although these genetic attribution studies are useful, they are dependent upon there being a host-associated genetic signature within the studied loci or observable differences in the frequency of strains in different host pools. Evidence from currently available MLST data suggests that there are host-associated strains but that there are also multihost lineages (11). For example, ST-61 and ST-257 clonal complexes are more common among isolates from cattle and chicken, respectively, but the ST-21 and ST-45 clonal complexes are common in both hosts (11). Therefore, while the presence of ST-61 and ST-257 complex isolates in clinical samples is likely to indicate cattle or chicken as the source of infection, source tracking of ST-21 and ST-45 complex isolates is more difficult.
In addition to the practical limitations for source attribution, the ability of closely related isolates to occupy highly divergent host niches has implications for understanding Campylobacter evolution and the emergence of disease. Specifically, the colonization of multiple hosts, for example, by the ST-21 and ST-45 clonal complexes, suggest that these lineages have some enhanced capacity for phenotypic variation that facilitates adaptation to novel host environments. This may be important in disease, as isolates from these two clonal complexes can account for a large proportion of human C. jejuni infection (12).
Despite the proven utility of nucleic acid-based methods for identifying and classifying microorganisms based on genotypic variations, considerable advantages can be gained by measuring variations in the phenotypic physical characteristics of cells. An increasingly popular phenotyping method used in microbiology is Raman spectroscopy (13–15). This rapid, noninvasive technique measures the inelastic scattering of light following the illumination of a sample with a monochromatic laser beam and can determine the chemical composition of a sample based upon molecular vibrations. This gives rise to unique fingerprints for compounds based upon bonding configuration. For example, biologically associated molecules, such as nucleic acids, protein, lipids, and carbohydrates, all generate unique signatures within Raman spectra, and variation in these biomolecules can be used as a measure of the bacterial phenotype associated with species, strains, and metabolic histories (16). Measurements of the cellular phenotype, and use of this information for classification (chemotaxonomy), has the potential to generate a snapshot of the physiological state of the cells but also to classify them based on groups of functional and clinical significance.
Raman spectroscopy has previously been used for the rapid phenotypic classification of bacterial genera, including Enterococcus, Streptococcus, Staphylococcus, Mycobacterium, Bacillus (17–21), and, more recently, Campylobacter (22). The high sensitivity of this technique has been shown to achieve discrimination down to the subspecies level (15). Raman-based phenotypic classification has also shown agreement with sensitive genotyping techniques, including amplified fragment length polymorphism (AFLP) in Acinetobacter baumannii (23) and pulsed-field gel electrophoresis (PFGE) in Staphylococcus aureus (14).
Here, we have characterized variation in 108 Campylobacter strains by genotyping through ribosomal MLST (24), an approach which indexes variation of the 53 genes encoding the bacterial ribosome protein subunits (rps genes). This has been compared to phenotypic classification using fingerprints generated by Raman spectroscopy. Using these data, our aim was to investigate (i) the utility of Raman spectroscopy as a potential phenotyping and classification method for Campylobacter strains, (ii) the correlation between genotypic groupings (species, clade, and clonal complex) and phenotypic classification, and (iii) the extent to which the phenotype varies among closely related Campylobacter isolates and to what extent this is linked to the host of origin.
MATERIALS AND METHODS
Bacterial culturing.
Isolates were selected from whole-genome multilocus sequence-typed collections from the PubMLST database (http://pubmlst.org/campylobacter/) to represent known diversity among C. jejuni isolates and the three major C. coli clades, with particular emphasis on the two major multihost C. jejuni clonal complexes found in cattle and chickens (ST-21 and ST-45 clonal complexes) (see Table S1 in the supplemental material for strain details). Campylobacter isolates were subcultured on Columbia blood agar (CBA) plates with 5% defibrinated horse blood (Oxoid, Basingstoke, United Kingdom). These were grown overnight in a microaerophilic workstation under microaerobic conditions (5% CO2, 5% O2, 3% H2, and 87% N2) at 42°C. To ensure strain purity, single colonies were picked onto fresh CBA plates and a second overnight incubation was carried out.
Genotypic variation.
Cell suspensions of each culture were made in 125 μl phosphate-buffered saline or in water (Sigma-Aldrich, United Kingdom) in a 0.2-ml PCR tube. Genomic DNA extraction was carried out using the QIAamp DNA minikit (Qiagen GMBH, Hilden, Germany). The DNA was resuspended in 100 to 200 μl of the elution buffer supplied and stored at −20°C. Isolates were sequenced with an Illumina genome analyzer using a multiplex sequencing approach with 12 separately tagged libraries sequenced simultaneously in two lanes of an eight-channel GAII flow cell. Libraries were created using the standard Illumina indexing protocol. Briefly, (i) 2 μg genomic DNA was fragmented by acoustic shearing to enrich for 200-bp fragments using a Covaris E210, cleaned, and end repaired. (ii) A tailing was carried out, and (iii) adapters were ligated. (iv) To introduce specific tag sequences between the sequencing and flow cell binding sites of the Illumina adapter, an overlap extension PCR was carried out using the Illumina 3 primer set. Each of these steps was followed by a DNA cleanup using a 1:1 ratio of AMPure paramagnetic beads (Beckman Coulter, Inc.) to remove DNA of <150 bp. Finally, DNA quantification was carried out by quantitative PCR (qPCR) followed by sequencing. The average overall output was 80 Mbp per isolate. High-coverage short reads (25 to 50 bp) were assembled de novo using Velvet software (25) to produce contiguous sequences of approximately 10 to 200 kb and archived in the Bacterial Isolate Genome Sequence Database (bigsdb) (26).
A ribosomal multilocus sequence typing (rMLST) approach was used to investigate the genetic relationship between isolates (24). Orthologs for the 53 genes encoding the bacterial ribosome protein subunits (rps genes) were defined in all isolates by comparison to the finished genome of isolate NCTC11168 (27, 28). Reciprocal best hits to 11,168 rps loci, with at least 70% nucleotide identity and 50% difference in alignment length, were used for the blast algorithm. Gene orthologs were aligned on a gene-by-gene basis using Muscle (29) and then concatenated into contiguous sequence for each isolate, including gaps for missing nucleotides (or entire genes). A neighbor-joining tree of rps gene alignments was reconstructed using Mega (30) version 5 with the Kimura 2-parameter model and neighbor-joining clustering.
Phenotypic variation. (i) Bacterial culturing.
Single-colony picks were inoculated into 10 ml of brain heart infusion (BHI) broth (Oxoid, Basingstoke, United Kingdom) in 15-ml culture tubes and grown overnight in microaerobic conditions (as described above) with shaking at 150 rpm. Cultures were diluted with fresh sterile culture media to an optical density at 600 nm of ∼0.5. These standardized stocks were then used to inoculate fresh tubes containing 10 ml BHI broth with 20 μl of Campylobacter culture. Each strain was cultured in triplicate. Cultures were then grown simultaneously as before at 42°C, in microaerophilic conditions with shaking at 150 rpm, for 20 h. Cells were then pelleted at 5,000 × g for 10 min at 4°C and resuspended in ice-cold 0.85% sterile NaCl solution. This was repeated twice to remove all traces of culture media. Cell pellets were fixed in 2% buffered formaldehyde (pH 7.2) for 1 h at room temperature before being washed twice in 0.85% NaCl and for a final time in molecular-grade H2O (Sigma, United Kingdom). Pelleted samples were stored in 1.5-ml microcentrifuge tubes at −80°C until spectroscopic analysis.
Raman spectroscopy.
Cell pellets were resuspended in 20 μl ice-cold molecular-grade H2O. After briefly being vortexed, 5 μl of cell suspension was pipetted onto a high-purity CaF2 disk (Crystran, Poole, United Kingdom) and dried for 30 min in a desiccation chamber before sampling. Raman spectra were collected using a Horiba LabRAM HR800 Raman microspectrometer (Horiba Scientific, United Kingdom) equipped with an Olympus BX-41 microscope and an Andor electronically cooled charge-coupled-device (CCD) detector. The spectrometer has a 600-grooves/mm grating and a slit width of 100 μm. Dried cell pellets were visually focused using a 100×, 0.9-numeric-aperture (NA) air objective (Mplan; Olympus) and a CCD camera. Laser illumination was from a 532-nm Nd:YAG laser, and the incident laser power was typically adjusted to 5 to 8 mW. The sampling diameter of the laser was calculated using the equation laser diameter = 1.22(λ/NA), where λ is the laser wavelength and NA is the numerical aperture of the microscope objective being used. In this case, a 532-nm laser and a 100×, 0.9-NA objective resulted in a theoretical spot diameter of 721 nm. While it is possible to collect spectra from single cells, in our experience the signal-to-noise ratio is low using this approach. By sampling a dried cell pellet, multiple cells were sampled at each acquisition and signal quality was improved. The Raman signal was optimized by adjusting the laser focus with a real-time readout of the Raman signal before acquiring the spectrum between 211 and 1,894 cm−1, with 1,022 data points (∼1.5 cm−1 per point). Each accumulation consisted of two 30-s exposures that were averaged, and any cosmic spikes were also automatically removed using LabSpec v5 software (Horiba Scientific, United Kingdom). Raman spectra were collected from 3 to 4 points within each dried bacterial spot and averaged. A reference spectrum of cytochrome c from equine heart (Sigma-Aldrich, United Kingdom) was collected directly from crystalline powder using the same parameters as those used for the bacterial samples.
Data analysis.
Raw spectra were concatenated to between 400 and 1,800 cm−1 wavenumbers, a region that has been described as the biological fingerprint for bacterial cells (31). These data were normalized (area under spectra to 100) using LabSpec, and baseline correction was carried out on all spectra using an 8-degree polynomial fit. Baseline-corrected data were used because they had a higher discriminatory ability (data not shown). A moving average of five data points centered on the middle wavelength value was used, and peak regions were identified and extracted from the data set. Spectra for all three culture replicates were used for each isolate.
An unsupervised multivariate technique, hierarchical cluster analysis (HCA), was used to perform a preliminary analysis of the data to examine relationships between each spectrum and the species, strain, clade, and host of origin. Following this, informative peaks were identified using an alternative supervised statistical technique, the automatic relevance determination (ARD) method (32), which is based on the multilayer perceptron (MLP) neural network architecture (33, 34). This method is well suited to identifying informative peaks from Raman spectroscopy data, as the neural network allows the detection of nonlinear relationships between wavelengths, and the ARD method allows identification of the inputs providing the most information in separating the classes. For each class to be separated (species, clade, clonal complex, and isolate source/host), the MLP was initially trained on the complete data set and then the ARD parameters were estimated. The wavelength with the highest value for the ARD parameter, corresponding to the least informative wavelength, was then discarded. This process was repeated until only 10 input attributes remained. Each set of wavelengths then was used to construct a classifier, and the classification accuracy was determined by using leave-one-out cross validation (35). This was repeated 10 times, and the wavelength that was consistently the least informative was discarded as described above until the classification accuracy was reduced, at which point the analysis was stopped. The remaining wavelengths, along with the final wavelength removed, formed the set of informative peaks for each class.
RESULTS
Genotype variation according to rMLST.
Using the rMLST approach (24), BLAST searches identified orthologs at 51 ribosome protein subunit (rps) loci for 85 isolates. The sequences of these genes were concatenated to produce 22,267 bp of contiguous sequence, and a neighbor-joining tree was constructed to investigate the genetic distances among isolates (Fig. 1). Isolates formed discrete clusters, reflecting species designations and previously described clades (C. coli) (2, 36). The relatedness of isolates in previously described clonal complexes (37), defined as sequence types (STs) sharing alleles at 4 or more loci from standard MLST, were also recovered. However, with the benefit of additional loci it was possible to extend the clonal complex designation of the ST-21 complex to include closely related isolates belonging to the ST-8 and ST-206 complexes.
Fig 1.
Campylobacter taxonomy based on rMLST. Shown is a neighbor-joining tree reconstructed from concatenated ribosomal protein gene sequences of 85 C. jejuni and C. coli isolates from chicken (circle), cattle (square), wild bird/environment (triangle), pig (inverted triangle), and clinical (diamond) samples. Analysis involved sequences from genomes with at least 51 tagged ribosomal protein genes. Species, clade (C. coli), and clonal complex substructure is evident. The isolate source indicated on the tree is drawn to scale, with branch lengths measured in the number of substitutions per site. Where the isolate belongs to an MLST clonal complex or clade, it is indicated on the tree. The scale bar represents a genetic distance of 0.005.
Raman spectra.
Spectra were collected from 108 strains in total, comprising C. jejuni (87 strains) and C. coli (21 strains). Each strain was cultured in triplicate, giving a total of 324 samples which generated an equal number of averaged Raman spectra over the range of 220 to 1,895 cm−1, which was changed to 400 to 1,800 cm−1 upon editing. An unedited spectrum, including the background from the CaF2 slide, is shown in Fig. S1 in the supplemental material.
Identification of informative peaks.
The Raman spectra of all of the Campylobacter strains in this study where dominated by peaks associated with the heme protein cytochrome c. The laser excitation wavelength used in this study (532 nm) has previously been shown to be resonant with the electronic absorption of cytochrome, resulting in a selective resonance Raman enhancement of the Raman modes of the cytochrome chromophore (38). Comparison of a typical Campylobacter spectrum to that of pure cytochrome c shows the significance of these molecules as a component of the cell (Fig. 2A).
Fig 2.
Raman spectra of Campylobacter jejuni and cytochrome c. Shown is a typical Raman spectrum of a Campylobacter jejuni sample smeared on a CaF2 plate and recorded with an excitation wavelength of 532 nm. Smoothing was carried out using a Savitzky-Golay filter. Peaks on the spectrum are shaded to highlight the comparison to equivalent peaks from a spectrum from pure cytochrome c (dotted line) (A) and other Raman peaks that are informative for phenotypic analysis of Campylobacter originating from biomolecules and their bonds (B). A.U., arbitrary unit. (See Table S2 in the supplemental material for full peak identifications and references.)
Of the peaks that were most informative for discriminating campylobacter isolates, seven were assigned identities originating from variations in cytochrome c vibrations. The remaining six identifiable peaks were assigned to biomolecules and molecular vibrations found in the bacterial cells (Fig. 2B), including carbohydrates (480 cm−1), C-C stretching (856 cm−1), phenylalanine (1,000 cm−1), phenylalanine, C-N stretching (1,049 to 1,054 cm−1), nucleic acids (1,049 to 1,054 cm−1), adenine, guanine, tyrosine, and tryptophan (1,335 to 1,339 cm−1) (see Table S2 in the supplemental material for full peak identifications and references).
Raman-based classification at various taxonomic levels.
Classification of the Raman spectra using an unsupervised multivariate technique, hierarchical cluster analysis, proved to be uninformative, and although some clustering occurred based on species, no clear patterns related to host of origin were apparent (see Fig. S2 in the supplemental material).
Using the ARD method, wavelengths that provided the most information for separating classes were determined in separate analyses for distinguishing species (C. coli and C. jejuni), clade (C. coli clades 1 to 3), clonal complex (ST-21 and ST-45 clonal complexes), and isolate host source (cattle, chicken, and human). To illustrate the phenotypic differences between the isolates, values at the two most informative peaks for each class, i.e., those with the lowest ARD weights, were plotted against each other for each group, and the class boundary separation based upon the spectra was compared to the actual boundary in network confusion matrices (Fig. 3 and 4). The classification boundaries on the two-dimensional plots (Fig. 3 and 4) divide the input space into areas which the trained neural network would assign to different classes and are calculated to be the values at which the class with the highest probability changes to a different class.
Fig 3.
Taxonomic discrimination based on Raman phenotype. Scatter plots of the magnitude of the two most informative peaks for discriminating C. jejuni (circles) and C. coli (crosses) isolates on the basis of species (A), clade (B), and clonal complex (C). Scatterplot axes represent the two most discriminatory wavenumbers from each subset. The solid line indicates the decision boundary calculated from the neural network. In panels A and B, C. coli clades are colored green (clade 1), purple (clade 2), and brown (clade 3). In panel C, C. jejuni isolates originated from the ST-21 clonal complex (circles) and ST-45 clonal complex (squares) based upon sharing alleles at 4 or more loci by MLST. Shown are isolates from human (red), cattle (blue), and chicken (orange). Network confusion matrices for each scatter plot indicate the separation achieved by the model compared to the actual class designation in the neural network trained on all of the informative peaks for each group.
Fig 4.
Host origin discrimination based on Raman phenotype. Shown are scatter plots of the magnitude of the two most informative peaks for discriminating ST-21 complex (circles) and ST-45 complex (squares) C. jejuni isolates from cattle (blue) and chicken (orange). Scatterplot axes represent the two most discriminatory wavenumbers from each subset. Plots show (A) ST-21 and ST-45 complex isolates from both hosts, (B) ST-21 complex isolates from both hosts, and (C) ST-45 complex isolates from both hosts. The solid line indicates the decision boundary calculated from the neural network. Network confusion matrices for each scatter plot indicate the separation achieved by the model compared to the actual class designation in the neural network trained on all of the informative peaks for each group.
Five wavenumbers, 945 (unassigned), 1,029 to 1,034 (phenylalanine), 1,227 (phenylalanine), 1,363 (cytochrome c), and 1,049 to 1,054 cm−1 (nucleic acids, C-O str, protein, and C-N str), conferred significant discriminatory power for separating C. jejuni from C. coli, with 83.95% correctly classified (Fig. 3A). This shows that there are phenotypic differences between the species but the classification is imperfect, with some overlap. Campylobacter coli clades showed good agreement with rMLST-informed taxonomy, requiring information at only three peaks, 1,049 to 1,054 (nucleic acids), 996 (unassigned), and 600 to 603 cm−1 (cytochrome c), for complete separation consistent with the three major clades observed in this species, as well as a 98.41% classification success. Good class separation was achieved visually with values at just two peaks (Fig. 3B). Class separation between isolates belonging to the ST-21 and ST-45 clonal complexes was not as clear, with a 86.89% classification accuracy based on wavelengths at seven peaks: 600 to 603 cm−1 (cytochrome c), 988 cm−1 (unassigned), 1,049 to 1,054 cm−1 (nucleic acids), 1,227 cm−1 (cytochrome c), and 1,363 cm−1 (cytochrome c). There was a significant overlap between members of the two clonal complexes (Fig. 3C), but it was possible to identify a decision boundary which gave a reasonable accuracy.
Raman-based classification in relation to host of origin.
In order to explore the phenotypic differences that occurred between different host-associated strains independently of genetic grouping, classification success between the two major host groups (cattle and chicken) and genetic groupings (ST-21 and ST-45) in C. jejuni was examined. Where both clonal complexes were included in the analysis, host attribution was relatively low, with a 74.07% classification success based on seven wavenumbers: 1,335 to 1,339 (nucleobases and amino acids), 783 (cytochrome c), 1,392 (cytochrome c), 829 (unassigned), 1,227 (cytochrome c), 1,313 (cytochrome c), and 920 cm−1 (unassigned). However, division of the data set into the two major clonal groupings resulted in a much-improved classification success, with strains originating from chicken and cattle being correctly assigned with a success rate of 96.97% in ST-21 and 95.65% in ST-45. Informative peaks for the ST-21 model included 1,227 (cytochrome c), 600 to 603 (cytochrome c), 749 (cytochrome c), 856 (C-C stretching and COC 1,4-glycosidic link), and 1,392 cm−1 (cytochrome c). ST-45 host attribution involved six wavenumbers: 1,313 (cytochrome c), 1,079 (C-O stretching), 480 (carbohydrates), 1,335 to 1,339 (nucleobases and amino acids), 1,049 to 1,054 (nucleic acids), and 1,029 to 1,034 cm−1 (phenylalanine).
DISCUSSION
The genealogical reconstruction of Campylobacter genotypes from diverse sources showed evidence of clusters of related isolates that were consistent with species, clade (in C. coli), and clonal complex designations previously described using MLST data (2, 36). This information provided a basis for testing of a Raman-based phenotypic classification system, which could mirror the genetic structure and distinguish different bacterial strains, as in studies of other species (14, 23). Raman previously has been successfully used for the classification of bacteria at various levels of taxonomy and phenotype, ranging from broad levels, such as Gram type (39), down to species-level (40) and even strain-level (14) discrimination. Raman does have some potential advantages over DNA sequence-based methods of classification, mainly due to the rapid speed of data collection (30 to 60 s to collect a spectrum). However, the link between genotype and phenotype, especially at the strain level, is not always straightforward, and studies using related vibration spectroscopy techniques, such as Fourier transform infrared spectroscopy, have found no clear associations (41). In our study, the segregation based upon Raman spectra was largely consistent with the genotype clustering. This discriminatory power, as well as the high reproducibility and rapidity of obtaining spectra, agree with a recent report (22) that Raman spectroscopy may be a useful method for typing bacteria, including Campylobacter.
Notwithstanding the potential practical benefits of this technology, it is the combination of this fine-scale phenotypic information with genotype information that provides the most significant insight into the evolution of Campylobacter. The genetic structure among the most common pathogenic Campylobacter species is broadly reflective of phenotypic differences. For example, C. coli and C. jejuni occupy different but overlapping niches dominating in wild birds and pigs (C. coli clade 1), respectively (2, 4), with both species being common in chickens and ruminants, where C. coli usually represents 10 to 15% of isolates (42). Similarly, the subspecies clade structure in C. coli is associated with isolates that dominate in agricultural (clade 1) and environmental (clades 2 and 3) reservoirs (2, 43), and the less-deep-branching clonal complex structure in C. jejuni is also associated with host differences (4).
This niche segregation can be used to explain the genetic structure in the population. According to a simple evolutionary model, isolates become separated in different niches and their genomes progressively diversify over time, accompanied by the accumulation of amino acid differences and concomitant phenotypic diversification. These phenotypic differences are the basis for the phenotypic discrimination of isolates using Raman spectra. Therefore, based on this simple model, classifications of clonal complex, clade, and species should be recovered with increasing ease, as they represent increasingly distant separation.
Consistent with this assumption, we find good discrimination of isolates from different clades in C. coli (98.41%). However, when discriminating isolates from the ST-21 and ST-45 clonal complexes, the number of isolates assigned to the correct clonal complex decreases to 86.89% (Fig. 3C). This result is inconsistent with similar studies that have shown Raman spectra to provide extremely fine-scale discriminatory power, even separating isogenic strains with subtle differences in phenotype (14, 22). There are a number of possible reasons for this. First, the Raman-derived phenotypes contain less phenotypic information than is required for discrimination at this level. An explanation for this is that peaks originating from cytochrome c could mask more informative signals provided by non-cytochrome peaks (Fig. 2A and B). Second, it may also be that the simple evolutionary model of linked genotypic and phenotypic diversification is too simplistic, particularly when applied to bacteria that occupy multiple niches, such as Campylobacter.
Although Raman-based phenotypic measures have been shown to correlate well with genotypic classification, few studies have attempted to link the Raman phenotype with defined ecological characteristics, such as host association. C. jejuni lineages that are strongly associated with a single host niche have been identified by MLST. For example, the ST-257 and ST-61 complexes are associated with chicken and cattle, respectively (5, 12, 44). However, other lineages, including the ST-21 and ST-45 clonal complexes, are ubiquitous, being found in cattle, chicken, and other host species. This implies that within these lineages, strains with the same clonal background but from different host environments can display levels of phenotypic variation in adaptation to the host environment. To test this hypothesis, we investigated if isolates from diverse clonal backgrounds (ST-21 and ST-45 clonal complexes) would cluster together phenotypically when isolated from the same host. A total of 74% of the isolates were assigned the correct host origin irrespective of the clonal origin, but when isolates from the same clonal complex were analyzed for host-associated Raman spectral clusters, the accuracy of the phenotypic classification increased to >95% (Fig. 4B and C).
This suggests that the cellular characteristics from which the phenotypic signal is derived are not fixed by clonal descent but are influenced by the host environment and may change as strains move between hosts. The factors that promote this phenotypic plasticity within the ST-21 and ST-45 clonal complexes are not fully understood. While reconstructions based upon 51-locus MLST provide a genealogical basis for considering the clonal relatedness of isolates, it is highly likely that adaptive loci elsewhere in the genome display high levels of diversity within clonal complexes, potentially showing host-associated differences. Analysis of Raman spectra could provide clues about the metabolic pathways and associated genes that are involved in adaptation to different hosts. For example, many of the Raman peaks that differed between isolates from different hosts resulted from variation in cytochrome c. Cytochromes of b and c types are known to be important in Campylobacter as an adaptation to the low-oxygen environment of the intestinal tract (45–47). Therefore, phenotypic differences based upon cytochromes could reflect differences in host niche characteristics, such as oxygen availability in the chicken and cow gut.
Genetic variation at loci involved in adaptation to different hosts has the potential to lead to phenotypic variation in clonally related lineages. This variation can be generated relatively quickly because of lateral gene transfer, where DNA from relatively distantly related isolates can be integrated into a recipient genome, potentially conferring novel phenotypes. This process is common in Campylobacter, even between C. coli and C. jejuni species (36, 48), contributing to rapid adaptation to novel environments (3, 49). In addition to variation in the DNA sequence, functionally relevant phenotypic modifications could result from epigenetic factors, including changes in gene expression or cellular phenotype. For example, in C. jejuni, epigenetic regulation by DNA MTase Cj1461 has been shown to affect several phenotypes related to virulence of lineages in human hosts (50).
Raman spectroscopy has previously been shown to be highly sensitive to phenotypic changes in cellular content, although most previous studies have focused on phenotypic changes due to metabolic history, such as growth stage (51) or exposure to stress in the form of chemicals, such as pharmaceuticals (52) and antibiotics (53). The link between phenotype and genotype may not always be clear, as there is likely to be “noise” caused by differences in gene expression between different isolates. Ultimately the link is affected by broad phenotypic characteristics, such as growth rate, substrate utilization, and biofilm formation, which may influence the characteristics of the Raman-measured phenotype, weakening the link between genotypic classification and phenotype classification. However, the methodology utilized in this study was designed to control for these factors by standardizing growth conditions as much as possible. It is also likely that any variations caused by these factors are relevant to each strain's ecological niche and as such are useful markers for host attribution. Additionally, the use of a supervised classification technique through machine learning allowed for the selection of optimal statistical models containing Raman peaks that best explained the predefined genotypic and ecological groupings and excluded those that did not contribute meaningfully to the classification.
Whatever the mechanism(s) of phenotypic plasticity, changes have occurred over a relatively short time scale. According to recent estimates (3, 54), the diversification of C. jejuni from C. coli occurred around 6,600 years ago. If this estimate is correct, then the clonal complex structure reflects relatively recent evolution, and isolates belonging to the ST-21 and ST-45 complexes have switched hosts several times in the last 6,600 years, leading to the host-specific phenotypic signal. The ability to colonize multiple hosts is a key feature in the emergence of zoonotic pathogens such as Campylobacter, and the phenotypic plasticity that is observed among lineages from the same clonal background (clonal complex) may be important in adaptation.
The phenotypic plasticity observed in Campylobacter has practical implications for understanding the epidemiology of human campylobacteriosis. Recent studies have estimated the contribution of various disease reservoirs to human infection by comparing 7-locus STs of isolates from host source populations to those from clinical samples (12, 55). Because of the association of certain STs with hosts, including chicken and cattle (11), it is possible to identify chicken-associated isolates as a major source of human disease (12, 55). However, some lineages, including the ST-21 and ST-45 clonal complexes, are found in multiple hosts and therefore are much more difficult to attribute to a source using MLST data. The identification of a host-specific phenotypic signal provides the potential to carry out source tracking based upon Raman spectra. However, to do this it would be necessary to determine how the phenotypic signal of a particular lineage changes when it enters the human gut. For example, an isolate from chicken may pick up a mammal-adapted signature on entering a human host, leading to incorrect attribution to cattle (and other mammals).
Whatever the evolutionary implications or the potential utility of phenotypic classification in source attribution, the phenotypic plasticity in Campylobacter presents a significant factor for disease control. Pathogenic bacteria are commonly specific lineages that emerge from a background of harmless ancestors via the acquisition of specific virulence factors, for example, Escherichia coli O157:H7 (56). In this case, the disease-causing strains are well defined. However, in Campylobacter, it may be the case that rather than acquiring specific virulence factors associated with disease, rapid adaptation and phenotypic plasticity allow lineages from diverse clonal backgrounds to rapidly colonize new niches and infect people.
Supplementary Material
ACKNOWLEDGMENTS
D.S.R. was supported by core Science Budget funding from the Natural Environmental Research Council at the Centre for Ecology & Hydrology. D.J.W. is funded by BBSRC grant BB/F005814/1.
We thank Ian T. Nabney for his advice on the use of ARD.
Footnotes
Published ahead of print 30 November 2012
Supplemental material for this article may be found at http://dx.doi.org/10.1128/AEM.02521-12.
REFERENCES
- 1. Buzby JC, Roberts T. 1997. Economic costs and trade impacts of microbial foodborne illness. World Health Stat. Q. 50:57–66 [PubMed] [Google Scholar]
- 2. Sheppard SK, Colles F, Richardson J, Cody AJ, Elson R, Lawson A, Brick G, Meldrum R, Little CL, Owen RJ, Maiden MC, McCarthy ND. 2010. Host association of Campylobacter genotypes transcends geographic variation. Appl. Environ. Microbiol. 76:5269–5277 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Woolhouse ME, Taylor LH, Haydon DT. 2001. Population biology of multihost pathogens. Science 292:1109–1112 [DOI] [PubMed] [Google Scholar]
- 4. Colles FM, Jones TA, McCarthy ND, Sheppard SK, Cody AJ, Dingle KE, Dawkins MS, Maiden MCJ. 2008. Campylobacter infection of broiler chickens in a free-range environment. Environ. Microbiol. 10:2042–2050 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. McCarthy ND, Colles FM, Dingle KE, Bagnall MC, Manning G, Maiden MCJ, Falush D. 2007. Host-associated genetic import in Campylobacter jejuni. Emerg. Infect. Dis. 13:267–272 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Mullner P, Collins-Emerson JM, Midwinter AC, Carter P, Spencer SE, van der Logt P, Hathaway S, French NP. 2010. Molecular epidemiology of Campylobacter jejuni in a geographically isolated country with a uniquely structured poultry industry. Appl. Environ. Microbiol. 76:2145–2154 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Strachan NJ, Gormley FJ, Rotariu O, Ogden ID, Miller G, Dunn GM, Sheppard SK, Dallas JF, Reid TM, Howie H, Maiden MC, Forbes KJ. 2009. Attribution of Campylobacter infections in northeast Scotland to specific sources by use of multilocus sequence typing. J. Infect. Dis. 199:1205–1208 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Corry JEL, Atabay HI. 2001. Poultry as a source of Campylobacter and related organisms. J. Appl. Microbiol. 90:96s–114s [DOI] [PubMed] [Google Scholar]
- 9. Friedman CR, Hoekstra RM, Samuel M, Marcus R, Bender J, Shiferaw B, Reddy S, Ahuja SD, Helfrick DL, Hardnett F, Carter M, Anderson B, Tauxe RV, Emerging Infections Program FoodNet Working Group 2004. Risk factors for sporadic Campylobacter infection in the United States: a case-control study in FoodNet sites. Clin. Infect. Dis. 38:S285–S296 [DOI] [PubMed] [Google Scholar]
- 10. Neimann J, Engberg J, Molbak K, Wegener HC. 2003. A case-control study of risk factors for sporadic Campylobacter infections in Denmark. Epidemiol. Infect. 130:353–366 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Sheppard SK, Colles FM, McCarthy ND, Strachan NJC, Ogden ID, Forbes KJ, Dallas JF, Maiden MCJ. 2011. Niche segregation and genetic structure of Campylobacter jejuni populations from wild and agricultural host species. Mol. Ecol. 20:3484–3490 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Sheppard SK, Dallas JF, Strachan NJC, MacRae M, McCarthy ND, Wilson DJ, Gormley FJ, Falush D, Ogden ID, Maiden MCJ, Forbes KJ. 2009. Campylobacter genotyping to determine the source of human infection. Clin. Infect. Dis. 48:1072–1078 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Harz M, Kiehntopf M, Stockel S, Rosch P, Straube E, Deufel T, Popp J. 2009. Direct analysis of clinical relevant single bacterial cells from cerebrospinal fluid during bacterial meningitis by means of micro-Raman spectroscopy. J. Biophoton. 2:70–80 [DOI] [PubMed] [Google Scholar]
- 14. Willemse-Erix HFM, Jachtenberg J, Barutci H, Puppels GJ, van Belkum A, Vos MC, Maquelin K. 2010. Proof of principle for successful characterization of methicillin-resistant coagulase-negative Staphylococci isolated from skin by use of Raman spectroscopy and pulsed-field gel electrophoresis J. Clin. Microbiol. 48:736–740 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Wulf MWH, Willemse-Erix D, Verduin CM, Puppels G, van Belkum A, Maquelin K. 2012. The use of Raman spectroscopy in the epidemiology of methicillin-resistant Staphylococcus aureus of human- and animal-related clonal lineages. Clin. Microbiol. Infect. 18:147–152 [DOI] [PubMed] [Google Scholar]
- 16. Huang WE, Li MQ, Jarvis RM, Goodacre R, Banwart SA. 2010. Shining light on the microbial world: the application of raman microspectroscopy, p 153–186 Advances in applied microbiology, vol 70 Elsevier Academic Press Inc., San Diego, CA: [DOI] [PubMed] [Google Scholar]
- 17. Berger AJ, Zhu QY. 2003. Identification of oral bacteria by Raman microspectroscopy. J. Mod. Optic. 50:2375–2380 [Google Scholar]
- 18. Buijtels P, Willemse-Erix HFM, Petit PLC, Endtz HP, Puppels GJ, Verbrugh HA, van Belkum A, van Soolingen D, Maquelin K. 2008. Rapid identification of mycobacteria by Raman spectroscopy. J. Clin. Microbiol. 46:961–965 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Harz M, Rosch P, Peschke KD, Ronneberger O, Burkhardt H, Popp J. 2005. Micro-Raman spectroscopic identification of bacterial cells of the genus Staphylococcus and dependence on their cultivation conditions. Analyst 130:1543–1550 [DOI] [PubMed] [Google Scholar]
- 20. Hutsebaut D, Maquelin K, De Vos P, Vandenabeele P, Moens L, Puppels GJ. 2004. Effect of culture conditions on the achievable taxonomic resolution of Raman spectroscopy disclosed by three Bacillus species. Anal. Chem. 76:6274–6281 [DOI] [PubMed] [Google Scholar]
- 21. Kirschner C, Maquelin K, Pina P, Thi NAN, Choo-Smith LP, Sockalingum GD, Sandt C, Ami D, Orsini F, Doglia SM, Allouch P, Mainfait M, Puppels GJ, Naumann D. 2001. Classification and identification of Enterococci: a comparative phenotypic, genotypic, and vibrational spectroscopic study. J. Clin. Microbiol. 39:1763–1770 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Lu XN, Huang Q, Miller WG, Aston DE, Xu J, Xue F, Zhang HW, Rasco BA, Wang S, Konkel ME. 2012. Comprehensive detection and discrimination of Campylobacter species by use of confocal micro-Raman spectroscopy and multilocus sequence typing. J. Clin. Microbiol. 50:2932–2946 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Maquelin K, Dijkshoorn L, van der Reijden TJK, Puppels GJ. 2006. Rapid epidemiological analysis of Acinetobacter strains by Raman spectroscopy. J. Microbiol. Methods 64:126–131 [DOI] [PubMed] [Google Scholar]
- 24. Jolley KA, Bliss CM, Bennett JS, Bratcher HB, Brehony C, Colles FM, Wimalarathna H, Harrison OB, Sheppard SK, Cody AJ, Maiden MCJ. 2012. Ribosomal multilocus sequence typing: universal characterization of bacteria from domain to strain. Microbiology 158:1005–1015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Zerbino DR, Birney E. 2008. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18:821–829 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Jolley KA, Maiden MCJ. 2010. BIGSdb: scalable analysis of bacterial genome variation at the population level. BMC Bioinformatics 11:595 doi:10.1186/1471-2105-11-595 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Gundogdu O, Bentley SD, Holden MT, Parkhill J, Dorrell N, Wren BW. 2007. Re-annotation and analysis of the Campylobacter jejuni NCTC11168 genome sequence. BMC Genomics 8:162 doi:10.1186/1471-2164-8-162 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Parkhill J, Dougan G, James KD, Thomson NR, Pickard D, Wain J, Churcher C, Mungall KL, Bentley SD, Holden MTG, Sebaihia M, Baker S, Basham D, Brooks K, Chillingworth T, Connerton P, Cronin A, Davis P, Davies RM, Dowd L, White N, Farrar J, Feltwell T, Hamlin N, Haque A, Hien TT, Holroyd S, Jagels K, Krogh A, Larsen TS, Leather S, Moule S, O'Gaora P, Parry C, Quail M, Rutherford K, Simmonds M, Skelton J, Stevens K, Whitehead S, Barrell BG. 2001. Complete genome sequence of a multiple drug resistant Salmonella enterica serovar Typhi CT18. Nature 413:848–852 [DOI] [PubMed] [Google Scholar]
- 29. Edgar RC. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32:1792–1797 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. 2011. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28:2731–2739 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Ghosh S, Zhang PF, Li YQ, Setlow P. 2009. Superdormant spores of Bacillus species have elevated wet-heat resistance and temperature requirements for heat activation. J. Bacteriol. 191:5584–5591 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Mackay DJC. 1995. Probable networks and plausible predictions–a review of practical Bayesian methods for supervised neural networks. Network Comp. Neural 6:469–505 [Google Scholar]
- 33. Bishop CM. 2007. Pattern recognition and machine learning. Springer, New York, NY [Google Scholar]
- 34. Nabney IT. 2004. NETLAB: algorithms for pattern recognition. Springer, New York, NY [Google Scholar]
- 35. Picard RR, Cook RD. 1984. Cross-validation of regression models. J. Am. Stat. Assoc. 79:575–583 [Google Scholar]
- 36. Sheppard SK, McCarthy ND, Falush D, Maiden MCJ. 2008. Convergence of Campylobacter species: implications for bacterial evolution. Science 320:237–239 [DOI] [PubMed] [Google Scholar]
- 37. Dingle KE, Colles FM, Wareing DRA, Ure R, Fox AJ, Bolton FE, Bootsma HJ, Willems RJL, Urwin R, Maiden MCJ. 2001. Multilocus sequence typing system for Campylobacter jejuni. J. Clin. Microbiol. 39:14–23 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Johannessen C, White PC, Abdali S. 2007. Resonance Raman optical activity and surface enhanced resonance Raman optical activity analysis of cytochrome c. J. Phys. Chem. A 111:7771–7776 [DOI] [PubMed] [Google Scholar]
- 39. Prucek R, Ranc V, Kvitek L, Panacek A, Zboril R, Kolar M. 2012. Reproducible discrimination between Gram-positive and Gram-negative bacteria using surface enhanced Raman spectroscopy with infrared excitation. Analyst 137:2866–2870 [DOI] [PubMed] [Google Scholar]
- 40. Webb-Robertson BJM, Bailey VL, Fansler SJ, Wilkins MJ, Hess NJ. 2012. Spectral signatures for the classification of microbial species using Raman spectra. Anal. Bioanal. Chem. 404:563–572 [DOI] [PubMed] [Google Scholar]
- 41. Oberreuter H, Charzinski J, Scherer S. 2002. Intraspecific diversity of Brevibacterium linens, Corynebacterium glutamicum and Rhodococcus erythropolis based on partial 16S rDNA sequence analysis and Fourier-transform infrared (FT-IR) spectroscopy. Microbiology 148:1523–1532 [DOI] [PubMed] [Google Scholar]
- 42. Sheppard SK, Dallas JF, MacRae M, McCarthy ND, Sproston EL, Gormley FJ, Strachan NJC, Ogden ID, Maiden MCJ, Forbes KJ. 2009. Campylobacter genotypes from food animals, environmental sources and clinical disease in Scotland 2005/6. Int. J. Food Microbiol. 134:96–103 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Colles FM, Ali JS, Sheppard SK, McCarthy ND, Maiden MCJ. 2011. Campylobacter populations in wild and domesticated Mallard ducks (Anas platyrhynchos). Env. Microbiol. Rep. 3:574–580 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Colles FM, Dingle KE, Cody AJ, Maiden MC. 2008. Comparison of Campylobacter populations in wild geese with those in starlings and free-range poultry on the same farm. Appl. Environ. Microbiol. 74:3583–3590 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Elkurdi AB, Leaver JL, Pettigrew GW. 1982. The c-type cytochromes of Campylobacter sputorum ssp. mucosalis. FEMS Microbiol. Lett. 14:177–182 [Google Scholar]
- 46. Hoffman PS, Goodman TG. 1982. Respiratory physiology and energy conservation efficiency of Campylobacter jejuni. J. Bacteriol. 150:319–326 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Lascelles J, Calder KM. 1985. Participation of cytochromes in some oxidation reduction systems in Campylobacter fetus. J. Bacteriol. 164:401–409 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Sheppard SK, McCarthy ND, Jolley KA, Maiden MCJ. 2011. Introgression in the genus Campylobacter: generation and spread of mosaic alleles. Microbiology 157:1066–1074 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Falush D. 2009. Toward the use of genomics to study microevolutionary change in bacteria. PLoS Genet. 5:e1000627 doi:10.1371/journal.pgen.1000627 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Kim JS, Li JQ, Barnes IHA, Baltzegar DA, Pajaniappan M, Cullen TW, Trent MS, Burns CM, Thompson SA. 2008. Role of the Campylobacter jejuni cj1461 DNA methyltransferase in regulating virulence characteristics. J. Bacteriol. 190:6524–6529 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Escoriza MF, Vanbriesen JM, Stewart S, Maier J. 2006. Studying bacterial metabolic states using Raman spectroscopy. Appl. Spectrosc. 60:971–976 [DOI] [PubMed] [Google Scholar]
- 52. Wharfe ES, Winder CL, Jarvis RM, Goodacre R. 2010. Monitoring the effects of chiral pharmaceuticals on aquatic microorganisms by metabolic fingerprinting. Appl. Environ. Microbiol. 76:2075–2085 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Moritz TJ, Taylor DS, Polage CR, Krol DM, Lane SM, Chan JW. 2010. Effect of cefazolin treatment on the nonresonant Raman signatures of the metabolic state of individual Escherichia coil cells. Anal. Chem. 82:2703–2710 [DOI] [PubMed] [Google Scholar]
- 54. Sheppard SK, Dallas JF, Wilson DJ, Strachan NJ, McCarthy ND, Jolley KA, Colles FM, Rotariu O, Ogden ID, Forbes KJ, Maiden MC. 2010. Evolution of an agriculture-associated disease causing Campylobacter coli clade: evidence from national surveillance data in Scotland. PLoS One 5:e15708 doi:10.1371/journal.pone.0015708 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. Wilson DJ, Gabriel E, Leatherbarrow AJ, Cheesbrough J, Gee S, Bolton E, Fox A, Hart CA, Diggle PJ, Fearnhead P. 2009. Rapid evolution and the importance of recombination to the gastroenteric pathogen Campylobacter jejuni. Mol. Biol. Evol. 26:385–397 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Perna NT, Plunkett G, Burland V, Mau B, Glasner JD, Rose DJ, Mayhew GF, Evans PS, Gregor J, Kirkpatrick HA, Posfai G, Hackett J, Klink S, Boutin A, Shao Y, Miller L, Grotbeck EJ, Davis NW, Limk A, Dimalanta ET, Potamousis KD, Apodaca J, Anantharaman TS, Lin JY, Yen G, Schwartz DC, Welch RA, Blattner FR. 2001. Genome sequence of enterohaemorrhagic Escherichia coli O157: H7. Nature 409:529–533 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




