Abstract
Malassezia yeasts are a commensal microorganism found in human and animal skin. Species of Malassezia have been connected to skin and opportunistic infections where certain microenvironmental conditions are required in the host for the pathogenic processes to occur. We present the analysis of the volatile space of Malassezia pachydermatis grown on three pHs (5.7, 9.7, and 12.4) by comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry (GC×GC-TOFMS). Since changes in pH also affect the growth media and the VOCs produced by it, media blanks at the three pHs were analyzed, with 5 replicates of each of the 6 samples. Following data collection, GC×GC-TOFMS chromatograms were analyzed by Fisher ratio software that found 566 analytes, out of which 288 were tentatively identified with a mass spectrum match value (MV) ≥ 800 based upon a NIST library search. A signal pattern for each of the 566 analytes was obtained by averaging the replicates, and two metrics (R and RSD) were calculated for each signal pattern. The R metric was defined to focus upon the differences between analyte signals of media blanks and M. pachydermatis by taking away the influence of pH changes, while the RSD metric was defined to evaluate only the influence of pH. Based on the R metric magnitude, the analytes were split into 3 categories: media analytes consumed by M. pachydermatis, analytes at similar concentration at a given pH in the media and M. pachydermatis, and analytes produced in M. pachydermatis only. Many of the M. pachydermatis produced analytes were already shown to be produced by other yeast species and shown to have biological significance when pH is varied. Further, there is evidence of some bioconversions between the consumed analytes discovered versus the analytes produced. We also verified our classification results using a support vector machine (SVM) model, where cross-validation provided a very promising outcome with TPR and TNR both being over 0.95 and error being below 0.03 (or 3%).
Graphical Abstract

INTRODUCTION
Malassezia pachydermatis is a yeast species that can be found on mammalian skin. Malassezia species are lipid dependent and inhabit sebaceous areas, such as the scalp, face, or behind the ears, at which these yeasts can intake lipids from their host.1,2 Malassezia is a ubiquitous commensal microbe that can also be an opportunistic yeast where under certain microenvironmental conditions the establishment of dermatologic and systemic diseases in the host can be triggered;2–7 pH has been proposed as a microenvironmental condition that may trigger such changes in Malassezia. Deviations in pH of healthy skin (pH 4.0–5.5)8 can elicit some of these dermatologic and systemic diseases when pH is altered from this healthy state.9–11
Evidence suggests microbial volatile organic compounds (VOCs) are generated from central and secondary metabolisms, therefore, decoding chemical communication mediated by microbial VOCs may be pivotal to understanding what triggers the yeast to become pathogenic. VOCs are carbon-based compounds with a low molecular weight that exhibit a high vapor pressure.6 Several factors are involved in specifically producing microbial VOCs, including substrate and temperature. Further, diversity in the composition of VOCs can be altered by factors such as pH.3,4 Currently, there are studies starting to understand the metabolic pathways of VOCs of Malassezia on a genus level;3,4,12 however, untangling the complex volatome and the pathogenic triggers of Malassezia, and microbes in general, is challenging.
For the chemical analysis of complex systems, comprehensive two-dimensional (2D) gas chromatography with time-of-flight mass spectrometry (GC×GC-TOFMS) has been growing in use in various areas including screening metabolite changes in specific diseases like tuberculosis13,14, and SARS-CoV-215,16 as well as changes in breath based on diet.17 GC×GC-TOFMS allows for a nontargeted and high-throughput analysis of samples, which is incredibly important when analyzing biological samples, where there are many analytes present and interacting with one another in various ways.18,19 Due to the complex nature of GC×GC-TOFMS datasets, it is essentially impossible to fully interpret them manually. Hence, advanced software “chemometric” tools are used for extracting chemical information from such datasets, with the least possible interaction from the user.20
There are two general categories of chemometric methods often implemented: targeted and untargeted, with the latter focusing on all of the analytes that are responsible for differences or similarities between classes of samples. Further, these methods can be supervised or unsupervised, where supervised methods rely on knowledge of sample classes before analysis. A powerful supervised untargeted method is tile-based Fisher ratio (F-ratio) analysis21,22 used here to find class distinguishing analytes between M. pachydermatis grown on three different pH growth media agar: 5.7, 9.7, and 12.4. The F-ratio is calculated as the ratio of between-class variance relative to the sum of within-class variance within a small 2D portion of the chromatogram, called a tile, on a mass channel (m/z) basis.21,22 After F-ratio analysis, the user is provided with a hit list, with analyte hits ranked in descending order of their F-ratio. An analyte hit with a high F-ratio is more likely to be statistically significantly different in concentration between the two or more sample classes, so the most informative analytes should be near the top of the hit list. After F-ratio analysis, the user can readily extract the signal for every analyte for each sample class. By using analyte signals, further visualization and evaluation of relationships between classes can be performed, using methods like PCA23–26 or k-means.27,28 However, in some cases these methods do not provide clear relationships between the sample classes. To address this challenge, post-processing of the analyte signals may need to be performed, e.g., either on simulated data18 or by applying suitable quantitative metrics.29
Herein, we present a study of VOCs produced by M. pachydermatis grown at three pHs that is initially confounded by interwoven sample class relationships. Due to the changing VOC profiles produced by media blanks when their pH is changed, they were also analyzed. As such, this dataset had 6 sample classes, and for each class, 5 culture replicates were collected. Overall, 30 GC×GC-TOFMS chromatograms were analyzed using tile-based F-ratio analysis, after which the signal pattern for each analyte across the 6 sample classes, deemed significant by passing a suitable F-ratio threshold based on previous work,22 were extracted and analyzed for further classification using two quantitative metrics (R and RSD). The R metric was defined to focus upon the differences between analyte signals of media blanks and M. pachydermatis by taking away the influence of pH changes. The RSD metric, on the other hand, was defined to evaluate only the influence of pH. Based on the R metric magnitude, the analytes were split into 3 categories: media analytes consumed by M. pachydermatis (in media only), analytes at similar concentration at a given pH in the media blank and M. pachydermatis, and analytes produced in M. pachydermatis only. Using these metric assessments, we also built a support vector machine model that was used to confirm the metrics we defined and implemented were a valid approach to categorize analytes. The model will be demonstrated to provide high selectivity and specificity with a very low error. To our knowledge, this is the first study analyzing the VOC profile of M. pachydermatis using GC×GC-TOFMS, as well as looking into changes in the VOC profiles with this method when pH of culture conditions is changing for M. pachydermatis.
EXPERIMENTAL SECTION
M. pachydermatis Microbial Culture.
The strain CBS 1879 of M. pachydermatis was purchased from the American Type Culture Collection (ATCC, Manassas, Virginia) for this study. M. pachydermatis was maintained in modified Dixon agar (mDixon per liter: 20 g noble agar, 36 g malt extract, 10 g mycological peptones (oxoid L40), 10 g desiccated oxbile, 10 mL Tween 60, 4 mL glycerol 50%) and incubated for 48 hr at 31 ºC. Liquid cultures were prepared by removing an individual fungal colony from the agar plate and mixing into 10 mL of the liquid mDixon media in an autoclaved Erlenmeyer flask. The flask was then incubated on an orbital shaker (180 rpm) at 31 ºC for 48 hr.
Device Inoculation and Incubation.
Prior to inoculation, 20 mL glass GC-MS vials (Supelco, St. Louis, MO) were ultrasonicated with 70% ethanol for 30 min and autoclaved at 121 ºC for 30 min on a gravity cycle. Metal caps with a rubber septum (Supelco, St. Louis, MO) for the vials were also sonicated in 70% ethanol for 30 min and dried with compressed air. The vials had 1 mL of mDixon agar (5.7 pH) adjusted to 9.7 and 12.4 pH with 1 N sodium hydroxide (NaOH) per experimental design for inoculation and blanks. After the 48 hr incubation period, the liquid fungal cultures were adjusted to 1×106 colony forming units per milliliter (CFU/mL) by optical density (OD600) analysis before inoculation onto the agar of the GC-MS vials. Each vial was inoculated with 10 μL of the liquid culture. The vials were capped and para filmed. Vials were then placed in a pipette tip box and placed in a 31 ºC incubator for an incubation period of 72 hr to allow fungal colony growth before sampling by headspace (HS) solid phase microextraction (SPME) followed by GC×GC-TOFMS data collection.
HS-SPME-GC×GC-TOFMS instrument conditions.
Data was collected using a Pegasus BT 4D GC×GC-TOFMS with a cryogenic modulator (LECO, St. Joseph, MI). The headspace of each media blank and M. pachydermatis sample was sampled using a divinylbenzene/carboxen/polydimethylsiloxane SPME fiber (DVB/CAR/PDMS, fiber thickness: 50/30 μm). This fiber was chosen due to its ability to extract analytes of a wide boiling point and compound polarity range that were anticipated in these samples.
For the initial conditioning of a new DVB/CAR/PDMS fiber, it was held in a 250 °C GC inlet for 1 hr. Before each sample extraction, M. pachydermatis samples and media blanks at each pH (5.7, 9.7, and 12.4) were incubated at 31 °C for 5 min. The headspace of the samples was extracted for 30 min at 31 °C. This temperature was chosen to not alter M. pachydermatis samples by accidentally killing them at too high of a temperature. The SPME fiber with the extracted volatiles was desorbed splitless in the GC inlet for 5 min at 250 °C. Between sample extractions and chromatographic runs, the fiber was re-conditioned at 250 °C for 5 min. The sample extraction method utilized the L-PAL3 autosampler (LECO, St. Joseph, MI, USA).
Separations of the 6 samples in true quintuple (each of the 5 culture replicates was made in a separate vial but within a single experiment) were collected using the LECO Pegasus BT 4D GC×GC-TOFMS equipped with an Agilent 7890 GC (Agilent Technologies, Palo Alto, CA, USA) and a stock quad-jet thermal modulator. Splitless sample injections were separated on a non-polar Rxi-5Sil MS 1D column (60 m × 0.25mm × 0.25 μm; Restek), and a mid-polar Rtx-200 2D column (3 m × 0.18 mm × 0.2 μm; Restek). The 1D column was held at 40 °C for 5 min before ramping to 140 °C at 8 °C/min, and then ramping to 250 °C at 30 °C/min where it was held for 10 min. The same temperature program was used for the 2D oven and modulator with an offset of +5 °C and +15 °C, respectively. A temperature program with a fast ramp portion near the end of the separation was used to reduce the overall run time concurrent with preserving the integrity of the biological sample data set information. The carrier gas, ultra-high purity helium (Grade 5, 99.999 %, Praxair, Seattle, WA, USA), operated at a constant flow rate of 2 mL/min. The 1D effluent was reinjected on the 2D column at a modulation period of 3 s. The ion source and transfer line temperatures were set to 225 °C and 285 °C, respectively. The TOFMS collected m/z 45–334 at 100 Hz for a total of m/z range of 290 mass channels for analysis. An electron ionization energy of 70 eV after a 10 s acquisition delay was utilized for the collected m/z range.
Data analysis.
GC×GC-TOFMS chromatograms were imported into ChromaTOF Tile Software (v. 101, LECO, St. Joseph, MI, USA) to perform tile-based Fisher ratio analysis, using a 1D tile size of 5 modulations (15 s) and a 2D tile size of 15 spectra (150 ms), where the hit list is ranked using the m/z signal that produced the highest F-ratio for each hit. The signal-to-noise ratio (S/N) threshold was 10 and minimum m/z per tile were set to 3. All the hits with an F-ratio ≥ 20 based on previous work22, were investigated, and after artifact removal 566 analyte hits remained. Of the 566 analytes remaining, 288 were tentatively identified based upon their mass spectrum match value (MV) ≥ 800 based upon a NIST library search. Application of this F-ratio threshold facilitated focusing on the most relevant analyte hits while keeping the total number of hits at a manageable level. Signals for each of those hits at their highest F-ratio m/z for the 30 chromatograms (3 pHs × 2 conditions (media blank vs. M. pachydermatis) × 5 culture replicates) were transferred out of ChromaTOF Tile Software and all of the further data analysis was performed using Matlab 2023a using the Parallel Computing Toolbox (Mathworks, Inc., Natick, MA, USA). Support Vector Machine (SVM) Regression modeling was performed using PLS Toolbox (Eigenvector Research, Inc., Manson, WA, USA). For the SVM regression modeling the averaged signals over 5 replicates for 566 hits were used. The model was trained on the signals of the 566 hits and normalized to the highest signal out of 6 chromatograms per hit. For the model, radial basis function kernel was used, no compression was performed and standard parameters (Table S1, in Supporting Information) for gamma, cost, epsilon, and nu were used. Venetian blinds were used as a cross-validation method using 10 data splits.
RESULTS AND DISCUSSION
GC×GC-TOFMS chromatograms (one replicate of each sample is shown in Figure 1) of all the samples were collected and then processed using tile-based F-ratio software (described in the Experimental section). Figure 2 provides zoom-in chromatographic data for a media blank at pH 5.7, 9.7 and 12.4 (Figure 2A-C, respectively), as well as M. pachydermatis at pH 5.7, 9.7 and 12.4 (Figure 2D-F, respectively) for three analyte hits (i-iii). Tile-based F-ratio analysis revealed 566 analyte hits exhibited sufficient class-to-class signal differences (F-ratio ≥ 20) across the 6 sample classes. Following the F-ratio analysis, each analyte hit was reduced to its signal pattern, with examples provided in Figure 2G, where “B” represents media blank samples and “M” represents M. pachydermatis samples, and then “1” represents pH 5.7, “2” represents pH 9.7 and “3” represents pH 12.4 on the x-axis. Examples of the three distinctive signal patterns observed in this dataset are revealed in Figure 2Gi-iii. For example, the analyte in Figure 2Gi of phenylethyl alcohol (Hit 1) has a very small signal in all the blanks and a much higher signal in at least one of the M. pachydermatis samples. This is an example of an analyte produced by the M. pachydermatis, but with pH dependence. A different signal pattern is revealed in Figure 2Gii, which is 2-methylfuran (Hit 2), whereby the signals change only with pH, but minimally changes between media blank and M. pachydermatis. This would be an example of the pH-only dependent analyte. Finally, there is a third signal pattern observed in Figure 2Giii, which is hexanal (Hit 29), that exhibits high signal(s) in the media blanks, and low signal(s) in M. pachydermatis samples. This indicates an analyte that is consumed by M. pachydermatis.
Figure 1.
TIC chromatograms of one example replicate of media blank at (A) pH 5.7, (B) pH 9.7, (C) pH 12.4, and of M. pachydermatis at (D) pH 5.7, (E) pH 9.7, (F) pH 12.4. These are representative of five culture replicates.
Figure 2.
Zoom-in of selective m/z ion chromatograms for (A) media blank at pH 5.7, (B) media blank at pH 9.7, (C) media blank at pH 12.4, (D) M. pachydermatis at pH 5.7, (E) M. pachydermatis at pH 9.7, (F) M. pachydermatis at pH 12.4 of (i) Hit 1 at m/z 116, (ii) Hit 2 at m/z 107 and (iii) Hit 29 at m/z 60. (G) Average signal patterns for (i) Hit 1: phenylethyl alcohol as an example analyte produced by M. pachydermatis (ii) Hit 2: 2-methylfuran as an example analyte in both the media blank and M. pachydermatis and (iii) Hit 29: hexanal as an example analyte consumed by M. pachydermatis. B1 denotes media blank at pH 5.7, B2 denotes media blank at pH 9.7, and B3 denotes media blank at 12.4. M1 denotes M. pachydermatis at pH 5.7, M2 denotes M. pachydermatis at pH 9.7, and M3 denotes M. pachydermatis at pH 12.4. Error bars in G i-iii represent the standard error of the mean of five culture replicate.
As there were three visually recognizable distinctive types of analyte signal patterns (Figure 3A-C), we initially attempted to apply PCA, however a clear classification of the three analyte signal patterns was not observed (Figure S1). Hence, we created and applied a mathematical method by defining two quantitative metrics to separate the analyte signal patterns into three categories for the purpose of uncoupling the effect of pH influence vs. media blank and M. pachydermatis. The first quantitative metric, R, was defined so as to remove the pH influence while focusing on the signal differences between media blanks vs. M. pachydermatis. This R metric (Figure 3D-F) is the ratio of sum of the media blank signals across every pH minus the sum of the M. pachydermatis signals across every pH, normalized by the sum of these two summed quantities:
| 1 |
Figure 3.
Original signal patterns for (A) Hit 1, an analyte being produced by M. pachydermatis, (B) Hit 2, an analyte in both the media blank and M. pachydermatis, and (C) Hit 29, an analyte being consumed by M. pachydermatis. R values for Hits 1, 2, and 29 (D-F) and RSD values (G-I), respectively. B1 denotes media blank at pH 5.7, B2 denotes media blank at pH 9.7, and B3 denotes media blank at 12.4. M1 denotes M. pachydermatis at pH 5.7, M2 denotes M. pachydermatis at pH 9.7, and M3 denotes M. pachydermatis at pH 12.4. Error bars in A-C represent the standard error of the mean of five culture replicates. Scatter dots in A-C represent signals from individual culture replicates.
where M represents all M. pachydermatis signals and B represents all media blank signals. As defined, the R metric ranges from −1 to 1. Next, the quantitative RSD metric (Figure 3G-I), was defined to focus on the pH influence while minimizing the influence of the signal differences between media blanks vs. M. pachydermatis, by taking a relative standard deviation (RSD) of the sum of media blank and M. pachydermatis at every pH (B1+M1, B2+M2, B3+M3). The general equation for this RSD metric is shown below, where STD stands for standard deviation:
| 2 |
The R and RSD for each of the 566 analyte hits are plotted (Figure 4) in order to visualize the categorization process, principally relying upon the R metric (x-axis) which minimizes the influence of pH, while the RSD metric (y-axis) captures the pH dependence. Note that the 566 analyte hits are spread out over much of the plot indicating a wide range of analyte signal patterns were exhibited. We observe that a high positive R indicates an analyte produced by M. pachydermatis (Figure 3A,D for phenylethyl alcohol (Hit 1), R = 0.92, RSD = 67.6%). Conversely, a high negative R indicates an analyte is consumed by M. pachydermatis (Figure 3C,F for hexanal (Hit 29), R = −0.77, RSD = 94.7%). Finally, the third category with a small R (± surrounding 0) corresponds to an analyte with a signal pattern that is similar between the media blank and M. pachydermatis (Figure 3B,E for 2-methylfuran (Hit 2), R = −0.07, RSD = 94.7%) which may or may not have a pH dependence indicated by the RSD. Initially, we empirically applied this R metric threshold to separate these 3 signal pattern categories; however, this initial R threshold categorization approach was more rigorously examined using a support vector machine (SVM) modeling approach, vide infra. For this initial categorization, analytes that exhibit signal patterns that do not change significantly between the media blank and M. pachydermatis, with small R (± surrounding 0). We empirically define this condition as when analytes exhibit less than a 2-fold difference between the sum of the media blank signals versus the sum of the M. pachydermatis signals. Accordingly, this condition results in a R metric threshold of R=±0.33. When this R threshold is applied (Figure 4), the 3 analyte signal pattern categories emerge: analytes produced by M. pachydermatis are in red, analytes consumed by M. pachydermatis are in pink, and analytes with little to no change between media blank and M. pachydermatis are in blue. We believe the success of the R and RSD metrics can be attributed to their underlying mathematical definition and how these definitions focus upon teasing apart meaningful information in tandem with the experimental design with 6 inter-related sample classes that the metrics then facilitated further clarification into three analyte signal pattern categories.
Figure 4.
Plot of RSD vs R for 566 analytes, showing signal pattern examples (denoted in pink, blue, and red) for analytes at similar R values but different RSD values across culture conditions. B1 denotes media blank at pH 5.7, B2 denotes media blank at pH 9.7, and B3 denotes media blank at 12.4. M1 denotes M. pachydermatis at pH 5.7, M2 denotes M. pachydermatis at pH 9.7, and M3 denotes M. pachydermatis at pH 12.4. Error bars in signal patterns represent standard error of the mean of five culture replicates.
While the R metric is the key categorization metric to classify the analytes according to their trend between blanks and M. pachydermatis samples, the RSD metric describes the pH dependence. Along with the pH impact on growth conditions, pH can also impact the volatility of many of the analytes. For instance, some of the pH-dependent analytes with significant RSD values are Lewis bases, such as 2-methylfuran (Hit 2), hence it increases in volatility as pH increases as shown in Fig. 3B. Future studies would be needed to uncouple these two pH dependencies. Next, we look more in-depth into the interpretation of detailed examples of relevant analytes (Figure 4). Analyte signal patterns with a high RSD on the top of the graph (1-methythiopropane (Hit 80), 2,3-dimethyl-2-cyclopentene-1-one (Hit 218), and 2-methyl-1-butanol (Hit 450)) show a high influence of pH on their signal patterns. The lower RSD examples on the bottom of the graph (octanal (Hit 405), 2,5-dimethylpyrazine (Hit 229), and 3-methyl-2-pentanone (Hit 501) show a relatively lower impact of pH on their signal patterns. The two examples of analytes dependent on pH (Hits 218 and 229) as well as the other two that are produced (Hits 450 and 501) have been reported in the literature in other species of Malassezia (M. globosa, M. restricta, M. sympodialis) which have been associated with skin diseases such as atopic dermatitis and psoriasis.3,30–33
Additionally, 2,5-dimethylpyrazine (Hit 229) is one of the analytes that changes slightly with pH and is present in both media blank and M. pachydermatis samples (Figure 4), which makes sense as in the literature it has been noted to exist in all eukaryotes, including plants and animals, and the media components are derived from animal sources.34 2,5-dimethylpyrazine has also been found as a fermentation byproduct after inoculation with a different yeast of Saccharomyces cerevisiae.35 The other example of a pH dependent analyte is 2,3-dimethyl-2-cyclopenten-1-one (Hit 218), which has been found as a natural product in Mangifera indica commonly called mango.36 One of the examples of the analytes that get produced by M. pachydermatis is 2-methyl-1-butanol (Hit 450), which is a compound produced by many different microbes including S. cerevisiae,37 Corynebacterium glutamicum,38 Escherichia coli,39 and others.40 It also is one of the compounds involved in signaling in microbe-to-microbe interactions.6 2-methyl-1-butanol is a promising alternative to ethanol as a biofuel, so production by a microbe could be very efficient.37 The other example is 3-methyl-2-pentanone (Hit 501), which is produced by bacterial strains of Microbacterium foliorum,41 and Pseudomonas aeruginosa.42 It has also been found in the volatile space of both unspoiled and spoiled Iberian dry-cured hams.43 Lastly, two analyte examples of consumed analytes are octanal (Hit 405) and 1-methylthiopropane (Hit 80). Octanal has been previously found present in different growth mediums and shown to be consumed by E. coli.44
Based upon their F-ratio magnitude, the top 20 identified analytes from each analyte signal pattern category are shown in Figure 5, while a full list of all 566 analytes is provided in Table S2. The media analytes that are consumed by M. pachydermatis are shown in Figure 5A on a heatmap, where red represents high signal intensity, and blue represents low signal. Note that most analytes are highest in the media of pH 5.7, however, there are a few analytes that are highest in the media of pH 9.7. The analytes with similar signal patterns between the media blank and M. pachydermatis are shown in Figure 5B, while the analytes that are produced by M. pachydermatis are shown in Figure 5C. From the analytes produced, many have been found previously including phenylethyl alcohol, 1-pentanol, 3-methyl-1-butanol, and 2-ethyl-1-hexanol.3 Another analyte, ethanol, has been found to be produced by the fungus Muscodor albus, which is known to release volatiles that inhibit other fungi, bacteria, and insects.45 There is also evidence that microbes can convert fatty acids (that are present in the media) to corresponding alcohols,46 like 1-pentanol (Hit 88 – Figure 5C), or benzyl alcohol (Hit 160 – Figure 5C).
Figure 5.
Signal profiles of top 20 identified analytes of each class: (A) for consumed analytes by M. pachydermatis, (B) for analytes that are similar between the media blank and M. pachydermatis and (C) for M. pachydermatis produced analytes.
Lastly, we validated our approach of assigning analytes into one of three signal pattern categories by creating a support vector machine (SVM) model. All of the 566 analyte signal patterns were normalized to the maximum signal within each signal pattern. At this stage they could be put into SVM software. Using venetian blinds, as described in the experimental section, a cross-validation (CV) set was created. Results of the CV are shown in Table 1. Four primary metrics used for this evaluation of the model are true positive rate (TPR), true negative rate (TNR), precision, and error. All metric equations are provided in the SI. TPR provides the proportion of positive cases that were correctly identified; TNR calculates the proportion of negative cases that were classified correctly. In this study, all three analyte signal pattern categories are performing very well with all of them having both TPR and TNR of 0.95 or higher, with 1 being perfect. Precision measures the proportion of positive predictions that are actually correct. Again, for the SVM model, all three analyte categories have a precision of 0.93 or higher, indicating the model predictions are reliable. Error provides the proportion of samples that were incorrectly classified. In this study, the highest error was 0.027, which is 2.7% (which is still low nonetheless), for the category of analyte signal patterns where the patterns for media blank and M. pachydermatis are similar. Indeed, the modeling error for the other signal pattern categories, analytes produced by M. pachydermatis and analytes consumed by M. pachydermatis, is even lower, at ~1%. This result is very exciting, as it shows that our categorization of the analyte signal patterns based on the R metric approach is soundly based, and a blind classification SVM model categorized the analytes into the same assignments. To further evaluate the SVM model performance, macro-averaged F-score, accuracy, and extended Matthews correlation coefficient (EMCC) metrics were also calculated, with 1 being perfect.47 Equations are provided in the SI. F-score balances the TPR and precision, providing a single score that reflects both correctness and completeness of positive predictions; in our SVM model, the macro-averaged F-score from all three classes is 0.97, indicating high classification quality across classes. Accuracy calculated the overall predictive ability which is 0.96. Lastly, EMCC evaluates the entire model considering true and false predictions across all classes; our model’s EMCC equals 0.93, indicating strong and balanced predictive performance.
Table 1.
Summary of results of support vector machine regression model using cross-validation results.
| Class | TPR | TNR | Precision | Error |
|---|---|---|---|---|
| Consumed analytes | 0.965 | 0.996 | 0.982 | 0.011 |
| Analytes similar in media blank and M. pachydermatis | 0.981 | 0.959 | 0.979 | 0.027 |
| Produced analytes | 0.949 | 0.990 | 0.938 | 0.016 |
CONCLUSION
Here we present the nontargeted analysis of the volatile space of M. pachydermatis at three different pHs using GC×GC-TOFMS. Discovery-based F-ratio analysis provided a final hit list with 566 analyte hits, 288 of which were tentatively identified by a NIST mass spectrum library search. Two quantitative metrics were defined and implemented, R and RSD, with R focusing on the changes between media blanks and M. pachydermatis by minimizing the influence of pH, and RSD focusing on the influence of pH. Based on the R metric, the analyte signal patterns were split into 3 categories: media analytes consumed by M. pachydermatis, analytes in the media blank and M. pachydermatis with similar patterns, and analytes produced by M. pachydermatis. Many of the produced analytes were already shown to be produced by other yeast species and shown to have biological significance. Further, there is evidence of some bioconversions between our found consumed analytes and the ones produced. We also validated our signal pattern categorization method using a SVM model, where cross validation showed very promising results with TPR and TNR both being over 0.95 and error being below 0.03 (or 3%). To our knowledge, this is the first study analyzing the M. pachydermatis volatome using GC×GC-TOFMS allowing for a more in-depth study of the volatile signals being consumed as well as produced by the microbe. Microenvironmental changes including pH are hypothesized to trigger M. pachydermatis commensal to pathogenic transitions. It is important to analyze how the VOC profile changes with pH change to better characterize metabolic processes in M. pachydermatis in different microenvironments.
Supplementary Material
Supporting Information
The Supporting Information (pdf) is available free of charge on the ACS Publications website.
• PCA scores of 566 analytes (Figure S1); SVM regression model standard parameters (Table S1); final hitlist of 566 analytes (Table S2); SVM metric calculations (S.1.).
ACKNOWLEDGMENT
L. Mikaliunaite acknowledges the Lithuanian Foundation Scholarship. J. C. Tokihiro acknowledges the National Center for Advancing Translational Sciences grant TL1TR002318. This publication was supported by the National Institutes of Health (NIH) through R35GM128648. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Ashleigh B. Theberge reports filing multiple patents through the University of Washington and receiving a gift to support research outside the submitted work from Ionis Pharmaceuticals. Erwin Berthier is an inventor on multiple patents filed by Tasso, Inc., the University of Washington, and the University of Wisconsin. Erwin Berthier has ownership in Salus Discovery, LLC, and Tasso, Inc. and is employed by Tasso, Inc. However, this research is not related to these companies. Erwin Berthier and Ashleigh B. Theberge have ownership in Seabright, LLC, which will advance new tools for diagnostics and clinical research, but is not directly related to the research in this manuscript. The terms of this arrangement have been reviewed and approved by the University of Washington in accordance with its policies governing outside work and financial conflicts of interest in research.
Footnotes
Conflicts of interest: The other authors declare no other conflicts of interest.
REFERENCES
- (1).Celis Ramírez AM; Amézquita A; Cardona Jaramillo JEC; Matiz-Cerón LF; Andrade-Martínez JS; Triana S; Mantilla MJ; Restrepo S; Barrios AFG; de Cock H. Analysis of Malassezia Lipidome Disclosed Differences Among the Species and Reveals Presence of Unusual Yeast Lipids. Front Cell Infect Microbiol 2020, 10, 338. 10.3389/fcimb.2020.00338. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (2).Peano A; Johnson E; Chiavassa E; Tizzani P; Guillot J; Pasquetti M. Antifungal Resistance Regarding Malassezia Pachydermatis: Where Are We Now? J Fungi (Basel) 2020, 6 (2), 93. 10.3390/jof6020093. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (3).Rios-Navarro A; Gonzalez M; Carazzone C; Celis Ramírez AM Why Do These Yeasts Smell So Good? Volatile Organic Compounds (VOCs) Produced by Malassezia Species in the Exponential and Stationary Growth Phases. Molecules 2023, 28 (6), 2620. 10.3390/molecules28062620. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (4).Gonzalez M; Celis AM; Guevara-Suarez MI; Molina J; Carazzone C. Yeast Smell Like What They Eat: Analysis of Volatile Organic Compounds of Malassezia Furfur in Growth Media Supplemented with Different Lipids. Molecules 2019, 24 (3), 419. 10.3390/molecules24030419. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (5).Sastoque A; Triana S; Ehemann K; Suarez L; Restrepo S; Wösten H; de Cock H; Fernández-Niño M; González Barrios AF; Celis Ramírez AM New Therapeutic Candidates for the Treatment of Malassezia Pachydermatis -Associated Infections. Sci Rep 2020, 10 (1), 4860. 10.1038/s41598-020-61729-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (6).Rios-Navarro A; Gonzalez M; Carazzone C; Celis Ramírez AM Learning about Microbial Language: Possible Interactions Mediated by Microbial Volatile Organic Compounds (VOCs) and Relevance to Understanding Malassezia Spp. Metabolism. Metabolomics 2021, 17 (4), 39. 10.1007/s11306-021-01786-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (7).Yang J; Park S; Kim HJ; Lee SJ; Jung WH The Interkingdom Interaction with Staphylococcus Influences the Antifungal Susceptibility of the Cutaneous Fungus Malassezia. J. Microbiol. Biotechnol. 2023, 33 (2), 180–187. 10.4014/jmb.2210.10039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (8).Lambers H; Piessens S; Bloem A; Pronk H; Finkel P. Natural Skin Surface pH Is on Average below 5, Which Is Beneficial for Its Resident Flora. Int J Cosmet Sci 2006, 28 (5), 359–370. 10.1111/j.1467-2494.2006.00344.x. [DOI] [PubMed] [Google Scholar]
- (9).Koike A; Kano R; Nagata M; Chen C; Hwang C-Y; Hasegawa A; Kamata H. Genotyping of Malassezia Pachydermatis Isolates from Canine Healthy Skin and Lesional Skin of Atopic Dermatitis in Japan, Korea and Taiwan. J Vet Med Sci 2013, 75 (7), 955–958. 10.1292/jvms.12-0372. [DOI] [PubMed] [Google Scholar]
- (10).Park M; Lee JS; Jung WH; Lee YW pH-Dependent Expression, Stability, and Activity of Malassezia Restricta MrLip5 Lipase. Annals of Dermatology 2020, 32 (6), 473. 10.5021/ad.2020.32.6.473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (11).Selander C; Zargari A; Möllby R; Rasool O; Scheynius A. Higher pH Level, Corresponding to That on the Skin of Patients with Atopic Eczema, Stimulates the Release of Malassezia Sympodialis Allergens. Allergy 2006, 61 (8), 1002–1008. 10.1111/j.1398-9995.2006.01108.x. [DOI] [PubMed] [Google Scholar]
- (12).Matousek JL; Campbell KL; Kakoma I; Solter PF; Schaeffer DJ Evaluation of the Effect of pH on in Vitro Growth of Malassezia Pachydermatis. Canadian Journal of Veterinary Research 2003, 67 (1), 56. [PMC free article] [PubMed] [Google Scholar]
- (13).Makhubela PCK; Rohwer ER; Naudé Y. Detection of Tuberculosis-Associated Compounds from Human Skin by GCxGC-TOFMS. Journal of Chromatography B 2023, 1231, 123937. 10.1016/j.jchromb.2023.123937. [DOI] [PubMed] [Google Scholar]
- (14).Beukes D; van Reenen M; Loots DT; du Preez I. Tuberculosis Is Associated with Sputum Metabolome Variations, Irrespective of Patient Sex or HIV Status: An Untargeted GCxGC-TOFMS Study. Metabolomics 2023, 19 (6), 55. 10.1007/s11306-023-02017-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (15).Berna AZ; Akaho EH; Harris RM; Congdon M; Korn E; Neher S; M’Farrej M; Burns J; Odom John AR Reproducible Breath Metabolite Changes in Children with SARS-CoV-2 Infection. ACS Infect. Dis. 2021, 7 (9), 2596–2603. 10.1021/acsinfecdis.1c00248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (16).Barberis E; Amede E; Khoso S; Castello L; Sainaghi PP; Bellan M; Balbo PE; Patti G; Brustia D; Giordano M; Rolla R; Chiocchetti A; Romani G; Manfredi M; Vaschetto R. Metabolomics Diagnosis of COVID-19 from Exhaled Breath Condensate. Metabolites 2021, 11 (12), 847. 10.3390/metabo11120847. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (17).Raninen K; Nenonen R; Järvelä-Reijonen E; Poutanen K; Mykkänen H; Raatikainen O. Comprehensive Two-Dimensional Gas Chromatography–Mass Spectrometry Analysis of Exhaled Breath Compounds after Whole Grain Diets. Molecules 2021, 26 (9), 2667. 10.3390/molecules26092667. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (18).Mikaliunaite L; Synovec RE Computational Method for Untargeted Determination of Cycling Yeast Metabolites Using Comprehensive Two-Dimensional Gas Chromatography Time-of-Flight Mass Spectrometry. Talanta 2022, 244, 123396. 10.1016/j.talanta.2022.123396. [DOI] [PubMed] [Google Scholar]
- (19).Mikaliunaite L; Synovec RE Simultaneous Discovery of Compounds Dominated by Either Molding Kinetics or Geographical Region of Origin for Moisture Damaged Cacao Beans Using Orthogonally Applied Tile-Based Fisher Ratio Analysis of GC×GC-TOFMS Data. Journal of Chromatography A 2024, 1730, 465093. 10.1016/j.chroma.2024.465093. [DOI] [PubMed] [Google Scholar]
- (20).Trinklein TJ; Cain CN; Ochoa GS; Schöneich S; Mikaliunaite L; Synovec RE Recent Advances in GC×GC and Chemometrics to Address Emerging Challenges in Nontargeted Analysis. Anal. Chem. 2023, 95 (1), 264–286. 10.1021/acs.analchem.2c04235. [DOI] [PubMed] [Google Scholar]
- (21).Marney LC; Christopher Siegler W; Parsons BA; Hoggard JC; Wright BW; Synovec RE Tile-Based Fisher-Ratio Software for Improved Feature Selection Analysis of Comprehensive Two-Dimensional Gas Chromatography–Time-of-Flight Mass Spectrometry Data. Talanta 2013, 115, 887–895. 10.1016/j.talanta.2013.06.038. [DOI] [PubMed] [Google Scholar]
- (22).Parsons BA; Marney LC; Siegler WC; Hoggard JC; Wright BW; Synovec RE Tile-Based Fisher Ratio Analysis of Comprehensive Two-Dimensional Gas Chromatography Time-of-Flight Mass Spectrometry (GC × GC–TOFMS) Data Using a Null Distribution Approach. Anal. Chem 2015, 87 (7), 3812–3819. 10.1021/ac504472s. [DOI] [PubMed] [Google Scholar]
- (23).Sudol PE; Galletta M; Tranchida PQ; Zoccali M; Mondello L; Synovec RE Untargeted Profiling and Differentiation of Geographical Variants of Wine Samples Using Headspace Solid-Phase Microextraction Flow-Modulated Comprehensive Two-Dimensional Gas Chromatography with the Support of Tile-Based Fisher Ratio Analysis. Journal of Chromatography A 2022, 1662, 462735. 10.1016/j.chroma.2021.462735. [DOI] [PubMed] [Google Scholar]
- (24).Humston-Fulmer EM; Alonso DE; Binkley JE Chapter Six - Improving Cannabis Differentiation by Expanding Coverage of the Chemical Profile with GCxGC-TOFMS. In Comprehensive Analytical Chemistry; Ferrer I, Thurman EM, Eds.; Analysis of Cannabis; Elsevier, 2020; Vol. 90, pp 169–196. 10.1016/bs.coac.2020.04.007. [DOI] [Google Scholar]
- (25).Favela KA; Hartnett MJ; Janssen JA; Vickers DW; Schaub AJ; Spidle HA; Pickens KS Nontargeted Analysis of Face Masks: Comparison of Manual Curation to Automated GCxGC Processing Tools. J. Am. Soc. Mass Spectrom 2021, 32 (4), 860–871. 10.1021/jasms.0c00318. [DOI] [PubMed] [Google Scholar]
- (26).Cheung C; Baker JD; Byrne JM; Perrault KA Investigating Volatiles as the Secondary Metabolome of Piper Methysticum from Root Powder and Water Extracts Using Comprehensive Two-Dimensional Gas Chromatography. Journal of Ethnopharmacology 2022, 294, 115346. 10.1016/j.jep.2022.115346. [DOI] [PubMed] [Google Scholar]
- (27).Sudol PE; Ochoa GS; Cain CN; Synovec RE Tile-Based Variance Rank Initiated-Unsupervised Sample Indexing for Comprehensive Two-Dimensional Gas Chromatography-Time-of-Flight Mass Spectrometry. Analytica Chimica Acta 2022, 1209, 339847. 10.1016/j.aca.2022.339847. [DOI] [PubMed] [Google Scholar]
- (28).Costa R; Fanali C; Pennazza G; Tedone L; Dugo L; Santonico M; Sciarrone D; Cacciola F; Cucchiarini L; Dachà M; Mondello L. Screening of Volatile Compounds Composition of White Truffle during Storage by GCxGC-(FID/MS) and Gas Sensor Array Analyses. LWT - Food Science and Technology 2015, 60 (2, Part 1), 905–913. 10.1016/j.lwt.2014.09.054. [DOI] [Google Scholar]
- (29).Cain CN; Trinklein TJ; Ochoa GS; Synovec RE Tile-Based Pairwise Analysis of GC × GC-TOFMS Data to Facilitate Analyte Discovery and Mass Spectrum Purification. Anal. Chem 2022, 94 (14), 5658–5666. 10.1021/acs.analchem.2c00223. [DOI] [PubMed] [Google Scholar]
- (30).Darabi K; Hostetler SG; Bechtel MA; Zirwas M. The Role of Malassezia in Atopic Dermatitis Affecting the Head and Neck of Adults. Journal of the American Academy of Dermatology 2009, 60 (1), 125–136. 10.1016/j.jaad.2008.07.058. [DOI] [PubMed] [Google Scholar]
- (31).Lee Y-J; Yassa C; Park S-H; Song SW; Jung WH; Lee YW; Kang H; Kim J-E Interactions between Malassezia and New Therapeutic Agents in Atopic Dermatitis Affecting Skin Barrier and Inflammation in Recombinant Human Epidermis Model. International Journal of Molecular Sciences 2023, 24 (7), 6171. 10.3390/ijms24076171. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (32).Schmid-Grendelmeier P; Scheynius A; Crameri R. The Role of Sensitization to Malassezia Sympodialis in Atopic Eczema. Chem Immunol Allergy 2006, 91, 98–109. 10.1159/000090246. [DOI] [PubMed] [Google Scholar]
- (33).Glatz M; Bosshard PP; Hoetzenecker W; Schmid-Grendelmeier P. The Role of Malassezia Spp. in Atopic Dermatitis. Journal of Clinical Medicine 2015, 4 (6), 1217. 10.3390/jcm4061217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (34).2,5-Dimethylpyrazine (YMDB01598) - Yeast Metabolome Database. https://www.ymdb.ca/compounds/YMDB01598 (accessed 2025-03-20).
- (35).Schwan RF; Bressani APP; Martinez SJ; Batista NN; Dias DR The Essential Role of Spontaneous and Starter Yeasts in Cocoa and Coffee Fermentation. FEMS Yeast Research 2023, 23, foad019. 10.1093/femsyr/foad019. [DOI] [PubMed] [Google Scholar]
- (36).PubChem. 2-Cyclopenten-1-one, 2,3-dimethyl- https://pubchem.ncbi.nlm.nih.gov/compound/14270 (accessed 2025-03-20).
- (37).Zhang Y; Lane S; Chen J-M; Hammer SK; Luttinger J; Yang L; Jin Y-S; Avalos JL Xylose Utilization Stimulates Mitochondrial Production of Isobutanol and 2-Methyl-1-Butanol in Saccharomyces Cerevisiae. Biotechnology for Biofuels 2019, 12 (1), 223. 10.1186/s13068-019-1560-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (38).Vogt M; Brüsseler C; Ooyen J. van; Bott M; Marienhagen J. Production of 2-Methyl-1-Butanol and 3-Methyl-1-Butanol in Engineered Corynebacterium Glutamicum. Metabolic Engineering 2016, 38, 436–445. 10.1016/j.ymben.2016.10.007. [DOI] [PubMed] [Google Scholar]
- (39).Cann AF; Liao JC Production of 2-Methyl-1-Butanol in Engineered Escherichia Coli. Appl Microbiol Biotechnol 2008, 81 (1), 89–98. 10.1007/s00253-008-1631-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (40).Su H; Chen H; Lin J. Enriching the Production of 2-Methyl-1-Butanol in Fermentation Process Using Corynebacterium Crenatum. Curr Microbiol 2020, 77 (8), 1699–1706. 10.1007/s00284-020-01961-0. [DOI] [PubMed] [Google Scholar]
- (41).Deetae P; Bonnarme P; Spinnler HE; Helinck S. Production of Volatile Aroma Compounds by Bacterial Strains Isolated from Different Surface-Ripened French Cheeses. Appl Microbiol Biotechnol 2007, 76 (5), 1161–1171. 10.1007/s00253-007-1095-5. [DOI] [PubMed] [Google Scholar]
- (42).Bean HD; Dimandja J-MD; Hill JE Bacterial Volatile Discovery Using Solid Phase Microextraction and Comprehensive Two-Dimensional Gas Chromatography–Time-of-Flight Mass Spectrometry. Journal of Chromatography B 2012, 901, 41–46. 10.1016/j.jchromb.2012.05.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (43).Martín A; Benito MJ; Aranda E; Ruiz-Moyano S; Córdoba JJ; Córdoba MG Characterization by Volatile Compounds of Microbial Deep Spoilage in Iberian Dry-Cured Ham. Journal of Food Science 2010, 75 (6), M360–M365. 10.1111/j.1750-3841.2010.01674.x. [DOI] [PubMed] [Google Scholar]
- (44).Ratiu I-A; Ligor T; Bocos-Bintintan V; Al-Suod H; Kowalkowski T; Rafińska K; Buszewski B. The Effect of Growth Medium on an Escherichia Coli Pathway Mirrored into GC/MS Profiles. J. Breath Res 2017, 11 (3), 036012. 10.1088/1752-7163/aa7ba2. [DOI] [PubMed] [Google Scholar]
- (45).Alpha CJ; Campos M; Jacobs-Wagner C; Strobel SA Mycofumigation by the Volatile Organic Compound-Producing Fungus Muscodor Albus Induces Bacterial Cell Death through DNA Damage. Applied and Environmental Microbiology 2015, 81 (3), 1147–1156. 10.1128/AEM.03294-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (46).Lin L-J; Saini M; Chiang C-J; Chao Y-P Biocatalytic Conversion of Short-Chain Fatty Acids to Corresponding Alcohols in Escherichia Coli. Processes 2021, 9 (6), 973. 10.3390/pr9060973. [DOI] [Google Scholar]
- (47).Ballabio D; Grisoni F; Todeschini R. Multivariate Comparison of Classification Performance Measures. Chemometrics and Intelligent Laboratory Systems 2018, 174, 33–44. 10.1016/j.chemolab.2017.12.004. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





