Skip to main content
PLOS One logoLink to PLOS One
. 2021 Jun 4;16(6):e0252378. doi: 10.1371/journal.pone.0252378

Metabolomic profiling of microbial disease etiology in community-acquired pneumonia

Ilona den Hartog 1, Laura B Zwep 1,2,3, Stefan M T Vestjens 4, Amy C Harms 1, G Paul Voorn 4, Dylan W de Lange 5,6, Willem J W Bos 7,8, Thomas Hankemeier 1, Ewoudt M W van de Garde 9,10, J G Coen van Hasselt 1,*
Editor: Aran Singanayagam11
PMCID: PMC8177549  PMID: 34086721

Abstract

Diagnosis of microbial disease etiology in community-acquired pneumonia (CAP) remains challenging. We undertook a large-scale metabolomics study of serum samples in hospitalized CAP patients to determine if host-response associated metabolites can enable diagnosis of microbial etiology, with a specific focus on discrimination between the major CAP pathogen groups S. pneumoniae, atypical bacteria, and respiratory viruses. Targeted metabolomic profiling of serum samples was performed for three groups of hospitalized CAP patients with confirmed microbial etiologies: S. pneumoniae (n = 48), atypical bacteria (n = 47), or viral infections (n = 30). A wide range of 347 metabolites was targeted, including amines, acylcarnitines, organic acids, and lipids. Single discriminating metabolites were selected using Student’s T-test and their predictive performance was analyzed using logistic regression. Elastic net regression models were employed to discover metabolite signatures with predictive value for discrimination between pathogen groups. Metabolites to discriminate S. pneumoniae or viral pathogens from the other groups showed poor predictive capability, whereas discrimination of atypical pathogens from the other groups was found to be possible. Classification of atypical pathogens using elastic net regression models was associated with a predictive performance of 61% sensitivity, 86% specificity, and an AUC of 0.81. Targeted profiling of the host metabolic response revealed metabolites that can support diagnosis of microbial etiology in CAP patients with atypical bacterial pathogens compared to patients with S. pneumoniae or viral infections.

Introduction

Community-acquired pneumonia (CAP) is a commonly occurring respiratory tract infection caused by bacterial or viral pathogens that can lead to severe disease, especially in elderly patients [1]. The predominant pathogens found in hospitalized CAP patients are Streptococcus pneumoniae and to a lesser extent, Haemophilus influenzae, Legionella pneumophila, and respiratory viruses [2, 3]. Patients hospitalized with severe CAP typically receive empirical antibiotic treatment with broad-spectrum antibiotics until the microbial etiology is determined [4, 5]. Current standard diagnostic methods for microbial identification are pathogen-targeted and include culturing, antigen testing, and molecular diagnostics such as PCR [5]. In over 60% of CAP patients, no causative pathogen can be identified with these pathogen-targeted diagnostic techniques [2, 6]. As a consequence, broad-spectrum antibiotics are over-used, which facilitates the emergence of antimicrobial resistance [7, 8]. To this end, a need exists to explore innovative methods to enhance the diagnostic performance for the detection of microbial pathogens in CAP.

Evaluation of differences in the host-response to CAP-associated pathogens may be an alternative approach to improve diagnosis [9]. There is growing evidence that the host, i.e. the patient, metabolic response to infections can be a relevant source of novel host immune response biomarkers to infections [10, 11]. Several small studies have reported differences in metabolite profiles in blood and urine samples in patients with different types of infections (S1 Table) [1218]. For instance, studies comparing metabolomic changes in CAP and tuberculosis (TB) patients show increased levels of plasma lipids and decreased levels of metabolites involved in cholesterol synthesis [12, 15]. A study comparing viral and bacterial respiratory tract infections showed that plasma metabolite profiles of patients with influenza A and bacterial pneumonia differed significantly [17]. In another study, urine samples of patients with a respiratory syncytial virus (RSV) or a bacterial respiratory tract infection showed differences in metabolite levels as well [18]. An important limitation of these studies is that the comparisons made cannot yet support the etiological diagnosis of CAP but merely focus on differences between diseases such as TB versus CAP. The studies that compared viral and bacterial causative pathogen groups of CAP used an untargeted metabolomics approach. While an untargeted approach is especially useful for the discovery of new features and hypothesis-free analysis, a targeted approach that can be fully quantified to clinical laboratory standards may be preferable for clinical implementation. Furthermore, these studies have the limitation that they focus on the comparison of pediatric patients while most hospitalized CAP patients are adults. No studies have evaluated differences in metabolite profiles of CAP patients comparing different microbial etiologies relevant for treatment of CAP, i.e. S. pneumoniae, atypical pathogens, and viral infections.

In the current study, we performed extensive targeted metabolomic profiling for three groups of hospitalized CAP patients with confirmed microbial etiologies of S. pneumoniae, atypical bacteria, or viral infections. We aimed to determine whether host-response associated metabolites can enable diagnosis of microbial etiology, focusing on discrimination between the pathogen groups S. pneumoniae, atypical bacteria, and respiratory viruses in patients hospitalized with CAP.

Materials and methods

Study population

Serum samples were taken from 505 patients that were diagnosed with CAP in two previously conducted clinical studies that were executed between October 2004 and September 2010 [2, 3]. The samples were taken from CAP patients within 24 hours after hospital admission. In 57% of these patient samples, the causative pathogen could be identified using conventional diagnostic methods such as culturing, PCR, and urinary antigen tests. The most commonly found causative pathogen in these patients was S. pneumoniae, followed by atypical bacterial and viral pathogens. A minority of patients was diagnosed with other bacteria.

From the selection of patients in which a causative pathogen was identified, we excluded patients with mixed infections. Furthermore, we constructed three distinctive groups of patients with Streptococcus pneumoniae, atypical (Coxiella burnetii, Chlamydophila psittaci, Legionella pneumophila or Mycoplasma pneumoniae), or viral (influenza virus, herpes simplex virus (HSV), respiratory syncytial virus (RSV), parainfluenza virus, or another respiratory virus) infections. The number of available samples for the patient group with confirmed viral CAP infection was limited (n = 31). The patients included in the S. pneumoniae and atypical bacterial groups were randomly drawn from the remaining study population in an iterating fashion until the bacterial groups were composed in such a way that three groups showed comparable means for sex and pneumonia severity index scores. This resulted in a group of 49 patients with S. pneumoniae and a group of 50 patients with atypical infections (Fig 1). No matching of individual samples was performed. An overview of patient characteristics is provided in Table 1 and S2 Table. Patient characteristics that might be considered as possible covariates were: age, sex, nursing home resident, renal disease, congestive heart failure, CNS disease, malignancy, COPD, diabetes, altered mental status, respiratory rate, systolic blood pressure, temperature, pulse, pH, BUN, sodium, glucose, hematocrit, partial pressure of oxygen, pleural effusion on x-ray, duration of symptoms before admission, antibiotic treatment before admission. The analyses performed in this study were executed conform the informed consent given by the patients. The clinical data was anonymized before use.

Fig 1. Flow chart of the formation of the three studied patient groups.

Fig 1

Table 1. Patient characteristics per pathogen group.

S. pneumonia (n = 48) Atypical (n = 47) Viral (n = 30) P-value
Age (years)
    Mean (SD) 62.2 (18.9) 54.7 (14.6) 70.1 (16.4) <0.01
    Median [Min, Max] 63.5 [18.0, 98.0] 52.0 [26.0, 81.0] 74.0 [29.0, 95.0]
Sex
    Male 22 (45.8%) 34 (72.3%) 21 (70.0%) 0.12
PSI score
    < 50 9 (18.8%) 9 (19.1%) 2 (6.7%) 0.33
    51–70 7 (14.6%) 13 (27.7%) 6 (20.0%)
    71–90 5 (10.4%) 10 (21.3%) 7 (23.3%)
    91–130 23 (47.9%) 12 (25.5%) 11 (36.7%)
    > 131 4 (8.3%) 3 (6.4%) 4 (13.3%)
Liver disease
    No 48 (100%) 47 (100%) 30 (100%) -
Kidney disease
    Yes 3 (6.2%) 1 (2.1%) 4 (13.3%) 0.30
Cardiovascular disease
    Yes 6 (12.5%) 5 (10.6%) 3 (10.0%) 0.93
CNS disease
    No 46 (95.8%) 44 (93.6%) 28 (93.3%) 0.66
    Yes 1 (2.1%) 3 (6.4%) 2 (6.7%)
    Missing 1 (2.1%) 0 (0%) 0 (0%)
Malignancy
    No 44 (91.7%) 46 (97.9%) 28 (93.3%) 0.66
    Yes 3 (6.2%) 1 (2.1%) 2 (6.7%)
    Missing 1 (2.1%) 0 (0%) 0 (0%)
COPD
    No 24 (50.0%) 44 (93.6%) 25 (83.3%) 0.16
    Yes 9 (18.8%) 3 (6.4%) 5 (16.7%)
    Missing 15 (31.2%) 0 (0%) 0 (0%)
Diabetes
    No 26 (54.2%) 45 (95.7%) 26 (86.7%) 0.17
    Yes 7 (14.6%) 2 (4.3%) 4 (13.3%)
    Missing 15 (31.2%) 0 (0%) 0 (0%)
Duration of symptoms before admission (days)
    Mean (SD) 4.06 (3.03) 5.83 (5.65) 4.70 (3.21) 0.33
    Median [Min, Max] 3.50 [1.00, 14.0] 5.00 [1.00, 42.0] 4.00 [0.00, 14.0]
    Missing 16 (33.3%) 0 (0%) 0 (0%)
Antibiotic treatment before admission
    No 27 (56.2%) 29 (61.7%) 23 (76.7%) 0.17
    Yes 5 (10.4%) 18 (38.3%) 7 (23.3%)
    Missing 16 (33.3%) 0 (0%) 0 (0%)
Corticosteroid use before admission
    No 29 (60.4%) 46 (97.9%) 29 (96.7%) 0.67
    Yes 2 (4.2%) 1 (2.1%) 1 (3.3%)
    Missing 17 (35.4%) 0 (0%) 0 (0%)

Data are presented as number (%) or mean (SD). Abbreviations: PSI: pneumonia severity index; CNS: central nervous system; COPD: chronic obstructive pulmonary disease.

Bioanalytical procedures

Serum samples were analyzed with five liquid chromatography methods and one gas chromatography, mass spectrometry-based, targeted, metabolomics method. The metabolomics profiling covered 596 metabolite targets from 25 metabolite classes, including amino acids, biogenic amines, acylcarnitines, organic acids, and multiple classes of lipids (S3 Table). Levels of 374 unique metabolites were detected in the samples. The metabolomic profiling was performed within the Biomedical Metabolomics Facility of Leiden University in Leiden, The Netherlands. Details of the metabolomic analysis methods used are provided in S1 Method.

Data analysis

The data resulting from the metabolomic profiling was cleaned by removing patient samples with more than 10 missing metabolite values, for example, if results from one measurement platform were missing because of too low sample volumes, and by removing metabolites with missing patient samples, for example, because of a sample preparation error. The clean dataset consisted of 347 metabolite levels (S4 Table) for 125 patients diagnosed with the microbial etiology S. pneumoniae (n = 48), atypical (n = 47), or viral (n = 30). The pathogens identified in each group are shown in Table 2. The resulting metabolite levels were preprocessed by applying log transformation and standardized to correct for heteroscedasticity. The preprocessed metabolomics dataset was visually inspected using a principal component analysis.

Table 2. Distribution of causative microbial agents per pathogen group for statistical data analysis.

Causative pathogen S. pneumonia (n = 48) Atypical bacterial (n = 47) Viral (n = 30)
S. pneumonia 48 (100%) 0 (0%) 0 (0%)
Legionella pneumophila 0 (0%) 18 (38.3%) 0 (0%)
Coxiella burnetii 0 (0%) 17 (36.2%) 0 (0%)
Chlamydophila psittaci 0 (0%) 7 (14.9%) 0 (0%)
Mycoplasma pneumoniae 0 (0%) 5 (10.6%) 0 (0%)
Influenza virus 0 (0%) 0 (0%) 11 (36.7%)
HSV 0 (0%) 0 (0%) 6 (20.0%)
RSV 0 (0%) 0 (0%) 4 (13.3%)
Parainfluenza virus 0 (0%) 0 (0%) 3 (10.0%)
Other viruses 0 (0%) 0 (0%) 6 (20.0%)

Data are presented as number (%). Abbreviations: S. pneumoniae: Streptococcus pneumoniae; HSV: herpes simplex virus; RSV: respiratory syncytial virus.

Data imputation was performed for patient characteristics that were to be evaluated as covariates in the statistical analysis and showed missingness in the data. Five times repeated imputation using predictive mean matching was performed with the ‘mice’ package for R to impute the patient data for the covariates with less than 25% missing data. Predictive mean matching is suitable for both numeric and binary covariates. Patient characteristics with >25% missing data were excluded from further analysis.

We performed logistic regression and elastic net regression modeling to determine if patients in one pathogen group could be discriminated from patients in the remaining two groups. Also, we aimed to determine which metabolites were important for prediction of the causative pathogen. In both methods, five-fold cross-validation was used to make the most efficient use of the available data for estimation of the predictive performance of the models and its associated metabolites [19]. Furthermore, the model generation was repeated 100 times to obtain robust estimates of the predictive performance of the models.

To identify single discriminative metabolites, Student’s T-tests with false discovery rate (FDR) multiple testing corrections were performed (p < 0.05). Then, significant metabolites and a combination of significant metabolites were modeled using logistic regression. Also, models containing covariates age and sex and all covariates were generated. The predictive logistic regression models were analyzed by comparison of their area under the curve (AUC), sensitivity, specificity, balanced error rate (BER), and receiver operating characteristic (ROC) curve.

Elastic net regression was performed to test if the predictive power of the metabolite data could be increased by including correlations between metabolites in addition to evaluating single metabolites. In elastic net regression, metabolites that have no explanatory power can be set to zero, as in a lasso regression, and metabolites that explain the same amount of variance can all be included with balanced coefficient sizes, as in a ridge regression [20].

To obtain robust estimates of the predictive performance of the elastic net model, hyperparameters were optimized in a five-fold nested-cross validation, where the hyperparameters were selected truly independent of the calculation of the predictive performance, as is schematically shown in Fig 2 [21]. In the inner cross-validation loop, the model optimization loop, optimal values for model hyperparameters α and λ were determined. In the outer cross-validation loop, the model performance loop, the optimal model for the training fold was built on the set hyperparameters α and λ (S1 Fig). Hyperparameter selection was performed using the balanced error rate (BER), which can be calculated from the true- and false positive (TP, FP), and true- and false-negative rates (TN, FN, Eq 1). The BER accounts for different group sizes per model and therefore gives an accurate picture of the performance of models in the model optimization and model performance loop.

Fig 2. Schematic representation of stratified nested cross-validation for elastic net regression model optimization and performance [21].

Fig 2

Abbreviations: CV: cross-validation.

BER=0.5*(FPTN+FP+FNFN+TP) (1)

The overall predictive diagnostic performance was evaluated using sensitivity and specificity performance measures, generated from the confusion matrix that represents the number of samples falling into each possible outcome (Eq 23). The average sensitivity and specificity of all 500 generated models and its standard deviation were used to compare the assay performance to currently used methods.

Sensitivity=TPTP+FN (2)
Specificity=TNTN+FP (3)

The relative contribution of metabolites to provide predictions of the expected pathogen group were quantified using the variable importance in prediction (VIP) score, expressed as a percentage. The VIP score was calculated per metabolite per fold or repeat as follows:

VIP(%)=βji=0p|βi|100% (4)

where βj is the regression coefficient for fold j over the sum of all regression coefficient values in the model. Metabolites were arranged based on their mean VIP score over all folds and repeats. Metabolites with an absolute VIP > 1% were considered to be most important. Furthermore, to determine the need to include age and sex, or all covariates in the models we compared the BER for models with and without age and sex, or all covariates included. Finally, mean AUC values and ROC curves were calculated and generated to compare the performance of the elastic net models to the logistic regression models.

The scripts used for the statistical analyses were deposited in Github at http://github.com/vanhasseltlab/MetabolomicsEtiologyCAP.

Results

Metabolomics profiling and exploratory analysis of metabolomics data

Metabolomics profiling was performed for 130 patients and 596 metabolite targets. Preprocessing of the metabolomics dataset resulted in a reduced dataset including 125 patients and 347 metabolites (Fig 1). The patient characteristics of these 125 patients are displayed in Table 1. The patients were diagnosed with the microbial etiology S. pneumoniae (n = 48), atypical bacteria (n = 47), or respiratory virus (n = 30) (Table 2). A list of all targeted and detected metabolites and their identifiers can be found in S4 Table. Unsupervised principal component analysis showed no clear separation between pathogen groups (S2 Fig).

Single discriminating metabolites for pathogen groups

Three significant metabolites were found for the discrimination of atypical pathogens from S. pneumoniae and viral pathogens using a Student’s T-test with FDR multiple testing correction (p < 0.05): glycylglycine, symmetric dimethylarginine (SDMA), and lysophosphatidylinositol (18:1) (LPI (18:1)). For the other comparisons, no significantly discriminating metabolites were found.

The significantly differentiating metabolites were included in logistic regression models to differentiate patients with atypical pathogens from patients suffering from CAP caused by S. pneumoniae or viral pathogens. The logistic regression models were evaluated based on their AUC, sensitivity, specificity, BER, and ROC curve after fivefold cross-validation with 100 repeats (Table 3, Fig 3). They show that logistic regression models of the individual metabolites glycylglycine, SDMA, and LPI(18:1) can differentiate atypical pathogens from S. pneumoniae and viral pathogens with AUCs between 0.70–0.72, sensitivities between 0.32–0.36, sensitivities between 0.83–0.85, and BERs of 0.39–0.41. A logistic regression model including all three significantly discriminating metabolites yields a more successful separation with an AUC of 0.78, sensitivity of 0.57, specificity of 0.83, and BER of 0.30. Addition of the covariates age and sex to the three metabolite model, slightly improved the predictive performance of the model resulting in a sensitivity of 0.63 and a specificity of 0.84. This model also showed the highest AUC (0.79) and lowest BER (0.26) of the tested logistic regression models. The addition of other covariates to the logistic regression model resulted in lower performance, probably due to overfitting of the model. The ROC curves emphasize the increased model performance upon the addition of more discriminating metabolites to the logistic regression model (Fig 3).

Table 3. Results from the logistic regression and elastic net regression models that were tested in a fivefold cross-validation with 100 repeats.

Model Variables AUC Sensitivity Specificity BER
Atypical–(S. pneumoniae + viral)
Logistic Regression Glycylglycine 0.72 (0.094) 0.36 (0.14) 0.83 (0.110) 0.40 (0.084)
Logistic Regression SDMA 0.72 (0.093) 0.36 (0.15) 0.86 (0.100) 0.39 (0.082)
Logistic Regression LPI.18.1. 0.70 (0.099) 0.32 (0.14) 0.85 (0.100) 0.41 (0.082)
Logistic Regression Age + sex 0.71 (0.097) 0.39 (0.15) 0.85 (0.090) 0.38 (0.071)
Logistic Regression All covariates 0.65 (0.098) 0.52 (0.15) 0.68 (0.120) 0.40 (0.087)
Logistic Regression Glycylglycine + SDMA + LPI.18.1. 0.78 (0.094) 0.57 (0.16) 0.83 (0.100) 0.30 (0.090)
Logistic Regression Glycylglycine + SDMA + LPI.18.1. + age + sex 0.79 (0.089) 0.63 (0.16) 0.84 (0.095) 0.26 (0.085)
Logistic Regression Glycylglycine + SDMA + LPI.18.1. + all covariates 0.75 (0.097) 0.60 (0.16) 0.78 (0.110) 0.31 (0.093)
Elastic net regression 100 (82) 0.81 (0.087) 0.61 (0.18) 0.86 (0.092) 0.27 (0.094)
Elastic net regression 110 (91) incl. age & sex 0.80 (0.094) 0.61 (0.17) 0.84 (0.096) 0.28 (0.090)
Elastic net regression 270 (140) incl. all covariates 0.69 (0.100) 0.58 (0.17) 0.70 (0.120) 0.36 (0.098)
S. pneumoniae–(atypical + viral)
Elastic net regression 210 (120) 0.74 (0.091) 0.83 (0.10) 0.50 (0.160) 0.33 (0.087)
Elastic net regression 240 (130) incl. age & sex 0.74 (0.095) 0.80 (0.10) 0.52 (0.160) 0.34 (0.084)
Elastic net regression 290 (120) incl. all covariates 0.63 (0.110) 0.69 (0.13) 0.51 (0.17) 0.40 (0.098)
Viral–(S. pneumoniae + atypical)
Elastic net regression 170 (140) 0.54 (0.120) 0.88 (0.11) 0.16 (0.170) 0.48 (0.075)
Elastic net regression 130 (130) incl. age & sex 0.63 (0.130) 0.89 (0.08) 0.23 (0.160) 0.44 (0.082)
Elastic net regression 180 (160) incl. all covariates 0.56 (0.130) 0.79 (0.11) 0.31 (0.190) 0.45 (0.099)

The table displays the performance of the models for the three comparisons: atypical versus S. pneumoniae and viral pathogens; S. pneumoniae pathogens versus atypical and viral pathogens; and viral versus S. pneumoniae and atypical pathogens. Logistic regression is only included for the comparison of atypical versus S. pneumoniae and viral pathogens because no significant single metabolites were found for the other comparisons. The performance is evaluated using the mean area under the curve (AUC), the mean sensitivity, the mean specificity, and the mean balanced error rate (BER) over all folds and repeats. All performances result from the test sets within the cross-validation. The best performing model per comparison and evaluation measure is displayed in bold and underlined.

Data are presented as mean (SD). Variables are presented as variable names or as the number of variables that are included in the model. Abbreviations: SDMA: symmetric dimethylarginine, LPI (18:1): lysophosphatidylinositol (18:1), AUC: area under the curve, BER: balanced error rate.

Fig 3. ROC curves of the results from logistic regression and elastic net regression models that were tested in five-fold cross-validation with 100 repeats for the comparisons: atypical versus S. pneumoniae and viral pathogens; S. pneumoniae pathogens versus atypical and viral pathogens; and viral versus S. pneumoniae and atypical pathogens.

Fig 3

Abbreviations: LR: logistic regression, EN: elastic net regression, SDMA: symmetric dimethylarginine, LPI (18:1): lysophosphatidylinositol (18:1).

Predictive metabolites for diagnosis of CAP-associated pathogens

Elastic net models including multiple metabolites were fit to discriminate S. pneumoniae, atypical bacterial, and viral pathogens from the remaining two groups (e.g., S. pneumoniae versus atypical bacterial and viral pathogens). Elastic net models separating patients with atypical bacterial pathogens from patients with S. pneumoniae and viral infections resulted in a mean AUC of 0.81, a sensitivity of 0.61, a specificity of 0.86, and a BER of 0.26. Prediction of S. pneumoniae or viral infection etiologies showed lower predictive capabilities with AUC’s of 0.74 and 0.63, high sensitivities of 0.83 and 0.89, but low specificities of 0.5 and 0.23, and BER’s of 0.33 and 0.44, respectively (Table 3).

We included the covariates age and sex, and all covariates in the elastic net models to account for potential confounding effects. The addition of these covariates showed no improved performance of the elastic net models for differentiation of atypical pathogens or S. pneumoniae from the other groups. For the differentiation of viral pathogens from the other two pathogen groups, a slight performance improvement was seen upon the addition of the covariates age and sex resulting in an AUC of 0.63, a sensitivity of 0.89, a specificity of 0.23, and a BER of 0.44 (Table 3).

The ROC curves for the separation of atypical pathogens from S. pneumoniae and viral pathogens show that elastic net models perform better than the logistic regression models for single metabolites. However, the logistic regression model including the three significant metabolites and the covariates age and sex shows similar performance as the elastic net regression which included 100 metabolites on average (Fig 3).

Metabolite classes predictive for atypical bacterial pathogens

Focusing on the metabolites that have shown to be predictive for atypical bacterial pathogens, i.e., the only comparison with clinically relevant predictive performance, we identified 26 metabolites with an absolute VIP > 1% using elastic net regression (Fig 4). The metabolites originated from multiple metabolite classes. However, the classes of biogenic amines and lysophospholipids were well represented (4–5 metabolites per class), compared to the other classes. The number of metabolites included in the models varied across folds without a clear correlation to the BER. Commonly, models including all metabolites were favored, followed by models including 20–100 metabolites (S3 Fig). We visualized the separation of the different pathogens in the atypical pathogen group using an unsupervised PCA analysis including all metabolites. The PCA plot indicated that no clear sub-group is present within the atypical group that would prominently drive the separation from the S. pneumoniae and viral infections (S4 Fig).

Fig 4. Variable importance of metabolites for the prediction of an atypical bacterial infection versus S. pneumoniae and viral infections.

Fig 4

Only metabolites with an absolute mean percentage of influence > 1% are visualized.

Discussion

Targeted profiling of the host metabolic response revealed metabolites that can support the diagnosis of microbial etiology in CAP patients with atypical bacterial pathogens compared to patients with S. pneumoniae or viral infections. CAP patients suffering from S. pneumoniae and viral infection could not be as successfully discriminated from the other groups based on the metabolic host-response.

The currently used clinical assays still outperform the metabolomics host-response assays developed in this study. For atypical pathogens, the sensitivity of 63% and specificity of 86% reported in this study are lower than the current urinary antigen tests for detection of Legionella pneumophila which shows a sensitivity of approximately 70% and a specificity up to 96% [22]. For detection of S. pneumoniae, the 83% sensitivity reached with the metabolomics-based assay outperforms the current antigen tests that show 70% sensitivity. However, the specificity of the metabolomics-based assay is only 50% while antigen tests reach specificity up to 96% [23, 24]. PCR assays of nasopharyngeal swabs for viral pathogens show sensitivities of up to 96% for influenza viruses A and B [25]. Our viral metabolomics-based assay shows a good sensitivity of 89% as well. However, the specificity of this assay is with 23% very low. The expected clinical utility of the studied metabolite classes as host-response biomarkers for etiological diagnosis of CAP may therefore be considered limited.

The combination of the metabolites glycylglycine, SDMA, and LPI (18:1) and the covariates age and sex showed predictive capacities similar to elastic net models including 100 metabolites in the comparison of atypical pathogens versus S. pneumoniae and viral pathogens. This result suggests that a simple model might perform as well as a more complex elastic net model, which is an important finding when considering the use of these biomarkers for clinical diagnostic applications, e.g., where a limited set of 3 metabolites is preferable.

Glycylglycine, a biogenic amine, showed to be significantly contributing to the differentiation of atypical pathogens from the other pathogens, but was not often included in elastic net models. In contrast, SDMA and LPI (18:1) were often included in the elastic net models as was shown in the overview of the 26 most influential metabolites. Metabolites of the classes biogenic amines and lysophospholipids, to which SDMA and LPI (18:1) have been assigned, were most represented in the 26 most influential metabolites compared to other metabolite classes in the comparison of atypical versus S. pneumoniae and viral pathogens. A comparison of the most influential metabolites in this study to metabolites of interest reported in previous studies of metabolomics in CAP patients shows limited overlap. Major reasons for this could be that (i) not all studies measured the same set of metabolic classes; (ii) some other studies poorly controlled patient comparator groups; and (iii) difference in bioanalytical methodologies, e.g. the use of NMR or MS as analytical method with their respective (dis)advantages might provide different results [26]. For example, most lipids found to be predictive in this study have not been reported previously, most likely because the applied bioanalytical methodologies did not allow their detection. However, some overlap was found between the most influential metabolites for the comparison of atypical versus S. pneumoniae and viral pathogens in this study, and the metabolites of interest from other metabolomics studies involving CAP patients. The amino acid alanine was found in multiple studies [14, 16, 17]. Ceramide (d18:1/16:0), two diacyl-phosphatidylcholines, and diacyl-phosphatidylethanolamine (38:2) were found in other studies as well, the latter in the form of choline and ethanolamine [15, 16, 18]. Lactic acid was identified by several other metabolomics studies to respiratory bacterial and viral infections [12, 14, 17]. Lactic acid levels are also known to rise in case of severe disease. However, because the three pathogen groups were balanced in terms of disease severity and, for example, did not show significant differences in pH levels, we hypothesize that the differences in lactate levels are, in this case, an effect of the pathogen-specific host-response to infection. The result showed that models including disease severity covariates do not perform better than models without these confounders, thus supporting this hypothesis. Finally, 3-hydroxyisovaleric acid and betaine have been reported in a previous study comparing viral and bacterial pneumonia [18]. The overlap in these findings may provide insights into common metabolic responses to pathogens involved in CAP.

Multiple biological processes besides infection can influence metabolic processes in patients. Inclusion of age and sex in the models did not improve the predictive performance of the elastic net models for atypical bacteria and S. pneumoniae but did improve the model for viral pathogens. The average age in the viral pathogen group was higher than in the other groups, which could explain this result. For the other comparisons, we see that a model including age and sex or more covariates does not outperform models without these possible confounders. This doesn’t imply there is no metabolomic effect of age in the bacterial pathogen groups but implies that the separation between bacterial pathogen groups is more dependent on the metabolomic host-response to the infection than on the age-related metabolomic changes. In this study, we included patients with mild to severe CAP, reflecting the target patient population for which improvements in a diagnostic assay are required. However, the combination of samples from patients with different disease severities may negatively influence the predictive capabilities of the model because the effect from the causative pathogen on the host-metabolism may be less pronounced for less severe disease [27]. However, separating the patients into groups with comparable disease severity scores would decrease the power for statistical analysis. Furthermore, no standardization of sampling times and conditions was applied, e.g., patients had not fasted before blood sampling, which may influence the metabolite patterns found. Since variations in sampling conditions were unknown, we were unable to consider these in our analyses. However, we expect that the impact of not standardizing and correcting for these factors is limited because the noise in metabolite levels introduced by these factors is expected to be random with regard to the pathogen groups compared in this study. A standardized sampling approach could improve the sensitivity of the models to detect predictive metabolites because some noise is reduced. However, the specificity of the models with respect to the prediction of specific pathogens would be unchanged, since no correlation with pathogen groups is likely.

The sample size of this study (n = 125) was relatively large compared to studies researching metabolomic differences between causative pathogens of CAP that included approximately 70 patients [17, 18]. The compared groups S. pneumoniae, atypical bacteria, and viruses were chosen because antibiotic treatment strategies differ between these three groups. Ideally, we would have further investigated differences within studied groups, e.g. to identify metabolic responses to specific pathogens within the atypical pathogens and viral infection groups. For example, it would be of interest to study Legionella species more in-depth because their intracellular growth might result in a differentiated host-response. However, this was considered not feasible in this study due to sample size restrictions. The heterogeneous pathogen population in the atypical bacterial and viral pathogen groups might have lowered the predictive performance of the metabolomic analysis. Studying the individual pathogens in bigger sample sizes might reveal more characteristic metabolite signatures. In this study, no control group was included because the goal of the study was to provide a faster and optimal diagnostic method and a guide for antibiotic treatment in hospitalized CAP patients. In further studies, it would be preferable to include patients with all causes of CAP, including the remaining microorganisms, which were excluded in the current study because of their low frequency, to enable a more comprehensive comparison with current clinical assays. In this study, CAP patients with unknown pathogens were excluded. In a follow-up study, the metabolite pattern of the patients with unknown causative pathogens could be compared to the metabolite patterns of the distinguished pathogen groups to gain more information about the metabolomic resemblance of the samples in which pathogens could and could not be identified using the conventional diagnostic techniques.

Metabolomics analysis resulted in some missing data because of sample preparation errors or the limited volume of the samples. Because the measurement platforms covered multiple metabolites within one pathway, metabolites with missing data could be removed without influencing the final results. Some patient samples had to be removed because of multiple missing metabolite levels, for example, if the results from a whole metabolomics platform were missing. Data imputation was not performed for the metabolomics data, because the wide range of patients included in the dataset did, in our opinion, not provide enough information for accurate data imputation.

In summary, this comprehensive analysis of the host metabolic response across multiple metabolic classes and based on a well-balanced study cohort of CAP patients has shown the possibility to identify atypical pathogens in CAP and limited utility of predicting S. pneumoniae and viral infection disease etiologies.

Supporting information

S1 Method. Details on metabolomic sample analysis.

(DOCX)

S1 Fig. Optimization of α and λ in the inner Cross-Validation (CV) to reach a minimal Balanced Error Rate (BER) in the outer CV.

(A) Shows all α and λ values tested in inner CV against mean BER of the inner CV. (B) A plot of the optimal α and λ combinations chosen in the inner CV against their BER in the outer CV shows a variety of favorable α and λ concentrations. (C) A plot of the number of variables selected in the elastic net model in outer CV shows that with increasing alpha, the number of variables decreases as is expected in an elastic net model. The data shown in the Fig is a result of the comparison Atypical–(S. pneumoniae + viral).

(DOCX)

S2 Fig. Unsupervised Principal Component Analysis (PCA) plot of all pathogen groups.

(DOCX)

S3 Fig

(A) Boxplot of BER per number of variables selected shows no clear relation between the number of variables selected and model performance. (B) Histogram of the number of variables selected shows that a model with all metabolites included is favored, followed by models including 34, 49, 82, 24, or 45 metabolites. Both Figs contain the data of all folds and repeats (n = 500) for the comparison between atypical versus S. pneumoniae and viral infections.

(DOCX)

S4 Fig. Principal Component Analysis (PCA) of the atypical pathogen group (log-transformed and standardized data) shows that there is no clear subgroup within the atypical group that would prominently drive the separation from the S. pneumoniae and viral infections.

(DOCX)

S1 Table. Summary of previous studies focusing on bacterial and viral respiratory tract infections and related metabolites.

(DOCX)

S2 Table. Additional patient characteristics per pathogen group.

(DOCX)

S3 Table. Overview of the number of metabolites included in the metabolomics platforms, measured in the samples and included in the data analysis.

(DOCX)

S4 Table. Information on measurement platforms used, metabolite classes targeted per platform, targeted metabolites, their abbreviations and names in R (if detected) and identifiers (if available).

(XLSX)

S5 Table. Metabolomics data after quality control.

(CSV)

Data Availability

All relevant data are within the manuscript and its Supporting Information files.

Funding Statement

This work is part of the research program ‘Metabolomic fingerprint biomarkers to guide antibiotic therapy and reduce resistance’ with project number 541001007, which is financed by ZonMW, the Netherlands Organization for Health Research and Development associated with the Dutch Research Council (NWO). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Kothe H, Bauer T, Marre R, Suttorp N, Welte T, Dalhoff K, et al. Outcome of community-acquired pneumonia: Influence of age, residence status and antimicrobial treatment. Eur Respir J. 2008. Jul 1;32(1):139–46. doi: 10.1183/09031936.00092507 [DOI] [PubMed] [Google Scholar]
  • 2.Meijvis SCA, Hardeman H, Remmelts HHF, Heijligenberg R, Rijkers GT, Van Velzen-Blad H, et al. Dexamethasone and length of hospital stay in patients with community-acquired pneumonia: A randomised, double-blind, placebo-controlled trial. Lancet. 2011;377(9782):2023–30. doi: 10.1016/S0140-6736(11)60607-7 [DOI] [PubMed] [Google Scholar]
  • 3.Endeman H, Schelfhout V, Paul Voorn G, van Velzen-Blad H, Grutters JC, Biesma DH. Clinical features predicting failure of pathogen identification in patients with community acquired pneumonia. Scand J Infect Dis. 2008. Jan 8;40(9):715–20. doi: 10.1080/00365540802014864 [DOI] [PubMed] [Google Scholar]
  • 4.Wunderink RG, Waterer GW. Community-Acquired Pneumonia. Solomon CG, editor. N Engl J Med. 2014. Feb 6;370(6):543–51. doi: 10.1056/NEJMcp1214869 [DOI] [PubMed] [Google Scholar]
  • 5.Wiersinga WJ, Bonten MJ, Boersma WG, Jonkers RE, Aleva RM, Kullberg BJ, et al. Management of community-acquired pneumonia in adults: 2016 guideline update from the Dutch Working Party on Antibiotic Policy (SWAB) and Dutch Association of Chest Physicians (NVALT). Neth J Med. 2018;76(1). [PubMed] [Google Scholar]
  • 6.Postma DF, van Werkhoven CH, van Elden LJR, Thijsen SFT, Hoepelman AIM, Kluytmans JAJW, et al. Antibiotic Treatment Strategies for Community-Acquired Pneumonia in Adults. N Engl J Med. 2015. Apr 2;372(14):1312–23. doi: 10.1056/NEJMoa1406330 [DOI] [PubMed] [Google Scholar]
  • 7.Bjarnason A, Westin J, Lindh M, Andersson LM, Kristinsson KG, Löve A, et al. Incidence, etiology, and outcomes of community-acquired pneumonia: A population-based study. Open Forum Infect Dis. 2018. Feb 1;5(2). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.WHO. Antimicrobial resistance: global report on surveillance. 2014.
  • 9.Saleh MAA, Van Hasselt CJG, Van De Garde EMW. Host-response biomarkers for the diagnosis of bacterial respiratory tract infections. Clin Chem Lab Med. 2018; [DOI] [PubMed] [Google Scholar]
  • 10.Pearce EL, Pearce EJ. Metabolic pathways in immune cell activation and quiescence. Vol. 38, Immunity. Cell Press; 2013. p. 633–43. doi: 10.1016/j.immuni.2013.04.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Khovidhunkit W, Kim MS, Memon RA, Shigenaga JK, Moser AH, Feingold KR, et al. Effects of infection and inflammation on lipid and lipoprotein metabolism: Mechanisms and consequences to the host. Vol. 45, Journal of Lipid Research. Lipid Research Inc.; 2004. p. 1169–96. doi: 10.1194/jlr.R300019-JLR200 [DOI] [PubMed] [Google Scholar]
  • 12.Zhou A, Ni J, Xu Z, Wang Y, Zhang H, Wu W, et al. Metabolomics specificity of tuberculosis plasma revealed by (1)H NMR spectroscopy. Tuberculosis (Edinb). 2015. May 1;95(3):294–302. doi: 10.1016/j.tube.2015.02.038 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Laiakis EC, Morris GAJ, Fornace AJ, Howie SRC, Howie SRC. Metabolomic Analysis in Severe Childhood Pneumonia in The Gambia, West Africa: Findings from a Pilot Study. Lau ATY, editor. PLoS One. 2010. Sep 9;5(9):e12655. doi: 10.1371/journal.pone.0012655 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Slupsky CM, Rankin KN, Fu H, Chang D, Rowe BH, Charles PGP, et al. Pneumococcal Pneumonia: Potential for Diagnosis through a Urinary Metabolic Profile. J Proteome Res. 2009. Dec 4;8(12):5550–8. doi: 10.1021/pr9006427 [DOI] [PubMed] [Google Scholar]
  • 15.Lau SKP, Lee K-C, Curreem SOT, Chow W-N, To KKW, Hung IFN, et al. Metabolomic Profiling of Plasma from Patients with Tuberculosis by Use of Untargeted Mass Spectrometry Reveals Novel Biomarkers for Diagnosis. Land GA, editor. J Clin Microbiol. 2015. Dec 1;53(12):3750–9. doi: 10.1128/JCM.01568-15 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Antcliffe D, Jiménez B, Veselkov K, Holmes E, Gordon AC. Metabolic Profiling in Patients with Pneumonia on Intensive Care. EBioMedicine. 2017. Apr;18:244–53. doi: 10.1016/j.ebiom.2017.03.034 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Banoei MM, Vogel HJ, Weljie AM, Kumar A, Yende S, Angus DC, et al. Plasma metabolomics for the diagnosis and prognosis of H1N1 influenza pneumonia. Crit Care. 2017. Apr 19;21(1):97. doi: 10.1186/s13054-017-1672-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Adamko DJ, Saude E, Bear M, Regush S, Robinson JL. Urine metabolomic profiling of children with respiratory tract infections in the emergency department: a pilot study. BMC Infect Dis. 2016;16(1):439. doi: 10.1186/s12879-016-1709-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Varma S, Simon R. Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics. 2006. Feb 23;7:91. doi: 10.1186/1471-2105-7-91 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Zou H, Hastie T. Regularization and variable selection via the elastic net. J R Stat Soc Ser B (Statistical Methodol. 2005. Apr;67(2):301–20. [Google Scholar]
  • 21.Statnikov A, Tsamardinos I, Dosbayev Y, Aliferis CF. GEMS: A system for automated cancer diagnosis and biomarker discovery from microarray gene expression data. Int J Med Inform. 2005. Aug 1;74(7–8):491–503. doi: 10.1016/j.ijmedinf.2005.05.002 [DOI] [PubMed] [Google Scholar]
  • 22.Yzerman EPF, Den Boer JW, Lettinga KD, Schellekens J, Dankert J, Peeters M. Sensitivity of three urinary antigen tests associated with clinical severity in a large outbreak of Legionnaires’ disease in the Netherlands. J Clin Microbiol. 2002. Sep;40(9):3232–6. doi: 10.1128/JCM.40.9.3232-3236.2002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Gutierrez F, Masia M, Rodriguez JC, Ayelo A, Soldan B, Cebrian L, et al. Evaluation of the Immunochromatographic Binax NOW Assay for Detection of Streptococcus pneumoniae Urinary Antigen in a Prospective Study of Community‐Acquired Pneumonia in Spain. Clinical Infectious Diseases Feb, 2003. p. 286–92. doi: 10.1086/345852 [DOI] [PubMed] [Google Scholar]
  • 24.Sordé R, Falcó V, Lowak M, Domingo E, Ferrer A, Burgos J, et al. Current and potential usefulness of pneumococcal urinary antigen detection in hospitalized patients with community-acquired pneumonia to guide antimicrobial therapy. Arch Intern Med. 2011. Jan 24;171(2):166–72. doi: 10.1001/archinternmed.2010.347 [DOI] [PubMed] [Google Scholar]
  • 25.Van Elden LJR, Nijhuis M, Schipper P, Schuurman R, Van Loon AM. Simultaneous detection of influenza viruses A and B using real-time quantitative PCR. J Clin Microbiol. 2001;39(1):196–200. doi: 10.1128/JCM.39.1.196-200.2001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Emwas AH, Roy R, McKay RT, Tenori L, Saccenti E, Nagana Gowda GA, et al. Nmr spectroscopy for metabolomics research. Vol. 9, Metabolites. MDPI AG; 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Ning P, Zheng Y, Luo Q, Liu X, Kang Y, Zhang Y, et al. Metabolic profiles in community-acquired pneumonia: developing assessment tools for disease severity. Crit Care. 2018. Dec 14;22(1):130. doi: 10.1186/s13054-018-2049-2 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Aran Singanayagam

1 Feb 2021

PONE-D-21-00435

Metabolomic profiling of microbial disease etiology in community-acquired pneumonia

PLOS ONE

Dear Dr. den Hartog,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Mar 18 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Aran Singanayagam

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match.

When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section.

3. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: I Don't Know

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors undertook a metabolomics study of serum samples in hospitalized CAP patients to determine if host-response associated metabolites can enable diagnosis of microbial etiology. The topic is relevant.

The authors conclude that' targeted profiling of the host metabolic response revealed metabolites that can support diagnosis of microbial etiology in CAP patients with atypical bacterial pathogens compared to patients with S. pneumoniae or viral infections'. However, they also admit that 'the currently used clinical assays still outperform the metabolomics host-response assays developed in this study'. Despite the later, the study is still sound I guess. It is challenging to get homogenous groups and, depending on the sampling and method used, different results are expected. This makes a generalizability challeninge for this study.

I probably missed this point. When were the serum samples taken? In the morning before food? Has this been standardized?

Table 1: A pity that BMI has not been known. What about use of antibiotics? Ethnicity? Authors could add p-values so one would see if some of the patient characteristics per pathogen group are different between the three groups. E.g. is age really significnatly higher in the s. pneumoniae group?

Table 2: The 'atypical bacterial' and 'viral' groups are still heterogenous. Has this been considered in the models? Different atypical bacteria may result in a different profile? Is there a reason why the authors didn't include some individuals without CAP (control group)?

The authors did not use NMR. Could the authors elaborate on the pros and cons of using NMR as compared to their methods? They only mention 'reduced sensitivity', but there are also advantages using nmr. It seems to me that different methods lead to different conclusions.

No line numbers in the discussion.

Reviewer #2: I am reviewing the article titled “Metabolomic profiling of microbial disease etiology in community-acquired pneumonia” by den Hartog et al. These investigators performed a “large-scale” metabolomics study from human serum samples of severe pneumonia (necessitating hospital admission). The researchers focused on three distinct groups of patients those with S. pneumoniae, atypical bacteria, or viral infections. The authors utilized multiple methods to determine discriminating metabolites, they found that there is a possible method to determine atypical infections from S. pneumoniae vs. viral pathogens.

Strengths

The authors have extensively profiled sick patients with pneumonia through an untargeted metabolomic profile. The authors use extensive statistical modeling using these tools to determine if there are differences between the patients with three different pneumonias. They found that there are ways to determine the differences between atypical pneumonias compared to Streptococcus/viral pneumonias. However, these methods are not sensitive nor specific enough compared to approved clinical tests. The work is thorough and well-documented; however I believe that it is missing some clinical relevance.

Weaknesses

As a proof of concept, this is an excellent manuscript, but I am less enthusiastic for several reasons: 1) As a clinician, several significant clinical outcomes of interest, including things like antibiotics, oxygen requirement, and if the patients were sick or not sick. If we are talking about host-response, these factors may play a bigger role and may confound their analysis, 2) Lumping severity into a score (e.g., PSI), 3) Other medications and intrinsic lung disease are not mentioned as possible contributors to their model, 4) clinical relevance, if clinicians and researchers are able to tell the difference between certain infections, then what can utilizing a metabolomic approach offer a researcher or clinician? Finally, 5) Was there another testing cohort to test their model?

Two interesting points that may be beyond the scope of the work by the authors: 1) Was there ever thought about comparing the metabolites to healthy subjects compared to pneumonia subjects? 2) Although there is little difference between the atypical pneumonia pathogens, there almost appears to be a distinct group between the legionella compared to mycoplasma samples. Was there thought about exploring possible differences between these two groups?

I will be using the page number and the left most line numbers. In the discussion section there doesn’t seem to be line numbers.

Major

Introduction:

Page 10, Line 68 “The studies that compared viral and bacterial …” I would just be careful and call this a limitation. Untargeted metabolomics may offer significant benefits in terms of identifying unknown metabolites. An untargeted approach is much more similar to a fishing expedition, I agree, but there may be some benefits compared to a targeted approach.

Materials and methods:

Page 11, Line 95 “The study …” One question I was wondering that the authors may have addressed at a different point was the length of time related to the patient’s illness? While it’s interesting that these patients all felt ill enough to come into the hospital, it’s not quite clear if the length of time they were sick would have confounded their analysis. For example, a person sick enough to come to the hospital on day 5 may be different than one that arrives 14 days after falling ill.

Page 11, Line 99 There is very limited clinical information that would confound host-metabolite expression, for example 1) Use of supplemental oxygen? 2) Other comorbid disease states such as diabetes, 3) BMI (which the authors mentioned in the conclusion was not recorded), 4) medications the patient had been taken prior to “catching” pneumonia (e.g., steroids, inhalers, antibiotics), and 5) most interesting of all, no mention of pre-existing lung disease (e.g., COPD, asthma, ILD). For host-metabolite issues, these would be of interesting to understand if they impact host-expression, especially lung and systemic metabolites.

Page 12, Line 133 “… models containing age and sex were generated …” Given the predilection of Streptococcus pneumonia impact older subjects, I am a little surprised that age did not factor into the analysis as in Table 1 it seems as though the age would be statistically different.

Results

Page 15, Line 189 “Single discriminating metabolites for pathogen groups”. Out of curiosity and this may be beyond the scope of the study, was there any distinct groups that were identified in an unsupervised fashion? From the metabolites, could the authors identify distinct groups? I am wondering if using the data to find distinct groups could also be performed (again beyond the scope of the study, but could be interesting to look at to see if there may be groups that are not clearly seen). For example, using Dirichlet Multinomial Mixtures to identify distinct groups. This could be added as a figure in the supplement. Part of me wonders if differences in serum metabolites may be associated with clinical outcomes.

Discussion

Page 20, Line … “Targeted …” I appreciate that the authors point out that it is difficult based on the host-metabolomic profile to tell the difference between the various pneumonias. What isn’t clear to me is why would atypical infections, in particular have such distinct host-metabolomic profile? The authors do a commendable effort into searching for metabolites which can discriminate between infections, but what is so particular that the infections create a unique host response (e.g., such as the intra-cellular nature of some of these infections Mycoplasma and Legionella).

Page 21, Line … “Lactic acid …” I think this is interesting because there are R and L enantiomers that are involved in microbial metabolism, but from a clinical point of view, lactemia in the serum is sign of severe disease. Perhaps, it may actually reflect severity of disease.

Page 22, Line … “In this study, we included patients …” It’s interesting that the authors utilized a pneumonia score, perhaps to understand some of the granularity of the data the authors should try to expand the PSI score and reassess their model based on the severity of disease? Moreover, have the authors tried to separate out the analysis based upon severity? The severity of disease could serve as a confounder in their analysis. I recommend the authors split the PSI score and attempt to construct their models utilizing

Minor

None

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Jun 4;16(6):e0252378. doi: 10.1371/journal.pone.0252378.r002

Author response to Decision Letter 0


10 Mar 2021

Dear dr. Singanayagam and reviewers,

We would like to thank you for thoroughly reading our manuscript and providing us with constructive feedback. We revised the manuscript based on your comments. We made adjustments to clarify certain sections and improve the quality of the manuscript. We hope to have addressed all concerns to your satisfaction. Below, we respond to each comment individually. The line numbers we added refer to the clean version of the revised manuscript.

Reviewer #1:

1. I probably missed this point. When were the serum samples taken? In the morning before food? Has this been standardized?

Since the samples were not collected specifically for metabolomics analysis, no standardization for sampling times and conditions was applied. In the Methods section we stated the only available information (lines 85-86):

“The samples were taken from CAP patients within 24 hours after hospital admission.”

2. Table 1: A pity that BMI has not been known. What about use of antibiotics? Ethnicity? Authors could add p-values so one would see if some of the patient characteristics per pathogen group are different between the three groups. E.g. is age really significnatly higher in the s. pneumoniae group?

We agree with the reviewer that it would have been beneficial to have known the BMI, since it is known to influence metabolomic profiles. We added extra information to Table 1 (line 113): P-values to visualize significant differences between groups, COPD, diabetes, duration of symptoms before admission, antibiotic treatment before admission, corticosteroid use before admission. We have added S2 Table (line 612) with additional patient characteristics as well, containing: race, nursing home resident, altered mental status, respiratory rate, systolic blood pressure, temperature, pulse, pH, BUN, sodium, glucose, hematocrit, partial pressure of oxygen, oxygen saturation, supplemental oxygen required.

The mean age is indeed significantly different between the three groups as we expected based on literature. This is, for example, described by Raeven et al., BMC Infect Dis. 2016 June 17; 16(299).

3. Table 2: The 'atypical bacterial' and 'viral' groups are still heterogenous. Has this been considered in the models? Different atypical bacteria may result in a different profile?

This heterogeneous character of the atypical bacterial and viral groups has not been taken into account in the models. We have added a clarification to the Discussion section:

Lines 354-355: “The compared groups S. pneumoniae, atypical bacteria, and viruses were chosen because antibiotic treatment strategies differ between these three groups.”

Lines 360-362:“The heterogeneous pathogen population in the atypical bacterial and viral pathogen groups might have lowered the predictive performance of the metabolomic analysis. Studying the individual pathogens in bigger sample sizes might reveal more characteristic metabolite signatures.”

4. Is there a reason why the authors didn't include some individuals without CAP (control group)?

We have emphasized our approach more clearly in the Discussion (lines 362-364):

“In this study, no control group was included because the goal of the study was to provide a faster and optimal diagnostic method and a guide for antibiotic treatment in hospitalized CAP patients.”

For more general, biological analysis, we think that the inclusion of a control group could be of interest to provide more insight into the metabolomic differences between healthy and diseased individuals.

5. The authors did not use NMR. Could the authors elaborate on the pros and cons of using NMR as compared to their methods? They only mention 'reduced sensitivity', but there are also advantages using nmr. It seems to me that different methods lead to different conclusions.

We rephrased the reference to the difference between NMR and MS methods in the Discussion (lines 314-317):

“Major reasons for this could be that (i) not all studies measured the same set of metabolic classes; (ii) some other studies poorly controlled patient comparator groups; and (iii) difference in bioanalytical methodologies, e.g. the use of NMR or MS as analytical method with their respective (dis)advantages might provide different results [26]”

6. No line numbers in the discussion.

Something went wrong indeed. We have added line numbers in the discussion in the entire manuscript.

Reviewer #2:

1. As a clinician, several significant clinical outcomes of interest, including things like antibiotics, oxygen requirement, and if the patients were sick or not sick. If we are talking about host-response, these factors may play a bigger role and may confound their analysis. Lumping severity into a score (e.g., PSI), Other medications and intrinsic lung disease are not mentioned as possible contributors to their model,

We would like to thank the reviewer for these suggestions. We have added the clinical parameters that were available for this study to the patient characteristics table (Table 1 & S2 Table). To Table 1 we have added: COPD, Diabetes, Duration of symptoms before admission, antibiotic treatment before admission, corticosteroid use before admission. In S2 Table we have included: race, nursing home resident, altered mental status, respiratory rate, systolic blood pressure, temperature, pulse, pH, BUN, sodium, glucose, hematocrit, partial pressure of oxygen, oxygen saturation, supplemental oxygen required.

In the materials and methods section we have defined which confounders were included in the model (lines 102-106):

“Patient characteristics that might be considered as possible covariates were: age, sex, nursing home resident, renal disease, congestive heart failure, CNS disease, malignancy, COPD, diabetes, altered mental status, respiratory rate, systolic blood pressure, temperature, pulse, pH, BUN, sodium, glucose, hematocrit, partial pressure of oxygen, pleural effusion on x-ray, duration of symptoms before admission, antibiotic treatment before admission.”

For the remaining patient characteristics, there was 100% the same value in all samples, for example there were no patients with liver disease, or >25% missingness in the data. We clarified the imputation procedure in the method (lines 135-140):

“Data imputation was performed for patient characteristics that were to be evaluated as covariates in the statistical analysis and showed missingness in the data. Five times repeated imputation using predictive mean matching was performed with the ‘mice’ package for R to impute the patient data for the covariates with less than 25% missing data. Predictive mean matching is suitable for both numeric and binary covariates. Patient characteristics with >25% missing data were excluded from further analysis.”

The results of adding extra confounders to the logistic regression and elastic net models are presented in Fig. 3 and Table 3 and in writing in the Results section:

Lines 225-226: “The addition of other covariates to the logistic regression model resulted in lower performance, probably due to overfitting of the model.”

Lines 237-242: “We included the covariates age and sex, and all covariates in the elastic net models to account for potential confounding effects. The addition of these covariates showed no improved performance of the elastic net models for differentiation of atypical pathogens or S. pneumoniae from the other groups. For the differentiation of viral pathogens from the other two pathogen groups, a slight performance improvement was seen upon the addition of the covariates age and sex resulting in an AUC of 0.63, a sensitivity of 0.89, a specificity of 0.23, and a BER of 0.44 (Table 3).”

2. If clinicians and researchers are able to tell the difference between certain infections, then what can utilizing a metabolomic approach offer a researcher or clinician?

This is however currently not the case. As we stated in the introduction (lines 51-52)

“In over 60% of CAP patients, no causative pathogen can be identified with these pathogen-targeted diagnostic techniques [2,6]”.

Identifying microbial diagnosis for this patient group could improve patient care by guiding antibiotic therapy and possibly reduce the risk for the development of antimicrobial resistance.

3. Was there another testing cohort to test their model?

No, no separate testing cohort was available. We chose to use a nested cross-validation approach to validate our model as is explained in the method section (lines 159-164) and Fig. 2 (line 196).

Two interesting points that may be beyond the scope of the work by the authors:

4. Was there ever thought about comparing the metabolites to healthy subjects compared to pneumonia subjects?

We have emphasized our approach more clearly in the Discussion (lines 362-364):

“In this study, no control group was included because the goal of the study was to provide a faster and optimal diagnostic method and a guide for antibiotic treatment in hospitalized CAP patients.”

For more general, biological analysis, we think that the inclusion of a control group could be of interest to provide more insight into the metabolomic differences between healthy and diseased individuals.

5. Although there is little difference between the atypical pneumonia pathogens, there almost appears to be a distinct group between the legionella compared to mycoplasma samples. Was there thought about exploring possible differences between these two groups?

We have not attempted to separate individual pathogens with predictive modeling and have added a clarification about this topic to the Discussion section:

Lines 354-355: “The compared groups S. pneumoniae, atypical bacteria, and viruses were chosen because antibiotic treatment strategies differ between these three groups.”

Lines 360-362: “The heterogeneous pathogen population in the atypical bacterial and viral pathogen groups might have lowered the predictive performance of the metabolomic analysis. Studying the individual pathogens in bigger sample sizes might reveal more characteristic metabolite signatures.”

Lines 355-360: “Ideally, we would have further investigated differences within studied groups, e.g. to identify metabolic responses to specific pathogens within the atypical pathogens and viral infection groups. For example, it would be of interest to study Legionella species more in-depth because their intracellular growth might result in a differentiated host-response. However, this was considered not feasible in this study due to sample size restrictions.”

6. Introduction: Page 10, Line 68 “The studies that compared viral and bacterial …” I would just be careful and call this a limitation. Untargeted metabolomics may offer significant benefits in terms of identifying unknown metabolites. An untargeted approach is much more similar to a fishing expedition, I agree, but there may be some benefits compared to a targeted approach.

We have rephrased this sentence in the introduction to underline the benefits of both untargeted and targeted metabolomics (lines 68-72):

“The studies that compared viral and bacterial causative pathogen groups of CAP used an untargeted metabolomics approach. While an untargeted approach is especially useful for the discovery of new metabolites and hypothesis-free analysis, a targeted approach that can be fully quantified to clinical laboratory standards may be preferable for clinical implementation.”

7. Materials and methods: Page 11, Line 95 “The study …” One question I was wondering that the authors may have addressed at a different point was the length of time related to the patient’s illness? While it’s interesting that these patients all felt ill enough to come into the hospital, it’s not quite clear if the length of time they were sick would have confounded their analysis. For example, a person sick enough to come to the hospital on day 5 may be different than one that arrives 14 days after falling ill.

We agree and have added the variable “Duration of symptoms before admission” to the patient characteristics and our models (see the response in point 1, reviewer 2). We did not see a significant difference between the three groups, but we cannot exclude that some noise might have been introduced.

8. Materials and methods: Page 11, Line 99 There is very limited clinical information that would confound host-metabolite expression, for example 1) Use of supplemental oxygen? 2) Other comorbid disease states such as diabetes, 3) BMI (which the authors mentioned in the conclusion was not recorded), 4) medications the patient had been taken prior to “catching” pneumonia (e.g., steroids, inhalers, antibiotics), and 5) most interesting of all, no mention of pre-existing lung disease (e.g., COPD, asthma, ILD). For host-metabolite issues, these would be of interesting to understand if they impact host-expression, especially lung and systemic metabolites.

We have added all available patient characteristics related to these possible confounders (see point 1 in response to reviewer 2). There are no significant differences for patient characteristics between the three compared pathogen groups, except for age.

9. Materials and methods: Page 12, Line 133 “… models containing age and sex were generated …” Given the predilection of Streptococcus pneumonia impact older subjects, I am a little surprised that age did not factor into the analysis as in Table 1 it seems as though the age would be statistically different.

In the studied patient cohort, the mean age was significantly higher in patients with viral CAP compared to bacterial CAP. We hypothesized that this age difference could confound the results but our results show that this is not the case in our cohort. To clarify, we have added to the discussion section (lines 339-343):

“We see that a model including age and sex does not outperform models without these possible confounders. This doesn’t imply there is no metabolomic effect of age in the bacterial pathogen groups but implies that the separation between bacterial pathogen groups is more dependent on the metabolomic host-response to the infection than on the age-related metabolomic changes.”

10. Results: Page 15, Line 189 “Single discriminating metabolites for pathogen groups”. Out of curiosity and this may be beyond the scope of the study, was there any distinct groups that were identified in an unsupervised fashion? From the metabolites, could the authors identify distinct groups? I am wondering if using the data to find distinct groups could also be performed (again beyond the scope of the study, but could be interesting to look at to see if there may be groups that are not clearly seen). For example, using Dirichlet Multinomial Mixtures to identify distinct groups. This could be added as a figure in the supplement. Part of me wonders if differences in serum metabolites may be associated with clinical outcomes.

Yes, we have performed unsupervised analysis in the form of Principal Component Analysis. The results are shown in S2 Fig (line 595). However, unsupervised PCA did not show separation between the pathogen groups. We have looked into Dirichlet Multinomial Mixtures after the reviewer's suggestion, but feel that this would not be appropriate for analysis of continuous metabolite levels present in our dataset, and is primarily of relevance for microbial metagenomics studies that involved discrete observations.

11. Discussion: Page 20, Line … “Targeted …” I appreciate that the authors point out that it is difficult based on the host-metabolomic profile to tell the difference between the various pneumonias. What isn’t clear to me is why would atypical infections, in particular have such distinct host-metabolomic profile? The authors do a commendable effort into searching for metabolites which can discriminate between infections, but what is so particular that the infections create a unique host response (e.g., such as the intra-cellular nature of some of these infections Mycoplasma and Legionella).

Although our approach primarily was focused on assessing whether metabolomics profiling could be helpful in guiding empirical antimicrobial treatment of CAP, we agree that the distinct group of atypical pathogens requires further elaboration. Therefore, we added to the manuscript (355-360):

“Ideally, we would have further investigated differences within studied groups, e.g. to identify metabolic responses to specific pathogens within the atypical pathogens and viral infection groups. For example, it would be of interest to study Legionella species more in-depth because their intracellular growth might result in a differentiated host-response. However, this was considered not feasible in this study due to sample size restrictions.”

12. Discussion: Page 21, Line … “Lactic acid …” I think this is interesting because there are R and L enantiomers that are involved in microbial metabolism, but from a clinical point of view, lactemia in the serum is sign of severe disease. Perhaps, it may actually reflect severity of disease.

The reviewer raises an interesting point here. Unfortunately with the metabolomics method used, we are not able to differentiate between R and L enantiomers. Aside from that, we have added a clarification on the interpretation of the finding of lactic acid as metabolite of interest (lines 325-331):

“Lactic acid levels are also known to rise in case of severe disease. However, because the three pathogen groups were well balanced in terms of disease severity and, for example, did not show significant differences in pH levels, we hypothesize that the differences in lactate levels are, in this case, an effect of the pathogen-specific host-response to infection. The result showed that models including disease severity covariates do not perform better than models without these confounders, thus supporting this hypothesis.”

13. Discussion: Page 22, Line … “In this study, we included patients …” It’s interesting that the authors utilized a pneumonia score, perhaps to understand some of the granularity of the data the authors should try to expand the PSI score and reassess their model based on the severity of disease? Moreover, have the authors tried to separate out the analysis based upon severity? The severity of disease could serve as a confounder in their analysis. I recommend the authors split the PSI score and attempt to construct their models utilizing

We would like to refer to point 1 in response to reviewer 2, where we explain how we incorporated disease severity variables in the revised manuscript.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 1

Aran Singanayagam

14 Apr 2021

PONE-D-21-00435R1

Metabolomic profiling of microbial disease etiology in community-acquired pneumonia

PLOS ONE

Dear Dr. den Hartog,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by May 29 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Aran Singanayagam

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: I only have one suggestion:

As for 1.) no standardization for sampling times and conditions was applied.

Please add in the disscussion if this may limit some (say which ones!) of the conclusions of the study and why.

Reviewer #2: The authors have addressed all comments. They do a commendable job and also addressed missing clinical variables in their response.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Jun 4;16(6):e0252378. doi: 10.1371/journal.pone.0252378.r004

Author response to Decision Letter 1


29 Apr 2021

Dear Dr. Singanayagam,

Hereby we would like to resubmit our revised manuscript entitled “Metabolomic profiling of microbial disease etiology in community-acquired pneumonia.”

We have addressed the remaining reviewer comments in this revised version, for which we provide a specific response below. In addition, we have made a number of minor textual changes to further improve readability of the Discussion section.

We hope that this manuscript is now acceptable for publication.

Sincerely,

on behalf of all co-authors,

Ilona den Hartog and Coen van Hasselt

Reviewer #1:

I probably missed this point. When were the serum samples taken? In the morning before food? Has this been standardized?

As for 1.) no standardization for sampling times and conditions was applied. Please add in the disscussion if this may limit some (say which ones!) of the conclusions of the study and why.

Response:

Since the samples were not collected specifically for metabolomics analysis, no standardization for sampling times and conditions was applied.

In the Methods section we stated the following (lines 85-86):

“The samples were taken from CAP patients within 24 hours after hospital admission.”

Furthermore, we have added to the discussion the following (lines 348 - 356):

“Furthermore, no standardization of sampling times and conditions was applied, e.g., patients had not fasted before blood sampling, which may influence the metabolite patterns found. Since variations in sampling conditions were unknown, we were unable to consider these in our analyses. However, we expect that the impact of not standardizing and correcting for these factors is limited because the noise in metabolite levels introduced by these factors is expected to be random with regard to the pathogen groups compared in this study. A standardized sampling approach could improve the sensitivity of the models to detect predictive metabolites because some noise is reduced. However, the specificity of the models with respect to the prediction of specific pathogens would be unchanged, since no correlation with pathogen groups is likely.”

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 2

Aran Singanayagam

17 May 2021

Metabolomic profiling of microbial disease etiology in community-acquired pneumonia

PONE-D-21-00435R2

Dear Dr. den Hartog,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Aran Singanayagam

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Aran Singanayagam

25 May 2021

PONE-D-21-00435R2

Metabolomic profiling of microbial disease etiology in  community-acquired pneumonia

Dear Dr. van Hasselt:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Aran Singanayagam

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Method. Details on metabolomic sample analysis.

    (DOCX)

    S1 Fig. Optimization of α and λ in the inner Cross-Validation (CV) to reach a minimal Balanced Error Rate (BER) in the outer CV.

    (A) Shows all α and λ values tested in inner CV against mean BER of the inner CV. (B) A plot of the optimal α and λ combinations chosen in the inner CV against their BER in the outer CV shows a variety of favorable α and λ concentrations. (C) A plot of the number of variables selected in the elastic net model in outer CV shows that with increasing alpha, the number of variables decreases as is expected in an elastic net model. The data shown in the Fig is a result of the comparison Atypical–(S. pneumoniae + viral).

    (DOCX)

    S2 Fig. Unsupervised Principal Component Analysis (PCA) plot of all pathogen groups.

    (DOCX)

    S3 Fig

    (A) Boxplot of BER per number of variables selected shows no clear relation between the number of variables selected and model performance. (B) Histogram of the number of variables selected shows that a model with all metabolites included is favored, followed by models including 34, 49, 82, 24, or 45 metabolites. Both Figs contain the data of all folds and repeats (n = 500) for the comparison between atypical versus S. pneumoniae and viral infections.

    (DOCX)

    S4 Fig. Principal Component Analysis (PCA) of the atypical pathogen group (log-transformed and standardized data) shows that there is no clear subgroup within the atypical group that would prominently drive the separation from the S. pneumoniae and viral infections.

    (DOCX)

    S1 Table. Summary of previous studies focusing on bacterial and viral respiratory tract infections and related metabolites.

    (DOCX)

    S2 Table. Additional patient characteristics per pathogen group.

    (DOCX)

    S3 Table. Overview of the number of metabolites included in the metabolomics platforms, measured in the samples and included in the data analysis.

    (DOCX)

    S4 Table. Information on measurement platforms used, metabolite classes targeted per platform, targeted metabolites, their abbreviations and names in R (if detected) and identifiers (if available).

    (XLSX)

    S5 Table. Metabolomics data after quality control.

    (CSV)

    Attachment

    Submitted filename: Response to Reviewers.docx

    Attachment

    Submitted filename: Response to Reviewers.docx

    Data Availability Statement

    All relevant data are within the manuscript and its Supporting Information files.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES