Abstract
Simple Summary
Despite the emerging success of immunotherapy in non-small-cell lung cancer (NSCLC), it remains clinically important to better identify patients who are likely to respond to treatment, especially considering the existence of immune-related adverse events (irAEs). In recent years, the gut microbiome has been correlated with treatment response, but no predictive models relating the two have been developed. In this study, we used random forest and neural networks to predict the progression-free survival of NSCLC patients treated with immunotherapy. Our results showed that a functional profile of the human gut microbiome outperformed the taxonomical profile across different studies, which can be utilized to establish a model with good predictive value in lung cancer immunotherapy.
Abstract
We performed various analyses on the taxonomic and functional features of the gut microbiome from NSCLC patients treated with immunotherapy to establish a model that may predict whether a patient will benefit from immunotherapy. We collected 65 published whole metagenome shotgun sequencing samples along with 14 samples from our previous study. We systematically studied the taxonomical characteristics of the dataset and used both the random forest (RF) and the multilayer perceptron (MLP) neural network models to predict patients with progression-free survival (PFS) above 6 months versus those below 3 months. Our results showed that the RF classifier achieved the highest F-score (85.2%) and the area under the receiver operating characteristic curve (AUC) (95%) using the protein families (Pfam) profile, and the MLP neural network classifier achieved a 99.9% F-score and 100% AUC using the same Pfam profile. When applying the model trained in the Pfam profile directly to predict the treatment response, we found that both trained RF and MLP classifiers significantly outperformed the stochastic predictor in F-score. Our results suggested that such a predictive model based on functional (e.g., Pfam) rather than taxonomic profile might be clinically useful to predict whether an NSCLC patient will benefit from immunotherapy, as both the F-score and AUC of functional profile outperform that of taxonomic profile. In addition, our model suggested that interactive biological processes such as methanogenesis, one-carbon, and amino acid metabolism might be important in regulating the immunotherapy response that warrants further investigation.
Keywords: immunotherapy, non-small cell lung cancer, gut microbiome, prediction model, machine learning
1. Introduction
Lung cancer remains the leading cause of cancer-related death in the US and worldwide [1,2]. With a better understanding of immune checkpoints in tumor control, immunotherapy using immune checkpoint inhibitors (ICIs) has revolutionized our treatment in various types of cancer, including lung cancer [3,4,5]. Pembrolizumab, for example, binds to and impairs the lymphocyte PD-1 receptor’s ability to interact with PD-L1 on cancer cells and thus allows the enhancement of antitumor immune response via suppression of the co-inhibitory PD-1/PD-L1 pathway [6], which has resulted in a significantly better response and survival in certain patients with advanced/metastatic NSCLC [7,8,9]. However, not all patients benefit from ICIs, and some even develop non-negligible irAEs [10,11], such as potentially lethal pneumonitis. Given this, it is desirable to develop predictive models that better identify patient populations in which treatment benefits outweigh risks.
Several studies, including ours, have associated the gut microbiome with the host’s immune system and immunotherapy response and irAEs [12,13,14,15,16]. For example, Bacteroides thetaiotaomicron and Bacteroides fragilis were reported to be positively associated with the efficacy of CTLA-4 blockade [17]. Enrichment of Bifidobacterium was reported to be associated with a lower incidence of irAEs in lung cancer patients receiving ICIs [18]. Oral intake of Bifidobacterium combined with anti–PD-L1 antibody therapy showed significant improvement in melanoma control in mouse models [19], and Bifidobacterium was also reported to suppress metastasis of lung cancer in mouse models [20]. Akkermansia muciniphila was found enriched in NSCLC patients who responded to PD-1–based immunotherapy [21]. However, several key questions remain, the external validity of a study’s taxonomic analysis of the gut microbiome, especially considering the existence of various modulating factors [22]; can functional analysis provide a better signal considering the inherent functional redundancy of microbiota, and if so, whether such a signal can have predictive value? Up to now, there is no published prediction model using the gut microbiome to predict the efficacy of immunotherapy in NSCLC. The prediction model proposed for melanoma did not show satisfactory performance, with a 71% F1 score on the testing set [23].
The heterogeneity of human gut microbiome taxonomical composition challenges its predictive use [23]. For example, Limeta, et al. showed that the weighted UniFrac distances were smaller in patients’ gut microbiomes when grouping by study than by response [23]. Similar findings were reported in healthy humans by the Human Microbiome Project [24]. However, despite the dissimilarity of taxonomical compositions, Huttenhower et al. did show that most metabolic pathways were commonly shared across different human subjects [24].
The possibility of predicting rather than merely associating the efficacy of immunotherapy in NSCLC patients using the taxonomic and/or functional gut microbiome as a biomarker collectively intrigued us. To answer this, in the current study, we performed analyses on the taxonomic and functional features of the gut microbiome between long PFS (above 6 months) and short PFS (below 3 months) and trained a random forest (RF) classifier and a multilayer perceptron (MLP) neural network classifier to predict if a patient will develop long vs. short PFS. We stratified PFS in this way to separate patients who clearly benefit from immunotherapy from those who do not benefit in an effort to enrich potential signals. Our study showed that the RF classifier achieved an 85.2% F-score and a 95% area under the receiver operating characteristic curve (AUC) using the Pfam profile, and the MLP classifier achieved a 99.9% F-score and 100% AUC using the Pfam profile. Figure 1 describes our study schema.
2. Materials and Methods
2.1. Data Set and Metadata Collection
We utilized two datasets (DS1 and DS2) in this study. The detail of DS1 is shown below in patients and samples. For DS2, we began by performing a literature search on the SRA (Sequence Read Archive) database in the NCBI (National Center for Biotechnology Information) using the search term “(NSCLC gut) AND bioproject_sra[filter] NOT bioproject_gap[filter]” with a cutoff date of 14 December 2021, to retrieve all publications related to NSCLC immunotherapy with published metagenomic data of gut microbiome. This resulted in 5 records. We found that 2 out of the 5 records contained whole metagenome shotgun sequencing (WGS) data, and only one [21] has metadata associated with the WGS data, which was referred to as DS2 in this paper. Broadening the search criteria to “lung cancer” resulted in 16 records, but no further usable WGS data containing metadata was found.
2.2. Patients and Samples
In total, 14 NSCLC patients who received immunotherapy and have known PFS were selected from our previous study [18] (referred to as DS1, which can be accessed from PRJNA866654). Their pretreatment baseline fecal samples were collected and extracted DNA sequenced on Illumina HiSeq for 2 × 150 pb; NexteraXT preparation was used by COSMOSID®. DS2 contained 65 additional pretreatment samples. For the PFS study, 6 patients with long PFS and 3 patients with short PFS were included in DS1, and 7 samples with long PFS and 34 samples with short PFS were included in DS2. We omitted patients with PFS between 3 and 6 months to further contrast the gut microbiome of patients with long vs. short PFS. We later applied our trained models to predict treatment response. We grouped patients with a complete and partial response as responders (R), whereas those with stable disease, progression, and death as non-responder (NR). With this design, all 79 patients (14 from DS1 plus 65 from DS2) were included for analysis, with 8 R and 6 NR in DS1 and 12 R and 53 NR in DS2. RECIST 1.1 criteria [25] were used to assess the treatment response.
2.3. Quality Control, Annotation, and Differential Study
All raw metagenomic data were quality trimmed using Trimmomatic [26] (v0.38) using a user-specific adapter list with default parameters (Supplemental Figure S1). The potential human genome sequences were removed using BWA [27] (0.7.16a-r1185-dirty) with default parameters (the percentage of reads removed was reported in Supplemental Table S7). MetaPhlAn [28] (v3.0.6) was used to annotate the taxonomic composition against its own database mpa_v30_CHOCOPhlAn_201901 (containing marker genes from ~99,500 bacterial and ~500 eukaryotic genomes). UProC [29] (v1.2.0) was used to estimate the abundances of the Pfam families [30] (28.0) and KEGG Orthology (the March 2014 release). MicrobiomeAnalyst [31] was used to compute alpha-diversity and beta-diversity, and metagenomeSeq [32] was used to identify differentially abundant taxa. Significant differential-abundance Pfam and KEGG Orthology families were determined by DESeq2 [33] (v1.22.2). Hierarchical clustering was then performed using clustermap in seaborn [34] (v0.11.0), with the Z-scores obtained from normalized family-level RPKM (Reads Per Kilobase Million).
2.4. Prediction
The RF and MLP classifiers from scikit-learn [35] (v0.24.2) were used to classify the patients into long/short PFS groups and between R/NR groups. We selected RF for its interpretability and MLP for its regression power [36]. Training-testing datasets for PFS classification were constructed such that the training and testing datasets contained 35 and 15 samples, respectively, and the testing dataset contained 4 samples with long PFS. Gini importance [37] was adopted to reduce the input dimension and avoid model overfitting. The peak AUC performances of taxa-, KEGG Orthology-, Pfam-, and the combined information-based models were achieved at 7, 60, 38, and 18 selected features, respectively (Supplemental Figure S2). An early stop was enabled for MLP to prevent overfitting to the training set; all the other parameters were left as default.
We define true positives (TP) as the number of patients who were predicted and indeed had PFS > 6 months, and the true negatives (TN) as those predicted and indeed had PFS < 3 months. Correspondingly, the false positives (FP) corresponded to the patients predicted with long but had short PFS, and vice versa for the false negatives (FN). The performances of our models were measured by sensitivity, precision, F-score, and accuracy. The receiver operating characteristic (ROC) curves were generated from data collected from 100 repeated experiments. When calculating the AUC score, they were extrapolated to the extreme points corresponding to the highest sensitivity and precision.
Our Pfam-based RF and MLP models trained on 50 PFS-labeled patients were further used for R/NR classification. The test set contained 79 samples with R/NR group labels. A null model (random guess) was repeated 1000 times to simulate the background distribution. Detailed scripts and comments can be found in the Supplemental Materials.
3. Results
3.1. Taxonomic Differences between the Long and Short PFS Gut Microbiomes
All of the raw sequencing data were re-processed and re-analyzed using the pipeline described in Figure 1 and Methods. After combining the two datasets, examining the taxonomic distribution at the phylum level did not show a significant difference between long vs. short PFS groups. Comparisons of alpha-diversity and beta-diversity at the genus level also showed no significant difference (Figure 2A,B).
Given that no significant difference was observed at the global level using merged datasets (Figure 2A,B), Supplemental Figure S4), we individually compared the taxonomic profiles between PFS groups in DS1 and DS2. We used taxa with the largest differentials in abundance between the groups for hierarchical clustering (Figure 2C,D), also in Supplemental Figures S5A and S6). In total, 23 taxa of interest were found in DS1 (p-value ≤ 0.05, Supplemental Table S1), and 24 were found in DS2 (p-value ≤ 0.05, Supplemental Table S2). Two long PFS samples (JZLC_19 and JZLC_37) demonstrated significantly different taxonomic profiles than the other samples within the same group (Supplemental Figure S3) and thus were excluded from further analysis. The taxa with p-value ≤ 0.05 successfully classified PFS in DS1 but failed in DS2 (Figure 2C,D), suggesting taxonomic profile alone might not be enough to cluster/predict long vs. short PFS.
3.2. Functional Differences between Long and Short PFS Gut Microbiomes
We then investigated the use of the Pfam and KEGG Orthology protein families for long/short PFS classification. In total, 8516 protein families from Pfam were detected in DS1 and DS2, whereas 10,806 were detected from KEGG Orthology (Supplemental Tables S3–S6). Moreover, 171 and 163 protein families showed significantly different abundance (p-value ≤ 0.05, Supplemental Tables S3 and S4) in DS1 from the Pfam and KEGG Orthology, respectively; while 213 and 239 of them showed significantly different abundances (p-value ≤ 0.05, Supplemental Tables S5 and S6) in DS2 from Pfam and KEGG Orthology, respectively. The top 50 protein families of Pfam or KEGG Orthology clustered most of the long vs. short PFS patients (Figure 3, also in Supplemental Figure S5B,C), Supplemental Figures S7 and S8), with only two misclassifications (Figure 3D).
3.3. PFS Prediction Using Taxonomic and Functional Information
Having identified that the functional profile of the gut microbiome can better segregate long vs. short PFS, we next investigated whether it could have better predictive power than taxonomic information. Figure 4A–D shows the distribution of RF prediction scores (ranging from 0 to 1) of 15 testing samples (recall the training-testing splitting above) in 100 experiments using the taxonomic, KEGG Orthology, Pfam, and combined profiles, respectively (the true positive cases were marked in orange). The functional profiles (KEGG Orthology and Pfam) clearly outperformed the taxonomic profile. By using different cutoffs of prediction scores, the receiver operating characteristic (ROC) curve of average performances was shown in the left panel of Figure 4E, from which we can see that the Pfam profile achieved the best AUC (95%) and F1 score (85.2%, the harmonic mean of the precision and sensitivity (see right panel of Figure 4E). This is reflected in the distribution pattern of positive cases in Figure 4A–D, showing a better separation of positive vs. negative predictive events for PFS. Compared to the KEGG Orthology profile (Figure 4B), the prediction scores of positive cases using the Pfam profile (orange dots in Figure 4C) were higher, and the majority of the negative cases (grey dots) were assigned with lower prediction scores (accumulated to the left side). This feature of the Pfam profile might be clinically useful: a high score (more than 0.5) implies an NSCLC patient is very likely to have long PFS on immunotherapy, whereas a low score implies otherwise. However, the existence of such a lower-score cutoff needs to be verified using more samples from prospective studies. Figure 4E shows the value of performance matrices using the default cutoff of prediction score (0.5), and the highest values were in bold. The Pfam profile achieved the best F1 score (85.2%) and best accuracy (92.8%).
To maximize the performance, we applied MLP to predict PFS. The ROC curve and performance matrices were computed using identical setups as RF. The results are shown in Figure 4G) (also Supplemental Figure S9), from which we observed that the Pfam, KEGG Orthology, and combination profiles achieved nearly perfect predictions in the range of 97% to 100% for the AUC and 98.4% to 99.9% for the F1 score. The predictive power of the taxonomic profile was again considered inferior.
3.4. Treatment Response Prediction Using Pfam-PFS Model
As quite often (although not always) in the clinical setting, we observe that treatment response correlates with survival benefits (e.g., longer PFS). Thus, we further studied the predictive power of the Pfam-PFS feature set in treatment response. To do that, we directly applied the RF and MLP classifiers, trained by the 50 samples with PFS labels, to the 79 samples with R/NR labels. We benchmarked the trained classifiers against a null (random) predictor and simulated the performance of the stochastic predictor 1000 times to approximate the background distribution. The performances of the Pfam-based RF and MLP classifiers were shown as the red crosses in Figure 4F,H, respectively. The trained RF and MLP classifiers statistically significantly outperformed the stochastic predictor in F-score (with a p-value less than 1 × 10−6). It should be noted that none of the responder labels was known by the classifiers during the training process; thus, Figure 4F,H suggested the potential to develop a single model that predicts both PFS and treatment response.
3.5. Biological Processes with Potential Impacts on NSCLC Immunotherapy Response
Noticing that the functional profiles can better predict the immunotherapy response, we explored its relevant biological processes. Table 1 lists all statistically significant biological processes from the 38 Pfam protein families used for prediction (see Supplemental Table S8 for 38 Pfam protein families and full list of biological processes).
Table 1.
Pfam ID | Biological Process | p-Value * |
---|---|---|
PF02249, PF02240, PF02505 | Methanogenesis | 2.11 × 10−5 |
PF01450 | Branched-chain amino acid biosynthetic process | 0.008611 |
PF07991 | Cellular amino acid biosynthetic process | 0.01286 |
PF05732 | Plasmid maintenance | 0.01286 |
PF02741 | One-carbon metabolic process | 0.01708 |
PF00742 | Cellular amino acid metabolic process | 0.03761 |
* All protein families are enriched in the long PFS group.
3.5.1. Methanogenesis and One-Carbon Metabolic Process
Out of 14,107 protein families, we found that 3 out of 14 methanogenesis-related protein families [30] were enriched in patients with long PFS (p-value 2.11 × 10−5) (PF02249, PF02240, and PF02505, all related to methyl-coenzyme M reductase: MCR). Methanogenesis is the formation of methane by microbes known as methanogens, which are primarily belonging to the Archaea domain [38], and MCR is the key enzyme of this biological process [39]. Methanogenic Archaea inhabit mammals’ gastrointestinal (GI) tract and have syntrophic interactions with other microorganisms within the microbial community [40]. Some of them, for example, Methanobrevibacter smithii, can be recognized by the human innate immune system and activate dendritic cells, therefore contributing to the activation of the adaptive immune response [41]. In addition, methanogenic Archaea can be functional associates of the fermentative digestion of dietary fibers, favoring the production of beneficial short-chain fatty acids [42] that is associated with good immunotherapy response [43]. Consistent with this, several studies have positively correlated methanogenic microbiota with cancer immunotherapy [44,45]. In fact, our taxonomic analysis also showed that Methanobrevibacter smithii is one of the top enriched microbiota in patients with long PFS.
We also found that 1 out of 4 one-carbon metabolism families was significantly enriched (p-value of 0.01719) (PF02741, annotates the proximal lobe of formylmethanofuran--tetrahydromethanopterin formyltransferase: FTR). Considering that methanogenesis is a process converting bacterial metabolic products (e.g., CO2, formate, etc.) to methane, it is not surprising to see the importance of the one-carbon metabolic process as it is instrumental in reducing CO2 (the most oxidized one-carbon compound) to methane (the most reduced form of a one-carbon compound), which is accompanied with electrons derived from the oxidation of either H2 or formate [46]. In fact, FTR participates in both methanogenesis and folate biosynthesis. Interestingly, one-carbon metabolism has recently been shown to play an essential role in T-cell function. For example, adding products of one-carbon metabolism (such as formate and glycine) was found capable of enhancing the activation of aged naïve T cells [47]. The deficiency of folate, which supports the one-carbon metabolism, has been shown to substantially reduce CD8+ T cell (cytotoxic T cell, CTL) proliferation [48]. Furthermore, methyl-B12 can promote both the number and activity of CD8+ T cells [49,50]. Of note, vitamin B12 is a cofactor for methionine synthase and contributes to the one-carbon metabolism [51].
3.5.2. Amino Acid Biosynthetic and Metabolic Processes
We also found that three processes were statistically significant in regard to amino acids. The branched-chain amino acid (BCAA) biosynthetic process (PF01450, annotates the catalytic domain of acetohydroxy acid isomeroreductase: AHIR) and cellular amino acid biosynthetic process (PF07991, annotates the NADPH-binding domain of AHIR) are two processes relevant to AHIR (also known as ketol-acid reductoisomerase, KARI). AHIR not only participates in the formation of BCAAs such as isoleucine, leucine, and valine, but it also catalyzes the reversible transformation of NADP+ and NADPH [52]. Of note, NADPH is reported to be an additional product of one-carbon metabolism [53]. The cellular amino acid metabolic process is related to Pfam PF00742, which annotates homoserine dehydrogenase that catalyzes the third step in the aspartate pathway [54,55]. The aspartate pathway produces essential amino acids threonine, methionine, lysine, and isoleucine; the cofactor S-adenosylmethionine; and the cell wall component diaminopimelate. The third step of the aspartate pathway is the NAD(P)-dependent reduction of aspartate beta-semialdehyde into homoserine. Homoserine is an intermediate in the biosynthesis of threonine, isoleucine, and methionine.
Amino acids are found important to support immunity by providing energy or biomass to support the proliferation of immune cells and via modulation of key metabolic pathways that instruct immune cell function [56]. For example, BCAAs such as leucine, isoleucine, and valine can provide acetyl-CoA and succinyl-CoA that enter the TCA cycle [57], and supplementation of BCAA could enhance CD8+ T cell activity [58]. Amino acids can also be used to make antioxidants such as glutathione to maintain redox balance and provide methyl and acetyl groups to epigenetically regulate gene expression patterns in immune cells [56].
Interestingly, methanogens can generate serine during methanogenesis and synthesize lysine [59], and one-carbon metabolism directly modulates the levels of three amino acids: methionine, serine, and glycine [53], and connects the TCA cycle via NADH [60]. All these suggest a close interaction among these biological processes. Since many metabolites/metabolic intermediates can reach host cells (including immune cells) from gut microbiota, and several metabolic pathways such as one-carbon metabolism span all kingdoms [61], these biological processes could also prime host immune cells to respond better to immunotherapy (Figure 5).
3.5.3. Plasmid Maintenance
This significantly enriched biological process in patients with long PFS is due to Pfam PF05732, which annotates firmicute plasmid replication protein (RepL). Firmicutes were reported to be positively associated with better immunotherapy response [18,22]. In addition, lower plasmid diversity is associated with gut dysbioses such as inflammatory bowel disease (IBD), and higher plasmid diversity is associated with higher alpha diversity, which also correlates with a healthier condition and, in general, better response to immunotherapy [62].
4. Discussion
Although previous studies, including ours [18,21,22,63] have demonstrated the correlation of the gut microbiome with immunotherapy response, this is arguably the first study showing that the gut microbiome can be used to predict treatment response in lung cancer immunotherapy. We have illustrated its potential to predict long vs. short PFS and treatment response in NSCLC patients receiving ICIs. This study also showed that the functional profile, particularly the Pfam profile, outperformed the taxonomic profile by 3.9% in F-score and 11% in AUC using RF and by 10.4% in F-score and 14% in AUC using MLP. This can be explained by the fact that the Pfam profile is more granular than the taxonomic profile: Pfam annotations used more information contained in the raw data.
We also noticed that using Pfam, several biological processes such as methanogenesis, amino acid biosynthesis/metabolic process, and one-carbon metabolism were significantly enriched in patients who benefited from immunotherapy. This finding is supported by previous studies which demonstrated the importance of amino acids, folate, and cobalamin in CTLs [47,48,49,50,51], a key player in immunotherapy using ICIs, as well as shared biological processes across kingdoms such as one-carbon metabolism [61]. Though exciting, such a finding will need to be validated in future larger datasets and mechanistically using mouse models and relevant in vitro studies.
The prediction of PFS and response using the Pfam profile (Figure 4) could have significant clinical value. Note that the precision scores of the long PFS and responder were nearly 100%, meaning that patients who were predicted to benefit from immunotherapy indeed did so. On the other hand, by lowering the prediction threshold, for example, 0.2 in Figure 4C, all patients who benefitted from the therapy were accurately predicted. These suggest the possibility of setting another lower bound to predict patients who are less likely to benefit from the therapy, allowing other therapeutic approaches to be considered in advance.
We must admit that the sample size limits our findings. With a larger sample size, the prediction models (both RF and MLP) could learn better to generalize to a larger population. We are actively enrolling patients through clinical trials (e.g., NCT04636775) with the plan to further train and validate our predictive model through continuous integration of new data. Since gut microbiome can be affected by diet and various lifestyle factors, we also plan to incorporate published data from studies performed in other geographic locations. Furthermore, since UProC [29] is not the only approach to analyze protein sequences, to minimize research method bias, we also plan to integrate other approaches, such as HUMAnN (the HMP Unified Metabolic Analysis Network) [28], in our future studies.
5. Conclusions
Gut microbiome may predict therapeutic benefit from immunotherapy in NSCLC patients. Its derived functional profile (e.g., Pfam) seems to have more potent predictive power than taxonomic information. The revealed biological processes, especially one-carbon metabolism, might modulate cancer immunotherapy response, which deserves validation and mechanistic investigation in future studies.
In the future, we will continue to incorporate more data to improve and validate our predictive model. Equally important, we have planned a series of mechanistic studies to understand the value of methanogenesis, one-carbon and amino acids metabolism, and archaea in modulating host immune status and response to cancer immunotherapy.
Supplementary Materials
The following are available online at https://www.mdpi.com/article/10.3390/cancers14215401/s1, Materials: Supplemental Materials, Figure S1: Examples illustrating per-base content before and after quality control; Figure S2: The fine-tuning processes; Figure S3: The hierarchical clustering of DS1 of 9 sample; Figure S4: The hierarchical clustering of z-score of abundances of most differential taxa (29) between PFS Long (orange) and PFS Short (grey) on DS1 and DS2 together; Figure S5: The hierarchical clustering of 7 sample on DS1; Figure S6: The hierarchical clustering of z-score of abundances of most differential taxa (24) between PFS Long and PFS Short samples on DS2, i.e., Figure 2D in main manuscript; Figure S7: The hierarchical clustering of z-score of abundances of 50 most differential protein families us-ing KEGG Orthology profile on DS2, i.e., Figure 3B in main manuscript; Figure S8: The hierarchical clustering of z-score of abundances of 50 most differential protein families us-ing Pfam profile on DS2, i.e., Figure 3D in main manuscript; Figure S9: Prediction of MLP classifier; Table S1: DS1_metageno_de_output, Table S2: DS2_metageno_de_output, Table S3: DS1_pfam, Table S4: DS1_KEGG, Table S5: DS2_pfam, Table S6: DS2_KEGG, Table S7: HumanReadsMapping, Table S8: topPfam.
Author Contributions
Conceptualization, C.Z. and J.Z.; methodology, C.Z. and J.Z.; software, B.L. and C.Z.; validation, B.L., C.Z. and J.Z.; formal analysis, B.L., C.Z. and J.Z.; investigation, B.L., C.Z. and J.Z.; resources, C.Z. and J.Z.; data curation, J.C. and Q.D.; writing—original draft preparation, B.L.; writing—review and editing, B.L., J.C., C.Z. and J.Z.; visualization, B.L. and C.Z.; supervision, C.Z. and J.Z.; project ad-ministration, C.Z. and J.Z.; funding acquisition, C.Z. and J.Z. All authors have read and agreed to the published version of the manuscript.
Institutional Review Board Statement
The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Institutional Review Board of the University of Kansas Medical Center (STUDY00146102, approved on 07/20/2022).
Informed Consent Statement
Informed consent was obtained from all subjects involved in the study.
Data Availability Statement
Raw data used for analysis has been uploaded to the (NCBI) BioSample database under BioProject ID PRJNA866654, currently pending review and publication.
Conflicts of Interest
J.Z. has served as a scientific advisor/consultant for AstraZeneca, Biodesix, Novocure, Bayer, Daiichi Sankyo, Mirati, Novartis, Cardinal Health, Bristol-Myers Squibb, Nexus Health and Sanofi. J.Z. is on the speakers’ bureau for AstraZeneca, Regeneron, Sanofi, and MJH Life Sciences. J.Z. has also received research funding/support from AstraZeneca, Biodesix, Novartis, Genentech/Roche, Mirati, AbbVie, Hengrui, BeiGene, InnoCare Pharma, and Nilogen. C.Z. is a scientific advisor for Xbiome Co., Ltd.
Funding Statement
This research was supported by various research funds, including the University of Kansas Start-Up (J.Z.); the “Play with a Pro” Lung Cancer Research Fund (J.Z.); the Pilot Grant for Cancer Research of the University of Kansas Cancer Center (J.Z.), and the National Science Foundation CAREER award DBI-1943291 (B.L. and C.Z.).
Footnotes
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Siegel R.L., Miller K.D., Fuchs H.E., Jemal A. Cancer Statistics, 2021. CA Cancer J. Clin. 2021;71:7–33. doi: 10.3322/caac.21654. [DOI] [PubMed] [Google Scholar]
- 2.Ferlay J., Colombet M., Soerjomataram I., Parkin D.M., Piñeros M., Znaor A., Bray F. Cancer statistics for the year 2020: An overview. Int. J. Cancer. 2021;149:778–789. doi: 10.1002/ijc.33588. [DOI] [PubMed] [Google Scholar]
- 3.Shields M.D., Marin-Acevedo J.A., Pellini B. Immunotherapy for Advanced Non-Small Cell Lung Cancer: A Decade of Progress. Am. Soc. Clin. Oncol. Educ. Book. 2021;41:1–23. doi: 10.1200/EDBK_321483. [DOI] [PubMed] [Google Scholar]
- 4.Waldman A.D., Fritz J.M., Lenardo M.J. A guide to cancer immunotherapy: From T cell basic science to clinical practice. Nat. Rev. Immunol. 2020;20:651–668. doi: 10.1038/s41577-020-0306-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Pandey P., Khan F., Qari H.A., Upadhyay T.K., Alkhateeb A.F., Oves M. Revolutionization in Cancer Therapeutics via Targeting Major Immune Checkpoints PD-1, PD-L1 and CTLA-4. Pharmaceuticals. 2022;15:335. doi: 10.3390/ph15030335. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Pardoll D.M. The blockade of immune checkpoints in cancer immunotherapy. Nat. Rev. Cancer. 2012;12:252–264. doi: 10.1038/nrc3239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Garon E.B., Rizvi N.A., Hui R., Leighl N., Balmanoukian A.S., Eder J.P., Patnaik A., Aggarwal C., Gubens M., Horn L., et al. Pembrolizumab for the Treatment of Non–Small-Cell Lung Cancer. N. Engl. J. Med. 2015;372:2018–2028. doi: 10.1056/NEJMoa1501824. [DOI] [PubMed] [Google Scholar]
- 8.Mok T.S., Wu Y.-L., Kudaba I., Kowalski D.M., Cho B.C., Turna H.Z., Castro Jr G., Srimuninnimit V., Laktionov K.K., Bondarenko I. Pembrolizumab versus chemotherapy for previously untreated, PD-L1-expressing, locally advanced or metastatic non-small-cell lung cancer (KEYNOTE-042): A randomised, open-label, controlled, phase 3 trial. Lancet. 2019;393:1819–1830. doi: 10.1016/S0140-6736(18)32409-7. [DOI] [PubMed] [Google Scholar]
- 9.Reck M., Rodríguez-Abreu D., Robinson A.G., Hui R., Csőszi T., Fülöp A., Gottfried M., Peled N., Tafreshi A., Cuffe S., et al. Pembrolizumab versus Chemotherapy for PD-L1-Positive Non-Small-Cell Lung Cancer. N. Engl. J. Med. 2016;375:1823–1833. doi: 10.1056/NEJMoa1606774. [DOI] [PubMed] [Google Scholar]
- 10.Li S., Zhang Z., Lai W.-F., Cui L., Zhu X. How to overcome the side effects of tumor immunotherapy. Biomed. Pharmacother. 2020;130:110639. doi: 10.1016/j.biopha.2020.110639. [DOI] [PubMed] [Google Scholar]
- 11.Thompson J.A., Schneider B.J., Brahmer J., Achufusi A., Armand P., Berkenstock M.K., Bhatia S., Budde L.E., Chokshi S., Davies M., et al. Management of Immunotherapy-Related Toxicities, Version 1.2022, NCCN Clinical Practice Guidelines in Oncology. J. Natl. Compr. Cancer Netw. 2022;20:387–405. doi: 10.6004/jnccn.2022.0020. [DOI] [PubMed] [Google Scholar]
- 12.Schluter J., Peled J.U., Taylor B.P., Markey K.A., Smith M., Taur Y., Niehus R., Staffas A., Dai A., Fontana E., et al. The gut microbiota is associated with immune cell dynamics in humans. Nature. 2020;588:303–307. doi: 10.1038/s41586-020-2971-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Snyder A., Pamer E., Wolchok J. Could microbial therapy boost cancer immunotherapy? Science. 2015;350:1031–1032. doi: 10.1126/science.aad7706. [DOI] [PubMed] [Google Scholar]
- 14.Swami U., Zakharia Y., Zhang J. Understanding Microbiome Effect on Immune Checkpoint Inhibition in Lung Cancer: Placing the Puzzle Pieces Together. J. Immunother. 2018;41:359–360. doi: 10.1097/CJI.0000000000000232. [DOI] [PubMed] [Google Scholar]
- 15.Strouse C., Mangalam A., Zhang J. Bugs in the system: Bringing the human microbiome to bear in cancer immunotherapy. Gut Microbes. 2019;10:109–112. doi: 10.1080/19490976.2018.1511665. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Chau J., Zhang J. Tying Small Changes to Large Outcomes: The Cautious Promise in Incorporating the Microbiome into Immunotherapy. Int. J. Mol. Sci. 2021;22:7900. doi: 10.3390/ijms22157900. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Vétizou M., Pitt J.M., Daillère R., Lepage P., Waldschmitt N., Flament C., Rusakiewicz S., Routy B., Roberti M.P., Duong C.P.M., et al. Anticancer immunotherapy by CTLA-4 blockade relies on the gut microbiota. Science. 2015;350:1079–1084. doi: 10.1126/science.aad1329. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Chau J., Yadav M., Liu B., Furqan M., Dai Q., Shahi S., Gupta A., Mercer K.N., Eastman E., Hejleh T.A., et al. Prospective correlation between the patient microbiome with response to and development of immune-mediated adverse effects to immunotherapy in lung cancer. BMC Cancer. 2021;21:808. doi: 10.1186/s12885-021-08530-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Sivan A., Corrales L., Hubert N., Williams J.B., Aquino-Michaels K., Earley Z.M., Benyamin F.W., Man Lei Y., Jabri B., Alegre M.-L., et al. Commensal Bifidobacterium promotes antitumor immunity and facilitates anti–PD-L1 efficacy. Science. 2015;350:1084–1089. doi: 10.1126/science.aac4255. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Zhu Z., Huang J., Li X., Xing J., Chen Q., Liu R., Hua F., Qiu Z., Song Y., Bai C., et al. Gut microbiota regulate tumor metastasis via circRNA/miRNA networks. Gut Microbes. 2020;12:1788891. doi: 10.1080/19490976.2020.1788891. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Routy B., Le Chatelier E., Derosa L., Duong C.P., Alou M.T., Daillère R., Fluckiger A., Messaoudene M., Rauber C., Roberti M.P. Gut microbiome influences efficacy of PD-1–based immunotherapy against epithelial tumors. Science. 2018;359:91–97. doi: 10.1126/science.aan3706. [DOI] [PubMed] [Google Scholar]
- 22.Huang C., Li M., Liu B., Zhu H., Dai Q., Fan X., Mehta K., Huang C., Neupane P., Wang F., et al. Relating Gut Microbiome and Its Modulating Factors to Immunotherapy in Solid Tumors: A Systematic Review. Front. Oncol. 2021;11:91–97. doi: 10.3389/fonc.2021.642110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Limeta A., Ji B., Levin M., Gatto F., Nielsen J. Meta-analysis of the gut microbiota in predicting response to cancer immunotherapy in metastatic melanoma. JCI Insight. 2020;5:e140940. doi: 10.1172/jci.insight.140940. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Huttenhower C., Gevers D., Knight R., Abubucker S., Badger J.H., Chinwalla A.T., Creasy H.H., Earl A.M., FitzGerald M.G., Fulton R.S. Structure, function and diversity of the healthy human microbiome. Nature. 2012;486:207. doi: 10.1038/nature11234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Schwartz L.H., Seymour L., Litière S., Ford R., Gwyther S., Mandrekar S., Shankar L., Bogaerts J., Chen A., Dancey J., et al. RECIST 1.1—Standardisation and disease-specific adaptations: Perspectives from the RECIST Working Group. Eur. J. Cancer. 2016;62:138–145. doi: 10.1016/j.ejca.2016.03.082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Bolger A.M., Lohse M., Usadel B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Li H., Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Beghini F., McIver L.J., Blanco-Míguez A., Dubois L., Asnicar F., Maharjan S., Mailyan A., Manghi P., Scholz M., Thomas A.M., et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife. 2021;10:e65088. doi: 10.7554/eLife.65088. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Meinicke P. UProC: Tools for ultra-fast protein domain classification. Bioinformatics. 2014;31:1382–1388. doi: 10.1093/bioinformatics/btu843. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Mistry J., Chuguransky S., Williams L., Qureshi M., Gustavo A.S., Sonnhammer E.L.L., Tosatto S.C.E., Paladin L., Raj S., Richardson L.J., et al. Pfam: The protein families database in 2021. Nucleic Acids Res. 2020;49:D412–D419. doi: 10.1093/nar/gkaa913. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Chong J., Liu P., Zhou G., Xia J. Using MicrobiomeAnalyst for comprehensive statistical, functional, and meta-analysis of microbiome data. Nat. Protoc. 2020;15:799–821. doi: 10.1038/s41596-019-0264-1. [DOI] [PubMed] [Google Scholar]
- 32.Paulson J.N., Stine O.C., Bravo H.C., Pop M. Differential abundance analysis for microbial marker-gene surveys. Nat. Methods. 2013;10:1200–1202. doi: 10.1038/nmeth.2658. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Love M.I., Huber W., Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Waskom M.L. Seaborn: Statistical data visualization. J. Open Source Softw. 2021;6:3021. doi: 10.21105/joss.03021. [DOI] [Google Scholar]
- 35.Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V., et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011;12:2825–2830. [Google Scholar]
- 36.Kotsiantis S.B., Zaharakis I., Pintelas P. Supervised machine learning: A review of classification techniques. Emerg. Artif. Intell. Appl. Comput. Eng. 2007;160:3–24. [Google Scholar]
- 37.Nembrini S., König I.R., Wright M.N. The revival of the Gini importance? Bioinformatics. 2018;34:3711–3718. doi: 10.1093/bioinformatics/bty373. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Djemai K., Drancourt M., Tidjani Alou M. Bacteria and Methanogens in the Human Microbiome: A Review of Syntrophic Interactions. Microb. Ecol. 2022;83:536–554. doi: 10.1007/s00248-021-01796-7. [DOI] [PubMed] [Google Scholar]
- 39.Ermler U., Grabarse W., Shima S., Goubeaud M., Thauer R.K. Crystal structure of methyl-coenzyme M reductase: The key enzyme of biological methane formation. Science. 1997;278:1457–1462. doi: 10.1126/science.278.5342.1457. [DOI] [PubMed] [Google Scholar]
- 40.Muñoz-Tamayo R., Popova M., Tillier M., Morgavi D.P., Morel J.P., Fonty G., Morel-Desrosiers N. Hydrogenotrophic methanogens of the mammalian gut: Functionally similar, thermodynamically different-A modelling approach. PLoS ONE. 2019;14:e0226243. doi: 10.1371/journal.pone.0226243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Bang C., Weidenbach K., Gutsmann T., Heine H., Schmitz R.A. The intestinal archaea Methanosphaera stadtmanae and Methanobrevibacter smithii activate human dendritic cells. PLoS ONE. 2014;9:e99411. doi: 10.1371/journal.pone.0099411. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Sereme Y., Mezouar S., Grine G., Mege J.L., Drancourt M., Corbeau P., Vitte J. Methanogenic Archaea: Emerging Partners in the Field of Allergic Diseases. Clin. Rev. Allergy Immunol. 2019;57:456–466. doi: 10.1007/s12016-019-08766-5. [DOI] [PubMed] [Google Scholar]
- 43.Nomura M., Nagatomo R., Doi K., Shimizu J., Baba K., Saito T., Matsumoto S., Inoue K., Muto M. Association of Short-Chain Fatty Acids in the Gut Microbiome With Clinical Response to Treatment With Nivolumab or Pembrolizumab in Patients With Solid Cancer Tumors. JAMA Netw. Open. 2020;3:e202895. doi: 10.1001/jamanetworkopen.2020.2895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Simpson R., Batten M., Shanahan E., Read M., Silva I., Aangelatos A., Tan J., Adhikari C., Conway J., Menzies A., et al. Intestinal microbiota predict response and toxicities during anti-PD-1/anti-CTLA-4 immunotherapy. Pathology. 2020;52:S127. doi: 10.1016/j.pathol.2020.01.433. [DOI] [Google Scholar]
- 45.Zheng Y., Wang T., Tu X., Huang Y., Zhang H., Tan D., Jiang W., Cai S., Zhao P., Song R., et al. Gut microbiome affects the response to anti-PD-1 immunotherapy in patients with hepatocellular carcinoma. J. Immunother. Cancer. 2019;7:193. doi: 10.1186/s40425-019-0650-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Ferry J.G. Enzymology of one-carbon metabolism in methanogenic pathways. FEMS Microbiol. Rev. 1999;23:13–38. doi: 10.1111/j.1574-6976.1999.tb00390.x. [DOI] [PubMed] [Google Scholar]
- 47.Ron-Harel N., Notarangelo G., Ghergurovich J.M., Paulo J.A., Sage P.T., Santos D., Satterstrom F.K., Gygi S.P., Rabinowitz J.D., Sharpe A.H., et al. Defective respiration and one-carbon metabolism contribute to impaired naïve T cell activation in aged mice. Proc. Natl. Acad. Sci. USA. 2018;115:13347–13352. doi: 10.1073/pnas.1804149115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Courtemanche C., Elson-Schwab I., Mashiyama S.T., Kerry N., Ames B.N. Folate Deficiency Inhibits the Proliferation of Primary Human CD8+ T Lymphocytes In Vitro. J. Immunol. 2004;173:3186. doi: 10.4049/jimmunol.173.5.3186. [DOI] [PubMed] [Google Scholar]
- 49.Tamura J., Kubota K., Murakami H., Sawamura M., Matsushima T., Tamura T., Saitoh T., Kurabayshi H., Naruse T. Immunomodulation by vitamin B12: Augmentation of CD8+ T lymphocytes and natural killer (NK) cell activity in vitamin B12-deficient patients by methyl-B12 treatment. Clin. Exp. Immunol. 1999;116:28–32. doi: 10.1046/j.1365-2249.1999.00870.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Funada U., Wada M., Kawata T., Mori K., Tamai H., Kawanishi T., Kunou A., Tanaka N., Tadokoro T., Maekawa A. Changes in CD4+CD8–/CD4–CD8+ Ratio and Humoral Immune Functions in Vitamin B12-Deficient Rats. Int. J. Vitam. Nutr. Res. 2000;70:167–171. doi: 10.1024/0300-9831.70.4.167. [DOI] [PubMed] [Google Scholar]
- 51.Rush E.C., Katre P., Yajnik C.S. Vitamin B12: One carbon metabolism, fetal growth and programming for chronic disease. Eur. J. Clin. Nutr. 2014;68:2–7. doi: 10.1038/ejcn.2013.232. [DOI] [PubMed] [Google Scholar]
- 52.Dumas R., Biou V., Halgand F., Douce R., Duggleby R.G. Enzymology, structure, and dynamics of acetohydroxy acid isomeroreductase. Acc. Chem. Res. 2001;34:399–408. doi: 10.1021/ar000082w. [DOI] [PubMed] [Google Scholar]
- 53.Ducker G.S., Rabinowitz J.D. One-Carbon Metabolism in Health and Disease. Cell Metab. 2017;25:27–42. doi: 10.1016/j.cmet.2016.08.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Thomas D., Barbey R., Surdin-Kerjan Y. Evolutionary relationships between yeast and bacterial homoserine dehydrogenases. FEBS Lett. 1993;323:289–293. doi: 10.1016/0014-5793(93)81359-8. [DOI] [PubMed] [Google Scholar]
- 55.Cami B., Clepet C., Patte J.C. Evolutionary comparisons of three enzymes of the threonine biosynthetic pathway among several microbial species. Biochimie. 1993;75:487–495. doi: 10.1016/0300-9084(93)90115-9. [DOI] [PubMed] [Google Scholar]
- 56.Kelly B., Pearce E.L. Amino Assets: How Amino Acids Support Immunity. Cell Metab. 2020;32:154–175. doi: 10.1016/j.cmet.2020.06.010. [DOI] [PubMed] [Google Scholar]
- 57.Neinast M.D., Jang C., Hui S., Murashige D.S., Chu Q., Morscher R.J., Li X., Zhan L., White E., Anthony T.G., et al. Quantitative Analysis of the Whole-Body Metabolic Fate of Branched-Chain Amino Acids. Cell Metab. 2019;29:417–429.e414. doi: 10.1016/j.cmet.2018.10.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Tsukishiro T., Shimizu Y., Higuchi K., Watanabe A. Effect of branched-chain amino acids on the composition and cytolytic activity of liver-associated lymphocytes in rats. J. Gastroenterol. Hepatol. 2000;15:849–859. doi: 10.1046/j.1440-1746.2000.02220.x. [DOI] [PubMed] [Google Scholar]
- 59.Embree M., Liu J.K., Al-Bassam M.M., Zengler K. Networks of energetic and metabolic interactions define dynamics in microbial communities. Proc. Natl. Acad. Sci. USA. 2015;112:15450–15455. doi: 10.1073/pnas.1506034112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Maynard A.G., Kanarek N. NADH Ties One-Carbon Metabolism to Cellular Respiration. Cell Metab. 2020;31:660–662. doi: 10.1016/j.cmet.2020.03.012. [DOI] [PubMed] [Google Scholar]
- 61.Krautkramer K.A., Fan J., Bäckhed F. Gut microbial metabolites as multi-kingdom intermediates. Nat. Rev. Microbiol. 2021;19:77–94. doi: 10.1038/s41579-020-0438-4. [DOI] [PubMed] [Google Scholar]
- 62.Stockdale S.R., Harrington R.S., Shkoporov A.N., Khokhlova E.V., Daly K.M., McDonnell S.A., O’Reagan O., Nolan J.A., Sheehan D., Lavelle A., et al. Metagenomic assembled plasmids of the human microbiome vary across disease cohorts. Sci. Rep. 2022;12:9212. doi: 10.1038/s41598-022-13313-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Gopalakrishnan V., Spencer C.N., Nezi L., Reuben A., Andrews M.C., Karpinets T.V., Prieto P.A., Vicente D., Hoffman K., Wei S.C., et al. Gut microbiome modulates response to anti–PD-1 immunotherapy in melanoma patients. Science. 2018;359:97–103. doi: 10.1126/science.aan4236. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Raw data used for analysis has been uploaded to the (NCBI) BioSample database under BioProject ID PRJNA866654, currently pending review and publication.