Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2023 Apr 26;19(4):e1011076. doi: 10.1371/journal.pcbi.1011076

Network analysis of toxin production in Clostridioides difficile identifies key metabolic dependencies

Deborah A Powers 1, Matthew L Jenior 2, Glynis L Kolling 2, Jason A Papin 1,2,*
Editor: Kiran Raosaheb Patil3
PMCID: PMC10166488  PMID: 37099624

Abstract

Clostridioides difficile pathogenesis is mediated through its two toxin proteins, TcdA and TcdB, which induce intestinal epithelial cell death and inflammation. It is possible to alter C. difficile toxin production by changing various metabolite concentrations within the extracellular environment. However, it is unknown which intracellular metabolic pathways are involved and how they regulate toxin production. To investigate the response of intracellular metabolic pathways to diverse nutritional environments and toxin production states, we use previously published genome-scale metabolic models of C. difficile strains CD630 and CDR20291 (iCdG709 and iCdR703). We integrated publicly available transcriptomic data with the models using the RIPTiDe algorithm to create 16 unique contextualized C. difficile models representing a range of nutritional environments and toxin states. We used Random Forest with flux sampling and shadow pricing analyses to identify metabolic patterns correlated with toxin states and environment. Specifically, we found that arginine and ornithine uptake is particularly active in low toxin states. Additionally, uptake of arginine and ornithine is highly dependent on intracellular fatty acid and large polymer metabolite pools. We also applied the metabolic transformation algorithm (MTA) to identify model perturbations that shift metabolism from a high toxin state to a low toxin state. This analysis expands our understanding of toxin production in C. difficile and identifies metabolic dependencies that could be leveraged to mitigate disease severity.

Author summary

Clostridioides difficile is the causative agent in approximately 73% of healthcare-acquired gastrointestinal infections, resulting in a significant healthcare burden. Its toxins are crucial to virulence and play a key role in establishing a nutritional niche for C. difficile. Highly virulent strains with high toxin production can lead to worse outcomes for patients with C. difficile infection (CDI), such as progression to pseudomembranous colitis, toxic megacolon, and in some cases, death. Improving our understanding of how these toxins are regulated through their environment and intracellular metabolism could allow us to attenuate C. difficile virulence in infected patients. Therefore, we have compiled gene expression data of C. difficile grown in 16 different conditions to investigate how toxin production changes in response to the environment. We have integrated these data with a genome-scale metabolic model of C. difficile, allowing us to simulate the intracellular metabolism in high and low toxin producing states. Our network analysis of metabolism and toxin production predicts metabolic patterns in high and low toxin-producing states and provides insights into metabolic regulation of toxins. Additionally, our analysis highlights new proteins that could serve as anti-toxin targets.

Introduction

Clostridioides difficile is the leading contributor to healthcare-acquired gastrointestinal (GI) infections, accounting for 73% of GI infections and costing an estimated $6.3 billion annually in the United States [1,2]. The primary risk factor for developing C. difficile infection (CDI) is broad-spectrum antibiotic usage which alters the structure and composition of the gut microbiota, allowing C. difficile to outgrow nonpathogenic competitors [3]. CDI is recurrent in 30% of cases and has a mortality rate of 9.3%, causing 29,000 deaths in the US annually [1,4].

CDI pathogenesis is primarily mediated by two large Clostridia toxins, TcdA and TcdB [5,6]. These toxins induce cytopathic and cytotoxic effects, including changes in epithelial cell morphology, cell cycle modulation, disruption of the colonic epithelial barrier, induction of apoptosis, and activation of an acute innate inflammatory response [610]. TcdA and TcdB induce a portion of this damage by glucosylating, and thereby inactivating, Rho GTPases in the host intestinal epithelial cells [7]. Rho GTPase inactivation disrupts the actin cytoskeleton and tight junctions of epithelial cells, resulting in a cytopathic phenotype of altered cell morphology and impaired epithelial barrier function [11]. A compromised colonic epithelial barrier can lead to increased intestinal permeability and fluid secretion which furthers intestinal inflammation and damage [5]. Together, these toxins are crucial to the establishment of CDI.

Regulation of C. difficile toxin synthesis is complex. TcdA and TcdB are encoded by genes on the pathogenicity locus (PaLoc) along with a transcriptional regulator, TcdR [12]. TcdR, in turn, is negatively controlled by metabolically sensitive global regulators such as CodY and CcpA which inhibit TcdR in the presence of intracellular branched chain amino acids (BCAAs) or fructose-biphosphate (FBP), respectively (Fig 1) [12,13]. In addition to these intracellular metabolites, the extracellular environment can also influence toxin production in C. difficile. Multiple defined media experiments with C. difficile demonstrate that certain carbohydrates, amino acids, and short chain fatty acids (SCFA) promote the increase or decrease of toxin production (Fig 1) [1419]. Furthermore, C. difficile toxin production responses are dependent on the surrounding microbial community, as the microbial community shapes the nutritional environment of C. difficile [20]. Toxin response to environment may also be strain-dependent, further compounding the complexity of toxin production regulation [20]. While the exact relationship between the regulation of C. difficile toxin synthesis and extracellular environment is unclear, there is evidence that this process is linked to extra- and intracellular metabolism.

Fig 1. C. difficile toxin production is regulated by multiple metabolic signals.

Fig 1

The transcription of tcdA and tcdB to synthesize toxins TcdA and TcdB is positively regulated by TcdR which is in turn negatively regulated by CcpA and CodY. Each of these components are regulated by multiple metabolic signals. Fructose bis-phosphate (FBP); branched-chain amino acids (BCAA); Guanosine triphosphate (GTP). Numbers correspond to references with evidence for the indicated regulatory mechanism.

To investigate the metabolic states contributing to shifts in toxin production related to changes in the environment, we use previously established and curated genome-scale metabolic network reconstructions (GENREs) of C. difficile [21]. Metabolic modeling provides a unique systems approach to studying metabolism. Briefly, a GENRE describes the gene-protein-reaction associations for all of the metabolic genes of an organism which can then be simulated to predict the flux through the metabolic network [22]. We can represent the metabolic state of C. difficile in a specific context by integrating transcriptomic data from multiple studies (S1 Table) with the metabolic model via the RIPTiDe algorithm [23]. Using this approach, we created context-specific metabolic models of two C. difficile strains (CD630 and CDR20291) for 16 different environmental conditions with a range of toxin production states. We analyzed these models using a combination of machine learning and flux balance analysis (FBA) methods and found that arginine and ornithine metabolism is more active in models with low toxin production. Moreover, we found that arginine and ornithine metabolism may be influenced by intracellular fatty acid and large polymer pools. Finally, we applied a modified metabolic transformation algorithm (mMTA) to identify pivotal reactions for transitioning from a metabolic state associated with high toxin production to one associated with low toxin production [24]. Integrating gene expression data with a metabolic network simultaneously gives the gene expression data a metabolic context and the metabolic network biological relevance, optimizing our predictive power. Metabolic network analysis of microbial pathogens can catalyze biological discoveries which can then translate to new therapeutic leads.

Results

Contextualized metabolic models of C. difficile

Multiple studies have shown that toxin production in C. difficile can be altered by changing its nutritional environment; therefore, we wanted to investigate the intracellular metabolism of C. difficile grown in a variety of media conditions, inducing different toxin production states [1316,18,20,25,26]. To do this, we used published GENREs of C. difficile strains CD630 and CDR20291 (iCdG709 and iCdR703, respectively) [21]. We compiled a set of publicly available RNA-sequencing data for each of these strains that considers a range of environmental conditions (S1 Table). For each condition, we classified the toxin states as low or high based on the median tcdA transcript reads per million (RPM) across all conditions (S1 Fig). Using the RIPTiDe algorithm [23], we integrated the transcript data with the appropriate strain model to generate a total of 16 contextualized models of C. difficile (Fig 2A).

Fig 2. Metabolic differences between toxin states are strain-specific.

Fig 2

(A) Summary table of the RIPTiDe contextualized models including the strain, toxin production level, and number of genes, reactions, and metabolites. (B) Normalized, absolute flux values for reactions indicated by Random Forest classifier as important for distinguishing between toxin levels. C. difficile strains 630 and R20291 are shown by light and dark purple respectively. Toxin transcript levels are shown by light (low) and dark (high) teal. Starred reactions are contextualized in panel (C). (C) Map of reactions in the metabolic model. Reactions identified by Random Forest analysis in panel (B) are starred. Arg: Arginine, Orn: Ornithine, Pro: Proline, Suc: Sucrose, UDP-Glc: UDP-Glucose, Glc1P: Glucose-1-phospate, ManNAc: N-acetyl-D-mannosamine, Guo: Guanosine, dGuo: Deoxyguanosine, G: Guanine.

On average, the RIPTiDe models for low toxin conditions include more genes than the high toxin RIPTiDe models but retain similar metabolite numbers (Fig 2A). The RIPTiDe models do share a core set of 483 reactions, which accounts for approximately 80% of the reactions in each model. There are strain differences, with CD630 models accounting for more genes, reactions, and metabolites compared to CDR20291 models (Fig 2A). Principle component analysis (PCA) of flux samples of these models demonstrates broad clustering by strain as well as by context (S2B–S2F Fig); complete flux sampling details can be found in the Materials and Methods. Overall, these models reflect metabolic differences by strain and toxin level with more metabolic genes and reactions included in both CD630 models and in low toxin models.

Metabolic differences between strains and toxin states

To identify reactions important in distinguishing between metabolic states of low and high toxin conditions, we applied a Random Forest classifier to the flux sampling data (S3 Fig). Briefly, after randomly down-sampling the flux sampling data to 100 flux samples per RIPTiDe model, we used random stratified groups to split the flux sampling data, with a 75–25 train-test ratio (S3 Fig). The classifier had a mean accuracy of 97%; features were ranked by their Gini score to identify reactions important in distinguishing between toxin states (Materials and Methods). From the Random Forest analysis, we determined that arginine and ornithine reactions are more active in low toxin conditions (Fig 2B). Specifically, growth media supplemented with bile acid (deoxycholate and cholate) and calprotectin result in increased metabolic flux through arginine and ornithine transport reactions which move towards Stickland fermentation and NAD+ production via D-proline catabolism (Fig 2B and 2C). Other metabolic processes that are more active in low toxin conditions include reactions involved in carbohydrate metabolism and nucleotide metabolism (Fig 2B and 2C). However, for the majority of important reactions from Random Forest, the flux difference across all conditions was neither large nor significant. These results indicate that while metabolism can be predictive of toxin outcomes, it may be due to the cumulative effect of many tightly controlled reactions rather than large changes in flux through a few key reactions.

Because of the flux sampling results, we decided to investigate how sensitive growth and toxin production would be to disruptions in flux balance and intracellular metabolite concentrations. To approach this question, we conducted a shadow pricing analysis. Briefly, shadow pricing is the dual problem to flux balance analysis (FBA) in which shadow prices capture the sensitivity of an objective function (e.g., biomass) to changes in metabolites levels [27,28]. Thus, increases in levels of metabolites with negative shadow prices reduce flux through the objective function while increases in levels of metabolites with positive shadow prices increase flux through the objective function. For this analysis we iteratively set the top 20 reactions from the Random Forest analysis as the objective function and solved the dual problem. Reactions whose flux is predominantly increased by increasing metabolite concentrations may indicate tightly regulated reactions where a consistent flux is important or perhaps that the reactions are not metabolically regulated. Conversely, reactions whose flux is decreased by many metabolites may be more responsive to the environment, allowing C. difficile to optimize its cellular function. Overall, we find that arginine and ornithine transport and L-aspartate:fumarate oxidoreductase are particularly sensitive to disruptions in metabolite concentrations with over 40 metabolites with a shadow price less than -5 (Fig 3). Both of these reactions are primarily sensitive to fatty acid and large polymer metabolite pools which have relatively high negative shadow prices (Fig 3B), indicating a possible source of intracellular metabolic regulation.

Fig 3. Flux through arginine and ornithine reactions is sensitive to intracellular metabolite concentrations.

Fig 3

(A) Summary of the shadow pricing analysis with the top 20 reactions from the Random Forest classifier set as the objective function. The number in the "models" column (blue) corresponds to the fraction of contextualized models that were able to carry flux with the indicated reaction set as the objective function (OF). The values in the orange columns indicate the following: Increase: the number of metabolites for which an increased level results in increased flux through the OF (median shadow price > 0, range < 2); Decrease: the number of metabolites for which an increased level results in decreased flux through the OF (median shadow price < -0.1, range < 2); and Variable: the number of metabolites whose shadow price varied across RIPTiDe models (range > 2). For example, in the first row of panel (A), the OF was able to carry flux in all of the models, 294 metabolites increased flux through the OF in all of the models, 1 metabolite limited flux through the OF in all of the models, and 5 metabolites had different effects on flux through the OF across all of the models. (B) Shadow prices for limiting metabolites in arginine/ornithine and aspartate metabolism reactions. The metabolites categorized as sensitive in panel (A) for these OFs and with a shadow price < -5 are shown. Increasing negative values indicates increasing reaction flux sensitivity to the metabolite.

Metabolic transformation between toxin states

From the flux sampling and shadow pricing analyses above, we described the metabolic state of C. difficile under 16 specific conditions and identified metabolic differences between low and high toxin producing conditions. We next sought to investigate whether there were pivotal reaction knockouts that could transition the model from a metabolic state associated with a high toxin transcript level to a metabolic state associated with a low toxin transcript level. For this analysis we used a modified metabolic transformation algorithm (mMTA) which identifies key reactions to switch a cell from one metabolic state to another. To do this, MTA classifies reactions as changed or unchanged based on whether there was differential flux between a reference and target state; it then solves a mixed integer quadratic programming (MIQP) problem that maximizes change in flux in the direction of the target flux for the changed reaction set while minimizing flux changes for the unchanged reaction set [24]. We modified the original MTA for compatibility with COBRApy and to use updated modeling methods throughout its implementation. These changes and considerations are detailed in Materials and Methods. For our reference and target states we selected conditions that induce high and low toxin states respectively and that have transcriptomic data generated from the same study to reduce inter-study experimental variation. Therefore, we chose the iCdG709 RIPTiDe models contextualized with data from GSE165116, setting BHIS (high toxin) as the reference state and BHIS + 240 uM Deoxycholate (low toxin) as the target state (S1 Table, S1 Fig). Using mMTA, we classified 101 reactions as significantly different between the reference and target state. Out of this reaction set, 87 reactions were successfully changed in the desired flux direction in at least one reaction knockout simulation (S6C Fig). 396 out of 594 reaction knockouts resulted in a feasible solution; for each feasible reaction knockout, a transformation score (TS) was calculated (S6B Fig). The reaction knockouts with the top 50 TSs induced flux transformation in reactions related to energy metabolism, such as carbohydrate metabolism and Stickland fermentation (Fig 4). Of these transformed reactions, the Stickland fermentation reactions and most of the amino acid metabolism reactions are involved in isoleucine fermentation (Fig 4, S3 Dataset). Isoleucine is an important energy substrate that is metabolized via oxidative Stickland fermentation to form ATP [17]. This finding not only supports the importance of energy metabolism in toxin production, but also highlights the nutritional flexibility of C. difficile to acquire energy from available resources.

Fig 4. Modified MTA identifies key reaction knockouts and pathways for transformation from a high to low toxin state.

Fig 4

The mMTA algorithm runs a reaction KO simulation to optimize changes in reaction flux that transform the model from the reference metabolic state (high toxin) to the target metabolic state (low toxin). The reaction knockouts with the highest transformation scores are shown on the y-axis. The reactions whose flux changed under these KO conditions are shown on the x-axis. Successfully changed reactions are defined as those whose flux changed from the reference in the desired direction by a minimum threshold of significance (successful: dark blue; unsuccessful: light grey). The metabolic pathways for these reactions are shown beneath the clustering dendrogram at the top.

Discussion

Research on C. difficile toxins has been extensive and wide-ranging, investigating biochemistry and structure, mechanisms of action, host and microbiome interactions, nutritional environments, genetic and metabolic regulation, and more [6,1012,20,29]. Of these, the link between metabolism and regulation is particularly salient because of its therapeutic potential for virulence attenuation. Many in vitro studies have shown that C. difficile toxin production can be manipulated through its environment [1316,18,20,30,31]. Additionally, multiple mechanisms of intracellular toxin regulation have been shown, such as cyclic di-GMP, carbon catabolite repression via ccpA, and nutritional limitation via codY [13,32,33]. However, while all of these studies have elucidated various facets of C. difficile toxin regulation under specific conditions, our understanding of the relationship between metabolism and toxin regulation remains fragmented. To address this, we conducted a systems analysis in which we interrogated an array of environmental conditions and toxin transcript levels with the goal of defining the metabolic state of C. difficile in these conditions.

The role of arginine in C. difficile toxin production is contradictory. Early investigation of C. difficile grown in minimal defined media supplemented with arginine found that increasing arginine concentrations resulted in decreased toxin production and enhanced growth [30,31]. Conversely, a study using phenotype microarrays found that arginine, and arginine dipeptides in particular, induce toxin production [25]. When we simulated growth under a subset of PM conditions, we did not find any correlations between metabolic flux and the PM toxin data (S8 Fig). The only model constraints in this simulation were growth limits rather than intracellular limits as for the RIPTiDe models; these constraints, combined with the differences in which strain was profiled with the PM experiments and which strains were modeled with the analysis presented here, may have contributed to this lack of correlation. Cecal metabolomics of mice infected with wild-type (inflammatory) and toxin-deficient (non-inflammatory) C. difficile strains show that metabolic pathways for arginine and ornithine in the gut microbial community are more active in the non-inflammatory state [34]. This same study also linked ornithine metabolism in C. difficile with reduced inflammation during CDI. In our analysis, an arginine/ornithine transport reaction was identified as an important differentiator between toxin states by Random Forest and is particularly active in low toxin states across strains (Fig 2B). Shadow price analysis of these reactions found it was highly sensitive to fatty acids or large polymers primarily involved in cell wall synthesis (Fig 3B). Fatty acids have been shown to regulate arginine and ornithine metabolism in other organisms [3537]. This potential regulatory interaction between fatty acids and arginine in C. difficile could explain the contradictory results for the impact of arginine and ornithine metabolism on toxin production.

Toxin production in C. difficile is clearly linked to its nutritional status. We used Random Forest to attempt to identify any underlying metabolic patterns between toxin states but to understand better the intracellular metabolic switches necessary for transitioning from a high to low toxin production state we used mMTA. The results from mMTA give us two types of information. First, the metabolic reactions whose activity is important for this transition and second, the metabolic reactions that can be modulated to induce these metabolic changes. We found that within the metabolic network model corresponding to C. difficile grown in BHIS with high toxin production, flux through 86% of the reactions identified as important for transitioning to a low toxin state could be modulated by at least one reaction knockout. With the Random Forest analysis we performed, we found that the 20 reactions with the highest Gini scores are heavily involved in energy metabolism (e.g., arginine/ornithine, glycine, glutamate, aspartate, sucrose, glucose, and N-acetyl-D-mannosamine) as well as nucleotide metabolism (e.g., guanosine and uridine). Similar patterns of important metabolic pathways are replicated in the mMTA results; reactions whose flux can be changed to mimic a low toxin state fall into carbohydrate, amino acid, and nucleotide metabolism categories (Fig 4). The mMTA results also predict that these reactions can be metabolically modulated via knockouts of specific reactions (as indicated on the y-axis in Fig 4). The reaction knockouts with the highest transformation scores were frequently key reactions in alternative energy-generating pathways such as carbon metabolism and Stickland fermentation of leucine and valine (Fig 4). When these reactions were knocked out, Stickland fermentation of isoleucine increased (S3 Dataset).

Isoleucine is another metabolite whose role in toxin production in C. difficile is unclear. Isoleucine activates CodY which represses toxin production (Fig 1) [38]; it is reasonable to hypothesize that conditions supplemented with isoleucine would have low toxin production. However, defined media experiments show the opposite effect. C. difficile (VPI 10463) grown in a minimal media supplemented with isoleucine demonstrated increased TcdA production [31]. Another study growing C. difficile (ATCC 9689) in phenotype microarrays found that isoleucine induced middle-level toxin production [25]. It may be that preferential fermentation of isoleucine throughout the exponential growth phase depletes stores of bioavailable isoleucine for CodY activation in stationary phase, resulting in CodY deactivation and increased toxin production. While our mMTA results show that isoleucine fermentation was maximized in the target low toxin state, perhaps the driving differential feature between states is carbohydrate metabolism (S7 Fig). In the reference state, increased glucose metabolism is likely driving the high toxin transcript levels [14]; in the target state, suppression of glucose metabolism is accompanied by an increase in isoleucine fermentation to maintain energy generation. In the short-term, this increased isoleucine fermentation likely results in increased uptake and availability of isoleucine which is sufficient for CodY activation and suppression of toxin transcription.

We used C. difficile GENREs and publicly available transcriptomic data for our analyses, applying RIPTiDe for data integration. RIPTiDe uses genetic evidence from the transcriptomic data as weights indicating the likelihood that a reaction will carry flux and to what extent [23]. It additionally prunes reactions for which there is no evidence or that do not pass a minimum flux threshold. This approach creates a contextualized model for a specific environmental and metabolic state. However, it is possible that this method may have inadvertently limited the results of our mMTA analysis. The flux bounds of reactions in a RIPTiDe model are set based on genetic evidence and are often quite restrictive because the goal of RIPTiDe is to describe the metabolic state of an organism in a specific condition. Therefore, any significant departure from the reference state flux necessary for a reaction to achieve a target state may not even be feasible due to preset flux bounds. However, we were able to successfully transform 86% of the reactions targeted for change, and therefore do not consider any potential limitation from application of RIPTiDe as having a significant impact on our mMTA results.

We additionally used RIPTiDe to sample the flux distributions for each model which we then used for our Random Forest analysis. The flux samples for each condition are highly correlated in part due to the RIPTiDe restrictions discussed above; this characteristic could lead to overfitting in a Random Forest model because the model would be able to learn what condition a sample is from and then use this information to infer the toxin level. To prevent this, we used a random stratified group sampling approach for splitting the data into train-test sets for Random Forest (Methods). This approach ensures first that there is an equal (or near-equal) ratio of low and high toxin conditions in the train and test sets and second that all the flux samples from a single RIPTiDe model are used in either the train or test set. The classifier had a mean accuracy of 95% which suggests model over-fitting despite the feature selection and sampling approaches we took. A closer look at the features with the highest Gini scores shows that there is a greater difference in flux between strains than in flux between toxin states (Fig 2). While the classifier is not learning what condition a sample is from, it may be learning what strain it is and then predicting the toxin state based on strain-specific criteria. This result highlights the importance in accounting for strain differences when interrogating toxin production in C. difficile.

The reactions from the Random Forest Classifier with the highest Gini scores were analyzed in a shadow price analysis. The results of this analysis showed that arginine and ornithine transport as well as an aspartate fumurate oxidoreductase were highly sensitive to fatty acids and large polymers. However, two metabolites that also commonly occurred as limiting (shadow price < -5) were “Protein biosynthesis” and “Cell Wall Polymer” (Figs 3 and S5). These metabolites are not true biological metabolites but rather in silico substitutions for high-level cellular processes just upstream of the biomass reaction within the model. While the biomass reaction is not set as the objective function in the shadow pricing analysis, there may be underlying biases that drive flux towards biomass production. Regulation via these two “metabolites” may indicate an intracellular sensing mechanism or be used as proxy for cell status but it is also possible that they are merely modeling artefacts.

In conclusion, we performed a systems analysis of C. difficile metabolism under different growth conditions, paired with the associated toxin transcript level to define the relationship between metabolism and toxin production. These toxins are essential in establishing a nutritional niche for C. difficile and can cause extensive damage in the host colon. CDI is most effectively resolved through fecal microbiota transplants (FMTs) [39]; however, FMTs are typically only prescribed for patients with severe or recurrent cases of CDI [39]. Using microbial engineering to design probiotic communities that can be offered as a non-invasive CDI therapeutic is a major advancement already underway within the field [40,41]. An important step in designing these therapeutic communities is identifying reactions or pathways associated with high and low toxin production and understanding how those reactions change as a function of the environment, resulting in specific toxin-associated phenotypes. Future research investigating questions include accounting for the relationship between regulatory networks and metabolism in C. difficile toxin production as we know that toxin responses are in part the effect of global regulators. Additionally, modeling the effect of toxin activity once it is released from C. difficile could also help guide selection of members of a microbial community. Research in these areas will provide foundational understanding of C. difficile biology that will enable intentional and specific therapeutic community design.

Materials and methods

Processing the RNA-sequencing data

We compiled transcriptome count matrices from seven publicly available RNA-sequencing studies of C. difficile covering two strains (630 and R20291) and 16 unique conditions (S1 Table). Raw count matrices were normalized using the reads per million (RPM) formula

Numberofreadsmappedtogenex106Totalnumberofmappedreads.

RPM normalized toxin gene transcripts from all studies were grouped by condition, averaged, and binned into high and low categories by tcdA medians (low < median < high) (S1 Fig). We used medians of tcdA rather than tcdB because tcdB had low to no expression across all conditions with the exception of the tryptone yeast conditions (TY and TY + Cysteine).

Genome-scale metabolic models

Previously published C. difficile genome-scale metabolic network reconstructions (GENREs) of strains CD630 and CDR20291 (referred to as iCdG709 and iCdR703, respectively) were used for all of the modeling analyses [21]. We created a total of 16 contextualized metabolic models of C. difficile using RIPTiDe and the processed transcriptomic read matrices from publicly available RNA-Seq studies (S1 Table) [23].

Random Forest analysis of flux samples

For each RIPTiDe model, we optimized for biomass production and then sampled (n = 500) flux distributions of the entire feasible steady-state solution space using RIPTiDe. We then randomly down sampled to 100 flux samples per condition and performed a principal component analysis (PCA) of sampled flux distributions for all the models.

To identify the most important reactions in differentiating between toxin production (low, high) we ran a Random Forest classifier with 500 trees on the down-sampled flux sampling data (100 flux samples per RIPTiDe model). We reduced the feature space by selecting features with a near-zero variance (NZV) > 0.005 and an absolute Pearson’s correlation coefficient < 0.8. We used random stratified group K-fold (k = 5) cross validation to check the classifier (S3 Fig). Using a stratified group k-fold to split the data into train and test sets ensures first that there is an equal (or near-equal) ratio of low and high toxin conditions in the train and test sets and second that all the flux samples from a single RIPTiDe model are used in either the train or test set. We used this approach to prevent the classifier from learning which RIPTiDe model the flux sample came from and substituting that information to infer toxin level; while there are many flux samples per RIPTiDe model, these flux samples tend to be highly correlated. For model predictions, the data was split using a random stratified group split as described for the cross validation. The classifier was then trained on 75% of the data and tested on the remaining 25%.

Following classifier testing, we ranked the features by their Gini score and selected the top 20 most important features for model predictions. For these 20 reactions, we calculated the median flux value for each condition, normalized, and visualized using a heatmap (Fig 2B). We tracked the flow of flux through these reactions to create a human readable metabolic map (Fig 2A) as well as an Escher metabolic map [42] with the GENRE IDs for the reactions and metabolites (S4 Fig). To investigate metabolism of C. difficile that literature indicates can impact toxin production, we compiled lists of identifiers for reactions within a specific metabolic pathway. We did this analysis for three metabolic processes: Stickland fermentation, ATP production, and redox reactions. We filtered and processed the flux sampling data in the same way as for the Random Forest results.

Shadow pricing analysis

The 20 reactions with the highest Gini scores from the Random Forest analysis were iteratively set as the objective function for each RIPTiDe model. This model was then optimized and the corresponding shadow price for the FBA solution was saved. For each metabolite and each objective function, we calculated the median shadow price and range across all RIPTiDe models. We summarized the shadow pricing results for each objective function (OF) across all RIPTiDe models using the following metrics: fraction of RIPTiDe models able to carry flux for that OF, total number of metabolites that increase flux through the OF (median shadow price < -0.1, range < 2), total number of metabolites that decrease flux through the OF (median shadow price > 0, range < 2), total number of metabolites whose shadow price varied across RIPTiDe models (range > 2). We then plotted the metabolites with a median shadow price < -5 and a range < 2 for the 9 objective functions that had metabolites in this category (Figs 3B and S5).

Metabolic transformation algorithm (MTA)

The goal of MTA is to identify perturbations that transform a metabolic network from a reference state (e.g., diseased) to a target state (e.g., healthy) [24]. In our case, we used the MTA to find reaction knock-outs that lead to transformation of high toxin states to low toxin states in C. difficile. The generic MTA is comprised of four distinct steps which are briefly: (1) calculate a flux solution for the reference state, vref, (2) identify reactions that are changed in the forward (RF) or backward (RB) direction and unchanged (RS) between the reference and target state, (3) solve the MIQP optimization problem formulated to minimize change in RS and maximize change in RF and RB, and (4) calculate a transformation score (TS) to quantify the success of each reaction knockout in transforming the reference state to the target state. To successfully apply MTA to our problem, we made changes to the original formulation at each step, resulting in a modified MTA (mMTA, described below) compatible with COBRApy tools.

Step 1: We created two contextualized metabolic models (reference and target) using RIPTiDe with gene expression data and then generated 500 flux samples using RIPTiDe [23]. Because the mean (or median) of the flux samples is not a mass-balanced solution, setting it as vref can lead to infeasible MIQP solutions downstream. Therefore, we used a Bray-Curtis non-parametric multidimensional scaling (NMDS) to reduce the flux samples to a two-dimensional space, then calculated the centroid of the flux sampling distribution, and finally calculated the point closest to the centroid and set this flux sample as vref (S6A Fig).

Step 2: We determine significantly changed and unchanged reactions by using a Mann-Whitney U test with a Bonferroni multiple tests correction. We categorize all reactions into three sets: statistically insignificant reactions (RS), and statistically significant reactions which must change in the forward (RF) or backward (RB) direction in order to match the target state. The threshold for statistical significance is an adjusted p-value < 0.05.

Step 3: The goal of the MIQP problem is to minimize changes in flux for reactions in RS and maximize changes in flux for reactions in RF and RB in the intended direction. We implemented the MIQP formulation as it was set out for the generic MTA:

minv,y((1α)iRS(virefvi)2+α2iRFyi+α2iRByi)
s.t.
Sv=0
vminvvmax
vj=0
viyiF(viref+εi)yivimin0,iRF
yiF+yi=1,iRF
viyiB(virefεi)yivimax0,iRB
yiB+yi=1,iRB
yi,yiF,yiB{0,1}

Mass balance constraints, thermodynamic constraints, and the reaction knockout perturbation are enforced in equations 2, 3, and 4 respectively. The demands for changed reactions are represented in equations 5–8, such that the Boolean variables yi,yiF,yiB indicate whether a forward reaction (RF) either increases by more than ε with respect to vref or maintains a preset flux minimum and whether a backward reaction (RB) either decreases by more than ε with respect to vref or maintains a preset flux maximum. ε is the vector of thresholds used to determine if flux changes are statistically significant (p < 0.05) and was calculated using a one-sided T-test with a 95% confidence interval, such that ε=tSn where t is the critical value, S is the standard deviation, and n is the number of flux samples. We set α = 0.66 as in the original formulation.

Step 4: We categorize forward and backward reactions as successful if vRF > (vRFref + ε) or if vRB < (vRBrefε), respectively. Next, to quantify how well each reaction knock-out transformed the reference state to the target state, we calculated the transformation score (TS) as formulated for the generic MTA:

iRSuccessabs[(virefvires]iRunsuccessabs[(virefvires)]iRSabs[(virefvires)]

Ranking reaction knockouts by the TS provides a helpful metric for evaluating the flux solution from each knockout while taking the MIQP objective value would result in reaction knockouts with varying degrees of transformational success being set as equivalent (S6B Fig).

Phenotype Microarray (PM) data integration and analysis

We used public data from a PM study that measured toxin production of C. difficile type strain ATCC 9689 when grown in each condition [25]. The toxin concentrations from this study were calculated by comparing the amount of dye reduction in cell cytotoxicity assays to a standard curve of toxin concentrations [25]. The authors defined toxins as low (<42 ng/uL), mid (42–420 ng/uL), or high (>420 ng/uL) and we used the same categories in this analysis. The dataset from this PM study included 652 unique growth conditions. The GENREs iCdG709 and iCdR703 contain 171 unique extracellular metabolites, 65 of which overlapped with metabolites from the PM dataset (S8A Fig). We constrained the GENREs to minimal media conditions and iteratively added one of the 65 overlapping metabolites and simulated flux while optimizing for biomass. We normalized the flux sampling data using min-max normalization and then removed reactions with variance < 0.05. This step trimmed the flux data from 1323 reactions to 67 reactions. Next, we calculated the Pearson’s correlation between each reaction flux vector and the PM toxin data. None of the reactions were correlated with toxin production. Finally, we visualized the absolute flux data for each of the 65 simulated PM conditions (S8C Fig).

Supporting information

S1 Table. Public RNA-sequencing datasets.

BHIS(G): Brain-Heart Infusion Supplemented (Glucose), Cd: C. difficile, CDMM: C. difficile Minimal Media, DCA: Deoxycholate, DMSO: Dimethyl Sulfoxide, GEO ID: Gene Expression Omnibus Identifier, TY: Tryptone Yeast. Alternate identifiers for RIPTiDe models with similar growth conditions are indicated in parentheses in the Growth Condition column when applicable; these identifiers are used for all analyses.

(DOCX)

S1 Fig. Toxin transcript counts across conditions.

Toxin transcript counts quantified as reads per million (RPM) are shown for all conditions included in the study (see S1 Table for more details). Conditions were binned based on median tcdA transcript levels across all conditions and labeled as low (< median) or high (> median).

(TIFF)

S2 Fig. PCA of flux sampling of RIPTiDe-contextualized iCdG709 and iCdR703 models.

(A) Summary table of the RIPTiDe-contextualized models including the strain, toxin production level, and number of genes, reactions, and metabolites. (B) The iCdG709 (CD630, light purple) and iCdR703 (CDR20291, dark purple) C. difficile models were contextualized with transcriptomic data (S1 Table) and flux distributions were sampled (n = 500) using RIPTiDe. The flux sampling for each model was randomly down-sampled to 100 flux samples and PCA was performed for all the models together (B) and by strain (C–F).

(TIFF)

S3 Fig. Random Forest validation metrics.

(A) Visualization of the random stratified group k-fold splits used for cross validation of the Random Forest classifier. (B-C) K-fold cross validation (k = 5) of the Random Forest classifier testing ROC (B) and accuracy (C), with an average accuracy of 95% in cross validation. (D) Confusion matrix for model predictions with train and test sets selected in a 75–25 ratio using random stratified group splits. The model trained on this set had a 97% accuracy. (E) The top 20 features for model predictions by Gini score.

(TIFF)

S4 Fig. Escher metabolic maps.

Metabolic context for reactions from the Random Forest analysis labeled with the reaction and model IDs from the GENREs iCdG709 and iCdR703.

(TIFF)

S5 Fig. Shadow prices of metabolites that decrease flux through reactions from Random Forest.

For each objective function (OF) listed in Fig 3A, the metabolites categorized as decreasing and with a shadow price < -5 are shown.

(TIFF)

S6 Fig. MTA calculations for centroids and transformation scores (TS).

(A) Bray-Curtis NMDS of flux sampling results for iCdG709 contextualized for BHIS + DCA 240 uM (target, low toxin, light teal) and BHIS (reference, high toxin, dark teal) was used to calculate the centroids (red) and the flux sample closest to the centroid (orange) for each model. (B) The MIQP objective value verses the TS demonstrates the utility of the TS in ranking flux solutions with a similar objective-value based on success of the flux solution in transforming reactions to the target state. (C) Successfully changed reactions for each reaction knockout. Successful (dark blue), unsuccessful (light blue).

(TIFF)

S7 Fig. Comparison of metabolic flux through reactions in the Reference and Target state.

(TIFF)

S8 Fig. Phenotype microarray (PM) simulation and analysis.

(A) Venn diagram showing the overlap of unique metabolites from the PM dataset and the extracellular metabolites from the GENREs. (B) The toxin concentration distribution for the 65 overlapping growth conditions from panel (A). (C) Simulated reaction flux through each in silico PM condition (n = 65). The flux data was min-max normalized and reactions with flux variance across all conditions < 0.05 were removed and the absolute flux value of the remaining reactions was visualized. The PM growth conditions are sorted by their toxin category. Toxin categories were defined as low (<42 ng/uL), mid (42–420 ng/uL), and high (>420 ng/uL) as in Lei, XH and Bochner, BR (2013).

(TIFF)

S1 Dataset. RIPTiDe model flux sampling data.

Down-sampled flux data (n = 100 samples per RIPTiDe model), with the first three columns set as sample descriptors (condition (RIPTiDe model), strain, and toxin category).

(CSV)

S2 Dataset. Shadow pricing data.

Metabolite shadow prices with the first five columns set as simulation descriptors: condition (RIPTiDe model), strain (CD630 or CDR20291), toxin (low or high), OF (reaction ID for objective function), and OF_name (name of OF).

(CSV)

S3 Dataset. MTA knockout flux data.

Flux data for each reaction knockout (columns) with the first two columns showing the flux data for the Target and Control (Reference).

(CSV)

Data Availability

Publicly available gene expression data from the GEO database was downloaded and used for this study; the GEO IDs are listed in S1 Table. The flux sampling, shadow pricing, and mMTA data are shared in S1, S2 and S3 Datasets respectively. The code used for this study is available on Github at https://github.com/dap5mb/cdToxinAnalysis.

Funding Statement

The authors received funding from the following NIH grants: R01-AT010253 (MLJ, GLK, JAP) and T32-AI007046 (DAP). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Bouwknegt M, van Dorp S, Kuijper E. Burden of Clostridium difficile infection in the United States. N Engl J Med. 2015. Jun 11;372(24):2368. doi: 10.1056/NEJMc1505190 [DOI] [PubMed] [Google Scholar]
  • 2.Zhang S, Palazuelos-Munoz S, Balsells EM, Nair H, Chit A, Kyaw MH. Cost of hospital management of Clostridium difficile infection in United States—a meta-analysis and modelling study. BMC Infect Dis. 2016. Dec;16(1):447. doi: 10.1186/s12879-016-1786-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Ross CL, Spinler JK, Savidge TC. Structural and functional changes within the gut microbiota and susceptibility to Clostridium difficile infection. Anaerobe. 2016. Oct;41:37–43. doi: 10.1016/j.anaerobe.2016.05.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Olsen MA, Yan Y, Reske KA, Zilberberg MD, Dubberke ER. Recurrent Clostridium difficile infection is associated with increased mortality. Clin Microbiol Infect. 2015. Feb;21(2):164–70. doi: 10.1016/j.cmi.2014.08.017 [DOI] [PubMed] [Google Scholar]
  • 5.Chandrasekaran R, Lacy DB. The role of toxins in Clostridium difficile infection. FEMS Microbiol Rev. 2017. Nov 1;41(6):723–50. doi: 10.1093/femsre/fux048 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Voth DE, Ballard JD. Clostridium difficile toxins: mechanism of action and role in disease. Clin Microbiol Rev. 2005. Apr;18(2):247–63. doi: 10.1128/CMR.18.2.247-263.2005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Vedantam G, Clark A, Chu M, McQuade R, Mallozzi M, Viswanathan VK. Clostridium difficile infection: toxins and non-toxin virulence factors, and their contributions to disease establishment and host response. Gut Microbes. 2012. Apr;3(2):121–34. doi: 10.4161/gmic.19399 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.D’Auria KM, Donato GM, Gray MC, Kolling GL, Warren CA, Cave LM, et al. Systems analysis of the transcriptional response of human ileocecal epithelial cells to Clostridium difficile toxins and effects on cell cycle control. BMC Syst Biol. 2012. Jan 6;6:2. doi: 10.1186/1752-0509-6-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.D’Auria KM, Kolling GL, Donato GM, Warren CA, Gray MC, Hewlett EL, et al. In vivo physiological and transcriptional profiling reveals host responses to Clostridium difficile toxin A and toxin B. Infect Immun. 2013. Oct;81(10):3814–24. doi: 10.1128/IAI.00869-13 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.D’Auria KM, Bloom MJ, Reyes Y, Gray MC, van Opstal EJ, Papin JA, et al. High temporal resolution of glucosyltransferase dependent and independent effects of Clostridium difficile toxins across multiple cell types. BMC Microbiol. 2015. Feb 4;15:7. doi: 10.1186/s12866-015-0361-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Aktories K, Schwan C, Jank T. Clostridium difficile Toxin Biology. Annu Rev Microbiol. 2017. Sep 8;71:281–307. doi: 10.1146/annurev-micro-090816-093458 [DOI] [PubMed] [Google Scholar]
  • 12.Martin-Verstraete I, Peltier J, Dupuy B. The Regulatory Networks That Control Clostridium difficile Toxin Synthesis. Toxins (Basel). 2016. May 14;8(5):E153. doi: 10.3390/toxins8050153 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Antunes A, Martin-Verstraete I, Dupuy B. CcpA-mediated repression of Clostridium difficile toxin gene expression: C. difficile toxin regulation by CcpA. Molecular Microbiology. 2011. Feb;79(4):882–99. [DOI] [PubMed] [Google Scholar]
  • 14.Hofmann JD, Biedendieck R, Michel AM, Schomburg D, Jahn D, Neumann-Schaal M. Influence of L-lactate and low glucose concentrations on the metabolism and the toxin formation of Clostridioides difficile. PLoS One. 2021;16(1):e0244988. doi: 10.1371/journal.pone.0244988 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Karlsson S, Burman LG, Åkerlund T. Suppression of toxin production in Clostridium difficile VPI 10463 by amino acids. Microbiology (Reading). 1999. Jul;145 (Pt 7):1683–93. doi: 10.1099/13500872-145-7-1683 [DOI] [PubMed] [Google Scholar]
  • 16.Karlsson S, Lindberg A, Norin E, Burman LG, Akerlund T. Toxins, butyric acid, and other short-chain fatty acids are coordinately expressed and down-regulated by cysteine in Clostridium difficile. Infect Immun. 2000. Oct;68(10):5881–8. doi: 10.1128/IAI.68.10.5881-5888.2000 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Neumann-Schaal M, Jahn D, Schmidt-Hohagen K. Metabolism the Difficile Way: The Key to the Success of the Pathogen Clostridioides difficile. Front Microbiol. 2019. Feb 15;10:219. doi: 10.3389/fmicb.2019.00219 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Yamakawa K, Kamiya S, Meng XQ, Karasawa T, Nakamura S. Toxin production by Clostridium difficile in a defined medium with limited amino acids. J Med Microbiol. 1994. Nov;41(5):319–23. doi: 10.1099/00222615-41-5-319 [DOI] [PubMed] [Google Scholar]
  • 19.Bouillaut L, Self WT, Sonenshein AL. Proline-dependent regulation of Clostridium difficile Stickland metabolism. J Bacteriol. 2013. Feb;195(4):844–54. doi: 10.1128/JB.01492-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Carlucci C, Jones CS, Oliphant K, Yen S, Daigneault M, Carriero C, et al. Effects of defined gut microbial ecosystem components on virulence determinants of Clostridioides difficile. Sci Rep. 2019. Jan 29;9(1):885. doi: 10.1038/s41598-018-37547-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Jenior ML, Leslie JL, Powers DA, Garrett EM, Walker KA, Dickenson ME, et al. Novel Drivers of Virulence in Clostridioides difficile Identified via Context-Specific Metabolic Network Analysis. mSystems. 2021. Oct 26;6(5):e0091921. doi: 10.1128/mSystems.00919-21 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Oberhardt MA, Chavali AK, Papin JA. Flux Balance Analysis: Interrogating Genome-Scale Metabolic Networks. In: Maly IV, editor. Systems Biology [Internet]. Totowa, NJ: Humana Press; 2009. [cited 2022 Oct 21]. p. 61–80. (Methods in Molecular Biology; vol. 500). Available from: http://link.springer.com/10.1007/978-1-59745-525-1_3 [DOI] [PubMed] [Google Scholar]
  • 23.Jenior ML, Moutinho TJ, Dougherty BV, Papin JA. Transcriptome-guided parsimonious flux analysis improves predictions with metabolic networks in complex environments. Lewis NE, editor. PLoS Comput Biol. 2020. Apr 16;16(4):e1007099. doi: 10.1371/journal.pcbi.1007099 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Yizhak K, Gabay O, Cohen H, Ruppin E. Model-based identification of drug targets that revert disrupted metabolism and its application to ageing. Nat Commun. 2013. Dec;4(1):2632. doi: 10.1038/ncomms3632 [DOI] [PubMed] [Google Scholar]
  • 25.Lei XH, Bochner BR. Using Phenotype MicroArrays to Determine Culture Conditions That Induce or Repress Toxin Production by Clostridium difficile and Other Microorganisms. Popoff MR, editor. PLoS ONE. 2013. Feb 20;8(2):e56545. doi: 10.1371/journal.pone.0056545 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Dubois T, Dancer-Thibonnier M, Monot M, Hamiot A, Bouillaut L, Soutourina O, et al. Control of Clostridium difficile Physiopathology in Response to Cysteine Availability. Infect Immun. 2016. Aug;84(8):2389–405. doi: 10.1128/IAI.00121-16 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Reznik E, Mehta P, Segrè D. Flux Imbalance Analysis and the Sensitivity of Cellular Growth to Changes in Metabolite Pools. Maranas CD, editor. PLoS Comput Biol. 2013. Aug 29;9(8):e1003195. doi: 10.1371/journal.pcbi.1003195 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Tervo CJ, Reed JL. Expanding Metabolic Engineering Algorithms Using Feasible Space and Shadow Price Constraint Modules. Metab Eng Commun. 2014. Dec 1;1:1–11. doi: 10.1016/j.meteno.2014.06.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Fletcher JR, Pike CM, Parsons RJ, Rivera AJ, Foley MH, McLaren MR, et al. Clostridioides difficile exploits toxin-mediated inflammation to alter the host nutritional landscape and exclude competitors from the gut microbiota. Nat Commun. 2021. Dec;12(1):462. doi: 10.1038/s41467-020-20746-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Karasawa T, Maegawa T, Nojiri T, Yamakawa K, Nakamura S. Effect of Arginine on Toxin Production by Clostridium difficile in Defined Medium. Microbiology and Immunology. 1997. Aug;41(8):581–5. [DOI] [PubMed] [Google Scholar]
  • 31.Ikeda D, Karasawa T, Yamakawa K, Tanaka R, Namiki M, Nakamura S. Effect of Isoleucine on Toxin Production by Clostridium difficile in a Defined Medium. Zentralblatt für Bakteriologie. 1998. May;287(4):375–86. doi: 10.1016/s0934-8840(98)80174-6 [DOI] [PubMed] [Google Scholar]
  • 32.McKee RW, Mangalea MR, Purcell EB, Borchardt EK, Tamayo R. The second messenger cyclic Di-GMP regulates Clostridium difficile toxin production by controlling expression of sigD. J Bacteriol. 2013. Nov;195(22):5174–85. doi: 10.1128/JB.00501-13 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Dineen SS, McBride SM, Sonenshein AL. Integration of metabolism and virulence by Clostridium difficile CodY. J Bacteriol. 2010. Oct;192(20):5350–62. doi: 10.1128/JB.00341-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Pruss KM, Enam F, Battaglioli E, DeFeo M, Diaz OR, Higginbottom SK, et al. Oxidative ornithine metabolism supports non-inflammatory C. difficile colonization. Nat Metab. 2022. Jan;4(1):19–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Guelzim N, Mariotti F, Lasserre F, Mathé V, Azzout D, Pineau T, et al. Regulation of arginine metabolism by dietary fatty acids: involvement of PPARα. Proc Nutr Soc. 2008. May;67(OCE5):E203. [Google Scholar]
  • 36.Li S, Zhang Y, Liu N, Chen J, Guo L, Dai Z, et al. Dietary L-arginine supplementation reduces lipid accretion by regulating fatty acid metabolism in Nile tilapia (Oreochromis niloticus). J Animal Sci Biotechnol. 2020. Dec;11(1):82. doi: 10.1186/s40104-020-00486-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Jobgen WS, Fried SK, Fu WJ, Meininger CJ, Wu G. Regulatory role for the arginine-nitric oxide pathway in metabolism of energy substrates. J Nutr Biochem. 2006. Sep;17(9):571–88. doi: 10.1016/j.jnutbio.2005.12.001 [DOI] [PubMed] [Google Scholar]
  • 38.Sonenshein AL. CodY, a global regulator of stationary phase and virulence in Gram-positive bacteria. Current Opinion in Microbiology. 2005. Apr;8(2):203–7. doi: 10.1016/j.mib.2005.01.001 [DOI] [PubMed] [Google Scholar]
  • 39.Baunwall SMD, Andreasen SE, Hansen MM, Kelsen J, Høyer KL, Rågård N, et al. Faecal microbiota transplantation for first or second Clostridioides difficile infection (EarlyFMT): a randomised, double-blind, placebo-controlled trial. The Lancet Gastroenterology & Hepatology. 2022. Dec;7(12):1083–91. [DOI] [PubMed] [Google Scholar]
  • 40.Smith AB, Jenior ML, Keenan O, Hart JL, Specker J, Abbas A, et al. Enterococci enhance Clostridioides difficile pathogenesis. Nature. 2022. Nov 24;611(7937):780–6. doi: 10.1038/s41586-022-05438-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Bohmann N, Wilmanski T, Levy L, Lampe JW, Gurry T, Rappaport N, et al. Microbial community-scale metabolic modeling predicts personalized short-chain-fatty-acid production profiles in the human gut [Internet]. Systems Biology; 2023. Mar [cited 2023 Mar 7]. Available from: http://biorxiv.org/lookup/doi/10.1101/2023.02.28.530516 [DOI] [PubMed] [Google Scholar]
  • 42.King ZA, Dräger A, Ebrahim A, Sonnenschein N, Lewis NE, Palsson BO. Escher: A Web Application for Building, Sharing, and Embedding Data-Rich Visualizations of Biological Pathways. Gardner PP, editor. PLoS Comput Biol. 2015. Aug 27;11(8):e1004321. doi: 10.1371/journal.pcbi.1004321 [DOI] [PMC free article] [PubMed] [Google Scholar]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1011076.r001

Decision Letter 0

Kiran Raosaheb Patil

30 Jan 2023

Dear Professor Papin,

Thank you very much for submitting your manuscript "Network analysis of toxin production in Clostridioides difficile identifies key metabolic dependencies" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

We especially recommend you to comprehensively address the following issues: a) ML overfitting, b) empty github repository, and c) integration of phenotypic data.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Kiran Patil

Section Editor

PLOS Computational Biology

***********************

We especially recommend you to comprehensively address the following issues: a) ML overfitting, b) empty github repository, and c) integration of phenotypic data.

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: Summary of Research:

The authors present a computational analysis of metabolic regulation of toxin (TcdA and TcdB) production in C. difficile. They integrate public transcriptomics data to develop 16 context specific metabolic models, using the RIPTiDe algorithm and C. difficile metabolic models. These base models and algorithm were previously developed by several of the same authors. They analyze the models using a combination of flux sampling, machine learning (random forest), and shadow prices to identify patterns associated with toxin production. They also implement the metabolic transformation algorithm to identify reaction knockouts that drive metabolic flux towards a low-toxin state. The discussion of the results focuses on arginine and ornithine transporters and isoleucine. Overall, the scope and motivation of the work is interesting. The conceptual model is set up well. The analysis and interpretation of the results should be explained more clearly. Specific major and minor comments are provided below.

Major Comments:

1. I am concerned with overfitting in the machine learning results. The total number of samples is quite small (16). I understand that there are additional samples created through the flux sampling process, but these will likely be highly correlated. The model can likely learn, from these correlated samples, which condition a flux sample comes from and use that information to infer the toxin level. Therefore, the degree to which the results correspond to microbial physiology in toxin production vs condition specific physiology is not clear. This should be addressed in the discussion.

2. To further address overfitting to condition in the results, I would also suggest that the authors implement a cross-validation scheme where flux samples from the same condition are not mixed across the train and validate sets. To do this you can randomly select a set of conditions (of the 16) to assign to the training set (repeating random selections for statistics). The random selection could be stratified to ensure there are always some high and some low toxin samples. Then use all the flux samples from the randomly selected conditions in the training set and the remaining samples in the validate sets. Cross-validating in this way will give insight into whether the ML model can generalize across conditions and should be presented alongside the current results.

3. Another suggestion for the machine learning analysis is to reduce feature correlations. ML algorithms, and in particular feature importance calculations, are typically more stable when the input features are not correlated. Before implementing the ML the authors can clean up the input flux features in two ways. First, reduce the number of features by getting rid of any flux that has variability across all samples below some specified threshold (these features contain little to no information). Second, cluster features with covariance across all samples above some threshold into one feature (one of these fluxes can be used as a representative feature for the machine learning). There are often linear pathways of reactions with highly correlated flux across samples that can be reduced in this way. Reducing the feature space could make the results more stable and easier to interpret. If any feature engineering of this sort has already been implemented, the authors should include the details in the methods section.

4. The authors could further discuss the relationship between the ML analysis and the MTA analysis. What are the differences in the types of results you expect to find with these two approaches? How do the results from the two approaches complement or support each other?

Minor Comments:

1. Line 11. Consider changing “its toxins” to “two toxic proteins”. Defining TcdA,B more specifically as proteins (as opposed to metabolites) will help non-expert readers quickly understand the context of the modeling work.

2. Line 51-54. This is not a comment for this paper, but I was curious if the glucosylating activity of TcdA,B is a metabolic process that could be modeled. Is that included in the genome-scale metabolic models? Just a thought on another possibly interesting line of inquiry.

3. Line 81-83. It would be nice to include in the introduction a bit more description of the transcriptomic data that was used. Is this all from one study, multiple studies? Is there any motivation behind the different conditions that were included? Maybe just a reference to Supplemental Table 1 would suffice.

4. Line 98-99. Is there any reason for choosing tcdA over tcdB? Are tcdA and tcdB correlated across conditions?

5. Line 107-108. Please provide additional description of the flux sampling here or point to this description in the methods. Was this sampling of the entire feasible steady-state space, was biomass production optimized and used as a constraint?

6. Paragraph 111-121. The machine learning approach should be described in more detail. How many sampled flux distributions were used for each condition? What was the train/validate/test split procedure? What was the accuracy of the ML model on the test (or validation) set relative to a null distribution? How was feature importance extracted from the random forest model? This information is important for interpreting the results so it should be presented to some degree in the results section.

7. Line 122. Figure 2. It may be nice to highlight the arginine and ornithine transport related reactions. The figure in part B does a nice job of explaining the context of the reactions from the ML importance results in part A. Maybe highlight the relevant reactions in A with bold or a different colored font and link them to the matching reaction in part B with a superscript.

8. Paragraph 130-141. It would be nice to have more discussion of why certain reactions have many sensitive metabolites while others do not. How could this arise and what are the implications? In general, what are the implications of a metabolite having a strong shadow price for an important reaction? I was surprised to see that arginine and ornithine do not come up in the shadow price analysis, as I would naively think that they would limit the transport reactions. The authors could expand on this in the results section here or in the discussion.

9. Line 145. The blue column seems to be a fraction of models not the number of models.

10. Line 143. Figure 3b. It would be good to include the ID of the metabolites or some other more specific name.

11. Paragraph 155-177. The MTA sections seems to be only weakly connected to the previous results from the ML section. Any efforts to link the two results in the discussion would be appreciated. (See major comment 4)

12. Line 183. Include the value of epsilon in the caption.

13. Paragraph 240-250. Good discussion of implications of the RIPTiDe algorithm. Additional discussion should be added regarding the limitations of GENRES and the other analyses utilized here.

14. Line 455. Sup Fig S2. An additional PCA plot with high and low toxin as the colors would be good to include here.

15. Lines 488-489. A numbering system is mentioned in the figure caption, but I do not see any use of that numbering system in the table. Maybe it would be clearer to include the RIPTiDe model reference in the table.

16. Line 348. The github repository that is linked seems to be empty at this time.

Reviewer #2: This is a review of “Network analysis of toxin production in Clostridioides difficile identifies key metabolic dependencies” by Powers and colleagues. This paper focuses on Clostridium difficile, a notorious opportunistic pathogen which has a diverse metabolism that enables it to establish a niche within the complex gut environment. Its pathogenesis is primarily mediated by toxins TcdA and TcdB. This paper successfully demonstrates how toxic production is regulated by the organism's metabolism, a long-standing question in the field. The study employs a system biology-based workflow, utilizing genome-scale metabolic models, to reveal how different extracellular environments can affect the regulation of toxin production and how this relates to changes in intracellular metabolism.

This is a strong paper especially because of its innovative use of publicly available transcriptomic data from various studies to provide an extracellular context for the genome-scale metabolic models. The choice of low and high toxin states through transcriptomic data gives these states the context while performing metabolic modelling using genome scale metabolic models. The contextualization of these models has shown that how these states are influenced by both extracellular and intracellular environments.

The use of the RIPTiDe algorithm and machine learning methods highlights the key role that arginine and ornithine, which are available from the environment, play in regulating toxin production. The paper goes on to show which is further regulated by intracellular pools of fatty acids and large polymer metabolite pools, as shown by the flux balance analysis and shadow pricing analysis. Additionally, the application of the mMTA algorithm identifies important reactions involved in transitioning from high to low toxin production states, providing ideas for potential therapeutic targets.

I am convinced that this paper will be a valuable contribution to the field. However, there are important issues that should be addressed before publication:

Major issues:

1. The paper states that the analysis code is shared but the associated link is empty. This needs to be fixed before publication.

2. The methods section could be more detailed in describing the usage and caveats of the RIPTiDE algorithm and shadow pricing analysis with respect to flux sampling to improve the reproducibility of the work.

Minor issues:

3. The paper is a valuable contribution to the field's understanding of toxin regulation by metabolism in C. difficile. However, it would have been beneficial to include a discussion of isoleucine fermentation as a source of energy metabolism in the results section.

4. The abstract and introduction effectively summarize the work and highlight its potential therapeutic applications, but the results and discussion sections could benefit from more specific recommendations on this subject.

5. Additionally, the figure legends in Figure 3A could be clearer, and the use of the term "context dependent" for metabolites with shadow price>2 may be confusing.

6. The legend in Figure 4 should also be more descriptive to make it accessible to non-experts in computational methods, and the use of the term epsilon without further explanation may be confusing to readers.

Reviewer #3: This paper report integration of publicly available transcriptomic data using RIPTiDe algorithm to create contextualize how nutritional changes regulate toxin levels in C. difficile. This work is built on previously published genome scale models of strains 630 and and R20291. While it is interesting to see transcriptomic data integration into the previously published model, it is incremental work. The following aspects could make this paper much stronger.

1. Integrate actual toxin level data following growth in Biolog Phenotype array previously reported by Lei and Bochner 2013. That paper is cited as ref#26 in the discussion. However, that data is not integrated with the model. There is some agreement between the results in the current work and phenotypic toxin data published in ref#26. For example, Arg-dipeptides showed higher toxin production in Biolog plates. Calibrating the FBA model with validated phenotypic toxin data will make this work much better. Since authors have done that previously in E. coli (data guided FBA model), doing that here should not be that difficult.

2. The other question is how genome variation in C. difficile affects toxin production. CD 630 and CD R20291 belong to different toxinotypes. There have been conflicting reports including and excluding the importanc of toxinotypes in C. difficle virulence. Although there is high genome variation in C. difficile, genes coding for the central metabolic pathways are somewhat conserved, albeit with some sequence variation. Is data integration with genome scale modeling sensitive enough to understand how the pathogenicity locus (where cluster of toxin genes are located) interacts with master regulators of central metabolic pathways (codY, CcpA, and others). Integrating Biolog data could be interesting because adenine and related compounds were the strongest toxin inducers in the previous study.

3. Public data set is available where both transcriptome and metabolome were analyzed following C. difficile infection in a mouse model (PMID: 29600278). Unfortunately, toxin levels were not reported in that paper. However, toxin gene expression level could be pulled from the transcriptome data, and do the predictions here correlate with what is reported in that work? I agree that dataset may not amenable for modelling, but could be useful in expanding the discussion in the context of toxinotype variation.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No: The github repository is currently empty. The authors will need to upload their models and code here before the paper is published.

Reviewer #2: No: The links for the code and the data are made available in the paper but those links are empty and nothing is uploaded on them so far.

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Vishwas Mishra

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1011076.r003

Decision Letter 1

Kiran Raosaheb Patil

4 Apr 2023

Dear Professor Papin,

We are pleased to inform you that your manuscript 'Network analysis of toxin production in Clostridioides difficile identifies key metabolic dependencies' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Kiran Patil

Section Editor

PLOS Computational Biology

***********************************************************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors have done a good job of updating their manuscript following the first round of reviews. In particular, I am happy with the updates to the machine learning methods and discussion. The github is also now available with the code. I encourage the authors to include some additional documentation in a github readme file, but that should not hold up publication.

Reviewer #2: The authors have very well taken into account the reviewer's comments and have made necessary changes to the previous manuscript submission. This paper successfully shows how toxic production is regulated by the organism's metabolism. This is a great paper to be added to the field because of its novel way to make use of publicly available transcriptomic data to provide an extracellular context for the genome-scale metabolic models. I am convinced that this paper will be a valuable contribution to the field.

Reviewer #3: Authors have incorporated most of the suggestions made earlier. I don't have any further comments

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1011076.r004

Acceptance letter

Kiran Raosaheb Patil

20 Apr 2023

PCOMPBIOL-D-22-01878R1

Network analysis of toxin production in Clostridioides difficile identifies key metabolic dependencies

Dear Dr Papin,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Zsofi Zombor

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Table. Public RNA-sequencing datasets.

    BHIS(G): Brain-Heart Infusion Supplemented (Glucose), Cd: C. difficile, CDMM: C. difficile Minimal Media, DCA: Deoxycholate, DMSO: Dimethyl Sulfoxide, GEO ID: Gene Expression Omnibus Identifier, TY: Tryptone Yeast. Alternate identifiers for RIPTiDe models with similar growth conditions are indicated in parentheses in the Growth Condition column when applicable; these identifiers are used for all analyses.

    (DOCX)

    S1 Fig. Toxin transcript counts across conditions.

    Toxin transcript counts quantified as reads per million (RPM) are shown for all conditions included in the study (see S1 Table for more details). Conditions were binned based on median tcdA transcript levels across all conditions and labeled as low (< median) or high (> median).

    (TIFF)

    S2 Fig. PCA of flux sampling of RIPTiDe-contextualized iCdG709 and iCdR703 models.

    (A) Summary table of the RIPTiDe-contextualized models including the strain, toxin production level, and number of genes, reactions, and metabolites. (B) The iCdG709 (CD630, light purple) and iCdR703 (CDR20291, dark purple) C. difficile models were contextualized with transcriptomic data (S1 Table) and flux distributions were sampled (n = 500) using RIPTiDe. The flux sampling for each model was randomly down-sampled to 100 flux samples and PCA was performed for all the models together (B) and by strain (C–F).

    (TIFF)

    S3 Fig. Random Forest validation metrics.

    (A) Visualization of the random stratified group k-fold splits used for cross validation of the Random Forest classifier. (B-C) K-fold cross validation (k = 5) of the Random Forest classifier testing ROC (B) and accuracy (C), with an average accuracy of 95% in cross validation. (D) Confusion matrix for model predictions with train and test sets selected in a 75–25 ratio using random stratified group splits. The model trained on this set had a 97% accuracy. (E) The top 20 features for model predictions by Gini score.

    (TIFF)

    S4 Fig. Escher metabolic maps.

    Metabolic context for reactions from the Random Forest analysis labeled with the reaction and model IDs from the GENREs iCdG709 and iCdR703.

    (TIFF)

    S5 Fig. Shadow prices of metabolites that decrease flux through reactions from Random Forest.

    For each objective function (OF) listed in Fig 3A, the metabolites categorized as decreasing and with a shadow price < -5 are shown.

    (TIFF)

    S6 Fig. MTA calculations for centroids and transformation scores (TS).

    (A) Bray-Curtis NMDS of flux sampling results for iCdG709 contextualized for BHIS + DCA 240 uM (target, low toxin, light teal) and BHIS (reference, high toxin, dark teal) was used to calculate the centroids (red) and the flux sample closest to the centroid (orange) for each model. (B) The MIQP objective value verses the TS demonstrates the utility of the TS in ranking flux solutions with a similar objective-value based on success of the flux solution in transforming reactions to the target state. (C) Successfully changed reactions for each reaction knockout. Successful (dark blue), unsuccessful (light blue).

    (TIFF)

    S7 Fig. Comparison of metabolic flux through reactions in the Reference and Target state.

    (TIFF)

    S8 Fig. Phenotype microarray (PM) simulation and analysis.

    (A) Venn diagram showing the overlap of unique metabolites from the PM dataset and the extracellular metabolites from the GENREs. (B) The toxin concentration distribution for the 65 overlapping growth conditions from panel (A). (C) Simulated reaction flux through each in silico PM condition (n = 65). The flux data was min-max normalized and reactions with flux variance across all conditions < 0.05 were removed and the absolute flux value of the remaining reactions was visualized. The PM growth conditions are sorted by their toxin category. Toxin categories were defined as low (<42 ng/uL), mid (42–420 ng/uL), and high (>420 ng/uL) as in Lei, XH and Bochner, BR (2013).

    (TIFF)

    S1 Dataset. RIPTiDe model flux sampling data.

    Down-sampled flux data (n = 100 samples per RIPTiDe model), with the first three columns set as sample descriptors (condition (RIPTiDe model), strain, and toxin category).

    (CSV)

    S2 Dataset. Shadow pricing data.

    Metabolite shadow prices with the first five columns set as simulation descriptors: condition (RIPTiDe model), strain (CD630 or CDR20291), toxin (low or high), OF (reaction ID for objective function), and OF_name (name of OF).

    (CSV)

    S3 Dataset. MTA knockout flux data.

    Flux data for each reaction knockout (columns) with the first two columns showing the flux data for the Target and Control (Reference).

    (CSV)

    Attachment

    Submitted filename: PLOSCompReviews.docx

    Data Availability Statement

    Publicly available gene expression data from the GEO database was downloaded and used for this study; the GEO IDs are listed in S1 Table. The flux sampling, shadow pricing, and mMTA data are shared in S1, S2 and S3 Datasets respectively. The code used for this study is available on Github at https://github.com/dap5mb/cdToxinAnalysis.


    Articles from PLOS Computational Biology are provided here courtesy of PLOS

    RESOURCES