Skip to main content
PLOS One logoLink to PLOS One
. 2026 Mar 9;21(3):e0344478. doi: 10.1371/journal.pone.0344478

Integrative network pharmacology and machine learning identify potential targets of indole-3-lactic acid in colorectal cancer

Jie Li 1, Jian Zhang 1, Jun Ke 1, Zhijian Ren 1,*, Cuncheng Feng 2,*
Editor: Kayode Raheem3
PMCID: PMC12970938  PMID: 41801955

Abstract

The treatment of colorectal cancer (CRC) remains challenging due to chemotherapy resistance and genetic heterogeneity. Indole-3-lactic acid (ILA), a tryptophan metabolite derived from gut microbiota, exhibits promising anti-inflammatory and anticancer properties; however, its specific molecular targets and regulatory mechanisms in CRC remain poorly understood. In this study, we combined network pharmacology and machine learning with molecular docking to identify candidate targets and pathways for ILA in CRC. We identified 39 ILA-CRC common targets, ultimately identifying four hub genes through the intersection of machine learning models. Validation in independent GEO datasets confirmed significant differential expression of these genes in CRC tissues. Functional enrichment analyses linked these genes to the PPAR, PI3K-AKT, and IL-17 signaling pathways, and gene set enrichment analysis further implicated ascorbate and aldarate metabolism, DNA replication, and fatty acid metabolism. Immune infiltration analysis indicated associations between hub gene expression and immune cell populations, including mast cells, neutrophils, and macrophages, suggesting potential involvement in the tumor immune microenvironment. Molecular docking supported favorable binding of ILA to all four hub proteins, and 100-ns molecular dynamics simulations specifically validated the dynamic stability of the ILA-HMOX1 complex. In conclusion, these results highlight EPHA2, HMOX1, MMP3, and PARP1 as candidate targets and suggest that ILA may influence CRC-related signaling, metabolic programs, and immune contexture, providing a theoretical foundation for developing gut microbiota-derived metabolites as novel anticancer strategies.

1. Introduction

Colorectal cancer (CRC), ranking as the third most common malignancy globally and the second leading cause of cancer-related mortality, represents a significant global public health burden. Its pathogenesis is closely associated with chronic inflammation, gut microbiota dysbiosis, and metabolic alterations [1]. While diagnostic and therapeutic advances have improved early-stage outcomes, patients with advanced CRC continue to face dismal prognoses due to persistent challenges, including chemotherapy resistance and treatment-related toxicities [2]. Additionally, the complexity of CRC pathogenesis involves not only genetic heterogeneity but also profound metabolic reprogramming and an immunosuppressive tumor microenvironment (TME). A major bottleneck in current CRC treatment is the limited efficacy of immunotherapies in the majority of patients, largely due to TME-mediated immune evasion [3,4]. Therefore, identifying agents that can simultaneously modulate metabolic programs and reshape the immune contexture remains a critical focus in CRC research.

In the search for new therapeutic avenues, the gut microbiota and its metabolic products have emerged as a critical frontier. During the metabolism of dietary components, microbes generate diverse bioactive metabolites, including short-chain fatty acids and indole derivatives, which can signal through host receptors and shape intestinal physiology [5]. Microbiota dysbiosis may alter both the composition and function of these metabolites, thereby disrupting host immune homeostasis and being associated with tumor initiation and progression [6]. Emerging evidence indicates that microbial metabolites act as direct chemical messengers at the host-microbiota interface. These small molecules can traverse the intestinal barrier and influence key host processes, including immune homeostasis, epigenetic regulation, and metabolic reprogramming [7]. These observations have prompted increasing interest in gut microbiota-derived small molecules as candidates for CRC prevention and adjunctive treatment.

Indole-3-lactic acid (ILA) is a tryptophan-derived microbial metabolite that has been reported to exert anti-inflammatory and antioxidant activities [8]. Recent clinical evidence indicates that fecal ILA levels are significantly reduced in patients with CRC and are inversely associated with tumor stage, suggesting a potential tumor-suppressive role [9]. Mechanistic studies further show that, in colitis-associated colorectal cancer models, ILA can activate aryl hydrocarbon receptor (AhR) signaling in macrophages, influence macrophage differentiation, and attenuate intestinal inflammation, thereby restraining inflammation-driven tumorigenesis [10]. In addition, ILA has been suggested to reshape the tumor immune microenvironment by suppressing M2 macrophage polarization, enhancing CD8+ T-cell cytotoxicity, preserving epithelial barrier integrity, and alleviating inflammation-induced mucosal injury [11,12]. Despite these encouraging findings, the multi-target regulatory network through which ILA interfaces with the genetic complexity and immune contexture of CRC remains incompletely defined and warrants systematic investigation.

In recent years, the rapid development of computational biology approaches, including network pharmacology, machine learning, and molecular docking, has provided powerful tools for elucidating drug mechanisms and screening potential targets. Network pharmacology systematically reveals intricate relationships between drugs, targets, and diseases, through multi-source data integration, enabling researchers to comprehensively understand pharmacological mechanisms from a holistic perspective [12]. Machine learning algorithms effectively identify critical genes and potential targets from large-scale datasets, markedly enhancing drug screening efficiency and accuracy [13]. Molecular docking technology validates binding affinity and stability by simulating ligand-protein interactions, thereby providing structural insights for rational drug design. Integrating these methodologies accelerates drug discovery processes and establishes a robust theoretical foundation for clinical applications. In this study, we integrated machine learning with network pharmacology to identify potential targets and elucidate the molecular mechanisms of ILA in CRC. Subsequent molecular docking and molecular dynamics simulations validated these computational predictions. Our findings provide novel theoretical foundations and promising putative targets for applying gut microbiota-derived metabolites in CRC therapy. Fig 1 shows a schematic workflow for this study.

Fig 1. Workflow chart.

Fig 1

2. Materials and methods

2.1. Collection of ILA targets

The PubChem database (https://pubchem.ncbi.nlm.nih.gov/) (accessed on 8 January 2025) [14] was used to obtain molecular structural formulae and canonical SMILES information for ILA. Subsequently, five online public databases were employed to predict ILA-related targets: SwissTargetPrediction (http://www.swisstargetprediction.ch/) (accessed on 8 January 2025) [15], Similarity Ensemble Approach (https://sea.bkslab.org/) (accessed on 8 January 2025) [16], TargetNet (http://targetnet.scbdd.com/home/index/) (accessed on 8 January 2025) [17], SuperPred (http://prediction.charite.de/) (accessed on 8 January 2025) [18], and PharmMapper (http://www.lilab-ecust.cn/pharmmapper) (accessed on 8 January 2025) [19].

For SwissTargetPrediction, SEA, SuperPred, and PharmMapper, all predicted targets annotated as Homo sapiens were retained without applying additional score or probability cutoffs, as these platforms primarily provide ranked or similarity-based predictions rather than unified probability thresholds. This strategy was adopted to ensure comprehensive coverage of potential human targets at the initial screening stage. For TargetNet, which provides explicit prediction probabilities, only targets with a probability greater than 0 were included according to the database output criteria. Predicted targets obtained from the five databases were subsequently merged, and duplicate entries were removed. All target proteins were standardized to human species genes via the UniProt database (https://www.uniprot.org/) (accessed on 9 January 2025) [20]. Finally, the ILA target network was created using Cytoscape 3.9.1 (https://cytoscape.org/) (accessed on 9 January 2025).

2.2. Collection of CRC targets

We conducted a comprehensive search using the keywords “colorectal cancer” and “colorectal carcinoma” to identify CRC-related targets from four databases: OMIM (http://www.omim.org) (accessed on 10 January 2025) [21], TTD (https://db.idrblab.org/ttd/) (accessed on 10 January 2025) [22], GeneCards (https://www.genecards.org/) (accessed on 10 January 2025) [23], and DrugBank (https://go.drugbank.com/) (accessed on 10 January 2025) [24]. For GeneCards, we applied a relevance score cutoff set at or above the median value. The targets from the four databases were combined, and duplicates were deleted. At the same time, we retrieved the microarray dataset from the Gene Expression Omnibus database (GEO; http://www.ncbi.nih.gov/geo/) (accessed on 10 January 2025) [25], and the datasets were filtered based on the following inclusion criteria: (1) organism restricted to Homo sapiens; (2) study design comparing primary colorectal cancer tissues vs. normal colonic tissues; (3) gene expression profiling by array; and (4) sufficient sample size (>25 samples) to ensure statistical reliability. Based on these criteria, four publicly available CRC-related microarray datasets (GSE44076, GSE74602, GSE32323, and GSE113513) were selected for subsequent analyses. These datasets were chosen because they provide high-quality transcriptomic profiles with sufficient sample sizes and have been widely used or are representative datasets in CRC-related gene expression studies. Among them, GSE44076 was designated as the training dataset, while GSE74602, GSE32323, and GSE113513 were used as external validation datasets. Expression matrices from the validation datasets were extracted using Perl software (version 5.30.2) and subsequently integrated. Batch effects arising from differences among datasets were corrected using the limma (version 3.64.1) and sva (version 3.56.0) packages implemented in R software (version 4.4.2), resulting in normalized gene expression data suitable for downstream bioinformatic analyses. Detailed information for all included datasets is summarized in Table 1.

Table 1. Detailed information for the GEO datasets.

Dataset GEO accession Year Platform Disease Control
Training set GSE44076 2014 GPL13667 98 50
Validation set GSE74602 2016 GPL6104 30 30
GSE32323 2012 GPL570 17 17
GSE113513 2018 GPL15207 14 14

2.3. Identification of differentially expressed genes (DEGs)

Differential expression analysis was performed between normal and CRC groups using the limma package in R, with significance thresholds set at |logFC| > 1 and adjusted p-value < 0.05. The resulting DEGs were then visualized through a volcano plot and heatmap, generated using the ggplot2 (version 3.5.2) and pheatmap (version 1.0.12) packages, respectively.

2.4. Weighted gene co-expression network analysis

Weighted gene co-expression network analysis (WGCNA) is a systems biology method that enables the exploration of high-dimensional gene expression datasets through network-based approaches, aiming to confirm gene modules and key genes associated with biological processes to reveal potential biological pathways or mechanisms [26]. First, we preprocessed the raw gene expression data and constructed a gene relationship matrix. We determined the optimal soft threshold power (β) using the pickSoftThreshold function in the WGCNA R package. The criterion for selection was to achieve a scale-free topology fit index (R2) of at least 0.85 while maintaining a reasonable mean connectivity. Consequently, a power of β = X was selected based on the scale independence and mean connectivity plots. Using the selected β value, an adjacency matrix was constructed and transformed into a topological overlap matrix (TOM), which reflects the network connectivity between gene pairs. The dissimilarity measure (1-TOM) was then used as the distance metric for hierarchical clustering to identify gene modules. Modules were detected using the dynamic tree cut algorithm with a predefined minimum module size, and highly similar modules were merged based on module eigengene correlation to improve module robustness. To assess the association between gene modules and CRC-related traits, module-trait relationships were evaluated by calculating Pearson correlation coefficients between module eigengenes and phenotypic traits. Gene significance (GS) and module membership (MM) values were further computed to identify biologically relevant core modules and genes most strongly associated with CRC for downstream analyses.

2.5. Construction of protein-protein interaction (PPI) network

We identified potential key genes for CRC by the intersection of databases, DGEs, and core module genes. The overlapping targets between ILA and CRC were identified using Venn diagram analysis (http://www.bioinformatics.com.cn/). These common targets were subsequently analyzed in the STRING database (version 12.0; https://string-db.org) [27] with the following parameters: organism restricted to Homo sapiens and minimum interaction confidence score set to >0.4 (medium confidence). The PPI data were downloaded from the STRING database in TSV format and imported into Cytoscape software (version 3.9.1) for network construction and visualization. To analyze the topological characteristics, the ‘Analyze Network’ tool was utilized with the interaction type set to ‘Treat network as undirected’. Node centrality measures were then calculated using the CytoNCA plugin (version 2.1.6), with degree centrality used as the primary metric. Isolated nodes without interactions (degree = 0) were excluded from the network. Finally, to intuitively visualize the network structure, the ‘Style’ panel was used to map node attributes: node size and color were adjusted according to their DC values, where larger and darker nodes represented higher connectivity. We employed three topological algorithms from the cytoHubba plugin (version 0.1)- betweenness, closeness, and maximal clique centrality (MCC) – to identify hub genes within the PPI network.

2.6. Enrichment analysis

We employed the Metascape online tool (version 3.5; http://metascape.org/) [28] to perform Gene Ontology (GO) annotations and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of the selected genes. The GO enrichment analysis included the three domains: biological process (BP), cellular component (CC), and molecular function (MF). The resulting data were visualized in an online visualization platform (http://www.bioinformatics.com.cn) [29]. Additionally, we performed gene set enrichment analysis (GSEA) on hub genes using the clusterProfiler (version 4.16.0) R package. The enriched biological pathways relevant to the predicted mechanisms of ILA in CRC were subsequently visualized using the enrichplot (version 1.28.4) and pathview (version 1.30) packages.

2.7. Determination of hub genes with machine learning algorithms

In this study, three machine learning algorithms, least absolute shrinkage and selection operator (LASSO), random forest (RF), and support vector machine recursive feature elimination (SVM-RFE), were applied to jointly identify hub genes associated with the hypothesized regulatory roles of ILA in CRC. To ensure reproducible results, we set the random seed to 12345 [30]. To address the sample imbalance between CRC and normal groups, stratified cross-validation was employed across all models to maintain consistent class proportions. Specifically, LASSO employs L1 regularization to mitigate multicollinearity and induce sparsity within high-dimensional datasets; RF was selected to capture complex non-linear interactions; and SVM-RFE identifies the optimal feature subset to maximize classification performance. These three algorithms complement each other, and their integration minimizes the bias inherent in any single model, thereby enhancing the robustness of the screening results.

The LASSO regression analysis was performed using the glmnet package (version 4.1.10) [31]. Specifically, parameters included family = binomial (binary classification) and alpha = 1 (LASSO penalty) to mitigate overfitting. To optimize hyperparameters, a 10-fold cross-validation procedure was performed using deviance as the optimization metric. The optimal lambda value (lambda.min) was selected based on the minimum mean cross-validated deviance. Features with non-zero coefficients at this optimal lambda value were retained as significant targets.

RF modeling was implemented using the randomForest package (version 4.7.1.2) [32]. The number of decision trees was evaluated across a range from 1 to 500, and the optimal tree number was determined based on the out-of-bag (OOB) error rate to balance model performance and complexity. An optimized RF model was then constructed using this tree number. Feature importance scores were calculated using the mean decrease in accuracy metric. Genes with importance scores greater than 0.3 were retained as candidate features for downstream analysis. This threshold was applied as a pragmatic filtering criterion to retain relatively high-importance features while reducing noise prior to integration with other algorithms. Sensitivity analyses using alternative thresholds confirmed that the final consensus hub genes were robust to reasonable variations in this cutoff.

SVM-RFE was performed using the svmRadial, e1071 (version 1.7.16), and caret (version 7.0.1) packages [33]. Before SVM training, all features were standardized using z-score normalization to ensure comparable feature scales. To avoid data leakage, normalization parameters were estimated exclusively on the training folds and subsequently applied to the corresponding validation folds. A five-fold stratified cross-validation scheme was employed during recursive feature elimination. The optimal gene subset was selected based on the lowest mean classification error and the highest cross-validated predictive performance.

Finally, genes consistently identified by all three algorithms (LASSO, RF, and SVM-RFE) were defined as high-confidence hub genes associated with the predicted regulatory mechanisms of ILA in CRC.

2.8. Validation of hub genes

In this study, GSE44076 was used as a training set containing 148 samples (50 normal and 98 CRC samples). Three independent datasets (GSE74602, GSE32323, and GSE113513) were integrated to form a validation set comprising 122 samples (61 normal and 61 CRC samples). Differential expression patterns of hub genes and their diagnostic performance were assessed using the limma (version 3.64.1), ggpubr (version 0.6.0), and pROC (version 1.18.5) packages in R, which generated comparative box plots and receiver operating characteristic (ROC) curves for both sample groups.

2.9. Immune cell infiltration

We performed immune cell infiltration analyses using the CIBERSORT algorithm [34], which employs linear support vector regression to quantify the relative proportions of 22 distinct immune cell subtypes in both CRC patients and normal subjects. Following this analysis, we identified immune cell populations demonstrating statistically significant differences between the two groups and evaluated their correlations with hub gene expression levels using the Spearman method.

2.10. Molecular docking analysis

Protein and ligand structures were prepared using the Protein Preparation Wizard and LigPrep modules in the Schrödinger Suite 2021−4, with energy minimization performed under the OPLS4 force field [35]. The crystal structures of EPHA2, HMOX1, MMP3, and PARP1 were obtained from the Protein Data Bank (PDB; https://www.rcsb.org/structure) and processed by assigning bond orders, adding hydrogens, optimizing hydrogen-bond networks, and minimizing heavy atoms to relieve steric clashes. The ligand (ILA) was prepared with LigPrep by generating possible ionization/tautomeric states at physiological pH and by enumerating low-energy conformers. The receptor grids were generated using the Receptor Grid Generation panel in Glide. The center of the grid box was defined by the centroid of the co-crystallized ligand within the active site of each target protein. The bounding box was sized to sufficiently enclose the active site and accommodate the ligand. Docking was performed using Glide Standard Precision (SP) with flexible ligand sampling enabled, including ring conformation sampling, and post-docking minimization was applied. Van der Waals radii scaling was set to 0.8 for ligand atoms with a partial charge cutoff of 0.15 (receptor scaling: 1.0, charge cutoff 0.25) to allow for limited softness in the potential. The resulting poses were subjected to post-docking minimization, and the best-scored pose (lowest Glide score) was selected for further analysis.

2.11. Molecular dynamics (MD) simulations

The highest-ranked docking poses of the four compounds were used as initial structures for MD simulations, performed with Desmond in the Schrödinger Suite 2021−4 [35]. Each complex was solvated in an orthorhombic TIP3P water box and neutralized with counterions; NaCl was added to a final concentration of 0.15 M. Simulations were conducted using the OPLS4 force field. Systems were first subjected to energy minimization followed by equilibration using the standard Desmond relaxation protocol, after which production runs were carried out under the NPT ensemble at 300 K and 1 atm. Temperature and pressure were controlled using the default Desmond thermostat and barostat settings for Schrödinger Suite 2021−4. For comparative assessment, each of the four complexes (ILA bound to EPHA2, HMOX1, MMP3, and PARP1) was simulated for 10 ns under identical conditions. Based on the comparative stability, an additional 100 ns production simulation was performed for the ILA-HMOX1 complex. Trajectory analyses included root mean square deviation (RMSD) to evaluate global conformational stability, root mean square fluctuation (RMSF) of protein Cα atoms to assess residue-level flexibility, and protein-ligand interaction analyses to characterize the persistence and types of intermolecular contacts over time. The binding free energy values and interactions of ligands with proteins were calculated by the MM/GBSA method.

3. Results

3.1. Acquisition of the compound ILA targets

The chemical structure of ILA is presented in Fig 2A. We retrieved 100, 150, 140, 92, and 297 ILA-related targets from five network pharmacology databases, respectively (S1 File in S1 Data). After removing duplicate entries, 626 unique putative ILA targets were retained. The overlap of ILA-related targets identified across the five databases is summarized in the Venn diagram (Fig 2B). These targets were subsequently visualized as an ILA-target interaction network constructed in Cytoscape (Fig 2C).

Fig 2. Identification of ILA targets via network pharmacology.

Fig 2

(A) The chemical structure of ILA. (B) Venn diagram showing ILA-related targets among the five databases. (C) ILA-targets interaction network.

3.2. Acquisition of disease CRC targets

From four databases, we obtained 93, 95, 2005, and 25 CRC-related targets, respectively. After removing duplicates, 2122 unique CRC-related targets remained (Fig 3A; S2 File in S1 Data). Using the CRC-related GEO dataset GSE44076 as the training set, we identified 1891 DEGs, comprising 891 upregulated and 1000 downregulated genes. The DEG distribution is shown in the volcano plot (Fig 3B), and the expression patterns of the top 50 most significant DEGs are displayed in the hierarchical clustering heatmap (Fig 3C), indicating clear separation between normal and CRC samples. The validation set was obtained by merging the three data sets GSE74602, GSE32323, and GSE113513. To evaluate the effectiveness of data integration, we performed Principal Component Analysis (PCA). As shown in S1 Fig in S1 Data, the samples from different datasets were clearly separated before correction, indicating a significant batch effect. However, after batch effect correction, the samples from the three datasets were well-mixed and uniformly distributed, demonstrating that the non-biological variations were successfully removed for subsequent analyses. To identify CRC-related gene modules, we constructed a weighted gene co-expression network using WGCNA. Based on scale independence and average connectivity, the optimal soft threshold was set at 4, achieving a scale-free index of 0.9 with favorable mean connectivity (Fig 3D). The resulting gene clustering dendrogram (Fig 3E) identified distinct co-expression modules represented by different-colored branches. After merging similar modules, six distinct gene modules were obtained. The MEturquoise module showed the strongest association with normal/CRC phenotypes (R = 0.97, P < 0.0001; Fig 3F) and contained 3314 genes prioritized as potential CRC targets. Finally, intersecting CRC-related genes from databases (2122 genes), DEGs (1891 genes), and MEturquoise-module genes (3314 genes) yielded 252 CRC key genes (Fig 3G).

Fig 3. Integrated identification of CRC targets through multi-database mining and WGCNA.

Fig 3

(A) Venn diagram of CRC targets retrieved from four databases. (B) Volcano plot of DEGs in CRC vs. normal groups. Red points: 891 significantly upregulated genes; green points: 1000 downregulated genes. (C) Hierarchical clustering heatmap of the top 50 most significant DEGs. (D) Scale independence and average connectivity to determine soft thresholds in WGCNA. (E) Gene clustering dendrogram and highly correlated gene modules. (F) The heatmap of module–trait relationships for the 6 co-expression modules. MEturquoise module showed the strongest positive correlation with CRC (R = 0.97, p < 0.0001). (G) Intersection of CRC targets from databases (2122 genes), DEGs (1891 genes), and WGCNA (3314 genes), identifying 252 CRC key genes.

3.3. PPI network and enrichment analysis of common targets associated with ILA in CRC

Based on the intersection analysis above, 39 potential targets of ILA against CRC were identified (Fig 4A). A PPI network of these 39 targets was constructed using STRING to characterize target-target interactions (Fig 4B). We then applied the cytoHubba plugin to rank hub candidates; the top 15 targets based on betweenness, closeness, and MCC are shown in Fig 4C, and their intersection yielded 11 core genes (Fig 4D): ICAM1, MMP3, HMOX1, LCN2, PPARG, MMP7, IDO1, PARP1, HSP90AB1, EPHA2, and CBS. KEGG and GO enrichment analyses based on these 39 targets were performed to identify biological processes and signaling pathways most likely involved in the predicted regulatory roles of ILA in CRC. KEGG pathway analysis revealed 17 significantly enriched pathways (P < 0.05) (S2 Fig in S1 Data), with the top five being lipid and atherosclerosis, pathways in cancer, IL-17 signaling pathway, PI3K-Akt signaling pathway, and PPAR signaling pathway. The core targets associated with these pathways included HSP90AB1, ICAM1, MMP1, MMP3, and HMOX1; the associations between key targets and the top pathways are visualized in the Sankey bubble chart (Fig 4E). GO analysis identified 247 BPs, 15 CCs, and 66 MFs, with the top 10 enriched terms shown in S2 Fig in S1 Data. The chord diagram (Fig 4F) highlights the top five BPs: cellular responses to nitrogen compounds, cellular responses to hormonal stimuli, positive regulation of programmed cell death, response to amyloid-beta, and response to bacterium (S3 File in S1 Data).

Fig 4. PPI network and functional enrichment analysis of common targets associated with ILA in CRC.

Fig 4

(A) Venn diagram identifying 39 potential targets between ILA and CRC. (B) PPI network of 39 common targets. (C) Top 15 hub genes ranked by cytoHubba plugin using three topological algorithms. (D) Eleven intersection hub genes of the betweenness, closeness, and MCC algorithms. (E) Sankey bubble chart of the top 5 KEGG pathways. (F) Chord diagram of the top 5 biological processes.

3.4. Selection of target hub genes via machine learning

To further identify critical hub genes for ILA against CRC, we screened the 39 targets using three machine-learning algorithms. For the LASSO regression, the changing trajectory of independent variable coefficients and the partial likelihood deviance were plotted, identifying 17 core targets at the optimal lambda value (Fig 5A). In the RF algorithm, the error rate stabilized as the number of decision trees increased; we selected 14 genes with a relative importance score greater than 0.3 for downstream analysis (Fig 5B). The SVM-RFE algorithm selected 13 genes with the lowest error (Fig 5C). S4 File in S1 Data presents a ranked list of core genes identified by the three machine learning algorithms. Finally, intersecting the candidates from cytoHubba, LASSO, RF, and SVM-RFE identified four hub genes: EPHA2, HMOX1, MMP3, and PARP1 (Fig 5D).

Fig 5. Machine learning-based identification of candidate targets associated with ILA in CRC.

Fig 5

(A) The coefficients and regularization plot of LASSO regression, the vertical dashed lines indicate the optimal lambda value. (B) Error rates in random forests and the top 14 genes with relative importance greater than 0.3. (C) Accuracy and error rate curves of SVM-REF algorithm. (D) Four hub genes (EPHA2, HMOX1, MMP3, PARP1) were identified by intersecting cytoHubba and machine learning selected targets.

3.5. Detection of the expression of hub genes in CRC

We next examined the expression patterns of the four hub genes between the normal and CRC groups. In the GSE44076 training set, EPHA2, MMP3, and PARP1 were significantly upregulated in CRC, whereas HMOX1 was significantly downregulated (Fig 6A). The area under the curve (AUC) values of ROC were 0.939 for EPHA2, 0.980 for HMOX1, 0.955 for MMP3, and 0.997 for PARP1 (Fig 6B). We further validated these findings in an independent merged validation cohort (GSE74602 + GSE32323 + GSE113513). The expression trends remained consistent (Fig 6C), and ROC analysis confirmed the diagnostic value of these hub genes (Fig 6D), with the AUC values of ROC were 0.935 for EPHA2, 0.883 for HMOX1, 0.935 for MMP3, and 0.804 for PARP1. These results support the robustness of the identified hub genes in distinguishing CRC from normal samples across datasets.

Fig 6. Expression analysis of hub genes in the normal and CRC groups.

Fig 6

(A) Differential expression of EPHA2, HMOX1, MMP3, and PARP1 between normal (n = 50) and CRC (n = 98) in the GSE44076 training set. (B) ROC curves demonstrating diagnostic performance of hub genes in GSE44076. AUC values: EPHA2 (0.939), HMOX1 (0.980), MMP3 (0.955), PARP1 (0.997). (C) Differential expression of hub genes between normal (n = 61) and CRC (n = 61) in the validation set. (D) ROC analysis of hub genes in the validation set. AUC values: EPHA2 (0.935), HMOX1 (0.883), MMP3 (0.935), PARP1 (0.804).

3.6. GSEA analysis

GSEA analysis identified signaling pathways associated with the four hub genes, with the top 10 upregulated and downregulated pathways shown in Fig 7A-D. EPHA2 was significantly associated with aminoacyl-tRNA biosynthesis, ascorbate and aldarate metabolism, DNA replication, and fatty acid metabolism (Fig 7A). HMOX1 expression was significantly correlated with ascorbate and aldarate metabolism, DNA replication, and fatty acid metabolism (Fig 7B). MMP3 was significantly correlated with ascorbate and aldarate metabolism, butanoate metabolism, and fatty acid metabolism (Fig 7C). PARP1 was significantly associated with ascorbate and aldarate metabolism, butanoate metabolism, and DNA replication (Fig 7D). Overall, these results suggest that the hub genes are closely linked to pathways involving redox-related metabolism, DNA replication, and lipid metabolism.

Fig 7. GSEA analysis of the hub genes.

Fig 7

(A) GSEA up- and down-regulation pathways for EPHA2. (B) GSEA up- and down-regulation pathways for HMOX1. (C) GSEA up- and down-regulation pathways for MMP3. (D) GSEA up- and down-regulation pathways for PARP1.

3.7. Immune cell infiltration

Given the close relationship between CRC progression and the tumor immune microenvironment, we assessed the relative abundance of 22 immune cell subtypes using CIBERSORT. Differential immune cell infiltration between normal and CRC samples in the GSE44076 training set is shown in Fig 8A. Compared with the normal group, plasma cells, T cells follicular helper, macrophages M0/M1/M2, activated mast cells, and neutrophils were significantly increased in CRC, whereas resting CD4 memory T cells, Tregs, resting mast cells, and eosinophils were significantly decreased.

Fig 8. Immune cell infiltration analysis.

Fig 8

(A) Differential immune cell infiltration between normal (n = 50) and CRC (n = 98) in the GSE44076 training set. CIBERSORT analysis quantified 22 immune cell subtypes. (B) Spearman correlation heatmap between hub genes and immune cell subtypes. Asterisks denote statistical significance of correlation coefficients: *P < 0.05, **P < 0.01, ***P < 0.001.

We further evaluated the associations between the four hub genes and immune cell infiltration; the Spearman correlation heatmap is provided in Fig 8B. EPHA2 showed a significant positive correlation with activated mast cells and a significant negative correlation with plasma cells. HMOX1 was positively correlated with neutrophils and negatively correlated with resting CD4 memory T cells and Tregs. MMP3 was positively correlated with activated mast cells and negatively correlated with resting mast cells. PARP1 was negatively correlated with resting dendritic cells and macrophages M1. These results indicate that hub gene expression is linked to distinct immune infiltration patterns in CRC.

3.8. Molecular docking

Molecular docking was conducted to evaluate the binding affinity and predicted binding poses of ILA with the four hub genes. The docking conformations of ILA with EPHA2, HMOX1, MMP3, and PARP1 are shown in Fig 9A-D, respectively. All four complexes exhibited favorable docking scores (EPHA2: −6.247; HMOX1: −5.876; MMP3: −7.208; PARP1: −5.857), suggesting stable binding potential. In the EPHA2-ILA complex (Fig 9A), ILA formed π-π stacking interactions with TYR694, π-alkyl interactions with LEU746, ALA644, and ILE619, and a hydrogen bond with THR692. The predicted binding poses for HMOX1, MMP3, and PARP1 (Fig 9B-D) further support that ILA can be accommodated within their binding pockets with favorable interaction patterns, consistent with the docking affinities reported in Table 2. Overall, these docking results suggest that ILA may be involved in CRC-associated regulatory processes through interactions with these proteins.

Fig 9. Molecular docking analysis of ILA and four hub genes.

Fig 9

(A) Docking diagrams for ILA and EPHA2, binding affinity −6.247 kcal/mol. (B) Docking diagrams for ILA and HMOX1, binding affinity −5.876 kcal/mol. (C) Docking diagrams for ILA and MMP3, binding affinity −7.208 kcal/mol. (D) Docking diagrams for ILA and PARP1, binding affinity −5.857 kcal/mol.

Table 2. Detailed parameters of molecular docking.

Number Genes PDB ID Box_center (x, y, z)/Å Affinity/(kcal/mol)
1 EPHA2 6Q7D −82.93, −19.24, 89.25 −6.247
2 HMOX1 1N45 18.41, −0.96, 1.4 −5.876
3 MMP3 1HY7 2.03, 48.73, 57.68 −7.208
4 PARP1 7KK2 3.54, −17.64, 31.88 −5.857

3.9. Molecular dynamics simulations of ILA with hub genes

To assess the dynamic stability of the docked poses and validate the binding interactions, MD simulations were performed. Comparative RMSD profiles for the four protein-ILA complexes over a 10 ns simulation are shown in Fig 10A-D, indicating that the ILA-HMOX1 complex exhibited the most rapid equilibration and the lowest RMSD fluctuation among the four candidates. It should be acknowledged that the 10 ns MD trajectories represent a preliminary stability assessment and are insufficient to fully characterize long-timescale conformational changes. Therefore, conclusions derived from the short simulations should be interpreted cautiously and were complemented by extended sampling for the representative ILA-HMOX1 complex. The 100 ns simulation confirmed the dynamic stability of the ILA-HMOX1 complex (Fig 10E). The RMSD of the protein Cα atoms converged after approximately 20 ns, maintaining a stable value of approximately 1.8 Å. Critically, the ligand RMSD, when fitted to the protein, also remained stable within a narrow range (fluctuating around 2.2 Å), indicating that ILA maintained a consistent binding pose within the active site without dissociating. Besides, results of MM/GBSA showed that the binding free energy for ILA to HMOX1 protein was −25.58 kcal/mol.

Fig 10. Molecular dynamics simulations of ILA with hub genes.

Fig 10

(A-D) Comparative RMSD of Cα Atoms for the Four Protein-ILA Complexes over a 10 ns Simulation. (E) RMSD plots for the Cα atoms of HMOX1 and the heavy atoms of the ligand ILA during a 100 ns simulation, confirming the dynamic stability of the complex. (F) RMSF analysis of the ILA-HMOX1 complex. The blue curve illustrates the fluctuation magnitude of individual amino acid residues; The green vertical bars highlight the specific residues involved in the interaction with ILA. (G) Protein-ligand interaction analysis of the ILA-HMOX1 complex. Interaction fraction of key residues with ILA, categorized by interaction type, including hydrogen bonds, hydrophobic contacts, ionic interactions, and water bridges. (H) The 2D schematic diagram illustrating the key persistent intermolecular interactions, including π-cation, salt bridges, and hydrogen bonds, that anchor ILA within the HMOX1 binding pocket throughout the 100 ns trajectory.

To further evaluate residue-level flexibility, RMSF analysis of protein Cα atoms was performed based on the 100 ns trajectory (Fig 10F). Most residues exhibited low RMSF values (generally below 1.2 Å), indicating overall structural stability of the protein during the simulation. In addition, protein-ligand interaction analysis was conducted to characterize the persistence and nature of intermolecular contacts throughout the simulation (Fig 10G). Several residues, including LYS18, TYR134, LYS179, and ARG183, exhibited high interaction fractions, predominantly mediated by hydrogen bonds, ionic interactions, and water bridges. Time-resolved interaction maps demonstrated that these contacts were maintained across most of the simulation period, supporting the formation of a stable and persistent binding interface. The 2D interaction diagram highlights critical, persistent intermolecular contacts, demonstrating that these interactions effectively anchor ILA within the HMOX1 binding pocket over the entire 100 ns simulation (Fig 10H).

4. Discussion

Despite improvements in survival among patients with CRC in recent years, a subset of patients still faces major clinical challenges, including chemotherapy resistance, limited benefit from immunotherapy, and postoperative recurrence [36]. Accordingly, identifying new molecular targets and developing effective intervention concepts remain important priorities in CRC research. Growing evidence supports the involvement of the gut microbiota and its metabolites in CRC initiation and progression. Microbial metabolites act as key mediators of microbiota-host interactions and have therefore attracted increasing interest as potential contributors to CRC-related biological processes. Tryptophan, for example, can be converted by gut microbes into multiple bioactive indole derivatives, including indole, ILA, indole-3-aldehyde (IAld), and indole-3-propionic acid (IPA), which have been reported to shape the TME [5]. ILA has been reported to strengthen intestinal barrier function and attenuate inflammatory responses through AhR signaling, with potential relevance to CRC-related phenotypes [8]. However, the precise molecular mechanisms linking ILA to CRC-associated regulatory processes remain incompletely understood.

This study represents the first systematic elucidation of the potential mechanism underlying ILA in CRC treatment, utilizing an integrative approach combining network pharmacology, machine learning, molecular docking, and molecular dynamics simulations. These methodologies are particularly well-suited for deciphering complex metabolite-disease interactions, as they enable the comprehensive identification of disease- and drug-related targets while uncovering key signaling pathway networks [11]. In this study, four critical targets (EPHA2, HMOX1, MMP3, and PARP1) were identified (Fig 11). We elucidated the potential CRC-associated regulatory mechanisms of ILA through target modulation and discussed its possible immunological relevance, offering new mechanistic perspectives on gut microbiota-derived metabolites in CRC.

Fig 11. Potential hub genes and pathways associated with ILA in CRC.

Fig 11

4.1. Potential CRC-associated mechanisms linked to ILA via multi-target regulation

4.1.1. Oxidative stress regulation.

Heme oxygenase 1 (HMOX1) is an antioxidant enzyme that catalyzes heme degradation into carbon monoxide, biliverdin, and ferrous iron. Increasing evidence links HMOX1 to ferroptosis, a regulated form of cell death driven by iron-dependent lipid peroxidation [37]. By shaping oxidative stress, inflammatory signaling, and apoptosis, the HMOX1 protein has been implicated in tumor cell survival, proliferation, and metastatic behavior. For example, Shenqi Sanjie Granules were reported to induce ferroptosis in colon cancer cells through HMOX1 upregulation, with associated suppression of tumor growth and metastasis [38]. Recent in vitro and in vivo studies suggest that ILA has antioxidant and anti-inflammatory properties. Mechanistic work indicates that ILA can activate the Nrf2 pathway and increase expression of antioxidant defense genes in intestinal epithelial cells [39]. Additional studies suggest that ILA activates Nrf2 via an AhR-dependent mechanism, increases HMOX1 expression in HT-29 colon cancer cells, and enhances tight-junction protein expression, thereby ameliorating LPS-induced intestinal barrier injury [40]. Together, these reports support the plausibility that ILA-associated signaling could be linked to HMOX1-related oxidative stress programs in CRC-associated contexts.

4.1.2. DNA damage repair regulation.

Poly (ADP-ribose) polymerase 1 (PARP1) participates in DNA repair, transcriptional regulation, and apoptosis. It catalyzes the transfer of ADP-ribose units from NAD+ to target proteins, forming poly(ADP-ribose) (PAR) chains [41]. PARP1 overexpression has been reported in multiple cancer types and has been associated with tumor progression, metastasis, and angiogenesis [42]. Clinically, PARP inhibitors can impair DNA repair and promote apoptosis, either as monotherapy or in combination regimens [43]. Lin et al. reported that a PARP inhibitor reduced metastatic nodules in several organs in CRC nude mouse models, supporting PARP1 as a therapeutically relevant target in CRC [44]. A recent study reported that the indole ring of tryptophan can form stable π-π stacking and hydrogen-bond interactions within the PARP1 active site, potentially mimicking aspects of NAD+ binding and reducing PARP1 activity, which may contribute to DNA repair defects [45]. In our dataset, PARP1 expression was elevated in CRC samples, and molecular docking suggested that ILA can adopt a plausible binding pose with PARP1 through hydrogen bonding. These results provide a structural rationale for a potential association between ILA-PARP1 interactions and CRC-relevant DNA damage response pathways.

4.1.3. Extracellular matrix remodeling.

Ephrin type-A receptor 2 (EPHA2) and matrix metalloproteinase 3 (MMP3) may be jointly related to extracellular matrix (ECM) remodeling and TME reprogramming during CRC progression. EPHA2 is frequently overexpressed in malignancies including breast, prostate, and colorectal cancer, and higher expression is often associated with invasiveness and poor prognosis [46]. EPHA2 has been implicated in metastasis through mechanisms involving EMT and ECM remodeling. In CRC, EPHA2 overexpression has been associated with liver metastasis and adverse clinical outcomes [47]. Cholic acid-tryptophan conjugates have been reported as EPHA2 antagonists [48]. However, direct evidence for ILA-EPHA2 interactions remains limited. In our analysis, EPHA2 was overexpressed in CRC samples, and docking suggested a plausible ILA-EPHA2 interaction. These computational observations raise the possibility that ILA could be linked to EPHA2-associated ECM programs,

MMP3 encodes a matrix metalloproteinase involved in ECM degradation and remodeling [49]. Elevated MMP3 expression has been reported in CRC tumor tissues and is associated with tumor progression and poor prognosis [50]. Although ILA-MMP3 interactions have not been well characterized, IAld and IPA-two other tryptophan-derived indole metabolites, were reported to mitigate IL-1β-induced chondrocyte inflammation and ECM degradation, partly via suppression of inflammatory mediators and MMP3 expression [5152]. Given the structural similarity among indole metabolites, it is reasonable to hypothesize that ILA could be linked to MMP3-related ECM remodeling pathways in CRC, although direct validation is required.

Across the integrative machine learning and PPI analyses, EPHA2, HMOX1, MMP3, and PARP1 were consistently prioritized as hub genes connecting ILA-associated targets with CRC-related pathological processes. Although direct experimental evidence for ILA binding to these proteins in CRC is currently lacking, convergence across network topology, expression profiling, functional enrichment, immune infiltration analysis, and structural simulations suggests that these genes represent biologically meaningful candidates rather than isolated computational signals. Functionally, EPHA2 and MMP3 are associated with oncogenic signaling and ECM remodeling, PARP1 is central to DNA damage responses and replication stress, and HMOX1 is involved in oxidative stress regulation and inflammatory homeostasis. These findings suggest that ILA may be involved in CRC-associated regulatory processes by coordinately modulating oxidative stress responses, DNA damage repair, and extracellular matrix dynamics, thereby providing mechanistic insight into the potential role of gut microbiota-derived metabolites as adjuvant regulatory strategies in CRC.

4.2. Predicted modulation of CRC-associated signaling pathways by ILA

In this study, KEGG and GO enrichment analyses were performed on the 39 ILA-CRC common targets. The enriched terms were mainly related to the PPAR signaling pathway, PI3K–AKT signaling pathway, and IL-17 signaling pathway. In addition, the four hub genes showed associations with DNA replication and fatty acid metabolism signatures in the transcriptomic analyses.

ILA has been reported to exhibit antioxidant and anti-inflammatory properties in experimental systems, and these effects may involve multi-target pathway modulation that contributes to intestinal homeostasis [53]. The PPAR pathway plays a role in lipid metabolism and inflammatory signaling, and dysregulation of this pathway has been linked to CRC development. Lian et al. reported that ILA reduced lipid peroxidation products and alleviated cardiac toxicity in mice [54]. The PI3K-AKT pathway is a major signaling axis controlling proliferation, metabolism, and migration. Prior work suggests that ILA can activate PI3K-AKT signaling via AhR in some contexts, which may be related to inflammatory regulation and macrophage phenotypes [10]. IL-17 is a proinflammatory cytokine produced by Th17 cells and has been implicated in tumor-promoting inflammation and immune evasion [55]. Inflammatory bowel disease (IBD) is a recognized risk factor for colitis-associated CRC [56]. Mechanistic studies suggest that ILA can reduce LPS/TNF-α-induced IL-8 production in intestinal epithelial cells, which may be relevant to inflammation-driven carcinogenesis [39]. Other studies reported that ILA can target RORγt, inhibit Th17 differentiation, and reduce IL-17 signaling activation in CRC mouse models [57]. Wang and colleagues reported that ILA supports intestinal barrier integrity through coordinated activation of AhR, increased tight-junction protein expression, and inhibition of NF-κB activity, together with reduced proinflammatory cytokines and increased IL-10 in IBD-related settings [38,58]. These findings suggest that ILA may have potential mechanistic implications for the prevention and treatment of CRC through the aforementioned signaling pathways.

4.3. Immune-related findings and immune infiltration analysis

Accumulating evidence indicates that immune cell infiltration is associated with prognosis and treatment response in CRC [59]. Alterations in ECM composition and immune cell distribution in the TME can contribute to immunosuppressive states and tumor progression [60,61]. In our study, immune infiltration analysis identified statistically significant correlations between hub gene expression and inferred immune cell proportions. EPHA2 and MMP3 expression levels were positively correlated with activated mast cells. HMOX1 expression correlated positively with neutrophils and negatively with resting memory CD4 ⁺ T cells and regulatory T cells. PARP1 expression correlated negatively with resting dendritic cells and M1 macrophages.

Gut microbiota-derived metabolites have been reported to influence immune-related signaling through receptors such as G protein-coupled receptors and AhR, thereby shaping immune responses in various disease contexts [59]. Dendritic cells are antigen-presenting cells that support CD8 ⁺ T-cell activation, and prior experiments suggested that ILA can enhance CD8 ⁺ T-cell-associated responses via IL12A regulation in dendritic cells under specific conditions [62]. Macrophages are also key TME components with diverse phenotypes in tumor initiation and progression [63]. ILA has been identified as an AhR ligand, and AhR signaling has been implicated in macrophage phenotypic balance and inflammatory regulation. Li et al. reported reduced expression of the AhR-responsive gene CYP1B1 in CRC based on GEPIA analysis, while in vitro work suggested that ILA may be involved in macrophage differentiation and inflammatory responses in colitis-associated tumor models [64]. Taken together, while these previous findings provide biological context supporting a potential link between ILA-related pathways and immune regulation, the immune infiltration results in this study should be interpreted cautiously. Direct immunomodulatory effects and causal mechanisms require further validation through targeted in vitro and in vivo experiments.

4.4. Research limitations and future directions

Through an integrative framework combining network pharmacology, machine learning, molecular docking, and molecular dynamics analyses, this study systematically explored the potential targets and molecular mechanisms of ILA in CRC. Nevertheless, several limitations should be acknowledged. First, although the four hub genes identified were consistently supported by multiple computational approaches, they were primarily derived from bioinformatic analyses and therefore require further experimental validation. Future studies employing in vitro assays and in vivo models will be necessary to confirm the functional roles of these targets and to elucidate their causal involvement in CRC-associated regulatory processes linked to ILA. Second, transcriptomic data provide gene-level insights, but additional multi-omics layers, including proteomics and metabolomics, may offer a more comprehensive view of ILA-related regulatory networks. Integrating these layers may help clarify whether predicted target-pathway relationships are reflected at the protein and metabolite levels.

5. Conclusion

By integrating network pharmacology, machine learning, molecular docking, and dynamics simulations, this study systematically elucidates the predicted mechanistic role of the gut microbiota-derived metabolite ILA in CRC. Four hub genes were identified as key molecular nodes potentially associated with CRC-related regulatory processes involving ILA. These findings provide mechanistic insight into how ILA may participate in CRC-associated molecular regulation through coordinated, multi-target interactions. Importantly, this study not only advances the understanding of the molecular mechanisms underlying ILA activity in CRC, but also highlights the broader potential of gut microbiota-derived metabolites as promising candidates for adjuvant regulatory strategies in CRC.

Supporting information

S1 Data. S1 File. The summary of targets of ILA.

S2 File. The summary of targets of CRC. S3 File. KEGG and GO analysis. S4 File. Feature importance rankings of hub genes. S1 Fig. The PCA plots before and after batch correction. S2 Fig. KEGG and GO analysis of 39 common targets.

(ZIP)

pone.0344478.s001.zip (3.1MB, zip)

Data Availability

Some relevant data are within the manuscript and its Supporting Information files. This research also involves data from the Gene Expression Omnibus database (GEO; http://www.ncbi.nih.gov/geo/), which belongs to the public domain. The accession numbers for the datasets used in this study are GSE44076, GSE74602, GSE32323, and GSE113513.

Funding Statement

The research was supported by the Shaanxi Province Natural Science Basic Research Program 2025 (2025JC-YBMS-1051) and the Xi’an International Medical Center Hospital Research Fund (2025QN10).

References

  • 1.Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6):394–424. doi: 10.3322/caac.21492 [DOI] [PubMed] [Google Scholar]
  • 2.Xiang L, He B, Liu Q, Hu D, Liao W, Li R, et al. Antitumor effects of curcumin on the proliferation, migration and apoptosis of human colorectal carcinoma HCT‑116 cells. Oncol Rep. 2020;44(5):1997–2008. doi: 10.3892/or.2020.7765 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Rodríguez-García C, Gutiérrez-Santiago F. Emerging Role of Plant-Based Dietary Components in Post-Translational Modifications Associated with Colorectal Cancer. Life (Basel). 2023;13(2):264. doi: 10.3390/life13020264 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Kim Y-H, Lee SB, Shim S, Kim A, Park J-H, Jang W-S, et al. Hyaluronic acid synthase 2 promotes malignant phenotypes of colorectal cancer cells through transforming growth factor beta signaling. Cancer Sci. 2019;110(7):2226–36. doi: 10.1111/cas.14070 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Krautkramer KA, Fan J, Bäckhed F. Gut microbial metabolites as multi-kingdom intermediates. Nat Rev Microbiol. 2021;19(2):77–94. doi: 10.1038/s41579-020-0438-4 [DOI] [PubMed] [Google Scholar]
  • 6.Cai J, Sun L, Gonzalez FJ. Gut microbiota-derived bile acids in intestinal immunity, inflammation, and tumorigenesis. Cell Host Microbe. 2022;30(3):289–300. doi: 10.1016/j.chom.2022.02.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Ocvirk S, O’Keefe SJD. Dietary fat, bile acid metabolism and colorectal cancer. Semin Cancer Biol. 2021;73:347–55. doi: 10.1016/j.semcancer.2020.10.003 [DOI] [PubMed] [Google Scholar]
  • 8.Roager HM, Licht TR. Microbial tryptophan catabolites in health and disease. Nat Commun. 2018;9(1):3294. doi: 10.1038/s41467-018-05470-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Zhou S, Wang K, Huang J, Xu Z, Yuan Q, Liu L, et al. Indole-3-lactic acid suppresses colorectal cancer via metabolic reprogramming. Gut Microbes. 2025;17(1):2508949. doi: 10.1080/19490976.2025.2508949 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Li Y, Li Q, Yuan R, Wang Y, Guo C, Wang L. Bifidobacterium breve-derived indole-3-lactic acid ameliorates colitis-associated tumorigenesis by directing the differentiation of immature colonic macrophages. Theranostics. 2024;14(7):2719–35. doi: 10.7150/thno.92350 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.De Juan A, Segura E. Modulation of Immune Responses by Nutritional Ligands of Aryl Hydrocarbon Receptor. Front Immunol. 2021;12:645168. doi: 10.3389/fimmu.2021.645168 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Shang L, Wang Y, Li J, Zhou F, Xiao K, Liu Y, et al. Mechanism of Sijunzi Decoction in the treatment of colorectal cancer based on network pharmacology and experimental validation. J Ethnopharmacol. 2023;302(Pt A):115876. doi: 10.1016/j.jep.2022.115876 [DOI] [PubMed] [Google Scholar]
  • 13.Wei W, Li Y, Huang T. Using machine learning methods to study colorectal cancer tumor micro-environment and its biomarkers. Int J Mol Sci. 2023;24(13):11133. doi: 10.3390/ijms241311133 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, et al. PubChem substance and compound databases. Nucleic Acids Res. 2016;44(D1):D1202-13. doi: 10.1093/nar/gkv951 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Daina A, Michielin O, Zoete V. SwissTargetPrediction: updated data and new features for efficient prediction of protein targets of small molecules. Nucleic Acids Res. 2019;47(W1):W357–64. doi: 10.1093/nar/gkz382 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Wang Z, Liang L, Yin Z, Lin J. Improving chemical similarity ensemble approach in target prediction. J Cheminform. 2016;8:20. doi: 10.1186/s13321-016-0130-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Yao Z-J, Dong J, Che Y-J, Zhu M-F, Wen M, Wang N-N, et al. TargetNet: A web service for predicting potential drug-target interaction profiling via multi-target SAR models. J Comput Aided Mol Des. 2016;30(5):413–24. doi: 10.1007/s10822-016-9915-2 [DOI] [PubMed] [Google Scholar]
  • 18.Nickel J, Gohlke B-O, Erehman J, Banerjee P, Rong WW, Goede A, et al. SuperPred: update on drug classification and target prediction. Nucleic Acids Res. 2014;42(Web Server issue):W26-31. doi: 10.1093/nar/gku477 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Wang X, Shen Y, Wang S, Li S, Zhang W, Liu X, et al. PharmMapper 2017 update: A web server for potential drug target identification with a comprehensive target pharmacophore database. Nucleic Acids Res. 2017;45(W1):W356–60. doi: 10.1093/nar/gkx374 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.UniProt Consortium T. UniProt: The universal protein knowledgebase. Nucleic Acids Res. 2018;46(5):2699. doi: 10.1093/nar/gky092 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Amberger JS, Bocchini CA, Scott AF, Hamosh A. OMIM.org: Leveraging knowledge across phenotype-gene relationships. Nucleic Acids Res. 2019;47(D1):D1038–43. doi: 10.1093/nar/gky1151 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Zhou Y, Zhang Y, Zhao D, Yu X, Shen X, Zhou Y, et al. TTD: Therapeutic Target Database describing target druggability information. Nucleic Acids Res. 2024;52(D1):D1465–77. doi: 10.1093/nar/gkad751 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Safran M, Dalah I, Alexander J, Rosen N, Iny Stein T, Shmoish M, et al. GeneCards Version 3: the human gene integrator. Database (Oxford). 2010;2010:baq020. doi: 10.1093/database/baq020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Knox C, Wilson M, Klinger CM, Franklin M, Oler E, Wilson A, et al. DrugBank 6.0: The DrugBank Knowledgebase for 2024. Nucleic Acids Res. 2024;52(D1):D1265–75. doi: 10.1093/nar/gkad976 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, et al. NCBI GEO: Archive for functional genomics data sets--update. Nucleic Acids Res. 2013;41(Database issue):D991-5. doi: 10.1093/nar/gks1193 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Langfelder P, Horvath S. WGCNA: An R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9:559. doi: 10.1186/1471-2105-9-559 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2015;43(Database issue):D447-52. doi: 10.1093/nar/gku1003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Zhou Y, Zhou B, Pache L, Chang M, Khodabakhshi AH, Tanaseichuk O, et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun. 2019;10(1):1523. doi: 10.1038/s41467-019-09234-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Tang D, Chen M, Huang X, Zhang G, Zeng L, Zhang G, et al. SRplot: A free online platform for data visualization and graphing. PLoS One. 2023;18(11):e0294236. doi: 10.1371/journal.pone.0294236 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.He J, Liu A, Shen H, Jiang Y, Gao M, Yu L, et al. Shared diagnostic genes and potential mechanisms between polycystic ovary syndrome and recurrent miscarriage revealed by integrated transcriptomics analysis and machine learning. Front Endocrinol (Lausanne). 2024;15:1335106. doi: 10.3389/fendo.2024.1335106 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Tibshirani R. The lasso method for variable selection in the cox model. Statist Med. 1997;16(4):385–95. doi: 10.1002/(sici)1097-0258(19970228)16:4<385::aid-sim380>3.0.co;2-3 [DOI] [PubMed] [Google Scholar]
  • 32.Izmirlian G. Application of the random forest classification algorithm to a SELDI-TOF proteomics study in the setting of a cancer prevention trial. Ann N Y Acad Sci. 2004;1020:154–74. doi: 10.1196/annals.1310.015 [DOI] [PubMed] [Google Scholar]
  • 33.Hao P-Y, Chiang J-H, Chen Y-D. Possibilistic classification by support vector networks. Neural Netw. 2022;149:40–56. doi: 10.1016/j.neunet.2022.02.007 [DOI] [PubMed] [Google Scholar]
  • 34.Newman AM, Liu CL, Green MR, Gentles AJ, Feng W, Xu Y, et al. Robust enumeration of cell subsets from tissue expression profiles. Nat Methods. 2015;12(5):453–7. doi: 10.1038/nmeth.3337 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Shaw DE. Schrödinger release 2021-4: Desmond molecular dynamics system. New York, NY, USA: D. E. Shaw Research. 2021. [Google Scholar]
  • 36.Xing X, Zou Z, He C, Hu Z, Liang K, Liang W, et al. Enhanced antitumor effect of cytotoxic T lymphocytes induced by dendritic cells pulsed with colorectal cancer cell lysate expressing α-Gal epitopes. Oncol Lett. 2019;18(1):864–71. doi: 10.3892/ol.2019.10376 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Ma J, Zhao N, Zhu D. Endothelial cellular responses to biodegradable metal Zinc. ACS Biomater Sci Eng. 2015;1(11):1174–82. doi: 10.1021/acsbiomaterials.5b00319 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Chen M, Ma S, Ji W, Hu W, Gao J, Yang J, et al. Shenqi Sanjie Granules induce Hmox1-mediated ferroptosis to inhibit colorectal cancer. Heliyon. 2024;10(18):e38021. doi: 10.1016/j.heliyon.2024.e38021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Ehrlich AM, Pacheco AR, Henrick BM, Taft D, Xu G, Huda MN, et al. Indole-3-lactic acid associated with Bifidobacterium-dominated microbiota significantly decreases inflammation in intestinal epithelial cells. BMC Microbiol. 2020;20(1):357. doi: 10.1186/s12866-020-02023-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Wang A, Guan C, Wang T, Mu G, Tuo Y. Indole-3-Lactic Acid, a Tryptophan Metabolite of Lactiplantibacillus plantarum DPUL-S164, Improved Intestinal Barrier Damage by Activating AhR and Nrf2 Signaling Pathways. J Agric Food Chem. 2023;71(48):18792–801. doi: 10.1021/acs.jafc.3c06183 [DOI] [PubMed] [Google Scholar]
  • 41.Scobie KN, Damez-Werno D, Sun H, Shao N, Gancarz A, Panganiban CH, et al. Essential role of poly(ADP-ribosyl)ation in cocaine action. Proc Natl Acad Sci U S A. 2014;111(5):2005–10. doi: 10.1073/pnas.1319703111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Zheng J, Chen Y, Peng X, Zheng W, Zhang Y, Hei F, et al. Research Progress of M6A Methylation modification in immunotherapy of colorectal cancer. Curr Cancer Drug Targets. 2025;:10.2174/0115680096332984250221071109. doi: 10.2174/0115680096332984250221071109 [DOI] [PubMed] [Google Scholar]
  • 43.Cheng B, Pan W, Xing Y, Xiao Y, Chen J, Xu Z. Recent advances in DDR (DNA damage response) inhibitors for cancer therapy. Eur J Med Chem. 2022;230:114109. doi: 10.1016/j.ejmech.2022.114109 [DOI] [PubMed] [Google Scholar]
  • 44.Lin K, Zhao Y, Tang Y, Chen Y, Lin M, He L. Collagen I-induced VCAN/ERK signaling and PARP1/ZEB1-mediated metastasis facilitate OSBPL2 defect to promote colorectal cancer progression. Cell Death Dis. 2024;15(1):85. doi: 10.1038/s41419-024-06468-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Xu H, Li T, Wang Q, Lv Y, Sun C, Yan R, et al. Small molecular oligopeptides adorned with tryptophan residues as potent antitumor agents: Design, synthesis, bioactivity assay, computational prediction, and experimental validation. J Chem Inf Model. 2025;65(3):1514–36. doi: 10.1021/acs.jcim.4c01759 [DOI] [PubMed] [Google Scholar]
  • 46.Pasquale EB. Eph-ephrin bidirectional signaling in physiology and disease. Cell. 2008;133(1):38–52. doi: 10.1016/j.cell.2008.03.011 [DOI] [PubMed] [Google Scholar]
  • 47.Arabzadeh A, McGregor K, Breton V, Van Der Kraak L, Akavia UD, Greenwood CMT, et al. EphA2 signaling is impacted by carcinoembryonic antigen cell adhesion molecule 1-L expression in colorectal cancer liver metastasis in a cell context-dependent manner. Oncotarget. 2017;8(61):104330–46. doi: 10.18632/oncotarget.22236 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Jannu AK, Puppala ER, Gawali B, Syamprasad NP, Alexander A, Marepally S, et al. Lithocholic acid-tryptophan conjugate (UniPR126) based mixed micelle as a nano carrier for specific delivery of niclosamide to prostate cancer via EphA2 receptor. Int J Pharm. 2021;605:120819. doi: 10.1016/j.ijpharm.2021.120819 [DOI] [PubMed] [Google Scholar]
  • 49.He L, Kang Q, Chan KI, Zhang Y, Zhong Z, Tan W. The immunomodulatory role of matrix metalloproteinases in colitis-associated cancer. Front Immunol. 2023;13:1093990. doi: 10.3389/fimmu.2022.1093990 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Pan Z, Lin H, Fu Y, Zeng F, Gu F, Niu G, et al. Identification of gene signatures associated with ulcerative colitis and the association with immune infiltrates in colon cancer. Front Immunol. 2023;14:1086898. doi: 10.3389/fimmu.2023.1086898 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Zhuang H, Ren X, Jiang F, Zhou P. Indole-3-propionic acid alleviates chondrocytes inflammation and osteoarthritis via the AhR/NF-κB axis. Mol Med. 2023;29(1):17. doi: 10.1186/s10020-023-00614-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Zhuang H, Li B, Xie T, Xu C, Ren X, Jiang F, et al. Indole-3-aldehyde alleviates chondrocytes inflammation through the AhR-NF-κB signalling pathway. Int Immunopharmacol. 2022;113(Pt A):109314. doi: 10.1016/j.intimp.2022.109314 [DOI] [PubMed] [Google Scholar]
  • 53.Galligan JJ. Beneficial actions of microbiota-derived tryptophan metabolites. Neurogastroenterol Motil. 2018;30(2):10.1111/nmo.13283. doi: 10.1111/nmo.13283 [DOI] [PubMed] [Google Scholar]
  • 54.Lian J, Lin H, Zhong Z, Song Y, Shao X, Zhou J, et al. Indole-3-Lactic acid inhibits doxorubicin-induced ferroptosis through activating aryl hydrocarbon receptor/nrf2 signalling pathway. J Cell Mol Med. 2025;29(2):e70358. doi: 10.1111/jcmm.70358 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Park H, Li Z, Yang XO, Chang SH, Nurieva R, Wang Y-H, et al. A distinct lineage of CD4 T cells regulates tissue inflammation by producing interleukin 17. Nat Immunol. 2005;6(11):1133–41. doi: 10.1038/ni1261 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Shah SC, Itzkowitz SH. Colorectal cancer in inflammatory bowel disease: Mechanisms and management. Gastroenterology. 2022;162(3):715-730.e3. doi: 10.1053/j.gastro.2021.10.035 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Han J-X, Tao Z-H, Wang J-L, Zhang L, Yu C-Y, Kang Z-R, et al. Microbiota-derived tryptophan catabolites mediate the chemopreventive effects of statins on colorectal cancer. Nat Microbiol. 2023;8(5):919–33. doi: 10.1038/s41564-023-01363-5 [DOI] [PubMed] [Google Scholar]
  • 58.Wang A, Guan C, Wang T, Mu G, Tuo Y. Lactiplantibacillus plantarum-Derived Indole-3-lactic Acid Ameliorates Intestinal Barrier Integrity through the AhR/Nrf2/NF-κB Axis. J Agric Food Chem. 2024;:10.1021/acs.jafc.4c01622. doi: 10.1021/acs.jafc.4c01622 [DOI] [PubMed] [Google Scholar]
  • 59.Zhou Z, Wang B, Pan X, Lv J, Lou Z, Han Y, et al. Microbial metabolites indole derivatives sensitize mice to D-GalN/LPS induced-acute liver failure via the Tlr2/NF-κB pathway. Front Microbiol. 2023;13:1103998. doi: 10.3389/fmicb.2022.1103998 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Nisar KS, Kulachi MO, Ahmad A, Farman M, Saqib M, Saleem MU. Fractional order cancer model infection in human with CD8+ T cells and anti-PD-L1 therapy: Simulations and control strategy. Sci Rep. 2024;14(1):16257. doi: 10.1038/s41598-024-66593-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Nisar KS, Farman M, Hincal E. Mathematical analysis and chaotic behavior of cancer treatment with virotherapy by using fractional integral sustainable approach. J Appl Math Comput. 2025;71(3):4283–311. doi: 10.1007/s12190-025-02397-0 [DOI] [Google Scholar]
  • 62.Zhang Q, Zhao Q, Li T, et al. Lactobacillus plantarum-derived indole-3-lactic acid ameliorates colorectal tumorigenesis via epigenetic regulation of CD8+ T cell immunity. Cell Metab. 2023;35(6):943–60. doi: 10.1016/j.cmet.2023.04.015 [DOI] [PubMed] [Google Scholar]
  • 63.Cendrowicz E, Sas Z, Bremer E, Rygiel TP. The role of macrophages in cancer development and therapy. Cancers (Basel). 2021;13(8):1946. doi: 10.3390/cancers13081946 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Shinde R, Hezaveh K, Halaby MJ, Kloetgen A, Chakravarthy A, da Silva Medina T, et al. Apoptotic cell-induced AhR activity is required for immunological tolerance and suppression of systemic lupus erythematosus in mice and humans. Nat Immunol. 2018;19(6):571–82. doi: 10.1038/s41590-018-0107-1 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Kayode Raheem

24 Dec 2025

Potential mechanism prediction of indole-3-lactic acid against colorectal cancer based on network pharmacology, machine learning and molecular docking

PLOS One

Dear Dr. Feng,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

This study needs Major revision, after reviewing the manuscript and comments from the reviewers. Both reviewers find the overall study technically sound and the manuscript generally well organized; however, substantial revisions are required to ensure that conclusions are appropriately framed for an in silico study and that the Methods contain sufficient detail for full assessment and reproducibility. The manuscript also requires careful language editing to correct grammatical issues and improve clarity.

Please submit your revised manuscript by Feb 06 2026 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols .

We look forward to receiving your revised manuscript.

Kind regards,

Kayode Raheem

Guest Editor

PLOS One

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1.Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2.Please note that PLOS One has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, we expect all author-generated code to be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse.

3. Please note that funding information should not appear in any section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. Please remove any funding-related text from the manuscript.

4. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match.

When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section.

5. We note that Figure 1 in your submission contain copyrighted images. All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution. For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright.

We require you to either (1) present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or (2) remove the figures from your submission:

1. You may seek permission from the original copyright holder of Figure 1 to publish the content specifically under the CC BY 4.0 license.

We recommend that you contact the original copyright holder with the Content Permission Form (http://journals.plos.org/plosone/s/file?id=7c09/content-permission-form.pdf) and the following text:

“I request permission for the open-access journal PLOS ONE to publish XXX under the Creative Commons Attribution License (CCAL) CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). Please be aware that this license allows unrestricted use and distribution, even commercially, by third parties. Please reply and provide explicit written permission to publish XXX under a CC BY license and complete the attached form.”

Please upload the completed Content Permission Form or other proof of granted permissions as an 'Other' file with your submission.

In the figure caption of the copyrighted figure, please include the following text: “Reprinted from [ref] under a CC BY license, with permission from [name of publisher], original copyright [original copyright year].”

2. If you are unable to obtain permission from the original copyright holder to publish these figures under the CC BY 4.0 license or if the copyright holder’s requirements are incompatible with the CC BY 4.0 license, please either i) remove the figure or ii) supply a replacement figure that complies with the CC BY 4.0 license. Please check copyright information on all replacement figures and update the figure caption with source information. If applicable, please specify in the figure caption text when a figure is similar but not identical to the original image and is therefore for illustrative purposes only.

6. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information.

7. If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise.

Additional Editor Comments:

Dear Dr. Feng,

Thank you for submitting your manuscript, “Potential mechanism prediction of indole-3-lactic acid against colorectal cancer based on network pharmacology, machine learning and molecular docking” (PONE-D-25-56178), to PLOS ONE.

We invite you to revise the paper, carefully addressing the comments from the reviewers and the editor. After considering the reports, my decision is Major revision. The overall computational workflow is of interest and the manuscript is generally well structured, but substantial revisions are needed to ensure that conclusions are appropriately framed for an in silico study and that the Methods provide sufficient detail for full assessment and reproducibility. When this revision is ready, please submit the updated manuscript and a point-by-point response. This will allow the reviewers and editor to evaluate how each concern has been addressed and, where appropriate, to determine whether additional external review is required. The reviewers' comment is attached

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #1: Yes

Reviewer #2: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #1: Yes

Reviewer #2: Yes

**********

Reviewer #1: From introduction to conclusion, the manuscript is well-structured and clear. A gut microbiota-derived metabolite, Indole-3-lactic acid (ILA), may treat colorectal cancer (CRC). The authors present a complete history. The study's goals, methods, and results are organized. A more extensive description of several portions would improve clarity before acceptance.

Title

• The title is written in a passive voice, which can make it sound less engaging and less clear. Consider rephrasing the title.

• The title is a bit long and should be rephrased to be more concise.

• Consider rephrasing the title to focus more on the research question or the main findings.

Abstract:

• Rephrase the sentence "Our study systematically elucidated..." to be more concise and impactful.

• Consider adding a brief summary of the study's main findings.

Introduction:

• Add more specific details about the current state of CRC research and how ILA fits into this landscape.

• Reorganize the paragraph discussing ILA's biological activities and mechanisms to improve flow.

• Add transitional phrases or sentences to connect the ideas between paragraphs.

• Rephrase the sentence "In this study, we integrated..." to be more concise and focused on the main objective.

Materials and Methods:

• Provide more detail on the databases and tools used, including their versions and specific parameters.

• Clarify the criteria for selecting the soft threshold, β.

• Provide more information on the PPI network analysis, including visualization parameters.

• Justify the use of specific machine learning algorithms.

• Provide more detail on the molecular docking analysis, including specific parameters used, such as the grid size and ligand flexibility.

• The section mentions that MD simulations were performed using Desmond in the Schrödinger Suite 2021-4. However, it would be helpful to provide more information on the specific protocol used, such as the simulation time, temperature, and pressure. I invite authors to clarify the MD simulation protocol.

The authors should revise the Methods section to include all necessary details regarding parameter selection, statistical thresholds, and multiple testing corrections to allow for full assessment and reproducibility.

Results

• Some figures and tables are not clearly described or referenced in the text. For example, Figure S1 is mentioned in the text, but its content is not clearly described. Clarify the presentation of results, including figures and tables.

• The KEGG and GO enrichment analyses are performed on the 39 therapeutic targets. However, it would be helpful to provide more context on why these specific targets were chosen and how they relate to the overall study.

• The machine learning algorithms identify four hub genes (EPHA2, HMOX1, MMP3, and PARP1). However, it would be helpful to discuss the implications of these findings and how they relate to the overall study. I recommend authors to discuss the implications of the machine learning results

• The molecular docking and MD simulations are performed on the four hub genes. Nevertheless, it would be supportive to provide more detail on the specific parameters used and the results obtained.

• The study has several limitations, including the reliance on bioinformatic analyses and the need for further functional validation.

• Clarify the implications of the study's findings and how they relate to the overall research question.

• The text employs various terms to denote the same concept, including "ILA-associated targets" and "ILA-CRC shared targets." Consistent terminology throughout the text would be beneficial.

• The text indicates the utilization of four GEO datasets; however, it lacks clarity regarding the rationale for selecting these particular datasets and the processing methods employed.

• The text states that WGCNA was employed to identify gene modules associated with CRC; however, it lacks clarity regarding the analysis methodology and the parameters utilized.

• The study is significantly dependent on bioinformatic analyses, which may introduce biases and limitations. It is essential to address these potential biases and limitations more explicitly.

• Phrases like "Collectively", "Notably", “In conclusion”, and "Furthermore" are used frequently throughout the manuscript, which can make it sound like it was written by an AI model.

Reviewer #2: Manuscript Title

Potential mechanism prediction of indole-3-lactic acid against colorectal cancer based on network pharmacology, machine learning and molecular docking

Overall Assessment

This manuscript presents a comprehensive in silico investigation of the potential mechanisms by which the gut microbiota–derived metabolite indole-3-lactic acid (ILA) may influence colorectal cancer (CRC). The authors integrate network pharmacology, transcriptomic analysis, machine learning–based feature selection, molecular docking, and molecular dynamics (MD) simulations. Given the increasing interest in gut microbiota–derived metabolites and their roles in cancer biology, this study provides a computational framework to identify potential molecular targets and pathways associated with ILA. Thus, it offers a useful hypothesis-generating resource for future experimental studies. However, several major issues related to interpretation, methodological transparency, and overstatement of conclusions should be addressed before the manuscript can be considered suitable for publication.

Major Comments

1. Overinterpretation of Computational Findings

Throughout the Abstract, Results, Discussion, and Conclusion, the manuscript frequently implies confirmed “therapeutic effects” of ILA against CRC e.g.

• “Our study systematically elucidated the potential therapeutic effect of ILA on CRC…” (Abstract)

• “ILA may exert its anti-CRC effects by targeting these proteins” (Results 3.8)

Given that the study is entirely computational, these statements overstate the strength of the evidence.

Recommendation:

The authors should revise the language throughout the manuscript to clearly frame the findings as predictive, hypothesis-generating, or putative mechanisms. Replace “therapeutic effect” with “predicted mechanism”, “putative targets”, or “hypothesized regulatory roles”. Claims of therapeutic efficacy should be avoided unless supported by experimental validation.

2. Target Prediction Strategy Requires Additional Justification

In Section 2.1 (Collection of ILA targets), predicted targets from multiple databases are combined without reporting confidence scores, prediction probabilities, or overlap frequency across databases.

Recommendation:

The authors should provide additional information on target confidence, such as:

• The number of databases supporting each target

• Probability or score thresholds (where available)

• A rationale for including low-confidence predictions

3. Machine Learning Methodology Lacks Transparency

In Section 2.7, the use of LASSO, Random Forest, and SVM-RFE is appropriate; however, important methodological details are missing:

• Handling of class imbalance between normal and CRC samples is not described.

• Feature scaling or normalization prior to SVM analysis is not reported.

• The Random Forest importance threshold (>0.3) is not justified.

Recommendation:

The authors should clarify preprocessing steps, provide justification for parameter choices, and include additional information to ensure reproducibility. Reporting cross-validation strategies and performance metrics in greater detail would strengthen this section.

4. Interpretation of Immune Infiltration Analysis

The immune infiltration analysis (Section 3.7) identifies correlations between hub gene expression and immune cell proportions. However, the discussion occasionally implies mechanistic regulation of immune cell behavior by ILA or hub genes.

Recommendation:

The authors should explicitly state that these findings are correlational. Causal language should be avoided, and conclusions regarding immune modulation should be clearly framed as speculative since these claims was not substantially validated in an invitro or in vivo model.

5. Molecular Docking and Molecular Dynamics Analysis

The docking and MD simulations support binding plausibility between ILA and the identified hub proteins. However, the rationale for extending the MD simulation to 100 ns for only the ILA–HMOX1 complex, while limiting others to 10 ns, is not sufficiently explained.

Recommendation:

The authors should justify the selection of HMOX1 for extended simulation and discuss the limitations of short MD trajectories for the other complexes. Additional stability metrics or binding free energy calculations would further strengthen this section. Report RMSF, hydrogen bond occupancy, and binding free energy (MM-PBSA). Also Include control ligands or known inhibitors of CRC for benchmarking.

Minor Comments

1. Data Availability Statement

The statement “All data are available on request from the authors” does not align with PLOS ONE data-sharing recommendations. Public access to analysis scripts (e.g., GitHub, Zenodo) or processed data should be provided where possible.

2. Redundancy in the Discussion

Several mechanistic explanations (e.g., AhR/Nrf2 signaling) are repeated across multiple subsections. The Discussion could be streamlined to improve clarity.

3. Language and Formatting

Minor grammatical errors and overly long sentences are present, particularly in the Discussion. Gene and protein nomenclature should be consistently formatted.

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #1: No

Reviewer #2: Yes: Modinat Aina Abayomi

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

To ensure your figures meet our technical requirements, please review our figure guidelines: https://journals.plos.org/plosone/s/figures

You may also use PLOS’s free figure tool, NAAS, to help you prepare publication quality figures: https://journals.plos.org/plosone/s/figures#loc-tools-for-figure-preparation.

NAAS will assess whether your figures meet our technical requirements by comparing each figure against our figure specifications.

Attachment

Submitted filename: Reviewers comment.docx

pone.0344478.s002.docx (19.7KB, docx)
PLoS One. 2026 Mar 9;21(3):e0344478. doi: 10.1371/journal.pone.0344478.r002

Author response to Decision Letter 1


22 Jan 2026

Dear Editors:

Thank you for giving us the opportunity to submit a revised draft of our manuscript “Potential mechanism prediction of indole-3-lactic acid against colorectal cancer based on network pharmacology, machine learning and molecular docking” (Submission ID PONE-D-25-56178) for publication in PLOS One. We appreciate the time and effort that you and the reviewers dedicated to providing feedback on our manuscript and are grateful for the insightful comments and valuable improvements to our paper. We have read your comments carefully and have made some modifications which hopefully could meet your expectations. The revised portions are highlighted in red in the manuscript. The main modifications in the paper are summarized and the response to the reviewers’ comments are as follows:

Reviewer 1, Point 1

1.Comment: (Title: The title is written in a passive voice, which can make it sound less engaging and less clear. Consider rephrasing the title. The title is a bit long and should be rephrased to be more concise. Consider rephrasing the title to focus more on the research question or the main findings.)

Response: We sincerely appreciate your thorough review and valuable suggestions regarding our manuscript. Following your advice, we have rephrased the title to be more concise, active, and focused on the core research findings. We have revised the original title to: “Integrative network pharmacology and machine learning identify potential targets of indole-3-lactic acid in colorectal cancer”. We have updated the title on the Title Page and throughout the manuscript submission system.

Reviewer 1, Point 2

2.Comment: (Abstract: Rephrase the sentence "Our study systematically elucidated..." to be more concise and impactful. Consider adding a brief summary of the study's main findings.)

Response: We sincerely thank your for this insightful suggestion regarding the Abstract. Following your suggestion, we have carefully revised the Abstract and rewritten the concluding sentence to be more concise and impactful. The revised sentence reads as follows: “In conclusion, these results highlight EPHA2, HMOX1, MMP3, and PARP1 as candidate targets and suggest that ILA may influence CRC-related signaling, metabolic programs, and immune contexture, providing a theoretical foundation for developing gut microbiota-derived metabolites as novel anticancer strategies”. We have updated the Abstract in the revised manuscript.

Reviewer 1, Point 3

3.Comment: (Introduction: Add more specific details about the current state of CRC research and how ILA fits into this landscape.)

Response: We sincerely thank your for this insightful suggestion regarding the Introduction. Following your suggestion, we have rewritten the Introduction, with the main revisions focusing on adding and strengthening the following two aspects: (1) Summarize the current therapeutic and research challenges in CRC, including molecular heterogeneity and treatment resistance. (2) Better position gut microbiota–derived metabolites as an emerging research direction relevant to tumor metabolism and the immune microenvironment. We further clarified how ILA, as a tryptophan-derived microbial metabolite with reported anti-inflammatory, antioxidant and immunomodulatory activities, conceptually fits this framework and why a systematic target- and pathway-level investigation is needed. These revisions have been incorporated in the Introduction section. Please see the specific modifications in the revised manuscript.

Reviewer 1, Point 4

4.Comment: (Materials and Methods: Provide more detail on the databases and tools used, including their versions and specific parameters.)

Response: Thank you very much for pointing this out, and we think this is an excellent suggestion. In accordance with your suggestion, we have thoroughly updated the "Materials and Methods" section to ensure full transparency and reproducibility. The specific revisions are as follows: (1) Database access: We have added the specific URLs and last access dates for all online databases. (2) Software versions: Beyond the main R software version (v4.4.2), we have now specified the exact version numbers for all critical R packages used in the analysis. (3) Parameter specification: We have elaborated on the specific parameters for the computational models used in this study. Specifically, we provided the detailed workflow for the PPI network analysis (Section 2.5); clarified the rationale for the selected machine learning algorithms alongside their hyperparameter tuning settings (Section 2.7); and supplemented detailed information regarding the molecular docking and molecular dynamics simulations (Sections 2.10 and 2.11). These modifications have significantly improved the clarity and readability of our manuscript. Please see the specific modifications in the revised manuscript.

Reviewer 1, Point 5

5.Comment: (Materials and Methods: Clarify the criteria for selecting the soft threshold, β.)

Response: Thank you for your pointing this out. We apologize for not making it clear. We have added a detailed description in Section 2.4 of the criteria used to select the soft-thresholding power (β) in the WGCNA analysis, as follows: “We determined the optimal soft threshold power (β) using the pickSoftThreshold function in the WGCNA R package. The criterion for selection was to achieve a scale-free topology fit index (R2) of at least 0.85 while maintaining a reasonable mean connectivity. Consequently, a power of β=X was selected based on the scale independence and mean connectivity plots. ”

Reviewer 1, Point 6

6.Comment: (Materials and Methods: Provide more information on the PPI network analysis, including visualization parameters.)

Response: Thank you for your pointing this out. We apologize for not making it clear. We have added a detailed description in Section 2.5 of the parameters and visualization workflow used for the PPI analysis, as follows: “The PPI data were downloaded from the STRING database in TSV format and imported into Cytoscape (version 3.9.1) for network construction and visualization. Network topological properties were analyzed using the ‘Analyze Network’ tool with the network treated as undirected (‘Treat network as undirected’). Node centrality measures were then calculated using the CytoNCA plugin (version 2.1.6), with degree centrality used as the primary metric. Isolated nodes without interactions (degree = 0) were excluded from the network. Finally, the ‘Style’ panel was used to map node attributes for visualization: node size and color were scaled according to degree centrality values, with larger and darker nodes indicating higher connectivity.”

Reviewer 1, Point 7

7.Comment: (Materials and Methods: Justify the use of specific machine learning algorithms.)

Response: Thank you very much for your comments. We have added a detailed justification in Section 2.7 for using the three machine-learning algorithms (LASSO, RF, and SVM-RFE), as follows: “Specifically, LASSO employs L1 regularization to mitigate multicollinearity and induce sparsity in high-dimensional datasets; RF was selected to capture complex non-linear interactions; and SVM-RFE identifies an optimal feature subset to maximize classification performance. These three algorithms are complementary, and their combined use reduces the bias inherent to any single model, thereby improving the robustness of the screening results.”

Reviewer 1, Point 8

8.Comment: (Materials and Methods: Provide more detail on the molecular docking analysis, including specific parameters used, such as the grid size and ligand flexibility.)

Response: We sincerely appreciate the valuable comments. In response to your suggestion, we have updated Section 2.10 with the following detailed parameters: (1) Grid generation: The receptor grids were generated using the Receptor Grid Generation panel in Glide. The center of the grid box was defined by the centroid of the co-crystallized ligand within the active site of each target protein. The bounding box was sized to sufficiently enclose the active site and accommodate the ligand. The specific grid dimensions for each target are provided in Table 2. (2) Ligand flexibility: The Glide Standard Precision mode was employed, which internally generates multiple conformations for the ligand. The sampling included the exploration of ring conformations and nitrogen inversions. To account for protein flexibility implicitly and allow for minor steric clashes, the van der Waals radii of the ligand atoms were scaled by a factor of 0.80 with a partial charge cutoff of 0.15, while the receptor atoms were scaled by 1.0. (3) The resulting poses were subjected to post-docking minimization, and the best-scored pose (lowest GlideScore) was selected for further analysis. These revisions have been incorporated into the manuscript Section 2.10.

Reviewer 1, Point 9

9.Comment: (Materials and Methods: The section mentions that MD simulations were performed using Desmond in the Schrödinger Suite 2021-4. However, it would be helpful to provide more information on the specific protocol used, such as the simulation time, temperature, and pressure. I invite authors to clarify the MD simulation protocol.)

Response: Thank you very much for pointing this out, and we agree that a clearer and more detailed description of the MD protocol is essential for reproducibility. We have therefore revised Section 2.11 to explicitly report the key simulation settings, including the simulation time for each system, the ensemble, temperature, and pressure, as well as additional protocol details (system solvation, equilibration procedure, time step and constraints, thermostat, and trajectory recording frequency). Briefly, all simulations were performed in Desmond (Schrödinger Suite 2021-4) using the OPLS4 force field, in a TIP3P water box with 0.15 M NaCl. After minimization and equilibration, four independent 10 ns comparative simulations were conducted for the four protein-ligand complexes, and a 100 ns production run was performed for the ILA-HMOX1 complex under the NPT ensemble at 300 K and 1 atm. These details have been added to the revised manuscript Section 2.11.

Reviewer 1, Point 10

10.Comment: (Materials and Methods: The authors should revise the Methods section to include all necessary details regarding parameter selection, statistical thresholds, and multiple testing corrections to allow for full assessment and reproducibility.)

Response: We sincerely appreciate the valuable comments. Accordingly, we have comprehensively revised the Materials and Methods section to explicitly describe the criteria and parameter settings used at each analytical step. We have incorporated these revisions into Section 2 of the revised manuscript.

Reviewer 1, Point 11

11.Comment: (Results: Some figures and tables are not clearly described or referenced in the text. For example, Figure S1 is mentioned in the text, but its content is not clearly described. Clarify the presentation of results, including figures and tables.)

Response: Thank you for your pointing this out. We apologize for not making it clear. In accordance with your suggestion, we have carefully re-examined the entire Results section. (1) Regarding Figure S1: We have added a detailed description in the main text to explain the changes in PCA plots before and after batch effect correction, demonstrating the reliability of our data integration. (2) General revision: We have systematically checked the citations and descriptions for all figures (Figures 1-11; Figure S1-S2) and tables (Tables 1-2). We revised the manuscript to ensure that every reference to a figure or table is accompanied by a clear interpretation of the specific data trends and their biological significance. Please see the specific modifications in the revised manuscript.

Reviewer 1, Point 12

12.Comment: (The KEGG and GO enrichment analyses are performed on the 39 therapeutic targets. However, it would be helpful to provide more context on why these specific targets were chosen and how they relate to the overall study.)

Response: We sincerely appreciate the valuable comments.

(1) The 39 targets used for GO and KEGG enrichment analyses were not arbitrarily selected but were identified through a stepwise integrative strategy designed to link ILA with CRC in a biologically meaningful manner. Specifically, these 39 targets represent the intersection between 1) putative ILA-related targets predicted using multiple network pharmacology databases and 2) CRC key genes defined by integrating disease-related databases, differential expression analysis, and WGCNA. Therefore, these targets simultaneously reflect the potential molecular actions of ILA and the core pathological features of CRC.

(2) Performing functional enrichment analyses on this refined target set allowed us to focus on biological processes and signaling pathways that are most likely involved in the therapeutic effects of ILA against CRC, rather than generating nonspecific results from broader gene sets. Importantly, the enriched GO terms and KEGG pathways provide mechanistic context for the subsequent identification of hub genes, immune infiltration analysis, and molecular docking.

(3) To clarify this rationale, we have revised the Results section (Section 3.3) and added explanatory text describing the selection criteria and biological significance of the 39 therapeutic targets, as well as their role in the overall study framework.

Reviewer 1, Point 13

13.Comment: (The machine learning algorithms identify four hub genes (EPHA2, HMOX1, MMP3, and PARP1). However, it would be helpful to discuss the implications of these findings and how they relate to the overall study. I recommend authors to discuss the implications of the machine learning results.)

Response: Thank you very much for your comments. In this study, machine learning algorithms were applied not merely as a feature-selection tool but as an integrative strategy to identify the most robust and biologically relevant hub genes among the potential therapeutic targets of ILA against CRC. By combining LASSO regression, RF, and SVM-RFE, we aimed to reduce model-specific bias and improve the reliability of hub gene selection.

(1) The four hub genes identified were consistently supported by multiple independent analyses, including PPI network topology, differential expression and diagnostic performance, gene set enrichment analysis, immune infiltration correlations, and molecular docking and molecular dynamics simulations. This convergence of evidence indicates that these genes represent key molecular nodes linking ILA-associated targets to CRC-related pathological processes, rather than being isolated machine learning outputs.

(2) From a biological perspective, these hub genes are involved in critical processes relevant to CRC progression. Their identification therefore provides mechanistic insight into how ILA may exert anti-CRC effects through coordinated regulation of inflammation, oxidative stress, metabolism, and tumor-host interactions.

(3) We have added a comprehensive discussion at the end of Section 4.1 of the Discussion to improve the interpretability and biological relevance of the machine learning analyses.

Reviewer 1, Point 14

14.Comment: (The molecular docking and MD simulations are performed on the four hub genes. Nevertheless, it would be supportive to provide more detail on the specific parameters used and the results obtained.)

Response: We sincerely appreciate the valuable comments. In accordance with your recommendation, we have made substantial additions to the manuscript in two areas: (1) We have supplemented the text with specific operational parameters for molecular docking and molecular dynamics simulations in the Methods section (Sections 2.10 and 2.11). (2) We have provided a more granular and quantitative description of the simulation outcomes in the Results section (Sections 3.8 and 3.9). We believe these added details significantly enhance the transparency and robustness of our computational findings.

Reviewer 1, Point 15

15.Comment: (The study has several limitations, including the reliance on bioinformatic analyses and the need for further functional validation.)

Response: We sincerely appreciat

Attachment

Submitted filename: Response to Reviewers.doc

pone.0344478.s004.doc (133KB, doc)

Decision Letter 1

Kayode Raheem

23 Feb 2026

Integrative network pharmacology and machine learning identify potential targets of indole-3-lactic acid in colorectal cancer

PONE-D-25-56178R1

Dear Dr. Feng,

I am pleased to inform you that your revised manuscript, “Integrative network pharmacology and machine learning identify potential targets of indole-3-lactic acid in colorectal cancer  (Manuscript ID: PONE-D-25-56178R1 ), is accepted for publication  in PLOS ONE, contingent upon completion of any outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager®  and clicking the ‘Update My Information' link at the top of the page. For questions related to billing, please contact billing support .

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Kayode Raheem

Guest Editor

PLOS One

**********

Acceptance letter

Kayode Raheem

PONE-D-25-56178R1

PLOS One

Dear Dr. Feng,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS One. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

You will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Kayode Raheem

Guest Editor

PLOS One

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Data. S1 File. The summary of targets of ILA.

    S2 File. The summary of targets of CRC. S3 File. KEGG and GO analysis. S4 File. Feature importance rankings of hub genes. S1 Fig. The PCA plots before and after batch correction. S2 Fig. KEGG and GO analysis of 39 common targets.

    (ZIP)

    pone.0344478.s001.zip (3.1MB, zip)
    Attachment

    Submitted filename: Reviewers comment.docx

    pone.0344478.s002.docx (19.7KB, docx)
    Attachment

    Submitted filename: Response to Reviewers.doc

    pone.0344478.s004.doc (133KB, doc)

    Data Availability Statement

    Some relevant data are within the manuscript and its Supporting Information files. This research also involves data from the Gene Expression Omnibus database (GEO; http://www.ncbi.nih.gov/geo/), which belongs to the public domain. The accession numbers for the datasets used in this study are GSE44076, GSE74602, GSE32323, and GSE113513.


    Articles from PLOS One are provided here courtesy of PLOS

    RESOURCES