Comparative study of the mechanism of natural compounds with similar structures using docking and transcriptome data for improving in silico herbal medicine experimentations

Musun Park; Su-Jin Baek; Sang-Min Park; Jin-Mu Yi; Seongwon Cha

doi:10.1093/bib/bbad344

. 2023 Oct 5;24(6):bbad344. doi: 10.1093/bib/bbad344

Comparative study of the mechanism of natural compounds with similar structures using docking and transcriptome data for improving in silico herbal medicine experimentations

Musun Park ^1,^✉, Su-Jin Baek ², Sang-Min Park ³, Jin-Mu Yi ⁴, Seongwon Cha ⁵

PMCID: PMC10555731 PMID: 37798251

Abstract

Natural products have successfully treated several diseases using a multi-component, multi-target mechanism. However, a precise mechanism of action (MOA) has not been identified. Systems pharmacology methods have been used to overcome these challenges. However, there is a limitation as those similar mechanisms of similar components cannot be identified. In this study, comparisons of physicochemical descriptors, molecular docking analysis and RNA-seq analysis were performed to compare the MOA of similar compounds and to confirm the changes observed when similar compounds were mixed and used. Various analyses have confirmed that compounds with similar structures share similar MOA. We propose an advanced method for in silico experiments in herbal medicine research based on the results. Our study has three novel findings. First, an advanced network pharmacology research method was suggested by partially presenting a solution to the difficulty in identifying multi-component mechanisms. Second, a new natural product analysis method was proposed using large-scale molecular docking analysis. Finally, various biological data and analysis methods were used, such as in silico system pharmacology, docking analysis and drug response RNA-seq. The results of this study are meaningful in that they suggest an analysis strategy that can improve existing systems pharmacology research analysis methods by showing that natural product–derived compounds with the same scaffold have the same mechanism.

Keywords: natural compounds, herbal medicine, similar structure analysis, molecular docking, transcriptome

INTRODUCTION

Natural products have long been used to treat patients in Traditional Asian, Ayurvedic and Kampo medicine. Therefore, it can be said that natural products are safe and effective [1]. However, the exact mechanism underlying the therapeutic efficacy of natural compounds has not yet been identified, which is an obstacle to the commercialization of natural product–based drugs [2]. If this issue is not overcome soon, it will not only result in the loss of trust in treatments using natural products but will also hinder the standardization and industrialization of natural products. Therefore, it is important to understand the mechanisms of action of these natural products.

Most mechanisms of action studies have been conducted as single target studies based on the magic bullet paradigm [3, 4]. However, because natural products contain multiple compounds, studies that consider only a single compound have limitations in revealing the mechanisms of action of natural products. To overcome this limitation, network-based systems pharmacology studies are being conducted to verify the mechanisms of action of natural products based on multi-component, multi-target interactions [5–7]. Although herbal medicine research using systems pharmacology has been successfully conducted on platforms such as TCMSP [8] and BATMAN-TCM [9], it still has the limitation of showing interaction results without considering similar compounds.

Active compounds with same scaffold may have similar mechanisms of action as they bind to the same site on the target. For example, caffeine has the same graph framework as adenosine and exerts competitive antagonistic effects on adenosine receptors [9, 10]. In addition, based on the finding that the same molecular scaffold has the same mechanism of action (MOA) [11, 12], research on the development of new drugs using fragment-based drug design, such as scaffold hopping, has also been conducted [13–15]. These studies show that comparative studies on similar compounds are necessary to verify the MOA of natural products containing many similar active compounds.

Natural products contain several compounds with similar structures. These include compounds such as peptides, polyketides and terpenes. Terpenes are phytochemicals with various physiological activities [16] and exist in various forms in natural products through structural diversification by cytochrome P450 enzymes (CYP) [17]. For example, oleanolic acid (OA), a pentacyclic triterpene present in Paeonia lactiflora, is biologically transformed by CYP into compounds such as hederagenin (HG), gypsogenic acid and medicagenic acid [17]. However, their structures are maintained because the biological transformation processes do not change the molecular scaffold but the functional group. This suggests that, although natural products contain various compounds owing to the biotransformation process, the transformed components have the same scaffold and MOA [9–15].

One analytical method for verifying this hypothesis is molecular docking. Molecular docking is an in silico method that determines the interaction potential by calculating the binding affinity between a specific protein and a specific compound [18]. When compounds with similar structures interact at the same location on a protein, they are likely to share the same biological mechanisms. Therefore, determining whether similar compounds dock to the same protein and match the docking binding site is necessary for drug mechanism studies. In particular, because natural products contain many compounds with similar structures that interact with multiple targets, large-scale molecular docking analysis is important for studying the mechanisms of action of natural products [12, 17, 19]. However, few studies have elucidated the mechanisms of action of these natural products using large-scale molecular docking analyses. A previous study conducted large-scale molecular docking analysis using natural compounds was performed [20]. In the present study, docking analysis between natural product–derived compounds and ~150 proteins was conducted to find the possible protein targets for these compounds. However, their study did not use docking analysis to confirm the MOA of the compound. If a docking analysis of a larger scale compared to the previous study can be performed to determine the effect of a compound on the entire human body, it can be helpful in further studies.

Although large-scale molecular docking analysis succeeds in predicting whether natural products with similar structures have similar mechanisms, it remains unclear whether they have the same mechanisms. In addition, it is necessary to predict the results of drug combinations to reveal the MOA of natural products in which multiple compounds are combined; however, it is difficult to predict the mechanism of drug combinations with similar structures using conventional network pharmacology or docking analysis. Therefore, we attempted to resolve these issues using drug response transcriptome analysis. Transcriptome analysis using RNA-seq reveals the RNA expression level in cells using next-generation sequencing analysis and can identify intracellular transcripts that change in response to specific stimuli [20]. Because the expression of transcripts induced by drug treatment reflects a wide range of target changes, it provides abundant information on drug action mechanisms [21]. Using the characteristics of these drug response transcripts, it is possible to identify the MOA of a combination of natural compounds with similar structures.

While researching the mechanisms of natural products, it is important to consider the similarity of compound structures within natural products; however, few studies have analyzed and compared natural product compounds with similar structures. Therefore, predicting the biological mechanisms of compounds with similar structures and identifying the drug mechanisms of similar compound combinations are novel. This study proposes an advanced method for in silico experiments for herbal medicine research. The method involves comparing the mechanisms of action of natural compounds with similar structures using biological data.

METHODS

Overview of the study

This study was conducted in the following manner: Physicochemical descriptors were calculated based on the chemical structures of OA, HG and gallic acid (GA), and similarity was confirmed by calculating the distance between the descriptors. Next, the mechanisms of action of OA, HG and GA were compared and confirmed through pharmacological analyses using in silico–based systems. In addition, the proteins interacting with the three compounds were verified through large-scale molecular docking analysis using the druggable proteome. Finally, it was confirmed through the production and analysis of drug response transcripts that the MOA of OA and HG was similar and consistent with the combination of OA and HG (Figure 1).

Framework for bioinformatics analysis of structurally similar compounds. Research comparing the MOA of natural product compounds with similar structures using biological data and research on the MOA using a combination of natural product components with similar structures was conducted. OA and HG were selected as natural compounds with similar structures, and GA was selected as a control. To compare the mechanisms of action of OA, HG and GA, chemical properties comparison, system pharmacology analysis, and molecular docking analysis were performed, respectively. Then, RNA-seq analysis was performed to confirm the MOA of natural compounds with similar structures and the MOA of combinations of natural compounds with similar structures. This study confirmed that compounds of natural products with similar structures have the same mechanism, and even when compounds with similar structures were mixed, it was confirmed that they had the same mechanism.

Comparative analysis of chemical properties of OA, HG and GA

Molecular feature collection and molecular descriptor calculation method

The physical properties, CID and SMILES string information of OA, HG and GA used in this study were collected from the PubChem database, and the chemical ontology information was collected from the ChEBI database [22]. Molecular descriptors were calculated using the Python Mordred library [23]. Among the 1826 molecular descriptors provided by Mordred, 396 descriptors that could not be calculated and 314 descriptors with a value of zero for all three compounds were excluded from the analysis. The 1116 molecular descriptors used in the similarity analysis are presented in Supplementary Material 1.

Distance-based molecular similarity measurement method

The similarity measures for OA, HG and GA were calculated based on distance measures. The similarity distance measure used 1116 molecular descriptors computed from the Mordred library. Compounds were paired for distance calculation, and the Euclidean [1], cosine [2] and Tanimoto distances [3] of the paired compounds were calculated using the Python numpy library [24].

(1)

(2)

(3)

Comparative analysis of OA, HG and GA based on systems pharmacology platform

Selection of druggable target using systems pharmacology platform

In the systems pharmacology analysis, protein targets interacting with OA, HG and GA were selected using the BATMAN-TCM platform. On the BATMAN-TCM platform, the possibility of drug–target interaction (DTI) was calculated and expressed as a score; among them, targets with a DTI score of 10 points and above were selected as druggable targets.

BATMAN-TCM is a platform that is used to compute drug–target interactions [26], which calculates drug–target similarity scores using known DTI information and uses a value obtained by multiplying the two scores. Drug similarity is calculated by considering the following: (i) fingerprint-based chemical structure, (ii) functional group–based chemical structure, (iii) side effects, (iv) anatomical, therapeutic and chemical (ATC) classification schemes; (v) drug-induced gene expression and (vi) text mining scores. A target similarity score is calculated using each similarity based on (i) protein sequence, (ii) proximity in protein interaction networks and (iii) Gene Ontology (GO) functional annotations. The prediction result of BATMAN-TCM can be used in studies as the ROC AUC value is as high as 0.9663 as it checks the accuracy of the existing database with the answer data, using leave-one-interaction-out cross-validation.

Network configuration using druggable targets

The druggable targets of each compound were constructed as a compound–target-pathway network. The network nodes consisted of compounds (OA, HG and GA), targets of effective pathways. The edges of the network represent (i) the connectivity of the compound that interacts with the targets and (ii) the connectivity of the target and the pathways containing the target. The target and pathway nodes used in the network were selected using over-representation analysis (ORA) based on the gene sets of the KEGG pathway database [25] provided by the BATMAN-TCM database. The effective pathways for each compound were determined based on an adjusted P-value ˂0.05, calculated using the Benjamini–Hochberg (BH) procedure [26]. Along with the effective pathways, druggable targets belonging to effective pathways were selected as network nodes. Network visualization was performed using Cytoscape (v3.9.1) [27].

ORA based on systems pharmacology platform

Druggable targets OA, HG and GA were used for ORA. ORA was performed on the EnrichR platform [28] and analyzed using gene sets of the KEGG pathway [25], the gene sets of GO biological process [29] and the gene sets of OMIM disease [30]. The analysis results were based on the combined score provided by the EnrichR platform, and the top 10 pathways and diseases were selected as the pharmacological mechanisms of each compound.

EnrichR is a platform that can perform ORA analysis and check the analysis results of multiple bioinformatic databases simultaneously [30, 33]. ORA analysis on the EnrichR platform proceeds almost identically to the conventional analysis method [34]. First, using the same method as ORA, we calculate the probability that genes of interest are found by chance more than the expected number of times when compared with randomly selected genes from the entire gene set using a hypergeometric test. Next, we generate a dummy set of randomly extracted genes with the same number as the gene set of interest and calculate the z-score measured through the hypergeometric test using the dummy set. Finally, using the P-value calculated in the hypergeometric test, the value obtained by multiplying log(p) and z-score is selected as the total score. EnrichR can improve the performance of the existing ORA by 1.6 times, so it was used in this study.

Molecular docking analysis

Molecular docking analysis method

The compounds used for docking analysis were transformed into the pdbqt form using OpenBabel software [31] after downloading the 3d_sdf information from the PubChem database [32] (PubChem CID: OA-10494, HG-73299 and GA-370). The protein structures used for analysis were collected from the AlphaFold2 (AF2) [32] and Human Protein Atlas (HPA) databases [33]. Among the human-derived proteomes provided by the AF2, 812 proteomes selected by the HPA as druggable proteomes were used for analysis. After converting the selected druggable proteomes to pdbqt, which can be docked using the OpenBabel Python API, molecular docking analysis was performed with the modified compound. Molecular docking analysis was performed using the Python and AutoDock Vina software [33], and the docking analysis parameter exhaustiveness was set to a maximum value of 100. Binding sites for proteins were performed based on maximum coverage provided by AutoDock Vina. The protein structure provided by AF2 is aligned with the coordinates of the structure center at (x,y,z) = (0,0,0). To identify the protein binding site, docking analysis was performed by setting the grid box at the center coordinates and setting the grid value to 126 for each of the x-axis, y-axis and z-axis, which is the maximum value provided by Autodock Vina.

Pathway analysis and visualization using molecular docking analysis results

The druggable proteomes of OA, HG and GA, selected through docking analysis were used for ORA-based pathway analysis using the EnrichR platform. Fifty proteins with the lowest binding affinities were selected as docking-based effective proteins (DEPs), and ORA was performed. Similar to the systems pharmacology platform method, ORA was performed using gene sets from the KEGG pathway, GO biological process and OMIM disease databases. The top 10 pathways and diseases with the highest combined scores were selected as the DEP-based pharmacological mechanisms of each compound.

Pathway visualization was performed by constructing a network using Cytoscape software. The five pathways with the highest combined scores were selected as the main active pathways from the KEGG pathway analysis results calculated using EnrichR for network construction. The nodes of the network were composed of each compound, DEPs and the main active pathways. The edges of the network comprised interactions between compounds and DEPs and interactions between DEPs and major active pathways.

The docking analysis was visualized by selecting five proteins (ABL1, JAK1, PIK3CD, CPS1 and CACNA1S) that effectively interacted with all three components in the docking analysis. RYR1 was excluded from the visualization because its sequence length was too large, and it was split into multiple files in the AF2 database to provide sequence information. AutoDock tools were used to visualize the molecular docking prediction results for the five proteins and each compound [34].

Molecular mechanics–Poisson Boltzmann surface area analysis

Molecular dynamics simulations were performed using GROningen MAchine for Chemical Simulations (GROMACS 2023.2) software [35]. ABL1 protein and OA, HG and GA topology files were generated using the UCSF Chimera software (v1.17.3) [36]. Their parameters were generated using the SwissParam platform [37]. For GROMACS protein–ligand simulation, Chemistry at Harvard Macromolecular Mechanics (CHARMM) 27 force field [38] and three-site transferrable intermolecular potential water model (TIP3P) were used [39], and molecular trajectory files were generated using these parameters. Afterward, the gmx_MMPBSA library was used to calculate molecular mechanics–Poisson Boltzmann surface area (MM/PBSA) [40]. Protein–ligand interaction and molecular trajectory information produced using GROMACS were used as input. Through this process, the binding energy values of ABL1 and the three compounds were confirmed.

RNA-seq analysis method

Chemicals and reagents

Roswell Park Memorial Institute (RPMI) 1640 medium, phosphate-buffered saline (PBS), TrypLE Express, penicillin–streptomycin and fetal bovine serum (FBS) were purchased from Gibco (Grand Island, NY, USA). Cell culture flasks and multiwell culture plates were purchased from Thermo Fisher Scientific (Waltham, MA, USA). Dimethyl sulfoxide (DMSO) and QIAzol lysis reagents were purchased from Sigma-Aldrich (St. Louis, MO, USA) and Qiagen (Germantown, MD, USA), respectively. The Ez-Cytox cell viability assay kit was purchased from Dogen Bio (Seoul, Korea). OA (CFN98800), HG (CFN98695) and GA (CFN99624) were purchased from ChemFace (Wuhan, Hubei, China), and all compounds had purities ≥98%.

Cell culture

Human non-small cell lung cancer cell (NSCLC) line A549 (CCL-185) was purchased from the American Type Culture Collection (ATCC, Manassas, VA, USA). The cells were maintained in RPMI 1640 medium supplemented with 10% (v/v) heat-inactivated FBS, 100 IU/ml penicillin and 100 mg/ml streptomycin at 37°C, 5% CO₂ incubator. A549 cells were subcultured every 3 or 4 days, depending on the cell density.

Drug treatment and total RNA preparation

The compounds were dissolved at 20 mM in DMSO and stored at −20°C until use. Before drug treatment, 20 mM drug solutions were diluted to 100 and 200 μM with PBS and filtered through a 0.22 μm membrane syringe filter (Sartorius, Goettingen, Germany). PBS with 2% DMSO was used as the vehicle. A549 cells were plated at 3 × 10⁵ cells/well in a 6-well plate containing 3 ml growth medium 1 day before drug treatment. The cells were exposed to 5 or 10 μM by treatment with 150 μl of 100 or 200 μM diluted drug per well. There was no cytotoxicity when treated with a high dose (10 μM) in all drugs, and it was confirmed using the Ez-cytox cell viability assay kit. After 24 h of drug treatment, the cells were washed thrice with ice-cold PBS. The total cell lysate was prepared with the QIAzol lysis reagent and stored in a −70°C deep freezer until RNA extraction. Total RNA was isolated according to the manufacturer’s protocol. The concentration of the isolated RNA was determined using an Agilent RNA 6000 Nano Kit (Agilent Technologies, Waldbronn, Germany), and RNA quality was evaluated by assessing the RNA integrity number (RIN > 7).

Library preparation for mRNA sequencing

One milligram of total RNA was processed to prepare an mRNA sequencing library using the MGIEasy RNA Directional Library Prep Kit (MGI) according to the manufacturer’s instructions. The first step involved purifying poly A-containing mRNA molecules using poly T oligo-attached magnetic beads. Following purification, mRNA was fragmented into small pieces using divalent cations at elevated temperatures. Cleaved RNA fragments were copied into first-strand cDNA using reverse transcriptase and random primers. Strand specificity was achieved in RT directional buffer, followed by second-strand cDNA synthesis. A single ‘A’ base was then added to these cDNA fragments, and subsequent ligation of the adapter occurred. The products were purified and enriched using PCR to create a final cDNA library. The double-stranded library was quantified using QauntiFluor ONE dsDNA System (Promega). The library was circularized at 37°C for 30 min and then digested at 37°C for 30 min, followed by the cleanup of the circularization products. DNA nanoballs (DNBs) were prepared by incubating the library at 30°C for 25 min with DNBs. Finally, the library was quantified using the QuantiFluor ssDNA System (Promega).

Sequencing and estimate expression abundance

The prepared DNB was sequenced using the MGIseq system (MGI) with 100 bp paired-end reads. Reads were trimmed using Trim Galore [41] to remove adapter sequences and low-quality reads. High-quality sequence reads were mapped to the human genome (hg38), and mRNA expression levels were quantified using the DESeq2 library [42].

Gene set enrichment analysis and cluster analysis of pathways

Gene set enrichment analysis (GSEA) was performed using the RNA sequencing results obtained from transcriptome experiments [43]. Differentially expressed genes (DEGs) were extracted using DESeq2 library (v1.38.2) [42] and edgeR (v3.40.1) [44] included in Bioconductor [45], and the extracted DEGs were visualized using volcano plots [DEG selection threshold: q-value ≤0.05, P-value ≤ log2(1.5)]. GSEA was performed for curated gene sets (KEGG pathway, Hallmark and WikiPathways) in the Molecular Signature Database (MSigDB v7.5.1) [46] using the fgsea package (v1.24) [47] in R (v4.2.2) with parameters of minimum size 15, maximum size 500 and 100 000 permutations. The statistical significance of the GSEA results was evaluated by adjusting the P-value using the BH procedure. GSEA results were visualized as a heat map using pheatmap (v.1.0.12), and hierarchical clustering analysis of pathways was performed using Euclidean distance and the complete method [48].

RESULTS

Physical and chemical properties of OA, HG and GA

The physicochemical and molecular characteristics of OA, HG and GA were compared, and molecular descriptors were calculated to quantitatively confirm the similarity of the compounds. According to Lipinski’s rule of five [49], OA and HG are druggable compounds except that they are hydrophobic, and GA is suitable for use as a drug under all conditions. ‘Lipinski’s rule of five’ helps determine whether a given compound has potential to be used as a drug. Though this rule works well, not all drugs fulfill the rule. For example, in extreme cases, FDA-approved drugs may not meet Lipinski’s rule. Abarelix, a well-known gonadotropin-releasing hormone antagonist, satisfies only one of the Lipinski’s rule, but it is extensively used as an effective drug [50]. In comparison, OA, HG and GA satisfy most of the Lipinski’s rules, and hence, they can be said to be compounds with high potential for use as drugs. In addition, OA and HG have similar chemical properties and the same chemical classification as pentacyclic triterpenoids; therefore, they are suitable for analysis as similar compounds (Table 1).

Table 1.

Chemical properties of OA, HG and GA

	Chemical property	Property value	Druggability (Lipinski’s rule)
Oleanolic acid	Molecular weight	456.7	YES
	XLogP3	7.5	NO
	Hydrogen bond donor count	2	YES
	Hydrogen bond acceptor count	3	YES
	Chemical class	Pentacyclic triterpenoid
Hederagenin	Molecular weight	472.7	YES
	XLogP3	6.8	NO
	Hydrogen bond donor count	3	YES
	Hydrogen bond acceptor count	4	YES
	Chemical class	Pentacyclic triterpenoid
Gallic acid	Molecular weight	170.12	YES
	XLogP3	0.7	YES
	Hydrogen bond donor count	4	YES
	Hydrogen bond acceptor count	5	YES
	Chemical class	Benzoic acids

Open in a new tab

Note: Lipinski’s rule of five: molecular weight ≤ 500, XlogP3 ≤ 5, H-bond donors ≤5, H-bond acceptors ≤10.

Distance similarity calculations using molecular descriptors confirmed that OA and HG are similar to GA. The Euclidean distance analysis showed that the distance between OA and HG was ~40 times smaller than that between OA and GA and between HG and GA. In particular, as a result of calculating the cosine and Tanimoto distances, OA and HG were found to be ˃99% similar. Conversely, in the Tanimoto distance calculation results, GA had only ~7% similarity with the other two compounds, which quantitatively confirmed that it was a heterogeneous compound compared with the other two compounds (Table 2).

Table 2.

Molecular descriptor similarity of the three components

	Euclidean distance	Cosine distance	Tanimoto distance
OA-HG	4793.82	0.99995	0.99949
OA-GA	189942.02	0.76676	0.07616
HG-GA	194267.50	0.76446	0.07231

Open in a new tab

Note: OA: oleanolic acid; HG: hederagenin; GA: gallic acid.

Results of systems pharmacology analysis of OA, HG and GA

Systems pharmacology analysis using the BATMAN-TCM database showed similar patterns for OA and HG, whereas GA showed different patterns. There were 44 druggable targets shared by OA and HG, and they shared interacting proteins at rates of 86% (OA:44/51) and 96% (HG:44/46). In contrast, only 20% (5/25) of the GA protein was shared with the druggable targets of the other two compounds (Figure 2A, Supplementary Table 1).

Systems pharmacology analysis of OA, HG and GA. A systems pharmacology platform, BATMAN-TCM, was used to predict druggable target proteins of OA, HG and GA. ORA was performed on the EnrichR platform using the predicted druggable proteins and KEGG pathway gene sets, and a compound–protein-pathway (CPP) network was constructed using the ORA results. (A) Venn diagram shown using druggable target proteins of OA, HG, and GA predicted by BATMAN-TCM. (B) ORA results using KEGG pathway gene sets and druggable target proteins of OA. (C) ORA results using KEGG pathway gene sets and druggable target proteins of HG. (D) ORA results using KEGG pathway gene sets and druggable target proteins of GA. (E) CPP network constructed using OA, HG, GA target proteins and KEGG pathway analysis results. Orange, light-blue and yellow nodes represent OA, HG and GA, respectively. Green nodes represent proteins interacting with compounds, and pink nodes represent valid pathways derived from proteins. The orange box indicates the classification of proteins and pathways, which are divided into (i) the action point where OA and HG act together, (ii) the action point where GA acts alone and (iii) the action point where the three compounds act together.

In the ORA using the druggable target of each compound, the analysis results for OA and HG were consistent. The ORA of OA and HG showed that the three major pathways with the highest combined scores were cardiac muscle contraction, oxidative phosphorylation and non-alcoholic fatty liver disease. In contrast, arachidonic acid metabolism, serotonergic synapses and long-term depression were analyzed as major pathways in GA, and the results of the other two compounds were not consistent (Figure 2B–D). Even when the analysis results were expanded to the top 10 KEGG pathways, OA and HGf matched nine pathways, whereas GA matched only the steroid hormone biosynthesis pathway with OA. In GO pathway analysis, both OA and HG were predicted to have the highest combined score for the mitochondrial electron transport mechanism. In addition, in the OMIM disease analysis, both OA and GA were predicted to act on cholesterol levels, migraine and myocardial infarction (Supplementary Figure 1).

Network analysis results based on systems pharmacology platform analysis also showed that OA and HG share the same mechanism. OA and HG share the most druggable targets and mechanisms of cardiac muscle contraction, oxidative phosphorylation, neuroactive ligand-receptor interactions, tyrosine metabolism and fatty acid degradation. However, GA had relatively many unique elements and, unlike the other two components, acted on serotonergic synapse, dopaminergic synapse, AMPK signaling pathway and TGF-beta signaling pathway (Figure 2E).

Results of molecular docking analysis of OA, HG and GA

Analysis of the top 50 druggable proteomes predicted by molecular docking analysis with OA, HG, GA, OA and HG showed similar patterns (Supplementary Material 2). OA and HG shared 38 druggable proteins, while GA shared only 11 proteins with the other two (Figure 3A, Supplementary Table 2).

In the ORA docking results, both OA and HG were predicted to act on cholinergic synapses, type 2 diabetes mellitus and the gonadotropin-releasing hormone (GnRH) secretion pathway. In contrast, GA was predicted to act on pathways different from those of the two compounds, such as folate biosynthesis and the pancreatic cancer pathway (Figure 3B–D). Even when the analysis results were expanded to the top 10 KEGG pathways, OA and HG matched eight pathways, whereas GA did not match any pathway. In addition, in the GO pathway and OMIM disease analysis, both OA and HG showed consistent predictive results for membrane depolarization, calcium ion import, long QT syndrome and hypertension (Supplementary Figure 2).

In addition, network analysis based on molecular docking predicted that OA and HG share the same mechanism. Both OA and HG are predicted to act on GnRH secretion, type 2 diabetes mellitus, cholinergic synapses and calcium signaling pathways. However, because GA had many proteins that interacted only with GA, unlike the other two compounds, it acted on pathways such as folate biosynthesis, pancreatic cancer and chronic myeloid leukemia (Figure 3E).

The molecular docking analysis results of the proteins with which all three compounds interacted were visualized to confirm whether OA and HG bound to the same positions. Visualization confirmed that OA and HG interacted at similar positions in all proteins, whereas only GA interacted at different positions (Figure 3F and G, Supplementary Figure 3). These results suggest that OA and HG likely share similar mechanisms, and we can conclude that GAs may have different mechanisms even though they interact with the same protein.

Finally, through MM/PBSA analysis, the three compounds were confirmed to have differences in binding energy in terms of molecular dynamics (Supplementary Table 3). As a result of the analysis, it was confirmed that the binding energy changes of compounds with similar scaffolds were similar to those with different scaffolds. In particular, there was a large difference in the polar contribution to the solvation free energy, and this difference appears to have contributed the most to the difference in total energy.

Results of RNA-seq analysis of OA, HG and GA

By selecting the DEGs of OA, HG, GA and a combination of OA and HG (COH) using RNA-seq analysis, it was confirmed that the expression patterns of OA, HG and COH were similar. In particular, the 110 genes whose expression levels were altered by all three compounds were OA, HG and COH. In contrast, among the 49 genes whose expression levels changed in GA, 43 showed a pattern of expression level change only in GA (Figure 4A). Volcano plot analysis also confirmed that the expression levels of FAM129A, PCK2, MTHFD2, PSAT1, ASNS, PHGDH, FGB and EGLN3 that were altered in COH were altered in at least one of the OA and HG groups. However, it was not possible to identify genes whose expression levels changed in the same pattern as COH in the GA group (Figure 4B, Supplementary Figure 4, Supplementary Table 4).

In the GSEA results, OA, HG and COH exhibited similar patterns. In KEGG pathway analysis, OA, HG and COH showed pathways related to amino acid and fat metabolism, such as (i) alanine, aspartate and glutamate metabolism; (ii) glycine, serine and threonine metabolism; (iii) biosynthesis of unsaturated fatty acids; and (iv) steroid biosynthesis. In contrast, GA was associated with RNA-related pathways, including (i) RNA degradation, (ii) pyrimidine metabolism and (iii) purine metabolism (Figure 4C, Supplementary Figure 5). In particular, OA, HG and COH acted in opposition to GA in the aldosterone-regulated sodium reabsorption pathway.

DISCUSSION AND CONCLUSION

In this study, we confirmed that natural compounds with similar structures exhibit similar mechanisms of action. In other words, through analysis of the natural product–derived compounds OA, HG and GA, it was found that OA and HG with similar structures had similar mechanisms of action, while GA and the other two compounds had different mechanisms of action.

The finding that compounds with similar structures exhibit similar mechanisms has three major novelties. First, an advanced network pharmacology research method was suggested by partially presenting a solution to overcome the difficulty of identifying multicomponent mechanisms, which are characteristic of natural products and an obstacle to new drug development. Second, a new natural product analysis method was proposed using large-scale molecular docking analysis. Finally, various biological data and analysis methods were used, such as in silico system pharmacology, docking analysis and drug response RNA-seq. We confirmed the pharmacodynamic processes from the binding of natural compounds to druggable targets to the expression of drug efficacy through signaling by the bound using this method.

To date, most network pharmacology studies predicting the mechanisms of natural compounds have analyzed all components independently. However, as shown in the molecular descriptor similarity comparison, OA and HG were quantitatively similar compounds; therefore, they should not be considered independent components (Tables 1 and 2). Rather, the two compounds should be considered dependent components that share the same mechanisms. Systematic pharmacological analyses confirmed that similar compounds share similar mechanisms, which was confirmed by molecular docking and transcriptome analyses. This result can reduce the complexity of mechanism prediction when analyzing multiple compounds. In other words, replacing similar compounds with the same scaffold as a representative compound is possible. This would help overcome the limitation posed to studying the mechanism by the large number of compounds in natural products. Natural products can effectively treat multifactorial diseases and comorbidities by acting on multiple targets. However, there is a disadvantage: it is difficult to use in developing new drugs because it is hard to predict the mechanism of natural products. If the complexity of mechanism prediction can be reduced using the method proposed in this study, it will greatly help natural-product-based drug development strategies. However, network pharmacological analysis showed that OA has a wider range of interacting targets than HG. Because the MOA may vary depending on the differences in functional groups, polarity and molecular weight, future research should be conducted to identify this.

Large-scale molecular docking analysis was performed using human-derived proteins. Because docking analyses based on the magic bullet paradigm are important for a small number of key targets, few attempts have been made to use a large number of human-derived protein structures. Natural products are not used after being subjected to preparing one active compound from a lead compound like in case of general medicines, but are instead used after being subjected to simple processing of plants, animals and minerals. Therefore, among the many compounds present in natural products, it is important to confirm whether a certain compound interacts with a druggable protein. Because selecting a specific compound as a key compound in natural product mechanism research is difficult without specific criteria, a large-scale molecular docking analysis strategy is required. In particular, to study the treatment mechanisms of complex diseases by multiple compounds and targets of natural products, it is important to identify the target proteins that natural compounds bind to in the human body. When conducting such a mechanistic study, the large-scale docking analysis method used in this study will be important for future natural product research (Figure 3).

Finally, this study is significant in confirming all processes of a series of pharmacodynamic mechanisms of compounds in herbal medicine. It is well known that when compounds bind to target proteins in the human body, the second messenger is activated and affects the expression of new proteins. However, it is difficult to confirm the secondary drug activities of natural products using network pharmacology and molecular docking analyses. The secondary drug activity of the natural product, which the two analyses could not confirm, was confirmed using drug response transcriptome analysis. For example, the cholinergic synapse, GnRH secretion and calcium signaling pathways obtained from the molecular docking analysis of OA and HG (Figure 3B and C) affect the aldosterone-regulated sodium reabsorption obtained from the RNA-seq analysis (Figure 4C). This is also consistent with the cardiac muscle contraction pathway obtained from the system pharmacology analysis. These results confirm that OA and HG affect heart health (Figure 2) [51]. Biological data analysis shows a series of processes in which the human body absorbs natural compounds, interacts with druggable targets and is activated by secondary messengers. As shown in our study, the research method that analyzes systems biology, molecular docking and even transcriptomes will be a milestone in analyzing the pharmacological mechanisms of natural products.

In this study, GSEA based on drug-response transcripts confirmed that similar compounds have similar mechanisms. A recent study identified the Aurora B pathway as a novel anticancer therapeutic activity of Paeoniae Radix (PR) extract in lung cancer cell lines via systematic transcriptome analysis [52]. Among the eight compounds (HG, OA, GA, albiflorin, benzoic acid, catechin, paeoniflorin and paeonol) in PR, HG and OA demonstrated anticancer efficacy by reducing Aurora kinase activity. In contrast, other compounds (GA, benzoic acid, catechin, paeoniflorin and paeonol) in PR induced cell-cycle arrest and apoptosis via p53 and MAPK activation. Although the structurally similar HG and OA were significantly altered by the Aurora B pathway, other compounds containing GA demonstrated anticancer effects through cell cycle arrest and apoptosis. The results of our study are consistent with previous findings showing that the structure of a compound determines its MOA. These results demonstrate that the method used in this study can be effectively used to study the mechanisms of action of herbal medicines.

However, this study has several limitations. First, there was a limitation in not considering the number of compounds in the system pharmacology analysis. The dose of a drug is very important in all experiments, from in vitro experiments to clinical trials, because the biological mechanism and cell viability may vary depending on the dose. Systems pharmacology studies have been limited because they do not consider the doses of compounds contained in herbal medicines. The discovery in this study that compounds with the same scaffold have the same mechanism will have a positive impact on dose-based in silico herbal medicine research. However, this study only suggested a method for handling the dose of compounds in herbal medicines and did not consider the exact dose. In future studies, it will be necessary to propose a systematic pharmacological analysis strategy, such as LC–MS, that can quantitatively confirm the amount of compounds.

Another limitation of our study is that the association between systems pharmacology, molecular docking and RNA-seq analysis was not clearly confirmed. All three methods have in common that they are analyzed to confirm the MOA of the drug and analyzed using the same gene set. However, systems pharmacology and molecular docking are methods used to observe DTI mechanisms at the upstream level, whereas RNA-seq is a method used to observe DTI mechanisms at the downstream level after a second messenger system transmits a signal. Therefore, these methods maintain a consistent drug mechanism in some cases but are inconsistent in others because they involve complex interaction mechanisms. Therefore, further studies are required to connect and integrate different analytical methods based on the same compounds.

One of the limitations of the present study is that the results may be difficult to generalize. Previous studies have reported that OA and HG, which belong to pentacyclic triterpenes, show similar results, thereby suggesting that compounds with similar scaffolds likely have similar mechanisms [52]. Moreover, studies have also reported that other components with identical scaffolds have identical mechanisms [53–56]. Therefore, pentacyclic triterpenes, such as OA, HG and ursolic acid, likely share the same mechanism, but it may be difficult to derive such generalized conclusions based on our findings. For example, according to the activity cliff theory, the two compounds may have similar structures but different mechanisms [57]. Hence, future studies will be needed to generalize this claim.

Nevertheless, the results of this study are meaningful in that they suggest an analysis strategy that can improve the existing systems pharmacology research analysis method using the result that natural product-derived compounds with the same scaffold have the same mechanism. In addition, this study is novel because it proposes a new natural product mechanism analysis method using large-scale molecular docking and confirms the results of secondary drug response transcripts using RNA-seq analysis.

Key Points

The precise mechanisms by which natural products treat diseases have not yet been identified.
We compared physicochemical descriptors and conducted large-scale molecular docking and RNA-seq analyses to compare mechanisms of similar compounds when used individually and in combinations.
An advanced network pharmacology research method was proposed by partially presenting a solution to the difficulty in identifying multi-component mechanisms and a product analysis method was developed.
Our work proposes an analysis strategy that improves systems pharmacology research analysis methods by demonstrating that natural product-derived compounds with the same scaffold have the same mechanism.

FUNDING

This research was funded by the research program of the Korea Institute of Oriental Medicine [grant number KSN1731122], and this work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) [grant number RS-2023-00250546].

DATA AVAILABILITY

The raw sequence and processed data were deposited in the NCBI Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/) with accession number GSE228524.

Supplementary Material

Supplementary_Figure_1_bbad344

Click here for additional data file.^{(556.9KB, jpeg)}

Supplementary_Figure_2_bbad344

Click here for additional data file.^{(609.5KB, jpeg)}

Supplementary_Figure_3_bbad344

Click here for additional data file.^{(1.4MB, jpeg)}

Supplementary_Figure_4_bbad344

Click here for additional data file.^{(488.9KB, jpeg)}

Supplementary_Figure_5_bbad344

Click here for additional data file.^{(1.1MB, jpeg)}

supplementary_material_1_molecular_descriptor_bbad344

Click here for additional data file.^{(66KB, xlsx)}

supplementary_matarial_2_druggable_proteome_docking_result_bbad344

Click here for additional data file.^{(21.7KB, xlsx)}

supplementary_material_3_DEG_list_bbad344

Click here for additional data file.^{(19.7KB, xlsx)}

Supplementary_tables_bbad344

Click here for additional data file.^{(31.4KB, docx)}

Author Biographies

Musun Park is a senior researcher in Korea Institute of Oriental Medicine (KIOM). His research interests are network pharmacology, systems modeling and machine learning in medicine. His email is bmusun@kiom.re.kr.

Su-Jin Baek is a senior researcher in KIOM. Her research interests are functional genomics and epigenetics. Her email is baeksj@kiom.re.kr.

Sang-Min Park is an assistant professor at the Chungnam National University. His research interests are systems biology and systems pharmacology. His email is smpark@cnu.ac.kr.

Jin-Mu Yi is a senior researcher at the KIOM. His research interests are cancer and bioinformatics. His email is jmyi@kiom.re.kr.

Seongwon Cha is a principal researcher in KIOM. His research interests are genomics in population variations and the systems research of Korean medicine. His email is scha@kiom.re.kr.

Contributor Information

Musun Park, Korean Medicine (KM) Data Division, Korea Institute of Oriental Medicine, Daejeon, Republic of Korea.

Su-Jin Baek, Korean Medicine (KM) Data Division, Korea Institute of Oriental Medicine, Daejeon, Republic of Korea.

Sang-Min Park, College of Pharmacy, Chungnam National University, Daejeon, Republic of Korea.

Jin-Mu Yi, KM Convergence Research Division, Korea Institute of Oriental Medicine, Daejeon, Republic of Korea.

Seongwon Cha, Korean Medicine (KM) Data Division, Korea Institute of Oriental Medicine, Daejeon, Republic of Korea.

AUTHOR CONTRIBUTIONS

M.S.P. contributed to the research design, analyzed data and drafted the manuscript; S.J.B. drafted the manuscript; S.M.P. analyzed data; J.M.Y. conducted data design and production and drafted the manuscript; S.W.C. drafted and validated the manuscript. All authors have read and agreed to the published version of the manuscript.

References

1. Saggar S, Mir PA, Kumar N, et al. Traditional and herbal medicines: opportunities and challenges. Pharm Res 2022;14:107–14. [Google Scholar]
2. Di Pierro F. Roles of chemical complexity and evolutionary theory in some hepatic and intestinal enzymatic systems in chemical reproducibility and clinical efficiency of herbal derivatives. ScientificWorldJournal 2014;2014:1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
3. Schwartz RS. Paul Ehrlich's magic bullets. N Engl J Med 2004;350:1079–80. [DOI] [PubMed] [Google Scholar]
4. Strebhardt K, Ullrich A. Paul Ehrlich's magic bullet concept: 100 years of progress. Nat Rev Cancer 2008;8:473–80. [DOI] [PubMed] [Google Scholar]
5. Wang Y, Fan X, Qu H, et al. Strategies and techniques for multi-component drug design from medicinal herbs and traditional Chinese medicine. Curr Top Med Chem 2012;12:1356–62. [DOI] [PubMed] [Google Scholar]
6. Li S, Zhang B. Traditional Chinese medicine network pharmacology: theory, methodology and application. Chin J Nat Med 2013;11:110–20. [DOI] [PubMed] [Google Scholar]
7. Park M, Park SY, Lee HJ, Kim CE. A systems-level analysis of mechanisms of Platycodon grandiflorum based on a network pharmacological approach. Molecules 2018;23:2841. [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Ru J, Li P, Wang J, et al. TCMSP: a database of systems pharmacology for drug discovery from herbal medicines. J Chem 2014;6:13. [DOI] [PMC free article] [PubMed] [Google Scholar]
9. Ribeiro JA, Sebastiao AM. Caffeine and adenosine. J Alzheimers Dis 2010;20(Suppl 1):S3–15. [DOI] [PubMed] [Google Scholar]
10. Puckeridge M, Fulcher BD, Phillips AJ, Robinson PA. Incorporation of caffeine into a quantitative model of fatigue and sleep. J Theor Biol 2011;273:44–54. [DOI] [PubMed] [Google Scholar]
11. Bemis GW, Murcko MA. The properties of known drugs. 1. Molecular frameworks. J Med Chem 1996;39:2887–93. [DOI] [PubMed] [Google Scholar]
12. Hu Y, Stumpfe D, Bajorath J. Computational exploration of molecular scaffolds in medicinal chemistry. J Med Chem 2016;59:4062–76. [DOI] [PubMed] [Google Scholar]
13. Bohm HJ, Flohr A, Stahl M. Scaffold hopping. Drug Discov Today Technol 2004;1:217–24. [DOI] [PubMed] [Google Scholar]
14. Krueger BA, Dietrich A, Baringhaus KH, et al. Scaffold-hopping potential of fragment-based de novo design: the chances and limits of variation. Comb Chem High Throughput Screen 2009;12:383–96. [DOI] [PubMed] [Google Scholar]
15. Hu Y, Stumpfe D, Bajorath J. Recent advances in scaffold hopping. J Med Chem 2017;60:1238–46. [DOI] [PubMed] [Google Scholar]
16. Gonzalez-Burgos E, Gomez-Serranillos MP. Terpene compounds in nature: a review of their potential antioxidant activity. Curr Med Chem 2012;19:5319–41. [DOI] [PubMed] [Google Scholar]
17. Ghosh S. Triterpene structural diversification by plant cytochrome P450 enzymes. Front Plant Sci 2017;8:1886. [DOI] [PMC free article] [PubMed] [Google Scholar]
18. Shoichet BK, McGovern SL, Wei B, Irwin JJ. Lead discovery using molecular docking. Curr Opin Chem Biol 2002;6:439–46. [DOI] [PubMed] [Google Scholar]
19. Bender BJ, Gahbauer S, Luttens A, et al. A practical guide to large-scale docking. Nat Protoc 2021;16:4799–832. [DOI] [PMC free article] [PubMed] [Google Scholar]
20. Chu YJ, Corey DR. RNA sequencing: platform selection, experimental design, and data interpretation. Nucleic Acid Ther 2012;22:271–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
21. Kwon OS, Kim W, Cha HJ, Lee H. In silico drug repositioning: from large-scale transcriptome data to therapeutics. Arch Pharm Res 2019;42:879–89. [DOI] [PubMed] [Google Scholar]
22. Degtyarenko K, de Matos P, Ennis M, et al. ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Res 2008;36:D344–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
23. Moriwaki H, Tian YS, Kawashita N, Takagi T. Mordred: a molecular descriptor calculator. J Chem 2018;10:4. [DOI] [PMC free article] [PubMed] [Google Scholar]
24. Miranda-Quintana RA, Bajusz D, Racz A, et al. Differential consistency analysis: which similarity measures can be applied in drug discovery? Mol Inform 2021;40:e2060017. [DOI] [PubMed] [Google Scholar]
25. Kanehisa M, Goto S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 2000;28:27–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
26. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B Methodol 1995;57:289–300. [Google Scholar]
27. Shannon P, Markiel A, Ozier O, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 2003;13:2498–504. [DOI] [PMC free article] [PubMed] [Google Scholar]
28. Chen EY, Tan CM, Kou Y, et al. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics 2013;14:128. [DOI] [PMC free article] [PubMed] [Google Scholar]
29. Ashburner M, Ball CA, Blake JA, et al. Gene ontology: tool for the unification of biology. Nat Genet 2000;25:25–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
30. Hamosh A, Scott AF, Amberger JS, et al. Online Mendelian inheritance in man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res 2005;33:D514–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
31. O'Boyle NM, Banck M, James CA, et al. Open babel: an open chemical toolbox. J Chem 2011;3:33. [DOI] [PMC free article] [PubMed] [Google Scholar]
32. Kim S, Thiessen PA, Bolton EE, et al. PubChem substance and compound databases. Nucleic Acids Res 2016;44:D1202–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
33. Trott O, Olson AJ. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem 2010;31:455–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
34. Huey R, Morris GM. Using AutoDock 4 with AutoDocktools: a tutorial. The Scripps Research Institute, USA 2008;8:54–6. [Google Scholar]
35. Van der Spoel D, Lindahl E, Hess B, et al. GROMACS: fast, flexible, and free. J Comput Chem 2005;26:1701–18. [DOI] [PubMed] [Google Scholar]
36. Pettersen EF, Goddard TD, Huang CC, et al. UCSF chimera - a visualization system for exploratory research and analysis. J Comput Chem 2004;25:1605–12. [DOI] [PubMed] [Google Scholar]
37. Zoete V, Cuendet MA, Grosdidier A, Michielin O. SwissParam: a fast force field generation tool for small organic molecules. J Comput Chem 2011;32:2359–68. [DOI] [PubMed] [Google Scholar]
38. Brooks BR, Brooks CL, Mackerell AD, et al. CHARMM: the biomolecular simulation program. J Comput Chem 2009;30:1545–614. [DOI] [PMC free article] [PubMed] [Google Scholar]
39. Mark P, Nilsson L. Structure and dynamics of the TIP3P, SPC, and SPC/E water models at 298 K. Chem A Eur J 2001;105:9954–60. [Google Scholar]
40. Valdes-Tresanco MS, Valdes-Tresanco ME, Valiente PA, et al. gmx_MMPBSA: a new tool to perform end-state free energy calculations with GROMACS. Journal of Chemical Theory and Computation 2021;17:6281–91. [DOI] [PubMed] [Google Scholar]
41. Krueger F. Trim galore: a wrapper tool around Cutadapt and FastQC to consistently apply quality and adapter trimming to FastQ files, with some extra functionality for MspI-digested RRBS-type (reduced representation Bisufite-Seq) libraries. 2012. http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/ (28 April 2016, date last accessed).
42. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 2014;15:550. [DOI] [PMC free article] [PubMed] [Google Scholar]
43. Subramanian A, Tamayo P, Mootha VK, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 2005;102:15545–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
44. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 2010;26:139–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
45. Gentleman RC, Carey VJ, Bates DM, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004;5:R80. [DOI] [PMC free article] [PubMed] [Google Scholar]
46. Liberzon A, Subramanian A, Pinchback R, et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics 2011;27:1739–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
47. Korotkevich G, Sukhov V, Budin N, et al. Fast gene set enrichment analysis. BioRxiv 2021;060012. [Google Scholar]
48. Murtagh F, Contreras P. Algorithms for hierarchical clustering: an overview, Wiley interdisciplinary reviews-data mining and knowledge. Discovery 2012;2:86–97. [Google Scholar]
49. Lipinski CA. Lead- and drug-like compounds: the rule-of-five revolution. Drug Discov Today Technol 2004;1:337–41. [DOI] [PubMed] [Google Scholar]
50. Mongiat-Artus P, Teillac P. Abarelix: the first gonadotrophin-releasing hormone antagonist for the treatment of prostate cancer. Expert Opin Pharmacother 2004;5:2171–9. [DOI] [PubMed] [Google Scholar]
51. Tsilosani A, Gao C, Zhang WZ. Aldosterone-regulated sodium transport and blood pressure. Front Physiol 2022;13:770375. [DOI] [PMC free article] [PubMed] [Google Scholar]
52. Baek SJ, Lee H, Park SM, et al. Identification of a novel anticancer mechanism of Paeoniae radix extracts based on systematic transcriptome analysis. Biomed Pharmacother 2022;148:112748. [DOI] [PubMed] [Google Scholar]
53. Patocka J. Biologically active pentacyclic triterpenes and their current medicine signification. J Appl Biomed 2003;1:7–12. [Google Scholar]
54. Kashyap D, Sharma AS, Tuli H, et al. Ursolic acid and oleanolic acid: pentacyclic terpenoids with promising anti-inflammatory activities. Recent Pat Inflamm Allergy Drug Discov 2016;10:21–33. [DOI] [PubMed] [Google Scholar]
55. Takada K, Nakane T, Masuda K, et al. Ursolic acid and oleanolic acid, members of pentacyclic triterpenoid acids, suppress TNF-alpha-induced E-selectin expression by cultured umbilical vein endothelial cells. Phytomedicine 2010;17:1114–9. [DOI] [PubMed] [Google Scholar]
56. Park M, Lee S-Y, Lee H, et al. Effect of terpenes from Poria Cocos: verifying modes of action against Alzheimer's disease using molecular docking, drug-induced transcriptomes and diffusion network. BioRxiv 2023:2023–06.543358. [DOI] [PMC free article] [PubMed] [Google Scholar]
57. Stumpfe D, Hu HB, Bajorath J. Evolving concept of activity cliffs. Acs Omega 2019;4:14360–8. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary_Figure_1_bbad344

Click here for additional data file.^{(556.9KB, jpeg)}

Supplementary_Figure_2_bbad344

Click here for additional data file.^{(609.5KB, jpeg)}

Supplementary_Figure_3_bbad344

Click here for additional data file.^{(1.4MB, jpeg)}

Supplementary_Figure_4_bbad344

Click here for additional data file.^{(488.9KB, jpeg)}

Supplementary_Figure_5_bbad344

Click here for additional data file.^{(1.1MB, jpeg)}

supplementary_material_1_molecular_descriptor_bbad344

Click here for additional data file.^{(66KB, xlsx)}

supplementary_matarial_2_druggable_proteome_docking_result_bbad344

Click here for additional data file.^{(21.7KB, xlsx)}

supplementary_material_3_DEG_list_bbad344

Click here for additional data file.^{(19.7KB, xlsx)}

Supplementary_tables_bbad344

Click here for additional data file.^{(31.4KB, docx)}

Data Availability Statement

The raw sequence and processed data were deposited in the NCBI Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/) with accession number GSE228524.

[ref1] 1. Saggar S, Mir PA, Kumar N, et al. Traditional and herbal medicines: opportunities and challenges. Pharm Res 2022;14:107–14. [Google Scholar]

[ref2] 2. Di Pierro F. Roles of chemical complexity and evolutionary theory in some hepatic and intestinal enzymatic systems in chemical reproducibility and clinical efficiency of herbal derivatives. ScientificWorldJournal 2014;2014:1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref3] 3. Schwartz RS. Paul Ehrlich's magic bullets. N Engl J Med 2004;350:1079–80. [DOI] [PubMed] [Google Scholar]

[ref4] 4. Strebhardt K, Ullrich A. Paul Ehrlich's magic bullet concept: 100 years of progress. Nat Rev Cancer 2008;8:473–80. [DOI] [PubMed] [Google Scholar]

[ref5] 5. Wang Y, Fan X, Qu H, et al. Strategies and techniques for multi-component drug design from medicinal herbs and traditional Chinese medicine. Curr Top Med Chem 2012;12:1356–62. [DOI] [PubMed] [Google Scholar]

[ref6] 6. Li S, Zhang B. Traditional Chinese medicine network pharmacology: theory, methodology and application. Chin J Nat Med 2013;11:110–20. [DOI] [PubMed] [Google Scholar]

[ref7] 7. Park M, Park SY, Lee HJ, Kim CE. A systems-level analysis of mechanisms of Platycodon grandiflorum based on a network pharmacological approach. Molecules 2018;23:2841. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref8] 8. Ru J, Li P, Wang J, et al. TCMSP: a database of systems pharmacology for drug discovery from herbal medicines. J Chem 2014;6:13. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref9] 9. Ribeiro JA, Sebastiao AM. Caffeine and adenosine. J Alzheimers Dis 2010;20(Suppl 1):S3–15. [DOI] [PubMed] [Google Scholar]

[ref10] 10. Puckeridge M, Fulcher BD, Phillips AJ, Robinson PA. Incorporation of caffeine into a quantitative model of fatigue and sleep. J Theor Biol 2011;273:44–54. [DOI] [PubMed] [Google Scholar]

[ref11] 11. Bemis GW, Murcko MA. The properties of known drugs. 1. Molecular frameworks. J Med Chem 1996;39:2887–93. [DOI] [PubMed] [Google Scholar]

[ref12] 12. Hu Y, Stumpfe D, Bajorath J. Computational exploration of molecular scaffolds in medicinal chemistry. J Med Chem 2016;59:4062–76. [DOI] [PubMed] [Google Scholar]

[ref13] 13. Bohm HJ, Flohr A, Stahl M. Scaffold hopping. Drug Discov Today Technol 2004;1:217–24. [DOI] [PubMed] [Google Scholar]

[ref14] 14. Krueger BA, Dietrich A, Baringhaus KH, et al. Scaffold-hopping potential of fragment-based de novo design: the chances and limits of variation. Comb Chem High Throughput Screen 2009;12:383–96. [DOI] [PubMed] [Google Scholar]

[ref15] 15. Hu Y, Stumpfe D, Bajorath J. Recent advances in scaffold hopping. J Med Chem 2017;60:1238–46. [DOI] [PubMed] [Google Scholar]

[ref16] 16. Gonzalez-Burgos E, Gomez-Serranillos MP. Terpene compounds in nature: a review of their potential antioxidant activity. Curr Med Chem 2012;19:5319–41. [DOI] [PubMed] [Google Scholar]

[ref17] 17. Ghosh S. Triterpene structural diversification by plant cytochrome P450 enzymes. Front Plant Sci 2017;8:1886. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref18] 18. Shoichet BK, McGovern SL, Wei B, Irwin JJ. Lead discovery using molecular docking. Curr Opin Chem Biol 2002;6:439–46. [DOI] [PubMed] [Google Scholar]

[ref19] 19. Bender BJ, Gahbauer S, Luttens A, et al. A practical guide to large-scale docking. Nat Protoc 2021;16:4799–832. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref20] 20. Chu YJ, Corey DR. RNA sequencing: platform selection, experimental design, and data interpretation. Nucleic Acid Ther 2012;22:271–4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref21] 21. Kwon OS, Kim W, Cha HJ, Lee H. In silico drug repositioning: from large-scale transcriptome data to therapeutics. Arch Pharm Res 2019;42:879–89. [DOI] [PubMed] [Google Scholar]

[ref22] 22. Degtyarenko K, de Matos P, Ennis M, et al. ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Res 2008;36:D344–50. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref23] 23. Moriwaki H, Tian YS, Kawashita N, Takagi T. Mordred: a molecular descriptor calculator. J Chem 2018;10:4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref24] 24. Miranda-Quintana RA, Bajusz D, Racz A, et al. Differential consistency analysis: which similarity measures can be applied in drug discovery? Mol Inform 2021;40:e2060017. [DOI] [PubMed] [Google Scholar]

[ref25] 25. Kanehisa M, Goto S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 2000;28:27–30. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref26] 26. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B Methodol 1995;57:289–300. [Google Scholar]

[ref27] 27. Shannon P, Markiel A, Ozier O, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 2003;13:2498–504. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref28] 28. Chen EY, Tan CM, Kou Y, et al. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics 2013;14:128. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref29] 29. Ashburner M, Ball CA, Blake JA, et al. Gene ontology: tool for the unification of biology. Nat Genet 2000;25:25–9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref30] 30. Hamosh A, Scott AF, Amberger JS, et al. Online Mendelian inheritance in man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res 2005;33:D514–7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref31] 31. O'Boyle NM, Banck M, James CA, et al. Open babel: an open chemical toolbox. J Chem 2011;3:33. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref32] 32. Kim S, Thiessen PA, Bolton EE, et al. PubChem substance and compound databases. Nucleic Acids Res 2016;44:D1202–13. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref33] 33. Trott O, Olson AJ. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem 2010;31:455–61. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref34] 34. Huey R, Morris GM. Using AutoDock 4 with AutoDocktools: a tutorial. The Scripps Research Institute, USA 2008;8:54–6. [Google Scholar]

[ref35] 35. Van der Spoel D, Lindahl E, Hess B, et al. GROMACS: fast, flexible, and free. J Comput Chem 2005;26:1701–18. [DOI] [PubMed] [Google Scholar]

[ref36] 36. Pettersen EF, Goddard TD, Huang CC, et al. UCSF chimera - a visualization system for exploratory research and analysis. J Comput Chem 2004;25:1605–12. [DOI] [PubMed] [Google Scholar]

[ref37] 37. Zoete V, Cuendet MA, Grosdidier A, Michielin O. SwissParam: a fast force field generation tool for small organic molecules. J Comput Chem 2011;32:2359–68. [DOI] [PubMed] [Google Scholar]

[ref38] 38. Brooks BR, Brooks CL, Mackerell AD, et al. CHARMM: the biomolecular simulation program. J Comput Chem 2009;30:1545–614. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref39] 39. Mark P, Nilsson L. Structure and dynamics of the TIP3P, SPC, and SPC/E water models at 298 K. Chem A Eur J 2001;105:9954–60. [Google Scholar]

[ref40] 40. Valdes-Tresanco MS, Valdes-Tresanco ME, Valiente PA, et al. gmx_MMPBSA: a new tool to perform end-state free energy calculations with GROMACS. Journal of Chemical Theory and Computation 2021;17:6281–91. [DOI] [PubMed] [Google Scholar]

[ref41] 41. Krueger F. Trim galore: a wrapper tool around Cutadapt and FastQC to consistently apply quality and adapter trimming to FastQ files, with some extra functionality for MspI-digested RRBS-type (reduced representation Bisufite-Seq) libraries. 2012. http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/ (28 April 2016, date last accessed).

[ref42] 42. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 2014;15:550. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref43] 43. Subramanian A, Tamayo P, Mootha VK, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 2005;102:15545–50. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref44] 44. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 2010;26:139–40. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref45] 45. Gentleman RC, Carey VJ, Bates DM, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004;5:R80. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref46] 46. Liberzon A, Subramanian A, Pinchback R, et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics 2011;27:1739–40. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref47] 47. Korotkevich G, Sukhov V, Budin N, et al. Fast gene set enrichment analysis. BioRxiv 2021;060012. [Google Scholar]

[ref48] 48. Murtagh F, Contreras P. Algorithms for hierarchical clustering: an overview, Wiley interdisciplinary reviews-data mining and knowledge. Discovery 2012;2:86–97. [Google Scholar]

[ref49] 49. Lipinski CA. Lead- and drug-like compounds: the rule-of-five revolution. Drug Discov Today Technol 2004;1:337–41. [DOI] [PubMed] [Google Scholar]

[ref50] 50. Mongiat-Artus P, Teillac P. Abarelix: the first gonadotrophin-releasing hormone antagonist for the treatment of prostate cancer. Expert Opin Pharmacother 2004;5:2171–9. [DOI] [PubMed] [Google Scholar]

[ref51] 51. Tsilosani A, Gao C, Zhang WZ. Aldosterone-regulated sodium transport and blood pressure. Front Physiol 2022;13:770375. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref52] 52. Baek SJ, Lee H, Park SM, et al. Identification of a novel anticancer mechanism of Paeoniae radix extracts based on systematic transcriptome analysis. Biomed Pharmacother 2022;148:112748. [DOI] [PubMed] [Google Scholar]

[ref53] 53. Patocka J. Biologically active pentacyclic triterpenes and their current medicine signification. J Appl Biomed 2003;1:7–12. [Google Scholar]

[ref54] 54. Kashyap D, Sharma AS, Tuli H, et al. Ursolic acid and oleanolic acid: pentacyclic terpenoids with promising anti-inflammatory activities. Recent Pat Inflamm Allergy Drug Discov 2016;10:21–33. [DOI] [PubMed] [Google Scholar]

[ref55] 55. Takada K, Nakane T, Masuda K, et al. Ursolic acid and oleanolic acid, members of pentacyclic triterpenoid acids, suppress TNF-alpha-induced E-selectin expression by cultured umbilical vein endothelial cells. Phytomedicine 2010;17:1114–9. [DOI] [PubMed] [Google Scholar]

[ref56] 56. Park M, Lee S-Y, Lee H, et al. Effect of terpenes from Poria Cocos: verifying modes of action against Alzheimer's disease using molecular docking, drug-induced transcriptomes and diffusion network. BioRxiv 2023:2023–06.543358. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref57] 57. Stumpfe D, Hu HB, Bajorath J. Evolving concept of activity cliffs. Acs Omega 2019;4:14360–8. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Comparative study of the mechanism of natural compounds with similar structures using docking and transcriptome data for improving in silico herbal medicine experimentations

Musun Park

Su-Jin Baek

Sang-Min Park

Jin-Mu Yi

Seongwon Cha

Abstract

INTRODUCTION

METHODS

Overview of the study

Figure 1.

Comparative analysis of chemical properties of OA, HG and GA

Molecular feature collection and molecular descriptor calculation method

Distance-based molecular similarity measurement method

Comparative analysis of OA, HG and GA based on systems pharmacology platform

Selection of druggable target using systems pharmacology platform

Network configuration using druggable targets

ORA based on systems pharmacology platform

Molecular docking analysis

Molecular docking analysis method

Pathway analysis and visualization using molecular docking analysis results

Molecular mechanics–Poisson Boltzmann surface area analysis

RNA-seq analysis method

Chemicals and reagents

Cell culture

Drug treatment and total RNA preparation

Library preparation for mRNA sequencing

Sequencing and estimate expression abundance

Gene set enrichment analysis and cluster analysis of pathways

RESULTS

Physical and chemical properties of OA, HG and GA

Table 1.

Table 2.

Results of systems pharmacology analysis of OA, HG and GA

Figure 2.

Results of molecular docking analysis of OA, HG and GA

Figure 3.

Results of RNA-seq analysis of OA, HG and GA

Figure 4.

DISCUSSION AND CONCLUSION

Key Points

FUNDING

DATA AVAILABILITY

Supplementary Material

Author Biographies

Contributor Information

AUTHOR CONTRIBUTIONS

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases