Abstract
The present article describes the building of a small-molecule web server, CBPDdb, employing R-shiny. For the generation of the web server, three compounds were chosen, namely coumarin, benzothiazole and pyrazole, and their derivatives were curated from the literature. The two-dimensional (2D) structures were drawn using ChemDraw, and the .sdf file was created employing Discovery Studio Visualizer v2017. These compounds were read on the R-shiny app using ChemmineR, and the dataframe consisting of a total of 1146 compounds was generated and manipulated employing the dplyr package. The web server is provided with JSME 2D sketcher. The descriptors of the compounds are obtained using propOB with a filter. The users can download the filtered data in the .csv and .sdf formats, and the entire dataset of a compound can be downloaded in .sdf format. This web server facilitates the researchers to screen plausible inhibitors for different diseases. Additionally, the method used in building the web server can be adapted for developing other small-molecule databases (web servers) in RStudio.
Database URL: https://srampogu.shinyapps.io/CBPDdb_Revised/
Introduction
Computer-aided drug design (CADD) has been instrumental in retrieving plausible inhibitors for a given target for the past three decades (1). This method allows quick screening of compounds at a very low cost (2–4). This CADD is accomplished either by structure-based drug design (SBDD) (5) or by ligand-based drug design (LBDD) (4).
In the SBDD, the presence of the resolved three-dimensional (3D) structure and its inbound ligand (small molecule) plays an important role (5). The interactions between the target and the ligand are critical in understanding the probable binding mode (6, 7) and important residues that might bring out the biological activity. An approach that demonstrates the association between the structure of a compound and its physicochemical properties that determine the biological activity of a compound is called the LBDD (8). The selection of the potential inhibitors is done either by mapping the compounds to a pharmacophore model and molecular docking (9–12) or directly by molecular docking (13). This process can be termed as screening or virtual screening. In 1997, the term virtual screening was first used in the literature (14) and is ‘defined as a set of computational methods that analyses large databases or collections of compounds in order to identify potential hit candidates’ (15). Generally, the search for the compounds is performed using the chemical libraries (16–19). Usually, the compounds are additionally filtered based on their drug-like properties in order to find favour during the development process.
A detailed account of different web servers embedded with small molecules is given in a study (14), while another web server provides information on different natural compounds with anticancer activity (20). However, a database with compound derivatives of coumarin, benzothiazole and pyrazole has not yet been built. Therefore, in the current study, we have built a web server of Coumarin–Benzothiazole–Pyrazole Derivatives Database (CBPDdb), with derivatives of coumarin, benzothiazole and pyrazole that have demonstrated biological activity towards various diseases.
Materials and methods
Collection of the compounds
In this study, three compounds, namely coumarin, benzothiazole and pyrazole, were selected to search for derivatives in the literature. These compounds were specifically chosen as there are an increasing number of experiments available on the biological activities of these derivatives. These compounds have demonstrated varied biological activities and therapeutic applications. We aim to provide the researchers in the field of CADD with most of the compounds with biological activities that would help them discover novel compounds for different diseases.
Specifically, the compounds that have shown biological activity was selected. The derivatives were collected by giving ‘compound names and their derivatives’, ‘compound name + synthesis’, ‘compound name + biological activity’ as the key words in PubMed, NCBI (https://pubmed.ncbi.nlm.nih.gov/), Google Scholar and Google.
The polyphenolic compounds coumarin (2H-1-benzopyran-2-one) are a group of oxygenated, colourless, crystalline compounds. These compounds were initially isolated from Dipteryx odorata Willd. (Fabaceae) in 1820 by Vogel. This plant is commonly called Coumarou (21, 22). Structurally, this compound is made up of a fused benzene ring and α-pyrone ring (23).
Benzothiazole is a heterocyclic structure that is usually bioactive (24). These compounds have a heterocyclic nucleus called a thiazole that confers various biological properties (25). The π-excess aromatic heterocyclic compound pyrazole is a five-membered structure, which is a widely studied group in the azole family (26). The pyrazole template has gained popularity due to its potential therapeutic applications (26). In this compound, the fourth position is preferred for the electrophilic substitution reaction, while the third and the fifth positions are preferred by the nucleophilic reactions (26). To the pyrazole ring, several varied functional groups can be added, substituted, removed or fused to correspondingly synthesize the biologically potent compounds (27). These three compounds have various medicinal applications and hence are chosen to generate a web server with their derivatives (25, 28–33).
Building of the webserver
The two-dimensional (2D) structures were initially sketched employing ChemDraw and saved in .mol format. These structures were upgraded to Discovery Studio Visualizer to obtain their 3D forms and saved them in .sdf format. The therapeutic action of the compounds and the source of curation were prepared in a .csv file that was used to develop the server along with the .sdf files of the compounds. The overview of the web server is given in Figure 1.
Figure 1.

Overview of the web server.
To build the web server, the ChemmineR (34) was used that enables compound similarity search, clustering, visualization and function of compounds. Here, we have employed the DT (renderDataTable) to display the data of the compounds into a data table form.
Results
Collection of the compounds and building of CBPDdb
For building a web server that could help the computational chemist, computational biologist or CADD researchers, we have selected coumarins, benzothiazole and pyrazole as a first attempt. A total of 1146 compounds (coumarin, 140; benzothiazole, 451 and pyrazole, 555) were curated from various literature sources. Using the read.SDFset available with ChemmineR, the compounds were imported into the RStudio. The properties/descriptors for these compounds were generated employing propOB. This feature can be adapted post instalment of ChemmineOB package and the OpenBabel software (35). The so-obtained results are transformed into a data table (DT1).
Furthermore, a different file was generated in .csv format that included the therapeutic action and source of data curation. This file was also read on RStudio using read.csv and a data table (DT2) was created. The two data tables (DT1 and DT2) were merged to join the descriptors with the therapeutic action using the merge function and dplyr. This final data table was displayed on the web server. This pattern was followed to generate the data table for the derivatives, which were displayed under three tabs.
How to use the database
The web server is divided into three major sections: (1) full dataset with filters, (2) full dataset graphical frequency analysis of descriptors and (3) extracting cansmi (smiles) column: filtered data.
Full dataset with filters
This section shows the full dataset of the compounds. The derivatives of the three compounds are included in a separate tab that can be downloaded in the .csv or .sdf formats. Each of the data tables is provided with a top filter that allows the users to choose their choice of descriptors. The filtered data can be downloaded as a .csv file and checked if the selected compounds are downloaded by counter-verifying the Chemical Name in both the files (Supplementary Figure 1). The DT is equipped with clickable links that correspondingly connects to the compound articles. The DT is provided with a search bar that allows the users to search a given input. For instance, if anticancer is given as an input, the results in the DT will display only those compounds with anticancer property.
Full dataset graphical frequency analysis of descriptors
The sidebar panel of the web server is equipped with a histogram plot that displays the frequency of the compounds. The users can select the descriptor from the sidebar panel and view the result as a histogram with the selection option for bins (Supplementary Figure 2).
Extracting the cansmi (smiles) column: filtered data
Section 3 is linked to Section 1, which specifically retrieves a single column upon selection. Once the data is filtered (Section 1), the cansmiName column is selected in Section 1. The selected column with the filtered data will be displayed in Section 3. Here, the display corresponds to the selected tab. The results (filtered data) can be downloaded in the .csv and .sdf formats. The .sdf files can be used to generate the 3D structures (Supplementary Figure 3).
Visualizing the 2D structures
The sidebar panel of the server is embedded with JSME Molecular Editor (Supplementary Figure 2) (36), which facilitates the visualization of the structure of the compounds. The 2D structures can be viewed by giving the SMILES (cansmi, which are the Canonical SMILES) as an input at the Molecular Editor by clicking the downward arrow, selecting the Paste Mol or SDF or SMILES and clicking Accept. The 2D structure appears on the editor (Supplementary Figure 4). The editor also has other parameters through which the compound’s appearance can be changed. Additionally, the users can copy and save the compound in several formats. The modification of the molecules is supported by JSME by clicking the FG (36) (Supplementary Figure 5).
Discussion and conclusion
In order to discover new drugs with therapeutic ability, the CADD process plays a very effective role. In contrast, traditional drug discovery methods are time- and money-consuming processes (2). The term CADD includes saving the compounds, organizing and evaluating them and further modelling the compounds (2). The efficiency of CADD can be evidently seen during the recent pandemic times, when there was an urgency to identify the potential candidate compounds (37–39). Earlier, our group had computationally designed butein analogues that demonstrated anticancer activity (40). Furthermore, these compounds have shown in silico antibacterial activity (41). In another study, computational design of PARP inhibitors was performed against SARS-CoV-2 (42).
Virtual screening is an important step in retrieving the best molecule against a given target (43, 44). The screening process may proceed via SBDD and/or LBDD (43). In either methods, the main purpose is to discover a highly potent putative compound against a target (44, 45). The molecular docking is also included in the virtual screening step. Molecular docking primarily imparts knowledge on the binding mode of the ligand at the active site of the protein (46).The small molecules can be prepared using Gypsum-DL for structure-based virtual screening (47).
Accordingly, in the present study, we have built a web server called the CBPDdb, consisting of derivatives of compounds from coumarin, benzothiazole and pyrazole curated from different literature sources. These compounds have displayed biological activities such as anticancer, antifungal, antiviral, etc. We believe that these compounds will be useful for the CADD researchers to work with the compounds for using them against several diseases. This web server is equipped with JSME, a 2D sketcher that enables the users to visualize the 2D structures of the compounds. Furthermore, the compounds can be selected based on filter parameters to facilitate the user’s choice of compounds.
In the following versions, the web server will be regularly updated to increase the number of compounds with the coumarin, benzothiazole and pyrazole derivatives and other derivatives. Furthermore, the web server will be incorporated with different analysis methods and predictions relevant to medicinal chemistry and CADD.
In conclusion, we believe that this web server could help the computational chemist or computational biologist in their research progress. Furthermore, our attempt may also help the researchers design new small-molecule web servers.
Supplementary Material
Acknowledgments
The authors extend their appreciation to the Deputyship for Research and Innovation, “Ministry of Education” in Saudi Arabia for funding this research (IFKSUOR3–103–3).
Contributor Information
Shailima Rampogu, Cachet Big Data Lab, Hyderabad, Telangana 500045, India.
Mohammed Rafi Shaik, Department of Chemistry, College of Science, King Saud University, P.O. Box 2455, Riyadh 11451, Saudi Arabia.
Merajuddin Khan, Department of Chemistry, College of Science, King Saud University, P.O. Box 2455, Riyadh 11451, Saudi Arabia.
Mujeeb Khan, Department of Chemistry, College of Science, King Saud University, P.O. Box 2455, Riyadh 11451, Saudi Arabia.
Tae Hwan Oh, School of Chemical Engineering, Yeungnam University, Gyeongsan 38541, Republic of Korea.
Baji Shaik, School of Chemical Engineering, Yeungnam University, Gyeongsan 38541, Republic of Korea.
Supplementary Material
Supplementary Material is available at Database online.
Data availability
The data underlying this article are available in https://github.com/SRampogu/CBPDdb_revised.
Author Contribution
S.R., B.S. and T.H.O. conceived the idea of the project; S.R. built the web server, wrote the manuscript and curated the compounds from literature; B.S., M.R.S., Me.K. and Muj.K. curated the compounds from literature;M.R.S., Me.K. and Muj.K. provided funding acquisition and B.S. and T.H.O. did sketching of the 2D structures.
Conflict of interest
The authors declare no conflict of interest.
References
- 1. Sliwoski G., Kothiwale S., Meiler J.. et al. (2014) Computational methods in drug discovery. Pharmacol. Rev., 66, 334–395. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Ou-Yang S., Lu J., Kong X.. et al. (2012) Computational drug discovery. Acta Pharmacol. Sin., 33, 1131–1140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Sliwoski G., Kothiwale S., Meiler J.. et al. (2013) Computational methods in drug discovery. Pharmacol. Rev., 66, 334–395. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Yu W. and Mackerell A.D. (2017) Computer-aided drug design methods. Methods Mol. Biol., 1520, 85–106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Batool M., Ahmad B. and Choi S. (2019) A structure-based drug discovery paradigm. Int. J. Mol. Sci., 20, 2783. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Ferreira L.G., Dos Santos R.N., Oliva G.. et al. (2015) Molecular docking and structure-based drug design strategies. Molecules, 20, 13384–13421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Anderson A.C. (2012) Structure-based functional design of drugs: from target to lead compound. Methods Mol. Biol., 823, 359–366. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Shim J. and MacKerell A.D.J. Jr. (2011) Computational ligand-based rational design: role of conformational sampling and force fields in model development. Medchemcomm, 2, 356–370. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Joshi S.D., Dixit S.R., Basha J.. et al. (2018) Pharmacophore mapping, molecular docking, chemical synthesis of some novel pyrrolyl benzamide derivatives and evaluation of their inhibitory activity against enoyl-ACP reductase (InhA) and Mycobacterium tuberculosis. Bioorg. Chem., 81, 440–453. [DOI] [PubMed] [Google Scholar]
- 10. Simon L., Imane A., Srinivasan K.K.. et al. (2017) In silico drug-designing studies on flavanoids as anticolon cancer agents: pharmacophore mapping, molecular docking, and monte carlo method-based QSAR modeling. Interdiscip. Sci. Comput. Life Sci., 9, 445–458. [DOI] [PubMed] [Google Scholar]
- 11. Tian X., Zhao Q., Chen X.. et al. (2022) Discovery of novel and highly potent inhibitors of SARS CoV-2 papain-like protease through structure-based pharmacophore modeling, virtual screening, molecular docking, molecular dynamics simulations, and biological evaluation. Front. Pharmacol., 13, 817715. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Rampogu S. and Lee K.W. (2021) Pharmacophore modelling-based drug repurposing approaches for SARS-CoV-2 Therapeutics. Front. Chem., 9, 636362. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Rampogu S., Parameswaran S., Lemuel M.R.. et al. (2018) Exploring the therapeutic ability of fenugreek against type 2 diabetes and breast cancer employing molecular docking and molecular dynamics simulations. Evidence-based Complement. Altern. Med., 2018, 1943203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Singh N., Chaput L. and Villoutreix B.O. (2021) Virtual screening web servers: designing chemical probes and drug candidates in the cyberspace. Brief. Bioinform., 22, 1790–1818. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Wermuth C.G., Villoutreix B., Grisoni S. et al. (2015) Chapter 4: strategies in the search for new lead compounds or original working hypotheses. In: Wermuth CG, Aldous D, Raboisson P. et al. (eds.) The Practice of Medicinal Chemistry. 4th edn. Academic Press, San Diego, pp. 73–99. [Google Scholar]
- 16. Perola E., Xu K., Kollmeyer T.M.. et al. (2000) Successful virtual screening of a chemical database for farnesyltransferase inhibitor leads. J. Med. Chem., 43, 401–408. [DOI] [PubMed] [Google Scholar]
- 17. Irwin J.J. and Shoichet B.K. (2005) ZINC—a free database of commercially available compounds for virtual screening. J. Chem. Inf. Model., 45, 177–182. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Yao H. (2022) Virtual screening of natural chemical databases to search for potential ACE2 inhibitors. Molecules, 27, 1740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Carracedo-Reboredo P., Liñares-Blanco J., Rodríguez-Fernández N.. et al. (2021) A review on machine learning approaches and trends in drug discovery. Comput. Struct. Biotechnol. J., 19, 4538–4558. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Mangal M., Sagar P., Singh H.. et al. (2013) NPACT: naturally occurring plant-based anti-cancer compound-activity-target database. Nucleic Acids Res., 41, D1124–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Küpeli Akkol E., Genç Y., Karpuz B.. et al. (2020) Coumarins and Coumarin-Related Compounds in Pharmacotherapy of Cancer. Vol. 12. Cancers (Basel), Basel, Switzerland, p. 1959. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Jain P.K. and Joshi H. (2012) Coumarin: chemical and pharmacological profile. J. Appl. Pharm. Sci., 2, 236–240. [Google Scholar]
- 23. Venugopala K.N., Rashmi V. and Odhav B. (2013) Review on natural coumarin lead compounds for their pharmacological activity. Biomed Res. Int., 2013, 963248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Ali R. and Siddiqui N. (2013) Biological aspects of emerging benzothiazoles: a short review. J. Chem., 2013, 345198. [Google Scholar]
- 25. Pathak N., Rathi E., Kumar N.. et al. (2020) A review on anticancer potentials of benzothiazole derivatives. Mini Rev. Med. Chem., 20, 12–23. [DOI] [PubMed] [Google Scholar]
- 26. Karrouchi K., Radi S., Ramli Y.. et al. (2018) Synthesis and pharmacological activities of pyrazole derivatives: a review. Molecules, 23, 134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Costa R.F., Turones L.C., Cavalcante K.V.N.. et al. (2021) Heterocyclic compounds: pharmacology of pyrazole analogs from rational structural considerations. Front. Pharmacol., 12, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Burger A. and Sawhney S.N. (1968) Antimalarials. III. Benzothiazole amino alcohols. J. Med. Chem., 11, 270–273. [DOI] [PubMed] [Google Scholar]
- 29. Ansari A., Ali A. and Asif M. (2017) Review: biologically active pyrazole derivatives. New J. Chem., 41, 16–41. [Google Scholar]
- 30. Naim M.J., Alam O., Nawaz F.. et al. (2016) Current status of pyrazole and its biological activities. J. Pharm. Bioallied Sci., 8, 2–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Bairagi S.H., Salaskar P.P., Loke S.D.. et al. (2012) Medicinal significance of coumarins: a review. Int. J. Pharm. Res., 4, 16–19. [Google Scholar]
- 32. Poumale H.M.P., Hamm R., Zang Y.. et al. (2013) Coumarins and Related Compounds from the Medicinal Plants of Africa. Medicinal Plant Research in Africa. Elsevier, Amsterdam, Netherlands, pp. 261–300. [Google Scholar]
- 33. Gouda M.A., Hussein B.H.M., El-Demerdash A.. et al. (2020) A review: synthesis and medicinal importance of coumarins and their analogues (Part II). Curr. Bioact. Compd., 16, 993–1008. [Google Scholar]
- 34. Cao Y., Charisi A., Cheng L.-C.. et al. (2008) ChemmineR: a compound mining framework for R. Bioinformatics, 24, 1733–1734. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Horan K. and Girke T. (2023) ChemmineOB: R interface to a subset of OpenBabel functionalities. R package version 1.38.0, https://github.com/girke-lab/ChemmineOB. [Google Scholar]
- 36. Bienfait B. and Ertl P. (2013) JSME: a free molecule editor in JavaScript. J. Cheminform., 5, 24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Muratov E.N., Amaro R., Andrade C.H.. et al. (2021) A critical overview of computational approaches employed for COVID-19 drug discovery. Chem. Soc. Rev., 50, 9121–9151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Gurung A.B., Ali M.A., Lee J.. et al. (2021) An updated review of computer-aided drug design and its application to COVID-19. Biomed Res. Int., 2021, 8853056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Onawole A.T., Sulaiman K.O., Kolapo T.U.. et al. (2020) COVID-19: CADD to the rescue. Virus Res., 285, 198022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Rampogu S., Kim S.M., Shaik B.. et al. (2021) Novel butein derivatives repress DDX3 expression by inhibiting PI3K/AKT signaling pathway in MCF-7 and MDA-MB-231 cell lines. Front. Oncol., 11, 712824. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Rampogu S., Shaik B., Kim J.H.. et al. (2023) Explicit molecular dynamics simulation studies to discover novel natural compound analogues as Mycobacterium tuberculosis inhibitors. Heliyon, 9, e13324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Rampogu S., Jung T.S., Ha M.W.. et al. (2023) Repurposing and computational design of PARP inhibitors as SARS-CoV-2 inhibitors. Sci. Rep., 13, 10583. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Li Q. (2020) Chapter 4—virtual screening of small-molecule libraries. In: Trabocchi A, Lenci EBT-SMDD (eds.) Small Molecule Drug Discovery, Vol. 2020. Elsevier, Amsterdam, Netherlands, pp. 103–125. [Google Scholar]
- 44. Ekhteiari Salmas R., Unlu A., Bektaş M.. et al. (2017) Virtual screening of small molecules databases for discovery of novel PARP-1 inhibitors: combination of in silico and in vitro studies. J. Biomol. Struct. Dyn., 35, 1899–1915. [DOI] [PubMed] [Google Scholar]
- 45. Cuccioloni M., Bonfili L., Cecarini V.. et al. (2020) Structure/activity virtual screening and in vitro testing of small molecule inhibitors of 8-hydroxy-5-deazaflavin:NADPH oxidoreductase from gut methanogenic bacteria. Sci. Rep., 10, 13150. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Morris G.M. and Lim-Wilby M. (2008) Molecular docking. Methods Mol. Biol., 443, 365–382. [DOI] [PubMed] [Google Scholar]
- 47. Ropp P.J., Spiegel J.O., Walker J.L.. et al. (2019) Gypsum-DL: an open-source program for preparing small-molecule libraries for structure-based virtual screening. J. Cheminform., 11, 34. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data underlying this article are available in https://github.com/SRampogu/CBPDdb_revised.
