Abstract
As many societies age, primary osteoporosis (PO) is increasingly a major health problem. Current drug treatments such as alendronate and risedronate have known side effects. We took an agnostic empirical approach to find PO therapeutic compounds. We examined 13,548,960 probe data-points from mesenchymal stromal cell (hMSC) lines and found that PGF, DDIT4, and COMP to be up-regulated, and CHI3L1, down-regulated. We then identified their druggable domains. For the up-regulated differentially-expressed genes, we used protein–protein interactions to find residue clusters as binding surfaces. We then employed pharmacophore models to screen 15,407,096 conformations of 22,723,923 compounds, which identified (6R,9R)-6-(2-furyl)-9-(1H-indol-3-yl)-2-(trifluoromethyl)-5,6,7, 9-tetrahydro-4H[1,2,4]triazolo[5,1],(2S)-N1-[2-[2-(methylamino)-2-oxo-ethyl]phenyl]-N2-phenylpyrrolidine-1,2-dicarboxamide, and 2-furyl-(1H-indol-3-yl)-methyl-BLAHone as candidate compounds. For the down-regulated CH13L1, we relied on genome-wide disease signatures to identify (11alpha)-9-fluoro-11,17,21-trihydroxypregn-4-ene-3,20-dione and Genistein as candidate compounds. Our approach differs from previous research as we did not confine our drug targets to hypothesized compounds in the existing literature. Instead, we allowed the full expression profile of PO cell lines to reveal the most desirable targets. Second, our differential gene analysis revealed both up- and down-regulated genes, in contrast to the literature, which has focused on inhibiting only up-regulated genes. Third, our virtual screening universe of 22,723,923 compounds was more than 100 times larger than those in the known literature.
Keywords: Bioinformatics, Drug design, Virtual screening, Pharmacophore, Docking, Osteoporosis
1. Introduction
As many societies age, primary osteoporosis (PO) is increasingly a major health problem. In the U.S., PO incidence has steadily increased from 75 per 100,000 women in the 1950 to 150 per 100,000 women by the 1990s [3]. The most common drug treatments are generic bisphosphonates such as alendronate and risedronate, due to their low cost [15]. However, they have well-known upper gastrointestinal side effects and are associated with atypical fractures of the femur and aseptic necrosis of the mandible [15]. Recently, a number of researchers have sought to identify new therapeutic drugs for PO. For example, Yasuda et al. [20] focused on cathepsins S and K, which have selective expression in the extracellular matrix (ECM). Feder et al. [6] screened a 500-compound library for purple acid phosphatase inhibitors because elevated phosphatases are correlated with osteoporosis. More recently, Vuorinen et al. [19] screened 202,906 compounds for 17B-hydroxysteroid dehydrogenase 2 inhibitors, under the assumption that these inhibitors catalyze the inactivation of estradiol into estrone.
We took a different, novel approach. First, we did not assume or confine our drug targets to hypothesized compounds in the existing literature. Instead, we allowed the full expression profile of PO cell lines to reveal the most desirable targets, subject of course to druggability conditions. Second, our differential gene analysis of the cell lines revealed both up- and down-regulated genes, in sharp contrast to the existing literature, which has focused on inhibiting only up-regulated genes. Third, we used the new ZINC database of compounds [7]. With 15,407,096 conformations of 22,723,923 compounds, this universe is more than 100 times larger than that in Vuorinen et al. [19].
In the Section 2, we describe how we used human mesenchymal stromal cell (hMSC) cell lines for differential expression analysis and used virtual-screening on the ZINC database [7]. In the Section 3, we report four differentially-expressed genes (DEG): the up-regulated PGF, DDIT4, and COMP and the down-regulated CHI3L1. This consideration of a down-regulated gene is a departure from the literature, which has focused on only up-regulated genes. Finally, we report the identity of 5 potent candidate compounds under stringent druggability conditions. We conclude with thoughts on some remaining limitations and suggest further research directions.
2. Materials and methods
We obtained probe data of hMSC cell lines from the GSE35936 in the NCBI library [1]. The dataset consisted of 1,354,896 million probes for 54,675 genes in each of 10 cell lines, of which 5 are age-matched controls. Data were produced using the Affymetrix U133 Plus 2.0 Array platform.
We undertook differential gene analysis using Mendel [11], which we developed employing the R language. We then pre-processed the probe data with Robust Multichip Average (RMA) analysis. The DEGs were annotated with gene ontology, diseased ontology, and KEGG pathways. All this information allowed us to identify genes that are significantly up- or down-regulated, ranked by log fold changes.
The path from significant DEGs to therapeutic compounds required much filtering. Furthermore, the process for up-regulated DEGs had to be different than that for down-regulated ones. While any ligand docked into an up-regulated DEG domain could be treated as an inhibitor [9], the same could not be said for a down-regulated DEG. Instead, we have to explicitly search for agonists for the latter.
For up-regulated genes, we first used EBI’s structured-based engine to identify druggable protein domains, which we in turn used to identify clusters of anchor residues over protein–protein interaction (PPI) surfaces. This was done with PocketQuery [9] which ranked the clusters with a composite score of solvent-accessible surface area (SASA), free energy (GFC), and sequence conservation, The highest-ranked cluster was used to build pharmacophore models using Pharmer, a high-performance search engine based on geometric hashing, generalized Hough transforms, and Bloom fingerprints [8]. The pharmacophore was then used for virtual screening of over 22 million compounds using ZINCPharmer [10]. We used the stringent criteria of a maximum 1 hit per conformation, maximum 1 hit per molecule, a maximum RMSD of 0.01, and a maximum of 4 rotatable bonds. Finally, each hit was characterized with a set of absorption, distribution, metabolism, excretion, and toxicity (ADMET) rules [17].
Our final step was to undertake molecular simulation to dock the ligands to the protein receptors. To prepare receptors, we use Chimera [16] to remove the protein’s original ligand, delete its solvent, replace incomplete side chains with Dunbrack rotamers, add hydrogen and OXT atoms to missing C-termini (with protonation states for histidine), assign partial charges from the AMBER ff145B force field with non-monatomic ion residues using semi-empirical (AMI) with bond charge correction (BCC).
For down-regulated genes, we again confined our attention to those with druggable domains. However, we were not screening for inhibitors, but looking for agonists. We used the functional connections between genetic perturbation and drug action to identify bioactive small molecules [12]. We subject each molecule to the same ADMET rules as for up-regulated DEGs, and dock them with ligands in the same manner.
3. Results and discussion
Fig. 1 reports the up- and down-regulated DEGs from Mendel. LogFC is the value of the contrast in log2 fold-change. AveExpr is the average log2 expression for that gene across all arrays and channels. t and P.Value are the usual statistics, and adj.P.Val adjusts for multiple testing with the Benjamin–Hochberg method to control for false discovery rate. The B is the long-odds statistic, measuring the odds that the gene is differentially expressed.
Figure 1.
Up-regulated DEGs using Mendel.
The most significant up-regulated DEG was MAB21L2, on chromosome chr4:151,504,181–151,505,261, that is a repressor of bone morphogenetic protein (BMP)-induced transcription.). However, as Fig. 2 shows, only three up-regulated DEGs have druggable domains: PGF (placental growth factor), DDIT4 (DNA-damage-inducible transcript 4), and COMP (cartilage oligomeric matrix protein).
Figure 2.
Druggable domains of up-regulated DEGs.
Fig. 3 shows how PGF mediates osteoclastogenesis [2]. PGF secreted by osteogenic BMSCs stimulates RANKL (receptor activator of nuclear factor kappa-B ligand) expression by these same BMSCs, then working with RANKL, PGF also mediates the differentiation of osteoclast progenitor cells. PGF also signals directly on these precursors in a positive feedback loop, so that its pleiotropic actions lead to bone resorption.
Figure 3.
Pathway for PGF effect on osteoclastogenesis.
DDIT4 is an inhibitor in mTOR signaling in osteoblasts; see Fig. 4. mTOR is a member of the phosphoinositol kinase family and a regulator of 1,25(OH)2D (vitamin D) action in bones, via pS6K1. DDIT4 is also an inhibitor of NFTc1, a master transcription factor for osteoclastogenesis, via eukaryotic translation initiation factor 2 alpha (eIF2alpha) signaling [18].
Figure 4.

Pathway for DDIT4 effect on osteoblast regulation.
COMP expresses a noncollagenous glycoprotein that binds to type I, II, and IX collagen fibers. It has a domain at the N-terminus, a globular domain at the C-terminus, and 4 epidermal growth factor-like (EGF-like) domains, with 8 thrombospondin type 3 repeats (Fig. 2). COMP stabilizes the collagen fiber network in articular cartilage, tendons, menisci, and synovial tissue. Up-regulation of COMP, however, decreases the viability of chondrocytes, which could be a pathway for PO [5].
Next, we report the results of using PPI to discover regions of cluster residues for binding to the three druggable up-regulated PGF, DDIT4, and COMP. Fig. 5 illustrates the region for PGF. The residue is shown in space-filled elements and the ligand protein shown with ball-and-sticks. Fig. 6 reports the 3 residues in the cluster model uses the residue cluster at channel A, with 3 residues. GFC is Gibbs free energy and SASA is solvent accessible surface area. Conserv is the conservation score calculated from the EBI Scorecons server. An alternative measure of conservation is Evol.Rate, the evolutionary rate computed from Tel Aviv University’s Rate4Site.
Figure 5.
Hot region of PGF (human placenta growth factor-1) defined by residue clusters, at PPI interfaces.
Figure 6.
Pharmacophore model of PGF (human placenta growth factor-1).
With the cluster regions, we successfully built pharmacophore models. Fig. 6 illustrates one such model for PGF.
Finally, Fig. 7 shows the candidate compounds based on virtual screening using the pharmacophore models. The screen was able to identify compounds only for PGF. We also report a number of ADMET screening results. Although not all compounds pass all ADMET screens, all compounds pass most of the screens.
Figure 7.
Candidate compounds for up-regulated DEGs.
Of the three compounds, (2S)-N1-[2-[2-(methylamino)-2-oxo-ethyl]phenyl]-N2-phenyl-pyrrolidine-1,2-dicarboxamide has the highest pIC50 (3.2), and therefore can be considered the most potent inhibitor of the up-regulated PGF.
Finally, we were able to dock all three compounds with PGF. Fig. 8 shows the docking using (2S)-N1-[2-[2-(methylamino)-2-oxo-ethyl]phenyl]-N2-phenyl-pyrrolidine-1,2-dicarboxamide. This has a full fitness of −1167.75 kcal/mol and an estimated ΔG at −7.47 kcal/mol.
Figure 8.

PGF docked with (2S)-N1-[2-[2-(methylamino)-2-oxo-ethyl]phenyl]-N2-phenyl-pyrrolidine-1,2-dicarboxamide.
We now turn from up-regulated to down-regulated DEGs, shown in Fig. 9. It turns out that only CHI3L1 (chitinase 3-like 1) is druggable. The role of CHI3L1 in osteoporosis is still uncertain. On the one hand, CHI3L1 promotes the proliferation of connective tissue. Its silencing with siRNA is known to lower bone resorption while its transfection decreases the osteoclast pro-differentiative marker MMP-9 [4]. In this way, PO might be associated actually with CHI3LI up-regulation. On the other hand, CHI3LI facilitates bacterial adhesion and invasion, especially in the epithelial cells. It does this by activating protein kinase B (AKT) phosphorylation [14]. It is the excessive production of CHI3L1 that is pathogenic in mucosal tissues. Therefore, it is also plausible that CHI3L1 down-regulation is associated with PO, as our probe data revealed. In short, the exact pathway in which CHI3L1 affects PO needs to be further investigated.
Figure 9.
Down-regulated DEGs using Mendel.
The disease-gene signature map identified two agonist compounds, reported in Fig. 10. Both are potent, although Genistein is the more potent, with a pIC50 of 4.44. Genistein is already known to activate peroxisome proliferator-activated receptors (PPARs), especially gamma forms that regulate bone mass [13], so it seems quite natural for it to be repurposed for PO.
Figure 10.
Candidate compounds for down-regulated CHI3L1.
Fig. 11 illustrates the docking of (11alpha)-9-fluoro-11,17,21-trihydroxypregn-4-ene-3,20-dione with CHI3L1. The full fitness is associated with −1137.60 kcal/mol, with an estimated ΔG at −7.40 kcal/mol.
Figure 11.

PGCHI3L1 F docked with (11alpha)-9-fluoro-11,17,21-trihydroxypregn-4-ene-3,20-dione.
4. Conclusion
Our analysis uses an agnostic approach to identify target genes and their inhibitors and activators, so that we were able to obtain a comprehensive list of five candidate therapeutic compounds for PO. Interestingly, many of our targets are consistent with the literature, even if current research has not gone as far as virtual screening of compounds. Furthermore, we could relax the filtering criteria in many ways, from the threshold log fold-change for determining significant DEGs, to the criteria for determining targets, identifying druggable domains, constructing PPI cluster interfaces, modeling pharmacophores, and finally to screening compounds.
Going forward, the candidate compounds will need biological assays and clinical trials, but the lead compounds now identified could produce a new cocktail of possible therapeutic drugs for PO.
Acknowledgements
I thank Prof. P. Schlar of Mount Sinai Hospital for guiding me towards R, used to develop Mendel, and to Peggy Benisch at the University of Wuerzburg and her colleagues for usage of the primary osteoporosis dataset. I am also grateful to the developers [1], [18], [10], [11], [8], [14], [2], [13], [12], [16] of the R packages used in Mendel.
Footnotes
Peer review under responsibility of National Research Center, Egypt.
References
- 1.Benisch P., Tatjana S., Ludger K., Frey S., Seefried L., Raaijmakers N., Krug M., Regensburger R.M., Zeck S., Schinke T., Amiing M., Ebert R., Jakob F. PLoS One. 2012;7:e45142. doi: 10.1371/journal.pone.0045142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Coenegrachts L., Maes C., Torrekens S., Van Looveren R., Mazzone M., Guise T.A., Bouillon R., Strassen J., Carmeliet P., Carmeliet G. Cancer Res. 2010;70:6537–6547. doi: 10.1158/0008-5472.CAN-09-4092. [DOI] [PubMed] [Google Scholar]
- 3.Cooper C., Cole Z.A., Holroyd C.R., Earl S.C., Harvey N.C., Dennison E.M., Melton L.J., Cummings S.R., Kanis J.A. Osteoporos. Int. 2011;22:1277–1288. doi: 10.1007/s00198-011-1601-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Di Rosa M., Tibullo D., Vecchio M., Nunnari G., Saccone S., Di Raimondo F., Malaguarnera L. Bone. 2014;61:55–63. doi: 10.1016/j.bone.2014.01.005. [DOI] [PubMed] [Google Scholar]
- 5.Dinser R., Zaucke F., Kreppel F., Hultenby K., Kochanek S., Paulsson M., Maurer P. J. Clin. Invest. 2002;110:505–514. doi: 10.1172/JCI14386. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Feder D., Hussein W.M., Clayton D.J., Kan M.W., Schenk G., McGeary R.P., Guddat L.W. Chem. Biol. Drug Des. 2012;80:665–674. doi: 10.1111/cbdd.12001. [DOI] [PubMed] [Google Scholar]
- 7.Irwin J.J., Sterling T., Mysinger M.M., Bolstad E.S., Coleman R.G. J. Chem. Inf. Model. 2012;52:1757–1768. doi: 10.1021/ci3001277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Koes D.R., Camacho C.J. J. Chem. Inf. Model. 2011;51:1307–1314. doi: 10.1021/ci200097m. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Koes D.R., Camacho C.J. Nucleic Acids Res. 2012;40:1–66. doi: 10.1093/nar/gks336. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Koes D.R., Camacho C.J. Nucleic Acids Res. 2012;40:W409–W414. doi: 10.1093/nar/gks378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Lai C.J. J. Microbiol. Biotechnol. Res. 2015;5:18–30. [Google Scholar]
- 12.Lamb J., Crawford E.D., Peck D., Modell J.W., Blat I.C., Wrobel M.J., Golub T.R. Science. 2006;313:1929–1935. doi: 10.1126/science.1132939. [DOI] [PubMed] [Google Scholar]
- 13.Lecka-Czernik B. IBMS BoneKEy. 2010;7:171–181. [Google Scholar]
- 14.Lee I.A., Kamba A., Low D., Mizoguchi E. World J. Gastroenterol. 2014;20:1127–1138. doi: 10.3748/wjg.v20.i5.1127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Lems W.F., den Heijer M. Neth. J. Med. 2013;71:188–193. [PubMed] [Google Scholar]
- 16.Pettersen E.F., Goddard T.D., Huang C.C., Couch G.S., Greenblatt D.M., Meng E.C., Ferrin T.E. J. Comput. Chem. 2004;25:1605–1612. doi: 10.1002/jcc.20084. [DOI] [PubMed] [Google Scholar]
- 17.Swain M. J. Chem. Inf. Model. 2004;52:613–615. [Google Scholar]
- 18.N.G. Tanjung, In Vitro and In Silico Analysis of Osteoclastogenesis in Response to Inhibition of De-phosphorylation of EIF2alpha by Salubrinal and Guanabenz Doctoral dissertation, Purdue University, 2013.
- 19.Vuorinen A., Engeli R., Meyer A., Bachmann F., Griesser U.J., Schuster D., Odermatt A. J. Med. Chem. 2014;57:5995–6007. doi: 10.1021/jm5004914. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Yasuda Y., Kaleta J., Brömme D. Adv. Drug Deliv. Rev. 2005;57:973–993. doi: 10.1016/j.addr.2004.12.013. [DOI] [PubMed] [Google Scholar]









