ABSTRACT
Accumulating studies have shown that microRNAs (miRNAs) could be used as targets of small-molecule (SM) drugs to treat diseases. In recent years, researchers have proposed many computational models to reveal miRNA-SM associations due to the huge cost of experimental methods. Considering the shortcomings of the previous models, such as the prediction accuracy of some models is low or some cannot be applied for new SMs (miRNAs), we developed a novel model named Symmetric Nonnegative Matrix Factorization for Small Molecule-MiRNA Association prediction (SNMFSMMA). Different from some models directly applying the integrated similarities, SNMFSMMA first performed matrix decomposition on the integrated similarity matrixes, and calculated the Kronecker product of the new integrated similarity matrixes to obtain the SM-miRNA pair similarity. Further, we applied regularized least square to obtain the mapping function of the SM-miRNA pairs to the associated probabilities by minimizing the objective function. On the basis of Dataset 1 and 2 extracted from SM2miR v1.0 database, we implemented global leave-one-out cross validation (LOOCV), miRNA-fixed local LOOCV, SM-fixed local LOOCV and 5-fold cross-validation to evaluate the prediction performance. Finally, the AUC values obtained by SNMFSMMA in these validation reached 0.9711 (0.8895), 0.9698 (0.8884), 0.8329 (0.7651) and 0.9644 ± 0.0035 (0.8814 ± 0.0033) based on Dataset 1 (Dataset 2), respectively. In the first case study, 5 of the top 10 associations predicted were confirmed. In the second, 7 and 8 of the top 10 predicted miRNAs related with 5-FU and 5-Aza-2ʹ-deoxycytidine were confirmed. These results demonstrated the reliable predictive power of SNMFSMMA.
KEYWORDS: Small molecule, microRNA, association prediction, symmetric non-negative matrix factorization, Kronecker regularized least squares
Introduction
MicroRNAs (miRNAs) are endogenous small non-coding RNA molecules (21–25 nucleotides) in various organisms including viruses, green alga, plants and animals [1,2]. Although miRNAs could not control protein synthesis, studies have shown that they could regulate gene expression [3]. And there is evidence that miRNAs play an irreplaceable role in many important biological processes [4], such as cell differentiation, metabolism, and so on [5,6]. Therefore, it is not surprising that more and more studies have shown that miRNAs are closely related to human diseases [7–9]. For instance, the expression level of miR-17 in Kidney Neoplasm tissues is different from that in normal cell tissues [10]. There is evidence that miR-126 could restrain the growth of Colon Neoplasms cell by targeting the insulin receptor substrate-1. Therefore, in Colon Neoplasms cells, miR-126 often loses [11]. Mir-145 inhibits the growth of colon cancer cells by targeting the insulin receptor substrate-1 (IRS-1) 3ʹ-untranslated region (UTR) to down-regulate the IRS-1 protein [12]. What’s more, compared to normal cells, mir-17 is abnormally expressed in renal tumour cells [10].
Small molecule (SM) drugs have been widely used in clinical treatment due to their delivery efficacy and high biological activity [13,14]. After the realization that RNA can exert their function in almost all cellular processes related to pathological conditions, scientists expanded their efforts towards RNA, including miRNA-targeted SMs development [15]. Some researchers have found that miRNAs could be used as potential targets for SMs in the treatment of diseases [16]. Recently, a number of important studies about SM–miRNA interactions have been reported [17]. In laboratories, the approaches to identify a SM targeting miRNA are classified into several main strategies. First, there are several approaches based on high-throughput screening to identify SMs that are involved in miRNA biogenesis [18]. For example, Gumireddy et al. [19] reported the luciferase reporter system that has been widely used for high-throughput screening of candidate miRNA-targeted compounds. They selected miR-21 as a target miRNA and screened more than 1000 SMs. Finally, they identified azobenzene-2 as an effective inhibitor of miR-21 expression [19]. Alternatively, researchers can identify a SM that is able to interact with miRNA via structure analysis. Informa is platform applied to design SMs targeting miRNAs based on structure and motif data [20]. Using this platform, Vo et al. [21] analysed RNA-binding motifs of SMs and developed SM inhibitors of miRNA-372 and miR-373 expression in gastric cancer. The last strategy is a combination of bioinformatics and high-throughput screening system. Recently, by expanding studies on both miRNA-targeted SM libraries and SM-miRNA structure database, scientists revealed that a bis-benzimidazole ligand can abrogate miR-544 production to sensitize the breast cancer cells to hypoxic stress [22]. Besides, the toxicological research of miRNAs has shown that miRNA can participate in chemotherapy and affect cellular sensitivity to SMs in complex diseases, especially in cancers. For instance, miR-197 can confer cisplatin resistance in non-small cell lung cancer, while miR-125a-5p are able to reverse the resistance in oesophageal squamous cell carcinoma [23,24]. These studies not only provide a new insight into the molecular mechanism of SM-miRNA associations, but also serves as new evidence for novel miRNA-targeted therapeutic strategy and drug designing for human diseases. The field of miRNA targeting drug designing is in the initial stage but has significantly advanced over the past years. The achievements of both principles and applications greatly benefited the discovery of selective miRNA-targeted SMs with biological activity. However, all the current strategies used in laboratories still face many limitations. For example, due to the structural and dynamic properties of miRNAs and the complexity of SMs, it is very difficult to discover associations between SMs and miRNAs using biological experiments. Moreover, using experimental methods to reveal associations is not only time-consuming and cost-effective, but also inefficient.
It is urgent to develop efficient approaches to continually identify and validate the associations between miRNAs and SMs, which would significantly aid our understanding of the mechanism of miRNA in therapy and elucidate how to select the best-suited miRNAs for targeting. For this purpose, the development of computational methods to predict the SM-miRNA associations is a prospective research field. So far, some researchers have proposed novel computational models to predict the associations between SMs and miRNAs. For example, Lv et al. [25] made use of network retrieval methods on an integrated molecular network including SM similarity network, miRNA similarity network, and known SM-miRNA association network to predict the SM-related miRNAs. They sorted candidate miRNAs for each SM based on their relation probabilities obtained by implementing random walk with restart algorithm on the integrated molecular network. Based on the functional similarity of differentially expressed genes under drug therapy and miRNAs perturbations, Wang et al. [26] calculated the functional similarity scores of SM-miRNA pairs by performing Gene Ontology (GO) enrichment analysis on differentially expressed genes to predict potential relations between SMs and miRNAs. What’s more, Wang et al. also used miRNA as a bridge to predict the associations between SMs and diseases by integrating the confirmed associations of SM-miRNA pairs and disease-related miRNAs, which was beneficial for solving the problem of drug repositioning. In order to predict miRNAs that were associated with SMs, Li et al. [27] proposed a model of predictive Small Molecule-miRNA Network-Based Inference (SMiR-NBI) on the basis of network-based inference algorithm. In SMiR-NBI, the initial resources of a given SM were located in the miRNAs associated with it. These miRNAs first divided the resources into their associated SMs equally, and then the SMs allocated the received resources to each adjacent miRNA. The resource value of each miRNA was the probability that it was related to the given SM. SMiR-NBI only used the associated information of the SM-miRNA pairs to complete the prediction. Jiang et al. [28] proposed a high-throughput algorithm to predict the associations of SM-miRNA pairs on the basis of the differential expression of miRNA target genes and gene signatures for gene expression profiles after treatment with SMs. Jiang et al. identified 406 cancer-related miRNAs (CRMs) involved in 23 cancers. They applied Gene Ontology (GO) enrichment analysis to divide differentially expressed target genes of one CRM into significant gene ontology modules (GOMs). According to the transcriptional response of SM and differentially expressed target genes of miRNA, the association of SM-miRNA pair was assessed by performing Kolmogorov–Smirnov (KS) test on the transcriptional response of compound and miRNA differentially expressed target genes. For each CRM in a particular cancer, if there was a significantassociation between SM and CRM in more than 80% of significant GOMs, they assumed that there was an association between the SM and the CRM in this cancer. What’s more, in order to study the miRNA associated with Alzheimer’s disease (AD), Meng et al. [29] developed a novel model named Small molecule and miRNA association Network in Alzheimer’s disease (SmiRN-AD) based on systematic computational methods. Under the influence of the anomaly of miRNAs associated with AD, the expression of the target gene changed, which was defined as the gene expression signatures of AD-related miRNAs (ADMs) regulation. Meng et al. predicted the associations between SMs and miRNAs through comparative analysis of the transcriptional response of drug therapy and miRNA regulation in AD. Besides, Meng et al. created a web server to help others query the potential associations of SM-miRNA pairs predicted by this model. Guan et al. [30] proposed a model named Graphlet Interaction based inference for Small Molecule-MiRNA Association prediction (GISMMA) to reveal the associations between SMs and miRNAs. GISMMA constructed a weighted SM similarity network and miRNA similarity network by integrating multiple SM similarity information and multiple miRNA similarity information, and then calculated the Graphlet interaction between SMs or miRNAs on the two networks. The predicted association scores of SM-miRNA pairs could be obtained by using the linear regression between the known SM-miRNA association scores and the number of different Graphlet interaction isomers. Compared with previous prediction methods, GISMMA used Graphlet interaction to describe the complex relationship between two nodes on the network, not only considering directly connected nodes, but also considering the relationship between indirectly connected nodes.
In this paper, we presented a novel model named Symmetric Nonnegative Matrix Factorization for Small Molecule-MiRNA Association prediction (SNMFSMMA) to predict the potential associations of SM-miRNA pairs. Unlike those models that directly used the similarity matrixes, we first interpolated the integrated similarity matrixes using symmetric non-negative matrix factorization (SymNMF), and calculated the Kronecker products of the new-integrated similarity matrixes, and then obtained the function of calculating the association probabilities of disease-miRNA pairs through regularized least square (RLS). To verify the prediction performance of SNMFSMMA, we implemented Leave-One-Out Cross Validation (LOOCV) and 5-fold cross-validation based on two datasets. As shown in the results, the AUCs of SNMFSMMA reached 0.9711 (0.8895), 0.9698 (0.8884), 0.8329 (0.7651) and 0.9644 ± 0.0035 (0.8814 ± 0.0033) in global LOOCV, miRNA-fixed local LOOCV, SM-fixed local LOOCV and 5-fold cross-validation based on Dataset 1 (Dataset 2). All of the above results confirmed the reliable prediction ability of SNMFSMMA.
Results
Performance evaluation
To evaluate the prediction effect of SNMFSMMA, we implemented global and local LOOCV based on two datasets constructed from known SM-miRNA associations in SM2miR v1.0. Dataset 1 recorded 664 SM-miRNA associations between 831 SMs and 541 miRNAs, while Dataset 2 only recorded 39 SMs and 286 miRNAs that have known associations. Here, SM-miRNA pairs with known associations were considered as positive samples, and those without known associations were named unknown samples. In global LOOCV, each of 664 positive samples were left out as test sample and other 663 positive samples were regarded as training samples, while the unknown samples were treated as candidate samples. We utilized SNMFSMMA to score candidate samples as well as test sample, and then obtained the ranking of the test sample in all of the candidate samples. Local LOOCV was similar to global LOOCV, except that in local LOOCV we only sorted test samples with partial candidate samples. There were two types of local LOOCV, namely, SM-fixed local LOOCV and miRNA-fixed local LOOCV. The former sorted the test sample with all candidate samples including the same SM as the test sample, while the latter ranked the test sample with all candidate samples containing the same miRNA as the test sample. Our model could be considered to make a successful prediction if the ranking of the test sample was above a certain threshold. Here we defined the ratio of test samples whose rankings were higher than the threshold to the total test samples as true positive rate (TPR), and false positive rate (FPR) was the percentage of negative samples whose rankings were lower than the threshold. Then, we set the FPR to the horizontal axis and the TPR to the vertical axis. Different thresholds corresponded to different points in the coordinate system. The line formed by these points was the Receiver operating characteristics (ROC) curve. We further calculated the area under the ROC curve (AUC) to estimate the prediction performance of the model. AUC = 1 meant that the prediction ability of the model was extremely strong, and AUC = 0.5 indicated that the model could only make random prediction. That was, when the value of AUC was greater than 0.5 and less than 1, the larger the value of AUC, the better the prediction performance of the model.
As shown in Fig. 1, the global LOOCV values of SNMFSMMA based on Dataset 1 and Dataset 2 reached 0.9711 and 0.8895, respectively, which were much larger than SMiR-NBI (0.8843, 0.7264) [27], GISMMA (0.9291, 0.8203) [30], and SLHGISMMA (0.9273, 0.7774) [31]. In miRNA-fixed local LOOCV, the AUC values of SNMFSMMA based on Dataset 1 and 2 were 0.9698 and 0.8884, respectively, which were significantly better than SMiR-NBI (0.8837, 0.7846), GISMMA (0.9505, 0.8640), and SLHGISMMA (0.9365, 0.7973) (See Fig. 2). As for SM-fixed local LOOCV, the AUC values of SNMFSMMA based on Dataset 1 and 2 were 0.8329 and 0.7651, respectively, which were also much larger than SMiR-NBI (0.7497, 0.6100), GISMMA (0.7702, 0.6591), and SLHGISMMA (0.7703, 0.6556) (See Fig. 3).
Figure 1.

Comparison of prediction performance between SNMFSMMA and three other calculation models (SMiR-NBI, GISMMA, and SLHGISMMA) based on the AUC values of global LOOCV. As shown by the comparison results, based on Dataset 1 and 2, the AUC values of SNMFSMMA reached 0.9711 and 0.8895, respectively, which were significantly better than SMiR-NBI (0.8843, 0.7264), GIMMA (0.9291, 0.8203), and SLHGISMMA (0.9273, 0.7774).
Figure 2.

Comparison of prediction performance between SNMFSMMA and three other calculation models (SMiR-NBI, GISMMA, and SLHGISMMA) according to the result of miRNA-fixed local LOOCV based on Dataset 1 and 2, the AUC values of SNMFSMMA were 0.9698 and 0.8884, respectively, which were significantly better than SMiR-NBI (0.8837, 0.7846), GISMMA (0.9505, 0.8640), and SLHGISMMA (0.9365, 0.7973).
Figure 3.

Comparison of prediction performance between SNMFSMMA and three other calculation models (SMiR-NBI, GISMMA, and SLHGISMMA) according to the result of SM-fixed local LOOCV based on Dataset 1 and 2, the AUC values of SNMFSMMA were 0.8329 and 0.7651, respectively, which were significantly better than SMiR-NBI (0.7497, 0.6100), GISMMA (0.7702, 0.6591), and SLHGISMMA (0.7703, 0.6556).
In addition, we also used 5-fold cross-validation to assess the prediction performance of SNMFSMMA. We first divided the 664 positive samples into five parts randomly (133 in each of the first four part, and 132 of the fifth part). Each part of the positive samples was left out as test sample in turn, and the other four parts were used as training samples. Similarly, all SM-miRNA pairs without confirmed association were regarded as candidate samples. After scoring all the samples with SNMFSMMA, we could obtain the ranking of each test sample among all candidate samples. To reduce the impact of randomly grouping positive samples on prediction results, we performed 5-fold cross-validation 100 times to obtain the mean and standard deviation of AUCs. As a result, based on Dataset 1 and 2, our model achieved AUCs of 0.9644 ± 0.0035 and 0.8814 ± 0.0033 in 5-fold cross-validation, respectively, which were clearly better than the results of SMiR-NBI (0.8554 ± 0.0063, 0.7104 ± 0.0087), GISMMA (0.9263 ± 0.0026, 0.8088 ± 0.0044), and SLHGISMMA (0.9241 ± 0.0052, 0.7724 ± 0.0032). From the above results, it could be seen that the prediction performance of SNMFSMMA was trustworthy.
The above results indicated that the performance based on Dataset 1 was significantly better than that based on Dataset 2, which was mainly due to the following reasons. In the process of implementing cross-validation, Datasets 1 and 2 contained the same positive samples, but the number of candidate sample in Dataset 1 was much larger than that in Dataset 2. Due to the fact that there many SM/miRNAs without any known associations in the Dataset 1, the association probabilities of the candidate samples about this SM/miRNAs were relatively very low. Therefore, the association probabilities of test samples could exceed more candidate samples relative to Dataset 2, and the AUC based on Dataset 1 was significantly larger than that based on Dataset 2.
Case studies
Based on the published literatures in PubMed, we performed two kinds of case studies to evaluate the prediction performance of SNMFSMMA. In the first kind of case study, we sorted the calculated correlation probabilities of SM-miRNA pairs without known association, and then counted how many predicted SM-miRNA associations among the top 10, the top 20 and the top 50 predictions were verified by literatures, respectively. As shown in the verification results in Table 1, 5, 13 and 27 of the top 10, 20 and 50 predicted associations were confirmed by experimental literatures. For the 5 SM-miRNA pairs of the top 10, which had been confirmed to be associated by literatures, we gave brief description for them. In triple-negative breast cancer (TNBC), miR-126-3p (ranked 1st in the prediction list) was confirmed to correlate with the therapeutic effects of 5-Flurouracil (5-FU) and a favourable TNBC outcome [32]. In hepatocellular carcinoma, it was confirmed that overexpression of miR-195 (ranked 2nd in the prediction list) could sensitize hepatocellular carcinoma cells to 5-FU by repressing BCL-w protein level [33]. Similarly, in breast cancer, let-7b (ranked 5th in the prediction list) could sensitize MCF-7 cells to 5-FU by binding to the 3ʹ-untranslated region (UTR) of Bcl-xl [34]. In colon cancer, after exposure to 5-FU, the intracellular levels of miR-145 (ranked 6th in the prediction list) significantly increased in 5-FU-sensitive DLD-1 cells. The 5-FU resistance was due to the enhanced secretion of miR-145 via extracellular microvesicles (MVs) [35]. In another colon cancer cell line HCT116, increased expression of miR-143 (ranked 9th in the prediction list) was associated with increased cell death and decreased viability after exposure to 5-FU through regulating nuclear factor-kappaB pathways [36].
Table 1.
The validation of the top 50 predicted SM-miRNA associations. The first (fourth) and second (fifth) columns, respectively, record the top 25 [26–50] predicted associations between SMs and miRNAs. As a result, 5 and 27 of the top 10 and 50 predicted associations were confirmed by experimental literatures, and the evidence is documented in PubMed, which proves the corresponding association.
| SM | MiRNA | Evidence | SM | MiRNA | Evidence |
|---|---|---|---|---|---|
| CID:3385 | hsa-mir-126 | 26062749 | CID:3385 | hsa-mir-214 | unconfirmed |
| CID:3385 | hsa-mir-195 | 21947,05 | CID:3385 | hsa-let-7i | unconfirmed |
| CID:3385 | hsa-mir-181b-1 | unconfirmed | CID:3385 | hsa-mir-140 | 19734943 |
| CID:451668 | hsa-mir-19b-1 | unconfirmed | CID:3385 | hsa-mir-26a-1 | unconfirmed |
| CID:3385 | hsa-let-7b | 25789066 | CID:3385 | hsa-mir-10b | 22322955 |
| CID:3385 | hsa-mir-145 | 24447928 | CID:451668 | hsa-let-7c | unconfirmed |
| CID:3385 | hsa-mir-1322 | unconfirmed | CID:3385 | hsa-mir-196a-1 | unconfirmed |
| CID:3385 | hsa-mir-19b-1 | unconfirmed | CID:3385 | hsa-mir-142 | 23,619,912 |
| CID:3385 | hsa-mir-143 | 19843160 | CID:451668 | hsa-let-7b | 26708866 |
| CID:3385 | hsa-mir-125b-1 | unconfirmed | CID:3385 | hsa-mir-155 | 28347920 |
| CID:60750 | hsa-mir-24-2 | 25841339 | CID:451668 | hsa-mir-19b-2 | unconfirmed |
| CID:3385 | hsa-mir-150 | 23084747 | CID:3385 | hsa-mir-181a-1 | unconfirmed |
| CID:451668 | hsa-mir-18a | unconfirmed | CID:60750 | hsa-mir-23a | unconfirmed |
| CID:3385 | hsa-mir-146a | 28466779 | CID:451668 | hsa-mir-221 | unconfirmed |
| CID:3385 | hsa-mir-221 | 27501171 | CID:3385 | hsa-mir-29c | unconfirmed |
| CID:451668 | hsa-mir-143 | 28391715 | CID:3385 | hsa-mir-34a | 25333573 |
| CID:3385 | hsa-mir-181b-2 | unconfirmed | CID:5311 | hsa-mir-495 | unconfirmed |
| CID:3385 | hsa-mir-205 | 24396484 | CID:3385 | hsa-mir-183 | 27249600 |
| CID:3385 | hsa-let-7c | 25951903 | CID:451668 | hsa-mir-106a | unconfirmed |
| CID:3385 | hsa-mir-200a | 24510588 | CID:3385 | hsa-mir-100 | 20585341 |
| CID:451668 | hsa-let-7e | 22053057 | CID:3385 | hsa-mir-103a-1 | unconfirmed |
| CID:3385 | hsa-mir-133b | 28881788 | CID:3385 | hsa-mir-29b-1 | unconfirmed |
| CID:451668 | hsa-mir-195 | 23333942 | CID:451668 | hsa-mir-181b-1 | unconfirmed |
| CID:451668 | hsa-mir-203a | 26577858 | CID:3385 | hsa-mir-125b-2 | unconfirmed |
| CID:3385 | hsa-mir-107 | 26636340 | CID:451668 | hsa-mir-342 | unconfirmed |
To test the ability of SNMFSMMA to predict miRNAs associated with new SMs without any known miRNA-related, we implemented the second kind of case study on 5-FU and 5-Aza-2ʹ-deoxycytidine (5-Aza-CdR). Here, we first set all the 1 elements in the row of the adjacency matrix corresponding to 5-FU or 5-Aza-CdR to 0 so that all the known associations of the investigated SM were removed. Then, we further calculated the correlation probabilities between all miRNAs and the investigated SM and sorted them. As a result, 7 and 24 out of the top 10 and top 50 predicted miRNAs associated with 5-FU were validated, respectively, of which 4 and 14 were validated by the known associations we recorded, and the rest were verified by the literatures published on PubMed (See Table 2). For the three miRNAs in the top 10, which had been confirmed to be related to 5-FU by literatures, we provided brief introduction to their associations with 5-FU. In colorectal cancer (CRC), 4-acetylantroquinonol B (4-AAQB) could potentiate 5-FU anticancer effect by eliciting the re-expression of miR-324 (ranked 1st in the prediction list) [37]. Similarly in CRC, the restoration miR-874 (ranked 6th in the prediction list) could inhibit proliferation, enhance apoptosis, as well as decrease the 5-FU resistance of cells [38]. Meanwhile, in oesophageal carcinoma cell lines treated with 5-FU, miR-455-3p (ranked 9th in the prediction list) were deregulated after treatment [39].
Table 2.
The validation of the top 50 predicted miRNAs associated with 5-Fluorouracil. Here, all known associations with 5-Fluorouracil had been removed. The top 1–25 miRNAs are shown in the first column while the top 26–50 in the third. As a result, 7 and 24 out of the top 10 and 50 predicted associations were verified by experimental literatures. The evidence is documented in PubMed, which proves the association between 5-Fluorouracil and corresponding miRNAs.
| MiRNA | Evidence | MiRNA | Evidence |
|---|---|---|---|
| hsa-mir-324 | 30103475 | hsa-mir-412 | unconfirmed |
| hsa-mir-24-1 | 26198104 | hsa-mir-330 | 28521444 |
| hsa-mir-501 | 26198104 | hsa-mir-1226 | 26198104 |
| hsa-mir-500a | unconfirmed | hsa-mir-197 | 26198104 |
| hsa-mir-650 | unconfirmed | hsa-mir-181a-2 | 24462870 |
| hsa-mir-874 | 27221209 | hsa-mir-128-2 | 26198104 |
| hsa-mir-24-2 | 26198104 | hsa-mir-181b-1 | unconfirmed |
| hsa-mir-23a | 26198104 | hsa-mir-132 | 26198104 |
| hsa-mir-455 | 21743970 | hsa-mir-339 | unconfirmed |
| hsa-mir-409 | unconfirmed | hsa-mir-320a | 26198104 |
| hsa-mir-345 | unconfirmed | hsa-mir-342 | 26198104 |
| hsa-mir-372 | unconfirmed | hsa-mir-128-1 | 26198104 |
| hsa-mir-329-1 | unconfirmed | hsa-mir-181a-1 | unconfirmed |
| hsa-mir-299 | unconfirmed | hsa-mir-181b-2 | unconfirmed |
| hsa-mir-212 | unconfirmed | hsa-mir-194-1 | unconfirmed |
| hsa-mir-129-2 | unconfirmed | hsa-mir-518a-1 | unconfirmed |
| hsa-mir-346 | unconfirmed | hsa-mir-149 | 26198104 |
| hsa-mir-337 | unconfirmed | hsa-mir-373 | unconfirmed |
| hsa-mir-328 | unconfirmed | hsa-mir-186 | unconfirmed |
| hsa-mir-329-2 | unconfirmed | hsa-mir-154 | unconfirmed |
| hsa-mir-211 | 28720546 | hsa-mir-379 | unconfirmed |
| hsa-mir-202 | unconfirmed | hsa-mir-204 | 27095441 |
| hsa-mir-217 | unconfirmed | hsa-mir-133a-1 | 26198104 |
| hsa-mir-326 | 24119443 | hsa-mir-27a | 26198104 |
| hsa-mir-187 | 30103475 | hsa-mir-155 | 28347920 |
Taking 5-Aza-CdR as another example in the second kind of case study, we also validated its predicted associated miRNAs. Based on the correlation probability ranking calculated by SNMFSMMA, 8 and 22 out of the top 10 and 50 predicted miRNAs associated with 5-Aza-CdR were verified, of which 7 and 17 were confirmed by the known associations we collected, and the rest were verified by the literatures published on PubMed (See Table 3). For the 5 miRNAs in the top 50, which had been confirmed to be related to 5-Aza-CdR by literatures, we provided a brief introduction to their associations with 5-Aza-CdR. The treatment of oesophageal squamous cell carcinoma (ESCC) tumours with 5-Aza-CdR would increase the expression level of miR-203a (ranked 7th in the prediction list) [40]. In prostate cancer cells, 5-Aza-CdR could restore miR-143 (ranked 22th in the prediction list) by reducing methylation of CpG in miR-143 promoter [41]. In addition, the inhibition of dopamine D3 receptor (DRD3) induced by let-7d (ranked 32th in the prediction list) could be abolished by 5-Aza-CdR [42]. While in hilar cholangiocarcinoma, 5-Aza-CdR could induce the reactivation of miR-373 (ranked 36th in the prediction list) and lead to decreased enrichment of MBD2, which plays an important role in the development of the cancer [43]. In addition, in gastric cancer cells, miR-330-3p (ranked 48th in the prediction list) was increased after the treatment with 5-Aza-CdR, which inhibited the cancer progression [44]. The above results of case studies confirmed the superior prediction performance of SNMFSMMA.
Table 3.
The validation of the top 50 predicted miRNAs associated with 5-Aza-2ʹ-deoxycytidine. Here, all known associations with 5-Aza-2ʹ-deoxycytidine had been removed. The top 1–25 miRNAs are shown in the first column while the top 26–50 in the third. As a result, 8 and 22 out of the top 10 and 50 predicted associations were verified by experimental literatures. The evidence is documented in PubMed, which proves the association between 5-Aza-2ʹ-deoxycytidine and corresponding miRNAs.
| MiRNA | Evidence | MiRNA | Evidence |
|---|---|---|---|
| hsa-mir-125b-2 | 26198104 | hsa-mir-129-2 | unconfirmed |
| hsa-mir-125b-1 | 26198104 | hsa-mir-328 | unconfirmed |
| hsa-mir-19a | 26198104 | hsa-mir-372 | unconfirmed |
| hsa-mir-17 | 26198104 | hsa-mir-16-1 | 26198104 |
| hsa-mir-20a | 26198104 | hsa-mir-181b-2 | unconfirmed |
| hsa-mir-18a | unconfirmed | hsa-mir-212 | unconfirmed |
| hsa-mir-203a | 26577858 | hsa-let-7d | 26802971 |
| hsa-mir-19b-1 | unconfirmed | hsa-mir-194-1 | unconfirmed |
| hsa-mir-145 | 26198104 | hsa-mir-202 | unconfirmed |
| hsa-mir-21 | 26198104 | hsa-mir-346 | unconfirmed |
| hsa-mir-181a-2 | 26198104 | hsa-mir-373 | 21165562 |
| hsa-mir-24-1 | 26198104 | hsa-mir-211 | unconfirmed |
| hsa-mir-181a-1 | 26198104 | hsa-mir-92a-1 | unconfirmed |
| hsa-mir-19b-2 | unconfirmed | hsa-mir-342 | unconfirmed |
| hsa-mir-324 | unconfirmed | hsa-mir-217 | unconfirmed |
| hsa-mir-155 | 26198104 | hsa-let-7a-1 | unconfirmed |
| hsa-mir-27b | 26198104 | hsa-mir-345 | unconfirmed |
| hsa-mir-26a-1 | unconfirmed | hsa-mir-412 | unconfirmed |
| hsa-mir-181b-1 | unconfirmed | hsa-mir-132 | unconfirmed |
| hsa-mir-27a | 26198104 | hsa-mir-24-2 | 26198104 |
| hsa-mir-329-1 | unconfirmed | hsa-mir-409 | unconfirmed |
| hsa-mir-143 | 28391715 | hsa-mir-148a | unconfirmed |
| hsa-mir-125a | 26198104 | hsa-mir-330 | 27904681 |
| hsa-mir-329-2 | unconfirmed | hsa-mir-128-2 | unconfirmed |
| hsa-mir-205 | 26198104 | hsa-mir-23a | unconfirmed |
The validation results of the case studies fully demonstrated the outstanding prediction performance of SNMFSMMA. Based on Dataset 1, we applied SNMFSMMA to calculate the association probabilities of all unknown samples and ranked them (See Supplementary Table 1), which would be beneficial for researchers to further study the relationship between SMs and miRNAs.
Discussion
There was increasing evidence that miRNAs played an extremely important role in human physiological processes, and dysregulation of miRNAs might lead to human diseases. It was a very effective method to treat diseases by targeting miRNAs with SMs. Therefore, revealing the relationship between SMs and miRNAs helped provide new strategies for the treatment of diseases. However, the cost of using traditional biological experiments to reveal the associations between SMs and miRNAs was large. Therefore, in recent years, researchers have proposed many computational models in order to predict potential SM-miRNA associations more quickly and efficiently. In this paper, we proposed a novel model of SNMFSMMA to reveal potential associations between miRNAs and SMs. As the performance evaluation results shown that, based on Dataset 1, the AUCs of SNMFSMMA reached 0.9711, 0.9698 and 0.8329 in global LOOCV, miRNA-fixed local LOOCV and SM-fixed local LOOCV, respectively, while SNMFSMMA achieved AUCs of 0.8895, 0.8884, and 0.7651 in the aforementioned corresponding LOOCVs based on Dataset 2. What’s more, in 5-fold cross-validation, the AUCs of SNMFSMMA attained 0.9644 ± 0.0035 and 0.8814 ± 0.0033 based on Dataset 1 and Dataset 2, respectively. The results of the first kind of case study indicated that 5, 13 and 27 out of the top 10, 20 and 50 predicted associations were verified by experimental literatures. In the second kind of case study, 7 and 24 out of the top 10 and top 50 predicted miRNAs associated with 5-FU were validated, respectively, while 8 and 22 out of the top 10 and 50 miRNAs associated with 5-Aza-CdR were verified. All of the above evaluation results confirmed the superior prediction performance of SNMFSMMA.
SNMFSMMA has a significant difference from previous studies. Here, we will analyse the difference between it and our recent computational model of SLHGISMMA. We used matrix decomposition in both SLHGISMMA and SNMFSMMA, but there were significant differences between the two models. Firstly, in SLHGISMMA, we applied the sparse learning to decompose the adjacency matrix to eliminate the noise in the initial association information to some extent, while in SNMFSMMA, we interpolated the integrated similarity matrixes based on SymNMF. Secondly, we calculated the association probability of SM-miRNA pair by analysing the path connecting corresponding SM node and miRNA node in the heterogeneous graph in SLHGISMMA, which is a network-based algorithm. In SNMFSMMA, we predicted the potential associations based on the Kronecker regularized least square (KronRLS) method, which is a machine learning-based algorithm. The reason why we built the model SNMFSMMA was mainly due to the following two aspects: on the one hand, the prediction performance of previous models needed to be further improved, and the prediction accuracy of SNMFSMMA was significantly higher than that of previous models. On the other hand, there were many advantages in SNMFSMMA. Firstly, the datasets we used were extracted from the reliable database. Secondly, different from other models directly using the similarities of SMs and miRNAs, we interpolated the similarity matrixes using SymNMF before applying them, which could eliminate noises of similarity matrixes to some extent. Thirdly, in our model, we introduced the Kronecker product to get the relationship between SM-miRNA pairs, and we calculated the correlation probabilities between SMs and miRNAs based on the similarity between SM-miRNA pairs, new-integrated SM similarity and new-integrated miRNA similarity, which conduced to improve the prediction accuracy of the model. Finally, we introduced the spectral decomposition of the new-integrated matrixes, which could significantly speed up the calculation. Certainly, SNMFSMMA still has some shortcomings that need to be improved. For example, the optimal measurement problems of the similarity of SM and miRNA remain unresolved. In addition, the calculation process of the scoring matrix involves the step of obtaining the Kronecker product of two matrices, which often leads to memory problem. What’s more, the number of SM-miRNA pairs with known associations is too small, which seriously affect the prediction accuracy of the model. Finally, in the process of SymNMF, more similarity information about SMs and miRNAs could be introduced to improve the interpolation effect [45,46].
Materials and methods
Small molecule-miRNA associations
In our study, we constructed two datasets of Dataset 1 and 2, which were identical in that they contained 664 identical known SM-miRNA associations. The 831 SMs included in Dataset 1 were collected from the database PubChem [47], SM2miR v1.0 [48] and DrugBank [49], and the 541 miRNAs of Dataset 1 were obtained from the database SM2miR v1.0 [48], PhenomiR [50], HMDD [51] and miR2Disease [52]. Therefore, there were many SMs (miRNAs) in Dataset 1 without known associations. We extracted 39 SMs and 286 miRNAs involved in 664 known associations to form Dataset 2. Therefore, the prediction performance of SNMFSMMA on different datasets could be evaluated by implementing predictions based on Dataset 1 and 2, respectively. Then, we defined the SM-miRNA adjacency matrix , where and equalled to the number of SM and miRNA in Dataset 1 (Dataset 2) (The variables we defined in Dataset 1 and 2 are consistent), respectively. If there was a known association between SM and miRNA the value of was 1, otherwise 0.
Integrated SM similarity
There was evidence that the similarity of SM obtained by integrating different biological datasets could help the models more effectively predict the associations between SMs and miRNAs [53–57]. Inspired by this study [25], in this paper, we calculated four types of SM similarities, including indication phenotype-based similarity [53], chemical structure similarity [58], gene functional consistency-based similarity [59] and side effects similarity [53]. In the process of calculating the indication phenotype-based similarity matrix , a set was defined for SM , where the element represented the disease which was associated with SM . Then, the indication phenotype-based similarity between SM and could be obtained by calculating the Jaccard score of and (i.e. ). As for , firstly, labelled graphs were constructed based on the collected chemical structures of SMs. Then, the computational problem of similarity was translated into a comparison problem of graphs, which could be solved by SIMCOMP through searching the largest common subgraph isomorphism in the association graph (graph product of the two graphs) [59]. The gene functional consistency-based similarities of SMs were defined based on the assumption that if the target genes of SMs had greater functional consistency, the SMs were more similar. We calculated the gene functional consistency-based similarities by quantifying the functional associations between SM-related genes using the Gene Set Functional Similarity (GSFS) method [59]. As for , similar to the calculation process of , a set was constructed based on the side effects of SM , then we calculated the Jaccard score of and to obtain the side effects similarity between and . For the sake of deviation reduction in the process of similarities integration, we employed a weighted combination approach to unite the four similarities to obtain the integrated similarity of SMs. The specific combination was as follows:
| (1) |
Here, we set the value of to 1, which meant that all similarities took the same weight in the integrated similarity .
Integrated miRNA similarity
Inspired by the miRNA similarity calculation method [25], in this paper, firstly, we calculated the functional consistency-based similarity for miRNAs, whose calculation process was similar to that of . Secondly, the disease phenotype-based similarity for miRNAs could be obtained in a similar way to that of . Finally, the integrated miRNA similarity matrix was defined by combining the functional consistency-based similarity for miRNAs and disease phenotype-based similarity for miRNAs as follows:
| (2) |
Here, the values of the parameter and were still set to 1.
SNMFSMMA
Here, we presented a novel computational model of SNMFSMMA to reveal the potential SM-miRNA associations (motivated by previous study [60,61]) and the flow chart of the model as shown in Fig. 4. As shown in the figure, the calculation process of the model was mainly divided into two parts. Firstly, we applied SymNMF to perform matrix decomposition on the integrated similarity matrix and , and then recalculated the new-integrated similarity matrix and using the decomposed matrix. Secondly, based on the new-integrated similarity matrixes, we implemented the Kronecker regularized least squares (KronRLS) method to obtain the scoring matrix , which had the same number of row and column as the adjacency matrix . The element of the matrix represented the correlation probability between SM and miRNA , and the detailed calculation process of the two steps were as follows:
Figure 4.

Flowchart for predicting the potential associations of SM-miRNA pairs by SNMFSMMA, which is mainly divided into two steps: SymNMF and KronRLS.
SymNMF
As a semi-supervised learning method, non-negative matrix factorization (NMF) had gradually become one of the most popular multi-dimensional data processing tools in signal processing, biomedical engineering, pattern recognition, computer vision and image engineering [62–64]. In our algorithm, taking the integrated SM similarity as an example, we employed a special case of non-negative matrix factorization, SymNMF. The purpose of our matrix decomposition was to find the matrix so that met the following conditions:
| (3) |
The detailed steps were as follows: The first step was to construct a matrix as the initial matrix of whose all elements were positive and its dimension was the same as matrix . We defined as the matrix corresponding to at the iteration, and then the error at the iteration was calculated as follows:
The next step was to start the iteration. First, we needed to calculate the intermediate variable as follows:
| (5) |
where represented the entrywise division of matrices and . In the iterative process, we temporarily defined the updated as , which could be obtained by the following formula:
| (6) |
where represented the entrywise product (also known as Hadamard product) of matrices and . Here, should take a value close to 1 but slightly less than 1, so we set to 0.999, and the updated error at the iteration was calculated as follows:
| (7) |
Then, we compared the value of and . If was smaller than , the iteration ended, otherwise and should be updated as follows:
| (8) |
| (9) |
At the end of this iteration, we assigned values to and as follows and returned to formula (4) to start the next iteration.
| (10) |
| (11) |
| (12) |
The new SM integrated similarity matrix could be calculated by the following formula:
| (13) |
where indicated the total number of iteration. The decomposition of the integrated miRNA similarity was similar to SM, and we further could obtain the new miRNA integrated similarity matrix .
KronRLs
After matrix decomposition to obtain new-integrated similarity matrixes, the second step was to calculate the association probabilities of SM-miRNA pairs by KronRLS algorithm. First, we created a set and for SM and miRNA, respectively, where and represented the number of SM and miRNA involved in the dataset. Then, we built a -dimensional column vector where each element represented an SM-miRNA pair and n meant the total number of pairs (), respectively. We further constructed a vector with the same dimension as whose element value depended on the corresponding element in , if the SM-miRNA pair corresponding to had a known association, then the value of was 1, otherwise 0. Our ultimate goal was to find the mapping function from the SM-miRNA pair to the corresponding association probability by minimizing the following function.
| (14) |
where was the norm of function in Hilbert space, and was the parameter used to weigh the regression accuracy and complexity of the function in Hilbert space. Here we set its value to 1 [65]. The representer theorem ensured that Equation (14) had a closed-form solution of the form below:
| (15) |
| (16) |
where was the calculated association probability of the SM-miRNA pair corresponding to , and was a column vector with n elements, which could be obtained by solving the following linear equation.
| (17) |
Here, was an identity matrix of the same size as , and was defined as a pairwise instance kernel to represent the similarity between two data points in Hilbert space. Specifically, the similarity between SM-miRNA pairs was obtained according to the integrated similarity of SM and the integrated similarity of miRNA. For example, for SM-miRNA pairs and , the similarity between them was calculated as follows:
| (18) |
The matrix form of formula (18) could be expressed as follows:
| (19) |
where represented the Kronecker product of matrices and . It was easy to get the following formula by combining with the formulas (16) and (17):
| (20) |
Here, represented the column vector that was stitched together by all the columns of the score matrix . So each element of represented the association probability of the corresponding SM-miRNA pairs in . Since the dimension of was large, the calculation process was cumbersome and time-consuming. To optimize the model more efficiently, we introduced spectral decomposition of the matrix to reduce the computation time. The matrixes were decomposed as follows:
| (21) |
| (22) |
| (23) |
| (24) |
where:
| (25) |
| (26) |
Here, the dimension of the matrix was the same as , and each of its columns was the eigenvector of the matrix . While was a diagonal matrix, and the diagonal element was eigenvalue corresponding to the eigenvector of the matrix , that was, the column of the matrix . What’s more, one of the proven properties of Kronecker was as follows:
| (27) |
According to the above formula, we could calculate the score matrix as follows:
| (28) |
where:
| (29) |
Funding Statement
YZ was supported by the Fundamental Research Funds for the Central Universities (2019BSCX13).
Disclosure statement
No potential conflict of interest was reported by the authors.
Supplementary material
Supplemental data for this article can be accessed here.
References
- [1].Lema C, Cunningham MJ.. MicroRNAs and their implications in toxicological research. Toxicol Lett. 2010;198(2):100–105. Epub 2010/07/06. PubMed PMID: 20599482. [DOI] [PubMed] [Google Scholar]
- [2].Bartel DP. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell. 2004;116(2):281–297. [DOI] [PubMed] [Google Scholar]
- [3].Matsui M, Corey DR. Non-coding RNAs as drug targets. Nat Rev Drug Discov. 2017;16(3):167–179. Epub 2016/ 11/04. PubMed PMID: 27444227; PubMed Central PMCID: PMCPMC5831170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [4].Chen X, Xie D, Zhao Q, et al. MicroRNAs and complex diseases: from experimental results to computational models. Brief Bioinform. 2019;20(2):515–539. Epub 2017/ 10/19. PubMed PMID: 29045685. [DOI] [PubMed] [Google Scholar]
- [5].Lu J, Getz G, Miska EA, et al. MicroRNA expression profiles classify human cancers. nature. 2005;435(7043):834–838. [DOI] [PubMed] [Google Scholar]
- [6].Bartel DP. MicroRNAs: target recognition and regulatory functions. Cell. 2009;136(2):215–233. Epub 2009/ 01/27. PubMed PMID: 19167326; PubMed Central PMCID: PMCPMC3794896. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].Alvarez-Garcia I, Miska EA. MicroRNA functions in animal development and human disease. Development. 2005;132(21):4653–4662. Epub 2005/ 10/15. PubMed PMID: 16224045. [DOI] [PubMed] [Google Scholar]
- [8].Chen X, Huang L. LRSSLMDA: laplacian regularized sparse subspace learning for MiRNA-disease association prediction. PLoS Comput Biol. 2017;13(12):e1005912. Epub 2017/12/19. PubMed PMID: 29253885; PubMed Central PMCID: PMCPMC5749861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Chen X, Xie D, Wang L, et al. BNPMDA: bipartite network projection for MiRNA-disease association prediction. Bioinformatics. 2018;34(18):3178–3186. Epub 2018/ 04/28. PubMed PMID: 29701758. [DOI] [PubMed] [Google Scholar]
- [10].Chow TF, Youssef YM, Lianidou E, et al. Differential expression profiling of microRNAs and their potential involvement in renal cell carcinoma pathogenesis. Clin Biochem. 2010;43(1–2):150–158. Epub 2009/08/04. PubMed PMID: 19646430. [DOI] [PubMed] [Google Scholar]
- [11].Guo C, Sah JF, Beard L, et al. The noncoding RNA, miR-126, suppresses the growth of neoplastic cells by targeting phosphatidylinositol 3-kinase signaling and is frequently lost in colon cancers. Genes Chromosomes Cancer. 2008;47(11):939–946. Epub 2008/07/30. PubMed PMID: 18663744; PubMed Central PMCID: PMCPMC2739997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12].Shi B, Sepp-Lorenzino L, Prisco M, et al. Micro RNA 145 targets the insulin receptor substrate-1 and inhibits the growth of colon cancer cells. J Biol Chem. 2007;282(45):32582–32590. Epub 2007/09/11. PubMed PMID: 17827156. [DOI] [PubMed] [Google Scholar]
- [13].Mruk K, Chen JK. Thinking big with small molecules. J Cell Biol. 2015;209(1):7–9. Epub 2015/04/15. PubMed PMID: 25869661; PubMed Central PMCID: PMCPMC4395478. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [14].Chen X, Sun YZ, Zhang DH, et al. NRDTD: a database for clinically or experimentally supported non-coding RNAs and drug targets associations. Database. 2017;2017. Epub 2017/ 12/09. PubMed PMID: 29220444; PubMed Central PMCID: PMCPMC5527270. DOI: 10.1093/database/bax057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].Donlic A, Hargrove AE. Targeting RNA in mammalian systems with small molecules. Wiley Interdiscip Rev RNA. 2018;9(4):e1477. Epub 2018/05/05. PubMed PMID: 29726113; PubMed Central PMCID: PMCPMC6002909. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Matsui M, Corey DR. Non-coding RNAs as drug targets. Nat Rev Drug Discov. 2016;16(3):167. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Chen X, Guan NN, Sun YZ, et al. MicroRNA-small molecule association identification: from experimental results to computational models. Brief Bioinform. 2018. Epub 2018/10/17. PubMed PMID: 30325405. DOI: 10.1093/bib/bby098. [DOI] [PubMed] [Google Scholar]
- [18].Di Giorgio A, Tran TP, Duca M. Small-molecule approaches toward the targeting of oncogenic miRNAs: roadmap for the discovery of RNA modulators. Future Med Chem. 2016;8(7):803–816. Epub 2016/05/06. PubMed PMID: 27149207. [DOI] [PubMed] [Google Scholar]
- [19].Gumireddy K, Young DD, Xiong X, et al. Small-molecule inhibitors of microrna miR-21 function. Angew Chem (Int Ed in English). 2008;47(39):7482–7484. Epub 2008/ 08/21. PubMed PMID: 18712719; PubMed Central PMCID: PMCPMC3428715. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20].Disney MD, Winkelsas AM, Velagapudi SP, et al. Inforna 2.0: a platform for the sequence-based design of small molecules targeting structured RNAs. ACS Chem Biol. 2016;11(6):1720–1728. Epub 2016/ 04/21. PubMed PMID: 27097021; PubMed Central PMCID: PMCPMC4912454. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [21].Vo DD, Staedel C, Zehnacker L, et al. Targeting the production of oncogenic microRNAs with multimodal synthetic small molecules. ACS Chem Biol. 2014;9(3):711–721. Epub 2013/ 12/24. PubMed PMID: 24359019. [DOI] [PubMed] [Google Scholar]
- [22].Haga CL, Velagapudi SP, Childs-Disney JL, et al. Rapid generation of miRNA inhibitor leads by bioinformatics and efficient high-throughput screening methods. Methods Mol Biol. 2017;1517:179–198. Epub 2016/ 12/08. PubMed PMID: 27924483. [DOI] [PubMed] [Google Scholar]
- [23].Fujita Y, Yagishita S, Hagiwara K, et al. The clinical relevance of the miR-197/CKS1B/STAT3-mediated PD-L1 network in chemoresistant non-small-cell lung cancer. Mol Ther. 2015;23(4):717–727. Epub 2015/ 01/20. PubMed PMID: 25597412; PubMed Central PMCID: PMCPMC4395779. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [24].Zhao Y, Ma K, Yang S, et al. MicroRNA-125a-5p enhances the sensitivity of esophageal squamous cell carcinoma cells to cisplatin by suppressing the activation of the STAT3 signaling pathway. Int J Oncol. 2018;53(2):644–658. Epub 2018/05/17. PubMed PMID: 29767234; PubMed Central PMCID: PMCPMC6017156. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Lv Y, Wang S, Meng F, et al. Identifying novel associations between small molecules and miRNAs based on integrated molecular networks. Bioinformatics. 2015;31(22):3638–3644. Epub 2015/ 07/23. PubMed PMID: 26198104. [DOI] [PubMed] [Google Scholar]
- [26].Wang J, Meng F, Dai E, et al. Identification of associations between small molecule drugs and miRNAs based on functional similarity. Oncotarget. 2016;7(25):38658–38669. Epub 2016/10/23. PubMed PMID: 27232942; PubMed Central PMCID: PMCPMC5122418. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [27].Li J, Lei K, Wu Z, et al. Network-based identification of microRNAs as potential pharmacogenomic biomarkers for anticancer drugs. Oncotarget. 2016;7(29):45584–45596. Epub 2016/06/23. PubMed PMID: 27329603; PubMed Central PMCID: PMCPMC5216744. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [28].Jiang W, Chen X, Liao M, et al. Identification of links between small molecules and miRNAs in human cancers based on transcriptional responses. Sci Rep. 2012;2:282. Epub 2012/02/23. PubMed PMID: 22355792; PubMed Central PMCID: PMCPMC3282946. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [29].Meng F, Dai E, Yu X, et al. Constructing and characterizing a bioactive small molecule and microRNA association network for Alzheimer’s disease. J R Soc Interface. 2014;11(92):20131057. Epub 2013/12/20. PubMed PMID: 24352679; PubMed Central PMCID: PMCPMC3899875. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [30].Guan NN, Sun YZ, Ming Z, et al. Prediction of potential small molecule-associated MicroRNAs using graphlet interaction. Front Pharmacol. 2018;9:1152. Epub 2018/10/31. PubMed PMID: 30374302; PubMed Central PMCID: PMCPMC6196296. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [31].Yin J, Chen X, Wang CC, et al. Prediction of small molecule-MicroRNA associations by sparse learning and heterogeneous graph inference. Mol Pharm. 2019;16(7):3157–3166. Epub 2019/05/29. PubMed PMID: 31136190. [DOI] [PubMed] [Google Scholar]
- [32].Liu Y, Cai Q, Bao PP, et al. Tumor tissue microRNA expression in association with triple-negative breast cancer outcomes. Breast Cancer Res Treat. 2015;152(1):183–191. Epub 2015/06/13. PubMed PMID: 26062749; PubMed Central PMCID: PMCPMC4484742. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [33].Yang X, Yin J, Yu J, et al. miRNA-195 sensitizes human hepatocellular carcinoma cells to 5-FU by targeting BCL-w. Oncol Rep. 2012;27(1):250–257. Epub 2011/ 09/29. PubMed PMID: 21947305. [DOI] [PubMed] [Google Scholar]
- [34].Wang T, Huang B, Guo R, et al. A let-7b binding site SNP in the 3ʹ-UTR of the Bcl-xL gene enhances resistance to 5-fluorouracil and doxorubicin in breast cancer cells. Oncol Lett. 2015;9(4):1907–1911. Epub 2015/03/20. PubMed PMID: 25789066; PubMed Central PMCID: PMCPMC4356428. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [35].Akao Y, Khoo F, Kumazaki M, et al. Extracellular disposal of tumor-suppressor miRs-145 and −34a via microvesicles and 5-FU resistance of human colon cancer cells. Int J Mol Sci. 2014;15(1):1392–1401. Epub 2014/01/23. PubMed PMID: 24447928; PubMed Central PMCID: PMCPMC3907875. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [36].Borralho PM, Kren BT, Castro RE, et al. MicroRNA-143 reduces viability and increases sensitivity to 5-fluorouracil in HCT116 human colorectal cancer cells. Febs J. 2009;276(22):6689–6700. Epub 2009/10/22. PubMed PMID: 19843160. [DOI] [PubMed] [Google Scholar]
- [37].Bamodu OA, Yang CK, Cheng WH, et al. 4-acetyl-antroquinonol B suppresses SOD2-enhanced cancer stem cell-like phenotypes and chemoresistance of colorectal cancer cells by inducing hsa-miR-324 re-expression. Cancers (Basel). 2018;10(8). Epub 2018/08/15. PubMed PMID: 30103475; PubMed Central PMCID: PMCPMC6116152. DOI: 10.3390/cancers10080269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [38].Han J, Liu Z, Wang N, et al. MicroRNA-874 inhibits growth, induces apoptosis and reverses chemoresistance in colorectal cancer by targeting X-linked inhibitor of apoptosis protein. Oncol Rep. 2016;36(1):542–550. Epub 2016/05/26. PubMed PMID: 27221209. [DOI] [PubMed] [Google Scholar]
- [39].Hummel R, Wang T, Watson DI, et al. Chemotherapy-induced modification of microRNA expression in esophageal cancer. Oncol Rep. 2011;26(4):1011–1017. Epub 2011/07/12. PubMed PMID: 21743970. [DOI] [PubMed] [Google Scholar]
- [40].Liu Y, Dong Z, Liang J, et al. Methylation-mediated repression of potential tumor suppressor miR-203a and miR-203b contributes to esophageal squamous cell carcinoma development. Tumour Biol. 2016;37(4):5621–5632. Epub 2015/ 11/19. PubMed PMID: 26577858. [DOI] [PubMed] [Google Scholar]
- [41].Liu J, Li M, Wang Y, et al. Curcumin sensitizes prostate cancer cells to radiation partly via epigenetic activation of miR-143 and miR-143 mediated autophagy inhibition. J Drug Target. 2017;25(7):645–652. Epub 2017/04/11. PubMed PMID: 28391715. [DOI] [PubMed] [Google Scholar]
- [42].Zhang Y, Cheng C, He D, et al. Transcriptional gene silencing of dopamine D3 receptor caused by let-7d mimics in immortalized renal proximal tubule cells of rats. Gene. 2016;580(2):89–95. [DOI] [PubMed] [Google Scholar]
- [43].Chen Y, Gao W, Luo J, et al. Methyl-CpG binding protein MBD2 is implicated in methylation-mediated suppression of miR-373 in hilar cholangiocarcinoma. Oncol Rep. 2011;25(2):443–451. Epub 2010/ 12/18. PubMed PMID: 21165562. [DOI] [PubMed] [Google Scholar]
- [44].Guan A, Wang H, Li X, et al. MiR-330-3p inhibits gastric cancer progression through targeting MSI1. Am J Transl Res. 2016;8(11):4802–4811. Epub 2016/12/03. PubMed PMID: 27904681; PubMed Central PMCID: PMCPMC5126323. [PMC free article] [PubMed] [Google Scholar]
- [45].Chen X, Wang L, Qu J, et al. Predicting miRNA-disease association based on inductive matrix completion. Bioinformatics. 2018;34(24):4256–4265. Epub 2018/ 06/26. PubMed PMID: 29939227. [DOI] [PubMed] [Google Scholar]
- [46].Chen X, Zhu CC, Yin J. Ensemble of decision tree reveals potential miRNA-disease associations. PLoS Comput Biol. 2019;15(7):e1007209. Epub 2019/07/23. PubMed PMID: 31329575; PubMed Central PMCID: PMCPMC6675125. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [47].Wang Y, Xiao J, Suzek TO, et al. PubChem: a public information system for analyzing bioactivities of small molecules. Nucleic Acids Res. 2009;37(Web Server issue):W623–33. Epub 2009/ 06/06. PubMed PMID: 19498078; PubMed Central PMCID: PMCPMC2703903. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [48].Liu X, Wang S, Meng F, et al. SM2miR: a database of the experimentally validated small molecules’ effects on microRNA expression. Bioinformatics. 2013;29(3):409–411. Epub 2012/12/12. PubMed PMID: 23220571. [DOI] [PubMed] [Google Scholar]
- [49].Knox C, Law V, Jewison T, et al. DrugBank 3.0: a comprehensive resource for ‘omics’ research on drugs. Nucleic Acids Res. 2011;39(Database issue):D1035–41. Epub 2010/ 11/10. PubMed PMID: 21059682; PubMed Central PMCID: PMCPMC3013709. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [50].Ruepp A, Kowarsch A, Schmidl D, et al. PhenomiR: a knowledgebase for microRNA expression in diseases and biological processes. Genome Biol. 2010;11(1):R6. Epub 2010/ 01/22. PubMed PMID: 20089154; PubMed Central PMCID: PMCPMC2847718. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [51].Lu M, Zhang Q, Deng M, et al. An analysis of human microRNA and disease associations. PloS One. 2008;3(10):e3420. Epub 2008/10/17. PubMed PMID: 18923704; PubMed Central PMCID: PMCPMC2559869. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [52].Jiang Q, Wang Y, Hao Y, et al. miR2Disease: a manually curated database for microRNA deregulation in human disease. Nucleic Acids Res. 2009;37(Database issue):D98–104. Epub 2008/10/18. PubMed PMID: 18927107; PubMed Central PMCID: PMCPMC2686559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [53].Gottlieb A, Stein GY, Ruppin E, et al. PREDICT: a method for inferring novel drug indications with application to personalized medicine. Mol Syst Biol. 2011;7:496. Epub 2011/06/10. PubMed PMID: 21654673; PubMed Central PMCID: PMCPMC3159979. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [54].Chen X, Liu MX, Cui QH, et al. Prediction of disease-related interactions between microRNAs and environmental factors based on a semi-supervised classifier. PloS One. 2012;7(8):e43425. Epub 2012/09/01. PubMed PMID: 22937049; PubMed Central PMCID: PMCPMC3427386. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [55].Takarabe M, Kotera M, Nishimura Y, et al. Drug target prediction using adverse event report systems: a pharmacogenomic approach. Bioinformatics. 2012;28(18):i611–i8. Epub 2012/ 09/11. PubMed PMID: 22962489; PubMed Central PMCID: PMCPMC3436840. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [56].Chen L, Lu J, Luo X, et al. Prediction of drug target groups based on chemical-chemical similarities and chemical-chemical/protein connections. Biochim Biophys Acta. 2014;1844(1 Pt B):207–213. Epub 2013/ 06/05. PubMed PMID: 23732562. [DOI] [PubMed] [Google Scholar]
- [57].Chen X, Yin J, Qu J, et al. MDHGI: matrix decomposition and heterogeneous graph inference for miRNA-disease association prediction. PLoS Comput Biol. 2018;14(8):e1006418. Epub 2018/08/25. PubMed PMID: 30142158; PubMed Central PMCID: PMCPMC6126877. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [58].Hattori M, Okuno Y, Goto S, et al. Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways. J Am Chem Soc. 2003;125(39):11853–11865. Epub 2003/09/25. PubMed PMID: 14505407. [DOI] [PubMed] [Google Scholar]
- [59].Lv S, Li Y, Wang Q, et al. A novel method to quantify gene set functional association based on gene ontology. J R Soc Interface. 2012;9(70):1063–1072. Epub 2011/ 10/15. PubMed PMID: 21998111; PubMed Central PMCID: PMCPMC3306647. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [60].Chen H, Jing L, Chen H, et al., editors. A flexible and robust multi-source learning algorithm for drug repositioning. Boston (MA): Acm International Conference on Bioinformatics; 2017. [Google Scholar]
- [61].Zhao Y, Chen X, Yin J. A novel computational method for the identification of potential miRNA-Disease association based on symmetric non-negative matrix factorization and Kronecker regularized least square. Front Genet. 2018;9:324. Epub 2018/09/07. PubMed PMID: 30186308; PubMed Central PMCID: PMCPMC6111239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [62].He Z, Xie S, Zdunek R, et al. Symmetric nonnegative matrix factorization: algorithms and applications to probabilistic clustering. IEEE Trans Neural Networks. 2011;22(12):2117–2131. Epub 2011/ 11/02. PubMed PMID: 22042156. [DOI] [PubMed] [Google Scholar]
- [63].Xiao Q, Luo J, Liang C, et al. A graph regularized non-negative matrix factorization method for identifying microRNA-disease associations. Bioinformatics. 2017. Epub 2017/10/03. PubMed PMID: 28968779. DOI: 10.1093/bioinformatics/btx545. [DOI] [PubMed] [Google Scholar]
- [64].Wang S, Xia P, Zhang L, et al. Systematical identification of breast cancer-related circular RNA modules for deciphering circRNA functions based on the non-negative matrix factorization algorithm. Int J Mol Sci. 2019;20(4). Epub 2019/02/23. PubMed PMID: 30791568; PubMed Central PMCID: PMCPMC6412941. DOI: 10.3390/ijms20040919. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [65].Pahikkala T, Airola A, Pietila S, et al. Toward more realistic drug-target interaction predictions. Brief Bioinform. 2015;16(2):325–337. Epub 2014/ 04/12. PubMed PMID: 24723570; PubMed Central PMCID: PMCPMC4364066. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
