Abstract
microRNAs (miRNAs) are short noncoding RNAs that can repress the expression of protein-coding messenger RNAs (mRNAs) by binding to the 3’UTR of the target. Genetic mutations such as single nucleotide variants (SNVs) in the 3’UTR of the mRNAs can disrupt miRNA regulation. In this study, we presented dbMTS, a database for miRNA target site SNVs and their functional annotations. This database can help studies easily identify putative SNVs that affect miRNA targeting and facilitate the prioritization of their functional importance. dbMTS is freely available for academic use at http://database.liulab.science/dbMTS as a web-service or a downloadable attached database of dbNSFP.
Keywords: variant, microRNA target, 3’ untranslated region, functional prediction, database
Introduction
MicroRNAs (miRNAs) are short noncoding RNAs (~22 nucleotides) that can repress the expression of target messenger RNAs (mRNAs) by binding to their 3’ untranslated regions (UTRs). It is estimated that more than 60% of human protein-coding genes are under the regulation of miRNAs (Friedman, Farh, Burge, & Bartel, 2009; Lewis, Burge, & Bartel, 2005), and there is increasing evidence suggesting their wide variety of functions in developmental and physiological processes (Bartel, 2018). miRNAs always convey their repressive effect through an imperfect binding with their target mRNAs, however perfect base-pairings at the 5’ end of the miRNA (nucleotide position 2–7; also known as the seed region) is often required for a miRNA target site (MTS) to be functional (Agarwal, Bell, Nam, & Bartel, 2015). Thus, single nucleotide variants (SNVs) located within MTS, especially those residing in the part of the MTS that pairs with miRNA seed regions, can undoubtedly disrupt the efficacy of miRNA targeting. This type of regulatory variant can then lead to downstream transcriptomic and proteomic changes, which have been extensively reported to be associated with various diseases (Li et al., 2018; Nicoloso et al., 2010). However similar to many other regulatory SNVs, the functional importance of most MTS SNVs is still poorly understood.
To help us better understand and interpret these regulatory SNVs in MTS, some databases have been established to try to link these SNVs with miRNA targetome alterations and diseases (Bhattacharya, Ziebarth, & Cui, 2014; Bruno et al., 2012; Gong et al., 2015; C. Liu et al., 2012). While some databases lack recent updates, two widely used and actively updated databases are PolymiRTS Database 3.0 (Bhattacharya et al., 2014) and miRNASNP v2.0 (Gong et al., 2015). These two databases both used variants from dbSNP build 137 (Sherry et al., 2001) and tried to link these variants with MTS and/or possible downstream phenotype information. Although these databases mentioned above are valuable in interpreting association results, they still suffer from some limitations. First, only known variants from dbSNP were included in the databases. With the fast development of whole-genome sequencing (WGS), there is a growing need to predict the functional effect of novel SNVs. Therefore, focusing only on known variants from dbSNP is insufficient. Second, their information is not comprehensive and has omitted a vast majority of recently developed functional annotations that can help interpret these MTS SNVs, e.g., CADD (Kircher, 2014) and Eigen (Ionita-Laza, McCallum, Xu, & Buxbaum, 2016). Such annotations utilized a wide range of functional genomic annotations such as conservation, experimentally identified functional elements, and their consequences, and they were proven to be predictive regarding the functional consequences of a potential SNV. Thus, by missing such information, currently available databases and their applications in prioritizing and filtering variants for association analyses, especially those with a large number of candidate SNVs are limited. Lastly, none of these databases included tissue specific information. It has been well studied that miRNAs have differential expression levels across different tissues (Ludwig et al., 2016). Thus, the functionalities of miRNA and MTS SNVs are greatly associated with the environment being considered.
To bridge these gaps, we have established a comprehensive database with all putative SNVs that might influence miRNA targeting. We first compiled a collection of all possible SNVs in the 3’UTR of mRNAs that may disrupt a MTS or gain a new MTS based on predictions from three popular miRNA target prediction tools, namely TargetScan (Agarwal et al., 2015, http://www.targetscan.org/vert_70/,v7.0), miRanda (John et al., 2004, http://www.microrna.org/microrna/getDownloads.do, aug2010) and RNAhybrid (Rehmsmeier, Steffen, Höchsmann, Giegerich, & Ho, 2004, https://bibiserv.cebitec.uni-bielefeld.de/download/tools/rnahybrid.html, 2.1.1). At the same time, we calculated some miRNA-specific scores for all identified SNVs using these three miRNA target prediction tools. We next collected their corresponding prediction scores from multiple popular SNV functional annotation tools, such as CADD, DANN (Quang, Chen, & Xie, 2015), FATHMM-MKL (Shihab et al., 2015) and Eigen. Lastly, The Cancer Genome Atlas (TCGA) data were processed to obtain miRNA-mRNA correlation in multiple tissues for both normal and tumor samples. We named our database dbMTS (database of microRNA Target Site SNVs), which is the first known database that aims to include all putative SNVs in human 3’UTRs that may impact miRNA targeting along with their functional annotations. This database can help studies easily and quickly identify putative SNVs that may impact miRNA targeting and facilitate the prioritization of functional important SNVs in putative MTS at genome level. Users can access dbMTS either as a web service at http://database.liulab.science/dbMTS, which is suitable for small-scale query or as a downloadable attached database to dbNSFP (Liu et al., 2011; Liu et al., 2016) at https://sites.google.com/site/jpopgen/dbNSFP, which is suitable for large-scale query and filtering.
Construction and content
TargetScan v7.0, RNAhybrid, and miRanda were used to predict putative miRNA targets and to evaluate the effect of different SNVs on miRNA targeting. Briefly, these algorithms identify favorable miRNA binding sites by providing a numeric estimation of the likelihood and the binding efficacy for a specific miRNA-target pairing site. miRanda focuses more on the complementarity between the miRNA and the binding site. RNAhybrid focuses more on the minimum free energy hybridization between the miRNA and its target 3’UTR sequence. TargetScan adopts more comprehensive information from various aspects of the binding site: conservation of the target, context information such as the position of the site, and seed region complementarity. We chose these three algorithms for two reasons. First, they adopted different target prediction and scoring schemes, which enabled us to capture different aspects of miRNA targeting. Second, their executables were freely available online so that we could make batch predictions locally.
The 3’-UTR coordinates and sequences were downloaded using the Table Browser utility from the UCSC genome browser. GENCODE (Harrow et al., 2012) gene annotation V23 basic set under genome assembly hg38 was retrieved, which included 3’-UTRs for 73,196 transcripts. miRNA sequence file for all species was downloaded from miRBase V21 at http://www.mirbase.org/ftp.shtml. Only mature human miRNAs were kept, which resulted in 2,588 mature miRNA sequences.
As the initial step to building dbMTS, our goal was to identify MTS SNVs and estimate their effect on miRNA targeting. To minimize the computational burden and at the same time capture the most impactful SNVs and their effect, we focused our research on those SNVs that pair with the miRNA seed region where a single mutation would completely disrupt the miRNA regulation. Our first step was to run the three miRNA target prediction algorithms with 2,588 mature human miRNAs and 3’UTR transcripts to get the reference miRNA targeting information in humans (reference scores). Then, to estimate the SNVs’ effect on reference miRNA targetome, we would mutate each nucleotide of all the 3’UTR transcripts one-by-one and use these variant-induced 3’UTRs to run the three miRNA target prediction algorithms again (variant-induced scores; see Supp. Figure S1 for detail). Next, we categorized all SNVs into three groups based on its estimated effect (Figure 1): 1) a SNV was classified as substitution when there are regulating miRNAs and have their seed regions overlap with this locus using both reference 3’UTR sequence and variant-induced 3’UTR sequence; 2) a SNV was classified as target loss where there are regulating miRNAs overlap with this locus using the reference 3’UTR sequence but not the variant-induced 3’UTR sequence; 3) a SNV was classified as target gain where there are regulating miRNAs overlap with this locus using the variant-induced 3’UTR sequence but not the reference 3’UTR sequence. For each SNV, the maximum difference between the reference score and variant-induced score was calculated to estimate how the miRNA targeting efficacy was changed after introducing the variant (Figure 1). Currently, there is no clear indication showing which of these three types of MTS SNVs is functionally more critical. Thus, for each miRNA target prediction algorithm, we calculated rank scores within each type of the SNV to account for the possible impact of different scales of their raw scores between the three types of SNV groups.
Figure 1. Illustration of the three types of SNVs and maximum difference score calculation.

In the first row, the C>A mutation resulted in the substitution of targeting miRNAs. Substitution is an event when there are regulating miRNAs with their seed regions overlapping with this locus, after introducing a mutation, one or more different miRNAs are targeting this locus. In the second row, the C>A mutation resulted in target gain. A target gain is when there was no regulating miRNA; after introducing a mutation, it gained at least one regulating miRNA. In the third row, the C>G mutation resulted in target loss. A target loss is when there were regulating miRNAs, but after introducing a mutation, there was no targeting miRNA. In this example, the maximum difference scores are calculated as abs(-0.59 – (-0.02)) = 0.57 for the C>A substitution SNV; abs(-0.25) = 0.25 for the C>A target gain SNV; abs(-0.12) = 0.12 for the C>G target loss SNV.
After identifying the three categories of SNVs that affect miRNA targeting for each of the three miRNA target prediction tools, these results were combined to build the foundation of the database with all potential functional SNVs that could affect miRNA targeting. Then additional annotations were extracted from Whole Genome Sequencing Annotator (WGSA) (X. Liu et al., 2015, https://sites.google.com/site/jpopgen/wgsa, v0.8) based on the positions of all the SNVs identified in our database. Some of the annotation categories include functional consequences of genomic variants by VEP (McLaren et al., 2016), dbSNP variant IDs, GWAS Catalog entries, allele frequencies from various populations, clinical consequences from ClinVar, expression quantitative trait loci (eQTLs) from GTEx, and mappability scores. Besides, quantitative annotations which combined machine learning techniques with experimental information or other annotation scores were included. Moreover, for each miRNA-target pair, we have calculated the correlation of their expression within 15 different tissues from the TCGA program. The expression data were obtained from (Bai et al., 2016), including 8 tissues from their published data and 7 tissues from their unpublished data. We provided these annotations to help users more easily rank and interpret a large number of candidate SNVs.
In our database, each SNV links to 221 unique fields with the first four columns as the primary identifier of the SNV: chromosome number, physical position on the chromosome as to hg38 (1-based coordinate), reference allele (as on the + strand), alternate allele (as on the + strand). For interested users, we also provided the identifier of the SNV as to the hg19 human reference genome from column 5 to 8. Following the identifier information, there are annotations we retrieved from WGSA (column 10 to 131). From column 132 to 221, there is the information we obtained from the three miRNA target prediction tools and miRNA-target co-expression information. For each of the miRNA target prediction tools, there are 30 fields of information: 13 fields of predictions using reference 3’UTRs, 13 fields of predictions using SNV-induced 3’UTRs, and 4 fields with site-level summary information: maximum difference score, its rank score, the transcript ID correspond to the maximum difference score and the predicted category of the SNV. A more comprehensive description of these fields can be found at Supp. Table S1.
All this information can be retrieved through our client-server architecture that enables the interaction between users and the database server. In more detail, dbMTS was stored in our database server managed by Microsoft SQL Server 2017. On the dbMTS web page (http://database.liulab.science/dbMTS), by submitting one or multiple genome coordinates (chromosome, position, reference allele, and alternate allele), users can easily retrieve the fields mentioned above in dbMTS. The output will be displayed on the web page as well as available as a downloadable text file for further documentation. Besides, the entire database is available for download at https://sites.google.com/site/jpopgen/dbNSFP.
Among the large number of annotations included, in this study, we focused only on those popular quantitative measurements that had been proven to be useful under different scenarios to prioritize functional SNVs. However, please note that other annotations could potentially be as or more useful, depending on specific research goals and interests. The 16 annotations we selected include eight conservation prediction scores: PhyloP46way primate, PhyloP100way vertebrate, PhyloP20way mammalian, PhastCons46way primate, PhastCons20way mammalian, PhastCons100way vertebrate, GERP_RS, and Siphy scores; eight integrative annotations that adopted more than one features or combined multiple individual annotations: integrated fitCons, FATHMM-MKL (coding and noncoding), Eigen, Eigen-PC, CADD, DANN, and GenoCanyon scores. Detailed information on these 16 annotation scores and 3 miRNA specific scores could be found in Table 1. Summary information for miRNA specific scores can be found at Supp. Table S2 and Supp. Figure S2. The relatively low coverage of miRanda and RNAhybrid predictions resulted partly from their high threshold of reporting a ‘true’ MTS, and partly from built-in limitations of the program, e.g., RNAhybrid was not able to predict 3’UTR with length longer than 2000. Their results could be considered as a more constrained set of potential MTS SNVs.
Table 1.
Annotations in dbMTS with their score range and missingness
| Annotations | min score | max score | No. Missing | No. SNVs | Percent non-missing (%) |
|---|---|---|---|---|---|
| phyloP46way_primate | −8.04 | 0.66 | 1361441 | 149236550 | 99.10 |
| phyloP20way_mammalian | −12.51 | 1.2 | 352635 | 150245356 | 99.77 |
| phyloP100way_vertebrate | −20 | 10 | 495343 | 150102648 | 99.67 |
| phastCons46way_primate | 0 | 1 | 1361438 | 149236553 | 99.10 |
| phastCons20way_mammalian | 0 | 1 | 352635 | 150245356 | 99.77 |
| phastCons100way_vertebrate | 0 | 1 | 495343 | 150102648 | 99.67 |
| GERP_RS | −12.3 | 6.17 | 6422487 | 144175504 | 95.74 |
| SiPhy_29way | 0 | 34.19 | 27411447 | 123186544 | 81.80 |
| integrated_fitCons | 0 | 0.84 | 18967312 | 131630679 | 87.41 |
| GenoCanyon | 0 | 1 | 163222 | 150434769 | 99.89 |
| CADD | −6.41 | 35.5 | 996735 | 149601256 | 99.34 |
| DANN | 0.01 | 1 | 996735 | 149601256 | 99.34 |
| fathmm-MKL_non-coding | 0 | 1 | 1298282 | 149299709 | 99.14 |
| fathmm-MKL_coding | 0 | 1 | 1034328 | 149563663 | 99.31 |
| Eigen | −3.33 | 6.84 | 19417761 | 131180230 | 87.11 |
| Eigen-PC | −3.38 | 17.69 | 19417761 | 131180230 | 87.11 |
| miRanda_raw | 0 | 216 | 42145719 | 104728888 | 69.54 |
| miRanda_rankscore | 0 | 1 | 42145719 | 104728888 | 69.54 |
| TargetScan_raw | 0 | 11.07 | 1991498 | 148606493 | 98.68 |
| TargetScan_rankscore | 0 | 1 | 1991498 | 148606493 | 98.68 |
| RNAhybrid_raw | 0 | 60.8 | 91185241 | 59412750 | 39.45 |
| RNAhybrid_rankscore | 0 | 1 | 91185241 | 59412750 | 39.45 |
Correlation structures of the above-mentioned quantitative annotation scores are shown in Figure 2. There was a high correlation between conservation and most integrative scores, while there was little correlation between miRNA target prediction scores and all other scores. This observation indicates that conservation or comparative genomic information was heavily used in these integrative algorithms, and those miRNA target prediction algorithms might be able to provide some additional information about the functional importance of these SNVs regardless of conservation. Aside from the built-in limitations discussed in the previous paragraph, the fact that there is a low correlation between the three miRNA target prediction tools confirms that currently available tools tend to generate distinct and non-overlapping false positive predictions.
Figure 2. Correlation structure between different annotation scores.

Utility and discussion
Functional 3’UTR SNVs are more likely to fall in dbMTS
First, we checked if functionally important SNVs in 3’UTR are enriched in dbMTS. We retrieved two related datasets, namely ClinVar (Landrum et al., 2016) and the MiRNA SNP Disease Database or MSDD (Yue et al., 2018). ClinVar is a public database with reported associations between human variation and phenotypes. MSDD is a manually curated database containing experimentally supported associations between miRNA related SNVs and human diseases. For ClinVar, we identified 1,214 pathogenic SNVs in dbMTS, and an additional 1,937 pathogenic SNVs in the rest of human 3’UTRs. Given that dbMTS includes 50,471,975 positions and human 3’UTRs have about 231,762,266 positions, it can be shown that pathogenic SNVs are over-represented in dbMTS (P < 0.00001). Besides, we extracted 118 unique SNVs in the MSDD database that were labeled as 3’UTR variants. We found that five of these SNVs resided in miRNA coding sequences, which were not considered in our database. Among the remaining 113 SNVs, we were able to identify 111 of them in dbMTS (98.2%). Using the same 113 SNVs, PolymiRTS Database 3.0 and miRNASNP v2.0 covered 102 (90.3%) and 86 (76.1%) SNVs, respectively. These results indicated that dbMTS could identify potential SNVs that can affect miRNA targeting and are functionally important, which would provide users with increased ability to screen for functional SNVs in human 3’UTRs.
Second, we checked if our calculated miRNA specific scores (i.e., TS_rankscore, M_rankscore, and R_rankscore) using the three miRNA prediction tools, TargetScan, miRanda, and RNAhybrid, can further help user discriminate between non-functional and functional SNVs. Using MSDD, we found that SNVs with low TargetScan rank score (TS_rankscore < 0.2) showed statistically significant depletion of functional SNVs (P < 0.05). This under-representation of functional SNVs in low score range illustrated that when a SNV has a low TS_rankscore (<0.2), it is not only an indication of low target efficacy but also less likely to be located in authentic MTS due to the high false-positive predictions made by currently available miRNA target prediction tools.
Comparison of predictive power between different annotation scores
After proving dbMTS’ ability to identify functional MTS SNVs, we next tried to compare some of the functional annotation scores in our database to check which one performs the best in separating potential functional MTS SNVs with non-functional ones. From ClinVar, we extracted 1,214 ‘pathogenic’ SNVs as a part of our true positive (TP) testing set and 3,196 ‘benign’ SNVs as our true negative (TN) testing set. From MSDD, we extracted the 111 unique SNVs that were labeled as 3’UTR variants and were identified in dbMTS. All SNVs identified at MSDD were labeled as TP in our testing dataset. Then testing samples extracted previously were combined. To ensure the SNVs being evaluated were completely noncoding and did not overlap with any coding regions, we removed those SNVs annotated as nonsynonymous or splicing by any of the three popular functional annotation tools, namely ANNOVAR (Wang, Li, & Hakonarson, 2010), VEP (McLaren et al., 2016) and SnpEff (Cingolani et al., 2012). Finally, we obtained a testing dataset with 166 TPs and 2,943 TNs. We randomly selected 166 TNs to account for the imbalance of the labels. Then using receiver operating characteristic (ROC) curves, we evaluated the performance of each annotation score in our database (Figure 3). We found that the Eigen score had the overall best performance with the area under the curve (AUC) of 0.7405, followed by fathmm-MKL and several conservation scores. Since Eigen score has a higher missing rate in our test set compared to some of the other annotation scores (missing rate = 7.33%), we also performed an analysis that removed those SNVs with missing Eigen score in our test set which returned similar results (Supp Table S3). The FATHMM-MKL noncoding score showed the best performance for those SNVs with a missing Eigen score. For TargetScan scores, they achieved significant AUCs with their 95% confidence intervals greater than 0.5. For RNAhybrid and miRanda, all their predictions had AUCs with their 95% confidence interval, including 0.5, meaning their predictive power to this testing dataset was no better than random guesses. This observation implies that relying solely on the difference between binding stabilities and the difference between base-pairing scores showed little predictive power for whether a MTS SNV was functional or not. Other context information around the binding site was also indispensable to predicting the efficacy of miRNA regulation and inferring SNV’s impact on miRNA targeting. This observation also partly explains the low correlation between these miRNA target prediction scores showed in Figure 2.
Figure 3. ROCs for functional annotation scores in our database using a self-curated test set.

Utilities and future studies
As mentioned previously, dbMTS included a large number of SNVs with their possible effects on miRNA targeting in the 3’UTR regions, along with multiple functional annotation scores and predictions. Aside from merely using the overall best score, Eigen, another straightforward way to take advantage of the database is to use several annotation scores at once to find consensus predictions among them. This approach can be applied in two ways: the first way is to find consensus SNVs predicted by multiple miRNA target prediction algorithms to identify a stringent subset of SNVs that affect miRNA targeting; another way is to prioritize functional SNVs by using the predicted functional importance from multiple annotations (a list of recommended cut-off points for some of the annotations can be found at Supp. Table S4). Using this method, studies interested in SNVs and MTS could filter out a large number of neutral SNVs and keep those highly confident SNVs that are more likely to affect miRNA targeting for further analyses. For example, we investigated variants reported in the GWAS Catalog and found 1,571 3’UTR SNVs that could potentially affect miRNA targeting (Supp. Table S5), which are good starting points for future functional validation.
Additionally, with miRNA-target co-expression information, we observed that 13,225,315 SNVs in our database are located in MTS of miRNA-target pairs with negative correlated expression between miRNAs and pairing mRNAs (Pearson’s correlation coefficient < 0) in some tissues and no positive correlated expression in any of the 15 tissues being considered (e.g., increased expression of true targeting miRNA will result in decreased expression of the targeted mRNA and vice versa). Thus, these variants and their effects predicted by computational tools are more likely to be authentic. This co-expression information added another layer of evidence to help users screen for truly functional variants. Moreover, given the extensive involvement of miRNAs in oncogenesis (Esquela-Kerscher & Slack, 2006; Lin & Gregory, 2015), using this same approach, our database can be used to prioritize candidate driver mutations in cancer genomes. The richness of the available information can easily be used to further boost the user’s power to interpret noncoding SNVs. For example, eQTL loci can be used to associate SNVs and their targeting miRNAs with gene expression to gain a more well-rounded picture of gene regulation pathways.
Our database can be further improved in various ways. First, our database would benefit significantly from the future development of both miRNA target prediction tools and SNV functional annotation tools. Second, although we focused on SNVs, other types of genetic variations, such as some of those observed insertions or deletions, can also disrupt miRNA targeting. Even though it is computationally expensive to evaluate their effects, including these types of mutations can undoubtedly further increase the comprehensiveness of our database. Third, since our database contained miRNA-specific raw scores from the three miRNA target prediction tools, they could be used to construct new measurements of functional importance other than the maximum potential difference we used in our study.
Conclusion
In this study, we took advantage of three miRNA target prediction tools (TargetScan, miRanda, and RNAhybrid) to identify all possible SNVs that could affect miRNA targeting in the 3’UTR of human mRNAs. We calculated the functional importance using the three tools mentioned above and collected multiple popular functional annotation scores for these SNVs. Besides, we compared these functional annotation scores collected regarding their performance using a combined testing dataset. We found that Eigen outperformed all other individual annotations, and TargetScan showed statistically significant (though weak) predictive power regarding SNVs’ pathogenicity. We hope the presented database could facilitate researches interested in using MTS to prioritize functional SNVs or interpret WGS results.
Supplementary Material
Funding information:
This work is supported partially by NIH funding 1UM1HG008898 to Dr. Richard Gibbs (PI) and partially by the startup funding for XL from the University of South Florida.
Footnotes
Availability of data and materials
The web service of dbMTS is hosted at http://database.liulab.science/dbMTS.
The full downloadable version of dbMTS is attached to dbNSFP and hosted at https://sites.google.com/site/jpopgen/dbNSFP.
These data in the database were derived from the following resources available in the public domain:
Multiple functional annotations of SNVs in dbMTS were retrieved through WGSA at http://sites.google.com/site/jpopgen/wgsa.
Testing data, namely ClinVar and MSDD, are retrieved at ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/ and http://bio-bigdata.hrbmu.edu.cn/msdd/, respectively.
The two databases for miRNA-related SNVs compared with dbMTS in the study are PolymiRTS (http://compbio.uthsc.edu/miRSNP/download/PolymiRTS3.0/) and miRNASNP (http://bioinfo.life.hust.edu.cn/miRNASNP2/download.php).
GENCODE mRNA transcript data were retrieved using the UCSC Table Browser at http://genome.ucsc.edu/cgi-bin/hgTables.
microRNA sequences were retrieved through miRBase at http://www.mirbase.org/ftp.shtml.
Conflict of Interest Statement
The authors declare that they have no competing interests.
References
- Agarwal V, Bell GW, Nam JW, & Bartel DP (2015). Predicting effective microRNA target sites in mammalian mRNAs. eLife, 4, 1–38. doi: 10.7554/eLife.05005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bai Y, Ding L, Baker S, Bai JM, Rath E, Jiang F, … Stuart G (2016). Dissecting the biological relationship between TCGA miRNA and mRNA sequencing data using MMiRNA-Viewer. BMC bioinformatics, 17(Suppl 13), 336. doi: 10.1186/s12859-016-1219-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bartel DP (2018). Metazoan MicroRNAs. Cell, 173(1), 20–51. doi: 10.1016/j.cell.2018.03.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bhattacharya A, Ziebarth JD, & Cui Y (2014). PolymiRTS Database 3.0: linking polymorphisms in microRNAs and their target sites with human diseases and biological pathways. Nucleic Acids Res, 42(Database issue), D86–91. doi: 10.1093/nar/gkt1028 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bruno AE, Li L, Kalabus JL, Pan Y, Yu A, & Hu Z (2012). miRdSNP: a database of disease-associated SNPs and microRNA target sites on 3’UTRs of human genes. BMC Genomics, 13, 44. doi: 10.1186/1471-2164-13-44 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, … Ruden DM(2012). A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w 1118; iso-2; iso-3. Fly, 6, 80–92. doi: 10.4161/fly.19695 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Consortium E (2012). An integrated encyclopedia of DNA elements in the human genome. Nature, 489, 57–74. doi: 10.1038/nature11247 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Esquela-Kerscher A, & Slack FJ (2006). Oncomirs - microRNAs with a role in cancer. Nat Rev Cancer, 6(4), 259–269. doi: 10.1038/nrc1840 [DOI] [PubMed] [Google Scholar]
- Friedman RC, Farh KKH, Burge CB, & Bartel DP (2009). Most mammalian mRNAs are conserved targets of microRNAs. Genome Research, 19, 92–105. doi: 10.1101/gr.082701.108 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gong J, Liu C, Liu W, Wu Y, Ma Z, Chen H, & Guo AY (2015). An update of miRNASNP database for better SNP selection by GWAS data, miRNA expression and online tools. Database, 2015, 1–8. doi: 10.1093/database/bav029 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M, Kokocinski F, … Hubbard TJ (2012). GENCODE: The reference human genome annotation for the ENCODE project. Genome Research, 22, 1760–1774. doi: 10.1101/gr.135350.111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ionita-Laza I, McCallum K, Xu B, & Buxbaum JD (2016). A spectral approach integrating functional genomic annotations for coding and noncoding variants. Nature genetics, advance on, 214–220. doi: 10.1038/ng.3477 [DOI] [PMC free article] [PubMed] [Google Scholar]
- John B, Enright AJ, Aravin A, Tuschl T, Sander C, & Marks DS (2004). Human MicroRNA targets. PLoS biology, 2, e363. doi: 10.1371/journal.pbio.0020363 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kircher M (2014). A general framework for estimating the relative pathogenicity of human genetic variants. Nature g, 46, 310–315. doi: 10.1038/ng.2892.A [DOI] [PMC free article] [PubMed] [Google Scholar]
- Landrum MJ, Lee JM, Benson M, Brown G, Chao C, Chitipiralla S, … Maglott DR (2016). ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res, 44(D1), D862–868. doi: 10.1093/nar/gkv1222 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lewis BP, Burge CB, & Bartel DP (2005). Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell, 120, 15–20. doi: 10.1016/j.cell.2004.12.035 [DOI] [PubMed] [Google Scholar]
- Li C, Grove ML, Yu B, Jones BC, Morrison A, Boerwinkle E, & Liu X (2018). Genetic variants in microRNA genes and targets associated with cardiovascular disease risk factors in the African-American population. Hum Genet, 137(1), 85–94. doi: 10.1007/s00439-017-1858-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lin S, & Gregory RI (2015). MicroRNA biogenesis pathways in cancer. Nat Rev Cancer, 15(6), 321–333. doi: 10.1038/nrc3932 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu C, Zhang F, Li T, Lu M, Wang L, Yue W, & Zhang D (2012). MirSNP, a database of polymorphisms altering miRNA target sites, identifies miRNA-related SNPs in GWAS SNPs and eQTLs. BMC Genomics, 13, 661. doi: 10.1186/1471-2164-13-661 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu X, White S, Peng B, Johnson AD, Brody J. a., Li AH, … Boerwinkle E (2015). WGSA: an annotation pipeline for human genome sequencing studies: Figure 1. Journal of Medical Genetics, 0, jmedgenet-2015–103423. doi: 10.1136/jmedgenet-2015-103423 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ludwig N, Leidinger P, Becker K, Backes C, Fehlmann T, Pallasch C, … Keller A (2016). Distribution of miRNA expression across human tissues. Nucleic acids research, 44, 3865–3877. doi: 10.1093/nar/gkw116 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GR, Thormann A, … Cunningham F (2016). The Ensembl Variant Effect Predictor. Genome Biol, 17(1), 122. doi: 10.1186/s13059-016-0974-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nicoloso MS, Sun H, Spizzo R, Kim H, Wickramasinghe P, Shimizu M, … Calin GA (2010). Single-nucleotide polymorphisms inside microRNA target sites influence tumor susceptibility. Cancer Res, 70(7), 2789–2798. doi: 10.1158/0008-5472.CAN-09-3541 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Quang D, Chen Y, & Xie X (2015). DANN: A deep learning approach for annotating the pathogenicity of genetic variants. Bioinformatics, 31, 761–763. doi: 10.1093/bioinformatics/btu703 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rehmsmeier M, Steffen P, Höchsmann M, Giegerich R, & Ho M (2004). Fast and effective prediction of microRNA / target duplexes. Spring, 1507–1517. doi: 10.1261/rna.5248604.and [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, & Sirotkin K (2001). dbSNP: the NCBI database of genetic variation. Nucleic Acids Res, 29(1), 308–311. Retrieved from https://www.ncbi.nlm.nih.gov/pubmed/11125122 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shihab H. a., Rogers MF, Gough J, Mort M, Cooper DN, Day INM, … Campbell C (2015). An integrative approach to predicting the functional effects of noncoding and coding sequence variation. Bioinformatics (Oxford, England), 31, 1536–1543. doi: 10.1093/bioinformatics/btv009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu X, Jian X, & Boerwinkle E (2011). dbNSFP: a lightweight database of human nonsynonymous SNPs and their functional predictions. Human mutation, 32(8), 894–899. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu X, Wu C, Li C, & Boerwinkle E (2016). dbNSFP v3. 0: A one‐stop database of functional predictions and annotations for human nonsynonymous and splice‐site SNVs. Human mutation, 37(3), 235–241. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang K, Li M, & Hakonarson H (2010). ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic acids research, 38, e164. doi: 10.1093/nar/gkq603 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yue M, Zhou D, Zhi H, Wang P, Zhang Y, Gao Y, … Li X (2018). MSDD: a manually curated database of experimentally supported associations among miRNAs, SNPs and human diseases. Nucleic Acids Res, 46(D1), D181–D185. doi: 10.1093/nar/gkx1035 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
