Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2014 Jul 31;42(15):9543–9552. doi: 10.1093/nar/gku675

Effects of genetic variations on microRNA: target interactions

Chaochun Liu 1, William A Rennie 1, C Steven Carmack 1, Shaveta Kanoria 1, Jijun Cheng 2, Jun Lu 2, Ye Ding 1,*
PMCID: PMC4150780  PMID: 25081214

Abstract

Genetic variations within microRNA (miRNA) binding sites can affect miRNA-mediated gene regulation, which may lead to phenotypes and diseases. We perform a transcriptome-scale analysis of genetic variants and miRNA:target interactions identified by CLASH. This analysis reveals that rare variants tend to reside in CDSs, whereas common variants tend to reside in the 3′ UTRs. miRNA binding sites are more likely to reside within those targets in the transcriptome with lower variant densities, especially target regions in which nucleotides have low mutation frequencies. Furthermore, an overwhelming majority of genetic variants within or near miRNA binding sites can alter not only the potential of miRNA:target hybridization but also the structural accessibility of the binding sites and flanking regions. These suggest an interpretation for certain associations between genetic variants and diseases, i.e. modulation of miRNA-mediated gene regulation by common or rare variants within or near miRNA binding sites, likely through target structure alterations. Our data will be valuable for discovering new associations among miRNAs, genetic variations and human diseases.

INTRODUCTION

Genetic variations within gene regulatory elements may affect gene expression levels in an allele-specific manner and thereby contribute to the variation in complex human phenotypes and diseases. Many disease-associated regulatory polymorphisms such as variants in cis-elements (1,2) operate at the stage of transcriptional regulation through altering the binding affinity of transcription factors. Recently, polymorphisms within post-transcriptional regulatory elements in particular microRNA (miRNA) binding sites have been studied (3,4). These polymorphisms also represent an important class of genetic variations.

miRNAs are an abundant class of small endogenous non-coding RNAs of ∼22 nucleotides (nts) in length. More than 1,000 human miRNAs have been discovered (5), while more than 30% of human protein-coding genes are predicted to be regulated by miRNAs (6). miRNAs are key post-transcriptional regulators involved in diverse developmental processes, molecular and cellular pathways and human diseases (7). A mature miRNA can guide RNA-induced silencing complex for target recognition through hybridization between the miRNA and the cognitive messenger RNAs (mRNAs). Successful target binding usually results in translational repression and/or mRNA degradation (8). It has been demonstrated that genetic variants within miRNA binding sites can modulate gene expression and protein output levels and affect phenotypes or cause disease (3,4,9–15). Several studies have systematically identified the genetic variants within human miRNA target sites (16–21) and performed part or all of the following analyses: (i) investigation of natural selection via statistical analysis of the frequency of single nucleotide polymorphisms (SNPs) within miRNA seed (2–7 nt) complementary regions, miRNA binding sites and the 3′ untranslated regions (3′ UTRs) of mRNAs or the entire mRNAs (17–21); (ii) measurement of the SNP-induced effect on miRNA binding by hybridization energy change (16,17); and (iii) association of the miRNA-related SNPs with human phenotypes or diseases (16,17,20). For some of these studies, a small fraction of miRNA binding sites were experimentally validated. The remaining miRNA binding sites in all of these studies were identified by computational predictions that can have high numbers of false positives or false negatives (22). Inaccuracy in predictions may bias such analyses. Moreover, these studies only considered common genetic variants with minor allele frequencies (MAFs) greater than or equal to 1% or 5%. Although common genetic variants have been a focus of disease association studies, some rare variants may have significant impact on an individual's risk of certain phenotypes or diseases (23–28). To date, there has been a lack of systematic studies that include both common and rare genetic variants, as well as variants without frequency information. Genetic variations may alter the local secondary structure of mRNA sequences (29). A change in structural accessibility can affect target recognition by miRNAs (30–33). However, the hybridization energy used in the previous studies (16,17) does not measure the effect of local target structure change induced by genetic variants within the binding sites. In this work, we consider several target structure features for measuring the effects of variants on local target structure. Moreover, two SNPs near miRNA binding sites were reported to lead to either a change in local target secondary structure (34) or an alteration of miRNA regulation (35). We here systematically study such effects of SNPs in the flanking regions of miRNA binding sites.

A human miRNA interactome of ∼18,500 miRNA:mRNA interactions has been experimentally identified by CLASH (36). The CLASH technique performs high-throughput crosslinking, ligation and sequencing of miRNA-target RNA duplexes associated with human AGO1. It allows direct observation of miRNA:target interactions revealed by CLASH chimeras. Over 98% of the interactions were formed in vivo in human cells (36). The CLASH study has presented a high-quality data set of high-confidence miRNA binding sites, which enables accurate identification of genetic variants within or near true miRNA binding sites. Thus, this data set provides a solid foundation for a systematic investigation of the effects of miRNA-related variations on miRNA-mediated gene regulation and human phenotypes or diseases. To pursue this objective, we start with a comprehensive transcriptome-wide survey on natural selection for genetic variants within miRNA binding sites identified by CLASH. In addition to hybridization energy, we consider four features to measure the effects of genetic variants within or near miRNA binding site on local target structure and the potential of miRNA:target hybridization. Furthermore, we identify miRNA-related genetic variants for cancer genes and also those associated with known human phenotypes or diseases to facilitate further studies on individual susceptibility to complex diseases.

MATERIALS AND METHODS

Data processing

We downloaded the human genetic variant data set ‘phase1_integrated_release_version3’ from 1000 genomes FTP (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521/) which contains phased genotype calls on 1,092 human samples for SNPs, short indels and large deletions. These variants were mapped to Ensembl transcriptome of all protein-coding genes using the annotation file (Ensembl Release 60) from Ensembl genome browser (http://www.ensembl.org).

The CLASH data set includes ∼18,500 miRNA:target interactions as chimeric sequencing reads for 399 miRNAs and 7,390 transcripts. Each chimeric read contains one miRNA and a target-binding region of 42–119 nts in length. The miRNA binding sites within the CLASH chimeras are identified by the RNAhybrid program (37) for either seed sites (i.e. canonical sites) or seedless sites (i.e. noncanonical sites). Seed sites include 8mer, 7mer-A1, 7mer-m8, 6mer and offset 6mer sites (38). RNAhybrid also presents the conformation of the miRNA:target hybrid in addition to the start and end nucleotide positions of the binding site. For each binding site, we calculated conservation score as the average of individual nucleotide conservation scores from the UCSC genome browser. These scores were generated by the PhastCons program (39) through multiple-sequence alignments of nine primate genomes to the human genome (hg19).

Variant frequency analysis

For the subset of common variants and the subset of rare variants, we first counted the numbers of variants residing within the entire mRNA, coding sequence (CDS), or 3′ UTR for the whole transcriptome, CLASH transcripts and miRNA binding sites, respectively. A transcript is referred to as a CLASH transcript if it is represented by at least one CLASH chimera. The 5′ UTR was not included in region-specific analysis, due to sparse variant data. We next computed the length for each of these regions. The variant density for a region was computed by the number of variants divided by the length of the region. The P-value from Fisher's exact test (40) was used to evaluate the significance of the difference in variant densities.

Thermodynamic and target structure features for measure of variation effects

In addition to ΔGhybrid, a measure of hybrid stability computed by RNAhybrid (37), we consider four other thermodynamic and target structure features. ΔGtotal, a measure of total energy change, is the key characteristic of a two-step model for miRNA:target hybridization (33) and can be considered as a measure of potential for successful miRNA:target hybridization. We also computed three probabilistic measures of structural accessibility for the miRNA binding site, the 25-nt blocks upstream and downstream of the site as follows. For a block of nucleotides, the accessibility was computed by the average probability that each nucleotide in the block is single-stranded, based on the RNA secondary structure sampling algorithm implemented by Sfold (41,42).

For measuring the effects of genetic variants, ΔΔGhybrid was computed by [ΔGhybrid (mutant) − ΔGhybrid (wild type: WT)] to measure the variant effect on hybrid stability. ΔΔGtotal was computed by [ΔGtotal (mutant) − ΔGtotal (WT)] to measure the variant effect on the potential of miRNA:target hybridization. Δsite_access was computed by [site accessibility (mutant) − site accessibility (WT)] to measure the variant effect on structural accessibility of the miRNA binding site. Similarly, Δupstream_access and Δdownstream_access were computed to measure the variant effects on structural accessibility of the 25-nt blocks upstream and downstream of the binding site, respectively.

Identification of miRNA-related variants for cancer genes or associated with diseases

We downloaded the list of cancer genes from the CancerGenes database (43). Using this list, we identified all miRNA-related variants in cancer genes that reside either in the miRNA binding sites or in the 25-nt flanking regions. Although the genome-wide association studies (GWAS) have identified many genetic variants associated with diseases, very few miRNA-related variants in this study can be found or corroborated by GWAS results. We thus performed a literature search using PMC databases to retrieve articles reporting the association between each of these miRNA-related variants and human phenotypes or diseases, and collected the miRNA-related variants that were reported to be associated with human phenotypes or diseases in one or more studies.

RESULTS

Genetic variation frequency in different regions

We identified 955,275 variants across 75,853 Ensembl transcripts. These include 302,797 (31.7%) variants with MAF ≥1% and 652,478 (68.3%) variants with MAF <1%. Among all of the miRNA binding sites from CLASH chimeras, ∼81.3% are seedless according to the definition of seed sites (38). Comparisons were made between whole transcriptome (defined by Ensembl transcripts) and CLASH transcripts, and between CLASH transcripts and miRNA binding sites within CLASH chimeras.

We focus on presenting results using the common MAF thresholds of 1% defining common variants and rare variants. The conclusions are generalizable to a wide range of thresholds (Supplementary Figure S1). For both common and rare variants in the three target regions (mRNA, CDS and 3′ UTR), the densities for CLASH transcripts (blue bars in Figure 1A and B) are significantly lower than those for the whole transcriptome (red bars in Figure 1A and B), with all P-values under 0.04 for density comparisons. It indicates that miRNA binding sites are more likely to reside within those targets in the transcriptome with lower variant densities, consistent with a previous study (21). In all of the three target regions (mRNA, CDS and 3′ UTR), for common variants, the densities for miRNA binding site (green bars in Figure 1A) are significantly lower than those for the CLASH transcripts (blue bars in Figure 1A), with all P-values under 0.03. For rare variants, the densities for miRNA binding site (green bars in Figure 1B) are marginally higher than those for the CLASH transcripts (blue bars in Figure 1B). These suggest miRNAs binding sites are more likely to reside within target regions in which nucleotides have low mutation frequencies. Moreover, for common variants, the densities for 3′ UTRs are substantially higher than those of CDSs (Figure 1A); for rare variants, the densities for CDSs are substantially higher than those of the 3′ UTRs (Figure 1B). These indicate that rare variants tend to reside in CDSs, whereas common variants tend to reside in the 3′ UTRs. This may be due to the fact that under codon constraints, CDS tends to be more conserved than 3′ UTR.

Figure 1.

Figure 1.

Variant densities in whole transcriptome, CLASH transcripts and miRNA binding sites for (A) common variants (MAF ≥ 1%); (B) rare variants (MAF <1%); (C) percentages of miRNA binding sites by evolutionary conservation levels; (D) density of variants (common or rare) with different MAF thresholds for miRNA binding sites grouped by conservation level.

We estimate that ∼43% of miRNA binding sites in the CLASH chimeras are highly conserved (conservation score >0.9), while ∼21% are poorly conserved (conservation score ≤0.1) (Figure 1C). This suggests that while many miRNA binding sites are conserved, a significant portion are species specific. For highly conserved sites, the variant density is significantly lower than that of other sites (P-value = 8.1e−28). Especially for common variants, the densities generally decrease with increasing conservation (Figure 1D). These findings are consistent with previous observations on predicted conserved miRNA binding sites (18).

Effects of genetic variants within miRNA binding sites on miRNA:target interaction

Genetic variants within miRNA binding sites can have impact on miRNA:target hybridization through either altering local target structure and accessibility or disruption/creation of base pair(s). ΔΔGhybrid (see the MATERIALS AND METHODS section) was used in a previous study (17) to measure the effects of genetic variants on stability of the miRNA:target hybrid. A positive value of ΔΔGhybrid indicates a decrease in hybrid stability due to the variant, whereas a negative value indicates an increase in hybrid stability. However, ΔGhybrid does not measure target structural accessibility and the potential of miRNA:target hybridization. Here we compute four features ΔΔGtotal, Δsite_access, Δupstream_access and Δdownstream_access (see the MATERIALS AND METHODS section) for measuring the variant effects on target structure accessibility and the potential of miRNA:target hybridization. A positive ΔΔGtotal indicates a decrease in the potential of miRNA:target hybridization due to the variant, whereas a negative value indicates an increase in the potential. A positive value for Δsite_access, or Δupstream_access or Δdownstream_access indicates increased structural accessibility at the miRNA binding sites or the flanking region(s), whereas a negative value indicates decreased accessibility. A larger change in any of the energetic or accessibility measures above could have a greater impact on miRNA:target interactions. The values of ΔGhybrid, ΔGtotal, site_access, upstream_access and downstream_access for both wild type miRNA:target interactions and mutant miRNA:target interactions can be found in Supplementary Table S1, together with seed type, maximal length of continuous Watson–Crick base-pairing in the region from miRNA nucleotide 12 to the 3′ end, predicted conformation of miRNA:target hybrid for each miRNA binding site of CLASH chimeras and the GO categories for cancer genes. We identified a total of 4109 variants residing within miRNA binding sites. For MAF threshold of 1%, there are 1047 common variants and 3062 rare variants. In general, the histograms representing distributions of effect measures for common variants are similar to those for rare variants.

The histograms of ΔΔGhybrid for common and rare variants in miRNA binding sites are shown in Figure 2A. For common variants, 44.3% decrease and 10.5% increase the hybrid stability by at least 1 kcal/mol. For rare variants, 49% decrease and 8.5% increase the hybrid stability by at least 1 kcal/mol. By varying the change in hybrid stability, we also present cumulative distributions of absolute value of ΔΔGhybrid for common and rare variants in miRNA binding sites (Supplementary Figure S2A). These indicate that a majority of variants in miRNA binding sites can alter hybrid stability. In particular, both common and rare variants tend to weaken the miRNA:target hybrid stability (P-values <2.3e−45).

Figure 2.

Figure 2.

The histograms of effect measures for common (MAF ≥1%) and rare (MAF <1%) variants in miRNA binding sites (the horizontal axis intervals (a,b], [a,b), (a,b), [a,b] are defined by a<x≤b, a≤x<b, a<x<b, a≤x≤b, respectively, where x is the value of the feature). (A) ΔΔGhybrid; (B) ΔΔGtotal; (C) Δsite_access; (D) Δupstream_access; (E) Δdownstream_access.

The histograms of ΔΔGtotal for common and rare variants in miRNA bindings sites are shown in Figure 2B. For common variants, 44.4% decrease and 35.5% increase the hybridization potential by at least 1 kcal/mol, respectively. For rare variants, 45.7% decrease and 34.2% increase the hybridization potential by at least 1 kcal/mol, respectively. By varying the change in the total hybridization energy, we also present cumulative distributions of absolute value of ΔΔGtotal for common and rare variants in miRNA binding sites (Supplementary Figure S2B). Noticeably, even with a relatively high threshold of |ΔΔGtotal|, such as 5 kcal/mol, a substantial fraction (∼30%) of variants changed total hybridization energy. These indicate that a majority of variants in miRNA binding sites can alter hybridization-potential. In particular, both common and rare variants tend to reduce the potential of miRNA:target hybridization (P-values <2.8e−4).

The histograms of Δsite_access, Δupstream_access and Δdownstream_access for common and rare variants in miRNA binding sites are shown in Figure 2C–E. For common variants, ∼82% can alter structural accessibility of the miRNA binding sites (with a cutoff of 0.01); 67% can alter the accessibility of the upstream region of 25 nts; and 66% can alter downstream accessibility. Comparable percentages were observed for the rare variants. By varying the change in accessibility, we also present cumulative distributions of absolute values of Δsite_access, Δupstream_access and Δdownstream_access, for common and rare variants in miRNA binding sites (Supplementary Figure S2C–E). These indicate that for variants in miRNA binding sites, the majority can alter structural accessibility of the miRNA binding sites and the flanking regions, thereby affecting the access to the target by miRNA–Argonaute complex. Moreover, we computed the mean MAF for variants (in miRNA binding sites) which affect site accessibility (|Δsite_access|>Θ) and those which do not (|Δsite_access|≤Θ), where Θ is a threshold varying from 0.01 to 0.2. We observed that variants which affect site accessibility have substantially lower mean MAF than those variants which do not (Supplementary Figure S4A). This indicates that variants (in miRNA binding sites) which affect site accessibility tend to have lower frequencies for minor alleles.

Effects of genetic variants near miRNA binding sites on miRNA:target interaction

Genetic variants within flanking regions of miRNA binding sites may also alter local target structure, affecting the potential of miRNA:target hybridization. Because these variants reside outside miRNA binding sites, ΔGhybrid is the same for all alleles (i.e. ΔΔGhybrid = 0) and thus is not useful for analysis of their effects. Therefore, we only consider the four structural features. Some variants (e.g. insertions or deletions of multiple nucleotides) can substantially change a flanking region. To facilitate analysis on flanking regions of a pre-specified length, we focus on SNP variants within a 25-nt block either upstream or downstream of the miRNA binding sites. For MAF threshold of 1%, we identified 1234 common SNPs and 3778 rare SNPs for upstream regions (Supplementary Table S2), and 1231 common SNPs and 3746 rare SNPs for downstream regions (Supplementary Table S3). Generally, the histograms of effect measures for common variants are very similar to those for rare variants. The values of the structural features for both the wild-type miRNA:target interactions and mutant miRNA:target interactions are also given in Supplementary Tables S2 and S3.

The histograms of ΔΔGtotal for common and rare SNPs in upstream regions are shown in Figure 3A. Among common SNPs, 24.2% decrease and 21.5% increase the hybridization potential by at least 1 kcal/mol. A similar histogram is also shown for the rare SNPs. The results are similar for the downstream regions (Figure 3B). By varying the change in the total hybridization energy, we also present cumulative distributions of absolute value of ΔΔGtotal for common and rare variants in the flanking regions of miRNA binding sites (Supplementary Figure S3A and B). Overall, about half of the SNPs in the flanking regions of miRNA binding sites can alter the potential of the miRNA:target hybridization by at least 1 kcal/mol, and by over 10 kcal/mol in some cases. The nearly symmetric distributions indicate that these SNPs are nearly equally likely to decrease or increase miRNA:target hybridization potential. Furthermore, the histograms have heavier weights in the center than those in Figure 2B, indicating that the effects of variants outside miRNA binding sites are more moderate than those within the binding sites.

Figure 3.

Figure 3.

The histograms of effect measures for common (MAF ≥1%) and rare (MAF <1%) SNPs in 25-nt blocks upstream or downstream of miRNA binding sites. (A) ΔΔGtotal for SNPs upstream of sites; (B) ΔΔGtotal for SNPs downstream of sites; (C) Δsite_access for SNPs upstream of sites; (D) Δsite_access for SNPs downstream of sites; (E) Δupstream_access for SNPs upstream of sites; (F) Δupstream_access for SNPs downstream of sites; (G) Δdownstream_access for SNPs upstream of sites; (H) Δdownstream_access for SNPs downstream of sites.

The histograms of Δsite_access, Δupstream_access and Δdownstream_access for common and rare SNPs in either upstream or downstream regions of miRNA binding sites are shown in Figure 3C–H. The histogram distribution for Δsite_access shows that ∼60% of common or rare SNPs near miRNA binding sites can alter the structural accessibility of the binding sites (with a cutoff of 0.01), even though they reside outside the sites. For either common or rare SNPs in the upstream regions, ∼80% can alter the structural accessibility of the upstream regions; 47% can alter the downstream accessibility. For common or rare SNPs in the downstream regions, ∼46% can alter the structural accessibility of the upstream regions; 80% can alter the downstream accessibility. These percentages of the accessibility-altering variants suggest that the effects decrease with increasing distance from a region to the variant (i.e. 80% > 60% > 47% for upstream SNPs; 46% < 60% < 80% for downstream SNPs). By varying the change in accessibility, we also present cumulative distributions of absolute values of Δsite_access, Δupstream_access and Δdownstream_access for common and rare variants in the flanking regions of miRNA binding sites (Supplementary Figure S3C–H). Overall, these results indicate that majority of genetic variants near miRNA binding sites can alter the structural accessibility of both the binding sites and the flanking regions. Moreover, we computed the mean MAF for variants (in 25-nt flanking regions of miRNA binding sites) which affect site accessibility and those which do not, and observed the same result as for variants in miRNA binding sites (Supplementary Figure S4B). This indicates that among variants in 25-nt flanking regions of miRNA binding sites, those that affect site accessibility tend to have lower frequencies for minor alleles.

Association of variants within miRNA binding sites with human diseases or phenotypes

Among the 4109 variants within miRNA binding sites (Supplementary Table S1), we identified 28 common variants and one rare variant that are associated with human diseases or phenotypes (Supplementary Table S4). A majority of these variants can substantially alter the potential of the miRNA:target hybridization. For example, the variant rs1049255 (G>A) in the 3′ UTR of gene CYBA was reported to be associated with NADPH oxidase (NOX) activity, oxidative stress and acute kidney injury (44). It leads to lower mRNA and protein expression of CYBA and reduced NOX activity (45). Interestingly, the allele A increases the potential of hybridization between CYBA and miR-320a by 4.3 kcal/mol. The lower levels of gene expression and NOX activity may be interpreted by enhanced regulation of miR-320a expressed in the kidney. We note that ΔΔGhybrid is rather small in this case (−0.7 kcal/mol). Another example relates to the rare variant rs71653621 (A>G, MAF = 0.0014) in the CDS of gene PARK7, which was reported to cause early onset and familial Parkinson's disease (PD) (46). The report also showed that the mutation causes 1.3% decrease in PARK7 mRNA folding energy compared to the wild-type sequence in silico and suggested a possible small effect on PARK7 gene function (46). CLASH chimeras and our SNP analysis revealed an interaction of miR-92b:PARK7 with rs71653621 residing within the miRNA binding site. The mutation increases the potential of hybridization between PARK7 and miR-92b by 6.3 kcal/mol, suggesting possible effect on miRNA-mediated gene regulation. We note that the ΔΔGhybrid value is also rather small in this case (0.3 kcal/mol). This presents a striking example of a rare genetic variant being associated with human disease.

Association of SNPs near miRNA binding sites with human diseases or phenotypes

Among the 5012 SNPs in the 25-nt blocks upstream of miRNA binding sites (Supplementary Table S2), we identified 20 common SNPs and one rare SNP that are associated with human diseases or phenotypes (Supplementary Table S5). Among the 4977 SNPs in the 25-nt blocks downstream of miRNA binding sites (Supplementary Table S3), we identified 24 common SNPs and one rare SNP that are associated with human diseases or phenotypes (Supplementary Table S6). A majority of the upstream and downstream SNPs can substantially alter the potential of the miRNA:target hybridization. For example, the SNP rs2228075 (G>A) in CDS of gene IMPDH1 is upstream of an miR-615-3p binding site identified by CLASH. This SNP was suggested to be adequate for the identification of patients at high risk of mycophenolate mofetil gastrointestinal intolerance (47). miR-615-3p was found to be expressed in colorectal cells (48). We observed large ΔΔGtotal of about −5 kcal/mol for the allele mutation G>A, indicating a substantial enhancement of the potential of the hybridization between IMPDH1 and miR-615-3p and potential decrease of IMPDH1 expression level. It may provide an interpretation for the high risk of mycophenolate mofetil gastrointestinal intolerance, since IMPDH1 is a regulation receptor in response to mycophenolate concentration (49). Another example relates to the SNP rs3088440 (C>T) in 3′ UTR of gene CDKN2A. This SNP is downstream of miR-10b binding site identified by CLASH and is associated with melanoma risk and second primary malignancy risk after index squamous cell carcinoma of the head and neck (50,51). For ΔΔGtotal, we observed a large value of 4.1 kcal/mol for miR-10b:CDKN2A hybridization.

DISCUSSION

Previous studies were primarily based on predicted miRNA binding sites particularly seed sites, and in a few cases involved small numbers of validated miRNA binding sites. However, the reliance on the seed sites is a major limitation, because an overwhelming majority of predicted seed sites were not supported by the CLIP technique for miRNA binding identification (52). Furthermore, only ∼18.7% of the miRNA bindings sites from the CLASH chimeras are seed sites. To avoid potential biases, we based our analyses on the large set of miRNA binding sites experimentally identified by CLASH. In addition, previous work only examined common variants within miRNA binding sites. In this work, we also studied variants near miRNA binding sites as well as rare variants that may contribute to an individual's risk of certain phenotypes or diseases (23–28). Rare variants have been largely unexplored. Examination of the effects of miRNA-related rare variants complements existing techniques for the identification of candidate causal variants. The rare variants with large effects in this work could be promising candidates for causal variants in disease-association research.

It has been postulated that rare variants tend to have stronger biological effects while common variants tend to have weaker biological effects (53). This is consistent with our findings that rare variants tend to reside in CDSs, whereas common variants tend to reside in the 3′ UTRs. Rare variants could have greater biological effects by altering protein sequences, whereas common variants are often involved in post-translational regulation through regulatory regions in the 3′ UTRs. On the other hand, miRNA binding sites are more likely to reside within those targets in the transcriptome with lower variant densities, especially target regions in which nucleotides have low mutation frequencies.

Previous studies were limited to the miRNA:target hybrid stability measured by ΔGhybrid. This feature ignores the effects of local target structure that have been shown to be important for target binding by miRNAs (30–33). In addition, it is not useful for studying the effects of variants residing outside miRNA binding sites. To address these limitations, we considered four structure-based features. These features provide new insights into the effects of genetic variants on the potential of miRNA:target hybridization as well as the structural accessibility of both the binding site and flanking regions. Moreover, they also facilitate the examination of effects of genetic variants near miRNA binding sites. For the cases with disease associations examined here, we observed a substantial ΔΔGtotal, but rather small ΔΔGhybrid. This observation and the findings from the previous studies (29,34,54) suggest that alteration in local target structure can be an important mechanism for genetic variants to have biological effects, some of which are associated with diseases or phenotypes.

We identified a list of variants that are associated with human phenotypes and diseases, and showed that such associations could be interpreted by the effects of variants on target binding by miRNAs. The reliable large set of miRNA binding sites from CLASH and broad gene regulation by miRNAs make our comprehensive list of miRNA-related variants with their effect measures valuable for the discovery of new associations between genetic variations and human diseases or phenotypes. Our findings also present a general mechanistic interpretation for certain associations between genetic variants and diseases, i.e. modulation of miRNA-mediated gene regulation by common or rare genetic variants within or near miRNA binding sites. In particular, among our list of miRNA-related genetic variants within cancer genes, some may be promising candidates for causal cancer variants.

We have shown that the genetic variants within or near miRNA binding sites can affect miRNA:target interactions. Such miRNA-related variants can be reliably identified by using miRNA:target interactions directly observed by CLASH. Therefore, available associations between these variants and human diseases could be used to infer associations between miRNAs and human diseases. This provides a means for the identification of miRNAs as potential biomarkers for human diseases. Our data will facilitate such investigations.

CLASH provides much more accurate miRNA binding site information than CLIP methods (55,56), as the two RNA molecules are in close proximity to each other. Further, the strong binding energies of chimeric reads indicate that these have resulted from genuine RNA–RNA interactions rather than from proximity-induced ligation of non-interacting RNAs in solution. Additionally, control experiments indicated that almost >98% of the miRNA–target RNA interactions by CLASH had formed in vivo in human cells, ruling out the possibility of false interactions that mostly form in vitro (36). Despite less accurate binding sites from PAR-CLIP (56), all of the observations from density comparisons and conservation analyses (Figure 1) also hold for PAR-CLIP data (Supplementary Figure S5).

CLASH data are limited to abundant miRNAs and transcripts expressed in the used cell line with chimeric read throughput dictated by ligation efficiency, thus presenting only a subset of all miRNA:target interactions in human transcriptome. Our findings are based on the CLASH data; however, they may be generalizable especially if the CLASH-identified miRNA:target interactions represent a statistical sample of all interactions in human transcriptome. Genetic variants can create new predicted miRNA seed sites (19). However, it is unknown to what extent these sites are effective for miRNA binding. An analysis revealed that over 90% of seed sites were not bound according to CLIP data (52).

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

SUPPLEMENTARY DATA

Acknowledgments

The Computational Molecular Biology and Statistics Core at the Wadsworth Center is acknowledged for supporting computing resources for this work. The authors thank Grzegorz Kudla for clarification on the CLASH data.

Authors’ contributions: Y.D. conceived and supervised the study. C.L. performed the analyses. W.R. and C.C. provided computer system support for cluster computing. J.L. and S.K. assisted with the presentation of the manuscript and provided biological insights. J.C. provided biological insights. Y.D. and C.L. drafted the manuscript. All authors read and approved the final manuscript.

FUNDING

National Science Foundation [DBI-0650991 to Y.D.]; National Institutes of Health (NIH) [GM099811 to Y.D., J.L.; R01CA149109 to J.L.]. Funding for open access charge: NIH: GM099811.

Conflict of interest statement. None declared.

REFERENCES

  • 1.Knight J.C. Regulatory polymorphisms underlying complex disease traits. J. Mol. Med. 2005;83:97–109. doi: 10.1007/s00109-004-0603-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Wang X., Tomso D.J., Liu X., Bell D.A. Single nucleotide polymorphism in transcriptional regulatory regions and expression of environmentally responsive genes. Toxicol. Appl. Pharmacol. 2005;207:84–90. doi: 10.1016/j.taap.2004.09.024. [DOI] [PubMed] [Google Scholar]
  • 3.Sethupathy P., Collins F.S. MicroRNA target site polymorphisms and human disease. Trends Genet. 2008;24:489–497. doi: 10.1016/j.tig.2008.07.004. [DOI] [PubMed] [Google Scholar]
  • 4.Ryan B.M., Robles A.I., Harris C.C. Genetic variation in microRNA networks: the implications for cancer research. Nat. Rev. Cancer. 2010;10:389–402. doi: 10.1038/nrc2867. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Griffiths-Jones S., Saini H.K., van Dongen S., Enright A.J. miRBase: tools for microRNA genomics. Nucleic Acids Res. 2008;36:D154–D158. doi: 10.1093/nar/gkm952. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Lewis B.P., Burge C.B., Bartel D.P. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell. 2005;120:15–20. doi: 10.1016/j.cell.2004.12.035. [DOI] [PubMed] [Google Scholar]
  • 7.Erson A.E., Petty E.M. MicroRNAs in development and disease. Clin. Genet. 2008;74:296–306. doi: 10.1111/j.1399-0004.2008.01076.x. [DOI] [PubMed] [Google Scholar]
  • 8.Fabian M.R., Sonenberg N. The mechanics of miRNA-mediated gene silencing: a look under the hood of miRISC. Nat. Struct. Mol. Biol. 2012;19:586–593. doi: 10.1038/nsmb.2296. [DOI] [PubMed] [Google Scholar]
  • 9.Abelson J.F., Kwan K.Y., O'Roak B.J., Baek D.Y., Stillman A.A., Morgan T.M., Mathews C.A., Pauls D.L., Rasin M.R., Gunel M., et al. Sequence variants in SLITRK1 are associated with Tourette's syndrome. Science. 2005;310:317–320. doi: 10.1126/science.1116502. [DOI] [PubMed] [Google Scholar]
  • 10.Chin L.J., Ratner E., Leng S., Zhai R., Nallur S., Babar I., Muller R.U., Straka E., Su L., Burki E.A., et al. A SNP in a let-7 microRNA complementary site in the KRAS 3’ untranslated region increases non-small cell lung cancer risk. Cancer Res. 2008;68:8535–8540. doi: 10.1158/0008-5472.CAN-08-2129. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Mencia A., Modamio-Hoybjor S., Redshaw N., Morin M., Mayo-Merino F., Olavarrieta L., Aguirre L.A., del Castillo I., Steel K.P., Dalmay T., et al. Mutations in the seed region of human miR-96 are responsible for nonsyndromic progressive hearing loss. Nat. Genet. 2009;41:609–613. doi: 10.1038/ng.355. [DOI] [PubMed] [Google Scholar]
  • 12.Adams B.D., Furneaux H., White B.A. The micro-ribonucleic acid (miRNA) miR-206 targets the human estrogen receptor-alpha (ERalpha) and represses ERalpha messenger RNA and protein expression in breast cancer cell lines. Mol. Endocrinol. 2007;21:1132–1147. doi: 10.1210/me.2007-0022. [DOI] [PubMed] [Google Scholar]
  • 13.Clop A., Marcq F., Takeda H., Pirottin D., Tordoir X., Bibe B., Bouix J., Caiment F., Elsen J.M., Eychenne F., et al. A mutation creating a potential illegitimate microRNA target site in the myostatin gene affects muscularity in sheep. Nat. Genet. 2006;38:813–818. doi: 10.1038/ng1810. [DOI] [PubMed] [Google Scholar]
  • 14.Godshalk S.E., Paranjape T., Nallur S., Speed W., Chan E., Molinaro A.M., Bacchiocchi A., Hoyt K., Tworkoski K., Stern D.F. A variant in a microRNA complementary site in the 3’ UTR of the KIT oncogene increases risk of acral melanoma. Oncogene. 2010;30:1542–1550. doi: 10.1038/onc.2010.536. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Jensen K.P., Covault J., Conner T.S., Tennen H., Kranzler H.R., Furneaux H.M. A common polymorphism in serotonin receptor 1B mRNA moderates regulation by miR-96 and associates with aggressive human behaviors. Mol. Psychiatry. 2009;14:381–389. doi: 10.1038/mp.2008.15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Landi D., Gemignani F., Barale R., Landi S. A catalog of polymorphisms falling in microRNA-binding regions of cancer genes. DNA Cell Biol. 2008;27:35–43. doi: 10.1089/dna.2007.0650. [DOI] [PubMed] [Google Scholar]
  • 17.Gong J., Tong Y., Zhang H.M., Wang K., Hu T., Shan G., Sun J., Guo A.Y. Genome-wide identification of SNPs in microRNA genes and the SNP effects on microRNA target binding and biogenesis. Hum. Mutat. 2011;33:254–263. doi: 10.1002/humu.21641. [DOI] [PubMed] [Google Scholar]
  • 18.Chen K., Rajewsky N. Natural selection on human microRNA binding sites inferred from SNP data. Nat. Genet. 2006;38:1452–1456. doi: 10.1038/ng1910. [DOI] [PubMed] [Google Scholar]
  • 19.Saunders M.A., Liang H., Li W.H. Human polymorphism at microRNAs and microRNA target sites. Proc. Natl Acad. Sci. U.S.A. 2007;104:3300–3305. doi: 10.1073/pnas.0611347104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Richardson K., Lai C.Q., Parnell L.D., Lee Y.C., Ordovas J.M. A genome-wide survey for SNPs altering microRNA seed sites identifies functional candidates in GWAS. BMC Genomics. 2011;12:504. doi: 10.1186/1471-2164-12-504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Hu Z., Bruno A.E. The influence of 3’UTRs on microRNA function inferred from human SNP data. Comp. Funct. Genomics. 2011:910769. doi: 10.1155/2011/910769. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Ørom U.A., Lund A.H. Experimental identification of microRNA targets. Gene. 2010;451:1–5. doi: 10.1016/j.gene.2009.11.008. [DOI] [PubMed] [Google Scholar]
  • 23.Altshuler D.M., Gibbs R.A., Peltonen L., Dermitzakis E., Schaffner S.F., Yu F., Bonnen P.E., de Bakker P.I., Deloukas P., Gabriel S.B., et al. Integrating common and rare genetic variation in diverse human populations. Nature. 2010;467:52–58. doi: 10.1038/nature09298. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Johansen C.T., Wang J., Lanktree M.B., Cao H., McIntyre A.D., Ban M.R., Martins R.A., Kennedy B.A., Hassell R.G., Visser M.E., et al. Excess of rare variants in genes identified by genome-wide association study of hypertriglyceridemia. Nat. Genet. 2010;42:684–687. doi: 10.1038/ng.628. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Cirulli E.T., Goldstein D.B. Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat. Rev. Genet. 2010;11:415–425. doi: 10.1038/nrg2779. [DOI] [PubMed] [Google Scholar]
  • 26.Sebat J., Levy D.L., McCarthy S.E. Rare structural variants in schizophrenia: one disorder, multiple mutations; one mutation, multiple disorders. Trends Genet. 2009;25:528–535. doi: 10.1016/j.tig.2009.10.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Elia J., Gai X., Xie H.M., Perin J.C., Geiger E., Glessner J.T., D'Arcy M., deBerardinis R., Frackelton E., Kim C., et al. Rare structural variants found in attention-deficit hyperactivity disorder are preferentially associated with neurodevelopmental genes. Mol. Psychiatry. 2009;15:637–646. doi: 10.1038/mp.2009.57. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Bodmer W., Bonilla C. Common and rare variants in multifactorial susceptibility to common diseases. Nat. Genet. 2008;40:695–701. doi: 10.1038/ng.f.136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Halvorsen M., Martin J.S., Broadaway S., Laederach A. Disease-associated mutations that alter the RNA structural ensemble. PLoS Genet. 2012;6:e1001074. doi: 10.1371/journal.pgen.1001074. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Zhao Y., Samal E., Srivastava D. Serum response factor regulates a muscle-specific microRNA that targets Hand2 during cardiogenesis. Nature. 2005;436:214–220. doi: 10.1038/nature03817. [DOI] [PubMed] [Google Scholar]
  • 31.Robins H., Li Y., Padgett R.W. Incorporating structure to predict microRNA targets. Proc. Natl Acad. Sci. U.S.A. 2005;102:4006–4009. doi: 10.1073/pnas.0500775102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Kertesz M., Iovino N., Unnerstall U., Gaul U., Segal E. The role of site accessibility in microRNA target recognition. Nat. Genet. 2007;39:1278–1284. doi: 10.1038/ng2135. [DOI] [PubMed] [Google Scholar]
  • 33.Long D., Lee R., Williams P., Chan C.Y., Ambros V., Ding Y. Potent effect of target structure on microRNA function. Nat. Struct. Mol. Biol. 2007;14:287–294. doi: 10.1038/nsmb1226. [DOI] [PubMed] [Google Scholar]
  • 34.Haas U., Sczakiel G., Laufer S.D. MicroRNA-mediated regulation of gene expression is affected by disease-associated SNPs within the 3’-UTR via altered RNA structure. RNA Biol. 2012;9:924–937. doi: 10.4161/rna.20497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Mishra P.J., Humeniuk R., Longo-Sorbello G.S., Banerjee D., Bertino J.R. A miR-24 microRNA binding-site polymorphism in dihydrofolate reductase gene leads to methotrexate resistance. Proc. Natl Acad. Sci. U.S.A. 2007;104:13513–13518. doi: 10.1073/pnas.0706217104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Helwak A., Kudla G., Dudnakova T., Tollervey D. Mapping the human miRNA interactome by CLASH reveals frequent noncanonical binding. Cell. 2013;153:654–665. doi: 10.1016/j.cell.2013.03.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Rehmsmeier M., Steffen P., Hochsmann M., Giegerich R. Fast and effective prediction of microRNA/target duplexes. RNA. 2004;10:1507–1517. doi: 10.1261/rna.5248604. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Bartel D.P. MicroRNAs: target recognition and regulatory functions. Cell. 2009;136:215–233. doi: 10.1016/j.cell.2009.01.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Siepel A., Bejerano G., Pedersen J.S., Hinrichs A.S., Hou M., Rosenbloom K., Clawson H., Spieth J., Hillier L.W., Richards S., et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005;15:1034–1050. doi: 10.1101/gr.3715005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Fisher R.A. Statistical Methods for Research Workers. Edinburgh, London: Oliver and Boyd; 1954. [Google Scholar]
  • 41.Ding Y., Lawrence C.E. Statistical prediction of single-stranded regions in RNA secondary structure and application to predicting effective antisense target sites and beyond. Nucleic Acids Res. 2001;29:1034–1046. doi: 10.1093/nar/29.5.1034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Ding Y., Lawrence C.E. A statistical sampling algorithm for RNA secondary structure prediction. Nucleic Acids Res. 2003;31:7280–7301. doi: 10.1093/nar/gkg938. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Higgins M.E., Claremont M., Major J.E., Sander C., Lash A.E. CancerGenes: a gene selection resource for cancer genome projects. Nucleic Acids Res. 2007;35:D721–D726. doi: 10.1093/nar/gkl811. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Perianayagam M.C., Tighiouart H., Nievergelt C.M., O'Connor D.T., Liangos O., Jaber B.L. CYBA gene polymorphisms and adverse outcomes in acute kidney injury: a prospective cohort study. Nephron Extra. 2011;1:112–123. doi: 10.1159/000333017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Schirmer M., Hoffmann M., Kaya E., Tzvetkov M., Brockmoller J. Genetic polymorphisms of NAD(P)H oxidase: variation in subunit expression and enzyme activity. Pharmacogenomics J. 2008;8:297–304. doi: 10.1038/sj.tpj.6500467. [DOI] [PubMed] [Google Scholar]
  • 46.Anvret A., Blackinton J.G., Westerlund M., Ran C., Sydow O., Willows T., Hakansson A., Nissbrandt H., Belin A.C. DJ-1 mutations are rare in a Swedish Parkinson Cohort. Open Neurol. J. 2011;5:8–11. doi: 10.2174/1874205X01105010008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Ohmann E.L., Burckart G.J., Chen Y., Pravica V., Brooks M.M., Zeevi A., Webber S.A. Inosine 5’-monophosphate dehydrogenase 1 haplotypes and association with mycophenolate mofetil gastrointestinal intolerance in pediatric heart transplant patients. Pediatr. Transplant. 2010;14:891–895. doi: 10.1111/j.1399-3046.2010.01367.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Cummins J.M., He Y., Leary R.J., Pagliarini R., Diaz L.A., Jr, Sjoblom T., Barad O., Bentwich Z., Szafranska A.E., Labourier E., et al. The colorectal microRNAome. Proc. Natl Acad. Sci. U.S.A. 2006;103:3687–3692. doi: 10.1073/pnas.0511155103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Bremer S., Vethe N.T., Rootwelt H., Bergan S. Expression of IMPDH1 is regulated in response to mycophenolate concentration. Int. Immunopharmacol. 2009;9:173–180. doi: 10.1016/j.intimp.2008.10.017. [DOI] [PubMed] [Google Scholar]
  • 50.Maccioni L., Rachakonda P.S., Bermejo J.L., Planelles D., Requena C., Hemminki K., Nagore E., Kumar R. Variants at the 9p21 locus and melanoma risk. BMC Cancer. 2013;13:325. doi: 10.1186/1471-2407-13-325. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Zhang Y., Sturgis E.M., Zafereo M.E., Wei Q., Li G. p14ARF genetic polymorphisms and susceptibility to second primary malignancy in patients with index squamous cell carcinoma of the head and neck. Cancer. 2011;117:1227–1235. doi: 10.1002/cncr.25605. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Liu C., Mallick B., Long D., Rennie W.A., Wolenc A., Carmack C.S., Ding Y. CLIP-based prediction of mammalian microRNA binding sites. Nucleic Acids Res. 2013;41:e138. doi: 10.1093/nar/gkt435. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Manolio T.A., Collins F.S., Cox N.J., Goldstein D.B., Hindorff L.A., Hunter D.J., McCarthy M.I., Ramos E.M., Cardon L.R., Chakravarti A., et al. Finding the missing heritability of complex diseases. Nature. 2009;461:747–753. doi: 10.1038/nature08494. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Salari R., Kimchi-Sarfaty C., Gottesman M.M., Przytycka T.M. Sensitive measurement of single-nucleotide polymorphism-induced changes of RNA conformation: application to disease studies. Nucleic Acids Res. 2012;41:44–53. doi: 10.1093/nar/gks1009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Chi S.W., Zang J.B., Mele A., Darnell R.B. Argonaute HITS-CLIP decodes microRNA-mRNA interaction maps. Nature. 2009;460:479–486. doi: 10.1038/nature08170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Hafner M., Landthaler M., Burger L., Khorshid M., Hausser J., Berninger P., Rothballer A., Ascano M., Jr, Jungkamp A.C., Munschauer M., et al. Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Cell. 2010;141:129–141. doi: 10.1016/j.cell.2010.03.009. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SUPPLEMENTARY DATA

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES