Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2006 Nov 11;35(Database issue):D51–D54. doi: 10.1093/nar/gkl797

PolymiRTS Database: linking polymorphisms in microRNA target sites with complex traits

Lei Bao 1,2, Mi Zhou 1,2, Ligang Wu 5, Lu Lu 2,3,6, Dan Goldowitz 2,3, Robert W Williams 2,3,4, Yan Cui 1,2,*
PMCID: PMC1669716  PMID: 17099235

Abstract

Polymorphism in microRNA Target Site (PolymiRTS) database is a collection of naturally occurring DNA variations in putative microRNA target sites. PolymiRTSs may affect gene expression and cause variations in complex phenotypes. The database integrates sequence polymorphism, phenotype and expression microarray data, and characterizes PolymiRTSs as potential candidates responsible for the quantitative trait locus (QTL) effects. It is a resource for studying PolymiRTSs and their implications in phenotypic variations. PolymiRTS database can be accessed at http://compbio.utmem.edu/miRSNP/.

INTRODUCTION

Identification of causal genetic variants underlying complex traits is a major goal in genetic studies. Linkage analysis has long been used to discover chromosomal intervals harboring sequence variants that cause variations in quantitative traits. A typical quantitative trait locus (QTL) interval usually contains many genes ranging from several dozens to several hundreds, hence it is critical to be able to focus on genetic variants in the interval that are most likely to have functional impacts. Among them, nonsynonymous single nucleotide polymorphisms (SNPs) that alter protein sequences and regulatory polymorphisms that affect gene expression are natural high-priority candidates. Although regulatory polymorphisms are much more challenging to be identified and characterized, experimental and analytical tools are being actively developed for this purpose (14). Polymorphisms in miRNA target sites (PolymiRTS) represent a specific class of regulatory polymorphisms that may regulate posttranscriptional gene expression. A recent work reports that PolymiRTS can underlie the effects of physiological QTLs (pQTLs) that control classic higher order traits (5). MicroRNAs (miRNAs) are a family of small RNAs that pair to the transcripts of protein-coding genes and cause translational repression or mRNA destabilization (6,7). Hundreds of miRNAs have been identified in humans and mice and many of them have been shown to regulate their target genes that control diverse biological processes such as differentiation, proliferation, apoptosis and morphogenesis (7). PolymiRTS may affect the base-pairing process, hence affect the miRNA-mediated gene repression which in turn cause phenotypic variations. It has been found that miRNA-mediated target mRNA destabilization is widespread in mammals (810). Thus, for miRNAs acting by this mechanism, the PolymiRTS may lead to heritable variations in gene expression. Variations in gene expression across a population can be assessed by a newly developed genetical genomics approach (1113). The genetical genomics approach treats gene expression level as quantitative trait. Linkage mapping is then used to discover the genetic loci regulating gene expression traits (eQTLs). PolymiRTS may induce a cis-acting eQTL that coincides with the gene's physical location. We proposed a simple conceptual model (Figure 1) that represents information flow from PolymiRTS to complex trait via cis-acting eQTL. Based on this model scheme, we create a database integrating SNP, phenotype and expression microarray data of human and mouse.

Figure 1.

Figure 1

Conceptual QTL model. A PolymiRTS (triangle) may cause the gene expression variation (diamond) in segregating population and a cis-acting eQTL is observed. The variation in gene expression in turn may cause phenotype variation (rectangle) and a pQTL is observed.

DATA SOURCES AND PROCESSING

Identifying and annotating PolymiRTS

The method of identifying and annotating PolymiRTS is shown in Figure 2. SNPs that are located in the 3′-untranslated regions (3′-UTRs) of all known genes by UCSC genome annotation (mouse: mm7 and human: hg18) (14) were extracted from dbSNP build 126 (15). Genomic locations of these SNPs were mapped onto mRNAs. For each SNP, we assessed whether its two alleles lead to different miRNA target sites. To be conservative, we only consider the 3′-UTR SNPs that affect the match to the seed region of the miRNA. Mature miRNA sequences were downloaded from the miRBase (16). We used the criteria of TargetScanS (17) in the prediction of miRNA sites. Basically, besides requiring a perfect Watson–Crick match to the seed 2–7 nt of miRNA, we further require that there is either a perfect match to the 8th nt of miRNA, or an anchor adenosine immediately downstream the 2–7 seed in the target. We assigned the PolymiRTS to one of the four classes: ‘D’ (an allele disrupts a conserved miRNA site), ‘N’ (a derived allele disrupts a nonconserved miRNA site), ‘C’ (a derived allele creates a new miRNA site) and ‘O’ (other cases when the ancestral allele can not be determined unambiguously). PolymiRTS of class ‘C’ may cause abnormal gene repression and PolymiRTS of class ‘D’ may cause loss of normal repression control. These two classes of PolymiRTS are most likely to have functional impacts. We used the pre-calculated 17-way Multiz alignments of vertebrate genomes to derive the annotations. For a miRNA site to be conserved, we require that it is present in at least two other vertebrate genomes in addition to the query genome. For mouse SNPs, their ancestral alleles were determined by mouse versus rat (rn3) genome alignment. For human SNPs, their ancestral alleles were determined by human versus chimpanzee (panTro1) genome alignment. Additionally, we categorized PolymiRTS with A/G alleles because they are supposed to be less deleterious with their ability to form G:U wobble base-pairs with miRNAs.

Figure 2.

Figure 2

Methods of identifying and annotating PolymiRTS.

PolymiRTSs within cis-acting expression QTL (eQTL) intervals

The genes with both cis-acting eQTL and PolymiRTS are featured in the database. First, gene expression levels in cerebellum, hippocampus, striatum, eye, whole brain and hematopoietic cell (18) were assessed in recombinant inbred mouse strains (BXD) derived from two parental strains C57BL/6J and DBA/2J. Gene expression levels were treated as quantitative traits and were mapped onto genomic regions (eQTL) using standard marker regression. A gene is said to have a suggestive (significant) cis-acting eQTL, if the LOD (log of odds) peak location is within 10 MB from the gene's physical location and the LOD >2.8 (>4.3) (19). Second, gene expression levels in lymphoblastoid cells of 194 human individuals from 14 CEPH families (20) were downloaded from the GEO database (21) and the raw data were processed by using the RMA protocol (22). Genotypes for 1628 autosomal SNP markers were downloaded from The SNP Consortium database (23). We used Merlin (24) to remove genotype errors and perform family-based linkage analysis. A gene is said to have a cis-acting eQTL, if the LOD peak location is within 10 MB from the gene's physical location and the P-value is <0.05.

PolymiRTSs in physiological QTL (pQTL) intervals

We first mapped the QTLs (with a LOD >2.8) for more than 800 published BXD phenotypes (physiological/behavioral traits) (18). For each QTL, we linked it with genes that are physically located in the QTL interval and have at least one PolymiRTS. These genes are candidate causal genes underlying the pQTL.

DATABASE CONTENT AND ACCESS

Table 1 shows the major data fields for a typical PolymiRTS record. The users can access the database by several options. First, a web interface is implemented for browsing the entries. Second, a text search interface is designed for query by SNP ID, miRNA ID, GenBank accession, HUGO gene identifier and gene description. Third, a chromosome location search is offered so that the users can specify a genomic interval of interest and search all the PolymiRTSs within the interval. For mouse, we also provide the inbred strain comparison option. By combining strain comparison with the range search, the user can retrieve all the PolymiRTSs that are candidate variants underlying the pQTL identified by them. Finally, we provide flat file downloads for all the data.

Table 1.

Field description

Field Description
Location SNP location in the mRNA transcript
SNPID Link to dbSNP
Wobble base-pair Whether the SNP can form a G:U wobble basepair with the miRNA. Y: Yes; N: No
Ancestral allele If applicable, ancestral allele is denoted
Allele Two alleles of the SNP in the mRNA transcript
Strain Genotypes of two mouse inbred strains to be compared
miRID Link to miRBase
Support Occurrence of miRNA site in other vertebrate genomes
Function Class C: derived allele creates a new miRNA sites
N: derived allele disrupts a nonconserved miRNA site
D: allele disrupts conserved a miRNA site
O: other cases when the ancestral allele cannot be determined unambiguously
miRSite Sequence context of the miRNA site. Seed region are in capital letters and SNPs are highlighted in red

DISCUSSIONS AND FUTURE WORK

miRNAs regulate posttranscriptional gene expression through mRNA destabilization or translational repression. Destabilization of mRNA may cause variations in the transcript levels while translational repression does not (2527). The effects of miRNAs that act through translational repression cannot be detected by expression microarrays. Therefore, in the PolymiRTS database, the sequence variations in the target sites of miRNAs that act through translational repression may be linked to pQTLs but not to eQTLs.

A recent study shows that there are two broad categories of miRNA target sites: (i) 5′-dominant sites with sufficient 5′-pairing, and (ii) 3′-compensatory sites with insufficient 5′-pairing which require strong 3′-pairing (28). By using the TargetScanS algorithm alone, we are likely to miss many target sites from the second category. Because some mismatches in the 5′-pairing region of 3′-compensatory target sites are tolerated which is difficult to estimate the influence of SNPs on the function of 3′-compensatory target sites. Hence, at this stage we want to be conservative and focus on the major and well-studied category of 5′-dominant sites.

Currently, we did not take miRNA gene expression pattern into consideration due to lack of large-scale miRNA expression data. However, in many cases, such information would be very valuable. For example, it is known that a set of genes evolutionarily avoid having miRNA target site in their 3′-UTRs (8). They are called miRNA antitargets. Obviously, creation of a new miRNA site in an antitarget is most likely to have severe consequences. Since antitargets tend to be highly and specifically expressed in the tissue where the miRNA is expressed (8), we can use this information to pick PolymiRTS in the putative antitargets and further prioritize strong functional candidates. With the anticipated accumulation of such miRNA profiling data, we would expect a much better annotation of PolymiRTS in the near future.

Acknowledgments

This work was supported by a PhRMA Foundation grant (Y.C.), and NIH grants HD052472 (D.G.), AA014425 (L.L.) and DA021131 (R.W.W.). We thank the two anonymous reviewers for their helpful suggestions. Funding to pay the Open Access publication charges for this article was provided by a PhRMA Foundation grant (Y.C.), and NIH grants HD052472 (D.G.), AA014425 (L.L.) and DA021131 (R.W.W.).

Conflict of interest statement. None declared.

REFERENCES

  • 1.Cowles C.R., Hirschhorn J.N., Altshuler D., Lander E.S. Detection of regulatory variation in mouse genes. Nature Genet. 2002;32:432–437. doi: 10.1038/ng992. [DOI] [PubMed] [Google Scholar]
  • 2.Knight J.C. Regulatory polymorphisms underlying complex disease traits. J. Mol. Med. 2005;83:97–109. doi: 10.1007/s00109-004-0603-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Knight J.C., Keating B.J., Rockett K.A., Kwiatkowski D.P. In vivo characterization of regulatory polymorphisms by allele-specific quantification of RNA polymerase loading. Nature Genet. 2003;33:469. doi: 10.1038/ng1124. [DOI] [PubMed] [Google Scholar]
  • 4.Ronald J., Akey J.M., Whittle J., Smith E.N., Yvert G., Kruglyak L. Simultaneous genotyping, gene-expression measurement, and detection of allele-specific expression with oligonucleotide arrays. Genome Res. 2005;15:284–291. doi: 10.1101/gr.2850605. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Clop A., Marcq F., Takeda H., Pirottin D., Tordoir X., Bibe B., Bouix J., Caiment F., Elsen J.M., Eychenne F., et al. A mutation creating a potential illegitimate microRNA target site in the myostatin gene affects muscularity in sheep. Nature Genet. 2006;38:813–818. doi: 10.1038/ng1810. [DOI] [PubMed] [Google Scholar]
  • 6.Ambros V. The functions of animal microRNAs. Nature. 2004;431:350–355. doi: 10.1038/nature02871. [DOI] [PubMed] [Google Scholar]
  • 7.Bartel D.P. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell. 2004;116:281–297. doi: 10.1016/s0092-8674(04)00045-5. [DOI] [PubMed] [Google Scholar]
  • 8.Farh K.K.-H., Grimson A., Jan C., Lewis B.P., Johnston W.K., Lim L.P., Burge C.B., Bartel D.P. The widespread impact of mammalian MicroRNAs on mRNA repression and evolution. Science. 2005;310:1817–1821. doi: 10.1126/science.1121158. [DOI] [PubMed] [Google Scholar]
  • 9.Lim L.P., Lau N.C., Garrett-Engele P., Grimson A., Schelter J.M., Castle J., Bartel D.P., Linsley P.S., Johnson J.M. Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs. Nature. 2005;433:769–773. doi: 10.1038/nature03315. [DOI] [PubMed] [Google Scholar]
  • 10.Wu L., Fan J., Belasco J.G. MicroRNAs direct rapid deadenylation of mRNA. Proc. Natl Acad. Sci. USA. 2006;103:4034–4039. doi: 10.1073/pnas.0510928103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Bao L., Wei L., Peirce J.L., Homayouni R., Li H., Zhou M., Chen H., Lu L., Williams R.W., Pfeffer L.M., et al. Combining gene expression QTL mapping and phenotypic spectrum analysis to uncover gene regulatory relationships. Mamm. Genome. 2006;17:575–583. doi: 10.1007/s00335-005-0172-2. [DOI] [PubMed] [Google Scholar]
  • 12.Jansen R.C., Nap J.P. Genetical genomics: the added value from segregation. Trends Genet. 2001;17:388–391. doi: 10.1016/s0168-9525(01)02310-1. [DOI] [PubMed] [Google Scholar]
  • 13.Schadt E.E. Exploiting naturally occurring DNA variation and molecular profiling data to dissect disease and drug response traits. Curr. Opin. Biotechnol. 2005;16:647–654. doi: 10.1016/j.copbio.2005.10.005. [DOI] [PubMed] [Google Scholar]
  • 14.Kent W.J., Sugnet C.W., Furey T.S., Roskin K.M., Pringle T.H., Zahler A.M., Haussler D. The human genome browser at UCSC. Genome Res. 2002;12:996–1006. doi: 10.1101/gr.229102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Sherry S.T., Ward M.H., Kholodov M., Baker J., Phan L., Smigielski E.M., Sirotkin K. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001;29:308–311. doi: 10.1093/nar/29.1.308. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Griffiths-Jones S., Grocock R.J., van Dongen S., Bateman A., Enright A.J. miRBase: microRNA sequences, targets and gene nomenclature. Nucleic Acids Res. 2006;34:D140–D144. doi: 10.1093/nar/gkj112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Lewis B.P., Burge C.B., Bartel D.P. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell. 2005;120:15–20. doi: 10.1016/j.cell.2004.12.035. [DOI] [PubMed] [Google Scholar]
  • 18.Chesler E.J., Lu L., Wang J., Williams R.W., Manly K.F. WebQTL: rapid exploratory analysis of gene expression and genetic networks for brain and behavior. Nature Neurosci. 2004;7:485–486. doi: 10.1038/nn0504-485. [DOI] [PubMed] [Google Scholar]
  • 19.Lander E., Kruglyak L. Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nature Genet. 1995;11:241–247. doi: 10.1038/ng1195-241. [DOI] [PubMed] [Google Scholar]
  • 20.Morley M., Molony C.M., Weber T.M., Devlin J.L., Ewens K.G., Spielman R.S., Cheung V.G. Genetic analysis of genome-wide variation in human gene expression. Nature. 2004;430:743–747. doi: 10.1038/nature02797. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Barrett T., Suzek T.O., Troup D.B., Wilhite S.E., Ngau W.-C., Ledoux P., Rudnev D., Lash A.E., Fujibuchi W., Edgar R. NCBI GEO: mining millions of expression profiles—database and tools. Nucleic. Acids Res. 2005;33:D562–D566. doi: 10.1093/nar/gki022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Bolstad B.M., Irizarry R.A., Astrand M., Speed T.P. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 2003;19:185–193. doi: 10.1093/bioinformatics/19.2.185. [DOI] [PubMed] [Google Scholar]
  • 23.Thorisson G.A., Stein L.D. The SNP Consortium website: past, present and future. Nucleic Acids Res. 2003;31:124–127. doi: 10.1093/nar/gkg052. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Abecasis G.R., Cherny S.S., Cookson W.O., Cardon L.R. Merlin–rapid analysis of dense genetic maps using sparse gene flow trees. Nature Genet. 2002;30:97–101. doi: 10.1038/ng786. [DOI] [PubMed] [Google Scholar]
  • 25.Bhattacharyya S.N., Habermacher R., Martine U., Closs E.I., Filipowicz W. Relief of microRNA-mediated translational repression in human cells subjected to stress. Cell. 2006;125:1111–1124. doi: 10.1016/j.cell.2006.04.031. [DOI] [PubMed] [Google Scholar]
  • 26.Kiriakidou M., Nelson P.T., Kouranov A., Fitziev P., Bouyioukos C., Mourelatos Z., Hatzigeorgiou A. A combined computational-experimental approach predicts human microRNA targets. Genes Dev. 2004;18:1165–1178. doi: 10.1101/gad.1184704. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Schratt G.M., Tuebing F., Nigh E.A., Kane C.G., Sabatini M.E., Kiebler M., Greenberg M.E. A brain-specific microRNA regulates dendritic spine development. Nature. 2006;439:283–289. doi: 10.1038/nature04367. [DOI] [PubMed] [Google Scholar]
  • 28.Brennecke J., Stark A., Russell R.B., Cohen S.M. Principles of microRNA-target recognition. PLoS Biol. 2005;3:e85. doi: 10.1371/journal.pbio.0030085. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Krek A., Grun D., Poy M.N., Wolf R., Rosenberg L., Epstein E.J., MacMenamin P., da Piedade I., Gunsalus K.C., Stoffel M., et al. Combinatorial microRNA target predictions. Nature Genet. 2005;37:495–500. doi: 10.1038/ng1536. [DOI] [PubMed] [Google Scholar]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES