Skip to main content
Human Genomics logoLink to Human Genomics
editorial
. 2010 Jun 1;4(5):284–288. doi: 10.1186/1479-7364-4-5-284

Functional intronic polymorphisms: Buried treasure awaiting discovery within our genes

David N Cooper 1,
PMCID: PMC3500160  PMID: 20650817

'In Nature's infinite book of secrecy, a little I can read.'

Antony and Cleopatra [Act I, Scene 2], William Shakespeare

Pathological mutations occurring within the extended consensus sequences of exon-intron splice junctions account for ~10 per cent of all inherited lesions logged in The Human Gene Mutation Database (HGMD®; http://www.hgmd.org)[1] and are frequently encountered in mutation screening studies [2]. Mutations residing in other intronic locations (including the canonical branch-point sequence,[3] 5'-YURAY-3'), however, may often go undetected unless patient RNA can be analysed and the mutations in question induce aberrant splicing (eg exon skipping or cryptic splice site utilisation) that is readily distinguishable qualitatively or quantitatively from normal (and/or normal alternative) splicing. Indeed, introns probably represent a substantially larger mutational target than has hitherto been appreciated, on account of their containing a multiplicity of functional elements, including intron splice enhancers and silencers that regulate alternative splicing,[4,5]trans-splicing elements [6] and other regulatory elements, some of which may be deeply embedded within very large introns [7].

In addition to pathological mutations sensu stricto, introns also harbour functional polymorphisms that can influence the expression of the genes that host them. Some of these intronic variants may also confer susceptibility to disease or otherwise modulate the genotype-phenotype relationship. For the reasons discussed above, it is very likely that such variants will have been seriously under-ascertained to date. Although most of these variants are single nucleotide polymorphisms (SNPs), others may be of the insertion/deletion type [8]. With the advent of genome-wide association studies (GWAS), an increasing number of potentially functional intronic variants are being identified [9]. In the majority of cases, however, it is unclear whether such variants are of direct functional significance, as opposed to simply being in linkage disequilibrium with another (as yet unidentified) functional SNP in the vicinity [10]. Even when GWAS studies deem a newly identified intronic polymorphism to be 'functional', it should be appreciated that such a term may often be ascribed solely on the basis of an observed association between a specific allele and a plasma protein level, enzymatic activity or a clinical/laboratory phenotype -- even although in reality such associations cannot readily distinguish a bona fide functional SNP from a linkage disequilibrium effect.

As has been noted with pathological mutations, the vast majority of known functional intronic polymorphisms are located within the extended consensus sequences of exon-intron splice junctions [2]. Some intronic polymorphic variants do not occur within the splice junctions, however, but nevertheless still act so as to change the splicing phenotype as a consequence of their being located within an intron splice enhancer or branchpoint site, or by activating a cryptic splice site [11,12]. This is, from a biological point of view, a more interesting category of intronic SNP to study, since the mechanisms by which these variants exert their effects on the splicing phenotype are often unclear and may be quite subtle. In the pages of this issue, Millar et al.[13] report that a SNP, buried deep within intron 4 of the human growth hormone (GH1) gene, is of direct functional significance by virtue of its influence on the expression of this gene. This polymorphism therefore joins the ranks of the hitherto relatively small number of human intronic SNPs located outwith exon-intron splice junctions that have been shown by various methods of in vitro characterisation to be of direct functional significance. Table 1 lists some of the best characterised examples of such functional SNPs, most of which are located at least ~30 base pairs (bp) from the nearest splice site. These SNPs have been shown to influence either the transcriptional activity or the splicing efficiency of their host genes, or instead to alter the expression of alternative transcripts.

Table 1.

Selected examples of in vitro characterised human functional intronic polymorphisms located more than ~30 bp from the nearest splice site

Gene Disease/phenotype Chromosomal location Polymorphism, intronic location and dbSNP number Consequences for gene expression or mRNA splicing Reference
AGTR2 Predisposition to congenital anomalies of the kidney and urinary tract Xq22-q23 IVS1, AS, A > G, -29
(rs1403543)
SNP occurs within branchpoint motif and alters splicing efficiency Nishimura et al. (1999)a
BANK1 Susceptibility to systemic lupus erythematosus 4q23 IVS1, AS, T > C, -43
(rs17266594)
SNP occurs within branchpoint motif and risk allele alters expression of alternative transcripts Kozyrev et al. (2008)b
CD244 Susceptibility to rheumatoid arthritis 1q23.1 IVS3, AS, T > C, -164
(rs6682654)
Risk allele associated with increased transcriptional activity Suzuki et al. (2008)c
CD244 Susceptibility to rheumatoid arthritis 1q23.1 IVS5, DS, G > A, +526 (rs3766379) Risk allele associated with increased transcriptional activity Suzuki et al. (2008)c
COL1A1 Reduced bone density/osteoporosis 17q21.33 IVS1, AS, G > T, -440
(rs1800012)
SNP occurs within Sp1-binding site; risk allele alters Sp1 binding and transcriptional activity Mann et al. (2001)d
CXCR3 Variation in immune cell response to chemokine-cytokine signals Xq13 IVS1, DS, G > A, +234
(rs2280964)
Risk allele associated with reduced CXCR3 gene expression Choi et al. (2008)e
CYP2D6 Intermediate metaboliser (reduced expression of CYP2D6) 22q13.1 IVS6, DS, G > A, +39
(rs28371725)
Increased level (7.3-fold) of non-functional splice variant transcript lacking exon 6 and reduced level (2.9-fold) of functional transcript Toscano et al.
(2006)f
DRD2 Reduced DRD2 expression 11q22-q23 IVS1, DS, A > G, +3850
(rs2734836)
Risk allele associated with increased binding of transcriptional repressor (Freud-1) leading to reduced DRD2 expression Rogaeva
et al. (2007)g
DRD2 Reduced DRD2 expression 11q23 IVS6, AS, C > A, -83
(rs 1076560)
Risk allele alters expression of alternative transcripts Zhang et al.
(2007)h
F2 Elevated prothrombin level/thrombosis 11p11-q12 IVS13, AS, A > G, -59 Risk allele influences splicing efficiency von Ahsen & Oellerich (2004)i
FGFR2 Susceptibility to breast cancer 10q26 IVS2, DS, T > C,+ 12912
(rs2981578)
Risk allele alters binding affinity for transcription factors Oct-1/Runx2, leading to increased FGFR2 expression Meyer et al.
(2008)j
FOXP3 Susceptibility to psoriasis Xp11.23 IVS1, DS, A > C, +2882
(rs3761548)
Risk allele causes loss of binding of E47 and c-Myb, leading to reduced FOXP3 transcription Shen et al.
(2010)k
GFPT1 Reduced GFPT1 expression 2p13 IVS1, DS, T > C, +36
(rs6720415)
SNP occurs within GC box and risk allele decreases transcriptional activity Kunika et al.
(2006)[1]
GSK3B Risk of Parkinson's disease 3q13.3 IVS5, AS, T > C, -157
(rs6438552)
Risk allele associated with increased level of GSK3B transcripts lacking exons 9 and 11 Kwok et al.
(2005)m
IRF4 Risk of childhood acute lymphoblastic leukaemia in males 6p25-p23 IVS4, DS, C > T, +386
(rs12203592)
Risk allele increases IRF4 promoter activity/expression Do et al.
(2010)n
LTA Susceptibility to myocardial infarction 6p21.3 IVS1, AS, G > A, -198
(rs909253)
Risk allele associated with increased transcriptional activity Ozaki et al.
(2002)o
NLRP3 Susceptibility to food-induced anaphylaxis 1q44 IVS7, AS, C > T, -202
(rs4612666)
Risk allele increases enhancer activity by 20% Hitomi et al.
(2009)p
SCG3 Association with obesity 15q21 IVS1, DS, G > A, +190
(rs16964476)
Risk allele alters transcriptional activity Tanabe et al.
(2007)q
TH Risk of essential tension 11p15.5 IVS12, DS, T > C, +127
(rs2070762)
Risk allele associated with increased transcriptional activity Wang et al.
(2008)r
USF1 Association with familial combined hyperlipidaemia 1q22-q23 IVS7, AS, G > A, -100
(rs2073658)
SNP alleles exhibit differential binding to nuclear proteins.
USF1-regulated genes are differentially regulated, depending on the identity of the rs2073658 allele
Naukkarinen et al. (2005)s
Naukkarinen et al. (2009)t

Abbreviations: AS, acceptor splice site; DRD2, dopamine D2 receptor; DS, donor splice site; IVS, intron (number) Nucleotide numbering relative to specified splice site.

rs numbers are provided courtesy of dbSNP http://www.ncbi.nlm.nih.gov/projects/SNP/. For the sake of simplicity, only SNPs have been included in Table 1 (thus, for example, functional intronic microsatellite polymorphisms would require a separate treatment).

References to table

a. Nishimura, H., Yerkes, E., Hohenfellner, K., Miyazaki, Y. et al. (1999), 'Role of the angiotensin type 2 receptor gene in congenital anomalies of the kidney and urinary tract, CAKUT, of mice and men', Mol. Cell Vol. 3, pp. 1-10.

b. Kozyrev, S.V., Abelson, A.K., Wojcik, J., Zaghlool, A. et al. (2008), 'Functional variants in the B-cell gene BANK1 are associated with systemic lupus erythematosus', Nat. Genet. Vol. 40, pp. 211-216.

c. Suzuki, A., Yamada, R., Kochi, Y., Sawada, T. et al. (2008), 'Functional SNPs in CD244 increase the risk of rheumatoid arthritis in a Japanese population', Nat. Genet. Vol. 40, pp. 1224-1229.

d. Mann, V., Hobson, E.E., Li, B., Stewart, T.L et al. (2001), 'A COL1A1 Sp1 binding site polymorphism predisposes to osteoporotic fracture by affecting bone density and quality', J. Clin. Invest. Vol. 107, pp. 899-907.

e. Choi, J.W., Park, C.S., Hwang, M., Nam, H.Y. et al. (2008), 'A common intronic variant of CXCR3 is functionally associated with gene expression levels and the polymorphic immune cell responses to stimuli', J. Allergy Clin. Immunol. Vol. 122, pp. 1119-1126.

f. Toscano, C., Klein, K., Blievernicht, J., Schaeffeler, E. et al. (2006), 'Impaired expression of CYP2D6 in intermediate metabolizers carrying the *41 allele caused by the intronic SNP 2988G > A: Evidence for modulation of splicing events', Pharmacogenet. Genomics Vol. 16, pp. 755-766.

g. Rogaeva, A., Ou, X.M., Jafar-Nejad, H., Lemonde, S. et al. (2007), 'Differential repression by freud-1/CC2D1A at a polymorphic site in the dopamine-D2 receptor gene'. J. Biol. Chem. Vol. 282, pp. 20897-20905.

h. Zhang, Y., Bertolino, A., Fazio, L., Blasi, G. et al. (2007), 'Polymorphisms in human dopamine D2 receptor gene affect gene expression, splicing, and neuronal activity during working memory', Proc. Natl. Acad. Sci. USA Vol. 104, pp. 20552-20557.

i. von Ahsen, N. and Oellerich, M. (2004), 'The intronic prothrombin 19911A > G polymorphism influences splicing efficiency and modulates effects of the 20210G > A polymorphism on mRNA amount and expression in a stable reporter gene assay system', Blood Vol. 103, pp. 586-593.

j. Meyer, K.B., Maia, A.T., O'Reilly, M., Teschendorff, A.E. et al. (2008), 'Allele-specific up-regulation of FGFR2 increases susceptibility to breast cancer', PLoS Biol. Vol. 6, p. e108.

k. Shen, Z., Chen, L., Hao, F., Wang, G. et al. (2010), 'Intron-1 rs3761548 is related to the defective transcription of Foxp3 in psoriasis through abrogating E47/c-Myb binding', J. Cell. Mol. Med. Vol. 14, pp. 226-241.

l. Kunika, K., Tanahashi, T., Kudo, E., Mizusawa, N. et al. (2006), 'Effect of þ36T > C in intron 1 on the glutamine: fructose-6-phosphate amido-transferase 1 gene and its contribution to type 2 diabetes in different populations', J. Hum. Genet. Vol. 51, pp. 1100-1109.

m. Kwok, J.B., Hallupp, M., Loy, C.T., Chan, D.K. et al. (2005), 'GSK3B polymorphisms alter transcription and splicing in Parkinson's disease', Ann. Neurol. Vol. 58, pp. 829-839.

n. Do, T.N., Ucisik-Akkaya, E., Davis, C.F., Morrison, B.A. et al. (2010), 'An intronic polymorphism of IRF4 gene influences gene transcription in vitro and shows a risk association with childhood acute lymphoblastic leukemia in males', Biochim. Biophys. Acta Vol. 1802, pp. 292-300.

o. Ozaki, K., Ohnishi, Y., Iida, A., Sekine, A. et al. (2002), 'Functional SNPs in the lymphotoxin-a gene that are associated with susceptibility to myocardial infarction', Nat. Genet. Vol. 32, pp. 650-654.

p. Hitomi, Y., Ebisawa, M., Tomikawa, M., Imai, T. et al. (2009), 'Associations of functional NLRP3 polymorphisms with susceptibility to food-induced anaphylaxis and aspirin-induced asthma', J. Allergy Clin. Immunol. Vol. 124, pp. 779-785.

q. Tanabe, A., Yanagiya, T., Iida, A., Saito, S. et al. (2007), 'Functional single-nucleotide polymorphisms in the secretogranin III (SCG3) gene that form secretory granules with appetite-related neuropeptides are associated with obesity', J. Clin. Endocrinol. Metab. Vol. 92, pp. 1145-1154.

r. Wang, L., Li, B., Lu, X., Zhao, Q. et al. (2008), 'A functional intronic variant in the tyrosine hydroxylase (TH) gene confers risk of essential hypertension in the Northern Chinese Han population', Clin. Sci. Vol. 115, pp. 151-158.

s. Naukkarinen, J., Gentile, M., Soro-Paavonen, A., Saarela, J. et al. (2005), 'USF1 and dyslipidemias: Converging evidence for a functional intronic variant', Hum. Mol. Genet. Vol. 14, pp. 2595-2605.

t. Naukkarinen, J., Nilsson, E., Koistinen, H.A., Söderlund, S. et al. (2009), 'Functional variant disrupts insulin induction of USF1: Mechanism for USF1-associated dyslipidemias', Circ. Cardiovasc. Genet. Vol. 2, pp. 522-529.

How should we go about increasing the number of identified functional intronic polymorphisms? One approach would be to employ exon-tiling microarrays to perform genome-wide scans to identify intronic SNPs responsible for inter-individual differences in the splicing phenotype [11,14,15]. Since currently available bioinformatics tools are inadequate to the task of predicting splicing consequences,[14] however, all SNPs identified in this way would have to be further validated using mini-gene constructs to determine the resulting splicing phenotype [14]. One feature that might prove helpful in identifying intronic SNPs is that such variants are often located within gene regions that are characterised by a reduced level of genetic variation [16].

Precisely because we invariably adopt a gene-centric approach to screening introns for functional polymorphisms, we should be wary of the existence of overlapping genes, a not infrequent occurrence in our complex genome. Thus, for example, the functional SNP rs4988235, located 13.9 kilobases upstream of the lactase (LCT) gene and associated with adult-type hypolactasia, actually resides deep within intron 13 of the minichromosome maintenance complex component 6 (MCM6) gene [17-19]. In addition, since disease-associated intronic SNPs that play a role in long-range gene regulation have also recently been identified,[20,21] we should be aware that some SNPs may influence the expression of remote genes at distance, rather than the expression of those genes which actually host them. These caveats notwithstanding, new techniques such as chromosome conformational capture [22] and chromatin immunoprecipitation followed by deep sequencing (ChIP-seq)[23] promise greatly to increase the number of functional intronic polymorphisms identified, thereby potentially pinpointing the locations of a whole new lexicon of intron-located regulatory elements, which will increase our understanding of intron structure and function.

References

  1. Stenson PD, Mort M, Ball EV, Howells K. et al. 'The Human Gene Mutation Database: 2008 update'. Genome Med. 2009;1:13. doi: 10.1186/gm13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Krawczak M, Thomas NS, Hundrieser B, Mort M. et al. 'Single base-pair substitutions in exon-intron junctions of human genes: Nature, distribution, and consequences for mRNA splicing'. Hum Mutat. 2007;28:150–158. doi: 10.1002/humu.20400. [DOI] [PubMed] [Google Scholar]
  3. Královicová J, Lei H, Vorechovský I. 'Phenotypic consequences of branch point substitutions'. Hum Mutat. 2006;27:803–813. doi: 10.1002/humu.20362. [DOI] [PubMed] [Google Scholar]
  4. Wang X, Wang K, Radovich M, Wang Y. et al. 'Genome-wide prediction of cis-acting RNA elements regulating tissue-specific pre-mRNA alternative splicing'. BMC Genomics. 2009;10(Suppl 1):S4. doi: 10.1186/1471-2164-10-S1-S4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Tress ML, Martelli PL, Frankish A, Reeves GA. et al. 'The implications of alternative splicing in the ENCODE protein complement'. Proc Natl Acad Sci USA. 2007;104:5495–5500. doi: 10.1073/pnas.0700800104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Gingeras TR. 'Implications of chimaeric non-co-linear transcripts'. Nature. 2009;461:206–211. doi: 10.1038/nature08452. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Solis AS, Shariat N, Patton JG. 'Splicing fidelity, enhancers, and disease'. Front Biosci. 2008;13:1926–1942. doi: 10.2741/2812. [DOI] [PubMed] [Google Scholar]
  8. Wilkins JM, Southam L, Mustafa Z, Chapman K. et al. 'Association of a functional microsatellite within intron 1 of the BMP5 gene with susceptibility to osteoarthritis'. BMC Med Genet. 2009;10:141. doi: 10.1186/1471-2350-10-141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Manolio TA, Collins FS, Cox NJ, Goldstein DB. et al. 'Finding the missing heritability of complex diseases'. Nature. 2009;461:747–753. doi: 10.1038/nature08494. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. McCauley JL, Kenealy SJ, Margulies EH, Schnetz-Boutaud N. et al. 'SNPs in multi-species conserved Sequences (MCS) as useful markers in association studies: A practical approach'. BMC Genomics. 2007;8:266. doi: 10.1186/1471-2164-8-266. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Kwan T, Benovoy D, Dias C, Gurd S. et al. 'Genome-wide analysis of transcript isoform variation in humans'. Nat Genet. 2008;40:225–231. doi: 10.1038/ng.2007.57. [DOI] [PubMed] [Google Scholar]
  12. Coulombe-Huntington J, Lam KC, Dias C, Majewski J. 'Fine-scale variation and genetic determinants of alternative splicing across individuals'. PLoS Genet. 2009;5:e1000766. doi: 10.1371/journal.pgen.1000766. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Millar DS, Horan M, Chuzhanova NA, Cooper DN. 'Characterisation of a functional intronic polymorphism in the human growth hormone (GH1) gene'. Hum Genomics. 2010;4:289–301. doi: 10.1186/1479-7364-4-5-289. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Hull J, Campino S, Rowlands K, Chan M-S. et al. 'Identification of common genetic variation that modulates alternative splicing'. PLoS Genet. 2007;3:e99. doi: 10.1371/journal.pgen.0030099. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Nembarware V, Lupindo B, Schouest K, Spillane C. et al. 'Genome-wide survey of allele-specific splicing in humans'. BMC Genomics. 2008;9:265. doi: 10.1186/1471-2164-9-265. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Lomelin D, Jorgenson E, Risch N. 'Human genetic variation recognizes functional elements in noncoding sequence'. Genome Res. 2010;20:311–319. doi: 10.1101/gr.094151.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Enattah NS, Sahi T, Savilahti E, Terwilliger JD. et al. 'Identification of a variant associated with adult-type hypolactasia'. Nat Genet. 2002;30:233–237. doi: 10.1038/ng826. [DOI] [PubMed] [Google Scholar]
  18. Olds LC, Sibley E. 'Lactase persistence DNA variant enhances lactase promoter activity in vitro: Functional role as a cis regulatory element'. Hum Mol Genet. 2003;12:2333–2340. doi: 10.1093/hmg/ddg244. [DOI] [PubMed] [Google Scholar]
  19. Lewinsky RH, Jensen TG, Møller J, Stensballe A. et al. 'T-13910 DNA variant associated with lactase persistence interacts with Oct-1 and stimulates lactase promoter activity in vitro'. Hum Mol Genet. 2005;14:3945–3953. doi: 10.1093/hmg/ddi418. [DOI] [PubMed] [Google Scholar]
  20. Ragvin A, Moro E, Fredman D, Navratilova P. et al. 'Long-range gene regulation links genomic type 2 diabetes and obesity risk regions to HHEX, SOX4, and IRX3'. Proc Natl Acad Sci USA. 2010;107:775–780. doi: 10.1073/pnas.0911591107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Jowett JB, Curran JE, Johnson MP, Carless MA. et al. 'Genetic variation at the FTO locus influences RBL2 gene expression'. Diabetes. 2010;59:726–732. doi: 10.2337/db09-1277. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Dostie J, Dekker J. 'Mapping networks of physical interactions between genomic elements using 5C technology'. Nat Protoc. 2007;2:988–1002. doi: 10.1038/nprot.2007.116. [DOI] [PubMed] [Google Scholar]
  23. Visel A, Blow MJ, Li Z, Zhang T. et al. 'ChIP-seq accurately predicts tissue-specific activity of enhancers'. Nature. 2009;457:854–858. doi: 10.1038/nature07730. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Human Genomics are provided here courtesy of BMC

RESOURCES