Functional annotation of risk loci identified through genome-wide association studies for prostate cancer

Yizhen Lu; Zheng Zhang; Hongjie Yu; S Lily Zheng; William B Isaacs; Jianfeng Xu; Jielin Sun

doi:10.1002/pros.21311

. Author manuscript; available in PMC: 2012 Jun 15.

Published in final edited form as: Prostate. 2010 Dec 6;71(9):955–963. doi: 10.1002/pros.21311

Functional annotation of risk loci identified through genome-wide association studies for prostate cancer

Yizhen Lu ^1,^2,^3,⁴, Zheng Zhang ⁵, Hongjie Yu ^3,⁴, S Lily Zheng ⁵, William B Isaacs ⁷, Jianfeng Xu ^3,^4,^5,⁶, Jielin Sun ^5,^†

PMCID: PMC3070182 NIHMSID: NIHMS258501 PMID: 21541972

Abstract

Background

The majority of established prostate cancer risk-associated Single Nucleotide Polymorphisms (SNPs) identified from genome-wide association studies do not fall into protein coding regions. Therefore, the mechanisms by which these SNPs affect prostate cancer risk remain unclear. Here, we used a series of bioinformatic tools and databases to provide possible molecular insights into the actions of risk SNPs.

Methodology/Principal Findings

We performed a comprehensive assessment of the potential functional impact of 33 SNPs that were identified and confirmed as associated with PCa risk in previous studies. For these 33 SNPs and additional SNPs in Linkage Disequilibrium (LD) (r² ≥ 0.5), we first mapped them to genomic functional annotation databases, including the Encyclopedia of DNA Elements (ENCODE), eleven genomic regulatory elements databases defined by the University of California Santa Cruz (UCSC) table browser, and Androgen Receptor (AR) binding sites defined by a ChIP-chip technique. Enrichment analysis was then carried out to assess whether the risk SNP blocks were enriched in the various annotation sets. Risk SNP blocks were significantly enriched over that expected by chance in two annotation sets, including AR binding sites (p=0.003), and FoxA1 binding sites (p=0.05). About one third of the 33 risk SNP blocks are located within AR binding regions.

Conclusions/Significance

The significant enrichment of risk SNPs in AR binding sites may suggest a potential molecular mechanism for these SNPs in prostate cancer initiation, and provide guidance for future functional studies.

Keywords: functional annotation, prostate cancer, bioinformatics, genome-wide association study

Introduction

Genome-wide association studies have identified ~33 established prostate cancer (PCa) risk-associated Single Nucleotide Polymorphisms (SNPs) (1–9). SNPs are DNA sequence variations occurring when a single nucleotide in the genome differs between individuals. Although these SNP associations have been consistently replicated in multiple studies, the functional roles of these risk SNPs remain uncharacterized, largely due to their location in DNA regions that do not encode proteins. Additionally, almost half of the SNP loci are located in regions which do not harbor any known genes. For example, multiple SNPs at 8q24 reside within a 1.3Mb gene-desert region (10), with the closest gene, c-Myc, located ~300kb downstream to rs1447295. Because the knowledge base for non-coding DNA is generally limited, few studies have been performed to evaluate the functional impact of the risk SNPs on PCa etiology (11–12). Indeed, the ability to understand the functional impact of the risk SNPs will most likely require additional emphasis on potential transcriptional regulatory mechanisms on non-coding DNA sequence.

The Encyclopedia of DNA Elements (ENCODE) project aims to provide a comprehensive catalog of biological information for functionally important elements. These elements include non-protein coding DNA sequences which regulate gene transcription, gene expression, and DNA replication. The ENCODE pilot study rigorously analyzed a small proportion (1%) of the human genome using computational and experimental methods. Results of the pilot study highlighted the complexity of transcriptional regulation and demonstrated the knowledge gap in this area (13). Based on the initial success, National Human Genome Research Institute (NHGRI) expanded the human ENCODE project to the whole genome (www.genome/gov/ENCODE). At this time, a primary focus of ENCODE is on the characterization of binding of transcriptional factors (TF) and chromatin structure, which represent two of the major factors involved in transcriptional regulation, using the ChIP-seq technique. ChIP-seq is a state-of-the-art high-throughput approach that involves chromatin immunoprecipitation and high-throughput sequencing of immunoprecipitated DNA. Compared with other methods of characterizing regulatory elements, a key advantage of ChIP-seq is a systematic and nonbiased approach, which does not depend on previous knowledge of canonical promoter regions and allows for evaluation of binding complexes of transcription factors and regulatory elements in a more natural state (14). The availability of this comprehensive catalog may facilitate an improved understanding of the functional role of risk SNPs located in non-coding regions.

In addition to the ENCODE project, several recent studies have utilized the high-throughput ChIP-chip method to identify genome-wide binding sites for transcription factors, including Androgen receptor (AR), Forkhead box A1 (FoxA1), and Estrogen receptor (ER) (15,16). Similar to ChIP-seq, ChIP-chip is another high-throughput, global approach for mapping transcription factors. ChIP-chip analysis involves isolating target DNA through chromatin immunoprecipitation, followed by analysis on DNA microarrays that tile the human genome (14). AR is a well-known transcription factor that plays an important role in prostate cancer initiation and progression. Risk SNPs located in putative AR binding sites might change the affinity of androgen-AR complex to binding sequences, thus providing a mechanism leading to modification of PCa risk.

In our study, we performed a comprehensive assessment of the potential functional impact of SNPs that were associated with PCa risk by GWAS studies, utilizing ENCODE genomic annotation databases, as well as ~20 annotation databases from the University of California Santa Cruz table browser (UCSC table browser) (http://genome.ucsc.edu/) and TF binding sites defined by previous studies (15,16). Enrichment analysis was then performed to evaluate whether the risk SNPs were over-represented in any of the annotation sets. To our knowledge, our study is among the first attempts to comprehensively characterize the potential function of risk SNPs using existing bioinformatic databases. These results assist in the interpretation of the molecular mechanisms of the risk SNPs on PCa etiology and provide guidance for future functional studies.

Methods

Define SNPs that are in Linkage Disequilibrium (LD) with established PCa risk-associated SNPs discovered from GWAS

The causative SNP may be the risk SNP itself or the SNPs in LD with them. Therefore, we identified all SNPs in LD (r² ≥ 0.5) with the 33 risk SNPs discovered by GWAS (SNPs that reached a genome-wide significance level with a p value equal or less than 10⁻⁷ in previous studies (1–9)) based on the CEU genotype data from the HapMap release #27 (Phase II+PhaseIII) (http://hapmap.ncbi.nlm.nih.gov/). We consider each risk SNP and SNPs that are in LD (r² ≥ 0.5) with it as one risk SNP block.

Overlapping the risk SNP blocks with functionally annotated genomic regions

We mapped SNPs in each risk SNP block to the ENCODE genomic annotation databases (release #2), as well as eleven annotation databases from UCSC (http://genome.ucsc.edu/) and transcription factors defined by previous studies. We defined a risk SNP block as located within a given annotated region if the risk SNP itself, or at least one of the SNPs in LD with the risk SNP, mapped to the annotated region.

Assessment of enrichment of the risk SNP blocks in the annotated genomic regions

We counted the number of risk SNP blocks that mapped to each annotated genomic region. Each risk SNP block was counted only once, even if more than one SNP within the same block mapped to the annotated region. A simulation analysis was used to assess the statistical significance of any potential enrichment for risk SNP blocks within annotated genomic regions, under a null hypothesis that none of these blocks were truly associated with PCa risk. We began the simulation analysis by randomly generating 1,000 sets of 33 SNPs (1,000 replicates) from the ~2.5 million SNPs in the genome with minor allele frequency (MAF)>=0.05 (Hapmap Phase II). We then identified all SNPs in LD with the randomly selected 33 SNPs, and performed the same analysis as for the true risk SNPs, including overlapping the SNP blocks with functionally annotated genomic regions and then counting the number of the SNP blocks that mapped to each annotated genomic region. Next, the mean number of risk SNP blocks that mapped to each annotated region was calculated based on the average counts of the 1,000 replicates. Finally, empirical p-values were calculated based on the number of replicates in which the number of counts was equal or larger than the observed number, divided by the total number of replicates. To reduce the concern of multiple testing, we limited the enrichment analysis to annotation sets with 5 or more mapped risk SNP blocks.

Results

Identification of SNPs in LD with PCa risk SNPs

We identified a total of 972 SNPs in LD with the 33 risk SNPs. A list of these SNPs and pair-wise r² for each risk SNP is provided in Supplementary Table 1.

Defining the functional annotation databases

We further grouped the genomic annotation databases into six categories (Table 1), majorly based on the potential functionality and techniques used to define the annotation sets: 1) Yale ENCODE (Yale Transcription Factor Binding Sites (TFBS)) characterizes the binding sites for a series of transcription factors including c-Myc, GATA-2, SIRT6, TCF7L2, STAT1, NK-kB, c-Fox, c-Jun, E2F6, Max and SIRT6; 2) Broad ENCODE (Broad histone) defines genomic regions with chromatic accessibility and histone modifications, including regions that are enriched with histone markers (H3K4m1, H3K4m2, H3K4m3, H3K27ac, and H3K9ac); 3) regulatory elements defined by UCSC table browser (http://genome.ucsc.edu/), which includes 11 genomic regulatory annotation sets; 4) a conserved region annotation set was also retrieved from UCSC phastConsElements28way and phaseConsElements17way table with conservation scores >500 (a conservation score is a measurement of the degree of conservation of a genomic region) ; 5) coding regions and splice sites that include annotation sets for the protein coding regions, and non-protein-coding RNAs (including transfer RNAs, ribosomal RNAs, small nuclear RNAs, and micro (mi) RNAs); 6) annotation sets including AR, ER and FoxA1 binding sites as defined by the ChIP-on-chip technique were obtained from previous studies (14,15).

Table 1.

Annotation databases will be used for the bioinformatics analysis

Annotation type	Brief explanation	Sources and references
Yale TFBS
Transcription factors	Transcription factor binding sites as defined by Encode project, including c-Myc, GATA-2, SIRT6, TCF7L2, STAT1, NK-kB, c-Fox, c-Jun, E2F6, Max, SIRT6	Encode project performed by Yale University, UCSC Table browser
Broad Histone
Histone modifications	Genomic regions with chromatic accessibility and histone modifications, including regions that are enriched with histone marks (H3K4m1, H3K4m2, H3K4m3, H3K27ac, and H3K9ac)	ENCODE project performed by Broad Institute, UCSC Table browser
Regulatory elements defined by UCSC table browser
CpgIslandExt	Targeted methylation regions (CpG island) predicted by computation algorithms	UCSC Table browser, ref 26
encodeUViennaRnaz	Noncoding RNA region predicted by three computational algorithms (EvoFold, RNAz and AlifoldZ) in ENCODE regions	UCSC Table browser, ref 27
eponine	Transcription start sites (TSS) predicted by a probabilistic method (Eponine), with good specificity and excellent positional accuracy.	UCSC Table browser, ref 28
firstEF	Computationally predicted promoter regions and first exons in the human genome	UCSC Table browser, ref 29
lamB1	Genomics regions interacting with nuclear lamina may contribute to the spatial organization of chromosomes inside the nucleus	UCSC Table browser, ref 30
oregano	The Open REGulatory ANNOtation database (ORegAnno) is an open database for the known regulatory elements from scientific literature (with various biological experimentally supported regulatory regions)	UCSC Table browser
polyaDb	mRNA ploly (A) sites that are mapped by cDNA/EST sequences	UCSC Table browser, ref 31
switchDbTss	Computationally predicted transcription start sites (TSS) based on cDNA alignment	UCSC Table browser, ref 32
targetScanS	TargetScan predicts biological targets of micro (mi) RNAs by searching for the presence of conserved 8mer and 7mer sites that match the seed region of each miRNA	UCSC Table browser, ref 33–35
tfbsConsSites	Location and score of transcription factor binding sites conserved in the human/mouse/rat alignment; data are computed with the Transfac Matrix Database (v7.0) and are purely computational	UCSC Table browser
VistaEnhancers	Defines distant-acting transcriptional enhancers by combining computational approach and a moderate throughput mpise transgenesis enhancer assay	UCSC Table browser, ref 36
Conserved region
Conserved region	Genomic region that are conserved across different species	UCSC phastConsElements28way and phaseConsElements17way table with conservation score >500
Coding region and splicing sites
Coding	Genomic regions coding for genes	UCSC Table browser
Non-Synonymous change	Genomic regions where a nucleotide substitution leads to an amino acid change	UCSC Table browser
Splice sites	Functional element that affect the splicing process	UCSC snp129
Non-protein-coding RNAs	Transfer RNAs, ribosomal RNAs, small nuclear RNAs, and miRNAs	UCSC Table browser (rnaGene)
Transcription factor binding sites defined by ChIP-on-chip technique
AR binding	Androgen Receptor binding regions defined by ChIP-on-chip technology	ref 15
ER binding	Estrogen Receptor binding regions defined by ChIP-on-chip technology	ref 16
FoxA1 binding	FoxA1 binding regions defined by ChIP-on-chip technology	ref 15

Open in a new tab

Mapping of risk SNP blocks to the functional annotation sets

Detailed annotation for each of the 33 risk SNP blocks are shown in Table 2. Only annotation sets with more than one mapped risk SNP block are listed in Table 2. Detailed information about the mapped SNPs that are in LD with the risk SNPs are presented in Supplementary Table 2. Briefly, 10 risk SNP blocks fall into conserved regions. No risk SNP blocks map to coding regions, non-synonymous changes, splice sites, non-protein-coding RNAs, miRNAs, miRNA target regions or methylation sites (data not shown). Based on category #6 genomic annotation databases (defined in preceding paragraph), 11, 4 and 9 risk SNP blocks were found to map to AR, ER and FoxA1 binding sites, respectively.

Table 2.

Annotation information on 33 risk SNP blocks based on various genomic databases

CHR	SNPs	Note	BP^*	genes	Yale ENCODE	Broad ENCODE	regulatory elements defined by UCSC annotation						Conserved	trascription factor binding sites defined by previous papers (13,14)
							cpgIslandExt	encodeUViennaRnaz	firstEF	laminB1	oreganno	tfbsConsSites	Region	AR	ER	FoxA
2	rs1465618	New 2p21	43,407,453	THADA		H3K4me1				Yes
2	rs721048	2p15	62,985,235	EHBP1		H3K4me2,H3K27ac,H3K9ac,				Yes		Yes	Yes
2	rs12621278	New2q31.1	173,019,799	ITGA6	c-Fos,Max,TCF7L2,STAT1	H3K4me1,H3K4me2,H3K27 ac,H3K4me3				Yes			Yes	Yes		Yes
3	rs2660753	3p12	87,193,364		c-Fos,STAT1	H3K4me1				Yes			Yes
3	rs10934853	New3q21.3	129,521,063		STAT1,GATA-2,ZNF263	H3K4me1,H3K4me2				Yes		Yes	Yes		Yes	Yes
4	rs17021918	New4q22.3	95,781,900	PDLIM5		H3K4me1,H3K4me2						Yes	Yes	Yes	Yes	Yes
4	rs7679673	New 4q24	106,280,983	TET2	JunD,NFKB,STAT1	H3K4me2,H3K4me3				Yes			Yes
6	rs9364554	6q25	160,753,654		c-Fos	H3K4me1,H3K4me2,H3K4m e3,				Yes
7	rs10486567	7p15	27,943,088	JAZF1		H3K4me1,H3K4me2				Yes			Yes	Yes
7	rs6465657	7q21	97,654,263	LMTK2	NFKB,Pol2,c- Myc,SIRT6,ZNF263	H3K4me1,H3K4me2,H3K4m e3				Yes	Yes		Yes		Yes	Yes
8	rs2928679	8p21.2	23,494,920	cTBP2		H3K4me1,H3K4me2				Yes			Yes	Yes
8	rs1512268	8p21.2	23,582,408	NKX3.1	Pol2,c-Myc,HA- E2F1,TCF7L2,GATA-2,NF- YB,SIRT6,ZNF263	H3K4me1,H3K27ac,H3K4me 3,H3K9ac,	Yes		Yes	Yes			Yes	Yes		Yes
8	rs10086908	New 8q24 (5)	128,081,119			H3K4me1,H3K4me3				Yes			Yes
8	rs16901979	8q24(2)	128,194,098			H3K4me1,H3K4me2				Yes			Yes	Yes
8	rs16902094	New 8q24.21	128,389,528
8	rs620861	New 8q24 (4)	128,404,855		TCF7L2	H3K4me2				Yes		Yes	Yes	Yes		Yes
8	rs6983267	8q24(3)	128,482,487		TCF7L2,STAT1,Rad21								Yes			Yes
8	rs1447295	8q24(1)	128,554,220											Yes
9	rs1571801	9q33	123,467,194							Yes
10	rs10993994	10q11	51,219,502	MSMB	ZNF263	H3K4me1,H3K4me2,H3K27 ac,H3K9ac				Yes	Yes		Yes	Yes	Yes	Yes
10	rs4962416	10q26	126,686,862	CTBP2	Pol2,E2F6,ZNF263	H3K4me1,H3K4me3				Yes			Yes
11	rs7127900	New 11P15.5	2,190,150	IGF2, IGF2AS, INS, TH	Pol2	H3K4me1		Yes
11	rs12418451	11q13(2)	68,691,995	AL137479, BC043531	c-Fos,Max,NF-YA,NF-YB	H3K4me1
11	rs10896449	11q13(1)	68,751,243		Max,c-Myc	H3K4me1,H3K4me2					Yes
17	rs11649743	17q12 (2)	33,149,092		ZNF263	H3K4me1,H3K4me3				Yes		Yes	Yes			Yes
17	rs4430796	17q12(1)	33,172,153	TCF2	Pol2,STAT1	H3K4me1				Yes
17	rs1859962	17q24.3	66,620,348			H3K4me1,H3K4me2				Yes			Yes	Yes
19	rs8102476	New 19q13.2	43,427,453		c-Myc	H3K4me1,H3K9ac				Yes			Yes
19	rs887391	19q13	46,677,464	10 Mb to KLK3	NF-YB	H3K27ac,				Yes
19	rs2735839	19q13 (KLK3)	56,056,435	KLK3
22	rs9623117	New22q13	38,782,065		GATA-2	H3K4me1,H3K4me2				Yes				Yes
22	rs5759167	New 22q13.2	41,830,156	TTLL1, BIK, MCAT, PACSIN2	c-Fos,Max,c-Jun,c- Myc,TCF7L2,E2F6,GATA- 2,SIRT6	H3K4me1,H3K4me2
X	rs5945619	Xp11	51,258,412	NUDT10, NUDT11, LOC340602

Open in a new tab

bp (base pair) position is based on NCBI build 36

For Yale TFBS and Broad Histone annotation sets, transcription factor binding sites and regions that are enriched with specific histone markers were provided for each risk SNP block. For the remaining three annotation categories, a “yes” means the risk SNP block overlapped with the annotation category. Empty cells mean the risk SNP block does not overlap with the annotation category.

Enrichment analysis

Risk SNP blocks were significantly enriched in genomic regions containing AR binding sites, with 11 (34.4%) risk SNP blocks mapping to these sites, whereas only 4.0 (12.5%) blocks randomly generated from the genome are located at such sites (p=0.003). Similarly, risk SNP blocks were significantly enriched in regions containing FoxA1 binding sites (7 (21.88%) vs 3.47 (10.84%); p=0.05) (Table 2). Risk SNP blocks were not significantly enriched in any of the other annotation genomic sets that were tested.

Discussion

PCa risk-associated SNPs identified from GWA studies have been consistently replicated and confirmed in a large number of studies (1–9). The clinical utility of predicting an individual’s risk for PCa and identification of high risk men for PCa using these SNPs have been extensively discussed and explored (17–19). In contrast, the biological mechanisms by which the risk SNPs affect PCa initiation are poorly understood. In this study, we used bioinformatics tools to provide insight into the potential functional impact of these SNPs. Importantly, risk SNPs were found to be significantly enriched over that expected by chance in two functional annotation sets, consisting of AR binding sites (p=0.003) and FoxA1 binding sites (p=0.05).

AR, a member of the nuclear hormone receptor family, is a well-known transcription factor which plays an important role in prostate cancer initiation, although the precise mechanisms by which androgens promote prostate carcinogenesis remain ill-defined despite years of investigation. It is clear, however, that healthy men receiving preventive drugs that block the conversion of testosterone to dihydrotestosterone, the more potent androgen, experience a significant reduction (~25%) in PCa risk (20,21). Upon androgen binding, the AR-androgen dimer translocates from the cytosol into the cell nucleus. The AR-androgen dimer complex then binds to specific DNA sequences known as AR binding sites, recruiting coactivators and other factors which direct and regulate target gene expression. In our study, we demonstrated that almost one third of the 33 known risk SNP blocks are located in AR binding regions identified by the ChIP-chip method (15). A total of ~22,000 AR binding regions have been mapped across the prostate cell genome, using the Model-based Analysis of Tiling-arrays (MAT) algorithm (22) and based on a false discovery rate of 15% (p<10⁻⁴) (15). The average length of the AR binding regions is 911 base pairs (bp), with a range from 299 to 5,554 bp. The significant enrichment of known risk SNP blocks in regions that harbor AR binding regions suggests a molecular mechanism that may explain the associations between the risk SNPs and PCa risk. The statistical significance (P=0.003) remains even after a stringent Bonferroni correction (13 independent tests were performed in enrichment analysis, p=0.05/13=0.0037). Known risk SNPs that are located within the putative AR binding sites may change the binding affinity of the AR-androgen complex for the binding sequence, leading to altered expression of AR target genes that presumably play rate limiting roles in PCa formation.

Risk SNP blocks were also enriched in FoxA1 binding sites with a nominal p value of 0.05. FoxA1 acts as an AR collaborating cofactor and assists nuclear receptor binding in certain genomic regions (23,24). The coupling of FoxA1 to binding sites is required for AR binding to enhancers in multiple AR-targeted genes (15). SNPs that reside in FoxA1 binding sites may affect the binding affinity of FoxA1 protein and lead to increased risk for PCa. However, the importance of risk SNPs and FoxA1 binding need to be interpreted with caution since the enrichment in FoxA1 did not reach statistical significance after Bonferroni correction (P = 0.65 after Bonferroni correction).

One advantage of our study is the use of a comprehensive and unbiased approach to evaluate TF binding regions defined by ChIP-seq and ChIP-chip techniques, which allowed us to evaluate the regulatory elements on a genome-wide level. Regulation of gene expression is a complicated process and current knowledge in this area is very limited. The pilot study of the ENCODE project revealed that only about 25% of the regulatory elements are located near previously identified transcription starting sites (TSS). This suggests the ChIP-on-chip method is able to identify a large number of promoter regions and regulatory elements that were previously unknown and distant from the classical TSS (13). Although we did not observe any significant enrichment of risk SNP blocksin the ENCODE annotation sets, this might be due to tissue-specific patterns of TF and TF regulation. Currently, a variety of cell lines, mainly Hela and GM06990 have been used for the identification of regulatory elements in the ENCODE project. The LNCaP cell line is currently proposed for study by ENCODE consortium in Tier 3 (http://genome.ucsc.edu/ENCODE/cellTypes.html). Prostate cancer tissue-specific TF binding may provide valuable information for evaluating the molecular mechanisms of risk SNPs on PCa risk.

The fact that no risk SNP blocks are located in regions that code proteins may be due to two reasons. First, we only evaluated SNPs that are characterized by the Hapmap project, in which unknown SNPs that are located in protein coding regions are not evaluated. Secondly, the current resources that are commonly used for gene annotation, RefSeq and ENSEMBLE, likely represents an incomplete catalogue of human genes. SNPs may be located within exons that are not yet identified. In fact, about 60% of GENCODE exons (GENCODE is a sub-project of ENCODE, which aimed to provide a reference annotation of all protein-coding genes within the pilot study of the ENCODE project) are not annotated in RefSeq and ENSEMBL (25). This fact indicates that a high number of alternative slice forms with unique exons exist across the genome (25). With the completion of ENCODE in the near future, a richer and more complete annotation of human genes should provide more insight.

We also did not observe significant enrichment of risk SNP blocks in the regulatory annotation sets defined by UCSC table browsers. However, the null results for these annotation sets need to be interpreted with caution. The majority of the regulatory elements annotation sets are defined by computational algorithms, rather than by biological experiments. In addition, the regulatory elements predicted in those annotation sets are not tissue specific.

In summary, our study is among the first to comprehensively evaluate the potential functional impact of risk loci identified for PCa through GWAS studies. The fact that about one third of 33 SNP blocks fall within AR binding regions, and that the risk SNPs were statistically enriched in AR regions, may suggest a potential molecular mechanism by which risk SNPs contribute to PCa initiation. These results also provide a guidance for future functional studies. In addition, the databases used for bioinformatics annotation could also be used to annotate and prioritize variants identified through GWAS and whole-genome sequencing.

Supplementary Material

Supplementary Table 1

NIHMS258501-supplement-Supplementary_Table_1.doc^{(1.4MB, doc)}

Supplementary Table 2

NIHMS258501-supplement-Supplementary_Table_2.doc^{(242.5KB, doc)}

Table 3.

Enrichment analysis of the 33 risk SNP blocks

Annotation type^*	# of counts and frequency (%) in the PRAS blocks	# of counts and frequency in the randomly generated SNP blocks	P-value^#
Yale TFBS
c-Myc	5 (15.63)	3.43 (0.11)	0.26
TCF7L2	5 (15.63)	3.08 (0.1)	0.18
STAT1	6 (18.75)	3.61 (0.11)	0.13
c-Fos	5 (15.63)	3.58 (0.11)	0.28
Broad Histone
H3K4me1	23 (71.88)	19.88 (0.62)	0.18
H3K4me2	16 (50.00)	14.04 (0.44)	0.32
H3K4me3	8 (25.00)	5.72 (0.18)	0.19
H3K27ac	5 (15.63)	4.29 (0.13)	0.43
Regulatory elements defined by UCSC table browser
laminB1	24 (75.00)	24.61 (76.91)	0.53
tfbsConsSites	6 (18.75)	8.93 (0.28)	0.82
Conserved region
Conserved region	10 (31.25)	8.69 (27.16)	0.38
Transcription factor binding sites defined by ChIP-chip technology
AR binding	11 (34.38)	3.99 (12.47)	0.003
Fox A1 binding	7 (21.88)	3.47 (10.84)	0.05

Open in a new tab

Enrichment analysis was only performed for annotation sets with 5 or more mapped risk SNP blocks

p-value is based on 1,000 simulation replicates

Acknowledgments

Grants: National Cancer Institute (CA129684, CA140262, CA148463 to J.X.) and Department of Defense (W81XWH-09-1-0488 to J.S.)

References

1.Yeager M, Orr N, Hayes RB, et al. Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nat Genet. 2007;39:645–649. doi: 10.1038/ng2022. [DOI] [PubMed] [Google Scholar]
2.Gudmundsson J, Sulem P, Steinthorsdottir V, et al. Two variants on chromosome 17 confer prostate cancer risk, and the one in TCF2 protects against type 2 diabetes. Nat Genet. 2007;39:977–983. doi: 10.1038/ng2062. [DOI] [PubMed] [Google Scholar]
3.Duggan D, Zheng SL, Knowlton M, et al. Two genome-wide association studies of aggressive prostate cancer implicate putative prostate tumor suppressor gene DAB2IP. J Natl Cancer Inst. 2007;99:1836–1844. doi: 10.1093/jnci/djm250. [DOI] [PubMed] [Google Scholar]
4.Thomas G, Jacobs KB, Yeager M, et al. Multiple loci identified in a genome-wide association study of prostate cancer. Nat Genet. 2008;40:310–315. doi: 10.1038/ng.91. [DOI] [PubMed] [Google Scholar]
5.Gudmundsson J, Sulem P, Rafnar T, et al. Common sequence variants on 2p15 and Xp11.22 confer susceptibility to prostate cancer. Nat Genet. 2008;40:281–283. doi: 10.1038/ng.89. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Eeles RA, Kote-Jarai Z, Giles GG, et al. Multiple newly identified loci associated with prostate cancer susceptibility. Nat Genet. 2008;40:316–321. doi: 10.1038/ng.90. [DOI] [PubMed] [Google Scholar]
7.Yeager M, Chatterjee N, Ciampa J, et al. Identification of a new prostate cancer susceptibility locus on chromosome 8q24. Nat Genet. 2009;41:1055–1057. doi: 10.1038/ng.444. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Gudmundsson J, Sulem P, Gudbjartsson DF, et al. Genome-wide association and replication studies identify four variants associated with prostate cancer susceptibility. Nat Genet. 2009;41:1122–1126. doi: 10.1038/ng.448. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Eeles RA, Kote-Jarai Z, Al Olama AA, et al. Identification of seven new prostate cancer susceptibility loci through a genome-wide association study. Nat Genet. 2009;41:1116–1121. doi: 10.1038/ng.450. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Ghoussaini M, Song H, Koessler T, Al Olama AA, Kote-Jarai Z, Driver KE, Pooley KA, Ramus SJ, Kjaer SK, Hogdall E, DiCioccio RA, Whittemore AS, Gayther SA, Giles GG, Guy M, Edwards SM, Morrison J, Donovan JL, Hamdy FC, Dearnaley DP, Ardern-Jones AT, Hall AL, O'Brien LT, Gehr-Swain BN, Wilkinson RA, Brown PM, Hopper JL, Neal DE, Pharoah PD, Ponder BA, Eeles RA, Easton DF, Dunning AM UK Genetic Prostate Cancer Study Collaborators/British Association of Urological Surgeons' Section of Oncology; UK ProtecT Study Collaborators. Multiple loci with different cancer specificities within the 8q24 gene desert. J Natl Cancer Inst. 2008 Jul 2;100(13):962–966. doi: 10.1093/jnci/djn190. Epub 2008 Jun 24. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Lou H, Yeager M, Li H, Bosquet JG, Hayes RB, Orr N, Yu K, Hutchinson A, Jacobs KB, Kraft P, Wacholder S, Chatterjee N, Feigelson HS, Thun MJ, Diver WR, Albanes D, Virtamo J, Weinstein S, Ma J, Gaziano JM, Stampfer M, Schumacher FR, Giovannucci E, Cancel-Tassin G, Cussenot O, Valeri A, Andriole GL, Crawford ED, Anderson SK, Tucker M, Hoover RN, Fraumeni JF, Jr, Thomas G, Hunter DJ, Dean M, Chanock SJ. Fine mapping and functional analysis of a common variant in MSMB on chromosome 10q11.2 associated with prostate cancer susceptibility. Proc Natl Acad Sci U S A. 2009 May 12;106(19):7933–7938. doi: 10.1073/pnas.0902104106. Epub 2009 Apr 21. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Chang BL, Cramer SD, Wiklund F, Isaacs SD, Stevens VL, Sun J, Smith S, Pruett K, Romero LM, Wiley KE, Kim ST, Zhu Y, Zhang Z, Hsu FC, Turner AR, Adolfsson J, Liu W, Kim JW, Duggan D, Carpten J, Zheng SL, Rodriguez C, Isaacs WB, Grönberg H, Xu J. Fine mapping association study and functional analysis implicate a SNP in MSMB at 10q11 as a causal variant for prostate cancer risk. Hum Mol Genet. 2009 Apr 1;18(7):1368–1375. doi: 10.1093/hmg/ddp035. Epub 2009 Jan 19. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.The ENCODE Project Consortium Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007 Jun 14;447(7146):799–816. doi: 10.1038/nature05874. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Euskirchen GM, Rozowsky JS, Wei CL, Lee WH, Zhang ZD, Hartman S, Emanuelsson O, Stolc V, Weissman S, Gerstein MB, Ruan Y, Snyder M. Mapping of transcription factor binding regions in mammalian cells by ChIP: comparison of array- and sequencing-based technologies. Genome Res. 2007 Jun 17;6:898–909. doi: 10.1101/gr.5583007. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Wang Q, Li W, Zhang Y, Yuan X, Xu K, Yu J, et al. Androgen receptor regulates a distinct transcription program in androgen-independent prostate cancer. Cell. 2009 Jul 23;138(2):245–256. doi: 10.1016/j.cell.2009.04.056. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Carroll JS, Meyer CA, Song J, Li W, Geistlinger TR, Eeckhoute J, Brodsky AS, Keeton EK, Fertuck KC, Hall GF, Wang Q, Bekiranov S, Sementchenko V, Fox EA, Silver PA, Gingeras TR, Liu XS*, Brown M* Genome-wide analysis of estrogen receptor binding sites. Nat. Genet. 2006;38:1289–1297. doi: 10.1038/ng1901. [DOI] [PubMed] [Google Scholar]
17.Xu J, Sun J, Kader AK, Lindström S, Wiklund F, Hsu FC, Johansson JE, Zheng SL, Thomas G, Hayes RB, Kraft P, Hunter DJ, Chanock SJ, Isaacs WB, Grönberg H. Estimation of absolute risk for prostate cancer using genetic markers and family history. Prostate. 2009 Oct 1;69(14):1565–1572. doi: 10.1002/pros.21002. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Hsu FC, Sun J, Zhu Y, Kim ST, Jin T, Zhang Z, Wiklund F, Kader AK, Zheng SL, Isaacs W, Grönberg H, Xu J. Comparison of two methods for estimating absolute risk of prostate cancer based on single nucleotide polymorphisms and family history. Cancer Epidemiol Biomarkers Prev. 2010 Apr;19(4):1083–1088. doi: 10.1158/1055-9965.EPI-09-1176. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Sun J, Kader AK, et al. Inherited genetic markers discovered to date are able to identify a significant number of men at considerably elevated risk for prostate cancer. Prostate. doi: 10.1002/pros.21256. (In press) [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Thompson IM, Goodman PJ, Tangen CM, et al. The influence of finasteride on the development of prostate cancer. N Engl J Med. 2003;349:215–224. doi: 10.1056/NEJMoa030660. [DOI] [PubMed] [Google Scholar]
21.Andriole GA, Bostwick D, Brawley OW. The influence of dutasteride on the risk of biopsy-detectable prostate cancer: Outcomes of the REduction by DUtasteride of Prostate Cancer Events (REDUCE) study. N Engl J Med. 2010 Apr 1;362(13):1192–1202. doi: 10.1056/NEJMoa0908127. [DOI] [PubMed] [Google Scholar]
22.Johnson WE, Li W, Meyer CA, Gottardo R, Carroll JS, Brown M, Liu XS. Model-based analysis of tiling-arrays for ChIP-chip. Proc Natl Acad Sci U S A. 2006 Aug 15;103(33):12457–12462. doi: 10.1073/pnas.0601180103. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Carroll JS, Liu XS, Brodsky AS, Li W, Meyer CA, Szary AJ, Eeckhoute J, Shao W, Hestermann EV, Geistlinger TR, Fox EA, Silver PA, Brown M. Chromosome-wide mapping of estrogen receptor binding reveals long-range regulation requiring the forkhead protein FoxA1. Cell. 2005 Jul 15;122(1):33–43. doi: 10.1016/j.cell.2005.05.008. [DOI] [PubMed] [Google Scholar]
24.Wang Q, Li W, Liu XS, Carroll JS, Jänne OA, Keeton EK, Chinnaiyan AM, Pienta KJ, Brown M. A hierarchical network of transcription factors governs androgen receptor-dependent prostate cancer growth. Mol Cell. 2007 Aug 3;27(3):380–392. doi: 10.1016/j.molcel.2007.05.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Harrow J, Denoeud F, Frankish A, Reymond A, Chen CK, Chrast J, Lagarde J, Gilbert JG, Storey R, Swarbreck D, Rossier C, Ucla C, Hubbard T, Antonarakis SE, Guigo R. GENCODE: producing a reference annotation for ENCODE. Genome Biol. 2006;7 Suppl 1:S4.1–S4.9. doi: 10.1186/gb-2006-7-s1-s4. Epub 2006 Aug 7. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Gardiner-Garden M, Frommer M. CpG islands in vertebrate genomes. J.Mol.Biol. 1987 Jul 20;196(2):261–282. doi: 10.1016/0022-2836(87)90689-9. [DOI] [PubMed] [Google Scholar]
27.Washietl S, Pedersen JS, Korbel JO, Stocsits C, Gruber AR, Hackermüller J, Hertel J, Lindemeyer M, Reiche K, Tanzer A, Ucla C, Wyss C, Antonarakis SE, Denoeud F, Lagarde J, Drenkow J, Kapranov P, Gingeras TR, Guigó R, Snyder M, Gerstein MB, Reymond A, Hofacker IL, Stadler PF. Structured RNAs in the ENCODE selected regions of the human genome. Genome Res. 2007 Jun;17(6):852–864. doi: 10.1101/gr.5650707. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Down TA, Hubbard TJP. Computational detection and location of transcription start sites in mammalian genomic DNA. Genome Res. 2002 Mar;12(3):458–461. doi: 10.1101/gr.216102. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Davuluri RV, Grosse I, Zhang MQ. Computational identification of promoters and first exons in the human genome. Nat Genet. 2001 Dec;29(4):412–417. doi: 10.1038/ng780. Erratum in: Nat Genet 2002 Nov;32(3):459. [DOI] [PubMed] [Google Scholar]
30.Guelen L, Pagie L, Brasset E, Meuleman W, Faza MB, Talhout W, Eussen BH, de Klein A, Wessels L, de Laat W, van Steensel B. Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions. Nature. 2008 June 12;453:948–951. doi: 10.1038/nature06947. [DOI] [PubMed] [Google Scholar]
31.Pennacchio LA, Ahituv N, Moses AM, Prabhakar S, Nobrega MA, Shoukry M, Minovitsky S, Dubchak I, Holt A, Lewis KD, et al. In vivo enhancer analysis of human conserved non-coding sequences. Nature. 2006 Nov 23;444(7118):499–502. doi: 10.1038/nature05295. [DOI] [PubMed] [Google Scholar]
32.Cheng Y, Miura RM, Tian B. Prediction of mRNA polyadenylation sites by support vector machine. Bioinformatics. 2006;22:2320–2325. doi: 10.1093/bioinformatics/btl394. [DOI] [PubMed] [Google Scholar]
33.Benjamin P Lewis, Christopher B Burge, David P Bartel. Conserved Seed Pairing, Often Flanked by Adenosines, Indicates that Thousands of Human Genes are MicroRNA Targets. Cell. 2005;120:15–20. doi: 10.1016/j.cell.2004.12.035. [DOI] [PubMed] [Google Scholar]
34.Andrew Grimson, Kyle Kai-How Farh, Wendy K Johnston, Philip Garrett-Engele, Lee P Lim, David P Bartel. MicroRNA Targeting Specificity in Mammals: Determinants beyond Seed Pairing Molecular. Cell. 2007;27:91–105. doi: 10.1016/j.molcel.2007.06.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Robin C Friedman, Kyle Kai-How Farh, Christopher B Burge, David P Bartel. Most Mammalian mRNAs Are Conserved Targets of MicroRNAs. Genome Research. 2009;19:92–105. doi: 10.1101/gr.082701.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Down TA, Hubbard TJ. Computational detection and location of transcription start sites in mammalian genomic DNA. Genome research. 2002;12(3):458–461. doi: 10.1101/gr.216102. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Table 1

NIHMS258501-supplement-Supplementary_Table_1.doc^{(1.4MB, doc)}

Supplementary Table 2

NIHMS258501-supplement-Supplementary_Table_2.doc^{(242.5KB, doc)}

[R1] 1.Yeager M, Orr N, Hayes RB, et al. Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nat Genet. 2007;39:645–649. doi: 10.1038/ng2022. [DOI] [PubMed] [Google Scholar]

[R2] 2.Gudmundsson J, Sulem P, Steinthorsdottir V, et al. Two variants on chromosome 17 confer prostate cancer risk, and the one in TCF2 protects against type 2 diabetes. Nat Genet. 2007;39:977–983. doi: 10.1038/ng2062. [DOI] [PubMed] [Google Scholar]

[R3] 3.Duggan D, Zheng SL, Knowlton M, et al. Two genome-wide association studies of aggressive prostate cancer implicate putative prostate tumor suppressor gene DAB2IP. J Natl Cancer Inst. 2007;99:1836–1844. doi: 10.1093/jnci/djm250. [DOI] [PubMed] [Google Scholar]

[R4] 4.Thomas G, Jacobs KB, Yeager M, et al. Multiple loci identified in a genome-wide association study of prostate cancer. Nat Genet. 2008;40:310–315. doi: 10.1038/ng.91. [DOI] [PubMed] [Google Scholar]

[R5] 5.Gudmundsson J, Sulem P, Rafnar T, et al. Common sequence variants on 2p15 and Xp11.22 confer susceptibility to prostate cancer. Nat Genet. 2008;40:281–283. doi: 10.1038/ng.89. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Eeles RA, Kote-Jarai Z, Giles GG, et al. Multiple newly identified loci associated with prostate cancer susceptibility. Nat Genet. 2008;40:316–321. doi: 10.1038/ng.90. [DOI] [PubMed] [Google Scholar]

[R7] 7.Yeager M, Chatterjee N, Ciampa J, et al. Identification of a new prostate cancer susceptibility locus on chromosome 8q24. Nat Genet. 2009;41:1055–1057. doi: 10.1038/ng.444. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Gudmundsson J, Sulem P, Gudbjartsson DF, et al. Genome-wide association and replication studies identify four variants associated with prostate cancer susceptibility. Nat Genet. 2009;41:1122–1126. doi: 10.1038/ng.448. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Eeles RA, Kote-Jarai Z, Al Olama AA, et al. Identification of seven new prostate cancer susceptibility loci through a genome-wide association study. Nat Genet. 2009;41:1116–1121. doi: 10.1038/ng.450. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Ghoussaini M, Song H, Koessler T, Al Olama AA, Kote-Jarai Z, Driver KE, Pooley KA, Ramus SJ, Kjaer SK, Hogdall E, DiCioccio RA, Whittemore AS, Gayther SA, Giles GG, Guy M, Edwards SM, Morrison J, Donovan JL, Hamdy FC, Dearnaley DP, Ardern-Jones AT, Hall AL, O'Brien LT, Gehr-Swain BN, Wilkinson RA, Brown PM, Hopper JL, Neal DE, Pharoah PD, Ponder BA, Eeles RA, Easton DF, Dunning AM UK Genetic Prostate Cancer Study Collaborators/British Association of Urological Surgeons' Section of Oncology; UK ProtecT Study Collaborators. Multiple loci with different cancer specificities within the 8q24 gene desert. J Natl Cancer Inst. 2008 Jul 2;100(13):962–966. doi: 10.1093/jnci/djn190. Epub 2008 Jun 24. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Lou H, Yeager M, Li H, Bosquet JG, Hayes RB, Orr N, Yu K, Hutchinson A, Jacobs KB, Kraft P, Wacholder S, Chatterjee N, Feigelson HS, Thun MJ, Diver WR, Albanes D, Virtamo J, Weinstein S, Ma J, Gaziano JM, Stampfer M, Schumacher FR, Giovannucci E, Cancel-Tassin G, Cussenot O, Valeri A, Andriole GL, Crawford ED, Anderson SK, Tucker M, Hoover RN, Fraumeni JF, Jr, Thomas G, Hunter DJ, Dean M, Chanock SJ. Fine mapping and functional analysis of a common variant in MSMB on chromosome 10q11.2 associated with prostate cancer susceptibility. Proc Natl Acad Sci U S A. 2009 May 12;106(19):7933–7938. doi: 10.1073/pnas.0902104106. Epub 2009 Apr 21. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Chang BL, Cramer SD, Wiklund F, Isaacs SD, Stevens VL, Sun J, Smith S, Pruett K, Romero LM, Wiley KE, Kim ST, Zhu Y, Zhang Z, Hsu FC, Turner AR, Adolfsson J, Liu W, Kim JW, Duggan D, Carpten J, Zheng SL, Rodriguez C, Isaacs WB, Grönberg H, Xu J. Fine mapping association study and functional analysis implicate a SNP in MSMB at 10q11 as a causal variant for prostate cancer risk. Hum Mol Genet. 2009 Apr 1;18(7):1368–1375. doi: 10.1093/hmg/ddp035. Epub 2009 Jan 19. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.The ENCODE Project Consortium Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007 Jun 14;447(7146):799–816. doi: 10.1038/nature05874. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] 14.Euskirchen GM, Rozowsky JS, Wei CL, Lee WH, Zhang ZD, Hartman S, Emanuelsson O, Stolc V, Weissman S, Gerstein MB, Ruan Y, Snyder M. Mapping of transcription factor binding regions in mammalian cells by ChIP: comparison of array- and sequencing-based technologies. Genome Res. 2007 Jun 17;6:898–909. doi: 10.1101/gr.5583007. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] 15.Wang Q, Li W, Zhang Y, Yuan X, Xu K, Yu J, et al. Androgen receptor regulates a distinct transcription program in androgen-independent prostate cancer. Cell. 2009 Jul 23;138(2):245–256. doi: 10.1016/j.cell.2009.04.056. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Carroll JS, Meyer CA, Song J, Li W, Geistlinger TR, Eeckhoute J, Brodsky AS, Keeton EK, Fertuck KC, Hall GF, Wang Q, Bekiranov S, Sementchenko V, Fox EA, Silver PA, Gingeras TR, Liu XS*, Brown M* Genome-wide analysis of estrogen receptor binding sites. Nat. Genet. 2006;38:1289–1297. doi: 10.1038/ng1901. [DOI] [PubMed] [Google Scholar]

[R17] 17.Xu J, Sun J, Kader AK, Lindström S, Wiklund F, Hsu FC, Johansson JE, Zheng SL, Thomas G, Hayes RB, Kraft P, Hunter DJ, Chanock SJ, Isaacs WB, Grönberg H. Estimation of absolute risk for prostate cancer using genetic markers and family history. Prostate. 2009 Oct 1;69(14):1565–1572. doi: 10.1002/pros.21002. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Hsu FC, Sun J, Zhu Y, Kim ST, Jin T, Zhang Z, Wiklund F, Kader AK, Zheng SL, Isaacs W, Grönberg H, Xu J. Comparison of two methods for estimating absolute risk of prostate cancer based on single nucleotide polymorphisms and family history. Cancer Epidemiol Biomarkers Prev. 2010 Apr;19(4):1083–1088. doi: 10.1158/1055-9965.EPI-09-1176. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Sun J, Kader AK, et al. Inherited genetic markers discovered to date are able to identify a significant number of men at considerably elevated risk for prostate cancer. Prostate. doi: 10.1002/pros.21256. (In press) [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Thompson IM, Goodman PJ, Tangen CM, et al. The influence of finasteride on the development of prostate cancer. N Engl J Med. 2003;349:215–224. doi: 10.1056/NEJMoa030660. [DOI] [PubMed] [Google Scholar]

[R21] 21.Andriole GA, Bostwick D, Brawley OW. The influence of dutasteride on the risk of biopsy-detectable prostate cancer: Outcomes of the REduction by DUtasteride of Prostate Cancer Events (REDUCE) study. N Engl J Med. 2010 Apr 1;362(13):1192–1202. doi: 10.1056/NEJMoa0908127. [DOI] [PubMed] [Google Scholar]

[R22] 22.Johnson WE, Li W, Meyer CA, Gottardo R, Carroll JS, Brown M, Liu XS. Model-based analysis of tiling-arrays for ChIP-chip. Proc Natl Acad Sci U S A. 2006 Aug 15;103(33):12457–12462. doi: 10.1073/pnas.0601180103. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.Carroll JS, Liu XS, Brodsky AS, Li W, Meyer CA, Szary AJ, Eeckhoute J, Shao W, Hestermann EV, Geistlinger TR, Fox EA, Silver PA, Brown M. Chromosome-wide mapping of estrogen receptor binding reveals long-range regulation requiring the forkhead protein FoxA1. Cell. 2005 Jul 15;122(1):33–43. doi: 10.1016/j.cell.2005.05.008. [DOI] [PubMed] [Google Scholar]

[R24] 24.Wang Q, Li W, Liu XS, Carroll JS, Jänne OA, Keeton EK, Chinnaiyan AM, Pienta KJ, Brown M. A hierarchical network of transcription factors governs androgen receptor-dependent prostate cancer growth. Mol Cell. 2007 Aug 3;27(3):380–392. doi: 10.1016/j.molcel.2007.05.041. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] 25.Harrow J, Denoeud F, Frankish A, Reymond A, Chen CK, Chrast J, Lagarde J, Gilbert JG, Storey R, Swarbreck D, Rossier C, Ucla C, Hubbard T, Antonarakis SE, Guigo R. GENCODE: producing a reference annotation for ENCODE. Genome Biol. 2006;7 Suppl 1:S4.1–S4.9. doi: 10.1186/gb-2006-7-s1-s4. Epub 2006 Aug 7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] 26.Gardiner-Garden M, Frommer M. CpG islands in vertebrate genomes. J.Mol.Biol. 1987 Jul 20;196(2):261–282. doi: 10.1016/0022-2836(87)90689-9. [DOI] [PubMed] [Google Scholar]

[R27] 27.Washietl S, Pedersen JS, Korbel JO, Stocsits C, Gruber AR, Hackermüller J, Hertel J, Lindemeyer M, Reiche K, Tanzer A, Ucla C, Wyss C, Antonarakis SE, Denoeud F, Lagarde J, Drenkow J, Kapranov P, Gingeras TR, Guigó R, Snyder M, Gerstein MB, Reymond A, Hofacker IL, Stadler PF. Structured RNAs in the ENCODE selected regions of the human genome. Genome Res. 2007 Jun;17(6):852–864. doi: 10.1101/gr.5650707. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] 28.Down TA, Hubbard TJP. Computational detection and location of transcription start sites in mammalian genomic DNA. Genome Res. 2002 Mar;12(3):458–461. doi: 10.1101/gr.216102. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] 29.Davuluri RV, Grosse I, Zhang MQ. Computational identification of promoters and first exons in the human genome. Nat Genet. 2001 Dec;29(4):412–417. doi: 10.1038/ng780. Erratum in: Nat Genet 2002 Nov;32(3):459. [DOI] [PubMed] [Google Scholar]

[R30] 30.Guelen L, Pagie L, Brasset E, Meuleman W, Faza MB, Talhout W, Eussen BH, de Klein A, Wessels L, de Laat W, van Steensel B. Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions. Nature. 2008 June 12;453:948–951. doi: 10.1038/nature06947. [DOI] [PubMed] [Google Scholar]

[R31] 31.Pennacchio LA, Ahituv N, Moses AM, Prabhakar S, Nobrega MA, Shoukry M, Minovitsky S, Dubchak I, Holt A, Lewis KD, et al. In vivo enhancer analysis of human conserved non-coding sequences. Nature. 2006 Nov 23;444(7118):499–502. doi: 10.1038/nature05295. [DOI] [PubMed] [Google Scholar]

[R32] 32.Cheng Y, Miura RM, Tian B. Prediction of mRNA polyadenylation sites by support vector machine. Bioinformatics. 2006;22:2320–2325. doi: 10.1093/bioinformatics/btl394. [DOI] [PubMed] [Google Scholar]

[R33] 33.Benjamin P Lewis, Christopher B Burge, David P Bartel. Conserved Seed Pairing, Often Flanked by Adenosines, Indicates that Thousands of Human Genes are MicroRNA Targets. Cell. 2005;120:15–20. doi: 10.1016/j.cell.2004.12.035. [DOI] [PubMed] [Google Scholar]

[R34] 34.Andrew Grimson, Kyle Kai-How Farh, Wendy K Johnston, Philip Garrett-Engele, Lee P Lim, David P Bartel. MicroRNA Targeting Specificity in Mammals: Determinants beyond Seed Pairing Molecular. Cell. 2007;27:91–105. doi: 10.1016/j.molcel.2007.06.017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R35] 35.Robin C Friedman, Kyle Kai-How Farh, Christopher B Burge, David P Bartel. Most Mammalian mRNAs Are Conserved Targets of MicroRNAs. Genome Research. 2009;19:92–105. doi: 10.1101/gr.082701.108. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] 36.Down TA, Hubbard TJ. Computational detection and location of transcription start sites in mammalian genomic DNA. Genome research. 2002;12(3):458–461. doi: 10.1101/gr.216102. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Functional annotation of risk loci identified through genome-wide association studies for prostate cancer

Yizhen Lu

Zheng Zhang

Hongjie Yu

S Lily Zheng

William B Isaacs

Jianfeng Xu

Jielin Sun

Abstract

Background

Methodology/Principal Findings

Conclusions/Significance

Introduction

Methods

Define SNPs that are in Linkage Disequilibrium (LD) with established PCa risk-associated SNPs discovered from GWAS

Overlapping the risk SNP blocks with functionally annotated genomic regions

Assessment of enrichment of the risk SNP blocks in the annotated genomic regions

Results

Identification of SNPs in LD with PCa risk SNPs

Defining the functional annotation databases

Table 1.

Mapping of risk SNP blocks to the functional annotation sets

Table 2.

Enrichment analysis

Discussion

Supplementary Material

Table 3.

Acknowledgments

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Functional annotation of risk loci identified through genome-wide association studies for prostate cancer

Yizhen Lu

Zheng Zhang

Hongjie Yu

S Lily Zheng

William B Isaacs

Jianfeng Xu

Jielin Sun

Abstract

Background

Methodology/Principal Findings

Conclusions/Significance

Introduction

Methods

Define SNPs that are in Linkage Disequilibrium (LD) with established PCa risk-associated SNPs discovered from GWAS

Overlapping the risk SNP blocks with functionally annotated genomic regions

Assessment of enrichment of the risk SNP blocks in the annotated genomic regions

Results

Identification of SNPs in LD with PCa risk SNPs

Defining the functional annotation databases

Table 1.

Mapping of risk SNP blocks to the functional annotation sets

Table 2.

Enrichment analysis

Discussion

Supplementary Material

Table 3.

Acknowledgments

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases