Skip to main content
BMC Cancer logoLink to BMC Cancer
. 2025 Nov 4;25:1708. doi: 10.1186/s12885-025-14827-0

Cross-tissue transcriptome-wide association studies identify genetic susceptibility genes for prostate cancer

Jingqi Hua 1, Yichen Qian 1, Yuanchen Lu 1, Qijie Zhang 1, Hongliang Que 1, Tengyue Zeng 1, Quan Li 1, Junpeng Deng 1,, Jianjun Xie 1,
PMCID: PMC12587738  PMID: 41188790

Abstract

Background

Despite significantPlease check if article title presented correctly. advances made by genome-wide association studies (GWAS) in the genetic exploration of tumors such as prostate cancer (PCa), the precise pathogenic genes and underlying biological mechanisms of PCa remain unclear.

Methods

To address this complex issue, we used a cross-tissue transcriptome-wide association study (TWAS) strategy within the Unified Test for Molecular Signatures (UTMOST) framework. This approach integrated GWAS summary statistics from 122,188 PCa patients and 604,640 controls with extensive gene expression data from the Genotype-Tissue Expression (GTEx) project. We validated key gene discoveries using three complementary methods: FUSION, FOCUS, and Multi-marker Analysis of GenoMic Annotation (MAGMA). Additionally, MAGMA was used to examine the tissue and functional level enrichment of single nucleotide polymorphisms (SNPs) associated with PCa. Conditional and joint analysis, as well as fine mapping techniques, were employed to deepen our understanding of PCa’s genetic architecture. To establish causal relationships, we conducted Mendelian randomization analysis, while colocalization analysis was used to identify potential shared SNPs between key genes and PCa risk.

Results

Through the comprehensive application of four TWAS methods, we identified 13 potential susceptibility genes closely associated with PCa risk. Mendelian randomization analysis confirmed direct causal links between the WDPCP, RIF1, POLI, HAAO, GGCX and CASP10 genes and PCa. Colocalization analysis further revealed that CASP10 (rs6735656), GGCX (rs2028900) may share genetic signals between GWAS and expression quantitative trait loci (eQTL), indicating common pathways in PCa pathogenesis.

Conclusion

This study identified 13 new susceptibility genes significantly associated with PCa risk and provided new insights into the genetic basis of PCa, contributing to a more comprehensive understanding of its complex genetic structure.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12885-025-14827-0.

Keywords: PCa, TWAS, UTMOST, MAGMA, EQTL

Introduction

Prostate cancer (PCa) is one of the most prevalent cancers among men globally, with high incidence and mortality rates in many countries [1]. According to global cancer statistics, PCa constitutes about 13.5% of all new cancer cases in men, resulting in hundreds of thousands of deaths each year [2]. The risk of developing PCa increases significantly with age, particularly in men over 65 years old [3, 4]. Additionally, there are notable racial disparities in PCa incidence and mortality, with African American men experiencing higher rates compared to other ethnic groups [5].

The pathogenesis of PCa is highly complex. Although its precise causes are not fully understood, research indicates that genetic, environmental, and lifestyle factors all contribute to the development and progression of the disease [68]. A family history of PCa is a major risk factor, significantly increasing a man’s likelihood of developing the condition [9, 10]. Genome-wide association studies (GWAS) have become essential in exploring the genetic foundations of PCa. By scanning the genome for single nucleotide polymorphisms (SNPs), GWAS identify genetic variants linked to PCa risk. These studies have uncovered hundreds of risk loci. For instance, variations in genes like HOXB13, BRCA2, and ATM are strongly associated with increasedPCa risk [1113]. These discoveries not only elucidate the genetic landscape of PCa but also provide targets for further functional research.

Despite the advancements made through GWAS, the identified genetic variants often explain only a small fraction of the disease risk. This indicates that the genetic architecture of PCa is likely more intricate, involving numerous low-frequency or rare variants yet to be identified. Moreover, GWAS can pinpoint SNPs significantly associated with the disease but cannot elucidate the biological mechanisms underlying these associations. Therefore, future research must integrate additional genomic and functional approaches to fully understand the genetic basis and pathogenesis of PCa.

In recent years, with the release of the GTEx project, a wealth of gene expression data has been made available, providing a valuable resource for TWAS studies, especially for analysis in a multi-tissue context. Similar to this study, Duo Liu et al. [14]. identified genes associated with PCa risk through TWAS analysis of methylation studies of PCa. Although this study provided valuable clues, their study was limited to methylation level analysis and did not address more complex mechanisms of gene expression regulation. In contrast, Nicholas Mancuso et al. [15] conducted an extensive study combining PCa GWAS and GTEx eQTL data, identifying 109 genes associated with PCa risk. The methodology of this study, although using a more traditional analysis strategy, did not use newer analytical tools for cross-organizational data integration, such as UTMOST and FOCUS, which are effective in improving identification accuracy and reducing false-positive results. Therefore, although the 109 genes identified by Mancuso et al. provided a valuable reference, this study improved the accuracy and reliability of the analysis by using more advanced tools, such as UTMOST, FUSION, and FOCUS.

In addition, Ramya Sundaram et al. [16] in their study used TWAS analysis with GTEx data to propose some novel risk genes and pathways, further supporting the multigene and multitissue correlation of prostate cancer. This is in line with the core idea of the present study, which is to reveal a more comprehensive genetic background through cross-tissue association analysis across multiple tissues, especially in terms of prostate tissue-specific susceptibility genes and gene expression regulation.

However, despite the fact that several studies have been conducted using similar TWAS approaches, integrated analyses combining GTEx and GWAS data still face certain challenges. For example, existing studies often focus on a single tissue or fail to effectively utilize newer tools for cross-tissue integration, limitations that restrict a comprehensive exploration of prostate cancer risk genes. This study aims to uncover novel susceptibility genes for PCa through the comprehensive application of cross-tissue transcriptome-wide association studies (TWAS) [17]. We combined the extensive multi-tissue gene expression data from the GTEx project with the summary statistics from PCa GWAS, utilizing advanced genetic analysis tools such as the UTMOST [18], FUSION [1922], MAGMA [23], and FOCUS [24, 25] to systematically explore the potential relationships between gene expression variations and PCa risk.

The use of UTMOST allows us to integrate data across various tissues, enhancing statistical power and capturing key genetic associations that might be missed in traditional single-tissue studies [18]. FUSION leverages open gene expression resources and GWAS data to identify genes that may influence PCa phenotypes through their expression levels, providing crucial insights into the disease’s molecular mechanisms and biological pathways [1922]. FOCUS, an advanced TWAS method, improves the accuracy of identifying specific genes associated with disease risk through fine-mapping, reducing false positives and increasing the interpretability of genetic signals [24, 25].

Additionally, this study employs conditional analysis, colocalization analysis, and Mendelian randomization analysis to verify the reliability and potential causality of the identified associations. This comprehensive approach not only enhances our understanding of the genetic basis of PCa but also opens new avenues for future research and therapeutic strategies. Through this research, we aim to provide a robust genetic foundation for precision medicine and personalized treatment in PCa.

Materials and methods

PCa GWAS data source

The GWAS data on PCa were derived from a study conducted by Wang A et al. [26] that included 122,188 European ancestry cases, 604,640 European ancestry controls. In addition, we validated the PCa GWAS using the GWAS for PCa provided by Schumacher et al. [27] validation, which included 79,148 PCa cases and 61,106 controls. To ensure the quality of the data, we performed a rigorous quality control process on the raw data before performing the GWAS analysis. Specifically, the genomic coordinates of all SNPs were based on the hg38 version of the reference genome. For minor allele frequency (MAF), the study screened SNPs with MAF > 1% and excluded low-frequency variants.

The eQTL data in this study were obtained from the GTEx v8 dataset, which contains gene expression data from 838 individuals in 54 different tissues. In this dataset, the relationship between gene expression and genetic variants (SNPs) was systematically explored. The dataset covers the expression of approximately 20,000 genes and includes samples of multi-ethnicity (predominantly European, African, and Latino ancestry).GTEx v8 uses high-throughput RNA-seq technology for gene expression sequencing and Matrix eQTL software for eQTL analysis to capture the relationship between genotype and gene expression.

Cross-tissue and Single-tissue TWAS analysis

We used the innovative UTMOST method [18] provided by Hu, Y., et al. to explore gene expression correlations across different tissues. UTMOST integrates the joint effects of SNPs within linkage disequilibrium (LD) regions and leverages the extensive gene expression data from the GTEx project (version GTExV8), constructing covariance matrices for individuals and across 49 tissues to define potential gene-trait associations. We combined PCa-corrected GWAS data and eQTL data from GTExV8 to estimate the genetic contributions of gene expression in each tissue. In the initial single-tissue TWAS analysis, UTMOST processed GTExV8 data to obtain gene-PCa risk associations for each tissue. Subsequently, we performed cross-tissue TWAS analysis using UTMOST to capture genetic synergies, summarizing joint correlations across all GTExV8 tissues to identify gene variants influencing PCa risk across multiple tissues. To verify the robustness and accuracy of UTMOST results, we used three complementary methods: FUSION, FOCUS, and MAGMA. FUSION [1922] builds predictive models based on gene expression or regulatory elements’ genetic components and tests their associations with disease phenotypes using GWAS summary statistics. FOCUS [24, 25] uses probabilistic fine-mapping to assign weights to each gene within risk regions, prioritizing those with strong causal relationships. MAGMA [23] reduces SNP matrix dimensionality via principal component analysis, then uses these components as predictors in linear regression models to explore genetic mechanisms and functional pathways behind polygenic traits. To ensure significance, we applied the Benjamini-Hochberg (BH) correction and set the false discovery rate (FDR) threshold at 0.05 to identify statistically significant gene variants. This comprehensive strategy enhances the reliability of results and provides new insights into the genetic architecture of PCa.

Conditional and joint analyses

We used FUSION software for conditional and joint analyses. The aim of the conditional analysis was to identify genetic variants that contribute independently to PCa risk in the GWAS data and to further reveal whether different SNPs act independently, or whether they work together to influence disease risk through synergistic effects. First, using data from the European ancestry (EUR) population in the 1000 Genomes Project, FUSION automatically constructed the LD matrix between SNPs. Then, we performed conditional analysis using FUSION’s assoc_test function [28] to identify genetic variants independently contributing to disease risk in GWAS, distinguishing between single SNP effects and multiple SNPs’ synergistic effects, and identifying other potential independent variants. Additionally, we used the joint_test function of FUSION [29] for joint analysis, which is primarily used to improve the detection of small effect variants, particularly in the assessment of combined effects in a multivariate setting. In this study, the FUSION analysis only included data from Prostate tissue.

Pathway enrichment analysis and Tissue-specific analysis

We performed pathway enrichment and tissue-specific analysis using MAGMA [23]. The aim of pathway enrichment analysis is to identify biological pathways and functional modules associated with PCa in order to better understand how the relevant genes act synergistically in different biological processes to influence the onset and progression of the disease. Integrating multi-source gene expression data from the GTEx project, MAGMA evaluated gene expression across different tissues and identified potential links between these genes and specific phenotypes, such as PCa. MAGMA’s gene set enrichment analysis capabilities assessed the collective effect of genes within specific pathways, revealing biologically relevant pathways and functional modules associated with the disease.

Colocalization analysis

Colocalization analysis determines shared genetic variant signals among multiple genetic studies (e.g., GWAS summary data for different diseases or traits). Using the coloc tool, which employs a Bayesian framework to construct mutually exclusive hypotheses, we analyzed potential causal variants within genomic regions. The hypotheses include: H0 (no association), H1 or H2 (association of a single trait with specific variants), H3 (two traits associated with different SNPs), and H4 (two traits sharing the same SNP as a causal variant). If the posterior probability of H4 (PPH4) exceeds a set threshold (e.g., > 0.7) [30], it indicates colocalization, meaning the genetic signals observed in different studies originate from the same variant.

Mendelian randomization

In order to explore potential causal relationships between important genes and PCa, Mendelian randomization analyses were performed on eQTL data of genes and PCa GWAS data. The eQTL data used were obtained from the GTEx v8 database. These analyses were mainly performed using inverse variance weighting (IVW) [31], and instrumental variables (IVs) were screened based on the following criteria: significant correlation (p < 5 × 10-8), independence (r² < 0.01), and exclusion of horizontal multiplicity SNPs. a Wald ratio was used for only one IV [32], and the “TwoSampleMR” R package was used. To ensure that the MR hypothesis was valid, MR-Egger and MR-PRESSO were used to detect polyvalent SNPs and Cochran’s Q test was used to assess the heterogeneity of instrumental variables. As for the significance of causal associations, this study used the Benjamini-Hochberg method to control for FDR (FDR < 0.05) and combined it with Bonferroni correction to further verify the robustness of the key findings.

Results

UTMOST and FUSION Transcriptome-wide association study results

The UTMOST cross-tissue study identified 343 genes with statistically significant signals after FDR correction (FDR < 0.05) (Supplementary Table S1). In the single-tissue validation using FUSION, 576 genes showed significant association signals in TWAS (FDR < 0.05) (Supplementary Table S2) (Fig. 1).

Fig. 1.

Fig. 1

Manhattan plot of cross-tissue TWAS results for PCa. The y-axis represents p-values on a –log(10) scale. The significance threshold is based on FDR correction

Conditional and joint analyses

We performed conditional and joint analyses on 39 genes identified by both UTMOST and FUSION to confirm their independent association with PCa. 29 genes were found to be independent markers, each conveying different genetic information (FDR < 0.05) (Table 1). These genes remained significantly associated with PCa after conditional analysis, indicating their independent association rather than being influenced by linkage disequilibrium. Certain genetically regulated genes’ expression may drive some GWAS signals. For example, GEN1 accounted for 0.833 of the signal on the 2p17 motif, GGCX and IMMT accounted for 0.86 of the signal on the 2q37.3 motif, whereas NCKAP5 accounted for 0.638 of the signal on the 2q33.2 motif, RIF1 accounted for 0.832 of the signal on the 2q14.3 motif, and PDE11A accounted for 0.916 of the signal on the 2q33.2 motif, BMPR1B accounted for 0.941 of the signal on the 4q22.3 motif, PRDM5 accounted for 0.652 of the signal on the 4q23 motif, and CEP192 accounted for 0.816 of the signal on the 18q21.33 motif. TBX1 accounted for 0.813 of the signal on the 22q11.21 motif, AP000350.6 accounted for 0.813 of the signal on the motif 22q13.31(Supplementary Fig. 1).

Table 1.

Significant genes for PCa in cross-tissue and single-tissue TWAS analysis

ID TWAS.Z TWAS.P JOINT.Z JOINT.P FDR
WDPCP 11.1 1.30E-28 11.1 1.30E-28 3.61E-26
TTLL1 -7 2.30E-12 -7 2.30E-12 2.60E-10
TRMT61B -3.7 2.00E-04 -3.7 2.10E-04 3.82E-03
TBX1 -9 2.30E-19 -9 2.30E-19 4.16E-17
SEPT2 -8.8 1.40E-18 -8.8 1.40E-18 2.42E-16
RN7SKP80 -3.1 2.30E-03 -3.1 2.30E-03 2.61E-02
RIF1 5 6.40E-07 5 6.40E-07 2.58E-05
PRDM5 -3.7 1.90E-04 -3.7 1.90E-04 3.64E-03
PPA2 3.8 1.20E-04 3.8 1.20E-04 2.53E-03
POLI -5.3 9.40E-08 -5.3 9.40E-08 4.38E-06
PLEKHH2 -4.1 3.70E-05 -4.1 3.60E-05 9.45E-04
PLA2G6 -3.1 2.20E-03 -3.1 2.10E-03 2.50E-02
PDE11A -3.3 9.50E-04 -3.3 9.30E-04 1.31E-02
NCKAP5 -4.4 1.00E-05 -4.4 1.00E-05 3.20E-04
MTERF4 -3.3 8.40E-04 -3.3 8.40E-04 1.20E-02
MLPH -7 2.60E-12 -7 2.60E-12 2.79E-10
METTL4 3 3.00E-03 3 3.00E-03 3.20E-02
LDAH -8.6 1.10E-17 -8.6 1.10E-17 1.79E-15
KIF1A -4.7 2.30E-06 -4.7 2.40E-06 8.25E-05
IMMT 3.5 5.20E-04 3.5 5.20E-04 8.47E-03
HAAO -4.2 2.70E-05 -4.2 2.70E-05 7.33E-04
GPC1 3.2 1.20E-03 3.2 1.20E-03 1.62E-02
GGCX -13.1 2.20E-39 -13.1 3.30E-39 1.01E-36
GEN1 -3.9 1.10E-04 -3.9 1.00E-04 2.32E-03
CEP192 3.4 5.80E-04 3.4 5.80E-04 9.16E-03
CASP10 -6.7 2.80E-11 -4.3 2.10E-05 2.50E-09
BMPR1B -13.3 1.90E-40 -13.3 1.90E-40 9.84E-38
AP000350.6 3.5 4.60E-04 3.5 4.60E-04 7.86E-03
ALS2CR12 -6.6 5.40E-11 -4.1 4.20E-05 4.76E-09

MAGMA analysis

Using MAGMA in the cross-tissue TWAS, 2823 genes showed significant signals (FDR < 0.05) (Supplementary Table S3) (Fig. 2A). These significant genes were enriched in pathways like REACTOME_ABACAVIR_TRANSMEMBRANE_TRANSPORT, EPITHELIAL_CELL_DIFFERENTIATION_INVOLVED_IN_PROSTATE_GLAND_DEVELOPMENT, NOREPINEPHRINE_UPTAKE, CHROMOSOME, and PROSTATE_GLAND_MORPHOGENESIS (Fig. 2B; Supplementary Table S4). MAGMA’s tissue-specific analysis revealed that SNPs associated with PCa were primarily found in the Prostate、Vagina、Minor_Salivary_Gland、Esophagus_Mucosa、eg (Fig. 2C, Supplementary Table S5).

Fig. 2.

Fig. 2

MAGMA Analysis. A Using MAGMA in the cross-tissue TWAS, 1405 genes showed significant signals (FDR < 0.05); B Pathway enrichment analysis; C Tissue specificity

Statistical fine mapping

We used FOCUS software for fine mapping of TWAS associations, analyzing data from a single European population. Under the criteria of FDR < 0.05 and PIP > 0.9, 126 positive genes were identified in prostate tissue (Supplementary Table S6). Finally, the Venn diagram showed that 13 key genes related to PCa were identified using four methods: UTMOST, FUSION, FOCUS, and MAGMA (Fig. 3).

Fig. 3.

Fig. 3

The key genes identified by the intersection of four methods

Colocalization of eQTL and GWAS associations

We conducted colocalization analysis to evaluate the likelihood of shared signals between GWAS and eQTL _from_GTEx_Prostate_hg38. Among the 13 important susceptibility genes identified by the four TWAS methods, CASP10 (rs6735656) (PP4 = 0.76), GGCX (rs2028900) on chromosome 2 (PP4 = 0.99) likely share the same GWAS and eQTL signals (Fig. 4A-B).

Fig. 4.

Fig. 4

A Colocalization of CASP10 in eQTL and PCa GWAS associations. B Colocalization of GGCX in eQTL and PCa GWAS associations

Two-sample Mendelian randomization (TSMR)

TSMR analysis used eQTL_from_GTEx_Prostate_hg38 data for WDPCP, RIF1, POLI, MLPH, HAAO, GGCX and CASP10, along with PCa GWAS data. All SNPs used in the TSMR analysis were considered strong instruments (F > 10). Finally for found a causal relationship between six genes, except MLPH, and PCa. Specifically, WDPCP (OR = 1.18, 95% CI: 1.14–1.21), RIF1 (OR = 1.12, 95% CI: 1.08–1.17), POLI (OR = 0.93, 95% CI: 0.91–0.95), HAAO (OR = 1.08, 95% CI: 1.06–1.10), GGCX (OR = 0.76, 95% CI: 0.73–0.79) and CASP10 (OR = 1.16, 95% CI: 1.11–1.22) Increased levels of gene expression were associated with a reduced risk of PCa. (Fig. 5, Supplementary Table S7).

Fig. 5.

Fig. 5

Forest plot of TSMR results

Discussion

Based on the PCa GWAS dataset, we systematically evaluated the genetic predictive associations between gene expression and PCa risk. Using UTMOST, FUSION, and conditional and joint analyses to identify PCa risk genes, we further enhanced the accuracy of our findings by annotating risk genes with MAGMA and fine-mapping them with FOCUS. By intersecting the results from four genetic analysis methods (MAGMA, UTMOST, FUSION, and FOCUS), we identified 13 risk genes. Finally, Bayesian colocalization analysis and TSMR analysis were performed on these genes to establish significant causal relationships with PCa. Our results can improve understanding of the genetics and etiology of PCa. Cross-tissue TWAS analysis methods effectively identify significant risk genes [33]. Research using these methods to identify PCa risk genes is still rare, and we incorporated comprehensive analysis methods to mutually verify the results.

Combining the 4 TWAS methods we obtained 13 risk genes (CASP10, GGCX, HAAO, KIF1A, MLPH, NCKAP5, PLEKHH2, POLI, PPA2, RIF1, TBX1, TTLL1, WDPCP), which completely covered the 8 risk genes obtained from the validation dataset (TBX1, GGCX, POLI, MLPH, KIF1A, WDPCP, TTLL1, PLEKHH2), and shared SNPs between CASP10 (rs6735656), GGCX (rs2028900)) and PCa were identified by co-localization analyses with COLOC, as compared with the results of the validation dataset (TBX1 (rs1978060), GGCX (rs2028900), POLI (rs11083046)) were similar.

TBX1 (T-Box transcription factor 1) is a gene encoding a developmental transcription factor associated with retinoic acid signaling and PCa risk [34]. A meta-analysis of 87,040 individuals (43,303 PCa cases and 43,737 controls) found significant associations between intronic SNPs in the TBX1 gene and PCa in European and Japanese populations [35]. A study using Enhanced Reduced Representation Bisulfite Sequencing identified a region in the first intron of the TBX1 gene with increasing methylation levels correlating with disease severity, along with increased gene expression [34]. Studies suggest that rs1978060 may be a functional variant regulating TBX1 expression [36].

GGCX (gamma-glutamyl carboxylase) on chromosome 2 encodes a γ-glutamyl carboxylase crucial for the carboxylation of vitamin K-dependent proteins. These proteins include clotting factors, osteocalcin, and matrix Gla protein, which are vital for blood coagulation, bone metabolism, and regulation of vascular calcification. While there is no direct evidence linking GGCX to PCa, it is highly expressed in prostate tissue [37]. The activity of GGCX depends on vitamin K, and abnormalities in vitamin K metabolism may be associated with cancer risk. Additionally, GGCX regulates angiogenesis, essential for tumor growth and metastasis. By influencing endothelial cell function, GGCX may affect tumor vascular formation.

POLI (Polymerase Iota) encodes a DNA polymerase involved in DNA replication and repair. Its expression varies throughout the cell cycle, allowing it to function during different cell cycle phases, particularly in response to DNA damage. Studies by Manuel Luedeke et al. [38] suggest that POLI is a candidate tumor susceptibility gene for lung cancer [39] and colorectal cancer [40] in different mouse models. Their research found a significant enrichment of the POLI germline variant (F532S) in probands with TMPRSS2-ERG fusion (OR, 4.6; P = 0.0011) [38].

The MLPH gene encodes the MLPH protein, involved in melanosome transport [41]. MLPH forms a complex with Rab27a and myosin Va (Rab27a-MLPH-MyoVa), enhancing melanosome transfer from melanocytes to keratinocytes, essential for pigmentation of hair, skin, and eyes [42]. MLPH expression increases with UV exposure, indicating its role in UV response [42]. The transfer of melanosomes protects the skin from UV damage [43]. Studies show a negative correlation between sunlight exposure and PCa incidence [44], with lower mortality rates in regions with higher UV levels [45]. Although MLPH is crucial in the skin, it is also highly expressed in prostate tissues, with lower levels linked to more aggressive PCa [15].

Additionally, MAGMA’s pathway enrichment results primarily involve prostate epithelial cell differentiation, which is related to the occurrence and development of PCa, as it originates from abnormal prostate epithelial cells. The pathways also include norepinephrine uptake. Although the effect of norepinephrine on PCa is not fully understood, it may influence the tumor microenvironment, neural regulation, or tumor progression. Chromosomal stability and structural changes are also involved, as PCa is closely associated with chromosomal instability and genomic alterations. MAGMA’s tissue enrichment analysis indicates that molecular mechanisms might be shared not only in prostate tissue but also in the vagina, minor salivary gland, esophagus mucosa, and stomach.

This study has several limitations. First, the criteria for selecting significant cis-regulated genes in TWAS analysis may have excluded some relevant genes, and SNPs related to PCa but not to cis-expression were not considered. Second, our GWAS data and the reference GTExv8 eQTL data are from European populations, which may limit the applicability of our findings to other ethnic groups. Third, with the ongoing release of high-throughput data from more tissues and PCa GWAS datasets from diverse ancestral groups, cross-tissue correlation analysis combined with other GWAS strategies is expected to provide deeper insights into PCa genetics. Finally, the potential mechanisms of the identified risk genes have not been experimentally validated. In response to these limitations, a number of studies have explored the impact of different populations and organization types on TWAS results. For example, TWAS analyses of multiple ethnic groups by Wittich H et al. suggest that GWAS data from European populations may not fully reflect genetic background differences in other populations [46]. In addition, in recent years, studies combining genome-wide data and eQTL information from different tissues for cross-tissue analyses have also shown that this approach significantly improves genetic prediction for a wide range of complex diseases [47]. These studies provide valuable references for the present study and provide directions for future research, especially in cross-population and multi-organization analyses.

Conclusion

In summary, this study identified 13 susceptibility genes associated with PCa risk using four TWAS methods. Colocalization analysis confirmed shared SNPs between 2 of these genes and PCa. Mendelian randomization indicates that POLI and GGCX may reduce PCa risk, whereas WDPCP, RIF1, HAAO, and CASP10 may increase PCa risk. These findings provide valuable new insights into the genetic basis of PCa and emphasize the importance of studying these genes in PCa development. This research not only deepens our understanding of the biological basis of PCa but also offers new targets for its prevention and treatment.

Supplementary Information

Acknowledgements

Not applicable.

Abbreviations

BH

Benjamini-Hochberg

eQTL

Expression quantitative trait locus

FDR

False discovery rate

FOCUS

Fine-mapping of causal gene sets

FUSION

Functional summary-based imputation

GWAS

Genome-wide association study

IV/IVW

Instrumental variable/inverse variance weighting

LD

Linkage disequilibrium

MAGMA

Multi-marker analysis of genomic annotation

MAF

Minor allele frequency

MR

Mendelian randomization

PCa

Prostate cancer

PPH4/PP4

Posterior probability for shared causal variant

SNP

Single nucleotide polymorphism

TSMR

Two-sample Mendelian randomization

TWAS

Transcriptome-wide association study

UTMOST

Unified test for molecular signatures

Authors’ contributions

Study conception: JQH, YCQ, and YCL; study design: QJZ, HLQ, and TYZ; data extraction and analyses: QL, JQH, and JJX; results presentation and interpretation: HS and JPD; manuscript drafting and revising: JQH, YCQ, and JPD. The work reported in the paper has been performed by the authors, unless clearly specified in the text. All authors read and approved the final manuscript.

Funding

This work was supported by the Suzhou Clinical Medical Center Program(grants Szlcyxzx202106).

Data availability

The GWAS data on PCa were derived from a study conducted by Wang A et al. [26] that included 122,188 European ancestry cases, 604,640 European ancestry controls. In addition, we validated the PCa GWAS using the GWAS for PCa provided by Schumacher et al. [27] validation, which included 79,148 PCa cases and 61,106 controls. To ensure the quality of the data, we performed a rigorous quality control process on the raw data before performing the GWAS analysis. Specifically, the genomic coordinates of all SNPs were based on the hg38 version of the reference genome. For minor allele frequency (MAF), the study screened SNPs with MAF >1% and excluded low-frequency variants. The eQTL data in this study were obtained from the GTEx v8 dataset, which contains gene expression data from 838 individuals in 54 different tissues. In this dataset, the relationship between gene expression and genetic variants (SNPs) was systematically explored. The dataset covers the expression of approximately 20,000 genes and includes samples of multi-ethnicity (predominantly European, African, and Latino ancestry). GTEx v8 uses high-throughput RNA-seq technology for gene expression sequencing and Matrix eQTL software for eQTL analysis to capture the relationship between genotype and gene expression. All of the preprocessing, analysis, and plotting code used in this study can be found in github. https://github.com/duanxian0301/TWAS.

Declarations

Ethics approval and consent to participate

This study did not require ethical approval.

Consent for publication

All authors have consented to the publication of this manuscript.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Junpeng Deng, Email: djunpeng@outlook.com.

Jianjun Xie, Email: xjjmobile@163.com.

References

  • 1.Center MM, et al. International variation in prostate cancer incidence and mortality rates. Eur Urol. 2012;61(6):1079–92. [DOI] [PubMed] [Google Scholar]
  • 2.Sung H, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–49. [DOI] [PubMed] [Google Scholar]
  • 3.Lalitha K, et al. Estimation of time trends of incidence of prostate cancer–an Indian scenario. Asian Pac J Cancer Prev. 2012;13(12):6245–50. [DOI] [PubMed] [Google Scholar]
  • 4.Ha Chung B, Horie S, Chiong E. The incidence, mortality, and risk factors of prostate cancer in Asian men. Prostate Int. 2019;7(1):1–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Hinata N, Fujisawa M. Racial differences in prostate cancer characteristics and cancer-specific mortality: an overview. World J Mens Health. 2022;40(2):217–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Al-Ghazawi M, et al. An In-Depth look into the epidemiological and etiological aspects of prostate cancer: A literature review. Cureus. 2023;15(11):e48252. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Du J, et al. Association of angiotensin-converting enzyme insertion/deletion (ACE I/D) gene polymorphism with susceptibility to prostate cancer: an updated meta-analysis. World J Surg Oncol. 2022;20(1): 354. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Cacciatore A, et al. Epigenome-wide impact of MAT2A sustains the androgen-indifferent state and confers synthetic vulnerability in ERG fusion-positive prostate cancer. Nat Commun. 2024;15(1): 6672. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Pilenzi L, et al. The crucial role of hereditary cancer panel testing in unaffected individuals with a strong family history of cancer: a retrospective study of a cohort of 103 healthy subjects. Cancers (Basel). 2024. 10.3390/cancers16132327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Randazzo M, et al. A positive family history as a risk factor for prostate cancer in a population-based study with organised prostate-specific antigen screening: results of the Swiss European randomised study of screening for prostate cancer (ERSPC, Aarau). BJU Int. 2016;117(4):576–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Wokołorczyk D, et al. Mutations in ATM, NBN and BRCA2 predispose to aggressive prostate cancer in Poland. Int J Cancer. 2020;147(10):2793–800. [DOI] [PubMed] [Google Scholar]
  • 12.Kimura H, et al. Prognostic significance of pathogenic variants in BRCA1, BRCA2, ATM and PALB2 genes in men undergoing hormonal therapy for advanced prostate cancer. Br J Cancer. 2022;127(9):1680–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Trendowski MR, et al. Germline variants in DNA damage repair genes and HOXB13 among black patients with Early-Onset prostate cancer. JCO Precis Oncol. 2022;6:e2200460. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Liu D, et al. A transcriptome-wide association study identifies novel candidate susceptibility genes for prostate cancer risk. Int J Cancer. 2022;150(1):80–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Mancuso N, et al. Large-scale transcriptome-wide association study identifies new prostate cancer risk regions. Nat Commun. 2018;9(1): 4079. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Wu L, et al. Identification of novel susceptibility loci and genes for prostate cancer risk: a transcriptome-wide association study in over 140,000 European descendants. Cancer Res. 2019;79(13):3192–204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Wainberg M, et al. Opportunities and challenges for transcriptome-wide association studies. Nat Genet. 2019;51(4):592–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Hu Y, et al. A statistical framework for cross-tissue transcriptome-wide association analysis. Nat Genet. 2019;51(3):568–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Mai J, et al. Transcriptome-wide association studies: recent advances in methods, applications and available databases. Commun Biol. 2023;6(1): 899. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Gusev A, et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat Genet. 2016;48(3):245–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Zhu Z, et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat Genet. 2016;48(5):481–7. [DOI] [PubMed] [Google Scholar]
  • 22.Gamazon ER, et al. A gene-based association method for mapping traits using reference transcriptome data. Nat Genet. 2015;47(9):1091–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.de Leeuw CA, et al. MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput Biol. 2015;11(4): e1004219. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Lu Z, et al. Multi-ancestry fine-mapping improves precision to identify causal genes in transcriptome-wide association studies. Am J Hum Genet. 2022;109(8):1388–404. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Mancuso N, et al. Probabilistic fine-mapping of transcriptome-wide association studies. Nat Genet. 2019;51(4):675–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Wang A, et al. Characterizing prostate cancer risk through multi-ancestry genome-wide discovery of 187 novel risk variants. Nat Genet. 2023;55(12):2065–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Schumacher FR, et al. Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nat Genet. 2018;50(7):928–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Byrne EM, et al. Conditional GWAS analysis to identify disorder-specific SNPs for psychiatric disorders. Mol Psychiatry. 2021;26(6):2070–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Deng Y, Pan W. Improved use of small reference panels for conditional and joint analysis with GWAS summary statistics. Genetics. 2018;209(2):401–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Zeng R, et al. Dissecting shared genetic architecture between obesity and multiple sclerosis. EBioMedicine. 2023;93: 104647. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Burgess S, Small DS, Thompson SG. A review of instrumental variable estimators for Mendelian randomization. Stat Methods Med Res. 2017;26(5):2333–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Sun Y, et al. Causal association between serum total bilirubin and cholelithiasis: a bidirectional two-sample Mendelian randomization study. Front Endocrinol (Lausanne). 2023;14: 1178486. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Zhu M, et al. A cross-tissue transcriptome-wide association study identifies novel susceptibility genes for lung cancer in Chinese populations. Hum Mol Genet. 2021;30(17):1666–76. [DOI] [PubMed] [Google Scholar]
  • 34.Lin PC, et al. Epigenomic alterations in localized and advanced prostate cancer. Neoplasia. 2013;15(4):373–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Al Olama AA, et al. A meta-analysis of 87,040 individuals identifies 23 new susceptibility loci for prostate cancer. Nat Genet. 2014;46(10):1103–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Li Y, et al. Genetic variant of TBX1 gene is functionally associated with adolescent idiopathic scoliosis in the Chinese population. Spine (Phila Pa 1976). 2021;46(1):17–21. [DOI] [PubMed] [Google Scholar]
  • 37.Caspers M, et al. Two enzymes catalyze vitamin K 2,3-epoxide reductase activity in mouse: VKORC1 is highly expressed in exocrine tissues while VKORC1L1 is highly expressed in brain. Thromb Res. 2015;135(5):977–83. [DOI] [PubMed] [Google Scholar]
  • 38.Luedeke M, et al. Predisposition for TMPRSS2-ERG fusion in prostate cancer by variants in DNA repair genes. Cancer Epidemiol Biomarkers Prev. 2009;18(11):3030–5. [DOI] [PubMed] [Google Scholar]
  • 39.Lee GH, Matsushita H. Genetic linkage between pol iota deficiency and increased susceptibility to lung tumors in mice. Cancer Sci. 2005;96(5):256–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Starr TK, et al. A transposon-based genetic screen in mice identifies genes altered in colorectal cancer. Science. 2009;323(5922):1747–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Oberhofer A, et al. Myosin va’s adaptor protein melanophilin enforces track selection on the microtubule and actin networks in vitro. Proc Natl Acad Sci U S A. 2017;114(24):E4714-23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Sckolnick M, et al. More than just a cargo adapter, melanophilin prolongs and slows processive runs of myosin Va. J Biol Chem. 2013;288(41):29313–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Natarajan VT, et al. Multifaceted pathways protect human skin from UV radiation. Nat Chem Biol. 2014;10(7):542–51. [DOI] [PubMed] [Google Scholar]
  • 44.Taksler GB, et al. Ultraviolet index and racial differences in prostate cancer incidence and mortality. Cancer. 2013;119(17):3195–203. [DOI] [PubMed] [Google Scholar]
  • 45.Hanchette CL, Schwartz GG. Geographic patterns of prostate cancer mortality. Evidence for a protective effect of ultraviolet radiation. Cancer. 1992;70(12):2861–9. [DOI] [PubMed] [Google Scholar]
  • 46.Wittich H, et al. Transcriptome-wide association study of the plasma proteome reveals cis and trans regulatory mechanisms underlying complex traits. Am J Hum Genet. 2024;111(3):445–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Roselli C, et al. Multi-ethnic genome-wide association study for atrial fibrillation. Nat Genet. 2018;50(9):1225–33. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The GWAS data on PCa were derived from a study conducted by Wang A et al. [26] that included 122,188 European ancestry cases, 604,640 European ancestry controls. In addition, we validated the PCa GWAS using the GWAS for PCa provided by Schumacher et al. [27] validation, which included 79,148 PCa cases and 61,106 controls. To ensure the quality of the data, we performed a rigorous quality control process on the raw data before performing the GWAS analysis. Specifically, the genomic coordinates of all SNPs were based on the hg38 version of the reference genome. For minor allele frequency (MAF), the study screened SNPs with MAF >1% and excluded low-frequency variants. The eQTL data in this study were obtained from the GTEx v8 dataset, which contains gene expression data from 838 individuals in 54 different tissues. In this dataset, the relationship between gene expression and genetic variants (SNPs) was systematically explored. The dataset covers the expression of approximately 20,000 genes and includes samples of multi-ethnicity (predominantly European, African, and Latino ancestry). GTEx v8 uses high-throughput RNA-seq technology for gene expression sequencing and Matrix eQTL software for eQTL analysis to capture the relationship between genotype and gene expression. All of the preprocessing, analysis, and plotting code used in this study can be found in github. https://github.com/duanxian0301/TWAS.


Articles from BMC Cancer are provided here courtesy of BMC

RESOURCES