Skip to main content
Nature Communications logoLink to Nature Communications
. 2025 Oct 8;16:8950. doi: 10.1038/s41467-025-64005-w

Combined SNPs sequencing and allele specific proteomics capture reveal functional causality underpinning the 2p25 prostate cancer susceptibility locus

Dandan Dong 1,#, Zixian Wang 1,2,#, Mengqi Liu 1, Qin Zhang 2,3, Wenjie Xu 1, Yu Wei 4, Jing Zhu 5, Xiayun Yang 3, Qixiang Zhang 1,2, Yao Zhu 4, Liang Wang 6,, Peng Zhang 1,, Gong-Hong Wei 1,2,
PMCID: PMC12508145  PMID: 41062477

Abstract

Genome wide association studies (GWASs) have identified numerous risk loci associated with prostate cancer, yet unraveling their functional significance remains elusive. Leveraging our high-throughput SNPs-seq method, we pinpointed rs4519489 within the multi-ancestry GWAS-discovered 2p25 locus as a potential functional SNP due to its significant allelic differences in protein binding. Here, we conduct a comprehensive analysis of rs4519489 and its associated gene, NOL10, employing diverse cohort data and experimental models. Clinical findings reveal a synergistic effect between rs4519489 genotype and NOL10 expression on prostate cancer prognosis and severity. Through unbiased proteomics screening, we reveal that the risk allele A of rs4519489 exhibits enhanced binding to USF1, an oncogenic transcription factor (TF) implicated in prostate cancer progression and prognosis, resulting in elevated NOL10 expression. Furthermore, we elucidate that NOL10 regulates cell cycle pathways, fostering prostate cancer progression. The concurrent expression of NOL10 and USF1 correlates with aggressive prostate cancer characteristics and poorer prognosis. Collectively, our study offers a robust strategy for functional SNP screening and TF identification through high-throughput SNPs-seq and unbiased proteomics, highlighting the rs4519489-USF1-NOL10 regulatory axis as a promising biomarker or therapeutic target for clinical diagnosis and treatment of prostate cancer.

Subject terms: Cancer epigenetics, Prostate cancer


Here the authors reveal that the prostate cancer risk variant rs4519489 enhances binding of the oncogenic transcription factor USF1, upregulating NOL10. Elevated NOL10 promotes tumor progression, highlighting the rs4519489–USF1–NOL10 axis as a potential biomarker and therapeutic target.

Introduction

Prostate cancer is the second most common cancer and the fifth leading cause of cancer-related mortality among men worldwide, with around 1.5 million new cases and 400,000 deaths annually1. The disease’s incidence and mortality rates vary significantly by region, with the highest incidence observed in Northern and Western Europe and the lowest in Asia1. Notably, mortality rates have decreased in regions like Northern America, Oceania, and Northern and Western Europe. However, recent years have seen an increase in both incidence and mortality rates in Asia, Central and Eastern Europe, and sub-Saharan Africa1,2. This rise is likely due to improved awareness, widespread use of prostate-specific antigen (PSA) testing, alongside rising incidence trends and challenges in accessing effective treatment options17.

Prostate cancer development is influenced by a complex interplay of factors, including age, familial history, genetic background, germline mutations, and lifestyle/environmental factors like smoking, obesity, and diet1,8. Oncogenic pathways in prostate cancer feature a range of genetic alterations, including somatic mutations in crucial genes (SPOP, FOXA1, TP53, AR, RB1), PTEN deletions, MYC amplifications, and gene fusions like TMPRSS2-ERG1,8. Our previous research highlighted the intricate interplay between the somatic TMPRSS2-ERG fusion and the 17q12/HNF1B locus9, underlining their significant roles in prostate cancer risk and progression.

Large-scale twin studies and epidemiological evidence have revealed a significant genetic component to prostate cancer, estimating its heritability at 57%10,11. The introduction of genome-wide association studies (GWASs) has fundamentally enhanced our understanding of the genetics underlying prostate cancer10,12, identifying over 450 susceptibility variants since the first GWAS in 2005, as documented in the NHGRI-EBI GWAS Catalog10,1215. Post-GWAS research is now focused on exploring the biological mechanisms underpinning these susceptibility loci, uncovering risk loci that affect crucial processes in prostate tumorigenesis, such as cell cycle regulation, DNA repair, inflammation, and metabolism16,17. A key challenge remains from association studies to functional investigations, with a keen emphasis on their potential clinical implications and applications1721.

Recent progress in high-throughput screening approaches has significantly enhanced the annotation of functional single nucleotide polymorphisms (fSNPs), connecting GWAS outcomes to disease mechanisms. Techniques such as the massively parallel reporter assay (MPRA) allow for the examination of thousands of sequences for potential transcriptional activation, enabling detailed analysis of transcriptional regulatory elements with genetic variations22,23. The self-transcribing active regulatory region sequencing (STARR-seq) method quantitatively evaluates enhancer activity across millions of sequences harboring regulatory SNPs24,25. We and others have employed CRISPR interference (CRISPRi) to identify regulatory elements and their target genes, clarifying the role of noncoding genetic variation in prostate cancer26,27. Pooled chromatin immunoprecipitation sequencing (pooled ChIP-seq) links genetic variants in transcription factor binding to disease risk28. Additionally, the combination of our teamʼs previously developed innovative high-throughput technique called single-nucleotide polymorphisms sequencing (SNPs-seq)29, and the type IIS enzymatic restriction approach developed by Li and colleagues30 enables the identification of fSNPs that influence allele-specific regulatory protein binding, thus bridging the gap between genetic variants and their functional impact on diseases.

Our SNPs-seq method capitalizes on the selective retention of protein-bound DNA oligonucleotides in a protein purification column, followed by massively parallel sequencing. Using it for a broad analysis of fSNPs at prostate cancer risk loci, we identified numerous candidate fSNPs29. Notably, rs4519489 at the 2p25 locus, located in an intron of the nucleolar protein 10 (NOL10) gene, showed significant allelic variation in protein binding. Further underlining its significance, several large-scale GWASs have discovered the 2p25 locus as significant for prostate cancer susceptibility and severity with two lead SNPs, rs9287719 and rs199061314,15,31,32, showing strong linkage disequilibrium with rs4519489 and thus emphasizing its functional role in prostate cancer causality. Herein we conducted a thorough functional analysis of rs4519489 and its eQTL target gene, NOL10. We discovered that the transcription factor USF1 plays a crucial role in modulating NOL10 expression through rs4519489, using an unbiased proteomics approach. Our research further investigates the impact of NOL10 and USF1 on prostate cancer predisposition and progression.

Results

Identification of functionally critical variants and eQTL genes underlying GWAS loci of prostate cancer

Identifying functionally critical variants and their linked expression quantitative trait loci (eQTL) genes within GWAS loci is crucial for unraveling the genetic complexity of prostate cancer. This requires integrating various techniques, including high-throughput screening, allele-specific assays, and analyses correlating genotypes with phenotypes. Our recent work leverages our innovative SNPs-seq method to study 374 prostate cancer risk loci, examining allelic differences in protein binding29. Results showed notable allele-dependent binding variations (Fig. 1a); specifically, the A allele of rs4519489 had stronger protein binding than the T allele, with significant biased allelic binding (BAB) scores in different samples (Fig. 1b, and Supplementary Fig. 1a). Critically, the A allele of rs4519489 strongly correlates with the two major GWAS risk SNPs, namely the C allele of rs9287719 (R2 = 0.67, D’ = 0.82)31,32 and the T allele of rs1990613 (R2 = 0.8, D’ = 0.98)14,15, underscoring its potential functional importance in prostate cancer genetics.

Fig. 1. Analysis of allele-specific protein binding at rs4519489/2p25 locus and its association with NOL10 expression.

Fig. 1

a Identification of significant SNPs with allele-dependent protein binding differences using SNPs-seq. TF: transcription factor. b Enhanced binding preference of the A allele over the T allele of rs4519489 with ETH or DHT treatment, as evidenced by SNPs-seq results. c Electrophoretic mobility shift assay (EMSA) showing differential binding between the A and T alleles of rs4519489 and nuclear extracts from 293 T cells. Note that the A allele exhibits stronger binding compared to the T allele, suggesting allele-specific interaction with transcription factors. d EMSA of LNCaP cells demonstrating higher binding affinity of the A allele (lane 4) compared to the T allele (lane 3) of rs4519489. Binding is displaced by a 200× consensus competitor (lane 6) and diminished by a 200× mutant competitor (lane 8), but unaffected by a 200× random competitor (lane 10). e Luciferase reporter assay indicating increased enhancer activity (571 bp DNA segment, chr2: 10,598,681-10,599,251, Human GRCh38) with the A allele of rs4519489 compared to the T allele in LNCaP or DU145 cells under ETH or DHT treatment (E enhancer; P promoter; Luc luciferase). n  =  6 samples; P values based on the order of appearance: 2.52E–04, 1.99E–05, 1.35E–02. f Association of the risk allele A at rs4519489 with elevated NOL10 expression in the CPGEA cohort. The interquartile range (IQR) is depicted by the box with the median represented by the center line. Whiskers maximally extend to 1.5 × IQR (with outliers shown). g ChIP-seq data revealing histone modification enrichment (H3K4me1, H3K4me3, and H3K27ac) at the 4519489/2p25 locus in prostate cancer cell lines (LNCaP and VCaP) and prostate tissues (normal and cancerous). h Downregulation of NOL10 expression in 22Rv1 cells following CRISPR interference (CRISPRi) targeting of rs4519489, as assessed by RT-qPCR (GAPDH as the internal control). n  =  3 samples; P values based on the order of appearance: 9.79E–05, 4.79E–04. i RT-qPCR analysis of NOL10 expression relative to ACTB in parental and mutated clones at the rs4519489 locus in PC3 cells, showing altered expression in the mutated clone. n  =  2 samples; P values = 9.06E–03. In e, h, i n  =  3 technical replicates, error bars, mean ± SD. In c,d, representative experiment of three independent EMSA experiments is shown; NE: nuclear extract. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001; statistical significance was determined using a two-tailed Student’s t test. Source data are provided as a Source Data file.

To confirm the distinct protein binding affinities of rs4519489 T and A alleles, we performed an electrophoretic mobility shift assay (EMSA). Nuclear proteins were extracted from 293 T cells and subjected to EMSA, revealing a distinct binding pattern for the A allele compared to the T allele, with a more prominent band observed for the A allele (Fig. 1c, and Supplementary Fig. 1b). Notably, the A allele demonstrated stronger binding to nuclear proteins in LNCaP cells than the T allele. This enhanced binding was significantly diminished upon competition with a consensus oligonucleotide, whereas mutant or random oligonucleotide competitors had no discernible effect (Fig. 1d, and Supplementary Fig. 1c). Consistent results were also obtained in 22Rv1 cells (Supplementary Fig. 1d, e).

Additionally, we assessed transcriptional regulation differences using an allele-specific enhancer reporter assay. A 571-bp DNA segment containing either the T or A allele of rs4519489 (designated as E in the figure to indicate its role as an enhancer) was cloned into luciferase reporter constructs. The A allele conferred significantly greater luciferase activity compared to the T allele across multiple prostate cancer cell lines (LNCaP, DU145, 22Rv1, PC3, and VCaP), both in the presence and absence of dihydrotestosterone (DHT) treatment. These results indicate that the A allele possesses a markedly stronger transcriptional activation potential (Fig. 1e, and Supplementary Fig. 1f–h).

Subsequently, to assess whether the rs4519489 SNP genotype correlates with the expression of nearby genes, we conducted an eQTL analysis using the Chinese Prostate Cancer Genome and Epigenome Atlas (CPGEA) cohort33. The analysis linked the aggressive prostate cancer-associated A allele of rs4519489 with higher NOL10 mRNA expression (Fig. 1f, and Supplementary Data 1), implicating NOL10 in susceptibility to aggressive prostate cancer. Given that rs4519489 is an intronic variant, we further investigated its potential influence on NOL10 alternative splicing. Splicing quantitative trait locus (sQTL) analysis of the four annotated NOL10 isoforms was conducted in both the TCGA PRAD and CPGEA cohorts (Supplementary Fig. 2a). This analysis revealed no genotype-dependent differences in isoform usage, with the relative abundance of each transcript being comparable across AA, AT, and TT genotypes (Supplementary Fig. 2b, c).

Furthermore, to examine the potential enhancer functionality of the rs4519489 region, we performed chromatin immunoprecipitation sequencing (ChIP-seq) experiments for epigenetic markers in various cell lines and clinical specimens. The results, shown in the Integrative Genomics Viewer (IGV), indicated active epigenetic marker enrichment (H3K27ac, H3K4me1, H3K4me3) at the rs4519489 locus (Fig. 1g), hinting at regulatory elements presence. Expanding ChIP-seq to include histone modifications in both normal and tumor prostate tissues from the CPGEA cohort33 confirmed enhancer/promoter activity at rs4519489 (Fig. 1g), reinforcing its functional gene regulatory role. To examine allele-specific regulatory potential, we genotyped rs4519489 in these prostate tissues and performed allele-specific ChIP-qPCR (ChIP-AS-qPCR) analyses on two heterozygous (A/T) samples. Notably, we observed significantly greater enrichment of active histone marks, H3K4me1 and H3K27ac, at the A allele compared to the T allele (Supplementary Fig. 3a, b), indicating allele-specific enhancer activity.

To explore if the rs4519489 region acts as an enhancer affecting NOL10 expression, we utilized CRISPR interference (CRISPRi) in 22Rv1 and PC3 cells. We first established a cell line with stable dCas9 expression, then designed and integrated two sgRNAs targeting the rs4519489 enhancer region into a humanized pgRNA vector (including an sgRNA targeting HPRT as a positive control). After infecting these dCas9-expressing cells with a lentivirus carrying the sgRNA plasmids, RT-qPCR analysis showed a notable decrease in NOL10 mRNA levels upon targeting the rs4519489 enhancer (Fig. 1h, and Supplementary Fig. 3c–g), indicating its role in modulating NOL10 expression.

To further investigate the impact of rs4519489 alleles on NOL10 expression, we employed CRISPR-Cas9-mediated base editing34 to manipulate the T allele to the A allele of rs4519489 in PC3 cells. The resultant A/T heterozygous clone exhibited increased NOL10 expression, reinforcing the hypothesis that rs4519489 is a functional regulator for NOL10 expression (Fig. 1i, and Supplementary Fig. 3h). Simultaneously, we applied the AYBE editor35 in DU145 cells to edit the rs4519489 genotype from A/A to T/T. This alteration was associated with a significant reduction in NOL10 expression (Supplementary Fig. 3i, j), thus  providing additional evidence for the allele-dependent regulatory role of rs4519489 on NOL10.

In summary, our high-throughput SNPs-seq screening identified rs4519489 as a functional causal SNP closely linked with key GWAS lead SNPs at the 2p25 prostate cancer susceptibility locus. Genotype-expression analysis revealed NOL10 as the eQTL gene for rs4519489, indicating the rs4519489/2p25 region likely functions as an enhancer modulating NOL10 expression.

NOL10 upregulation and rs4519489 eQTL correlate with prostate cancer severity

To ascertain the functional significance of NOL10 in the clinical settings, we initially analyzed the CPGEA data33, and revealed significant upregulation of NOL10 mRNA in prostate cancer tumors compared to normal tissues (Fig. 2a). This finding was supported by further analysis of data from TCGA PRAD36, Health Study Prostate Tumor Cohort37,38, and another Chinese prostate cancer dataset39, all of which consistently indicated higher NOL10 expression in tumor tissues (Fig. 2b, c, and Supplementary Fig. 4a). Independent validation using samples from the Fudan University Shanghai Cancer Center (FUSCC) cohort40, through RT-qPCR and Western Blot, confirmed significant overexpression of NOL10 in prostate tumor tissues (Fig. 2d). Additionally, analysis of the GSE10645 dataset41 showed a notable association between NOL10 expression and metastatic progression in prostate cancer patients (Fig. 2e), underscoring NOL10 upregulation and its rs4519489 eQTL correlation with the severity of prostate cancer.

Fig. 2. Correlation between NOL10 expression, rs4519489 genotype, and prostate cancer risk and severity.

Fig. 2

a–c Elevated NOL10 expression in prostate adenocarcinoma compared to normal prostate glands across three independent cohorts: CPGEA, TCGA, and GSE62872. d RT-qPCR and western blot analysis of NOL10 expression in prostate cancer versus paracancerous tissues, with GAPDH as a loading control. N: normal; T: tumor. n  =  10 samples; P values based on the order of appearance: 5.91E–05, 1.12E–02, 8.09E–05, 6.70E–03, 1.52E–03. e–i Association of high NOL10 expression with various clinical features in prostate cancer: increased tumor metastasis (e, n = 31), higher tumor stage (f, n = 144), lymph node metastasis (g, n = 409), higher Gleason score (h, n = 497), and biochemical recurrence (BCR) (i, n = 122). j Kaplan-Meier survival analysis showing that higher NOL10 expression correlates with reduced overall survival in the GSE35988 cohort. k Correlation analysis between rs4519489 genotypes and biochemical recurrence in the TCGA PRAD cohort. l, m Prostate cancer patients with the rs4519489 A/A genotype in the TCGA cohort exhibit lower disease-free and progression-free probability. n Multivariate analysis including rs4519489 genotype for overall survival of prostate cancer patients in the TCGA cohort. P values were evaluated by Cox’s proportional hazards regression. The horizontal error bars represent the 95% CI with the measure of center as HR. o, p Worse overall survival and progression-free probability in TCGA cohort patients with the rs4519489 A/A genotype and higher NOL10 expression. In a-c and e-i, the interquartile range (IQR) is depicted by the box with the median represented by the center line. Whiskers maximally extend to 1.5 × IQR (with outliers shown). In d, n  =  3 technical replicates, error bars, mean ± SD. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001; statistical significance determined using two-tailed Student’s t-test, two-way ANOVA, or log-rank analysis, as appropriate. Source data are provided as a Source Data file.

Our in-depth analysis of clinical features in prostate cancer patients revealed that elevated NOL10 expression correlates significantly with more advanced tumor stages36,42 (Fig. 2f, and Supplementary Fig. 4b), lymph node metastasis36 (Fig. 2g), higher Gleason scores36 (Fig. 2h), and increased biochemical recurrence rates33 (Fig. 2i). Moreover, survival analysis utilizing the Grasso dataset43 indicated that higher NOL10 levels are linked to reduced overall survival times (Fig. 2j), underscoring the potential of NOL10 as a critical prognostic biomarker for prostate cancer.

We also investigated the relationship between NOL10 expression and genome instability in clinical samples. By examining three indicators of genome instability (altered genome fraction, aneuploidy score, and mutation count) in TCGA PRAD samples36, we observed a positive correlation between NOL10 expression and these markers (Supplementary Fig. 4c, e). Given the ectopic APOBEC-mediated mutagenesis causes excessive point mutations and genomic instability in prostate cancer44, we explored whether mutations in the NOL10 gene align with APOBEC mutagenesis signatures. Comparative analysis of APOBEC-associated mutations in NOL10 versus the global mutational landscape of prostate cancer genomes revealed a modest enrichment of APOBEC-induced mutations within the NOL10 gene in both CPGEA and TCGA cohorts (Supplementary Fig. 4f–i). These findings suggest that NOL10 may be subject to APOBEC-mediated mutagenesis, potentially contributing to its functional involvement in prostate cancer progression and underscoring its relevance in the complex molecular pathology of the disease.

Given the significant association between the rs4519489 risk allele A and elevated NOL10 expression, as well as the correlation of NOL10 upregulation with prostate cancer severity, we examined whether the rs4519489 genotype directly impacts patient survival outcomes. Our analysis revealed that patients carrying the risk A/A genotype at rs4519489 had higher rates of biochemical recurrence, shorter overall and disease-free survival, and an increased risk of disease progression (Fig. 2k–m, and Supplementary Fig. 4j). Multivariate analysis confirmed that rs4519489 is an independent prognostic factor for overall survival in the TCGA cohort (Fig. 2n). Moreover, patients with the A/A genotype and higher NOL10 levels exhibited markedly poorer overall and progression-free survival compared to those with the A/T or T/T genotypes and lower NOL10 levels (Fig. 2o, p).

In summary, our comprehensive analysis of clinical data demonstrates that the rs4519489 risk allele A and NOL10 expression, individually or in combination, are associated with aggressive prostate cancer characteristics. These findings underscore the potential of NOL10 and rs4519489 as valuable biomarkers for assessing disease severity and predicting disease progression.

NOL10 as an oncogene potentiates proliferation and metastasis of prostate cancer

To evaluate the functional consequences of the rs4519489 single-base substitution in prostate cancer cells, we conducted comparative assays between parental PC3 cells harboring the T/T genotype and engineered PC3 cells carrying the A/T mutation at this locus. Cells with the A/T genotype demonstrated significantly enhanced proliferation, colony formation, and migratory capabilities compared to their T/T counterparts (Supplementary Fig. 5a–c). We next sought to investigate tumor biology effect of NOL10 in prostate cancer and thus performed shRNA-mediated knockdown of NOL10 in DU145 cells (Supplementary Fig. 5d, e). The results showed that the capability of proliferation, colony formation, and migration of DU145 cells transfected with NOL10 shRNAs were significantly reduced compared with control shRNA transfected cells (Fig. 3a–c). We also observed significant reductions in cell proliferation upon knockdown of NOL10 using siRNA in LNCaP cells (Fig. 3d).

Fig. 3. Impact of NOL10 modulation on prostate cancer phenotypes in vitro and in vivo.

Fig. 3

a, b Proliferation of DU145 cells with NOL10 knockdown, assessed by CCK8 assay (a) and colony formation assay (b). n  =  3 samples; P values based on the order of appearance: 6.60E–05, 1.32E–05 (a); 4.63E–03, 1.64E–02 (b). c Representative images and quantification of migration in DU145 cells stably expressing shRNAs targeting NOL10. n  =  3 samples; P values based on the order of appearance: 6.66E–05, 8.35E–06. d Measurement of cell proliferation in LNCaP cells with NOL10 knockdown using siRNA, determined by CCK8 assay. n  =  4 samples; P values based on the order of appearance: 3.82E–05, 1.02E–02, 1.35E–07. e–h Enhancement of prostate cancer phenotypes in 22Rv1 cells with NOL10 overexpression, including cell proliferation (e), colony formation (f), migration (g) and invasion (h). n  =  2 samples; P values based on the order of appearance: 5.88E–06 (e), 4.56E–03 (f), 1.27E–04 (g), 3.56E–04 (h). i Representative images of tumor xenografts in nude mice following subcutaneous injection of DU145 cells infected with control or NOL10 shRNA. j, k Quantification of tumor weight (j) and volume (k) in xenograft tumors from DU145 cells with NOL10 knockdown. n  =  18 samples; P values based on the order of appearance: 3.22E–03, 3.38E–02 (j); 1.64E–02, 4.68E–03 (k). error bars, mean ± SD. l Hematoxylin and eosin (H&E) and immunohistochemistry (IHC) staining for Ki67, E-cadherin, and Vimentin in xenograft tumor sections. Images are representative of three independent experiments. m Representative images of organoids derived from control shRNA and NOL10 shRNA-treated cells, with PTEN and LKB1 deficient. Scale bar, 250 µm. n, o Reduced number (n) and size (o) of organoids following NOL10 knockdown in Ptenpc−/−; Lkb1pc−/− organoids. n  =  3 samples; P values based on the order of appearance: 3.35E–02, 7.90E–03 (n); 7.39E–06, 1.06E–02 (o). p–r Positive correlation between NOL10 expression and EMT score, as well as AR signaling score, in human prostate cancer tumors across multiple cohorts. In ah, n, o, n  =  3 technical replicates, error bars, mean ± SD. In c, g, h, l, Scale bar, 100 µm. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001; statistical significance was assessed using two-way ANOVA. Source data are provided as a Source Data file.

To further substantiate the oncogenic properties of NOL10 in prostate cancer, we employed a doxycycline (Dox)-inducible lentiviral system for overexpressing NOL10 in DU145 and 22Rv1 cells (Supplementary Fig. 5f, g, k, l). The cellular function assays revealed that NOL10 overexpression significantly increased oncogenic behaviors, including cell proliferation, colony formation, migration, and invasion, compared to controls (Fig. 3e–h, and Supplementary Fig. 5h–j). Complementing this, shRNA-mediated knockdown of NOL10 in 22Rv1 cells replicated the inhibitory effects on oncogenic activities previously seen in DU145 cells, reinforcing NOL10’s critical contribution to oncogenic traits in prostate cancer (Supplementary Fig. 5m–r).

Furthermore, we conducted sgRNA-mediated knockout or shRNA-mediated knockdown of NOL10 in PC3 cells and validated the knockout efficiency using RT-qPCR and Western blot analysis (Supplementary Fig. 6a, b, g, h). Subsequent cellular biology assays, including CCK8, colony formation, and transwell migration and invasion (with or without Matrigel), produced results consistent with the NOL10 shRNA knockdown experiments in DU145 and 22Rv1 cells (Supplementary Fig. 6c–f, i–l), further underscoring the critical role of NOL10 in regulating tumor cell behavior.

Building on the significant association between NOL10 expression and prostate cancer proliferation, we expanded our study to assess the impact of NOL10 on tumor growth in vivo. We injected nude mice with DU145 cells in which NOL10 expression was diminished through shRNA-mediated knockdown. The results indicated that tumors from the NOL10 knockdown group were markedly smaller in both volume and weight compared to those from the control group (Fig. 3i–k). Histological examination with Hematoxylin and Eosin (H&E) staining demonstrated that the NOL10 knockdown tumors had cells with notably smaller nuclei and fewer atypical features than control tumors (Fig. 3l). Immunohistochemical (IHC) analysis reinforced these findings, revealing increased E-cadherin expression and decreased Vimentin expression in the NOL10 knockdown tumors, indicating a shift towards epithelial characteristics and reduced mesenchymal traits (Fig. 3l). Furthermore, Ki67expression, which signals cell proliferation, was notably lower in the NOL10 knockdown tumors (Fig. 3l), supporting the role of NOL10 in promoting tumor growth and suggesting its potential as a target for therapeutic intervention. Similar results were observed in vivo using PC3 cells with shRNA-mediated NOL10 knockdown injected into nude mice (Supplementary Fig. 6m–p).

To further substantiate these findings, we utilized a prostate cancer organoid model derived from C57BL/6 J mice harboring Ptenpc−/−;Lkb1pc−/− mutations45. RM1 cells (murine; lacking rs4519489) was transfected with NOL10-targeting shRNA, and the validated effective shRNAs were subsequently used to perform organoid experiments. Knockdown of NOL10 markedly inhibited cell proliferation, colony formation, and migration in vitro (Supplementary Fig. 7a–d). In organoid assays, NOL10 depletion led to a substantial decrease in both the number and size of organoids (Fig. 3m–o, and Supplementary Fig. 7e), further supporting the oncogenic role of NOL10 in prostate cancer.

Recognizing the critical importance of epithelial mesenchymal transition (EMT) and androgen receptor (AR) signaling in the progression of prostate cancer, we explored the relationship between NOL10 expression and EMT or AR signaling activity in patients. Through a detailed analysis spanning multiple cohorts, including MSKCC42, NPC46, SMMU47, and SU2C48, we consistently found a positive correlation between elevated NOL10 expression and higher EMT or AR signaling scores (Fig. 3p–r, and Supplementary Fig. 7f–k), highlighting a potential role of NOL10 in modulating key pathways involved in prostate cancer advancement.

Collectively, our results underscore the significant contribution of NOL10 to promoting key oncogenic activities in prostate cancer, both in vitro and in vivo. NOL10 notably boosts cell proliferation, migration, and invasion, and markedly amplifies the EMT process in subcutaneous tumor models in nude mice, underscoring its importance in cancer progression.

NOL10 promotes cell cycle progression contributing to prostate cancer severity

We next sought to elucidate the potential mechanisms through which NOL10 contributes to prostate cancer progression, and began with a gene set enrichment analysis (GSEA) using the TCGA PRAD dataset, which showed NOL10 expression significantly associated with critical cell cycle pathways, notably E2F targets and G2M checkpoint pathways (Fig. 4a, and Supplementary Data 2). Subsequently, to assess the impact of NOL10 knockdown on downstream gene expression, we performed RNA sequencing analysis to identify differentially expressed genes (DEGs) in LNCaP cells treated with control siRNA or siRNA targeting NOL10 (Supplementary Fig. 8a, b). Our analysis revealed a substantial correlation between two technical replicates, identifying 114 genes as upregulated and 71 genes as downregulated upon NOL10 knockdown (Supplementary Fig. 8c). Furthermore, GSEA of these downregulated DEGs highlighted their significant enrichment in cell cycle pathways for the NOL10 knockdown group (Fig. 4b, c, and Supplementary Fig. 8d).

Fig. 4. Association of NOL10 gene signature with prostate cancer progression in clinical settings.

Fig. 4

a GSEA in the TCGA prostate cohort, ranking NOL10 expression against pathways from the HALLMARK collection. b, c GSEA plots showing significant enrichment of E2F (b) and G2M (c) target pathways following siRNA-mediated NOL10 knockdown in LNCaP cells. d Heatmap of differentially expressed genes in the cell cycle pathway after NOL10 knockdown in LNCaP cells. e RT-qPCR validation of differentially expressed genes in the cell cycle pathway following NOL10 knockdown in LNCaP cells (GAPDH as the internal control). n  =  3 samples; P values based on the order of appearance: 2.91E–06, 2.31E–06, 2.47E–06, 1.48E–04, 6.26E–05, 5.80E–04, 2.84E–06, 7.88E–05, 1.15E–04, 4.29E–02, 1.20E–04, 2.32E–04, 3.83E–07, 1.92E–05, 6.62E–07, 5.24E–06, 1.58E–01, 9.58E–04, 2.75E–06, 7.48E–05. f–n Correlation of the NOL10 cell cycle signature (CCS) score with various clinical parameters across prostate cancer cohorts: cell cycle progression (CCP) score (f), tumor metastasis (g), T stage (h), lymph-node metastasis (i), Gleason score (j), PSA level (k), seminal vesical (l), person neoplasm status (m), and biochemical recurrence indicator (BRI) (n). o–q Kaplan-Meier survival curves illustrating relationships between the NOL10 cell cycle signature and overall survival (OS), recurrence-free survival (RFS), and metastasis-free survival (MFS) in prostate cancer patients from the GSE21034 and TCGA cohorts; analyzed using the log-rank test. r Forest plots displaying the meta-analysis of the hazard ratio estimates of the NOL10 CCS in predicting biochemical recurrence-free survival in CPGEA (n = 120), MSKCC (n = 140), TCGA (n = 492), and DKFZ (n = 105) cohorts. Horizontal error bars represent 95% confidence intervals (CIs), with HR as the center measure. The HR and 95% CI were presented in the form of natural logarithm (ln). P values calculated using a two-way Fixed-Effects Model. s Multivariate analysis (MV) of biochemical recurrence (BCR) in prostate cancer patients from the CPGEA cohort, including the NOL10 cell cycle signature as a factor. P values were evaluated by the Cox’s proportional hazards regression. The horizontal error bars represent the 95% CI with the measure of center as HR. In e, n  =  3 technical replicates, error bars, mean ± SD. In g-n, the interquartile range (IQR) is depicted by the box with the median represented by the center line. Whiskers maximally extend to 1.5 × IQR (with outliers shown). *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001; statistical significance determined using two-way ANOVA. Source data are provided as a Source Data file.

To validate these RNA-seq findings, we performed RT-qPCR on selected DEGs from the cell cycle pathways, confirming the changes observed in the RNA-seq data (Fig. 4d, e). These results reinforce the involvement of NOL10 in modulating cell cycle-related gene expression in prostate cancer.

To further confirm the consistency of these findings, we extended the RNA-seq analysis to additional cell lines and the knockdown method. In all instances, the results were consistent with those obtained in LNCaP cells. Specifically, NOL10 knockdown in LNCaP and DU145 cells (using shRNA; Supplementary Fig. 8e–l; and Supplementary Fig. 9a–i), 22Rv1 and PC3 cells (using siRNA; Supplementary Fig. 9j–r; Supplementary Fig. 10a–i) all showed similar patterns of these downregulated DEGs enriched in the cell cycle-related pathways, supporting the robustness of these findings.

Next, to evaluate the clinical relevance of NOL10 target genes in prostate cancer, we developed a cell cycle signature (CCS) based on these genes. Our analysis showed that the NOL10 CCS positively correlates with cell cycle progression (CCP) scores across diverse cohorts, including CPGEA33, TCGA PRAD36, SU2C48, FHCRC49, and GSE6287237 (Fig. 4f, Supplementary Fig. 11a–d). Further, we discovered that the NOL10 CCS was significantly higher in metastatic prostate cancer compared to normal prostate glands and primary tumors (Fig. 4g). Additionally, an elevated NOL10 CCS was linked to more aggressive prostate cancer features, such as advanced T stage, lymph node metastasis, higher Gleason scores, increased PSA levels, seminal vesical invasion, person neoplasm status, and biochemical recurrence indicator (BRI) in various cohorts (Fig. 4h–n and Supplementary Fig. 11e–m). Importantly, a higher NOL10 CCS also predicted with poorer patient survival outcomes, including overall, recurrence-free, and metastasis-free survival (Fig. 4o–q, Supplementary Fig. 11n–p), underscoring the potential of the NOL10 CCS as a prognostic marker for prostate cancer aggressiveness and patient prognosis.

To validate the strength of the observed associations, we performed a comprehensive meta-analysis assessing the correlation between the NOL10 CCS and survival outcomes in prostate cancer patients across various cohorts. Our findings demonstrated that a higher NOL10 CCS significantly correlates with shorter biochemical recurrence-free and overall survival (OS) (Fig. 4r, Supplementary Fig. 11q). Furthermore, intriguingly, multivariate analyses revealed that an elevated NOL10 CCS serves as an independent risk factor for both biochemical recurrence-free survival and OS across multiple cohorts (Fig. 4s, and Supplementary Fig. 12a–c), reinforcing the prognostic value of the NOL10 CCS in predicting outcomes for prostate cancer patients.

In summary, our research indicates that NOL10 potentially regulates genes crucial to cell cycle pathways, with a significant correlation observed between NOL10 target genes and prostate cancer progression, highlighting its importance in promoting the advancement of prostate cancer.

Unbiased proteomics approach identified USF1 as an allele-specific mediator between rs4519489 and NOL10

Given the established role of regulatory SNPs in modulating disease susceptibility via alterations in transcription factor (TF)-DNA binding50, we further sought to identify TFs that might account for binding differences between the T and A alleles of rs4519489. We thus employed a proteome mass spectrometry approach inspired by the proteome-wide analysis of SNPs (PWAS) technique51 (Fig. 5a). By comparing mass spectrometry data for both alleles of rs4519489 (Supplementary Data 3), we discovered that several TFs, notably USF1, TBX3, and TFAP4, showed specific interactions with the A allele, suggesting their potential roles in mediating the allele-specific effects on gene expression and prostate cancer progression.

Fig. 5. Unbiased proteomics identification of USF1 as a transcription factor interacting with the rs4519489 enhancer region.

Fig. 5

a Schematic outline of the proteomics screening approach used to identify transcription factors interacting with the rs4519489 enhancer region. b Enhancer Element Locator (EEL) analysis showing matching scores between the A or T alleles of rs4519489 and the motifs of potential interacting transcription factors. c ChIP-qPCR validation of USF1 binding to the rs4519489 region in LNCaP and 22Rv1 prostate cancer cell lines. n  =  4 samples; P values based on the order of appearance: 6.15E–05, 1.03E–03. d Allele-specific ChIP-qPCR (ChIP-AS-qPCR) demonstrating preferential binding of USF1 to the A allele at rs4519489 in 22Rv1 cells (A/T genotype). n  =  2 samples; P values based on the order of appearance: 8.12E–01, 3.41E–03. e, f Representative experiment of three independent gel super-shift assays showing allele-specific binding of USF1 to the A or T allele of rs4519489, with nuclear extracts from LNCaP (e) or 22Rv1 (f) cells, using Flag antibody or IgG control antibody. OE overexpression; NE nuclear extract. g ChIP-qPCR results confirming USF1 binding at the rs4519489 region in prostate cancer and adjacent paracancerous tissues. n  =  8 samples; P values based on the order of appearance: 3.58E–03, 3.96E–02, 1.16E–03, 1.03E–02. In c, d, g, n  =  3 technical replicates, error bars, mean ± SD. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, ns: non-significant; statistical significance assessed using two-tailed Student’s t test. Source data are provided as a Source Data file.

We further explored if rs4519489 directly influences the DNA binding affinity of any specific TFs identified in our proteomics study. Utilizing computational analysis with the enhancer element locator (EEL) algorithm52 and integrating it with DNA binding position weight matrix data for human TFs53, we found that rs4519489 resides within the binding motifs of USF1, TBX3, and TFAP4. Notably, USF1 was identified as the most significant among them (Fig. 5b). This suggests a pivotal interaction between rs4519489 and key TFs, especially USF1, potentially clarifying how this SNP contributes to the genetic risk of prostate cancer.

To validate the binding of TFs to the rs4519489 locus, we performed ChIP-qPCR assays using the antibodies against USF1, TBX3, or TFAP4. Remarkably, the results demonstrated a significant enrichment of USF1 at the 2p25/rs4519489 locus compared to the IgG control under both ETH and DHT treatments (Supplementary Fig. 13a), aligning well with our EEL analysis predictions. In contrast, no significant enrichment was observed for TBX3 and TFAP4. Furthermore, a comparative analysis of ChIP-qPCR across multiple prostate cancer cell lines (LNCaP, 22Rv1, PC3, and VCaP) consistently detected USF1 enrichment at the rs4519489 locus. Notably, enrichment was markedly stronger in cells carrying the A/A genotypes (LNCaP and VCaP) and the heterozygous A/T genotype (22Rv1) compared to the homozygous T/T genotype (PC3) (Fig. 5c, and Supplementary Fig. 13b, c). These results collectively support the hypothesis that USF1, among other TFs, plays a pivotal role in binding to the rs4519489 locus, indicating its involvement in the regulatory processes governing gene expression linked to prostate cancer pathogenesis at this genomic site.

To elucidate the allele-specific binding differences of rs4519489 with USF1, we assessed the genotypes of rs4519489 in five prostate cancer cell lines. Sanger sequencing unveiled that only the 22Rv1 cell line was heterozygous, harboring both A and T alleles (Supplementary Fig. 13d–h). Subsequently, ChIP-AS-qPCR targeting rs4519489 demonstrated a notably higher enrichment of USF1 at the A allele compared to the T allele (Fig. 5d), supporting the hypothesis of allele-specific transcription factor binding.

Further validation of USF1 binding was carried out using super-shift EMSA in LNCaP, 22Rv1, and 293 T cells. Upon the addition of a Flag antibody (which recognizes Flag-tagged USF1), a distinct super-shift band was observed, confirming that USF1 specifically binds to the A allele of rs4519489 (Fig. 5e, f, Supplementary Fig. 13i). This finding strongly supports the notion that USF1 preferentially binds to the A allele of rs4519489.

To corroborate these findings in vivo, we conducted ChIP-qPCR assays in normal prostate or tumor tissues using USF1 antibody or IgG control. The qPCR results affirmed the enrichment of USF1 at the rs4519489 region in prostate specimens (Fig. 5g, Supplementary Fig. 13j). Notably, the ratio of A to T read counts from the Input ChIP-seq sample in one prostate specimen was 1:5, while the ratio in the USF1 ChIP-seq sample shifted to 11:6 (Supplementary Fig. 13k). This shift in allele-specific enrichment further supports the preferential binding of USF1 to the A allele in clinical samples.

Taken together, our unbiased allele-specific proteomics analysis identified USF1 as a TF interacting with the rs4519489 regulatory region, independent of androgen signaling pathways. USF1 exhibited a preference for the A allele of rs4519489.

USF1 positively correlates with NOL10 expression and functions as an oncogene in prostate cancer

To further investigate the regulatory effect of USF1 on NOL10, we initially established a stable 22Rv1 cell line with USF1 sgRNA knockout. Western blot analyses revealed a downregulation of NOL10 expression following USF1 knockout (Fig. 6a). Further validation was conducted by transiently transfecting 22Rv1 cells with a USF1 overexpression plasmid or an empty vector. Western blot analysis demonstrated elevated NOL10 expression levels in the USF1 overexpression samples compared to the empty vector controls (Fig. 6b). These experiments were replicated in LNCaP cells, yielding consistent results (Supplementary Fig.14a). Additionally, we generated a USF1 shRNA knockdown stable cell line in PC3 cells. RT-qPCR and Western blot results validated a significant decrease in NOL10 expression following USF1 knockdown (Supplementary Fig.14b–d), consistent with the knockdown assay results in 22Rv1 cells.

Fig. 6. USF1 regulates NOL10 expression and function as an oncogenic transcription factor in prostate cancer.

Fig. 6

a Reduction in NOL10 protein levels following USF1 Knockout (sgRNA) in 22Rv1 cells, as assessed by Western blotting. b Increased NOL10 protein expression upon USF1 overexpression in 22Rv1 cells. c, d Correlation of mRNA expression between NOL10 and USF1 in prostate cancer patients across different cohorts, showing positive association. e–g Association of higher USF1 expression with advanced tumor stage (e), lymph node-positive status (f), and reduced progression-free survival (g) in prostate cancer. h RT-qPCR analysis of USF1 expression in paired prostate cancer and adjacent paracancerous tissue samples, with GAPDH as a loading control. n  =  16 samples; P values based on the order of appearance: 1.75E–03, 1.78E–02, 2.74E–03, 1.17E–02, 5.96E–03, 3.05E–05, 8.28E–03, 1.27E–02. i, j Proliferation assays in DU145 cells with shRNA-mediated USF1 knockdown examined by CCK8 assay (i) and colony formation assay (j), showing reduced cell growth. n  =  3 samples; P values based on the order of appearance: 6.54E–05, 5.57E–04 (i); 1.14E–02, 5.10E–03 (j). k Representative images and quantification of cell migration in DU145 cells with stable expression of USF1-targeting shRNAs. n  =  3 samples; P values based on the order of appearance: 1.66E–03, 2.08E–03. l CCK8 assay showing the effect of siRNA-mediated USF1 knockdown on the proliferation of LNCaP cells. n  =  3 samples; P values based on the order of appearance: 2.74E–03, 2.20E–02. m Representative images of tumor xenografts in nude mice after subcutaneous injection of DU145 cells with control or USF1 shRNA. n, o Quantification of tumor weight (n) and volume (o) at specified time points, showing reduced growth in tumors with USF1 knockdown. n  =  18 samples; P values based on the order of appearance: 9.91E–03, 5.15E–03 (n); 3.95E–03, 1.87E–03 (o). error bars, mean ± SD. p Hematoxylin and eosin (H&E) and immunohistochemistry (IHC) staining of xenograft tumors, with Ki67, E-cadherin, and Vimentin indicating altered cell proliferation and epithelial-mesenchymal transition (EMT). Images are representative of three independent experiments. q Representative images of organoids derived from prostate cancer in Ptenpc−/−; Lkb1pc−/−mice, with control or USF1 shRNA. Scale bar, 250 µm. r, s Quantification of organoid number (r) and size (s) from Ptenpc−/−; Lkb1pc−/− prostate cancer tissue, showing decreased organoid formation upon USF1 knockdown. n  =  3 samples; P values based on the order of appearance: 5.98E–04, 5.56E–04 (r); 6.21E–05, 4.13E–04 (s). In a, b representative experiment of three independent western blotting experiments is shown. In e,f, the interquartile range (IQR) is depicted by the box with the median represented by the center line. Whiskers maximally extend to 1.5 × IQR (with outliers shown). In hl, r, s, n  =  3 technical replicates, error bars, mean ± SD. In k,p, Scale bar, 100 µm. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001; statistical significance determined using two-tailed Student’s t-test, two-way ANOVA, or log-rank analysis, as appropriate. Source data are provided as a Source Data file.

To explore the association between USF1 and NOL10 expression, we conducted a comprehensive analysis across multiple datasets, revealing a consistent positive correlation between the mRNA expression levels of USF1 and NOL10. This correlation was observed in diverse cohorts, including CPGEA33, TCGA PRAD36, GTEx54, Stockholm camcap55, SMMU47, and NPC cohorts46 (Fig. 6c, d, Supplementary 14e-h), indicating a potential role for USF1 in upregulating NOL10 expression in clinical contexts.

Further investigating the clinical relevance of USF1 expression, we analyzed two large-scale clinical datasets. In the CPGEA cohort33, high USF1 expression showed a significant association with advanced tumor stages in prostate cancer (Fig. 6e). Similarly, analysis of the TCGA PRAD cohort36 revealed that elevated USF1 expression was significantly correlated with malignant characteristics of prostate cancer, including tumor stage, lymph node metastasis, Gleason score, biochemical recurrence, person neoplasm status, and progression-free survival (Fig. 6f, g, and Supplementary 14i–l). These findings underscore the potential prognostic value of USF1 expression in prostate cancer.

To validate our findings from the clinical databases, we conducted RT-qPCR verification using eight pairs of prostate cancer tissues and their adjacent normal tissues from the CPGEA cohort. This verification reinforced our database analysis, showing higher expression of USF1 in prostate cancer tissues compared to adjacent normal tissues (Fig. 6h). These results collectively underscore the significant correlation between USF1 expression and prostate cancer progression, highlighting USF1 as a potential biomarker for disease severity and as a target for therapeutic intervention.

Recognizing the pivotal role of TFs in cancer development56, and considering the regulatory influence of USF1 on the oncogene NOL10, we hypothesized that USF1 might possess critical biological functions in prostate cancer. To test this hypothesis, we established a stable DU145 cell line with shRNA-mediated USF1 knockdown and transfected LNCaP cells with siRNA targeting USF1 (Supplementary Fig. 15a–c). Subsequently, we performed a series of tumor cell biology assays, including CCK8 cell proliferation, colony formation, and cell migration experiments. The results indicated that, compared to the control shRNA or siRNA group, the USF1 knockdown group exhibited significantly reduced cell proliferation, colony formation, and migration abilities (Fig. 6i–l).

In addition, we generated stable DU145 and 22Rv1 cell lines overexpressing USF1 and conducted similar cell function experiments (Supplementary Fig. 15d, e, i, j). The outcomes of these experiments revealed that cell proliferation, colony formation, migration, and invasion in the USF1 overexpressed 22Rv1 cells were significantly enhanced compared to the cells with the empty vector control (Supplementary Fig. 15f–h, k–n). These phenotypic changes were further validated in PC3 cells with shRNA-mediated knockdown of USF1 (Supplementary Fig. 16a–f). These findings collectively provide compelling evidence that USF1 plays a critical role in the modulation of prostate cancer cell behaviors, contributing to the progression and aggressiveness of the disease.

To validate our in vitro findings in an in vivo setting, we carried out subcutaneous tumor transplantation experiments using nude mice. These mice were injected subcutaneously with DU145 cells control stably transduced with either control or USF1 target shRNAs. The results showed that both the volume and weight of the tumors in the USF1 knockdown groups were significantly reduced compared to the control group (Fig. 6m–o). Additionally, histopathological examination, including H&E staining and IHC analysis of the tumor tissues from the USF1 knockdown groups, revealed patterns similar to those observed in NOL10 knockdown group, showing a diminished capacity for subcutaneous tumor formation and inhibition of the epithelial-mesenchymal transition (EMT) process in the tumors (Fig. 3l and Fig. 6p). These results were further validated in PC3 cells with shRNA-mediated knockdown of USF1 (Supplementary Fig. 16g–j).

Additionally, we confirmed the role of USF1 in the RM1 murine prostate cancer cell line. Knockdown of USF1 led to a drastic reduction in cell proliferation, colony formation, and migration (Supplementary Fig. 16k–n). Furthermore, organoid models derived from murine prostate cancer tissue (C57BL/6 J background, Ptenpc−/−; Lkb1pc−/−) showed a significant decrease in both the number and size of organoid upon USF1 knockdown (Fig. 6q–s, and Supplementary Fig. 16o). These findings provide additional support for the role of USF1 as an oncogenic transcription factor.

In summary, USF1 positively regulates the expression of NOL10 at both the mRNA and protein levels. Our comprehensive analysis demonstrates a positive correlation between USF1 and NOL10 expression, with clinical data indicating a connection between USF1 and malignant prostate cancer characteristics. Moreover, our findings demonstrate that USF1 enhances the aggressiveness of prostate cancer cells in vitro and promotes tumor formation and the EMT process in vivo, as evidenced by both mouse and organoid models.

Combined effects of NOL10 and USF1 on prostate cancer progression

To further investigate the combined role of NOL10 and USF1 in prostate cancer progression, we first performed RNA sequencing following USF1 knockdown using shRNA in LNCaP and DU145 cells. The RNA-seq results revealed significant downregulation of genes involved in the E2F and G2M pathways upon USF1 knockdown, indicating that USF1, similar to NOL10, plays a positive regulatory role in the expression of cell cycle-related genes. These findings were confirmed by RT-qPCR, which demonstrated consistent changes in the expression of cell cycle regulators, supporting the role of USF1 as a key driver of cell cycle progression (Fig. 7a–d, Supplementary Fig. 17a–k).

Fig. 7. Joint influence of NOL10 and USF1 on prostate cancer progression.

Fig. 7

a, b GSEA showing significant enrichment of the E2F (a) and G2M (b) pathways in LNCaP cells after USF1 knockdown. c Heatmap of differentially expressed genes in the cell cycle pathway following USF1 knockdown in LNCaP cells, as identified by RNA-seq. d RT-qPCR validation of differentially expressed cell cycle genes in LNCaP cells with NOL10 knockdown, showing altered expression levels (GAPDH as the internal control). n  =  3 samples; P values based on the order of appearance: 5.23E–05, 1.75E–04, 6.94E–06, 1.16E–02, 1.50E–03, 2.47E–02, 2.45E–05, 2.18E–02, 7.39E–05, 1.61E–03, 9.45E–06, 3.50E–05, 7.43E–05, 3.32E–02, 3.94E–05, 9.03E–04, 1.10E–05, 1.89E–04, 5.55E–03, 4.01E–02, 8.16E–05, 3.24E–04, 6.79E–05, 1.95E–02, 1.26E–05, 2.04E–02. e MTT assay results in 22Rv1 cells transfected with empty vector, USF1, and NOL10 expression constructs, demonstrating cell proliferation changes. n  =  4 samples; P values based on the order of appearance: 2.67E–03, 1.58E–05, 5.50E–07. f Colony formation assay showing changes in colony number and size in 22Rv1 cells expressing control vector, USF1, and NOL10. n  =  4 samples; P values based on the order of appearance: 3.64E–02, 1.88E–02, 4.91E–03. g Migration assay showing representative images and quantification of 22Rv1 cells with control vector or USF1 and NOL10 expression constructs, indicating altered migratory potential. Scale bar, 100 µm. n  =  4 samples; P values based on the order of appearance: 9.59E–06, 1.15E–06, 4.68E–06. h–j Comparison of clinical features between groups with low and high co-expression of NOL10 and USF1 in the CPGEA and TCGA cohorts: tumor stage (h), lymph node metastasis (i), and PSA levels (j). k Combined impact of NOL10 and USF1 expression on BCR in CPGEA patients, with hazard ratio (HR), confidence interval (CI), and p-value shown. Horizontal error bars represent 95% CI, with HR as the center measure. P values calculated using a two-way Fixed-Effects Model. l–m Receiver operating characteristic (ROC) curves illustrating the predictive power of combined NOL10 and USF1 expression for survival outcomes in prostate cancer patients from the CPGEA (n = 120) and TCGA (n = 493) cohorts. n, o Correlation between combined NOL10 and USF1 expression and clinical outcomes in TCGA cohort: biochemical-recurrence-free survival (BFS) (n), and metastasis-free survival (MFS) (o) in prostate cancer patients. In dg, n  =  3 technical replicates, error bars, mean ± SD. In h-o, “Hi” indicating higher expression and “Lo” indicating lower expression. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001; statistical significance determined using two-tailed Student’s t-test, two-way ANOVA, or log-rank test as appropriate. Source data are provided as a Source Data file.

Given the oncogenic roles of both USF1 and NOL10 in prostate cancer, we next assessed their combined effects by performing overexpression studies in 22Rv1 cells. Co-overexpression of USF1 and NOL10 resulted in a significant enhancement of cell proliferation, colony formation, and migration compared to the individual overexpression of either gene alone (Fig. 7e–g, Supplementary Fig. 17l). These results suggest that NOL10 and USF1 cooperate to drive key oncogenic processes in prostate cancer, further reinforcing their synergistic contribution to disease progression.

To investigate the clinical relevance of the combined expression of NOL10 and USF1 on prostate cancer progression, we conducted an analysis of their synergistic expression and its correlation with clinical pathology characteristics. Using data from both the CPGEA and TCGA datasets, we found that patients with elevated co-expression levels of NOL10 and USF1 showed a significant association with increased tumor stage, lymph node metastasis, PSA levels, Gleason score, and biochemical recurrence (Fig. 7h–j, Supplementary Fig. 18a–c). This suggests that the joint expression of NOL10 and USF1 could serve as a potential biomarker for assessing disease severity and progression in prostate cancer. The observed correlation underscores the importance of these two molecular entities in the pathophysiology of the disease and highlights their potential as targets for therapeutic intervention.

To further understand the clinical implications of NOL10 and USF1 co-expression in prostate cancer, we calculated hazard ratios (HR) for biochemical recurrence, metastasis, and overall survival based on the levels of NOL10 and USF1 expression across several cohorts, including CPGEA33, TCGA36, and SU2C48. The results consistently showed that higher co-expression of NOL10 and USF1 was associated with increased hazard ratios in these cohorts (Fig. 7k, and Supplementary Fig. 18d–f), indicating that patients with elevated levels of both NOL10 and USF1 expression are at a greater risk of disease progression.

To further evaluate the predictive power of NOL10 and USF1 expression in prostate cancer prognosis, we constructed time-independent Receiver Operating Characteristic (ROC) curves. These analyses demonstrated that the combined effect of NOL10 and USF1 outperformed the predictive accuracy of either gene alone. Moreover, time-dependent ROC curves were generated to assess the predictive capability for 1-, 3-, 5-, and 10-year survival outcomes. These analyses indicated that the combination of NOL10 and USF1 offered superior prognostic prediction over either gene alone across various cohorts, including CPGEA33, TCGA36, SU2C48, and DKFZ57 (Fig. 7l, m, Supplementary Fig. 19a–l).

Furthermore, we explored the combination effect of NOL10 and USF1 expression on the prognosis of prostate cancer patients. Our analysis revealed that the synergistic co-overexpression of NOL10 and USF1 was associated with poorer overall survival, biochemical recurrence-free survival, and metastasis-free survival in patients with prostate cancer, consistently across multiple cohorts including CPGEA33, TCGA36, SU2C48, and DKFZ57 (Fig. 7n, o, Supplementary Fig. 20a–e). These findings underscore the significant prognostic value of assessing both NOL10 and USF1 expression levels in prostate cancer patients. The synergistic effect of their co-overexpression serves as a robust indicator of disease progression and patient outcomes, highlighting their potential as critical biomarkers in the clinical management and treatment of prostate cancer.

Discussion

In this study, we have revealed the regulatory relationships among the prostate cancer risk locus rs4519489, USF1, and NOL10 (Fig. 8). By integrating high-throughput SNPs-seq and unbiased proteomics, we uncovered the prostate cancer risk SNP rs4519489 (2p25) within a functional enhancer, where USF1 exhibits a preference for binding the risk allele A, thereby upregulating NOL10. This highlights a direct regulatory pathway mediated by USF1 at this specific genomic locus. Moreover, NOL10 is implicated in the regulation of cell cycle pathways, thereby facilitating the progression of prostate cancer supported by cell, organoid and mouse model experiments. Notably, both NOL10 and USF1 are linked to aggressive prostate cancer phenotypes, underscoring their clinical relevance as potential prognostic markers and therapeutic targets.

Fig. 8. Schematic model illustrating the regulatory mechanism of the 2p25 locus and USF1 in driving NOL10 expression and prostate cancer progression.

Fig. 8

The model depicts the interaction between the 2p25 locus and the transcription factor USF1, which binds to the enhancer region of the locus, leading to upregulation of NOL10 expression. Increased NOL10 expression promotes prostate cancer cell proliferation and migration, contributing to the advancement of tumor severity. The pathway suggests a regulatory axis that enhances prostate cancer progression through the modulation of cell cycle-related genes and other oncogenic pathways, thereby influencing tumor growth, metastasis, and overall disease severity.

Identifying functional causal SNPs and understanding their biological roles within the hundreds of GWAS-reported risk loci remains a formidable challenge58,59. While numerous methods have been developed to address this, there is an ongoing need for more comprehensive studies that can bridge the gap between GWAS findings and the underlying disease mechanisms17,60. In the context of prostate cancer, our team introduced an advanced approach called SNPs-seq29, designed for high-throughput screening of SNPs for allele-specific protein binding differences. This tool allows for a more refined understanding of the functional significance of SNPs in disease susceptibility, enabling more targeted investigations of their roles in cancer biology.

A particularly notable finding from our study was the identification of rs4519489 in the 2p25 locus, which exhibited a significant allele-specific protein binding bias between its A and T alleles. This SNP is in high LD with two previously identified GWAS lead SNPs, rs928771932,61 and rs199061314,15, which have been strongly associated with prostate cancer susceptibility. The strong LD between these variants reinforces the hypothesis that the 2p25 region is a critical genomic hotspot for prostate cancer risk and suggests that rs4519489 may directly contribute to disease etiology. This finding highlights the power of SNPs-seq to reveal allele-specific regulatory differences and underscores the potential functional importance of rs4519489 in prostate cancer progression.

The 2p25 region has been implicated in prostate cancer by several large GWAS studies, including those involving over 140,000 men32 and a meta-analysis of 87,000 individuals31, which identified rs9287719 (downstream of NOL10) as a prominent prostate cancer susceptibility locus. The rs9287719-C and rs11902236-A variants, located in the 2p25.1 region associated with LOC105373426-NOL10 and GRHL1, have also been suggested as causal variants for prostate cancer susceptibility6264. These findings underscore the importance of this region in modulating prostate cancer risk. In particular, rs9287719 has been associated with PSA levels in controls, and its disease risk has been linked to an epigenetically active complex variant (MNLP) that harbors transcription factor binding sites for AR, FOXA1, and MYC, the key players in prostate cancer pathogenesis31,65. Furthermore, rs9287719 has been associated with the expression of ODC1, a gene involved in neural development, through eQTL analysis66, providing further evidence for its potential role in prostate cancer biology. The association between ODC1 and rs9287719 suggests that this variant may influence both cancer susceptibility and neural development, adding complexity to its functional interpretation.

Several large-scale GWAS studies have further implicated the 2p25 locus in prostate cancer risk. For instance, a study involving 150,000 prostate cancer cases and 780,000 controls15, along with a meta-analysis of 100,000 cases and 120,000 controls14, identified rs1990613 in the intron of NOL10 as a significant prostate cancer susceptibility locus. Interestingly, rs1990613 has also been associated with human height67, reflecting its pleiotropic effects and suggesting that genetic variants linked to cancer susceptibility may also have broader physiological impacts. Additionally, rs373055126, another SNP in NOL10, has been associated with warfarin metabolism68, further highlighting the diverse biological roles of NOL10 in both cancer and other systemic processes.

Herein we conducted a comprehensive investigation to validate the allele-specific protein binding and regulatory function of rs4519489, along with its clinical implications. Our eQTL analysis, using data from the CPGEA cohort, revealed a significant association between the rs4519489 A/A risk genotype and increased expression of NOL10. In addition, we applied CRISPR base editing to demonstrate a direct regulatory effect of the A allele on NOL10 expression. While the roles of NOL10 in cancer have been underexplored, previous studies identified it as an essential nucleolar protein crucial for maintaining nucleolar structural integrity69,70. Our clinical data analysis showed that NOL10 expression was elevated in prostate cancer tissues and correlated with advanced disease progression and poor prognosis. To elucidate its potential oncogenic role, we performed functional analyses demonstrating NOL10 status as a novel oncogene for prostate cancer. Mechanistically, NOL10 likely influences the expression of genes associated with critical cell cycle pathways, including E2F targets and the G2M checkpoint. These findings collectively suggest that NOL10 actively contributes to prostate cancer progression rather than being a passive bystander. Its ability to modulate key cellular processes central to cancer development underscores its potential as a therapeutic target in prostate cancer intervention.

Our study refined an allele-specific proteomics screening method to investigate how SNPs can influence gene expression by modulating the binding affinity of key TFs. Analytical outcomes indicated that USF1 is the most likely TF to mediating the genetic effect of the rs4519489/2p25 locus. We confirmed USF1 chromatin occupancy at the rs4519489 site using ChIP-qPCR. Furthermore, we performed allele-specific chromatin enrichment assays and super-shift EMSA to confirm USF1’s preferential binding to the A allele of rs4519489. In addition, USF1 positively regulated NOL10 expression, linking USF1 with the genetic predisposition to prostate cancer. Moreover, USF1 was significantly associated with malignant characteristics of prostate cancer, as evidenced by clinical data showing correlations with higher tumor stages, lymph node metastasis, elevated Gleason scores, biochemical recurrence, and poorer progression-free survival. Additionally, analysis of clinical prostate cancer samples revealed higher USF1 expression in tumor tissues compared to normal prostate tissues. Functional assays in cell culture, mouse, and organoid models confirmed the oncogenic role of USF1 in prostate cancer. These findings collectively suggest that USF1 not only serves as a potential biomarker for prostate cancer severity but also actively promotes disease progression. Furthermore, the synergistic action of USF1 and NOL10 could enhance tumor severity. Their ability to drive tumorigenesis and influence key cancer cell behaviors underscores its potential as therapeutic targets in prostate cancer intervention.

In summary, our study unveils a pivotal regulatory mechanism underlying prostate cancer pathogenesis, centered around the genetic risk variant rs4519489 at the 2p25 locus. We demonstrate that this region acts as an enhancer, modulating the binding affinity of identified regulator USF1. This regulatory shift subsequently governs the expression of NOL10, a key contributor to prostate cancer progression. By delving into the functional aspects of the 2p25/NOL10 genetic risk locus, we significantly enhance our understanding of prostate cancer development. Our findings highlight the importance of rs4519489 and NOL10 in the molecular landscape of prostate cancer, offering potential as both a diagnostic biomarker and a therapeutic target. Targeting the rs4519489-USF1-NOL10 regulatory axis holds promise for innovative therapeutic strategies aimed at curtailing prostate cancer progression and severity.

Methods

Ethics Statement

The utilization of clinical human specimens in our study, along with the review of relevant patient records, received the endorsement of the Ethical Committee and Institutional Review Board of the School of Basic Medical Sciences at Fudan University (Approval number: 2021-005). All procedures involving human samples were conducted in strict adherence to the ethical guidelines set forth in the Declaration of Helsinki. Informed consent was duly obtained from each participating patient, ensuring the utmost respect for patient confidentiality throughout the study.

Furthermore, all animal experiments conducted as part of this study were approved by the Animal Care and Use Committee of the School of Basic Medical Sciences at Fudan University (Ethical approval number: 20200713-002), and compliant with the rules on an experimental mouse for tumor research that the tumor weight of mice should not exceed 10% of the bodyweight of mice, and the average tumor diameter should not exceed 20 mm. These experimental protocols were rigorously aligned with the Guide for the Care and Use of Laboratory Animals, underscoring our commitment to the ethical and humane treatment of all animals involved in our research endeavors. This compliance is a testament to our dedication to maintaining the highest standards of ethical conduct in all aspects of our research.

Tissue samples

Tissue samples employed in our study were selected to provide robust insights into the molecular mechanisms underlying prostate cancer. For ChIP-seq of histone modifications, including H3K27ac, H3K4me1, and H3K4me3, we utilized both normal and tumor prostate tissues from the CPGEA cohort33. For USF1 ChIP assays, we collected chromatin from normal prostate as well as prostate tumor tissues obtained from the FUSCC (Fudan University Shanghai Cancer Center) cohort40. Furthermore, to validate the expression levels of NOL10 and USF1 in patient tissues, we extracted RNA from five tissue pairs comprising prostate tumor tissues and their adjacent normal counterparts. We also isolated protein samples from two of these tissue pairs, all of which were acquired from the FUSCC cohort.

Mice

Male nude mice aged 6 weeks were acquired from Gempharmatech Company, China, for conducting in vivo experiments. The mice were maintained under controlled environmental conditions to ensure their wellbeing and the validity of our experimental outcomes. The housing conditions included a 12-h light-dark cycle, with the mice accommodated in sterilized plastic cages. The ambient temperature of housing facility was regulated between 21.7–22.8 °C, and the humidity was maintained within a range of 40-60%. To ensure the highest standards of hygiene and health, the water provided to the mice was autoclaved, and their cages were replaced once every week. The health and wellbeing of the mice were continuously monitored through a dirty bedding sentinel program, which is a well-established method for detecting health issues in laboratory animals. For all in vivo studies conducted as part of this research, we adhered to a protocol that included cohorts of three or more mice per experimental group. This approach was designed to ensure the reliability and reproducibility of our results. The experiments were repeated two to three times independently, further strengthening the robustness of our findings.

Organoid generation

Prostate tissue was extracted from Ptenpc−/−; LKB1pc−/− mice and dissociated to isolate prostate cancer cells. Organoids were cultured following the protocol established in Gao Lab45. Cells were infected with either a control vector, NOL10 or USF1 shRNA, and stably transduced lines were selected using puromycin. Following enzymatic dissociation, the cells were resuspended at a density of 2 × 104 cells/mL or 4 × 104 cells/mL and embedded in Matrigel (BD Biosciences). Cultures were maintained in mouse organoid culture medium consisting of 50× diluted B27, 1.25 mM N-acetyl-l-cysteine, 50 ng/ml epidermal growth factor (EGF), 200 nM A83-01, 100 ng/ml Noggin, 500 ng/ml R-spondin 1, 10 μM Y-27632 dihydrochloride, and 1 nM dihydrotestosterone. For organoid formation assays, 2000 cells were seeded per well, and the number and size of organoids were assessed after 14 days.

Cell lines

The human prostate cancer cell lines, including PC3 (#CRL-1435), DU145 (#TCHu222), 22Rv1 (#TCHu100), LNCaP (#CRL-1740), VCaP (#TCHu220), murine prostate cancer cell line RM1 (TCM14), and the human embryonic kidney (HEK) 293 T (#CRL-11268) were obtained from the American Type Culture Collection (ATCC, USA) and the Cell Bank of the Chinese Academy of Sciences (China). The culture conditions for these cell lines were carefully maintained to ensure their optimal growth and viability. The PC3, 22Rv1, and LNCaP cells were cultured in RPMI 1640 medium, whereas the DU145, VCaP, and HEK 293 T cells were grown in DMEM medium. The cell culture media for all these lines supplemented with 10% fetal bovine serum (FBS) (#FSP500, Genetimes Technology) and 1% penicillin/streptomycin (#MA0110, MeilunBio). The cell cultures were housed in a 37 °C incubator with a humidified atmosphere containing 5% CO2. To ensure the integrity and reliability of our research, all cell lines underwent regular testing for mycoplasma contamination, with consistently negative results. Additionally, these cell lines have been authenticated by short tandem repeat (STR) fingerprinting.

Molecular cloning

For construction of shRNA plasmid, primers were designed based on the mRNA sequences of NOL10 (NM_024894.4) and USF1 (NM_007122.5) obtained from the National Center for Biotechnology Information (NCBI). Post primer annealing, the shRNA sequences were cloned into the pLKO.1 puro vector (#8453, Addgene).

For construction of sgRNA plasmid, sgRNA oligos were designed using an online tool (http://crispor.tefor.net). For annealing, we mixed 5 μl sense sgRNA oligos (100 μM) and 5 μl anti-sense sgRNA oligos (100 μM) with 10 μl annealing buffer (5×), and 30 μl ddH2O. The oligos were annealed in a thermocycler at 95 °C for 5 min, followed by a gradual temperature decrease to 25 °C at a rate of 1 °C/min. The annealed oligos were then inserted into the Lenti CRISPR V2 Puro vector (#52961, Addgene).

For construction of overexpression plasmid, the coding regions of NOL10 or USF1 were amplified from mixed cDNA obtained from prostate cancer cells. The amplified products were cloned full-length into the pcDNA3.1 V5 vector (#V81020, Thermo Fisher Scientific) or Lenti-X Tet-One Inducible Puro V5 vector (modified from vector of #631847, Takara Bio). This was achieved using either restriction enzymes or homologous recombination techniques.

Details of the primer sequences utilized were provided in Supplementary Table 1.

Luciferase enhancer reporter assay

To investigate the regulatory potential of SNP rs4519489, we employed an allele-dependent luciferase reporter assay. This assay involved cloning allele-specific sequences (either the T or A allele, achieved through site-directed mutagenesis) from the genomic DNA of human prostate cancer cells into a firefly luciferase pGL4.23 minimal promoter (minP) vector (#E8411, Promega) or the pGL3 SV40 promoter vector (#E1761, Promega) to assess enhancer activity. The constructs were transiently transfected into 22Rv1 or LNCaP cells. For hormonal treatment, cells were exposed to either dihydrotestosterone (DHT) or ethanol (ETH). Transfection was facilitated using Lipofectamine 3000 DNA Transfection Reagent (#L3000015, Thermo Fisher Scientific). To normalize the results, we co-transfected cells with the renilla luciferase pGL4.75 plasmid (#E6931, Promega) as an internal control. The experiments were conducted in 96-well plates, with each well containing 100 μl of medium seeded with 3 × 105 22Rv1 or LNCaP cells/ml. Post-transfection, the cells were incubated at 37 °C in a 5% CO2 atmosphere for 48 h. The luciferase activity was measured using the Dual Luciferase Reporter Assay System (#E1960, Promega) on a bioluminometer. Each construct was tested in at least three replicate wells. The results were then statistically analyzed using a two-tailed Student’s T-test. Details of the primer sequences, cloning methods, and enzymes used are available in Supplementary Table 1.

Electrophoretic mobility shift assay (EMSA) and gel super shift assay

We employed an electrophoretic mobility shift assay (EMSA) to validate the allele-dependent protein binding differences. This assay was performed using the LightShift Chemiluminescent EMSA Kit (#20148, Thermo Fisher Scientific). The oligonucleotides required for this experiment were synthesized by Tsingke Biotech. The target oligonucleotide, 29 base pairs in length with the SNP positioned centrally, was labeled using the Biotin 3’ End DNA Labeling Kit (#89818, Thermo Fisher Scientific). The nuclear proteins were extracted from LNCaP, 22Rv1, and 293 T cells to be used in the binding reactions. The 20 μl reaction mixture included 1x binding buffer, 1 μg of Poly (dI-dC), 3 μl of nuclear extract, a 200-fold excess of unlabeled oligo for competitive assays, and 20 fmol of 3’ end labeled oligo. The reaction mixtures were subjected to electrophoresis on a 6% polyacrylamide gel using 0.5x TBE buffer. Following electrophoresis, the samples were transferred onto a nylon membrane (#77016, Thermo Fisher Scientific). After cross-linking, protein-DNA complexes were detected using the Chemiluminescent Nucleic Acid Detection Module. Visualization was achieved using the Tanon 5200 Imaging System (Tanon, China). For gel super shift assay, 1 μl mouse Flag antibody (#F1804, Sigma) or mouse IgG antibody (#SC-2025, Santa Cruz Biotechnology) were mixed with nuclear extract from cells with overexpression of USF1. The sequences of the oligonucleotides used in these assays were provided in Supplementary Table 2.

CRISPRi

We generated stable 22Rv1 and PC3 cell lines expressing CRISPR dCas9 KRAB by transfecting cells with the pLX303-ZIM3-KRAB-dCas9 plasmid (#154472, Addgene). Post-transfection, cells underwent antibiotic selection with 6 μg/mL blasticidin for two weeks. Guide RNAs (gRNAs) were specifically designed to target the active epigenetically marked chromatin region encompassing rs4519489. To ensure comprehensive analysis, we included a negative control (scramble sgRNA) and a positive control (HPRT1 promoter targeting gRNA). These gRNA cassettes were synthesized by Tsingke Biotech and subsequently cloned into the pgRNA humanized vector (#44248, Addgene). The cells stably expressing KRAB-dCas9 were then infected with the gRNA vectors. Following infection, the cells underwent selection with 2 μg/mL puromycin for five days. The primers used for all gRNAs are detailed in Supplementary Table 3.

Base editing

We employed two methods to perform base editing at the rs4519489 locus. Firstly, we applied the CRISPR/Cas9-mediated single nucleotide mutation method34 to mutate the rs4519489 locus in PC3 cells. Two pairs of oligonucleotides were designed using the well-established CRISPR design tool (http://crispr.mit.edu/). DNA fragments centered on the rs4519489 (A or T) variant were used as repair templates. The 5 μL of sgRNA (10 μM) targeting the locus was annealed in 5X annealing buffer, and the resulting oligos were cloned into the pSpCas9n (BB) −2A-Puro (PX462) V2.0 vector.

When PC3 cells reached 80% confluence in a 24-well plate, 350 ng of the sgRNA-Cas9 plasmid (pSpCas9n) and 2ul (10 um) of the repair template were co-transfected into the cells using Lipofectamine 3000 reagent. Twenty-four hours post-transfection, the medium was changed, and 1 μg/ml puromycin was added to select for transfected cells. After 48 h, surviving cells were sorted by FACS into single-cell wells in a 96-well plate. Two weeks later, cells were harvested by trypsin digestion (50 ul), followed by DNA extraction using 10 ul Quickextract DNA extraction solution. Genotyping was then performed to confirm successful editing.

Furthermore, we performed base editing at the rs4519489 site in DU145 cells using the AYBE base editor (Yang Lab)35. The sgRNA targeting the rs4519489 locus was cloned into the AYBE v3 vector, and 6 μg of the plasmid was transfected into DU145 cells using Lipofectamine 3000 reagent when cells reached 70% confluence in a 6 cm dish. Seventy-two hours post-transfection, single-cell sorting was performed using a flow cytometer (mCherry channel). After two weeks of culture, cells were genotyped to assess editing efficiency. The oligonucleotides used in this experiment are listed in Supplementary Table 3.

CRISPR/Cas9 mediated genome editing assay

The cells were seeded in a 6-well plate, ensuring they were at the appropriate density for transduction. The sgRNA lentivirus specific to NOL10 was prepared in advance. Virus Addition: For each well, 1 ml of the lentivirus-containing medium was combined with an equal volume of the cell culture medium. To enhance the efficiency of viral transduction, 10 μg/ml polybrene was added to this mixture. Incubation Period: The cells were incubated for 48 h to allow sufficient time for the viral transduction to occur. Medium Change and Selection: Post-transduction, the medium in each well was replaced with fresh medium containing 1 μg/ml puromycin. This step was crucial for selecting cells that had successfully incorporated the sgRNA, as puromycin resistance is conferred only to those cells where the viral transduction (and thus the sgRNA incorporation) was successful.

siRNA and shRNA knockdown assay

Prostate cancer cells were grown to 70–80% confluency for optimal transfection conditions. Cells were transfected with either control siRNA or siRNAs targeting NOL10 or USF1 using Lipofectamine RNAi MAX Transfection Reagent (#13778150, Thermo Fisher Scientific). The medium was replaced after 12 h post-transfection, and cells were collected after 48 h for further analysis. The specific sequences of siRNAs used are detailed in Supplementary Table 3.

Lentiviral constructs with shRNA targeting NOL10 or USF1 were produced in 293 T cells using a third-generation packaging system. Cells were seeded in a 6-cm dish at 70%–80% confluency a day before transfection. A mix of four plasmids (pCMV-VSV-G, #14888, Addgene; pRSV-Rev, #12253, Addgene; pMDLg/pRRE, #12251, Addgene and the lentiviral target vector) was prepared in a 1:1:1:3 ratio, totaling 10 μg, and diluted in Opti-MEM with PEI reagent. After 24 h, the medium was replaced with 2 ml fresh medium, and the virus-containing medium was collected every 24 hours for three days, filtered through a 0.45 μm filter, and stored −80 °C. For virus transduction, the desired cells were seeded in a 6-well plate and incubated with the lentivirus-containing medium supplemented with 8 μg/ml polybrene (#TR-1003-G, Sigma). In case of puromycin selection construct, after 24 h the medium was replaced with pre-warmed medium, and 48 h after transduction the medium was changed with fresh medium containing puromycin (2 μg/ml; #MA0318, MeilunBio) in a final concentration of 2 μg/ml for selection. Non-transduced cells served as controls for determining cell survival upon puromycin selection.

Overexpression assay

DU145, LNCaP, and 22Rv1 cells were grown to 70–80% confluency for optimal transfection efficiency. The transfection mixture consisted of the pcDNA3.1 construct, P3000 reagent, Lipofectamine 3000 reagent (#L3000015, Thermo Fisher Scientific), and Opti-MEM (#11058021, Thermo Fisher). The prepared mixture was added to the cells and incubated for 48 h to allow for gene expression. Post-incubation, cells were harvested for subsequent analyses. For establishing stable overexpression cells, 22Rv1 cells were infected with Lenti-X Tet-One inducible Puro V5 constructs. Post-infection, cells were selected and maintained under appropriate conditions to ensure stable integration and expression of the target gene.

RNA isolation, reverse transcription, and quantitative PCR

Total RNA was isolated using the EZ-10 DNAaway RNA Mini-Preps Kit (#B618133, Sangon Biotech). 1 ug total RNA was reverse transcribed using the HiScript III RT SuperMix for qPCR kit (#R323-01, Vazyme) and the resulting cDNA was diluted 20 times. RNA expression was quantified using the ChamQ universal SYBR qPCR master mix (#Q711-02, Vazyme) on the Light Cycler 480 (Roche). GAPDH or β-actin was used as reference for normalizing gene expression levels in the samples. Each sample was measured in triplicate to ensure the accuracy and reliability of the data. Relative gene expression was calculated using the ΔΔCT (ΔCT [sample] – ΔCT [control average]) method. The sequences of all oligonucleotides used in these procedures are provided in Supplementary Table 2.

Western blot

The cell pellet was resuspended in lysis buffer, followed by centrifugation. The supernatant containing the extracted proteins, was collected. Protein concentrations were determined using the BCA Protein Assay Kit (#P0012S, Beyotime Biotechnology). Equal amounts of protein lysate (30 μg) were denatured using protein loading buffer (#P0015F, Beyotime Biotechnology). The denatured proteins were separated by SDS-PAGE, and transferred to 0.45 μm PVDF membranes (#IPVP00010, Millipore). The membrane was blocked for 1 h at room temperature using blocking buffer (5% nonfat milk in TBST) while gently shaking. The blocked membrane was incubated overnight at 4 °C with primary antibodies diluted in blocking buffer, under gentle rotation. Post-incubation, the membrane was washed five times for 5 min each with TBST. The membrane was then incubated with HRP-conjugated secondary antibody diluted in blocking buffer for 1 h at room temperature on a rotor. Afterwards, the membrane was washed five times for 5 min each using TBST. Finally, the membrane was developed using Omni-ECL Western Blotting Substrate (#SQ202L, Epizyme) or Omni-ECL Femto Maximum Sensitivity Substrate (#SQ201, Epizyme). The developed blot was imaged using the ChemiDoc Imaging System (Bio-Rad). Original blots are provided in the source data file. The specific antibodies used in this study are listed in Supplementary Table 4.

Tumor cell biology experiments

For cell proliferation assay, cells were seeded in 96-well plates (1 × 103 cells per well for DU145, PC3, and RM1, 6 × 103 cells per well for LNCaP, and 3 × 103 cells per well for 22Rv1 in 100 μl medium). Cell viability and proliferation were measured using CCK 8 Kit (#MA0218, MeilunBio) or MTT (#SY316, Beyotime Biotechnology) kits. Absorbance readings at 450 nm (CCK-8) or 490 nm (MTT) were taken at specific time points. Data, obtained from at least triplicate wells, were analyzed using two-tailed Student’s T-test or two-way ANOVA.

For colony formation assay, cells (1 × 103 for DU145 and PC3, 4 × 103 for 22Rv1, and 500 for RM1 cells) were seeded in 6-well or 12-well plates. After two weeks, colonies were fixed in 4% paraformaldehyde and stained with crystal violet (#A600331-0100, Sangon Biotech).

For cell migration assay, cells were trypsinized, resuspended in serum-free medium, and 200 μl were placed into 8 μm transwell inserts (#353097, BD). Lower chambers were filled with 600 μl of normal growth medium and cells were incubated for 36 h. Post-incubation, cells were fixed with 4% formaldehyde and stained with crystal violet.

For cell invasion assay, the transwell inserts were coated with 100 μl Matrigel (#40183ES10, Yeasen) diluted in serum-free medium. Invasive cells on the bottom surface of the filters were counted in five microscopic fields per membrane. Both the migration and invasion assays were statistically analyzed using two-tailed Student’s T-test or two-way ANOVA, with each assay performed in three replicates.

In vivo nude mice subcutaneous xenograft model

Male nude mice from Gempharmatech Company, China, were randomly divided into different groups, with six mice in each group. Control shRNA, NOL10 shRNA or USF1 shRNA stable DU145 or PC3 cells were harvested, trypsinized, and washed with PBS. Each mouse received a subcutaneous injection of 5 × 106 PC3 cells in 50 μl PBS mixed with 50 μL Matrigel (#40183ES10, Yeasen) into the right dorsum. Tumor sizes were measured weekly using a vernier caliper, and volumes calculated using the formula: V = 0.5 × (Length × Width2). After four weeks, mice were sacrificed, and subcutaneous tumors were removed for further analysis. The tumor weight of mouse and the average tumor diameter did not exceed the requirements of experimental mouse for tumor research.

Immunohistochemistry (IHC)

Subcutaneous tumor tissues from each group of mice were collected and fixed in 4% paraformaldehyde, dehydrated, and embedded in paraffin. Paraffin sections (5 μm thickness) were deparaffinized, rehydrated, and stained with haematoxylin and eosin (H&E). Sections underwent hydrogen peroxide treatment, antigen retrieval, and blocking. Overnight incubation with primary antibodies (NOL10, E-cadherin, Vimentin, Ki67 or USF1) at 4 °C was followed by application of biotinylated secondary antibodies and streptavidin conjugated HRP. Detection was developed using DAB substrate solution. Details of the antibodies used are provided in Supplementary Table 4.

RNA-seq and differential expression genes (DEG) analysis

Prostate cancer cells were transfected with either siRNA targeting NOL10 or USF1, or a negative control siRNA, incubated for 48 h under standard cell culture conditions, with two biological replicates. Total RNA was extracted using Trizol reagent (#15596018, Thermo Fisher Scientific). RNA-seq libraries were prepared using the Stranded mRNA-seq Lib Prep Module (RK20349, Abclonal). The quality of libraries was assessed using LabChip Touch, and sequencing was conducted at Annoroad Company with Illumina sequencing platforms.

Raw sequence data were preprocessed using FastQC (v.0.11.9) (www.bioinformatics.babraham.ac.uk/projects/fastqc/) for quality assessment. AdapterRemoval (v.2.3.2)71 was used for quality trimming and adapter removal with default parameters. The processed reads were aligned to the human genome (hg38) using STAR (v.2.7.9a)72 and the aligned BAM files were sorted using SAMtools (v.1.13)73. HTSeq (v.0.13.5)74 was employed to quantify aligned sequencing reads against UCSC gene annotation with the parameters “-s reverse, -i gene_id”. DESeq2 (v.1.30.1)75 was used for DEG analysis from the read count matrix. Genes with low expressions (<5 cumulative read count across samples) were filtered out. An adjusted P value < 0.05 was applied to generate the list of differentially expressed genes. DEGs were ranked according to their fold change. Statistical tests were applied to control or treatment to ensure high correlations between technical replicates. Data normalization was performed using the variance Stabilizing Transformation (VST) method. A heatmap presenting DEGs between siRNA control and siRNA NOL10 or USF1 samples was generated using the R package “pheatmap” (v.1.0.12). Detailed information about the software and algorithms used is provided in Supplementary Table 5.

Gene set enrichment analysis (GSEA)

We applied GSEA (v.4.0.3) to interpret the RNA-seq results of NOL10 or USF1 knockdown. A pre-ranked gene list was compiled by calculating data following the formula sign (logFC) *-log (p value), and the data were sorted in a descending order. The GSEA Preranked test was used to test the enrichment of phenotypic genes in Hallmark gene sets (H collection). Parameters were set as follows: Enrichment statistic = “weighted”, Max size (exclude larger sets) = 5000, number of permutations = 1000. All other parameters remained as default. GSEA enrichment plots were generated using R packages “clusterProfiler” (v.3.14.3)76 and “enrichplot” (v.1.12.0). The software and algorithms were listed in Supplementary Table 5.

Allele specific unbiased proteomics screening

To identify transcription factors (TFs) responsible for the allele-specific binding differences at rs4519489, we employed a modified version of the Proteome Wide Analysis of SNPs (PWAS) mass spectrometry approach51. This adaptation enabled the identification of TFs with preferential affinity for the T or A allele of rs4519489.

We first synthesized 29-base pair oligonucleotides harboring either the T or A allele of rs4519489, and labeled them using the Biotin 3’ End DNA Labeling Kit (#89818, Thermo Fisher Scientific). The biotinylated double-stranded oligonucleotides were then incubated with freshly prepared nuclear extracts from LNCaP cells (NE-PER Nuclear and Cytoplasmic Extraction Reagents, #78833, Thermo Fisher Scientific). Each DNA–protein binding reaction (total volume 100 μl) contained 54 μl ultrapure water, 10 μl 1 0× binding buffer, 5 μl poly(dI•dC) (1 μg/μl), 20 μl nuclear extract, 1 μl protease inhibitor, and 10 μl biotinylated oligonucleotides (50 nM). The mixtures were incubated at room temperature for 15 min.

Subsequently, Dynabeads M280 Streptavidin (#11205D, Thermo Fisher Scientific) were washed three times with washing buffer, and then incubated with the DNA-protein complexes for 20 min at room temperature. The resulting complexes were washed five times using a magnetic stand, and proteins were precipitated with ethanol and eluted in 8 M urea. For mass spectrometry (MS) analysis, protein samples were resuspended in 50 µl of 50 µM ammonium bicarbonate buffer.

Allele-specific DNA-protein complexes for the T and A alleles were analyzed via electrospray ionization nanoflow ultrahigh-performance liquid chromatography coupled with an LTQ Orbitrap XL mass spectrometer (Thermo Fisher Scientific). Samples were loaded onto a C18 trap column, and peptides were subsequently eluted onto an analytical column. Chromatographic separation was achieved using a factory-default extended gradient of 88 min with solvent A (0.1% formic acid in LC-MS grade water) and solvent B (0.1% formic acid in LC-MS grade acetonitrile). The spray voltage was dynamically optimized between 1.8 and 2.0 kV, with the ion transfer tube temperature maintained at 275 °C, and the RF lens level set to 40%. MS data were acquired in data-dependent acquisition (DDA) mode, selecting the top 16 most intense precursor ions.

Peptide identification was performed using MaxQuant (v1.6.14.0), searching against the Swiss-Prot Human database (release 2023_03), with the following parameters: Trypsin as the proteolytic enzyme, up to two missed cleavages, precursor mass tolerance of 10 ppm, and fragment mass tolerance of 0.05 Da. The sequences of oligonucleotides used in this allele-specific, unbiased proteomics screening are provided in Supplementary Table 2.

Chromatin immunoprecipitation (ChIP)

LNCaP, 22Rv1, PC3, and VCaP cells were cross-linked with 1% formaldehyde for 10 min and fixation was stopped with 125 mM Glycine at room temperature for 5 min with gentle shaking. Cell pellets were suspended in hypotonic lysis buffer (with protease inhibitor cocktail) for 45 min. Nuclei were washed with cold PBS and re-suspended in SDS lysis buffer (final 0.5% SDS). Chromatins was sonicated to ~400 bp for ChIP-qPCR and ChIP-AS-qPCR, and ~200 bp for ChIP-seq (Diagenode bioruptor or Covaris M220). Dynabeads Protein G (#10004D, Thermo Fisher Scientific) were washed twice by blocking buffer, and then incubate the beads with antibodies (6 μg for TF and 2 μg for histone modification antibodies) at 4 °C overnight. The sonicated chromatin (300 μg for TF, and 20 μg for histone ChIP assay) was diluted in IP buffer to final volume of 1.3 ml, then added to 40 μl of Dynabeads antibody complex. After incubation overnight at 4 °C, the complex was washed six times with washing buffers. The DNA protein complex will be separated from beads by extraction buffer. DNA-protein complexes were reverse cross-linked with Proteinase K and NaCl at 65 °C overnight. The DNA was purified using the MinElute PCR Purification Kit (#28006, Qiagen).

For tissue ChIP assay, the samples were cut into small pieces by tiny scissors, fixed in 1.5% formaldehyde for 10 min at room temperature, and then quenched with Glycine. The tissues were mechanically extracted by applying 8 cycles using a tissue freezing grinder (Jingxin, China). To isolate nuclei, we suspended the tissue pellet in hypotonic lysis buffer (with DTT and protease inhibitor cocktail) for 40 min at 4 °C. The tissue mass was filtered out with a sterile 100 μm filter. Chromatin was sheared to 200-500 bp using a high power Bioruptor plus sonicator or Covaris. For each ChIP, the chromatin (30 μg for a TF and 1.5 μg for a histone modification ChIP assay) were incubated with antibodies (4 μg for TF and 2 μg for histone) overnight at 4 °C. The antibody chromatin complex were conjugated with washed Protein G Dynabeads overnight at 4 °C. The 100 ul eluted chromatin protein complex were reverse cross-linked by adding 6 μl of 5 M NaCl and 5 ul of Proteinase K and then incubating overnight at 65 °C. The immunoprecipitated and input DNA was purified using the MinElute PCR Purification Kit (#28006, Qiagen). The specific antibodies used for these experiments are listed in Supplementary Table 4.

ChIP-qPCR, ChIP-AS-qPCR, and ChIP-seq

For ChIP-qPCR, qPCR was performed at the SNP site in triplicates. The enrichment of TFs at target DNA fragments was quantified relative to IgG controls. Before ChIP-AS-qPCR, primers for allele-specific amplification of the rs4519489 region were designed, with a product length of 234 bp while rs4519489 in the middle of the fragment. Genomic DNA from human prostate cancer cell lines (LNCaP, DU145, 22Rv1, PC3, and VCaP) was used as a template for PCR, with Sanger sequencing determining the genotypes at rs4519489. The sequences of oligonucleotides used are listed in Supplementary Table 2.

ChIP-seq libraries were prepared using the NEBNext Ultra II DNA Library Prep Kit (#E7103L, NEB) according to the manufacturer’s instructions. Sequencing was performed at Annoroad company. The histone modification (H3K27ac, H3K4me1, and H3K4me3) ChIP-seq libraries were sequenced to yield 150 bp pair-end reads. FastQC (v.0.11.9) was for quality assessment of raw data. Adapters and short reads were removed using TrimGalore (v.0.6.7, RRID: SCR_011847). The trimmed reads were mapped into the human genome Hg38 using Bowtie2 (v.2.2.5)77 with the default parameters. Low-quality alignment reads were excluded via SAMtools (v.1.13)73 via applying the parameters “-q 30 -F 3844.” Duplicate reads were identified and removed using the Picard toolkit (v.2.25.1, RRID: SCR_006525). MACS2 (v.2.1.4)78 was employed for peak calling with default parameters. We utilized the Integrated Genome Viewer (IGV, v.2.12.3) for peak visualization and analysis.

Expression quantitative trait loci (eQTL) analysis

To evaluate the associations between genotypes of rs4519489 and NOL10 expression levels, we performed an eQTL analysis using the R package “Matrix eQTL” (v.2.2) in the CPGEA cohort comprised of 134 normal prostate samples. The eQTL analysis was applied by fitting a linear regression model (“useModel = modelLINEAR”) between the expression and genotype data, setting up other parameters as default (pvOutputThreshold = 0.05, errorCovariance = numeric ()”). The transcriptional profiling in CPGEA cohort was assessed by RNA-Seq and the CPGEA cohort was genotyped using whole genome sequencing (WGS) strategy.

EMT score and AR signaling score

The EMT score was based on a set of 76 genes79, from which the EMT signature was found correlated with known EMT markers. The AR signaling score was estimated using a gene expression signature from 30 genes50, including MPHOSPH9, ADAM7, FOLH1, CD200, FKBP5, GLRA2, NDRG1, CAMKK2, MAN1A1, MED28, ELL2, ACSL3, PMEPA1, GNMT, ABCC4, HERC3, PIP4K2B, KLK3, EAF2, CENPN, MAPRE2, NKX3-1, KLK2, AR, TNK1, MAF, C1ORF116, TMPRSS2, TBC1D9B, and ZBTB10, that was chosen based on their robust activation or inhibition upon androgen stimulation.

NOL10 cell cycle signature (CCS) and cell cycle progression (CCP) score

The NOL10 cell cycle signature, composed of predefined set of 32 genes80, was derived from the four top enriched cell cycle related pathways identified via GSEA. The genes from these enriched pathways were then intersected with the 267 genes that were found to be downregulated in our RNA-seq data upon NOL10 knockdown. The CCP score was calculated using a predefined set of 31 CCP genes71.

Multivariate analysis

We investigated the association of the prostate cancer patient biochemical recurrence and overall survival with the NOL10 cell cycle signature and clinical variables, including age, tumor stage, Gleason score, PSA level, seminal vesical status, surgical margin status, and extraprostatic extension status. These factors are critical in understanding the progression and prognosis of prostate cancer. The Cox proportional hazard model was applied to investigate the relation between patient prognosis and NOL10 cell cycle signature. Based on the NOL10 cell cycle signature, samples were stratified into two groups – those with higher expression and those with lower expression. The criterion for stratification was the mean value of the NOL10 cell cycle signature.

Univariate analysis

For the univariate analysis, we investigated the association of the prostate cancer patients’ biochemical recurrence and metastasis with single or pairwise combinations of gene expression levels of NOL10 and USF1. The z-score sum of gene expression was calculated and patients with prostate cancer were then stratified into two groups – these with higher expression and these with lower expression. The median value of these cumulative expression levels served as the threshold for stratification. Statistics were summarized and presented in forest plots.

Gene expression correlation analysis

We performed the co-expression analysis to evaluate the expression correlation between NOL10, USF1, NOL10 CCS, CCP, or EMT score from multiple independent cohorts with cancerous prostate tissues. Both Pearson’s product-moment correlation and Spearman’s rank correlation rho methods were applied in all linear expression correlation tests.

Receiver operating characteristic (ROC) analysis

To evaluate the predictive potential functions of the expressions of NOL10 and USF1 for 1- year, 3- year, 5- year, 10- year survival of prostate cancer patients in multiple cohorts, ROC analyses were performed by adding the expression data that were statistically associated with survival to a multivariable adjusted logistic regression model81.

Survival analysis

The Kaplan-Meier survival analysis was conducted to evaluate the impact of SNP genotype or expression levels of NOL10, USF1, or NOL10 CCS on patient prognosis in multiple independent clinical prostate cancer data sets. Patients were stratified based on the SNP genotype or the median value of gene expression levels. For the investigation of the synergistic effect of NOL10 and USF1 on patient survival, we included prostate cancer patients with consensus dual high or low expression levels of NOL10 and USF1. Kaplan-Meier survival analysis was conducted using R package “Survival” (v.3.2.13) and assessed by using the log-rank tests.

Statistical analysis and data visualization

Throughout the study, continuous variables are presented using the median and interquartile ranges. Discrete variables are reported as the actual number or percentages. All statistical analyses were performed using RStudio (v.1.2.5033) with R environment (v.3.6.3) or unless specified. To determine the expression of NOL10, USF1, or NOL10 CCS on human samples, we compared their expression among normal prostate tissue, primary prostate tumor, and tumor metastasis in multiple prostate cancer clinical cohorts. We evaluated the association of candidate gene expression with other clinicopathological features such as clinical T stages, lymph node metastasis, Gleason score, prostate specific antigen (PSA) level, seminal vesical, person neoplasm status, and BCR. The Mann Whitney U test was used for gene expression in clinical cohorts with two groups, while the Kruskal Wallis H test was applied for cohorts having three or more groups. For the experimental part, data were presented as means ± SD using the GraphPad Prism 6 software. Differences between two groups were estimated using the two tailed Student’s t test. The variables in three or more groups were compared using the two-way ANOVA test. Asterisks indicate the significance levels (*P < 0.05; **P < 0.01; ***P < 0.001; ****P < 0.0001). For comparative analyses, P < 0.05 was considered statistically significant. The software and algorithms were listed in Supplementary Table 5.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Supplementary information

41467_2025_64005_MOESM2_ESM.pdf (55.6KB, pdf)

Description of Additional Supplementary Files

Supplementary Data 1 (11.5KB, xlsx)
Supplementary Data 2 (19.1KB, xlsx)
Supplementary Data 3 (75KB, xlsx)
Reporting Summary (102.5KB, pdf)

Source data

Source Data (26.5MB, zip)

Acknowledgements

We want to acknowledge the participants and investigators of the CPGEA and FUSCC studies. This work was supported by the Shanghai Interactional Collaborative Project (23410713300), the National Natural Science Foundation of China (82372628 to P.Z.; 82203416 to Z.W.; 82073082, 82311530050 to G.-H.W.), the National Key Research and Development Program of China (2022YFC2703600), Jane ja Aatos Erkon säätiö, Sigrid Juséliuksen Säätiö, Syöpäjärjestöt, Fudan University Recruit Funding to G.-H.W., and Research Council of Finland Profi8 funding (decision number 365202) to Q.Z., as well as the National Institute of Health (R01 CA250018-01) to L.W. Linux high-performance computing servers were supported by the Medical Research Data Center in Shanghai Medical College of Fudan University, the High-performance Computing Platform of Suzhou Institute of Systems Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, and the CSC-IT Center for Science Ltd.

Author contributions

Conceptualisation, G.-H.W. and L.W.; D.D. and P.Z. performed most of the experiments and analysed the data. Z.W. performed bioinformatics analysis with essential assistance from Q.Z. and W.X. D.D., Z.W., P.Z., and G.-H.W. prepared figures. Methodology, D.D., P.Z., Z.W., M.L., Q.Z., Y.W., J.Z., X.Y., W.X., Q.-X.Z., Y.Z., L.W., and G.-H.W. Software, Z.W., Q.Z., W.X., and G.-H.W. Validation, D.D. and P.Z. Formal analysis, D.D., P.Z., Z.W., and G.-H.W. Investigation, D.D., P.Z., Y.W., Y.Z., and G.-H.W. Resources, Y.Z., L.W., and G.-H.W. Data curation, D.D., P.Z., Z.W., Q.Z., W.X., and G.-H.W. Writing-original draft, P.Z., D.D., and G.-H.W. Writing-review & editing, D.D., Z.W., P.Z., G.-H.W. with inputs from all authors. Visualisation, D.D., Z.W., P.Z., and G.-H.W. Supervision, project administration, and funding acquisition, L.W., P.Z. and G.-H.W.

Peer review

Peer review information

Nature Communications thanks Jyotsna Batra, Ping Mu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Data availability

All data and software used in this study are accessible. The publicly available GWAS data in prostate cancer used in this study were obtained from the GWAS catalog. The RNA-seq and ChIP-seq data generated in this study have been deposited in the NCBI Gene Expression Omnibus (GEO) under accession code GSE287279. The publicly available RNA-seq or microarray data including CPGEA, TCGA, MSKCC, GTEx, DKFZ, FHCRC, NPC, Rld, SMMU, Stockholm, SU2C-PCF, Yu (GSE6919), Chandran (GSE10645), Taylor (GSE21034), Grasso (GSE35988), and Penney (GSE62872) were retrieved from public databases including cBioPortal for Cancer Genomics, Oncomine database, and GEO database. The remaining data supporting the findings of this study are available in the Article, Supplementary Information or Source Data file. A reporting summary for this article is available as a Supplementary file. The deposited data were listed in Supplementary Table 6Source data are provided with this paper.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Dandan Dong, Zixian Wang.

Contributor Information

Liang Wang, Email: liang.wang@moffitt.org.

Peng Zhang, Email: peng_zhang@fudan.edu.cn.

Gong-Hong Wei, Email: gonghong_wei@fudan.edu.cn.

Supplementary information

The online version contains supplementary material available at 10.1038/s41467-025-64005-w.

References

  • 1.Bray F. et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 74, 229-263 (2024). [DOI] [PubMed]
  • 2.Culp, M. B., Soerjomataram, I., Efstathiou, J. A., Bray, F. & Jemal, A. Recent global patterns in prostate cancer incidence and mortality rates. Eur. Urol.77, 38–52 (2020). [DOI] [PubMed] [Google Scholar]
  • 3.Zhou, C. K. et al. Prostate cancer incidence in 43 populations worldwide: an analysis of time trends overall and by age group. Int J. Cancer138, 1388–1400 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Center, M. M. et al. International variation in prostate cancer incidence and mortality rates. Eur. Urol.61, 1079–1092 (2012). [DOI] [PubMed] [Google Scholar]
  • 5.Tikkinen, K. A. O. et al. Prostate cancer screening with prostate-specific antigen (PSA) test: a clinical practice guideline. BMJ362, k3581 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Wong, M. C. et al. Global incidence and mortality for prostate cancer: analysis of temporal patterns and trends in 36 countries. Eur. Urol.70, 862–874 (2016). [DOI] [PubMed] [Google Scholar]
  • 7.Tsodikov, A. et al. Reconciling the effects of screening on prostate cancer mortality in the ERSPC and PLCO Trials. Ann. Intern Med167, 449–455 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Sandhu, S. et al. Prostate cancer. Lancet398, 1075–1090 (2021). [DOI] [PubMed] [Google Scholar]
  • 9.Giannareas, N. et al. Extensive germline-somatic interplay contributes to prostate cancer progression through HNF1B co-option of TMPRSS2-ERG. Nat. Commun.13, 7320 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Benafif, S., Kote-Jarai, Z., Eeles, R. A. & Consortium, P. A review of prostate cancer genome-wide association studies (GWAS). Cancer Epidemiol. Biomark. Prev.27, 845–857 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Mucci, L. A. et al. Familial risk and heritability of cancer among twins in Nordic countries. JAMA315, 68–76 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Gudmundsson, J. et al. Genome-wide association study identifies a second prostate cancer susceptibility variant at 8q24. Nat. Genet39, 631–637 (2007). [DOI] [PubMed] [Google Scholar]
  • 13.Tian, P., Zhong, M. & Wei, G. H. Mechanistic insights into genetic susceptibility to prostate cancer. Cancer Lett.522, 155–163 (2021). [DOI] [PubMed] [Google Scholar]
  • 14.Conti, D. V. et al. Trans-ancestry genome-wide association meta-analysis of prostate cancer identifies new susceptibility loci and informs genetic risk prediction. Nat. Genet53, 65–75 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Wang, A. et al. Characterizing prostate cancer risk through multi-ancestry genome-wide discovery of 187 novel risk variants. Nat. Genet55, 2065–2074 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Grisanzio, C. et al. Genetic and functional analyses implicate the NUDT11, HNF1B, and SLC22A3 genes in prostate cancer pathogenesis. Proc. Natl. Acad. Sci. USA109, 11252–11257 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Farashi, S., Kryza, T., Clements, J. & Batra, J. Post-GWAS in prostate cancer: from genetic association to biological contribution. Nat. Rev. Cancer19, 46–59 (2019). [DOI] [PubMed] [Google Scholar]
  • 18.Freedman, M. L. et al. Principles for the post-GWAS functional characterization of cancer risk loci. Nat. Genet43, 513–518 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Edwards, S. L., Beesley, J., French, J. D. & Dunning, A. M. Beyond GWASs: illuminating the dark road from association to function. Am. J. Hum. Genet93, 779–797 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Jiang, J., Cui, W., Vongsangnak, W., Hu, G. & Shen, B. Post genome-wide association studies functional characterization of prostate cancer risk loci. BMC Genomics14, S9 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Gallagher, M. D. & Chen-Plotkin, A. S. The post-GWAS Era: from association to function. Am. J. Hum. Genet102, 717–730 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Melnikov, A. et al. Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nat. Biotechnol.30, 271–277 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Khetan, S. et al. Functional characterization of T2D-associated SNP effects on baseline and ER stress-responsive beta cell transcriptional activation. Nat. Commun.12, 5242 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Arnold, C. D. et al. Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science339, 1074–1077 (2013). [DOI] [PubMed] [Google Scholar]
  • 25.Liu, S. et al. Systematic identification of regulatory variants associated with cancer risk. Genome Biol.18, 194 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Fulco, C. P. et al. Systematic mapping of functional enhancer-promoter connections with CRISPR interference. Science354, 769–773 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Ahmed, M. et al. CRISPRi screens reveal a DNA methylation-mediated 3D genome dependent causal mechanism in prostate cancer. Nat. Commun.12, 1781 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Tehranchi, A. K. et al. Pooled ChIP-Seq links variation in transcription factor binding to complex disease risk. Cell165, 730–741 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Zhang, P. et al. High-throughput screening of prostate cancer risk loci by single nucleotide polymorphisms sequencing. Nat. Commun.9, 2022 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Li, G. et al. High-throughput identification of noncoding functional SNPs via type IIS enzyme restriction. Nat. Genet50, 1180–1188 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Al Olama, A. A. et al. A meta-analysis of 87,040 individuals identifies 23 new susceptibility loci for prostate cancer. Nat. Genet46, 1103–1109 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Schumacher, F. R. et al. Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nat. Genet50, 928–936 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Li, J. et al. A genomic and epigenomic atlas of prostate cancer in Asian populations. Nature580, 93–99 (2020). [DOI] [PubMed] [Google Scholar]
  • 34.Gao, P., Dong, X., Wang, Y. & Wei, G. H. Optimized CRISPR/Cas9-mediated single nucleotide mutation in adherent cancer cell lines. STAR Protoc.2, 100419 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Tong, H. et al. Programmable A-to-Y base editing by fusing an adenine base editor with an N-methylpurine DNA glycosylase. Nat. Biotechnol.41, 1080–1084 (2023). [DOI] [PubMed] [Google Scholar]
  • 36.The Cancer Genome Atlas Research Network. The molecular taxonomy of primary prostate. Cell163, 1011–1025 (2015). [DOI] [PMC free article] [PubMed]
  • 37.Penney, K. L. et al. Association of prostate cancer risk variants with gene expression in normal and tumor tissue. Cancer Epidemiol. Biomark. Prev.24, 255–260 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Labbe, D. P. et al. High-fat diet fuels prostate cancer progression by rewiring the metabolome and amplifying the MYC program. Nat. Commun.10, 4358 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Ren, S. et al. RNA-seq analysis of prostate cancer in the Chinese population identifies recurrent gene fusions, cancer-associated long noncoding RNAs and aberrant alternative splicings. Cell Res22, 806–821 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Shao, N. et al. A novel gene signature to predict immune infiltration and outcome in patients with prostate cancer. Oncoimmunology9, 1762473 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Nakagawa, T. et al. A tissue biomarker panel predicting systemic progression after PSA recurrence post-definitive prostate cancer therapy. PLoS One3, e2318 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Taylor, B. S. et al. Integrative genomic profiling of human prostate cancer. Cancer Cell18, 11–22 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Grasso, C. S. et al. The mutational landscape of lethal castration-resistant prostate cancer. Nature487, 239–243 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Li, X. et al. Loss of SYNCRIP unleashes APOBEC-driven mutagenesis, tumor heterogeneity, and AR-targeted therapy resistance in prostate cancer. Cancer Cell41, 1427–1449 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Li, F. et al. LKB1 inactivation promotes epigenetic remodeling-induced lineage plasticity and antiandrogen resistance in prostate cancer. Cell Res35, 59–71 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Beltran, H. et al. Divergent clonal evolution of castration-resistant neuroendocrine prostate cancer. Nat. Med.22, 298–305 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Ren, S. et al. Whole-genome and transcriptome sequencing of prostate cancer identify new genetic alterations driving disease progression. Eur. Urol.73, 322–339 (2018). [DOI] [PubMed] [Google Scholar]
  • 48.Abida, W. et al. Genomic correlates of clinical outcome in advanced prostate cancer. Proc. Natl. Acad. Sci. USA116, 11428–11436 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Kumar, A. et al. Substantial interindividual and limited intraindividual genomic diversity among tumors from men with metastatic prostate cancer. Nat. Med22, 369–378 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Deplancke, B., Alpern, D. & Gardeux, V. The Genetics of Transcription Factor DNA Binding Variation. Cell166, 538–554 (2016). [DOI] [PubMed]
  • 51.Butter, F. et al. Proteome-wide analysis of disease-associated SNPs that show allele-specific transcription factor binding. PLoS Genet8, e1002982 (2012). [DOI] [PMC free article] [PubMed]
  • 52.Hallikas, O. et al. Genome-wide prediction of mammalian enhancers based on analysis of transcription-factor binding affinity. Cell124, 47–59 (2006). [DOI] [PubMed] [Google Scholar]
  • 53.Jolma, A. et al. DNA-binding specificities of human transcription factors. Cell152, 327–339 (2013). [DOI] [PubMed] [Google Scholar]
  • 54.Consortium, G. T. The Genotype-Tissue Expression (GTEx) project. Nat. Genet45, 580–585 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Ross-Adams, H. et al. Integration of copy number and transcriptomics provides risk stratification in prostate cancer: A discovery and validation cohort study. EBioMedicine2, 1133–1144 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Bushweller, J. H. Targeting transcription factors in cancer - from undruggable to reality. Nat. Rev. Cancer19, 611–624 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Gerhauser, C. et al. Molecular evolution of early-onset prostate cancer identifies molecular risk markers and clinical trajectories. Cancer Cell34, 996–1011 e1018 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Sud, A., Kinnersley, B. & Houlston, R. S. Genome-wide association studies of cancer: current insights and future perspectives. Nat. Rev. Cancer17, 692–704 (2017). [DOI] [PubMed] [Google Scholar]
  • 59.Uffelmann E. et al. Genome-wide association studies. Nat. Rev. Method Prime1, 59 (2021).
  • 60.Tam, V. et al. Benefits and limitations of genome-wide association studies. Nat. Rev. Genet20, 467–484 (2019). [DOI] [PubMed] [Google Scholar]
  • 61.Al Olama, A. A. et al. A meta-analysis of 87,040 individuals identifies 23 new susceptibility loci for prostate cancer. Nat. Genet.46, 1103–1109 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Rebbeck, T. R. Prostate cancer genetics: variation by race, ethnicity, and geography. Semin Radiat. Oncol.27, 3–10 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Dias A., Kote-Jarai Z., Mikropoulos C., Eeles R. Prostate Cancer Germline Variations and Implications for Screening and Treatment. Cold Spring Harb Perspect Med.8, a030379 (2018). [DOI] [PMC free article] [PubMed]
  • 64.Dadaev, T. et al. Fine-mapping of prostate cancer susceptibility loci in a large meta-analysis identifies candidate causal variants. Nat. Commun.9, 2256 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Spisak, S. et al. A biallelic multiple nucleotide length polymorphism explains functional causality at 5p15.33 prostate cancer risk locus. Nat. Commun.14, 5118 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Prokop J. W. et al. Emerging role of ODC1 in neurodevelopmental disorders and brain development. Genes (Basel) 12, 470 (2021). [DOI] [PMC free article] [PubMed]
  • 67.Yengo, L. et al. A saturated map of common genetic variants associated with human height. Nature610, 704–712 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Asiimwe, I. G. et al. A genome-wide association study of plasma concentrations of warfarin enantiomers and metabolites in sub-Saharan black-African patients. Front Pharm.13, 967082 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Ahmad, Y., Boisvert, F. M., Gregor, P., Cobley, A. & Lamond, A. I. NOPdb: nucleolar proteome database-−2008 update. Nucleic acids Res.37, D181–D184 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Jin, X. et al. PQBP5/NOL10 maintains and anchors the nucleolus under physiological and osmotic stress conditions. Nat. Commun.14, 9 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Schubert, M., Lindgreen, S. & Orlando, L. AdapterRemoval v2: rapid adapter trimming, identification, and read merging. BMC Res Notes9, 88 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics29, 15–21 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Danecek P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, giab008 (2021). [DOI] [PMC free article] [PubMed]
  • 74.Anders, S., Pyl, P. T. & Huber, W. HTSeq-a Python framework to work with high-throughput sequencing data. Bioinformatics31, 166–169 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol.15, 550 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Wu, T. et al. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innov. (Camb.)2, 100141 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods9, 357–359 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol.9, R137 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Byers, L. A. et al. An epithelial-mesenchymal transition gene signature predicts resistance to EGFR and PI3K inhibitors and identifies Axl as a therapeutic target for overcoming EGFR inhibitor resistance. Clin. Cancer Res19, 279–290 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Giannareas N. et al. Extensive germline-somatic interplay contributes to prostate cancer progression through HNF1B co-option of TMPRSS2-ERG. Nat. Commun. 13, 7320 (2022). [DOI] [PMC free article] [PubMed]
  • 81.Chen Y. et al. Multi-factors including inflammatory/immune, hormones, tumor-related proteins and nutrition associated with chronic prostatitis NIH IIIa plus b and IV based on project. Sci. Rep-Uk7, 9143 (2017). [DOI] [PMC free article] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

41467_2025_64005_MOESM2_ESM.pdf (55.6KB, pdf)

Description of Additional Supplementary Files

Supplementary Data 1 (11.5KB, xlsx)
Supplementary Data 2 (19.1KB, xlsx)
Supplementary Data 3 (75KB, xlsx)
Reporting Summary (102.5KB, pdf)
Source Data (26.5MB, zip)

Data Availability Statement

All data and software used in this study are accessible. The publicly available GWAS data in prostate cancer used in this study were obtained from the GWAS catalog. The RNA-seq and ChIP-seq data generated in this study have been deposited in the NCBI Gene Expression Omnibus (GEO) under accession code GSE287279. The publicly available RNA-seq or microarray data including CPGEA, TCGA, MSKCC, GTEx, DKFZ, FHCRC, NPC, Rld, SMMU, Stockholm, SU2C-PCF, Yu (GSE6919), Chandran (GSE10645), Taylor (GSE21034), Grasso (GSE35988), and Penney (GSE62872) were retrieved from public databases including cBioPortal for Cancer Genomics, Oncomine database, and GEO database. The remaining data supporting the findings of this study are available in the Article, Supplementary Information or Source Data file. A reporting summary for this article is available as a Supplementary file. The deposited data were listed in Supplementary Table 6Source data are provided with this paper.


Articles from Nature Communications are provided here courtesy of Nature Publishing Group

RESOURCES