Skip to main content
Translational Cancer Research logoLink to Translational Cancer Research
. 2025 Jul 14;14(7):4058–4070. doi: 10.21037/tcr-2024-2472

Integrative analysis of CRISPR screening and gene expression data identifies a three-gene prognostic model associated with immune microenvironment in neuroblastoma

Xin Li 1,2,, Wanrong Li 2,3, Jian Wang 2,3,
PMCID: PMC12335716  PMID: 40792157

Abstract

Background

Neuroblastoma is a heterogeneous pediatric tumor with variable clinical outcomes. Current prognostic markers are insufficient to predict patient survival accurately, necessitating the identification of novel biomarkers and therapeutic targets. This study aimed to develop a robust prognostic model by integrating CRISPR screening data and transcriptomic profiles, and to explore its correlation with the tumor immune microenvironment.

Methods

We integrated Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) screening data from the DepMap database (version 24Q2) and gene expression profiles from neuroblastoma patients to identify key genes associated with neuroblastoma prognosis. Essential genes with Computational Evaluation of RNAi Essentiality Scores (CERES) scores less than −1 in at least 80% of 34 neuroblastoma cell lines were intersected with differentially expressed genes (|logFC| >2, P<0.05) from the National Genomics Data Center (NGDC) dataset (accession code HRA002064), resulting in 43 overlapping genes. Random forest analysis and multivariate Cox regression were conducted on the GSE49710 training set (n=498) to construct a prognostic model. The model was externally validated using the E-MTAB-8248 dataset (n=223). Immune infiltration and immunotherapy response were assessed using Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data (ESTIMATE), Microenvironment Cell Populations counter (MCPcounter), Cell-type Identification By Estimating Relative Subsets Of RNA Transcripts (CIBERSORT), immunophenoscore (IPS), and Tumor Immune Dysfunction and Exclusion (TIDE) algorithms.

Results

A three-gene prognostic model comprising PKMYT1, CDT1, and NCAPG was established. Patients were stratified into high-risk and low-risk groups based on the median RiskScore of 9.514526. In the training set, high-risk patients exhibited significantly poorer overall survival compared to low-risk patients (log-rank test, P<0.001). The model outperformed traditional clinical factors and demonstrated consistent prognostic value in the external validation cohort. High-risk patients showed lower immune cell infiltration, higher TIDE scores, and lower IPS values, suggesting an immunosuppressive microenvironment and reduced likelihood of responding to immunotherapy. In contrast, low-risk patients had higher immune infiltration and a predicted immunotherapy response rate of 70% versus 36% in the high-risk group.

Conclusions

The three-gene prognostic model effectively stratifies neuroblastoma patients by survival risk and correlates with immune microenvironment characteristics. This model has potential clinical utility for prognosis prediction and guiding personalized immunotherapy strategies in neuroblastoma.

Keywords: Neuroblastoma, prognostic model, Clustered Regularly Interspaced Short Palindromic Repeats screening (CRISPR screening), immune infiltration


Highlight box.

Key findings

• A three-gene prognostic model (PKMYT1, CDT1, NCAPG) derived from integrative Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) screening and gene expression data effectively stratifies neuroblastoma patients into distinct risk groups

• The model correlates with immune infiltration levels and predicts varying potential for immunotherapy response.

What is known and what is new?

• It is known that neuroblastoma is highly heterogeneous, and existing prognostic markers, including MYCN amplification, do not fully capture its complexity. Current approaches lack a robust method to integrate functional genomic data with clinical outcomes.

• This study introduces a novel prognostic risk score by combining essential gene dependencies and differential gene expression profiles, providing improved predictive performance and insight into the tumor immune microenvironment.

What is the implication, and what should change now?

• The findings suggest that integrating functional genomic data into prognostic models can enhance accuracy and clinical utility.

• Incorporating this three-gene model into current risk stratification practices may guide more personalized therapy, including immunotherapy selection.

• Future steps should focus on prospective validation and the development of treatment protocols that leverage this prognostic model to improve patient outcomes.

Introduction

Neuroblastoma is the most common extracranial solid tumor in children, accounting for approximately 15% of pediatric cancer-related deaths worldwide (1). Originating from neural crest cells, neuroblastoma exhibits remarkable heterogeneity in clinical behavior, ranging from spontaneous regression to aggressive progression with poor prognosis (2). Despite advances in multimodal therapies, the survival rate for high-risk neuroblastoma remains unsatisfactory, necessitating the exploration of novel prognostic markers and therapeutic targets (3).

Genomic alterations and dysregulated gene expression play critical roles in neuroblastoma pathogenesis. Amplification of the MYCN oncogene, chromosomal aberrations, and mutations in genes such as ALK have been associated with disease progression and unfavorable outcomes (4). However, these biomarkers do not fully capture the complexity of neuroblastoma biology, highlighting the need for comprehensive analyses to identify additional molecular determinants of prognosis.

The advent of high-throughput sequencing and functional genomic screening has facilitated the identification of essential genes and pathways involved in cancer cell survival and proliferation (5). The Dependency Map (DepMap) project provides a valuable resource for uncovering gene dependencies across various cancer types using CRISPR-Cas9 screening data (6). Integrating such functional genomic data with patient-derived gene expression profiles can enhance our understanding of tumor biology and lead to the discovery of novel prognostic biomarkers.

In addition, the tumor microenvironment, particularly immune cell infiltration, has emerged as a critical factor influencing tumor progression and response to therapy (7). Immunotherapies, including immune checkpoint inhibitors, have shown promise in treating various malignancies but is yet to achieve significant success in neuroblastoma (8). Understanding the immunological landscape of neuroblastoma and its relationship with tumor-intrinsic factors may provide insights into improving therapeutic strategies.

In this study, we aimed to identify key genes associated with neuroblastoma prognosis by integrating CRISPR screening data from the DepMap database and gene expression data from patient samples. We first extracted neuroblastoma-specific essential genes from the DepMap CRISPR screen and intersected them with differentially expressed genes (DEGs) identified from neuroblastoma and ganglioneuroma (GN) sequencing data. Through random forest analysis and Cox regression modeling, we constructed a three-gene prognostic risk model comprising PKMYT1, CDT1, and NCAPG. The model was validated in an independent cohort and demonstrated superior predictive performance compared to traditional clinical factors.

Furthermore, we explored the relationship between the risk score derived from our model and the tumor immune microenvironment. We found that the high-risk group exhibited lower immune cell infiltration and higher immune evasion scores, suggesting a more immunosuppressive milieu. Conversely, the low-risk group showed higher immunogenicity and a greater likelihood of responding to immunotherapy.

Our findings provide a novel prognostic model for neuroblastoma that not only stratifies patients based on survival risk but also reflects the immunological characteristics of the tumor. This model holds potential for guiding personalized treatment strategies and improving outcomes for patients with neuroblastoma. We present this article in accordance with the TRIPOD reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2024-2472/rc).

Methods

Data acquisition and preprocessing

We obtained multiple datasets from publicly available databases to ensure a comprehensive analysis. CRISPR screening data were sourced from the Broad Institute’s Dependency Map database (version 24Q2). Specifically, we extracted CERES gene effect scores for neuroblastoma cell lines, identifying 34 neuroblastoma-related cell lines for inclusion. Genes with CERES scores less than −1 in at least 80% of these cell lines were considered essential for neuroblastoma cell survival, as lower scores indicate a greater dependency of the cell line on that gene. Concurrently, neuroblastoma sequencing data were downloaded from the National Genomics Data Center (NGDC) under the accession code HRA002064. This dataset comprised gene expression profiles of neuroblastoma and GN samples. Raw expression data were processed using the Robust Multi-array Average (RMA) method for background correction and normalization. Differential expression analysis between neuroblastoma and GN samples was conducted using the limma package in R, with thresholds set at an absolute log2 fold change (|logFC|) >2 and a P value <0.05 to identify significantly DEGs.

For model training, the GSE49710 dataset was retrieved from the Gene Expression Omnibus (GEO), containing gene expression profiles and survival data for 498 neuroblastoma patients. This dataset was used as the training set for constructing the prognostic model. An external validation set was obtained from the ArrayExpress database, specifically the E-MTAB-8248 dataset, which included 223 neuroblastoma patient samples with corresponding clinical information.

To analyze gene expression at the single-cell level, we acquired two single-cell RNA sequencing (scRNA-seq) datasets, PMC and GOSH, from publicly accessible repositories. Quality control of scRNA-seq data was performed using the Seurat package, filtering out cells with fewer than 200 genes or more than 10% mitochondrial gene expression to ensure data reliability. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Feature selection using random forest algorithm

The 43 overlapping genes were subjected to feature selection using the random forest algorithm implemented in the randomForest package in R. The random forest model is an ensemble learning method that operates by constructing multiple decision trees during training and outputting the mode of the classes for classification tasks or mean prediction for regression tasks. We set the number of trees (ntree) to 1,000 to ensure stability and reproducibility of results. The mean decrease in Gini coefficient was used to evaluate the importance of each gene. The top six genes with the highest importance scores were selected for subsequent analysis.

Construction of the prognostic risk model

To develop a prognostic model, we performed multivariate Cox proportional hazards regression analysis using the survival package in R. The six genes identified from the random forest analysis were included as covariates. To assess multicollinearity among these genes, we calculated variance inflation factors (VIFs) using the car package. Genes with a VIF >5 were considered to exhibit multicollinearity and were excluded from the model to prevent distortion of the regression coefficients. After this evaluation, three genes—PKMYT1, CDT1, and NCAPG—were retained. The prognostic risk score (RiskScore) for each patient was calculated based on the multivariate Cox regression coefficients using the formula:

RiskScore = (0.6123075)PKMYT1 expression      +0.3990650CDT1 expression      +1.0120016NCAPG expression [1]

Patients in the training set were stratified into high-risk and low-risk groups using the median RiskScore of 9.514526 as the cutoff value.

Evaluation of the prognostic model

The prognostic performance of the RiskScore model was assessed through several statistical methods. Kaplan-Meier survival analysis was conducted using the survminer package to compare overall survival between high-risk and low-risk groups. The log-rank test was applied to determine the statistical significance of survival differences.

To evaluate the predictive accuracy of the RiskScore, time-dependent receiver operating characteristic (ROC) curves were generated using the timeROC package. The area under the ROC curve (AUC) was calculated to quantify the model’s discrimination ability. We compared the AUC values of the RiskScore with traditional clinical prognostic factors, including the International Neuroblastoma Staging System (INSS) stage, MYCN amplification status, and patient age.

Decision curve analysis (DCA) was performed using the ggDCA package to assess the clinical utility of the RiskScore model. DCA evaluates the net benefit of a predictive model across a range of threshold probabilities, aiding in determining its value in clinical decision-making.

Single-cell expression analysis

The expression patterns of the 43 overlapping genes were examined at the single-cell level using the Seurat package. Data integration from the PMC and GOSH scRNA-seq datasets was achieved through canonical correlation analysis (CCA) to correct for batch effects. Cells were clustered using the Louvain algorithm, and cell types were annotated based on canonical markers. Dotplot were generated to visualize the expression levels of the overlapping genes across different cell clusters, focusing on their distribution in tumor cells versus non-tumor cells.

Immune infiltration analysis

We assessed the tumor immune microenvironment in the training set using three computational algorithms:

  • ❖ ESTIMATE: provided stromal and immune scores by analyzing gene expression signatures specific to stromal and immune cells. The ESTIMATE score is the sum of these two scores, reflecting tumor purity;

  • ❖ MCPcounter: quantified the abundance of eight immune cell populations and two stromal cell populations using marker gene expression;

  • ❖ CIBERSORT: utilized a deconvolution algorithm to estimate the proportions of 22 immune cell types from bulk tumor gene expression data. The immune cell infiltration levels estimated by these methods were compared between high-risk and low-risk groups using the Wilcoxon rank-sum test.

Prediction of immunotherapy response

To predict the potential response to immunotherapy, we employed two computational tools:

  • ❖ Immunophenoscore (IPS): obtained from The Cancer Immunome Atlas (TCIA), IPS is a quantitative metric based on the expression of genes related to antigen presentation, effector cells, suppressor cells, and immune checkpoints. Higher IPS values suggest increased immunogenicity and a better response to immune checkpoint inhibitors.

  • ❖ Tumor Immune Dysfunction and Exclusion (TIDE): a computational framework that models two primary mechanisms of tumor immune evasion: T-cell dysfunction in tumors with high infiltration of cytotoxic T lymphocytes and T-cell exclusion in tumors with low infiltration. Higher TIDE scores indicate a higher likelihood of immune evasion and potential resistance to immunotherapy.

Statistical analysis

All statistical analyses were conducted using R software (version 4.0.2). Continuous variables were expressed as mean ± standard deviation (SD) or median with interquartile range (IQR) and compared using the Student’s t-test or Wilcoxon rank-sum test, depending on data distribution. Categorical variables were summarized as counts and percentages and compared using the Chi-squared test or Fisher’s exact test. Survival curves were generated using the Kaplan-Meier method, and differences were assessed with the log-rank test. A two-sided P value less than 0.05 was considered statistically significant.

Results

Identification of key genes associated with neuroblastoma

To identify key genes associated with neuroblastoma, we first obtained CRISPR screening data from the Broad Institute’s DepMap database (version 24Q2). We extracted data specific to neuroblastoma, resulting in a selection of 34 neuroblastoma-related cell lines. Genes that exhibited CERES scores less than −1 in at least 80% of these cell lines were considered essential for neuroblastoma cell survival. This criterion led to the identification of 599 candidate genes.

Concurrently, we downloaded neuroblastoma sequencing data (accession code HRA002064) from the NGDC. Differential expression analysis was performed between neuroblastoma and GN groups. Genes with an |logFC| greater than 2 and a P value less than 0.05 were deemed significantly differentially expressed. This analysis yielded 2,032 DEGs.

Intersecting the 599 essential genes from the CRISPR screen with the 2,032 DEGs from the sequencing data resulted in 43 overlapping genes (Figure 1A). To understand the biological significance of these genes, we conducted Gene Ontology (GO) enrichment analysis using the clusterProfiler package. The GO analysis revealed that these genes were significantly enriched in biological processes related to the cell cycle and cellular ubiquitination (Figure 1B), suggesting their involvement in cell proliferation and protein degradation pathways critical for tumor growth.

Figure 1.

Figure 1

Identification of overlapping genes associated with neuroblastoma. (A) Venn diagram showing the overlap between essential genes identified from the Dependency Map (DepMap) CRISPR screen (version 24Q2) and differentially expressed genes from the NGDC sequencing data (accession code HRA002064), resulting in 43 overlapping genes. (B) GO enrichment analysis of the 43 overlapping genes, indicating significant enrichment in cell cycle and cellular ubiquitination processes. (C,D) Heatmaps displaying the expression profiles of the 43 overlapping genes in single-cell RNA sequencing datasets from PMC and GOSH, showing predominant expression in tumor cells. CRISPR, Clustered Regularly Interspaced Short Palindromic Repeats; GO, Gene Ontology; GOSH, Great Ormond Street Hospital; NGDC, National Genomics Data Center; PMC, Princess Máxima Center.

Further exploration of the expression patterns of these 43 genes at the single-cell level was performed using two neuroblastoma single-cell RNA sequencing datasets: PMC and GOSH. Analysis of these datasets demonstrated that the majority of the overlapping genes were predominantly expressed in tumor cells (Figure 1C,1D). This observation underscores the potential role of these genes in neuroblastoma tumorigenesis and highlights them as possible targets for therapeutic intervention.

Construction and validation of a prognostic risk model

To develop a robust prognostic model for neuroblastoma, we utilized the 43 overlapping genes identified from the previous analyses. These genes were subjected to feature selection using a random forest algorithm on the GSE49710 dataset, which comprises gene expression profiles of 498 neuroblastoma patients. The random forest analysis ranked the genes based on their importance scores, and the top six genes with the highest scores were selected (Figure 2A,2B).

Figure 2.

Figure 2

Construction and evaluation of the three-gene prognostic risk model. (A) Feature importance plot from the random forest algorithm used to reduce the number of significant genes on the GSE49710 dataset. (B) Bar graph of the relative importance of the top six genes affecting survival outcomes. (C) Forest plot from the multivariate Cox regression analysis of PKMYT1, CDT1, and NCAPG. (D) Kaplan-Meier survival curves comparing overall survival between high-risk and low-risk groups in the training set, stratified by the median RiskScore of 9.514526. (E) Scatter plot showing the distribution of RiskScores and survival status for each patient in the training set. (F) ROC curves comparing the predictive accuracy of the RiskScore and traditional clinical factors. (G) DCA comparing the net benefit of the RiskScore and traditional clinical factors. (H) Heatmap illustrating the distribution of RiskScores alongside clinical characteristics for each patient. CI, confidence interval; DCA, decision curve analysis; ROC, receiver operating characteristic.

Subsequently, we performed multivariate Cox proportional hazards regression analysis on these six genes to determine their independent prognostic significance. To address potential multicollinearity among the variables, VIFs were calculated, and genes with high VIFs were excluded. This process resulted in the identification of three key genes—PKMYT1, CDT1, and NCAPG—which were retained for model construction (Figure 2C). The prognostic risk score (RiskScore) for each patient was calculated using the following formula derived from the multivariate Cox regression coefficients:

RiskScore = (0.6123075)PKMYT1 expression      +0.3990650CDT1 expression      +1.0120016NCAPG expression [2]

Patients were stratified into high-risk and low-risk groups based on the median RiskScore of 9.514526 within the training set. Kaplan-Meier survival analysis revealed that patients in the high-risk group had significantly poorer overall survival compared to those in the low-risk group (log-rank test, P<0.001; Figure 2D). The distribution of RiskScores and corresponding survival status for each patient is presented in Figure 2E, illustrating that higher RiskScores are associated with increased mortality.

To assess the predictive performance of the RiskScore, time-dependent ROC curves were generated for both short-term (3-year) and long-term (5-year) survival predictions. The AUC values for the RiskScore were higher than those for traditional clinical prognostic factors, including INSS stage, MYCN amplification status, and patient age (Figure 2F).

Furthermore, DCA was performed to evaluate the clinical utility of the RiskScore compared to conventional prognostic factors. The DCA curves demonstrated that our RiskScore model provided the highest net benefit across a range of threshold probabilities for both short-term and long-term survival (Figure 2G), suggesting that it may be more beneficial for guiding clinical decision-making.

Finally, we explored the relationship between the RiskScore and clinical characteristics. A heatmap was generated to visualize the distribution of RiskScores alongside clinical parameters such as INSS stage, MYCN status, and age (Figure 2H). The analysis indicated that the RiskScore is independent of these clinical factors, further supporting its potential as an independent prognostic indicator.

To explore the protein-level expression and subcellular localization of the three model genes, we consulted the Human Protein Atlas and retrieved representative images from breast and lung cancer tissues or cell lines. Immunohistochemistry images for CDT1 and NCAPG showed strong expression in tumor tissues with distinct nuclear and cytoplasmic/membrane localization, respectively (Figure S1A,S1B). For PKMYT1, immunofluorescence analysis in MCF7 cells revealed a perinuclear and Golgi pattern (Figure S1C).

To evaluate the link between transcriptional phenotypes and risk classification, we calculated MES and ADRN ssGSEA scores for each sample in GSE49710 using gene sets from van Groningen et al. (9). The difference (MES-ADRN score, denoted as M-A score) was significantly higher in the high-risk group, and positively correlated with the RiskScore (ρ=0.43, P<2.2e−16), indicating that our model may reflect a MES-like, high-risk cellular state (Figure S2).

External validation of the prognostic risk model

To validate the prognostic performance of our three-gene RiskScore model in an independent cohort, we utilized the E-MTAB-8248 dataset as an external validation set. This dataset comprises gene expression profiles and clinical data from 223 neuroblastoma patients not included in the training set. The RiskScore for each patient in the validation set was calculated using the same formula established in the training set:

Patients were stratified into high-risk and low-risk groups based on the median RiskScore of 9.514526 derived from the training set. Consistent with the training set results, Kaplan-Meier survival analysis demonstrated that patients in the high-risk group had significantly poorer overall survival compared to those in the low-risk group (log-rank test, P<0.001; Figure 3A). The distribution of RiskScores and corresponding survival status for each patient in the validation set is illustrated in Figure 3B. The plot indicates that higher RiskScores are associated with increased mortality, corroborating the findings from the training set.

Figure 3.

Figure 3

External validation of the prognostic risk model. (A) Kaplan-Meier survival curves comparing overall survival between high-risk and low-risk groups in the E-MTAB-8248 validation set (n=223), using the median RiskScore from the training set. (B) Scatter plot showing the distribution of RiskScores and survival status for each patient in the validation set. (C) ROC curves demonstrating the predictive accuracy of the RiskScore. (D) Comparing AUC values between the RiskScore and traditional clinical factors. (E) DCA comparing the net benefit of the RiskScore and traditional clinical factors in the validation set. AUC, area under the curve; CI, confidence interval; DCA, decision curve analysis; ROC, receiver operating characteristic.

To assess the predictive accuracy of the RiskScore in the validation set, we generated ROC curves. The RiskScore achieved high AUC values, indicating robust predictive performance (Figure 3C). Specifically, the RiskScore outperformed traditional clinical prognostic factors, such as INSS stage, MYCN amplification status, and patient age, demonstrating higher AUC values (Figure 3D).

Furthermore, we compared the clinical utility of the RiskScore with conventional prognostic factors using DCA. The DCA curves revealed that our RiskScore model provided the highest net benefit across a range of threshold probabilities in the validation set (Figure 3E), suggesting superior potential for guiding clinical decision-making.

Lastly, we examined the concordance between the RiskScore and traditional clinical factors in predicting patient outcomes. The analysis confirmed that the RiskScore remained an independent prognostic indicator in the validation cohort, reinforcing its generalizability and reliability.

To explore oncogene-associated expression differences, we analyzed gene expression stratified by MYCN amplification in the GSE49710 cohort, observing significantly higher levels of PKMYT1, CDT1, and NCAPG in MYCN-amplified tumors. ALK mutation data were not available in public patient datasets; thus, we evaluated neuroblastoma cell line RNA-seq data from GSE89413. PKMYT1 and CDT1 were significantly upregulated in MYCN-amplified lines, while NCAPG showed a non-significant trend. No statistically significant differences were observed between ALK-mutant and wild-type lines (Figure S3A-S3C).

Immune infiltration and predicted immunotherapy response between high-risk and low-risk groups

To explore the immunological differences associated with the RiskScore, we analyzed the immune infiltration levels and predicted immunotherapy responses between the high-risk and low-risk groups in the training set. Three computational algorithms—ESTIMATE, MCPcounter, and CIBERSORT—were employed to assess the tumor immune microenvironment of each sample.

The analysis revealed that the high-risk group exhibited significantly lower levels of immune cell infiltration compared to the low-risk group (Figure 4A). Specifically, there was a marked decrease in the infiltration of CD8+ T cells and cytotoxic T lymphocytes in the high-risk group, which are critical components of the anti-tumor immune response. This suggests that tumors in the high-risk group may possess an immunosuppressive microenvironment that facilitates tumor progression.

Figure 4.

Figure 4

Immune infiltration and predicted immunotherapy response between high-risk and low-risk groups. (A) Heatmap comparing immune cell infiltration levels between high-risk and low-risk groups, assessed by ESTIMATE, MCPcounter, and CIBERSORT algorithms. (B) Box plot illustrating the TIDE scores of high-risk and low-risk groups. (C) Box plot showing the IPS of high-risk and low-risk groups. (D) Bar chart plot depicting the relationship between RiskScore and predicted immunotherapy response rates. (E) Bar chart summarizing the predicted immunotherapy response rates in high-risk and low-risk groups. ***, P<0.001. CIBERSORT, Cell-type Identification By Estimating Relative Subsets Of RNA Transcripts; ESTIMATE, Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data; IPS, Immunophenoscore; MCPcounter, Microenvironment Cell Populations counter; NR, non-respond; R, respond; TIDE, Tumor Immune Dysfunction and Exclusion.

To further investigate the potential for immunotherapy benefit, we utilized the IPS and TIDE algorithms to predict the immunogenicity and immunotherapy responsiveness of the samples. The high-risk group demonstrated significantly higher TIDE scores (Figure 4B), indicating a higher likelihood of immune evasion mechanisms and reduced sensitivity to immune checkpoint inhibitors. Conversely, the low-risk group showed higher IPS values (Figure 4C), suggesting a more favorable immunogenic profile and a greater potential to benefit from immunotherapy.

We also leveraged the TIDE web platform to predict immunotherapy response rates within the training set. The analysis revealed an inverse relationship between the RiskScore and predicted immunotherapy responsiveness. As shown in Figure 4D, patients with higher RiskScores were less likely to respond to immunotherapy. A summary of the predicted response rates indicated that only 36% of patients in the high-risk group were predicted to respond to immunotherapy, compared to 70% in the low-risk group (Figure 4E). These findings imply that the low-risk group, characterized by higher immune infiltration and lower immunosuppression, may derive more significant clinical benefits from immunotherapy.

Discussion

In this study, we developed a novel prognostic model for neuroblastoma by integrating CRISPR screening data from the DepMap database with gene expression profiles from patient samples. Neuroblastoma, a malignancy arising from neural crest cells, continues to pose significant clinical challenges due to its heterogeneous nature and variable clinical outcomes (10). Despite advancements in multimodal therapies, the prognosis for high-risk neuroblastoma remains poor, underscoring the need for reliable prognostic markers and targeted therapies.

Our approach began with the identification of 43 overlapping genes that were both essential for neuroblastoma cell survival and differentially expressed between neuroblastoma and GN tissues. The essential genes were determined based on CERES scores from CRISPR screens, identifying genes the knockout of which significantly impairs cell viability. The DEGs were obtained from sequencing data, focusing on those with significant expression changes (|logFC| >2, P<0.05). GO enrichment analysis of these overlapping genes revealed significant involvement in cell cycle regulation and cellular ubiquitination processes, which are critical pathways in tumorigenesis (11). The single-cell RNA sequencing analysis further demonstrated that these genes were predominantly expressed in tumor cells, suggesting their direct involvement in neuroblastoma pathology. This finding aligns with previous studies highlighting the importance of cell cycle regulators and ubiquitination in cancer progression (12-14).

Through random forest analysis and multivariate Cox regression, we narrowed down the list to three key genes: PKMYT1, CDT1, and NCAPG. PKMYT1 is a kinase involved in cell cycle regulation by inhibiting the activation of cyclin-dependent kinase 1 (CDK1), thereby controlling the G2/M transition (15,16). Previous studies have implicated PKMYT1 in cancer cell proliferation and survival (16-18). CDT1 is essential for DNA replication initiation, and its dysregulation can lead to genomic instability, a hallmark of cancer (19). NCAPG is a component of the condensin complex involved in chromosome condensation during mitosis, and its overexpression has been associated with poor prognosis in various cancers (20). The inclusion of these genes in our prognostic model underscores their potential as therapeutic targets.

The prognostic model demonstrated robust predictive performance in both the training and validation cohorts. In the training set (GSE49710, n=498), the RiskScore effectively stratified patients into high-risk and low-risk groups with significantly different survival outcomes (log-rank test, P<0.001). Importantly, the model outperformed traditional clinical factors such as INSS stage, MYCN status, and patient age. The external validation using the E-MTAB-8248 dataset (n=223) confirmed the model’s generalizability, further highlighting its potential clinical utility.

One of the notable findings of our study is the association between the RiskScore and the tumor immune microenvironment. High-risk patients exhibited significantly lower levels of immune cell infiltration, particularly CD8+ T cells and cytotoxic T lymphocytes, as assessed by ESTIMATE, MCPcounter, and CIBERSORT algorithms. This immunosuppressive microenvironment may facilitate tumor progression and resistance to therapies (21). Conversely, low-risk patients demonstrated higher immune infiltration, suggesting a more active anti-tumor immune response.

The IPS and TIDE analyses provided insights into the potential responsiveness to immunotherapy. High-risk patients had higher TIDE scores and lower IPS values, indicating increased immune evasion mechanisms and reduced likelihood of benefiting from immune checkpoint inhibitors. In contrast, low-risk patients had a higher predicted response rate to immunotherapy (70% vs. 36%), suggesting that they may derive more benefit from such treatments. These findings are particularly relevant given the limited success of immunotherapy in neuroblastoma to date (8). Our model could aid in identifying patients who are more likely to respond to immunotherapy, thereby personalizing treatment strategies.

The integration of functional genomic data with clinical outcomes represents a significant advancement in neuroblastoma research. Previous studies have primarily focused on genetic mutations and amplifications, such as MYCN amplification and ALK mutations (4,22,23). While these markers are valuable, they do not fully capture the complexity of neuroblastoma biology. Our study incorporates gene dependency data, which reflects the functional importance of genes in cancer cell survival, providing a more dynamic understanding of tumor biology.

However, there are several limitations in this study. First, although we validated our model in an external cohort, prospective clinical trials are necessary to confirm its prognostic value and clinical applicability. Second, the mechanisms by which PKMYT1, CDT1, and NCAPG contribute to neuroblastoma progression and immune evasion were not explored in this study. Functional experiments are needed to elucidate their roles and validate them as therapeutic targets. Third, the immunotherapy response predictions are based on computational algorithms and require validation in clinical settings.

Future research should focus on investigating the biological functions of the identified genes in neuroblastoma and their interactions with the immune microenvironment. Additionally, incorporating other omics data, such as proteomics and metabolomics, may provide a more comprehensive understanding of neuroblastoma pathogenesis. Personalized treatment approaches based on the RiskScore could be developed, potentially improving outcomes for patients with neuroblastoma.

Conclusions

In conclusion, we developed and validated a three-gene prognostic model that effectively stratifies neuroblastoma patients by survival risk and correlates with the tumor immune microenvironment. This model has the potential to inform prognosis and guide personalized therapeutic strategies, including immunotherapy, thereby contributing to improved management of neuroblastoma.

Supplementary

The article’s supplementary files as

tcr-14-07-4058-rc.pdf (203.8KB, pdf)
DOI: 10.21037/tcr-2024-2472
tcr-14-07-4058-coif.pdf (257.5KB, pdf)
DOI: 10.21037/tcr-2024-2472
DOI: 10.21037/tcr-2024-2472

Acknowledgments

None.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Footnotes

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tcr.amegroups.com/article/view/10.21037/tcr-2024-2472/rc

Funding: This work was supported by grants from the Tianjin Health Technology Project (Grant No. 2022QN106) to X.L.

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2024-2472/coif). The authors have no conflicts of interest to declare.

References

  • 1.Park JR, Eggert A, Caron H. Neuroblastoma: biology, prognosis, and treatment. Pediatr Clin North Am 2008;55:97-120, x. 10.1016/j.pcl.2007.10.014 [DOI] [PubMed] [Google Scholar]
  • 2.Maris JM. Recent advances in neuroblastoma. N Engl J Med 2010;362:2202-11. 10.1056/NEJMra0804577 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.London WB, Castel V, Monclair T, et al. Clinical and biologic features predictive of survival after relapse of neuroblastoma: a report from the International Neuroblastoma Risk Group project. J Clin Oncol 2011;29:3286-92. 10.1200/JCO.2010.34.3392 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Pugh TJ, Morozova O, Attiyeh EF, et al. The genetic landscape of high-risk neuroblastoma. Nat Genet 2013;45:279-84. 10.1038/ng.2529 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Srivastava A, Creek DJ. Discovery and Validation of Clinical Biomarkers of Cancer: A Review Combining Metabolomics and Proteomics. Proteomics 2019;19:e1700448 . 10.1002/pmic.201700448 [DOI] [PubMed] [Google Scholar]
  • 6.Meyers RM, Bryan JG, McFarland JM, et al. Computational correction of copy number effect improves specificity of CRISPR-Cas9 essentiality screens in cancer cells. Nat Genet 2017;49:1779-84. 10.1038/ng.3984 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Dallavalasa S, Beeraka NM, Basavaraju CG, et al. The Role of Tumor Associated Macrophages (TAMs) in Cancer Progression, Chemoresistance, Angiogenesis and Metastasis - Current Status. Curr Med Chem 2021;28:8203-36. 10.2174/0929867328666210720143721 [DOI] [PubMed] [Google Scholar]
  • 8.Anderson J, Majzner RG, Sondel PM. Immunotherapy of Neuroblastoma: Facts and Hopes. Clin Cancer Res 2022;28:3196-206. 10.1158/1078-0432.CCR-21-1356 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.van Groningen T, Koster J, Valentijn LJ, et al. Neuroblastoma is composed of two super-enhancer-associated differentiation states. Nat Genet 2017;49:1261-6. 10.1038/ng.3899 [DOI] [PubMed] [Google Scholar]
  • 10.Whittle SB, Smith V, Doherty E, et al. Overview and recent advances in the treatment of neuroblastoma. Expert Rev Anticancer Ther 2017;17:369-86. 10.1080/14737140.2017.1285230 [DOI] [PubMed] [Google Scholar]
  • 11.Sellers WR. A blueprint for advancing genetics-based cancer therapy. Cell 2011;147:26-31. 10.1016/j.cell.2011.09.016 [DOI] [PubMed] [Google Scholar]
  • 12.Pagano M, Tam SW, Theodoras AM, et al. Role of the ubiquitin-proteasome pathway in regulating abundance of the cyclin-dependent kinase inhibitor p27. Science 1995;269:682-5. 10.1126/science.7624798 [DOI] [PubMed] [Google Scholar]
  • 13.Dang F, Nie L, Wei W. Ubiquitin signaling in cell cycle control and tumorigenesis. Cell Death Differ 2021;28:427-38. 10.1038/s41418-020-00648-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Dagar G, Kumar R, Yadav KK, et al. Ubiquitination and deubiquitination: Implications on cancer therapy. Biochim Biophys Acta Gene Regul Mech 2023;1866:194979 . 10.1016/j.bbagrm.2023.194979 [DOI] [PubMed] [Google Scholar]
  • 15.Maton G, Thibier C, Castro A, et al. Cdc2-cyclin B triggers H3 kinase activation of Aurora-A in Xenopus oocytes. J Biol Chem 2003;278:21439-49. 10.1074/jbc.M300811200 [DOI] [PubMed] [Google Scholar]
  • 16.Zhang QY, Chen XQ, Liu XC, et al. PKMYT1 Promotes Gastric Cancer Cell Proliferation and Apoptosis Resistance. Onco Targets Ther 2020;13:7747-57. 10.2147/OTT.S255746 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Zhang Q, Zhao X, Zhang C, et al. Overexpressed PKMYT1 promotes tumor progression and associates with poor survival in esophageal squamous cell carcinoma. Cancer Manag Res 2019;11:7813-24. 10.2147/CMAR.S214243 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Li H, Wang L, Zhang W, et al. Overexpression of PKMYT1 associated with poor prognosis and immune infiltration may serve as a target in triple-negative breast cancer. Front Oncol 2022;12:1002186 . 10.3389/fonc.2022.1002186 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Petropoulos M, Champeris Tsaniras S, Taraviras S, et al. Replication Licensing Aberrations, Replication Stress, and Genomic Instability. Trends Biochem Sci 2019;44:752-64. 10.1016/j.tibs.2019.03.011 [DOI] [PubMed] [Google Scholar]
  • 20.Xiao C, Gong J, Jie Y, et al. NCAPG Is a Promising Therapeutic Target Across Different Tumor Types. Front Pharmacol 2020;11:387 . 10.3389/fphar.2020.00387 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Fridman WH, Zitvogel L, Sautès-Fridman C, et al. The immune contexture in cancer prognosis and treatment. Nat Rev Clin Oncol 2017;14:717-34. 10.1038/nrclinonc.2017.101 [DOI] [PubMed] [Google Scholar]
  • 22.Carpenter EL, Mossé YP. Targeting ALK in neuroblastoma--preclinical and clinical advancements. Nat Rev Clin Oncol 2012;9:391-9. 10.1038/nrclinonc.2012.72 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Rosswog C, Fassunke J, Ernst A, et al. Genomic ALK alterations in primary and relapsed neuroblastoma. Br J Cancer 2023;128:1559-71. 10.1038/s41416-023-02208-y [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    The article’s supplementary files as

    tcr-14-07-4058-rc.pdf (203.8KB, pdf)
    DOI: 10.21037/tcr-2024-2472
    tcr-14-07-4058-coif.pdf (257.5KB, pdf)
    DOI: 10.21037/tcr-2024-2472
    DOI: 10.21037/tcr-2024-2472

    Articles from Translational Cancer Research are provided here courtesy of AME Publications

    RESOURCES