Summary
Diffuse Large B-Cell Lymphoma (DLBCL) is a heterogeneous disease characterized by a subset of patients who exhibit treatment resistance and poor prognoses. Genomic assays have been widely employed to identify high-risk individuals characterized by rearrangements in the MYC, BCL2, and BCL6 genes. These patients typically undergo more aggressive therapeutic treatments, however, there remains a significant variation in their treatment outcomes. This study introduces a MYC Signature Score (MYCSS) derived from gene expression profiles, specifically designed to evaluate MYC overactivation in DLBCL patients. MYCSS was validated across several independent cohorts to assess its ability to stratify patients based on MYC-related genetic and molecular aberrations, enhancing the accuracy of prognostic evaluations compared to conventional MYC biomarkers. Our results indicate that MYCSS significantly refines prognostic accuracy beyond that of conventional MYC biomarkers focused on genetic aberrations. More importantly, we found that nearly 50% of patients identified as high-risk by traditional MYC metrics actually share similar survival prospects with those having no MYC aberrations. These patients may benefit from standard GCB-based therapies rather than more aggressive treatments. MYCSS provides a robust signature for identifying high-risk patients aid precision treatment while minimize potential overtreatment in DLBCL.
Keywords: Diffuse Large B-Cell Lymphoma, MYC Rearrangement, Gene Signatures, Prognostic Stratification
Graphical Abstract

Clinically, high-risk Diffuse Large B-Cell Lymphoma (DLBCL) patients are identified through genomic assays detecting MYC, BCL2, and BCL6 gene rearrangements or MYC overexpression. This study constructs a MYC signature using MYC mutation/amplification status and RNA-Seq data. Validation with independent datasets using the MYC signature score (MYCSS) reveal that approximately 50% of patients classified as high-risk by traditional MYC-related features or novel signatures such as MHG/DZsig exhibit survival rates similar to those without MYC aberrations. This signature can be applied in combination with MYC genomic assays to improve personalized treatment while minimizing overtreatment in DLBCL.
Background
Diffuse Large B-Cell Lymphoma (DLBCL), the most common type of non-Hodgkin’s lymphoma, is an aggressive cancer that arises from B cells. Treatment with a rituximab, cyclophosphamide, doxorubicin, vincristine, and prednisone regimen (R-CHOP) has been the therapeutic standard for almost twenty years, achieving a cure in 50 to 70% of patients.1 Although many patients with DLBCL have favorable disease outcomes and experience remission with first-line therapy, nearly half of all patients experience disease progression, and approximately 25% of DLBCL patients die from their illness.2 Of the ~40% of patients who experience refractory disease after treatment or disease relapse, around 80% will ultimately die from lymphoma, despite further treatment with salvage chemotherapy or autologous stem cell transplantation (ASCT).3 In recent years, research efforts in the field have focused on improving the R-CHOP regimen, including newer regimens such as DA-R-EPOCH (dose-adjusted rituximab, etoposide, prednisone, vincristine, cyclophosphamide, and doxorubicin), and personalized therapy is especially important for patients with high-risk tumor genetics or poor prognostic markers.1
DLBCL is a highly heterogeneous disease with appreciated biological, pathological, and clinical variability between patients.3 Numerous studies have attempted to classify disease severity and identify high-risk patients within DLBCL’s heterogeneous disease landscape.4–6 The first major stride in classification was the identification of gene expression-based molecular subtypes for risk stratification and response prediction in DLBCL. The cell-of-origin (COO) signature categorizes DLBCL samples into activated B-cell-like (ABC) and germinal center B-cell-like (GCB) subtypes.7 ABC DLBCL is associated with markedly worse outcomes than GCB DLBCL when patients receive standard chemoimmunotherapy.7 ABC and GCB DLBCL are thought to have unique disease mechanisms underlying their variable responses, and understanding an individual patient’s subtype can assist with patient prognostication and treatment selection.7,8 Another gene-based classifier has been developed to identify a molecular high-grade (MHG) subtype, of which most samples are from the GCB COO class and is associated with poor prognosis.9,10 Additionally, the double-hit signature (DHITsig), which was recently renamed the Dark-Zone signature11 (DZsig), was originally developed by Ennishi et al.12 to identify DLBCL patients with poor prognoses who had genetic expression characteristics similar to high-grade B-cell lymphoma patients with double-hit mutations. In a recent clinical study, it was found that DZsig-positive patients demonstrated a poorer prognosis than both ABC and GCB DLBCL patients.13
Furthermore, DLBCL patient prognoses can also be differentiated based on the presence of key chromosomal abnormalities. Notably, the rearrangement of MYC, which encodes a “master” transcription factor that regulates cell growth and division across multiple biological pathways, has been frequently observed in DLBCL.14 DLBCL with MYC and BCL2 and/or BCL6 translocations (double-hit and triple-hit, respectively) is associated with a poor prognosis and thereby defined as high-grade B-cell lymphoma according to WHO classification.15 Interestingly, some DLBCLs exist in which MYC and BCL2 genes are overexpressed at the protein level but without genetic rearrangements.16 Currently, it has become a routine practice to detect the rearrangement of MYC, BCL2, and BCL6 by fluorescence in situ hybridization (FISH). DLBCL patients with MYC double/triple hits typically receive more intensive therapeutic treatment, and MYC gene rearrangement is one of the most powerful predictors of an adverse outcome while on R-CHOP therapy.17 In addition to rearrangement, MYC can also be activated via other types of MYC aberrations, including MYC gene mutations and amplification.18
In this study, we developed a MYC gene signature to quantify the MYC activity of DLBCL samples based on their gene expression profiles. Our MYC signature can effectively distinguish good versus poor prognostic subgroups in DLBCL patients with MYC aberrations or high-risk subtypes such as the MHG or Dzsig-positive. Importantly, our results indicate that at least 50% of DLBCL patients with aberrant MYC genes or double-hit rearrangements do not have a poor prognosis. Instead, they have comparable survival to patients without MYC aberrations. Further, the MYC signature can better predict patient sensitivity to chemotherapy in both R-CHOP and RB-CHOP regimens. The signature provides additional prognostic values in each DLBCL molecular subtype and has great potential to improve personalized treatment.
Methods
Dataset collection
In this study, we utilized four datasets, one training dataset to develop the MYC signature and three validation datasets, including a total of 3,035 DLBCL patients (Supplementary Table S1). The Reddy dataset was obtained from the European Genome-phenome Archive (EGA) (https://ega-archive.org/) under the accession number EGAS0000100260619. This dataset provides RNA-seq data, whole-exome sequencing data, and clinical information. The somatic mutations and copy number variations for 150 cancer driver genes were reported for samples with whole-exome sequencing. From this dataset, we selected a total of 604 samples with complete gene expression, MYC mutation/amplification, and survival data for defining a MYC gene signature. The signature can be used to independent gene expression datasets to determine the MYC activity of DLBCL samples based on their gene expression profiles.
The expression profiles and clinical data of the Sha10, Lacy20, and Frei21,22 datasets were downloaded from the Gene Expression Omnibus (GEO) database (RRID:SCR_005012) (https://www.ncbi.nlm.nih.gov/geo/) under accession numbers GSE117556, GSE181063, and GSE31312, respectively, including 924, 1,037, and 470 DLBCL patients, as the validation datasets for evaluating the prediction of prognosis or treatment response by MYC signature score.
Constructing the MYC signature
To construct the weighted MYC signature that accounted for any pathway-level dysregulation in gene expression, we leveraged the selected Reddy dataset that comprise gene expression and MYC variation data for 604 samples. This genetic signature represents the impact of MYC gene alterations on the expression levels across all genes. We utilized the MYC nonsynonymous mutation and amplification status reported by Reddy et al., which identified 32 samples with either MYC nonsynonymous mutation (n=24), MYC amplification (n=7) or both (n=1) (Supplementary Table S2). Notably, all except one of the 25 nonsynonymous MYC mutations were also identified in a separate study by Dreval et al..35 The expression profiles of 32 DLBCL samples were selected and compared with those from MYC wild-type samples (without MYC mutation or amplification events regardless of aberrations in other genes). For each gene, the following multiple linear regression model was implemented:
in which MYC mutation status is a binary variable with 1 indicating the presence of a non-synonymous MYC mutation or amplification and 0 indicating the absence of the aforementioned characteristic. IPI (International Prognostic Index)23 indicates the risk stratification indicator based on the patient’s clinical information, and COO (Cell-of-Origin)24 indicates the DLBCL subtype of GCB (germinal center B cell-like), ABC (activated B cell-like), or UNC (unclassified) according to gene expression profiles. Based on the model, the coefficient (beta) and the statistical significance (P-value) were calculated for all genes, indicating their association with MYC mutation status. Following that, two separate weight vectors were created: an upregulated weight (wup) and a downregulated weight (wdown). For each gene, when beta > 0, wup = −log(P-value) and wdown = 0. Alternatively, when beta < 0, wdown = −log(P-value) and wup = 0. To eliminate extreme values before downstream analysis, all weights above 10 were trimmed to 10. The resulting vectors were subtracted by the minimum weight and then divided by the range, thus converting wup and wdown to vectors between 0 and 1. The top-weighted 40 genes (cutoff > 0.5) from our DEG analysis during the process of constructing our MYCSS are included in Supplementary Table S3.
Calculation of MYC signature score and Identification of DZsig and MHG in DLBCL
With the MYC signature defined above, sample-specific MYC signature scores (MYCSS) were calculated for patients in all datasets using a modified gene set enrichment analysis method termed Binding Associated with Sorted Expression, or BASE25 as previously described26 (Supplementary Method). Additionally, the detail method to identify DZsig-positive and MHG subtype patients were also provided in the Supplementary Method.
Results
Novel MYC signature score accurately predicts patient survival
The Reddy dataset offers paired MYC somatic mutation and detailed gene expression profiles from samples obtained from 604 patients with DLBCL. We defined a MYC signature score (MYCSS) evaluating the gene expression changes in other pathway genes from the Reddy dataset and used it to construct MYC signature scores for prognosis prediction. Establishing a MYC signature score accounts for the downstream pathway inactivation or overactivation that may result from any MYC perturbation, either from a direct mutation or any alternate mechanism (Figure 1A). As a validation, the MYCSS can distinguish the DLBCL samples with MYC translocation from those without for 67 samples with known MYC translocation status determined from FISH assay (Figure S1A). Next, we further validated the performance of our score in the setting of MYC mutations and rearrangements.
Figure 1. MYC signature predicts patient survival more accurately than MYC features.

(A) Framework for constructing the MYC signature and the overall design of this study. (B) MYC signature scores (MYCSS) are significantly different between samples with positive versus negative MYC features, MYC mutation, rearrangement, and IHC MYC expression. (C) Patients with high-MYCSS (MYCSS-Hi) and low-MYCSS (MYCSS-Lo) show significantly different overall survival (OS) in the Sha dataset. (D) Conventional methods for MYC-based risk stratification demonstrating overall survival benefit based on MYC mutation, rearrangement, genetic aberration status, mRNA expression, and protein expression via IHC. Of note, the MYCSS provides more accurate risk stratification than MYC features. The Sha dataset was used for all analyses.
Leveraging data from 924 patients in the Sha dataset, we found that the MYCSS directly correlates with and can predict MYC mutation status, rearrangement status, and expression changes as determined by immunohistochemistry (IHC). Patients without a MYC aberration (i.e., without a mutation or rearrangement event) have a statistically significantly lower MYCSS than patients with a MYC aberration (Figure 1B). Furthermore, our dichotomic MYCSS can accurately predict overall patient survival—patients with higher MYC signature scores (MYCSS-Hi) have a significantly poorer prognosis than patients with lower MYC signature scores (MYCSS-Lo) (P = 3e-8, Figure 1C). While most conventional MYC-based prognostic measures, such as mutation status, rearrangement status, other MYC genetic aberrations, and RNA expression, can statistically significantly predict patient survival, the dichotomic MYCSS can improve overall survival (OS) and progression-free survival (PFS) prediction over each of these methods (Figure 1D; Figure S1B-C).
Validation of MYC signature score in independent datasets confirms the accuracy of survival prediction
After validating the performance of the MYCSS against key MYC-related variables in the Sha dataset, we wanted to evaluate further the ability of our dichotomous MYCSS to predict overall survival in two additional and independent datasets. In the Reddy and Lacy datasets, a significant decrease in overall survival in the high MYCSS cohorts compared to the low MYCSS cohorts is observed (Figure 2A–B, respectively). As a next step, we evaluated the performance of the MYCSS in the context of existing molecular signatures, such as cell-of-origin (COO) groups (ABC and GCB). In a multivariable Cox model of the Sha dataset and after adjusting for additional clinical factors and conventional prognostic models, the MYCSS remained the most significant predictor of overall survival (Figure 2C). This trend is consistent for PFS in the Sha dataset and OS in the Reddy and Lacy datasets (Figure S2A-C). Furthermore, using a univariate Cox regression of the MYCSS in COO groups, the score is prognostic within both the ABC and GCB subgroups for each of the three analyzed datasets (Figure 2D), and the MYCSS remains the most significant MYC-based predictor of OS and PFS when analyzed in multiple multivariable models (Figure S2D). The Sha dataset includes patients treated with R-CHOP or RB-CHOP. When stratified by the treatment, MYCSS was significantly associated with patient prognosis after adjusting for MYC RNA expression, COO, IPI, LDH, and age (Figure S2E).
Figure 2. Validation of MYCSS in independent datasets confirms accuracy of survival prediction.

(A-B) Kaplan-Meier survival curve demonstrating overall survival in MYCSS-Hi versus MYCSS-Lo patients in the Reddy dataset (A) and the Lacy dataset (B). Higher MYCSS is associated with lower overall survival rates. (C) Multivariable Cox regression analysis identifies MYCSS as the only significant MYC-related variables after adjusting for established clinical factors in the Sha dataset. The other MYC-related variables are MYC mRNA expression, MYC rearrangement, and MYC protein expression from IHC. (D) MYCSS is predictive of patient survival in both ABC and GCB subtypes. Results are from univariate Cox regression models using MYCSS as a continuous variable.
MYC signature score identifies high- versus low-risk patients, even in patients with genetic MYC aberrations
Next, we sorted all DLBCL samples in the decreasing order of their MYCSS and examined the distribution of samples with MYC genetic alterations. As shown, samples with MYC rearrangements, mutations, positive IHC, or double-hits tend to have a higher MYCSS (Figure 3A). However, some samples with high MYCSS do not have any MYC alterations, suggesting other mechanisms leading to MYC overactivation. Furthermore, samples harboring the same type of MYC genetic alteration varied in their MYCSS, which can only partially (R2 = 0.28, Figure S3A) be explained by their variation in MYC mRNA expression. For samples with non-synonymous MYC mutations, the difference in their MYCSS might indicate the functional impact of different mutations. Based on the predicted functional impact, and the SIFT score, we separated MYC mutated samples from the Reddy and Sha datasets into two groups with tolerated versus deleterious mutations separately27 (Figure S3B). The deleterious group demonstrated a significantly higher MYCSS than the tolerated group (Figure 3B).
Figure 3. MYCSS further stratify patients into high- versus low-risk subgroups within high-risk patients selected by MYC aberrations.

(A) Heatmap demonstrating MYCSS (red = MYCSS-Hi, blue = MYCSS-Lo, white = missing value or not tested) in the context of MYC rearrangements, MYC mutations, MYC IHC, and double-hits (DH, MYC/BCL2 rearranged). (B) Boxplot demonstrating that MYCSS predicts the functional impact of MYC mutations. Significantly higher MYCSS was observed in samples with deleterious than those with tolerated mutations in MYC gene. (C) Kaplan-Meier survival curves comparing the overall survival of three groups (no event of interest, event of interest with MYCSS-Lo, and event of interest with MYCSS-Hi) across four conditions (MYC event, MYC mutation, MYC rearrangement, and double-hit). DH-positive patients contain a MYC/BCL2 rearrangement while DH-negative patients do not harbor both mutations. Of note, the MYCSS-Hi group shows significantly poorer survival than the MYCSS-Lo group within the event-of-interest (high-risk) patients selected by the four MYC-related features. The HR between Pos-MYCSS-Lo and Neg, and between Pos-MYCSS-Hi and Pos-MYCSS-Lo, are as follows: MYC-event: 0.52 and 10.39; MYC-mutation: 0.92 and 5.68; MYC-rearrangement: 0.43 and 19.22; Double-Hit: 0 and >100.
MYC alterations, especially the rearrangement or double-hit, have been used to identify high-risk DLBCL patients for treatment stratification.17 Considering these patients still have considerable variations in their MYCSS, we tested whether our score could further distinguish high- versus low-risk patients within the subset selected based on the MYC event, including mutation, rearrangement, and double-hit. MYC mRNA expression levels do not explain the MYCSS differences in patients with MYC rearrangements (R2 = 0.002, Figure S4A and S4B). When we stratified DLBCL samples with a MYC event (mutation or rearrangement) based on their MYCSS, the high-score subgroup showed a significantly worse prognosis than the low-score subgroup (P = 2e-4) (Figure 3C). Of note, the prognosis of the low-score subgroup is comparable with the patients free of any MYC event, suggesting at least 50% of patients with a MYC event may be less aggressive. For patients selected by MYC mutation, rearrangement, or double-hit, the MYCSS can successfully distinguish high-risk versus low-risk patients (Figure 3C). Progression-free survival results show a similar pattern to overall survival in the Sha dataset (Figure S4C-F). These results also hold true for OS in the Reddy dataset, although the relationship could be less statistically significant due to the due to the limited number of samples with MYC mutation or translocation (Figure S4G-I). In other words, the MYCSS distinguishes between patients who are high- or low-risk from patients who have a MYC event, mutation, rearrangement, or double-hit. The MYCSS can identify patients with a comparable prognosis to patients without a MYC event among patients with various high-risk MYC events, which can help prevent patient overtreatment.
MYC signature score accurately quantifies MYC regulatory activity in DLBCL
We also examined the relationship between the MYCSS and MYC expression (Figure S5A). Although they are significantly correlated, the expression of the MYC gene can only explain 16% of the variation of MYCSS (R2 = 0.16) (Figure 4A). In patients with high MYC expression, the MYCSS still retains the ability to differentiate between high-risk and low-risk patients (P = 8e-6, Figure 4B, S5B-C). The MYCSS can identify low-risk patients similar to the low MYC-RNA group from the high MYC-RNA group. In patients with high MYC expression, the MYCSS remains the most significant feature in a multivariable Cox model, including other relevant clinical parameters (Figure S5D). To investigate the differences in gene expression between the MYCSS-Hi and MYCSS-Lo patients in the cohort with high MYC expression, gene set enrichment analysis (GSEA) was performed in hallmark pathways from the Molecular Signatures Database (MsigDB). Our results indicated that MYC target genes have significantly higher expression levels in the MYCSS-Hi than the MYCSS-Lo patients (Figure 4C), suggesting a higher regulatory activity of MYC in samples with high MYCSS (despite their similar MYC expression levels). Finally, the MYCSS differentiated between high- and low-risk patients in a cohort with positive MYC protein expression as determined by IHC (P = 0.002, Figure 4D) while IHC status alone could not significantly predict OS (Figure S5E). The same applies to PFS in the Sha dataset and OS in the Reddy dataset (Figure S5F-G). Our results indicate that the MYCSS can accurately quantify MYC regulatory activity in DBLCL samples and thus provide critical prognostic values in addition to MYC alterations, mRNA, and protein expression.
Figure 4. MYCSS accurately quantifies MYC regulatory activity in DLBCL.

(A) Scatterplot demonstrating the relationship between MYC RNA expression and MYCSS. Of note, MYC mRNA expression explains only 16% of the MYCSS variation. (B) Kaplan-Meier survival curve investigating overall survival in patients with high MYC RNA expression stratified by MYCSS. The HR between mRNA-Hi & MYCSS-Lo and mRNA-Lo, and between mRNA-Hi & MYCSS-Hi and mRNA-Hi & MYCSS-Lo, are 0.88 and 3.05, respectively. (C) GSEA analysis indicates that MYC target genes tend to have higher expression levels in MYCSS-Hi than in MYCSS-Lo samples. Position in the ranked list of genes indicates the rank of the genes in the gene set in all genes ordered by the fold-change between MYCSS-Hi and MYCSS-Lo patients. (D) Kaplan-Meier survival curve investigating overall survival in patients with positive IHC stratified by MYCSS. The HR between IHC -Hi & MYCSS-Lo and IHC -Lo, and between IHC -Hi & MYCSS-Hi and IHC-Hi & MYCSS-Lo, are 0.42 and 5.41, respectively.
MYC signature score provides additional prognostic value compared to clinical variables and current molecular signatures
We next investigated our MYCSS in the context of DLBCL molecular subtypes7 (GCB, ABC, UNC, and MHG) defined based on commonly used gene signatures and found that the MHG signature demonstrated the highest MYCSS (Figure 5A). The MHG signature is defined for identifying aggressive DLBCL samples, which is highly prognostic alone (Figure S6A). Within the MHG subtype, our MYCSS identifies patients at true high-risk while patients with low MYCSS have a similar prognosis to non-MHG patients, suggesting a large fraction of MHG samples are not aggressive (Sha dataset: OS in Figure 5B, PFS in S6B; Reddy dataset: OS in Figure S6C). The difference in MYCSS cannot be explained by MYC RNA expression (R2 = 0.04, Figure S6D), as there is no significant difference between the high and low MYCSS groups (Figure S6E). Our score remained robust in identifying patients at high and low risk in patients with non-MHG subtypes (Sha dataset: OS in Figure 5C, PFS in S6F; Reddy dataset: OS in Figure S6G).
Figure 5. MYCSS provides additional prognostic value compared to clinical variables and current molecular signatures.

(A) MYCSS distribution across DLBCL subtypes (UNC, GCB, ABC, and MHG). (B) Kaplan-Meier overall survival (OS) curves comparing MYCSS-high (MYCSS-Hi) and MYCSS-low (MYCSS-Lo) subgroups within MHG patients and non-MHG groups. The HR between MHG & MYCSS-Lo and non-MHG, and MHG & MYCSS-Hi and MHG & MYCSS-Lo, are 1.16 and 5.68, respectively. (C) Kaplan-Meier OS curves for MYCSS-Hi and MYCSS-Lo in non-MHG patients. (D) Kaplan-Meier OS curves for MYCSS-Hi and MYCSS-Lo subgroups among DZsig-positive (DZpos) and DZsig-negative (DZneg) patients. The HR between DZpos & MYCSS-Lo and DZneg, and DZpos & MYCSS-Hi and DZpos & MYCSS-Lo, are 0.93 and 4.24, respectively. (E) Kaplan-Meier OS curves for MYCSS-Hi and MYCSS-Lo subgroups within GCB patients classified by DZsig and MHG status, compared with DZneg & non-MHG groups. The HR between DZposMHG & MYCSS-Lo and DZneg & non-MHG, and DZposMHG & MYCSS-Hi and DZposMHG & MYCSS-Lo, are 1.32 and 7.94, respectively. (F) Kaplan-Meier OS curves for MYCSS-Hi and MYCSS-Lo in patients with an unclassified cell-of-origin (COO). (G) Kaplan-Meier OS curves for MYCSS-Hi and MYCSS-Lo in non-MHG patients without MYC events. (H) Multivariable Cox regression analysis for OS among non-MHG patients without MYC events. All analyses in this figure are based on the Sha dataset.
The DZsig was developed to identify DLBCL patients with poor prognoses and genetic expression characteristics similar to high-grade B-cell lymphoma patients with GCB COO and double-hit rearrangement. We observed that MYCSS can significantly identify a subgroup of DLBCL patients among those positive for DZsig (DZsig-Pos) whose prognosis is indistinguishable from that of DZsig-negative (DZsig-Neg) patients (OS: Figure 5D, PFS: Figure S6H). Furthermore, when considering only patients with the GCB subtype, MYCSS also significantly distinguishes patient prognoses among those at highest risk (DZsig-Pos and MHG-positive), identifying a low-risk subgroup with a prognosis similar to DZsig-Neg and non-MHG GCB patients (OS: Figure 5E, PFS: Figure S6I). Interestingly, if GCB patients are first stratified into MYCSS-Hi and MYCSS-Lo groups based on MYCSS, the MYCSS-Hi group can subsequently be divided into DZsigPos+MHG and DZsigNeg+non-MHG, with each subgroup exhibiting significantly different prognoses. However, even among the MYCSS-Hi patients, those characterized as DZsigNeg+non-MHG still demonstrate a prognosis significantly worse than that of the MYCSS-Lo patients (as shown in OS and PFS in Figure S6J-K). This result highlights MYCSS’s prognostic efficacy and its capacity to identify low-risk subgroups among high-risk patients with outcomes comparable to the lowest-risk groups.
Additionally, in the UNC samples, which cannot be certainly classified as the other subtypes, the MYCSS still stratified patients into subgroups with significant different prognosis (OS in Figure 5F and PFS in S6L). Furthermore, the MYCSS demonstrates prognostic value even in non-MHG patients without any detectable MYC alterations (Figure 5G and S7A-B). In a multivariable Cox model in non-MHG patients free of MYC alterations, the binary MYC signature score remained significant in predicting OS (Figure 5H) and PFS (Figure S7C) after adjusting DZsig-groups, MYC RNA expression, COO, IPI, LDH, and age, suggesting the MYCSS provides additional prognostic value compared to established clinical variables and molecular signatures.
MYC signature score predicts chemotherapy response
Next, we aimed to understand the ability of our MYCSS to predict disease response to chemotherapy. We found that patients with confirmed and unconfirmed complete responses, partial responses, and stable disease demonstrated lower MYCSS than patients with progressive disease (Figure 6A). Additionally, the logistic regression of MYCSS on treatment response in the three datasets showed that it is significantly related to treatment response (Figure S8A). R-CHOP/RB-CHOP resistance, prevalent in patients with progressive disease, is observed as the MYCSS increases (Figure 6B, left) and the number of patients with resistance is significantly greater in the high MYCSS group than in the low MYCSS group (Figure 6B, right; Figure S8B). The MYCSS can better identify drug-resistant patients than those based on MYC mutation, MYC rearrangement, MYC RNA expression, MYC IHC status, and the MHG subtype (Figure 6C). For these analyses, event-based and MYCSS-based stratifications were performed in the same subset of patients to achieve fair comparisons. For example, 364 patients from the Sha dataset had MYC mutation data, with 29 possessing the mutation and 335 demonstrating WT MYC. Treatment resistance could be significantly distinguished (P = 0.02) between these two groups. From these patients, we selected the top 29 patients with a high MYCSS and compared to the remaining 335 patients with a low MYCSS, which demonstrated an improved ability to differentiate treatment resistance among patients (P = 6 × 10−5). Within groups of patients with MYC mutations, MYC rearrangements, high MYC RNA expression, and positive MHG subtypes, the MYCSS retains the ability to identify R-CHOP/RB-CHOP resistant patients, highlighting the ability of the score to identify authentic high-risk patients who may already have other potentially risk-increasing factors (Figure 6D). While not as significant, the MYCSS can also identify R-CHOP/RB-CHOP resistant patients in some groups without the event of interest (Figure S8C).
Figure 6. MYCSS predicts chemotherapy response.

(A) MYCSS stratified by R-CHOP responses (CR: complete response; CRu: unconfirmed complete response; PR: partial response; SD: stable disease; PD: progressive disease). P-values were calculated using the Wilcoxon rank-sum test. (B) Percentage of R-CHOP resistant patients stratified by continuous (left) and dichotomous (right) MYCSS. P-values were calculated using the Fisher’s exact test. (C) The R/RB-CHOP treated patients selected based on MYCSS (MYCSS-Hi) show higher resistance rate than those selected various MYC events (event-Pos), such as MYC mutation, MYC rearrangement, MYC RNA, MYC IHC, and MHG. For these analyses, event-based and MYCSS-based stratifications were performed in the same subset of patients to achieve fair comparisons. P-values were calculated using the Fisher’s exact test. (D) MYCSS further distinguishes resistant versus non-resistant patients treated by R/RB-CHOP within patients with MYC mutation, MYC rearrangement, high MYC RNA expression, positive MYC IHC status, and MHG subtype. P-values were calculated using the Fisher’s exact test. All analyses in this figure are based on the Sha dataset.
Discussion
In DLBCL, accurately stratifying risk and predicting patient response to chemotherapy are important considerations due to the poor prognosis associated with resistance to first-line therapies.3 In the present study, we define a MYC signature score that robustly stratifies patients into high- or low-risk groups and accurately predicts patient response to R-CHOP and RB-CHOP chemotherapy regimens. Notably, at least 50% of patients with MYC aberrations demonstrate an improved disease prognosis that more closely mimics patients without MYC aberrations, suggesting that MYC aberration status alone is insufficient to determine high- versus low-risk patients. Furthermore, this finding suggests that some low-risk patients with MYC aberrations may be overtreated or benefit from alternative therapeutic approaches.
To solve the challenge of patient stratification, we developed our MYC signature score to be agnostic to the underlying genetic aberration and robustly identify high-risk patients with various underlying genetic mutations. We validated the performance of our MYC signature score across three independent DLBCL datasets and consistently demonstrated successful stratification of high- and low-risk patients, irrespective of MYC aberration status. Even in patients with MYC mutations, MYC rearrangements, or the more aggressive MHG subtype, our MYC signature score can accurately and significantly distinguish between high- and low-risk patients. Our MYC signature score may help guide treatment selection in patients with DLBCL so that high-risk patients can receive tailored therapies and/or additional supportive care while low-risk patients do not need highly aggressive treatment.
Previous studies have shown that MYC aberrations alone are insufficient to predict chemotherapy response14, and our study demonstrated that MYC RNA expression levels can only partially explain increased risk conferred to patients. This insufficiency is partly explained by the “double-hit” model, which suggests that aberrations in both MYC and BCL2 or BCL6 are required for a patient to be genuinely high-risk to resist first-line treatments. However, double-hit-lymphomas without the MHG signature show no evidence of worse outcomes than other GCB DLBCL patients.10 Although the MHG score can identify high-risk patients without double-hits, our MYC signature score provides new prognostic information. In the present study, we demonstrated the ability of our score to stratify risk in patients with non-MHG and non-MHG/non-MYC aberrant samples, thereby capturing a previously neglected group of patients who may potentially benefit from alternative therapy options. Taken together, our MYC signature score can identify high- and low-risk patients from a diverse array of patient groups, whether the patient groups have previously published high-risk signatures (e.g., MHG), MYC aberrations along the central dogmatic axis (e.g., high MYC RNA), or double-hit mutations (e.g., MYC and BCL6 mutations).
Classification systems, such as COO or other molecular signatures, are often used in clinical practice to identify subsets of patients with high-risk disease who may have poorer responses to standard chemotherapy.28 Importantly, within the GCB group, which typically demonstrates an improved prognosis, MYCSS was able to identify a subset of patients with DZsigPos + MHG, which are both indicators of poor prognosis, who had a prognosis as favorable as those with DZsigNeg + non-MHG. On the other hand, when MYCSS-Hi and MYCSS-Lo are first used to identify patients, although DZsig + MHG can also significantly stratify MYCSS-Hi patients, the prognosis of MYCSS-Lo patients is still significant better. In other words, DZsig + MHG cannot find a group of patients in MYCSS-Hi with a prognosis similar to MYCSS-Lo. Taken together, our results indicate that a large fraction of patients whom alternative metrics may classify as high-risk may actually be low-risk and have a similar prognosis to patients without MYC aberrations, suggesting that these patients may be overtreated due to deficiencies in current classification methods. In addition, out of the 40 most informative genes from our MYC signature (Supplementary Table S3), only 10 and 5 genes are respectively shared with the MHG signature (172 genes) and the DZsig signature (104 genes), suggesting that the predictive power of these signatures is driven by different genes.
The MYC signature score presented in this study may be an important risk-stratification tool for clinicians. Risk-stratification is a highly researched area spanning various forms of cancer, such as breast29, bladder30, anal31, colorectal32, melanoma33, and more. Furthermore, multigene molecular tests have improved risk stratification, but a significant field challenge is utilizing these tools in clinical practice.34 To our knowledge, no commercially available multigene molecular tests are currently available in DLBCL. Our present study demonstrates that our MYC signature score can predict patient sensitivity to R/RB-CHOP chemotherapy regimens. Patients considered high-risk by our MYCSS may have improved outcomes by receiving more frequent disease monitoring to catch recurrent disease sooner or receive earlier enrollment in clinical trials. Although the MYC signature was developed using the RNA-seq data (the Reddy dataset), we demonstrated its prognostic values in two independent datasets (the Sha and Lacy datasets). Both datasets are generated by using the Illumina microarray platform (WG-DASL), suggesting the robustness of this signature against different gene expression profiling platforms. It has the potential to be developed into a genomic assay by using gene quantification platforms commonly used in clinics such as NanoString nCounter.
While our current study presents a novel method for patient risk stratification in DLBCL, a cancer characterized by poor prognosis in patients who do not respond to front-line therapies, there are a few limitations. First, the dichotomous version of our score may have a decreased ability to correctly classify “border case” patients or patients who may not clearly fit into the high- or low-risk group. However, the continuous MYC signature score may be used in these border cases for more precise evaluation. Second, this study evaluated the performance of the MYC signature score in three published datasets, but future studies can expand into additional or larger published datasets of patients with DLBCL. Finally, while this study focuses on our score’s ability to stratify risk in traditional R/RB-CHOP chemotherapy regimens, we do not explore the MYC signature score’s ability to predict risk in patients treated with immunotherapies3, primarily due to the lack of available datasets.
Our study presents a robust method to classify DLBCL patients into high- and low-risk groups with an opportunity to be used as a clinical decision-making adjunct to improve outcomes in DLBCL patients. Our MYC signature score is able to identify high- and low-risk patients in a diverse group of samples with a range of MYC aberrations, double-hits, and previously published risk stratification subtypes. Future studies should apply this score to emerging datasets for DLBCL patients treated with immunotherapy.
Supplementary Material
Acknowledgements
This study is supported by the Cancer Prevention Research Institute of Texas (CPRIT) (RR180061 to CC) and the National Cancer Institute of the National Institute of Health (1R01CA269764 to CC). CC is a CPRIT Scholar in Cancer Research.
Footnotes
Conflict of Interest
The authors report no conflicts of interest.
Data Availability Statement
The MYC signature developed in this study, along with the BASE algorithm used to calculate the MYC Signature Score (MYCSS), and the computed MYCSS values are available on our GitHub repository at https://github.com/JRLi/DLBCL-MYCSS. The raw data from the collected DLBDL datasets are publicly accessible. PubMed IDs and Accession IDs are provided in Supplementary Table S1. For further information, inquiries can be directed to the corresponding author.
References
- 1.Major A, Smith SM. DA-R-EPOCH vs R-CHOP in DLBCL: How do we choose? Clin Adv Hematol Oncol. 2021;19(11):698–709. [PMC free article] [PubMed] [Google Scholar]
- 2.Maurer MJ, Habermann TM, Shi Q, et al. Progression-free survival at 24 months (PFS24) and subsequent outcome for patients with diffuse large B-cell lymphoma (DLBCL) enrolled on randomized clinical trials. Annals of Oncology. 2018;29(8):1822–1827. doi: 10.1093/annonc/mdy203 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.He MY, Kridel R. Treatment resistance in diffuse large B-cell lymphoma. Leukemia. 2021;35(8):2151–2165. doi: 10.1038/s41375-021-01285-3 [DOI] [PubMed] [Google Scholar]
- 4.Miyawaki K, Sugio T. Lymphoma Microenvironment in DLBCL and PTCL-NOS: the key to uncovering heterogeneity and the potential for stratification. J Clin Exp Hematop. 2022;62(3):127–135. doi: 10.3960/jslrt.22027 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Susanibar-Adaniya S, Barta SK. 2021 Update on Diffuse large B cell lymphoma: A review of current data and potential applications on risk stratification and management. Am J Hematol. 2021;96(5):617–629. doi: 10.1002/ajh.26151 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Liu Y, Barta SK. Diffuse large B-cell lymphoma: 2019 update on diagnosis, risk stratification, and treatment. Am J Hematol. 2019;94(5):604–616. doi: 10.1002/ajh.25460 [DOI] [PubMed] [Google Scholar]
- 7.Nowakowski GS, Czuczman MS. ABC, GCB, and Double-Hit Diffuse Large B-Cell Lymphoma: Does Subtype Make a Difference in Therapy Selection? Am Soc Clin Oncol Educ Book. Published online 2015:e449–57. doi: 10.14694/EdBook_AM.2015.35.e449 [DOI] [PubMed] [Google Scholar]
- 8.Li S, Young KH, Medeiros LJ. Diffuse large B-cell lymphoma. Pathology. 2018;50(1):74–87. doi: 10.1016/j.pathol.2017.09.006 [DOI] [PubMed] [Google Scholar]
- 9.Cucco F, Barrans S, Sha C, et al. Distinct genetic changes reveal evolutionary history and heterogeneous molecular grade of DLBCL with MYC/BCL2 double-hit. Leukemia. 2020;34(5):1329–1341. doi: 10.1038/s41375-019-0691-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Sha C, Barrans S, Cucco F, et al. Molecular High-Grade B-Cell Lymphoma: Defining a Poor-Risk Group That Requires Different Approaches to Therapy. J Clin Oncol. 2019;37(3):202–212. doi: 10.1200/JCO.18.01314 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Alduaij W, Collinge BJ, Ben-Neriah S, et al. Molecular Determinants of Clinical Outcomes In a Real-World Diffuse Large B-cell Lymphoma Population. Blood. Published online October 27, 2022. doi: 10.1182/blood.2022018248 [DOI] [PubMed] [Google Scholar]
- 12.Ennishi D, Jiang A, Boyle M, et al. Double-Hit Gene Expression Signature Defines a Distinct Subgroup of Germinal Center B-Cell-Like Diffuse Large B-Cell Lymphoma. J Clin Oncol. 2019;37(3):190–201. doi: 10.1200/JCO.18.01583 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Urata T, Naoi Y, Jiang A, et al. Distribution and clinical impact of molecular subtypes with dark zone signature of DLBCL in a Japanese real-world study. Blood Adv. 2023;7(24):7459–7470. doi: 10.1182/bloodadvances.2023010402 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Xia Y, Zhang X. The Spectrum of MYC Alterations in Diffuse Large B-Cell Lymphoma. Acta Haematol. 2020;143(6):520–528. doi: 10.1159/000505892 [DOI] [PubMed] [Google Scholar]
- 15.Petrich AM, Gandhi M, Jovanovic B, et al. Impact of induction regimen and stem cell transplantation on outcomes in double-hit lymphoma: a multicenter retrospective analysis. Blood. 2014;124(15):2354–2361. doi: 10.1182/blood-2014-05-578963 [DOI] [PubMed] [Google Scholar]
- 16.Li W, Gupta SK, Han W, et al. Targeting MYC activity in double-hit lymphoma with MYC and BCL2 and/or BCL6 rearrangements with epigenetic bromodomain inhibitors. J Hematol Oncol. 2019;12(1):73. doi: 10.1186/s13045-019-0761-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Rosenwald A, Bens S, Advani R, et al. Prognostic Significance of MYC Rearrangement and Translocation Partner in Diffuse Large B-Cell Lymphoma: A Study by the Lunenburg Lymphoma Biomarker Consortium. J Clin Oncol. 2019;37(35):3359–3368. doi: 10.1200/JCO.19.00743 [DOI] [PubMed] [Google Scholar]
- 18.de Jonge AV, Roosma TJA, Houtenbos I, et al. Diffuse large B-cell lymphoma with MYC gene rearrangements. Eur J Cancer. 2016;55:140–146. doi: 10.1016/j.ejca.2015.12.001 [DOI] [PubMed] [Google Scholar]
- 19.Reddy A, Zhang J, Davis NS, et al. Genetic and Functional Drivers of Diffuse Large B Cell Lymphoma. Cell. 2017;171(2):481–494.e15. doi: 10.1016/j.cell.2017.09.027 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Lacy SE, Barrans SL, Beer PA, et al. Targeted sequencing in DLBCL, molecular subtypes, and outcomes: a Haematological Malignancy Research Network report. Blood. 2020;135(20):1759–1771. doi: 10.1182/blood.2019003535 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Frei E, Visco C, Xu-Monette ZY, et al. Addition of rituximab to chemotherapy overcomes the negative prognostic impact of cyclin E expression in diffuse large B-cell lymphoma. J Clin Pathol. 2013;66(11):956–961. doi: 10.1136/jclinpath-2013-201619 [DOI] [PubMed] [Google Scholar]
- 22.Visco C, Li Y, Xu-Monette ZY, et al. Comprehensive gene expression profiling and immunohistochemical studies support application of immunophenotypic algorithm for molecular subtype classification in diffuse large B-cell lymphoma: a report from the International DLBCL Rituximab-CHOP Consortium Program Study. Leukemia. 2012;26(9):2103–2113. doi: 10.1038/leu.2012.83 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.International Non-Hodgkin’s Lymphoma Prognostic Factors Project. A predictive model for aggressive non-Hodgkin’s lymphoma. N Engl J Med. 1993;329(14):987–994. doi: 10.1056/NEJM199309303291402 [DOI] [PubMed] [Google Scholar]
- 24.Wright G, Tan B, Rosenwald A, Hurt EH, Wiestner A, Staudt LM. A gene expression-based method to diagnose clinically distinct subgroups of diffuse large B cell lymphoma. Proc Natl Acad Sci U S A. 2003;100(17):9991–9996. doi: 10.1073/pnas.1732008100 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Cheng C, Yan X, Sun F, Li LM. Inferring activity changes of transcription factors by binding association with sorted expression profiles. BMC Bioinformatics. 2007;8:452. doi: 10.1186/1471-2105-8-452 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Zhao Y, Varn FS, Cai G, Xiao F, Amos CI, Cheng C. A P53-Deficiency Gene Signature Predicts Recurrence Risk of Patients with Early-Stage Lung Adenocarcinoma. Cancer Epidemiol Biomarkers Prev. 2018;27(1):86–95. doi: 10.1158/1055-9965.EPI-17-0478 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Ng PC, Henikoff S. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003;31(13):3812–3814. doi: 10.1093/nar/gkg509 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Susanibar-Adaniya S, Barta SK. 2021 Update on Diffuse large B cell lymphoma: A review of current data and potential applications on risk stratification and management. Am J Hematol. 2021;96(5):617–629. doi: 10.1002/ajh.26151 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Yu F, Quan F, Xu J, et al. Breast cancer prognosis signature: linking risk stratification to disease subtypes. Brief Bioinform. 2019;20(6):2130–2140. doi: 10.1093/bib/bby073 [DOI] [PubMed] [Google Scholar]
- 30.Nabavizadeh R, Bobrek K, Master VA. Risk stratification for bladder cancer: Biomarkers of inflammation and immune activation. Urol Oncol. 2020;38(9):706–712. doi: 10.1016/j.urolonc.2020.04.006 [DOI] [PubMed] [Google Scholar]
- 31.Clifford GM, Alberts CJ. Molecular Risk Stratification for Anal Cancer Prevention. Clin Infect Dis. 2021;72(12):2164–2166. doi: 10.1093/cid/ciaa399 [DOI] [PubMed] [Google Scholar]
- 32.Ichimasa K, Kudo SE, Miyachi H, Kouyama Y, Misawa M, Mori Y. Risk Stratification of T1 Colorectal Cancer Metastasis to Lymph Nodes: Current Status and Perspective. Gut Liver. 2021;15(6):818–826. doi: 10.5009/gnl20224 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Trager MH, Geskin LJ, Samie FH, Liu L. Biomarkers in melanoma and non-melanoma skin cancer prevention and risk stratification. Exp Dermatol. 2022;31(1):4–12. doi: 10.1111/exd.14114 [DOI] [PubMed] [Google Scholar]
- 34.Heinzel A, Perco P, Mayer G, Oberbauer R, Lukas A, Mayer B. From molecular signatures to predictive biomarkers: modeling disease pathophysiology and drug mechanism of action. Front Cell Dev Biol. 2014;2. doi: 10.3389/fcell.2014.00037 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Dreval K, Cruz M, Rushton C, et al. Revisiting Reddy: A DLBCL Do-Over. bioRxiv. Published online November 22, 2023. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The MYC signature developed in this study, along with the BASE algorithm used to calculate the MYC Signature Score (MYCSS), and the computed MYCSS values are available on our GitHub repository at https://github.com/JRLi/DLBCL-MYCSS. The raw data from the collected DLBDL datasets are publicly accessible. PubMed IDs and Accession IDs are provided in Supplementary Table S1. For further information, inquiries can be directed to the corresponding author.
