Skip to main content
Cell Reports Medicine logoLink to Cell Reports Medicine
. 2024 Feb 29;5(3):101443. doi: 10.1016/j.xcrm.2024.101443

Follicular lymphoma B cells exhibit heterogeneous transcriptional states with associated somatic alterations and tumor microenvironments

Jordan E Krull 1, Kerstin Wenzl 1, Melissa A Hopper 1, Michelle K Manske 1, Vivekananda Sarangi 2, Matthew J Maurer 2, Melissa C Larson 2, Patrizia Mondello 1, ZhiZhang Yang 1, Joseph P Novak 1, Makayla Serres 1, Kaitlyn R Whitaker 1, Jose C Villasboas Bisneto 1, Thomas M Habermann 1, Thomas E Witzig 1, Brian K Link 3, Lisa M Rimsza 4, Rebecca L King 5, Stephen M Ansell 1, James R Cerhan 2, Anne J Novak 1,6,
PMCID: PMC10983045  PMID: 38428430

Summary

Follicular lymphoma (FL) is an indolent non-Hodgkin lymphoma of germinal center origin, which presents with significant biologic and clinical heterogeneity. Using RNA-seq on B cells sorted from 87 FL biopsies, combined with machine-learning approaches, we identify 3 transcriptional states that divide the biological ontology of FL B cells into inflamed, proliferative, and chromatin-modifying states, with relationship to prior GC B cell phenotypes. When integrated with whole-exome sequencing and immune profiling, we find that each state was associated with a combination of mutations in chromatin modifiers, copy-number alterations to TNFAIP3, and T follicular helper cells (Tfh) cell interactions, or primarily by a microenvironment rich in activated T cells. Altogether, these data define FL B cell transcriptional states across a large cohort of patients, contribute to our understanding of FL heterogeneity at the tumor cell level, and provide a foundation for guiding therapeutic intervention.

Keywords: follicular lymphoma, B cell, transcriptome, germinal center, tumor microenvironment, genomics

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • B cells from follicular lymphoma exhibit 3 distinct transcriptional states

  • FL B cells differ by enhanced inflammation, proliferation, or chromatin remodeling

  • Tumor cell states correlate with unique immune-microenvironment features

  • Unique mutation and CNV profiles highlight potential genetic causes of heterogeneity


Krull et al. analyzed bulk transcriptional, genomic, and immune profiles of B cells from follicular lymphoma and reveal 3 distinct transcriptional states. These cell states underscore the inherent variability of FL tumors, independent of stroma, and implicate intrinsic differences as an underpinning to FL heterogeneity.

Introduction

Follicular lymphoma (FL) is the most common indolent non-Hodgkin lymphoma (NHL) and presents as a heterogeneous disease with a highly variable clinical course. FL is thought to develop over a long period of time as a B cell acquires somatic alterations following egress from the bone marrow and subsequent germinal center (GC) reactions.1 The hallmark genetic event, t(14;18), which places BCL2 under transcriptional control of the immunoglobulin heavy-chain locus, arises during early B cell development, and occurs in 80%–90% of FL patients.2 Although FL is considered an indolent lymphoma, the clinical course is highly variable, with an overall survival (OS) at 5 years in the rituximab era ranging from 90% for low-risk patients to 65% for high-risk patients by the FL International Prognostic Index (FLIPI).3 Despite the high response rates to frontline treatments, most patients will experience relapse of their disease and subsequent lines of therapy. A greater understanding of FL biology presents an opportunity for more targeted approaches and biologically intuitive treatments.

Despite advances in other subtypes of NHL, such as diffuse large B cell lymphoma, the identification of biologic clusters in newly diagnosed FL from comprehensive genomic or transcriptomic datasets has yet to be accomplished. Although gene expression data have been used for the identification of FL prognostic subgroups, Dave et al. used gene expression data from prerituximab era cases and identified two gene sets (IR1 and IR2) that delineated patients with poor and favorable prognosis.4 More recently, Huet et al. developed a gene-expression signature and assessed the value of a 23-gene score to predict progression or relapse after diagnosis for immunochemotherapy (IC)-treated cases.5 Notably, both studies used supervised methodologies to discover biological relationships to clinical outcomes, but the biologic diversity of FL tumors themselves was not addressed. Although these studies are important for our understanding of FL, the prognostic value of those results are heavily dependent on cohort characteristics and the heterogeneity of the tumor microenvironment (TME). Single-cell transcriptomic analyses of FL tumor cells, although limited in number, have shown that FL B cells exhibit heterogeneous profiles distinct from non-neoplastic GC B cells and that multiple subclones may exist within each tumor.6,7

Whole-exome (WES-seq) and whole-genome sequencing of FL tumors and follow-up validation studies have discovered that point mutations in genes involved in epigenetic regulation and chromatin modification, including KMT2D, EZH2, CREBBP, EP300, and MEF2B, dominate the FL landscape.8,9,10 Studies to date have focused primarily on the identification of small site mutations that are recurrent in FL tumorigenesis11,12 or involved in tumor clonal evolution.13,14 Although informative, these studies have been limited by either small discovery cohorts and/or use of targeted sequencing approaches, and they often include samples from relapsed posttreatment patients. Thus, the full landscape of alterations present in newly diagnosed FLs has not been fully ascertained.

In addition to genetic alterations, the FL TME is quite heterogeneous, with a multitude of cell types unique to FL or lymph nodes having been characterized.15,16 Many of these cell types have been found to vary across FL populations in a manner related to clinical outcomes,17,18,19 but why the nontumor cells vary between patients is under investigation. Patterns of FL TME have also been studied revealing four distinct subtypes of the lymphoma microenvironment in primary human FL based on the relative abundance of T cell subsets referred to as naive, warm, depleted, and intermediate.20 Recent evidence has shown that mutated chromatin-modifying enzymes, among other unique genetic lesions, in FL have the capacity to increase immune evasion in tumor cells and support microenvironmental niches that promote tumor growth.21,22,23,24 Furthermore, FL tumor cells have been shown to not only shape the immune content of their environment but also the gene expression within local nonmalignant entities.25,26 These studies point to the intrinsic activity of the tumor cells as a key contributor to the TME content.

In this study, we leveraged computational techniques with WES, RNA-seq from sorted B cells, and immune profiling data from untreated and newly diagnosed FL tumors to gain further insight into the biologic basis for heterogeneity in FL. Through this analysis, we identified 3 groups of FL with unique transcriptional states. We further defined molecular activities and possible cellular origins that constitute these states, and identified related immune and genetic profiles. Altogether, this comprehensive, systems-level approach better defines our understanding of FL biology and provides potential avenues for the diagnostic evaluation and identification of therapeutic targets.

Results

Profiling of follicular lymphoma B cell transcriptional states

To identify unique B cell transcriptional states from FL tumors, we performed RNA-seq on isolated B cells from FL tumors (n = 87) and benign lymph nodes (n = 5); sample and patient data are described in Figures 1A and S1 and Table S1. Filtering gene expression values based on low expression or potential sources of cluster artifacts resulted in 10,015 remaining genes available for analysis. We implemented non-negative matrix factorization (NMF) and consensus clustering to define an ideal quantity of stable cell states between patients. We identified 3 FL B cell groups (G1–G3) as the most stable solution, based on consensus clustering selection criteria of iterative NMF runs (Figures 1B and S1C). Hierarchical clustering of matching B cell and whole-tumor (presort) RNA-seq samples could not resolve the intersample relationships, suggesting that the grouping pattern is specific to the B cell RNA-seq (Figure S1B). To determine whether our clusters were driven by the tumor biopsy site or clinical characteristics (histologic grade, stage, FLIPI), we looked for the association of each feature with individual clusters and found that groups were significantly related to histological grade and FLIPI scores (p < 0.05), although no group could be defined solely by their diagnostic features nor their biopsy site (Figure 1C). Nevertheless, we did note that less aggressive characteristics favored G1 and more aggressive characteristics favored G2, based on FLIPI and staging. Although there were no significant differences in event-free or OS between the groups (Figures S2A and S2B), nearly 40% of G2 and 30% of G3 experienced an event within 24 months of diagnosis, whereas no G1 patients experienced an early event (p < 0.05) (Figure S2C).

Figure 1.

Figure 1

FL B cell transcriptional groups

(A) Schematic of samples used for data generation. Dotted lines indicate patient-matched data/samples.

(B) Consensus clustering map result of 200 NMF runs on FL and benign B cell (n = 87) log2(TPM+1) gene expression. The heatmap indicates the sample-sample pairing frequency. Three groups were identified: group 1 (red, n = 20), group 2 (blue, n = 24), and group 3 (green, n = 43). Tissue type bar displays sample origin: FL (gray) and benign (blue).

(C) Proportional association between diagnostic clinical attributes and each B cell NMF group. Fisher’s exact test ∗p < 0.05.

(D) B cell group signature scores from the 23-gene high-risk signature and IR-1/IR-2 signatures on matching bulk, unsorted tumor RNA-seq samples. The y axis represents the average value of signature genes’ Z scores with each sample. Independent Wilcoxon rank-sum tests; ∗p < 0.05; ∗∗p < 0.01; ∗∗∗p < 0.001.

To determine whether our FL B cell group assignment was related to previously defined transcriptional gene sets, we scored the samples on immune-response-1 (IR-1), IR-2, and a 23-gene high-risk signature4,5 using matched bulk RNA-seq samples (Figure 1D). Given the large intragroup variability within each gene set, G1–G3 did not exhibit a relationship to these signatures. However, much like the favorable outcomes of G1 (Figure S2), the 23-gene high-risk signature, which is positively associated with poor outcomes, trended low for G1 compared to G2 (p < 0.01). The IR-1 and IR-2 signatures were more prevalent in G1 and G2 (IR-1, p < 0.001; IR-2, p < 0.01, p < 0.05); however, the 2 groups appear to have a similar enrichment score for both gene sets. Taken together, our analysis of purified FL B RNA-seq data cells reveals biologic groups that are not defined by clinical characteristics or previously characterized gene expression subgroups.

FL B cell states exhibit unique biological activities

To determine the biologic programs driving G1–G3, we looked at the transcriptional states underlying each group. NMF itself does not result in a group assignment, but rather a number of continuous values (coefficients) for each sample, that represent a continuous transcriptional state of coordinated gene expression, in which every sample expresses each of the 3 states to some degree (Figure 2A, top). The highest expressing state in each sample is analogous to their group assignment from consensus clustering. To determine a rank order for gene contribution to each state, we scored feature weights by their positive and negative relationships to each coefficient (Figure 2A, left; Table S2) and found that a number of genes important for FL and B cell biology were strongly associated with each of the 3 states (Figure 2A, right). G1 included genes positively enriched for inflammation and B cell motility (e.g., interleukin-6 [IL-6], IFNGR1, S1PR1, APRIL, TACI), whereas regulators of B cell transcription were negatively associated (e.g., MEF2B, EZH2). G2 had strong positive associations with genes involved in the proliferation and maintenance of DNA replication (e.g., MKI67, POLQ, RAD51, CCNB2). Lastly, G3 was defined by the higher expression of canonical GC B cell markers (e.g., CD38, EP300, MME [CD10]) along with low expression of IRF4 and BHLHE40, both of which restrain GC B cell phenotype commitment.27,28

Figure 2.

Figure 2

Divergent FL B cell gene expression states highlight unique biological activities

(A, left) Gene contribution scores for each group state. (See STAR Methods.) Higher/more significant positive contribution (red) and lower/more significant negative contribution (blue). (Center) Relative gene expression heatmap (n = 92) of group signature-defining genes (n = 500 genes/group). Selected genes of interest are highlighted at right. (Top) Individual sample state coefficient values for each factor/state from the NMF H matrix.

(B) Normalized enrichment scores (NESs) from selected significant gene sets for each group state. Gene sets are grouped vertically based on the group with which they share the most significance.

(C) GSEA plots of significant gene sets for group states 1 (top row), 2 (center row), and 3 (bottom row).

(D) VIPER regulator activity enrichment plot for group states. Selected top significant results (p < 0.01) are shown for positive enrichment (top proteins, red bar) and negative enrichment (bottom proteins, blue bar). Red lines depict genes positively regulated by the protein, and blue lines depict genes negatively regulated by the protein. Genes are ordered according to their group state contribution score (bottom bar plot).

We next sought to determine whether specific pathways were contained within each state, which may contribute to the biological heterogeneity of FL B cells. Gene set enrichment analysis (GSEA) on gene contribution ranks from each state revealed 352 significant pathway associations among all 3 states (Table S3). Among the most significant gene sets, a pattern of significant intersample heterogeneity surrounding inflammation, growth/proliferation, and epigenetic reprogramming emerged (Figure 2B). The G1 state was most enriched in programs related to type 1 interferon (IFN) signaling, IL-6 signaling, MYD88/NF-κB (nuclear factor κB), and repressed cell cycle. The G2 state was dominated by proliferative signals, metabolic reprogramming, B cell receptor (BCR) signaling, and response to stress or damage. Positive enrichment in programs related to epigenetic reprogramming, metabolic reprogramming, and cell adhesion was seen in the G3 state, in addition to the negative enrichment of MYD88/NF-κB and inflammation.

Many of the enriched gene sets highlight coordinated functions in each state, which are summarized by 1 or 2 gene sets highlighting the activity in each state. We observed notable positive enrichment of IL-6, Janus kinase, and signal transducers and activators of transcription (STAT) signaling and negative enrichment of G2M cell-cycle checkpoint in G1 state, which highlights the inflamed antiproliferative nature of this state (Figure 2C). The G2 state was both positively enriched in the hallmark mammalian target of rapamycin complex 1 (mTORC1) gene set and the BCR complex signaling, both of which are capable of activating and sustaining the proliferative, centroblastic nature of this state, along with a number of the other enriched gene sets in the G2 state. Finally, the G3 state was positively enriched in chromatin modification, summarizing the numerous related epigenetic gene sets enriched, and was negatively enriched in antigen processing and presentation.

B cells undergo multiple phenotypic and morphologic changes during their development, all of which are regulated by a handful of master transcriptional regulators, such as B cell lymphoma 6 (BCL6), interferon regulatory factor 4 (IRF4), and PAX5, among others.29 We next used VIPER30 to measure the activity of B cell master regulators in G1–G3 states and filtered them for nonredundant and significant regulators because of their known role in lymphomagenesis (Figure 2D; Table S4). The G1 state was significantly enriched for activity in all 3 proteins of the IFN stimulated gene factor 3 complex (IRF9, STAT1, and STAT2), which respond to IFN-β signals. The transcription factor responsible for IFN-β production, IRF7, also significantly enriched, further suggesting the type 1 IFN signaling in the G1 state. This state also had increased activity in KLF9, which is known to suppress cell-cycle progression; this was consistent with significantly decreased activity in Forkhead box protein M1 (FOXM1) and topoisomerase IIα (TOP2A), as well as significant negative enrichment of the G2M gene set (Figure 2C, top). The G2 state was significantly enriched for positive regulators of cell growth and proliferation (FOXM1, TOP2A, HMGA1, MYB), as well as the DNA damage response (protein kinase, DNA-activated, catalytic subunit [PRKDC], HMGB1). In addition, the G2 state enriched for ILF2 (NFAT), which has been shown to be further activated by PRKDC,31 both of which are downstream of BCR activation. Likewise, the G2 state displayed reduced activity in regulators of the GC reaction FOXO1, CREB binding protein (CREBBP), and SPEN. The G3 state was significantly enriched for increased activity in proteins involved in maintaining nuclear receptor co-repressor (NCOR)/SMRT (silencing mediator of retinoic acid and thyroid hormone receptor transcriptional repression activity (BCL6, MEF2B, ZBTB5, ZBTB20, MLL, SPEN). These included methyltransferases as well as proteins that recruit HDAC proteins. In summary, these data suggest that transcriptional activities related to inflammation, immune response, cell proliferation, metabolic reprogramming, and chromatin modification define transcriptional heterogeneity between FL tumors at the B cell level.

GC programs are partially retained in FL B cell groups

Among the top positively enriched gene sets for the G1 and G2 states were centrocyte, centroblast, light zone, and dark zone. This was relatively unexpected, considering that recent evidence suggests that FL B cells have a desynchronized GC program or would almost ubiquitously skew toward the light-zone phenotype as opposed to the dark zone.6,32 However, the GC reaction, reviewed by De Silva and Klein,33 is nuanced and far from existing in 2 simple states, as evidenced by Holmes et al.34 To determine whether any GC programs underlie the biology of G1–G3, we scored FL B cell gene expression for the enrichment of GC B cell states using gene sets and methods outlined by Holmes et al.34 (Figure S3). The 13 possible states were in developmental order from dark-zone (DZ.a–DZ.c), to intermediate phenotypes (INT.a–INT.e), to light zone (LZ.a and LZ.b), to post-GC phenotypes (prememory [M], preM; plasmablast, PBL.a and PBL.b). Every sample scored significantly positive (p < 0.05) for at least 1 of the 13 GC states analyzed (Figure 3A; Table S5). Each of the 13 gene sets was tested independently, and most of the samples (87.6%) scored significantly (p < 0.05) for more than 1 gene set, often from adjacent phenotypes (i.e., DZ.a and DZ.b) or positive and negative scores from opposing phenotypes (i.e., DZ.c and preM). Gene expression patterns were also consistent with the highest-scoring gene set for each sample, highlighting the heterogeneity of the B cell samples even when considering only genes that control the GC program (Figure 3B).

Figure 3.

Figure 3

FL B cell programs associate with independent GC phenotypes

(A) Weighted voting prediction scores of GC B cell cluster gene sets for each sample. Empirically derived null distribution 95% confidence interval (CI) from imputed data, in gray. Scores range from −1 to 1, with 0 representing equal number of votes for and against a gene set in a sample.

(B) Heatmap of normalized RNA-seq values of FL B cell samples (n = 92), for the top 50 genes in each gene set. Single cell clusters are listed on the left side of the map and FL B cell grouping assignments are represented on the top bar. Samples are ordered based on the highest GC B cell cluster score.

(C) GC B cell cluster enrichment analysis from hypergeometric testing of B cell group assignment. Bars represent the false discovery rate-corrected p values (q-val). Positive and negative associations (residuals) are plotted directionally with –log(q) values. Gray area represents p = 0.05.

Hypergeometric testing for associations between each GC gene set and the FL B cell group revealed strong associations (q < 0.05) between all 3 groups (Figure 3C), although G1–G3 were not clearly defined by a single phenotype. G1 had strong associations for post-GC phenotypes (preM and PBL.b), as well as 2 intermediate phenotypes (INT.b and INT.d). INT.d and preM are considered developmentally continuous phenotypes, suggesting that the B cells in this subset of FL are in the process of exiting or have exited the GC reaction. G2 had significant associations with a number of proliferative phenotypes, DZ.a, LZ.b, and PBL.a, not to mention weaker associations with the remaining DZ phenotypes (Figure 3C). Conversely, G3 samples enriched positively only in the INT.e phenotype and had very strong negative associations with INT.d and preM (Figure 3C). These data further support the hypothesis that FL B cells have a relatively desynchronized GC program and that G1–G3 have gene expression signatures not solely defined by individual GC phenotypes, but rather resemble aggregates of interrelated GC phenotypes, a result corroborated by recent work by Attaf et al.35 Based on the FL B cell transcriptional profiles (Figure 2) and associated GC programs identified in each group thus far, we now refer to G1 as inflamed memory-like (INFM), G2 as proliferating dark zone-like (PDZ), and G3 as chromatin-modifying intermediate (CMI).

Predicting FL B cell states in external gene expression data

This study uses RNA-seq-derived gene expression from bulk, sorted B cells; however, there are no datasets with which to validate our signatures. To overcome this challenge, we used our matching bulk FL RNA-seq samples to derive a gene signature matrix for the INFM, PDZ, and CMI signatures (Figure S4). This signature can then be used to deconvolve bulk FL RNA-seq and gene expression profiling (GEP) data to generate signature values using a linear ν-support vector regression (ν-SVR). The performance showed significant (p < 0.001) correlations between the cell states in our matching bulk FL RNA-seq samples compared to their FL B cell NMF results. We then applied this method to 4 independent FL datasets of diagnostic FL biopsies (n = 742) (Figures S5A–S5D). Not only were the 3 signatures identifiable in all cohorts but also most of the samples displayed a dominant signature. In addition, gene expression patterns of signature genes were distinct and consistent with sample signature predictions, suggesting that these signatures reflect sample heterogeneity found in a large cohort of external samples. Lastly, we were able to detect a similar pattern of less-aggressive characteristics in INFM cases in an independent FL cohort with available outcome data, with no cases having an event or failing event-free survival at 24 months (EFS24)(Figures S5E–S5G).

FL B cell states associate with specific tumor-immune and stromal interactions

FL tumors develop in an immune cell-rich TME, and the unique molecular characteristics of each patient’s tumor cells may contribute to evasion of immune surveillance or promote an environment that provides continuous growth signals. To survey the TME and identify immune cell subsets in our FL samples, we used available cytometry by time of flight (CyTOF) data from cell suspensions on matched lymph node samples (n = 60) from Figures 1, 2, and 3 and clustered using the phenograph algorithm. We identified 20 unique cell populations (Figure 4A) that classified into known major and minor immune subsets, including 4 subsets of T follicular helper (Tfh) cells (Figure 4B). Overall, non-B immune cell content varied from 2% to 84% across all of the samples, with the highest mean percentage of immune cells being detected in INFM (40%) and the lowest being detected in CMI (21%) (Figure 4C). In addition, the inverse relationship between the CMI and INFM states was significantly correlated with immune content, regardless of group assignment, suggesting that the CMI and INFM B cell states have opposing relationships to the overall TME (Figure 4D).

Figure 4.

Figure 4

FL B cell states shape their local immune profile

(A) Uniform manifold approximation and projection of immune metaclusters from individually profiled FL patient CyTOF samples (n = 60). Clusters identified using Phenograph and are colored by manual annotation.

(B) Heatmap of mean surface protein expression in each identified cluster mentioned in (A). Cluster identities listed next to their coordinating colors from (A).

(C) Boxplot of immune content as percentage of live cells in each sample by their B cell group.

(D) Correlation plot between the difference in CMI vs. INFM state coefficients (positive = higher CMI, negative = higher INFM) and the non-B immune cell percentage (% of live cells). A standard linear regression line is displayed with 95% CIs, along with (r) Pearson correlation, and representative p value.

(E) (Left) Heatmap of correlation matrix between cell cluster abundances in all of the samples, depicting cellular communities (black boxes). Cell values from Pearson correlation. Dendrogram from correlation distance. (Right) Heatmap of Pearson correlation between cell abundance and sample B cell group state. Pearson p values are printed where p < 0.1.

(F) Heatmap of Spearman correlation between single-sample gene set scoring, using SingScore of published GC stromal gene sets, and B cell group states (n = 54). Pearson p values are printed in cells where p < 0.1.

(G) CODEX multiplexed immunofluorescence analysis of CD21 (white), CD20 (blue), Ki-67 (yellow), CD4 (green), and CD8 (magenta) on 1 sample from each B cell group. (Left) Baseline cell type abundance in each sample (% of live) from CyTOF. (Right) First 3 panels, stitched images equal to 10× magnification with 500-μm scale. White box delineating the magnified view in panel 4 of follicle and interfollicular region, with 200-μm scale.

To identify significant relationships between INFM, PDZ, and CMI and their immune cell content, we first organized the FL TME into cellular communities using a correlation matrix, which, when paired with hierarchical clustering, groups populations whose abundance is coupled (Figure 4E, left). Next, to determine the relationship between the cell abundances and the 3 B cell states, we modeled the capacity of each normalized B cell state’s expression to predict the percentage of cell abundance in each sample. Among these relationships, the most striking was the inverse relationship the INFM and CMI states had with the presumably antitumor and proeffector cell communities (Figure 4E, right). Notably, the INFM state had a strong positive relationship with proinflammatory cell types, inflammatory monocyte-derived dendritic cells (mDCs) (p = 0.006), CD4 Th1 effector memory T cells (Tem) (p = 0.018) and CD8 Tem (p = 0.035), all of which are capable of secreting type 1 IFNs and tumor necrosis factor α (TNF-α). As indicated by the variability in non-B cell content among the PDZ group, very few cell types associated with the PDZ state; however, there were positive associations of PDZ with CD14+ mDCs (p = 0.007) and CD57hi “GC” Tfh cells (p = 0.034), and a trend with ICOShi Tfh cells (p = 0.070). In addition, the CMI state uniquely had significant negative associations with half of the cell types, including most of the cell communities containing T cells. In addition, we observed similar patterns of TME and FL B cell state association, in an independent cohort of FL samples, although these exact cell types could not be extracted (Figure S6).

FL-involved lymph node (LN) tissue can also contain a number of stromal cells that are often excluded from immune cell phenotyping due to their lack of abundance in dissociated tissue cell suspensions. To define the stromal cell TME, we used RNA-seq from matched frozen sections with predicted group assignments to score the relative abundance of expression signatures from 3 types of stromal cells (vascular pericytes [DNs], follicular DCs [FDCs], and follicular reticular cells [FRCs]), which were recently identified in FL.26 Similar to the immune microenvironment, the largest differences in predicted stromal composition were between INFM and CMI. INFM and PDZ groups both had significantly (p < 0.05) higher signatures from FDCs and FRCs compared to CMI (Figure 4F). In addition, INFM had a significantly higher signature for DN stromal cells (described as pericytes) than CMI.

To visually confirm the TME trends identified in Figures 4A–4F, CODEX multiplexed immunofluorescence imaging of a representative tumor section from each of the 3 groups was performed (Figure 4G). Immune cell content from the CyTOF analysis of each case is shown on the left side of the panel in Figure 4G. CD21 staining patterns reveal large definable follicles in INFM and PDZ, and the pattern of CD20 staining places B cells in close association with FDCs. Conversely, the CMI sample had smaller follicles with softer margins, and the CD20 pattern shows no inherent structural organization to B cell locations. In support of the proliferative signature, the PDZ sample also contained a strikingly large number of Ki-67+ cells localized primarily to the follicle, compared to INFM and CMI, which had low numbers of Ki-67+ and no localization, respectively. Similar to Figure 4E, the INFM sample had abundant CD8 T cell staining, which predominantly localized with B cells in the follicles. This was in stark contrast to the PDZ sample, which had CD8 staining almost entirely sequestered to the interfollicular regions. The PDZ sample had enhanced expression of CD4 in the follicles, where B cells were sequestered, compared to the INFM sample, which supports an association between PDZ and Tfh cells. The CMI sample largely demonstrates a disrupted immunoarchitecture compared to the INFM and PDZ sample; however, CD4 and CD8 stained much dimmer in the limited follicle spaces. Together, these findings reveal a connection between the tumor B cell intrinsic state and the local microenvironment and suggest that FL tumor B cell states are related to their local TME.

Genetic profile of FL tumors

To characterize the overall genetic landscape of the tumors in our cohort and to define the association of genetic variants with our B cell states, we performed WES on tumor/normal pairs from FL tumors (n = 123), 119 of which had matched RNA-seq from both sorted and bulk data available for assignment into INFM, CIM, or PDZ. We first identified 4 significant copy-number gain peaks and 7 significant copy loss peaks, across the genome (Figure 5A). These regions include gains in REL, MYC, and BCL2, as well as losses of TNFAIP3, TNFRSF14, and FAS. In addition, the landscape of mutational signatures was derived from these samples using a BayesNMF model published previously,36 which revealed 3 mutational signatures across these samples (Figure 5B). Sig1 was identified as a signature of aging, with an abundance of nonclustered, C>T mutations at CpG motifs and strong correlation to both COSMIC signature 1 and patient age (p < 0.01) (Figure S7). Sig3, aligning with a number of DNA damage response signatures, made up a majority of the clustered mutations, and its most abundant substitutions resembled the known activation-induced cytidine deaminase (AID) C>T/G mutations in RCY motifs (Figures 5B, S7B, and S7C).36,37

Figure 5.

Figure 5

Landscape of somatic alterations in FL

(A) GISTIC 2.0 analysis of CNVs identified from WES (n = 123). G-score plot of significant CNA peaks (q < 0.1).

(B) Mutational signature trinucleotide single-base-substitution profiles derived from BayesNMF resultant signatures on all SNVs derived from WES (n = 123). Sig1 = aging, Sig2 = unknown, Sig3 = AID-like.

(C) Oncoplot of individual mutation and CNV events across FL patients (n = 123). Mutation variants are filtered to >1% variant allele frequency and genes that contain significant driver scores from CHASM and/or VEST. (Left) Driver scores from CHASM and VEST gene level scores.

(D) Ratios of clonal to subclonal events by gene/q-band. Analysis performed using ABSOLUTE and variant CCFs > 0.85 were considered clonal.

To identify novel as well as previously described driver genes, we used the openCRAVAT mutation impact scoring algorithms ChASM and VEST (Figure 5C, left panel). In total, we identified 158 recurrently mutated proteins, which are shown in the oncoplot (frequency of >3% in the cohort; Figure 5C). As has been previously reported, the most frequently altered genes (>20%) were KMT2D, BCL2, TNFRSF14, CREBBP, and MEF2B, all of which had mutation frequencies similar to those reported in FL in the COSMIC database (https://cancer.sanger.ac.uk),38 although a number exhibited patterns different from those reported in the literature (Table S6). Copy gains at chromosomes 2p16.1 and 18q21.33 and copy losses at chromosomes 6q14.2–23.3 and 1p36.23, were among the most prevalent copy-number alterations (CNAs), ranging from 18.7% to 22%. Previous reports suggest CREBBP and KMT2D to be among the earliest genetic events in the development of FL21,39; however, only 72 (58.5%) patients in this cohort have a mutation in KMT2D and/or CREBBP, suggesting that a number of possible early tumorigenic events have yet to be discovered. To identify the frequency of early versus late events in each mutated gene and copy-number variation (CNV), we used ABSOLUTE40 to measure the estimated cancer cell fraction (CCF) of each variant and further subclassified them into clonal (CCF ≥0.85) and subclonal (CCF <0.85) (Figure 5D). Most CNVs were predominantly clonal (mean = 82.6%) genes, which were more frequently mutated in the cohort that averaged closer to 50% clonal events (BCL2, 55.8%; CREBBP, 57.5%; EZH2, 54.5%), with the exception of KMT2D (76.9%). Taken together, these data highlight the importance of CNAs as significant genetic events and expands the breadth of known recurrent somatic alterations and early genetic events in FL.

Genetic events relate to FL tumor cell states

To understand the influence of genetic events on our newly identified FL B cell states, we next examined the enrichment of genetic features among each state. When we looked at overall tumor mutation burden (TMB), in INFM, PDZ, and CMI, we found that the INFM group trended lower compared to both PDZ and CMI (Figure 6A). Since the Sig3 mutational signature loosely resembled the known AID motif and aberrant AID activity is considered to be a major contributor of DNA mismatches leading to B cell malignancies,36,41,42,43 we next quantified the association between FL B cell state classes and the exposure to the Sig3 mutational signature (Figure 6B). Similar to TMB, Sig3 signature exposure trended upward from INFM to CMI, and the increased exposure in CMI, compared to INFM, approached the standard significance threshold (p = 0.069).

Figure 6.

Figure 6

Somatic alterations associate with B cell states

(A) Box and violin plots of TMB by B cell group (n = 119). Pairwise Wilcoxon test p values displayed.

(B) Box and violin plots of the Sig3 mutational signature exposure by B cell group (n = 119). Pairwise Wilcoxon test p values displayed.

(C) Heatmap of B cell state and mutation mutual association analysis was performed using a generalized linear model between sample B cell state values and genotype profile. Degree of association is determined from linear model coefficients and are colored from negative (purple) to positive (orange). Gray circles depict the association p value, and significant associations are marked with a thick black border (p < 0.05). Only genes with at least 1 association p < 0.1 are shown.

We next tested individual alterations for their contribution to the three B cell states (Figure 6C). There were no variants that were significantly enriched in INFM, although this was not unexpected because this group has a shared gene signature with our normal control tissue (Figure 1B) and trended the lowest for mutation burden. Several gene alterations were significantly positively enriched in the PDZ state, including immunoregulatory genes (HLA-B and β2M), copy-number losses to 6q14.2–23.3, which includes the NF-κB inhibitor gene TNFAIP3 (A20 protein), and genes associated with cell growth (ZFHX3) and differentiation (UBR5). Finally, the CMI state enriched positively for alterations in chromatin-modifying proteins such as CREBBP and KMT2D, mutations in BCL2, 10q23.31–24.1 copy-number losses, and copy-number gains at 2p16.1, including REL. Together, these results implicate coordinated somatic alterations to specific genes, or lack thereof, as an upstream contributor to the 3 FL B cell states.

Discussion

Understanding the molecular features of FL is a critical step to defining the biological basis for tumor heterogeneity, defining targetable pathways, and identifying biomarkers of outcomes, all high priorities for the National Cancer Institute (NCI)-driven paradigm for progress in follicular lymphoma.44,45 In the present study, we used our rich resource of tissue samples from highly annotated newly diagnosed or untreated FL tumors paired with innovative machine learning approaches to identify key transcriptional programs in FL, without the contaminating effects of the TME. We found that FL tumors are transcriptionally heterogeneous and coalesce into 3 continuous states (INFM, PDZ, CMI) with unique biological phenotypes. Specifically, these transcriptional states differentiate by activity in IFN and inflammatory signaling (INFM); proliferation, DNA damage response, and metabolic reprogramming (PDZ); or epigenetic activity (CMI). In addition, when paired with matched immune profiling and WES data, we further identified unique associations between these FL programs with somatic alterations and the tumor cell extrinsic microenvironment. We also expanded the known profile of recurrent mutations and CNAs in FL. Importantly, our study incorporated an unbiased sample selection criteria that resulted in a sample cohort that is representative of all FL stages, grades, and FLIPI scores and included samples from all of the treatment categories (watch and wait, immunochemotherpy, rituximab monotherapy, and other treatment). The advantage of this is that it provides a larger window into the molecular features of all FL at diagnosis and/or pretreatment, in which new biology can be described and genomic landscape trends can be more accurately estimated.

Since much of the current heterogeneity in FL has been attributed to differences in TME or clinical presentation, the molecular basis for heterogeneity in FL tumor cells remains unknown. Glas et al. provided the largest unsupervised look at the FL transcriptome at diagnosis to date (N = 72) and concluded that FL was homogeneous, despite noting 3 general sample groupings from the hierarchical clustering.46 Given that transcriptomics on bulk tissue aggregate gene expression from all of the cells in the tissue, nontumor cell variation clouds over tumor cell heterogeneity..4,5 However, the finding that FL B cells exhibit coordinated heterogeneous profiles is not unprecedented, considering a number of single-cell RNA-seq studies have shown a similar result that tumor B cells grouped by patient, with little to no overlap between patients.7,20,47,48

Enhanced inflammation and NF-κB activation have previously been shown to define subclasses of FL, such as t(14;18) negative FL.49,50 APRIL and TACI were both defining genes for INFM, which also displayed characteristics of an NF-κB/IFN-β regulatory program, and have been found to play a role in FL biology.50 IFN-β is generally thought to be proapoptotic for tumor cells; however, APRIL, TACI, and type 1 IFNs can counteract this and are important for memory B cell survival at the expense of proliferation.51,52,53 Type 1 IFNs are potent activators of tumor-resident T cells and are a viable immunotherapy candidate in some cancers. Consistent with the inflammatory phenotype in INFM, T cells, DCs, FRCs, and vascular pericytes were enriched in the INFM group state (Figure 4). Interestingly, enhanced T cell signatures in FL have mostly been attributed to better outcomes, much like this group. Our genetic analysis of INFM identified a lower mutation burden and lack of enriched driver mutations. Without a significant number of genetic events, it is possible to hypothesize that INFM tumors have yet to acquire a sufficient quantity of genetic driver lesions to clearly distinguish it from benign tissue, maintaining an indolent disease sustained in part by the inflamed local TME.

The PDZ state exhibited significant enrichment of cell-cycle progression, inflammation, DNA damage response, and mTORC1 and BCR activity. In support of this, MKI67, AURKA, and CCNB1 were significant defining genes for the PDZ state, and FOXM1 and MYB activity had significant positive associations with the PDZ state (Figures 2A and 2D); both of the latter were previously identified as key regulators of B cell differentiation into centroblasts and are potent regulators of proliferation.54 The hallmark mTORC1 gene set was also significantly enriched in the PDZ state, which is capable of stimulating proliferation in B cells when activated by AKT via phosphatidylinositol 3-kinase and BCR stimulation.55,56 Immune profiling found that Tfh cells, CD14+ mDCs, and FDCs were positively associated with the PDZ state (Figure 4), all of which are known to support FL B cell growth and survival through tonic BCR signals and CD40 signaling, among others.16,57,58,59 This was further supported by the high quantity of Ki-67+ cells localized almost entirely to areas staining positive for FDCs. Genetic analysis additionally revealed a number of positively associated somatic alterations that can enhance the effect of these pathways on proliferation, such as copy-number losses of TNFAIP360 and mutations in ZFHX3.61 This group of cases had the worst outcome (EFS and OS) compared to INFM and CMI, but inhibitors of FOXM1 or mTOR may present promising therapeutic avenues for these patients in the future. Together, our data suggest that PDZ FL B cell state represents more aggressive FL tumors of proliferative, GC cycling B cells that are receiving growth and survival signals from a supportive cellular environment in addition to alterations that potentiate BCR and NF-κB signals.

CMI was the largest group in our cohort (50.4%) and enriched for a transcriptional program related to epigenetic reprogramming, metabolic reprogramming, and cell adhesion (Figures 2B and 2C, bottom). Concurrently, the CMI state displayed a strong association with alterations in a number of chromatin modifiers capable of controlling B cell differentiation, including KMT2D, CREBBP, and 10q23.31–24.1 losses (LCOR). Numerous canonical GC B cell genes (BCL2, CD38, MME, EP300) positively defined this group signature, and significant activity in BCL6, EP300, and FOXO1 supports the conclusion that CD40 and BCR stimulation do not occur in these samples. As a B cell exits the DZ, BCR stimulation leads to a decrease in FOXO1 and an intake of antigen for processing62 and subsequent engagement with Tfh cells (i.e., LZ). Whereas B cells typically undergo apoptosis at this stage if sufficient BCR stimulation is not given, mutations in BCL2 and loss of FAS were enhanced in the CMI state, providing the necessary inhibition of pr-apoptotic signals without the necessary LZ signals. In support of this, CMI negatively enriched for antigen processing and presentation gene sets and had significant negative associations with predominantly CD4 T cells, such as Tfh. These findings agree with previous findings that mutations in chromatin modifying proteins (CREBBP, KMT2D, EZH2, and EP300) are capable of repressing antigen presentation and decreasing interactions with T cells,21,23 blocking necessary LZ signals, and promoting the BCL6 dependence of FL cells.63 Taken together, these data suggest that CMI group tumors have acquired the necessary genetic lesions in chromatin-modifier genes to sustain B cell survival and growth, independent of T cell help, and are developmentally stunted from exiting the GC state.

In addition to identifying FL states, our study highlights the complexity of profiling somatic genomic events in FL. We identified a large number of genes with mutations (Figure 5C); however, prior studies reflect a slightly different landscape compared to ours.14,21,64,65 The population frequency of mutations to KMT2D and CREBBP are much higher in other works (Table S6). However, these studies lack significant numbers of samples with matched germlines, which is the gold standard for identifying somatic events, and enrich heavily for cohort characteristics representative of a small subset of newly diagnosed FLs (i.e., IC treatment, high grade, pretransformation). Our results reveal a more accurate picture of the mutation landscape in FL and is the largest true discovery set to date.

Although it advances our current understanding of FL, this study also highlights a number of key aspects to consider with future work. Although our groups are not prognostic, they do provide some guidance as to the axis of susceptibility for targeted treatment. All 3 groups had dominant molecular characteristics that can guide new treatment strategies. For instance, INFM had enrichment of inflammation and cytotoxic cells, suggesting an immunotherapy such as chimeric antigen receptor-T cells or bispecifc antibodies may be particularly effective in these patients. PDZ had strong metabolic characteristics, which pointed strongly toward dysregulated mTORC1, to which mTOR inhibition has shown some promise in FL.66 CMI, however, was defined by epigenetic dysregulation and a depleted TME. Tazemetostat has been shown to be an effective therapy in FL, targeting the chromatin methylator EZH2, although it has not been tested in a frontline setting.67 Intriguingly, although response to EZH2 inhibition is superior in patients harboring EZH2 mutations, wild-type patients also appear to benefit to a smaller extent, so perhaps epigenetically dysregulated FL overall could benefit from this drug. In summary, we show that FL tumor B cells differentiate among 3 transcriptional states (INFM, PDZ, CMI), with definable biological phenotypes and unique relationships to genetic and microenvironmental underpinnings.

Limitations of the study

The samples used in this study are sorted B cells from FL tumors. Because there are no distinguishing markers for FL B cells compared to normal B cells, we cannot rule out the potential for normal B cell signatures influencing our NMF results. Although prior studies using a similar sample type suggest that a majority of FL biopsies demonstrate kappa or lambda restriction in ≥95% of B cells from FL tumor samples.6,7 This study uses a diverse cohort of patients representative of a real-world clinical practice with no selection for clinical subgroups or initial treatments, which may contribute to the limited prognostic utility detected in our subgroups. However, the goal of the study was not necessarily to determine the prognostic value of these B cell states, but rather novel biology, because gene expression studies have already extensively explored this question in IC-treated FL. Although we were able to validate the presence of our clusters in independent datasets, we were not able to fully validate the supporting mutation profiles, and this should be further investigated in larger, more contemporary cohorts with available WES data. Lastly, the B cell developmental origin of FL, among LZ and DZ, has been debated, and recent evidence suggests that the effort may be fruitless.6 However, our results do not insinuate that these states indicate a GC phenotypic identity of FL B cell samples, but rather identify something like a phenotypic scar, in which a state represents the most recent GC phenotype from which the samples deviated. Therefore, our results are consistent with the findings of Milpied et al.,6 in that these 3 B cell states exhibit signs of a prior GC reaction phenotype, but that GC dysregulation from alterations specifically to the GC reaction checkpoints pushed the tumor cells into adjacent phenotypic categories.

STAR★Methods

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Antibodies

CD21-BX032 (EP3093)—Atto 550-RX032 Akoya Catalog: 4450027
CD20-BX007 (L26)—Alexa Fluor™ 750-RX007 Akoya Catalog: 4450018, RRID:AB_2915939
CD8-BX026 (C8/144B)—Atto 550-RX026 Akoya Catalog: 4250012, RRID:AB_2915960
CD4-BX003 (EPR6855)—Cy5-RX003 Akoya Catalog: 4350018, RRID:AB_2915936
Ki-67-BX047 (AKYP0052)-Atto 550-RX047 Akoya Catalog: 4250019, RRID:AB_2895046

Biological samples

Frozen tumor and lymph node biopsies This study N/A
Peripheral blood DNA This study N/A

Critical commercial assays

EasySep Human B Cell Enrichment Kit II Without CD43 Depletion STEMCELL Technologies Catalog: 17963
EasySep Human Memory B Cell Isolation Kit STEMCELL Technologies Catalog: 17864
TruSeq RNA Exome Kit Illumina Catalog: 20020189
Qiagen QIAamp DNA mini kit Qiagen Catalog: 51304
Agilent SureSelect XT Agilent N/A
Agilent SureSelect XT AllExon v5 + UTR kit Agilent N/A

Deposited data

Raw Data dbGAP Database dbGAP: PHS002989
Bulk B cell RNA-Seq Holmes et al. (2020)34 GEO: GSE139833
Human reference genome NCBI build 38, GRCh38 Genome Reference Consortium http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/human/
FL Biopsy CyTOF Yang et al. (2019)17 Data on hand
Stromal Gene Sets Mourcin et al. (2021)26 N/A
ExAC Karczewski et al. (2017)68 RRID:SCR_004068
1000 Genomes Project Clarke et al. (2012)69 RRID:SCR_008801

Software and algorithms

R (v4.2) R Development Core Team https://www.R-project.org/
FLBstate Prediction Novak Lab https://github.com/NovakLab/FL_Bstate
MAP-RSeq (v3.0.2) Kalari et al. (2014)70 https://bioinformaticstools.mayo.edu/research/maprseq/
Subread Liao et al. (2013)71 https://subread.sourceforge.net/
NMF R package (v0.26) Gaujoux and Seoighe (2010)72 https://cran.r-project.org/web/packages/NMF/
msigdbR R package (v7.5.1) Liberzon et al. (2015)73 https://cran.r-project.org/web/packages/msigdbr/
Harmonizome Rouillard et al. (2016)74 https://maayanlab.cloud/Harmonizome/
ReCiPa R package (v3.0) Vivar et al. (2013)75 https://cran.r-project.org/web/packages/ReCiPa/
fGSEA R package (v1.28) Korotkevich et al. (2016)76 https://bioconductor.org/packages/release/bioc/html/fgsea.html
VIPER R package (v1.36) Alvarez et al. (2016)30 https://bioconductor.org/packages/release/bioc/html/viper.html
e1071 R package (v1.7) N/A https://cran.r-project.org/web/packages/e1071/
cytofclean R package (v1.0.3) N/A https://www.github.com/JimboMahoney/cytofclean
Phenograph R package Levine et al. (2015)77 https://github.com/JinmiaoChenLab/Rphenograph
SingScore R package (v1.22.0) Foroutan et al. (2018)78 https://bioconductor.org/packages/release/bioc/html/singscore.html
CODEX Multiplex Analysis Viewer (v1.5) Akoya https://help.codex.bio/codex/mav/overview
ImageJ fiji (v1.53f51) Schindelin et al. (2012)79 https://imagej.net/software/fiji/
Agilent AGeNT LocatIt (v4.0.1) Agilent N/A
BWA aligner (v0.7.10) Li and Durbin (2009)80 RRID:SCR_010910, https://www.illumina.com/products/by-type/informatics-products/basespace-sequence-hub/apps/bwa-aligner.html
GATK (v3.4-46) McKenna et al. (2010)81 RRID:SCR_001876, https://gatk.broadinstitute.org/hc/en-us
SomaticSniper (v1.0.4.2) Larson et al. (2011)82 RRID:SCR_005108, https://gmt.genome.wustl.edu/packages/somatic-sniper/
JointSNVMix (v0.8-b2) Roth et al. (2012)83 RRID:SCR_006804
MuTect (v1.1.7) Cibulskis et al. (2013)84 RRID:SCR_000559
Maftools R package (v3.18) Mayakonda et al. (2018)85 https://bioconductor.org/packages/release/bioc/html/maftools.html
PatternCNV (v1.0) Wang et al. (2014)86 https://bioinformaticstools.mayo.edu/research/patterncnv/
ABSOLUTE Carter et al. (2012)40 RRID:SCR_005198
GISTIC 2.0 Mermel et al. (2011)87 RRID:SCR_000151
Sigminer R Package (v2.2.2) Wang et al. (2020)88 https://cran.r-project.org/web/packages/sigminer/
OpenCRAVAT (v2.2.7) Pagel et al. (2020)89 https://opencravat.org/index.html
GraphPad Prism (v9.4.1) GraphPad RRID:SCR_002798

Resource availability

Lead contact

Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Anne J. Novak, Ph.D. (Novak.Anne@mayo.edu).

Materials availability

This study did not produce any new unique reagents.

Data and code availability

  • Raw genomic and transcriptomic data files can be accessed via the database of Genotypes and Phenotypes (dbGAP) at https://www.ncbi.nlm.nih.gov/gap/ with dbGaP Study Accession: phs002989.

  • Original code has been deposited at GitHub as a package and is publicly available as of the date of publication. DOIs are listed in the key resources table. This paper does not report original code beyond this package.

  • Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

Experimental model and study participant details

All patient samples in this study were collected with informed consent for research use and were approved by the Mayo Clinic Institutional Review Board in accordance with the Declaration of Helsinki.

Lymphoma and reactive lymph node tissues were selected from either the Mayo Clinic Lymphoma Biobank, the University of Iowa and Mayo Clinic Lymphoma Specialized Program of Research Excellence (SPORE) Biospecimen Core, or the SPORE Molecular Epidemiology Resource Biospecimen Core. There was no specific selection for sample sex or gender and the cohort is appropriately balanced for this characteristic.

Method details

FL sample collection and selection

Tumors from 129 newly diagnosed (93.2%) and/or untreated (6.8%) follicular lymphoma patients were included in this study, with clinical characteristics detailed in Table S1. Lymphoma tissue, as well as tissue from reactive lymph nodes (n = 5), were selected from either the Mayo Clinic Lymphoma Biobank, the University of Iowa and Mayo Clinic Lymphoma Specialized Program of Research Excellence (SPORE) Biospecimen Core, or the SPORE Molecular Epidemiology Resource Biospecimen Core. Benign reactive lymph node samples were acquired by core needle biopsy and diagnosed by pathology review. Cryopreserved samples (N = 96) were processed from biopsies and frozen as single cell suspensions in DMSO cyropreservative. Frozen section samples (N = 59) were fresh excisional biopsies frozen in OCT mounting medium. A detailed description of the population from which this cohort is selected is outlined by Cerhan et al. (2017).90 Tumor DNA and/or RNA were extracted from whole tissue sections cut from FL biopsies frozen in OCT (Tumor OCT), or from ficolled single cell suspensions frozen in DMSO cyropreservative (Tumor DMSO). RNA was also isolated from purified FL B cells (Tumor B Cell). Briefly, cryopreserved single sell suspensions were initially thawed and washed with media. Each sample was then incubated at room temperature for 15 min in EDTA free media with DNAse (375 U/ml), followed by a wash to neutralize the enzyme. B cells were isolated using an EasySep Human B cell enrichment kit without CD43 depletion (Stem Cell Tech, Cat# 19154), in accordance with manufacturer specifications. All samples were between 2 × 107 and 3 × 107 cells at the time of sort, which is well within the manufacturer recommended cell concentration for the sort. The negative fractions (B cells) were immediately pelleted and resuspended in Qiazol for RNA extraction (see RNA sequencing). Germline DNA for WES was extracted from blood (n = 102) or the non-B cell fraction from the B cell enrichment (n = 21). Additionally, Memory B cells were isolated from the peripheral blood of 5 healthy controls using the EasySep Human Memory B Cell Isolation Kit per manufacturer’s instructions (STEMCELL Technologies, Cat. No. 17864).

RNA sequencing

RNA was extracted from benign lymph nodes (n = 5), biopsies from samples stored in DMSO (n = 89), or fresh-frozen sections in OCT (n = 37), and sequencing was performed at the Mayo Clinic Genome Analysis Core using the Illumina TruSeq RNA Exome Kit for library preparation, sequencing platform HiSeq 4000, 100 × 2 paired end reads. The raw RNA sequencing paired-end reads were processed through the Mayo Clinic RNA-seq bioinformatics pipeline, MAP-RSeq version 3.0.2.70 Briefly, MAP-RSeq employs the very fast, accurate and splice-aware aligner STAR91 to align reads to the reference human genome build hg38. Gene and exon expression quantification were performed using the Subread71 package to obtain raw counts. Samples with library sizes below 15 million counts were removed from analysis, resulting in the removal of three samples from further analysis. Genes with median adjusted counts below 30 were removed from analysis. Counts were then normalized for library size and gene length by calculating transcripts per million (TPM), followed by log2(TPM+1) transformation. Finally, genes were filtered to include only protein coding, miRNA, and lncRNA genes and exclude genes encoded on X, Y, and M chromosomes.

Non-negative matrix factorization – sample clustering

B cell gene expression clustering was performed using non-negative matrix factorization (NMF), as previously described.92 NMF was performed on the B cell gene expression data using the nmf R package.72 We used the Brunet et al. (2004)92 method and performed 200 runs on factor numbers (k) 2 to 6. The ideal factoring rank was determined to be 3 based on suggestions by Brunet et al. (2004)92 to select the factor which occurs before a significant drop in the cophenetic correlation coefficient. A factor of 3 also the highest cophenetic correlation coefficient. Samples were assigned to specific groups based on the result of the consensus clustering, built within the R nmf package. Group state coefficients were acquired from the result of the NMF function. Unlike groups, this is not a result of multiple runs and consensus clustering, but is the factor coefficients (H matrix) from the run which resulted in the lowest reconstruction error. Each state was assigned to represent one of the groups based on overlapping samples between group assigned and highest value state coefficient in each sample. Finally, state coefficients were normalized to a sum of 1 in each sample.

External prognostic gene set scoring

For relative scoring among a set of samples, gene expression values were scaled to z-scores. For each sample in the set, a single gene set was scored as the mean Z score of each gene in a gene set. This will only represent a score relative to the other samples in the set. This was not performed for less than 80 samples to ensure an accurate population mean estimation.

B cell state gene scoring

In order to assess contribution of all genes, regardless of the degree to which they are expressed overall, we computed the relationship between genes and states using scaled coefficient correlation values (Equation 1). The scaled coefficient correlation value (C) is calculated as the Pearson correlation between the expression of gene (i) and the coefficient value (j) for one of the three states across all samples multiplied by the negative log of the p value from this correlation:

Ci,j=ri,j·log(pi,j) (Equation 1)

Where Ci,j is the scaled coefficient correlation between gene i and coefficient j from the NMF result. The value ri,j is the pearson correlation between gene i and coeffiient j, and pi,j is the resultant p value. This takes into account the direction of the correlation, the degree to which a gene is correlated to that state, and a scaling to the likelihood of being a false positive.

Gene pathway inference

To determine biological activity of each B cell state, we employed gene set enrichment analysis (GSEA). GSEA requires a ranked list of genes and the scores which determined the rank, usually from differential expression. In order to run GSEA on each of the three states, we ranked genes based on their scaled contribution score (Equation 1). This orders genes based on their likelihood of driving a single state or countering a single state. Gene sets from mSigDB Reactome, GO-BP, and Hallmark, using the R package msigdbr.73,93,94 Additionally, gene sets from the NCI lymphoid malignancy network,95 FL publications,4,5,6 and transcription factor target gene sets from the Harmonizome database74 were used. The ChIP-X, ENCODE, TRANSFAC transcription factor target sets were collected from the Harmonizome and identical transcription factors, along with their gene sets, were combined. To remove redundant gene sets, we combined gene sets into superpathways based on gene content overlap, using the ReCiPa package in R.75 Briefly, the method calculates overlapping information content between all gene sets and combines those with significant overlap into superpathways. The resultant superpathways are manually annotated based on the identities of the contained pathways.

Following this, the gene contributions to each state was ordered and GSEA was run using the R package fastGSEA.76 We computed using standard scoring method and an epsilon of 0. GSEA was only run on gene sets 40 to 500 genes in size. Results were filtered for adjusted p value <0.05 and a secondary collapse of redundant signals was run post-hoc using the collapsePathways() function from the package. Results are reported in Table S2 and listed based on the coefficient from which the gene set was derived. There were no competing gene sets enriched in either of the three states; however, to avoid sematic redundancy we manually selected gene sets to display in Figure 3.

VIPER master regulator

To infer potential master regulators of each B cell state, we used the VIPER R package.30 The msVIPER function requires a cell context specific regulatory network, usually calculated from ARACNe. Because the package was originally developed using a B cell transcriptional network, we utilized this provided network as our network input. Next, the method operates similar to GSEA and requires an ordered list of genes and their rank scores, so each state was run with the previously calculated gene-state contributions. It is optional but recommended to include a permuted null model of gene scores, as an input. The built-in permuted null method results in null distributions from a t-test but these scores would be on a different distribution than our gene contribution scores. Therefore, we shuffled gene expression values and state coefficients randomly and recalculated gene coefficient contribution scores from Equation 1, over 1000 permutations. Finally, the msVIPER function was run with these inputs, followed by shadow and synergy analyses. Results displayed reflect the most significant regulators for each group, filtered for non-redundancy and interpretability.

Germinal center program association

B cell gene expression data were interrogated for germinal center programs according to methods outlined by Holmes et al. (2020).34 Briefly, RNA-seq from bulk B cells were scaled to z-scores and subsetted to only the 50 most significant genes defining for the 13 single-cell clusters in the Holmes et al. paper (acquired from Table S2). The bulk sample scoring was individually calculated for a single cell cluster by summing individual scores of genes (+1 or −1), where a gene had a +1 if the Z score and the single cell cluster enrichment were of the sign and a gene had a −1 if the signs were different. The final score for a B cell sample and GC cluster ranged from −1 to 1, based on the summed vote divided by the total number of votes possible. To ensure robustness from random effects, we calculated an empirically derived null distribution for each GC cluster gene set. This was acomplished by randomizing genes and samples, followed by a calculation of the enrichment scores described above, which was repeated 1000 times. We generated p values of individual scores using a cumulative distribution function (ecdf) of the empirically derived score distributions. The scoring quality was tested against controls included in Holmes et al. (2020),34 which were bulk RNA-seq samples of sorted human tonsil DZ, LZ, centroblast, and memory B cells. Non-log-scaled TPM values were acquired from GEO (GSE139833). FL samples were individually assigned to a scCluster based on the highest scoring gene set, among those that were significant. To determine association between scCluster scores and B cell group assignment, we ran hypergeometric testing between in-class vs. out-of-class designation and the number of samples scoring above the upper 95%-CI for a particular GC cluster. The same was run for negative associations of those scores occurring below the lower 95%-CI. Subsequent p values were FDR adjusted and displayed as -log(q-value).

Cluster and coefficient prediction algorithm

In order to obtain genes which are linearly defining for each state in bulk RNA-seq samples, we utilized methods inspired by Newman et al. (2019)96 to deconvolve B cell states from bulk tumor RNA-seq. First, we determined a basis matrix (W), as if these samples had yielded the same state values from NMF. The bulk, unsorted RNA-seq data represents samples from two different origins (processed single cell suspension frozen in DMSO (DMSO) and fresh frozen tumor sections (OCT)), with 16 samples overlapping between the two. Using all samples from both origins, we removed genes which had high association with sample origin. Next, we combined non-overlapping sample gene expression profiles (non-logged TPMs) with preference for OCT in the matching samples, in our signature training set (n = 91). Because of the non-negative constraint of NMF, we utilized non-negative least-squares regression to infer basis values of genes from bulk RNA-seq given known coefficients (state values) from matching bulk B cell gene expression samples.

NNLS is prone to spurious error due to the sheer number of genes being tested, so we accounted for basis gene variance by randomly sampling 50% of samples and calculating a basis, which was repeated 10 times, resulting in 10 basis matrix results for each group. We then filtered out genes with basis values less than 1 in all factors. Basis values were then transformed to log2(value + 1) and filtered for genes with low variance (σ2 < 0.5) within either of the three factors among the 10 runs. Finally, we ran a Wilcoxon rank-sum test across every gene comparing each factor’s basis values to the other two for significance testing. To select genes for the signature matrix, a score of -log(p value)∗Fold-Change was generated for each gene and each basis factor. Once genes were selected, we calculated a signature matrix by averaging the basis values from each group for each gene.

To determine the predicted state value for each sample, we utilized ν-SVR from the (e1071) R package. Much like CIBERSORTx,96 we targeted a signature matrix as our predicted value and collected state values from the coefficients following training. The input for each sample is the log2(TPM+1) values of genes in the signature matrix. Additionally, the target signature matrix is also log transformed as log2(value + 1). Resultant coefficients for each sample were adjusted to sum to one before comparing to prior values during training. In order to determine an optimal nu value and an optimal number of genes for the signature matrix, we ran a grid search to calculate coefficient values on each sample and computed a root mean squared error (RMSE) for nu values ranging from 0.1 to 0.9 by 0.1 increments and a number of genes per state (25, 50, 75, 100, 125, 150). We selected an ideal pair as 0.1 nu value and 50 genes per state (150 total) (Figure S4A). The resultant signature (Figure S4B) was then used to determine the state prediction accuracy in each of the training samples. Normally, SVR would create a “perfect” model for the training set and the result would be a perfect prediction. However, by predicting a signature common to all samples, the result is a new model for each sample. Therefore, we utilized the training data to evaluate the accuracy of this resultant signature in the cryopreserved samples and the frozen section samples separately (Figure S4C). Subsequently, the testing set samples were run using this same signature matrix.

We provided group assignment of test samples by first assigning a sample to the group of their maximum coefficient. If a sample was assigned to the INFM group but the CMI state was greater than 0.38, then they are reassigned to CMI. If first assigned to PDZ and the INFM state value was greater than 0.42, then reassigned to INFM; and if assigned to CMI and the PDZ state value was greater than 0.36, then reassigned to PDZ. This heuristic represents a dominance feature of each state on another and rotates the decision boundary slightly beyond the simple highest score.

CyTOF processing

Mass cytometry samples were originally processed, stained, and cytometry run in Yang et al. (2020).15 Samples were run individually over the course of 12 days and 500k events were collected for each sample. Raw fcs files were obtained and pre-processed in R according to a custom pipeline based on methods outlined in97 and.98 Briefly, arcsinh transformed events from samples run on the same date (batch) were imported into R, without extreme value truncation. Bead normalization was performed according to.99 Samples were then concatenated if split between two runs. Gaussian mixture based modeling of Ce140 expression, assuming a mixture of cells and beads, identified the lower tail 5% probability of belonging to the bead distribution (Ce140-high) and selected all events below that value. Live singlets were identified using a series of statistically derived gates on Gaussian parameters, Ir191 DNA, and live-dead stain, according to98 and github/JimboMahoney/cytofclean.

To QC samples, we measured the percentage of events which were CD3+/CD19+ and excluded samples with more than 2% of their events falling in this gate. Additionally, samples were excluded if they had less than 50,000 live singlet events following pre-processing, with an exception of 1 sample which had 47,000 events.

Surface markers can have unique ranges of expression, which represent real biological differences but not reflective of marker importance; therefore markers were scaled to a local maximum within each sample. Marker maximums were determined as the 99.5th percentile of 50,000 randomly selected events from each sample. Some markers in our panel are exclusive to cell types which may be quite rare in some samples. Therefore, the true 99.9th percentile for that marker may be much lower than the actual upper level of expression. In order to account for this possibility, we compared these marker maxima to a 99.5th percentile maxima of events collected on the same day, which have already been normalized together. Every sample’s the middle point between the two maxima was used as the maximum for that marker and that sample. Importantly, we did not use the batch maxima for each sample because some inter-sample variability can occur after normalization, despite being normalized together.

CyTOF clustering

A single sample metaclustering approach was used according to.77,100 Our panel was heavily populated by T cell markers and more limited for discovering B cell phenotypes. However, because B cells are the predominant population, a limited number of markers describe much of the differences between cells and mask the identification of smaller populations. To circumvent this problem, normalized marker expression of 50,000 events from each sample was first clustered using Phenograph using k = 20, resulting in 20–30 clusters.77 To identify and remove any doublet events remaining from processing, events in clusters which had mean expression of CD3 and CD19 above 0.5 for both were removed. This process removes a maximum of 5–6% of events from each sample. Samples with more than 5000 doublet events removed were excluded from further analysis. The remaining clusters were then separated into B cell and non-B cell buckets, based on cluster mean CD19 expression. Sample specific cell clusters were identified from the B cell and non-B cell normalized events separately, using Phenograph. For each bucket, we set k to a dynamic range between a minimum of 7 and maximum of 15, based on floor ((number of cells in bucket)∗0.002), as in.100 Resultant cluster marker centroids and the number of cells per cluster were saved.

After repeating this process for each sample, there were 3,331 resultant clusters from all samples. A final check for doublet events was run, with the aforementioned cutoff criteria and clusters were removed if expressing both CD19 and CD3. Because the marker panel contained mostly those meant to identify phenotypic diversity of T cells, B cell clusters were more likely to inadvertently combine with non-B clusters due to marker expression similarities outside of primary lineage markers. Therefore, we split clusters using a simple k-means approach on the mean lineage marker expression (CD45, CD3, CD19, CD56, CD14, CD141) in these clusters (k = 4), which split the clusters into B cell, T cell, and a non-B or T cell group. To combine this data into cellular meta-clusters, we ran a final step of phenograph on the cluster centroids from the groups delineated in the k-means. The phenograph k was set to 10 for each and markers inclusion was specific to known expression in T cells, B cells or other cell types, to decrease the likelihood of artifact cell types. The resultant metaclusters were then evaluated for combining based on their exclusivity to a limited number of batches or less than 20 samples (i.e., likely artifacts). Clusters that met this criteria were combined with their most similar neighbor based on marker expression and expert manual review. B cell clusters from CyTOF were not the subject of investigation and the T cell driven panel made it difficult to identify biologically relevant B cell metaclusters; therefore, all B cell metaclusters were combined into a single B cell metacluster, in addition to any remaining B cell metaclusters from the other category. Final cluster phenotypic annotations were determined by expert manual review.

Tumor microenvironment stromal and B cell deconvolution

To estimate relative quantities of stromal cells we utilized a gene set scoring approach on gene sets published by Mourcin et al. (2021).26 These gene sets represented defining genes for follicular dendritic cells (FDC), follicular reticular cells (FRC), and vascular pericytes (DN), derived from scRNA-seq of human lymph nodes. Samples from frozen section represent tissues without manipulation and were therefore used for this analysis. Using TPM values from frozen section RNA-seq, we removed genes from each gene set which could potentially be aberrantly expressed by FL B cells, by removing genes with a Pearson correlation greater than 0.2 between the gene expression and the estimated B cell content, a process originally outlined by Jiménez-Sánchez et al. (2019).101 We then used singScore R package78 to score each sample on each gene set, of the remaining genes, and scores were scaled to Z score values.

To estimate the percentage of B cells in all samples, we utilized RNA-seq from samples which also had B cell quantities measured by CyTOF to train a linear support vector regression model (ε-SVR). In addition to 65 FL samples with B cell percentage measured by CyTOF, we simulated 5 additional tumors with 100% B cell purity by randomly sampling 10 FL B cell RNA-seq samples, and recalculating the TPM gene expression values for a mixture of these samples. Using these 70 “bulk” samples and their measured B cell quantities, we selected genes with a pearson correlation value above 0.5 or below −0.7 between B cell content and the TPM values. We tuned the ε-SVR model using the selected input genes to predict B cell percentage, across cost values of (1, 10, 50, 100, 200, 400) and epsilon values of 0.1–1, by 0.1, on a linear kernel and a 10-fold cross validation. A cost of 1 and an epsilon of 0.5 was optimal, and obtained an RMSE of 9.24%. A final model was trained using the optimal parameters from tuning and B cell quantities were computed on test samples.

Cell-type abundance and B cell state relationship

Determining the relationship between FL B cell states and cell-type abundance was based on a Pearson correlation between the normalized state coefficient from NMF and the percent abundance of a particular cell-type. In the case of stromal cells, scaled cell-type scores were used in lieu of percentages and predicted state values were used, except in frozen section samples with a matching B cell RNA-sample. Significance was determined based on the Pearson correlation p value and only values below 0.1 are shown.

CODEX multiplexed IHC

Methods for CODEX of these samples are outlined in.19 Briefly, 3 FL samples, one to represent each of the FL states, were selected. Sections from FFPE blocks were stained with hematoxylin and eosin (H&E) and annotated by an expert pathologist to identify malignant follicles and inter-follicular tumor areas. ROIs were selected using a 20× air objective during CODEX cycles. Each ROI consisted of 5 × 5 tiles with 30% overlap. An 8-micron thick section, adjacent to those already evaluated from the FFPE tissue blocks were mounted on a poly-lysine coated glass coverslip and stained with a cocktail containing 17 nucleotide-barcoded primary antibodies following heat-based antigen retrieval, although for this study we only report on the expression of 5 antibodies. For the purposes of this study, tissue was stained using the following antibodies purchased from Akoya: DAPI (#7000003), CD21-BX032 (EP3093)—Atto 550-RX032 (#4450027), CD20-BX007 (L26)—Alexa Fluor 750-RX007 (#4450018), CD8-BX026 (C8/144B)—Atto 550-RX026 (#4250012), CD4-BX003 (EPR6855)—Cy5-RX003 (#4350018), Ki-67-BX047 (AKYP0052)-Atto 550-RX047 (#4250019). The section underwent nuclear staining (DAPI) and was loaded on the stage of an automated inverted fluorescence microscope connected to the robotic fluidic system known as Co-Detection by Indexing (CODEX, Akoya Biosciences). Each tissue underwent a total of 8 iterative cycles including 2 blank cycles (beginning and end) for auto-fluorescence background subtraction. Following CODEX acquisition, images were processed using the CODEX processor software (Akoya Biosciences). Analysis was performed using the CODEX Multiplex Analysis Viewer (MAV) software in ImageJ fiji (v1.53f51).

Whole exome sequencing

Whole exome sequencing was performed on tumor-normal pairs from 123 patients (Table S1). A detailed description of the whole exome sequencing method has been described previously.102 Briefly, tumor biopsy DNA was extracted from samples stored in DMSO (n = 90), or fresh-frozen sections in OCT (n = 33), along with matching germline samples from blood (n = 123), using Qiagen QIAamp DNA mini kit (51304). Matched germline and tumor DNA samples were collected and sent for whole exome sequencing at the Mayo Clinic Genome Analysis Core. Library preparation was done using the Agilent SureSelect XT and Agilent SureSelect XT AllExon v5 + UTR kit and sequencing was carried out on an Illumina NovaSeq 6000, 150 × 2 paired end reads. Raw FASTQ files were processed in the Mayo Clinic Bioinformatics core. Prior to alignment to human reference, Agilent AGeNT LocatIt (version 4.0.1) was used to remove duplicate reads using using unique molecular barcodes per fragment information. Non-duplicate reads were mapped to human reference genome (hg38) using BWA aligner version 0.7.10, followed by realignment and recalibration using GATK version 3.4–46. Somatic mutations were called using consensus calls from three somatic callers, SomaticSniper (v1.0.4.2), JointSNVMix (v0.8-b2), and MuTect (v1.1.7). Mutations were annotated using in-house annotation tool, BioR. To further filter for artifacts and germline mutations, we assessed germline mutations from 987 samples from an in-house biobank and public datasets, ExAC68 and The 1000 Genomes Project.69 Mutations present at a frequency of greater than 1% in each of the datasets were excluded. Additionally, any mutation with a depth of less than 10 supporting reads and/or with a variant allele frequency of less than 1% were filtered out. All sample identities and their matched germlines were confirmed using the maftools sampleSwaps() function.103 No sample cross contamination was found and all samples appropriately matched with their germline counterpart.

Absolute copy number and mutation frequency resolution

Whole exome copy number profiles were first generated using PatternCNV,86 which resulted in a segmentation file for each sample with segment size, location, and copy ratios across the genome. To identify sample tumor purity, ploidy, and adjust mutation/cnv cancer cell fraction (ccf), ABSOLUTE40 was run using a combination of the segmentation and mutation maf files for each sample individually. Chromosome X segments were excluded from this analysis due to the annotated nature of the segmentation file compared to that of an SNP array. Along with the model likelihood ratio test, measured and predicted B cell contents, from CyTOF and deconvolution (See tumor microenvironment stromal and B cell deconvolution), in each sample were used as an a posteriori estimation of tumor cell content, to identify candidate models which are both close to the prior B cell measurement and are among the likeliest model. We found that model selection was dramatically simplified by the addition of mutations to the ABSOLUTE algorithm.

Mutational signature analysis

To identify independent mutational processes present in these samples, we utilized an approach similar to Chapuy et al., which applied trinucleotide mutations from WES to a Bayesian non-negative matrix factorization (bayesNMF). First, unfiltered synonymous and non-synonymous somatic variants above 1% allele frequency were classified as clustered and non-clustered based on their minimum distance to their nearest mutant neighbor distance (NMD) from the same sample being less or greater than 5kb, respectively. Two separate trinucleotide matrices were made from clustered and non-clustered mutations separately and then columns were combined for bayesNMF, such that each sample had counts for a clustered and a non-clustered form of each trinucleotide. Signatures were extracted using the L1WL2H form of bayesNMF within the sigminer R package,88 with the combined trinucleotide matrix as input. For this analysis, the most recent cosmic single base signatures (SBS) for hg38 reference genome were used as a reference signature to which the bayesNMF could converge. The model was initialized with 10 signatures, 50 runs, 2e5 iterations, and a tolerance of 1e−7. Following 50 runs, the optimal number of k factors (k = 3) is determined by the number of factors yielding the highest mean cosine similarity with the reference signature amongst all 50 runs. Signatures were annotated using a combination of the cosine similarities between signatures and the reference cosmic SBS signatures, the relative exposure among clustered and non-clustered mutations, and manual inspection of base changes. Single sample exposures were calculated from the sigminer R package. Signature 1 demonstrated an abundance of C>T mutations at CpG motifs and predominantly non-clustered. Signature 1 also had strong cosine similarity to SBS1 and the signature exposure was significantly correlated with age in this dataset, so it was determined that Signature 1 was an aging signature. Signature 3 exhibited enriched C>T and C>G mutations particularly in the known AID motif, RCY (R = A/G and Y=C/T), and the signature was mostly comprised of clustered mutations. Because of this, we considered Signature 3 to be a possible AID-like signature, although it did not have a particularly strong cosine similarity to SBS84 or SBS85. Considering an overwhelming majority of the samples used to derive cosmic SBS signatures are from non-lymphoid tumors, where AID is presumed to be most important, it is likely that the true context of AID mutagenesis is not captured in this database. Signature 2 consisted mostly of T>G mutations at ATG sites and were mostly non-clustered. This signature was present in most samples and exhibited no specific gene enrichment. The strongest cosine similarity was SBS55, which is considered a potential sequencing artifact but the evidence is unclear. We are as of yet unsure of the nature of this signature, but the available data on signature 55 suggests these mutations may be associated with regions bound by acetylated H3K27.

Copy number alterations detection

To determine oncogenic patterns of copy number variations in these samples, we applied the GISTIC2.0 algorithm to adjusted log2(copy ratios).87 Using the output from ABSOLUTE, we performed in-silico admixture removal (ISAR), to generate a segmentation file which represents copy number segments from the cancer cell fraction. The Gene Pattern module for GISTIC2.0 was run using this adjusted segmentation file as input, along with a q-value threshold of 0.1, confidence interval cutoff of 75%, and a focal length cutoff of 0.98. Because of the ISAR adjusted segmentation files, amplification threshold was set to 0.43, and the deletion threshold was set to −0.9, which are representative cutoffs for complete haploid copy losses and gains. Finally, log2(Copy Ratio) values were capped to between −3 and +2.

Driver mutation filtration

Potential driver alterations were identified by two strategies. First, we included all mutations in 268 known lymphoma driver genes (Chapuy et al., 2018; Pillonel et al., 2018; Reddy et al., 2017; Schmitz et al., 2018), and all mutations in genes reported as mutated in previous FL studies.9,12,14,21,64,104 Second, we performed a de-novo driver prediction using openCRAVAT.89 To filter mutated genes, we utilized the ChASM score and VEST score produced by CRAVAT. The ChASM score is related to the oncogenic driver potential of a missense mutation,105 and VEST measures the functional significance of a variant.106 Both are included because ChASM is incapable of evaluating variants other than missense. Genes that had at least two mutations which had a ChASM score p value <0.1 and VEST score p value <0.01, were included. Additionally, for genes which had no variants with a ChASM score, these genes were included if they had at least two variants with a VEST p value <0.05. Genes selected from the openCRAVAT driver prediction were combined with the genes acquired from published literature.

Causal somatic alterations model

We estimated the association between somatic alterations and B cell states by performing linear regression between state values and genotype. First, the genotype matrix was reduced to only samples with B cell state scores(n = 119), followed by genes/q-bands which were altered in 5 or more of the remaining samples. For each gene-state pair we used a generalized linear model to predict the state value given the alteration status of each sample. We filtered genes to only those with a p value below 0.1 for at least one state and report regression coefficients to determine if the association is positive or negative.

Quantification and statistical analysis

Additional statistical tests were performed using the R statistical software and Prism software version 9.4.1 (GraphPad Software). Where appropriate, Fisher’s exact tests, Fisher-Freeman-Halton exact tests, and wilcoxin rank-sum tests were performed in R. Pearson correlations and p values were derived in R using the hmisc package. Log rank tests were performed in Prism. For all situations of multiple-comarisons, p values were adjusted using the Benjamini-Hochberg FDR correction, and listed as a q-value or adjusted p value.

Additional resources

The algorithm to predict the reported B cell states and their class assignment has been formatted into a simple package on github at https://github.com/NovakLab/FL_Bstate. The package is intended for prediction of all three FL B cell state values and the assigned classes, using either RNA-Seq or Array based GEP, from bulk sample sources. This is the same methodology used within the manuscript.

Acknowledgments

This work was supported in part by the NIH/NCI grants SPORE-P50 CA97274 (to J.R.C. and A.J.N.), R01 CA212162-01A1 (to A.J.N. and J.R.C.), U01 CA195568 (to J.R.C.), and 5T32AI007425 (to J.E.K.). We thank Dr. Janek Walker for his kind review and editing of this manuscript. We also acknowledge BioRender for the creation of the graphical abstract.

Author contributions

Conceptualization, J.E.K., J.R.C., and A.J.N. Methodology, J.E.K., K.W., and A.J.N. Formal analysis, J.E.K., K.W., M.A.H., V.S., M.C.L., M.J.M., J.P.N., and K.R.W. Data curation, J.E.K., M.K.M., V.S., Z.Y., M.S., K.R.W., J.C.V.B., T.M.H., T.E.W., B.K.L., L.M.R., R.L.K., and S.M.A. Writing – original draft, J.E.K. and A.J.N. Project administration, A.J.N. Funding acquisition, J.R.C. and A.J.N. Writing – review & editing, all authors.

Declaration of interests

A.J.N. has received research funding from Bristol Myers Squibb.

Published: February 29, 2024

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.xcrm.2024.101443.

Supplemental information

Document S1. Figures S1–S7
mmc1.pdf (7MB, pdf)
Table S1. Cohort clinical table
mmc2.xlsx (21.3KB, xlsx)
Table S2. FL B cell state gene contribution
mmc3.xlsx (790.5KB, xlsx)
Table S3. Full GSEA results
mmc4.xlsx (129.9KB, xlsx)
Table S4. Full VIPER Master Regulator results
mmc5.xlsx (118KB, xlsx)
Table S5. GC association votes
mmc6.xlsx (13.6KB, xlsx)
Table S6. FL mutational frequency study comparison
mmc7.xlsx (11.8KB, xlsx)
Document S2. Article plus supplemental information
mmc8.pdf (16.4MB, pdf)

References

  • 1.Roulland S., Faroudi M., Mamessier E., Sungalee S., Salles G., Nadel B. In: Advances in Immunology. Alt F.W., editor. Academic Press; 2011. Chapter 1 - Early Steps of Follicular Lymphoma Pathogenesis; pp. 1–46. [DOI] [PubMed] [Google Scholar]
  • 2.Cleary M.L., Sklar J. Nucleotide sequence of a t(14;18) chromosomal breakpoint in follicular lymphoma and demonstration of a breakpoint-cluster region near a transcriptionally active locus on chromosome 18. Proc. Natl. Acad. Sci. USA. 1985;82:7439–7443. doi: 10.1073/pnas.82.21.7439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Nooka A.K., Nabhan C., Zhou X., Taylor M.D., Byrtek M., Miller T.P., Friedberg J.W., Zelenetz A.D., Link B.K., Cerhan J.R., et al. Examination of the follicular lymphoma international prognostic index (FLIPI) in the National LymphoCare study (NLCS): a prospective US patient cohort treated predominantly in community practices. Ann. Oncol. 2013;24:441–448. doi: 10.1093/annonc/mds429. [DOI] [PubMed] [Google Scholar]
  • 4.Dave S.S., Wright G., Tan B., Rosenwald A., Gascoyne R.D., Chan W.C., Fisher R.I., Braziel R.M., Rimsza L.M., Grogan T.M., et al. Prediction of Survival in Follicular Lymphoma Based on Molecular Features of Tumor-Infiltrating Immune Cells. N. Engl. J. Med. 2004;351:2159–2169. doi: 10.1056/nejmoa041869. [DOI] [PubMed] [Google Scholar]
  • 5.Huet S., Tesson B., Jais J.-P., Feldman A.L., Magnano L., Thomas E., Traverse-Glehen A., Albaud B., Carrère M., Xerri L., et al. A gene-expression profiling score for prediction of outcome in patients with follicular lymphoma: a retrospective training and validation analysis in three international cohorts. Lancet Oncol. 2018;19:549–561. doi: 10.1016/s1470-2045(18)30102-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Milpied P., Cervera-Marzal I., Mollichella M.-L., Tesson B., Brisou G., Traverse-Glehen A., Salles G., Spinelli L., Nadel B. Human germinal center transcriptional programs are de-synchronized in B cell lymphoma. Nat. Immunol. 2018;19:1013–1024. doi: 10.1038/s41590-018-0181-4. [DOI] [PubMed] [Google Scholar]
  • 7.Andor N., Simonds E.F., Czerwinski D.K., Chen J., Grimes S.M., Wood-Bouwens C., Zheng G.X.Y., Kubit M.A., Greer S., Weiss W.A., et al. Single-cell RNA-Seq of follicular lymphoma reveals malignant B-cell types and coexpression of T-cell immune checkpoints. Blood. 2019;133:1119–1129. doi: 10.1182/blood-2018-08-862292. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Bödör C., Grossmann V., Popov N., Okosun J., O’Riain C., Tan K., Marzec J., Araf S., Wang J., Lee A.M., et al. EZH2 mutations are frequent and represent an early event in follicular lymphoma. Blood. 2013;122:3165–3168. doi: 10.1182/blood-2013-04-496893. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Okosun J., Wolfson R.L., Wang J., Araf S., Wilkins L., Castellano B.M., Escudero-Ibarz L., Al Seraihi A.F., Richter J., Bernhart S.H., et al. Recurrent mTORC1-activating RRAGC mutations in follicular lymphoma. Nat. Genet. 2016;48:183–188. doi: 10.1038/ng.3473. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Pasqualucci L., Dominguez-Sola D., Chiarenza A., Fabbri G., Grunn A., Trifonov V., Kasper L.H., Lerach S., Tang H., Ma J., et al. Inactivating mutations of acetyltransferase genes in B-cell lymphoma. Nature. 2011;471:189–195. doi: 10.1038/nature09730. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Morin R.D., Johnson N.A., Severson T.M., Mungall A.J., An J., Goya R., Paul J.E., Boyle M., Woolcock B.W., Kuchenbauer F., et al. Somatic mutations altering EZH2 (Tyr641) in follicular and diffuse large B-cell lymphomas of germinal-center origin. Nat. Genet. 2010;42:181–185. doi: 10.1038/ng.518. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Morin R.D., Mendez-Lago M., Mungall A.J., Goya R., Mungall K.L., Corbett R.D., Johnson N.A., Severson T.M., Chiu R., Field M., et al. Frequent mutation of histone-modifying genes in non-Hodgkin lymphoma. Nature. 2011;476:298–303. doi: 10.1038/nature10351. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Green M.R., Gentles A.J., Nair R.V., Irish J.M., Kihira S., Liu C.L., Kela I., Hopmans E.S., Myklebust J.H., Ji H., et al. Hierarchy in somatic mutations arising during genomic evolution and progression of follicular lymphoma. Blood. 2013;121:1604–1611. doi: 10.1182/blood-2012-09-457283. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Okosun J., Bödör C., Wang J., Araf S., Yang C.-Y., Pan C., Boller S., Cittaro D., Bozek M., Iqbal S., et al. Integrated genomic analysis identifies recurrent mutations and evolution patterns driving the initiation and progression of follicular lymphoma. Nat. Genet. 2014;46:176–181. doi: 10.1038/ng.2856. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Yang Z.-Z., Kim H.J., Wu H., Jalali S., Tang X., Krull J.E., Ding W., Novak A.J., Ansell S.M. TIGIT Expression Is Associated with T-cell Suppression and Exhaustion and Predicts Clinical Outcome and Anti–PD-1 Response in Follicular Lymphoma. Clin. Cancer Res. 2020;26:5217–5231. doi: 10.1158/1078-0432.ccr-20-0558. [DOI] [PubMed] [Google Scholar]
  • 16.Rawal S., Chu F., Zhang M., Park H.J., Nattamai D., Kannan S., Sharma R., Delgado D., Chou T., Lin H.Y., et al. Cross Talk between Follicular Th Cells and Tumor Cells in Human Follicular Lymphoma Promotes Immune Evasion in the Tumor Microenvironment. J. Immunol. 2013;190:6681–6693. doi: 10.4049/jimmunol.1201363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Yang Z.-Z., Kim H.J., Villasboas J.C., Price-Troska T., Jalali S., Wu H., Luchtel R.A., Polley M.-Y.C., Novak A.J., Ansell S.M. Mass Cytometry Analysis Reveals that Specific Intratumoral CD4+ T Cell Subsets Correlate with Patient Survival in Follicular Lymphoma. Cell Rep. 2019;26:2178–2193.e3. doi: 10.1016/j.celrep.2019.01.085. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Smeltzer J.P., Jones J.M., Ziesmer S.C., Grote D.M., Xiu B., Ristow K.M., Yang Z.Z., Nowakowski G.S., Feldman A.L., Cerhan J.R., et al. Pattern of CD14+ Follicular Dendritic Cells and PD1+ T Cells Independently Predicts Time to Transformation in Follicular Lymphoma. Clin. Cancer Res. 2014;20:2862–2872. doi: 10.1158/1078-0432.ccr-13-2367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Mondello P., Fama A., Larson M.C., Feldman A.L., Villasboas J.C., Yang Z.-Z., Galkin I., Svelolkin V., Postovalova E., Bagaev A., et al. Lack of intrafollicular memory CD4 + T cells is predictive of early clinical failure in newly diagnosed follicular lymphoma. Blood Cancer J. 2021;11 doi: 10.1038/s41408-021-00521-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Han G., Deng Q., Marques-Piubelli M.L., Dai E., Dang M., Ma M.C.J., Li X., Yang H., Henderson J., Kudryashova O., et al. Follicular lymphoma microenvironment characteristics associated with tumor cell mutations and MHC class II expression. Blood Cancer Discov. 2022;3:428–443. doi: 10.1158/2643-3230.Bcd-21-0075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Green M.R., Kihira S., Liu C.L., Nair R.V., Salari R., Gentles A.J., Irish J., Stehr H., Vicente-Dueñas C., Romero-Camarero I., et al. Mutations in early follicular lymphoma progenitors are associated with suppressed antigen presentation. Proc. Natl. Acad. Sci. USA. 2015;112:E1116–E1125. doi: 10.1073/pnas.1501199112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Boice M., Salloum D., Mourcin F., Sanghvi V., Amin R., Oricchio E., Jiang M., Mottok A., Denis-Lagache N., Ciriello G., et al. Loss of the HVEM Tumor Suppressor in Lymphoma and Restoration by Modified CAR-T Cells. Cell. 2016;167:405–418.e13. doi: 10.1016/j.cell.2016.08.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Béguelin W., Teater M., Meydan C., Hoehn K.B., Phillip J.M., Soshnev A.A., Venturutti L., Rivas M.A., Calvo-Fernández M.T., Gutierrez J., et al. Mutant EZH2 Induces a Pre-malignant Lymphoma Niche by Reprogramming the Immune Response. Cancer Cell. 2020;37:655–673.e11. doi: 10.1016/j.ccell.2020.04.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Bararia D., Hildebrand J.A., Stolz S., Haebe S., Alig S., Trevisani C.P., Osorio-Barrios F., Bartoschek M.D., Mentz M., Pastore A., et al. Cathepsin S Alterations Induce a Tumor-Promoting Immune Microenvironment in Follicular Lymphoma. Cell Rep. 2020;31 doi: 10.1016/j.celrep.2020.107522. [DOI] [PubMed] [Google Scholar]
  • 25.Kiaii S., Clear A.J., Ramsay A.G., Davies D., Sangaralingam A., Lee A., Calaminici M., Neuberg D.S., Gribben J.G. Follicular Lymphoma Cells Induce Changes in T-Cell Gene Expression and Function: Potential Impact on Survival and Risk of Transformation. J. Clin. Oncol. 2013;31:2654–2661. doi: 10.1200/jco.2012.44.2137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Mourcin F., Verdière L., Roulois D., Amin R., Lamaison C., Sibut V., Thamphya B., Pangault C., Monvoisin C., Huet S., et al. Follicular lymphoma triggers phenotypic and functional remodeling of the human lymphoid stromal cell landscape. Immunity. 2021;54:1788–1806.e7. doi: 10.1016/j.immuni.2021.05.019. [DOI] [PubMed] [Google Scholar]
  • 27.Rauschmeier R., Reinhardt A., Gustafsson C., Glaros V., Artemov A.V., Dunst J., Taneja R., Adameyko I., Månsson R., Busslinger M., Kreslavsky T. Bhlhe40 function in activated B and TFH cells restrains the GC reaction and prevents lymphomagenesis. J. Exp. Med. 2022;219 doi: 10.1084/jem.20211406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Ochiai K., Maienschein-Cline M., Simonetti G., Chen J., Rosenthal R., Brink R., Chong A.S., Klein U., Dinner A.R., Singh H., Sciammas R. Transcriptional Regulation of Germinal Center B and Plasma Cell Fates by Dynamical Control of IRF4. Immunity. 2013;38:918–929. doi: 10.1016/j.immuni.2013.04.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Song S., Matthias P.D. The Transcriptional Regulation of Germinal Center Formation. Front. Immunol. 2018;9:2026. doi: 10.3389/fimmu.2018.02026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Alvarez M.J., Shen Y., Giorgi F.M., Lachmann A., Ding B.B., Ye B.H., Califano A. Functional characterization of somatic mutations in cancer using network-based inference of protein activity. Nat. Genet. 2016;48:838–847. doi: 10.1038/ng.3593. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Kim Wiese A., Schluterman Burdine M., Turnage R.H., Tackett A.J., Burdine L.J. DNA-PKcs controls calcineurin mediated IL-2 production in T lymphocytes. PLoS One. 2017;12 doi: 10.1371/journal.pone.0181608. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Victora G.D., Dominguez-Sola D., Holmes A.B., Deroubaix S., Dalla-Favera R., Nussenzweig M.C. Identification of human germinal center light and dark zone cells and their relationship to human B-cell lymphomas. Blood. 2012;120:2240–2248. doi: 10.1182/blood-2012-03-415380. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.De Silva N.S., Klein U. Dynamics of B cells in germinal centres. Nat. Rev. Immunol. 2015;15:137–148. doi: 10.1038/nri3804. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Holmes A.B., Corinaldesi C., Shen Q., Kumar R., Compagno N., Wang Z., Nitzan M., Grunstein E., Pasqualucci L., Dalla-Favera R., Basso K. Single-cell analysis of germinal-center B cells informs on lymphoma cell of origin and outcome. J. Exp. Med. 2020;217 doi: 10.1084/jem.20200483. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Attaf N., Dong C., Gil L., Cervera-Marzal I., Gharsalli T., Navarro J.-M., Mboumba D.-L., Chasson L., Lemonnier F., Gaulard P., et al. Cold Spring Harbor Laboratory; 2022. Functional Plasticity and Recurrent Cell States of Malignant B Cells in Follicular Lymphoma. [Google Scholar]
  • 36.Kasar S., Kim J., Improgo R., Tiao G., Polak P., Haradhvala N., Lawrence M.S., Kiezun A., Fernandes S.M., Bahl S., et al. Whole-genome sequencing reveals activation-induced cytidine deaminase signatures during indolent chronic lymphocytic leukaemia evolution. Nat. Commun. 2015;6:8866. doi: 10.1038/ncomms9866. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Pettersen H.S., Galashevskaya A., Doseth B., Sousa M.M.L., Sarno A., Visnes T., Aas P.A., Liabakk N.-B., Slupphaug G., Sætrom P., et al. AID expression in B-cell lymphomas causes accumulation of genomic uracil and a distinct AID mutational signature. DNA Repair. 2015;25:60–71. doi: 10.1016/j.dnarep.2014.11.006. [DOI] [PubMed] [Google Scholar]
  • 38.Tate J.G., Bamford S., Jubb H.C., Sondka Z., Beare D.M., Bindal N., Boutselakis H., Cole C.G., Creatore C., Dawson E., et al. COSMIC: the Catalogue Of Somatic Mutations In Cancer. Nucleic Acids Res. 2019;47:D941–D947. doi: 10.1093/nar/gky1015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Green M.R., Gentles A.J., Nair R.V., Irish J.M., Kihira S., Liu C.L., Kela I., Hopmans E.S., Myklebust J.H., Ji H., et al. Hierarchy in somatic mutations arising during genomic evolution and progression of follicular lymphoma. Blood. 2013;121:1604–1611. doi: 10.1182/blood-2012-09-457283. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Carter S.L., Cibulskis K., Helman E., McKenna A., Shen H., Zack T., Laird P.W., Onofrio R.C., Winckler W., Weir B.A., et al. Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol. 2012;30:413–421. doi: 10.1038/nbt.2203. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Chapuy B., Stewart C., Dunford A.J., Kim J., Kamburov A., Redd R.A., Lawrence M.S., Roemer M.G.M., Li A.J., Ziepert M., et al. Molecular subtypes of diffuse large B cell lymphoma are associated with distinct pathogenic mechanisms and outcomes. Nat. Med. 2018;24:679–690. doi: 10.1038/s41591-018-0016-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Rogozin I.B., Lada A.G., Goncearenco A., Green M.R., De S., Nudelman G., Panchenko A.R., Koonin E.V., Pavlov Y.I. Activation induced deaminase mutational signature overlaps with CpG methylation sites in follicular lymphoma and other cancers. Sci. Rep. 2016;6 doi: 10.1038/srep38133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Ye X., Ren W., Liu D., Li X., Li W., Wang X., Meng F.-L., Yeap L.-S., Hou Y., Zhu S., et al. Genome-wide mutational signatures revealed distinct developmental paths for human B cell lymphomas. J. Exp. Med. 2021;218 doi: 10.1084/jem.20200573. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Huet S., Sujobert P., Salles G. From genetics to the clinic: a translational perspective on follicular lymphoma. Nat. Rev. Cancer. 2018;18:224–239. doi: 10.1038/nrc.2017.127. [DOI] [PubMed] [Google Scholar]
  • 45.Friedberg J.W., Kahl B.S., Leonard J.P. The Roadmap Forward in Follicular Lymphoma: Time for a precision approach. ASH Clinical News. 2015:29–30. [Google Scholar]
  • 46.Glas A.M., Kersten M.J., Delahaye L.J.M.J., Witteveen A.T., Kibbelaar R.E., Velds A., Wessels L.F.A., Joosten P., Kerkhoven R.M., Bernards R., et al. Gene expression profiling in follicular lymphoma to assess clinical aggressiveness and to guide the choice of treatment. Blood. 2005;105:301–307. doi: 10.1182/blood-2004-06-2298. [DOI] [PubMed] [Google Scholar]
  • 47.Haebe S., Shree T., Sathe A., Day G., Czerwinski D.K., Grimes S.M., Lee H., Binkley M.S., Long S.R., Martin B., et al. Single-cell analysis can define distinct evolution of tumor sites in follicular lymphoma. Blood. 2021;137:2869–2880. doi: 10.1182/blood.2020009855. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Roider T., Seufert J., Uvarovskii A., Frauhammer F., Bordas M., Abedpour N., Stolarczyk M., Mallm J.-P., Herbst S.A., Bruch P.-M., et al. Dissecting intratumour heterogeneity of nodal B-cell lymphomas at the transcriptional, genetic and drug-response levels. Nat. Cell Biol. 2020;22:896–906. doi: 10.1038/s41556-020-0532-x. [DOI] [PubMed] [Google Scholar]
  • 49.Leich E., Salaverria I., Bea S., Zettl A., Wright G., Moreno V., Gascoyne R.D., Chan W.-C., Braziel R.M., Rimsza L.M., et al. Follicular lymphomas with and without translocation t(14;18) differ in gene expression profiles and genetic alterations. Blood. 2009;114:826–834. doi: 10.1182/blood-2009-01-198580. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Gupta M., Dillon S.R., Ziesmer S.C., Feldman A.L., Witzig T.E., Ansell S.M., Cerhan J.R., Novak A.J. A proliferation-inducing ligand mediates follicular lymphoma B-cell proliferation and cyclin D1 expression through phosphatidylinositol 3-kinase–regulated mammalian target of rapamycin activation. Blood. 2009;113:5206–5216. doi: 10.1182/blood-2008-09-179762. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Tsuji S., Cortesão C., Bram R.J., Platt J.L., Cascalho M. TACI deficiency impairs sustained Blimp-1 expression in B cells decreasing long-lived plasma cells in the bone marrow. Blood. 2011;118:5832–5839. doi: 10.1182/blood-2011-05-353961. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.He B., Chadburn A., Jou E., Schattner E.J., Knowles D.M., Cerutti A. Lymphoma B Cells Evade Apoptosis through the TNF Family Members BAFF/BLyS and APRIL. J. Immunol. 2004;172:3268–3279. doi: 10.4049/jimmunol.172.5.3268. [DOI] [PubMed] [Google Scholar]
  • 53.Badr G., Saad H., Waly H., Hassan K., Abdel-Tawab H., Alhazza I.M., Ahmed E.A. Type I interferon (IFN-α/β) rescues B-lymphocytes from apoptosis via PI3Kδ/Akt, Rho-A, NFκB and Bcl-2/BclXL. Cell. Immunol. 2010;263:31–40. doi: 10.1016/j.cellimm.2010.02.012. [DOI] [PubMed] [Google Scholar]
  • 54.Lefebvre C., Rajbhandari P., Alvarez M.J., Bandaru P., Lim W.K., Sato M., Wang K., Sumazin P., Kustagi M., Bisikirska B.C., et al. A human B-cell interactome identifies MYB and FOXM1 as master regulators of proliferation in germinal centers. Mol. Syst. Biol. 2010;6:377. doi: 10.1038/msb.2010.31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Leseux L., Hamdi S.M., Al Saati T., Capilla F., Recher C., Laurent G., Bezombes C. Syk-dependent mTOR activation in follicular lymphoma cells. Blood. 2006;108:4156–4162. doi: 10.1182/blood-2006-05-026203. [DOI] [PubMed] [Google Scholar]
  • 56.Fruchon S., Kheirallah S., Al Saati T., Ysebaert L., Laurent C., Leseux L., Fournié J.J., Laurent G., Bezombes C. Involvement of the Syk–mTOR pathway in follicular lymphoma cell invasion and angiogenesis. Leukemia. 2012;26:795–805. doi: 10.1038/leu.2011.248. [DOI] [PubMed] [Google Scholar]
  • 57.Travert M., Ame-Thomas P., Pangault C., Morizot A., Micheau O., Semana G., Lamy T., Fest T., Tarte K., Guillaudeux T. CD40 Ligand Protects from TRAIL-Induced Apoptosis in Follicular Lymphomas through NF-κB Activation and Up-Regulation of c-FLIP and Bcl-xL. J. Immunol. 2008;181:1001–1011. doi: 10.4049/jimmunol.181.2.1001. [DOI] [PubMed] [Google Scholar]
  • 58.Amin R., Mourcin F., Uhel F., Pangault C., Ruminy P., Dupré L., Guirriec M., Marchand T., Fest T., Lamy T., Tarte K. DC-SIGN–expressing macrophages trigger activation of mannosylated IgM B-cell receptor in follicular lymphoma. Blood. 2015;126:1911–1920. doi: 10.1182/blood-2015-04-640912. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Linley A., Krysov S., Ponzoni M., Johnson P.W., Packham G., Stevenson F.K. Lectin binding to surface Ig variable regions provides a universal persistent activating signal for follicular lymphoma cells. Blood. 2015;126:1902–1910. doi: 10.1182/blood-2015-04-640805. [DOI] [PubMed] [Google Scholar]
  • 60.Wenzl K., Manske M.K., Sarangi V., Asmann Y.W., Greipp P.T., Schoon H.R., Braggio E., Maurer M.J., Feldman A.L., Witzig T.E., et al. Loss of TNFAIP3 enhances MYD88L265P-driven signaling in non-Hodgkin lymphoma. Blood Cancer J. 2018;8 doi: 10.1038/s41408-018-0130-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Hu Q., Zhang B., Chen R., Fu C., Fu X., Li J., Fu L., Zhang Z., Dong J.-T. ZFHX3 is indispensable for ERβ to inhibit cell proliferation via MYC downregulation in prostate cancer cells. Oncogenesis. 2019;8 doi: 10.1038/s41389-019-0138-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Luo W., Weisel F., Shlomchik M.J. B Cell Receptor and CD40 Signaling Are Rewired for Synergistic Induction of the c-Myc Transcription Factor in Germinal Center B Cells. Immunity. 2018;48:313–326.e5. doi: 10.1016/j.immuni.2018.01.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Jiang Y., Ortega-Molina A., Geng H., Ying H.-Y., Hatzi K., Parsa S., McNally D., Wang L., Doane A.S., Agirre X., et al. CREBBP Inactivation Promotes the Development of HDAC3-Dependent Lymphomas. Cancer Discov. 2017;7:38–53. doi: 10.1158/2159-8290.cd-16-0975. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Krysiak K., Gomez F., White B.S., Matlock M., Miller C.A., Trani L., Fronick C.C., Fulton R.S., Kreisel F., Cashen A.F., et al. Recurrent somatic mutations affecting B-cell receptor signaling pathway genes in follicular lymphoma. Blood. 2017;129:473–483. doi: 10.1182/blood-2016-07-729954. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Ma M.C.J., Tadros S., Bouska A., Heavican T., Yang H., Deng Q., Moore D., Akhter A., Hartert K., Jain N., et al. Subtype-specific and co-occurring genetic alterations in B-cell non-Hodgkin lymphoma. Haematologica. 2022;107:690–701. doi: 10.3324/haematol.2020.274258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Witzig T.E., Reeder C.B., Laplant B.R., Gupta M., Johnston P.B., Micallef I.N., Porrata L.F., Ansell S.M., Colgan J.P., Jacobsen E.D., et al. A phase II trial of the oral mTOR inhibitor everolimus in relapsed aggressive lymphoma. Leukemia. 2011;25:341–347. doi: 10.1038/leu.2010.226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Morschhauser F., Tilly H., Chaidos A., McKay P., Phillips T., Assouline S., Batlevi C.L., Campbell P., Ribrag V., Damaj G.L., et al. Tazemetostat for patients with relapsed or refractory follicular lymphoma: an open-label, single-arm, multicentre, phase 2 trial. Lancet Oncol. 2020;21:1433–1442. doi: 10.1016/S1470-2045(20)30441-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Karczewski K.J., Weisburd B., Thomas B., Solomonson M., Ruderfer D.M., Kavanagh D., Hamamsy T., Lek M., Samocha K.E., Cummings B.B., et al. The ExAC browser: displaying reference data information from over 60 000 exomes. Nucleic Acids Res. 2017;45:D840–D845. doi: 10.1093/nar/gkw971. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Clarke L., Zheng-Bradley X., Smith R., Kulesha E., Xiao C., Toneva I., Vaughan B., Preuss D., Leinonen R., Shumway M., et al. The 1000 Genomes Project: data management and community access. Nat. Methods. 2012;9:459–462. doi: 10.1038/nmeth.1974. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Kalari K.R., Nair A.A., Bhavsar J.D., O'Brien D.R., Davila J.I., Bockol M.A., Nie J., Tang X., Baheti S., Doughty J.B., et al. MAP-RSeq: Mayo Analysis Pipeline for RNA sequencing. BMC Bioinf. 2014;15:224. doi: 10.1186/1471-2105-15-224. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Liao Y., Smyth G.K., Shi W. The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nucleic Acids Res. 2013;41:e108. doi: 10.1093/nar/gkt214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Gaujoux R., Seoighe C. A flexible R package for nonnegative matrix factorization. BMC Bioinf. 2010;11:367. doi: 10.1186/1471-2105-11-367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Liberzon A., Birger C., Thorvaldsdóttir H., Ghandi M., Mesirov J.P., Tamayo P. The Molecular Signatures Database Hallmark Gene Set Collection. Cell Syst. 2015;1:417–425. doi: 10.1016/j.cels.2015.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Rouillard A.D., Gundersen G.W., Fernandez N.F., Wang Z., Monteiro C.D., McDermott M.G., Ma’Ayan A. The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins. Database. 2016;2016:baw100. doi: 10.1093/database/baw100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Vivar J.C., Pemu P., McPherson R., Ghosh S. Redundancy control in pathway databases (ReCiPa): an application for improving gene-set enrichment analysis in Omics studies and "Big data" biology. OMICS A J. Integr. Biol. 2013;17:414–422. doi: 10.1089/omi.2012.0083. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Korotkevich G., Sukhov V., Budin N., Shpak B., Artyomov M.N., Sergushichev A. Cold Spring Harbor Laboratory; 2016. Fast Gene Set Enrichment Analysis. [Google Scholar]
  • 77.Levine J.H., Simonds E.F., Bendall S.C., Davis K.L., Amir E.a.D., Tadmor M.D., Litvin O., Fienberg H.G., Jager A., Zunder E.R., et al. Data-Driven Phenotypic Dissection of AML Reveals Progenitor-like Cells that Correlate with Prognosis. Cell. 2015;162:184–197. doi: 10.1016/j.cell.2015.05.047. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Foroutan M., Bhuva D.D., Lyu R., Horan K., Cursons J., Davis M.J. Single sample scoring of molecular phenotypes. BMC Bioinf. 2018;19 doi: 10.1186/s12859-018-2435-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Schindelin J., Arganda-Carreras I., Frise E., Kaynig V., Longair M., Pietzsch T., Preibisch S., Rueden C., Saalfeld S., Schmid B., et al. Fiji: an open-source platform for biological-image analysis. Nat. Methods. 2012;9:676–682. doi: 10.1038/nmeth.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Li H., Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.McKenna A., Hanna M., Banks E., Sivachenko A., Cibulskis K., Kernytsky A., Garimella K., Altshuler D., Gabriel S., Daly M., DePristo M.A. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–1303. doi: 10.1101/gr.107524.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Larson D.E., Harris C.C., Chen K., Koboldt D.C., Abbott T.E., Dooling D.J., Ley T.J., Mardis E.R., Wilson R.K., Ding L. SomaticSniper: identification of somatic point mutations in whole genome sequencing data. Bioinformatics. 2012;28:311–317. doi: 10.1093/bioinformatics/btr665. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Roth A., Ding J., Morin R., Crisan A., Ha G., Giuliany R., Bashashati A., Hirst M., Turashvili G., Oloumi A., et al. JointSNVMix: a probabilistic model for accurate detection of somatic mutations in normal/tumour paired next-generation sequencing data. Bioinformatics. 2012;28:907–913. doi: 10.1093/bioinformatics/bts053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Cibulskis K., Lawrence M.S., Carter S.L., Sivachenko A., Jaffe D., Sougnez C., Gabriel S., Meyerson M., Lander E.S., Getz G. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 2013;31:213–219. doi: 10.1038/nbt.2514. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Mayakonda A., Lin D.-C., Assenov Y., Plass C., Koeffler H.P. Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Res. 2018;28:1747–1756. doi: 10.1101/gr.239244.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Wang C., Evans J.M., Bhagwate A.V., Prodduturi N., Sarangi V., Middha M., Sicotte H., Vedell P.T., Hart S.N., Oliver G.R., et al. PatternCNV: a versatile tool for detecting copy number changes from exome sequencing data. Bioinformatics. 2014;30:2678–2680. doi: 10.1093/bioinformatics/btu363. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Mermel C.H., Schumacher S.E., Hill B., Meyerson M.L., Beroukhim R., Getz G. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 2011;12:R41. doi: 10.1186/gb-2011-12-4-r41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Wang S., Tao Z., Wu T., Liu X.-S. Sigflow: an automated and comprehensive pipeline for cancer genome mutational signature analysis. Bioinformatics. 2021;37:1590–1592. doi: 10.1093/bioinformatics/btaa895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Pagel K.A., Kim R., Moad K., Busby B., Zheng L., Tokheim C., Ryan M., Karchin R. Integrated Informatics Analysis of Cancer-Related Variants. JCO Clin. Cancer Inform. 2020;4:310–317. doi: 10.1200/cci.19.00132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Cerhan J.R., Link B.K., Habermann T.M., Maurer M.J., Feldman A.L., Syrbu S.I., Thompson C.A., Farooq U., Novak A.J., Slager S.L., et al. Cohort Profile: The Lymphoma Specialized Program of Research Excellence (SPORE) Molecular Epidemiology Resource (MER) Cohort Study. Int. J. Epidemiol. 2017;46:1753–1754i. doi: 10.1093/ije/dyx119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Dobin A., Davis C.A., Schlesinger F., Drenkow J., Zaleski C., Jha S., Batut P., Chaisson M., Gingeras T.R. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Brunet J.P., Tamayo P., Golub T.R., Mesirov J.P. Metagenes and molecular pattern discovery using matrix factorization. Proc. Natl. Acad. Sci. USA. 2004;101:4164–4169. doi: 10.1073/pnas.0308531101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Liberzon A., Subramanian A., Pinchback R., Thorvaldsdóttir H., Tamayo P., Mesirov J.P. Molecular signatures database (MSigDB) 3.0. Bioinformatics. 2011;27:1739–1740. doi: 10.1093/bioinformatics/btr260. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Gillespie M., Jassal B., Stephan R., Milacic M., Rothfels K., Senff-Ribeiro A., Griss J., Sevilla C., Matthews L., Gong C., et al. The reactome pathway knowledgebase 2022. Nucleic Acids Res. 2022;50:D687–D692. doi: 10.1093/nar/gkab1028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Shaffer A.L., Wright G., Yang L., Powell J., Ngo V., Lamy L., Lam L.T., Davis R.E., Staudt L.M. A library of gene expression signatures to illuminate normal and pathological lymphoid biology. Immunol. Rev. 2006;210:67–85. doi: 10.1111/j.0105-2896.2006.00373.x. [DOI] [PubMed] [Google Scholar]
  • 96.Newman A.M., Steen C.B., Liu C.L., Gentles A.J., Chaudhuri A.A., Scherer F., Khodadoust M.S., Esfahani M.S., Luca B.A., Steiner D., et al. Determining cell type abundance and expression from bulk tissues with digital cytometry. Nat. Biotechnol. 2019;37:773–782. doi: 10.1038/s41587-019-0114-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Crowell H.L., Chevrier S., Jacobs A., Sivapatham S., Tumor Profiler Consortium. Bodenmiller B., Robinson M.D. An R-based reproducible and user-friendly preprocessing pipeline for CyTOF data. F1000Res. 2020;9:1263. doi: 10.12688/f1000research.26073.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Bagwell C.B., Inokuma M., Hunsberger B., Herbert D., Bray C., Hill B., Stelzer G., Li S., Kollipara A., Ornatsky O., Baranov V. Automated Data Cleanup for Mass Cytometry. Cytometry A. 2020;97:184–198. doi: 10.1002/cyto.a.23926. [DOI] [PubMed] [Google Scholar]
  • 99.Finck R., Simonds E.F., Jager A., Krishnaswamy S., Sachs K., Fantl W., Pe'Er D., Nolan G.P., Bendall S.C. Normalization of mass cytometry data with bead standards. Cytometry A. 2013;83:483–494. doi: 10.1002/cyto.a.22271. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Goswami S., Walle T., Cornish A.E., Basu S., Anandhan S., Fernandez I., Vence L., Blando J., Zhao H., Yadav S.S., et al. Immune profiling of human tumors identifies CD73 as a combinatorial target in glioblastoma. Nat. Med. 2020;26:39–46. doi: 10.1038/s41591-019-0694-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Jiménez-Sánchez A., Cast O., Miller M.L. Comprehensive Benchmarking and Integration of Tumor Microenvironment Cell Estimation Methods. Cancer Res. 2019;79:6238–6246. doi: 10.1158/0008-5472.can-18-3560. [DOI] [PubMed] [Google Scholar]
  • 102.Hartert K.T., Wenzl K., Krull J.E., Manske M., Sarangi V., Asmann Y., Larson M.C., Maurer M.J., Slager S., Macon W.R., et al. Targeting of inflammatory pathways with R2CHOP in high-risk DLBCL. Leukemia. 2021;35:522–533. doi: 10.1038/s41375-020-0766-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103.Westphal M., Frankhouser D., Sonzone C., Shields P.G., Yan P., Bundschuh R. SMaSH: Sample matching using SNPs in humans. BMC Genom. 2019;20 doi: 10.1186/s12864-019-6332-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Tsukamoto T., Nakano M., Sato R., Adachi H., Kiyota M., Kawata E., Uoshima N., Yasukawa S., Chinen Y., Mizutani S., et al. High-risk follicular lymphomas harbour more somatic mutations including those in the AID-motif. Sci. Rep. 2017;7 doi: 10.1038/s41598-017-14150-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105.Carter H., Chen S., Isik L., Tyekucheva S., Velculescu V.E., Kinzler K.W., Vogelstein B., Karchin R. Cancer-Specific High-Throughput Annotation of Somatic Mutations: Computational Prediction of Driver Missense Mutations. Cancer Res. 2009;69:6660–6667. doi: 10.1158/0008-5472.can-09-1133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106.Carter H., Douville C., Stenson P.D., Cooper D.N., Karchin R. Identifying Mendelian disease genes with the Variant Effect Scoring Tool. BMC Genom. 2013;14:S3. doi: 10.1186/1471-2164-14-s3-s3. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S7
mmc1.pdf (7MB, pdf)
Table S1. Cohort clinical table
mmc2.xlsx (21.3KB, xlsx)
Table S2. FL B cell state gene contribution
mmc3.xlsx (790.5KB, xlsx)
Table S3. Full GSEA results
mmc4.xlsx (129.9KB, xlsx)
Table S4. Full VIPER Master Regulator results
mmc5.xlsx (118KB, xlsx)
Table S5. GC association votes
mmc6.xlsx (13.6KB, xlsx)
Table S6. FL mutational frequency study comparison
mmc7.xlsx (11.8KB, xlsx)
Document S2. Article plus supplemental information
mmc8.pdf (16.4MB, pdf)

Data Availability Statement

  • Raw genomic and transcriptomic data files can be accessed via the database of Genotypes and Phenotypes (dbGAP) at https://www.ncbi.nlm.nih.gov/gap/ with dbGaP Study Accession: phs002989.

  • Original code has been deposited at GitHub as a package and is publicly available as of the date of publication. DOIs are listed in the key resources table. This paper does not report original code beyond this package.

  • Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.


Articles from Cell Reports Medicine are provided here courtesy of Elsevier

RESOURCES