Abstract
Gene expression profiling technologies have revolutionized cell biology, enabling researchers to identify gene signatures linked to various biological attributes of melanomas, such as pigmentation status, differentiation state, proliferative versus invasive capacity, and disease progression. Although the discovery of gene signatures has significantly enhanced our understanding of melanocytic phenotypes, reconciling the numerous signatures reported across independent studies and different profiling platforms remains a challenge. Current methods for classifying melanocytic gene signatures depend on exact gene overlap and comparison with unstandardized baseline transcriptomes. In this study, we aimed to categorize published gene signatures into clusters based on their similar patterns of expression across clinical cutaneous melanoma specimens. We analyzed nearly 800 melanoma samples from six gene expression repositories and developed a classification framework for gene signatures that is resilient against biases in gene identification across profiling platforms and inconsistencies in baseline standards. Using 39 frequently cited published gene signatures, our analysis revealed seven principal classes of gene signatures that correlate with previously identified phenotypes: Differentiated, Mitotic/MYC, AXL, Amelanotic, Neuro, Hypometabolic, and Invasive. Each class is consistent with the phenotypes that the constituent gene signatures represent, and our classification method does not rely on overlapping genes between signatures. To facilitate broader application, we created WIMMS (What Is My Melanocytic Signature, available at https://wimms.tanlab.org/), a user-friendly web application. WIMMS allows users to categorize any gene signature, determining its relationship to predominantly cited signatures and its representation within the seven principal classes.
Cutaneous melanoma is an aggressive form of skin cancer, characterized by its tendency to metastasize and its capacity to acquire resistance to pharmacological treatments (Siegel et al., 2024). Central to the processes of metastatic dissemination and therapeutic resistance is the alteration of gene expression profiles (Arozarena & Wellbrock, 2019; Centeno et al., 2023; Karras et al., 2022; Konieczkowski et al., 2014; Rambow et al., 2019; Shaffer et al., 2017; Tirosh et al., 2016). With the ultimate goal of enhancing clinical outcomes through improved diagnostics, prognostics, and selection of therapeutic regimens, considerable research has been dedicated to identifying gene signatures associated with distinct melanoma phenotypes. Examples of such studies include: i) the comparison of transcriptomes between metastatic and primary tumors (Alonso et al., 2007; Jaeger et al., 2007; Kauffmann et al., 2008; Winnepenninckx et al., 2006); ii) profiling collections of cell lines derived from tumors with different metastatic potentials (Hoek et al., 2006; Hoek, Eichhoff, et al., 2008; Widmer et al., 2012); iii) characterization of heterogeneity in metastatic melanoma tumors, cell lines, and skin using single-cell RNA sequencing (scRNA-seq) (Belote et al., 2021; Jerby-Arnon et al., 2018; Pozniak et al., 2024; Rambow et al., 2018; Tirosh et al., 2016; Wouters et al., 2020); iv) spatial sequencing of clinical specimens (Karras et al., 2022); and v) myriad studies assessing changes in gene expression upon molecular perturbation (e.g. gene knockdown or ectopic expression) across various model systems (Ryu et al., 2011).
Advancements in transcriptomic profiling and computational analysis have facilitated the identification of melanocytic gene signatures, driving a shift from the earlier binary categorization of ‘proliferative’ and ‘invasive’ signatures described by Hoek et al. towards a collection of over a hundred distinct gene signatures (Hoek et al., 2006; Wouters et al., 2020). However, achieving a comprehensive understanding of how the numerous phenotypes and associated gene signatures identified in different studies interrelate has become increasingly challenging. Genomic classification is conducted using absolute identification of a precise molecular change (i.e. BRAFV600E is always BRAFV600E) permitting high-confidence interstudy comparisons. In contrast, transcriptomic classification is neither absolute nor precise, such that signatures for the same phenotypes can vary by magnitudes. For example, the “proliferative” signature from Hoek et al. has 51 genes, while Verfaillie and colleagues reported a list of 770 genes (Hoek et al., 2006; Verfaillie et al., 2015). The difficulty in creating a reproducible taxonomy for transcriptomic classification, and the consequential ballooning of new and different gene signatures, stems from multiple sources: differences in sample type, biases in transcriptome profiling technology, and reliance on relative, rather than absolute, strategies for gene signature identification.
First, a wide variety of sample types have been used in studying phenotypes in cutaneous melanoma, including cell cultures and lines, patient tissues, and patient-derived xenografts (PDX), each offering unique advantages and disadvantages. Cell cultures and lines allow for thorough molecular and phenotypic profiling and well-controlled experimental design and manipulation (Hoek et al., 2006; Tsoi et al., 2018; Widmer et al., 2012; Wouters et al., 2020). However, these in vitro models require that cells are active in cell cycle, hindering the study of slow-cycling phenotypes. Clinical specimens are, by definition, the most representative of the human disease, but are often limited in quantity and cannot be experimentally manipulated with comparable ease. Tumors in PDX models retain the complex microenvironment with the caveat of the often absent intact immune system and may suffer from contamination from the mouse transcriptome. Thus, while each of this diverse selection of available models possesses distinct advantages and has yielded valuable insights into melanoma phenotypes, the differences between these models contribute to distinctions between gene signatures.
Parallel to sample type considerations, the rapid evolution of transcriptome profiling technologies is another factor that influences gene signature generation and reproducibility. Bulk transcriptome profiling technologies are low-cost, posing less of an obstacle for clinical utility. However, results from bulk methods have a low signal-to-noise ratio due to the mixing of all cell types in the samples. Compared to the hybridization-based microarray technologies used to generate profiles for most large clinical transcriptomic repositories, Next-Generation-Sequencing-based technologies can capture more types of transcripts, such as those of novel genes, splicing variants, and sequence variants (Wang et al., 2009). In addition, RNA-seq has a higher range for transcript detection with a higher sensitivity for poorly expressed genes (Wang et al., 2009). Even within the category of RNA-seq, there are factors that can impact the genes detected. For instance, polyadenylated (polyA) mRNA selection or rRNA depletion are commonly used techniques for mRNA selection. However, Zhao et al. found that polyA selection results in a higher exonic coverage compared to rRNA depletion (Zhao et al., 2018). Single-cell technologies offer an unprecedented cellular resolution but at the sacrifice of transcript resolution. Plate-based-scRNA-seq methods have higher gene detection sensitivity but poorer scalability as opposed to droplet-based protocols (Hwang et al., 2018). Spatial technologies, while introducing an additional dimension, exhibit bias stemming from protocol differences, with sequencing-based methods offering greater gene detection efficiency over image-based counterparts (Moses & Pachter, 2022).
Finally, the statistical analysis employed in generating gene signatures introduces another layer of variation. Differential expression (DE) analysis is used to compare phenotypes, and differentially expressed genes are used as signatures (Costa-Silva et al., 2017). However, the results from a DE analysis are highly contingent on the dataset. Experiments with a small sample size suffer from a low statistical power, resulting in different thresholds for statistical significance. For example, Alonso et al. and Jaeger et al. have similar numbers of samples, yet false discovery rates of 0.2 and 0.005 were chosen as the thresholds for statistical significance, respectively (Alonso et al., 2007; Jaeger et al., 2007). In addition, the identity of the transcriptome(s) chosen for the baseline is as much a determinant for identified DE genes as the experimental groups. Therefore, the relative nature of DE analysis likely contributes to the differences between signatures that represent the same cell states. In single-cell analysis, the cell state information is typically unknown, whereas phenotypes of whole samples can be learned through other assays or from clinical information. Thus, the Gene Set Enrichment Analysis (GSEA) or other pathway enrichment analyses are commonly used to determine the phenotypic identity of a cell population. However, the variability in enriched gene sets between datasets may lead to identical cell states being erroneously classified as distinct or, conversely, two different cell states classified as similar.
In short, due to distinctions in experimental design and computational analysis, the landscape of melanoma gene signatures has become convoluted. Significant strides have been made in summarizing major cell states via literature review, which has pioneered a taxonomy for different phenotypes, but it remains ambiguous how to place each distinct signature across disparate studies into this framework, and the complexity and growing volume of data render manual organization impractical if not impossible (Arozarena & Wellbrock, 2019; Rambow et al., 2019). With the continual emergence of new signatures from ongoing research, there is an urgent need for an unbiased and automated categorization method to deconvolute the current landscape in order to facilitate progress in discerning the utility of consistent clinical phenotypes. In this study, we introduce a framework for clustering melanocytic gene signatures that depends neither on gene overlap nor on undefined baseline transcriptomes. This framework is based on the premise that the most clinically useful gene signatures will be those recurrently observed in human disease, and which can be measured and identified using widespread technologies currently compatible with clinical standards of care. Our approach, therefore, utilizes the bulk transcriptomes of clinical specimens as the benchmark and clusters gene signatures based on their expression patterns in these specimens. With this framework, we have categorized gene signatures commonly cited within the melanocytic field. Additionally, we have developed a user-friendly web application that enables users to classify any signature, thereby complementing the currently available DE analysis and GSEA tools.
First, we assembled the gene expression datasets from six cutaneous melanoma genetic repositories (Fig. 1A). These datasets were selected based on four criteria: 1) the samples must be cutaneous melanoma clinical specimens; 2) the samples size must be larger than 15; 3) the final dataset includes both microarray and RNA-seq datasets; 4) the data must be publicly available. We obtained the gene expression data of TCGA-SKCM from the GDC portal (Weinstein et al., 2013). This dataset has 472 samples, including both primary and metastatic tumor specimens. These samples also have a variety of driver mutations and treatment statuses. Next, we selected five additional datasets generated with microarray and RNA-seq: GEO GSE19234 (Bogunovic et al., 2009), GSE53118 (Mann et al., 2013), GSE46517 (Kabbarah et al., 2010), GSE50509 (Rizos et al., 2014), and ENA PRJEB23709 (Gide et al., 2019). The data from Bogunovic et al. includes 44 metastatic melanoma tissue samples obtained from surgery, with some patients having undergone additional treatments. The dataset from Mann et al. consists of expression profiles from 79 samples of stage III melanoma with various driver mutations. Kabbarah et al. generated a transcriptomic dataset from 31 primary and 73 metastatic melanoma specimens. The Rizos et al. study included 59 metastases harboring BRAFV600E mutation from patients treated with either dabrafenib or vemurafenib. Since Gide et al. provided fastq files, we processed the data using Cutadapt, STAR, and RSEM to create the count matrix (Supplementary Table 1) (Dobin et al., 2013; Li & Dewey, 2011; Martin, 2011) and included data from 18 treated samples that were treated with anti-PD-1 monotherapy or combined anti-CTLA-4 and anti-PD-1 therapy. Collectively, the datasets encompass melanoma transcriptomes of various clinical stages, patient populations, treatments, and genomic backgrounds.
Figure 1. Schematic and validation of classification correlation pipeline.

A. The six large gene expression datasets assembled for the analysis, totaling 778 specimens. B. The 39 gene signatures from 15 studies included in the original clustering. C. Schematic and example of Z-score transformation. D. Schematic and example of Z-score aggregation. E. Schematic of final signature Z-score matrix. F. The three primary outputs of the pipeline. Hierarchical clustering dendrogram (left) shows five major clusters of signatures based upon classification correlation (dotted red line). The seven principal classes (see Figure 2) are color-coded. The Signature Relationship heatmap (middle) shows the pairwise correlation coefficient of all signatures. The Gene Membership heatmap (right) shows overlap fractions of all pairs of signatures. G. Scatter plot of all pairwise comparisons of correlation and overlap fraction. H&K. Scatter plots of representative pairs of signatures (as indicated by colored points in G) with no overlapping genes. J. Box and whisker plot of comparing pairwise correlation, Jaccard Index, and gene overlap fraction within “Invasive/Invasion” and “Proliferative/Proliferation” signatures.
Following data compilation, we surveyed the recent literature for commonly referenced gene signatures and collected 39 signatures from 15 published studies (Fig. 1B and Supplementary Table 2). These signatures represent diverse transcriptional phenotypes in human melanoma and melanocytes, collected from a variety of sources (clinical specimens, primary cultures, cell lines) using different profiling platforms (microarray, bulk sequencing, single-cell sequencing). The signatures include the most established proliferative and invasive phenotypes, as well as cell states that are resistant to targeted therapy. While we believe the collection captures key cell states and phenotypes, we emphasize that it does not exhaustively cover the extensive array of published signatures. Rather, our selection aims to encompass a wide range of specimen types, phenotypes, and profiling methods and to include those signatures commonly utilized for benchmarking new signatures. Our goal is to establish this curated set of gene signatures as a foundational reference that enables the easy integration of additional signatures, thereby enhancing subsequent analysis and interpretation.
For direct comparisons between signatures, we aimed to derive a single score for each gene set. Since gene expression data are prone to batch effects, we scaled all datasets by transforming the raw expression values into Z-scores across all specimens (Fig. 1C). The Z-scores were computed by , where x is each expression value of a gene, and μ and σ are respectively the mean expression and standard deviation of the same gene across all samples within a dataset. This effectively ranks the specimens by their expression level within a dataset. Then, expression values of the same genes from different datasets can be compared. Next, Z-scores of genes that belong to the same signatures are averaged to compute the final aggregate Z-score representing each signature (Fig. 1D), resulting in a signature by specimen Z-score matrix (Fig. 1E). Using this Z-score matrix, we performed bootstrapped hierarchical clustering with average linkage and Pearson’s correlation (referred to henceforth as “correlation”) using the “Pvclust” R package (Suzuki & Shimodaira, 2006), and we also generated a Z-score correlation heatmap between all pairs of signatures (Fig. 1F). Our framework rests on the assumption that a cell state is achieved by coordinated co-expression of genes. Thus, if two signatures are related, the overall gene expression trends of these signatures across specimens should be highly correlated. In other words, if two groups of genes (signatures) are highly expressed across the same specimens, both likely contribute to the same phenotype.
Since our goal was to classify signatures based on similar patterns of expression across clinical specimens instead of grouping them based on overlapping genes, we first examined whether signatures can be highly correlated even with minimal overlap of specific genes. We calculated the fraction of overlapping genes between pairwise signatures and visualized the correlation and overlap between all pairs of signatures (Fig. 1F & G). Expectedly, we observed that some signatures do share an appreciable amount of genes, and these signatures are highly correlated. More importantly, the results show a wide range of correlations from strong anti-correlation (Fig. 1H) to strong positive correlation (Fig. 1I) in signature pairs with no overlapping genes. To compare correlation to alternative measures of similarity based on gene membership/overlap, we performed a comparative analysis using nine signatures designated as “invasive/invasion” or “proliferative/proliferation” in the original studies by Rambow et al., Jeffs et al., Widmer et al., Hoek et al., and Verfaillie et al. (Hoek et al., 2006; Jeffs et al., 2009; Rambow et al., 2018; Verfaillie et al., 2015; Widmer et al., 2012). We computed the correlation and overlap fraction as detailed above. The pairwise Jaccard Indices were calculated using , where p and q are gene signatures. The results show that in spite of low gene overlap fractions, correlation shows statistically significantly higher values than the Jaccard Indices within each category (Invasive: Wilcoxon Test p<0.001, Proliferative: Student’s T-test p<0.01; Fig. 1J), suggesting that our framework can indeed capture meaningful biological signals without relying on the overlapping genes between signatures.
We next sought to assign terminology rooted in the literature to each signature cluster. We initially considered the five higher-level clusters of the hierarchical tree (Fig. 1F). Upon examining the constituent signatures, these five groups showed consistent themes with the biological interpretation from the original studies. Based upon the interpretations from the original studies, we designated them as follows: Differentiated, Mitotic/MYC, Amelanotic, Hypometabolic, and Dedifferentiated (Fig. 2A & 2B). The Differentiated class encompasses signatures that are characteristic of pigmentation and differentiation, exemplified by MITF targets from Hoek et al. and Rambow et al., as well as pigmented melanocytes from adult skin (Belote et al., 2021; Hoek, Schlegel, et al., 2008; Rambow et al., 2018). The Mitotic/MYC class represents signatures of mitotic processes, including DNA replication from Kauffmann et al. and mitosis from Rambow et al. (Kauffmann et al., 2008; Rambow et al., 2018). The Amelanotic class represents a population of non-pigmented melanocytes, such as those on volar anatomic sites. Volar melanocytes are especially critical as an aggressive subtype of melanoma, acral melanoma, arises from these cells (Belote et al., 2021; Okamoto et al., 2014). The Hypometabolic class contains a single signature from Rambow et al. that describes a nutrient-deprived state (Rambow et al., 2018).
Figure 2. Identification and characterization of principal signature classes.

A. Global correlation heatmap of all pairs of signature Z-scores. Each cell in the heatmap represents the correlation coefficient between the two corresponding signature Z-score pairs. The rows and columns are ordered by hierarchical clustering with the dendrogram on top. Rows are labeled by the corresponding principal classes. B. All constituent signatures for each of the seven principal classes. C. Histogram of all pairwise gene overlap fractions within the same principal class. For each of the seven principal classes, the gene overlap fractions of all signature pairs within the same class are computed. These gene overlap fractions from all seven classes are aggregated and counted. The vertical red dotted line denotes the overlap fraction at 0.5. D. Bar plots of representative Enrichr results showing the enrichments of all principal classes in three GO and Hallmark terms. E. Bar plots of Enrichr results showing the three KEGG and Hallmark terms that the Mitotic/MYC is the most enriched.
It was notable that three frequently discussed types of gene signatures – those associated with AXL expression, a neuro-crest-like phenotype, or invasive behavior – are all contained within a single node. Within this Dedifferentiation parent cluster, signatures associated with each of these phenotypes did form subclusters. Since recent single-cell sequencing studies and literature reviews identify these phenotypes as distinct (Karras et al., 2022; Pozniak et al., 2024; Rambow et al., 2018, 2019; Tsoi et al., 2018; Wouters et al., 2020), we further divided the Dedifferentiation branch into three subclasses. The AXL class includes the AXL program from Tirosh et al. and a TNF-α signature that is highly correlated to AXL expression from Riesenberg et al. (Riesenberg et al., 2015; Tirosh et al., 2016). The Neuro class represents a neural-crest-like cell state traditionally considered an intermediate cell state (Rambow et al., 2019; Wouters et al., 2020). Lastly, the Invasive class represents one of the more established aggressive states that is typically associated with metastatic dissemination (Hoek et al., 2006). The close correlation of the signatures for the three dedifferentiated subclusters indicates that even though single-cell sequencing studies have established each as a distinct cluster, cells expressing each of these programs tend to co-occur in clinical specimens. To confirm that signature clusters are not merely a consequence of high amounts of gene overlap, we examine the pairwise overlap of signatures within each cluster. Fig. 2C shows that the majority of signatures within the same class have an overlap fraction of below 0.5 (see also Fig. 1F), suggesting that the clustering results are on the basis of biological signals. Overall, we propose a taxonomy of five major classes of cutaneous melanoma gene signatures with three subclasses within the Dedifferentiated class, resulting in a total of seven principal classes (Fig. 2A & 2B): Differentiated, Mitotic/MYC, Amelanotic, Hypometabolic, AXL, Neuro, and Invasive. It is important to note that the precise choice of label is subjective. The purpose of the classification system developed here is to assess the similarity between gene sets that may describe the same biological cell state despite having few overlapping genes. This clustering and proposed taxonomy are consistent with established literature and initial biological interpretations; however, individual investigators are encouraged to consult the original experiments that defined each signature for more nuanced interpretations of the phenotypes.
We next derived consensus gene signatures for each class. We pooled the Z-scores of all the genes within one class and then computed the correlation between each gene and each signature in the same class. Each class consensus signature consists of genes that are both highly correlated to the signatures within the class and also unique to the class (Supplementary Table 3). We performed pathway enrichment analysis of the seven class consensus signatures using Enrichr with Hallmark, GO Biological Process, and KEGG gene sets. The findings revealed a congruence between the distinctive biological themes characterizing each class and their corresponding signatures. For instance, the Differentiated class is highly enriched in GO_Melanin_Biosynthetic_Process, and the Invasive class is highly enriched in Hallmark_EMT (Fig. 2D). Five classes show enrichment in Hallmark_Glycolysis, but the Hypometabolic class is the most enriched. The Mitotic/MYC class is strongly enriched in KEGG_DNA_Replication, and intriguingly, this class is also uniquely enriched in Hallmark_MYC_Targets (Fig. 2E), suggesting the potential of MYC as a biomarker for this class. Overall, the seven classes and the constituent signatures are consistent with the previously understood underlying biology and Enrichr results.
Our Enrichr analysis additionally highlights the need to categorize gene signatures into clusters for a comprehensive global analysis, rather than depending solely on relative gene expression against a single, non-standardized baseline transcriptome. Take, for instance, the Hallmark_EMT gene set, which is often reported as enriched in GSEA analyses of melanocytic signatures. We found that while the AXL, Invasive, and Neuro classes all show enrichment for Hallmark_EMT, the level of enrichment varies markedly, and therefore perceived enrichment in any gene signature derived from DE will be substantially influenced not only by the transcriptional profile of interest, but also by the choice of baseline transcriptome for comparison (Fig. 2D). This finding cautions against the common practice of interpreting any two gene signatures as indicative of the same phenotype based solely on their shared enrichment in gene sets, such as Hallmark_EMT. In contrast, the Mitotic/MYC class is the only class enriched for Hallmark_MYC_Targets_V2, and it is thus less likely that transcriptional profiles found enriched for this gene set would represent a different class. Our framework offers a more standardized approach that allows for a nuanced interpretation of the degree of enrichment, providing a more objective assessment of their biological significance.
In addition to classifying previously published signatures, we also developed a user-friendly web application (step-by-step tutorial in Supplementary Material), WIMMS (What Is My Melanocytic Signature), available at https://wimms.tanlab.org/. The main utility of this web application is to facilitate researchers to readily visualize the relationship between a new signature and the 39 signatures used here, as well as to classify the signature into one of the seven classes we identified or categorize it as an independent class. Based on the class signatures, we include a function (Composition) in the web application that allows the user to simultaneously compare their signature to all seven classes and obtain a quantitative readout. This method permits a signature to exhibit similarities to several classes. Since transcriptional programs can exhibit modularity, such that strong expression of one program does not necessarily preclude expression of an orthogonal program, this visualization offers a more comprehensive understanding of relationships to existing signatures than what can be readily determined through a dendrogram. In addition, the application includes an overlap analysis function so that users can determine whether the classification is a consequence of high gene-specific overlap. Lastly, we included the results from the Enrichr enrichment analysis (Chen et al., 2013) so that users can determine the biological relevance of their signature after classification.
To showcase how WIMMS can efficiently facilitate signature identification, we classified four signatures from Pozniak et al. – named in the original study as Melanocytic, Mesenchymal-Like, Mitotic, and Neural Crest-like – which were not included in the original 39 signature framework (Fig. 3A). In accordance with the original interpretation, the Melanocytic and Mitotic signatures aligned with the Differentiated and Mitotic/MYC classes, respectively. Similarly, the Mesenchymal signature clusters with the Invasive class, which is most enriched for the Hallmark_EMT gene set. While this alignment is consistent with the Pozniak nomenclature, the Composition function highlights substantial correlation with additional clusters as well, notably the Neuro class, in which the signature is only slightly less enriched compared to the Invasive class. In other words, our framework reveals that while it was not inaccurate to describe this gene signature as “Mesenchymal,” it is also important to acknowledge that this signature also possesses similarities to other classes. The Neural Crest-like signature represents a state interpreted in the original study as dedifferentiated along with the Mesenchymal phenotype, and the authors demonstrated that other previously described signatures are not strongly enriched in this cell cluster. Results from our framework paint a similar picture, with this signature being clustered into the broad dedifferentiated class but as an independent branch from the Invasive, Neuro, and AXL subclasses (Fig. 3A). Moreover, the Neural Crest-like signature shows enrichments in six out of seven classes (Fig. 3E), while the Melanocytic and Mitotic signatures are more exclusively enriched in only one or two classes (Fig. 3B & 3C). This observation suggests that the Neural Crest-like state is a transitional state linked to a dedifferentiated phenotype as opposed to an exclusive identity demonstrated by Melanocytic and Mitotic signatures. These insights collectively underscore the reliability of our method in mirroring established interpretations from the field while offering additional nuanced interpretations. Moreover, this analysis was swiftly conducted within minutes, utilizing only lists of gene signatures and a straightforward point-and-click interface, underscoring the user-friendly nature of the web application.
Figure 3. Example of signature classification using WIMMS.

A. Overall clustering results of four signatures from Pozniak et al (Pozniak et al., 2024). Pozniak signatures are in red rectangles. B. Radar plot showing the enrichment of the Melanocytic signature from Pozniak et al. among the seven principal classes. C. Radar plot showing the enrichment of the Mitotic signature from Pozniak et al. among the seven principal classes. D. Radar plot showing the enrichment of the Mesenchymal signature from Pozniak et al. among the seven principal classes. E. Radar plot showing the enrichment of the Neural Crest Like signature from Pozniak et al. among the seven principal classes.
In this study, we developed methods to systematically classify major melanocytic gene signatures. We provide an easy-to-use web application that allows users to classify their own signatures within this framework, benchmarking any signature to established melanocytic cell states. With the suggested taxonomy of melanoma signatures, this application will facilitate the identification of novel signatures and the consolidation of redundant signatures. Other functionalities, such as overlap and composition analyses, will provide orthogonal evidence for the classification. The class signatures are an important resource that offers a reference for biomarker discovery. We anticipate this tool to complement traditional DE analysis and GSEA in the field of melanoma research.
Supplementary Material
Significance.
The challenge of reconciling the plethora of published gene signatures, each identified through comparing relative gene expression across various sample types and profiling technologies, poses a barrier to fully understanding the spectrum of melanocytic cell phenotypes. To address this, we have developed a new framework that categorizes published signatures into seven distinct classes. Leveraging this framework, we have created a web application that streamlines the process of clustering new gene signatures within these established categories. Overall, we provide a taxonomy for melanoma gene signatures that clarifies the identification of major phenotypes, coupled with an intuitive interface for signature integration and, ultimately, facilitating research into better diagnostic, prognostic, and therapeutic outcomes.
Acknowledgments
This work was supported by a National Cancer Institute R01 (R01CA229896) to RLJ-T. We utilized the Shared Resources for Research Informatics supported by the National Cancer Institute of the National Institutes of Health under Award Number P30CA042014.
References
- Alonso SR, Tracey L, Ortiz P, Pérez-Gómez B, Palacios J, Pollán M, Linares J, Serrano S, Sáez-Castillo AI, Sánchez L, Pajares R, Sánchez-Aguilera A, Artiga MJ, Piris MA, & Rodríguez-Peralto JL (2007). A High-Throughput Study in Melanoma Identifies Epithelial-Mesenchymal Transition as a Major Determinant of Metastasis. Cancer Research, 67(7), 3450–3460. 10.1158/0008-5472.CAN-06-3481 [DOI] [PubMed] [Google Scholar]
- Arozarena I, & Wellbrock C (2019). Phenotype plasticity as enabler of melanoma progression and therapy resistance. Nature Reviews Cancer, 19(7), Article 7. 10.1038/s41568-019-0154-4 [DOI] [PubMed] [Google Scholar]
- Belote RL, Le D, Maynard A, Lang UE, Sinclair A, Lohman BK, Planells-Palop V, Baskin L, Tward AD, Darmanis S, & Judson-Torres RL (2021). Human melanocyte development and melanoma dedifferentiation at single-cell resolution. Nature Cell Biology, 23(9), Article 9. 10.1038/s41556-021-00740-8 [DOI] [PubMed] [Google Scholar]
- Bogunovic D, O’Neill DW, Belitskaya-Levy I, Vacic V, Yu Y-L, Adams S, Darvishian F, Berman R, Shapiro R, Pavlick AC, Lonardi S, Zavadil J, Osman I, & Bhardwaj N (2009). Immune profile and mitotic index of metastatic melanoma lesions enhance clinical staging in predicting patient survival. Proceedings of the National Academy of Sciences, 106(48), 20429–20434. 10.1073/pnas.0905139106 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Centeno PP, Pavet V, & Marais R (2023). The journey from melanocytes to melanoma. Nature Reviews Cancer, 23(6), Article 6. 10.1038/s41568-023-00565-7 [DOI] [PubMed] [Google Scholar]
- Chen EY, Tan CM, Kou Y, Duan Q, Wang Z, Meirelles GV, Clark NR, & Ma’ayan A (2013). Enrichr: Interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics, 14(1), 128. 10.1186/1471-2105-14-128 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Costa-Silva J, Domingues D, & Lopes FM (2017). RNA-Seq differential expression analysis: An extended review and a software tool. PLOS ONE, 12(12), e0190152. 10.1371/journal.pone.0190152 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, & Gingeras TR (2013). STAR: Ultrafast universal RNA-seq aligner. Bioinformatics, 29(1), 15–21. 10.1093/bioinformatics/bts635 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gide TN, Quek C, Menzies AM, Tasker AT, Shang P, Holst J, Madore J, Lim SY, Velickovic R, Wongchenko M, Yan Y, Lo S, Carlino MS, Guminski A, Saw RPM, Pang A, McGuire HM, Palendira U, Thompson JF, … Wilmott JS (2019). Distinct Immune Cell Populations Define Response to Anti-PD-1 Monotherapy and Anti-PD-1/Anti-CTLA-4 Combined Therapy. Cancer Cell, 35(2), 238–255.e6. 10.1016/j.ccell.2019.01.003 [DOI] [PubMed] [Google Scholar]
- Hoek KS, Eichhoff OM, Schlegel NC, Döbbeling U, Kobert N, Schaerer L, Hemmi S, & Dummer R (2008). In vivo Switching of Human Melanoma Cells between Proliferative and Invasive States. Cancer Research, 68(3), 650–656. 10.1158/0008-5472.CAN-07-2491 [DOI] [PubMed] [Google Scholar]
- Hoek KS, Schlegel NC, Brafford P, Sucker A, Ugurel S, Kumar R, Weber BL, Nathanson KL, Phillips DJ, Herlyn M, Schadendorf D, & Dummer R (2006). Metastatic potential of melanomas defined by specific gene expression profiles with no BRAF signature. Pigment Cell Research, 19(4), 290–302. 10.1111/j.1600-0749.2006.00322.x [DOI] [PubMed] [Google Scholar]
- Hoek KS, Schlegel NC, Eichhoff OM, Widmer DS, Praetorius C, Einarsson SO, Valgeirsdottir S, Bergsteinsdottir K, Schepsky A, Dummer R, & Steingrimsson E (2008). Novel MITF targets identified using a two-step DNA microarray strategy. Pigment Cell & Melanoma Research, 21(6), 665–676. 10.1111/j.1755-148X.2008.00505.x [DOI] [PubMed] [Google Scholar]
- Hwang B, Lee JH, & Bang D (2018). Single-cell RNA sequencing technologies and bioinformatics pipelines. Experimental & Molecular Medicine, 50(8), Article 8. 10.1038/s12276-018-0071-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jaeger J, Koczan D, Thiesen H-J, Ibrahim SM, Gross G, Spang R, & Kunz M (2007). Gene Expression Signatures for Tumor Progression, Tumor Subtype, and Tumor Thickness in Laser-Microdissected Melanoma Tissues. Clinical Cancer Research, 13(3), 806–815. 10.1158/1078-0432.CCR-06-1820 [DOI] [PubMed] [Google Scholar]
- Jeffs AR, Glover AC, Slobbe LJ, Wang L, He S, Hazlett JA, Awasthi A, Woolley AG, Marshall ES, Joseph WR, Print CG, Baguley BC, & Eccles MR (2009). A Gene Expression Signature of Invasive Potential in Metastatic Melanoma Cells. PLOS ONE, 4(12), e8461. 10.1371/journal.pone.0008461 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jerby-Arnon L, Shah P, Cuoco MS, Rodman C, Su M-J, Melms JC, Leeson R, Kanodia A, Mei S, Lin J-R, Wang S, Rabasha B, Liu D, Zhang G, Margolais C, Ashenberg O, Ott PA, Buchbinder EI, Haq R, … Regev A (2018). A Cancer Cell Program Promotes T Cell Exclusion and Resistance to Checkpoint Blockade. Cell, 175(4), 984–997.e24. 10.1016/j.cell.2018.09.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kabbarah O, Nogueira C, Feng B, Nazarian RM, Bosenberg M, Wu M, Scott KL, Kwong LN, Xiao Y, Cordon-Cardo C, Granter SR, Ramaswamy S, Golub T, Duncan LM, Wagner SN, Brennan C, & Chin L (2010). Integrative Genome Comparison of Primary and Metastatic Melanomas. PLOS ONE, 5(5), e10770. 10.1371/journal.pone.0010770 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Karras P, Bordeu I, Pozniak J, Nowosad A, Pazzi C, Van Raemdonck N, Landeloos E, Van Herck Y, Pedri D, Bervoets G, Makhzami S, Khoo JH, Pavie B, Lamote J, Marin-Bejar O, Dewaele M, Liang H, Zhang X, Hua Y, … Marine J-C (2022). A cellular hierarchy in melanoma uncouples growth and metastasis. Nature, 610(7930), Article 7930. 10.1038/s41586-022-05242-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kauffmann A, Rosselli F, Lazar V, Winnepenninckx V, Mansuet-Lupo A, Dessen P, van den Oord JJ, Spatz A, & Sarasin A (2008). High expression of DNA repair pathways is associated with metastasis in melanoma patients. Oncogene, 27(5), Article 5. 10.1038/sj.onc.1210700 [DOI] [PubMed] [Google Scholar]
- Konieczkowski DJ, Johannessen CM, Abudayyeh O, Kim JW, Cooper ZA, Piris A, Frederick DT, Barzily-Rokni M, Straussman R, Haq R, Fisher DE, Mesirov JP, Hahn WC, Flaherty KT, Wargo JA, Tamayo P, & Garraway LA (2014). A Melanoma Cell State Distinction Influences Sensitivity to MAPK Pathway Inhibitors. Cancer Discovery, 4(7), 816–827. 10.1158/2159-8290.CD-13-0424 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li B, & Dewey CN (2011). RSEM: Accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics, 12(1), 323. 10.1186/1471-2105-12-323 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mann GJ, Pupo GM, Campain AE, Carter CD, Schramm S-J, Pianova S, Gerega SK, De Silva C, Lai K, Wilmott JS, Synnott M, Hersey P, Kefford RF, Thompson JF, Yang YH, & Scolyer RA (2013). BRAF Mutation, NRAS Mutation, and the Absence of an Immune-Related Expressed Gene Profile Predict Poor Outcome in Patients with Stage III Melanoma. Journal of Investigative Dermatology, 133(2), 509–517. 10.1038/jid.2012.283 [DOI] [PubMed] [Google Scholar]
- Martin M (2011). Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.Journal, 17(1), Article 1. 10.14806/ej.17.1.200 [DOI] [Google Scholar]
- Moses L, & Pachter L (2022). Museum of spatial transcriptomics. Nature Methods, 19(5), Article 5. 10.1038/s41592-022-01409-2 [DOI] [PubMed] [Google Scholar]
- Okamoto N, Aoto T, Uhara H, Yamazaki S, Akutsu H, Umezawa A, Nakauchi H, Miyachi Y, Saida T, & Nishimura EK (2014). A melanocyte–melanoma precursor niche in sweat glands of volar skin. Pigment Cell & Melanoma Research, 27(6), 1039–1050. 10.1111/pcmr.12297 [DOI] [PubMed] [Google Scholar]
- Pozniak J, Pedri D, Landeloos E, Herck YV, Antoranz A, Vanwynsberghe L, Nowosad A, Roda N, Makhzami S, Bervoets G, Maciel LF, Pulido-Vicuña CA, Pollaris L, Seurinck R, Zhao F, Flem-Karlsen K, Damsky W, Chen L, Karagianni D, … Marine J-C (2024). A TCF4-dependent gene regulatory network confers resistance to immunotherapy in melanoma. Cell, 187(1), 166–183.e25. 10.1016/j.cell.2023.11.037 [DOI] [PubMed] [Google Scholar]
- Rambow F, Marine J-C, & Goding CR (2019). Melanoma plasticity and phenotypic diversity: Therapeutic barriers and opportunities. Genes & Development, 33(19–20), 1295–1318. 10.1101/gad.329771.119 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rambow F, Rogiers A, Marin-Bejar O, Aibar S, Femel J, Dewaele M, Karras P, Brown D, Chang YH, Debiec-Rychter M, Adriaens C, Radaelli E, Wolter P, Bechter O, Dummer R, Levesque M, Piris A, Frederick DT, Boland G, … Marine J-C (2018). Toward Minimal Residual Disease-Directed Therapy in Melanoma. Cell, 174(4), 843–855.e19. 10.1016/j.cell.2018.06.025 [DOI] [PubMed] [Google Scholar]
- Riesenberg S, Groetchen A, Siddaway R, Bald T, Reinhardt J, Smorra D, Kohlmeyer J, Renn M, Phung B, Aymans P, Schmidt T, Hornung V, Davidson I, Goding CR, Jönsson G, Landsberg J, Tüting T, & Hölzel M (2015). MITF and c-Jun antagonism interconnects melanoma dedifferentiation with pro-inflammatory cytokine responsiveness and myeloid cell recruitment. Nature Communications, 6(1), Article 1. 10.1038/ncomms9755 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rizos H, Menzies AM, Pupo GM, Carlino MS, Fung C, Hyman J, Haydu LE, Mijatov B, Becker TM, Boyd SC, Howle J, Saw R, Thompson JF, Kefford RF, Scolyer RA, & Long GV (2014). BRAF Inhibitor Resistance Mechanisms in Metastatic Melanoma: Spectrum and Clinical Impact. Clinical Cancer Research, 20(7), 1965–1977. 10.1158/1078-0432.CCR-13-3122 [DOI] [PubMed] [Google Scholar]
- Ryu B, Moriarty WF, Stine MJ, DeLuca A, Kim DS, Meeker AK, Grills LD, Switzer RA, Eller MS, & Alani RM (2011). Global Analysis of BRAFV600E Target Genes in Human Melanocytes Identifies Matrix Metalloproteinase-1 as a Critical Mediator of Melanoma Growth. Journal of Investigative Dermatology, 131(7), 1579–1583. 10.1038/jid.2011.65 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shaffer SM, Dunagin MC, Torborg SR, Torre EA, Emert B, Krepler C, Beqiri M, Sproesser K, Brafford PA, Xiao M, Eggan E, Anastopoulos IN, Vargas-Garcia CA, Singh A, Nathanson KL, Herlyn M, & Raj A (2017). Rare cell variability and drug-induced reprogramming as a mode of cancer drug resistance. Nature, 546(7658), Article 7658. 10.1038/nature22794 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Siegel RL, Giaquinto AN, & Jemal A (2024). Cancer statistics, 2024. CA: A Cancer Journal for Clinicians, 74(1), 12–49. 10.3322/caac.21820 [DOI] [PubMed] [Google Scholar]
- Suzuki R, & Shimodaira H (2006). Pvclust: An R package for assessing the uncertainty in hierarchical clustering. Bioinformatics, 22(12), 1540–1542. 10.1093/bioinformatics/btl117 [DOI] [PubMed] [Google Scholar]
- Tirosh I, Izar B, Prakadan SM, Wadsworth MH, Treacy D, Trombetta JJ, Rotem A, Rodman C, Lian C, Murphy G, Fallahi-Sichani M, Dutton-Regester K, Lin J-R, Cohen O, Shah P, Lu D, Genshaft AS, Hughes TK, Ziegler CGK, … Garraway LA (2016). Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science (New York, N.Y.), 352(6282), 189–196. 10.1126/science.aad0501 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsoi J, Robert L, Paraiso K, Galvan C, Sheu KM, Lay J, Wong DJL, Atefi M, Shirazi R, Wang X, Braas D, Grasso CS, Palaskas N, Ribas A, & Graeber TG (2018). Multi-stage Differentiation Defines Melanoma Subtypes with Differential Vulnerability to Drug-Induced Iron-Dependent Oxidative Stress. Cancer Cell, 33(5), 890–904.e5. 10.1016/j.ccell.2018.03.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Verfaillie A, Imrichova H, Atak ZK, Dewaele M, Rambow F, Hulselmans G, Christiaens V, Svetlichnyy D, Luciani F, Van den Mooter L, Claerhout S, Fiers M, Journe F, Ghanem G-E, Herrmann C, Halder G, Marine J-C, & Aerts S (2015). Decoding the regulatory landscape of melanoma reveals TEADS as regulators of the invasive cell state. Nature Communications, 6(1), Article 1. 10.1038/ncomms7683 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang Z, Gerstein M, & Snyder M (2009). RNA-Seq: A revolutionary tool for transcriptomics. Nature Reviews Genetics, 10(1), Article 1. 10.1038/nrg2484 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weinstein JN, Collisson EA, Mills GB, Shaw KRM, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, & Stuart JM (2013). The Cancer Genome Atlas Pan-Cancer analysis project. Nature Genetics, 45(10), Article 10. 10.1038/ng.2764 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Widmer DS, Cheng PF, Eichhoff OM, Belloni BC, Zipser MC, Schlegel NC, Javelaud D, Mauviel A, Dummer R, & Hoek KS (2012). Systematic classification of melanoma cells by phenotype-specific gene expression mapping. Pigment Cell & Melanoma Research, 25(3), 343–353. 10.1111/j.1755-148X.2012.00986.x [DOI] [PubMed] [Google Scholar]
- Winnepenninckx V, Lazar V, Michiels S, Dessen P, Stas M, Alonso SR, Avril M-F, Ortiz Romero PL, Robert T, Balacescu O, Eggermont AMM, Lenoir G, Sarasin A, Tursz T, van den Oord JJ, Spatz A, & On behalf of the Melanoma Group of the European Organization for Research and Treatment of Cancer. (2006). Gene Expression Profiling of Primary Cutaneous Melanoma and Clinical Outcome. JNCI: Journal of the National Cancer Institute, 98(7), 472–482. 10.1093/jnci/djj103 [DOI] [PubMed] [Google Scholar]
- Wouters J, Kalender-Atak Z, Minnoye L, Spanier KI, De Waegeneer M, Bravo González-Blas C, Mauduit D, Davie K, Hulselmans G, Najem A, Dewaele M, Pedri D, Rambow F, Makhzami S, Christiaens V, Ceyssens F, Ghanem G, Marine J-C, Poovathingal S, & Aerts S (2020). Robust gene expression programs underlie recurrent cell states and phenotype switching in melanoma. Nature Cell Biology, 22(8), 986–998. 10.1038/s41556-020-0547-3 [DOI] [PubMed] [Google Scholar]
- Zhao S, Zhang Y, Gamini R, Zhang B, & von Schack D (2018). Evaluation of two main RNA-seq approaches for gene quantification in clinical RNA sequencing: polyA+ selection versus rRNA depletion. Scientific Reports, 8(1), Article 1. 10.1038/s41598-018-23226-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
