Abstract
Rhabdomyosarcoma (RMS) is a common childhood cancer that shares features with developing skeletal muscle. Yet, the conservation of cellular hierarchy with human muscle development and the identification of molecularly-defined tumor-propagating cells has not been reported. Using single-cell RNA sequencing, DNA-barcode cell fate mapping, and functional stem cell assays, we uncovered shared tumor cell hierarchies in RMS and human muscle development. We also identified common developmental stages at which tumor cells become arrested. Fusion-negative (FN-) RMS resemble early myogenic cells found in embryonic and fetal development, while fusion-positive (FP-) RMS express a highly specific gene program found in muscle cells transiting from embryonic to fetal development at 7-7.75 weeks of age. FP-RMS also have neural-pathway enriched states, suggesting less-rigid adherence to muscle-lineage hierarchies. Finally, we identified a molecularly-defined tumor-propagating subpopulation in FN-RMS that shares remarkable similarity to bi-potent, muscle mesenchyme progenitors that can make both muscle and osteogenic cells.
Introduction
Many cancers contain less-differentiated cell types that have the capacity to self-renew and proliferate to drive tumor growth1. These tumor-propagating cells (TPCs) also differentiate to give rise to all the cell types within the tumor. Indeed, molecularly-defined TPCs have been identified in acute myeloid leukemia2, breast cancer3, and colorectal cancer4 among others. Yet, some cancers are not hierarchically-organized and exhibit extreme cellular plasticity that drives tumor growth, the most notable example being melanoma5. In addition to defining roles for TPCs in driving tumor growth in specific malignancies, it is not well understood if the same self-renewal programs and cell fate decisions found in the predicted tissue-of-origin are recapitulated in cancer. For example, medulloblastomas coopt the same Sonic Hedgehog (SHH) and WNT signaling pathways to drive cancer self-renewal as found in non-malignant neuron precursor cells6,7. By contrast, MYC drives cancer self-renewal programs that are not restricted to the predicted tissue-of-origin8. These findings suggest that a subset of tumors re-use the same developmental stem cell pathways found in their proposed originating tissue while others adopt new self-renewal programs as part of their transformation process.
Rhabdomyosarcoma (RMS) is the most common soft tissue sarcoma of childhood and shares histopathological features with hierarchically organized skeletal muscle9-11, making it an ideal model to address these questions. Rhabdomyosarcoma consists of two major subtypes, including fusion-positive RMS that harbor PAX3 or PAX7 translocations with FOXO1 (FP-RMS), and fusion-negative RMS that are largely transformed by RAS pathway activation (FN-RMS)12-15. Clinical assignment of high risk RMS includes harboring PAX:FOXO1 translocations, age <1 or older than 10 years of age, developing primary tumor at unfavorable locations, failing to achieve local tumor control, and/or progressing to metastatic disease16-19. Although these common clinical characteristics inform treatment, it is also clear that additional molecular heterogeneity underlies tumor aggression and therapy resistance. For example, PAX3-FOXO1 fusion-positive RMS and the MYODL122R spindle-variant of RMS have poor prognosis compared to fusion-negative or PAX7-FOXO1 RMS17-19, suggesting additional molecular and tumor heterogeneity beyond the purported two RMS subtypes. Indeed, additional genetic perturbations including P53 pathway inactivation are risk factors for developing aggressive and treatment resistant disease in both FN- and FP-RMS20. Despite roles for genetic mutations and molecular heterogeneity in driving RMS aggression, both RMS subtypes express muscle-lineage transcription factors including MYOD (Myoblast Determination Protein 1), MYF5 (Myogenic Factor 5), and/or Myogenin21,22 and morphologically resemble undifferentiated mononucleated muscle cells found throughout fetal, embryonic, and adult development including dermomyotome, satellite cells, muscle progenitors, and myoblasts/myocytes12,23-26. These data suggest that underlying muscle developmental pathways drive the growth and maintenance of a wide array of RMS tumors.
To date, direct assignment of RMS molecular cell states with those from normal human skeletal muscle development have not been reported. Nor is it known the extent to which RMS cell hierarchies recapitulate those found in muscle and the maturation stage at which tumor cells arrest in human development. The controversy for RMS arising from multiple possible cells of origin12,23-25,27 and the lack of detailed molecular description of human RMS tumor cell heterogeneity, including the cell types that sustain tumor cell growth, led us to perform single-cell RNA sequencing of human RMS and make comparison with human muscle development. We also performed functional stem cell assays to identify a largely quiescent tumor-propagating cell (TPC) in FN-RMS that drives cancer regrowth following stress. This FN-RMS TPC shares remarkable similarity to the recently described bipotent mesenchymal stem cell population that makes both muscle and osteogenic lineages28.
Results
scRNA-sequencing reveals RMS heterogeneity
To investigate the cell states and the conservation of muscle developmental hierarchies in RMS, we performed droplet-based 10x single cell RNA-sequencing of patient-derived xenografts (PDXs, n=9 from 7 patients; Fig. 1a, Supplementary Table 1). A mouse cell depletion kit removed stromal cells, with subsequent PDX samples contained only 0-1.3% mouse cells that were efficiently excluded based on failing to map to the human genome. For primary, frozen patient samples common stromal cell types were identified and then excluded based on Cellassign29 (n=4 from 3 patients, Extended Data Fig. 1a). RMS cells were independently confirmed as tumor based on expression of well-known RMS-expressed genes including MYOD, MYOG, and DES and of a highly specific FN- or FP-core signature identified below (Extended Data Fig. 1b, Fig. 6, Extended Data Fig. 9, and Supplementary Table 2).
Cells were next clustered using shared nearest neighbor (SNN) clustering analysis30 and visualized using UMAP rendering (Uniform Manifold Approximation and Projection, Fig. 1a, Extended Data Fig. 1c). Cell clusters with similar gene expression were combined and conserved cell states assigned using the Molecular Signatures Database v7.431 (Fig. 1b-c, Extended Data Fig. 1d, and Supplementary Table 2). From this analysis, we uncovered common pan-cancer cell states including proliferative, hypoxic, apoptotic, interferon and ER stress responsive cell signatures (Fig. 1d, Extended Data Fig. 1e, 2). We also discovered RMS-specific cell states that included i) a differentiated muscle cell population that expressed MYLPF (Myosin Light chain 2), ACTC1 (Actin Alpha Cardiac Muscle 1), LRRN1 (Leucin Rich Repeat Neuronal 1), TNNT3 (Troponin T3) and TSPAN33 (Tetraspanin-33); ii) a mesenchymal-enriched population (Mesen. like) that expressed extracellular matrix and mesenchymal genes that included MMP2 (Matrix Metalloproteinase 2), CD44, PTN (Pleiotrophin), POSTN (Periostin), and THY1 (CD90); and iii) neural-pathway enriched cell types found only in single FP-RMS samples (Fig. 1d, Extended Data Fig. 3). MYOD, Desmin, and MYC were expressed across all tumors and all RMS cell states, confirming that RMS cells arrest in and express genes associated with muscle lineage commitment (Extended Data Fig. 4). Two mice xenografted with FN-MAST85 PDX were independently analyzed by single cell sequencing, reveling largely similar cell state constitution across engrafted animals (Fig. 1d). Similar cell state composition was also observed from single-nuclei sequencing of four additional primary patient samples (Fig. 1d, denoted with asterisks, and Extended Data Fig. 1).
Immunofluorescence staining showed a uniform distribution of cell states within PDX tumors, irrespective if cells were located at the invasive edge or within the central tumor mass with exception of hypoxic and interferon-responsive cells that were regionally confined to heterogeneous patches throughout the tumor mass (Extended Data Fig. 5). IHC staining confirmed similar overall percentages of tumors cell states when compared with scRNA sequencing (Extended Data Fig. 5). Immunohistochemistry cell state markers were identified from our scRNA-sequencing and included Ki67 (MKI67) that labelled proliferative cells, NDRG1 (N-myc Downstream Regulated 1) for hypoxic cells, and MX1 (MX Dynamin Like GTPase 1) for interferon responsive cells. RMS-specific cell states were assessed using EGFR (Epidermal growth factor receptor) for mesenchymal-like cells and TNNT3 and myosin heavy chain (MF20) for differentiated muscle subpopulations.
A more detailed analysis of RMS intertumoral heterogeneity reveled that a vast majority of tumors contained differentiated muscle cells (Fig. 1d). Yet, one FN-RMS model (PDX MAST39 and its metastatic lesion MST85) and one FN-primary patient sample (29806) did not contain differentiated muscle cells when assessed by single cell sequencing and IHC, consistent with the clinical presentation of a subset of RMS that lack differentiated muscle cell types. All five FN-RMS tumors contained mesenchymal-enriched cells, while only two of the five FP-RMS contained this cell subpopulation (Fig. 1d). Finally, a majority of FP-RMS harbored neural pathway enriched cells (n=4 of 5), suggesting that FP-RMS tumors may commonly adopt these cell states as part of the transformation process (Fig. 1d, Extended Data Fig. 3). Neural pathway-enriched tumor cell populations contained non-overlapping expressed genes that differed between clusters within the same tumor and across FP-RMS (Extended Data Fig. 3, Supplementary Table 2). Finally, all patient-derived RMS contained large numbers of cells that failed to express any of the above transcriptional gene modules and were assigned as “ground state” (Fig. 1c-d). These ground state cells are committed to muscle lineage and express muscle-specific genes including MYOD, MYOG, and Desmin (Extended Data Fig. 4). Importantly, comparable numbers of genes were detected in ground state cells when compared with other cell subpopulations (Extended Data Fig. 1e, 2b-c), obviating the possibility that ground state cells clustered together because of low transcript detection. Gene expression was also analyzed in 3-dimensional space based on the combined expression of gene modules for proliferation, muscle, and mesenchymal cell states. This orthogonal approach confirmed that each cell state was molecularly distinct and comprised largely non-overlapping cell states (Extended Data Fig. 6a-c). These data show that there are four dominant, transcriptionally-defined RMS tumor cell states that comprise proliferative cells, ground state cells, mesenchymal cells, and differentiated muscle cells.
Not all RMS cells can initiate tumor growth
To unbiasedly determine latency of tumor growth in FN- and FP-RMS PDX models, we next transplanted RMS cells from nine PDXs into NSG mice (n=3 mice per group, 1x105 and 1x104 cells, Fig. 2a-c). FN-RMS re-established tumors faster when compared with FP-RMS (35.4±11.7 days and 72±6.9 days, respectively; 1x105 cells/animal, p=5.5e-08, similar results were seen in animals engrafted with 1x104 cells). To further investigate if a single RMS cell can remake tumor and generate all the subsequent cellular states, single tumor cells from four PDXs were engrafted into the flanks of NSG mice (n=30-60 single cells/PDX, Fig. 2a, d). In total, three of the four PDXs generated tumors following single cell xenograft engraftment, including two fusion-negative and one fusion-positive RMS. scRNA sequencing confirmed that each tumor derived from implantation of a single RMS cell had similar cell state composition (Fig. 2d) and clustered with their parental tumor following TSNE visualization performed using all tumors (Extended Fig. 6d). These results confirm that a single tumor cell can repopulate the entirety of RMS cell states, including the neural pathway enriched cell states in FP-RMS, and also suggested that some RMS contain unanticipatedly high numbers of TPCs.
FN-RMS contain a molecularly-defined TPC
To assess cell lineage and fate decisions in FN-RMS, scRNA sequencing was next completed using Lineage And RNA RecoverY (LARRY) barcoding of human FN-RMS RD cells (Fig. 3). Importantly, RD cells contain the same four dominant tumor cell states in both 2D culture and xenografts (Fig. 3b, Extended Data Fig. 6e-f). LARRY uses a unique lentiviral barcode inserted at the 3’UTR of an expressed GFP that is integrated into parental cells. Progeny are then clonally traced to follow daughter cell fates over time32. Here, RD cells were lenti-virally infected at a MOI of 0.3, ensuring that each cell integrated one copy of the unique barcode (Fig. 3a). Cells were grown for two days to permit 1-2 cell divisions and GFP+ cells isolated by FACs. A portion of the sample was used for 10x scRNA sequencing to assess the initial cell states of parent cells and included a LARRY specific primer to amplify the barcode (~5x104 reads per cell). The remaining cells were cultured in high serum or low serum/differentiation media for 4 days. Daughter cells were then harvested for scRNA sequencing and a portion of the cells grown in differentiation media were replated into high serum growth media for 3 days. In total, LARRY barcodes were detected in 26.4 to 47.8% of scRNA sequenced cells across conditions (range 2,040 to 2,470 cells/condition). Bioinformatic analysis confirmed that a large majority of cells found in the LARRY barcoded library retained their cell state following short-term culture (Fig. 3c). In total, ≥446 barcodes were shared in parental and daughter cells across experimental conditions (Fig. 3d), permitting lineage tracing and cell fate mapping over time (Fig. 3e-g).
As may be expected, parental cells were largely comprised of proliferative cells and drove the bulk of tumor growth under high serum conditions (Fig. 3f). Yet, ground state cells also lead to the production of large numbers of daughter cells, suggesting that ground and proliferative cell states dynamically oscillate in RMS. Importantly, both the ground and proliferative parental cells produced all cell types when grown in high serum, and only ground, muscle and mesenchymal-enriched cells when grown in low serum. By contrast, mesenchymal-enriched cells and muscle-lineage expressing cells were largely non-proliferative when grown in high serum and underwent only rare symmetric cell divisions to produce daughter cells with the same cell fate. These LARRY lineage tracing results align well with the scRNA sequencing showing that mesenchymal-enriched and muscle cell states are largely non-proliferative during tumor growth in patients and mouse xenografts (Extended Data Fig. 6a-b). Finally, serial replating from low serum into high serum revealed that mesenchymal-enriched cells could re-enter cell cycle and proliferate to remake cells from all four cell states. By contrast, differentiated muscle cells divided to a far lesser degree, and failed to make mesenchymal-enriched cells. These data show that the mesenchymal-enriched cell population contributes to tumor growth under low-nutrient, stress conditions and comprises a TPC that can reconstitute all cell states in FN-RMS.
To directly investigate the mesenchymal-enriched RMS cell state in driving cancer growth, we next sought to functionally assign tumor propagating potential to discrete populations of FN-RMS cells. Cell surface markers identified from our scRNA sequencing were used to enrich for the mesenchymal-like and differentiated muscle subpopulations from FN-RMS including two PDXs (MAST139, MSK7471) and three RMS cell lines (RD, 381T and SMS-CTR; Fig. 4, Extended Data Fig. 7, Supplementary Fig. 1). As expected, FACS using either i) CD44/CD90 or CD90/CHODL combinations of antibodies allowed isolation of RMS cells that were highly enriched for the mesenchymal cell state when assessed by quantitative real time PCR (Fig. 4c, Extended Data Fig. 7b, f, j, n). Similar high-level purity for differentiated muscle cell fractions was observed following FACs using TSPAN33, LRRN1 or ERBB3 antibodies. Fusion-negative SMS-CTR contained mesenchymal cell states but lacked differentiated muscle cell populations (Extended Data Fig. 7m-p, consistent with the intertumoral heterogeneity seen in a subset of FN-RMS and previous studies showing lack of differentiation potential in this cell line33. Following 3D culturing, mesenchymal-like cells generated significantly more tumorspheres in all five FN-RMS models analyzed, especially when compared to muscle differentiation-enriched cell types (Fig. 4d, Extended Data Fig. 7d, h, l, p). Tumor spheres generated from mesenchymal-like cells were also larger in size (Fig. 4e, Extended Data Fig. 7c, g, k, o) and quantitatively enriched for TPCs when compared to counter-selected negative cell types or differentiated muscle cells (p<0.026, Fig. 4f, Supplementary Table 3).
Our results were next extended to mouse xenografts. Specifically, FACS sorted cells were engrafted into NSG mice at limiting dilution. Mesenchymal-like enriched TPCs remade tumors with high efficiency in both FN-MAST139 and FN-MSK74711, especially when compared to differentiated muscle (Fig. 5a-c, p<0.026 ELDA analysis, Extended Data Fig. 8a-b, Supplementary Table 3). Mice engrafted with mesenchymal-like TPCs also exhibited faster time to tumor re-growth and had overall increased numbers of animals with disease (Fig. 5b, Extended Data Fig. 8b). ELDA confirmed TPC enrichment within the mesenchymal-enriched sorted cells when compared with engraftment of counter-selected cells or those enriched for the differentiated muscle cell state (p<0.026 ELDA, Fig. 5c, Supplementary Table 3). Tumors generated from engraftment of mesenchymal-enriched cells also had similar histology and overall numbers of heterogenous cell populations as bulk engrafted tumors (Fig. 5d-e, Extended Data Fig. 8c-d). By contrast, the few tumors that were generated from CD44−/CD90− or CD90−/CHODL− cells showed significant lower reconstitution of mesenchymal-enriched cell states and had elevated differentiated cells based on flow analysis and immunohistochemistry for Myosin Heavy Chain (MF20) and Troponin Fast Muscle Protein 3 (TNNT3, Fig. 5d-e, Extended Data Fig. 8c-d). Finally, engrafted tumors arising from mesenchymal-like cells had similar proliferation rates compared with parental tumors, whereas animals engrafted with mesenchymal-negative cells were significantly less proliferative (Fig. 5e, Extended Data Fig. 8d, p<0.001, two-sided Student’s t-test). These data show that FN-RMS contain a distinct and molecularly defined mesenchymal-pathway enriched TPC that is largely quiescent during normal growth conditions, and yet has the potential to re-enter cell cycle, divide, and create a tumor with the same underlying heterogeneity as parental tumors when grown in culture and xenografted mice.
RMS shares molecular similarity with embryonic/fetal muscle
Davicioni et al. had previously identified sub-type specific transcriptional gene programs in primary human RMS34, leading us to hypothesize that subtype-specific transcriptional programs may be associated with arrest at specific stages of human muscle development. Gene expression was determined using the aggregate summation of scRNA-seq data from each tumor and then differentially regulated genes were identified by comparing gene expression between FP- and FN-PDXs (>1.5 log2 fold change, p<0.05 two-sided Student’s t-test). This gene list was then compared with the previously defined sub-type specific gene signatures identified by Davicioni et al. to generate a highly specific core-signature gene profile for either FN-RMS or FP-RMS (n=67 and 93 genes respectively, Fig. 6a, Supplementary Table 2). As may be expected, each core gene module was ubiquitously and similarly expressed across all cells and was highly specific to FN- or FP-RMS, which was easily visualized using dot plot expression renderings for the ten most representative genes in each signature (Fig. 6b, Extended Data Fig. 9a). Moreover, the FP-RMS core signature was significantly enriched for PAX3 regulated genes defined by Gryder et al.35 (n=40 of 93 genes, p=3.35x10−32, Fisher Exact Test), but also contained an even larger fraction of genes for which PAX3 is not known to regulate (n=53 genes, Fig. 6c). We next used the LISA algorithm to predict the transcriptional regulators of differentially expressed genes within each core signature. LISA queries a large dataset of well-annotated histone mark ChIP-seq and chromatin accessibility profiles to construct a chromatin model related to the regulation of queried gene lists36. LISA analysis again showed higher enrichment of PAX3 sites in the FP-RMS core genes (Fig. 6c, right panel). By contrast, the non-overlapping and the FN-RMS core signature did not have high enrichment of predicted PAX3 regulated genes.
To test whether these subtype-specific core signatures were also enriched at specific stages of muscle development, we next mapped our FP- and FN- core RMS signatures to human embryonic, fetal, and adult muscle cell populations identified by scRNA sequencing (Fig. 6d-e)28. The FN-RMS core signature was expressed across muscle cells isolated from embryonic and fetal development, but not adult muscle (Fig. 6e, Extended Data Fig. 9b-c). By contrast, the FP-RMS core signature was enriched at a highly specific developmental stage at which embryonic muscle is transitioning to fetal muscle at 7-7.75 weeks of age. Dot plot expression renderings and TSNE plots showing representative core-signature genes confirmed expression of FN- and FP-core genes in these same stages of normal muscle development (Extended Data Fig. 9b). These findings support a model that both RMS subtypes express gene programs found in embryonic or fetal muscle, and uncovered that FP-RMS express transcriptional programs associated with a tightly controlled developmental stage at which myogenic cells transit from embryonic to fetal muscle development.
RMS share stem cell hierarchies with embryonic/fetal muscle
To investigate whether distinct RMS cell states mimic those found in human embryonic and fetal development, we next assessed the gene expression patterns between RMS and normal human muscle. We first quantified enrichment of our RMS transcriptional gene modules in scRNA sequencing of human muscle progenitors (MP), myocytes/myoblasts (MB/MC), and skeletal muscle mesenchymal stem/progenitor cells (SkM. Mesen) from embryonic, fetal, and adult development (Fig. 7a)28. The RMS proliferation gene module was enriched in human MP cells of 6–7-week-old-embryonic skeletal tissue (p < 0.0001, by GSEA, Fig. 7b), reflecting shared muscle-specific transcriptional programs related to cell-cycle and rapid expansion of these cells during development. By contrast, the differentiated RMS muscle transcriptional gene module was enriched within the MB/MC cells across a range of developmental time points including 6-7, 9, and 12-14 weeks of age (p < 0.0001, Fig. 7a-b).
GSEA analysis also showed that the mesenchymal-like RMS TPCs were highly transcriptionally similar to the recently described bi-potent Skeletal muscle mesenchymal stem/progenitor cells (SkM. Mesen cells)28. Notably, the mesenchymal-enriched TPC signature was preferentially expressed in SkM. Mesen cells at 9 and 12-14 weeks of embryonic development (p < 0.0001, FDR < 0.0001, NES: 2.078, Fig. 7a-b). Like normal SkM. Mesen cells, these mesenchymal-enriched TPCs also uniquely expressed the osteogenic genes OGN (Osteoglycin) and MGP (Matrix Gla protein, Fig. 7c, Extended Data Fig. 10a). We verified high expression of OGN and MGP in FACS isolated mesenchymal-like RMS cells using both quantitative real-time PCR and antibody co-staining following flow analysis (Fig. 7d, Extended Data Fig. 10b-c). These data verify the remarkable similarities of RMS TPCs with the SkM. Mesen cell population found in early human muscle development28. Results were further validated by performing the reciprocal gene expression analysis using gene sets enriched in developmental stages of normal muscle and querying them against our RMS cell states using GSEA (Supplementary Table 4). This analysis again showed that the early MP gene signature from 6-7 week old embryo muscle was enriched specifically in the proliferative RMS cells. By contrast, SkM. Mesenchymal cell signatures from both 9 and 12-14 week embryonic muscle were enriched only in mesenchymal-like RMS cells while myoblast/myocyte signatures from all three developmental time points were enriched only within the differentiated muscle cell states of RMS (NES>1.5, FDR<0.25, p<0.001 by GSEA, Supplementary Table 4). Thus, FN-RMS contain similar cell states as those found in early, human muscle development and contain mesenchymal-enriched RMS TPCs that are transcriptionally similar to the SkM. Mesen cells.
We next functionally assessed the ability of FN-RMS mesenchymal-enriched TPCs to generate osteogenic cell types, which would be predicted if these cells share transcriptional and function similarity with the bipotent SkM. Mesenchymal population. FACS sorted mesenchymal, differentiated muscle, or counter selected cells were isolated from FN-MAST139, RD and 381T cells and cultured in osteogenic differentiation media for 18 days (Fig. 7e-f). The mesenchymal-like cells from all three models generated significantly more Alizarin Red S+ osteogenic lineage cells, while the counter-selected and differentiated muscle cell fractions failed to efficiently generate osteogenic cells (p<0.01 by two-sided Student’s t-test for all comparisons, Fig. 7e-f). These data support a shared stem cell state and functionality between FN-RMS TPCs and the recently defined bi-potential SkM. Mesen stem cell28.
Discussion
Our work has uncovered remarkably heterogeneity in patient-derived rhabdomyosarcomas using single-cell transcriptomic profiling. Most notably, we discovered a unique mesenchymal-like cell population that expresses high transcript levels for mesenchymal genes and drives FN-RMS tumor growth. This FN-RMS TPC shares molecular, developmental, and functional similarity with the recently described human SkM. Mesen muscle stem/progenitor cell that can generate terminally-differentiated muscle and yet has bipotentiality to produce the osteogenic cells28. Indeed, our experiments showed that the mesenchymal-enriched TPCs express osteogenic lineage markers and can generate both muscle and osteogenic cells. Our results contrast with previous reports suggesting FN-RMS TPCs resemble pluripotent ES cells37 or satellite cells12,23,24, likely reflecting lack of a full molecular and transcriptional characterization of these RMS cell types or direct comparison to human muscle development at the single-cell level. Importantly, all FN-RMS studied to date contain mesenchymal-enriched tumor-propagating cells, suggesting that this cell type is found in a majority of FN-RMS and can drive tumor growth following stress.
The mesenchymal-enriched TPCs found in FN-RMS patients and PDXs do not commonly express proliferative genes such as MKI67, CCNB1, CDK1, and E2F1 (Fig. 1), indicating they are likely quiescent under normal growth conditions. This observation was supported by LARRY barcode lineage tracing where a majority of tumor growth was driven by proliferative and ground state cells. Yet, mesenchymal-enriched FN-RMS TPCs can remake tumor and produce all the cell states following growth in low-serum, stress conditions or after xenograft implantation of low numbers of cells into immune deficient mice. These largely quiescent TPCs are predicted to be more resistant to radiation and chemotherapies that kill rapidly dividing cells. Indeed, Patel et al. has identified the existence of quiescent RMS “mesodermal” cells that are therapy-resistant and marked by MEOX2 and EGFR, markers that also define the mesenchymal TPC population described here38. Discovery of this FN-RMS TPC will surely lead to new insights into therapeutic targeting of these cells and to assess their roles in driving relapse and metastasis in the future. Indeed, our work has recently shown that EGFR-targeted immunotherapies can curb RMS xenograft tumor growth in both zebrafish and mouse models39, raising the intriguing possibility that EGFR immunotherapies can specifically target and kill these FN-RMS TPCs.
We also discovered that pediatric FN- and FP-RMS express highly specific transcriptional gene programs shared with distinct stages of human muscle development. FN-RMS share transcriptional programs with both human fetal and embryonic muscle. By contrast, FP-RMS express a highly specific gene program found only in muscle cells that are transitioning from embryonic to fetal development at 7-7.75 weeks of age. Interestingly, this finding suggests that FP-RMS may develop from early muscle precursors in the first trimester and yet present clinically much later, most typically in the second decade of life 11,40. These results are also consistent with our xenograft studies showing longer latency of tumor regrowth in FP-RMS when compared with FN-RMS. Yet, given the extreme plasticity and preponderance of neural like cell states found in a majority of FP-RMS, it is also likely that a subset of FP-RMS could originate from non-muscle cell types and ultimately adopt this transitory muscle cell fate as part of the transformation process. For example, PAX3-FOXO1 and PAX7-FOXO1 elicits potent transformation of chick embryonic neural cells into alveolar rhabdomyosarcoma41 and clinical case reports have suggested a neural cell of origin in a small fraction of human FP-RMS42,43. Akin to our findings in RMS, correlative gene expression studies of normal breast epithelium and breast cancer suggests that each of the five dominant cancer subtypes segregate along the normal differentiation hierarchy and may arise from and arrest in different putative cells-of-origin44. For example, Claudin-low tumors share remarkable similarity with multipotent mammary stem cells, while other breast cancer subtypes likely arise from more differentiated luminal progenitor cells and exhibit plasticity to dedifferentiate towards a basal-like stem cell fate. These and many other studies raise the interesting possibility that a subset of human tumors arise from and arrest within tissue-restricted stem cell pools while others can adopt these cell fates as part of the transformation process.
In total, our work has uncovered a remarkable conservation of underlying cellular hierarchy between human muscle development and RMS. We have also identified a molecularly-defined and largely quiescent tumor-propagating cell in FN-RMS that shares molecular, developmental, and functional similarity to the newly described bi-potent, muscle mesenchyme stem/progenitor cell28.
Methods
Institutional approvals and sample procurement
Excess, de-identified tumor material was collected from consented patients at MGH in agreement with local institutional ethical regulations and institutional review board approval under human IRB protocol #2007P002464 (single cell RNA sequencing analysis was completed under this protocol for samples 20696, 21202, 29806, and 20082 shown in Fig. 1). Patient-derived xenografts were provided by St. Jude Children’s Research Hospital45 and Memorial-Sloan Kettering Cancer Center. These PDX models were created from tumors of consented patients under IRB approval and shared with MGH under MTA (human IRB protocol #2009P002756, PDXs used under this protocol are denoted by prefix MAST or MSK, see Supplementary Table 1). Mouse studies were approved by the MGH Institutional Animal Care and Use committee under protocol #2013N000038 (experiment #3). As outlined in animal protocol #2013N000038, mice were humanely euthanized by inhalation of CO2 or exsanguination under isoflurane anesthesia for tissue harvest if any tumor ulceration is detected, if the tumor impairs mobility, or when the tumor size reaches no greater than 4,189 mm3 ((4/3) x π x (L/2) x (W/2) x (D/2), with L≤20mm, W≤20mm) for subcutaneous xenografts. Mice on protocol would have been humanely euthanized if they exhibited clinical signs of distress including weight loss greater than 15% of body weight, lack of movement or lethargy/weakness causing inability to eat or drink water, signs of significant pain and/or distress, labored breathing. Additional criteria for euthanasia include lesions covering more than 10% of the skin, hunched posture, distended abdomen, diarrhea, coughing, central nervous system signs such as tremors, spasticity, seizures, or paralysis. In total 8 of the 229 mice followed exceeded this end points of 4,189 mm3 in Fig. 2, Fig. 5 and Extended Data Fig. 8, all of which were within the tumor volume range at the second last time point (See SourceData).
NSG mice were initially engrafted with nine frozen, independent PDXs or RD cell lines (n=3 mice/tumor, n=30 mice total). These engrafted animals were subsequently used for studies outlined in Fig. 4, 7d-e, and Extended Data Fig. 4, 5, 7, 10b-c), a subset used for bulk single cell RNA sequencing (Fig. 1, Extended Data Fig. 6f), and/or for passaging the tumors into secondary animals. Tumor growth kinetics were followed in 9 PDX models serially engrafted into NSG mice (n=54 NSG mice total, n=3 mice/dilution, two cell dilutions total, Fig. 2b). Engraftment from single cells was completed using 75 NSG mice (n=4 PDX models, each engrafted with a single tumor cell into both hind flanks, Fig. 2d). Limiting dilution cell transplantation experiments used 72 mice (n=2 PDX models, 3 cell dilutions, 3 mice/arm, 4 sorted cell populations, see Fig. 5 and Extended Data Fig. 8). In all experiments, mice were sacrificed before tumors reached 6000mm3 (See Fig. 2, Fig. 5, Fig. Extended Data Fig. 8 and corresponding Source Data files for detailed tumor volume sizes for specific experiments). Mice were housed in the MGH CCM BCL2 mouse facility within the CNY149 facility with temperature of 70°F (range of 65°-75°F), humidity at 30%-70% RH, and lighting cycle 7:00am ON-7:00pm OFF. Mice were anesthetized with 5% isoflurane for 10 minutes and then euthanized by aortic exsanguination.
Mouse xenografts
1-5 x106 frozen, viable PDX cells were transplanted subcutaneously along with Matrigel into three 6-8 week-old female NOD.Cg-PrkdcscidIl2rgtm1Wjl/SzJ mice (NSG, 100 microliters). Mice were reared in a BCL2 facility as previously described46. Tumor volume was measured 2-3 times per week using caliper measure47. After developing tumors of ≤6000mm3, necropsy was performed to harvest tumor. A portion of each tumor was fixed in 4% PFA and the remaining tissue used to isolate single cells. Specifically, tumors were macerated in RPMI medium supplemented with dissociation enzymes (Miltenyi Biotec, cat No. 130-095-929) and incubated at 37°C for 20-45 minutes. Cells were then manually aspirated to disassociate clumps and filtered through 100um mesh strainer to remove tissue debris. Cell suspensions were washed once with 1xPBS at 4°C, centrifuged, and then resuspended. Dead cells were removed using a dead cell removal kit (Miltenyi Biotec, cat No. 130-090-101) and mouse cells were removed using the mouse cell depletion kit (Miltenyi Biotec, cat No. 130-104-694). Viable tumor cells were counted and resuspended to a density of 100,000 cells/mL in 1xPBS/0.04%BSA. One or two PDX engrafted tumors were analyzed by scRNA sequencing. A portion of PDX engrafted tumor cells was also transplanted again into NSG mice (1x105, 1x104, or single cells in matrigel). A subset of experiments also used FACS sorted cells for transplantation48. Single cell suspensions were also used for 3D sphere colony assays, q-RT-PCR experiments and/or 10X Genomics single cell sequencing. All PDX tumors were analyzed for mycoplasma contamination (MycoAlert Mycoplasma Detection Kit, Lonza); FKHR break-apart FISH to confirm fusion status49; and short-tandem repeat analysis (Human STR profiling cell authentication service, ATCC) to confirm identify of tumor samples.
Cell culture
Human fusion-negative RD (purchased from ATCC), 381T, and SMS-CTR cell lines and PDX explants were cultured in either DMEM (Gibco) supplemented with 10% FBS and 1% penicillin/streptomycin (regular growth media) or DMEM/F12 supplemented with 2% Horse Serum and 1% penicillin/streptomycin (differentiation media). Cells were dissociated by 0.05% Trypsin/0.05mM EDTA for 5 mins prior to staining, sorting, or loading into 10X Genomics for library construction. All cell line and PDX models were STR-profiled using ATCC short-tandem-repeats services and confirmed to be mycoplasma-free prior to experiments (MycoAlert Mycoplasma Detection Kit, Lonza).
Single-cell RNA sequencing
For PDX tumors, single cell suspensions were created as outlined above and then immediately processed for library preparation using 10X Genomics Chromium Chip A/B Single Cell kit and Single Cell GEM, Library & Gel Bead kit (cat No. 1000092/100075 and 1000073/1000074), according to manufacturer’s protocol. Library quantification and quality check was performed using the Agilent High Sensitivity DNA kit (Agilent # 5067-4626) and Bioanalyzer. Paired end sequencing was performed using the Illumina NextSeq 500 v2.5 High Output Kit (75 cycles), with 28 cycles for read 1, and 55 cycles for read 2, single indexed with 8 cycles, according to 10X Genomics manufactural recommendations. 45,529 cells were analyzed across the 9 PDX models with an average of 2,780 +/−1,368 genes detected per cell (<0.1% doublet rate).
Primary, snap-frozen patient samples were subjected to single-nucleus RNA-seq50. Specifically, samples were washed in 4°C 1xPBS and macerated in TST (Tween with Salts and Tris) nuclei lysis buffer50. Samples were filtered using a 40um mesh strainer to obtain single nuclei. Nuclei were immediately processed using the 10X Genomics kit for library preparation. 29,441 cells were analyzed from the four primary patient samples, with an average of 3,275 +/−1,376 genes being detected per cell (<5.23% doublet rate).
Single-cell and single-nucleus RNA-seq processing
Single cell RNA-seq raw base call (BCL) files from Illumina Basespace were demultiplexed and converted into text-based FASTQ files by using 10X Genomics Cell Ranger pipeline (v3.1.0) mkfastq command (https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/3.1). Reference genome sequence and transcript annotations for sequence alignment and transcript reads were prepared. We used the human hg19 reference (STAR genome index) and transcriptome annotation from the 10x Genomics website (General Transfer Format (GTF) v3.0.0) (https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/3.0) to align and quantify gene single-cell expression for the human patient derived xenograft (PDX) samples. Cells having >5% mouse reads were excluded from analysis. For the reference of single-nucleus RNA-seq, a custom ‘pre-mRNA’ human hg19 reference was built50. The Cell Ranger pipeline (v3.1.0) was then used to perform sequence alignment, basic read quality filtering, cell barcode and unique molecular identifier (UMI) counting with the corresponding species reference genomes and transcriptome annotations. Since the PDX samples might contain mouse cells, they were filtered by combing hg19 and mm10 reference from 10X genomics website (v3.0.0, https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/3.0). The pipeline output for downstream analysis contains filtered cell barcodes and transcripts ids, read count per cell and gene, a web page for data quality summary including basic t-SNE for clusters and differential gene expression visualization. To detect and remove doublets we used the Single-Cell Remover of Doublets (scrublet command, v0.2.1)51. The scrublet was run as follows: expected_doublet_rate 0.06, min_counts 2, min_cells 3, min_gene_variability_pctl 85 and n_prin_comps 30.
The cells passing quality control were processed with Seurat (v3.2.2)52 under the R environment (v3.6.1). First, we removed cells with high mitochondrial ratio (>20%), low expressed gene number (<1000), high expressed gene number (>8000) and potential mouse cell (mouse reads ratio >5%). We normalized read counts using the LogNormalize function with a scaling factor 10,000, and then selected the top 2,000 variable genes across cells by using the vst method. We then regressed out unwanted covariates including nUMI, nGene, ribosome and mitochondria percentage, and optional fraction of mouse reads if PDX samples and scaled to a maximum value of 10 using ScaleData. We then performed dimensionality reduction (PCA) based on the top 1,000 variable genes. The top 20 principal components (PCs) were used for clustering. To this end we first constructed the Shared Nearest Neighbors (SNN) graph and then performed the SNN modularity optimization-based clustering with multiple resolution ranging from 0.4 to 2. Based on the same 20 PCs we generated two-dimensional embedding to visualize cells and cluster labels using the Uniform Manifold Approximation and Projection (UMAP) method30.
Cluster assignments and identifying differential expressed genes in specific cell states
Cluster assignment was assessed at 0.4, 0.6, 0.8, 1.0, 1.2, and 1.6, 1.8, 2.0 resolution to ensure accurate cluster assignment and identification meaningful cell states. The resolution of 0.8 was chosen for differential expression analysis, which maintains the resolution of clusters and avoided over-calling of cluster assignments. A custom function was implemented to calculate the log fold change and the differential proportions of cells expressing a gene and to perform Fisher’s exact test of the proportion of cells expressing or not a gene (log counts per million (CPM)) between a target cluster (foreground) and the rest of clusters (background). The p-values of the Fisher’s exact test were adjusted by the Benjamini-Hochberg (BH) to compute FDR (q-value). After iterating all clusters, the differential expressed genes were filtered genes at FDR 0.01 together with either foreground or background cells gene expression ratio more than 10%. The differentially expressed genes were analyzed using the Molecular Signatures Database v7.4 (https://www.gsea-msigdb.org/gsea/msigdb/annotate.jsp) and ClusterProfiler 53 (v3.19) 31,54,55.
To distinguish between tumor and non-tumor cells in primary tumor samples, we used Cellassign with a set of immune markers29. The final gene sets were identified from conserved clusters across tumors (>0.1 enrichment and >1.5 fold change between foreground and background cells within each tumor and >50% of cells containing the gene of interest). Seurat, ComplexHeatmap56 and SuperExactTest57 were used to visualize the normalized gene expression patterns and gene set intersection size across identified cell states. For primary patient tumor cluster annotations, we used the ClusterProfiler enrichment analysis to first assess the association of each cluster-based differentially expressed gene set with those conserved gene sets from PDX and the most significant PDX gene set (e.g., muscle) was assigned to each cluster.
Single-cell RNA sequencing data from normal muscle was also analyzed in our work and used the Drop-seq methodology28. These data were obtained from single cell sequencing of normal human muscle at 6, 6.5, 7, 7.25, 7.75, 12, 14, 17, 18 weeks, and 7, 11, 34, 42 years of age. Data was converted to a Seurat 3.0 object and processed as described above to generate an integrated UMAP embedding for the visualization of marker expression and stages. For the developmental muscle data at 6-7, 9 and 12-14 weeks, differentially expressed genes, t-SNE embedding coordinates, and normalized gene expression matrix and cell type annotations were downloaded directly from UCSC cell browser (http://cells.ucsc.edu/?ds=skeletal-muscle)28. The differentially expressed genes from different cell types and weeks were then converted to pre-ranked gene lists based on fold change and p-values. GSEA enrichment analysis was performed on these gene lists using GSEApy (https://github.com/zqfang/GSEApy) and based on our annotated RMS gene sets.
LARRY barcoding and 10X Genomics scRNA sequencing
The LARRY barcoding approach was adapted from the recently published work32 and applied to the 10X Genomics single-cell sequencing platform. An extra PCR step was used to amplify the LARRY-GFP-barcode amplicons with the Phusion TAQ polymerase enzyme. cDNA library and LARRY-GFP-barcodes were amplified using the same index. The original LARRY computational pipeline was adapted for use with the 10X Genomics single-cell sequencing platform (https://github.com/AllonKleinLab/LARRY/). The GFP UTR sequence (CGTTGCTAGGAGAGACCATATG) was used to extract the potential barcoded sequence in R2 pair-end sequences, then 29bp barcodes were identified adjacent to the 3' end of GFP UTR sequence and validated by the motif compositions (TG at 4-5bp, CA at 10-11bp, AC at 16-17bp, GA at 22-23bp, G at the last bp). Cell barcodes from R1 pair-end sequences were used to match the barcoded cells and the cells' transcriptomes. Cell states were in silico predicted using the Seurat FindClusters with resolution 0.8, and manually assigned through expert and enrichr enrichment analysis with MSigDB and internal RMS signatures. Quantitation of shared barcoded cells within the LARRY library was limited to those that contained only two cells. Significance for lineage assignment required greater than or equal to 10 daughter cells having the same barcode originating from a given parent cell state (Fig. 3E,F). Arrow direction and size indicate the percent probability of a parent cell dividing to produce a daughter cell with a specific cell fate.
Histopathology, immunofluorescence, and immunohistochemistry
PDX tumors were fixed in formalin, processed, and embedded in paraffin blocks by the Molecular Pathology Histology Core of the Massachusetts General Hospital. Serial 5um sections were prepared for H&E (hematoxylin and eosin), immunofluorescence and immunohistochemistry (IHC) staining. H&E staining was performed at the Molecular Pathology Histopathology Core at MGH, whereas IHC stain for MYOD and DESMIN were performed by Brigham and Women’s Hospital Pathology Core24,46. Histology images were taken using an Olympus BX41 microscope with CCD camera, with 20X objective. Immunofluorescence were performed in the lab with following protocol. Paraffin sections were deparaffinized in Xylene and re-hydrated using ethanol and water. Antigens were retrieved by citrate antigen retrieval buffer (pH6.0). Sections were then blocked, and incubated with primary and secondary antibodies, according to the protocol established by Isacke Lab58. A Zeiss LSM710 inverted confocal microscope was used for imaging, with 20X objective. Imaging J (v2.0.0) was used for image processing and analysis. Primary antibodies included MX1 (abcam, ab95926), TNNT3 (abcam, ab118886), NDRG1 (CST #9485T), Ki67 (CST #9449S), MF20 (R&D MAB4470) and were applied at a dilution of 1:100 in blocking buffer, PBS with 2%FCS and 1%BSA. Secondary antibodies used included Alexa Fluor 488 Goat anti-Rabbit IgG secondary antibody (Fisher Scientific A11034, 1:500 dilution in blocking buffer). or Alexa Fluor 488 Goat anti-Mouse IgG secondary antibody (Fisher Scientific A11001, 1:500 dilution in blocking buffer). IHC stained cells were quantified within the central tumor mass and invasive edge (defined by <20 cells adjacent the invasive edge) by taking images of multiple fields, performing automated cell counting using an imageJ plug in, followed by two-sided Student’s t-test comparison. Four images were analyzed per condition, ranging from 207-643 cells/field.
Fluorescence activated cell sorting (FACS)
Disassociated tumor cells from PDXs or cultured cells were stained with fluorophore-conjugated primary antibodies. Primary antibodies included PE-CD90 (BioLegend #328109), FITC-CD44 (BioLegend #338803), FITC-CHODL (abcam, ab134924), FITC-TSPAN33 (Fisher Scientific MAB8405), PE-LRRN1 (Creative Biolabs, TAB-522MZ, TB0777#281-6 (N1mAb)), PE-ERBB3 (BioLegend #324705), and FITC-LRRN1, each used at the dilution of 1:200 in sorting buffer (PBS, with 1% FBS and 1% NaN3). The antibody purification kit (ab102784) was used for antibodies that are in solution with sodium azide/glycerol which interferes with fluorophore conjugation. FITC-conjugating kit (ab102884) and PE-conjugating kit (ab102893) were used to conjugate fluorophores for some antibodies for which no commercially conjugated antibody was available. DAPI was used to counter-select dead cells.
The BD FACSAria Fusion Cell Sorter was used with nozzle size 100um. Purity check was performed after each FACS with >1x103 cells.
Reverse Transcription and Qualitative PCR
FACS sorted cells were used in q-RT-PCR analysis (1x104 cells/sample). RNA was extracted using the NEBNext Single Cell/Low Input cDNA Synthesis and Amplification Module kit (New England Biolabs # E6421S). Amplified cDNA was then used in each quantitative PCR (PowerUp SYBR Green Master Mix, Fisher Scientific A25742) reaction and run in triplicates for each primer set (Supplementary Table 5). GraphPad was used for qRT-PCR data analysis and ANOVA followed by two-sided Student’s t-test was used to compare expression levels between sorted cell populations.
Tumor sphere assays
Tumor sphere assays were performed48. Briefly, single cell suspensions obtained from FACS were immediately seeded at limiting dilution into ultra-low attachment 6-well plates. DMEM/F12 supplemented with vitamin-free B27 and bFGF (20ng/mL), EGF (10ng/mL) was used as tumorsphere medium. Tumorspheres were imaged and counted at either 10 days for cell lines or 14 days for PDXs using an inverted phase-contrast microscope. Tumorspheres with different size range were counted as small (25-50um), medium (50-100um), or large (>100um). ImageJ was used to process and analyze the images. Graphpad ANOVA and two-sided student-t-test was used for statistical analysis.
Osteogenic differentiation assay
Osteogenic differentiation was performed according to previously established protocols28,59. In our work, sorted RMS cells were seeded into collagen I coated 24-well plates, and then cultured in RPMI, supplemented with 2% FBS, beta-glycerophosphate and vitamin C for 18 days before Alizarin Red S staining. Medium were replaced every other day. Sort purity was >85% and viability >95%. Alizarin Red S was dissolved in HCl, pH4.2 and applied to 4%PFA fixed cells. Images were taken using Olympus MVX10 Microscope and Olympus DP74 Camera. Data were analyzed by Graphpad using one-way ANOVA followed by two-sided Student’s t-test calculation.
Statistics and reproducibility
Most experiments in the accompanying manuscript used a sample size of 3. This is the minimum required for running statistical analysis, is common in the field, saves on costs and animals, and does not require statistical analysis a priori to pre-determine sample size. These included in vivo transplant experiments, ex vivo tumorsphere and osteogenic differentiation assays, and qRT-PCR of sorted PDX samples. All work was replicated at least twice (in most instances three times) using biological replicates as noted in the text, with exception of mouse xenograft transplants as is customary in the field. Animals were followed for a minimum of 12 months for tumor onset and no animals were excluded from our studies.
For RMS cell line and PDX explant studies, we used G-power to calculate the sample size based on the preliminary pilot studies. Tumorsphere experiments were randomized in different wells of 6-well plates. Tumor sections of PDX samples were analyzed blinded and up to 4 images selected for imaging and downstream analysis. Quantification of all immunofluorescence images of PDX samples were performed using ImageJ and by a researcher who was blinded to patient sample and experimental information. Data distribution was assumed to be normally distributed. No data were excluded from any studies in the manuscript. In total, 1 to 2 tumors per PDX or single primary patient samples were single-cell RNA-sequenced.
Extended Data
Supplementary Material
Table S1. Patient and PDX tumors.
Table S2. Genes expressed within each cell state and within the "core" signature of FN- and FP-RMS. Genes expressed in common cell states (left). Unique gene modules including neural-like states and unidentified subpopulations (middle). Fusion-Negative (FN), or Fusion-Positive (FP) gene modules shown to the right.
Supplemental Table S3. Quantification of tumor propagating cell enrichment in sorted FN-RMS cell subpopulations. Extreme limiting dilution analysis (ELDA) was used to determine the TPC frequency (TPC freq), 95% confidence interval (95% CI), and p-value (<0.05 denotes significance).
Table S4. Quantification of myogenic developmental cell signature enrichment in RMS cell subpopulations by Gene Set Enrichment Analysis (GSEA). GSEA of the top 100 genes expressed in each developmental stage and compared to the proliferation, mesen.like and muscle modules in MSK74711 and MAST139. Normalized Enrichment Scores (NES). False discovery rate (FDR). Green highlighting denotes significant enrichment (GSEA analysis, NES >+1.5, FDR<0.25, and padj<0.001). Not significant (NS).
Table S5. Quantitative real-time PCR primers.
Supplementary Fig. 1. Representative example of sorting strategy used for FACS experiments. MAST139 cells were isolated from a PDX grown in mice and live cells were gated first by size using FSC-H vs. SSC-H. Then single cells were selected based on linear relationship between SSC-H and SSC-A. Viable cells were next selected based on DAPI dye exclusion and gated based on markers of interest, in this case FITC-CD44 and PE-CD90.
Acknowledgments:
This work was supported by NIH grants R01CA154923 (D.M.L.), R01CA215118 (D.M.L.), R01CA211734 (D.M.L), U54CA231630 (D.M.L), R00HG008399 (L.P.), R35HG010717 (L.P.), and R01AR064327 (A.D.P.). Additional funding included the Liddy Shriver Sarcoma Initiative (D.M.L.), the MGH Research Scholars Program (D.M.L.), Infinite Love for Kids Fighting Cancer Grant (D.M.L., F.D.C), the Rally Foundation (D.M.L), The Truth 365 (D.M.L), the Summer’s Way/Friends of TJ Young Investigator Award (Y.W.), Tosteson & Fund for Medical Discovery Fellowship from MGH (Y.C.), the Alex’s Lemonade Stand Foundation Young Investigator Award (Y.C.) CIRM Quest DISC2-10696 (A.D.P), UCLA BSCRC (A.D.P), the Ayoub Centennial Chair (A.D.P), Paulie Strong Foundation (F.D.C), The Grayson Fund (F.D.C), Willens Family Fund (F.D.C), and Pediatric Cancer Foundation (F.D.C). We thank the MGH Department of Pathology Flow and Image Cytometry Research Core which has been supported by NIH grants 1S10OD012027-01A1, 1S10OD016372-01, 1S10RR020936-01, and 1S10RR023440-01A1. We thank Drs. Michael Dyer and Elizabeth Stewart from the Childhood Solid Tumor Network (CSTN) at St. Jude for a subset of PDX models used in this work. We thank Alison Friedmann and David Ebb from MGH Pediatric Hematology/Oncology department. We also thank Deb O’Neill and Liz Millet for helpful and stimulating discussions.
Footnotes
Competing Interests Statement: A.J.I. receives royalties from ArcherDx and consults for Paige.AI, Repare Therapeutics, Oncoclinicas Brasil, and Kinnate Biopharma. M.L.S. is an equity holder, scientific co-founder, and advisory board member of Immunitas Therapeutics. L.P. has financial interests in Edilytics and SeQure Dx, Inc. All potential competing interests are reviewed and managed by Massachusetts General Hospital and Mass General Brigham HealthCare in accordance with their conflict-of-interest policies. D.M.L receives sponsored research funds from NextCure for an unrelated project.
Data Availability
ScRNA-seq and snRNA-seq data are available at Gene Expression Omnibus (GEO) under accession #GSE195709. Source data have been provided as Source Data files. All other data supporting the findings of this study are available from the corresponding author on reasonable request.
Code Availability
All the analysis scripts have been deposited at GitHub and can be accessed using the links: https://github.com/qinqian/sc_normal_muscle, and https://github.com/qinqian/rms_analysis.
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
References
- 1.Reya T, Morrison SJ, Clarke MF & Weissman IL Stem cells, cancer, and cancer stem cells. Nature 414, 105–111 (2001). [DOI] [PubMed] [Google Scholar]
- 2.Miyamoto T, Weissman IL & Akashi K AML1/ETO-expressing nonleukemic stem cells in acute myelogenous leukemia with 8;21 chromosomal translocation. Proc Natl Acad Sci U S A 97, 7521–7526, doi: 10.1073/pnas.97.13.7521 (2000). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Ginestier C et al. ALDH1 is a marker of normal and malignant human mammary stem cells and a predictor of poor clinical outcome. Cell Stem Cell 1, 555–567, doi: 10.1016/j.stem.2007.08.014 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Merlos-Suarez A et al. The intestinal stem cell signature identifies colorectal cancer stem cells and predicts disease relapse. Cell Stem Cell 8, 511–524, doi: 10.1016/j.stem.2011.02.020 (2011). [DOI] [PubMed] [Google Scholar]
- 5.Quintana E et al. Efficient tumour formation by single human melanoma cells. Nature 456, 593–598 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Kenney AM, Cole MD & Rowitch DH Nmyc upregulation by sonic hedgehog signaling promotes proliferation in developing cerebellar granule neuron precursors. Development 130, 15–28, doi: 10.1242/dev.00182 (2003). [DOI] [PubMed] [Google Scholar]
- 7.Gilbertson RJ & Ellison DW The origins of medulloblastoma subtypes. Annu Rev Pathol 3, 341–365, doi: 10.1146/annurev.pathmechdis.3.121806.151518 (2008). [DOI] [PubMed] [Google Scholar]
- 8.Wong DJ et al. Module map of stem cell genes guides creation of epithelial cancer stem cells. Cell Stem Cell 2, 333–344, doi: 10.1016/j.stem.2008.02.009 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Parham DM & Barr FG Classification of rhabdomyosarcoma and its molecular basis. Adv Anat Pathol 20, 387–397, doi: 10.1097/PAP.0b013e3182a92d0d (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Horn RC Jr. & Enterline HT Rhabdomyosarcoma: a clinicopathological study and classification of 39 cases. Cancer 11, 181–199, doi: (1958). [DOI] [PubMed] [Google Scholar]
- 11.Yohe ME et al. Insights into pediatric rhabdomyosarcoma research: Challenges and goals. Pediatr Blood Cancer 66, e27869, doi: 10.1002/pbc.27869 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Langenau DM et al. Effects of RAS on the genesis of embryonal rhabdomyosarcoma. Genes Dev 21, 1382–1395 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Chen X et al. Targeting oxidative stress in embryonal rhabdomyosarcoma. Cancer Cell 24, 710–724, doi: 10.1016/j.ccr.2013.11.002 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Shern JF et al. Comprehensive genomic analysis of rhabdomyosarcoma reveals a landscape of alterations affecting a common genetic axis in fusion-positive and fusion-negative tumors. Cancer Discov 4, 216–231, doi:2159-8290.CD-13-0639 [pii] 10.1158/2159-8290.CD-13-0639 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Linardic CM, Downie DL, Qualman S, Bentley RC & Counter CM Genetic modeling of human rhabdomyosarcoma. Cancer Res 65, 4490–4495, doi: 10.1158/0008-5472.CAN-04-3194 (2005). [DOI] [PubMed] [Google Scholar]
- 16.Hibbitts E et al. Refinement of risk stratification for childhood rhabdomyosarcoma using FOXO1 fusion status in addition to established clinical outcome predictors: A report from the Children's Oncology Group. Cancer Med 8, 6437–6448, doi: 10.1002/cam4.2504 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Agaram NP et al. MYOD1-mutant spindle cell and sclerosing rhabdomyosarcoma: an aggressive subtype irrespective of age. A reappraisal for molecular classification and risk stratification. Mod Pathol 32, 27–36, doi: 10.1038/s41379-018-0120-9 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Sorensen PH et al. PAX3-FKHR and PAX7-FKHR gene fusions are prognostic indicators in alveolar rhabdomyosarcoma: a report from the children's oncology group. J Clin Oncol 20, 2672–2679, doi: 10.1200/JCO.2002.03.137 (2002). [DOI] [PubMed] [Google Scholar]
- 19.Heske CM et al. Survival outcomes of patients with localized FOXO1 fusion-positive rhabdomyosarcoma treated on recent clinical trials: A report from the Soft Tissue Sarcoma Committee of the Children's Oncology Group. Cancer 127, 946–956, doi: 10.1002/cncr.33334 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Shern JF et al. Genomic Classification and Clinical Outcome in Rhabdomyosarcoma: A Report From an International Consortium. J Clin Oncol 39, 2859–2871, doi: 10.1200/JCO.20.03060 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Sebire NJ & Malone M Myogenin and MyoD1 expression in paediatric rhabdomyosarcomas. J Clin Pathol 56, 412–416 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Tenente IM et al. Myogenic regulatory transcription factors regulate growth in rhabdomyosarcoma. Elife 6, doi: 10.7554/eLife.19214 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Rubin BP et al. Evidence for an unanticipated relationship between undifferentiated pleomorphic sarcoma and embryonal rhabdomyosarcoma. Cancer Cell 19, 177–191, doi: 10.1016/j.ccr.2010.12.023 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Ignatius MS et al. In vivo imaging of tumor-propagating cells, regional tumor heterogeneity, and dynamic cell movements in embryonal rhabdomyosarcoma. Cancer Cell 21, 680–693, doi:S1535-6108(12)00136-5 [pii] 10.1016/j.ccr.2012.03.043 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Hettmer S et al. Sarcomas induced in discrete subsets of prospectively isolated skeletal muscle cells. Proc Natl Acad Sci U S A 108, 20002–20007, doi:1111733108 [pii] 10.1073/pnas.1111733108 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Preussner J et al. Oncogenic Amplification of Zygotic Dux Factors in Regenerating p53-Deficient Muscle Stem Cells Defines a Molecular Cancer Subtype. Cell Stem Cell 23, 794–805 e794, doi: 10.1016/j.stem.2018.10.011 (2018). [DOI] [PubMed] [Google Scholar]
- 27.Drummond CJ et al. Hedgehog Pathway Drives Fusion-Negative Rhabdomyosarcoma Initiated From Non-myogenic Endothelial Progenitors. Cancer Cell 33, 108–124 e105, doi: 10.1016/j.ccell.2017.12.001 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Xi H et al. A Human Skeletal Muscle Atlas Identifies the Trajectories of Stem and Progenitor Cells across Development and from Human Pluripotent Stem Cells. Cell Stem Cell 27, 181–185, doi: 10.1016/j.stem.2020.06.006 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Zhang AW et al. Probabilistic cell-type assignment of single-cell RNA-seq for tumor microenvironment profiling. Nat Methods 16, 1007–1015, doi: 10.1038/s41592-019-0529-1 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Becht E et al. Dimensionality reduction for visualizing single-cell data using UMAP. Nat Biotechnol, doi: 10.1038/nbt.4314 (2018). [DOI] [PubMed] [Google Scholar]
- 31.Subramanian A et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102, 15545–15550 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Weinreb C, Rodriguez-Fraticelli A, Camargo FD & Klein AM Lineage tracing on transcriptional landscapes links state to fate during differentiation. Science 367, doi: 10.1126/science.aaw3381 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Chen EY et al. Glycogen synthase kinase 3 inhibitors induce the canonical WNT/β-catenin pathway to suppress growth and self-renewal in embryonal rhabdomyosarcoma. Proc Natl Acad Sci U S A 111, 5349–5354, doi: 10.1073/pnas.1317731111 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Davicioni E et al. Molecular classification of rhabdomyosarcoma--genotypic and phenotypic determinants of diagnosis: a report from the Children's Oncology Group. Am J Pathol 174, 550–564 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Gryder BE et al. PAX3-FOXO1 Establishes Myogenic Super Enhancers and Confers BET Bromodomain Vulnerability. Cancer Discov 7, 884–899, doi: 10.1158/2159-8290.CD-16-1297 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Qin Q et al. Lisa: inferring transcriptional regulators through integrative modeling of public chromatin accessibility and ChIP-seq data. Genome Biol 21, 32, doi: 10.1186/s13059-020-1934-6 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Walter D et al. CD133 positive embryonal rhabdomyosarcoma stem-like cell population is enriched in rhabdospheres. PLoS One 6, e19506, doi: 10.1371/journal.pone.0019506 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Patel AG et al. The Myogenesis Program Drives Clonal Selection and Drug Resistance in Rhabdomyosarcoma. bioRxiv, 2021.2006.2016.448386, doi: 10.1101/2021.06.16.448386 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Yan C et al. Single-cell imaging of T cell immunotherapy responses in vivo. J Exp Med 218, doi: 10.1084/jem.20210314 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Fletcher CDM et al. WHO classification of tumours of soft tissue and bone. (2013). [Google Scholar]
- 41.Gonzalez Curto G et al. The PAX-FOXO1s trigger fast trans-differentiation of chick embryonic neural cells into alveolar rhabdomyosarcoma with tissue invasive properties limited by S phase entry inhibition. PLoS Genet 16, e1009164, doi: 10.1371/journal.pgen.1009164 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Khalatbari MR, Jalaeikhoo H, Hamidi M & Moharamzad Y Primary spinal epidural rhabdomyosarcoma: a case report and review of the literature. Childs Nerv Syst 28, 1977–1980, doi: 10.1007/s00381-012-1822-9 (2012). [DOI] [PubMed] [Google Scholar]
- 43.Chikhalkar S et al. Alveolar rhabdomyosarcoma arising in a giant congenital melanocytic nevus in an adult--case report with review of literature. Int J Dermatol 52, 1372–1375, doi: 10.1111/j.1365-4632.2011.05448.x (2013). [DOI] [PubMed] [Google Scholar]
- 44.Fu NY, Nolan E, Lindeman GJ & Visvader JE Stem Cells and the Differentiation Hierarchy in Mammary Gland Development. Physiol Rev 100, 489–523, doi: 10.1152/physrev.00040.2018 (2020). [DOI] [PubMed] [Google Scholar]
- 45.Stewart E et al. Orthotopic patient-derived xenografts of paediatric solid tumours. Nature 549, 96–100, doi: 10.1038/nature23647 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Hayes MN et al. Vangl2/RhoA Signaling Pathway Regulates Stem Cell Self-Renewal Programs and Growth in Rhabdomyosarcoma. Cell Stem Cell 22, 414–427.e416, doi: 10.1016/j.stem.2018.02.002 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Tomayko MM & Reynolds CP Determination of subcutaneous tumor size in athymic (nude) mice. Cancer Chemother Pharmacol 24, 148–154, doi: 10.1007/BF00300234 (1989). [DOI] [PubMed] [Google Scholar]
- 48.Skoda J et al. Serial Xenotransplantation in NSG Mice Promotes a Hybrid Epithelial/Mesenchymal Gene Expression Signature and Stemness in Rhabdomyosarcoma Cells. Cancers (Basel) 12, doi: 10.3390/cancers12010196 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Mehra S et al. Detection of FOXO1 (FKHR) gene break-apart by fluorescence in situ hybridization in formalin-fixed, paraffin-embedded alveolar rhabdomyosarcomas and its clinicopathologic correlation. Diagn Mol Pathol 17, 14–20, doi: 10.1097/PDM.0b013e3181255e62 (2008). [DOI] [PubMed] [Google Scholar]
- 50.Slyper M et al. A single-cell and single-nucleus RNA-Seq toolbox for fresh and frozen human tumors. Nat Med 26, 792–802, doi: 10.1038/s41591-020-0844-1 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Wolock SL, Lopez R & Klein AM Scrublet: Computational Identification of Cell Doublets in Single-Cell Transcriptomic Data. Cell Syst 8, 281–291.e289, doi: 10.1016/j.cels.2018.11.005 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Butler A, Hoffman P, Smibert P, Papalexi E & Satija R Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat Biotechnol 36, 411–420, doi: 10.1038/nbt.4096 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Yu G, Wang LG, Han Y & He QY clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284–287, doi: 10.1089/omi.2011.0118 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Liberzon A et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics 27, 1739–1740, doi: 10.1093/bioinformatics/btr260 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Liberzon A et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst 1, 417–425, doi: 10.1016/j.cels.2015.12.004 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Gu Z, Eils R & Schlesner M Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 32, 2847–2849, doi: 10.1093/bioinformatics/btw313 (2016). [DOI] [PubMed] [Google Scholar]
- 57.Wang M, Zhao Y & Zhang B Efficient Test and Visualization of Multi-Set Intersections. Sci Rep 5, 16923, doi: 10.1038/srep16923 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Robertson D, Savage K, Reis-Filho JS & Isacke CM Multiple immunofluorescence labelling of formalin-fixed paraffin-embedded (FFPE) tissue. BMC Cell Biol 9, 13, doi: 10.1186/1471-2121-9-13 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Slemmons KK et al. A method to culture human alveolar rhabdomyosarcoma cell lines as rhabdospheres demonstrates an enrichment in stemness and Notch signaling. Biol Open 10, doi: 10.1242/bio.050211 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Hu Y & Smyth GK ELDA: extreme limiting dilution analysis for comparing depleted and enriched populations in stem cell and other assays. J Immunol Methods 347, 70–78, doi: 10.1016/j.jim.2009.06.008 (2009). [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Table S1. Patient and PDX tumors.
Table S2. Genes expressed within each cell state and within the "core" signature of FN- and FP-RMS. Genes expressed in common cell states (left). Unique gene modules including neural-like states and unidentified subpopulations (middle). Fusion-Negative (FN), or Fusion-Positive (FP) gene modules shown to the right.
Supplemental Table S3. Quantification of tumor propagating cell enrichment in sorted FN-RMS cell subpopulations. Extreme limiting dilution analysis (ELDA) was used to determine the TPC frequency (TPC freq), 95% confidence interval (95% CI), and p-value (<0.05 denotes significance).
Table S4. Quantification of myogenic developmental cell signature enrichment in RMS cell subpopulations by Gene Set Enrichment Analysis (GSEA). GSEA of the top 100 genes expressed in each developmental stage and compared to the proliferation, mesen.like and muscle modules in MSK74711 and MAST139. Normalized Enrichment Scores (NES). False discovery rate (FDR). Green highlighting denotes significant enrichment (GSEA analysis, NES >+1.5, FDR<0.25, and padj<0.001). Not significant (NS).
Table S5. Quantitative real-time PCR primers.
Supplementary Fig. 1. Representative example of sorting strategy used for FACS experiments. MAST139 cells were isolated from a PDX grown in mice and live cells were gated first by size using FSC-H vs. SSC-H. Then single cells were selected based on linear relationship between SSC-H and SSC-A. Viable cells were next selected based on DAPI dye exclusion and gated based on markers of interest, in this case FITC-CD44 and PE-CD90.
Data Availability Statement
ScRNA-seq and snRNA-seq data are available at Gene Expression Omnibus (GEO) under accession #GSE195709. Source data have been provided as Source Data files. All other data supporting the findings of this study are available from the corresponding author on reasonable request.
All the analysis scripts have been deposited at GitHub and can be accessed using the links: https://github.com/qinqian/sc_normal_muscle, and https://github.com/qinqian/rms_analysis.
Further information on research design is available in the Nature Research Reporting Summary linked to this article.