Abstract
Single-cell multi-omics can provide a unique perspective on tumor cellular heterogeneity. Most previous single-cell whole-genome RNA sequencing (scWGS-RNA-seq) methods demonstrate utility with intact cells from fresh samples. Among them, many are not applicable to frozen samples that cannot produce intact single-cell suspensions. We have developed scONE-seq, a versatile scWGS-RNA-seq method that amplifies single-cell DNA and RNA without separating them from each other and hence is compatible with frozen biobanked samples. We benchmarked scONE-seq against existing methods using fresh and frozen samples to demonstrate its performance in various aspects. We identified a unique transcriptionally normal-like tumor clone by analyzing a 2-year frozen astrocytoma sample, demonstrating that performing single-cell multi-omics interrogation on biobanked tissue by scONE-seq could enable previously unidentified discoveries in tumor biology.
scONE-seq generates paired single-cell DNA and RNA data and uncovers a normal-like tumor clone by integrative analysis.
INTRODUCTION
Single-cell genomics has become a mainstay technology used to dissect multicellular organisms and tissues that are composed of cells with diverse functions (1–3). The power of this approach has been demonstrated in several cell atlas studies: Novel cell types have been discovered that further led to the elucidation of new mechanisms; complex cellular interactions and transitions associated with disease initiation or progression have been revealed, and cross-species analyses have shed light on evolutionary processes (4–7). The use of single-cell technology in studying cancer is especially important. Regulatory mechanisms underlying drug resistance or immune evasion are elusive and complex, and tumor cell heterogeneity is a major contributing factor to this complexity, making it particularly challenging to dissect these mechanisms with bulk techniques (8–10). Single-cell technologies have greatly enhanced our understanding of tumor heterogeneity and accelerated mechanistic discovery. At the phenotype level, single-cell RNA sequencing (scRNA-seq) has been used to uncover drug-resistant melanoma subpopulations and to characterize cancer stem cell subpopulations in glioblastoma (11, 12). It has also enabled a more comprehensive phenotypic understanding of the tumor microenvironment (TME) in many cancers including glioma and colorectal cancer (13, 14). At the genotype level, genomic instability contributes to cancer initiation, progression, relapse, and metastasis. With single-cell whole-genome sequencing (scWGS), the clonal structure of the tumor can be resolved, and evolutionary analysis based on copy number variations (CNVs) can reveal tumor progression (15, 16).
Both the genomic and transcriptomic heterogeneity of tumors contribute to disease pathology, and hence, it is crucial to understand both. Meanwhile, frozen tumors in biobanks represent most of the readily available clinical samples for cancer studies. Several single-cell methods that interrogate DNA and RNA simultaneously in the same cell (scWGS-RNA-seq) have been developed (17–22), but their applicability to frozen biobanked tissues is limited. Among these methods, many require physical separation of the nucleus from the cytosol (19, 21); methods in this category such as Trio-seq and DNTR-seq are incompatible with nuclei from frozen tissue. In principle, methods such as G&T-seq, DR-seq, and sci-L3 DNA/RNA can be applied to single nuclei derived from frozen samples. G&T-seq uses poly-thymine beads to capture mRNA from a single cell, followed by library preparation of the mRNA and DNA separately (17). DR-seq uses reverse transcription (RT) to barcode RNA in a single cell and then quasilinear amplification to preamplify both genomic DNA and RNA. Reactions are separated into two halves to generate DNA and mRNA libraries, respectively (18). The most recently published sci-L3 DNA/RNA co-assay uses SDS to remove the nucleosome followed by a split-pooling strategy for DNA and RNA amplification, demonstrating the potential for high-throughput capability (22). However, these methods inevitably result in more sample loss and reduced data quality: G&T-seq requires physical separation of DNA and RNA where nucleic acids are lost in the process; DR-seq suffers from biases introduced by quasilinear amplification, and the SDS treatment and extensive multiple washing that are used in sci-L3 result in lower sensitivity. Furthermore, these methods are labor-intensive, technically demanding, and time-consuming. These limitations, therefore, prompted us to develop a scWGS-RNA-seq method that is versatile, easy to use, and compatible with single nuclei from frozen tissues.
scONE-seq amplifies the transcriptome and genome of a single cell or nucleus simultaneously in a one-pot reaction. During the coamplification, specifically designed DNA and RNA barcodes with unique molecular identifier (UMI) recognize each nucleic acid species, respectively, which allows transcriptomic and genomic information to be distinguished after sequencing (23–25). Benefiting from this design, scONE-seq has several advantages: It has a simplified library construction workflow; it is compatible with standard single-cell isolation methods such as fluorescence-activated cell sorting (FACS); being a one-pot reaction, its throughput can be easily scaled up using liquid-handling robots while it is still readily accessible for manual operation; the simplified workflow eliminates transferring steps that cause material loss, thereby ensuring good library quality; and because DNA and RNA can be coamplified, scONE-seq is applicable to nuclei from frozen samples or tissue types such as liver, bone, and brain, where whole cells are difficult to obtain. These advantages lay the foundation for a versatile method.
We first benchmarked the technical performance of scONE-seq and showed that it is comparable to existing methods using various sample types, including cell lines and lymphocytes from the peripheral blood mononuclear cell (PBMC) of a healthy donor. We then showed that scONE-seq allows the discovery of novel disease-related phenotypes in a frozen IDH1-mutant astrocytoma sample. Using scONE-seq, we integrated both clonal and transcriptomic information and identified a unique normal-like tumor clone in this sample. Further analysis suggests that this subpopulation exhibits molecular phenotypes related to tumor-neuron synapse formation and immune repression regulation. This discovery demonstrates the power of scONE-seq to identify unique cell subpopulations from true normal cells by harnessing both genome and transcriptome information. Furthermore, we established a framework for integrative single-cell multi-omics WGS/RNA-seq analysis on biobanked samples to unveil layers of complexity in tumor cells, enabled by our multifaceted scONE-seq approach.
RESULTS
A molecular barcoding strategy enables accurate and sensitive co-profiling of DNA and RNA from a single cell in a one-tube reaction
To achieve single-cell genome and transcriptome co-profiling, we devised a workflow to amplify RNA and DNA simultaneously (Fig. 1A and Supplementary Text). Briefly, single cells or nuclei are sorted into polymerase chain reaction (PCR) plates containing lysis buffer using FACS. Plates containing sorted single cells can then be processed immediately or stored at −80°C for months before further processing. We then use Tn5 with a custom adaptor to fragment and label the genome or any DNA within the cells (26). In this step, the amplification adaptor, which includes a 6-nucleotide “DNA barcode” and 8-nucleotide UMI, is added to the fragmented DNA (fDNA). RNAs are then reverse-transcribed to cDNA using RT primers that are composed of a priming sequence adapted from that of the MATQ-seq protocol (27), 6-nucleotide “RNA barcode,” and 8-nucleotide UMI. The RT primer binds to the internal regions of RNA transcripts, thereby capturing both polyadenylated (polyA) and non-polyA RNAs to detect full-length cDNA. 3′ adaptor is then added through subsequent poly-cytosine (polyC) tailing and nondenaturation PCR (27, 28). Once DNA-specific and RNA-specific barcodes have been added, fDNA and cDNA can be amplified simultaneously (fig. S1, A and B). The products are used to construct a sequencing library subsequently (fig. S1C).
Fig. 1. Overview of scONE-seq library preparation and benchmarking results.
(A) The molecular mechanism of scONE-seq workflow. (B) Box plot shows gene detection numbers in scONE-seq whole-cell dataset, scONE-seq nucleus dataset, and SS2 dataset (HCT116, n = 90, 93, and 94, respectively). All samples were downsampled to 40,000 mapped reads to match with the nuclei dataset (P < 2 × 10-16, t test between scONE-seq cells and SS2). (C) Gene body coverage for scONE-seq cells, nuclei, and Smart-seq2 (n = 90, 93, and 94, respectively). Error areas are indicated by ± SD between cells. (D) Accuracy across mock samples (150,000 mapped reads). Pearson correlations were calculated from log-transformed TPM (transcript per million). (E) Lorenz curve of bulk and scONE-seq data (cells, 88; nuclei, 83). Percentiles of the genome covered are plotted against the cumulative fraction of reads. A perfect coverage uniformity results in a straight line with the slope as 1. Error areas are indicated by ±SD between cells. (F) Dot plots with normalized counts across the genome superimposed with solid line plots to visualize integer copy numbers. Amplification regions are in red; deletion regions are in light blue. Data from bulk HCT116 WGS (top; bin size = 25 kb and depth = 30×), HCT116 scONE-seq pseudo-bulk data (middle; bin-size = 500 kb and n = 88), and a representative single-cell HCT116 scONE-seq data (bottom; window size = 500 kb, n = 1, and depth = 0.056×) are shown. (G) The bar plot shows the fraction of mapped regions from different assays (n = 93, 90, 1, 94, 1, and 1, respectively). scONE-seq RNA control refers to RNA-only assays. scONE-seq DNA refers to DNA-only assays.
We first benchmarked the transcriptome profiles from scONE-seq against those from Smart-seq2 (SS2) using a variety of test samples: extracted RNA-free Escherichia coli genomes (mock DNA), extracted DNA-free human total RNA (mock RNA), the mixture of the two (i.e., E. coli DNA mixed with human total RNA), and HCT116 single cells/nuclei (29). scONE-seq detected more genes per cell than SS2, which indicates its high sensitivity (Fig. 1B and fig. S2, A and B). This is likely due to scONE-seq being able to capture total RNA from both nuclei and cytosol by complete lysis of both membranes while SS2 only detects RNA with long enough polyA length in the cytosol. Our platform therefore captures a more diverse set of molecules at any given sequencing depth (fig. S2B) (27, 30). We also showed that genes detected by scONE-seq largely overlapped with SS2 (fig. S2C). scONE-seq profiles full-length transcript with high gene body coverage, albeit with a slight 3′ bias compared to SS2 (Fig. 1C). Beyond that, we compared scONE-seq with other single-cell methods, including SS2, G&T-seq, DR-seq, DNTR-seq, and sci-L3 DNA/RNA co-assay, which showed desirable performances in mitochondrial and ribosomal read proportion, gene detection number, and cell-cell correlation analysis (fig. S2D). In addition to the evaluation of quality control (QC) metrics, the sample-to-sample correlation analysis with mock samples showed that scONE-seq has comparable accuracy to SS2 (Fig. 1D). ERCC spike-ins were used to further estimate the accuracy and sensitivity of scONE-seq in comparison to other methods (fig. S2E). Here, we concluded that scONE-seq has comparable performance to other benchmarked methods (fig. S2E).
Next, we compared scONE-seq RNA data generated from HCT116 nuclei with that of live whole cells. Nuclei data generally detect slightly fewer genes than whole cells and has higher gene dropout rates (fig. S2D). Nonetheless, benefiting from the nonseparation strategy, scONE-seq nuclei data still exhibit better sensitivity when compared with some scWGS-RNA-seq methods. We found a high correlation between gene expression from whole cells and nuclei (R = 0.8976), with a small group of genes (n = 405) having higher expressions in whole cells than in nuclei (fig. S2F). Genes that are enriched in the whole-cell dataset are highly expressed housekeeping genes including actin genes, adenosine triphosphate synthase genes, translation initiation factor genes, and ribosome subunit protein genes (RPL genes), which is in concordance with previous findings (31). Notably, Gene Ontology enrichment analysis suggests that these genes are associated with the ribosomal function (fig. S2G).
Most of the present single-cell DNA sequencing (DNA-seq) is performed with shallow depth (less than 0.5×), which is sufficient for CNV estimation (15, 16). Therefore, we focus on the evaluation of CNVs calling by scONE-seq. To benchmark the data quality, we assessed the proportion of mitochondrial reads and PCR duplication rates of scONE-seq and other methods including DOP-PCR, ACT, DR-seq, G&T-seq, DNTR-seq, and sci-L3 DNA assay (fig. S3A). We then validated the WGS capability of scONE-seq by assessing the Lorenz curves and coefficient of variation (CV; Fig. 1E and fig. S3, A to C), which shows that our platform has better coverage uniformity compared to other single-cell multi-omics methods that are also applicable to nuclei. The breadth of coverage for a cell with 0.5 million UMIs (read length = 100 base pairs) is around 1.12% and having 134 cells achieves ~70% coverage, which is similar to bulk data with the same depth (fig. S3D). Although scONE-seq has a slightly lower coverage uniformity compared with single-cell DNA-seq methods, CNVs detected with scONE-seq DNA data are consistent with those defined by bulk and pseudo-bulk (88 cells; Fig. 1F and fig. S3E). This demonstrates that the current coverage uniformity of scONE-seq is sufficient for correct CNV calling. We further assessed the CNV calling accuracy of scONE-seq by treating CNVs calculated from bulk WGS as a reference, as HCT116 is a relatively homogeneous cell line (fig. S3F). We conclude that 0.5 million UMIs (corresponding to 0.67 million reads) are needed for precise CNV detection in whole cells. With more than 1 million UMIs, CNVs called by scONE-seq can achieve 95% precision and 95% recall rate in whole cells. In the nuclei dataset, scONE-seq generated a comparable recall rate to the whole cell data, albeit with slightly reduced precision. Furthermore, to evaluate the sensitivity of CNV detection in scONE-seq, we surveyed whether small-scale CNVs (500 to 5000 kb) could be detected in a single cell (fig. S4, A and B). scONE-seq is able to detect 800-kb CNV with high detection rates using a bin size of 100 kb while robustly detecting 4000-kb CNV with different bin sizes. Because it is common practice to study CNVs in pseudo-bulk data so as to achieve higher resolution (15, 32), our minimum CNV detection analysis shows that sensitivity can be enhanced with pseudo-bulk with higher cell numbers (fig. S4C). The caveat is that chromosome loci could also affect the detection of CNVs (fig. S4, A and C).
The cross-contamination of barcodes between DNA and RNA was evaluated. We found no RNA-to-DNA contamination and minor DNA-to-RNA contamination (~8%), which can be detected using a cross-species experiment (Supplementary Text). We investigated the mapping region of the single-cell RNA data from DNA and RNA co-profiling and those from only RNA profiling. The mapping difference between exon and intron depends on the relative proportions of DNA and RNA sequenced. Because the rate of mislabeling remains constant, a suboptimal Tn5 concentration would result in a larger quantity of DNA mislabeled as RNA. Thus, when Tn5 concentration is not optimized for a balanced representation of DNA and RNA capture, a higher intronic fraction (~9.5% higher than the RNA-only control) will be observed in the RNA-barcoded data (Fig. 1G). As expected, scONE-seq nuclei data contain a higher proportion of intronic reads compared with those from whole cells.
Summarily, scONE-seq can profile genome and transcriptome from a single cell without compromising the data quality of each individual modality (table S1). This series of benchmarking analyses highlight the satisfactory performance of scONE-seq in both CNV calling accuracy and gene detection sensitivity, whereas other methods either suffer from reduced uniformity of DNA data or comparatively poor RNA capture. In addition, scONE-seq has a more straightforward workflow that is manageable for users. To increase scalability, we integrated our platform with a liquid dispensing system, which reduces reagent costs and increases throughput.
scONE-seq data correctly assign cell types from primary donor samples
After assessing the technical performance of scONE-seq, we evaluate whether it can accurately identify cellular subtypes within a mixed population by applying it to known biologically heterogeneous samples. It is achieved by performing scONE-seq on two different cell lines and on a primary PBMC sample from a healthy donor.
We examined the cell type identification accuracy of scONE-seq using the strategy mentioned above. With unsupervised graph-based clustering of the RNA expression data, cells belonging to the same cell type are clustered together (Fig. 2A). We assigned cell type identities based on their sample sources, and well-studied marker genes for each cell type were differentially expressed (Fig. 2B). We used scDASH to remove ribosomal RNA so that a large proportion of detected genes were protein-coding genes (fig. S5A) (33). Non-polyA genes with potentially important biological functions can be captured; for example, histone genes that lack poly-A tails and are cell cycle related (fig. S5, B and C) (34).
Fig. 2. scONE-seq cell type classification and clone identification with CNVs.
(A) UMAP of scONE-seq RNA data from HCT116, NPC43, and lymphocytes (n = 183, 162, and 188, respectively). Cells from the same sample source are clustered together. (B) Differential expression gene (DEG) heatmap. DEG group cells based on their cell source. Common markers for these cells are labeled in the heatmap. (C) UMAP of integrated lymphocyte transcriptomic data from scONE-seq (n = 188), Smart-seq2 (n = 183), and 10x Genomics (n = 1659). (D) Cell type annotations are based on known markers of immune cells. The expression of CD45RO is based on isoform-level quantification (only available for scONE-seq and Smart-seq2). (E) Cell type composition shows no difference between two datasets (P = 0.3705, chi-square test). (F) Heatmap shows CNV profiles of samples from multiple sources. Cells are organized by hierarchical clustering (normal, n = 184; HCT116, n = 171; and NPC43, n = 137). (G) The minimum evolution tree with diploid as the root shows that NPC43 cells used in this study acquired more CNVs compared with the genome state when the cell line was originally established. The underlined number shows the length of the designated Manhattan distance.
We further inspected lymphocytes from the same PBMC sample profiled using both scONE-seq and SS2. We integrated both datasets with a publicly available independently generated PBMC dataset (10x Genomics) that consists of 1659 lymphocytes to annotate cells on the basis of known lymphocyte markers (Fig. 2, C and D) (35). We used the public dataset as the “ground truth” of cell type identities. To evaluate the accuracy of cell type classification in each dataset, we first clustered the scONE-seq, SS2, and 10x datasets independently, and annotate cells with the same markers as before (fig. S6, A to C). At similar sequencing depths, scONE-seq faithfully recovered the cell types identified by the reference dataset. Notably, CD4+ T effector (Teff) cells could not be distinguished well by the SS2. In contrast, scONE-seq detected markers of Teff that are shown to be only sparsely expressed in the 10x dataset, including ITGB1 and AHNAK (fig. S6, D to F). scONE-seq was also able to reveal two B cell subtypes (fig. S6, A and D). In addition, there was no significant difference observed in the cell type composition of cells profiled by the scONE-seq and SS2 (P = 0.3705, chi-square test; Fig. 2E). Last, we evaluated the cell type classification ability of the two methods at varying sequencing depths. For profiling lymphocytes, around 10,000 UMIs, corresponding to 50,000 reads, and 200 cells were needed to achieve accurate cell type labeling by the scONE-seq dataset, which has a higher accuracy rate compared with SS2 with similar sequencing depth (fig. S7, A and B). Benefiting from the high sensitivity of our method, these results collectively demonstrate that scONE-seq RNA data can accurately capture the biological variation within heterogeneous samples to achieve correct cell type classification and clustering.
scONE-seq data identify distinct clones in different samples
The analysis above shows the robustness of scONE-seq’s cell type assignment in RNA data. We then evaluated the performance of clone identification with scONE-seq WGS data. Here, we used scONE-seq WGS data that was obtained simultaneously from samples used in the previous cell type assignment analysis and delineated their clonal CNV structure, followed by hierarchical clustering with their integer copy number profiles (Fig. 2F). We observed that HCT116 maintained a relatively homogeneous clonal composition, whereas NPC43, a primary patient-derived cell line that shows strong genome instability, was composed of three main clones (Fig. 2F). Furthermore, the CNV structure of these three clones differed substantially compared to the state of cell line establishment (36), especially in chromosomes 1, 3, 4, 6, 7, and 11 (Fig. 2G and fig. S8, A to D). The distinctions between clones are mainly found in chromosomes 1, 3, 7, and 11 (fig. S8, A to D). On the basis of this observation, the change in chromosome copy numbers during primary cell line culture could be a common phenomenon in cell lines with abundant CNVs and unstable genomes. Studies have shown extensive genetic variations across different cell lines, and single cells from some cell lines can give rise to populations with multiple clones because of genome instability (15, 37). In addition, with the matched transcriptomes of every single cell and their corresponding copy number states, we mapped the clonal information to the transcriptome UMAP for NPC43 and found that the CNVs in NPC43 were not reflected by transcriptome state markedly (fig. S8E). This demonstrates that scONE-seq can identify both phenotype and genotype for each cell, which allows researchers to gain great insights into its genetic information.
Dissecting the clonal structure and cell type subpopulations of an IDH1-mutant astrocytoma
Gliomas, especially high-grade ones, are some of the most aggressive malignant tumors originating in the brain (38, 39). When studying gliomas or other brain tissues using single-cell technology, it is challenging to obtain intact dissociated whole single cells, especially cells with complex morphology that leads to biases in cell type sampling (40). Hence, single-nucleus isolation is more widely used for brain scRNA-seq. To profile both the genotypic and phenotypic heterogeneity in a biobanked sample, we applied scONE-seq on single nuclei isolated from a snap-frozen astrocytoma specimen that had been stored since resection for 2 years: a second recurrent (2R) astrocytoma sample with IDH1(R132H), TP53(P278S), and ATRX(R781*) mutations (fig. S9A). The primary (P) and first recurrent (1R) samples were limited in quantity and subjected to whole-exome sequencing (WES) and RNA-seq in bulk (Fig. 3A) (41).
Fig. 3. scONE-seq reveals the clonal composition of the IDH1-mutant astrocytoma.
(A) Schematic showing the diagnosis history of a patient whose tumor sample is studied here. The patient has been diagnosed as IDH1-mutant (grade 4) astrocytoma. Surgery was performed to excise the tumor, and concurrent chemoradiotherapy was then applied. The recurrences of the tumor were excised without further drug treatment. Tumor specimens were snap-frozen in the liquid nitrogen tank and stored for 2 years before being subject to nuclei extraction. (B) UMAP of scONE-seq DNA copy number data indicate four genome states in this sample (normal cell, n = 586; 2R clone 1, n = 17; 2R clone 2, n = 20; and 2R clone 3, n = 432). (C) Heatmap of integer copy numbers of all the 2R astrocytoma cells profiled; three clones with different copy number profiles are observed (2R clone 1, n = 17; 2R clone 2, n = 20; and 2R clone 3, n = 432). The bottom annotation represents CNVs in this tumor sample. Glioma/astrocytoma driver genes are shown. Amplified genes are in red; deleted genes are in dark blue. (D) The minimum evolution tree with diploid as root. WES data–inferred CNVs (labeled with black) from the same patient were integrated with pseudo-clonal CNVs (labeled with light-blue circle) to show the evolutionary relationship between the tumor recurrences and their various clones. The P clones are inferred from bulk WES data. The underlined number shows the length of the designated Manhattan distance.
We delineated the clonal architecture of this 2R astrocytoma sample. Using dimension reduction on scWGS data, we clustered cells into four distinct genomic states (Fig. 3B), consisting of one cluster of cells defined as normal and three other clusters defined as tumor clones (fig. S10A). Whole-genome duplication was also found in this tumor and validated by measuring each cell’s 4′,6-diamidino-2-phenylindole (DAPI) intensity using flow cytometry and estimating the B-allele frequency of each cell (fig. S10, B to D). The integer copy number of cells was subsequently calculated (Fig. 3C). On the basis of the genomic profile of each clone and the WES data from the primary and recurrent tumors, the phylogenetic tree of the patient was constructed. The 2R clone 1 was found to be closest to the root (normal cell), with fewer loss-of-heterozygosity (LOH) events, and had very similar genome alterations to the primary tumor WES data (Figs. 3D and 4A and fig. S10E). The 2R clones 2 and 3 harbored many of the same deletion regions as the 1R tumor, resulting in LOH (Fig. 4A). We also investigated known cancer driver genes and found BRAF, MET, and MYC being amplified in all the 2R clones (Fig. 3C) (42). Several key deletion events were found to only occur in the 2R clones 2 and 3, including the deletion of CDKN2A and PTEN. Notably, the homozygous deletion of CDKN2A is a known prognostic factor for the IDH mutant astrocytoma, and it has recently been incorporated into the World Health Organization central nervous system tumor classification scheme as a sufficient criterion for classifying cases as grade 4, regardless of classical histology criteria (fig. S10F) (43).
Fig. 4. scONE-seq reveals the clonal allele frequency and CNVs.
(A) Top: The CNV plot. Dots show normalized counts across the human genome, and the solid line represents the estimated integer copy number. Bottom: The mirrored B-Allele Frequenc (BAF) plot across the genome. Relatively amplified regions are highlighted in red; relatively deleted regions are highlighted in blue. If dots are close to the red belt in the mirrored BAF dots plot, then this indicates that there are LOH in those regions. If dots are close to the blue belt in the mirrored BAF dots plot, then this indicates that there are imbalanced haplotypes in those regions. The top bar highlights the LOH regions of the genome. The clonal pseudo-bulk genome information for each 2R clone is also shown.
Next, we performed unsupervised graph-based clustering on scONE-seq RNA data. Multiple cell clusters were annotated on the basis of their canonical cell type gene signatures. We found that this tumor contains tumor-associated macrophages (TAMs), neurons, astrocytes, oligodendrocytes, and tumor cells (Fig. 5, A and B). Such a complex TME illustrates this recurrence tumor’s highly infiltrated phenotype. The tumor cells, those that had transcriptomes classified as tumor-like, displayed high epidermal growth factor receptor expression, a well-known feature of high-grade gliomas. These tumor cells can further be characterized into four cellular states based on meta-module scores described by Neftel et al. (12) (Fig. 5A): oligodendrocyte progenitor cell–like (OPC-like), neural progenitor cell-like (NPC-like), mesenchymal-like (MES-like), and astrocyte-like (AC-like) cellular states.
Fig. 5. scONE-seq reveals the TME composition of an IDH1-mutant astrocytoma.
(A) UMAP of scONE-seq RNA data shows the TME composition cell types in this second-recurrence IDH1-mutant astrocytoma sample. Tumor cells are classified into four cellular states on the basis of their meta-module scores. (B) Dot plot shows representative markers used for annotating cell types. (C) UMAP of scONE-seq RNA data annotated with clonal information. 2R clone 1 cells are clustered with normal astrocytes as indicated by a red arrow. (D) Volcano plot shows the DEGs between the 2R clone 1 and clone 3 cells. Genes with higher expression in clone 1 cells are colored red. Genes with higher expression in clone 3 are colored blue. (E and F) UMAPs show the expression pattern of four key markers to distinguish the 2R clone 1 cells in the scONE-seq RNA dataset (E) and 10x small nuclear RNA sequencing (snRNA-seq) dataset (F). XIST (deletion in 2R clone 3), RFX3 (homozygous deletion in 2R clone 3), ADCY8, and GRIA1 (unique expression in clone 1 compared to normal astrocytes). (G) UMAP shows the integration of scONE-seq and 10x snRNA-seq dataset. Integrated data (left) retain all cell types identified by scONE-seq or 10x snRNA-seq. Split UMAP (middle and right) shows adequate mixing of cell types found separately by the two methods, with 2R clone 1 cells (from scONE-seq) and putative clone 1 cells (from snRNA-seq) falling into the same integrated cluster.
We used the paired RNA data to superimpose the cell type information onto the clonal information so that clonal subpopulations with unique functional, phenotypic features could be identified. To do so, we mapped the clonal information to the RNA UMAP to visualize the clonal distribution among different cell types (Fig. 5C). The 2R clone 3 was the major clone of this tumor and was differentiated into all four tumor phenotypes: OPC-like, NPC-like, MES-like, and AC-like cellular states. The 2R clone 2 consisted predominantly of AC-like cells. The 2R clone 1 is the most interesting: All cells from this clone were clustered with normal astrocytes using RNA data alone, indicating transcriptome similarity between 2R clone 1 and normal astrocytes that are indistinguishable with scRNA-seq data; but upon superimposing matched genotype and phenotype, this unique population of astrocyte-like tumor cells with clearly abnormal genotype is revealed (Fig. 5C). We then sought to validate the presence of this clone within the tumor and to investigate its potential role in the TME.
Characterization of a unique tumor clone with normal astrocyte-like phenotype
To verify the existence of clone 1 cells, we first identified gene markers unique to clone 1, including XIST, RFX3, ADCY8, and GRIA1, which distinguished them from other tumor and normal cells (Fig. 5, D and E). With these markers, we observed a similar phenomenon in the droplet-based small nuclear RNA sequencing (snRNA-seq) dataset, where a putative clone 1 population was found to be adjacent to normal astrocytes (Fig. 5F and fig. S11, A and B). As validation, we integrated the scONE-seq RNA dataset and 10x dataset to show that our scONE-seq dataset of 1210 nuclei exhaustively captured all cell types that were observed in droplet snRNA-seq of 4416 nuclei and showed that clone 1 merged with the putative clone 1 cells from the droplet-based snRNA-seq dataset (Fig. 5G). The abnormality of the 2R clone 1 is not definitively identifiable using either transcriptomic clustering or RNA-inferred CNVs (fig. S11, C and D). This highlights the importance of directly measuring copy number profiles as a standard to identify cancer cells and is particularly important in studying tumor initiation to capture normal-to-cancer transition.
Then, we performed histological analyses on Formalin-fixed, paraffin-embedded (FFPE) sections from both the primary and 2R tumors of this patient to identify the cells with IDH1(R132H) and ADCY8. Anti-IDH1(R132H) is expected to label all tumor cells, as this is a known somatic mutation in IDH-mutant glioma cells and has also been validated with clonal pseudo-bulk Integrative Genomics Viewer (IGV) tracks (fig. S12A) (44). Anti-ADCY8 is expected to mark some normal neurons and normal astrocytes in addition to clone 1 cells. Hence, putative clone 1 cells are those cells marked by double-positive staining of anti-IDH1(R132H) and anti-ADCY8. We looked at the overall staining pattern across the whole slide section and noted that the IDH1(R132H)-positive tumor cells were distributed over the entire section for both the primary and 2R tumors, although with a different pattern. The ADCY8 signals appeared stronger in 2R tumor sections and were specifically concentrated in certain regions that also express IDH1(R132H) more strongly (fig. S12B). These ADCY8-positive regions were always near the IDH1(R132H)-negative “normal adjacent” regions (fig. S12B). The double-positive cells that we suspected to be the putative clone 1 cells appeared to be near other normal and malignant cells (Fig. 6A). These histological immune staining results reveal the spatial distribution of putative clone 1 cells in the tumor sections.
Fig. 6. Characterization of features of 2R clone 1.
(A) The immunofluorescent images colabel the IDH1(R132H) and ADCY8 in the FFPE section of the patient (scale bars, 20 μm). Top: Images from the 2R tumor. Bottom: Images from the primary tumor from the same patient. Yellow arrow indicates the costained putative clone 1 cells. The red arrow indicates the other tumor cells. The green arrow indicates normal astrocytes or GABAergic neurons. (B) AMPAR subunit–encoding gene expression pattern in this 2R IDH1-mutant astrocytoma. Clone 1 cells have highly expressed GRIA1. (C) TGFβ signaling gene expression pattern in this 2R IDH1-mutant astrocytoma. Clone 1 cells and normal astrocytes have highly expressed TGFB2. Receptors (TGFBR1 and TGFBR2) are mostly expressed in TAMs.
The presence of these clone 1 tumor cells with their normal-like transcriptional signature among normal astrocytes and neurons prompted us to examine the gene expression of clone 1 cells associated with signaling and cell-cell communication. Several studies have demonstrated that glioma cells can form synaptic structures with normal neurons as a signaling conduit within the tumor (45, 46). This was found to occur via AMPA receptors (AMPAR), a glutamate receptor subtype (47, 48). AMPARs are tetrameric, and there are four subunit proteins involved, Glut1-4, encoded by the genes GRIA1-4, respectively (49, 50). We found the GRIA1 genes to be differentially expressed between the different tumor clones in this 2R tumor (Fig. 6B). The major clone, clone 3, expressed GRIA2-4 but did not express GRIA1; however, clone 1 was the only tumor subpopulation that expressed GRIA1, and all three other GRIA family genes were expressed at much lower levels. The coexpression of GRIA1 and APOE was found in clone 1, which is a transcriptomic marker thought to identify tumor cells with a tumor-neuron synapse (fig. S12C) (47). We subsequently performed ligand-receptor analysis for the different subpopulations and found transforming growth factor–β (TGFβ) signaling transcripts to be strongly and specifically expressed in normal astrocytes, clone 1 cells, and TAMs (fig. S12D), with clone 1 cells expressing the ligands and TAMs predominantly expressing the receptors (Fig. 6C).
The transcriptomic regulation is multifaceted, and the copy number change is one of the important factors. We evaluated the copy number states of differential expression genes from 2R clone 3 by comparing it to normal cells or 2R clone 1 and found enrichment in copy number changed regions (fig. S13, A to D). When comparing 2R clone 3 and 2R clone 1, there were more genes enriched in copy number change regions than when comparing 2R clone 3 and normal (fig. S13, B and D). Transcription is regulated via many layers of epigenetic controls such as chromatin accessibility, DNA/histone modifications, and posttranscriptional regulation, but these states are not currently measurable in the same cell simultaneously. Therefore, while there certainly is a correlation between copy number and transcript abundance, the degree of this correlation can also be affected by epigenetic regulation.
DISCUSSION
We have developed scONE-seq, a versatile method that offers balanced data quality between DNA and RNA readouts, ease of use, and a customizable level of scale. We designed this platform to be compatible with frozen tissue samples that have been stored for years. This feature makes it easier to plan and perform larger-scale clinical multi-omics single-cell studies in two ways: It makes studies on existing biobanked samples possible, which we have demonstrated here; it also removes the burden of having to immediately process fresh samples from clinical researchers, whose priority is patient care. The application of scONE-seq on frozen tumor tissue in our study led to an integrated investigation of the genotypic and molecular phenotypic heterogeneity of astrocytoma. By superimposing the multi-omics data, it enabled us to reveal and characterize differentiated tumor clones, which resonates with the idea that tumor clones can produce hierarchical cellular states (6, 12). In addition, we found a unique tumor clone in this astrocytoma tumor with a very similar transcriptomic phenotype to normal astrocytes. Putative clone 1 cells expressing the same gene markers as clone 1 were detected and validated using both 10x Genomics snRNA-seq and immunostaining on tissue sections. Combining the histological and transcriptomic evidence of clone 1 cells, we propose some potential roles of these cells in the TME, including cell-cell communication via both signaling molecules and potential tumor-neuron synapse formation. Specifically, we found clone 1 to be the only tumor subpopulation that expressed GRIA1, an AMPAR encoding gene, while the other clones expressed other GRIA genes. GRIA1-encoded GluA1 subunit often forms GluA1 homomer AMPARs, which are calcium-permeable and broadly found in synapses in early development (50). Calcium-permeable AMPARs are a key signaling molecule in the tumor-neuron synapse (47, 48). The unique expression of GRIA1 in clone 1 hints at a distinctive and potentially multifaceted role in the intricate network of cell-cell communication in the TME. We also found TGFβ signaling molecules to be prominently expressed in normal astrocytes, clone 1 cells, and TAMs. TGFβ signaling in glioblastomas can promote invasiveness and angiogenesis and suppress the immune system. It is therefore a therapeutic target for gliomas. The expression of TGFβ pathway–associated genes by clone 1 cells is comparable to that of the normal astrocytes within the tumor, raising intriguing questions about the role of clone 1 in tumor immune modulation and its possible interactions with TAMs. Whether clone 1 cells are a commonly occurring cell type in gliomas or specifically astrocytoma and its involvement in TME will be an interesting avenue of future investigation on a larger sample set. Collectively from these results, we posit that cancer studies based only on scRNA-seq could underestimate important layers of tumor heterogeneity whereas simultaneous detection with DNA could contribute to meaningful and informative insight on tumor evolution. Meanwhile, the clonal analysis based on scWGS data only might overlook the complex interactions within a TME. By deciphering the genetic and phenotypic heterogeneity within the tumor ecosystem with scONE-seq, we can reveal the interplays of clonal expansion, tumor hierarchical cellular states, and TME.
The proof-of-concept study here is limited by the patient and cell numbers. We anticipate that future large-scale studies will produce a more informative understanding of cancer and has the potential to advance our knowledge in a more generalizable manner. Benefitting from its one-tube reaction system, scONE-seq has flexible scalability. For applications in large-scale studies, we have established a low-volume, higher-throughput version of scONE-seq by adapting it to a liquid handler platform (see Materials and Methods). To achieve higher throughput, a robotic platform that integrates the liquid handler and plate transferring is a promising but expensive solution. With further technology development, integrating the scONE-seq molecular mechanism onto a droplet pico-injection platform could be a more practical solution in the long run to perform even larger-scale multi-omics studies (51). Alternatively, integrating scONE-seq data with droplet-based single-cell data is also a complementary approach to achieve higher throughput. Various procedures can be easily added to the scONE-seq workflow to profile additional layers of information: To detect chromatin accessibility, a nucleus tagmentation step with customized Assay for Transposase-Accessible Chromatin (ATAC) adaptors could be added before FACS (52, 53); similarly, quantitative protein estimation could be achieved by using DNA-barcoded antibodies before single-cell sorting steps of the scONE-seq (54). Paired high-depth single-cell somatic mutation landscape could also be integrated into the scONE-seq by jointly performing WES or any hybridized target sequencing panels with the standard scONE-seq library. In general, scONE-seq is a flexible platform that can be further expanded and developed to measure different signals from a cell. We expect that scONE-seq will be a powerful tool to dissect cellular heterogeneity and will inspire other ultrahigh-throughput single-cell multi-omics methods development.
MATERIALS AND METHODS
Experimental design
Single-cell or nucleus isolation
HCT116 (American Type Culture Collection), NPC43 (provided by K. W. Lo in the Chinese University of Hong Kong), HUVEC (Lonza), and H9 (WiCell) cells are dissociated with trypsin-EDTA (0.25%) solution (Thermo Fisher Scientific) and stained with propidium iodide (1 mg/ml; Thermo Fisher Scientific) to exclude dead cells. Fresh whole blood was taken in the Hong Kong University of Science and Technology (HKUST) clinic from a healthy human donor. Lymphocytes were isolated via Ficoll-Paque Plus (GE Healthcare) density centrifugation. The red blood cells were removed with 1× red blood cell lysis buffer (Thermo Fisher Scientific). The frozen IDH1-mutant astrocytoma tissue (stored at −80°C) was obtained from Prince of Wales Hospital. The nuclei isolation protocol used for the frozen tumor is based on previous studies (55). Nuclei isolation from HCT116 is based on the Tween with salts and Tris (TST) protocol (56). Nuclei were resuspended in sorting buffer [1× phosphate-buffered saline (PBS), 1% bovine serum albumin (BSA), ribonuclease (RNase; 1 U/ml) inhibitor, and DAPI (0.1 mg/ml; Thermo Fisher Scientific)]. Cells or nuclei were then loaded to FACSAria III flow cytometer (BD Biosciences) to sort single cells into PCR tubes (96 or 384 PCR plates) containing lysis buffer. The lysis buffer consisted of RNase Inhibitor (2.5 U/μl; NEB), 0.15% Triton X-100 (Sigma-Aldrich), and 6 μM dithiothreitol (DTT; Thermo Fisher Scientific). The sorted sample can be stored at −80°C for months.
Sample processing
Mock RNA (total RNA) samples were extracted from HCT116. Mock DNA samples were extracted E. coli DNA. Fifty picograms of human RNA and 5 pg of E. coli DNA were used to perform scONE-seq. Lymphocytes were used to perform scONE-seq and Smart-seq2 in parallel. IDH1-mutant astrocytoma nuclei were used to perform scONE-seq and 10x Genomics in parallel. The quantity of ERCC spike-in used in mock and HCT116 experiments is 1 μl of 1:500,000 diluted ERCC.
Buffer and enzyme preparation
Mix 12 μl of Tn5 adaptor (One-Tn5 or P5-Tn5, 100 μM) and 12 μl of Mosaic (100 μM). Vortex and spin-down the sample. Place the tube on a thermocycler with the temperature preset to 95°C. Gradually reduce the heat until the oligonucleotides have reached room temperature (−1°C/min). Mix 4 μl of adaptor from the above step with 20 μl of Tn5 (Novoprotein). Incubate at 24°C for 45 min. The enzyme can be stored at −20°C for years. Twenty milligrams of proteinase K (Sigma-Aldrich) was resolved in 500 μl of nuclease-free water. Glycerol (500 μl; Sigma-Aldrich) was then added to get proteinase K solution (20 mg/ml).
The preamplification steps of scONE-seq
For standard scONE-seq preamplification, a single cell/nucleus was sorted into 2.3 μl of lysis buffer. Proteinase K (PK) (0.1 μl; 20 mg/ml), 0.15 μl of MgCl2 (250 mM), 0.3 μl of Deoxynucleotide (dNTP) (10 mM each), and 0.25 μl of H2O were added to completely lyse cells or nuclei by incubating at 55°C for 15 min. Then, the proteinase K was deactivated at 75°C for 15 min. The tagmentation reaction was performed to fragment the genome DNA and add the DNA-specific barcode. This reaction was performed in 5-μl volume, by adding 1.025 μl of H2O, 1.0 μl of PEG-8000 (polyethylene glycol, molecular weight 8000; 40% w/v), 0.125 μl of TAPS (400 mM TAPS-NaOH, pH 7.9), 0.1 μl of RNase Inhibitor, 0.05 μl of KAPA polymerase (Roche), and 0.0002 μl of One-Tn5 with custom adaptor (GTCTCGTGGGCTCGG TCATG NNNNNNNN AGATGTGTATAAGAGACAG). The reaction was incubated at 55°C for 10 min followed by 72°C for 10 min. Then, 0.1 μl of proteinase K (Sigma-Aldrich) and 0.6 μl of H2O were used to deactivate the enzyme in the buffer by incubating at 55°C for 15 min followed by 75°C for 15 min. Thereafter, we performed RT in 9-μl volume by adding 0.43 μl of H2O, 0.14 μl of MgCl2 (250 mM), 1.28 μl of tris-HCl (500 mM; pH 8.0), 0.6 μl of DTT (100 mM), 0.2 μl of SuperScript III, 0.15 μl of RT primer (10 μM), and 0.1 μl of RNase Inhibitor. RT was carried out at 12°C for 12 s followed by a gradient increasing to 50°C (12°C for 15 s, 15°C for 45 s, 20°C for 30 s, 30°C for 30 s, and 42°C for 45 s) and incubating for 50 min and 55°C for 50 min. Subsequently, the residual primers and RNA were removed with 0.6 μl of thermolabile Exo I (NEB; 37°C for 20 min; 65°C for 20 min), 0.1 μl of RNase If (NEB), and 0.1 μl of RNase H (NEB) at 37°C for 15°C. Then, 0.55 μl of nuclease-free water, 0.1 μl of the terminal transferase (NEB), and 0.1 μl of deoxycytidine triphosphate (dCTP; 100 mM) were used to add the C-tail to cDNA fragments. This reaction was performed at 37°C for 5 min, and the enzyme was immediately deactivated by adding 0.2 μl of proteinase K and 0.55 μl of H2O (55°C for 15 min; 75°C for 15 min). Second strand synthesis was then performed by adding 0.3 μl of 3′ adaptor (10 mM), 1 μl of KAPA HIFI Fidelity Buffer (5×), 1 μl of dNTP (10 mM each), 0.1 μl of (NH4)2SO4 (250 mM), and 0.1 μl of KAPA Polymerase (final volume, 15 μl). The reaction was incubated at 72°C for 5 min [10 cycles (1 min at 48°C; 1 min at 72°C)] and 5 min at 72°C, in a thermal cycler. Additional residual primer removal reaction was performed with 0.7 μl of Exo I (NEB; 37°C for 30 min; 55°C for 15 min; 65°C for 15 min). Last, 14 μl of KAPA HotStart ReadyMix (2×), 0.4 μl of (NH4)2SO4 (250 mM), 0.68 μl of dimethyl sulfoxide (DMSO; Thermo Fisher Scientific), and 4 μl of amplification primer (10 μM) were added to amplify DNA and RNA simultaneously. The PCR was performed at 98°C for 4 min [18 to 22 cycles (20 s at 98°C; 4.25 min at 72°C)] and 10 min at 72°C, in a thermal cycler. All primer sequences can be found in table S2.
The low-volume high-throughput scONE-seq
We established a low-volume high-throughput version of scONE-seq with a MANTIS (FORMULATRIX) liquid dispenser. The volume was scaled down ~3.5-fold. To avoid evaporation, 6 μl of Vapor-Lock (QIAGEN) was added at the very beginning (before lysis buffer) (57). In addition, dNTP concentration was increased. All liquid handling steps were performed on MANTIS.
In detail, 0.7 μl of lysis buffer was used with all components maintaining the same concentration. PK (0.03 μl) and 0.07 μl of H2O were then added to lyse cells. A total of 0.7 μl of tagmentation mix (0.24 μl of dNTP) with all other components maintaining the same concentration was added to perform DNA tagmentation. PK (0.03 μl) and 0.07 μl of H2O were then added to deactivate all enzymes. RT mix (0.7 μl) was then added to perform the RT reaction. Subsequently, 0.2 μl of thermolabile Exo I was added to remove residual primers, and 0.2 μl of RNase mix (0.03 μl of RNase If, 0.03 μl of RNase H, and 0.14 μl of H2O) was added to remove RNA residual. Then, 0.2 μl of poly-C tailing mix (0.03 μl of terminal transferase, 0.03 μl of dCTP, and 0.14 μl of H2O) was added. PK (0.03 μl) and 0.07 μl of H2O were then added to deactivate all enzymes in the buffer. Second-strand synthesis was then performed by adding 0.8 μl with all components maintaining the same concentration. Exo I (0.2 μl) was then added to remove 3′ adaptors. Last, 3.6 μl of KAPA HotStart ReadyMix (2×), 0.1 μl of (NH4)2SO4 (250 mM), 0.18 μl of DMSO (Thermo Fisher Scientific), and 1.08 μl of amplification primer (10 μM) were added to amplify DNA and RNA simultaneously (10-μl volume). In theory, the volume could be further scaled down with a better liquid dispenser. The reaction programs keep the same as the standard protocol.
Adjusting the DNA/RNA reads ratio
For different research purposes, different sequencing depths are needed for DNA, and RNA data may be desired. This can be achieved easily by adjusting the Tn5 amount for tagmentation, which adjusts the yield of the DNA preamplification. In general, more Tn5 will shorten the DNA fragments and increase the mole number of DNA sequenced or vice versa. The suggested amount of Tn5 for single cells is 0.0001 μl, as the whole cell contains more RNA, while for single nuclei, it is 0.00005 μl.
scONE-seq sequencing library construction
Preamplified samples were purified with AMPure XP beads (Beckman Coulter). Samples were diluted to 0.1 ng/μl, and tagmentation reaction was performed with the following components: 1× TAPS buffer [50 mM TAPS-NaOH and 25 mM MgCl2 (pH 8.0)], 8% PEG-8000, and 0.003 μl of P5-Tn5 (TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG). The reaction was performed at 55°C for 15 min. Samples were then amplified with Illumina sequencing index primers (Sangon) by using KAPA HiFi HotStart Polymerase Kit (Roche). The enrichment PCR was incubated at 95°C for 10 min [10 to 11 cycles (20 s at 98°C, 15 s at 60°C, and 30 s at 72°C)] and 2 min at 72°C, in a thermal cycler. Samples were then pooled and purified with AMPure XP beads. scDASH protocol was then used to remove the abundant ribosome and mitochondrial sequences (33, 58). Double size selection can be performed to optimize the library size. The library was then sequenced on Illumina NextSeq500 with customized sequencing primers (table S2). Alternatively, the Illumina sequencing library could be converted to the MGI sequencing library with an MGI conversion kit (MGI) and sequenced with customized sequencing primers (table S2). The costs for scONE-seq are summarized in table S3.
DNA/RNA reads demultiplexing
A standard scONE-seq run could have 200 to 300 cells for a high-throughput Illumina kit (yielding 300 to 350 pairs of reads). Cells with less than 0.1 million reads will be filtered out directly. Sequencing reads were firstly filtered with fastp (59), which retains around 85 to 90% of reads for DNA/RNA read demultiplexing. We then used Cutadapt to separate fastq files into DNA fastq files, RNA fastq files, and unmatched fastq (3 to 5%) (60). During this process, the UMI of the reads was extracted and labeled to fastq file head with fastp. Generally, in fastq files, the duplication rate is around 6 to 11% (table S4). The detailed code can be found on GitHub (https://github.com/0YuLei0/scONE-seq-data-processing).
DNA data analysis
DNA fastq files were mapped to hg38 with BWA mem (61). To perform UMI-based deduplication, read2 reads in bam files were extracted with samtools and deduplicated with umi_tools (62, 63). The deduplicated read2 reads were used to extract its paired read1, and these paired fastq were then realigned to hg38 with BWA mem.
If performing the standard CNV analysis, then Ginkgo was used to generate the normalized counts with a 500-kb window size (64). If performing the allele-specific CNV analysis, then CHISEL was used to generate allele frequency information (65). In this process, the CVs of normalized counts in copy number neutral regions of each cell were calculated. Cells with CVs less than 0.5 were kept for downstream analysis. The integer copy number calculation was based on previous studies (15). In this pipeline, the segmentation was performed with copynumber and aCGH in R (66).
WES data from different tumor samples were analyzed with CNVkit and THetA to infer the copy number profiles (67, 68). The integer copy number data from WES and scONE-seq were combined to perform the phylogenetic reconstruction, adapted from a previous method (15).
Other single-cell multi-omics or DNA-seq data were downloaded (15–18, 21, 22). All fastq files were filtered with fastp. DNA data were mapped with BWA mem. For mitochondrial proportion and duplication rate evaluation, bam files were downsampled to 0.5 or 0.25 million (sci-L3) fragments. Mitochondrial reads were extracted and summarized with samtools. Duplication reads were marked with Picard toolkit and summarized with samtools. For uniformity evaluation, duplicated reads were removed with the Picard toolkit. The remaining deduplicated reads were downsampled to 0.5 or 0.25 million fragments. To select normal genome regions for method comparison, we performed hierarchical clustering to select clustered copy number neutral regions. Data from these regions were compared to evaluate the uniformity of methods (CV).
The minimum CNVs detection analysis was performed with the scONE-seq HCT116 dataset (cells, nuclei, and pseudo-bulks). The criterion for the detection of a small-scale CNV is that a cell, nucleus, or pseudo-bulk has more than 80% of regions matched with the reference CNVs in this small-scale CNV region. Then, the detection rate could be calculated on the basis of the sample number.
RNA data analysis
UMI-based deduplication was also performed with RNA fastq files. The workflow was kept the same except for replacing the BWA with STAR (69). Mapping region analysis was performed with bedtools (70). The fastq files were quantified with Kallisto (cDNA quantification) for cell lines and PBMC or Salmon for the frozen tumor sample (premature RNA quantification) (71, 72). 10x snRNA-seq data were quantified with kb-python (73). The expression data were analyzed using Seurat with the sctransform pipeline (normalization, dimension reduction, dataset integration, finding clusters, and differential gene analysis) (74). The astrocytoma cellular state meta-module scoring was performed following the original paper (12). The ligand-receptor analysis was performed with CellChat (75). The CNV inferring was performed with CopyKAT (76).
RNA data from other methods were downsampled and quantified with Kallisto (17, 18, 21, 22). Gene expression matrixes from DR-seq and sci-L3 were downloaded directly. We filtered sci-L3 RNA data to keep cells with more than 100 UMIs. The scONE-seq dataset was filtered by setting a threshold of detecting more than 3500 genes. All other methods were not filtered to show the data quality. Commonly detected genes were defined as the most frequently expressed 3000 genes of each dataset. Subsequently, excluding ribosomal and mitochondrial genes in commonly detected genes, cell-cell Pearson correlations were calculated within each dataset. Last, scDblFinder was used to estimate the cell doublet scores (77).
For the lymphocyte dataset, a threshold of 2000 gene detection number was applied for the scONE-seq dataset, and a threshold of 1000 gene detection number was applied for the SS2 dataset. The mitochondrial proportion was also checked to be lower than 5% for each dataset.
Regarding the frozen astrocytoma sample, nuclei with gene detection numbers less than 1500 and doublets scores higher than 2 were removed. The mitochondrial proportion was also checked to be lower than 2% in all nuclei. In the matched 10x Genomics dataset, nuclei with gene detection numbers less than 1500, mitochondrial proportions higher than 3%, and doublets scores higher than 0.5 were removed.
Visualization
Plots were created using the ggplot2 R package. Heatmaps were created with the ComplexHeatmap package (78). Figures were prepared in Inkscape.
Immunohistochemistry (IHC) analysis
Slides were obtained from Prince of Wales Hospital. Xylene and ethanol were used to remove the wax. Antigen retrieval was performed with sodium citrate buffer (Thermo Fisher Scientific) at 98°C for 15 min. Blocking was performed in 10% normal serum (goat and donkey, Abcam) with 1% BSA in PBST buffer (0.05% Triton X-100). IDH1(R132H) antibody (1:40; Dianova) and ADCY8 antibody (1:200; Abcam) were added to slides and incubated at 4°C overnight in a humid box. Secondary antibodies (1:400; anti-mouse, anti-rabbit; Thermo Fisher Scientific) were used to provide the fluorescent signal. The mounting buffer with DAPI (Abcam) was used to stain the nucleus and retain fluorescence. Images were taken with Zeiss Axio Scan.Z1 Slide Scanner with a ×20 objective (Zeiss).
Statistical analysis
Between-group differences in discrete values were calculated using the chi-square test. Differences in parametric distributions (gene detection number, CVs, and Pearson correlation coefficient) were quantified using the Student’s t test.
Acknowledgments
We thank Y. Huang (Peking University) and J. Wang (Tsinghua University) for ongoing support and advice on this project. We also thank T. H. T. Cheung (HKUST) for providing us with the Tn5 enzyme that was used in a portion of the experiments and K. Wai Lo (CUHK) for providing the NPC43 cell line used in the study. We thank D. C. Y. Leung (HKUST) for reading and feedback on the manuscript. We also want to thank Z. Wen (HKUST), Y. Wei (CUHK), and C. Yang (HKUST) for discussion and advice on the data analysis.
Funding: This work was supported by: the Hong Kong Research Grant Council (26101016, 16101118, T12-704/16R-2, and C4001-18G), a Hong Kong University of Science and Technology’s startup grant (R9364), the Hong Kong University of Science and Technology Big Data for Bio Intelligence Laboratory (BDBI), The Hong Kong University of Science and Technology Center for Aging Science, The Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), and The Chau Hoi Shuen Foundation (to A.R.W.); the Hong Kong Research Grant Council (26102719 and 16101021) and the Mainland-Hong Kong Joint Funding Scheme (MHP/004/19) (to J.W.); The Lo Ka Chung Foundation through the Hong Kong Epigenomics Project (to A.R.W. and J.W.); and the Health and Medical Research Fund (HMRF; 07180736) (to H.-K.N.).
Author contributions: L.Y. and A.R.W. designed the study. L.Y. developed scONE-seq chemistry, generated libraries, performed computational analysis and IHC staining, prepared figures, and wrote the manuscript text. X.W. provided input to scONE-seq chemistry, generated libraries, performed IHC staining, and prepared figures. Q.M. performed computational analysis. S.S.T.T. prepared the nuclei 10x library and edited the manuscript. D.S.C.L. provided input to the scONE-seq chemistry. A.K.Y.C., W.S.P., H.-K.N., and D.T.M.C. prepared patient samples, metadata, and FFPE sections and assessed histopathology. J.W. and A.R.W. supervised the work. A.R.W. also conceived the study, obtained funding, analyzed the data, prepared figures, and wrote the manuscript. All authors edited and approved the final manuscript.
Competing interests: A.R.W. and L.Y. have filed a USPTO patent on scONE-seq, application number PCT/IB2021/000713, The Hong Kong University of Science and Technology, filed 19 October 2021. All other authors declare that they have no competing interests
Data and materials availability: The data generated in this study have been deposited into the NCBI Gene Expression Omnibus with accession code GSE185269. Code and scripts used for performing the analysis can be found at GitHub: https://github.com/0YuLei0/scONE-seq-data-processing. A permanent code repository can be found at Zenodo: https://doi.org/10.5281/zenodo.6796059. All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials.
Supplementary Materials
This PDF file includes:
Supplementary Text
Figs. S1 to S13
Tables S1 to S4
REFERENCES AND NOTES
- 1.Wu A. R., Wang J., Streets A. M., Huang Y.,Single-cell transcriptional analysis. Annu. Rev. Anal. Chem. 10,439–462 (2017). [DOI] [PubMed] [Google Scholar]
- 2.Gawad C., Koh W., Quake S. R.,Single-cell genome sequencing: Current state of the science. Nat. Rev. Genet. 17,175–188 (2016). [DOI] [PubMed] [Google Scholar]
- 3.Nam A. S., Chaligne R., Landau D. A.,Integrating genetic and non-genetic determinants of cancer evolution by single-cell multi-omics. Nat. Rev. Genet. 22,3–18 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Villani A.-C., Satija R., Reynolds G., Sarkizova S., Shekhar K., Fletcher J., Griesbeck M., Butler A., Zheng S., Lazo S., Jardine L., Dixon D., Stephenson E., Nilsson E., Grundberg I., McDonald D., Filby A., Li W., de Jager P. L., Rozenblatt-Rosen O., Lane A. A., Haniffa M., Regev A., Hacohen N.,Single-cell RNA-seq reveals new types of human blood dendritic cells, monocytes, and progenitors. Science 356,eaah4573 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Treutlein B., Brownfield D. G., Wu A. R., Neff N. F., Mantalas G. L., Espinoza F. H., Desai T. J., Krasnow M. A., Quake S. R.,Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq. Nature 509,371–375 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Venteicher A. S., Tirosh I., Hebert C., Yizhak K., Neftel C., Filbin M. G., Hovestadt V., Escalante L. E., Shaw M. K. L., Rodman C., Gillespie S. M., Dionne D., Luo C. C., Ravichandran H., Mylvaganam R., Mount C., Onozato M. L., Nahed B. V., Wakimoto H., Curry W. T., Iafrate A. J., Rivera M. N., Frosch M. P., Golub T. R., Brastianos P. K., Getz G., Patel A. P., Monje M., Cahill D. P., Rozenblatt-Rosen O., Louis D. N., Bernstein B. E., Regev A., Suvà M. L.,Decoupling genetics, lineages, and microenvironment in IDH-mutant gliomas by single-cell RNA-seq. Science 355,eaai8478 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Shami A. N., Zheng X., Munyoki S. K., Ma Q., Manske G. L., Green C. D., Sukhwani M., Orwig K. E., Li J. Z., Hammoud S. S.,Single-cell RNA sequencing of human, macaque, and mouse testes uncovers conserved and divergent features of mammalian spermatogenesis. Dev. Cell 54,529–547.e12 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.McGranahan N., Swanton C.,Clonal heterogeneity and tumor evolution: Past, present, and the future. Cell 168,613–628 (2017). [DOI] [PubMed] [Google Scholar]
- 9.Kreso A., Dick J. E.,Evolution of the cancer stem cell model. Cell Stem Cell 14,275–291 (2014). [DOI] [PubMed] [Google Scholar]
- 10.Prager B. C., Xie Q., Bao S., Rich J. N.,Cancer stem cells: The architects of the tumor ecosystem. Cell Stem Cell 24,41–53 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Shaffer S. M., Dunagin M. C., Torborg S. R., Torre E. A., Emert B., Krepler C., Beqiri M., Sproesser K., Brafford P. A., Xiao M., Eggan E., Anastopoulos I. N., Vargas-Garcia C. A., Singh A., Nathanson K. L., Herlyn M., Raj A.,Rare cell variability and drug-induced reprogramming as a mode of cancer drug resistance. Nature 546,431–435 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Neftel C., Laffy J., Filbin M. G., Hara T., Shore M. E., Rahme G. J., Richman A. R., Silverbush D., Shaw M. K. L., Hebert C. M., Dewitt J., Gritsch S., Perez E. M., Castro L. N. G., Lan X., Druck N., Rodman C., Dionne D., Kaplan A., Bertalan M. S., Small J., Pelton K., Becker S., Bonal D., Nguyen Q.-D., Servis R. L., Fung J. M., Mylvaganam R., Mayr L., Gojo J., Haberler C., Geyeregger R., Czech T., Slavc I., Nahed B. V., Curry W. T., Carter B. S., Wakimoto H., Brastianos P. K., Batchelor T. T., Stemmer-Rachamimov A., Martinez-Lage M., Frosch M. P., Stamenkovic I., Riggi N., Rheinbay E., Monje M., Rozenblatt-Rosen O., Cahill D. P., Patel A. P., Hunter T., Verma I. M., Ligon K. L., Louis D. N., Regev A., Bernstein B. E., Tirosh I., Suvà M. L.,An integrative model of cellular states, plasticity, and genetics for glioblastoma. Cell 178,835–849.e21 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Müller S., Kohanbash G., Liu S. J., Alvarado B., Carrera D., Bhaduri A., Watchmaker P. B., Yagnik G., di Lullo E., Malatesta M., Amankulor N. M., Kriegstein A. R., Lim D. A., Aghi M., Okada H., Diaz A.,Single-cell profiling of human gliomas reveals macrophage ontogeny as a basis for regional differences in macrophage activation in the tumor microenvironment. Genome Biol. 18,234 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Zhang L., Li Z., Skrzypczynska K. M., Fang Q., Zhang W., O’Brien S. A., He Y., Wang L., Zhang Q., Kim A., Gao R., Orf J., Wang T., Sawant D., Kang J., Bhatt D., Lu D., Li C.-M., Rapaport A. S., Perez K., Ye Y., Wang S., Hu X., Ren X., Ouyang W., Shen Z., Egen J. G., Zhang Z., Yu X.,Single-cell analyses inform mechanisms of myeloid-targeted therapies in colon cancer. Cell 181,442–459.e29 (2020). [DOI] [PubMed] [Google Scholar]
- 15.Minussi D. C., Nicholson M. D., Ye H., Davis A., Wang K., Baker T., Tarabichi M., Sei E., Du H., Rabbani M., Peng C., Hu M., Bai S., Lin Y.-W., Schalck A., Multani A., Ma J., McDonald T. O., Casasent A., Barrera A., Chen H., Lim B., Arun B., Meric-Bernstam F., Van Loo P., Michor F., Navin N. E.,Breast tumours maintain a reservoir of subclonal diversity during expansion. Nature 592,302–308 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Gao R., Davis A., McDonald T. O., Sei E., Shi X., Wang Y., Tsai P.-C., Casasent A., Waters J., Zhang H., Meric-Bernstam F., Michor F., Navin N. E.,Punctuated copy number evolution and clonal stasis in triple-negative breast cancer. Nat. Genet. 48,1119–1130 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Macaulay I. C., Haerty W., Kumar P., Li Y. I., Hu T. X., Teng M. J., Goolam M., Saurat N., Coupland P., Shirley L. M., Smith M., van der Aa N., Banerjee R., Ellis P. D., Quail M. A., Swerdlow H. P., Zernicka-Goetz M., Livesey F. J., Ponting C. P., Voet T.,G&T-seq: Parallel sequencing of single-cell genomes and transcriptomes. Nat. Methods 12,519–522 (2015). [DOI] [PubMed] [Google Scholar]
- 18.Dey S. S., Kester L., Spanjaard B., Bienko M., van Oudenaarden A.,Integrated genome and transcriptome sequencing of the same cell. Nat. Biotechnol. 33,285–289 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Hou Y., Guo H., Cao C., Li X., Hu B., Zhu P., Wu X., Wen L., Tang F., Huang Y., Peng J.,Single-cell triple omics sequencing reveals genetic, epigenetic, and transcriptomic heterogeneity in hepatocellular carcinomas. Cell Res. 26,304–319 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Li L., Guo F., Gao Y., Ren Y., Yuan P., Yan L., Li R., Lian Y., Li J., Hu B., Gao J., Wen L., Tang F., Qiao J.,Single-cell multi-omics sequencing of human early embryos. Nat. Cell Biol. 20,847–858 (2018). [DOI] [PubMed] [Google Scholar]
- 21.Zachariadis V., Cheng H., Andrews N., Enge M.,A highly scalable method for joint whole-genome sequencing and gene-expression profiling of single cells. Mol. Cell 80,541–553.e5 (2020). [DOI] [PubMed] [Google Scholar]
- 22.Yin Y., Jiang Y., Lam K. W. G., Berletch J. B., Disteche C. M., Noble W. S., Steemers F. J., Camerini-Otero R. D., Adey A. C., Shendure J.,High-throughput single-cell sequencing with linear amplification. Mol. Cell 76,676–690.e10 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Kivioja T., Vähärautio A., Karlsson K., Bonke M., Enge M., Linnarsson S., Taipale J.,Counting absolute numbers of molecules using unique molecular identifiers. Nat. Methods 9,72–74 (2012). [DOI] [PubMed] [Google Scholar]
- 24.Klein A. M., Mazutis L., Akartuna I., Tallapragada N., Veres A., Li V., Peshkin L., Weitz D. A., Kirschner M. W.,Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161,1187–1201 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Macosko E. Z., Basu A., Satija R., Nemesh J., Shekhar K., Goldman M., Tirosh I., Bialas A. R., Kamitaki N., Martersteck E. M., Trombetta J. J., Weitz D. A., Sanes J. R., Shalek A. K., Regev A., McCarroll S. A.,Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161,1202–1214 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Picelli S., Björklund A. K., Reinius B., Sagasser S., Winberg G., Sandberg R.,Tn5 transposase and tagmentation procedures for massively scaled sequencing projects. Genome Res. 24,2033–2040 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Sheng K., Cao W., Niu Y., Deng Q., Zong C.,Effective detection of variation in single-cell transcriptomes using MATQ-seq. Nat. Methods 14,267–270 (2017). [DOI] [PubMed] [Google Scholar]
- 28.Tang F., Barbacioru C., Wang Y., Nordman E., Lee C., Xu N., Wang X., Bodeau J., Tuch B. B., Siddiqui A., Lao K., Surani M. A.,mRNA-seq whole-transcriptome analysis of a single cell. Nat. Methods 6,377–382 (2009). [DOI] [PubMed] [Google Scholar]
- 29.Picelli S., Björklund Å. K., Faridani O. R., Sagasser S., Winberg G., Sandberg R.,Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods 10,1096–1098 (2013). [DOI] [PubMed] [Google Scholar]
- 30.Fan X., Zhang X., Wu X., Guo H., Hu Y., Tang F., Huang Y.,Single-cell RNA-seq transcriptome analysis of linear and circular RNAs in mouse preimplantation embryos. Genome Biol. 16,148 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Thrupp N., Sala Frigerio C., Wolfs L., Skene N. G., Fattorelli N., Poovathingal S., Fourne Y., Matthews P. M., Theys T., Mancuso R., de Strooper B., Fiers M.,Single-nucleus RNA-seq is not suitable for detection of microglial activation genes in humans. Cell Rep. 32,108189 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Zahn H., Steif A., Laks E., Eirew P., Vaninsberghe M., Shah S. P., Aparicio S., Hansen C. L.,Scalable whole-genome single-cell library preparation without preamplification. Nat. Methods 14,167–173 (2017). [DOI] [PubMed] [Google Scholar]
- 33.Loi D. S. C., Yu L., Wu A. R.,Effective ribosomal RNA depletion for single-cell total RNA-seq by scDASH. PeerJ. 9,e10717 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Marzluff W. F., Wagner E. J., Duronio R. J.,Metabolism and regulation of canonical histone mRNAs: Life without a poly(A) tail. Nat. Rev. Genet. 9,843–854 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Farber D. L., Yudanin N. A., Restifo N. P.,Human memory T cells: Generation, compartmentalization and homeostasis. Nat. Rev. Immunol. 14,24–35 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Lin W., Yip Y. L., Jia L., Deng W., Zheng H., Dai W., Ko J. M. Y., Lo K. W., Chung G. T. Y., Yip K. Y., Lee S.-D., Kwan J. S.-H., Zhang J., Liu T., Chan J. Y.-W., Kwong D. L.-W., Lee V. H.-F., Nicholls J. M., Busson P., Liu X., Chiang A. K. S., Hui K. F., Kwok H., Cheung S. T., Cheung Y. C., Chan C. K., Li B., Cheung A. L.-M., Hau P. M., Zhou Y., Tsang C. M., Middeldorp J., Chen H., Lung M. L., Tsao S. W., Establishment and characterization of new tumor xenografts and cancer cell lines from EBV-positive nasopharyngeal carcinoma. Nat. Commun. 9,4663 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Ben-David U., Siranosian B., Ha G., Tang H., Oren Y., Hinohara K., Strathdee C. A., Dempster J., Lyons N. J., Burns R., Nag A., Kugener G., Cimini B., Tsvetkov P., Maruvka Y. E., O’Rourke R., Garrity A., Tubelli A. A., Bandopadhayay P., Tsherniak A., Vazquez F., Wong B., Birger C., Ghandi M., Thorner A. R., Bittker J. A., Meyerson M., Getz G., Beroukhim R., Golub T. R.,Genetic and transcriptional evolution alters cancer cell line drug response. Nature 560,325–330 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Ceccarelli M., Barthel F. P., Malta T. M., Sabedot T. S., Salama S. R., Murray B. A., Morozova O., Newton Y., Radenbaugh A., Pagnotta S. M., Anjum S., Wang J., Manyam G., Zoppoli P., Ling S., Rao A. A., Grifford M., Cherniack A. D., Zhang H., Poisson L., Carlotti C. G. Jr., da Cunha Tirapelli D. P., Rao A., Mikkelsen T., Lau C. C., Yung W. K. A., Rabadan R., Huse J., Brat D. J., Lehman N. L., Barnholtz-Sloan J. S., Zheng S., Hess K., Rao G., Meyerson M., Beroukhim R., Cooper L., Akbani R., Wrensch M., Haussler D., Aldape K. D., Laird P. W., Gutmann D. H.; TCGA Research Network, Noushmehr H., Iavarone A., Verhaak R. G. W.,Molecular profiling reveals biologically discrete subsets and pathways of progression in diffuse glioma. Cell 164,550–563 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Hu H., Mu Q., Bao Z., Chen Y., Liu Y., Chen J., Wang K., Wang Z., Nam Y., Jiang B., Sa J. K., Cho H.-J., Her N.-G., Zhang C., Zhao Z., Zhang Y., Zeng F., Wu F., Kang X., Liu Y., Qian Z., Wang Z., Huang R., Wang Q., Zhang W., Qiu X., Li W., Nam D.-H., Fan X., Wang J., Jiang T.,Mutational landscape of secondary glioblastoma guides MET-targeted trial in brain tumor. Cell 175,1665–1678.e18 (2018). [DOI] [PubMed] [Google Scholar]
- 40.Habib N., Avraham-Davidi I., Basu A., Burks T., Shekhar K., Hofree M., Choudhury S. R., Aguet F., Gelfand E., Ardlie K., Weitz D. A., Rozenblatt-Rosen O., Zhang F., Regev A.,Massively parallel single-nucleus RNA-seq with DroNc-seq. Nat. Methods 14,955–958 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Mu Q., Chai R., Liu H., Yang Y., Zhao Z., Lui M. H., Bao Z., Song D., Jiang B., Sa J. K., Cho H. J., Chang Y., Chan K. H. Y., Loi D. S. C., Tam S. S. T., Chan A. K. Y., Wu A. R., Poon W. S., Ng H. K., Chan D. T. M., Iavarone A., Nam D.-H., Jiang T., Wang J.,MYC amplification at diagnosis drives therapy-induced hypermutation of recurrent glioma. ResearchSquare 10.21203/rs.3.rs-138020/v1 , (2021). [Google Scholar]
- 42.Wang J., Cazzato E., Ladewig E., Frattini V., Rosenbloom D. I. S., Zairis S., Abate F., Liu Z., Elliott O., Shin Y.-J., Lee J.-K., Lee I.-H., Park W.-Y., Eoli M., Blumberg A. J., Lasorella A., Nam D. H., Finocchiaro G., Iavarone A., Rabadan R.,Clonal evolution of glioblastoma under therapy. Nat. Genet. 48,768–776 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Louis D. N., Perry A., Wesseling P., Brat D. J., Cree I. A., Figarella-Branger D., Hawkins C., Ng H. K., Pfister S. M., Reifenberger G., Soffietti R., von Deimling A., Ellison D. W.,The 2021 WHO classification of tumors of the central nervous system: A summary. Neuro Oncol. 23,1231–1251 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Johnson B. E., Mazor T., Hong C., Barnes M., Aihara K., Lean C. Y. M., Fouse S. D., Yamamoto S., Ueda H., Tatsuno K., Asthana S., Jalbert L. E., Nelson S. J., Bollen A. W., Gustafson W. C., Charron E., Weiss W. A., Smirnov I. V., Song J. S., Olshen A. B., Cha S., Zhao Y., Moore R. A., Mungall A. J., Jones S. J. M., Hirst M., Marra M. A., Saito N., Aburatani H., Mukasa A., Berger M. S., Chang S. M., Taylor B. S., Costello J. F.,Mutational analysis reveals the origin and therapy-driven evolution of recurrent glioma. Science 343,189–193 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Venkataramani V., Tanev D. I., Kuner T., Wick W., Winkler F.,Synaptic input to brain tumors: Clinical implications. Neuro Oncol. 23,23–33 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Jung E., Alfonso J., Osswald M., Monyer H., Wick W., Winkler F.,Emerging intersections between neuroscience and glioma biology. Nat. Neurosci. 22,1951–1960 (2019). [DOI] [PubMed] [Google Scholar]
- 47.Venkataramani V., Tanev D. I., Strahle C., Studier-Fischer A., Fankhauser L., Kessler T., Körber C., Kardorff M., Ratliff M., Xie R., Horstmann H., Messer M., Paik S. P., Knabbe J., Sahm F., Kurz F. T., Acikgöz A. A., Herrmannsdörfer F., Agarwal A., Bergles D. E., Chalmers A., Miletic H., Turcan S., Mawrin C., Hänggi D., Liu H.-K., Wick W., Winkler F., Kuner T.,Glutamatergic synaptic input to glioma cells drives brain tumour progression. Nature 573,532–538 (2019). [DOI] [PubMed] [Google Scholar]
- 48.Venkatesh H. S., Morishita W., Geraghty A. C., Silverbush D., Gillespie S. M., Arzt M., Tam L. T., Espenel C., Ponnuswami A., Ni L., Woo P. J., Taylor K. R., Agarwal A., Regev A., Brang D., Vogel H., Hervey-Jumper S., Bergles D. E., Suvà M. L., Malenka R. C., Monje M.,Electrical and synaptic integration of glioma into neural circuits. Nature 573,539–545 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Henley J. M., Wilkinson K. A.,Synaptic AMPA receptor composition in development, plasticity and disease. Nat. Rev. Neurosci. 17,337–350 (2016). [DOI] [PubMed] [Google Scholar]
- 50.Diering G. H., Huganir R. L.,The AMPA receptor code of synaptic plasticity. Neuron 100,314–329 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Abate A. R., Hung T., Marya P., Agresti J. J., Weitz D. A.,High-throughput injection with microfluidics using picoinjectors. Proc. Natl. Acad. Sci U.S.A.. 107,19163–19166 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Cao J., Cusanovich D. A., Ramani V., Aghamirzaie D., Pliner H. A., Hill A. J., Daza R. M., McFaline-Figueroa J. L., Packer J. S., Christiansen L., Steemers F. J., Adey A. C., Trapnell C., Shendure J.,Joint profiling of chromatin accessibility and gene expression in thousands of single cells. Science 361,1380–1385 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Lareau C. A., Duarte F. M., Chew J. G., Kartha V. K., Burkett Z. D., Kohlway A. S., Pokholok D., Aryee M. J., Steemers F. J., Lebofsky R., Buenrostro J. D.,Droplet-based combinatorial indexing for massive-scale single-cell chromatin accessibility. Nat. Biotechnol. 37,916–924 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Stoeckius M., Hafemeister C., Stephenson W., Houck-Loomis B., Chattopadhyay P. K., Swerdlow H., Satija R., Smibert P.,Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods 14,865–868 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Corces M. R., Trevino A. E., Hamilton E. G., Greenside P. G., Sinnott-Armstrong N. A., Vesuna S., Satpathy A. T., Rubin A. J., Montine K. S., Wu B., Kathiria A., Cho S. W., Mumbach M. R., Carter A. C., Kasowski M., Orloff L. A., Risca V. I., Kundaje A., Khavari P. A., Montine T. J., Greenleaf W. J., Chang H. Y.,An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues. Nat. Methods 14,959–962 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Slyper M., Porter C. B. M., Ashenberg O., Waldman J., Drokhlyansky E., Wakiro I., Smillie C., Smith-Rosario G., Wu J., Dionne D., Vigneau S., Jané-Valbuena J., Tickle T. L., Napolitano S., Su M. J., Patel A. G., Karlstrom A., Gritsch S., Nomura M., Waghray A., Gohil S. H., Tsankov A. M., Jerby-Arnon L., Cohen O., Klughammer J., Rosen Y., Gould J., Nguyen L., Hofree M., Tramontozzi P. J., Li B., Wu C. J., Izar B., Haq R., Hodi F. S., Yoon C. H., Hata A. N., Baker S. J., Suvà M. L., Bueno R., Stover E. H., Clay M. R., Dyer M. A., Collins N. B., Matulonis U. A., Wagle N., Johnson B. E., Rotem A., Rozenblatt-Rosen O., Regev A.,A single-cell and single-nucleus RNA-Seq toolbox for fresh and frozen human tumors. Nat. Med. 26,792–802 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Hagemann-Jensen M., Ziegenhain C., Sandberg R.,Scalable single-cell RNA sequencing from full transcripts with Smart-seq3xpress. Nat. Biotechnol. 40,1452–1457 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Gu W., Crawford E. D., O’Donovan B. D., Wilson M. R., Chow E. D., Retallack H., DeRisi J. L.,Depletion of abundant sequences by hybridization (DASH): Using Cas9 to remove unwanted high-abundance species in sequencing libraries and molecular counting applications. Genome Biol. 17,41 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Chen S., Zhou Y., Chen Y., Gu J.,fastp: An ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34,i884–i890 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Martin M.,Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17,10–12 (2011). [Google Scholar]
- 61.Li H., Durbin R.,Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25,1754–1760 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R.; 1000 Genome Project Data Processing Subgroup ,The sequence alignment/map format and SAMtools. Bioinformatics 25,2078–2079 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Smith T., Heger A., Sudbery I.,UMI-tools: Modeling sequencing errors in unique molecular identifiers to improve quantification accuracy. Genome Res. 27,491–499 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Garvin T., Aboukhalil R., Kendall J., Baslan T., Atwal G. S., Hicks J., Wigler M., Schatz M. C.,Interactive analysis and assessment of single-cell copy-number variations. Nat. Methods 12,1058–1060 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Zaccaria S., Raphael B. J.,Characterizing allele- and haplotype-specific copy numbers in single cells with CHISEL. Nat. Biotechnol. 39,207–214 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Nilsen G., Liestøl K., van Loo P., Moen Vollan H. K., Eide M. B., Rueda O. M., Chin S. F., Russell R., Baumbusch L. O., Caldas C., Børresen-Dale A. L., Lingjærde O. C.,Copynumber: Efficient algorithms for single- and multi-track copy number segmentation. BMC Genomics 13,591 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Talevich E., Shain A. H., Botton T., Bastian B. C.,CNVkit: Genome-wide copy number detection and visualization from targeted DNA sequencing. PLOS Comput. Biol. 12,e1004873 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Oesper L., Mahmoody A., Raphael B. J.,THetA: Inferring intra-tumor heterogeneity from high-throughput DNA sequencing data. Genome Biol. 14,R80 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Dobin A., Davis C. A., Schlesinger F., Drenkow J., Zaleski C., Jha S., Batut P., Chaisson M., Gingeras T. R.,STAR: Ultrafast universal RNA-seq aligner. Bioinformatics 29,15–21 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Quinlan A. R., Hall I. M.,BEDTools: A flexible suite of utilities for comparing genomic features. Bioinformatics 26,841–842 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Bray N. L., Pimentel H., Melsted P., Pachter L.,Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34,525–527 (2016). [DOI] [PubMed] [Google Scholar]
- 72.Patro R., Duggal G., Love M. I., Irizarry R. A., Kingsford C.,Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods 14,417–419 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Melsted P., Booeshaghi A. S., Liu L., Gao F., Lu L., Min K. H., da Veiga Beltrame E., Hjörleifsson K. E., Gehring J., Pachter L.,Modular, efficient and constant-memory single-cell RNA-seq preprocessing. Nat. Biotechnol. 39,813–818 (2021). [DOI] [PubMed] [Google Scholar]
- 74.Stuart T., Butler A., Hoffman P., Hafemeister C., Papalexi E., Mauck W. M. III, Hao Y., Stoeckius M., Smibert P., Satija R.,Comprehensive integration of single-cell data. Cell 177,1888–1902.e21 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Jin S., Guerrero-Juarez C. F., Zhang L., Chang I., Ramos R., Kuan C.-H., Myung P., Plikus M. V., Nie Q.,Inference and analysis of cell-cell communication using CellChat. Nat. Commun. 12,1088 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Gao R., Bai S., Henderson Y. C., Lin Y., Schalck A., Yan Y., Kumar T., Hu M., Sei E., Davis A., Wang F., Shaitelman S. F., Wang J. R., Chen K., Moulder S., Lai S. Y., Navin N. E.,Delineating copy number and clonal substructure in human tumors from single-cell transcriptomes. Nat. Biotechnol. 39,599–608 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Germain P.-L., Robinson M. D., Lun A., Garcia Meixide C., Macnair W.,Doublet identification in single-cell sequencing data using scDblFinder. F1000Res. 10,979 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Gu Z., Eils R., Schlesner M.,Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 32,2847–2849 (2016). [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary Text
Figs. S1 to S13
Tables S1 to S4






