Abstract
Conventional methods fall short in unraveling the dynamics of rare cell types related to aging and diseases. Here we introduce EasySci, an advanced single-cell combinatorial indexing strategy for exploring age-dependent cellular dynamics in the mammalian brain. Profiling approximately 1.5 million single-cell transcriptomes and 400,000 chromatin accessibility profiles across diverse mouse brains, we identified over 300 cell subtypes, uncovering their molecular characteristics and spatial locations. This comprehensive view elucidates rare cell types expanded or depleted upon aging. We also investigated cell-type-specific responses to genetic alterations linked to Alzheimer’s disease, identifying associated rare cell types. Additionally, by profiling 118,240 human brain single-cell transcriptomes, we discerned cell- and region-specific transcriptomic changes tied to Alzheimer’s pathogenesis. In conclusion, this research offers a valuable resource for probing cell-type-specific dynamics in both normal and pathological aging.
Subject terms: Transcriptomics, Alzheimer's disease
Single-cell transcriptomes and single-cell chromatin accessibility profiles generated using EasySci provide a global view of aging and Alzheimer’s pathogenesis-associated cell population dynamics in human and mouse brains.
Main
Progressive changes in brain cell populations, which can occur during aging, may contribute to functional decline and increased risks for neurodegenerative diseases such as Alzheimer’s disease (AD)1–4. Although the recent advances in single-cell genomics have created unprecedented opportunities to explore the cell-type-specific dynamics across the entire mammalian brain5–8, most prior studies relied on a relatively shallow sampling of the brain cell populations and failed to reveal rare aging or AD-associated cell types. Additionally, they were technically limited in several ways, including failing to recover isoform-level gene expression patterns and the associated chromatin landscape that regulates cell-type-specific alterations across aging stages.
Here, we introduced EasySci, a cost-effective single-cell profiling strategy based on extensive optimization of single-cell RNA sequencing (RNA-seq) by combinatorial indexing9. While the original method has been widely used to study embryonic and fetal tissues10,11, it remains restricted to gene quantification proximal to the 3’ end and limited in efficiency and cell recovery rate11. EasySci provided improved conditions for cell lysis, fixation, sample preservation, enzymatic reaction, oligonucleotide design, and purification methodologies (Supplementary Table 1). Several test conditions were inspired by optimizations described in recently developed or optimized single-cell techniques12,13. The major features of EasySci include (i) 1 million single-cell transcriptomes were prepared for ~US $700 (library preparation cost only, not including personnel or sequencing cost; Fig. 1a–c); (ii) reverse transcription (RT) with indexed oligo-dT and random hexamer primers was achieved, thus recovering cell-type-specific gene expression with full gene body coverage (Fig. 1d); (iii) cell recovery rate, as well as the number of transcripts detected per cell, were substantially improved through optimized nuclei storage, enzymatic reactions and improved primer design (Fig. 1e and Extended Data Fig. 1); and (iv) an extensively improved single-cell data processing pipeline was developed for both gene counting and exonic counting using paired-end single-cell RNA-seq data (Methods).
Leveraging the technical innovations during the development of EasySci-RNA, we further optimized the single-cell chromatin accessibility profiling method by combinatorial indexing (sci-ATAC-seq3)14,15. The key optimizations include (i) a tagmentation reaction with indexed Tn5 that are fully compatible with indexed ligation primers of EasySci-RNA; (ii) a modified nuclei extraction and cryostorage procedure to further increase the library complexity. (A comprehensive quality comparison with other single-cell sequencing assay for transposase-accessible chromatin (scATAC) protocols is shown in Extended Data Fig. 2.) It is noteworthy that the assay for transposase-accessible chromatin with sequencing (ATAC-seq) signal specificity of EasySci-ATAC parallels the original sci-ATAC-seq14,15, albeit lower than 10x ATAC-seq, potentially due to the indexed Tn5 used in single-cell combinatorial indexing. The detailed protocols for EasySci are included as supplementary files (Supplementary Protocols 1 and 2) to facilitate individual laboratories to cost-efficiently generate gene expression and chromatin accessibility profiles from millions of single cells.
Results
A single-cell catalog of the mouse brain in aging and AD
We first applied EasySci to characterize cell-type-specific gene expression, and chromatin accessibility profiles across the entire mouse brain sampling at different ages, sexes and genotypes (Fig. 1f). We collected C57BL/6 wild-type (WT) mouse brains at 3 months (n = 4), 6 months (n = 4) and 21 months (n = 4). To gain insight into the early molecular changes associated with the pathophysiology of AD, two mutants from the same C57BL/6 background at 3 months were included: an early-onset AD (EOAD) model (5xFAD) that overexpresses mutant human amyloid-beta precursor protein and human presenilin 1 harboring multiple AD-associated mutations16; and a late-onset AD (LOAD) model (APOE*4/Trem2*R47H) that carries two of the highest risk factor mutations of LOAD, including a humanized ApoE knock-in allele and missense mutations in the mouse Trem2 gene17,18.
In brief, nuclei were extracted from the whole brain and then deposited to different wells for indexed RT (RNA) or transposition (ATAC), such that the first index indicated the originating sample and assay type of any given well. The resulting EasySci libraries (RNA and ATAC) were sequenced separately, yielding a total of 20 billion paired-end reads. After filtering out low-quality cells and doublets, we recovered gene expression profiles in 1,469,111 single nuclei (a median of 70,589 nuclei per brain sample; Extended Data Fig. 3a) and chromatin accessibility profiles in 376,309 single nuclei (a median of 18,112 nuclei per brain sample, Extended Data Fig. 3b) across conditions. Despite shallow sequencing depth (~4,340 and ~16,000 raw reads per cell for RNA and ATAC, respectively), we recovered an average of 1,788 unique molecular identifiers (UMIs) (RNA, median of 935 UMIs) and 5,515 unique fragments (ATAC, median of 3,918) per nucleus (Extended Data Fig. 3c–f), comparable to other published datasets10,11,14.
With UMAP visualization19 and Louvain clustering20, we identified 31 main cell types by gene expression clusters (a median of 16,370 cells per cell type; Fig. 1g), annotated based on cell-type-specific gene markers2. Each cell type was present in nearly all individuals, except for rare pituitary cells (0.09% of the population), which were absent in 3 out of 20 individuals (Extended Data Fig. 3g). The cell-type-specific fractions in the global cell population ranged from 0.05% (inferior olivary nucleus neurons) to 32.5% (cerebellum granule neurons) (Fig. 1h). An average of 74 marker genes were identified for each main cell type (defined as at least a twofold expression difference between first- and second-ranked cell types; false discovery rate (FDR) of 5%; and transcripts per million (TPM) > 50 in the target cell type; Supplementary Table 2). In addition to the established marker genes, we identified novel markers that were not previously associated with the respective cell types, such as markers for microglia (e.g., Arhgap45 and Wdfy4), astrocytes (e.g., Celrr and Adamts9) and oligodendrocytes (e.g., Sec14l5 and Galnt5) (Extended Data Fig. 3h).
Several integration analyses were performed to validate the recovered cell types across different layers. First, we applied a deep-learning-based strategy21 to integrate transcriptome and chromatin accessibility profiles, yielding 31 main cell types (Fig. 1g). The gene body accessibility and expression of marker genes across cell types were highly correlated (Fig. 1i), as well as the fraction of each cell type (Pearson correlation r = 0.95, P = 6.68 × 10−16) (Fig. 1j). We further investigated the epigenetic controls of the diverse brain cell types through differential accessibility analysis (Extended Data Fig. 4a). We identified a median of 474 differential accessible peaks per cell type (FDR of 5%, TPM > 20 in the target cell type; Extended Data Fig. 4b,c and Supplementary Table 3). Key cell-type-specific transcription factor (TF) regulators were discovered by correlation analysis between motif accessibility and expression patterns, such as Spi1 in microglia22, Nr4a2 in cortical projection neurons 3 (ref. 23) and Pou4f1 in inferior olivary nucleus neurons24 (Extended Data Fig. 4d).
We next integrated our dataset with a 10x Visium spatial transcriptomics dataset through a modified NNLS approach (Methods). As expected, specific brain cell types were mapped to distinct anatomical locations (Fig. 1k,l), especially for region-specific cell types such as cortical projection neurons (clusters 6–8), cerebellum granule neurons (cluster 3) and hippocampal dentate gyrus neurons (cluster 9). These integration analyses confirmed the annotations and spatial locations of main cell types in our single-cell datasets.
In-depth view of cellular subtypes in the mammalian brain
Rather than performing subclustering analysis with the gene expression alone, we exploited the unique feature of EasySci-RNA (that is, full gene body coverage) by incorporating both gene counts and exonic counts for principal-component analysis followed by unsupervised clustering. The approach substantially increased the clustering resolution, as shown in a microglia subtype example (Fig. 2a,b). Leveraging this subclustering strategy, we identified a total of 359 subclusters, with a median of 1,038 cells in each group (Fig. 2c). All subclusters were contributed by multiple individuals, with a median of nine exonic markers enriched in each subcluster (Extended Data Fig. 5a,b and Supplementary Table 4). Some subtype-specific exonic markers were not detected by conventional differential gene analysis (for example, Map2-ENSMUSE00000443205.3 in microglia-8; Extended Data Fig. 5c). Notably, our strategy favors detecting extremely rare cell types, such as rare pinealocytes (choroid plexus epithelial cells 7, 21 cells, marked by Tph1 and Ddc25) and tanycytes (vascular leptomeningeal cells-2, 35 cells, marked by Fndc3c1, Scn7a26) (Extended Data Fig. 5d–g).
About 75% of the 359 cell subclusters can be validated through integration analysis with other datasets (Fig. 2d). Our initial integration with a single-cell dataset featuring highly detailed cell type annotations2 enables the validation of 112 subclusters, each matching with cell types documented in the previous study2 (Fig. 2e and Supplementary Table 5). These corresponding cell types were further validated by their cell-type-specific markers, exemplified by neuronal intermediate progenitor cells, vascular smooth muscle cells, and olfactory ensheathing cells (Fig. 2f). Next, we integrated the 10x Visium spatial transcriptomics datasets27 and determined the region-specificity of the recovered cell types or subtypes27 (Extended Data Fig. 5h,i). We then expanded the analysis to include an extensive spatial transcriptomics dataset encompassing 75 coronal sections of the mouse brain27,28 and discovered 122 subclusters with high spatial mapping scores (Supplementary Table 5 and Methods). For instance, our analysis revealed that choroid plexus epithelial cells-6 were primarily situated in the lateral ventricle, whereas cortical projection neurons 1-1 were predominantly found in the amygdala (Fig. 2g). As the third approach to confirm these subclusters, we utilized a deep-learning-based method21 to integrate the snRNA-seq and snATAC-seq data from each main cell type and recovered 224 ‘corresponding subclusters’ between the two molecular layers (Fig. 2h,i). As expected, the subclusters validated by ATAC-seq data exhibit more markers than those not validated (Fig. 2j). For example, the chromatin landscape for all 24 subclusters from cortical projection neurons 1 cells was recognized and validated by the significant enrichment of marker gene expression and activity in the target subcluster (Fig. 2k). We further explored cis-regulatory elements at the cell subtype resolution by correlation-based linkage analysis and unveiled a global network of putative enhancer-gene pairs shaping brain cell heterogeneity (Extended Data Fig. 6).
We next investigated key molecular programs underlying diverse cellular subtypes by clustering genes based on their expression variance across all 359 cell subclusters (Extended Data Fig. 7). We identified 21 gene modules (GMs), with the largest one (GM1) corresponding to a group of housekeeping genes. Several GMs were enriched in specific cell subtypes, such as the ependymal cell-specific GM29 (GM11), and pituitary cells subtype-6 specific GM (GM9)30. Similar analysis revealed programs in other rare subtypes, such as microglia-13 (GM19), vascular leptomeningeal cells-12 (GM20) and choroid plexus epithelial cells-7 (GM2). Remarkably, rare proliferating cells were identified through a cell-cycle-related GM (GM6), which include both conventional proliferating markers (for example, Mki67), and a group of less-studied lncRNAs (for example, Gm29260 and Gm37065) (Extended Data Fig. 7c and Supplementary Table 6).
Aging-associated population dynamics at subtype resolution
To obtain a global view of brain cell population dynamics across the adult lifespan, we first quantified the cell-type-specific fractions recovered from each individual mouse. Differential abundance analyses were conducted across all 359 subclusters, yielding 45 and 29 significantly changed subclusters during early growth (between 3 and 6 months) and aging (between 6 and 21 months; Fig. 3a and Supplementary Tables 7 and 8), respectively. Significantly changed cell subtypes were strongly correlated between genders (Fig. 3b).
Consistent with the growth of the olfactory bulb (OB) during early development31, we observed significant expansion in all OB neuron subtypes during this phase. Meanwhile, a rare astrocyte subtype (AS-14, Lyn+ Adgrb1+) and a vascular leptomeningeal cell subtype (VLC-4, Sox10+ Mybpc1+) exhibited significant expansion in the same period (Fig. 3c). AS-14 featured with genes (for example, BAI1) involved in the clean-up of apoptotic neuronal debris produced during brain fast growth32, and VLC-4 highly expressed genes (for example, Sox10 and Mybpc1 (refs. 33,34)) involved in the growth of axons35. Both subclusters were spatially mapped to the OB region, suggesting their potential involvement in OB expansion (Fig. 3c). In contrast to the early growth stage, most OB neurons remained relatively stable during aging, with only a few subtypes showing significant changes. Key examples include the expansion of an OB neuron subtype corresponding to excitatory neurons in the mitral cell layer of the OB region36 (OBN 3–3, marked by Cpa6 and Col23a1), and the depletion of OB neuroblasts2,37 (OBN 1–11, marked by Robo2 and Prokr2). Integration analysis with spatial transcriptomics datasets indicate these cell types were mapped to different regions of the OB (Fig. 3c).
More than twenty brain cell subtypes showed a marked reduction across the adult lifespan. For example, the most depleted populations in the aged brain include OB neuroblasts (OBN 1–11, marked by Prokr2 and Robo22,37), OB neuronal progenitor cells (OBN 1–17, marked by Mki67 and Egfr38), and dentate gyrus neuroblasts (DGN-8, marked by Sema3c and Igfbpl139) (Fig. 3d). DG neuroblasts declined even in the early growth, suggesting an earlier decline of DG neurogenesis compared to OB neurogenesis. In contrast to age-associated depletion of neurogenesis progenitors, oligodendrocyte progenitors (OPC-4, marked by Pdgfra and Mki67) remained relatively stable. However, newly formed oligodendrocytes (OLG-6, marked by Prom1 and Tcf7l1 (ref. 38,40)) and committed oligodendrocyte precursors (OPC-6, marked by Bmp4 and Enpp6 (ref. 38,40,41)) decreased during aging, indicating impaired oligodendrocyte differentiation. The age-associated population dynamics were further validated using the scATAC-seq dataset (Fig. 3d and Extended Data Fig. 8a,b) and our companion study in which we tracked cell dynamics via metabolic labeling42. Furthermore, we identified subtype-specific TF regulators using both gene expression and TF motif accessibility. This includes recognized regulators of neurogenesis (for instance, Sox2 and E2f243,44), demonstrating the potential of our datasets to unveil key epigenetic signatures of aging-associated cell subtypes (Fig. 3e).
A total of 14 cell subtypes notably expanded in the aged brain, such as a microglia subtype (MG-9, Apoe+, Csf1+) corresponding to a previously reported DAM45, and a reactive oligodendrocyte subtype (OLG-7, C4b+, Serpina3n+46,47). With the scATAC-seq dataset, we further confirmed its expansion (Fig. 3f and Extended Data Fig. 8b,c) and identified its associated TFs. For example, the OLG-7 associated TF, Stat3 (Fig. 3e), plays a critical role in regulating inflammation and immunity in the brain48. We also performed a spatial transcriptomics experiment using adult and aged mouse brains. Strikingly, we detected a significant enrichment of the reactive oligodendrocyte-specific markers (for example, C4b and Serpina3n) around the subventricular zone (SVZ) (Fig. 3g,h), indicating an age-related activation of inflammation signaling around the adult neurogenesis niche.
We next explored the subtype manifestation of aging signatures by differentially expressed (DE) gene analysis. We identified 7,135 aging-associated signatures across 359 subclusters (Supplementary Table 9 and Extended Data Fig. 9a). Of the 580 genes significantly altered in multiple (≥3) subtypes, 241 showed consistent directions. For example, Nr4a3 (genes involved in DNA repair49) was significantly decreased in aged neuron subtypes (striatal neurons, OB neurons, and interneurons). Hdac4, encoding a histone deacetylase50, decreased in aged astrocytes and ependymal cells. Insulin-degrading enzyme (IDE), involved in amyloid-beta clearance51, also increased in neuron subtypes. We also identified age-related changes in non-coding RNAs, many with high cell-type specificity (for example, B230209E15Rik in cortical projection neuron subtypes), but were not well characterized previously (Extended Data Fig. 9b).
AD pathogenesis-associated gene signatures and cell subtypes
Through comparison of subcluster fractions in two AD models to age-matched WT controls (3 months old), we detected 16 and 14 significantly changed subclusters (FDR of 5%, at least twofold change) in the EOAD (5xFAD) model and LOAD (APOE*4/Trem2*R47H) model, respectively (Fig. 4a and Supplementary Tables 10 and11). Most significantly altered subtypes correlated between genders (Fig. 4b) and between the two AD models, even though they had distinct genetic perturbations in different cell types (Fig. 4c). For example, a rare choroid plexus epithelial cell subtype (CPEC-4) was strongly depleted (by more than twofold decrease) in both models. This cell type is marked by significant enrichment of multiple mitochondrial genes linked to neuroprotective factors against neurodegeneration (for example, mt-Rnr2 (ref. 52)) or Tau protein levels in cerebrospinal fluid (mt-Rnr1 and mt-Nd553). Through spatial transcriptomics analysis, we verified its location around the SVZ and confirmed its depletion in the EOAD (5xFAD) model, suggesting mitochondrial dysfunction in choroid plexus epithelial cells plays a role in neurodegenerative diseases (Fig. 4d,e).
By contrast, another choroid plexus epithelial cell subtype (CPEC-6; marked by Sptlc3[+54, Fer1l6+) expanded in both AD models (over twofold increase) (Fig. 4b). A similar expansion was observed in a rare interbrain and midbrain neuron subtype (IMN 1–13, marked by Col25a1+, Ndrg1+) that expresses Col25a1, a membrane-associated collagen reported to promote intracellular amyloid formation in mouse models55 (Fig. 4c). Spatial transcriptomic analysis confirmed the up-regulation of IMN 1–13 specific gene markers in the thalamus region of the 5xFAD mouse brain (Fig. 4d,e), providing further validation of the AD-related neuron subtype change. Additionally, a septal nuclei neuron subtype IMN 2-9 (marked by Prdm16 and Ano2) that significantly overexpress in both AD model GMs related to axonogenesis (for example, Nrp1 and Slit2) and synaptogenesis (for example, Ptprd and Nrxn1) (ref. 56) was also significantly expanded in both AD models (Fig. 4f–h), aligning with the observed enlargement of the septal nuclei region several years before the onset of memory decline57.
Meanwhile, we observed a significant expansion of microglia subtype 9 (marked by Apoe and Csf1) in early-onset 5xFAD mice, aligning with previous reports45. This disease-associated microglia (DAM) also expanded in aged mice but was not evident in the late-onset APOE*4/Trem2*R47H model at 3 months of age (validated by both RNA and ATAC), potentially indicating a correlation with disease onset (Fig. 4i). We further investigated its DE genes (Extended Data Fig. 8d) and key TFs exhibiting consistency between cell-type-specific gene expression and motif accessibility (Fig. 4j). The enriched TFs were reported to be involved in microglia expansion during aging and AD58–60. Additionally, we quantified the enrichment of genetic variants linked to human traits61 and observed significant enrichment of AD heritability in microglia cells at both the main cell type level and particularly in the microglia-9 subtype, highlighting the role of DAM in AD pathogenesis (Extended Data Fig. 8e,f).
We identified subtype-specific manifestations of key AD-related molecular signatures. In the 5xFAD (EOAD) model, we found 6,792 subcluster-specific DE genes, whereas the APOE*4/Trem2*R47H (LOAD) model had 7,192 subcluster-specific DE genes (Extended Data Fig. 9c,f and Supplementary Tables 12 and 13). The Apoe gene was globally down-regulated in the APOE*4/Trem2*R47H mice, possibly due to the replacement of the Apoe gene with the human sequence. Many AD-associated gene signatures exhibited consistent changes across cellular subtypes, such as increased stress-related markers (for example, Hsp90aa1 and Txnrd1) in neuron subtypes in the 5xFAD mice. The expression of Reln62 decreased in various cell types in both models, aligning with previous report of Reln depletion before the onset of amyloid-beta pathology in the human frontal cortex63. Other intriguing observations included the down-regulation of Tlcd4, a gene involved in lipid trafficking and metabolism64 in multiple subclusters in the 5xFAD mice. Interestingly, despite genetic differences and disease onsets in the two AD models, there were remarkably consistent alterations in cell-type-specific molecular profiles. We identified 559 subcluster-specific DE genes shared between both AD mutants, suggesting common molecular mechanisms between early- and late-onset AD models (Extended Data Fig. 9g). We also investigated the connection between aging and AD-associated changes using transcriptomic aging clocks65, revealing significantly accelerated biological aging in both AD models (Extended Data Fig. 9h). Although most cell types demonstrated accelerated aging-related molecular changes, specific cell types only exhibited these signs in LOAD (Extended Data Fig. 9i). This is further validated by consistent cell-type-specific changes of aging-associated gene signatures (for example, Neat1 and Zfp423) in aged and AD models (Extended Data Fig. 9k).
Detection of dysregulated gene signatures in human AD brains
To compare molecular signatures associated with AD pathogenesis in mouse models and human patients, we sequenced a total of 118,240 single-nuclei transcriptomes (a median of 5,585 nuclei per sample, with the sequencing depth of 13,850 raw reads and a median of 1,109 UMIs per nucleus; Extended Data Fig. 10a,b) from 24 human brain samples across two brain regions (hippocampus, superior and middle temporal gyrus (SMTG)), derived from six patients with AD and six age- and gender-matched controls (Supplementary Table 14). Thirteen main cell types were identified through integration analysis with the mouse dataset and validated by the specific expression of known markers (Fig. 5a and Extended Data Fig. 10c–e).
A total of 4,171 and 2,149 cell-type-specific DE genes were identified in the hippocampus and SMTG, respectively (Fig. 5b and Supplementary Table 15). Exactly 349 genes were significantly changed in the same cell type from two distinct regions, among which 332 were altered consistently (Fig. 5c). For example, oligodendrocytes in AD samples from both regions exhibited decreased expression of the oligodendrocyte terminal differentiation factor OPALIN66 and the oxidation stress protector OXR1 (ref. 67). Concurrently, we observed an up-regulation of genes related to programmed cell death (for example, FLCN and RASSF2)68,69, suggesting an elevated stress in oligodendrocytes from AD brains. Other examples include the microglia-specific up-regulation of PTPRG70, and astrocyte-specific down-regulation of several transmembrane transporters (for example, AQP4) and neurotransmitter metabolism enzymes (for example, GLUD1)71,72.
Interestingly, some AD-associated gene signatures exhibited region-specific expression patterns. For example, GPNMB, encoding a transmembrane glycoprotein associated with microglia activation in AD brains73, showed increased expression in the microglia from the hippocampus but not from the SMTG. On the other hand, MMP24, encoding a member of the metalloproteinase family implicated in AD pathogenesis74, showed increased expression in cortical projection neurons unique within the SMTG (Fig. 5d). Notably, inhibition of MMP24 has been demonstrated to decrease amyloid-beta levels and promote cognitive functions in mouse models75, suggesting its potential role as a novel therapeutic target for AD.
Finally, we explored the human-mice relevance for AD-associated gene signatures and molecular pathways. Despite differences in the species and disease stages between the two datasets, several genes encoding heat shock proteins (for example, HSP90AA1 and HSPH1) were up-regulated across multiple cell types in both species (Fig. 5e). The elevated chaperon system potentially reduces the formation of toxic oligomeric assemblies in AD brains76, further validating the dysfunction of proteostasis as a molecular marker of AD77. Meanwhile, we identified down-regulated genes in both human and mice. One of the examples, PLP1, was reported as a subtype-specific driver gene contributing to AD pathogenesis78. Another gene, PDE10A, plays a key role in promoting neuronal survival, with its reduction detected in our datasets and multiple neurodegenerative diseases (for example, Huntington’s disease79 and Parkinson’s disease80) (Fig. 5f). Importantly, the above-mentioned trends were readily validated by another single-cell dataset investigating AD in the human prefrontal cortex6 (Extended Data Fig. 10f). In summary, the human-mice relevance analysis identified species-conserved genetic programs associated with AD pathogenesis.
Discussion
In this study, we introduced EasySci, a cost-effective technical framework for individual laboratories to generate gene expression and chromatin accessibility profiles from millions of single cells. We used EasySci to analyze 1.5 million single-cell transcriptomes with full gene body coverage and 380,000 chromatin accessibility profiles across mammalian brains of different ages and genotypes. The datasets enable the identification of over 300 cellular subtypes throughout the brain, including highly rare cell types representing less than 0.01% of the total brain cell population. Furthermore, we discovered region-specific effects attributable to aging and AD and examined the manifestation of molecular signatures associated with aging and AD on a cell-type-specific basis.
As highlighted by our subcluster level analysis, the effects of aging and AD on the global brain cell population are profoundly cell-type specific. Although most brain cell types stay relatively stable under various conditions, we identified over 50 cell subtypes exhibiting over twofold change in brains affected by aging and AD models. Many of these cell subtypes were rare and overlooked in conventional single-cell analysis. For example, the aging brain is characterized by the depletion of both rare neuronal progenitor cells and differentiating oligodendrocytes, associated with the enrichment of a C4b+ Serpina3n+ reactive oligodendrocyte subtype surrounding the SVZ, suggesting a potential interplay between oligodendrocytes, localized inflammatory signals and the stem cell niche.
The lack of reliable mouse models remains a big challenge in studying late-onset AD. The novel APOE*4/Trem2*R47H model aims to overcome this limitation by introducing two of the strongest late-onset AD-associated mutations81. We found consistent molecular and cellular population dynamics between the well-established 5xFAD and the novel APOE*4/Trem2*R47H model. For example, we observed shared subtypes that were depleted (for example, mt-Cytb+ mt-Rnr2+ choroid plexus epithelial cell) or enriched (for example, Col25a1+ Ndrg1+ interbrain and midbrain neuron) in both early- and late-onset AD mutant brains. Meanwhile, differences were also observed between the two AD models, as expected by the different onset times. The absence of an increase in the DAM population in the LOAD model may be due to its lack of amyloid deposition82 or by genetic perturbations, as both Trem2 and Apoe play a role in the activation of this cell population45.
In addition, we investigated AD-associated gene signatures in human brains by profiling over 100,000 single-nucleus transcriptomes derived from 24 human brain samples from control and AD patients, across two distinct anatomical locations. Although most AD-associated gene dynamics are profoundly cell-type and region specific, we identified dysregulated genetic signatures that are conserved between different locations in the human brains. Moreover, integrating the human and mouse brain datasets further revealed molecular pathways shared between human AD patients and mouse AD models, which suggests that the mouse AD model can serve as a model system to investigate the function and regulation of these conserved features associated with AD or neuronal dysfunction.
Of note, there are several inherent limitations of the study. First, the analysis covers only around 2% of the total mouse brain population (estimated at approximately 100 million cells), which means extremely rare cell subtypes may still be overlooked. Additionally, our relatively shallow sequencing depth might hinder the detection of lowly expressed transcripts or minor aging-related cellular state changes. Nevertheless, the validity of our key biological findings is reinforced by the consistent results across different genders (male versus female), genotypes (EOAD versus LOAD), and orthogonal approaches (such as comparisons between single-cell transcriptome, chromatin accessibility or spatial transcriptomics). This lends significant credence to our discoveries, even when considering the limitations of the study.
In summary, we have showcased the power of highly scalable single-cell genomics to delve into the dynamics of rare cell types, uncovering novel subtypes associated with aging and disease. Though our focus was on brain tissues, the strategic approach could be readily extended to systematically explore cellular states across an entire organism. Such exploration could illuminate the rare vulnerable cell populations to aging and diseases, opening up pathways to develop targeted therapeutic strategies.
Methods
Animals
C57BL/6 WT mouse brains at 3 months (n = 4), 6 months (n = 4) and 21 months (n = 4) were collected in this study. Two AD models at 3 months old from the same C57BL/6 background were added, including an early-onset model (5xFAD, JAX stock #034840) that overexpresses mutant human amyloid-beta precursor protein with the Swedish (K670N, M671L), Florida (I716V) and London (V717I) familial AD (FAD) mutations and human presenilin 1 harboring two FAD mutations, M146L and L286V. Brain-specific overexpression is achieved by neural-specific elements of the mouse Thy1 promoter16. The second, late-onset AD model (APOE*4/Trem2*R47H, JAX stock #028709) in this study carries two of the highest risk factor mutations of LOAD81, including a humanized APOE knock-in allele, where exons 2 and 3 and most of exon 4 of the mouse gene were replaced by the human ortholog including exons 2, 3, 4 and some part of the 3’ UTR. Furthermore, a knock-in missense point mutation in the mouse Trem2 gene was also introduced, consisting of an R47H mutation, along with two other silent mutations. Two male and two female mice are included in each condition. Mice were housed socially. All animal procedures were in accordance with institutional, state, and government regulations and approved under institutional animal care and use committee protocols 21049 and 20047.
EasySci-RNA library preparation
Detailed step-by-step EasySci-RNA protocol is included as Supplementary Protocol 1.
Human brain sample
Twenty-four post-mortem human brain samples across two regions (hippocampus and SMTG) and twelve individuals, including six controls and six patients with AD, ranging from 70 to 94 years of age, were collected from the University of Kentucky AD Center Tissue Bank. Each included participant who donated samples for this study signed a relevant consent form (including consent for unrestricted sharing of clinical, pathological and genetic information for dementia research) that was approved by the UK Internal Review Board (UK IRB #44009).
Computational procedures for processing EasySci-RNA libraries
A custom computational pipeline was developed to process the raw fastq files from the EasySci libraries. Similar to our previous studies10,11, the barcodes of each read pair were extracted. Both adaptor and barcode sequences were trimmed from the reads. Second, an extra trimming step is implemented using Trim Galore85 with default settings to remove the poly(A) sequences and the low-quality base calls from the cDNA. Afterward, the paired-end sequences were aligned to the genome with the STAR aligner86, and the PCR duplicates were removed. Finally, the reads are split into SAM files per cell, and the gene expression is counted using a custom script. The reads from the same cell originating from the short dT and the random hexamer RT primers were counted as independent cells. During the gene counting step, we assigned reads to genes if the aligned coordinates overlapped with the gene locations on the genome. If a read was ambiguous between genes and derived from the short dT RT primer, we assigned the read to the gene with the closest 3’ end; otherwise, the reads were labeled as ambiguous and not counted. If no gene was found during this step, we then searched for candidate genes 1,000 bp upstream of the read or genes on the opposite strand. Reads without any overlapped genes were discarded. Similar strategy was used for generating an exon count matrix across cells.
To compare the performance of EasySci-RNA with the commercial 10x Chromium system, we subsampled ~3,800 raw reads/cell from one randomly selected PCR batch of our large-scale mouse brain experiment, a 10x v2 Chromium dataset83, a 10x v3 dataset (https://www.10xgenomics.com/resources/datasets/5k-adult-mouse-brain-nuclei-isolated-with-chromium-nuclei-isolation-kit-3-1-standard) and a SPLiT-seq dataset33. After the subsampling, the EasySci data were processed with the custom computational pipeline, whereas the 10x Chromium data were processed with 10x Genomics’ Cell Ranger software87. We removed low-quality cells (unassigned reads >30%, UMIs >20,000 and genes <200) and selected the top 1,000 highest-quality cells from the 10x Chromium dataset83 and a deeply sequenced EasySci-RNA library profiling adult mouse brains. Subsequently, we subsampled these cells to different sequencing depths and quantified the unique transcripts/genes detected per cells. Based on this comparison, we recommend a sequencing depth of no less than 5,000 raw reads per nucleus to ensure adequate coverage and detection of a substantial number of unique molecules.
Cell clustering and annotation analysis
After gene counting, we kept the cells with reads identified by both RT primers. We then merged the reads from the same cells. Low-quality cells were removed based on one of the following criteria: (i) the percentage of unassigned reads > 30%, (ii) the number of UMIs >20,000 and (iii) the detected number of genes <200. We then used the Scrublet88 to identify and remove potential doublets. To identify distinct clusters of cells, we subjected the 1,469,111 single-cell gene expression profiles to UMAP visualization and Louvain clustering, similar to our previous study10. We then co-embedded our data with the published datasets2,89,90 through Seurat91, and clusters were annotated based on overlapped cell types. The annotations were manually verified and refined based on marker genes. DE genes across cell types were identified with the differentialGeneTest() function of Monocle 2 (ref. 92). To identify cell-type-specific gene markers, we selected genes that were DE across different cell types (FDR of 5%, likelihood), with over twofold expression difference between first and second-ranked cell types and TPM >50 in the first-ranked cell types.
Cell subclustering analysis
We selected each main cell type and applied PCA (combined matrix including the 30 principal components derived from the gene-level expression matrix and the first 10 principal components derived from the exon-level expression matrix), UMAP and Louvain clustering similarly to the major cluster analysis. We then merged subclusters that were not readily distinguishable in the UMAP space similar as described before10. DE genes and exons across cell types were identified with the differentialGeneTest() function of Monocle 2 (ref. 92). To identify subcluster-specific DE genes associated with aging or AD models, we sampled a maximum of 5,000 cells per condition for downstream DE gene analysis using the differentialGeneTest function of the Monocle 2 (ref. 92). The sex of the animals was included as a covariate to reduce sex-specific batch effects.
To detect cellular fraction changes at the subtype level across various conditions, we first generated a cell count matrix by computing the number of cells from every subcluster in each RT well profiled by EasySci-RNA. Each RT well was regarded as a replicate comprising cells from a specific mouse individual. Of note, we repeated the same analysis using the number of cells from each subcluster in each mouse individual (instead of RT well) and the result is highly consistent. We then applied the likelihood ratio test to identify significantly changed subclusters between different conditions, with the differentialGeneTest() function of Monocle 292. Subclusters were removed if they had <20 cells in either the male or female samples. The fold change was calculated by normalizing the number of cells in a cluster by the total number of cells in the corresponding condition, then dividing the normalized values in the case and control conditions after adding a small number (10−5) to reduce the effect of the very small clusters. In addition, we considered subclusters to change significantly only if there was over twofold change between conditions and the q-value was less than 0.05.
Integration analysis with external datasets and to locate the spatial distributions of main cell types and subtypes
To annotate the spatial locations of main cell types, we integrated the EasySci-RNA data with publicly available 10x Visium spatial transcriptomics datasets (https://www.10xgenomics.com/resources/datasets/mouse-brain-section-coronal-1-standard-1-0-0, https://www.10xgenomics.com/resources/datasets/mouse-brain-serial-section-1-sagittal-anterior-1-standard-1-0-0; https://www.10xgenomics.com/resources/datasets/mouse-brain-serial-section-1-sagittal-posterior-1-standard-1-0-0) through a NNLS approach: we first aggregated cell-type-specific UMI counts, normalized by the library size, multiplied by 100,000 and log-transformed after adding a pseudocount. A similar procedure was applied to calculate the normalized gene expression in each spatial spot captured in the 10x Visium dataset. We then applied NNLS regression to predict the gene expression of each spatial spot in 10x Visium data using the gene expression of all cell types recovered in Easy-RNA data, similar to our previous study10. The same approach10 was applied to integrate our EasySci-RNA dataset with a large single-cell dataset featuring highly detailed cell type annotations2 for identification of shared cellular states in two datasets.
To spatially map EasySci cell subtypes, we first aggregated ~50 single-cell transcriptomes identified by k-means clustering of cells in the UMAP space of subclustering analysis. We then integrated the EasySci-RNA data with the above 10x Visium spatial transcriptomics datasets and a published spatial dataset28, using cell2location27 following the default settings. To establish the corresponding regions of EasySci subclusters, we utilized the regional annotation of the spatial pixels and manually reviewed the anatomical regions of the top 10 pixels with the highest mapping score. To remove low-quality spatial mappings, only mapping scores above 1 were considered.
GM analysis
We performed GM analysis to identify the molecular programs underlying different cell types in the brain. First, we aggregated the gene expression across all subclusters. The aggregated gene count matrix was then normalized by the library size and then log-transformed. Genes were removed if they exhibited low expression (less than 1 in all subclusters) or low variance of expression (that is, the gene expression fold change between the maximum expressed subcluster and the median expression across subclusters is less than 5). The filtered matrix was used as input for UMAP visualization19 (metric = ‘cosine’, min_dist = 0.01, n_neighbors = 30). We then clustered genes based on their 2D UMAP coordinates through densityClust package (rho = 1, delta = 1)93.
EasySci-ATAC library preparation and sequencing
The detailed protocol for EasySci-ATAC library preparation is included in Supplementary Protocol 2.
Data processing for EasySci-ATAC
After sequencing, base calls were converted to fastq format and demultiplexed using Illumina’s bcl2fastq/v2.19.0.316 tolerating one mismatched base in barcodes (edit distance <2). Downstream sequence processing was similar to sci-ATAC-seq94. To compare the performance of EasySci-ATAC with other methods, we extracted reads containing barcodes from cells passing quality control (3,636 cells from one PCR well of the EasySci-ATAC library, 8067 nuclei from the 10x-ATACv2 library and 5,494 nuclei from the sci-ATAC-seq library15). We normalized for sequencing depth differences by subsampling reads from the 10X-ATACv2 and sci-ATAC-seq library, resulting on average 6,360 raw reads per cell across all three libraries. We processed the data through the same computation pipeline described above. Peak calling was performed on each dataset separately with these parameters:–nomodel–extsize 200–shift -100 -q 0.05. For peak counting, a union peak set was generated by merging the peaks called from three datasets. Cells were determined to be accessible at a given peak if a read from a cell overlapped with the peak. The peak-count matrix was generated by a custom python script with the HTseq package95.
Cell filtering, clustering and annotation for EasySci-ATAC
We used SnapATAC2/v1.99.99.396,97 to preprocess the EasySci-ATAC dataset. Cells with <1500 fragments and <2 TSS Enrichment were discarded. Potential doublet cells and doublet-derived subclusters were detected using an iterative clustering strategy10 modified to suit for scATAC-seq data. We then used a deep-learning-based framework scJoint21 to annotate main ATAC-seq cell types by using the EasySci-RNA dataset as a reference. First, we subsampled 5,000 cells from each main cell type of the EasySci-RNA dataset, and selected genes detected in more than 10 cells. Then, the gene count matrix and cell type labels of EasySci-RNA, along with the gene activity matrix of EasySci-ATAC were input into the scJoint pipeline with default parameters. Jointed embedding layers calculated from scJoint were used for UMAP visualizations using python package umap/v0.5.3 (ref. 19). Louvain clusters were identified using the Seurat function FindNeighbors() and FindClusters() based on the UMAP coordinates. Cells were assigned to the prediction label with the highest abundance within each louvain cluster. Clusters with low purities (that is, <80% cells were from the highest abundant cell type) were removed. Finally, to validate the integration-based annotations, we selected DE genes identified from the RNA-seq data with the following criteria: fold change between the maximum and the second maximum expressed cell type >1.5, q-value < 0.05, TPM >20 in the maximum RNA group and reads per million >50 in the maximum ATAC group. The top 10 DE genes ranked by fold change were selected using RNA-seq data for each cell type. If there were less than 10 genes passing the cutoff, we selected the top genes ranked by the fold change between the maximum expressed cell type and the mean expression of other cell types. We then calculated the aggregated gene count and gene body accessibility for each cell type. Subcluster level integrations were similar to the main cluster level integrations.
Differential accessible peak analysis
Nonduplicate ATAC-seq reads of cells from each main cell type were aggregated and peaks were called on each group separately with these parameters:–nomodel–extsize 200–shift -100 -q 0.05 using MACS2/v2.1.1 (ref. 98). To correct for differences in read depth or the number of nuclei per cell type, we converted MACS2 peak scores (−log10(q-value)) to ‘score per million’99 and filtered peaks by choosing a score-per-million cutoff of 1.3. Peak summits were extended by 250 bp on either side and then merged with bedtools/v2.30.0. Cells were determined to be accessible at a given peak if a read from a cell overlapped with the peak. The peak-count matrix was generated by a custom python script with the HTseq package95.
We used R package Signac/v1.7.0 (ref. 100) to perform the dimension reduction analysis using the peak-count matrix. We subsampled 5,000 cells from each main cell type and performed TF-IDF normalization using RunTFIDF(), followed by singular value decomposition using RunSVD() and retained the 2nd to 30th dimensions for UMAP visualizations using RunUMAP(). Differentially accessible peaks across cell types were identified using monocle 2 (ref. 92) with the differentialGeneTest() function. 5,000 cells were subsampled from each cell type for this analysis. Peaks detected in less than 50 cells were filtered out. We selected peaks that were differentially accessible across cell types by the following criteria: 5% FDR (likelihood ratio test), and with TPM >20 in the target cell type.
Transcription factor motif analysis
We used ChromVar/v1.16.0 (ref. 101) to asess the TF motif accessibility using cisBP motif sets curated by chromVARmotifs/v0.2.0 (ref. 101,102). We subsampled 5,000 cells from each main cell type, and calculated the motif deviation score for each single cell using the Signac wrapper RunChromVAR(). The motif deviation scores of each single cell were rescaled to (0, 10) using R function rescale() and then aggregated for each cell type. In addition, we also aggregated the gene expression of each TF in each cell type. We then computed the Pearson correlations between the aggregated motif matrix and aggregated TF expression matrix after scaling across all main cell types. TF analysis at the subcluster level was performed similarly with modifications. For each cell type of interest, we selected peaks detected in more than 20 cells and only kept cells with more than 500 reads in peaks. Peaks were resized to 500 bp (±250 bp around the center) and motif occurrences were identified using matchMotifs() function from motifmatchr/v1.16.0 (ref. 103). The motif deviation matrix was calculated using the ChromVar function computeDeviations(). Then, the motif deviation scores were rescaled to (0, 10) and aggregated per subcluster. Pearson correlation was calculated between the aggregated motif activity and aggregated TF expression across subclusters after scaling (subclusters with <20 cells were excluded).
LDSC analysis
The LDSC computational pipeline was modified from Cusanovich et al.15 and based on the LDSC software104 (https://github.com/bulik/ldsc). Specifically, to integrate human and mouse data, we first used the UCSC utility liftOver105 to lift all GWAS SNPs to the mouse genome. We then took the set of differentially accessible peaks across main clusters and across microglia subclusters, and annotated each SNP according to whether or not it overlapped one of these peaks. We then followed the recommended workflow for running LDSC using HapMap SNPs106, precomputed files corresponding to 1000 genomes phase 3, excluding the MHC region to generate an LDSC model for each chromosome and peak set. Only main cell types or subclusters containing DEpeaks in every chromosome are included in the following analysis.
To calculate enrichments based on each model, we first regenerated the baseline model (version 1.1) provided from the LDSC website and used this as the reference for enrichment calculation. Results for all trait/cluster pairs were gathered into a single file. P values were calculated from z-scores assigned to coefficients reported by ldsc.py and coefficients were divided by the average per-SNP heritability for traits associated with a given test. Tests were corrected for multiple hypothesis testing using the Benjamini-Hochberg method and only tests with a q-value of 0.05 or lower were considered significant.
Cis-regulatory elements linkage analysis
We first constructed pseudo-cells by aggregating the RNA-seq and ATAC-seq profiles of the same subclusters. Aggregated count matrices were normalized to TPM and log-transformed after adding one pseudocount. We only retained genes and peaks with TPM value greater than 10 in the maximum expressed pseudo-cells. Then, for each gene, we calculated the Pearson correlation coefficient (PCC) between its gene expression and the chromatin accessibility of its nearby accessible sites (±500 kb from the TSS) across aggregated subclusters. To define a threshold at PCC score, we also generated a set of background pairs by permuting the subcluster ID of the ATAC-seq matrix and with an empirically defined significance threshold of FDR < 0.01, to select significant positively correlated cis-reculatory element-gene pairs. We only keep the top linked gene with the highest PCC for each peak and distal peaks overlapping with the promoters for other genes were filtered out.
Spatial gene expression profiling of mouse brains
Spatial gene expression analysis experimental protocol was followed according to Visium Spatial Gene Expression User Guide (catalog no. CG000160), Visium Spatial Tissue Optimization User Guide (catalog no. CG000238 Rev A, 10x Genomics) and Visium Spatial Gene Expression User Guide (catalog no. CG000239 Rev A, 10x Genomics).
Transcriptomic aging clock analysis
A ridge regression model was employed to predict the ln(age) of pseudobulk cells (on average 15 cells merged) utilizing 80% of the pseudobulk cells from 3, 6, and 21-month-old mice. Predicted ages were subsequently calculated for the remaining 20% of WT mice and the entirety of the AD models. Individual models were crafted for each cell type.
Clustering, annotation and differential analysis for human brain samples
A digital gene expression matrix was constructed from the raw sequencing data as described before. To identify distinct clusters of cells corresponding to different cell types in the human brain samples, we co-embedded the human cells from both regions with our mouse brain dataset (up to 5,000 cells randomly sampled from each of 31 cell types), and clusters were annotated based on overlapped cell types. The annotations were manually verified and refined based on marker genes. Following on, the hippocampus and SMTG human dataset were integrated together to construct the same low-dimensional space with only human cells.
DE genes between AD and control samples for each cell type in each region were identified using Monocle 2 (ref. 107,108) with the differentialGeneTest() function. Main cell types with less than 50 cells were excluded from the analysis (that is, choroid plexus epithelial cells and vascular leptomeningeal cells in the SMTG). DE genes were filtered based on the following cutoffs: q-value < 0.05, with fold change (FC) > 1.5 between the maximum and second expressed condition, and with TPM >50 in the highest expressed condition. To further validate human-mouse shared gene expression changes, we used a recently published AD single-cell dataset from the human prefrontal cortex6.
Statistics and reproducibility
Statistical analyses are detailed in figure legends (Fig. 1 and Extended Data Fig. 9) and were performed using R software (version 4.0.1). The number of cells or pseudobulk cells used for the comparisons are detailed in the figure legends and the number of replicates are detailed in Methods. For spatial integration analysis in Fig. 1k,l, Fig. 3g,h, Fig. 4e and Extended Data Fig. 5h,i, each spatial transcriptomic datum includes one section of the experiment.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41588-023-01572-y.
Supplementary information
Acknowledgements
We thank all members of the Cao lab for helpful discussions and feedback. We thank J. Shendure (University of Washington) for insightful feedback on this work. We also thank members of the Rockefeller University Genomics Resource Center (SCR_020986), High-Performance Computing Resource Center and Comparative Bioscience Center for their exceptional assistance with library sequencing and animal maintenance. This work was funded by grants from the National Institutes of Health (DP2HG012522, R01AG076932 and RM1HG011014 to J.C.; P30AG072946 and P01AG078116 to P.T.N.; and R01AG072758 to L.G.) and the Sagol Network GerOmic Award (J.C.). This work is partly supported by the Pershing Square Foundation, Bill Ackman and Neri Oxman.
Extended data
Author contributions
J.C. and W.Z. conceptualized and supervised the project. J.L. and A.S. developed the experimental and computational pipeline for EasySci-RNA profiling of all samples. G.B. and Z.L. developed the experimental and computational pipeline for EasySci-ATAC profiling of all samples. A.A. performed the 10x Visium spatial transcriptomics experiment. S.A. and P.N. processed the human brain samples for single-cell profiling experiments. A.S. and Z.L. performed the downstream analysis with assistance from E.M., A.L., A.E., C.S., Z.X., Z.Z. and J.B. J.C., W.Z., Z.L. and A.S. wrote the paper with input and biological insight from P.N., L.G. and other co-authors.
Peer review
Peer review information
Nature Genetics thanks Inge Holtman, Luciano Martelotto and Vivek Swarup for their contribution to the peer review of this work.
Data availability
All relevant data generated in this study are deposited to public repositories and are publicly released. Raw and processed data of single-cell RNA-seq/ATAC-seq profiling were deposited at the NCBI Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc= GSE212606).
Code availability
Customized computational scripts of processing EasySci data were deposited in Zenodo109 (10.5281/zenodo.8395492) and GitHub (https://github.com/JunyueCaoLab/EasySci).
Competing interests
J.C., W.Z., A.S. and J.L. are inventors on pending patent applications related to EasySci-RNA-seq. The other authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Andras Sziraki, Ziyu Lu, Jasper Lee.
These authors jointly supervised this work: Wei Zhou, Junyue Cao.
Contributor Information
Wei Zhou, Email: wzhou@rockefeller.edu.
Junyue Cao, Email: jcao@rockefeller.edu.
Extended data
is available for this paper at 10.1038/s41588-023-01572-y.
Supplementary information
The online version contains supplementary material available at 10.1038/s41588-023-01572-y.
References
- 1.Erö C, Gewaltig M-O, Keller D, Markram H. A cell atlas for the mouse brain. Front. Neuroinform. 2018;12:84. doi: 10.3389/fninf.2018.00084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Zeisel A, et al. Molecular architecture of the mouse nervous system. Cell. 2018;174:999–1014. doi: 10.1016/j.cell.2018.06.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Mathys H, et al. Single-cell transcriptomic analysis of Alzheimer’s disease. Nature. 2019;570:332–337. doi: 10.1038/s41586-019-1195-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Xia X, Jiang Q, McDermott J, Han J-DJ. Aging and Alzheimer’s disease: comparison and associations from molecular to system level. Aging Cell. 2018;17:e12802. doi: 10.1111/acel.12802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ximerakis M, et al. Single-cell transcriptomic profiling of the aging mouse brain. Nat. Neurosci. 2019;22:1696–1708. doi: 10.1038/s41593-019-0491-3. [DOI] [PubMed] [Google Scholar]
- 6.Morabito S, et al. Single-nucleus chromatin accessibility and transcriptomic characterization of Alzheimer’s disease. Nat. Genet. 2021;53:1143–1155. doi: 10.1038/s41588-021-00894-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Tabula Muris Consortium. A single-cell transcriptomic atlas characterizes ageing tissues in the mouse. Nature. 2020;583:590–595. doi: 10.1038/s41586-020-2496-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Wang R, et al. Construction of a cross-species cell landscape at single-cell level. Nucleic Acids Res. 2022;51:501–516. doi: 10.1093/nar/gkac633. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Cao J, et al. Comprehensive single-cell transcriptional profiling of a multicellular organism. Science. 2017;357:661–667. doi: 10.1126/science.aam8940. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Cao J, et al. A human cell atlas of fetal gene expression. Science. 2020;370:eaba7721. doi: 10.1126/science.aba7721. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Cao J, et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature. 2019;566:496–502. doi: 10.1038/s41586-019-0969-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Ma S, et al. Chromatin potential identified by shared single-cell profiling of RNA and chromatin. Cell. 2020;183:1103–1116. doi: 10.1016/j.cell.2020.09.056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Martin BK, et al. An optimized protocol for single cell transcriptional profiling by combinatorial indexing. Nat. Protoc. 2023;18:188–207. doi: 10.1038/s41596-022-00752-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Domcke S, et al. A human cell atlas of fetal chromatin accessibility. Science. 2020;370:eaba7721. doi: 10.1126/science.aba7612. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Cusanovich DA, et al. A single-cell atlas of in vivo mammalian chromatin accessibility. Cell. 2018;174:1309–1324. doi: 10.1016/j.cell.2018.06.052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Oakley H, et al. Intraneuronal beta-amyloid aggregates, neurodegeneration, and neuron loss in transgenic mice with five familial Alzheimer’s disease mutations: potential factors in amyloid plaque formation. J. Neurosci. 2006;26:10129–10140. doi: 10.1523/JNEUROSCI.1202-06.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Desimone A, et al. The influence of ApoE4 on the clinical outcomes and pathophysiology of degenerative cervical myelopathy. JCI Insight. 2021;6:e149227. doi: 10.1172/jci.insight.149227. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Xiang X, et al. The Trem2 R47H Alzheimer’s risk variant impairs splicing and reduces Trem2 mRNA and protein in mice but not in humans. Mol. Neurodegener. 2018;13:49. doi: 10.1186/s13024-018-0280-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.McInnes L, et al. UMAP: Uniform Manifold Approximation and Projection. J. Open Source Softw. 2018;3:861. doi: 10.21105/joss.00861. [DOI] [Google Scholar]
- 20.Blondel VD, et al. Fast unfolding of communities in large networks. J. Stat. Mech. 2008;2008:10008. doi: 10.1088/1742-5468/2008/10/P10008. [DOI] [Google Scholar]
- 21.Lin Y, et al. scJoint integrates atlas-scale single-cell RNA-seq and ATAC-seq data with transfer learning. Nat. Biotechnol. 2022;40:703–710. doi: 10.1038/s41587-021-01161-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Yeh H, Ikezu T. Transcriptional and epigenetic regulation of microglia in health and disease. Trends Mol. Med. 2019;25:96–111. doi: 10.1016/j.molmed.2018.11.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Watakabe A, et al. Comparative analysis of layer-specific genes in mammalian neocortex. Cereb. Cortex. 2007;17:1918–1933. doi: 10.1093/cercor/bhl102. [DOI] [PubMed] [Google Scholar]
- 24.McEvilly RJ, et al. Requirement for Brn-3.0 in differentiation and survival of sensory and motor neurons. Nature. 1996;384:574–577. doi: 10.1038/384574a0. [DOI] [PubMed] [Google Scholar]
- 25.Mays JC, et al. Single-cell RNA sequencing of the mammalian pineal gland identifies two pinealocyte subtypes and cell type-specific daily patterns of gene expression. PLoS ONE. 2018;13:e0205883. doi: 10.1371/journal.pone.0205883. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Campbell JN, et al. A molecular census of arcuate hypothalamus and median eminence cell types. Nat. Neurosci. 2017;20:484–496. doi: 10.1038/nn.4495. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Kleshchevnikov V, et al. Cell2location maps fine-grained cell types in spatial transcriptomics. Nat. Biotechnol. 2022;40:661–671. doi: 10.1038/s41587-021-01139-4. [DOI] [PubMed] [Google Scholar]
- 28.Ortiz C, et al. Molecular atlas of the adult mouse brain. Sci. Adv. 2020;6:eabb3446. doi: 10.1126/sciadv.abb3446. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Kuleshov MV, et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016;44:W90–W97. doi: 10.1093/nar/gkw377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Liu J, et al. Tbx19, a tissue-selective regulator of POMC gene expression. Proc. Natl Acad. Sci. USA. 2001;98:8674–8679. doi: 10.1073/pnas.141234898. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Tufo C, et al. Development of the mammalian main olfactory bulb. Development. 2022;149:dev200210. doi: 10.1242/dev.200210. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Sokolowski JD, et al. Brain-specific angiogenesis inhibitor-1 expression in astrocytes and neurons: implications for its dual function as an apoptotic engulfment receptor. Brain Behav. Immun. 2011;25:915–921. doi: 10.1016/j.bbi.2010.09.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Rosenberg AB, et al. Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding. Science. 2018;360:176–182. doi: 10.1126/science.aam8999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Tepe B, et al. Single-cell RNA-seq of mouse olfactory bulb reveals cellular heterogeneity and activity-dependent molecular census of adult-born neurons. Cell Rep. 2018;25:2689–2703. doi: 10.1016/j.celrep.2018.11.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Barraud P, et al. Neural crest origin of olfactory ensheathing glia. Proc. Natl Acad. Sci. USA. 2010;107:21040–21045. doi: 10.1073/pnas.1012248107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Monavarfeshani A, Knill CN, Sabbagh U, Su J, Fox MA. Region- and cell-specific expression of transmembrane collagens in mouse brain. Front. Integr. Neurosci. 2017;11:20. doi: 10.3389/fnint.2017.00020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Puverel S, Nakatani H, Parras C, Soussi-Yanicostas N. Prokineticin receptor 2 expression identifies migrating neuroblasts and their subventricular zone transient-amplifying progenitors in adult mice. J. Comp. Neurol. 2009;512:232–242. doi: 10.1002/cne.21888. [DOI] [PubMed] [Google Scholar]
- 38.Pastrana E, Cheng L-C, Doetsch F. Simultaneous prospective purification of adult subventricular zone neural stem cells and their progeny. Proc. Natl Acad. Sci. USA. 2009;106:6387–6392. doi: 10.1073/pnas.0810407106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Kumar A, et al. Transcriptomic analysis of the signature of neurogenesis in human hippocampus suggests restricted progenitor cell progression post-childhood. IBRO Rep. 2020;9:224–232. doi: 10.1016/j.ibror.2020.08.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Marques S, et al. Transcriptional convergence of oligodendrocyte lineage progenitors during development. Dev. Cell. 2018;46:504–517. doi: 10.1016/j.devcel.2018.07.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Zhang Y, et al. An RNA-sequencing transcriptome and splicing database of glia, neurons, and vascular cells of the cerebral cortex. J. Neurosci. 2014;34:11929–11947. doi: 10.1523/JNEUROSCI.1860-14.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Lu Z, et al. Tracking cell-type-specific temporal dynamics in human and mouse brains. Cell. 2023;186:4345–4364.e24. doi: 10.1016/j.cell.2023.08.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Graham V, Khudyakov J, Ellis P, Pevny L. SOX2 functions to maintain neural progenitor identity. Neuron. 2003;39:749–765. doi: 10.1016/S0896-6273(03)00497-5. [DOI] [PubMed] [Google Scholar]
- 44.Li J, et al. Transcription factors Sp8 and Sp9 coordinately regulate olfactory bulb interneuron development. Cereb. Cortex. 2018;28:3278–3294. doi: 10.1093/cercor/bhx199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Keren-Shaul H, et al. A unique microglia type associated with restricting development of Alzheimer’s disease. Cell. 2017;169:1276–1290. doi: 10.1016/j.cell.2017.05.018. [DOI] [PubMed] [Google Scholar]
- 46.Zhou Y, et al. Human and mouse single-nucleus transcriptomics reveal TREM2-dependent and TREM2-independent cellular responses in Alzheimer’s disease. Nat. Med. 2020;26:131–142. doi: 10.1038/s41591-019-0695-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Kenigsbuch M, et al. A shared disease-associated oligodendrocyte signature among multiple CNS pathologies. Nat. Neurosci. 2022;25:876–886. doi: 10.1038/s41593-022-01104-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.See AP, et al. The role of STAT3 activation in modulating the immune microenvironment of GBM. J. Neurooncol. 2012;110:359–368. doi: 10.1007/s11060-012-0981-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Paillasse MR, de Medina P. The NR4A nuclear receptors as potential targets for anti-aging interventions. Med. Hypotheses. 2015;84:135–140. doi: 10.1016/j.mehy.2014.12.003. [DOI] [PubMed] [Google Scholar]
- 50.Di Giorgio E, et al. HDAC4 degradation during senescence unleashes an epigenetic program driven by AP-1/p300 at selected enhancers and super-enhancers. Genome Biol. 2021;22:129. doi: 10.1186/s13059-021-02340-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Zhang Y, Wang P. Age-related increase of insulin-degrading enzyme is inversely correlated with cognitive function in APPswe/PS1dE9 mice. Med. Sci. Monit. 2018;24:2446–2455. doi: 10.12659/MSM.909596. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Hashimoto Y, et al. A rescue factor abolishing neuronal cell death by a wide spectrum of familial Alzheimer’s disease genes and Abeta. Proc. Natl Acad. Sci. USA. 2001;98:6336–6341. doi: 10.1073/pnas.101133498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Cavalcante GC, et al. Mitochondrial genetics reinforces multiple layers of interaction in Alzheimer’s disease. Biomedicines. 2022;10:880. doi: 10.3390/biomedicines10040880. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Mielke MM, Lyketsos CG. Alterations of the sphingolipid pathway in Alzheimer’s disease: new biomarkers and treatment targets? Neuromolecular Med. 2010;12:331–340. doi: 10.1007/s12017-010-8121-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Tong Y, Xu Y, Scearce-Levie K, Ptácek LJ, Fu Y-H. COL25A1 triggers and promotes Alzheimer’s disease-like pathology in vivo. Neurogenetics. 2010;11:41–52. doi: 10.1007/s10048-009-0201-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Morabito, S., Reese, F., Rahimzadeh, N., Miyoshi, E. & Swarup, V. hdWGCNA identifies co-expression networks in high-dimensional transcriptomics data. Cell Rep. Methods3, 100498 (2023). [DOI] [PMC free article] [PubMed]
- 57.Butler T, et al. Volume of the human septal forebrain region is a predictor of source memory accuracy. J. Int. Neuropsychol. Soc. 2012;18:157–161. doi: 10.1017/S1355617711001421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Oeckinghaus A, Ghosh S. The NF-kappaB family of transcription factors and its regulation. Cold Spring Harb. Perspect. Biol. 2009;1:a000034. doi: 10.1101/cshperspect.a000034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Liu X-F, et al. Nrf2 as a target for prevention of age-related and diabetic cataracts by against oxidative stress. Aging Cell. 2017;16:934–942. doi: 10.1111/acel.12645. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Bommer GT, MacDougald OA. Regulation of lipid homeostasis by the bifunctional SREBF2-miR33a locus. Cell Metab. 2011;13:241–247. doi: 10.1016/j.cmet.2011.02.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Finucane HK, et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 2015;47:1228–1235. doi: 10.1038/ng.3404. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Seripa D, et al. The RELN locus in Alzheimer’s disease. J. Alzheimers Dis. 2008;14:335–344. doi: 10.3233/JAD-2008-14308. [DOI] [PubMed] [Google Scholar]
- 63.Herring A, et al. Reelin depletion is an early phenomenon of Alzheimer’s pathology. J. Alzheimers Dis. 2012;30:963–979. doi: 10.3233/JAD-2012-112069. [DOI] [PubMed] [Google Scholar]
- 64.Attwood MM, Schiöth HB. Characterization of five transmembrane proteins: with focus on the Tweety, Sideroflexin, and YIP1 domain families. Front Cell Dev. Biol. 2021;9:708754. doi: 10.3389/fcell.2021.708754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Buckley MT, et al. Cell-type-specific aging clocks to quantify aging and rejuvenation in neurogenic regions of the brain. Nat. Aging. 2023;3:121–137. doi: 10.1038/s43587-022-00335-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.de Faria O, Jr, et al. TMEM10 promotes oligodendrocyte differentiation and is expressed by oligodendrocytes in human remyelinating multiple sclerosis plaques. Sci. Rep. 2019;9:3606. doi: 10.1038/s41598-019-40342-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Volkert MR, Crowley DJ. Preventing neurodegeneration by controlling oxidative stress: the role of OXR1. Front. Neurosci. 2020;14:611904. doi: 10.3389/fnins.2020.611904. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Schmidt LS, Linehan WM. FLCN: the causative gene for Birt-Hogg-Dubé syndrome. Gene. 2018;640:28–42. doi: 10.1016/j.gene.2017.09.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Cooper WN, et al. RASSF2 associates with and stabilizes the proapoptotic kinase MST2. Oncogene. 2009;28:2988–2998. doi: 10.1038/onc.2009.152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Luo, J. et al. PTPRG activates m6A methyltransferase VIRMA to block mitochondrial autophagy mediated neuronal death in Alzheimer’s disease. Preprint at medRxiv10.1101/2022.03.11.22272061 (2022).
- 71.Silva I, Silva J, Ferreira R, Trigo D. Glymphatic system, AQP4, and their implications in Alzheimer’s disease. Neurol. Res Pr. 2021;3:5. doi: 10.1186/s42466-021-00102-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Kulijewicz-Nawrot M, Syková E, Chvátal A, Verkhratsky A, Rodríguez JJ. Astrocytes and glutamate homoeostasis in Alzheimer’s disease: a decrease in glutamine synthetase, but not in glutamate transporter-1, in the prefrontal cortex. ASN Neuro. 2013;5:273–282. doi: 10.1042/AN20130017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Hüttenrauch M, et al. Glycoprotein NMB: a novel Alzheimer’s disease associated marker expressed in a subset of activated microglia. Acta Neuropathol. Commun. 2018;6:108. doi: 10.1186/s40478-018-0612-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Zipfel P, Rochais C, Baranger K, Rivera S, Dallemagne P. Matrix metalloproteinases as new targets in Alzheimer’s disease: opportunities and challenges. J. Med. Chem. 2020;63:10705–10725. doi: 10.1021/acs.jmedchem.0c00352. [DOI] [PubMed] [Google Scholar]
- 75.Baranger K, et al. MT5-MMP is a new pro-amyloidogenic proteinase that promotes amyloid pathology and cognitive decline in a transgenic mouse model of Alzheimer’s disease. Cell. Mol. Life Sci. 2016;73:217–236. doi: 10.1007/s00018-015-1992-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Arawaka S, Machiya Y, Kato T. Heat shock proteins as suppressors of accumulation of toxic prefibrillar intermediates and misfolded proteins in neurodegenerative diseases. Curr. Pharm. Biotechnol. 2010;11:158–166. doi: 10.2174/138920110790909713. [DOI] [PubMed] [Google Scholar]
- 77.Cornejo VH, Hetz C. The unfolded protein response in Alzheimer’s disease. Semin. Immunopathol. 2013;35:277–292. doi: 10.1007/s00281-013-0373-9. [DOI] [PubMed] [Google Scholar]
- 78.Neff RA, et al. Molecular subtyping of Alzheimer’s disease using RNA sequencing data reveals novel mechanisms and targets. Sci. Adv. 2021;7:eabb5398. doi: 10.1126/sciadv.abb5398. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Niccolini F, et al. Altered PDE10A expression detectable early before symptomatic onset in Huntington’s disease. Brain. 2015;138:3016–3029. doi: 10.1093/brain/awv214. [DOI] [PubMed] [Google Scholar]
- 80.Niccolini F, et al. Loss of phosphodiesterase 10A expression is associated with progression and severity in Parkinson’s disease. Brain. 2015;138:3003–3015. doi: 10.1093/brain/awv219. [DOI] [PubMed] [Google Scholar]
- 81.Karch CM, Goate AM. Alzheimer’s disease risk genes and mechanisms of disease pathogenesis. Biol. Psychiatry. 2015;77:43–51. doi: 10.1016/j.biopsych.2014.05.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Kotredes KP, et al. Uncovering disease mechanisms in a novel mouse model expressing humanized APOEε4 and Trem2*R47H. Front. Aging Neurosci. 2021;13:735524. doi: 10.3389/fnagi.2021.735524. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Ding J, et al. Systematic comparison of single-cell and single-nucleus RNA-sequencing methods. Nat. Biotechnol. 2020;38:737–746. doi: 10.1038/s41587-020-0465-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Stoeckius M, et al. Cell Hashing with barcoded antibodies enables multiplexing and doublet detection for single cell genomics. Genome Biol. 2018;19:224. doi: 10.1186/s13059-018-1603-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Krueger, F., James, F., Ewels, P., Afyounian, E. & Schuster-Boeckler, B. FelixKrueger/TrimGalore: a wrapper around Cutadapt and FastQC to consistently apply adapter and quality trimming to FastQ files, with extra functionality for RRBS data. GitHub10.5281/zenodo.5127899 (2021).
- 86.Dobin A, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Zheng GXY, et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 2017;8:14049. doi: 10.1038/ncomms14049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Wolock SL, Lopez R, Klein AM. Scrublet: computational identification of cell doublets in single-cell transcriptomic data. Cell Syst. 2019;8:281–291. doi: 10.1016/j.cels.2018.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Yao Z, et al. A transcriptomic and epigenomic cell atlas of the mouse primary motor cortex. Nature. 2021;598:103–110. doi: 10.1038/s41586-021-03500-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Kozareva V, et al. A transcriptomic atlas of mouse cerebellar cortex comprehensively defines cell types. Nature. 2021;598:214–219. doi: 10.1038/s41586-021-03220-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Stuart T, et al. Comprehensive integration of single-cell data. Cell. 2019;177:1888–1902. doi: 10.1016/j.cell.2019.05.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Qiu X, et al. Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods. 2017;14:979–982. doi: 10.1038/nmeth.4402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Rodriguez A, Laio A. Machine learning. Clustering by fast search and find of density peaks. Science. 2014;344:1492–1496. doi: 10.1126/science.1242072. [DOI] [PubMed] [Google Scholar]
- 94.Cao J, et al. Joint profiling of chromatin accessibility and gene expression in thousands of single cells. Science. 2018;361:1380–1385. doi: 10.1126/science.aau0730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Anders S, Pyl PT, Huber W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics. 2015;31:166–169. doi: 10.1093/bioinformatics/btu638. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Fang R, et al. Comprehensive analysis of single cell ATAC-seq data with SnapATAC. Nat. Commun. 2021;12:1337. doi: 10.1038/s41467-021-21583-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Zhang, K., Zemke, N. R., Armand, E. J. & Ren, B. SnapATAC2: a fast, scalable and versatile tool for analysis of single-cell omics data. Preprint at bioRxiv10.1101/2023.09.11.557221 (2023). [DOI] [PMC free article] [PubMed]
- 98.Zhang Y, et al. Model-based analysis of ChIP-Seq (MACS) Genome Biol. 2008;9:R137. doi: 10.1186/gb-2008-9-9-r137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Corces MR, et al. The chromatin accessibility landscape of primary human cancers. Science. 2018;362:eaav1898. doi: 10.1126/science.aav1898. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Stuart T, Srivastava A, Madad S, Lareau CA, Satija R. Single-cell chromatin state analysis with Signac. Nat. Methods. 2021;18:1333–1341. doi: 10.1038/s41592-021-01282-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Schep AN, Wu B, Buenrostro JD, Greenleaf W. J. chromVAR: inferring transcription-factor-associated accessibility from single-cell epigenomic data. Nat. Methods. 2017;14:975–978. doi: 10.1038/nmeth.4401. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Weirauch MT, et al. Determination and inference of eukaryotic transcription factor sequence specificity. Cell. 2014;158:1431–1443. doi: 10.1016/j.cell.2014.08.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Schep, A. motifmatchr: Fast Motif Matching in R. Githubhttps://github.com/GreenleafLab/motifmatchr/ (2017).
- 104.Bulik-Sullivan BK, et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 2015;47:291–295. doi: 10.1038/ng.3211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Hinrichs AS, et al. The UCSC Genome Browser Database: update 2006. Nucleic Acids Res. 2006;34:D590–D598. doi: 10.1093/nar/gkj144. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.International HapMap Consortium. The International HapMap Project. Nature. 2003;426:789–796. doi: 10.1038/nature02168. [DOI] [PubMed] [Google Scholar]
- 107.Trapnell C, et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 2014;32:381–386. doi: 10.1038/nbt.2859. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Qiu X, et al. Single-cell mRNA quantification and differential analysis with Census. Nat. Methods. 2017;14:309–315. doi: 10.1038/nmeth.4150. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Sziraki, A. & Lu, Z. Computational pipeline for processing EasySci data. Zenodo10.5281/zenodo.8395492 (2023).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All relevant data generated in this study are deposited to public repositories and are publicly released. Raw and processed data of single-cell RNA-seq/ATAC-seq profiling were deposited at the NCBI Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc= GSE212606).
Customized computational scripts of processing EasySci data were deposited in Zenodo109 (10.5281/zenodo.8395492) and GitHub (https://github.com/JunyueCaoLab/EasySci).