Abstract
The development of the human brain involves unique processes (not observed in many other species) that can contribute to neurodevelopmental disorders1–4. Cerebral organoids enable the study of neurodevelopmental disorders in a human context. We have developed the CRISPR–human organoids–single-cell RNA sequencing (CHOOSE) system, which uses verified pairs of guide RNAs, inducible CRISPR–Cas9-based genetic disruption and single-cell transcriptomics for pooled loss-of-function screening in mosaic organoids. Here we show that perturbation of 36 high-risk autism spectrum disorder genes related to transcriptional regulation uncovers their effects on cell fate determination. We find that dorsal intermediate progenitors, ventral progenitors and upper-layer excitatory neurons are among the most vulnerable cell types. We construct a developmental gene regulatory network of cerebral organoids from single-cell transcriptomes and chromatin modalities and identify autism spectrum disorder-associated and perturbation-enriched regulatory modules. Perturbing members of the BRG1/BRM-associated factor (BAF) chromatin remodelling complex leads to enrichment of ventral telencephalon progenitors. Specifically, mutating the BAF subunit ARID1B affects the fate transition of progenitors to oligodendrocyte and interneuron precursor cells, a phenotype that we confirmed in patient-specific induced pluripotent stem cell-derived organoids. Our study paves the way for high-throughput phenotypic characterization of disease susceptibility genes in organoid models with cell state, molecular pathway and gene regulatory network readouts.
Subject terms: Stem cells, Stem-cell biotechnology, Development of the nervous system
We develop a high-throughput CRISPR screening system in cerebral organoids and identify vulnerable cell types and gene regulatory networks associated with autism spectrum disorder from single-cell transcriptomes and chromatin modalities.
Main
Human cortical development involves unique and intricate processes. Following neural tube formation, neuroepithelial cells within the telencephalon proliferate, expand and generate radial glial progenitors, intermediate progenitors and outer radial glial progenitors. In the dorsal region, these progenitors give rise to layered excitatory neurons. In the ventral telencephalon, they generate interneurons that migrate into the dorsal cortex to integrate with excitatory neurons. These processes are governed by precise and highly orchestrated genetic and molecular programs, many of which have remained elusive3. Research into neurodevelopmental disorders (NDDs) has advanced our understanding of human brain development and helped to reveal how it can go awry. However, many NDDs, such as autism spectrum disorder (ASD), are diagnosed only after birth, when brain development is almost complete. Analysing the developmental and cell type-specific defects associated with ASD in a human context is crucial but is often constrained to neuroimaging and postmortem tissue studies. Moreover, coexpression network analyses of ASD genes indicate that the developmental defects associated with ASD may arise during fetal stages5,6, periods that are difficult to investigate.
Studying the genetic aetiology of NDDs enhances our understanding of disease mechanisms1,7,8, but it usually requires access to the developmental processes of the human brain. Brain organoids recapitulate early brain development and generate diverse cell types found in vivo9. Although organoids have been used to investigate disease-associated genes9–11, they are limited by phenotypic variability and low throughput. Recent studies combining CRISPR screening technology with organoids have revealed the power of such strategies for discovering new gene functions12,13. However, such screens are limited by low-content readouts, which often use guide RNA (gRNA) counts to assess proliferation phenotypes when challenged with genetic perturbations. Although CRISPR screening coupled with single-cell transcriptomic readout provides unprecedented resolution for phenotypic characterization14–16, such approaches have not been fully explored in organoids. The feasibility of single-cell perturbation screening in heterogeneous tissues that undergo long-term differentiation and consist of diverse cell types remains unclear.
Here, we describe the CHOOSE system which combines parallel genetic perturbations with single-cell transcriptomic readout in mosaic cerebral organoids. We deliver barcoded pairs of gRNAs as a pooled lentiviral library to stem cells and generate telencephalic organoids to identify the loss-of-function phenotypes of 36 high-risk ASD genes at the level of cell types and molecular pathways. Using single-cell multiomic data, we construct a developmental gene regulatory network (GRN) of cerebral organoids and identify ASD-enriched regulatory hubs connected to the genes that are dysregulated in response to genetic perturbations. Among the 36 genes, one of the most significant changes in cell type composition was identified in the context of ARID1B. Specifically, perturbing ARID1B expands ventral radial glia cells and increases their transition to early oligodendrocyte precursor cells (OPCs), a phenotype we verify in brain organoids generated from ARID1B patient-derived induced pluripotent stem cell (iPS cell) lines.
Single-cell CRISPR screening in barcoded organoids
Single-cell RNA sequencing (scRNA-seq) is a high-throughput method used to analyse cellular heterogeneity in complex tissues. To establish an organoid system that enables CRISPR perturbations with single-cell transcriptomic readout, we used a human embryonic stem cell (hES cell) line expressing an enhanced specificity SpCas9 (eCas9), which has substantially reduced off-target effects and is controlled by an upstream loxp stop element12 (Fig. 1a). To regulate eCas9 induction, we engineered a lentiviral vector to deliver 4-hydroxytamoxifen-inducible CRE recombinase and a dual single-guide RNA (sgRNA) cassette (Fig. 1a). The dual sgRNA consists of two sgRNAs targeting the same gene, expressed under the U6 or H1 promoter. The dual gRNA is located within the 3′ long terminal repeat (LTR) and is thus transcribed by RNA polymerase II to be captured by scRNA-seq assays17. To ensure efficient generation of loss-of-function alleles, we determined the editing efficiency of each sgRNA pair using a flow cytometry-based gRNA reporter assay (Fig. 1b and Extended Data Fig. 1a). In this assay, a pre-assembled array of gRNA-targeting sequences fused with TagBFP is used to generate 3T3 fibroblast reporter cell lines. sgRNA and eCas9 are then delivered into the reporter cell lines by lentiviruses. Successful genome editing causes frameshift mutations, resulting in the loss of blue fluorescent protein (BFP) fluorescence, which enables the quantitative evaluation of gRNA efficiency (Fig. 1c and Extended Data Fig. 1b,c), although it does not allow for the determination of whether a heterozygous or homozygous mutation was introduced. Using our reporter assay, we selected efficient sgRNA pairs for 36 ASD genes (Extended Data Fig. 1d,e and Supplementary Table 1). Immunohistochemistry analysis of several perturbations in organoids further confirmed the loss of protein products for the majority of selected gRNAs (Extended Data Fig. 2).
We individually cloned sgRNA pairs and pooled them equally to construct a lentiviral plasmid library (Extended Data Fig. 3a,b). To ensure that lentiviral integration frequency was limited to one per cell, weused a low infection rate of 2.5% (ref. 18) (Extended Data Fig. 3c–e). Our analysis indicated homogeneous distribution of the gRNAs in both the plasmid library and hES cells, which was maintained after the formation of embryoid bodies (Fig. 1d). It is important to note that the development of human brain tissues exhibits the generation of clones with highly variable sizes both in vivo and in vitro12,19. To monitor the clonal complexity of the founder cells, we introduced a unique clone barcode (UCB; 1.4 × 107 combinations) for each dual sgRNA cassette to label individual lentiviral integration events (Fig. 1a and Extended Data Fig. 4a,b). Using this strategy, we obtained an average of 2,770 unique clones for each perturbation, which we used to generate mosaic embryoid bodies (Fig. 1e). Altogether, we established a highly efficient and controlled pooled screening system with high clonal complexity in the organoid.
CHOOSE organoids generate diverse cell types
Cortical abnormalities are a prominent feature of ASD8,20. Many ASD risk genes associated with transcriptional regulation and chromatin remodelling are crucial for cortical development21,22. Therefore, we aimed to leverage our methodology to explore loss-of-function phenotypes for 36 transcriptional control genes with high ASD causal confidence (Simons Foundation Autism Research Initiative (SFARI) gene score 1)7.
We used previously established protocols that reproducibly generate human telencephalon organoids23,24 (Extended Data Fig. 5a–c). eCas9 was induced in 5-day-old embryoid bodies, followed by neural induction. Fluorescence-activated cell sorting (FACS)-based analyses suggest that mutant cells (GFP+/dTomato+) remained at low percentages throughout development, with an average of 21.8% on day 120 (Extended Data Fig. 5d–f). It is likely that this could limit the mutant cell–cell interactions within the mosaic tissues. Single-cell transcriptome profiling of cerebral organoids at 4 months revealed a large diversity of dorsal and ventral telencephalon cell populations (Fig. 1f,g and Extended Data Fig. 6a–j) (14 independent pools of organoids, 3–7 organoids each pool, 65 organoids in total, three independent batches). We first annotated cell clusters based on control cells (non-targeting gRNA control) and eCas9-uninduced cells (35,203 cells) (Extended Data Fig. 6d–f). Cell-type labels for the full CHOOSE dataset (49,754 cells) were then derived through a label transfer. Broadly, we found that perturbed cells adopted cell states resembling those found in the unperturbed controls (Fig. 1f). We identified progenitor cells with dorsal (PAX6) or ventral (ASCL1, OLIG2) origins. Among the cells with dorsal identity, we identified radial glial cells (RGCs; VIM), cycling RGCs (ASPM), outer RGCs (oRGCs; HOPX) and intermediate progenitor cells (IPCs; EOMES). These progenitors differentiated into excitatory neurons with specific layer identities, including layer 5/6 neurons (L5/6; BCL11B), L6 cortical thalamic projection neurons (FOXP2, TLE4)25, L4 neurons (RORB, UNC5D, NR2F1)26,27 and L2/3 neurons (SATB2). Ventral radial glial cells (v-RGCs; cell cycling ventral RGCs, ccv-RGCs) differentiated into interneuron precursor cells (INPs; DLX2), which generated interneurons with lateral ganglionic eminence (LGE) origin (LGE-IN; MEIS2) or caudal ganglionic eminence (CGE) origin (CGE-IN; NR2F2)28. Notably, we found a cluster of interneurons expressing MEIS2 and high levels of PAX6 (LGE PAX6+ interneurons), a signature that resembles mouse olfactory bulb precursors that were recently reported to generate neurons redirected to white matter in primates29. In addition to neuronal populations, we identified glial cell populations including astrocytes (S100B, APOE, ALDH1L1) and OPCs (OLIG2, PDGFRA) (Fig. 1f,g). RNA velocity analysis30,31 revealed developmental trajectories from neural progenitor cells to neuronal populations in both the dorsal and ventral telencephalons (Extended Data Fig. 6k). Further analysis of our organoid dataset and a primary developing human brain dataset32 revealed cell type-specific expression patterns for several ASD genes that we targeted in our screen (Extended Data Fig. 6l,m). In summary, our scRNA-seq dataset of 4-month-old cerebral organoids recapitulates diverse telencephalic cell types that are present in the developing human brain.
Cell proliferation and depletion phenotypes
Aberrant cell proliferation during brain development has been suggested to contribute to ASD pathology33. To test whether ASD genetic perturbations could affect cell proliferation, we recovered gRNA information from scRNA-seq complementary DNA libraries as well as bulk-extracted genomic DNA from organoid pools of four different batches (Extended Data Fig. 7a). We observed a heterogenous gRNA representation in eCas9-induced cells at 4 months. A time course analysis revealed a deviation from the initial gRNA distribution as early as day 20 (Extended Data Fig. 7b). When comparing eCas9-induced with uninduced cells from scRNA-seq libraries, we found significantly enriched and depleted gRNAs (Fig. 2a,b) (induced cells, n = 14 independent pools of organoids, three batches; uninduced cells, n = 8 independent pools of organoids, two batches). Using a FACS-based approach, we further confirmed the enrichment and depletion phenotypes in individually perturbed organoids for the four genes (KMT2C, LEO1, ADNP and WAC) that showed the largest effect sizes (Extended Data Fig. 8).
To ensure that the observed cell proliferation phenotypes were not driven by clonal effects, we determined the complexity of clones recovered from scRNA-seq libraries. On average, we recovered 125 clones for each perturbation and found that the clones were distributed across all libraries (Extended Data Fig. 4c,d). Analysing the size of each clone, we found a mean average cell number per clone of 4.4 (Extended Data Fig. 4e,f). These data suggest that cells captured in the CHOOSE screen came from diverse and relatively small clones, which are crucial for mitigating dominant clonal effects. Bulk analysis of the genomic DNA with a higher cell number input (50,000–150,000) revealed high clonal complexity in both eCas9-uninduced and eCas9-induced cells, with a homogenous distribution only in uninduced cells (Extended Data Fig. 4g,h). In conclusion, using a pooled, high-complexity barcoding screening system, we successfully identified ASD risk genes that play essential roles in cell proliferation and survival.
Cell-type-specific effects of perturbing ASD genes
Cell type-specific alterations have been observed in patients with ASD and brain organoid disease models11,34. The large cellular diversity detected in our system enables us to systematically assess, compare and categorize the effects of ASD gene perturbations on cell states. Using a Cochran–Mantel–Haenszel (CMH) test stratified by library replicates, we first assessed the differential abundance of dorsal versus ventral telencephalic cells, as well as the abundance of each individual cell type in each perturbation versus a non-targeting gRNA control (11 independent pools of organoids, two independent batches) (Fig. 2c and Extended Data Fig. 9).
For 24 perturbations, we detected significant changes in the ratio of dorsal to ventral cells (Fig. 2c). Notably, most perturbations (21 of 24) lowered the dorsal-to-ventral ratio. Perturbation of KMT2C, for example, led to a strong enrichment of ventral cells. For 23 perturbations, we detected changes in the abundance of at least one cell type (CMH test, false discovery rate (FDR) < 0.05) (Fig. 2c). On the other hand, six perturbations specifically targeted one cell type without affecting others, including ADNP (L2/3 enrichment), POGZ (RGC enrichment) and SETD5 (ccv-RGC enrichment).
Among the progenitors, we identified IPC depletion as a strong, convergent effect in 12 perturbations (Fig. 2c) (for example, CHD2, KAT2B and KMT2C). Additionally, we observed an enrichment of v-RGCs and/or ccv-RGCs in 10 perturbations (for example, ARID1B, BCL11A and DEAF1) and an enrichment of INPs in 10 perturbations (for example, ILF2, MED13 and TCF20). To validate these phenotypes, we generated individually perturbed organoids for several genes. We performed immunohistochemical analyses of IPCs and INPs at day 60, a stage when organoids present a radially organized structure including ventricular zone (VZ), subventricular zone (SVZ) and cortical plate, enabling robust examination of the progenitors35. Consistent with the screen data, we detected significantly decreased EOMES+ IPCs in KMT2C and PHF3 perturbations (Extended Data Fig. 11) and increased DLX2+ INPs in KMT2C, MED13, PHF3 and TBL1XR1 perturbations (Extended Data Fig. 12). Furthermore, we analysed specific neuronal subpopulations and found that L2/3 excitatory neurons were more impacted and depleted in the majority of perturbations (Fig. 2c). Analysis of interneurons revealed enrichment of both CGE-INs and LGE-INs in three perturbations and enrichment of only CGE-INs or LGE-INs in four perturbations. In addition, LGE PAX6+ interneurons were depleted by PHF3 perturbation. These data indicate an interneuron subtype-specific response to ASD genetic perturbations. Beyond neuronal cell clusters, we found that astrocytes were significantly enriched in CIC perturbation.
To assess the consistency of the effects across replicates from different batches, we performed a t-test on individual enrichment and depletion effects (Extended Data Fig. 10). We observed that most effects detected at the single-cell level were also largely consistent across organoids grown from different batches, further supporting the robustness and reproducibility of our system.
Collectively, the CHOOSE system allowed us to simultaneously investigate the effects of multiple ASD genes on cell fate determination. We found that progenitors, including IPCs and INPs, as well as L2/3 excitatory neuronss, were among the most affected cell types. Furthermore, our data indicate that ASD pathology could emerge as early as the neural progenitor stage.
Altered gene expression upon ASD gene perturbation
To further assess the molecular changes caused by each perturbation, we performed a differential gene expression analysis, comparing each perturbation with controls within dorsal and ventral trajectories. We detected 2,071 differentially expressed genes (DEGs) across all perturbations (Fig. 3a and Supplementary Data 2). Additionally, we could identify genes dysregulated in both dorsal and ventral populations, as well as those specifically dysregulated in one population (KMT2C, LEO1 perturbation) (Extended Data Fig. 13a). We ranked DEGs by detection frequency and discovered that many genes were differentially expressed in multiple perturbations (Fig. 3b). Notably, in the dorsal populations, seven perturbations caused CHCHD2 downregulation (Fig. 3a). CHCHD2 encodes a mitochondrial protein, and its downregulation has been observed in neurons from the postmortem brains of patients with ASD34. In the ventral cell populations, the most frequently detected DEG is the adhesion molecule gene CSMD1 (Fig. 3a,b), which is upregulated by ARID1B, CIC, MED13 and PHF3 perturbations and downregulated by LEO1 and KMT2C perturbations. To ensure a balanced impact of differential DEG numbers across all perturbations on downstream analyses, we selected the top 30 DEGs for each perturbation (TOP-DEGs) for gene ontology term enrichment analysis. We found that cell adhesion, cell differentiation, forebrain development and axogenesis were among the most associated biological processes (Fig. 3c and Supplementary Data 3). We also observed many perturbation-specific gene ontology terms, many of which confirm previous studies, further supporting the power of detecting complex biological phenotypes with the CHOOSE system (Extended Data Fig. 13b and Supplementary Data 4). Some notable examples include the gene ontology terms ribosome assembly (SETD5 perturbation)36, mitochondrion organization (FOXP1 perturbation)37, lipid homoeostasis (IRF2BPL perturbation)38, autophagosome maturation (KAT2B perturbation)39 and cilium development (MECP2 perturbation)40.
ASD-associated regulatory modules
When combining the TOP-DEGs from all perturbations, we found that they were significantly enriched in risk genes associated with ASD (SFARI database, 1,031 genes; approximately twofold enrichment, Fisher exact test P < 10−5 against genes expressed in 5% of cells as background) (Fig. 3d and Supplementary Data 2). Notably, we did not observe enrichment in risk genes for other NDDs, such as intellectual disability (ID; sysID database, 936 primary ID genes) (Fig. 3d), suggesting that certain biological processes and regulatory programs might be specifically relevant to ASD-associated gene perturbations. To explore these potential gene regulatory ‘hubs’, we first generated single-cell multiome data, including single-cell transcriptome and chromatin accessibility modalities, from 4-month-old cerebral organoids (Extended Data Fig. 14a–g). Using Pando41, we harnessed these multimodal measurements to infer a GRN of the developing telencephalon and extract sets of genes regulated by each transcription factor (TF) module, as well as positive and negative regulatory interactions between the TFs (Extended Data Fig. 14h,i and Supplementary Data 5). We visualized this GRN on the level of TFs using a uniform manifold approximation and projection (UMAP) embedding42, which revealed distinct TF groups active in neural progenitor cells (PAX6, GLI3), inhibitory neurons (DLX2, DLX6) and excitatory neurons (NEUROD2, NFIB, SATB2) as well as regulatory interactions between the TFs (Fig. 3e).
To test whether regulatory subnetworks indeed exist at which ASD risk genes accumulate, we tested all TF modules for enrichment with SFARI genes. We found significant enrichment for a set of 40 TFs (adjusted Fisher test P < 0.01, more than twofold enrichment; for example, EOMES, OLIG1, DLX2) (Extended Data Fig. 14j), among which 14 TFs were encoded by ASD risk genes (for example, NFIA, BCL11A, MEF2C) (Fig. 3e). All TF regulatory modules enriched in SFARI genes together form an ASD-associated sub-GRN (Supplementary Data 6).
Next, we assessed the transcriptomic effect of ASD genetic perturbations in the context of the inferred GRN. We performed enrichment tests (Fisher exact test) on perturbation-induced TOP-DEGs (CHOOSE DEGs) from dorsal and ventral telencephalic cells separately. We found that, similar to ASD risk genes, CHOOSE DEGs were enriched in specific TF modules (Fig. 3f and Supplementary Data 7). In the ventral telencephalic cells, CREB5, MEIS2, NFIA and OLIG1 were most strongly affected, whereas dorsal telencephalon-specific DEGs were strongly enriched in SATB2, BACH2, MEF2C and EOMES modules. Notably, some of the ASD-associated TF modules were among the most strongly enriched in CHOOSE DEGs, supporting their role in ASD-associated gene dysregulation (Fig. 3f). We finally present gene regulatory subnetworks of OLIG1 and EOMES, which are both enriched in ASD risk genes and strongly affected by ASD genetic perturbations (Fig. 3g). Oligodendrocyte transcription factor 1 (OLIG1) is preferentially expressed in the ventral telencephalon and is a key regulator for interneuron and oligodendrocyte lineages. Eomesodermin (EOMES) is a key TF for the fate specification of IPCs in the dorsal telencephalon. The enrichment of OLIG1 and EOMES regulomes suggests potentially vulnerable cell fate specification-related regulatory networks upon ASD genetic perturbations.
Thus, we have characterized gene expression changes for each genetic perturbation in both dorsal and ventral telencephalon and uncovered molecular programs shared between different perturbations. Leveraging GRN inference from multiomic data, we further identified ASD-associated TF modules during cortical development and critical regulatory hubs underlying the detected gene expression changes.
Effects of ARID1B perturbation on v-RGCs
Among the 36 genes, we found that ARID1B perturbation caused one of the most significant enrichments of v-RGCs (Fig. 2c). Notably, the OLIG1 regulatory module was also enriched in DEGs caused by ARID1B perturbation (Extended Data Fig. 15a). These data motivated us to further investigate how v-RGCs are affected by ARID1B perturbation in the screen. We used Cellrank43 to delineate the developmental trajectories leading to different interneuron subtypes and OPC populations (Fig. 4a). We visualized the terminal fate probabilities for each cell as a circular projection, which revealed a distinct differentiation trajectory from ventral progenitors towards early OPCs (OLIG2, PDGFRA) and a branching of INPs (DLX2) into different inhibitory neuronal fates (DLX5) (Fig. 4b). We found that ARID1B-perturbed cells were strongly enriched in the OPC trajectory and had a higher percentage of OLIG2+ v-RGCs (Fig. 4c,d). This is an interesting finding given that OLIG2 is known to regulate progenitor self-renewal at earlier developmental stages and is a master regulator for oligodendrocyte lineage specification in the ventral telencephalon44,45. We then analysed the fate transition probabilities of ventral progenitors and found that ARID1B-perturbed v-RGCs have significantly higher transition probabilities towards early OPCs than neuronal fates (Fig. 4e).
Loss-of-function mutations in ARID1B have been shown to cause ID and ASD7,46. To confirm whether our findings are relevant to human disorders, we recruited two patients with heterozygous ARID1B mutations. Patient 1 harbours a nucleotide duplication (c.2201dupG), resulting in an early STOP codon. Patient 2 carries a microdeletion (6q25.3del) that includes exon 8 and the downstream region of the ARID1B locus (Extended Data Fig. 15b,c). We established iPS cell lines from both patients and a mutation-corrected cell line for patient 1 as an isogenic control. To investigate the behaviour of v-RGCs, we used a previously published protocol that uses smoothened agonist (SAG) and inhibitor of Wnt production-2 (IWP2) to specifically guide organoids to develop ventral telencephalic tissue47,48. In line with our previous findings, we observed considerably increased OLIG2+, DLX2+ and DLX2+OLIG2+ cells in 40-day-old organoids from both patients compared with control organoids (Fig. 4f,g). To further explore the potential consequences of such defects in patients, we analysed the prenatal brain structure of patient 2 at two gestation stages (gestational weeks (GW) 22 and GW31) by intrauterine super-resolution magnetic resonance imaging (MRI). Three-dimensional reconstruction of the ganglionic eminence (GE), the source of ventral telencephalon progenitors, and normalization to cortical and total brain volume revealed an enlarged GE compared with multiple age-matched controls at both examined developmental stages (Fig. 4h and Extended Data Fig. 15d–h), which could be partially due to an increase in ventral progenitors. Taken together, the enrichment of v-RGCs and ccv-RGCs (Fig. 2a), the higher transition probability of v-RGCs to early OPCs, and the increased proportion of OLIG2+ cells in our screen and in organoids generated from two patient iPS cell lines all suggest that ARID1B perturbation leads to abnormal ventral progenitor expansion and aberrant cell fate specification (Fig. 4i). The enlarged volume of GE in the patient with an ARID1B mutation is consistent with these observations.
Discussion
We have developed the CHOOSE system to characterize loss-of-function phenotypes of high-risk ASD genes across dozens of cell types spanning early brain developmental stages in cerebral organoids. By employing a pooled CRISPR screening system in conjunction with validation, our study provides a developmental and cell type-specific phenotypic database for ASD gene loss-of-function research. IPCs, transit-amplifying dorsal progenitors that generate neurons for all cortical layers and contribute to the evolutionary expansion of the human cortex49,50, appeared to be particularly susceptible to ASD genetic perturbations. Among the ventral telencephalon cells, we have identified strong enrichment of v-RGCs, ccv-RGCs and INPs, which are progenitors that differentiate into interneurons and oligodendrocytes44,51. Furthermore, our findings indicate that L2/3 excitatory neurons are more vulnerable than other neuronal populations to ASD perturbations. This aligns with the observation that ASD risk gene coexpression networks are enriched in upper-layer neurons during development and that these neurons are preferentially affected in ASD patients6,34. In our screen, we assessed 19 ASD genes known to be involved in epigenetic regulation. Despite their broad role in cell differentiation, perturbations of these genes impacted specific cell types and biological processes during brain development. DEG and gene ontology enrichment analyses revealed both common and distinct molecular processes impaired in different perturbations, suggesting that both convergent and divergent mechanisms contribute to ASD pathophysiology. Furthermore, we constructed a telencephalon developmental trajectories-based GRN and identified ASD-associated regulatory modules in dorsal and ventral cell populations. The OLIG1 module is particularly interesting as many of its downstream targets are ASD risk genes. This module was previously identified as crucial for oligodendrocyte differentiation in the developing human cortex22, highlighting the involvement of the oligodendrocytes in ASD pathophysiology.
The combination of high-content perturbation screening and validation in a patient-specific context exemplifies the effectiveness of employing organoid systems to study NDDs. We discovered that loss of ARID1B leads to increased transition of ventral progenitors to early OPCs. Importantly, perturbations of three BAF complex members (ARID1B, BCL11A and SMARRC2) all lead to enrichment of v-RGCs, indicating the critical role of the BAF complex in regulating ventral telencephalon cell fate specification. Given the cell type-specific expression of each BAF subunit and their involvement in NDDs52, it would be interesting to investigate how ARID1B or other subunits regulate oligodendrocyte and interneuron specification, as well as their contribution to NDDs.
Our study has limitations. First, our system lacks certain brain cell types, such as microglia, and does not include interneurons derived from the medial ganglionic eminence. Future studies should explore the impact of ASD gene alterations on these cell types or use a system that better resembles in vivo cell-type complexity. Second, we do not know whether perturbed cells are heterozygous or homozygous for each mutant. It would be interesting to generate precisely edited cells to compare mutation-specific phenotypes. Additionally, the effects of perturbations of ASD risk genes on cell-type abundances can sometimes be transient11,53, and therefore certain abnormalities during development may not be captured.
The ability to determine cell type-specific contributions to genetic disorders in a systematic, scalable and efficient manner will greatly enhance our understanding of disease mechanisms. As the CHOOSE system provides a robust, precisely controlled screening strategy, we anticipate that it will be widely applied beyond brain organoids to study disease-associated genes.
Methods
Stem cell and cerebral organoid culture conditions
Feeder-free hES cells or iPS cells were cultured on hES cell-qualified Matrigel (Corning, catalogue no. 354277)-coated plates with Essential8 stem cell medium supplemented with bovine serum albumin (BSA). H9 embryonic stem cells were obtained from WiCell. Cells were maintained in a 5% CO2 incubator at 37 °C. All cell lines were authenticated using a short tandem repeat assay, tested for genomic integrity using single-nucleotide polymorphism (SNP) array genotyping and routinely tested negative for mycoplasma.
Cerebral organoids were generated using a previously published protocol with modifications23. In brief, cells were cultured to 70–80% confluent, and 16,000 live cells in 150 μl Essential8 medium supplemented with Revitacell (ThermoFisher, catalogue no. A2644501) were added to each well of a U-bottom ultralow attachment 96-well plate (Corning, catalogue no. CLS3473) to form embryoid bodies. For eCas9 induction, 4-hydroxytamoxifen (Sigma-Aldrich, catalogue no. H7904) was added on day 5 at a concentration of 0.3 μg ml−1. Neural induction was started on day 6. Embryoid bodies were embedded in Matrigel (Corning, catalogue no. 3524234) at day 11 or 12 based on morphology check. CHIR99021 (Merck, catalogue no. 361571) at 3 μM was added from day 13 to day 16, and medium was switched to improved differentiation medium supplemented with B27 minus vitamin A (IDM-A) at day 14. On day 25, medium was switched to improved differentiation medium supplemented with B27 plus vitamin A (IDM+A); 1% dissolved Matrigel was added to the medium from day 40 to day 90. From day 60 to day 70, medium was gradually switched to Brainphys neuronal medium (Stemcell Technologies, catalogue no. 05790) and supplemented with brain-derived neurotrophic factor (BDNF) (20 ng ml−1; Stemcell Technologies, catalogue no. 78005.3), glial cell line-derived neurotrophic factor (GDNF) (20 mg ml−1; Stemcell Technologies, catalogue no. 78058.3) and bucladesine sodium (1 mM; MedChemExpress, catalogue no. HY-B0764)24. For ventralized organoids, we followed a previously published protocol47. Embryoid bodies were not embedded, and patterning factors, including 100 nm SAG (Merck-Millipore, catalogue no. US1566660) and 2.5 μM IWP2 (Sigma-Aldrich, catalogue no. IO536), were added from day 5 to day 11.
CHOOSE screen
sgRNA selection and cloning
The top four sgRNAs were first selected on the basis of predictions using multilayered Vienna Bioactivity CRISPR (VBC) score54 and then subjected to the reporter assay (below) to test editing efficiency. sgRNAs were cloned into the gRNA reporter assay lentivirus construct (containing the dual sgRNA cassette: U6-sgRNA1-H1-sgRNA2) using the GeCKO cloning protocol55. The two sgRNAs were cloned using type IIS class restriction enzymes FastDigest BpiI (ThermoFisher, catalogue no. FD1014) and Esp3I (ThermoFisher, catalogue no. FD0454) separately and verified using Sanger sequencing. All gRNAs used for this study can be found in Supplementary Table 1.
sgRNA reporter assay
A construct containing dTomato-2A-gRNA target array-TagBFP under the RSV promoter was assembled using Gibson assembly. The construct was packaged into retrovirus using the Platinum-E retroviral packaging cell line via the calcium phosphate-based transfection method. Virus-containing supernatant (Dulbecco’s modified Eagle’s medium (DMEM), 10% fetal bovine serum (FBS), 2 mM l-glutamine, 100 U ml−1 penicillin and 0.1 mg ml−1 streptomycin) was collected for up to 72 h, filtered through a 0.45-μm filter and then stored on ice. Retroviruses were then used to infect NIH-3T3 cells, and dTomato-positive cells were sorted using flow cytometry into single cells to establish reporter cell lines. To deliver sgRNAs, the lentiviral construct containing the dual gRNA cassette and the spleen focus-forming virus (SFFV) promoter driving eCas9 were packaged using HEK293 cells to produce lentivirus. The reporter 3T3 cell lines generated above were cultured in six-well plates and infected with lentivirus containing dual sgRNA cassette targeting each gene individually. BFP fluorescence was measured at 7, 14 and 20 days postinfection. Fluorescent changes at 20 days postinfection were used to evaluate the gRNA editing efficiency. In total, 98 dual sgRNA cassettes were tested for 36 genes.
Generation of barcoded CHOOSE lentivirus, hES cell infection and embryoid body generation
The CHOOSE lentiviral vector was constructed based on a previously published lentiviral vector that carries a CAG driving ERT2-Cre-ERT2-P2A-EGFP-P2A-puro cassette12. A multicloning site including NheI and SgsI recognition sequences was introduced to the 3′ LTR of the lentivirus backbone according to the CROP-seq vector design17. Then, the original U6 sgRNA expression cassette was removed; instead, the dual sgRNA (U6-sgRNA1-H1-sgRNA2) cassette was introduced to the 3′ LTR cloning site. To generate a barcoded library, the following primers were used to individually amplify (8–10 cycles, monitored using a quantitative polymerase chain reaction (qPCR) machine, stopped when reaching to logarithmic phase) each dual sgRNA cassette from the lentiviral construct used in the reporter assay while introducing a 15 base pair barcode.
FW primer: 5′-tcgaccgctagcagggcctatttcccatga-3′.
RV primer: 5′-cagtagggcgcgccNVDNHBNVDNHBNVDccggcgaaccatgatcaaa-3′.
Equal molar amounts of amplicons for the ASD library (36 paired sgRNAs targeting ASD genes) or control library (a paired non-targeting control gRNA) were pooled. Amplicons and lentiviral backbone were then digested with FastDigest NheI (ThermoFisher, catalogue no. FD0973) and FastDigest SgsI (ThermoFisher, catalogue no. FD1894) and gel purified. Ligation was performed using T4 DNA ligase (ThermoFisher, catalogue no. EL0011) and cleaned up by phenol-chloroform extraction. In total, 90 ng of ASD library plasmids and 30 ng of control library plasmids were used for electroporation of MegaX DH10B T1R Electrocomp Cells (ThermoFisher, catalogue no. C640003) following the manufacturer’s guide. Bacteria were plated on lithium borate medium plates containing ampicillin. Dilutions were performed to calculate the complexity; 2.6 × 107 colonies were obtained for the ASD library, and 0.5 × 107 colonies were obtained for the control library. Lentiviruses were packaged using HEK293T cells, and infection of hES cells was performed as before12. Infection rate was controlled to be lower than 5% to prevent multiple infections18; 6.6 × 105 ASD library cells and 2.3 × 105 control library cells positive for GFP were sorted by flow cytometry. Cells were recovered and passaged two times in 10 cm dishes to maintain maximum complexity. Cells were mixed with a ratio of 96:4 (ASD:control) and then used to make embryoid bodies. For individual gene validations, lentivirus carrying a dual gRNA cassette only targeting one gene was packaged and used to infect the eCas9-inducible cell line. Cells were then collected by FACS and used to make embryoid bodies. Organoids were cultured using the conditions described above.
Cerebral organoid tissue dissociation, FACS and scRNA-seq
For each library, three to seven organoids at 4 months were pooled, washed twice in Dulbecco’s phosphate-buffered saline (DPBS)−/− and dissociated using the gentleMACS dissociator in trypsin–accutase (1×) solution with TURBO DNase (2 μl ml−1; ThermoFisher, catalogue no. AM2238). After dissociation, DPBS−/− supplemented with 10% FBS (DPBS–10% FBS) was gradually added to stop the reaction. Samples were then centrifuged at 400g for 5 min at 4 °C, and the supernatant was aspirated without touching the pellet. The pellet was then resuspended in an additional 1–2 ml of DPBS–10% FBS and then, filtered through a 70 μm strainer and FACS tubes. Cells were then stained with viability dye DRAQ7 (Biostatus; DR70250, 0.3 mM). Target live cells were sorted with a BD FACSAria III on Alexa 700 filter with low pressure (100 μm nozzle) and collected in DPBS–10% FBS at 4 °C. Cells were then centrifuged and resuspended in DPBS–10% FBS to achieve a target concentration of 450–1,000 cells per microliter. Samples with more than 85% viability were processed. For each library, 16,000 cells were loaded onto a 10× chromium controller to target a recovery of 10,000 cells. Libraries using the Chromium Single Cell 3′ Reagent Kits (v.3.1) were prepared following the 10× user guide. Libraries were sequenced on a Novaseq S2 or S4 flow cell with a target of 25,000 paired-end reads per cell.
Custom genomic reference
Each cell expresses eCas9 from a genomic locus (AAVS1) and a polyadenylated dual sgRNA cassette, which is delivered by lentivirus and integrated into the genome. To cover these extrinsic elements, we built a custom genomic reference for mapping 10× single-cell data by amending the GRCh38 human reference. As the individual gRNA sequences differed, we masked them by Ns so as not to interfere with mapping (individual gRNA information is distinguished in a separate counting pipeline). The sequences added covered the genomic loci of AAVS1 with eCas9-dTomato-WPRE-SV40 and the masked lentiviral construct.
Emulsion PCR and target amplification
Emulsion polymerase chain reaction (PCR) was used to recover gRNA and UCB sequences from plasmid libraries, genomic DNA extracted from lentivirus-infected hES cells and cells sorted from CHOOSE mosaic organoids as well as 10× single-cell complementary DNA libraries to reduce PCR bias and to prevent the generation of chimeric PCR products56,57. AmpliTaq Gold 360 master mix (ThermoFisher, catalogue no. 4398876) was used for all PCR reactions. Emulsion PCR was performed using the Micellula DNA Emulsion & Purification Kit (EURX, catalogue no. E3600) according to the manufacturer’s guide. For target amplification from 10× single-cell libraries, heminested emulsion PCRs were performed using the following primers:
First PCR: forward primer (FW): 5′-gcagacaaatggctgaacgctgacg-3′, reverse primer (RV): 5′-ccctacacgacgctcttccgatct-3′; second PCR: FW: 5′-ggagttcagacgtgtgctcttccgatcttgggaatcttataagttctgtatgagaccactctttcc-3′, RV: 5′-ccctacacgacgctcttccgatct-3′.
Amplicons were then indexed with unique NEB dual indexing primers, and amplifications were monitored in a qPCR machine and stopped when reaching the logarithmic phase. Amplicons were sequenced using the Illumina Nextseq2000 or Novaseq6000 system. All primers used can also be found in Supplementary Table 1.
gRNA and UCB recovery and analyses
gRNA sequences were extracted by cutting 5′- and 3′-flanking regions with cutadapt (10% error rate, 1–3 nucleotide (nt) overlap, no indels)58. Sequences were filtered to be between 15 and 21 nt long. The corrected cell barcode (CBC) and the unique molecular identifier (UMI) of each read were derived via the 10× Genomics Cell Ranger 6.0.1 alignment59. Only reads with a corresponding gene expression (GEX) cell were accepted. Reads and target sequences were joined by allowing partial overlaps and hamming distances of two. Reads are counted towards unique CBC–UMI–gRNA combinations. A read count cutoff of 1% of the median read count of the UMI with the highest reads count per cell was applied. Cells with only one gRNA and more than one read were kept. In addition, within unique CBC–UMI combinations, only gRNA with more than 20% of the maximal read count of that group was kept. After read filtering, UMIs were counted for each CBC–gRNA combination. If more than one gRNA was found within a cell, only the gRNAs with equal UMI count compared with the maximum UMI count were kept. Only one-to-one combinations were considered further. Analogous to gRNA extraction, UCB was extracted with at least 6 nt overlap to the flanks. Sequences with 12 nt length were selected and had to follow the synthesis pattern. Further processing was done analogous to gRNA.
Preprocessing of single-cell transcriptomics data
We first aligned reads to the above defined custom genomic reference with Cell Ranger 6.0 (10x Genomics) using pre-mRNA gene models and default parameters to produce the cell by gene UMI count matrix. UMI counts were then analysed in R using the Seurat v.4 (ref. 60). We first filtered features detected in a minimum of three cells. Next, we filtered high-quality cells based on the number of genes detected (minimum 1,000, maximum 8,000), removing cells with high mitochondrial (less than 15%) or ribosomal (less than 20%) messenger RNA content. Thereafter, expression matrices of high-quality cells were normalized (LogNormalize) and scaled to a total expression of 10,000 UMIs for each cell. Principal component analysis (PCA) was performed based on the z-scaled expression of the 2,000 most variable features (FindVariableFeatures()).
Integration and annotation of single-cell transcriptomics data
To annotate the dataset, we first extracted cells with control gRNAs and merged them with cells from uninduced organoids (35,203 cells). We integrated these unperturbed cells across libraries using Harmony61 with default parameters. Using the integrated space, we clustered the dataset at a resolution of one using the Louvain algorithm62 and annotated the clusters as dorsal and ventral telencephalons based on marker gene expression. We then split both trajectories and clustered again with a resolution of two to annotate the cell types more finely. This annotation of unperturbed cells was used to perform a label transfer onto the full dataset with perturbed cells using Seurat. The full CHOOSE dataset was further filtered for cells for which gRNAs were detected and integrated across libraries using the Seurat anchoring method. The integrated count matrix was log-normalized and scaled before computing a PCA. To visualize the dataset, the first 20 principal components were used to compute a UMAP embedding.
Assessment target gene expression in organoid and primary cell types
To assess how the target genes in our screen were expressed in organoid and primary tissue, we obtained gene expression data from cell clusters in the developing human brain32 from https://storage.googleapis.com/linnarsson-lab-human/HumanFetalBrainPool.h5. For both the primary data and our organoid dataset, we summarized log-normalized expression for each cell type (‘CellClass’ in the primary dataset) by computing the arithmetic mean. We visualized the expression of CHOOSE target genes with a heat map as displayed in Extended Data Fig. 6l,m.
RNA velocity
To obtain count matrices for spliced and unspliced transcripts, we used kallisto (v.0.46.2)63 through the command line tool loompy from fastq from the python package loompy (v.3.0.7; https://linnarssonlab.org/loompy/). Using scVelo (v.0.2.4)31, moments were computed based on the first 20 principal components using the function scvelo.pp.moments() with n_neighbors = 30. RNA velocity was subsequently calculated using the function scvelo.tl.velocity() (mode = ‘stochastic’), and a velocity graph was constructed using scvelo.tl.velocity_graph(). To obtain a pseudotemporal ordering describing the two differentiation trajectories, we first removed clusters annotated as cycling cells (MKI67+) and astrocytes (S100B+) from the dataset. We then calculated a pseudotime based on the velocity graph using the function scv.tl.velocity_pseudotime() for both trajectories separately.
Differential gRNA representation analysis
To test whether perturbations affected fitness or proliferation capacity of cells, we compared gRNA representation in eCas9-induced (n = 14 pools of organoids from three batches) versus uninduced (n = 8 pools of organoids from two batches) samples. For each pool of organoids, we computed the fractions of cells with each gRNA. We then computed the average fold change of detection percentage between induced and uninduced samples and performed a two-sided t-test comparing both distributions. Multiple-testing correction on the resulting P values was performed using the Benjamini–Hochberg method.
Differential abundance testing
To assess how the perturbation of ASD risk genes changes abundances of different organoid cell populations, we tested for enrichment of each gRNA in each annotated cell state versus the control. To control for confounding effects through differential gRNA sampling in different libraries, we used a CMH test stratified by library. Multiple-testing correction was performed using the Benjamini–Hochberg method, and a significance threshold of 0.05 was applied to the resulting FDR. Enrichment effects were plotted using the signed −log10 FDR: that is, the sign of the log odds ratio (effect size) multiplied by the −log10 FDR-corrected P value. To further assess the variability of the differential abundance effects across independent pools of organoids, we computed cell-type fold enrichment for each organoids pool and gRNA. For this, we used 14 scRNA-seq libraries obtained from independent pools of organoids as replicates from three batches. Two batches (11 replicates) used non-targeting gRNA as a control, and a third batch (three replicates) used eCas9-uninduced cells as an alternative control. We additionally computed a background distribution of enrichment effects from randomly permuted gRNA labels. We then performed a t-test for each perturbation and cell type against this background distribution.
Local cell compositional enrichment test
To visualize the compositional changes induced by the genetic perturbations at a finer resolution, we used a method outlined in Nikolova et al.64 In brief, a k-nearest neighbour (kNN) graph (k = 200) of cells was constructed on the basis of Euclidean distance on the PCA-reduced CCA (canonical-correlation analysis) space. Next, a CMH test stratified by library was performed on the neighbourhood of each cell, comparing frequencies of the gRNA or gRNA pool and the pool of control gRNAs within and outside of the neighbourhood. The resulting neighbourhood enrichment score of each cell was defined as signed −log(P), where the sign was determined by the sign of the log-transformed odds ratio. A random walk with restart procedure was then applied to smooth the neighbourhood enrichment score of each cell. The smoothened enrichment scores were visualized on the UMAP embedding using the ggplot2 (ref. 65) function stat_summary_hex() (bins = 50).
Differential expression analysis
To investigate the transcriptomic changes caused by each perturbation, we performed differential expression analysis based on logistic regression. We used the Seurat function FindMarkers() (test.use = ‘LR’) to find DEGs for each gRNA label versus control. Tests were performed on log-normalized transcript counts Y while treating library, cell_type and n_UMI as covariates in the model:
Testing within each developmental trajectory was performed by omitting the cell_type covariate. Multiple-testing correction was performed using the Benjamini–Hochberg method, and a significance threshold of 0.05 was applied to the resulting FDR to obtain a set of DEGs (CHOOSE DEGs). We further selected top 30 DEGs on the basis of absolute fold change for each gRNA (TOP-DEGs).
DEG enrichment analysis
To assess the biological processes in which the detected DEGs were involved, we performed gene ontology enrichment across all TOP-DEGs globally as well as using all detected DEGs for each target gene in excitatory and inhibitory neuron trajectories separately. As a background gene set, we used all genes expressed in more than 5% of cells in our dataset. To perform gene ontology analysis, we used the function ‘enrichGO’ from the R package clusterProfiler66 with ‘pAdjustMethod = ′fdr′’. We filtered the results using a significance threshold of FDR < 0.01. To test whether the set of TOP-DEGs was enriched for ASD-associated genes, we first obtained a list of risk genes from SFARI (https://gene.sfari.org/database/gene-scoring/, 11 April 2021). We then tested the enrichment using a Fisher exact test with all genes expressed in more than 5% of cells in our dataset as the background. To assess the specificity of this enrichment, we obtained a list of ID risk genes from sysID (936 primary ID genes, https://sysndd.dbmr.unibe.ch, 17 March 2022) and tested for enrichment among TOP-DEGs in the same way.
Processing of single-cell multiome data and GRN inference
Initial transcript count and peak accessibility matrices for the multiome data were obtained from sequencing reads with Cell Ranger Arc and further processed using the Seurat (v.4.0.1) and Signac (v.1.4.0)67 R packages. Peaks were called from the fragment file using MACS2 (v.2.2.6)68 and combined in a common peak set before merging. Transcript counts were log-normalized, and peak counts were normalized using term frequency–inverse document frequency normalization. To assess the cell composition of the multiome data, integration with the CHOOSE scRNA-seq data was performed using Seurat (FindIntegrationAnchors() -> IntegrateData()) with default parameters. As a preprocessing step to GRN inference with Pando41, chromatin accessibility data were first coarse grained to a high-resolution cluster level. For this, control cells from the CHOOSE dataset were combined with the multiome dataset, and Louvain clustering was performed at a resolution of 20 based on the first 20 principal components calculated from the 2,000 most variable features (RNA). For each cluster, peak accessibility was summarized by computing the arithmetic mean from binarized peak counts so that each cell in the cluster was represented by the detection probability vector of each peak. To constrain the set of peaks considered by Pando, we used the union of PhastCons conserved elements69 from an alignment of 30 mammals (obtained from https://genome.ucsc.edu/) and candidate cis-regulatory elements derived from the ENCODE project70 (initiate_grn()). In these regions, we scanned for TF motifs (find_motifs()) based on the motif database shipped with Pando, which was compiled from motifs derived from JASPAR and CIS-BP. Based on motif matches, cell-level log-normalized transcript counts and cluster-level peak accessibilities, we then inferred the GRN using the Pando function infer_grn() (peak_to_gene_method = ‘GREAT’, upstream = 100,000, downstream = 100,000) for the 5,000 most variable features. Here, genes were associated with candidate regulatory regions in a 100,000 radius around the gene body using the method proposed by GREAT71. From the model coefficients returned by Pando, TF modules were constructed using the function find_modules() (P_thresh = 0.05, rsq_thresh = 0.1, nvar_thresh = 10, min_genes_per_module = 5). To visualize subnetworks centred around one TF, we computed the shortest path from the TF to every gene in the GRN graph. If there were multiple shortest paths, we retained the one with the lowest average P value. The resulting graph was visualized with the R package ggraph (https://github.com/thomasp85/ggraph) using the circular tree layout.
Enrichment testing for TF modules
To find subnetworks of the GRN at which ASD-associated genes accumulate, we first obtained a list of ASD risk genes from SFARI (https://gene.sfari.org/database/gene-scoring/). For all genes included in SFARI (1,031 genes), we tested for enrichment in TF modules using a Fisher exact test. All genes expressed in more than 5% of cells in our dataset (12,079 genes) were treated as the background. Fisher test P values were corrected for multiple testing using the Benjamini–Hochberg method, and significant enrichment was defined as FDR < 0.01 and more than twofold enrichment (odds ratio). To assess which TF modules were most affected by genetic perturbations of ASD-associated genes, we similarly used a Fisher exact test. For the set of TOP-DEGs, we tested for enrichment in any of the inferred TF modules. Here, all genes included in the GRN (5,000 most variable features) were treated as the background.
Cell rank analysis
To better understand the differentiation trajectories leading up to inhibitory neuron populations, we used CellRank43 to compute transition probabilities into each terminal fate based on the previously computed velocity pseudotime. First, the clusters with the highest pseudotime for each terminal cell state were annotated as terminal states. We then constructed a Palantir kernel72 (PalantirKernel()) based on velocity pseudotime and used Generalized Perron Cluster Cluster Analysis73 (GPCCA()) to compute a terminal fate probability matrix (compute_absorption_probabilities()). All cell rank functions were run with default parameters. Fate probabilities for each cell were visualized using a circular projection74. In brief, we evenly spaced terminal states around a circle and assigned each state an angle t. We then computed two-dimensional coordinates (, ) from the transition probability matrix for N cells and terminal states as
To visualize enrichment of perturbed cells in this space, we used the method outlined in Nikolova et al.64. Here, the kNN graph (k = 100) was computed using euclidean distances in fate probability space, and enrichment scores were visualized on the circular projection. Otherwise, the method was performed as described above.
Immunofluorescence
Organoid tissues were fixed in paraformaldehyde at 4 °C overnight followed by washing in PBS three times for 10 min. Tissues were then allowed to sink in 30% sucrose overnight, followed by embedding in O.C.T. compound (Sakura, catalogue no. 4583). Tissues were frozen on dry ice and cryosectioned at 20 μm. For staining, sections were first blocked and permeabilized in 0.1% Triton X-100 in PBS (0.1% PBTx) with 4% normal donkey serum. Sections were then stained with primary and secondary antibodies diluted in 0.1% PBTx with 4% normal donkey serum. Sections were washed in PBS three times for 10 min after each antibody staining and mounted in DAKO fluorescent mounting medium (Agilent Technologies, catalogue no. S3023). The following antibodies were used in this study: DLX2 (Santa Cruz, catalogue no. SC393879, 1:100); OLIG2 (Abcam, catalogue no. ab109186, 1:100); SOX2 (R&D, catalogue no. MAB2018, 1:500); FOXG1 (Abcam, catalogue no. ab18259, 1:200); EOMES (R&D, catalogue no. AF6166, 1: 200); ARID1B (Cell Signaling, catalogue no. 92964, 1:100); ADNP (ThermoFisher, catalogue no. 702911, 1:250); BCL11A (Abcam, catalogue no. 191401, 1:250); PHF3 (Sigma, catalogue no. HPA024678, 1:250); SMARCC2 (ThermoFisher, catalogue no. PA5-54351, 1:250); KMT2C (Sigma, catalogue no. HPA074736, 1:250); Alexa 488, 568 and 647 conjugated secondary anti-bodies (ThermoFisher, 1:250); and Hoechst (ThermoFisher, catalogue no. H3569, 1:10,000).
Microscopy, image processing and quantification
Tissue sections were imaged using an Olympus IX3 Series inverted microscope equipped with a dual-camera Yokogawa W1 spinning disk. Images were acquired with 10× 0.75 (air) working distance (WD) 0.6 mm or 40× 0.75 (air) WD 0.5 mm objectives and produced by the Cellsense software.
For DLX2 and OLIG2 quantification in Fig. 4, images were processed and quantified using Fiji. Based on the size of the tissue, 5–12 regions from each organoid were selected using the Hoechst channel. In total, n = 108 areas (13 organoids from four batches) from the ARID1B control group (c.2201dupG repair), n = 104 areas (15 organoids from four batches) from the ARID1B+/− (c.2201dupG) group and n = 94 areas (15 organoids from three batches) from the ARID1B+/− (6q25.3del) group are collected and subjected to an automatic segmentation using a Fiji macro. Both DLX2 and OLIG2 channels are used to define the cell body area, followed by the intensity measurement. Area mean intensity was used for setting up the threshold. For protein expression quantification in Extended Data Fig. 2, organoids with individual gene perturbations costained for each gene were processed and quantified using Fiji. Five to fourteen cortical plate regions were analysed per gene. Areas containing both uninduced (dTomato−) as well as induced (dTomato+) cells were selected and subjected to an automated segmentation using a Fiji macro. The Hoechst channel is used to define the cell body area, followed by intensity measurement. Detected cells were separated into wild-type and perturbed cells by setting up a threshold of mean intensity in the dTomato channel. Additionally, KMT2C protein expression was compared between wild-type (dTomato−) and mutant (dTomato+) VZ area. VZs were individually outlined, and mean dTomato as well as KMT2C intensities were measured. For IPC abundance analysis, organoids with individual gene perturbations were costained for EOMES. Mutant columns expressing dTomato were individually segmented, and EOMES+ cells were identified by setting a threshold for EOMES intensity. The number of EOMES+ cells was normalized to the total number of cells. Percentages of EOMES+ cells were compared between individual gene perturbations and non-targeting gRNA control groups. For INP abundance analysis, organoids were costained with DLX2. A Fiji macro for automated segmentation was used to identify DLX2+ cells throughout the entire tissue. Areas containing multiple rosettes from each organoid were collected for quantification. The number of DLX2+ cells was normalized to the tissue area and compared between individual gene perturbations and non-targeting gRNA control groups.
Patient sample collection
The study was approved by the local ethics committee of the Medical University of Vienna. Study inclusion criteria were as follows: (1) mutation in the ARID1B gene proven by whole-exome sequencing, (2) age between 0 and 18 years old, (3) continuous follow-up at the Vienna General Hospital and (4) availability of fetal brain MRI data. After informed consent, 10 ml of blood was collected from two selected patients for iPS cell reprogramming.
Reprogramming of PBMCs into iPS cells
iPS cells were generated from peripheral blood mononuclear cells (PBMCs) isolated from patient blood samples as previously described75. In brief, 10 ml blood was collected in sodium citrate collection tubes. PBMCs were isolated via a Ficoll–Paque density gradient, and erythroblasts were expanded for 9 days. Erythroblast-enriched populations were infected with Sendai Vectors expressing human OCT3/4, SOX2, KLF4 and cMYC (CytoTune; Life Technologies, A1377801). Three days after infection, cells were switched to mouse embryonic fibroblast feeder layers. Five days after infection, the medium was changed to iPS cell medium (KoSR + FGF2). Ten to 21 days after infection, the transduced cells began to form colonies that exhibited iPS cell morphology. iPS cell colonies were picked and passaged every 5–7 days after transfer to the mTeSR culture system (Stemcell Technologies).
Generation of isogenic control cell line for patient 1
Isogenic control cell lines for patient 1 were generated using CRISPR–Cas9. Streptococcus pyogenes Cas9 protein with two nuclear localization signals was purified as previously described76. gRNA transcription was performed with the HiScribe T7 High Yield RNA Synthesis Kit (NEB) according to the manufacturer’s protocol, and gRNAs were purified via phenol:chloroform:isoamyl alcohol (25:24:1; Applichem) extraction followed by ethanol precipitation. The homology-directed repair (HDR) template (custom single-stranded oligodeoxynucleotides; Integrated DNA Technologies) was designed to span 100 base pairs up- and downstream of the mutation site. iPS cells had been grown in mTeSR for 14 passages before the procedure. For generation of isogenic control cell lines, cells were washed with DPBS−/− and incubated for 5 min at 37 °C with 1 ml of accutase solution (Sigma-Aldrich, A6964-500ML). The plate was gently tapped to detach cells, and cells were gently pipetted to generate a single-cell suspension, pelleted by spinning at 200g for 3 min and counted using Trypan Blue solution (ThermoFisher Scientific). For nucleofection, 1.0 × 106 cells were spun down and resuspended in Buffer R of the Neon Transfection System (ThermoFisher Scientific) at a concentration of 2 × 107 cells per millilitre. Twelve nanograms of sgRNA and 5 ng of Cas9 protein were combined in resuspension buffer to form the Cas9–sgRNA ribonucleoprotein complex. The reaction was mixed and incubated at 37 °C for 5 min. Five microliters of the HDR template (100 μM) were added to the Cas9–sgRNA ribonucleoprotein complex and combined with the cell suspension. Electroporations were performed using a Neon Transfection System (ThermoFisher Scientific) with 100 μl Neon Pipette Tips using the embryonic stem cells electroporation protocol (1,400 V, 10 ms, three pulses). Cells were seeded in one matrigel-coated well of a six-well plate in mTeSR. After a recovery period of 3 days, a single-cell suspension was generated, and cells were split into another well of a six-well plate for banking and sparsely into two 10-cm dishes for colony formation from single cells. After colony growth for 1 week, individual colonies were picked and seeded each into one well of a 96-well plate. After colony expansion, gDNA was extracted using DNA QuickExtract Solution (Lucigen), followed by PCR and Sanger sequencing to determine efficient repair of the mutation.
Fetal MRI and 3D reconstruction
Women with singleton pregnancies undergoing fetal MRI at a tertiary care centre from January 2016 to December 2021 were retrospectively reviewed. This study was approved by the institutional ethics board, and all examinations were clinically indicated. A retrospective review of patient records was performed, and a patient with a positive genetic testing report for ARID1B mutation was selected. The participant was included in further analysis, and the gestational age (given in gestational weeks and days postmenstruation) was determined by first-trimester ultrasound. High-quality super-resolution reconstruction was obtained13. Age-matched control subjects were identified and included if they presented an absence of confounding comorbidities, including structural cerebral or cardiac anomalies or fetal growth restriction.
Fetal MRI scans were conducted using 1.5-T (Philips Ingenia/Intera) and 3-T magnets (Philips Achieva). The mother was examined in a supine position or if necessary, left recumbent to achieve sufficient imaging quality. The examinations were performed within 45 min, neither sedation nor MRI contrast medium was applied, and both the fetal head and body were imaged. Fetal brain imaging included T2-weighted sequences in three orthogonal planes (slice thickness = 3–4 mm, echo time = 140 ms, field of view = 230 mm) of the fetal head. Postprocessing was conducted as previously described77. Superresolution imaging was generated using a volumetric superresolution algorithm77. The resulting superresolution data were quality assessed, and only cases that met high-quality standards (score of less than or equal to two of five) were included in the analysis. Atlas-based segmentation was performed for the fetal cortex and total brain volume by nonrigid mapping of a publicly available spatiotemporal, anatomical fetal brain atlas for each investigated case77,78. Segmentation of the GE was performed manually using the open-source application ITK-SNAP79. To delineate the T2-weighted hypointense GE, histological fetal atlases by Bayer and Altman80,81 were used as a reference guide. Volumetric data were generated and calculations for the GE were made based on the investigated gestational ages.
Statistics
Information on the statistical analyses used is described in each method section. No statistical methods were used to predetermine sample size unless specified. No blinding and randomization were used unless specified.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41586-023-06473-y.
Supplementary information
Acknowledgements
We thank the patients and their families for participating in this study. We thank all members of the laboratory of J.A.K. for support and discussions; F. Bonnay, E. Chatzidaki, O. L. Eichmüller, R. Najm and J. Sidhaye for comments on the manuscript; M. Nezhyba and T. Lendl from the IMP/IMBA Biooptics Facility for technical support; A. Vogt, F. Drochter and T. Grentzinger from the VBCF NGS facility for single-cell RNA sequencing library preparation; the IMBA Stem Cell Core facility for generation of induced pluripotent stem cell lines; and M. Balmana Esteban and the Christoph Bock group (CeMM) for the help with single-cell multiomics sequencing. Work in the laboratory of J.A.K. is supported by SFARI (pilot award no. 724430); the Austrian Federal Ministry of Education, Science and Research; the Austrian Academy of Sciences; the City of Vienna; the Austrian Science Fund (FWF) Special Research Programme (grant no. F 7804-B) and two Stand-Alone grants (grant no. P 35680 and no. P 35369); and a European Research Council (ERC) Advanced Grant under the European Union’s Horizon 2020 programs (grant no. 695642 and no. 874769). Work in the laboratory of B.T. is supported by the European Research Council (organomics and braintime (to B.T.)); the Chan Zuckerberg Initiative DAF, an advised fund of the Silicon Valley Community Foundation (grant no. CZF2019-002440); the Swiss National Science Foundation (grant no. 310030_192604); and the National Centre of Competence in Research Molecular Systems Engineering. A.V. is supported by an EMBO Fellowship (grant no. ALTF-1112-2019). J.S.F. was supported by a Boehringer Ingelheim Fonds PhD fellowship. C.M.-C. was supported by the SCORPION Austrian Science Fund (FWF) DOC 72-B27.
Extended data figures and tables
Author contributions
C.L. and J.A.K. conceived the project and experimental design and secured the funding. C.L., J.S.F. and J.A.K. prepared the manuscript with input from all authors. C.L. performed all the experiments and analysed the data with the help of T.R.B., J.T., A.M.P., A.V., J.B.L. and C.E. J.S.F. performed the analysis of all single-cell RNA sequencing and multiome data under the supervision of B.T. M.S. and G.K. performed patient diagnosis and analysed magnetic resonance imaging. C.M.-C. generated induced pluripotent stem cell lines under the supervision of N.S.C. U.E. provided sgRNA predication.
Peer review
Peer review information
Nature thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Data availability
Raw sequencing datasets were deposited into ArrayExpress with the following accession codes: single-cell RNA sequencing and associated amplicon (E-MTAB-13148), bulk genomic DNA-derived amplicon (E-MTAB-13140) and single-cell mutiomics (E-MTAB-13144). ARID1B cell line genotype data were deposited into the European Genome-Phenome Archive (EGAS00001007381). Processed Seurat objects were deposited into Zenodo (https://zenodo.org/record/7083558).
Code availability
The Pando R package is available on GitHub (https://github.com/quadbiolab/Pando). Other custom code used in the analyses has been deposited on GitHub (https://github.com/quadbiolab/ASD_CHOOSE).
Competing interests
J.A.K. is on the supervisory and scientific advisory board of a:head bio AG (https://aheadbio.com) and is an inventor on several patents relating to cerebral organoids. The remaining authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Chong Li, Jonas Simon Fleck
Change history
11/14/2023
A Correction to this paper has been published: 10.1038/s41586-023-06836-5
Contributor Information
Chong Li, Email: chong.li@imba.oeaw.ac.at.
Barbara Treutlein, Email: barbara.treutlein@bsse.ethz.ch.
Juergen A. Knoblich, Email: juergen.knoblich@imba.oeaw.ac.at
Extended data
is available for this paper at 10.1038/s41586-023-06473-y.
Supplementary information
The online version contains supplementary material available at 10.1038/s41586-023-06473-y.
References
- 1.Hu WF, Chahrour MH, Walsh CA. The diverse genetic landscape of neurodevelopmental disorders. Genom. Hum. Genet. 2014;15:195–213. doi: 10.1146/annurev-genom-090413-025600. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Klingler E, Francis F, Jabaudon D, Cappello S. Mapping the molecular and cellular complexity of cortical malformations. Science. 2021;371:eaba4517. doi: 10.1126/science.aba4517. [DOI] [PubMed] [Google Scholar]
- 3.Libé-Philippot B, Vanderhaeghen P. Cellular and molecular mechanisms linking human cortical development and evolution. Annu. Rev. Genet. 2021;55:555–581. doi: 10.1146/annurev-genet-071719-020705. [DOI] [PubMed] [Google Scholar]
- 4.Lui JH, Hansen DV, Kriegstein AR. Development and evolution of the human neocortex. Cell. 2011;146:18–36. doi: 10.1016/j.cell.2011.06.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Willsey AJ, et al. Coexpression networks implicate human midfetal deep cortical projection neurons in the pathogenesis of autism. Cell. 2013;155:997–1007. doi: 10.1016/j.cell.2013.10.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Parikshak NN, et al. Integrative functional genomic analyses implicate specific molecular pathways and circuits in autism. Cell. 2013;155:1008–1021. doi: 10.1016/j.cell.2013.10.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Satterstrom FK, et al. Large-scale exome sequencing study implicates both developmental and functional changes in the neurobiology of autism. Cell. 2020;180:568–584.e23. doi: 10.1016/j.cell.2019.12.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Torre-Ubieta L, de la Won H, Stein JL, Geschwind DH. Advancing the understanding of autism disease mechanisms through genetics. Nat. Med. 2016;22:345–361. doi: 10.1038/nm.4071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Lancaster MA, et al. Cerebral organoids model human brain development and microcephaly. Nature. 2013;501:373–379. doi: 10.1038/nature12517. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Mariani J, et al. FOXG1-dependent dysregulation of GABA/glutamate neuron differentiation in autism spectrum disorders. Cell. 2015;162:375–390. doi: 10.1016/j.cell.2015.06.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Paulsen B, et al. Autism genes converge on asynchronous development of shared neuron classes. Nature. 2022;602:268–273. doi: 10.1038/s41586-021-04358-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Esk C, et al. A human tissue screen identifies a regulator of ER secretion as a brain-size determinant. Science. 2020;370:935–941. doi: 10.1126/science.abb5390. [DOI] [PubMed] [Google Scholar]
- 13.Michels BE, et al. Pooled in vitro and in vivo CRISPR-Cas9 screening identifies tumor suppressors in human colon organoids. Cell Stem Cell. 2020;26:782–792.e7. doi: 10.1016/j.stem.2020.04.003. [DOI] [PubMed] [Google Scholar]
- 14.Dixit A, et al. Perturb-seq: dissecting molecular circuits with scalable single-cell RNA. Cell. 2016;167:1853–1866.e17. doi: 10.1016/j.cell.2016.11.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Jaitin DA, et al. Dissecting immune circuits by linking CRISPR-pooled screens with single-cell RNA-seq. Cell. 2016;167:1883–1896.e15. doi: 10.1016/j.cell.2016.11.039. [DOI] [PubMed] [Google Scholar]
- 16.Adamson B, et al. A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell. 2016;167:1867–1882.e21. doi: 10.1016/j.cell.2016.11.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Datlinger P, et al. Pooled CRISPR screening with single-cell transcriptome read-out. Nat. Methods. 2017;14:297–301. doi: 10.1038/nmeth.4177. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Doench JG. Am I ready for CRISPR? A user’s guide to genetic screens. Nat. Rev. Genet. 2018;19:67–80. doi: 10.1038/nrg.2017.97. [DOI] [PubMed] [Google Scholar]
- 19.Bizzotto S, et al. Landmarks of human embryonic development inscribed in somatic mutations. Science. 2021;371:1249–1253. doi: 10.1126/science.abe1544. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Chen JA, Peñagarikano O, Belgard TG, Swarup V, Geschwind DH. The emerging picture of autism spectrum disorder: genetics and pathology. Pathology Mech. Dis. 2015;10:111–144. doi: 10.1146/annurev-pathol-012414-040405. [DOI] [PubMed] [Google Scholar]
- 21.Study TD, et al. Synaptic, transcriptional, and chromatin genes disrupted in autism. Nature. 2014;515:209–215. doi: 10.1038/nature13772. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Trevino AE, et al. Chromatin and gene-regulatory dynamics of the developing human cerebral cortex at single-cell resolution. Cell. 2021;184:5053–5069.e23. doi: 10.1016/j.cell.2021.07.039. [DOI] [PubMed] [Google Scholar]
- 23.Lancaster MA, et al. Guided self-organization and cortical plate formation in human brain organoids. Nat. Biotechnol. 2017;35:659–666. doi: 10.1038/nbt.3906. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Eichmüller OL, et al. Amplification of human interneuron progenitors promotes brain tumors and neurological defects. Science. 2022;375:eabf5546. doi: 10.1126/science.abf5546. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Tsyporin J, et al. Transcriptional repression by FEZF2 restricts alternative identities of cortical projection neurons. Cell Rep. 2021;35:109269–109269. doi: 10.1016/j.celrep.2021.109269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Zhong Y, et al. Identification of the genes that are expressed in the upper layers of the neocortex. Cereb. Cortex. 2004;14:1144–1152. doi: 10.1093/cercor/bhh074. [DOI] [PubMed] [Google Scholar]
- 27.Oishi K, et al. Identity of neocortical layer 4 neurons is specified through correct positioning into the cortex. eLife. 2016;5:e10907. doi: 10.7554/eLife.10907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Shi Y, et al. Mouse and human share conserved transcriptional programs for interneuron development. Science. 2021;374:eabj6641. doi: 10.1126/science.abj6641. [DOI] [PubMed] [Google Scholar]
- 29.Schmitz MT, et al. The development and evolution of inhibitory neurons in primate cerebrum. Nature. 2022;603:871–877. doi: 10.1038/s41586-022-04510-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Manno GL, et al. RNA velocity of single cells. Nature. 2018;560:494–498. doi: 10.1038/s41586-018-0414-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Bergen V, Lange M, Peidli S, Wolf FA, Theis FJ. Generalizing RNA velocity to transient cell states through dynamical modeling. Nat. Biotechnol. 2020;38:1408–1414. doi: 10.1038/s41587-020-0591-3. [DOI] [PubMed] [Google Scholar]
- 32.Braun, E. et al. Comprehensive cell atlas of the first-trimester developing human brain. Preprint at bioRxiv10.1101/2022.10.24.513487 (2022). [DOI] [PubMed]
- 33.Courchesne E, et al. The ASD Living Biology: from cell proliferation to clinical phenotype. Mol. Psychiatr. 2019;24:88–107. doi: 10.1038/s41380-018-0056-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Velmeshev D, et al. Single-cell genomics identifies cell type–specific molecular changes in autism. Science. 2019;364:685–689. doi: 10.1126/science.aav8130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Qian X, et al. Sliced human cortical organoids for modeling distinct cortical layer formation. Cell Stem Cell. 2020;26:766–781.e9. doi: 10.1016/j.stem.2020.02.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Nakagawa T, et al. The autism-related protein SETD5 controls neural cell proliferation through epigenetic regulation of rDNA expression. iScience. 2020;23:101030. doi: 10.1016/j.isci.2020.101030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Wang J, et al. Mitochondrial dysfunction and oxidative stress contribute to cognitive and motor impairment in FOXP1 syndrome. Proc. Natl Acad. Sci. USA. 2022;119:e2112852119. doi: 10.1073/pnas.2112852119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Chen H-H, et al. IRF2BP2 reduces macrophage inflammation and susceptibility to atherosclerosis. Circ. Res. 2015;117:671–683. doi: 10.1161/CIRCRESAHA.114.305777. [DOI] [PubMed] [Google Scholar]
- 39.Jia Y-L, et al. P300/CBP-associated factor (PCAF) inhibits the growth of hepatocellular carcinoma by promoting cell autophagy. Cell Death Dis. 2016;7:e2400–e2400. doi: 10.1038/cddis.2016.247. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Frasca A, et al. MECP2 mutations affect ciliogenesis: a novel perspective for Rett syndrome and related disorders. EMBO Mol. Med. 2020;12:e10270. doi: 10.15252/emmm.201910270. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Fleck, J.S., Jansen, S.M.J., Wollny, D. et al. Inferring and perturbing cell fate regulomes in human brain organoids. Nature10.1038/s41586-022-05279-8 (2022). [DOI] [PMC free article] [PubMed]
- 42.Becht E, et al. Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol. 2019;37:38–44. doi: 10.1038/nbt.4314. [DOI] [PubMed] [Google Scholar]
- 43.Lange M, et al. CellRank for directed single-cell fate mapping. Nat. Methods. 2022;19:159–170. doi: 10.1038/s41592-021-01346-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Petryniak MA, Potter GB, Rowitch DH, Rubenstein JLR. Dlx1 and Dlx2 control neuronal versus oligodendroglial cell fate acquisition in the developing forebrain. Neuron. 2007;55:417–433. doi: 10.1016/j.neuron.2007.06.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Sun Y, et al. Phosphorylation state of Olig2 regulates proliferation of neural progenitors. Neuron. 2011;69:906–917. doi: 10.1016/j.neuron.2011.02.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Moffat JJ, Smith AL, Jung E-M, Ka M, Kim W-Y. Neurobiology of ARID1B haploinsufficiency related to neurodevelopmental and psychiatric disorders. Mol. Psychiatr. 2022;27:476–489. doi: 10.1038/s41380-021-01060-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Bagley JA, Reumann D, Bian S, Lévi-Strauss J, Knoblich JA. Fused cerebral organoids model interactions between brain regions. Nat. Methods. 2017;14:743–751. doi: 10.1038/nmeth.4304. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Bajaj S, et al. Neurotransmitter signaling regulates distinct phases of multimodal human interneuron migration. EMBO J. 2021;40:e108714. doi: 10.15252/embj.2021108714. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Martínez-Cerdeño V, Noctor SC, Kriegstein AR. The role of intermediate progenitor cells in the evolutionary expansion of the cerebral cortex. Cereb. Cortex. 2006;16:i152–i161. doi: 10.1093/cercor/bhk017. [DOI] [PubMed] [Google Scholar]
- 50.Pebworth M-P, Ross J, Andrews M, Bhaduri A, Kriegstein AR. Human intermediate progenitor diversity during cortical development. Proc. Natl Acad. Sci. USA. 2021;118:e2019415118. doi: 10.1073/pnas.2019415118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Kessaris N, et al. Competing waves of oligodendrocytes in the forebrain and postnatal elimination of an embryonic lineage. Nat. Neurosci. 2006;9:173–179. doi: 10.1038/nn1620. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Sokpor G, Xie Y, Rosenbusch J, Tuoc T. Chromatin remodeling BAF (SWI/SNF) complexes in neural development and disorders. Front. Mol. Neurosci. 2017;10:243. doi: 10.3389/fnmol.2017.00243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Villa CE, et al. CHD8 haploinsufficiency links autism to transient alterations in excitatory and inhibitory trajectories. Cell Rep. 2022;39:110615. doi: 10.1016/j.celrep.2022.110615. [DOI] [PubMed] [Google Scholar]
- 54.Michlits G, et al. Multilayered VBC score predicts sgRNAs that efficiently generate loss-of-function alleles. Nat. Methods. 2020;17:708–716. doi: 10.1038/s41592-020-0850-8. [DOI] [PubMed] [Google Scholar]
- 55.Sanjana NE, Shalem O, Zhang F. Improved vectors and genome-wide libraries for CRISPR screening. Nat. Methods. 2014;11:783–784. doi: 10.1038/nmeth.3047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Omelina ES, Ivankin AV, Letiagina AE, Pindyurin AV. Optimized PCR conditions minimizing the formation of chimeric DNA molecules from MPRA plasmid libraries. BMC Genom. 2019;20:536. doi: 10.1186/s12864-019-5847-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Williams R, et al. Amplification of complex gene libraries by emulsion PCR. Nat. Methods. 2006;3:545–550. doi: 10.1038/nmeth896. [DOI] [PubMed] [Google Scholar]
- 58.Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 2011;17:10–12. doi: 10.14806/ej.17.1.200. [DOI] [Google Scholar]
- 59.Zheng GXY, et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 2017;8:14049. doi: 10.1038/ncomms14049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Hao Y, et al. Integrated analysis of multimodal single-cell data. Cell. 2021;184:3573–3587.e29. doi: 10.1016/j.cell.2021.04.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Korsunsky I, et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods. 2019;16:1289–1296. doi: 10.1038/s41592-019-0619-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J Statist. Mech. Theory Exp. 2008;2008:P10008. doi: 10.1088/1742-5468/2008/10/P10008. [DOI] [Google Scholar]
- 63.Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 2016;34:525–527. doi: 10.1038/nbt.3519. [DOI] [PubMed] [Google Scholar]
- 64.Nikolova, M. T. et al. Fate and state transitions during human blood vessel organoid development. Preprint at bioRxiv10.1101/2022.03.23.485329 (2022).
- 65.Wickham, H. ggplot2: Elegant Graphics for Data Analysis. Springer,; 2016. [Google Scholar]
- 66.Wu T, et al. clusterProfiler 4.0: a universal enrichment tool for interpreting omics data. Innovation. 2021;2:100141. doi: 10.1016/j.xinn.2021.100141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Stuart T, Srivastava A, Madad S, Lareau CA, Satija R. Single-cell chromatin state analysis with Signac. Nat. Methods. 2021;18:1333–1341. doi: 10.1038/s41592-021-01282-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Feng J, Liu T, Qin B, Zhang Y, Liu XS. Identifying ChIP-seq enrichment using MACS. Nat. Protoc. 2012;7:1728–1740. doi: 10.1038/nprot.2012.101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Siepel A, et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005;15:1034–1050. doi: 10.1101/gr.3715005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Consortium TEP, et al. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature. 2020;583:699–710. doi: 10.1038/s41586-020-2493-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.McLean CY, et al. GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 2010;28:495–501. doi: 10.1038/nbt.1630. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Setty M, et al. Characterization of cell fate probabilities in single-cell data with Palantir. Nat. Biotechnol. 2019;37:451–460. doi: 10.1038/s41587-019-0068-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Reuter B, Weber M, Fackeldey K, Röblitz S, Garcia ME. Generalized Markov state modeling method for nonequilibrium biomolecular dynamics: exemplified on amyloid β conformational dynamics driven by an oscillating electric field. J. Chem. Theory Comput. 2018;14:3579–3594. doi: 10.1021/acs.jctc.8b00079. [DOI] [PubMed] [Google Scholar]
- 74.Velten L, et al. Human haematopoietic stem cell lineage commitment is a continuous process. Nat. Cell Biol. 2017;19:271–281. doi: 10.1038/ncb3493. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Agu CA, et al. Successful generation of human induced pluripotent stem cell lines from blood samples held at room temperature for up to 48 hr. Stem Cell Rep. 2015;5:660–671. doi: 10.1016/j.stemcr.2015.08.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Jinek M, et al. A programmable dual-RNA–guided DNA endonuclease in adaptive bacterial immunity. Science. 2012;337:816–821. doi: 10.1126/science.1225829. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Gholipour A, et al. A normative spatiotemporal MRI atlas of the fetal brain for automatic segmentation and analysis of early brain growth. Sci. Rep. 2017;7:476. doi: 10.1038/s41598-017-00525-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Yushkevich PA, et al. User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability. Neuroimage. 2006;31:1116–1128. doi: 10.1016/j.neuroimage.2006.01.015. [DOI] [PubMed] [Google Scholar]
- 79.Ebner M, et al. An automated framework for localization, segmentation and super-resolution reconstruction of fetal brain MRI. Neuroimage. 2020;206:116324. doi: 10.1016/j.neuroimage.2019.116324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Bayer, SA, Altman, J. The Human Brain during the Second Trimester. Taylor &amp;amp;amp;amp;amp;amp;amp;amp; Francis,; 2005. [Google Scholar]
- 81.Bayer, SA, Altman, J. Atlas of Human Central Nervous System Development. Taylor &amp;amp;amp;amp;amp;amp;amp; Francis,; 2003. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Raw sequencing datasets were deposited into ArrayExpress with the following accession codes: single-cell RNA sequencing and associated amplicon (E-MTAB-13148), bulk genomic DNA-derived amplicon (E-MTAB-13140) and single-cell mutiomics (E-MTAB-13144). ARID1B cell line genotype data were deposited into the European Genome-Phenome Archive (EGAS00001007381). Processed Seurat objects were deposited into Zenodo (https://zenodo.org/record/7083558).
The Pando R package is available on GitHub (https://github.com/quadbiolab/Pando). Other custom code used in the analyses has been deposited on GitHub (https://github.com/quadbiolab/ASD_CHOOSE).