Abstract
Cell type-specific investigations commonly employ gene reporters or single-cell (sc) analytical techniques. However, reporter line development is arduous and generally limited to a single gene of interest, while scRNA-seq frequently yields equivocal results that preclude definitive cell identification. To examine gene expression profiles of multiple retinal cell types derived from human pluripotent stem cells (hPSCs), we performed scRNA-seq on optic vesicle-like structures (OVs) cultured under cGMP-compatible conditions. However, efforts to apply traditional scRNA-seq analytical methods based on unbiased algorithms were unrevealing. Therefore, we developed a simple, versatile, and universally applicable approach that generates gene expression data akin to those obtained from reporter lines. This method ranks single cells by expression level of a bait gene and searches the transcriptome for genes whose cell-to-cell rank order expression most closely matches that of the bait. Moreover, multiple bait genes can be combined to refine datasets. Using this approach, we provide further evidence for the authenticity of hPSC-derived retinal cell types.
Keywords: Pluripotent stem cells, Retina, High-Throughput RNA Sequencing, Gene Expression Profiling
Graphical Abstract

Introduction
Most human pluripotent stem cell (hPSC) differentiation protocols aim to generate an enriched population of cells with in vivo-like characteristics, preferably without the use of serum or other undefined media components. But even under ideal conditions, such methods often produce complex cultures containing a diversity of cell types that mature asynchronously over time.
Among hPSC-derived retinal progeny, retinal pigment epithelial (RPE) cells differentiate early and are readily identified and cultured as purified monolayers, features that have accelerated their use in disease modeling and clinical trials. Unlike RPE, the neural retina (NR) is a heterogeneous tissue consisting of multiple cell classes, all of which originate from a common progenitor that populates the optic vesicle (OV). 3D hPSC-OV (or “retinal organoid”) protocols allow isolation of these progenitors, which differentiate in a conserved spatiotemporal manner spanning many months [1–4]. During this process, there is considerable overlap in the production of NR cell classes, such that at any given time hPSC-OVs contain an evolving array of cell types of varying maturity. Adding to this complexity is the potential loss of cells that reside deepest within the structures where long-term metabolic support may be lacking. Given the potential impact of these culture dynamics on in vitro models, drug screens, or cell replacement strategies, there is a need to better characterize differentiating hPSC-OV populations.
Methods to assess the composition of complex cultures are numerous and include gene reporters and single-cell analytical techniques. To date, hPSC reporter lines have been helpful in broadly evaluating NR progenitor cells (NRPCs) and photoreceptors (PRs) [2, 5, 6]. In particular, comparative bulk RNA-seq analysis on sorted retinal cells has provided insight into the pooled transcriptomes of all PRs present within early organoids [5]. Despite their utility, the process of creating and characterizing reporter lines is painstaking and satisfactory outcomes are not always achieved. Furthermore, few genes are completely specific to one cell type or maturation stage, necessitating the use of multiple reporters for more refined analyses.
In contrast to reporter lines, methods for investigating single-cell transcriptomics theoretically allow detailed molecular characterization of every cell type without genetic manipulation. However, challenges also exist for single-cell analyses, such as capture bias, transcript processing artifacts, and inconclusive or contradictory findings. As such, these studies often rely upon a limited set of marker genes to assign cell identity. For example, single-cell microarray studies of developing mouse retina revealed a diverse array of cell types [7–10], but did so by classifying cells based on selected NR marker expression. Newer methods such as single cell RNA-seq (scRNA-seq) offer advantages to microarray techniques, but the aforementioned challenges remain.
In the present study, we employed scRNA-seq to investigate the diversity and authenticity of cell types within differentiating hPSC-OVs cultured under serum-free conditions. To bolster this effort, we generated an hPSC reporter line that robustly labels photoreceptor lineage cells, thus allowing us to monitor OV sample purity, reproducibility, and maturation in real time across multiple differentiation runs. Initial efforts revealed that unbiased PCA methods to mine scRNA-seq datasets did not clearly define retinal cell types. Therefore, we developed a simple and universally applicable method that extracts gene expression information from scRNA-seq datasets similar to a reporter line, except that it can be applied to any gene (or combination of genes) of interest. Together, these findings demonstrate the utility of our analytical approach and confirm the potential for hPSCs to produce highly authentic retinal cell types for basic science discovery and clinical translation.
Materials and Methods
Generation, retinal differentiation, and characterization of CRX+/tdTomato hPSCs
A BAC clone (CTD-3013L6) with the complete CRX coding region was obtained (CalTech Human BAC Library D, Life Technologies) and red-ET recombination was used to introduce the tdTomato gene (Clontech) and a PGK/gb promoter driving a neomycin resistance gene upstream of the start codon in exon 2 of CRX. A 29-kb BamHI fragment containing the modified CRX gene was then subcloned into the hES-2A-inducible BAC vector. Human embryonic stem cells (hESCs, WA09 line, WiCell) were harvested for transfection 2 days post-passaging and resuspended in OptiMEM (Life Technologies) at 5 × 106 cells/mL. 500 μl of cells was added to a cuvette containing 30 μg of linearized CRX reporter construct and 10 μg mRNA encoding a pair of zinc finger proteins (5 μg each) specific to CRX (Sigma) in 300 μl PBS. Cells were electroporated (320V, 200 μF) and plated. Geneticin (100 μg/mL) was added until single drug-resistant colonies formed. Genomic DNA was extracted from drug resistant colonies and analyzed by qPCR using primers specific to the endogenous CRX locus or to both the endogenous and transgenic CRX sequences. Clones with one copy of endogenous CRX relative to an internal control were identified as targeted. G-banding analysis (WiCell) was performed to confirm karyotype. The CRX+/tdTomato line was then grown in mTeSR (WiCell) on matrigel-coated plates and differentiated to OVs using established protocols [1]. Characterization was performed with ICC and RT-PCR as previously described [11], as well as with high content imaging analysis (Supporting Information Methods; Supporting Information Table 1 and 2 for antibodies and RT-PCR primers). A subset of PRs were recorded via standard whole-cell patch clamp electrophysiology (Supporting Information Methods).
Human prenatal and adult eye tissue collection
Human prenatal eyes (122 days gestation) were obtained from the Laboratory of Developmental Biology (University of Washington-Seattle). Tissue collection methods adhered to IRB requirements, NIH guidelines, and the Helsinki Declaration. Adult human eyes (64 years old) were acquired from the Lions Eye Bank of Wisconsin.
Single-cell capture and cDNA library preparation for RNA-seq
TdTomato+ OVs and adult retinas were dissociated using papain. All procedures of cell loading, capture, and library preparations were performed following the Fluidigm user manual and as previously described [12]. Briefly, 5000–8000 cells were loaded onto a C1 Single-Cell Auto Prep fluidic circuit (Fluidigm). Cell capture efficiency was determined using the EVOS FL Auto Cell Imaging system (Life Technologies) and empty sites or sites with >1 cell were excluded from further analysis. Reverse transcription and cDNA amplification were performed using the SMARTer PCR cDNA Synthesis kit (Clontech) and the Advantage 2 PCR kit (Clontech). Full-length, single-cell cDNA libraries were harvested from the C1 chip and diluted to 0.1–0.3 ng/μl, fragmented, and amplified using the Nextera XT DNA Sample Preparation and Index Kits (Illumina). Libraries were multiplexed at 24–48 libraries per lane and 51-bp single-end reads were sequenced (Illumina HiSeq 2500 system). Remaining cells were processed for bulk RNA-seq as previously described [13, 14] (also Supporting Information Methods).
Quantification of gene expression levels for scRNA-seq
FASTQ files were generated from Illumina Hi-Seq 2500 output by CASAVA (v1.8.2), reads were mapped to the human transcriptome (RefGene v1.2.3) using Bowtie (v0.12.8), and normalized gene expression values (in TPM) were calculated by RSEM (v1.2.3) [15]. To exclude cells with poor quality data, the SinQC program was used [16]. To control for potential batch effects, we further performed median-by-ratio normalization [17] on SinQC-passed cells. The scRNA-seq reads were submitted to GEO (accession number GSE98556).
Global gene expression analysis of scRNA-seq data
We used two independent methods to globally analyze the gene expression patterns for all the cells: PCA and Hierarchical Clustering. PCA analysis was performed based on “Log10 (‘Normalized TPM’ + 1)” measurement. Hierarchical clustering was performed based on cell-to cell-pairwise Spearman correlation coefficients of normalized TPMs with average linkage.
Spearman’s Rank Correlational Coefficient Analysis
SRCCA was used to determine the strength of association between individual marker genes and all genes. Genes were then ranked based on Spearman’s correlation coefficients (rho) from high to low. SRCCA was performed by R programming language (http://www.R-project.org/). To determine overlap in the top 200 correlating genes for different bait genes, lists were imported into Venny 2.1 (http://bioinfogp.cnb.csic.es/tools/venny/).
Results
Verification of OV culture purity using a CRX-targeted hPSC reporter line
The CRX promoter has been employed in animal models and PSCs to tag PRs from early retinogenesis through to maturity, which also makes it useful as an indicator of retinal identity at all stages of development. Given this utility, we developed a CRX-targeted hPSC reporter line to assess differentiating OV populations in real time and assure consistency between cultures and RNAseq experiments. To create the line, we used bacterial artificial chromosome-mediated homologous recombineering to insert tdTomato 3′ to the start codon in exon 2 of one CRX allele of the WA09 line (Figure S1A). Clones were screened to detect on- and off-target events using a qPCR copy number assay and G-banding confirmed a normal karyotype for targeted clones (Figure S1B, C).
We then differentiated CRX+/tdTomato hPSCs to NR using a serum-free, 3D OV protocol that is compatible with current Good Manufacturing Process guidelines [1]. As such, cells generated by this method are potentially suitable for clinical use. By day (d) 10 of differentiation, colonies expressing eye field transcription factors were present (Figure S1D–H) that gave rise to OVs containing a nearly pure population of proliferative NRPCs (Figure S1I). Following isolation at d20–25, OVs were placed in individual culture wells and monitored for tdTomato expression. Quantification showed that the majority of OVs were tdTomato+ by d40, while >97% were fluorescent by d60 (Figure 1A, B). Conversely, non-retinal (i.e., forebrain) neurospheres generated by the same cultures had negligible fluorescence (Figure 1C, D). By d80, all OVs were tdTomato+ (Figure 1E, F). These results demonstrate our ability to consistently produce and isolate pure cultures of hPSC-derived NR cell populations, an important quality control step prior to embarking on RNAseq studies.
Figure 1. Assessing hPSC-OV production and PR maturation with a CRX reporter line.
A) Consecutive live images showed increased tdTomato fluorescence from one CRX+/tdTomato OV from d28 to d60 (identical exposure). B) Percent of OVs expressing tdTomato over time (n=48 OVs). C, D) Forebrain neurospheres served as a negative control and did not fluoresce (n=48 for panel D). E, F) Low magnification brightfield (E) and tdTomato fluorescence (F) images of a CRX+/tdTomato OV culture demonstrating fluorescence in all OVs at d80. G) d80 patch-clamped tdTomato+ cell. H) Electrophysiological profiles of early (n=20) and late (n=41) stage PRs were distinct. I) Profiles of early tdTomato+ cells were heavily weighted by cones, whereas profiles of late tdTomato+ cells were predominantly rods (determined via single-cell PCR; n=5 for each cohort). J, K) RT-PCR showed expression of PR markers, including cone- and rod-specific markers (J), and all other major NR cell classes (K) from d0-d243 of differentiation. L, M) Cell counts were performed at d70 (C) and d135 (D) of differentiation for NR markers using the Operetta® High Content Imaging System (n=3 biological replicates for each, error bars = SEM). Data presented as mean ± SEM. Scale bars=100μm.
OVs were then immunostained using a CRX primary antibody (Ab) [18], which demonstrated that all tdTomato+ cells were CRX+ (Figure S2A–C). To confirm the specificity of the CRX antibody, we immunostained human fetal retina and found robust signal in the ONL and RPE (consistent with Esumi et al. 2009), and fainter signal in the inner nuclear layer where BPCs reside (Figure S2D–F). With further differentiation, tdTomato continued to be expressed in developing PRs (Figure S3A), but was excluded from other NR cell types present in OVs (Figure S3B, C). However, tdTomato expression was detected in adherent RPE monolayers left behind after removal of OVs (Figure S3D–F). Thus, of the three CRX+ cell classes present in developing human retina, our hPSC reporter line labelled PRs and RPE. However, since RPE cells are rarely present in OV cultures using our production and isolation methods, tdTomato labeling was almost exclusively reflective of the differentiating PR population.
To determine whether tdTomato+ PRs adopt distinct rod or cone electrophysiological signatures over time, we compared average whole cell electrophysiological current profiles of tdTomato+ cells at early (d80–100) and late (d200–220) stages of OV differentiation (Figure 1G–I). TdTomato-negative cells were analyzed as a control. Distinct current profiles were found in early (n=20) vs. late (n=41) stage tdTomato+ PRs, with lower outward and inward currents observed in the latter group (Figure 1H). We retrieved cytoplasm from a subset of individual cells post-recording (n=5 for both groups) and determined rod vs. cone identity using single cell RT-PCR. Only THRB- or GNAT2-expressing cones were found in the early cohort, while NRL-expressing rods were captured in the late cohort. Comparison of these electrophysiological profiles confirmed that the early tdTomato+ cell recordings were predominantly weighted by cone responses, whereas late recordings were weighted by rods (Figure 1I), consistent with prior mouse studies [19].
Changes in retinal cell type composition over time
To further characterize tdTomato+ OVs prior to RNAseq studies, we next investigated production of all major NR cell types in between d0 and d243 of differentiation (Figure 1J–M). Genes important to PR differentiation and/or function were upregulated by d43 (Figure 1J). More specifically, cone production started by d43 as demonstrated by RXRG and THRB upregulation, with genes indicative of more mature cones expressed later (ARR3, OPN1SW, OPN1MW). Rod differentiation trailed that of cones, with NR2E3 and GNAT1 upregulated by d100, followed by CNGA1, SAG, and RHO. Expression of markers of other major NR cell types was also demonstrated (Figure 1K). The retinal ganglion cell (RGC) genes ATOH7 and POU4F2 were among the earliest expressed, but later became undetectable owing to RGC loss over time. Markers of HCs, amacrine cells (ACs), bipolar cells (BPCs), and Müller glia (MG) were present as well. VSX2 was present by d9 in NRPCs (Figure 1K and S1H), and persisted at late stages of development where it remains expressed in BPCs.
To quantify cell type production by protein expression, high content image analysis (HCIA) was performed on dissociated tdTomato+ OVs at d70 and d135 using the Operetta® system (Figure 1L, M)). At d70, 50.3 ± 5.2% of cells expressed CRX (n=3 independent cultures) and 89.1 ± 3.2% of CRX+ cells were tdTomato+. Additional PR markers were also expressed, including RCVRN (31.5 ± 2.9%), THRB (3.4 ± 1.9%), NRL (4.0 ± 2.7%), NR2E3 (0.8 ± 0.6%), and OC1 (15.9 ± 2.5%). Other early NR cell types were present as well, as indicated by expression of PAX6 (NRPCs and RGCs; 34.9 ± 4.4%), VSX2 (NRPCs and BPCs; 14.6 ± 3.8%), BRN3 (RGCs; 8.9 ± 0.9%), and GFAP and S100 (MG; 4.3 ± 1.8% and 2.6 ± 1.2%, respectively). Cell division (Ki-67) was evident in 10.8 ± 2.8% of all cells. By d135 the number of PRs increased substantially, with CRX and RCVRN expression in 69.29 ± 4.1% and 75.2 ± 2.6% of all cells, respectively. Of note, tdTomato labeling was present in a nearly identical cell percentage (70.6 ± 3.6%). Likewise, the percentages of cells expressing THRB (17.0 ± 8.3), OPN1SW (9.35 ± 4.11%), NRL (32.7 ± 1.8%), and NR2E3 (29.4 ± 3.2%) increased by d135, while expression of the early PR marker OC1 declined to 1.7 ± 0.4%. Other markers examined at d135 included PAX6 (14.5 ± 3.6%), VSX2 (2.2 ± 1.1%), Ki-67 (1.8 ± 0.7%), BRN3 (0.4 ± 0.2%), GFAP (9.4 ± 2.3%), S100 (1.2 ± 0.5%), and MITF (0.2 ± 0.1%). Overall, the number of PRs and MGs increased from d70 to d135, while the number of RGCs and proliferating NRPCs declined. Of note, MITF+ cells were rare at all time points, indicative of the low rate of RPE contamination inherent to this OV culture protocol.
Comparison of unbiased vs. instructed scRNA-seq analyses of hPSC-OVs
To further probe the authenticity of PRs and other NR cell types in hPSC-derived retinal cultures, we pursued single-cell transcriptome analysis. Dissociated cells from unsorted tdTomato+ OVs at early (d70) and late (d218) stages of differentiation and 64 year-old adult retina were captured and prepared for scRNA-seq analysis using the Fluidigm C1 system (Figure 2). We evaluated overall data quality by comparing a synthetic ensemble scRNA-seq dataset to the bulk RNA-seq dataset obtained using the same sample. The transcriptome-wide correlation between the average of all single cells and the bulk sample was high (Rho = 0.87) (Figure 2B), indicating that the single-cell ensemble recapitulated the heterogeneity present in the bulk RNA-seq dataset. Thereafter, stringent quality control parameters were applied to all samples using the SinQC program [16]. SinQC is a non-arbitrary method to distinguish cells with true biological variability vs. technical artifact by integrating gene expression patterns and data quality information. Only data from cells that met SinQC cutoffs were used for analysis (Figure 2C–I).
Figure 2. Quality control analysis of scRNA-seq data using the SinQC program.
A) Cells from dissociated hPSC-OVs or human adult retina were captured and processed in Fluidigm C1 microfluidic plates. B) Comparison of the synthetic ensemble scRNA-seq test dataset with its corresponding bulk RNAseq dataset from a d70 OV culture. The transcriptome-wide correlation between the two datasets was high (Rho = 0.87, Spearman rank correlation), indicating that the single cell ensemble captured much of the heterogeneity present in the bulk dataset. C) Total captured cells vs. number of cells that passed SinQC analysis. D, F, H) Principle component analysis (PCA) from d70 (D), d218 (F), and adult retina (H), including comparisons between principal components 1, 2, and 3, of all cells from each sample. Cells that passed QC are identified in blue, while cells that failed are identified in red. E, G, I) SinQC assesses data quality along 3 major parameters: number of mapped reads, mapping rate, and library complexity. Thereafter, the program determines the primary parameter(s) that led to rejection of cells within the sample. D70 OV cells that failed SinQC QC thresholds were rejected primarily due to low mapped reads (E), while cells from d218 OVs were rejected primarily due to low library complexity (G). In adult retinal cells, low number of mapped reads was the primary cause of rejection (I).
We next performed unbiased principle component analysis (PCA) and hierarchical clustering on all profiled cells from each sample group. Cells from d70, d218, or adult retina clustered using this approach with some overlap but no apparent sub-clustering within groups (Figure 3A–B). In a concerted attempt to identify retinal subpopulations, we repeated the d70 hierarchical clustering analysis using a pre-selected group of marker genes for PRs, RGCs, NRPCs, and RPE (Figure 3C). While broad clusters were evident, significant variability in marker expression was observed within clusters, with numerous cells displaying mixed signatures. We also looked at the percentage of all captured cells expressing common NR cell markers at >1 transcript per million (TPM), a cutoff often used to indicate biologically relevant expression [20, 21]. As expected, the percentage of cells expressing PR, HC, AC, BPC, or MG genes at >1 TPM increased between d70 and d218, while the percentage of cells expressing NRPC or RGC genes declined (Figure 3D). However, most single cells expressed markers of two or more different NR cell types, hampering efforts to unequivocally assign identity. For example, nearly all d218 and adult retina cells simultaneously expressed genes for cones (GNB3), rods (GNAT1), and BPCs (GRM6) at >1 TPM (Figure 3D). Thus, reliance upon the mere presence or absence of gene expression above an arbitrary threshold yielded ambiguous results.
Figure 3. scRNA-seq analysis of d70 and d218 hPSC-OVs and human adult retina.
A,B) Single cells passing SinQC analysis were analyzed with PCA (A) and hierarchical clustering (B), which revealed general clustering of cells within d70 (green) and d218 (red) OVs and adult retina (black) samples. C) Using d70 OV data as an example, hierarchical clustering using preselected markers for PRs, RGCs, RPCs, and RPE showed cell type-selective clustering. However, most individual cells demonstrated expression of markers of >1 cell subtype. D) The percentage of cells that expressed individual marker genes at >1 TPM was analyzed in each sample.
Superimposition of relative NR gene expression levels in PCA plots
We next asked if consideration of relative expression levels of NR markers could aid in delineating cell clusters. Toward this end, PCA plots (Figure 4A–C) were augmented to display single-cell data points based on the number of detected transcripts of a particular NR marker, with darker and larger data points indicating a higher TPM value (Figure 4D–I). This approach highlighted an overall increase in the number of high CRX-expressing cells captured from d218 OVs and adult retina samples compared to d70 (Figure 4D–F). Interestingly, the majority of cells in d218 OV plots that expressed high levels of CRX, THRB, and RXRG arose within a discernible cluster (Figure 4E, G–H). However, cells with high relative expression levels of non-PR genes were also present within this same plot region (Figure 4I). Thus, even augmented PCA offered minimal assistance in evaluating these scRNA-seq datasets.
Figure 4. Augmentation of PCA plots by relative gene expression weighting.
A–C) PCA plots of d70 (A) and d218 (B) OVs and adult retina (C) showed no clear cell subclusters. D–F) PCA plots of each sample weighted by CRX expression (i.e., greater expression = larger and darker data point). D70 OV cells (D) and adult retinal cells (F) did not appear to cluster based upon CRX expression levels; however, higher CRX-expressing cells from d218 OVs showed some clustering (E, dashed oval). G–I) Further interrogation of this d218 CRX cluster showed individual cells that also expressed high levels of the cone markers THRB (G) and RXRG (H) and/or the BPC marker GRM6 (I). A single cell expressing THRB, RXRG, and GRM6 is highlighted (arrow in G–I).
Examination of rank order gene expression correlations across hPSC-OV scRNA-seq datasets
Given these findings, we asked whether ranking relative NR gene expression across scRNA-seq datasets in the absence of PCA would be sufficient to yield insight into OV composition and authenticity. To do so, we developed a simple strategy of ordering all cells based solely on relative expression of a target (or “bait”) gene – from highest to lowest expresser (Figure 5). We then employed Spearman’s rank correlation coefficient analysis (SRCCA), a nonparametric method used to describe monotonic functions, to obtain an ordered list of genes whose relative expression among single cells correlated most closely with that of the bait gene. In addition to known cell type-associated genes, SRCCA can be used to discover novel genes by correlating bait gene expression with the entire transcriptome.
Figure 5. Spearman’s rank correlational coefficient analysis (SRCCA) captures PR gene associations.
A–G) SRCCA was performed on selected PR marker (“bait”) genes. Cells were first ordered from high (left/red) to low (right/blue) bait gene expression and plotted by z-score. Thereafter, an unbiased scRNA-seq dataset search was performed to find the top 20 genes whose cell-to-cell expression pattern was most similar to that of the bait gene. (Corresponding color coding for z-scores in A–G are shown in a–g). In d70 OV cells, few genes indicative of mature PR function correlated with CRX as bait (A), whereas in d218 OV cells (B) and adult retina cells (C) the majority of the top 20 correlated genes are known to be expressed in mature PRs. Analyses using the PR precursor gene PRDM1 (D,E) and the early cone gene THRB (F,G) as baits showed similar trends from d70 to d218 OVs. More specifically, cone-specific genes (e.g., PDE6H, GNAT2, ARR3, GNB3) were top correlators with THRB in d218 OV samples (G).
To validate this approach, we first performed SRCCA with CRX as the bait gene (Figure 5A–C). At d70, the majority of the top 20 correlated genes were not known to play a role in PR development (Figure 5A). However, by d218 the list was comprised mostly of established PR genes (Figure 5B), consistent with progressive OV maturation. Similarly, SRCCA of the adult NR scRNA-seq dataset identified numerous known PR genes as correlating highly with rank order CRX expression (Figure 5C). Next, we selected bait genes associated with early PR precursors (PRDM1) or developing cones (THRB) (Figure 5D–G). At d70, both genes highly correlated with each other and with additional genes involved in early PR differentiation (e.g., OTX2) (Figure 5D, F). By d218, the list of known PR genes correlating with PRDM1 and THRB increased (Figure 5E,G) and included several found in the CRX lists from d218 OV and adult NR datasets (Figure 5B,C). Additionally, THRB SRCCA revealed multiple cone-specific genes (PDE6H, GNAT2, ARR3, GNB3) (Figure 5G). The high degrees of correlation between these bait genes and other known PR genes support the use of SRCCA in generating cell type-specific gene expression profiles from scRNA-seq datasets.
To test this strategy further, we extended our list of bait genes to include transcription factors involved in the differentiation of other major retinal cell classes (Figure S4). Unbiased lists generated by SRCCA once again highlighted known genes indicative of expected retinal cell type(s) for each bait gene. Thus, SRCCA proved capable of grouping genes belonging to particular cell classes, depending on the specificity of the bait gene.
Combinatorial SRCCA analysis reveals high-confidence groupings of genes associated with PRs or other NR cell types
To further validate and refine cell type-associated gene lists, we combined SRCCA results from multiple NR bait genes. To begin, we looked for overlap among the top 200 gene expression correlators for CRX and PRDM1 in d218 OVs (Figure 6A). At this time point, PRDM1 ranked 47th among the top 200 genes correlating with CRX, and CRX ranked 63rd among the top 200 genes correlating with PRDM1 (rho = 0.46). Strikingly, 142 of the top 200 correlating genes (71%) for these two PR markers were shared. To search for genes associated with early cones, we added RXRG to the analysis, which reduced the percentage of co-segregating genes slightly to 62% (Figure 6B). After adding a second cone precursor gene (THRB), nearly 50% gene overlap was still obtained (Figure 6C). Importantly, including RXRG and/or THRB succeeded in excluding all rod-specific genes from the analysis. Next, we used the same approach with markers of two unrelated cell types, CRX (PRs) and VIM (MG), as well as CRX and MKI67 (proliferation marker), and no overlap in their respective lists of top 200 SRCCA genes was found (Figure 6D).
Figure 6. Combining SRCCA profiles from multiple bait genes reveals high confidence PR genes.
A) To refine SRCCA PR gene lists, we looked for overlap in the top 200 genes* whose cell-to-cell expression profiles correlated with the PR bait genes CRX and PRDM1. This comparison revealed 142 shared genes in the d218 OV scRNA-seq dataset. B, C) To search for cone-specific genes, we added RXRG (B) and RXRG + THRB (C) to create 3- and 4-way analyses, which highlighted 97 genes. D) We also examined markers for unrelated cell types, CRX (PRs) and GFAP (MG) or MKI67 (proliferative cells), and found no overlap in their respective lists of top 200 correlating genes. E) Of the 97 genes identified in (C), 65 genes are known to be expressed in PRs (cone-specific genes are indicated with an asterisk), while the remaining 32 are putative cone-specific or general PR genes. F, G) SRCCA was performed using two putative PR genes identified in E as baits: AKAP9 (F) and MEGF9 (G). The top 200 correlating genes for each were then subjected to gene ontology (GO) analysis, which showed GO term enrichment consistent with PR function. H–I) SRCCA top 200 gene profiles for AKAP9 and MEGF9 showed a high degree of overlap with each other (H) and with CRX (I–J). [*Rho value ranges for the top 200 genes correlated with each bait gene: CRX (0.68–0.30), PRDM1 (0.69–0.34), RXRG (0.72–0.39), THRB (0.69–0.38), VIM (0.78–0.47), MKI67 (0.47–0.21), AKAP9 (0.67–0.34), MEGF9 (0.63–0.33).]
Of the 97 SRCCA genes common to CRX, PRDM1, RXRG, and THRB, 65 are known to be expressed in PRs, 5 of which are specifically expressed in cones (Figure 6E). The remaining 32 genes were considered high confidence PR and/or cone genes. We then performed SRCCA on two of these putative PR genes, A-Kinase Anchoring Protein-9 (AKAP9) and Multiple EGF-Like Domains-9 (MEGF9). Examination of Gene Ontology (GO) terms associated with the top 200 correlating genes for AKAP9 and MEGF9 revealed functions consistent with PRs (Figure 6F, G). Further scrutiny of the genes included in the AKAP9 and MEGF9 GO analyses revealed four that were not shared with CRX, PRDM1, RXRG, and THRB. In addition, SRCCA expression profiles of MEGF9 and AKAP9 showed a high degree of overlap with each other (118 of 200; Figure 6H) and with the PR marker CRX (Figure 6I, J).
Lastly, combinatorial SRCCA analysis was applied to scRNA-seq datasets using NRL and NR2E3 (rod PR), ATOH7 and POU4F2 (RGC), VSX2 and VIM (RPCs), or PMEL and TYRP1 (RPE) as bait genes (Figure 7). The two rod markers shared 57 genes in adult retina scRNA-seq datasets (Figure 7A), the majority of which are known PR and/or rod-specific genes (no cone-specific genes were identified). The ATOH7 and POU4F2 pairing highlighted 34 genes in d70 scRNA-seq datasets, half of which are known to be expressed in RGCs (Figure 7B). The RPC and RPE bait gene pairings likewise extracted numerous known and putative genes associated with these particular cell types (Figure 7C, D). Altogether, SRCCA offers a simple, robust, and highly versatile approach to mine scRNA-seq data to authenticate hPSC-OV progeny and search for putative NR cell type-associated genes.
Figure 7. Combinatorial SRCCA reveals gene expression associations for multiple retinal cell types.
Marker genes of any cell type can be used as bait for SRCCA. A) Combinatorial SRCCA analysis of rod markers NRL and NR2E3 revealed overlap in 57 of the top 200 correlating genes, including many known rod PR genes (asterisks). B–D) Similar results were obtained for markers of RGCs (ATOH7 and POU4F2), RPCs (VSX2 and VIM), and RPE (PMEL and TYRP1). Human adult retinal cells were profiled in A, while d70 hPSC-OVs were analyzed in B–D. [Rho value ranges for the top 200 genes correlated with each bait gene: NRL (0.42–0.20), NR2E3 (0.46–0.21), ATOH7 (0.45–0.23), VSX2 (0.60–0.28), VIM (0.55–0.32), PMEL (0.74–0.37), TYRP1 (0.74–0.36).]
Discussion
We describe a rapid and versatile method to mine scRNA-seq datasets for genes associated with any cell marker combination of interest. This approach provides an alternative to standard scRNA-seq analyses [22, 23], which generally use statistical algorithms such as PCA to examine cell-to-cell transcriptome variances and - ideally - separate cell types into distinct graphical clusters. In the present study, PCA plots did not yield clearly defined cell populations, although cells from similarly aged samples broadly grouped together. The presence of transient intermediary cell types in differentiating OV cultures cannot wholly explain this outcome, since clusters were not forthcoming from adult NR samples either. Similarly, selective capture of a limited subset of cell types is unlikely given the strong transcriptome-wide correlation between bulk and average single-cell RNA-seq datasets.
In contrast to PCA, analyzing scRNA-seq datasets based on the expression of a fully pre-determined cohort of retinal cell type-selective marker genes did succeed in generating graphical clusters. However, many of these genes were expressed more widely in our scRNA-seq datasets than expected. CRX, for example, was expressed at >1 TPM in nearly all cells from d218 OV and adult retina samples, despite tdTomato fluorescence being restricted to PRs and RPE. Thus, the biological relevance of low level gene expression is unclear and can confound scRNA-seq data interpretation.
These findings led us to consider relative expression levels of NR marker genes as parameters, reasoning that cells expressing higher levels of a particular marker are more likely to belong to a corresponding cell class. First, we enhanced PCA plots by superimposing relative expression levels of individual markers. However, even high expressing cells were found to be widely interspersed on PCA plots, precluding the assignment of identity based on graphical location alone. Of note, this observation also indicates that increasing the number of cells analyzed by PCA would not improve cell clustering.
We therefore eliminated PCA and focused exclusively on relative expression levels of target (or bait) genes across single cells. After ordering all cells in a scRNA-seq dataset from highest to lowest expresser, we applied SRCCA to search the transcriptome for genes whose cell-to-cell expression levels most closely matched that of the bait gene. This approach produced unbiased lists of genes that “virtually sort” together. Typically, gene lists such as these are obtained using bulk RNA-seq datasets following cell-sorting, either through use of cell surface markers or gene-specific reporter lines.
Advantages of our approach are multiple. First, it does not require production, characterization, and sorting of reporter lines. Furthermore, for primary human tissue such as the retinal sample examined this study, reporters are not an option, and capable cell surface markers are often not available. Also, successful production of a faithful, robust reporter line is not guaranteed. As such, investigators tend to gravitate to the same proven genes to create reporters, as we did using CRX. By contrast, our technique is equally applicable to any gene and requires only minutes to generate the type of lists presented herein.
Second, unlike reporter lines, our method does not require genome modification and thus reflects influences of all native gene regulatory mechanisms. This is important, since even targeting a reporter to an endogenous locus can fail to recapitulate the full expression profile of a particular gene. Indeed, our CRX+/tdTomato hPSC line, as well as the allele-targeted CRX reporter line generated by the Lako lab [6], did not label BPCs, a known CRX+ NR cell type. This omission may be due to disruption of cryptic BPC-specific regulatory sequences at or distal to the reporter gene insertion site. Ultimately, this shortcoming was favorable in our effort to label differentiating PRs, but it underscores the need for caution in interpreting results from reporter lines.
A third advantage of SRCCA is that putative cell type-associated genes can be fed back into the analysis to further assess their selectivity and potential functional role(s). One such gene, AKAP9, encodes a member of a group of proteins that bind and compartmentalize the regulatory subunit of cAMP-dependent protein kinase. While the function of AKAP9 in differentiating PRs has not been described, a role for AKAP9 in ciliogenesis has been reported [24]. Another novel gene, MEGF9, encodes a transmembrane protein expressed widely in the developing nervous system (including the outermost portion of mouse PRs) that is postulated to act as a guidance or signaling molecule [25].
Fourth, and perhaps most compelling, is the ease with which SRCCA lists from multiple bait genes can be combined to generate a focused group of expressed genes highly correlated with particular cell types. To obtain similar data from a reporter line would require multiple gene editing steps and reduce the number of fluorescence emission channels available for characterization.
A limitation of our approach is that the operator is required to choose the bait gene(s) to be analyzed. However, it is common for scRNA-seq studies that employ unbiased PCA, hierarchical clustering, or t-distributed stochastic neighbor embedding to utilize marker genes to assign cell identity [26, 27]. Using these methods, cell types were confirmed within clusters based upon the presence of a specific marker gene. However, results presented here and elsewhere [8, 9] suggest that transcripts indicative of multiple different cell types can be attributed to the same cell. Attention to relative levels of gene expression as well as the use of multiple markers may help address this problem.
Other limitations must also be considered. In this study, we arbitrarily examined the top 200 correlating genes for each bait gene; however, this undoubtedly excludes genes of importance. In addition, gene expression levels between cell types cannot be directly compared with SRCCA. Artifacts of scRNA-seq stemming from cell capture bias and transcript processing are also points of concern. To monitor the former, we determined the transcriptome-wide correlation between bulk RNA-seq and average scRNA-seq data. For the latter, we employed a rigorous statistical method (SinQC) for separating technical artifacts from real biological variability based on multiple unbiased criteria. Finally, it is important to emphasize that, unlike PCA, our method groups genes with similar cell-to-cell expression patterns and not the cells themselves. Thus, it does not provide a direct assessment of the number of cells belonging to a specific cell class.
Conclusion
We describe a method of scRNA-seq analysis that provides information akin to a gene-specific reporter line, albeit with a limitless number and combination of genes of interest. This method does not supplant others as a means to mine scRNA-seq datasets, but instead offers an alternative approach when cell clusters are not readily apparent on standard PCA plots. Upon applying the technique to serum-free cultures of 3D OVs, genes indicative of PRs and other NR cell types grouped together, providing evidence for the authenticity of OV progeny and providing a basis for analyzing genes of potential importance in NR cell development and function.
Supplementary Material
A) CRX-tdTomato BAC construct used for reporter line creation. B) qPCR copy number assay confirming targeting. C) Normal karyotype of targeted CRX+/tdTomato line. D–G) At d10, cells within neural clusters expressed eye field transcription factors (EFTFs), including RAX (D), PAX6 (E), LHX2 (D,E), SIX3 and SIX6 (F), and OTX2 (G), along with proliferation marker KI67 (G). H) RT-PCR showed expression of EFTFs and NR transcription factors by d9. I) OVs purified at d30 contained KI67+ NRPCs expressing VSX2. Scale bar = 100μm in G; scale bar = 50μm in I.
A–C) tdTomato (A) is co-expressed with CRX (B) in day 45 CRX+/tdTomato hPSC-OVs (merged in C). D–F) CRX expression was examined in human fetal retina (d122) for comparison, which was detected in the ONL and INL of the NR (D), and in RPE (arrows) at low (E) and high (K) magnification. Light microscopic and fluorescence images were merged in E and F to show RPE pigmentation. Scale bars = 100μm in C and E; scale bar in F = 20μm.
A)At d45, many CRX-tdTomato+ PRs co-express RCVRN. B) By d100, CRX-tdTomato+ PRs are situated at the periphery of OVs, whereas VSX2, a marker for RPCs and bipolar cells, are found deeper within the developing neuroblastic/inner nuclear layer. C) BRN3+ ganglion cells are found in the innermost portion of OVs at d100. These and all other non-PR NR cells are tdTomato-negative. D) Light microscopic image of an adherent RPE monolayer culture from the hPSC-CRX+/tdTomato line. E–F) CRX-tdtomato expression was seen in a subset of hPSC-RPE cells (E). By overlaying light microscopic and fluorescence images (F), it was noted that CRX-tdTomato expression was robust in lightly pigmented RPE, and was not detected in heavily pigmented RPE (asterisks in D–F). Scale bars = 100μm.
A–C) VSX2 was examined as a RPC and BPC marker in early (A) and late (B) OVs, and in adult retina (C). Genes expressed in RPCs correlated with VSX2 at d70, including SFRP2, HES1, SOX2, VIM, and DKK3 (A). By d218 the number of VSX2+ cells decreased (B) compared to adult retina (C). D–F) SOX9 was used as a marker for RPCs and MG in early (D) and late (E) OVs, and in adult retina (F). At d70, SRCCA for SOX9 correlated with proliferation genes (CCND1 and PCNA) (D), while at d218 SOX9 expression correlated with RPC and MG genes SOX2, VIM, DKK3, and NES (E) and mature MG genes CRABP1, PAX6, and AQP4 in adult retina (F). G–I) ONECUT1 is expressed in RPCs, early cones, and HCs. In d70 OVs (G), ONECUT1 was most correlated with PITF1A, an HC transcription factor. At d218 (H), few cells expressed ONECUT1, although a correlation with PROX1 (HC gene) was noted. Adult retina ONECUT1 SRCCA results are shown in I. J,K) ATOH7 and POU4F2 (RGCs) were expressed in d70 OV cells (for further analysis see Figure 7). L) PTF1A (HCs and ACs), correlated with ONECUT1 at d70 (and vice-versa). M–O) PROX1 (HCs) SRCCA of d70 (M) and d218 (N) OVs and in adult retina (O) are shown. In d218 OV cells, the top correlating genes with PROX1 were the HC transcription factors ONECUT1 and ISL1. Z-score legends for A–O are shown in panels a-o.
Significance Statement.
Cell type evaluation in complex cultures can be accomplished via reporter lines or single cell analytical techniques. Toward this end, corresponding authors and colleagues examined differentiating human pluripotent stem-derived optic vesicles using standard single cell RNA-seq principle component analysis, which yielded equivocal results. Subsequent application of Spearman’s rank correlation coefficient analysis allowed rapid, in depth examination of retinal cell progeny and gene profiling of individual neural retina cell types. This simple and highly versatile analytical method is applicable not only to stem cell-derived retinal cultures, but any complex culture or tissue.
Acknowledgments
Grants
NIH R01EY21218 (DMG), R01EY24995 (BRP) and P30HD03352, the Foundation Fighting Blindness Wynn-Gund Translational Research Acceleration Program (DMG), the Retina Research Foundation Humble Directorship (DMG) and MD Matthews Research Professorship (BRP), the McPherson Eye Research Institute Sandra Lemke Trout Chair in Eye Research (DMG), Research to Prevent Blindness (UW-Madison Department of Ophthalmology and Visual Sciences).
We thank Jennifer Bolin and Angela Elwell for technical assistance and Drs. David Frisch and Fred Blattner for the hES-2A–inducible BAC vector. This work was funded by NIH R01EY21218 (DMG), R01EY24995 (BRP) and P30HD03352, the Foundation Fighting Blindness Wynn-Gund Translational Research Acceleration Program (DMG), the Retina Research Foundation Humble Directorship (DMG) and MD Matthews Research Professorship (BRP), the McPherson Eye Research Institute Sandra Lemke Trout Chair in Eye Research (DMG), the Muskingum County Community Foundation, and Research to Prevent Blindness.
Footnotes
Disclosure of Potential Conflicts of Interest
Drs. David Gamm and M. Joseph Phillips have an ownership interest in Opsis Therapeutics LLC, which has licensed the technology to generate optic vesicles from pluripotent stem cell sources reported in this publication. Dr. Gamm also declared intellectual rights in Wisconsin Alumni Research Foundation and consultant role with FUJIFILM Cellular Dynamics International. All other authors declare no conflicts.
Author contributions
M. Joseph Phillips: Conception and design, Collection and/or assembly of data, Data analysis and interpretation, Manuscript writing, Final approval of manuscript.
Peng Jiang: Conception and design, Collection and/or assembly of data, Data analysis and interpretation, Manuscript writing, Final approval of manuscript.
Sara Howden: Collection and/or assembly of data, Data analysis and interpretation.
Patrick Barney: Collection and/or assembly of data, Data analysis and interpretation.
Jee Min: Collection and/or assembly of data, Data analysis and interpretation.
Nathaniel W. York: Collection and/or assembly of data, Data analysis and interpretation.
Li-Fang Chu: Collection and/or assembly of data, Data analysis and interpretation.
Elizabeth E. Capowski: Data analysis and interpretation.
Abigail Cash: Collection and/or assembly of data.
Shivani Jain: Collection and/or assembly of data.
Katherine Barlow: Collection and/or assembly of data.
Tasnia Tabassum: Collection and/or assembly of data.
Ron Stewart: Data analysis and interpretation.
Bikash R. Pattnaik: Financial support, Data analysis and interpretation.
James A. Thomson: Financial support, Data analysis and interpretation, Final approval of manuscript.
David M. Gamm: Financial support, Data analysis and interpretation, Manuscript writing, Final approval of manuscript.
References
- 1.Meyer JS, Howden SE, Wallace KA, et al. Optic vesicle-like structures derived from human pluripotent stem cells facilitate a customized approach to retinal disease treatment. Stem Cells. 2011;29:1206–1218. doi: 10.1002/stem.674. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Nakano T, Ando S, Takata N, et al. Self-formation of optic cups and storable stratified neural retina from human ESCs. Cell Stem Cell. 2012;10:771–785. doi: 10.1016/j.stem.2012.05.009. [DOI] [PubMed] [Google Scholar]
- 3.Zhong X, Gutierrez C, Xue T, et al. Generation of three-dimensional retinal tissue with functional photoreceptors from human iPSCs. Nat Commun. 2014;5:4047. doi: 10.1038/ncomms5047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Reichman S, Terray A, Slembrouck A, et al. From confluent human iPS cells to self-forming neural retina and retinal pigmented epithelium. Proc Natl Acad Sci U S A. 2014;111:8518–8523. doi: 10.1073/pnas.1324212111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kaewkhaw R, Kaya KD, Brooks M, et al. Transcriptome Dynamics of Developing Photoreceptors in Three-Dimensional Retina Cultures Recapitulates Temporal Sequence of Human Cone and Rod Differentiation Revealing Cell Surface Markers and Gene Networks. Stem Cells. 2015;33:3504–3518. doi: 10.1002/stem.2122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Collin J, Mellough CB, Dorgau B, et al. Using Zinc Finger Nuclease Technology to Generate CRX-Reporter Human Embryonic Stem Cells as a Tool to Identify and Study the Emergence of Photoreceptors Precursors During Pluripotent Stem Cell Differentiation. Stem Cells. 2016;34:311–321. doi: 10.1002/stem.2240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Cherry TJ, Trimarchi JM, Stadler MB, et al. Development and diversification of retinal amacrine interneurons at single cell resolution. Proc Natl Acad Sci U S A. 2009;106:9495–9500. doi: 10.1073/pnas.0903264106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Mizeracka K, Trimarchi JM, Stadler MB, et al. Analysis of gene expression in wild-type and Notch1 mutant retinal cells by single cell profiling. Dev Dyn. 2013;242:1147–1159. doi: 10.1002/dvdy.24006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Trimarchi JM, Stadler MB, Cepko CL. Individual retinal progenitor cells display extensive heterogeneity of gene expression. PLoS One. 2008;3:e1588. doi: 10.1371/journal.pone.0001588. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Trimarchi JM, Stadler MB, Roska B, et al. Molecular heterogeneity of developing retinal ganglion and amacrine cells revealed through single cell gene expression profiling. J Comp Neurol. 2007;502:1047–1065. doi: 10.1002/cne.21368. [DOI] [PubMed] [Google Scholar]
- 11.Phillips MJ, Wallace KA, Dickerson SJ, et al. Blood-derived human iPS cells generate optic vesicle-like structures with the capacity to form retinal laminae and develop synapses. Invest Ophthalmol Vis Sci. 2012;53:2007–2019. doi: 10.1167/iovs.11-9313. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Chu LF, Leng N, Zhang J, et al. Single-cell RNA-seq reveals novel regulators of human embryonic stem cell differentiation to definitive endoderm. Genome Biol. 2016;17:173. doi: 10.1186/s13059-016-1033-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hou Z, Jiang P, Swanson SA, et al. A cost-effective RNA sequencing protocol for large-scale gene expression studies. Scientific reports. 2015;5:9570. doi: 10.1038/srep09570. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Phillips MJ, Perez ET, Martin JM, et al. Modeling human retinal development with patient-specific induced pluripotent stem cells reveals multiple roles for visual system homeobox 2. Stem Cells. 2014;32:1480–1492. doi: 10.1002/stem.1667. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Li B, Dewey CN. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics. 2011;12:323. doi: 10.1186/1471-2105-12-323. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Jiang P, Thomson JA, Stewart R. Quality control of single-cell RNA-seq by SinQC. Bioinformatics. 2016;32:2514–2516. doi: 10.1093/bioinformatics/btw176. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Leng N, Dawson JA, Thomson JA, et al. EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments. Bioinformatics. 2013;29:1035–1043. doi: 10.1093/bioinformatics/btt087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Tran NM, Zhang A, Zhang X, et al. Mechanistically distinct mouse models for CRX-associated retinopathy. PLoS genetics. 2014;10:e1004111. doi: 10.1371/journal.pgen.1004111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Pattnaik B, Jellali A, Sahel J, et al. GABAC receptors are localized with microtubule-associated protein 1B in mammalian cone photoreceptors. The Journal of neuroscience : the official journal of the Society for Neuroscience. 2000;20:6789–6796. doi: 10.1523/JNEUROSCI.20-18-06789.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Jiang P, Hou Z, Bolin JM, et al. RNA-Seq of Human Neural Progenitor Cells Exposed to Lead (Pb) Reveals Transcriptome Dynamics, Splicing Alterations and Disease Risk Associations. Toxicological sciences : an official journal of the Society of Toxicology. 2017;159:251–265. doi: 10.1093/toxsci/kfx129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Jiang P, Nelson JD, Leng N, et al. Analysis of embryonic development in the unsequenced axolotl: Waves of transcriptomic upheaval and stability. Developmental biology. 2017;426:143–154. doi: 10.1016/j.ydbio.2016.05.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Kumar P, Tan Y, Cahan P. Understanding development and stem cells using single cell-based analyses of gene expression. Development. 2017;144:17–32. doi: 10.1242/dev.133058. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Bacher R, Kendziorski C. Design and computational analysis of single-cell RNA-sequencing experiments. Genome Biol. 2016;17:63. doi: 10.1186/s13059-016-0927-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Hurtado L, Caballero C, Gavilan MP, et al. Disconnecting the Golgi ribbon from the centrosome prevents directional cell migration and ciliogenesis. The Journal of cell biology. 2011;193:917–933. doi: 10.1083/jcb.201011014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Brandt-Bohne U, Keene DR, White FA, et al. MEGF9: a novel transmembrane protein with a strong and developmentally regulated expression in the nervous system. The Biochemical journal. 2007;401:447–457. doi: 10.1042/BJ20060691. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Treutlein B, Brownfield DG, Wu AR, et al. Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq. Nature. 2014;509:371–375. doi: 10.1038/nature13173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Macosko EZ, Basu A, Satija R, et al. Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets. Cell. 2015;161:1202–1214. doi: 10.1016/j.cell.2015.05.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
A) CRX-tdTomato BAC construct used for reporter line creation. B) qPCR copy number assay confirming targeting. C) Normal karyotype of targeted CRX+/tdTomato line. D–G) At d10, cells within neural clusters expressed eye field transcription factors (EFTFs), including RAX (D), PAX6 (E), LHX2 (D,E), SIX3 and SIX6 (F), and OTX2 (G), along with proliferation marker KI67 (G). H) RT-PCR showed expression of EFTFs and NR transcription factors by d9. I) OVs purified at d30 contained KI67+ NRPCs expressing VSX2. Scale bar = 100μm in G; scale bar = 50μm in I.
A–C) tdTomato (A) is co-expressed with CRX (B) in day 45 CRX+/tdTomato hPSC-OVs (merged in C). D–F) CRX expression was examined in human fetal retina (d122) for comparison, which was detected in the ONL and INL of the NR (D), and in RPE (arrows) at low (E) and high (K) magnification. Light microscopic and fluorescence images were merged in E and F to show RPE pigmentation. Scale bars = 100μm in C and E; scale bar in F = 20μm.
A)At d45, many CRX-tdTomato+ PRs co-express RCVRN. B) By d100, CRX-tdTomato+ PRs are situated at the periphery of OVs, whereas VSX2, a marker for RPCs and bipolar cells, are found deeper within the developing neuroblastic/inner nuclear layer. C) BRN3+ ganglion cells are found in the innermost portion of OVs at d100. These and all other non-PR NR cells are tdTomato-negative. D) Light microscopic image of an adherent RPE monolayer culture from the hPSC-CRX+/tdTomato line. E–F) CRX-tdtomato expression was seen in a subset of hPSC-RPE cells (E). By overlaying light microscopic and fluorescence images (F), it was noted that CRX-tdTomato expression was robust in lightly pigmented RPE, and was not detected in heavily pigmented RPE (asterisks in D–F). Scale bars = 100μm.
A–C) VSX2 was examined as a RPC and BPC marker in early (A) and late (B) OVs, and in adult retina (C). Genes expressed in RPCs correlated with VSX2 at d70, including SFRP2, HES1, SOX2, VIM, and DKK3 (A). By d218 the number of VSX2+ cells decreased (B) compared to adult retina (C). D–F) SOX9 was used as a marker for RPCs and MG in early (D) and late (E) OVs, and in adult retina (F). At d70, SRCCA for SOX9 correlated with proliferation genes (CCND1 and PCNA) (D), while at d218 SOX9 expression correlated with RPC and MG genes SOX2, VIM, DKK3, and NES (E) and mature MG genes CRABP1, PAX6, and AQP4 in adult retina (F). G–I) ONECUT1 is expressed in RPCs, early cones, and HCs. In d70 OVs (G), ONECUT1 was most correlated with PITF1A, an HC transcription factor. At d218 (H), few cells expressed ONECUT1, although a correlation with PROX1 (HC gene) was noted. Adult retina ONECUT1 SRCCA results are shown in I. J,K) ATOH7 and POU4F2 (RGCs) were expressed in d70 OV cells (for further analysis see Figure 7). L) PTF1A (HCs and ACs), correlated with ONECUT1 at d70 (and vice-versa). M–O) PROX1 (HCs) SRCCA of d70 (M) and d218 (N) OVs and in adult retina (O) are shown. In d218 OV cells, the top correlating genes with PROX1 were the HC transcription factors ONECUT1 and ISL1. Z-score legends for A–O are shown in panels a-o.







