Abstract
To overcome the ethical and technical limitations of in vivo human disease models, the broader scientific community frequently employs model organism-derived cell lines to investigate disease mechanisms, pathways, and therapeutic strategies. Despite the widespread use of certain in vitro models, many still lack contemporary genomic analysis supporting their use as a proxy for the affected human cells and tissues. Consequently, it is imperative to determine how accurately and effectively any proposed biological surrogate may reflect the biological processes it is assumed to model. One such cellular surrogate of human disease is the established mouse neural precursor cell line, SN4741, which has been used to elucidate mechanisms of neurotoxicity in Parkinson disease for over 25 years. Here, we are using a combination of classic and contemporary genomic techniques – karyotyping, RT-qPCR, single cell RNA-seq, bulk RNA-seq, and ATAC-seq – to characterize the transcriptional landscape, chromatin landscape, and genomic architecture of this cell line, and evaluate its suitability as a proxy for midbrain dopaminergic neurons in the study of Parkinson disease. We find that SN4741 cells possess an unstable triploidy and consistently exhibits low expression of dopaminergic neuron markers across assays, even when the cell line is shifted to the non-permissive temperature that drives differentiation. The transcriptional signatures of SN4741 cells suggest that they are maintained in an undifferentiated state at the permissive temperature and differentiate into immature neurons at the non-permissive temperature; however, they may not be dopaminergic neuron precursors, as previously suggested. Additionally, the chromatin landscapes of SN4741 cells, in both the differentiated and undifferentiated states, are not concordant with the open chromatin profiles of ex vivo, mouse E15.5 forebrain- or midbrain-derived dopaminergic neurons. Overall, our data suggest that SN4741 cells may reflect early aspects of neuronal differentiation but are likely not a suitable proxy for dopaminergic neurons as previously thought. The implications of this study extend broadly, illuminating the need for robust biological and genomic rationale underpinning the use of in vitro models of molecular processes.
Supplementary Information
The online version contains supplementary material available at 10.1186/s12864-023-09398-y.
Keywords: Parkinson disease, Mouse-derived cell lines, Immortalized cell lines, Chromatin accessibility, RNA-seq, ATAC-seq, scRNA-seq, Genomic characterization, Disease-relevant model systems
Background
In vitro cellular surrogates present an excellent opportunity for elucidating the molecular mechanisms behind human disease without the ethical and technical limitations of in vivo systems. As such, most studies of human disease that employ genomic or cellular manipulations or assays that require high cell quantity and quality, are often conducted in vitro to ensure biological and statistical robustness [1–3]. For example, in vitro models are frequently employed in studies of the role of genomic regulation in human disease, identification of candidate genes and regulatory elements and evaluation of their functional characteristics through genetic manipulations and high-throughput assays [4–7]. As genome-wide association studies (GWASs) continue to reveal human disease-associated variants, it is becoming evident that most of them lie within non-coding regions of the genome [8]. Such regions frequently represent cis-regulatory elements (CREs), required for the transcriptional modulation of cognate genes. The assays required to evaluate their function [4, 9, 10], or connect CREs with the promoters they modulate [11], often require large cell numbers, making, in vitro cellular systems the preferred strategy.
Prioritizing non-coding GWAS variants and disease-relevant sequences for extensive investigation requires knowledge of their chromatin accessibility status. Open chromatin is prone to harbor functional sequences; and since chromatin accessibility profiles vary across cell types and developmental time, it is important to prioritize disease-associated variants that lie within open chromatin regions in the disease-relevant cell type(s) [8, 12, 13]. It is also critical to functionally evaluate the biological consequences of disease-associated variation, test the efficacy of potential therapeutics, and observe the effects of disease-relevant insults in the appropriate cellular context [8, 14, 15]. Therefore, when studying disease associated variation, the most effective in vitro cellular surrogates should ideally mimic the chromatin architecture and transcriptional profiles of the in vivo cell types affected by disease.
In Parkinson disease (PD), midbrain (MB) dopaminergic (DA) neurons in the substantia nigra (SN) are the primary affected cell type [16]. Preferential degeneration of these neurons elicits a progressive neurodegenerative disorder characterized by motor deficits [16]. As the second most common neurodegenerative disorder, affecting approximately 1% of adults over 70 years old [17, 18], PD is the focus of extensive research efforts. As such, various cell lines have been used as in vitro proxies of MB DA neurons to study the cellular impacts of PD-relevant insults, as well as candidate PD-associated sequences, their functions, and their potential as therapeutic targets [19].
One such cell line, SN4741, is reported to be a clonal DA neuronal progenitor line that was established in 1999 from mouse embryonic day 13.5 (E13.5) SN tissue [20]. The SN was dissected from transgenic mice containing 9.0 kb of the 5’ promoter region of rat tyrosine hydroxylase (Th), fused to the temperature-sensitive mutant Simian Virus 40 T antigen (SV40Tag-tsA58) oncogene [20]. The goal of this Th promoter transgene was to enable selective acquisition of DA neurons, while the purpose of the SV40Tag oncogene was to facilitate conditional immortalization of the cell line. The temperature sensitive mutant form of this immortalizing gene (tsA58) should permit uncontrolled differentiation and proliferation at the permissive temperature (33 °C), maintain cells in an undifferentiated state at 37 °C, and since tsA58 displays diminished activity at 39 °C, it should direct differentiation that more closely resembles primary cells when the culture is shifted to this non-permissive temperature [20].
As an established mouse neural precursor line, SN4741 cells have since been used to elucidate mechanisms of neurotoxicity in PD [21–25], test the efficacy of therapeutic targets against PD relevant insults [26, 27], and assay the impacts of PD-associated genetic mutation [28, 29] and transcriptional regulation [30–32]. Important technological advances have also arisen since the genesis and implementation of the SN4741 cell line, including chromatin conformation capture technologies [11, 33, 34], RNA-sequencing (RNA-seq) [35], and assay for transposase-accessible chromatin using sequencing (ATAC-seq) [36]. In this study, we exploit these modern approaches to assess the suitability of SN4741 as an in vitro proxy for DA neurons and determine the extent to which this cell line is appropriate for prioritizing and investigating the mechanisms by which PD-associated variation confers disease risk.
Through a combination of karyotyping, single-cell (sc)RNA-seq, and RT-qPCR, we evaluate the genomic integrity of this immortalized cell line, determine how the transcriptional profile and expression of DA neuron marker genes in this line changes between undifferentiated (37 °C) and differentiated (39 °C) states, and evaluate whether these transcriptional changes are consistent throughout the differentiation process. The data we collect suggests that while these cells show evidence that they are exiting a proliferative state and entering a more differentiated state, they are an unsuitable model of SN DA neurons, as they possess aneuploidy and structural abnormalities, as well as consistently low expression of DA neuron markers upon differentiation. We employ bulk RNA-seq to quantify transcriptional differences between differentiated and undifferentiated SN4741 cells and determine that, while transcriptional profiles change to reflect differentiation, they do not show strong evidence that these cells are entering a DA state. We then compare chromatin accessibility profiles of undifferentiated and differentiated SN4741 cells with those of ex vivo mouse E15.5 midbrain (MB) and forebrain (FB) neurons and determine that the chromatin accessibility profiles of SN4741 cells do not reflect the cellular population from which they were derived. Collectively, cytogenetic, chromatin, and transcriptional data suggest that the SN4741 cell line is not as strong a cellular surrogate for DA neurons as previously thought. Ultimately, this work underscores the importance of leveraging technological advances in genomic and cellular analyses to evaluate, and re-evaluate, the suitability of established model systems in disease biology.
Results
SN4741 is an unstable polyploid cell line
G-band karyotyping was performed on 20 SN4741 metaphase spreads and a representative karyogram (Fig. 1A) was generated. The karyotype was interpreted as an abnormal, polyploid, karyotype with complex numerical abnormalities and unbalanced, structural abnormalities. While most, but not all, abnormalities were consistently present in these cells; none of the 20 cells assessed had the same chromosome complement, and no normal cells were observed. All cells possessed at least one copy of each mouse autosome (1 through 19) and female sex chromosomes; however, most chromosomes were triploid in each cell (Fig. 1B). These karyotypic abnormalities already call into question the viability of these cells as a surrogate for human neurodegenerative disease. Since these cells are genetically unstable, there may be large experimental batch effects as the cell populations shift across divisions. Furthermore, gene dosage effects that severely deviate from normal copy number in DA neurons may lead to confounding and unreliable results.
Undifferentiated and differentiated SN4741 cells express similar levels of dopaminergic neuron marker genes by RT-qPCR
Preliminary analysis by RT-qPCR confirmed expression of a variety of DA neuron markers: forkhead box A2 (Foxa2), nuclear receptor subfamily 4 group A, member 2 (Nr4a2), solute carrier family 6 member 3 (Slc6a3), and tyrosine hydroxylase (Th). Compared to the expression of these markers in the undifferentiated SN4741 cell culture (37 °C), relative expression of all markers remained at similar levels (Foxa2, p = 0.601; Nr4a2, p = 0.425; Slc6a3, p = 0.729; Th, p = 0.265) when the cells were shifted to the higher temperature condition (39 °C), (Fig. 1C). While elevated Th has been used as a marker of differentiation into DA neurons in previous work with SN4741 cells [20, 37, 38], an increase in Th expression is not exclusively associated with DA neurons. Th is a marker for all catecholaminergic neurons (dopaminergic and adrenergic) [39], and evidence suggests that Th expression is transient in other neurons throughout embryonic development [40–42]. These results indicate that at the non-permissive temperature, SN4741 cells may not be fully differentiating into DA neuron progenitors.
scRNA-seq reveals that SN4741 cells differentiate at the non-permissive temperature, but lack expression of DA neuron marker genes
To assess the consistency of the differentiation protocol, transcriptomes were generated from ≥ 17,000 cells across four replicates cultured at the permissive temperature (37 °C) and four replicates cultured at the non-permissive temperature (39 °C). Analysis of the single-cell transcriptomes reveal that the cells cluster by growth temperature (Fig. 1D). This separation of cells by temperature is accompanied by changes to the cell cycle, with cells at the permissive (37 °C) temperature mostly in either G2M or S phase, while cells at the non-permissive temperature (39 °C) are mostly in G1 phase (Fig. 1E), indicating that they may be differentiated. In expression analysis, markers of proliferation that are expressed in G2M phase, like Marker of Proliferation Ki-67 (Mki67), are predominantly expressed in cells at the permissive temperature (p < 2.225e-308, Fig. 1F), corroborating the cell cycle analysis. When shifted to the non-permissive temperature, SN4741 cells appear to robustly differentiate, exemplified by a decrease in the expression of Nestin (Nes), a neural stem cell marker (p < 2.225e-308, Fig. 1F). Additional transcriptional changes at this non-permissive temperature include a significant increase in the expression of a neural marker CUGBP Elav-Like Family Member 5 (Celf5, p < 2.225e-308) [43], as well as genes that have been found to regulate neural stem cell self-renewal (Inhibitor of DNA Binding 2, Id2, p = 9.236e-251; High Mobility Group AT-Hook 2, Hmga2, p < 2.225e-308) [44, 45], neurogenesis (Iroquois Homeobox 3, Irx3, p < 2.225e-308) [46], and arborization of neurons (Sodium Voltage-Gated Channel Beta Subunit 1, Scn1b, p < 2.225e-308) [47], indicating that these cells may be differentiating into neural precursor cells (Fig. 1G). Furthermore, Cadherin 13 (Cdh13, p < 2.225e-308), a modulator of GABAergic neurons, is significantly upregulated at this non-permissive temperature (Fig. 1G), while the expression of a variety of DA neuron markers, including Aldehyde Dehydrogenase 1 Family Member A1 (Aldh1a1), Foxa2, LIM Homeobox Transcription Factor 1 Beta (Lmx1b), Nr4a2, Paired-like homeodomain 3 (Pitx3), Slc6a3, and Th, fail to meet the criteria differential expression analysis (Fig. 1H). Collectively, these results suggest that while SN4741 cells are differentiating towards a neuronal fate when shifted to the nonpermissive temperature, they may not be entering a clear DA trajectory under these conditions.
ATAC-seq identifies differential open chromatin profiles in SN4741 cells at the permissive and non-permissive temperatures
To consider how chromatin accessibility changes between the two temperatures, we performed ATAC-seq on SN4741 cells in both the undifferentiated and differentiated states. Libraries were confirmed to be technically and biologically relevant (Supplemental Fig. 1), and well correlated between replicates (Supplemental Fig. 2; Fig. 2A-B).
A total of 83,778 consensus open chromatin regions were identified, with 70% of peaks shared between the two temperatures (Fig. 2C). Principal component analysis of these consensus regions suggests a clear separation in the chromatin state between the two temperatures (Fig. 2D). To explore these differences, we performed differential accessibility analysis with DiffBind [48], to find a total of 5,055 differentially accessible regions: 2,654 enriched in the permissive temperature and 2,401 enriched at the non-permissive temperature (log2FC > 1, FDR < 0.05; Fig. 2E).
Gene ontology of genes adjacent to differentially accessible regions largely recapitulate the scRNA-seq analysis; functions associated with regions preferentially open at the permissive temperature suggest the maintenance of the undifferentiated, cell-cycling state (Fig. 2F). The gene ontology of genes adjacent to those regions preferentially accessible at the non-permissive temperature is less coherent and suggest cell differentiation towards several fates (blood vessels, cartilage, tooth), none of which are neuronal and, perhaps unsurprisingly, demonstrate evidence of response to temperature stress [49] (Fig. 2G).
Overall, there is a shift in the chromatin accessibility between the two temperatures that indicates that the cells transition from an undifferentiated to differentiated state as the cells move from the permissive to non-permissive temperature. The differences in chromatin accessibility further confirm that SN4741 cells are not differentiating towards a neuronal lineage.
Comparison of chromatin accessibility in SN4741 cells fails to recapitulate the chromatin landscape of ex vivo mouse DA neurons
To evaluate the potential relationship between SN4741 cells and DA neurons they are presumably modelling, we compared the chromatin accessibility between the SN4741 cells at both temperatures to previously generated ex vivo mouse embryonic DA neuron chromatin accessibility profiles (NCBI GEO: GSE122450; [50]).
Considering the consensus peak set of 165,334 regions generated from all in vivo and ex vivo samples and their normalized read counts, we observe a clear separation between the SN4741 cell culture model and the ex vivo DA neurons by correlation and principal component analysis (Fig. 3A, B). Examining the raw overlap of peaks between the SN4741 cells and ex vivo neurons, just 12.5% (20,667) are present in all four cell types/conditions (Fig. 3C). The chromatin profiles are largely exclusive between the SN4741 cell culture model and the ex vivo DA neurons: 41.3% (68,304) of regions are accessible solely in the ex vivo neuron populations and 40% (65,857) are exclusively accessible in the SN4741 cell culture models. There is little overlap between the ex vivo and cultured samples. In comparison of the ex vivo midbrain DA neurons to the non-permissive, differentiated temperature, only 183 peaks are restricted to these populations.
The chromatin profiles between ex vivo embryonic DA neurons and their prospective in vitro cell culture surrogate are virtually independent. They exhibit scant overlap in their global chromatin profiles and bear little resemblance to each other at regulatory regions of key DA neuron genes (Fig. 3D). It is worth noting that the lack of concordant data between in vivo SN4741 cells (2D culture) and ex vivo DA neurons (3D) may also be due, in part, to differences in culturing and isolation conditions. Previous studies have found that 2D culture conditions do not fully recapitulate in vivo or ex vivo 3D conditions [51, 52]. Regardless, neither the analysis of the SN4741 chromatin accessibility profiles in isolation or in comparison with ex vivo neurons would suggest these cells to be appropriate models of embryonic DA neurons.
Transcriptional changes in SN4741 cells indicate differentiation from pluripotent stem cells into brain cells that do not fully resemble MB DA neurons
Bulk RNA-seq data were also generated for SN4741 cells, at both the permissive and non-permissive temperatures, to determine whether transcriptome changes reflect differentiation into DA neurons, or other neural cell types. To evaluate the RNA-seq libraries, quality-control measures were performed in silico (Supplemental Fig. 3). PCA (Supplemental Fig. 3B), and sample-sample distances (Supplemental Fig. 3C) reaffirmed that samples cultured at the same temperature are more like one-another than samples cultured at the alternate temperature.
We found that 735 genes were upregulated at the non-permissive temperature (adjusted p-value < 0.01 and log2 FC > 1.5), and 954 genes were downregulated (adjusted p-value < 0.01 and log2 FC < -1.5) at the non-permissive temperature. The list of genes significantly downregulated at the non-permissive temperature was submitted to Enrichr (https://maayanlab.cloud/Enrichr/) [53–55] for gene ontology (GO) and analysis of cell type markers. Consistent with the observation that cells at the non-permissive temperature are differentiated and in G1 phase of the cell cycle, downregulated genes resulted in GO terms strongly enriched for mitotic and DNA replication processes (Fig. 4A). Additionally, significantly downregulated genes at the non-permissive temperature overlap with subsets of PanglaoDB [56] cell type marker genes, suggesting that these cells are shifting away from a state that resembles neural stem cells (Fig. 4B).
Similarly, the list of significantly upregulated genes was submitted to Enrichr for GO and analysis of cell type markers. As expected, upregulated genes resulted in GO terms for biological processes that indicate a more terminally differentiated cell type (Fig. 4C): “synaptic vesicle docking”, “negative regulation of osteoblast proliferation”, “lens fiber cell differentiation”, “regulation of osteoblast proliferation”, and “forebrain regionalization”. While not included in the top 10 terms by combined score ranking, “neuron remodeling”, “synaptic transmission, glutamatergic”, “neuron maturation”, and “synaptic transmission, cholinergic” were also identified as significantly associated terms. Notably, “synaptic transmission, dopaminergic” and “dopaminergic neuron differentiation” were also listed as insignificant terms (Fig. 4C), as Th was the lone overlapping marker gene for these terms.
In line with GO terms enriched for biological processes involving differentiation, possibly in neuronal cells, overlapping PanglaoDB [56] cell type marker genes suggest that SN4741 cells at the non-permissive temperature most significantly resemble immature neurons (Fig. 4D). “Oligodendrocytes”, “retinal progenitor cells”, “satellite glial cells”, “dopaminergic neurons”, “adrenergic neurons”, “GABAergic neurons”, and “glutamatergic neurons” were also listed as cell types with significant marker gene overlap.
The distribution of various cell type marker genes on a volcano plot, indicating the log2FC in expression and -log10 adjusted p-values of DE genes, reveals that the specific genes overlapping “pluripotent stem cell” markers (26/112), cluster as the most highly significantly downregulated genes (Fig. 4E). In contrast, only two of the upregulated marker genes overlapping “immature neurons” (16/136, Fig. 4F) and “oligodendrocytes” (17/178, Fig. 4G) cluster in a similarly strong way. Plotting the 11/119 overlapping upregulated genes for “dopaminergic neurons” (Fig. 4H) reveals that 7/11 overlapping genes (Celf5, Dpys15, Cacna1b, Tmem179, Nova2, Nrx1, and Cntn2) are also marker genes for immature neurons. Plotting the DA neuron markers also assayed by RT-qPCR validates that the relative expression of these markers is consistent between these highly sensitive assays. At 39 °C, Th expression significantly increases (log2FC = 1.552651, p = 3.779e-05); Nr4a2 (log2FC = -1.042794, p = 1.555e-30) and Foxa2 (log2FC = -0.4142541, p = 1.303e-17) expression actually decreases, but does not meet the thresholds set for differential expression due to a low fold-change in expression; and Slc6a3 was filtered out due to low read counts across both temperature conditions.
To confirm the GO-indicted cell types, normalized read counts for select marker genes were plotted for each temperature replicate: Celf5 (p < 2.225e-308) [43], Nrxn1 (p = 0.0002) [57], Ntrk1 (p = 4.907e-09) [58], and Unc13a (p = 1.719e-103) [59], for “immature neurons” were upregulated at 39 °C relative to cells at 37 °C (Fig. 5A); Olig3 (p = 1.101e-13) [60], Il33 (p = 6.241e-117) [61], Hdac11 (p = 3.778e-152) [62], and Ptgds (p = 8.108e-66) [63] for “oligodendrocytes” were upregulated at 39 °C relative to cells at 37 °C (Fig. 5B); and Ccna2 (p < 2.225e-308) [64], Cdc6 (p < 2.225e-308) [65], Cenpf (p < 2.225e-308) [66], and Gins1 (p < 2.225e-308) [67] for “pluripotent stem cells” were downregulated at 39 °C relative to cells at 37 °C (Fig. 5C).
The differentially expressed gene sets were then analyzed using STRING (https://string-db.org/) [68]. The set of upregulated genes was enriched for protein-protein interactions (number of edges = 705; expected number of edges = 438; PPI enrichment p-value = < 1.0e-16) and GO terms such as “Neuron differentiation” (GO:0030182; FDR = 0.0016), “Neuron development” (GO:0048666; FDR = 0.0070), and “Neurogenesis” (GO:0022008; FDR = 0.0085) supporting that this upregulated gene set is a meaningful group likely belonging to a network involved in neuronal maturation. The set of downregulated genes was also enriched for protein-protein interactions (number of edges = 9251; expected number of edges = 1910; PPI enrichment p-value = < 1.0e-16) and GO terms such as “Cell cycle” (GO:0007049; FDR = 5.90e-36), “Mitotic cell cycle” (GO:0000278; FDR = 6.97e-34), and “Cell division” (GO:0051301; 6.35e-25) further confirming that these cells are no longer undergoing cell division.
Finally, previously generated reads per kilobase of exon per million reads mapped (RPKM) from ex vivo E15.5 mouse embryonic DA neuron bulk RNA-seq [50] were used to compare how closely the SN4741 transcriptome resembles the neuronal populations they are expected to model. Similar to our results comparing chromatin accessibility between these two datasets, correlation of RPKM shows a clear separation between the SN4741 cell culture model and the ex vivo DA neurons (average r2: MB vs 37 = 0.653; MB vs 39 = 0.659; FB vs 37 = 0.636; FB vs 39 = 0.643) (Fig. 5D). Collectively, these results confirm that at the non-permissive temperature, SN4741 cells are no longer rapidly dividing neural stem cells. However, while the transcriptional profile of these cells indicates that they are differentiating towards cell types present in the brain, these cells do not fully possess characteristics of the MB DA neurons they are meant to model.
Discussion
It is critically important that studies of human disease generate biologically accurate data, whether aimed at elucidating molecular mechanisms, onset and progression, or management and therapeutics. In the context of discovery biology or the illumination of human health and disease mechanisms, misattribution of cellular identity, or other deviations from biological accuracy, may result the misinterpretation of biological findings or misdirected research efforts. When studying human disease, cellular surrogates are often used to overcome the ethical and technical limitations of employing animal models. Therefore, it is imperative that disease-relevant insights are predicated on robust data generated from model systems representing human biology as accurately as possible.
Here, we demonstrate the importance of assessing in vitro models of disease to determine the extent to which they can yield biologically accurate data that can be used to inform aspects of human disease. The SN4741 cell line has been used to study neurotoxicity and therapeutic interventions [21–27], PD-associated genetic mutation [28, 29] and cell signaling and transcriptional regulation [30–32, 37], since it was initially characterized as an immortal, mouse MB-derived cell line that differentiates into DA neurons at a non-permissive temperature [20]. However, contemporary genomic analyses have not been leveraged to characterize and evaluate the SN4741 cell line as a suitable proxy for DA neurons in PD, until now.
We employed karyotyping, RT-qPCR, and scRNA-seq to assess the genomic stability of these cells and determine how consistently they differentiate into DA neurons at the non-permissive temperature. We generated bulk RNA-seq and ATAC-seq data from this cell line at both the permissive and non-permissive temperatures, to extensively characterize this cell line and document how transcriptional landscapes and chromatin accessibility profiles shift in response to temperature-induced differentiation and compare to known profiles of ex vivo DA neurons. Our results suggest that SN4741 is an unstable, polyploid cell line that is unlikely to be a viable differentiation model of DA neurons; and thus, is likely not a robust proxy by which to study MB DA neurons in the context of human phenotypes, including PD, schizophrenia, addiction, memory, or movement disorders.
The results of karyotyping alone indicate that any data generated using SN4741 cells may be biologically inaccurate due to extreme variability in chromosome complement and therefore, copy number variation, between individual cells. Consequently, the results of previous studies evaluating neurotoxicity [21, 23, 24, 26, 27, 30, 32, 38], cellular signaling pathways [22, 25, 28, 69, 70], and transcriptional profiling [37] in these cells may have been unduly influenced by the extreme imbalance in gene dosage that we found to vary from cell to cell. For example, alpha-synuclein (SNCA) has been consistently implicated in PD risk [71–74], particularly due to variants that promote α-synuclein misfolding [75] and overexpression [50] or events that result in gene amplification [76, 77]. Snca is present on mouse chromosome 6 and the karyotypes generated for SN4741 cells show that chromosome 6 is triploid in most assayed cells (Fig. 1B). Therefore, using the SN4741 cell line to model neurodegeneration in PD may result in inaccurate data due to an exaggerated vulnerability towards degeneration imposed by elevated Snca copy number, by gene dosage effects of other interacting gene products in relevant pathways, or by the structural instability of this line.
Even if this cell line could be adopted to study Snca overexpression/amplification, ATAC-seq profiling of open chromatin regions in this cell line at the permissive and non-permissive temperatures indicates that these cells do not possess chromatin accessibility profiles similar to those of ex vivo, mouse E15.5 MB neurons. In PD, disease is characterized by the degeneration of MB DA neurons, while DA neurons of the FB are spared. Therefore, the chromatin profiles of MB DA neurons, as well as the differentially accessible regions of the genome between MB and FB neurons, may influence the preferential vulnerability of MB neurons in PD [50]. In the context of exploiting these chromatin profiles to study PD-associated variability and neurotoxicity, SN4741 cells are likely a poor model, as the open chromatin regions of these cells are not a reliable proxy for mouse E15.5 MB or FB DA neurons.
The chromatin accessibility profiles of SN4741 cells not only fail to cluster with ex vivo populations of mouse MB neurons, but the transcriptional landscapes of these cells suggest that these cells have shifted towards a more differentiated state that may be less DA than previously thought. Examination of cell cycle markers by scRNA-seq demonstrates that SN4741 cells at the non-permissive temperature are more differentiated than cells at the permissive temperature, as expected [20]. GO terms for genes that are significantly downregulated at the non-permissive temperature reinforces that these cells are no longer rapidly dividing, pluripotent stem cells. However, RT-qPCR, scRNA-seq, and bulk RNA-seq in these cells fail to detect significant upregulation of most key DA neuron markers in the differentiated cells, except for Th in the bulk RNA-seq data. Th is not exclusively expressed by DA neurons at embryonic timepoints [40–42]. In fact, significantly upregulated genes in SN4741 cells at the non-permissive temperature that overlap with GO terms and cell cycle marker genes suggests that Th is the only significantly upregulated gene overlapping with biological processes involving DA neurons. Rather, additional overlapping cell type marker genes suggest that these cells more closely resemble immature neurons.
In parallel, we generated promoter capture (pc)Hi-C data at the non-permissive temperature with the intention of exploring how non-coding disease-relevant variants interact with promoters and potentially regulate gene expression in MD DA neurons. As our group is focused on PD-associated variation, which is unlikely to act broadly in immature neurons, our group has not analyzed the resulting data, beyond basic quality control (Supplemental Fig. 4). While the SN4741 cells at the non-permissive temperature fail to recapitulate the transcriptomic or chromatin state of DA neurons, it is of potential interest for follow-up studies that they do resemble some immature neuron types. Although not analyzed by our group, we generated output files for interaction detection, and this data has been made publicly available for others to explore (accessible through: https://github.com/rachelboyd/SN4741_pcHiC), as it may be useful to study genomic interactions at promoters driving an immature neuronal state. However, the cell type best represented by SN4741 cells at the non-permissive temperature still requires deeper characterization.
Any data generated using SN4741 cells in the context of DA neuron modeling and/or PD must be interpreted with caution and in light of the appropriate caveats. Due to the instability and polyploidy of this cell line, we recommend that the use of SN4741 cells for PD- related research be re-evaluated. Future studies designed to fine-tune the classification of these cells may support the use of SN4741 cells as a model of other neuronal or non-neuronal cells. Additionally, the differentiation trajectory of these cells may be amenable to intervention(s) that could drive their molecular state towards one that resembles DA neurons more closely.
Conclusions
This study establishes a valuable precedent with broad implications across biological and disease-related research. Prior to using SN4741 cells to study non-coding regulatory variation in PD, we characterized this cell line to determine its suitability as a model of DA neurons in PD, and found that these cells are unstable, polyploid cells that do not demonstrate strong molecular characteristics of MB DA neurons. These cells express low levels of DA neuron markers, and chromatin landscapes in differentiated SN4741 cells scarcely overlap open chromatin regions in ex vivo mouse E15.5 midbrain neurons. We demonstrate the importance of genomic characterization of in vitro model systems prior to generating data and valuable resources that may be used to inform aspects of human disease. In future studies that utilize in vitro models of any human disease, due diligence to confirm their suitability as surrogates could save time, resources, and possibly lives, by avoiding misdirection and advancing successful therapeutic development.
Methods
No animals were directly used in this study. All assays were carried out using animal-derived cell lines or publicly available animal-derived cell data.
Cell culture
SN4741 cells were obtained from the Ernest Arenas group at the Karolinska Institutet. SN4741 cells were confirmed to be mycoplasma free using a MycoAlert® Mycoplasma Detection Assay (Lonza) and were maintained in high glucose Dulbecco’s Modified Eagle Medium (DMEM; Gibco 1196502), supplemented with 1% penicillin–streptomycin and 10% fetal bovine serum (FBS) in a humidified 5% CO2 incubator at 37 °C. Cells at 80% confluence were passaged by trypsinization approximately every 2–3 days. To induce differentiation, 24 h after the cells were passaged, media was replaced by DMEM supplemented with 1% penicillin–streptomycin and 0.5% FBS at 39 °C. Cells were allowed to grow and differentiate in these conditions for 48 h before harvesting for experimentation.
G-band karyotyping
At passage 21, undifferentiated SN4741 cells were sent to the WiCell Research Institute (Madison, Wisconsin), at 40–60% confluency, for chromosomal G-band analyses. Karyotyping was conducted on 20 metaphase spreads, at a band resolution of > 230, according to the International System for Human Cytogenetic Nomenclature.
cDNA synthesis and RT-qPCR for DA neuron markers
RNA was extracted from both differentiated and undifferentiated cells by following the RNeasy Mini Kit (QIAGEN) protocol, as written. Extracted RNA was quantified using a Nanodrop. 1 μg of each RNA sample underwent first-strand cDNA synthesis using the SuperScript III First-Strand Synthesis System for RT-PCR (Invitrogen) according to the oligo(dT) method. qPCR was performed with Power SYBR Green Master Mix (Applied Biosystems), using primers for β-actin (Actb), Foxa2, Nr4a2, Slc6a3, and Th (Table S1). Reactions were run in triplicate under default SYBR Green Standard cycle specifications on the Viia7 Real-Time PCR System (Applied Biosystems). Normalized relative quantification and error propagation followed the data analysis and associated calculations proposed by Taylor et al. (2019) [78], with results normalized to Actb and a t-test was performed using R function “t.test”, (alternative = “two.sided”). The remaining RNA and cDNA were subsequently stored at -80 °C.
Single cell RNA-seq library preparation, sequencing, and alignment
Both differentiated (39 °C) and undifferentiated (37 °C) cells were trypsinized, and scRNA-seq libraries were generated following the Chromium 10X pipeline [79]. Four replicates at each temperature across > 17,000 cells were assayed. Cell capture, cDNA generation, and library preparation were performed with the standard protocol for the Chromium Single Cell 3’ V3 reagent kit. Libraries were quantified with the Qubit dsDNA High Sensitivity Assay (Invitrogen) in combination with the High Sensitivity DNA Assay (Agilent) on the Agilent 2100 Bioanalyzer. Single-cell RNA-sequencing libraries were pooled and sequenced on an Illumina NovaSeq 6000 (SP flow cells), using 2 × 50 bp reads per library, to a combined depth of 1.6 billion reads. The quality of sequencing was evaluated via FastQC. Paired-end reads were aligned to the mouse reference genome (mm10) using the CellRanger v3.0.1 pipeline. Unique molecular identifier (UMI) counts were quantified per gene per cell (“cellranger count”) and aggregated (“cellranger aggr”) across samples with no normalization.
Single cell RNA-seq analysis
Using Seurat [80] (v4.2.0), cells were filtered to remove stressed/dying cells (% of reads mapping to the mitochondria > 15%) and empty droplets or doublets (number of unique genes detected < 200 or > 6,000). Cells were scored for their stage in the cell cycle using “CellCycleScoring()” on cell cycle genes provided by Seurat (“cc.genes”). Cells were then normalized using “SCTransform” (vst.flavor = “v2”) and corrected for percent mitochondrial reads and sequence depth. Principal component (PC) analysis was performed and a PC cut-off was identified using “ElbowPlot().” Using this PC cutoff and a minimum distance of 0.001, UMAP clustering was used for dimensionality reduction. Expression was plotted on a log scale with “VlnPlot()” for a variety of proliferation and DA neuron markers. Differentially expressed genes were identified using “FindMarkers” (min.diff.pct = 0.2).
ATAC-seq library preparation and quantification
ATAC-seq libraries were generated for four replicates of undifferentiated (37 °C) and differentiated (39 °C) SN4741 cells, according to the Omni-ATAC protocol [81], with minor modifications. Aliquots of 50,000 cells were centrifuged at 2000 × g for 20 min at 4 °C, and the resulting pellets were resuspended in 50μL of resuspension buffer. Cells were left to lyse for 3 min on ice before being centrifuged again at 2000 × g for 20 min at 4 °C. The resulting nuclei pellets were then tagmented, as written, using 50μL of transposition mixture and then incubated at 37 °C for 30 min in a 1000 RPM thermomixer. After transposition, DNA was purified with the Zymo DNA Clean and Concentrator -5 Kit and eluted in 21μL of elution buffer.
Pre-amplification of the transposed fragments was performed according to the conditions outlined in the Omni-ATAC protocol [81]; however, 12 pre-amplification cycles were run in lieu of qPCR amplification to determine additional cycles. The amplified libraries were prepared according to the Nextera DNA Library Prep Protocol Guide, except that libraries were purified with 40.5μL AMPure XP beads (Beckman Coulter), and 27.5μL of resuspension buffer was added to each sample. All libraries were quantified with the Qubit dsDNA High Sensitivity Assay (Invitrogen) in combination with the High Sensitivity DNA Assay (Agilent) on the Agilent 2100 Bioanalyzer.
ATAC-seq sequencing, alignment, and peak calling
Libraries were sequenced on Illumina NovaSeq 6000 (SP flow cells), using 2 × 50 bp reads per library, to a total combined depth of 1.6 billion reads. The quality of sequencing was evaluated with FastQC (v0.11.9) [82] and summarized with MultiQC (v1.13) [83]. Reads were aligned to the mouse reference genome (mm10) in local mode with Bowtie2 [84] (v2.4.1), using –X 1000 to specify an increased pair distance to 1000 bp. Samtools (v1.15.1) [85] and Picard (v2.26.11; http://broadinstitute.github.io/picard/) were used to sort, deduplicate and index reads. Peaks were called with MACS3 (v3.0.0a7; https://github.com/macs3-project/MACS) [86] and specifying --nomodel and --nolambda for the ‘callpeaks()’ command. Peaks overlapping mm10 blacklisted/block listed regions called by ENCODE [87, 88] and in the original ATAC-seq paper [36] were also removed with BEDTools (v2.30.0) [89].
For visualization with IGV, IGVTools (v2.15.2) was used to convert read pileups to TDFs. The fraction of reads in peaks was calculated with DeepTools (v3.5.1) [90] using the plotEnrichment command. The average mapping distance flag was extracted from the SAM files with a custom script available at our GitHub repo (https://github.com/sarahmcclymont/SN4741_ATAC/) to generate the fragment length plot. Mouse (mm10) transcriptional start site (TSS) coordinates were downloaded from the UCSC Genome Browser [91] (Mouse genome; mm10 assembly; Genes and Gene Predictions; RefSeq Genes track using the table refGene), and DeepTools (v3.5.1) [90] was used to plot the pileup of reads overtop of these TSSs. Conservation under peaks (phastCons) [92] and the genomic distribution of peaks were calculated using the Cis-regulatory Element Annotation System (CEAS) [93] and conservation tool of the Cistrome [94] pipeline. Analysis can be found at http://cistrome.org/ap/u/smcclymont/h/sn4741-atac-seq-ceas-and-conservation.
ATAC-seq normalization and differential peak analysis
Each sample’s peak file and BAMs were read into and analysed with DiffBind (v3.8.1) [48]. Peaks present in two or more libraries were considered in the consensus peakset. Reads overlapping these consensus peaks were counted with ‘dba.count()’ specifying summits = 100, bRemoveDuplicates = TRUE. These read counts were normalized with ‘dba.normalize()’ on the full library size using the RLE normalization method as it is native to the DESeq2 analysis we employed in the following ‘dba.analyze()’ step. The volcano plot was generated using a custom script using the output of the ‘dba.report()’ command, where th = 1 and fold = 0 and bCounts = T, to output all peaks regardless of their foldchange or significance. Significantly differentially accessible regions (filtered for abs(Fold) > 1 & FDR < 0.05) were submitted to GREAT (v4.0.4) [95] and the gene ontology of the nearest gene, as identified with the basal + extension method (where proximal was considered to be ± 5 kb) was assessed and plotted.
ATAC-seq comparison to ex vivo MB and FB DA neurons
Previously generated ATAC-seq libraries from ex vivo E15.5 mouse embryonic DA neurons [50] were re-analyzed in parallel, following all the above alignment and filtering steps. DiffBind (v3.8.1) was used to compare the samples, as above, and the R package UpSetR (v1.4.0) [96] was used to plot the overlap of peaks between conditions.
Bulk RNA-seq library preparation, sequencing, and alignment
Cells were run through QIAshredder (Qiagen) and total RNA was extracted using the RNeasy Mini Kit (Qiagen) according to the manufacturer’s recommendations, except that RNA was eluted twice in 50μL of water. Total RNA integrity was determined with the RNA Pico Kit (Agilent) on the Agilent 2100 Bioanalyzer. RNA samples were sent to the Johns Hopkins University Genetic Resources Core Facility (GRCF) for library prep (NEBNext Ultra II directional library prep kit with poly-A selection) and sequencing. The libraries were pooled and sequenced on an Illumina NovaSeq 6000 (SP flow cells), using 2 × 50 bp reads per library, to a combined depth of 1.6 billion reads. The quality of sequencing was evaluated via FastQC. FASTQ files were aligned to the mouse reference genome (mm10) with HISAT2 [97] (v2.0.5) and sample reads from different lanes were merged using samtools [85] (v.1.10) function “merge”. Aligned reads from individual samples were quantified against the mm10 reference transcriptome with the subread [98–100] (v1.6.1) function “featureCounts” [101], using -t exon and -g gene_id (Supplemental Fig. 3A).
Bulk RNA-seq analysis
The DESeq2 (v3.15) package was used for data quality assessment and analyses. A DESeqDataSet of count data was generated using “DESeqDataSetFromMatrix” (design = ~ temp). The data underwent variance stabilizing transformation (vst) prior to using “plotPCA” to visualize experimental covariates/batch effects (Supplemental Fig. 3B) and R package “pheatmap” (v1.0.12; https://CRAN.R-project.org/package=pheatmap) to visualize the sample-to-sample distances (Supplemental Fig. 3C).
Genes with an average of at least 1 read for each sample were analyzed to identify differentially expressed (DE) genes between temperature conditions, using the function “DESeq”. P-value distribution after differential expression (DE) analysis (Supplemental Fig. 3D) verified that the majority of called DE genes are significant. Results (alpha = 0.01) were generated and subjected to log fold change shrinkage using the function “lfcShrink” (type = “apeglm”) [102] for subsequent visualization and ranking. The function “plotMA” was used to generate MA plots, both before and after LFC shrinkage, to visualize the log2 fold changes attributable to the non-permissive temperature shift over the mean of normalized counts for all the samples in the DESeqDataSet (Supplemental Fig. 3E-F). MA plots demonstrated that log fold change shrinkage of the data successfully diminished the effect size of lowly expressed transcripts with relatively high levels of variability.
Volcano plots were generated using a custom function to visualize log2 fold changes of specific genes in the dataset. A gene was considered significantly differentially expressed if it demonstrated an adjusted p-value < 0.01 and |log2 FC|> 1.5. These significantly differentially expressed genes were submitted to Enrichr [53–55] for analyses within the “ontologies” and “cell types” categories. The upregulated and downregulated gene sets were passed to STRING [68] for analysis of protein-protein interactions and network relationships.
Bulk RNA-seq comparison to ex vivo MB and FB DA neurons
Read counts from the SN4741 bulk RNA-seq dataset were converted to RPKM and compared to bulk RNA-seq data from previously generated ex vivo mouse embryonic DA neurons (NCBI GEO: GSE122450; [50]). A Pearson correlation heatmap was generated using ggplot2 [103].
Promoter capture HiC library generation
PcHiC was performed as previously described [104], with minor modifications. Briefly, SN4741 cells were cultured at the non-permissive temperature and plated at five million cells per 10 cm dish. The cells were crosslinked using 1% formaldehyde, snap frozen using liquid nitrogen, and stored at -80 °C. The cells were dounce homogenized and restriction enzyme digestion, using 400 units HindIII-HF overnight at 37 °C. The total volume was maintained at 500µL, through addition of 1X NEBuffer 2.1. Heat inactivation was performed at 80 °C for 20 min, and biotinylated-dCTP was used for biotin fill-in reaction. Blunt-end ligation was performed using Thermo T4 DNA ligase, with cohesive end units maintained at 15,000 and buffer and water volumes adjusted to ensure a total volume of 665µL was added to each Hi-C tube. Cross-linking was performed overnight, with additional (50μL) proteinase K added for 2 h the following day. DNA purification was split across two reactions using 2 mL PhaseLock tubes, and volumes were adjusted accordingly. Each PhaseLock reaction was split again into two vials for ethanol purification, and centrifugation at step 6.3.8 was performed at room temperature. The pellets were dissolved in 450µL 1X TLE and transferred to a 0.5 mL 30kD Amicon Column. After washing, the column was inverted into a new container, and no additional liquid was added to raise the volume to 100µL. All four reactions were combined, the total volume determined, and RNAseA (1 mg/mL) equal to 1% of the total volume was added for 30 min at 37 °C.
The libraries were assessed for successful blunt-end ligation by a ClaI restriction enzyme digest of PCR products, as previously described [104]. Biotin was removed from un-ligated ends and DNA was sheared to a size of 200-300 bp using the Covaris M220 (High setting, 35 cycles of 30 s “on” and 90 s “off”; vortexing/spinning down samples and changing sonicator water every 5 cycles). Size selection was performed using AMpure XP magnetic beads, as previously described [104] except that all resuspension steps were increased by 5µL, so that 5μL could be used for QC with the High Sensitivity DNA Assay (Agilent) on the Agilent 2100 Bioanalyzer (at three stages: post-sonication, post-0.8 × size-selection, and post-1.1 × size-selection). The remaining protocol was performed as described. Capture probes (Arbor Biosciences; https://github.com/nbbarrientos/SN4741_pcHiC) were designed against mouse (mm10) RefSeq transcription start sites, filtering out “XM” and “XR” annotated genes. The remaining promoters were intersected with the in silico digested HindIII mouse genome, to retain all HindIII fragments containing a promoter. Potential probes sites were assessed ± 330 bp of the HindIII cut site on either end of the fragment and finalized probe sets were filtered using no repeats and “strict” criteria, as defined by Arbor Biosciences. After generating a uniquely indexed HiC library with complete Illumina adapters, probes targeting promoter containing fragments were hybridized following Arbor Biosciences capture protocol (v4) at 65 °C, 1 µg DNA, and one round of capture. The library was PCR amplified before sequencing on an Illumina NovaSeq 6000 (SP flow cells), using 2 × 50 bp reads per library, to a combined depth of 1.6 billion reads.
Promoter capture HiC data analysis
Raw pcHiC reads for each replicate (n = 4) were evaluated for quality via FastQC. FASTQ files were mapped to mm10 using Bowtie2 [84] (v.2.4.1) and filtered using HiCUP [105] (v. 0.8). The HiCUP pipeline was configured with the following parameters: FASTQ format (Format: Sanger), maximum di-tag length (Longest: 700), minimum di-tag length (shortest: 50), and filtering and alignment statistics were reported (Supplemental Fig. 4A-C). BAM files were generated for each replicate using samtools [85] (v.1.10). DeepTools [90] (v.3.5.1) before read coverage similarities and replicate correlation was assessed using the function “multiBamSummary” (in bins mode) to analyze the entire genome. A Pearson correlation heatmap was generated using the function “plotCorrelation” (Supplemental Fig. 4D). As a result of high Pearson correlation coefficient among replicates (r > 0.93), library replicates were combined. The CHiCAGO [106] (v. 1.18.0) pipeline was used to convert the merged BAM file into CHiCAGO format. The digested mm10 reference genome was used to generate a restriction map file, a baited restriction map file, and the rest of required input files (.npb,.nbpb, and.poe) required to run the CHiCAGO pipeline.
Supplementary Information
Acknowledgements
The authors would like to acknowledge Ernest Arenas (Karolinska Institutet), for providing SN4741 cells, as well as the Johns Hopkins Genomics Core Research Facility (GCRF) and the WiCell Research Institute, for providing technical services.
Abbreviations
- ActB
β-Actin
- Aldh1a1
Aldehyde Dehydrogenase 1 Family Member A1
- ATAC-seq
Assay for Transposase-Accessible Chromatin using Sequencing
- BAM
Binary Alignment and Map
- Cacna1b
Calcium channel, voltage-dependent, N type, alpha 1B subunit
- Ccna2
Cyclin A2
- Cdc6
Cell division cycle 6
- Cdh13
Cadherin 13
- CEAS
Cis-Regulatory Element Annotation System
- Celf5
CUGBP Elav-Like Family Member 5
- Cenpf
Centromere protein F
- Cntn2
Contactin 2
- CO2
Carbon Dioxide
- CRE
Cis Regulatory Element
- DA
Dopaminergic
- DMEM
Dulbecco’s Modified Eagle Medium
- Dpysl5
Dihydropyrimidinase-like 5
- E13.5/15.5
Embryonic Day 13.5/15.5
- ENCODE
Encyclopedia of DNA Elements
- FB
Forebrain
- FBS
Fetal Bovine Serum
- FDR
False Discovery Rate
- Foxa2
Forkhead Box A2
- Gins1
GINS complex subunit 1 (Psf1 homolog)
- GO
Gene Ontology
- GRCF
Genetics Core Research Facility
- GWAS
Genome-Wide Association study
- Hdac11
Histone deacetylase 11
- Hmga2
High Mobility Group AT-Hook 2
- Id2
Inhibitor Of DNA Binding 2
- IGV
Integrative Genomics Viewer
- Il33
Interleukin 33
- Irx3
Iroquois Homeobox 3
- LFC
Log Fold-Change
- Lmx1b
LIM Homeobox Transcription Factor 1 Beta
- MB
Midbrain
- Mki67
Marker of Proliferation Ki-67
- Nes
Nestin
- Nova2
NOVA alternative splicing regulator 2
- Nr4a2
Nuclear Receptor Subfamily 4 Group A, Member 2
- Nrx1
Neurexin 1
- Ntrk1
Neurotrophic receptor tyrosine kinase 1
- OCR
Open Chromatin Region
- Olig3
Oligodendrocyte transcription factor 3
- PC(A)
Principal Component (Analysis)
- pcHi-C
Promoter-Capture Hi-C
- PD
Parkinson Disease
- Pitx3
Paired-like homeodomain 3
- Ptgds
Prostaglandin D2 synthase
- (q)PCR
(Quantitative) Polymerase Chain Reaction
- QC
Quality Control
- RNA
Ribonucleic Acid
- RPKM
Reads per kilobase of exon per million reads mapped
- RT
Reverse Transcriptase
- Scn1b
Sodium Voltage-Gated Channel Beta Subunit 1
- scRNA-seq
Single Cell RNA sequencing
- Slc6a3
Solute Carrier Family 6 Member 3
- SN
Substantia Nigra
- SNCA/Snca
Alpha-synuclein
- SV40Tag
Simian Virus 40 T antigen
- TH/Th
Tyrosine Hydroxylase
- Tmem179
Transmembrane protein 179
- ts
Temperature-Sensitive
- TSS
Transcriptional Start Site
- Unc13a
Unc-13 homolog A
- vst
Variance Stabilizing Transformation
Authors’ contributions
New data was generated by S.A.M., P.W.H., W.D.L., E.W.L., and J.R; analyzed by R.J.B., S.A.M., P.W.H., N.B.B., W.D.L., and A.S.M. The manuscript was written by R.J.B., S.A.M., N.B.B., W.D.L., and A.S.M. Figures were created by R.J.B., S.A.M., and N.B.B. All authors reviewed and approved the manuscript.
Funding
This research, undertaken at Johns Hopkins University School of Medicine, was supported in part by awards from the National Institutes of Health (NS62972 and MH106522) to A.S.M., by T32 GM007814-40 to R.J.B. and N.B.B., and by the Canadian Institutes of Health Research (DFD-181599) to R.J.B.
Availability of data and materials
All data and analysis pipelines are available at https://github.com/rachelboyd. ATAC-sequencing, RNA-sequencing, single-cell RNA-sequencing, and promoter-capture Hi-C data is available at the Gene Expression Omnibus (GEO) under the series accession number GSE225084.
Declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rachel J. Boyd and Sarah A. McClymont contributed equally to this work.
Contributor Information
Rachel J. Boyd, Email: rboyd25@jhmi.edu
Sarah A. McClymont, Email: sarahmcclymont@gmail.com
Nelson B. Barrientos, Email: nbarrie1@jhu.edu
Paul W. Hook, Email: phook2@jhmi.edu
William D. Law, Email: williamdlaw@gmail.com
Rebecca J. Rose, Email: rebeccarose10@gmail.com
Eric L. Waite, Email: eric.waite@pennmedicine.upenn.edu
Jay Rathinavelu, Email: jay.rathinavelu@gmail.com.
Dimitrios Avramopoulos, Email: adimitr1@jhmi.edu.
Andrew S. McCallion, Email: andy@jhmi.edu
References
- 1.Ormond KE, Mortlock DP, Scholes DT, Bombard Y, Brody LC, Faucett WA, et al. Human germline genome editing. Am J Hum Genet. 2017;101(2):167–176. doi: 10.1016/j.ajhg.2017.06.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Barbosa DJ, Capela JP, de Lourdes BM, Carvalho F. In vitro models for neurotoxicology research. Toxicol Res. 2015;4(4):801–842. [Google Scholar]
- 3.Hirsch C, Schildknecht S. In vitro research reproducibility: keeping up high standards. Front Pharmacol. 2019;10:1484. doi: 10.3389/fphar.2019.01484. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Fisher S, Grice EA, Vinton RM, Bessling SL, Urasaki A, Kawakami K, et al. Evaluating the biological relevance of putative enhancers using Tol2 transposon-mediated transgenesis in zebrafish. Nat Protoc. 2006;1(3):1297–1305. doi: 10.1038/nprot.2006.230. [DOI] [PubMed] [Google Scholar]
- 5.Gorkin DU, Lee D, Reed X, Fletez-Brant C, Bessling SL, Loftus SK, et al. Integration of ChIP-seq and machine learning reveals enhancers and a predictive regulatory sequence vocabulary in melanocytes. Genome Res. 2012;22(11):2290–2301. doi: 10.1101/gr.139360.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Shlyueva D, Stampfel G, Stark A. Transcriptional enhancers: from properties to genome-wide predictions. Nat Rev Genet. 2014;15(4):272–286. doi: 10.1038/nrg3682. [DOI] [PubMed] [Google Scholar]
- 7.Gasperini M, Findlay GM, McKenna A, Milbank JH, Lee C, Zhang MD, et al. CRISPR/Cas9-mediated scanning for regulatory elements required for HPRT1 expression via thousands of large, programmed genomic deletions. Am J Hum Genet. 2017;101(2):192–205. doi: 10.1016/j.ajhg.2017.06.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Maurano MT, Humbert R, Rynes E, Thurman RE, Haugen E, Wang H, et al. Systematic localization of common disease-associated variation in regulatory DNA. Science (1979) 2012;337(6099):1190–5. doi: 10.1126/science.1222794. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Kheradpour P, Ernst J, Melnikov A, Rogov P, Wang L, Zhang X, et al. Systematic dissection of regulatory motifs in 2000 predicted human enhancers using a massively parallel reporter assay. Genome Res. 2013;23(5):800–811. doi: 10.1101/gr.144899.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Shim S, Kwan KY, Li M, Lefebvre V, Šestan N. Cis-regulatory control of corticospinal system development and evolution. Nature. 2012;486:74–9. doi: 10.1038/nature11094. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Schoenfelder S, Javierre BM, Furlan-Magaril M, Wingett SW, Fraser P. Promoter capture Hi-C: high-resolution, genome-wide profiling of promoter interactions. J Vis Exp. 2018;2018(136):e57320. doi: 10.3791/57320. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Schaub MA, Boyle AP, Kundaje A, Batzoglou S, Snyder M. Linking disease associations with regulatory information in the human genome. Genome Res. 2012;22(9):1748–1759. doi: 10.1101/gr.136127.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Ernst J, Kheradpour P, Mikkelsen TS, Shoresh N, Ward LD, Epstein CB, et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. 2011;473(7345):43–49. doi: 10.1038/nature09906. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Boyle EA, Li YI, Pritchard JK. An expanded view of complex traits: From polygenic to omnigenic. Cell. 2017;169(7):1177–1186. doi: 10.1016/j.cell.2017.05.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Lee D, Gorkin DU, Baker M, Strober BJ, Asoni AL, McCallion AS, et al. A method to predict the impact of regulatory variants from DNA sequence. Nat Genet. 2015;47(8):955–961. doi: 10.1038/ng.3331. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Fearnley JM, Lees AJ. Ageing and Parkinson’s disease: Substantia nigra regional selectivity. Brain. 1991;114:2283–2301. doi: 10.1093/brain/114.5.2283. [DOI] [PubMed] [Google Scholar]
- 17.Marras C, Beck JC, Bower JH, Roberts E, Ritz B, Ross GW, et al. Prevalence of Parkinson’s disease across North America. NPJ Parkinsons Dis. 2018;4(1):21. doi: 10.1038/s41531-018-0058-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Dorsey ER, Bloem BR. The Parkinson pandemic - a call to action. JAMA Neurol. 2018;75(1):9–10. doi: 10.1001/jamaneurol.2017.3299. [DOI] [PubMed] [Google Scholar]
- 19.Ferrari E, Cardinale A, Picconi B, Gardoni F. From cell lines to pluripotent stem cells for modelling Parkinson’s disease. J Neurosci Methods. 2020;340:108741. doi: 10.1016/j.jneumeth.2020.108741. [DOI] [PubMed] [Google Scholar]
- 20.Son JH, Chun HS, Joh TH, Cho S, Conti B, Lee JW. Neuroprotection and neuronal differentiation studies using substantia nigra dopaminergic cells derived from transgenic mouse embryos. J Neurosci. 1999;19(1):10. doi: 10.1523/JNEUROSCI.19-01-00010.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Chang J, Le ZX, Yu H, Chen J. Downregulation of RTN1-C attenuates MPP+-induced neuronal injury through inhibition of mGluR5 pathway in SN4741 cells. Brain Res Bull. 2019;146:1–6. doi: 10.1016/j.brainresbull.2018.11.026. [DOI] [PubMed] [Google Scholar]
- 22.Chen J, Li M, Zhou X, Xie A, Cai Z, Fu C, et al. Rotenone-induced neurodegeneration is enabled by a p38-Parkin-ROS signaling feedback loop. J Agric Food Chem. 2021;69(46):13942–13952. doi: 10.1021/acs.jafc.1c04190. [DOI] [PubMed] [Google Scholar]
- 23.Guiney SJ, Adlard PA, Lei P, Mawal CH, Bush AI, Finkelstein DI, et al. Fibrillar α-synuclein toxicity depends on functional lysosomes. J Biol Chem. 2020;295(51):17497–17513. doi: 10.1074/jbc.RA120.013428. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Chun HS, Gibson GE, Degiorgio LA, Zhang H, Kidd VJ, Son JH. Dopaminergic cell death induced by MPP+, oxidant and specific neurotoxicants shares the common molecular mechanism. J Neurochem. 2001;76(4):1010–1021. doi: 10.1046/j.1471-4159.2001.00096.x. [DOI] [PubMed] [Google Scholar]
- 25.Chun HS, Lee H, Son JH. Manganese induces endoplasmic reticulum (ER) stress and activates multiple caspases in nigral dopaminergic neuronal cells, SN4741. Neurosci Lett. 2001;316(1):5–8. doi: 10.1016/s0304-3940(01)02341-2. [DOI] [PubMed] [Google Scholar]
- 26.Zeng W, Zhang W, Lu F, Gao L, Gao G. Resveratrol attenuates MPP+-induced mitochondrial dysfunction and cell apoptosis via AKT/GSK-3β pathway in SN4741 cells. Neurosci Lett. 2017;637:50–56. doi: 10.1016/j.neulet.2016.11.054. [DOI] [PubMed] [Google Scholar]
- 27.Cai Z, Zeng W, Tao K, Lu F, Gao G, Yang Q. Myricitrin alleviates MPP+-induced mitochondrial dysfunction in a DJ-1-dependent manner in SN4741 cells. Biochem Biophys Res Commun. 2015;458(2):227–233. doi: 10.1016/j.bbrc.2015.01.060. [DOI] [PubMed] [Google Scholar]
- 28.Mao K, Chen J, Yu H, Li H, Ren Y, Wu X, et al. Poly (ADP-ribose) polymerase 1 inhibition prevents neurodegeneration and promotes α-synuclein degradation via transcription factor EB-dependent autophagy in mutant α-synucleinA53T model of Parkinson’s disease. Aging Cell. 2020;19(6):e13163. doi: 10.1111/acel.13163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Gui C, Ren Y, Chen J, Wu X, Mao K, Li H, et al. p38 MAPK-DRP1 signaling is involved in mitochondrial dysfunction and cell death in mutant A53T α-synuclein model of Parkinson’s disease. Toxicol Appl Pharmacol. 2020;388:114874. doi: 10.1016/j.taap.2019.114874. [DOI] [PubMed] [Google Scholar]
- 30.Dong Y, Xiong J, Ji L, Xue X. MiR-421 aggravates neurotoxicity and promotes cell death in Parkinson’s disease models by directly targeting MEF2D. Neurochem Res. 2021;46(2):299–308. doi: 10.1007/s11064-020-03166-0. [DOI] [PubMed] [Google Scholar]
- 31.Yoo MS, Chun HS, Son JJ, DeGiorgio LA, Kim DJ, Peng C, et al. Oxidative stress regulated genes in nigral dopaminergic neuronal cells: correlation with the known pathology in Parkinson’s disease. Mol Brain Res. 2003;110(1):76–84. doi: 10.1016/s0169-328x(02)00586-7. [DOI] [PubMed] [Google Scholar]
- 32.Wang B, Cai Z, Lu F, Li C, Zhu X, Su L, et al. Destabilization of survival factor MEF2D mRNA by neurotoxin in models of Parkinson’s disease. J Neurochem. 2014;130(5):720–728. doi: 10.1111/jnc.12765. [DOI] [PubMed] [Google Scholar]
- 33.Dekker J, Rippe K, Dekker M, Kleckner N. Capturing chromosome conformation. Science (1979) 2002;295(5558):1306–11. doi: 10.1126/science.1067799. [DOI] [PubMed] [Google Scholar]
- 34.Lieberman-Aiden E, Van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science (1979) 2009;326(5950):289–93. doi: 10.1126/science.1181369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008;5(7):621–628. doi: 10.1038/nmeth.1226. [DOI] [PubMed] [Google Scholar]
- 36.Buenrostro JD, Giresi PG, Zaba LC, Chang HY, Greenleaf WJ. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat Methods. 2013;10(12):1213–1218. doi: 10.1038/nmeth.2688. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Nissim-Eliraz E, Zisman S, Schatz O, Ben-Arie N. Nato3 integrates with the Shh-Foxa2 transcriptional network regulating the differentiation of midbrain dopaminergic neurons. J Mol Neurosci. 2013;51:13–27. doi: 10.1007/s12031-012-9939-6. [DOI] [PubMed] [Google Scholar]
- 38.Fishman-Jacob T, Reznichenko L, Youdim MBH, Mandel SA. A sporadic Parkinson disease model via silencing of the ubiquitin-proteasome/E3 ligase component SKP1A. J Biol Chem. 2009;284(47):32835–32846. doi: 10.1074/jbc.M109.034223. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Weihe E, Depboylu C, Schuẗz B, Schäfer MKH, Eiden LE. Three types of tyrosine hydroxylase-positive CNS neurons distinguished by dopa decarboxylase and VMAT2 co-expression. Cell Mol Neurobiol. 2006;26(4–6):659–678. doi: 10.1007/s10571-006-9053-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Jonakait GM, Markey KA, Goldstein M, Dreyfus CF, Black IB. Selective expression of high-affinity uptake of catecholamines by transiently catecholaminergic cells of the rat embryo: studies in vivo and in vitro. Dev Biol. 1985;108(1):6–17. doi: 10.1016/0012-1606(85)90003-x. [DOI] [PubMed] [Google Scholar]
- 41.Cochard P, Goldstein M, Black IB. Ontogenetic appearance and disappearance of tyrosine hydroxylase and catecholamines in the rat embryo. Proc Natl Acad Sci U S A. 1978;75(6):2986–2990. doi: 10.1073/pnas.75.6.2986. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Asmus SE, Parsons S, Landis SC. Developmental changes in the transmitter properties of sympathetic neurons that innervate the periosteum. J Neurosci. 2000;20(4):1495–1504. doi: 10.1523/JNEUROSCI.20-04-01495.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Ladd AN, Charlet-B. N, Cooper TA. The CELF family of RNA binding proteins is implicated in cell-specific and developmentally regulated alternative splicing. Mol Cell Biol. 2001;21(4):1285–96. doi: 10.1128/MCB.21.4.1285-1296.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Nishino J, Kim I, Chada K, Morrison SJ. Hmga2 promotes neural stem cell self-renewal in young, but not old, mice by reducing p16Ink4a and p19Arf expression. Cell. 2008;135(2):227. doi: 10.1016/j.cell.2008.09.01. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Park HJ, Hong M, Bronson RT, Israel MA, Frankel WN, Yun K. Elevated Id2 expression results in precocious neural stem cell depletion and abnormal brain development. Stem Cells. 2013;31(5):1010. doi: 10.1002/stem.1351. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Dou Z, Son JE, Hui CC. Irx3 and Irx5 - Novel regulatory factors of postnatal hypothalamic neurogenesis. Front Neurosci. 2021;15:1447. doi: 10.3389/fnins.2021.763856. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Reid CA, Leaw B, Richards KL, Richardson R, Wimmer V, Yu C, et al. Reduced dendritic arborization and hyperexcitability of pyramidal neurons in a Scn1b-based model of Dravet syndrome. Brain. 2014;137:1701–1715. doi: 10.1093/brain/awu077. [DOI] [PubMed] [Google Scholar]
- 48.Stark R, Brown GD. DiffBind: differential binding analysis of ChIP-Seq peak data. Bioconductor. 2011. Available online at: http://bioconductor.org/packages/release/bioc/html/DiffBind.html.
- 49.Li H, Liu Y, Gu Z, Li L, Liu Y, Wang L, et al. p38 MAPK-MK2 pathway regulates the heat-stress-induced accumulation of reactive oxygen species that mediates apoptotic cell death in glial cells. Oncol Lett. 2018;15(1):775. doi: 10.3892/ol.2017.7360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.McClymont SA, Hook PW, Soto AI, Reed X, Law WD, Kerans SJ, et al. Parkinson-associated SNCA enhancer variants revealed by open chromatin in mouse dopamine neurons. Am J Hum Genet. 2018;103(6):874–892. doi: 10.1016/j.ajhg.2018.10.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Xie Z, Bailey A, Kuleshov MV, Clarke DJB, Evangelista JE, Jenkins SL, et al. Gene set knowledge discovery with Enrichr. Curr Protoc. 2021;1(3):e90. doi: 10.1002/cpz1.90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Chen EY, Tan CM, Kou Y, Duan Q, Wang Z, Meirelles GV, et al. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics. 2013;14:128. doi: 10.1186/1471-2105-14-128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016;44(W1):W90–7. doi: 10.1093/nar/gkw377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Franzén O, Gan LM, Björkegren JLM. PanglaoDB: a web server for exploration of mouse and human single-cell RNA sequencing data. Database. 2019;2019(1):46. doi: 10.1093/database/baz046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Zeng L, Zhang P, Shi L, Yamamoto V, Lu W, Wang K. Functional impacts of NRXN1 knockdown on neurodevelopment in stem cell models. PLoS One. 2013;8(3):e59685. doi: 10.1371/journal.pone.0059685. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Kaplan DR, Miller FD. Neurotrophin signal transduction in the nervous system. Curr Opin Neurobiol. 2000;10(3):381–391. doi: 10.1016/s0959-4388(00)00092-1. [DOI] [PubMed] [Google Scholar]
- 57.Reddy-Alla S, Böhme MA, Reynolds E, Beis C, Grasskamp AT, Mampell MM, et al. Stable positioning of Unc13 restricts synaptic vesicle fusion to defined release sites to promote synchronous neurotransmission. Neuron. 2017;95(6):1350–1364.e12. doi: 10.1016/j.neuron.2017.08.016. [DOI] [PubMed] [Google Scholar]
- 58.Storm R, Cholewa-Waclaw J, Reuter K, Bröhl D, Sieber M, Treier M, et al. The bHLH transcription factor Olig3 marks the dorsal neuroepithelium of the hindbrain and is essential for the development of brainstem nuclei. Development. 2009;136(2):295–305. doi: 10.1242/dev.027193. [DOI] [PubMed] [Google Scholar]
- 59.Sung HY, Chen WY, Huang HT, Wang CY, Chang SB, Tzeng SF. Down-regulation of interleukin-33 expression in oligodendrocyte precursor cells impairs oligodendrocyte lineage progression. J Neurochem. 2019;150(6):691–708. doi: 10.1111/jnc.14788. [DOI] [PubMed] [Google Scholar]
- 60.Liu H, Hu Q, D’Ercole AJ, Ye P. Histone Deacetylase 11 regulates oligodendrocyte-specific gene expression and cell development in OL-1 oligodendroglia cells. Glia. 2009;57(1):1. doi: 10.1002/glia.20729. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Sakry D, Yigit H, Dimou L, Trotter J. Oligodendrocyte precursor cells synthesize neuromodulatory factors. PLoS One. 2015;10(5):e0127222. doi: 10.1371/journal.pone.0127222. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Yam CH, Fung TK, Poon RYC. Cyclin A in cell cycle control and cancer. Cell Mol Life Sci. 2002;59(8):1317–1326. doi: 10.1007/s00018-002-8510-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Borlado LR, Méndez J. CDC6: from DNA replication to cell cycle checkpoints and oncogenesis. Carcinogenesis. 2008;29(2):237–243. doi: 10.1093/carcin/bgm268. [DOI] [PubMed] [Google Scholar]
- 64.Ma L, Zhao X, Zhu X. Mitosin/CENP-F in mitosis, transcriptional control, and differentiation. J Biomed Sci. 2006;13(2):205–213. doi: 10.1007/s11373-005-9057-3. [DOI] [PubMed] [Google Scholar]
- 65.Nagahama Y, Ueno M, Miyamoto S, Morii E, Minami T, Mochizuki N, et al. PSF1, a DNA replication factor expressed widely in stem and progenitor cells, drives tumorigenic and metastatic properties. Cancer Res. 2010;70(3):1215–1224. doi: 10.1158/0008-5472.CAN-09-3662. [DOI] [PubMed] [Google Scholar]
- 66.Szklarczyk D, Gable AL, Nastou KC, Lyon D, Kirsch R, Pyysalo S, et al. The STRING database in 2021: customizable protein-protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res. 2021;49(D1):D605–D612. doi: 10.1093/nar/gkaa1074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Choi KC, Kim SH, Ha JY, Kim ST, Son JH. A novel mTOR activating protein protects dopamine neurons against oxidative stress by repressing autophagy related cell death. J Neurochem. 2010;112(2):366–376. doi: 10.1111/j.1471-4159.2009.06463.x. [DOI] [PubMed] [Google Scholar]
- 68.Bryja V, Čajánek L, Grahn A, Schulte G. Inhibition of endocytosis blocks Wnt signalling to β-catenin by promoting dishevelled degradation. Acta Physiol. 2007;190(1):55–61. doi: 10.1111/j.1365-201X.2007.01688.x. [DOI] [PubMed] [Google Scholar]
- 69.Nalls MA, Blauwendraat C, Vallerga CL, Heilbron K, Bandres-Ciga S, Chang D, et al. Identification of novel risk loci, causal insights, and heritable risk for Parkinson’s disease: a meta-analysis of genome-wide association studies. Lancet Neurol. 2019;18(12):1091–1102. doi: 10.1016/S1474-4422(19)30320-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Blauwendraat C, Nalls MA, Singleton AB, Blauwendraat C, Singleton AB, Nalls MA. The genetic architecture of Parkinson’s disease. Lancet Neurol. 2020;19:170–178. doi: 10.1016/S1474-4422(19)30287-X. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Chang D, Nalls MA, Hallgrímsdóttir IB, Hunkapiller J, van der Brug M, Cai F, et al. A meta-analysis of genome-wide association studies identifies 17 new Parkinson’s disease risk loci. Nat Genet. 2017;49(10):1511–1516. doi: 10.1038/ng.3955. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Blauwendraat C, Heilbron K, Vallerga CL, Bandres-Ciga S, von Coelln R, Pihlstrøm L, et al. Parkinson’s disease age at onset genome-wide association study: defining heritability, genetic loci, and α-synuclein mechanisms. Mov Disord. 2019;34(6):866–875. doi: 10.1002/mds.27659. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Polymeropoulos MH, Lavedan C, Leroy E, Ide SE, Dehejia A, Dutra A, et al. Mutation in the α-synuclein gene identified in families with Parkinson’s disease. Science (1979) 1997;276(5321):2045–7. doi: 10.1126/science.276.5321.2045. [DOI] [PubMed] [Google Scholar]
- 74.Singleton AB, Farrer M, Johnson J, Singleton A, Hague S, Kachergus J, et al. Alpha-synuclein locus triplication causes Parkinson’s disease. Science (1979) 2003;302(5646):841. doi: 10.1126/science.1090278. [DOI] [PubMed] [Google Scholar]
- 75.Ibáñez P, Bonnet AM, Débarges B, Lohmann E, Tison F, Pollak P, et al. Causal relation between alpha-synuclein gene duplication and familial Parkinson’s disease. Lancet. 2004;364(9440):1169–1171. doi: 10.1016/S0140-6736(04)17104-3. [DOI] [PubMed] [Google Scholar]
- 76.Taylor SC, Nadeau K, Abbasi M, Lachance C, Nguyen M, Fenrich J. The ultimate qPCR experiment: producing publication quality, reproducible data the first time. Trends Biotechnol. 2019;37(7):761–774. doi: 10.1016/j.tibtech.2018.12.002. [DOI] [PubMed] [Google Scholar]
- 77.Zheng GXY, Terry JM, Belgrader P, Ryvkin P, Bent ZW, Wilson R, et al. Massively parallel digital transcriptional profiling of single cells. Nat Commun. 2017;8(1):1–12. doi: 10.1038/ncomms14049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Butler A, Hoffman P, Smibert P, Papalexi E, Satija R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat Biotechnol. 2018;36(5):411–420. doi: 10.1038/nbt.4096. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Corces MR, Trevino AE, Hamilton EG, Greenside PG, Sinnott-Armstrong NA, Vesuna S, et al. An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues. Nat Methods. 2017;14(10):959–962. doi: 10.1038/nmeth.4396. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Butler A, Hoffman P, Smibert P, Papalexi E, Satija R. Integrating single-cell transcriptomic data 867 across different conditions, technologies, and species. Nat Biotechnol. 2018;36(5):411–20. doi: 10.1038/nbt.4096. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016;32(19):3047–3048. doi: 10.1093/bioinformatics/btw354. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9(4):357. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, et al. Model-based analysis of ChIP-Seq (MACS) Genome Biol. 2008;9(9):1–9. doi: 10.1186/gb-2008-9-9-r137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Amemiya HM, Kundaje A, Boyle AP. The ENCODE blacklist: identification of problematic regions of the genome. Sci Rep. 2019;9(1):9354. doi: 10.1038/s41598-019-45839-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Dunham I, Kundaje A, Aldred SF, Collins PJ, Davis CA, Doyle F, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489(7414):57–74. doi: 10.1038/nature11247. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Ramírez F, Ryan DP, Grüning B, Bhardwaj V, Kilpert F, Richter AS, et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 2016;44(W1):W160–W165. doi: 10.1093/nar/gkw257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Karolchik D, Hinricks AS, Furey TS, Roskin KM, Sugnet CW, Haussler D, et al. The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 2004;32(Database issue):D493. doi: 10.1093/nar/gkh103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005;15(8):1034–1050. doi: 10.1101/gr.3715005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Shin H, Liu T, Manrai AK, Liu SX. CEAS: cis-regulatory element annotation system. Bioinformatics. 2009;25(19):2605–2606. doi: 10.1093/bioinformatics/btp479. [DOI] [PubMed] [Google Scholar]
- 92.Liu T, Ortiz JA, Taing L, Meyer CA, Lee B, Zhang Y, et al. Cistrome: an integrative platform for transcriptional regulation studies. Genome Biol. 2011;12(8):1–10. doi: 10.1186/gb-2011-12-8-r83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.McLean CY, Bristor D, Hiller M, Clarke SL, Schaar BT, Lowe CB, et al. GREAT improves functional interpretation of cis-regulatory regions. Nat Biotechnol. 2010;28(5):495–501. doi: 10.1038/nbt.1630. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Conway JR, Lex A, Gehlenborg N. UpSetR: an R package for the visualization of intersecting sets and their properties. Bioinformatics. 2017;33(18):2938–2940. doi: 10.1093/bioinformatics/btx364. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nat Methods. 2015;12(4):357–360. doi: 10.1038/nmeth.3317. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004;5(10):R80. doi: 10.1186/gb-2004-5-10-r80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Huber W, Carey VJ, Gentleman R, Anders S, Carlson M, Carvalho BS, et al. Orchestrating high-throughput genomic analysis with Bioconductor. Nat Methods. 2015;12(2):115–121. doi: 10.1038/nmeth.3252. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Liao Y, Smyth GK, Shi W. The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nucleic Acids Res. 2013;41(10):e108. doi: 10.1093/nar/gkt214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Liao Y, Smyth GK, Shi W. Sequence analysis featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 2014;30(7):923–930. doi: 10.1093/bioinformatics/btt656. [DOI] [PubMed] [Google Scholar]
- 100.Zhu A, Ibrahim JG, Love MI. Heavy-tailed prior distributions for sequence count data: removing the noise and preserving large differences. Bioinformatics. 2019;35(12):2084–2092. doi: 10.1093/bioinformatics/bty895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Wickham H. ggplot2: elegant graphics for data analysis. New York: Springer-Verlag; 2016. [Google Scholar]
- 102.Belaghzal H, Dekker J, Gibcus JH. Hi-C 2.0: an optimized Hi-C procedure for high-resolution genome-wide mapping of chromosome conformation. Methods. 2017;123:56–65. doi: 10.1016/j.ymeth.2017.04.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Wingett S, Ewels P, Furlan-Magaril M, Nagano T, Schoenfelder S, Fraser P, et al. HiCUP: pipeline for mapping and processing Hi-C data. F1000Res. 2015;4:1310. doi: 10.12688/f1000research.7334.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Cairns J, Freire-Pritchett P, Wingett SW, Várnai C, Dimond A, Plagnol V, et al. CHiCAGO: robust detection of DNA looping interactions in Capture Hi-C data. Genome Biol. 2016;17(1):1–17. doi: 10.1186/s13059-016-0992-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Wingett S, Ewels P, Furlan-Magaril M, Nagano T, Schoenfelder S, Fraser P, et al. HiCUP: Pipeline for mapping and processing Hi-C data. F1000Res. 2015;4. 10.12688/f1000research.7334.1. [DOI] [PMC free article] [PubMed]
- 106.Cairns J, Freire-Pritchett P, Wingett SW, Várnai C, Dimond A, Plagnol V, et al. CHiCAGO: Robust detection of DNA looping interactions in Capture Hi-C data. Genome Biol. 2016;17(1):1–17. [DOI] [PMC free article] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All data and analysis pipelines are available at https://github.com/rachelboyd. ATAC-sequencing, RNA-sequencing, single-cell RNA-sequencing, and promoter-capture Hi-C data is available at the Gene Expression Omnibus (GEO) under the series accession number GSE225084.