Summary
Long interspersed nuclear element 1 (L1) retrotransposons represent a vast source of genetic variability. However, mechanistic analysis of whether and how L1s contribute to human developmental programs is lacking, in part due to the challenges associated with specific profiling and manipulation of human L1 expression. Here, we show that thousands of hominoid-specific L1 integrants are expressed in human induced pluripotent stem cells and cerebral organoids. The activity levels of individual L1 promoters vary widely and correlate with an active epigenetic state. Efficient on-target CRISPR interference (CRISPRi) silencing of L1s revealed nearly a hundred co-opted L1-derived chimeric transcripts, and L1 silencing resulted in changes in neural differentiation programs and reduced cerebral organoid size. Together, these data implicate L1s and L1-derived transcripts in hominoid-specific CNS developmental processes.
Keywords: transposable elements, LINE-1, non-coding genome, epigenetics, neurodevelopment, evolution, cerebral organoids, hiPSCs, pluripotency
Graphical abstract

Highlights
-
•
Evolutionarily young L1s are highly expressed in hiPSCs
-
•
L1s display dynamic expression and epigenetic regulation during differentiation
-
•
L1s act as alternative promoters of primate- and human-specific transcript isoforms
-
•
CRISPRi silencing of L1s impairs the differentiation of cerebral organoids
Adami et al. demonstrate that evolutionarily young L1s are expressed in human pluripotent stem cells and are dynamically regulated throughout neural differentiation. The study examines the role of L1s as alternative promoters of primate- and human-specific transcriptional isoforms, highlighting their significance in human brain development and potential involvement in evolution.
Introduction
The human genome consists of at least 50% transposable elements (TEs) that have been incorporated and expanded by waves of retrotransposition, resulting in significant interspecies and interindividual variation in genomic composition.1,2,3,4 Due to their ability to mobilize, TEs pose a potential threat to genomic integrity.5 TEs are therefore assumed to be transcriptionally silenced in mammalian cells via epigenetic mechanisms such as DNA methylation.6,7,8,9,10 However, recent studies have begun to challenge this simplistic view, by demonstrating that some TEs are selectively expressed in various mammalian cell types during early development where they may contribute to important cellular functions.11,12,13,14,15
Long interspersed nuclear elements 1 (L1s) are the most abundant and only autonomously mobilizing family of TEs in humans, accounting for ∼17% of human genomic DNA.1,4,16 Because L1s colonized the human genome in different waves via a copy-and-paste mechanism, it is possible to approximate the evolutionary age of each individual L1 element and assign them to chronologically ordered subfamilies.17 Hundreds of thousands of L1s are primate specific, and thousands are human specific.18 L1s are transcribed from a CpG-rich internal 5′ RNA polymerase II promoter as a bicistronic mRNA encoding two proteins, ORF1p and ORF2p, which are essential for L1 mobilization.19,20,21,22,23 Notably, the L1 promoter is bidirectional, and in evolutionarily young L1s, the antisense transcript encodes a small peptide, ORF0, whose function is poorly characterized.24 L1 antisense promoters can also give rise to chimeric transcripts and act as alternative promoters for non-canonical isoforms of protein-coding genes.8,25,26 L1s are highly expressed during early mouse development, where they regulate global chromatin accessibility and influence developmental potency and self-renewal.11,12 In particular, L1s are expressed in the developing brain, where they regulate the rate of neuronal differentiation and maturation.27,28,29 The mechanism by which L1s exert these roles remains debated but appears to be independent of retrotransposition.11,12,27,28 Rather, L1s may influence transcriptomes via cis-acting mechanisms, acting as enhancers or alternative promoters, or via trans-acting mechanisms, where they act as regulatory non-coding RNAs.30 However, the functional relevance of L1 expression in early human development remains poorly understood. The intact L1 loci expressed during mouse development are not present in the human genome, where other primate- and human-specific L1 loci are found.4 Furthermore, the expression of human L1s can be affected by different regulatory elements.31 Although several pathways that regulate L1 transcription are conserved between mice and humans, the expression profile of L1s in human cells is likely to differ considerably from that in mice. This, combined with the species specificity of many L1 insertions, may affect the potential functional role of these elements.
In this study, we used a combination of short- and long-read RNA sequencing (RNA-seq) approaches coupled with targeted cleavage and nuclease release (cleavage under targets and release using nuclease [CUT&RUN]) epigenomic profiling, long-read DNA methylation analysis, and tailored bioinformatic approaches to demonstrate that L1-derived transcripts are highly expressed in human induced pluripotent stem cells (hiPSCs) and human cerebral organoids and that their expression correlates with the presence of active histone marks and the absence of DNA methylation. We applied an optimized CRISPR interference (CRISPRi)-based system that allows efficient and specific silencing of L1 expression in hiPSCs and organoids and found that L1s exert profound cis-acting effects on gene expression by acting as alternative promoters for almost 100 protein-coding genes and long non-coding RNAs (lncRNAs). This L1-mediated cis activity is not essential for the maintenance of pluripotency in hiPSCs but is important for the regulation of neural differentiation in cerebral organoids. L1 silencing reduces the size of cerebral organoids at time points corresponding to the proliferation of neural progenitors. In summary, these results demonstrate that L1s are wired into gene regulatory networks in human pluripotent stem cells and provide a layer of primate- and human-specific transcriptome complexity that influences neural differentiation and may have contributed to the evolution of the human brain.
Results
Evolutionarily young L1s are highly expressed in hiPSCs
The human genome contains approximately half a million individual L1 copies, most of which are transcriptionally inactivated due to 5′ truncations and the accumulation of mutations and deletions.4 Only L1s with intact 5′ promoter regions are capable of driving L1 transcription, and our genome contains approximately 6,000 full-length (>6 kb) L1 elements (Figure 1A). We characterized the expression of individual full-length L1 elements in two commercially available hiPSC lines (hereafter referred to as hiPSC1 and hiPSC2, see STAR Methods for details) (Figure S1A). We generated bulk RNA-seq data using an in-house 2 × 150 bp, polyadenylate (poly(A)) enriched stranded library preparation with a reduced fragmentation step to optimize library insert size for L1 analysis (Figure 1A). We obtained an average of 58 million reads per sample. Due to their repetitive nature, a large proportion of reads originating from L1s are expected to ambiguously map to the reference genome.26,32 To avoid an inflated signal due to multimapping reads, we used a unique mapping approach to investigate the expression of individual L1 elements (Figure 1A). This bioinformatic approach yielded an average of 85.9% uniquely mapped reads per sample. By using these uniquely mapped reads it is possible to investigate the transcriptional status of most individual L1 loci, except for some of the most evolutionarily recent integrants, as well as polymorphic L1 alleles that are not present in the hg38 reference genome.26
Figure 1.
L1s are highly expressed in hiPSCs
(A) Left: schematic of the experimental design (top) to profile L1 expression in hiPSCs. Bioinformatic rationale (bottom) to quantify L1 expression by using a unique mapping approach to quantify stranded expression of unique L1 loci. Middle: overview of a full-length L1 element with an intact 5′ UTR and two long open reading frames (ORF1 and ORF2). The zoom-in shows the bidirectional promoter encoding the L1 body in sense and ORF0 in antisense. Arrows indicate the sense and antisense promoters. Right: phylogenetic tree showing the evolutionary age of different L1 subfamilies.
(B) Expression (reads per kilobase per million mapped reads [RPKM]) of full-length (>6 kb), evolutionarily young L1s in two hiPSC lines. n = 3 technical replicates.
(C) Number of L1s belonging to primate-specific L1 subfamilies (L1HS to L1PA4) expressed in hiPSCs.
(D) Normalized read counts in antisense (red) and sense (blue) and of full-length L1HS (top) and L1PA2 (bottom) per sample (n = 3 technical replicates, t test). Bars represent mean normalized expression, error bars show mean ± standard deviation.
(E) Genome browser tracks showing normalized expression (RPKM) of an L1HS element in two hiPSC lines (dark blue, forward transcription; purple, reverse transcription).
(F) Western blots (WBs) of ORF1p in two hiPSC lines (top) and actin-β (bottom) as a loading control protein.
(G) Immunocytochemistry of the pluripotency marker NANOG (red) and the L1-derived protein ORF1p (green) in hiPSCs. DAPI nuclear staining (blue) is included in the overlay.
In line with what previously has been reported, we found that many individual full-length L1 loci were highly expressed in hiPSCs.33,34 These L1s belonged predominantly to primate-specific families (Figures 1B–1E), including both hominoid-specific elements (L1PA2 to L1PA3) and human-specific elements (L1HS) (Figures S1B–S1D). We detected the expression of 2,323 unique loci of evolutionarily young L1s (Figures 1B, 1C, S1C, and S1D). The RNA-seq signal was present throughout the L1 body, with a slight enrichment at the 3′ end, suggesting that transcription of most L1s terminates in the internal L1 polyadenylation signal (Figures 1B and 1E). Most of the transcription was in the same orientation as the L1s (in sense, Figure 1D). Most expressed L1s were found in intergenic regions, but for those L1s located within genes, transcription of L1s primarily occurred in the direction opposite to the host gene (Figure S1C). Taken together, these results suggest that most L1 transcripts originate from the L1 promoter and are not a consequence of readthrough or bystander transcription. We also found evidence for activity of the antisense L1 promoter, resulting in transcription extending into the upstream flanking genome (Figures 1E and S1D).
L1s encode open reading frames (ORFs) that lead to the translation of peptides, including the RNA binding protein ORF1p. Western blot (WB) analysis confirmed that ORF1p is highly expressed in hiPSCs (Figure 1F). Immunocytochemistry (ICC) of ORF1p in hiPSCs showed localization to cytoplasmic, perinuclear puncta, consistent with what has been previously described in other human cell lines (Figure 1G).35 In summary, these results demonstrate that L1 elements are highly expressed in human pluripotent stem cells.
Transcription of unique L1 loci correlates with their epigenetic status
The RNA and protein analyses show that L1s are highly expressed in hiPSCs, raising the possibility that the epigenetic status of L1 promoters is associated with an active state in this cell type. To address this question, we performed CUT&RUN epigenomic analysis36 to determine whether the histone mark H3K4me3, which is associated with active promoters, is present on evolutionarily young L1s in hiPSCs (Figures 2A and S2A). An additional advantage of H3K4me3 profiling is that the signal of these histone modifications spreads to the unique flanking genomic context, allowing accurate identification of individual transcriptionally active L1 promoters.26,37 The resulting sequencing data were uniquely mapped, followed by peak calling and intersection with full-length L1s. The CUT&RUN analysis identified 133 high-confidence H3K4me3 peaks located at the 5′ end of evolutionarily young full-length L1s, and most of these loci were confirmed to be expressed in the bulk RNA-seq dataset (Figures 2B and 2C).
Figure 2.
The epigenetic profile over L1 loci correlates with their expression
(A and B) Schematics of the CUT&RUN and DNA methylation analysis workflow.
(C) H3K4me3 peaks over the promoter of young (L1HS-PA4) >6 kb L1s (n = 2 biological replicates) (RPKM normalized). TSS, transcription start site. Profile plot at the top shows the summed signal.
(D) Violin plots showing DNA methylation levels (ONT DNA sequencing data) over the promoter of full-length (>6 kb), evolutionarily young (L1HS–L1PA4) L1 elements in two hiPSC lines, including H3K4me3 peak-called L1s. Boxplot centers correspond to the median, hinges correspond to the first and third quartile, and whiskers stretch from the first and third quartile to +1.5 interquartile range (IQR).
(E) Scatterplots showing a negative correlation between normalized expression of individual full-length (>6 kb) L1s (x axis; normalized counts by gene sizeFactors as calculated by DESeq2 [i.e., median of ratios]) and the percentage of methylated CpG sites at their promoters (y axis). R2 and p value of fitted linear models between methylation and expression per L1 subfamily are shown.
(F) Genome browser tracks showing, from top to bottom, H3K4me3 mark at the L1 promoter (RPKM normalized) and transcription of the element (dark blue, forward transcription; purple, reverse transcription, RPKM normalized). ONT DNA reads in the region, with black dots indicating methylated CpG sites, and methylation coverage of the L1 elements are shown at the bottom (light blue, hiPSC1; dark blue, hiPSC2).
Transcriptional silencing of L1s has previously been associated with the presence of CpG DNA methylation at the L1 promoter.8,9 To investigate whether L1 expression in hiPSCs correlates with the absence of DNA methylation, we performed genome-wide methylation profiling using long-read sequencing from Oxford Nanopore Technologies (ONT) (Figure 2A). This analysis revealed that the DNA methylation levels of the L1 promoters negatively correlated with their evolutionary age. Promoters of human-specific L1 elements were hypomethylated when compared to more ancient subfamilies (Figure 2D). Notably, L1s that carried an H3K4me3 mark displayed very low levels of DNA methylation over the L1 promoter, and the lack of DNA methylation at individual L1 loci correlated with the RNA expression at the same loci (Figures 2D and 2E). For example, we found several instances where the L1 promoter was hypomethylated in hiPSCs and where this correlated with the presence of H3K4me3 and RNA-seq signal (Figure 2F). These results demonstrate that the epigenetic status of each individual L1 promoter in hiPSCs, including the absence of DNA methylation and the presence of H3K4me3, correlates with the transcriptional activity of individual loci.
Efficient, on-target L1 silencing using CRISPRi
To investigate the functional role of L1s in human pluripotency and early human brain development, we optimized a lentiviral CRISPRi system38 to transcriptionally silence L1s in hiPSCs (Figures 3A and 3B). To achieve specific and efficient transcriptional silencing, we tested several guide RNA (gRNA) designs (see STAR Methods for details). Ultimately, we selected two distinct 20 bp gRNAs that target the 5′ UTR (at +802 bp for gRNA1 and at +71 bp for gRNA2) of young, full-length L1s close to their transcription start site (TSS),39 which, when co-expressed with a Krüppel-associated box transcriptional repressor domain fused to a catalytically inactive Cas9 (KRAB-dCas9) and a GFP selection marker, efficiently silenced L1 transcription (Figures 3A–3E and S3A). As a control, we transduced hiPSCs with the same lentiviral CRISPRi system but using a non-targeting control gRNA (see STAR Methods for details). No differences in cell death were observed after transduction between the control-transduced and the L1-CRISPRi-transduced hiPSCs. The percentage of successfully transduced cells (GFP-positive hiPSCs selected via fluorescence-activated cell sorting (FACS) was reproducibly comparable between conditions (Figure S3B).
Figure 3.
CRISPRi-based silencing of L1s in hiPSCs does not affect human pluripotency
(A) Schematic of the gRNA target sites within the full-length L1s. The gRNAs were designed to target the 5′ UTR of evolutionarily young, >6 kb L1s.
(B) Schematic of the CRISPRi workflow and downstream analyses.
(C) CUT&RUN analysis of dCas9 (gRNA1, control signal) (left) and bulk RNA-seq data showing the normalized expression of uniquely mapped, full-length (>6 kb), evolutionarily young (L1HS to L1PA4) L1s in control vs. L1-CRISPRi hiPSCs (all tracks RPKM normalized).
(D) H3K4me3 peaks over the promoter of young (L1HS–PA4) >6 kb L1s in control vs. L1-CRISPRi hiPSCs. Profile plot at the top shows the summed signal.
(E) Genome browser tracks illustrating dCas9 enrichment (gRNA1, control signal), H3K4me3 loss over L1 elements’ promoter, and the loss of expression in control vs. L1-CRISPRi hiPSCs (lilac, dCas9 CUT&RUN signal; green, H3K4me3 CUT&RUN signal; dark blue, forward transcription; purple, reverse transcription) (all tracks RPKM normalized).
(F) Expression of L1 families analyzed using TEtranscripts in control vs. L1-CRISPRi hiPSCs (heatmap showing normalized expression, scaled by row).
(G) Western blots (WBs) of ORF1p (top) and actin-β (bottom) in control vs. L1-CRISPRi hiPSCs.
(H) Immunostaining of the pluripotency marker NANOG (red) and the L1-derived protein ORF1p (green) in control vs. L1-CRISPRi hiPSCs. DAPI nuclear staining is in blue.
(I) Mass spectrometry (MS) data showing changes (padj) in ORF1p levels upon L1-CRISPRi in hiPSCs. See STAR Methods for statistical analysis details. Bars show protein levels (× 1,000), and error bars correspond to ± standard deviation.
(J) Left: heatmap showing log2 normalized expression (RNA-seq) of pluripotency and differentiation markers in control vs. L1-CRISPRi hiPSCs. Right: heatmap of MS data showing expression of pluripotency markers in control vs. L1-CRISPRi hiPSCs (heatmap showing log2 normalized expression).
To evaluate the efficiency and on-target specificity of the lentiviral CRISPRi system, we performed an in-depth epigenetic profiling of the L1-CRISPRi hiPSCs and control transduced cells. We performed CUT&RUN analysis of the dCas9 protein, which is expected to be recruited to the gRNA target site, as well as the active histone mark H3K4me3 and the repressive histone mark H3K9me3 (Figures 3C, 3D, and S3C). Using unique mapping, we observed clear enrichment of dCas9 protein at the 5′ UTR of evolutionarily young, full-length (>6 kb) L1s in L1-CRISPRi hiPSCs (Figure 3C). This was accompanied by a loss of H3K4me3 and gain of H3K9me3 at gRNA binding sites (Figures 3D and S3C), confirming efficient on-target engagement of the L1-CRISPRi system at the promoter of evolutionarily young L1s. H3K9me3 localization was confined to the 5′ UTR of the targeted L1 elements (Figure S3C), suggesting that L1s are specifically silenced, with limited heterochromatin spread to their surrounding genomic locations. To further characterize the CRISPRi system, we examined other (non-L1) genomic loci where a dCas9 peak was detected (n = 272 CUT&RUN peaks) (Figures S3F and S3G). Importantly, these peaks were also detected in cells expressing the non-targeting gRNA, and the levels of H3K4me3, H3K9me3, and transcription were unchanged between L1-CRISPRi and control cells at these sites, strongly implying that these are technical artifacts of the dCas9 CUT&RUN experiment, which required a cross-linking approach, rather than off-target effects associated with the L1 gRNAs (Figure S3G).
To investigate the transcriptional consequences of L1-CRISPRi transduction, we performed RNA-seq and analyzed the expression of unique L1 loci (Figures 3C, 3E, and S3D). We found that the expression of evolutionarily young full-length L1 elements was almost completely silenced in L1-CRISPRi hiPSCs, including both in-sense transcription and antisense promoter activity (Figures 3C, 3E, S3D, S3E, and S3I). To quantify the extent of L1 silencing, we used the TE-oriented read quantification software TEtranscripts.40 This approach confirmed a global transcriptional silencing of evolutionarily young L1 families (Figures 3F and S3H), leading to complete loss of the L1-derived protein ORF1p in the L1-CRISPRi hiPSCs assessed with WB, immunocytochemistry, and quantitative mass spectrometry (MS) (Figures 3G–3I and S3J–S3L). We confirmed efficient L1 silencing using the lentiviral CRISPRi system at both the RNA and the protein levels using both of our gRNAs in the two different hiPSC lines (Figures S3D, S3E, S3I, and S3J–S3L). Taken together, these data demonstrate efficient and specific on-target activity of the L1-CRISPRi system.
L1 expression is not essential for the maintenance of human pluripotency
Two previous studies have suggested that L1s may play a critical role in early mouse development and that L1 expression is important for the control of developmental potency.11,12 Recently, similar observations have also been made in naive human embryonic stem cells (ESCs).41 However, we found that L1-CRISPRi hiPSCs remained proliferative and exhibited a morphology characteristic of human pluripotent stem cells (Figures 3H and S4A). Consistent with this, RNA-seq analysis confirmed that L1-CRISPRi hiPSCs maintained high expression of pluripotency-related genes, such as POU5F1, NANOG, and SOX2, while genes associated with differentiation of all other lineages (ecto-, endo-, meso-, and trophoectodermal markers) remained transcriptionally silent (Figures 3J and S4B). We also verified the expression of pluripotency-related proteins in L1-CRISPRi using the mass spectrometry data and found high levels of expression of POU5F1, NANOG, UTF1, and SOX2 (Figures 3J and S4B). The expression of NANOG was further verified by immunostaining (Figures 3H and S3K). Finally, we confirmed the pluripotent state of the control- and L1-CRISPRi-transduced hiPSCs by successfully differentiating them into all three germ layers (Figure S4C). These data show that the L1-CRISPRi hiPSCs remain in a proliferative pluripotent state, demonstrating that L1 transcription is not essential for the maintenance of human pluripotent stem cells.
L1s influence the expression of protein-coding genes and lncRNAs in cis
To investigate the transcriptional consequences of L1 silencing in hiPSCs, we performed differential gene expression analysis comparing L1-CRISPRi transduced cells to the control cells (Figure 4A). We found that 99 genes were significantly downregulated (padj < 0.05, log2 fold change > 1) in L1-CRISPRi hiPSCs, while only 6 genes were upregulated. Most of the downregulated genes (61/99) were in the vicinity (≤50 kb) of a CRISPRi-silenced L1 (Figure 4B), showing that the transcriptional silencing of L1s impacts mainly on nearby gene expression. Overall, this indicates that most of the gene expression changes in the L1-CRISPRi hiPSCs are direct cis-related effects of the L1 transcriptional silencing, while any downstream effects appear to be minor. Among the downregulated genes, 67 were non-coding RNAs, mostly lncRNAs, while 32 were protein-coding genes (Figures 4B, 4C, and S5C).42 The transcriptional response in L1-CRISPRi hiPSCs was highly reproducible with the second L1-targeting gRNA, as well as in the second hiPSC line, suggesting that many of these transcripts are likely to be directly regulated by L1s in cis in hiPSCs (Figures 4C and S5A–S5C).
Figure 4.
L1s drive the expression of protein-coding genes and long non-coding RNAs in hiPSCs
(A) Scatterplot showing mean gene expression in L1-CRISPRi (y axis) and control (x axis) hiPSCs and summary of the differential expression analysis (DEA) (bulk RNA-seq, n = 3 replicates/condition). Blue dots, significantly downregulated genes; red dots, significantly upregulated genes; gray dots, non-significant (DESeq2: Wald test; padj < 0.05; log2[fold change] > 1).
(B) Top: distribution between protein-coding and non-coding downregulated genes. Bottom: number of downregulated genes split by their distance from the nearest full-length L1. In purple, number of genes that overlap with a full-length L1 (intragenic).
(C) Heatmap showing all the normalized expression of downregulated protein-coding genes that overlap with a full-length L1 (n = 3 replicates/condition, heatmap showing log2 normalized expression).
(D) Schematic showing the expression of a canonical transcriptional isoform from the canonical gene promoter vs. expression of an L1-driven alternative transcriptional isoform.
(E) Genome browser tracks showing the expression of L1-driven alternative transcriptional isoforms at the ELAPOR2 (top) and PPP1R1C (bottom) loci in hiPSCs and its silencing upon L1-CRISPRi. Left shows normalized transcription (RPKM) in L1-CRISPRi and control hiPSCs and ONT long-read direct RNA from control hiPSCs showing the different isoforms. Right shows a zoom-in to the L1 elements showing, from top to bottom, bulk RNA-seq data (dark blue, forward transcription; purple, reverse transcription), H3K4me3 CUT&RUN (green, control vs. L1-CRISPRi hiPSCs), and ONT direct RNA reads from control hiPSCs.
(F) Normalized exon expression between control and L1-CRISPRi hiPSCs of the first three and last three exons of ELAPOR2 and all the exons of PPP1R1C (padj DESeq2: Wald test, n = 3 replicates per condition). Intronic L1 position is indicated in blue. Bars represent mean normalized expression, and error bars show mean ± standard error.
L1s can act as alternative promoters for protein-coding genes using the activity of the L1 antisense promoter8,25,26 (Figure 4D). Focusing on the downregulated protein-coding genes, we indeed found several examples of this phenomenon where genes on the strand opposite a nearby L1 were transcriptionally silenced upon L1-CRISPRi (Figures 4D and 4E). These observations suggest that the antisense L1 promoter drives expression of an alternative isoform of these protein-coding genes. In line with this, we confirmed that the transcription of genes hosting an intronic, CRISPRi-targeted L1 element in the same orientation was not affected upon L1-CRISPRi (Figure S5D and S5E). Direct long-read ONT RNA-seq analysis (Figure S5F and S5G) in hiPSCs confirmed the existence of full-length L1 transcripts as well as L1-fusion transcripts in these genes (Figures 4E and S5F–S5I). For example, an L1PA2 provides an alternative promoter for a hominoid isoform of ELAPOR2, a gene involved in the BMP signaling pathway,43 and another L1PA2 provides an alternative hominoid promoter for PPP1R1C, which is part of a major serine/threonine phosphatase (Figure 4E). The direct long read RNA-seq analysis confirmed that the majority of ELAPOR2 transcripts start in the L1 element and are then spliced into the conserved downstream exons (Figure 4E). The L1PA2 promoter in PPP1R1C is driving almost all detected expression of this gene in hiPSCs (Figure 4E). These two L1-derived transcripts have previously been reported in the RefSeq or GENCODE database, which further confirms the robustness of our approach (Figures 4E and 4F). Analysis of exon usage of these two genes in hiPSCs confirmed that exons downstream of the L1 insertion are highly expressed and downregulated by L1-CRISPRi transduction, while exons upstream of the L1 insertion are lowly expressed and not affected by L1-CRISPRi (Figure 4F), confirming that L1-promoter activity is responsible for most of the transcription of these two genes in hiPSCs. We also found that the majority of the downregulated lncRNAs use an L1 antisense promoter to drive their expression (Figures S5B, S5H, and S5I). One such example is LINC00648, an lncRNA involved in various types of cancer.44,45 LINC00648 is highly expressed in hiPSCs and uses an L1PA2 as its promoter. Upon L1-CRISPRi, the expression of this lncRNA is completely shut down (Figure S5I) together with the L1PA2 element, confirming the L1 as the driver of transcription.
Taken together, the L1-CRISPRi RNA-seq analysis revealed nearly a hundred genes that depend on L1 promoter activity in human pluripotent cells. All these L1s represent hominoid or human-specific insertions, demonstrating that these elements provide a hominoid-specific layer of transcriptome complexity during early human development. Notably, several of these genes, including ELAPOR2, PPP1R1C, and PLCB1 (Figures 4C and 4E), are genes involved in brain development,43,46,47 suggesting a potential role for cis-acting L1s in hominoid brain speciation. Consistent with this, RNA-seq and CUT&RUN analysis of human fetal forebrain tissue26 confirmed the presence of several of these L1-driven transcript isoforms in the developing human brain, suggesting that they may play a functional role in vivo (Figure S5J).
Expression of L1 loci in cerebral organoids is variable and DNA methylation dependent
To study the role of L1s in human brain development, we generated human cerebral organoids as a model for human brain development in a 3D setting (Figure 5A). We cultured hiPSC-derived unguided cerebral organoids for 15 days. At this time point the organoids are mostly composed of neural rosettes, as determined using bright-field imaging and ZO1/PAX6 immunostaining (Figures 5B and S6A), representing an early stage of human brain development when neural progenitor cells (NPCs) are expanding.48,49 Using single-nucleus RNA-seq (snRNA-seq) analysis, we confirmed that most of the cells displayed a transcriptome characteristic of NPCs, while a minority of the cells expressed genes associated with early neurons (Figures 5C and S6B). We found no pluripotent cells in the organoids, as determined by OCT4/NANOG immunostaining and expression of OCT4 or NANOG in the snRNA-seq data (Figure S6C and S6D). These results confirmed that the unguided cerebral organoids are a suitable model system to study the exit from the pluripotent stage into a committed NPC.
Figure 5.
Expression of evolutionarily young full-length L1s in cerebral organoids
(A) Schematic of the workflow for the generation of unguided cerebral organoids and downstream analyses.
(B) Bright-field images of differentiating cerebral organoids at different time points and immunohistochemistry on day 15 organoids for ZO1 (red) and PAX6 (green). DAPI is included in blue in the overlay.
(C) Left: uniform manifold approximation and projection (UMAP) showing clusters found in day 15 unguided cerebral organoids. Middle, top: UMAP displaying the different cell types found in day 15 cerebral organoids. Middle, bottom: bar plot showing the percentage cell type composition of the cerebral organoids at day 15 of differentiation. Right: dot plot displaying selected neuronal and NPC markers used to characterize the cell clusters (dot size shows the percentage of cells expressing the gene, color indicates average expression in each cell type).
(D) Expression of uniquely mapped evolutionarily young full-length L1s in hiPSCs and day 15 cerebral organoids.
(E) Violin plots of the methylation status over the promoter of L1HS-L1PA4 >6 kb L1s in hiPSCs vs. day 15 cerebral organoids. Zoom-in plots indicate mean methylation levels (red dot) per condition per subfamily. Boxplot centers correspond to median, hinges correspond to the first and third quartile, and whiskers stretch from the first and third quartile to +1.5 IQR.
(F) Genome browser tracks showing normalized transcription (RPKM) of L1s in hiPSCs and day 15 cerebral organoids (dark blue, forward transcription; purple, reverse transcription) and ONT DNA reads across day 15 organoids and hiPSCs, black dots indicating methylated CpGs, and methylation coverage of the L1 elements at the bottom (green, organoids; blue, hiPSCs).
To investigate L1 expression, we performed bulk RNA-seq, which revealed that L1s were expressed in cerebral organoids, albeit at lower levels compared to hiPSCs (Figures 5D and S6E). The reduced L1 expression correlated with an overall increase in DNA methylation at L1 promoters in organoids compared to hiPSCs (Figure 5E). This is consistent with the restoration of DNA methylation during differentiation.50 However, of the 2,323 L1s expressed in hiPSCs, 750 were completely silenced in organoids, while the expression of 1,573 loci could still be detected (Figure S6F). This difference was also reflected in the DNA methylation status, where L1s expressed in organoids showed lower levels of DNA methylation at the L1 promoter than the silenced loci (Figure S6G). This suggests that when analyzing the epigenetic status of L1s, it is not possible to treat them as a single uniform family. Rather, each locus seems to be regulated in an independent manner, as previously suggested.9,10,51,52 We found fully methylated L1 loci that were transcriptionally silent in organoids, as well as fully demethylated loci in organoids that were expressed at similar levels compared to hiPSCs (Figures 5F and S6H). This is surprising, as the different L1 loci share the same regulatory sequences, but mirrors observations we have made previously in the developing and adult human brain.26 Taken together, these results demonstrate that, at the global level, the expression of L1s depends on the cellular context during early human development and neural differentiation but that epigenetic state and transcriptional activity of each individual L1 element is surprisingly variable.
L1s regulate early neural differentiation in cerebral organoids
To investigate whether L1 expression plays a role in the exit from human pluripotency and early neural differentiation, we generated L1-CRISPRi organoids. L1 silencing did not affect the formation of the organoids, which displayed characteristic neural rosettes after 15 days of differentiation as well as expression of ZO1 and PAX6 (Figures 6A–6C, S7A, and S7B). To assess the cellular and molecular consequences of L1-CRISPRi inhibition on human cerebral organoids, we used snRNA-seq (Figures 6D–6I and S7C–S7E). High-quality data were generated from a total of 66,206 cells, including 39,181 from L1-CRISPRi organoids (two gRNAs, two cell lines; in total 11 libraries made out of five organoids each) and 27,025 from control organoids (LacZ-gRNA, two cell lines in total; eight libraries made out of five organoids each). We performed an unbiased clustering analysis to identify and quantify the different cell types present in the organoids. Eight separate clusters of cells were identified, with the majority of cells being non-proliferating NPCs (cluster 0) and proliferating NPCs (cluster 1), as well as clusters containing newborn neurons at different stages of maturation (Figures 6D and 6E). All clusters contained cells from both L1-CRISPRi and control organoids, and we found no apparent difference in the contribution of L1-CRISPRi organoids to the different clusters, suggesting that L1s do not have a major influence on developmental fate in cerebral organoids (Figure 6F).
Figure 6.
L1s regulate early neural differentiation in cerebral organoids
(A) Workflow for the differentiation of L1-CRISPRi and control hiPSCs into day 15 unguided cerebral organoids and downstream analyses.
(B) Differentiation of the unguided cerebral organoids (hiPSC1 control vs. L1-CRISPRi). Bright-field images included from days 4, 9, and 14 of differentiation. Right: immunostaining showing neural rosettes and expression of the tight junction marker ZO1 (red) and the neural progenitor cell marker PAX6 (green). Nuclear staining DAPI is included in the overlay (blue).
(C) Normalized expression (RPKM) of uniquely mapped evolutionarily young (L1HS–L1PA4), >6 kb L1s in control vs. L1-CRISPRi unguided cerebral organoids at day 15 of differentiation.
(D) Left: UMAP showing clustering of day 15 organoids (control + L1-CRISPRi). Middle: UMAP colored by cell-cycle score (S + G2M). Right: UMAPs showing the identified cell types in control and L1-CRISPRi day 15 cerebral organoids.
(E) Dot plot showing the expression of selected markers used for the characterization of the cell clusters (dot size shows the percentage of cells expressing the gene, color indicates average expression in cell type per condition).
(F) Bar plot of the cell type distribution across batches and cell lines of control vs. L1-CRISPRi organoids.
(G) Left: heatmap showing commonly downregulated genes (L1-CRISPRi vs. control organoids) in all cell lines and guides (genes selected based on hiPSC1 gRNA1 avg log2FC < −0.25, hiPSC1 gRNA2, and hiPSC2 gRNA1 avg log2FC < 0) in the snRNA-seq data. The full list of downregulated genes can be found in Table S1. Right: heatmap with all the upregulated genes found in all cell lines and guides (genes selected based on hiPSC1 gRNA1 avg log2FC > 0.25, hiPSC1 gRNA2, and hiPSC2 gRNA1 avg log2FC > 0). The average expression of the upregulated genes in NPCs across conditions is also shown.
(H) Violin plots of selected genes upregulated in control (gray) compared to L1-CRISPRi (purple) organoids in cluster 0 and cluster 1 (Seurat::FindMarkers, Wilcoxon test; number of nuclei per condition as in Table S2).
(I) Gene set enrichment analysis of upregulated genes from clusters 0 and 1 (control vs. L1-CRISPRi organoids).
(J) Growth curves showing the size measurement distribution of control vs. L1-CRISPRi cerebral organoids from day 2 to day 15 of differentiation. Control is gray, L1-CRISPRi is purple. Area was measured as square micrometers and was assessed with Fiji ImageJ. Student’s t test was performed to compare the two conditions at each time point.
Next, we analyzed the transcriptional difference between L1-CRISPRi and control organoids. We first confirmed the transcriptional silencing of the L1s by using an in-house bioinformatics pipeline that allows for the analysis of L1 expression from the snRNA-seq dataset.26,53 By backtracking the reads from the cells that make up each cluster, we are able to create a pseudo-bulk population of cells by cluster and analyze L1 expression using unique mapping. This pseudo-bulk approach greatly increases the sensitivity of L1 analysis and allows for a quantitative estimation of L1 expression at single-cell-type resolution. We found clear evidence of L1 expression from multiple loci in most clusters in the control organoids, as well as evidence of silenced L1 expression in the L1-CRISPRi organoids (Figure S7C). In addition, we confirmed that several of the L1 cis-regulated genes (see Figure 4C) were also silenced in all clusters in the L1-CRISPRi organoids (Figure S7D).
In the two largest NPC clusters (cluster 0 and cluster 1), representing non-proliferating and proliferating NPCs, respectively, we found a set of genes that were consistently up- or downregulated in L1-CRISPRi organoids among the different batches, in both cell lines, and using either of the gRNAs (Figure 6G). Downregulated genes (n = 116, log2FC < −0.25) were predominantly located in the close vicinity of an L1, and many of these genes were the same as the cis-regulated genes identified in the iPSCs (Figure 6G). Interestingly, many of the upregulated genes (n = 41, log2FC > 0.25), which likely represent downstream consequences of the L1 silencing, have been linked to neuronal growth and neurogenesis, including AKAP6, TOX, NTRK2, and SCN3A54,55,56,57 (Figures 6G–6I). Indeed, gene set enrichment analysis of the consistently upregulated genes in the major NPC clusters (cluster 0 and cluster 1) identified terms such as central nervous system differentiation, forebrain, and synapse development, driven by genes related to functions in cAMP and calcium signaling, such as AKAP6 and ADCY2 (Figures 6H and 6I).54,58,59 These changes in gene expression could be validated across replicates and batches using a pseudo-bulk analysis (Figure S7E). Thus, the transcriptome of NPCs in organoids that lack L1 expression suggests a more advanced differentiated profile when compared to control cerebral organoids, implying that L1 expression is important for an appropriate progress of the finely tuned neural differentiation. In line with this, we found that quantification of organoid size throughout differentiation revealed that L1-CRISPRi organoids were reproducibly and significantly smaller than control organoids. This difference appeared after 10 days and increased until day 15, the last time point quantified (Figure 6J). The results were reproduced in three independent batches using the two different cell lines and the two different gRNAs (Figure 6J, t test, hiPSC1 gRNA1 batch 1, p = 0.0089; hiPSC1 gRNA1 batch 2, p = 0.00015; hiPSC1 gRNA2, p = 0.0088; and hiPSC2 gRNA1, p = 0.00019).
Together, these results show that silencing of L1s results in organoids containing the same cell types as control organoids, suggesting that L1s do not influence developmental fate. However, we found that the L1-CRISPRi organoids exhibited transcriptome changes consistent with alterations in neural development in NPCs, resulting in organoids that are reduced in size. These observations are consistent with an important role for L1 expression in the exit from pluripotency and early human brain development.
Discussion
Several recent studies have shown that L1s are expressed in early mammalian development, where they are proposed to play an important regulatory role.11,12,41 However, the abundance and repetitive nature of L1s make it difficult to accurately measure their transcription, particularly at individual loci.32 Thus, the majority of these recent studies have largely considered L1s as a single family, thereby limiting mechanistic analysis of how L1s may influence early developmental programs. In our study, we address this issue in hiPSCs and cerebral organoids using a detailed multiomics analysis, including long-read sequencing and custom bioinformatics pipelines. Our results show that thousands of unique L1 loci are expressed in human pluripotent stem cells and during early neural differentiation, where they add a hominoid-specific layer of transcriptional complexity.
We find that L1 expression in hiPSCs is largely restricted to human-specific (L1HS) and hominoid-specific (L1PA2 and L1PA3) elements. Expression of these elements correlates with lower levels of DNA methylation, suggesting that hominoid-specific L1s escape repressive epigenetic mechanisms that otherwise silence TEs in pluripotent stem cells.60 This is consistent with a previous study that found that ancestral L1PA3 elements underwent structural sequence changes that allowed them and subsequent younger L1 families to escape silencing by KRAB zinc-finger proteins,61 a large family of transcription factors that mediate transcriptional silencing of TEs in early development. However, upon differentiation into cerebral organoids, evolutionarily young L1s were largely silenced by DNA methylation. The mechanism underlying this phenomenon is unknown, but may be related to other TE repression mechanisms, such as those associated with the HUSH complex.62,63,64 This observation is similar to those previously made in adult human tissues, including the brain, where evolutionarily young L1s are hypermethylated.65 Importantly though, our data highlight that many L1 elements are completely demethylated and highly expressed in cerebral organoids and therefore do not follow this general rule. Several mechanisms may underlie this discrepancy. For example, the L1 integration site is likely to be important, and the presence of highly active nearby gene promoters or other regulatory elements may influence the epigenetic status of L1s and their expression.10,26,37,51,66 In addition, single-nucleotide variants or small deletions in the regulatory regions of individual L1 loci could prevent the recruitment of silencing factors.9 Crucially, our data underscore that, when analyzing L1 expression, it is not sufficient to investigate expression at the family level, but rather, approaches that allow single-locus resolution are needed.
Since the expression of L1s in hiPSCs correlates with the presence of H3K4me3 and low levels of DNA methylation at the promoter region, our data confirm that the L1 promoter is active in hiPSCs.33,34 This promoter activity provides a rich source of hominoid-specific transcripts. To investigate how L1s influence the transcriptome of hiPSCs, we optimized a CRISPRi system that silences the L1 promoter with high precision and efficiency. A combination of RNA-seq, mass spectrometry proteomics, and CUT&RUN epigenomic analysis confirmed on-target silencing of virtually all active L1 promoters in hiPSCs, resulting in the loss of L1 RNA and protein and inhibition of L1-mediated cis-acting effects. We found that about 100 genes in hiPSCs, a third of which are protein coding, depend on the activity of the L1 promoter. For the protein-coding genes, the L1 antisense promoter is used as an alternative promoter, leading to the expression of hominoid-specific alternative isoforms of these conserved protein-coding genes. Thus, L1s are wired into the transcriptional network of human pluripotent stem cells. In addition to acting as alternative promoters, it has been shown that L1s can act as enhancers over large distances.42 In line with this, several of the genes we found to be downregulated in the current study were located more than 50 kb away from the nearest downregulated L1, which may be the consequence of long-range effects.
Despite the global loss of L1 promoter activity, we found that L1-CRISPRi hiPSCs remained in a proliferative pluripotent state as demonstrated by RNA-seq, mass spectrometry proteomics, immunocytochemistry, and trilineage differentiation analyses. This is somewhat at odds with two previous studies using mouse and human embryonic stem cells,11,41 which suggested that L1s play an important role in developmental potency. However, the cells used in those studies were naive, totipotent embryonic stem cells, which represent an earlier developmental stage than the hiPSCs used in this study. hiPSCs represent a primed pluripotent developmental stage. Therefore, the precise early developmental stage may be important in relation to the functional role of L1s during early development. Moreover, the cited studies used different technical approaches to silence L1s, relying on antisense oligonucleotides targeting ORF1 or ORF2. The key advantage of our CRISPRi approach is that, by targeting the gRNA to the 5′ end of L1s, we limit the targeted loci to full-length L1s that are transcriptionally active. This increases specificity, as most L1s are 5′ truncated as a consequence of an imperfect retrotransposition event. Such fragmented L1s are present in very high numbers throughout the human genome and are often part of other transcripts, making it difficult to avoid off-target effects when targeting more central parts of the L1 sequence.
Theoretically, L1s can mediate regulatory functions either through cis-acting mechanisms on nearby genes, as documented in the current study and previously in mouse embryonic stem cells,67 or through trans-acting mechanisms, where L1-derived RNA or proteins carry out other regulatory mechanisms throughout the cell. Our CRISPRi approach is limited because it cannot distinguish between cis- and trans-acting mechanisms in relation to the functional outcome we detect using the organoid model. Thus, it is possible that the cis-acting mechanisms that we document at protein-coding genes are complemented by trans-acting L1 functions, which previously have been documented both in pluripotent cells and during brain development.11,26,27,28,29,41 To fully understand the mechanisms underlying how L1s influence human brain development, new molecular tools are needed that can distinguish between cis- and trans-acting mechanisms.
Several of the L1-regulated genes we documented in this study have been implicated in brain development, suggesting that L1s may play an important role in this process. Our data show that cerebral organoids in which L1 expression was silenced were smaller in size and had an altered transcriptome of NPCs compared to control counterparts. Our findings are reminiscent of previously observed differences when comparing human cerebral organoids with those derived from non-human great apes, implicating L1s in the evolution of the human brain.68,69,70 In addition, many of the genes we found upregulated in L1-CRISPRi organoids have been implicated in neurodevelopmental and psychiatric disorders. Thus, the modest, yet robust, changes observed in the transcriptional profile of L1-CRISPRi organoids suggest that the dysregulation of young, full-length L1s during early brain development could lead to neurodevelopmental defects specifically related to developmental timing. As L1s are highly polymorphic within the human population, the prevalence of particular L1 copies, single-nucleotide polymorphisms, and structural variants in fixed L1s in the genome could therefore influence the etiology of brain disorders through altered brain development.71,72
In conclusion, our results illustrate how L1s provide a layer of hominoid-specific transcriptional complexity in human pluripotent stem cells that is important for the exit of pluripotency and early brain development. Thus, L1s represent a set of genetic material that contributes to important gene regulatory and transcriptional networks in early human development that could have contributed to human brain evolution. Our findings suggest that L1s should no longer be neglected and that these sequences must be included as individual loci in future investigations to study human evolution and the underlying genetic causes of human disorders.
Limitations of the study
This study uses a CRISPRi tool to silence L1 transcription. While our data demonstrate that this tool efficiently targets the promoter of young L1s with limited off-target effects, the tool has some limitations. For instance, L1-derived transcripts that do not depend on the L1 promoter, such as readthrough transcripts, are not targeted by this system. These transcripts may play a regulatory role in human cells. Additionally, the CRISPRi approach cannot distinguish between cis- and trans-acting mechanisms, which limits our mechanistic understanding of the observed phenotype. Thus, additional molecular investigations using alternative tools are needed to fully understand how L1s impact early human brain development.
Furthermore, the iPSC and organoid model systems used in this study recapitulate some aspects of early human development. However, there are limitations to the accuracy of the cell models in comparison to actual human development, and they are associated with experimental variation, which limits the strength of some conclusions.
Resource availability
Lead contact
Requests for further information, resources, and reagents should be directed to and will be fulfilled by the lead contact, Johan Jakobsson (johan.jakobsson@med.lu.se).
Materials availability
All unique plasmids and vectors generated in this study are available from the lead contact.
Data and code availability
-
•
All data needed to evaluate the conclusions in the paper are present in the paper and/or the supplemental information. The sequencing data presented in this study have been deposited at the GEO superseries GEO: GSE283600. Proteomic data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository73 with the identifier PRIDE: PXD059661 (https://www.ebi.ac.uk/pride/archive/projects/PXD059661). Original code has been deposited at GitHub and is publicly available at https://molecular-neurogenetics.github.io/truster/ (https://doi.org/10.5281/zenodo.14362613) and git@github.com:Molecular-Neurogenetics/L1CRISPR_Adami_2025.git (https://doi.org/10.5281/zenodo.15638389).
-
•
The data, code, protocols, and key lab materials used and generated in this study are listed in the key resources table available within the article and deposited to Zenodo at https://doi.org/10.5281/zenodo.16094977.
-
•
An earlier version of this article was posted to bioRxiv at https://doi.org/10.1101/2025.01.17.633315.
Acknowledgments
We would like to thank D. Trono and R. Foray for excellent comments on the manuscript and S. Henikoff for providing reagents. We also thank U. Jarl and A. Hammarberg for technical assistance. We acknowledge and thank J. Hansson and S. Ghosh from the Lund Stem Cell Center Proteomics facility for producing the mass spectrometry data and contributing with their expertise in proteomics data. We acknowledge Clinical Genomics Lund, SciLifeLab, and Center for Translational Genomics (CTG) Lund University for providing expertise and service with sequencing and analysis. We also acknowledge the Cell and Gene Technologies Core at the Lund Stem Cell Center for performing the three-germ-layer differentiation experiment. We are grateful to all members of the Jakobsson laboratory.
This study was partially funded by the joint efforts of the Michael J. Fox Foundation for Parkinson's Research (MJFF) and the Aligning Science Across Parkinson's (ASAP) initiative. MJFF administers the grants ASAP-000520, ASAP-024296, and ASAP-025170 on behalf of ASAP and itself. The work was also supported by grants from the Swedish Research Council (2022-02673 to J.J. and 2021-03494 to C.H.D.), the Swedish Brain Foundation (FO2023-0232 to J.J.), Cancerfonden (222185Pj to J.J.), Barncancerfonden (PR2023-0099 to J.J.), NIHR Cambridge Biomedical Research Centre (NIHR203312 to R.A.B), the Swedish Society for Medical Research (S19-0100 to C.H.D.), and the Swedish Government Initiative for Strategic Research Areas (MultiPark & StemTherapy). P.G. acknowledges support from the European Union’s Horizon Europe Research and Innovation Programme under the Marie Skłodowska-Curie Actions Postdoctoral Fellowship grant agreement (project 101105804 – brainTEaser). For the purpose of open access, the authors have applied a CC_BY public copyright license to the author accepted manuscript (AAM) version arising from this publication.
Author contributions
Design and interpretation, all authors; conceptualization, A.A., R.G., and J.J.; experimental research, A.A., P.G., P.A.J., F.D., S.K., L.C.-V., D.A.M.A., and J.G.J.; bioinformatics, R.G., P.G., Y.S., O.T., and C.H.D.; material, reagents, and expertise, A.K., M.G.H., and R.A.B.; writing – original draft, A.A., R.G., and J.J.; writing – review & editing: all authors.
Declaration of interests
M.G.H. is a member of the scientific advisory board of Transposon Therapeutics.
STAR★Methods
Key resources table
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Antibodies | ||
| rabbit anti-H3K9me3, polyclonal | Abcam | RRID:AB_306848 |
| rabbit anti-H3K4me3, polyclonal | Active Motif | RRID:AB_2615077 |
| goat anti-rabbit IgG, polyclonal | Abcam | RRID:AB_10681025 |
| rabbit anti-PAX6, polyclonal | BioLegend | RRID:AB_2565003 |
| mouse anti-ZO1, monoclonal | Invitrogen | RRID:AB_2477799 |
| mouse anti-ORF1p, clone 4H1 | Sigma-Aldrich | RRID:AB_2941775 |
| rabbit anti-Cas9, polyclonal | Takara | RRID:AB_3251495 |
| mouse anti-OCT3/4 | Santa Cruz | RRID: AB_628051 |
| rabbit anti-NANOG | Abcam | RRID: AB_446437 |
| donkey anti-rabbit Alexa647 | Jackson ImmunoResearch Labs | RRID:AB_2492288 |
| donkey anti-mouse Cy3 | Jackson ImmunoResearch Labs | RRID: AB_2340813 |
| anti-human CD144 (VE-Cadherin) | Miltenyi Biotec | RRID:AB_2819510 |
| anti-human CD140b | Miltenyi Biotec | RRID:AB_2783952 |
| anti-human SOX17 | Miltenyi Biotec | RRID:AB_2653493 |
| anti-human CD184 (CXCR4) | Miltenyi Biotec | RRID:AB_2752173 |
| anti-human PAX6 | Miltenyi Biotec | RRID:AB_2819462 |
| anti-human/mouse SOX2 | Miltenyi Biotec | RRID:AB_2784458 |
| Chemicals, peptides, and recombinant proteins | ||
| ConA-coated magnetic beads (BioMag®Plus Concanavalin A) | Bangs Laboratories | Cat# BP531 |
| Roche cOmplete protease inhibitor | Sigma Aldrich | Cat# 11697498001 |
| Rock-inhibitor, Y27632 | Miltenyi Biotec | Cat# 130-106-538 |
| Penicillin-Streptomycin | GIBCO (ThermoFisher) | Cat# 15140122 |
| Draq7 | BD Biosciences | Cat# 564904 |
| Knockout replacement serum (KSR) | GIBCO(ThermoFisher) | Cat# 10828010 |
| Laminin-521 | Biolamina | Cat# LN521 |
| StemMACS iPS-Brew XF, human | Miltenyi biotec | Cat# 130-104-368 |
| StemPro Accutase | GIBCO(ThermoFisher) | Cat# A1110501 |
| Polyethyleneimine | PEI Polysciences | Cat# PN 23966 |
| SYBR Green I Master | Roche | Cat# 4887352001 |
| DPBS | GIBCO(ThermoFisher) | Cat# 14190086 |
| Matrigel | Corning | Cat# 356234 |
| OCT | HistoLab | Cat# 45830 |
| Critical commercial assays | ||
| NucleoSpin PCR Cleanup and Gel Extraction | Macherey-Nagel | Cat# 740609.50 |
| STEMdiff™ Cerebral Organoids kit | STEMCELL Technologies | Cat# 08570 |
| Rneasy mini kit (miniRNeasy) | QIAGEN | Cat# 74104 |
| TruSeq Stranded mRNA LP (48 Spl) | Illumina | Cat# 20020594 |
| Hyperprep kit (KAPA Biosystems) | Roche | Cat# 7962347001 |
| Chromium Single Cell 3′ Library | 10X Genomics | Cat# PN-1000268 |
| Chromium Single Cell 5′ Library | 10X Genomics | Cat# PN-1000263 |
| miRNA Easy Mini Kit | Qiagen | Cat# 217004 |
| StemMACS™ Trilineage Differentiation Kit | Miltenyi Biotec | Cat# 130-115-660 |
| NEBNext® Single Cell/Low Input cDNA Synthesis & Amplification Module | New England Biolabs | Cat#: E6421S for 24 reactions or E6421L for 96 reactions |
| Qubit dsDNA HS kit | Invitrogen(ThermoFisher) | Cat# Q32851 |
| Bioanalyzer High Sensitivity kit | Agilent | N/A |
| TRIzol™ reagent | Invitrogen | Cat# 15596026 and 15596018 |
| NEBNext® High Input Poly(A) mRNA Isolation Module | NEB | NEB #E3370S (24 reactions) |
| Deposited data | ||
| bulkRNA seq data (raw) | GEO | GSE283600 |
| CUTnRUN data (raw) | GEO | GSE283600 |
| 10X data (raw) | GEO | GSE283600 |
| ONT data (raw) | GEO | GSE283600 |
| Proteomics (MS) data (raw) | PRIDE | PXD059661 |
| iPSC QC | Zenodo | https://doi.org/10.5281/zenodo.15675716 |
| Immunostaining and Western Blot images | Zenodo | https://doi.org/10.5281/zenodo.15673660 |
| Experimental models: Cell lines | ||
| hiPSC line (HS2), male | RIKEN | RRID:CVCL_DQ30 |
| hiPSC line (HS1), female | RIKEN | RRID:CVCL_DQ11 |
| HS1_L1HS-CRISPRi_g1 | This paper | RRID:CVCL_E8I4 |
| HS1_L1HS-CRISPRi_g2 | This paper | RRID:CVCL_E8I5 |
| HS1_L1HS-CRISPRi_LacZ | This paper | RRID:CVCL_E8I6 |
| HS2_L1HS-CRISPRi_g1 | This paper | RRID:CVCL_E8I7 |
| HS2_L1HS-CRISPRi_LacZ | This paper | RRID:CVCL_E8I8 |
| Oligonucleotides | ||
| L1-CRISPRi gRNA1 forward: CACCGGTGGAGCCCACCACAGCTCA | This paper (ordered from: Eurofins) | N/A |
| L1-CRISPRi gRNA1 reverse: AAACTGAGCTGTGGTGGGCTCCACC | This paper (ordered from: Eurofins) | N/A |
| L1-CRISPRi gRNA1 forward: CACCGGCGTGAGCGACGCAGAAGAC | This paper (ordered from: Eurofins) | N/A |
| L1-CRISPRi gRNA2 reverse: AAACGTCTTCTGCGTCGCTCACGCC | This paper (ordered from: Eurofins) | N/A |
| LacZ gRNA forward: CACCGTGCGAATACGCCCACGCGAT | Eurofins | N/A |
| LacZ gRNA reverse: AAACATCGCGTGGGCGTATTCGCAC | Eurofins | N/A |
| Recombinant DNA | ||
| pLV hU6-sgRNA hUbC-dCas9-KRAB-T2a-GFP | Addgene/C. Gersbach | RRID: Addgene_71237 |
| FUGW U6 gLacZ dCas9-KRAB-T2a-GFP (LV3599) | This paper | RRID: Addgene_234883 |
| FUGW U6 gL1HSg1dCas9-KRAB-T2a-GFP (LV3824) | This paper | RRID: Addgene_234882 |
| FUGW U6 gL1HSg2dCas9-KRAB-T2a-GFP (LV3822) | This paper | RRID: Addgene_234881 |
| pMDL | Addgene; Zufferey et al74 | RRID: Addgene _12251 |
| psRev | Addgene; Zufferey et al74 | RRID: Addgene _12253 |
| pMD2G | Addgene; Zufferey et al74 | RRID: Addgene_12259 |
| Software and algorithms | ||
| Seurat (version 5.0.2 and 5.1.0) | Stuart et al75 | RRID:SCR_007322 |
| STAR (version 2.7.8a) | Dobin et al76 | RRID:SCR_004463 |
| MethylArtist (version 1.2.6) | https://github.com/adamewing/methylartist | N/A |
| IgV (version 2.18.2) | http://www.broadinstitute.org/igv/ | RRID:SCR_011793 |
| Spectronaut (version 18) | https://biognosys.com/software/spectronaut/ | N/A |
| Msstats (version 4.12.0) | http://msstats.org/ | RRID:SCR_014353 |
| Fiji (version 2.14.0/1.54f) | http://fiji.sc | RRID:SCR_002285 |
| Dorado (version 0.5.1-CUDA-11.7.0 and 0.7.1-CUDA-11.7.0) | https://github.com/nanoporetech/dorado | RRID:SCR_025883 |
| TEcount (TEtrancripts) (version 2.2.3) | Jin et al40 | RRID:SCR_023208 |
| SAMtools (version 1.16.1 and 1.18) | Li et al77 | RRID:SCR_002105 |
| Deeptools (version 2.5.4) | Ramirez et al78 | RRID:SCR_016366 |
| featureCounts (version 1.6.3) | Liao et al79 | RRID:SCR_012919 |
| CellRanger (version 6.0.0) | Zheng et al80 | RRID:SCR_017344 |
| bamtofastq (version 1.4.1) | N/A | RRID:SCR_023215 |
| subset-bam(version 1.1.0) | N/A | RRID:SCR_023216 |
| trusTEr (version 0.1.2) | https://doi.org/10.5281/zenodo.7589548 | https://doi.org/10.5281/zenodo.7589547 |
| DESeq2 (version 1.44.0) | Love et al81 | RRID:SCR_015687 |
| bowtie2 (version 2.4.4) | Langmead & Salzberg 82 | RRID:SCR_016368 |
| minimap2 | Li 83 | RRID:SCR_018550 |
| Bedtools (version 2.30.0) | Quinlan & Hall 84 | RRID:SCR_006646 |
| mosdepth (version 0.3.10) | https://github.com/brentp/mosdepth | RRID:SCR_018929 |
| Homer (version 5.1) | Heinz et al85 | RRID:SCR_010881 |
| Other | ||
| bulkRNA seq data analysis (code) | GitHub | https://doi.org/10.5281/zenodo.15638389 |
| CUTnRUN data analysis (code) | GitHub | https://doi.org/10.5281/zenodo.15638389 |
| 10X data analysis (code) | GitHub | https://doi.org/10.5281/zenodo.15638389 |
| Proteomics (MS) data (code) | GitHub | https://doi.org/10.5281/zenodo.15638389 |
| ONT data analysis (code) | GitHub | https://doi.org/10.5281/zenodo.15638389 |
Experimental model and study participant details
Cell culture
Two commercially available hiPSC lines generated by mRNA transfection were used in the study (RBRC-HPS0328 606A1 and RBRC-HPS0360 648A1 [RIKEN]; referred to as hiPSC1 (RRID:CVCL_DQ11; female) and hiPSC2 (RRID:CVCL_DQ30; male), respectively). The cells were cultured as previously described.86,87 Briefly, hiPSCs were maintained on Lam521-coated (0.7 μg/cm2 [Biolamina]) Nunc Δ multidishes in iPS media (StemMACS iPS-Brew XF and 0.5% penicillin/streptomycin [GIBCO]). The cells were passaged every 2-4 days until 70-90% confluency. At passaging, cells were rinsed once with DPBS (GIBCO) and dissociated with Accutase (250 μl/well in 12-wells plates) [GIBCO]) for 7-10 minutes at 37 °C. After incubation, the cells were collected, transferred to 10 ml of wash medium (9 ml DMEM/F-12 [GIBCO]; 1 ml Knockout Serum Replacement [GIBCO]), and centrifuged at 400 x g for 5 minutes. The cells were resuspended in 1 ml iPS media and plated with 10 μM of Y27632 (Rock inhibitor [Miltenyi]) for expansion. Media was changed daily.
The quality control of the iPSCs (pluripotency capacity, genomic integrity, and iPSCs quality control table) is reported at the following link: https://doi.org/10.5281/zenodo.15675716.
A detailed protocol of the hiPSC culture and handling can be found at the following link: https://doi.org/10.17504/protocols.io.kqdg334pqg25/v1.
Unguided cerebral organoids culture
Organoids were cultured using the STEMdiff™ Cerebral Organoids kit (STEMCELL Technologies [Catalog # 08570]) and following the manufacturer’s protocol until day 15 of differentiation. Shortly, at day 0 of differentiation hiPSCs at 70-90% confluency were detached and centrifuged as described above (see cell culture: hiPSCs), then resuspended in 1 ml of EB Seeding Medium (STEMdiff™ Cerebral Organoids Basal Medium 1 and STEMdiff™ Cerebral Organoids Supplement A, supplemented with 10 um of Y-27632 (Rock Inhibitor)). Cells were counted and the required volume to obtain 90000 cells/ml was calculated. 100 μl per well were then distributed in a round-bottom ultra-low attachment 96-well plate (9000 cells/well). Plate was left incubating undisturbed for 24 h. At day 2 and 4 of differentiation 100 μl of fresh EB Formation Medium (STEMdiff™ Cerebral Organoids Basal Medium 1 and STEMdiff™ Cerebral Organoids Supplement A, 4:1) were added to each well. On day 5, organoids were transferred to an ultra-low attachment 24-well plate with 0.5 ml of Induction Medium (STEMdiff™ Cerebral Organoids Basal Medium 1 and STEMdiff™ Cerebral Organoids Supplement B), in each well (1 organoids/well to avoid organoids fusion). Organoids were incubated at 37 °C for 48 h. On day 7, organoids were embedded with ∼15 μl of Matrigel per organoid and transferred to an ultra-low attachment 6-well plate with 3 ml of Expansion Medium (STEMdiff™ Cerebral Organoids Basal Medium 2, STEMdiff™ Cerebral Organoids Supplement C, STEMdiff™ Cerebral Organoids Supplement D) per well. 10-12 organoids per well were cultured together. On day 10 and 13 media was changed with 3 ml/well of Maturation Medium (STEMdiff™ Cerebral Organoids Basal Medium 2, STEMdiff™ Cerebral Organoids Supplement E). On day 15 of differentiation, the organoids were collected for downstream analyses. 3 organoids were collected and snap-frozen for each bulk RNA sequencing replicate (n= 3 replicates per condition). 4-5 organoids were collected and snap-frozen for each snRNA seq replicate (n= 3 replicates per condition). 2 organoids were collected for immunostaining (see: immunohistochemistry for samples preparation). 5 organoids were collected for long-read DNA sequencing.
Method details
Pluripotency capacity assay: Trilineage differentiation
Human iPSC were cultured in mTeSR Plus medium (STEMCELL Technologies, Cat # 100-0276) on Laminin 521 (ThermoFisher Scientific, Cat # A29248) coated plates, passaged as single cells with TrypLE Select (ThermoFisher Scientific, Cat # 12563011). Cells were seeded in media supplemented with 10uM Y27632 (Millipore, Cat # 688000). Trilineage differentiation was carried out with StemMACS™ Trilineage Differentiation Kit (Miltenyi Biotec, # 130-115-660).
| Germ Layer | Antibody | Dilution |
|---|---|---|
| Mesoderm | Miltenyi Biotec, CD144 (VE-Cadherin) Antibody, anti-human, FITC, REAfinity™ (Cat #130-123-688; RRID:AB_2819510) | 1:50 |
| Mesoderm | Miltenyi Biotec, CD140b Antibody, anti-human, APC, REAfinity™ (Cat #130-121-052; RRID:AB_2783952) | 1:50 |
| Endoderm | Miltenyi Biotec, Sox17 Antibody, anti-human, PE, REAfinity™ (Cat #130-111-032; RRID:AB_2653493) | 1:50 |
| Endoderm | Miltenyi Biotec, CD184 (CXCR4) Antibody, anti-human, APC, REAfinity™ (Cat #130-120-708; RRID:AB_2752173) | 1:50 |
| Ectoderm | Miltenyi Biotec, PAX-6 Antibody, anti-human, APC, REAfinity™ (Cat #130-123-267; RRID:AB_2819462) | 1:50 |
| Ectoderm | Miltenyi Biotec, Sox2 Antibody, anti-human/mouse, FITC, REAfinity™ (Cat #130-120-721; RRID:AB_2784458) | 1:50 |
All the tested cell lines were mycoplasma-free (qPCR-based analysis done by Eurofins Genomics). Detailed results of the assay and iPSC quality control table can be found at the following link: https://doi.org/10.5281/zenodo.15675716.
Bulk RNA sequencing
Total RNA was isolated from cells using the RNeasy Mini Kit (Qiagen). The sequencing libraries were then generated using Illumina TruSeq Stranded mRNA library prep kit (with poly-A selection) and sequenced on a NovaSeq6000 or Novaseq X plus (paired end 2 × 150bp).
To optimize the library insert size for L1 analysis, we followed the manufacturer’s instructions for the Stranded mRNA library preparation, but without fragmentation of the insert. This results in fragments that are just 150 bp shorter than the total library size (around 480 bp in total). Detailed protocol can be found at https://doi.org/10.17504/protocols.io.36wgqjqbkvk5/v1.
Bulk RNA-seq analysis
Detailed protocols for all the following sections within Bulk RNA-seq analysis can be found at https://doi.org/10.17504/protocols.io.yxmvm2m55g3p/v1.
TE subfamily quantification
For the quantification of TE subfamilies, the reads were mapped using STAR aligner (version 2.7.8a; RRID:SCR_004463)76 with an hg38 index and GENCODE version 38 (RRID:SCR_014966) as the guide GTF (--sjdbGTFfile), allowing for a maximum of 100 multimapping loci (--outFilterMultimapNmax 100) and 200 anchors (--winAnchorMultimapNmax). The rest of the parameters affecting the mapping was left in default as for version 2.6.0c.
The TE subfamily quantification was performed using TEcount from the TEToolkit (version 2.2.3; RRID:SCR_023208) in mode multi (--mode). GENCODE annotation v38 was used as the input gene GTF (--GTF), and the provided hg38 GTF file from the author’s web server was used as the TE GTF (--TE).
TE quantification
Reads were mapped using STAR aligner (version 2.7.8a; RRID:SCR_004463) with an hg38 index and GENCODE version 38 (RRID:SCR_014966) as the guide GTF (--sjdbGTFfile). We allowed a single mapping locus (--outFilterMultimapNmax 1) and a ratio of mismatches to the mapped length of 0.03 (--outFilterMismatchNoverLmax).
We measured in-sense and in-antisense transcription over features as previously described26 (see STAR Methods sections “Bulk RNAseq: TE quantification” and “Bulk RNAseq: Comparison between sense and antisense transcription over TEs”). For consistency (and to avoid quantifying over simple repeats, small RNAs, and low-complexity regions), we quantified TEs using the curated hg38 GTF file provided by the TEtranscripts authors (version 2.2.3; RRID:SCR_023208; https://github.com/mhammell-laboratory/TEtranscripts). We considered an element to be “expressed” in a particular condition if at least one sample’s normalized expression was higher than two, i.e., TE counts divided by the sample distances (sizeFactor) as calculated by DESeq2 using the gene quantification (see the “bulk RNA-seq analysis: gene quantification” section).
To create in-sense and in-antisense deeptools heatmaps of evolutionary young L1 subfamilies, we used DeepTools’ (version 2.5.4; RRID:SCR_016366) computeMatrix, computeMatrixOperations, and plotHeatmap functions as previously described26 (the method section “Bulk RNAseq: Transcription over evolutionary young L1 elements in bulk datasets”).
Gene quantification
Using uniquely-mapped reads, genes were quantified using featureCounts79 from the subread package (version 1.6.3; RRID:SCR_012919) forcing strandness (-s 2) to quantify by gene_name (-g) from the GTF of GENCODE version 38 (RRID: SCR_014966).
Differential gene expression analysis
We performed differential expression analysis using DESeq2 (version 1.44.0; RRID:SCR_015687) with the read count matrix from featureCounts (subread version 1.6.3; RRID:SCR_012919) as input. Fold changes were shrunk using DESeq2:: lfcShrink.
For the produced heatmaps, counts were normalized by median of ratios as described by Love et al.,81 summed with a pseudo-count of 1 and log2-transformed.
For further detail, please refer to the Rmarkdown on the GitHub.
Differential TE subfamilies expression analysis
We performed differential expression analysis using DESeq2 (version 1.44.0; RRID:SCR_015687) per guide RNA, per cell line, and per experiment (some experiments were reproduced on a second batch of organoids) using the uniquely-mapped read quantification of TEs (see section above “bulk RNAseq: TE quantification”). Fold changes were shrunk using DESeq2:: lfcShrink.
Using the gene DESeq2 object (see section above), we normalized the TE subfamily counts by dividing the read count matrix by the sample distances (sizeFactor) as calculated by DESeq2 using the gene quantification (see the “bulk RNA-seq analysis: gene quantification” section). For heatmap visualization, a pseudo-count of 1 was added and log2-transformed.
Western blot
RIPA buffer (Sigma-Aldrich) and a complete protease inhibitor cocktail were used to lyse the cells on ice for 30 minutes. The cells were then pelleted at 17 000xg, 4° C, for 20 minutes. Supernatants were collected and mixed with Bolt LDS loadig buffer 4x (Novex) and Bolt reducing agent 10x (Novex) and boiled at 95 °C for 5 minutes prior separation on a 4–12% Tris-glycine SDS-PAGE gel (200 V, 45 min). Proteins were then transferred from the gel to a PVDF membrane with the Transblot-Turbo Transfer system (BioRad). The membrane was blocked for 1 h in TBST with 5% skimmed milk (MTBST) before incubation overnight at 4° C with mouse anti-L1-ORF1p [Millipore MABC1152, 1:1,000 dilution] diluted in MTBST. The following day, the membrane was washed twice for 15 minutes in TBST and incubated for 1 h at room temperature with HRP-conjugated anti-mouse secondary antibody (Cell signalling, 1:5000) diluted in MTBST. After washing the membrane twice in TBST and once in TBS, the protein was detected by chemiluminescence using ECL Select reagents (Cytiva) according to manufacturer’s instructions and then imaged with a Chemi-Doc system (BioRad). Finally, the membrane was stripped using the Restore PLUS Western Blot Stripping Buffer (Thermo) as per instructions, re-blocked for 1 h in MTBST after which the procedure for the β-actin staining (HRP-472 conjugated anti-β-actin, Sigma A3854, 1:50 000 dilution). was performed as above.
Detailed protocol can be found at https://doi.org/10.17504/protocols.io.ewov1d562vr2/v1.
Western blot images included in the paper are deposited on Zenodo at the following link: https://doi.org/10.5281/zenodo.15045170.
Immunocytochemistry
HiPSCs staining. Cells were rinsed once with DPBS and fixed for 15 minutes with 4% paraformaldehyde (Merck Millipore). The fixed cells were then rinsed three times with DPBS, and then pre-blocked for ∼1h in blocking solution (KPBS with 0.25% Triton X-100 [FisherScientific] and 5% normal donkey serum). The primary antibodies were added to the blocking solution and incubated overnight at 4 °C. On the following day, cells were washed twice with KPBS and the secondary antibodies, diluted in the blocking solution, were added and incubated at room temperature for 2 hours. Stained cells were rinsed once with KPBS, then stained with DAPI (1:1000 dilution, [Sigma-Aldrich]) for 10 minutes. Finally, 2 rinses with KPBS were performed before imaging. Stainings were imaged on a Leica microscope (model DMI60000 B), and images were cropped and adjusted on Fiji ImageJ version 2.14.0/1.54f (RRID:SCR_002285; http://fiji.sc).
Antibodies used: mouse anti-OCT3/4 (1:300 dilution [Santa Cruz, RRID: AB_628051]); rabbit anti-NANOG (1:600 dilution [Abcam, RRID: AB_446437]); mouse anti-hORF1p, clone 4H1 (1:300 dilution [Sigma-Aldrich, RRID: RRID:AB_2941775]); donkey anti-mouse Cy3 (1:550 dilution [Jackson Lab]); donkey anti-rabbit Alexa647 (1:550 dilution [Jackson Lab]).
The protocol can be found at https://doi.org/10.17504/protocols.io.5qpvor7pdv4o/v1.
Immunocytochemistry images included in the paper are deposited on Zenodo at the following link: https://doi.org/10.5281/zenodo.15045170.
CUT&RUN
Two different CUT&RUN protocols were used depending on the target. A standard CUT&RUN protocol was applied to profile H3K4me3 and H3K9me3. A crosslinking step was added to profile dCas9 in L1 CRISPRi hiPSCs.
Standard CUT&RUN
The protocol performed follows a previously described CUT&RUN protocol.36,88 Briefly, 500k cells were harvested, washed twice in Wash buffer (20 mM HEPES pH 7.5, 150 mM NaCl, 0.5 mM spermidine, 1 Roche Complete Protease Inhibitor EDTA-free tablet diluted in dH2O) and then attached to 10 ul of pre-activated ConcavalinA-coated magnetic beads. 10 μL of beads per sample were pre-activated in Binding buffer (20 mM HEPES pH 7.9, 10 mM KCl, 1 mM CaCl2, 1 mM MnCl2) and kept on ice until use. The bead-bound cells were then placed in a magnetic stand and, after carefully removing the supernatant, resuspended in ice-cold Antibody buffer (20 mM HEPES pH 7.5, 0.15 M NaCl, 0.5 mM spermidine, 1 Roche Complete Protease Inhibitor EDTA-free tablet, 0.05% w/v digitonin, 2 mM EDTA, 1:100 primary antibodies). The cells were left incubating in Antibody buffer overnight at 4 °C on gentle shake. On the following day, samples were placed on the magnetic stand, the liquid removed, and the beads were washed twice with Digitonin buffer (Wash buffer + 0.05% w/v digitonin). 50 μL of Protein A-MNase at a concentration of 700 ng/ml diluted in digitonin buffer were then carefully added to the beads while gently vortexing. The samples were incubated with the Protein A-MNase for 1h at 4 °C on gentle shake. Bead-bound cells were washed twice with 1 ml of Digitonin buffer (samples placed in the magnetic stand, liquid removed without dislodging the beads) and then resuspended in 100 μL of Digitonin buffer and cooled down to 0-2 °C for 5 minutes. Genome digestion was achieved by adding 2 mM CaCl2 to the samples which were then kept at 0 °C for 30 minutes. The reaction was quenched with 100 μL of 2X Stop buffer (0.35 M NaCl, 20 mM EDTA, 4 mM EGTA, 0.05% digitonin, 50 ng/ml of RNAse A, 50 ng/ml of glycogen, 10 ng/ml spike in DNA, all diluted in distilled H2o) and vortexing. After a 30 minutes incubation at 37 °C to release the DNA fragments, samples were placed on the magnetic stand and the supernatant was collected into a new 1.5 ml Eppendorf tube for DNA extraction via spin-column (NucleoSpin clean-up kit [Macherey-Bagel]).
Antibodies used with this protocol: anti-H3K9me3 (1:100 dilution [Abcam]); anti-H3K4me3 (1:100 dilution [Abcam]); IgG (1:100 dilution [Abcam]).
Detailed protocol can be found at https://doi.org/10.17504/protocols.io.36wgqdb83vk5/v1.
Crosslinked CUT&RUN
The protocol was the same as above except for some additional reagents in the buffers and the cross-linking (and cross-linking reversion) steps. Briefly, cells were centrifuged at 600xg for 3 minutes and the pellet was then resuspended in 1 ml of 0.1% formaldehyde solution diluted in the cell culture media to allow light crosslinking. After 1 minute incubation at room temperature, glycine was added to a final concentration of 125 mM, and the cells were centrifuged again at 600xg for 3 minutes. The supernatant was removed, and the cells resuspended in 800 μL of XL-Wash Buffer (20 mM HEPES pH 7.5, 150 mM NaCl, 0.5 mM spermidine, 1 Roche Complete Protease Inhibitor EDTA-free tablet, 1% Triton X-100, 0.05% SDS, diluted in dH2O). Cells were centrifuged at 1000xg for 3 minutes and resuspended again in XL-Wash Buffer. Centrifugation was performed at 1200xg for 3 minutes two times more, and the pellet resuspended in XL-Wash Buffer. 10 μL of pre-activated ConA-coated beads was added per sample and allowed to bind (5-10 minutes incubation on an end-over-end rotator). After placing the samples on the magnetic stand, cleared supernatant was removed and beads were resuspended in 100 μL XL-Antibody Buffer (XL-Wash Buffer, 0.05% digitonin, 2 mM EDTA, 1:50 anti-Cas9 antibody), in which they were incubated overnight at 4 °C on gentle shake. On the following day, samples were placed on a magnetic stand and, after removing the supernatant, resuspended in XL-Digitonin buffer (XL-Wash buffer, 0.05% digitonin). The wash was repeated once more and then the beads were resuspended in 50 μL/sample of Protein A-MNase diluted in XL-Digitonin buffer (final concentration 700 ng/ml), followed by 1 hour incubation at 4 °C on gentle shake. Bead-bound samples were washed then twice with XL-Digitonin buffer and then resuspended in 100 μL of XL-Digitonin buffer and cooled down to 0-2 °C for 5 minutes. 2ul of 100 mM CaCl2 was added to each sample and then incubated at 0 °C for 30 minutes. The reaction was stopped by adding 2X Stop buffer (see: standard CUT&RUN protocol) and incubated at 37°C for 30 minutes to release the fragmented chromatin. The samples were placed on the magnetic stand and the supernatant was collected in a new Eppendorf tube, where 0.09% SDS and 0.22 mg/ml of Proteinase K were added to reverse the crosslinking and incubated overnight at 55°C. DNA was extracted using NucleoSpin PCR Cleanup columns (Macherey-Nagel).
The antibody used in this step is anti-Cas9 (1:50 dilution [Takara]).
A detailed protocol can be found at https://doi.org/10.17504/protocols.io.bp2l6dr5rvqe/v1.
CUT&RUN analysis
Paired-end reads (2 × 75) were aligned to the human genome (hg38) using bowtie2 (version 2.4.4; RRID:SCR_016368)85 (–local –very-sensitive-local –no-mixed –no-unal –no-discordant –phred33 -I 10 -X 700), converted to BAM files with SAMtools (version 1.16.1; RRID:SCR_002105) and sorted (SAMtools version 1.16.1; RRID:SCR_002105). Reads per kilobase per million mapped reads (RPKM) normalized bigwig coverage tracks were made with bamCoverage (DeepTools, version 2.5.4; RRID:SCR_016366).78 To normalize the signal from the dCas9 CUT&RUN bigwig files, we performed a subtraction (--ratio) using bigwigCompare of dCas9 gRNA1 cells signal minus the dCas9 control cells signal.
Tag directories were created using Homer (version 5.1; RRID:SCR_010881)82 makeTagDirectory on default parameters. Peak calling was performed using findPeaks (Homer), using the option “factor” as style for dCas9 CUT&RUN (-style). The rest of the parameters were left on default options. A window of 100 bp was added at both sides of the significant peaks (p-value < 0.05, as reported by homer) and only dCas9 L1-CRISPRi peaks that had no overlap in dCas9 control peaks were kept as true gRNA binding sites (bedr::in.region). and intersected (BEDtools,84 version 2.30.0; RRID:SCR_006646) to bed files containing coordinates of >6-kbp L1HS or any other L1PA subfamily (on-target; bedtools intersect; repeatmasker hg38 version open-4.0.5). The rest (-v) of the called peaks were considered off-targets. Matrices for heatmaps (both on- and off- target ones) were created using computeMatrix (DeepTools, version 2.5.4; RRID:SCR_016366) and visualized using plotHeatmap (DeepTools).
Tag directories of histone marks were created using Homer (version 5.1; RRID:SCR_010881)82 makeTagDirectory on default parameters. Peak calling was performed using findPeaks (Homer, version 5.1), using the option “histone” as style for the H3K4me3 and H3K9me3 CUT&RUN (-style). To identify active promoters among full-length evolutionary young L1s, the significant peaks (p-value < 0.05, as reported by homer) were intersected (BEDtools, version 2.30.0; RRID:SCR_006646) to a bed file containing the coordinates of the first 900bps of >6-kbp L1HS, L1PA2, L1PA3 and L1PA4 (repeatmasker hg38 version open-4.0.5).
Oxford nanopore DNA sequencing
HMW DNA was extracted from frozen hiPSC pellets (500 000-1 million cells) and day 15 cerebral organoids (5 organoids) using the Nanobind HMW DNA Extraction kit (PacBio) following the manufacturer’s instructions. HMW DNA was eluted in 100 μL of Buffer EB. Size selection to enrich for fragments >5 kb was performed using the Short Read Eliminator XS kit (PacBio) according to the manufacturer’s instructions. HMW DNA quantity and quality were assessed by Nanodrop and Qubit from the top, middle, and bottom of each tube, and on a TapeStation 4200 (Agilent) using the Agilent Genomic DNA Screen Tape Assay (Ref No. G2991-90040, Edition 08/2015). HMW DNA was sheared to approximately 10kb fragment length using a g-TUBE (Covaris) following the manufacturer’s protocol. Library preparation for whole genome sequencing was performed using the SQK-LSK114 Ligation Sequencing kit (Oxford Nanopore Technologies) with >1 μg DNA input. Libraries were each sequenced seperately using a FLO-PRO114M PromethION Flow Cell R10.4.1 on a PromethION platform (Oxford Nanopore Technologies) at SciLifeLab Uppsala and in-house on a PromethION 2 Solo (Oxford Nanopore Technologies) for 72 hours. Raw sequencing data (POD5 format) were basecalled and mapped to the human reference genome (GRCh38/hg38) using Dorado version 0.5.1-CUDA-11.7.0 (RRID:SCR_025883; https://github.com/nanoporetech/dorado) utilizing the super accurate basecalling model dna_r10.4.1_e8.2_400bps_sup@v4.3.0 and dna_r10.4.1_e8.2_400bps_sup@v4.3.0_5mCG_5hmCG@v1 for methylation aware basecalling. Resulting BAM files were sorted and indexed using SAMtools version 1.18.77
Coverage was measured using mosdepth89 (version 0.3.10; RRID:SCR_018929) on fast mode (-x) for all chromosomes with a widow size of 1000bps (-b 1000), using reads with a MAPQ > 20 (-Q 20). Copy number variations (CNVs) were detected using Spectre (https://github.com/fritzsedlazeck/Spectre) using the mosdepth output (--coverage), with hg38 as the reference fasta file (--reference). Detailed results of the CNV analysis can be found at the following link: https://doi.org/10.5281/zenodo.15675716.
Locus-specific L1 and L1 promoter (first 900 bp) methylation status were visualized using MethylArtist version 1.2.690 (https://github.com/adamewing/methylartist) using the segmeth (--motif CG, with hg38 as reference genome) and locus functions with default parameters.
Long-read direct RNA sample preparation, sequencing and analysis
RNA extraction
RNA was extracted from one hiPS cell pellet (3 million cells) using the Trizol RNA extraction protocol (Trizol reagent, Invitrogen; Pub. No. MAN0001271) . Briefly, 1 ml of Trizol (Invitrogen #15596026) was added to the pellet to lyse the cells. The lysate was incubated for 5 minutes at room temperature (RT), and then 200 μl of chloroform were added followed by thorough mixing. The sample was then incubated at RT for an additional 3 minutes prior centrifugation at 12000 x g at 4 °C for 20 minutes. The aqueous phase (containing RNA) was transferred to a new tube where 500 μl of isopropanol where added prior a 10 minutes incubation at 4 °C. After that, RNA was pelleted via centrifugation at 12000 x g at 4 °C for 10 minutes. The RNA was resuspended in 1 ml of freshly prepared 75% ethanol, vortexed, and centrifuged for 5 minutes at 7500 x g at 4 °C. The pelleted RNA was air-dried and finally resuspended in 35 μl of RNase-free water. Lastly, the suspension was incubated at 56 °C for 15 minutes.
Poly(A) mRNA isolation
The NEBNext High Input Poly(A) mRNA Isolation Module (NEB #E3370S) protocol was used to select the mRNA for long-read direct RNA sequencing. Shortly, 20 μl of High Input Oligo d(T)25 Beads were washed twice in 100 μl of RNA Binding Buffer (2X). The supernatant was removed by placing the beads in a magnetic rack until the solution was clear. 50 μl of RNA Binding Buffer (2) were then added to the isolated RNA (see: RNA extraction), which was previously diluted to 50 μl. The RNA was denaturated by heating the sample at 65 °C for 5 minutes and then cooling it to 4 °C. Beads were then resuspended and incubated at room temperature for 5 minutes prior to placing them on the magnetic holder and removing the supernatant. Two additional washes were then performed with 200 μl of Wash Buffer. After removing the Wash Buffer, beads were thoroughly resuspended in 50 μl of Tris Buffer and then heated to 80 °C for 2 minutes, followed by a cool down to 25 °C. A second purification step followed with the addition of 50 μl of RNA Binding Buffer to increase the specificity of mRNA binding. The mRNA-bound beads were again washed with 200 μl of Wash Buffer. The mRNA was then finally eluted with 17 μl of Tris Buffer and incubation at 80 °C for 2 minutes, followed by cool down to 25 °C. The sample was replaced on the magnetic rack, and 15 μl of eluted RNA were transferred to a clean nuclease-free PCR tube and placed on ice immediately. The yield of the purified mRNA was assessed using a Qubit 3 fluorometer and Qubit RNA Assay kit.
ONT direct RNA sequencing library preparation and data analysis
The library for long-read direct RNA sequencing was prepared according to manufacturer’s protocol and using the Direct RNA Sequencing Kit (SQK-RNA004) from Oxford Nanopore Technologies. The sample was sequenced using a PromethION 2 Solo IT on a PromethION RNA Flow Cell (FLO-PRO004RA). Data was basecalled using dorado (version 0.7.1-CUDA-11.7.0) with model rna004_130bps_sup@v3.0.1 and --modified-bases-models using dorado's model rna004_130bps_sup@v3.0.1_m6A_GRACH@v1. Reads were mapped using minimap283 (-ax splice -uf -k14 -y) using hg38 as the reference genome. Visualization was done using the sorted and indexed bam file (samtools 1.16.1) on IgV (version 2.18.2) (http://www.broadinstitute.org/igv/).
CRISPR inhibition (CRISPRi)
To silence the transcription of LINE-1s in hiPSCs, we adapted the protocol detailed previously.38,74 The single guide sequences designed to recognize the 5’ UTR of full-length young L1 elements were previously published.39 The guide sequences were inserted into a deadCas9-KRAB-T2A-GFP lentiviral backbone, pLV hU6-sgRNA hUbC-dCas9-KRAB-T2a-GFP, a gift from Charles Gersbach (RRID: Addgene_71237), using annealed oligos and the BsmBI cloning site. Lentivirus was produced as described below (see: lentiviral production). hiPSCs were transduced with MOI 10 of LacZ (RRID: Addgene_234883) and LINE-1-targeting gRNA lentiviral particles (RRID: Addgene_234882 and (RRID: Addgene_234881). Guide efficiency was validated using bulk RNA sequencing, due to the repetitive nature of the targets.
| gRNAs | Direction | Sequence (with BsmBI cloning site) – 5’-3’ |
|---|---|---|
| L1 CRISPRi guide 1 | forward | CACCGGTGGAGCCCACCACAGCTCA |
| reverse | AAACTGAGCTGTGGTGGGCTCCACC | |
| L1 CRISPRi guide 2 | forward | CACCGGCGTGAGCGACGCAGAAGAC |
| reverse | AAACGTCTTCTGCGTCGCTCACGCC | |
| Non-targeting control (LacZ) | forward | CACCGTGCGAATACGCCCACGCGAT |
| reverse | AAACATCGCGTGGGCGTATTCGCAC |
Between 7 and 10 days after transduction, GFP-positive cells were selected via FACS (FACS Aria, BD Biosciences). Briefly, cells were detached as described above (see: cell culture), then resuspended in iPS media containing RY27632 (10 μM) and Draq7 (1:1000, BD Bioscience), and strained with a 70μm (BD Bioscience) filter. Gating parameters were determined by side and forward scatter to eliminate debris and dead cells. The GFP-positive gates were set using untransduced hiPSCs. Sorting gates and strategies were validated via reanalysis of sorted cells with a requirement of > 95% purity cut-off. 100 000-200 000 GFP-positive/Draq7-negative cells were collected for each sample, spun down at 400 x g for 5 min and snap-frozen on dry ice (for bulk RNA sequencing). Cell pellets were kept at −80°C until RNA was isolated. 300 000 GFP-positive/Draq7-negative cells per samples were also collected, spun down at 400 x g for 5 min, resuspended in iPS media containing RY27632 (10 μM) and expanded as described above (see: cell culture). GFP-positive cells were also frozen down for further use (e.g. cerebral organoids differentiation). If the FACS-sorted hiPSCs were replated for expansion, they were passaged once before any downstream functional genomic assays or differentiation into unguided cerebral organoids were performed, in order to allow the cells to full recover. If the transduced cells were cryopreserved, they were sorted using FACS after thawing, before any subsequent experiments or analyses.
The cell lines produced from these lentiviral transductions are: HS1_L1HS-CRISPRi_g1 (RRID:CVCL_E8I4); HS1_L1HS-CRISPRi_g2 (RRID:CVCL_E8I5); HS1_L1HS-CRISPRi_LacZ (RRID:CVCL_E8I6); HS2_L1HS-CRISPRi_g1 (RRID:CVCL_E8I7); HS2_L1HS-CRISPRi_LacZ (RRID:CVCL_E8I8).
A detailed protocol for lentiviral-mediated CRISPRi can be found at https://doi.org/10.17504/protocols.io.4r3l29ezqv1y/v1
Lentiviral production
Lentiviral vectors were produced according to Zufferey et al.91 and were titered by qRT-PCR. Shortly, HEK293T cells (RRID: CVCL_0063) were grown to a confluency of 70 – 90% for lentiviral production and co-transfected with third-generation packaging and envelope vectors (pMDL [RRID:Addgene_12251], psRev [RRID:Addgene_12253], pMD2G [RRID:Addgene_12259]) together with Polyethyleneimine (PEI Polysciences PN 23966) in DPBS (GIBCO) in conjunction with the previously generated plasmids (see above: CRISPR inhibition). The lentiviruses were then harvested two days after transfection. The media was collected, filtered and centrifuged at 19500 x g for 2 hours at 4° C. The supernatant was removed from the tubes, the pellet resuspended in cold DPBS and left at 4° C overnight. The resulting lentivirus was aliquoted, titered, and stored at −80°C.
Detailed protocol can be found at https://doi.org/10.17504/protocols.io.8epv5ro85g1b/v1
Mass spectrometry data generation and analysis
A detailed protocol for each of the following sections for mass spectrometry data generation and analysis can be found doi: https://www.ebi.ac.uk/pride/archive/projects/PXD059661
Sample preparation for cellular proteome analysis
Pellets corresponding to 250,000 cells from L1 CRISPRi hiPSCs samples and the LacZ control (3 replicates per condition) were solubilized in 50 mM ammonium bicarbonate (Sigma) with 0.1% RapiGest (Waters) to denature proteins and shaken on a thermomixer (Eppendorf) at 400 rpm for 15 minutes decreasing the temperature from 80°C to 56°C. This was followed by reduction of disulfide bonds with 0.1M dithiothreitol at 56°C, cysteine alkylation with 0.2 M iodoacetamide at room temperature in the dark, and digestion overnight at 37°C with sequencing grade modified trypsin (enzyme: protein ratio 1:50 [Promega]). Digested peptides were acidified with 10% trifluoroacetic acid (TFA) and RapiGest was precipitated by incubation at 37°C. Peptides were desalted with SepPak C18 cartridges (Waters), dried by vacuum centrifugation, and stored at −80°C until LC-MS analysis. To generate a spectral library for the deep proteomic profile for the L1 CRISPRi hiPSCs proteome, 15-20 μg of peptides pooled from each sample were fractionated into 8 fractions by high-pH reversed-phase (HpH-RP) fractionation. Fractioned samples were dried by vacuum centrifugation and stored at −80°C until LC-MS analysis.
Liquid chromatography and mass spectrometry (LC-MS)
LC-MS analyses were carried out on an Orbitrap Exploris 480 MS instrument with a reverse phase UltiMate 3000 UHPLC system via an EASY-Spray ion source equipped with FAIMS Pro (all Thermo Fisher Scientific). Peptides for the single-shot samples were measured with data-independent acquisition (DIA). Digested peptides were loaded onto a trap cartridge (Acclaim PepMap C18, 5 mm particle size, 0.3 mm inner diameter x 5 mm length, Thermo Fisher Scientific) and separated by EASY-Spray analytical column (2 mm particle size, 75 mm inner diameter x 500 mm length, Thermo Fisher Scientific). Each sample was injected twice and eluted with a linear gradient ranging from 2-19% Solvent B (0.1% formic acid (FA) in 80% acetonitrile (ACN) over 80 min, 19-41% B over 40 min, 41-90% B over 5 min and held at 90-95% B for 5 min at a constant flow rate of 300 nl/min at 450C. For data-dependent acquisition (DDA) and DIA measurements, the spray voltage was set at 2.1 kV, ion transfer tube temperature was set at 275° C, and FAIMS compensation voltages (CV) were set to -45 and -60. For DIA, peptides were analyzed with one full scan (350–1,400 m/z, R = 120,000) at a normalized AGC target of 300%, followed by 38 DIA MS/MS scans (350–1,050 m/z) in HCD mode (isolation window 18 m/z, 1 m/z window overlap, normalized collision energy 27%), with fragments detected in the Orbitrap (R = 15,000). The fractions were measured with DDA to generate the spectral library. Each fraction was injected twice and eluted over 180 min gradients (UltiMate 3000 UHPLC, Thermo Fisher Scientific) ranging from 2-25% Solvent B (0.1%FA in 80% ACN) over 100 min, 25-40% B over 20 min, 40-90% B over 2 min and held at 90% B for 5 min at a constant flow rate of 300 nl/min at 450C. The full MS scan was performed in the Orbitrap in the range of 350 to 1400 m/z at a resolution of 60,000 with automatic gain control (AGC) set to 300 and maximum ion injection time set to auto. The intensity threshold was set to 5.0e3 and mass tolerance to 10 ppm. The most intense ions selected in the first MS scan were isolated for higher-energy collision-induced dissociation (HCD) at a precursor isolation window width of 1.6 m/z, and normalized AGC of 75%. Maximum ion injection time for MS2 was set to auto with MS2 resolution set to 15,000. The first mass and the normalized collision energy were set to 110 m/z and 30%, respectively. All data were acquired in positive polarity and MS/MS for DIA and DDA were acquired in centroid and profile mode, respectively.
MS raw data processing and statistical analysis
Spectronaut (version 18, Biognosys AG; https://biognosys.com/software/spectronaut/) was used to build the spectral library and analyze the single-shot DIA runs. For the generation of the hybrid library, the reverse phase fraction runs (DDA) were combined with the single-shot individual sample runs (DIA) and searched with ‘Pulsar’ using default settings. FDR was set to 1% to determine the significance level. The single-shot DIA samples were searched against the hybrid library with the human SwissProt reference proteome (https://www.uniprot.org/uniprotkb; 20,602 entries downloaded on October 2024) and commonly used contaminants. Searches used carbamidomethylation as fixed modifications, methionine oxidation, and protein N-terminal acetylation as variable modifications. The Trypsin/P proteolytic cleavage rule was used, permitting a maximum of 2 missed cleavages and a minimum peptide length of 7 amino acids. ‘Cross run normalization’ was enabled with Normalization Strategy set to ‘local normalization’ based on rows with ‘Identified in All Runs (Complete)’. Data filtering was set to Q-value and the Q-value thresholds were set to 0.01 at PSM, peptide, and protein levels. Protein quantification and statistical analysis were performed with Msstats92 (version 4.12.0; RRID:SCR_014353; http://msstats.org/). Contaminants were filtered and features were converted to MSStats format for downstream processing. Uninformative features were removed, and missing values were imputed with the ‘MBimpute’ function within MSStats. For statistical analysis, MSStats ‘group comparison’ was performed for the gRNA samples against the LacZ control. Differentially expressed proteins were selected with adjusted p-value of less than 0.05 and a fold change of more than 1.5 for each comparison.
Immunohistochemistry
Organoids collected for staining were transferred into a 24-well plate and fixed with 4% paraformaldehyde (PFA) for 2 hours. Fixed organoids were then rinsed twice in DPBS and immersed in a 1:1 OCT (catalog no. 45830, HistoLab) and 30% sucrose emulsion in which they were incubated overnight at 4 °C on gentle shake. On the following day, the organoids were transferred to a cryomold in OCT and frozen on dry ice, to be stored at -80 °C until cryosectioning. Organoids were cut on a cryostat at -20 °C into sections of 20 um thickness and placed onto Superfrost Plus microscope slides.
A hydrophobic PAP Pen was used to draw the borders of the slides. Slides were then washed twice in 1X KPBS (PBS + 0.3% Triton X-100) and fixed again with 4% PFA for 10 minutes. After two washes with KPBS, slides were immersed in Blocking solution (TKPBS + 5% Normal Donkey Serum) for at least 1 h. ∼300 μl of Primary antibody solution (blocking solution + primary antibodies at the chosen dilution) were added to each slide and these were incubated overnight at 4 °C in a humidified chamber. On the following day, slides were washed three times with KPBS and then incubated for at least 1 h with ∼300 μl of Secondary antibody solution (blocking solution + primary antibodies at the chosen dilution, 1:1000 DAPI) in the dark. Slides were then washed three times with KPBS and finally coverslipped with FluorSave mounting medium. Slides were imaged on a confocal Microscope Leica TSC SP8 and images cropped and adjusted on ImageJ Fiji.
Antibodies used: mouse anti-ZO1(Invitrogen, 339100; 1:300); rabbit anti-PAX6 (Biolegend, 901301; 1:300); anti-mouse Cy3 (Jackson Lab; 1:550); anti-rabbit Alexa Fluor647 (Jackson Lab; 1:550).
Detailed protocol can be found at https://doi.org/10.17504/protocols.io.261gerkm7l47/v1.
Immunohistochemistry images included in the paper are deposited on Zenodo at the following link: https://doi.org/10.5281/zenodo.15045170.
Single nuclei isolation
The nuclei isolation from organoids was performed as previously published53,93 using a sucrose gradient-based isolation. In brief, the organoids were thawed and dissociated in ice-cold lysis buffer (0.32 M sucrose, 5 mM CaCl2, 3 mM MgAc, 0.1 mM Na2EDTA, 10 mM Tris-HCl, pH 8.0, 1 mM DTT) with a 1 ml tissue douncer (Wheaton). The lysate was then carefully layered on top of a sucrose cushion (1.8 M sucrose, 3 mM MgAc, 10 mM Tris-HCl pH 8.0, and 1 mM DTT diluted in milliQ water) before centrifugation at 30,000 × g for 2 hours and 15 min. Once the supernatant was removed, the pelleted nuclei were softened for 10 min in 100 μl of nuclear storage buffer (15% sucrose, 10 mM Tris-HCl pH 7.2, 70 mM KCl, and 2 mM MgCl2, all diluted in milliQ water), resuspended in 300-800 μl of dilution buffer (10 mM Tris-HCl pH 7.2, 70 mM KCl, and 2 mM MgCl2, diluted in milliQ water) and then filtered through a cell strainer (70 μm). The nuclei were sorted via FANS (with a FACS Aria, BD Biosciences) at 4° C at low flow rate using a 100 μm nozzle (reanalysis showed >95% purity).
Detailed protocol can be found at https://doi.org/10.17504/protocols.io.5jyl8j678g2w/v1.
Single nuclei sequencing
The nuclei for single nuclei RNA sequencing (8500-10000 nuclei per sample) were loaded onto the Chromium Next GEM Chip G Single Cell Kit along with the reverse transcription mastermix according to manufacturer’s protocol for the Chromium Next GEM single cell 3’ kit (10X Genomics, PN-1000268) to generate single-cell gel beads in emulsion. cDNA amplification was done following the guidelines from 10X Genomics using 13 cycles of amplification of the 3’ libraries. Sequencing libraries were generated with unique dual indices (TT set A) and pooled for sequencing on a Novaseq6000 or Novaseq X plus using a 100-cycle kit and 28-10-10-90 reads.
Single nuclei RNAseq analysis
Gene quantification
The raw base calls were demultiplexed and converted to sample-specific fastq files using 10x Genomics Cell Ranger mkfastq (version 6.0.0; RRID:SCR_017344).80 Cell Ranger count was run with default settings, using an mRNA reference for single-cell samples and a pre-mRNA reference (generated using 10x Genomics Cell Ranger 6.0.0 guidelines) for single-nucleus samples.
Clustering
Samples were analyzed using Seurat (version 5.0.2; RRID:SCR_007322).75 For each sample, cells were filtered out if the percentage of mitochondrial content was over 10% (perc_mitochondrial). Cells were discarded if the number of genes detected (nFeature_RNA) was higher than two SDs over the mean in the sample or lower than a SD below the mean in the sample. Following the merging of the samples (Seurat::merge), counts were normalized (Seurat::NormalizeData, LogNormalize) and scaled (Seurat::ScaleData). Principal components were calculated using Seurat::RunPCA, and the first 10 precomputed principal components were used to construct the (shared) nearest neighbor graph (Seurat::FindNeighbors) to determine nuclei clusters (Seurat::FindClusters) using a resolution of 0.1. Cell types were annotated using canonical marker gene expression.
Gene differential expression analysis
We used Seurat’s FindMarkers grouped by cell types and on default parameters as for version 5.1.0 to identify differentially expressed genes (Wilcoxon test). A gene was considered to be differentially expressed on a cell type if its adjusted P value was below 0.05. To define the lists of genes consistently up or downregulated in all experiments (Figure 6G), we performed individual comparisons between L1-CRISPR and LacZ conditions in each cell type and guide. We used hiPS6 gRNA1 as our “base” comparison. If a gene had a log2FC > 0.25 in our “base” comparison and log2FC > 0 in the rest of the comparisons (hiPS6 gRNA1 vs LacZ, hiPS48 gRNA1 vs LacZ, and hiPS48 gRNA2 vs LacZ), the gene is considered to be a consistently upregulated gene. Similarly, if a gene had a log2FC < -0.25 in our “base” comparison and log2FC < 0 in the rest of the comparisons, the gene is considered to be a consistently downregulated gene.
TE quantification
We used an in-house pseudo-bulk approach to processing snRNA-seq data to quantify TE expression per cluster, similar to what has been previously described.53 All clustering, normalization and merging of samples were performed using the contained scripts of get_clusters.R [get_custers() from the Sample class] and merge_samples.R [merge_samples() from the Experiment class] of trusTEr (version 0.1.2; https://doi.org/10.5281/zenodo.14362613) using the unique mapping mode when processing clusters. Documentation of the pipeline can be found at https://molecular-neurogenetics.github.io/truster/.
Dependencies were ran using the following software versions:
| Cellranger from 10x Genomics version 6.0.0 (RRID:SCR_017344). |
| subset-bam from 10x Genomics version 1.1.0 (RRID:SCR_023216). |
| bamtofastq from 10x Genomics version 1.4.1 (RRID:SCR_023215). |
| STAR aligner version 2.7.8a (RRID:SCR_004463). |
| SAMtools version 1.16.1 (RRID:SCR_002105). |
| TEcount version 2.2.3 (RRID:SCR_023208). |
| featureCounts subread version 1.6.3 (RRID:SCR_012919). |
| Seurat version 5.0.2 (RRID:SCR_007322). |
TrusTEr was ran using unique mapping (unique = True) and accounting for genes (include_genes = True). SizeFactors for each pseudocluster were calculated using gene counts (DESeq2) and used to normalize pseudobulked TE expression.
Published: August 22, 2025
Footnotes
Supplemental information can be found online at https://doi.org/10.1016/j.xgen.2025.100979.
Supplemental information
References
- 1.Lander E.S., Linton L.M., Birren B., Nusbaum C., Zody M.C., Baldwin J., Devon K., Dewar K., Doyle M., FitzHugh W., et al. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. doi: 10.1038/35057062. [DOI] [PubMed] [Google Scholar]
- 2.de Koning A.P.J., Gu W., Castoe T.A., Batzer M.A., Pollock D.D. Repetitive elements may comprise over two-thirds of the human genome. PLoS Genet. 2011;7 doi: 10.1371/journal.pgen.1002384. [DOI] [Google Scholar]
- 3.Nurk S., Koren S., Rhie A., Rautiainen M., Bzikadze A.V., Mikheenko A., Vollger M.R., Altemose N., Uralsky L., Gershman A., et al. The complete sequence of a human genome. Science. 2022;376:44–53. doi: 10.1126/science.abj6987. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Hoyt S.J., Storer J.M., Hartley G.A., Grady P.G.S., Gershman A., de Lima L.G., Limouse C., Halabian R., Wojenski L., Rodriguez M., et al. From telomere to telomere: The transcriptional and epigenetic state of human repeat elements. Science. 2022;376 doi: 10.1126/science.abk3112. [DOI] [Google Scholar]
- 5.Payer L.M., Burns K.H. Transposable elements in human genetic disease. Nat. Rev. Genet. 2019;20:760–772. doi: 10.1038/s41576-019-0165-8. [DOI] [PubMed] [Google Scholar]
- 6.Deniz Ö., Frost J.M., Branco M.R. Regulation of transposable elements by DNA modifications. Nat. Rev. Genet. 2019;20:417–431. doi: 10.1038/s41576-019-0106-6. [DOI] [PubMed] [Google Scholar]
- 7.Yoder J.A., Walsh C.P., Bestor T.H. Cytosine methylation and the ecology of intragenomic parasites. Trends Genet. 1997;13:335–340. doi: 10.1016/s0168-9525(97)01181-5. [DOI] [PubMed] [Google Scholar]
- 8.Jonsson M.E., Brattås P.L., Gustafsson C., Petri R., Yudovich D., Pircs K., Verschuere S., Madsen S., Hansson J., Larsson J., et al. Activation of neuronal genes via LINE-1 elements upon global DNA demethylation in human neural progenitors. Nat. Commun. 2019;10:3182. doi: 10.1038/s41467-019-11150-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Sanchez-Luque F.J., Kempen M.J.H.C., Gerdes P., Vargas-Landin D.B., Richardson S.R., Troskie R.L., Jesuadian J.S., Cheetham S.W., Carreira P.E., Salvador-Palomeque C., et al. LINE-1 Evasion of Epigenetic Repression in Humans. Mol. Cell. 2019;75:590–604.e12. doi: 10.1016/j.molcel.2019.05.024. [DOI] [PubMed] [Google Scholar]
- 10.Lanciano S., Philippe C., Sarkar A., Pratella D., Domrane C., Doucet A.J., van Essen D., Saccani S., Ferry L., Defossez P.A., Cristofari G. Locus-level L1 DNA methylation profiling reveals the epigenetic and transcriptional interplay between L1s and their integration sites. Cell Genom. 2024;4 doi: 10.1016/j.xgen.2024.100498. [DOI] [Google Scholar]
- 11.Percharde M., Lin C.J., Yin Y., Guan J., Peixoto G.A., Bulut-Karslioglu A., Biechele S., Huang B., Shen X., Ramalho-Santos M. A LINE1-Nucleolin Partnership Regulates Early Development and ESC Identity. Cell. 2018;174:391–405.e19. doi: 10.1016/j.cell.2018.05.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Jachowicz J.W., Bing X., Pontabry J., Bošković A., Rando O.J., Torres-Padilla M.E. LINE-1 activation after fertilization regulates global chromatin accessibility in the early mouse embryo. Nat. Genet. 2017;49:1502–1510. doi: 10.1038/ng.3945. [DOI] [PubMed] [Google Scholar]
- 13.Macfarlan T.S., Gifford W.D., Driscoll S., Lettieri K., Rowe H.M., Bonanomi D., Firth A., Singer O., Trono D., Pfaff S.L. Embryonic stem cell potency fluctuates with endogenous retrovirus activity. Nature. 2012;487:57–63. doi: 10.1038/nature11244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Grow E.J., Flynn R.A., Chavez S.L., Bayless N.L., Wossidlo M., Wesche D.J., Martin L., Ware C.B., Blish C.A., Chang H.Y., et al. Intrinsic retroviral reactivation in human preimplantation embryos and pluripotent cells. Nature. 2015;522:221–225. doi: 10.1038/nature14308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Wang J., Xie G., Singh M., Ghanbarian A.T., Raskó T., Szvetnik A., Cai H., Besser D., Prigione A., Fuchs N.V., et al. Primate-specific endogenous retrovirus-driven transcription defines naive-like stem cells. Nature. 2014;516:405–409. doi: 10.1038/nature13804. [DOI] [PubMed] [Google Scholar]
- 16.Beck C.R., Garcia-Perez J.L., Badge R.M., Moran J.V. LINE-1 elements in structural variation and disease. Annu. Rev. Genomics Hum. Genet. 2011;12:187–215. doi: 10.1146/annurev-genom-082509-141802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Khan H., Smit A., Boissinot S. Molecular evolution and tempo of amplification of human LINE-1 retrotransposons since the origin of primates. Genome Res. 2006;16:78–87. doi: 10.1101/gr.4001406. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Konkel M.K., Walker J.A., Batzer M.A. LINEs and SINEs of primate evolution. Evol. Anthropol. 2010;19:236–249. doi: 10.1002/evan.20283. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Feng Q., Moran J.V., Kazazian H.H., Jr., Boeke J.D. Human L1 retrotransposon encodes a conserved endonuclease required for retrotransposition. Cell. 1996;87:905–916. doi: 10.1016/s0092-8674(00)81997-2. [DOI] [PubMed] [Google Scholar]
- 20.Mathias S.L., Scott A.F., Kazazian H.H., Jr., Boeke J.D., Gabriel A. Reverse transcriptase encoded by a human transposable element. Science. 1991;254:1808–1810. doi: 10.1126/science.1722352. [DOI] [PubMed] [Google Scholar]
- 21.Swergold G.D. Identification, characterization, and cell specificity of a human LINE-1 promoter. Mol. Cell Biol. 1990;10:6718–6729. doi: 10.1128/mcb.10.12.6718-6729.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Hata K., Sakaki Y. Identification of critical CpG sites for repression of L1 transcription by DNA methylation. Gene. 1997;189:227–234. doi: 10.1016/s0378-1119(96)00856-6. [DOI] [PubMed] [Google Scholar]
- 23.Khazina E., Weichenrieder O. Human LINE-1 retrotransposition requires a metastable coiled coil and a positively charged N-terminus in L1ORF1p. eLife. 2018;7 doi: 10.7554/eLife.34960. [DOI] [Google Scholar]
- 24.Denli A.M., Narvaiza I., Kerman B.E., Pena M., Benner C., Marchetto M.C.N., Diedrich J.K., Aslanian A., Ma J., Moresco J.J., et al. Primate-specific ORF0 contributes to retrotransposon-mediated diversity. Cell. 2015;163:583–593. doi: 10.1016/j.cell.2015.09.025. [DOI] [PubMed] [Google Scholar]
- 25.Speek M. Antisense promoter of human L1 retrotransposon drives transcription of adjacent cellular genes. Mol. Cell Biol. 2001;21:1973–1985. doi: 10.1128/MCB.21.6.1973-1985.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Garza R., Atacho D.A.M., Adami A., Gerdes P., Vinod M., Hsieh P., Karlsson O., Horvath V., Johansson P.A., Pandiloski N., et al. LINE-1 retrotransposons drive human neuronal transcriptome complexity and functional diversification. Sci. Adv. 2023;9 doi: 10.1126/sciadv.adh9543. [DOI] [Google Scholar]
- 27.Toda T., Bedrosian T.A., Schafer S.T., Cuoco M.S., Linker S.B., Ghassemzadeh S., Mitchell L., Whiteley J.T., Novaresi N., McDonald A.H., et al. Long interspersed nuclear elements safeguard neural progenitors from precocious differentiation. Cell Rep. 2024;43 doi: 10.1016/j.celrep.2024.113774. [DOI] [Google Scholar]
- 28.Mangoni D., Simi A., Lau P., Armaos A., Ansaloni F., Codino A., Damiani D., Floreani L., Di Carlo V., Vozzi D., et al. LINE-1 regulates cortical development by acting as long non-coding RNAs. Nat. Commun. 2023;14:4974. doi: 10.1038/s41467-023-40743-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Bodea G.O., Botto J.M., Ferreiro M.E., Sanchez-Luque F.J., de Los Rios Barreda J., Rasmussen J., Rahman M.A., Fenlon L.R., Jansz N., Gubert C., et al. LINE-1 retrotransposons contribute to mouse PV interneuron development. Nat. Neurosci. 2024;27:1274–1284. doi: 10.1038/s41593-024-01650-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Jonsson M.E., Garza R., Johansson P.A., Jakobsson J. Transposable Elements: A Common Feature of Neurodevelopmental and Neurodegenerative Disorders. Trends Genet. 2020;36:610–623. doi: 10.1016/j.tig.2020.05.004. [DOI] [PubMed] [Google Scholar]
- 31.Rosser J.M., An W. L1 expression and regulation in humans and rodents. Front. Biosci. 2012;4:2203–2225. doi: 10.2741/537. [DOI] [Google Scholar]
- 32.Lanciano S., Cristofari G. Measuring and interpreting transposable element expression. Nat. Rev. Genet. 2020;21:721–736. doi: 10.1038/s41576-020-0251-y. [DOI] [PubMed] [Google Scholar]
- 33.Wissing S., Muñoz-Lopez M., Macia A., Yang Z., Montano M., Collins W., Garcia-Perez J.L., Moran J.V., Greene W.C. Reprogramming somatic cells into iPS cells activates LINE-1 retroelement mobility. Hum. Mol. Genet. 2012;21:208–218. doi: 10.1093/hmg/ddr455. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Klawitter S., Fuchs N.V., Upton K.R., Muñoz-Lopez M., Shukla R., Wang J., Garcia-Cañadas M., Lopez-Ruiz C., Gerhardt D.J., Sebe A., et al. Reprogramming triggers endogenous L1 and Alu retrotransposition in human induced pluripotent stem cells. Nat. Commun. 2016;7 doi: 10.1038/ncomms10286. [DOI] [Google Scholar]
- 35.Rodic N., Sharma R., Sharma R., Zampella J., Dai L., Taylor M.S., Hruban R.H., Iacobuzio-Donahue C.A., Maitra A., Torbenson M.S., et al. Long interspersed element-1 protein expression is a hallmark of many human cancers. Am. J. Pathol. 2014;184:1280–1286. doi: 10.1016/j.ajpath.2014.01.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Skene P.J., Henikoff S. An efficient targeted nuclease strategy for high-resolution mapping of DNA binding sites. eLife. 2017;6 doi: 10.7554/eLife.21856. [DOI] [Google Scholar]
- 37.Philippe C., Vargas-Landin D.B., Doucet A.J., van Essen D., Vera-Otarola J., Kuciak M., Corbin A., Nigumann P., Cristofari G. Activation of individual L1 retrotransposon instances is restricted to cell-type dependent permissive loci. eLife. 2016;5 doi: 10.7554/eLife.13926. [DOI] [Google Scholar]
- 38.Johansson P.A., Adami A., Jakobsson J. CRISPRi-mediated transcriptional silencing in iPSCs for the study of human brain development. STAR Protoc. 2022;3 doi: 10.1016/j.xpro.2022.101285. [DOI] [Google Scholar]
- 39.Gu Z., Liu Y., Zhang Y., Cao H., Lyu J., Wang X., Wylie A., Newkirk S.J., Jones A.E., Lee M., Jr., et al. Silencing of LINE-1 retrotransposons is a selective dependency of myeloid leukemia. Nat. Genet. 2021;53:672–682. doi: 10.1038/s41588-021-00829-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Jin Y., Tam O.H., Paniagua E., Hammell M. TEtranscripts: a package for including transposable elements in differential expression analysis of RNA-seq datasets. Bioinformatics. 2015;31:3593–3599. doi: 10.1093/bioinformatics/btv422. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Zhang J., Ataei L., Mittal K., Wu L., Caldwell L., Huynh L., Sarajideen S., Tse K., Simon M.M., Mazid M.A., et al. LINE1 and PRC2 control nucleolar organization and repression of the 8C state in human ESCs. Dev. Cell. 2025;60:186–203.e13. doi: 10.1016/j.devcel.2024.09.024. [DOI] [PubMed] [Google Scholar]
- 42.Li X., Bie L., Wang Y., Hong Y., Zhou Z., Fan Y., Yan X., Tao Y., Huang C., Zhang Y., et al. LINE-1 transcription activates long-range gene expression. Nat. Genet. 2024;56:1494–1502. doi: 10.1038/s41588-024-01789-5. [DOI] [PubMed] [Google Scholar]
- 43.Araki T., Kusakabe M., Nishida E. A transmembrane protein EIG121L is required for epidermal differentiation during early embryonic development. J. Biol. Chem. 2011;286:6760–6768. doi: 10.1074/jbc.M110.177907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Verma D., Kapoor S., Kumari S., Sharma D., Singh J., Benjamin M., Bakhshi S., Seth R., Nayak B., Sharma A., et al. Decoding the genetic symphony: Profiling protein-coding and long noncoding RNA expression in T-acute lymphoblastic leukemia for clinical insights. PNAS Nexus. 2024;3 doi: 10.1093/pnasnexus/pgae011. [DOI] [Google Scholar]
- 45.Zhao L., Gao M., Zhao X. Verification of expression of LINC00648 in the serum of lung cancer patients by TCGA database. Cell. Mol. Biol. 2020;66:101–108. [Google Scholar]
- 46.Wu C.C., Brugeaud A., Seist R., Lin H.C., Yeh W.H., Petrillo M., Coppola G., Edge A.S.B., Stankovic K.M. Altered expression of genes regulating inflammation and synaptogenesis during regrowth of afferent neurons to cochlear hair cells. PLoS One. 2020;15 doi: 10.1371/journal.pone.0238578. [DOI] [Google Scholar]
- 47.Yang Y.R., Kang D.S., Lee C., Seok H., Follo M.Y., Cocco L., Suh P.G. Primary phospholipase C and brain disorders. Adv. Biol. Regul. 2016;61:80–85. doi: 10.1016/j.jbior.2015.11.003. [DOI] [PubMed] [Google Scholar]
- 48.Medelnik J.P., Roensch K., Okawa S., Del Sol A., Chara O., Mchedlishvili L., Tanaka E.M. Signaling-Dependent Control of Apical Membrane Size and Self-Renewal in Rosette-Stage Human Neuroepithelial Stem Cells. Stem Cell Rep. 2018;10:1751–1765. doi: 10.1016/j.stemcr.2018.04.018. [DOI] [Google Scholar]
- 49.Ando-Akatsuka Y., Yonemura S., Itoh M., Furuse M., Tsukita S. Differential behavior of E-cadherin and occludin in their colocalization with ZO-1 during the establishment of epithelial cell polarity. J. Cell. Physiol. 1999;179:115–125. doi: 10.1002/(SICI)1097-4652(199905)179:2<115::AID-JCP1>3.0.CO;2-T. [DOI] [PubMed] [Google Scholar]
- 50.Iurlaro M., von Meyenn F., Reik W. DNA methylation homeostasis in human and mouse development. Curr. Opin. Genet. Dev. 2017;43:101–109. doi: 10.1016/j.gde.2017.02.003. [DOI] [PubMed] [Google Scholar]
- 51.Freeman B., White T., Kaul T., Stow E.C., Baddoo M., Ungerleider N., Morales M., Yang H., Deharo D., Deininger P., Belancio V.P. Analysis of epigenetic features characteristic of L1 loci expressed in human cells. Nucleic Acids Res. 2022;50:1888–1907. doi: 10.1093/nar/gkac013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Gerdes P., Chan D., Lundberg M., Sanchez-Luque F.J., Bodea G.O., Ewing A.D., Faulkner G.J., Richardson S.R. Locus-resolution analysis of L1 regulation and retrotransposition potential in mouse embryonic development. Genome Res. 2023;33:1465–1481. doi: 10.1101/gr.278003.123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Jonsson M.E., Garza R., Sharma Y., Petri R., Södersten E., Johansson J.G., Johansson P.A., Atacho D.A., Pircs K., Madsen S., et al. Activation of endogenous retroviruses during brain development causes an inflammatory response. EMBO J. 2021;40 doi: 10.15252/embj.2020106423. [DOI] [Google Scholar]
- 54.Wang Y., Cameron E.G., Li J., Stiles T.L., Kritzer M.D., Lodhavia R., Hertz J., Nguyen T., Kapiloff M.S., Goldberg J.L. Muscle A-Kinase Anchoring Protein-alpha is an Injury-Specific Signaling Scaffold Required for Neurotrophic- and Cyclic Adenosine Monophosphate-Mediated Survival. EBioMedicine. 2015;2:1880–1887. doi: 10.1016/j.ebiom.2015.10.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Roussel-Gervais A., Sgroi S., Cambet Y., Lemeille S., Seredenina T., Krause K.H., Jaquet V. Genetic knockout of NTRK2 by CRISPR/Cas9 decreases neurogenesis and favors glial progenitors during differentiation of neural progenitor stem cells. Front. Cell. Neurosci. 2023;17 doi: 10.3389/fncel.2023.1289966. [DOI] [Google Scholar]
- 56.Zaman T., Helbig K.L., Clatot J., Thompson C.H., Kang S.K., Stouffs K., Jansen A.E., Verstraete L., Jacquinet A., Parrini E., et al. SCN3A-Related Neurodevelopmental Disorder: A Spectrum of Epilepsy and Brain Malformation. Ann. Neurol. 2020;88:348–362. doi: 10.1002/ana.25809. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Artegiani B., de Jesus Domingues A.M., Bragado Alonso S., Brandl E., Massalini S., Dahl A., Calegari F. Tox: a multifunctional transcription factor and novel regulator of mammalian corticogenesis. EMBO J. 2015;34:896–910. doi: 10.15252/embj.201490061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Gray M., Nash K.R., Yao Y. Adenylyl cyclase 2 expression and function in neurological diseases. CNS Neurosci. Ther. 2024;30 doi: 10.1111/cns.14880. [DOI] [Google Scholar]
- 59.Tibbo A.J., Baillie G.S. Phosphodiesterase 4B: Master Regulator of Brain Signaling. Cells. 2020;9 doi: 10.3390/cells9051254. [DOI] [Google Scholar]
- 60.Rosspopoff O., Trono D. Take a walk on the KRAB side: (Trends in Genetics, 39:11 p:844-857, 2023) Trends Genet. 2024;40:203–205. doi: 10.1016/j.tig.2023.12.007. [DOI] [PubMed] [Google Scholar]
- 61.Jacobs F.M.J., Greenberg D., Nguyen N., Haeussler M., Ewing A.D., Katzman S., Paten B., Salama S.R., Haussler D. An evolutionary arms race between KRAB zinc-finger genes ZNF91/93 and SVA/L1 retrotransposons. Nature. 2014;516:242–245. doi: 10.1038/nature13760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Liu N., Lee C.H., Swigut T., Grow E., Gu B., Bassik M.C., Wysocka J. Selective silencing of euchromatic L1s revealed by genome-wide screens for L1 regulators. Nature. 2018;553:228–232. doi: 10.1038/nature25179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Pandiloski N., Horváth V., Karlsson O., Koutounidou S., Dorazehi F., Christoforidou G., Matas-Fuentes J., Gerdes P., Garza R., Jönsson M.E., et al. DNA methylation governs the sensitivity of repeats to restriction by the HUSH-MORC2 corepressor. Nat. Commun. 2024;15:7534. doi: 10.1038/s41467-024-50765-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Robbez-Masson L., Tie C.H.C., Conde L., Tunbak H., Husovsky C., Tchasovnikarova I.A., Timms R.T., Herrero J., Lehner P.J., Rowe H.M. The HUSH complex cooperates with TRIM28 to repress young retrotransposons and new genes. Genome Res. 2018;28:836–845. doi: 10.1101/gr.228171.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Ewing A.D., Smits N., Sanchez-Luque F.J., Faivre J., Brennan P.M., Richardson S.R., Cheetham S.W., Faulkner G.J. Nanopore Sequencing Enables Comprehensive Transposable Element Epigenomic Profiling. Mol. Cell. 2020;80:915–928.e5. doi: 10.1016/j.molcel.2020.10.024. [DOI] [PubMed] [Google Scholar]
- 66.Pitkanen E., Cajuso T., Katainen R., Kaasinen E., Välimäki N., Palin K., Taipale J., Aaltonen L.A., Kilpivaara O. Frequent L1 retrotranspositions originating from TTC28 in colorectal cancer. Oncotarget. 2014;5:853–859. doi: 10.18632/oncotarget.1781. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Perez-Rico Y.A.B.,A., Henao Misikova L., Mulugeta E., de Almeida S.F., Muotri A.R., Heard E., Gendrel A. Transcriptional perturbation of LINE-1 elements reveals their cis-regulatory potential. bioRxiv. 2024 doi: 10.1101/2024.02.20.581275. Preprint at. [DOI] [Google Scholar]
- 68.Kanton S., Boyle M.J., He Z., Santel M., Weigert A., Sanchís-Calleja F., Guijarro P., Sidow L., Fleck J.S., Han D., et al. Organoid single-cell genomic atlas uncovers human-specific features of brain development. Nature. 2019;574:418–422. doi: 10.1038/s41586-019-1654-9. [DOI] [PubMed] [Google Scholar]
- 69.Benito-Kwiecinski S., Giandomenico S.L., Sutcliffe M., Riis E.S., Freire-Pritchett P., Kelava I., Wunderlich S., Martin U., Wray G.A., McDole K., Lancaster M.A. An early cell shape transition drives evolutionary expansion of the human forebrain. Cell. 2021;184:2084–2102.e19. doi: 10.1016/j.cell.2021.02.050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Marchetto M.C., Hrvoj-Mihic B., Kerman B.E., Yu D.X., Vadodaria K.C., Linker S.B., Narvaiza I., Santos R., Denli A.M., Mendes A.P., et al. Species-specific maturation profiles of human, chimpanzee and bonobo neural cells. eLife. 2019;8 doi: 10.7554/eLife.37527. [DOI] [Google Scholar]
- 71.Ewing A.D., Kazazian H.H., Jr. High-throughput sequencing reveals extensive variation in human-specific L1 content in individual human genomes. Genome Res. 2010;20:1262–1270. doi: 10.1101/gr.106419.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Sudmant P.H., Rausch T., Gardner E.J., Handsaker R.E., Abyzov A., Huddleston J., Zhang Y., Ye K., Jun G., Fritz M.H.Y., et al. An integrated map of structural variation in 2,504 human genomes. Nature. 2015;526:75–81. doi: 10.1038/nature15394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Perez-Riverol Y., Bai J., Bandla C., García-Seisdedos D., Hewapathirana S., Kamatchinathan S., Kundu D.J., Prakash A., Frericks-Zipper A., Eisenacher M., et al. The PRIDE database resources in 2022: a hub for mass spectrometry-based proteomics evidences. Nucleic Acids Res. 2022;50:D543–D552. doi: 10.1093/nar/gkab1038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Johansson P.A., Brattås P.L., Douse C.H., Hsieh P., Adami A., Pontis J., Grassi D., Garza R., Sozzi E., Cataldo R., et al. A cis-acting structural variation at the ZNF558 locus controls a gene regulatory network in human brain development. Cell Stem Cell. 2022;29:52–69.e8. doi: 10.1016/j.stem.2021.09.008. [DOI] [PubMed] [Google Scholar]
- 75.Stuart T., Butler A., Hoffman P., Hafemeister C., Papalexi E., Mauck W.M., 3rd, Hao Y., Stoeckius M., Smibert P., Satija R. Comprehensive Integration of Single-Cell Data. Cell. 2019;177:1888–1902.e21. doi: 10.1016/j.cell.2019.05.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Dobin A., Davis C.A., Schlesinger F., Drenkow J., Zaleski C., Jha S., Batut P., Chaisson M., Gingeras T.R. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R., 1000 Genome Project Data Processing Subgroup The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Ramirez F., Dundar F., Diehl S., Gruning B.A., Manke T. deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 2014;42:W187–W191. doi: 10.1093/nar/gku365. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Liao Y., Smyth G.K., Shi W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 2014;30:923–930. doi: 10.1093/bioinformatics/btt656. [DOI] [PubMed] [Google Scholar]
- 80.Zheng G.X.Y., Terry J.M., Belgrader P., Ryvkin P., Bent Z.W., Wilson R., Ziraldo S.B., Wheeler T.D., McDermott G.P., Zhu J., et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 2017;8 doi: 10.1038/ncomms14049. [DOI] [Google Scholar]
- 81.Love M.I., Huber W., Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Langmead B., Salzberg S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods. 2012;9:357–359. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34:3094–3100. doi: 10.1093/bioinformatics/bty191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Quinlan A.R., Hall I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Heinz S., Benner C., Spann N., Bertolino E., Lin Y.C., Laslo P., Cheng J.X., Murre C., Singh H., Glass C.K. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell. 2010;38:576–589. doi: 10.1016/j.molcel.2010.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Grassi D.A., Brattås P.L., Jönsson M.E., Atacho D., Karlsson O., Nolbrant S., Parmar M., Jakobsson J. Profiling of lincRNAs in human pluripotent stem cell derived forebrain neural progenitor cells. Heliyon. 2020;6 doi: 10.1016/j.heliyon.2019.e03067. [DOI] [Google Scholar]
- 87.Nolbrant S., Heuer A., Parmar M., Kirkeby A. Generation of high-purity human ventral midbrain dopaminergic progenitors for in vitro maturation and intracerebral transplantation. Nat. Protoc. 2017;12:1962–1979. doi: 10.1038/nprot.2017.078. [DOI] [PubMed] [Google Scholar]
- 88.Skene P.J., Henikoff J.G., Henikoff S. Targeted in situ genome-wide profiling with high efficiency for low cell numbers. Nat. Protoc. 2018;13:1006–1019. doi: 10.1038/nprot.2018.015. [DOI] [PubMed] [Google Scholar]
- 89.Pedersen B.S., Quinlan A.R. Mosdepth: quick coverage calculation for genomes and exomes. Bioinformatics. 2018;34:867–868. doi: 10.1093/bioinformatics/btx699. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Cheetham S.W., Kindlova M., Ewing A.D. Methylartist: tools for visualizing modified bases from nanopore sequence data. Bioinformatics. 2022;38:3109–3112. doi: 10.1093/bioinformatics/btac292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Zufferey R., Nagy D., Mandel R.J., Naldini L., Trono D. Multiply attenuated lentiviral vector achieves efficient gene delivery in vivo. Nat. Biotechnol. 1997;15:871–875. doi: 10.1038/nbt0997-871. [DOI] [PubMed] [Google Scholar]
- 92.Huang T., Choi M., Tzouros M., Golling S., Pandya N.J., Banfai B., Dunkley T., Vitek O. MSstatsTMT: Statistical Detection of Differentially Abundant Proteins in Experiments with Isobaric Labeling and Multiple Mixtures. Mol. Cell. Proteomics. 2020;19:1706–1723. doi: 10.1074/mcp.RA120.002105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Sodersten E., Toskas K., Rraklli V., Tiklova K., Björklund A.K., Ringnér M., Perlmann T., Holmberg J. Author Correction: A comprehensive map coupling histone modifications with gene regulation in adult dopaminergic and serotonergic neurons. Nat. Commun. 2018;9:4639. doi: 10.1038/s41467-018-07154-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
-
•
All data needed to evaluate the conclusions in the paper are present in the paper and/or the supplemental information. The sequencing data presented in this study have been deposited at the GEO superseries GEO: GSE283600. Proteomic data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository73 with the identifier PRIDE: PXD059661 (https://www.ebi.ac.uk/pride/archive/projects/PXD059661). Original code has been deposited at GitHub and is publicly available at https://molecular-neurogenetics.github.io/truster/ (https://doi.org/10.5281/zenodo.14362613) and git@github.com:Molecular-Neurogenetics/L1CRISPR_Adami_2025.git (https://doi.org/10.5281/zenodo.15638389).
-
•
The data, code, protocols, and key lab materials used and generated in this study are listed in the key resources table available within the article and deposited to Zenodo at https://doi.org/10.5281/zenodo.16094977.
-
•
An earlier version of this article was posted to bioRxiv at https://doi.org/10.1101/2025.01.17.633315.






