Skip to main content
Plant Communications logoLink to Plant Communications
. 2023 Sep 15;5(2):100717. doi: 10.1016/j.xplc.2023.100717

Single-cell transcriptome analysis dissects lncRNA-associated gene networks in Arabidopsis

Zhaohui He 1,7, Yangming Lan 1,7, Xinkai Zhou 1,7, Bianjiong Yu 1,7, Tao Zhu 1, Fa Yang 1, Liang-Yu Fu 1, Haoyu Chao 2, Jiahao Wang 3,4, Rong-Xu Feng 5, Shimin Zuo 3,4, Wenzhi Lan 1, Chunli Chen 6, Ming Chen 2,, Xue Zhao 1,∗∗, Keming Hu 3,4,∗∗∗, Dijun Chen 1,∗∗∗∗
PMCID: PMC10873878  PMID: 37715446

Abstract

The plant genome produces an extremely large collection of long noncoding RNAs (lncRNAs) that are generally expressed in a context-specific manner and have pivotal roles in regulation of diverse biological processes. Here, we mapped the transcriptional heterogeneity of lncRNAs and their associated gene regulatory networks at single-cell resolution. We generated a comprehensive cell atlas at the whole-organism level by integrative analysis of 28 published single-cell RNA sequencing (scRNA-seq) datasets from juvenile Arabidopsis seedlings. We then provided an in-depth analysis of cell-type-related lncRNA signatures that show expression patterns consistent with canonical protein-coding gene markers. We further demonstrated that the cell-type-specific expression of lncRNAs largely explains their tissue specificity. In addition, we predicted gene regulatory networks on the basis of motif enrichment and co-expression analysis of lncRNAs and mRNAs, and we identified putative transcription factors orchestrating cell-type-specific expression of lncRNAs. The analysis results are available at the single-cell-based plant lncRNA atlas database (scPLAD; https://biobigdata.nju.edu.cn/scPLAD/). Overall, this work demonstrates the power of integrative single-cell data analysis applied to plant lncRNA biology and provides fundamental insights into lncRNA expression specificity and associated gene regulation.

Key words: single-cell transcriptomics, long noncoding RNAs, lncRNAs, gene regulatory networks, GRNs, plants


This study presents the investigation of the diverse transcriptional patterns of lncRNAs and their associated gene networks on the basis of a comprehensive cell atlas (scPLAD) in Arabidopsis. The results show that the cell-type-specific expression of lncRNAs can explain their tissue specificity. In addition, putative transcription factors orchestrating cell-type-specific expression of lncRNAs have been identified.

Introduction

Plant genomes produce myriad noncoding RNAs (ncRNAs), among which the long ncRNAs (lncRNAs) are classically >200 nt in length and show no discernable coding potential (Liu et al., 2015; Yu et al., 2019). Emerging evidence has shown that lncRNAs play crucial roles throughout plant growth and development (Yu et al., 2019; Jha et al., 2020; Wierzbicki et al., 2021; Palos et al., 2023), participating in biological processes such as biotic/abiotic stress responses (Jha et al., 2020), leaf morphogenesis (Wu et al., 2013), stem elongation (Patil et al., 2019), tillering (Gou et al., 2017), and flowering (Heo and Sung, 2011; Shivaraj et al., 2018). In the past decade, high-throughput RNA sequencing (RNA-seq) technology and advanced bioinformatics tools have been used to predict thousands of lncRNAs in various plant species (Palos et al., 2023).

Despite the importance and abundance of plant lncRNAs, their functional analysis is still in its early stages because of their low copy number and poor sequence conservation compared with mRNAs in plant species. However, given their common transcriptional features and correlated expression patterns, it is likely that lncRNAs and protein-coding genes are co-regulated within the same gene networks or biological pathways (Guttman et al., 2009; Quinn and Chang, 2015; Statello et al., 2020). Gene regulatory network (GRN) analysis can be used to better understand the functions of lncRNAs, as it has been used extensively to decipher gene functions (Statello et al., 2020; Zhao et al., 2022). One such analytical tool is ncFANs, which was constructed on the basis of this concept. In the updated version, ncFANs version 2.0, the relationship between protein-coding genes and lncRNAs was calculated using a random-forest-based network (Liao et al., 2011; Zhang et al., 2021b).

Although lncRNAs exhibit tissue- or condition-specific expression (Liu et al., 2012), the heterogeneity of bulk RNA-seq data leads to averaging of lncRNA transcripts across different cell types, obscuring their specificity and their correlation with coding genes in GRNs constructed from bulk RNA-seq datasets. Recently, single-cell RNA-seq (scRNA-seq) has emerged as a powerful technology for investigating cell-type-specific expression and associated GRNs of lncRNAs (Shaw et al., 2021; Xu et al., 2022). Studies have shown that lncRNAs exhibit high cell-to-cell variability at the single-cell level, and some lncRNAs are expressed exclusively in specific cell types or during specific stages of cell growth (Zhao et al., 2022). However, most scRNA-seq methods amplify only the 3′ ends of transcripts, limiting their ability to predict lncRNAs de novo. Although the SMART-seq2 protocol (Picelli et al., 2014) can generate full-length cDNAs from polyadenylated transcripts and is useful for analyzing lncRNAs (Luo et al., 2021), it has much lower throughput and has not yet been widely used in plants (Lopez-Anido et al., 2021).

In the current study, we leveraged accurate lncRNA prediction from bulk RNA-seq data and the advantage of scRNA-seq for lncRNA quantification to produce a comprehensive single-cell expression atlas of lncRNAs in plants. For this, we developed a computational pipeline to analyze the expression specificity and cell-type-specific regulation of plant lncRNAs (Figure 1). We generated a cell-type-resolution lncRNA transcriptome atlas by comprehensive integration of 28 published scRNA-seq datasets from juvenile Arabidopsis seedlings. We identified an extensive list of lncRNA signatures that showed cell-type-specific expression patterns, which were confirmed by experimental validation. We showed that the tissue specificity of lncRNAs can be explained by their cell-type-specific expression. In addition, we used GRN analysis to explore the cell-type functions of lncRNAs at the single-cell level. Overall, our study not only provides a computational framework for expression analysis of lncRNAs but also offers a valuable resource for functional dissection of lncRNAs in plants. All analysis results from the study are available at the single-cell-based plant lncRNA atlas database (scPLAD; https://biobigdata.nju.edu.cn/scPLAD/).

Figure 1.

Figure 1

Workflow of the study

Results

Identification of cell-type-specific lncRNA signatures using single-cell transcriptome analysis

To unravel the lncRNA-associated GRNs that regulate cell-type-specific gene expression in plants, we constructed a catalog of 24 848 lncRNAs from recent studies (Okamoto et al., 2010; Liu et al., 2012; Wang et al., 2014a, 2014b; Di et al., 2014; Zhu et al., 2014; Li et al., 2016; Zhao et al., 2018; Szcześniak et al., 2019) (Supplemental Table 1) and compiled a comprehensive compendium of single-cell transcriptome data by integrating 28 published scRNA-seq datasets (Jean-Baptiste et al., 2019; Ryu et al., 2019; Zhang et al., 2019, 2021a; Liu et al., 2020; Wendrich et al., 2020; Farmer et al., 2021; Long et al., 2021; Lopez-Anido et al., 2021) from juvenile seedlings of Arabidopsis thaliana (Figure 2A; Supplemental Figures 1A and 1B; Table 1; Supplemental Table 2). After stringent data filtering for quality control, the integrated transcriptome atlas contained 186 030 high-quality single cells with 15 319 expressed lncRNAs (61.65% of annotated lncRNAs) and 26 363 protein-coding genes (PCs; 91.44% of annotated PCs) (Supplemental Figure 1C). We were able to annotate 14 distinct major cell types or states (including epidermis, columella, cortex, defense-related cells, endodermis, guard cells, lateral root cap, meristem, pericycle/bundle sheath [BS], phloem, photosynthetic cells, procambium, root hair, and xylem cells) from 44 cell clusters according to canonical protein-coding gene signatures (referred to as “canonical markers” hereafter) (Figure 2B; Supplemental Figure 2; and Supplemental Tables 3 and 4). Notably, some cells across various tissues can display shared gene expression characteristics. For instance, cell types like the root-specific pericycle and the aerial-part-specific BS have been found to exhibit similar gene expression patterns (Zhang et al., 2021a; Wang et al., 2021). Specifically, in our dataset, clusters 6 and 17 were predominantly made up of cortex cells; however, they included a noteworthy 35.7% of cells from aerial part samples. Similarly, clusters 19, 37, and 21, which were primarily composed of endodermis cells, surprisingly contained 26.7% of cells from aerial part samples. This similarity in gene expression patterns between cell clusters from different tissues underscores the complexity and interconnectedness of gene regulation across diverse cell types and tissues.

Figure 2.

Figure 2

Integrative analysis of single-cell transcriptome data in Arabidopsis seedlings to depict the expression specificity of mRNAs and lncRNAs

(A) Uniform manifold approximation and projection (UMAP) plot displaying an integrated cell map of 14 major cell types.

(B) Heatmap showing the expression patterns of top protein-coding marker genes (row) in each cell cluster (column). Cell clusters are colored according to their annotated cell types as in (A). Representative genes are shown on the right.

(C) Heatmap showing the expression patterns of top lncRNA signatures (row) in each cell cluster (column). Representative lncRNAs are labeled on the right.

(D) UMAP plots displaying the expression of representative protein-coding genes (top panel) and lncRNAs (bottom panel).

Table 1.

Published scRNA-seq data from Arabidopsis seedlings used in this study

Sample information Number of datasets Cell number (total/used) Reference
6 days, root tip 6 27 834/27 798 Wendrich et al., 2020
5 days, root tip 3 21 321/21 180 Ryu et al., 2019
10 days, root tip 1 14 332/13 514 Zhang et al., 2019
5 days, leaf 4 42 007/41 551 Liu et al., 2020
7–8 days, whole root 2 13 620/13 168 Jean-Baptiste et al., 2019
7 days, whole root 5 10 946/10 431 Farmer et al., 2021
10 days, root tip 1 1267/1206 Long et al., 2021
7 days, shoot apex 2 40 350/39 741 Zhang et al., 2021a
10–12 days, seedling/leaf 4 21 094/17 441 Lopez-Anido et al., 2021
Total 28 192 771/186 030

On the basis of both bulk and scRNA-seq data, lncRNAs generally exhibited greater specificity in expression than protein-coding genes (Supplemental Figure 3). We sought to identify cell-type-specific lncRNAs (“lncRNA signatures”) by comparing lncRNA expression among different cell types. In this manner, we identified 445 potential lncRNA signatures from the annotated cell types (Supplemental Table 5). The top 10 representative lncRNA signatures for different clusters (157 unique lncRNAs in total) are shown in Figure 2C. These lncRNA signatures showed overall patterns of cell-type expression specificity similar to those of canonical markers. For example, CASP5 is a well-known marker gene for the Casparian strip in root endodermal cells (Roppolo et al., 2011) and was expressed exclusively in endodermis in our data, as expected. AthLNC021492 showed an expression pattern similar to that of CASP5, suggesting that this lncRNA is a potential signature for endodermal cells. In short, the above analysis provides valuable candidate lncRNAs for further functional investigations.

Cell-type-specific expression of lncRNAs explains their tissue specificity

To estimate the biological relevance of cell types and gene signatures identified on the basis of scRNA-seq data, we used CIBERSORT (Newman et al., 2015) to deconvolve the abundance of annotated cell types in bulk transcriptome data. This deconvolution process involves regression analysis on a reference expression matrix constructed from a predefined set of marker genes. To validate this approach, we applied it to cell-specific bulk RNA-seq datasets (n = 60) obtained from Arabidopsis root tissues (Li et al., 2016) (Supplemental Table 6). As anticipated, the distribution of cell-type proportions in bulk samples aligned with the expected cell types (Figure 3A). For instance, the predominant cell type within the cortex bulk sample was indeed cortex cells, whereas guard and photosynthetic cells were conspicuously scarce across all root samples. These observations indicate that the deconvolution approach can accurately estimate cell type abundances within bulk samples.

Figure 3.

Figure 3

Cell-type deconvolution of bulk RNA-seq data reveals tissue-specific expression of lncRNAs

(A) Heatmap representing the composition of cell types inferred from cell-specific bulk RNA-seq data (n = 60) using CIBERSORT. Columns represent 60 bulk RNA-seq datasets obtained from Arabidopsis root tissues. Rows represent the 14 major cell types identified from the scRNA-seq data in this study. The bar diagram (right) shows the sum of values for each row in the heatmap.

(B) Heatmap showing different types of tissues with distinct cell-type compositions inferred from bulk RNA-seq data by CIBERSORT. Columns represent 95 bulk RNA-seq datasets collected from different tissues.

(C) Relative expression of protein-coding marker genes for different cell types (the x axis) in bulk RNA-seq data from different tissues (with distinct color codes). For each cell type, the top 20 lncRNA genes with the highest expression values were selected. Each point on the boxplot represents the expression level of these selected genes in the bulk RNA-seq data.

(D) lncRNA signatures from different cell types, presented as in (C).

We next investigated the cell-type composition of 95 bulk RNA-seq datasets from various major tissues at the juvenile growth stage (Supplemental Figure 4 and Supplemental Table 7). Outcomes of the cell-type deconvolution analysis revealed distinct abundance patterns across various tissues. Notably, significant variations in the prevalence of cell types were observed among different tissue types (Figure 3B). Cell types specific to roots (such as endodermis, root hair, columella, and lateral root cap) and shoots (such as guard cells and photosynthetic cells) predominantly corresponded to samples from their respective tissue regions. Moreover, cell types common to both roots and shoots (including epidermis, pericycle/BS, phloem, and xylem) exhibited a pronounced representation in samples from both tissue parts. Of particular interest, photosynthetic, columella, and meristem cells displayed relatively higher proportions and notable discrepancies across the analyzed samples. Specifically, photosynthetic cells were predominantly associated with samples derived from shoot-related sources, consistent with their prominent representation in scRNA-seq data from shoots (Figure 3B). Columella cells were highly deconvolved in roots, whereas meristem cells were overrepresented in tissues with high activation of cell proliferation, such as root tips and shoot apexes (Figure 3B).

We observed that cell-type-specific mRNA markers and lncRNA signatures from photosynthetic, columella, and meristem cells were highly expressed in the corresponding tissues enriched for these cells (Figure 3C and 3D). For less prominently represented cell types as determined by the deconvolution analysis, expression profiles of both mRNA markers and lncRNA signatures exhibited comparatively weak discrimination among different tissue types (Supplemental Figure 4). Nonetheless, it is important to emphasize that, compared with mRNA markers, the cell-type-specific expression of lncRNA signatures exhibited a relatively lower degree of tissue specificity (Figure 3C and 3D). Taken together, findings from the deconvolution analysis suggest that previously recognized tissue-specific expression of lncRNAs is partially attributed to their cell-type-specific expression.

Constructing lncRNA-associated gene networks at the cell-type level

Transcription factors (TFs) orchestrate the expression of both protein-coding and noncoding genes (e.g., lncRNAs) in a cell-specific manner. Therefore, cell identity and function can be partially depicted by the expression of TFs and their target genes in a scenario where they form co-expression GRNs. To map cell-type-specific GRNs including both protein-coding genes and lncRNAs, we predicted GRNs on the basis of co-expression analysis and TF motif enrichment using SCENIC (Aibar et al., 2017). In brief, SCENIC aims to identify genes sets that are both co-expressed with TFs and enriched in the corresponding TF motifs. SCENIC then uses the AUCell algorithm to score the activity of the discovered regulons (including a TF and its co-expressed target genes) in individual cells. Some TFs may be involved in multiple regulons. In total, we identified 184 representative regulons governed by 141 distinct TFs across all cell types (Figure 4A; Supplemental Figure 5), and regulons consisting of TFs and their target genes (including PCs and lncRNAs) were visualized as integrated networks (consisting of multiple regulons) in a cell-type-specific manner (Supplemental Figure 6). As expected, the regulon activity and expression level of target genes colocalized over the single-cell map (Supplemental Figure 7).

Figure 4.

Figure 4

Construction and evaluation of lncRNA-associated gene networks at the cell-type level

(A) Networks showing representative regulons for different cell types. Large nodes denote distinct cell types; small nodes represent TF regulons whose protein-coding and lncRNA targets are shown in pie charts.

(B) Clustering dendrogram of 60 samples for co-expression network analysis using WGCNA, resulting in 68 module eigengenes (MEs; so-called gene modules) with assigned module colors.

(C) Chord diagram depicting the concurrence of genes between gene modules (n = 68) and cell types (n = 14). The linked lines represent the number of common edges between the WGCNA network and the predicted scRNA-seq GRN, with the line colors corresponding to gene modules.

(D) SCENIC-predicted regulons (Obs.) in xylem show significantly higher weighted correlations derived from WGCNA than random controls (CK) for both lncRNA-related regulons and protein-coding-gene-related regulons. As controls, transcription factor (TF) and gene pairs were randomly generated from the same reference gene list in the xylem (repeated 1000 times). See also Supplemental Figures 9 and 10.

(E) Heatmap displaying the expression patterns of 434 genes from the “blue” gene module.

(F) Gene Ontology (GO) enrichment analysis of protein-coding genes (n = 331) from the “blue” gene module.

To verify the confidence of inferred cell-type-specific GRNs from scRNA-seq data, we used bulk RNA-seq data (n = 60 samples) collected from individual cell types and developmental zones of the Arabidopsis root (Li et al., 2016) to construct co-expression networks between mRNAs and lncRNAs using WGCNA (Langfelder and Horvath, 2008). Using this approach, we identified 68 gene modules (Figure 4B). We compared the gene modules identified from bulk RNA-seq data with cell-type-specific GRNs identified from scRNA-seq data, and we observed consistent mapping between them (Figure 4C). In addition, the TF–mRNA or TF–lncRNA regulatory pairs inferred by SCENIC were quantified by weighted correlation coefficients obtained from the WGCNA co-expression network analysis. This analysis revealed that the inferred regulatory relationship (derived from scRNA-seq data) is significantly better than expected in terms of co-expression weight (calculated on the basis of bulk RNA-seq data), not only for protein-coding genes but also for lncRNAs (Figure 4D). These findings strongly indicate that the inferred regulatory relationships demonstrate robustness at the level of individual cells and across various tissues, confirming the high confidence of the inferred cell-type-specific GRNs. Notably, the “blue” gene module closely resembled the xylem-specific GRN, with genes (including both protein-coding genes and lncRNAs) from this module specifically expressed in xylem samples (Figure 4C and 4E). Accordingly, functional analysis using protein-coding genes from this module revealed an enrichment in biological pathways related to “cell wall biogenesis” and “xylem development” (Figure 4F). In short, these cell-type-specific GRNs provide entry points for investigating the functions of associated lncRNAs.

Analysis of cell-type-specific GRNs

Using the inferred GRNs, the functions of lncRNAs can be inferred from those of linked genes with known functions in the networks. We were therefore able to infer the potential roles of lncRNAs in cell-specific functions on the basis of their co-expression patterns with PCs when lncRNAs and PCs were regulated by a common set of upstream regulators. In xylem cells, our analysis identified the regulons of ANAC007 (VND7) and ANAC030 (VND4) (Figure 5A and 5B)—NAC-domain TFs necessary for xylem formation (Kubo et al., 2005)—and xylem-specific regulons of MYB103, MYB46, and ANAC073 (SND2)—TFs that regulate secondary wall biosynthesis in xylem tissues (Ohman et al., 2013; Zhong et al., 2021; Nookaraju et al., 2022). The predicted target (protein-coding) genes of these regulons, including WOX14 (Denis et al., 2017) and ANAC075 (SND4) (Zhong et al., 2021), were functionally enriched in biological pathways related to “xylem development” (Figure 5C). Notably, we observed that a number of lncRNAs such as AthLNC018000 and AthLNC011193 (Figure 5A and 5B) were co-expressed with these protein-coding target genes and regulated by the same xylem-specific regulators. These xylem-specific target genes (including both lncRNAs and protein-coding genes) were generally highly expressed in the xylem bulk RNA-seq datasets (Figure 5D). We also confirmed the xylem-specific expression patterns of these lncRNAs in additional single-cell transcriptome datasets available at scPlantDB (He et al., 2023) (Figure 5E; Supplemental Figure 8). To validate our computational analysis, we performed in situ hybridization on Arabidopsis seedlings and found that AthLNC018000 and AthLNC011193 were expressed exclusively in cells of the vascular tissue (Figure 5F and Supplemental Figure 9). Collectively, these findings suggest that the identified xylem-specific lncRNAs may have potential functions in xylem development.

Figure 5.

Figure 5

Analysis of xylem-specific gene regulatory networks

(A) UMAP plots displaying the regulatory activity (AUCell score) of selected xylem-specific regulons (top) and gene expression patterns of regulon targets, including protein-coding genes (middle) and lncRNAs (bottom).

(B) Subnetwork view of the xylem-specific regulon of target genes.

(C) Gene Ontology (GO) enrichment analysis of target genes of the xylem-specific regulons.

(D) Heatmap showing the expression patterns of target genes (lncRNAs and mRNAs) from the xylem network in cell-specific bulk RNA-seq data (n = 60).

(E) Boxplot depicting the expression levels of AthLNC011193 and AthLNC018000 in various cell types across 10 well-annotated single-cell transcriptome datasets available at scPlantDB. These scPlantDB datasets are identified as follows: SRP267870, SRP285817, SRP330542, SRP394711, CRA002977_1, ERP132245, SRP173393, SRP320285, SRP338044, and SRP398011.

(F) Localization of fluorescein isothiocyanate (FITC)-labeled lncRNAs in the Arabidopsis leaf was visualized using a confocal microscope. Expression of AthLNC018000 and AthLNC011193 is shown. v, vein; m, mesophyll; x, xylem. Scale bars, 50 μm.

We also provided additional examples of cell-type-specific GRNs (Supplemental Figures 10 and 11). In the case of columella-specific GRNs, we observed that certain lncRNA-associated regulons, which include AT1G04353, AthLNC010807, and AthLNC009197, exhibit high activity in columella cells. The co-expressed protein-coding genes in these regulons are functionally enriched in biological processes related to “rhythmic process” and “gibberellin mediated signaling pathway” (Supplemental Figure 10). Among root-hair-specific regulons, we found that AthLNC025037 and AthLNC019865 are regulated by root-hair-specific regulators such as RAP2.11, which has previously been linked to root hair developmental processes (Kim et al., 2012) (Supplemental Figure 11). This finding implies that the identified lncRNAs may play a role in GRNs that govern root hair development. Taken together, our findings offer valuable insights for studying the functions of coding and ncRNAs at the cellular level.

scPLAD enables online exploration of lncRNA expression and gene networks at the single-cell level

To help plant scientists fully explore our results, we developed an integrated, web-based, user-friendly platform: single-cell-based plant lncRNA atlas database (scPLAD; https://biobigdata.nju.edu.cn/scPLAD/; Supplemental Figure 12), which enables users to quickly retrieve the expression and regulatory networks of genes of interest (PCs and/or lncRNAs) from an integrated cell map. In the future, we plan to update scPLAD by including more single cell omics datasets and new plant species.

Discussion

Since the identification of the first plant lncRNA, enod40 in Medicago plants, in 1994 (Crespi et al., 1994), many plant lncRNAs have been reported as crucial regulators involved in plant development and responses to abiotic and biotic stresses (Nejat and Mantri, 2018; Wierzbicki et al., 2021; Palos et al., 2023). Despite numerous studies revealing tissue-specific and even cell-specific expression patterns of lncRNAs in mammals, our understanding of the cell-type expression patterns and functional characterization of the majority of plant lncRNAs remains limited (Flynn and Chang, 2014; Gloss and Dinger, 2016; Mattick et al., 2023; Palos et al., 2023). Nevertheless, lncRNAs have been found to have a higher tissue specificity than protein-coding genes in plants (Li et al., 2014; Zhang et al., 2014; Wang et al., 2015). Emerging single-cell techniques, which are designed to decode the heterogeneity among a group of cells, represent very promising methods for lncRNA studies. In fact, scRNA-seq has helped to identify lncRNAs specific to radial glia cells in the neocortex (Liu et al., 2016), further demonstrating that undersampled lncRNAs can be captured in single-cell analyses.

However, because of inherent limitations (such as drop-out events) of current scRNA-seq techniques, cell-specific lncRNA transcripts are hard to distinguish from technical noise in a single cell. In this work, we assumed that if a lncRNA was occasionally expressed in multiple cells from the same cell type or state, albeit at a relatively low level, it could be a reliable transcript in single-cell analysis. As a proof of concept, 28 published scRNA-seq datasets collected from different tissues of Arabidopsis seedlings were integrated to annotate a comprehensive cell map. Using a similar strategy of cell-type-specific marker identification for protein-coding genes, we identified hundreds of potential lncRNA signatures in a cell-type-specific manner. Notably, we observed that meristem-associated lncRNA signatures were generally highly expressed in all tested tissues compared with other lncRNA signatures in the deconvolution analysis (Figure 3D), consistent with the fact that lncRNAs are known to play important roles in differentiation in both plants and animals (Mattick et al., 2023). The possibility of integrating scRNA-seq data is becoming increasingly viable for plant lncRNA research, especially as comprehensive single-cell transcriptome analysis becomes achievable for individual laboratories.

Functional annotation of lncRNAs remains a major challenge because of their modest effects on an organism’s phenotype, which make genetic screening an inefficient method for identification of these transcripts. As an alternative, computational prediction of lncRNA functions can be used to identify candidate lncRNAs for further experimental validation. In this study, we used a network-based approach to predict putative functions of lncRNAs based on their co-expression with protein-coding genes. In contrast to co-expression analysis using bulk RNA-seq data, single-cell co-expression analysis offers several distinct advantages. First, construction of cell-type-specific GRNs is a superior approach to bulk RNA-seq because it overcomes the limitations of averaging transcripts from multiple cell types. This is particularly important for lncRNAs that are exclusively expressed in specific cell types or cellular states. Second, inferring co-expression networks on the basis of bulk RNA-seq data requires a large number of samples to ensure robustness. Conversely, in single-cell analysis, the individuality of each cell permits it to be treated as an independent “sample,” rendering the inference of co-expression networks feasible within a single scRNA-seq dataset. Using SCENIC, we identified 184 cell-type-specific regulons, each containing upstream TFs and target genes of both lncRNAs and mRNAs. Notably, certain regulons were found to regulate more lncRNAs than protein-coding genes in specific cell types, such as the bZIP48 regulon in pericycle/BS cells and the HSFA1E regulon in lateral root cap cells (Figure 5B). These findings suggest potential functional associations between lncRNAs and TFs in specific cellular contexts. However, it is important to note that these computational methods for predicting lncRNA functions are not perfect and may have limitations. Therefore, experimental validation is necessary to confirm the accuracy and relevance of the functional predictions. This can be achieved through techniques such as CRISPR-Cas9 gene editing, RNAi, and overexpression studies.

In conclusion, our analyses deepen our understanding of the expression and regulation of Arabidopsis lncRNAs at a single-cell resolution and offer valuable resources for functional dissection of lncRNAs in plants. We anticipate that our study may provide the groundwork for future studies on functional mechanisms of lncRNAs in plant growth and development.

Methods

Plant materials and growth conditions

A. thaliana (ecotype Col-0) was used as the experimental material. Prior to germination on 1/2 Murashige and Skoog (MS) agar plates, the seeds were sterilized by three immersions in 75% alcohol for three minutes each. The plants were then grown at 21°C under a long-day photoperiod consisting of 16 h of light and 8 h of darkness.

In situ hybridization

Frozen sections were prepared from 10-day-old seedlings as described in a recent study (Solanki et al., 2020) for fluorescence in situ hybridization (FISH). We used custom-designed oligonucleotide probes specific to the target lncRNAs of interest. To visualize the probes, sections were incubated with rabbit anti-biotin (Solarbio) for biotin-labeled probes and goat anti-rabbit IgG/fluorescein isothiocyanate (FITC) (Solarbio) to label fluorescence (Hu et al., 2021). After washing off unbound probes, the sections were imaged under a fluorescence microscope with laser excitation for FITC at 488 nm. The biotin-labeled RNA probes were synthesized by Beijing Tsingke Biotech Co., Ltd. Primers are detailed in Supplemental Table 8.

LncRNA reference database

Droplet-based methods, like the 10x approach (Zheng et al., 2017), focus primarily on sequencing either the 3′ or 5′ termini of transcripts, significantly limiting the ability to predict lncRNAs. Because Arabidopsis lncRNAs have been comprehensively predicted on the basis of bulk RNA-seq data under various conditions, we preferred to build a reference lncRNA database based on existing studies. We therefore collected a comprehensive list of predicted lncRNAs from recent studies (Okamoto et al., 2010; Liu et al., 2012; Wang et al., 2014a, 2014b; Di et al., 2014; Zhu et al., 2014; Li et al., 2016; Zhao et al., 2018; Szcześniak et al., 2019) by merging well-annotated lncRNAs from the TAIR database (https://www.arabidopsis.org). We used the following criteria to filter the lncRNAs. First, we retained all Araport11 lncRNAs, as these lncRNAs have been annotated using stringent criteria. Second, we excluded predicted lncRNAs that overlapped with annotated genes, including both protein-coding and lncRNAs from the Araport11 database, within the same sense strand. Third, any overlapping lncRNAs from different sources were merged into a single lncRNA locus and assigned a specific ID (starting with AthLNC, as shown in Supplemental Table 1). In addition, we excluded intronic lncRNAs and sense lncRNAs from our analysis, as their quantification could be influenced by their host genes. As a result, our final lncRNA list comprised 12 693 long intergenic ncRNAs (lincRNAs) and 12 155 natural antisense lncRNAs (NAT lncRNAs), which were used as the lncRNA reference in this study.

scRNA-seq and bulk RNA-seq data collection

Arabidopsis scRNA-seq datasets (n = 28; Supplemental Table 2) from juvenile seedlings were collected from previous studies (Jean-Baptiste et al., 2019; Ryu et al., 2019; Zhang et al., 2019, 2021a; Liu et al., 2020; Wendrich et al., 2020; Farmer et al., 2021; Long et al., 2021; Lopez-Anido et al., 2021). To analyze these scRNA-seq data in a uniform manner, we downloaded the original fastq files from the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) (https://www.ncbi.nlm.nih.gov/sra/) or CNCB NGDC (https://ngdc.cncb.ac.cn/) databases. Bulk RNA-seq datasets (n = 95; Supplemental Table 7) from juvenile Arabidopsis seedlings were also collected for cell-type deconvolution analysis.

RNA-seq data processing

RNA-seq reads were mapped to the A. thaliana reference genome (TAIR10) using STAR (version 020201) (Dobin et al., 2013). Expression levels (FPKM [fragments per kilobase of transcript per million mapped reads]) of all annotated protein-coding genes and lncRNAs were estimated with RSEM (version 1.2.22) (Li and Dewey, 2011). FPKM values as defined by RSEM were added to a pseudo-value of 1e−6 (to avoid zeros) and then log2 transformed. A gene was considered to be expressed only if its estimated FPKM value was >0.1 for at least one sample. Reproducibility of RNA-seq experiments was evaluated using Spearman’s correlation analysis (Supplemental Figure 1F).

scRNA-seq data processing

Raw scRNA-seq data in fastq format were aligned to the reference genome (TAIR10) and subjected to barcode assignment and unique molecular identifier (UMI) counting using the CellRanger version 3.1.0 pipeline (10x Genomics). 10x Single Cell gene expression libraries are strand specific, and CellRanger counts only sense-strand reads for either mRNAs or lncRNAs. This means that the quantification of lncRNAs is not influenced by the values of mRNAs, even if they originate from the same genomic positions. The counting process specifically excludes UMIs from antisense reads, ensuring reliable and independent quantification of lncRNA expression levels. Filtered count matrices (identical criteria for both mRNAs and lncRNAs) from the CellRanger pipeline were further processed using the Seurat package (version 4.0.0) (Hao et al., 2021). Cells that expressed fewer than 200 genes and more than 10% of mitochondrial gene expression in UMI counts were removed from the analysis. The top 3000 most variably expressed genes were determined using the “vst” method in the “FindVariableFeatures” function and scaled using “ScaleData” with regression on the proportion of mitochondrial UMIs (mt.percent).

We used a “two-stepwise strategy” for scRNA-seq data integration. First, different scRNA-seq datasets from a specific study were aligned with canonical correlation analysis (CCA). We then used the robust principal-component analysis (RPCA) method in Seurat to integrate scRNA-seq data from different studies.

For visualization, the “RunPCA” function was used to compute the top 30 principal components using the top variably expressed genes. UMAP (uniform manifold approximation and projection) was used to visualize cell clusters. Clustering was performed for integrated expression values on the basis of shared-nearest-neighbor (SNN) graph clustering (Louvain community detection-based method) using “FindClusters” with a resolution of 0.8. The “FindAllMarkers” function was used with default parameters to identify markers for each cluster. Cell types were annotated on the basis of known marker genes. Lastly, the “FindAllMarkers” function was run again for each cell type. Top protein-coding marker genes and lncRNAs for each cluster are provided in Supplemental Tables 4 and 5.

Measuring tissue specificity with entropy

We assessed tissue specificity for both lncRNAs and mRNAs in our study using entropy-based metrics on both bulk RNA-seq and scRNA-seq datasets. Specifically, the gene specificity score was defined as 1entropy/log2n, where n represents the number of samples in bulk RNA-seq data (n = 95) and the number of cell types in scRNA-seq data (n = 14) (Schug et al., 2005).

Cell-type deconvolution

We generated a signature matrix consisting of all annotated cell types from scRNA-seq data. Using this signature matrix, we deconvoluted cell-type proportions from bulk RNA-seq datasets with default parameters of CIBERSORT (Newman et al., 2015).

Single-cell regulatory network inference

We followed a similar strategy in the scPlant pipeline (Cao et al., 2023) to create the cisTarget database in Arabidopsis to run SCENIC. Specifically, we collected TF DNA-binding motifs for Arabidopsis from several databases, including JASPAR (Fornes et al., 2020), CIS-BP (Weirauch et al., 2014), and PlantTFDB (Jin et al., 2017). All motif information was subjected to redundancy filtering and has been compiled in our ChIP-Hub database (Fu et al., 2022). The cisTarget database was then constructed according to the SCENIC (Aibar et al., 2017) tutorial instructions (https://github.com/aertslab/create_cisTarget_databases). We used pySCENIC (version 0.11.2, https://github.com/aertslab/pySCENIC) with default parameters to infer co-expression modules using the cisTarget database created above. The single-cell TF regulation activity was calculated with default parameters. For each cell type, the regulon specificity score (RSS) was summarized for each TF. We then plotted the RSS scores by rank to create RSS curves. TFs whose RSS scores were greater than a specific threshold were considered to be cell-type-specific regulators. The optimal RSS threshold for each cell type was determined using an elbow statistic (the location of a bend or knee in the plot) on the basis of the RSS curves (Supplemental Figure 5). In this manner, cell-type-specific regulons were identified and visualized in networks.

Co-expression network analysis

We used the WGCNA package (Langfelder and Horvath, 2008) with default parameters to construct weighted co-expression networks between protein-coding genes and lncRNAs using bulk RNA-seq data from Li et al. (Li et al., 2016). Only genes with a tissue-specificity tau (Yanai et al., 2005) score >0.85 were used in the analysis, resulting in 5265 protein-coding genes and 6253 lncRNAs. We used the weighted correlation coefficients derived from WGCNA to reflect the strength and direction of co-expression patterns between pairs of genes across various samples or conditions. These coefficients indicate how closely the expression levels of two genes are related and can be used to assess the biological relevance of gene pairs. By considering the strength of the correlation, WGCNA can identify gene pairs with moderate to strong co-expression patterns, which are more likely to be biologically relevant and involved in similar regulatory processes or functional pathways.

Statistical analysis

If not specified, all statistical analyses and data visualization were performed in R (version 4.0.0). R packages such as ggplot2 and igraph were used for graphics. The clusterProfiler package (Wu et al., 2021) was used for gene set enrichment analysis.

Data and code availability

The integrated cell map and associated analysis results from this study can be retrieved and viewed at the web-based platform scPLAD (https://biobigdata.nju.edu.cn/scPLAD/).

Code availability

No special methods were implemented in this study. R code used to analyze data and generate figures is available from the corresponding authors upon reasonable request.

Funding

This work is supported by grants from the National Natural Science Foundation of China (grants 32070656, 32270709, 32070677, and 32000362), the Natural Science Foundation of Jiangsu Higher Education Institutions of China (grant 23KJA210002), the open funds of the Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding (grant PL202105), the Priority Academic Program Development of Jiangsu Higher Education Institutions of Jiangsu Higher Education Institutions (PAPD), and the 2023 Postgraduate Research & Practice Innovation Program of Jiangsu Province (grant KYCX23_0131). The authors acknowledge the Center for Information Technology and the High Performance Computing Center of Nanjing University for providing high-performance computing (HPC) resources. We thank Professor Xiaobo Zhao from Zhejiang University for kindly providing valuable information on photosynthetic cells.

Author contributions

D.C. conceived the study. D.C., M.C., and K.H. supervised the study. Z.H. performed data analyses with support from X. Zhou, Y.L., T.Z., F.Y., H.C., R.-X.F., and S.Z. X. Zhou, B.Y., and D.C. designed the web-based application. Y.L. performed experimental validations with support from L.-Y.F., J.W., W.L., and C.C. D.C. wrote the manuscript with input from K.H. X. Zhao and D.C. contributed to the discussion section. All authors reviewed and approved the submitted manuscript.

Acknowledgments

No conflict of interest is declared.

Published: September 15, 2023

Footnotes

Published by the Plant Communications Shanghai Editorial Office in association with Cell Press, an imprint of Elsevier Inc., on behalf of CSPB and CEMPS, CAS.

Supplemental information is available at Plant Communications Online.

Contributor Information

Ming Chen, Email: mchen@zju.edu.cn.

Xue Zhao, Email: zhaoxue@nju.edu.cn.

Keming Hu, Email: hukm@yzu.edu.cn.

Dijun Chen, Email: dijunchen@nju.edu.cn.

Supplemental information

Document S1. Supplemental Figures 1–12
mmc1.pdf (26.5MB, pdf)
Data S1. Supplemental Tables 1–8
mmc2.xlsx (1.2MB, xlsx)
Document S2. Article plus supplemental information
mmc3.pdf (29.9MB, pdf)

References

  1. Aibar S., González-Blas C.B., Moerman T., Huynh-Thu V.A., Imrichova H., Hulselmans G., Rambow F., Marine J.-C., Geurts P., Aerts J., et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods. 2017;14:1083–1086. doi: 10.1038/nmeth.4463. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Cao S., He Z., Chen R., Luo Y., Fu L.-Y., Zhou X., He C., Yan W., Zhang C.-Y., Chen D. scPlant: A versatile framework for single-cell transcriptomic data analysis in plants. Plant Commun Advance Access. 2023 doi: 10.1016/j.xplc.2023.100631. published May 2023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Crespi M.D., Jurkevitch E., Poiret M., D’Aubenton-Carafa Y., Petrovics G., Kondorosi E., Kondorosi A. enod40, a gene expressed during nodule organogenesis, codes for a non-translatable RNA involved in plant growth. EMBO J. 1994;13:5099–5112. doi: 10.1002/j.1460-2075.1994.tb06839.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Denis E., Kbiri N., Mary V., Claisse G., Conde E Silva N., Kreis M., Deveaux Y. WOX14 promotes bioactive gibberellin synthesis and vascular cell differentiation in Arabidopsis. Plant J. 2017;90:560–572. doi: 10.1111/tpj.13513. [DOI] [PubMed] [Google Scholar]
  5. Di C., Yuan J., Wu Y., Li J., Lin H., Hu L., Zhang T., Qi Y., Gerstein M.B., Guo Y., Lu Z.J. Characterization of stress-responsive lncRNAs in Arabidopsis thaliana by integrating expression, epigenetic and structural features. Plant J. 2014;80:848–861. doi: 10.1111/tpj.12679. [DOI] [PubMed] [Google Scholar]
  6. Dobin A., Davis C.A., Schlesinger F., Drenkow J., Zaleski C., Jha S., Batut P., Chaisson M., Gingeras T.R. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Farmer A., Thibivilliers S., Ryu K.H., Schiefelbein J., Libault M. Single-nucleus RNA and ATAC sequencing reveals the impact of chromatin accessibility on gene expression in Arabidopsis roots at the single-cell level. Mol. Plant. 2021;14:372–383. doi: 10.1016/j.molp.2021.01.001. [DOI] [PubMed] [Google Scholar]
  8. Flynn R.A., Chang H.Y. Long Noncoding RNAs in Cell-Fate Programming and Reprogramming. Cell Stem Cell. 2014;14:752–761. doi: 10.1016/j.stem.2014.05.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Fornes O., Castro-Mondragon J.A., Khan A., van der Lee R., Zhang X., Richmond P.A., Modi B.P., Correard S., Gheorghe M., Baranašić D., et al. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 2020;48:D87–D92. doi: 10.1093/nar/gkz1001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Fu L.-Y., Zhu T., Zhou X., Yu R., He Z., Zhang P., Wu Z., Chen M., Kaufmann K., Chen D. ChIP-Hub provides an integrative platform for exploring plant regulome. Nat. Commun. 2022;13:3413. doi: 10.1038/s41467-022-30770-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Gloss B.S., Dinger M.E. The specificity of long noncoding RNA expression. Biochim. Biophys. Acta. 2016;1859:16–22. doi: 10.1016/j.bbagrm.2015.08.005. [DOI] [PubMed] [Google Scholar]
  12. Gou J., Fu C., Liu S., Tang C., Debnath S., Flanagan A., Ge Y., Tang Y., Jiang Q., Larson P.R., et al. The miR156 - SPL4 module predominantly regulates aerial axillary bud formation and controls shoot architecture. New Phytol. 2017;216:829–840. doi: 10.1111/nph.14758. [DOI] [PubMed] [Google Scholar]
  13. Guttman M., Amit I., Garber M., French C., Lin M.F., Feldser D., Huarte M., Zuk O., Carey B.W., Cassady J.P., et al. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature. 2009;458:223–227. doi: 10.1038/nature07672. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Hao Y., Hao S., Andersen-Nissen E., Mauck W.M., Zheng S., Butler A., Lee M.J., Wilk A.J., Darby C., Zager M., et al. Integrated analysis of multimodal single-cell data. Cell. 2021;184:3573–3587.e29. doi: 10.1016/j.cell.2021.04.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. He Z., Luo Y., Zhou X., Zhu T., Lan Y., Chen D. scPlantDB: a comprehensive database for exploring cell types and markers of plant cell atlases. Nucleic Acids Res. 2023;28:gkad706. doi: 10.1093/nar/gkad706. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Heo J.B., Sung S. Vernalization-Mediated Epigenetic Silencing by a Long Intronic Noncoding RNA. Science. 2011;331:76–79. doi: 10.1126/science.1197349. [DOI] [PubMed] [Google Scholar]
  17. Hu B., Xue Z., Zhang C. Protocols for Small RNA FISH in Plants. Chin. Bull. Bot. 2021;56:330. [Google Scholar]
  18. Jean-Baptiste K., McFaline-Figueroa J.L., Alexandre C.M., Dorrity M.W., Saunders L., Bubb K.L., Trapnell C., Fields S., Queitsch C., Cuperus J.T. Dynamics of Gene Expression in Single Root Cells of Arabidopsis thaliana. Plant Cell. 2019;31:993–1011. doi: 10.1105/tpc.18.00785. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Jha U.C., Nayyar H., Jha R., Khurshid M., Zhou M., Mantri N., Siddique K.H.M. Long non-coding RNAs: emerging players regulating plant abiotic stress response and adaptation. BMC Plant Biol. 2020;20:466. doi: 10.1186/s12870-020-02595-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Jin J., Tian F., Yang D.-C., Meng Y.-Q., Kong L., Luo J., Gao G. PlantTFDB 4.0: toward a central hub for transcription factors and regulatory interactions in plants. Nucleic Acids Res. 2017;45:D1040–D1045. doi: 10.1093/nar/gkw982. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Kim M.J., Ruzicka D., Shin R., Schachtman D.P. The Arabidopsis AP2/ERF Transcription Factor RAP2.11 Modulates Plant Response to Low-Potassium Conditions. Mol. Plant. 2012;5:1042–1057. doi: 10.1093/mp/sss003. [DOI] [PubMed] [Google Scholar]
  22. Kubo M., Udagawa M., Nishikubo N., Horiguchi G., Yamaguchi M., Ito J., Mimura T., Fukuda H., Demura T. Transcription switches for protoxylem and metaxylem vessel formation. Genes Dev. 2005;19:1855–1860. doi: 10.1101/gad.1331305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Langfelder P., Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinf. 2008;9:559. doi: 10.1186/1471-2105-9-559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Li B., Dewey C.N. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinf. 2011;12:323. doi: 10.1186/1471-2105-12-323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Li L., Eichten S.R., Shimizu R., Petsch K., Yeh C.-T., Wu W., Chettoor A.M., Givan S.A., Cole R.A., Fowler J.E., et al. Genome-wide discovery and characterization of maize long non-coding RNAs. Genome Biol. 2014;15:R40. doi: 10.1186/gb-2014-15-2-r40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Li S., Yamada M., Han X., Ohler U., Benfey P.N. High-Resolution Expression Map of the Arabidopsis Root Reveals Alternative Splicing and lincRNA Regulation. Dev. Cell. 2016;39:508–522. doi: 10.1016/j.devcel.2016.10.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Liao Q., Xiao H., Bu D., Xie C., Miao R., Luo H., Zhao G., Yu K., Zhao H., Skogerbø G., et al. ncFANs: a web server for functional annotation of long non-coding RNAs. Nucleic Acids Res. 2011;39:W118–W124. doi: 10.1093/nar/gkr432. –W124. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Liu J., Jung C., Xu J., Wang H., Deng S., Bernad L., Arenas-Huertero C., Chua N.-H. Genome-Wide Analysis Uncovers Regulation of Long Intergenic Noncoding RNAs in Arabidopsis. Plant Cell. 2012;24:4333–4345. doi: 10.1105/tpc.112.102855. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Liu X., Hao L., Li D., Zhu L., Hu S. Long non-coding RNAs and their biological roles in plants. Dev. Reprod. Biol. 2015;13:137–147. doi: 10.1016/j.gpb.2015.02.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Liu S.J., Nowakowski T.J., Pollen A.A., Lui J.H., Horlbeck M.A., Attenello F.J., He D., Weissman J.S., Kriegstein A.R., Diaz A.A., Lim D.A. Single-cell analysis of long non-coding RNAs in the developing human neocortex. Genome Biol. 2016;17 doi: 10.1186/s13059-016-0932-1. 67-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Liu Z., Zhou Y., Guo J., Li J., Tian Z., Zhu Z., Wang J., Wu R., Zhang B., Hu Y., et al. Global Dynamic Molecular Profiling of Stomatal Lineage Cell Development by Single-Cell RNA Sequencing. Mol. Plant. 2020;13:1178–1193. doi: 10.1016/j.molp.2020.06.010. [DOI] [PubMed] [Google Scholar]
  32. Long Y., Liu Z., Jia J., Mo W., Fang L., Lu D., Liu B., Zhang H., Chen W., Zhai J. FlsnRNA-seq: protoplasting-free full-length single-nucleus RNA profiling in plants. Genome Biol. 2021;22:66. doi: 10.1186/s13059-021-02288-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Lopez-Anido C.B., Vatén A., Smoot N.K., Sharma N., Guo V., Gong Y., Anleu Gil M.X., Weimer A.K., Bergmann D.C. Single-cell resolution of lineage trajectories in the Arabidopsis stomatal lineage and developing leaf. Dev. Cell. 2021;56:1043–1055.e4. doi: 10.1016/j.devcel.2021.03.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Luo H., Bu D., Shao L., Li Y., Sun L., Wang C., Wang J., Yang W., Yang X., Dong J., et al. Single-cell Long Non-coding RNA Landscape of T Cells in Human Cancer Immunity. Dev. Reprod. Biol. 2021;19:377–393. doi: 10.1016/j.gpb.2021.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Mattick J.S., Amaral P.P., Carninci P., Carpenter S., Chang H.Y., Chen L.-L., Chen R., Dean C., Dinger M.E., Fitzgerald K.A., et al. Long non-coding RNAs: definitions, functions, challenges and recommendations. Nat. Rev. Mol. Cell Biol. 2023;24:430–447. doi: 10.1038/s41580-022-00566-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Nejat N., Mantri N. Emerging roles of long non-coding RNAs in plant response to biotic and abiotic stresses. Crit. Rev. Biotechnol. 2018;38:93–105. doi: 10.1080/07388551.2017.1312270. [DOI] [PubMed] [Google Scholar]
  37. Newman A.M., Liu C.L., Green M.R., Gentles A.J., Feng W., Xu Y., Hoang C.D., Diehn M., Alizadeh A.A. Robust enumeration of cell subsets from tissue expression profiles. Nat. Methods. 2015;12:453–457. doi: 10.1038/nmeth.3337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Nookaraju A., Pandey S.K., Ahlawat Y.K., Joshi C.P. Understanding the Modus Operandi of Class II KNOX Transcription Factors in Secondary Cell Wall Biosynthesis. Plants. 2022;11:493. doi: 10.3390/plants11040493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Öhman D., Demedts B., Kumar M., Gerber L., Gorzsás A., Goeminne G., Hedenström M., Ellis B., Boerjan W., Sundberg B. MYB103 is required for FERULATE-5-HYDROXYLASE expression and syringyl lignin biosynthesis in Arabidopsis stems. Plant J. 2013;73:63–76. doi: 10.1111/tpj.12018. [DOI] [PubMed] [Google Scholar]
  40. Okamoto M., Tatematsu K., Matsui A., Morosawa T., Ishida J., Tanaka M., Endo T.A., Mochizuki Y., Toyoda T., Kamiya Y., et al. Genome-wide analysis of endogenous abscisic acid-mediated transcription in dry and imbibed seeds of Arabidopsis using tiling arrays. Plant J. 2010;62:39–51. doi: 10.1111/j.1365-313X.2010.04135.x. [DOI] [PubMed] [Google Scholar]
  41. Palos K., Yu L., Railey C.E., Nelson Dittrich A.C., Nelson A.D.L. Linking discoveries, mechanisms, and technologies to develop a clearer perspective on plant long noncoding RNAs. Plant Cell. 2023;35:1762–1786. doi: 10.1093/PLCELL/KOAD027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Patil V., McDermott H.I., McAllister T., Cummins M., Silva J.C., Mollison E., Meikle R., Morris J., Hedley P.E., Waugh R., et al. APETALA2 control of barley internode elongation. Development Advance Access published January. 2019;146:dev170373. doi: 10.1242/dev.170373. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Picelli S., Faridani O.R., Björklund A.K., Winberg G., Sagasser S., Sandberg R. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 2014;9:171–181. doi: 10.1038/nprot.2014.006. [DOI] [PubMed] [Google Scholar]
  44. Quinn J.J., Chang H.Y. Unique features of long non-coding RNA biogenesis and function. Nat. Rev. Genet. 2015;17:47–62. doi: 10.1038/nrg.2015.10. [DOI] [PubMed] [Google Scholar]
  45. Roppolo D., de Rybel B., Dénervaud Tendon V., Pfister A., Alassimone J., Vermeer J.E.M., Yamazaki M., Stierhof Y.D., Beeckman T., Geldner N. A novel protein family mediates Casparian strip formation in the endodermis. Nature. 2011;473:380–383. doi: 10.1038/nature10070. [DOI] [PubMed] [Google Scholar]
  46. Ryu K.H., Huang L., Kang H.M., Schiefelbein J. Single-Cell RNA Sequencing Resolves Molecular Relationships Among Individual Plant Cells. Plant Physiol. 2019;179:1444–1456. doi: 10.1104/pp.18.01482. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Schug J., Schuller W.-P., Kappen C., Salbaum J.M., Bucan M., Stoeckert C.J. Promoter features related to tissue specificity as measured by Shannon entropy. Genome Biol. 2005;6:R33. doi: 10.1186/gb-2005-6-4-r33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Shaw R., Tian X., Xu J. Single-Cell Transcriptome Analysis in Plants: Advances and Challenges. Mol. Plant. 2021;14:115–126. doi: 10.1016/j.molp.2020.10.012. [DOI] [PubMed] [Google Scholar]
  49. Shivaraj S.M., Jain A., Singh A. Highly preserved roles of Brassica MIR172 in polyploid Brassicas: ectopic expression of variants of Brassica MIR172 accelerates floral transition. Mol. Genet. Genom. 2018;293:1121–1138. doi: 10.1007/s00438-018-1444-3. [DOI] [PubMed] [Google Scholar]
  50. Solanki S., Ameen G., Zhao J., Flaten J., Borowicz P., Brueggeman R.S. Visualization of spatial gene expression in plants by modified RNAscope fluorescent in situ hybridization. Plant Methods. 2020;16:71–79. doi: 10.1186/s13007-020-00614-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Statello L., Guo C.J., Chen L.L., Huarte M. Gene regulation by long non-coding RNAs and its biological functions. Nat. Rev. Mol. Cell Biol. 2020;22:96–118. doi: 10.1038/s41580-020-00315-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Szcześniak M.W., Bryzghalov O., Ciomborowska-Basheer J., Makałowska I. Methods in Molecular Biology. 2019. CANTATAdb 2.0: Expanding the Collection of Plant Long Noncoding RNAs. [DOI] [PubMed] [Google Scholar]
  53. Wang H., Chung P.J., Liu J., Jang I.C., Kean M.J., Xu J., Chua N.H. Genome-wide identification of long noncoding natural antisense transcripts and their responses to light in Arabidopsis. Genome Res. 2014;24:444–453. doi: 10.1101/gr.165555.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Wang Y., Wang X., Deng W., Fan X., Liu T.T., He G., Chen R., Terzaghi W., Zhu D., Deng X.W. Genomic features and regulatory roles of intermediate-sized non-coding RNAs in Arabidopsis. Mol. Plant. 2014;7:514–527. doi: 10.1093/mp/sst177. [DOI] [PubMed] [Google Scholar]
  55. Wang M., Yuan D., Tu L., Gao W., He Y., Hu H., Wang P., Liu N., Lindsey K., Zhang X. Long noncoding RNA s and their proposed functions in fibre development of cotton ( Gossypium spp.) New Phytol. 2015;207:1181–1197. doi: 10.1111/nph.13429. [DOI] [PubMed] [Google Scholar]
  56. Wang Y., Huan Q., Li K., Qian W. Single-cell transcriptome atlas of the leaf and root of rice seedlings. Journal of Genetics and Genomics. 2021;48:881–898. doi: 10.1016/j.jgg.2021.06.001. [DOI] [PubMed] [Google Scholar]
  57. Weirauch M.T., Yang A., Albu M., Cote A.G., Montenegro-Montero A., Drewe P., Najafabadi H.S., Lambert S.A., Mann I., Cook K., et al. Determination and Inference of Eukaryotic Transcription Factor Sequence Specificity. Cell. 2014;158:1431–1443. doi: 10.1016/j.cell.2014.08.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Wendrich J.R., Yang B., Vandamme N., Verstaen K., Smet W., Van de Velde C., Minne M., Wybouw B., Mor E., Arents H.E., et al. Vascular transcription factors guide plant epidermal responses to limiting phosphate conditions. Science. 2020:370. doi: 10.1126/science.aay4970. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Wierzbicki A.T., Blevins T., Swiezewski S. Long Noncoding RNAs in Plants. Annu. Rev. Plant Biol. 2021;72:245–271. doi: 10.1146/annurev-arplant-093020-035446. [DOI] [PubMed] [Google Scholar]
  60. Wu H.-J., Wang Z.-M., Wang M., Wang X.-J. Widespread Long Noncoding RNAs as Endogenous Target Mimics for MicroRNAs in Plants. Plant Physiol. 2013;161:1875–1884. doi: 10.1104/pp.113.215962. –1884. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Wu T., Hu E., Xu S., Chen M., Guo P., Dai Z., Feng T., Zhou L., Tang W., Zhan L., et al. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innovation. 2021;2 doi: 10.1016/j.xinn.2021.100141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Xu Z., Wang Q., Zhu X., Wang G., Qin Y., Ding F., Tu L., Daniell H., Zhang X., Jin S. Plant Single Cell Transcriptome Hub (PsctH): an integrated online tool to explore the plant single-cell transcriptome landscape. Plant Biotechnol. J. 2022;20:10–12. doi: 10.1111/pbi.13725. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Yanai I., Benjamin H., Shmoish M., Chalifa-Caspi V., Shklar M., Ophir R., Bar-Even A., Horn-Saban S., Safran M., Domany E., et al. Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification. Bioinformatics. 2005;21:650–659. doi: 10.1093/bioinformatics/bti042. [DOI] [PubMed] [Google Scholar]
  64. Yu Y., Zhang Y., Chen X., Chen Y. Plant Noncoding RNAs: Hidden Players in Development and Stress Responses. Annu. Rev. Cell Dev. Biol. 2019;35:407–431. doi: 10.1146/annurev-cellbio-100818-125218. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Zhang Y.-C., Liao J.-Y., Li Z.-Y., Yu Y., Zhang J.-P., Li Q.-F., Qu L.-H., Shu W.-S., Chen Y.-Q. Genome-wide screening and functional analysis identify a large number of long noncoding RNAs involved in the sexual reproduction of rice. Genome Biol. 2014;15:512. doi: 10.1186/s13059-014-0512-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Zhang T.-Q., Xu Z.-G., Shang G.-D., Wang J.-W. A Single-Cell RNA Sequencing Profiles the Developmental Landscape of Arabidopsis Root. Mol. Plant. 2019;12:648–660. doi: 10.1016/j.molp.2019.04.004. [DOI] [PubMed] [Google Scholar]
  67. Zhang T.-Q., Chen Y., Wang J.-W. A single-cell analysis of the Arabidopsis vegetative shoot apex. Dev. Cell. 2021;56:1056–1074.e8. doi: 10.1016/j.devcel.2021.02.021. [DOI] [PubMed] [Google Scholar]
  68. Zhang Y., Bu D., Huo P., Wang Z., Rong H., Li Y., Liu J., Ye M., Wu Y., Jiang Z., et al. ncFANs v2.0: an integrative platform for functional annotation of non-coding RNAs. Nucleic Acids Res. 2021;49:W459–W468. doi: 10.1093/nar/gkab435. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Zhao X., Li J., Lian B., Gu H., Li Y., Qi Y. Global identification of Arabidopsis lncRNAs reveals the regulation of MAF4 by a natural antisense RNA. Nat. Commun. 2018;9:5056. doi: 10.1038/s41467-018-07500-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Zhao X., Lan Y., Chen D. Exploring long non-coding RNA networks from single cell omics data. Comput. Struct. Biotechnol. J. 2022;20:4381–4389. doi: 10.1016/J.CSBJ.2022.08.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Zheng G.X.Y., Terry J.M., Belgrader P., Ryvkin P., Bent Z.W., Wilson R., Ziraldo S.B., Wheeler T.D., McDermott G.P., Zhu J., et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 2017;8 doi: 10.1038/ncomms14049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Zhong R., Lee C., Haghighat M., Ye Z.H. Xylem vessel-specific SND5 and its homologs regulate secondary wall biosynthesis through activating secondary wall NAC binding elements. New Phytol. 2021;231:1496–1509. doi: 10.1111/nph.17425. [DOI] [PubMed] [Google Scholar]
  73. Zhu Q.H., Stephen S., Taylor J., Helliwell C.A., Wang M.B. Long noncoding RNAs responsive to Fusarium oxysporum infection in Arabidopsis thaliana. New Phytol. 2014;201 doi: 10.1111/nph.12537. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Supplemental Figures 1–12
mmc1.pdf (26.5MB, pdf)
Data S1. Supplemental Tables 1–8
mmc2.xlsx (1.2MB, xlsx)
Document S2. Article plus supplemental information
mmc3.pdf (29.9MB, pdf)

Data Availability Statement

The integrated cell map and associated analysis results from this study can be retrieved and viewed at the web-based platform scPLAD (https://biobigdata.nju.edu.cn/scPLAD/).

Code availability

No special methods were implemented in this study. R code used to analyze data and generate figures is available from the corresponding authors upon reasonable request.


Articles from Plant Communications are provided here courtesy of Elsevier

RESOURCES