Abstract
Background
One of the most promising strategies of Pine Pitch Canker (PPC) management is the use of reproductive plant material resistant to the disease. Understanding the complexity of plant transcriptome that underlies the defence to the causal agent Fusarium circinatum, would greatly facilitate the development of an accurate breeding program. Long non-coding RNAs (lncRNAs) are emerging as important transcriptional regulators under biotic stresses in plants. However, to date, characterization of lncRNAs in conifer trees has not been reported. In this study, transcriptomic identification of lncRNAs was carried out using strand-specific paired-end RNA sequencing, from Pinus radiata samples inoculated with F. circinatum at an early stage of infection.
Results
Overall, 13,312 lncRNAs were predicted through a bioinformatics approach, including long intergenic non-coding RNAs (92.3%), antisense lncRNAs (3.3%) and intronic lncRNAs (2.9%). Compared with protein-coding RNAs, pine lncRNAs are shorter, have lower expression, lower GC content and harbour fewer and shorter exons. A total of 164 differentially expressed (DE) lncRNAs were identified in response to F. circinatum infection in the inoculated versus mock-inoculated P. radiata seedlings. The predicted cis-regulated target genes of these pathogen-responsive lncRNAs were related to defence mechanisms such as kinase activity, phytohormone regulation, and cell wall reinforcement. Co-expression network analysis of DE lncRNAs, DE protein-coding RNAs and lncRNA target genes also indicated a potential network regulating pectinesterase activity and cell wall remodelling.
Conclusions
This study presents the first comprehensive genome-wide analysis of P. radiata lncRNAs and provides the basis for future functional characterizations of lncRNAs in relation to pine defence responses against F. circinatum.
Supplementary Information
The online version contains supplementary material available at 10.1186/s12864-022-08408-9.
Keywords: Pinus radiata, Fusarium circinatum, lncRNA, Transcriptomics network, Conifer defence, RNA-Seq.
Background
The study of the different types and functions of non-coding RNAs (ncRNAs) has recently gained prominence [1]. During the last decade, a new class of ncRNA, long non-coding RNA (lncRNA), has emerged as another eukaryotic transcript class where all transcripts greater than 200 nt in length that lack coding potential are included [2]. Similar to protein-coding genes, lncRNAs are transcribed by RNA polymerase II, capped, polyadenylated and usually spliced [3]. Accumulating evidence supports that lncRNAs participate in many cellular processes by regulating gene expression in a cis-regulatory manner, influencing genes around their transcription site, or leaving their transcription sites to exert their function elsewhere as a trans-acting transcript [4]. Sense and anti-sense, intergenic as well as intronic (located into an intron) are the main groups for classifying the lncRNAs according to their orientation with respect to the nearest protein-coding gene in the genome and genomic location [5]. Known mechanism of action including molecular signalling, decoys (binding to regulatory elements such as miRNAs blocking their molecular interaction), guides (directing specific RNA-protein complexes to specific targets) and scaffolds as central platforms for regulation, are associated to the majority of lncRNAs [6].
The growing number of studies focusing on the interference of plant lncRNAs in different biological processes, including fertility, photomorphogenesis, wood formation, and biotic and abiotic stress, has demonstrated their important regulatory role in the transcription system [7–9]. Some of these lncRNAs have been experimentally validated, most of them being from model plants. For example in Arabidopsis, two lncRNAs, COOLAIR and COLDAIR, have been shown to be crucial in the regulation of cold stress response [10, 11]. Likewise, DRIR lncRNA regulates the expression of a series of genes involved in drought and salt stress-responsive [12]. The regulatory role of the lncRNA IPS1 has also been reported blocking the miRNA mir399 that suppress the expression of the gene responsible for the phosphate uptake [13]. In Populus tomentosa, the interaction of the NERD gene and its regulatory lncRNA NERDL, is involved in the wood formation processes [14]. Moreover, some lncRNAs associated with biotic stress have been characterized in plants. These included lncRNAs that regulate positively the expression of defence-related PR genes such as ELENA1, identified in Arabidopsis as a factor enhancing resistance against the pathogen Pseudomonas syringae, and lncRNA39026 that increases resistance against Phytophthora infectans in tomato [15, 16]. The biosynthesis or signalling of plant hormones have been altered by lncRNAs as well. In cotton plants, the silencing of two lncRNAs (GhlncNAT- ANX2 and GhlncNAT-RLP7) led to increased resistance to Verticillium dahliae and Botrytis cinerea, possibly due to the transcriptional induction of two lipoxygenases involved in the jasmonic acid defence signalling pathway [17]. In addition, overexpression of lncRNA ALEX1 in rice increased jasmonic acid levels enhancing resistance to the bacteria Xanthomonas oryzae pv. oryzae [18].
Next Generation Sequencing (NGS) technologies and computational methods have enabled a deeper study of the transcriptomic data and have been widely applied for the identification and characterization of plant lncRNAs [19]. Recently, a number of lncRNAs involved in plant-pathogen interactions has been computationally predicted in non-model plants. In Brassica napus, 931 lncRNAs were identified in response to Sclerotinia sclerotiorum infection, one of them (TCONS_00000966) as antisense regulator of genes involved in plant defence [20]. Li et al. [21] discovered Musa acuminata lncRNAs related to resistance against Fusarium oxysporum f. sp. cubense infection. Particularly, lncRNAs involved in the expression of pathogenesis-related proteins and peroxidases were mainly induced in the resistant cultivar, whereas lncRNAs related to auxin and salicylic acid signal transductions could predominantly be induced in the susceptible cultivar. In the Paulownia witches’ broom disease interaction, nine lncRNAs were predicted to target twelve genes based on a co-expression network model in the tree [22]. In kiwifruit leaves infected by P. syringae, a weighted gene co-expression network analysis revealed a number of lncRNAs closely related to plant immune response and signal transduction [23]. Likewise, Feng et al. [24] identified 14,525 lncRNAs related to the walnut anthracnose resistance. This analysis showed that the target genes of the up-regulated lncRNAs were enriched in immune-related processes during the infection of the causal agent Colletotrichum gloeosporioides. These studies highlight the important role of lncRNAs in plant defence, thus further research is needed to decipher their function and interference in the transcriptomic system.
Fusarium circinatum is an invasive pathogen that causes Pine Pitch Canker (PPC). This disease affects conifers, resulting in a serious economic and ecological impact on nurseries and pine stands [25]. Since the first report in 1945 in North America, the presence of F. circinatum has been notified in 14 countries of America as well as Asia, Africa and Europe [26]. The long-distance dispersion as a result of globalization of plant trade and movement of contaminated soil and seed, represents the main pathway for new introductions of the pathogen into disease-free regions [27]. The establishment of the disease in field is of great concern since no feasible measures are available to control or eradicate F. circinatum [28]. Thus, the development of resistant genotypes through breeding and/or genetic engineering may be one of the most efficient PPC management strategies in the long-term [28, 29]. In this context, several transcriptome analyses with the aim of unravelling molecular defence responses have provided detailed insights about the molecular mechanisms underlying disease progression in the Pinus-F. circinatum pathosystem. These studies have examined the response of hosts through a different degree of susceptibility, from highly susceptible (Pinus radiata, Pinus patula) to moderate (Pinus pinaster) and highly resistant (Pinus tecunumanii, Pinus pinea) [30–35]. However, the role of lncRNAs in the regulation of defence network in conifers has not been studied yet. In the present study, a strand-specific RNA-Seq has been conducted in order to characterize lncRNAs present in high susceptible P. radiata and elucidate how lncRNA expression profiles change in response to F. circinatum infection.
Results
Disease monitoring
The survival analysis revealed clear significant differences between the inoculation and control conditions (χ2 = 116, p< 0.001). At 10 days post inoculation (dpi) all seedlings inoculated with F. circinatum showed symptoms of PPC (resin and/or necrosis at the stem and wilting) and started to die at 33 dpi (Fig. 1). A seedling was considered dead when the pathogen had caused a girdling lesion and more than 75% of the needles were necrotic. By the end of the experiment, 92.2% of the inoculated seedlings had died. No mortality was recorded for control seedlings.
Deep sequencing and transcripts assembly
High-throughput strand-specific RNA-Seq of nine libraries constructed from stem tissue of P. radiata inoculated with F. circinatum and mock-inoculated were analysed. Raw data of the experiment have been deposited at the NCBI under the SRA numbers SRR15100123-31 (BioProject PRJNA742852). Almost 590 million 150-base pair-end reads on polyadenylated (polyA) selected RNAs were generated by the Illumina platform. RNA-Seq reached average depths of ca. 65.5 million reads (55 to 84 million reads; Table S1). After adapter and low-quality nucleotides trimming, an average of 78% of paired reads and 11% of mates from broken pairs were retained. Approximately 74.21% and 70.33% of reads from inoculated and mock-inoculated libraries successfully mapped to the Pinus taeda reference genome, respectively (Table S1). Considering the infected samples, the average of 2.63% reads mapped to the F. circinatum genome confirmed the presence of the pathogen.
Nine high-depth transcriptomes were generated. Six of them were reconstructed from P. radiata inoculated with F. circinatum, and the other three were generated from the mock-inoculated seedlings. After merging all of them, the unique transcriptome assembled were composed of 87,427 loci and 127,677 transcripts, with 43.1% GC content. A total of 51,212 (40.11%) transcripts were shared with the reference annotation file (Pita_v2.01.gtf) and discarded for lncRNA detection analysis since these transcripts were known as protein-coding RNAs. The remaining 76,465 transcripts were further categorized into different class codes according to its relationship with its closest reference transcript (Table 1).
Table 1.
Class code | After assemblya | LncRNAs predictedb | Descriptionc | ||
---|---|---|---|---|---|
Transcript no. | % | Transcript no. | % | ||
x | 902 | 0.71 | 446 | 3.35 | Overlapping an exon of an annotated gene at the opposite strand |
i | 1,178 | 0.92 | 383 | 2.88 | Fully contained in a known intron |
y | 500 | 0.39 | 189 | 1.42 | Contains a reference gene within its intron |
p | 516 | 0.4 | 0 | 0 | Adjacent to the 5’ end of an annotated gene at the same strand |
u | 45,705 | 35.8 | 12,280 | 92.32 | Intergenic region |
aNumber of transcripts belonging to each class code from the non-redundant set of transcripts generated after transcript assembly
bNumber of transcripts belonging to each class code from the total number of predicted lncRNAs
cBrief explanation of the class codes
Genome-wide identification and characterization of pine lncRNAs
The 76,465 total unknown transcripts were subjected to several sequential filter steps to obtain the lncRNA transcripts (Fig. 2). A total of 13,312 lncRNAs (length ≥ 200 nt, open reading frame coverage < 50%, and potential coding score < 0.5) and 47,473 potential new isoforms were obtained at the end of the pipeline. Using the FEELnc classifier module, the class distributions of the pine lncRNAs was performed according to their location relative to the nearest protein-coding gene based on the reconstructed transcriptome. The majority of the lncRNAs were long intergenic non-coding RNAs (lincRNAs) with 12,291 (92.3%) transcripts, followed by long non-coding natural antisense transcripts (lncNAT) with 445 (3.3%) transcripts and 383 (2.9%) intronic transcripts. In addition, 25 lncRNAs were also identified as known miRNA precursors belonging to 10 miRNA families being the most represented MIR160, MIR159 and MIR1314. The Rfam and miRBase analyses also allowed the identification of 174 transcripts that were found to be distributed among 32 conserved RNA families including ribosomal RNA (rRNAs), transfer RNA (tRNAs), histones and several small nucleolar RNAs (snoRNAs; Table S2-S3).
The average length of protein-coding transcripts (1,200 bp) was higher than that of lincRNAs (750 bp), lncNATs (452 bp) and intronic lncRNAs (565 bp). However, while most of lncNATs and intronic lncRNAs showed short lengths (300 bp), lincRNAs and protein-coding transcripts exhibited a similar trend of length distribution (Fig. 3A). Overall, the size distribution of the lncRNAs ranged from 200 to 7,393 bp, with the majority of these transcripts ranging from 200 to 400 bp. Differences in the analysis of the exon number were also found. While the lncRNAs showed an average exon number of 2.5, the protein-coding transcripts had 4.1 exons (Fig. 3B). This analysis also revealed that two-exon transcripts were the most represented in this study. The highest ratio of two-exon transcripts was found in lncNATs (77.3%) and intronic lncRNAs (75.7%), followed by lincRNAs (66.9%). In the group of protein-coding transcripts, the ratio of two-exon transcripts was not so high (32%). Regarding the exon length, similarly to the transcript length, the exons belonging to the lncNAT and intronic lncRNA transcripts showed shorter lengths (100-300 bp) than those belonging to protein-coding transcripts (Fig. 3C). Once again, the distribution of the exon lengths from the lincRNA transcripts was similar to that of protein-coding transcripts.
The average expression levels of lncRNAs in terms of fragment per kilobase of exon per million mapped reads (FPKM) was lower (3.3) than those of protein-coding transcripts (5.6; Fig. 3D). In addition, the GC content in lncRNAs (41%) was slightly lower than that in protein-coding transcripts (44.8%), showing the intronic lncRNA transcripts the lowest percentage (Fig. 3E).
All the lncRNA transcripts were aligned against the known lncRNAs of 10 different plant species from the CANTATA database: Chenopodium quinoa, Brassica napus, Malus domestica, Zea mays, Arabidopsis thaliana, Oryza rufipogon, Vitis vinifera, Populus trichocarpa, Prunus persica and Ananas comosus. Likewise, known lncRNAs of all plant species present in the GreeNc database, except those species already examined with the CANTATA database, were confronted with the lncRNAs of P. radiata. A number of 1,131 (8.6%) lncRNAs were conserved across the ten species of CANTATA (Table S4). In addition, a total of 1,421 (10.8%) lncRNA transcripts, corresponding to known lncRNA genes from the GreeNc database (Table S5), were obtained. Therefore, 2,552 (19.3%) lncRNAs showed homology with known lncRNAs from other plant species. The highest homology ratio (number of hits of pine lncRNAs with those of each plant species to the total number of lncRNAs of each plant species) was observed with the woody plant P. trichocarpa (5.03%; Figure S1).
Differential expression analysis in response to F. circinatum infection and prediction of candidate target genes
The expression changes of lncRNAs between the P. radiata seedlings inoculated with F. circinatum and controls were analysed. The principal component analyses (PCA) allowed to identify two sample outliers among the pathogen-inoculated condition that were discarded for the differential expression analysis (Figure S2). A total of 164 lncRNA transcripts were identified as differentially expressed (p-value < 0.05, log2 (|Fold-change|) ≥ 1) under the pathogen infection, 146 of which were up-regulated and 18 down-regulated (Table S6-S7). Among the differentially expressed (DE) lncRNAs, 157 were lincRNA transcripts and the remainder were two intronic lncRNAs, one lncNATs, and four lncRNA transcripts containing a coding-protein in its intron. DE lncRNAs were clustered in a heat map in order to visualize the expression pattern of both conditions of the analysis (Fig. 4). On the other hand, 2,369 protein-coding RNA were up-regulated and 189 down-regulated by the pathogen infection (Table S8-S9).
Analysis of lncRNAs cis-interacting genes
To predict the role of cis-acting lncRNAs of P. radiata in response to F. circinatum, the protein-coding transcripts located within a 10 kb window upstream and 100 kb downstream were investigated. A total of 4,268 lncRNA–mRNA interaction pairs were recorded by the FEELnc classifier module (Table S10). However, one lncRNA could have more than one target gene, and a target gene could be the target of one or more lncRNAs. In fact, a number of 2,760 candidate cis target genes were observed for 3,750 lncRNAs, of which 3,342 had a single candidate target gene and 408 lncRNAs had multiple interactions. The maximum number of target genes for a single lncRNA was five, which was reached by seven lncRNAs (Table S11). Moreover, the 73% of the 2,760 candidate target genes were targeted by one lncRNA, while one candidate target gene could be targeted by up to 30 different lncRNAs.
In total, 39 candidate target genes were predicted for the 37 DE lncRNAs (Table 2). The function prediction of these DE lncRNAs was based on the functional annotation of their nearby target genes. Among these targeted genes, there were genes encoding for receptor-like protein kinases (RLKs), enzymes associated to the cell-wall reinforcement and lignification (pectin methylesterases inhibitor, uclacyanin and 4-coumarate-CoA ligase), and enzymes involved in the attenuation of oxidative stress (glutathione S-transferase). One RLK that was predicted to be targeted by the up-regulated lncRNAPiRa.29753.1 was, in turn, induced by the pathogen infection. Two pectin methylesterases (PME) were predicted to be regulated by lncRNAPiRa.23041.2 and lncRNAPiRa.22160.1 transcribed in the same orientation in a downstream location. One of the targeted PME was DE by the pathogen infection, whereas the other PME did not. Moreover, the coding region for 4-coumarate-CoA ligase 3 (4CL3) targeted by lncRNAPiRa.33098.2 was also present among the differentially expressed genes (DEGs) of the coding RNAs analysis. One gene harbours the DNA-binding motif MYB, a transcription factor with a role in plant stress tolerance, was potentially regulated by a lncNAT (lncRNAPiRa.31525.1). The lncRNAPiRa.85000.6 lncRNA, which was predicted to target an ethylene receptor 2 (ETR2) gene involved in the ethylene signal transduction pathway, was transcribed in the same strand and orientation than its RNA partner from an upstream location. In addition, two genes encoding for photoassimilate-responsive protein 1 (PAR1) were predicted to be targeted by lncRNAPiRa.61651.3 and lncRNAPiRa.33277.3, the latter being DE between conditions.
Table 2.
LncRNA | Log2FCa | Targeted gene | Log2FCa | Direction and typeb | Location | Description of targeted gene |
---|---|---|---|---|---|---|
lncRNAPiRa.44237.18 | 9.76 ↑ | PITA_00496 | Antisense, intergenic (41,417) | Convergent, downstream | Pentatricopeptide repeat-containing protein At4g13650 | |
lncRNAPiRa.64325.1 | 3.15 ↑ | PITA_01014 | Antisense, genic | Containing, exonic | Transcript with domain: DUF4228 | |
lncRNAPiRa.32343.2 | 11.3 ↑ | PITA_13284 | Antisense, intergenic (51,151) | Convergent, downstream | CYCD2 | |
lncRNAPiRa.35491.1 | 9.18 ↑ | PITA_33574 | Antisense, intergenic (3,155) | Convergent, downstream | Leaf rust 10 disease-resistance locus receptor-like protein kinase-like 1.2 isoform X1 | |
lncRNAPiRa.42942.2 | 8.54 ↑ | PITA_15284 | Antisense, intergenic (92,774) | Divergent, upstream | Ribosomal RNA methyltransferase FtsJ domain-containing protein | |
lncRNAPiRa.22160.1 | 9.5 ↑ | PITA_12411 | 6.58 ↑ | Sense, intergenic (9,711) | Same strand, downstream | Pectin methylesterase 17 |
lncRNAPiRa.31525.1 | 9.63 ↑ | PITA_31792 | Antisense, intergenic (35,358) | Convergent, downstream | Transcript with domain: Myb_DNA-binding | |
lncRNAPiRa.79902.12 | 7.12 ↑ | PITA_05666 | 10.1 ↑ | Sense, genic | containing | PDC1 |
PITA_12210 | 11.9 ↑ | Sense, intergenic (87) | Same strand, downstream | PDC1 | ||
lncRNAPiRa.70333.4 | 7.9 ↑ | PITA_01539 | Sense, genic | Nested, intronic | Uclacyanin 1 | |
lncRNAPiRa.51697.3 | 3.35 ↑ | PITA_34628 | Sense, genic | Containing, intronic | Transcript with domain: Peptidase_S28, Peptidase_S9 | |
lncRNAPiRa.61651.3 | 8.53 ↑ | PITA_42898 | Antisense, intergenic (7,280) | Convergent, downstream | PAR1 | |
lncRNAPiRa.33277.3 | 3.35 ↑ | PITA_08467 | 3.54 ↑ | Sense, intergenic (376) | same strand, upstream | PAR1 |
lncRNAPiRa.45077.2 | 8.14 ↑ | PITA_26106 | Antisense, intergenic (87,766) | Convergent, downstream | Purple acid phosphatase | |
lncRNAPiRa.23041.2 | 9.27 ↑ | PITA_28262 | Sense, intergenic (9,586) | Same strand, downstream | Pectin methylesterase 17 | |
lncRNAPiRa.85490.1 | 6.99 ↑ | PITA_28228 | Antisense, intergenic (85,760) | Divergent, upstream | unknown [Picea sitchensis] | |
lncRNAPiRa.47042.1 | 7.53 ↑ | PITA_13092 | Sense, intergenic (31,121) | Same strand, upstream | Transcript with domain: PP2C | |
lncRNAPiRa.19024.1 | 5.58 ↑ | PITA_42377 | 5.17 ↑ | Sense, intergenic (542) | Same strand, downstream | Non-symbiotic hemoglobin 1 (HB) |
lncRNAPiRa.25700.7 | 3.79 ↑ | PITA_23327 | Sense, genic | Containing, intronic | Peptidase S9 | |
PITA_25465 | Sense, intergenic (541) | Same strand, downstream | Prolyl endopeptidase | |||
lncRNAPiRa.25968.1 | 2.91 ↑ | PITA_42840 | 4.85 ↑ | Sense, intergenic (33,267) | Same strand, downstream | Transcript with domain: USP |
lncRNAPiRa.29628.1 | 6.8 ↑ | PITA_10474 | Sense, intergenic (812) | Same strand, downstream | Transcript with domain: Glycolytic-Fructose-bisphosphate aldolase class-I | |
lncRNAPiRa.80857.1 | 6.78 ↑ | PITA_28959 | Antisense, genic | Nested, intronic | ALN | |
lncRNAPiRa.29753.1 | 7.2 ↑ | PITA_38537 | 6.4 ↑ | Sense, intergenic (69,405) | Same strand, downstream | leaf rust 10 disease-resistance locus receptor-like protein kinase-like protein 2.4 |
lncRNAPiRa.33098.2 | 6.8 ↑ | PITA_43179 | 5.01 ↑ | Sense, intergenic (6,072) | Same strand, downstream | 4-coumarate-CoA ligase, partial (4CL3) |
lncRNAPiRa.80336.1 | 4.89 ↑ | PITA_17252 | Antisense, intergenic (47,603) | Convergent, downstream | Protein chromatin remodeling 24 | |
lncRNAPiRa.61651.4 | 5.82 ↑ | PITA_42898 | Antisense, intergenic (7,280) | Convergent, downstream | unknown [Picea sitchensis] | |
lncRNAPiRa.64704.5 | 6.91 ↑ | PITA_22879 | Sense, intergenic (6023) | Same strand, downstream | Lambda class glutathione S-transferase (GSTL1) | |
lncRNAPiRa.75647.1 | 5.93 ↑ | PITA_04032 | Antisense, intergenic (8,949) | Divergent, upstream | Transcript with domain: RRM_1 | |
lncRNAPiRa.85000.6 | 9.91 ↑ | PITA_16990 | Sense, intergenic (5,468) | Same strand, upstream | Ethylene receptor 2 (ETR2) | |
lncRNAPiRa.33190.1 | 7.85 ↑ | PITA_44567 | Sense, intergenic (66,345) | Same strand, downstream | Transcript with domain: EamA | |
lncRNAPiRa.31184.1 | 2.25 ↑ | PITA_16807 | Antisense, intergenic (55,755) | Divergent, upstream | Transcript with domain: LEA_3 | |
lncRNAPiRa.78332.11 | 2.65 ↑ | PITA_41139 | 4.04 ↑ | Sense, genic | Overlapping, intronic | CBS domain-containing protein cbscbspb3 |
lncRNAPiRa.42813.1 | 9.29 ↑ | PITA_02986 | Sense, intergenic (190) | Same strand, downstream | Hypothetical protein 0_9919_01, partial [Pinus taeda] | |
lncRNAPiRa.78487.3 | 9.39 ↑ | PITA_28133 | Antisense, intergenic (33,344) | Convergent, downstream | Transcript with domain: zf-CCCH | |
lncRNAPiRa.84511.1 | 4.53 ↑ | PITA_13110 | 6.3 ↑ | Sense, intergenic (82) | Same strand, downstream | Transcript with domain: Cellulase |
lncRNAPiRa.62823.1 | 9.14 ↓ | PITA_01229 | Sense, intergenic (76,300) | Same strand, downstream | UBA52 | |
lncRNAPiRa.83146.2 | 7.43 ↓ | PITA_05626 | Sense, intergenic (69,363) | Same strand, downstream | Pyridoxal kinase-like protein isoform X1 | |
lncRNAPiRa.38350.3 | 6.97 ↓ | PITA_18454 | Sense, intergenic (494) | Same strand, downstream | CC-NBS-LRR resistance-like protein |
a The symbol ↑ refers to up-regulated expression and ↓ refers to down-regulation expression of lncRNAs and genes
b Numbers in parenthesis indicate the genomic distance between the lncRNA and its potential target gene
The pine lncRNA lncRNAPiRa.79902.12 was predicted to target two genes encoding for the pyruvate decarboxylase 1 (PDC1) enzyme, which both were up-regulated by the pathogen infection. Furthermore, one gene that participates in chromatin modifications (chromatin remodelling 24) and three genes that contain canonical RNA-binding domains (pentatricopeptide repeat-containing protein, ribosomal RNA methyltransferase FtsJ domain containing protein, CCCH-type Znf protein) were predicted to be targeted in an antisense manner by lncRNAs. None of the latter three genes were differentially expressed in the coding RNAs expression analysis.
The enrichment analysis of Gene Ontology (GO) terms and KEGG pathways of the nearby protein-coding RNAs revealed potential functions in which DE lncRNAs could be involved (Fig. 5). The three target genes regulating the down-regulated lncRNAs were not associated to any GO term neither KEGG pathway, thus the analysis showed results only for the up-regulated lncRNAs (Table S12). Biological and metabolic processes were the most representative GO terms for the biological process category, followed by macromolecule metabolic process and response to stimulus and stress in this dataset. Several GO terms associated with low-oxygen conditions including response to hypoxia and response to decreased oxygen levels were enriched. In addition, catabolism and metabolism of allantoin were also enriched. Genes involved in cell periphery and cell wall were represented for cellular components. For molecular functions, the pine lncRNAs were enriched for GO terms such as catalytic activity, binding and hydrolase activity. The KEGG pathways enriched in the target genes of the up-regulated lncRNAs were ‘glycolysis/gluconeogenesis’ and ‘microbial metabolism in diverse environments’ (Table S13).
Co-expression gene modules associated with P. radiata defence response
A dendrogram, in which the samples were clustered according to their condition using the CEMiTool package, was generated (Fig. 6A). The modular expression analysis revealed genes that may act together or are similarly regulated during the defence responses to F. circinatum infection. The dissimilarity threshold of 0.8 was used as a cut-off on hierarchical clustering, which identified two co-expression modules (Fig. 6B and C). The largest module contained 320 co-expressed transcripts (M1): 307 DEGs, 13 DE lncRNAs, and three targeted genes (PDC1, PME and RLK; Table S14). Transcripts in M1 were enriched mainly for biological processes related to the pectinesterase activity and cell wall remodeling among others (Fig. 6D; Table S15). Indeed, three DEGs encoding for pectin methylesterase 17 were identified as gene hubs in this module (Table 3). The second module (M2) consisted of 30 DEGs and one DE lncRNA (Table S14), however, no significant GO terms were identified. The top gene hubs of both modules are shown in Table 3.
Table 3.
Transcript | Description |
---|---|
Hub genes - M1 | |
PITA.22172.1 | Pectin methylesterase 17 |
PITA.22173.1 | Pectin methylesterase 17 |
PITA_04671 | Pectin methylesterase 17 |
PITA.84236.10 | Alcohol dehydrogenase, partial (ADH1) |
PITA_08271 | Early nodulin-93-like |
Hub genes - M2 | |
PITA.37728.4 | 2-methylene-furan-3-one reductase |
PITA.32347.3 | unknown |
PITA.69828.1 | hypothetical protein |
PITA.7538.2 | Glutathione S-transferase, partial (GST) |
PITA.87100.2 | Pheophytinase, chloroplastic-like |
Discussion
Over the past decade, the complexity of eukaryote genome expression has become apparent mainly due to the development of next-generation sequencing technologies. Particularly, the sequencing of RNA (RNA-Seq) has revealed an important part of non-coding transcriptome that should not be ignored. Indeed, a large number of studies have recently reported lncRNAs to be essential in the regulation of a wide range of biological and molecular processes by activating their nearly protein-coding genes using a cis–mediated mechanism or distant genes in a trans-acting manner [36]. Stress conditions lead to transcriptomic reprogramming where lncRNAs also play a key role. In plants, numerous lncRNAs under biotic stress have been identified to date, although further studies for non-model plants are still required. In the last years, the transcriptomic responses of conifers to fungal infections have been increasingly studied. In particular, several transcriptomic studies have demonstrated that the F. circinatum infection causes substantial changes in the pine gene expression [30–35]. However, to our knowledge, no reports investigating the long non-coding RNAs of conifer trees in response to fungal attacks have been published so far. The results reported here, therefore, provide a first insight into the regulatory mechanisms of lncRNAs involved in defence reactions against F. circinatum of a highly susceptible species such as P. radiata at an early stage of infection.
The percentage of P. radiata reads mapped the P. taeda reference genome conforms to acceptable mapping ratios [37], which can be expected as genome-wide comparisons between both species have revealed a significant collinearity between their genomes using restriction fragment length polymorphisms (RFLPs) and microsatellite markers [38]. Although P. taeda and P. radiata belong to different subsections (Australes and Oocarpae, respectively), they are in the same subgenus (Pinus) and section (Pinus). Therefore, in absence of conifer genome sequence of the species of interest, the use of those of taxonomically related species in order to perform an unannotated transcript identification pipeline can be employed to assist and improve the transcript assembly process when using short reads [39].
The combination of the strand-specific RNA-Seq approach and high coverage sequencing (up to 84 million reads per sample) allowed the identification of lncRNAs that are commonly expressed at low levels and lncNATs that would otherwise have been difficult to find [40]. Moreover, the visual analysis of the PCA identified two sample outliers among the pathogen-inoculated condition (Figure S2), which may be
due to technical failures during the multi-step process of RNA-Seq experiment (mRNA isolation, reverse transcription, library construction and sequencing). Since these errors, known as batch effects, lead to decreased statistical power [41], outliers were discarded for the differential expression analysis. Overall, a total of 13,312 lncRNAs were identified from the P. radiata transcriptome, of which 164 were F. circinatum-responsive lncRNAs comprised mainly by intergenic lncRNAs (Table S6-S7). This is consistent with previous analyses where the number of lncRNAs in response to biotic stress was comparable. In Paulownia tomentosa, two similar studies found 112 and 110 lncRNAs to be involved in phytoplasma infection [22, 42]. Similarly, among 94 and 302 lncRNAs were identified in susceptible and resistant M. acuminata roots in response to F. oxysporum f. sp. cubense, with the highest value in the resistant roots after 51 h post-inoculation [21]. The number of S. sclerotiorum-responsive lncRNAs was slightly higher in B. napus with 662 at 24 h decreasing until 308 at 48 h [20]. In addition, intergenic lncRNAs were also the most abundant responsive transcripts in all these studies. Therefore, the pattern appears to follow the same trend in conifer trees.
In general, lncRNAs demonstrate low and tissue-specific expression patterns and lack of conservation [3, 43, 44]. Indeed, lncRNAs of P. radiata showed lower expression than the protein-coding RNAs, and only 19.3% of them were conserved among 46 different plant species. However, the low level of transcriptome conservation in P. radiata to angiosperms has also been shown in xylem tissues (15-32%; E-value ≤ 10−5), compared with the highly conserved xylem transcriptome within conifers (78-82%; E-value ≤ 10−5) [45]. Thus, it may not be a characteristic of conifer lncRNAs. The genomic features of the lncRNA transcripts of P. radiata were consistent with those previously characterized in other organisms [46]. As expected, the lncRNAs were shorter in terms of overall length and contained lower number of exons (Fig. 3 A and B). The analysis showed a bias toward two-exon lncRNAs, which is explained by the retention of only monoexonic transcripts with antisense localization in the identification process of lncRNAs. The length of the exons was also shorter in lncNATs and intronic lncRNAs when comparing with protein-coding RNAs, however, the distribution of the length of exons belonging to lincRNAs was closer to that of the protein-coding transcripts (Fig. 3C). In this regard, some exceptions have been found in other plants such as cotton (Gossypium arboretum) and chickpea (Cicer arietinum) where the exon length of the lincRNAs was even longer than protein-coding RNAs [47]. The GC content of the assembled transcripts of P. radiata (43.1%) was similar to that of the transcriptome of other Pinus species such as P. tecunumanii (44%) [32]. Separately, the GC content in pine lncRNAs (41%) was lower than in protein-coding RNAs (44.8%), which had been reported before as a common feature of lncRNAs due to different evolutionary pressures in ORFs [48].
The role of lncRNAs in the positive or negative regulation of gene expression is well known [3]. One of the conserved mechanisms of action of the lncRNAs is their function as decoys by sequestering RNA-binding proteins (RBP), miRNAs or chromatin-modifying complexes [6]. Thus, the lncRNA ultimately inhibits its particular function. Several DE lncRNAs of P. radiata inoculated by F. circinatum seem to fit into this functional mechanism. Four antisense lncRNAs were predicted to target genes encoding RBPs including pentatricopeptide repeat-containing protein (PPR2), ribosomal RNA methyltransferase FtsJ domain containing protein, CCCH-type zinc finger protein and RNA recognition motif (RRM) containing protein. Moreover, another antisense DE lncRNA was predicted to target a chromatin-remodelling gene (Table 2). Therefore, the reprogramming exerted by the infection of F. circinatum on pine transcription affects not only the protein-coding genes, but also the non-coding part of the genome.
The induction of plant defences is a complex biological process that causes a dramatic transcriptomic reprogramming throughout the genome [49]. Previous studies have shown that a vast number of genes are either up- or down-regulated in response to F. circinatum infection [31, 33–35]. Several functional groups of genes have repeatedly been identified as induced upon the pathogen infection. These groups include signal perception and transduction, biosynthesis of defence hormone and secondary metabolites, and cell wall reinforcement and lignification. Some of the GO terms enriched by the potential target genes of the lncRNAs identified in this study were related to these functional groups including biological processes such as cell wall modification and signalling of the abscisic acid (ABA), ethylene (ET) and cytokinin hormones (Table S12). These results suggest for the first time that the lncRNAs may play a key role in the process of pine defence to F. circinatum as previously reported in other pathosystems [8, 50]. Indeed, the enrichment of various GO terms related to the ABA signalling suggests an involvement of the pine lncRNAs in this pathway. In turn, ABA accumulation has been previously associated with increased PPC susceptibility [51–54], therefore, a deeper investigation of these lncRNAs in the pine is needed in order to better understand the complex regulation of ABA responses.
Plant signalling molecules such as protein kinases, reactive oxygen species (ROS) and hormones are critical in mounting an appropriate defence response [55]. Genes with kinase activity have a role in signal transduction triggering the downstream signalling. Two genes with predicted functions in receptor-like kinase were cis-regulated by lncRNAs, being one of them DE by the pathogen infection (Table 2). The other one was potentially regulated by a lncNAT. Positive cis-regulatory feature of NATs by mediating histone modifications at the locus has been previously reported [44]. This behaviour has been also seen in LAIR, a rice lncNAT that up-regulates the expression of its neighbour leucine-rich repeat receptor kinase [56]. Despite that a large number of genes [43] encoding glutathione S-transferases (GSTs) were up-regulated under the pathogen infection (Table S8), the GST predicted to be regulated by the downstream lncRNAPiRa.64704.1 was not among the DEGs. Joshi et al. [20] also identified one lncRNA of B. napus located in the upstream of a gene encoding for a GST in response to S. sclerotiorum infection. GST genes are highly induced under biotic stress due to their role in the attenuation of oxidative stress and the participation in hormone transport [57]. In addition, a transcript predicted to encode a non-symbiotic hemoglobin 1, which is involved in ROS and NO scavenging [58], was DE in the analysis and predicted to be targeted by lncRNAPiRa.19024.1 (Table 2). These findings seem to indicate that lncRNAs could be also involved in the cell detoxification after an oxidative burst provoked by a fungal infection.
Phytohormones trigger an effective defence response against biotic stress [59]. Several studies have pointed to lncRNAs as participants in the complex network of hormone regulation. In M. acuminata infected by F. oxysporum f. sp. cubense, lncRNAs were found to be predominantly associated with auxin and salicylic acid signal transduction in susceptible cultivars, whereas all phytohormones were potentially regulated by lncRNAs in resistant cultivars [21]. Genes related to the salicylic acid-mediated defence process were co-expressed with lncRNAs in kiwifruit plant challenged with the bacteria P. syringae [23]. Likewise, lncRNAs of resistant walnuts to C. gloeosporioides were predicted to trans-regulate genes involved in defence pathways of the jasmonic acid and auxins [24]. A previous transcriptome analysis of P. radiata showed the induction of abscisic acid signalling under the infection of F. circinatum [31]. A type 2 C protein phosphatase (PP2C) family gene, which negatively regulates abscisic acid responses [60, 61], could be regulated by lncRNAPiRa.47042.1 located upstream in the same strand despite not belonging to the DEGs (Table 2). The implication of this lncRNA in the abscisic acid signalling regulation would need further investigation.
The phytohormone ethylene represents one of the core components of the plant immune system [62]. When ethylene binds with its ETRs activates the transcriptional cascade of ethylene-regulated genes [63]. Seedlings of P. tecunumanii, P. patula, P. pinea and P. radiata inoculated with F. circinatum have demonstrated to induce ethylene biosynthesis and signalling genes [31, 33, 35]; however, only ETR2 has been found to be induced in the moderate resistant specie P. pinaster at 5 and 10 dpi [34]. Under stress conditions, when the concentration of ethylene is high, the transcription of ETR2 contributes to the stabilization of ethylene levels by attenuating its signalling output and restore the ability to respond to subsequent ethylene signal [64]. In the present study, ETR2 has not been DE in P. radiata but was presumably influenced by lncRNAPiRa.85000.6, which has been DE by F. circinatum (Table 2). Therefore, we can hypothesize that the ethylene response seems to be fine-tuned in P. pinaster, which does not occur in P. radiata, possibly due to the influence of this lncRNA located upstream of its transcription. It would be worthwhile to further investigate the regulatory function of this lncRNA as it could be a key factor in overcoming the PPC disease.
The potential function of lncRNAs in wood formation has been previously observed in different plant species. In a study of cotton lncRNAs, these were enriched for lignin catabolic processes and their role in lignin biosynthesis by regulating the expression of LAC4 was suggested [65]. In Populus, 16 genes targeted by lncRNAs were involved in wood formation processes, including lignin biosynthesis [9], and 13 targeted genes were associated to cellulose and pectin synthesis [66]. In addition, the lncRNA NERDL regulates the Needed for rdr2-independent DNA methylation (NERD) gene, which is also involved in the wood formation in Populus [67]. The enzyme that catalyse the hemicellulose xyloglucan was predicted to be targeted by a lncRNA of Paulownia tomentosa and had a role in the hyperplasia caused by a phytoplasma infection [22]. Cell wall reinforcement and lignification are the most common induced defences against pathogens, for that, the cell wall suffers a remodelling process that has been documented in the P. radiata-F. circinatum pathosystem [31, 35]. The demethylesterification of pectin, controlled by PMEs, is considered to affect the porosity of the cell wall and, thus, exposes the plant to an easier degradation by pathogen enzymes [68]. However, PME activity has been also associated with the activation of plant immunity and resistance against pathogens [69]. In a recent study, in contrast to P. radiata, the resistant species P. pinea infected by F. circinatum showed a high induction of pectin methylesterase inhibitor (PMEI) genes and an inhibition of PMEs [35]. In this study, two lncRNAs were predicted to target two PMEs, one of them was up-regulated by the pathogen infection, which could suggest a positive regulation from the lncRNA activity (Table 2). In addition, the co-expression analysis of F. circinatum responsive lncRNAs and mRNAs indicated a clear enrichment for PME activity (Fig. 6D). The transcriptional regulation of these enzymes could be related to the susceptibility of P. radiata and would be worth further investigation. Another gene containing a cellulase domain was also up-regulated in the expression analysis of protein-coding RNAs and predicted to be regulated by an induced lncRNA (Table 2). Moreover, the analysis identified a potential lncRNA cis-regulating positively a gene encoding for 4CL3 (Table 2), one of the key enzymes of the phenylpropanoid pathway. In plants, this pathway leads to the production of secondary metabolites and cell wall lignification, both associated to plant defence. The transcriptional regulation of the 4CL gene by lncRNAs has been also reported in P. tomentosa, that together with the targeted gene encoding the caffeoyl-CoA 3-O-methyltransferase (CCOMT) enzyme by another lncRNA, highlighted the potential role of these molecules in lignin formation in wood with different properties [9]. These findings provide increasing evidence for the involvement of lncRNAs in cell wall remodelling and lignification process.
Although the role of the hypoxia in the plant-pathogen interaction has not yet been determined, hypoxia-responsive genes have been reported to be induced in some plants during pathogen infections [70]. Indeed, the analysis of DEGs showed that a large number of genes encoding for PDC1 and alcohol dehydrogenase 1 (ADH1), which are required in the fermentative pathway under low-oxygen conditions, were highly induced by F. circinatum infection (>10 log2[fold change]; Table S8). Among them, two PDC1 were potentially targeted by two pine lncRNAs (Table 2). This together with the functional analysis results of the lncRNAs where several enriched GO terms were associated to hypoxia suggests a role of pine lncRNAs in an insufficient oxygen situation.
Conclusions
In summary, the computational analysis allowed to identify 13,312 lncRNAs in P. radiata. Compared to the protein-coding RNAs, the lncRNAs were shorter, with fewer exons and showed lower expression levels. In total 164 lncRNAs were reported as responsive to F. circinatum infection. GO enrichment of genes that either overlap with or are neighbours of these pathogen-responsive lncRNAs suggested involvement of important defence processes including signal transduction and cell wall reinforcement. These results present a comprehensive map of lncRNAs in P. radiata under F. circinatum infection and provide a starting point to understand their regulatory mechanisms and functions in conifer defence. In turn, a thorough understanding of the mechanism of gene regulation will contribute to the improvement of breeding programs for resistant pine commercialization, one of the most promising approaches for PPC management.
Methods
Inoculum preparation and inoculation trial
The F. circinatum isolate 072 obtained from an infected P. radiata tree in the North of Spain (Cantabria, Spain) was used. The isolate was cultured in Petri dishes containing PDA medium (Scharlab S.L., Spain) for a week at 25 °C. Then, to stimulate the sporulation of the fungus, four mycelial agar plugs were subcultured in an Erlenmeyer flask with 100 mL of PDB medium (Scharlab S.L., Spain) and incubated in an orbital shaker at 150 rpm during 48 h at 25ºC. Afterwards, the conidial suspension was adjusted with a haemocytometer at 106 conidia mL−1 for the inoculation.
Six-month-old seedlings from P. radiata seeds originating from the same provenance (Galicia, Spain), which had been assessed and provided by the Consellería do Medio Rural (Xunta de Galicia, Spain), were used for the inoculation trial. The plants, with an approximate stem diameter of 2.5 ± 0.5 cm, were inoculated on the stem by making a wound with a sterile scalpel and pipetting 10 µL of conidial suspension [71]. The same process was applied for the control seedlings that were mock-inoculated with sterilized distilled water. The inoculated wound was immediately sealed with Parafilm® to prevent drying. Sixty seedlings were inoculated for each treatment (inoculation with pathogen and mock-inoculation). Plants were placed in a growth chamber at 21.5 ºC with a 14-h photoperiod and kept for 67 days during which symptoms were monitored according to the scale of symptoms (slightly modified) described by Correll et al. [72], where 0 = healthy plant, 1 = resin and/or necrosis at the point of inoculation and healthy foliage, 2 = resin and/or necrosis beyond the point of inoculation, 3 = accentuated wilting and appreciable dieback, 4 = dead plant. Mortality rates were daily recorded.
The survival analysis based on the non-parametric estimator Kaplan-Meier [73] was performed with the “Survival” package [74] to test the mortality of the plants. Survival curves were created with the “Survfit” function and the differences between the curves were tested with the “Survdiff” function. All analyses were performed using R software environment [75].
RNA extraction and paired-end strand-specific sequencing
A piece of the stem from the upper part of the inoculation point (ca. 1 cm length) was sampled at four dpi for the transcriptomic analysis. The timing of sampling was choosen since it could be an adequate representation of the initial phase of the infection process, which has been previously established as a seven-day period [76]. The harvested tissues were immediately frozen in liquid nitrogen and ground to a fine powder using a mortar and pestle. RNA extractions were performed using the Spectrum™ Plant Total RNA Kit (Sigma Aldrich, USA) following the manufacturer’s protocols including the optional on-column DNase 1 digestion (DNASE10-1SET, Sigma-Aldrich, St. Louis, MO, USA). After RNA extraction, samples were transferred to RNase- and DNase-free tubes (Axygen®, USA) and stored at -80 °C. The concentration and purity of the RNA extracted were measured using the Multiskan GO Spectrophotometer (A260/A280 ≥ 1.8, A260/A230 ≥ 1.8 and concentration > 50 ng/µl; Thermo Fisher Scientific, Waltham, MA, USA). RNA integrity was checked by agarose gel electrophoresis (1% TAE).
Six biological replicates of inoculated and three of mock-inoculated treatment randomly selected from the inoculation trial were sent to Macrogen Co. (Seoul, South Korea) for sequencing. Sequenced samples showed a RNA integrity number (RIN) ≥ 7 measured by an Agilent 2100 Bioanalyzer. The strand-specific RNA-Seq libraries were constructed using the Illumina TruSeq Stranded mRNA protocol with polyadenylated mRNAs and lncRNAs enrichment and an insert size of 300 bp (150 × 2 paired-end reads). Sequencing was performed on the Illumina NovaSeq 6000 Sequencing System (Illumina Inc., USA).
Genome mapping and reference-based transcriptome assembly
All sequenced libraries were assessed for quality control using FastQC v.0.11.9 [77] and trimmed for Illumina adaptor sequences and low-quality base-calls using Trimmomatic v.0.38 [78]. The trimmed reads with high quality were then aligned to the P. taeda reference genome sequence (Pita_v2.01; Treegenes database [79]) using HISAT2 v.2.0.0 [80] with parameters “--known-splicesite-infile”, “--dta” and “--rna-strandness RF”. In order to ensure the presence of F. circinatum biomass in the samples, the reads were also mapped to its publically available genome sequence (accession number JAGGEA000000000). The SAM files from the pine mapping were processed with the SAMtools utility [81] for converting to binary alignment map (BAM) format, sorting by coordinates and removing duplicates. The transcripts for each sample were reconstructed separately by StringTie v.2.1.4 [82] using the “-G option” with the annotation file of P. taeda (Pita.2_01.entap_annotations.tsv; Treegenes database [79]. This file was previously fixed with Gffread utility v.0.12.1 [83] for the correct understanding by StringTie program. After the transcriptome assembly, the nine resulting GTF files were merged to generate a non-redundant set of transcripts with unique identifiers using the StringTie “-merge” parameter, where only transcripts with expression levels > 0.1 FPKM were included. Finally, this newly experiment-level transcriptome was further compared with the P. taeda reference annotation GTF file (Pita_v2.01; Treegenes database [79]) using the software Gffcompare v.0.12.1 [83], classifying transcripts in different class codes according to their nature/origin.
LncRNAs identification
Based on all the assembled transcripts, the known transcripts marked with the class code “=” were excluded before conducting the potential long non-coding RNAs identification. The remaining transcripts were subjected to the coding potential predictor FEELnc v.0.2 tool [84] as well as several filters to ensure reliability of lncRNAs. Firstly, the FEELnc filter module was used to remove short transcripts (< 200 nt) and keep only monoexonic transcripts with antisense localization. After that, the sequences of the resulting transcripts were extracted with Gffread v.0.12.1 [83] and the fasta file output was piped to the Eukaryotic Non-Model Transcriptome Annotation Pipeline (EnTAP) v.0.9.2 [85] for transcript annotation. Briefly, GeneMarkS-T v.5.1 [86] was used for ORF prediction and the sequence aligner DIAMOND v.1.9.2 [87] conducted the similarity search with default settings (E-value < 10−5) using the NCBI non-redundant protein database (release-201). After that, the assignment of protein domains (Pfam), GO terms and KEGG pathways was performed using EggNOG v.1.0.3 [88]. Finally, EnTAP filtered contaminants to retain only high-quality transcripts. Subsequently, the FEELnc codpot module was used with the shuffling mode to calculate a coding potential score (CPS) for the un-annotated transcripts using a random forest algorithm trained with multi k-mer frequencies and relaxed ORFs. The specificity threshold was set at 0.95 in order to increase the robustness of the final set of novel lncRNAs. The remaining transcripts were designated as lncRNAs and further classified according to the ‘Gffcompare’ output as lincRNAs categorized with class code ‘u’, lncNAT from the class code ‘x’, and intronic transcripts that were those with class code ‘i’ [89].
In order to investigate the conservation of the pine lncRNAs, two recently released and updated databases of known plant lncRNAs were used [40]. All the transcripts designated as lncRNA were aligned against CANTATA database [90] and GreeNc database [91] using the blastn algorithm (E-value <10−5) of the BLAST v.2.9.0 software suite [92, 93]. Moreover, the transcripts were also aligned to the Rfam (version 14.1) and miRBase (version 21) non-coding RNA databases with designated threshold value (E-value <10−5) using the blastn algorithm in order to detect housekeeping non-coding RNAs including tRNAs, rRNAs and snoRNAs, and miRNA precursors.
Differential expression analysis
StringTie together with the “-e” parameter was employed to estimate expression for all transcripts of the experiment-level transcriptome [82]. The output file was reformatted using the “prepDE.py” script for further expression analysis [94]. DESeq2 v.1.24.1 [95] was used to identify DE lncRNA transcripts based on the matrix of the estimated counts. DEGs were identified equally. The pairwise comparison of inoculated and control plants were evaluated using Wald tests. To visualize the similarity of the replicates and identify any sample outliers, the PCA was constructed using the rlog-transformed expression values. Transcripts were considered as differentially expressed if the adjusted p-values (padj) for multiple testing, using Benjamini–Hochberg to estimate the false discovery rate (FDR) [96], was less than 0.05 and the |log2 (Fold Change)| ≥ 1.
Potential target gene prediction and functional enrichment
Based on the genome location of the lncRNAs relative to the neighbouring genes, the nearest protein-coding genes transcribed within a 10 kb window upstream or 100 kb downstream were considered as potential cis-regulated target genes. These genes were identified using the FEELnc classifier module [84] and annotated using the EnTAP pipeline [85] as described above but implemented with the RefSeq complete protein database (release-201) and the UniProtKB/Swissprot database (release-2020_05).
Functional enrichment analysis of the target genes associated with the DE lncRNAs was conducted. DE lncRNA transcripts were divided into up- and down-regulated subsets for efficient functional analysis [97]. Using all genes as background, GO and KEGG enrichment analysis were conducted by GOSeq v.1.38.0 based on the Wallenius non-central hyper-geometric distribution that allows the adjustment for transcript length bias [98]. The GO terms and KEGG pathways with corrected p-values lower than 0.05 were considered to be enriched in the group. Redundant gene ontology categories were parsed using Revigo [99].
Co-expression analysis and identification of hub genes
In order to predict the co-expression modules and determine the GO terms that differentiate the transcriptome induced by F. circinatum, a weighted gene co-expression network analysis approach implemented in the R-based Co-Expression Modules identification Tool (CEMiTool) package v.1.8.3 [100] was conducted in R software. Network analysis was carried out on the expression data for three gene sets: DE lncRNAs, DEGs and targeted genes predicted by FEELnc. A variance stabilizing transformation (VST) was used and transcripts were filtered to reduce correlation between variance and gene expression. The Spearman’s method was used for calculating the correlation coefficients and a soft thresholding power (β) of 6 was selected. The co-expressed modules were subjected to over-representation analysis (ORA) based on the hypergeometric test [101] using the GO terms to determine the most significant module functions (q-value <= 0.05) [90]. Moreover, genes with the highest connectivity, known as hub genes and considered functionally important genes [102] were identified in each module.
Additional files
Acknowledgements
The authors thank Marcos García for informatics support and Sergio Díez for helping in R scripting. In addition, we would like to thank Jonatan Niño for reviewing the text. Special thanks to Alex Douglas for hosting CZ-B at the University of Aberdeen (Scotland) and teaching her bioinformatics. The data analysis has been carried out using the resources of the “Centro de Supercomputación de Castilla y León” (SCAYLE) under the valuable technical support of Carmen Calvo and Jesús Lorenzana.
Abbreviations
- ncRNA
Non-coding RNA
- miRNAs
MicroRNAs
- lncRNAs
Long non-coding RNA
- NGS
Next generation sequencing
- PPC
Pine Pitch Canker
- RNA-Seq
RNA sequencing
- dpi
Days post inoculation
- polyA
Polyadenylated
- lincRNAs
Long intergenic non-coding RNAs
- lncNATs
Long non-coding natural antisense transcripts
- rRNA
Ribosomal RNAs
- tRNA
Transfer RNAs
- snoRNAs
Small nucleolar RNAs
- bp
Base pair
- FPKM
Fragment per kilobase of exon per million mapped reads
- PCA
Principal component analysis
- DE
Differentially expressed
- kb
Kilobases
- mRNA
Messenger RNA
- RLKs
Receptor-like protein kinases
- PMEs
Pectin methylesterases.
- 4CL3
4-coumarate-CoA ligase 3
- DEGs
Differentially expressed genes
- ETR2
Ethylene receptor 2
- PAR1
Photoassimilate-responsive protein 1
- PDC1
Pyruvate decarboxylase 1
- FC
Fold-change
- GO
Gene Ontology
- KEGG
Kyoto Encyclopedia of Genes and Genomes
- GSEA
Gene set enrichment analysis
- NES
Normalised Enrichment Score
- BH
Benjamini-Hochberg
- RBPs
RNA-binding proteins
- PPR2
Pentatricopeptide repeat-containing protein
- RRM
RNA recognition motif
- ROS
Reactive Oxygen Species
- GST
Glutathione S-transferases
- NO
Nitric Oxide
- PP2C
Type 2 C protein phosphatase
- NERD
Needed for rdr2-independent DNA methylation
- PMEI
Pectin methylesterase inhibitor
- CCOMT
Caffeoyl-CoA 3-O-methyltransferase
- ADH1
Alcohol dehydrogenase 1
- PDA
Potato Dextrose Agar
- PDB
Potato Dextrose Broth
- TAE
Tris-Acetate-EDTA
- RIN
RNA integrity number
- BAM
Binary Alignment Map
- EnTAP
Eukaryotic Non-Model Transcriptome Annotation Pipeline
- CPS
Coding Potential Score
- ORF
Open Reading Frame
- padj
Adjusted p-value
- FDR
False Discovery Rate
- VST
variance stabilizing transformation
- ORA
Over-representation analysis
Authors’ contributions
CZ-B and JM-G performed the experimental work. CZ-B conducted the bioinformatics analysis and interpreted data analyses. AS-V supervised the bioinformatics analysis. JJD obtained funding to support the research. CZ-B wrote the manuscript with input from JM-G, AS-V and JJD, who supervised aspects of this research. The authors read and approved the final manuscript.
Funding
This work was supported by project PID2019-110459RB-I00 funded by MICINN (Spain) as well as the project VA208P20 funded by JCYL (Spain), both co-financed by FEDER (EU) budget. University of Valladolid supported CZ-B (Rector’s Resolution of November 11, 2016).
Availability of data and materials
Pine seeds and fungal isolate used in this study will be provided upon request in order to support reproducibility. Likewise, the datasets generated and analysed during the current study are available in the SRA database (https://www.ncbi.nlm.nih.gov/sra) of NCBI repository. The SRA accession numbers are SRR15100123-31 (BioProject PRJNA742852; https://www.ncbi.nlm.nih.gov/search/all/?term=PRJNA742852). The reference genome (Pita_v2.01) is publicly available in Treegenes database (https://treegenesdb.org/FTP/Genomes/Pita/v2.01/genome/Pita.2_01.fa.gz).
Declarations
Ethics approval and consent to participate
Not applicable. Pine seedlings used in this study were sourced from the provenance 01. Galicia litoral provided by the Consellería do Medio Rural (Xunta de Galicia, Spain). The voucher specimen of P. radiata had been deposited to our lab but not to any publicly available herbarium. Fusarium circinatum 072 isolate used in this study was obtained with the pertinent permission from the competent administration (Government of Cantabria), an institution that was also collaborating in the research study that was ongoing at the time. This isolate is maintained by our group.
Consent for publication
Not applicable.
Competin interests
The authors declare that they have no conflicts of interest.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Cristina Zamora-Ballesteros, Email: cristinazamoraballesteros@gmail.com.
Jorge Martín-García, Email: jorgemg@pvs.uva.es.
Julio Javier Diez, Email: julioJavier.Diez@uva.es.
References
- 1.Eddy SR. Non–coding RNA genes and the modern RNA world. Nat. Rev. Genet. 2001;2:919–929. doi: 10.1038/35103511. [DOI] [PubMed] [Google Scholar]
- 2.Kapranov P, Cheng J, Dike S, Nix DA, Duttagupta R, Willingham AT, et al. RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science. 2007;316:1484–8. doi: 10.1126/science.1138341. [DOI] [PubMed] [Google Scholar]
- 3.Quan M, Chen J, Zhang D. Exploring the secrets of long noncoding RNAs. Int J Mol Sci. 2015;16:5467–96. doi: 10.3390/ijms16035467. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Gil N, Ulitsky I. Regulation of gene expression by cis-acting long non-coding RNAs. Nat Rev Genet. 2020;21:102–17. doi: 10.1038/s41576-019-0184-5. [DOI] [PubMed] [Google Scholar]
- 5.Ma L, Bajic VB, Zhang Z. On the classification of long non-coding RNAs. RNA Biol. 2013;10:924–33. doi: 10.4161/rna.24604. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Wang KC, Chang HY. Molecular Mechanisms of Long Noncoding RNAs. Mol Cell. 2011;43:904–14. doi: 10.1016/j.molcel.2011.08.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Liu X, Hao L, Li D, Zhu L, Hu S. Long Non-coding RNAs and Their Biological Roles in Plants. Genomics. Proteomics Bioinforma. 2015;13:137–47. doi: 10.1016/j.gpb.2015.02.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Sanchita, Trivedi PK, Asif MH. Updates on plant long non-coding RNAs (lncRNAs): the regulatory components. Plant Cell Tissue Organ Cult. 2020;140:259–69. doi: 10.1007/s11240-019-01726-z. [DOI] [Google Scholar]
- 9.Chen J, Quan M, Zhang D. Genome-wide identification of novel long non-coding RNAs in Populus tomentosa tension wood, opposite wood and normal wood xylem by RNA-seq. Planta. 2015;241:125–43. doi: 10.1007/s00425-014-2168-1. [DOI] [PubMed] [Google Scholar]
- 10.Heo JB, Sung S. Vernalization-mediated epigenetic silencing by a long intronic noncoding RNA. Science. 2011;331:76–9. doi: 10.1126/science.1197349. [DOI] [PubMed] [Google Scholar]
- 11.Swiezewski S, Liu F, Magusin A, Dean C. Cold-induced silencing by long antisense transcripts of an Arabidopsis Polycomb target. Nature. 2009;462:799–803. doi: 10.1038/nature08618. [DOI] [PubMed] [Google Scholar]
- 12.Qin T, Zhao H, Cui P, Albesher N, Xionga L. A nucleus-localized long non-coding RNA enhances drought and salt stress tolerance. Plant Physiol. 2017;175:1321–36. doi: 10.1104/pp.17.00574. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Franco-Zorrilla JM, Valli A, Todesco M, Mateos I, Puga MI, Rubio-Somoza I, et al. Target mimicry provides a new mechanism for regulation of microRNA activity. Nat Genet. 2007;39:1033–7. doi: 10.1038/ng2079. [DOI] [PubMed] [Google Scholar]
- 14.Shi W, Quan M, Du Q, Zhang D. The interactions between the Long Non-coding RNA NERDL and its target gene affect wood formation in Populus tomentosa. Front Plant Sci. 2017;8:1035. [DOI] [PMC free article] [PubMed]
- 15.Seo JS, Sun HX, Park BS, Huang CH, Yeh SD, Jung C, et al. ELF18-INDUCED LONG-NONCODING RNA associates with mediator to enhance expression of innate immune response genes in Arabidopsis. Plant Cell. 2017;29:1024–38. doi: 10.1105/tpc.16.00886. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Hou X, Cui J, Liu W, Jiang N, Zhou X, Qi H, et al. LncRNA39026 enhances tomato resistance to Phytophthora infestans by decoying miR168a and inducing PR gene expression. Phytopathology. 2020;110:873–80. doi: 10.1094/PHYTO-12-19-0445-R. [DOI] [PubMed] [Google Scholar]
- 17.Zhang L, Wang M, Li N, Wang H, Qiu P, Pei L, et al. Long noncoding RNAs involve in resistance to Verticillium dahliae, a fungal disease in cotton. Plant Biotechnol J. 2018;16:1172–85. doi: 10.1111/pbi.12861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Yu Y, Zhou YF, Feng YZ, He H, Lian JP, Yang YW, et al. Transcriptional landscape of pathogen-responsive lncRNAs in rice unveils the role of ALEX1 in jasmonate pathway and disease resistance. Plant Biotechnol J. 2020;18:679–90. doi: 10.1111/pbi.13234. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Tripathi R, Chakraborty P, Varadwaj PK. Unraveling long non-coding RNAs through analysis of high-throughput RNA-sequencing data. Non-coding RNA Res. 2017;2:111–8. doi: 10.1016/j.ncrna.2017.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Joshi RK, Megha S, Basu U, Rahman MH, Kav NN V. Genome Wide Identification and Functional Prediction of Long Non-Coding RNAs Responsive to Sclerotinia sclerotiorum Infection in Brassica napus. PLoS One. 2016;11:e0158784. doi: 10.1371/journal.pone.0158784. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Li W, Li C, Li S, Peng M. Long noncoding RNAs that respond to Fusarium oxysporum infection in “Cavendish” banana (Musa acuminata) Sci Rep. 2017;7:1–13. doi: 10.1038/s41598-017-17179-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Wang Z, Zhai X, Cao Y, Dong Y, Fan G. Long non-coding RNAs responsive to Witches’ Broom disease in Paulownia tomentosa. Forests. 2017;8:348. doi: 10.3390/f8090348. [DOI] [Google Scholar]
- 23.Wang Z, Liu Y, Li L, Li D, Zhang Q, Guo Y, et al. Whole transcriptome sequencing of Pseudomonas syringae pv. actinidiae-infected kiwifruit plants reveals species-specific interaction between long non-coding RNA and coding genes. Sci Rep. 2017;7:1–15. doi: 10.1038/s41598-017-05377-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Feng S, Fang H, Liu X, Dong Y, Wang Q, Yang KQ. Genome-wide identification and characterization of long non-coding RNAs conferring resistance to Colletotrichum gloeosporioides in walnut (Juglans regia) BMC Genomics. 2021;22:15. doi: 10.1186/s12864-020-07310-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Wingfield MJ, Hammerbacher A, Ganley RJ, Steenkamp ET, Gordon TR, Wingfield BD, et al. Pitch canker caused by Fusarium circinatum - A growing threat to pine plantations and forests worldwide. Australas Plant Pathol. 2008;37:319–34. doi: 10.1071/AP08036. [DOI] [Google Scholar]
- 26.Drenkhan R, Ganley B, Martín-García J, Vahalík P, Adamson K, Adamčíková K, et al. Global Geographic Distribution and Host Range of Fusarium circinatum, the Causal Agent of Pine Pitch Canker. Forests. 2020;11. 10.3390/f11070724.
- 27.Zamora-Ballesteros C, Diez JJ, Martín-García J, Witzell J, Solla A, Ahumada R, et al. Pine Pitch Canker (PPC): Pathways of Pathogen Spread and Preventive Measures. Forests. 2019;10:1158. doi: 10.3390/f10121158. [DOI] [Google Scholar]
- 28.Martín-García J, Zas R, Solla A, Woodward S, Hantula J, Vainio EJ, et al. Environmentally-friendly methods for controlling pine pitch canker. Plant Pathol. 2019;68:843–860. doi: 10.1111/ppa.13009. [DOI] [Google Scholar]
- 29.Gordon TR, Swett CL, Wingfield MJ. Management of Fusarium diseases affecting conifers. Crop Prot. 2015;73:28–39. doi: 10.1016/j.cropro.2015.02.018. [DOI] [Google Scholar]
- 30.Visser EA, Wegrzyn JL, Steenkmap ET, Myburg AA, Naidoo S. Combined de novo and genome guided assembly and annotation of the Pinus patula juvenile shoot transcriptome. BMC Genomics. 2015;16:1057. doi: 10.1186/s12864-015-2277-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Carrasco A, Wegrzyn JL, Durán R, Fernández M, Donoso A, Rodriguez V, et al. Expression profiling in Pinus radiata infected with Fusarium circinatum. Tree Genet Genomes. 2017;13. 10.1007/s11295-017-1125-0.
- 32.Visser EA, Wegrzyn JL, Myburg AA, Naidoo S. Defence transcriptome assembly and pathogenesis related gene family analysis in Pinus tecunumanii (low elevation) BMC Genomics. 2018;19:632. doi: 10.1186/s12864-018-5015-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Visser EA, Wegrzyn JL, Steenkamp ET, Myburg AA, Naidoo S. Dual RNA-Seq analysis of the pine-Fusarium circinatum interaction in resistant (Pinus tecunumanii) and susceptible (Pinus patula) hosts. Microorganisms. 2019;7:7–9. doi: 10.3390/microorganisms7090315. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Hernández-Escribano L, Visser EA, Iturritxa E, Raposo R, Naidoo S. The transcriptome of Pinus pinaster under Fusarium circinatum challenge. BMC Genomics. 2020;21:1–18. doi: 10.1186/s12864-019-6444-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Zamora-Ballesteros C, Pinto G, Amaral J, Valledor L, Alves A, Diez JJ, et al. Dual RNA-Sequencing Analysis of Resistant (Pinus pinea) and Susceptible (Pinus radiata) Hosts during Fusarium circinatum Challenge. Int J Mol Sci. 2021;22:5231. doi: 10.3390/ijms22105231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Geisler S, Coller J. RNA in unexpected places: Long non-coding RNA functions in diverse cellular contexts. Nat Rev Mol Cell Biol. 2013;14:699–712. doi: 10.1038/nrm3679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Conesa A, Madrigal P, Tarazona S, Gomez-Cabrero D, Cervera A, McPherson A, et al. A survey of best practices for RNA-seq data analysis. Genome Biol. 2016;17:1–19. doi: 10.1186/s13059-016-0881-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Devey ME, Sewell MM, Uren TL, Neale DB. Comparative mapping in loblolly and radiata pine using RFLP and microsatellite markers. Theor Appl Genet. 1999;99:656–62. doi: 10.1007/s001220051281. [DOI] [PubMed] [Google Scholar]
- 39.Lischer HEL, Shimizu KK. Reference-guided de novo assembly approach improves genome reconstruction for related species. BMC Bioinformatics. 2017;18:1–12. doi: 10.1186/S12859-017-1911-6/FIGURES/6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Rai MI, Alam M, Lightfoot DA, Gurha P, Afzal AJ. Classification and experimental identification of plant long non-coding RNAs. Genomics. 2019;111:997–1005. doi: 10.1016/j.ygeno.2018.04.014. [DOI] [PubMed] [Google Scholar]
- 41.Leek JT, Scharpf RB, Bravo HC, Simcha D, Langmead B, Johnson WE, et al. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet. 2010;11:733–9. doi: 10.1038/nrg2825. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Fan G, Cao Y, Wang Z. Regulation of long noncoding RNAs responsive to phytoplasma infection in Paulownia tomentosa. Int J Genomics. 2018 doi: 10.1155/2018/3174352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Chen L, Zhu QH, Kaufmann K. Long non-coding RNAs in plants: emerging modulators of gene activity in development and stress responses. Planta. 2020;252:92. doi: 10.1007/s00425-020-03480-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Yu Y, Zhang Y, Chen X, Chen Y. Plant noncoding RNAs: Hidden players in development and stress responses. Annu Rev Cell Dev Biol. 2019;35:407–31. doi: 10.1146/annurev-cellbio-100818-125218. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Li X, Wu HX, Southerton SG. Comparative genomics reveals conservative evolution of the xylem transcriptome in vascular plants. BMC Evol Biol. 2010;10:1–14. doi: 10.1186/1471-2148-10-190. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Cabili M, Trapnell C, Goff L, Koziol M, Tazon-Vega B, Regev A, et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 2011;25:1915–27. doi: 10.1101/gad.17446611. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Zaynab M, Fatima M, Abbas S, Sharif Y, Umair M, Zafar MH, et al. Role of secondary metabolites in plant defense against pathogens. Microb Pathog. 2018;124:198–202. doi: 10.1016/j.micpath.2018.08.034. [DOI] [PubMed] [Google Scholar]
- 48.Shuai P, Liang D, Tang S, Zhang Z, Ye CY, Su Y, et al. Genome-wide identification and functional prediction of novel and drought-responsive lincRNAs in Populus trichocarpa. J Exp Bot. 2014;65:4975–83. doi: 10.1093/jxb/eru256. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Kovalchuk A, Kerio S, Oghenekaro AO, Jaber E, Raffaello T, Asiegbu FO. Antimicrobial Defenses and Resistance in Forest Trees: Challenges and Perspectives in a Genomic Era. Artic Annu Rev Phytopathol. 2013;51:221–44. doi: 10.1146/annurev-phyto-082712-102307. [DOI] [PubMed] [Google Scholar]
- 50.Zhu Q-H, Stephen S, Taylor J, Helliwell CA, Wang M-B. Long noncoding RNAs responsive to Fusarium oxysporum infection in Arabidopsis thaliana. New Phytol. 2014;201:574–84. doi: 10.1111/nph.12537. [DOI] [PubMed] [Google Scholar]
- 51.Amaral J, Pinto G, Flores-Pacheco JA, Díez-Casero JJ, Cerqueira A, Monteiro P, et al. Effect of Trichoderma viride pre-inoculation in pine species with different levels of susceptibility to Fusarium circinatum: physiological and hormonal responses. Plant Pathol. 2019;68:1645–53. doi: 10.1111/ppa.13080. [DOI] [Google Scholar]
- 52.Amaral J, Correia B, António C, Rodrigues AM, Gómez-Cadenas A, Valledor L, et al. Pinus Susceptibility to Pitch Canker Triggers Specific Physiological Responses in Symptomatic Plants: An Integrated Approach. Front Plant Sci. 2019;10:509. doi: 10.3389/fpls.2019.00509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Amaral J, Correia B, Escandón M, Jesus C, Serôdio J, Valledor L, et al. Temporal physiological response of pine to Fusarium circinatum infection is dependent on host susceptibility level: the role of ABA catabolism. Tree Physiol. 2020;tpaa143. 10.1093/treephys/tpaa143 [DOI] [PubMed]
- 54.Cerqueira A, Alves A, Berenguer H, Correia B, Gómez-Cadenas A, Diez JJ, et al. Phosphite shifts physiological and hormonal profile of Monterey pine and delays Fusarium circinatum progression. Plant Physiol Biochem. 2017;114:88–99. doi: 10.1016/j.plaphy.2017.02.020. [DOI] [PubMed] [Google Scholar]
- 55.Yu X, Feng B, He P, Shan L. From Chaos to Harmony: Responses and Signaling upon Microbial Pattern Recognition. Annu Rev Phytopathol. 2017;55:109–37. doi: 10.1146/annurev-phyto-080516-035649. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Wang Y, Luo X, Sun F, Hu J, Zha X, Su W, et al. Overexpressing lncRNA LAIR increases grain yield and regulates neighbouring gene cluster expression in rice. Nat Commun. 2018;9:1–9. doi: 10.1038/s41467-018-05829-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Gullner G, Komives T, Király L, Schröder P. Glutathione S-transferase enzymes in plant-pathogen interactions. Front Plant Sci. 2018;9:1836. doi: 10.3389/fpls.2018.01836. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Bahmani R, Kim DG, Na JD, Hwang S. Expression of the tobacco non-symbiotic class 1 hemoglobin gene hb1 reduces cadmium levels by modulating cd transporter expression through decreasing nitric oxide and ROS level in Arabidopsis. Front Plant Sci. 2019;10:201. doi: 10.3389/fpls.2019.00201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Checker VG, Kushwaha HR, Kumari P, Yadav S. Role of phytohormones in plant defense: Signaling and cross talk. In: Singh A, Singh I, editors. Molecular Aspects of Plant-Pathogen Interaction. Singapore: Springer; 2018. pp. 159–84. [Google Scholar]
- 60.Cao J, Jiang M, Li P, Chu Z. Genome-wide identification and evolutionary analyses of the PP2C gene family with their expression profiling in response to multiple stresses in Brachypodium distachyon. BMC Genomics. 2016;17:175. doi: 10.1186/s12864-016-2526-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Jung C, Nguyen NH, Cheong JJ. Transcriptional regulation of protein phosphatase 2 C genes to modulate abscisic acid signaling. Int J Mol Sci. 2020;21:1–18. doi: 10.3390/ijms21249517. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Müller M, Munné-Bosch S. Ethylene Response Factors: A Key Regulatory Hub in Hormone and Stress Signaling. Plant Physiol. 2015;169:32–41. doi: 10.1104/pp.15.00677. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Sakai H, Hua J, Chen QG, Chang C, Medrano LJ, Bleecker AB, et al. ETR2 is an ETR1-like gene involved in ethylene signaling in Arabidopsis. Proc Natl Acad Sci U S A. 1998;95:5812–7. doi: 10.1073/pnas.95.10.5812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Zhao Q, Guo HW. Paradigms and paradox in the ethylene signaling pathway and interaction network. Mol Plant. 2011;4:626–34. doi: 10.1093/mp/ssr042. [DOI] [PubMed] [Google Scholar]
- 65.Wang M, Yuan D, Tu L, Gao W, He Y, Hu H, et al. Long noncoding RNAs and their proposed functions in fibre development of cotton (Gossypium spp.) New Phytol. 2015;207:1181–97. doi: 10.1111/nph.13429. [DOI] [PubMed] [Google Scholar]
- 66.Tian J, Song Y, Yang X, Ci D, Chen J, Xie J, et al. Population genomic analysis of gibberellin-responsive long non-coding RNAs in Populus. J Exp Bot. 2016;67:2467–82. doi: 10.1093/jxb/erw057. [DOI] [PubMed] [Google Scholar]
- 67.Shi W, Quan M, Du Q, Zhang D. The Interactions between the Long Non-coding RNA NERDL and Its Target Gene Affect Wood Formation in Populus tomentosa. Front Plant Sci. 2017;8:1035. doi: 10.3389/fpls.2017.01035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Raiola A, Lionetti V, Elmaghraby I, Immerzeel P, Mellerowicz EJ, Salvi G, et al. Pectin methylesterase is induced in Arabidopsis upon infection and is necessary for a successful colonization by necrotrophic pathogens. Mol Plant-Microbe Interact. 2011;24:432–40. doi: 10.1094/MPMI-07-10-0157. [DOI] [PubMed] [Google Scholar]
- 69.Del Corpo D, Fullone MR, Miele R, Lafond M, Pontiggia D, Grisel S, et al. AtPME17 is a functional Arabidopsis thaliana pectin methylesterase regulated by its PRO region that triggers PME activity in the resistance to Botrytis cinerea. Mol Plant Pathol. 2020;21:1620–33. doi: 10.1111/mpp.13002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Loreti E, Perata P. The Many Facets of Hypoxia in Plants. Plants. 2020;9:745. doi: 10.3390/plants9060745. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Martín-García J, Paraschiv M, Flores-Pacheco JA, Chira D, Diez JJ, Fernández M. Susceptibility of Several Northeastern Conifers to Fusarium circinatum and Strategies for Biocontrol. Forests. 2017;8:318. doi: 10.3390/f8090318. [DOI] [Google Scholar]
- 72.Correll JC, Gordon TR, McCain AH, Fox JW, Koehler CS, Wood DL, et al. Pitch Canker Disease in California: Pathogenicity, Distribution, and Canker Development on Monterey Pine (Pinus radiata) Plant Dis. 1991;75:676–82. doi: 10.1094/PD-75-0676. [DOI] [Google Scholar]
- 73.Kaplan EL, Meier P. Nonparametric Estimation from Incomplete Observations. J Am Stat Assoc. 1958;53:481. doi: 10.2307/2281868. [DOI] [Google Scholar]
- 74.Therneau TM. Survival Analysis. R package survival version 3.2-7. 2020. https://cran.r-project.org/package=survival. Accessed 21 Jan 2021.
- 75.R Core Team. R: The R Project for Statistical Computing. CRAN. 2019. https://www.r-project.org/. Accessed 16 Nov 2020.
- 76.Martín-Rodrigues N, Espinel S, Sanchez-Zabala J, Ortíz A, González-Murua C, Duñabeitia MK. Spatial and temporal dynamics of the colonization of Pinus radiata by Fusarium circinatum, of conidiophora development in the pith and of traumatic resin duct formation. New Phytol. 2013;198:1215–27. doi: 10.1111/nph.12222. [DOI] [PubMed] [Google Scholar]
- 77.Andrews S. FastQC a quality control tool for high throughput sequence data. 2012. http://www.bioinformatics.babraham.ac.uk/projects/fastqc/. Accessed 28 Feb 2020.
- 78.Bolger AM, Lohse M, Usadel B, Trimmomatic A flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–20. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Wegrzyn JL, Lee JM, Tearse BR, Neale DB. TreeGenes: A Forest Tree Genome Database. Int J Plant Genomics. 2008;412875. 10.1155/2008/412875. [DOI] [PMC free article] [PubMed]
- 80.Kim D, Langmead B, Salzberg SL. HISAT: A fast spliced aligner with low memory requirements. Nat Methods. 2015;12:357–60. doi: 10.1038/nmeth.3317. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–9. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Pertea M, Pertea GM, Antonescu CM, Chang T-C, Mendell JT, Salzberg SL, et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol. 2015;33:290–5. doi: 10.1038/nbt.3122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Pertea M, Pertea G. GFF Utilities: GffRead and GffCompare. F1000Research. 2020;9:304. doi: 10.12688/f1000research.23297.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Wucher V, Legeai F, Hédan B, Rizk G, Lagoutte L, Leeb T, et al. FEELnc: A tool for long non-coding RNA annotation and its application to the dog transcriptome. Nucleic Acids Res. 2017;45:57. doi: 10.1093/nar/gkw1306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Hart AJ, Ginzburg S, Xu MS, Fisher CR, Rahmatpour N, Mitton JB, et al. EnTAP: Bringing faster and smarter functional annotation to non-model eukaryotic transcriptomes. Mol Ecol Resour. 2020;20:591–604. doi: 10.1111/1755-0998.13106. [DOI] [PubMed] [Google Scholar]
- 86.Tang S, Lomsadze A, Borodovsky M. Identification of protein coding regions in RNA transcripts. Nucleic Acids Res. 2015;43:e78. doi: 10.1093/nar/gkv227. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Buchfink B, Xie C, Huson DH. Fast and sensitive protein alignment using DIAMOND. Nat Methods. 2015;12:59–60. doi: 10.1038/nmeth.3176. [DOI] [PubMed] [Google Scholar]
- 88.Huerta-Cepas J, Szklarczyk D, Forslund K, Cook H, Heller D, Walter MC, et al. EGGNOG 4.5: A hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences. Nucleic Acids Res. 2016;44:286–93. doi: 10.1093/nar/gkv1248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Budak H, Kaya SB, Cagirici HB. Long Non-coding RNA in Plants in the Era of Reference Sequences. Front Plant Sci. 2020;11:276. doi: 10.3389/fpls.2020.00276. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Szcześniak MW, Bryzghalov O, Ciomborowska-Basheer J, Makałowska I. CANTATAdb 2.0: Expanding the Collection of Plant Long Noncoding RNAs. In: Chekanova JA, Wang HV, editors. Methods in Molecular Biology. New York, NY: Humana Press Inc.; 2019. pp. 415–29. [DOI] [PubMed] [Google Scholar]
- 91.Gallart AP, Pulido AH, De Lagrán IAM, Sanseverino W, Cigliano RA. GREENC: A wiki-based database of plant lncRNAs. Nucleic Acids Res. 2016;44:D1161–6. doi: 10.1093/nar/gkv1215. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Kalvari I, Nawrocki EP, Ontiveros-Palacios N, Argasinska J, Lamkiewicz K, Marz M, et al. Rfam 14: Expanded coverage of metagenomic, viral and microRNA families. Nucleic Acids Res. 2021;49:D192–200. doi: 10.1093/nar/gkaa1047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Kozomara A, Birgaoanu M, Griffiths-Jones S, MiRBase From microRNA sequences to function. Nucleic Acids Res. 2019;47:D155–62. doi: 10.1093/nar/gky1141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.CCB. StringTie. Transcript assembly and quantification for RNA-Seq. 2019. http://ccb.jhu.edu/software/stringtie/index.shtml?t=manual. Accessed 7 Jul 2021.
- 95.Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Benjamini Y, Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple. J R Stat Soc Ser B. 1995;57:289–300. [Google Scholar]
- 97.Hong G, Zhang W, Li H, Shen X, Guo Z. Separate enrichment analysis of pathways for up- and downregulated genes. J R Soc Interface. 2014;11:20130950. doi: 10.1098/rsif.2013.0950. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Young MD, Wakefield MJ, Smyth GK, Oshlack A. Gene ontology analysis for RNA-seq: accounting for selection bias. Genome Biol. 2010;11:14. doi: 10.1186/gb-2010-11-2-r14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Supek F, Bošnjak M, Škunca N, Šmuc T. REVIGO Summarizes and Visualizes Long Lists of Gene Ontology Terms. PLoS One. 2011;6:e21800. doi: 10.1371/journal.pone.0021800. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Russo PST, Ferreira GR, Cardozo LE, Bürger MC, Arias-Carrasco R, Maruyama SR, et al. CEMiTool: A Bioconductor package for performing comprehensive modular co-expression analyses. BMC Bioinformatics. 2018;19:1–13. doi: 10.1186/s12859-018-2053-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Yu G, Wang LG, Han Y, He QY, ClusterProfiler An R package for comparing biological themes among gene clusters. Omi A J Integr Biol. 2012;16:284–7. doi: 10.1089/omi.2011.0118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Tahmasebi A, Ashrafi-Dehkordi E, Shahriari AG, Mazloomi SM, Ebrahimie E. Integrative meta-analysis of transcriptomic responses to abiotic stress in cotton. Prog Biophys Mol Biol. 2019;146:112–22. doi: 10.1016/j.pbiomolbio.2019.02.005. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Pine seeds and fungal isolate used in this study will be provided upon request in order to support reproducibility. Likewise, the datasets generated and analysed during the current study are available in the SRA database (https://www.ncbi.nlm.nih.gov/sra) of NCBI repository. The SRA accession numbers are SRR15100123-31 (BioProject PRJNA742852; https://www.ncbi.nlm.nih.gov/search/all/?term=PRJNA742852). The reference genome (Pita_v2.01) is publicly available in Treegenes database (https://treegenesdb.org/FTP/Genomes/Pita/v2.01/genome/Pita.2_01.fa.gz).