Skip to main content
Data in Brief logoLink to Data in Brief
. 2019 Sep 7;26:104489. doi: 10.1016/j.dib.2019.104489

The dataset of de novo transcriptome assembly of Falcataria moluccana cambium from gall-rust (Uromycladium falcatarium) infected and non-infected tree

Hasyyati Shabrina a, Ulfah J Siregar b, Deden D Matra c, Iskandar Z Siregar b,
PMCID: PMC6811874  PMID: 31667253

Abstract

Sengon (Falcataria moluccana), formerly known as Albizia falcataria or Paraserianthes falcataria, is an essential tree species for the development of community-based timber plantation, especially in Indonesia. The plantations nowadays are facing a significant disease threat, namely infection of gall-rust fungi (Uromycladium falcatarium). However, a clear understanding of the molecular mechanisms of the tree response against the disease is still unknown. We carried out transcriptome assembly using BGISEQ-500 technology to provide assembled de novo transcriptome dataset generated from gall-rust infected and non-infected trees. The construction of assembled transcriptome was conducted using Trinity v.2.3.2 The raw reads are available in the DDBJ platform with accession number, DRA007983.

Keywords: De novo, Falcataria moluccana, Gall-rust (Uromycladium falcatarium) infected, Next-generation sequencing (NGS), RNA, Transcriptome


Specifications Table

Subject Agricultural and Biological Sciences: Forestry
Specific subject area Molecular study in Forestry
Type of data RNA Sequencing Data
How data were acquired The data acquired by Next-generation Sequencing technology using BGISEQ-500 platform
Data format Raw Sequencing reads and assembled contigs
Parameters for data collection The cambium was collected each from outer sapwood of gall-rust infected and non-infected Sengon tree.
Description of data collection The RNA was sequenced with Next Generation Sequencing method using BGISEQ-500 platform in Beijing Genomic Institute, Hongkong.
Data source location City/Region: Bogor/West Java
Country: Indonesia
Latitude and longitude (and GPS coordinates) for collected samples/data: 6°32′38.9″S 106°44′24.5″E
Data accessibility Repository name: DDBJ (DNA Data Bank of Japan)
Data identification number: DRA007983
Direct URL to data: https://ddbj.nig.ac.jp/DRASearch/submission?acc=DRA007983
Related research article N.E. Lelana, S. Wiyanto, Gianto, I.Z. Siregar.
Genetic diversity of Falcataria moluccana and its relationship to the resistance of gall rust disease.
Biodiversitas (2018): 12–17. DOI: https://doi.org/10.13057/biodiv/d190102
D. Rahmawati, N. Khumaida, U.J. Siregar.
Morphological and phytochemical characterization of susceptible and resistant Sengon (Falcataria moluccana) tree to gall rust disease. Biodiversitas 20 (2019): 907–913. DOI: https://doi.org/10.13057/biodiv/d200340
Value of the data
  • This data provides the assembled-transcriptome of Sengon (Falcataria moluccana) from cambium tissue

  • This data will be of practical use to develop genetic markers as a tool for assisting gall-rust resistant Sengon tree improvement program

  • This data will further be advantageous in analyzing differential gene expression to understand molecular mechanism regarding infection of gall rust disease in Sengon

1. Data

This data presented the de novo transcriptome assembly of Falcataria moluccana (Miq.) Barneby & JW Grimes (Indonesia: Sengon, Malaysia: Batai, Hawaii: Albizia, East Timor: Madre de Cacao, Philippines: Falcata). The Sengon tree is cultivated widely in Indonesia as a material source for veneer, light construction, and pellet in bioenergy. However, the plantations are facing the most damaging threat that damaging the wood and lowering the productivity, gall-rust (Uromycladium falcatarium) fungi infection. Altough the fungi could infect any tissue, the wood or stem, which is the main product of the plantation, was the part that damaged badly. After the fungi entered the bark, the gall-rust fungi will wounded the cell in cambium, causing death in some cells and grow itself in cambium tissues by blocking the nutrient. The fungi also secretes growth hormones that caused the undifferentiated cell in cambium to grow rapidly and created wood deformity. The transcriptome of Sengon was derived from the RNA collected from the cambium part of the stem. The sequencing process was generated using BGISEQ-500 platform. The properties of the reads and the assembled sequences are shown in Table 1. Overview of the transcriptome data is presented in Table 2. The KEGG pathway analysis generated pathways with high number of contigs which involved in gall-rust infection incidence (Table 3). Among the microsatellite motifs from the merged sequences (Table 4), mono- (27,917; 39.83%) and dinucleotide (8,242; 11.76%) were the most common, and the three most frequent motifs were A/T (26,157; 64.33%), AG/CT (4,300; 10.58%), and AT/AT (2,195; 5.40%). The most frequent trinucleotide, tetranucleotide, pentanucleotide, and hexanucleotide were AAG/CTT (1,378; 3.39%), AAAG/CTTT (83; 0.2%), AAAAG/CTTTT (8; 0.02%), and AAACCC/GGGTTT (7; 0.02%) respectively.

Table 1.

The properties of raw reads and assembled sequences of Sengon cambium.

Features Numbers
infected non-infected Merged
Raw Reads
 Numbers 79,054,112 79,037,484 158,090,000
 Bases 7,905,411,200 7,903,748,400 15,810,000,000
Unique Transcripts
 Numbers 380,032 118,171 400,633
 Bases 278,860,152 114,550,451 325,278,365
 Length range (bp) 201-17,824 201-8,098 201-12,634
 Average (bp) 733.78 969.36 811.91
 N50 (bp) 1324 1439 1521
 GC contents (%) 40.1 40.6 39.8
Contigs
 Numbers 72,968 41,332 70,089
 Bases 100,689,847 57,231,623 115,068,922
 Length range (bp) 201-21,986 201-8,098 181-15,997
 Average (bp) 733.78 1385 1641
 N50 (bp) 1074 1243 1382
 GC contents (%) 41.1 40.8 40.6

Table 2.

Functional annotation of Sengon contigs.

Source Number (percentage)
Infected Non-infected Merged
Contig Number 72,968 41,332 70,089
Contigs:
 Nr NCBI 68,927 (98.34)
 Nt NCBI 61,213 (83.89) 37,986 (91.90) 62,880 (89.71)
 SwissProt 40,694 (55.76) 25,633 (62.02) 42,783 (61.04)
 TrEMBL 58,593 (83.59)
 Gene ontology 17,134 (24.45)
 KEGG 3,256 (4.65)

Table 3.

Top 20 KEGG pathway with the highest contig numbers of merged sequence.

Pathway Pathway ID #Enzymes in pathway #Contigs of Enzyme
Purine metabolism map00230 7 54
Thiamine metabolism map00730 2 44
Biosynthesis of antibiotics map01130 8 31
Pyrimidine metabolism map00240 8 17
Folate biosynthesis map00790 2 16
One carbon pool by folate map00670 3 16
beta-Lactam resistance map01501 1 13
Drug metabolism - other enzymes map00983 2 13
Penicillin and cephalosporin biosynthesis map00311 1 13
Terpenoid backbone biosynthesis map00900 1 12
Aminoacyl-tRNA biosynthesis map00970 3 9
Alanine, aspartate and glutamate metabolism map00250 3 8
Cysteine and methionine metabolism map00270 3 8
Porphyrin and chlorophyll metabolism map00860 3 6
Galactose metabolism map00052 1 6
Oxidative phosphorylation map00190 1 5
Glutathione metabolism map00480 2 4
Pantothenate and CoA biosynthesis map00770 3 4
Cyanoamino acid metabolism map00460 1 3
Glycine, serine and threonine metabolism map00260 2 3
Phenylalanine, tyrosine and tryptophan biosynthesis map00400 2 3

Table 4.

The number and motif of microsatellite of Sengon contigs.

Motifs Number of contigs (percentage)
Infected Non-infected Merged
Mononucleotide 15,979 (21.90) 18,786 (45.45) 27,917 (39.83)
Dinucleotide 7,157 (9,81) 3,879 (9.38) 8,242 (11.76)
Trinucleotide 3,676 (5.04) 2,258 (5.46) 4,172 (5.95)
Tetranucleotide 216 (0.30) 94 (0.23) 255 (0.36)
Pentanucleotide 31 (0.04) 6 (0.01) 28 (0.04)
Hexanucleotide 41 (0.06) 22 (0.05) 48 (0.07)

2. Experimental design, materials, and methods

The cambiums of Sengon tree were collected from a private plantation in Bogor, West Java, Indonesia (6°32′38.9″S 106°44′24.5″E). The RNA was extracted from the cambium part originated from gall-rust infected and gall-rust free tree, each type of tree consist of one individual and no replication. The RNA extracted by BGI from 80 mg tissue sample using established CTAB-pBIOZOL method. The integrity and quantity of isolated-RNA were monitored and quantified by a NanoDrop ND-1000 spectrophotometer and Agilent 2100 Bioanalyzer. The non-infected sample was treated with Ribozero due to ribosomal RNA contamination. The RNA sequencing was carried out using the BGISEQ-500 (BGI, Hong Kong) and produced pre-processing reads to discard the adaptors and low-quality reads. The pre-processed reads were de novo assembled by Trinity v.2.3.2 [1], and the high-quality contigs were obtained [2], [3]. These contigs were annotated using BLAST + program [4] against the NCBI non-redundant (Nr), Nucleotide sequences (Nt) and protein sequence database of UniProt (SwissProt and TrEMBL). The annotated-contigs from Nr database were performed using Blast2Go software [5] to obtain functional annotation of Gene Ontology and KEGG pathway. Identification of contigs containing microsatellites was performed using MISA program [6], with minimum repeats were: 10 for 1 base, 6 for 2 bases, and 5 for 3, 4, 5, and 6 bases; and the interruptions (the maximum difference between microsatellites) were 100 bases.

Acknowledgments

This research is partially funded by Master Program of Education Leading to Doctoral Degree for Excellent Graduates (PMDSU) Scheme from Ministry of Research, Technology and Higher Education of the Republic of Indonesia (KEMENRISTEKDIKTI) and The United States Agency for International Development (USAID) through the Sustainable Higher Education Research Alliances (SHERA) program - Centre for Development of Sustainable Region (CDSR). In 2017–2021, CDSR is led by Center for Energy Studies – Gadjah Mada University (UGM).

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  • 1.Grabherr M.G. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 2011;29:644–652. doi: 10.1038/nbt.1883. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Matra D.D., Ritonga A.W., Natawijaya A., Poerwanto R., Sobir, Widodo W.D., Inoue E. Dataset of the first de novo transcriptome assembly of the arillode of Baccaurea motleyana. Data in Brief. 2019;22:332–335. doi: 10.1016/j.dib.2018.12.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Matra D.D., Ritonga A.W., Natawijaya A., Poerwanto R., Sobir, Widodo W.D., Inoue E. Dataset from de novo transcriptome assembly of Nephelium lappaceum aril. Data in Brief. 2019;22:566–569. doi: 10.1016/j.dib.2018.12.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
  • 5.Conesa A., Götz S., García-Gómez J.M., Terol J., Talón M., Blast2GO M. Robles. A universal tool for annotation, visualization, and analysis in functional genomics research. Bioinformatics. 2005;21:3674–3676. doi: 10.1093/bioinformatics/bti610. [DOI] [PubMed] [Google Scholar]
  • 6.Thiel T., Michalek W., Varshney R.K., Graner A. Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.), Theor. Appl. Genet. 2003;106:411–422. doi: 10.1007/s00122-002-1031-0. [DOI] [PubMed] [Google Scholar]

Articles from Data in Brief are provided here courtesy of Elsevier

RESOURCES