Skip to main content
Data in Brief logoLink to Data in Brief
. 2019 Jun 5;25:104099. doi: 10.1016/j.dib.2019.104099

Genome/transcriptome collection of plethora of economically important, previously unexplored organisms from India and abroad

Arijit Panda a,b,1, Narendrakumar M Chaudhari b,1, Mayuri Mukherjee a,b, Samrat Ghosh a,b, Aditya Narayan Sarangi b, C Mathu Malar a,b, Shashi Kant a,b, Diya Sen a, Abhishek Das a,b, Subhadeep Das a,b, Deeksha Singh a,b, Asharani Prusty a,b, Sucheta Tripathy a,b,
PMCID: PMC6595405  PMID: 31294057

Abstract

Genome and transcriptome sequencing data are extremely useful resources for researchers in carrying out biological experiments that involves cloning and characterizing genes. We are presenting here genome sequence data from different clades of life including photosynthetic prokaryotes; oomycetes pathogens; probiotic bacteria; endophytic yeasts and filamentous fungus and pathogenic protozoa Leishmania donovani. In addition, we are also presenting paired control and treated stress response transcriptomes of Cyanobacteria growing in extreme conditions. The Cyanobacterial species that are included in this dataset were isolated from extreme conditions including desiccated monuments, hot springs and saline archipelagos. The probiotic Lactobacillus paracasei was isolated from Indian sub-continent. The Kala azar causing protozoan Leishmania donovani, whose early infectious stage is also included in this dataset. The endophyte Arthrinium malaysianum was isolated as a contaminant has significant bio-remediation property. Our collaborators have isolated endophyte Rhodotorula mucilaginosa JGTA1 from Jaduguda mines, West Bengal, India infested with Uranium. Our collaborators have isolated a heterozygous diploid oomycetes pathogen, Phytophthora ramorum causing sudden oak death in CA, USA coast is also part of the data. These dataset presents a unique heterogeneous collection from various sources that are analyzed using “Genome Annotator Light (GAL): A Docker-based package for genome analysis and visualization” (Panda et al., 2019) and are presented in a web site automatically created by GAL at http://www.eumicrobedb.org/cglab.

Keywords: Annotation, Genome, Transcriptome


Specifications Table

Subject area Biology, Genetics
More specific subject area Genomics, Computational Genomics, Transcriptomics
Type of data Genome sequences (text),Transcriptomic raw data, Alignment files in SAM/BAM format, Database (online data and graphics)
How data was acquired Next Generation DNA and RNA Sequencing
Data format Filtered, annotated and analyzed complete genomes, assemblies and raw transcriptomic sequences
Experimental factors Collected organisms from various sources such as monuments, hot springs, saline archipelagos, hosts with pathogens, endophytes etc.
Experimental features The genome and transcriptomic sequencing of microbes are carried out by second and third generation sequencing platforms.
Data source location India; California, USA
Data accessibility The collection of analyzed genomes of various microbes is publically accessible at following url: http://www.eumicrobedb.org/cglab
This resource is created after a collection of genomes sequenced in our laboratory at CSIR-Indian Institute of Chemical Biology, Kolkata, India and in collaborative labs. Newly sequenced additional genomes will be added to the same repository from time to time in future.
The details of the data with their accession numbers are provided in Table 1.
Value of the data
  • 1.

    The importance of this data collection lies with the economic significance of the organisms sequenced. Here, mostly extremophile Cyanobacteria and fungi isolated from diverse sources are included. There is one fungi like organism which is a pathogen on Oak plants was isolated from CA coast is also part of this resource.

  • 2.

    Another aspect of this collection is the availability of not just the sequence data but the analyzed collection of large datasets of genomes and their annotations with comparative genomics in comparative genomics visualization in browser tracks.

  • 3.

    The genomes sequenced hereby are mostly from Indian subcontinent, which unleash the hidden biodiversity of highly diverse Indian lands, oceans and monuments.

  • 4.

    Genome sequenced pathogens which destroy commercially important plants may prove to be of immense importance to the researchers and economists of the related countries.

  • 5.

    The probiotic and immune-modulatory activities of some microbes can be explored using this data collection where genomes from probiotic strains e.g. Lactobacillus casei Lbs2 isolated from gut of Indian individual were sequenced and assembled.

  • 6.

    Economically important yeasts genome is also present in this collection, e.g. it includes genome of the plant growth promoting yeast Rhodotorula mucilaginosa JGTA-S1, found in Typha angustifolia, a macrophyte growing in wetland near to Uranium-mine in Jaduguda, India.

  • 7.

    Also transcriptomic data of some crucial microbes are provided to study expression profiles.

  • 8.

    Overall, these data may create new possibilities of collaboration in given context i.e. genomics and transcriptomic aspects of previously unexplored organisms

1. Data

The data collection includes sequenced and analyzed genomes of cyanobacteria, fungi, yeast and oomycetes from our laboratory and from the laboratories of our collaborators. This data collection offers a repository of genomic information of economically or environmentally important microorganisms. The collection currently includes only genomics data and transcriptomics data for some Cyanobacterial genomes and fungi under a paired experiment [Table 1]. The collection does not currently include proteomic or metabolomic data.

Table 1.

List of selected microbes sequenced and analyzed in the data collection with their data accession and publications.

Sr. No. Taxonomy Full Name in NCBI Data Type NCBI BioSample NCBI BioProject NCBI Accession Data Source Citation
1 Cyanobacteria Halomicronema excentricum str. Lakshadweep Genomic SAMN09787254 PRJNA485276 QVFV00000000.1 Marine archipelagos from the Lakshadeep islands UP
2 Cyanobacteria Hassallia byssoidea VB512170 Genomic SAMN03174155 PRJNA266752 JTCM00000000 Desiccated monuments, of eastern India [2]
3 Cyanobacteria Lyngbya confervoides BDU141951 Genomic SAMN03217274 PRJNA268230 JTHE00000000 Marine Cyanobacteria from Southern India [3]
4 Cyanobacteria Mastigocladus laminosus UU774 Genomic SAMN05942614 PRJNA350610 MNPM00000000 Hot springs of eastern India UP
5 Cyanobacteria Scytonema millei VB511283 Genomic SAMN03200207 PRJNA267760 JTJC00000000 Desiccated monuments, of eastern India [4]
6 Cyanobacteria Scytonema tolypothrichoides VB-61278 Genomic SAMN03274355 PRJNA271448 JXCA00000000 Desiccated monuments, of eastern India [5]
7 Cyanobacteria Tolypothrix bouteillei VB521301 Genomic SAMN02697214 PRJNA242379 JHEG00000000 Desiccated monuments, of eastern India [6]
8 Cyanobacteria Tolypothrix campylonemoides VB511288 Genomic SAMN03274356 PRJNA271449 JXCB00000000 Desiccated monuments, of eastern India [7]
9 Cyanobacteria Westiellopsis prolifica IICB1 Genomic SAMN06473105 PRJNA377809 NAPS00000000 Fresh water Cyanobacteria from culture collections of India UP
10 Firmicutes Lactobacillus paracasei Lbs2 Genomic SAMN02910029 PRJNA255080 JPKN00000000 Isolated from the guts of healthy humans from Northern India. [8]
11 Fungi Rhodotorula mucilaginosa JGTA-S1 Genomic SAMN07313544 PRJNA393004 PEFX00000000 Endophytes isolated from Jaduguda mine, eastern India. [9]
12 Leishmania Leishmania donovani strain:MHOM/IN/1983/AG83(early passage) Genomic SAMN04145217 PRJNA297706 GCA_001989975.1 (36 chromosomes) Leishmania pathogens isolated from diseased rodents from India and the cultures represents early passage (<5 passages). [10]
13 Oomycetes Phytophthora ramorum isolate:CDFA1418886 Genomic SAMN08537704 PRJNA434169 PUHL00000000 This is a virulent pathogen on Oak trees in Southern CA, USA. These organisms are fungi like but are distinctly different from them. PacBio sequencing was done on them to improve the assembly. [11], [12]
14 Cyanobacteria Halomicronema excentricum str. Lakshadweep Trascriptomics SAMN09787254 PRJNA485276 Submitted and Processing Control and organisms grown under acid stress at pH 4.5. UP
15 Cyanobacteria Mastigocladus laminosus UU774 Trascriptomics SAMN05942614 PRJNA350610 Submitted and Processing Cyanobacteria grown under heat stress and control conditions. UP
16 Cyanobacteria Scytonema tolypothrichoides VB-61278 Trascriptomics SAMN03274355 PRJNA271448 Submitted and Processing Whole transcriptome UP
17 Cyanobacteria Tolypothrix campylonemoides VB511288 Trascriptomics SAMN03274356 PRJNA271449 Submitted and Processing Whole transcriptome UP
18 Cyanobacteria Tolypothrix bouteillei VB521301 Trascriptomics SAMN02697214 PRJNA242379 Submitted and Processing Whole transcriptome UP

Note: UP; unpublished.

2. Experimental design, materials and methods

Table 1 lists the current set of complete genomes and transcriptomes from diverse sources sequenced or processed at our lab or in collaboration.

3. Other useful links to related resources

Acknowledgements

The sequencing is supported by grants in aid from Department of Biotechnology, Govt. of India, Council of Scientific and Industrial Research, Govt. of India to ST.

Conflict of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  • 1.Panda A., Chaudhari N.M., Tripathy S. Genome Annotator Light (GAL): a Docker-based package for genome analysis and visualization. Genomics. 2019 doi: 10.1016/j.ygeno.2019.03.012. [DOI] [PubMed] [Google Scholar]
  • 2.Singh D., Chandrababunaidu M.M., Panda A., Sen D., Bhattacharyya S., Adhikary S.P., Tripathy S. Draft genome sequence of cyanobacterium Hassallia byssoidea strain VB512170, isolated from Monuments in India. Genome Announc. 2015;3 doi: 10.1128/genomeA.00064-15. e00064-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Chandrababunaidu M.M., Sen D., Tripathy S. Draft genome sequence of filamentous marine cyanobacterium Lyngbya confervoides strain BDU141951. Genome Announc. 2015;3 doi: 10.1128/genomeA.00066-15. e00066-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Sen D., Chandrababunaidu M.M., Singh D., Sanghi N., Ghorai A., Mishra G.P., Madduluri M., Adhikary S.P., Tripathy S. Draft genome sequence of the terrestrial cyanobacterium Scytonema millei VB511283, isolated from Eastern India. Genome Announc. 2015;3:e00009–e15. doi: 10.1128/genomeA.00009-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Das A., Panda A., Singh D., Chandrababunaidu M.M., Mishra G.P., Bhan S., Adhikary S.P., Tripathy S. Deciphering the genome sequences of the hydrophobic cyanobacterium Scytonema tolypothrichoides VB-61278. Genome Announc. 2015;3 doi: 10.1128/genomeA.00228-15. e00228-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Chandrababunaidu M.M., Singh D., Sen D., Bhan S., Das S., Gupta A., Adhikary S.P., Tripathy S. Draft genome sequence of Tolypothrix boutellei strain VB521301. Genome Announc. 2015;3:e00001–e15. doi: 10.1128/genomeA.00001-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Das S., Singh D., Madduluri M., Chandrababunaidu M.M., Gupta A., Adhikary S.P., Tripathy S. Draft genome sequence of bioactive-compound-producing cyanobacterium Tolypothrix campylonemoides strain VB511288. Genome Announc. 2015;3 doi: 10.1128/genomeA.00226-15. e00226-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Bhowmick S., Malar M., Das A., Thakur B.K., Saha P., Das S., Rashmi H.M., Batish V.K., Grover S., Tripathy S. Draft genome sequence of Lactobacillus casei Lbs2. Genome Announc. 2014;2 doi: 10.1128/genomeA.01326-14. e01326-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Sen D., Paul K., Saha C., Mukherjee G., Nag M., Ghosh S., Das A., Seal A., Tripathy S. A unique life-strategy of an endophytic yeast Rhodotorula mucilaginosa JGTA-S1—a comparative genomics viewpoint. DNA Res. Apr 2019;26(2):131–146. doi: 10.1093/dnares/dsy044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Sinha R., Mathu Malar C., Raghwan S.D., Das S., Shadab M., Chowdhury R., Tripathy S., Ali N. Genome plasticity in cultured Leishmania donovani: comparison of early and late passages. Front. Microbiol. 2018;9 doi: 10.3389/fmicb.2018.01279. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Elliott M., Yuzon J., C M.M., Tripathy S., Bui M., Chastagner G.A., Coats K., Rizzo D.M., Garbelotto M., Kasuga T. Characterization of phenotypic variation and genome aberrations observed among Phytophthora ramorum isolates from diverse hosts. BMC Genomics. 2018;19:320. doi: 10.1186/s12864-018-4709-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Malar C M., Yuzon J.D., Das S., Das A., Panda A., Ghosh S., Tyler B.M., Kasuga T., Tripathy S. Haplotype-phased genome assembly of virulent Phythophthora ramorum isolate ND886 facilitated by long-read sequencing reveals effector polymorphisms and copy number variation. Mol. Plant Microbe Interact. 2019 doi: 10.1094/MPMI-08-18-0222-R. [DOI] [PubMed] [Google Scholar]
  • 13.Panda A., Sen D., Ghosh A., Gupta A., Prakash Mishra G., Singh D., Ye W., Tyler B.M., Tripathy S. EumicrobeDBLite: a lightweight genomic resource and analytic platform for draft oomycete genomes. Mol. Plant Pathol. 2018;19:227–237. doi: 10.1111/mpp.12505. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Data in Brief are provided here courtesy of Elsevier

RESOURCES