Skip to main content
Biotechnology Reports logoLink to Biotechnology Reports
. 2017 Aug 24;16:18–20. doi: 10.1016/j.btre.2017.07.006

Draft genome sequence of Sclerospora graminicola, the pearl millet downy mildew pathogen

S Chandra Nayaka a, H Shekar Shetty a, CTara Satyavathi b, Rattan S Yadav c, PBKavi Kishor d, M Nagaraju d, TA Anoop e, M Milner Kumar e, Boney Kuriakose e, Navajeet Chakravartty e, AVSK Mohan Katta e, VB Reddy Lachagari e, Om Vir Singh b, Pranav Pankaj Sahu c, Swati Puranik c, Pankaj Kaushal f, Rakesh K Srivastava g,
PMCID: PMC5647520  PMID: 29062722

Highlights

  • We sequenced the downy mildew pathogen, which is one of the most important production constraints for pearl millet.

  • In a maiden attempt, the whole-genome of Sclerospora graminicola pathotype 1 from India was sequenced and annotated.

  • The overall genome coverage achieved was 40×.

  • Estimate genome size of S. graminicola was 299.9 Mb.

  • Out of 65,404 genes that were predicted, a total of 38,120 genes were annotated.

Keywords: Sclerospora graminicola, Pathotype 1, Pearl millet, Downy mildew, Whole genome sequence

Abstract

Sclerospora graminicola pathogen is the most important biotic production constraints of pearl millet in India, Africa and other parts of the world. We report a de novo whole genome assembly and analysis of pathotype 1, one of the most virulent pathotypes of S. graminicola from India. The draft genome assembly contained 299,901,251 bp with 65,404 genes. This study may help understand the evolutionary pattern of pathogen and aid elucidation of effector evolution for devising effective durable resistance breeding strategies in pearl millet.


Pearl millet [Pennisetum glaucum (L.) R. Br.], is an important crop of the semi-arid and arid regions of the world. It is capable of growing in harsh and marginal environments with the highest degree of tolerance to drought and heat stresses among cereals [1].

Downy mildew is the most devastating disease of pearl millet caused by Sclerospora graminicola (sacc. Schroet), particularly on genetically uniform hybrids. Estimated annual grain yield loss due to downy mildew is approximately 10–80% [2], [3], [4], [5], [6], [7].

Pathotype 1 has been reported to be the highly virulent pathotype of Sclerospora graminicola in India [8]. We report a de novo whole genome assembly and analysis of S. graminicola pathotype 1 from India.

A susceptible pearl millet genotype Tift 23D2B1P1-P5 was used for obtaining single-zoospore isolates from the original oosporic sample. The library for whole genome sequencing was prepared according to the instructions by NEB ultra DNA library kit for Illumina (New England Biolabs, USA). The libraries were normalized, pooled and sequenced on Illumina HiSeq 2500 (Illumina Inc., San Diego, CA, USA) platform at 2 × 100 bp length. Mate pair (MP) libraries were prepared using the Nextera mate pair library preparation kit (Illumina Inc., USA). The libraries were normalized, pooled and sequenced on Illumina MiSeq (Illumina Inc., USA) platform at 2 × 300 bp length. One SMRTbell library was prepared with 20Kb insert size sequenced on PACBIO RSII platform.

The whole genome sequencing was performed by sequencing of 7.38 Gb with 73,889,924 paired-end reads from the paired-end library, and 1.15 Gb with 3,851,788 reads from the mate pair library generated from Illumina HiSeq2500 and Illumina MiSeq, respectively. Illumina reads were filtered with a quality score of at least 20 and read duplicates were removed before the assembly. A total 597,293 filtered sub reads with average read length of 6.39 Kb was generated on PACBIO RSII with P6-C4 chemistry. Approximately 51% of data generated from the reads with more than 10Kb length with a maximum read length of 49,261 bp. The sequences were assembled using various genome assemblers like ABySS, MaSuRCA, Velvet, SOAPdenovo2, and ALLPATHS-LG. The hybrid assembly generated by MaSuRCA [9] algorithm was found to be superior over other algorithms (Table 1). Assembled draft genome sequence of S. graminicola pathotype 1 was 299,901,251 bp in length, N50 of 17,909 bp with a minimum of 1 Kb scaffold size. The GC content was 47.2% consisting of 26,786 scaffolds with longest scaffold size of 238,843 bp. The overall coverage was 40×. The draft genome sequence was used for gene prediction using AUGUSTUS which resulted in 65,404 genes using Saccharomyces cerevisiae as a model. The completeness of the assembly was investigated through CEGMA and revealed 92.7% proteins completely present and 95.6% proteins partially present, while BUSCO v3 fungal dataset indicated 64.9% complete, 12.4% fragmented, 22.7% missing out of 290 BUSCO groups. A total of 52,285 predicted genes found homology using BLASTX against nr database and 38,120 genes were observed with a significant BLASTX match with E-value cutoff of 1e-5 and 40% identity percentage. Out of 38,120 genes annotated a set of 11,873 genes had UniProt entries, while 7248 were GO terms and 9686 with KEGG IDs. Of the 7248 GO terms, 2724 were associated with the biological processes. Some important GO terms are listed in Table 2. During the annotation, we observed many protein molecules which have known role in pathogenicity. Some of these include Crinkler (CRN) family protein, Glucanase inhibitor, Serine protease inhibitor, Cystiene inhibitor, INF1 Elicitin-like protein, SWI4 1, Peter Pan-like protein suppressor, Sterol binding protein, PexRD2, Glyceraldehyde-3-phosphate dehydrogenase, Ribonuclease, HECT E3 ubiquitin ligase, Alpha-1,2-Mannosidase, Endo-1,3(4)-beta-glucanase putative, Palmitoyltransferase, Serine/threonine-protein phosphatase 2A activator, Protein kinase, putative, NAD-dependent histone deacetylase sir2-like protein, rpp 13-like proteins, rpm, Glycoside hydrolase, Pre-mRNA-splicing factor SF2, NADH dehydrogenase flavoprotein 1, Mitochondrial Aldehyde dehydrogenase, Deoxyhypusine hydroxylase, DEAD/DEAH box RNA helicase, putative, CAMK protein kinase, Alpha-1,2-Mannosidase, Ornithine aminotransferase, mitochondrial Phosphatidate cytidylyltransferase, Acetolactate synthase, Inositol hexakisphosphate and diphosphoinositol-pentakisphosphate kinase.

Table 1.

Comparative statistics of the promising genome assemblers.

Assembler Minimum Maximum Mean N50 No. of Contigs Sum of Contigs CEGMA Complete CEGMA Partial
Abyss_DBG2OLC Scaffold 2730 235,195 27,432 22,557 5126 140,615,056 56.45 62.5
SOAP_DBG2OLC Scaffolds 2079 194,864 28,748 26,386 5404 155,354,949 65.73 71.37
MaSuRCA_Scaffolds 1000 238,843 11,196 17,909 26,786 299,901,251 89.52 93.95

Table 2.

Important biological process identified using GO annotation.

Biological process Number of GO terms
DNA integration [GO:0015074] 699
NADH dehydrogenase (ubiquinone) activity [GO:0008137] 95
Cytochrome-c oxidase activity [GO:0004129] 81
ATP binding [GO:0005524] 79
Heme binding [GO:0020037] 68
Cytochrome-c oxidase activity [GO:0004129] 68
Hydrogen ion transmembrane transporter activity [GO:0015078] 61
Proton-transporting ATP synthase complex, coupling factor F(o) [GO:0045263] 58
Microtubule motor activity [GO:0003777] 35
Intracellular signal transduction [GO:0035556] 34
Small-subunit processome [GO:0032040] 29
Unfolded protein binding [GO:0051082] 24
Magnesium ion binding [GO:0000287] 20
Intracellular protein transport [GO:0006886] 20

Repetitive element analysis with Repbase revealed 115 Ty1/Copia, 50 Gypsy, 419 small RNA, 23,618 simple repeats and 3365 low complex repeats. Microsatellite analysis with misa tool revealed 8179 mononucleotide repeats, 1082 low complexity repeats and 5562 dinucleotide to hexanucleotide repeats. S. graminicola pathotype 1 genome characteristics and resources are mentioned in Table 3.

Table 3.

Sclerospora graminicola pathotype 1 genome characteristics and resources.

Name Genome characteristic/resource
NCBI bioproject ID PRJNA325098
NCBI biosample ID SAMN05219233
NCBI SRA accession No. SRP076363 with accession numbers SRR3658180 and SRR3658181
Sequence type Illumina HiSeq2500 and Illumina MiSeq, PacBio RSII
Total number of reads 73,889,924 from PE Library, 3,851,788 from MP Library
Read length 2 × 100 bp for PE and 2 × 300 bp for MP
Overall coverage 40×
Estimated genome size 299.9 Mb
Predicted protein coding genes 65,404
Annotated Genes 38,120

The S. graminicola pathotype 1 sample has been deposited at the National fungal herbarium facility with accession number 52052 at the Herbarium Cryptogamae Indiae Orientalis (HCIO), Division of Plant Pathology, Indian Agricultural Research Institute (IARI), New Delhi, India.

Information on deposited data

The genome information of downy mildew pathogen is available in the NCBI GenBank database. The Sclerospora graminicola whole genome shotgun (WGS) project has the project accession MIQA00000000. This version of the project (02) has the accession number MIQA02000000, and consists of sequences MIQA02000001-MIQA02026786, with BioProject ID PRJNA325098 and BioSample ID SAMN05219233, and can be accessed at https://www.ncbi.nlm.nih.gov/bioproject/PRJNA325098/.

Author contributions

RKS, HSS, CNS designed the experiment. RKS, CNS, SSH, CTS, RSY, VBRL, ATA, MKM, BK, NC, MKAVSK performed research. RKS, CNS, VBRL, SSH, RSY, CTS, PPS, SP, PK, OVS wrote the manuscript.

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

The authors gratefully acknowledge the support received from the All India Coordinated Research Project on Pearl Millet (AICRP-PM), Mandor; and the Indian Institute of Millets Research (IIMR), Hyderabad in collecting and maintaining downy mildew inoculum. The financial assistance received from ICAR and AgriGenome Labs is gratefully acknowledged. The authors also acknowledge the Herbarium Cryptogamae Indiae Orientalis (HICO), Division of Plant Pathology, Indian Agricultural Research Institute (IARI), for undertaking conservation of the deposited fungal specimen.

References

  • 1.Supriya A., Senapathy S., Hash C.T., Thirunavukkarasu N., Vengaldas Kankanti R., Sharma R., Thakur R.P., Veeranki P.R., Yadav R.C., Srivastava R.K. QTL mapping of pearl millet rust resistance using an integrated DArT- and SSR-based linkage map. Euphytica. 2016;209(2):461–476. [Google Scholar]
  • 2.Singh S.D., King S.B., Werder J. International Crops Research Institute for the Semi-Arid Tropics; India: 1993. Downy Mildew Disease of Pearl Millet. Information Bulletin No. 37, Patancheru, Andhra Pradesh 502324; p. 36. [Google Scholar]
  • 3.Singh S.D. Downy mildew of pearl millet. Plant Dis. 1995;79:545–550. [Google Scholar]
  • 4.Hash C.T., Singh S.D., Thakur R.P., Talukdar B.S. Breeding for disease resistance. In: Khairwal I.S., Rai K.N., Andrews D.J., Harinarayana G., editors. Pearl Millet Breeding. Oxford & IBH; New Delhi, India: 1999. pp. 337–379. [Google Scholar]
  • 5.Hess D.E., Thakur R.P., Hash C.T., Sérémé P., Magill C.W. Pearl millet downy mildew: problems and control strategies for a new millennium. In: Leslie J.F., editor. Sorghum and Millets Diseases Ames. Iowa State Press; Iowa, USA: 2002. pp. 37–42. [Google Scholar]
  • 6.Yadav O.P., Rai K.N. Genetic improvement of pearl millet in India. Agric. Res. 2013;2:275–292. [Google Scholar]
  • 7.Anup C.P., Kini K.R. Analysis of dynamics of proteome in resistant cultivar of pearl millet seedlings during Sclerospora graminicola infection. J. App. Biol. Biotechnol. 2016;4:067–071. [Google Scholar]
  • 8.Sudisha J., Ananda K.S., Shetty H.S. Characterization of Downy Mildew Isolates of Sclerospora graminicola by using differential cultivars and molecular markers. J. Cell Mol. Biol. 2008;7:41–55. [Google Scholar]
  • 9.Zimin A.V., Guillaume M., Daniela P., Michael R., Steven L.S., James A.Y. The MaSuRCA genome assembler. Bioinformatics. 2013;29:2669–2677. doi: 10.1093/bioinformatics/btt476. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Biotechnology Reports are provided here courtesy of Elsevier

RESOURCES