Skip to main content
Bioinformation logoLink to Bioinformation
. 2016 Jul 26;12(4):233–236. doi: 10.6026/97320630012233

A searchable database for the genome of Phomopsis longicolla (isolate MSPL 10-6)

Omar Darwish 1,#, Shuxian Li 2,*,#, Zane May 1, Benjamin Matthews 3, Nadim W Alkharouf 1
PMCID: PMC5290664  PMID: 28197060

Abstract

Phomopsis longicolla (syn. Diaporthe longicolla) is an important seed-borne fungal pathogen that primarily causes Phomopsis seed decay (PSD) in most soybean production areas worldwide. This disease severely decreases soybean seed quality by reducing seed viability and oil quality, altering seed composition, and increasing frequencies of moldy and/or split beans. To facilitate investigation of the genetic base of fungal virulence factors and understand the mechanism of disease development, we designed and developed a database for P. longicolla isolate MSPL 10-6 that contains information about the genome assemblies (contigs), gene models, gene descriptions and GO functional ontologies. A web-based front end to the database was built using ASP.NET, which allows researchers to search and mine the genome of this important fungus. This database represents the first reported genome database for a seed borne fungal pathogen in the Diaporthe– Phomopsis complex. The database will also be a valuable resource for research and agricultural communities. It will aid in the development of new control strategies for this pathogen.

Availability:

http://bioinformatics.towson.edu/Phomopsis_longicolla/HomePage.aspx

Keywords: Phomopsis longicolla, MSPL 10-6, database, annotations and genomic sequence

Background

Phomopsis longicolla (syn. Diaporthe longicolla) is an important seedborne fungal pathogen that primarily causes Phomopsis seed decay (PSD) in most soybean production areas worldwide [1,2]. This disease severely decreases soybean seed quality by reducing seed viability and oil quality, altering seed composition, and increasing frequencies of moldy and/or split beans [3,4, 5,6]. Research on analysis of the internal transcribed spacer (ITS) region [7], the small subunit of the mitochondrial ribosomal RNA gene [8], and other genes/regions of P. longicolla have been reported. Recently, the genome of a P. longicolla isolate MSPL 10-6 which was isolated from field-grown soybean seed in Mississippi, USA was sequenced [9]. Development of a database for P. longicolla isolate MSPL 10-6 that contains information about the genome assemblies (contigs), gene models, gene descriptions and GO functional ontologies will allow researchers to search and mine the genome of this important fungus. The database will be a valuable resource for research and agricultural communities, and facilitate investigation of the genetic base of fungal virulence factors and an understanding of the mechanism of disease development. To our knowledge, this database represents the first reported genome database for a seed borne fungal pathogen in the Diaporthe–Phomopsis complex.

Methodology of Development

The database was designed, implemented and hosted using Microsoft SQL Server 2008 Enterprise Edition. Microsoft Visual Studio 2013 was used to design and implement the web pages, which were programmed using ASP.NET framework 4.0 with C# programming language. Both the database and the website are on the same server at Towson University in Baltimore, MD, USA. This server is running Microsoft Windows Server 2012 and Internet Information Services (IIS V7.0). The database stores the assembly of the P. longicolla MSPL 10-6 genome (108 scaffolds) [9] and their annotations. In addition to the sequences, the database also houses information on gene function and gene ontology distributions.

Utility to the biological community

The database contains the genome sequence of P. longicolla MSPL 10-6 and the 16,597 genes that were annotated. The annotation includes GO ontologies that have been assigned to most genes (process, molecular function and cellular component). The database’s web-accessible interface (Figure 1) provides an easy way to search, browse and download the sequences and functional annotation data stored in the database. The following are the main functions the website provides:

Figure 1.

Figure 1

A snapshot of the database main web page showing a quick summary of the project and its functions in the website.

[1] Search:

Users can search by GO ontology terms, or by sequence description (Figure 2). Partial characters can be used if one is not sure of the full GO term or gene name. Both the search by GO ontologies and search by description return their results in a nice tabular format that allows the user to select any record of the returned search results to see details about that specific sequence\gene. The information includes sequence name, sequence description, sequence length, blast e-value, gene ontolgy, InterProScan results and the actual sequence in FASTA.

Figure 2.

Figure 2

A snapshot of the search pages. Users can search by gene description and\or GO ontologies. More detailed information of a record (i.e. sequence) can be obtained by clicking on the “Details” link next to the sequence ID.

[2] Statistics and Graphs:

The web site provides static pages that display the annotation statistics (lengths of coding regions, number of exons…etc.) along with bar graphs depicting the GO ontologies distributions.

Download:

The web site allows user to download the complete assembled genome (FASTA format) and the annotations in both FASTA and GF3 formats. Raw sequences can be found from the SRA database, located at: http://www.ncbi.nlm.nih.gov/nuccore/ AYRD00000000/

Caveats

The assembly and genome annotation P. longicolla cannot be considered a complete reference for the species, as only one strain (MSPL 10-6) was sequenced.

Future Development

Other strains of P. longicolla will be included on this database\site once they have been sequenced and annotated.

Author Contributions:

Omar Darwish and Zane May, designed and developed the database and user interface under the guidance of Nadim Alkharouf at Towson University. Shuxian Li at the USDA-ARS led and coordinated the project and was in charge of fungal culture and DNA preparation for sequencing as well as the overall design of the experiments. Benjamin Matthews acted as a scientific consultant. All authors contributed to the writing of the manuscript.

Acknowledgments

This work was partially supported by the USDA-ARS projects 6066- 21220-012-00D. We are grateful to Phillip SanMiguel at Purdue Genomics Core Facility for sequencing. Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the United States Department of Agriculture. USDA is an equal opportunity provider and employer.

Edited by P Kangueane

Citation: Darwish et al. Bioinformation 12(4): 233-236 (2016)

References

  • 1.Hobbs TW, et al. Mycologia. 1985;77:535. [Google Scholar]
  • 2.Li S, et al. Plant Dis. 2010;94:1035. doi: 10.1094/PDIS-94-8-1035. [DOI] [PubMed] [Google Scholar]
  • 3.Hepperly PR, Sinclair JB, Phytopathology. 1978;68:1684. [Google Scholar]
  • 4.Sinclair JB, Plant Dis. 1993;329 [Google Scholar]
  • 5.Li S. Phomopsis seed decay of soybean, In Soybean: Molecular Aspects of Breeding. Vienna, Austria: 2011. Intech Publisher p277-292. [Google Scholar]
  • 6.Li S. Phomopsis Seed Decay, In Compendium of Soybean Diseases and Pets. Minnesota, USA: 2015. Fifth Edition. APS Press p47-48. [Google Scholar]
  • 7.Zhang AW, et al. Phytopathology. 1998;88:1306. doi: 10.1094/PHYTO.1998.88.12.1306. [DOI] [PubMed] [Google Scholar]
  • 8.Li S, et al. Plant Dis. 2001;85:1031. doi: 10.1094/PDIS.2001.85.9.1031A. [DOI] [PubMed] [Google Scholar]
  • 9.Li S, et al. Genome Data. 2014;3:55. [Google Scholar]

Articles from Bioinformation are provided here courtesy of Biomedical Informatics Publishing Group

RESOURCES