Abstract
A draft genome sequence of the yeast Pachysolen tannophilus CBS 4044/NRRL Y-2460 is presented. The organism has the potential to be developed as a cell factory for biorefineries due to its ability to utilize waste feedstocks. The sequenced genome size was 12,238,196 bp, consisting of 34 scaffolds. A total of 4,463 genes from 5,346 predicted open reading frames were annotated with function.
GENOME ANNOUNCEMENT
The yeast Pachysolen tannophilus was first isolated from wood extracts used in leather tanning (2) and has gained interest due to its ability to utilize d-xylose (1, 7, 8) for fuel ethanol production (6, 8, 10). It can ferment the common sugars (glucose, mannose, and galactose) occurring in hemicellulose hydrolysate mixtures (8) and can produce ethanol from glycerol (4). The whole-genome sequencing of P. tannophilus was performed to provide genetic information as a necessary step toward engineering of the strain and its potential development as an industrial cell factory.
Sequencing was performed by a whole-genome shotgun strategy with an Illumina genome analyzer (Beijing Genomics Institute, Shenzhen, China). The raw data of short reads were assembled into 279 contigs which were ordered into 34 scaffolds (>2 kb) with an N50 size of 1.1 Mb by using the SOAPdenovo package (3). Augustus software, version 2.5 (9), trained for predicting genes in Debaryomyces hansenii, was utilized to identify protein-coding genes in the genome, and the putative amino acid sequences were used for subsequent gene function annotation analysis. The functional annotation was accomplished by BLASTP analysis (E-value < 1 × 10−5) of protein sequences in the databases (COG and KEGG), and the best hit was selected. Pulsed-field gel electrophoresis (PFGE) (5) was performed to predict the number and approximate sizes of chromosomes. The program for PFGE was 48 h at 3 V/cm with a 500-s switch time at an included angle of 106° with 0.5× Tris-borate-EDTA (TBE) on 0.75% LMP (low melting point) agarose at 14°C.
The total length of the sequenced genome was 12,238,196 bp (without N), with a GC content of 29.82%. A total of 1970.8 Mb of raw data was sequenced, representing around 145-fold coverage of the P. tannophilus genome. Five thousand three hundred forty-six protein-coding genes (coding sequences [CDSs]) were predicted, and 4,463 (83.5%) genes were annotated with function.
Four coding sequences of P. tannophilus were retrieved from GenBank to compare with the annotated CDSs. There were 2-bp differences among 4,803 bp of length. Furthermore, 13 full-length coding sequences were PCR amplified and resequenced to estimate accuracy. The total resequenced length was 21,111 bp, with 100% similarity obtained.
Based on PFGE results, six chromosomal bands were separated, with two of the bands probably migrating as doublets. The sizes of the chromosomal bands were estimated to be 2.9 ± 0.05 Mb (mean ± standard deviation), 2.1 ± 0.04 Mb, 1.9 ± 0.05 Mb, 1.6 ± 0.08 Mb (doublet), 1.3 ± 0.07 Mb (doublet), and 0.98 ± 0.02 Mb based on comparison with the yeast marker Hansenula wingei. The estimated genome size of P. tannophilus was approximately 13.6 ± 0.4 Mb with an estimated 8 chromosomes.
Nucleotide sequence accession numbers.
The draft genome sequences of P. tannophilus were deposited in EMBL with contig accession numbers CAHV01000001 to CAHV01000267.
ACKNOWLEDGMENTS
This work was funded by the European Community's 7th Framework Research Programme under grant agreement number 213506 (project GLYFINERY), providing financial support to X.L. and M.W.
REFERENCES
- 1. Fu N, Peiris P. 2008. Co-fermentation of a mixture of glucose and xylose to ethanol by Zymomonas mobilis and Pachysolen tannophilus. World J. Microbiol. Biotechnol. 24:1091–1097 [Google Scholar]
- 2. Kurtzman C. 1983. Biology and physiology of the D-xylose fermenting yeast Pachysolen tannophilus, p 73–83. In Feichter A, Jeffries TW. (ed), Advances in biochemical engineering and biotechnology, vol 27, Springer, Berlin, Germany [Google Scholar]
- 3. Li R, et al. 2010. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 20:265–272 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Liu X, Jensen PR, Workman M. 2012. Bioconversion of crude glycerol feedstocks into ethanol by Pachysolen tannophilus. Bioresour. Technol. 104:579–586 [DOI] [PubMed] [Google Scholar]
- 5. Maringele L, Lydall D. 2006. Pulsed-field gel electrophoresis of budding yeast chromosomes, p 65–73. In Xiao W. (ed), Yeast protocol, vol. 313, Humana Press, Totowa, NJ: [DOI] [PubMed] [Google Scholar]
- 6. Sathesh-Prabu C, Murugesan AG. 2011. Potential utilization of sorghum field waste for fuel ethanol production employing Pachysolen tannophilus and Saccharomyces cerevisiae. Bioresour. Technol. 102:2788–2792 [DOI] [PubMed] [Google Scholar]
- 7. Schneider H, Wang PY, Chan YK, Maleszka R. 1981. Conversion of D-xylose into ethanol by the yeast Pachysolen tannophilus. Biotechnol. Lett. 3:89–92 [Google Scholar]
- 8. Slininger PJ, Bolen PL, Kurtzman CP. 1987. Pachysolen tannophilus: Properties and process considerations for ethanol production from d-xylose. Enzyme Microb. Technol. 9:5–15 [Google Scholar]
- 9. Stanke M, Steinkamp R, Waack S, Morgenstern B. 2004. AUGUSTUS: a web server for gene finding in eukaryotes. Nucleic Acids Res. 32(Suppl 2):W309–W312 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Zhao L, Yu J, Zhang X, Tan T. 2010. The ethanol tolerance of Pachysolen tannophilus in fermentation on xylose. Appl. Biochem. Biotechnol. 160:378–385 [DOI] [PubMed] [Google Scholar]
