ABSTRACT
Porcisia hertigi is a parasitic kinetoplastid first isolated from porcupines (Coendou rothschildi) in central Panama in 1965. We present the complete genome sequence of P. hertigi, isolate C119, strain LV43, sequenced using combined short- and long-read technologies. This complete genome sequence will contribute to our knowledge of the parasitic genus Porcisia.
ANNOUNCEMENT
Porcisia hertigi was discovered in Central Panama (1) in porcuines (Coendou rothschildi) and later in Costa Rica (2). Initially classified as Leishmania hertigi, subsequent phylogenetic studies resulted in its reclassification into a new genus, Porcisia (3), within the subfamily Leishmaniinae (4). To date, only a partial assembly of Porcisia deanei (5) (strain TCC258) is available within this genus. We now assemble the complete genome sequence of Porcisia hertigi, strain LV43, isolate C119 (WHO code MCOE/PA/1965/C119;LV43), isolated from a porcupine in Panama in 1965. This new complete genome sequence will contribute to our understanding of the evolution of both the genus Porcisia and the subfamily Leishmaniinae.
Parasites were grown using an in vitro culture system previously developed for Leishmania (Mundinia) orientalis axenic amastigotes (6), in Schneider’s insect medium at 26°C as promastigotes, then in M199 medium supplemented with 10% FCS, 2% stable human urine, 1% basal medium Eagle vitamins, and 25 μg/ml gentamicin sulfate, with subpassage to fresh medium every 4 days to sustain the parasite growth and viability. DNA was extracted and purified using a Qiagen DNeasy blood and tissue kit with the spin column protocol, according to the manufacturer’s instructions. The extracted DNA concentration was assessed using a Qubit fluorometer, microplate reader, and agarose gel electrophoresis. All sequencing libraries were based on the same extracted DNA sample to avoid any inconsistency.
Short-read library construction and sequencing were contracted to (i) BGI (Shenzhen, China) for DNBSEQ libraries, producing paired-end reads (270 bp and 500 bp) using the Illumina HiSeq platform, and (ii) Aberystwyth University (Aberystwyth, UK) for TruSeq Nano DNA libraries, producing paired-end reads (300 bp) using the Illumina MiSeq platform. We performed long-read library preparation and sequencing according to the Nanopore protocol (SQK-LSK109) on R9 flow cells (FLO-MIN106). The read quality was assessed using MultiQC (7), incorporating the use of FastQC for the Illumina short reads and pycoQC for the Nanopore long reads.
We assembled the long reads using Flye (8), with default parameters, to generate chromosome-scale scaffolds. Then, using Minimap2 (9) and SAMtools (10), we mapped the short reads onto the assembled scaffolds to compensate for erroneous bases within the long reads and create consensus sequences. After polishing the assembly using Pilon (11), another round of consensus short-read mapping was performed. Then, we removed duplicated contigs and sorted the remainder according to length using Funannotate (12). Finally, we separated the chimeric sequences and performed scaffolding using RaGOO (13) with the Leishmania major strain Friedlin genome (GenBank accession number GCA_000002725.2) (14) as a reference guide, aligning all 36 chromosomes for our assembly, with the exception of 38 unplaced contigs totaling 1,892,991 bp.
The analysis workflow for assembly and annotation was performed using Snakemake (15) and is available online for reproducibility purposes (https://github.com/hatimalmutairi/LGAAP), including the software versions and parameters used (16). Figure 1 compares our assembly with other complete genomes.
FIG 1.
Assembly comparison of P. hertigi LV43 with Endotrypanum monterogeii LV88, L. major Friedlin, and P. deanei TCC258.
We assessed the assembly completeness using BUSCO (17), with the lineage data set for the phylum Euglenozoa, containing 130 single-copy orthologs from 31 species, finding that 126 of these were present (96.92% completeness). We carried out functional annotation and prediction using the MAKER2 (18) annotation pipeline in combination with AUGUSTUS (19) gene prediction software. Table 1 shows additional summary metrics for the sequencing, assembly, and annotation.
TABLE 1.
Detailed summary metrics of the genome sequencing, assembly, and annotation for P. hertigi LV43
| Feature(s) | Metric(s) |
|---|---|
| Total no. of reads | 27,383,632 |
| No. of MiSeq reads | 3,785,008 |
| No. of HiSeq reads | 23,382,754 |
| No. of MinION reads (read N50 [bp]) | 215,870 (20,520) |
| Genome size (Gbp) | 13.41 |
| Genome coverage (×) | 177.1 |
| Total no. of scaffolds | 74 |
| Genome size (bp) | 34,958,538 |
| N50 (bp) | 967,170 |
| GC content (%) | 56.00 |
| No. of Ns (% of genome) | 320 (0.001) |
| No. of genes | 7,891 |
| Gene density (genes/Mb) | 225.7 |
| No. of exons | 8,270 |
| Mean gene length (bp) | 1,908 |
| Total length of CDSsa (Mb [% of genome]) | 14.70 (42.06) |
CDSs, coding DNA sequences.
Data availability.
The assembly and annotations are available under GenBank assembly accession number GCA_017918235.1. The master record for the whole-genome sequencing project is available at JAFJZO000000000.1. The raw sequence reads are available under BioProject accession number PRJNA691541.
ACKNOWLEDGMENT
This work is funded by a Ph.D. studentship grant to H.A. from the Ministry of Health and Public Health Authority of Saudi Arabia.
Contributor Information
Derek Gatherer, Email: d.gatherer@lancaster.ac.uk.
Jason E. Stajich, University of California, Riverside
REFERENCES
- 1.Herrer A. 1971. Leishmania hertigi sp. n., from the tropical porcupine, Coendou rothschildi Thomas. J Parasitol 57:626–629. doi: 10.2307/3277928. [DOI] [PubMed] [Google Scholar]
- 2.Zeledon R, Ponce C, de Ponce E. 1977. Finding of Leishmania hertigi in the Costa Rican porcupine. J Parasitol 63:924–925. doi: 10.2307/3279912. [DOI] [PubMed] [Google Scholar]
- 3.Espinosa OA, Serrano MG, Camargo EP, Teixeira MMG, Shaw JJ. 2018. An appraisal of the taxonomy and nomenclature of trypanosomatids presently classified as Leishmania and Endotrypanum. Parasitology 145:430–442. doi: 10.1017/S0031182016002092. [DOI] [PubMed] [Google Scholar]
- 4.Jirků M, Yurchenko VY, Lukeš J, Maslov DA. 2012. New species of insect trypanosomatids from Costa Rica and the proposal for a new subfamily within the Trypanosomatidae. J Eukaryot Microbiol 59:537–547. doi: 10.1111/j.1550-7408.2012.00636.x. [DOI] [PubMed] [Google Scholar]
- 5.Albanaz ATS, Gerasimov ES, Shaw JJ, Sadlova J, Lukes J, Volf P, Opperdoes FR, Kostygov AY, Butenko A, Yurchenko V. 2021. Genome analysis of Endotrypanum and Porcisia spp., closest phylogenetic relatives of Leishmania, highlights the role of amastins in shaping pathogenicity. Genes (Basel) 12:444. doi: 10.3390/genes12030444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Chanmol W, Jariyapan N, Somboon P, Bates MD, Bates PA. 2019. Axenic amastigote cultivation and in vitro development of Leishmania orientalis. Parasitol Res 118:1885–1897. doi: 10.1007/s00436-019-06311-z. [DOI] [PubMed] [Google Scholar]
- 7.Ewels P, Magnusson M, Lundin S, Kaller M. 2016. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32:3047–3048. doi: 10.1093/bioinformatics/btw354. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Kolmogorov M, Yuan J, Lin Y, Pevzner PA. 2019. Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol 37:540–546. doi: 10.1038/s41587-019-0072-8. [DOI] [PubMed] [Google Scholar]
- 9.Li H. 2016. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics 32:2103–2110. doi: 10.1093/bioinformatics/btw152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup . 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM. 2014. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9:e112963. doi: 10.1371/journal.pone.0112963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Li W-C, Wang T-F. 2021. PacBio long-read sequencing, assembly, and Funannotate reannotation of the complete genome of Trichoderma reesei QM6a. Methods Mol Biol 2234:311–329. doi: 10.1007/978-1-0716-1048-0_21. [DOI] [PubMed] [Google Scholar]
- 13.Alonge M, Soyk S, Ramakrishnan S, Wang X, Goodwin S, Sedlazeck FJ, Lippman ZB, Schatz MC. 2019. RaGOO: fast and accurate reference-guided scaffolding of draft genomes. Genome Biol 20:224. doi: 10.1186/s13059-019-1829-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Ivens AC, Peacock CS, Worthey EA, Murphy L, Aggarwal G, Berriman M, Sisk E, Rajandream M-A, Adlem E, Aert R, Anupama A, Apostolou Z, Attipoe P, Bason N, Bauser C, Beck A, Beverley SM, Bianchettin G, Borzym K, Bothe G, Bruschi CV, Collins M, Cadag E, Ciarloni L, Clayton C, Coulson RMR, Cronin A, Cruz AK, Davies RM, De Gaudenzi J, Dobson DE, Duesterhoeft A, Fazelina G, Fosker N, Frasch AC, Fraser A, Fuchs M, Gabel C, Goble A, Goffeau A, Harris D, Hertz-Fowler C, Hilbert H, Horn D, Huang Y, Klages S, Knights A, Kube M, Larke N, Litvin L, et al. 2005. The genome of the kinetoplastid parasite, Leishmania major. Science 309:436–442. doi: 10.1126/science.1112680. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Mölder F, Jablonski KP, Letcher B, Hall MB, Tomkins-Tinch CH, Sochat V, Forster J, Lee S, Twardziok SO, Kanitz A, Wilm A, Holtgrewe M, Rahmann S, Nahnsen S, Köster J. 2021. Sustainable data analysis with Snakemake. F1000Res 10:33. doi: 10.12688/f1000research.29032.2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Almutairi H, Urbaniak MD, Bates MD, Jariyapan N, Kwakye-Nuako G, Thomaz-Soccol V, Al-Salem WS, Dillon RJ, Bates PA, Gatherer D. 2021. LGAAP: Leishmaniinae Genome Assembly and Annotation Pipeline. Microbiol Resour Announc 10:e0043921. doi: 10.1128/MRA.00439-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Simao FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. 2015. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31:3210–3212. doi: 10.1093/bioinformatics/btv351. [DOI] [PubMed] [Google Scholar]
- 18.Holt C, Yandell M. 2011. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12:491. doi: 10.1186/1471-2105-12-491. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Stanke M, Steinkamp R, Waack S, Morgenstern B. 2004. AUGUSTUS: a Web server for gene finding in eukaryotes. Nucleic Acids Res 32:W309–W312. doi: 10.1093/nar/gkh379. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The assembly and annotations are available under GenBank assembly accession number GCA_017918235.1. The master record for the whole-genome sequencing project is available at JAFJZO000000000.1. The raw sequence reads are available under BioProject accession number PRJNA691541.

