Skip to main content
Wellcome Open Research logoLink to Wellcome Open Research
. 2022 Dec 5;7:294. [Version 1] doi: 10.12688/wellcomeopenres.18575.1

Genome sequence of Leishmania mexicana MNYC/BZ/62/M379 expressing Cas9 and T7 RNA polymerase

Tom Beneke 1,2, Ulrich Dobramysl 3, Carolina Moura Costa Catta-Preta 4, Jeremy Charles Mottram 4, Eva Gluenz 1,5, Richard J Wheeler 3,a
PMCID: PMC9975418  PMID: 36874584

Abstract

We present the genome sequence of Leishmania mexicana MNYC/BZ/62/M379 modified to express Cas9 and T7 RNA-polymerase, revealing high similarity to the reference genome (MHOM/GT2001/U1103). Through RNAseq-based annotation of coding sequences and untranslated regions, we provide primer sequences for construct and sgRNA template generation for CRISPR-assisted gene deletion and endogenous tagging.

Keywords: Leishmania mexicana, Genome, Pilon polish

Introduction

Leishmania mexicana is a human-infective unicellular eukaryote and one of the species which cause leishmaniasis. It is commonly used as a model Leishmania species for molecular cell biology due to its lower virulence (causing cutaneous rather than visceral leishmaniasis) and its ability to readily differentiate into the amastigote form in appropriate axenic culture. We have previously described the generation of a genetically modified L. mexicana MNYC/BZ/62/M379 expressing Cas9 and T7 RNA polymerase as a strain enabling for rapid reverse genetic modifications 1 . As this is not the reference genome strain (which is MHOM/GT/2001/U1103) 2 and may have accumulated mutations during laboratory culture and/or selection pressures of Cas9 or T7 expression, we sequenced the genome of this widely used strain as a high-quality reference for design of reverse genetic strategies.

Methods

We have previously confirmed that these promastigotes are infectious to the sandfly vector 3 . To ensure that the line was infectious to mammals, we infected an eight-week-old female BALB/c mouse footpad with stationary phase promastigotes (2.0 × 10 6); after four weeks we purified amastigotes from the excised resulting lesion, which were then back-transformed to promastigotes in axenic culture in M199 supplemented with 20% FCS and 50 µg/ml gentamycin (Roche) and grown for seven passages. This gave rise to the cell line L. mex Cas9 T7 M. Genomic DNA from before and after mouse passage was extracted using phenol-chloroform DNA extraction as previously described 4 . For Illumina sequencing, isolated DNA was diluted in 300 µL resuspension buffer (from Illumina TruSeq Nano DNA Library kit) and sonicated for nine (9) minutes using a Bioruptor 300 (Diagenode, set to low [0.5 interval]). The resulting 600 bp DNA fragments were processed for library construction using the TruSeq Nano DNA Library kit (Illumina). Sequencing followed on an Illumina NextSeq550 in paired-end mode (2×150nt) using a NextSeq 500/550 Mid Output 300 Cycles v2 Kit (Illumina, v2 kits now discontinued). 24,656,567 (734 Gb) and 26,015,449 reads (7.8 Gb) from before and after passage were obtained respectively. For Nanopore sequencing, a library was constructed from the after-passage sample (not additionally sonicated) using the 1D 2 (SQK-LSK309) kit and sequenced on a MinION (FLO-MIN106 flow cell), obtaining 329,692 1D and 48,711 1D 2 reads (total, 2.3 Gb; mean length, 6.1 kb; read N 50, 8.0 kb; longest read, 448 kb).

We generated a trial de novo MNYC/BZ/62/M379 Cas9/T7 assembly using the Nanopore reads. Following adapter trimming using Porechop v0.2.4 we used a minimap2/racon/miniasm pipeline (v2.17-r974, v1.4.20 and v0.3-r179 respectively) 57 , then polished the assembly by mapping the adapter-trimmed (using TrimGalore! v0.6.0) 8 Illumina reads using BWA-MEM v0.7.17 9 and 10 rounds of polishing using Pilon v1.23 10 (total length, 31.4 Mb; contigs, 109; mean length, 291 kb; N 50, 640 kb; longest contig, 2.87 Mb). Synteny inspection using SyMAP 11 in comparison to the MHOM/GT/2001/U1103 reference genome 2 showed one (1) or two (2) contigs per chromosome (except for chromosomes 34, four (4) contigs; 19, three (3) contigs; 10, three (3) contigs; 8, four (4) contigs) and no evidence for chromosomal segmental deletion or duplication.

To simplify genome annotation, we therefore opted to polish the MHOM/GT/2001/U1103 genome (NCBI Genome Assembly GCA_000234665.4) with the Illumina reads to generate a MNYC/BZ/62/M379 Cas9/T7 genome instead of using the de novo assembly. Following adapter trimming using TrimGalore! V0.6.0 (default settings) and removal of unfixable reads, the genome was polished by mapping the Illumina reads to the genome using BWA-MEM v0.7.17 9 and one (1) round of polishing using Pilon v1.23 fixing SNPs and indels 10 , identifying 21500 SNPs, 3828 small insertions and 4878 small deletions. Pilon, run in changes mode, identified only 193 SNPs and no changes in coding sequences following mouse passage. Note that neither T7 nor Cas9 are present in this polished genome as Cas9 is not chromosomally integrated (instead expressed from an episome 1 ) and T7 is integrated into the highly repetitive 18S rRNA array which is collapsed in the reference genome.

Aneuploidy is known to be common among Leishmania 2 . Indeed, chromosomal coverage was not uniform, coverage was 144±5 (mean±sd.) excluding three outliers: chromosome 3 (coverage, 218; triploid), 16 (coverage, 214; triploid) and 30 (coverage, 277; tetraploid).

Updated MHOM/GT/2001/U1103 genome annotations were prepared from existing resources, then transferred to MNYC/BZ/62/M379 Cas9/T7 accounting for coordinate changes due to indels. MHOM/GT/2001/U1103 ORFs and non-coding RNAs from TriTrypDB v50 12 were taken as the start set. Previous RNAseq analysis 13 (BioProject accession number PRJEB8829) mapped spliced leader acceptor sites (SLASs, the site of trans splicing of a leader sequence common to all processed mRNAs) and polyadenylation sites (PASs) which define the bounds of the mRNA, from which suggested gene extensions and truncations were listed. We included these changes when a valid ORF (with a start and stop codon and no internal stops) was retained and mapped the 5’ and 3’ UTR based on the most commonly observed SLAS and PAS for each gene respectively. Previously identified novel genes with a valid ORF and evidence for expression as a polyadenylated transcript 12 were also included. To distinguish these gene models from the reference genome annotation we prefixed the gene names with “LmxM379c” indicating the strain name and its expression of Cas9 and T7.

The Cas9 enables CRISPR-assisted genome editing, while the T7 RNA polymerase allows use of sgDNA encoding a T7 promoter and sgRNA to program Cas9 activity. Using our previously published ‘LeishGEdit’ pipeline 1 and our updated primer design software 14 that is based on the CCTop CRISPR/Cas9 Target Prediction Tool 15 , we designed primers for: 1) PCR-based generation of constructs for endogenous protein tagging (uf/ur primers for N terminal tagging or df/dr primers for C terminal tagging, using the pPOT, pLPOT and pPLOT series of plasmids) and gene deletion (uf/dr primers, using the pT series). 2) PCR-based generation of sgRNA templates for tagging and deletion (5g/3g primers for deletion, 5g primer for N terminal tagging and 3g primer for C). 3) Primers within each protein-coding gene ORF validation of gene deletion by diagnostic PCR (vf/vr primers) (based on the Primer3 primer design software 16 ). We also designed uf primers carrying a unique 17-nt DNA barcode 14 for generating barcoded pools of deletion mutants.

As this set of primers accounts for strain-specific SNPs and indels we recommended them as a standardised ‘first attempt’ for tagging and deletion genes in L. mex Cas9 T7 M, and we will be using them for future high-throughput reverse genetic analyses.

Ethics statement

All experiments were conducted according to the Animals (Scientific Procedures) Act of 1986, United Kingdom, and had approval from the University of York Animal Welfare and Ethical Review Body (AWERB) committee. All efforts were undertaken to minimise the suffering of animals.

Funding Statement

This work was supported by Wellcome 221944, [<a href=https://doi.org/10.35802/221944>https://doi.org/10.35802/221944</a>]; 211075, [<a href=https://doi.org/10.35802/211075>https://doi.org/10.35802/211075</a>]; 200807, [<a href=https://doi.org/10.35802/200807>https://doi.org/10.35802/200807</a>] and an MRC PhD studentship [15/16_MSD_836338]. Eva Gluenz was supported by the Royal Society.

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

[version 1; peer review: 2 approved, 1 approved with reservations]

Data availability

Underlying data

BioProject: Leishmania mexicana MNYC/BZ/62/M379 Cas9/T7 whole genome sequencing; Accession number: PRJNA853937, https://identifiers.org/bioproject:PRJNA853937 17

Sequence Read Archive: Next generation sequencing of Leishmania mexicana Cas9/T7 strain: after mouse passage (SRR19895146). Accession number: SRR19895146; https://identifiers.org/insdc.sra:SRR19895146 18

Sequence Read Archive: Nanopore sequencing of Leishmania mexicana Cas9/T7 strain (SRR20123517). Accession number: SRR20123517; https://identifiers.org/insdc.sra:SRR20123517 19

Extended data

Zenodo: Gene tagging and gene deletion resources for Leishmania mexicana MNYC/BZ/62/M379 Cas9/T7 strain, https://doi.org/10.5281/zenodo.7313190 20

This project contains the primer sequences, barcodes and the GFF file containing the sequence and the annotations of the L. mexicana MNYC/BZ/62/M379 T7/Cas9.

Analysis code

All code for genome assembly, polishing, annotation updates and annotation transfer are available from GitHub: https://github.com/Wheeler-Lab/genome-lmexcas9t7/tree/v1.0.1; and archived in Zenodo: https://doi.org/10.5281/zenodo.7357174 21

License: GNU GPL-3.0

Reporting guidelines

Zenodo: ARRIVE E-10 Checklist for “Genome sequence of Leishmania mexicana MNYC/BZ/62/M379 expressing Cas9 and T7 RNA polymerase”, https://doi.org/10.5281/zenodo.7330926 22 .

Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).

References

  • 1. Beneke T, Madden R, Makin L, et al. : A CRISPR Cas9 high-throughput genome editing toolkit for kinetoplastids. R Soc Open Sci. 2017;4(5):170095. 10.1098/rsos.170095 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Rogers MB, Hilley JD, Dickens NJ, et al. : Chromosome and gene copy number variation allow major structural change between species and strains of Leishmania. Genome Res. 2011;21(12):2129–42. 10.1101/gr.122945.111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Beneke T, Demay F, Hookway E, et al. : Genetic dissection of a Leishmania flagellar proteome demonstrates requirement for directional motility in sand fly infections. PLoS Pathog. 2019;15(6):e1007828. 10.1371/journal.ppat.1007828 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Sambrook J, Russell DW: Purification of nucleic acids by extraction with phenol:chloroform. CSH Protoc. 2006;2006(1): pdb.prot4455. 10.1101/pdb.prot4455 [DOI] [PubMed] [Google Scholar]
  • 5. Li H: Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics. 2016;32(14):2103–10. 10.1093/bioinformatics/btw152 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Li H: Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34(18):3094–100. 10.1093/bioinformatics/bty191 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Vaser R, Sovic I, Nagarajan N, et al. : Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 2017;27(5):737–46. 10.1101/gr.214270.116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Krueger F: Trim Galore.2021. Reference Source [Google Scholar]
  • 9. Li H: Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arixv: 13033997 [q-bio].2013. 10.6084/M9.FIGSHARE.963153.V1 [DOI] [Google Scholar]
  • 10. Walker BJ, Abeel T, Shea T, et al. : Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One. 2014;9(11):e112963. 10.1371/journal.pone.0112963 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Soderlund C, Bomhoff M, Nelson WM: SyMAP v3.4: a turnkey synteny system with application to plant genomes. Nucleic Acids Res. 2011;39(10):e68. 10.1093/nar/gkr123 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Amos B, Aurrecoechea C, Barba M, et al. : VEuPathDB: the eukaryotic pathogen, vector and host bioinformatics resource center. Nucleic Acids Res. 2022;50(D1):D898–D911. 10.1093/nar/gkab929 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Fiebig M, Kelly S, Gluenz E: Comparative Life Cycle Transcriptomics Revises Leishmania mexicana Genome Annotation and Links a Chromosome Duplication with Parasitism of Vertebrates. PLoS Pathog. 2015;11(10):e1005186. 10.1371/journal.ppat.1005186 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Beneke T, Gluenz E: Bar-seq strategies for the LeishGEdit toolbox. Mol Biochem Parasitol. 2020;239:111295. 10.1016/j.molbiopara.2020.111295 [DOI] [PubMed] [Google Scholar]
  • 15. Stemmer M, Thumberger T, Del Sol Keyer M, et al. : CCTop: An Intuitive, Flexible and Reliable CRISPR/Cas9 Target Prediction Tool. PLoS One. 2015;10(4):e0124633. 10.1371/journal.pone.0124633 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Untergasser A, Cutcutache I, Koressaar T, et al. : Primer3--new capabilities and interfaces. Nucleic Acids Res. 2012;40(15):e115. 10.1093/nar/gks596 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Beneke T, Dobramysl U, Catta-Preta CMC, et al. : Leishmania mexicana MNYC/BZ/62/M379 Cas9/T7 whole genome sequencing [Data set]. NCBI BioProject PRJNA853937.2022.
  • 18. Beneke T, Dobramysl U, Catta-Preta CMC, et al. : Next generation sequencing of Leishmania mexicana Cas9/T7 strain: after mouse passage (SRR19895146).[Data set]. Sequence Read Archive; Accession number: SRR19895146.2022.
  • 19. Beneke T, Dobramysl U, Catta-Preta CMC, et al. : Nanopore sequencing of Leishmania mexicana Cas9/T7 strain (SRR20123517).[Data set]. Sequence Read Archive; Accession number: SRR19895146.2022.
  • 20. Beneke T, Dobramysl U, Catta-Preta CMC, et al. : Gene tagging and gene deletion resources for Leishmania mexicana MNYC/BZ/62/M379 Cas9/T7 strain. Zenodo. 2022. 10.5281/zenodo.6832399 [DOI] [Google Scholar]
  • 21. Dobramysl U, Wheeler RJ: Pilon polish code for the Leishmania mexicana MNYC/BZ/62/M379 Cas9/T7 genome. Zenodo. 2022. 10.5281/zenodo.7357174 [DOI] [Google Scholar]
  • 22. Beneke T, Dobramysl U, Catta-Preta CMC, et al. : ARRIVE E-10 Checklist for "Genome sequence of Leishmania mexicana MNYC/BZ/62/M379 expressing Cas9 and T7 RNA polymerase". Zenodo. 2022. 10.5281/zenodo.7330925 [DOI] [PMC free article] [PubMed] [Google Scholar]
Wellcome Open Res. 2023 Jan 12. doi: 10.21956/wellcomeopenres.20597.r53618

Reviewer response for version 1

Stephen M Beverley 1

This is a fairly straightforward work providing the genome sequence of a workhorse Leishmania mexicana strain in common use for CRISPR genome editing studies. It will be a useful resource to many laboratories.

The authors might comment upon the 'clonality' of this line - in past work from these investigators the parasites were not always of clonal origin and hence some heterogeneity may exist which could confound the analysis. Were the parent lines clonal, and since their origin, how many cell doublings have occurred that might further introduce heterogeneity?

Secondly, the use of triploid or tetraploid for specific chromosomes is incorrect - it should be trisomic or tetrasomic.

Are sufficient details of methods and materials provided to allow replication by others?

Yes

Is the rationale for creating the dataset(s) clearly described?

Yes

Are the datasets clearly presented in a useable and accessible format?

Yes

Are the protocols appropriate and is the work technically sound?

Yes

Reviewer Expertise:

Leishmania molecular genetics

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Wellcome Open Res. 2023 Jan 4. doi: 10.21956/wellcomeopenres.20597.r53621

Reviewer response for version 1

Richard McCulloch 1

In this article, Beneke and colleagues provide an extremely useful resource: the genome sequence of a Leishmania mexicana strain expressing Cas9 and T7 polymerase, which is widely used to allow for CRISPR genome editing. The main focus of the article is in understanding if these modifications of an existing and sequenced 'reference' L. mexicana strain (to allow CRISPR editing) have resulted in undetected genome changes, including sequence variations that might reduce the effectiveness of primers based on the reference genome. The data suggest no large scale genome changes have occurred, but that some SNPs should be accounted for in designing primers for CRISPR-based experiments. As such, the study and new genome are cleary of considerable value. Nonetheless, some methods could be more clearly explained.

 

  1. Are the authors correct when they state that they generated 734 Gb and 7.8 Gb of sequence from before and after passage? This difference appears at odds with the very similar number of reads recovered from the two samples: 24,656,567 and 26,015,449.

  2. In the article, the authors suggest that they used Pilon v1.23 to 'polish the [reference] MHOM/GT/2001/U1103 genome'. I would suggest this phrasing is perhaps confusing and could be improved. Pilon is an algorithm that can be used to improve an existing genome assembly or identify differences between related assemblies. In the context of the de novo assembly of the MNYC/BZ/62/M379 Cas9/T7 genome, 'polishing' appears appropriate, since the Illumina reads improve the Nanopore assembly. However, it appears that the approach taken when comparing the MNYC/BZ/62/M379 Cas9/T7 Illumina reads to the MHOM/GT/2001/U1103 genome assembly has not 'polished' (i.e. altered) the latter genome, but instead identified differences between them.

  3. In describing the differences between the two genomes, what do the authors mean by 'small' insertions and deletions; can they specify the size range?

  4. The following sentence is somewhat confusing:

    'Indeed, chromosomal coverage was not uniform, coverage was 144±5 (mean±sd.) excluding three outliers: chromosome 3 (coverage, 218; triploid), 16 (coverage, 214; triploid) and 30 (coverage, 277; tetraploid).' What is the 'coverage' they refer to; and do they mean that coverage only deviated from relatively uniform levels in the three chromosomes mentioned, or are they suggesting there may be further ploidy variation (not discussed)?

Are sufficient details of methods and materials provided to allow replication by others?

Partly

Is the rationale for creating the dataset(s) clearly described?

Yes

Are the datasets clearly presented in a useable and accessible format?

Yes

Are the protocols appropriate and is the work technically sound?

Yes

Reviewer Expertise:

Genome biology of trypanosomatid parasites

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Wellcome Open Res. 2022 Dec 23. doi: 10.21956/wellcomeopenres.20597.r53617

Reviewer response for version 1

José María Requena 1

In this article, Beneke et al. inform on the genomic sequencing and genome assembly for the Leishmania mexicana MNYC/BZ/62/M379 strain, which was manipulated to express CRISPR-associated protein 9 (Cas9) and T7 RNA polymerase. The creation of this cell line represents the starting point (and essential step) to apply the successful methodology of CRISPR-assisted gene deletion and endogenous tagging that was developed previously by these authors. Although the genome sequence for other L. mexicana strain (MHOM/GT/2001/U1103) is publically available, it is necessary to know the exact nucleotide sequence for the genes in the particular Leishmania line in which CRISPR methodology will be used. As mentioned by the authors the L. mexicana MNYC/BZ/62/M379 strain is widely used in different laboratories.

The methodological procedures used for generating the raw sequencing data are described in a detailed manner. Also, the bioinformatics procedures used for genome assembly and sequence polishing are clearly described. Apart from generating the genome sequence, they conducted genome annotations and transcripts delineation (including UTRs). As another valuable resource, the authors have designed oligonucleotide sequences to accomplish gene tagging or deletion of every gene in L. mexicana MNYC/BZ/62/M379.

In sum, this article contains valuable datasets. In this regard, this reviewer acknowledges the effort made by the authors to generate these valuable resources and encourages users to inform about possible inconsistencies in the information provided, as an effective way to improve and curate genome annotations between all.

Are sufficient details of methods and materials provided to allow replication by others?

Yes

Is the rationale for creating the dataset(s) clearly described?

Yes

Are the datasets clearly presented in a useable and accessible format?

Yes

Are the protocols appropriate and is the work technically sound?

Yes

Reviewer Expertise:

Molecular Biology, Leishmania, NGS, genomics, transcriptomics

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Beneke T, Dobramysl U, Catta-Preta CMC, et al. : Leishmania mexicana MNYC/BZ/62/M379 Cas9/T7 whole genome sequencing [Data set]. NCBI BioProject PRJNA853937.2022.
    2. Beneke T, Dobramysl U, Catta-Preta CMC, et al. : Next generation sequencing of Leishmania mexicana Cas9/T7 strain: after mouse passage (SRR19895146).[Data set]. Sequence Read Archive; Accession number: SRR19895146.2022.
    3. Beneke T, Dobramysl U, Catta-Preta CMC, et al. : Nanopore sequencing of Leishmania mexicana Cas9/T7 strain (SRR20123517).[Data set]. Sequence Read Archive; Accession number: SRR19895146.2022.

    Data Availability Statement

    Underlying data

    BioProject: Leishmania mexicana MNYC/BZ/62/M379 Cas9/T7 whole genome sequencing; Accession number: PRJNA853937, https://identifiers.org/bioproject:PRJNA853937 17

    Sequence Read Archive: Next generation sequencing of Leishmania mexicana Cas9/T7 strain: after mouse passage (SRR19895146). Accession number: SRR19895146; https://identifiers.org/insdc.sra:SRR19895146 18

    Sequence Read Archive: Nanopore sequencing of Leishmania mexicana Cas9/T7 strain (SRR20123517). Accession number: SRR20123517; https://identifiers.org/insdc.sra:SRR20123517 19

    Extended data

    Zenodo: Gene tagging and gene deletion resources for Leishmania mexicana MNYC/BZ/62/M379 Cas9/T7 strain, https://doi.org/10.5281/zenodo.7313190 20

    This project contains the primer sequences, barcodes and the GFF file containing the sequence and the annotations of the L. mexicana MNYC/BZ/62/M379 T7/Cas9.


    Articles from Wellcome Open Research are provided here courtesy of The Wellcome Trust

    RESOURCES