Abstract
Ascochyta rabiei is the causal organism of ascochyta blight of chickpea and is present in chickpea crops worldwide. Here we report the release of a high-quality PacBio genome assembly for the Australian A. rabiei isolate ArME14. We compare the ArME14 genome assembly with an Illumina assembly for Indian A. rabiei isolate, ArD2. The ArME14 assembly has gapless sequences for nine chromosomes with telomere sequences at both ends and 13 large contig sequences that extend to one telomere. The total length of the ArME14 assembly was 40,927,385 bp, which was 6.26 Mb longer than the ArD2 assembly. Division of the genome by OcculterCut into GC-balanced and AT-dominant segments reveals 21% of the genome contains gene-sparse, AT-rich isochores. Transposable elements and repetitive DNA sequences in the ArME14 assembly made up 15% of the genome. A total of 11,257 protein-coding genes were predicted compared with 10,596 for ArD2. Many of the predicted genes missing from the ArD2 assembly were in genomic regions adjacent to AT-rich sequence. We compared the complement of predicted transcription factors and secreted proteins for the two A. rabiei genome assemblies and found that the isolates contain almost the same set of proteins. The small number of differences could represent real differences in the gene complement between isolates or possibly result from the different sequencing methods used. Prediction pipelines were applied for carbohydrate-active enzymes, secondary metabolite clusters and putative protein effectors. We predict that ArME14 contains between 450 and 650 CAZymes, 39 putative protein effectors and 26 secondary metabolite clusters.
Keywords: PacBio, Pleosporales, Dothideomycetes, plant pathogen, chickpea
Ascochyta blight disease of chickpea (Cicer arietinum L.), is caused by Ascochyta rabiei (Pass.) Labrouse (teleomorph: Didymella rabiei (Kovachevski) von Arx, syn. Mycosphaerella rabiei (Kovachevski), syn Phoma rabiei) (de Gruyter et al. 2009; Aveskamp et al. 2010) and is the major biotic constraint affecting production of chickpea worldwide. Chickpea is a legume crop species of global economic importance (Fondevilla et al. 2015). Worldwide production was 12 M tonne in 2016 with two-thirds grown in India. Australia is a major exporter and the combined control costs and losses attributed to ascochyta blight in Australia in 2012 were around AU$40 M (Murray and Brennan 2012) which represents approximately 10% of crop value.
A. rabiei is a haploid, heterothallic, Dothideomycetes fungus (Order Pleosporales) and it causes disease by producing necrosis in leaf, stem and pod tissues (Trapero-Casas and Kaiser 1992; Peever et al. 2004). Little is known about how it causes disease. It produces solanapyrones, phytotoxic secondary metabolites in culture filtrates, and these have long been considered to play a role in pathogenicity (Alam et al. 1989; Hamid and Strange 2000). However deletion of the biosynthesis genes responsible for the synthesis of solanapyrones did not affect virulence (Kim et al. 2015a,b). Attention has since shifted to effectors, proteins that control the plant-pathogen interaction and in some cases induce necrosis (Lo Presti et al. 2015; Tan and Oliver 2017). In terms of pathogen life cycle and population structure, the two mating types are found in A. rabiei populations in Israel (Lichtenzveig et al. 2005), North America (Peever et al. 2004) and Canada (Armstrong et al. 2001), thus providing a mechanism for sexual recombination. In Australia, most reports suggest the presence of only mating-type MAT1-2 (Leo et al. 2015; Mehmood et al. 2017) and the absence of mating in the population. SSR genotyping has found the Australian A. rabiei population to be highly homogenous with around 70% of isolates tested being from a single dominant haplotype designated ARH01 (Leo et al. 2015; Mehmood et al. 2017).
In 2016, an Illumina short-read genome assembly of an Indian A. rabiei isolate, ArD2, was published (Verma et al. 2016). In their analysis Verma et al. (2016) predicted 758 secreted proteins from a total of 10,596 proteins encoded by the 34.6 Mb genome assembly (Verma et al. 2016). Of the 758 predicted secreted proteins, 201 proteins were annotated as Carbohydrate-Active Enzymes (CAZymes) (Lombard et al. 2014). These included proteins containing carbohydrate-binding modules and LysM domains that characterize chitin-binding effectors, which suppress the immune response of plants to fungal pathogens (de Jonge et al. 2010). There were 323 putative effectors with no known protein domain and 70 proteins of the predicted secretome showed sequence similarity to members of the Pathogen-Host Interaction (PHI) database (Winnenburg et al. 2006) with annotated involvement in virulence or pathogenicity (Verma et al. 2016). In addition to proposed virulence proteins, transcription factors that control gene expression during plant infection by A. rabiei have been predicted for ArD2 and assessed for their contribution to pathogenesis (Verma et al. 2017). Verma et al. (2017) suggest that for A. rabiei, Myb transcription factors play a role in regulating the expression of genes encoding secreted proteins, such as effectors. Gene regulation by transcription factors during plant infection may also vary depending on specific isolate and host interactions (Verma et al. 2017).
Here, we present a genome assembly for an Australian A. rabiei isolate, ArME14 that was produced by amalgamation of whole genome DNA sequence data from both Illumina and PacBio SMRT sequencing. The updated genome assembly features 12 full-length chromosome contigs with telomere sequences at both ends, 9 partial chromosomal contigs with one telomere end, and 13 smaller fragments with no telomere. The total assembled genome length was 40.9 Mb and genome annotation predicted 11,257 gene models.
Materials and methods
Fungal culture and DNA extraction
A. rabiei isolate ArME14 was collected from chickpea at the Department of Agriculture and Food Western Australia (DAFWA) field station, Medina, Western Australia in 2004. Fungal cultures were grown for three days in potato dextrose liquid media, at approx. 22° with shaking (150 rpm). For Illumina sequencing, DNA was prepared using a standard CTAB extraction method. RNA was removed by incubation with DNase-free RNase A and DNA was resuspended in TE buffer (10 mM Tris-HCl 1 mM EDTA, pH 8). DNA concentration was determined by NanoDrop Spectrophotometer (Thermo-Fisher Scientific, Waltham, MA, USA) and quality and purity were assessed by agarose gel electrophoresis. For PacBio sequencing, maxi-prep DNA extractions were produced using a modified method from Xin and Chen (2012). Fungal material was grown in Yeast Extract Glucose liquid media at approx. 22° with shaking (180 rpm), for 72 hr. DNA was resuspended in 2 mL Tris-HCl, pH 8.0 and treated with 20 µg.mL-1 DNase-free RNase A. DNA was purified using Ampure XP beads (Agencourt, Beckman-Coulter, USA) in a 96-well microtitre plate. The final DNA solution was quantified using Qubit and Nanodrop (Thermo Fisher Scientific) assays. Gel electrophoresis on a 1% agarose gel was used to assess DNA quality.
Genome sequencing
Short-read DNA sequencing was performed at the Allan Wilson Genome Centre (Massey University, Palmerston North, New Zealand) using an Illumina Genome Analyzer (Illumina Inc., San Diego, CA, USA). Illumina TruSeq paired-end libraries were prepared for A. rabiei isolate ArME14 DNA, size-selected for 200 bp fragments, from which 75 bp reads were sequenced. Single-Molecule, Real-Time (SMRT) PacBio sequencing was performed by Genome Quebec (McGill University, Montreal, Canada). Libraries were prepared with size-selected 17 Kb fragments from sheared genomic DNA using P6-C4 chemistry, and sequenced on six SMRT cells using a PacBio RSII instrument (Pacific Biosciences, Menlo Park, CA, USA).
Reference genome assembly
PacBio SMRT reads were assembled using the CANU v 1.2 assembler (Berlin et al. 2015) and the resulting intermediate assembly sequences were corrected using Illumina reads via PILON v 1.2.1 (Walker et al. 2014). A single mitochondrial genome contig from the assembly was identified by homology with other published Dothideomycete genome sequences and was designated ‘Mitochondrion MT’ in the new assembly. Sequencing statistics for the final corrected A. rabiei ArME14 reference assembly including overall percent GC content, were calculated using QUAST v 4.6.2 (Gurevich et al. 2013). Telomere sequences were manually recorded based on the presence of TTAGGG tandem repeat sequences at contig ends (Schechtman 1990). Computational steps were coordinated using Nextflow (Di Tommaso et al. 2017).
Genome annotation and analysis
Gene prediction for the reference A. rabiei ArME14 assembly was performed using the annotation program, AUGUSTUS v 3.3 (Stanke et al. 2004, 2006; König et al. 2016), based on sequence homology of in vitro and in planta RNASeq and Massive Analysis of cDNA Ends (MACE) libraries from the A. rabiei BioProject, PRJNA288273 (Fondevilla et al. 2015). We used BUSCO version 3.0 (Simão et al. 2015) to assess assembly and annotation completeness by running protein fasta files with benchmarking against the Ascomycota_odb9 single-copy orthologs file downloaded from the BUSCO website September 2019 (https://busco.ezlab.org/). We used the program OcculterCut v 1.1 (Testa et al. 2016) to scan the genome assembly to determine its percent GC content distribution.
For detection and assessment of transposable element and repeat sequences, the suite of detection and classification programs in the PiRATE-Galaxy pipeline virtual machine as described by Berthelier et al. (Berthelier et al. 2018) was used. PiRATE-Galaxy uses similarity-based detection programs RepeatMasker (Smit et al. 2013) and TE-HMMER (Berthelier et al. 2018), a custom program based on HMMER (Eddy 1995) and tBLASTn (Altschul et al. 1990); structure-based programs MITE-Hunter (Han and Wessler 2010), SINE-Finder (Wenke et al. 2011), Helsearch (Yang and Bennetzen 2009) and LTRharvest (Ellinghaus et al. 2008); and repetitiveness-based programs, TEdenovo (Flutre et al. 2011) and RepeatScout (Price et al. 2005). CD-HIT-EST (Li and Godzik 2006) in PiRATE-Galaxy was used to reduce redundancy in the combined TE and repeat sequence dataset by removing duplicated sequences with 100% identity to other longer sequences in the data set. Short sequences of less than 500 nucleotides were removed. Classification of repeat and TE sequences was implemented using the program PASTEC (Hoede et al. 2014), using nucleotide, protein and profile HMM databanks from the PiRATE-Galaxy server (Berthelier et al. 2018).
Calculating coverage of ArME14 by the previous ArD2 assembly was performed with NUCMER (Kurtz et al. 2004) using the maxmatch argument and plotted using ggplot (Wickham 2016). Prediction of secreted proteins from the A. rabiei ArME14 set of annotated protein sequences was accomplished using SignalP version 5.0 (Almagro Armenteros et al. 2019), and DeepSig (Savojardo et al. 2018). From the set of annotated proteins, we applied effector selection criteria, including mature polypeptide molecular weight less than 25 KDa, number of cysteines, presence of a secretion signal and EffectorP 2.0 score greater than 0.8 (Sperschneider et al. 2016, 2018), to predict putative effector proteins using a custom pipeline written in Python [Johannes Debler. (2019, November 4). JWDebler/effector_selection: First working release (Version v1.0). Zenodo. http://doi.org/10.5281/zenodo.3526820]. Where there was disagreement between SignalP and DeepSig on the signal peptide processing site, the custom pipeline chose the site determined by SignalP.
CAZymes were identified from the ArME14 annotated set of proteins using the dbCAN2 web-based meta server (Zhang et al. 2018), implementing HMMER v 3.2.1 (Eddy 1996) with the HMMdb release 8.0, DIAMOND (Buchfink et al. 2015) and Hotpep (Busk et al. 2017). Secondary metabolite clusters were identified using the antiSMASH fungal version v 5.0, web-based prediction server (Medema et al. 2011; Blin et al. 2019). A CIRCOS plot that illustrates all annotated genome features was produced using the CIRCOS software v 0.69-9 (Krzywinski et al. 2009).
Data availability
Illumina and PacBio genome sequencing data for A. rabiei ArME14 described herein, and the reference genome assembly have been deposited in the Sequence Read Archive and NCBI database, under the BioProject accession number PRJNA510692. A. rabiei ArME14 BioSample number is SAMN10613128. Illumina SRA entries are deposited under SRX5179494 and PacBio SRA data under SRX5172972. The GenBank assembly accession number is GCA_004011695.1. Supplemental material available at figshare: https://doi.org/10.25387/g3.11589420
Results and discussion
PacBio SMRT sequencing of ArME14 produced 34 contigs, including one mitochondrial genomic contig, at 166x sequencing depth (Table 1). The ArME14 assembly was 18% larger at 40,927,385 bp compared with 34,658,250 bp for the ArD2 genome assembly (Verma et al. 2016). Telomeres were identified manually by sequence observation and all were TTAGGG repeats of approximately 100 bp in length. Repeated TTAGGG sequences are reported to characterize the telomere regions in filamentous fungi such as Neurospora crassa (Schechtman 1990), Cladosporium fulvum (Coleman et al. 1993) and Magnaporthe oryzae (Rehmeyer et al. 2006; Farman 2007), and this sequence motif is a conserved feature in ArME14. Of the 33 nuclear contigs, 12 had TTAGGG telomere sequences at both ends (contig sizes 3,373,759 – 1,223,093 bp). Nine others had one telomere (contig sizes 2,532,578 – 1,278,587 bp). Values for L50 and N50 (Table 1) were 9 and 1,812,190 bp, compared with 64 and 154,808 bp for ArD2, respectively (Verma et al. 2016) (Figure 1). Akamatsu et al. (2012) used pulsed-field gel electrophoresis to determine chromosome number and size for multiple A. rabiei isolates from 21 countries. The number of chromosomes ranged from 12 to 16, and total genome size estimates ranged from 23 Mb to 34 Mb (Akamatsu et al. 2012). Our whole genome sequencing suggests that A. rabiei ArME14 possesses at least 17 chromosomes and has significantly higher genome size than previously estimated (Akamatsu et al. 2012). The mitochondrial genomic sequence was assembled as a single contig of 74,173 bp length (Figure 1) and was identified by homology with other fungal and Dothideomycete mitochondrial genome sequences. PacBio genome sequencing for fungi facilitates the assembly of long contigs by resolving the repetitive and AT-rich regions that characterize these species. For ArME14, we have been able to assemble 12 end-to-end chromosomal contigs as evidenced by the telomere sequences that terminate the contigs of the assembly. Recently produced, highly resolved genome assemblies for phytopathogenic fungi include: Verticillium dahliae (Faino et al. 2016), Botrytis cinerea (Van Kan et al. 2017), Sclerotinia sclerotiorum (Derbyshire et al. 2017) Pyrenophora teres (Syme et al. 2018) and Pyrenophora tritici-repentis (Moolhuijzen et al. 2018). For each of these PacBio sequencing was implemented but in addition to this, optical mapping, and in some cases genetic mapping were used to confirm the assembly, particularly across repetitive DNA within their genomes. For V. dahliae, optical mapping combined with PacBio sequencing improved the assembly from 119 contigs before optical mapping to 8 contigs after optical mapping (Faino et al. 2015). Even without optical and genetic mapping, but having deep sequencing coverage at 166x and Illumina correction, we are confident in the ArME14 genome sequence and the organization of GC-equilibrated and AT-rich sections in the assembly.
Table 1. Summary assembly and annotation statistics for Illumina sequencing of A. rabiei isolate, ArD2 (Verma et al. 2016) and PacBio SMRT sequencing for ArME14.
Assembly statistics | ArD2 Illumina (13) | ArME14 PacBio SMRT |
---|---|---|
Genome size (bp) | 34,658,250 | 40,927,385 |
Total sequenced bases | 100 Gb | ∼6.8 Gb |
Coverage | 178x | 166x (928,353 reads) |
Number of scaffolds/contigs | 338 a | 33 b |
Largest scaffold/contig size (bp) | 1,160,210 a | 3,373,759 b |
L50 | 64 a | 9 b |
N50 (bp) | 154,808 a | 1,812,190 b |
GC (%) | 51.6 | 49.2 |
% Repetitive sequence | 9.9 | 12.6 |
Complete chromosomes | — | 12 |
Annotation statistics | ||
Number of protein coding genes | 10,596 | 11,257 |
Predicted secreted proteins | 758 c (1,111 d) | 1,145 c |
Predicted effectors | 328 c (36 d) | 39 c |
Predicted sec. metabolite clusters | 26 e | 26 |
Predicted no. of CAZymes | 1,727 c (441 f) | 451 c f |
Scaffolds for ArD2 Illumina assembly (GCA_001630375.1).
Contigs for ArME14 PacBio SMRT assembly.
Differences in numbers likely due largely to different selection criteria.
Secretome and effector predictions for ArD2 assembly using the same methods applied to ArME14.
Unknown prediction method for secondary metabolite clusters.
CAZyme prediction using dbCAN2 meta server in this study.
AUGUSTUS (v 3.3) (Stanke et al. 2004, 2006; König et al. 2016) and expressed gene sequence data from a published A. rabiei (isolate P4) transcriptome project (Fondevilla et al. 2015) were used to annotate transcribed gene features of the ArME14 genome assembly. Figure 1 shows GC-balanced, gene-rich regions as thick black bars interspersed between AT-rich, gene-sparse regions. Homology-based alignment of the ArD2 and ArME14 genome assemblies (Figure 2) shows that unique and near-exact matches between the two genomes cover the majority of the ArME14 genome. However, a substantial proportion of the homologous regions are between non-uniquely matching sequences, which are likely to be repetitive regions in both assemblies. Prediction of transposable elements and other repetitive DNA sequences for ArME14 identified that these regions comprise approximately 6.14 Mb or 15% of the genome (Table 2) and this value is roughly equal to the amount of non-unique matches indicated by genome alignment (Figure 2). The most abundant of the transposable and repetitive element types present in the A. rabiei genome were the Class I, long terminal repeats (LTR) with 3.3 Mb (54%), Class II terminal inverted repeats (TIR), with 1.8 Mb (30%) and long interspersed nuclear element (LINE) with 0.45 Mb (7.3%). The ArME14 genome assembly had a lower overall GC content compared with the ArD2 assembly (Table 1). Using OcculterCut (Testa et al. 2016) we found that the content of AT-rich DNA sequence was higher for the ArME14 assembly than for the Illumina ArD2 assembly. Around 20% of the ArME14 genome has a low GC content (between 29% and 37% GC) compared to 9.4% for ArD2. The content of GC-equilibrated regions was 32 Mb for ArME14 and 30 Mb for ArD2. Overestimation of the amount of the repetitive DNA in the ArME14 genome assembly due to mis-assembly is possible, but the difference in AT-rich, repetitive DNA, between ArD2 and ArME14 can be explained by the more complete sequencing and assembly using PacBio sequencing. Distribution of GC content varies widely among the Pleosporales plant pathogenic fungi (Figure 3), with Parastagonospora nodorum, Pyrenophora tritici-repentis and Zymoseptoria tritici having mostly 50–55% GC content and the canola blackleg disease pathogen, Leptosphaeria maculans, having approximately one third of its genome as AT-rich DNA (Rouxel et al. 2011; Testa et al. 2016). A. rabiei has a similar GC content distribution to the barley pathogen, Pyrenophora teres f.sp teres (Figure 3 A). Size distributions for both the AT-rich and GC-balanced regions for ArME14 were highly variable, with average sizes of approximately 6,200 and 25,000 bp, respectively (Figure 3 B). Filamentous plant pathogen genomes tend to be characterized as having substantial proportions of repetitive and AT-rich sequence and their complement of genes includes a large number that encode secreted proteins. The genome architecture of ArME14 revealed by PacBio sequencing fits the “two-speed genome” model as proposed by Dong et al. (2015). The striking feature of this model, is that positive selection in genes located near repetitive DNA regions leads to higher rates of evolution in species for which genome architecture fits this model (Dong et al. 2015).
Table 2. Transposable element and repetitive DNA sequences from A. rabiei ArME14.
Class | Type | Number of sequences | % of total sequences | Total nucleotides | % of total nucleotides | Average size |
---|---|---|---|---|---|---|
I | LTR | 780 | 43 | 3336780 | 54 | 4278 |
LINE | 177 | 9.7 | 450418 | 7.3 | 2545 | |
LARD | 26 | 1.4 | 194939 | 3.2 | 7498 | |
TRIM | 5 | 0.3 | 5668 | 0.1 | 1134 | |
SINE | 2 | 0.1 | 1034 | 0.02 | 517 | |
II | TIR | 683 | 38 | 1820841 | 30 | 2666 |
Helitron | 44 | 2.4 | 171920 | 2.8 | 3907 | |
MITE | 7 | 0.4 | 4193 | 0.1 | 599 | |
SSR | No cat a | 82 | 4.5 | 137974 | 2.2 | 1682 |
Host gene a | 4 | 0.2 | 8223 | 0.1 | 2056 | |
SSR | 6 | 0.3 | 5102 | 0.1 | 2056 | |
Total | 1,816 | 100 | 6,137,092 | 100 |
“No Cat” and “Host gene” are categories assigned by the PiRATE Galaxy server, and describe unclassified (no category) and potential host gene, respectively.
We predicted 11,257 protein coding genes in ArME14, 661 more than for ArD2, and again this discrepancy is likely a result of the different sequencing methods used and differences in the gene model prediction and annotation results. Using tBLASTn (Altschul et al. 1990), we identified 405 annotated ArME14 protein coding genes that were not found in the ArD2 genome assembly and almost all of these were located near contig ends or near annotated transposable element sequences. It is unclear whether the observed differences in the complement of genes between the two isolates is due to lack of sequence data for these regions in ArD2, difficulty in assembling such regions, or due to real deletions or insertions of gene-encoding sequence at these locations. Each of these possibilities can be explained by the AT-rich and repetitive nature of DNA sequence where these missing genes are located. BUSCO analysis indicated a substantial improvement for the sequencing and annotation for A. rabiei with only three missing and 12 fragmented genes for ArME14, compared with 37 missing and 17 fragmented for ArD2 (Supplementary Figure: Figure_S1).
Effector proteins of plant pathogenic fungi are usually predicted based on the presence of a secretion signal, small protein size and a high proportion of cysteine residues (Jones et al. 2018). Therefore our first step in effector protein prediction was to determine the set of secreted proteins. SignalP v 5.0 predicted 1,145 secreted proteins for ArME14. Verma et al. (2016) predicted fewer secreted proteins for ArD2 (758). However, when we applied the same prediction method for secreted proteins as for ArME14, we found a similar number (1,111) of secreted proteins for ArD2, suggesting that the two genomes are highly similar with respect to their complement of secreted proteins. We compared the 1,145 ArME14 secreted proteins with the 1,111 sequences predicted for ArD2 using tBLASTn and found that 22 ArD2 proteins were not present in the ArME14 genome and 29 proteins from ArME14 that were not in ArD2 (Supplementary File, File_S2). Table 3 shows the number of putative effector proteins predicted from the total A. rabiei ArME14 proteome using different selection criteria with increasing stringency. For ArD2, 328 effectors were predicted and these were a large proportion of the secreted, non-Carbohydrate Active Enzymes (Verma et al. 2016). In contrast, our study used EffectorP v 2.0 (Sperschneider et al. 2016, 2018) as a more specific tool for fungal effector prediction. Using a mature protein size threshold of 25 KDa and EffectorP score threshold of 0.8, we nominated 39 protein sequences, designated PE01 to PE39 as putative effectors (PE). Full details of the 39 putative effector proteins are presented in the Supplementary File, File_S4. Three of the ArME14 putative effectors were missing from the ArD2 proteome with only 36 ArD2 putative effectors being predicted using the same selection criteria as we used for ArME14. A subsequent tBLASTn search of the ArD2 assembly located one of these “missing” proteins, the ArME14 PE22 ortholog, as an un-annotated sequence in the ArD2 assembly. Putative effector genes PE 34 and PE36 in ArME14, were absent from the ArD2 nucleotide sequence. Both genes are located in highly repetitive regions of sub-telomeric DNA and may have been absent from ArD2 or not assembled correctly in the ArD2 Illumina genome assembly. From the set of ArD2 secreted proteins not found in ArME14, one was predicted to be an effector with mature protein molecular weight and EffectorP 2.0 score of 14.7 KDa and 0.64, respectively, although this was below our EffectorP threshold of 0.8. From the 29 ArME14 secreted proteins not in found ArD2, we predicted seven to be effectors with EffectorP score greater than 0.6, but only two having EffectorP scores above 0.8. These two proteins were PE34 and PE36 (Supplementary data) as discussed above. In the “two-speed genome” model, genes closely located to, or within highly repetitive sequence evolve at a higher rate with greater rates of positive selection (Oliver 2012; Grandaubert et al. 2014; Dong et al. 2015; Raffaele et al. 2015). This evolutionary process is illustrated in the case of seven small-secreted protein, avirulence effectors of L. maculans, where the corresponding genes are located in AT-rich regions of the L. maculans genome and display evidence of Repeat-Induced Point mutation (RIP) and positive selection in their sequences (Van de Wouw et al. 2010; Grandaubert et al. 2014). Similar evolutionary processes are likely to have shaped the pathogen-host relationship for A. rabiei and chickpea, and further insights about the molecular mechanisms of pathogenicity in this species will be uncovered through functional analysis of these predicted effector proteins.
Table 3. Summary of potential pathogenicity genome features, including: secondary metabolite clusters, predicted effector genes and CAZyme genes identified from the A. rabiei ArME14 genome assembly. Detailed tables are provided in Supplementary File, File_S4.
Class | Number |
---|---|
Putative effectors | |
EffP > 0.8, MW < 25KDa a | 39 |
EffP > 0.8, MW < 15KDa a | 27 |
EffP > 0.9, MW < 15KDa a | 15 |
Secondary metabolite clusters | 26 |
T1PKS | 7 |
T3PKS | 1 |
NRPS | 2 |
NRPS-like | 7 |
NRPS/NRPS-like – T1PKS | 4 |
Indole | 1 |
Terpene | 4 |
CAZymes | 451 |
AA - Auxiliary activities | 77 |
CBM - Carbohydrate-binding module | 3 |
CE - Carbohydrate esterase | 31 |
GH - Glycoside hydrolase | 227 |
GT - Glycosyl transferase | 82 |
PL - Polysaccharide lyase | 31 |
mature protein MW.
The secondary metabolite cluster prediction tool, antiSMASH (Medema et al. 2011; Blin et al. 2019) predicted 26 clusters in both ArD2 and ArME14 (Table 3). Verma et al. (2016) similarly predicted 26 clusters for ArD2. The antiSMASH-predicted clusters in ArME14 matched clusters from the ArD2 genome assembly in almost all cases, with some BLAST hits spread across multiple ArD2 scaffolds. Notably, the NRPS/T1PKS cluster 10-1 on ArME14 contig 10 has a polyketide synthase gene (g5897) that was absent from the ArD2 assembly although other ortholog genes for the cluster were present. Details of the ArME14 secondary metabolite clusters are presented in the Supplementary File, File_S4. Predicted clusters were homologous to characterized clusters designated for the biosynthesis of known fungal secondary metabolites including: cluster 16.2, melanin (Akamatsu et al. 2010), cluster 3.1, mellein (Chooi et al. 2015), and cluster 7.3, solanapyrone (Kim et al. 2015a,b). There were a further six clusters with characterized secondary metabolite homologs with proposed roles in fungal physiology or reproduction and 17 other gene clusters putatively producing molecules with unknown structures and functions. It is likely that some of these gene clusters will have a role in producing novel molecules required for virulence and host specificity. Of the 26 secondary metabolite clusters, eight were located within sub-telomeric regions of the ArME14 assembly and two were bounded by highly repetitive regions populated by transposable elements. Similar to the predicted effectors, the presence of secondary metabolite clusters in repeat-rich regions of the genome confers mobility between species and rapid adaptation through processes such as Repeat-Induced Point mutation (RIP) (Hane and Oliver 2008; Fudal et al. 2009; Rouxel et al. 2011; Testa et al. 2016; Seidl and Thomma 2017). The features of the ArME14 genome are consistent with repetitive genome structure having played a role in the evolution and host adaptation of A. rabiei.
Carbohydrate-Active Enzymes (CAZymes) are a key feature of all fungi, and in plant pathogens these enzymes are essential for the degradation of host plant polysaccharides for penetrating, colonizing and deriving nutrition from host tissues. Our CAZyme predictions from ArME14 produced 451 CAZyme sequences (Table 3), which is substantially fewer than the published number of 1,727 for ArD2 (Verma et al. 2016). Our search method identified only 441 CAZymes in ArD2. The dbCAN2 web server estimates of CAZyme number for A. rabiei are similar to those reported for other plant pathogenic fungi (Zhao et al. 2013). A total of 650 CAZymes were predicted by at least one of the tools and 451 to 650 is the likely range for the number of A. rabiei CAZymes. The main distinction of CAZyme complement among fungi is that necrotrophic fungal pathogens generally have a greater number (approx. 400-850) than biotrophs (approx. 170-320) (Zhao et al. 2013). The A. rabiei ArME14 genome has at least 450 and possibly up to 650 CAZymes, which is a similar number to those identified for other Dothideomycete genomes (Zhao et al. 2013). Fungal pathogens of dicots generally have an adapted set of CAZymes that are tailored to the types of carbohydrates found in dicot cell walls. Zhao et al. (Zhao et al. 2013) report that dicot pathogens generally have more polysaccharide lyases that degrade pectate and pectin (classes PL1 and PL3), which are more abundant in the cell walls of dicots than of monocots. In A. rabiei ArME14 there were nine PL1 CAZymes, which is similar to the average number reported for dicot pathogens and significantly greater than the average number of three PL1 enzymes for monocot pathogens (Zhao et al. 2013). In addition, A. rabiei ArME14 contained 12 pectin degrading polygalacturonases from GH28 where the average for dicot and monocot pathogens is 13 and 5, respectively (Zhao et al. 2013). Figure 4 summarizes the genome structure and locations of transposable and repetitive elements, putative effectors, CAZymes and secondary metabolite clusters in a CIRCOS plot. A plot of percent GC content in the CIRCOS format emphasizes partitioning of the genome into AT-rich gene-sparse, and GC-rich gene-dense sections. Notably 62% of the 39 predicted effector genes were located within 50 kb of repetitive regions and 23% were between 50-100 kb from the nearest repeat-rich region. Genome features are provided in a Supplementary genome feature file (Supplementary Data, File_S3).
Transcription factors are an important feature of the A. rabiei genome and of the 381 identified in the ArD2 genome assembly, three were found using tBLASTn searches to be absent from ArME14 (KZM27601.1, KZM27726.1 and KZM27745.1). Functional annotation of predicted ArME14 proteins using interproscan showed 126 proteins described as being transcription factors in addition to the 378 ArD2 transcription factor orthologs. Most of these were present in the ArD2 assembly but either they were not annotated as transcription factors or they were not annotated as protein-encoding genes for ArD2. We found three putative transcription factor genes in ArME14 for which there was no homologous DNA sequence in the ArD2 assembly. These were ArME14 g29, g427 and g4943, each described as containing fungal transcription factor domains. Putative transcription factor sequences from the comparisons between A. rabiei ArD2 and ArME14 are provided in the Supplementary Material, File_S2.
Developing an understanding of the mechanisms of virulence of plant pathogens is critical to the effective control of plant disease in crop production. The Pleosporales order of filamentous fungi including P. nodorum, P. tritici-repentis, Cochliobolus heterostrophus and P. teres f. teres among others, have many common overarching features that govern their primary functions as plant pathogens. Notwithstanding the similarities in genome structure and function among these species, there are also differences in virulence genes and effectors that determine the very important phenomenon of host specialization in plant pathogens. Furthermore, regulation of gene expression is critical to the production of virulence factors and the interaction of pathogen and plant host (Verma et al. 2017). The publication of a near-complete, high-fidelity genome assembly for A. rabiei complements the previously published genome assembly (Verma et al. 2016) and provides the basis for further work in the field of chickpea ascochyta blight research.
Acknowledgments
This work was funded by the Australian Grains Research and Development Corporation (GRDC) research grants UMU00021 and UMU00022 at Murdoch University, and CUR00014 and CUR00023 at Curtin University. RMS acknowledges the Malaysian Ministry of Higher Education and Universiti Malaysia Terengganu for providing a scholarship. This work was supported by resources from the Pawsey Supercomputing Centre, Kensington, Western Australia and the National Computational Infrastructure (NCI) funded by the Australian Government. The authors gratefully acknowledge the contributions of Judith Lichtenzveig to the early conception and establishment of research projects CUR00014 and the pulse pathogen program of CUR00023, isolate selection and initial genome sequencing. Robert Syme is acknowledged for producing the PacBio genome assembly, genome annotation and NUCMER comparison.
Footnotes
Supplemental material available at figshare: https://doi.org/10.25387/g3.11589420.
Communicating editor: S. Smith
Literature Cited
- Akamatsu H. O., Chilvers M. I., Kaiser W. J., and Peever T. L., 2012. Karyotype polymorphism and chromosomal rearrangement in populations of the phytopathogenic fungus, Ascochyta rabiei. Fungal Biol. 116: 1119–1133. 10.1016/j.funbio.2012.07.001 [DOI] [PubMed] [Google Scholar]
- Akamatsu H. O., Chilvers M. I., Stewart J. E., and Peever T. L., 2010. Identification and function of a polyketide synthase gene responsible for 1,8-dihydroxynaphthalene-melanin pigment biosynthesis in Ascochyta rabiei. Curr. Genet. 56: 349–360. 10.1007/s00294-010-0306-2 [DOI] [PubMed] [Google Scholar]
- Alam S. S., Bilton J. N., Slawin A. M. Z., Williams D. J., Sheppard R. N. et al. , 1989. Chickpea blight: Production of the phytotoxins solanapyrones A and C by Ascochyta rabiei. Phytochemistry 28: 2627–2630. 10.1016/S0031-9422(00)98054-3 [DOI] [Google Scholar]
- Almagro Armenteros J. J., Tsirigos K. D., Sønderby C. K., Petersen T. N., Winther O. et al. , 2019. SignalP 5.0 improves signal peptide predictions using deep neural networks. Nat. Biotechnol. 37: 420–423. 10.1038/s41587-019-0036-z [DOI] [PubMed] [Google Scholar]
- Altschul S. F., Gish W., Miller W., Myers E. W., and Lipman D. J., 1990. Basic Local Alignment Search Tool. J. Mol. Biol. 215: 403–410. 10.1016/S0022-2836(05)80360-2 [DOI] [PubMed] [Google Scholar]
- Armstrong C. L., Chongo G., Gossen B. D., and Duczek L. J., 2001. Mating type distribution and incidence of the teleomorph of Ascochyta rabiei (Didymella rabiei) in Canada. Can. J. Plant Pathol. 23: 110–113. 10.1080/07060660109506917 [DOI] [Google Scholar]
- Aveskamp M. M., de Gruyter J., Woudenberg J. H. C., Verkley G. J. M., and Crous P. W., 2010. Highlights of the Didymellaceae: A polyphasic approach to characterise Phoma and related pleosporalean genera. Stud. Mycol. 65: 1–60. 10.3114/sim.2010.65.01 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berlin K., Koren S., Chin C. S., Drake J. P., Landolin J. M. et al. , 2015. Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat. Biotechnol. 33: 623–630. 10.1038/nbt.3238 [DOI] [PubMed] [Google Scholar]
- Berthelier J., Casse N., Daccord N., Jamilloux V., Saint-Jean B. et al. , 2018. A transposable element annotation pipeline and expression analysis reveal potentially active elements in the microalga Tisochrysis lutea. BMC Genomics 19: 378 10.1186/s12864-018-4763-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blin K., Shaw S., Steinke K., Villebro R., Ziemert N. et al. , 2019. antiSMASH 5.0: updates to the secondary metabolite genome mining pipeline. Nucleic Acids Res. 47: W81–W87. 10.1093/nar/gkz310 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Buchfink B., Xie C., and Huson D. H., 2015. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12: 59–60. 10.1038/nmeth.3176 [DOI] [PubMed] [Google Scholar]
- Busk P. K., Pilgaard B., Lezyk M. J., Meyer A. S., and Lange L., 2017. Homology to peptide pattern for annotation of carbohydrate-active enzymes and prediction of function. BMC Bioinformatics 18: 214 10.1186/s12859-017-1625-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chooi Y., Krill C., Barrow R. A., Chen S., Trengove R. et al. , 2015. An In Planta-Expressed Polyketide Synthase Produces (R)-Mellein in the Wheat Pathogen Parastagonospora nodorum. Applied and Environmental Microbiology 81: 177–186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Coleman M. J., McHale M. T., Arnau J., Watson A., and Oliver R. P., 1993. Cloning and characterisation of telomeric DNA from Cladosporium fulvum. Gene 132: 67–73. 10.1016/0378-1119(93)90515-5 [DOI] [PubMed] [Google Scholar]
- Derbyshire M., Denton-Giles M., Hegedus D., Seifbarghi S., Rollins J. et al. , 2017. The complete genome sequence of the phytopathogenic fungus Sclerotinia sclerotiorum reveals insights into the genome architecture of broad host range pathogens. Genome Biol. Evol. 9: 593–618. 10.1093/gbe/evx030 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dong S., Raffaele S., and Kamoun S., 2015. The two-speed genomes of filamentous pathogens: Waltz with plants. Curr. Opin. Genet. Dev. 35: 57–65. 10.1016/j.gde.2015.09.001 [DOI] [PubMed] [Google Scholar]
- Eddy S. R., 1996. Hidden Markov models. Curr. Opin. Struct. Biol. 6: 361–365. 10.1016/S0959-440X(96)80056-X [DOI] [PubMed] [Google Scholar]
- Eddy S. R., 1995. Multiple alignment using hidden Markov models. Proc. Int. Conf. Intell. Syst. Mol. Biol. 3: 114–120. [PubMed] [Google Scholar]
- Ellinghaus D., Kurtz S., and Willhoeft U., 2008. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics 9: 18 10.1186/1471-2105-9-18 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Faino L., Seidl M. F., Datema E., van den Berg G. C. M., Janssen A. et al. , 2015. Single-Molecule Real-Time Sequencing Combined with Optical Mapping Yields Completely Finished Fungal Genome. MBio 6 10.1128/mBio.00936-15 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Faino L., Seidl M. F., Shi-kunne X., Pauper M., van den Berg G. C. M. et al. , 2016. Transposons passively and actively contribute to evolution of the two-speed genome of a fungal pathogen. Genome Res. 26: 1091–1100. 10.1101/gr.204974.116 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Farman M. L., 2007. Telomeres in the rice blast fungus Magnaporthe oryzae: The world of the end as we know it. FEMS Microbiol. Lett. 273: 125–132. 10.1111/j.1574-6968.2007.00812.x [DOI] [PubMed] [Google Scholar]
- Flutre T., Duprat E., Feuillet C., and Quesneville H., 2011. Considering transposable element diversification in de novo annotation approaches. PLoS One 6: e16526 10.1371/journal.pone.0016526 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fondevilla S., Krezdorn N., Rotter B., Kahl G., and Winter P., 2015. In planta Identification of Putative Pathogenicity Factors from the Chickpea Pathogen Ascochyta rabiei by De novo Transcriptome Sequencing Using RNA-Seq and Massive Analysis of cDNA Ends. Front. Microbiol. 6: 1–15. 10.3389/fmicb.2015.01329 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fudal I., Ross S., Brun H., Besnard A.-L., Ermel M. et al. , 2009. Repeat-Induced Point mutation (RIP) as an alternative mechanism of evolution toward virulence in Leptosphaeria maculans. Mol. Plant Microbe Interact. 22: 932–941. 10.1094/MPMI-22-8-0932 [DOI] [PubMed] [Google Scholar]
- Grandaubert J., Lowe R. G. T., Soyer J. L., Schoch C. L., Fudal I. et al. , 2014. Transposable Element-assisted evolution and adaptation to host plant within the Leptosphaeria maculans-Leptosphaeria biglobosa species complex of fungal pathogens. Biomed Cent. Genomics 15: 891 10.1186/1471-2164-15-891 [DOI] [PMC free article] [PubMed] [Google Scholar]
- de Gruyter J., Aveskamp M. M., Woudenberg J. H. C., Verkley G. J. M., Groenewald J. Z. et al. , 2009. Molecular phylogeny of Phoma and allied anamorph genera: Towards a reclassification of the Phoma complex. Mycol. Res. 113: 508–519. 10.1016/j.mycres.2009.01.002 [DOI] [PubMed] [Google Scholar]
- Gurevich A., Saveliev V., Vyahhi N., and Tesler G., 2013. QUAST: Quality assessment tool for genome assemblies. Bioinformatics 29: 1072–1075. 10.1093/bioinformatics/btt086 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hamid K., and Strange R. N., 2000. Phytotoxicity of solanapyrones A and B produced by the chickpea pathogen Ascochyta rabiei (Pass.) Labr. and the apparent metabolism of solanapyrone A by chickpea tissues. Physiol. Mol. Plant Pathol. 56: 235–244. 10.1006/pmpp.2000.0272 [DOI] [Google Scholar]
- Han Y., and Wessler S. R., 2010. MITE-Hunter : a program for discovering miniature inverted-repeat transposable elements from genomic sequences. Nucleic Acids Res. 38: e199 10.1093/nar/gkq862 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hane J. K., and Oliver R. P., 2008. RIPCAL: A tool for alignment-based analysis of repeat-induced point mutations in fungal genomic sequences. BMC Bioinformatics 9: 478 10.1186/1471-2105-9-478 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hoede C., Arnoux S., Moisset M., Chaumier T., Inizan O. et al. , 2014. PASTEC : An automatic transposable element classification tool. PLoS One 9: e91929 10.1371/journal.pone.0091929 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jones D. A. B., Bertazzoni S., Turo C. J., Syme R. A., and Hane J. K., 2018. Bioinformatic prediction of plant – pathogenicity effector proteins of fungi. Curr. Opin. Microbiol. 46: 43–49. 10.1016/j.mib.2018.01.017 [DOI] [PubMed] [Google Scholar]
- de Jonge R., van Esse H. P., Kombrink A., Shinya T., Desaki Y. et al. , 2010. Conserved fungal LysM effector ECP6 prevents chitin-triggered immunity in plants. Science 329: 953–955. 10.1126/science.1190859 [DOI] [PubMed] [Google Scholar]
- Van Kan J. A. L., Stassen J. H. M., Mosbach A., van der Lee T. A. J., Faino L. et al. , 2017. A gapless genome sequence of the fungus Botrytis cinerea. Mol. Plant Pathol. 18: 75–89. 10.1111/mpp.12384 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim W., Park J. J., Gang D. R., Peever T. L., and Chen W., 2015a A novel type pathway-specific regulator and dynamic genome environments of a solanapyrone biosynthesis gene cluster in the fungus Ascochyta rabiei. Eukaryot. Cell 14: 1102–1113. 10.1128/EC.00084-15 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim W., Park C., Park J., Akamatsu H. O., Peever T. L. et al. , 2015b Functional Analyses of the Diels-Alderase Gene sol5 of Ascochyta rabiei and Alternaria solani Indicate that the Solanapyrone Phytotoxins Are Not Required for Pathogenicity. Mol. Plant Microbe Interact. 28: 482–496. 10.1094/MPMI-08-14-0234-R [DOI] [PubMed] [Google Scholar]
- König S., Romoth L. W., Gerischer L., and Stanke M., 2016. Simultaneous gene finding in multiple genomes. Bioinformatics 32: 3388–3395. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krzywinski M., Schein J., Birol I., Connors J., Gascoyne R. et al. , 2009. Circos : An information aesthetic for comparative genomics. Genome Res. 19: 1639–1645. 10.1101/gr.092759.109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kurtz S., Phillippy A., Delcher A. L., Smoot M., Shumway M. et al. , 2004. Versatile and open software for comparing large genomes. Genome Biol. 5: R12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leo A. E., Ford R., and Linde C. C., 2015. Genetic homogeneity of a recently introduced pathogen of chickpea, Ascochyta rabiei, to Australia. Biol. Invasions 17: 609–623. 10.1007/s10530-014-0752-8 [DOI] [Google Scholar]
- Li W., and Godzik A., 2006. Cd-hit : a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22: 1658–1659. 10.1093/bioinformatics/btl158 [DOI] [PubMed] [Google Scholar]
- Lichtenzveig J., Gamliel E., Frenkel O., Michaelido S., Abbo S. et al. , 2005. Distribution of mating types and diversity in virulence of Didymella rabiei in Israel. Eur. J. Plant Pathol. 113: 15–24. 10.1007/s10658-005-8914-2 [DOI] [Google Scholar]
- Lombard V., Ramulu H. G., Drula E., Coutinho P. M., and Henrissat B., 2014. The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res. 42: D490–D495. 10.1093/nar/gkt1178 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Medema M. H., Blin K., Cimermancic P., de Jager V., Zakrzewski P. et al. , 2011. AntiSMASH: Rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences. Nucleic Acids Res. 39: W339–W346. 10.1093/nar/gkr466 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mehmood Y., Sambasivam P., Kaur S., Davidson J., Leo A. E. et al. , 2017. Evidence and Consequence of a Highly Adapted Clonal Haplotype within the Australian Ascochyta rabiei Population. Front. Plant Sci. 8: 1029 10.3389/fpls.2017.01029 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moolhuijzen P., See P. T., Hane J. K., Shi G., Liu Z. et al. , 2018. Comparative genomics of the wheat fungal pathogen Pyrenophora tritici-repentis reveals chromosomal variations and genome plasticity. BMC Genomics 19: 1–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Murray G. M., and Brennan J. P., 2012. The current and potential costs from diseases of pulse crops in Australia, Grains Research and Development Corporation, Canberra, Australia. [Google Scholar]
- Oliver R., 2012. Genomic tillage and the harvest of fungal phytopathogens. New Phytol. 196: 1015–1023. 10.1111/j.1469-8137.2012.04330.x [DOI] [PubMed] [Google Scholar]
- Peever T. L., Salimath S. S., Su G., Kaiser W. J., and Muehlbauer F. J., 2004. Historical and contemporary multilocus population structure of Ascochyta rabiei (teleomorph: Didymella rabiei) in the Pacific Northwest of the United States. Mol. Ecol. 13: 291–309. 10.1046/j.1365-294X.2003.02059.x [DOI] [PubMed] [Google Scholar]
- Lo Presti L., Lanver D., Schweizer G., Tanaka S., Liang L. et al. , 2015. Fungal Effectors and Plant Susceptibility. Annu. Rev. Plant Biol. 66: 513–545. 10.1146/annurev-arplant-043014-114623 [DOI] [PubMed] [Google Scholar]
- Price A. L., Jones N. C., and Pevzner P. A., 2005. De novo identification of repeat families in large genomes. Bioinformatics 21: i351–i358. 10.1093/bioinformatics/bti1018 [DOI] [PubMed] [Google Scholar]
- Raffaele S., Fairer R. A., Cano L. M., Studholme D. J., Thines M. et al. , 2015. Genome Evolution Following Host Jumps in the Irish Potato Famine. Genome Biol. Evol. 330: 1540–1543. [DOI] [PubMed] [Google Scholar]
- Rehmeyer C., Li W., Kusaba M., Kim Y.-S., Brown D. et al. , 2006. Organization of chromosome ends in the rice blast fungus, Magnaporthe oryzae. Nucleic Acids Res. 34: 4685–4701. 10.1093/nar/gkl588 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rouxel T., Grandaubert J., Hane J. K., Hoede C., van de Wouw A. P. et al. , 2011. Effector diversification within compartments of the Leptosphaeria maculans genome affected by Repeat-Induced Point mutations. Nat. Commun. 2: 202 10.1038/ncomms1189 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Savojardo C., Martelli P. L., Fariselli P., and Casadio R., 2018. DeepSig: Deep learning improves signal peptide detection in proteins. Bioinformatics 34: 1690–1696. 10.1093/bioinformatics/btx818 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schechtman M. G., 1990. Characterization of telomere DNA from Neurospora crassa. Gene 88: 159–165. 10.1016/0378-1119(90)90027-O [DOI] [PubMed] [Google Scholar]
- Seidl M. F., and Thomma B. P. H. J., 2017. Transposable Elements Direct The Coevolution between Plants and Microbes. Trends Genet. 33: 842–851. 10.1016/j.tig.2017.07.003 [DOI] [PubMed] [Google Scholar]
- Simão F. A., Waterhouse R. M., Ioannidis P., Kriventseva E. V., and Zdobnov E. M., 2015. BUSCO: Assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31: 3210–3212. 10.1093/bioinformatics/btv351 [DOI] [PubMed] [Google Scholar]
- Smit A. F., Hubley R., and Green P., 2013. RepeatMasker Open-4.0. http://www.repeatmasker.org.
- Sperschneider J., Dodds P. N., Gardiner D. M., Singh K. B., and Taylor J. M., 2018. Improved prediction of fungal effector proteins from secretomes with EffectorP 2.0. Mol. Plant Pathol. 19: 2094–2110. 10.1111/mpp.12682 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sperschneider J., Gardiner D. M., Dodds P. N., Tini F., Covarelli L. et al. , 2016. EffectorP: Predicting fungal effector proteins from secretomes using machine learning. New Phytol. 210: 743–761. 10.1111/nph.13794 [DOI] [PubMed] [Google Scholar]
- Stanke M., Schöffmann O., Morgenstern B., and Waack S., 2006. Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinformatics 7: 62 10.1186/1471-2105-7-62 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stanke M., Steinkamp R., Waack S., and Morgenstern B., 2004. AUGUSTUS: A web server for gene finding in eukaryotes. Nucleic Acids Res. 32: W309–W312. 10.1093/nar/gkh379 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Syme R. A., Martin A., Wyatt N. A., Lawrence J. A., Muria-Gonzalez M. J. et al. , 2018. Transposable element genomic fissuring in Pyrenophora teres is associated with genome expansion and dynamics of host-pathogen genetic interactions. Front. Genet. 9: 130 10.3389/fgene.2018.00130 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tan K., and Oliver R. P., 2017. Regulation of proteinaceous effector expression in phytopathogenic fungi. PLoS Pathog. 13: e1006241 10.1371/journal.ppat.1006241 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Testa A. C., Oliver R. P., and Hane J. K., 2016. OcculterCut : A comprehensive survey of AT-rich regions in fungal genomes. Genome Biol. Evol. 8: 2044–2064. 10.1093/gbe/evw121 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Di Tommaso P., Chatzou M., Floden E. W., Barja P. P., Palumbo E. et al. , 2017. Nextflow enables reproducible computational workflows. Nat. Biotechnol. 35: 316–319. 10.1038/nbt.3820 [DOI] [PubMed] [Google Scholar]
- Trapero-Casas A., and Kaiser W. J., 1992. Development of Didyella rabiei, the teleomorph of Ascochyta rabiei, on chickpea straw. Phytopathology 82: 1261–1266. 10.1094/Phyto-82-1261 [DOI] [Google Scholar]
- Verma S., Gazara R. K., Nizam S., Parween S., Chattopadhyay D. et al. , 2016. Draft genome sequencing and secretome analysis of fungal phytopathogen Ascochyta rabiei provides insight into the necrotrophic effector repertoire. Sci. Rep. 6: 24638 10.1038/srep24638 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Verma S., Gazara R. K., and Verma P. K., 2017. Transcription Factor Repertoire of Necrotrophic Fungal Phytopathogen Ascochyta rabiei: Predominance of MYB Transcription Factors As Potential Regulators of Secretome. Front. Plant Sci. 8: 1037 10.3389/fpls.2017.01037 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Walker B. J., Abeel T., Shea T., Priest M., Abouelliel A. et al. , 2014. Pilon: An integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9: e112963 10.1371/journal.pone.0112963 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wenke T., Dobel T., Sorensen T. R., Junghans H., Weisshaar B. et al. , 2011. Targeted identification of short interspersed nuclear element families shows their widespread existence and extreme heterogeneity in plant genomes. Plant Cell 23: 3117–3128. 10.1105/tpc.111.088682 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wickham H., 2016. ggplot2 Elegant Graphics for Data Analysis. Springer, New York. [Google Scholar]
- Winnenburg R., Baldwin T., Urban M., Rawlings C., Kohler J. et al. , 2006. PHI-base: a new database for pathogen host interactions. Nucleic Acids Res. 34: D459–D464. 10.1093/nar/gkj047 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van de Wouw A. P., Cozijnsen A. J., Hane J. K., Brunner P. C., McDonald B. A. et al. , 2010. Evolution of linked avirulence effectors in Leptosphaeria maculans is affected by genomic environment and exposure to resistance genes in host plants. PLoS Pathog. 6: e1001180 10.1371/journal.ppat.1001180 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xin Z., and Chen J., 2012. A high throughput DNA extraction method with high yield and quality. Plant Methods 8: 26 10.1186/1746-4811-8-26 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang L., and Bennetzen J. L., 2009. Structure-based discovery and description of plant and animal Helitrons. Proc. Natl. Acad. Sci. USA 106: 12832–12837. 10.1073/pnas.0905563106 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang H., Yohe T., Huang L., Entwistle S., Wu P. et al. , 2018. DbCAN2: A meta server for automated carbohydrate-active enzyme annotation. Nucleic Acids Res. 46: W95–W101. 10.1093/nar/gky418 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhao Z., Liu H., Wang C., and Xu J. R., 2013. Comparative analysis of fungal genomes reveals different plant cell wall degrading capacity in fungi. BMC Genomics 14: 274 10.1186/1471-2164-14-274 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Illumina and PacBio genome sequencing data for A. rabiei ArME14 described herein, and the reference genome assembly have been deposited in the Sequence Read Archive and NCBI database, under the BioProject accession number PRJNA510692. A. rabiei ArME14 BioSample number is SAMN10613128. Illumina SRA entries are deposited under SRX5179494 and PacBio SRA data under SRX5172972. The GenBank assembly accession number is GCA_004011695.1. Supplemental material available at figshare: https://doi.org/10.25387/g3.11589420