Skip to main content
PLOS Genetics logoLink to PLOS Genetics
. 2024 Mar 18;20(3):e1011207. doi: 10.1371/journal.pgen.1011207

Genome biology and evolution of mating-type loci in four cereal rust fungi

Zhenyan Luo 1, Alistair McTaggart 2, Benjamin Schwessinger 1,*
Editor: Tatiana Giraud3
PMCID: PMC10977897  PMID: 38498573

Abstract

Permanent heterozygous loci, such as sex- or mating-compatibility regions, often display suppression of recombination and signals of genomic degeneration. In Basidiomycota, two distinct loci confer mating compatibility. These loci encode homeodomain (HD) transcription factors and pheromone receptor (Pra)-ligand allele pairs. To date, an analysis of genome level mating-type (MAT) loci is lacking for obligate biotrophic basidiomycetes in the Pucciniales, an order containing serious agricultural plant pathogens. Here, we focus on four species of Puccinia that infect oat and wheat, including P. coronata f. sp. avenae, P. graminis f. sp. tritici, P. triticina and P. striiformis f. sp. tritici. MAT loci are located on two separate chromosomes supporting previous hypotheses of a tetrapolar mating compatibility system in the Pucciniales. The HD genes are multiallelic in all four species while the PR locus appears biallelic, except for P. graminis f. sp. tritici, which potentially has multiple alleles. HD loci are largely conserved in their macrosynteny, both within and between species, without strong signals of recombination suppression. Regions proximal to the PR locus, however, displayed signs of recombination suppression and genomic degeneration in the three species with a biallelic PR locus. Our observations support a link between recombination suppression, genomic degeneration, and allele diversity of MAT loci that is consistent with recent mathematical modelling and simulations. Finally, we confirm that MAT genes are expressed during the asexual infection cycle, and we propose that this may support regulating nuclear maintenance and pairing during infection and spore formation. Our study provides insights into the evolution of MAT loci of key pathogenic Puccinia species. Understanding mating compatibility can help predict possible combinations of nuclear pairs, generated by sexual reproduction or somatic recombination, and the potential evolution of new virulent isolates of these important plant pathogens.

Author summary

Sexes in animals and some plants are determined by sex chromosomes. In fungi, mate compatibility is determined by mating-type (MAT) loci, which share some features with sex chromosomes including recombination suppression around heterozygous loci. Here, we study the MAT loci in fungal pathogens from the order Pucciniales, which cause rust diseases on many economically important plants, including wheat and oats. We show that one of the MAT loci is multiallelic, while the other is biallelic in most cases. The biallelic locus shows strong signs of recombination suppression and genetic deterioration with an increase in number of transposable elements and gene deserts surrounding the locus. Our findings on the genome biology of MAT loci in four economically important pathogens will improve predictions on potential novel virulent isolates that can lead to large scale pandemics in agriculture.

Introduction

The evolutionary origin of sex and mating-type determining chromosomes or loci is a fundamental question in biology. It is widely accepted that X and Y chromosomes were originally homologous, but recombination suppression caused gradual genetic degeneration of the Y chromosome [1]. Footprints of genetic degeneration include increased rates of (non-)synonymous substitutions (dN, dS), accumulation of transposable elements (TEs), accumulation of inversions, reduced gene expression and/or reduced gene numbers, all of which are consequences of recombination cessation [25].

In contrast to many animals and plants, antagonistic selection is unlikely to be the evolutionary mechanism that causes recombination cessation in fungal mating-type (MAT) loci [68]. Instead, mathematical modelling and stochastic simulations suggest that non-recombining DNA fragments, caused for example by inversions, can be fixed solely due to the presence of deleterious mutations in genomes [2]. Non-recombining fragments are beneficial if they carry fewer deleterious mutations than average and can increase in frequency. However, when becoming frequent, their recessive deleterious mutations will be exposed, selected against and thereby prevent fixation. Yet, they can be fix at permanent heterozygous loci due to their sheltering effect for deleterious mutations [2].

Initial fixation of recombination suppressors, such as inversions, can lead to progressive extension of non-recombining DNA fragments by an accumulation of additional inversions over longer evolutionary timeframes [2]. Importantly, this model predicts that non-recombining DNA fragments around MAT loci are larger at biallelic loci and in species with smaller effective population sizes, short haploid phases, outcrossing mating systems, high mutation rates, and extended dikaryotic life stages [2]. Indeed, several fungal species carry extensive non-recombining DNA fragments around MAT loci. Recombination suppression can range from hundreds of kbp (kilo base pairs) to several Mbp (mega base pairs) and lead to genetic degeneration [5,9], such as in Neurospora tetrasperma [10], Podospora anserina [11], Schizothecium tetrasporum [12], Agaricus bisporus [13], Ustilago hordei [14], several species of Microbotryum [8,15] and Cryptococcus spp. [16,17].

Basidiomycota have evolved unique mating systems to govern nuclear compatibility, mate selection, and life cycles [9]. From a genetic perspective, two MAT loci determine mating-type identity and non-self-recognition in most Basidiomycota. The pheromone receptor (PR) locus contains a pheromone receptor gene (Pra) and at least one pheromone peptide precursor gene (mfa) [18]. The PRA protein is a transmembrane localized G protein-coupled receptor, which recognizes the processed and post-translationally modified mature pheromone peptide MFA encoded by compatible PR allele [9]. PR locus defines pre-mating compatibility and gamete fusion [3]. Downstream of initial gamete fusion, the homeodomain (HD) locus determines success in post-mating development [9]. The HD locus contains two tightly linked homeodomain transcription factor genes (bW-HD1 and bE-HD2 in Pucciniales [19, 20], originally designated as bE-HD1 and bW-HD2 in Ustilago maydis [21]), which are linked by a short DNA fragment (~1 kbp) and are outwardly transcribed in opposite directions. Their protein products must be of different allelic specificity to form heterocomplexes and to activate a transcriptional cascade downstream of PRA-MFA [9]. The HD transcription factors regulate cellular development during mating, maintenance of the dikaryotic state, and control pathogenicity in some plant pathogens such as smuts [15].

The HD and PR loci can be either physically linked or unlinked, which means that they segregate together or independently, respectively. The mating compatibility system in Basidiomycota influences genomic organization and resulting segregation patterns of MAT loci. Most basidiomycete species display either bipolar or tetrapolar mating compatibility systems and sexual gametes (haploids) must have different alleles at MAT loci to be compatible [9]. In bipolar species the HD and PR loci are physically linked or alternatively, one locus has lost its function in mating compatibility. In bipolar species, only one MAT locus determines mate compatibility. In tetrapolar species, the HD and PR loci are unlinked, and both determine mate compatibility. A bipolar mating compatibility system is favorable in inbreeding populations that are primarily selfing, as the likelihood of compatibility amongst gametes derived from the same diploid individual is 50 percent [15]. In comparison, tetrapolar mating has 25 percent compatibility of gametes under selfing conditions. Outcrossing mating behavior leads to multiallelism at MAT loci because syngamy of haploid gametes derived from different individuals increases compatibility odds [6,9,15]. In many tetrapolar basidiomycete species, the HD locus is highly polymorphic within the population, with tens to hundreds of known or estimated alleles that are under negative frequency-dependent selection [15,22,23]. A tetrapolar mating compatibility system is thought to be the ancestral state in Basidiomycota, yet several species have evolved bipolarity independently. For example, Microbotryum spp. are highly selfing, and multiple independent evolutionary events have linked the HD and the PR locus into DNA fragments of different sizes and ages within this genus [3,7,8,24,25]. These independent events at different timescales show clear signatures of recombination suppression and genetic degeneration. Similarly, in the Ustilaginomycotina, several species evolved bipolarity by linking both MAT loci, such as U. hordei, U. bromivora or Sporisorium scitamineum [9]. In contrast, a recent study showed that tetrapolarity has evolved multiple times in the human skin fungus, Malassezia spp., where the ancestral behavior is pseudobipolar [26].

Detailed analyses of MAT loci and their genome biology is lacking for rust fungi (order Pucciniales, division Basidiomycota) [9]. This lack of knowledge on mating-type is despite Pucciniales being the largest order of fungal plant pathogens, causing diseases with significant environmental and economic impact for trees and crops such as poplar, paperbarks, wheat, oat, soybean, and coffee [2729]. Studying mating systems in Pucciniales has been difficult as they are obligate biotrophs with complex life cycles. They cannot be cultured in vitro and many species require multiple hosts to complete their sexual life cycle [9]. For example, macrocyclic and heteroecious rust fungi have five spore stages and require two hosts to complete their sexual life cycles. During the asexual infection cycle on the “primary” host, for example wheat and oat for rust on cereal crops, fungi are dikaryotic and produce re-infective urediniospores that harbor two distinct nuclei carrying compatible MAT loci. The sexual cycle is initiated with the production of teliospores, which represent the diploid phase of rust fungi and the site of meiosis. Resulting haploid basidiospores infect the “alternate” host, for example Berberis spp. for rust fungi of cereal crops, and form haploid infection structures with distinct MAT loci. Fusion of cells with compatible MAT loci generates dikaryotic intercellular mycelia within the plant tissue that gives rise to dikaryotic aeciospores that infect the “primary” host [7] to complete the full life cycle [9, 30]. However, it is important to note that recent cytogenomic and cytogenetic work questions this clear delineation of nuclear state and ploidy in Pucciniales at different life stages, with evidence for the occurrence of diploid nuclei throughout the life cycle in many species [31].

The comprehensive role of MAT genes during the life cycle of rust fungi is not yet understood. It is hypothesized that MAT genes regulate nuclear pairing during dikaryotic spore production and mediate compatibility of haploid cell fusions [9,30]. Original research suggested that most species of rust fungi are bipolar [30], yet more recent studies on Melampsora lini and P. coronata var. coronata revealed a tetrapolar mating compatibility system [32,33]. In the absence of direct experimental evidence such as experimental mating or gene knockout studies, genomic insights into MAT loci from genome assemblies of dikaryons can provide hypotheses on the mating compatibility system. Independent initial genome analyses of several rust fungi revealed two unlinked MAT loci with PR and HD loci located on different contigs, scaffolds or chromosomes including Austropuccinia psidii [34], P. graminis f. sp. tritici [35] and P. tritici [19,20]. This genome organization supports hypotheses of a tetrapolar mating compatibility system in these rust fungi. These rust fungi encode Pra receptor genes belonging to the STE3 gene family. The likely bona fide receptors are encoded by STE3.2–2 and STE3.2–3 that display clear signatures of ancient trans-specific polymorphisms, which are proposed to be ancestral to Basidiomycota [36]. Here the sequences of a given mating-type are more similar to pheromone receptor alleles of the same mating-type in distantly related fungi than the alternate allele in the same species. In most cases, at least one mfa allele has been found in proximity to the STE3.2-2/3 locus. To date, the PR locus is supported as biallelic in available population level datasets for P. tritici and P. striiformis f. sp. tritici [19,23,37]. Melampsora larici-populina may be multiallelic at the PR locus [9] and population level analyses are missing for other rust fungi. The HD locus is multiallelic in P. triticina and P. striiformis f. sp. tritici [19,23,37] encoding variants of bW-HD1 and bE-HD2 that are highly dissimilar only in the variable N-terminal domain, shown to be essential for functional heterodimerization in other Basidiomycota [38,39].

It is noteworthy that direct biochemical or genetic evidence is lacking to show that any of these MAT genes govern mating compatibility in rust fungi. Yet, complementation assays in U. maydis and host induced gene silencing on wheat demonstrated the importance of Pra and HD genes of P. triticina for mating in a heterologous system and for spore production during asexual infection [19]. In addition, MAT genes were expressed during the asexual and/or sexual infection cycle in several rust fungal species [19,40]. These initial studies brought insights and left several outstanding knowledge gaps for MAT loci in rust fungi: 1) What is the chromosomal organization of PR and HD loci and how does it compare between closely related species? 2) What is the allelic diversity of MAT genes within rust fungal species? 3) What is the composition and organization of MAT loci proximal regions, how does it compare between PR and HD loci, and how does it vary between closely related species? 4) What is the compositional and organizational variation of MAT loci within rust fungal species? 5) Is the expression of MAT genes during the asexual infection cycle common in rust fungi?

Addressing these questions has been challenging because initial genome assemblies of rust fungi were fragmented and the structure of MAT loci could not be assessed consistently. For example, clustering of repetitive elements around MAT loci [18,41,42] is a challenge for studying rust fungi with repeat-rich genomes because repetitive sequences cause gaps, fragmentation, and phase-errors in assembled genomes using short-read or noisy long-read sequencing technologies. Here, we overcome this challenge and address important knowledge gaps around MAT loci in four cereal rust fungi by providing a detailed comparative analysis of publicly available chromosome-scale phased genome assemblies and complementary Illumina short-read datasets. The study species include the oat crown rust fungus, P. coronata f. sp. avenae [4345], and three wheat rust fungi, P. graminis f. sp. tritici [35,46,47], P. triticina [4851] and P. striiformis f. sp. tritici [5258], which combined are the biggest threat to wheat production globally causing losses of several billion dollars every year [29]. Our detailed comparative analyses set out to address the following specific objectives. 1) To determine the chromosomal organization of MAT loci. 2) To assess the allelic diversity of these loci in the four cereal rust fungi. 3) To evaluate synteny, recombination suppression, and genomic degeneration around PR and HD loci with the prediction that multiallelic loci are more syntenic than biallelic loci and that the latter show stronger footprints of recombination suppression and genomic degeneration. 4) To gauge the conservation and/or plasticity of MAT loci within a cereal rust fungus species using P. triticina as model system. And 5) to determine the expression of Pra and HD genes during the asexual infection cycle of the cereal hosts.

We show that HD genes are multiallelic in four species of cereal rust fungi and Pra genes are most likely biallelic in three out of four species. Only biallelic MAT loci show strong signs of recombination suppression and genomic degeneration which supports recent mathematical modelling. Our results provide novel insights into the genome biology of MAT loci in cereal rust fungi and will benefit predictions about mating and hybridization events that may be linked to the evolution of novel pathogenicity traits.

Results

Genomic organisation and inheritance of mating-type genes suggest cereal rust fungi have a tetrapolar mating compatibility system

We set out to test previous observations that suggest cereal rust fungi have a tetrapolar mating compatibility system with unlinked PR and HD loci [19,20]. We made use of seven available chromosome-level genome assemblies of four cereal rust fungi, namely P. coronata f. sp. avenae (Pca) [4345], P. graminis f. sp. tritici (Pgt) [35,46,47,58], P. triticina (Pt) [37,4851,59] and P. striiformis f. sp. tritici (Pst) [52,5457,60], in addition to five partially phased assemblies (Table 1). Our initial analysis focused on one reference genome per species with Pca 203, Pgt 21–0, Pt 76, and Pst 134E.

Table 1. Cereal Rust Fungi Genomes used in the present study.

Species Strain Genome size BUSCO (%) Source Download date Citation
Puccinia graminis f. sp. tritici Ug99 176.2 Mb 93.4 [S:7.7, D:85.7] NCBI 26/05/2021 [35]
21–0* 176.9 Mb 94.2 [S:8.3, D:85.9] NCBI 26/05/2021
Puccinia striiformis f. sp. tritici DK0911 157.8 Mb 91.1 [S:38.4, D:52.7] JGI 1/06/2021 [54]
104E 126.5 Mb 92.1 [S:7.9, D:84.2] JGI 26/05/2021 [53]
134E* 167.7 Mb 92.6 [S:6.8, D:85.8] Provided by the authors 21/06/2021 [52]
Puccinia coronata f. sp. avenae 12NC29 105.2 Mb 89.9 [S:18.1, D:71.8] NCBI 26/05/2021 [43]
12SD80 99.2 Mb 90.2 [S:33.1, D:57.1] NCBI 26/05/2021 [43]
203* 208.1 Mb 92.1 [S:6.9, D:85.2] CSIRO data access portal 20/06/2022 [44]
Puccinia triticina 76* 253.5 Mb 91.0 [S:1.7, D:89.3] Provided by the authors 28/05/2021 [48]
15 * 243.9 Mb 91.1 [S:1.6, D:89.5] NCBI 02/11/2023 [59]
19NSW04 * 253 Mb 91.2[S:0.9, D:90.3] NCBI 10/10/2023 [37]
20QLD87 * 248.1 Mb 91.3[S:1.3, D:90.0] NCBI 10/10/2023 [37]
Puccinia polysora f. sp. zea GD1913 * 1.7 Gb 91.1 [S:1.4, D:89.7] NCBI 07/14/2023 [81]

The table provides information on all genome assemblies used in this study, their total dikaryotic genome assembly size (“Genome size”), and completeness as assessed with BUSCO (“BUSCO (%)”). For the “BUSCO (%)” column the first number presents the percentage of complete single BUSCO genes identified. The numbers in the brackets provide the percentage of single copy and of duplicated BUSCO genes in the dikaryotic genome assembly. Additional presented metadata includes species name (“Species”), strain name (“Strain”), source (“Source”), download date in DD/MM/YYYY (“Download date”) and the initial reference (“Citation”). The * genome denotes dikaryotic genome assemblies which are chromosome scale phased genome. NCBI–National Center for Biotechnology information, JGI–Joint Genome Institute, CSIRO–Commonwealth Scientific and Industrial Research Organization.

We used previously characterized MAT genes from the Pt isolate BBBD (Table 2) [19] as queries to identify orthologs in available genomes and proteomes (Table 1). We identified HD (bW-HD1 and bE-HD2) and Pra (STE3.2–2 and STE3.2–3) alleles on chromosome 4 and chromosome 9 respectively, whereas STE3.2–1 alleles were located on chromosome 1 (S1 Fig). This is consistent with previous preliminary analyses for Pgt and Pt that suggested HD and PR loci are located on two distinct chromosomes [35,37]. We confirmed these results using five genome resources for cereal rust fungi that are only partially phased and non-chromosome scale (Table 1). We identified two alleles of bW-HD1, bE-HD2 and Pra (STE3.2–2 or STE3.2–3) in all five genome assemblies with the exception for STE3.2–2 in Pst 104E and Pca 12SD80 [43,53]. In the case of Pst 104E, STE3.2–2 was absent from the genome assembly but we confirmed its presence in the raw sequencing data by mapping sequencing reads against the Pst 134E phased chromosome scale genome assembly [52]. STE3.2–2 displayed average genome-wide read coverage without any nucleotide variation confirming that Pst 104E was also heterozygous for Pra. In the case of Pca 12SD80, we identified two copies of STE3.2–2 on contig 000183F and 000183F_004. This mis-characterization of STE3.2–2 in Pst 104E and Pca 12SD80 is likely caused by early long-read genome assembly errors using noisy PacBio long-reads and suboptimal genome assembly algorithms that struggle with highly repetitive regions. Similar observations have been made for missing Pra genes reported for earlier genome versions of Melampsora larici-populina [61].

Table 2. MAT reference genes used as query and outgroups.

Species Strain Source Gene name Gene ID Citation
Puccinia graminis f. sp. tritici isolate CDL 75–36–700–3 (race SCCL) NCBI & Cuomo et al., 2017 bW-HD1 PGTG_05143 [19]
bE-HD2 PGTG_05144
STE3.2–1 PGTG_00333
STE3.2–2 PGTG_19559
STE3.2–3 PGTG_01392
Pgtmfa1 -
Pgtmfa2 -
Pgtmfa3 -
Puccinia triticina isolate 1–1 / race 1 (BBBD) NCBI & Cuomo et al., 2017 bW-HD1 PTTG_09683 PTTG_27730 [19]
bE-HD2 PTTG_10928 PTTG_03697
STE3.2–1 PTTG_09751
STE3.2–2 PTTG_28830
STE3.2–3 PTTG_09693
Ptmfa1/3 -
Ptmfa2 -
Puccinia striiformis f. sp. tritici 78 NCBI & Cuomo et al., 2017 bW-HD1 PSTG_05919 PSTG_18670 [19]
bE-HD2 PSTG_05918 PSTG_19315
STE3.2–1 PSTG_02613
STE3.2–2 PSTG_15127
STE3.2–3 PSTG_15070
Pstmfa1 -
Pstmfa2 -
Puccinia polysora f. sp. zea GD1913 NCBI bW-HD1 FUNB_006397 [81]
bE-HD2 FUNA_006226
STE3.2–2 FUNA_023003
STE3.2–1 FUNA_001167
STE3.2–3 FUNA_013409

The table provides gene names (“Gene name”) and gene identifier (“Gene ID”) of MAT genes from Puccinia triticina initially used as query sequences and from Puccinia polysora f. sp. zeae used as outgroups. Additional presented metadata includes species name (“Species”), strain name (“Strain”), source (“Source”), and the initial reference (“Citation”). NCBI–National Center for Biotechnology information.

Overall, the two MAT loci, HD and PR, are unlinked and heterozygous in all twelve dikaryotic genome assemblies, which is consistent with a genome informed hypothesis that cereal rust fungi are tetrapolar.

Mfa pheromone peptide precursor genes are closely linked to STE3.2–2 but not STE3.2–3

We searched the available dikaryotic genomes for putative pheromone peptide precursor genes, which encode mating-factor-a (MFA) peptides. Mfas are often linked to Pra alleles and are predicted to bind to the compatible pheromone receptors encoded at the allelic PR locus. We used the previously identified mfa1, mfa2, and mfa3 genes of Pst, Pgt, and Pt as queries [19]. We identified a single mfa2 pheromone peptide precursor gene in all species. In all cases, mfa2 and STE3.2–2 were closely linked, located within 500–1100 bp from each other, and encoded on the same DNA strand (S2 Fig). The mfa2 derived precursor peptides were all 34 amino acids long with a characteristic CAAX motif at the C-terminus, where C is cysteine, A is an aliphatic amino acid, and X is any amino acid [62] (S3B Fig). The MFA2 amino acid sequences were 100 percent identical at species rank (S3B Fig).

In contrast, the sequence, length, number, and location of mfa genes associated with STE3.2–3 varied between species (S2, S3A, and S3C Figs). Pca and Pst encoded a single mfa1 allele. Pca-mfa1 of Pca 203 encoded a 76 amino acid peptide and was located 0.54 Mbp upstream of STE3.2–3. In Pst, Pst-mfa1 encoded a 74 amino acid peptide and was located 10 kbp and 13 kbp away from STE3.2–3 in Pst 134E and Pst DK0911, respectively. In contrast, Pt carried two identical copies of Pt-mfa1/3 encoding 61 amino acid peptides. In the Pt 76 reference, Pt-mfa1/3s were associated with STE3.2–3 at a distance of 0.24 mbp and 0.27 mbp, respectively (S2 Fig). In addition, Pgt encoded for two mfa genes in close proximity to STE3.2–3 that were located upstream and downstream at a distance of 30 kbp and 96 kbp, respectively (S2 Fig). Yet in contrast to Pt, Pgt mfa1 and mfa3 encoded for highly distinct pheromone peptide precursor that were only 23.10% identical at the amino acid level and varied in length (68 vs 57 amino acids, respectively).

Further analysis of all three MFA precursor peptide alleles suggests a distinct maturation pathway for each. MFA1 precursor peptides appear to contain three tandem pheromone peptide repeats of similar sequence (S3A Fig), which has been reported for Microbotryum spp. [22,63] and Ascomycete fungi [64] previously. The predicted mature peptide encoded by the first repeat is sequence conserved between all four cereal rust species while the second repeat are species-specific, and the third repeat is sequenced conserved between Pgt, Pst and Pca, and has a single amino acid variation in Pt (S3A Fig). In contrast, MFA2 precursor peptides lack detectable pheromone peptide repeats and are likely processed into a single mature MFA peptide which is sequence-specific in each species (S3B Fig). Similarly, MFA3 of Pgt did not contain amino acid repeat sequences, however we identified the “QWGNGSHYC” amino acid sequence at the C-terminus of the Pgt-MFA3 precursor peptide (S3C Fig). This predicted mature peptide sequence of Pgt-MFA3 is highly similar to the predicted mature pheromone peptides of Pgt-MFA1 “QWGNGSHMC” with a single amino acid variation. We cannot exclude the possibility that MFA3 is processed into additional mature peptides and future studies are needed to define if the observed amino acid variations lead to different receptor specificities.

Taken together, in all four cereal rust fungi, Pra alleles were associated with species-specific mfa genes. Yet, the number, distance, organization, sequence of mfa genes and their predicted mature pheromone peptides varied between species.

Genealogies elucidate distinct evolutionary histories for the different MAT genes

Having identified the MAT genes, we investigated the genealogical relationships of each individual gene and compared them within and between species. This tested for shared or distinct evolutionary histories, which can provide indications for recombination within and between genes. For example, trans-specific polymorphisms indicate that recombination cessation is older than speciation [36,65]. In the absence of a robust phylogeny including all four study species [66], we first generated a multilocus species tree based on a multiple sequence alignment of 2,284 single ortholog protein sequences [67] including P. polysora f. sp. zeae as an outgroup (Fig 1A). The species tree showed that the three wheat rust fungi shared a most recent common ancestor when compared to the oat rust fungus Pca (Fig 1A). In the wheat rust clade, Pgt was sister to Pt, and Pst shared a common ancestor with them both. We aligned HD genes bW-HD1 and bE-HD2 separately to construct independent gene trees using P. polysora f. sp. zeae orthologs as outgroups (Table 2 and Fig 1B and 1C). The gene trees of bW-HD1 and bE-HD2 grouped alleles from each species into species-specific clades (Fig 1B and 1C) and we therefore did not find any evidence for trans-specific polymorphisms for either HD gene.

Fig 1. HD genealogies suggest distinct evolutionary histories for bW-HD1 and bE-HD2 in four cereal rust fungi.

Fig 1

(A) Species tree of Puccinia polysora f. sp. zeae (Ppz), P. coronata f. sp. avenae (Pca), P. graminis f. sp. tritici (Pgt), P. triticina (Pt) and P. striiformis f. sp. tritici (Pst) inferred from 2284 single-copy orthogroups. Numbers on nodes represent local support value computed by Fasttree. Puccinia polysora f. sp. zeae (Ppz) has been used for rooting the species tree. (B) and (C) Bayesian rooted gene tree built from bW-HD1 or bE-HD2 coding-based sequence alignment, respectively. Trees are based on a HKY + G model of molecular evolution. In either case, alleles from the same species are grouped into the same clade. Each node is labelled with its values of posterior probability (PP). PP values above 0.95 are considered to have strong evidence for monophyly of a clade and PP values of identical alleles are not displayed. The scale bar represents the number of nucleotide substitutions per site. * marks an allele with minor variations only outside the variable domain, which means this allele is predicted to be functionally equivalent to its closest neighbor. Alleles of the same species are colored with identical background: Pca (yellow), Pgt (green), Pt (blue), Pst (orange).

We identified multiple bW-HD1 and bE-HD2 alleles for the four cereal rust fungi suggesting that the HD locus is multiallelic (see also below). We observed several shared HD alleles between isolates in the case of Pt while HD alleles in Pca and Pst appeared to be more diverse. In the case of Pgt, we were limited to only four alleles, making any meaningful conclusion difficult. We next tested our null hypothesis that bW-HD1 and bE-HD2 have similar evolutionary histories because they are closely linked by a short DNA fragment. We performed approximately unbiased (AU) tests [68] within each species to investigate if the tree topologies of bW-HD1 and bE-HD2 are congruent with each other. In Pca, Pst, and Pt the pAU-value was less than 0.05, which rejected the null hypothesis of congruent tree topologies. This suggests distinct evolutionary histories for the bW-HD1 and bE-HD2 genes in these species, which might be caused by recombination within the HD locus. In contrast, the AU-test for Pgt suggested similar tree topologies for bW-HD1 and bE-HD2, which likely is influenced by the low sample number of four alleles. We also applied RDP5 [69] to detect potential signals of recombination at the nucleotide level. We detected potential signals for recombination in the HD locus for all species, which further supports the results of the AU test [70] and is similar to recombination events between b alleles reported in U. marydis [71].

Next, we explored the evolutionary relationship between Pra alleles at the PR locus by building a single gene tree using P. polysora f.sp. zeae STE3.2–2 and STE3.2–3 as outgroups. The Pra gene tree formed two obvious clades based on allele identity grouping the STE3.2–2 alleles of all species into one clade and all STE3.2–3 alleles into another clade including respective alleles of P. polysora f.sp. zeae (Fig 2). In each clade, there was little to no intra-species variation for STE3.2–2 and STE3.2–3 based on the minimal branch lengths that separated allele copies in each species sub-clade. This indicates that Pra might be biallelic in cereal rust fungi (see also below). The clear grouping by allele identity rather than by species of rust fungi indicates strong trans-polymorphisms and a long-term suppression of recombination at the PR locus that predates ancestral speciation.

Fig 2. Pra genealogy displays ancient trans-specific polymorphism in cereal rust fungi.

Fig 2

Bayesian gene tree built from Pra (STE3.3–2 and STE3.3–3) coding-based sequence alignment. The tree is based on a TN93 + I model of molecular evolution. The Pra alleles, STE3.2–2 and STE3.2–3, were grouped into two clades by allele identity and not species identity. Each node is labelled with its values of posterior probability (PP). PP values above 0.95 are considered as strong evidence for monophyly of a clade and PP values of identical alleles are not displayed. The scale bar represents the number of nucleotide substitutions per site. Species-specific background coloring is the same as for Fig 1.

We next investigated if the putative pheromone peptide precursor mfa genes at the PR locus followed similar evolutionary histories compared to their co-located Pra genes. Like the Pra genealogy, the gene tree of mfa alleles revealed strong trans-specific polymorphisms because mfa copies clustered by allele and not species identity (S4 Fig). For example, all mfa2 alleles grouped into one clade according to their physical association with the specific Pra allele STE3.2–2. Similarly, all mfa1 alleles grouped together with similar topology as their linked STE3.2–3 copies. The species-specific Pgt-mfa3s, which are also physically linked to STE3.2–3, were placed as a sister group to the mfa1 clade, being more closely related to mfa1 than to mfa2.

HD loci display limited signs of genomic deterioration and their synteny is conserved within species

Different effects on recombination have been reported around MAT loci in basidiomycetes [8,17]. Hence, we investigated signals of altered recombination, synteny, and genomic deterioration around the MAT loci in rust fungi. We defined the HD locus as the DNA fragment which includes bW-HD1, bE-HD2, and the DNA sequence in between [22]. First, we analyzed whether there was a reduction in synteny on chromosome 4, surrounding the HD locus, as a signature for long-term recombination suppression. We used MUMmer software [72] to align the haplotypes of chromosome 4 from Pca, Pgt, Pt, and Pst, against each other (S5 Fig). Overall macro-synteny of chromosome 4 haplotypes was conserved in all cereal rust fungi with Pt 76 the most syntenic (S5C Fig). Macro-synteny in a 40 kb-sized window around the HD locus was mostly conserved in Pt 76, Pca 203, and Pst 134E while this initial analysis suggested that this locus was less syntenic in Pgt 21–0 (S5 Fig).

We investigated the individual allelic divergence along chromosome 4 using synonymous divergence (dS) value as an additional measure for footprints of ancient recombination suppression [8]. This analysis tested if the HD locus and proximal genes showed an increase in dS values, which could be caused by long-term suppression of recombination and accumulation of independent mutations, as seen in other Basidiomycota [15]. We calculated the pairwise dS values for all genes across all sister chromosomes in all four species of Puccinia. This whole-genome analysis of all allele pairs on sister chromosomes suggested that chromosome 4 did not have increased dS values when compared to all other chromosomes (S6 Fig). The dS value of alleles for both HD genes fell within the upper 95% quantile of dS values on chromosome 4. Yet, dS values of directly adjacent genes were not elevated beyond the background level when we plotted dS values for allele pairs along chromosome 4 (Fig 3). Consistently, regions around the HD locus did not show any obviously particular patterns of gene or transposon density, which are common features of genetic deterioration (Fig 3). We tested whether the HD locus was potentially linked to centromeres as reported in other Basidiomycota [15,73]. We used previously identified centromere locations for Pgt and Pst [52,74] and identified centromere locations for Pca and Pt using genome-wide Hi-C heatmaps based on the bowtie-like Hi-C interaction features known for fungal centromeres (S710 Figs) [75]. This analysis revealed that the HD locus is likely not directly linked to centromeres in any of the four cereal rust fungi.

Fig 3. Synonymous divergence (dS) values of HD genes but not their immediate neighbors are slightly elevated on chromosome 4.

Fig 3

Synonymous divergence values (dS) for all allele pairs are plotted along chromosome 4A for (A) P. coronata f. sp. avenae (“Pca 203”), (B) P. graminis f. sp. tritici (“Pgt 21–0”), (C) P. triticina (“Pt 76”), and (D) P. striiformis f. sp. Tritici (“Pst 134E”). In each panel, the top track shows the dS values (“dS) of allele pairs along chromosome 4. Each dot corresponds to the dS value of a single allele pair. The second and third track shows the averaged TE (“TE”) and gene (“gene”) density along chromosome 4 in 10 kbp-sized windows, respectively. The HD genes (bW-HD1 and bE-HD2) are highlighted with a red line and red shading indicates a 0.4 mbp-sized window around the HD locus. Predicted centromeric regions are marked with blue shading. The two lower tracks (dS values and gene locations) provide a detailed zoomed in view of red shaded area around the HD locus. Species-specific background coloring is the same as for Fig 1.

Lastly, we investigated overall conservation of nucleotide and gene-coding regions for the HD locus including proximal regions containing 40 neighboring genes on either side of the locus. We combined fine scale nucleotide synteny with gene conservation analysis. We specifically tested if cereal rust fungi contain syntenic blocks with conserved protein-coding genes surrounding MAT loci as was found for species e. g. Trichosporonales spp. [17]. We used blastn to identify conserved nucleotide sequences within dikaryotic genomes and between species. We plotted genes, transposons, nucleotide and protein-coding gene conservation for proximal regions to the HD locus for all four cereal rust fungi (Fig 4A). Protein-coding genes and their order are mostly conserved within dikaryotic genomes of the same species with 58/80 genes in Pca 203, 40/80 in Pgt 21–0, 57/80 in Pt 76, and 62/80 in Pst 134E being conserved. Similarly, we observed considerable nucleotide conservation and synteny within dikaryotic genomes with the exception of Pgt 21–0, which is consistent with our initial MUMmer-based analysis (S5 Fig). Overall, synteny and protein-coding gene conservation was very limited between species for regions proximal to the HD locus. We could only identify three conserved genes across all four cereal rust fungi. These three genes code for an integral membrane protein (PTHR12459), D-2-hydroxyglutarate dehydrogenase (PTHR43716) and Glucose-6-phosphate 1-sepimerase (IPR025532) with all three clustering downstream of the HD locus (Fig 4A). Our TE analysis revealed that their coverage was overall consistent at the order classification level [76] within species but varied across species (S11 Fig). For example, terminal inverted repeats (TIRs) and other undetermined Class II DNA transposons dominated the HD locus of Pst 134E, while the HD locus of other rust fungi had higher coverage of Class I RNA transposons including long terminal repeats (LTRs).

Fig 4. HD loci are partially conserved within species whereas PR loci are highly heterozygous and display strong signals of genomic degeneration.

Fig 4

Synteny graphs of HD locus (A) and (B) PR locus including proximal regions in P. coronata f. sp. avenae (“Pca 203”), P. graminis f. sp. tritici (“Pgt 21–0”), (C) P. triticina (“Pt 76”), and (D) P. striiformis f. sp. tritici (“Pst 134E”). Proximal regions are defined as 40 genes downstream and upstream of HD and Pra genes, respectively. HD loci are relatively syntenic within each dikaryotic genome but less conserved between species. PR loci display little synteny in Pgt 21–0, Pt 76 and Pst 134E and show strong signs of transposon accumulation (see also S13 Fig). There is very little conservation of the PR loci across species. Red lines between chromosome sections represent gene pairs with identity higher than 70% and grey shades represent conserved nucleotide sequences (> = 1000 bp and identity > = 90%). For additional annotations please refer to the included legend (“Legend”).

Overall, our analysis of the HD locus suggests that it is conserved within each dikaryotic genome at species rank, while conservation has been eroded between species over longer evolutionary timescales.

The PR locus displays strong signs of genomic degeneration

We investigated organization and synteny of the PR locus which we defined as the DNA fragments including Pra, mfa genes, and DNA sequences in between. Initial MUMmer based synteny analysis revealed clear macro- and micro-synteny breaks around the PR locus in all cereal rust fungi (S12 Fig). It also revealed accumulation of highly repetitive sequences closely associated with the PR locus, especially in Pca 203 and Pst 134E. This increase in TEs correlates with a reduction in gene density around the PR locus (Figs 4B and 5). The loss in synteny was specific to the PR locus, as overall macro-synteny of chromosome 9 was conserved in all species (S12 Fig).

Fig 5. PR loci display elevated synonymous divergence (dS) values, accumulation of transposable elements and depletion of genes.

Fig 5

Synonymous divergence values (dS) for all allele pairs are plotted along chromosome 9A for (A) P. coronata f. sp. avenae (“Pca 203”), (B) P. graminis f. sp. tritici (“Pgt 21–0”), (C) P. triticina (“Pt 76”), and (D) P. striiformis f. sp. tritici (“Pst 134E”). In each panel, the top track shows the dS values (“dS”) of allele pairs along chromosome 9. Each dot corresponds to the dS value of a single allele pair. The second and third track shows the averaged TE (“TE”) and gene (“gene”) density along chromosome 9 in 10 kbp-sized windows, respectively. The Pra alleles (STE3.2–2 and STE3.2–3) are highlighted with a red line and red shading indicates a 0.4 mbp-sized window around the PR locus. Predicted centromeric regions are marked with blue shading. The two lower tracks (dS values and gene locations) provide a detailed zoomed in view of red shaded area around the PR locus. Species-specific background coloring is the same as for Fig 1.

Consistent with the macro-synteny analysis, dS values for the Pra alleles STE3.2-2- STE3.2–3 belonged to the 99.9% quantile of dS values of all allele pairs found on chromosome 9 in all species (Figs 5 and S6). Gene density dropped significantly around Pra genes with STE3.2-2/STE3.2–3 located within or adjacent to ‘gene deserts’ that extended over 1 mbp in the case of Pst 134E. The PR locus and its proximal regions also appear to be enriched in TEs. One exception was observed around the PR locus of Pgt 21–0 which exhibited a higher gene density and lower amounts of TEs when compared to the other three species, which might be a biological feature or a technical artefact (see Discussion). None of the Pra alleles was physically closely linked to centromeres except for Pst 134E STE3.2–3, which was found 164 kbp from the centromere, embedded in a > 1 Mb long stretch of TEs depleted of genes (Fig 5).

Detailed analyses of nucleotide synteny and protein-coding gene conservation confirmed signals of extended recombination suppression, loss of synteny and accumulation of transposable elements around the PR locus (Fig 4B). We observed little conservation of nucleotide sequences and protein-coding genes in regions proximal to Pra within dikaryotic genomes or between species. Indeed, we could not identify any conserved protein-coding gene proximal to Pra across the four cereal rust fungi. The TE composition in regions proximal to Pra varied within dikaryotic genomes and between species (S11B Fig). This indicates the accumulation of distinct transposable element families around the PR locus in different cereal rust species (see more detail for Pt below).

Our results show that the PR locus appears highly plastic with little conservation within and between species.

HD genes are multiallelic in cereal rust fungi while Pra genes display little variation at the species rank

We extended our species rank analysis of variation in MAT genes (Figs 1 and 2) using publicly available genomic data for all four species (S1 Table). We identified between 12 or 15 representative isolates per species that were assigned to as many distinct genomic lineages as possible based on previous population genetic studies [35,4547,4951,5558] (S1 Table). Our reasoning for this approach was to avoid repeated sampling from clonal lineages that are predicted to show no genetic variation at MAT genes. We quality controlled whole-genome short-read datasets and mapped them against the respective reference genome to estimate the genetic variation in MAT genes (S13S16 and S18S21 Figs). We identified high levels of polymorphisms at the HD locus for all analyzed cereal rust fungi. In our dataset, Pca showed the highest level of variation of the HD locus for all species (S13 Fig). When mapping Illumina short-read data of the Pca isolates against the Pca 203 reference, the HD locus was highly polymorphic including heterozygous single nucleotide polymorphisms (SNPs) and multiple unaligned regions and gaps. This suggested that the HD locus present in our selected Pca isolates were not well reflected in the Pca 203 reference assembly. This is likely due to sampling from diverse sexual populations of Pca in the USA [45]. We therefore implemented a novel de novo reconstruction approach for the HD locus its encoded HD genes directly from whole-genome Illumina short-read datasets (see Methods for detail). We confirmed our approach by reconstructing the Pca 203 HD locus from publicly available Illumina short-read datasets. We confirmed that the two reconstructed HD alleles were identical to the ones derived from the dikaryotic reference genome assembly (Fig 1) via a nucleotide alignment of the two reference bW-HD1 and bE-HD2 alleles to the de novo reconstructed HD alleles (S17A Fig). In total, we reconstructed eleven bW-HD1 and twelve bE-HD2 alleles for Pca (S17B Fig). Using this approach, we identified extensive genetic variation for bW-HD1 and bE-HD2 in Pgt and Pt (S17C–S17D Figs) while making use of recently published bW-HD1 and bE-HD2 alleles in case of Pst [23] (S17E Fig). We identified six, nine, and ten bW-HD1 and six, eight, and ten bE-HD2 alleles for Pgt, Pt, and Pst, respectively. Overall, this suggests that the HD genes are multiallelic in all four cereal rust fungi, which is consistent with other recent reports [23,37].

Compared to the HD genes, the Pra genes are reported to be far less polymorphic in many basidiomycetes [9,77,78]. Hence, we next focused our analysis on Pra genes using the same whole-genome Illumina short-read datasets (S1 Table). As expected, the Pra genes were far less polymorphic within each species compared to HD genes (S18S22 Figs), which is consistent with our initial analysis of Pra genes extracted from genome assemblies (Figs 1 and 2). In contrast to the highly polymorphic HD genes in Pca, we identified only two SNPs in the coding regions of STE3.2–2 and STE3.2–3 with a single SNP being non-synonymous (S22 Fig). Similarly, STE3.2–2 and STE3.2–3 copies of all Pst isolates had identical coding sequences. Our short-read mapping analysis of Pra in Pt identified four SNPs in coding regions of STE3.2–2 and STE3.2–3. These SNPs gave rise to a single additional STE3.2–2 variant with two amino acid substitutions close to the C-terminus (S22 Fig). Pgt was the most polymorphic for Pra. We identified multiple distinct copies of STE3.2–2 and STE3.2–3 in the global Pgt population. We identified one STE3.2–2 variant with several amino acid changes (S22B Fig). STE3.2–3 was the most polymorphic in Pgt including two isolates, TTTSK and UVPgt60, which contain potential non-sense mutations leading to pre-mature stop codons (S22B Fig). Overall, our analysis suggests that Pra is most likely biallelic in Pca, Pt, and Pst, while Pgt may have more than two functional Pra alleles.

Lastly, we investigated the nucleotide sequences of all mfa alleles at the species rank. mfa1 and mfa2 alleles were fully conserved in Pca and Pst, while Pt had only a single non-synonymous change in mfa1/3 (S3 Fig). In contrast, mfa1 and mfa3 in Pgt had several non-synonymous variations, yet all were located outside the predicted mature pheromone peptide sequences (S23 Fig). This species level variation in Pgt at mfa is consistent with the variation observed in STE3.2–2 and STE3.2–3.

MAT loci are conserved at the species level in Puccinia triticina while showing transposable element expansion specific to the STE3.2–3 allele

We made use of four chromosome-scale and fully-phased genome assemblies for Pt (Table 1) to explore structural conservation or plasticity of the MAT loci within a species of cereal rust fungi. We analyzed macro-synteny, detailed nucleotide synteny, protein-coding gene conservation and TE composition for the HD and PR locus within and between Pt 76 [48], Pt 15 [59], Pt 19NSW04 and Pt 20QLD87 [37]. A recent study suggested that Pt 19NSW04 arose via somatic hybridization between Pt 76 and Pt 20QLD87 or close relatives in Australia mid-2010 [37]. Hence, Pt 19NSW04 and Pt 76 share one complete nuclear haplotype B, which contains STE3.2–2, and Pt 19NSW04 and Pt 20QLD87 share one complete nuclear haplotype C, which contains STE3.2–3. The Pt 15 isolate is unrelated and was sampled in China in 2015.

Macro-synteny plots of the HD locus containing chromosomes suggested that shared nuclear haplotypes e.g. Pt 19NSW04 and Pt 76 hapB or Pt 19NSW04 and Pt 20QLD87 hapC, are more conserved when compared to the HD locus contained within each dikaryotic genome (S5 and S24 Figs). Detailed nucleotide synteny and protein-coding gene conservation analysis revealed that gene order and function is highly conserved in regions proximal to the HD locus between all tested nuclear haplotypes (S25 Fig). To investigate whether any specific TE families cluster around the HD locus, we built a TE database based on the Pt 76 reference genome and used it to annotate TEs in all four isolates. We identified similar coverage of TEs, at the order classification level, around the HD locus in all dikaryotic genomes (S26 Fig).

We used identical analyses for the PR locus to understand its evolution at species level in Pt. We compared the overall structural relationship between chromosomes containing STE3.2–2 or STE3.2–3 using macro-synteny plots. Like the HD locus, chromosomes of shared nuclear haplotypes, e.g. Pt 19NSW04 vs Pt 76 hapB with STE3.2–2, were more similar to each other than non-shared haplotypes, e.g. Pt 76 hapB vs Pt 15 hapA with STE3.2–3 (S12 and S27A Figs). The same applied for STE3.2–2 (S12 and S27B Figs). We clearly observed clustering of repetitive sequences around the STE3.2–2 and STE3.2–3 locus with the latter more pronounced (S27 Fig). These repetitive sequences appeared to be allele specific as they were not visible in macro-syntenty plots comparing STE3.2–2 and STE3.2–3 containing chromosomes (S12 Fig). Detailed analyses of nucleotide synteny and protein-coding genes showed that genes proximal to either Pra allele were mostly conserved while some haplotypes appeared to have inversions downstream of STE3.2–2 relative to the Pt 76 reference (Fig 6A and 6B). Gene synteny was more conserved for genes surrounding STE3.2–3 yet gene distance appeared to be especially variable between the two mfa1/3s and STE3.2–3 (Fig 6B). We explored the TE composition around each Pra allele and tested if TE expansions could explain the difference in intergenic distances around the STE3.2–3 allele. TE composition around STE3.2–2 and STE3.2–3 were markedly different with the STE3.2–3 allele displaying an increase of LTR retrotransposons (Fig 6C). TE coverage and composition around STE3.2–2 was overall consistent between the different haplotypes (Fig 6C). In the case of STE3.2–3 we found a single LTR Ty3 (also known as Gypsy [79]) TE family (Ty3_Pt_STE3.2–3, Fig 6D) highly expanded at the locus with varying coverage between haplotypes ranging from 14.7% to 25.4% with Pt 20QLD87 having the highest coverage (Fig 6C). The percentage identity relative to the consensus sequence was > 99% and most copies were around 8 kb (Fig 6E), which indicates active transposition of Ty3_Pt_STE3.2–3 in recent history. To investigate whether this TE family was specific to the STE3.2–3 allele, we compared its abundance at the PR locus with its abundance across the whole-genome. We found that Ty3_Pt_STE3.2–3 only clustered on STE3.2–3 containing chromosomes in all four strains with only a very limited number of single copies found on other chromosomes (S28 Fig). Indeed, Ty3_Pt_STE3.2–3 had a strong preference to insert between Ptmfa1/3s and STE3.2–3 (S29 Fig). To further investigate the structure of Ty3_Pt_STE3.2–3, we used LTR_FINDER [80] to identify 5’ and 3’ LTR regions and other features related to retrotransposons. Ty3_Pt_STE3.2–3 was found to have a short LTR, around 150 bp in all four strains, with target site repeat (TSR) region of ‘AAGT’, pol genes containing reverse transcriptase (RT), ribonuclease H (RH), integrase (INT). However, we failed to identify any group-specific antigen (gag) in these TEs (Fig 6D).

Fig 6. The PR locus shows an STE3.2–3 allele specific accumulation of a unique Ty3-transposon family in Puccinia triticina.

Fig 6

Synteny graphs of (A) STE3.2–2 and (B) STE3.2–3 including proximal regions in four different P. triticina isolates Pt 15, Pt 19NSW04, Pt 20QLD87, and Pt 76. Both loci are overall syntenic while the STE3.2–3 locus displays an extension of the intergenic distance especially between the pheromone receptor and the two mfa genes. Red lines between chromosome sections represent gene pairs with identity higher than 70% and grey shades represent conserved nucleotide sequences (> = 1000 bp and identity > = 90%). For additional annotations please refer to the included legend (“Legend”). (C) Transposable element coverage at the order classification level at the PR locus in the four different P. triticina isolates, empty white bars represent the coverage of Ty3_Pt_STE3.2–3. (D) Structure of a representative Ty3_Pt_STE3.2–3 copy. LTR–Long Terminal Repeat, RT–Reverse transcriptase, RH–RNAse H, and INT–Integrase. Distributions of (E) percentage identity relative to the consensus sequence (left) and lengths (right) of individual Ty3_Pt_STE3.2–3 insertions at the STE3.2–3 gene and proximal regions in four different P. triticina isolates.

STE3.2–1 genes are highly conserved within and among cereal rust fungi

Similar to previous studies [19,34], we identified an additional Pra-like gene, STE3.2–1, on chromosome 1. However, its two alleles are nearly identical in all dikaryotic genome assemblies and are not associated with any pheromone peptide precursor genes [70]. We investigated the genealogy of the Pra-like STE3.2–1 with using P. polysora f. sp. zeae STE3.2–1 as an outgroup [81]. The two STE3.2–1 alleles identified in each isolate were almost identical and grouped by species identity. Overall, STE3.2–1 showed very little variation within each species (S30 Fig).

We also analyzed STE3.2–1 and its proximal regions as above for the MAT loci. The three independent analyses revealed that STE3.2–1 and its proximal regions are highly conserved at the nucleotide and protein-coding gene level within dikaryotic genomes and are syntenic between species (S31S33 Figs). This shows a clear difference between the STE3.2–1 gene from the two MAT genes. Hence, STE3.2–1 is likely not involved in mating compatibility in the examined cereal rust fungi as previously suggested.

MAT genes are expressed late in the asexual infection cycle during urediniospore production

Importance of MAT genes in mate compatibility is well established for Pt and Pst [19,82], yet it is unclear if they are expressed and functional during asexual reproduction on cereal hosts as found in Pt [19]. Hence, we investigated the expression patterns of MAT genes in Pca, Pgt and Pst using publicly available RNA-seq infection time series of their cereal hosts [43,46,53,83] (S2 Table). We applied the trimmed mean of M-values (TMM) normalization to read counts and assessed quality of the RNA-seq datasets by multidimensional scaling (MDS) plots (S34 Fig). The MDS plots confirmed the suitability of the datasets for detailed expression analysis based on technical replicates clustering closely together and one dimension separating samples according to their infection progress. The stable expression of two house-keeping genes for each species across all samples further confirmed the suitability of the datasets for detailed MAT gene expression analysis (S35 Fig).

STE3.2–2 and STE3.2–3 were upregulated during the asexual infection process and always displayed highest expression at the latest time point available, which coincides with sporulation and production of urediniospores (Fig 7). We confirmed the differential expression at later infection timepoints with a likelihood ratio test with a p-value cut-off of < 0.05 (S36 Fig). Similarly, HD genes are upregulated during asexual infection of the cereal host (Fig 7). We did not observe any expression of STE3.2–1 in any of the samples. Lastly, we were also interested to see if we could detect transcripts of MAT genes in cereal rust spores before infection and in specialized infection structures called haustoria. We used an additional Pst RNA-seq dataset and compared the expression of MAT genes in ungerminated spores, germinated spores, 5-days post infection, 9-days post infection and haustoria enriched samples (S37 Fig) [53]. HD genes displayed some expression in spores albeit lower than observed in later stages of infection during spore production in the asexual cycle. In contrast, we did not detect any expression of Pra genes in ungerminated and germinated spores. Yet, Pra genes were expressed and upregulated at later stages of the asexual infection consistent with other publicly available datasets (Figs 7C and S37).

Fig 7. MAT gene expression is upregulated in the late stage of asexual infection cycle of the cereal host.

Fig 7

(A) Trimmed mean of M-values (TMM)-normalized values of MAT genes in P. coronata f. sp. avenae (“Pca 12NC29”) at 48 and 120 hour post infection (hpi). (B) TMM-normalized values of MAT genes in P. graminis f. sp. tritici (“Pgt 21–0) at 48, 72, 96, 120, 144 and 168 hpi. (C) TMM-normalized values of MAT genes in P. striiformis f. sp. tritici (“Pst 87/66”) at 24, 48, 72, 120, 168, 216 and 264 hpi.

Discussion

Our comparative analysis of predicted MAT genes and MAT loci in four cereal rust fungi provides novel insight into the evolution of these genes and their proximal regions within and between species.

First, we confirmed that HD and PR loci are unlinked and located on two distinct chromosomes in all four cereal rust fungi, which supports previous analyses on more fragmented genome assemblies or for individual rust fungal species [19,23,35]. Further, both loci were heterozygous in all twelve dikaryotic genome assemblies. Taken together, these observations strongly support the hypothesis of rust fungi displaying a tetrapolar mating compatibility system. Though this genome biology informed hypothesis requires experimental validation via crosses and genetic analysis of resulting offspring.

As previously observed for Pt [19], the Pra homologs STE3.2–1 were highly conserved in all cereal rust fungi and had no associated mfa genes. This suggests that STE3.2–1 does not regulate mate compatibility. In P. coronata f. sp. avenae, P. striiformis f. sp. graminis and P. striiformis f. sp. tritici, STE3.2–1 is flanked by anaphase promoting complex APC10/Doc1, which regulates mitosis [84], and Elongation factor EF1B [85]. Together, these findings suggest that STE3.2–1 might play a role during the cell cycle or different developmental programs as reported for other basidiomycete fungi [86,87]. However, as previously reported for P. triticina [19], expression for STE3.2–1 was absent or very low during the infection cycle of the cereal host.

Fungi with a tetrapolar mating compatibility system are often multiallelic for at least one of the two MAT loci. For example, U. maydis has many distinct transcription factor alleles at the HD locus [88,89]. We performed extensive analysis of the HD locus extracted from dikaryotic genome assemblies and developed a workflow to de novo reconstruct loci directly from Illumina short-read sequencing data. We investigated the allelic diversity of bW-HD1 and bE-HD2 in 12–15 additional isolates for each cereal rust fungal species. For each species these were sampled in a way to provide the best available representation of unique genotypes from global populations. This analysis demonstrated that the HD locus is multiallelic in the four species of cereal rust fungi. The four species had between 6–12 alleles of bW-HD1 and bE-HD2. This supports previous analyses of the allelic diversity of these genes in P. striiformis f. sp. tritici, which identified nine alleles of bW-HD1 and bE-HD2 [23]. Our reported number of bW-HD1 and bE-HD2 alleles is likely an underestimate given the current limited sampling publicly available. Most analyzed isolates were retrieved from cereal hosts during the asexual reproduction cycle and from regions that are known to be deprived of sexually recombining populations. The only exception was P. coronata f. sp. avenae isolates, which are reported to be derived from sexual recombining populations in the United States of America [45]. We identified the highest allelic variation of HD genes (eleven bW-HD1 and twelve bE-HD2) in P. coronata f. sp. avenae. This is similar to previous studies of isolates from likely sexual populations of P. striiformis f. sp. tritici in China, India, and Pakistan, which showed the highest diversity in HD alleles compared to other clonal lineages in Western wheat growing regions [23].

In contrast to multiallelic HD genes, the Pra and associated mfa genes were biallelic in P. coronata f. sp. avenae, P. triticina and P. striiformis f. sp. tritici. In all cases, one Pra allele was linked to one specific invariant mfa allele whereby P. triticina carried two mfa1/3 genes that were nearly identical in the sequence of the predicted mature pheromone peptide. This biallelism of Pra/mfa is similar to other tetrapolar dimorphic basidiomycetes like Microbotryum sp., U. maydis and Cryptococcus sp. [63,90,91]. The notable exception was P. graminis f. sp. tritici, which encodes two mfa genes (mfa1 and mfa3) in proximity to STE3.2–3. The two encoded precursor peptides are highly variable and are likely processed into two distinct mature pheromone peptides with at least one amino acid difference. In addition, we cannot exclude that MFA3 encodes for additional mature pheromone peptides, which might lead to distinct receptor activation. Interestingly, Pgt STE3.2–3 was also the most variable Pra allele in all species. We identified three STE3.2–3 variants that have at least four or more non-synonymous nucleotide changes relative to the most common Pgt STE3.2–3 variant. Moreover, the STE3.2–3 variants in Pgt 126–6711, Pgt ME-02, and Pgt PK-01 shared four distinctive amino acid changes in the C-terminus while being linked to an mfa1 allele that gives rise to MFA1 with four amino acid substitutions. All analyzed P. graminis f. sp. tritici isolates carried an invariant STE3.2–2 allele with only one mfa gene in proximity and we did not identify a clearly distinct additional Pra allele. Hence the functional significance of the observed variation at the STE3.2–3 locus is currently difficult to assess without more extensive sampling, more phased chromosome scale genome assemblies and in the absence of direct functional crossing studies in P. graminis f. sp. tritici. Yet, the organization of the STE3.2–3 gene is reminiscent of the PR locus of Sporisorium reilianum in the order Ustilaginales. In S. reilianum, the PR locus is at least triallelic with each allele encoding one pheromone receptor and two distinct pheromone peptides. Each of the mature pheromones derived from one allele specifically activates only one of the receptors encoded by the two other alleles [92]. More studies are needed to assess if the PR locus in P. graminis f. sp. tritici is bi- or multiallelic.

We initially predicted that biallelic MAT loci would show stronger signatures of recombination suppression and genomic degeneration than multiallelic loci based on a published mathematical model [2] and observations in other basidiomycete species [9,93]. Indeed, the multiallelic HD locus was mostly syntenic within dikaryotic genomes of the four cereal rust species. This initial analysis with a limited set of genomes per species is likely to hold true at the population level. In the case of P. triticina, the eight HD alleles of four isolates were all highly syntenic with most proximal genes conserved between the different alleles. Consistently, the composition of TEs at order classification level was highly similar between HD alleles of the same species.

In contrast, the PR locus showed strong signs of recombination suppression and genomic degeneration, however, the exact patterns were not identical in all four species. PR proximal regions showed the strongest signs of recombination suppression and were the most extended in P. coronata f. sp. avenae, P. triticina and P. striiformis f. sp. tritici with Pra genes located at edges or within large gene deserts and TE islands of 0.7–1.2 Mbp. At least some of these repeats were shared between the two PR alleles within each dikaryotic genome of P. coronata f. sp. avenae and P. striiformis f. sp. tritici based on whole chromosome alignments. This was not the case for PR alleles in P. triticina. In all three cases, there appeared to be differences in composition and coverage of TE at the order, superfamily, and family classification level between the two PR alleles. This was most obvious for P. triticina, where we analyzed four dikaryotic genome assemblies and identified a TE family specific to STE3.2–3, namely Ty3_Pt_STE3.2–3. Ty3_Pt_STE3.2–3 displayed preferential insertion for the STE3.2–3 allele between the pheromone receptor and the two mfa genes. Ty3_Pt_STE3.2–3 is likely a currently active TE based on its sequence conservation, length of individual TE copies, and the suggested nuclear exchange between the studied Australian isolates within the last decade [37]. These initial observations of partial allele-specific TE composition and coverage in P. coronata f. sp. avenae, P. triticina and P. striiformis f. sp. tritici contrast with observations of TE analysis of biallelic MAT locus in 15 species of Microbotryum. The MAT locus in Microbotryum spp. did not display allele specific patterns of TE composition, yet were likely reservoirs for TE families that drove TE expansion at a genome scale at discrete time points associated with the extension of evolutionary strata around the MAT locus [42].

Compared to these observations at biallelic PR loci, the PR locus in P. graminis f. sp. tritici contained fewer TEs, more genes, and was much shorter at ~300 kbp. In addition, the TE composition and coverage was very similar between the two alleles within the dikaryotic genome of Pgt 21–0. These observations could be biological or a technical artifact of the assembly process of this specific genome. The Pgt 21–0 assembly is based on older, noisy PacBio long-read technologies, included extensive manual curation and relied on gene synteny between contigs and haplotypes for scaffolding. This gene-based scaffolding might have been broken by long-stretches of TEs around the PR locus, as observed for the three other cereal rusts, and thereby introduced scaffolding errors in Pgt 21–0 [35]. Several other genomes generated with older, error-prone PacBio long-read technologies (e. g. Pst 104E and Pca 12SD80) had issues assembling the PR locus correctly. Alternatively, these marked differences of the P. graminis f. sp. tritici PR locus might reflect biological reality, as models predict that recombination suppression and genomic degeneration is more pronounced at biallelic loci [2]. If the PR locus in P. graminis f. sp. tritici is not truly biallelic, as potentially indicated by the gene-based analysis of Pra and mfa alleles, the observed organization and level of genomic degeneration at the PR locus could support this prediction. However, more sampling and high-quality, phased chromosome-scale genome assemblies are necessary to differentiate these alternative hypotheses concerning the PR locus in P. graminis f. sp. tritici.

The functions of MAT genes in rust fungi are currently unknown, beyond their likely role in mediating mate compatibility. However, in P. triticina, P. striiformis f. sp. tritici and M. larici-populina [19,40,82,94], MAT genes are expressed during the sexual and asexual infection cycle. Suppression of MAT gene expression in P. triticina during the asexual infection cycle on wheat reduces infection severity and spore production [19]. Here, we show that MAT genes are also upregulated during the late stage of the asexual infection cycle of the cereal host in P. coronata f. sp. avenae, P. graminis f. sp. tritici, and P. striiformis f. sp. tritici. This suggests a broader conservation of MAT gene expression during the late stages of the asexual infection cycle of rust fungi. Cells of rust fungi carry more than two nuclei in the same cytosol during the asexual infection cycle [95]. Urediniospores always carry two nuclei of compatible mating-types and never two nuclei of identical mating-types. Studies from the early 20th century show that the two nuclei of urediniospore mother cells undergo synchronous nuclear division within the same cytoplasm giving rise to four nuclei within the same urediniospore precursor cells [95]. This is followed by nuclear movement to restore appropriate nuclear pairing and planar cell division that enables urediniospore maturation and reconstitutes the dikaryotic state [95]. The observed upregulation of MAT genes aligns with the development of urediniospore primordia and spore production. Hence, we hypothesize that MAT genes regulate nuclear pairing during production of dikaryotic urediniospores from primordia.

Understanding mating-types in cereal rust fungi has significant implications for agriculture and prediction of new virulent isolates. Recent reports suggest that nuclear exchange between isolates during the asexual infection cycle (also known as somatic hybridization [96]), in addition to sexual reproduction, can lead to nuclear assortment [96]. Nuclear exchange and the viability of the resulting offspring is likely regulated by MAT genes because all proposed events have given rise to isolates with dikaryotic genomes that carry opposing MAT gene pairs. We confirmed these observations with our MAT gene analysis in the case of Pgt Ug99 (Pgt 21–0 and unknown nuclear donor) and Pt 19NSW04 (Pt 76 and Pt 20QLD87 as proposed nuclear donors). Conversely, our analysis showed that nuclear exchange has not occurred for the analyzed isolates of P. coronata f. sp. avenae, and P. striiformis f. sp. tritici, as none of the isolates share identical HD and Pra alleles that could be carried in the same nucleus.

In cases of nuclear exchange, these nuclei behave like entire linkage groups with virulence (also known as effectors) allele complements being tied to specific MAT alleles, if somatic hybridization occurs in the absence of parasexual reproduction as currently proposed. This implies that the nuclear MAT alleles define the possible virulence allele combinations that can arise via nuclear exchange, from either sexual or somatic reproduction. We hence can predict the most likely novel virulence allele combinations that might arise given circulating nuclear genomes in existing cereal rust fungal populations on their cereal hosts. This observation requires us to generate phased and chromosome scale genome assemblies of all cereal rust fungi at the population level. These will improve disease management strategies and the durability of characterized resistance loci in the global crop germplasm leading to smarter rust resistance breeding programs.

Material and methods

Assessment of used genomic resources

Detailed information on isolates and genomes used in this study can be found in Table 1. Completeness of genome assemblies were assessed by BUSCO v5.5.0 [97] in the genome mode with the basidiomycota_odb10 gene set.

Identification of MAT genes

Annotation of Pt 19NSW04 and Pt 20QLD87 were downloaded from CSIRO Data Access Portal (URL: https://data.csiro.au/), CDS and protein data of Pt 19NSW04 and Pt 20QLD87 were generated with Gffread [98]. MAT genes including the HD and Pra genes previously identified in P. triticina (Table 2) [19] were used as blastp and blastn [99] query to identify orthologs in P. graminis f. sp. tritici Ug99 and 21–0 [35], P. striiformis f. sp. tritici 134E [52], 104E [53] and DK0911 [54], P. triticina 76 [48], 19NSW04, 20QLD87 [37] and 15 [59], P. polysora GD1913 [81], P. coronate f. sp. avenae 203 [44], 12NC29 and 12SD80 [43] isolates. Missed STE3.2–3 in Pst 104E was fixed by mapping short-read to reference genome Pst 134E with bwa-mem2 [100]. Mfa genes were identified by using mfas of Pt BBBD, Pgt SCCL and Pst 78 [19] as a custom motif database to search similar pattern in all studied species with Geneious Prime v. 2022.1.1 [101], using a function built from fuzznuc [102]. Karyograms of Pca 203, Pgt 21–0, Pt 76 and Pst 134E were made with RIdeogram 0.2.2 [103].

Genealogical analysis of the HD and Pra genes

To construct a species tree for cereal rust fungi, protein sequences encoded in one nuclear genome from each cereal rust fungus species under study was used as input data. The input data was processed by Orthofinder 2.5.5 [67,104] to infer maximum likelihood trees from multiple sequence alignments (MSA). fasttree v2.1 [105] was used to calculate likelihood-based local support values. Coding regions of identified HD and Pra genes were aligned with MACSE v2.07 [106] with default setting. Coding-sequence based alignments were trimmed by using TrimAl v1.2 [107] with option (-gt 0.9) to remove sequences which have gaps in more than 10% of samples, trimmed alignments were realigned with MACSE again. Nucleotide alignments were imported into BEAUti v2.7.6 [108] to generate.xml files, bModelTest 1.3.3 [109] was used to test the best nucleotide substitution model separately with a Markov chain length of 15 million. HKY+G was chosen for inferring two HD genes, whereas TN93+I was chosen for Pra genes and STE3.2–1 individually. Then BEAST 2 MCMC was run under a strict molecular clock with the Yule Model tree. Fifteen million length Markov chains were used to provide a sufficient effective sample size (ESS) of independent samples. Tracer v.1.7.2 [110] was used to check if ESS values of all parameters are above 200 for each run. TreeAnnotator [108] was used to generate a maximum clade credibility tree with 10% burn in. Figtree v.1.4.4 [111] was used to visualize the final output.

Nucleotide sequences of identified mfa genes were aligned with MACSE v2.07 [106] with default settings, trimAl v1.2 [107] with option (-gt 0.9) to remove gaps, trimmed alignment was used to construct maximum likelihood tree via IQ-TREE 2 (version 2.1.4-beta) [112], the best model was estimated by ModelFinder (-m MFP) [113] based on Akaike Information Corrected Criterion (AICc) score [114], 10000 replicates were set for ultrafast bootstrap [115]. K2P+G4 substitution model was chosen by ModelFinder [113].

To compare topologies of bW-HD1 and bE-HD2 gene trees, IQ-TREE 2 (version 2.1.4-beta) [112] was used to build unrooted maximum likelihood trees for each species and both genes separately, the best model for each alignment estimated by ModelFinder (-m MFP) [113] based on AICc score [114], 10000 replicates were set for ultrafast bootstrap [115]. Top 100 optimal gene trees of bW-HD1 and bE-HD2 were used as alternate topologies after removing branch length values to perform AU tests [68] with IQ-TREE2 (-keep-ident–zb 10000 -n 0 -au), pAU <0.05 suggesting topological incongruence between the two HD genes.

In order to detect potential recombination events, the HD locus plus 1kb of proximal DNA sequence of each species were aligned with MAFFT v7.490 [116,117] in Geneious Prime v. 2022.1.1 [101], both sides of each alignment were trimmed manually. The RDP [118], GENECONV [119], BOOTSCAN [120], MAXCHI [121], CHIMAERA [122], SISCAN [123], and 3SEQ [124] methods in the program RDP5 [69] were processed to detect unique recombination events for the HD locus of each species using default settings. Recombination events which were detected by at least five of the above methods were accepted. RDP5 projects of each species are available in Dryad Database [70].

Estimation of centromeric regions

Hi-C reads of Pca 203 and Pt 76 were mapped back to their corresponding haploid genome assembly with Juicer 1.6 [75] and processed with the 3d-dna [125] pipeline. Juicebox 1.8.8 [75] was used to visualize Hi-C maps. The approximate positions of centromeric regions were estimated by zooming in on the Hi-C heatmap in Juicebox and selecting the region corresponding to strong bowtie like interaction signals, which have been validated and widely used in previous studies [126, 127].

ds value estimation for allele pairs and plotting

Inspired by the approach used by Branco et al. [8] for identifying evolutionary strata on mating-type chromosomes of Microbotryum species, proteinortho 6.0.22 [128] was used to pair orthologs one-to-one with -synteny and -singles tag. Muscle 3.8.1551 [129] was applied to generate coding-sequence based protein alignments with default settings. PAML 4.9 [130] was used to calculate synonymous divergence values from alignments generated with Muscle. Genomes were split in 10 kbp per window, then density of genes and repeats were calculated with bedtools v 2.29.2 [131]. Results were visualized with karyoploteR v3.17 [132].

Synteny analysis of MAT genes and proximal regions

For investigating macrosyntenic relationship between haplotypes, self-alignment analyses between two haplotypes of each species were done with NUCmer (version 3.23) [72] by using–maxmatch option, alignments with a size of less than 1000 bp or less than 90% identity were removed, filtered data were visualized via Matplotlib [133].

To investigate synteny conservation, NUCmer (version 3.23) [72] was used to align nucleotide sequences of MAT genes and proximal regions. Blastn 2.10.0 [99] was used to identified conserved genes (identity higher than 70%, e-value < 0.05) between and within species. Results of 40 genes flanking the MAT genes were visualized with gggenomes 0.9.9.9000 [134].

General repeat annotations

REPET v3.0 pipeline [135,136] was applied to annotate repeats in all genomes used in this study, together with Repbase v22.05 [137], config files were based on default settings, Wicker classification system was used to classify TE families into Class-Order-Superfamily-Family [76]. TEdenovo was used to predict novel repetitive elements and to construct specific custom databases from each genome individually as described in the manual instructions. Constructed custom databases were used for repeat annotation with TEannot. The reference TE database built for Pt 76 from TEdenovo with the REPET v3.0 pipeline was used as reference for annotating repeats in Pt 15 [59], Pt 19NSW04 and Pt 20QLD87 [37] with RepeatMasker 4.1.5 [138]. To make analyses consistent, the Pt 76 genome was reannotated in the same way as the other three isolates.

To calculate the coverage of each TE family, classification information from TEannot was used. For TEs with more than one potential classification, hits from Repbase which have more than 70% of similarity were used as evidence for reclassification.

LTR_FINDER v1.07 [80] with ps_scan [139,140] was used to identify detailed structures of LTRs of Ty3_Pt_STE3.2–3 in default setting. Reverse transcriptase (A0A0C4EV49_PUCT1), Retrotransposon gag domain (A0A0C4EKM8_PUCT1), Integrase, catalytic domain (A0A0C4EMG6_PUCT1) and Retropepsins (A0A0C4F2V2_PUCT1) [141], were additionally used as query to search for motifs in identified LTR retrotransposons using tblastn 2.12.0+ [99].

Identification of variants in MAT loci

Sequence data of Pca were published by [45], data of Pgt were published by [35,46,47,58] and under PRJNA39437, data of Pt used in this study have been published earlier by [4951] and under PRJNA39803 and PRJNA39801, data of Pst were published by [5557] and under PRJNA60743, all above data were downloaded from NCBI SRA database (URL: https://www.ncbi.nlm.nih.gov/sra) (S1 Table). Four Pca isolates, Pca 203 [44], Pgt 21–0 [35], Pt 76 [48] and Pst 134E [52], with published chromosome-scale haplotype-phased genome assemblies were used as reference for mapping. Trimmomatic 0.39.2 [142] was used to remove adapters from raw reads whereas FastQC v0.11.8 [143] was applied for assessing the quality before and after trimming. Bwa-mem2 [100] was used to map trimmed reads to reference genomes. MarkDuplicates (Picard) 1.70 [144] was applied to remove PCR-generated duplicates. Samtools 1.12 [145] was used to remove mapped reads with mapping quality lower than 30. IGV 2.16.2 [146] and Qualimap 2 [147] were both used to check mapping quality.

For reconstruction of Pra alleles, freebayes-parallel 1.3.6 [148,149] was applied to detect variants, bcftools 1.12 [145] was used to exclude low quality variants with -e ’QUAL<40’. Samtools 1.12 [145] was used to filter out regions of interest. The Ensembl Variant Effect Predictor (VEP) 88.9 [150] was used to annotate sequence variants. Bcftools 1.12 [145] was used to generate consensus sequences, nucleotide and protein alignments were generated and visualized with Geneious Prime v. 2022.1.1 [101].

For de novo reconstruction of HD alleles, Spades [151], a de novo assembler, was used for building draft genome assemblies from available whole-genome Illumina short-read datasets (S1 Table). For isolates with average read lengths > 101 bp, k-mer size of 101 was chosen for building draft genomes. For isolates with average read lengths < 101 bp, k-mer size of 51 bp was chosen. HD genes used in genealogical analysis were used as reference genes, blastn 2.12.0+ [99] was used to search and identify HD allele containing contigs of the draft genome assemblies. KAT [152] was used to compute a k-mer matrix from de novo assembled HD contigs and reference HD loci sequences with default setting (k-mer size = 27). KAT filter was used to obtain all paired-end reads which contained HD k-mers based on the created HD k-mer matrix (-T 0.2). Spades was used to reassemble all HD related paired-end reads of the specific isolate. Final outputs were visualized in Geneious, CDS regions of reconstructed HD alleles were predicted by mapping the CDS of reference HD alleles to reconstructed contigs, ORFs were presumed with Geneious.

Expression analysis of MAT genes

RNA-seq reads of rust fungi on infected plants were obtained from NCBI SRA database (URL: https://www.ncbi.nlm.nih.gov/sra). The datasets included: RNA-seq data of Pst 87/66 on wheat (PRJEB12497 [83], Pca 12NC29 on Brachypodium distachyon (PRJNA398546 [43], Pgt 21–0 on wheat (PRJNA415866 [46], Pst 104E on wheat (PRJNA396589 [53]. Raw reads were quality checked with FastQC v0.11.8 [143], trimmomatic 0.50 [142] was used to trim low quality reads with parameters: ILLUMINACLIP:adapter.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36. Kallisto 0.44.0 [153] was then applied to align and count transcripts with -b 100 parameter for all paired-end reads and -b 50 -l 200 -s 20—single—single-overhang for single-end reads. Pst 104E cDNA data was used as reference for Pst 104E infected tissues. Since no reference genome of Pst 87/66 is available, cDNA data of Pst 134E was used as reference instead with HD genes of Pst 87/66 supplemented to obtain better quantification of the specific HD alleles. Pgt 21–0 and Pca 12NC29 were used as reference for mapping reads from Pgt 21–0 and Pca 12NC29 infected tissues, respectively. Gene names of Pca 12NC29 and Pst 104E were added with funannotate annotate v 1.8.5 [154,155] with default parameters. Kallisto outputs were imported to EdgeR 3.17 [156] with tximportdata [157] following user instruction, followed by normalizing with the TMM method, significant upregulation of MAT genes was assessed with the LRT method and ggplot2 [158] was used to visualize the result. Housekeeping genes were identified in each dataset based on the following conditions: p-value>0.1 and log(FC) <0.5, functional annotations of candidate housekeeping genes were retrieved from interproscan [155].

Dryad DOI

https://doi.org/10.5061/dryad.w0vt4b8zm [70].

Supporting information

S1 Table. List of whole-genome Sequence Read Archive data used in the present study.

The table provides information on all whole-genome Sequence Read Archive (SRA) data used in the present study. Metadata includes species abbreviation (“Species”), isolate name (“Isolate”), SRA identifier (“SRA ID”), and the initial reference (“Citation”). Pca—P. coronata f. sp. avenae, Pgt—Puccinia graminis f. sp. tritici, PtP. triticina and PstP. striiformis f. sp. tritici.

(XLSX)

pgen.1011207.s001.xlsx (12.4KB, xlsx)
S2 Table. List of RNAseq Sequence Read Archive data used in the present study.

The table provides information on all RNAseq Sequence Read Archive (SRA) data used in the present study. Metadata includes species abbreviation (“Species”), isolate name (“Isolate”), timepoint of infection (hpi, hours post infection) or sample type (“Sample Type”), timepoint/status of spores (“Time point/Status”), replicate number (“Replicate”), SRA identifier (“SRA ID”), and the initial reference (“Citation”). Pca—P. coronata f. sp. avenae, Pgt—Puccinia graminis f. sp. tritici, and PstP. striiformis f. sp. tritici.

(XLSX)

pgen.1011207.s002.xlsx (11.7KB, xlsx)
S1 Fig. Two unlinked MAT loci suggest tetrapolar mating types in four Puccinia spp.

Karyograms of P. coronata f. sp. avenae (“Pca 203”), P. graminis f. sp. tritici (“Pgt 21–0”), P. triticina (“Pt 76”) and P. striiformis f. sp. tritici (“Pst 134E”) with the positions of HD, PR and STE3.2–1 loci marked by black arrow heads. HD, PR and STE3.2–1, located on chromosome 4, chromosome 9 and chromosome 1, respectively, suggest tetrapolar mating types in these four species.

(TIF)

pgen.1011207.s003.tif (1.1MB, tif)
S2 Fig. Genetic distance between pheromone precursor genes (mfa) and pheromone receptor (Pra) genes varies from 100 bp to 100 kb.

The diagrams display the location of STE3.2–2 and STE3.2–3 and their linked mfa genes on chromosome 9A and 9B in P. coronata f. sp. avenae (“Pca 203”), P. graminis f. sp. tritici (“Pgt 21–0”), P. triticina (“Pt 76”) and P. striiformis f. sp. tritici (“Pst 134E”). The grey bar represents chromosome subsections containing the genes of interest with the numbers indicating the absolute location in mega base pairs on chromosome 9A and 9B. The genetic distance between mfa1/mfa3 and STE3.2–3 are highly variable between species, whereas STE3.2–2 and mfa2 are tightly linked in all species.

(TIF)

pgen.1011207.s004.tif (233.8KB, tif)
S3 Fig. MFA protein alignments indicate mating factor a proteins are overall conserved within species.

(A) Alignment of MFA1 protein sequences from four different rust fungal species. The MFA1, which is linked to STE3.2–3, is mostly conserved within species. Two near identical MFA1/3 copies can be identified in four P. triticina isolates. In P. graminis f. sp. tritici MFA1 has four amino acid substitutions in Pgt SCCL versus Pgt 21–0. (B) Alignment of MFA2 protein sequences from four different rust fungal species. MFA2 is fully conserved within each species. (C) Alignment of MFA3 protein sequences from three P. graminis f. sp. tritici isolates. Mfa3 is only present P. graminis f. sp. tritici in close proximity to STE3.2–3. Predicted mature pheromone sequences are outlined by boxes.

(TIF)

pgen.1011207.s005.tif (933.3KB, tif)
S4 Fig. Pheromone precursor genes group according to their proximate Pra alleles and display strong signals of trans-specific polymorphisms.

Maximum likelihood tree of mfas identified in four cereal rust fungi including multiple isolates per species. Tips are labelled with the species abbreviation, gene names, and isolate names are provided in parentheses. Branch support was assessed by 10000 replicates. The scale bar represents 0.5 substitutions per site. Pca—P. coronata f. sp. avenae, Pgt—Puccinia graminis f. sp. tritici, PtP. triticina and PstP. striiformis f. sp. tritici.

(TIF)

pgen.1011207.s006.tif (848.5KB, tif)
S5 Fig. Whole chromosome alignments of HD loci containing chromosomes between the two haplotypes of each dikaryotic genome exhibit slightly reduced synteny around HD locus.

The figure shows dots plots of whole chromosome alignments between the two HD loci containing chromosomes from dikaryotic genome assemblies. Each panel consists of dot plots of the whole chromosome and subset dot plot zooming into the HD locus. The HD locus is labelled and line colors show the nucleotide percentage identity and nucleotide orientation as indicated in the figure legend. Subfigure A to D show P. coronata f. sp. avenae (“Pca 203”), P. graminis f. sp. tritici (“Pgt 21–0”), P. triticina (“Pt 76”) and P. striiformis f. sp. tritici (“Pst 134E”), respectively.

(TIF)

pgen.1011207.s007.tif (630.2KB, tif)
S6 Fig. Allele pairs on chromosomes 4 and 9 are not overall more diverged when compared to other chromosomes.

The plots show the distribution of dS values of allele pairs on sister chromosomes in four cereal rust fungal species. Bar plots show the distribution of dS values up to the 99% quantile. Red lines represent threshold of dS values of 95% of all alleles. Blue lines represent threshold of dS values of 90% of all alleles. Black points show dS values of individual allele pairs. HD and STE3 allele pairs are highlighted as red points. Subfigures A to D show P. coronata f. sp. avenae (“Pca 203”), P. graminis f. sp. tritici (“Pgt 21–0”), P. triticina (“Pt 76”) and P. striiformis f. sp. tritici (“Pst 134E”), respectively.

(TIF)

pgen.1011207.s008.tif (695.5KB, tif)
S7 Fig. Hi-C heatmap of P. coronata f. sp. avenae (“Pca 203”) hapA.

(TIF)

pgen.1011207.s009.tif (3.3MB, tif)
S8 Fig. Hi-C heatmap of P. coronata f. sp. avenae (“Pca 203”) hapB.

(TIF)

pgen.1011207.s010.tif (3.1MB, tif)
S9 Fig. Hi-C heatmap of P. triticina (“Pt 76”) hapA.

(TIF)

pgen.1011207.s011.tif (3.9MB, tif)
S10 Fig. Hi-C heatmap of P. triticina (“Pt 76”) hapB.

(TIF)

pgen.1011207.s012.tif (3.8MB, tif)
S11 Fig. Coverage of different transposable element orders at the HD, PR, and STE3.2–1 locus.

The plots show the percentage of nucleotides covered by different transposable element orders at (A) HD locus (B) PR locus (C) STE3.2–1 locus. Each subfigure A to C shows the coverage in each haplotype of the dikaryotic genomes of P. coronata f. sp. avenae (“Pca 203”), P. graminis f. sp. tritici (“Pgt 21–0”), P. triticina (“Pt 76”) and P. striiformis f. sp. tritici (“Pst 134E”). Different TE orders are color coded as shown in the legend. TEs with no assigned class are labelled “Undetermined”. TEs with no assigned order but belonging to Class I (RNA retrotransposons) or Class II (DNA transposons) are labelled “Undetermined Class I” or “Undetermined Class II”, respectively.

(TIF)

pgen.1011207.s013.tif (790.2KB, tif)
S12 Fig. Whole chromosome alignments of PR loci containing chromosomes between two haplotypes of each dikaryotic genome exhibit strong signs of synteny loss.

The figure shows dot plots of whole chromosome alignments between the two PR loci containing chromosomes from the dikaryotic genome assemblies. Each panel consists of dot plots of the whole chromosome and subset dot plots zooming into the PR locus. The PR locus is labelled and line colors show the nucleotide percentage identity and nucleotide orientation as indicated in the figure legend. Subfigures A to D show P. coronata f. sp. avenae (“Pca 203”), P. graminis f. sp. tritici (“Pgt 21–0”), P. triticina (“Pt 76”) and P. striiformis f. sp. tritici (“Pst 134E”), respectively.

(TIF)

pgen.1011207.s014.tif (518.8KB, tif)
S13 Fig. IGV screen shots of short-read Illumina mapping of various P. coronata f. sp. avenae isolates against the HD locus in the Pca 203 reference.

(A) shows mapping against the HD locus on chromosome 4A and (B) chromosome 4B, respectively.

(TIF)

pgen.1011207.s015.tif (1.1MB, tif)
S14 Fig. IGV screen shot of short-read Illumina mapping of various P. graminis f. sp. tritici isolates against the HD locus in the Pgt 21–0 reference.

(A) shows mapping against the HD locus on chromosome 4A and (B) chromosome 4B, respectively.

(TIF)

pgen.1011207.s016.tif (4.7MB, tif)
S15 Fig. IGV screen shots of short-read Illumina mapping of various P. triticina against the HD isolates locus in the Pt 76 reference.

(A) shows mapping against the HD locus on chromosome 4A and (B) chromosome 4B, respectively.

(TIF)

pgen.1011207.s017.tif (811.6KB, tif)
S16 Fig. IGV screen shots of short-read Illumina mapping of various P. striiformis f. sp. tritici isolates against the HD locus in the Pst 134E reference.

(A) shows mapping against the HD locus on chromosome 4A and (B) chromosome 4B, respectively.

(TIF)

pgen.1011207.s018.tif (986KB, tif)
S17 Fig. Nucleotide alignments of the HD gene coding sequence in four cereal rust fungi indicate bW-HD1 and bE-HD2 are multiallelic in each species.

(A) Nucleotide alignment of the de novo reconstructed HD locus from Pca 203 Illumina short-read data with the coding regions of bW-HD1 and bE-HD2 alleles from Pca 203 dikaryotic reference genome. (B) to (D) multiple sequence alignments of de novo reconstructed HD coding regions, (E) multiple sequence alignments of Pst bW-HD1 and bE-HD2 alleles [159]. In each subfigure B to E, the top two track shows the consensus sequence length and relative sequence identity, respectively. Subfigure B to E show P. coronata f. sp. avenae (“Pca”), P. graminis f. sp. tritici (“Pgt”), P. triticina (“Pt”) and P. striiformis f. sp. tritici (“Pst”), respectively. The bW-HD1 and bE-HD2 are numbered in accordance with Fig 1.

(TIF)

S18 Fig. IGV screen shots of short-read Illumina mapping of various P. coronata f. sp. avenae isolates against the PR locus in the Pca 203 reference.

(A) shows mapping against the PR locus on chromosome 9A and (B) chromosome 9B, respectively.

(TIF)

pgen.1011207.s020.tif (291.6KB, tif)
S19 Fig. IGV screen shot of short-read Illumina mapping of various P. graminis f. sp. tritici isolates against the PR locus in the Pgt 21–0 reference.

(A) shows mapping against the PR locus on chromosome 9A and (B) chromosome 9B, respectively.

(TIF)

pgen.1011207.s021.tif (446.7KB, tif)
S20 Fig. IGV screen shots of short-read Illumina mapping of various P. triticina isolates against the PR locus in the Pt 76 reference.

(A) shows mapping against the PR locus on chromosome 9A and (B) chromosome 9B, respectively.

(TIF)

pgen.1011207.s022.tif (349.6KB, tif)
S21 Fig. IGV screen shots of short-read Illumina mapping of various P. striiformis f. sp. tritici isolates against the PR locus in the Pst 134E reference.

(A) shows mapping against the PR locus on chromosome 9A and (B) chromosome 9B, respectively.

(TIF)

pgen.1011207.s023.tif (326.6KB, tif)
S22 Fig. Amino acid alignments of STE3.2–2 and STE3.2–3 alleles of four cereal rust fungi.

Multiple sequence alignment of de novo reconstructed STE3.2–2 and STE3.2–3 protein sequences. Subfigure contains MSAs; one for STE3.2–2 and one for STE3.2–3. Subfigure A to D show P. coronata f. sp. avenae (“Pca”), P. graminis f. sp. tritici (“Pgt”), P. triticina (“Pt”) and P. striiformis f. sp. tritici (“Pst”), respectively.

(TIF)

pgen.1011207.s024.tif (476.4KB, tif)
S23 Fig. Amino acid alignments of the three MFA alleles from various P. graminis f. sp. tritici isolates.

Amino acid substitutions are highlighted by color, whereas predicted mature pheromone sequences are outlined by boxes.

(TIF)

pgen.1011207.s025.tif (992.1KB, tif)
S24 Fig. Whole chromosome alignments of HD genes containing chromosomes between haplotypes of four different P. triticina isolates.

The figure shows dots plots of whole chromosome alignments of HD loci containing chromosomes derived from distinct dikaryotic genomes of four different P. triticina isolates including Pt 15, Pt 19NSW04, Pt 20QLD87 against Pt 76. Each panel consists of a dot plot of the whole chromosome and a subset dot plot zooming into the HD locus. The HD locus is labelled and line colors show the nucleotide percentage identity and nucleotide orientation as indicated in the figure legend. (A) Comparison of nucleotide sequence of chromosome 4s of Pt 15 and Pt 76. (B) Comparison of nucleotide sequence of chromosome 4s of Pt 19NSW04 and Pt 76. (C) Comparison of nucleotide sequence of chromosome 4s of Pt 20QLD87 and Pt 76. (D) Comparison of nucleotide sequence of chromosome 4s of Pt 19NSW04 and Pt 20QLD87.

(TIF)

pgen.1011207.s026.tif (3.6MB, tif)
S25 Fig. The HD locus is highly conserved in four P. triticina isolates.

Synteny graphs of HD loci including proximal regions in the four P. triticina isolates Pt 15, Pt 19NSW04, Pt 20QLD87 and Pt 76. Red lines between chromosome sections represent gene pairs with nucleotide sequence identity higher than 70% and grey shades between conserved nucleotide sequences (> = 1000 bp and identity > = 90%). For additional annotations please refer to the provide legend (“Legend”).

(TIF)

pgen.1011207.s027.tif (1.6MB, tif)
S26 Fig. Coverage of different transposable element orders at the HD locus in four P. triticina isolates.

The plots show the percentage of nucleotides covered by different transposable element orders at the HD locus. Each subfigure shows the coverage in each haplotype of the dikaryotic genomes of P. triticina isolates Pt 15, Pt 19NSW04, Pt 20QLD87 and Pt 76. Different TE orders are color coded as shown in the legend. TEs with no assigned class are labelled “Undetermined”. TEs with no assigned order but belonging to Class I (RNA retrotransposons) or Class II (DNA transposons) are labelled “Undetermined Class I” or “Undetermined Class II”, respectively.

(TIF)

pgen.1011207.s028.tif (326.2KB, tif)
S27 Fig. Whole chromosome alignments of Pra gene containing chromosomes between haplotypes of four different P. triticina isolates.

The figure shows dot plots of whole chromosome alignments of STE3.2–2 or STE3.2–3 containing chromosomes derived from distinct dikaryotic genomes of four different P. triticina isolates including Pt 15, Pt 19NSW04, Pt 20QLD87 against Pt 76. Each panel consists of a dot plot of the whole chromosome and a subset dot plot zooming into the proximal region of STE3.2–2 or STE3.2–3. The STE3.2–2 or STE3.2- are labelled and line colors show the nucleotide percentage identity and nucleotide orientation as indicated in the figure legend. (A) Comparison of nucleotide sequence of chromosome 9s containing STE3.2–2 gene (B) Comparison of nucleotide sequence of chromosome 9s containing STE3.2–3 gene.

(TIF)

pgen.1011207.s029.tif (1.3MB, tif)
S28 Fig. Nucleotide coverage distribution of Ty3_Pt_STE3.2–3 on the chromosomes of four P. triticina isolates.

The plots show the percentage of nucleotides covered by the transposable element family Ty3_Pt_STE3.2–3 on each chromosome of four P. triticina isolates Pt 15 (A), Pt 19NSW04 (B), Pt 20QLD87 (C) and Pt 76 (D). Chromosomes carrying STE3.2–3 are highlighted in red.

(TIF)

pgen.1011207.s030.tif (325.2KB, tif)
S29 Fig. Schematic illustration of the location of Ty3_Pt_STE3.2–3 copies at the STE3.2–3 locus in four P. triticina isolates.

(TIF)

pgen.1011207.s031.tif (118.7KB, tif)
S30 Fig. Genealogy of STE3.2–1 in rust fungi indicates sequence of STE3.2–1 are conserved within species.

Bayesian rooted gene tree built from STE3.2–1 coding-based sequence alignment from four cereal rust fungi: P. coronata f. sp. avenae (Pca), P. graminis f. sp. tritici (Pgt), P. triticina (Pt) and P. striiformis f. sp. tritici (Pst) and P. polysora f. sp. zeae (Ppz) GD1913 was included as outgroup. Trees are based on a TN93+I model of molecular evolution. Each node is labelled with its values of posterior probability (PP). PP values above 0.95 are considered to have strong evidence for monophyly of a clade and PP values of identical alleles are not displayed. The scale bar represents the number of nucleotide substitutions per site. Alleles of the same species are colored with identical background: Pca (yellow), Pgt (green), Pt (blue), Pst (orange).

(TIF)

pgen.1011207.s032.tif (525.7KB, tif)
S31 Fig. Whole chromosome alignments of STE3.2–1 gene containing chromosomes between two haplotypes of each dikaryotic genome suggest high conservation of STE3.2–1.

The figure shows dot plots of whole chromosome alignments between the two STE3.2–1 gene containing chromosomes from dikaryotic genome assemblies. Each panel consists of dot plots of the whole chromosome and subset dot plots zooming into the STE3.2–1 proximal region. The position of STE3.2–1 gene is labelled, and line colors show the nucleotide percentage identity and nucleotide orientation as indicated in the figure legend. Subfigures A to D show P. coronata f. sp. avenae (“Pca 203”), P. graminis f. sp. tritici (“Pgt 21–0”), P. triticina (“Pt 76”) and P. striiformis f. sp. tritici (“Pst 134E”), respectively.

(TIF)

pgen.1011207.s033.tif (626.8KB, tif)
S32 Fig. Synteny analysis of STE3.2–1 genes and their flanking regions reveal no evidence of genetic degeneration.

Synteny graphs of the STE3.2–1 locus including proximal regions in P. coronara f. sp. avenae (“Pca 203”), P. gramminis f. sp. tritici (“Pgt 21–0”), P. triticina (“Pt 76”), and P. striiformis f. sp. tritici (“Pst 134E”). Proximal regions are defined as 40 genes downstream and upstream of the STE3.2–1 alleles, respectively. STE3.2–1 proximal regions are highly syntenic within each dikaryotic genome and conserved between species. Red lines between chromosome sections represent gene pairs with sequence identity higher than 70% and grey shades represent conserved nucleotide sequences (> = 1000 bp and identity > = 90%). For additional annotations please refer to the provided legend (“Legend”).

(TIF)

pgen.1011207.s034.tif (1.5MB, tif)
S33 Fig. Synonymous divergence (dS) values between STE3.2–1 alleles along chromosome 1A suggest conservation of this locus.

Synonymous divergence values (dS) for all allele pairs are plotted along chromosome 1A for (A) P. coronara f. sp. avenae (“Pca 203”), (B) P. gramminis f. sp. tritici (“Pgt 21–0”), (C) P. triticina (“Pt 76”), and (D) P. striiformis f. sp. tritici (“Pst 134E”). In each panel, the top track shows the dS values (“dS) of allele pairs along chromosome 1. Each dot corresponds to the dS value of a single allele pair. The second and third track show the averaged TE (“TE”) and gene (“gene”) density along chromosome 1 in 10 kbp-sized windows, respectively. The STE3.2–1 alleles are highlighted with a red line and red shading indicates a 0.4 mbp-sized window around the STE3.2–1 genes. Predicted centromeric regions are marked with blue shading. The two lower tracks (ds values and gene locations) provide a detailed zoomed in view of red shaded area around the STE3.2–1 alleles. Species-specific background coloring is the same as for Fig 1.

(TIF)

pgen.1011207.s035.tif (1.6MB, tif)
S34 Fig. Multidimensional scaling (MDS) plot of RNA dataset used in this study.

MDS plots were made with TMM normalized counts for quality control, each dot represents a single sample, and replicates were color coded as indicated in each subfigure legend. (A) MDS plot of TMM-normalized value of Pca at 48 hour post infection (hpi) and 120 hpi. (B) MDS plot of TMM-normalized value of Pgt at 48, 72, 96, 120, 148 and 168 hpi. (C) MDS plot of TMM-normalized value of Pst at 24, 48, 72, 120, 168, 216 and 264 hpi. (D) MDS plot of TMM-normalized value of ungerminated spore (US), germinated spores (GS) stages, 144 hpi, 216 hpi and haustoria enriched samples (HE) of Pst.

(TIF)

pgen.1011207.s036.tif (367KB, tif)
S35 Fig. Expression of housekeeping genes in four RNAseq datasets.

(A) TMM-normalized value of housekeeping genes in P. coronata f. sp. avenae (“Pca 12NC29”), P. graminis f. sp. tritici (“Pgt 21–0”), and P. striiformis f. sp. tritici (“Pst 87/66” and “Pst 104E”). (B) Likelihood ratio test (LRT) method was applied to test significant upregulation of housekeeping genes between timepoints, none of the housekeeping genes show significant upregulation between timepoints. red dashed lines indicate logFC = 0.5 and logFC = -0.5 respectively, stars above or below bars indicate statistically significant differences between the two adjacent time points: *p<0.05, **p<0.01, ***p<0.001. Genes are labelled with different colors. Transcription elongation factor TFIIS (TFIIS), Actin-related protein 3 (ARP3), Actin/actin-like protein 2 (ACTIN2), Ubiquitin carboxyl-terminal hydrolase 6 (UBP6), Conserved oligomeric Golgi complex subunit 3 (COG3), Ctr copper transporter 2 (CTR2).

(TIF)

pgen.1011207.s037.tif (314.5KB, tif)
S36 Fig. Upregulation of MAT genes in late stages of the asexual life cycle.

Likelihood ratio test (LRT) method was applied to test significant upregulation of MAT genes between timepoints, red dashed lines indicate logFC = 0.5 and logFC = -0.5 respectively, stars above or below bars indicate statistically significant differences between the two adjacent time points: *p<0.05, **p<0.01, ***p<0.001. MAT genes were labelled with different colors. (A) The expression levels of MAT genes in P. coronata f. sp. avenae (“Pca 12NC29”) were compared between 120 hours post infection (hpi) and 48 hpi. (B) The expression levels of MAT genes in P. graminis f. sp. tritici (“Pgt 21–0”) were compared between 72 hpi and 48 hpi, 96 hpi and 72 hpi, 120 hpi and 96 hpi, 144 hpi and 120 hpi, 168 hpi and 144 hpi. (C) The expression levels of MAT genes in P. striiformis f. sp. tritici (“Pst 87/66”) were compared between 48 hpi and 24 hpi, 72 hpi and 48 hpi, 120 hpi and 72 hpi, 168 hpi and 120 hpi, 216 hpi and 168 hpi, 264 hpi and 216 hpi. (D) The expression levels of MAT genes in P. striiformis f. sp. tritici (“Pst 104E”) were compared between germinated spores (GS) and ungerminated spores (US), 144 hpi and GS, 216 hpi and 144 hpi, haustoria enriched samples (HE) and 216 hpi.

(TIF)

pgen.1011207.s038.tif (226.9KB, tif)
S37 Fig. MAT genes are upregulated in the late asexual infection stage of P. striiformis f. sp. tritici (“Pst 104E”).

TMM-normalized value of MAT genes in ungerminated spores (US), germinated spores (GS) stages, 144 hpi, 216 hpi and in haustoria enriched samples (HE) of Pst. Genes are labelled with different colours.

(TIF)

pgen.1011207.s039.tif (86.5KB, tif)

Acknowledgments

We thank Dr. P. Tobias and Dr. M. Moeller kindly for critical reading and comments on this manuscript. We thank Dr. M. Bui for advice in iqtree2 usage, Prof. C. Linde for useful suggestions, and R. Tam for providing a snakemake pipeline for trimming raw sequencing data and centromere information of Pst 134E. We thank J. Lin for suggestions on data visualization. This work was supported by computational resources provided by the Australian Government through the National Computational Infrastructure (NCI) under the ANU Merit Allocation Scheme.

Data Availability

Analysis code used in this study is available at github repository: https://github.com/ZhenyanLuo/codes-used-for-mating-type. Alignments used in genealogical studies, exported RDP5 project files, ds value of gene pairs in studied cereal rust species, presumed CDS of reconstructed HD alleles, Pra alleles, TE annotation and classification files, normalized gene expression matrix files are available at Dryad (Luo Z, Schwessinger B. Supporting information of genome biology and evolution of mating type loci in four cereal rust fungi [Dataset] 2024. Available from: https://doi.org/10.5061/dryad.w0vt4b8zm).

Funding Statement

This work was supported by an Australian Research Council Future Fellowship FT180100024 to B. S. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Bachtrog D. A dynamic view of sex chromosome evolution. Current opinion in genetics & development. 2006;16(6):578–85. doi: 10.1016/j.gde.2006.10.007 [DOI] [PubMed] [Google Scholar]
  • 2.Jay P, Tezenas E, Véber A, Giraud T. Sheltering of deleterious mutations explains the stepwise extension of recombination suppression on sex chromosomes and other supergenes. PLoS Biology. 2022;20(7):e3001698. doi: 10.1371/journal.pbio.3001698 [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
  • 3.Duhamel M, Carpentier F, Begerow D, Hood ME, Rodríguez de la Vega RC, Giraud T. Onset and stepwise extensions of recombination suppression are common in mating-type chromosomes of Microbotryum anther-smut fungi. Journal of Evolutionary Biology. 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Carpentier F, Rodríguez de la Vega RC, Jay P, Duhamel M, Shykoff JA, Perlin MH, et al. Tempo of degeneration across independently evolved nonrecombining regions. Molecular Biology and Evolution. 2022;39(4):msac060. doi: 10.1093/molbev/msac060 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Idnurm A, Hood ME, Johannesson H, Giraud T. Contrasted patterns in mating-type chromosomes in fungi: hotspots versus coldspots of recombination. Fungal Biology Reviews. 2015;29(3–4):220–9. doi: 10.1016/j.fbr.2015.06.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Hartmann FE, Duhamel M, Carpentier F, Hood ME, Foulongne-Oriol M, Silar P, et al. Recombination suppression and evolutionary strata around mating-type loci in fungi: documenting patterns and understanding evolutionary and mechanistic causes. New Phytologist. 2021;229(5):2470–91. doi: 10.1111/nph.17039 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Bazzicalupo AL, Carpentier F, Otto SP, Giraud T. Little evidence of antagonistic selection in the evolutionary strata of fungal mating-type chromosomes (Microbotryum lychnidis-dioicae). G3: Genes, Genomes, Genetics. 2019;9(6):1987–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Branco S, Badouin H, de la Vega RCR, Gouzy J, Carpentier F, Aguileta G, et al. Evolutionary strata on young mating-type chromosomes despite the lack of sexual antagonism. Proceedings of the National Academy of Sciences. 2017;114(27):7067–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Coelho MA, Bakkeren G, Sun S, Hood ME, Giraud T. Fungal sex: the Basidiomycota. The fungal kingdom. 2017:147–75. doi: 10.1128/microbiolspec.FUNK-0046-2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Menkis A, Jacobson DJ, Gustafsson T, Johannesson H. The mating-type chromosome in the filamentous ascomycete Neurospora tetrasperma represents a model for early evolution of sex chromosomes. PLoS Genetics. 2008;4(3):e1000030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Hartmann FE, Ament-Velásquez SL, Vogan AA, Gautier V, Le Prieur S, Berramdane M, et al. Size variation of the nonrecombining region on the mating-type chromosomes in the fungal Podospora anserina species complex. Molecular biology and evolution. 2021;38(6):2475–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Vittorelli N, Rodríguez de la Vega RC, Snirc A, Levert E, Gautier V, Lalanne C, et al. Stepwise recombination suppression around the mating-type locus in an ascomycete fungus with self-fertile spores. PLoS Genetics. 2023;19(2):e1010347. doi: 10.1371/journal.pgen.1010347 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Jalalzadeh B, Barroso G, Savoie J-M, Callac P. Experimental Outcrossing in Agaricus bisporus Revealed a Major and Unexpected Involvement of Airborne Mycelium Fragments. Journal of Fungi. 2022;8(12):1278. doi: 10.3390/jof8121278 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Liang S-W, Huang Y-H, Chiu J-Y, Tseng H-W, Huang J-H, Shen W-C. The smut fungus Ustilago esculenta has a bipolar mating system with three idiomorphs larger than 500 kb. Fungal Genetics and Biology. 2019;126:61–74. [DOI] [PubMed] [Google Scholar]
  • 15.Carpentier F, de la Vega RCR, Branco S, Snirc A, Coelho MA, Hood ME, et al. Convergent recombination cessation between mating-type genes and centromeres in selfing anther-smut fungi. Genome research. 2019;29(6):944–53. doi: 10.1101/gr.242578.118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Lengeler KB, Fox DS, Fraser JA, Allen A, Forrester K, Dietrich FS, et al. Mating-type locus of Cryptococcus neoformans: a step in the evolution of sex chromosomes. Eukaryotic Cell. 2002;1(5):704–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Sun S, Coelho MA, Heitman J, Nowrousian M. Convergent evolution of linked mating-type loci in basidiomycete fungi. PLoS genetics. 2019;15(9):e1008365. doi: 10.1371/journal.pgen.1008365 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Fraser JA, Diezmann S, Subaran RL, Allen A, Lengeler KB, Dietrich FS, et al. Convergent evolution of chromosomal sex-determining regions in the animal and fungal kingdoms. PLoS biology. 2004;2(12):e384. doi: 10.1371/journal.pbio.0020384 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Cuomo CA, Bakkeren G, Khalil HB, Panwar V, Joly D, Linning R, et al. Comparative analysis highlights variable genome content of wheat rusts and divergence of the mating loci. G3: Genes, Genomes, Genetics. 2017;7(2):361–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Wu JQ, Song L, Ding Y, Dong C, Hasan M, Park RF. A chromosome-scale assembly of the wheat leaf rust pathogen Puccinia triticina provides insights into structural variations and genetic relationships with haplotype resolution. Frontiers in Microbiology. 2021:2180. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Brefort T, Doehlemann G, Mendoza-Mendoza A, Reissmann S, Djamei A, Kahmann R. Ustilago maydis as a pathogen. Annual review of phytopathology. 2009;47:423–45. [DOI] [PubMed] [Google Scholar]
  • 22.Maia TM, Lopes ST, Almeida JMGCF, Rosa LH, Sampaio JP, Gonçalves P, et al. Evolution of Mating Systems in Basidiomycetes and the Genetic Architecture Underlying Mating-Type Determination in the Yeast Leucosporidium scottii. Genetics. 2015;201(1):75–89. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Holden S, Bakkeren G, Hubensky J, Bamrah R, Abbasi M, Qutob D, et al. Uncovering the history of recombination and population structure in western Canadian stripe rust populations through mating type alleles. BMC Biology. 2023;21(1):233. doi: 10.1186/s12915-023-01717-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Badouin H, Hood ME, Gouzy J, Aguileta G, Siguenza S, Perlin MH, et al. Chaos of rearrangements in the mating-type chromosomes of the anther-smut fungus Microbotryum lychnidis-dioicae. Genetics. 2015;200(4):1275–84. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Petit E, Giraud T, de Vienne DM, Coelho MA, Aguileta G, Amselem J, et al. LINKAGE TO THE MATING-TYPE LOCUS ACROSS THE GENUS MICROBOTRYUM: INSIGHTS INTO NONRECOMBINING CHROMOSOMES. Evolution. 2012;66(11):3519–33. doi: 10.1111/j.1558-5646.2012.01703.x [DOI] [PubMed] [Google Scholar]
  • 26.Coelho MA, Ianiri G, David-Palma M, Theelen B, Goyal R, Narayanan A, et al. Frequent transitions in mating-type locus chromosomal organization in Malassezia and early steps in sexual reproduction. Proceedings of the National Academy of Sciences. 2023;120(32):e2305094120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Aime M, McTaggart A. A higher-rank classification for rust fungi, with notes on genera. Fungal systematics and evolution. 2021;7(1):21–47. doi: 10.3114/fuse.2021.07.02 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Dean R, Van Kan JA, Pretorius ZA, Hammond-Kosack KE, Di Pietro A, Spanu PD, et al. The Top 10 fungal pathogens in molecular plant pathology. Molecular plant pathology. 2012;13(4):414–30. doi: 10.1111/j.1364-3703.2011.00783.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Savary S, Willocquet L, Pethybridge SJ, Esker P, McRoberts N, Nelson A. The global burden of pathogens and pests on major food crops. Nature ecology & evolution. 2019;3(3):430–9. doi: 10.1038/s41559-018-0793-y [DOI] [PubMed] [Google Scholar]
  • 30.Buller AHR. Researches on Fungi, Vol. VII: The Sexual Process in the Uredinales: University of Toronto Press; 1950. [Google Scholar]
  • 31.Talhinhas P, Carvalho R, Tavares S, Ribeiro T, Azinheira H, Ramos AP, et al. Diploid Nuclei Occur throughout the Life Cycles of Pucciniales Fungi. Microbiology Spectrum. 2023:e01532–23. doi: 10.1128/spectrum.01532-23 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Narisawa K, Yamaoka Y, Katsuya K. Mating type of isolates derived from the spermogonial state of Puccinia coronata var. coronata. Mycoscience. 1994;35(2):131–5. [Google Scholar]
  • 33.Nemri A, Saunders DG, Anderson C, Upadhyaya NM, Win J, Lawrence G, et al. The genome sequence and effector complement of the flax rust pathogen Melampsora lini. Frontiers in plant science. 2014;5:98. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Ferrarezi J, McTaggart A, Tobias P, Hayashibara C, Degnan R, Shuey L, et al. Austropuccinia psidii uses tetrapolar mating and produces meiotic spores in older infections on Eucalyptus grandis. Fungal Genetics and Biology. 2022:103692. [DOI] [PubMed] [Google Scholar]
  • 35.Li F, Upadhyaya NM, Sperschneider J, Matny O, Nguyen-Phuc H, Mago R, et al. Emergence of the Ug99 lineage of the wheat stem rust pathogen through somatic hybridisation. Nature communications. 2019;10(1):1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Devier B, Aguileta G, Hood ME, Giraud T. Ancient trans-specific polymorphism at pheromone receptor genes in basidiomycetes. Genetics. 2009;181(1):209–23. doi: 10.1534/genetics.108.093708 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Sperschneider J, Hewitt T, Lewis DC, Periyannan S, Milgate AW, Hickey LT, et al. Nuclear exchange generates population diversity in the wheat leaf rust pathogen Puccinia triticina. Nature Microbiology. 2023:1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Kronstad J, Leong S. The b mating-type locus of Ustilago maydis contains variable and constant regions. Genes & development. 1990;4(8):1384–95. [DOI] [PubMed] [Google Scholar]
  • 39.Casselton LA, Olesnicky NS. Molecular genetics of mating recognition in basidiomycete fungi. Microbiology and molecular biology reviews. 1998;62(1):55–70. doi: 10.1128/MMBR.62.1.55-70.1998 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Duplessis S, Hacquard S, Delaruelle C, Tisserant E, Frey P, Martin F, et al. Melampsora larici-populina transcript profiling during germination and timecourse infection of poplar leaves reveals dynamic expression patterns associated with virulence and biotrophy. Molecular plant-microbe interactions. 2011;24(7):808–18. [DOI] [PubMed] [Google Scholar]
  • 41.Hood ME, Antonovics J, Koskella B. Shared forces of sex chromosome evolution in haploid-mating and diploid-mating organisms: Microbotryum violaceum and other model organisms. Genetics. 2004;168(1):141–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Duhamel M, Hood ME, Rodríguez de la Vega RC, Giraud T. Dynamics of transposable element accumulation in the non-recombining regions of mating-type chromosomes in anther-smut fungi. Nature Communications. 2023;14(1):5692. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Miller ME, Zhang Y, Omidvar V, Sperschneider J, Schwessinger B, Raley C, et al. De novo assembly and phasing of dikaryotic genomes from two isolates of Puccinia coronata f. sp. avenae, the causal agent of oat crown rust. MBio. 2018;9(1):e01650–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Henningsen EC, Hewitt T, Dugyala S, Nazareno ES, Gilbert E, Li F, et al. A chromosome-level, fully phased genome assembly of the oat crown rust fungus Puccinia coronata f. sp. avenae: a resource to enable comparative genomics in the cereal rusts. G3. 2022;12(8):jkac149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Miller ME, Nazareno ES, Rottschaefer SM, Riddle J, Dos Santos Pereira D, Li F, et al. Increased virulence of Puccinia coronata f. sp. avenae populations through allele frequency changes at multiple putative Avr loci. PLoS genetics. 2020;16(12):e1009291. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Chen J, Upadhyaya NM, Ortiz D, Sperschneider J, Li F, Bouton C, et al. Loss of AvrSr50 by somatic exchange in stem rust leads to virulence for Sr50 resistance in wheat. Science. 2017;358(6370):1607–10. [DOI] [PubMed] [Google Scholar]
  • 47.Lewis CM, Persoons A, Bebber DP, Kigathi RN, Maintz J, Findlay K, et al. Potential for re-emergence of wheat stem rust in the United Kingdom. Communications biology. 2018;1(1):1–9. doi: 10.1038/s42003-018-0013-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Duan H, Jones AW, Hewitt T, Mackenzie A, Hu Y, Sharp A, et al. Physical separation of haplotypes in dikaryons allows benchmarking of phasing accuracy in Nanopore and HiFi assemblies with Hi-C data. Genome biology. 2022;23(1):84. doi: 10.1186/s13059-022-02658-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Wu JQ, Dong C, Song L, Park RF. Long-Read–Based de novo Genome Assembly and Comparative genomics of the wheat leaf rust pathogen Puccinia triticina identifies candidates for three avirulence genes. Frontiers in genetics. 2020;11:521. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Wu JQ, Sakthikumar S, Dong C, Zhang P, Cuomo CA, Park RF. Comparative genomics integrated with association analysis identifies candidate effector genes corresponding to Lr20 in phenotype-paired Puccinia triticina isolates from Australia. Frontiers in plant science. 2017;8:148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Fellers JP, Sakthikumar S, He F, McRell K, Bakkeren G, Cuomo CA, et al. Whole-genome sequencing of multiple isolates of Puccinia triticina reveals asexual lineages evolving by recurrent mutations. G3. 2021;11(9):jkab219. doi: 10.1093/g3journal/jkab219 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Schwessinger B, Jones A, Albekaa M, Hu Y, Mackenzie A, Tam R, et al. A Chromosome Scale Assembly of an Australian Puccinia striiformis f. sp. tritici Isolate of the PstS1 Lineage. Molecular Plant-Microbe Interactions. 2022;35(3):293–6. [DOI] [PubMed] [Google Scholar]
  • 53.Schwessinger B, Sperschneider J, Cuddy WS, Garnica DP, Miller ME, Taylor JM, et al. A near-complete haplotype-phased genome of the dikaryotic wheat stripe rust fungus Puccinia striiformis f. sp. tritici reveals high interhaplotype diversity. MBio. 2018;9(1):e02275–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Schwessinger B, Chen Y-J, Tien R, Vogt JK, Sperschneider J, Nagar R, et al. Distinct life histories impact dikaryotic genome evolution in the rust fungus Puccinia striiformis causing stripe rust in wheat. Genome biology and evolution. 2020;12(5):597–617. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.van Schalkwyk HJ, Adams T, Persoons A, Boshoff WH, Wanyera R, Hovmøller MS, et al. Pathogenomic analyses of Puccinia striiformis f. sp. tritici supports a close genetic relationship between South and East Africa. Plant Pathology. 2022;71(2):279–88. [Google Scholar]
  • 56.Hubbard A, Lewis CM, Yoshida K, Ramirez-Gonzalez RH, de Vallavieille-Pope C, Thomas J, et al. Field pathogenomics reveals the emergence of a diverse wheat yellow rust population. Genome biology. 2015;16(1):1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Ding Y, Cuddy WS, Wellings CR, Zhang P, Thach T, Hovmøller MS, et al. Incursions of divergent genotypes, evolution of virulence and host jumps shape a continental clonal population of the stripe rust pathogen Puccinia striiformis. Molecular Ecology. 2021;30(24):6566–84. [DOI] [PubMed] [Google Scholar]
  • 58.Upadhyaya NM, Garnica DP, Karaoglu H, Sperschneider J, Nemri A, Xu B, et al. Comparative genomics of Australian isolates of the wheat stem rust pathogen Puccinia graminis f. sp. tritici reveals extensive polymorphism in candidate effector genes. Frontiers in plant science. 2015;5:759. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Li C, Qiao L, Lu Y, Xing G, Wang X, Zhang G, et al. Gapless Genome Assembly of Puccinia triticina Provides Insights into Chromosome Evolution in Pucciniales. Microbiology Spectrum. 2023;11(1):e02828–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Schwessinger B, Sperschneider J, Cuddy WS, Garnica DP, Miller ME, Taylor JM, et al. A near-complete haplotype-phased genome of the dikaryotic wheat stripe rust fungus Puccinia striiformis f. sp. tritici reveals high interhaplotype diversity. MBio. 2018;9(1). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Persoons A, Morin E, Delaruelle C, Payen T, Halkett F, Frey P, et al. Patterns of genomic variation in the poplar rust fungus Melampsora larici-populina identify pathogenesis-related factors. Frontiers in plant science. 2014;5:450. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Raudaskoski M, Kothe E. Basidiomycete mating type genes and pheromone signaling. Eukaryotic cell. 2010;9(6):847. doi: 10.1128/EC.00319-09 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Xu L, Petit E, Hood M. Variation in mate-recognition pheromones of the fungal genus Microbotryum. Heredity. 2016;116(1):44–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Martin SH, Wingfield BD, Wingfield MJ, Steenkamp ET. Causes and consequences of variability in peptide mating pheromones of ascomycete fungi. Molecular biology and evolution. 2011;28(7):1987–2003. doi: 10.1093/molbev/msr022 [DOI] [PubMed] [Google Scholar]
  • 65.van Diepen LTA, Olson Å, Ihrmark K, Stenlid J, James TY. Extensive Trans-Specific Polymorphism at the Mating Type Locus of the Root Decay Fungus Heterobasidion. Molecular Biology and Evolution. 2013;30(10):2286–301. doi: 10.1093/molbev/mst126 [DOI] [PubMed] [Google Scholar]
  • 66.Aime MC, Bell CD, Wilson AW. Deconstructing the evolutionary complexity between rust fungi (Pucciniales) and their plant hosts. Studies in Mycology. 2018;89(1):143–52. doi: 10.1016/j.simyco.2018.02.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Emms DM, Kelly S. STRIDE: species tree root inference from gene duplication events. Molecular biology and evolution. 2017;34(12):3267–78. doi: 10.1093/molbev/msx259 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Shimodaira H. An approximately unbiased test of phylogenetic tree selection. Systematic biology. 2002;51(3):492–508. doi: 10.1080/10635150290069913 [DOI] [PubMed] [Google Scholar]
  • 69.Martin DP, Varsani A, Roumagnac P, Botha G, Maslamoney S, Schwab T, et al. RDP5: a computer program for analyzing recombination in, and removing signals of recombination from, nucleotide sequence datasets. Virus Evolution. 2021;7(1):veaa087. doi: 10.1093/ve/veaa087 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Luo Z, Schwessinger B. Supporting information of genome biology and evolution of mating type loci in four cereal rust fungi. Dryad, Dataset, 10.5061/dryad.w0vt4b8zm. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Kämper J, Kahmann R, Bölker M, Ma L-J, Brefort T, Saville BJ, et al. Insights from the genome of the biotrophic fungal plant pathogen Ustilago maydis. Nature. 2006;444(7115):97–101. doi: 10.1038/nature05248 [DOI] [PubMed] [Google Scholar]
  • 72.Marçais G, Delcher AL, Phillippy AM, Coston R, Salzberg SL, Zimin A. MUMmer4: A fast and versatile genome alignment system. PLoS computational biology. 2018;14(1):e1005944. doi: 10.1371/journal.pcbi.1005944 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Hood ME, Scott M, Hwang M. Breaking linkage between mating compatibility factors: Tetrapolarity in Microbotryum. Evolution. 2015;69(10):2561–72. doi: 10.1111/evo.12765 [DOI] [PubMed] [Google Scholar]
  • 74.Sperschneider J, Jones AW, Nasim J, Xu B, Jacques S, Zhong C, et al. The stem rust fungus Puccinia graminis f. sp. tritici induces centromeric small RNAs during late infection that are associated with genome-wide DNA methylation. BMC biology. 2021;19(1):1–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Durand NC, Robinson JT, Shamim MS, Machol I, Mesirov JP, Lander ES, et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell systems. 2016;3(1):99–101. doi: 10.1016/j.cels.2015.07.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Wicker T, Sabot F, Hua-Van A, Bennetzen JL, Capy P, Chalhoub B, et al. A unified classification system for eukaryotic transposable elements. Nature Reviews Genetics. 2007;8(12):973–82. doi: 10.1038/nrg2165 [DOI] [PubMed] [Google Scholar]
  • 77.Coelho MA, Sampaio JP, Goncalves P. A deviation from the bipolar-tetrapolar mating paradigm in an early diverged basidiomycete. PLoS genetics. 2010;6(8):e1001052. doi: 10.1371/journal.pgen.1001052 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Urban M, Kahmann R, Bölker M. The biallelica mating type locus of Ustilago maydis: remnants of an additional pheromone gene indicate evolution from a multiallelic ancestor. Molecular and General Genetics MGG. 1996;250(4):414–20. [DOI] [PubMed] [Google Scholar]
  • 79.Wei K, Aldaimalani R, Mai D, Zinshteyn D, Satyaki P, Blumenstiel JP, et al. Rethinking the “gypsy” retrotransposon: A roadmap for community-driven reconsideration of problematic gene names. 2022. [Google Scholar]
  • 80.Xu Z, Wang H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic acids research. 2007;35(suppl_2):W265–W8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Liang J, Li Y, Dodds PN, Figueroa M, Sperschneider J, Han S, et al. Haplotype-phased and chromosome-level genome assembly of Puccinia polysora, a giga-scale fungal pathogen causing southern corn rust. Molecular Ecology Resources. 2023;23(3):601–20. [DOI] [PubMed] [Google Scholar]
  • 82.Zhan G, Guo J, Tian Y, Ji F, Bai X, Zhao J, et al. High-throughput RNA sequencing reveals differences between the transcriptomes of the five spore forms of Puccinia striiformis f. sp. tritici, the wheat stripe rust pathogen. Stress Biology. 2023;3(1):29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Dobon A, Bunting DC, Cabrera-Quio LE, Uauy C, Saunders DG. The host-pathogen interaction between wheat and yellow rust induces temporally coordinated waves of gene expression. BMC genomics. 2016;17(1):1–14. doi: 10.1186/s12864-016-2684-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Au SW, Leng X, Harper JW, Barford D. Implications for the ubiquitination reaction of the anaphase-promoting complex from the crystal structure of the Doc1/Apc10 subunit. Journal of molecular biology. 2002;316(4):955–68. [DOI] [PubMed] [Google Scholar]
  • 85.Andersen GR, Nissen P, Nyborg J. Elongation factors in protein biosynthesis. Trends in biochemical sciences. 2003;28(8):434–41. doi: 10.1016/S0968-0004(03)00162-2 [DOI] [PubMed] [Google Scholar]
  • 86.Wirth S, Freihorst D, Krause K, Kothe E. What role might non-mating receptors play in Schizophyllum commune? Journal of Fungi. 2021;7(5):399. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Hsueh YP, Xue C, Heitman J. A constitutively active GPCR governs morphogenic transitions in Cryptococcus neoformans. The EMBO journal. 2009;28(9):1220–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Zambino P, Groth JV, Lukens L, Garton JR, May G. Variation at the b mating type locus of Ustilago maydis. Phytopathology. 1997;87(12):1233–9. [DOI] [PubMed] [Google Scholar]
  • 89.Silva J. Alleles at the b incompatibility locus in Polish and North American populations of Ustilago maydis (DC) Corda. Physiological Plant Pathology. 1972;2(4):333–7. [Google Scholar]
  • 90.Metin B, Findley K, Heitman J. The mating type locus (MAT) and sexual reproduction of Cryptococcus heveanensis: insights into the evolution of sex and sex-determining chromosomal regions in fungi. PLoS Genetics. 2010;6(5):e1000961. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Bölker M, Urban M, Kahmann R. The a mating type locus of U. maydis specifies cell signaling components. Cell. 1992;68(3):441–50. [DOI] [PubMed] [Google Scholar]
  • 92.Schirawski J, Heinze B, Wagenknecht M, Kahmann R. Mating type loci of Sporisorium reilianum: novel pattern with three a and multiple b specificities. Eukaryotic cell. 2005;4(8):1317–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Fontanillas E, Hood ME, Badouin H, Petit E, Barbe V, Gouzy J, et al. Degeneration of the nonrecombining regions in the mating-type chromosomes of the anther-smut fungi. Molecular biology and evolution. 2015;32(4):928–43. doi: 10.1093/molbev/msu396 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Duplessis S, Cuomo CA, Lin Y-C, Aerts A, Tisserant E, Veneault-Fourrey C, et al. Obligate biotrophy features unraveled by the genomic analysis of rust fungi. Proceedings of the National Academy of Sciences. 2011;108(22):9166–71. doi: 10.1073/pnas.1019315108 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Savile DBO. Nuclear structure and behavior in species of the Uredinales. American Journal of Botany. 1939:585–609. [Google Scholar]
  • 96.Park RF, Wellings CR. Somatic hybridization in the Uredinales. Annual review of Phytopathology. 2012;50:219–39. doi: 10.1146/annurev-phyto-072910-095405 [DOI] [PubMed] [Google Scholar]
  • 97.Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Molecular biology and evolution. 2021;38(10):4647–54. doi: 10.1093/molbev/msab199 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Pertea G, Pertea M. GFF utilities: GffRead and GffCompare. F1000Research. 2020;9. doi: 10.12688/f1000research.23297.2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, McGinnis S, Madden TL. NCBI BLAST: a better web interface. Nucleic acids research. 2008;36(suppl_2):W5–W9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. bioinformatics. 2009;25(14):1754–60. doi: 10.1093/bioinformatics/btp324 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Geneious Prime 2022.0.2 [Available from: https://www.geneious.com.
  • 102.Rice P, Longden I, Bleasby A. EMBOSS: the European molecular biology open software suite. Elsevier current trends; 2000. [DOI] [PubMed] [Google Scholar]
  • 103.Hao Z, Lv D, Ge Y, Shi J, Weijers D, Yu G, et al. RIdeogram: drawing SVG graphics to visualize and map genome-wide data on the idiograms. PeerJ Computer Science. 2020;6:e251. doi: 10.7717/peerj-cs.251 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Emms DM, Kelly S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome biology. 2019;20(1):1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105.Price MN, Dehal PS, Arkin AP. FastTree 2–approximately maximum-likelihood trees for large alignments. PloS one. 2010;5(3):e9490. doi: 10.1371/journal.pone.0009490 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106.Ranwez V, Douzery EJ, Cambon C, Chantret N, Delsuc F. MACSE v2: toolkit for the alignment of coding sequences accounting for frameshifts and stop codons. Molecular biology and evolution. 2018;35(10):2582–4. doi: 10.1093/molbev/msy159 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107.Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 2009;25(15):1972–3. doi: 10.1093/bioinformatics/btp348 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108.Bouckaert R, Heled J, Kühnert D, Vaughan T, Wu C-H, Xie D, et al. BEAST 2: a software platform for Bayesian evolutionary analysis. PLoS computational biology. 2014;10(4):e1003537. doi: 10.1371/journal.pcbi.1003537 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109.Bouckaert RR, Drummond AJ. bModelTest: Bayesian phylogenetic site model averaging and model comparison. BMC evolutionary biology. 2017;17:1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110.Rambaut A, Drummond AJ, Xie D, Baele G, Suchard MA. Posterior summarization in Bayesian phylogenetics using Tracer 1.7. Systematic biology. 2018;67(5):901–4. doi: 10.1093/sysbio/syy032 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 111.FigTree 2016 [Available from: http://tree.bio.ed.ac.uk/software/figtree/.
  • 112.Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, Von Haeseler A, et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Molecular biology and evolution. 2020;37(5):1530–4. doi: 10.1093/molbev/msaa015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 113.Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. ModelFinder: fast model selection for accurate phylogenetic estimates. Nature Methods. 2017;14(6):587–9. doi: 10.1038/nmeth.4285 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114.HURVICH CM TSAI C-L. Regression and time series model selection in small samples. Biometrika. 1989;76(2):297–307. [Google Scholar]
  • 115.Minh BQ, Nguyen MAT, Von Haeseler A. Ultrafast approximation for phylogenetic bootstrap. Molecular biology and evolution. 2013;30(5):1188–95. doi: 10.1093/molbev/mst024 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 116.Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Molecular biology and evolution. 2013;30(4):772–80. doi: 10.1093/molbev/mst010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 117.Katoh K, Misawa K, Kuma Ki, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic acids research. 2002;30(14):3059–66. doi: 10.1093/nar/gkf436 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 118.Martin D, Rybicki E. RDP: detection of recombination amongst aligned sequences. Bioinformatics. 2000;16(6):562–3. doi: 10.1093/bioinformatics/16.6.562 [DOI] [PubMed] [Google Scholar]
  • 119.Padidam M, Sawyer S, Fauquet CM. Possible emergence of new geminiviruses by frequent recombination. Virology. 1999;265(2):218–25. doi: 10.1006/viro.1999.0056 [DOI] [PubMed] [Google Scholar]
  • 120.Martin D, Posada D, Crandall K, Williamson C. A modified bootscan algorithm for automated identification of recombinant sequences and recombination breakpoints. Virologica Sinica. 2005;37(2). doi: 10.1089/aid.2005.21.98 [DOI] [PubMed] [Google Scholar]
  • 121.Smith JM. Analyzing the mosaic structure of genes. Journal of molecular evolution. 1992;34:126–9. doi: 10.1007/BF00182389 [DOI] [PubMed] [Google Scholar]
  • 122.Posada D, Crandall KA. Evaluation of methods for detecting recombination from DNA sequences: computer simulations. Proceedings of the National Academy of Sciences. 2001;98(24):13757–62. doi: 10.1073/pnas.241370698 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 123.Gibbs MJ, Armstrong JS, Gibbs AJ. Sister-scanning: a Monte Carlo procedure for assessing signals in recombinant sequences. Bioinformatics. 2000;16(7):573–82. doi: 10.1093/bioinformatics/16.7.573 [DOI] [PubMed] [Google Scholar]
  • 124.Boni MF, Posada D, Feldman MW. An exact nonparametric method for inferring mosaic structure in sequence triplets. Genetics. 2007;176(2):1035–47. doi: 10.1534/genetics.106.068874 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 125.Dudchenko O, Batra SS, Omer AD, Nyquist SK, Hoeger M, Durand NC, et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science. 2017;356(6333):92–5. doi: 10.1126/science.aal3327 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 126.Yow AG, Zhang Y, Bansal K, Eacker SM, Sullivan S, Liachko I, et al. Genome sequence of Monilinia vaccinii-corymbosi sheds light on mummy berry disease infection of blueberry and mating type. G3. 2021;11(2):jkaa052. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 127.Schultz DT, Haddock SH, Bredeson JV, Green RE, Simakov O, Rokhsar DS. Ancient gene linkages support ctenophores as sister to other animals. Nature. 2023:1–8. doi: 10.1038/s41586-023-05936-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 128.Lechner M, Findeiß S, Steiner L, Marz M, Stadler PF, Prohaska SJ. Proteinortho: detection of (co-) orthologs in large-scale analysis. BMC bioinformatics. 2011;12(1):1–9. doi: 10.1186/1471-2105-12-124 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 129.Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research. 2004;32(5):1792–7. doi: 10.1093/nar/gkh340 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 130.Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Molecular biology and evolution. 2007;24(8):1586–91. doi: 10.1093/molbev/msm088 [DOI] [PubMed] [Google Scholar]
  • 131.Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841–2. doi: 10.1093/bioinformatics/btq033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 132.Gel B, Serra E. karyoploteR: an R/Bioconductor package to plot customizable genomes displaying arbitrary data. Bioinformatics. 2017;33(19):3088–90. doi: 10.1093/bioinformatics/btx346 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 133.Hunter JD. Matplotlib: A 2D graphics environment. Computing in science & engineering. 2007;9(03):90–5. [Google Scholar]
  • 134.Thomas Hackl MJA. gggenomes: A Grammar of Graphics for Comparative Genomics 2022. [Available from: https://github.com/thackl/gggenomes. [Google Scholar]
  • 135.Flutre T, Duprat E, Feuillet C, Quesneville H. Considering transposable element diversification in de novo annotation approaches. PloS one. 2011;6(1):e16526. doi: 10.1371/journal.pone.0016526 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 136.Quesneville H, Bergman CM, Andrieu O, Autard D, Nouaud D, Ashburner M, et al. Combined evidence annotation of transposable elements in genome sequences. PLoS Comput Biol. 2005;1(2):e22. doi: 10.1371/journal.pcbi.0010022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 137.Bao W, Kojima KK, Kohany O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mobile Dna. 2015;6(1):11. doi: 10.1186/s13100-015-0041-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 138.Smit A, Hubley R, Green P. RepeatMasker Open-4.0. 2013–2015. 2015. [Google Scholar]
  • 139.Gattiker A, Gasteiger E, Bairoch A. ScanProsite: a reference implementation of a PROSITE scanning tool. Applied bioinformatics. 2002;1(2):107–8. [PubMed] [Google Scholar]
  • 140.Sigrist CJ, De Castro E, Cerutti L, Cuche BA, Hulo N, Bridge A, et al. New and continuing developments at PROSITE. Nucleic acids research. 2012;41(D1):D344–D7. doi: 10.1093/nar/gks1067 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 141.Consortium U. UniProt: a worldwide hub of protein knowledge. Nucleic acids research. 2019;47(D1):D506–D15. doi: 10.1093/nar/gky1049 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 142.Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20. doi: 10.1093/bioinformatics/btu170 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 143.Andrews FK, Anne Segonds-Pichon, Laura Biggins, Christel Krueger, Jo Montgomery. FastQC: a quality control tool for high throughput sequence data 2010. [Available from: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/. [Google Scholar]
  • 144.Picard toolkit: Broad Institute; 2019 [Available from: https://github.com/broadinstitute/picard.
  • 145.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078–9. doi: 10.1093/bioinformatics/btp352 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 146.Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Briefings in bioinformatics. 2013;14(2):178–92. doi: 10.1093/bib/bbs017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 147.Okonechnikov K, Conesa A, García-Alcalde F. Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data. Bioinformatics. 2016;32(2):292–4. doi: 10.1093/bioinformatics/btv566 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 148.Tange O. GNU parallel 2018: Lulu. com; 2018. [Google Scholar]
  • 149.Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. arXiv preprint arXiv:12073907. 2012. [Google Scholar]
  • 150.McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GR, Thormann A, et al. The ensembl variant effect predictor. Genome biology. 2016;17(1):1–14. doi: 10.1186/s13059-016-0974-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 151.Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19(5):455–77. doi: 10.1089/cmb.2012.0021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 152.Mapleson D, Garcia Accinelli G, Kettleborough G, Wright J, Clavijo BJ. KAT: a K-mer analysis toolkit to quality control NGS datasets and genome assemblies. Bioinformatics. 2017;33(4):574–6. doi: 10.1093/bioinformatics/btw663 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 153.Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nature biotechnology. 2016;34(5):525–7. doi: 10.1038/nbt.3519 [DOI] [PubMed] [Google Scholar]
  • 154.Jon Palmer JS. nextgenusfs/funannotate: funannotate v1.5.3 Zenodo2019 [Available from: 10.5281/zenodo.2604804. [DOI]
  • 155.Jones P, Binns D, Chang H-Y, Fraser M, Li W, McAnulla C, et al. InterProScan 5: genome-scale protein function classification. Bioinformatics. 2014;30(9):1236–40. doi: 10.1093/bioinformatics/btu031 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 156.Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2009;26(1):139–40. doi: 10.1093/bioinformatics/btp616 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 157.Soneson C, Love MI, Robinson MD. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Research. 2015;4. doi: 10.12688/f1000research.7563.2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 158.Wickham H. ggplot2: elegant graphics for data analysis: springer; 2016. [Google Scholar]
  • 159.Holden S, Bakkeren G, Hubensky J, Bamrah R, Abbasi M, Qutob D, et al. Uncovering the history of recombination and population structure in western Canadian stripe rust populations through mating-type alleles. bioRxiv. 2023:2023.03. 30.534825. doi: 10.1186/s12915-023-01717-9 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Tatiana Giraud, Eva H Stukenbrock

22 Sep 2023

Dear Dr Schwessinger,

Thank you very much for submitting your Research Article entitled 'Genome biology and evolution of mating type loci in four cereal rust fungi' to PLOS Genetics.

The manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important problem, but raised some substantial concerns about the current manuscript. Based on the reviews, we will not be able to accept this version of the manuscript, but we would be willing to review a much-revised version. We cannot, of course, promise publication at that time.

Should you decide to revise the manuscript for further consideration here, your revisions should address the specific points made by each reviewer. We will also require a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

If you decide to revise the manuscript for further consideration at PLOS Genetics, please aim to resubmit within the next 60 days, unless it will take extra time to address the concerns of the reviewers, in which case we would appreciate an expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments are included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool.  PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, use the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

We are sorry that we cannot be more positive about your manuscript at this stage. Please do not hesitate to contact us if you have any concerns or questions.

Yours sincerely,

Tatiana Giraud

Guest Editor

PLOS Genetics

Eva Stukenbrock

Section Editor

PLOS Genetics

The manuscript has been evaluated by three referees, who agree that this manuscript on the mating-type loci of rust fungi could be of interest for PLoS Genetics, providing valuable insights into mating-type loci in rust fungi. Despite the positive assets acknowledged by the referees, however, serious concerns were raised, in particular about methods, generalization from one isolate to species, unwarranted conclusions about HD gene recombination, language, format, clarity of the text and figures, lack of citations of previous works, the inclusion of a single PR allele in gene genealogies, which does not allow analysing trans-specific polymorphism, lack of discussion on the drivers of recombination suppression, lack of information on the size of recombination suppression, lack of analyses of polymorphism data and of recombination suppression at HD genes, and lack of data availability. I have to agree with these concerns. The manuscript is not very well written, many sentences are unclear or make no sense, are awkward or with wrong syntax. The introduction includes irrelevant information about resistance genes but lack essential ones, and the predictions at the end of the introduction seem ad hoc, while more interesting one could be stated. Some examples are highlighted below and by referees but the text needs to be edited all along the manuscript, and the figures improved. I would encourage resubmission only if you are able to revise the manuscript along these lines and add the corresponding analyses. The referees also provide a list of excellent additional suggestions and questions, which should also be addressed. Please also be sure to cite previous works on mating-type loci in rusts.

More specific suggestions

-Please number the pages

-L29: It's not the loci that give rise to recombination suppression, they undergo recombination suppression; It's not necessarily the linkage either.

-L60: I would rather say "sex and mating-type”

-L66-68, L516 : it’s not really « initial sheltering » but rather « a combination of an initial advantage of carrying fewer deleterious mutations than average and the sheltering of these few deleterious mutations when increasing in frequency ». I would also highlight here the relevant prediction made in Jay et al 2022 that recombination suppression extension should only occur in dikaryotic species, and only around biallelic loci not multi-allelic ones. It would thus fits the findings that it occurs around the biallelic PR locus and not the multiallelic HD locus, and come back to this in the discussion.

-L90, L93 : advantageOUS

-L93 : awkward sentence. It’s not that « multi allelism » is advantagous (this is wrong group selection reasonning), but that any new, rare allele is advantageous, which leads to multi-allelism ; this reasonnning is very different.

-L98 : awkward sentence : the mating system is selfing versus outcrossing while the mating compatibility system is bipolar or tetrapolar, this is very different and the relationships between the two systems is not simple. A tetrapolar compatibility system does not decrease inbreeding (you can always self), but it is less advantageous under inbreeding, this is very different. I would delete this whole paragraph, not relevant to the study anyway.

-L105-132 : delete also this paragraph, it’s not relevant to the study and there are several awkward sentences (for example an isolate is not « adapted », it’s populations that adapt not individuals). Just give the minimal information on the role of MAT genes in the life cycle.

-L133 : why « likely » ? based on what ?

-L136 : MAT loci not alleles. See https://www.nature.com/articles/s41467-023-41413-4.pdf

-L143-150 : awkward predictions, we don’t understand why, they seem like ad hoc, after you knew the results. I would rather explain predictions about recombination suppression around biallelic loci and not multiallelic ones as explained above based on Jay et al. 2022.

-avoid abbreviations, and in particular stick to conventional Latin abbreviations for species names.

-Table legends are too short, remove capitals, explain abbreviations, explain all lines and columns

-L184-186 : awkward sentence

-L201 : « fully conserved » is unclear : present ? at the same location ? identical sequence ?

-L205 : « composition » is unclear : identical sequence ?

-L218 : see Petit et al (2012) Evolution 66: 3519–3533.

-L222, L252, L522 : explain better what trans-specific polymorphism means and refer to the first paper on this on the PR gene : Devier et al 2009 Genetics 181:209-223. Predictions about this in the introduction would be better than the predictions currently at the end of the introduction.

-L232 : either rather than both

-L235 : unclear what is your initial hypothesis and why. This whole page is hard to follow, having the last sentence at the beginning would help (but see referees’ comments on this inference, likely unreliable due to low bootstraps)

-L261 : run a formal test of congruence

-L267 : these are not phylogenetic trees but gene genealogies (phylogenies are for species)

-L271 : alleles and not genes

-L300 HD locus

-L310 and elsewhere : S in dS should be subscript

-L326 : delete « pair » and « are »

-L397 : conservation is unclear : in synteny ? sequence ? within instead of « of each »

-L398 : any conserved

-L404 : awkward wording

-L401 : ancestral to what ?

-L415 : awkward wording

-L428 : no plural at polymorphism

-L439-440 : this makes no sense.

-L445-449 : unclear ; what is the « status » ? conserved means identical sequences ?

-L453 : I don’t think this is true beyond this particular species in the given reference

-L457 : delete significant

-L460 : describe is not the right term here

-L463, L466 : we’re lost with the jargon here, these terms should be explained in the introdcution instead of the long irrelevant paragraph on R genes. Explain whether these findings are consistent with exectations given the life cycle and where the MAT genes act.

-L482 : mating compatibility not sexual reproduction, this is very different

-L487 : make it clear whether or not other genes are trapped into the non-recombining region

-L489 : the PR locus

-L491 : the hypothesis

-L492 : delete parasexual reproduction, this has not been defined nor shown.

-L494 : mating is not tetrapolar, mating compatibility is, this is very different

-L506 : why mitosis activity ?

-L508 : delete « between haplotypes but » and begin a new sentence with « Chromosomes… »

-L519 : likely caused ; See https://www.nature.com/articles/s41467-023-41413-4.pdf

-L528 : awkward sentence, it’s more balancing selection

-L529 : awkward sentence, the causal « because « sounds odd : it’s not the cause but the observation allowing the inference

-L532 : see referees’ concerns about this, I agree this should not work and is poorly supported.

-L537 : it’s not consistent but similar

-L545-6 : this is wrong group selection reasonning : selection does not work for the good of the species

-L548 : pheromone receptor gene and singular or locus ?

-550 : at each ?

-L556 : unclear

-L558-560 : unclear and unconvincing, delete

-L566 : unclear ; of different mating types ?

-L573-3 : this should be explained in the introduction instead of the long digression about R genes

-L576 : unclear ; is a verb missing ?

-L582-596 : this makes no sense and is irelevant, delete

-L601 : demete between species, pleonasm

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: In this study, the group of B. Schwessinger made use of several Puccinia striiformis f.sp tritici (Pst) genome assemblies which they generated previously; they were among the first pioneering the proper haplophasing of the two haploid genomes residing in the two haploid nuclei in each urediniospore (the source of gDNA used for sequencing projects). Here, they compared these to genomes from three other Puccinia species pathogenic to cereals (mostly wheat), and whose haplophased genomes were recently published. They present the organization and variation of the MAT loci (HD and PRA/mfa) among these species/ forma speciales. Though the various mating-type genes (HD and PRA/mfa) have been reported on elsewhere in a few studies including their likely tetrapolar arrangement in Puccinia species, there has not yet been done a more in depth comparison of their arrangement and genomic neighborhood in various Puccinia species (the four presented here). This is becoming feasible only recently because of the above-mentioned haplophased assemblies. From the comparisons, they infer evolutionary principles of the MAT loci for these rust fungi.

Claims:

- the tetrapolar arrangement of the HD and PRA/mfa mating-type loci in all four species; has been substantiated

- similar, little degeneration around HD loci is found when comparing haplotype regions, as to other genomic regions as an indication of low recombination suppression; has been substantiated

- PRA/mfa mating-type locus much more degenerated among haplotypes, surrounded by higher levels of repetitive DNA (transposable elements or TEs, and low gene density / “gene desserts”) as other genome regions, as an indication of recombination suppression preceding speciation; substantiated to a certain degree but see comments below on TEs

- tried to locate centromers and link these to mat loci; see comments below

- the HD genes are multiallelic “highly polymorphic” within various species isolates; see comments below

- the PRA locus is biallelic with one allele in each haplotype (STE3.2-2 and STE3.2-3) in the sampled species; substantiated. Potential polymorphisms may indicate more allelic variants, particularly in Pgt (correlating with potential three allelic PRA genes?); see comments below

- use of publicly available transcriptome data for the four species indicates both MAT loci are expressed during the asexual infection cycle and higher expression is seen in later stages of the infection (very few data exists on expression during the sexual stages on alternate hosts); tentative, see comments below.

Overall, a nice assessment of MAT loci in this group of fungi, but some issues would need to be resolved before statements / claims can be made.

Line 350-351: “Composition of transposons vary across species, DNA transposons dominated the HD locus of Pst134E but HD loci of other rust fungi had higher coverage of RNA transposons (S9 Fig).”

And Line 400-401: “Pst 210 and Pt 76 showed strong signals of accumulation of independent transposable element families in each nuclear haplotype, which indicates invasion by TEs since cereal rusts shared a most recent common ancestor.”

- As the supplementary figure only contains single representatives of the species, can species-wide conclusions be drawn? Could you provide more explanation on the families of TEs and how the results were generated for figure S9?

Lines 568-575: “detailed transcriptomic analysis of MAT genes now gives an initial indication that these genes are important for nuclear pairing in the asexual stage, which is consistent with earlier preliminary reports in Pt (21) and M. larici-populina (62, 63). Our analysis suggests that MAT genes are expressed in asexual life cycle stages of cereal rust fungi. In our detailed transcriptomic analysis, we found both HD and PRA genes were upregulated during the infection of the asexual host with a peak at final infection stages that align with spore production and dispersal HD genes are known to control post-mating growth, such as control development of fruiting-body in some Ascomycota and Basidiomycota species (64).“ Also Figure 6.

- Does the up-regulation of these genes really warrant this conclusion? Moreover, from figure 6C, bE2-HD2 does not change in expression. Based on the plot, between 24h and 48h, there is a lot of variation for the bW1-HD1, bW2-HD1, and Ste3.2-3 transcripts, and using the median point no increase of expression when compared to 5days post infection, is there an explanation?

- Is there a lot of variation between genes at the early time points? For early time points, is there is a possibility that there is not enough fungal biomass to get meaningful number of reads, or depth, to make these inferences?

- Is it possible to get information from kallisto or downstream EdgeR of how many genes have counts at the early time points relative to spore or late sample times. And do you have sufficient counts at these early time points for differential expression analysis? Another quick method would be to look at housekeeping genes throughout time points or genes that don’t change.

- Also, there is / we could not find any reference to parameters used with EdgeR within the github, only deseq2, the parameters and methodology would be good to see.

Issues with Methods.

Lines 631 -635: “Estimation of centromeric regions…. Hi-C reads of Pca 203 and Pt 76 were mapped back to corresponding diploid genome assembly with Juicer 1.6 (71) and processed with 3d-dna (72) pipeline. Juicerbox 1.8.8 (71) was used to visualize Hi-C maps and manually estimate centromeric regions by identifying strong interaction signals.”

- Please, include the Hi-C contact maps used for centromeric identification. As discussed with respect to the PR locus, this region is marked by repeats and transposable elements which would affect the mapping of Hi-C reads, how can you be confident where the centromere is? Is there a centromeric specific sequence motif between the analyzed isolates like found in other fungi?

- I would if it was mentioned at the beginning of the methods that specific scripts and parameters used in the study are in the attached git repo. Also, include how REPET/TEdenovo was done in the main methods, as transposable elements are in the main figures.

Lines 658-664: “…before and after trimming. Bwa-mem2 (65) was used to map trimmed reads to reference genome, MarkDuplicates (Picard) 1.70 (92) was applied to remove PCR-generated duplicates. Qualimap 2 (93) was used to check mapping coverage and quality. Freebayes-parallel 1.3.6 (94, 95) was processed to detect variants, samtools 1.12 (96) was used to filter out regions of interest. The Ensembl Variant Effect Predictor (VEP) 88.9 (97) was used to annotate sequence variants. Bcftools 1.12 (96) was used to generate consensus sequence, nucleotide and protein alignments were generated and visualized by Geneious Prime v. 2022.1.1 (66).”

- Could you provide coverage plots for each of the genes for all the isolates to ensure that the genes are fully covered by short reads, and that there is sufficient coverage for consensus sequences to be created from the variants.

- Are there regions where reads are not aligning due to differences? This has also implications for the (HD) allele diversity you claim.

Other issues and suggestions:

Line 76: …’ to govern nuclear compatibility and inbreeding within populations’; maybe better: ‘… to govern mate selection within populations and nuclear compatibility’? It is not solely for “inbreeding”.

Line 84: maybe add ‘..which are outwardly transcribed in opposite directions.’?

Line 85: ‘cascade’, one word

Line 87: ‘…and the involvement of one or both loci in mating’?? Better: ‘Different segregation patterns of these mating-type loci define the terms bipolar or tetrapolar mate compatibility.’

Line Suggested: ‘…are physically linked and hence only one MAT locus controls mate compatibility.’

Line 90: advantageous

Line 98: The tetrapolar..

Line 101-104: A very nice new example is: Coelho et al., 2023: doi:10.1073/pnas.2305094120

Line 113: ‘is prevented by management strategies that remove the sexual host e.g. Berberis’. I think, though this has helped in the past, the fact that the alternate host(s) may not be available in that region of production of the primary host, may be more important?

Line 118: ‘it poses serious risks’

Line 122: Suggestion: ‘As alternatives to sexual reproduction and gene assorting, somatic hybridisation in which nuclei are exchanged during hyphal interactions can increase genotypic diversity including the generation of novel effector complements (29, 30).’

Line 126: ref 31 is cited but this publication is based on a seriously flawed Pt genome assembly (this publication should be retracted and not cited). If Pt genome assembly information is needed, a better resource is a preprint by Sperschneider et al. bioRxiv preprint https://doi.org/10.1101/2022.11.28.518271

Line 146: delete ‘hypothesis’

Line 157 – 159: “Here we made use of the four available chromosome-level genome assemblies of cereal rust fungi at the time of the study including Pca 203, Pst 134E, Pt 76 and Pgt 210 (Table 1). We included P. polysora f. sp. zeae (40) or Melampsora larici-populina as an outgroup.”

- It would be nice to see in the table which assemblies are complete/assembly information sizes etc.

Line 169: ‘…that are thought to bind to’; this has not yet been proven, and is thought to be analogous to other, more-studied basidiomycete fungi….

Line 174 – 175: “In the case of Pst 104E, STE3.2-2 was absent from the genome assembly but we recovered its haplotype by mapping raw sequencing reads of Pst 104E to the Pst 134E phased chromosome scale genome assembly (34)”.

- Could you clarify what was done here? Did you map reads then assemble this region ? Or take the consensus of the mapped reads as STE3.2-2. ?

Line 216: ‘…is also diverse…’

Line 392: ‘the PRA genes of Pst 134E were found to be ~164kbp from the centromere of chromosome 9A and ~473kbp from that of chromosome 9B, likely physically-linked to their centromeres’.

- to which figure is this referring, and are these typos? Those distances don’t look like “physically-linked’?!

Line 317-318: “The information of centromere locations suggests that the HD loci are likely not directly linked to centromeres in all four species”.

- This was based on juicer hi-c contact plots, can they be included in the supplementary?

Line 386-387: “Gene density dropped significantly around PRA genes with STE3.2-2/STE3.2-3 located within or adjacent to ‘gene deserts’ that extended over 1mbps in the case of Pst 134E.”

- Could this be a factor of current annotation limits as these regions contain large repetitive regions / transposons and normally these would be masked in conventional annotation pipelines? Do all gene annotations have RNA support, or vice versa, are there RNA reads mapping to areas where no genes have been “annotated”?

Line 423: I do not see the Pca HD maps in this S14 figure to substantiate the text….?

Line 424-430: “This is likely caused by the fact that isolates selected were mostly linked to sexual populations of Pca found in the USA (49). Hence we also used the most closely related available reference of each sample for short read mapping purposes including references from Pca 203 (50), Pca 12NC29 and Pca 12SD80 (49). This showed that the HD loci in Pca display high levels of polymorphisms including heterozygous SNPs, multiple unaligned regions and gaps, which suggests the HD loci present in our selected Pca isolates are not well reflected in the available references.”

- Would you expect this for the other isolates?

Line 439-444: “We identified slightly more SNPs in STE3.2-2 and STE3.2-3 of Pt which we estimate to give rise to a maximum of four distinct haplotypes. Yet each predicted haplotype has only a very limited number of amino acid substitutions. Pgt was the most polymorphic at the STE3.2-2 and STE3.2-3 loci. We identified three haplotypes of STE3.2-2 with several amino acid changes. STE3.2-3 was the most polymorphic in Pgt including two isolates, TTTSK and UVPgt60, which contain potential nonsense mutations leading to pre-mature stop codons.”

- If these regions are marked by increase of repeats, what is the mapability of the short reads to these genes and loci?

And, as mentioned at line 513-514: “and complete loss of synteny within and among species surrounding he PR loci”.

- Doesn't this suggest that there will be issues of mapping from other isolates to these regions if haplotypes of the same isolate show high heterozygosity for the PR loci as in Fig 4 B?

Line 484: ‘two loci that control mate compatibility’; I think such statements (and in general throughout the manuscript, e.g. also lines 548-550) need to be presented as tentative since no functional molecular work has demonstrated this in the rusts. All based on analogous work in other basidiomycetes, primarily Ustilago and Cryptococcus.

Line 485: “…HD loci are highly multiallelic”; again, overstated since only cursory, indirect data from a few isolates per species has been presented; see comment above on potential problems with mapability of (Illumina) short reads.

Line 519: “…likely be caused’; ‘…is likely caused’

Lines 520-521: “Interestingly, the TE families surrounding PR loci are not shared within or among species.”

- Again, is this a valid claim when only looking at four species and single isolates from each?

Line 555 and on, Discussion on potential multiallelic PRA alleles in Pgt; there is precedence in Sporisorium reilianum: Schirawski et al. 2005 DOI:10.1128/EC.4.8.1317-1327.2005

Line 564 – 566: “This is even though nuclei of the same mating type share the same cytoplasm during colonisation of the asexual host."

- I do not understand this comment. If it read “…sexual host...”, it would maybe make sense ((haploid) teliospores undergo another mitotic division so a dikaryotic cell / infection hypha directly penetrates the alternate host leaf / epidermal cell.

Line 567: …is unknown; I don’t think that this preliminary transcript analysis is enough to prove that ‘… (MAT genes) are important for nuclear pairing in the asexual stage’. Some experimentation (Cuomo et al., 2017) suggests that they affect overall virulence…..

Line 574: ‘…and dispersal.’ Period

Lines 582-586: “Individual nuclear genomes of rust fungi are entire linkage groups, with effector haplotype complements being tied to specific MAT haplotypes if somatic hybridisation occurs in the absence of parasexual reproduction. Hence, we can predict the possible combinations of effector complements that can arise by somatic hybridisation based on the knowledge to which MAT haplotypes they are linked to.”

- Doesn’t this suggest that there are only a few combinations based on haplotypes that can exist? Can this claim be made when there is a large amount of transposable elements? Are you confident there is no asexual mechanism for information exchange between haplotypes during somatic hybridization?

Fig 1. Pca203: bW26HD1? (delete 6?); Pca12SD80 should be bW6-HD1?? It’s confusing. Please, call haplotype allele numbers consistently. Check all!

Fig 2. Wrt incongruency and inconsistency in a phylogenetic tree when analyzing STE3.2-2 and STE3.2-3 together; in several analyses in other studies, the pheromone receptor sequences were C-terminally truncated to exclude the cytoplasmic tail to optimize the alignment and allow various PRA alleles to be directly compared (e.g., Coelho et al. 2017, Cuomo et al. 2017; Bakkeren et al., 2008,FGB 45,S15). No detail is given on how the pile-ups and analyses were done.

Legend Fig S10: ‘yellow boxes indicate PR loci’; I do not see yellow boxes….?

Reviewer #2: Luo and colleagues offer a genome-level analysis of mating type (MAT) loci in four economically important cereal rust species. The study reveals that these species possess a tetrapolar mating system characterized by unlinked biallelic pheromone/receptor (PR) locus and a multiallelic homeodomain transcription factor (HD) locus. Through a range of genomic analyses, the authors report signs of recombination suppression and genomic degeneration around the PR locus. Conversely, the HD locus appears to be more conserved and does not show signs of genetic degeneration, suggesting ongoing recombination in its neighboring regions as normally observed in other tetrapolar basidiomycetes. The work also emphasizes the potential role of these MAT loci in promoting nuclear pairing during the asexual cycle, offering insights into the evolution of virulence in these critical pathogens. Overall, the paper has the potential to significantly advance our understanding of mating compatibility mechanisms in this important group of fungi with possible repercussions for agriculture and disease management.

While the manuscript provides valuable insights into mating type loci in rust fungi, it needs several revisions. First, the language needs polishing for clarity and readability, and typographical errors and inconsistent formatting should be fixed. Second, the figures need a thorough review to improve readability and presentation. Finally, the Methods section is too vague, making it hard to understand the research process and raising concerns about reproducibility. These general issues should be addressed alongside my more specific comments/suggestions below.

- Major points -

Line 83: It appears that the manuscript uses reversed nomenclature for the bW (bWest) and bE (bEast) genes, potentially following the precedent set in a previous paper on rust fungi mating type loci (Cuomo et al. 2017, G3, 7:361-376). However, based on foundational research on mating type loci in smut fungi (Ustilaginomycotina), the bW gene corresponds to HD2 and bE to HD1. Therefore, I recommend revising this information for the sake of accuracy and ensuring that this corrected nomenclature is consistently applied throughout the manuscript, so that it can be adopted and reflected in future studies.

Lines 163-164: The manuscript establishes that the HD and PR loci are located on separate chromosomes, thereby supporting a tetrapolar MAT organization. Given the significance of this finding to the overall narrative of the study, it would be beneficial to emphasize this information through visual representation in a main figure. Specifically, I recommend the addition of a figure (potentially as Fig. 1) that provides a simplified version of what is currently shown in S1 Fig. This figure could graphically depict the chromosomes containing the PR and HD loci for each of the four studied species. To provide additional evolutionary context, this could be juxtaposed with the species phylogeny that is currently presented in S4 Fig. Merging these key elements into a main figure would enhance the comparative aspects of the study and facilitate a more immediate understanding of the findings.

Lines 205-207: The authors present interesting findings regarding the genomic organization of pheromone receptor and pheromone genes across the Puccinia species examined. Specifically, they note that STE3.2-2 is found adjacent to MFA2, while MFA1 is located at varying distances from STE3.2-3. Additionally, in Pt, two identical MFA1 genes were identified, possibly indicative of a recent duplication event. On the other hand, in Pgt, two distinct pheromone genes (MFA1 and MFA3) were found in association with STE3.2-3. These observations raise several questions: (1) Why were the two identical genes in Pt labeled as MFA1/3? Is this a nomenclatural choice or does it imply something about the gene function? (2) Given that MFA3 is unique to Pgt, it could be that this gene product may not function in mating type recognition, or might not even be a mating pheromone? (3) Could the authors speculate on what might constitute the protein sequence of the mature, active form of the pheromone? (4) How does this compare with other Pucciniales previously sequenced (e.g., Melampsora larici-populina).

Lines 234-241: The authors suggest that the differing branching patterns of HD1 and HD2 within each species may indicate recombination between these genes. This could potentially lead to self-compatible HD1-HD2 allele pairs. However, an alternative explanation could be a lack of strong phylogenetic signal, as evidenced by some low bootstrap values. I recommend a more cautious interpretation of these results. Additionally, a phylogenetic analysis using protein sequences might be informative. It's important to see if the nucleotide differences result in functionally distinct proteins. Also, in many other basidiomycetes, the N-terminal regions of HD1 and HD2 are more variable, possibly affecting self/non-self-recognition. Have you explored this in the available HD alleles?

Lines 263-264: The conclusion in this section suggests that the incongruent topologies of STE3.2-2 and STE3.2-3 indicate long-term recombination suppression around the PR locus. However, the actual evidence supporting long-term recombination suppression would be phylogenetic clustering of alleles by mating type across species, rather than clustering by species (i.e., trans-specific polymorphism). This is because the two alleles ceased recombining long ago and have subsequently accumulated substitutions independently. It is also worth noting that the incongruent topologies between the two trees are associated with relatively low bootstrap values in the STE3.2-2 tree. Again, these low bootstrap values may affect the confidence level of the inferred phylogenetic relationships, potentially complicating the interpretation.

Lines 450-478: This study provides important data on mating type gene expression. Since STE3.2-1 doesn't appear to be mating type-specific based on your results, it might be useful to examine if this gene is upregulated under the conditions you studied, or how its expression varies during infection or spore formation. This could give more context on a possible role of this gene. Additionally, you might consider referencing similar genes in other basidiomycetes that don't directly affect mating type. For instance, in Cryptococcus deneoformans, the Cpr2 gene competes with the Ste3 receptor and triggers unisexual reproduction when overexpressed. Similarly, in some mushroom species, receptor-like genes near the PR locus are linked to vegetative growth, not mating type (e.g., Wirth S, et al. 2021). This could hint at broader functions for genes like STE3.2-1 in rust fungi.

- Minor points -

Line 43: The term “di-om mating” appears in the abstract but is not explained or mentioned elsewhere in the manuscript. The terminology is not only unclear but might also be too technical for a broader readership at the abstract level. Did you intend to write “di-mon mating”?

Lines 62-65: The sentence “Genetic degeneration footprints are higher rates of (non)synonymous substitutions (dN, dS), accumulation of transposable elements, reduced gene expression and reduced gene numbers which are all a consequence of recombination cessation” could benefit from clarification. Are all these “footprints” of genetic degeneration always observed in non-recombining regions?

Line 72: Please consider changing “100s” to “hundreds”.

Line 74: According to standard taxonomic nomenclature, the abbreviation “sp.” in Microbotryum sp. should not be italicized. Furthermore, if referring to multiple species within the Microbotryum genus, the correct form should be “Microbotryum spp.” Please verify this and other instances throughout the text for consistency, such as on lines 101 and 501.

Line 77: The phrase “In most cases, two MAT loci control mating” could be more accurately stated as, “In most cases, two MAT loci determine mating type identity.”

Lines 78-79: To enhance clarity and consistency in the manuscript, consider standardizing the nomenclature for the pheromone receptor genes, choosing either “PRA” or “STE3” for uniform reference throughout. Additionally, please take the opportunity to review and standardize the formatting of gene names (italics vs. non-italics, uppercase vs. lowercase) in both the main text and figure legends, such as in S2 Fig.

Line 81: The term “PR haplotype” is used, which may not align well with conventional nomenclature for individual variations of a gene or locus, especially in the context of mating type loci literature. To ensure clarity and consistency, consider using the term “PR allele” or “idiomorph” instead. Furthermore, when discussing STE3.2-2 and STE3.2-3, it might be more appropriate to refer to these as two different alleles of the same gene. This would align better with terminology used in other parts of the manuscript, such as lines 252-253, and contribute to a more cohesive narrative."

Line 84: It is important to clarify that the products of HD1 and HD2 from the same allele do not form heterodimers. Heterodimerization occurs exclusively between HD1 and HD2 products that are derived from different alleles. Consequently, these products must differ in compatible mating partners.

Line 90: The term “Biopolar” should be corrected to “Bipolar,” and the phrase “are advantages in” should be revised to “are advantageous in.” Similar corrections are needed on line 93.

Line 91: The phrase “that undergo selfings” should be corrected to “that undergo selfing.”

Lines 93-95: The sentence, “In outcrossing populations it is advantages that MAT loci are multiallelic because this increases the success of compatible gametes (in doing what?) derived from two independent individuals,” needs revision for clarity and completeness.

Lines 93-95: The sentence regarding the advantages of multiallelic MAT loci in outcrossing populations is currently unclear and incomplete. Specifically, it would be helpful to clarify what is meant by “increases the success of compatible gametes.” Are you referring to a higher likelihood of successful fertilization, greater genetic diversity, or some other form of “success”?

Lines 96-97: The segment “with ten to hundred known and estimated haplotypes” is unclear and needs clarification.

Line 98: While the term “plesiomorphic” is technically accurate, it might come across as overly technical and potentially less accessible to a broader audience. In this case, it might be beneficial to use the term “ancestral” as a replacement to enhance readability.

Lines 100-103: To offer more background and improve clarity, consider rewording this section along the lines of: “For example, Microbotryum spp. are highly selfing, and multiple independent events have linked the HD locus and the PR locus into large non-recombining regions. These regions differ in both their size and age and have been formed by successive steps of recombination suppression, resulting in several adjacent “evolutionary strata” of differentiation between non-recombination chromosomes.

Line 112-113: You may want to rephrase the sentence to: “as sexual reproduction is prevented by management strategies that remove the secondary host (e.g., Berberis) required for sexual reproduction.”

Lines 116-117: To better articulate the point, consider revising the sentence to: “The asexual epidemic potential of a given rust isolate is largely determined by the complement of effector genes encoded in its two nuclei, as well as by the resistance (R) genes present in crop varieties grown in a specific region, which confer them the ability to recognize and mount defenses against specific pathogens or their effectors”.

Lines 146: “loci” should not be italicized.

Line 158: There seems to be a discrepancy in the notation for “Pgt 210.” Is this the same as “Ptg 21-0”? If so, please consider standardizing.

Line 184: For enhanced clarity, consider rewording the sentence along the lines of: “We employed a phylogenetic approach to trace the inheritance of PRA and HD genes in each haplotype. Under a tetrapolar system, compatible haplotypes in a dikaryotic genotype are expected to have one copy of either STE3.2-2 or STE3.2-3 and exhibit different alleles at the HD locus.”

Lines 207-208: To conform with standard notation, please separate the numbers from the units in the manuscript. For example, change “0.54Mb” to “0.54 Mb” and “10kb” to “10 kb,” and apply this consistently throughout the text.

Line 222: For the mention of trans-specific polymorphism, consider adding a relevant citation to provide further context and support for this concept.

Line 308: The use of the term “sister chromosomes” could introduce ambiguity. I recommend using the term “homologous chromosomes” instead. Additionally, it would be insightful if the manuscript could also address the extent of synteny conservation among the MAT chromosomes across the species studied. Providing information on the average level of divergence between the analyzed haplotypes in different species would also be beneficial.

Line 314-318: The phrase “user features” in this sentence is unclear and might cause confusion. Could you please elaborate on what you mean by this term or consider using a more straightforward expression? Also, it is mentioned that no “obviously aberrant patterns of gene or transposon density” were observed around the HD locus. Could you specify whether there is a particular analysis that supports this assertion? Lastly, I believe the methods used to predict the centromere locations are not very well discussed – if these were determined based on previous research, please provide the appropriate citations.

Lines 386-387: It is noteworthy that in Pca and Pst, there is a marked drop in gene density on only one side of the pheromone receptor gene locus. Could this asymmetry be directly correlated with the position of the pheromone genes that is distinct between the two PR alleles/idiomorphs?

Line 423: It is stated that “Pca showed the highest level of variation of HD loci at the species level (S14 Fig).” However, upon reviewing S14 Fig, I noticed that this figure only displays HD1-HD2 allele variation for three of the species (Pst, Pgt, and Pt). Therefore, the data supporting this claim for Pca appears to be missing or not clearly presented in the supplemental figure. Could you please clarify or correct this?”

Line 439: A higher number of SNPs in the STE3.2-2 and STE3.2-3 genes was found in Pgt compared to other species. Could this increase be attributed to the Pgt isolates analyzed be more divergent? It might be helpful to expand on this by comparing SNP frequencies in other genomic regions as well. For instance, S16 Fig. suggests that STE3.2-1, which is not directly related to mating type determination based on your findings, also seems to exhibit a high level of variation. This could indicate that the elevated SNP levels in STE3.2-2 and STE3.2-3 may not be gene-specific but rather a broader characteristic of the Pgt isolates examined? If so, this might challenge the suggestion made in Discussion section (lines 489-490) that “Pgt might be multiallelic at the PR locus.”

Lines 503-507: About the role of STE3.2-1, I find it important that it should be mentioned that other non-mating type pheromone-like receptor genes have been found in other basidiomycetes, including the Cpr2, which is known in Cryptococcus deneoformans to compete with the Ste3 receptor for signaling and whose overexpression elicits unisexual reproduction (Hsueh et al., 2009, EMBO J. 28:1220-1233). In mushroom species, pheromone receptor-like genes are also found close to the PR locus, but these do not determine mating type identity. A recent report suggest they may have a role in vegetative growth, and self-signaling (e.g., see Wirth S, et al. 2021, J Fungi, 7:399), thus revealing an important role for hyphal growth and development.

Lines 526-528: While it is appropriately highlighted the presence of trans-specific polymorphism in the STE3 alleles of rust fungi, it might be beneficial to the discussion to also note that STE3.1 and STE3.2 alleles in other Pucciniomycotina species such as Microbotryum, as well as in red yeasts and Leucosporidiales (for further context, see Coelho et al. 2010, PLoS Genetics 6:e1001052 and Maia et al., 2015, Genetics 201:75-89), appear to be evolutionarily older than those specific to the Pucciniales. This contextual information could provide additional depth to your discussion on the evolutionary aspects of these alleles.

Line 539: Typo in the term “Pucciniomycotina”

Lines 600-601: As it reads, the statement "There is no evidence of recombination suppression at HD loci, as there are no trans-specific polymorphisms between species and support for macro-synteny among species" appears to be contradictory or unclear.

Line 625: As stated above, the Method section requires careful revision and added details. For example, in the “Phylogenetic analysis of the HD and PRA genes” subsection, the absence of parameters used in OrthoFinder for orthology inference and in IQ-TREE2 for phylogenetic analysis is a notable omission.

Line 637: Should be Branco et al.

Fig 3, Line 329: The phrase "10k bp sized windows" should be amended to "10 kb-sized windows". Moreover, the function of the shaded black/grey areas below the dS plots is unclear. If these are intended to represent gene-containing regions, it would be helpful to make this explicit in the figure legend or main text. I also noticed that not all shaded areas have corresponding dS values. If this discrepancy exists because an allele is missing in one of the two haplotypes, please clarify that point. My concerns extend to the calculation of dS values, as mentioned in my comment for S8 Fig below. Additionally, to improve visual distinction between the tracks, consider applying different color schemes for Transposable Elements (TEs) and genes. Lastly, as a general comment for all figures in the manuscript, maintaining a consistent color code for labeling different species would enhance readability and interpretation, as observed in Figures 1 and 2.

Fig 4: The gene tracks are rather small, making it difficult distinguishing individual genes. To enhance readability, I suggest enlarging the gene tracks.

S2 Fig. Line 36: Please correct the phrase “cross species” to “across species” for clarity.

S3 Fig. Line 47: The word “there” appears to be a typographical error and should be corrected to “their.”

S4 Fig: To aid in cross-referencing between the text and the figure, consider adding the species abbreviations next to each species name in the phylogenetic tree.

S5 Fig: I suggest that the phylogenetic tree be rooted at the midpoint. This would prevent inadvertently implying a directional bias in the tree when using the MlpPh1 gene as the outgroup.

S8 Fig: The figure includes calculated dS values for the STE3.2-2 and STE3.2-3 genes (also plotted in Fig 5). Given that these two genes are not easily alignable, it would be helpful to clarify the methodology used to calculate these dS values. Additionally, for enhanced transparency, consider providing the percentage of pairs of genes for which dS was possible to be calculated in each chromosome across all studied species and/or include the associated raw data of these dS calculations.

References Section: I noticed several inconsistencies in the formatting of the references, particularly concerning the use of italics. For example, references 33 and 59 do not follow the same formatting as others in the list. Please carefully revise the entire section.

Reviewer #3: In this study the author investigated the structure, evolution and function of mating type loci in several cereal rust fungi using publicly available dataset of genomes and expression data analysis.

Very little is known on the evolution of mating type loci in rusts compared to other Basidiomycota and their study can be highly relevant to better understand sexual and asexual cycle of these damaging pathogens.

The study goal is very interesting and I didn’t detect any major methodological flaws. However, I found overall that the state of art was not enough described, some part of the material and method were not enough detailed and some results need to be better discussed in their context.

Major comments:

-For example, recombination suppression around the PR locus was shown previously to be likely ancestral to Basidiomycota (see Devier et al 2008 Genetics); this doesn’t appear clearly in neither the abstract, the introduction and the discussion of the results.

-The estimated size of the region with recombination suppression found around PR locus is not clearly described; how much this region varied in size between species ? between haplotypes?

-The gene tree for the PR locus should include both haplotypes to conclude to transpecific polymorphism (Fig2). Transpecific polymorphism is clear from mfa genes however.

-It is to note that there seems to have some haplotype clustering within species for bW-HD1 gene both for Pt and Pst species. This may suggest some recent recombination suppression (Fig1), which is consistent with high dS value for this gene.

-Looking at the allelic diversity within species, I wonder if the authors consider retrieving the PR/HD sequences by blast analysis in de novo assemblies; read mapping in a region rich in TE as in the PR region may be challenging. The method of estimation of the number of alleles at HD and PR locus need to be further described

-the finding that the MAT genes are expressed late in the asexual infection cycle during spore production is interesting ; however, it is likely not the only gene to be overexpressed in this stage; more evidence / functional analysis need to be done to check that MAT loci govern compatibility of nuclei in somatic hybridization; L576-582: conclusions need to be alleviated

-the evolutionary drivers of recombination suppressions are presented in the introduction, but the finding of recombination suppression around the PR loci is not discussed in this context in the discussion.

Minor comments:

Introduction

-l60-68: other models predict recombination suppression around mating-type loci and could be presented here

-l75: more Basidiomycota examples could be quoted as recombination suppression was described in some species of Cryptococcus, Agaricus, Ustilago… It could be interesting to describe if recombination suppression was associated to bipolarity/tetrapolarity, mating system, number of alleles at mating-type loci in these systems.

-l93: typo for “advantages”

-l101: the recurrent comparison with Microbotryum should be justified with a phylogenetic context that may not be obvious to readers; furthermore evolution of recombination suppression in this genus should be better descrivzs: if it is in bipolar or tetrapolar species, around HD or PR genes or linking them, etc…

-l105-106: this sentence is not related to the previous paragraph and the main goal of the study in my opinion

-l143-145: More context need to be given to understand why the author have the presented

hypothesrs ; I failed to understand why the authors tested these hypotheses in particular when reading the introduction.

Results

- Table 1: statistics about the genome assemblies (N90, L50 …) seems lacking and should be presented

Author mention “chromosome scale” assemblies, are telomere and centromere well identified? Is it haplotype-phased assemblies? Are all genomes dikaryotic genotypes?

-l184: “phylogenetic hypothesis” is not clear

-l345: previous sentences state the opposite

-l393: a distance of 473 kb to the centromere may be high for physical linkage; are other elements supporting linkage to the centromere such as segregation analyses?

Material & method

-l621: the mapping approach needs to be futher described; how was the sequence retrieved after mapping?

-l623: the search of orthologs with GeniousPrime needs to be further described; which dataset was used?

-l626-627: which dataset was used to build the rooted species tree? Is it coding sequence?

-L637: typo “Branco et al”

-One separate section could summarize all strains/genomic data used in this study

-the RNAseq dataset used (time points, number of replicates by time points) need to be better described

-TE annotation and gene models used need further description

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: No: See comments to the authors. For example, more data on TE categories; better numerical data on gene transcript levels / read depths (especially wrt HD allele variants calling); parameters used with EdgeR within the github; Hi-C contact maps / data; identification of centromers.

Reviewer #2: No: For example, the absence of raw data for both the dS calculations and the RNA-seq analyses is noticeable. Including these as supplementary datasets would enhance the transparency and reproducibility of the work.

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Decision Letter 1

Tatiana Giraud, Eva H Stukenbrock

29 Jan 2024

Dear Dr Schwessinger,

Thank you very much for submitting your Research Article entitled 'Genome biology and evolution of mating type loci in four cereal rust fungi' to PLOS Genetics.

 

The manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important topic but identified some concerns that we ask you address in a revised manuscript.

We therefore ask you to modify the manuscript according to the review recommendations. Your revisions should address the specific points made by each reviewer.

In addition we ask that you:

1) Provide a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

2) Upload a Striking Image with a corresponding caption to accompany your manuscript if one is available (either a new image or an existing one from within your manuscript). If this image is judged to be suitable, it may be featured on our website. Images should ideally be high resolution, eye-catching, single panel square images. For examples, please browse our archive. If your image is from someone other than yourself, please ensure that the artist has read and agreed to the terms and conditions of the Creative Commons Attribution License. Note: we cannot publish copyrighted images.

We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we would ask you to let us know the expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments should be included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, you will need to go to the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

Please let us know if you have any questions while making these revisions.

Yours sincerely,

Tatiana Giraud

Guest Editor

PLOS Genetics

Eva Stukenbrock

Section Editor

PLOS Genetics

The revised manuscript has been evaluated by the same three referees, who agree that this manuscript on the mating-type loci of rust fungi has been improved following their suggestions. The referees nevertheless still had extensive concerns, in particular about the surprising similarity between PR alleles, the delimitation of the mating-type loci, the clustering of transposable elements as well as the lack of clarity and length of the text.

I have to agree with these concerns. About the clustering of Ty elements, note that a previous study found such enrichment in non-recombining regions of mating-type chromosomes (https://doi.org/10.1038/s41467-023-41413-4). In addition, the manuscript is still not very well written. It should really be corrected by a native English specialist in the field. Some issues are highlighted below, but there are many others all along the manuscript.

I would encourage resubmission only if you are able to revise the manuscript along these lines. The referees also provide a list of excellent additional suggestions and questions, which should also be addressed.

More specific suggestions:

-L66: it’s not alleviated but increased!

-L34, L564: it’s not the mating types that are tetrapolar but the species.

-L40, L578: correlation is between two quantitative variables; what is an allele status? Allele number?

-L42: what does evolutionary conservation mean here?

-L71-76: awkward and not really exact. A correct formulation would be for example: “… suggest that non-recombining fragments could fix more easily when linked to biallelic permanently heterozygous loci such as mating-type loci due to the sheltering of deleterious mutations. Indeed, non-recombining fragments with fewer deleterious mutations than average are beneficial and can rise in frequency, but they will suffer from exposing their (few) recessive deleterious mutations at a homozygous stage when becoming frequent, except if they are linked to a permanently heterozygous locus that will shelter them.”

-L77: predicts that

-L85-95 : keep the general ideas first (i.e. from L91) and the specific statements later (i.e. from L85).

-L107, L118 : « heterothallic » is not a mating system (heterothallic fungi can undergo both outcrossing and diploid selfing)

-L211 : tetrapolar is not a mating system either, it does not control selfing and outcrossing either

-L221 : suggested that (« that » should not be omitted in written text)

-L370 : consistent with what ? what does « order level » mean ?

-L374, L404, 462 : what is conserved ? gene order ? sequenced identiry ? « conserved » makes no sense « within genomes », conserved applies to long evolutionary times, do you mean are similar ?

-L382, 383 and elsewhere : why loci plural ? There is a single locus, even across species, these are not different loci

-L388 : why mbps plural ?

-l389 : well, it’s not concomittant, it’s causal : it’s because TE have accumulated that the regions are gene poor

-L422 : awkward : the polymorphism exist regardless of whether you map (so not only « when mapping « )

-L426 : isolates are not populations

-L430 : again, the locus is the same across species so the formulation is incorrect, the locus is not specific to Pca, the sequence is

-L449 : gave rise is not optimal formulation

-All along the ms : check that you add a hyphen between two words that qualifies a third one, e.g. L477 protein-coding gene, or L455 chromosome-scale, otherwise your long sentences are hard to read

P21-22 : this is hard to read, much too detailed

-L519 : MAT loci are not involved in sexual reproduction per se but in mate compatibility, this is very different. The HD locus is also involed in dikaryotic growth, why shouldn’t this be also the case in asexual reproduction ?

-L541 : any expression

-L565 : as already said in a previous version, it’s not that multiallelism is beneficial under outcrossing (this is wrong group selection reasonning, evolution does not act for the good of populatio,), but that rare alleles are selection for under outcrossing and leads to multi-allelism, this is very different.

-L624 : they actually follow the same trend

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: Comments on rebuttal (Jan 2024)

I very much appreciate that the authors took most of the comments and issues from the 4 (!) reviewers to heart and seriously revamped the study. Hence, in my opinion, the manuscript has been substantially improved. All of the issues we raised have been addressed, which made the supplementary data set quite large: but the many added mapping data, alignments and redone (molecular) phylogenies made the data much more supportive of the conclusions. Also, generating the many HD alleles de novo for proper comparisons and subsequent conclusions added much clarity to the manuscript.

The intro is now much more focused on the evolution of the mating-type region and known literature; many concepts are now introduced.

Some minor suggestions:

Line 25, right at the start of the abstract: I don’t quite get the term “Obligate heterozygous loci such as”. I would delete this whole first sentence since you come back later to it.

Line 27, delete “for”

Line 28: suggestion: “To date, an analysis of genome-level mating-type (MAT) loci is lacking in the obligate biotrophic basidiomycetes in….”

Line 32: “P. graminis….”

Line 34: “tetrapolar mating systems in the Pucciniales. The HD locus is found to be multiallelic….”

Line 36: “were..”; switching to past tense compared to he sentence above. Be consistent!

Line 39: “…with a clear biallelic PR locus”; you already said this a few sentences before….

Line 43: “…is related to correct nuclear pairing during spore formation.”; is unclear. IS meant: “..may play a role in maintenance of a proper dinuclear state.”, or something?

Line 85: delete “behind”

Line 143: “The role of MAT genes during the life cycle of rust fungi is rudimentary. Though it is hypothesized that MAT genes regulate appropriate nuclear paring during dikaryotic spore production and mediate the compatibility of haploid cell fusions on the “alternate” host”; I do not understand the first sentence, what is meant by “rudimentary”, the knowledge of their role(s)? Also, the hypothesis brought forward is heavily based on molecular work in the smuts; maybe this should be mentioned (and a ref).

Line 202: “unequivocal”; in the abstract the term “appears to be biallelic” is used, which is more carefully phrased. You show the presence of two alleles here in the few surveyed isolates; it’s nuanced in the Discussion, with Pgt potentially have more alleles. This may may be confirmed when more isolates genomes are generated worldwide in the future. Even for the other three investigates species; can this be excluded?

Line 292: I am not sure that the two genes bE-HD2 and bW-HD1 actually share the same promotor; yes, they are tightly linked, physically close together and outwardly expressed and may depend on similar/the same transcription induction elements, but I don’t think the can be defined as the same promoter. Experiments will need to be done to ascertain this.

Line 296: “…. , which might be caused by recombination within the HD locus.” Could different rates of genetic drift (accumulation of nucleotide mutations) followed by different selection pressures be the cause?

Line 662. Wrt MAT genes expression during spore stages, Zhan e al., 2023: https://doi.org/10.1007/s44154-023-00107-z, can be cited here again (you already mentioned this ref earlier). Ironically, their data set, though publicly available, cannot easily be checked for these MAT genes!

Data availability: the link to the repository Dryad (https://doi.org/10.5061/dryad.w0vt4b8zm) does not seem to work.

Reviewer #2: Luo and colleagues have commendably addressed the concerns and suggestions from my initial review, significantly enhancing the quality and clarity of their manuscript. Notably, improvements in terminology, formatting, figures, and content have been made. The extension of the methods section, coupled with the provision of raw data on Dryad and analytic code on GitHub, is particularly appreciated for its contribution to transparency and reproducibility.

In response to specific points raised in my initial review, the authors have opted to maintain the nomenclature of HD1 (as bWest) and HD2 (as bEast) genes, but with added clarifications provided in the text. Additionally, they provide a more comprehensive approach to explore HD allele diversity and of variability in the N-terminal regions of HD. The authors also acknowledge the limits of their dataset regarding STE3.2-1 expression and added relevant references to similar findings in other fungi.

In this revised evaluation, I have focused on assessing how these and other changes have impacted the overall narrative and quality of the manuscript. Below, I provide additional comments and suggestions on specific points. Overall, the manuscript has undergone significant improvements and is moving closer to aligning with the high standards of PLOS Genetics.

Major points

1) PR and HD locus regions: The manuscript would benefit greatly from a clearer delineation of what the authors consider to be the PR locus region in each of the haplotypes for the species studied. While the dot plot analysis provides valuable information, an overarching view of the synteny for the PR-containing chromosomes in each species would significantly enhance the clarity and comprehensiveness of the findings. The HD locus region (only HD1-HD2?) should also be highlighted.

2) Mature Pheromone Peptides Predictions: In the section concerning the predicted Mfa1 and Mfa3 proteins in Pgt (Lines 266-268 and Fig S3), the authors mention the possibility of distinct mature pheromone peptides being produced by these two genes, each potentially having different receptor specificities. However, the authors don’t provide the predicted sequences for these mature peptides. Clarifying these sequences is important for assessing their specificity and understanding the functional implications of these differences. Upon closer examination, it appears that the Mfa1 precursor may encode multiple copies of the putative peptide moiety. For instance, the Ptmfa1/3 precursor contains two copies of the sequence “QWGNGSHIC” with CAAX motifs separated by a spacer sequence. This contrasts with the Mfa2 precursors, which seem to encode only one mature peptide. This pattern of multiple peptide encoding is also observed in other Pucciniomycotina species, such as Rhodotorula spp. and Microbotryum spp. and also noted in other Pucciniales.

3) Hi-C data and PR/HD locus linkage to centromeres: The conclusion that HD loci are likely not directly linked to centromeres, derived from Hi-C data, warrants further clarification. How does this method definitively determine the absence of direct (genetic?) linkage, and were these assessments also made for the PR locus? The potential limitations in resolution or mapping in complex genomic regions should also be addressed.

4) Evolutionary history of STE3.2-2 and STE3.2-3 alleles: An intriguing aspect noted is the considerable identity shared between the STE3.2-2 and STE3.2-3 alleles within the same species (~43% identity, ~56% similarity). This raises an interesting question about their evolutionary history. Do you think this level of similarity could indicate a more recent allele turnover, potentially following the loss of the STE3.1 allele variant? This seems particularly noteworthy when considering that in other Pucciniomycotina species, the STE3.1 and STE3.2 alleles are more divergent and often not alignable for dS calculations. Furthermore, I am curious about the divergence rates of non-MAT genes within the presumed PR locus region. Specifically, do these non-MAT genes exhibit increased synonymous substitution rates (dS) compared to other genes located on the same chromosome but outside the putative PR region? This assumes, of course, that there are genes common to both haplotypes within the PR region, allowing for such a comparison. Alternatively, is the gene content within the PR region too divergent between the haplotypes to enable a meaningful comparison of substitution rates?

5) Clustering of Ty3_Pt_STE3.2-3 TE family: The observation that the Ty3_Pt_STE3.2-3 TE family predominantly clusters on chromosomes containing the STE3.2-3 locus, particularly within a region bordered by the pheromone and receptor genes, is a notable finding. This specific clustering suggests restriction of this element in this specific genomic region. I’m curious about several aspects regarding this observation. Firstly, what might be the potential mechanisms that restrict the Ty3_Pt_STE3.2-3 to the PR STE3.2-3 locus? Could there be unique genomic or epigenetic features in the vicinity of the STE3.2-3 locus that might explain why this TE family does not transpose more frequently to other genomic regions? Secondly, Have similar specific TE elements been observed within the STE3.2-2-containing haplotype? Thirdly, the absence of a GAG domain in Ty3_Pt_STE3.2-3 TEs is intriguing. Given that GAG domains are typically associated with the structural proteins of TEs, could this absence affect the mobility or functionality of these TEs?

Minor points:

1) Lines 28-30, Abstract section: The statement, "To date, genome-level mating type (MAT) loci analysis is lacking for obligate biotrophic basidiomycetes in the order Pucciniales, which contains many economically important plant pathogens," could be refined to highlight the novel aspects of the study more explicitly, by utilizing chromosome-level and phased genome assemblies to analyze mating type (MAT) loci.

2) Line 95/97: “PRA” and “MFA” nomenclature. Please double-check and confirm that the use of this nomenclature for proteins and genes is the adopted by the journal and fungal genetics.

3) Line 97: use of “matching allelic PR locus”, perhaps would more clearly expressed as “compatible PR allele.”

4) Line 100: The phrase “also known as bE-HD1 and bW-HD2” could be rephrased for clarity, and to provide a historical context, to "originally designated as bE-HD1 and bW-HD2 in Ustilago maydis” and include in addition, the original reference where this designation was first established.

5) Line 121: “ten to hundred known…”, consider revise to “ten to hundreds of known…”

6) Lines 140-143: The discussion about cell types and ploidy changes during the life cycle is important in the context of understanding the evolution of non-recombining regions. Therefore, it might be beneficial to consider citing and contextualizing this information in light of the recent findings by Talhinhas et al. (Microbiol Spectr. 2023 11:e0153223. doi: 10.1128/spectrum.01532-23). This study reports the widespread occurrence of replicating haploid and diploid nuclei in various life cycle stages of Pucciniales species, suggesting a unique life cycle that is distinct from traditional haplontic, diplontic, or haplodiplontic cycles.

7) Lines 293-295: In the context of discussing tree topologies in phylogenetic analysis, using the term "congruent" instead of "concurrent" might be more accurate and descriptive. Furthermore, since the authors suggest the possibility of recombination within the HD locus, conducting a more direct test of recombination employing software like RDP5 could help substantiate this hypothesis. Although different from the hypothesis proposed here, I also recommend considering the study conducted on Ustilago maydis as a pertinent reference as it demonstrates how novel HD specificities can arise from single homologous recombination events, leading to simultaneous changes in the dimerization subdomains of the bE and bW proteins (see Kämper J, et al. 2020. New Phytol. 228:1001-1010. doi: 10.1111/nph.16755; also check Perlin MH. 2020. New Phytol. 228:799-801. doi: 10.1111/nph.16847).

8) Lines 387-391: In the section discussing gene density around PRA genes, it is mentioned that STE3.2-2/STE3.2-3 are located within or adjacent to 'gene deserts' and these regions are enriched in TEs. However, the text refers to Pgt 21-0 as an exception, but it's not entirely clear what this exception pertains to. Does the exception refer to the lack of a 'gene desert', the absence of TE enrichment, or a different pattern of gene density around the PR locus in Pgt 21-0? Clarification on this point would help in better understanding the unique characteristics of the PR locus in Pgt 21-0 compared to the other species studied.

9) Upon reviewing the data presented in S12 Fig related to the TE composition in Pra proximate regions, I noticed that although the proportions of different transposable elements vary, it appears that the classes of TEs present are more or less the same across the different cereal rust fungi. This observation seems somewhat contradictory to the statement in the manuscript (lines 399-402) about the varied TE composition and accumulation in these regions. If so, please clarify.

10) Line 421: S18 Fig seems to refer to Pgt not Pca.

11) Line 434: Not sure that “de novo assembly” should be considered a “novel” approach.

12) Line 455-461: Regarding the assessment of more than two functional Pra alleles in Pgt, I am not entirely convinced by this conclusion. It seems more likely that the observed variations are variants of the same allele class, a common occurrence when comparing more divergent strains or lineages. A more cautious or nuanced discussion of this finding would be beneficial, considering the high degree of allele variation in fungal species. Additionally, the authors indicate that mfa1 and mfa3 in Pgt had several non synonymous changes and that this is consistent with significant variation observed in STE3.2-2 and STE3.2-3. However, I think would be informative to determine if the observed variations at the mfa1 and mfa3 loci translate into differences in the mature peptides produced.

13) Lines 498-506: In the section detailing TE analysis around the STE3.2-3 and STE3.2-2 loci, I have some points and suggestions to enhance clarity and comparability of the presented data. First, the current format of presenting Fig 6C and S30B separately makes direct comparison challenging. I suggest either placing these figures together or, alternatively, generating a plot that highlights the TE ratio or differences between STE3.2-3 and STE3.2-2. Secondly, the 'empty' portions of the bars in Fig 6C related to LTRs are not immediately clear and should be explained. Third, regarding the quantification of LTR Ty3 TE Family in Fig 6C, the text mentions that Pt 20QLD87 has the highest coverage of the LTR Ty3 TE family (Ty3_Pt_STE3.2-3), as depicted in Fig 6D. However, it's not evident from Fig 6C how this conclusion is drawn, as the coverage of this specific TE family doesn’t seem to be separately quantified in that figure. Could you elucidate how this conclusion was reached or consider revising Fig 6C to explicitly demonstrate the coverage of Ty3_Pt_STE3.2-3?

14) I have noticed that the supplementary figures in the provided PDF are of generally low resolution. It's possible that this issue might have arisen due to the PDF conversion process during submission, but please double-check.

Reviewer #3: In this study the author investigated the structure, evolution and function of mating type loci in several rust fungi using publicly available dataset of genomes and expression data analysis.

I found that overall the authors addressed many comments of the reviewers on the previously submitted version and that this version of the manuscript is very interesting.

My major comments about this new version are:

-over all the manuscript, I noted inconsistencies between abbreviations for the PR locus (Pra; PR; PRA…); it is unclear if there are differences between these abbreviations;

-Polymorphism within population between the two PR alleles seems low to me for a region that has likely recombination suppression (L444:one single SNP); it doesn’t seem consistent with the dS value computed or I misinterpreted this part of the result.

Did the author use the denovo genome approach to check the polymorphism at PR loci as they did at HD loci?

-It is unclear if the hypothesis of a role in MAT genes for regulation of nuclear pairing is a new hypothesis or rely on previous observations (two references are cited L145; in the discussion, it appears as a new hypothesis) L678: indications could be given on how to further test this hypothesis.

-I had difficulties to understand the last part of the results about “shared nuclear haplotypes” and what the main conclusion of the author was. maybe a schema would help or clear conclusions.

-the text overall could be shorten (synteny analyses appear several times in the method, the part about STE3.2-1 loci could be merged in the result part)

Minor comments:

-L35: unclear what “genetic features” refers to

-L40: unclear what “allele status” refers to; bi-allelic or multiallelic

-L64: states it is an example

-L79: the model cited doesn’t states that outcrossing mating system is necessary in the model; the model may work under inbreeding or automixis.

-L84: cite Basidiomycota examples as they are more relevant to your study: Agaricus, Ustilago, Cryptococcus…

-L120 : what is known about the PR locus? It is explained L155, but could be explained here

-L125: the term evolutionary strata needs to be defined

-L143: unclear why the term “rudimentary” means

-L147-148: indicate on what evidence rely previous studies

-L195: use “degeneration” instead of “deterioration”

-L617 : no s at model if only one is cited

-L625: It is unclear to me “the locus is extended” ; do the author mean “the region with signal of recombination suppression around the PR locus”? how did the author define this region precisely ?

-L630-631: TE composition and coverage at the order level: not very clear to me what the order means

********** 

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

********** 

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Guus Bakkeren

Reviewer #2: No

Reviewer #3: No

Decision Letter 2

Tatiana Giraud, Eva H Stukenbrock

28 Feb 2024

Dear Dr Schwessinger,

Thank you very much for submitting your Research Article entitled 'Genome biology and evolution of mating type loci in four cereal rust fungi' to PLOS Genetics.

The manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important topic but identified some concerns that we ask you address in a revised manuscript.

We therefore ask you to modify the manuscript according to the review recommendations. Your revisions should address the specific points made by each reviewer.

In addition we ask that you:

1) Provide a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

2) Upload a Striking Image with a corresponding caption to accompany your manuscript if one is available (either a new image or an existing one from within your manuscript). If this image is judged to be suitable, it may be featured on our website. Images should ideally be high resolution, eye-catching, single panel square images. For examples, please browse our archive. If your image is from someone other than yourself, please ensure that the artist has read and agreed to the terms and conditions of the Creative Commons Attribution License. Note: we cannot publish copyrighted images.

We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we would ask you to let us know the expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments should be included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, you will need to go to the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

Please let us know if you have any questions while making these revisions.

Yours sincerely,

Tatiana Giraud

Guest Editor

PLOS Genetics

Eva Stukenbrock

Section Editor

PLOS Genetics

The manuscript has been much improved. However, there remains multiple occurrence of awkward writing, including formulations already highlighted as incorrect in the previous rounds of reviews, as highlighted below.

-L4, L29, L213: mating-type loci

-L33/ as already said in the previous rounds of reviews, it’s not mating that is tetrapolar but the mating compatibility system

-L41 : we confirm that

-L38 : deleter extensive

-L44 : delete « of Puccinia » or « pathogenic Puccina species »

-L46 : delete « from sexual reproduction or somatic recombination » it does not make sense here

-L49 : sexes are, not sex is (sex is sexual reproduction)

-L50 : replace « allele divs » by « features », this sentence makes no sense

-L57-58, L210-211, 725-727: this is too far-fetched

-L61 : delete the ‘ signs ; delete loci or chromosome

-L71, 78, 80 : delete « or new TE insertions », it is unclear how this can suppress recombination here

-L72-76 : this is still very awkwardly explained and makes no sense in the present state.. replace by : « Instead, mathematical modelling and stochastic simulations suggest that non-recombining DNA fragments, caused for example by inversions, can be fixed solely due to the presence of deleterious mutations in genomes. Non-recombining fragments are beneficial if they carry fewer deleterious mutations than average and can increase in frequency. However, when becoming frequenct, their (few) recessive deleterious mutations will be exposed and selected against, preventing fixation. They can fix at permanent biallelic, heterozygous loci due to their sheltering of deleterious mutations (2). »

-L106, 108 and elsewhere: replace « mating behavior » by « mating compatibility system »

-L113 : bipolar mating compatibility, not behavior

-L115, 120, 160, 214, 215, 592 and elsewhere : tetrapolar mating compatibility, not behavior (you say L116 outcrossing behavior, which is here correct and shows the confusion around the term mating behavior in the crrent state)

-L117 : compatibility odds

-L157 : the mating compatibility system (compatibility would be between two cells)

-L179 : these studies did not « open » gaps, they brought insights and left gaps, or gaps remain after these studies

-L207 : at least three ? or is the fourth one multi-allelic ?

-Please stick to the accepted Latin abbreviation system for species names

-L362 : replace « aberrant patterns of gene or transposon density » by « particular patterns of gene or transposon density » (enriched TE patterns are not « aberrant » under recombination suppression)

-L431 : « whole-genome »

-L432 : genome singular as there is « respective » before

-L474 : « This species rank variation » : it is the level that is ranked not the variation itself

-L476, 497 : level instead of rank

-L543 : « the two MAT genes » : it sounds as you already know the two MAT genes while in the sentence you aim to investigate a potential mating-type function for a third one, so this sounds unclear, reformulate

-L593 : as already said, the causality is reverse : it’s not that outcrossing benefits from new alleles (which is wrong, group selection reasonning), but that, under outcrossing, new, rare allleles are selected for, this is completely different.

-L601 : unclea rformulation, add « each » after species ? or is the number all four species pooled ?

-L618 : this is also the case in Microbotryum, see Michael Hood’s studies

-L650 : « exact trends » seems contradictory, replace by patetrns

-L688 : why would they have other functions ? Delete this sentence, or rephrase saying there is no evidence of additional functions

Decision Letter 3

Tatiana Giraud, Eva H Stukenbrock

4 Mar 2024

Dear Dr Schwessinger,

We are pleased to inform you that your manuscript entitled "Genome biology and evolution of mating type loci in four cereal rust fungi" has been editorially accepted for publication in PLOS Genetics. Congratulations!

Before your submission can be formally accepted and sent to production you will need to complete our formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Please note: the accept date on your published article will reflect the date of this provisional acceptance, but your manuscript will not be scheduled for publication until the required changes have been made.

Once your paper is formally accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you’ve already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosgenetics@plos.org.

In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pgenetics/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. Note that PLOS requires an ORCID iD for all corresponding authors. Therefore, please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field.  This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager.

If you have a press-related query, or would like to know about making your underlying data available (as you will be aware, this is required for publication), please see the end of this email. If your institution or institutions have a press office, please notify them about your upcoming article at this point, to enable them to help maximise its impact. Inform journal staff as soon as possible if you are preparing a press release for your article and need a publication date.

Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Genetics!

Yours sincerely,

Tatiana Giraud

Guest Editor

PLOS Genetics

Eva Stukenbrock

Section Editor

PLOS Genetics

www.plosgenetics.org

Twitter: @PLOSGenetics

----------------------------------------------------

Comments from the reviewers (if applicable):

----------------------------------------------------

Data Deposition

If you have submitted a Research Article or Front Matter that has associated data that are not suitable for deposition in a subject-specific public repository (such as GenBank or ArrayExpress), one way to make that data available is to deposit it in the Dryad Digital Repository. As you may recall, we ask all authors to agree to make data available; this is one way to achieve that. A full list of recommended repositories can be found on our website.

The following link will take you to the Dryad record for your article, so you won't have to re‐enter its bibliographic information, and can upload your files directly: 

http://datadryad.org/submit?journalID=pgenetics&manu=PGENETICS-D-23-00971R3

More information about depositing data in Dryad is available at http://www.datadryad.org/depositing. If you experience any difficulties in submitting your data, please contact help@datadryad.org for support.

Additionally, please be aware that our data availability policy requires that all numerical data underlying display items are included with the submission, and you will need to provide this before we can formally accept your manuscript, if not already present.

----------------------------------------------------

Press Queries

If you or your institution will be preparing press materials for this manuscript, or if you need to know your paper's publication date for media purposes, please inform the journal staff as soon as possible so that your submission can be scheduled accordingly. Your manuscript will remain under a strict press embargo until the publication date and time. This means an early version of your manuscript will not be published ahead of your final version. PLOS Genetics may also choose to issue a press release for your article. If there's anything the journal should know or you'd like more information, please get in touch via plosgenetics@plos.org.

Acceptance letter

Tatiana Giraud, Eva H Stukenbrock

12 Mar 2024

PGENETICS-D-23-00971R3

Genome biology and evolution of mating type loci in four cereal rust fungi

Dear Dr Schwessinger,

We are pleased to inform you that your manuscript entitled "Genome biology and evolution of mating type loci in four cereal rust fungi" has been formally accepted for publication in PLOS Genetics! Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out or your manuscript is a front-matter piece, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Genetics and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Anita Estes

PLOS Genetics

On behalf of:

The PLOS Genetics Team

Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom

plosgenetics@plos.org | +44 (0) 1223-442823

plosgenetics.org | Twitter: @PLOSGenetics

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Table. List of whole-genome Sequence Read Archive data used in the present study.

    The table provides information on all whole-genome Sequence Read Archive (SRA) data used in the present study. Metadata includes species abbreviation (“Species”), isolate name (“Isolate”), SRA identifier (“SRA ID”), and the initial reference (“Citation”). Pca—P. coronata f. sp. avenae, Pgt—Puccinia graminis f. sp. tritici, PtP. triticina and PstP. striiformis f. sp. tritici.

    (XLSX)

    pgen.1011207.s001.xlsx (12.4KB, xlsx)
    S2 Table. List of RNAseq Sequence Read Archive data used in the present study.

    The table provides information on all RNAseq Sequence Read Archive (SRA) data used in the present study. Metadata includes species abbreviation (“Species”), isolate name (“Isolate”), timepoint of infection (hpi, hours post infection) or sample type (“Sample Type”), timepoint/status of spores (“Time point/Status”), replicate number (“Replicate”), SRA identifier (“SRA ID”), and the initial reference (“Citation”). Pca—P. coronata f. sp. avenae, Pgt—Puccinia graminis f. sp. tritici, and PstP. striiformis f. sp. tritici.

    (XLSX)

    pgen.1011207.s002.xlsx (11.7KB, xlsx)
    S1 Fig. Two unlinked MAT loci suggest tetrapolar mating types in four Puccinia spp.

    Karyograms of P. coronata f. sp. avenae (“Pca 203”), P. graminis f. sp. tritici (“Pgt 21–0”), P. triticina (“Pt 76”) and P. striiformis f. sp. tritici (“Pst 134E”) with the positions of HD, PR and STE3.2–1 loci marked by black arrow heads. HD, PR and STE3.2–1, located on chromosome 4, chromosome 9 and chromosome 1, respectively, suggest tetrapolar mating types in these four species.

    (TIF)

    pgen.1011207.s003.tif (1.1MB, tif)
    S2 Fig. Genetic distance between pheromone precursor genes (mfa) and pheromone receptor (Pra) genes varies from 100 bp to 100 kb.

    The diagrams display the location of STE3.2–2 and STE3.2–3 and their linked mfa genes on chromosome 9A and 9B in P. coronata f. sp. avenae (“Pca 203”), P. graminis f. sp. tritici (“Pgt 21–0”), P. triticina (“Pt 76”) and P. striiformis f. sp. tritici (“Pst 134E”). The grey bar represents chromosome subsections containing the genes of interest with the numbers indicating the absolute location in mega base pairs on chromosome 9A and 9B. The genetic distance between mfa1/mfa3 and STE3.2–3 are highly variable between species, whereas STE3.2–2 and mfa2 are tightly linked in all species.

    (TIF)

    pgen.1011207.s004.tif (233.8KB, tif)
    S3 Fig. MFA protein alignments indicate mating factor a proteins are overall conserved within species.

    (A) Alignment of MFA1 protein sequences from four different rust fungal species. The MFA1, which is linked to STE3.2–3, is mostly conserved within species. Two near identical MFA1/3 copies can be identified in four P. triticina isolates. In P. graminis f. sp. tritici MFA1 has four amino acid substitutions in Pgt SCCL versus Pgt 21–0. (B) Alignment of MFA2 protein sequences from four different rust fungal species. MFA2 is fully conserved within each species. (C) Alignment of MFA3 protein sequences from three P. graminis f. sp. tritici isolates. Mfa3 is only present P. graminis f. sp. tritici in close proximity to STE3.2–3. Predicted mature pheromone sequences are outlined by boxes.

    (TIF)

    pgen.1011207.s005.tif (933.3KB, tif)
    S4 Fig. Pheromone precursor genes group according to their proximate Pra alleles and display strong signals of trans-specific polymorphisms.

    Maximum likelihood tree of mfas identified in four cereal rust fungi including multiple isolates per species. Tips are labelled with the species abbreviation, gene names, and isolate names are provided in parentheses. Branch support was assessed by 10000 replicates. The scale bar represents 0.5 substitutions per site. Pca—P. coronata f. sp. avenae, Pgt—Puccinia graminis f. sp. tritici, PtP. triticina and PstP. striiformis f. sp. tritici.

    (TIF)

    pgen.1011207.s006.tif (848.5KB, tif)
    S5 Fig. Whole chromosome alignments of HD loci containing chromosomes between the two haplotypes of each dikaryotic genome exhibit slightly reduced synteny around HD locus.

    The figure shows dots plots of whole chromosome alignments between the two HD loci containing chromosomes from dikaryotic genome assemblies. Each panel consists of dot plots of the whole chromosome and subset dot plot zooming into the HD locus. The HD locus is labelled and line colors show the nucleotide percentage identity and nucleotide orientation as indicated in the figure legend. Subfigure A to D show P. coronata f. sp. avenae (“Pca 203”), P. graminis f. sp. tritici (“Pgt 21–0”), P. triticina (“Pt 76”) and P. striiformis f. sp. tritici (“Pst 134E”), respectively.

    (TIF)

    pgen.1011207.s007.tif (630.2KB, tif)
    S6 Fig. Allele pairs on chromosomes 4 and 9 are not overall more diverged when compared to other chromosomes.

    The plots show the distribution of dS values of allele pairs on sister chromosomes in four cereal rust fungal species. Bar plots show the distribution of dS values up to the 99% quantile. Red lines represent threshold of dS values of 95% of all alleles. Blue lines represent threshold of dS values of 90% of all alleles. Black points show dS values of individual allele pairs. HD and STE3 allele pairs are highlighted as red points. Subfigures A to D show P. coronata f. sp. avenae (“Pca 203”), P. graminis f. sp. tritici (“Pgt 21–0”), P. triticina (“Pt 76”) and P. striiformis f. sp. tritici (“Pst 134E”), respectively.

    (TIF)

    pgen.1011207.s008.tif (695.5KB, tif)
    S7 Fig. Hi-C heatmap of P. coronata f. sp. avenae (“Pca 203”) hapA.

    (TIF)

    pgen.1011207.s009.tif (3.3MB, tif)
    S8 Fig. Hi-C heatmap of P. coronata f. sp. avenae (“Pca 203”) hapB.

    (TIF)

    pgen.1011207.s010.tif (3.1MB, tif)
    S9 Fig. Hi-C heatmap of P. triticina (“Pt 76”) hapA.

    (TIF)

    pgen.1011207.s011.tif (3.9MB, tif)
    S10 Fig. Hi-C heatmap of P. triticina (“Pt 76”) hapB.

    (TIF)

    pgen.1011207.s012.tif (3.8MB, tif)
    S11 Fig. Coverage of different transposable element orders at the HD, PR, and STE3.2–1 locus.

    The plots show the percentage of nucleotides covered by different transposable element orders at (A) HD locus (B) PR locus (C) STE3.2–1 locus. Each subfigure A to C shows the coverage in each haplotype of the dikaryotic genomes of P. coronata f. sp. avenae (“Pca 203”), P. graminis f. sp. tritici (“Pgt 21–0”), P. triticina (“Pt 76”) and P. striiformis f. sp. tritici (“Pst 134E”). Different TE orders are color coded as shown in the legend. TEs with no assigned class are labelled “Undetermined”. TEs with no assigned order but belonging to Class I (RNA retrotransposons) or Class II (DNA transposons) are labelled “Undetermined Class I” or “Undetermined Class II”, respectively.

    (TIF)

    pgen.1011207.s013.tif (790.2KB, tif)
    S12 Fig. Whole chromosome alignments of PR loci containing chromosomes between two haplotypes of each dikaryotic genome exhibit strong signs of synteny loss.

    The figure shows dot plots of whole chromosome alignments between the two PR loci containing chromosomes from the dikaryotic genome assemblies. Each panel consists of dot plots of the whole chromosome and subset dot plots zooming into the PR locus. The PR locus is labelled and line colors show the nucleotide percentage identity and nucleotide orientation as indicated in the figure legend. Subfigures A to D show P. coronata f. sp. avenae (“Pca 203”), P. graminis f. sp. tritici (“Pgt 21–0”), P. triticina (“Pt 76”) and P. striiformis f. sp. tritici (“Pst 134E”), respectively.

    (TIF)

    pgen.1011207.s014.tif (518.8KB, tif)
    S13 Fig. IGV screen shots of short-read Illumina mapping of various P. coronata f. sp. avenae isolates against the HD locus in the Pca 203 reference.

    (A) shows mapping against the HD locus on chromosome 4A and (B) chromosome 4B, respectively.

    (TIF)

    pgen.1011207.s015.tif (1.1MB, tif)
    S14 Fig. IGV screen shot of short-read Illumina mapping of various P. graminis f. sp. tritici isolates against the HD locus in the Pgt 21–0 reference.

    (A) shows mapping against the HD locus on chromosome 4A and (B) chromosome 4B, respectively.

    (TIF)

    pgen.1011207.s016.tif (4.7MB, tif)
    S15 Fig. IGV screen shots of short-read Illumina mapping of various P. triticina against the HD isolates locus in the Pt 76 reference.

    (A) shows mapping against the HD locus on chromosome 4A and (B) chromosome 4B, respectively.

    (TIF)

    pgen.1011207.s017.tif (811.6KB, tif)
    S16 Fig. IGV screen shots of short-read Illumina mapping of various P. striiformis f. sp. tritici isolates against the HD locus in the Pst 134E reference.

    (A) shows mapping against the HD locus on chromosome 4A and (B) chromosome 4B, respectively.

    (TIF)

    pgen.1011207.s018.tif (986KB, tif)
    S17 Fig. Nucleotide alignments of the HD gene coding sequence in four cereal rust fungi indicate bW-HD1 and bE-HD2 are multiallelic in each species.

    (A) Nucleotide alignment of the de novo reconstructed HD locus from Pca 203 Illumina short-read data with the coding regions of bW-HD1 and bE-HD2 alleles from Pca 203 dikaryotic reference genome. (B) to (D) multiple sequence alignments of de novo reconstructed HD coding regions, (E) multiple sequence alignments of Pst bW-HD1 and bE-HD2 alleles [159]. In each subfigure B to E, the top two track shows the consensus sequence length and relative sequence identity, respectively. Subfigure B to E show P. coronata f. sp. avenae (“Pca”), P. graminis f. sp. tritici (“Pgt”), P. triticina (“Pt”) and P. striiformis f. sp. tritici (“Pst”), respectively. The bW-HD1 and bE-HD2 are numbered in accordance with Fig 1.

    (TIF)

    S18 Fig. IGV screen shots of short-read Illumina mapping of various P. coronata f. sp. avenae isolates against the PR locus in the Pca 203 reference.

    (A) shows mapping against the PR locus on chromosome 9A and (B) chromosome 9B, respectively.

    (TIF)

    pgen.1011207.s020.tif (291.6KB, tif)
    S19 Fig. IGV screen shot of short-read Illumina mapping of various P. graminis f. sp. tritici isolates against the PR locus in the Pgt 21–0 reference.

    (A) shows mapping against the PR locus on chromosome 9A and (B) chromosome 9B, respectively.

    (TIF)

    pgen.1011207.s021.tif (446.7KB, tif)
    S20 Fig. IGV screen shots of short-read Illumina mapping of various P. triticina isolates against the PR locus in the Pt 76 reference.

    (A) shows mapping against the PR locus on chromosome 9A and (B) chromosome 9B, respectively.

    (TIF)

    pgen.1011207.s022.tif (349.6KB, tif)
    S21 Fig. IGV screen shots of short-read Illumina mapping of various P. striiformis f. sp. tritici isolates against the PR locus in the Pst 134E reference.

    (A) shows mapping against the PR locus on chromosome 9A and (B) chromosome 9B, respectively.

    (TIF)

    pgen.1011207.s023.tif (326.6KB, tif)
    S22 Fig. Amino acid alignments of STE3.2–2 and STE3.2–3 alleles of four cereal rust fungi.

    Multiple sequence alignment of de novo reconstructed STE3.2–2 and STE3.2–3 protein sequences. Subfigure contains MSAs; one for STE3.2–2 and one for STE3.2–3. Subfigure A to D show P. coronata f. sp. avenae (“Pca”), P. graminis f. sp. tritici (“Pgt”), P. triticina (“Pt”) and P. striiformis f. sp. tritici (“Pst”), respectively.

    (TIF)

    pgen.1011207.s024.tif (476.4KB, tif)
    S23 Fig. Amino acid alignments of the three MFA alleles from various P. graminis f. sp. tritici isolates.

    Amino acid substitutions are highlighted by color, whereas predicted mature pheromone sequences are outlined by boxes.

    (TIF)

    pgen.1011207.s025.tif (992.1KB, tif)
    S24 Fig. Whole chromosome alignments of HD genes containing chromosomes between haplotypes of four different P. triticina isolates.

    The figure shows dots plots of whole chromosome alignments of HD loci containing chromosomes derived from distinct dikaryotic genomes of four different P. triticina isolates including Pt 15, Pt 19NSW04, Pt 20QLD87 against Pt 76. Each panel consists of a dot plot of the whole chromosome and a subset dot plot zooming into the HD locus. The HD locus is labelled and line colors show the nucleotide percentage identity and nucleotide orientation as indicated in the figure legend. (A) Comparison of nucleotide sequence of chromosome 4s of Pt 15 and Pt 76. (B) Comparison of nucleotide sequence of chromosome 4s of Pt 19NSW04 and Pt 76. (C) Comparison of nucleotide sequence of chromosome 4s of Pt 20QLD87 and Pt 76. (D) Comparison of nucleotide sequence of chromosome 4s of Pt 19NSW04 and Pt 20QLD87.

    (TIF)

    pgen.1011207.s026.tif (3.6MB, tif)
    S25 Fig. The HD locus is highly conserved in four P. triticina isolates.

    Synteny graphs of HD loci including proximal regions in the four P. triticina isolates Pt 15, Pt 19NSW04, Pt 20QLD87 and Pt 76. Red lines between chromosome sections represent gene pairs with nucleotide sequence identity higher than 70% and grey shades between conserved nucleotide sequences (> = 1000 bp and identity > = 90%). For additional annotations please refer to the provide legend (“Legend”).

    (TIF)

    pgen.1011207.s027.tif (1.6MB, tif)
    S26 Fig. Coverage of different transposable element orders at the HD locus in four P. triticina isolates.

    The plots show the percentage of nucleotides covered by different transposable element orders at the HD locus. Each subfigure shows the coverage in each haplotype of the dikaryotic genomes of P. triticina isolates Pt 15, Pt 19NSW04, Pt 20QLD87 and Pt 76. Different TE orders are color coded as shown in the legend. TEs with no assigned class are labelled “Undetermined”. TEs with no assigned order but belonging to Class I (RNA retrotransposons) or Class II (DNA transposons) are labelled “Undetermined Class I” or “Undetermined Class II”, respectively.

    (TIF)

    pgen.1011207.s028.tif (326.2KB, tif)
    S27 Fig. Whole chromosome alignments of Pra gene containing chromosomes between haplotypes of four different P. triticina isolates.

    The figure shows dot plots of whole chromosome alignments of STE3.2–2 or STE3.2–3 containing chromosomes derived from distinct dikaryotic genomes of four different P. triticina isolates including Pt 15, Pt 19NSW04, Pt 20QLD87 against Pt 76. Each panel consists of a dot plot of the whole chromosome and a subset dot plot zooming into the proximal region of STE3.2–2 or STE3.2–3. The STE3.2–2 or STE3.2- are labelled and line colors show the nucleotide percentage identity and nucleotide orientation as indicated in the figure legend. (A) Comparison of nucleotide sequence of chromosome 9s containing STE3.2–2 gene (B) Comparison of nucleotide sequence of chromosome 9s containing STE3.2–3 gene.

    (TIF)

    pgen.1011207.s029.tif (1.3MB, tif)
    S28 Fig. Nucleotide coverage distribution of Ty3_Pt_STE3.2–3 on the chromosomes of four P. triticina isolates.

    The plots show the percentage of nucleotides covered by the transposable element family Ty3_Pt_STE3.2–3 on each chromosome of four P. triticina isolates Pt 15 (A), Pt 19NSW04 (B), Pt 20QLD87 (C) and Pt 76 (D). Chromosomes carrying STE3.2–3 are highlighted in red.

    (TIF)

    pgen.1011207.s030.tif (325.2KB, tif)
    S29 Fig. Schematic illustration of the location of Ty3_Pt_STE3.2–3 copies at the STE3.2–3 locus in four P. triticina isolates.

    (TIF)

    pgen.1011207.s031.tif (118.7KB, tif)
    S30 Fig. Genealogy of STE3.2–1 in rust fungi indicates sequence of STE3.2–1 are conserved within species.

    Bayesian rooted gene tree built from STE3.2–1 coding-based sequence alignment from four cereal rust fungi: P. coronata f. sp. avenae (Pca), P. graminis f. sp. tritici (Pgt), P. triticina (Pt) and P. striiformis f. sp. tritici (Pst) and P. polysora f. sp. zeae (Ppz) GD1913 was included as outgroup. Trees are based on a TN93+I model of molecular evolution. Each node is labelled with its values of posterior probability (PP). PP values above 0.95 are considered to have strong evidence for monophyly of a clade and PP values of identical alleles are not displayed. The scale bar represents the number of nucleotide substitutions per site. Alleles of the same species are colored with identical background: Pca (yellow), Pgt (green), Pt (blue), Pst (orange).

    (TIF)

    pgen.1011207.s032.tif (525.7KB, tif)
    S31 Fig. Whole chromosome alignments of STE3.2–1 gene containing chromosomes between two haplotypes of each dikaryotic genome suggest high conservation of STE3.2–1.

    The figure shows dot plots of whole chromosome alignments between the two STE3.2–1 gene containing chromosomes from dikaryotic genome assemblies. Each panel consists of dot plots of the whole chromosome and subset dot plots zooming into the STE3.2–1 proximal region. The position of STE3.2–1 gene is labelled, and line colors show the nucleotide percentage identity and nucleotide orientation as indicated in the figure legend. Subfigures A to D show P. coronata f. sp. avenae (“Pca 203”), P. graminis f. sp. tritici (“Pgt 21–0”), P. triticina (“Pt 76”) and P. striiformis f. sp. tritici (“Pst 134E”), respectively.

    (TIF)

    pgen.1011207.s033.tif (626.8KB, tif)
    S32 Fig. Synteny analysis of STE3.2–1 genes and their flanking regions reveal no evidence of genetic degeneration.

    Synteny graphs of the STE3.2–1 locus including proximal regions in P. coronara f. sp. avenae (“Pca 203”), P. gramminis f. sp. tritici (“Pgt 21–0”), P. triticina (“Pt 76”), and P. striiformis f. sp. tritici (“Pst 134E”). Proximal regions are defined as 40 genes downstream and upstream of the STE3.2–1 alleles, respectively. STE3.2–1 proximal regions are highly syntenic within each dikaryotic genome and conserved between species. Red lines between chromosome sections represent gene pairs with sequence identity higher than 70% and grey shades represent conserved nucleotide sequences (> = 1000 bp and identity > = 90%). For additional annotations please refer to the provided legend (“Legend”).

    (TIF)

    pgen.1011207.s034.tif (1.5MB, tif)
    S33 Fig. Synonymous divergence (dS) values between STE3.2–1 alleles along chromosome 1A suggest conservation of this locus.

    Synonymous divergence values (dS) for all allele pairs are plotted along chromosome 1A for (A) P. coronara f. sp. avenae (“Pca 203”), (B) P. gramminis f. sp. tritici (“Pgt 21–0”), (C) P. triticina (“Pt 76”), and (D) P. striiformis f. sp. tritici (“Pst 134E”). In each panel, the top track shows the dS values (“dS) of allele pairs along chromosome 1. Each dot corresponds to the dS value of a single allele pair. The second and third track show the averaged TE (“TE”) and gene (“gene”) density along chromosome 1 in 10 kbp-sized windows, respectively. The STE3.2–1 alleles are highlighted with a red line and red shading indicates a 0.4 mbp-sized window around the STE3.2–1 genes. Predicted centromeric regions are marked with blue shading. The two lower tracks (ds values and gene locations) provide a detailed zoomed in view of red shaded area around the STE3.2–1 alleles. Species-specific background coloring is the same as for Fig 1.

    (TIF)

    pgen.1011207.s035.tif (1.6MB, tif)
    S34 Fig. Multidimensional scaling (MDS) plot of RNA dataset used in this study.

    MDS plots were made with TMM normalized counts for quality control, each dot represents a single sample, and replicates were color coded as indicated in each subfigure legend. (A) MDS plot of TMM-normalized value of Pca at 48 hour post infection (hpi) and 120 hpi. (B) MDS plot of TMM-normalized value of Pgt at 48, 72, 96, 120, 148 and 168 hpi. (C) MDS plot of TMM-normalized value of Pst at 24, 48, 72, 120, 168, 216 and 264 hpi. (D) MDS plot of TMM-normalized value of ungerminated spore (US), germinated spores (GS) stages, 144 hpi, 216 hpi and haustoria enriched samples (HE) of Pst.

    (TIF)

    pgen.1011207.s036.tif (367KB, tif)
    S35 Fig. Expression of housekeeping genes in four RNAseq datasets.

    (A) TMM-normalized value of housekeeping genes in P. coronata f. sp. avenae (“Pca 12NC29”), P. graminis f. sp. tritici (“Pgt 21–0”), and P. striiformis f. sp. tritici (“Pst 87/66” and “Pst 104E”). (B) Likelihood ratio test (LRT) method was applied to test significant upregulation of housekeeping genes between timepoints, none of the housekeeping genes show significant upregulation between timepoints. red dashed lines indicate logFC = 0.5 and logFC = -0.5 respectively, stars above or below bars indicate statistically significant differences between the two adjacent time points: *p<0.05, **p<0.01, ***p<0.001. Genes are labelled with different colors. Transcription elongation factor TFIIS (TFIIS), Actin-related protein 3 (ARP3), Actin/actin-like protein 2 (ACTIN2), Ubiquitin carboxyl-terminal hydrolase 6 (UBP6), Conserved oligomeric Golgi complex subunit 3 (COG3), Ctr copper transporter 2 (CTR2).

    (TIF)

    pgen.1011207.s037.tif (314.5KB, tif)
    S36 Fig. Upregulation of MAT genes in late stages of the asexual life cycle.

    Likelihood ratio test (LRT) method was applied to test significant upregulation of MAT genes between timepoints, red dashed lines indicate logFC = 0.5 and logFC = -0.5 respectively, stars above or below bars indicate statistically significant differences between the two adjacent time points: *p<0.05, **p<0.01, ***p<0.001. MAT genes were labelled with different colors. (A) The expression levels of MAT genes in P. coronata f. sp. avenae (“Pca 12NC29”) were compared between 120 hours post infection (hpi) and 48 hpi. (B) The expression levels of MAT genes in P. graminis f. sp. tritici (“Pgt 21–0”) were compared between 72 hpi and 48 hpi, 96 hpi and 72 hpi, 120 hpi and 96 hpi, 144 hpi and 120 hpi, 168 hpi and 144 hpi. (C) The expression levels of MAT genes in P. striiformis f. sp. tritici (“Pst 87/66”) were compared between 48 hpi and 24 hpi, 72 hpi and 48 hpi, 120 hpi and 72 hpi, 168 hpi and 120 hpi, 216 hpi and 168 hpi, 264 hpi and 216 hpi. (D) The expression levels of MAT genes in P. striiformis f. sp. tritici (“Pst 104E”) were compared between germinated spores (GS) and ungerminated spores (US), 144 hpi and GS, 216 hpi and 144 hpi, haustoria enriched samples (HE) and 216 hpi.

    (TIF)

    pgen.1011207.s038.tif (226.9KB, tif)
    S37 Fig. MAT genes are upregulated in the late asexual infection stage of P. striiformis f. sp. tritici (“Pst 104E”).

    TMM-normalized value of MAT genes in ungerminated spores (US), germinated spores (GS) stages, 144 hpi, 216 hpi and in haustoria enriched samples (HE) of Pst. Genes are labelled with different colours.

    (TIF)

    pgen.1011207.s039.tif (86.5KB, tif)
    Attachment

    Submitted filename: 20240108_ResponseToEditorAndReviewers.docx

    pgen.1011207.s040.docx (75.2KB, docx)
    Attachment

    Submitted filename: 20240223_RebuttalLetter.pdf

    pgen.1011207.s041.pdf (224.4KB, pdf)
    Attachment

    Submitted filename: 20240229_ResponseToReviewer.pdf

    pgen.1011207.s042.pdf (135.1KB, pdf)

    Data Availability Statement

    Analysis code used in this study is available at github repository: https://github.com/ZhenyanLuo/codes-used-for-mating-type. Alignments used in genealogical studies, exported RDP5 project files, ds value of gene pairs in studied cereal rust species, presumed CDS of reconstructed HD alleles, Pra alleles, TE annotation and classification files, normalized gene expression matrix files are available at Dryad (Luo Z, Schwessinger B. Supporting information of genome biology and evolution of mating type loci in four cereal rust fungi [Dataset] 2024. Available from: https://doi.org/10.5061/dryad.w0vt4b8zm).


    Articles from PLOS Genetics are provided here courtesy of PLOS

    RESOURCES