ABSTRACT
Coriolopsis trogii is a typical thermotolerant basidiomycete fungus, but its thermotolerance mechanisms are currently unknown. In this study, two monokaryons of C. trogii strain Ct001 were assembled: Ct001_29 had a genome assembly size of 38.85 Mb and encoded 13,113 genes, while Ct001_31 was 40.19 Mb in length and encoded 13,309 genes. Comparative intra- and interstrain genomic analysis revealed the rich genetic diversity of C. trogii, which included more than 315,194 single-nucleotide polymorphisms (SNPs), 30,387 insertion/deletions (indels), and 1,460 structural variations. Gene family analysis showed that the expanded families of C. trogii were functionally enriched in lignocellulose degradation activities. Furthermore, a total of 14 allelic pairs of heat shock protein 20 (HSP20) genes were identified in the C. trogii genome. The expression profile obtained from RNA sequencing (RNA-Seq) showed that four tandem-duplicated allelic pairs, HSP20.5 to HSP20.8, had more than 5-fold higher expression at 35°C than at 25°C. In particular, HSP20.5 and HSP20.8 were the most highly expressed HSP20 genes. Allelic expression bias was found for HSP20.5 and HSP20.8; the expression of Ct29HSP20.8 was at least 1.34-fold higher than that of Ct31HSP20.8, and that of Ct31HSP20.5 was at least 1.5-fold higher than that of Ct29HSP20.5. The unique structural and expression profiles of the HSP20 genes revealed by these haplotype-resolved genomes provide insight into the molecular mechanisms of high-temperature adaptation in C. trogii.
IMPORTANCE Heat stress is one of the most frequently encountered environmental stresses for most mushroom-forming fungi. Currently available fungal genomes are mostly haploid because high heterozygosity hinders diploid genome assembly. Here, two haplotype genomes of C. trogii, a thermotolerant basidiomycete, were assembled separately. A conserved tandem cluster of four HSP20 genes showing allele-specific expression was found to be closely related to high-temperature adaptation in C. trogii. The obtained haploid genomes and their comparison offer a more thorough understanding of the genetic background of C. trogii. In addition, the responses of HSP20 genes at 35°C, which may contribute to the growth and survival of C. trogii at high temperatures, could inform the selection and breeding of elite strains in the future.
KEYWORDS: haplotype genome, HSP20, thermotolerance, Coriolopsis trogii
INTRODUCTION
Coriolopsis trogii, previously known as Trametes trogii (https://www.mycoguide.com/guide/fungi/basi/agar/poly/poly/trma/trogii), is a globally distributed white-rot basidiomycete that has been recognized as an excellent source of ligninolytic enzymes and thermostable laccases (1–6). The crude extract and exopolysaccharides of C. trogii also show strong pharmacological effects, such as anticancer and antioxidative effects (7–9). Most mushroom-forming fungi are sensitive to temperature, and temperatures above 25°C during the cultivation process may seriously affect the yield and quality of the fruiting bodies (10, 11). The optimal temperature for mycelial growth in C. trogii is about 35°C, which is significantly higher than that for most filamentous fungi (5), making it a good candidate for studying the heat resistance or adaptation of fungi.
Heat shock protein 20 (HSP20) genes, which encode a fascinating group of molecular chaperones and comprise a major family of HSP genes, possess a conserved sequence of 80 to 100 amino acid residues, called an α-crystallin domain (ACD), in their C-terminal region (12). HSP20s bind to partially folded or denatured proteins to prevent proteins from irreversible aggregation and keep them stable (13). HSP20 genes are a major family of HSP genes induced by elevated temperature that are associated with stress responses in a range of different species (12, 14–19). In addition to environmental adaption, HSP20 genes are associated with development in plants (12, 20, 21) and bacteria (22). However, the gene number and structure of the HSP20 family and their relationship with thermotolerance in C. trogii are currently unknown.
In basidiomycetes, two different nuclei coexist side by side in one cell during the majority of their life cycle (23). To achieve better assembly and annotation results, monokaryon strains derived from protoplasts or single spores are usually selected to conduct genome sequencing (24–27). However, considerable phenotypic and genetic diversity is found between different monokaryons, strains, or haplotypes (23, 28, 29), and thus more genomes may provide a more comprehensive genetic background of a single species. Currently, only one whole-genome sequence for a monokaryon of C. trogii (S0301) is available (30). Additional genomic resources are thus critically important for realizing a more comprehensive and in-depth understanding of C. trogii.
In our current study, two protoplast-derived mating-compatible monokaryons, namely, Ct001_29 (mating type AxBx) and Ct001_31 (mating type AyBy), of C. trogii strain Ct001 were sequenced and assembled. Comparative genomic analysis was conducted, and genetic variations were identified. Through comprehensive RNA sequencing (RNA-Seq) analysis and allele-specific expression analysis (ASE), genes encoding HSP20s, especially the duplicated allelic pairs Ct29HSP20.8/Ct31HSP20.8 and Ct29HSP20.5/Ct31HSP20.5, were suggested to be the major HSP genes participating in the high-temperature adaptation of C. trogii.
RESULTS
Assembly and annotation of the Ct001_29 and Ct001_31 genomes.
The diploid strain Ct001 was isolated from the field in Shandong, China, in 2019. A total of 60 monokaryons were obtained, of which two mating-compatible haploids were randomly selected (Ct001_29 and Ct001_31) for this study. In total, 100× to 120× coverage from Illumina reads and 50× to 60× coverage from PacBio reads was generated for each genome. The 38.85-Mb assembly of Ct001_29 consisted of 43 contigs, with a contig N50 of 2.53 Mb, and the 40.19-Mb assembly of Ct001_31 consisted of 38 contigs, with a contig N50 of 2.48 Mb (Table 1 and Fig. 1A). The high degree of completeness of these assemblies was supported by the k-mer spectrum (see Fig. S1 in the supplemental material) and by Benchmarking Universal Single-Copy Ortholog (BUSCO) analysis. A total of 283 BUSCOs (97.5% of genes in the fungal lineage of odb9) were identified in both assemblies, of which 282 were complete and only one was a fragment. These two genomes had comparable assembly quality with that of a previously published C. trogii draft genome (S0301) (30) in total genome size and N50 value (Table 1).
TABLE 1.
Statistics for assembly features of C. trogii nuclear genomes
| Statistic |
C. trogii genome assembly |
||
|---|---|---|---|
| Ct001_29 | Ct001_31 | S0301a | |
| No. of contigs | 43 | 38 | 29 |
| Total length (Mb) | 38.85 | 40.19 | 39.88 |
| Longest contig (Mb) | 4.63 | 4.78 | 4.82 |
| Shortest contig (kb) | 1.52 | 1.92 | 20.37 |
| Avg contig length (kb) | 903.50 | 1,057.75 | 1,375.01 |
| Contig N50 (Mb) | 2.53 | 2.48 | 2.4 |
| GC content (%) | 55.40 | 55.37 | 55.47 |
| Repeat content (%) | 7.64 | 10.20 | |
| No. of protein-coding genes | 13,113 | 13,309 | 14,508 |
The S0301 genome sequence was previously published by Liu et al. (30).
FIG 1.
Global view of the Ct001_29 genome. (A) The longest 20 contigs of Ct001_29; (B) G+C percentage; (C) gene density; (D) repeat content; (E) variant density in Ct001_29 versus that in Ct001_31; (F) variant density in Ct001_29 versus that in S0301 (30); (G) syntenic blocks (sequence length, ≥5 kb) within the Ct001_29 genome. All of the statistics were calculated over 50-kb nonoverlapping windows.
Repetitive sequences represented 7.97% and 10.18% of Ct001_29 and Ct001_31, respectively (see Table S1 in the supplemental material). Long terminal repeats (LTR) were dominant repeat elements in both genomes, with Ct001_29 having a percentage of 2.34% and Ct001_31 having a percentage of 3.45%. A total of 13,113 protein-coding genes were predicted in Ct001_29, while 13,309 were predicted in Ct001_31. Low-GC-content regions tended to have low gene density (Fig. 1B and C). Gene functional annotation of Ct001_29 revealed that approximately 86.97% (11,402), 58.48% (7,667), and 77.58% (10,171) of the annotations could be assigned using data from InterProScan, Pfam, and UniProt, respectively.
Evolution of the C. trogii genome.
To investigate the evolutionary history of the C. trogii genome, an orthologous gene analysis using C. trogii and 23 other representative species (see Table S2 in the supplemental material) was conducted. A total of 176 single-copy orthologous genes were used for phylogenetic tree construction. The phylogenetic tree showed that the estimated divergence time between the C. trogii lineage and the Ganoderma lucidum and Ganoderma sinensis lineages was around 104 million years ago (Mya) (Fig. 2A). There were 84, 88, and 168 gene families that had possibly expanded in the Ct001_29, Ct001_31, and S0301 (30) strains, respectively, while 674, 641, and 880 gene families, respectively, appeared to have contracted compared to those in other taxa in this tree (Fig. 2B and C). Genes belonging to glycoside hydrolase families, especially GH16, were significantly enriched in the observed expanded genes (Fig. 2D), which may be related to the strong lignocellulose degradation ability of this species.
FIG 2.
Phylogenetic analysis of C. trogii. (A) Phylogenetic tree (maximum likelihood) constructed based on 176 single-copy genes in C. trogii and 23 other species using RAxML. Mya, million years ago. The bootstrap values for the most recent common ancestor (MRCA) were 65% for Umay versus Clac, 76% for Pstr versus Pcar, and 100% for others. Gluc, Ganoderma lucidum; Gsin, Ganoderma sinense; Dsqu, Dichomitus squalens; Tvil, Trametes villosa; Tspx, Trametes sp. AH28-2; Tpub, Trametes pubescens; Tver, Trametes versicolor; Tpol, Trametes polyzona; Thir, Trametes hirsute; Tcoc, Trametes coccinea; Tsan, Trametes sanguinea; Tcin, Trametes cinnabarina; Wcoc, Wolfiporia cocos; Ppla, Postia placenta; Fpin, Fomitopsis pinicola; Pcar, Phanerochaete carnosa; Pstr, Puccinia striiformis; Post, Pleurotus ostreatus; Asub, Auricularia subglabra; Umay, Ustilago maydis; Clac, Cystobasidiopsis lactophilus; Ncra, Neurospora crassa; Scer, Saccharomyces cerevisiae, and S0301, C. trogii. (B) Gene families identified in 24 species. Gene family number represents the number of gene families identified in the corresponding species; the gene number represents the number of all of the predicted genes in the genome of the corresponding species; “gene in family” represents the total number of genes that can be classified into gene families. (C) Gene family expansion and contraction. Ct_29 and Ct_31 represent Ct001_29 and Ct001_31, respectively. (D) Functional enrichment of the expanded genes.
Comparative genomics among C. trogii strains.
To capture the genomic diversity of C. trogii in terms of the sequences from both interstrain (S0301 versus Ct001_29) and intrastrain (Ct001_31 versus Ct001_29) comparisons, the sequences of Ct001_29, Ct001_31, and S0301 (30) were compared in a pairwise manner. Overall, strong syntenic relationships between these genomes were observed, while several structural differences existed (Fig. 3A). Substantial variations were identified, including 330,591 single-nucleotide polymorphisms (SNPs), 56,638 insertion/deletions (indels), and 1,563 structural variations (SVs), in the interstrain comparisons, as well as 315,194 SNPs, 30,387 indels, and 1,460 SVs in the intrastrain comparisons (Fig. 3B). The variants were distributed unevenly across the whole genome, with some regions possessing high variant density (Fig. 1E and F). We found that most of these SVs had lengths of less than 2 kb. Among the interstrain SVs, 704 were distributed in intergenic regions and 1,097 were distributed in gene regions, with most (997) occurring in exons. Consistently, among the intrastrain SVs, 714 were distributed in intergenic regions and 975 were distributed in gene regions, with most (878) occurring in exons (Fig. 3C). Randomly selected SNPs and indel loci (except primer indel_5) were then amplified successfully, and these loci were confirmed by Sanger sequencing (Fig. S2A to C). A set of randomly selected SVs was confirmed by sequence amplification and agarose gel electrophoresis (see Fig. S2D in the supplemental material). On average, SV-associated genes in Ct001_29 showed relatively lower expression than that of SV-free genes (Fig. 3D), which indicated that SVs may influence the expression of nearby genes.
FIG 3.
Comparative analysis of C. trogii genomes. (A) Sequence syntenic alignment of S0301 (30), Ct001_31, and Ct001_29. Gray lines indicate shared syntenic blocks; horizontal lines indicate contigs of Ct001_29 (blue), Ct001_31 (green), and S0301 (orange). (B) Variant length distribution of S0301 versus that of Ct001_29 and that of Ct001_31 versus that of Ct001_29. The affected base size of a single-nucleotide polymorphism (SNP) (blue bar) is 0, the affected base size of an insertion (green bar) is >0, and the affected base size of a deletion (pink bar) is <0. DEL, deletion; INS, insertion. (Upper) Variant numbers of the comparison between Ct001_29 and Ct001_31; (lower) variant numbers of the comparison between Ct001_29 and S0301. (C) Number of structural variations (SVs) overlapping with specific genomic features. The “merged” columns represent numbers of SVs from the union of two comparisons. (D) Expression of SV-associated genes and SV-free genes. TPM, transcripts per million. (E) Functional enrichment of Ct001_29-specific genes and Ct001_31-specific genes. (F) Agarose gel electrophoresis of haplotype-specific genes. Genes of 128441, 128431, 076031, 069651, and 038071 are Ct001_31 specific, whereas the others are Ct001_29 specific. (G) Expression heatmap of haplotype-specific genes under different conditions. Carbon, carbon source; M, mycelium; P, primordium; FB, fruiting body; MM, glucose; SM, sucrose; LM, lignin; CM, cellulose; XM, xylan; PDA, potato dextrose agar.
In the genome comparison between S0301 and Ct001_29 (interstrain), 2,236 genes were highly conserved (variable bases accounted for less than 0.2% of a gene) and 3,257 were highly variable (variable bases accounted for more than 2% of a gene). In the genome comparison between Ct001_31 and Ct001_29 (intrastrain), 2,822 genes were highly conserved and 2,643 were highly variable. The conserved genes were functionally enriched in genome stability maintenance, such as DNA recombination, endonuclease activity, and DNA integration. The variable genes detected, however, were functionally enriched in catalyzing activities, such as oxidoreductase and monooxygenase.
In total, Ct001_31 had 365 haplotype-specific genes, and Ct001_29 had 360 haplotype-specific genes (the list of all of the haplotype-specific genes can be obtained at http://www.gpgenome.com:8080/species/62343). Ct001_31-specific genes were enriched in DNA-binding functions, and Ct001_29-specific genes were enriched in signal transduction-related functions (Fig. 3E). To validate the authenticity of the haplotype-specific genes, we used complementary DNA (cDNA) from Ct001_29 and Ct001_31 as the templates for PCR amplifications. Based on the band patterns observed (existence or absence) by agarose gel electrophoresis, five selected Ct001_31-specific genes and three Ct001_29-specific genes were further validated (Fig. 3F). Most haplotype-specific genes showed low expression levels under all of the culture conditions (Fig. 3G), and about half of them even exhibited no expression (43.06% of Ct001_29-specific genes and 47.40% of Ct001_31-specific genes), indicating their accessary roles in the growth and development of C. trogii.
Expression profile of C. trogii under different conditions.
Ct001 has a wide spectrum of carbon sources that it can use (carbon utilization ability), and it can produce fruiting bodies on a variety of natural substrates (see Fig. S3A in the supplemental material). The optimal growth rate of Ct001 at 35°C compared to that at 25°C under all of the tested conditions confirmed the high-temperature adaptation ability of this organism (Fig. S3B and C). To investigate the molecular mechanisms of high-temperature adaptation in C. trogii, RNA-Seq analysis of mycelia cultured on different carbon sources and different temperatures, as well as during the different developmental stages of Ct001, was conducted (Fig. 4A and B).
FIG 4.
Expression profile of C. trogii diploid strain Ct001 under different conditions. (A) Different developmental stages of C. trogii. M, mycelium; P, primordium; FB, fruiting body. (B) Growth status of mycelia cultured on different carbon sources and at different temperatures. MM, glucose; SM, sucrose; LM, lignin; CM, cellulose; XM, xylan; PDA, potato dextrose agar. (C) Heatmap of differentially expressed genes under different conditions. “Carbon” indicates the carbon source. The Z-score is the standardized value of transcripts per million for each gene.
Specific expression was observed under different conditions (developmental stages, carbon sources, or temperatures) (Fig. 4C). During different developmental stages, 2,628 differentially expressed genes (DEGs) (vegetative mycelia versus primordia) and 430 DEGs (primordia versus fruiting bodies) were identified (see Fig. S4C in the supplemental material), and these developmental DEGs were enriched in hydrolase activity, polysaccharide catabolic process, and aromatic compound catabolic process. A total of 2,243 carbon-related DEGs were identified (Fig. S4A), and these genes were enriched in functions of heme binding, monooxygenase activity, and the polysaccharide/aromatic compound/glucan/cellulose catabolic process. A total of 45 overlapping temperature-related DEGs were identified in growth on all six carbon groups, while this number was 155 under any combination of five different carbon groups (Fig. S4B); these 155 DEGs were enriched in chaperone complex and cellular response to unfolded/misfolded proteins. Notably, four HSP20 genes (HSP20.5 to HSP20.8) were differentially expressed under all of the tested conditions and showed extremely high expression levels at 35°C.
The HSP20 gene family is specifically involved in the high-temperature adaptation of C. trogii.
In total, 14 allelic pairs of HSP20 genes were identified in C. trogii (see Fig. S5 in the supplemental material), among which 8 pairs belonging to 3 groups were duplicated. Specifically, the group involving HSP20.5 to HSP20.8 was tandem duplicated, forming a tandem cluster (Fig. 5A). According to our RNA-Seq results, most of the HSP20 genes were more highly expressed at 35°C than at 25°C (Fig. S5C). The expression of HSP20.5 to HSP20.8 was then validated using quantitative PCR (qPCR). Due to their high sequence similarity, only one primer pair was designed and was used to detect their total expression (that of Ct29HSP20.5 to Ct29HSP20.8 and Ct31HSP20.5 to Ct31HSP20.8) as a whole. The expression levels at 35°C were all significantly higher (larger than 5-fold change) than those at 25°C under all of the tested conditions (see Fig. S6A in the supplemental material), indicating that these four HSP20 genes were widely involved in the high-temperature adaptation of C. trogii.
FIG 5.
Structure and expression profile of the C. trogii HSP20 tandem cluster. (A) Syntenic relationship of genes in the HSP20 tandem cluster of Ct001_29, Ct001_31, and S0301 (30). Green blocks represent genes on a forward strand, while blue blocks represent genes on a reverse strand. Wavy lines indicate shared syntenic genes, and blue wavy lines indicate syntenic relationship of HSP20 genes. (B) Normalized read counts of HSP20 genes under different conditions. MM, glucose; SM, sucrose; LM, lignin; CM, cellulose; XM, xylan.
In this cluster, the allele pairs Ct29HSP20.5-Ct31HSP20.5 and Ct29HSP20.8-Ct31HSP20.8 showed high expression, while Ct29HSP20.6-Ct31HSP20.6 and Ct29HSP20.7-Ct31HSP20.7 showed quite low expression under all of the tested conditions at the mycelia stage. Similarly, Ct29HSP20.8-Ct31HSP20.8 showed high expression across the developmental stages (Fig. 5B). According to our sequence similarity analysis results, the Ct29HSP20.5-Ct31HSP20.5 and Ct29HSP20.8-Ct31HSP20.8 allelic pair genes had similar promoter regions (Fig. S6B), indicating that they may be regulated by similar cis-regulation. Allele dominance was found in Ct29HSP20.5-Ct31HSP20.5 and Ct29HSP20.8-Ct31HSP20.8, as the expression of Ct31HSP20.5 was consistently higher (at least 1.5-fold) than that of Ct29HSP20.5, while that of Ct29HSP20.8 was consistently higher (at least 1.3-fold) than that of Ct31HSP20.8 (Fig. 5B). To validate the high expression of Ct29HSP20.5-Ct31HSP20.5 and Ct29HSP20.8-Ct31HSP20.8, we amplified a 563-bp sequence from this cluster and conducted clone sequencing. Among the 56 tested clones, the proportion of Ct29HSP20.5:Ct31HSP20.5:Ct29HSP20.6:Ct31HSP20.6:Ct29HSP20.7:Ct31HSP20.7:Ct29HSP20.8:Ct31HSP20.8 was 7:19:1:0:0:0:20:9 (Fig. S6C and D).
DISCUSSION
“Pangenome” and comparative genomics studies based on large numbers of high-quality genomes have become trendy research approaches as of late, especially in model fungal species, such as Saccharomyces cerevisiae (31) and Aspergillus spp. (32, 33). These studies have revealed the abundant genetic diversity in a given species and provide a solid foundation for the resolution of genotype-phenotype relationships, as well as illustrating the importance of having more complete genomes in a population. Several macrofungal taxa, such as G. lucidum, have more than one set of genomes available, and these genomes are typically from different strains with different phenotypes (26, 34). However, genome information from different karyotypes of the same strain is still limited. In this study, we successfully assembled two haplotype genomes and conducted comparative analysis of C. trogii based on the obtained high-quality genomes. Rich genetic diversity was detected between the different strains of C. trogii. Moreover, significant differences between the two karyotypes included, but were not limited to, genome length, number of genes, and gene composition. Sequencing the genomes of these two karyotypes allowed us to gain a more comprehensive understanding of the genetic background of this strain.
HSP20 genes represent the most abundant small heat shock protein (sHSP) genes in plants, and include, for example, 51 members in soybean (35), 48 in potato (14), and 41 in apple (15). Tandem and segmental duplications may be the major contributors of HSP20 expansion. According to earlier work (18), fungi carry fewer HSP20 genes, usually fewer than five. In this study, 14 HSP20 members were identified in C. trogii, and this expansion may have been the result of tandem duplication. Most HSP20 genes in plants have no or only one intron (14, 15, 21). According to previous studies, genes with few or no introns are considered to be rapidly activated in response to various stresses (36), and this is also the case for HSP20 genes, as these genes usually have no intron and are more easily induced (14, 15). However, the HSP20 genes in this study typically had two introns, indicating their differing evolutionary status from those from plants.
The induced high expression of HSP20 genes can enhance thermotolerance in organisms (15, 17). However, in this study, the continuous high expression of HSP20 genes at 35°C endowed C. trogii with high-temperature adaptive growth. Due to the high degree of sequence similarity, it was difficult to accurately analyze the expression levels of the duplicated genes. Benefiting from the genomic data of the different karyotypes, allele-specific expression analysis of HSP20 genes was then conducted, and two duplicated allelic pairs (Ct29HSP20.5-Ct31HSP20.5 and Ct29HSP20.8-Ct31HSP20.8) showed superior expression, indicating their indispensable role in high-temperature adaptive growth in C. trogii. The other two allelic pairs, Ct29HSP20.6-Ct31HSP20.6 and Ct29HSP20.7-Ct31HSP20.7, which had low expression levels under all of the tested conditions, may be functionally redundant or may confer better fitness under other conditions (18). The responses of HSP20 genes to heat stress may functionally contribute to the growth and survival of C. trogii at high temperatures, which might be useful for the selection and breeding of elite strains.
MATERIALS AND METHODS
Strains, cultivation, and fruiting body collection.
The dikaryotic C. trogii strain Ct001 was maintained on potato dextrose agar (PDA) at 4°C and stored at the Institute of Bioengineering, Guangdong Academy of Sciences. Fresh mycelial blocks (5 mm in diameter) were transferred to PDA plates kept at 25°C, 28°C, 32°C, 35°C, 37°C, or 42°C in the dark for 5 days. The growth rate was calculated based on hyphal radial growth under each condition. Monokaryotic strains with opposite mating types were obtained by methods previously reported (dedikaryotization via hyphal lysis, protoplast isolation, and colony regeneration) (37) from Ct001, and the monokaryons Ct001_29 and Ct001_31 were randomly selected.
The mycelia of Ct001 were cultured on minimal medium (MM) plates containing glucose (20 g/liter), (NH4)2SO4 (1.5 g/liter), K2HPO4 (1.0 g/liter), MgSO4 (0.3 g/liter), and vitamin B1 (0.5 mg/liter). Next, the glucose was replaced with an equal amount of four other carbon sources (20 g/liter), namely sucrose (SM), lignin (LM), cellulose (CM), and xylan (XM). Ct001 was then inoculated into the above four culture media, and the plates were further cultured at 25°C or 35°C for 5 days. For each sample (including mycelia cultured on PDA for 5 days at 25°C or 35°C), the mycelia were quickly scraped and mixed to produce a biological repeat sample, which was then frozen in liquid nitrogen and stored at −80°C. Three replicates were prepared for all of the treatments. The growth rates of Ct001 on different carbon sources were determined at 25°C and 35°C.
The 5-day-old mycelia of Ct001 cultured on PDA plates at 25°C were inoculated into culture vessels (300 ml) containing culture compost. The culture compost consisted of 10% oak wood, 70% sugarcane bagasse, 19% wheat bran, and 1% gypsum, with a final water content of 65%. The vessels were incubated at 25°C with approximately 50% humidity in the dark. These vessels were fully covered with mycelia after 18 days, at which point the surface of the medium was scraped with a sterilized scalpel and further cultivated for an additional 5 days. These vessels were then transferred to a fruiting room. The temperature was maintained at 28°C, and the room humidity was maintained at 85%, with a photoperiod of 12 h at 300 lx and 12 h in the dark. Three replicates of primordia (28 days after inoculation) and fruiting bodies (32 days after inoculation) were collected and quickly frozen in liquid nitrogen for further analysis.
Genome and RNA sequencing.
Genome sequencing was performed using samples of Ct001_29 and Ct001_31. For each strain, 8 μg of genomic DNA extracted via the cetyltrimethylammonium bromide method (38) was conjugated to a 16-bp barcode sequence, and then a 20 kb-insert library was constructed. These libraries were sequenced in one single-molecule real-time (SMRT) cell on the PacBio sequel II platform. In addition, 10 μg of genomic DNA from both Ct001_29 and Ct001_31 was used to construct paired-end (PE) libraries with an average insert size of 300 bp. Sequencing of these libraries was performed on the NovaSeq platform (Illumina, Inc., USA).
RNA-Seq was performed on a total of 42 samples, including mycelia cultured on different media, at different temperatures, or with primordia or fruiting bodies. The total RNA extraction/quality control and sequencing library construction of all of the samples were conducted using methods previously reported (24). These libraries were sequenced on an Illumina NovaSeq platform, and paired-end 150-bp reads were generated.
Assembly of the genome.
Raw read quality was assessed using FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/), and low-quality bases or reads were filtered out using Skewer (39) with the following criteria: trimming from the 3′-end base to achieve a quality score of >30 and exclusion of reads with a length of <100 bp or an average quality of <30. The PacBio data were assembled using Canu v1.8 (40) and refined with Racon (41) and Pilon (42). k-mer analysis toolkit (KAT) (43) analysis and BUSCO analysis (44) with the fungi odb9 database were used to test the accuracy and completeness of our assembled genomes.
Repeat sequence and gene annotation.
Dispersed repeated sequences at the DNA level were detected through an approach combining de novo prediction and homology-based searching. RepeatModeler v1.0.11 (http://www.repeatmasker.org/RepeatModeler/) was used to construct a de novo repeat library, and this de novo library was then mixed with Repbase (a database of eukaryotic repetitive elements) to conduct repeat searching using RepeatMasker v4.06 (http://www.repeatmasker.org/RMDownload.html).
Nuclear genomes for Ct001_29 and Ct001_31 were next annotated using EuGene v4.2 (45). Protein coding genes were functionally annotated by searching the following databases: Pfam (46), UniProt (https://sparql.uniprot.org/), and InterProScan (47).
Gene family and phylogenetic analysis.
A total of 26 genomes (including Ct001_29, Ct001_31, and S0301 [30]) were used to conduct phylogenetic analysis, and the annotations and sequences of all of the other genomes considered were downloaded from NCBI (see Table S2 in the supplemental material). To identify gene family numbers, we performed an all-against-all comparison using BLASTP (48) with an E value cutoff of 1 × 10−5, and the OrthoMCL (49) method was used to cluster our BLASTP results into paralogous and orthologous clusters. The orthologous gene families calculated by OrthoMCL were subjected to CAFE5 (50) for identification of expansion and contraction using default parameters.
In total, 176 single-copy genes were used to construct a phylogenetic tree. Multiple-sequence alignments of these 176 genes were done using MUSCLE v3.8.31 (51); these are combined into a long sequence for each species. A phylogenetic tree was then constructed using RAxML v8.2.11 (52) with 1,000 bootstraps, the PROTGAMMAJTTF model, and Neurospora crassa and S. cerevisiae as the outgroups. Two divergence time calibration points were fixed in our molecule clock analysis. The most recent common ancestor (MRCA) of S. cerevisiae and G. lucidum diverged 723 Mya, and the MRCA of Ustilago maydis and G. lucidum diverged 466 Mya (timetree.org). The divergence times of the other nodes were then calculated using MCMCtree (53).
Comparative genomics analysis.
The syntenic relationships among Ct001_29, Ct001_31, and S0301 (30) were analyzed using the MCScanX package (54) with default settings. Ct001_31 and S0301 were mapped to Ct001_29 using Minimap2 (55), and variants were called based on these mapping results. Variants with base changes shorter than 50 bp were defined as insertions/deletions (indels), and those longer than 50 bp were considered to be structural variations (SVs). When the number of overlapped bases (between genetic variations and genes) accounted for ≤0.2% of the total length of a gene, it was defined as a conserved gene, and when this ratio was ≥2%, this gene was defined as a variable gene. The predicted genes for Ct001_29 and Ct001_31 were then compared with each other, and the genes without any comparison results in the opposite genome were considered to be haplotype-specific genes.
For validation of the genetic variations and haplotype-specific genes between Ct001_29 and Ct001_31, mycelia from Ct001_29 and Ct001_31 were collected separately for DNA and RNA isolation after cultivation on PDA for 5 days at 25°C or 35°C. A total of 28 SNPs, 10 indels, and 8 SVs were randomly selected, and 20 primer pairs (several loci that could be amplified by 1 primer pair) were designed according to their conserved flanking sequences (see Table S3 in the supplemental material). PCR amplification was then conducted using genomic DNA for Ct001_29 and Ct001_31 as the templates. Three Ct001_29-specific genes and five Ct001_31-specific genes were randomly selected, and primer pairs were designed accordingly (Table S3). The cDNA from Ct001_29 and Ct001_31 was used as the amplification template. All of the primers were designed and synthesized by Tsingke Biotechnology Co., Ltd. (Beijing, China). All of the amplification products were analyzed using agarose gel electrophoresis or were sequenced at Tsingke Biotechnology Co., Ltd.
Expression profile analysis.
The raw reads generated by RNA-Seq were trimmed and quality controlled using Skewer (39). HISAT and StringTie (56) were used to calculate the expression levels of each gene (transcripts per million [TPM]). DEGs between these different samples were identified using DESeq2 (57), and DEGs were defined as those genes that had a log2 fold change of ≥1 and a P value of ≤1E−3. To analyze the DEGs from mycelia grown on different carbon sources, pairwise comparisons of all of the expression data at 25°C were conducted, and the expression data at 35°C were processed similarly. The intersection of any pairwise comparison at 25°C and 35°C (15 groups in total; see Fig. S4A in the supplemental material) was considered to represent the DEGs responsive to carbon sources. To analyze the DEGs of mycelia grown at different temperatures, expression data between 25°C and 35°C on each carbon source were compared, and the intersection of all of these comparisons (six groups in total; see Fig. S4B) was considered to be those DEGs responsive to temperature. To analyze the DEGs during different developmental stages, the expression data of mycelia cultured at 25°C on PDA, primordia, and fruiting bodies were compared (Fig. S4C). An expression heatmap was drawn using pheatmap (https://cran.r-project.org/web/packages/pheatmap/index.html).
Identification and analysis of the C. trogii HSP20 gene family.
All of the annotated HSP20 genes in Ct001_29 and Ct001_31 were manually corrected using Apollo (58) as previously reported (59), and genes with an HSP20 domain (Pfam identifier [ID] PF00011) were considered HSP20 genes. Full-length protein sequences of two sets of HSP20 gene members in Ct001_29 and Ct001_31 were used to construct a phylogenetic tree using RAxML (52). All of the HSP20 genes from Ct001_29 were searched against those from Ct001_31 using BLASTP (48). By combining our phylogenetic tree and protein identity (≥85%), the allelic (one to one) relationship between these HSP20 genes was confirmed. The duplication of HSP20 genes was determined using the following two criteria: (i) the protein length of the shorter sequence covered ≥50% of the longer sequence, and (ii) the similarity of the two aligned sequences was ≥90%.
The expression of allelic genes was calculated as the average sequencing depth of SNP loci between allelic genes and was normalized by the total read count of three replicates. The expression level of duplicated HSP20 genes was verified based on clone sequencing. Amplification and cloning of partial sequences (563 bp) of HSP20.5 to HSP20.8 was then conducted. A primer pair (5′-CCCCCTTTCTCCCTCACTA-3′ and 5′-AACMACAACCATCTCCWCCRT-3′) for these genes was designed, and PCR was conducted based on the cDNA obtained from mycelia cultured on PDA at 35°C. The amplification products were cloned, and 56 clones were sequenced at Tsingke Biotechnology Co., Ltd.
The synthesized cDNA for the RNA-Seq was used for qPCR. ChamQ Universal SYBR qPCR master mix (Vazyme, Nanjing, China) was used for qPCR as described previously (24). Glyceraldehyde 3-phosphate dehydrogenase (GAPDH) was used as a reference. Primers (see Table S3 in the supplemental material) were designed and synthesized by Tsingke Biotechnology Co., Ltd.
Gene functional enrichment analysis.
Gene ontology (GO) enrichment analysis was carried out on genes in expanded families, conserved/variable genes, haplotype-specific genes, and DEGs using clusterProfiler (60), and enrichment results with a P value of <1 × 10−3 were retained.
Data availability.
The genome sequences and raw sequencing data have been deposited in GenBank under project accession number PRJNA747173 and PRJNA749681. All of the sequences and related annotations, including nuclear genomes and RNA-Seq data, can be accessed from the Global Pharmacopoeia Genome Database (GPGD) (61) at the following URL: http://www.gpgenome.com:8080/species/62343.
ACKNOWLEDGMENTS
This research was funded by the GDAS' Project of Science and Technology Development (grant 2020GDASYL-20200103071), by the “13th Five-Year Plan” Key Field Research Project of the China Academy of Chinese Medical Sciences (grant ZZ10-007), by the Project Quality Standard System Construction for the Whole Industry Chain of Chinese Medicinal Decoction Pieces from Guangdong Provincial Drug Administration of China (grant 002009/2019KT1261/2020ZDB25), and by the Special Foundation of Guangzhou Key Laboratory (grant 202002010004).
S.X. and Z.H. designed the experiments; L.W., B.L., and L.G. performed the experiments and analyzed the data; L.W. and B.L. wrote the manuscript; and S.X., L.W., and B.L. discussed the manuscript.
We declare no conflicts of interest.
Footnotes
Supplemental material is available online only.
Contributor Information
Shuiming Xiao, Email: smxiao@icmm.ac.cn.
Zhihai Huang, Email: zhhuang7308@163.com.
Giuseppe Ianiri, University of Molise.
REFERENCES
- 1.Patrick F, Mtui G, Mshandete AM, Kivaisi A. 2010. Optimized production of lignin peroxidase, manganese peroxidase and laccase in submerged cultures of Trametes trogii using various growth media compositions. Tanzania J Sci 36:1–18. [Google Scholar]
- 2.Campos PA, Levin LN, Wirth SA. 2016. Heterologous production, characterization and dye decolorization ability of a novel thermostable laccase isoenzyme from Trametes trogii BAFC 463. Process Biochem 51:895–903. doi: 10.1016/j.procbio.2016.03.015. [DOI] [Google Scholar]
- 3.Ai MQ, Wang FF, Huang F. 2015. Purification and characterization of a thermostable laccase from Trametes trogii and its ability in modification of kraft lignin. J Microbiol Biotechnol 25:1361–1370. doi: 10.4014/jmb.1502.02022. [DOI] [PubMed] [Google Scholar]
- 4.Yang X, Wu Y, Zhang Y, Yang E, Qu Y, Xu H, Chen Y, Irbis C, Yan J. 2020. A thermo-active laccase isoenzyme from Trametes trogii and its potential for dye decolorization at high temperature. Front Microbiol 11:241. doi: 10.3389/fmicb.2020.00241. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Yan J, Chen Y, Niu J, Chen D, Chagan I. 2015. Laccase produced by a thermotolerant strain of Trametes trogii LK13. Braz J Microbiol 46:59–65. doi: 10.1590/S1517-838246120130895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Krumova E, Kostadinova N, Miteva-Staleva J, Stoyancheva G, Spassova B, Abrashev R, Angelova M. 2018. Potential of ligninolytic enzymatic complex produced by white-rot fungi from genus Trametes isolated from Bulgarian forest soil. Eng Life Sci 18:692–701. doi: 10.1002/elsc.201800055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Mazmancı B, Mazmanci MA, Unyayar A, Unyayar S, Cekic FO, Deger AG, Yalin S, Comelekoglu U. 2011. Protective effect of Funalia trogii crude extract on deltamethrin-induced oxidative stress in rats. Food Chem 125:1037–1040. doi: 10.1016/j.foodchem.2010.10.014. [DOI] [Google Scholar]
- 8.He P, Geng L, Wang Z, Mao D, Wang J, Xu C. 2012. Fermentation optimization, characterization and bioactivity of exopolysaccharides from Funalia trogii. Carbohydr Polym 89:17–23. doi: 10.1016/j.carbpol.2012.01.093. [DOI] [PubMed] [Google Scholar]
- 9.Rashid S, Unyayar A, Mazmanci MA, McKeown SR, Banat IM, Worthington J. 2011. A study of anti-cancer effects of Funalia trogii in vitro and in vivo. Food Chem Toxicol 49:1477–1483. doi: 10.1016/j.fct.2011.02.008. [DOI] [PubMed] [Google Scholar]
- 10.Zhang RY, Hu DD, Zhang YY, Goodwin PH, Huang CY, Chen Q, Gao W, Wu XL, Zou YJ, Qu JB, Zhang JX. 2016. Anoxia and anaerobic respiration are involved in “spawn-burning” syndrome for edible mushroom Pleurotus eryngii grown at high temperatures. Sci Hortic 199:75–80. doi: 10.1016/j.scienta.2015.12.035. [DOI] [Google Scholar]
- 11.Foulongne-Oriol M, Navarro P, Spataro C, Ferrer N, Savoie J-M. 2014. Deciphering the ability of Agaricus bisporus var. burnettii to produce mushrooms at high temperature (25°C). Fungal Genet Biol 73:1–11. doi: 10.1016/j.fgb.2014.08.013. [DOI] [PubMed] [Google Scholar]
- 12.Waters ER. 2013. The evolution, function, structure, and expression of the plant sHSPs. J Exp Bot 64:391–403. doi: 10.1093/jxb/ers355. [DOI] [PubMed] [Google Scholar]
- 13.Becker J, Craig EA. 1994. Heat-shock proteins as molecular chaperones. Eur J Biochem 219:11–23. doi: 10.1007/978-3-642-79502-2_2. [DOI] [PubMed] [Google Scholar]
- 14.Zhao P, Wang D, Wang R, Kong N, Zhang C, Yang C, Wu W, Ma H, Chen Q. 2018. Genome-wide analysis of the potato Hsp20 gene family: identification, genomic organization and expression profiles in response to heat stress. BMC Genomics 19:61. doi: 10.1186/s12864-018-4443-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Yao F, Song C, Wang H, Song S, Jiao J, Wang M, Zheng X, Bai T. 2020. Genome-wide characterization of the HSP20 gene family identifies potential members involved in temperature stress response in apple. Front Genet 11:609184. doi: 10.3389/fgene.2020.609184. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Hombach A, Ommen G, MacDonald A, Clos J. 2014. A small heat shock protein is essential for thermotolerance and intracellular survival of Leishmania donovani. J Cell Sci 127:4762–4773. doi: 10.1242/jcs.157297. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Li D, Yang F, Lu B, Chen D, Yang W. 2012. Thermotolerance and molecular chaperone function of the small heat shock protein HSP20 from hyperthermophilic archaeon, Sulfolobus solfataricus P2. Cell Stress Chaperones 17:103–108. doi: 10.1007/s12192-011-0289-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Wu J, Wang M, Zhou L, Yu D. 2016. Small heat shock proteins, phylogeny in filamentous fungi and expression analyses in Aspergillus nidulans. Gene 575:675–679. doi: 10.1016/j.gene.2015.09.044. [DOI] [PubMed] [Google Scholar]
- 19.Kirbach BB, Golenhofen N. 2011. Differential expression and induction of small heat shock proteins in rat brain and cultured hippocampal neurons. J Neurosci Res 89:162–175. doi: 10.1002/jnr.22536. [DOI] [PubMed] [Google Scholar]
- 20.Sarkar NK, Kim YK, Grover A. 2009. Rice sHsp genes: genomic organization and expression profiling under stress and development. BMC Genomics 10:393. doi: 10.1186/1471-2164-10-393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Ji XR, Yu YH, Ni PY, Zhang GH, Guo DL. 2019. Genome-wide identification of small heat-shock protein (HSP20) gene family in grape and expression profile during berry development. BMC Plant Biol 19:433. doi: 10.1186/s12870-019-2031-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Xie J, Peng J, Yi Z, Zhao X, Li S, Zhang T, Quan M, Yang S, Lu J, Zhou P, Xia L, Ding X. 2019. Role of hsp20 in the production of spores and insecticidal crystal proteins in Bacillus thuringiensis. Front Microbiol 10:2059. doi: 10.3389/fmicb.2019.02059. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Gehrmann T, Pelkmans JF, Ohm RA, Vos AM, Sonnenberg ASM, Baars JJP, Wösten HAB, Reinders MJT, Abeel T. 2018. Nucleus-specific expression in the multinuclear mushroom-forming fungus Agaricus bisporus reveals different nuclear regulatory programs. Proc Natl Acad Sci USA 115:4429–4434. doi: 10.1073/pnas.1721381115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Wang L, Gao W, Wu X, Zhao M, Qu J, Huang C, Zhang J. 2018. Genome-wide characterization and expression analyses of Pleurotus ostreatus MYB transcription factors during developmental stages and under heat stress based on de novo sequenced genome. Int J Mol Sci 19:2052. doi: 10.3390/ijms19072052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Chen LF, Gong YH, Cai YL, Liu W, Zhou Y, Xiao Y, Xu ZY, Liu Y, Lei XY, Wang GZ, Guo MP, Ma XL, Bian YB. 2016. Genome sequence of the edible cultivated mushroom Lentinula edodes (shiitake) reveals insights into lignocellulose degradation. PLoS One 11:e0160336. doi: 10.1371/journal.pone.0160336. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Chen SL, Xu J, Liu C, Zhu YJ, Nelson DR, Zhou SG, Li CF, Wang LZ, Guo X, Sun YZ, Luo HM, Li Y, Song JY, Henrissat B, Levasseur A, Qian J, Li JQ, Luo X, Shi LC, He L, Xiang L, Xu XL, Niu YY, Li QS, Han MV, Yan HX, Zhang J, Chen HM, Lv A, Wang Z, Liu MZ, Schwartz DC, Sun C. 2012. Genome sequence of the model medicinal mushroom Ganoderma lucidum. Nat Commun 3:913. doi: 10.1038/ncomms1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Yuan Y, Wu F, Si J, Zhao Y-F, Dai Y-C. 2019. Whole genome sequence of Auricularia heimuer (Basidiomycota, Fungi), the third most important cultivated mushroom worldwide. Genomics 111:50–58. doi: 10.1016/j.ygeno.2017.12.013. [DOI] [PubMed] [Google Scholar]
- 28.Wang L, Zhao M, Wu X, Huang C, Qu JB. 2019. Comparative genomic analyses of two Pleurotus ostreatus strains. Mycosystema 38:2133–2143. doi: 10.13346/j.mycosystema.190237. [DOI] [Google Scholar]
- 29.Schwessinger B, Sperschneider J, Cuddy WS, Garnica DP, Miller ME, Taylor JM, Dodds PN, Figueroa M, Park RF, Rathjen JP. 2018. A near-complete haplotype-phased genome of the dikaryotic wheat stripe rust fungus Puccinia striiformis f. sp. tritici reveals high interhaplotype diversity. mBio 9:e02275-17. doi: 10.1128/mBio.02275-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Liu Y, Wu Y, Zhang Y, Yang X, Yang E, Xu H, Yang Q, Chagan I, Cui X, Chen W, Yan J. 2019. Lignin degradation potential and draft genome sequence of Trametes trogii S0301. Biotechnol Biofuels 12:256. doi: 10.1186/s13068-019-1596-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Li G, Ji B, Nielsen J. 2019. The pan-genome of Saccharomyces cerevisiae. FEMS Yeast Res 19:foz064. doi: 10.1093/femsyr/foz064. [DOI] [PubMed] [Google Scholar]
- 32.Kjærbølling I, Vesth T, Frisvad JC, Nybo JL, Theobald S, Kildgaard S, Petersen TI, Kuo A, Sato A, Lyhne EK, Kogle ME, Wiebenga A, Kun RS, Lubbers RJM, Mäkelä MR, Barry K, Chovatia M, Clum A, Daum C, Haridas S, He G, LaButti K, Lipzen A, Mondo S, Pangilinan J, Riley R, Salamov A, Simmons BA, Magnuson JK, Henrissat B, Mortensen UH, Larsen TO, de Vries RP, Grigoriev IV, Machida M, Baker SE, Andersen MR. 2020. A comparative genomics study of 23 Aspergillus species from section Flavi. Nat Commun 11:1106–1112. doi: 10.1038/s41467-019-14051-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.McCarthy CGP, Fitzpatrick DA. 2019. Pan-genome analyses of model fungal species. Microb Genom 5:e000243. doi: 10.1099/mgen.0.000243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Tian Y, Wang Z, Liu Y, Zhang G, Li G. 2021. The whole-genome sequencing and analysis of a Ganoderma lucidum strain provide insights into the genetic basis of its high triterpene content. Genomics 113:840–849. doi: 10.1016/j.ygeno.2020.10.015. [DOI] [PubMed] [Google Scholar]
- 35.Lopes-Caitar VS, Carvalho MD, Darben LM, Kuwahara MK, Nepomuceno AL, Dias WP, Abdelnoor RV, Marcelino-Guimarães FC. 2013. Genome-wide analysis of the Hsp 20 gene family in soybean: comprehensive sequence, genomic organization and expression profile analysis under abiotic and biotic stresses. BMC Genomics 14:577. doi: 10.1186/1471-2164-14-577. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Jeffares DC, Penkett CJ, Bähler J. 2008. Rapidly regulated genes are intron poor. Trends Genet 24:375–378. doi: 10.1016/j.tig.2008.05.006. [DOI] [PubMed] [Google Scholar]
- 37.Qu J, Zhao M, Hsiang T, Feng X, Zhang J, Huang C. 2016. Identification and characterization of small noncoding RNAs in genome sequences of the edible fungus Pleurotus ostreatus. Biomed Res Int 2016:2503023. doi: 10.1155/2016/2503023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Allen GC, Flores-Vergara MA, Krasynanski S, Kumar S, Thompson WF. 2006. A modified protocol for rapid DNA isolation from plant tissues using cetyltrimethylammonium bromide. Nat Protoc 1:2320–2325. doi: 10.1038/nprot.2006.384. [DOI] [PubMed] [Google Scholar]
- 39.Jiang H, Lei R, Ding SW, Zhu S. 2014. Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads. BMC Bioinformatics 15:182. doi: 10.1186/1471-2105-15-182. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. 2017. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 27:722–736. doi: 10.1101/gr.215087.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Vaser R, Sović I, Nagarajan N, Šikić M. 2017. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res 27:737–746. doi: 10.1101/gr.214270.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM. 2014. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9:e112963. doi: 10.1371/journal.pone.0112963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Mapleson D, Garcia Accinelli G, Kettleborough G, Wright J, Clavijo BJ. 2017. KAT: a K-mer analysis toolkit to quality control NGS datasets and genome assemblies. Bioinformatics 33:574–576. doi: 10.1093/bioinformatics/btw663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. 2015. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31:3210–3212. doi: 10.1093/bioinformatics/btv351. [DOI] [PubMed] [Google Scholar]
- 45.Sallet E, Gouzy J, Schiex T. 2019. EuGene: an automated integrative gene finder for eukaryotes and prokaryotes. Methods Mol Biol 1962:97–120. doi: 10.1007/978-1-4939-9173-0_6. [DOI] [PubMed] [Google Scholar]
- 46.El-Gebali S, Mistry J, Bateman A, Eddy SR, Luciani A, Potter SC, Qureshi M, Richardson LJ, Salazar GA, Smart A, Sonnhammer ELL, Hirsh L, Paladin L, Piovesan D, Tosatto SCE, Finn RD. 2019. The Pfam protein families database in 2019. Nucleic Acids Res 47:D427–432. doi: 10.1093/nar/gky995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Jones P, Binns D, Chang HY, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, Pesseat S, Quinn AF, Sangrador-Vegas A, Scheremetjew M, Yong SY, Lopez R, Hunter S. 2014. InterProScan 5: genome-scale protein function classification. Bioinformatics 30:1236–1240. doi: 10.1093/bioinformatics/btu031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. 2009. BLAST+: architecture and applications. BMC Bioinformatics 10:421. doi: 10.1186/1471-2105-10-421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Li L, Stoeckert CJ, Roos DS. 2003. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 13:2178–2189. doi: 10.1101/gr.1224503. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Mendes FK, Vanderpool D, Fulton B, Hahn MW. 2020. CAFE 5 models variation in evolutionary rates among gene families. Bioinformatics 36:5516–5518. doi: 10.1093/bioinformatics/btaa1022. [DOI] [PubMed] [Google Scholar]
- 51.Edgar RC. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797. doi: 10.1093/nar/gkh340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312–1313. doi: 10.1093/bioinformatics/btu033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Xu B, Yang Z. 2013. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 30:2723–1591. doi: 10.1093/molbev/mst179. [DOI] [PubMed] [Google Scholar]
- 54.Wang Y, Tang H, Debarry JD, Tan X, Li J, Wang X, Lee T-h, Jin H, Marler B, Guo H, Kissinger JC, Paterson AH. 2012. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res 40:e49. doi: 10.1093/nar/gkr1293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Li H. 2018. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34:3094–3100. doi: 10.1093/bioinformatics/bty191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Pertea M, Kim D, Pertea GM, Leek JT, Salzberg SL. 2016. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat Protoc 11:1650–1667. doi: 10.1038/nprot.2016.095. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Wang L, Feng Z, Wang X, Wang X, Zhang X. 2010. DEGseq: an R package for identifying differentially expressed genes from RNA-seq data. Bioinformatics 26:136–138. doi: 10.1093/bioinformatics/btp612. [DOI] [PubMed] [Google Scholar]
- 58.Dunn NA, Unni DR, Diesh C, Munoz-Torres M, Harris NL, Yao E, Rasche H, Holmes IH, Elsik CG, Lewis SE. 2019. Apollo: democratizing genome annotation. PLoS Comput Biol 15:e1006790. doi: 10.1371/journal.pcbi.1006790. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Wang L, Huang Q, Zhang L, Wang Q, Liang L, Liao B. 2020. Genome-wide characterization and comparative analysis of MYB transcription factors in Ganoderma species. G3 (Bethesda) 10:2653–2660. doi: 10.1534/g3.120.401372. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Yu G, Wang L, Han Y, He Q. 2012. ClusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16:284–287. doi: 10.1089/omi.2011.0118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Liao B, Hu H, Xiao S, Zhou G, Sun W, Chu Y, Meng X, Wei J, Zhang H, Xu J, Chen S. 2021. GPGD, an integrated and mineable genomics database for traditional medicines from major pharmacopoeias worldwide. Sci China Life Sci. doi: 10.1007/s11427-021-1968-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplemental material. Download SPECTRUM00287-21_Supp_1_seq9.pdf, PDF file, 1.5 MB (1.5MB, pdf)
Data Availability Statement
The genome sequences and raw sequencing data have been deposited in GenBank under project accession number PRJNA747173 and PRJNA749681. All of the sequences and related annotations, including nuclear genomes and RNA-Seq data, can be accessed from the Global Pharmacopoeia Genome Database (GPGD) (61) at the following URL: http://www.gpgenome.com:8080/species/62343.





