Skip to main content
Plant Biotechnology Journal logoLink to Plant Biotechnology Journal
. 2023 Jul 14;21(11):2209–2223. doi: 10.1111/pbi.14123

Comparative genomics of the medicinal plants Lonicera macranthoides and L. japonica provides insight into genus genome evolution and hederagenin‐based saponin biosynthesis

Xiaojian Yin 1,2, , Yaping Xiang 1, , Feng‐Qing Huang 1, , Yahui Chen 1, , Hengwu Ding 3, Jinfa Du 1, Xiaojie Chen 1, Xiaoxiao Wang 1, Xinru Wei 1, Yuan‐Yuan Cai 1, Wen Gao 1, Dongshu Guo 4, Raphael N Alolga 1, Xianzhao Kan 3, Baolong Zhang 4, Gerardo Alejo‐Jacuinde 5, Ping Li 1, Lam‐Son Phan Tran 5,, Luis Herrera‐Estrella 5,6,, Xu Lu 1,, Lian‐Wen Qi 1,
PMCID: PMC10579715  PMID: 37449344

Summary

Lonicera macranthoides (LM) and L. japonica (LJ) are medicinal plants widely used in treating viral diseases, such as COVID‐19. Although the two species are morphologically similar, their secondary metabolite profiles are significantly different. Here, metabolomics analysis showed that LM contained ~86.01 mg/g hederagenin‐based saponins, 2000‐fold higher than LJ. To gain molecular insights into its secondary metabolite production, a chromosome‐level genome of LM was constructed, comprising 9 pseudo‐chromosomes with 40 097 protein‐encoding genes. Genome evolution analysis showed that LM and LJ were diverged 1.30–2.27 million years ago (MYA). The two plant species experienced a common whole‐genome duplication event that occurred ∼53.9–55.2 MYA before speciation. Genes involved in hederagenin‐based saponin biosynthesis were arranged in clusters on the chromosomes of LM and they were more highly expressed in LM than in LJ. Among them, oleanolic acid synthase (OAS) and UDP‐glycosyltransferase 73 (UGT73) families were much more highly expressed in LM than in LJ. Specifically, LmOAS1 was identified to effectively catalyse the C‐28 oxidation of β‐Amyrin to form oleanolic acid, the precursor of hederagenin‐based saponin. LmUGT73P1 was identified to catalyse cauloside A to produce α‐hederin. We further identified the key amino acid residues of LmOAS1 and LmUGT73P1 for their enzymatic activities. Additionally, comparing with collinear genes in LJ, LmOAS1 and LmUGT73P1 had an interesting phenomenon of ‘neighbourhood replication’ in LM genome. Collectively, the genomic resource and candidate genes reported here set the foundation to fully reveal the genome evolution of the Lonicera genus and hederagenin‐based saponin biosynthetic pathway.

Keywords: Caprifoliaceae (honeysuckle), chromosome‐level genome, hederagenin‐based saponins, Lonicere macranthoides, oleanolic acid synthase, UDP‐glycosyltransferase 73P1

Introduction

The honeysuckle has been one of the most widely used herbs in traditional Chinese medicine (TCM) for more than a thousand years (Shang et al., 2011). Its usage for human health benefit was first recorded in the Tang Materia Medica (Tang Ben Cao) dating as far back as the year 659. In TCM theory, the honeysuckle flower is described as having sweet taste, a cold property, and a link with the channels of the lungs, heart, and stomach. TCM practitioners use the honeysuckle for treatments of a variety of health conditions, including heat‐related illnesses, viral respiratory infections, skin diseases, and inflammation. The honeysuckle flowers have played a significant role in preventing and treating viral diseases, such as the SARS coronavirus (Lee et al., 2021) and influenza (Li et al., 2021a; Zhou et al., 2015). These flowers are a major ingredient in an herbal prescription called Yinqiaosan, which has been demonstrated to be as efficacious as oseltamivir in the treatment of human H1N1 influenza (Wang et al., 2011). Recently, herbal formulations containing the honeysuckle flowers, such as Lianhua Qingwen Capsules (Hu et al., 2021), Toujie Quwen Granules (Zhang et al., 2021), and Jinhua Qinggan Granules (An et al., 2021) contributed immensely to the management of COVID‐19 in China. Additionally, honeysuckle flower is also authorized as a functional food, and is widely used in herbal teas to reduce the effect of summer heat on human body temperature and relieve sore throat (Ge et al., 2004; Wang, 2010).

The honeysuckle belongs to the genus Lonicera that is composed of approximately 100 species of shrubs and climbers in the Caprifoliaceae family (Ge et al., 2022). Among these species, Lonicera japonica (LJ) and L. macranthoides (LM) are widely used as medicinal plant sources of honeysuckle flowers (Gao et al., 2012; Ge et al., 2004). Dried flower buds and flowers of LJ have been officially recorded as “Jinyinhua” since the 1963 Edition of the Chinese Pharmacopoeia (Chinese Pharmacopoeia Commission, 1963). LJ is also listed as a dietary supplement in the United States Pharmacopoeia and the National Formulary (United States Pharmacopoeia, 2017). LM was first recorded in the 2005 Edition of the Chinese Pharmacopoeia as a major plant source of “Shanyinhua” and independently from LJ (Jinyinhua; Chinese Pharmacopoeia Commission, 2005). The functions and indications of these two Lonicera species, as captured in the Chinese Pharmacopoeia, are nearly equivalent. In the latest 2020 Edition of Chinese Pharmacopoeia, Jinyinhua is one of the composition elements in approximately 92 TCM prescriptions, and Shanyinhua is present in 14 TCM prescriptions (Chinese Pharmacopoeia Commission, 2020). However, uncertainty remains as to which one provides better efficacies in specific formulations. Consequently, there is an urgent need for comprehensive comparisons between the two Lonicera species regarding their genome sequences, secondary metabolite compositions and abundances, pharmacological activities, and clinical efficacies. Such comparisons will promote a better understanding of the two herbs, facilitate governmental regulations, and lead to safer clinical use.

The hederagenin‐based saponins with notable examples, such as macranthoidin B, dipsacoside B, and macranthoidin A (Gao et al., 2012), have been shown to have strong antiviral activities (Kashiwada et al., 1998; Li et al., 2021b). The biosynthetic pathway of the hederagenin‐based saponins involves three main stages: the initial stage, the triterpenoid skeletal construction stage, and the glycosylation stage (Chung, 2020; Li et al., 2021c). In the biosynthetic process, squalene cyclooxygenase (Srisawat et al., 2019), β‐Amyrin synthase, and oleanolic acid synthase (OAS; Jo et al., 2017) are the key enzymes for the construction of the triterpenoid saponin skeleton. Cytochrome P450s and UDP‐glycosyltransferases (UGTs) are the key enzymes in the triterpenoid modification process (Zhang et al., 2020). Recently, genomic studies have been applied to exploring the biosynthesis of secondary metabolites, such as carotenoids (Pu et al., 2020; Xu et al., 2020), triterpenes (Su et al., 2021), wogonin (Zhao et al., 2019), and triptolide (Tu et al., 2020). The biosynthesis of the hederagenin‐based saponins has been partly documented in Staphylococcus aureus and Ilex specie plants (Chung, 2020; Wen et al., 2017). However, the key genes, including those involved in the biosynthesis of oleanolic acid, hederagenin, and hederagenin‐based saponins, remain largely unexplored in LM.

Here, we report a comprehensive comparison of the metabolomes, genomes, and transcriptomes of LM and LJ, with the aim of providing insights into their genome evolution, and mechanistic explanation for their differential secondary metabolite production. We also provide evidence through heterologous expression in N. benthamiana, protein purification, and enzyme activity analyses to support the assertion that LmOAS1 and LmUGT73P1 play important roles in oleanolic acid and α‐hederin production, respectively, in LM. Furthermore, we identified key amino acids involved in the enzymatic activities of LmOAS1 and LmUGT73P1, laying a foundation for large‐scale bioproduction of the hederagenin‐based saponins.

Results

Metabolomic comparison between LM and LJ

A total of 12 batches of LM and 22 batches of LJ were collected from different places in China. Since LM and LJ belong to the same genus (i.e., Lonicera), the morphology of their dried flower buds is very similar (Figure 1a), with a few exceptions. For example, filamentous hairs are commonly abundant on the surface of LJ flower buds but absent on LM flower buds (Figure S1). A liquid chromatography‐quadrupole time of flight‐mass spectrometer (LC‐QTOF‐MS/MS) was employed for metabolome analysis of dried flower buds of LM and LJ. The total ion chromatograms showed similar chemical profiles between the two herbs within the retention time of 0–30 min (Figure 1b; Figure S2). Thereafter, LM notably displayed abundant secondary metabolites within 30–50 min (Figure 1b). Principal component analysis indicated a clear separation between LM and LJ in terms of secondary metabolite production (Figure S3). By comparing precursor and fragment ions of the bud samples with reference compounds and previous reports, 36 compounds (peak numbers, 1–36) were identified in both LJ and LM, including 10 flavonoids, 10 iridoids, 14 organic acids, and 2 saponins. Importantly, LM presented seven (peak numbers, 37–43) high‐abundant hederagenin‐based saponins that were absent in LJ (Figure 1b, Data S1). A heatmap based on peak abundance clearly demonstrated that the contents of flavonoids and iridoids were higher in LJ than in LM, while organic acids and saponins were much more abundant in LM (Figure 1c). We then focused on the quantification of nine hederagenin‐based saponins, including macranthoidin B (peak number, 35), macranthoside A (peak number, 36), dipsacoside B (peak number, 37), asperosaponin VI (peak number, 38), macranthoside (peak number, 39), H‐hederin (peak number, 40), α‐hederin (peak number, 41), cauloside A (peak number, 42), and hederagenin (peak number, 43). Noticeably, LM buds contained an average of 86.01 mg/g hederagenin‐based saponins, while LJ buds only had trace amounts of these saponins, as low as 0.045 mg/g (Figure 1d). Macranthoidin B, macranthoside A and dipsacoside B were the major saponins in LM at the average amounts of 72.81, 5.71 and 6.90 mg/g, respectively (Figure 1d).

Figure 1.

Figure 1

Metabolome comparison between the dried flower buds of Lonicera macranthoides (LM) and L. japonica (LJ). (a) Macroscopic characteristics of dried flower buds of LM and LJ. Scale bars: 0.5 cm. (b) Total ion chromatograms (TIC) of LM and LJ by liquid chromatography‐quadrupole time of flight‐mass spectrometry (LC‐QTOF‐MS) in electrospray ionization (ESI) negative ion mode. Peaks 1–36 are common components present in both LM and LJ. Peaks 37–43 are only detected in LM. (c) Heatmap of the 43 identified compounds in 12 batches of LM and 22 batches of LJ. These compounds were classified into four main chemical groups including 10 flavonoids, 10 iridoids, 14 organic acids, and nine saponins. Each compound was displayed by min–max normalization of peak abundance. Colours from blue to red indicate concentrations of compounds from low to high. (d) The average content of each hederagenin‐based saponin in 12 batches of LM and 22 batches of LJ.

Construction of the LM genome

Genome sequence and assembly of LM were generated using the sequencing data obtained from Illumina and PacBio platforms. Using k‐mer frequency analysis of the Illumina sequencing reads, the genome size of LM was estimated in 888.36 Mb with a relatively high level of heterozygosity (1.44%) and with a content of 52.6% of high copy number DNA repetitive sequences (Figure S4, Table S1). To construct the genome sequence of LM, the datasets generated by short‐read Illumina sequencing, long‐range PacBio sequencing, and chromatin conformation capture (Hi‐C) sequencing were integrated. We obtained 56.74 Gb of short reads sequences, 107.89 Gb of long reads, and 121.35 Gb of Hi‐C data (~64×, ~121×, and ~140×, respectively), representing a total of ~325‐fold coverage of the estimated LM genome size (Tables S1–S3). The full‐length assembly of LM was 818.38 Mb, including 1609 contigs, a 1.1 Mb contig N50, and 82.4 Mb scaffold N50 (Table S4). From the total assembled genome, 811.06 Mb (99.11%) were anchored onto nine pseudo‐chromosomes, of which 741.75 Mb (91.45%) were oriented (Tables S5 and S6). The chromosomal interaction signal was strong (Figure S5), indicating that the quality of Hi‐C assembly was high. Evaluation of the genome completeness indicated 94.98% coverage of the conserved core eukaryotic genes by the Core Eukaryotic Genes Mapping Approach (CEGMA; Parra et al., 2007) analysis (Table S7), and 94.36% coverage of plant‐specific conserved genes by the Benchmarking Universal Single‐Copy Orthologs (BUSCO; Simão et al., 2015) analysis (Table S8). Additionally, 94.89% of the short‐read sequences could be mapped back to the assembly of long reads (Table S9).

Using the assembly data, we performed genome annotation and identified a total of 40 097 protein‐coding genes with an average sequence length of 5231 bp per gene (Table S10, Figure S6). Approximately 97.9% of the genes were annotated based on a sequence blast against homologous sequences and protein domains (Table S11). On average, every predicted gene contained 5.44 exons with an average length of 295.5 bp per exon (Table S12). In total, 492.05 Mb repetitive elements representing 60.11% genome sequences were identified (Table 1; Table S13). Like most plant genomes, the predominant types of transposable elements were long terminal repeat retrotransposons (18.79% gypsy and 18.48% copia retro‐elements) and large retrotransposon derivatives (14.47%; Table S13). A schematic outline of the genome construction is given in Figure 2a.

Table 1.

Global statistics for assembly and annotation of Lonicera japonica and L. macranthoides genomes

L. japonica L. macranthoides
Length
Total length, Mb 903.8 818.4
Chromosome length, Mb 843.2 811.1
Unplaced length, Mb 60.6 7.3
Scaffold number
Total scaffold number 145 293
Chromosome number 9 9
Unplaced scaffold number 136 284
Scaffold N50 length, Mb 84.4 82.4
Transposable elements (TEs)
TE quantity, Mb (% of genome) 526 (58.2%) 547 (60.11%)
Genome annotation
Protein‐coding genes 33 939 40 097
Average gene length, bp 3527.9 5231.4
Exon number 157 099 218 179
Exon length, Mb 38.0 64.5
Average exon length, bp 241.7 295.5
Intron number 123 160 178 082
Intron length, Mb 81.7 145.2
Average intron length, bp 663.9 815.8
Gene density (Genes/Mb) 37.55% 48.99%

Genome information, including gene structure, scaffold numbers, transposable elements, and gene densities counted in L. japonica and L. macranthoides.

Figure 2.

Figure 2

Construction of the Lonicera macranthoides (LM) genome and its comparison with the L. japonica (LJ) genome. (a) Global view of the LM genome. a, The nine pseudo‐chromosomes (Chr1‐Chr9). (b) Transposable element density. (c) The density of repeat sequence. (d) Gene density (500 kb window), (d–f) INDEL polymorphism density (500 kb window), and SNP density (500 kb window). (g) Linking lines in the center of the circle correspond to pairs of homologous genes. (b) Venn diagram analysis of orthologous groups and genes between LM and LJ. (c) CDS and PEP similarity of orthologous genes between LM and LJ. Error bars indicates the max‐min sequence similarity of orthologous genes. (d) Macrosynteny visualization between LM and LJ. The syntenic block in grey indicates a collinearity of genes between LM and LJ, red blocks represent intra‐chromosomal inversion, and the blocks in orange and blue indicate intra‐chromosomal translocation and inter‐chromosomal translocation. (e) Estimation of divergence time (MYA) between LM and LJ using orthologous gene pairs within collinear blocks. (f) Phylogenetic analysis and whole genome duplication events of LM and LJ. The inferred phylogenetic tree was constructed using 231 common single‐copy genes in LM, LJ, and other 11 plant species (Daucus carota, Lactuca sativa, Chrysanthemum nankingense, Solanum lycopersicum, Tripterygium wilfordii, Populus trichocarpa, Arabidopsis thaliana, Vitis vinifera, Zea mays, Oryza sativa, and Amborella trichopoda). Gene family expansions are indicated in green, and gene family contractions are indicated in red. The timing of WGD and WGT are superimposed on the tree. Divergence times are estimated by maximum likelihood. γ represents the gamma triplication event. CDS, coding sequence; INDEL, insertion and deletion; MYA, million years ago; PEP, peptide; SNP, single nucleotide polymorphism; WGD, whole‐genome duplication; WGT, whole‐genome triplication.

Comparison of LM and LJ genomes for evolutionary study of Lonicera genus

Comparisons were made between our assembled LM genome and the reported LJ genome (Pu et al., 2020), revealing a slightly smaller genome size for LM (818.4 Mb) than for LJ (903.8 Mb; Table 1). LM contained a total of 9 chromosomes at a length of 811.1 Mb, compared with 843.2 Mb for nine chromosomes of LJ (Table 1). The number of protein coding genes for LM was 40 097 with a gene density (total gene number/genome size) of 48.99%, which was greater than the 33 939 genes and 37.55% density for LJ (Table 1). The average gene length was 5231.4 bp in LM and 3527.9 bp in LJ (Table 1). LM contained higher number and larger length of introns (Table 1). Among the annotated genes, LM and LJ shared 17 134 common orthologous groups (25 311 genes in LM and 27 426 genes in LJ; Figure 2b). The sequence similarities for the common orthologous genes between LM and LJ averaged 96.13% (median 97.5%) at the DNA level and 96.39% (median 97.8%) at the protein level (Figure 2c). A total of 12 946 genes corresponding to 905 orthologous groups were specifically present in LM and absent in LJ (Figure 2b; Table S14). Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis showed that LM‐specific genes were mainly attributed to protein kinases, tRNA biogenesis, and sesquiterpene/triterpenoid biosynthesis (Figure S7). In total, 155 syntenic blocks containing 16 625 collinear genes were found in the comparative genomic analysis of LM and LJ (Data S2). Intergenomic co‐linearity analysis showed that most genes were linearly arranged between the chromosomes of LM and LJ (Figure 2d). Interestingly, some chromosome inversions and translocations were found between LM and LJ, especially on chromosomes 7 and 8 (Figure 2d). The synonymous substitution rate (Ks) distribution of the collinear gene pairs reached 0.015–0.035 (Data S2). Using the universal substitution rate of 7.7 × 10−9 mutations per site per year, the genomes of LM and LJ were predicted to have diverged from their common ancestor between 1.30 and 2.27 million years ago (MYA; Figure 2e), suggesting their close relation.

To further investigate the evolution of the Lonicera genus, orthologous groups from LM, LJ, and 11 other plant species were analysed, producing 31 618 orthologous groups involving 404 827 genes (Figure S8, Tables S15 and S16). A phylogenetic tree was constructed based on 231 single‐copy genes shared by the 13 analysed species. The phylogenetic tree indicated that LM and LJ shared a common gamma whole‐genome triplication (γ‐WGT) event at approximately 120 MYA with Daucus carota, Lactuca sativa, Chrysanthemum nankingense, Solanum lycopersicum, Tripterygium wilfordii, Populus trichocarpa, Arabidopsis thaliana, and Vitis vinifera. Importantly, LM and LJ displayed a special whole‐genome duplication (WGD) event approximately 54.5 MYA (Figure 2f). Divergence time by the phylogenetic tree indicated a sister relationship between LM and LJ (Figure 2f). In comparison with 11 additional plant species, the Lonicera genus (LM and LJ) was the closest relative to the ancestor of D. carota with an estimated divergence time of 71.2 MYA, and a secondary linkage to Asteraceae species C. nankingense and L. sativa with a divergence period around 75 MYA (Figure 2f). The Ks distributions of paralogous genes further confirmed a WGD and a γ‐WGT event in the Lonicera genus, a γ‐WGT event in V. vinifera, and Dc‐α, Dc‐β, and γ‐WGT events in D. carota (Figure S9). Distributions of logarithmic Ks values showed that the Ks peak values were 0.83 for LM and 0.85 for LJ (Figures S10 and S11), also indicating a special WGD event at approximately 53.9–55.2 MYA for the Lonicera genus (Figure 2f). Intergenomic co‐linearity (Figure S12) and gene‐block (Figure S13) analyses indicated a double gene syntenic relationship for the Lonicera genus compared with V. vinifera.

Genome‐based transcriptomics identifies biosynthetic genes for hederagenin‐based saponins

Metabolome analysis have clearly shown a considerably higher level of the hederagenin‐based saponins in LM compared with LJ. We then centered on identifying the gene families involved in the biosynthetic pathway of the hederagenin‐based saponins. The hederagenin‐based saponins are mainly derived from methylerythritol phosphate or mevalonic acid pathways, an enzymatic process that consists of more than 20 steps (Figure 3a). Unexpectedly, LJ and LM displayed comparable copy numbers of saponin‐biosynthetic genes (Table S17). As a further characterization, we performed comparative transcriptomics. A total of 5211 genes showed higher transcript levels in LM than in LJ with the change ratio >2 and adjusted P < 0.05 (Data S3). In contrast, 3308 genes were less expressed in LM than in LJ with the change ratio <0.5 and adjusted P < 0.05 (Data S3). Among these differentially expressed genes (DEGs), 98 might be involved in the hederagenin‐based saponin biosynthesis (Data S4). As expected, 71 of 98 DEGs were more highly expressed in LM than in LJ (Figure 3b; Figure S14). Among them, most of the genes belonging to the UGT family, such as those of UGT74 and UGT85 subfamilies, were more highly expressed in LM than in LJ (Figure S14). It is worth noting that the transcript levels of an OAS‐encoding gene (EVM0008033) and a UGT73‐encoding gene (EVM0012591) were extremely high in LM, but their expression levels were almost undetectable in LJ (Figure 3c). In addition, the upstream genes, such as those encoding farnesyldiphosphate synthase, squalene synthase, and squalene epoxidase, had higher transcript abundance in LM than in LJ (Data S4). In line with this finding, oleanolic acid, an indispensable precursor of hederagenin‐based saponins, was predominantly higher in LM than that in LJ (Figure 3d).

Figure 3.

Figure 3

Comparative transcriptomic analysis of genes involved in the hederagenin‐based saponin biosynthetic pathway. (a) The potential biosynthetic pathway of the hederagenin‐based saponin. The dotted line indicates unidentified steps. (b) Heatmap of differentially expressed genes involved in the hederagenin‐based saponin biosynthetic pathway in LM and LJ. Each gene was displayed by log2 normalization of Fragments Per Kilobase of exon model per Million mapped fragments (FPKM) values. Colours from blue to red indicate the expression level of genes from low to high. (c) Volcano plot of genes involved in the biosynthetic pathways of hederagenin‐based saponins. The red circle means highly expressed genes in LM, blue circle indicates highly expressed genes in LJ, and the grey circle shows genes have similar‐level expression in LM and LJ. (d) The content of oleanolic acid in LM and LJ. ACAT, acetoacetyl‐CoA thiolase; CMK, 4‐(cytidine‐5‐diphospho)‐2‐C‐methyl‐D‐erythritol kinase; DXR, 1‐deoxy‐D‐xylulose‐5‐phosphate reductase; DXS, 1‐deoxy‐D‐xylulose‐5‐phosphate‐synthase; FPS, farnesyldiphosphate synthase; HDR, 1‐hydroxy‐2methy‐3‐E‐butenyl‐4‐diphosphate reductase; HDS, 1‐hydroxy‐2‐methyl‐2‐E‐butenyl‐4‐diphosphate synthase; HMGR, 3‐hydroxy‐3‐methyl glutaryl coenzyme A reductase; HMGS, 3‐hydroxy‐3‐methyl glutaryl coenzyme A synthase; IDI, isopentenyl diphosphate isomerase; MCT, 2‐C‐methyl‐D‐erythritol‐4‐phosphate cytidylyltransferase; MDS, 2‐C‐methyl‐D‐erythritol‐2,4‐cyclodiphosphate synthase; MVD, mevalonate 5‐diphosphatcdecarboxylase; MVK, mevalonate kinase; OAS, oleanolic acid synthase; PMK, phosphomevalonate kinase; SQE, squalene epoxidase; SQS, squalene synthase; UGT, UDP‐glycosyltransferase; βAS, beta‐amyrin synthase.

LmOAS1 catalyses β‐Amyrin to oleanolic acid

EVM0008033 (named as LmOAS1), a highly expressed OAS‐encoding gene in LM, was cloned to confirm the step of oleanolic acid formation (Figure 4a). Results showed that protein extracted from yeast strains expressing LmOAS1 could effectively produce oleanolic acid from β‐Amyrin by adding a carboxyl group at the C‐28 position (Figure 4b). To identify the key amino acids for sustaining the enzymatic activity of LmOAS1, molecular docking and mutation assays were performed. The docking experiment identified that the amino acid regions 145–150, 340–355 and 435–441 were around the binding pocket. The amino acid region 145–150 including K145, P146, E147, A148, L149 and R150, positioned in non‐conserved sequences, were chosen for mutation assay (Figure 4c; Figure S15). Residues P146, A148 and L149 were mutated into amino acids with opposite polarity. Residues K145, E147 and R150 were mutated into amino acids with opposite pH. Results showed that mutation of E147 did not affect the enzyme activity, whereas mutations of K145E, P146S, and A148S reduced the enzyme activity (Figure 4d). Importantly, mutations of L149G and R150N resulted in complete losses of enzyme activity (Figure 4d). In addition, the amino acids at 1–26 were predicted to be the transmembrane motif of LmOAS1 (Figure 4e,f). In vitro enzymatic assay showed that when the 1–26 amino acids were removed, the catalytic activity of LmOAS1 to produce oleanolic acid was lost (Figure 4g). The collinearity analysis indicated that the LmOAS1 was expanded to two copies (EVM0033113 and EVM0008033) on chromosome 5 of LM compared to its collinear gene (Lj5A798G50) in LJ (Figure 4h).

Figure 4.

Figure 4

Functional identification of LmOAS1. (a) Schematic diagram of reaction from β‐Amyrin to oleanolic acid. (b) In vitro assay of LmOAS1 in the catalysis of β‐Amyrin to oleanolic acid. Extracted ion chromatogram of the product, oleanolic acid (m/z 455.35) was shown. (c) Overview of LmOAS1 model docked with β‐Amyrin. β‐Amyrin is shown in brown and the mutated amino acids around the binding pocket were marked with green and blue. The image was produced by Pymol. (d) The effect of amino acid site‐mutation on LmOAS1 enzyme activity. (e) Sequence alignment of LmOAS1 (EVM0008033). (f) Transmembrane motif predication of LmOAS1. Transmembrane domain of LmOAS1 was predicted using TMHMM server version 2.0. The purple colour indicates transmembrane motif, pink colour means outside sequence of cell membrane, and green colour shows sequence inside cell membrane. (g) The effect of the cutting 1–26 amino acids on LmOAS1 enzyme activity. (h) The colinearity relationship of LmOAS1, between Lonicera macranthoides (LM) and L. japonica (LJ). The syntenic blocks are connected by grey lines. The syntenic target genes are connected by yellow lines. Extracted ion chromatogram of product, oleanolic acid (m/z 455.35) was shown. OAS, oleanolic acid synthase.

LmUGT73P1 catalyses cauloside A to α‐hederin

Next, EVM0012591, a highly expressed UGT‐encoding gene in LM, was also cloned to confirm the key step of glycosylation (Figure 5a). We renamed EVM0012591 to LmUGT73P1 because of its high homology to the UGT73P2 from Glycine max and a common putative secondary plant glycosyltransferase (PSPG) motif in the C‐terminus (Figure S16). Using nickel‐nitrilotriacetic acid (Ni‐NTA) affinity chromatography, the recombinant protein expressed in the Escherichia coli was purified and verified by SDS–PAGE (Figure S17). Three possible substrates, including oleanolic acid, hederagenin, and cauloside A, were separately incubated with LmUGT73P1 and sugar donors containing uridine 5‐diphosphate arabinose (UDP‐Ara), uridine 5‐diphosphate glucose (UDP‐Glc), or uridine 5‐diphosphate rhamnose (UDP‐Rha). We did not observe any products when oleanolic acid or hederagenin was added as the substrate (Figure S18). However, when cauloside A was used as the substrate, LmUGT73P1 could effectively transfer a Rha residue into cauloside A, forming the expected product α‐hederin (Figure 5b; Figure S19). For further in vivo confirmation, LmUGT73P1 was transiently expressed in the leaves of Nicotiana benthamiana using cauloside A as the substrate. As expected, LmUGT73P1 effectively catalysed the conversion of cauloside A to α‐hederin in N. benthamiana by adding a Rha residue (Figure S20). The catalytic efficiency of LmUGT73P1 was calculated, and the K m value was found to be 32.96 μM (Figure 5c).

Figure 5.

Figure 5

Functional identification for LmUGT73P1. (a) Potential glycosylation process of the hederagenin‐based saponins. The step underlined in grey was confirmed in this study and other steps with dotted line remained to be explored. (b) In vitro assay of LmUGT73P1 in the catalysis of cauloside A to α‐hederin. Total ion chromatogram (TIC) of the substrate, cauloside A (m/z 603.39) and product compound, α‐hederin (m/z 749.45) was shown. (c) Determination of kinetic parameters for LmUGT73P1. The K m value was calculated using cauloside A as the substrate and UDP‐Rha as the donor. (d) Overview of LmUGT73P1 model docked with the substrate cauloside A and the donor UDP‐Rha. Cauloside A is shown in dark blue, UDP‐Rha is shown in red, and the mutated amino acids around binding pocket marked with light blue. (e) Effect of amino acid mutation on LmUGT73P1 enzyme activity. Extracted ion chromatogram (EIC) of product compound, α‐hederin (m/z 749.45) was shown. (f) The colinearity relationship of LmUGT73P1 between Lonicera macranthoides (LM) and L. japonica (LJ). The syntenic blocks are connected by grey lines. The syntenic target genes are connected by yellow lines. UDP‐Rha, uridine 5‐diphosphate rhamnose; UGT, UDP‐glycosyltransferase.

To identify the key amino acids for LmUGT73P1 enzymatic activity, molecular docking and mutation assays were then performed. The amino acid residues, including S53, E83, K258, E312, T330, and V338, around the binding pocket were changed into alanine using single‐site mutation (Figure 5d). Results showed that mutations of S53A, K258A, E312A, T330A, or V338A did not significantly affect the enzymatic activity of LmUGT73P1, whereas mutation of E83A noticeably reduced its enzymatic activity (Figure 5e). The collinearity analysis indicated that the LmUGT73P1 was expanded to two copies (EVM0009409 and EVM0012591) on chromosome 4 of LM, while its collinear gene Lj4C768T0 in LJ has a single copy (Figure 5f).

Discussion

LM and LJ are two of the most widely used plant source of the honeysuckle flowers among the various Lonicera species. In this work, we performed a comprehensive comparison between LM and LJ in terms of their metabolome, genome, and transcriptome. The major findings included the followings: (i) quantitative analysis that showed that LM buds contained 2000‐fold higher level of hederagenin‐based saponins than LJ buds; (ii) the assembly and annotation of a reference‐grade genome of LM, which provides insights into the evolution of the Lonicera genus and divergence of LM with LJ; (iii) Genome‐based transcriptomic analysis that revealed that most of the genes involved in the hederagenin‐based saponin biosynthesis had much higher expression levels in LM than in LJ; and (iv) identification of LmOAS1, an OAS‐encoding gene, and LmUGT73P1, an UGT‐encoding gene that encode the enzymes responsible for the biosynthesis of oleanolic acid from β‐Amyrin and the conversion of cauloside A to α‐hederin, respectively; (v) discovery of an interesting phenomenon, the so‐called ‘neighbourhood replication’ of LmOAS1 and LmUGT73P1 in LM genome in comparison with their collinear genes in LJ.

More specifically, qualitative and quantitative analyses were performed to compare the chemical compositions of LM and LJ. Belonging to the same genus (Lonicera), LM and LJ present similar morphological characteristics as well as common secondary metabolites. A total of 43 major chemical constituents belonging to four groups of compounds (flavonoids, iridoids, organic acids, and saponins) were characterized in LM, of which 36 were shared by LJ. In the herbal market, LJ is usually adulterated with LM because of a 10‐fold price difference. It should be noted that the two herbs displayed marked differences in their chemical profiles. A remarkable feature shown in this work, as well as in previous reports (Gao et al., 2012), is that the hederagenin‐based saponins are approximately 2000‐fold higher in LM (86.01 mg/g) than in LJ (0.045 mg/g). Macranthoidin B, the most abundant hederagenin‐based saponin, generally serves as a chemical marker for the quality control of LM and its differentiation from LJ in the Chinese Pharmacopoeia. The hederagenin‐based saponins have been demonstrated to possess strong antiviral functions (Kashiwada et al., 1998; Li et al., 2021a). Particularly, intermediate α‐hederin was reported to possess the therapeutic potential to combat COVID‐19 by in silico study (Jakhmola Mani et al., 2022; Salim and Noureddine, 2020). In addition, caffeoylquinic acids were also higher in LM than LJ (Gao et al., 2016). In contrast, iridoids and flavonoids were lower in LM than LJ. These findings will be useful to facilitate the quality control of LM and LJ.

Until now, only the genome of LJ had been reported among the ~100 Lonicera species (Pu et al., 2020). Thus, the high‐quality genome of LM reported in this study, serves as an invaluable genome resource for evolutionary phylogenomic studies of the Lonicera genus. Similar to previous findings (Pu et al., 2020), we found a close relationship of LJ with the Asteraceae species, C. nankingense and L. sativa. Furthermore, phylogenomic analysis demonstrated that the Lonicera genus (LM and LJ) was much closer to D. carota with an estimated divergence time of 71.2 MYA. It was reported that a specific WGD event occurred ~51 MYA in case of LJ (Pu et al., 2020). In close agreement, we have found that LM experienced a common WGD event dated approximately 53.9–55.2 MYA before the differentiation of the two Lonicera species. This minor difference in time of WGD event might be caused by different software and datasets. At the period of ~55 MYA called Palaeocene‐Eocene Thermal Maximum, dramatic climate change happened on earth (Frieling et al., 2017; Wagner et al., 2021). It is possible that this linage‐specific WGD event might enable the Lonicera genus to better cope with the drastic environmental changes experienced at that time (Crow and Wagner, 2005).

Recent genomic studies have demonstrated that WGD event could affect metabolic diversification of secondary metabolites, such as triterpenes (Su et al., 2021), wogonin (Zhao et al., 2019), and triptolide (Tu et al., 2020). In accordance with these observations, results of our study indicated that the calculated WGD event impacted the duplications of genes involved in the biosynthesis of the hederagenin‐based saponins in LM (Figure S21). Interestingly, these genes were arranged in several clusters on the chromosomes of LM (Figure S22). The divergence time of LM and LJ were estimated to 1.30–2.27 MYA. Intergenomic co‐linearity analysis indicated that most genes were linearly arranged between the chromosomes of LM and LJ. It is important to note that chromosome inversions and translocations were partly found between them. Chromosome inversion and translocation usually mediate reduction of meiotic recombination and lead to genetic isolation (Zapata et al., 2016) and geographic differentiation (He et al., 2021). It is possible that this critical genetic architecture may provide evidence of the primordial force leading to speciation between LM and LJ. In addition, 3854 genes were identified in the chromosome inversion‐regions. Among them, 362 (9.39%) genes are highly and 209 (5.42%) are lowly expressed in LM compared to in LJ. Interestingly, 2 significantly changed hederagenin‐based saponin biosynthetic genes (EVM0008033 and EVM0033113) are located in this region (Figure S23), indicating chromosome inversions might affect saponin biosynthesis through regulating gene expression.

Genome‐based transcriptome analysis indicated that the hederagenin‐based saponin biosynthesis‐related genes were more highly expressed in LM than in LJ. Specifically, at the transcript level, LmOAS1 transcript was extremely high in LM but it was almost undetectable in LJ. LmOAS1 (EVM0008033), a type II cytochrome oxidase (Tamura et al., 2017a), belongs to the CYP716A family (Figure S24). CYP716A was reported to catalyse oxidation of β‐Amyrin to yield oleanolic acid in plants, such as Medicago truncatula (Carelli et al., 2011) and Glycyrrhiza uralensis (Tamura et al., 2017b). In accordance with these reports, our results also demonstrated that LmOAS1 was expanded in LM and could catalyse the oxidation of β‐Amyrin to produce oleanolic acid. The key amino acid sites for enzymatic activities of OAS family members are less studied (Carelli et al., 2011). Using the single site mutation strategy, this work indicated that the amino acid residues K145, P146, A148, L149, and R150 were crucial in sustaining the catalytic activity of LmOAS1. Identification of these key amino acids would therefore provide a better understanding for the catalytic function of OAS. Additionally, we found that LmOAS1 lacking the 1–26 amino acid region also lost catalytic activity, suggesting an indispensable role for this transmembrane motif.

Many UGT genes were more highly expressed in LM than in LJ. UGTs encode enzymes that catalyse the formation of saponins with different backbones, including dammarane (Yan et al., 2014), cucurbitane (Itkin et al., 2016), and oleanane (Orme et al., 2019). The UGT73 subfamily enzymes have been reported to catalyse the glycosylation of the oleanane saponins (Augustin et al., 2012). In this work, LmUGT73P1 was expanded in LM, and the results of in vitro and in vivo assays demonstrated that LmUGT73P1 is capable of effectively transferring a Rha residue on cauloside A to produce the intermediate α‐hederin for macranthoidin B biosynthesis. Several studies have reported the substrate catalytic promiscuity of the UGTs (He et al., 2019; Tang et al., 2019). We showed substrate specificity of LmUGT73P1 for cauloside A but not for oleanolic acid and hederagenin. Importantly, site‐directed mutagenesis indicated that the enzymatic activity of LmUGT73P1 was drastically reduced when the negatively charged E83 was mutated to the neutral A83, indicating that the charge of amino acids might be an important factor for the catalytic activity of the UGT73 subfamily members (Nomura et al., 2019). Additionally, compared with its corresponding collinear genes in LJ, LmOAS1 and LmUGT73P1 had an interesting phenomenon of ‘neighbourhood replication’ in the LM genome. Both LmOAS1 and LmUGT73P1 had higher expression levels in LM compared with that of their collinear genes in LJ. The regulatory motifs in the 2000‐bp promoter regions (upstream of the translation start site) of LmOAS1 and LmUGT73P1 are different from those in the 2000‐bp promoter regions of their respective collinear genes (Lj5A798G50 and Lj4C768T0, respectively) in LJ (Figure S25). The interesting ‘neighbourhood replication’ phenomenon and differences between the promoter regions of OAS1 and UGT73P1 genes of LM and LJ, in terms of regulatory motifs, might be an important reason leading to the differential production of these enzymes and related compounds in LM and LJ plants.

Conclusion

Remarkable differences between LM and LJ in terms of their metabolomes, genomes and transcriptomes were reported in this work. These findings provide insights into the genome evolution of Lonicera genus and the differential hederagenin‐based saponin production in LM and LJ. The roles of LmOAS1 and LmUGT73P1 in the production of oleanolic acid and α‐hederin, respectively, were identified using protein mutation and in vitro and in vivo enzyme activity assays. The challenge in genetic transformation of Lonicera plants should be solved to open the way to study the functions of genes encoding LmOAS1 and LmUGT73P1, as well as other genes encoding other enzymes involved in the hederagenin‐based saponin production in Lonicera genus by a genetic means. In future studies, genes involved in the biosynthesis of the hederagenin‐based saponins could be cloned and integrated into microbes, such as E. coli or yeast, to produce large quantities of the active compounds for therapeutic applications. Accordingly, the LM genome sequence can serve as a vital resource in studying the genetic foundation of secondary metabolite metabolism and in the design of molecular breeding strategies to produce high‐quality honeysuckle cultivars.

Materials and methods

Plant material

The LM cultivar collected from Longhui County, Hunan Province, China, was used for sequencing. Healthy fresh leaves were collected, and external contaminants were removed by washing with ultrapure water three times. The leaves were then frozen in liquid nitrogen and stored at −80 °C before DNA extraction. For metabolomic analyses, 22 batches of LJ were brought from Shandong Province, China, and 12 batches of LM were brought from Hunan Province, Hubei Province, and Chongqing city, China. For RNA‐sequencing experiments, green bud flower samples were collected from the individuals and stored at −80 °C. For genome survey, tender leaves were collected and used.

Metabolome profiling of LM and LJ

Metabolites were extracted from LM and LJ using 50% methanol according to a previously published method (Gao et al., 2016) with slight modification and analysed by LC‐QTOF‐MS. Details of this procedure are provided in methods section of the Supporting Information S1. The identification of metabolites was performed based on the mass spectra, retention times, and fragmentation patterns relative to reference compounds and literature. Reference standards for macranthoidin B, macranthoside A, dipsacoside B, asperosaponin VI, macranthoside, H‐hederin, α‐hederin, cauloside A, and hederagenin (Chengdu Alfa Biotechnology CO, LTD, Sichuan, China) were used to determine the concentrations of hederagenin‐based saponins. The identified metabolites were then subjected to unsupervised principal component analysis and heatmap clustering using R v.3.5.1 based on their peak areas.

Genome sequencing

Genomic DNA of the LM leaves was obtained with the Plant DNA extraction Kit (TIANGEN) and fragmented randomly. DNA sequencing libraries were constructed in accordance with the standard Illumina library preparation protocols for the genome survey and contig polishing. Paired‐end libraries, with an average insert size of around 220 bp, were then set up in accordance with the manufacturer's instructions (Illumina, San Diego, CA). A total of 56.74 Gb high‐quality data were produced for genome survey and contig polishing. Next, high‐molecular‐weight genomic DNA were screened by BluePippin electrophoresis. Libraries with an insert size of 15–20 kb were constructed and analysed by the PacBio Sequel system with P6‐C4 chemistry. In total, nine single molecule real‐time cells were sequenced, producing 107.89 Gb of high‐quality data for genome assembly. Hi‐C libraries were constructed with reference to an earlier published method (Mascher et al., 2017). A total of six Hi‐C fragment libraries, comprising five DpnII and one HindIII libraries with fragment sizes between 300 and 700 bp, were constructed and analysed by the Illumina HiSeqX Ten platform. In total, 121.35 Gb of high‐quality Hi‐C data were generated.

Genome features estimation and assembly

Genome size was evaluated using the k‐mer frequency (Koren et al., 2017). Processing of the PacBio data involved the removal of sequencing adaptors, as well as low‐quality reads using the PacBio SMRT Analysis package with stringent parameters (min Sub Read Length, 500). The high‐quality PacBio subreads were qualified using Canu (version 1.5) software (Klasberg et al., 2021) with the parameter ‘corrected error rate’ set to 0.045, and then subjected to contig assembly by wtdbg (https://github.com/ruanjue/wtdbg) and Canu software. The Quickmerge (version 0.2) package (Walker et al., 2014) was applied to merge both assembly results using the wtdbg contigs as reference input. The short reads (56.74 Gb) from the secondary sequenced data were used to polish merged contigs using Pilon (version 1.22; Walker et al., 2014) and BWA (version 0.7.10‐r789; Chin et al., 2013). Subsequently, the polished contigs were further scaffolded using Hi‐C data.

In brief, a total of 121.35 Gb of high‐quality Hi‐C data were generated after adaptor sequences were trimmed and low‐quality (over 10% N base pairs or Q10 < 50%) paired‐end reads were removed. BWA (version 0.7.10‐r789) was used to map Hi‐C data based on the aln method. Scaffolding was performed using the uniquely mapped reads of quality >20. HiC‐Pro (version 2.8.1; Servant et al., 2015) was used to perform duplicate removal, sorting, and quality checks. The Hi‐C links were combined in 50‐kb bins and separately normalized for intra‐ and intercontig contacts. LACHESIS (Burton et al., 2013) was used to order the contigs into scaffolds. A final genome size of 811.06 Mb and scaffold N50 size of 82.4 Mb were obtained.

Genome annotation

The repeat sequences of the LM genome were identified and annotated with a combination of de novo and homologue search strategies. Prediction of gene structure was based on homology blast, de novo annotation, and transcriptome analysis. Gene function assignment of LM was performed with BLAST in public databases. The accuracy and completeness of the gene prediction were assessed with CEGMA pipeline search, expressed sequence tag (EST) alignment, and BUSCO datasets.

Genome comparison and evolution

The orthologous genes of the 13 representative plant genomes were identified through retrieval of their full genome sequences from websites. The phylogenetic tree was inferred with RAxML (version 8.2.12; Stamatakis, 2014) using the GTRGAMMA model, 100 starting trees and 1000 bootstrap replicates. Gene family expansions and contractions were identified by Café (version 4.2.1; De Bie et al., 2006). Because the genome of V. vinifera did not undergo any other WGD event, and the genome of D. carota underwent another two WGD events (Dc‐α, Dc‐β) following the γ event, we selected these two species (V. vinifera and D. carota) as references to confirm the number of WGD events for the two honeysuckle species (LM and LJ). The WGD event times for LM and LJ were detected using wgd (version 3.0; Zwaenepoel and De Peer, 2019). The co‐linearity analyses between the honeysuckle species and V. vinifera were performed using JCVI 1.0.5 (https://github.com/tanghaibao/jcvi). After that, the Ks distribution of LM was measured using the wgdi (version 0.51; Sun et al., 2022).

To compare the differences between the two honeysuckle species at the genome level, the shared and species‐specific orthologous groups between LM and LJ were reanalyzed using OrthoFinder (Emms and Kelly, 2019). KEGG enrichment analysis was performed for the species‐specific orthologous groups in LM. The differences in chromosomal structure between LM and LJ were visualized by the co‐linearity method using JCVI. The sequence similarities and Ks values between these gene pairs were calculated using EMBOSS Needle (version 6.6; Li et al., 2015), and phylogenetic analysis was performed using maximum likelihood (PAML; version 4.9b; Yang, 2007).

Evolution of hederagenin‐based saponin biosynthesis‐related genes

To assess tandem duplications that might have taken place in every gene family in the biosynthetic pathway, we determined the positions of all genes identified in the assembly. Divergence time of gene duplication in the hederagenin‐based saponins biosynthetic pathway was evaluated with reference to the phylogenetic tree of Arabidopsis genes. Selected paralogous gene pairs were subjected to Ks calculation using Nei and Gojobori in the PAML program (version 4.9b). The divergence time was subsequently calculated from the obtained Ks value based on the relation, T = Ks/2r, where ‘r’ represents a substitution rate of 7.7 × 10−9 mutations per site per year for eudicots.

Transcriptome profiling of LM and LJ

Total RNA was extracted from all samples for transcriptome analyses. The quality and concentrations of extracted RNA samples were determined, and cDNA library construction and sequencing were performed by the Biomarker Technologies Corporation (Beijing, China). After sequencing, clean data (clean reads) were obtained by removing reads containing adapter, reads containing poly‐N, and low‐quality reads from the raw data. Gene expression was calculated with Fragments Per Kilobase of exon model per Million mapped fragments (FPKM). The differential patterns of gene expression were analysed with count number by DESeq2 (Anders and Huber, 2010) using a model based on the negative binomial distribution. The resulting P values were adjusted using the Benjamini and Hochberg's approach for controlling the false discovery rate. Genes with an adjusted P‐value < 0.05 and fold change >2 or <0.5, were considered as DEGs.

Cloning and yeast transformation of LmOAS1

The full‐length coding sequence (CDS) of LmOAS1 was amplified from cDNA of LM using primers as described in the Table S18 following the protocol of Phanta Max Super‐Fidelity DNA Polymerase (Vazyme, Nanjing, China). To obtain the EVM0008033 with 26 amino acids missing in the N‐terminus of protein, primers pYES226cut0008033‐F and pYES20008033‐R listed in the Table S18 were used. The wild EVM0008033 and truncated EVM0008033 were cloned into yeast expression vector, respectively. Cytochrome P450 reductase ART1 obtained from Arabidopsis thaliana was also cloned into the yeast system.

The transformed yeasts were screened on synthetic dropout (SD)/‐Leu/‐Ura plates. The empty vector with pRS425‐Leu‐ART1 co‐transformed strain was used as negative control. Each of the transformed yeast strains was verified according to the protocol of 2 × Rapid Taq Master Mix (TaKaRa, Kyoto, Japan) before cultivation. The yeast cells were harvested by centrifugation at 3000  g for 10 min. Subsequent procedures were performed at 4 °C or on ice. Harvested cells were washed with 100 mM potassium phosphate buffer (pH 7.4) for three times and resuspended in the same buffer containing 1 mM EDTA. Glass beads were added to lyse the cells by vortexing 20 min. The lysed cells were centrifuged at 12 000  g for 10 min. The supernatant was collected and stored as crude yeast protein at −80 °C until use.

In vitro enzymatic activity assay of LmOAS1

The activity of LmOAS1 was tested in a 2 mL reaction mixture containing 20 mM Glc‐6‐phosphate, 2.5 units of Glc‐6‐phosphate dehydrogenase, 30 mg of β‐Amyrin, 2 mM NADPH and 1.8 mL of crude yeast proteins. After incubating the reaction mixture for 6 h at 28 °C, the reaction was terminated and extracted with 2 mL of ethyl acetate. The extraction was evaporated and re‐dissolved in 110 μL of 100% methanol and subjected to LC‐QTOF‐MS/MS analysis. Enzymatic metabolites were monitored by comparing both the retention time and mass spectra data with standard oleanolic acid (Yuanye Bio‐Technology Co., Ltd Shanghai, China).

Molecular docking and site‐directed mutagenesis of LmOAS1

The three‐dimensional LmOAS1 was constructed based on the X‐ray structure of CYP120A (resolution: 2.10 Å; Kühnel et al., 2008) using Profiles‐3D method. Substrate β‐Amyrin was docked into the binding pocket of LmOAS1 model using the Autodock software (Morris et al., 2009). Amino acid residues surrounding the substrate at a distance less than 5 Å and facing the active site center were chosen as mutation candidates. Residues K145, P146, E147, A148, L149, and R150 were selected for subsequent site‐directed mutagenesis to K145E, P146S, E147H, A148S, L149G, and R150N variants. The fragments of mutated LmOAS1 were amplified from the vector of pYES2‐Ura‐EVM0008033 using primers listed in the Table S1 following the protocol of Phanta Max Super‐Fidelity DNA Polymerase (TaKaRa). Fragments of each Mutated LmOAS1 were ligated and cloned into the vector, pYES2‐Ura according to the protocol of ClonExpress Ultra One Step Cloning Kit (Vazyme). The enzyme activities of mutated LmOAS1 versions were determined using in vitro enzymatic assay as described above.

Protein expression and purification of LmUGT73P1

The full length of the LmUGT73P1 cDNA was amplified by polymerase chain reaction (PCR) using designed primers (Table S18) and inserted into the pET‐28a (+) vector (Invitrogen, Carlsbad, USA). The recombinant plasmid pET‐28a‐LmGT73P1 was then introduced into E. coli BL21 (DE3; Transgen Biotech, Beijing, China) for heterologous expression. The target protein was induced and purified from the edited E. coli cells using Ni‐NTA agarose (QIAGEN, Dusseldorf, Germany). The purified protein was then concentrated and desalted using the Amicon Centrifugal Filter (Millipore, Billerica, MA) and stored at −80 °C for in vitro assays. The activity of LmUGT73P1 was analysed by in vitro enzyme assay.

In vitro assays and kinetic measurements of LmUGT73P1

The function of LmUGT73P1 was characterized by co‐incubating 20 μg purified protein, 0.1 mM substrates (oleanolic acid, hederagenin, and cauloside A were separately used), and 0.5 mM sugar donor (mixture of UDP‐Rha, UDP‐Ara, and UDP‐Glu) in 100 μL of 50 mM Tris–HCl buffer (pH 8.0, 37 °C, 1 h). Reactions were quenched with ice cold methanol and centrifuged at 12 000  g for 10 min. The supernatants were dried and re‐dissolved in methanol and subjected to LC‐QTOF‐MS/MS analysis.

Kinetic parameters of LmUGT73P1 were calculated using cauloside A as the substrate. Assays were performed in a final volume of 100 μL, consisting of 50 mM Tris–HCl (pH 7.5), 1 μg LmUGT73P1, 10 mM of UDP‐Rha, and different concentrations of the substrates (2.5, 5, 10, 20, 40, 80, 100 μM). After incubating at 37 °C for 20 min, the reactions were quenched with ice cold methanol and centrifuged at 12 000  g for 10 min, and the supernatants were analysed by LC‐QTOF‐MS/MS. The value of Km was calculated with the Michaelis–Menten plotting method.

Molecular docking and site‐directed mutagenesis of LmUGT73P1

The three‐dimensional structures of the substrate and sugar donor were generated and optimized by Chem3D software (CambridgeSoft, Cambridge, MA, USA). The structure model of LmUGT73P1 was predicted by Robetta server (http://robetta.bakerlab.org/), and then the pocket and cavity were obtained by the web POCASA (https://g6altair.sci.hokudai.ac.jp/g6/service/pocasa/). Cauloside A and UDP‐Rha were successfully docked into the substrate binding pocket using the AutoDock Vina with a grid box (30 Å × 30 Å × 40 Å) centered at (−1.1, 30.6, −35.5) Å. Amino acid residues surrounding binding pocket were chosen as candidates. Residues S53, E83, K258, E312, T330, and V338 were selected for subsequent site‐directed mutagenesis to make S53A, E83A, K258A, E312A, T330A, and V338A variants. The fragments of mutated LmUGT73P1 were amplified from the vector of pET‐28a (+)‐LmUGT73P1 using the primers listed in the Table S18 following the protocol of Phanta Max Super‐Fidelity DNA Polymerase (TaKaRa). Fragments of each mutated LmUGT73P1 were ligated and cloned into the vector of pET‐28a (+) according to the protocol of ClonExpress Ultra One Step Cloning Kit (Vazyme). The enzyme activities of mutated LmUGT73P1 versions were analysed through in vitro enzymatic assay.

Expression of UDP‐glycosyltransferase (LmUGT73P1) in N. benthamiana and product analysis

The LmUGT73P1‐encoding gene was amplified by PCR with gene‐specific primers UGT73P1–3 × Flag‐BamHI and UGT73P1–3 × Flag‐SpeI and inserted into the pHB‐3 × Flag vector. Upon PCR and DNA sequence verification, the pHB‐UGT73P1 vector was transferred to the Agrobacterium tumefaciens GV3101 strain. The positive clones were cultured at 28 °C in LB liquid medium. After centrifugation at 4000  g for 10 min, the A. tumefaciens cells were collected and resuspended in MMA buffer to obtain a final solution with an OD600 of 0.8 for each transformant. After a 2 h culturing at room temperature, the solution prepared was injected into the leaves of N. benthamiana. A. tumefaciens solution without UGT73P1 (empty vector) was used as control. After growing in a glasshouse for 5 days, fresh leaves of infiltrated plants were harvested and dried at 40 °C. Dried leaves were ground into a powder and metabolites were extracted by 50% methanol solution and analysed using the UPLC‐QTOF‐MS/MS method.

Conflict of interest

The authors declare that they have no conflict of interest.

Author contributions

L.‐W.Q., X.Y., L.H.‐E. X.L., L.‐S.P.T., X.K., B.Z., and P.L. designed research; X.Y., Y.X., Y.C., X.W., X.C., and X.W. performed research; X.Y., J.D., H.D., Y.‐Y.C., W. G., D. G., and G.A.‐J. analysed the data; X.Y., L.‐W.Q., X.L., L.H.‐E., X.L., L.‐S.P.T., and F.‐Q.H. wrote the paper.

Supporting information

Figure S1 Macroscopic characteristics of dried flower buds of Lonicera macranthoides (LM) and L. japonica (LJ).

Figure S2 Chemical profiling of Lonicera macranthoides (LM) and L. japonica (LJ).

Figure S3 Comparative principal component (PC) analysis (PCA) of Lonicera macranthoides (LM) and L. japonica (LJ).

Figure S4 Evaluation of the genome size of Lonicera macranthoides (LM).

Figure S5 Hi‐C intra‐chromosomal contact map for the genome assembly (2n = 18).

Figure S6 Venn diagram analysis of coding genes in the genome of Lonicera macranthoides.

Figure S7 KEGG pathway enrichment analysis of unique genes in Lonicera macranthoides compared with L. japonica.

Figure S8 Clusters of orthologous gene families in Lonicera macranthoides (LM), L. japonica (LJ), and 11 other fully‐sequenced plant species.

Figure S9 Ks distributions of paralogous genes within Lonicera macranthoides, L. japonica, Daucus carota, and Vitis vinifera.

Figure S10 Ks distributions in Lonicera macranthoides.

Figure S11 Ks distributions in Lonicera japonica.

Figure S12 Macrosynteny visualization among Lonicera macranthoides (LM), L. japonica (LJ), and Vitis vinifera (Viv) by intergenomic co‐linearity analysis.

Figure S13 Summary of the syntenic analysis between Lonicera macranthoides (LM) and Vitis vinifera (Viv).

Figure S14 Heatmap of differentially expressed genes involved in the hederagenin‐based saponin biosynthetic pathway in Lonicera macranthoides (LM) and L. japonica (LJ).

Figure S15 The amino acid sequence alignment of targeted oleanolic acid synthase (OAS, EVM0008033) in LM.

Figure S16 Phylogenetic tree of differentially expressed UDP‐glycosyltransferases (UGTs) between Lonicera macranthoides (LM) and L. japonica (LJ).

Figure S17 SDS‐PAGE analysis of the recombinant LmUGT73P1.

Figure S18 LC‐QTOF‐MS/MS analysis of product extracted from in vitro enzyme assay.

Figure S19 LC‐QTOF‐MS/MS confirmation of α‐hederin.

Figure S20 LC‐QTOF‐MS/MS analysis of product extracted from in vivo assay.

Figure S21 Evolution of genes involved in hederagenin‐based saponin biosynthesis in Lonicera macranthoides (LM).

Figure S22 Chromosomal localization analysis of genes involved in the biosynthetic pathway of hederagenin‐based saponins in Lonicera macranthoides.

Figure S23 Transcriptional expression profile of genes located in chromosome inversions and translocations regions.

Figure S24 Phylogenetic tree of OAS identified in LM and their relationships with reported OAS genes.

Figure S25 Sequence analysis of LmOAS1 and LmUGT73P1 in LM and LJ.

Table S1 Genome survey statistic results of Lonicera macranthoides.

Table S2 The filter data from original sequenced data of Lonicera macranthoides.

Table S3 The statistics of Hi‐C data.

Table S4 Genome assembly of Lonicera macranthoides.

Table S5 The quality estimation of Hi‐C data.

Table S6 The scaffold number and length grouped on pseudochromosomes.

Table S7 The quality estimation of genome assembly for Lonicera macranthoides using CEGMA database.

Table S8 The quality estimation of genome assembly for Lonicera macranthoides using BUSCO database.

Table S9 The quality estimation of genome assembly for Lonicera macranthoides using EST.

Table S10 The statistical results of gene prediction of Lonicera macranthoides.

Table S11 The statistical results of function annotation of Lonicera macranthoides genome.

Table S12 The statistical results of gene structure prediction of Lonicera macranthoides.

Table S13 Summary of repeat sequence in the Lonicera macranthoides (LM) genome.

Table S14 The statistical comparison results of Lonicera macranthoides and Lonicera japonica.

Table S15 List of species used in phylogenomic analyses.

Table S16 Summary statistics for gene family inference and gene counts for each family and each species.

Table S17 The copy number of genes involved in hederagenin‐based saponin biosynthesis.

Table S18 Primers used in this study.

PBI-21-2209-s002.docx (7.9MB, docx)

Data S1 The identified compounds from Lonicera macranthoides (LM) and L. japonica (LJ).

Data S2 The synonymous substitution rate (Ks) distribution of collinear genes between Lonicera macranthoides (LM) and L. japonica (LJ) genomes.

Data S3 List of the differentially expressed genes in green floral buds of Lonicera macranthoides (LM) and L. japonica (LJ).

Data S4 Transcriptional change tendency of key genes involved in hederagenin‐based saponin biosynthesis in green floral buds of Lonicera macranthoides (LM) versus L. japonica (LJ).

PBI-21-2209-s001.xlsx (2.3MB, xlsx)

Acknowledgements

We thank Huiying Wang in State Key Laboratory of Natural Medicines, Department of Pharmacognosy, Institute of Pharmaceutical Science, China Pharmaceutical University for her help in preparation of purified sample for NMR spectra analysis. This work was partially supported by the National Natural Science Foundation of China (NSFC, nos. 81825023 and 82173918), the National Key R&D Program of China (2019YFC1711000), the grants from the Basic Science program from CONACytT‐Mexico (Grant 00126261), and the Governor University Research Initiative program (05‐2018) from the State of Texas (L.‐H.E.).

Contributor Information

Lam‐Son Phan Tran, Email: son.tran@ttu.edu.

Luis Herrera‐Estrella, Email: luis.herrera-estrella@ttu.edu.

Xu Lu, Email: luxu666@163.com.

Lian‐Wen Qi, Email: qilw@cpu.edu.cn.

Data availability statement

The genome and transcriptome sequence data have been deposited under NCBI BioProject number PRJNA800599 and PRJNA825098, respectively. Source data are provided with this paper.

References

  1. An, X. , Xu, X. , Xiao, M. , Min, X. , Lyu, Y. , Tian, J. , Ke, J. et al. (2021) Efficacy of Jinhua Qinggan Granules combined with western medicine in the treatment of confirmed and suspected COVID‐19: a randomized controlled trial. Front. Med. 8, 728055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Anders, S. and Huber, W. (2010) Differential expression analysis for sequence count data. Genome Biol. 11, R106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Augustin, J.M. , Drok, S. , Shinoda, T. , Sanmiya, K. , Nielsen, J.K. , Khakimov, B. , Olsen, C.E. et al. (2012) UDP‐glycosyltransferases from the UGT73C subfamily in Barbarea vulgaris catalyze sapogenin 3‐O‐glucosylation in saponin‐mediated insect resistance. Plant Physiol. 160, 1881–1895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Burton, J.N. , Adey, A. , Patwardhan, R.P. , Qiu, R. , Kitzman, J.O. and Shendure, J. (2013) Chromosome‐scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol. 31, 1119–1125. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Carelli, M. , Biazzi, E. , Panara, F. , Tava, A. , Scaramelli, L. , Porceddu, A. , Graham, N. et al. (2011) Medicago truncatula CYP716A12 is a multifunctional oxidase involved in the biosynthesis of hemolytic saponins. Plant Cell, 23, 3070–3081. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Chin, C.S. , Alexander, D.H. , Marks, P. , Klammer, A.A. , Drake, J. , Heiner, C. , Clum, A. et al. (2013) Nonhybrid, finished microbial genome assemblies from long‐read SMRT sequencing data. Nat. Methods, 10, 563–569. [DOI] [PubMed] [Google Scholar]
  7. Chinese Pharmacopoeia Commission (1963) Pharmacopoeia of the People's Republic of China, p. 168. Beijing: People's Medical Publishing House and Chemical Industry Press. [Google Scholar]
  8. Chinese Pharmacopoeia Commission (2005) Pharmacopoeia of the People's Republic of China, p. 152. Beijing: People's Medical Publishing House and Chemical Industry Press. [Google Scholar]
  9. Chinese Pharmacopoeia Commission (2020) Pharmacopoeia of the People's Republic of China. Beijing: People's Medical Publishing House and Chemical Industry Press. [Google Scholar]
  10. Chung, P.Y. (2020) Novel targets of pentacyclic triterpenoids in Staphylococcus aureus: a systematic review. Phytomedicine, 73, 152933. [DOI] [PubMed] [Google Scholar]
  11. Crow, K.D. and Wagner, G.P. (2005) What is the role of genome duplication in the evolution of complexity and diversity? Mol. Biol. Evol. 23, 887–892. [DOI] [PubMed] [Google Scholar]
  12. De Bie, T. , Cristianini, N. , Demuth, J.P. and Hahn, M.W. (2006) CAFE: a computational tool for the study of gene family evolution. Bioinformatics, 22, 1269–1271. [DOI] [PubMed] [Google Scholar]
  13. Emms, D.M. and Kelly, S. (2019) OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Frieling, J. , Gebhardt, H. , Huber, M. , Adekeye, O.A. , Akande, S.O. , Reichart, G.J. , Middelburg, J.J. et al. (2017) Extreme warmth and heat‐stressed plankton in the tropics during the Paleocene‐Eocene Thermal Maximum. Sci. Adv. 3, e1600891. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Gao, W. , Yang, H. , Qi, L.W. , Liu, E.H. , Ren, M.T. , Yan, Y.T. , Chen, J. et al. (2012) Unbiased metabolite profiling by liquid chromatography–quadrupole time‐of‐flight mass spectrometry and multivariate data analysis for herbal authentication: classification of seven Lonicera species flower buds. J. Chromatogr. A 1245, 109–116. [DOI] [PubMed] [Google Scholar]
  16. Gao, W. , Wang, R. , Li, D. , Liu, K. , Chen, J. , Li, H.J. , Xu, X. et al. (2016) Comparison of five Lonicera flowers by simultaneous determination of multi‐components with single reference standard method and principal component analysis. J. Pharm. Biomed. Anal. 117, 345–351. [DOI] [PubMed] [Google Scholar]
  17. Ge, B. , Lu, X.Y. , Yi, K. and Tian, Y. (2004) The active constituent and pharmaceutical action of Flos Lonicerae and its application. Chin. Wild Plant Res. 23, 13–16. [Google Scholar]
  18. Ge, L. , Xie, Q. , Jiang, Y. , Xiao, L. , Wan, H. , Zhou, B. , Wu, S. et al. (2022) Genus Lonicera: new drug discovery from traditional usage to modern chemical and pharmacological research. Phytomedicine, 96, 153889. [DOI] [PubMed] [Google Scholar]
  19. He, J.B. , Zhao, P. , Hu, Z.M. , Liu, S. , Kuang, Y. , Zhang, M. , Li, B. et al. (2019) Molecular and structural characterization of a promiscuous C‐glycosyltransferase from Trollius chinensis . Angew. Chem. Int. Ed. Engl. 58, 11513–11520. [DOI] [PubMed] [Google Scholar]
  20. He, S. , Sun, G. , Geng, X. , Gong, W. , Dai, P. , Jia, Y. , Shi, W. et al. (2021) The genomic basis of geographic differentiation and fiber improvement in cultivated cotton. Nat. Genet. 53, 916–924. [DOI] [PubMed] [Google Scholar]
  21. Hu, K. , Guan, W.J. , Bi, Y. , Zhang, W. , Li, L. , Zhang, B. , Liu, Q. et al. (2021) Efficacy and safety of Lianhuaqingwen capsules, a repurposed Chinese herb, in patients with coronavirus disease 2019: a multicenter, prospective, randomized controlled trial. Phytomedicine, 85, 153242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Itkin, M. , Davidovich‐Rikanati, R. , Cohen, S. , Portnoy, V. , Doron‐Faigenboim, A. , Oren, E. , Freilich, S. et al. (2016) The biosynthetic pathway of the nonsugar, high‐intensity sweetener mogroside V from Siraitia grosvenorii. Proc. Natl Acad. Sci. USA, 113, E7619–E7628. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Jakhmola Mani, R. , Sehgal, N. , Dogra, N. , Saxena, S. and Pande Katare, D. (2022) Deciphering underlying mechanism of Sars‐CoV‐2 infection in humans and revealing the therapeutic potential of bioactive constituents from Nigella sativa to combat COVID19: in‐silico study. J. Biomol. Struct. Dyn. 40, 2417–2429. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Jo, H.J. , Han, J.Y. , Hwang, H.S. and Choi, Y.E. (2017) β‐Amyrin synthase (EsBAS) and β‐amyrin 28‐oxidase (CYP716A244) in oleanane‐type triterpene saponin biosynthesis in Eleutherococcus senticosus . Phytochemistry, 135, 53–63. [DOI] [PubMed] [Google Scholar]
  25. Kashiwada, Y. , Wang, H.K. , Nagao, T. , Kitanaka, S. , Yasuda, I. , Fujioka, T. , Yamagishi, T. et al. (1998) Anti‐AIDS agents. 30. Anti‐HIV activity of oleanolic acid, pomolic acid, and structurally related triterpenoids. J. Nat. Prod. 61, 1090–1095. [DOI] [PubMed] [Google Scholar]
  26. Klasberg, S. , Schmidt, A.H. , Lange, V. and Schöfl, G. (2021) DR2S: an integrated algorithm providing reference‐grade haplotype sequences from heterozygous samples. BMC Bioinformatics, 22, 236. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Koren, S. , Walenz, B.P. , Berlin, K. , Miller, J.R. , Bergman, N.H. and Phillippy, A.M. (2017) Canu: scalable and accurate long‐read assembly via adaptive k‐mer weighting and repeat separation. Genome Res. 27, 722–736. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Kühnel, K. , Ke, N. , Cryle, M.J. , Sligar, S.G. , Schuler, M.A. and Schlichting, I. (2008) Crystal structures of substrate‐free and retinoic acid‐bound cyanobacterial cytochrome P450 CYP120A1. Biochemistry, 47, 6552–6559. [DOI] [PubMed] [Google Scholar]
  29. Lee, Y.R. , Chang, C.M. , Yeh, Y.C. , Huang, C.F. , Lin, F.M. , Huang, J.T. , Hsieh, C.C. et al. (2021) Honeysuckle aqueous extracts induced let‐7a suppress EV71 replication and pathogenesis in vitro and in vivo and is predicted to inhibit SARS‐CoV‐2. Viruses, 13, 308. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Li, W. , Cowley, A. , Uludag, M. , Gur, T. , McWilliam, H. , Squizzato, S. , Park, Y.M. et al. (2015) The EMBL‐EBI bioinformatics web and programmatic tools framework. Nucleic Acids Res. 43, W580–W584. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Li, H. , Cheng, C. , Li, S. , Wu, Y. , Liu, Z. , Liu, M. , Chen, J. et al. (2021a) Discovery and structural optimization of 3‐O‐β‐chacotriosyl oleanane‐type triterpenoids as potent entry inhibitors of SARS‐CoV‐2 virus infections. Eur. J. Med. Chem. 215, 113242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Li, M. , Wang, Y. , Jin, J. , Dou, J. , Guo, Q. , Ke, X. , Zhou, C. et al. (2021b) Inhibitory activity of honeysuckle extracts against influenza A virus in vitro and in vivo. Virol. Sin. 36, 490–500. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Li, C.W. , Zhang, H. , Rao, P. , Zheng, D.R. , Wang, Y. and Li, Y.H. (2021c) Research progress on biosynthesis pathway of pentacyclic triterpenoids in plants. Chin. Tradit. Herb. Drug, 52, 11. [Google Scholar]
  34. Mascher, M. , Gundlach, H. , Himmelbach, A. , Beier, S. , Twardziok, S.O. , Wicker, T. , Radchuk, V. et al. (2017) A chromosome conformation capture ordered sequence of the barley genome. Nature, 544, 427–433. [DOI] [PubMed] [Google Scholar]
  35. Morris, G.M. , Huey, R. , Lindstrom, W. , Sanner, M.F. , Belew, R.K. , Goodsell, D.S. and Olson, A.J. (2009) AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility. J. Comput. Chem. 30, 2785–2791. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Nomura, Y. , Seki, H. , Suzuki, T. , Ohyama, K. , Mizutani, M. , Kaku, T. , Tamura, K. et al. (2019) Functional specialization of UDP‐glycosyltransferase 73P12 in licorice to produce a sweet triterpenoid saponin, glycyrrhizin. Plant J. 99, 1127–1143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Orme, A. , Louveau, T. , Stephenson, M.J. , Appelhagen, I. , Melton, R. , Cheema, J. , Li, Y. et al. (2019) A noncanonical vacuolar sugar transferase required for biosynthesis of antimicrobial defense compounds in oat. Proc. Natl Acad. Sci. USA, 116, 27105–27114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Parra, G. , Bradnam, K. and Korf, I. (2007) CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics, 23, 1061–1067. [DOI] [PubMed] [Google Scholar]
  39. Pu, X. , Li, Z. , Tian, Y. , Gao, R. , Hao, L. , Hu, Y. , He, C. et al. (2020) The honeysuckle genome provides insight into the molecular mechanism of carotenoid metabolism underlying dynamic flower coloration. New Phytol. 227, 930–943. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Salim, B. and Noureddine, M. (2020) Identification of compounds from Nigella sativa as new potential inhibitors of 2019 novel coronasvirus (COVID‐19): molecular docking study. ChemRxiv, 3, 1–12. [Google Scholar]
  41. Servant, N. , Varoquaux, N. , Lajoie, B.R. , Viara, E. , Chen, C.J. , Vert, J.P. , Heard, E. et al. (2015) HiC‐Pro: an optimized and flexible pipeline for Hi‐C data processing. Genome Biol. 16, 259. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Shang, X. , Pan, H. , Li, M. , Miao, X. and Ding, H. (2011) Lonicera japonica Thunb.: ethnopharmacology, phytochemistry and pharmacology of an important traditional Chinese medicine. J. Ethnopharmacol. 138, 1–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Simão, F.A. , Waterhouse, R.M. , Ioannidis, P. , Kriventseva, E.V. and Zdobnov, E.M. (2015) BUSCO: assessing genome assembly and annotation completeness with single‐copy orthologs. Bioinformatics, 31, 3210–3212. [DOI] [PubMed] [Google Scholar]
  44. Srisawat, P. , Fukushima, E.O. , Yasumoto, S. , Robertlee, J. , Suzuki, H. , Seki, H. and Muranaka, T. (2019) Identification of oxidosqualene cyclases from the medicinal legume tree Bauhinia forficata: a step toward discovering preponderant α‐amyrin‐producing activity. New Phytol. 224, 352–366. [DOI] [PubMed] [Google Scholar]
  45. Stamatakis, A. (2014) RAxML version 8: a tool for phylogenetic analysis and post‐analysis of large phylogenies. Bioinformatics, 30, 1312–1313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Su, W. , Jing, Y. , Lin, S. , Yue, Z. , Yang, X. , Xu, J. , Wu, J. et al. (2021) Polyploidy underlies co‐option and diversification of biosynthetic triterpene pathways in the apple tribe. Proc. Natl Acad. Sci. USA, 118, e2101767118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Sun, P. , Jiao, B. , Yang, Y. , Shan, L. , Li, T. , Li, X. , Xi, Z. et al. (2022) WGDI: a user‐friendly toolkit for evolutionary analyses of whole‐genome duplications and ancestral karyotypes. Mol. Plant, 15, 1841–1851. [DOI] [PubMed] [Google Scholar]
  48. Tamura, K. , Teranishi, Y. , Ueda, S. , Suzuki, H. , Kawano, N. , Yoshimatsu, K. , Saito, K. et al. (2017a) Cytochrome P450 monooxygenase CYP716A141 is a unique beta‐amyrin C‐16 beta oxidase involved in triterpenoid saponin biosynthesis in Platycodon grandiflorus . Plant Cell Physiol. 58, 874–884. [DOI] [PubMed] [Google Scholar]
  49. Tamura, K. , Seki, H. , Suzuki, H. , Kojoma, M. , Saito, K. and Muranaka, T. (2017b) CYP716A179 functions as a triterpene C‐28 oxidase in tissue‐cultured stolons of Glycyrrhiza uralensis . Plant Cell Rep. 36, 437–445. [DOI] [PubMed] [Google Scholar]
  50. Tang, Q.Y. , Chen, G. , Song, W.L. , Fan, W. , Wei, K.H. , He, S.M. , Zhang, G.H. et al. (2019) Transcriptome analysis of Panax zingiberensis identifies genes encoding oleanolic acid glucuronosyltransferase involved in the biosynthesis of oleanane‐type ginsenosides. Planta, 249, 393–406. [DOI] [PubMed] [Google Scholar]
  51. Tu, L. , Su, P. , Zhang, Z. , Gao, L. , Wang, J. , Hu, T. , Zhou, J. et al. (2020) Genome of Tripterygium wilfordii and identification of cytochrome P450 involved in triptolide biosynthesis. Nat. Commun. 11, 971. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. United States Pharmacopeial Convention (2017) United States Pharmacopoeia, p. 4709. Rockville MD: United States Pharmacopoeia, USP41‐NF36. [Google Scholar]
  53. Wagner, C.L. , Egli, R. , Lascu, I. , Lippert, P.C. , Livi, K.J.T. and Sears, H.B. (2021) In situ magnetic identification of giant, needle‐shaped magnetofossils in Paleocene–Eocene Thermal Maximum sediments. Proc. Natl Acad. Sci. USA, 118, e2018169118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Walker, B.J. , Abeel, T. , Shea, T. , Priest, M. , Abouelliel, A. , Sakthikumar, S. , Cuomo, C.A. et al. (2014) Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One, 9, e112963. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Wang, L.J. (2010) The study progress of Lonicera japonica. Med. Inform. 8, 2293–2296. [Google Scholar]
  56. Wang, C. , Cao, B. , Liu, Q.Q. , Zou, Z.Q. , Liang, Z.A. , Gu, L. , Dong, J.P. et al. (2011) Oseltamivir compared with the Chinese traditional therapy Maxingshigan–Yinqiaosan in the treatment of H1N1 influenza: a randomized trial. Ann. Intern. Med. 155, 217–225. [DOI] [PubMed] [Google Scholar]
  57. Wen, L. , Yun, X. , Zheng, X. , Xu, H. , Zhan, R. , Chen, W. , Xu, Y. et al. (2017) Transcriptomic comparison reveals candidate genes for triterpenoid biosynthesis in two closely related Ilex species. Front. Plant Sci. 8, 634. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Xu, Z. , Pu, X. , Gao, R. , Demurtas, O.C. , Fleck, S.J. , Richter, M. , He, C. et al. (2020) Tandem gene duplications drive divergent evolution of caffeine and crocin biosynthetic pathways in plants. BMC Biol. 18, 63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Yan, X. , Fan, Y. , Wei, W. , Wang, P. , Liu, Q. , Wei, Y. , Zhang, L. et al. (2014) Production of bioactive ginsenoside compound K in metabolically engineered yeast. Cell Res. 24, 770–773. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Yang, Z. (2007) PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591. [DOI] [PubMed] [Google Scholar]
  61. Zapata, L. , Ding, J. , Willing, E.M. , Hartwig, B. , Bezdan, D. , Jiao, W.B. , Patel, V. et al. (2016) Chromosome‐level assembly of Arabidopsis thaliana Ler reveals the extent of translocation and inversion polymorphisms. Proc. Natl Acad. Sci. USA, 113, E4052–E4060. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Zhang, L. , Ren, S. , Liu, X. , Liu, X. , Guo, F. , Sun, W. , Feng, X. et al. (2020) Mining of UDP‐glucosyltrfansferases in licorice for controllable glycosylation of pentacyclic triterpenoids. Biotechnol. Bioeng. 117, 3651–3663. [DOI] [PubMed] [Google Scholar]
  63. Zhang, Q. , Yue, S. , Wang, W. , Chen, Y. , Zhao, C. , Song, Y. , Yan, D. et al. (2021) Potential role of gut microbiota in traditional Chinese medicine against COVID‐19. Am. J. Chin. Med. 49, 785–803. [DOI] [PubMed] [Google Scholar]
  64. Zhao, Q. , Yang, J. , Cui, M.Y. , Liu, J. , Fang, Y. , Yan, M. , Qiu, W. et al. (2019) The reference genome sequence of Scutellaria baicalensis provides insights into the evolution of wogonin biosynthesis. Mol. Plant, 12, 935–950. [DOI] [PubMed] [Google Scholar]
  65. Zhou, Z. , Li, X. , Liu, J. , Dong, L. , Chen, Q. , Liu, J. , Kong, H. et al. (2015) Honeysuckle‐encoded atypical microRNA2911 directly targets influenza A viruses. Cell Res. 25, 39–49. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Zwaenepoel, A. and De Peer, Y.V. (2019) wgd‐simple command line tools for the analysis of ancient whole genome duplications. Bioinformatics, 35, 2153–2155. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure S1 Macroscopic characteristics of dried flower buds of Lonicera macranthoides (LM) and L. japonica (LJ).

Figure S2 Chemical profiling of Lonicera macranthoides (LM) and L. japonica (LJ).

Figure S3 Comparative principal component (PC) analysis (PCA) of Lonicera macranthoides (LM) and L. japonica (LJ).

Figure S4 Evaluation of the genome size of Lonicera macranthoides (LM).

Figure S5 Hi‐C intra‐chromosomal contact map for the genome assembly (2n = 18).

Figure S6 Venn diagram analysis of coding genes in the genome of Lonicera macranthoides.

Figure S7 KEGG pathway enrichment analysis of unique genes in Lonicera macranthoides compared with L. japonica.

Figure S8 Clusters of orthologous gene families in Lonicera macranthoides (LM), L. japonica (LJ), and 11 other fully‐sequenced plant species.

Figure S9 Ks distributions of paralogous genes within Lonicera macranthoides, L. japonica, Daucus carota, and Vitis vinifera.

Figure S10 Ks distributions in Lonicera macranthoides.

Figure S11 Ks distributions in Lonicera japonica.

Figure S12 Macrosynteny visualization among Lonicera macranthoides (LM), L. japonica (LJ), and Vitis vinifera (Viv) by intergenomic co‐linearity analysis.

Figure S13 Summary of the syntenic analysis between Lonicera macranthoides (LM) and Vitis vinifera (Viv).

Figure S14 Heatmap of differentially expressed genes involved in the hederagenin‐based saponin biosynthetic pathway in Lonicera macranthoides (LM) and L. japonica (LJ).

Figure S15 The amino acid sequence alignment of targeted oleanolic acid synthase (OAS, EVM0008033) in LM.

Figure S16 Phylogenetic tree of differentially expressed UDP‐glycosyltransferases (UGTs) between Lonicera macranthoides (LM) and L. japonica (LJ).

Figure S17 SDS‐PAGE analysis of the recombinant LmUGT73P1.

Figure S18 LC‐QTOF‐MS/MS analysis of product extracted from in vitro enzyme assay.

Figure S19 LC‐QTOF‐MS/MS confirmation of α‐hederin.

Figure S20 LC‐QTOF‐MS/MS analysis of product extracted from in vivo assay.

Figure S21 Evolution of genes involved in hederagenin‐based saponin biosynthesis in Lonicera macranthoides (LM).

Figure S22 Chromosomal localization analysis of genes involved in the biosynthetic pathway of hederagenin‐based saponins in Lonicera macranthoides.

Figure S23 Transcriptional expression profile of genes located in chromosome inversions and translocations regions.

Figure S24 Phylogenetic tree of OAS identified in LM and their relationships with reported OAS genes.

Figure S25 Sequence analysis of LmOAS1 and LmUGT73P1 in LM and LJ.

Table S1 Genome survey statistic results of Lonicera macranthoides.

Table S2 The filter data from original sequenced data of Lonicera macranthoides.

Table S3 The statistics of Hi‐C data.

Table S4 Genome assembly of Lonicera macranthoides.

Table S5 The quality estimation of Hi‐C data.

Table S6 The scaffold number and length grouped on pseudochromosomes.

Table S7 The quality estimation of genome assembly for Lonicera macranthoides using CEGMA database.

Table S8 The quality estimation of genome assembly for Lonicera macranthoides using BUSCO database.

Table S9 The quality estimation of genome assembly for Lonicera macranthoides using EST.

Table S10 The statistical results of gene prediction of Lonicera macranthoides.

Table S11 The statistical results of function annotation of Lonicera macranthoides genome.

Table S12 The statistical results of gene structure prediction of Lonicera macranthoides.

Table S13 Summary of repeat sequence in the Lonicera macranthoides (LM) genome.

Table S14 The statistical comparison results of Lonicera macranthoides and Lonicera japonica.

Table S15 List of species used in phylogenomic analyses.

Table S16 Summary statistics for gene family inference and gene counts for each family and each species.

Table S17 The copy number of genes involved in hederagenin‐based saponin biosynthesis.

Table S18 Primers used in this study.

PBI-21-2209-s002.docx (7.9MB, docx)

Data S1 The identified compounds from Lonicera macranthoides (LM) and L. japonica (LJ).

Data S2 The synonymous substitution rate (Ks) distribution of collinear genes between Lonicera macranthoides (LM) and L. japonica (LJ) genomes.

Data S3 List of the differentially expressed genes in green floral buds of Lonicera macranthoides (LM) and L. japonica (LJ).

Data S4 Transcriptional change tendency of key genes involved in hederagenin‐based saponin biosynthesis in green floral buds of Lonicera macranthoides (LM) versus L. japonica (LJ).

PBI-21-2209-s001.xlsx (2.3MB, xlsx)

Data Availability Statement

The genome and transcriptome sequence data have been deposited under NCBI BioProject number PRJNA800599 and PRJNA825098, respectively. Source data are provided with this paper.


Articles from Plant Biotechnology Journal are provided here courtesy of Society for Experimental Biology (SEB) and the Association of Applied Biologists (AAB) and John Wiley and Sons, Ltd

RESOURCES