Skip to main content
Horticulture Research logoLink to Horticulture Research
. 2026 Jan 20;13(3):uhaf343. doi: 10.1093/hr/uhaf343

The jujube pan-TE map identifies transposable element insertions associated with domestication and pericarp development

Xingnuo Li 1,2,#, Aidi Zhang 3,#, Muqaddas Bano 4,5, Juan Jin 6, Qing Hao 7, Dingyu Fan 8,, Liang Chen 9,, Xiujun Zhang 10,11,
PMCID: PMC12977166  PMID: 41821678

Abstract

Jujube (Ziziphus jujuba Mill.) is a fruit crop of high economic value, renowned for its distinctive flavor and wide range of phenotypic diversity. Despite major advancements in jujube genomics, the role of genetic variants in underlying agronomic trait formation is still poorly understood. Here, we used seven high-quality jujube genomes to construct a pan-TE (transposon element) map and investigated how TEs shape genome evolution and agronomic traits. We found that TEs constitute 29.05%–30.38% of the genome, predominantly long terminal repeat (LTR) retrotransposons such as Copia and Gypsy. A positive correlation (R2 = 0.76) between TE content and genome size underscores their role in genomic expansion. TE insertions within gene bodies significantly reduce gene expression, particularly for genes involved in cell wall biosynthesis and glucose metabolism. Population scale analysis of 1041 accessions identified 4176 transposable element insertion polymorphisms (TIPs) that distinguish wild and cultivated groups. Wild jujubes harbor stress-related TIPs (e.g. in peroxidase genes), whereas cultivated accessions carry TIPs linked to fruit development. Notably, a Gypsy insertion upstream of the cellulose synthase gene ZjCESA4 is associated with reduced expression and thinner pericarp in ‘Dongzao’ compared to ‘Huizao’. Similarly, a downstream LTR/Gypsy insertion near the MADS-box transcription factor gene ZjAGL18 correlates with suppressed expression, highlighting the recurrent targeting of key regulatory genes by TEs during domestication. Our findings demonstrate that TIPs are a major source of genetic variation in jujube, providing molecular markers for breeding programs that aim to balance fruit quality and stress resilience.

Introduction

Plant genomes encompass a significant quantity of transposable elements (TEs) [1]. By causing structural changes, changing gene regulation, and promoting adaptive responses, TEs have aided in the evolution of genomes [2–9]. TEs can induce phenotypic changes by modifying the expression of endogenous genes, either through cis-acting or trans-acting regulation [10–12]. These regulatory mechanisms in plants have been confirmed by more in-depth research. For instance, in grapes (Vitis vinifera), the insertion of the long terminal repeat (LTR) retrotransposon Gret1 (grapevine retrotransposon 1) upstream of the VvmybA1 gene in Cabernet grapes represses its expression, resulting in white fruit skins and giving rise to the white-skinned cultivar Chardonnay [13, 14]. In studies of transposable elements in apples (Malus domestica), LTR retrotransposon insertions into the fourth or sixth intron of the MdPI gene, a B-class MADS box gene essential for flower development, mutations in MdPI resulted in a petal-less flower phenotype [15]. The insertion of a Copia LTR-RT in tomato abolishes the fruit pedicel abscission zone by disrupting functional JOINTLESS2 (J2) MADS-box transcription factor transcripts [16]. Whole genome sequencing data from 602 Solanum lycopersicum accessions was used to examine the roles of transposable element insertion polymorphisms (TIPs) in genetic diversity, with 40 TIPs found to be associated with key agronomic traits [17]. In Brassica rapa, four TIP-containing genes were identified to exhibit adaptive responses to diverse climatic conditions, facilitating the development of various vegetable crops [18].

Traditional studies of crop domestication have largely focused on single reference genomes and small-scale mutations (SNPs and InDels), overlooking the significant contributions of structural variations (SVs) to evolutionary processes. The advent of pan-genomics, enabled by the proliferation of whole genome assemblies, now allows systematic investigation of SVs and their impact on phenotypic diversity and genome evolution [19]. Indeed, SVs have been recognized as key regulators of agronomic traits such as flowering time, fruit taste, size, and yield [16, 20–24]. Among SVs, TIPs remain an understudied source of phenotypic variation, despite their potential to drive adaptive evolution [17]. Notably, in Rosaceae fruit crops, TE insertions have played pivotal roles in domestication by shaping key fruit quality traits. For instance, LTR retrotransposon insertions are associated with red skin color and reduced astringency in apple through disruption of MdMYB1 and MdMYBTT [25, 26], a TE insertion in the promoter of PpBL controls blood flesh color in peach [27], and an LTR retrotransposon in the promoter of PsMYB10.2 regulates flesh color in Japanese plum [28].

In jujube (Ziziphus jujuba Mill.), an economically important fruit crop in Asia, notable phenotypic variation exists in traits such as pericarp thickness, fruit morphology, flowering time, and shoot architecture [20, 21, 29]. Recent genomic efforts have produced draft genomes for ‘Dongzao’ [29], ‘Junzao’ [30], and ‘Suanzao’ [31], followed by gapless assemblies [32] and haplotype-resolved T2T genomes [33]. Population genomic analysis of 672 accessions yielded novel insights into domestication history. Four reference-grade genomes (including ‘Huizao’) were assembled, and population analysis of 1059 accessions was conducted to expand this resource, with pan-genome construction further identifying candidate genes for traits like seed setting rate, bearing shoot length, and leaf size. These candidate genes include the MADS-box transcription factor ZjAGL28, which regulates flowering and ripening time [34].

While studies in established crop systems have illuminated how TEs control traits such as fruit symmetry, pedicel abscission, and skin color, the role of TEs in governing fruit textural and morphological domestication traits remains less explored. Jujube (Z. jujuba Mill.) offers an unparalleled approach to addressing this gap, owing to its unique and dramatic domestication syndrome. During its domestication, jujube has undergone remarkable divergence in fruit shape, orchestrated by key genes such as the ERF transcription factor ZjFS3 [35], and, most notably for this study, in pericarp texture and thickness. This is evidenced by the striking contrast between thin-skinned, crisp fresh-eating cultivars (e.g. ‘Dongzao’) and thick-skinned, tough dry-eating cultivars (e.g. ‘Huizao’), with the pericarp being a complex structure rich in cellulose and lignin [36]. This extreme variation in a fundamental textural trait, which is central to postharvest handling and utility, is a hallmark of jujube domestication not observed to the same degree in canonical models. Therefore, jujube presents a unique opportunity to unravel the genomic basis of pericarp differentiation and discover regulatory mechanisms driven by TEs that underlie fruit texture; this dimension of domestication syndrome is pivotal yet has received limited research attention.

Although TEs are known to influence genome size and gene expression [2, 37] and constitute a major component of the jujube genome [31, 34], the specific role of TIPs in cultivar-specific adaptation remains uncharacterized. While previous studies have linked SNPs and CNVs to domestication traits [33, 34], the functional contribution of TIPs as cis-regulatory variants remains uninvestigated, especially when it comes to traits like pericarp development and their impacts on gene families including MADS-box factors. Beyond their established roles in flower and fruit development, MADS-box genes like AGL18 are pivotal regulators of fundamental processes such as flowering time and embryogenesis [38, 39]. Recent studies show that AGL18 homologs function in complex networks; for instance, they can form repression complexes to delay flowering or, through alternative splicing, generate isoforms that antagonistically regulate key developmental transitions. This intricate regulatory potential makes MADS-box genes compelling targets for transposable element-mediated diversification during domestication. Although TEs have been shown to control traits such as fruit symmetry and skin color in model species like tomato and grape, their role in regulating development and domestication-related processes is less understood. Jujube provides an ideal system to address this gap, as its domestication has led to a striking divergence in multiple horticulturally important traits, including pericarp texture and reproductive timing. This remarkable phenotypic diversity, especially in fruit texture and developmental timing, provides a unique opportunity to dissect the TE-mediated mechanisms underlying trait evolution. This dimension of domestication is rarely accessible in other fruit crops.

Here, we integrate genomic, transcriptomic, and population genetic data to investigate how TE insertions shape gene expression and phenotypic divergence in jujube. The seven jujube genomes were strategically selected to capture the core genetic and phenotypic diversity of the species. Our selection encompasses the wild progenitor (‘Suanzao’) and key cultivated varieties from the major eastern (‘Dongzao’, ‘Huizao’) and western (‘Junzao’) geographic regions. Furthermore, the inclusion of haplotype-resolved assemblies for ‘Huizao’, ‘Junzao’, and ‘Suanzao’ enables a more comprehensive detection of genetic variation, including heterozygous TIPs. This framework ensures that our pan-genome and TIPs analysis is grounded in a representative and evolutionarily meaningful sampling strategy. We analyzed TE dynamics across seven high-quality genomes through pan-genome comparison, transcriptomics, and population scale TIP profiling. Our study not only identifies TIPs associated with pericarp development but also reveals TE insertions near MADS-box genes, suggesting a broader role of TEs in regulating developmental transitions during domestication. Phylogenetic and population structure analyses confirm that these genomes capture the core genetic diversity of jujube, establishing a robust evolutionary framework for our study. Our findings reveal TE-mediated regulatory mechanisms contributing to pericarp differentiation and developmental timing, providing novel insights into the genomic basis of jujube domestication.

Results

Transposable element composition in jujube genomes

Based on the analysis of seven jujube genomes, namely ‘Dongzao’ (DZ), two haplotypes of ‘Huizao’ (HA and HB), two haplotypes of ‘Junzao’ (JA and JB), and two haplotypes of ‘Suanzao’ (SA and SB), approximately 29.05%–30.38% of sequences were transposable elements (TEs) (Table S1). Among the TE subclasses, LTR retrotransposons such as Copia and Gypsy were predominant. In the ‘Dongzao’ genome, Copia elements accounted for 7.89% and Gypsy elements for 11.02% of the genome. The percentages of unknown repeating parts ranged from 0.6% (SA) to 0.71% (HB). LTR retrotransposons were the most abundant group, constituting 26.71% (SB) to 28.02% (DZ and JB) of the genomes. Short interspersed nuclear elements (SINEs) were nearly negligible, detected only in trace amounts in HA, HB, and SB, while long interspersed nuclear elements (LINEs) consistently accounted for approximately 0.02% across all genomes. Among DNA transposons, DTM elements and overall DNA transposons had significant representation. For example, in DZ, DNA transposons accounted for 1.7% of the genome, with DTM elements contributing 1.68%. Other DNA transposon subclasses, such as DTA, DTC, DTH, and DTT, were less abundant, with proportions ranging from 0.01% (DTT) to 0.18% (DTC). Helitron elements were also relatively scarce, with their proportions ranging from 0.06% in SA and 0.09% in JB.

A combined phylogenetic tree and bar chart (Fig. 1a and Fig. S1), constructed using single-copy genes, illustrated the relatedness among the jujube genomes based on their TE composition. Furthermore, a significant association between the assembled genome size and overall TE content was found by linear regression analysis (Fig. S2) (R2 = 0.76, P = 1.08e−2). This indicates that as assembled genome size increases, TE content also tends to increase. The nucleotide counts of total TE sequences in these jujube genomes were relatively similar, suggesting a certain degree of consistency in overall TE content at the sequence level.

Figure 1.

Figure 1

Characteristics of jujube genomes’ transposable elements (TEs). (a) TE categorization and content in each of the seven genomes of jujubes. Single-copy genes from the seven jujube genomes were used to build the phylogenetic tree, and the accession ‘Ppe’ (P. persica) was used as an outgroup. Numbers at nodes indicate estimated divergence times in million years ago (MYA). DNA transposons were divided into six primary superfamilies, including Helitron, DTM (Mutator), DTA (hAT), DTC (CACTA), DTH (PIF/Harbinger), and DTT (Tc1/Mariner). (b) Comparison of gene expression levels (FPKM) between genes with and without TE insertions in genic regions across five developmental stages. Statistical significance was determined by Wilcoxon rank-sum tests (***P < 1e−150 for all stages). The boxplots show the distribution of expression values, with the lower expression of TIP-containing genes being consistently observed throughout fruit development. (c) Distribution of nonsynonymous to synonymous SNP ratios (dN/dS) in genes with and without TE insertions across different genomic regions. Boxplots show the median and quartiles of dN/dS ratios. Wilcoxon rank-sum tests indicate significantly higher dN/dS ratios in TIP-containing genes across all regions (**P < 1e−13 for all comparisons), with the most pronounced effect observed in coding sequences.

Gene expression was tightly linked to transposable element insertions. TE insertions were found in introns of 1433 gene models in the ‘Dongzao’ genome. To quantitatively assess the impact of TE insertions on gene expression, we compared the expression levels (FPKM) of genes with and without TE insertions across five fruit developmental stages. Wilcoxon rank-sum tests revealed that genes with TE insertions in genic regions exhibited significantly lower expression levels than those without TE insertions at all developmental stages (all P < 1e−150) (Fig. 1b). This consistent pattern demonstrates the robust suppressive effect of intragenic TE insertions on gene expression. To investigate the evolutionary constraints on genes with TE insertions, we compared the ratio of nonsynonymous to synonymous substitutions (dN/dS) across different genomic regions. Wilcoxon rank-sum tests revealed that genes with TE insertions had significantly higher dN/dS ratios than those without TE insertions in all genomic regions examined (genic: P < 1e−48; CDS: P < 1e−34; upstream: P < 1e−13; downstream: P < 1e−13) (Fig. 1c), indicating that TE insertions could be linked to a greater frequency of nonsynonymous mutations. These results suggest a possible connection between the kinetics of gene evolution and genome size and TE insertions.

Landscapes of TE insertion polymorphisms in jujube genomes

We employed the ITIPs pipeline (https://github.com/caixu0518/ITIPs) [18] to identify TE insertions across jujube genomes. Genome insertions and deletions (INDELs) were detected among seven jujube genomes (DZ, HA, HB, JA, JB, SA, and SB) to explore their genomic variations. The number of insertions varied markedly, ranging from 102 299 in DZ to 117 341 in JB, with total lengths spanning from 95 748 595 bp in DZ to 107 860 443 bp in HB. Similarly, deletion events ranged from 83 068 in DZ to 93 850 in SA, with total lengths from 98 779 965 bp in JA to 108 379 114 bp in SA (Table S2). Based on the transposable element composition in jujube genomes, we identified the portions of insertions and deletions in the seven jujube genomes that belong to TE insertions and deletions (Fig. S4).

We separated the jujube pan-genome into aligned and unaligned sections based on the genomic sequences of the ‘Dongzao’ (Fig. S3). Among the seven genomes, the aligned portions made up 76.2% to 88.0%, whereas the unaligned regions made up 12.0% to 23.8% (Table S3). The total number of TE insertions we found was 5314 in the unaligned areas and 6319 in the aligned regions. Both upstream and downstream of the gene bodies, these insertions were found within 2 kb. TIP-containing genes are those that have these insertions. Among the seven jujube genomes, 27 706 genes were found to contain TIPs. Of the TIP-containing genes, 3335 make up around 11% of the total genes in ‘Dongzao’. Each of the 12 chromosomes contained these genes (Fig. 2a). Among all TIP-containing genes in the seven jujube genomes, TE insertions were found in the 1 kb upstream flanking region of 1044 to 1530 of the genes, in the 1 kb downstream flanking region of 924 to 1398 of the genes, in intronic regions of 904 to 1387 of the genes, and in coding regions of 172 to 739 of the genes (Fig. 2b and Table S5).

Figure 2.

Figure 2

Transposable element (TE) insertion distributions within the jujube genome. (a) Chromosomal distributions of repeats (i) and genes (ii) as well as TE insertion genes (iii) across the 12 chromosomes of the ‘Dongzao’ genome. Number of INDELs (iv) and TE insertion gene expression (v) in a sliding window of 1000 kb. (b) Number of genes with TE insertions in different jujube genomes. (c) Gene transcription levels in genic areas with and without TE insertions. The Student–Newman–Keuls test was used for multiple comparisons at a significance level of 0.01. (d) The numbers of various TE insertions inside the genic areas. Upstream and downstream (2 kb) of the gene body are considered genic areas. (e) The ratios of various TE insertions inside the genic areas. Upstream and downstream (2 kb) of the gene body are considered genic areas.

We also annotated each TIP and concentrated on the two main TE families (DNA transposons and LTR retrotransposons) (Table S4). We found 488 DNA transposon TIPs and 5687 LTR retrotransposon TIPs in the ‘Dongzao’ reference genome. The ratio of LTR to DNA transposon content in the genome composition (6.24 times) was much lower than the number of LTR retrotransposon TIPs, which were 11.65 times more than those of DNA transposons. The ratios of LTR retrotransposons to DNA transposon TIPs in the other six jujube genomes varied from 9.44 to 13.15, which are much greater than the comparable genomic content ratios, which ranged from 6.01 to 6. Most TIPs were found in Gypsy components among LTR retrotransposons. We also determined how many TIPs were found in coding regions, introns, and within 2 kb flanking regions of genes (Fig. 2d). In contrast to the lower insertion rates for DNA transposon subfamilies DTA (5.33%), DTC (9.02%), DTH (7.14%), and DTM (9.44%), Copia (9.39%) and Gypsy (10.67%) TIPs were inserted in coding regions. Given their higher insertion rates in coding areas, which may lead to greater genetic variety, this discrepancy implies that LTR repeat elements may be more important in genome diversification (Fig. 2e).

TE insertions contribute to transcriptomic changes and trait variation

As is commonly observed, TE insertions tend to reduce gene expression levels (Fig. 2c), which may subsequently lead to phenotypic variation. ‘Dongzao’ and ‘Huizao’, two cultivated jujube varieties with a close genetic relationship, exhibit striking differences in their epidermal characteristics. ‘Dongzao’ has relatively thin skin, whereas ‘Huizao’ has a thicker one. In the reference genome of ‘Dongzao’, we identified 3310 genes harboring TE insertions within gene bodies or their flanking regions. Among these genes, there are 1875 genes that exhibit transposable element (TE) insertions/deletions in their genic regions or flanking regions between ‘Dongzao’ and ‘Huizao’. Specifically, 974 genes have TE insertions in their genic or flanking regions in ‘Dongzao’, while another 901 genes show TE deletions in ‘Dongzao’ relative to ‘Huizao’ (Fig. 3b and Table S9). This extensive polymorphism highlighted the potential of TIPs to drive cultivar differentiation. Gene ontology (GO) enrichment analysis of these 1875 genes showed significant enrichment for terms including ‘cellulose biosynthetic process’ and ‘hydrolase activity, hydrolyzing O-glycosyl compounds’ (Fig. 3d), directly linking the global TE insertion landscape to pathways governing cell wall structure and pericarp development. To directly connect TIPs to transcriptional changes, we mapped transcriptomic data from both cultivars across five fruit developmental stages using ‘Dongzao’ as the reference and identified 203 differentially expressed genes (DEGs). Wilcoxon tests confirmed that intragenic TE insertions exerted significantly stronger suppressive effects on gene expression than those in flanking regions (P < 0.001), underscoring the functional importance of insertion context (Fig. S5).

Figure 3.

Figure 3

TE insertions influence gene expression related to pericarp development in ‘Dongzao’ and ‘Huizao’. (a) Heatmap of DEGs harboring TIP expressions across five developmental stages of jujube. (b) Venn diagram of TIP-containing genes in ‘Dongzao’(DZ) and ‘Huizao’(HZ), illustrating the number of genes with TE insertions unique to each cultivar or shared between them. (c) Co-expression network of DEGs with TE insertions. Nodes represent genes, and edges represent significant expression correlations. (d) GO enrichment analysis of TIP-containing genes identified in the DZ–HZ comparison. Significantly enriched terms related to cell wall biology are shown.

Among the 203 DEGs, 25 contained TE insertions. Functional annotation revealed that eight of these were associated with glycosyl hydrolase activity and three were involved in cell wall biogenesis and cellulose synthesis (Fig. S6). Expression analysis of these TE-associated DEGs revealed a striking disparity: genes related to cell wall biogenesis were highly expressed in ‘Huizao’ but showed low expression levels in ‘Dongzao’ (Fig. 3a). GO enrichment analysis of these DEGs further confirmed the involvement of key terms such as ‘hydrolase activity hydrolyzing O-glycosyl compounds’ and ‘carbohydrate metabolic process’ (Fig. S7). Furthermore, these genes were interconnected in a co-expression network (Fig. 3c).

These results establish a coherent model; the highly polymorphic TE insertions between ‘Dongzao’ and ‘Huizao’ are enriched in cell wall-related pathways. These insertions, particularly those within gene bodies, significantly suppress gene expression, and a subset of them is directly linked to differential expression of genes critical for pericarp development, ultimately contributing to the divergent fruit texture observed between the two cultivars.

Population structure analysis based on TIPs

To explore the impacts of TIPs on the domestication of jujube (Z. jujuba Mill.) morphotypes, we analyzed TE insertions in 1141 diverse jujube genomes [34]. Using the ITIPs pipeline [18], we identified TIPs across these accessions. After filtering out untyped loci and samples with excessive missing genotypes, 1041 high-quality samples and 4176 TE insertion sites were retained for downstream analysis.

Population structure analysis revealed an optimal genetic clustering (K = 6), clearly differentiating wild and cultivated jujube populations, as well as resolving five distinct subgroups within the cultivated gene pool. The phylogenetic tree constructed from TIPs data (Fig. 4a) recapitulated the divergence between wild and cultivated lineages, consistent with previous genomic studies based on whole genome SNPs [34]. Geographical distribution analysis showed that wild jujube accessions primarily originated from Shanxi, Henan, and Hebei provinces in northern China’s Yellow River Basin, aligning with the species’ hypothesized center of origin. Among the cultivated subgroups, accessions from eastern regions (east of the Taihang Mountains) made up most of the Groups II, IV, and V, whilst Groups I and III were mostly from western regions (west of the Taihang Mountains). A principal component analysis (PCA) based on TIPs and genetic components in 1041 jujube genomes also showed that TIPs were closely linked to the domestication of jujube morphotypes (Fig. 4a and Fig. S8). These findings demonstrate the intricate relationship that existed throughout jujube domestication between TE-driven genetic variation and regional adaptability.

Figure 4.

Figure 4

TIPs linked to selective development and domestication of jujube morphotypes. (a) The genetic components calculated for 1041 jujube accessions using TIPs. (b) Whole genome TIPs to identify genomic signs of selection in ‘Dongzao’ genomes. Significantly enriched TIPs are shown by larger and more prominent markers. (c) Genotypes of the top 50 selection signatures in the wild and eastern populations in jujube. The phylogenetic tree of 1041 jujube accessions was constructed using TIPs. ‘CC’ indicates consistency with the reference genome, ‘GG’ indicates variation from the reference genome, while missing loci (‘NN’) and heterozygous loci (‘CG’) are colored in gray and yellow, respectively.

Identification of significantly enriched TIPs in geographically distinct jujube populations

To identify TIPs under selection during domestication, we analyzed their genomic distribution across 1041 jujube accessions. Three geographical groupings were used for trait domestication analysis: wild populations, western cultivated subgroups (Groups 1 and 3), and eastern cultivated subgroups (Groups 2, 4, and 5). In each comparison, accessions from the target group were designated as the derived group, while all others served as the control. Using Fisher’s exact test, we identified significant enrichments of TIPs in each group: 2365 TIPs linked to 1726 genes in the western group (including ‘Dongzao’ and ‘Huizao’; Table S7 and Fig. 4b), 2703 TIPs associated with 1948 genes in the wild group (Fig. 4c and Table S6), and 2211 TIPs corresponding to 1688 genes in the eastern group (Table S8).

Functional annotation of wild enriched TIPs revealed strong associations with stress resistance pathways, including peroxidase activity (GO:0004601) and reactive oxygen species metabolism (GO:0006979). To assess whether this TIP enrichment reflected a general propensity for TE accumulation, we compared TE density patterns between peroxidase and ROS-related genes and the genome-wide background. Interestingly, we observed a converse trend; while ‘Suanzao’ (belongs to wild groups) exhibited the highest TE density within these specific stress-related genes, it showed the lowest genome-wide TE density among the four accessions. This inverse relationship indicates that peroxidase and ROS-related genes are not innate TE hotspots. Instead, the elevated TE density specifically in the wild group strongly implies active selection for these insertions, which likely bolsters environmental resilience (Fig. 5c and d).

Figure 5.

Figure 5

Population-specific and shared TIPs and their functional impacts. (a) Venn diagram showing the distribution of TIPs among wild, western cultivated, and eastern cultivated jujube populations. Numbers indicate the count of TIPs unique to each group or shared between them. (b) GO enrichment analysis of genes associated with TIPs. The chart shows significantly enriched GO terms for the core TIPs shared by all three groups (Shared) and for TIPs unique to each lineage (wild, western, eastern). The dashed vertical line indicates the threshold for statistical significance (−log10(P = 0.05)). (c) Comparison of transposable element (TE) insertion density in peroxidase/ROS-related gene regions. Density was calculated as the cumulative length of all TE insertions within peroxidase/ROS-related genes (including 1 kb upstream and 1 kb downstream flanking regions) divided by the total length of all peroxidase/ROS-related genomic regions (gene bodies plus 1 kb upstream and 1 kb downstream sequences). P values (0.0610, 0.0168) indicate statistical significance from pairwise comparisons. (d) Comparison of genome-wide TE insertion density. Density was calculated as the cumulative length of all TE insertions (bp) divided by the total length of all TIP-containing genes with their 1 kb upstream and downstream flanking regions.

To dissect the contributions of shared and private genetic variants, we cataloged TIPs unique to or shared among the wild, eastern, and western groups (Fig. 5a). Strikingly, functional analysis of these TIPs revealed a clear partition between core and specialized adaptations. The TIPs shared by all three lineages were enriched in genes encoding fundamental cellular resilience mechanisms, including heat shock protein binding and the GPI anchor transamidase complex (Fig. 5b). This strongly suggests that these insertions represent a foundational, ancestral adaptation that is core to the jujube genome’s functional architecture. In contrast, the lineage-unique TIPs reveal distinct evolutionary trajectories. Wild-specific TIPs were enriched for ubiquitin protein ligase activity, indicating selection for fine-tuned control of protein degradation, a crucial mechanism for responding to environmental stresses. Western-specific TIPs were linked to FAD binding, a key cofactor for numerous oxidoreductases, potentially enhancing energy metabolism and photosynthetic efficiency. Eastern-specific TIPs were associated with pyridoxal phosphate binding and general biosynthetic process, pointing toward metabolic specialization. This TIP landscape supports a two-layer evolutionary model: an initial layer of shared TE insertions that conferred essential, basal cellular robustness, upon which a second layer of lineage-specific insertions was superimposed to tailor physiology to distinct ecological and agronomic niches, thereby driving diversification.

TIP-containing genes under strong selection govern pericarp differentiation between ‘Dongzao’ and ‘Huizao’

Among these candidates, we identified examples of TIP-containing genes potentially involved in pericarp thickness variation in jujube. Through integrated analyses of population scale TIP distributions, genomic TE insertions, and transcriptomic data, we identified Chr09.922 as a key candidate gene (designated ZjCESA4). A comparison of three ‘Dongzao’ and three ‘Huizao’ accessions revealed an LTR/Gypsy-type TE insertion (DDZ467) CDS of ZjCESA4, which was present in ‘Dongzao’ but absent in ‘Huizao’. In ‘Dongzao’, the expression of ZjCESA4 was considerably lower than in ‘Huizao’, as measured by FPKM, across the five fruit developmental phases (young fruit, fruit swelling, white ripe, half red, and all red) (Fig. 6a). Notably, during the Fruit Swelling stage, ‘Huizao’ exhibited an FPKM of 270, while ‘Dongzao’ showed only 19. Statistically significant differences were observed at multiple developmental stages (marked by *).

Figure 6.

Figure 6

TE insertion and expression analysis of ZjCESA4, a key gene for pericarp differentiation in jujube. (a) Expression levels (FPKM) of ZjCESA4 in ‘Dongzao’ (DZ) and ‘Huizao’ (HZ) across five fruit developmental stages (young fruit, fruit swelling, white ripe, half red, all red). Asterisks (*) indicate stages with significant differences (Student’s t-test, P < 0.05). (b) Distribution of TIPs upstream of ZjCESA4 in ‘Dongzao’ (DZ) and ‘Huizao’ (HZ) accessions. ‘Huizao’ shows enrichment of accessions without TE insertions (−TE), while ‘Dongzao’ shows enrichment of accessions with TE insertions (+TE). (c) Gene structure diagram of ZjCESA4, showing Gypsy insertion (DDZ467) located in CDS regions. Exons are denoted by boxes and introns by lines. (d) Population-level validation of the DDZ467 TE insertion frequency associated with pericarp thickness. The insertion ratio [number of accessions with the insertion (+TE)/total number of genotyped accessions] of the key Gypsy-type TE (DDZ467) in the cellulose synthase gene ZjCESA4 was compared between grouped jujube accessions. The dry group comprises thick pericarp varieties (‘dazao’, ‘suanzao’, ‘youzao’), while the fresh group comprises thin pericarp, fresh-eating varieties (‘jidanzao’, ‘mizao’, ‘niunaizao’).

The distribution plot (Fig. 6b) clearly shows all three ‘Dongzao’ accessions with the TE insertion, while all three ‘Huizao’ accessions were without it, reinforcing the association between TE insertion and reduced gene expression. The gene structure diagram (Fig. 6c) further visualizes the TE’s position relative to the gene and its exons, suggesting a potential regulatory disruption. These findings highlight the potential of TE insertions to modulate gene expression. This may help explain the disparity in pericarp properties between ‘Dongzao’ and ‘Huizao’, which are likely attributable to variations in cellulose content.

To validate our findings beyond the ‘Dongzao’ and ‘Huizao’ comparison, we leveraged our population scale dataset. We strategically defined groups of thick-skinned (dry group: ‘dazao’, ‘suanzao’, ‘youzao’) and thin-skinned (fresh group: ‘jidanzao’, ‘mizao’, ‘niunaizao’) varieties, which are consistently documented in the authoritative Flora of China [40]. Analysis of the TE insertion frequency in pericarp thickness-related candidate genes revealed a significantly higher frequency of the ‘+TE’ allele (e.g. the Gypsy insertion upstream of ZjCESA4) in the fresh eating group compared to the dry eating group (Fig. 6d). This differential distribution across groups with divergent pericarp traits provides strong population genetic support for the role of these TIPs in pericarp differentiation.

A MADS-box transcription factor gene harbors a TE insertion associated with population differentiation

We experimentally validated a novel LTR/Gypsy insertion, DDZ2655, located downstream of the MADS-box transcription factor gene Chr11.950 (designated ZjAGL18) (Fig. 7d). Using polymerase chain reaction (PCR) primers flanking the insertion site (Fig. 7c), we confirmed the presence/absence polymorphism: the TE insertion in ‘Dongzao’ prevented PCR amplification, resulting in no observable band, whereas its absence in ‘Huizao’ allowed successful amplification of the expected fragment (Fig. 7a and, f). Genotyping of DDZ2655 across 1041 jujube accessions revealed a pronounced population-specific distribution (Fig. 7b). The insertion was predominantly found in the western cultivated group, present at a moderate frequency in the eastern cultivated group, and largely absent from the wild population. This pattern indicates that the DDZ2655 insertion was likely targeted by selection during domestication, particularly in western cultivars. Expression analysis showed that accessions carrying the DDZ2655 insertion exhibited significantly lower expression levels of ZjAGL18 across multiple fruit developmental stages compared to those without the insertion (Fig. 7e). This correlation suggests that the downstream TE insertion may act as a cis-regulatory variant that represses the expression of this key developmental transcription factor. Given that ZjAGL18 expression is suppressed by the TE insertion, its orthologs in other species, such as BjuAGL18-1 in Brassica juncea, act as repressors of flowering. The suppression of ZjAGL18 in jujube could therefore be expected to promote an earlier flowering phenotype. This aligns with the general domestication trend of selecting for accelerated reproductive cycles, suggesting that the TE insertion near ZjAGL18 may have been selected for its impact on flowering time, a key agronomic trait. Coupled with the previously identified TE-associated MADS-box gene ZjAGL28 [34], the discovery of DDZ2655 underscores the recurrent targeting of MADS-box genes by TE insertions during jujube domestication, pointing to their pivotal role in the evolution of domesticated traits.

Figure 7.

Figure 7

Characterization of the TE insertion DDZ2655 downstream of the MADS-box gene ZjAGL18. (a) Distribution of TIPs downstream of ZjAGL18 in ‘Dongzao’ (DZ) and ‘Huizao’ (HZ) accessions. ‘Huizao’ shows enrichment of accessions without TE insertions (−TE), while ‘Dongzao’ shows enrichment of accessions with TE insertions(+TE). (b) Genotype distribution of the DDZ2655 insertion across wild, eastern cultivated, and western cultivated jujube populations. Genotypes are classified as homozygous reference (GG), heterozygous (CG), or homozygous for the TE insertion (CC). (c) Schematic diagram of the PCR-based genotyping strategy for the DDZ2655 insertion. Arrows indicate the positions of primers 2655-F and 2655-R, which flank the insertion site. (d) Gene structure diagram of ZjAGL18, showing Gypsy insertion (DDZ2655) located in downstream regions. Exons are denoted by boxes and introns by lines. (e) Expression levels (FPKM) of ZjAGL18 in ‘Dongzao’ (DZ) and ‘Huizao’ (HZ) across five fruit developmental stages (young fruit, fruit swelling, white ripe, half red, all red). Asterisks (*) indicate stages with significant differences (Student’s t-test, P < 0.05). (f) PCR validation of the DDZ2655 insertion. Primer positions are indicated in (c), with amplification results showing the presence of the TE insertion in ‘Dongzao’ and its absence in ‘Huizao’.

Discussion

TE insertions modulate gene expression and phenotypic variations

According to our examination of seven jujube genomes, transposable elements (TEs) constitute over 30% of the assembled sequences, with LTR retrotransposons such as Copia and Gypsy being the most prevalent types, consistent with previous findings in jujube [34]. The positive correlation (R2 = 0.76) between total TE content and genome size underscores their role as key drivers of genomic expansion, aligning with their established function in plant genome evolution [1, 2]. Moreover, the enrichment of LTR retrotransposons in coding regions highlights their potential role in generating adaptive genetic variation, consistent with their ‘molecular drive’ in plant evolution [20, 24, 41–44].

Significant variations in gene expression were linked to genes with TE insertions, which have been noted in blood oranges, tomatoes, and soybeans before [5, 8, 45–47]. Such insertions may disrupt splicing machinery or recruit repressive chromatin marks, resulting in transcriptional silencing [37]. The elevated nonsynonymous-to-synonymous SNP ratio in TIP-containing genes further suggests that TEs may accelerate coding sequence evolution, potentially through insertional mutagenesis or altered selection pressures. Such regulatory impacts may also underlie morphological differences. The contrasting pericarp thickness between ‘Dongzao’ and ‘Huizao’ provides a compelling case study for TE-driven phenotypic variation. Our integrated analysis identified a Gypsy insertion (DDZ467) in the coding sequence of the cellulose synthase gene ZjCESA4 as a key candidate. The presence of this insertion in ‘Dongzao’ was coupled with a drastic reduction in gene expression, most notably during the fruit swelling stage when cell wall biosynthesis is most active (Fig. 6a and c). The functional impact of this suppressed expression is directly confirmed by physiological data, which show a consistently lower cellulose content in ‘Dongzao’ pericarp compared to ‘Huizao’ [48]. Furthermore, population genetics confirmed a higher frequency of this TE insertion in fresh-eating, thin-pericarp varieties (Fig. 6d). Therefore, the ZjCESA4 locus illustrates a complete trajectory from a population-specific TE polymorphism, through the disruption of gene expression, to a proven biochemical defect that underpins a key domestication trait.

Notably, TEs also appear to recurrently target key regulatory gene families. In addition to ZjCESA4, we identified a downstream LTR/Gypsy insertion (DDZ2655) near the MADS-box transcription factor gene ZjAGL18. This insertion was associated with suppressed expression and showed a structured distribution across wild and cultivated groups, suggesting selection during domestication. Together with the previously reported TE-associated MADS-box gene ZjAGL28 [34], these cases highlight the repeated recruitment of TIPs in modulating developmental regulators—a mechanism that may have broadly influenced the evolution of domesticated traits in jujube. The recurrent targeting of MADS-box genes, including ZjAGL18 and ZjAGL28, underscores the central role of this gene family in jujube domestication. The functional characterization of AGL18 orthologs in other species reveals its involvement in critical developmental switches, from somatic embryogenesis to the precise control of flowering time via antagonistic protein isoforms [38, 39]. The modulation of ZjAGL18 expression by a TE insertion illustrates how jujube domestication could co-opt complex, pre-existing regulatory networks. Transposable elements likely acted as natural mutational tools to fine-tune such master regulators, ultimately shaping agronomically vital traits including reproductive timing and embryogenic capacity, which are fundamental for cultivar selection.

These findings align with studies in other species, where TE insertions in regulatory regions alter spatiotemporal gene expression by disrupting cis-regulatory elements [49–52]. In jujube, TEs have thus served as a major source of both structural and regulatory variation, shaping not only genomic architecture but also key agronomic traits.

TIPs shape population genetics and domestication signatures

Wild and cultivated jujube populations were clearly genetically separated, according to population structure analysis based on TE insertion polymorphisms (TIPs). Cultivated groups formed different geographic clades (western vs. eastern), whereas wild accessions clustered independently. These findings highlight the use of TIPs for determining domestication history and are in line with earlier SNP-based research [34]. Wild populations exhibited enrichment of TIPs in stress-related genes [e.g. peroxidase activity, reactive oxygen species (ROS) metabolism], reflecting their adaptation to harsh environments [53,54]. Crucially, we observed a strikingly converse trend between the wild group and the genome-wide TE density; while wild jujube exhibits the lowest overall TE density, its peroxidase and ROS-related genes display the highest density of TE insertions among all groups (Fig. 5c and d). This inverse pattern rules out passive accumulation and provides strong evidence for positive selection, indicating that TIPs were specifically recruited to fine-tune these key defense genes for adaptation to harsh environments.

The functional divergence of TIPs was further dissected by categorizing them into shared and lineage-specific sets. This revealed a two-layer evolutionary model. The first layer consists of TIPs shared by all wild and cultivated populations, which are enriched in fundamental cellular resilience pathways such as heat shock protein binding and the GPI–anchor transamidase complex (Fig. 5b). This core set of insertions likely represents an ancestral adaptation that is fundamental to the jujube genome. The second layer comprises lineage-unique TIPs that have driven recent diversification: wild-specific TIPs are linked to ubiquitin protein ligase activity for precise stress response control; western-specific TIPs to FAD binding for enhanced energy metabolism; and eastern-specific TIPs to pyridoxal phosphate binding and biosynthetic process for metabolic specialization (Fig. 5b). This layered model illustrates how initial domestication leveraged standing TE variation, followed by lineage-specific adaptations fueled by new insertions.

Our findings establish TEs as key drivers of genomic and phenotypic diversity in jujube. We extended the growing body of evidence on the role of TIPs in Rosaceae fruit domestication. Previous research in apple, peach, and Japanese plum has elegantly demonstrated that TIPs can be master regulators of pigmentation and taste, primarily by disrupting or modulating transcription factors involved in pigment and proanthocyanin biosynthesis (e.g. MdMYB1, PpBL, PsMYB10.2, MdMYBTT) (Fiol et al., 2022) [25–27]. In contrast, our work reveals a TE-mediated mechanism underlying a fundamental textural trait—pericarp thickness. The association between a Gypsy insertion and the expression of a cellulose synthase gene (ZjCESA4) highlights a previously underappreciated pathway: the direct modulation of cell wall biosynthesis machinery. This finding demonstrates that the impact of TIPs in fruit domestication extends beyond the well-documented realms of color and metabolism to include structural properties, a dimension less explored in related species. The pronounced divergence in pericarp thickness between jujube cultivars thus provides a unique model to dissect this distinct class of TE-driven adaptation.

The functional impact of this mechanism is directly confirmed by physiological data showing lower cellulose content in ‘Dongzao’ pericarp [48]. The identified TIPs, such as DDZ467 in ZjCESA4, thus provide valuable molecular markers for breeding programs aiming to introgress stress tolerance traits from wild accessions while maintaining desirable fruit quality characteristics. The practical application of these TIPs is facilitated by their nature as presence/absence polymorphisms, which allows for their straightforward conversion into simple, robust, and cost-effective PCR-based assays (e.g. using insertion-specific primers), making them highly accessible for routine germplasm screening. Furthermore, as stable structural variants, these TIPs are expected to be heritable and consistent across generations, providing reliable selection targets. While linkage drag remains a general consideration in marker-assisted breeding, our population genetic atlas of TIPs provides a foundational resource for future fine-mapping efforts to pinpoint the causal variants responsible for the traits, thereby enabling more precise selection and minimizing the introgression of undesirable genomic segments.

Materials and methods

Plant materials and sequencing

For RNA sequencing, fresh leaf samples of the dried jujube cultivar ‘Huizao’ were gathered in 2023 from the Fresh Jujube Production Demonstration Park experimental base in Yangtake Township, Maigeti County, Xinjiang Uygur Autonomous Region, China (70.40°E, 38.54°N). For transcriptomic analysis, fruits from ‘Huizao’ and another cultivated jujube cultivar ‘Dongzao’, a fresh edible jujube, were sampled at five fruit development stages (31, 63, 78, 98, and 108 days after flower blooming), corresponding to the young fruit stage, expansion stage, white-ripening stage, half-ripening stage, and full-ripening stage. The National Center for Biotechnology Information (NCBI) has received the genome assembly of ‘Huizao’ and the raw transcriptome sequencing data produced during this investigation under BioProject accession number PRJNA1262080.

High molecular weight genomic DNA (gDNA) was isolated from ‘Huizao’ leaf tissue using a modified CTAB protocol. For sequencing, we constructed and sequenced three complementary libraries: an Illumina short insert library (350 bp) on a NovaSeq 6000 platform, a PacBio HiFi library on a Sequel II system, and an Oxford Nanopore ultra-long read library on a PromethION device. A Hi-C library was also prepared and sequenced on Illumina NovaSeq to support chromosome-scale scaffolding.

The initial contig-level assembly was generated from the PacBio HiFi and Nanopore reads using HiFiasm (version:0.18.2-r467). The contigs were then scaffolded into chromosomes by aligning Hi-C reads with Juicer (version:1.6) and performing ordering and orientation with 3D-DNA. To achieve a gap-free, telomere-to-telomere (T2T) assembly, ultra-long Nanopore reads were independently assembled with NextDenovo. The resulting contigs were mapped to the chromosomal assembly using Winnowmap to identify and close any remaining gaps, resulting in a complete, haplotype-resolved reference genome for ‘Huizao’.

Lastly, the NCBI has released the jujube resequencing data used in this investigation under BioProject accession numbers PRJNA1051535 and PRJNA560664. For the haplotype-resolved T2T genome assemblies of ‘Suanzao’ and ‘Junzao’, annotation files for protein-coding genes and other relevant information were obtained from https://figshare.com/s/ad5d747ccc2ccbb2b65b. We downloaded the genome sequence and annotation data for ‘Dongzao’ from https://figshare.com/s/56c2299b47a5efd8708f.

Transposable elements identification in jujube genomes

A nonredundant TE library was constructed for the jujube species using the panEDTA pipeline (v2.1.0) [55]. To ensure a high confidence library, we applied stringent parameters: a frequency filter (-f 3) to retain only TE families with ≥3 full-length copies across genomes, and a maximum divergence threshold (--maxdiv 40) during annotation. The pipeline’s output findings were used to extract the DNA transposons and LTR retrotransposons, which included six different DNA transposons (Helitron, DTM, DTA, DTC, DTH, and DTT) as well as Copia-like and Gypsy-like LTR-RTs [56]. Next, we used our created TE library to determine the total TE content in each of the 20 genomes using the RepeatMasker (version: open-4.0.7) software [57].

Phylogenetic and divergence time analysis

To reconstruct the evolutionary relationships and divergence times among jujube genomes, we selected Rhamnella rubrinervis, Prunus dulcis, Prunus persica, M. domestica, Arabidopsis thaliana, and V. vinifera as the outgroup. Single-copy orthologous genes identified by OrthoFinder (version:2.5.2) [58] were aligned using MAFFT (version:7.310) [59], and the resulting alignments were concatenated for phylogenetic tree construction. A maximum likelihood tree was inferred using RAxML (version:8.2.12) [60] with 1000 bootstrap replicates. Divergence times were estimated with MCMCTree (version:4.5) [61], using calibration points from the TimeTree database (Arabidopsis–Vitis: ~125 MYA; Prunus–Malus: ~105.5 MYA).

Detection of TE insertion in jujube genomes

We used Smartie-sv [62] to find insertion/deletion (INDEL) sequences in seven jujube genomes by aligning them to the ‘Dongzao’ reference genome using the ITIPs program [18]. The alignments were performed with the following key parameters: -minPctIdentity 50 (minimum 50% sequence identity), -minMapQV 30 (minimum mapping quality), -minMatch 10 (minimum seed length), and -nproc 6 (number of processor cores). To fully identify INDELs throughout the jujube pan-genome, the same methodology was performed, using each of the seven jujube genomes as the reference in turn.

Based on these alignments, we categorized the pan-genome into ‘aligned’ and ‘unaligned’ regions. A region in a query genome was classified as an ‘aligned region’ if it could be successfully mapped to the reference under the above parameters; otherwise, it was defined as an ‘unaligned region’, representing sequences absent or highly divergent from the reference.

After that, BLASTN (version:2.10.1) was used to align the generated INDEL sequences to the previously created TE library. TE insertions were defined as INDEL sequences that had coverage and sequence similarity of more than 80%. Lastly, the downstream and upstream sequences of each TE insertion were extracted, totaling 2 kb. These TE sequences were further utilized for genotyping along with their flanking areas.

Determination of TIPs on a population scale

The genotype profiles of TIPs were derived using population-scale short-read data, which was based on the TE insertions genotyping script from the ITIPs pipeline. First, using the ‘bwa mem’ method (version: 0.7.17-r1188) with the parameters ‘--T 20 --Y’, all short reads from the jujube accessions were mapped to the TE insertions and their flanking sequences [63]. If a read from a sample aligned to both upstream and downstream flanking sequences, the locus was labeled as ‘GG’, indicating the absence of a TE insertion in that sample. Conversely, if a read aligned to the TE insertion sequence and at least one flanking sequence (either upstream or downstream), the sample was classified as harboring the TE insertion and denoted as ‘CC’. Loci with missing data were labeled as ‘NN’. Finally, genotype information for TE insertions was obtained for each resequenced genome.

TE insertion polymorphism-based phylogenetic and population analysis

After obtaining the genotypes of TE insertions in all jujube resequenced genomes, we employed a Python script to merge all genotype files into a single PED file, nosex file, and MAP file. The PED file contains genotyping information for each sample, the MAP file includes positional information for each TE insertion, and the nosex file lists accession IDs. These files were converted to a single VCF file using Plink (version:1.90b4) (http://www.cog-genomics.org/plink/1.9/) [64]. Using the settings ‘--geno 0.02 --mind 0.02’, homozygous loci with a minor allele frequency (MAF) <0.05 were eliminated, and samples with more than 2% missing data for TE insertions or TE insertions missing in more than 2% of samples were filtered out. Based on this dataset, phylogenetic, population structure, and PCA were then performed. After using a Python script to determine the distance matrix between samples according to TIP genotypes, we built a neighbor-joining phylogenetic tree of the 1041 jujube accessions using FastMe (version: 2.1.6.4) [65]. Admixture (version: 1.3.0) [66] was used to perform population structure analysis with K = 2–10 clusters. The results at K = 5 and K = 6 were chosen to reflect the genetic structure of the 1041 jujube accessions. PCA of TIPs in the 1041 jujube genomes was conducted using Plink with the parameters ‘--noweb --pca 20’.

PCR validation of transposable element insertions

gDNA was extracted from jujube leaves using the CTAB protocol with modifications. In brief, fresh leaves were ground into a fine powder using liquid nitrogen. The powdered sample was then washed three times with 350 mM sorbitol buffer to eliminate polysaccharide contamination. Subsequently, the DNA was extracted using a 3× CTAB buffer. PCR amplification was conducted according to the defined reaction system and program, and the products were resolved by 1% agarose gel electrophoresis. The primers used for gene amplification are: 2665-F: GTATGTAATAGTAGCTGGAGTTGTGATTTTGT; 2655-R: GGTGACGAGCATGATAACGAAGGTCAAT. PCR reaction system: 2× Phanta Max Master Mix (Vazyme, Nanjing, China), 10 μl; 2655-F/R, 0.5 μl; gDNA, 1 μl; ddH2O, 8 μl. Amplification program: 95°C for 5 min; 95°C for 30 s, 56°C for 30 s, and 72°C for 90 s with 35 cycles.

Supplementary Material

Web_Material_uhaf343

Acknowledgements

This research was funded by the National Natural Science Foundation of China (Grant No. 32070682), the International Science and Technology Cooperation Program of Hubei Province (Grant No. 2023EHA048) and the Start-up Fund of Wuhan University of Technology (Grant No. 40121005). We would like to thank the members of the Key Laboratory of Genome Research and Genetic Improvement of Xinjiang Characteristic Fruits and Vegetables, China, for providing the ‘Huizao’ genome and transcriptome data.

Contributor Information

Xingnuo Li, Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430000, China; University of Chinese Academy of Sciences, Beijing, 100049, China.

Aidi Zhang, State Key Laboratory of Plant Diversity and Specialty Crops, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China.

Muqaddas Bano, Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430000, China; University of Chinese Academy of Sciences, Beijing, 100049, China.

Juan Jin, Key Laboratory of Genome Research and Genetic Improvement of Xinjiang Characteristic Fruits and Vegetables, Institute of Horticulture Crops, Xinjiang Academy of Agricultural Sciences, Urumqi 830091, China.

Qing Hao, Key Laboratory of Genome Research and Genetic Improvement of Xinjiang Characteristic Fruits and Vegetables, Institute of Horticulture Crops, Xinjiang Academy of Agricultural Sciences, Urumqi 830091, China.

Dingyu Fan, Key Laboratory of Genome Research and Genetic Improvement of Xinjiang Characteristic Fruits and Vegetables, Institute of Horticulture Crops, Xinjiang Academy of Agricultural Sciences, Urumqi 830091, China.

Liang Chen, Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430000, China.

Xiujun Zhang, Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430000, China; School of Mathematics and Statistics, Wuhan University of Technology, Wuhan 430070, China.

Author contributions

X.Z., L.C., and D.F. conceptualized the idea and supervised the project. J.J. and Q.H. collected the plant materials. M.B. polished the manuscript. X.L. and A.Z. performed statistical analysis. X.L. drafted the manuscript. All authors have read the manuscript and agreed with the current version of the manuscript for publication.

Data availability

All of the data are available in the manuscript or supplementary materials.

Conflicts of interest statement

The authors declare no conflict of interest.

Supplementary material

Supplementary material is available at Horticulture Research online.

References

  • 1. Feschotte  C, Jiang  N, Wessler  SR. Plant transposable elements: where genetics meets genomics. Nat Rev Genet. 2002;3:329–41 [DOI] [PubMed] [Google Scholar]
  • 2. Bennetzen  JL. Transposable element contributions to plant gene and genome evolution. Plant Mol Biol. 2000;42:251–69 [PubMed] [Google Scholar]
  • 3. Bourque  G, Burns  KH, Gehring  M. et al.  Ten things you should know about transposable elements. Genome Biol. 2018;19:199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Catlin  NS, Josephs  EB. The important contribution of transposable elements to phenotypic variation and evolution. Curr Opin Plant Biol. 2022;65:102140 [DOI] [PubMed] [Google Scholar]
  • 5. Chuong  EB, Elde  NC, Feschotte  C. Regulatory activities of transposable elements: from conflicts to benefits. Nat Rev Genet. 2017;18:71–86 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Niu  XM, Xu  YC, Li  ZW. et al.  Transposable elements drive rapid phenotypic variation in Capsella rubella. Proc Natl Acad Sci U S A. 2019;116:6908–13 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Quadrana  L, Etcheverry  M, Gilly  A. et al.  Transposition favors the generation of large effect mutations that may facilitate rapid adaption. Nat Commun. 2019;10:3421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Sultana  T, Zamborlini  A, Cristofari  G. et al.  Integration site selection by retroviruses and transposable elements in eukaryotes. Nat Rev Genet. 2017;18:292–308 [DOI] [PubMed] [Google Scholar]
  • 9. Zhang  X, Meng  L, Liu  B. et al.  A transposon insertion in FLOWERING LOCUS T is associated with delayed flowering in Brassica rapa. Plant Sci. 2015;241:211–20 [DOI] [PubMed] [Google Scholar]
  • 10. Cui  X, Cao  X. Epigenetic regulation and functional exaptation of transposable elements in higher plants. Curr Opin Plant Biol. 2014;21:83–8 [DOI] [PubMed] [Google Scholar]
  • 11. Lisch  D. Epigenetic regulation of transposable elements in plants. Annu Rev Plant Biol. 2009;60:43–66 [DOI] [PubMed] [Google Scholar]
  • 12. Rebollo  R, Romanish  MT, Mager  DL. Transposable elements: an abundant and natural source of regulatory sequences for host genes. Annu Rev Genet. 2012;46:21–42 [DOI] [PubMed] [Google Scholar]
  • 13. Kobayashi  S, Goto-Yamamoto  N, Hirochika  H. Retrotransposon-induced mutations in grape skin color. Science. 2004;304:982. [DOI] [PubMed] [Google Scholar]
  • 14. Lisch  D. How important are transposons for plant evolution?  Nat Rev Genet. 2013;14:49–61 [DOI] [PubMed] [Google Scholar]
  • 15. Yao  JL, Dong  YH, Morris  BA. Parthenocarpic apple fruit production conferred by transposon insertion mutations in a MADS-box transcription factor. Proc Natl Acad Sci U S A. 2001;98:1306–11 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Alonge  M, Wang  X, Benoit  M. et al.  Major impacts of widespread structural variation on gene expression and crop improvement in tomato. Cell. 2020;182:145–161.e23 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Domínguez  M, Dugas  E, Benchouaia  M. et al.  The impact of transposable elements on tomato diversity. Nat Commun. 2020;11:4058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Cai  X, Lin  R, Liang  J. et al.  Transposable element insertion: a hidden major source of domesticated phenotypic variation in Brassica rapa. Plant Biotechnol J. 2022;20:1298–310 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Bayer, P.E., Golicz, A.A., Scheben, A., Batley, J. and Edwards, D. (2020) Plant pan-genomes are the new reference. Nat. Plants, 6, 914–920. [DOI] [PubMed] [Google Scholar]
  • 20. Liu  J, Zhou  R, Wang  W. et al.  A copia-like retrotransposon insertion in the upstream region of the SHATTERPROOF1 gene, BnSHP1. A9, is associated with quantitative variation in pod shattering resistance in oilseed rape. J Exp Bot. 2020;71:5402–13 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Liu  M, Wang  J, Wang  L. et al.  The historical and current research progress on jujube—a superfruit for the future. Hortic Res. 2020;7:119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Hufford  MB, Seetharam  AS, Woodhouse  MR. et al.  De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes. Science. 2021;373:655–62 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Qin  P, Lu  H, Du  H. et al.  Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations. Cell. 2021;184:3542–3558.e16 [DOI] [PubMed] [Google Scholar]
  • 24. Song  JM, Guan  Z, Hu  J. et al.  Eight high-quality genomes reveal pan-genome architecture and ecotype differentiation of Brassica napus. Nat Plants. 2020;6:34–45 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Wang  Y, Xia  B, Lin  Q. et al.  Convergent domestication of bitter apples and pears by selecting mutations of MYB transcription factors to reduce proanthocyanidin levels. Mol Hortic. 2025;5:51. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Zhang  L, Hu  J, Han  X. et al.  A high-quality apple genome assembly reveals the association of a retrotransposon and red fruit colour. Nat Commun. 2019;10:1494. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Wang  J, Cao  K, Li  Y. et al.  Genome variation and LTR-RT analyses of an ancient peach landrace reveal mechanism of blood-flesh fruit color formation and fruit maturity date advancement. Hortic Res. 2023;11:uhad265. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Fiol  A, García  S, Dujak  C, Pacheco  I, Infante  R, Aranzana  MJ. An LTR retrotransposon in the promoter of a PsMYB10.2 gene associated with the regulation of fruit flesh color in Japanese plum. Hortic Res. 2022;9:uhac206. Published 2022 Sep 13. 10.1093/hr/uhac206 [DOI] [PMC free article] [PubMed]
  • 29. Liu  MJ, Zhao  J, Cai  QL. et al.  The complex jujube genome provides insights into fruit tree biology. Nat Commun. 2014;5:5315. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Huang  J, Zhang  C, Zhao  X. et al.  The jujube genome provides insights into genome evolution and the domestication of sweetness/acidity taste in fruit trees. PLoS Genet. 2016;12:e1006433 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Shen  LY, Luo  H, Wang  XL. et al.  Chromosome-scale genome assembly for Chinese sour jujube and insights into its genome evolution and domestication signature. Front Plant Sci. 2021;12:773090 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Yang  M, Han  L, Zhang  S. et al.  Insights into the evolution and spatial chromosome architecture of jujube from an updated gapless genome assembly. Plant Commun. 2023;4:100662 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Li  K, Chen  R, Abudoukayoumu  A. et al.  Haplotype-resolved T2T reference genomes for wild and domesticated accessions shed new insights into the domestication of jujube. Hortic Res. 2024;11:uhae071. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Guo  M, Lian  Q, Mei  Y. et al.  Analyzes of pan-genome and resequencing atlas unveil the genetic basis of jujube domestication. Nat Commun. 2024;15:9320. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Guo  M, Zhang  Z, Cheng  Y. et al.  Comparative population genomics dissects the genetic basis of seven domestication traits in jujube. Hortic Res. 2020;7:89. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Zhang  Q, Wang  L, Wang  Z. et al.  The regulation of cell wall lignification and lignin biosynthesis during pigmentation of winter jujube. Hortic Res. 2021;8:238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Slotkin  RK, Martienssen  R. Transposable elements and the epigenetic regulation of the genome. Nat Rev Genet. 2007;8:272–85 [DOI] [PubMed] [Google Scholar]
  • 38. Deng  Q, Lu  H, Liu  D. et al.  Modulation of flowering by an alternatively spliced AGL18-1 transcript in Brassica juncea. Crop J. 2025;13:456–67 [Google Scholar]
  • 39. Paul  P, Joshi  S, Tian  R, Diogo Junior  R, Chakrabarti  M, Perry  SE. The MADS-domain factor AGAMOUS-Like18 promotes somatic embryogenesis. Plant Physiol. 2022;188(3):1617-1631. 10.1093/plphys/kiab553 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Wu  ZY, Raven  PH. Flora of China. Vol. 12. Beijing: Science Press; St. Louis: Missouri Botanical Garden Press; 2007: [Google Scholar]
  • 41. Baduel  P, Quadrana  L, Hunter  B. et al.  Relaxed purifying selection in autopolyploids drives transposable element over-accumulation which provides variants for local adaptation. Nat Commun. 2019;10:5818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Carpentier  MC, Manfroi  E, Wei  FJ. et al.  Retrotranspositional landscape of Asian rice revealed by 3000 genomes. Nat Commun. 2019;10:24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Quadrana  L, Bortolini Silveira  A, Mayhew  GF. et al.  The Arabidopsis thaliana mobilome and its impact at the species level. Elife. 2016;5:e15716 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Serrato-Capuchina  A, Matute  DR. The role of transposable elements in speciation. Genes (Basel). 2018;9:254. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Busch  BL, Schmitz  G, Rossmann  S. et al.  Shoot branching and leaf dissection in tomato are regulated by homologous gene modules. Plant Cell. 2011;23:3595–609 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Butelli  E, Licciardello  C, Zhang  Y. et al.  Retrotransposons control fruit-specific, cold-dependent accumulation of anthocyanins in blood oranges. Plant Cell. 2012;24:1242–55 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Xu  M, Brar  HK, Grosic  S. et al.  Excision of an active CACTA-like transposable element from DFR2 causes variegated flowers in soybean [Glycine max (L.) Merr.]. Genetics. 2010;184:53–63 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Qu  S, Ren  T, Fan  D. et al.  Anatomical observation on the differences in the fruit texture between Huizao and Dongzao jujube and related enzyme activities. J Fruit Science. 2024;41(9):1811–20 [Google Scholar]
  • 49. Guillet-Claude  C, Birolleau-Touchard  C, Manicacci  D. et al.  Nucleotide diversity of the ZmPox3 maize peroxidase gene: relationships between a MITE insertion in exon 2 and variation in forage maize digestibility. BMC Genet. 2004;5:19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Li  J, Wang  Z, Peng  H. et al.  A MITE insertion into the 3′-UTR regulates the transcription of TaHSP16. 9 in common wheat. Crop J. 2014;2:381–7 [Google Scholar]
  • 51. Tovkach  A, Ryan  PR, Richardson  AE. et al.  Transposon-mediated alteration of TaMATE1B expression in wheat confers constitutive citrate efflux from root apices. Plant Physiol. 2013;161:880–92 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Yang  G, Dong  J, Chandrasekharan  M. et al.  Kiddo, a new transposable element family closely associated with rice genes. Mol Genet Genomics. 2001;266:417–24 [DOI] [PubMed] [Google Scholar]
  • 53. Ni  X, Wang  Y, Dai  L. et al.  The transcription factor GmbZIP131 enhances soybean salt tolerance by regulating flavonoid biosynthesis. Plant Physiol. 2025;197:kiaf092. [DOI] [PubMed] [Google Scholar]
  • 54. Zhang  H, Wang  Z, Li  X. et al.  The IbBBX24-IbTOE3-IbPRX17 module enhances abiotic stress tolerance by scavenging reactive oxygen species in sweet potato. New Phytol. 2022;233:1133–52 [DOI] [PubMed] [Google Scholar]
  • 55. Ou  S, Scheben  A, Collins  T. et al.  Differences in activity and stability drive transposable element variation in tropical and temperate maize. Genome Res. 2024;34:1140–53 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Feschotte  C, Pritham  EJ. DNA transposons and the evolution of eukaryotic genomes. Annu Rev Genet. 2007;41:331–68 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Tarailo Graovac, M. and Chen, N. (2009) Using RepeatMasker to identifyrepetitive elements in genomic sequences. Curr. Protoc. Bioinformatics, 25,Chapter 4, Unit 4 10. [DOI] [PubMed] [Google Scholar]
  • 58. Emms  DM, Kelly  S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019;20:238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Katoh  K, Standley  DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30:772–80 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Stamatakis  A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Puttick  MN. MCMCtreeR: functions to prepare MCMCtree analyses and visualize posterior ages on trees. Bioinformatics. 2019;35:5321–2 [DOI] [PubMed] [Google Scholar]
  • 62. Kronenberg  ZN, Fiddes  IT, Gordon  D. et al.  High-resolution comparative analysis of great ape genomes. Science. 2018;360:eaar6343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63. Li  H, Durbin  R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64. Chang  CC, Chow  CC, Tellier  LC. et al.  Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. 2015;4:s13742–015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Lefort  V, Desper  R, Gascuel  O. FastME 2.0: a comprehensive, accurate, and fast distance-based phylogeny inference program. Mol Biol Evol. 2015;32:2798–800 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. Alexander  DH, Novembre  J, Lange  K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009;19:1655–64 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Web_Material_uhaf343

Data Availability Statement

All of the data are available in the manuscript or supplementary materials.


Articles from Horticulture Research are provided here courtesy of Oxford University Press

RESOURCES