Abstract
Almond, as one of the most important nut tree species worldwide, has been extensively studied for its kernel flavor, lipid composition, and medicinal properties. However, the molecular regulatory mechanisms underlying its flesh development remain largely unexplored. Therefore, this study conducted integrated transcriptomic and metabolomic analyses on flesh samples of the ‘Wanfeng’ almond fruit at five developmental stages. Transcriptome sequencing generated 107.25 Gb of Clean Data, with thousands of differentially expressed genes identified between adjacent stages, indicating dynamic transcriptional reprogramming during flesh development. Enrichment analyses revealed that early and middle developmental stages were primarily associated with cellular development, photosynthesis, and biosynthesis, while later stages were linked to substrate decomposition and programmed cell death. The plant hormone signal transduction pathway played a critical regulatory role in almond flesh development. Metabolomic profiling identified 6,528 metabolites, with most differential metabolites accumulating before DAF96. Metabolite classification highlighted six dominant categories, including Fatty Acyls and Organooxygen compounds. Integrated analysis demonstrated the involvement of the phenylpropanoid biosynthesis pathway in lignin synthesis, which likely regulates cellular structure, stability, and flesh hardness throughout development. The hierarchical network model provides novel insights into the temporal regulation of almond flesh development, particularly the regulatory axis centered on 4CL and its interacting transcription factors, laying a theoretical foundation for subsequent functional validation. The qRT-PCR results highlight dynamic transcriptional regulation of key genes across six phytohormone and phenylpropanoid biosynthesis pathways during flesh development in almond. This study provides data reference for subsequent exploration of the regulation mechanism of almond flesh development.
Supplementary Information
The online version contains supplementary material available at 10.1186/s12870-025-07893-w.
Keywords: Prunus dulcis, Flesh, Transcriptome, Metabolome, Regulatory network
Introduction
Flesh development is a multi-layered biological process governed by dynamic gene regulatory networks, which coordinate hormone signaling (e.g., ethylene, auxin, and gibberellins), metabolic pathways (e.g., sugar-acid biosynthesis and secondary metabolite synthesis), and cellular remodeling (e.g., pectin degradation and cellulose realignment) to drive flesh expansion and material accumulation [1–3]. Cultivated Prunus fruit trees typically exhibit a double sigmoidal growth pattern, characterized by a distinct growth arrest phase during endocarp lignification [4]. This process encompasses four developmental stages: the first exponential growth phase (S1), stone hardening initiation phase (S2), second exponential growth phase (S3), and maturation phase (S4), all occurring post-fruit set. Stage 1 (S1) involves rapid initial fruit expansion, Stage 2 (S2) features slowed growth accompanied by endocarp lignification, Stage 3 (S3) marks a second rapid growth phase with accelerated flesh tissue enlargement, and Stage 4 (S4) represents the final size attainment and development of species-specific quality traits [5].
According to molecular phylogenetic evidence from the Angiosperm Phylogeny Group classification system, almond is classified within the expanded genus Prunus, with its traditional classification as an independent genus Amygdalus no longer valid. Typical cultivated Prunus fruit trees, such as peach, plum, and apricot, exhibit a characteristic fleshy mesocarp expansion pattern during fruit development. This process involves continuous cell expansion and vacuolization, accumulating water, soluble sugars, organic acids, and other compounds to form a sweet-flavored, fleshy edible tissue, which constitutes their core economic value. Almond (Prunus dulcis) shares a close phylogenetic relationship with peach (Prunus persica), displaying high similarity at both genomic and phenotypic levels (e.g., branches, leaves, and flowers), and the two species can serve as rootstocks for each other via grafting [6, 7]. However, almond exhibits significant differences in fruit development compared to peach. Early fruit development in almond aligns with that of peach, characterized by continuous mesocarp growth and expansion. As almond fruit progresses to mid-late developmental stages, its mesocarp development diverges markedly, ceasing expansion to form only a thin, rigid fleshy layer that does not soften. At maturity, almond fruit undergoes natural dehiscence along the ventral suture in the pericarp (exocarp and endocarp), accompanied by mesocarp dehydration and shrinkage, ultimately resulting in a desiccated, hardened lignified structure [8].
Almond, as one of the most important nut tree species globally, has been predominantly studied for its kernel flavor, lipid composition, and medicinal properties, while the molecular regulatory mechanisms underlying its mesocarp development remain largely unexplored [9, 10]. In recent years, China’s rapid economic growth has driven an increasing emphasis on diversified development strategies in the fruit industry to enhance economic and ecological benefits. Investigating almond mesocarp development mechanisms could provide theoretical support for developing almond flesh by-products, promoting the transformation of the almond industry from single-kernel utilization to whole-fruit valorization. Furthermore, elucidating the mechanisms of almond mesocarp formation may offer new insights into the genetic regulatory mechanisms and evolutionary patterns of fruit development in Prunus species, thereby providing data references for improving flesh texture, postharvest storability, and processing traits in other cultivated Prunus fruit trees. Therefore, this study systematically characterized the molecular regulatory features of almond mesocarp development, including key gene expression dynamics, metabolic pathway shifts, metabolite profiles, and interaction networks, through integrated transcriptomic and metabolomic analyses. The findings aim to deepen the understanding of molecular regulation during almond mesocarp development and provide valuable data for future almond and Prunus fruit tree breeding and quality improvement efforts.
Material method
Sample collection
This study selected three healthy and vigorous 10-year-old ‘Wanfeng’ almond trees from the State-owned No. 2 Forestry Center Almond Germplasm Repository in Shache County, Kashgar Region, Xinjiang (E77°26′, N38°41′) as fruit sampling targets. For each tree, similarly sized fruits were collected from the middle canopy layer in four orientations (east, south, west, north). Based on the full bloom date of ‘Wanfeng’ almond (March 23, 2024), fruits at five developmental stages were sampled: DAF15 (day after flowering) (April 7), DAF53 (May 15), DAF87 (June 20), DAF118 (July 21), and DAF149 (August 21) (Figure S1). DAF15 represents the young fruit stage, where the fruit is only approximately the size of a soybean, primarily undergoing cell division and tissue differentiation, with the ovule in the initial stage of development. DAF53 represents the initial fruit expansion stage, where the fruit volume begins to increase significantly, the endocarp remains uncalcified and soft, and the kernel appears as a transparent liquid endosperm. DAF87 represents the late fruit expansion stage, where the fruit volume approaches its maximum size, the endocarp begins to harden, and the kernel transitions from a liquid to a gel-like state. DAF118 represents the pre-maturity stage, where the fruit size stabilizes, the endocarp fully hardens, the kernel develops fully and accumulates dry matter, lipids, and other substances, and the embryonic organs mature. DAF149 represents the physiological maturity stage, where the fruit partially splits along the ventral suture, exposing the kernel. The endocarp easily separates, and the plump kernel appears milky white, reaching the optimal state for harvest. Fruits collected from each tree at each stage were pooled as one biological replicate, with three trees representing three biological replicates. After rapid freezing in liquid nitrogen, the exocarp and stone were removed, and the mesocarp (flesh) was preserved at −80 °C for subsequent analyses.
RNA extraction and library construction
Total RNA was extracted from mesocarp samples using the RNAprep Pure Plant Kit (Tiangen, China) following the manufacturer’s protocol. The extracted RNA was assessed for purity and concentration using a NanoDrop 2000 spectrophotometer, and RNA integrity was verified with an Agilent 2100 Bioanalyzer. Qualified RNA samples were processed for library construction. Eukaryotic mRNA was enriched using Oligo(dT)-attached magnetic beads, followed by fragmentation with Fragmentation Buffer. First- and second-strand cDNA synthesis was performed using fragmented mRNA as templates, and the resulting double-stranded cDNA was purified. Purified cDNA underwent end repair, poly-A tailing, and sequencing adapter ligation. AMPure XP beads were used for size selection, and PCR amplification was performed to enrich the cDNA libraries. After library construction, initial quantification was conducted using a Qubit 3.0 Fluorometer, with concentrations required to exceed 1 ng/μl. Insert fragment size distribution was verified using the Qsep400 High-throughput Analysis System. Libraries meeting size criteria were accurately quantified via Q-PCR to ensure effective concentrations > 2 nM. Qualified libraries were sequenced on the Illumina NovaSeq 6000 platform in PE150 (Paired-End 150) mode.
Data quality control and mapping
Raw sequencing data were subjected to quality control using FastQC (v0.20.0) with parameters set as: -q 5 -n 5, to remove sequences containing > 5 unknown bases (N), reads with > 50% of bases having quality scores < 5, and adapter sequences [11]. Clean reads from all mesocarp transcriptomes were aligned to the T2T genome of ‘Wanfeng’ almond using HISAT2 (v2.1.0) with parameters: –novel-splicesite-outfile XX.ss [12]. The resulting SAM files were converted to sorted BAM format using Samtools (v1.9) [13]. Alignment efficiency and mapping statistics were evaluated using the rnaseq module of Qualimap (v2.2) [14].
Gene expression and identification of differentially expressed genes
This study utilized the StringTie tool (version: v3.0.0 release) to quantify gene expression from the aligned BAM files, with parameters set as: -e -p 4 [15]. Ballgown (R script) was used to process these results, generating a readcount matrix of gene expression and ultimately an FPKM matrix [16]. The DESeq2 tool (R script) was employed to identify differentially expressed genes, with selection criteria: |log2(fold change (FC))|≥ 1 (Note: “FC” represents the fold change threshold. It is written in terms of upregulation, but it also includes downregulation. For example, when the FC threshold is 2, it means selecting both FC > 2 for upregulation and FC < 1/2 for downregulation. If the threshold is 1, it means selecting differential genes/metabolites without considering FC.) and false discovery rate ≤ 0.05 [17].
GO and KEGG enrichment
GO (Gene Ontology) enrichment analysis encompasses three ontologies: Biological Process, Cellular Component, and Molecular Function, each describing distinct attributes of gene functions. The basic unit of GO is a term, with each term representing a specific functional attribute. KEGG (Kyoto Encyclopedia of Genes and Genomes) (http://www.kegg.jp/) serves as the primary public database for pathway analysis, and pathway enrichment significance was evaluated based on KEGG Pathway units. All GO and KEGG background annotation data were derived from the T2T (Telomere-to-Telomere) genome annotation of ‘Wanfeng’ almond (Table S1).
Identification of transcription factors
This study identified transcription factors in the T2T genome protein dataset of ‘Wanfeng’ almond using the transcription factor prediction tool on the Baimaike Biotech Cloud Platform (https://www.biocloud.net/), selecting the plant database with parameter settings of E-value ≤ 1e−5.
Construction of time-ordered gene co-expression network
Based on all differentially expressed genes obtained from the four comparison groups (DAF15 vs DAF53, DAF53 vs DAF87, DAF87 vs DAF118, and DAF118 vs DAF149), this study employed the TO-GCN (Time-ordered gene coexpression network) tool to construct a co-expression network for time-series transcriptomes of flesh samples from five developmental stages of the ‘Wanfeng’ almond. Differentially expressed genes with an average FPKM value less than 1 across the five flesh samples were removed [18].Finally, the regulatory network diagram was constructed using the Cytoscape tool [19].
Metabolite detection and differentially metabolite analysis
This study commissioned Beijing Biomarker Technologies Co., Ltd. to perform metabolite detection and quantification on mesocarp samples of ‘Wanfeng’ almond at five developmental stages: DAF15, DAF53, DAF87, DAF118, and DAF149. The main steps for metabolite extraction include adding an appropriate volume of extraction solution and magnetic beads for grinding and sonication. After centrifugation and collection of the supernatant, vacuum drying is performed. Subsequently, an appropriate amount of extraction solution is added for reconstitution before analysis on the instrument. The analysis platform used is the Waters Acquity I-Class PLUS ultra-high-performance liquid chromatography coupled with the Waters Xevo G2-XS QTOF high-resolution mass spectrometer [20]. Sample analysis was performed according to the corresponding parameters. The raw data collected using MassLynx V4.2 was processed using Progenesis QI software for peak extraction, peak alignment, and other data processing operations. Identification is performed using the Progenesis QI software with the online METLIN database, public databases, and a self-built database by BMKGENE. Theoretical fragment identification is also conducted. Based on the results of OPLS-DA, differentially metabolites can be preliminarily selected by analyzing the Variable Importance in Projection (VIP) obtained from the multivariate analysis of the OPLS-DA model. Additionally, the p-value from univariate analysis or the fold change value can be used to further filter the differentially metabolites. The selection criteria are as follows, select metabolites with a fold change ≥ 1 [21]. If the difference in metabolite abundance between the control group and the experimental group is above it is considered as significant difference. If there are biological replicates in the sample groups, in addition to the above criteria, select metabolites with a VIP value ≥ 1. The VIP value indicates the impact strength of the between-group difference of the corresponding metabolite in the sample classification and discrimination of the model. Generally, metabolites with VIP ≥ 1 are considered significantly different [22]. If there are biological replicates in the sample groups, further filter differentially metabolites by selecting those with a p-value < 0.05 using the t test. Generally, metabolites with a p-value < 0.05 are considered significantly different [23].
Quantitative real-time PCR validation
This study randomly selected three differentially expressed genes from each of the six pathways (auxin, cytokinin, ethylene, brassinosteroid, methyl jasmonate, and Phenylpropanoid biosynthesis) in the L6 hierarchical network of the TO-GCN results to detect expression levels in flesh samples from five developmental stages of the ‘Wanfeng’ almond. Primers for qRT-PCR (Real-Time Quantitative Reverse Transcription PCR) amplification of selected genes were designed using Primer Premier 5 software [24]. We used Actin from a previously published article on almonds for this study. The fluorescence quantification experimental procedure followed the protocol described in a previously published article by our team [25]. The relative expression of the target genes was calculated using the 2−ΔΔCT method [26].
Result analysis
Transcriptome overview
This study conducted transcriptome sequencing on mesocarp samples of ‘Wanfeng’ almond at five developmental stages (DAF15, DAF53, DAF87, DAF118, and DAF149) using the Illumina NovaSeq 6000 platform. After quality control, raw data from 15 mesocarp samples yielded 107.25 Gb of Clean Data, with each sample containing ≥ 5.83 Gb of Clean Data, Q30 base percentages ≥ 92.09%, and GC content ranging from 44.12% to 46.18%, indicating high sequencing quality suitable for reference genome alignment (Table S2). Clean reads from all 15 samples were aligned to the ‘Wanfeng’ almond T2T genome, with overall alignment rates ranging from 89.02% to 97.22% (Table S3). Uniquely mapped reads accounted for 85.75%–94.10% of Clean reads, while multiple mapped reads comprised 2.38%–3.90%, demonstrating excellent genome coverage and supporting downstream analyses. Principal component analysis clearly distinguished mesocarp samples across the five developmental stages, with high clustering consistency among three biological replicates within each stage (Figure S2A). A clustered heatmap based on expression data revealed strong correlations among replicates within the same stage and marked divergence between stages (Figure S2B). These results confirm significant divergence in gene expression patterns across developmental stages and high experimental reproducibility.
Differentially expression gene analysis
To investigate gene expression changes in ‘Wanfeng’ almond mesocarp samples across five developmental stages (DAF15, DAF53, DAF87, DAF118, and DAF149), differentially expressed genes (DEGs) were identified using the DESeq2 tool. Comparative transcriptomic analysis of adjacent developmental stages revealed DEGs. Results showed that 7,634, 5,377, 7,274, and 4,258 DEGs were identified in the four comparison groups (DAF15 vs DAF53, DAF53 vs DAF87, DAF87 vs DAF118, and DAF118 vs DAF149, respectively) (Table S4). Among these, the numbers of upregulated DEGs were 3,829, 2,412, 3,630, and 1,890, while the numbers of downregulated DEGs were 3,805, 2,965, 3,644, and 2,368, respectively (Fig. 1A). A Venn diagram identified 567 DEGs shared across all four comparison groups (Fig. 1B, Table S5).
Fig. 1.
Differentially expression gene analysis of DAF15 vs DAF53, DAF53 vs DAF87, DAF87 vs DAF118, and DAF118 vs DAF149 comparison groups. A Number of DEGs in four comparison groups. The abscissa represents different differential gene sets, blue is all DEGs, orange is up-regulated genes, green is down-regulated genes, and the ordinate represents the number of DEGs. B Venn diagram of differentially expressed genes among four comparison groups. C Significant KEGG pathway of DEGs among four comparison groups. G1, G2, G3 and G4 represent DAF15 vs DAF53, DAF53 vs DAF87, DAF87 vs DAF118 and DAF118 vs DAF149, respectively
This study performed GO and KEGG enrichment annotation on DEGs from four comparison groups (DAF15 vs DAF53, DAF53 vs DAF87, DAF87 vs DAF118, and DAF118 vs DAF149), with significance analyzed based on P-value. Results showed that DEGs across all four groups were enriched in multiple GO terms under Biological Process, Cellular Component, and Molecular Function categories (Figure S3), with most DEGs enriched in metabolism-related KEGG pathways (Figure S4). The top 20 significantly enriched GO terms in Biological Process, Cellular Component, and Molecular Function were further classified (Figure S5, Figure S6 and Figure S7; Table S6). DEGs in DAF15 vs DAF53 were associated with DNA-binding transcription factors, auxin signaling pathways, cell wall metabolism, ribosome and plasma membrane-related structures, regulating cell proliferation, energy metabolism, and flesh expansion. DEGs in DAF53 vs DAF87 were linked to chlorophyll binding, photosynthesis, enhanced cell wall-modifying enzyme activity, and photosystem I activity, promoting photosynthetic product accumulation and cell expansion. DEGs in DAF87 vs DAF118 participated in hydrolase and calmodulin-binding activities, ketone/aldehyde catabolism, Golgi apparatus and myosin-mediated cell wall remodeling, facilitating cell wall degradation. DEGs in DAF118 vs DAF149 were enriched in ubiquitin–proteasome system activation, peptide phosphorylation, methylglyoxal degradation, ubiquitin ligase complexes, and lysosomal membrane functions, accelerating protein turnover, senescence, and cell death. Additionally, the top 20 enriched KEGG pathways aligned with GO enrichment results (Fig. 1C; Table S6). Collectively, these findings reflect two hallmark phases in almond mesocarp development: cell proliferation and material accumulation (DAF15–DAF87) and metabolic shift toward degradation and programmed cell death (DAF118–DAF149), consistent with the observed developmental characteristics.
Analysis of plant hormone signal transduction pathway
KEGG enrichment analysis revealed that the Plant hormone signal transduction pathway (ko04075) was significantly enriched across all four comparison groups and plays a critical role in regulating cellular development (Fig. 1C, Table S6). Thus, this study analyzed DEGs in five hormone pathways—auxin, cytokinin, ethylene, brassinosteroid, and methyl jasmonate—associated with cell development and senescence (Fig. 2A). Auxin pathway: A total of 57 DEGs were identified, including ARF (Auxin response factor) (14), AUX/IAA (Auxin-responsive protein IAA) (16), AUX1 (Auxin influx carrier 1) (4), GH3 (Gretchen Hagen3) (5), SAUR (Small Auxin-Up RNA) (14), and TIR1 (Transport inhibitor response 1) (4) (Fig. 2B). These DEGs were predominantly upregulated in DAF53 and DAF87 mesocarp samples. Cytokinin pathway: 30 DEGs were identified, categorized as A-ARR (Two-component response regulator ARR-A) (4), AHP (Arabidopsis histidine phosphotransfer proteins) (6), B-ARR (B-type Arabidopsis Response Regulator) (17), and CRE1 (Cytokinin Response 1) (3). These genes were mainly upregulated in DAF53 and DAF87 mesocarp samples (Fig. 2C). Ethylene pathway: 22 DEGs were identified, including CTR1 (Constitutive Triple Response 1) (5), EBF1/2 (EIN3-binding F-box protein) (3), EIN3 (Ethylene-insensitive protein 3) (3), ERF1/2 (Ethylene-responsive transcription factor 1/2) (4), ETR (Ethylene receptor) (1), and SIMKK (Mitogen-activated protein kinase kinase 4/5) (6). These genes showed upregulated expression in DAF15, DAF53, and DAF87 mesocarp samples, with partial upregulation in DAF149 (Fig. 2D). Brassinosteroid pathway: 50 DEGs were identified, encompassing BRI1 (Protein brassinosteroid insensitive 1) (35), BSK (BR-signaling kinase) (7), BZR1/2 (Brassinosteroid resistant 1/2) (5), CYCD3 (cyclin D3) (2), and TCH4 (TouCH-induced protein 4) (1). These DEGs were predominantly upregulated in DAF53, DAF87, and DAF118 mesocarp samples (Fig. 2E). Methyl jasmonate pathway: 26 DEGs were identified, including COI1 (Coronatine-insensitive protein 1) (3), JAR1 (Jasmonic acid-amino synthetase) (3), JAZ (jasmonate ZIM domain-containing protein) (5), and MYC2 (Transcription factor MYC2) (15). Most DEGs were upregulated in DAF53, DAF87, DAF118, and DAF149 mesocarp samples (Fig. 2F). In summary, all five hormone pathways likely play pivotal roles in almond mesocarp development. Notably, the methyl jasmonate pathway may be particularly critical in regulating late-stage cellular senescence and death processes.
Fig. 2.
Expression patterns of DEGs in five hormone pathways in flesh samples at five stages. A Hormone Flow Chart. B Heatmap of Differentially Expressed Genes in the Auxin Pathway. C Heatmap of Differentially Expressed Genes in the Cytokinin Pathway. D Heatmap of Differentially Expressed Genes in the Ethylene Pathway. E Heatmap of Differentially Expressed Genes in the Brassinosteroid Pathway. F Heatmap of Differentially Expressed Genes in the Jasmonic Acid Pathway. The gene expression heat map is normalized by the Z-score method. Blue to red indicates low expression to high expression. The different color columnar regions corresponding to the left side of each heat map represent different gene types. The number in brackets is the number of genes
Identification of differentially expressed transcription factors
Transcription factors (TFs) play pivotal roles in plant growth and development by regulating gene expression patterns to ensure proper tissue development. GO enrichment analysis revealed that DNA-binding transcription factor activity (GO:0003700) was significantly enriched across all four comparison groups (Figure S5). Using the transcription factor prediction tool on the Baimaike Cloud Platform, this study identified 1,761 TF genes in the ‘Wanfeng’ almond T2T genome, spanning 74 TF families, including WRKY, MYB, and NAC. Among the four comparison groups (DAF15 vs DAF53, DAF53 vs DAF87, DAF87 vs DAF118, and DAF118 vs DAF149), 589, 405, 568, and 379 DEGs, respectively, were classified as TFs, representing 58–69 TF families (Table S7). Further analysis of the top 10 TF families in each group showed that AP2/ERF, bHLH, C2C2, C2H2, MYB, NAC, and WRKY were consistently present across all four groups (Fig. 3). Notably, AP2/ERF TFs increased to 50 members in the DAF87 vs DAF118 group but sharply decreased to 22 in the DAF118 vs DAF149 group. Additionally, bZIP, HB, FAR1, GRAS, MADS, MYB-related, and C3H TFs appeared in specific comparison groups. These results indicate that the development process of almond mesocarp may be involved in the comprehensive regulatory role of different types of transcription factors.
Fig. 3.
Histogram of the top 10 transcription factors in the four comparison groups
Metabolite identification and classification
This study identified 6,528 metabolites in mesocarp samples of ‘Wanfeng’ almond across five developmental stages, with 2,678 metabolites detected in positive ion mode and 3,850 metabolites in negative ion mode (Table S8). Their abundance exhibited significant temporal dynamics during flesh development (Fig. 4A, 4B). As broad-target metabolomics does not assign functional classifications, the 6,528 metabolites were annotated using KEGG, HMDB, and LIPID MAPS databases. KEGG annotation: 252 metabolites were mapped to 94 pathways across 15 major categories, with some metabolites assigned to multiple pathways. HMDB annotation: 855 metabolites were classified into 15 superclasses and 171 subclasses. LIPID MAPS annotation: 237 metabolites were categorized into 6 main classes and 29 subclasses. The top 20 metabolite types from each database were highlighted. KEGG showed predominant enrichment in pathways related to “Biosynthesis of other secondary metabolites” and “Amino acid metabolism”, with the highest metabolite counts (Fig. 4C). HMDB revealed five dominant categories: Fatty Acyls, Organooxygen compounds, Carboxylic acids and derivatives, Prenol lipids, and Benzene and substituted derivatives, each containing > 50 metabolites (Fig. 4D). LIPID MAPS identified Flavonoids as the most abundant class (75 metabolites) (Fig. 4E).
Fig. 4.
Metabolite analysis of flesh samples at five stages. A Cluster heat map of metabolites in positive ion mode. B Cluster heat map of metabolites in negative ion mode. C KEGG database classification. D HMDB database classification. E LIPID MAPS database classification. The metabolic abundance heat map uses the Z-score method for data normalization. Blue to red indicates low expression to high expression
Differential metabolites and KEGG enrichment analysis
Differential metabolites provide insights into abundance variation characteristics of metabolites across developmental stages in plant tissues, offering an effective approach to screen key metabolites. This study identified 1,231, 3,248, 2,019, and 2,398 differential metabolites in the four comparison groups (DAF15 vs DAF53, DAF53 vs DAF87, DAF87 vs DAF118, and DAF118 vs DAF149, respectively), with 735 (496), 2,436 (812), 9,29 (1,090), and 1,329 (1,069) upregulated (downregulated) metabolites. A total of 4,808 differential metabolites were identified, with 220 metabolites shared across all four groups (Fig. 5A, 5B). Based on k-means clustering trends, these 4,808 differential metabolites were grouped into two major clusters (Fig. 5C), 2,482 metabolites exhibited higher relative abundance levels in DAF15, DAF87, and DAF118 mesocarp samples, while 2,326 metabolites showed higher abundance in DAF53 samples. These results indicate that most metabolites accumulate predominantly during the fruit expansion phase (DAF15–DAF118), corresponding to the peak developmental phase of mesocarp, and demonstrate a distinct temporal gradient accumulation pattern.
Fig. 5.
Analysis of differential metabolites in the four comparison groups. A Statistics of the number of differential metabolites in four comparison groups. B Venn diagram of differential metabolites in four comparison groups. C 4,808 differential metabolites k-means clustering trend diagram. The x-axis is the grouping, the y-axis is the standardized metabolite content, the different color line represents the average change trend of each k-means clustering metabolite content between groups, and the Sub Class represents the metabolite category number with the same change trend
This study analyzed the top 20 significantly enriched KEGG pathways across four comparison groups. In DAF15 vs DAF53, core pathways included Butanoate metabolism (ko00650), Sulfur metabolism (ko00920), Riboflavin metabolism (ko00740), Thiamine metabolism (ko00730), and Vitamin B6 metabolism (ko00750), suggesting differential metabolites in these stages are involved in energy metabolism and oxidative stress responses (Fig. 6A). DAF53 vs DAF87 showed significant enrichment in pathways such as Anthocyanin biosynthesis (ko00942) and Flavone and flavonol biosynthesis (ko00944), indicating potential roles of differential metabolites in regulating flesh pigmentation and material accumulation (Fig. 6B). DAF87 vs DAF118 highlighted pathways including Biosynthesis of unsaturated fatty acids (ko01040), Ascorbate and aldarate metabolism (ko00053), and Glucosinolate biosynthesis (ko00966), implicating differential metabolites in lipid dynamics and antioxidant processes (Fig. 6C). DAF118 vs DAF149 featured pathways such as Glyoxylate and dicarboxylate metabolism (ko00630), Citrate cycle (TCA cycle) (ko00020), and Cutin, suberine and wax biosynthesis (ko00073), suggesting differential metabolites drive metabolic shifts linked to flesh senescence and degradation product reallocation (Fig. 6D). Cross-group analysis revealed that Biosynthesis of unsaturated fatty acids (ko01040), Anthocyanin biosynthesis (ko00942), and Aminoacyl-tRNA biosynthesis (ko00970) were shared across three comparison groups, potentially regulating membrane structure remodeling, stress resistance enhancement, and protein synthesis/degradation, respectively.
Fig. 6.
The first 20 KEGG pathways were significant in the four comparison groups. A DAF15 vs DAF53 comparison group, B DAF53 vs DAF87 comparison group, C DAF87 vs DAF118 comparison group, D DAF118 vs DAF149 comparison group. The ordinate represents the name of the difference pathway, and the abscissa represents the difference abundance score (DA Score). DA Score reflects the overall changes of all metabolites in the metabolic pathway. Score 1 indicates that the expression trend of all annotated metabolites in the pathway is up-regulated, and −1 indicates that the expression trend of all annotated metabolites in the pathway is down-regulated. The length of the line segment represents the absolute value of the DA Score, and the dot size of the endpoint of the line segment represents the number of differential metabolites in the pathway. The dots are distributed on the left side of the central axis and the longer the line segment, indicating that the overall expression of the pathway is more inclined to be down-regulated. The dots are distributed on the right side of the central axis and the longer the line segment, indicating that the overall expression of the pathway is more inclined to be up-regulated. The larger the dots, the more the number of metabolites. The color of line segment and dot reflects the size of DA Score. The closer to red, the greater the DA Score, and the closer to blue, the smaller the DA Score
Joint analysis of differentially expressed genes and differential metabolites
This study utilized the KEGG Markup Language tool to visualize relationships among the top 10 enriched pathways, differentially expressed genes (DEGs), and differential metabolites (DEMs) across four comparison groups, constructing an integrated network (Fig. 7, Table S9). Results revealed that 23 pathways, 524 DEGs, and 26 DEMs collectively formed a potential regulatory network. Among these, Vitamin B6 metabolism (ko00750), Porphyrin and chlorophyll metabolism (ko00860), Zeatin biosynthesis (ko00908), and Carotenoid biosynthesis (ko00906) pathways were independent, while the remaining 19 pathways were interconnected through shared DEGs. For instance, alpha-Linolenic acid metabolism (ko00592) and Glycerophospholipid metabolism (ko00564) shared seven DEGs: Pdu25338 (lipase class 3 family protein), Pdu15218 (lipase class 3 family protein), Pdu01806 (lipase class 3 family protein), Pdu09944 (lipase class 3 family protein), Pdu21081 (Sugar-Dependent 1 like lipase), Pdu21837 (lipase class 3 family protein), and Pdu21836 (lipase class 3 family protein). Additionally, some pathways were linked via shared DEMs. For example, Melibiose was associated with both Galactose metabolism (ko00052) and ABC transporters (ko02010). Notably, seven DEMs—5-Hydroxyconiferaldehyde; 5-Hydroxyconiferyl alcohol; Choline; Coniferyl alcohol; D-1-Aminopropan-2-ol O-phosphate; and L-Isoleucine—may be implicated in regulating cellular structural stability and mesocarp hardening.
Fig. 7.
Potential regulatory network of KEGG pathway, differentially expressed genes and differential metabolites
Analysis of phenylpropanoid biosynthetic pathway
Based on the above findings, seven differential metabolites—5-Hydroxyconiferaldehyde, 5-Hydroxyconiferyl alcohol, Choline, Coniferyl alcohol, D-1-Aminopropan-2-ol O-phosphate, and L-Isoleucine—were mapped to the Phenylpropanoid biosynthesis pathway (ko00940), a critical pathway regulating lignin synthesis in plants. Lignin plays a key role in modulating flesh hardness and delaying softening. Accordingly, this study reconstructed a partial schematic of the phenylpropanoid biosynthesis pathway and analyzed expression patterns of associated DEGs and DEMs (Fig. 8A, Table S10). A total of 84 DEGs were identified across nine gene categories in the pathway: 4CL (4-coumarate-CoA ligase) (9), CAD (Cinnamyl-alcohol dehydrogenase) (25), CCR (Cinnamoyl-CoA reductase) (13), COMT (Caffeic acid 3-O-methyltransferase) (3), CYP73A (Trans-cinnamate 4-monooxygenase) (2), E2.1.1.104 (Caffeoyl-CoA O-methyltransferase) (2), F5H (Ferulate-5-hydroxylase) (1), PAL (Phenylalanine ammonia-lyase) (1), and POD (Peroxidase) (28). Heatmap analysis revealed significant upregulation of these DEGs across five developmental stages, particularly in CAD, CCR, COMT, and E2.1.1.104, which were predominantly upregulated in DAF15 and DAF53 mesocarp samples. Most 4CL genes were upregulated in DAF149, while POD genes showed elevated expression in DAF15, DAF53, and DAF87 (Fig. 8B). Nine DEMs—L-Phenylalanine, Cinnamic acid, Coumaric acid, Coumaraldehyde, Caffeoyl alcohol, Ferulic acid, Coniferyl alcohol, 5-Hydroxyconiferaldehyde, and 5-Hydroxyconiferyl alcohol—were identified in the pathway. Expression profiles indicated that L-Phenylalanine and Cinnamic acid reached their highest accumulation levels in DAF53 mesocarp samples (Fig. 8C). Notably, the other seven DEMs exhibited increased accumulation during DAF87, DAF118, and DAF149. Based on these results, it is speculated that the phenylpropane biosynthesis pathway may have a potential role in regulating the texture of almond mesocarp.
Fig. 8.
Phenylpropanoid biosynthesis pathway analysis. A Phenylpropanoid biosynthesis pathway. B Nine types of differentially expressed gene expression heat map. C Abundance heat maps of nine differential metabolites. A In the flow chart, the red rectangle is the differential metabolite mapped to the pathway, and the pink rectangle is the metabolite not mapped to. Gene expression and metabolite abundance heat maps of (B) and (C) were normalized using the Z-score method, and blue to red indicated low to high expression. The different color cylindrical regions corresponding to the left side of each heat map represent different gene types and metabolites
TO-GCN analysis of hormones and phenylpropanoid biosynthesis-related genes
This study constructed a temporal gene co-expression network (TO-GCN) between differentially expressed transcription factors (TFs) and key genes in hormone and phenylpropanoid biosynthesis pathways across five developmental stages of ‘Wanfeng’ almond mesocarp (Fig. 9A, Table S11). The initial node of the network was Pdu02243, an AP2/ERF transcription factor highly expressed in DAF15 mesocarp samples and exhibiting a declining trend in subsequent stages. In the C1 + C2 + GCN framework, relationships with Pearson correlation coefficients ≥ 0.85 between 887 TFs and 9,962 structural genes were visualized using Cytoscape. The final time-ordered co-expression network comprised 887 differentially expressed TFs (including the L1 initial node) and 261 key genes, organized into 7 hierarchical regulatory layers (L1–L7). By integrating FPKM values of DEGs across DAF15, DAF53, DAF87, DAF118, and DAF149 mesocarp samples, differential expression heatmaps were generated for layers L2–L7, with L1 serving as a reference (Fig. 9B). Based on layer-specific expression patterns, mesocarp development was divided into three phases: initial phase (DAF15, fruit formation, soybean-sized), transition phase (DAF53, DAF87, DAF118, continuous expansion followed by growth arrest and texture changes), and termination phase (DAF149, programmed cell death, dehiscence, and dehydration). Stage-specific analysis revealed 152 TFs and 48 key genes highly expressed in the initial phase (L2/L3), 615 TFs and 195 key genes in the transition phase (L4–L6), and 40 TFs and 18 key genes in the termination phase (L7). The network encompassed 67 TF families, with bZIP, B3, WRKY, C2H2, C2C2, NAC, HB, bHLH, AP2/ERF, and MYB being the most abundant (Fig. 9C). TF numbers surged from initial (L2/L3) to transition phases (L4–L6) but sharply declined in the termination phase (L7), while hierarchical expression patterns exhibited high conservation (Fig. 9D). These results demonstrate that gene expression abundance in hierarchical networks aligns closely with mesocarp developmental traits. Specifically, 841 TFs (94.92%) and 102 key genes (91.07%) regulated developmental processes during DAF15–DAF118, while gene activity diminished or silenced in DAF149 as mesocarp maturation halted.
Fig. 9.
TO-GCN Analysis. A TO-GCN Analysis of Hormones and Phenylpropanoid Biosynthesis-Related Genes. B Expression heatmap of differentially expressed genes across seven hierarchical layers. C Transcription factor binding sites in the upstream promoter sequence of the Pdu03147 gene. D Expression heatmap of five genes. E Hierarchical regulatory network of the Pdu03147 gene. Red arrows indicate the presence of corresponding transcription factor binding sites in promoter regions
Within the TO-GCNs, we elucidated the regulatory network architecture of Pdu03147 (4CL), a key gene in the Phenylpropanoid biosynthesis pathway (Fig. 9E). Pdu03147 occupied a central hub position in the network, forming close regulatory associations with 21 transcription factors (TFs). Among these, Pdu07239 (MADS) acted as a fifth-layer regulator, Pdu00846 (C2C2) and Pdu05080 (HD-ZIP) as fourth-layer regulators, Pdu04105 (KNOX), Pdu02801 (HD-ZIP), and Pdu10284 (DBB) as third-layer regulators, and Pdu19065 (AUX/IAA) and Pdu08953 (ARF) as second-layer regulators. A group of 13 TFs, including Pdu23309 (Dof), directly (first-layer) regulated Pdu03147. To identify transcription factor binding sites (TFBS) in promoter regions across hierarchical layers, the promoter sequence of Pdu03147 was analyzed, revealing a regulatory cascade involving Pdu07239, Pdu05080, Pdu02801, and Pdu08953. This cascade potentially modulates metabolites in the phenylpropanoid pathway, including Coumaric acid, 5-Hydroxyconiferyl alcohol, 4-Hydroxycinnamyl aldehyde, 5-Hydroxyconiferaldehyde, Coniferyl alcohol, and Ferulic acid. Expression patterns showed that Pdu07239, Pdu05080, Pdu02801, Pdu08953, and Pdu03147 were significantly upregulated in DAF53, with Pdu03147 also upregulated in DAF149. These findings provide data reference for future research on almond mesocarp.
Fluorescence quantitative expression analysis
qRT-PCR analysis of flesh samples across five developmental stages revealed significant expression changes in 18 selected DEGs associated with six pathways—auxin, cytokinin, ethylene, brassinosteroid, methyl jasmonate, and phenylpropanoid biosynthesis. Notably, expression patterns for the majority of these DEGs showed strong consistency between qRT-PCR results and transcriptomic FPKM trends throughout fruit development (Fig. 10, Figure S8, Table S12). In the phenylpropanoid biosynthesis pathway, Pdu00909 (4CL) exhibited peak expression (8.7) at DAF149, while Pdu20693 (POD) showed elevated levels at DAF53 (10.31), DAF87 (7.57), and DAF118 (17.43). Pdu24484 (E2.1.1.104) displayed minimal fluctuations across all stages. Within the methyl jasmonate pathway, Pdu07572 (MYC2) peaked at DAF118 (14.64), and Pdu09190 (JAZ) showed high expression at DAF118 (72.91) and DAF149 (43.88), whereas Pdu15659 (MYC2) remained stable. In the auxin pathway, Pdu00580 (ARF) and Pdu11112 (GH3) were significantly upregulated at DAF118 (15.69 and 5.45, respectively), while Pdu12166 (TIR1) peaked at DAF87 (25.42). For cytokinin signaling, two B-ARR genes, Pdu16901 and Pdu20787, were markedly induced at DAF118 (5.76 and 32.99), whereas Pdu02843 showed no significant variation. Ethylene-related genes Pdu13267 and Pdu17403 (SIMKK) exhibited pronounced upregulation at DAF53, DAF87, and DAF149, while Pdu24786 (EBF1/2) displayed only modest increases across DAF53–DAF149. In brassinosteroid signaling, Pdu00798 (TCH4) was sharply upregulated at DAF118 (26.93), Pdu11482 (BRI1) peaked at DAF118 (15.82) and DAF149 (8.68), and Pdu17542 (BSK) remained relatively stable throughout development.
Fig. 10.
The qRT-PCR expression levels of 18 genes. From the first row to the sixth row are the pathways of Phenylpropanoid biosynthesis, methyl jasmonate, auxin, cytokinin, ethylene, and brassinosteroid, respectively
Discuss
The flesh of the fruit plays a dual role: it serves as a natural barrier protecting seeds from external environmental stresses and acts as an effective medium for promoting species dispersal through animal foraging behavior. In recent years, with the rapid advancement of sequencing technologies and significant reductions in associated costs, transcriptomic and metabolomic analyses have emerged as powerful tools to unravel the diverse characteristics of fruit flesh, such as color, flavor, texture, and nutritional composition [27, 28]. Notably, research on almond has predominantly focused on the kernel, while transcriptomic and metabolomic studies on its fleshy tissue remain scarcely reported. Therefore, this study conducted integrated transcriptomic and metabolomic analyses on flesh samples of the ‘Wanfeng’ almond at five developmental stages: DAF15, DAF53, DAF87, DAF118, and DAF149.
This study obtained 107.25 Gb of clean transcriptomic data from 15 flesh samples of the ‘Wanfeng’ almond at five developmental stages: DAF15, DAF53, DAF87, DAF118, and DAF149. DEGs can reveal important biological processes potentially involved in regulating growth and development, metabolic processes, and signal transduction [29]. This study discovered widespread gene expression changes throughout almond flesh development. A large number of DEGs were identified in all four comparison groups, indicating that each key transition in the flesh – from early expansion and substance accumulation to later maturation, senescence, and ultimately dehydration and splitting – is accompanied by the activation and switching of complex transcriptional regulatory networks. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes enrichment analyses provided crucial clues for understanding the biological significance of these dynamic changes [30, 31]. Unlike the typical double-sigmoid growth curve (characterized by two rapid expansion phases) observed in the flesh of Prunus species such as peach, plum, and apricot, almond flesh development exhibits a unimodal expansion pattern. In the early stages (DAF15 to DAF53), DEGs were significantly enriched in pathways related to cell proliferation and cell wall regulation, driving rapid cell division and flesh expansion. In subsequent stages (DAF87, DAF118, and DAF149), enrichment shifted towards functions associated with photosynthetic product accumulation and metabolism. This indicates that the expansion process is concentrated in the early developmental stages, lacking a typical lag phase and secondary expansion process [4]. Notably, unlike transcriptomic studies of thick-fleshed, high-sugar-acid flavor species like peach and apricot, the DEG enrichment results in almond mesocarp revealed minimal significant enrichment for key pathways directly involved in the biosynthesis of sugars (e.g., sucrose, starch metabolism) or organic acids (e.g., citric acid, malic acid cycles) [32, 33]. This finding suggests that the core metabolic activity of almond flesh during development may prioritize fundamental metabolism required to maintain basic cellular structure and function, rather than the substantial synthesis and accumulation of compounds such as soluble sugars and organic acids that confer flavor and thick texture. This difference in metabolic strategy may underlie the formation of the thin and astringent flesh structure in almonds and reflects distinct adaptive resource allocation strategies among different Prunus species during fruit evolution. Transcription factors play a central role in regulating flesh development. By integrating endogenous hormone signals and external environmental factors, they activate or suppress downstream target gene expression networks. This process finely coordinates key biological processes within flesh cells, including proliferation and expansion, textural changes, pigment accumulation, and sugar-acid metabolism, ultimately determining fruit quality and ripening progression [34, 35]. Transcriptional regulation in almond mesocarp exhibited stage-specific dynamics. During the DAF15 and DAF53 stages, transcription factor families such as AP2/ERF were highly enriched, potentially participating in cell division and proliferation. At the DAF87 and DAF118 stages, AP2/ERF enrichment peaked and factors like FAR1 appeared, suggesting their potential co-regulation of diverse biosynthetic pathways. By the DAF149 stage, bHLH transcription factors dominated, and the MADS-box family was enriched for the first time, potentially associated with maturation or the activation of programmed cell death. These stage-dependent transcriptional shifts require functional validation but provide potential clues for deciphering the regulatory network governing almond fruit development.
Plant hormones act as crucial intercellular signaling molecules, and their signal transduction pathway (ko04075) serves as a core hub coordinating developmental processes such as plant cell division, differentiation, expansion, and death [36]. Auxin and cytokinin drive the cell cycle and morphogenesis [37, 38]. Gibberellins and brassinosteroids promote cell elongation [39, 40]. Ethylene, abscisic acid, and jasmonates finely regulate organ senescence and programmed cell death[41–43]. This study found this pathway significantly enriched during all developmental stage transitions, highlighting the critical role of the hormone network in coordinating almond flesh development. In-depth analysis of DEGs related to five hormone classes-auxin, cytokinin, ethylene, brassinosteroids, and methyl jasmonate-revealed their distinct expression patterns across different almond flesh developmental stages. Notably, genes related to auxin, cytokinin, and brassinosteroids were generally upregulated at DAF53 and DAF87, suggesting their potential important roles in promoting flesh cell division and expansion. The activity of DEGs in the ethylene pathway across three stages (DAF15, DAF53, and DAF87) and the significant upregulation of methyl jasmonate pathway DEGs in the two later stages (DAF118 and DAF149) indicate that jasmonate signaling may play a more critical or synergistic role than ethylene in driving programmed cell death, dehydration, and the final splitting process in the late stages of almond flesh development. This potentially differs from models where ethylene primarily drives ripening and softening in some Prunus fruits, warranting further investigation [44, 45].
Almond flesh lacks typical softening phenomena in its late developmental stages. Within the phenylpropanoid biosynthesis pathway, key genes such as CAD and 4CL efficiently catalyze the biosynthesis of lignin monomers, driving the lignification process of cell walls, thereby maintaining and enhancing flesh hardness [46, 47]. Therefore, this study investigated the lignin synthesis branch of the phenylpropanoid pathway. The accumulation patterns of its metabolites (e.g., Coniferyl alcohol, 5-Hydroxyconiferaldehyde) during late development, combined with the expression dynamics of related genes (e.g., CAD, CCR, POD), suggest that this pathway potentially plays a role in regulating late-stage cell wall reinforcement (lignification), texture hardening, and preparation for eventual dehydration and splitting in almond flesh. This significant cell wall remodeling event is typically avoided during the late development of thick-fleshed, fresh-eating Prunus fruits (e.g., peach, apricot), further highlighting the unique developmental strategy of almond flesh. At the transcriptional regulation level, seven transcription factor families, including AP2/ERF, bHLH, MYB, NAC, and WRKY, were consistently present at high proportions across the four comparison groups, indicating their important roles in regulating the almond flesh developmental process. Members of these TF families are typically involved in broad processes such as the cell cycle, metabolism, stress response, and cell fate determination [48–50]. Furthermore, this study employed the TO-GCN tool to construct a time-series gene co-expression network integrating differentially expressed TFs and key DEGs (hormone pathway and phenylpropanoid pathway genes). This network successfully integrated gene expression patterns across key developmental stages, delineated the developmental process into distinct initial, transition, and termination phases, and revealed a hierarchical regulatory relationship. In summary, the above results provide a data reference for understanding the regulatory characteristics of almond flesh development.
Metabolomics technology enables the unbiased and comprehensive profiling of the rich metabolome in fruits, providing key evidence for systematically revealing the molecular mechanisms underlying quality formation, developmental regulation, and environmental responses [51, 52]. In the fruit development of typical cultivated Prunus trees, metabolite accumulation features are most notably manifested in sugars, organic acids, and flavor-related compounds [53]. This study identified 6,528 metabolites in flesh samples of ‘Wanfeng’ almond across five developmental stages: DAF15, DAF53, DAF87, DAF118, and DAF149. According to accumulation patterns, most metabolites showed a decreasing trend as the flesh matured. Annotation of the 6,528 metabolites revealed that six major classes predominated: Fatty Acyls, Organooxygen compounds, Carboxylic acids and derivatives, Prenol lipids, Benzene and substituted derivatives, and Flavonoids. These six classes of metabolites possess multiple functions, including regulating secondary metabolism, enhancing stress resistance, delaying cellular oxidative damage, promoting lignin accumulation, and enhancing the mechanical strength of cell walls. This suggests that almond flesh also possesses complex regulatory mechanisms during development [54]. The most significant feature is the stark contrast to typical fresh-eating Prunus fruits (e.g., peach, plum, apricot), characterized by the lack of significant metabolites associated with sugars, organic acids, and flavor substances. This further corroborates the transcriptomic results, indicating that almond flesh during development may primarily focus on fundamental metabolic activities to ensure the normal development and maturation of the seed. Fundamental metabolic activities provide the necessary energy and substances for flesh cells, maintaining basic cellular life activities, thereby ensuring the kernel receives adequate nutrition during development. This may represent the reproductive strategy distinguishing almond from its close relatives such as peach, plum, and apricot. In summary, through the detection and quantitative analysis of metabolites in flesh samples from five developmental stages of ‘Wanfeng’ almond, this study not only reveals the changes in metabolite types and accumulation patterns during flesh development but also provides a reference for understanding the potential metabolic characteristics of almond flesh development.
Integrated transcriptomic and metabolomic analysis can provide a more comprehensive understanding of changes and regulatory mechanisms in biological systems [55, 56]. Therefore, this study integrated transcriptomic and metabolomic analyses to reveal the core synergistic regulatory network driving the unique developmental trajectory of almond flesh, particularly its key late-stage transitions. Unlike the significant enrichment of metabolites in flavor compounds during fruit development in fresh-eating Prunus species, this study highlights the structural–functional orientation of almond flesh development. Integrated analysis of the early stages (DAF15, DAF53, and DAF87) showed enrichment in pathways such as amino acid and energy metabolism, reflecting the demand for fundamental substance synthesis supporting cell expansion. However, the most critical finding lies in the late developmental stages (DAF118 and DAF149), where the co-enrichment of genes and metabolites clearly points to a regulatory network centered on phenylpropanoid metabolism (phenylalanine metabolism; phenylalanine, tyrosine and tryptophan biosynthesis) and the cutin, suberine, and wax biosynthesis pathways. This activation is not an isolated event; extensive cross-talk between pathways (e.g., phenylpropanoid biosynthesis sharing key genes and metabolites with ubiquinone biosynthesis and flavonoid biosynthesis) constitutes a networked regulatory module. The core biological output of this module is driving the significant synthesis and accumulation of structural macromolecules such as lignin, cutin, and waxes. Lignification (the core product of phenylpropanoid metabolism) reinforces the cell wall, while the cutin/wax layer forms a protective barrier. Together, they act to maintain flesh structural integrity, resist stress, and prepare for the final drying and splitting. This forms a converse strategy to the prevalent cell wall degradation and softening during the ripening of fresh-eating fruits. Therefore, it is hypothesized that the lack of softening phenomena in late-stage almond flesh development may essentially result from this active structural reinforcement program. qRT-PCR confirmed the dynamic expression of multiple plant hormones (auxin, cytokinin, brassinolide, jasmonic acid) and key phenylpropanoid pathway genes during ‘Wanfeng’ almond flesh development, with DAF118 being the most pronounced stage. The coordinated upregulation of multi-pathway genes at this stage suggests that converging hormone signals drive late fruit maturation processes, including cell expansion, secondary metabolism, and stress adaptation. The burst-like expression of key phenylpropanoid pathway genes (e.g., POD) may be associated with lignin deposition or oxidative stress response, collectively regulating flesh texture development.
Conclusions
This study conducted integrated transcriptomic and metabolomic analyses on mesocarp tissues of the ‘Wanfeng’ almond at five developmental stages. Comparative analysis between adjacent developmental stages identified a substantial number of Differentially Expressed Genes. These findings revealed that almond mesocarp development exhibits a unimodal expansion pattern. The first phase was primarily associated with cell proliferation, photosynthesis, and substance accumulation. The second phase was characterized by shifts in metabolite dynamics and the initiation of programmed cell death. Metabolomic profiling identified 6,528 metabolites. Notably, the phenylpropanoid biosynthesis pathway was implicated as potentially involved in regulating mesocarp hardness. Furthermore, the construction of a TO-GCN elucidated potential hierarchical regulatory relationships between DEGs involved in five phytohormone signaling pathways and the phenylpropanoid biosynthesis pathway, along with key transcription factors. qRT-PCR validation confirmed the dynamic expression patterns of key genes within the five phytohormone pathways and the phenylpropanoid biosynthesis pathway throughout mesocarp development. Collectively, this study provides a foundational data resource for subsequent exploration of the regulatory mechanisms governing almond mesocarp development.
Supplementary Information
Acknowledgements
We would like to extend our sincere gratitude to Zeng Xiangxi from the University of Shanghai for Science and Technology College of Optoelectronic Information and Computer Engineering Automation major for his significant assistance in computer programming, data statistics, and professional drawing.
Authors’ contributions
D.Z: Writing-review & editing, Writing-original draft. Formal analysis, Data curation. Z.Y: Writing-review & editing, Supervision. X.Z: Writing-editing, Formal analysis, Data curation.. Y.H: Writing-original draft, Visualization. X.L: Writing-original draft, Visualization. B.Z: Writing-review & editing, Writing-original draft, Project administration, Funding acquisition. The authors read and approved the final manuscript.
Funding
2024 Xinjiang Autonomous Region Key R&D Task Special Project “Research and Demonstration of Key Technologies for Almond Breeding and Efficient Production and Storage and Processing”, Project No.: 2024B02018, Task (Topic) 2: Research and Demonstration of Key Technologies for Efficient Almond Production
Data availability
The raw sequence data reported in this paper have been deposited in the Genome Sequence Archive (Genomics, Proteomics & Bioinformatics 2021) in the National Genomics Data Center (Nucleic Acids Res 2024), China National Center for Bioinformation/Beijing Institute of Genomics, Chinese Academy of Sciences (GSA: CRA022929) and are publicly accessible at https://ngdc.cncb.ac.cn/gsa.
The whole-genome sequence data reported in this paper have been deposited in the Genome Warehouse in National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences/China National Center for Bioinformation, under accession number GWHFQHI00000000.1and are publicly accessible at https://ngdc.cncb.ac.cn/gwh.
Declarations
Ethics approval and consent to participate
These plant materials don’t include any species at risk of extinction. We declare that all the experimental plants were collected with permission from local authorities of agricultural department. We comply with relevant institutional, national, and international guidelines and legislation for plant study.
The manuscript does not include data or description of human or animal patients.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Pei MS, Cao SH, Wu L, et al. Comparative transcriptome analyses of fruit development among pears, peaches, and strawberries provide new insights into single sigmoid patterns. BMC Plant Biol. 2020;20(1):108. 10.1186/s12870-020-2317-6. Published 2020 Mar 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Mishra BS, Sharma M, Laxmi A. Role of sugar and auxin crosstalk in plant growth and development. Physiol Plant. 2022;174(1):e13546. 10.1111/ppl.13546. [DOI] [PubMed] [Google Scholar]
- 3.Paniagua C, Santiago-Doménech N, Kirby AR, Gunning AP, Morris VJ, Quesada MA, et al. Structural changes in cell wall pectins during strawberry fruit development. Plant Physiol Biochem. 2017;118:55–63. 10.1016/j.plaphy.2017.06.001. [DOI] [PubMed] [Google Scholar]
- 4.Dardick C, Callahan AM. Evolution of the fruit endocarp: molecular mechanisms underlying adaptations in seed protection and dispersal strategies. Front Plant Sci. 2014;5:284. 10.3389/fpls.2014.00284. PMID: 25009543; PMCID: PMC4070412. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Li M, Galimba K, Xiao Y, et al. Comparative transcriptomic analysis of apple and peach fruits: insights into fruit type specification. Plant J. 2022;109(6):1614–29. 10.1111/tpj.15633. [DOI] [PubMed] [Google Scholar]
- 6.Cao T, DeJong TM, Kirkpatrick BC. Almond leaf scorch disease development on almond branches high-grafted on peach rootstock. Plant Dis. 2013;97(2):277–81. 10.1094/PDIS-06-12-0580-RE. [DOI] [PubMed] [Google Scholar]
- 7.Bielsa B, Sanz MÁ, Rubio-Cabetas MJ. Uncovering early response to drought by proteomic, physiological and biochemical changes in the almond × peach rootstock “Garnem.” Funct Plant Biol. 2019;46(11):994–1008. 10.1071/FP19050. [DOI] [PubMed] [Google Scholar]
- 8.Sakar EH, El Yamani M, Boussakouran A, et al. Codification and description of almond (Prunus dulcis) vegetative and reproductive phenology according to the extended BBCH scale. Sci Hortic. 2019;247:224–34. 10.1016/j.scienta.2018.12.024. [Google Scholar]
- 9.Özcan MM. A review on some properties of almond: ımpact of processing, fatty acids, polyphenols, nutrients, bioactive properties, and health aspects. J Food Sci Technol. 2023;60(5):1493–504. 10.1007/s13197-022-05398-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Ouzir M, Bernoussi SE, Tabyaoui M, Taghzouti K. Almond oil: a comprehensive review of chemical composition, extraction methods, preservation conditions, potential health benefits, and safety. Compr Rev Food Sci Food Saf. 2021;20(4):3344–87. 10.1111/1541-4337.12752. [DOI] [PubMed] [Google Scholar]
- 11.Chen S. Ultrafast one‐pass FASTQ data preprocessing, quality control, and deduplication using fastp. Imeta. 2023;2(2):e107. 10.1002/imt2.107. Published 2023 May 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kim D, Paggi JM, Park C, Bennett C, Salzberg SL. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. 2019;37(8):907–15. 10.1038/s41587-019-0201-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Li H, Handsaker B, Wysoker A, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078–9. 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Okonechnikov K, Conesa A, García-Alcalde F. Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data. Bioinformatics. 2016;32(2):292–4. 10.1093/bioinformatics/btv566. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Shumate A, Wong B, Pertea G, Pertea M. Improved transcriptome assembly using a hybrid of long and short reads with StringTie. PLoS Comput Biol. 2022;18(6):e1009730. 10.1371/journal.pcbi.1009730. Published 2022 Jun 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Frazee AC, Pertea G, Jaffe AE, Langmead B, Salzberg SL, Leek JT. Ballgown bridges the gap between transcriptome assembly and expression analysis. Nat Biotechnol. 2015;33(3):243–6. 10.1038/nbt.3172. PMID: 25748911; PMCID: PMC4792117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. 10.1186/s13059-014-0550-8. PMID: 25516281; PMCID: PMC4302049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Chang YM, Lin HH, Liu WY, Yu CP, Chen HJ, Wartini PP, et al. Comparative transcriptomics method to infer gene coexpression networks and its applications to maize and rice leaf transcriptomes. Proc Natl Acad Sci U S A. 2019;116(8):3091–9. 10.1073/pnas.1817621116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Majeed A, Mukhtar S. Protein-protein interaction network exploration using cytoscape. Methods Mol Biol. 2023;2690:419–27. 10.1007/978-1-0716-3327-4_32. PMID: 37450163. [DOI] [PubMed] [Google Scholar]
- 20.Arbouche N, Walch A, Raul JS, Kintz P. Intentional overdose of glargine insulin: determination of the parent compound in postmortem blood by LC-HRMS. J Forensic Sci. 2023;68(3):1077–83. 10.1111/1556-4029.15247. [DOI] [PubMed] [Google Scholar]
- 21.He L, Li C, Chen Z, Huo Y, Zhou B, Xie F. Combined metabolome and transcriptome analysis reveal the mechanism of water stress in Ophiocordyceps sinensis. BMC Genomics. 2024;25(1):1014. 10.1186/s12864-024-10785-2. (PMID: 39472792; PMCID: PMC11523607). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Wang M, Chen L, Liang Z, He X, Liu W, Jiang B, et al. Metabolome and transcriptome analyses reveal chlorophyll and anthocyanin metabolism pathway associated with cucumber fruit skin color. BMC Plant Biol. 2020;20(1):386. 10.1186/s12870-020-02597-9. PMID: 32831013; PMCID: PMC7444041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Liu X, Fan HM, Liu DH, Liu J, Shen Y, Zhang J, et al. Transcriptome and metabolome analyses provide insights into the watercore disorder on “Akibae” pear fruit. Int J Mol Sci. 2021;22(9):4911. 10.3390/ijms22094911. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Singh VK, Mangalam AK, Dwivedi S, Naik S. Primer premier: program for design of degenerate primers from a protein sequence. Biotechniques. 1998;24(2):318–9. 10.2144/98242pf02. PMID: 9494736. [DOI] [PubMed] [Google Scholar]
- 25.Yu Z, Zhang D, Hu S, et al. Genome-wide analysis of the almond AP2/ERF superfamily and its functional prediction during dormancy in response to freezing stress. Biol. 2022;11(10):1520. 10.3390/biology11101520. (Published 2022 Oct 17). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative PCR and the 2−ΔΔCT method. Methods. 2001;25(4):402–8. 10.1006/meth.2001.1262. [DOI] [PubMed] [Google Scholar]
- 27.Yu X, Ali MM, Li B, Fang T, Chen F. Transcriptome data-based identification of candidate genes involved in metabolism and accumulation of soluble sugars during fruit development in “Huangguan” plum. J Food Biochem. 2021;45(9):e13878. 10.1111/jfbc.13878. [DOI] [PubMed] [Google Scholar]
- 28.Zhang Y, Shu H, Mumtaz MA, et al. Transcriptome and metabolome analysis of color changes during fruit development of pepper (Capsicum baccatum). Int J Mol Sci. 2022;23(20):12524. 10.3390/ijms232012524. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Bianchimano L, De Luca MB, Borniego MB, Iglesias MJ, Casal JJ. Temperature regulation of auxin-related gene expression and its implications for plant growth. J Exp Bot. 2023;74(22):7015–33. 10.1093/jxb/erad265. [DOI] [PubMed] [Google Scholar]
- 30.Peng J, Uygun S, Kim T, Wang Y, Rhee SY, Chen J. Measuring semantic similarities by combining gene ontology annotations and gene co-function networks. BMC Bioinformatics. 2015;16(1):44. 10.1186/s12859-015-0474-7. Published 2015 Feb 14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Kanehisa M. KEGG bioinformatics resource for plant genomics and metabolomics. Methods Mol Biol. 2016;1374:55–70. 10.1007/978-1-4939-3167-5_3. [DOI] [PubMed] [Google Scholar]
- 32.Zhou H, Wang L, Su M, Zhang X, Du J, Li X, et al. Comparative network analysis reveals the regulatory mechanism of 1-methylcyclopropene on sugar and acid metabolisms in yellow peach stored at non-chilling temperatures. Plant Physiol Biochem. 2024;216:109100. 10.1016/j.plaphy.2024.109100. [DOI] [PubMed] [Google Scholar]
- 33.Zhang Q, Feng C, Li W, Qu Z, Zeng M, Xi W. Transcriptional regulatory networks controlling taste and aroma quality of apricot (Prunus armeniaca L.) fruit during ripening. BMC Genomics. 2019;20(1):45. 10.1186/s12864-019-5424-8. (PMID: 30646841; PMCID: PMC6332858). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Liu Y, Lv G, Yang Y, Ma K, Ren X, Li M, et al. Interaction of AcMADS68 with transcription factors regulates anthocyanin biosynthesis in red-fleshed kiwifruit. Hortic Res. 2022;10(2):uhac252. 10.1093/hr/uhac252. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Li H, Yang Y, Zhang W, Zheng H, Xu X, Li H, et al. Promoter replication of grape MYB transcription factor is associated with a new red flesh phenotype. Plant Cell Rep. 2024;43(6):136. 10.1007/s00299-024-03225-8. PMID: 38709311. [DOI] [PubMed] [Google Scholar]
- 36.Cosgrove DJ. Structure and growth of plant cell walls. Nat Rev Mol Cell Biol. 2024;25(5):340–58. 10.1038/s41580-023-00691-y. Epub 2023 Dec 15. PMID: 38102449. [DOI] [PubMed] [Google Scholar]
- 37.Gomes GLB, Scortecci KC. Auxin and its role in plant development: structure, signalling, regulation and response mechanisms. Plant Biol. 2021;23(6):894–904. 10.1111/plb.13303. [DOI] [PubMed] [Google Scholar]
- 38.Yang W, Cortijo S, Korsbo N, Roszak P, Schiessl K, Gurzadyan A, et al. Molecular mechanism of cytokinin-activated cell division in arabidopsis. Science. 2021;371(26):1350–5. 10.1126/science.abe2305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Mesejo C, Yuste R, Reig C, Martínez-Fuentes A, Iglesias DJ, Muñoz-Fambuena N, et al. Gibberellin reactivates and maintains ovary-wall cell division causing fruit set in parthenocarpic Citrus species. Plant Sci. 2016;247:13–24. 10.1016/j.plantsci.2016.02.018. [DOI] [PubMed] [Google Scholar]
- 40.Delesalle C, Vert G, Fujita S. The cell surface is the place to be for brassinosteroid perception and responses. Nat Plants. 2024;10(2):206–18. 10.1038/s41477-024-01621-2. Epub 2024 Feb 22. PMID: 38388723. [DOI] [PubMed] [Google Scholar]
- 41.Parveen S, Altaf F, Farooq S, Lone ML, Ul Haq A, Tahir I. The swansong of petal cell death: insights into the mechanism and regulation of ethylene-mediated flower senescence. J Exp Bot. 2023;74(14):3961–74. 10.1093/jxb/erad217. [DOI] [PubMed] [Google Scholar]
- 42.Zhao L, Zhang W, Song Q, Xuan Y, Li K, Cheng L, et al. A WRKY transcription factor, TaWRKY40-D, promotes leaf senescence associated with jasmonic acid and abscisic acid pathways in wheat. Plant Biol. 2020;22(6):1072–85. 10.1111/plb.13155. [DOI] [PubMed] [Google Scholar]
- 43.An JP, Wang XF, Zhang XW, You CX, Hao YJ. Apple BT2 protein negatively regulates jasmonic acid-triggered leaf senescence by modulating the stability of MYC2 and JAZ2. Plant Cell Environ. 2021;44(1):216–33. 10.1111/pce.13913. [DOI] [PubMed] [Google Scholar]
- 44.Cao H, Chen J, Yue M, Xu C, Jian W, Liu Y, et al. Tomato transcriptional repressor MYB70 directly regulates ethylene-dependent fruit ripening. Plant J. 2020;104(6):1568–81. 10.1111/tpj.15021. [DOI] [PubMed] [Google Scholar]
- 45.Chen X, Liu Y, Zhang X, Zheng B, Han Y, Zhang RX. PpARF6 acts as an integrator of auxin and ethylene signaling to promote fruit ripening in peach. Hortic Res. 2023;10(9):uhad158. 10.1093/hr/uhad158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Özparpucu M, Gierlinger N, Burgert I, Van Acker R, Vanholme R, Boerjan W, et al. The effect of altered lignin composition on mechanical properties of CINNAMYL ALCOHOL DEHYDROGENASE (CAD) deficient poplars. Planta. 2018;247(4):887–97. 10.1007/s00425-017-2828-z. Epub 2017 Dec 21. PMID: 29270675. [DOI] [PubMed] [Google Scholar]
- 47.Cheng C, Zhang C, Jin X, Wang T, Zhang Y, Wang Y, et al. Calcium disrupts CML38/WRKY46-NAC187-CCR cascade to inhibit the formation of lignin-related physiological disorders in pear fruit. Plant Biotechnol J. 2025. 10.1111/pbi.70158. [DOI] [PubMed] [Google Scholar]
- 48.Liu H, Wang K, Yang J, Wang X, Mei Q, Qiu L, et al. The apple transcription factor MdbHLH4 regulates plant morphology and fruit development by promoting cell enlargement. Plant Physiol Biochem. 2023;205:108207. 10.1016/j.plaphy.2023.108207. [DOI] [PubMed] [Google Scholar]
- 49.Peng K, Xiao G, Shi Y, Huang X. Transcription factor CsNAC25 mediating dual roles in tea plant secondary cell wall formation and trichome development. Plant Sci. 2025;356:112499. 10.1016/j.plantsci.2025.112499. [DOI] [PubMed] [Google Scholar]
- 50.Wang Y, Li Y, He SP, Xu SW, Li L, Zheng Y, et al. The transcription factor ERF108 interacts with AUXIN RESPONSE FACTORs to mediate cotton fiber secondary cell wall biosynthesis. Plant Cell. 2023;35(30):4133–54. 10.1093/plcell/koad214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Nicolaï BM, Xiao H, Han Q, Tran DT, Crouch E, Hertog MLATM, et al. Spatio-temporal dynamics of the metabolome of climacteric fruit during ripening and post-harvest storage. J Exp Bot. 2023;74(20):6321–30. 10.1093/jxb/erad230. [DOI] [PubMed] [Google Scholar]
- 52.Liu R, Deng Y, Liu Y, Wang Z, Yu S, Nie Y, et al. Combined analysis of transcriptome and metabolome reveals the potential mechanism of the enantioselective effect of chiral penthiopyrad on tomato fruit flavor quality. J Agric Food Chem. 2022;70(7):10872–85. 10.1021/acs.jafc.2c03870. [DOI] [PubMed] [Google Scholar]
- 53.Zhang Z, Shi Q, Wang B, Ma A, Wang Y, Xue Q, et al. Jujube metabolome selection determined the edible properties acquired during domestication. Plant J. 2022;109(5):1116–33. 10.1111/tpj.15617. [DOI] [PubMed] [Google Scholar]
- 54.Khan N. Exploring plant resilience through secondary metabolite profiling: advances in stress response and crop improvement. Plant Cell Environ. 2025;48(7):4823–37. 10.1111/pce.15473. [DOI] [PubMed] [Google Scholar]
- 55.Huang T, Zhang X, Wang Q, Guo Y, Xie H, Li L, et al. Metabolome and transcriptome profiles in quinoa seedlings in response to potassium supply. BMC Plant Biol. 2022;22(1):604. 10.1186/s12870-022-03928-8. PMID: 36539684; PMCID: PMC9768898. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Fan W, Li B, Tian H, Li X, Ren H, Zhou Q. Metabolome and transcriptome analysis predicts metabolism of violet-red color change in Lilium bulbs. J Sci Food Agric. 2022;102(7):2903–15. 10.1002/jsfa.11631. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The raw sequence data reported in this paper have been deposited in the Genome Sequence Archive (Genomics, Proteomics & Bioinformatics 2021) in the National Genomics Data Center (Nucleic Acids Res 2024), China National Center for Bioinformation/Beijing Institute of Genomics, Chinese Academy of Sciences (GSA: CRA022929) and are publicly accessible at https://ngdc.cncb.ac.cn/gsa.
The whole-genome sequence data reported in this paper have been deposited in the Genome Warehouse in National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences/China National Center for Bioinformation, under accession number GWHFQHI00000000.1and are publicly accessible at https://ngdc.cncb.ac.cn/gwh.










