Main text
Standard transcriptome analyses based on single reference genomes often fail to capture genotype-specific gene expression, limiting our understanding of transcriptional diversity within species. Although pan-genome studies have documented extensive structural variation across crops (Tao et al., 2020), the functional implications of this variation on gene regulation remain underexplored. Conventional pan-genomic approaches have not fully captured species-wide diversity. Although advances in pan-genomics have revealed structural complexity in crops such as maize, rice, and soybean, transcriptomic variation within a species remains insufficiently explored. Recently, Guo et al. (2025) presented PanBaRT20, a comprehensive pan-transcriptome resource for barley. Utilizing RNA sequencing (RNA-seq; both long and short reads) data from 20 diverse genotypes across five different tissues (Figure 1), PanBaRT20 demonstrated an average mapping efficiency of 87.3% for RNA-seq read alignment during transcript quantification, representing an 11.1% improvement over BaRTv2.0 (Coulter et al., 2022).
Figure 1.
Schematic illustration of the pan-transcriptome development process and its benefits for functional genomics.
The diagram highlights the five Barley (Hordeum vulgare) tissue types selected for RNA sequencing, illustrating the comprehensive sampling strategy. The workflow depicts the sequencing methodologies employed, including short-read RNA-seq and long-read PacBio isoform sequencing (Iso-seq) technologies, which together enable detailed transcriptional profiling. The figure also highlights the research benefits and their practical implications, demonstrating the potential applicability of this approach to other crops.
PanBaRT20 also revealed more gene isoforms and structural variations than BaRTv2.0. Barley (Hordeum vulgare), the fourth most widely cultivated cereal crop globally, exhibits significant adaptability across diverse agroecological conditions and plays a central role in food and beverage production. Its diploid genome and considerable phenotypic variability make it an excellent model for studying gene regulation within the Triticeae tribe (Mascher et al., 2017).
Earlier transcriptome approaches, constrained by single-reference datasets such as BaRTv2.0, often overlooked genotype-specific transcripts, impeding efforts to link genetic diversity with phenotypic outcomes (Coulter et al., 2022). By integrating short-read RNA-seq and long-read PacBio Iso-seq data across five tissues, Guo et al. (2025) developed a detailed transcriptional profile that goes beyond genomic predictions alone. This pan-transcriptome deepens our understanding of regulatory mechanisms in barley and provides a valuable framework for crop research.
PanBaRT20 incorporates genotype-specific reference transcript datasets (GsRTDs) from 20 barley genotypes spanning diverse geographic origins. The resource includes 79 600 genes and 582 000 transcripts across five tissues, improving mapping accuracy from 76.2% to 87.3%. Functional analysis revealed core genes (21.85%), shell genes (40.47%), and cloud genes (37.68%), with shell and cloud genes enriched for stress response functions. Additionally, the number of splice junctions detected increased from 146 600 to 311 300, reflecting enhanced detection of alternative splicing. Copy-number variation (CNV) in (CBF2/4) genes and structural variation (the 141 Mb inversion on chromosome 7H) demonstrate genotype-specific regulatory complexity beyond sequence variation alone.
PanBaRT20 also has several limitations. It analyzes only five tissues, risking omission of stress-induced transcripts in unsampled organs. Moreover, it does not address the technical trade-offs between sequencing approaches, specifically that short reads provide higher sequencing depth and more accurate quantification, whereas long reads enable better detection and resolution of full-length transcript isoforms. It also overlooks biases from pooling samples, which can obscure individual genotype-specific expression information and may mask regulatory variation. Similarly, pan-transcriptome studies in maize (368 genotypes; Jin et al., 2016) and wheat (9 cultivars; White et al., 2024) have taken different approaches, emphasizing de novo annotation and the identification of core/dispensable gene sets to capture genetic diversity and presence/absence variation; however, these approaches do not necessarily address the technical limitations discussed above. In their analysis, Guo et al. (2025) focused on structural variation at selected loci. For example, a 141 Mb inversion on chromosome 7H is present in 40% of post-2000 UK varieties, affecting 75 differentially expressed genes, including starch metabolism components linked to grain quality. While the current study does not demonstrate direct phenotypic correlations, the functional significance may extend beyond the number of differentially expressed genes. Furthermore, CBF2/4 copy-number variation (one to five copies) correlates with elevated basal expression. Although previous studies have established the role of CBF genes in frost tolerance, the current study provides only expression-level correlations without direct validation under stress conditions.
Classifying shell (40.0%) and cloud (37.69%) genes as dispensable appears paradoxical given their enrichment for stress-response functions, reflecting a phenomenon of “conditional dispensability,” where genes are dispensable under normal conditions but essential during stress. The observed doubling of splice junctions (311 300 vs. 146 600) reflects enhanced detection of transcript diversity; however, potential technical artifacts from transcript redundancy require validation. The organization of 12 190 core orthologs into 738 expression modules indicates substantial genotype-specific expression divergence, complicating the development of stable expression-based breeding markers. Therefore, empirical field tests validating CNV phenotypes and stress-response effects are essential to harness PanBaRT20 for crop improvement.
PanBaRT20 is complemented by the Morex Atlas, which comprises over 300 RNA-seq samples describing tissue-specific expression variation across developmental stages and stress conditions. Principal-component analysis indicates that tissue type exerts a stronger influence on transcript profiles than environmental stresses. Network analyses reveal that 12 190 of the 13 652 single-copy core orthologs exhibit contrasting expression across genotypes, forming 738 co-expression modules organized into six communities, revealing genotype-specific divergence. For example, GA2ox7 demonstrates genotype-specific expression potentially associated with domestication traits, while CBF2/4 CNV correlates with stress tolerance expression patterns. By refining gene annotation across 76 genotypes (Jayakodi et al., 2024), PanBaRT20 provides a comprehensive platform for transcriptional breeding research.
Tong et al. (2025) systematically classified the barley basic-helix-loop-helix transcription factors into core, shell, and cloud classes, identifying 201 orthologs compared with the 161 found in a single-reference genome-based study. This demonstrates how pan-genome approaches can uncover previously hidden genetic diversity. One of the most important applications of PanBaRT20 is the systematic identification of stress-responsive gene networks. Categorizing genes into shell and cloud groups offers a criterion for prioritizing candidate genes, with these stress-responsive categories serving as accessible entry points for functional validation. GA2ox7 is a clear example of how this pan-transcriptomic approach can be applied to targeted breeding.
Advanced network analysis and machine learning algorithms are required to fully exploit the potential of the PanBaRT20 model, which contains 738 co-expression modules comprising 12 190 orthologs with genotype-specific expression patterns. The multi-dimensional nature of these data could be leveraged to predict phenotypic effects for traits such as plant height, stress tolerance, and grain quality—a concept already demonstrated by GA2ox7 in height regulation and by CBF2/4 in potential stress tolerance. PanBaRT20 is particularly valuable because conventional correlation-based approaches fail to capture the non-linearity and genotype-specific interactions revealed by Guo et al. (2025).
The use of PanBaRT20 and its supporting GitHub repositories, together with the Morex Atlas, establishes a foundation for an open-access platform that complies with the principles of findability, accessibility, interoperability, and reusability. These resources promote broader research collaboration and accelerate new discoveries in barley genetics and breeding.
This platform could also be extended to multi-species applications, however, three main challenges must be addressed: (1) standardized tissue sampling protocols across species with different developmental timings, (2) robust ortholog detection algorithms capable of resolving one-to-many relationships among phylogenetically distant species, and (3) machine learning models that integrate heterogeneous expression data while accounting for species-specific regulatory mechanisms. Additionally, predicting combined stress responses requires controlled experimental datasets that capture stress interactions and their validation across diverse genetic backgrounds. This necessitates multi-institutional collaboration to establish standardized protocols for cross-species data collection and analysis, particularly given the logistical challenges of coordinating experiments across different crop systems and research institutions.
The mechanisms driving variation in alternative splicing (junctions increasing from 146 600 in GsRTDs to 311 300 in PanBaRT20), CNVs (e.g., CBF2/4 linked to frost tolerance), and the 141 Mb inversion on 7H warrant deeper investigation. Analyses of CNV loci could uncover regulatory effects, refining selection strategies for yield, disease resistance, and nutrient efficiency (Varshney et al., 2021). Integrating pan-transcriptomic data with quantitative trait locus mapping and expression databases—for example, linking drought-responsive transcripts to yield quantitative trait loci—may help identify functional variants that contribute to stable productivity. This integration accelerates breeding by correlating transcripts with phenotypes, thereby enhancing sustainability (Jayakodi et al., 2020).
Funding
This work was supported by the Project of Sanya Yazhou Bay Science and Technology City (Grant No. SKJC-JYRC-2024-26).
Acknowledgments
We apologize to colleagues whose work could not be cited due to space limitations. The authors declare that they have no conflicts of interest.
Author contributions
Y.J. and R.K.V. conceptualized the study. F.K., Y.J., and A.R. wrote the initial draft. A.R., A.A., and S.W. improved the initial draft and prepared the figure. Y.J., S.W., and R.K.V. supervised, reviewed, and edited the manuscript. All authors have read and approved the final version of the manuscript.
Published: August 19, 2025
Contributor Information
Rajeev K. Varshney, Email: varshney@murdoch.edu.au.
Jun Yang, Email: yang9yj@hainanu.edu.cn.
References
- Coulter M., Entizne J.C., Guo W., Bayer M., Wonneberger R., Milne L., Schreiber M., Haaning A., Muehlbauer G.J., McCallum N., et al. BaRTv2: a highly resolved barley reference transcriptome for accurate transcript-specific RNA-seq quantification. Plant J. 2022;111:1183–1202. doi: 10.1111/tpj.15871. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guo W., Schreiber M., Marosi V.B., Bagnaresi P., Jørgensen M.E., Braune K.B., Chalmers K., Chapman B., Dang V., Dockter C., et al. A barley pan-transcriptome reveals layers of genotype dependent transcriptional complexity. Nat. Genet. 2025;57:441–450. doi: 10.1038/s41588-024-02069-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jin M., Liu H., He C., Fu J., Xiao Y., Wang Y., Xie W., Wang G., Yan J., Yan J. Maize pan-transcriptome provides novel insights into genome complexity and quantitative trait variation. Sci. Rep. 2016;6 doi: 10.1038/srep18936. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jayakodi M., Padmarasu S., Haberer G., Bonthala V.S., Gundlach H., Monat C., Lux T., Kamal N., Lang D., Himmelbach A., et al. The barley pan-genome reveals the hidden legacy of mutation breeding. Nature. 2020;588:284–289. doi: 10.1038/s41586-020-2947-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jayakodi M., Lu Q., Pidon H., Rabanus-Wallace M.T., Bayer M., Lux T., Guo Y., Jaegle B., Badea A., Bekele W., et al. Structural variation in the pangenome of wild and domesticated barley. Nature. 2024;636:654–662. doi: 10.1038/s41586-024-08187-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mascher M., Gundlach H., Himmelbach A., Beier S., Twardziok S.O., Wicker T., Radchuk V., Dockter C., Hedley P.E., Russell J., et al. A chromosome conformation capture ordered sequence of the barley genome. Nature. 2017;544:427–433. doi: 10.1038/nature22043. [DOI] [PubMed] [Google Scholar]
- Tao Y., Jordan D.R., Mace E.S. A graph-based pan-genome guides biological discovery. Mol. Plant. 2020;13:1247–1249. doi: 10.1016/j.molp.2020.07.020. [DOI] [PubMed] [Google Scholar]
- Tong C., Jia Y., Hu H., Zeng Z., Chapman B., Li C. Pangenome and pantranscriptome as the new reference for gene-family characterization: A case study of basic helix-loop-helix (bHLH) genes in barley. Plant communications. 2025;6:101190. doi: 10.1016/j.xplc.2024.101190. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Varshney R.K., Roorkiwal M., Sun S., Bajaj P., Chitikineni A., Thudi M., Singh N.P., Du X., Upadhyaya H.D., Khan A.W., et al. A chickpea genetic variation map based on the sequencing of 3,366 genomes. Nature. 2021;599:622–627. doi: 10.1038/s41586-021-04066-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- White B., Lux T., Rusholme-Pilcher R., Juhász A., Kaithakottil G., Duncan S., Simmonds J., Rees H., Wright J., Colmer J., et al. De novo annotation of the wheat pan-genome reveals complexity and diversity of the hexaploid wheat pan-transcriptome. bioRxiv. 2024 doi: 10.1101/2024.01.09.574802. Preprint at. [DOI] [Google Scholar]

