Skip to main content
Plant Communications logoLink to Plant Communications
. 2025 Dec 5;7(2):101658. doi: 10.1016/j.xplc.2025.101658

RIFinder reveals widespread adaptive remote introgression in grass genomes

Yujie Huang 1, Shiyu Zhang 1, Hanyang Lin 2, Chenxu Liu 3, Zhefu Li 1, Kun Yang 1, Yutong Liu 1, Linfeng Jin 1, Chuanlong Lu 3, Yuan Cheng 3, Chaoyi Hu 4, Huifang Zhao 1, Guoping Zhang 1, Qian Qian 5, Longjiang Fan 1,5,6,, Dongya Wu 1,6,7,∗∗
PMCID: PMC12903403  PMID: 41353563

Abstract

Genetic transfers are pervasive across both prokaryotes and eukaryotes, primarily involving canonical genomic introgression between species or genera and horizontal gene transfer (HGT) across kingdoms. However, DNA transfer between phylogenetically distant species, which differs from canonical introgression and HGT in certain aspects of its temporal scale and mechanistic features, here defined as remote introgression (RI), has received less attention in evolutionary genomics. In this study, we present RIFinder, a novel phylogeny-based method for the detection of RI events, and apply it to a comprehensive dataset of 122 grass genomes. Our analysis identifies 622 RI events originating from 543 distinct homologous genes, revealing distinct characteristics among grass subfamilies. Specifically, the subfamily Pooideae contains the largest number of introgressed genes, whereas Bambusoideae contains the fewest. Comparisons among the accepted genes, their donor copies, and native homologs demonstrate that introgressed genes undergo post-transfer localized adaptation and show significant functional enrichment in stress-response pathways. Notably, we identify a large Triticeae-derived segment in the Chloridoideae species Cleistogenes songorica, which is potentially associated with its exceptional drought tolerance. Furthermore, we provide compelling evidence that RI has contributed to the origin and diversification of biosynthetic gene clusters for gramine, a defensive alkaloid chemical, across grass species. Our study establishes a robust method for RI detection and highlights its critical role in adaptive evolution. The Python implementation of RIFinder is publicly available at https://github.com/Ne0tea/RIFinder.

Key words: remote introgression, adaptive evolution, gene cluster, phylogenetic incongruence


This study presents RIFinder, a newly developed tool for the detection of remote introgression (RI), and applies it to 122 grass genomes. The results reveal widespread RI events that have shaped aspects of adaptive evolution, including enhanced stress resilience and diversification of biosynthetic gene clusters, uncovering a hidden layer of reticulate evolution in plants.

Introduction

DNA transfer across species is pervasive across taxonomic groups, acting as a key driver of evolutionary diversification. Previous studies have highlighted the crucial roles of genomic introgression in facilitating adaptive evolution (Edelman et al., 2019; Taylor and Larson, 2019). For example, introgression of the EPAS1 allele from archaic Denisovans into modern humans confers high-altitude adaptation in Tibetan populations (Huerta-Sánchez et al., 2014). Genomic introgression through hybridization and successive back-crossing has become a fundamental strategy for introducing beneficial alleles from related wild species into elite varieties in modern crop breeding. Horizontal gene transfer (HGT), another form of genetic exchange, typically occurs through asexual mechanisms such as transformation, conjugation, or transduction in prokaryotes (Soucy et al., 2015). Although relatively rare in eukaryotes, well-documented cases of HGT have played pivotal roles in shaping adaptive traits in recipient lineages (Etten and Bhattacharya, 2020; Keeling, 2024). For instance, the acquisition of genes from Listeria influences male courtship behavior toward females in lepidopteran insects (Li et al., 2022). The whitefly (Bemisia tabaci), a major agricultural pest, has incorporated BtPMaT1 from its host plants to neutralize plant-derived toxins (Xia et al., 2021). In plant macroevolution, HGT events from bacteria may have facilitated adaptation of early-diverging plant lineages, such as in cell-wall biosynthesis (Feng et al., 2024) and biotic resistance (Li et al., 2018). Moreover, HGT through physical contact, such as between parasitic plants and their hosts via haustoria, has been recognized as a contributor to the adaptive evolution of parasitic species (Yoshida et al., 2010).

An extensive review of 5318 studies on DNA transfer published over the past 5 years (Supplemental Figure 1) revealed that HGT events in eukaryotes primarily involve inter-kingdom transfers, whereas genomic introgressions almost all occur among closely related taxa (e.g., congeneric species), as expected. This dichotomy leaves a conceptual gap regarding the possibility of genetic exchange between distant lineages with moderate phylogenetic distance. To clearly differentiate them from established mechanisms, we define such instances of DNA exchange as remote introgression (RI). RI can take place along intermediate branches, involving either ancient introgression events shared across multiple species or more recent genetic transfers between reproductively isolated lineages via previously unexplored pathways. This conceptual framework of RI offers a novel lens through which to explore reticulate evolution, a gap in current evolutionary biology discourse.

The grass family (Poaceae) is among the most important plant families for humans, serving as the principal source of calories for global populations. Grass species are generally divided into two major evolutionary lineages that diverged >80 million years ago (mya): the PACMAD lineage, which includes the Panicoideae (PAN), Arundinoideae, Chloridoideae (CHL), Micrairoideae, Aristidoideae, and Danthonioideae subfamilies, and the BOP lineage, which includes the Bambusoideae (BAM), Oryzoideae (ORY), and Pooideae (POO) subfamilies (Huang et al., 2022). Several cases of gene transfer have occurred across subfamilies despite their deep evolutionary divergence. For example, a fragment containing five genes related to abiotic stress response was reported to have been transferred from Panicum (PAN) to wild barley species (Hordeum, POO) (Mahelka et al., 2017, 2021). Likewise, seven Bx genes in Triticeae involved in the biosynthesis of DIMBOA (2,4-dihydroxy-7-methoxy-1,4-benzoxazin-3-one) were acquired from ancestral Panicoideae via RI (Wu et al., 2022a). Another compelling case of RI involves the transfer of momilactone A biosynthetic genes from wheat (Triticeae) to rice (Oryza), enabling production of this allelopathic defense compound in a distantly related lineage (Wu et al., 2022b). Although these events were previously described as horizontal or lateral transfers, we categorize them here as RI to clearly distinguish them from canonical HGT.

The evolutionary histories of genes are frequently obscured by processes such as duplication, loss, transfer, and incomplete lineage sorting (ILS), all of which contribute to gene-tree discordance and reticulate evolution (Edwards et al., 2007; Steenwyk et al., 2023). The detection of genomic introgression and HGT under these conditions relies primarily on site pattern analysis and phylogenetic inference. Classic methods such as D statistics (ABBA–BABA test) or f statistics (e.g., f3 and f4) are widely used. Although site-pattern-based methods are effective for detecting recent gene flow among closely related lineages, their sensitivity declines sharply as phylogenetic distance increases. Moreover, their design limitations, typically confined to four or five taxa, render them ill suited for identifying RI across broader evolutionary scales, especially in large, complex phylogenies. Phylogeny-based methods (e.g., QuIBL) leverage gene-tree topologies and branch-length distributions to distinguish introgression from ILS (Edelman et al., 2019), but they are generally restricted to analyses of three or four taxa. Similarly, PhyloNet integrates multiple gene trees to infer phylogenetic networks and hybridization events (Than et al., 2008). Collectively, these methods are poorly adapted to uncovering RI across deep evolutionary timescales. With the increasing availability of high-quality genomes, the discovery of hidden evolutionary relationships among distantly related species necessitates the development of novel algorithms tailored to large-scale evolutionary studies.

In this study, we aimed to identify RI events across large-scale genomic datasets and systematically examine their prevalence and adaptive significance. Given that there were no appropriate approaches for achieving this aim, we established a comprehensive workflow by introducing a modular tool, Remote Introgression Finder (RIFinder), specifically designed for RI detection. Using RIFinder, we identified widespread RI events across 122 grass genomes and characterized the structural and functional features of RI-associated genes. Notably, we identified a ∼30-kbp RI segment potentially involved in drought tolerance of the Eurasian desert plant Cleistogenes songorica and revealed the complex evolutionary trajectory of defense-related gramine biosynthesis gene clusters (GBGCs).

Results

Inferring remote introgression events between phylogenetically distant lineages using RIFinder

To efficiently identify RI events among hundreds of genomes in a high-throughput manner, we developed RIFinder, a novel analytical workflow (Figure 1A). RIFinder consists of three stepwise modules: phylogenetic inference of homologous groups, RI signal detection, and statistical significance testing (Supplemental Figure 2). The aim of RIFinder is to detect gene–species phylogenetic conflicts among evolutionarily distant species (inter-clade). First, species-level taxonomic clades are defined to construct a reference phylogeny, which serves as the backbone for comparison with gene-level phylogenies. Protein sequences from multiple species are clustered into homologous groups (HGs) and aligned. For each HG, a gene tree is constructed and simplified by collapsing monophyletic branches composed exclusively of sequences from a single clade. This step minimizes noise from within-clade phylogenetic discordance and mitigates biases introduced by unequal sampling among clades (Figure 1A). Recognizing that conventional homology clustering cannot reliably distinguish orthologs from paralogs, we implemented a tree-splitting algorithm to partition multi-copy HG trees into smaller, ortholog-like (OG-like) subtrees (Figure 1A). Each OG-like tree is subsequently scanned to identify nodes that exhibit topological incongruence with the reference species phylogeny as potential RI signals (Figure 1A).

Figure 1.

Figure 1

RIFinder workflow and performance evaluation.

(A) Overview of RIFinder processing steps. Tips from closely related species are marked with circles of the same color. Phylogenetic trees of different taxa are shown on a gray background. Red triangles mark candidate RI nodes detected by topological conflict analysis.

(B) Performance evaluation on simulated datasets. Upper panels represent distinct phylogenetic scenarios, and bar charts display precision (left, solid lines) and recall (right, dashed lines) metrics. The x-axis indicates the gene-transfer probability parameter in simulations.

In addition to introgression, ILS also plays an essential role in shaping patterns of reticulate evolution. To distinguish RI from ILS, we incorporated a modified branch-length test (mBLT) algorithm (Supplemental Figure 3). Under ILS without introgression, triplet topologies should exhibit no significant branch-length disparity compared with their sister nodes in the species tree. By contrast, introgression is expected to have a discordant distribution, with the introgressed lineage nested within an otherwise divergent clade. Furthermore, to minimize false positives, we filtered phylogenetically ambiguous signals, including nodes with low bootstrap support, low Bayesian posterior probabilities, or evidence of long-branch attraction (Supplemental Figure 2). For each high-confidence RI event, gene copies with incongruent phylogenetic placement are designated as the acceptors, whereas their closest non-incongruent homologs are defined as the donors.

To evaluate the performance of RIFinder, we simulated different RI levels across divergent clades (see methods). Across gene trees with various RI frequencies, RIFinder consistently achieved high precision (85.7%–98.6%; Figure 1B). In particular, at a simulated transfer rate of 0.1%, it achieved a recall rate of 89.0% (Figure 1B). In Poaceae, RI events were calculated to range from 0.03% to 0.3% of all genes (see below and Supplemental Figure 4). At this biologically relevant transfer frequency (0.04% and 0.1%), RIFinder demonstrated robust performance, achieving high accuracy (95.9%–98.2% and 93.5%–96.3%) and high recall (85.0%–87.6% and 83.5%–84.0%) across the three scenarios (Figure 1B). These results indicated that the simulation framework reflected biologically realistic introgression dynamics, reinforcing confidence in the applicability of RIFinder to real-world genomes. Collectively, these results suggest that RIFinder offers a well-balanced solution for the detection of RI, particularly in large and complex phylogenomic datasets.

Widespread RI events across grass genomes

We used RIFinder to systematically identify RI events between PACMAD and BOP lineages in the grass family by integrating high-quality genomes from 78 Poaceae species and the outgroup species pineapple (Ananas comosus; Supplemental Table 1). Polyploid genomes were split into haploid subgenomes, resulting in a total of 122 haploid genomes: 11 from CHL, 37 from PAN, 32 from BAM, 12 from ORY, and 29 from POO (Figure 2A). The basal Poaceae species Pharus latifolius served as an additional outgroup (Ma et al., 2021). A final total of 5.15 million protein-coding transcripts were analyzed to detect DNA transfers between the PACMAD and BOP lineages. A total of 622 candidate RI events were identified between the two lineages across various divergence nodes, including 286, 64, and 272 events shared by singleton, doubleton, and multiple species, respectively (Figure 2A and Supplemental Table 2). RIFinder successfully identified several previously characterized events with strong RI signals, such as the Bx cluster in Triticeae (Wu et al., 2022a), the MABGC cluster in Oryza (Wu et al., 2022b), and a stress-response-related DNA segment in barley (Hordeum) transferred from Panicum (Mahelka et al., 2021).

Figure 2.

Figure 2

Widespread remote introgression events between PACMAD and BOP lineages in Poaceae.

(A) RI events as donor (left half of pies) and acceptor (right half of pies) across the grass family. Pie diameter at the internal node shows the transfer frequency, and the color denotes the donor/acceptor taxonomy. Bar height indicates the frequency of RI events in leaves. Major crops are annotated with red dots at leaf nodes, and orphan crops are annotated with light-red dots. Poaceae subfamilies are labeled with silhouettes of representative species (images obtained from BioRender, https://biorender.com). Nodes with bootstrap support <95 are marked with red stars. BAM, Bambusoideae; ORY, Oryzoideae; POO, Pooideae; CHL, Chloridoideae; PAN, Panicoideae; Arun, Arundinarieae; Brach, Brachypodieae; Andro, Andropogoneae; Cyn, Cynodonteae; Erag, Eragrostideae. For full names of all abbreviated grass genome labels, see Supplemental Table 1.

(B) Relationship between RI events as donors and acceptors. Dot colors correspond to species taxonomy.

(C) Distribution of RI time estimates in rice and wheat, with the times for Bx6 and Bx6-like RI events highlighted in red.

(D) Maximum-likelihood phylogenetic tree of Bx6 homologs using protein sequences from grasses, with P. latifolius serving as an outgroup. Nodes with bootstrap values <60 are colored black, and those with values 60–80 are colored light blue. Leaf background colors correspond to species taxonomic classifications. Bx6-like branches of RI origin are highlighted in red.

RI events have occurred multiple times during the diversification of grass species, with the frequency of PACMAD-to-BOP transfers exceeding that of the reverse (Figure 2A). Intriguingly, we observed a significant correlation between the number of acceptor and donor events per haploid genome (p = 5.85e−10, two-tailed t-test), although the patterns varied by subfamilies (Figure 2B). In the BOP lineage, POO and ORY acquired an average of 117.7 and 67.3 genes per haploid genome, respectively, significantly more than BAM (Supplemental Figure 5). In PACMAD, CHL species accepted fewer RI genes from BOP than PAN species. PAN contributed the highest number of transferred genes (∼73.7 per haploid genome) relative to other subfamilies (Supplemental Figure 4A). Interestingly, BAM exhibited the lowest numbers of both donor and acceptor genes among the subfamilies, potentially owing to its distinct life cycle (Figure 2A). BAM grasses typically exhibit prolonged vegetative growth followed by a synchronized, monocarpic flowering event, which limits temporal overlap with the flowering of other species and may reduce opportunities for inter-lineage gene transfer. Notably, some Poaceae species are primarily polyploid, exhibiting subgenome dominance in gene expression and selection, with BAM as a representative example (Ma et al., 2024). In contrast to other subfamilies that show no significant subgenome preference for RI-derived gene retention, BAM exhibited a striking subgenome bias in retention ratios (p = 2.08e−10, two-tailed t-test; Supplemental Figure 5C).

In POO, the Triticeae tribe (e.g., barley, rye, and wheat) harbored significantly more acceptor genes than the sister tribe Poeae (e.g., alkaligrass and oat), likely owing to a Triticeae-specific RI pulse after the divergence of the two core Pooideae tribes (Figure 2A). Molecular dating of each RI event in wheat (Triticeae) and oat (Poeae) revealed that the recent RI pulse around 30 mya was specific to Triticeae species, consistent with the divergence time between Triticeae and Poeae (∼35.7 mya) (Zhang et al., 2022) (Figure 2C). For example, Bx6 genes encode a 2-oxoglutarate-dependent dioxygenase in the DIMBOA biosynthesis pathway for biotic defense (Jonczyk et al., 2008). Previous reports indicated that Bx6 genes in POO species arose from the duplication of native homologs and were transferred from PAN (Wu et al., 2022a). Here, expanded sampling revealed a novel Bx6 homologous monoclade in Triticeae, termed Bx6-like, that is phylogenetically nested within the PAN homologs (Figure 2D and Supplemental Figure 6A). Divergence time estimates between Bx6 and Bx6-like clades with corresponding RI-donor PAN homologs indicated that the Bx6-like RI event occurred more recently than the original Bx6 event (Figure 2C). The restricted presence of Bx6-like genes in Triticeae supported the notion of a lineage-specific RI. Topology tests confirmed that the phylogeny of duplicated Bx6 genes was highly robust and supported the introgression from PAN to Triticeae (Supplemental Figure 6B). Expression of the Bx6-like gene TraesCS2B01G076400 in hexaploid wheat (Triticum aestivum) was induced by multiple stresses, similar to that of the native copy TraesCS1B01G103700 in the B subgenome and other Bx6 copies (TraesCS2B01G066000 and TraesCS2A01G051700; Supplemental Figure 7).

Characteristics and functional implication of RI genes

The proteins encoded by RI genes showed pronounced involvement in stress-response processes, although the functional enrichment varied across subfamilies (Figure 3A). In rice, domains such as lipoxygenase (PF00305) and NB-ARC (PF00931), known for defense against pathogens and parasitic fungi, were significantly over-represented (Figure 3A). The enrichment of WD40 domains (PF00400) in rice indicates that RI-derived genes likely contribute significantly to both developmental processes and abiotic stress responses (Eryong and Bo, 2022). In addition, enrichment of terpene synthase domains (PF01397) in RI genes suggests a potential role in the biosynthesis of secondary metabolites, such as momilactone. In wheat, we observed significant enrichment of wax ester synthase domains (PF03007, WS/DAGs, p = 4.76e−45; PF06974, WS/DGAT C-terminal, p = 1.15e−58; Fisher’s exact test), which are critical for wax formation on leaf surfaces, an adaptive trait that enhances tolerance to both biotic and abiotic stress (Aharoni et al., 2004). The pentatricopeptide repeat (PPR) family domain (PF13041) also exhibited enrichment signals, suggesting diverse functional roles of RI-derived genes in wheat (Chen et al., 2018). In maize, aconitase-related RI genes, implicated in stress responses and photosynthesis, were significantly enriched (Pascual et al., 2021). In tef (Eragrostis tef), we also detected significant enrichment of NB-ARC and leucine-rich repeat domains (Duxbury et al., 2021; Alessi and Pfeffer, 2024). Collectively, RI-derived genes frequently encode proteins involved in signaling and stress responses, potentially enabling dynamic modulation of gene expression and immune strategies in acceptor species.

Figure 3.

Figure 3

Functional characterization of RI-derived genes in the grass family.

(A) Top 10 enriched gene families of RI acceptor genes in rice (O. sativa), wheat (Triticumaestivum), maize (Zea mays), and tef (Eragrostis tef). Color gradient indicates the statistical significance (Fisher’s exact test, p-adjusted) of gene family enrichment ratios relative to the whole-genome background.

(B) Expression of 54 RI-derived genes across tissues and stress treatments in rice. Chromosomal ideograms above the heatmap annotate the genomic loci of these RI genes. Colored rectangles denote gene categories: gray for native orthologs (connected to RI genes via dashed lines), blue for cloned abiotic stress–resistance genes, and red for biotic stress–resistance genes. Stars mark genes within selective sweep regions (Jing et al., 2023).

(C) Expression of 35 RI-derived genes and their corresponding native copies across tissues and experimental treatments in wheat. Subgenome-specific absences are marked by hollow rectangles with dashed borders. N, nitrogen; Mory, rice blast; RSV, rice stripe virus; BPH, brown planthopper; Xt, Xanthomonas translucens; Zt, Zymoseptoria tritici; Stripe, stripe rust; Powdery, powdery mildew; Cp, Claviceps purpurea; Fg, Fusarium graminearum; Have, Heterodera avenae.

As foreign copies of native genes, RI-derived genes may share evolutionary fates with gene duplicates, undergoing loss or retention through mechanisms such as compensatory drift or subfunctionalization (Birchler and Yang, 2022). We profiled the transcriptomic dynamics of the accepted RI genes in rice and wheat (Supplemental Tables 3 and 4). Among the 54 foreign RI genes in rice, 40 were expressed in at least one tissue (fragments per kilobase per million [FPKM] > 1). Notably, GF14a exhibited constitutive expression across all tissues examined (Figure 3B). This gene has been reported to repress the activity of OsMYBS2, thus protecting against photodamage and photoinhibition under conditions of excess light (Fu et al., 2021). In addition, GF14a resides within a selective sweep region, implying its role in domestication (Figure 3B). Forty genes displayed differential expression under stress treatments (Figure 3B). Bph14 was induced by flooding treatment but downregulated during brown planthopper (BPH) infestation (Figure 3B). Upon infestation by BPH, Bph14 triggers the salicylic acid signaling pathway, leading to a reduction in BPH feeding and growth rates (Hu et al., 2017). The observed increase in its expression under flooding conditions likely reflects adaptive changes in rice metabolic processes and hormonal signaling associated with submergence stress. Under phosphate starvation and salt stress, two homologs (LOC_Os01g37000 and LOC_Os06g16640) exhibited divergent expression patterns compared with the native ortholog (LOC_Os07g38590), suggesting their functional specialization in stress adaptation (Supplemental Figure 8).

By contrast, gene expression patterns in wheat were influenced by polyploidy and subgenome asymmetry (Ramírez-González et al., 2018). Among the 191 foreign genes, 112 (58.6%) were expressed from the A (n = 31), B (n = 42), and D (n = 39) subgenomes, with 28 genes exhibiting tissue-specific expression (Figure 3C). In addition, 179 RI genes derived from polyploidization or duplication were retained across subgenomes, 145 of which were preserved in all three subgenomes. Notably, 52 gene copies showed no detectable expression under any treatment, likely owing to dosage compensation or transcriptional silencing (Figure 3C). This result indicates that, compared with rice, the polyploid structure of the wheat genome has introduced greater complexity in RI-derived gene regulation, characterized by subgenome-specific retention and divergent functional trajectories among homeologs.

A Triticeae-derived segment in C. songorica associated with drought tolerance

The tetraploid CHL species C. songorica (2n = 4× = 40, abbreviated Cson) is a vital perennial forage grass and an ecologically important C4 species that thrives in temperate saline, semi-arid, and desert regions of Central Asia and shows exceptional drought tolerance (Yan et al., 2019; Zhang et al., 2021) (Figure 4A). A ∼30-kbp segment on chromosome 6 in the A subgenome (Cson-A) that contains five protein-coding genes—CsA601273 (RIG1), CsA601274 (RIG2), CsA601275 (RIG3), CsA601276 (RIG4), and CsA601277 (RIG5)—was identified as an RI fragment (Figure 4B). Among these genes, RIG1 and RIG5 are homologous to OsMATE2 and OsPSKR15, both implicated in abiotic stress responses (Supplemental Table 5) (Das et al., 2018; Nagar et al., 2022). RIG1, RIG2, RIG4, and RIG5 contain multidrug antimicrobial extrusion domains, WD40 repeat domains, strictosidine synthase domains, and leucine-rich repeat domains, respectively (Supplemental Table 5). Phylogenetic analyses revealed that these RI-acquired genes in Cson-A clustered within the POO clade and were sister to Achnatherum splendens (Aspl; 2n = 2× = 48), a Triticeae species dominant in alkaline grasslands (Figure 4A and 4B). Notably, the donor genes in Aspl were located within a ∼20-kbp region with identical gene order and orientation, suggesting that they were transferred in a single RI event (Figure 4C). Local sequence alignment demonstrated high similarity between the two gene clusters, particularly in the protein-coding regions (Supplemental Figure 9).

Figure 4.

Figure 4

Evolution and functional implications of a Triticeae-derived 30-kbp segment in Cleistogenes songorica.

(A) Representative plants of C. songorica (Cson, left) and A. splendens (Aspl, right). Credits: K. Nyamsuren (Cson) and R.J. Soreng (Aspl).

(B) Maximum-likelihood phylogenetic trees of RIGs. P. latifolius and A. comosus serve as outgroups. Branches are colored by species taxonomic classification. Nodes with bootstrap values <60 are shown in black and those with values 60–80 in light blue. Nested topological structures are magnified in gray blocks, with RIGs highlighted. Rectangular blocks at terminal nodes denote syntenic group (SG) assignments: same-colored blocks indicate genes within the same SG, and gray blocks represent singleton SGs.

(C) Genomic synteny of the RI-derived segment across grass families. RIG homologs are colored and connected by red lines; other genes are gray. Chromosome labels are colored by species taxonomy. Orientations of RIGs and their native copies are shown with arrows.

(D) Nucleotide divergence between homologous segments in Aspl and Cson, calculated in 200-bp sliding windows. Statistical significance of distribution differences was tested by a two-tailed t-test.

(E) Expression profiles of RIGs and HRIGs under high-temperature and drought stress. Statistical significance was determined by pairwise two-tailed Student’s t-tests. ns, not significant; ∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001.

(F) Geographic distribution of Z. japonica, Cson, and Aspl. Data source: Global Biodiversity Information Facility (GBIF; GBIF Secretariat, 2023a, GBIF Secretariat, 2023b, GBIF Secretariat, 2023c).

(G) Ecological niche differentiation across two climatic gradients for Z. japonica, Cson, and Aspl.

Notably, this RI segment was exclusively present in Cson-A but absent from the B subgenome (Cson-B) and other CHL genomes, implying recent lineage-specific acquisition (Figure 4B and 4C; Supplemental Figure 10). Technical artifacts were ruled out by uniform read coverage across the RI donor and acceptor segments and the absence of clipping signals at their boundaries when raw sequencing reads from Cson and Aspl were remapped to their assemblies (Supplemental Figure 9). In addition, a homologous region containing CsB800052 (HRIG2), CsB800053 (HRIG3), and CsB800054 (HRIG5) was observed in Cson-B. Phylogenetic analysis confirmed these B-subgenome genes as vertically inherited copies, distinct from the horizontally acquired cluster in Cson-A (Figure 4B). Alignments of protein domain sequences further supported the evolutionary affinity between the RI-acquired genes in Cson-A and the RI donor genes in Aspl (Supplemental Figure 11). Topology tests using nucleotide alignments revealed that introgression models had significantly better support than non-RI hypotheses (Supplemental Figure 12). Collectively, these results strongly indicate that this gene cluster in Cson-A originated from Aspl or its close relatives via RI. Interestingly, the RI donor cluster in Aspl has a segmental duplicate (SegDup) on chromosome 13 that is absent in other POO species (e.g., barley and wheat; Figure 4B and 4C). Non-RI homologs of this cluster exhibited conserved genomic orientation across grass species, implying that their presence predated grass divergence (Figure 4C). Analyses of sequence divergence revealed that the Cson-A cluster was derived from the Aspl RI donor rather than its SegDup (Figure 4D). Expression profiling of RI-acquired genes and their native homologs under stress revealed no co-expression (Supplemental Figure 13). Under drought stress, RIG2 and RIG3 were significantly downregulated, whereas RIG5 was upregulated (Figure 4E). Among native homologs, only HRIG2 showed a response to stress. Similar expression patterns were observed under high-temperature stress (Figure 4E). These findings suggest that the RI cluster in Cson-A from Aspl has undergone functional divergence from its native counterparts, potentially contributing to the exceptional drought tolerance of C. songorica.

Geographically, Aspl and Cson share a sympatric distribution in East Asian saline–alkaline grasslands, whereas Cson’s phylogenetically closest species, Zoysia japonica, is predominantly restricted to the coastal region of East Asia and coastal Australia, and Oropetium thomaeum is endemic to Africa (Figure 4F and Supplemental Figure 14). Analyses of climatic niche differentiation indicated substantial overlap between Aspl and Cson (Warren’s I = 0.760 and Schoener’s D = 0.572) relative to the null hypothesis of niche equivalency (Supplemental Figure 15A), whereas Aspl/Z. japonica (Warren’s I = 0.0388 and Schoener’s D = 0.1028, p < 0.001) and Cson/Z. japonica (Warren’s I = 0.0157 and Schoener’s D = 0.0714, p < 0.001) displayed strong partitioning (Supplemental Figure 15B). Species-density data along two climatic gradients showed that the niche space of Cson nested entirely within that of Aspl (Figure 4G). These findings implied a potential relationship between geographic niche proximity and preserved cross-subfamily gene flow between Aspl and Cson from broad-range species to narrow-range species.

Evolutionary trajectory of gramine biosynthesis gene clusters (GBGCs) in grasses

Gramine, an allelopathic indole alkaloid, exhibits broad-spectrum defense activities against aphids, fungi, viruses, and other plants, and its synthesis has been reported in some grass species (e.g., Hordeum) (Leete and Minich, 1977; Lu et al., 2019). Two core biosynthetic genes, AMIS/CYP76M57, encoding a cytochrome P450 monooxygenase, and NMT, encoding an N-methyltransferase, have been shown to mediate gramine production in the wild barley Hordeum vulgare subsp. spontaneum and specific cultivated barleys (H. vulgare). These genes are physically adjacent in the genome (Figure 5A) (Dias et al., 2024; Ishikawa et al., 2024), with AMIS encoding an enzyme that oxidizes tryptophan into aminomethylindole (AMI), and NMT encoding an enzyme that methylates AMI twice via the intermediate N-methylaminomethylindole to yield gramine (Figure 5A). Importantly, genes within the GBGC exhibited RI signals across grass subfamilies (Figure 5B).

Figure 5.

Figure 5

RI-mediated evolution of gramine biosynthesis genes.

(A) Genomic locus of the gramine biosynthesis gene cluster (GBGC) in wild barley, H. vulgare subsp. spontaneum var. B1K-04-12, and the biosynthetic pathway of gramine from tryptophan.

(B) Maximum-likelihood phylogenetic tree of AMIS homologs in grasses. Core P450 genes in clade 2 (CYP76M5CYP76M8) from the ORY subfamily are collapsed, with leaf node numbers labeled. Background colors denote grass subfamilies. The RI-related clade is highlighted in red. Rectangular blocks at terminal nodes indicate SGs.

(C) Evolutionary distribution of GBGC genes in grass genomes. The numbers between AMIS and NMT represent their genomic distances. Genes flanking the slash are located on two different chromosomes.

(D and G) Location and gene elements around GBGCs in O. sativa var. NIP (D) and T. aestivum var. Chinese Spring (G).

(E) Graph of the rice pangenome at the GBGC locus. Two major haplotypes are represented in green and blue. The circular plot below illustrates the proportional distribution of these haplotypes across different rice groups. African; Oryza barthii and Oryza glaberrima; Or-12, group 1/2 of O. rufipogon; XI, O. sativa ssp. xian/indica; Or-3, group 3 of O. rufipogon; GJ, O. sativa ssp. geng/japonica.

(F) Nucleotide diversity (π) between wild (Or-3 of O. rufipogon) and domesticated (O. sativa ssp. japonica/geng) populations and Tajima’s D values in 100-kb windows spanning chromosome 12. Dashed lines represent a Tajima’s D value of zero.

(H) Expression profiles of TaAMISa, TaELL1, and TaAMISb in the wheat spike under Fusarium head blight treatment. Statistical significance was determined using pairwise two-tailed Student’s t-tests. ns, not significant; ∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001.

(I) Extracted ion chromatograms (EICs) showing detection of gramine (m/z = 175.12, product) in C. macrourus and H. vulgare (cv. 7552) by high-performance liquid chromatography–mass spectrometry analysis.

Phylogenetic analysis of AMIS homologs defined three clades: a positionally conserved clade (clade 1, whose genes were phylogenetically clustered and colocalized in syntenic blocks across grass species), an ORY-specific clade (clade 2, comprising four core P450 genes [CYP76M5CYP76M8] involved in phytocassane biosynthesis on chromosome 2), and a GBGC-associated clade (clade 3). Phylogenetic discordance was primarily observed in clade 3, although a small cluster of Oryza genes nested within PACMAD in clade 1 (Figure 5B). Basal members in clade 1 (Aluo-C_Alu08Cg07060) and clade 3 (Aluo-C_Alu08Cg07070) were tandem duplicates in the Ampelocalamus luodianensis genome (BAM), indicating their common origin (Figure 5B). BAM homologs from clade 1 were found across multiple bamboo subgenome types, including the ancestral type H in herbaceous bamboos, whereas clade 3 homologs were confined to type C genomes, the only subgenome shared by all polyploid bamboo species, and were entirely absent from other subgenomes, suggesting subgenome-specific retention and selection-driven specialization (Ma et al., 2024). The topology of clade 3 was statistically supported and incongruent with the species tree (Supplemental Figure 16), with homologs from Eragrostis (CHL), Cenchrus (PAN), and POO species forming a monoclade sister to the Oryza lineage. Within POO, AMIS from wild barley (H. vulgare subsp. spontaneum) occupied a basal position, and its two duplicates clustered with other POO homologs, particularly in Triticeae species, which exhibited extensive tandem duplication (Figure 5B).

The phylogenetic topology of NMT homologs mirrored that of AMIS genes with significant RI signals: CHL, PAN, and POO copies were nested within ORY when BAM species were set as outgroups (Supplemental Figures 17 and 18). Unlike AMIS homologs, NMT homologs were largely absent from POO, except in Hordeum. Gene clustering of AMIS and NMT was identified in both ORY (e.g., Oryza sativa and Zizania latifolia) and PAN (e.g., Cenchrus; Figure 5B) lineages in addition to Hordeum. Synteny comparisons revealed identical gene orientations, supporting a co-introgression scenario for the two genes, despite their physical separation by several kilobases (Figure 5C). By comprehensively integrating the phylogenetic and syntenic evidence for the two genes with subfamily lineage divergence time, we inferred that the gathering of AMIS and NMT emerged first in the ancestral lineage of BOP, showing an ILS pattern when comparing the gene topology to the species phylogeny (Grass Phylogeny Working Group III, 2025). Subsequently, a remote introgression event occurred in the common ancestor of PAN and CHL, whose divergence time was much earlier than that of the radiation divergence among BOP subfamilies (Figure 5C).

In rice, two additional NMT homologs, OsNMTa (LOC_Os12g10140) and OsNMTb (LOC_Os12g10170), were present near the AMISNMT cluster (<200 kb), derived from duplication of OsNMT (LOC_Os12g09770) and subsequent fission (Supplemental Figures 19 and 20). RNA sequencing (RNA-seq) data showed upregulation of these genes in response to blast infection and nitrogen starvation (Supplemental Figure 21). Co-expression network analysis revealed that OsNMTb was co-expressed with the MABGC (Supplemental Figure 22). However, similar to the presence-and-absence variation polymorphism of GBGC in barley (Figure 5C and Supplemental Figure 23), this genomic region was almost exclusively observed in O. sativa ssp. japonica/geng and its wild ancestral group Or-3 of O. rufipogon, as shown from the rice pangenome graph (Figure 5E and Supplemental Figure 24). Moreover, this region was under artificial selection during the domestication of O. sativa ssp. japonica/geng (Figure 5F). In wheat, two tandem AMIS homologs were strongly induced by Fusarium infection and co-expressed with the neighboring gene TaELL1, which encodes tryptamine 5-hydroxylase, indicating their potential involvement in the 5-hydroxytryptamine biosynthesis pathway that contributes to pathogen resistance (Figure 5G and 5H). Similarly, extensive structural variations were observed in wheat pangenomes, reflecting balancing selection on this immunity-related locus (Supplemental Figure 25).

To assess the capacity for gramine biosynthesis in grass species with GBGCs, we first quantified the expression of GBGC genes in fresh leaves of Z. latifolia, Cenchrus macrourus, and O. sativa upon methyl jasmonate (MeJA) treatment, with barley serving as a positive control. Significant upregulation was observed for barley GBGC genes, but they showed no detectable expression in O. sativa or Z. latifolia, even under MeJA treatment (Supplemental Figure 26). In C. macrourus, the GBGC genes were transcriptionally active under MeJA-treatment conditions, although their expression was low (Supplemental Figure 26). Consistent with these observations, extracted ion chromatograms (EICs) from gramine-targeted metabolic profiling revealed trace levels of gramine in leaves under control conditions and 4 h post treatment (Figure 5I). In contrast to the distinct peak in barley, which clearly aligned with that of the gramine standard, three gramine-related ion peaks to the left of the barley peak (designated unknowns 1, 2, and 3) exhibited changes in abundance under MeJA stress (Figure 5I). These compounds may be gramine derivatives or isoforms that respond to biotic stress in C. macrourus, rather than gramine as observed in barley, a possibility that requires further investigation.

Discussion

In contrast to well-documented instances of cross-kingdom HGT and intra-genus introgression in eukaryotes, RI is an underexplored mechanism of adaptive gene acquisition (Supplemental Figure 27). Here, we developed a phylogeny-based workflow optimized for RI detection. Compared with sporadic previous reports (e.g., the momilactone A biosynthetic cluster in rice, stress-resistance gene transfers from Panicum to wild barley) (Mahelka et al., 2021; Wu et al., 2022b), we detected 622 RI events between the BOP and PACMAD lineages using RIFinder, providing novel insights into the adaptive evolution of grass species. To highlight the functional significance of RI, two representative examples were comprehensively investigated: a drought-related gene cluster in Cson (CHL), likely derived from Aspl relatives (POO), and gramine-associated genes in Hordeum (POO) and Cenchrus (PAN) originated from ORY.

Distinguishing true RI from ILS still poses a challenge for accurate RI detection, although we have implemented a modified branch-length test (mBLT) to exclude ILS-driven topological incongruence. In addition, the current framework is constrained by limited phylogenetic resolution. Although selective retention may confine RI signals to regions near functional genes, future improvements will enhance resolution and enable detection of a broader spectrum of RI events by integrating a k-mer-based or pangenome-alignment framework. Given our conservative emphasis on accuracy over sensitivity, and considering taxonomic undersampling and challenges in distant homology detection, our reported RI events likely represent a lower-bound estimate.

Taking grass species as an example, the uneven distribution of RI events across subfamilies likely reflects biological and ecological characteristics such as life history (including the reproductive cycle) and geographic/niche distribution. In previous studies, exposure of reproductive tissues during early development facilitated gene acquisition in plants (Huang, 2013; Ma et al., 2022). Our findings point to distribution range and reproductive timing as putatively critical variables influencing the propensity for RI in grasses. For instance, a long reproductive cycle in bamboo reduces opportunities for gene exchange (Yeasmin et al., 2015). In addition, we observed asymmetry in acceptor and donor gene numbers between the BOP and PACMAD lineages, with PACMAD species more frequently acting as donors and BOP lineages as acceptors. This imbalance may reflect differences in ecological and geographic distributions between the two clades. POO species are primarily temperate, ORY species are tropical/subtropical, and BAM species are restricted to South America, southern Africa, and Southeast Asia. By contrast, PAN and CHL span a broader latitudinal and ecological range, from the Arctic Circle to the Cape of Good Hope, resulting in a lower likelihood of fixation (Supplemental Figure 28).

Regarding the genetic mechanisms that underlie RI, two evolutionary scenarios are plausible: introgression between ancestral lineages of extant species or hybridization between evolutionarily divergent taxa that have overcome reproductive barriers. Ancestral introgression likely predates the establishment of reproductive isolation, aligning with the concept of a “speciation continuum” (Nosil et al., 2009; Kulmuni et al., 2020), particularly in clades undergoing radiation, where incomplete reproductive barriers may fail to prevent genetic exchange (Marques et al., 2019; Pontarp et al., 2024). Nevertheless, the detection of such ancient events is complicated by ILS, which may obscure introgression signals (Stull et al., 2023). Empirical evidence from diverse taxa supports the feasibility of distant hybridization as a pathway for RI. In primates, a recent study identified Y chromosome introgression in Cercopithecini monkeys, estimated at ∼8 mya (Jensen et al., 2024). In plants, ancient introgressions have been repeatedly detected in bamboo genomes (Guo et al., 2019). Recent experimental work demonstrated that reproductive barriers between distantly related species can be overcome. In Arabidopsis, hybrid embryos were successfully generated via pollen manipulation between species that diverged >14 mya (Lan et al., 2023). In Brassica, removal of reactive oxygen in the stigma alleviated the intergeneric reproductive barrier between Arabidopsis and Brassica (Huang et al., 2023). These findings across diverse taxa collectively indicate the possibility of RI via ancestral gene flow and naturally facilitated hybridization between widely divergent lineages.

Methods

RIFinder workflow

RIFinder is a Python-based tool developed to detect RI events from large genomic datasets. It is designed to identify phylogenetic discordance arising during species divergence, which may signal introgression between distantly related taxa. It integrates modules for gene-tree inference, signal quantification, and statistical testing (Supplemental Figure 2). The workflow begins by reconstructing gene trees for homologous gene groups, incorporating stringent quality control to remove unreliable phylogenies (e.g., those with abnormal branch lengths or poorly supported nodes). These curated gene trees are compared with species trees to identify topological discrepancies. RIFinder quantifies RI values for each gene tree using a topology-based scoring formula. To ensure specificity, a statistical significance module tests whether observed signals exceed expectations under the neutral ILS scenario. By integrating these steps into a unified and automated framework, RIFinder leverages multi-genome data to accurately capture reticulate evolutionary processes that are frequently overlooked in conventional tree-based models.

MCL clustering and gene phylogeny

Pairwise protein-sequence alignments were performed across all genome assemblies using DIAMOND (v2.0.2) with an E-value threshold of 1e−5 and a minimum query coverage of 50% (Buchfink et al., 2021). On the basis of these similarity scores, protein sequences were clustered into HGs using the Markov clustering algorithm (MCL) with an inflation parameter of 2. For HGs with sufficient taxonomic representation, multiple sequence alignments were constructed using MAFFT (Rozewicki et al., 2019), and poorly aligned regions were pruned using trimAl (Capella-Gutiérrez et al., 2009). Substitution models were selected for each alignment using ModelFinder (Kalyaanamoorthy et al., 2017), and gene trees were inferred under a maximum-likelihood framework using IQ-TREE (v2.0) with 3000 ultra-fast bootstrap replicates (Minh et al., 2020).

Compressing and splitting gene trees

Upon initialization of RIFinder, the pipeline performs quality control of gene trees by removing extreme outliers characterized by long branch lengths. For each gene tree, the distances from each leaf to its three immediate ancestral nodes are calculated. Any leaf with a distance exceeding the 99th percentile of the distance distribution within the tree is flagged and removed. After outlier removal, the algorithm traverses each node in the gene tree. For monophyletic groups from a single defined clade, a node-collapsing operation is applied to simplify the tree by reducing intra-clade topological complexity (Figure 1A). Paralogous genes, especially those resulting from ancient or lineage-specific duplication, are often included in gene family trees, complicating ortholog inference and phylogenetic reconstruction. Because similarity-based clustering methods tend to group orthologs and paralogs, large gene families may be artificially inflated. To address this issue, RIFinder implements a tree decomposition algorithm to split large gene trees into smaller, orthologous group-like (OG-like) clusters.

The principle of the tree-splitting algorithm is described using the following pseudocode:

  • FOR a rooted gene family phylogeny:

  • OGlike ← [];

  • > forall leaf of gtree do

  • >> cur node ← leaf;

  • >> while node comprising clade ≤ whole clade do

  • >>> node ← node.up;

  • >> end

  • >> AddToList(OGlike, node);

  • > end

  • return Oglike

  • ENDFOR

RIFinder iterates through the leaves of each gene tree to assess whether the current subtree contains taxonomically diverse representation. Specifically, it checks whether the number of unique clade identifiers in the subtree meets a user-defined threshold (default: n, where n is the total number of taxa in the tree). If a current subtree fails the criteria, the algorithm moves upward to the parent node and re-evaluates the taxonomic diversity. Once a subtree satisfies the inclusion criteria, it is extracted for downstream RI analysis; otherwise, it is excluded from further consideration. This sequential process of outlier filtering, node collapsing, and tree decomposition ensures that RIFinder operates on high-confidence, taxonomically representative gene subtrees, thereby enhancing the resolution and reliability of RI signal detection across complex, multi-copy gene families.

Remote introgression (RI) scoring

After pre-processing and filtering, gene phylogenies are screened for topology indicative of RI (Figure 1A). The species phylogeny is divided into two discrete taxonomic clades, representing the major groups for investigation. Species are classified as either congruent (i.e., belonging to the same monophyletic clade) or discordant (i.e., falling outside the expected monophyletic group) per gene tree. This classification aligns with the taxonomic level defined for RI. RI scores are computed for leaves in the gene tree on the basis of the taxonomic identity of sister branches, the number of discordant taxa, and their distribution across the tree. The threshold is determined using Cayley’s formula: n(n−2), where n denotes the number of taxa in a given subclade. This reflects the number of distinct unrooted tree topologies that can be formed from n tips. Any leaf whose RI score exceeds the threshold is labeled as a candidate RI event (Figure 1A). To calculate RI scores within each OG-like subtree, the algorithm starts at a target leaf that meets the minimum taxa threshold and proceeds iteratively. At each internal node, the algorithm examines the newly incorporated descendant branches to identify taxa that deviate from the expected monophyletic group. Scoring begins upon detection of the first discordant taxon and continues until a taxon from the original clade reappears, at which point scoring stops. Three distinct scoring scenarios are implemented. If all newly added leaves belong to the same discordant clade, their total number is added directly to the node’s RI score. If discordant taxa originate from multiple distinct clades, a fractional score is calculated as the ratio of discordant taxa to the total number of newly added taxa. If not all expected discordant clades are represented, the tree is excluded from downstream analysis to avoid potential bias from rooting artifacts or incomplete sampling. The full logic of the RI scoring procedure is outlined in pseudocode format as follows.

  • Input: A gene tree topology

  • CT ← cladei;

  • CN ← cladej;

  • scoreL ← [];

  • > forall leaves do >> score ← 0;

  • >> clade ← leaves;

  • >> dclade ← !leaves;

  • >> if taxon < CT then

  • >>> Continue

  • >> else

  • >>> node ← leaves;

  • >>> while node parent exist do

  • >>>> node ← node.up;

  • >>>> newTaxn ← getNewleaf(node);

  • >>>> allTaxn ← getLeaf(node);

  • >>>> if dclade in allTaxn then

  • >>>>> score ← score + weight;

  • >>>> if clade in newTaxn then

  • >>>>> Break

  • >>>> end

  • >>> end

  • >> end

  • >> AddToList(scoreL, score);

  • > end

  • end

  • Return scoreL.

To ensure the robustness of the results, only nodes with bootstrap values over 70 (default) are retained. In gene family trees with multiple plausible outgroup placements, the scoring process is repeated across alternative roots, and RI scores are aggregated to mitigate rooting bias.

Modified branch-length testing

After candidate RI events were identified, each case was further evaluated to rule out alternative sources of phylogenetic incongruence, with a primary focus on ILS. To distinguish RI from ILS, we developed an mBLT inspired by the framework proposed by Suvorov et al. (2022) (Supplemental Figure 3). The mBLT estimates relative coalescence times by analyzing branch lengths in gene trees, measured as substitutions per site. Under an RI scenario, the most recent common ancestor (MRCA) between the donor and recipient taxa is expected to exhibit shorter branch lengths than expected under ILS alone. By contrast, ILS typically leads to older coalescence times and does not generate consistent directional differences across alternative discordant topologies. This forms the null hypothesis of mBLT: in the absence of introgression, coalescence time should not differ significantly between topologies. For each candidate RI event, three clades (acceptor, donor, and sister) are defined as a triplet. Pairwise distances (denoted as d) are computed on the basis of external branch lengths between leaves from the recipient clade and those from the donor or sister clade, respectively. Specifically, for each leaf in the recipient clade, distances to all donor leaves d(Donor,Accepted) and all sister-clade leaves d(Donor,Sister) are measured. The distributions of d(Donor,Accepted) and d(Donor,Sister) are then compared using a two-tailed independent t-test. Under pure ILS, the expected relationship is d(Donor,Accepted)d(Donor,Sister), whereas d(Donor,Accepted) < d(Donor,Sister) indicates a possible introgression event consistent with the discordant topology. Conceptually, the mBLT aligns with previously described methods such as the original BLT (Suvorov et al., 2022) and the D3 statistic (Hahn and Hibbins, 2019), leveraging topological information across loci to detect introgression.

For genome-wide application, we apply mBLT to all homologous gene trees in the pangenome, and p values are adjusted using the Benjamini–Hochberg procedure with a false discovery rate (FDR) threshold of 0.05. It is important to note that mBLT has reduced sensitivity when introgression occurs at similar rates into both the recipient and sister clades or when gene flow took place in a deep ancestral lineage. Once a candidate RI event is confirmed, associated metadata, including recipient and donor clades, discordant topologies, and relevant annotations, are recorded (Supplemental Table 2). Donor and accepted copy identities are assigned on the basis of their respective last common ancestors in the species tree. In cases where a subset of leaves within a discordant clade fail to meet the criteria but are phylogenetically nested within the target clade, the corresponding RI event is annotated as a recent introgression.

Topology testing

For individual phylogenies of particular interest, analyses are repeated following the same procedures described above but with more rigorous alignment settings in MAFFT (i.e., additional iterations) and limited manual curation, including removal of long-branching or poorly aligned taxa. Substitution models are reselected using ModelFinder, and maximum-likelihood phylogenies are reconstructed using IQ-TREE, with ultra-fast bootstrap support computed over 3000 replicates (Minh et al., 2020). The resulting trees are visualized and annotated using iTOL (v5; Letunic and Bork, 2021). To test the robustness of alternative topologies, we use three tree topology-comparison methods implemented in IQ-TREE: the RELL approximation, the Shimodaira–Hasegawa (SH) test, and the approximately unbiased (AU) test, each based on 10 000 bootstrap replicates (Minh et al., 2020). AU and SH tests yield p values, and alternative topologies are rejected if p < 0.05. The RELL approximation is used to compute posterior weights for each tested topology (Kishino et al., 1990). To minimize the confounding effects of alignment gaps, trimAl is used to remove poorly aligned regions, and the resulting conserved alignments are used for topology testing (Capella-Gutiérrez et al., 2009). In addition, phylogenetic trees are constructed based on nucleotide coding sequences (CDSs) as well as separately for the first and second codon positions (codon12) and third codon positions (codon3), following the same maximum-likelihood framework.

Simulation

Benchmark datasets were generated under three distinct topologies using GenPhyloData in JPrIME v0.3.7 (Sjöstrand et al., 2014). These topological structures represent a variety of configurations under a rooted bifurcating framework, ranging from simple (2 versus 2) to complex (3 versus 4) scenarios, with each subgeneric clade comprising multiple species forming monophyletic groups. The bifurcating species trees for the three topological relationships were generated with HostTreeGen using the parameters “-timespan 10,” “-birth rate 0.3,” and “-death rate 0.1,” with each subgeneric clade containing more than four species. The birth–death process was used to generate species trees, with extinct lineages systematically pruned from the final phylogenies. The three resultant species trees, which contained 33, 45, and 59 extant taxa, respectively, and each of which included an additional outgroup lineage, served as backbone trees for subsequent gene-tree simulations. To maintain experimental consistency across simulations, all gene trees were generated using GuestTreeGen with identical parameters: duplication rate = 0.2 and loss rate = 0.1 per million years, with leaf node counts constrained to 1–4 times the species number. An additional constraint limited each species to ≤10 gene leaf nodes in any simulated gene tree. To simulate diverse biological contexts of RI, we generated three distinct datasets corresponding to transfer rates of 0.1%, 0.4%, 0.7%, 1%, and 2% per lineage-million years. A strict filtering protocol was implemented to curate RI: (1) outgroup-associated RI events were removed to avoid ancestral lineage interference, and (2) intra-subclade transfers were excluded to focus on biologically meaningful transfer. Subsequently, branch lengths were temporally calibrated using the BranchRelaxer module (Sjöstrand et al., 2014), which uses a lognormal-distributed relaxed molecular clock to rescale gene-tree branches while preserving topological congruence with species trees. The simulation pipeline yielded 7402 high-confidence gene trees containing RI events, with each transfer-rate condition (0.1%, 0.4%, 0.7%, 1%, and 2%) producing >150 qualifying trees that met all evolutionary criteria. The performance of RIFinder was quantified through rigorous comparison against introgression events reported in these datasets.

Grass genomes

All genomic sequences and gene-model annotations from 78 high-quality grass assemblies and the outgroup species A. comosus were obtained from Phytozome (https://phytozome-next.jgi.doe.gov), the National Genomics Data Center (NGDC) (https://ngdc.cncb.ac.cn), and Ensembl Plants (https://plants.ensembl.org). These genomes included representatives from the basal grass group (P. latifolius), Oryzoideae (Z. latifolia, Leersia perrieri, and Oryza spp.), Bambusoideae (Dendrocalamus spp., Ampelocalamus luodianensis, Bonia amplexicaulis, Guadua angustifolia, Hsuehochloa calcarea, Melocanna baccifera, Olyra latifolia, Otatea glauca, Phyllostachys edulis, Raddia guianensis, and Rhipidocladum racemiflorum), Pooideae (Avena spp., Alopecurus myosuroides, Achnatherum splendens, Aegilops tauschii, Brachypodium distachyon, Brachypodium stacei, Dasypyrum villosum, H. vulgare, Lolium multiflorum, Puccinellia tenuiflora, Secale cereale, Triticum spp., and Thinopyrum elongatum), Chloridoideae (Eragrostis spp., Eleusine indica, O. thomaeum, C. songorica, and Z. japonica), and Panicoideae (Alloteropsis semialata, Cenchrus spp., Digitaria spp., Echinochloa spp., Zea mays, Sorghum bicolor, Miscanthus sinensis, Dichanthelium oligosanthes, Pennisetum giganteum, Panicum spp., Setaria spp., Paspalum vaginatum, and Saccharum spontaneum). Polyploid genomes with chromosome-level assemblies were split into subgenomes (Supplemental Table 1). A final total of 122 haploid grass genomes were used in the analysis.

Genomic synteny

Whole-genome protein sequences were compared pairwise among the 122 genomes using BLASTP. The best hit for each BLAST search was retained, and the threshold was set to an E-value less than 1e−5 and an identity greater than 50%. The genes or proteins were ordered according to their physical positions on each chromosome of each species. Gene-to-gene synteny among grass species was then analyzed on the basis of gene order within each genome. Micro-synteny analysis was performed with JCVI (Tang et al., 2024). A syntelog-based pangenome was constructed with SynPan (https://github.com/dongyawu/PangenomeEvolution) using 119 chromosome-level assemblies.

Functional enrichment

All predicted protein-coding sequences were annotated using InterProScan (v5.65) (Zdobnov and Apweiler, 2001), and Pfam domains were determined with default parameters, along with Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis. Gene families were identified on the basis of Pfam annotations. For enrichment analysis, all genes across the grass assemblies were used as background. We tested the null hypothesis that the Pfam associated with RI reflects a random sampling of total protein families. To this end, each term was compared against the background using Fisher’s exact test. Fold change was calculated based on the number of each Pfam in the subfamilies.

Expression and selection of RI acceptor genes in rice and wheat

Uniform expression data for rice and wheat from multiple tissues under various treatments were obtained from the Plant Public RNA-seq Database (PPRD, https://plantrnadb.com/) (Yu et al., 2022). Specifically, we selected 618 libraries for rice and 476 libraries for wheat (Supplemental Table 6). For rice, we collected transcriptome data from three varieties, covering 11 different tissues and 18 distinct treatments (15 abiotic and three biotic). For wheat, the dataset included 11 different tissues and 13 distinct treatments (five abiotic stress and eight biotic stress). Accepted genes with putative native copies in rice and wheat were extracted according to the phylogeny and manually checked to generate accepted–native pairs. To investigate artificial selection on potential RI genes during domestication, we screened for overlap between candidate RI genes and previously reported selective sweep regions in rice and wheat (He et al., 2019; Jing et al., 2023).

Expression of accepted RI genes in C. songorica

One hundred and twenty-four RNA-seq libraries from C. songorica were obtained from NCBI and mapped to the C. songorica genome sequence using TopHat2 (Supplemental Table 6) (Kim et al., 2013). Gene expression in each sample was quantified using featureCounts, and FPKM values were computed with a custom script (Liao et al., 2014). Differentially expressed genes were analyzed using PyDESeq2 (Muzellec et al., 2023). FDRs were adjusted using the posterior probability. FDR < 0.05 and |log2(fold change)| ≥ 2 were set as the threshold for significant differential expression. A blockwise co-expression network was constructed, and modules were identified with the R package WGCNA (v1.66; Langfelder and Horvath, 2008). The number of modules was detected automatically, and the number of genes in a module was limited to between 20 and 5000 genes.

Geographic distribution

Distribution ranges of all species involved in the study were obtained from the Global Biodiversity Information Facility (GBIF, https://www.gbif.org). For subfamily distributions, we traversed the distribution range of each genus and integrated them according to the phylogenetic classification. After removal of locations with fewer than 10 occurrences, the final distribution maps were created. For C. songorica, A. splendens, O. thomaeum, and Z. japonica, we removed locations with only one occurrence and visualized the distribution ranges using heatmaps.

Ecological niche comparison

The initial dataset included 514, 55, and 940 records for A. splendens, C. songorica, and Z. japonica, respectively. To reduce sampling bias in niche modeling, occurrences with pair distance <5 km were excluded from the raw dataset using the thin function in the R package spThin (v0.2.0) (Aiello-Lammens et al., 2015). The data-trimming process resulted in 681 records (199 for A. splendens, 35 for C. songorica, and 445 for Z. japonica). We then retrieved near-present (1970–2000) climate layers for 19 bioclimatic variables (BIO1–BIO19) at a 2.5-min spatial resolution from the WorldClim database (Fick and Hijmans, 2017). Five uncorrelated bioclimatic variables (|Pearson’s r| < 0.7; BIO1, BIO2, BIO3, BIO8, and BIO15) were selected to characterize the climatic niche volumes. Using Warren’s I and Schoener’s D statistics, we estimated pairwise niche equivalency among the three species using the niche.equivalency.test function (with 100 replicates) in the R package ecospat (v4.1.2) (Schoener, 1968; Warren et al., 2008; Di Cola et al., 2017). To provide an intuitive visualization of the climatic niche space occupied by the three species, we partitioned the climatic niches into two gradients using principal-component analysis and plotted the species density along these gradients with the plot.niche.dyn function in the R package ecospat.

Pangenome graph and local sequence alignment

We used 70 rice genome assemblies to construct a local pangenome graph of the GBGC locus in rice (Xie et al., 2025). Genomic coordinates of GBGC homologs across these assemblies were identified by BLAST searches using the T2T-NIP sequence as a query (Shang et al., 2023). For each best hit, a 1-Mb interval (±500-kb flanking region) was extracted and used to build the local graph with Minigraph (v0.21-r606; Li et al., 2020). The graph structure was visualized using Bandage (v0.8.1; Wick et al., 2015). Local alignment of these intervals was performed against the T2T-NIP reference (AGIS1.0) using Minimap2 (Li, 2021). Alignments were filtered to retain only those with sequence identity >85% and aligned length >10 kb. Haplotype structures were manually curated and summarized across accessions. For barley and wheat, 76 and 18 genome assemblies were analyzed, respectively, alongside the reference genomes of ‘B1K-04-12’ (barley) and ‘ZM366’ (wheat) (Jayakodi et al., 2024; Jiao et al., 2025). Genomic intervals surrounding GBGC loci were extracted. Alignment results were filtered using a threshold of length >10 kb and identity >85% for barley and identity >70% for wheat.

Gramine-related gene expression

Barley (H. vulgare cv. HOR9043) and C. macrourus were planted in a greenhouse with a 16-h photoperiod at 18°C/15°C (day/night). Z. latifolia and O. sativa were planted under natural field conditions. For exogenous jasmonic acid treatment, 20-day-old plants were sprayed with 150 μM methyl jasmonate (MeJA; 392707, Sigma, Germany). Leaf samples were collected at 0, 0.5, 1, 2, 3, 4, 6, 9, and 12 h; total RNA was extracted using the HiPure Plant RNA Plus Kit (R4150, Magen Biotechnology, Guangzhou, China), and cDNA was obtained using the HiScript Ⅱ qRT Supermix for qPCR reagent (R223, Vazyme, Nanjing, China). RT–qPCR was performed using the SupRealQ Purple Universal SYBR qPCR Master Mix reagent (Q412, Vazyme, Nanjing, China) on the CFX96 real-time system (Bio-Rad, USA) according to the manufacturer’s instructions. The reaction procedure was 95°C for 30 s, 95°C for 10 s, and 60°C for 30 s. Relative gene expression was calculated by the 2−ΔΔCt method using EF1a, ACTB, OsActin, or Zl18S rRNA as the internal reference. Primers used for constructs are listed in Supplemental Table 7.

Gramine quantification

All samples were freeze-dried and ground into powder using a mixer mill. After addition of 1000 μl extraction solvent (precooled at −20°C; methanol/water, 3:1), the samples were vortexed for 30 s, homogenized at 38 Hz for 4 min, and sonicated for 5 min in an ice-water bath. The homogenization and sonication cycles were repeated three times, followed by incubation at −20°C for 1 h and centrifugation at 12 000 rpm and 4°C for 15 min. An 80-μl aliquot of the clear supernatant was transferred to an autosampler vial for liquid chromatography–tandem mass spectrometry (LC–MS/MS). Another 50 μl of supernatant was diluted 100 times for LC–MS/MS analysis. For gramine quantification, samples were analyzed by an ultra-high-performance liquid chromatography–tandem mass spectrometry (UPLC–MS/MS)-based targeted method combined with targeted metabolite profiling. The UPLC analytical conditions were as follows: Waters Acquity UPLC BEH C18 (pore size 1.7 μm, length 2.1 × 100 mm); solvent system, water (0.1% formic acid) and methanol; gradient program, 0 min, 10% B; 2 min, 10% B; 5 min, 90% B; 7 min, 90% B; 7.1 min, 10% B; 10 min, 10% B; flow rate, 300 μl/min; temperature, 35°C; injection volume, 1 μl. Targeted metabolite profiling was performed using scheduled multiple reaction monitoring with a Sciex QTRAP 6500+ triple quadrupole mass spectrometer equipped with an electrospray ionization (ESI) interface. The ESI source operation parameters were as follows: curtain gas, 35 psi; ion spray voltage, 4500 V; source temperature, 550°C; nebulizer gas, 55 psi; heater gas, 55 psi; declustering potential, 80 V; entrance potential, 10 V; collision cell exit potential, 13 V.

Data and code availability

Genome assemblies in this study are downloadable from public data warehouses. The RI detection results for Poaceae species, including the initial output and filtered results from RIFinder, are publicly available on Zenodo at https://doi.org/10.5281/zenodo.15877093. The source code for RIFinder is available on GitHub (https://github.com/Ne0tea/RIFinder) under Apache License 2.0.

Funding

This work was supported by the National Key Research and Development Program (2023YFD1400502) to L.F. and the Young Scientists Fund of the National Natural Science Foundation of China (no. 32300490) to D.W.

Acknowledgments

No conflict of interest declared.

Author contributions

D.W. conceived and initiated this study; Y.H. and S.Z. collected the assemblies; S.Z. performed quality control; Y.H., S.Z., Z.L., L.J., and K.Y. processed the RNA-seq data and analyzed the expression profiles; Y.H., Y.L., and S.Z. conducted the RIFinder evaluation based on simulated data; H.L. performed ecological niche comparison; C. Liu, C. Lu, and Y.C. provided gramine quantification; C.H. and H.Z. provided samples and advice on experiments; Y.H. analyzed all major components; G.Z. and Q.Q. discussed the results; D.W. and L.F. supervised all analyses; Y.H. and D.W. wrote the manuscript with input from all co-authors; and all authors discussed the results and commented on the manuscript.

Published: December 5, 2025

Footnotes

Supplemental information is available at Plant Communications Online.

Contributor Information

Longjiang Fan, Email: fanlj@zju.edu.cn.

Dongya Wu, Email: wudongya@zju.edu.cn.

Supplemental information

Document S1. Supplemental Figures 1–28
mmc1.pdf (6.1MB, pdf)
Data S1. Supplemental Tables 1–7
mmc2.xlsx (287.5KB, xlsx)
Document S2. Article plus supplemental information
mmc3.pdf (14.7MB, pdf)

References

  1. Aharoni A., Dixit S., Jetter R., Thoenes E., van Arkel G., Pereira A. The SHINE clade of AP2 domain transcription factors activates wax biosynthesis, alters cuticle properties, and confers drought tolerance when overexpressed in Arabidopsis[W] Plant Cell. 2004;16:2463–2480. doi: 10.1105/tpc.104.022897. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Aiello-Lammens M.E., Boria R.A., Radosavljevic A., Vilela B., Anderson R.P. spThin: an R package for spatial thinning of species occurrence records for use in ecological niche models. Ecography. 2015;38:541–545. [Google Scholar]
  3. Alessi D.R., Pfeffer S.R. Leucine-rich repeat kinases. Annu. Rev. Biochem. 2024;93:261–287. doi: 10.1146/annurev-biochem-030122-051144. [DOI] [PubMed] [Google Scholar]
  4. Birchler J.A., Yang H. The multiple fates of gene duplications: deletion, hypofunctionalization, subfunctionalization, neofunctionalization, dosage balance constraints, and neutral variation. Plant Cell. 2022;34:2466–2474. doi: 10.1093/plcell/koac076. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Buchfink B., Reuter K., Drost H.-G. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat. Methods. 2021;18:366–368. doi: 10.1038/s41592-021-01101-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Capella-Gutiérrez S., Silla-Martínez J.M., Gabaldón T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 2009;25:1972–1973. doi: 10.1093/bioinformatics/btp348. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Chen L., Li Y.X., Li C., Shi Y., Song Y., Zhang D., Li Y., Wang T. Genome-wide analysis of the pentatricopeptide repeat gene family in different maize genomes and its important role in kernel development. BMC Plant Biol. 2018;18:366. doi: 10.1186/s12870-018-1572-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Das N., Bhattacharya S., Bhattacharyya S., Maiti M.K. Expression of rice MATE family transporter OsMATE2 modulates arsenic accumulation in tobacco and rice. Plant Mol. Biol. 2018;98:101–120. doi: 10.1007/s11103-018-0766-1. [DOI] [PubMed] [Google Scholar]
  9. Di Cola V., Broennimann O., Petitpierre B., Breiner F.T., D’Amen M., Randin C., Engler R., Pottier J., Pio D., Dubuis A., et al. ecospat: an R package to support spatial analyses and modeling of species niches and distributions. Ecography. 2017;40:774–787. [Google Scholar]
  10. Dias S.L., Chuang L., Liu S., Seligmann B., Brendel F.L., Chavez B.G., Hoffie R.E., Hoffie I., Kumlehn J., Bültemeier A., et al. Biosynthesis of the allelopathic alkaloid gramine in barley by a cryptic oxidative rearrangement. Science. 2024;383:1448–1454. doi: 10.1126/science.adk6112. [DOI] [PubMed] [Google Scholar]
  11. Duxbury Z., Wu C.-H., Ding P. A comparative overview of the intracellular guardians of plants and animals: NLRs in innate immunity and beyond. Annu. Rev. Plant Biol. 2021;72:155–184. doi: 10.1146/annurev-arplant-080620-104948. [DOI] [PubMed] [Google Scholar]
  12. Edelman N.B., Frandsen P.B., Miyagi M., Clavijo B., Davey J., Dikow R.B., García-Accinelli G., Van Belleghem S.M., Patterson N., Neafsey D.E., et al. Genomic architecture and introgression shape a butterfly radiation. Science. 2019;366:594–599. doi: 10.1126/science.aaw2090. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Edwards S.V., Liu L., Pearl D.K. High-resolution species trees without concatenation. Proc. Natl. Acad. Sci. 2007;104:5936–5941. doi: 10.1073/pnas.0607004104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Eryong C., Bo S. OsABT, a rice WD40 domain-containing protein, is involved in abiotic stress tolerance. Rice Sci. 2022;29:247–256. [Google Scholar]
  15. Van Etten J., Bhattacharya D. Horizontal gene transfer in eukaryotes: not if, but how much? Trends Genet. 2020;36:915–925. doi: 10.1016/j.tig.2020.08.006. [DOI] [PubMed] [Google Scholar]
  16. Feng X., Zheng J., Irisarri I., Yu H., Zheng B., Ali Z., de Vries S., Keller J., Fürst-Jansen J.M.R., Dadras A., et al. Genomes of multicellular algal sisters to land plants illuminate signaling network evolution. Nat. Genet. 2024;56:1018–1031. doi: 10.1038/s41588-024-01737-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Fick S.E., Hijmans R.J. WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas. Int. J. Climatol. 2017;37:4302–4315. [Google Scholar]
  18. Fu X., Liu C., Li Y., Liao S., Cheng H., Tu Y., Zhu X., Chen K., He Y., Wang G. The coordination of OsbZIP72 and OsMYBS2 with reverse roles regulates the transcription of OsPsbS1 in rice. New Phytol. 2021;229:370–387. doi: 10.1111/nph.16877. [DOI] [PubMed] [Google Scholar]
  19. Grass Phylogeny Working Group III A nuclear phylogenomic tree of grasses (Poaceae) recovers current classification despite gene tree incongruence. New Phytol. 2025;245:818–834. doi: 10.1111/nph.20263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Guo Z.-H., Ma P.-F., Yang G.-Q., Hu J.-Y., Liu Y.-L., Xia E.-H., Zhong M.-C., Zhao L., Sun G.-L., Xu Y.-X., et al. Genome sequences provide insights into the reticulate origin and unique traits of woody bamboos. Mol. Plant. 2019;12:1353–1365. doi: 10.1016/j.molp.2019.05.009. [DOI] [PubMed] [Google Scholar]
  21. GBIF Secretariat . GBIF Secretariat; 2023. Zoysia japonica Steud. GBIF Backbone Taxonomy. [DOI] [Google Scholar]
  22. GBIF Secretariat . GBIF Secretariat; 2023. Adscita splendens subsp. splendens. GBIF Backbone Taxonomy. [DOI] [Google Scholar]
  23. GBIF Secretariat . GBIF Secretariat; 2023. Cleistogenes songorica (Roshev.) Ohwi. GBIF Backbone Taxonomy. [DOI] [Google Scholar]
  24. Hahn M.W., Hibbins M.S. A three-sample test for introgression. Mol. Biol. Evol. 2019;36:2878–2882. doi: 10.1093/molbev/msz178. [DOI] [PubMed] [Google Scholar]
  25. He F., Pasam R., Shi F., Kant S., Keeble-Gagnere G., Kay P., Forrest K., Fritz A., Hucl P., Wiebe K., et al. Exome sequencing highlights the role of wild-relative introgression in shaping the adaptive landscape of the wheat genome. Nat. Genet. 2019;51:896–904. doi: 10.1038/s41588-019-0382-2. [DOI] [PubMed] [Google Scholar]
  26. Hu L., Wu Y., Wu D., Rao W., Guo J., Ma Y., Wang Z., Shangguan X., Wang H., Xu C., et al. The coiled-coil and nucleotide binding domains of BROWN PLANTHOPPER RESISTANCE14 function in signaling and resistance against planthopper in rice. Plant Cell. 2017;29:3157–3185. doi: 10.1105/tpc.17.00263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Huang J. Horizontal gene transfer in eukaryotes: the weak-link model. Bioessays. 2013;35:868–875. doi: 10.1002/bies.201300007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Huang W., Zhang L., Columbus J.T., Hu Y., Zhao Y., Tang L., Guo Z., Chen W., McKain M., Bartlett M., et al. A well-supported nuclear phylogeny of poaceae and implications for the evolution of C4 photosynthesis. Mol. Plant. 2022;15:755–777. doi: 10.1016/j.molp.2022.01.015. [DOI] [PubMed] [Google Scholar]
  29. Huang J., Yang L., Yang L., Wu X., Cui X., Zhang L., Hui J., Zhao Y., Yang H., Liu S., et al. Stigma receptors control intraspecies and interspecies barriers in brassicaceae. Nature. 2023;614:303–308. doi: 10.1038/s41586-022-05640-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Huerta-Sánchez E., Jin X., Somel M., Bianba Z., Bianba Z., Peter B.M., Vinckenbosch N., Liang Y., Yi X. Altitude adaptation in tibetans caused by introgression of denisovan-like DNA. Nature. 2014;512:194–197. doi: 10.1038/nature13408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Ishikawa E., Kanai S., Shinozawa A., Hyakutake M., Sue M. Hordeum vulgare CYP76M57 catalyzes C2 shortening of tryptophan side chain by C-N bond rearrangement in gramine biosynthesis. Plant J. 2024;118:892–904. doi: 10.1111/tpj.16644. [DOI] [PubMed] [Google Scholar]
  32. Jayakodi M., Lu Q., Pidon H., Rabanus-Wallace M.T., Bayer M., Lux T., Guo Y., Jaegle B., Badea A., Bekele W., et al. Structural variation in the pangenome of wild and domesticated barley. Nature. 2024;636:654–662. doi: 10.1038/s41586-024-08187-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Jensen A., Horton E.R., Amboko J., Parke S.-A., Hart J.A., Tosi A.J., Guschanski K., Detwiler K.M. Y chromosome introgression between deeply divergent primate species. Nat. Commun. 2024;15 doi: 10.1038/s41467-024-54719-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Jiao C., Xie X., Hao C., Chen L., Xie Y., Garg V., Zhao L., Wang Z., Zhang Y., Li T., et al. Pan-genome bridges wheat structural variations with habitat and breeding. Nature. 2025;637:384–393. doi: 10.1038/s41586-024-08277-0. [DOI] [PubMed] [Google Scholar]
  35. Jing C.-Y., Zhang F.-M., Wang X.-H., Wang M.-X., Zhou L., Cai Z., Han J.-D., Geng M.-F., Yu W.-H., Jiao Z.-H., et al. Multiple domestications of asian rice. Nat. Plants. 2023;9:1221–1235. doi: 10.1038/s41477-023-01476-z. [DOI] [PubMed] [Google Scholar]
  36. Jonczyk R., Schmidt H., Osterrieder A., Fiesselmann A., Schullehner K., Haslbeck M., Sicker D., Hofmann D., Yalpani N., Simmons C., et al. Elucidation of the final reactions of DIMBOA-glucoside biosynthesis in maize: characterization of Bx6 and Bx7. Plant Physiol. 2008;146:1053–1063. doi: 10.1104/pp.107.111237. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Kalyaanamoorthy S., Minh B.Q., Wong T.K.F., von Haeseler A., Jermiin L.S. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods. 2017;14:587–589. doi: 10.1038/nmeth.4285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Keeling P.J. Horizontal gene transfer in eukaryotes: aligning theory with data. Nat. Rev. Genet. 2024;25:416–430. doi: 10.1038/s41576-023-00688-5. [DOI] [PubMed] [Google Scholar]
  39. Kim D., Pertea G., Trapnell C., Pimentel H., Kelley R., Salzberg S.L. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 2013;14:R36. doi: 10.1186/gb-2013-14-4-r36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Kishino H., Miyata T., Hasegawa M. Maximum likelihood inference of protein phylogeny and the origin of chloroplasts. J. Mol. Evol. 1990;31:151–160. [Google Scholar]
  41. Kulmuni J., Butlin R.K., Lucek K., Savolainen V., Westram A.M. Towards the completion of speciation: the evolution of reproductive isolation beyond the first barriers. Phil. Trans. R. Soc. B. 2020;375 doi: 10.1098/rstb.2019.0528. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Lan Z., Song Z., Wang Z., Li L., Liu Y., Zhi S., Wang R., Wang J., Li Q., Bleckmann A., et al. Antagonistic RALF peptides control an intergeneric hybridization barrier on Brassicaceae stigmas. Cell. 2023;186:4773–4787.e12. doi: 10.1016/j.cell.2023.09.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Langfelder P., Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinf. 2008;9:559. doi: 10.1186/1471-2105-9-559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Leete E., Minich M.L. Biosynthesis of gramine in Phalaris arundinacea. Phytochemistry. 1977;16:149–150. [Google Scholar]
  45. Letunic I., Bork P. Interactive tree of life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 2021;49:W293–W296. doi: 10.1093/nar/gkab301. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Li H. New strategies to improve minimap2 alignment accuracy. Bioinformatics. 2021;37:4572–4574. doi: 10.1093/bioinformatics/btab705. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Li F.-W., Brouwer P., Carretero-Paulet L., Cheng S., de Vries J., Delaux P.-M., Eily A., Koppers N., Kuo L.-Y., Li Z., et al. Fern genomes elucidate land plant evolution and cyanobacterial symbioses. Nat. Plants. 2018;4:460–472. doi: 10.1038/s41477-018-0188-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Li H., Feng X., Chu C. The design and construction of reference pangenome graphs with minigraph. Genome Biol. 2020;21:265. doi: 10.1186/s13059-020-02168-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Li Y., Liu Z., Liu C., Shi Z., Pang L., Chen C., Chen Y., Pan R., Zhou W., Chen X.X., et al. HGT is widespread in insects and contributes to male courtship in Lepidopterans. Cell. 2022;185:2975–2987.e10. doi: 10.1016/j.cell.2022.06.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Liao Y., Smyth G.K., Shi W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 2014;30:923–930. doi: 10.1093/bioinformatics/btt656. [DOI] [PubMed] [Google Scholar]
  51. Lu A., Wang T., Hui H., Wei X., Cui W., Zhou C., Li H., Wang Z., Guo J., Ma D., Wang Q. Natural products for drug discovery: discovery of gramines as novel agents against a plant virus. J. Agric. Food Chem. 2019;67:2148–2156. doi: 10.1021/acs.jafc.8b06859. [DOI] [PubMed] [Google Scholar]
  52. Ma P.-F., Liu Y.-L., Jin G.-H., Liu J.-X., Wu H., He J., Guo Z.-H., Li D.-Z. The Pharus latifolius genome bridges the gap of early grass evolution. Plant Cell. 2021;33:846–864. doi: 10.1093/plcell/koab015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Ma J., Wang S., Zhu X., Sun G., Chang G., Li L., Hu X., Zhang S., Zhou Y., Song C.-P., Huang J. Major episodes of horizontal gene transfer drove the evolution of land plants. Mol. Plant. 2022;15:857–871. doi: 10.1016/j.molp.2022.02.001. [DOI] [PubMed] [Google Scholar]
  54. Ma P.-F., Liu Y.-L., Guo C., Jin G., Guo Z.-H., Mao L., Yang Y.-Z., Niu L.-Z., Wang Y.-J., Clark L.G., et al. Genome assemblies of 11 bamboo species highlight diversification induced by dynamic subgenome dominance. Nat. Genet. 2024;56:710–720. doi: 10.1038/s41588-024-01683-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Mahelka V., Krak K., Kopecký D., Fehrer J., Šafář J., Bartoš J., Hobza R., Blavet N., Blattner F.R. Multiple horizontal transfers of nuclear ribosomal genes between phylogenetically distinct grass lineages. Proc. Natl. Acad. Sci. 2017;114:1726–1731. doi: 10.1073/pnas.1613375114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Mahelka V., Krak K., Fehrer J., Caklová P., Nagy Nejedlá M., Čegan R., Kopecký D., Šafář J. A panicum-derived chromosomal segment captured by Hordeum a few million years ago preserves a set of stress-related genes. Plant J. 2021;105:1141–1164. doi: 10.1111/tpj.15167. [DOI] [PubMed] [Google Scholar]
  57. Marques D.A., Meier J.I., Seehausen O. A combinatorial view on speciation and adaptive radiation. Trends Ecol. Evol. 2019;34:531–544. doi: 10.1016/j.tree.2019.02.008. [DOI] [PubMed] [Google Scholar]
  58. Minh B.Q., Schmidt H.A., Chernomor O., Schrempf D., Woodhams M.D., von Haeseler A., Lanfear R. IQ-TREE 2: New models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 2020;37:1530–1534. doi: 10.1093/molbev/msaa015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Muzellec B., Teleńczuk M., Cabeli V., Andreux M. PyDESeq2: a python package for bulk RNA-seq differential expression analysis. Bioinformatics. 2023;39 doi: 10.1093/bioinformatics/btad547. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Nagar P., Sharma N., Jain M., Sharma G., Prasad M., Mustafiz A. OsPSKR15, a phytosulfokine receptor from rice enhances abscisic acid response and drought stress tolerance. Physiol. Plant. 2022;174 doi: 10.1111/ppl.13569. [DOI] [PubMed] [Google Scholar]
  61. Nosil P., Harmon L.J., Seehausen O. Ecological explanations for (incomplete) speciation. Trends Ecol. Evol. 2009;24:145–156. doi: 10.1016/j.tree.2008.10.011. [DOI] [PubMed] [Google Scholar]
  62. Pascual J., Rahikainen M., Angeleri M., Alegre S., Gossens R., Shapiguzov A., Heinonen A., Trotta A., Durian G., Winter Z., et al. ACONITASE 3 is part of the ANAC017 transcription factor-dependent mitochondrial dysfunction response. Plant Physiol. 2021;186:1859–1877. doi: 10.1093/plphys/kiab225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Pontarp M., Lundberg P., Ripa J. The succession of ecological divergence and reproductive isolation in adaptive radiations. J. Theor. Biol. 2024;587 doi: 10.1016/j.jtbi.2024.111819. [DOI] [PubMed] [Google Scholar]
  64. Ramírez-González R.H., Borrill P., Lang D., Harrington S.A., Brinton J., Venturini L., Davey M., Jacobs J., van Ex F., Pasha A., et al. The transcriptional landscape of polyploid wheat. Science. 2018;361 doi: 10.1126/science.aar6089. [DOI] [PubMed] [Google Scholar]
  65. Rozewicki J., Li S., Amada K.M., Standley D.M., Katoh K. MAFFT-DASH: integrated protein sequence and structural alignment. Nucleic Acids Res. 2019;47:W5–W10. doi: 10.1093/nar/gkz342. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Schoener T.W. The anolis lizards of bimini: resource partitioning in a complex fauna. Ecology. 1968;49:704–726. [Google Scholar]
  67. Shang L., He W., Wang T., Yang Y., Xu Q., Zhao X., Yang L., Zhang H., Li X., Lv Y., et al. A complete assembly of the rice Nipponbare reference genome. Mol. Plant. 2023;16:1232–1236. doi: 10.1016/j.molp.2023.08.003. [DOI] [PubMed] [Google Scholar]
  68. Sjöstrand J., Tofigh A., Daubin V., Arvestad L., Sennblad B., Lagergren J. A bayesian method for analyzing lateral gene transfer. Syst. Biol. 2014;63:409–420. doi: 10.1093/sysbio/syu007. [DOI] [PubMed] [Google Scholar]
  69. Soucy S.M., Huang J., Gogarten J.P. Horizontal gene transfer: building the web of life. Nat. Rev. Genet. 2015;16:472–482. doi: 10.1038/nrg3962. [DOI] [PubMed] [Google Scholar]
  70. Steenwyk J.L., Li Y., Zhou X., Shen X.-X., Rokas A. Incongruence in the phylogenomics era. Nat. Rev. Genet. 2023;24:834–850. doi: 10.1038/s41576-023-00620-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Stull G.W., Pham K.K., Soltis P.S., Soltis D.E. Deep reticulation: the long legacy of hybridization in vascular plant evolution. Plant J. 2023;114:743–766. doi: 10.1111/tpj.16142. [DOI] [PubMed] [Google Scholar]
  72. Suvorov A., Kim B.Y., Wang J., Armstrong E.E., Peede D., D’Agostino E.R.R., Price D.K., Waddell P., Lang M., Courtier-Orgogozo V., et al. Widespread introgression across a phylogeny of 155 Drosophila genomes. Curr. Biol. 2022;32:111–123.e5. doi: 10.1016/j.cub.2021.10.052. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Tang H., Krishnakumar V., Zeng X., Xu Z., Taranto A., Lomas J.S., Zhang Y., Huang Y., Wang Y., Yim W.C., et al. JCVI: a versatile toolkit for comparative genomics analysis. iMeta. 2024;3 doi: 10.1002/imt2.211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Taylor S.A., Larson E.L. Insights from genomes into the evolutionary importance and prevalence of hybridization in nature. Nat. Ecol. Evol. 2019;3:170–177. doi: 10.1038/s41559-018-0777-y. [DOI] [PubMed] [Google Scholar]
  75. Than C., Ruths D., Nakhleh L. PhyloNet: a software package for analyzing and reconstructing reticulate evolutionary relationships. BMC Bioinf. 2008;9:322. doi: 10.1186/1471-2105-9-322. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Warren D.L., Glor R.E., Turelli M. Environmental niche equivalency versus conservatism: quantitative approaches to niche evolution. Evolution. 2008;62:2868–2883. doi: 10.1111/j.1558-5646.2008.00482.x. [DOI] [PubMed] [Google Scholar]
  77. Wick R.R., Schultz M.B., Zobel J., Holt K.E. Bandage: interactive visualization of de novo genome assemblies. Bioinformatics. 2015;31:3350–3352. doi: 10.1093/bioinformatics/btv383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Wu D., Hu Y., Akashi S., Nojiri H., Guo L., Ye C.-Y., Zhu Q.-H., Okada K., Fan L. Lateral transfers lead to the birth of momilactone biosynthetic gene clusters in grass. Plant J. 2022;111:1354–1367. doi: 10.1111/tpj.15893. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Wu D., Jiang B., Ye C.-Y., Timko M.P., Fan L. Horizontal transfer and evolution of the biosynthetic gene cluster for benzoxazinoids in plants. Plant Commun. 2022;3 doi: 10.1016/j.xplc.2022.100320. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Xia J., Guo Z., Yang Z., Han H., Wang S., Xu H., Yang X., Yang F., Wu Q., Xie W., et al. Whitefly hijacks a plant detoxification gene that neutralizes plant toxins. Cell. 2021;184:3588. doi: 10.1016/j.cell.2021.06.010. [DOI] [PubMed] [Google Scholar]
  81. Xie L., Huang Y., Huang W., Shang L., Sun Y., Chen Q., Bi S., Suo M., Zhang S., Yang C., et al. Genetic diversity and evolution of rice centromeres. Nat. Genet. 2025;57:2808–2818. doi: 10.1038/s41588-025-02365-1. [DOI] [PubMed] [Google Scholar]
  82. Yan Q., Wu F., Yan Z., Li J., Ma T., Zhang Y., Zhao Y., Wang Y., Zhang J. Differential co-expression networks of long non-coding RNAs and mRNAs in Cleistogenes songorica under water stress and during recovery. BMC Plant Biol. 2019;19:23. doi: 10.1186/s12870-018-1626-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Yeasmin L., Ali M.N., Gantait S., Chakraborty S. Bamboo: an overview on its genetic diversity and characterization. 3 Biotech. 2015;5:1–11. doi: 10.1007/s13205-014-0201-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Yoshida S., Maruyama S., Nozaki H., Shirasu K. Horizontal gene transfer by the parasitic plant Striga hermonthica. Science. 2010;328:1128. doi: 10.1126/science.1187145. [DOI] [PubMed] [Google Scholar]
  85. Yu Y., Zhang H., Long Y., Shu Y., Zhai J. Plant Public RNA-seq Database: a comprehensive online database for expression analysis of ∼45 000 plant public RNA-Seq libraries. Plant Biotechnol. J. 2022;20:806–808. doi: 10.1111/pbi.13798. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Zdobnov E.M., Apweiler R. InterProScan – an integration platform for the signature-recognition methods in InterPro. Bioinformatics. 2001;17:847–848. doi: 10.1093/bioinformatics/17.9.847. [DOI] [PubMed] [Google Scholar]
  87. Zhang J., Wu F., Yan Q., John U.P., Cao M., Xu P., Zhang Z., Ma T., Zong X., Li J., et al. The genome of Cleistogenes songorica provides a blueprint for functional dissection of dimorphic flower differentiation and drought adaptability. Plant Biotechnol. J. 2021;19:532–547. doi: 10.1111/pbi.13483. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Zhang L., Zhu X., Zhao Y., Guo J., Zhang T., Huang W., Huang J., Hu Y., Huang C.-H., Ma H. Phylotranscriptomics resolves the phylogeny of pooideae and uncovers factors for their adaptive evolution. Mol. Biol. Evol. 2022;39 doi: 10.1093/molbev/msac026. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Supplemental Figures 1–28
mmc1.pdf (6.1MB, pdf)
Data S1. Supplemental Tables 1–7
mmc2.xlsx (287.5KB, xlsx)
Document S2. Article plus supplemental information
mmc3.pdf (14.7MB, pdf)

Data Availability Statement

Genome assemblies in this study are downloadable from public data warehouses. The RI detection results for Poaceae species, including the initial output and filtered results from RIFinder, are publicly available on Zenodo at https://doi.org/10.5281/zenodo.15877093. The source code for RIFinder is available on GitHub (https://github.com/Ne0tea/RIFinder) under Apache License 2.0.


Articles from Plant Communications are provided here courtesy of Elsevier

RESOURCES